Artificially intelligent (AI) agents are increasingly being adopted for various tasks. Studies have concluded that the goal-directed behavior of AI agents may be dangerous depending on the nature of these goals. For instance, the AI agents might become pathological outside the regimes the designers expected, and agents may pursue convergent instrumental goals like resource acquisition, self-preservation, etc. Therefore, there is a need for methods to find agents with intentional actions.
To examine the robustness of machine learning systems from a security perspective, causal agent models have been implemented. However, identifying agents is difficult. This is because the modeler frequently assumes about the causal model without providing much evidence, which can lead to errors in the safety analysis.
One method of modeling decision-making scenarios used in studies is causal impact diagrams (CIDs). CIDs assist exposes potential dangers before training an agent and can inspire improved agent designs by connecting training sets to the incentives that affect agent behavior.
A new work by DeepMind presents the first formal causal definition of agents, which states, in essence, that agents are systems whose policies would change depending on the external environment.
For their purpose, they used agent systems whose results are influenced by justifications. To put it another way, when an agent makes a decision, it does so because it “expects” that action to lead to a particular result.
Reason-driven systems are the ones that would change their behavior if they “learned” the world functioned differently. Behavior sensitivity to changes in the surrounding environment can be formally modeled using causality and structural causal models (SCMs).
For this, the team designed a mechanized SCMs, a type of mechanized casual game, and the provision of an algorithm for producing its graph from a set of interventional distributions. Next, they build upon this by providing a method to identify which variables reflect agent decisions and which represent the objectives those decisions optimize. Consequently, they can transform a mechanical SCM into a (structural) causal game.
With this, they suggest that it is possible to discover agents by inferring a game graph from a collection of tests, provided certain assumptions are met.
The team considered multiple factors while forming the new definitions of agents. To begin, they use game-theoretic representations of agents’ graphs in causal experiments. These experiments can be carried out on existing systems, or employed in thought experiments, to establish the proper game graph and clear up misunderstandings. The right game graph gives the researcher more confidence that no modeling error has been made, allowing them better to grasp the agent’s incentives and safety features. The researchers believe in cases where testing is cheap, such as software simulations, the algorithms also pave the way for autonomous inference of game graphs.
This Article is written as a research summary article by Marktechpost Staff based on the research paper 'Discovering Agents'. All Credit For This Research Goes To Researchers on This Project. Check out the paper and reference article. Please Don't Forget To Join Our ML Subreddit
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.