In the future, it’s likely that many aspects of human society will be controlled — either partly or wholly — by artificial intelligence. AI computer agents could manage systems from the quotidian (e.g., traffic lights) to the complex (e.g., a nation’s whole economy), but leaving aside the problem of whether or not they can do their jobs well, there is another challenge: will these agents be able to play nice with one another? What happens if one AI’s aims conflict with another’s? Will they fight, or work together?
Google’s AI subsidiary DeepMind has been exploring this problem in a new study published today. The company’s researchers decided to test how AI agents interacted with one another in a series of “social dilemmas.” This is a rather generic term for situations in which individuals can profit from being selfish — but where everyone loses if everyone is selfish. The most famous example of this is the prisoner’s dilemma, where two individuals can choose to betray one another for a prize, but lose out if both choose this option.
As explained in a blog post from DeepMind, the company’s researchers tested how AI agents would perform in these sorts of situations, by dropping them into a pair of very basic video games.
In the first game, Gathering, two player have to collect apples from a central pile. They have the option of “tagging” the other player with a laser beam, temporarily removing them from the game, and giving the first player a chance to collect more apples. You can see a sample of this gameplay below:
In the second game, Wolfpack, two players have to hunt a third in an environment filled with obstacles. Points are claimed not just by the player that captures the prey, but by all players near to the prey when it’s captured. You can see a gameplay sample of this below:
What the researchers found was interesting, but perhaps not surprising: the AI agents altered their behavior, becoming more cooperative or antagonistic, depending on the context.
For example, with the Gathering game, when apples were in plentiful supply, the agents didn’t really bother zapping one another with the laser beam. But, when stocks dwindled, the amount of zapping increased. Most interestingly, perhaps, was when a more computationally-powerful agent was introduced into the mix, it tended to zap the other player regardless of how many apples there were. That is to say, the cleverer AI decided it was better to be aggressive in all situations.
AI agents varied their strategy based on the rules of the game
Does that mean that the AI agent thinks being combative is the “best” strategy? Not necessarily. The researchers hypothesize that the increase in zapping behavior by the more-advanced AI was simply because the act of zapping itself is computationally challenging. The agent has to aim its weapon at the other player and track their movement — activities which require more computing power, and which take up valuable apple-gathering time. Unless the agent knows these strategies will pay off, it’s easier just to cooperate.
Conversely, in the Wolfpack game, the cleverer the AI agent, the more likely it was to cooperate with other players. As the researchers explain, this is because learning to work with the other player to track and herd the prey requires more computational power.
The results of the study, then, show that the behavior of AI agents changes based on the rules they’re faced with. If those rules reward aggressive behavior (“Zap that player to get more apples”) the AI will be more aggressive; if they rewards cooperative behavior (“Work together and you both get points!) they’ll be more cooperative.
That means part of the challenge in controlling AI agents in the future, will be making sure the right rules are in place. As the researchers conclude in their blog post: “As a consequence [of this research], we may be able to better understand and control complex multi-agent systems such as the economy, traffic systems, or the ecological health of our planet - all of which depend on our continued cooperation.”