Artificial intelligence captured the world’s attention last year when it defeated humanity’s champion at the game of Go. It was a landmark event for AI, much like the moment in 1997 when IBM’s Deep Blue defeated Garry Kasparov at chess. Starting next week an artificial intelligence system named Libratus, developed by a team at Carnegie Mellon University, will try to establish a new milestone: beating some of the best human players at Heads-Up No-Limit Texas Hold’em poker.
While Libratus may one day be listed in the history books alongside Deep Blue and Alpha Go, it’s actually attempting to solve a very different kind of problem. Go and chess are perfect-information games: each player knows exactly what moves have been made and what space is left on the board to consider. Poker is an imperfect-information game, which makes it far more challenging for artificial intelligence to master.
“In a complete information game you can solve a subtree of the game tree,” says Professor Tuomas Sandholm, who built the Libratus system with PhD student Noam Brown. AI trying to win a game of chess or Go can work through how a sequence of moves will play out. “With incomplete-information games, it’s not like that at all. You can’t know what cards the other player has been dealt,” he explains. “That means you don’t know exactly what subgame you’re in. Also, you don’t know which cards chance will produce next from the deck.”
Incomplete information games have thus far proved much harder to solve. CMU’s AI focuses on information sets, a grouping of possible states that take into account the known and unknown variables. It’s a massive mathematical undertaking. “The game has 10 to the power of 160 information sets, and 10 to the power of 165 nodes in the game tree,” says Sandholm. That means there are more possible permutations in a hand of poker than atoms in our universe. “And even if you had another whole universe for each atom in our universe and counted all the atoms in those universes, it would be more than that.”
Rather than merely strategize many moves in advance, as AI might do when playing chess or Go, the system built by CMU is looking to achieve the perfect balance of risk and reward, a state of play defined by the Nash Equilibrium. You might be familiar with this seminal piece of mathematics from the film A Beautiful Mind, which chronicled the life of John Nash; he introduced the concept back in 1950. It has since become a cornerstone of game theory, earning Nash a Nobel Prize in 1994.
“In these two-player zero sum games, if the other player doesn’t play a Nash equilibrium strategy, that means they are playing worse, and we are making more money,” explains Sandholm. “In such games, playing Nash equilibrium is safe. It has the flavor where it plays rationally and is not exploitable anywhere.”
For the humans matched against the machine, this approach produces a relentless opponent. “I always tell people the one word I can use to describe the experience: a grind. The first few days we ended up playing til midnight, and when we were done we went back to the hotel and studied for a few hours before going to sleep. Then we would wake up at 9AM and do it all over again,” says Jason Les, a poker pro who played against CMU’s prior AI system during its first tournament, and who will be returning this time.
To the casual observer it can seem a lot less sexy than the strange or creative moves produced by a system like AlphaGo. “Many people see this as a defensive strategy, where all you're trying to do is avoid being beaten and then getting a small edge in all the spots your opponent is not playing optimally themselves.”
Sandholm is quick to point out, however, that playing it safe is not the same as playing conventionally. “This poker program, and the Claudico program a year and half ago, they come up with new moves. They play moves that established poker literature considers really bad.” For example in the first move in a hand of poker, limping means you just call the opponent, you put in the minimum amount of money to continue the hand. All the poker books say that is a terrible move, but CMU’s poker bots limp somewhere between 7–16 percent of the time.
“That really contradicts the folk wisdom on how to play this game,” says Sandholm. “The algorithms figure it out just from the rules of the game, we don’t give them any historical data about how humans play. They play like Martians, they figure out their own strategy.” The AI also flouted convention by donk betting a lot, taking the initiative from the player who placed the final wager in the previous round.
“I think they show human players that they can make these unconventional strategies work,” says Les. “However, in practice they are too difficult to emulate without the help of a computer.” Dong Kyu Kim, who played against CMU’s prior system in 2015, has adopted some of its strange techniques. “I have learned a lot from Claudico to use in my own game,” says Kim, who believes that following its lead can offer an edge over many human opponents.
A team from the University of Alberta built an AI system that was better than the best humans at limit Texas Hold’em back in 2008, and achieved near perfect play at that variant of the game in 2015. No-Limit, where the size of bets are not constrained, is much more complex, but all the poker pros involved in this tournament felt it was only a matter of time before the machines would prevail.
“I do not believe that poker is different enough from chess and Go, and ultimately think that computers will dominate the game,” said Jimmy Chou. “Humans may have the upper-hand occasionally due to our unpredictable nature, but long-term I will put my money on the effectiveness of machines due to math and science.” Kim agrees. “I hate to admit it as a professional poker player, but I do believe machines will be able to beat humans in all forms of poker. It is just a matter of time.”
While the triumphs of Deep Blue and AlphaGo captured the public imagination, systems that solve perfect information games have a limited application. “Most real world interactions include multiple parties and incomplete information,” says Sandholm. Crafting a system that can outperform humans at these types of tasks will be, “much more important from an AI perspective, and for making the world better in general.” AlphaGo’s creator has his eye on no-limit poker and Starcraft II, both imperfect-information games.
Sandholm sees systems like the one built by his team doing automated negotiation or bargaining on behalf of a consumer or a business in a complex auction, for example. It might also find a home in cybersecurity, helping to optimize a network’s defenses against hackers. And Sandholm hopes it might one day be widely applied to medicine. “We’ve been looking at auto-immune diseases and cancer, steering one’s own immune system to better battle disease at hand,” he explained. “The T-cell is not really an opponent, but you can deal with them using these techniques.”
The matches will begin on January 11th at the Rivers Casino in Pittsburgh, Pennsylvania. Four of the world’s top poker pros — Jason Les, Dong Kim, Daniel McAulay, and Jimmy Chou — will collectively play 120,000 hands over the course of the 20-day tournament, vying for a cut of the $200,000 prize purse. If you want to tune in, live streams of the matches between Libratus and its human opponents will be made available on Twitch as the tournament unfolds.