Today begins week three of the poker tournament between Libratus, an AI system built by researchers from Carnegie Mellon University, and four of the world’s top pros. While the humans plan to soldier on, a gallows humor has taken hold. With a little over 80,000 hands played, out of 120,000 total, the humans are down by roughly $750,000, a massive amount that will be all but impossible to come back from.
“We’re all down about the price of a small house,” said Jason Les, chatting with onlookers about the score while he played. The players don’t actually have to pay the AI anything, and in fact all get paid depending on how well they perform relative to one another. “It’s not about the money, it’s about preserving human dignity,” quipped Les. “And it’s not going well.”
The AI is unpredictable in a way no human can be
So far the AI’s main advantage is its ability to remain unpredictable. While the pros appreciate the way Libratus is playing, they don’t believe there are many tricks they can pick up from the system. “There are a lot of things I see Libratus do that I really like. However, they are really only possible because they are mixed and randomized by the reasoning of a computer,” says Les. “Its balanced dispersion of hand ranges into different actions is not really feasible for the human mind to imitate correctly.”
One of the things Libratus does well is bluff. Take the hand where Les was dealt a pair of 10s to start, a diamond, and a heart. The flop was king-9-4 with two clubs, and there was more betting. At this point the AI might have been looking for a third club to complete a flush. The turn brought a 5, not a club, and both sides checked. The final card, the river, was a non-club Queen and the AI made an aggressive bet, wagering all its chips.
AI is impervious to the psychological ups and downs
Les, faced with this bet, folded. A short while later his partner, Dong Kyu, got the other side of the hand. To eliminate the impact of luck, the tournament is arranged as two pairs of duplicate hands. One player plays side A of the cards against the AI, while his partner plays the same cards, but on side B. This mirror matchup happens simultaneously, and the players and poker bots can’t share info.
Kyu’s hand was a 7-3 of clubs. That meant the AI had been fishing for a flush, and its very aggressive bet against Les on the river was made with almost no chance of winning had Les called. To add insult to injury, with the situation reversed, the AI bet hard on its pair of 10s early, and Kyu folded giving Libratus a victory on both sides of the mirror match. It was one of many situations in which the system’s aggression flummoxed the human pros.
Libratus has also been over-betting frequently, wagering far more to win a hand than is currently up for grabs in the pot. “If you have $200 in the middle and $20,000 in your stack, you can bet that,” says Doug Polk, a poker pro who bested a previous AI built by CMU in 2015. “But humans don’t really like that. It feels like you’re risking a lot of money to win so little. The computer doesn’t have that psychology. It just looks at the best play.”
Imagine this type of AI directing cyberattacks
While beating the best human in chess, checkers, backgammon, and Go were important accomplishments for artificial intelligence, there are limited real-world applications for systems that master these kind of games. Most real-world problems aren’t laid out neatly on a board where both sides know exactly how the opponent will operate and where all the important pieces are at all times.
Mastering the art of the bluff requires AI that can calculate risk and reward in real time without having perfect information about what its opponent can do in return. It implies the system does more than simply play a perfectly safe game where it only grinds out wins when it has the stronger hand. The team from Carnegie Mellon University that built the Libratus poker bot hopes this kind of system can, after being tested on games like poker, learn how to tackle thorny decisions in the world of military strategy, cybersecurity, and even medicine. And they aren’t alone.
As the pros get tired, the bot is getting stronger
Regardless of the pure ability of the humans and the AI, it seems clear that the pros will be less effective as the tournament goes on. Ten hours of poker a day for 20 days straight against an emotionless computer was exhausting and demoralizing, even for pros like Doug Polk. And while the humans sleep at night, Libratus takes the supercomputer powering its in-game decision making and applies it to refining its overall strategy.
“The bot gets better and better every day. It’s like a tougher version of us,” said Jimmy Chou, one of the four pros battling Libratus. “The first couple of days, we had high hopes. But every time we find a weakness, it learns from us and the weakness disappears the next day.”
“The bot has certainly surprised us in its shifts in strategy so far. Without going into too much detail, there were specific vulnerabilities in Libratus' strategy that we identified in the early days of the competition and we attacked them relentlessly,” said Les. “After a few days in though, these leaks slowly started to disappear.” The players believe CMU’s creators must be tinkering with it by hand at night, but so far the academics aren’t talking. “We don't know exactly what happened or how it's adjusting, but that was impressive.”