Artificial Intelligence Defeats Humans in Heads-Up Poker
In January 2017, an AI system called Libratus played 120,000 hands of heads-up no-limit Texas Hold’em against 4 professional poker players at a casino in Pittsburgh. By the end of the 20-day contest, Libratus had won $1,766,250 in chips. The margin was not close. Two years later, a successor system called Pluribus beat professionals in 6-player no-limit Hold’em, a format closer to what most people encounter at a real card table. These results closed a chapter in artificial intelligence research that had been open since the 1990s.
Why Poker Was a Hard Problem for AI
Chess fell to IBM’s Deep Blue in 1997. Go fell to DeepMind’s AlphaGo in 2016. Both are games of complete information, meaning every player can see the full board state at all times. Poker is different. Cards are hidden. Players can bluff. The optimal move depends on what an opponent might be holding, which in turn depends on what that opponent thinks you are holding.
This property is called imperfect information. In game theory terms, a poker hand involves sequential decisions where each player acts without full knowledge of the state. Computing an optimal strategy for this kind of game requires an approach that accounts for every possible combination of hidden cards and opponent behaviors. The number of possible game states in heads-up no-limit Texas Hold’em reaches roughly 10 to the 161st power, more than the number of atoms in the observable universe.

Computation Meets Card Rooms
Poker has always attracted people who think in probabilities. Chess engines solved a game of complete information decades ago, but poker’s hidden cards required a different approach. The same mathematical reasoning that drives AI research appears in players studying hand ranges, pot odds, and positional play. Anyone playing poker games online encounters these calculations at the table, applied in real time against opponents rather than algorithms. Board games like Stratego and bridge share the same incomplete-information structure, which is why all three attracted attention from researchers before poker became the primary testing ground.
Libratus and the 2017 Brains vs. AI Challenge
Libratus was built by Noam Brown and Tuomas Sandholm at Carnegie Mellon University. The system used a 3-part approach. First, it precomputed a broad strategy by abstracting the game into manageable clusters of similar situations. Second, during play, it refined that strategy in real time based on what was happening at the table. Third, after each day of competition, it analyzed spots where opponents had exploited weaknesses and patched them overnight.
The 4 human opponents were Jason Les, Dong Kim, Daniel McAulay, and Jimmy Chou, all recognized professionals in heads-up no-limit play. Over 120,000 hands at the Rivers Casino in Pittsburgh, Libratus finished ahead by a margin of 14.7 big blinds per 100 hands. In poker terms, that is a dominant winrate. The result carried a statistical significance of 99.98%, with a P value of 0.0002.
Pluribus Expanded the Challenge to 6 Players
Two-player poker is complex, but multiplayer poker is a fundamentally harder problem. In a 2-player game, the optimal strategy is a Nash equilibrium, meaning neither player can improve their result by changing approach. In a 6-player game, Nash equilibria are far more difficult to compute and may not even be the best practical strategy.
Pluribus, built by Noam Brown and colleagues at Facebook AI Research in collaboration with Carnegie Mellon, took a different route. Instead of trying to solve the game outright, it used a limited-lookahead search algorithm that planned a few moves ahead during each decision. The system computed its baseline strategy in 8 days using 12,400 core hours and ran on only 28 processor cores during live play. That is a fraction of the computing power required for comparable AI milestones in other games.
In testing, Pluribus faced 15 professional players including Darren Elias, who holds the record for most World Poker Tour titles, and Chris Ferguson, winner of 6 World Series of Poker events. Over 10,000 hands, Pluribus averaged $5 in profit per hand and roughly $1,000 per hour. The results were published in Science in July 2019.
What the AI Did That Humans Did Not Expect
Pluribus displayed behaviors that surprised the professionals it played against. It bluffed more frequently than human players typically do in multiplayer formats. It used mixed strategies, meaning it would sometimes take different actions in identical situations to remain unpredictable. It also employed donk bets, a move where a player leads the betting in a subsequent round after having called rather than raised in the previous one. Donk betting is considered a weak or unorthodox play by many professionals, but Pluribus used it effectively as part of a balanced approach.
The system did not learn from observing human play. It trained entirely through self-play, running millions of simulated hands against copies of itself. This means its strategy was not derivative of human tendencies but emerged from mathematical optimization alone.

Where the Technology Goes After Poker
The techniques behind Libratus and Pluribus extend beyond card games. Imperfect-information decision making applies to military strategy, financial trading, cybersecurity, and negotiation. Sandholm’s lab at Carnegie Mellon has applied related methods to organ exchange programs, where matching donors and recipients involves sequential decisions under uncertainty.
The U.S. Department of Defense funded further development of poker AI concepts for strategic applications. The core idea is transferable: any scenario where participants make decisions without complete knowledge of what others know or intend can benefit from the same algorithmic frameworks. Poker was the proving ground because it offered a measurable, competitive environment with well-defined rules and skilled human opponents to benchmark against.
The Gap Between AI and Human Decision-Making
Poker AI does not think the way a human player does. A professional at the table reads body language, tracks timing tells, and adjusts based on personal history with specific opponents. Libratus and Pluribus had none of that. They operated entirely on mathematical models of the game state, ignoring any information that could not be expressed as cards, bets, and positions.
This constraint made the results more striking. The AI won without access to soft information that human players consider essential. It suggests that the mathematical structure of the game contains more exploitable information than most professionals realized, and that the physical and psychological dimensions of poker, while real, are secondary to sound probabilistic reasoning in heads-up and short-handed formats.
