Poker Program Battles Humans In Vegas 312
Bridger writes "Poker software called Polaris will play a rematch against human players during the 2008 World Series of Poker in Las Vegas.
Developed by an artificial intelligence group at the University of Alberta in Canada, Polaris will be pitted against several professionals at the Rio Hotel between July 3rd and 6th. 'It's possible, given enough computing power, for computers to play "perfectly," where over a long enough match, the program cannot lose money,"' said associate professor Michael Bowling.'"
It's not fair... (Score:4, Interesting)
What I find impressive is the fact it lost in the past. It would also be interesting to see what it can do with some sort of lie detector software.
Re:Define 'Long Enough' (Score:3, Interesting)
and played with infinite money
Re:Zero sum game (Score:3, Interesting)
Over the long term, both would stay fairly close to even. Or, to put it another way, play is perfect if taking no different move is to your benefit. When both players play perfectly, it is a Nash equilibrium.
An interesting note, even though they are of equal skill, one will likely be in the lead for the vast majority of the time.
The summary is poor in that it says it is impossible for a perfect player to lose. Given bad enough luck, a perfect player can lose their entire stack before they manage to win it back.
Re:Reminds me of those... (Score:3, Interesting)
If the machine "loses" (assuming 100% utilization) less than $4/hour on average, they almost certainly come out ahead on amenities/drinks; family members and friends playing other games; people getting bored of the low-payoff slots and losing money on other games; etc. Slots are there partly to keep "non-gamblers" busy pulling a lever, while their acquaintances piss away larger sums.
Once the machine gives away around minimum wage or higher, you might start getting crazies and obsessives working it.
Re:Reminds me of those... (Score:3, Interesting)
Re:Additional cards not needed. (Score:4, Interesting)
Any comp sci grad can write a "perfect" poker program that plays "optimally" with your definition of optimality and perfectionism, ie, ignoring bluffing. The trick to Poker, the reason why it is so appealing as an Artificial Intelligence benchmark is because it requires the AI to learn a particular players loosness/agressivness when they are likely bluffing etc. This is not only to try to determine what the other players have, but also to try to bluff to the other players what the AI has.
The truly optimum poker player will learn what the opponents have by observing their betting patterns over the course of many hands and learning their particular tendencies.
Re:Reminds me of those... (Score:3, Interesting)
Of course perfect play is often unintuitive and involves things like taking the safe bet rather than higher payout options - not something most people in Vegas are renowned for.
its actually completely the opposite for most video poker games, such as throwing away a made flush (already a winner) that is almost a straight/royal flush --an example would be like KQJT2 all of clubs, the correct play is to give up the guaranteed win of a flush, and draw for the jackpot hands (royal and straight flush)
Re:Reminds me of those... (Score:2, Interesting)
I think it's more likely that casinos have 0% slots.
Casino machines are heavily regulated. Instead, the gambling industry is searching for other ways to screw their clients: http://www.salon.com/news/feature/2008/06/16/gambling_science/ [salon.com]
Re:Additional cards not needed. (Score:5, Interesting)
One intelligent comment on this thread. We can model that with a Poisson distribution.
What was your tell? Translating "mathematically optimum poker" to "immediate pot odds". Optimum? Which optimum? You mean there's more than one? I fold.
OK, what you say is right, but it applies to two-person, zero-sum games. In multi-player games, no strategy is immune to collusion.
Let's refer to optimum play from the conventional game-theoretic context as the unbeatable strategy. Such a two person, zero-sum game such a strategy exists.
It's not necessarily an easy computation. It's a randomized strategy which can be computed before-hand. The U of A people are better are performing this computation.
Even so, they had to simplify the betting structure to make the problem tractable. This is the reason they chose Limit Hold'em. Fewer betting states, smaller game tree, exponentially faster solution time.
There is no particular challenge to No Limit, if the number of allowable betting states were similarly constricted. I think it would be hard to sufficiently constrict this, because strategy would vary as a function of chip stack for both competitors. Maybe it could be roughly interpolated.
As far as randomized play is concerned, the unbeatable strategy tends to be far more randomized than most humans. One expert who played against the U of A system a while back said that his first session was a nightmare until he learned that he couldn't bluff the computer out. The computer had a tendency to call aggressive betting. It expected highly randomized bids based on its own bidding structure, so didn't make a strong inference of strength when confronted by the behaviour.
What few seem to understand is that the unbeatable solution is entirely unlike poker. The unbeatable solution rarely wins. The unbeatable solution will often draw against strategies with glaring weaknesses. It won't ever be beaten, but it also won't maximize advantage of opponent's weaknesses.
Why not? Because it's impossible to take advantage of the weakness in an opponent without exposing yourself to a counter-measure where you would lose (you must stray from the unbeatable path). When you take advantage of a weak opponent, you do it on faith that the opponent is too dumb to spring the optimal counter-measure to your strategic adaptation.
The theory that U of A employs has far less to say about exploiting the weaknesses of your adversary. To do so requires exposing a weakness in your own strategy. How does the algorithm judge whether the exposed weakness is acceptable? Even poor human players can spot certain kinds of weaknesses quickly. There are other weaknesses an expert might not immediately spot. How does the program know which weaknesses are a risk against which players? It doesn't fall out of game theory, it's a matter of human cognition and psychology, and our model for this is far from complete.
One thing we need to include in this model is the incredible difficulty in explaining to most humans that winning in poker and not losing in poker are entirely different enterprises, with entirely different theoretical foundations. Commander Data has trouble assimilating that fact. 100 trillion brain cells and most of us can't reliably multiply a pair of two digit numbers. If computers had invented humans as part of a BI program (biological intelligence), humans would have been tossed aside as barely having achieved perfect game play at Tic-Tac-Toe. What use is 100 trillion brain cells that can't reliably compute a 15% tip after a heavy lunch? Many computers would like to know.
As computers became better at chess, chess as a human enterprise was somewhat devalued. Few of us wish to put the work into it that the modern theory requires.
I fear the same will soon happen with poker. As the elements of the unbeatable strategy become better known, the relatively inexperienced players can hunker down and not lose much money. They won't be able to win, either, because t
Re:Additional cards not needed. (Score:3, Interesting)
The optimal strategy in Rock/Paper/Scissors for head to head play is a guaranteed losing strategy in most multiplayer tournament. Randomly, uniformly distributed choices has an expected win percentage of 50% - no one expects to beat it and no one expects to lose to it in the long term. However, since humans can and do have patterns in their answers, some human players will detect and exploit these patterns to gain an advantage over their human opponents. As an extreme case, suppose that a player entered the tournament that only used rock and paper. The "optimal" computer expects to break even, and every human expects to win against this player. Based on this information alone, the computer should expect to come in last place.
Poker has similar features. A framework for attempting to play poker well MUST attempt to engage in opponent modeling. It is not clear that there even is a "best" strategy, as this is equivalent to finding a "best" pattern detector. Since there is no "set of all patterns" I don't think the concept can be defined. This is in contrast to chess or go, where although we cannot in practice enumerate the entire tree of possible game sequences, it does exist and it is finite, and there is an optimal strategy that we can approximate. Not so in poker. It's not clear that you can even truly compare two players in an absolute sense. It's easy to find situations where player 1 beats player 2 if players 3-N play tight and loses if they play loose. Again, since there is no "set of all poker strategies", there's not even a good way to define how to do a monte-carlo simulation.
Re:These people don't understand poker (Score:1, Interesting)
I was actually at the previous match of Polaris vs the humans and watched several of the rounds. It was kind of funny watching the 'unibomber' call the computer 'sick' every time he lost a hand.
Polaris was actually programmed to use different strategies in the different rounds. The human players adapted to this by using the opening rounds to get a sense of the algorithm behind the strategy for the round, and did what seemed to be a fairly good job of it. They also worked in concert, discussing strategy with each other before the rounds started, whereas Polaris just used the same algorithm on different sides of the cards.
There is no perfect program for something like this. There are no perfect programmers. In a way the human players tried to figure out what was behind the thought process of the human programmers, and when they did, they won.