

ChatGPT Loses in a Game of Chess Against Magnus Carlsen (time.com) 34
The world's best human chess player beat ChatGPT, reports Time magazine. Magnus Carlsen posted on X.com earlier this month that "I sometimes get bored while travelling," and shared screenshots of his conversations with ChatGPT after he beat the AI chatbot "without losing a single piece."
ChatGPT lost all its pawns, screenshots the Norwegian grandmaster shared on X on July 10 showed. ChatGPT resigned the match... "That was methodical, clean, and sharp. Well played!" ChatGPT said to him, according to the screenshots Carlsen posted.
Carlsen told the AI bot that he thought it "played really well in the opening," but ultimately "failed to follow it up correctly." He went on to ask ChatGPT for feedback on his performance. "Your play showed several strong traits," ChatGPT told him...
About a week after Carlsen posted that he beat ChatGPT in the online chess match, he lost the Freestyle Chess Grand Slam Tour in Las Vegas to teenage Indian grandmaster Rameshbabu Praggnanandhaa.
Carlsen told the AI bot that he thought it "played really well in the opening," but ultimately "failed to follow it up correctly." He went on to ask ChatGPT for feedback on his performance. "Your play showed several strong traits," ChatGPT told him...
About a week after Carlsen posted that he beat ChatGPT in the online chess match, he lost the Freestyle Chess Grand Slam Tour in Las Vegas to teenage Indian grandmaster Rameshbabu Praggnanandhaa.
Umm... (Score:4, Insightful)
Re: (Score:2)
LLMs aren't chess playing computers. This is a surprise to anyone?
And specialized chess software is "playing"? Like a human it's analyzing.a series of moves and countermoves, perhaps a longer series than a human, but is it "instinctively" pruning those paths to explore like a human? Probably not with the old school programs that were more brute force with limited pruning, maybe more so with ML based series pruning?
Re: (Score:2)
Yeah thats not how LLMs work.
Something like Stockfish absolutely does do something like that. Stockfish is not an LLM, its entire theory of operation is vastly different to how LLMs work.
For the most part LLMs play worse chess than children.
Billions of 1 move advice (Score:2)
We cannot compare something / AI which is essentially a 1 move "advice" tool which does not look ahead 10, 20 moves to a dedicated chess playing program which has been trained on long chains of chess moves.
Noted: Tree pruning, min-max algorithm, etc., etc.
Re:Umm... (Score:4, Insightful)
Re: (Score:2)
The interesting part of this is how it played, why it lost and how that might be fixed.
1. In all these kinds of examples they use models bad at this type of stuff. Here, they used ChatGPT-4o, which is tuned for speed and not for reasoning. So the obvious 'improvement' here would be to use o3 or any of the other competing 'reasoning' models designed for these kinds of things.
2. These examples 'play' chess in the form of a textual representation of a sequence of moves, not 2D representations of the board. Tha
Re: (Score:2)
Why would anyone expect a linguistic parlor trick to be good at chess?
So ChatGPT is a magnificent cut-and-paste machine? (Score:3)
Carlsen told the AI bot that he thought it "played really well in the opening," but ultimately "failed to follow it up correctly."
So ChatGPT is a magnificent cut and paste machine? For chess, its training probably included discussions of opening moves. For coding, it includes discussions on algorithms. No real reasoning going on here. Just pattern matching to training materials. Which is useful. As a human code I sometimes look up algorithms in a reference book or online to look at a sample implementation. And a chess enthusiast friend reads books and articles on opening moves.
Re: (Score:2)
There are dedicated chess engines that are a lot stronger than LLM chatbots. That being said, an LLM chatbot should be able to instantiate a chess engine and have it make the actual moves.
Re: (Score:2)
There are dedicated chess engines that are a lot stronger than LLM chatbots. That being said, an LLM chatbot should be able to instantiate a chess engine and have it make the actual moves.
Yes, but are they really "playing"? Like a human they are analyzing.a series of moves and countermoves, perhaps a longer series than a human, but is it "instinctively" pruning those paths to explore like a human? Probably not with the old school programs that were more brute force with limited pruning, maybe more so with ML based series pruning?
Re: So ChatGPT is a magnificent cut-and-paste mach (Score:2)
Yes, but are they really "playing"?
Yes, and prepare to have your mind blown: planes can fly.
Re: (Score:2)
Yes, but are they really "playing"?
Yes, and prepare to have your mind blown: planes can fly.
I can strap wings onto a brick and drop it from 10K ft. It moves through the air, but its not "flying" in any reasoned controlled flight perspective.
Re: (Score:2)
Re: (Score:2)
Yes, but are they really "playing"? Like a human they are analyzing.a series of moves and countermoves, perhaps a longer series than a human, but is it "instinctively" pruning those paths to explore like a human? Probably not with the old school programs that were more brute force with limited pruning, maybe more so with ML based series pruning?
You do not understand how LLMs work. Its not playing or analyzing anything at all. Its a glorified random number generator that's is literally predicting characters, in this case chess move sequences. Its not judging those sequences by rules, or using lookup tables or probability matrices for movement. Its literally predicting, character by character, what its been trained with.
The fact that it plays a 'real' game and tells a master it played a good game is meaningless - it would tell a child its ideas abo
Re: (Score:2)
Yes, but are they really "playing"? Like a human they are analyzing.a series of moves and countermoves, perhaps a longer series than a human, but is it "instinctively" pruning those paths to explore like a human? Probably not with the old school programs that were more brute force with limited pruning, maybe more so with ML based series pruning?
You do not understand how LLMs work. It's not playing or analyzing anything at all.
Guess again. From my first post in this thread: "So ChatGPT is a magnificent cut and paste machine? For chess, its training probably included discussions of opening moves. For coding, it includes discussions on algorithms. No real reasoning going on here. Just pattern matching to training materials."
A thread which is titled "So ChatGPT is a magnificent cut-and-paste machine"
Re: (Score:2)
Re: (Score:2)
William Shakespeare vs Stockfish (Score:3, Insightful)
Re: (Score:2)
In other news, William Shakespeare won in a language fluency, comprehension and poetry composition contest against Stockfish.
You're somehow overlooking how impressive an accomplishment that is, for a guy that's been dead 400 years...
Re: (Score:2)
Re: (Score:2)
I did get your point, and do agree with you. But I couldn't resist the joke!
One other thing this illustrates, at least in my opinion - companies like OpenAI and Anthropic have done a remarkable PR job, convincing people that their products are significantly more general-purpose (and significantly more advanced) than they really are. The general public -represented by Magnus Carlsen, in this instance - basically sees them as AGIs, rather than the hallucinating-and-still-simplistic tools they actually are.
Re: (Score:2)
True. I should probably have used someone contemporary.
Perhaps not. Quite a bit of what we attribute to the Bard didn't come from him. Romeo and Juliet for example is a regurgitation of an Italian work from thirty years prior.
chatGPT is actually terrible at chess (Score:1)
This illustrates why AI isn't a doomsday threat (Score:3)
ChatGPT and its siblings *seem* to be amazingly good at everything. But when it's necessary to do anything that requires a specialized skill, you need specialized AI.
For example, LLMs can interact with patients in a hospital, but it takes specialized AI to properly read an X-ray or an MRI. An LLM can tell you what medications are commonly used to treat specific illnesses, but if you blindly follow the advice, you're likely to miss important clues that a doctor wouldn't miss.
The LLM appears to understand how chess is played, but when it gets beyond the opening gambits that are well-documented, it falls apart.
In any kind of business, LLMs can provide output that appears sound, but when you get into the weeds, falls apart. We're going to need people to do the heavy lifting for some time to come. The death of white-collar employment is greatly exaggerated.
Super Excited (Score:3)
Re: (Score:2)
Posting this for the lulz.
https://www.businessinsider.co... [businessinsider.com]
asking ChatGPT for critque (Score:2)
It's a bit like asking the dumbest kid in class for his notes.