Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
Classic Games (Games) AI

ChatGPT Loses in a Game of Chess Against Magnus Carlsen (time.com) 40

The world's best human chess player beat ChatGPT, reports Time magazine. Magnus Carlsen posted on X.com earlier this month that "I sometimes get bored while travelling," and shared screenshots of his conversations with ChatGPT after he beat the AI chatbot "without losing a single piece." ChatGPT lost all its pawns, screenshots the Norwegian grandmaster shared on X on July 10 showed. ChatGPT resigned the match... "That was methodical, clean, and sharp. Well played!" ChatGPT said to him, according to the screenshots Carlsen posted.

Carlsen told the AI bot that he thought it "played really well in the opening," but ultimately "failed to follow it up correctly." He went on to ask ChatGPT for feedback on his performance. "Your play showed several strong traits," ChatGPT told him...

About a week after Carlsen posted that he beat ChatGPT in the online chess match, he lost the Freestyle Chess Grand Slam Tour in Las Vegas to teenage Indian grandmaster Rameshbabu Praggnanandhaa.

ChatGPT Loses in a Game of Chess Against Magnus Carlsen

Comments Filter:
  • Umm... (Score:4, Insightful)

    by Valgrus Thunderaxe ( 8769977 ) on Saturday July 26, 2025 @03:37PM (#65547304)
    LLMs aren't chess playing computers. This is a surprise to anyone?
    • by drnb ( 2434720 )

      LLMs aren't chess playing computers. This is a surprise to anyone?

      And specialized chess software is "playing"? Like a human it's analyzing.a series of moves and countermoves, perhaps a longer series than a human, but is it "instinctively" pruning those paths to explore like a human? Probably not with the old school programs that were more brute force with limited pruning, maybe more so with ML based series pruning?

      • Yeah thats not how LLMs work.

        Something like Stockfish absolutely does do something like that. Stockfish is not an LLM, its entire theory of operation is vastly different to how LLMs work.

        For the most part LLMs play worse chess than children.

        • We cannot compare something / AI which is essentially a 1 move "advice" tool which does not look ahead 10, 20 moves to a dedicated chess playing program which has been trained on long chains of chess moves.

          Noted: Tree pruning, min-max algorithm, etc., etc.

    • Re:Umm... (Score:4, Insightful)

      by LainTouko ( 926420 ) on Saturday July 26, 2025 @06:50PM (#65547594)
      It's unlikely to be a surprise to anyone here. To someone not particularly technical who has only been listening to LLM marketing, it's likely to be quite a surprise.
      • The interesting part of this is how it played, why it lost and how that might be fixed.

        1. In all these kinds of examples they use models bad at this type of stuff. Here, they used ChatGPT-4o, which is tuned for speed and not for reasoning. So the obvious 'improvement' here would be to use o3 or any of the other competing 'reasoning' models designed for these kinds of things.
        2. These examples 'play' chess in the form of a textual representation of a sequence of moves, not 2D representations of the board. Tha

        • 2. These examples 'play' chess in the form of a textual representation of a sequence of moves, not 2D representations of the board. That means that the AI needs to 'mentally' replay the moves to determine the state of the board. It's like asking it to do 20 consecutive math operations on a matrix of 8x8 and then asking it to come up with an operation that leads to a specific type of state of the matrix.

          What you are describing is a good way to play blindfolded chess. Every time someone makes a move, repeat in your mind all previous moves from the beginning until the present. I've taught several people to play blindfolded chess this way (and also learned it myself, from George Koltanowski).

    • by Ossifer ( 703813 )

      Why would anyone expect a linguistic parlor trick to be good at chess?

    • Exactly, it always loses at Rock, Paper, Scissors to me, no fingers. :-)

  • by drnb ( 2434720 ) on Saturday July 26, 2025 @03:43PM (#65547310)

    Carlsen told the AI bot that he thought it "played really well in the opening," but ultimately "failed to follow it up correctly."

    So ChatGPT is a magnificent cut and paste machine? For chess, its training probably included discussions of opening moves. For coding, it includes discussions on algorithms. No real reasoning going on here. Just pattern matching to training materials. Which is useful. As a human code I sometimes look up algorithms in a reference book or online to look at a sample implementation. And a chess enthusiast friend reads books and articles on opening moves.

    • There are dedicated chess engines that are a lot stronger than LLM chatbots. That being said, an LLM chatbot should be able to instantiate a chess engine and have it make the actual moves.

      • by drnb ( 2434720 )

        There are dedicated chess engines that are a lot stronger than LLM chatbots. That being said, an LLM chatbot should be able to instantiate a chess engine and have it make the actual moves.

        Yes, but are they really "playing"? Like a human they are analyzing.a series of moves and countermoves, perhaps a longer series than a human, but is it "instinctively" pruning those paths to explore like a human? Probably not with the old school programs that were more brute force with limited pruning, maybe more so with ML based series pruning?

        • Yes, but are they really "playing"?

          Yes, and prepare to have your mind blown: planes can fly.

          • by drnb ( 2434720 )

            Yes, but are they really "playing"?

            Yes, and prepare to have your mind blown: planes can fly.

            I can strap wings onto a brick and drop it from 10K ft. It moves through the air, but its not "flying" in any reasoned controlled flight perspective.

        • by PDXNerd ( 654900 )

          Yes, but are they really "playing"? Like a human they are analyzing.a series of moves and countermoves, perhaps a longer series than a human, but is it "instinctively" pruning those paths to explore like a human? Probably not with the old school programs that were more brute force with limited pruning, maybe more so with ML based series pruning?

          You do not understand how LLMs work. Its not playing or analyzing anything at all. Its a glorified random number generator that's is literally predicting characters, in this case chess move sequences. Its not judging those sequences by rules, or using lookup tables or probability matrices for movement. Its literally predicting, character by character, what its been trained with.

          The fact that it plays a 'real' game and tells a master it played a good game is meaningless - it would tell a child its ideas abo

          • by drnb ( 2434720 )

            Yes, but are they really "playing"? Like a human they are analyzing.a series of moves and countermoves, perhaps a longer series than a human, but is it "instinctively" pruning those paths to explore like a human? Probably not with the old school programs that were more brute force with limited pruning, maybe more so with ML based series pruning?

            You do not understand how LLMs work. It's not playing or analyzing anything at all.

            Guess again. From my first post in this thread: "So ChatGPT is a magnificent cut and paste machine? For chess, its training probably included discussions of opening moves. For coding, it includes discussions on algorithms. No real reasoning going on here. Just pattern matching to training materials."

            A thread which is titled "So ChatGPT is a magnificent cut-and-paste machine"

            • by PDXNerd ( 654900 )
              An LLM is not a "cut and paste machine" unless you're talking about your usage of it, and your discussion devolved into nonsense about an LLM pruning paths. If that's not what you meant, perhaps you should re-read the post *you* responded to. You said

              Probably not with the old school programs that were more brute force with limited pruning, maybe more so with ML based series pruning?

              Explain to me with your knowledge of machine learning, how does an ML model 'prune paths'? Or if you don't know, name one that does for chess that isn't a dedicated chess engine.

          • An AI is now more than a bare LLM. The 'deliberative' models now can and do think steps ahead. And an LLM-based chess engine can analyze a potential move for validity by examining the rules. But it's still grossly inefficient and not very effective compared to a special-purpose chess engine. The point of doing it is to find the minimal single type of "AI" that can do anything and everything well. And so far that goal is not met.
  • by VaccinesCauseAdults ( 7114361 ) on Saturday July 26, 2025 @04:01PM (#65547338)
    In other news, William Shakespeare won in a language fluency, comprehension and poetry composition contest against Stockfish.
    • In other news, William Shakespeare won in a language fluency, comprehension and poetry composition contest against Stockfish.

      You're somehow overlooking how impressive an accomplishment that is, for a guy that's been dead 400 years...

      • True. I should probably have used someone contemporary. But the point is it is utterly ridiculous talking about beating ChatGPT at chess. I thought even quite ordinary people could do that, let alone the world chess champion. LLMs and AIs can do many things but excelling at chess is not one of them. Unless it just does external calls to dedicated engines like Stockfish?
        • I did get your point, and do agree with you. But I couldn't resist the joke!

          One other thing this illustrates, at least in my opinion - companies like OpenAI and Anthropic have done a remarkable PR job, convincing people that their products are significantly more general-purpose (and significantly more advanced) than they really are. The general public -represented by Magnus Carlsen, in this instance - basically sees them as AGIs, rather than the hallucinating-and-still-simplistic tools they actually are.

          • True. I should probably have used someone contemporary.

            Perhaps not. Quite a bit of what we attribute to the Bard didn't come from him. Romeo and Juliet for example is a regurgitation of an Italian work from thirty years prior.

  • It cant even remember the moves played and frequently has to be corrected about where the pieces are. This headline implies that carlsen did something other than being funny. He knows chatGPT is not a chess computer. proper chess software even running on a shitty old cpu could beat him easily.
  • ChatGPT and its siblings *seem* to be amazingly good at everything. But when it's necessary to do anything that requires a specialized skill, you need specialized AI.

    For example, LLMs can interact with patients in a hospital, but it takes specialized AI to properly read an X-ray or an MRI. An LLM can tell you what medications are commonly used to treat specific illnesses, but if you blindly follow the advice, you're likely to miss important clues that a doctor wouldn't miss.

    The LLM appears to understand how chess is played, but when it gets beyond the opening gambits that are well-documented, it falls apart.

    In any kind of business, LLMs can provide output that appears sound, but when you get into the weeds, falls apart. We're going to need people to do the heavy lifting for some time to come. The death of white-collar employment is greatly exaggerated.

    • They merely have to make a plug-in for a chess bot which takes over chess game questions. Sure, it will not be reasoning but operating another heuristic bot; but Turing's test is proving to not be such a bad one after all.

      Once you've trained the bot to fool a person for everything they ask it's on to the next person until only some experts are not fooled... There are only so many tests most people can dream up and you only need to cover that large problem space. So, if it fools most people, is it alive?

  • by Luscious868 ( 679143 ) on Saturday July 26, 2025 @07:43PM (#65547686)
    All three people who follow chess are super excited
  • It's a bit like asking the dumbest kid in class for his notes.

Is your job running? You'd better go catch it!

Working...