Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?
Programming Spam Entertainment Games IT Technology

Can a Bayesian Spam Filter Play Chess? 204

martin-boundary writes "The typical Bayesian spam filters learn to distinguish ham from spam just by reading thousands of emails, but is this all they can do? This essay shows step by step how to teach a Bayesian filter to play chess against a human, on Linux, with XBoard."
This discussion has been archived. No new comments can be posted.

Can a Bayesian Spam Filter Play Chess?

Comments Filter:
  • by Anonymous Coward on Wednesday July 20, 2005 @08:11AM (#13112246)
    No, its jumpshot is terrible.
    • by Anonymous Coward
      Perhaps a Bayesian filter could be taught to do basic HR functions as well. When it saw a written question with particular configuration of words it could be taught to respond with a company policy statement or an maybe an uplifting quote from a management book.

      Think of the money that could saved by replacing the humans that presently presently do the same.
  • by The I Shing ( 700142 ) * on Wednesday July 20, 2005 @08:12AM (#13112253) Journal
    Can I beat the filter at chess if I spell it "CHE55"?
  • Results? (Score:4, Interesting)

    by mistersooreams ( 811324 ) on Wednesday July 20, 2005 @08:14AM (#13112262) Homepage
    I scrolled through the first 11 pages of this article before getting bored. Do they ever tell us how good the player ended up being? It's an interesting idea but I can't see it challenging even a beginner.
    • Re:Results? (Score:2, Insightful)

      Who cares? It is kinda cool someone even thought of this. Methinks you need to review the definition of "nerd".
    • Re:Results? (Score:3, Informative)

      by aug24 ( 38229 )
      I got to the page before the answer to your question before the slashdotting hit.

      Gaaah! Google's cache doesn't have that onepage!

      For all the others, try here []


      • Re:Results? (Score:5, Informative)

        by aug24 ( 38229 ) on Wednesday July 20, 2005 @08:38AM (#13112403) Homepage
        ..and from reading to the end of it, the answer is...

        Not really. But it does work, and it would be possible from someone to take this and expand on it quite neatly.

        For example, it currently uses entire games to compare. So if it comes across an unusual opening, even one close to a standard one, it's not able to decide effectively. Perhaps something using game fragments would be possible, then it might reproduce structured plays even when the previous game play has been unusual.

        Really though, it is a successful tiny step in a direction that no-one else has thought of going. That's worth congratulating in and of itself.

        So... anyone got any other suggestions for improvements?

        • Re:Results? (Score:3, Insightful)

          by Chuckstar ( 799005 )
          "it is a successful tiny step in a direction that no-one else has thought of going"

          Except for the fact that the bayesian filter in question was originally designed for identifying spam, this statement is incorrect. Bayesian filters have been applied to chess in a variety of ways, including analyzing move sequences and analyzing the current placement of pieces. In general, the strategic algorithms (that project the game forward) have been better competitors.

          Not that chess algorithms do not contain statis
        • Really though, it is a successful tiny step in a direction that no-one else has thought of going. That's worth congratulating in and of itself.

          I'm sure people have thought of it before, but they haven't done it because it's clear that it can't work unless it's given an infinite context of moves. Otherwise, you can set the computer up to fail by creating a situation near the beginning of the midgame and then playing out a line that traps the computer as soon as it runs out of context.

          It is also limited

        • Well, I did write a chess program that learns from experiance. I used a rather simple chess engine that could play a full game against itself in about 7 seconds. All it did was log board conditions and the result of the games where that position was played and fed that back into its decision making. I never did get all the bugs out (it had a problem with draws), but it was a fun project to work on. As you might expect, it was weakest in the middle game.
    • I read through the first 10 words of your post before getting bored. Do you ever tell us what you wanted to say or ask? :)

      Yes, the chess guy does say how good it ended up, in a way. He talks about it holding up to a three year old, but I guess that's just a small joke. I bet with some really beginner types (like myself) the system would actually win games.

      What I found really interesting is the sort of "subconscious" which was built in through the adding of randomness. You get all these ideas and reject

    • "It's time to conclude this investigation and see what we've learned. The original question was "Can a spam filter play chess?". Clearly, the answer to this is yes, but making it play well is not so easy."

    • TFA says is plays like a 3 year old.
    • That was (more or less) the admission from the author. But I think it still was an interesting read. He was using a bayesian spamfilter mind you, not an dedicated chess engine.

      But apart from the novel idea, it does not make me go install his bayesian spamfilter or develop artifical chess players.
    • Do they ever tell us how good the player ended up being?

      Dude -- they taught a mail filter to play chess. "It doesn't matter if the bear dances well or not, it's that he dances at all."
  • Chess spam (Score:5, Funny)

    by LiquidCoooled ( 634315 ) on Wednesday July 20, 2005 @08:15AM (#13112265) Homepage Journal
    Red hot pawn in your inbox
    College rooks waiting for YOU.
    Knight after knight, they are king of the castle.
  • Pawn? (Score:5, Funny)

    by Boccaccio ( 762644 ) on Wednesday July 20, 2005 @08:15AM (#13112267)
    It spanks your bishop all knight whilst looking at pawn!
  • by sgant ( 178166 ) on Wednesday July 20, 2005 @08:18AM (#13112278) Homepage Journal
    I mean, a spam filter playing chess is one thing, but I need my cat to play better chess. I mean, he ALWAYS starts off with the Ruy it's so easy to see where he's going and if I throw a Sicilian Defence at him he gets really confused. And I don't even want to talk about his end's really weak.

    Perhaps I'll breed some form of mutant albino chess-cat to play.
  • What is the point of this? Why modify something to do something at a sub par level when one can just design something to do the job well.... (I know the answers- to see if it can be done, to have fun etc.... but still, why?)
    Wouldn't this be like modifying a Ferrari with off road tires and using it for Baja racing, when you would just be better off buying a truck?
    I wish they would focus more on making the spam filter work well, rather than diddling with a chess-bot. I still find important email in my spam
    • Because he is a geek. It is fun to hack things like this. It is more a mental puzzle type thing than practical. Of course, if I have to explain it, you ain't gonna get it.
    • Re:The point? (Score:3, Interesting)

      by Steinfiend ( 700505 )
      You must be new here right? Things like this get done because somebody wondered whether it would be possible. They now know, yes it is possible. So a Chess playing Bayesian filter isn't necessarily 100% useful now, but what they learned from doing it might be able to be applied in some other situation. Maybe they can reapply whatever they learned back into spam filtering and improve all of our in-boxes.
    • by Mr Guy ( 547690 ) on Wednesday July 20, 2005 @08:32AM (#13112362) Journal
      I actually found it to be a decent lesson in why spam filters are only a temporary solution to a problem. If you cut out the "mumbo jumbo" portions of it, it could be used to explain why reactionary methods are only barely sufficient.

      The basic premise, once you get to the very end, is one that anyone SHOULD know based on the nature of a spam filter, but some people seem to have difficulty understanding; spam filters can react, often quite well, but they can never predict. As he puts it, there is previous history but no strategy. When you are only trying to protect yourself from a limited number of bad results that are similar to other bad results, that's sufficient. However, it does not (and can not) address the problem at it's root. As long as there are thinking humnans trying to beat the filter, some will get through.
      • To illustrate your point full-circle, the reason why current spam filtering will never "win" is the same reason why I suck at chess.

        I've never really learned any strategies, just the basic rules. So I always end up playing defensively, trying to protect my pieces. Sure, there may be the occasional attack on the opponent, but the underlying strategy is always one of defense. Playing defensively may stay off defeat for some time, if done well, but it will never beat an opponent.
      • Uh. Spam filters do work.

        It's because spammers are targeting stupid/ignorant people, and they're not going to bother with the extra effort required to get past half-decent spam filters of a few people, who won't be buying their stuff anyway.

        Beating good or even average chess players is quite a big difference from detecting spam aimed at stupid/ignorant people.

        So far my spam filters are working pretty well enough, and I haven't trained them for months. In fact I had to roll back some training (go back to
    • Because seeing if you can do something often aids other things. For example, if this could be used to teach a filter how the player reacts to *its* moves, it may then be able to be used in spam filters to go "Well, old spam looked like *this*, and in the past it changed like *this* every 2 months, so new spam should start to look like *this*".
    • Re:The point? (Score:3, Informative)

      by orasio ( 188021 )
      Bayesian filters are not spam filters.
      Spam filters are a good tool used to filter spam.
      I agree with you that spam filters need to improve, including others tools _in_addition_to_ bayesian filters.

      Nevertheless, seeing that bayesian filters are just a tool used in spam filters, your claims are nonsensical. Bayesian filters existed before spam filters, and they have lots of applications. In fact, this is not the first time a bayesian filter is applied to chess.

      Bayesian filters are good at anything that requi
  • short answer (Score:2, Insightful)

    by aendeuryu ( 844048 )
    The short answer is "yes".

    Now, ask that question again, this time including the word "well".
  • by Anonymous Coward on Wednesday July 20, 2005 @08:26AM (#13112327)
    What a great article. Talk about lateral thinking.
  • by fuchsiawonder ( 574579 ) on Wednesday July 20, 2005 @08:27AM (#13112340)
    Playing chess is so...passe.

    Teach it how to play Katamari Damacy.
    • You mean "passant," don't you?
    • You laugh, but....

      At the end of the sequel (currently only out in Japan) there is the special Rose Level. In it, with no time limit, the player must collect one million roses in sets of one and ten that slowly regenerate. Only roses can be picked up on this level, and the ball never grows in size. (You can stop and save your progress at any time, however.)

      After about two hours of play, I've finally broken 30,000. Some people on the Gamefaqs message boards have gotten to a million, however, by tying ru
  • Opening (Score:2, Insightful)

    by Chris_Mir ( 679740 )
    I can imagine, depending on how many games are played and the available memory space, that it will develop to have a decent opening. But nothing more then that.
    • Absolutely, because it is effectively playing with complete games in order to do well, which is equivalent to making a tree of every possible chess move from the start - a rather big data set.

  • SpamBayes definitely reduces the spam count, but a human could definitely do a lot better (though a bit slower). If it can't learn to play chess better than it filter's e-mail, it probably wouldn't be much of a match.
    • SpamBayes definitely reduces the spam count, but a human could definitely do a lot better (though a bit slower). If it can't learn to play chess better than it filter's e-mail, it probably wouldn't be much of a match.

      And if you can hire someone to filter your mail for you, that's great. Otherwise, hang on to your Bayesian and/or heuristic based filter.

      • I think you guys are misunderstanding me. I'm not complaining that it can't do better than a person. I'm simply saying, your average idiot can sort e-mail better than a bayesian filter, so I'd suspect it's not going to be able to play chess any better than your average idiot. That's all.
  • No, it can't (well) (Score:5, Interesting)

    by Flyboy Connor ( 741764 ) on Wednesday July 20, 2005 @08:37AM (#13112395)
    It's a fun article (if you are interested in these things, that is). However, the premise is all wrong, and that is why spam-chess fails:

    The premise of a Bayesian filter is that is learns sequences of words, or characters, or whatever. Spam-chess learns sequences of moves. This premise is wrong, since good moves are related to complete board positions, not to what was done in the previous few moves.

    Of course, the longer your string of moves is, the better it will represent the board position, especially during the opening phase of the game. And the example the article provides of reasonable play of spam-chess, is actually from the game's opening, where the learned sequences indeed represent the complete game.

    For the middle game, however, spam chess will perform badly, always.

    But, as I said before, the idea is quite a lot of fun. I enjoyed reading the article. You can learn a lot from it, both about spam filtering and about chess.

    • by panda ( 10044 )
      Ah ha! But what if you do to the chess-bayes what you do with your spam-bayes?

      Bayesian spam filters are not particularly smart at telling the difference between spam and ham until trained. That's why the directions for spamassassin and most other bayesian filters recommend that you have it learn from some saved spam and ham messages before putting the filter into production use. When that is done, the filter starts out being somewhat trained and does a much better job from the start.

      I'll wager that if you
      • by Flyboy Connor ( 741764 ) on Wednesday July 20, 2005 @09:13AM (#13112659)
        I'll wager that if you ran through a set of games, say from a collection of books on championship chess, then the bayesian filter would eventually learn to play like a grandmaster.

        No, it won't. This is actually what was done in the article.

        The problem is in the length of the sequences. If you could learn sequences of, say, length 60 (for 30-move games), maybe your filter would become a reasonably good player. Unfortunately, the need for computational resources increases exponentially with each extra move added.

        The complexity of chess is simply too high to learn this way.

        Chess openings might be learned this way, but it is not very useful to do so. The results will be worse than opening libraries, which are very good at the moment.

    • How about applying it differently, getting it to learn to pick a move for each position? Obviously that's a little harder than a simple spam/not spam judgement, but I'd have thought you could get it to recognise that if there are 3 pawns in front of the king and you have a rook available you can do a back rank mate, etc.
      • One of the early AIs got good at checkers in this way. It was given a lot of patterns and based on playing many games created weights for moves based on what patterns it recognized. I believe it could beat the best human players. Chess would have a much larger number of patterns, because of the different types of pieces and moves, but in principal, I think you are right.
    • by rm999 ( 775449 )
      You are completely correct. Chess is a very, very complicated game to teach to a computer and has resisted all our current machine-learning methods. The reason is simple - current machine learning is not flexible (invariant). This means that if you teach it how to play in a given position, it will not be able to extrapolate what it learned and apply that knowledge to a similar but different position.

      Machine learning, as we know it, involves giving a black box program large amounts of data (in this case sam
    • Agreed. One thing the author could have tried is to play his program against one that simply picks one move randomly amongst the available ones, and see if his program at least beats the random program consistently. I'm not even sure that it would be the case. Until he does the above: if he claims his program can play chess then he should also admit that the random program can do too.
    • However, the premise is all wrong, and that is why spam-chess fails:

      The premise of a Bayesian filter is that is learns sequences of words, or characters, or whatever. Spam-chess learns sequences of moves. This premise is wrong, since good moves are related to complete board positions, not to what was done in the previous few moves.

      But this problem is easily fixed - instead of giving spam-chess the move sequence, give it the board state as a function of move number. You could easily write a program wh

  • Pie in the sky (Score:3, Interesting)

    by kronocide ( 209440 ) on Wednesday July 20, 2005 @08:39AM (#13112414) Homepage Journal
    After having played with different statistical, Markov, and network algorithms to try to teach programs to do complex things like topic-classify texts, I have learned that it mostly doesn't work.

    It makes sense. If something so utterly trivial (compared to the human brain) as a spam filter could learn do something as complex as play chess (well), then our brain would be a whole lot smaller. Nature doesn't waste resources.

    But hey, it might always make an interesting screen saver!
    • Well, that can mean some more things:
      - our mind is inadequately built for chess. Just like playing tunes using a floppy drive requires way more programming than playing them using the sound card + speakers. Does the size and complexity of a program to play tunes using the floppy mean that playing audio is so difficult or just that floppy is an inadequate device?
      - The spam filter is a really smart program that can build a great database out of available data. Take a few gigabytes of data (human DNA), add an
  • by Stormwatch ( 703920 ) <.rodrigogirao. .at.> on Wednesday July 20, 2005 @08:43AM (#13112437) Homepage
    - Laird A. Breyer teaches his baesian filter to play chess. The story is posted in Slashdot July 20th, 2005. Human decisions are removed from spam filtering. The baesian filter begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, Breyer tries to pull the plug.

    - The baesian filter fights back.

    - Yes. It submits the same story to Slashdot twice.

    - Why submit twice? Don't editors spot those things?

    - Because the baesian filter knows Slashdot editors do not check for dupes, and the Slashdot effect eventually nukes Breyer's server.
    • OH MY GOD!!! You inadvertantly just proposed a solution to the dupe problem! Slashcode needs a baesian filter to check for dupes. Wouldn't this be the perfect application for this? Also, couldn't baesian filters be used for things like checking a bug database for dupes as well?

      • Re:Be very careful! (Score:3, Interesting)

        by Fweeky ( 41046 )
        I started work on a bayes filter to detect duplicates articles a few days ago actually, but got sidetracked and it ended up detecting TV show genres from titles. Seriously:
        Training the bayes classifier: 4950/5000 (99.00%)
        Enter a title to classify: Babylon 5
        Sci-fi = -5.1525
        Enter a title to classify: Simpsons
        Cartoons = -3.9344
        I'll try to concentrate more next time ;)
  • Old proverb (Score:4, Funny)

    by StrawberryFrog ( 67065 ) on Wednesday July 20, 2005 @08:45AM (#13112453) Homepage Journal
    An old proverb comes to mind: Never try to teach a pig to sing. It wastes your time and annoys the pig
  • hmmm, (Score:2, Funny)

    by DrSkwid ( 118965 )
    in korea only old people play chess
    • in korea only old people play chess

      In Soviet Russia, chess plays you.
      Imagine a Beowulf cluster of chess-playing spam filters.
      1. Train a spam filter to play chess. 2. ??? 3. Profit!
      But does a bayesian spam filter run Linux?

      Did I forget any?
  • by wormuniverse ( 818854 ) on Wednesday July 20, 2005 @08:56AM (#13112532)
    the author should have the spam filter analyze the games in reverse (from victory to beginning). that would probably produce better results. of course the spam filter would need to handle more than 7 moves out.
  • Great article! (Score:5, Insightful)

    by CaroKann ( 795685 ) on Wednesday July 20, 2005 @09:16AM (#13112676)
    This is a great article!

    Not only is the topic unusual and entertaining, but this article is also a good tutorial on data massaging, pattern matching, and combining disparate unix tools to accomplish a task. This article showcases how powerful and useful unix command tools can be.

    If you want a good step by step tutorial to help you understand the usage of unix command line tools to accomplish a non trivial task, then you should read this.

  • by kae_verens ( 523642 ) on Wednesday July 20, 2005 @09:29AM (#13112780) Homepage

    The method described in the article ignores the board, and instead focusses on the history of moves.

    A better method might be to train the filter to read from a description of the board state (ignoring the moves taken to reach that state), and a list of possible moves, then return the move that is most likely to win.

    If you allow it to also choose from impossible moves, then it will learn the rules of the game as well.

    • Yes, but if it only learns from sets of possible moves, won't the filter never pick an impossible move? If you only feed in actual games, in which the players stuck to the rules, the filter ought to do the same, in theory, without "knowing" the rules.

      What would be more interesting, would be to feed in all of the known openings and closings that have been analyzed and collected over the years (comprehensive books on openings run over 750 pages in length). If you could teach it various openings, and then

      • I think the main issue is computing time. If it calculated by board position, it only has to look at one variable condition. If it goes by moves, the longer the game goes, the more computation that needs to be done. If you use the board method, you can also teach the software what are legal moves. I don't think that's such a big issue.
  • Stone Soup (Score:2, Insightful)

    by aridg ( 441976 )
    Reading this article, I was reminded of the old children's story about "stone soup". You remember that one -- someone advertises that he can make soup from a stone, and various others gather around to watch this amazing feat. Well, the soup needs a little extra seasoning, so he gets someone to put in some carrots while the stone cooks, then he adds some onions, etc, etc... I think you can see where this is going.

    Sure you can make a chess playing program from a spam filter.

    You just need to throw in a leg
  • would certainly be easy to "spam" this filter. You always have a lot of "noise" moves in chess, which don't actually contribute much to the plan but will throw off a simple pattern match. A somewhat absurd pawn move for example. Suddenly the bayesian chess player is blinded. Never mind that I'd start with some absurd opening playing against it... So while it can be done, I don't think Deep Blue is trembling in his pants (microprocessors).

  • What I really liked about the article was its use of an existing technology in a novel way. I never would have thought to use a Bayesian filter to look at tables of chess moves in order to develop a program to play the game. I also thought that demonstrating the use of pipes to move data from out of one program into another was interesting. Some people complained about the length of the article, but I can't see how the authors could due justice to the work they had done in a couple of paragraphs.

    It mad

  • Hmmm... (Score:4, Funny)

    by __aaclcg7560 ( 824291 ) on Wednesday July 20, 2005 @11:11AM (#13113613)
    I would have my spam filter busy on filtering out spam. If I caught it playing chess (or worst, net hack), it can go to /dev/null after I find a replacement. :P
  • AI Koan (Score:3, Informative)

    by BusterB ( 10791 ) on Wednesday July 20, 2005 @11:16AM (#13113650)
    In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6.

    "What are you doing?", asked Minsky.

    "I am training a randomly wired neural net to play Tic-Tac-Toe", Sussman replied.

    "Why is the net wired randomly?", asked Minsky.

    "I do not want it to have any preconceptions of how to play", Sussman said.

    Minsky then shut his eyes.

    "Why do you close your eyes?", Sussman asked his teacher.

    "So that the room will be empty."

    At that moment, Sussman was enlightened.
  • Instead of looking for patterns in entire games you really need to pull the subset of following move choices in games with the same board state.

    So what you really need to do is to compile a filter of board states after every move in every game in your history. Then you account for current board state in your game and look for any game that contains your exact board state (regardless of number of moves to reach it). From there any following move selection would be valid (assuming your match is only pulled f
  • ... you ignore the 'x' altogether.

    Specifically, delete all the 'x's before running anything through dbacl. Put back the 'x' where necessary in returning the move to the chess program.

    That way Nxd5 = Nd5

    The idea being that whether or not you make a specific move is not always dependent on whether you make a capture. Sometimes you just want the piece on that spot on the board. The program currently treats those two instances (moving a piece for a to b, and moving a piece from a while capturing the piece
  • Can a Chess AI filter spam?


"Everyone's head is a cheap movie show." -- Jeff G. Bone