Can a Bayesian Spam Filter Play Chess? 204
martin-boundary writes "The typical Bayesian spam filters learn to distinguish ham from spam just by reading thousands of emails, but is this all they can do?
This essay shows step by step how to teach a Bayesian filter to play chess against a human, on Linux, with
XBoard."
But can a Bayesian filter play basketball? (Score:5, Funny)
Re:But can a Bayesian filter play basketball? (Score:2, Interesting)
Think of the money that could saved by replacing the humans that presently presently do the same.
I can beat that filter... (Score:5, Funny)
Re:I can beat that filter... (Score:5, Funny)
Re:I can beat that filter... (Score:2)
Results? (Score:4, Interesting)
Re:Results? (Score:2, Insightful)
Re:Results? (Score:3, Informative)
Gaaah! Google's cache doesn't have that onepage!
For all the others, try here [google.com]
J.
Re:Results? (Score:5, Informative)
Not really. But it does work, and it would be possible from someone to take this and expand on it quite neatly.
For example, it currently uses entire games to compare. So if it comes across an unusual opening, even one close to a standard one, it's not able to decide effectively. Perhaps something using game fragments would be possible, then it might reproduce structured plays even when the previous game play has been unusual.
Really though, it is a successful tiny step in a direction that no-one else has thought of going. That's worth congratulating in and of itself.
So... anyone got any other suggestions for improvements?
Justin.
Re:Results? (Score:3, Insightful)
Except for the fact that the bayesian filter in question was originally designed for identifying spam, this statement is incorrect. Bayesian filters have been applied to chess in a variety of ways, including analyzing move sequences and analyzing the current placement of pieces. In general, the strategic algorithms (that project the game forward) have been better competitors.
Not that chess algorithms do not contain statis
Re:Results? (Score:2)
I'm sure people have thought of it before, but they haven't done it because it's clear that it can't work unless it's given an infinite context of moves. Otherwise, you can set the computer up to fail by creating a situation near the beginning of the midgame and then playing out a line that traps the computer as soon as it runs out of context.
It is also limited
Re:Results? (Score:2)
Re:Results? (Score:2)
Obviously taking/checking will require different analyses, but then we are getting away from the simple bayesian approach.
J.
Re:Results? (Score:2)
I read through the first 10 words of your post before getting bored. Do you ever tell us what you wanted to say or ask? :)
Yes, the chess guy does say how good it ended up, in a way. He talks about it holding up to a three year old, but I guess that's just a small joke. I bet with some really beginner types (like myself) the system would actually win games.
What I found really interesting is the sort of "subconscious" which was built in through the adding of randomness. You get all these ideas and reject
Re:Results? (Score:2)
-Adam
Re:Results? (Score:2)
could maybe beat a 3 year old. (Score:2)
But apart from the novel idea, it does not make me go install his bayesian spamfilter or develop artifical chess players.
Re:Results? (Score:2)
Dude -- they taught a mail filter to play chess. "It doesn't matter if the bear dances well or not, it's that he dances at all."
Chess spam (Score:5, Funny)
College rooks waiting for YOU.
Knight after knight, they are king of the castle.
Re:Chess spam (Score:5, Funny)
Pawn? (Score:5, Funny)
Re:Pawn? (Score:4, Funny)
See? By making fun of other people's pronunciation of English, you can fit yet another chess piece in...
Re:Pawn? (Score:2)
Get back to me when I can teach my cat chess (Score:5, Funny)
Perhaps I'll breed some form of mutant albino chess-cat to play.
Re:Get back to me when I can teach my cat chess (Score:2, Insightful)
Re:Get back to me when I can teach my cat chess (Score:2)
The point? (Score:1, Insightful)
Wouldn't this be like modifying a Ferrari with off road tires and using it for Baja racing, when you would just be better off buying a truck?
I wish they would focus more on making the spam filter work well, rather than diddling with a chess-bot. I still find important email in my spam
Re:The point? (Score:2)
Re:The point? (Score:3, Interesting)
A decent lesson in why spam filters don't work (Score:5, Insightful)
The basic premise, once you get to the very end, is one that anyone SHOULD know based on the nature of a spam filter, but some people seem to have difficulty understanding; spam filters can react, often quite well, but they can never predict. As he puts it, there is previous history but no strategy. When you are only trying to protect yourself from a limited number of bad results that are similar to other bad results, that's sufficient. However, it does not (and can not) address the problem at it's root. As long as there are thinking humnans trying to beat the filter, some will get through.
Re:A decent lesson in why spam filters don't work (Score:3, Insightful)
I've never really learned any strategies, just the basic rules. So I always end up playing defensively, trying to protect my pieces. Sure, there may be the occasional attack on the opponent, but the underlying strategy is always one of defense. Playing defensively may stay off defeat for some time, if done well, but it will never beat an opponent.
Re:A decent lesson in why spam filters don't work (Score:2)
It's because spammers are targeting stupid/ignorant people, and they're not going to bother with the extra effort required to get past half-decent spam filters of a few people, who won't be buying their stuff anyway.
Beating good or even average chess players is quite a big difference from detecting spam aimed at stupid/ignorant people.
So far my spam filters are working pretty well enough, and I haven't trained them for months. In fact I had to roll back some training (go back to
Re:The point? (Score:2)
Re:The point? (Score:3, Informative)
Spam filters are a good tool used to filter spam.
I agree with you that spam filters need to improve, including others tools _in_addition_to_ bayesian filters.
Nevertheless, seeing that bayesian filters are just a tool used in spam filters, your claims are nonsensical. Bayesian filters existed before spam filters, and they have lots of applications. In fact, this is not the first time a bayesian filter is applied to chess.
Bayesian filters are good at anything that requi
short answer (Score:2, Insightful)
Now, ask that question again, this time including the word "well".
Re:short answer (Score:5, Funny)
Re:short answer (Score:2)
Finally - news for nerds! (Score:4, Insightful)
Get with the times. (Score:5, Funny)
Teach it how to play Katamari Damacy.
Re:Get with the times. (Score:3)
Re:Get with the times. (Score:2)
At the end of the sequel (currently only out in Japan) there is the special Rose Level. In it, with no time limit, the player must collect one million roses in sets of one and ten that slowly regenerate. Only roses can be picked up on this level, and the ball never grows in size. (You can stop and save your progress at any time, however.)
After about two hours of play, I've finally broken 30,000. Some people on the Gamefaqs message boards have gotten to a million, however, by tying ru
Opening (Score:2, Insightful)
Re:Opening (Score:2)
J>
Re:Opening (Score:2)
Can't be that good (Score:2)
Re:Can't be that good (Score:2)
And if you can hire someone to filter your mail for you, that's great. Otherwise, hang on to your Bayesian and/or heuristic based filter.
Re:Can't be that good (Score:2)
No, it can't (well) (Score:5, Interesting)
The premise of a Bayesian filter is that is learns sequences of words, or characters, or whatever. Spam-chess learns sequences of moves. This premise is wrong, since good moves are related to complete board positions, not to what was done in the previous few moves.
Of course, the longer your string of moves is, the better it will represent the board position, especially during the opening phase of the game. And the example the article provides of reasonable play of spam-chess, is actually from the game's opening, where the learned sequences indeed represent the complete game.
For the middle game, however, spam chess will perform badly, always.
But, as I said before, the idea is quite a lot of fun. I enjoyed reading the article. You can learn a lot from it, both about spam filtering and about chess.
Re:No, it can't (well) (Score:2, Redundant)
Bayesian spam filters are not particularly smart at telling the difference between spam and ham until trained. That's why the directions for spamassassin and most other bayesian filters recommend that you have it learn from some saved spam and ham messages before putting the filter into production use. When that is done, the filter starts out being somewhat trained and does a much better job from the start.
I'll wager that if you
Re:No, it can't (well) (Score:4, Insightful)
No, it won't. This is actually what was done in the article.
The problem is in the length of the sequences. If you could learn sequences of, say, length 60 (for 30-move games), maybe your filter would become a reasonably good player. Unfortunately, the need for computational resources increases exponentially with each extra move added.
The complexity of chess is simply too high to learn this way.
Chess openings might be learned this way, but it is not very useful to do so. The results will be worse than opening libraries, which are very good at the moment.
Re:No, it can't (well) (Score:2)
That, however, is the way it is used in the article.
Playing chess is not something that you can naturally represent as a binary classification task. Any approach that attempts to represent playing chess in this way is going to perform poorly.
Bayesian filtering for spam is not by definition binary classification. True, the end result is "spam" or "not spam", but what it does is calculating how likely a message is to be spam. Then, if the likelihood is above
Re:No, it can't (well) (Score:2)
Re:No, it can't (well) (Score:2)
Re:No, it can't (well) (Score:3, Insightful)
Machine learning, as we know it, involves giving a black box program large amounts of data (in this case sam
Re:No, it can't (well) (Score:2)
Re:No, it can't (well) (Score:2)
But this problem is easily fixed - instead of giving spam-chess the move sequence, give it the board state as a function of move number. You could easily write a program wh
Re:No, it can't (well) (Score:2)
It depends what you mean by "playing chess". Can a monkey play chess? If you let the monkey move pieces at random, and the board simply disallows illegal moves, is the monkey playing chess? If you say "yes", then sure, the spam-chess engine can play chess. But then anything that can move pieces can play chess. True, spam-chess plays a little better than random moves, tha
Pie in the sky (Score:3, Interesting)
It makes sense. If something so utterly trivial (compared to the human brain) as a spam filter could learn do something as complex as play chess (well), then our brain would be a whole lot smaller. Nature doesn't waste resources.
But hey, it might always make an interesting screen saver!
Re:Pie in the sky (Score:2)
- our mind is inadequately built for chess. Just like playing tunes using a floppy drive requires way more programming than playing them using the sound card + speakers. Does the size and complexity of a program to play tunes using the floppy mean that playing audio is so difficult or just that floppy is an inadequate device?
- The spam filter is a really smart program that can build a great database out of available data. Take a few gigabytes of data (human DNA), add an
Be very careful! (Score:5, Funny)
- The baesian filter fights back.
- Yes. It submits the same story to Slashdot twice.
- Why submit twice? Don't editors spot those things?
- Because the baesian filter knows Slashdot editors do not check for dupes, and the Slashdot effect eventually nukes Breyer's server.
Re:Be very careful! (Score:2)
LS
Re:Be very careful! (Score:3, Interesting)
Old proverb (Score:4, Funny)
Re:Old proverb (Score:5, Interesting)
Re:Old proverb (Score:2)
Re:Old proverb (Score:2)
Re:Old proverb (Score:2)
Pragmatists. Sheesh.
The thing is, in the process, you might learn some valuable things about yourself, the pig, and music.
hmmm, (Score:2, Funny)
Re:hmmm, (Score:2)
In Soviet Russia, chess plays you.
Imagine a Beowulf cluster of chess-playing spam filters.
1. Train a spam filter to play chess. 2. ??? 3. Profit!
But does a bayesian spam filter run Linux?
Did I forget any?
Re:hmmm, (Score:2)
the flaw in his teaching: (Score:3, Insightful)
Re:the flaw in his teaching: (Score:2)
Great article! (Score:5, Insightful)
Not only is the topic unusual and entertaining, but this article is also a good tutorial on data massaging, pattern matching, and combining disparate unix tools to accomplish a task. This article showcases how powerful and useful unix command tools can be.
If you want a good step by step tutorial to help you understand the usage of unix command line tools to accomplish a non trivial task, then you should read this.
teaching it to understand the board (Score:4, Interesting)
The method described in the article ignores the board, and instead focusses on the history of moves.
A better method might be to train the filter to read from a description of the board state (ignoring the moves taken to reach that state), and a list of possible moves, then return the move that is most likely to win.
If you allow it to also choose from impossible moves, then it will learn the rules of the game as well.
Re:teaching it to understand the board (Score:3, Interesting)
What would be more interesting, would be to feed in all of the known openings and closings that have been analyzed and collected over the years (comprehensive books on openings run over 750 pages in length). If you could teach it various openings, and then
Re:teaching it to understand the board (Score:2)
Re:teaching it to understand the board (Score:2)
Re:teaching it to understand the board (Score:2)
Fortunately hard drive space is cheap so he can easily add more as needed.
If he can build a good local index of the game positions he can even use the Gmail filesystem thingy for free storage. The speed would be rather lethargic though.
Stone Soup (Score:2, Insightful)
Sure you can make a chess playing program from a spam filter.
You just need to throw in a leg
Well I must say... (Score:2)
Kjella
clever application of an old idea (Score:2)
It mad
Hmmm... (Score:4, Funny)
AI Koan (Score:3, Informative)
"What are you doing?", asked Minsky.
"I am training a randomly wired neural net to play Tic-Tac-Toe", Sussman replied.
"Why is the net wired randomly?", asked Minsky.
"I do not want it to have any preconceptions of how to play", Sussman said.
Minsky then shut his eyes.
"Why do you close your eyes?", Sussman asked his teacher.
"So that the room will be empty."
At that moment, Sussman was enlightened.
Could be done better methinks..... (Score:2)
So what you really need to do is to compile a filter of board states after every move in every game in your history. Then you account for current board state in your game and look for any game that contains your exact board state (regardless of number of moves to reach it). From there any following move selection would be valid (assuming your match is only pulled f
What if... (Score:2)
Specifically, delete all the 'x's before running anything through dbacl. Put back the 'x' where necessary in returning the move to the chess program.
That way Nxd5 = Nd5
The idea being that whether or not you make a specific move is not always dependent on whether you make a capture. Sometimes you just want the piece on that spot on the board. The program currently treats those two instances (moving a piece for a to b, and moving a piece from a while capturing the piece
Next week on slashdot: (Score:2)
-
Re:Would you like to play a game? (Score:2)
I suppose this is the reason for this experiment: is a Bayesian filter some kind of universal AI? If I train a filter to emulate my slashdot posts will it be as intelligent (or not) as me?
Re:Would you like to play a game? (Score:2)
Re:Would you like to play a game? (Score:2)
Nuke Redmond, please.
Re:A poor reinvention of the wheel..... (Score:3, Interesting)
Re:Spam Filter (Score:2)
Re:Spam Filter (Score:2)
Re:Yes it can, as well as Deep Thought (Score:2)
Technically, nothing is "Turing-complete". A Turing machine has a tape of infinite length extending in two directions. However, if you discount that (which is always what happens when someone refers to anything in RL as Turing-complete) then you're right - they're both "Turing-complete".
Re:problem building dbacl (Score:3, Informative)
Re:problem building dbacl (Score:2)
Re:problem building dbacl (Score:2)
Re:problem building dbacl (Score:2)
Re:More importantly.... (Score:2)
Re:anything (Score:2)
Re:This topic is silly (Score:2)