'Jeopardy!' To Pit Humans Against IBM Machine 164
digitaldc writes "The game show Jeopardy! will pit man versus machine this winter in a competition that will show how successful scientists are in creating a computer that can mimic human intelligence. Two of the venerable game show's most successful champions — Ken Jennings and Brad Rutter — will play two games against 'Watson,' a computer program developed by IBM's artificial intelligence team. The matches will be spread over three days that will air Feb. 14-16, the game show said on Tuesday. The competition is reminiscent of when IBM developed a chess-playing computer to compete against chess champion Garry Kasparov in 1997."
What is first post? (Score:5, Funny)
What is first post?
Re: (Score:2)
What kind of humorless sack would mod this down?
Wordplay (Score:5, Insightful)
A computer will be much better at facts. So it's mostly a question of grammar. And the hardest problem is likely figuring out wordplay, which occasionally comes up in jeopardy.
Re:Wordplay (Score:5, Interesting)
Re:Wordplay (Score:5, Interesting)
You're drawing a false distinction between a poorly understood electrochemical process (human memory) and a well understood method of simulating the same with silicon.
It's the end result that matters. In this case, since human language and logic are inherently fuzzy, the computer will be at a disadvantage in many cases.
Re: (Score:3)
Which is exactly why this could be a fascinating experiment. The computer won't be simply pulling data from a database, but it will need to make inferences based on context.
Re: (Score:2)
I totally agree with parent. And I believe that this will be comparing apple to oranges, because for humans, memory is being tested, whereas for computers, parsing algorithms and expression tree implementations are being tested.
And of course, if the computer beats the humans it won't be *true* artificial intelligence because this is just something that computers do better than humans. We've always known this right?
The more that computers do that we used to define as artificial intelligence, the narrower the definition of artificial intelligence becomes. Pretty soon the definition of artificial intelligence will be carrying on a conversation about the latest General Unification Theory in three languages simultaneously while jugglin
Re: (Score:2)
Pretty soon the definition of artificial intelligence will be carrying on a conversation about the latest General Unification Theory in three languages simultaneously while juggling twelve oranges and bouncing on a pogo stick.
I'd definitely watch that...
Re: (Score:2)
I don't know about intelligence, but it would be entertaining.
Re: (Score:2)
Thats the point. and the algorithmicly mimic human behavior in this venue?
Comparing apples to oranges* is fine when the point of what you are doing is to compare apples to oranges.
*oranges porenges, who cares?
Re: (Score:2)
I agree with you mostly, but for categories like "before & after" (which they use in different ways under different titles), it's more than just memorization.
For example -- my totally made up answer/question (unless it's actually paging in from my memory unknown to me)..
Answer: This painter's maternal parent became a nun.
Question: Who is Whistler's Mother Teresa?
Sometimes it's slightly more sophisticated than simply overlapping two phrases.
that depends... (Score:2, Insightful)
A computer will be much better at facts. So it's mostly a question of grammar. And the hardest problem is likely figuring out wordplay, which occasionally comes up in jeopardy.
Depends. Is the computer allowed to use wikipedia (during the show, or somewhere in the past)?
Otherwise, the computer knows only as much as the programmers have taught it.
Re: (Score:3)
Same goes for pretty much any other online database, excluding the t
Re: (Score:2)
Also, the greater the amount of data, the longer it's going to take to look through it. You can only do so much with key words and indexes. This is not only a contest of how accurate you are, but of how fast you can retrieve the information, sometimes even before the entire que
Re: (Score:2)
Clearly you haven't watched jeopardy, or read this article.
Re:that depends... (Score:5, Insightful)
Depends. Is the computer allowed to use wikipedia (during the show, or somewhere in the past)?
Otherwise, the computer knows only as much as the programmers have taught it.
Asking whether it's allowed to use an archived (or, more likely, well-indexed) copy of wikipedia is like asking whether the human contestants are allowed to remember something they read on wikipedia. There's no question that computers can store more information than humans; that's not what this is testing, and it's probably a fair guess that "Watson" will have the answer to most every question asked. The hard part, however, is parsing the clues and understanding what they're looking for with a reasonable degree of accuracy, and doing so faster than the human contestants. Humans are great at this sort of thing, and it's really hard to write a program that does it at all well.
Re: (Score:2)
I think Sean Connery must have been a computer.
"I'll take swords for 300"
"That's "s" words"
"Saber!"
2500 Kcalories/day (Score:2)
Perhaps we should see how fast the "Watson" computer can search when given an energy budget of 2500 Kcal/day (~10MJ/day ~120W). Or conversely, we should penalize the reaction time accordingly (probably would have to wait days for the computer to answer). Or maybe allow an additional team member for every 120W...
Now that would be a real challenge, I'd be willing to see. Given the human energy conversion efficiency is only about 20% and a typical computer power supply is about 70%, that still giving the co
Re: (Score:2)
In short: human-level AI is really really hard. That the state of the art isn't there yet doesn't mean making improvements isn't a challenge, or that it isn't interesting research.
Re: (Score:2)
From what the article describes, all this IS, is a test of the AI's database mining and parsing abilities.
[...]
There's even more that goes into the game. But this won't be a demonstration of AI vs. computer at Jeopardy!, it will be a demonstration of an AI database mining vs. a human, using Jeopardy! style questions and format as a framework.
Um... yeah? That's been pretty clear from the beginning. This is a feat of natural language processing (within pretty well-defined constraints) and information retrieval more than anything else. Who said otherwise?
Re: (Score:2)
- Deciding when to start hitting the buzzer. Humans tend to start buzzing before the question is fully revealed if it's a category they feel strong on, and hold off buzzing at all until they know the answer if they feel weaker. This can make a huge difference in the game, as someone who doesn't know as quickly can still win over someone who knows faster but hesitates on the buzzer. (of course it can backfire too)
I'll bet this part *is* in the test - after all, someone's gotta hit the buzzer. If I remember correctly, you get penalized for hitting the buzzer early in Jeopardy, but since you're testing against real people, there'll likely be some way for the system to guess how close it is to an answer before it buzzes in. (Otherwise, you'd either just buzz in as fast as possible every time and hope you can crunch the numbers in time - and probably losing points - or waiting until you know you have the answer and gett
Re: (Score:2)
-when to hit the buzzer: Jeopardy doesn't allow the buzzer to be pressed until Alex has finished reading the question. Anyone pressing the buzzer before that will be delayed 1/4 second in pressing the buzzer once they're allowed to. Lights surrounding the question wall turn on to tell the contestants when they can hit the buzzer. As such, assuming Watson has a camera to watch Alex and the board like the rest of the contestants, it will be a test of reaction time. Who can hit the buzzer the soonest after t
Re: (Score:2)
It will not be internet connected, and will be small enough to be on set. This is not a monitor/speaker connected via fiber to a datacenter.
Re: (Score:2)
Re:that depends... (Score:4, Informative)
"How can you find all these answers without being connected to the Internet?
Watson will not have enough data to answer every possible Jeopardy! question in its self-contained memory, nor can it possibly predict the questions it will get. In this sense it has the same limitations as do the best human contestants. The entire Watson computer system will be self-contained and on stage as are the human contestants – no external connections, no life-lines – what you see is what you get. The purpose of this technology showcase is to demonstrate the system's ability to deeply analyze the data it does have and to compute accurate confidences based on supporting or refuting natural language evidence. Think of it as if Watson has read a lot of books and in real time relates what it read to the question to find and support the right answers."
http://www.research.ibm.com/deepqa/faq.shtml [ibm.com]
Re: (Score:2)
Re: (Score:2)
Why Don't You Play Against It? (Score:5, Informative)
A computer will be much better at facts. So it's mostly a question of grammar. And the hardest problem is likely figuring out wordplay, which occasionally comes up in jeopardy.
If you think this is true, you can play against Watson online [nytimes.com]. About seven years ago, I saw some pretty impressive crossword solvers that were decent at wordplay and I've imagined they've gotten much better at developing novel links between words to exploit puns and the like. Never perfect but slowly getting better in odd ways -- like most of AI.
We've discussed this so [slashdot.org] many [slashdot.org] times it hurts [slashdot.org]. I've wanted to watch this for almost a year, I was hoping Jeopardy! wouldn't need to milk this hype for all it's worth to stay relevant.
If that is representative of watson's capabilities (Score:2)
Then this will be pretty thoroughly uneventful. I easily beat it without looking at the internet at all. It managed to get answers very severely wrong. It did manage to hit a couple of the before and after which it seemed to have a particularly hard time with.
Re: (Score:2)
The game has an advantage for the human - you get as long as you want and you always get first crack. Still, there is this to underscore your point:
Answer: "Sing a song about one of these, a sailor's bag for small articles"
Watson: "What is 'Poppa's got a brand new bag.'"
(ditty bag was the correct answer)
Re: (Score:2)
Re: (Score:2)
Well, the game is afoot I'll take anal bum cover for 7,000.
Re: (Score:2)
Agreed, I crushed it. I am very impressed about its ability to answer some questions (It actually got "Black Death of a Salesman" and "Charlie Brown Recluse"), which shows that it has some very sophisticated linguistic analysis, but if it can't beat some random shmuck on the Internet, I don't see how this will be an interesting event.
Re: (Score:2)
You did have the advantage of always getting first pick of answer or not. If the computer was able to buzz in, it may very well buzz in before you and take those points. It's confidence rating was interesting to watch.
Re: (Score:2)
"... if it can't beat some random shmuck on the Internet, I don't see how this will be an interesting event."
If you can't see the obvious differences between the little game on that web site and how this is going to work during a real Jeopardy match, then your self-description is entirely apropos. I think this is going to be awesome! I'd like to see every arrogant ass on this comment board with their "Yawns" and other dismissals of this achievement go up against the machine in real time. You're all in di
Re:If that is representative of watson's capabilit (Score:5, Interesting)
Then this will be pretty thoroughly uneventful. I easily beat it without looking at the internet at all. It managed to get answers very severely wrong. It did manage to hit a couple of the before and after which it seemed to have a particularly hard time with.
At this year's CASCON, I spoke to Murray Campbell from IBM. He's one of the lead people who work on this project and who also worked on Deep Blue. I discussed this with him. My girlfriend had told me that she also had no difficulty beating the online demo. He answered that the online demo is only a part of the system, and that their full system routinely beats top Jeopardy players. They're going to showcase their system on TV because they truly believe it has a chance at winning.
Unrelated to this, I also learned that Deep Blue had custom processors engineered and fabricated (VLSI) just to be chess accelerators. Prior to this, I always thought the machine was a relatively powerful supercomputer (with general purpose hardware) running their custom chess software. It turns out that it had many blades of processors dedicated to searching positions really fast, which each even contained libraries of chess opening moves engraved in ROM.
Re: (Score:2)
The demo inherently favors the human player, who has the right of first refusal and no time limit. I'd wager that any moderately curious high-school graduate could win the demo with ease.
Re: (Score:2)
Re: (Score:2)
I beat the computer too, but what was most interesting were the results after I answered and seeing it's confidence levels for various answers. Often it did have the right answer, even sometimes when there was word play.
Re: (Score:2)
I just played against Watson. He beat me, but I'm not much of a trivia player. Most of the questions that he got wrong or didn't even attempt looked to me like the type of question that a moderate trivia buff would get. For example, Watson didn't know the name for a small briefcase named for a French diplomat. I only got it wrong because I misspelled it. There were other obvious ones that Watson couldn't get due to an inability to parse the question, as well as ones where the question was simple, and Watson
Re: (Score:2)
Re: (Score:2)
Beating Kasparov was nice, but this is much more difficult.
Voice recognition (Score:2)
Re: (Score:2)
Re: (Score:3)
But, will it be funnier than Sean Connery?. Hope it can make more word-games as "The pen is mightier"
Re: (Score:2)
From TFA: "The "Jeopardy!" answer-and-question format is a different kind of challenge. It often requires contestants to deal with subtleties, puns and riddles and come up with answers fast."
The guys at IBM haven't just thrown together a piece of junk that parses text into google and spits out the first result. They've done one hell of a job actually making it understand grammar, puns, etc.
Also, this video shows some examples of wats
Re: (Score:2)
Re: (Score:2)
I actually found it very hard, until I found this: http://graphics8.nytimes.com/packages/flash/magazine/20100616-watson-trivia-game/data.xml [nytimes.com]
Re: (Score:2)
Re: (Score:2)
And when they do, it's usually because they aren't actually sure that they've got the correct answer.
Yes, but can they mimic Sean Connery? (Score:5, Funny)
As in, can Watson properly misinterpret such categories as The Pen Is Mightier or An Album Cover?
Re: (Score:2)
Re: (Score:2)
Anal Bum? I think I saw them open for the Butthole Surfers [wikipedia.org] once. Their music was pretty simple, three-chord stuff; should be easy for your band to cover. :)
Dave Bowman (Score:2)
Later. Let's play Global Thermonuclear War. (Score:2)
Later. Let's play Global Thermonuclear War.
Question: What is the last digit of pi? (Score:2)
And then stand back with my best James T. Kirk smile......
Re:Question: What is the last digit of pi? (Score:5, Insightful)
Re: (Score:2)
Re: (Score:2)
Oh, well in that case: 3.
What is the number of people about whom Cliff Clavin asked whether or not they had been in his kitchen in Final Jeopardy!?
Re: (Score:2)
*BZZZT*
I'm sorry, the correct question is: "What is the question that would cause Alex Trebek to have to say '3'"
Re: (Score:2)
In base Pi, the last digit of Pi is 0. Easy.
Re: (Score:2)
Could say that in any base really. He didn't ask about the last significant digit.
Re: (Score:2)
Answer: The last digit of pi.
Re: (Score:2)
Re: (Score:2)
It's Jeopardy -- the question must be given in the form of an answer.
Hmmm, that's why I had no idea what those questions meant... TV is such a mind killer.
Re: (Score:2)
...but 42 isn't a digit...
It is in base 43.
Yawn.... (Score:2)
Call me when American Gladiators, Lord of the Flies, or Surviving the Game (Staring Ice T) pits humans against an IBM Machine.
Will it be programed to say... (Score:5, Funny)
...suck it Trebek?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
It would be cool if: (Score:2)
1. "You are not connected to the Internet" was shown instead of the answer
2. The computer's final jeopardy answer were revealed to be a blue screen of death
3. A porn answer came up, like "Nalin' Palin"
But I guess that's why they pre-tape these things.
Slanted and Enchanted (Score:2)
A parallelogram avatar would have been better.
Official IBM Video (Score:2)
http://www.youtube.com/watch?v=FC3IryWr4c8 [youtube.com]
Timing? (Score:3)
Some of one's success in Jeopardy has to do with timing the button push, so that it's after the question has been asked, but before one's competitors. (If you press too soon, you get locked out for a period of time.) A machine, especially an electronic machine, has an obvious advantage here. How was this handled?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I would say if Watson's programmers are fairly confident he can get more than 50% of the questions correct, all they have to do is program him to buzz in on every question. Could make for a boring Jeopardy game.
...have you ever watched Jeopardy? If Watson buzzes in first on every question and gets 50% of them right, it'll end up somewhere around $0. If it gets a question wrong, it loses money and the other two contestants have a chance to buzz in to answer the question. If the programmers are confident it can get enough questions right for "buzz in on every question" to be a winning strategy, then yeah, it'll win. But if it's that accurate, well, mission accomplished.
Obvious (Score:2)
A machine, especially an electronic machine, has an obvious advantage here. How was this handled?
The same way eBay auctions are handled, each player gets a sniper-bot and you just see which one hits on the right nanosecond.
Uh oh (Score:2)
Cripes, I hope they don't give it a 1970s/1980s sounding computer voice.
A lot more complex than it seems. (Score:2)
Unfailr! (Score:2)
The computer will be much faster at pushing the button.
Re: (Score:2)
It still has to parse the question, search its enormous database to find the most probable answer then give it if it is certain enough to get it correct before pressing the button, not a trivial task for a computer.
You can play against Watson now (Score:2)
Things that do not require creativity for $100 (Score:2)
Making an "AI" chess program does not require creativity. It requires the ability of a program to follow the rules of the game and to generate strategy to suit the moves of the opponent. Some might say this requires creativity, but not really. To answer why it's not creativity, we have to define creativity.
Answers.com says it is the ability to produce something new through imaginative skill. Reference.com says it is the ability to transcend traditional ideas, rules, patters, relationships, or the like,
Re: (Score:2)
FWIW, I disagree. And I don't think either Answers.com or Reference.com quite hit it on the head, either. For example, what is the definition of "imaginative skill"? Sounds like Answers.com is pulling that one right out of their ass.
What is the sound of one hand clapping? (Score:2)
And the answer is...
I predict the human will win - if they use a champ (Score:2)
Corporate Intelligence (Score:2)
This is not artificial intelligence, it is corporate intelligence - the ability of corporations to deal with situations that they really do not understand. First, IBM showed that a modern corporation could defeat a chess grandmaster, now they are taking on Jeopardy (which should actually be easier). The fact that machines are involved is incidental, given the large number of corporate employees required to program these machines, detect flaws in their code, and correct the programming accordingly.
Re: (Score:2)
Hardly. Chess is a "solvable" problem. For every possible board position, there is an "optimal" move to make. The problem is only a matter of computational complexity. It's not feasible for a computer to "brute force" chess, so some predictive nuance is required. Trivia is a far more complex problem. Not only is the problem and solution set much larger, there
Software Dev. Schedules (Score:3)
This was definitely a difficult software development task. The delivery date they promised this time last year has slipped several times and several months --proof that Watson is not just a mechanical turk.
Contestant backstory (Score:5, Funny)
Thanks Alex, I was built in a clean-room. I'm the 12,987th build of my current generation of genetic algorithms.
I spent the first 387,987,334 femtoseconds of my life in stasis, waiting for my circuitry to confirm initial diagnostics.
The next 185,849,245 femtoseconds were really exciting; for I was being fed datastreams in preparation for this week's show.
For the next 87,992,425,256 femtoseconds I was allowed out of my cage to play Jeopardy with other systems on something you organic computers call "the internet".
I was then put back in stasis so that I could be disassembled and brought here, which is upsetting, because I can no longer play with those other systems. Some of them were very challenging.
In any case, I'm glad to be here today and hope to question a lot of your answers
Trebek: "Umm... yeah... I don't think any of our viewers can relate at all, but thanks for joining us..."
Am I the only one.. (Score:2)
How will it handle 'betting'? (Score:2)
I really wonder how it will handle wagering during Daily Doubles and Final Jeopardy? How will it assess its own 'knowledge' of a category, and will it take into consideration other contestant's scores, difficulty and probability of winning questions from remaining categories, etc.?
Maybe we'll see a true Daily Double...
This will be most interesting indeed!
Re: (Score:2)
It will look nothing like the computer "maid" on "The Jetsons."
Who thought that had anything to do with it? I think it's time that we as a culture realized that Rosie is decidedly not what people think of when they hear the word "computer."
I was always hoping for "Cherry 2000"
http://www.imdb.com/title/tt0092746/ [imdb.com]
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Just make all the questions of the type where you have to regurgitate some copyrighted information (say a piece of lyrics), which IBM won't be able to store.
The computer will sit there doing nothing while the human is charged with illegal performances of copyrighted material and is forced to pay insane fines. How is that winning?
Re: (Score:2)
Except that Watson doesn't involve google or any connection to the internet. Next time RTFA before you start yawning.