Neural Net Learns Breakout By Watching It On Screen, Then Beats Humans 138
KentuckyFC writes "A curious thing about video games is that computers have never been very good at playing them like humans by simply looking at a monitor and judging actions accordingly. Sure, they're pretty good if they have direct access to the program itself, but 'hand-to-eye-co-ordination' has never been their thing. Now our superiority in this area is coming to an end. A team of AI specialists in London have created a neural network that learns to play games simply by looking at the RGB output from the console. They've tested it successfully on a number of games from the legendary Atari 2600 system from the 1980s. The method is relatively straightforward. To simplify the visual part of the problem, the system down-samples the Atari's 128-colour, 210x160 pixel image to create an 84x84 grayscale version. Then it simply practices repeatedly to learn what to do. That's time-consuming, but fairly simple since at any instant in time during a game, a player can choose from a finite set actions that the game allows: move to the left, move to the right, fire and so on. So the task for any player — human or otherwise — is to choose an action at each point in the game that maximizes the eventual score. The researchers say that after learning Atari classics such as Breakout and Pong, the neural net can then thrash expert human players. However, the neural net still struggles to match average human performance in games such as Seaquest, Q*bert and, most importantly, Space Invaders. So there's hope for us yet... just not for very much longer."
I for one (Score:5, Funny)
Re: (Score:1)
This is the weakest form of trolling I have ever seen. I award you no mod points and may God have mercy on your soul.
Excerpt from "Starfish" by Peter Watts (Score:5, Interesting)
"I hope the lifter pilot doesn't get too bored." Jarvis is all chummy again.
"There is no pilot. It's a smart gel."
"Really? You don't say." Jarvis frowns. "Those are scary things, those gels. You know one suffocated a bunch of people in London a while back?"
Yes, Joel's about to say, but Jarvis is back in spew mode. "No shit. It was running the subway system over there, perfect operational record, and then one day it just forgets to crank up the ventilators when it's supposed to. Train slides into station fifteen meters underground, everybody gets out, no air, boom."
Joel's heard this before. The punchline's got something to do with a broken clock, if he remembers it right.
"These things teach themselves from experience, right?," Jarvis continues. "So everyone just assumed it had learned to cue the ventilators on something obvious. Body heat, motion, CO2 levels, you know. Turns out instead it was watching a clock on the wall. Train arrival correlated with a predictable subset of patterns on the digital display, so it started the fans whenever it saw one of those patterns."
"Yeah. That's right." Joel shakes his head. "And vandals had smashed the clock, or something."
"Hey. You did hear about it."
"Jarvis, that story's ten years old if it's a day. That was way back when they were starting out with these things. Those gels have been debugged from the molecules up since then."
"Yeah? What makes you so sure?"
"Because a gel's been running the lifter for the better part of a year now, and it's had plenty of opportunity to fuck up. It hasn't."
"So you like these things?"
"Fuck no," Joel says, thinking about Ray Stericker. Thinking about himself. "I'd like 'em a lot better if they did screw up sometimes, you know?"
"Well, I don't like 'em or trust 'em. You've got to wonder what they're up to."
Re: (Score:2)
The cheese company had a person who's job it was to check the cheese and would go and poke the wheels to see if they were ready. The company created a robot to go and do the same job and trained it by having it poke fully ripened and unripened cheese. The problem was that when it was put it use it failed miserably at correctly and consistently telling if the cheese was ripe. The problem was the
Re: (Score:2)
You would be able to tell a real AI its mistake, and it would be able to figure out how to correct it. 3D print its own smell sensors...
Re: (Score:2)
http://www.youtube.com/watch?v=GzzOw2tmb3A
Re: (Score:1)
Summary of post:
tl;dr
Re: (Score:2, Insightful)
tl;dr == "I hate reading". You should NEVER see a tl;dr at slashdot. NERDS READ.
Define Slashdot.org (Score:2)
It's rambling, obtuse, and pretty much incomprehensible. One can read it, but there's nothing of value to be obtained by reading it. [...] its lack of substance and insight, if not its outright incorrectness, means that reading it is a pointless activity.
Re:Can we get a summary of that excerpt, please? (Score:4, Informative)
Re: (Score:2)
Re: (Score:1)
The performance would be better than that of a human regardless.
Until it's not. Perhaps the best solution is to use both. That is, automate but have a human operator as backup. That way when the automation goes off the rails (either figuratively or literally), there's a human there saying "that's not quite right" and can resume the reins. Automation should allow for a higher system-to-human ratio, so it's not a complete loss.
Re: (Score:1)
Jarvis and Joel are discussing smart gel - some kind of AI that apparently has had bugs. What was difficult to understand?
Re: (Score:1)
Re: (Score:2, Insightful)
Oh, wait, that's a dumb idea, because we'll end up looking like the stupid ones."
That is the conversation you should have had with yourself before you posted.
In the excerpt one of the chars expresses a begrudged acceptance of the 'gels' because they haven't 'fucked up' which is not, despite the anecdote which precedes the opinion, exclusive to fatalit
Re: (Score:2)
Alternatively, your mind is just too rigid to dig the style and too inflexible to get what the dialog is about.
Re: (Score:2)
I weep for humanity...
"weird writing style" apparently means "written beyond a 6th grade reading level" (which is incidentally what USA Today is written for, by and large — a good reason to aspire to better news periodicals, even though editorial standards are slipping across the board)
Incidentally, the writing style is not that uncommon, and some of the techniques he uses can be found in other great novels of multiple genres (e.g., detective novels). At least one review describes Starfish as a thril
Re: (Score:2)
"but I'm starting to think programs like No Child Left Behind may have de-emphasized that in exchange for teaching kids how to pass more and more standardized tests that focus on bare essentials"
so your positing that holding teachers and children to a fairly standardized level of math and reading comprehension by forcing them to prove they have the very skills that society needs them to have is somehow bad for an educational system??
that is just so asinine..."teaching for the test"?? jesus almost every min
Re: (Score:2)
There are more ways to learn, and prove that you have learned, than taking and passing tests. The idea that we go to school only to learn the rote of what is taught is the very problem with the system. Our education needs to focus on critical thinking and analysis, not memorizing the answers to test questions out of their textbooks.
I should be able to ask a class of high school students what they believe was the cause of some historic event, and hear back several different answers. They don't have to be the
Re: (Score:2)
Don't trust computers if your name is Jarvis.
In the greater context I think the post is trying to make the same thought, AI's that use visuals to make decisions may not be trusted with human life.
Re: (Score:2)
Unless, of course, your name is Jarvis and you are a computer. [wikia.com]
Re: (Score:2)
Touche' Well played for I had forgotten that reference.
Don't let it watch terminator (Score:1, Insightful)
Next up wee have Sky Net.
Re: (Score:1)
Next up wee have Sky Net.
Slow down there AC. We need to get WOPR first.
AI (Score:5, Insightful)
For once, something based on proper AI (rather than human-generated heuristics).
However - notice it's limitations: Where there is a direct correlation between where you need to be, and where something else is on the screen (basically a 1:1 relationship in Pong, for example), it can cope with going higher or lower as required.
But when you put it into something that has more than a single thing to "learn" (move left/right, avoid bombs, shoot aliens, choose which aliens to shoot, don't shoot your own base, etc.) then the amount of training required goes up exponentially. And thus we could spend centuries of computer time in order to get something that can do as well as a simple heuristic designed by someone who knows the game (not saying heuristics don't have their place!).
"Trained" devices require training relative to some power of the variety of the inputs and the directness of their correlation to the game-arena. And thus, proper AI is really stymied when it comes to learning complex tasks.
But still - this is the sort of thing we should be doing. If it takes an infant two years with the best "computer" in the universe that we know of to learn how to talk, why should we think it will take a machine at even the top-end of the supercomputer scale (which can't have as many "connections" as the average human brain) any less?
Re: (Score:3, Interesting)
If it takes an infant two years with the best "computer" in the universe that we know of to learn how to talk, why should we think it will take a machine at even the top-end of the supercomputer scale (which can't have as many "connections" as the average human brain) any less?
Because we're learning languages in the wrong way.
Re: (Score:3, Interesting)
Re: (Score:1)
Re: (Score:2)
Re:AI (Score:4, Interesting)
If it takes an infant two years with the best "computer" in the universe that we know of to learn how to talk, why should we think it will take a machine at even the top-end of the supercomputer scale (which can't have as many "connections" as the average human brain) any less?
Because neurons are much slower than transistors?
Re: (Score:1)
A neuron is more like a network router with local integrated storage, packed in a density somewhat comparable to integrated circuits.
Re: (Score:1)
If it takes an infant two years with the best "computer" in the universe that we know of to learn how to talk, why should we think it will take a machine at even the top-end of the supercomputer scale (which can't have as many "connections" as the average human brain) any less?
While I kind of agree with the point, it absolutely doesn't take 2 years to learn how to talk. It takes a few months to learn to talk (which would include learning that sounds have meaning). Just like it doesn't take over a year to learn to walk, it takes a couple weeks. Interestingly, they learn some very complex things all at the same time.
Re: (Score:2)
Re: (Score:2)
Why not include a heuristic processor in the AI, that would override the statistical training in certain cases?
So you could tell the program, in real time while its playing, something like "Watch out for bombs while moving left or right" and it would be able to ignore what its statistical training told it to do, in a context where the training told it to move right but that would send it into a bomb.
Re: (Score:1)
It's called a "JavaScript Programmer" algorithm. (Score:5, Funny)
This neural-net-combined-with-trial-and-error style of algorithm is typically referred to as a "JavaScript Programmer"-type algorithm in recent AI literature. (I'm being completely serious, too, in case you think this is a joke; it isn't.)
The name derives from the similarity between how these kinds of algorithms work, and how JavaScript programmers tend to work.
Both the algorithms and JavaScript programmers use a very basic, minute form of pseudo-intelligence.
This small dab of pseudo-intelligence is then used to repeatedly attempt to solve a problem, followed by an analysis of the success of the attempt.
In the case described in this article, it involves the computer trying to play the game, with the aim of winning.
In the case of the JavaScript programmer, it involves the programmer repeatedly searching through Stack Overflow, finding code to copy-and-paste, and then hoping that it works well enough to trick the customer or employer into thinking the job is done.
The summary should have probably mentioned this, but I suspect that the submitter may not be following the latest AI journals and research very closely.
Re: (Score:3, Funny)
the programmer repeatedly searching through Stack Overflow, finding code to copy-and-paste, and then hoping that it works well enough to trick the customer or employer into thinking the job is done."
Re: (Score:2)
Truer words were never spoken:
the programmer repeatedly searching through Stack Overflow, finding code to copy-and-paste, and then hoping that it works well enough to trick the customer or employer into thinking the job is done."
If it really works, if the specifications are met, and if it passes testing, then the job is done.
Wisely leveraging the shared knowledge of others is a good thing to do.
Re:It's called a "JavaScript Programmer" algorithm (Score:4, Interesting)
This neural-net-combined-with-trial-and-error style of algorithm is typically referred to as a "JavaScript Programmer"-type algorithm in recent AI literature. (I'm being completely serious, too, in case you think this is a joke; it isn't.)
The name derives from the similarity between how these kinds of algorithms work, and how JavaScript programmers tend to work.
Funny, of course :)
But, you got me thinking. The JavaScript programmer is generally trying to affect the appearance of stuff on the screen, therefore, he looks at the stuff on the screen, and tries to affect ... the stuff on the screen. So, it makes more sense than it might.
Our new pong-playing overlords, on the other hand, if they are actually doing something important like remotely fighting wars or trying to save people or something, well, then we don't really know if they are looking at the right input, and it becomes much more important that they, and we, understand exactly how they are coming to their decisions.
Re: (Score:2)
And now, I'm wondering if there is another way for creating DOM manipulating Javascript. I mean, I can most of times make a Linux module by reading the documentation of a device and writting code that makes it work (but for some devices, it's the Javascript wa
I know it was tongue in cheek , but... (Score:2)
"used to repeatedly attempt to solve a problem, followed by an analysis of the success of the attempt."
The above is exactly how humans learn to play simple games. Sure, you learn a few rules beforehand but then you actively - and to an extent subconciously - engage in trial and error about what to hit/kick/click at what time in what scenario. Its called "practice". No one for example becomes a good football (soccer for the yanks) player by analysing angles of attack of other players feet - they just go out
Re: (Score:2)
Its also more formally called TDD. Create some code that tests the suitability of the existing code to solve the problem. Then randomly change the code until it passes the test, and all the others. Repeat, rinse, etc.
The handwriting on the wall (Score:2)
Deep Blue [ibm.com]
Re:The handwriting on the wall (Score:5, Interesting)
Did a Computer Bug Help Deep Blue Beat Kasparov? [wired.com]
A little nit (Score:1)
Where did the researchers find the "expert Breakout and Pong players" to match their neural net against? Was it that same loudmouth kid down the hall who is always "beating the spread" on football?
Wrong question (Score:2)
The question is not, "when can a bunch of machinery beat a human at X." The question is "when can a bunch of machinery beat a team of humans _with access to similar computational resources_ at X." I don't see much progress there.
Re: (Score:2)
No problem.
I can wait 20 years for the computer to catch up.
I can already hear the daleks... (Score:2)
"Exterminate... Exterminate...."
Actually, when they become advanced enough, we won't need to work anymore.
I'll buy TWO. One to do my job and one ... just in case.
Re: (Score:2)
"Exterminate... Exterminate...."
Actually, when they become advanced enough, we won't need to work anymore.
I'll buy TWO. One to do my job and one ... just in case.
Dalek's are mutate life forms, riding in a machine. They are not robots or A.I.'s.
Re: (Score:2)
Dalek's are mutate life forms, riding in a machine.
Now I have an image of the little squishy Dalek sitting inside in a little chair, turning a little wheel and going "wheeee!"
(Username recognised)
Re: (Score:2)
Nah, it won't be the machines taking over. When machines become advanced enough, the 1% will no longer need the rest of us humans to grow their food, to make their toys, to be their servants and chauffers. Why would they pay us to do nothing? Why let us use up food and oxygen? It will be time to exterminate the teeming masses. In the name of sustainability, no doubt.
Oh, KentuckyFC (Score:1)
You were just asking for an oblig [xkcd.org], weren't you?
Re:Oh, KentuckyFC (Score:4, Funny)
You were just asking for an oblig [xkcd.org], weren't you?
http://xkcd.com/347/ [xkcd.com] ...now that was truly obligatory.
Tetris (Score:3)
Re:Tetris (Score:5, Informative)
Tetris is a solved problem if you're going for survival (assuming you don't get an extremely unlucky piece selection). Since AI has access to the current piece, the next piece, and can do a probability check on the next piece, it can basically last forever.
Tetris: The Grand Master: http://www.youtube.com/watch?v=jwC544Z37qo [youtube.com] - fast forward to 3:00 to see first majoor speedup, 4:45 for final speedup, and 5:01 for invisible pieces.
That, and 999999 was done on a real NES within 3 minutes 11 seconds: http://www.youtube.com/watch?v=bR0BKCHJ48s [youtube.com]
Re: (Score:2)
Re: (Score:2)
Play itself (Score:1)
They should spin up two instances of the neural net and have it play itself
All is lost! (Score:3)
The AI has another advantage over us human players with the Atari 2600. No blisters.
Re: (Score:2)
But (Score:3)
But has it learned to let someone else design Breakout and then steal a couple thousand dollars from him for his efforts? When it does that, it will truly be an intelligence. (And it will be a superior intelligence if it leaves off the black turtlenecks.)
Minecraft (Score:1)
Show me an AI that can play minecraft; that would be impressive.
Re: (Score:1)
Define the goals of minecraft. If you mean "An AI that can form its own aesthetic desire of what a nice house/castle/statue/dick should look like and creates it for fun," I'll agree with you, but if the standards are low, I can make an AI that plays minecraft by looking straight down and holding left click for a few minutes. I'll call it a zombie survivalist AI.
Re: (Score:1)
Kill the dragon in hardcore mode.
Perhaps I can teach it (Score:3)
to farm gold for me.
That's Not Impressive (Score:2)
So it can play Breakout, big deal.
Wake me when it's giving the checkers-playing chicken a run for her money.
What is the missing piece (Score:2)
The interesting part of the slim article was the part left out. Why did not not perform as well on some of the games. There was not much detail on that issue. I'm not familiar with the poorly played game, but I would guess they introduce a level of visual complexity that overwhelms the AI?
Other than that, simply astounding accomplishment.
Re: (Score:2)
Oh hey! That was the article that inspired The Adolescence of P-1 [wikipedia.org]! I never read the original article (being that I wasn't even born when that book was written ), but that was a fun book. I actually wasn't aware it was based off a real article rather than one the author made up, but apparently it was. Neat.
I for one... (Score:2)
...Welcome the new King of Kong!
wining is pointless (Score:2)
Re: (Score:3)
It's learning to have fun.
Run for the hills! (Score:2)
Neural Net Learns Breakout By Watching It On Screen, Then Beats Humans
Women and children and nerds first!! The machines are coming!
Oh, my mistake. I thought it said "neural learns to break out" and then something about beating humans.
This is one reason why most people don't Capitalise Every Word In A Headline.
Destination: Void (Score:2)
You have been warned.
**not an AI advancement** (Score:1)
TFA does not describe an advancement in AI technology whatsoever.
It is an external 'computer player'...We have had AI's that play video games virtually since we had video games.
Take good ol' Tecmo Bowl...you play against an AI opponent that does absolutely everything this AI did and more.
This is not an AI advancement, it is....an **application** of new and better **sensor inputs** for an external AI
I don't see anything in this that would indicate we are some kind of 'step' closer to having Terminator kill b
Re: (Score:2)
Nah, not related. In-game AI is written specifically with that game in mind, often knowing more than the eye can see. This is a general-purpose thing that attempts to learn only with visual input, without direct access to the program itself.
Re: (Score:2)
Yeah good points all around, AC...I rushed to judgement a bit and my analogies were off. Probably deserved the downmod I got.
I like your google Translate analogy...but yeah, I guess chalk this up to my intense hatred of 'AI' hype, which I feel is hurting our industry in many ways.
Neural Net Also Sings (Score:2)
"Would you like to hear the song I learned today while we play? Daisy, Daisy, give me your answer do ..."
Waiting for computer to get so smart..... (Score:2)
... they don';t need us creators any longer. Now if only we could see what they come up with about their origins after we are gone.
Comes with a caveat (Score:1)
More AI Hyperbole (Score:1)
Repeat it with me: "This is not an AI breakthrough".
Re: (Score:1)
Where did they find the expert players? (Score:2)
That could be interesting (Score:1)
Arcade Learning Environment (Score:2)
See http://www.arcadelearningenvironment.org/ [arcadelear...onment.org] for a few other approaches to this de-facto AI test.
Ha! (Score:2)
"However, the neural net still struggles to match average human performance in games such as Seaquest, Q*bert and, most importantly, Space Invaders."
There's the Singularity put off for another year.