Forgot your password?
typodupeerror
AI Games

Computer Learns Language By Playing Games 133

Posted by Soulskill
from the did-you-just-teach-computers-how-to-conquer-the-world dept.
Frans Faase writes "By basing its strategies on the text of a manual, a computer infers the meanings of words without human supervision. The paper Learning to Win by Reading Manuals in a Monte-Carlo Framework (PDF) explains how a computer program succeeds in playing Civilization II using the official game manual as a strategy guide. This manual uses a large vocabulary of 3638 words, and is composed of 2083 sentences, each on average 16.9 words long. By this the program improves it success rate from 45% to 78% in playing the game. No prior knowledge of the language is used."
This discussion has been archived. No new comments can be posted.

Computer Learns Language By Playing Games

Comments Filter:
  • by AndyAndyAndyAndy (967043) <<moc.liamg> <ta> <inicafa>> on Wednesday July 13, 2011 @01:20PM (#36751222)
    All Civilization-franchise manuals soon to be confiscated and destroyed in the name of national security.
  • in schools? Get kids reading decent manuals (text-books) and perhaps they may actually learn something and find they can do decent things with the new-found knowledge.
    • by vlm (69642) on Wednesday July 13, 2011 @01:48PM (#36751738)

      in schools? Get kids reading decent manuals (text-books) and perhaps they may actually learn something and find they can do decent things with the new-found knowledge.

      This probably dates myself quite accurately, but pretty much, Infocom taught me how to read and type.

      It has its side effects, saying "inv" when I look in my wallet, saying "save" before I do something dangerous, but overall it worked pretty well.

      • I was once slicing potatoes. I always done this by hand, but I had one of those new plastic cutting boards with the blade built in that I wanted to try. I loved the speed, but I didn't know just how fast it cut. I would do half a potato, then switch to holding down the potato with a hand held plunger thingy. The plunger was really hard to hold, so I'd go longer and longer without it. By the 5th potato (the last), I went so fast and so far that the pain of slicing a millimeter off the tip of my thumb wa

        • You should have sued the maker for not including a warning:
          Warning! This device does not have an undo function!
          SCNR :-)

        • by Quirkz (1206400)
          I remember once, after having spent a couple of very lengthy days working in Photoshop, making a silly real life mistake (spilling a glass, perhaps) and not only thinking "undo" but actually reaching out to the kitchen table top and tapping the space where my hands thought ctrl-Z would be, if there had been a keyboard in front of me.
        • by retchdog (1319261)

          it's called a mandoline. they're not particularly new.

    • by Dyinobal (1427207)
      I learned to read playing the Original Dragon Warrior on NES. I've always felt games in general get a bad rap when it comes to their educational value. Even games that obviously educational often have some value to them. Granted some more than others. A text heavy rpg can really improve someones vocabulary as well as improving their reading. Where as a first person shooter may not have much in the way for obvious education.
      • by Nadaka (224565) on Wednesday July 13, 2011 @01:59PM (#36751870)

        Indeed, I make daily use of the differences between a partisan, ranseur, glaive, guisarme, glaive-guisarm, guisarm-glaive, lucern hammer, military fork, volge, etc.

        • by TDyl (862130)
          Too true, as do I, but as a member of a medieval recreation society I guess I need to.
        • by dkleinsc (563838)

          Well, it means you get an entirely different meaning from the phrase "partisan hack"

        • by Verteiron (224042)

          Actually if the GP learned to read on the original Dragon Warrior, then he probably learned the difference between a PARTSN, RANSR, GUISA, GLAIVE, LHAMMR, and MFORK.

        • by Daetrin (576516)

          Indeed, I make daily use of the differences between a partisan, ranseur, glaive, guisarme, glaive-guisarm, guisarm-glaive, lucern hammer, military fork, volge, etc.

          SRSLY? Where do you shop? I've been having some issues at my local store. [giantitp.com]

      • by gstoddart (321705)

        I learned to read playing the Original Dragon Warrior on NES.

        Wow, as someone old enough to have had to learn to read with books ... just wow.

        I can't imagine having learned to read on a video game ... we had Dr. Seuss and "Little Golden Books" and the like.

        Rocks and snow, uphill, both ways ... we had it tough I tell you. ;-)

        • Re: (Score:3, Funny)

          by Anonymous Coward

          Rocks and snow, uphill, both ways ... we had it tough I tell you. ;-)

          That's what we get for letting Escher on the city planning committee...

        • Hey old-timer! You do realize that if you have a path that is uphill both ways, that you also could've gone downhill both ways, right?
      • by hitmark (640295)

        Heh, i am tempted to claim i learned English by reading RPG books. Well, that and trying to play computer games with interfaces and manuals in said language. At least if feel i picked up more of the language that way then i ever did trying to memorize list of words in school.

    • My first thought was to use this as a way to gauge the effectiveness of educational texts.
  • by Anonymous Coward

    I wonder how good that algorithm is at identifying people using their style, grammar and errorrs. Think Facebook and Google+.

    • their style, grammar and errorrs.

      Intentional? I would guess that's a rare one; maybe this AC could be found as he describes.

  • If no prior knowledge of the language is used, how is the program able to determine word boundaries? Perhaps they meant the domain-specific language (meaning vocabulary), for example "unit" and "cell"? Otherwise, there's a plethora of bits of knowledge about English grammar and structure that they probably coded into the AI...

    I also find it junk that the web.mit.edu link posts a screenshot of Civ 5, when the AI they are discussing runs against Civ 2... My bullshit meter is starting to tickle.

    • Re: (Score:2, Interesting)

      by Anonymous Coward

      Right, so because the journalist of the article picked the wrong screenshot, and because they likely told their software that whitespace seperates words, it must be bullshit.

      Standard slashdot loser, trying so desperately to degrade the efforts of others to make himself feel better about his dead-end IT job.

      • Right, so because the journalist of the article picked the wrong screenshot, and because they likely told their software that whitespace seperates words, it must be bullshit.

        Standard slashdot loser, trying so desperately to degrade the efforts of others to make himself feel better about his dead-end IT job.

        Taken from the article itself:

        But what would it mean for a computer to actually understand the meaning of a sentence written in ordinary English — or French, or Urdu, or Mandarin

        So if they're coding that "whitespace separates words", then any text written in Mandarin will consist of sentences with one single word? Mandarin and many other Asian languages (other Chinese dialects, Korean, Japanese, Thai) do not use whitespace to indicate word boundary.

        I won't find language AI interesting until we have true language learning. Sure, this may be better than previous attempts at language AI, but when there are limiting assumptions built into the foundation of

        • They do use separations, but in their own way. Each character is a self-contained unit, separated from the others by being a different character. Each character is comprised of 5 or so different sections, each with its own function.
          • They do use separations, but in their own way. Each character is a self-contained unit, separated from the others by being a different character. Each character is comprised of 5 or so different sections, each with its own function.

            You're partially right.

            In Mandarin, each character is a self-contained unit, and is separate from others around it. The problem, though, is that one character is not always a complete word. If you look character-by-character, you'll break down multi-character words like "shou ji" (cellphone) to "hand" and "machine".

            Further, there isn't one single way of constructing a character in Chinese; there are 6 ways. The only consistency is that in some ways, there are radicals that can be used to glean the general m

            • by HiThere (15173)

              My impression, from just LOOKING at Hiragana(?? The ideographic Japanese printing), was that they DID use spaces to separate words. And tended to use them frequently. Also to separate sentences, but larger spaces. (I'll grant that frequently the separation was left out, but not always. My guess was that it was put in to disambiguate.)

              OTOH, the layout also seems to be a bit inconsistent. Even to sometimes being vertical and sometimes being horizontal. (I'm assuming that vertical is the older form, and

              • My impression, from just LOOKING at Hiragana(?? The ideographic Japanese printing), was that they DID use spaces to separate words. And tended to use them frequently. Also to separate sentences, but larger spaces. (I'll grant that frequently the separation was left out, but not always. My guess was that it was put in to disambiguate.)

                OTOH, the layout also seems to be a bit inconsistent. Even to sometimes being vertical and sometimes being horizontal. (I'm assuming that vertical is the older form, and that horizontal is much more recent. The same may be true for spacing.)

                Can I ask what Hiragana you're looking at?

                If you go to yahoo.jp, you'll see lots of Hiragana, Katakana, and Kanji all mixed together. I didn't see many spaces between words or sentences. Note that Hiragana and Katakana are both phonetic alphabets, while Kanji refers to characters borrowed from Chinese. I've studied much less Japanese than Mandarin, but my understanding is that one Japanese word, depending on conjugation, can include one or more Kanji characters followed by any number of Hiragana letters.

                • by HiThere (15173)

                  It's a legitimate question, but I can't answer, as the particular example I'm thinking of was from a printed newspaper whose name I couldn't read. But I saw it just a few days ago in Oakland, CA...which, of course, helps LOTS when you're trying to confirm a report.

                  I saw it on a bus, and didn't see who left it. (I suppose there could have been Katakana and Kanji mixed in, as I can't tell the difference between them. I just know it wasn't roman letters. But I do recognize spaces. [Sorry, Hirigana was the

            • by GooberToo (74388)

              like "shou ji" (cellphone) to "hand" and "machine".

              What does that matter in context? Doesn't sound like it matters at all. If the computer infers "hand machine" to mean signal and cause an action, what does it matter if, "cellphone", is technically more accurate for you and me? AFAICT, it doesn't matter one bit.

              • The actual meaning doesn't matter. What matters is the fact that Mandarin requires special processing in order to determine the boundaries of a word. If one word (eg, "cellphone") gets broken out to two words ("hand" and "machine"), there is a much greater probability for it to infer the wrong meaning of the word(s).

                An equivalent example in English might be "preposition"; breaking it down a bit we can get "pre" and "position". If the computer correctly infers that "pre" means "before" and position means "lo

                • We actually had this problem on a class project trying to process brand names & part numbers from English text, where spaces are separators or not at the whim of the manufacturer. Compound words make English word demarcation not quite so clear-cut, but granted, still easier than Chinese or Japanese.
        • by jpate (1356395)

          So if they're coding that "whitespace separates words", then any text written in Mandarin will consist of sentences with one single word? Mandarin and many other Asian languages (other Chinese dialects, Korean, Japanese, Thai) do not use whitespace to indicate word boundary.

          Look at the proceedings of any major NLP conference in the last five years (e.g. ACL 2011 [acl2011.org] or EMNLP 2010 [ohio-state.edu]) and you'll find a number of papers on unsupervised word segmentation.

          I won't find language AI interesting until we have true language learning. Sure, this may be better than previous attempts at language AI, but when there are limiting assumptions built into the foundation of the code, I find it hard to believe that it will ever be able to "learn" any language.

          Do you mean that? You won't even find AI interesting until we have solved the entire problem of language acquisition? I don't know about you, but problems strike me as much less interesting once we have solved them, and consider progress towards that solution extremely interesting.

          • So if they're coding that "whitespace separates words", then any text written in Mandarin will consist of sentences with one single word? Mandarin and many other Asian languages (other Chinese dialects, Korean, Japanese, Thai) do not use whitespace to indicate word boundary.

            Look at the proceedings of any major NLP conference in the last five years (e.g. ACL 2011 [acl2011.org] or EMNLP 2010 [ohio-state.edu]) and you'll find a number of papers on unsupervised word segmentation.

            Thanks for the links; it is interesting to see what NLP conferences are talking about. That said, I didn't see many articles on unsupervised word segmentation...

            I won't find language AI interesting until we have true language learning. Sure, this may be better than previous attempts at language AI, but when there are limiting assumptions built into the foundation of the code, I find it hard to believe that it will ever be able to "learn" any language.

            Do you mean that? You won't even find AI interesting until we have solved the entire problem of language acquisition? I don't know about you, but problems strike me as much less interesting once we have solved them, and consider progress towards that solution extremely interesting.

            I meant that I don't find language AI interesting when it starts learning with coded assumptions about the language. I just don't think it will be useful beyond the specific case they're programming for. I'd be a lot more interested in this /. story if they showed the win percentage increase across multiple languages.

    • by vlm (69642)

      I also find it junk that the web.mit.edu link posts a screenshot of Civ 5, when the AI they are discussing runs against Civ 2... My bullshit meter is starting to tickle.

      Even worse if you read the paper, they ran it on freeciv. I'm sure the screenshots would show Civ5 on Vista, of course.

      • They first tried to run it on Civ 5, but the program started by reading the license, and then refused to continue. :-)

  • Every other word out of it's "mouth" would be 'anal'

  • by jfengel (409917) on Wednesday July 13, 2011 @01:26PM (#36751344) Homepage Journal

    Computers have always been good for doing tedious jobs that people don't want to do.

    Like reading manuals.

    • by vlm (69642)

      Computers have always been good for doing tedious jobs that people don't want to do.

      Like playing Civ. Ow the burn. Just kidding, I really like Civ. The recent versions have too much touchy feely timefilling with animations and readers, but they're still tolerable. Still not sure what to think about the recent square to hex conversion.

      • by tmosley (996283)
        Hex is a major improvement in my opinion. I don't know about only having one unit per "square", though I do like that that makes tactical. But sometimes I would rather just let the computer handle that, or design a unit formation and have that whole unit take u a square, with a limit to the number of soldiers/artillery/whatever that could fit in the area. That way you could set up spearmen guarding archers with calvary on the flanks for ancient warfare in open fields, or a core tank unit with infantry in
        • I like the simplicity of not having to worry about formations in Civ.

          If you want formations and really in-depth strategy for battles, check out the Total War franchise of games. My little brother played Rome: TW endlessly, and my understanding is that Rome:TW is the best one in the franchise.

    • by Anonymous Coward

      Computers have always been good for doing tedious jobs that people don't want to do.

      Like reading manuals.

      Or playing Civilization II.

    • by Quirkz (1206400)
      Problem being, what happens when only computers read manuals and then something goes wrong with the computer? Who reads the manual telling you how to fix the computer?
  • Before "gaming" became synonymous with exclusively FPS "if you can see it, shoot it", there used to be all kinds of games available, often with interesting manuals.

    Needless to say, the downloaded copies were better than store bought, because they didn't have copy protection / DRM, but obviously they didn't have the manual that came in the box from the store.

    Section 6 of the paper seems to imply that even the most illiterate fool would still win about 30% more games by having a copy of the manual, no matter

    • by McGiraf (196030)

      pdf

      • by Haedrian (1676506)

        Its not nearly the same thing is it?

        I used to like reading the manual while waiting for the 'loading' screen to disappear.

    • by nedlohs (1335013)

      If only the games industry made non-FPS game

      And yet somehow I've managed to buy dozens of games that are non-FPS in the last few years. Must have been made by aliens I guess.

    • by Dachannien (617929) on Wednesday July 13, 2011 @01:47PM (#36751708)

      Section 6 of the paper seems to imply that even the most illiterate fool would still win about 30% more games by having a copy of the manual, no matter how illiterate they are.

      I just like to look at the pictures.

    • by brit74 (831798)
      If only the games industry made non-FPS games, then they could use this to motivate people to buy the game with the manual, instead of just downloading...
      In general, games are designed to function without a manual. Why? Because a lot of people don't bother reading the manual - so game developers get better sales with an easy-to-learn game that requires no manual reading. I can't remember the last time I read a game manual. I think it might've been Civ 3, because I need to find out more detail about ho
      • by Raenex (947668)

        In general, games are designed to function without a manual. Why? Because a lot of people don't bother reading the manual - so game developers get better sales with an easy-to-learn game that requires no manual reading.

        Just to add to this, what changed was that developers got a clue and made learning how to play part of the game. It's pretty much standard for games now that the opening levels are a tutorial.

        It wasn't that people just didn't read the manuals, it's that even if they did, it's much more fun to be taught in-game.

        • by gl4ss (559668)

          i bought and read the new issue of edge lately. game developers are proud their games can be played by retards without hitting a wall.

          • by Bucky24 (1943328)
            "The moment you make something idiot proof someone will just make a better idiot"

            There will always be someone hitting the wall.
    • Needless to say, the downloaded copies were better than store bought, because they didn't have copy protection / DRM, but obviously they didn't have the manual that came in the box from the store.

      You'd think it might have been easier to have the computer use the same technique we used to do in that situation - try every key one by one until you figured out how the game worked.

    • by gman003 (1693318)

      If only the games industry made non-FPS games

      That's odd, I've played dozens of non-FPS games. Arkham Asylum. Mass Effect. Portal. Dragon Age. Final Fantasy. Assassin's Creed. Sure, the FPS is popular, but no more so than the platformer was in the early 90s.

      Further, more and more games are blurring genres. Mass Effect is combining the RPG and the third-person shooter, Borderlands is almost a dungeon-crawler at times, and pretty much every game has some sort of RPG mechanic. People are making hybrids of genres normally left alone - platformer-shooters,

      • by gl4ss (559668)

        you seem to confuse tunnel runs with rpg's, rpg's with animations, animations with ultra realistic, and ultra realistic with just plain unfinished.

        tho, fear2 is just blood 2 with added press-ctrl-quickly fuckings to player(and half a game from philips cdi).

  • by idontgno (624372) on Wednesday July 13, 2011 @01:28PM (#36751374) Journal

    Mostly the kind of language you don't use among polite company.

    Call me when computers learn to swear idiomatically and emotionally appropriately.

    • Re: (Score:2, Funny)

      by Anonymous Coward

      Well, double dumb ass on you!

    • As far a children learning to swear I learned from my dad. One of my first complete and correct sentences was "Oh fuck this!". I was about one and a half years old and trying to put together a roof rake (it was the summer) and there was a screw and wing nut to hold it together and I just couldn't get it together. So after a little bit I got frustrated and with a piece of the rake in each hand held it up in the air proclaimed "Oh fuck this!", threw it to the ground, and stomped off.
      • by idontgno (624372)

        It's amazing, isn't it?

        When my son was 2 years hold, I discovered he had learned my tendency to mutter "Well, shit.." when encountering a frustrating delay in some process I'm doing (canonical example: I've bought the wrong fasteners for putting something together).

        He's playing with his Duplos, and he discovers he can't find the piece he needs to bridge two little pillars he's assembled... and he mutters "Well, shit" while shaking his head.

        I couldn't decide to be mortified or fall down laughing.

        • by Inda (580031)
          And I still remember the looks I got when I said "Oh sugar bags!" as an infant.

          It was something my mother said too much.

          Everyone knew the secret - my mother was obviously saying "shit bags" - and I was confused.
  • Nice, now I have an AI to play against that isn't completely retarded.

    The next step is to get this sort of thing onto Battle.net.
    • by vlm (69642)

      Nice, now I have an AI to play against that isn't completely retarded.

      Read the paper. The "reading AI" developed at MIT was still crushed by the game-provided AI about half the time. Gives you an idea just how badly the game-provided AI plays.

      • by RJHelms (1554807)

        Nice, now I have an AI to play against that isn't completely retarded.

        Read the paper. The "reading AI" developed at MIT was still crushed by the game-provided AI about half the time. Gives you an idea just how badly the game-provided AI plays.

        Read the paper. That was before they had the "reading AI" read anything. After it read the manual, their AI beat the game-provided AI 78% of the time.

  • If you read the paper, you see that they are using FreeCiv, and not Civilization II.

  • Also from TFA: initially, its behavior is almost totally random.

    I have no idea what constitutes a "win" in this game, but if a "totally random" strategy can win 46% of the time it sounds a little cheesy. Sorta like life, I guess.

    • by Ezubaric (464724)

      The types of games they played is very constrained. It's only two civs on a very small map, and the only way the algorithm learns to win is a settler rush. It's not deep strategy.

  • I played Civilization 2 for years, and I still don't understand the rules.
  • How does it know it wins??

    I think emotion would yield better results because bad emotions, such as losing tend to make people try harder and not lose rather than try sequentially try different strategies.

    I would make them think further than just choosing different pathways in the game, as well as learn from their mistakes.

    • by vlm (69642)

      How does it know it wins??

      I think emotion would yield better results because bad emotions, such as losing tend to make people try harder and not lose rather than try sequentially try different strategies.

      I would make them think further than just choosing different pathways in the game, as well as learn from their mistakes.

      Page 3 algorithm 1, kind of neural network like learning feedback. Probably gets stuck in local maxima, should do simulated annealing.

  • by paiute (550198) on Wednesday July 13, 2011 @01:47PM (#36751714)
    If this thing gets a copy of the Bible, we are boned.
  • It could learn that "bad" is a noun and "fail" is an adjective.

  • I learned a whole new language playing Halo 2 on Xbox live.

  • by Anonymous Coward

    Shouldn't this be "computer learns game by reading manual"? The computer didn't learn to speak, it just learned to play the game...

  • I guess they picked a Civ manual for a reason. I don't remember the Civ II manual, but I remember the original Civ manual - that thing was a brick, a few hundred pages, with an appendix which had most of the algorithms used in the game documented. Not surprised a bot could get better at playing the game with that kind of reference material!

    I wish someone was still publishing manuals like that.

  • It would be interesting to feed this thing a "Java for Dummies" or "Learn C# in 21 Days" book and see if it can start writing it's own software. Maybe even throw in some books on AI and see if it can generate it's own AI software and become self aware.

  • Before the AI takes over the world.

  • Shall we play a game?

    How about Global Thermonuclear War?
  • I can't believe that it had a manual that had enough information to actually boost your score. Documentation nowadays is usually pretty lame, and doesn't actually provide anything instructive.

    Kudos to the writers of the Civilization II documentation ... I bet if you tried it with a modern game manual, the computer's score would go down. ;-)

    This tells me they actually wrote a comprehensive guide, which was well written.

  • Have it play against me on the XBox. So it can learn profanity.
  • the amount of NLP required for this task is almost nil. since the backend is doing massive combinatorial search anyway, all that the "reading" does is bias that search to look deeper at combinations of keywords which occur in the manual. it just so happens that game manuals are very simply written (since their point is not to be stand-alone literature).

    for example, if i wrote a manual full of weird phrases like "it's pointless to consider strategies combining $foo and $bar", it would probably trip up this a

  • A bit off topic, but I was always amused by the fact that when you play versus the computer 1v1 in Starcraft 2, the computer says "gg" when it realises that they can't possibly win.
    And then they surrender.

    I'm just waiting for the days when they start swearing at you and you can't tell the difference between AI and a person.

  • My first impression of the linked article is one of skepticism that they are really getting out of it what they think they are... While a computer program could certainly apply word relationships from an instruction manual to its interactions with a game program, presumably it has some method of characterizing and tracking word relevance as it "learns."

    That very characterization process may actually contain all the necessary "learning," and the actual text be irrelevant. The real test they need to
    • Or rather, perhaps the best text to try it with is the same instructional text words, but just scrambled. The fact that the instructional text contains the words expected to be encountered in the interface may be all that is relevant, that it actually is conveying some kind of information via the logical statements contained therein, a completely erroneous conclusion.

Are we running light with overbyte?

Working...