Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
AI Games

Clever Clues Clobber Crossword Computer 70

Posted by Soulskill
from the blithe-banter-bamboozles-brainy-bots dept.
Hugh Pickens writes "Steve Lohr reports that an impressive crossword-solving computer program called Dr. Fill matched its digital wits against 600 of the nation's best human crossword-solvers, finishing only 141st at the American Crossword Puzzle Tournament in New York. 'I wish it had done better,' says Dr. Matthew Ginsberg, the creator of Dr. Fill and an expert in artificial intelligence. Dr. Fill typically thrives on conventional crosswords, even ones with arcane clues and answers; it solved one of the most difficult puzzles at the tournament perfectly. But the computer does poorly with clever clues based on puns or jokes, because humans and machines solve the crosswords very differently. Humans recognize patterns based on accumulated knowledge and experience, while computers make endless calculations to determine the most statistically probable answer. The computer program is literal minded, and tends to struggle on puzzles with humor, and puzzles with unusual themes or letter arrangements. Take this clue from a 2010 puzzle in The Times: Apollo 11 and 12 (180 degrees). The answer is SNOISSIWNOOW, seemingly gibberish. A clever human could eventually figure out that those letters, when rotated 180 degrees, spell MOON MISSIONS. Humans get the joke, while a literal-minded computer does not. 'Occasionally, Dr. Fill just doesn't get it,' says Ginsberg. 'That's my nightmare.'"
This discussion has been archived. No new comments can be posted.

Clever Clues Clobber Crossword Computer

Comments Filter:
  • by Anonymous Coward

    Can it manage ironic clues?

  • It's not Dr Fill's fault for not getting it. Clearly it's Dr Ginsberg who is not getting it.
  • by whyloginwhysubscribe (993688) on Wednesday March 21, 2012 @04:27AM (#39424649)
    If I see someone doing a crossword I usually say "I was stuck on a crossword the other day - the clue was 'very busy postman'". Eventually (sometimes it takes a while) they ask "how many letters" at which point you can say "hundreds!"

    I'm such a funny guy...

    Oh - another one is to say "seven up is lemonade"...
    • Re: (Score:2, Funny)

      by Anonymous Coward

      Perhaps unsurprisingly, one down is justifiable homicide.

    • by Inda (580031)
      I went to a funeral the other day. The deceased was a crossword compiler.

      He was buried 6 down and 4 across.

      ^^ you can have that one for free :)
  • by mapkinase (958129) on Wednesday March 21, 2012 @04:29AM (#39424661) Homepage Journal

    I'd luck to congratulate submitter on a clever title. Does not happen very often here.

    • by mapkinase (958129)

      *like. If somebody thought that this was some kind of word play and modded up because of that (that's the only explanation of modups I have: my typo somehow made my comment clever beyond my understanding), it was not.

    • by DaFallus (805248)
      I have to respectfully disagree. I find these over the top alliterative titles to be incredibly contrived and annoying.
  • ...film at 11. And extended debugging session afterwards.
  • Would the Apollo example really trip up a decently-written program though? I mean, my first thought was "Well, what if it had a fallback routine where it tries anagrams of possible answers?" so I have to imagine someone smarter than me has thought of that. I guess there's some limitation I'm not seeing...

    On second thought...what am I still doing awake at 5:48am commenting on a post about crossword-solving computers?!

    • by N1AK (864906)
      You could and chances are that each time it hits something new like this it can be improved a little and any clue about rotation/flipping etc will check for this kind of skullduggery. However just building in functionality to look for something that looks like the word if viewed backwards and upside down isn't automatically something you'd add to a crossword computer.
    • I don't think it's that simple. Of course you could write the program to include answers made of letters rotated 180 degrees. But how often does an answer like that happen? I'm not a big crossword person, but i'd guess it's pretty rare. Maybe 1 out of 10,000 - if even that much? So while you might get that one clue right, meanwhile you've expanded the set of possible answers to include numerous elements with an extremely low chance of ever being correct. In addition to the added expense of considering them,

      • Re:Poor example? (Score:4, Interesting)

        by bws111 (1216812) on Wednesday March 21, 2012 @07:58AM (#39425999)

        Here is one that just appeared this week (LA Times, I think):

        Clue: Hail Answer: DANTESINFERNO
        Clue: Poe Answer: FLATBROKE
        Clue: What you need to get the above two answers: SOUTHERNDRAWL

        Not sure how you make a routine to come up with those answers.

        • by operagost (62405)
          That clinches it... not going to bother doing a crossword again!
        • Christ! Those are... Those are just ridiculous... I guess what I didn't understand is what crosswords have become!

          Now I get what they mean by requiring some cleverness... I mean, there're PEOPLE who wouldn't get those! It makes a lot more sense now.

  • All decent crosswords in the UK tend to be of the cryptic kind, rather than just needing a thesaurus most of the time. Writing answers backwards wouldn't be allowed, though, as the answer has to be an actual word. Here's one that a computer might struggle with.... V? (6,2,7) Answer: Centre of Gravity
    • by digitig (1056110)
      That would be considered an exceptionally poor cryptic clue, though, because one of the "rules" is that there should be some allusion to the -- or an -- actual meaning of the answer, however misleading. A better version of the clue would be "V is a source of strength".
      • A clue definition of '...V' is acceptable, if it follows on from a previous related question or answer. Even then it may seem related, but isn't. This will throw one off the scent, but the answer will often be part of an overarching theme. A theme often allows weaker definition cluing. Several other clues might include tangential references to science, but there will one clue that is often refered to by clue number only, elsewhere in the puzzle - e.g. 'Amphibian is working by force' (6) - hinting at the ove

        • by digitig (1056110)
          I'm not sure, but I seem to remember that it was Araucaria who laid down the general rules that most British setters are nowadays expected to follow. In which case he probably feels somewhat at liberty to bend them :-)
    • The computer would definitely struggle with that one because you spelled 'Center' incorrectly. :)
      • by CastrTroy (595695)
        Not in the UK. Or Canada, or Australia. Actually I think the Americans are the only ones that have it wrong.
      • by tlhIngan (30335)

        The computer would definitely struggle with that one because you spelled 'Center' incorrectly. :)

        There is a nasty one inside the crossword app on the Nook Color (not sure if it's in other Nooks, but I'm guessing it is) where the answer is spelled "CENTRE". The problem is the down answers really want "CENTER" to make any sense (one of the down ones was "TENT" which became "TRNT").

        Not sure if it was a typo or not. And the puzzles have no identifier so you can point it out.

    • Other one AI might struggle with:

      Lisping girl of legend (4) - (ans: 'Myth')

      How about some Cockney Rhyming slang?:

      Beehive in North London? (4, 6) - (ans: 'High Barnet')

      • by 91degrees (207121)
        A really good thesaurus might help with that second one.

        Use of cockney isn't uncommon so it would make sense to include both of those in the definition for "hairstyle".
  • by Anonymous Coward

    the computer needs help from Joe Piscopo [youtube.com]

  • I'd like to see if Dr Fill manages these two:

    HIJKLMNO (5)
    ___ (2, 3, 4, 1, 4)

    • by 91degrees (207121)
      HIJKLMNO

      Yes, I remember seeing that one. One of those clues that filled me with delight when I got it.

      ___ (2, 3, 4, 1, 4)

      To Not Have A Clue?
    • by dkf (304284)

      HIJKLMNO (5)

      That reminds me of this one:

      ABCDEFGHIJKMNOPQRSTUVWXYZ (4)

  • Redundancy (Score:4, Insightful)

    by Hentes (2461350) on Wednesday March 21, 2012 @05:50AM (#39424975)

    Not being able to guess a few words might not be a problem, skip it and solve the other ones, once there are enough letters in it a computer can easily look up the available words, and if there are more than one even use a nonlinear approach. Even without any clues, a few words can't be that hard to bruteforce.

    • by CastrTroy (595695)
      This is what I was thinking. Crossword puzzles should be pretty easy to solve if you can brute force the thing. If you can't solve something, solve all the other clues in the other direction, and you have an answer. Unless these contests don't actually solve whole puzzles, but rather are given a partial puzzle with some parts filled in, and have to answer particular clues. Also, are the contestants allowed to use dictionaries, thesauruses, and other reference materials? Because if they aren't it's even mo
    • by Hillgiant (916436)

      Not necessarily. The computer likely double checks cross clues to verify any individual answer. If one of answers starts to appear to be jibberish, it will throw a large swath of intersecting answers in doubt. If the jibberish answer crosses a large portion of the grid, the doubt can propagate through the entire puzzle.

  • by Anonymous Coward

    Unfortunately for Dr Fill's creator, the problem of how to get the program to work with such unorthodox solutions is the same as getting it to think like a person. At a certain point, all AI questions become the same AI question: this is the very essence of Turing horizon, and all such efforts converge there.

    The program he wants to write is, sadly, doomed, as it will be impossible until such time as our species generates a true artificial consciousness with human intelligence, at which point the problem wi

  • by Anonymous Coward

    The crossword puzzle guy needs lessons from Watson, who clobbered several Jeopardy human champions [slashdot.org]. That show has clever categories and clues. Watson probably had more impressive computing power, but I doubt that was the issue. The Watson designers clearly had a better grasp of natural language, including humor-filled and storied language.

  • Humans recognize patterns based on accumulated knowledge and experience, while computers make endless calculations to determine the most statistically probable answer.

    what's different about that? I've often said, "The more I learn about AI, the more I think it lacks any intelligence at all. The more I learn about psychology, the more I believe that humans think just like an AI."

    Humans are also determining the most statistically probable answer. They just have a better algorithm for factoring humor into those statistics.

  • So Doctor Fill has the same ability to comprehend humour as Doctor Phil?

  • Isn't having a gibberish answer in a crossword puzzle like making up your own words in Scrabble? Doesn't the creation of a crossword puzzle have any rules? No wonder I often do poorly with them; I had no idea that they could be making up nonsense words.
  • ...were supposed to be composed of REAL words.

    WTF is "SNOISSIWNOOW"??

  • I can't blame the computer for not doing well on these; a lot of crossword puzzles are a puzzle of "guess what the creator was thinking", and not a puzzle of words and language. Quite frankly, I'm not interested in guessing what someone else happens to be thinking when they write down a clue like "blue, red, and big"; I find that fundamentally uninteresting and of no long term value.

    I have the same problem with many Mensa puzzles. A lot of them I can do, but puzzles that require deep and specific informat

Trying to be happy is like trying to build a machine for which the only specification is that it should run noiselessly.

Working...