## Data Mining the Web Reveals What Makes Puzzles Hard For Humans 44

KentuckyFC (1144503) writes

*"The question of what makes puzzles hard for humans is deceptively tricky. One possibility is that puzzles that are hard for computers must also be hard for people. That's undoubtedly true and in recent years computational complexity theorists have spent some time trying to classify the games people play in this way (Pac Man is NP hard, by the way). But humans don't always solve problems in the same way as computers because they don't necessarily pick the best method or even a good way to do it. And that makes it hard to predict the difficulty of a puzzle in advance. Cognitive psychologists have attempted to tease this apart by measuring how long it takes people to solve puzzles and then creating a model of the problem solving process that explains the data.*

But the datasets gathered in this way have been tiny — typically 20 people playing a handful of puzzles. Now one researcher has taken a different approach by mining the data from websites in which people can play games such as Sudoku. That's given him data on the way hundreds of players solve over 2000 puzzles, a vast increase over previous datasets and this has allowed him to plot the average time it takes to finish different puzzles. One way to assess the difficulty of Sudoku puzzle is in the complexity of each step required to solve it. But the new work suggests that another factor is important too — whether the steps are independent and so can be attempted in parallel or whether the steps are dependent and so must be tried in sequence, one after the other. A new model of this puzzle-solving process accurately reproduces the time it takes real humans to finish the problems and that makes it possible to accurately predict the difficulty of a puzzle in advance for the first time. It also opens the way for other studies of human problem solving using the vast datasets that have been collected over the web. Indeed work has already begun on the Sudoku-like puzzle game, Nurikabe."But the datasets gathered in this way have been tiny — typically 20 people playing a handful of puzzles. Now one researcher has taken a different approach by mining the data from websites in which people can play games such as Sudoku. That's given him data on the way hundreds of players solve over 2000 puzzles, a vast increase over previous datasets and this has allowed him to plot the average time it takes to finish different puzzles. One way to assess the difficulty of Sudoku puzzle is in the complexity of each step required to solve it. But the new work suggests that another factor is important too — whether the steps are independent and so can be attempted in parallel or whether the steps are dependent and so must be tried in sequence, one after the other. A new model of this puzzle-solving process accurately reproduces the time it takes real humans to finish the problems and that makes it possible to accurately predict the difficulty of a puzzle in advance for the first time. It also opens the way for other studies of human problem solving using the vast datasets that have been collected over the web. Indeed work has already begun on the Sudoku-like puzzle game, Nurikabe."

## Human vs computer (Score:5, Insightful)

People do not do well with recursive. People do well with hidden simple pattern recognition. Give us a simple pattern, and we can recognize it everywhere. The simplest example of this is optical character recognition, i.e. recognizing letters in a picture. In part because there are infinite number of fonts, but humans can recognize them all, because we look for the pattern. Computers have major issues with this - and to get any real accuracy, do it slower than people do.

## Re:Sudoku's complexity (Score:4, Insightful)

He's looking at both complexity of the move and how many possible moves there are at each step.

It is much easier to find a valid move and solve a puzzle if there are 10 opening moves.

It is much harder if there is only a single path of 20 moves in a particular sequence.

A puzzle with 20 steps that must be done in order is much much harder than a puzzle with 20

steps that can be done in any order just like it would be much harder to solve a word search if

you had to find the words in order.

This should be pretty accurate as a truly serial puzzle would be the sum of time to solve each

step while a puzzle with parallel steps would be the average time to discover one of the possible

next moves. On average if there is more than one solution on a step then finding one of the

many solutions should be faster than finding the one solution.