Forgot your password?
typodupeerror
Biotech Games Science

Gamers Outdo Computers At DNA Sequence Alignments 61

Posted by Soulskill
from the must-not-be-using-watson dept.
ananyo writes "In another victory for crowdsourcing, gamers playing Phylo have beaten a state-of-the-art program at aligning regions of 521 disease-associated genes form different species. The 'multiple sequence alignment problem' refers to the difficulty of aligning roughly similar sequences of DNA in genes common to many species. DNA sequences that are conserved across species may play an important role in the ultimate function of that particular gene. But with thousands of genomes likely to be sequenced in the next few years, sequence alignment will only become more difficult in future. Researchers now report that players of Phylo have produced roughly 350,000 solutions to various multiple sequence alignment problems, beating the accuracy of alignments from a program in roughly 70% of the sequences they manipulated."
This discussion has been archived. No new comments can be posted.

Gamers Outdo Computers At DNA Sequence Alignments

Comments Filter:
  • by Trepidity (597) <delirium-slashdot@@@hackish...org> on Monday March 12, 2012 @04:41PM (#39332039)

    I'm highly skeptical that these gamers are really using some un-automatable human-only deep skills, especially since they aren't exactly extensively trained in this game, not to the level of, say, good Go players. So the interesting question to me is not that they beat current algorithms, but whether data mining these hundreds of thousands of alignments can tell us something about how they're doing it. My guess is that there are some heuristics that can be mined from this data that would massively speed up search.

    That's a more general point about how these stories are always pushed, though, sometimes by media, sometimes by the researchers themselves. Imo the most exciting thing about successful uses of "human computation" isn't that we can harness people to do things, but that we can gain some large data sets that will make it so we don't have to get people to do them anymore. Or at least, that should be the baseline, imo: that humans can beat some hand-crafted algorithm is one thing, but can they beat machine-learned algorithms trained on those humans' own gameplay logs?

    • by K. S. Kyosuke (729550) on Monday March 12, 2012 @04:50PM (#39332163)
      Perhaps they could make it into yet another captcha: "If you want to download your porn movie, please align the following two DNA fragments." :) If people can be made to do OCR for others, why not DNA alignment?
    • by rish87 (2460742)
      I agree 100% with the sentiment of figuring out how the players make the decisions and use it as new heuristics. The MSA problem isn't that computers cannot get the optimal solution, the problem is doing it quickly. Given enough time, a computer will always outdo or match a human. What needs to be done is improve the existing computational algorithms with heuristics learned from these players. Then we have much better results at a much faster rate.
      • by mug funky (910186) on Monday March 12, 2012 @07:57PM (#39334191)

        wouldn't the problem at hand be NP-hard? maybe that's why gamers are beating the algos?

        could this be a new way to "monetize" the internet? outsourcing hard problems for cash. with a cloud paradigm, it doesn't matter whether it's a cluster of computers or a crowd of aspies when the end result is the same.

        • by Anonymous Coward

          Yes, the problem is NP-hard. Computers can solve NP-hard problems, just the algorithms to do so are often too slow to be useful so approximation algorithms are used instead. The humans are competing against results generated by an approximation algorithm. Having humans do the computation is more or less a different approximation algorithm. Given enough time, a computer could simply work out the full solution, but the amount of computation would be way too high.

          Paying people for better results would be an in

          • by biodata (1981610)
            Is there another aspect to this, other than simply hardness? We can talk about exact solutions, and approximate solutions, but both are dependent on having some scoring metric that 'knows' what the correct solution is. In real alignment, we do not actually know what the answer is (assuming that the purpose of the alignment is to find the most likely evolutionary relationship between the bases in the sequences). When bases have been inserted/deleted and mutated, it is not necessarily possible to tell what
          • by mattack2 (1165421)

            Paying people for better results would be an interesting model. There is the problem of the perverse incentive to keep the algorithm secret though: if you come up with a better algorithm, you want to get paid to run it on instances, not tell the researchers so they can run it themselves.

            Though they could have a separate large payment for an algorithm. Sure, it wouldn't be as much as paying for the work over time forever, but the algorithm inventor/discoverer is betting that someone else doesn't come up wit

      • My hypothesis is that humans may have learnt how to find a path to parsimony. We have evolved to use resources efficiently, so finding stepwise approaches that use resources most parsimoniously would have been important. MSA seems like mostly a parsimony problem - what arrangement of bases most parsimoniously explains the likely evolutionary relationships. Typical computational approaches to this involve MCMC and various more or less random moves to try to find the most parsimonious solution. Humans are
      • by KDR_11k (778916)

        The basic visual sense of a large animal includes an insane amount of brainpower for pattern recognition, interpretation and such things. There's a reason even very dumb animals can maneuver through the world much faster than our smartest robots. The heuristics used by humans in tasks like that are likely backed by the enormous processing power of the brain when it comes to analyzing pictures and patterns so they may not be terribly useful for computers.

    • Cf. earlier summary about a similar achievement in protein folding [slashdot.org].
    • by Anonymous Coward

      I don't have the data to look through, but the general process of a human learnign new rules can be described as a sort of 'brute cunning algorithm.' It starts as a brute force, but recognizes certain trends, assumes consistency, and then makes jumps, narrowing back until they find a peak. Each person will display a different balance of brute force and portion skipping, so with a large enough gamerbase, you will get a collection of results that includes local maximums and a good chance of the true maximum

    • by whydavid (2593831)
      Agreed. I would be interested to see what the researchers learned from this exercise in terms of improving MSA algorithms. Perhaps the performance of the human players suggests that aligning a small subset of the problem with a high quality alignment algorithm before completing the problem with a run-of-the-mill algorithm is the way to go. The fact that puzzles completed repeatedly were where the phylo solutions performed best would indicate that running this first algorithm repeatedly with some element
  • So can we extract any insights from this, and use them to improve diff?
  • ... a beowulf cluster of these!
  • A fantastic example of why the building blocks of human life should not be patentable and hidden away by pharmaceutical companies.

  • I just started playing and I am haveing a slight trouble with it. These people must be geniuses!
  • Cured Lupus! 150G / Platinum Trophy
  • by whydavid (2593831) on Monday March 12, 2012 @05:43PM (#39332747)
    This is an interesting finding, but let's not get too carried away. If you read the article, you'll see that: a) The phylo-based alignments are partial solutions. They are simplified for the human user by leaving many orthologous sequences out of the alignment. This means there is another algorithm that finishes these partial solutions before they can be compared to solutions produced solely by algorithms. b) Only 36% of the _best_ phylo-based solutions, once completed, were better than the algorithms' solutions. This is still an improvement, but it DOES NOT suggest that humans are better than computers at multiple sequence alignment. If you were to ever try to solve a real MSA problem by hand, you would quickly understand how completely hopeless it is. In fact, even aligning 2 sequences of any appreciable length by hand is a chore. The problem here is the misguided title: "Gamers outdo computers at matching up disease genes" which should read: "Gamers + computer outdo computers only at matching up very small fragments of disease genes, some of the time"
    • by tibit (1762298)

      I'm not surprised. Their UI is disgusting, their scoring rules hidden behind a most amateurishly done video (they must expect you to write down fucking notes), and the whole project just seems in-your-face obnoxious. What a let-down :(

    • If you were to ever try to solve a real MSA problem by hand, you would quickly understand how completely hopeless it is.

      Nope nope nope [u-strasbg.fr]. From scratch, perhaps it looks daunting. But the big parts are actually pretty easy. I should stress that BAliBASE is used as a benchmark for new alignment programs, including MultiZ (which, btw, is actually a little old now.)

      • by whydavid (2593831)
        BAliBASE is a great reference, but all of the sequence alignments in the database were refined from algorithmically-derived alignments (implemented on computers) in the first place. I think it furthers my assertion that computers + humans > either alone when it comes to MSA. Certainly, the sheer scale of the data would prevent any sort of economic use of manual global alignment, even if the local alignments were best carried out by biologists. Again, my issue here is that the article gives the impress
        • Wholly agreed—but it should be emphasized that the mere existence of BAliBASE asserts that the trickiest part still requires direct intervention. There are precious few things in the universe that a computer can do that a human can't do more slowly or in smaller chunks, after all—and most of those are comparatively silly things like set voltages. I could, for example, implement ClustalW by hand, no sweat—just give me your favourite BLOSUM table, a few other parameters for gap size, a stack

  • Link to the English version that actually works:

    http://phylo.cs.mcgill.ca/eng/ [mcgill.ca]
  • I couldn't even solve one puzzle, so gave up.
  • Time on planet to optimize pattern matching algorithms --

    Humans: Millions of years

    Computers: Tens of years.

    Not sure there is a story, here...

Whoever dies with the most toys wins.

Working...