Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Classic Games (Games) Math Games

Chess Ratings — Move Over Elo 133

databuff writes "Less than 24 hours ago, Jeff Sonas, the creator of the Chessmetrics rating system, launched a competition to find a chess rating algorithm that performs better than the official Elo rating system. The competition requires entrants to build their rating systems based on the results of more than 65,000 historical chess games. Entrants then test their algorithms by predicting the results of another 7,809 games. Already three teams have managed create systems that make more accurate predictions than the official Elo approach. It's not a surprise that Elo has been outdone — after all, the system was invented half a century ago before we could easily crunch large amounts of historical data. However, it is a big surprise that Elo has been bettered so quickly!"
This discussion has been archived. No new comments can be posted.

Chess Ratings — Move Over Elo

Comments Filter:
  • umm (Score:5, Informative)

    by buddyglass ( 925859 ) on Wednesday August 04, 2010 @05:08PM (#33143698)

    However, it is a big surprise that Elo has been bettered done so quickly!

    Not really. Jeff Sagarin has had two systems of rating sports teams for a while now. One, ELO_CHESS, is based purely on win-loss, while the other, PURE POINTS, takes into account margin of victory. According to him, the latter is better at predicting future results. From his analysis:

    In ELO CHESS, only winning and losing matters; the score margin is of no consequence, which makes it very "politically correct". However it is less accurate in its predictions for upcoming games than is the PURE POINTS, in which the score margin is the only thing that matters. PURE POINTS is also known as PREDICTOR, BALLANTINE, RHEINGOLD, WHITE OWL and is the best single PREDICTOR of future games.

  • Submission error (Score:3, Informative)

    by TubeSteak ( 669689 ) on Wednesday August 04, 2010 @05:09PM (#33143706) Journal

    Already three teams [kaggle.com] have managed create systems that make more accurate predictions than the official Elo approach.

    1 EdR* 0.729125
    2 whiteknight* 0.731656
    3 Elo Benchmark* 0.738107 {-- The "official Elo approach"

    Maybe we're counting from zero and they forgot to put it on the leaderboard?

  • by Anonymous Coward on Wednesday August 04, 2010 @05:20PM (#33143836)

    That number is "Root Mean Square Error", so lower is better

  • by mooingyak ( 720677 ) on Wednesday August 04, 2010 @05:32PM (#33143988)

    Since the Elo system is not designed to predict future performance (it's designed to capture current relative rankings), then is it really surprising that programs designed to predict future performance are better at it?

    And if my current relative rank is higher than yours, doesn't that imply that if we play each other I should win? If not, what purpose does the rank serve?

  • by databuff ( 1789500 ) on Wednesday August 04, 2010 @05:51PM (#33144194)
    Data only shows results - so there's no scope for gauging the margin of victory.
  • Re:Submission error (Score:4, Informative)

    by databuff ( 1789500 ) on Wednesday August 04, 2010 @05:53PM (#33144232)
    The Elo Benchmark was submitted a second time. I wrote to Sonas about this. Apparently the rating system has to be seeded. He tried a different approach to calculating seed ratings and this performed better - pushing him one place higher in the rankings.
  • by phantomfive ( 622387 ) on Wednesday August 04, 2010 @06:19PM (#33144540) Journal
    You know, you're really asking for it when you take a small point that isn't even relevant to his main point and attack it. Sorry, YOU'RE WRONG!!!!! [gameknot.com].

    If you ever find yourself in a game where you can sacrifice all your pieces to get to that position, DO IT!
  • by Maarx ( 1794262 ) on Wednesday August 04, 2010 @07:07PM (#33144976)
    Not to belittle what Microsoft did, but in the interest if giving credit where credit is due:

    Here’s the problem with Battle.net 2.0: 2002s Warcraft III: Reign of Chaos is one of the most underrated video games ever created. And that’s before you learn its online apparatus is the foundation for modern matchmaking, where Blizzard Entertainment should get royalties every time you brag about your X-Box Live Trueskill rating. (Then again, I shouldn’t be giving Blizzard ideas right now.)

    Here’s how Warcraft III matchmaking worked: Everyone starts at level one. The maximum level is fifty. You play players within six levels of your own. Win five games, gain a level. Lose five games, lose a level. The penalty for losing is reduced during levels one to nine. Thus, players who win half their games will become level ten.

    It was simple and transparent. That was the hook, and people choked on it. It turned Warcraft III ladder play into what ICCUP serves for Starcraft players, a stomping ground so competitive that climbing the food chain gave you a shot at the guys who played for a living. That’s what a good online gaming system does.

    The quote comes from Battle.net 2.0: The Antithesis of Consumer Confidence [the-ghetto.org]. I would encourage you to read the entire thing, but for reasons completely unrelated to this thread.

  • by shimage ( 954282 ) on Wednesday August 04, 2010 @07:34PM (#33145254)
    Bullshit. Mistakes are roughly stochastic, ergo, there are random elements in chess players' performance. This is why chess matches involve more than just two games.
  • Elo Anecdote (Score:5, Informative)

    by afabbro ( 33948 ) on Wednesday August 04, 2010 @10:24PM (#33146324) Homepage

    Not relevant specifically to this story, but I always laugh at the story of how a prisoner manpiulated the Elo system via closed pool ratings inflation [wikipedia.org].

    Short summary: said prisoner only played against other prisoners, who he'd trained. Due to careful scheduling of the games, he rose from his true strength (probably sub-master) to being the second-highest rated played in the U.S. in 1996.

  • by Anonymous Coward on Wednesday August 04, 2010 @11:44PM (#33146714)

    The leaderboard changes over time, and also consider this:

    Update: The team Elo Benchmark (see the leaderboard), uses the Elo rating system. Note, the method for creating seed ratings for Elo Benchmark is being refined, so don't be surprised if the benchmark improves a little in the competition's first week.

  • Re:Submission error (Score:3, Informative)

    by Martian_Kyo ( 1161137 ) on Thursday August 05, 2010 @02:15AM (#33147304)

    1 Elo BenchmarkOpen 0.723834
    2 EdROpen 0.729125
    3 whiteknightOpen 0.731656
    so at this moment elo is back on top.

    Could it be that people have been done some quickly jumpening to conclusions?

    I guess george [nanc.com] is working at /. now.

  • by Olivier Galibert ( 774 ) on Thursday August 05, 2010 @03:28AM (#33147538)

    No we don't. This is not the crawler you're looking for.

        OG.

They are relatively good but absolutely terrible. -- Alan Kay, commenting on Apollos

Working...