Forgot your password?
typodupeerror
Games

The Problem With Metacritic 131

Posted by Soulskill
from the saturday-morning-game-developers dept.
Metacritic has risen to a position of prominence in the gaming community — but is it given more credit than it's due? This article delves into some of the problems with using Metacritic as a measure of quality or success. Quoting: "The scores used to calculate the Metascore have issues before they are even averaged. Metacritic operates on a 0-100 scale. While it's simple to convert some scores into this scale (if it's necessary at all), others are not so easy. 1UP, for example, uses letter grades. The manner in which these scores should be converted into Metacritic scores is a matter of some debate; Metacritic says a B- is equal to a 67 because the grades A+ through F- have to be mapped to the full range of its scale, when in reality most people would view a B- as being more positive than a 67. This also doesn't account for the different interpretation of scores that outlets have -- some treat 7 as an average score, which I see as a problem in an of itself, while others see 5 as average. Trying to compensate for these variations is a nigh-impossible task and, lest we forget, Metacritic will assign scores to reviews that do not provide them. ... The act of simplifying reviews into a single Metascore also feeds into a misconception some hold about reviews. If you browse into the comments of a review anywhere on the web (particularly those of especially big games), you're likely to come across those criticizing the reviewer for his or her take on a game. People seem to mistaken reviews as something which should be 'objective.' 'Stop giving your opinion and tell us about the game' is a notion you'll see expressed from time to time, as if it is the job of a reviewer to go down a list of items that need to be addressed — objectively! — and nothing else."
This discussion has been archived. No new comments can be posted.

The Problem With Metacritic

Comments Filter:
  • by TheoGB (786170) <theo@gra h a m -brown.org.uk> on Wednesday July 18, 2012 @02:58AM (#40683097) Homepage
    Personally I think anything less than 7 out of 10 isn't worth my while bothering with. That's me and about time I have. Friends of mine, however, would give a film a 5 out of 10 and say it's still decent enough to stick on one night when you want something to watch. Even if Metacritic was exactly showing a score that we agreed was 'accurate' it wouldn't really matter. Aggregation of this sort is as good as doing it by eye yourself, surely?
    • by Anonymous Coward

      It really doesn't matter what scale you use or how you weight them. If you add enough similarly-weighted ratings together, the central limit theorem essentially guarantees that the result will have a normal distribution. So in the end, the mean is irrelevant; it's really only the (score-mean)/stdev that matters.

      • by Razgorov Prikazka (1699498) on Wednesday July 18, 2012 @03:13AM (#40683175)
        Obligatory XKCD reference: http://xkcd.com/937/
        • by Anonymous Coward

          For the love of Baal, please at least use <a href [xkcd.com], m'kay?

        • Does anyone else play the "Predict which XKCD this will be" game when Oblig XKCD links are posted?

          (Those that don't know the strip numbers off by heart of course)

          • Re: (Score:3, Funny)

            Yeah, I surely hope that Randall Munroe makes a cartoon on the RaspberryPi or Bitcoins, that would make the prediction a whole lot easier on a lot of /. stories :-)
          • by mooingyak (720677)

            Does anyone else play the "Predict which XKCD this will be" game when Oblig XKCD links are posted?

            (Those that don't know the strip numbers off by heart of course)

            Every time. I don't have the numbers memorized, but I do recognize 'recent' vs 'not recent' and that helps sometimes.

      • by Anonymous Coward

        Nobody rates lower than 0 or higher than 100 so it can't be a normal distribution because the tail would be cut off. This would primarily be a problem for large standard deviations or extreme scores.

        • The problem is a lack of agreement on what the median should be.

          A normal distribution works when we know what the average value is (The peak of the distribution bubble) However we need a normal distribution on what the people think the normal is first to come up with the best median for the distribution of grades for a product.

        • by mooingyak (720677)

          Nobody rates lower than 0 or higher than 100 so it can't be a normal distribution because the tail would be cut off. This would primarily be a problem for large standard deviations or extreme scores.

          Not just that, but there are numbers in the low range that just aren't going to get used. Who rates something as a 3 out of 100? What does that mean? It was absolutely terrible except for one relatively minor part that was done right?

    • This line of thought seems faulty, but I have to admit that I feel the same way for most scores reviewing sites and magazines deal out. A score of less than 7 out of 10 is reserved for seriously failed works, and typically these works never merit a recommendation. However, if anything below 7 is not worth experiencing, you essentially only have five possible scores: bad, 7, 8, 9 and 10. The rest of the scale is simply wasted.

      I always wondered if this is caused by school grades in youth influencing what peop

      • by TheoGB (786170)
        Certainly, I'm not suggesting it's logical for me to feel that way. But it's just how I am. With music I am more open than I was 10-15 years ago and I think it's all about how much time I have: I can stream music for free (legally) all day at work so I have a lot more time to check it out for myself, but I only get a few hours a day to spend watching films and reading. This means I don't really have time to risk watching a film rated 5 (unless clearly there is a personal rec, etc.) when I could be sifting
      • Re:Is that so? (Score:5, Interesting)

        by SpooForBrains (771537) on Wednesday July 18, 2012 @05:15AM (#40683753)

        I work for a review platform. We have decided that you only really need four ratings, Bad, Poor, Good, Excellent. We don't have a neutral option because really neutral tends to mean bad.

        Of course, quite a lot of our users (and our marketing department) seem to prefer stars. Because an arbitrary scale is so much more useful that simply saying what you think of something. Apparently.

        • by Warma (1220342)

          Might I ask what you review and for a link to the site (if there is one)? It would be nice to check out reviews using this philosophy.

        • by Zadaz (950521)

          See also the San Francisco Chronicle's movie review "little man" [austinkleon.com]. Not a star or a thumb, but a very clear indication of what the reviewer thought. Is the little man applauding or asleep. Or is his little movie theater seat empty?

          Stars are awful and should never be used. There's a reason YouTube dropped stars in favor of thumbs up/down. There was a serious bathtub curve on the stars, putting a huge percentage at 1 or 5 stars, and only a few 2,3 or 4 stars.

          That's vastly different than how I use stars. But w

        • Does every one of your reviews give the 4 different possible values - in correct order - and then list what the reviewer gave the current game? Without that, having a word like 'Good' is so bland/overused as to be meaningless. Context matters, and stars give a much better, immediate view of that kind of rating.

          Of course I hate that most places disallow zero-stars :p

    • Re: (Score:2, Offtopic)

      by jellomizer (103300)

      We often think of using the school level of grading.
      60 F (0.0)
      60-69 D (1.0)
      70-79 C (2.0)
      80-89 B (3.0)
      90-100 A (4.0).

      Of course school grading has gotten corrupted to just mean. Less then C you don't go to college. C-B means you can go to a normal college, A you go to a well known college.

      However it should be the following...
      F is a Failure where they should take the class over again.
      D They can advance to the next class... However they are at high risk of failing the next class, you should have the op

      • It may just be that I am also used to this sort of grading system (well, the corrupted one where I would have been aghast upon receiving a high school C)--but for really in depth reviews of something like a game that you might spend 20-40 hours on, I think this is the way to go.

        Sure they can be mapped to real numbers but lets face it 5 and below are not useful to me. If a game gets an F, then I am probably not going to play it. A score of 1 is almost unreasonable...its like they managed to ship me a bla

  • This would be perfect for machine learning. Just analyze _all_ the scores from a certain source and calculate a most probable score with a standard deviation. Then assign a score from 0-100 accordingly. I don't know that much about machine learning so I'm sure an expert could find a way better algorithm for that.

    "Trying to compensate for these variations is a nigh-impossible task" - definitely not.
    • by bWareiWare.co.uk (660144) on Wednesday July 18, 2012 @04:18AM (#40683495) Homepage

      Your description is not so much machine learning as basic math. If you just want each scoring system to have equal weight on the results then compensating for the variation is trivial.

      Where machine learning would come in is to find underlying pattens in the data.

      This could be used to weed out reviewers to lazily copy scores, or are subject to influence.

      It would also allow them to test your scores for some games against the population of reviewers to find reviewers with similar tastes.

      You could also use clustering algorithms to find niche games which got a few really strong scores but whose average was really pulled down because they don't have wide appeal.

  • by wjh31 (1372867) on Wednesday July 18, 2012 @03:07AM (#40683151) Homepage
    sounds in principle like a fairly simple solution. Put together a separate histogram of the scores by each reviewer. From this you can estimate what an average score really is and how many standard deviations an individual score is above or below. The meta-score then becomes the average number of sigma's the game is above or below the various averages. If necessary this score can be sanitised to something easier to read for those less familiar with Gaussian statistics
    • by fph il quozientatore (971015) on Wednesday July 18, 2012 @03:37AM (#40683283) Homepage
      My thoughts exactly. This is not rocket science; it is sad to read that no one in their whole company seems to have a clue about statistics.
      • by Anonymous Coward
        Well it is extremely likely that Metacritic has a system like this implemented, but the article is just ignorant.
    • by F69631 (2421974) on Wednesday July 18, 2012 @03:45AM (#40683329)

      One reviewer might only rate highly hyped games which he expects to be good (nearly all fall to 60-100 range) and other reviewer tries out pretty much everything he encounters to find out those lone gems among less well-known indie games, etc. (let's say ranging from 20 to 95). We can't just take a bell curve of each and say "Game A is slightly above average on first reviewer's scale and Game B is slightly above average on the second reviewer's scale... so they're probably about equally good!". Sure, with large number of reviewers, you can still see which games do well and which won't but you have lost at least as much precision as you would have if you hadn't taken the bell curve in the first place.

      That said, I don't know if reviews are that relevant anymore. I am active gamer but don't remember when was the last time I read a full review... There have been two times recently when I bought newer games from series I had played years ago (Cossacks and Anno 1602). I just wanted to take a quick peek on whether the games were considered about equally good, better or worse than the ones I had liked and whether they were very similar with just better graphics etc. or if some major concept had changed. That consisted mostly of looking the games up on Wikipedia and quickly glancing the first reviews I found using Google. I think I also checked the metascore, but it was more among the lines of "I'll buy it unless it turns out to have metascore under 60 or something". I didn't use that as exact metric.

      Most games I buy are ones recommended to me by my friends, those recommended by blogs I follow (e.g., the Penny Arcade guys' news feed... you could consider those reviews, but they don't mention the games they hated, don't give scores, etc., just mention "Hey, that was pretty good. Try it out.") or those that just seem fun and don't cost much (When I noticed Orcs Must Die on Steam for under 5 euros, I didn't start doing extensive research on the critical acclaim of the game.)

      • I was thinking about this yesterday when I was looking through CmdrTaco's reddit AMA. If we wanted a truly fair system, couldn't we just do relational ratings? (It would probably be too much work server side, but it sure sounds good in my mind.) Have each reviewer rate a game in comparison to another game. Portal 1 better than 18 Wheels of Steel. Team Fortress 2 better than Data Jammers: FastForward. With enough people, everything would end up somewhere on the scale and it wouldn't all be squashed up toward
        • That sounds like a good idea. Give me a moment and I'll patent it.

          Seriously though, that idea is mostly good but I think that the passing of time might be a problem. Deus Ex was awesome game when it came out, but if it were to come out today it wouldn't be all that good because the video games have taken huge leap forward in the last 12 years. So, should I vote it worse than nearly all new games and completely ignore the context, how much it influenced the genre, etc.? In general, it's very hard to compar

          • Surely enough games are released each year or two that you could chunk it up like that and only allow comparisons in that range.
    • Go ahead and do all the math and I'll bet that 95% of scores move less than 5%. There's really very little wrong with a simple average.
    • by Nemyst (1383049)

      And then the average reader of Metacritic doesn't understand how the score is calculated and cries foul every single time.

      Yeah, that'll work.

  • by Lando (9348)

    Metacritic a scoring system that works for a lot of people and seems to work for them isn't perfect? News at 11. Really? So someone writes an opinion piece on it backed with opinion and this is interesting? The methods Metacritic use seem to be fair and work, so who cares that someone doesn't think they are perfect?

    I'll use the numbers as a guideline, but not as fact. Just like wikipedia, it's a place to start. Now if the article was aimed at pointing out that the publishers put too much emphas

    • The problem is not metacritic, it's the people who read it are. They seem to take this whole critic thing very personal. but like you said, it's not perfect but people don't seem to get that. On top of that, the metacritic score can be "hacked" in some way to my knowledge just like any other scores you use average in. Just think about it, it's simple math, give a game a super bad review and another one a good review and you get a decent score. I don't think this is good if you ask me.
      • by Lando (9348)

        It just seems to me that the article under question wasn't trying to point out that people should use the Metacritic score with some reservation, instead it seems like it was just an attack on metacritic itself complaining that it has too much influence. If the article would have approached it in a different matter, e.g. because of people's blind belief in metacritic producers of software are more interested in a high metacritic score than about the product itself, but here are reasons the Metacritic scor

  • by Riceballsan (816702) on Wednesday July 18, 2012 @03:27AM (#40683241)
    For me anyway, when it comes to finding good reviews of things, I've always found a mass average entirely useless. Just because 10,000 out of 15,000 like something, it has no bearing over if I will like it and quite often leads to the oposite. Instead what I do is I check reviews of the games movies shows etc... that I already have seen, I find the reviewers that are the closest match to how I felt about things in the past. Then I check their reviews of what I haven't seen. It isn't a perfect system, but it works overall, and tends to be more accurate to my tastes than other methods that I have tried. In addition of course actually reading detailed reviews with explanations of why they felt that way. If you are one who is looking for a game for a deep story and the review is 9/10 saying "Great explosions, incredible action at every turn, the graphics were spectacular, the story was a little weak but that is made up for by the incredible pace of the combat", odds are it isn't a good game for someone looking for a deep plot.
    • by Sycraft-fu (314770) on Wednesday July 18, 2012 @03:55AM (#40683375)

      I find it useful for that. If there's something you have little knowledge or information about it can give you a quick breakdown of what you might expect. For example if a game has a 90 metascore, you know it is something that you should probably look in to further, that is uncommonly high. If something has a 40 metascore, you can pretty much give it a miss, that is uncommonly low.

      What it'll then get you for things you do want to look in to further is a list of reviews. So you can see what sites have reviewed it, and then go and read the specifics if you wish. Along those lines it is a quick way to find good and bad reviews. When I'm on the fence about something I like to see what people thought was good and bad. I can then weigh for myself how much those matter to me.

      Average ratings really can be of some use to filter. I just don't give enough of a shit about every game to go and read multiple full reviews on it and research it. So if it isn't a game I was already interested in, I want a sort of executive summary to decide if I should give it any more time. Metacritic helps with that.

      Two recent examples:

      1) Endless Space. I had never heard of this game, an indy 4X space game apparently, though rather well developed. Ok well ambitious indy titles can be all over the map. Metascore is 78. That tells me it is worth looking at, it is on my list and I'll look at it more in depth when I feel like playing such a game.

      2) Fray, a turn-based strategy sci-fi game. Again, something I hadn't heard of, however a kind of game I like so maybe I'd be interested. Metascore of 32. So no, not wasting time on that.

      Other games I won't bother on the Metascore, just use it to find reviews. Like Orcs Must Die 2. Looking forward to that one, so I'll spend time researching it to see if I want to buy it. I liked the original enough it'll be worth looking at reviews, no matter what the score, so see if I think I'll like the next one.

      • by JMJimmy (2036122)

        Orcs Must Die 2 is a must buy - it's on sale 10% off right now too ;) (I LOVED the first one)

        Related to the article:

        It's just retarded. XKCD aside, it's worrying about the subjectivity of converted numbers when the initial number is completely subjective as well. If a reviewer is giving something say 3 out of 5 stars, maybe he thinks it's 3.12658 stars out of 5, maybe 2.9674... who cares!

      • by Anonymous Coward

        As a buyer of Endless Space, be careful. It's like a formalistic, spreadsheety MOO2 without combat, without charm, without espionage and poor diplomacy, and no real racial picks.

        This largely comes from ship design. There are three types of weapons, and they work like this: Beams 1 (100 damage), Beams 2 (200 damage), Beams 3 (400 damage), Beams 4 (600 damage), Beams 5, (1000 damage).. all the way to something like Beams 8. No special systems. No range triplers, no MIRV missiles, no black hole generators.

        Warl

      • A 80 game is still, when ranked by a buch of people, better than a 70 game. On average.

        You can do all kinds of statistical analysis on each reviewer for each platform and each genre of game and make adjustments to them, but at the end of the day you still put the pig in the sausage grinder to get a single number. And that number will not be significantly different than what's there now and the ranking of games will not change.

        Compare Metacritic's movie review numbers to Rotten Tomatoes'. They both use dif

    • by thegarbz (1787294) on Wednesday July 18, 2012 @04:22AM (#40683509)

      Just because 10,000 out of 15,000 like something, it has no bearing over if I will like it and quite often leads to the oposite.

      So you're not interested in my remake of Twilight staring Justin Bieber?

  • Of course it can be wrong, but its a good indicator of what others think of the game.

    Nothing on earth can tell you if YOU will like the game unless you play it.

    I personally sometimes enjoy playing terrible games, (or games with terrible reviews) and find them quite charming.

  • by Anonymous Coward

    They shouldn't worry, Slashdot beats Metacritic hands down for subjectively erroneous (mod) scoring.
    Mod me up for the hell of it.
    tnx.

  • This article is a classic example of why game and movie rating are so terrible nowadays. since when is a 67 a terrible score, in proper scoring system it should be at least a pass. It seems sites tend to rate even trash with a range of 60-100. A B- or a 67 is NOT a terrible score, if scoring is done correctly this should be above average but not great. What is the point of having a 0-100 scale if you are not using the range.
    • by Sycraft-fu (314770) on Wednesday July 18, 2012 @04:06AM (#40683433)

      In most US schools, the scale is:

      A: 100-90
      B: 89-80
      C: 79-70
      D: 69-60
      F (or sometimes E): 59-0

      So while you can percentage wise score anywhere from 0-100 on an assignment and on the final grade, 59% or below is failing. In terms of the grades an A means (or is supposed to mean) an excellent grasp of the material, a B a good grasp, a C an acceptable grasp, a D a below average grasp but still enough, and an F an unsatisfactory grasp.

      So translate that to reviews and you get the same system. Also it can be useful to have a range of bad. Anything under 60% is bad in grade terms but looking at the percentages can tell you how bad. A 55% means you failed, but were close to passing. A 10% means you probably didn't even try.

      So games could be looked at the same way. The ratings do seem to get used that way too. When you see sites hand out ratings in the 60s (or 6/10) they usually are giving it a marginal rating, like "We wouldn't really recommend this, but it isn't horrible so maybe if you really like this kind of game." A rating in the 50s is pretty much a no recommendation but for a game that is just bad not truly horrible. When a real piece of shit comes along, it will get things in the 30s or 20s (maybe lower).

      A "grade style" rating system does make some sense, also in particular since we are not rating in terms of averages. I don't think anyone gives a shit if the game is "average" or not, they care if it is good. The "average" game could be good or bad, that really isn't relevant. What is relevant is do you want to play a specific game.

      • South Africa uses a different set of numbers, so the same grade letter means something different for us, and probably something else completely for other countries too.

        • A - 80% to 100%
        • B - 70% to 79%
        • C - 60% to 69%
        • D - 50% to 59%
        • E - 40% to 49%
        • F - 30% to 39%
        • G - 20% to 29%
        • H - 0% to 19%
      • by flitty (981864)
        Grades make the most sense for game reviews for me, or, put another way, converting a star system to grades is the best way to think of it. 0-star = 0-59%, 1 star = 60-69%, 2 star = 70-79%, 3 star = 80-89%, 4 star= 90-100%

        I find it very important to think of games this way because technical incompetence (much like failing comprehension or ability with grades) is such a large part of gaming. There needs to be that large bottom half of the scale to allow for certain technical failures, which render the g
    • by Zephyn (415698)

      It's called the Four Point Scale [tvtropes.org] for precisely this reason. Professional reviewers are very limited on how severely they can criticize the faults of a game when it comes from a big publisher.

      When the critic reviews of a AAA game are all 80+, but the user reviews are in sub 4.0 red text, tread carefully.

  • I'd never, ever let a metacritic score determine whether or not I buy a game. That's not to say I don't let reviews (and review scores) influence a purchase, but I find metacritic useless.

    When I check reviews of a game to work out whether I want to buy it, I'll look at the score, but it's only one small factor. What I'm actually looking for are certain factors that might be picked up in a review that will be highly likely to influence whether or not I like a game.

    For example, I hate - and I do mean absolute

  • Even though meta critic has become the standard for measuring the quality of a game, they sadly do not check the quality or sincerity of the reviewers they pick. I myself work at a smaller indie game studio. Our last project got reviews between ranging from between 2 to 10. How that even is possible is due to several factors, though the main one being that some reviewers didn't really review the game at all. They just scraped at the surface of it, and Metacritic then used that score. Our game wasn't perfect
    • Our game wasn't perfect, neither was it crap. It is fun, addictive, beautiful

      That's completely subjective.

      But was it a 2 or a 10?

      Apparently to them, it was.

      • by metacell (523607)

        Well then, you could argue that their opinion isn't very relevant, if they only scraped the surface of the game. Reviews are for those who want to shell out their hard-earned cash on a game, and they probably want to go a little deeper.

      • But was it a 2 or a 10?

        Apparently to them, it was.

        Sure it was, but should they be at Metacritic? Anyone can rate anything without looking if they like to, but it will be of little or no use to anybody else.

  • They should give you an option to give more weight the later a review came out. I just find myself generally distrustful of 0-day reviews because they usually mean:
    a)The reviewer didnt actually spend enough time with the game to give it a meaningful review and/or
    b) the reviewer had access to the game early, which of course raises questions about objectivity.....
    The best reviews IMO are those that come out at least a week after the games release....
    • Well, it's not as though reviewers get the game the same day we do and then put them out the same day. They get them often weeks in advance, and save for really open ended games like Skyrim, most games can be thoroughly played in the window the reviewers are already given.

      • I should also point out that just because reviewers are given games early doesn't raise many questions about objectivity. It's pretty much how the entire industry functions. The reviews you need to take with a grain of salt are the pre-release reviews that come out some time before the game actually launches. Publishers often impose restrictions on what you can say before release, but they have no control over what is said after release. As long as the reviews come out the day the game launches, there shoul

    • I can't find the story now, but about a year ago a game publisher accused a reviewer of not playing a game much and used the server logs of the game to detail what the reviewer had and hadn't done.
  • And that is how maths for students works :-)

  • Diablo 3 on Metacritic is the 2nd highest rated current game.
    Don't take averages for truth, they're just averages. Use Metacritic as a source of reviews, find the reviewers (people) who you have the most affinity with over time, and then focus on what their own scores are.

    • by nschubach (922175)

      Sure, but if you look at the user scores, it's also in the lowest 5-7.

      http://www.metacritic.com/browse/games/score/metascore/year/pc?sort=desc&year_selected=2012 [metacritic.com]

      Now, if there were only 10-15 reviews I can understand that this number may not be accurate... but generally the fans don't like the game and that tells me it's probably not worth playing, especially Blizzard fans who are usually not critical to the mothership. Currently, the majority of user reviews are all negative, by almost 2:1.

      • by homb (82455)

        User scores are not reviewer reviews.
        Reviewers have experience and thoughtful analyses but suffer from small sample size and conflicts of interest.
        Users have the strength of numbers but suffer from groupthink and emotional coloring.

        The way I use MC is that I get a good feel for the game based on user score averages, then look at the reviewers that I like and analyze their pros and cons. Then I make a decision, based on whether the cons are bad enough for me or not.

        • by nschubach (922175)

          User scores are not reviewer reviews.

          Not sure where I said they were...

          Reviewers have experience and thoughtful analyses but suffer from small sample size and conflicts of interest.
          Users have the strength of numbers but suffer from groupthink and emotional coloring.

          In an entertainment medium I find emotional coloring an important metric, especially for companies that have such a huge following as Blizzard with such a rabid fan base. There are a few fans, but a majority are not happy with the game.
          Professional reviewers have ad revenues to look into. If they continually rate EA games poorly, EA will pull it's advertising dollars from their income. Of course they are bias and there have been multiple... multiple stories and articles

  • Metacritic is good for avoiding games that are complete crap. Other than that, you really have to read some of the reviews to decide which game you will like more.

    I just played The Last Story (metascore 82) after buying it instead of Xenoblade Chronicles (92). After reading some reviews I was sure that I would prefer The Last Story even though Xenoblade has a greater score and the games are in the same genre. Xenoblade is usually praised e.g. for having lots of stuff to do, but I really wanted a game that I

  • I like Metacritic. I consult it when I'm considering a new game on Steam, and I don't have gameplay footage or word of mouth to convince me otherwise. However, I don't use the aggregated score as anything other than an average.

    There are links to all of the collated reviews. They're there so that you, the consumer, can perform your due diligence more easily. Personally, I like to scan the middling and bad reviews to get an idea of the kind of warts I should expect if I buy it. One reviewer's selling point ma

  • Fortunately, all smart, discerning, handsome, virile, gamers can still get their properly graded reviews from 1UP. Phew!
  • Rottentomatoes (Score:5, Interesting)

    by cheesecake23 (1110663) on Wednesday July 18, 2012 @05:35AM (#40683863)

    By that logic, Rottentomatoes (which averages reviews using only a binary fresh/rotten scale) should be utterly useless. Except it isn't. It's IMHO the most dependable rating site on the net.

    It seems the magic lies not in the rating resolution, but in the quality and size of the reviewer pool (100+ for Rottentomatoes). In other words, make the law of averages work for you.

    • Re:Rottentomatoes (Score:4, Interesting)

      by Dorkmaster Flek (1013045) on Wednesday July 18, 2012 @08:31AM (#40685145)
      Rotten Tomatoes uses a different system though. In fact, I really like their system. They look at a review and decide ultimately whether the critic enjoyed the movie enough to recommend it or not. It's like Siskel & Ebert's thumbs up or down system; fresh or rotten. The only factor is whether the enjoyed the movie or not. There's none of this trying to take a letter grade and turn it into a number from 1-100 bullshit. The Rotten Tomatoes rating is simply a percentage of the number of critics who liked the film enough to recommend it out of the total number of reviews, which I find much more useful. It's still no substitute for the most reliable method, which somebody else above mentioned: find a reviewer whose taste agrees with you on past films/games/whatever and see what they say about new ones. Rotten Tomatoes takes less time though.
      • by Magius_AR (198796)

        Rotten Tomatoes uses a different system though. In fact, I really like their system. They look at a review and decide ultimately whether the critic enjoyed the movie enough to recommend it or not. I

        How is that a good metric? It's _terrible_ granularity. I count count the number of movies I was so-so about and had no strong opinion one way or another. And Rottentomatoes forces a "HELL YES!" or "HELL NO!" response with the thumbs up/thumbs down. Hell, if I'm just not in the mood one day for an action flic

    • Exactly, came here to say this.

      Rottentomatoes rating is not a rating of how good a movie is. Rather, it is how likely you are to enjoy it. It is a probability!!

      A movie with 10% on rottentomatoes doesn't mean its a movie worth a 10 grade, it means that only a niche audience enjoyed. So you're less likely to be part of that 10%, but its absolutely possible you still love the film, for very specific reasons. Similarly, a movie with a 98% rating isn't necessarily the best movie or a very high quality film
    • by kamapuaa (555446)

      Look at how game reviews work though. Unless the game is a complete debacle, every game will get a positive review, and the game would have a "fresh" rating of 100%

  • Anyone who reviews videogames -- or any form of entertainment, really -- will tell you the score is but one part of the puzzle; in some cases, it's looked upon as a necessary evil, as certain outlets' experiments with ditching scores altogether have been deemed failures.

    Yeah, that's why Rockpapershotgun have failed...oh wait.

  • Hmmmm...I hadn't realized this issue with Metacritic before. I now rate Metacritic's quality as 4.7 kumquats, down from 2-8 exahogsheads per quadriliter :(

  • by Anonymous Coward

    As a ten+ year game reviewer (shameless plug: game-over.net), I see the problem from the other side. Even on a single review board, there are variations in how "hard" individual reviewers score. Over the years we have tried to implement a scoring system, giving XX/20 for graphics, XX/20 for story, that kind of thing, but found disagreements among reviewers as to the weightings. Is good graphics equally as important as good plot? What about good music and sound effects? What one reviewer sees as retro,

  • The problem with metacritic is that they don't take into account the Law of Truly Large Numbers.

    (I don't know what that means, but I've decided I'm going to say this in all discussions involving statistics).

  • If your going to use some crazy scheme to rate games, then perhaps those websites should provide some kind of a translation value so that Metacritic can correctly identify the intention of the crazy review schemes.

    But ultimately if you are getting reviews from 100's of websites, the aggregate value should be fairly accurate. I think a game that is rated 50% is bad compared to a game rated 90%, but I don't think people really care about the perceived quality between two games if their ratings are like 85% a

  • Larger/louder/more voices drown out smaller/quieter/fewer voices -- regardless of the authority or quality of comment. (Unlike on /., which has moderation and meta-moderation based on content.)

    True story: My sister and brother-in-law left their kids with their grandmother and escaped to see a movie and relative peace for a few hours. My sister came back, really angry with her husband. "But sweetie," he said, "_2012_ got a good score on Metacritic!"

    Really. Happened.

  • by swb (14022) on Wednesday July 18, 2012 @09:05AM (#40685555)

    I think the value in metacritic isn't the "score" but the variation across all reviews. You could have two titles with identical "80" scores, which would otherwise indicate both titles are equally well liked.

    That being said, one title could have all of its reviews be between 70 and 90, while the other could have a lot of low scores and a lot of high scores. The high variation in scores tells you that there's something about that title that's amiss.

    It would be interesting to see statistics compiled for reviewers, too. Do some reviewers always deviate above the average? Below? I would think a reviewer with a higher variability of ratings would be more trustworthy than one who was consistent with their reviews.

  • It's not one about 100 to 0 or A to F. It's one where people are hardly objective or, rather, how people rate games.

    Take a look around. Games will get a rating of 90 to 100 if they're really good, 80 to 90 if they're halfway decent and 70 to 80 if they're kinda lukewarm. Then there's a big nothing until the 0-10 bracket for the stinkers. WTF?

    This doesn't make any sense at all. But that's how things run today. Every game maker presses to get a 90+ review. Even if the game doesn't really deserve it. What is a

  • Oversimplifying things leaves them... what's that word? Right. "Oversimplified".

    Who could have guessed that?

  • Reviews are subjective things to begin with. any aggregate is just intended to be a lose heuristic, not some auditable fact.
  • I've found Metacritic's scores to be pretty good when it's pulling from a large number of reviews. It's further enhanced by the inclusion of a separate user score. Score inflation is a problem with nearly all reviews so it isn't like Metacritic is really suffering from any inconsistency. I think at this point your average person is well aware of that and assesses scores accordingly.

    My problem isn't so much that most scores float above 75 except when they're exceptionally bad. My problem is with blatant infl

    • by Cederic (9623)

      I also tend to pick out reviews across the scoring range to get a more detailed assessment.

      I do that too. Focussing on the raw headline average score is a crude way of using Metacritic and people shouldn't be surprised that it leads to crude results.

      It's also interesting when you're looking at 1-2 year old games how many games have a lowish score because of release day issues. Buying the game fully patched for half the price means you can get a game that were it released in today's form back then would've got 10-20 points higher on the average, and is in fact an excellent game.

      There are some game

  • People were complaining about this stuff years ago. Nothing beats the time they told a reviewer he didn't understand his own score system [youtube.com].
  • It should be the common understanding among the public that results from MetaCritic (or any other sites of such nature) must bring to mind Slashdot's poll caveat, i.e.,:

    "This whole thing is wildly inaccurate. Rounding errors, ballot stuffers, dynamic IPs, firewalls. If you're using these numbers to do anything important, you're insane"

In every hierarchy the cream rises until it sours. -- Dr. Laurence J. Peter

Working...