Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Role Playing (Games) Entertainment Games IT

Behind the Scenes At Sony's NOC 49

VonGuard writes "Earlier this year, I spoke to Mark Rizzo, the man who manages the people who run Sony's online game servers. Rizzo learned the ropes of MMO hosting back on Ultima Online, and we chatted about where the tough problems were then versus now. Rizzo compares the operation to a 24/7 scientific simulation, albeit with some sassier and more involved end-users. His favorite innovation since those early days? Rapidly provisioning and deploying Linux installations tailor-made to their purposes. Here's my article on Rizzo and his band of 50-some-odd sysadmin-cum-dungeon-masters, written for the new newspaper The Systems Management News."
This discussion has been archived. No new comments can be posted.

Behind the Scenes At Sony's NOC

Comments Filter:
  • Re:And Remedy :P (Score:5, Insightful)

    by kjart ( 941720 ) on Tuesday June 03, 2008 @04:42AM (#23635683)
    All the problems you are describing are engineering/development issues and don't have anything to do with operations. The architects would be for the infrastructure, deployment, monitoring, etc etc, not for the games themselves.
  • by Capitalist Piggy ( 1298699 ) on Tuesday June 03, 2008 @06:07AM (#23635927)
    You make this whole thing sound like it's a 99.999% uptime venture, when I've seen EQ, WoW and Planetside servers go down for days at a time.

    Having spent much of my grown life as a NOC monkey, I can assure you heads would roll at the ISPs I've worked at if we had nearly the number and lengths of outages experienced in the gaming world.

    I don't see how this is more "involved" as far as the end user is concerned. What's going to happen on an MMORPG? People will post in forums and not ever see a response. That's not involvement. Involvement is when you've got three call centers with a two hour hold time, the random crazy person finding your NOC number, and directors having emergency meetings over even minor outages because these particular millions of customers have stocks to purchase, games to play, and email to check and they have a nice 1-800 number to dial instead of hitting a forum that's likely going to be down if your game servers are having trouble.

    I think Sony is just doing a little self-appreciation in the article, as I don't really expect anyone at any company to say the guy monitoring the network at night is playing Q3 on his workstation or about the guy who shows up on meth sometimes.
  • by sticks_us ( 150624 ) on Tuesday June 03, 2008 @06:28AM (#23636027) Homepage
    As you say, there's no doubt these people are doing impressive things, but to anyone with experience dealing with e-commerce solutions (read: involving people's money), all of these measures will probably seem familiar.

    The problems mentioned above about transactional integrity, backup/restore, availability, clustering, "five nines" uptime have all been largely addressed at places like Amazon, Bank of America, and so on.
  • Re:Linux? (Score:3, Insightful)

    by magamiako1 ( 1026318 ) on Tuesday June 03, 2008 @09:06AM (#23636951)
    The client and what the server does and has to do are entirely separate things and pretty much have no relation with regards to each other in any way except that they communicate data back and forth for one or the other to process.
  • Having spent much of my grown life as a NOC monkey, I can assure you heads would roll at the ISPs I've worked at if we had nearly the number and lengths of outages experienced in the gaming world.

    And the obvious difference is that with an ISP you don't have dozens or hundreds of people trying new ways to game the system. With fail over, live backup servers and cron jobs aplenty, you just swap out/swap in and you are good to go. With MMORPGs, someone hacks the system and you have to shut it down deliberately, pour yourself a double-shot and let out a loud WTF. Then study the hack, if you can, then engineer a work-around, then test it, then deploy it. Then bring the system back up. Yeah, these are very comparable systems alright.

    Another difference is when a new MMORPG becomes popular and you go from a hundred test users to a thousand gamers to 100,000 to 1,000,000 in about a week. Gee, our roomful of servers has to become a building full. Should take just 0.001% of a year to do that. Capacity is a chicken and egg situation -- you aren't going to buy a thousand servers before you have even launched a game so you are forced to play catch up.

Thus spake the master programmer: "After three days without programming, life becomes meaningless." -- Geoffrey James, "The Tao of Programming"

Working...