Parallel Processing For Cardiac Simulations Using an Xbox 360 101
Foot-in-Mouth writes "Physorg has an article about a researcher, Dr. Simon Scarle at the University of Warwick's WMG Digital Laboratory, who needed to model some cardiological processes. Conventionally, he would requisition time on a university parallel-processing computer or use a network of PCs. However, Dr. Scarle's work history included gaming industry experience as a software engineer at a company associated with Microsoft Games Studio. His idea was that researchers could use Xbox 360s as an inexpensive parallel computing platform due to the console's hefty parallel processing-enabled GPU. He said, 'Although major reworking of any previous code framework is required, the Xbox 360 is a very easy platform to develop for and this cost can easily be outweighed by the benefits in gained computational power and speed, as well as the relative ease of visualization of the system.'"
How would this work in practice? (Score:4, Interesting)
How would this work? Does Microsoft sell licenses for such purposes? Would they need to buy special development boxes instead of cheap of the self hardware? Has the Xbox360 been hacked enough to make this practical?
And most important of all: Why use a Xbox360 GPU in the first place? Aren't there PC GPUs that could run circles around what is in the Xbox360? Wouldn't a PS3 be better suited duo to being an open platform (well, at least as long as the old models are still available)?
Re:Why not the PS3? (Score:4, Interesting)
He should've used something like CUDA instead, for long term gains. This would have shown far better performance than the Xbox's GPU (which is quite dated now), and easy scalability as better GPUs keep coming to the market. His familiarity with Xbox programming might have enabled him to come up to speed with CUDA quickly.
Re:Why not the PS3? (Score:3, Interesting)
Re:isn't this somewhat boring? (Score:1, Interesting)
The GPU computing is nothing new.
However, the paper on the-
"Implications of the Turing completeness of reaction-diffusion models"
is fascinating.
It's about spatially diffused chemical reactions, which is basically cell chemistry and thus life.
If this process can be regarded as being Turing complete then we have to regard a cell membrane, or even a simple mix of chemicals as being able to compute. And to be able to compute anything.
Turing's original reaction-diffusion model of morphogenesis concentrated on animal skin patterns, but these guys have taken a whole new approach.
Re:Why not the PS3? (Score:3, Interesting)
Sounds like a DMCA violation... (Score:1, Interesting)
Better send him to jail.
"take the steep learning curve to SPU programming" (Score:2, Interesting)
Give me a break. So long as your program doesn't need more than 256k of memory per thread you can port it by typing little more than "CC=gcc-spu make". There is a nice helper library that wraps your main() and passes I/O to and from the SPU. I got about 40kloc of audio processing code running on the SPU in a grand total of 30 minutes, which included downloading the SDK (and reading the docs while it downloaded and installed). It required zero code changes. Getting good performance requires vectorizing the code with intrinsics which is exactly like coding for x86's SSE2 but easier because the SPU vector engine is a lot more complete. If you need to work with more memory you need to do some manual memory management, but it's nowhere near as invasive as the shuffling you need to do to move data on and off a video card. CELL is quite simply FAR easier than GPGPU.
tl;dr version: GPGPU is complex and specialized enough that you have to write for it, CELL is so simple that you can get regular CPU code running on it with few to no changes.
Re:Because it's an advetorial, perhaps? (Score:5, Interesting)
Only problem I have with the X-Box is how crappy the hardware has been so far. Rings of death, circle of death, failed rom drives, failed hard drives ...
Why on Earth would you want to rely on such a poorly constructed piece of hardware to do real work? Every component has failures, but when so many of my childrens' friends are on their 3rd or fourth in a few years, there is a real issue. And, no, they are not abusive to their equipment. The same kids have Wii,s PCs, PS3s, GameCubes and more without all the issues they have with X-Box 360s.
InnerWeb
Re:How would this work in practice? (Score:1, Interesting)
It is important to note that the CPUs and GPU share the RAM, though, so shuffling off a dataset to be crunched by a shader is a free operation.
Re:Why not the PS3? (Score:4, Interesting)
Perhaps. Depends on how many nodes he had to set up.
Let's do a bit of "napkin math" on this:
I believe there's 48 unified shader cores in the 360's GPU. That's a nice amount.
There's 112 shader cores in the 9800GT. With the SLI setup, that's 224 of them at your disposal to do GPGPU thread processing with.
Now...done right (meaning not going overboard on the CPU, etc...), you can field a machine for about $600 or so that has an inexpensive SLI board, case, memory, etc. If you're doing a cluster node, you wouldn't need a disk, etc. so you could shave a bit more than you'd think off the price past the first machine bought.
$200 versus $600. The price is compelling. But, unfortunately, you're talking about a machine that's nearly 5 times more powerful (Possibly more, I'm not doing apples-to-apples comparisons on the shader cores...) at this sort of task with the PC- for only about 3 times the cost. To gain the same performance level, you would have to field 5 360's per each PC compute node. If you only need the power of two or three of the 360 nodes, then it makes some sense to do it with that, especially if you're familiar with the environment (the gent we're talking about in the threads here was that...).The power consumption will be comparable across the board, so that's not so much a consideration.
Where it really hits the wall is with the cluster fabric itself. Using PS3's and 360's is "cool" but it's actually not overly practical past about 10 or so machines for most performance computing applications because of the limitation of the cluster interconnect you have at your disposal. With those machines you will be limited to 1Gb Ethernet which limits your interconnect performance to about 750Mbits per node. When you go to match the performance of the PC box, you will find that you can do it, but it'll take 5 or so 360's to do it because of the overhead, lower performing hardware, and all. You'll have difficulty matching a cluster of the same numbers of PC's- and we won't get into using Myrinet, Infiniband, or iWarp channel adapters for 10Gb interconnects on the PC's which will make it be basically a huge SMP machine for all intents and purposes until you scale it to about 32 or so machines.
I think the assessment that it's familiarity and "cool" factor that drove this decision- not price or actual usefulness.
Re:isn't this somewhat boring? (Score:2, Interesting)
I can compute anything, given enough time.
Then have a go at a tiling problem [cs4fn.org] and let us know when you've finished.
Namgge
Re:Because it's an advetorial, perhaps? (Score:3, Interesting)
Because it's really a publicity stunt from Microsoft trying to get the Xbox360 in the forefront of peoples minds in the lead up to Christmas.
The article reads like most of the marketing cover I see from Microsoft (and for that matter most other software companies).
I've worked with WMG people before, and they aren't the kind of organisation that takes a payoff like that. And they certainly aren't a typical MS shop, either. My guess is simply that the guy was more familiar with Xbox as a platform than he was with PS3... I doubt there's much more to it than that.