Stories
Slash Boxes
Comments

News for nerds, stuff that matters

More Cell Processor Details And First Pictures

Posted by timothy on Mon Feb 07, 2005 07:22 PM
from the never-a-good-time-to-buy-a-computer dept.
slashflood writes "After reading two articles on slashdot about the Cell architecture and another one that criticizes the extensive roundup of the STI patents, I found the first pictures of the Cell core. It seems that at least some predictions were true. Seeing is believing." mtgarden points to this ZDNet article which says that the "first version of the chip will run at speeds faster than 4GHz. Engineers were vague on how much faster, but reports from design partners say 4.6GHz is likely. By comparison, the fastest current Pentium PC processor tops out at 3.8GHz." (More below.)

Hack Jandy writes "Anand Shimpi has some details about the upcoming Cell processor (PS3) in his personal blog. According to Anand, "Rambus announced that the new Cell processor uses both Rambus XDR memory and their FlexIO processor bus. Because Rambus designed the interface for both the memory controller(s) and the processor interface, the vast majority of signaling pins are using Rambus interfaces - a total of 90% according to Rambus." Hasn't Rambus been showing up a lot again recently? The fact that Cell uses XDR has been widely speculated, but the fact that it will also use the Rambus bus signalling is something completely new."

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

More Cell Processor Details And First Pictures 25 Comments More | Login /

 Full
 Abbreviated
 Hidden
More | Login
Keybindings Beta
Q W E
A S D
Loading ... Please wait.
  • Pictures? (Score:5, Funny)

    by vurg (639307) on Monday February 07 2005, @07:25PM (#11601972)
    How about HL2 benchmarks?
  • Cell (Score:5, Interesting)

    by ryanmfw (774163) on Monday February 07 2005, @07:26PM (#11601981)
    Cell processors could really dominate. With how cheap they arespeculated to be, their distributed processing, and their all around speed, the could take over a significant part of the computer marketshare. If Cell processors also have the Power4 processors in them, this could be a replacement for x86. Could be. As other articles have pointed out, x86 has had superior competition in the past, and has been able to weather it. We shall wait and see. Cheers
    • Re:Cell (Score:5, Interesting)

      by hattig (47930) on Monday February 07 2005, @07:31PM (#11602024) Journal
      From http://www.aceshardware.com/forums/read_post.jsp?i d=115121622&forumid=1

      CELL is a Multi-Core Architecture

      Contains 8 SPUs each containing a 128 entry 128-bit register file and 256KB Local Store
      Contains 64-bit Power ArchitectureTM with VMX that is a dual thread SMT design - views system memory as a 10-way coherent threaded machine
      2.5MB of on Chip memory (512KB L2 and 8 * 256KB)
      234 million transistors
      Prototype die size of 221mm2
      Fabricated with 90nanometer (nm) SOI process technology


      We're talking about a single-core POWER5 design (because of the SMT).

      But 221mm^2 ... that's big, bigger than a 130nm Opteron, bigger than a dual-core 90nm Opteron. But wait for 65nm, and you've got something of a manageable size to make a cheaper console. I don't see 4 Cells in a PS3 though, not even at 65nm, unless it is going to cost a boatload. Still, Sony aren't a little company, I'm sure they could sort it out.

      Still, I guess this means the next PowerMac G5 will be using processors with SMT finally.
      [ Parent ]
      • Re:Cell (Score:5, Informative)

        by doormat (63648) on Monday February 07 2005, @07:59PM (#11602304) Journal
        234M transistors @ 90nm is actually about as big as most graphics processors are. They tend to be 150M-200M @ 110nm or 130nm. I dont see it being terribly difficult to fab really,
        [ Parent ]
      • by Ideaphile (678292) on Tuesday February 08 2005, @02:15AM (#11604297) Homepage
        I was at the Cell event today, and quoted in some of the news stories. I also have the ISSCC technical papers.

        The PowerPC core in the Cell prototype chip is NOT a Power5, as speculated here. According to IBM, this core was designed from scratch for this application. One critical difference is that the new pipeline executes instructions in strict program order rather than reordering instructions to improve throughput as is done with Power5.

        Also, IBM has not described the core as "simultaneous multithreaded", just "multithreaded." I presume from this that the multithreading is coarse-grained-- only one thread is active at a time, unlike Power5 which can execute instructions from two different threads in the same cycle.

        The logic design for the Cell CPU was optimized for higher clock speeds in a given process than Power5 can achieve. This is a good tradeoff for more linear multimedia algorithms, but reduces effective throughput on other types of code.

        I think it's reasonable to suppose that if Apple were interested in using the Cell architecture, it would prefer to use a version of the design that includes a Power5 core in place of the one in the Cell prototype.

        . png
        [ Parent ]
    • Re:Cell (Score:5, Funny)

      by Anonymous Coward on Monday February 07 2005, @08:30PM (#11602591)
      There isn't much info on this processor yet, but from what I've heard about it, I conjecture that its design is in danger of violating Nakamura's law of quantum molecular finitism, especially as the clock speeds are increased. This could result in an asymmetric shift of the lattice substrate, in which case the transistor conductivity would actually start to skew in the direction of anticonductivity (the inverse of superconductivity), forming insulating barriers. As insulating barriers would form and more heat would be generated, unbounded oscillations the molecular level could cause regenerative superhetrodyning - a cascading effect leading to the processor eventually failing catastrophically while emitting a sound remarkably similar to the Love Boat theme. Or not.
      [ Parent ]
        • Re:I wonder.... (Score:5, Funny)

          by cosmo7 (325616) on Monday February 07 2005, @08:18PM (#11602480) Homepage
          I wonder what M$ has to beat back a server processor with essentially hyper threading, running at 4.6 ghz, attached to 8 vector processors, each with a lot of registers and cache, which are using extremely fast memory, that can connect to other, similar processors nearby.

          Microsoft has consistently overwhelmed the fastest processors on the market and I am confident that with the right bloatware they will continue to do so.
          [ Parent ]
              • Re:Cell (Score:5, Insightful)

                by Fulcrum of Evil (560260) on Monday February 07 2005, @11:34PM (#11603548)
                Any links to back that up? The only confirmed unit loss consoles I know of are the Xbox and the dreamcast. Everything else (to my knowledge) has been profitable. I looked around and the only sites that are claiming that the PSP is sold at a loss are 1up.com and some people on the chat forums. I also found a bunch of posts claiming that Sony expects to not sell the hardware at a loss. Perhaps the initial allocation was sold cheap to create buzz?
                [ Parent ]
  • We flame Intel for touting speed... (Score:5, Interesting)

    by X43B (577258) on Monday February 07 2005, @07:29PM (#11602012) Homepage Journal
    I'm waiting to see how much work it can actually do before making a judgement. At the least it always exciting to have another option. I wonder how difficult it will be to take advantage of the new architecture.
  • Speed isn't everything (Score:5, Insightful)

    by leathered (780018) on Monday February 07 2005, @07:30PM (#11602017)
    While 4.6 GHz sounds impressive, I thought we were getting away from the notion that clock speed = performance. The Pentium 4 killed off clock speed comparisons.

    I must admit the specs are impressive, but show me the benchmarks!
    • Re:Speed isn't everything (Score:5, Interesting)

      by MBCook (132727) <foobarsoft@foobarsoft.com> on Monday February 07 2005, @07:39PM (#11602102) Homepage
      That's true. But there are two important things here. The first is that it's at 4ghz. The P4 hasn't been able to reach that (though Intel origionally said it would happen by now). So it's all ready up there.

      The second is that it's STARTING at 4ghz. It's one thing to say a chip can scale and run at some speed (again, I'm looking at you Intel), but to debut it running faster than the fastest mass produced CPU in the world is something all together different.

      Cell should be quite formidable, and I think it will be quite interesting to see what comes of it. I've held the opinion for a few years that computers would move to having a couple of CPUs each running their own task (like in Cell), with one main (quite possibly slower) CPU controlling them all and running the OS (traffic cop, again like in the Cell). While the individual processing units are not general purpose (they are more vector oriented), it should still be interesting to see what comes of this. After all, most things people use high-end CPUs for are (or can be) vector ops, right? Compression, 3D, etc. Wordprocessing and spreadsheets don't tend to need much power. A large generalization, I know, but still... the introduction of the Cell (especiall the way it should be able to "group" its self with other Cell processors in your house) should prove quite interesting even if it turned out to be a failure (which I SERIOUSLY doubt.)

      [ Parent ]
    • Re:Speed isn't everything (Score:5, Funny)

      by TexVex (669445) on Monday February 07 2005, @07:40PM (#11602104)
      The Pentium 4 killed off clock speed comparisons.
      No, that was the Athlon.
      [ Parent ]
    • Re:Speed isn't everything (Score:5, Informative)

      by drmerope (771119) on Monday February 07 2005, @09:33PM (#11602829)
      Indeed. Even in a slow 0.18um technology, I can easily make an 8 GHz 3-inverter oscillator ring. So what?

      The "chip frequency" is determined by
      1) how fast can the transistors switch
      2) how many FIO4 inverter equivalents (standard measure of logic complexity) there are between the latches.

      #1 is just a process technology attribute

      #2 is where all the magic is because it is "how much work can take place in one cycle"

      #2 is commonly reduced in a technique called pipelining.

      General rule: Pipelining increases throughput at the cost of latency.

      Branches especially, but in other situations as well: latency becomes a limiting factor

      When this happens trading against latency is a bad decision.

      For any given ISA you're likely to reach this break point *somewhere*. The i386 architecture has reached it. This is because of the latency of decoding the _complex_ instructions.

      A simplier instruction set => incurs less latency penalty => can be pipelined further => can achieve higher clock speeds and accrue performance benefits to additional pipelining.

      Intel, though, still has probably the best process technology in the world and as a consequence if Intel were manufacturing these cell processors they'd run even faster.

      But simplier instructions tend to do less work. This means you need more instructions for the same task. More instructions might code to larger memory footprints. Larger memory footprints require faster i/o to memory and larger caches to not incur performance penalties. Thus in the end you might gain nothing.

      You can see this effect within amd64. Running in 64-bit mode gives you more registers, more registers should mean faster programs, but moving around all those 64-bit variables erases the benefit. (at least in compiler run-time benchmarks that I've seen).
      [ Parent ]
  • joint venture (Score:5, Funny)

    by LittleGuernica (736577) on Monday February 07 2005, @07:31PM (#11602023) Homepage
    I believe Sony and IBM and Toshiba are going to produce this thing as a joint venture, calling it "Cyberdyne" also naming the PS3 online game network Skynet, sounds promising...
  • Some specs from Sony press material (Score:5, Informative)

    by Anonymous Coward on Monday February 07 2005, @07:34PM (#11602048)
    http://www.scee.presscentre.com/imagelibrary/detai l.asp?MediaDetailsID=25555
    :

    CELL...bringing supercomputer power to everyday life with latest technology optimized for compute-intensive and broadband rich media applications

    SUMMARY:

    Cell is a breakthrough architectural design -- featuring 8 Synergistic Processing Units (SPU) with Power-based core, with top clock speeds exceeding 4 GHz (as measured during initial laboratory testing).

    Cell is OS neutral - supporting multiple operating systems simultaneously

    Cell is a multicore chip comprising 8 SPUs and a 64-bit Power processor core capable of massive floating point processing

    Special circuit techniques, rules for modularity and reuse, customized clocking structures, and unique power and thermal management concepts were applied to optimize the design

    CELL is a Multi-Core Architecture

    Contains 8 SPUs each containing a 128 entry 128-bit register file and 256KB Local Store

    Contains 64-bit Power ArchitectureTM with VMX that is a dual thread SMT design - views system memory as a 10-way coherent threaded machine

    2.5MB of on Chip memory (512KB L2 and 8 * 256KB)

    234 million transistors

    Prototype die size of 221mm2

    Fabricated with 90nanometer (nm) SOI process technology

    Cell is a modular architecture and floating point calculation capabilities can be adjusted by increasing or reducing the number of SPUs

    CELL is a Broadband Architecture

    Compatible with 64b Power Architecture(TM)

    SPU is a RISC architecture with SIMD organization and Local Store

    128+ concurrent transactions to memory per processor

    High speed internal element interconnect bus performing at 96B/cycle

    CELL is a Real-Time Architecture

    Resource allocation (for Bandwidth Management)

    Locking caches (via Replacement Management Tables)

    Virtualization support with real time response characteristics across multiple operating systems running simultaneously

    CELL is Security Enabled Architecture

    SPUs dynamically configurable as secure processors for flexible security programming

    CELL is a Confluence of New Technologies

    Virtualization techniques to support conventional and real time applications

    Autonomic power management features

    Resource management for real time human interaction

    Smart memory flow controllers (DMA) to sustain bandwidth
    • by Sunspire (784352) on Monday February 07 2005, @07:48PM (#11602183)
      Warning: Pregnant women, the elderly, and children should avoid prolonged exposure to CELL.
      Caution: CELL may suddenly accelerate to dangerous speeds.
      CELL contains a liquid core, which if exposed due to rupture should not be touched, inhaled, or looked at.
      Do not use CELL on concrete.
      Discontinue use of CELL if any of the following occurs:
      * Itching
      * Vertigo
      * Dizziness
      * Tingling in extremities
      * Loss of balance or coordination
      * Slurred speech
      * Temporary blindness
      * Profuse Sweating
      or
      * Heart palpitations

      If CELL begins to smoke, get away immediately. Seek shelter and cover head.
      CELL may stick to certain types of skin.
      When not in use, CELL should be returned to its special container and kept under refrigeration.
      Failure to do so relieves the makers of CELL, Sony Incorporated of any and all liability.
      Ingredients of CELL include an unknown glowing substance which fell to Earth, presumably from outer space.
      CELL has been shipped to our troops in Saudi Arabia and is also being dropped by our warplanes on Iraq.

      Do not taunt CELL.
      CELL comes with a lifetime guarantee.
      CELL! Accept no substitutes!
      [ Parent ]
  • The Sony hype machine strikes again (Score:5, Insightful)

    by Laconian (578463) on Monday February 07 2005, @07:41PM (#11602116)
    Remember how the Emotion Engine worked us all into a lather five years ago? And when it came out, it was just merely competitive with contemporary processors? Sony is great at churning out nerd fetish tech, but they have a terrible track record of living up to their promises. Let's hope it's different this time.
  • Power consumption (Score:5, Interesting)

    by Anonymous Coward on Monday February 07 2005, @07:50PM (#11602208)
    For those of you wondering about the power consumption of this thing, perhaps you should note that Sony just licensed LongRun2 from Transmeta. It is a dynamic solution for power consumption and leakage that will probably end up in the 65nm versions coming out next year. google transmeta sony for more.

    Once touted as the Intel killer, perhaps Transmeta will finally have its day.
  • Missing the point (Score:5, Informative)

    by egrinake (308662) <{on.teopedoc} {ta} {gkire}> on Monday February 07 2005, @09:04PM (#11602645)

    There seems to be alot of confusion surrounding the Cell chip. This is not "just another processor", and it certainly has little to do with clock frequencies - the Cell is a whole new architecture, which might just be a glimpse into the future of computing.

    To begin with, it might be useful with some background on the ps2 architecture - there are a couple of really great in-depth articles at Ars Technica [arstechnica.com]; Sound and Vision: A Technical Overview of the Emotion Engine [arstechnica.com] and The PlayStation2 vs. the PC: a system-level comparison of two 3D platforms [arstechnica.com].

    What made the ps2 so awesome was that it was custom-built specifically for multimedia-processing, which requires completely different processing environments than general-purpose computing. Normal PCs are made for computing where you have a large number of instructions working on a small data-set (such as a spreadsheet) - this requires large data-caches close to the CPU, while instructions are streamed continually from RAM. Media-processing is the other way around; you have "simple" operations (like doing the calculations for a single pixel), which are run on a large set of data - so you wouldn't really need any data-caches. The ps2 did exactly this; it removed almost all the caches (only a few tiny ones were left), but it had a totally insane bus bandwidth. To borrow an analogy from the mentioned Ars Technica article:

    "Here's a goofy example to help you visualize what I'm talking about: imagine a series of large buckets, connected by pipes to a main tank, with a cow lapping water out of each bucket. Since cows don't drink too fast, the pipes don't have to be too large to keep the buckets full and the cows happy. Now imagine that same setup, except with elephants on the other end instead of cows. The elephants are sucking water out so fast that you've got to do something drastic to keep them happy. One option would be to enlarge the pipes just a little (*cough* AGP *cough*), and stick insanely large buckets on the ends of them (*cough* 64MB GeForce *cough*). You then fill the buckets up to the top every morning, leave the water on all day, and pray to God that the elephants don't get too thirsty. This only works to a certain extent though, because a really thirsty elephant would still end up draining the bucket faster than you can fill it. And what happens when the elephants have kids, and the kids are even thirstier? You're only delaying the inevitable with this solution, because the problem isn't with the buckets, it's with the pipes (assuming an infinite supply of water). A better approach would be to just ditch the buckets altogether and make the pipes really, really large. You'd also want to stick some pans on the ends of the pipes as a place to collect the water before it gets consumed, but the pans don't have to be that big because the water isn't staying in them very long."

    So, what does this have to do with the Cell? The Cell takes this concept even further. Cell systems are made up of multiple processors, called APUs (Attached Processing Units), which are connected using an insanely fast data bus. Each APU can be programmed to handle one specific task, and then pass the data on to the next APU for a different task. By doing this, you can just put in more processors to increase the throughput of the system. This works especially good for multimedia processing, which can be pipelined like this pretty easily. Here are a couple of snippets from the Wikipedia entry [wikipedia.org]:

    "While the Cell chip can have a number of different configurations, the workstation and PlayStation 3 version of Cell consists of one "Processing Element" ("PE"), and eight "Attached Processing Units" ("APU"). The PE is based on the POWER Architecture, basis of their existing POWER line and related to the PowerPC used by Apple

  • Intel not impressed (Score:5, Funny)

    by vandan (151516) on Monday February 07 2005, @09:55PM (#11602956) Homepage
    We are very reluctant to adopt architectures like this because they take compatibility and throw it out the window.

    You mean like the Itanic? Shoe's on the other foot now, eh?
    • Re:Xbox (Score:5, Interesting)

      by Thu25245 (801369) on Monday February 07 2005, @07:36PM (#11602065)
      Thing is, the next Xbox will be using a PowerPC 970. So it will share a common ancestor (POWER) with the Cell.

      I wonder, how compatible are the two CPUs' instruction sets? Will Microsoft be able to drop a Cell into a future revision of the Xbox2 and maintain backward compatibility? Could someone theoretically hack a PlayStation3 to run Xbox2 games?
      [ Parent ]
    • RTFA (Score:5, Informative)

      by temojen (678985) on Monday February 07 2005, @07:46PM (#11602154) Journal
      The Cell CPU has a POWER Processor with VMX (it's vector based), plus 8 stream processors (which kick ass on vector processing units for some applications). So you've got
      • a regular CPU (good for program flow/logic and interdependant operations),
      • a vector unit (good for large arrays with no conditionals),
      • and 8 stream processors (good for applying the same operations plus flow control to lots of independant chunks of data).
      w00t!
      [ Parent ]
        • Re:I did, I'm still confused (Score:5, Informative)

          by be-fan (61476) on Monday February 07 2005, @10:16PM (#11603092)
          So the CPU is just a normal POWER, right?

          No. Each Cell has one main (controller) CPU called a PU, and up to 8 seperate vector CPUs called SPEs. The main CPU is a regular 64-bit POWER processor (with SMT --- IBM's equivalent of hyperthreading), while the APUs are very simple processors with a lot of execution resources and insane bandwidth. Such processors are known as "stream processors" in the literature, because they are designed to handle streams of data.

          it's just a different brandname, right?

          Yes, "AltiVec" (like "G5") is an Apple/Motorola trademark, so IBM can't use it. And you're right, the AltiVec unit is on the PU.

          For what purposes is the VMX more suited?

          It's there most likely because if you're running some code that isn't suitable for the SPEs, but does need to do vector computations, you don't have to send it off to the SPEs.

          Will the SPEs have this same starvation problem?

          Potentially, but probably not. Altivec on the G4 was starved because the G4's bus was exceedingly slow. The SPEs are supposed to be on a shared 128GB/sec internal bus, and the Cell has 100GB/sec of bandwidth to main memory.

          That each of the SPEs has 256k of private memory to work with?

          Yes. In the Cell model, you design your code in "cells". A cell is a clump of code and data that's copied to the SPE's local memory. The code then runs, streaming in additional data from memory, and using the local memory as a workspace.

          Can SPEs freely read other SPEs "local memory", or only their own? And who fills up this memory initially, and who deals with it once it's done?

          The SPEs local memories are not connected to each other, so each SPE can only read from its own local memory. The memory is filled up by the PU, when a Cell is loaded onto the SPE. The SPE then runs autonomously, and when it finishes, sends the results back to the PU via main memory.

          I.E., do the SPEs have access to main or video memory or other hardware, or do they ever require for the CPU to shuttle data to keep them fed?

          The SPEs and the PU all talk to a single DMAC, which has access to main memory.

          But then the article seems to be saying the is SPE access to memory is limited-- i.e. it can only be done in block load/stores.

          Yes. The DMAC, actually, can only read/write in 1024-bit blocks. This isn't really a big deal if you think about it. When a regular CPU reads a memory address, it doesn't read a byte at a time. It loads a whole cacheline at a time. So a P4, for example, usually reads a 128-byte (1024-bit) block at a time from memory anyway.

          Do each of the 8 SPEs actually independently load their own instruction streams?

          Yes. All the processor units run seperate instruction streams. Each "software cell" runs in its own thread, if you will.
          [ Parent ]