Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Games Entertainment News

Building A Homemade Chess Supercomputer 282

nado writes "There's a new article on Chessbase.com which has GM John Nunn showing you his chess-orientated PC upgrade to a double Xeon system, with some Fritz benchmarks." Elsewhere in the article, John Nunn discusses the unique computer needs for chess computation: "One of the problems with currently available processors is that they are not particularly well suited to the integer calculations used for chess. A Pentium 4 will be slower at chess than a Pentium 3 of an equivalent clock speed."
This discussion has been archived. No new comments can be posted.

Building A Homemade Chess Supercomputer

Comments Filter:
  • Great! (Score:5, Funny)

    by thx2001r ( 635969 ) on Sunday June 22, 2003 @11:36PM (#6270795) Homepage
    And Chessmaster 2000 kicked my arse on a 486!

    I've got no chance.
  • Well... (Score:5, Funny)

    by rosewood ( 99925 ) <<ur.tahc> <ta> <doowesor>> on Sunday June 22, 2003 @11:37PM (#6270797) Homepage Journal
    No thanks, I still get my ass kicked when I play chess on my pocket PC yet alone on a chess super computer. Im lucky I can even win in Othello :(
  • by qbproger ( 467459 ) on Sunday June 22, 2003 @11:39PM (#6270806)
    I'm working my way up to chess. I'm starting by becoming a tic tac toe master.
  • by S. Traaken ( 28509 ) on Sunday June 22, 2003 @11:40PM (#6270808)
    Does this actually surprise anyone? The P4 was only an exercise in marketing by Intel - redesign the chipset so it can be clocked nice and high (so it appeals to the average consumer) and to hell with the performance...
    • Athlon 2000+! It's only 1.3ghz, but it beats a 2ghz Pentium4!

      How long until: Backport GNU/Linux, now with the Linux 0.2 kernel it's 1.00000001 times as GNU as the SCO-encumbered 2.4 kernel Debian GNU/Linux.
    • Exactly. A P4 has a longer pipeline then a PIII, so any branch misprediction will result in a longer time penalty for a pipeline flush. The PIII 1ghz I have sitting on my floor over there --> is an equivelant of about a P4 1.8ghz.

      Although the longer pipe does allow for ramping of clock speeds higher then before (part of the reason AMD added 2 more stages to the Opteron and by association the Athlon64) it needs to be complemented with a more efficient branch prediction algorithm.
    • by ciroknight ( 601098 ) on Monday June 23, 2003 @12:04AM (#6270905)
      This is sorta true, but the P4 was an excercize to see where AMD would go. By making chips that they knew would be really fast once they got the foundation down, they drove AMD to finding a way out. My guess is that Intel was trying to drive them into a corner since they couldn't build an Itanium-compatible processor without a license, but instead AMD knew when they were beat and decided to move on to a higher playing field.

      Like this delicate game of chess, Intel's next move is uncertain. While the P4 has what is needed to smoke the Athlon for years (just as long as they keep tweaking the predication engine and improve on branch prediction's accuracy), it can't really compete with Opteron. Neither can Itanium. Intel just hasn't invested enough in the future since they were ruling the present. I even read somewhere that they had started Williamette and Itanium (forgot the codename) at nearly the same time back in 95? but neither really caught their supervisors eyes since they were more than profittable already. So in short, Intel's game of chess has been too passive for too long. And it's not time to look back on the P3 and say what is good... the P4 is something completely new, like the pentium one was so long ago. Give it some time and it will vastly out perform the P3 comparitively, but you have to realize that the P3 is something like 6 years old, and the Athlon even older than that.... It's time people stop trolling on how past processors were faster comparitively and move on to making the new processors faster. Borrowing from the old is ok.. Look at Banias for example.. but we seriously need to worry about the future more. But thanks for trolling on by.
    • by Ralph Wiggam ( 22354 ) * on Monday June 23, 2003 @12:07AM (#6270921) Homepage
      IIRC, the P3 was the end of the Pentium Pro core family. The P4 is the beginning of a new core. So, yes, the first P4s were not the best chips and the P3s of the same time were better buys. But do you remember the P-Pro? Give them a few years of refinement and then judge the core.

      -B
    • by JebusIsLord ( 566856 ) on Monday June 23, 2003 @12:11AM (#6270937)
      Total and complete bullshit!

      Japanese sports cars have small engines, which rev really high. These are hightech, powerful and sought-after technologies. The P4 is analagous. American and European sportscars have large, slow-reving, high-torque monster engines which are also powerful and sought-after. This is analagous to the Athlon. They have been leapfroging eachother for years now in performance - each has a different but equally valid way to get there.

      It is true though, that the customer only sees clockspeed (RPMs in my analogy) - which tends to help Intel. This does NOT however make it an inferiour method of acheiving performance.
      • So the question is, is all this shit worth it, or would it be better for both companies to be building CPUs which were capable both of quick operations and wide execution, perhaps with multiple cores? Maybe someone could make a machine that had disparate popular processors to execute different types of code, and an OS that would run modules using those cores in different threads.

        Of course there's always the notion that we should be using asynchronous logic now, anyway; That logic has sped up to the point where it is not actually important to have everything happen on the same clock, but instead more useful to have it occur as rapidly as possible.

        As for your automotive analogy, the primary reason that it seems that doing things the Japanese way is practical is that you keep the weight down, which is good from the standpoint that your handling improves and you simply have less weight to push around. On the other hand, a large engine need not necessarily be heavier than a small one. The primary advantage today (it seems to me) is that it ends up being cheaper on gas to run the smaller motor, but of course the more power you use, the more fuel you throw down the thing. Larger cars can be much lighter now than they used to be, though, what with aluminum getting cheap and high strength steel being readily available, not to mention that monocoque technology has moved along nicely with all this computer modeling.

        Anyway aside from that digression; I don't think either company really has the win here, just like small engines and large engines are interesting for different reasons, though you can certainly get more power out of larger engines... It only becomes more expensive at a certain point. Of course, Intel's processors are artificially expensive, simply because people pay for them; AMD's are as well, though to a lesser extent. Silly analogies :(

      • Total and complete bullshit!

        Japanese sports cars a small and have little engines because Japanese men are small and have little engines. European sports cars... well, have you seen the size of those Germans? They need bigger cars with large engines and lots of torque to shift that amount of lard down the autobahn.
      • Totally agree.

        Raw clock speed can make up for a deficiancy in efficiancy. I don't like it, but I have to conceed. Why else are all the top machines in maximum PC P-4 3.2 Ghz w/ dual channel DDR 400?

        However, for me, an additional issue comes into play. I am also concerned about price.

        To bring it around to the car analogy again, I could take my $5000 used honda accord and drop another $6000 into it, and thus roughly equal my brother's TransAm, in price and horsepower, which he paid $11000 for. I.e. I c
    • The P4 was only an exercise in marketing by Intel - redesign the chipset so it can be clocked nice and high (so it appeals to the average consumer) and to hell with the performance

      The P4 handily outperforms the P3. It is irrelevant that it does so partly by running at a higher clockrate.

      • Actually, it's pretty relevant -- if a processor requires higher clock speed to reach the same performance as another chip, it'll require more power, require a more expensive supporting chipset, require more expensive RAM, generate more heat, etc.
    • That's what caught my eye right off, the statement in the blurb makes it sound like this is some kind of fluke, but it's not. That's true for any application, not just the chess board. I'm not gonna get into the argument about whether it was just 'marketing' or not - there are *some* technical justifications, but it's a well known fact to anyone that pays attention to this sort of thing that the PIV is vastly inferior to the PIII at the same clock speed, and that the virtue of it's design is in being able t

    • by akuma(x86) ( 224898 ) on Monday June 23, 2003 @02:05AM (#6271279)
      Does this actually surprise anyone? The P4 was only an exercise in marketing by Intel - redesign the chipset so it can be clocked nice and high (so it appeals to the average consumer) and to hell with the performance...

      Let me use the converse of your argument. AMD redesigned their chipset to make their IPC too high and to hell with performance.

      Why do people insist that high frequency automatically means low performance? I'd say the P4 is pretty damn fast.

      It does not matter if the frequency is high or low. If you get the performance, who cares if the frequency is 1GHz or 4GHz? There are lots of ways to go for performance - 2 extremes are "narrow-and-fast" and "wide-and-slow".

      Nobody complained when Alpha went for low-ipc/high-frequency designs. Students of computer architecture will remember the days in the early 90s when there was a contest between the "speed-demons" and the "brainiacs". HP built the 'brainiac' machine (which was lower in frequency but had a wider issue) and Dec (Alpha) went for the 'speed-demon' (faster clock, lower-ipc). History shows that Alpha won that particular battle (performance-wise, not market-wise).

      Getting higher IPC is hard. In fact, making a superscalar, out-of-order machine wider is really hard. The hardware cost and power grow as the square of the width. Getting higher frequency is hard too, but some believe it is not as hard as getting higher IPC. The cost of the hardware and power of a higher frequency machine grows linearly with frequency.

      Yes, the P4 is designed to clock higher than an Athlon. They use fewer gates-per-clock and therefore, necessarily do less work per clock. Unfortunately, performance is not measured in work-done-per-clock. It's measured in absolute time. So if you can get the same amount of work done in the same amount of time, but use more clocks to do it, why should you as a user care? You still got the performance.
      • Performance != speed (Score:2, Informative)

        by yerricde ( 125198 )

        Unfortunately, performance is not measured in work-done-per-clock. It's measured in absolute time.

        Not always. Performance may be measured in main loop executions per hour, but sometimes it is more useful to measure main loop executions per megajoule (speed vs. energy consumption; there are 3.6 MJ in 1 kWh) or main loop executions per cubic meter hour (speed vs. rack space). And if increasing work done per clock can increase the rate of work done for a given amount of electric power or rented rack space

        • Ah, yes. You bring up a great point.

          Performance, as I have it defined in my head is simply the time it takes to complete a task. The lower the better. For this guy's chess application, I think he is more concerned with absolute performance irrespective of total energy consumed.

          Your definition includes power efficiency as a consideration. This is a worthy metric that is not lost on the engineers at Intel and AMD - let me assure you. There's a very talented team of engineers in Haifa, Israel building v
  • by Anonymous Coward on Sunday June 22, 2003 @11:41PM (#6270812)
    I'm not sure this should have been said:

    "A Pentium 4 will be slower at chess than a Pentium 3 of an equivalent clock speed."

    That's too easy to be distorted

    I'm sure a marketing group or some such, for intel competitors or even PPC, will say

    "A Pentium 4 will be slower ... than a Pentium 3 of an equivalent clock speed."

    And then use it to justify their own means.

    Hmmm?
  • by AEton ( 654737 ) on Sunday June 22, 2003 @11:43PM (#6270816)
    From the article:

    As this computer was to be focussed on chess, video performance was not important.

    Hardcore Slashdot Games readers cringe...
  • by robindmorris ( 682328 ) on Sunday June 22, 2003 @11:43PM (#6270817)
    IBM's Deep Blue [ibm.com] used special purpose chips, so it shouldn't really come as too much of a surprise that general-purpose processors aren't the best for chess computers.
    • That is true.

      But if you can include special purpose chips (ASICs) in any comaparison, then general purpose processors won't be the best at any single task. An ASIC can be made for any program that can be written, and it'll run that program faster than any CPU made by the same process.

      For instance: want to know what is the best CPU for performing matrix multiplication? An ASIC. What's the best CPU for rendering 3D images? An ASIC (like the ones used in modern video cards - they're ASICs of a sort). Wha
  • by Rosco P. Coltrane ( 209368 ) on Sunday June 22, 2003 @11:51PM (#6270844)
    A Pentium 4 will be slower at chess than a Pentium 3 of an equivalent clock speed

    Just imagine the chess performances of a 8086 at 1GHz. And you get a space heater too, for those cold chess-playing winter nights ...
    • An 8086 @ 1Ghz would be substantially slower than a P3 or P4 @ 1GHz. The 8086 was a scalar, unpipelined, single issue machine, with no cache, no FPU unit, and a substantial average CPI (something like one or two dozen cycles). Most performance increases since the 8086 have been due to increasing IPC not frequency. I think a Pentium, and maybe a 486, would be faster than P3/P4 but not 8086.

      • So this begs the question, why not do a little custom logic, and tie a bunch of older chips together at some lower clock rate, thereby essentially gaining a bunch of functional units? I'd think you could use some fast FPGAs with some nice CPU cores to manage them to essentially make a widely parallel integer math coprocessor with the older CPUs acting as integer units.

        Of course that might be pure masturbation, the question is, what would it cost to do that and are there any cheap CPUs which are fast enoug

      • PPro/P2/P3 should probably be the fastest Intel processor in terms of instructions per second. The first P2's at 233MHz (which I bought in Jan 1998 at a pretty hefty price and am still using now) didn't have much megahertz difference with Pentium MMX/200, but is just so much faster.

        Since P4's have such a high frequency, it is impossible to still do everything as fast in terms of clock cycles. As far as I know, most non-vector integer applications run slower on a P4 than a P3 at the same clock frequency.

  • by Anonymous Coward on Sunday June 22, 2003 @11:54PM (#6270859)
    You should obviously change the game to take advantage of the hardware. Imagine it! Three dimensional chess where each piece has weapons, or magical attacks, deformable terrain, and lots of special effects to make use of the latest video cards! I can't wait!
    • You should obviously change the game to take advantage of the hardware. Imagine it! Three dimensional chess where each piece has weapons, or magical attacks, deformable terrain, and lots of special effects to make use of the latest video cards! I can't wait!

      Been there, done that. See The Chess Variant Pages. [chessvariants.com] It lists about a hundred chess variants, some of which are three dimensional, and has links to places where you can download software to play variants (commercial and otherwise). The site has an apple [chessvariants.com]
  • FritzMark (Score:4, Insightful)

    by sn00ker ( 172521 ) on Sunday June 22, 2003 @11:58PM (#6270878) Homepage
    How old is this software that it's not multi-threaded?
    Software to examine chess games would be a perfect example of the major performance improvements to be had with multi-threading. A new thread per processor, with each thread examining different possible move paths, would give dramatic speed gains.

    • Re:FritzMark (Score:5, Informative)

      by addaon ( 41825 ) <addaon+slashdot.gmail@com> on Monday June 23, 2003 @12:08AM (#6270923)
      Fritz is multithreaded. FritzMark, the benchmarking program that uses instruction sequences similar to those in Fritz, is not.
      • Re:FritzMark (Score:3, Informative)

        by abucior ( 306728 )
        To be precise, I think it's Deep Fritz that's the multiprocessor version. Fritz by itself is just a single processor version. To quote their blurb from Deep Fritz 7:

        "Deep Fritz is the multi-processor version of Fritz7, which leads the world ranking list since four years. Deep Fritz 7 will run in computers with between one and eight processors. On a dual system the increase in speed is around 85% compared to a single processor of equivalent speed. But even if you have a single processor system the playing s
    • Re:FritzMark (Score:2, Interesting)

      by tijsvd ( 548670 )
      That's not so easy as it sounds. The search algorithm used in chess engines, called alpha-beta search, performs best if the best move (the one that is eventually chosen) is searched first. Once the score for the best move is known, the rest of the search tree can be done very fast.

      Therefore, chess programs try to estimate the best moves, based on attack patterns and history. They are quite good at this and take a correct estimation for the best move in most cases.

      This means that adding processors/threads

  • by Anonymous Coward on Monday June 23, 2003 @12:01AM (#6270892)

    First of all, the whole point of the P4 is to rev up the clockspeed, so there are not and can not be any "equalent" P3s available (excepting early versions of the P4 which are way obsolete today anyway and irrelevant to the problem at hand)

    Secondly, the Athlons are well known for their stellar integer performance, so who'd use P4s when high IP is needed?

    • It's pretty simple really, the guy is not a computer nerd like us. Note that from the article he says his brother is really the one that built it, and that putting a chip on a motherboard is tricky.
    • Simple, whomever demands that their product a) runs efficiently, b) demands reliability, and c) demands that they want really high speed multitasking.

      The P4's have double ALU's for this purpose alone; turn on HT and integer performance improves drastically.. but float point suffers because their FPU's aren't nearly as efficient at HT, yet. So think of it as having 4 processors instead of two for this case alone.

      Arguably though, Athlon's wouldn't be a bad choice either, but since you can't use thermocom
  • by djupedal ( 584558 ) on Monday June 23, 2003 @12:06AM (#6270920)
    Why not just buy a computer, you might ask? There are two reasons why I preferred the do-it-yourself route.

    1. The economy sucks and I just lost my job.
    2. I just lost my job and the economy sucks.

    From America in 2003, where you damn well better DIY or DW (do without). Then, write it up and sell it as a big 'hint'...
  • Theoretically, a dual processor machine for chess WOULD be twice as fast as a single processor machine, unlike in normal tasks where dual doesn't mean double. Chess is full of interger operations, but at the same time, conditionals up the ass. To calculate the best move, the computer has to check every possibility a move can have and the possible consiquences several moves ahead. The nice thing about a dual processor machine is that each processor can focus on the branches of moves pending from different pieces. While one is calculating what one of the rooks can do, the other can calculate what one of the knights can do. One thing I see, though, is that hyperthreading would probably not do any good for such a game b/c all of the integer ALUs on a processor would be used by one thread, so there wouldn't be any ALUs open for another thread. I think in this sort of application of the Xenon, turning hyperthreading off would help boost performance, although I can't be 100% sure of it. Just a thought.
  • he could get a Mac and play chess.app on his "supercomputer."
  • by Akai ( 11434 ) on Monday June 23, 2003 @12:23AM (#6270979) Homepage Journal
    I just turned up a dual Xeon 2.4 rack-mount server for work and it's BIOS mentioned warned us to turn off Hyperthreading for anything other than Windows XP or Linux 2.4 (yeah, mention of Linux in BIOS! :).

    Anyways, since I am using linux 2.4, two hyperthreaded Xeons look like four processors to the box, I"m sure it's not the same performance of for seperate processors, but I'm hopeing it's at least slightly better then two non Xeons :)

    The writer of the article wrote that for Windows he prefers 2000 over XP. I am curious if XP (or Linux 2.4) and thus Hyperthreading might help his already built computer with a bit more performance...
    • Having just done some serious testing on a couple of hyperthreading capable machines (dual Xeon 3.06GHz and 3.0GHz P4), I can say a bit about it's effects on programs. If the code is multi-threaded (I didn't read the article to see if his is and this is meant to be really general) it will be distributed over all the "processors" equally. This works great for programs that have 2 very different threads. However, for an app that is very int or very fp intensive in multiple threads, hyperthreading actually hinders overall throughput.

      This is due to the fact that hyperthreading is still limited to the number of functional units in the processor. For code that is very intensive on a particular type of unit (int or fp), you basically end up with a stall condition on the virtual processor while all the functional units of that type are used by the first processor.

      Hyperthreading is better suited to cases such as a user using a 3d modeling program and a MP3 player. The MP3 player will hopefully end up on one virtual processor and use the int units while the 3d modeling will end up on the other and use the fp units. This would allow both to run in parallel on the same processor.

      So, if you are using a very int or very fp intensive, multi-threaded app, turn off hyperthreading. If you are a typical user running many programs that use both int and fp, then turn it on.
  • Mac attack (Score:4, Interesting)

    by mnemonic_ ( 164550 ) <jamec@umich. e d u> on Monday June 23, 2003 @12:25AM (#6270984) Homepage Journal
    Pentium 4 clock speed vs. performance discussion...

    Seconds before G3, G4 or PPC970 is mentioned:
    3...
  • by Anonymous Coward
    the computer beats YOU! ...wait a sec...
  • It seems you could make about 3 dual Athlon ~2GHz systems for the price of one 2x2.8 Xeon and that the cluster would outperform. Or maybe build, like, a 20-processor VIA C3 system that would perform the same and use less power.
  • by pz ( 113803 ) on Monday June 23, 2003 @12:48AM (#6271049) Journal
    So this Brit (who's REALLY good at chess) put together a machine that overall isn't all that stunning, specifically to play chess.

    Let me get this straight: he didn't select a purpose-designed processor, he didn't even do a survey of available processors (forget including non-Intel architecures) to see which would give him the best integer performance for the task, he doesn't consider chipset, he doesn't consider memory architecture, he's willing to accept one hardware-caused crash per month, he seems to think that configuring a machine and having his brother put it together is "building" one, and thinks that a purpose-built machine should be able to accept the OS and data (read: disk contents) from a previous machine without hiccough. While perhaps interesting to the chess afficionados, I fail to see the relevance on Slashdot.

    Why are we seeing this article instead of something on any one of the serious chess machines? Why is this article more newsworthy than, say, Anandtech or SharkyExtreme or Tom's Hardware's pick for the baddest machine you can currently build? Just because a Grand Master did it?

    To be fair, I have great respect for anyone who can attain the Grand Master level -- that's something I'll never do in my lifetime. He's clearly shown tremendous talent and devotion to chess, and my hat is off to John Nunn for that. But he's a computer harware expert? A supercomputer architect? Are we at the start of a new series of Slashdot articles on computers of the Rich and Famous? What's next, diet tips from RMS? Health advice from Linus? The EFF Cookbook?
    • Just to echo the parent, the gut basically had his brother build a dual Xeon system, which is really nothing special by itself, and certainly doesn't justify the title, Building A Homemade Chess Supercomputer.

      Tierce
  • by Animats ( 122034 ) on Monday June 23, 2003 @12:55AM (#6271075) Homepage
    If he's concerned about reliability and is having problems convincing his vendor that he's getting hardware errors, he should get ECC memory.

    My home desktop machines both have ECC memory. I never open the boxes. Haven't had a crash on either the Windows 2000 machine or the QNX machine in over a year.

    • If he's concerned about reliability and is having problems convincing his vendor that he's getting hardware errors, he should get ECC memory.

      While he's got him on the phone, he should ask the vendor where he can get one of these "equivalent" Pentium III's. I didn't know PIII's came in 3Ghz these days.

      The whole point of the differing Pentium 4 architecture is that it scales well with clockspeed; and with the introduction of Hyperthreading on the newer chips, The P4 has really come into its own as far

    • So you've gone a year without a crash. Big deal.

      I've got a number of dual-proc P3 systems in a rack that have gone for 3+ years without a crash, and not a one of them has ECC. In that time, they've been shut down about 3 or 4 times to move them and for kernel upgrades. I've also got an NT4 machine that hasn't crashed in about 4 years, it's been shut down twice - each time, to be moved to a different building.

      Now, don't get me wrong - when something's critical, I do use ECC. But NOT using ECC isn
  • Secondly, you shouldn't attempt to build you own computer unless you are confident of your ability to do so. Obviously, if you do it, then it is entirely at your own risk.

    Valuable advice to /.ers...
  • why is it? (Score:2, Interesting)

    by pixitha ( 589341 )
    Why is it that no one has written a chess benchmarking program for the mac (ie *nix)?

    I mean, for number crunching and math and calcs, the mac seems to rule close to the top...

    just my 2cents
    • Re:why is it? (Score:4, Insightful)

      by jericho4.0 ( 565125 ) on Monday June 23, 2003 @01:10AM (#6271128)
      I'm guessing you mean the G4 line of Macs. These boxes exel at certain types of calculations, with the help of the Altivec, but would suffer from the same disadvantage the P4 does, namely lousy integer performance.

      I expect this might be a different picture tomorow, with the much rumored anouncment of the G5@2GHz.

  • You know, the p4 is slower at a lot of things then the p3 at the same clock speed. The thing is, you can jack the hell out of the clock speed with the core design they use.
  • Damn! (Score:5, Funny)

    by Ridge ( 37884 ) on Monday June 23, 2003 @01:18AM (#6271154)
    "It didnâ(TM)t take long to find a missed win by Adams against Spraggett at the Elista Olympiad in 1998."


    Damn, those Pentium 4 Xeons are slow!
  • However, this time I decided to build a new computer completely from scratch.

    Your idea of building completely from scratch is to buy a pre-made motherboard and bolt on a few other pre-assembled components. Your concept of from scratch has certainly varied from what I would consider that concept to entail.

  • freechess.org (Score:2, Interesting)

    by jtcm ( 452335 )
    not sure if this is off-topic...mod me thus if you must.

    I play free internet chess at the Free Internet Chess Server. Find them at...you guessed it: www.freechess.org [freechess.org].

    All you CLI guys out there will love the fact that using a graphical client is optional! For those of us who are sane, there are a handful of graphical boards available to complement the irc-ish interface that allows people to find opponents.

    It's fairly popular already, but I sure wouldn't mind a bigger crowd...cause all the guys on there
  • by akuma(x86) ( 224898 ) on Monday June 23, 2003 @02:27AM (#6271324)
    If you look at SPECint2000, you will find an integer benchmark called 'crafty'. This is a chess simulator with code sequences that are probably similar to what this guy used.

    Intel D875PBZ motherboard (3.0 GHz, Pentium 4 processor with HT Technology) scores 1137

    ASUS A7N8X Motherboard rev. 2.0, AMD Athlon (TM) XP 3200+ scores 1324

    You'll find that P6 derivaties (Banias, Athlon, Opteron etc...) do better on this benchmark. There are lots of unpredictable conditional branches in this application, so the incidence of mispredictions is higher than normal. You would think that this is the main contributer to poor P4 performance, but actually that is a second order effect, because the predictor on the P4 is far better than on other machines. It's the fact that the code will not fit inside the trace cache, but will fit nicely within Athlon's 64KB I-Cache.
  • It's too bad... (Score:5, Insightful)

    by NerveGas ( 168686 ) on Monday June 23, 2003 @02:40AM (#6271344)

    That the software doesn't (seem) to exist to use a cluster instead.

    No, really, this isn't one of the "imagine a Beowolf of these..." posts. Here's my point: For the cost of just one of the *processers* that he bought, you can build an *entire machine*, happily running an AthlonXP 2700+. An ENTIRE MACHINE. So, for the cost of the two processers, you've got two machines. For the cost of the SuperMicro motherboard and chassis, you can build two MORE machines. With the cost for the rest of the stuff, there's a fifth machine thrown in to boot.

    So, what will be faster - a dual 2.8 GHz Xeon, or 5 AthlonXP 2700+ machines? My money's on the cluster, for this particular application. The Xeon machine has 533 MHz of total memory bandwidth, split between two processers, effectively 266 MHz each. The AthlonMP systems, with 333 MHz each, would have a combined bandwidth of 1,665 MHz - about three times that of the Xeon system.

    To make it better, the Athlon is MUCH better than the P3 OR the P4 for integer work, which makes me wonder why he would choose the P4 in the first place. Furthermore, not only does the Athlon do much more in a clock cycle than a P4, you'd have a combined clock speed of 10.8 GHz with the Athlons instead of the 5.6 GHz of the Xeons. Twice the clock speed, AND more work per cycle!

    Now, of course, being able to actually USE that clock speed would be dependent upon actually transmitting the messages back and forth, and efficiently dividing the work between the machines. In this sort of situation, where for any one point in time, there would be a great deal of possibilities to compute, it would seem like it would divide up very well.

    steve
  • by Space Coyote ( 413320 ) on Monday June 23, 2003 @04:09AM (#6271586) Homepage
    This thesis [mcgill.ca] shows a system that a guy from McGill University built to use Field Programmable Gate Arrays to generate possible moves. Since FPGAs allow you to do man simple tasks in parallel instead of trying to do one thing at a time very fast as in software, he was able to get an order-of-magnitude speed increase. Special chess computers like Big Blue used custom-designed ASICs for this same purpose, but FPGAs are a much more accessible solution and will blow a software solution out of the water.

To the systems programmer, users and applications serve only to provide a test load.

Working...