US Army’s ‘MACH 5’ Apple supercomputer offers unmatched price/performance

“The Army Research and Development Command will use a giant cluster of Apple Computer Inc.’s G5 servers [Xserves] to build one of the fastest supercomputers in the world to research the aerodynamics of hypersonic flight,” Brian Robinson reports for Federal Computer Week.

“The MACH 5 (Multiple Advanced Computers for Hypersonic research) supercomputer, announced earlier this week, will use 1,566 of the 64-bit dual-processor servers and is expected to top 25 teraflops per second when it comes online later this year. The fastest supercomputer in the world now is Japan’s Earth Simulator with a maximum performance of just less than 36 teraflops,” Robinson reports.

“MACH 5 will cost $5.8 million to construct, a fraction of the price purpose-built supercomputers bring. The Earth Simulator cost around $350 million. Apple won the Army contract after a competition among half a dozen companies based on such things as power requirements, cooling needs and floor space requirements, as well as performance,” Robinson reports.

Full article here.

51 Comments

  1. Hypersonic means Mach 5 or more, so it’s a very cool acronym, even if its application (weaponry) is less cool than “Earth Simulator.” Whatever, nice coup for Apple. Congrats. Apple should build one for itself to simulate the mind of Windoze users (marketing purposes). Let’s see … hmm, a cluster of three G3s should do it and save the G4s and 5s the ignominy of dealing with that.

  2. Some how 25 TF seems a bit high when you consider that Big Mac, 1,100 nodes only did 10 TF. One might expect 12 or 13 TF if performance scales linearly with node count, maybe as high as 15 with improved system performance and better optimized code.

    Of course it’s not entirely beyond reason since the theoretical performance of a G5 is 4 flops. So a 2*2GHz node could theoretically do 16 GigaFlops. That would make the theoretical peak of a 1,556 node cluster around 24.9 TF.

    Hmm – The theoretical peak is about 25 TF but in practice it’ll probably do 12 or 13 TF. Still nothing to sneeze at but certainly not as high as 25.

  3. jfbiii,

    It is a matter of diminishing returns.
    Except for the very few, very specific, very processor intensive tasks which can be very highly decoupled and “parallelized” (some computational fluid dynamic modelling is like this, many Monte Carlo based simulations are like this, however most applications are not) adding a second processor does not double the computational throughput. Having 2 XServes tied together is not twice as fast as having one.

    Except in those rare cases, 1566 XServes are not even close to 1500 times as fast as one XServe.

    Theoretical Peak Performance (TPP in “supercomputerese”) for 1,566 XServes is about 37 to 38 TFLOP. I will be very pleasantly surprised if the Mach5 team reaches their Peak Performance (PP in “supercomputerese”) goal of 25 TFLOP on the LINPAC benchmark. I will actually be pleasantly surprised if the PP is over 20 TFLOP.

    In common supercomputer applications roll-off in additional capability is not too severe, but even at an assumed 67% effectiveness (what the Mach5 team is expecting) the addition of another 1,566 machines would take a significant hit in effective throughput.

    Making the extremely gross assumption that the Mach5 team’s scaling continues on to higher clustering this becomes— in very, very approximate terms… (performance given in TFLOP and $$ in millions)
    Processors TPP PP $$ PP/$$
    1,566 37 25 5.8 4.3
    3,132 75 42 11 3.8
    6,264 150 64 22 2.9
    12,528 300 94 44 2.1

    Doing the same thing with VT’s scaling factor
    Processors TPP PP $$ PP/$$
    1,100 26 10.5 5.4 1.9
    2,200 53 15 10 1.5
    4,400 106 19 20 0.9
    8,800 211 26 40 0.6
    17,600 422 36 80 0.4

    In reality the Peak Performance (PP) numbers in these tables is probably overly optimistic.

    True, even the VT cluster scaled up to beat the Earth Simulator is much less expensive than the Earth Simulator was, but you can see that the roll-off in performance is significant compared to the more modest 1,000 to 2,000 machine systems.

  4. <i>”Hypersonic means Mach 5 or more, so it’s a very cool acronym, even if its application (weaponry) is less cool than “Earth Simulator.” “<i> – Less is More

    Not necessarily. Once the military developed this technology to connect computers together. Then, it became the Internet. The same thing happened with a bunch of satellites to pinpoint the enemy’s and military’s locations. Today, GPS is one of the most important navigational tools.

    Science is science. It’s neither good nor evil. Once upon a time, NASA dreamt of making a hypersonic airplane to reach Mach 25. While the project died, I think the dream is still alive and one day perhaps, one can travel to the furthest corner of the earth in an hour. When that happens, probably research like this one contributes a lot to it.

  5. I’m sorry, I lost track of all the flops somewhere. I’m still trying to count all the viruses, trojans, worms, etc. released for windoze machines so far this month.

    ” width=”19″ height=”19″ alt=”wink” style=”border:0;” />

  6. I think Apple should give them a buy one a regular price get one free sale price – basically doubling the size. Screw #2 – go for #1. The few million dollars it would cost Apple is cheap advertising.

  7. Yeah BUT. Apple has always had the mantality of buy two for one..In other words, charge you two times for one…

    i think this would be a great system for M$ Windows HPC Edition ” width=”19″ height=”19″ alt=”smile” style=”border:0;” />

    EEEeeeeekkk!!

  8. Yeah BUT. Apple has always had the mentality of buy two for one..In other words, charge you two times for one…

    i think this would be a great system for M$ Windows HPC Edition ” width=”19″ height=”19″ alt=”smile” style=”border:0;” />

    EEEeeeeekkk!!

  9. iSteve, I second that. Nothing would be better if Apple could claim that the #1 supercomputer in the world is a cluster of OSX/G5 machines. IBM would eat it up! Take that MS! Screw you!!

  10. Your point, Nobody, is what? The acronym is not cool? Simulating hypersonic projectiles is cooler than simulating natural phenomena? Did I say science was good or evil? The knowledge gained from any kind of research may lead to advances in other fields, such as GPS, as you say, but it dudn’t have anything to do with cool [a subjective term if ever there was any]. I’d rather simulate the flight dynamics of SpaceShipOne’s feathered wings at high altitude than how a hypersonic projectile penetrates various surfaces. I find one activity cooler than the other, even if from a scientific standpoint, both may be cool for you.

  11. 5.8 million dollars is not a lot of money for the US Army or any of its contractors. This MACH 5 system sounds like an experiment to test the viability of an OS X cluster. If it delivers the goods then bigger and better systems will propably be built with XServe G5s.

  12. My point is, the application of MACH 5 is to do scientific research. Just because it’s done by the military and you don’t see the peaceful purpose for it <i>now</b>, it doesn’t automatically less cooler than research done by non-military. You may have a distaste of anything done by the military, but lots of state-of-the-art technology used for the benefit of the public now are originated from military research.

  13. blah blah blah……..where is the headless g5 imac?

    maybe a dual g4 headless imac?

    anything that is worth buying? +$2k cheese graters and +$3k xservers aren’t it. So says the non buying public.

  14. Sol it right, 5.8 mil is chump change for the Army. They just ordered helmets to the tune of 80 million for 230,000 units. Not that helmets are not important technology; I’m just pointing out that the Army certainly knows how to spend when they’ve still got unwritten checks in their book.

  15. Nobody,
    If I remember correctly, the Germans and Japanese did a lot of scientific research in the big war with Jews and Chinese; as did the US with its above-ground nuclear tests in the Pacific. I just dropped a casual comment two posts ago that certain fields of research were cooler than others. I didn’t imply anything about the ethics or value of the various fields of research. You just assumed that I was injecting some editorial content, and responded to that assumption. Maybe you are overly sensitive to that topic or you don’t read properly. So let me put it this way:

    I’d rather do research on how the instinct to propagate the human species ~ to reproduce, to mate ~ affects adolescent behaviour than to study the effects of varying fiber intake on bowel movement. For me, one is cooler than the other. Note I didn’t say necessary, valuable, ethical, equivalent, important … just cool.

  16. Aryugaetu:

    …and the speed of light doesn’t seem so fast any more

    ummm, not that up with my qantum physics, but I’m sure that no one has proven that ANYTHING can travel faster than light? (And if they have it is only a theory – yet to be proven)

    I’d say that the speed of light is still the ultimate measure of speed of a “thing”.

    When talking of instructions, they are talking about the “volume” of information, not the speed.

    I have limited knowledge of physics, so may I explain my point with an analogy:

    Say I pass a piece of paper to you in 1 sec, and it has 1000 words on it. I now pass you a dictionary in 1 second, it has 40,000+ words.

    The speed is the same – the amount of data is significantly larger.

    These P�’s are not doing things FASTER than the speed of light, they are simply passing more “words” (instructions) along a path at a given speed. I’d say that is the reason for terms such as bandwidth etc.

    I’d love to know if Apple has succeeded in proving Einstien (and many other great minds) incorrect!

    Just my 2 cents. It doesn’t make it an any less amazing achievment, I’m just loathe to credit Apple with redefining the physical laws of of the universe as we know it.

    Cheers,

    Luke

  17. shadowself wrote:
    “Theoretical Peak Performance (TPP in “supercomputerese”) for 1,566 XServes is about 37 to 38 TFLOP”

    The PPC970 has two independent floating-pont units. Each FP unit can execute a multiplication and addition simultaneously (fused multiply-add, i.e. something like a := a + b * c)
    At 2GHz, each PPC970 can thus execute 4 billion multiplication and 4 billion addition operations per second.
    The TPP of a 1,566 dual-CPU 2GHz Xserve cluster is thus 1566*2*8 ~= 25 TFLOPS

  18. Sal,

    Ah but you forget the Multiply-Add-Fuse instruction — two FLOPs in a single clock cycle.

    Also it is possible to do 64 bit floating point (not easy, but possible) in the vector processor.

    After taking these into account, I stand by my TPP number for the XServe. This is “Theoretical” Peak Performance after all.

  19. shadowself:
    > Ah but you forget the Multiply-Add-Fuse instruction — two FLOPs in a single clock cycle.

    Let’s see: fused multiply-add (a:=a+b*c) two FLOPs per pipeline clock cycle.
    Execution pipeline clocked at 2GHz.
    This means 4GFLOPs (2GHz times 2 FLOP/cycle)
    There are two independent floating-point units per CPU
    Peak performance per CPU is thus 4GFLOPs times two = 8GFLOPs

    A cluster of 1566 dual-CPU Xserves contains 3,132 CPUs.
    The theoretical peak performance is thus… drum roll … 3132×8 ~= 25TFLOPs

    > Also it is possible to do 64 bit floating point (not easy, but possible) in the vector processor.

    Possible, but it would be quite cumbersome and slow.
    AltiVec natively supports 32-bit floating-point numbers, with a 23-bit mantissa.
    A 64-bit FP number has a 52-bit mantissa. To maintain precision, most any FPU with 64-bit support — be it from AMD, Motorola, IBM, Intel, Sun… — computes intermediary results with 80-bit numbers before reducing them to 64-bit. Don’t expect much performance piecing together e.g. 32-bit floating point instructions and arithmetic shifts in a vector processor to try to construct a 52-bit or 64-bit intermediary mantissa…

    > After taking these into account, I stand by my TPP number for the XServe

    I’m afraid your numbers are irrelevant.

Reader Feedback

This site uses Akismet to reduce spam. Learn how your comment data is processed.