The Top500 list of the world's most powerful supercomputers passed a milestone Wednesday with the first system to achieve peak performance of 1 petaflop/s, or one quadrillion floating point operations per second.
The system, called Roadrunner, was built by IBM for the US Department of Energy's Los Alamos National Laboratory. It's based on an advanced version of the Cell processor used in Sony's PlayStation 3, and it's performance outstrips by far the previous fastest system, another IBM computer that topped out at 478.2 teraflops per second.
Erich Strohmaier, a computer scientist at Lawrence Berkeley National Laboratory, was one of the founding editors of the Top500 list back in 1993. He talked with IDG News Service about the performance gains the list has seen, the quad-core processors that are coming to dominate it, and mistakes that can creep in when the list is put together. Following is an edited transcript:
Did you expect to see performance of a petaflop/s when you started this list?
No, 15 years ago the big question was whether all 500 systems together would amount to 1 teraflop -- and it was just above 1 teraflop, all 500 of them together.
Where does the performance of the IBM system come from, is it mainly the Cell processor or advances somewhere else?
For the Roadrunner it's a very dense package in terms of the computing power. The advanced Cell is important, with eight of those [cores] on a single processor ... but it's also because it's tightly integrated. It's a blade system so you get a lot of these in a rack.
Does that cut down on latency between the blades?
Yes, you lose that latency, and you also need that kind of packaging to cut down on the power. Using the Cell is one way, but using these tightly integrated blade systems is another way to control power.
Does someone go around and audit these systems? How do you know the results are genuine?
In the first place it's an honor system, but of course for the big systems we ask them to run the benchmark and we want to see the output files.
Have you ever caught anyone cheating?
Not on the larger-scale systems, but there are always mistakes on the list. Big companies don't really know precisely how much [equipment] they've sold where, because they don't track sales by system, they track them by components. So they know they've shipped so many blades of certain type to the UK, but they don't know how they are configured at customer sites. So yes, there have been mistakes made.
The more common mistake is that there are still systems on the list even though they have been decommissioned, because companies don't usually tell us when they shut their systems down. The thing that keeps the list healthy is that we lose, over a typical six month interval, about 200 to 220 systems. So if we made some mistakes they'll be out of the list very quickly. This time we had record turnover, we lost 300 systems.