IBM plans to load future supercomputers with more co-processors and accelerators to increase computing speed and power efficiency.
Supercomputers with this new architecture could be out within the next year. The aim is to boost data processing at the storage, memory and I/O levels, said Dave Turek, vice president of technical computing for OpenPower at IBM.
That will help break down parallel computational tasks into small chunks, reducing the compute cycles required to solve problems. That's one way to overcome scaling and economic limitations of parallel computing that affect conventional computing models, Turek said.
"We looked at this and said, we can't keep doing what we've done, but that won't even work [any longer] when you look at the volume of data people are starting to entertain," Turek said.
Memory, storage and I/O work in tandem to boost system performance, but there are bottlenecks with current supercomputing models. A lot of time and energy is wasted in continuously moving large chunks of data between processors, memory and storage. IBM wants to decrease the amount of data that has to be moved, which could help process data up to three times faster than current supercomputing models.
"When we are working with petabytes and exabytes of data, moving this amount of data is extremely inefficient and time consuming, so we have to move processing to the data. We do this by providing compute capability throughout the system hierarchy," Turek said.
IBM has built the world's fastest computers for decades, including the third- and fifth-fastest, according to a recent Top500 list. But the amount of data being fed to servers is outpacing the growth of supercomputing speeds. Networks aren't getting faster, the chip clock speeds aren't increasing and there isn't a huge increase in data-access time, Turek said.
"Applications no longer just live in the classic compute microprocessors, instead application and workflow computation are distributed throughout the system hierarchy," Turek said.
IBM's execution model is proprietary, but Turek provided a simple example of reducing the size of data sets by decomposing information in storage, which can then be moved to memory. Such a model can be applied to an oil and gas workflow -- which typically takes months -- and it would significantly shorten the time required to make decisions about drilling.
"We see a hierarchy of storage and memory including nonvolatile RAM, which means much lower latency, higher bandwidths, without the requirement to move the data all the way back to central storage," Turek said.
IBM is not trying to challenge conventional computing architectures such as the Von Neumann approach, in which data is pushed into a processor, calculated and pushed back in the memory. Most computer systems today work on the Von Neumann architecture, which was derived in the 1940s by mathematician John von Neumann.
"At the individual compute element level, we continue the Von Neumann approach. At the level of the system, however, we are providing an additional way to compute, which is to move the compute to the data. There are multiple ways to reduce latency in a system and reduce the amount of data which has to be moved. This saves time and energy," Turek said.
Moving computing closer to data in storage or memory is not a new concept. IBM has been building appliances and servers with CPUs targeted at specific workloads, and has been disaggregating memory, storage and processing subsystems into separate boxes. But IBM is looking at optimizing entire supercomputing workloads that involve modeling, simulation, visualization and complex analytics on massive data sets.
The model will work in research areas like oil and gas exploration, life sciences, weather modeling, and materials research. Applications will need to be written and well-defined for processing at different levels, and IBM is working with companies, institutions and researchers to define software models for key sectors.
The fastest supercomputers today are calculated with the LINPACK benchmark, which is a simple measurement based on floating point operations. IBM isn't ignoring Top500, but providing a different approach to speed up supercomputing.
LINPACK is good at measuring basic speed, but has under-represented the utility of supercomputers, Turek said, adding that the benchmark doesn't fully account for specialized processing elements like integer processing and FPGAs.
"The Top500 list measures some elements of the behavior of compute nodes, but is incomplete in terms of its characterization of workflows that require merging modeling, simulation and analytics. Our own research shows that many classic HPC applications are only moderately related to the measure of LINPACK," Turek said.
Organizations building supercomputers have learned to build software to take advantage of LINPACK, which is a poor measurement of supercomputing performance, said Nathan Brookwood, principal analyst at Insight 64.
"Top500 takes a very simple view of computer performance. Everybody loves simplicity," Brookwood said.
The real performance of some specialized applications goes far beyond LINPACK, and IBM's approach makes sense, Brookwood said.
"IBM is right, there's a lot of ways to skin the cat for different applications. Those with different applications will have a different effect, and it's hard to capture those numbers," Brookwood said.
There are companies developing computers that give a new spin on how data is accessed and interpreted. D-Wave Systems is offering what is believed to be the world's first and only quantum computer, which is being used by NASA, Lockheed Martin and Google for specific tasks. The others are in experimental phase. IBM has built an experimental computer with a chip designed to mimic a human brain. Hewlett-Packard's Machine has a new type of memory called memristor and will transfer data using light beams.