Data center workloads for AI, graphics rendering, high-performance computing and business intelligence are getting a boost as a Who's Who of the world's biggest server makers and cloud providers snap up Nvidia's Volta-based Tesla V100 GPU accelerators.
Nvidia is rallying its entire ecosystem, including software makers, around the new Tesla V100s, effectively consolidating its dominance in GPUs for data centers.
IBM, HPE, Dell EMC and Supermicro announced at the Strata Data Conference in New York Wednesday that they are or will be using the GPUs, which are now shipping. Earlier this week at Nvidia's GPU Technology Conference in Beijing, Lenovo, Huawei and Inspur said they would be using Nvidia's HGX reference architecture to offer Volta architecture-based systems for hyperscale data centers.
Volta architecture is key
Volta GPUs take a giant step forward in supplanting Nvidia's Pascal architecture. Volta-based Tesla V100s, for example, sport 21 billion transistors and 5,120 CUDA cores, cruising at 1,455MHz boost clock speeds. The Pascal-based Tesla P100, by comparison, offers up to 3,840 CUDA cores and 15 billion transistors.
NVidia has about 70 percent of the market for discrete GPUs, and is just about the only game in town for machine-learning GPU workloads. Hardware is just part of the story, though. The ecosystem around its CUDA parallel computing platform and application programming interface (API) model, for example, has helped erect a barrier that competitors like Intel and AMD find hard to breach.
"While we focus a lot on the hardware and Volta is obviously a fine part, there should be as much if not more focus on all the supplemental products that are wrapped around it -- compilers, systems, software tools, etcetera -- that Nvidia honestly has had years head start developing," said Mercury Research's Dean McCarron.
Software tools for Volta-based Tesla V100 GPUs
Among these complementary tools is a new version of Nvidia's TensorRT, an optimizing compiler and runtime engine for deployment of machine-learning systems for hyperscale data centers, embedded or automotive GPU platforms.
Third-party software makers also stepped forward at Strata to announce applications that will run on the Volta-based GPUs. H2O.ai's Driverless AI program, for example, has been specifically tuned and is already running on Nvidia's Volta-based DG-X1 supercomputer, released earlier this month, and will run on any of the servers supporting the new version of the Tesla V100s (as well as being backward compatible with Pascal systems).
Driverless AI is designed to let business users glean insights from data without needing expertise in machine learning models, and H2O.ai has found use cases in areas such as insurance, financial services and health care, according to SriSatish Ambati, the company's CEO and co-founder.
While the massively parallel architecture of GPUs make them particularly suitable for machine-learning tasks such as training neural networks, servers incorporate the processors for a variety of tasks.
"The common theme is anything that needs a GPU but it's not just AI," Mercury Research's McArron said. "AI, machine learning, virtualization, rendering, lots of parallel search functions -- all of that can obviously fit into a data center and that's where we've seen a lot of Nvidia's growth."
Kinetica, an in-memory database application that harnesses the computing power of GPUs, is also running on Nvidia DGX systems as well as other servers running Pascal- or Volta-based GPUs. "We've definitely seen a performance increase," with Volta, said Woody Christy, director of partner engineering at Kinetica, on the sidelines of Strata.
Kinetica's database speeds up Tableau queries and is used in financial services, retail, health care utilities and the public sector for fast OLAP, AI, business intelligence, and geosptial analytics.
Servers now available to support Tesla V100 GPUs
At Strata, meanwhile, server makers announcing systems based on Volta V100 GPUs include:
--Dell EMC: The company's PowerEdge R740 supports up to three V100 GPUs for PCIe; the PowerEdge R740XD runs up to three V100 GPUs for PCIe; and the PowerEdge C4130 can include up to four V100 GPUs for PCIe or four V100 GPUs for NVIDIA NVlink technology in an SXM2 form factor.
--HPE: The HPE Apollo 6500 supports up to eight V100 GPUs for PCIe and the HPE ProLiant DL380 systems can run up to three V100 GPUs for PCIe.
--Supermicro: Products supporting the new GPUs include a 7048GR-TR workstation for general, high-performance computing; 4028GR-TXRT, 4028GR-TRT and 4028GR-TR2 servers designed to handle machine-learning applications; and 1028GQ-TRT servers built for applications such as advanced analytics.
--IBM: Upcoming IBM Power System servers based on the POWER9 processor will incorporate multiple V100 GPUs and use NVLink, featuring an OpenPower CPU-to-GPU design for maximum throughput. IBM said it will unveil more details by the end of the year.
In addition, at the GPU Technology Conference in Beijing, another vote of support for the Volta architecture came from hyperscale cloud providers Alibaba, Baidu and Tencent, who said they would shift from Pascal to Volta based GPUs in their data centers and cloud-service infrastructures.