Nvidia has announced a slew of AI-focused enterprise products at its annual GTC conference. They include details of its new silicon architecture, Hopper; the first datacenter GPU built using that architecture, the H100; a new Grace CPU “superchip”; and vague plans to build what the company claims will be the world’s fastest AI supercomputer, named Eos.
Nvidia has benefited hugely from the AI boom of the last decade, with its GPUs proving a perfect match for popular, data-intensive deep learning methods. As the AI sector’s demand for data compute grows, says Nvidia, it wants to provide more firepower.
In particular, the company stressed the popularity of a type of machine learning system known as a Transformer. This method has been incredibly fruitful, powering everything from language models like OpenAI’s GPT-3 to medical systems like DeepMind’s AlphaFold. Such models have increased exponentially in size over the space of a few years. When OpenAI launched GPT-2 in 2019, for example, it contained 1.5 billion parameters (or connections). When Google trained a similar model just two years later, it used 1.6 trillion parameters.
As AI demands more computer, Nvidia wants to deliver it
“Training these giant models still takes months,” said Nvidia senior director of product management Paresh Kharya in a press briefing. “So you fire a job and wait for one and half months to see what happens. A key challenge to reducing this time to train is that performance gains start to decline as you increase the number of GPUs in a data center.”
Nvidia says its new Hopper architecture will help ameliorate these difficulties. Named after pioneering computer scientist and US Navy Rear Admiral Grace Hopper, the architecture is specialized to accelerate the training of Transformer models on H100 GPUs by six times compared to previous-generation chips, while the new fourth-generation Nivida NVlink can connect up to 256 H100 GPUs at nine times higher bandwidth than the previous generation.
The H100 GPU itself contains 80 billion transistors and is the first GPU to support PCle Gen5 and utilize HBM3, enabling memory bandwidth of 3TB/s. Nvidia says an H100 GPU is three times faster than its previous-generation A100 at FP16, FP32, and FP64 compute, and six times faster at 8-bit floating point math.
“For the training of giant Transformer models, H100 will offer up to nine times higher performance, training in days what used to take weeks,” said Kharya.
The company also announced a new data center CPU, the Grace CPU Superchip, which consists of two CPUs connected directly via a new low-latency NVLink-C2C. The chip is designed to “serve giant-scale HPC and AI applications” alongside the new Hopper-based GPUs, and can be used for CPU-only systems or GPU-accelerated servers. It has 144 Arm cores and 1TB/s of memory bandwidth.
In addition to hardware and infrastructure news, Nvidia also announced updates to its various enterprise AI software services, including Maxine (an SDK to deliver audio and video enhancements, intended to power things like virtual avatars) and Riva (an SDK used for both speech recognition and text-to-speech).
The company also teased that it was building a new AI supercomputer, which it claims will be the world’s fastest when deployed. The supercomputer, named Eos, will be built using the Hopper architecture and contain some 4,600 H100 GPUs to offer 18.4 exaflops of “AI performance.” The system will be used for Nvidia’s internal research only, and the company said it would be online in a few months’ time.
Over the past few years, a number of companies with strong interest in AI have built or announced their own in-house “AI supercomputers” for internal research, including Microsoft, Tesla, and Meta. These systems are not directly comparable with regular supercomputers as they run at a lower level of accuracy, which has allowed a number of firms to quickly leapfrog one another by announcing the world’s fastest.
However, during his keynote address, Nvidia CEO Jensen Huang did say that Eos, when running traditional supercomputer tasks, would rack 275 petaFLOPS of compute — 1.4 times faster than “the fastest science computer in the US” (the Summit). “We expect Eos to be the fastest AI computer in the world,” said Huang. “Eos will be the blueprint for the most advanced AI infrastructure for our OEMs and cloud partners.”