Nvidia is introducing a new top-of-the-line chip for AI work, the HGX H200. The new GPU upgrades the wildly in demand H100 with 1.4x more memory bandwidth and 1.8x more memory capacity, improving its ability to handle intensive generative AI work.
The big question is whether companies will be able to get their hands on the new chips or whether they’ll be as supply constrained as the H100 — and Nvidia doesn’t quite have an answer for that. The first H200 chips will be released in the second quarter of 2024, and Nvidia says it’s working with “global system manufacturers and cloud service providers” to make them available. Nvidia spokesperson Kristin Uchiyama declined to comment on production numbers.
The H200 appears to be substantially the same as the H100 outside of its memory. But the changes to its memory make for a meaningful upgrade. The new GPU is the first to use a new, faster memory spec called HBM3e. That brings the GPU’s memory bandwidth to 4.8 terabytes per second, up from 3.35 terabytes per second on the H100, and its total memory capacity to 141GB up from the 80GB of its predecessor.
“The integration of faster and more extensive HBM memory serves to accelerate performance across computationally demanding tasks including generative AI models and [high-performance computing] applications while optimizing GPU utilization and efficiency,” Ian Buck, Nvidia’s VP of high-performance computing products, said in a video presentation this morning.
The H200 is also built to be compatible with the same systems that already support H100s. Nvidia says cloud providers won’t need to make any changes as they add H200s into the mix. The cloud arms of Amazon, Google, Microsoft, and Oracle will be among the first to offer the new GPUs next year.
Once they launch, the new chips are sure to be expensive. Nvidia doesn’t list how much they cost, but CNBC reports that the prior-generation H100s are estimated to sell for anywhere between $25,000 to $40,000 each, with thousands of them needed to operate at the highest levels. Uchiyama said pricing is set by Nvidia’s partners.
Nvidia’s announcement comes as AI companies remain desperately on the hunt for its H100 chips. Nvidia’s chips are seen as the best option for efficiently processing the huge quantities of data needed to train and operate generative image tools and large language models. The chips are valuable enough that companies are using them as collateral for loans. Who has H100s is the subject of Silicon Valley gossip, and startups have been working together just to share any access to them at all.
Uchiyama said that the H200’s debut won’t impact production of the H100. “You’ll see us add overall supply throughout the year and we are continuing to purchase supply for the long term,” Uchiyama wrote in an email to The Verge.
Next year is shaping up to be a more auspicious time for GPU buyers. In August, the Financial Times reported that Nvidia was planning to triple its production of the H100 in 2024. The goal was to produce up to 2 million of them next year, up from around 500,000 in 2023. But with generative AI just as explosive today as it was at the beginning of the year, the demand may only be greater — and that’s before Nvidia threw an even hotter new chip in the mix.
Update November 13th 4:35PM ET: Added additional information from Nvidia spokesperson Kristin Uchiyama.