Skip to main content

Meta has built an AI supercomputer it says will be world’s fastest by end of 2022

Meta has built an AI supercomputer it says will be world’s fastest by end of 2022

/

Designed to train the next generation of machine learning systems

Share this story

Facebook Prineville Data Center
Servers at a Facebook datacenter.
Photo: Vjeran Pavic

Social media conglomerate Meta is the latest tech company to build an “AI supercomputer” — a high-speed computer designed specifically to train machine learning systems. The company says its new AI Research SuperCluster, or RSC, is already among the fastest machines of its type and, when complete in mid-2022, will be the world’s fastest.

“Meta has developed what we believe is the world’s fastest AI supercomputer,” said Meta CEO Mark Zuckerberg in a statement. “We’re calling it RSC for AI Research SuperCluster and it’ll be complete later this year.”

The news demonstrates the absolute centrality of AI research to companies like Meta. Rivals like Microsoft and Nvidia have already announced their own “AI supercomputers,” which are slightly different from what we think of as regular supercomputers. RSC will be used to train a range of systems across Meta’s businesses: from content moderation algorithms used to detect hate speech on Facebook and Instagram to augmented reality features that will one day be available in the company’s future AR hardware. And, yes, Meta says RSC will be used to design experiences for the metaverse — the company’s insistent branding for an interconnected series of virtual spaces, from offices to online arenas.

“RSC will help Meta’s AI researchers build new and better AI models”

“RSC will help Meta’s AI researchers build new and better AI models that can learn from trillions of examples; work across hundreds of different languages; seamlessly analyze text, images, and video together; develop new augmented reality tools; and much more,” write Meta engineers Kevin Lee and Shubho Sengupta in a blog post outlining the news.

“We hope RSC will help us build entirely new AI systems that can, for example, power real-time voice translations to large groups of people, each speaking a different language, so they can seamlessly collaborate on a research project or play an AR game together.”

Meta’s AI supercomputer is due to be complete by mid-2022.
Meta’s AI supercomputer is due to be complete by mid-2022.
Image: Meta

Work on RSC began a year and a half ago, with Meta’s engineers designing the machine’s various systems — cooling, power, networking, and cabling — entirely from scratch. Phase one of RSC is already up and running and consists of 760 Nvidia GGX A100 systems containing 6,080 connected GPUs (a type of processor that’s particularly good at tackling machine learning problems). Meta says it’s already providing up to 20 times improved performance on its standard machine vision research tasks.

Before the end of 2022, though, phase two of RSC will be complete. At that point, it’ll contain some 16,000 total GPUs and will be able to train AI systems “with more than a trillion parameters on data sets as large as an exabyte.” (This raw number of GPUs only provides a narrow metric for a system’s overall performance, but, for comparison’s sake, Microsoft’s AI supercomputer built with research lab OpenAI is built from 10,000 GPUs.)

These numbers are all very impressive, but they do invite the question: what is an AI supercomputer anyway? And how does it compare to what we usually think of as supercomputers — vast machines deployed by universities and governments to crunch numbers in complex domains like space, nuclear physics, and climate change?

The two types of systems, known as high-performance computers or HPCs, are certainly more similar than they are different. Both are closer to datacenters than individual computers in size and appearance and rely on large numbers of interconnected processors to exchange data at blisteringly fast speeds. But there are key differences between the two, as HPC analyst Bob Sorensen of Hyperion Research explains to The Verge. “AI-based HPCs live in a somewhat different world than their traditional HPC counterparts,” says Sorensen, and the big distinction is all about accuracy.

AI supercomputers and regular supercomputers can’t necessarily be compared apples to apples

The brief explanation is that machine learning requires less accuracy than the tasks put to traditional supercomputers, and so “AI supercomputers” (a bit of recent branding) can carry out more calculations per second than their regular brethren using the same hardware. That means when Meta says it’s built the “world’s fastest AI supercomputer,” it’s not necessarily a direct comparison to the supercomputers you often see in the news (rankings of which are compiled by the independent Top500.org and published twice a year).

To explain this a little more, you need to know that both supercomputers and AI supercomputers make calculations using what is known as floating-point arithmetic — a mathematical shorthand that’s extremely useful for making calculations using very large and very small numbers (the “floating point” in question is the decimal point, which “floats” between significant figures). The degree of accuracy deployed in floating-point calculations can be adjusted based on different formats, and the speed of most supercomputers is calculated using what are known as 64-bit floating-point operations per second, or FLOPs. However, because AI calculations require less accuracy, AI supercomputers are often measured in 32-bit or even 16-bit FLOPs. That’s why comparing the two types of systems is not necessarily apples to apples, though this caveat doesn’t diminish the incredible power and capacity of AI supercomputers.

Sorensen offers one extra word of caution, too. As is often the case with the “speeds and feeds” approach to assessing hardware, vaunted top speeds are not always representative. “HPC vendors typically quote performance numbers that indicate the absolute fastest their machine can run. We call that the theoretical peak performance,” says Sorensen. “However, the real measure of a good system design is one that can run fast on the jobs they are designed to do. Indeed, it is not uncommon for some HPCs to achieve less than 25 percent of their so-called peak performance when running real-world applications.”

In other words: the true utility of supercomputers is to be found in the work they do, not their theoretical peak performance. For Meta, that work means building moderation systems at a time when trust in the company is at an all-time low and means creating a new computing platform — whether based on augmented reality glasses or the metaverse — that it can dominate in the face of rivals like Google, Microsoft, and Apple. An AI supercomputer offers the company raw power, but Meta still needs to find the winning strategy on its own.

Today’s Storystream

Feed refreshed 22 minutes ago Not just you

T
Youtube
Thomas Ricker22 minutes ago
Table breaks before Apple Watch Ultra’s sapphire glass.

”It’s the most rugged and capable Apple Watch yet,” said Apple at the launch of the Apple Watch Ultra (read The Verge review here). YouTuber TechRax put that claim to the test with a series of drop, scratch, and hammer tests. Takeaways: the titanium case will scratch with enough abuse, and that flat sapphire front crystal is tough — tougher than the table which cracks before the Ultra fails — but not indestructible.


E
Twitter
Emma RothSep 25
Rihanna’s headlining the Super Bowl Halftime Show.

Apple Music’s set to sponsor the Halftime Show next February, and it’s starting out strong with a performance from Rihanna. I honestly can’t remember which company sponsored the Halftime Show before Pepsi, so it’ll be nice to see how Apple handles the show for Super Bowl LVII.


E
Twitter
Emma RothSep 25
Starlink is growing.

The Elon Musk-owned satellite internet service, which covers all seven continents including Antarctica, has now made over 1 million user terminals. Musk has big plans for the service, which he hopes to expand to cruise ships, planes, and even school buses.

Musk recently said he’ll sidestep sanctions to activate the service in Iran, where the government put restrictions on communications due to mass protests. He followed through on his promise to bring Starlink to Ukraine at the start of Russia’s invasion, so we’ll have to wait and see if he manages to bring the service to Iran as well.


E
External Link
Emma RothSep 25
We might not get another Apple event this year.

While Apple was initially expected to hold an event to launch its rumored M2-equipped Macs and iPads in October, Bloomberg’s Mark Gurman predicts Apple will announce its new devices in a series of press releases, website updates, and media briefings instead.

I know that it probably takes a lot of work to put these polished events together, but if Apple does pass on it this year, I will kind of miss vibing to the livestream’s music and seeing all the new products get presented.


Welcome to the new Verge

Revolutionizing the media with blog posts

Nilay PatelSep 13
E
External Link
Emma RothSep 24
California Governor Gavin Newsom vetoes the state’s “BitLicense” law.

The bill, called the Digital Financial Assets Law, would establish a regulatory framework for companies that transact with cryptocurrency in the state, similar to New York’s BitLicense system. In a statement, Newsom says it’s “premature to lock a licensing structure” and that implementing such a program is a “costly undertaking:”

A more flexible approach is needed to ensure regulatory oversight can keep up with rapidly evolving technology and use cases, and is tailored with the proper tools to address trends and mitigate consumer harm.


A
The Verge
Andrew WebsterSep 24
Get ready for some Netflix news.

At 1PM ET today Netflix is streaming its second annual Tudum event, where you can expect to hear news about and see trailers from its biggest franchises, including The Witcher and Bridgerton. I’ll be covering the event live alongside my colleague Charles Pulliam-Moore, and you can also watch along at the link below. There will be lots of expected names during the stream, but I have my fingers crossed for a new season of Hemlock Grove.


J
Twitter
Jay PetersSep 23
Twitch’s creators SVP is leaving the company.

Constance Knight, Twitch’s senior vice president of global creators, is leaving for a new opportunity, according to Bloomberg’s Cecilia D’Anastasio. Knight shared her departure with staff on the same day Twitch announced impending cuts to how much its biggest streamers will earn from subscriptions.


T
Twitter
Tom WarrenSep 23
Has the Windows 11 2022 Update made your gaming PC stutter?

Nvidia GPU owners have been complaining of stuttering and poor frame rates with the latest Windows 11 update, but thankfully there’s a fix. Nvidia has identified an issue with its GeForce Experience overlay and the Windows 11 2022 Update (22H2). A fix is available in beta from Nvidia’s website.