Nvidia has created the first video game demo using AI-generated graphics

A side-by-side comparison of real video footage and Nvidia’s AI-generated demo.
Credit: Nvidia

The recent boom in artificial intelligence has produced impressive results in a somewhat surprising realm: the world of image and video generation. The latest example comes from chip designer Nvidia, which today published research showing how AI-generated visuals can be combined with a traditional video game engine. The result is a hybrid graphics system that could one day be used in video games, movies, and virtual reality.

“It’s a new way to render video content using deep learning,” Nvidia’s vice president of applied deep learning, Bryan Catanzaro, told The Verge. “Obviously Nvidia cares a lot about generating graphics [and] we’re thinking about how AI is going to revolutionize the field.”

The results of Nvidia’s work aren’t photorealistic and show the trademark visual smearing found in much AI-generated imagery. Nor are they totally novel. In a research paper, the company’s engineers explain how they built upon a number of existing methods, including an influential open-source system called pix2pix. Their works deploys a type of neural network known as a generative adversarial network, or GAN. These are widely used in AI image generation, including for the creation of an AI portrait recently sold by Christie’s.

But Nvidia has introduced a number of innovations, and one product of this work, it says, is the first ever video game demo with AI-generated graphics. It’s a simple driving simulator where players navigate a few city blocks of AI-generated space, but can’t leave their car or otherwise interact with the world. The demo is powered using just a single GPU — a notable achievement for such cutting-edge work. (Though admittedly that GPU is the company’s top of the range $3,000 Titan V, “the most powerful PC GPU ever created” and one typically used for advanced simulation processing rather than gaming.)

Nvidia’s system generates graphics using a few steps. First, researchers have to collect training data, which in this case was taken from open-source datasets used for autonomous driving research. This footage is then segmented, meaning each frame is broken into different categories: sky, cars, trees, road, buildings, and so on. A generative adversarial network is then trained on this segmented data to generate new versions of these objects.

Next, engineers created the basic topology of the virtual environment using a traditional game engine. In this case the system was Unreal Engine 4, a popular engine used for titles such as Fortnite, PUBG, Gears of War 4, and many others. Using this environment as a framework, deep learning algorithms then generate the graphics for each different category of item in real time, pasting them on to the game engine’s models.

“The structure of the world is being created traditionally,” explains Catanzaro, “the only thing the AI generates is the graphics.” He adds that the demo itself is basic, and was put together by a single engineer. “It’s proof-of-concept rather than a game that’s fun to play.”

A comparison of AI-generated imagery. Top left is the segmentation map; top right pix2pixHD; bottom left COVST; bottom right, Nvidia’s system, vid2vid.
Credit: Nvidia

To create this system Nvidia’s engineers had to work around a number of challenges, the biggest of which was object permanence. The problem is, if the deep learning algorithms are generating the graphics for the world at a rate of 25 frames per second, how do they keep objects looking the same? Catanzaro says this problem meant the initial results of the system were “painful to look at” as colors and textures “changed every frame.”

The solution was to give the system a short-term memory, so that it would compare each new frame with what’s gone before. It tries to predict things like motion within these images, and creates new frames that are consistent with what’s on screen. All this computation is expensive though, and so the game only runs at 25 frames per second.

The technology is very much at the early stages, stresses Catanzaro, and it will likely be decades until AI-generated graphics show up in consumer titles. He compares the situation to the development of ray tracing, the current hot technique in graphics rendering where individual rays of light are generated in real time to create realistic reflections, shadows, and opacity in virtual environments. “The very first interactive ray tracing demo happened a long, long time ago, but we didn’t get it in games until just a few weeks ago,” he says.

The work does have potential applications in other areas of research, though, including robotics and self-driving cars, where it could be used to generate training environments. And it could show up in consumer products sooner albeit in a more limited capacity.

For example, this technology could be used in a hybrid graphics system, where the majority of a game is rendered using traditional methods, but AI is used to create the likenesses of people or objects. Consumers could capture footage themselves using smartphones, then upload this data to the cloud where algorithms would learn to copy it and insert it into games. It would make it easier to create avatars that look just like players, for example.

This sort of technology raises some obvious questions, though. In recent years experts have become increasingly worried about the use of AI-generated deepfakes for disinformation and propaganda. Researchers have shown it’s easy to generate fake footage of politicians and celebrities saying or doing things that they didn’t, a potent weapon in the wrong hands. By pushing forward the capabilities of this technology and publishing its research, Nvidia is arguably contributing to this potential problem..

The company, though, says this is hardly a new issue. “Can [this technology] be used for creating content that’s misleading? Yes. Any technology for rendering can be used to do that,” says Catanzaro. He says Nvidia is working with partners to research methods for detecting AI fakes, but that ultimately the problem of misinformation is a “trust issue.” And, like many trust issues before it, it will have to be solved with an array of methods, not just technological.

Catanzaro says tech companies like Nvidia can only take so much responsibility. “Do you hold the power company responsible because they created the electricity that powers the computer that makes the fake video?” he asks.

And ultimately, for Nvidia, pushing forward with AI-generated graphics has an obvious benefit: it will help sell more of the company’s hardware. Since the deep learning boom took off in the early 2010s, Nvidia’s stock price has surged as it became obvious that its computer chips were ideally suited for machine learning research and development.

So would an AI revolution in computer graphics be good for the company’s revenue? It certainly wouldn’t hurt, Catanzaro laughs. “Anything that increases our ability to generate graphics that are more realistic and compelling I think is good for Nvidia’s bottom line.”


Machines creating an artificial reality…no big deal.

I’m all ears…

Now give me that red pill already

About bloody time….given how many years it takes to create top tier immersive worlds for a video game, we need tech that can reduce the workload.

First movies now games.

Soon you will be able to just check a few boxes of what game or movie you want, wait 24 hours for the game to be built and voila.

The technology is very much at the early stages, stresses Catanzaro

Yes, clearly. That demo was very impressive from an engineering standpoint but awful from a CG graphics design perspe—

and it will likely be decades until AI-generated graphics show up in consumer titles.

Wow. It’s only been four hours, I’ve trudged through a Fox News comment section, and even interacted with a The_Donald and pro-Duterte Redditor, and yet this takes the cake for the most conservative thing I’ve read all day.

Unless you’re predicting sapient general AI, the absolute worst kind of timescale you can use in computer science for something that already exists (but is experimental) is "decades". Maybe decade—period— but if it takes until the 2040s or 2050s for this tech to be used in commercially-released products, that only signals civilization has entered a state of collapse.

That, or he’s trying to not cause any panic. I can’t imagine many artists and animators who specialize in 3D graphics would be very excited to hear a machine could do their job within a decade.

and it will likely be decades until AI-generated graphics show up in consumer titles.

Not to mention the AI anti-aliasing technology used by RTX cards already. The moment somebody makes a huge breakthrough (ie.: working on an AI that can develop a better neural network training program), the timeframe is shortened by thousands of times for any future research.

As scary as things can be, it got me excited about all the things that will happens in this next decade.2028 will be way different.

"AI-generated" for increasingly trivial definitions of "AI". I remember when AI meant actual synthetic intelligence, not just pretty sophisticated algorithms. But whatever. It’s a cool demo, I guess.

I mean, aren’t you just a pretty sophisticated algorithm?


"I remember when AI meant actual synthetic intelligence"
But here’s the thing we discovered: AI is everything that hasn’t been done yet. Once we do it, it’s no longer considered AI.
Once upon a time, it was thought that you needed AI in order for a computer to beat all humans at tic-tac-toe. AI mastered that in the 1950s. Then we thought you needed synthetic intelligence to beat all humans at chess. Nope, we accomplished that in the ’80s and then DeepBlue defeated Kasparov in ’97. Afterwards, we thought you needed AGI to beat a human champion at Go. Not at all: ANI is all you really need.

Basically, anything that we think requires AGI to accomplish, we can do with a sufficiently powerful ANI (aka "sophisticated algorithms"). Not including actual sapience, mind you, but anything really practical can be done with machine learning, tree searches, scripts, neural networks, whathaveyou.

How long until Star Citizen announces that they’re going to start over again to implement this?

Sounds like maybe they need to be embedding some kind of digital DNA into characters generated through these systems to avoid fakes being passed off as real.

It’s all fake! I say it’s all fake!! Oh, wait. It’s suppossed to be fake. Doh!

This is just going to be RTX all over again, another over hyped technology that will be exclusive to expensive high end Nvidia cards, that performs terrible on both their own cards and AMD’s and also destroys the framerate.


Even if this tech was to be improved with better framerates it still wouldn’t even be availble on mainstream cards for a long time.

The world generated by AI real time at 25 fps!
This is frikking amazing!

Imagine the amount of time that will be saved by studios.
It maybe bad for graphics generating coders, but amazing time saving for everyone else – including game designers and artists.

Please see this video from Blender conference to see what I mean: https://www.youtube.com/watch?v=FlgLxSLsYWQ

View All Comments
Back to top ↑