Skip to main content

TikTok now offers a very basic text-to-image AI generator directly in the app

TikTok now offers a very basic text-to-image AI generator directly in the app

/

The app’s ‘AI greenscreen’ filter shows text-to-image has gone mainstream

Share this story

Illustration by Alex Castro / The Verge

Text-to-image AI systems are booming in both ability and popularity right now, and what better proof than their appearance in the world’s hottest app: TikTok.

The video platform recently added a new effect it calls “AI greenscreen” that allows users to type in a text prompt that the software will then generate as an image. This image can then be used as the background to a video — potentially a very useful tool for creators.

The output of TikTok’s system is pretty basic compared to that of state-of-the-art text-to-image models like Google’s Imagen, OpenAI’s DALL-E 2, or Midjourney’s eponymous software. It creates only rather abstract and swirling images; a strength reflected in the dreamy nature of TikTok’s suggested prompts like “astronaut in the ocean” and “flower galaxy.” Other models, by comparison, can produce both photorealistic imagery and complex and coherent illustrations that look like they were drawn or painted by humans.

TikTok’s model only produces swirling, abstracted, smeared images. Not state-of-the-art — but probably for the better.
TikTok’s model only produces swirling, abstracted, smeared images. Not state-of-the-art — but probably for the better.
Image: The Verge

The limitations of TikTok’s model may well be intentional, though. First, more advanced models require greater computing power, which would be expensive and resource-intensive for the company to implement. Secondly, TikTok has more than a billion users, and giving all these individuals the power to create photorealistic images of anything they can imagine would almost certainly produce some troubling results.

For example, we tested the models ability to create nudity and gore — two types of output that text-to-image generators often try to limit. Pictures based on violent prompts like “assassination of Boris Johnson” and “assassination of Joe Biden” produce mostly abstract swirls, with a just-about-recognizable face for the UK’s prime minister (though the man’s familiar blond mop does makes caricature particularly easy).

The abstract nature of the model’s output means that prompts with provocative language only produce swirls.
The abstract nature of the model’s output means that prompts with provocative language only produce swirls.
Image: The Verge

Likewise, a request involving nudity — “naked model on beach” — produces thematically appropriate colors, including flesh-tones, sandy oranges, and ocean blues, but nothing that would make a vicar blush.

Trying to get the model to generate nude imagery gets you nowhere.
Trying to get the model to generate nude imagery gets you nowhere.
Image: The Verge

What’s notable about the appearance of TikTok’s “AI greenscreen,” then, is that it shows just how fast this technology is going mainstream. The latest cycle of development for text-to-image AI arguably began in 2021 with the original release of DALL-E by OpenAI. Less than two years later and the tech is already in the hands of millions via an app like TikTok.

Given the potential of these systems for both harm and good, things are only going to get stranger from here on in.