OpenAI announced the third version of its generative AI visual art platform DALL-E, which now lets users use ChatGPT to create prompts and includes more safety options.
DALL-E converts text prompts to images. But even DALL-E 2 got things wrong, often ignoring specific wording. The latest version, OpenAI researchers said, understands context much better.
A new feature of DALL-E 3 is integration with ChatGPT. By using ChatGPT, someone doesn’t have to come up with their own detailed prompt to guide DALL-E 3; they can just ask ChatGPT to come up with a prompt, and the chatbot will write out a paragraph (DALL-E works better with longer sentences) for DALL-E 3 to follow. Other users can still use their own prompts if they have specific ideas for DALL-E.
In a demo to The Verge, Aditya Ramesh, lead researcher and head of the DALL-E team, prompted ChatGPT to help him come up with a logo for a ramen restaurant in the mountains. ChatGPT then wrote a longer prompt, and DALL-E came up with four options. My favorite was a rendering of a mountain with ramen snowcaps, broth flowing down like a waterfall, and pickled eggs on the ground like garden stones — although it looked more like an illustration for some nice merch than a conventional restaurant logo. This connection with the chatbot, OpenAI said, allows more people to create AI art because they don’t have to be very good at coming up with a prompt.
DALL-E, first released in January 2021, came before other text-to-image generative AI art platforms by Stability AI and Midjourney. By the time DALL-E 2 was released in 2022, OpenAI opened a waitlist to control who got to use the platform after criticism that DALL-E could generate photorealistic explicit images and showed bias when generating photos. The company removed the waitlist in September last year and opened DALL-E 2 to the public.
This new version of DALL-E will be first released to ChatGPT Plus and ChatGPT Enterprise users in October, followed by research labs and its API service in the fall. OpenAI plans to stagger the release of DALL-E 3 but did not commit to when a free public version will be released.
OpenAI claims it focused a lot of work on DALL-E 3 in creating robust safety measures to prevent the creation of lewd or potentially hateful images. OpenAI said it worked with external red teamers — a group that intentionally tries to break a system to test its safety — and relied on input classifiers, a way to teach language models to ignore certain words to avoid explicit or violent prompts. DALL-E 3 will also be unable to recreate images of public figures — provided the prompt specifically mentions a name.
Sandhini Agarwal, a policy researcher at the company, said she has “high confidence” in its safety measures but clarified that the model continually improves and is not perfect. OpenAI representatives said in an email that DALL-E 3 has been trained to decline to generate images in the style of living artists. Unlike DALL-E 2 which, when prompted, can sort of mimic art in the style of certain artists.
OpenAI, possibly to avoid lawsuits, will also allow artists to opt their art out of future versions of text-to-image AI models. Creators can submit an image that they own the rights to and request its removal in a form on its website. A future version of DALL-E can then block results that look similar to the artist’s image and style. Artists sued DALL-E competitors Stability AI and Midjourney, along with art website DeviantArt, for allegedly using their copyrighted work to train their text-to-image models.