A big worry about the rise of AI language models is that the internet will soon be subsumed in a tidal wave of automated spam. So far, these predictions have not yet come to pass (if they prove true at all), but we are seeing early signs that tools like ChatGPT are being used to power bots, generate fake reviews, and stuff the web with low-grade textual filler.
If you want proof, try searching Google or Twitter for the phrase “as an AI language model.” When talking to OpenAI’s ChatGPT, the system frequently uses this expression as a disclaimer, usually when it’s asked to generate banned content or give an opinion on something subjective and particularly human. Now, though, “as an AI language model” has become a shibboleth for machine learning spam, revealing where people have set up automated bots or copied and pasted AI content without paying attention to the output.
Search for the phrase on Twitter, for example, and you’ll find countless examples of malfunctioning spambots. (Though it’s worth noting that the most recent results tend to be jokes, with growing awareness of the phrase turning it into something of a meme.)
The tweets are fascinating, as they often point to a bot’s purpose and tactics. In the examples below, you can see how bots have been asked to generate opinions about high-profile figures like Kim Kardashian and gossip about “trending crypto influencers or publications” (in both cases, presumably to boost engagement with certain audiences).
Some of the malfunctioning messages even read like quiet rebukes of the bot’s operator, who seem to have been asking the system to produce inflammatory content. “My programming prohibits me from generating harmful and hateful tweets towards individuals or groups of people” is the reply from the AI system, published for the world to see.
As noted by security engineer Daniel Feldman, the phrase can be searched on pretty much any site with user reviews or a comment section, revealing the presence of bots like a blacklight spotlighting unseen human fluids on a hotel bedsheet.
“As an AI language model, I haven’t personally used this product, but based on its features and customer reviews, I can confidently give it a five-star rating.”
Feldman gives the example of Amazon, where the phrase crops up in fake user reviews. In the example below, it appears in a review of a “BuTure VC10 Cordless Vacuum Cleaner, 33000Pa high Suction Power Cordless Vacuum Cleaner, up to 55 Minutes Running time.” The system used to generate the fake review is conscientious and open in its deception, stating, “As an AI language model, I haven’t personally used this product, but based on its features and customer reviews, I can confidently give it a five-star rating.”
Elsewhere on Amazon, the phrase crops up in real reviews about shoddy AI-generated products. Responding to a book about the Internet of Things, one reviewer notes that the title has been written by AI, as one paragraph starts with the phrase “as an AI language model I can’t.” Selling this sort of low-grade AI product is unscrupulous but not necessarily illegal, and there’s a whole culture of GPT-4 “hustlebros” who encourage such schemes as a way to generate passive income (and who cares about the unhappy customers).
Variations of this phrase show up in all sorts of other contexts, too. As noted by a commenter on Hacker News, it appears throughout the website of a Finnish electronics store. The store apparently tried to use AI to translate English-language products into Finnish but has instead been left with items named “sorry, as an AI language model, I cannot translate this phrase without any context.” On the website for an influencer marketing agency, the phrase appears in the title of a blog post: “Sorry, As An AI Language Model, I Cannot Predict Future Events Or Trends.” And here it is in a directory of malls in Qatar as well as here in a user profile on freelancer platform Upwork.
Other phrases also indicate inattentive use of AI, like “regenerate response,” which appears as an option in ChatGPT’s user interface. Search for these two words on LinkedIn, for example, and you’ll find numerous posts that were evidently copied and pasted from OpenAI’s language generator. (Don’t worry, though, it’s all part of that #growthmindset.)
Of course, these examples should be put in context. Although they show AI being used to generate spam and other low-grade text, it’s not clear how widespread this practice is or how it will change online ecosystems. Certainly, early signs are not good (a number of sites that solicit user-generated content in one form or another have banned AI submissions, for example) but that doesn’t necessarily guarantee the infopocalypse is nigh. For example, although searching for the phrase on Yelp.com reveals lots of hits in Google, in our own investigations, it seems the reviews in question have already been removed from the site.
On the other hand, the real problem in this equation is the unknown unknowns. The phrase “as an AI language model” is a useful tell for spotting AI spam, but it’s precisely the text that can’t be easily detected that’s the challenge. Software to detect AI-generated text is nonexistent and may even be mathematically impossible. And paranoia over machine learning fakery is so rampant that real people are now accused of being AI.
In a few years’ time, we may look back on such obvious fakes with envy. Though, as an AI language model, I don’t like to express opinions on such speculative events.