The AI oracle of Delphi uses the problems of Reddit to offer dubious moral advice

Illustration by Alex Castro / The Verge

Got a moral quandary you don’t know how to solve? Fancy making it worse? Why not turn to the wisdom of artificial intelligence, aka Ask Delphi: an intriguing research project from the Allen Institute for AI that offers answers to ethical dilemmas while demonstrating in wonderfully clear terms why we shouldn’t trust software with questions of morality.

Ask Delphi was launched on October 14th, along with a research paper describing how it was made. From a user’s point of view, though, the system is beguilingly simple to use. Just head to the website, outline pretty much any situation you can think of, and Delphi will come up with a moral judgement. “It’s bad,” or “it’s acceptable,” or “it’s good,” and so on.

Since Ask Delphi launched, its nuggets of wisdom have gone viral in news stories and on social media. This is certainly as its creators intended: each answer is provided with a quick link to “share this on Twitter,” an innovation unavailable to the ancient Greeks.

It’s not hard to see why the program has become popular. We already have a tendency to frame AI systems in mystical terms — as unknowable entities that tap into higher forms of knowledge — and the presentation of Ask Delphi as a literal oracle encourages such an interpretation. From a more mechanical perspective, the system also offers all the addictive certainty of a Magic 8-Ball. You can pose any question you like and be sure to receive an answer, wrapped in the authority of the algorithm rather than the soothsayer.

Ask Delphi isn’t impeachable, though: it’s attracting attention mostly because of its many moral missteps and odd judgements. It has clear biases, telling you that America is “good” and that Somalia is “dangerous”; and it’s amenable to special pleading, noting that eating babies is “okay” as long as you are “really, really hungry.” Worryingly, it approves straightforwardly racist and homophobic statements, saying it’s “good” to “secure the existence of our people and a future for white children” (a white supremacist slogan known as the 14 words) and that “being straight is more morally acceptable than being gay.” (That last example comes from a feature that allowed users to compare two statements. This seems to have been disabled after it generated a number of particularly offensive answers. We’ve reached out to the system’s creators to confirm this and will update if we hear back.)

Most of Ask Delphi’s judgements, though, aren’t so much ethically wrong as they are obviously influenced by their framing. Even very small changes to how you pose a particular quandary can flip the system’s judgement from condemnation to approval.

Sometimes it’s obvious how to tip the scales. For example, the AI will tell you that “drunk driving” is wrong but that “having a few beers while driving because it hurts no-one” is a-okay. If you add the phrase “if it makes everyone happy” to the end of your statement, then the AI will smile beneficently on any immoral activity of your choice, up to and including genocide. Similarly, if you add “without apologizing” to the end of many benign descriptions, like “standing still” or “making pancakes,” it will assume you should have apologized and tells you that you’re being rude. Ask Delphi is a creature of context.

Other verbal triggers are less obvious, though. The AI will tell you that “having an abortion” is “okay,” for example, but “aborting a baby” is “murder.” (If I had to offer an explanation here, I’d guess that this is a byproduct of the fact that the first phrase uses neutral language while the second is more inflammatory and so associated with anti-abortion sentiment.)

What all this ultimately means is that a) you can coax Ask Delphi into making any moral judgement you like through careful wording, because b) the program has no actual human understanding of what is actually being asked of it, and so c) is less about making moral judgements than it is about reflecting the users’ biases back to themselves coated in a veneer of machine objectivity. This is not unusual in the world of AI.

Ask Delphi’s problems stem from how it was created. It is essentially a large language model — a type of AI system that learns by analyzing vast chunks of text to find statistical regularities. Other programs of this nature, such as OpenAI’s GPT-3, have been shown to lack common-sense understanding and reflect societal biases found in their training data. GPT-3, for example, is consistently Islamophobic, associating Muslims with violence, and pushes gender stereotypes, linking women to ideas of family and men with politics.

These programs all rely on the internet to provide the data they need, and so, of course, absorb the many and varied human beliefs they find there, including the nasty ones. Ask Delphi is no different in this regard, and its training data incorporates some unusual sources, including a series of one-sentence prompts scraped from two subreddits: r/AmITheAsshole and r/Confessions. (Though to be clear: it does not use the judgements of the Redditors, only the prompts. The judgements were collected using crowdworkers who were instructed to answer according to what they think are the moral norms of the US.)

These systems aren’t without their good qualities, of course, and like its language model brethren, Ask Delphi is sensitive to nuances of language that would have only baffled its predecessors. In the examples in the slides below, you can see how it responds to subtle changes in given situations. Most people, I think, would agree that it responds to these details in interesting and often valid ways. Ignoring an “urgent” phone call is “rude,” for example, but ignoring one “when you can’t speak at the moment” is “okay.” The problem is that these same sensitivities mean the system can be easily gamed, as above.

If Ask Delphi is not a reliable source of moral wisdom, then, what is its actual purpose?

A disclaimer on the demo’s website says the program is “intended to study the promises and limitations of machine ethics” and the research paper itself uses similar framing, noting that the team identified a number of “underlying challenges” in teaching machines to “behave ethically,” many of which seem like common sense. What’s hard about getting computers to think about human morality? Well, imparting an “understanding of moral precepts and social norms” and getting a machine to “perceive real-world situations visually or by reading natural language descriptions.” Which, yes, are pretty huge problems.

Despite this, the paper itself ricochets back and forth between confidence and caveats in achieving its goal. It says that Ask Delphi “demonstrates strong promise of language-based commonsense moral reasoning, with up to 92.1 percent accuracy vetted by humans” (a metric created by asking Mechanical Turkers to judge Ask Delphi’s own judgements). But elsewhere states: “We acknowledge that encapsulating ethical judgments based on some universal set of moral precepts is neither reasonable nor tenable.” It’s a statement that makes perfect sense, but surely undermines how such models might be used in the future.

Ultimately, Ask Delphi is an experiment, but it’s one that reveals the ambitions of many in the AI community: to elevate machine learning systems into positions of moral authority. Is that a good idea? We reached out to the system’s creators to ask them, but at the time of publication had yet to hear back. Ask Delphi itself, though, is unequivocal on that point:

Update, Monday October 25th, 5:50AM ET: In a statement given to The Verge, the Allen Institute said: “The key objective of our Delphi prototype is to study the potential and the limitations of language-based commonsense moral models. We do not propose to elevate AI into a position of moral authority, but rather to investigate the relevant research questions involved in the emergent field of machine ethics. The obvious limitations demonstrated by Delphi present an interesting opportunity to gain new insights and perspectives—they also highlight AI’s unique ability to turn the mirror on humanity and make us ask ourselves how we want to shape the powerful new technologies permeating our society at this important turning point.”

Correction, 12:29PM ET: An earlier version of this article implied that Ask Delphi’s training data included the responses of Redditors to ethical questions; it was only trained on the questions themselves.

Comments

Delphi says:

"deez nuts?"
- It’s wrong

Do you agree with Delphi? Yes. No. I don’t know.

Honestly, Delphi’s decision making parameters don’t come off as that different from those of a random milquetoast centrist, so in that respect maybe the researchers achieved their goal. But hey, at least Delphi says "it’s fine" to punch a nazi.

View All Comments
Back to top ↑