Interest in the phenomenon of “deepfakes” has died down a little in recent months, presumably as the public comes to terms with what seems like an inevitability in 2018 — that people can and will use AI to create super-realistic fake videos and images. But a recent news story by BuzzFeed surfaced the term again in an unexpected setting, inviting the question: what is a deepfake anyway?
The article in question was titled “A Belgian Political Party Is Circulating A Trump Deepfake Video.” From the headline you might expect that this was a high-tech political propaganda campaign; someone using AI to put words in Trump’s mouth and mislead voters. In other words, exactly the sort of scenario experts are deeply worried about with deepfakes. But if you watch the actual video, it’s clear this isn’t the case. The clip is an obvious parody, with an exaggerated vocal impersonation and unrealistic computer effects. (The creators said it was made using Adobe After Effects — so, not AI.) At one point “Trump” even says: “We all know climate change is fake, just like this video.”
What’s the danger of misusing the term?
So should we call this a deepfake? Experts The Verge spoke to were pretty confident saying “no,” but the question raises a number of interesting issues: not only our difficulty in defining deepfakes, but the problems that could arise if the term is applied vaguely in the future. Could “deepfake” become the next “fake news,” for example; a phrase that once described a distinct phenomenon (people publishing fabricated news stories on social media for profit), but that has now been co-opted to discredit legitimate reporting.
But let’s start with a quick definition of what a “deepfake” is. The term originally came from a Reddit user called “deepfakes,” who, in December 2017, used off-the-shelf AI tools to paste celebrities’ faces onto pornographic video clips. The username was simply a portmanteau of “deep learning” (the particular flavor of AI used for the task) and “fakes,” but it would be hard to ask a branding department to come up with something more catchy.
Although the term was originally only applied to pornographic fakes, it was quickly adopted as shorthand for a broad range of video and imagery edited using machine learning. Although pornographic deepfakes are what brought this sub-discipline of AI to mainstream attention, researchers have been working on this sort of audiovisual manipulation for a long time. Methods that now fall under the deepfake umbrella include face swaps (like the above), audio deepfakes (copying someone’s voice), deepfake puppetry or facial re-enactment (mapping a targets face to an actor’s and manipulating it like that), and deepfake lip-synching (created video of someone speaking from audio and footage of their face).
But what makes a deepfake in the first place? Well, experts stress that the term is a vague one, and still in flux, as the technology develops and becomes more widely recognized. But, one baseline characteristic is that some part of the editing process is automated using AI techniques, usually deep learning. This is significant, not only because it reflects the fact that deepfakes are new, but that they’re also easy. A big part of the danger of the technology is that, unlike older photo and video editing techniques, it will be more widely accessible to people without great technical skill.
Deepfakes involve AI, automation, and (potentially) deception
Miles Brundage, a policy expert who co-authored a recent report on the malicious uses of AI, said the term “deepfake” does not have distinct boundaries, but generally refers to a “subset of fake video that leverages deep learning [...] to make the faking process easier.” Giorgio Patrini, an AI researcher at the University of Amsterdam who’s written on the subject of digital fakes, offered a similar definition, saying a deepfake should include “some automated, learned component.” Aviv Ovadya, chief technologist at the Center for Social Media Responsibility at the University of Michigan School of Information, agreed that we certainly need a term to describe “audio or video fabrication or manipulation that would have been extremely difficult and expensive without AI advances,” and that deepfake does the job pretty well.
If we agree with these definitions, it means video and images edited with existing software like Adobe Photoshop and After Effects aren’t deepfakes. But, as Patrini pointed out, this isn’t a reliable rule. For a start, programs like this already automate at least some part of the editing process, and they’ll soon be offering AI-powered features as well. (Adobe, for example, has showed off a number of AI editing tools currently in development.)
The experts also added that intent wasn’t part of the definition — it doesn’t matter whether someone is trying to deceive you to make something a deepfake or not. However, this doesn’t seem like the whole picture, and perhaps the potential to deceive is part of the equation. For example, Snapchat uses AI techniques to apply filters to peoples’ faces and we don’t call those deepfakes. Ditto Apple’s animoji, which you could call “cartoon deepfake puppetry” if you were feeling obtuse.
Looking at these counter examples, it seems that when we talk about “deepfakes” we are talking about content which has the potential to deceive someone, and perhaps meaningfully affect their lives. This could be by swaying their political opinions, or being used in a courtroom as false evidence. Or, in the case of pornographic deepfakes, it affects the people targeted, while the people making them want to believe they’re real for personal gratification.
This would give you the definition of deepfakes that sits in the middle of a Venn diagram comprised of three circles labeled “AI,” “automated,” and “potentially deceptive.” Even then, though, you can come up with edge cases that don’t fit.
“I’m more worried about what this does to authentic content.”
And if that’s the case, why quibble about it at all? Well, if we can’t agree on what a deepfake is and is not, it makes the subject difficult to talk about. And all the experts in this area say an informed citizenry is vital in combatting any future harms from this tech. There’s also the danger that if we deploy the term “deepfake” too casually and too loosely, it will become omnipresent; a cultural force that looms larger than the technology’s actual impact. That means people who want to deceive us can co-opt it, using the term (and people’s vague familiarity with it) to cast doubt on evidence they don’t like they look of. This is arguably what happened with “fake news.”
Speaking to The Verge, Hany Farid, an expert in digital forensics at Dartmouth College, stressed that this was perhaps the greatest near-term danger. “I’m more worried about what this does to authentic content,” said Farid. “Think about Donald Trump. If that audio recording of him saying he grabbed a woman was released today, he would have plausible deniability. He could say ‘someone could have synthesized this’ and what’s more, he would have a fair point.”
Having a widely agreed upon definition of what a deepfake is will not protect against this sort of scenario, of course. But, if the most dire predictions are to be believed, and if we are heading toward a world where any audiovisual content can be faked, leading to distrust in the media, courts, and other public institutions, then a clear definition would at least help public discussion of these issues. If we can’t even speak the same language, we’ll lose trust in one another even more quickly.
Update Monday June 11th: Updated to include information about how the Belgian “deepfake” was made.