I'm sitting across from young Michelle Moller, whose mother I just found brutally murdered. The lead detective of the crime, I've been given the unenviable role as harbinger of bad news. "I'm sorry to have to tell you this, Michelle, but your mother is dead." She hunches forward a few inches and throws her head into two upraised hands, audibly sobbing. Still, something's very wrong here, and I don't mean the case at hand.
Her fingers don't quite settle upon her face. There's no reddening of her cheeks — no change in color at all, in fact. Most of all, there are no tears. I ask her if she'd be okay with answering a few questions for the investigation. The sounds of sadness fade away as quickly as they arrived. She raises her head. "I could try," she says, with no forced effort to maintain composure — her face as bright as it was before I walked through that door.
What I'd like to be doing here is setting up a Voight-Kampff machine and asking about a tortoise in the desert (protip: whatever she says, choose "doubt"). But she isn't my primary suspect, and this isn't Blade Runner. This is the world of Team Bondi's action / thriller game L.A. Noire. The developer prides itself on its motion capture technology and interrogation game mechanic — players are supposed to be able to tell who's lying in interrogations based on how the actors actually performed it." The technology "allows us to bring a sense of humanity to the game that has yet to be achieved up until now," according to Depth Analysis' Oliver Bao. But in striving for this visual realism, as many games are trying to do, game developers seems to be missing the point. There's a degree of emphatic response, of humanity itself, that's being lost in translation — something very uncanny about the digital interpretation of life — and engineering alone won't fill that void.
Note: spoilers ahead for a variety of titles, tread cautiously!
Defining the uncanny valley, starring Tom Hanks
"The subject of the 'uncanny' is... undoubtedly related to what is frightening — to what arouses dread and horror; equally certainly, too, the word is not always used in a clearly definable sense, so that it tends to coincide with what excites fear in general."
— Sigmund Freud, "The Uncanny"
Quick history lesson on the term uncanny. Credited to Ernst Jentsch from his 1906 essay "On the Psychology of the Uncanny," Sigmund Freud later expounded upon it in the aptly-titled 1919 essay "The Uncanny." (Both cite an 1816 German short story ETA Hoffman, "Der Sandmann," in which the protagonist Nathanael falls in love with a humanoid automation named Olympia.) In both pieces, "uncanny" refers to something that is "creepy" or "not quite right" in appearance. Playing off that, the phrase "uncanny valley" comes from roboticist Masahiro Mori in 1970. Much of his hypothesis is best summed up in the supplementary chart:
According to Mori, as a robot becomes more humanoid (both in appearance and in motion), people will initially show more empathy towards it. Eventually, though, it hits an "edge" of realism where one's reaction very sharply turns to discomfort and aversion. The dip is less pronounced when motion is taken out of the equation, but it's still very much an issue. This is the uncanny valley, the lowest point, the wax mannequin of Samuel L Jackson that comes to life in your nightmare and chases you around. The only way to get out of the valley would be to create something so similar to a human being that you couldn't tell the difference — a visual Turing Test, if you will. Here's another take, care of Judah Friedlander from an episode of 30 Rock: "We like R2-D2 (industrial robot) and C-3PO (humanoid robot), and up here (healthy person) we have a real person like Han Solo, but down (in the valley) we have a CGI Stormtrooper or Tom Hanks from The Polar Express."
Unlike film, video games don't have the option to use live-action characters and performances to tell a story. In-game characters have to be built largely from scratch. Rather than reinventing the wheel, many developers use motion capture and facial mapping as a way of bridging that gap between the real and the digital. That can expedite the process of creating something "human like," but it isn't quite human. The uncanny debate in gaming is far from new, but as graphical processing and motion capture technology continues to improve, character design pushes ever closer against this so-called valley.
(A disclaimer: the uncanny valley is more philosophical than it is quantifiable. There is no scientific method for testing whether or not something is "in the uncanny valley," and for many people, this might not be an issue at all. Certainly a number of people found Polar Express not at all unsettling — I just don't know any of them.)
MotionScan: creating the world of L.A. Noire
"At the time, both of them were working on avatars. He was working on bodies, she was working on faces. She was the face department, because nobody thought that faces were all that important—they were just flesh-toned busts on top of the avatars. She was just in the process of proving them all desperately wrong."
—Neal Stephenson, Snow Crash
The room is all white, brightly lit to eliminate any trace of shadows. Aaron Staton sits down, his head the dead center of that room. Thirty-two cameras, grouped in pairs, surround him. Each pair is assigned a different portion of his face to record, working in tandem to provide a 3D scan as he goes through his lines with as much vim and vigor as his virtualized high-strung, do-gooder detective Cole Phelps requires. This face-centric performance will be coupled with a separate motion-capture session that records all the body movement.
This is MotionScan, the technology developed by Team Bondi offshoot Depth Analysis, and the end result is the in-game population of Los Angeles, 1947. The character model of Phelps in particular, a replicant of Staton, stands out with a range of emotions — mostly variations of anger, which give Team Bondi a chance to demonstrate how much raw data is being captured. When Phelps surveys a crime scene and converses with Dr. Malcolm Carruthers, the two pointing to various clues strewn about a victim's body, it's the merging of these two disparate recording sessions (face and body). It's impressive, but as I said in the introduction, it's also notably imperfect.
Let's return to the first example, the interrogation of Michelle Moller. Here's the full scene:
Something is off, but should I attribute it to the actress or the technology? To both? Coincidentally, the actress — 18-year old Abigail Mavity — has been questioned by authorities before when she appeared on an episode of the procedural cop drama NCIS (video here, embedding unfortunately disabled). The structure of the interrogation is amusingly similar to L.A. Noire. In the TV show clip, we see Mavity's character lie about sending an email, which prompts Special Agent Leroy Gibbs (Mark Harmon) to present evidence to the contrary: her smartphone. She tries to lie about her whereabouts last Thursday night. Gibbs doubts her answer, which is presented as a stern stare (L.A. Noire's Phelps can also "choose doubt," although his method involves a lot of yelling). That's all it takes for Mavity's character to open up — and when she does, Mavity's live-action face becomes very animated and telling.
That range of expression frankly isn't in the game; comparatively, the game character Michelle Moller is much more stilted. Whenever I show that L.A. Noire clip to others (or recount the scene with friends who have already played through the game), there's a general consensus that something feels very unnatural from a technical standpoint. I know Michelle is upset because the emotion in her voice and the context of the situation (i.e. the death of her mother), but her physical reactions don't match up appropriately. At many times throughout the game, these abnormalities seem to work against having any sentimental resonance with the characters, up to and including Phelps himself. It's clear that Team Bondi wants the player to make an emotional investment — why make a narrative-driven game for any other reason? MotionScan is an impressive engineering achievement, but there's an over-reliance on the technology alone to render sympathetic eyes.
Where Matthew Broderick and Half-Life 2 cross paths
In the 1997 movie Addicted to Love (not recommended), Matthew Broderick and Meg Ryan become obsessed with (and inevitably spy on) their ex-lovers, who at that point had become something of an item. Broderick's character, painted as the obsessive-compulsive type, keeps track of their actions and even goes so far as to create a "smile list," analyzing how variations in a smile have different meanings. An obviously important detail to the film (spoiler: it's how Meg Ryan realizes her and Broderick were meant for each other, as if being the two best-known actors in the film wasn't enough of a hint), it's also amusingly enough a pretty smart read into how people interpret facial reactions.
In 1978, Paul Ekman and Wallace V. Friesen set out to deconstruct facial expressions into a combination of Action Units (AU), which are various muscles contracting and reacting — an insincere smile, for example, would be just the contraction of the zygomatic major muscle (which controls part of the mouth), whereas a sincere and involuntary smile also contracts part of the orbicularis oculi (aka eyelids). This Facial Action Coding System (FACS), as it's known, has been incorporated into Valve's Source Engine and showcased via the character Alyx from Half-Life 2. Check out her reaction to an exploding gunship — it's exaggerated, yes, but it's fully emotive and a convincing complement to the voice work.
Smile like you mean it: Heavy Rain's casting call
If you've read any articles about the uncanny valley in the last few years, particularly in the world of video games, there's a good chance they were inspired by Heavy Rain. In 2006, developer Quantic Dreams unveiled a tech demo to show off its own motion capture technology. Heavy Rain: The Casting starred Aurélie Bancilhon as a woman auditioning for a role, her face going through a wide, wide range of emotion. The eyes, eyebrows, and mouth show a range of malleability — hey, look, tears! — but still, not every piece was there. I'd argue the biggest problem here was the glossy texture of her skin and mouth, which QD probably didn't find important to highlight the actual motion capture process, but in any case, this video elicited an extremely uneasy response from the gaming community.
A lot has changed since then. When the game launched in early 2010, character models were more detailed, edges softer. The lighting helped, too. It's not perfect, but with a game that lives and dies by its story, the characters had to sell the narrative — or at least not get in the way. Given the critical praise, I'd argue it did. As a note of comparison, Banchilhon was cast in the final game as Lauren Winters. Jump to the two-minute mark in this video:
The hollow men: a trip into Bethesda's world
It's rare that a game criticized for uncanny characters sees much of a dip in review scores or sales as a direct result. In gaming, mechanics are still king, and talking about the uncanny is more of an academic exercise. Metacritic shows an aggregate review score of 89 (out of 100) for L.A. Noire; the game also topped North American sales charts in its first month. And then there's Bethesda Softworks, developer of enormous and sprawling role-playing titles such as Fallout 3 and the Elder Scrolls series (Morrowind, Oblivion). Every single one of these games have been award-winning bestsellers — and every single one of them have been populated with bizarre, uncanny non-playable characters (NPCs). In both appearance and in action, something always seems a little off — clever dialog, eccentric voice work, but again, emotions like anger and surprise barely register an eyebrow lift. (At times, too, it seems many of the characters have the same (or eerily similar) faces in a way that isn't meant to imply family lineage.) Check out, for example, this video clip from Fallout 3, with you (as a character lacking any intelligence) chatting with several scientists:
None of those interactions feels all too natural — but on the other hand, maybe that doesn't matter. Again, game mechanics are king. Bethesda games always seem to be more a reflection of you, from the very beginning when you're given the ability to design every minute detail of your avatar. They're more than "sandboxes," they're playgrounds: vast, open worlds that let you roam around as you see fit and play as you choose. In some ways this uncanny aversion, this disconnect from the virtual population, might actually be an unintended boon for the title — or at least it doesn't hurt.
There is little external incentive to follow the main quest other than your own personal decision to do so. Relatively unimportant characters can give you daunting tasks, and you can decide to help. Any item can be picked up, any person can be killed, and at times there are consequences for your actions. (This isn't as clear cut, perhaps by technical limitation — people seem to be way too forgiving about murdering entire villages, for example.) In a sense, what Bethesda does is give you a large thematic sandbox and lets you run wild.
In a game where strong narrative isn't the primary motivator, emphatic characters are much less a priority. As dry as these characters are, they don't influence my actions one way or another on an emotional level — it's my playground, it's all about what I want to do. The stories I take from the post-apocalyptic Fallout 3 are guided only by my wandering. For example, exploring northeast of Megaton city, I stumbled upon a run-down Super-Duper Mart. There's a group of raiders in the parking lot in a shootout with god-knows-what. I sneak in — poorly, I might add — and alert the patrolling raiders both inside and out. What followed was a pretty long skirmish wherein I did my best to take cover between aisles and behind pharmacy counters, finding the occasional extra ammo clip and taking out the gun-toting pillagers one by one. Scavenging for medicine after the firefight, I decided for a laugh to play with the intercom; by the time I walked out, another group of raiders had entered to investigate the noise. Whoops...
Left 4 Dead elicited similar experiences — given a setting, the narrative ultimately came from how you survived and escaped a zombie onslaught. Unlike Left 4 Dead, however, Bethesda titles are solo experiences — every decision is yours to make, every story is yours to tell. These characters don't need to elicit an emotional response, because that might change how I would play it. The enjoyment of a Bethesda title is directly proportional to what you're willing to put into it.
And that brings us to Skyrim. Bethesda gave a 20-minute presentation of the next Elder Scrolls title earlier this month at E3, and though the NPC interaction was kept to a minimum, it felt like the same formula — characters being "not quite right," a story built around your choices and your encounters — and killing giant dragons, of course. What I'm impressed with is just how detailed the NPCs appear in the screenshots. To be fair, the Fallout 3 characters looked great in static snapshots, too, but this is why Mori hypothesized a second, more extreme dip for such characters in motion. Call me cautiously optimistic, and know that I'll probably sink dozens of hours into the game regardless.
Learning from Pixar
To get an emotional response from the audience, sometimes all it takes is a more liberal interpretation of a human. I love this quote from Clive Thompson's 2004 Slate piece "The Undead Zone," speaking with cartoonist Scott McCloud:
"Charlie Brown doesn't trigger our obsession with the missing details the way a not-quite-photorealistic character does, so we project ourselves onto him more easily. That's part of the genius behind modernist artists such as Picasso or Matisse. They realized that the best way to capture the essence of a person or object was with a single, broad-stroked detail."
In trying to become film, perhaps developers who are seeking "realism" should learn more from animation than live-action. Take Pixar's 1988 short "Tin Toy," for example — better yet, just look at this picture. This isn't some Benjamin Button child, a face wrinkled and a brow furrowed from inverse aging. It was a technical achievement in its day, a test of RenderMan software, a short that Pixar wanted to use to sell a half-hour TV special. Instead, it convinced Disney to push Pixar into doing a feature-length film — Toy Story. If you notice the humans in that film, they're never modeled with nearly as many wrinkles. Or really any wrinkles, for that matter.
From the first film onwards, Pixar has taken a more liberal interpretation of the human face, keeping key features intact and at times even more exaggerated for the display of emotion. But still, each character is unmistakably human (with the possible exception of WALL-E's Captain B. McCrea, but then again, years of isolation can do that to a body).
Speaking to The New York Times back in 2004 around the time The Incredibles was launching, Pixar co-founder and President Ed Catmull summed it up best: "anyone who thinks that emulating reality is the Holy Grail is not a great animator. Because the goal isn't to emulate humans. The goal is to create works of art and to tell stories." The company's obviously doing something right: Pixar movies are reliable box office hits with great critical reception.
Wrap-up: I'm Cole Phelps, Blade Runner
On roboticist Dario Floreano's "Talking Robots" podcast, David Hanson — famous for making robotic heads of Phillip K. Dick and Albert Einstein — argues that achieving realism is done through "very high-end artistry with groundbreaking engineering." He notes that film itself started as a technically-focused medium in the 1920s but didn't become meaningful until artists took note. "If [robots are] designed poorly," Hanson says, "it doesn't matter what level of realism is achieved. You can make something that looks like it could be a real human, it's perfectly realistic then, it could be very ugly or scary looking, or just flat mean and that can be disturbing."
If realism isn't the catalyst for an emotional response, then what creative liberties should be taken in game character design? I'd like to propose developers think of motion scanning as more of a canvas than a final product, taking a creative license where it makes sense. If a CG unicycle can tug on my heart strings, certainly a human character can be stylized even just a little bit in the interest of player empathy. Even the best motion capture technology at this point isn't going to perfectly recreate a human performance — but with the perfect blend of technical prowess and artistic styling, I believe games can mold their own unique (and interactive) stories with an even greater sense of humanity.
So let's return once more to L.A. Noire, to Michelle Moller's interview. But this time, mostly in jest, I'd like to think of L.A. Noire as a testing ground for early Nexus series replicants. Cole Phelps, a precursor to Rick Deckard (and also a character of questionable humanity, depending on how you view it), sketches a picture of Michelle Moller into his pad with phenomenal speed. I say to her, "Some of your mother's jewelry was missing. Can you describe her things?" Michelle fidgets, convulses even, just a little. "A ring, a watch... I never paid much attention to that stuff," she responds. If only Phelps could see what I see with my eyes. The degree of her emphatic response for one. Her askew, subdued facial contortion, for another. But he doesn't notice, or he doesn't seem to care. I'm drawing a grand illusion over a simple game mechanic. No matter what her face does or does not say, no matter how unnerving I find the character model to be, she fidgets. And I know that means to doubt her. It's not intuition, nor any empathy for her well-being that steer me that way: there's an obvious game mechanic poking through Moller's painstakingly-crafted digital skin. At the end of the day, L.A. Noire is just a game.