Could this be any more disturbing? A dozen years after they went off the air, the cast of Friends has been virtually reunited for a project that turned one of them into a video chatbot. As spotted by Prosthetic Knowledge, researchers at the University of Leeds used machine learning to create automated video avatars that speak in the voices of their characters. The result is a system that uses the original performances recorded by the actors to generate brief new scenes, starting with Matt LeBlanc’s immortal Joey Tribbiani.
The system was demonstrated this weekend at a European Conference on Computer Vision workshop. Researchers James Charles, Derek Magee, and David Hogg offered a proof of concept for what they call "a generative computational model of a person’s motion, appearance, speech, language and their style of interaction and behavior." After deconstructing all 236 episodes of Friends using software, they created language models that were able to build new sentences and speech for Joey. They then matched his new speech with corresponding mouth positions, which they digitally pasted over the mouth position from LeBlanc’s original performance.
The implications are fun to consider
The result is rough around the edges, but its implications are fun to consider. The authors imagine that in the future, companies like Apple and Amazon will use technology like this to create video representations of their voice assistants, Siri and Alexa. The technology could also be used to create interactive avatars of the living — or the dead. "This model [can] generate brand-new and interactive content, effectively rendering the person virtually immortal," the authors write.
A video produced along with the paper shows Joey speaking a pair of sentences generated by the language model. The model attempted to simulate things Joey would actually say, and came up with "I like pizza with cheese" and "Dude, I don’t care." That may fall short of the crackling dialogue of the series’ best episodes, but it still points to a future where we’ll all be able to create complete, fan-scripted Friends episodes using reassembled fragments of the original performances. At least until the entire thing is shut down over copyright concerns.
This month I wrote about Eugenia Kuyda, who rebuilt her best friend as a chatbot after he died tragically. This spring, Kuyda and I had discussed the possibility that some day a video avatar of Mazurenko might be possible — but that the current technology fell far short of what would be needed to convincingly re-create speech and motion.
Hogg, a professor of artificial intelligence at Leeds, said in an interview that the next step would be to build an interactive bot from the parts the researchers have assembled. A user could ask Joey (or another video avatar) a question, and the models developed by the researchers would attempt to generate a response.
"Joey knows all kind of stuff"
But the models won’t be truly compelling until they have a better understanding of their world, whether fictional or real, Hogg said. Researchers can roughly simulate short phrases that Joey might use, but they can't yet simulate his character. "The major problem with it is it’s all surface-level modeling," he said. "It needs a much deeper model to do anything really interesting." Better models might know that Ross and Monica are siblings, for example, or that Phoebe is the songwriter of "Smelly Cat." "Joey knows all kinds of stuff, and that stuff influences what he says on the show," Hogg said.
Hogg said the researchers plan to submit their work for publication in a journal. In the meantime, we’re still a long way from receiving video postcards from departed loved ones on our birthdays. But the Leeds researchers’ proof of concept shows how quickly the state of the art is advancing.