Adobe is working on an audio app that lets you add words someone never said

Photo by Justin Sullivan/Getty Images

Adobe is working on a new piece of software that would act like a Photoshop for audio, according to Adobe developer Zeyu Jin, who spoke at the Adobe MAX conference in San Diego, California today. The software is codenamed Project VoCo, and it’s not clear at this time when it will materialize as a commercial product. The standout feature, however, is the ability to add words not originally found in the audio file.

An Adobe representative confirmed the project’s existence to The Verge, clarifying that it was shown off today as part of a sneak-peek program at the MAX conference. The project is currently in development as part of a collaboration between members of Adobe Research and Princeton University. News of Project VoCo was first reported by the art and design website Creative Bloq earlier today.

Like Photoshop, Project VoCo is designed to be a state-of-the-art audio editing application. Beyond your standard speech editing and noise-cancellation features, Project VoCo can also apparently generate new words using a speaker’s recorded voice. Essentially, the software can understand the makeup of a person’s voice and replicate it, so long as there’s about 20 minutes of recorded speech. In Jin’s demo, the developer showcased how Project VoCo let him add a word to a sentence in a near-perfect replication of the speaker, according to Creative Bloq.


"When recording voiceovers, dialog, and narration, people would often like to change or insert a word or a few words due to either a mistake they made or simply because they would like to change part of the narrative," reads an official Adobe statement. "We have developed a technology called Project VoCo in which you can simply type in the word or words that you would like to change or insert into the voiceover. The algorithm does the rest and makes it sound like the original speaker said those words."

So similar to how Photoshop ushered in a new era of editing and image creation, this tool could transform how audio engineers work with sound, polish clips, and clean up recordings and podcasts. Of course, there’s all sorts of ethical implications involved when we have the ability to falsify entire sentences using a person’s voice. But just as Photoshop taught the general public to be wary of suspect images, Project VoCo might do so the same with regards to doctored audio clips.

- Via: Creative Bloq


In the dystopian future, are witnesses more reliable than video and audio recordings?

That’s what I was thinking. I mean this is going to be rampant for abuse.

A future where utterly everything is flexible.
The "truth" cannot be determined.

Do we need hard timestamping security on all media?
Refuse to believe anything that wasn’t encrypted and start and end time-stamped at the very moment it was recorded?

Exclusive recording, "I hate America" says Hillary Clinton, caught on tape! Coming soon to a Facebook feed near you!

Also coming soon, people at work trying to convince you this news is true and you should vote for Trump because they saw it on, shared by millions of people!

Definitely fake – I can hear the pixels

So now you can tell compelling lies visually AND sonically!

Holy crap.

I would have so much fun with this application

Ethan Hunt: I’ve had this for twenty years!

Also at MAX: Only getting a shitty hoodie as a free gift after years of giving away phones, computers, and cameras. Price was still over $1k, and no heads up about the "gift" before we got here either.

You pay $1k to attend?

Impressive, but why call Photoshop? Audio has nothing to do with Photo.

A headline written for industry outsiders I suppose.

"just as Photoshop taught the general public to be wary of suspect images"

Really? Because every week on Facebook I see lots of people taking everything they see at face value, unless it goes against their preconceived view of the world.

I hope people become more sceptical of what they hear, but I wouldn’t bet on it.

Interestingly there is a technique that’s used to detect editing of audio, in which you look for the incredibly subtle artifacts caused by interference from the electrical grid. It will be interesting to see if this tool can work around that, given enough material to work with.

Any and every excuse for these techies to invent something just to make more money.

"Your scientists were so preoccupied with whether or not they could that they didn’t stop to think if they should."

- Dr. Ian Malcolm in Jurassic Park, played by Jeff Goldblum

someone at Adobe: "wouldn’t it be really funny if we made facts not matter any more?"

What could possibly go wrong?

And this is good in what way?

This is great, A time saving machine to edit interviews.

View All Comments
Back to top ↑