Google Duplex really works and testing begins this summer

Google

In a restaurant in Mountain View, California yesterday, Google gave several small groups of journalists a chance to demo Duplex. If you don’t recall, Duplex is the AI system designed to make human-sounding voice calls on your behalf so as to automate things like booking restaurant tables and hair appointments. In the demo, we saw what it would be like for a restaurant to receive a phone call — and in fact, each of us in turn took a call from Duplex as it tried to book a reservation.

The briefings were in service of the news that Google is about to begin limited testing “in the coming weeks.” If you’re hoping that means you’ll be able to try it yourself, sorry: Google is starting with “a set of trusted tester users,” according to Nick Fox, VP of product and design for the Google Assistant. It will also be limited to businesses that Google has partnered with rather than any old restaurant.

The rollout will be phased, in other words. First up will be calls about holiday hours, then restaurant reservations will come later this summer, and then finally, hair cut appointments will be last. Those are the only three domains that Google has trained Duplex on.

The demos we saw had many of the same elements that made the original demonstration at Google I/O so impressive: the voice sounded much more human than normal, complete with “umms” and “ahhs.” It also featured something we didn’t hear in May: each call started with an explicit statement that the call was being recorded.

There were a few variations on the disclosure, but they all included some indication that you were talking to a machine and the call was being recorded. For example, one call began with “Hi, I’m calling to make a reservation. I’m Google’s automated booking service, so I’ll record the call. Uh, can I book a table for Sunday the first?”

A few things to note about that call. The voice sounded just as natural as in the video above, not at all like a robot. There were several variations on the robot disclosure — Google seems to be testing to see which is most effective at making people feel comfortable sticking with the call. The other thing to know is every variation I heard definitely said that it was recording, usually followed by a quick “umm” before jumping in to making a request for the reservation.

The more natural, human-sounding voice wasn’t there in the very first prototypes that Google built (amusingly, they worked by setting a literal handset on the speaker on a laptop). According to VP of engineering for the Google Assistant Scott Huffman, “It didn’t work ... we got a lot of hangups, we got a lot of incompletion of the task. People didn’t deal well with how unnatural it sounded.”

Part of making it sound natural enough to not trigger an aural sense of the uncanny valley was adding those ums and ahs, which Huffman identified as “speech disfluencies.” He emphasized that they weren’t there to trick anybody, but because those vocal tics “play a key part in progressing a conversation between humans.” He says it came from a well-known branch of linguistics called “pragmatics,” which encompasses all the non-word communications that happen in human speech: the ums, the ahs, the hand gestures, etc.

“Google has invented a lot of things” Huffman said, “but we did not invent ums and aahs.”

If you take a Duplex call and want to take that initial “um” as an opportunity to say “yeah no, I don’t want to be recorded,” Duplex can recognize that and end the call with something like “‘Okay, I’ll call back on an unrecorded line’ and then we have an operator just call back,” Fox says.

There are a few states where Duplex won’t work — Fox says Google doesn’t yet have the permitting for Texas, for example — but it should start making calls in the vast majority of the US soon. Duplex only works in English for now, but Google has worked to ensure that it is able to understand lots of dialects and accents.

Fox emphasized that the behavior of Duplex arises out of Google’s recently published “core AI principles.” “We’re going to be very slow, very careful, and very thoughtful as we go here,” said Fox. That’s part of the reason why the initial testing will only be with businesses that Google has partnered with. Google will also allow businesses to opt-out of being called by Duplex — likely through the “Google My Business” portal. Of course, if you’re the sort of business that doesn’t have online bookings, you’re probably the sort of business that has never identified yourself to Google using its tools.

”We want to be very respectful of the businesses we’re working with,” Fox stressed. Google will ensure that businesses won’t receive too many calls from Duplex — say, for example, from people who might use it to prank restaurants with fake reservations.

When you set up Duplex on your Google Assistant, you’ll give it a few pieces of information and some permissions. In one call, Duplex told the human on the phone it wasn’t authorized to share an email address but could share a phone number, for example.

As for haircuts, Fox noted that Google hadn’t worked out all the details of what Duplex would need to know, but he imagined a case where at the least it might be able to ask for something like your “usual” haircut.

Duplex conveyed politeness in the demos we saw. It paused with a little “mmhmm” when the called human asked it to wait, a pragmatic tactic Huffman called “conversational acknowledgement.” It showed that Duplex was still on the line and listening, but would wait for the human to continue speaking.

It handled a bunch of interruptions, out of order questions, and even weird discursive statements pretty well. When a human sounded confused or flustered, Duplex took a tone that was almost apologetic. It really seems to be designed to be a super considerate and non-confrontational customer on the phone.

But Duplex can’t handle everything, and so it will be paired with a bank of human operators who can take over a call if it goes sideways. Valerie Nygaard, product manager for Duplex, emphasized that “this is a system with a human fallback.” Those operators serve two purposes: they handle calls that Duplex can’t complete and they also mark up the call transcripts for Google’s AI algorithms to learn from.

None of the phone calls we listened to required human fallback, however. Huffman says that right now four out of five calls that Duplex makes can be handled without the human operator. That’s either a very low or very high percentage, depending on your attitude toward the technology.

If you’re skeptical of all this, I don’t blame you. Google got quite a bit of blowback after its Google I/O demo, both from people who were wondering about disclosure and from those who thought it might have not been a real call. Fox insists that it was, though it was edited to take out “personal information.”

Another reason to believe the demos we saw in that Mountain View restaurant were real? My own demo totally flopped. I played the role of a busy, crabby bartender who kept interrupting Duplex. The system handled the interruptions fine, but it got flummoxed when I told it that I could go ahead and make a reservation for seven, but the full kitchen closed at six that day so it would have to settle for bar food.

My call should have been handed off to a human operator at that point, but instead Duplex misunderstood my meaning about the kitchen closing. When I said there would only be bar food in a harried and snippy tone, it replied “Oh I see. Bye, thank you,” and hung up.

It was a very human thing to do.

Comments

Really cool tech. I’m glad Google added the disclosure, definitely the right thing to do to allow people to opt out of the call and being recorded if they don’t want to participate.

What are you talking about, opt out? If you’re a restaurant host making barely above minimum wage, and you decide you don’t wanna take someones robot reservation and cost your employer business, your ass is fired. There is no opt out, only using economic duress to force service workers to talk to robots.

In the same way that employers require their employees to talk to asshole customers that will be meaner and nastier than any AI will be (with a smile on their face).

I’ll definitely take those robo-reservations so when f*ck-wad-customer shows up at the wrong time, he can listen to his dumb ass recording!

That might be the case for some, but not for all. Some will absolutely choose to opt out and won’t be fired. Again, it’ll be met with a mixed response for sure.

It’s a business call. It’s not really a big deal at all and lets not pretend that it is, because hint: it isn’t.

But but… employees privacy and shit!

I’m not sure I’d even desire to accept a call from anyone (human or computer) whom starts the conversation by saying "this call will be recorded".

Secondly, if your too busy to call and make a reservation yourself, then you probably are way too dependent on technology today.

I come from the mindset that If Netflix binge watching is getting in the way of eating, i’d say it’s time to delete a few smartphone apps; and secondly, you probably smell badly so take a shower.

However, I could see value in an automated dialer for Craigslist Ads. If something comes available, it should dial the seller every 5 minutes until someone answers or responds. The partnership with Google Translate will make it into a great product for immigrants who can’t speak very much English but have a sale price on some ancillary item I don’t necessarily need.

Assitive technology for deaf, dumb and hard of hearing can be helpful. However, and for me, I actually don’t have much respect for the person who is too lazy to call and instead wants me to talk to their robot.

In all honesty, they probably don’t want your respect. They were calling (or having something call) for a reservation. People with the capability have their human assistants call all the time to do these mundane things. Do you still hold them in the same low regard?

Why does it matter if you take a robots reservation or a humans? Hell if I get a call and know its a robot I know its going to be nice and not argue or petty stuff.

I think it’s less the robotic nature of the call and more so the recording of the call. Some people may have a problem with Google recording their call and their voice. Reasonable people can argue whether or not they should have a problem with that, but undoubtedly people will.

If you call most large businesses, you will usually be informed that the call is being recorded for "training purposes." If you leave a voicemail message, your voice is obviously being recorded.

For the last few decades, people have become accustomed to having their voices recorded while participating in a telephone call, so this aspect of it is nothing new.

As long as they announce it, like they are in this iteration of Duplex, I’m fine with it because people have choice.

I’d hang up the phone out of respect for the person who makes a personal phone call; has the ability (and time) to make a personal phone call rather than having the automated attendant or robot call.

First is most businesses record calls.

Second who cares if they record an appointment? Theres no address, no billing info…people are getting way to paranoid over this. Your data is everywhere and even before the internet it was easy to get personal info.

I’m just telling you people will care, that is exactly why there was such a call for Google to announce that it is assistant and that is recording the call in the first place. You can think that’s paranoid if you want, but many would disagree.

Most people won’t care enough to regularly turn down business. Money has a way of allowing people to overcome all kinds of objections.

Meh. My phone automatically records all my calls, I tell no one. (Except you.)

I’m sure many people do the same. Generally, anywhere I call I expect to be recorded.

Recording a phone call without the other party being aware is illegal in many places.

Lol I’m pretty sure he knows that.

I do, I don’t care. How is someone going to make a fuss if they don’t know?

This is, to me, the same as my dash cam. People may not consent to being filmed either.

It’s only illegal if the other person files a complaint in court. To build a strong case, it’d be best to produce some sort of evidence that illegal recordings are happening.

Still if your sued for this.. lol

Really looking forward to how this works tech wise but also how our civilization reacts to these types of calls. I’m sure it will raise ethical questions as well as be abused by people.

I suspect many spam callers will simply copy the sound of the voice to gain the callers trust and use it to their own advantage.

It’s a dumb publicity stunt.

A much smarter approach would’ve been to create some incredibly easy-to-implement system for businesses to accept reservations and whatnot from customers using android phones.

There’s no need for a computer system to interact the way humans do for the sake of being kinda like humans.

There’s no benefit to the voice aspect. If speech is super important, they could’ve had a basic robo-voice say "RESERVATION REQUESTED FOR" [recording of person saying their name] "AT" [recording of person saying times] "PRESS 1 TO ACCEPT, 2 TO SUGGEST ALTERNATE TIME, 3 TO DECLINE, 9 IF THIS IS A WRONG NUMBER".

It would’ve taken half as long to build (if not way less) and would be more effective than a voice assistant that’s undoubtedly going to fail in any strange situations.

View All Comments
Back to top ↑