Skip to main content

Google Duplex really works and testing begins this summer

Google Duplex really works and testing begins this summer


This is what it’s like to take a restaurant reservation from an AI

Share this story


In a restaurant in Mountain View, California yesterday, Google gave several small groups of journalists a chance to demo Duplex. If you don’t recall, Duplex is the AI system designed to make human-sounding voice calls on your behalf so as to automate things like booking restaurant tables and hair appointments. In the demo, we saw what it would be like for a restaurant to receive a phone call — and in fact, each of us in turn took a call from Duplex as it tried to book a reservation.

The briefings were in service of the news that Google is about to begin limited testing “in the coming weeks.” If you’re hoping that means you’ll be able to try it yourself, sorry: Google is starting with “a set of trusted tester users,” according to Nick Fox, VP of product and design for the Google Assistant. It will also be limited to businesses that Google has partnered with rather than any old restaurant.

The rollout will be phased, in other words. First up will be calls about holiday hours, then restaurant reservations will come later this summer, and then finally, hair cut appointments will be last. Those are the only three domains that Google has trained Duplex on.

The demos we saw had many of the same elements that made the original demonstration at Google I/O so impressive: the voice sounded much more human than normal, complete with “umms” and “ahhs.” It also featured something we didn’t hear in May: each call started with an explicit statement that the call was being recorded.

There were a few variations on the disclosure, but they all included some indication that you were talking to a machine and the call was being recorded. For example, one call began with “Hi, I’m calling to make a reservation. I’m Google’s automated booking service, so I’ll record the call. Uh, can I book a table for Sunday the first?”

“I’m Google’s automated booking service, so I’ll record the call.”

A few things to note about that call. The voice sounded just as natural as in the video above, not at all like a robot. There were several variations on the robot disclosure — Google seems to be testing to see which is most effective at making people feel comfortable sticking with the call. The other thing to know is every variation I heard definitely said that it was recording, usually followed by a quick “umm” before jumping in to making a request for the reservation.

The more natural, human-sounding voice wasn’t there in the very first prototypes that Google built (amusingly, they worked by setting a literal handset on the speaker on a laptop). According to VP of engineering for the Google Assistant Scott Huffman, “It didn’t work ... we got a lot of hangups, we got a lot of incompletion of the task. People didn’t deal well with how unnatural it sounded.”

Part of making it sound natural enough to not trigger an aural sense of the uncanny valley was adding those ums and ahs, which Huffman identified as “speech disfluencies.” He emphasized that they weren’t there to trick anybody, but because those vocal tics “play a key part in progressing a conversation between humans.” He says it came from a well-known branch of linguistics called “pragmatics,” which encompasses all the non-word communications that happen in human speech: the ums, the ahs, the hand gestures, etc.

“Google has invented a lot of things” Huffman said, “but we did not invent ums and aahs.”

If you take a Duplex call and want to take that initial “um” as an opportunity to say “yeah no, I don’t want to be recorded,” Duplex can recognize that and end the call with something like “‘Okay, I’ll call back on an unrecorded line’ and then we have an operator just call back,” Fox says.

There are a few states where Duplex won’t work — Fox says Google doesn’t yet have the permitting for Texas, for example — but it should start making calls in the vast majority of the US soon. Duplex only works in English for now, but Google has worked to ensure that it is able to understand lots of dialects and accents.

“We going to be very slow, very careful, and very thoughtful.”

Fox emphasized that the behavior of Duplex arises out of Google’s recently published “core AI principles.” “We’re going to be very slow, very careful, and very thoughtful as we go here,” said Fox. That’s part of the reason why the initial testing will only be with businesses that Google has partnered with. Google will also allow businesses to opt-out of being called by Duplex — likely through the “Google My Business” portal. Of course, if you’re the sort of business that doesn’t have online bookings, you’re probably the sort of business that has never identified yourself to Google using its tools.

”We want to be very respectful of the businesses we’re working with,” Fox stressed. Google will ensure that businesses won’t receive too many calls from Duplex — say, for example, from people who might use it to prank restaurants with fake reservations.

When you set up Duplex on your Google Assistant, you’ll give it a few pieces of information and some permissions. In one call, Duplex told the human on the phone it wasn’t authorized to share an email address but could share a phone number, for example.

As for haircuts, Fox noted that Google hadn’t worked out all the details of what Duplex would need to know, but he imagined a case where at the least it might be able to ask for something like your “usual” haircut.

Duplex conveyed politeness in the demos we saw. It paused with a little “mmhmm” when the called human asked it to wait, a pragmatic tactic Huffman called “conversational acknowledgement.” It showed that Duplex was still on the line and listening, but would wait for the human to continue speaking.

Duplex’s tone was unfailingly polite and at times apologetic

It handled a bunch of interruptions, out of order questions, and even weird discursive statements pretty well. When a human sounded confused or flustered, Duplex took a tone that was almost apologetic. It really seems to be designed to be a super considerate and non-confrontational customer on the phone.

But Duplex can’t handle everything, and so it will be paired with a bank of human operators who can take over a call if it goes sideways. Valerie Nygaard, product manager for Duplex, emphasized that “this is a system with a human fallback.” Those operators serve two purposes: they handle calls that Duplex can’t complete and they also mark up the call transcripts for Google’s AI algorithms to learn from.

None of the phone calls we listened to required human fallback, however. Huffman says that right now four out of five calls that Duplex makes can be handled without the human operator. That’s either a very low or very high percentage, depending on your attitude toward the technology.

If you’re skeptical of all this, I don’t blame you. Google got quite a bit of blowback after its Google I/O demo, both from people who were wondering about disclosure and from those who thought it might have not been a real call. Fox insists that it was, though it was edited to take out “personal information.”

I duped Duplex

Another reason to believe the demos we saw in that Mountain View restaurant were real? My own demo totally flopped. I played the role of a busy, crabby bartender who kept interrupting Duplex. The system handled the interruptions fine, but it got flummoxed when I told it that I could go ahead and make a reservation for seven, but the full kitchen closed at six that day so it would have to settle for bar food.

My call should have been handed off to a human operator at that point, but instead Duplex misunderstood my meaning about the kitchen closing. When I said there would only be bar food in a harried and snippy tone, it replied “Oh I see. Bye, thank you,” and hung up.

It was a very human thing to do.

Today’s Storystream

Feed refreshed An hour ago 10 minutes in the clouds

Elizabeth LopattoAn hour ago
Spain’s Transports Urbans de Sabadell has La Bussí.

Once again, the US has fallen behind in transportation — call it the Bussí gap. A hole in our infrastructure, if you will.

External Link
Jay PetersTwo hours ago
Doing more with less (extravagant holiday parties).

Sundar Pichai addressed employees’ questions about Google’s spending changes at an all-hands this week, according to CNBC.

“Maybe you were planning on hiring six more people but maybe you are going to have to do with four and how are you going to make that happen?” Pichai sent a memo to workers in July about a hiring slowdown.

In the all-hands, Google’s head of finance also asked staff to try not to go “over the top” for holiday parties.

External Link
Jess Weatherbed12:31 PM UTC
Japan will fully reopen to tourists in October following two and a half years of travel restrictions.

Good news for folks who have been waiting to book their dream Tokyo vacation: Japan will finally relax Covid border control measures for visa-free travel and individual travelers on October 11th.

Tourists will still need to be vaccinated three times or submit a negative COVID-19 test result ahead of their trip, but can take advantage of the weak yen and a ‘national travel discount’ launching on the same date. Sugoi!

External Link
Thomas Ricker11:00 AM UTC
Sony starts selling the Xperia 1 IV with continuous zoom lens.

What does it cost to buy a smartphone that does something no smartphone from Apple, Google, Samsung can? $1,599.99 is Sony’s answer: for a camera lens that can shift its focal length anywhere between 85mm and 125mm.

Here’s Allison’s take on Sony’s continuous-zoom lens when she tested a prototype Xperia 1 IV back in May: 

Sony put a good point-and-shoot zoom in a smartphone. That’s an impressive feat. In practical use, it’s a bit less impressive. It’s essentially two lenses that serve the same function: portrait photography. The fact that there’s optical zoom connecting them doesn’t make them much more versatile.

Still, it is a Sony, and

External Link
Corin Faife10:44 AM UTC
If God sees everything, so do these apps.

Some Churches are asking congregants to install so-called “accountability apps” to prevent sinful behavior. A Wired investigation found that they monitor almost everything a user does on their phone, including taking regular screenshots and flagging LGBT search terms.

External Link
James Vincent8:41 AM UTC
Shutterstock punts on AI-generated content.

Earlier this week, Getty Images banned the sale of AI-generated content, citing legal concerns about copyright. Now, its biggest rival, Shutterstock, has responded by doing ... absolutely nothing. In a blog post, Shutterstock’s CEO Paul Hennessy says there are “open questions on the copyright, licensing, rights, and ownership of synthetic content and AI-generated art,” but doesn’t announce any policy changes. So, you can keep on selling AI art on Shutterstock, I guess.

Thomas Ricker6:58 AM UTC
This custom Super73 makes me want to tongue-kiss an eagle.

Super73’s tribute to mountain-biking pioneer Tom Ritchey has my inner American engorged with flag-waving desire. The “ZX Team” edition features a red, white, and blue colorway with custom components fitted throughout. Modern MTBers might scoff at the idea of doing any serious trail riding on a heavy Super73 e-bike, which is fine: this one-off is not for sale. 

You can, however, buy the Super73 ZX it’s based on (read my review here), which proved to be a very capable all-terrain vehicle on asphalt, dirt, gravel, and amber fields of grain.

Richard Lawler12:25 AM UTC
The sincerest form of flattery.

I had little interest in Apple’s Dynamic Island, but once a developer built their spin on the idea for Android, I had to give it a try.

Surprisingly, I’ve found I actually like it, and while dynamicSpot isn’t as well-integrated as Apple’s version, it makes up for it with customization. Nilay’s iPhone 14 Pro review asked Apple to reverse the long-press to expand vs. tap to enter an app setup. In dynamicSpot, you can do that with a toggle (if you pay $5).

DynamicSpot app on Android shown expanding music player, in the style of Apple’s Dynamic Island in iOS 16.
DynamicSpot in action on a Google Pixel 6
Image: Richard Lawler
Richard LawlerSep 22
TikTok politics.

Ahead of the midterm elections, TikTok made big changes to its rules for politicians and political fundraising on the platform, as Makena Kelly explains... on TikTok.

External Link
Richard LawlerSep 22
The Twitter employee who testified about Trump and the January 6th attack has come forward.

This summer, a former Twitter employee who worked on platform and content moderation policies testified anonymously before the congressional committee investigating the violence at the US Capitol on January 6th.

While she remains under NDA and much of her testimony is still sealed,  Anika Collier Navaroli has identified herself, explaining a little about why she’s telling Congress her story of what happened inside Twitter — both before the attack, and after, when it banned Donald Trump.