Skip to main content

Amazon confirms it holds on to Alexa data even if you delete audio files

Amazon confirms it holds on to Alexa data even if you delete audio files

/

‘The American people deserve to understand how their personal data is being used’

Share this story

The Amazon logo against a yellow and black background
Illustration by Alex Castro / The Verge

Amazon has admitted that it doesn’t always delete the stored data that it obtains through voice interactions with the company’s Alexa and Echo devices — even after a user chooses to wipe the audio files from their account. The revelations, outlined explicitly by Amazon in a letter to Sen. Chris Coons (D-DE), which was published today and dated June 28th, sheds even more light on the company’s privacy practices with regard to its digital voice assistant.

The answers are a follow-up to a request from Coons dating to last month when Coons questioned how long the company holds on to voice recordings and transcripts from Echo interactions. In this week’s letter, Amazon confirmed some of the allegations. “We retain customers’ voice recordings and transcripts until the customer chooses to delete them,” the letter reads. 

Amazon doesn’t always delete data gathered using Alexa, even when you tell it to

Following a CNET investigation published in May, there was also a question about whether Amazon held on to text transcripts of voice interactions with Alexa, even after a user has chosen to delete the audio equivalent. Amazon says some of those transcripts or information gleaned from the transcripts are indeed not removed, both because the company has to scrub the data from various parts of its global data storage systems and because, in some cases, Amazon chooses to hold on to the data without telling the user.

In its response, Brian Huseman, Amazon’s vice president of public policy, said the company is engaged in an “ongoing effort to ensure those transcripts do not remain in any of Alexa’s other storage systems.” In other words, even if a user manually deletes the audio version, some text versions are still saved in separate storage systems for some unknown amount of time. Yet, in certain cases where Amazon deems the feature set of Alexa would be hindered by deleting data, the company decides to hold on to some version of the data.

Amazon is claiming it doesn’t hold on to the audio files, but it may hold on to transaction information if someone uses Alexa to call an Uber or place a food delivery order, for instance. “We do not store the audio of Alexa’s response. However, we may still retain other records of customers’ Alexa interactions, including records of actions Alexa took in response to the customer’s request,” Huseman wrote.

The letter also points out that the company, and even developers of Alexa skills, can keep a record of every transaction or routinely scheduled activity a user makes with an Echo device. This, Amazon says, ensures that the task is easily repeatable and convenient for the user.

“And for other types of Alexa requests — for instance, setting a recurring alarm, asking Alexa to remind you of your anniversary, placing a meeting on your calendar, sending a message to a friend — customers would not want or expect deletion of the voice recording to delete the underlying data or prevent Alexa from performing the requested task,” Huseman explained.

Much attention has been paid in recent months to the inner workings of Alexa, following a Bloomberg report in April that outlined how thousands of employees, many of whom are contract workers and some not even directly employed by Amazon, have access to both voice and text transcripts of Alexa interactions that could, in theory, be used to piece together information about a user’s personal life. Amazon claims this data is reviewed and annotated by humans to help improve Alexa over time, using machine learning methods to train the underlying artificial intelligence software.

Amazon’s Alexa privacy practices are under renewed scrutiny

But the lack of clarity around how and to what end Amazon collects and stores this data, and why it’s confusing to get it completely scrubbed from the company’s servers, has brought renewed scrutiny to what Amazon claims are industry-standard practices for companies building AI-dependent tools and services.

The stakes are only getting higher, as Alexa now handles sensitive patient health information. Amazon has also come under fire from child and privacy advocacy groups that claim that the company is violating the Children’s Online Privacy Protection Act (COPPA) by collecting and storing data on children under the age of 13 with its Amazon Echo Dot Kids devices.

“Amazon’s response leaves open the possibility that transcripts of user voice interactions with Alexa are not deleted from all of Amazon’s servers, even after a user has deleted a recording of his or her voice,” Coon said in a statement. “The American people deserve to understand how their personal data is being used by tech companies, and I will continue to work with both consumers and companies to identify how to best protect Americans’ personal information.”