Researchers are building a database of human voices that they’ll use to develop AI-based tools that could eventually diagnose serious diseases; they’re targeting everything from Alzheimer’s to cancer. The National Institutes of Health-funded project, announced Tuesday, is an effort to turn the human voice into something that could be used as a biomarker for disease, like blood or temperature.
“What’s beautiful about voice data is that it’s probably one of the cheapest types of data that you can collect from people,” says Olivier Elemento, a professor at the Institute for Computational Biomedicine at Weill Cornell Medicine and one of the lead investigators on the project. “It’s a very accessible type of information you could access from any patient.”
Studies over the past few years have explored the potential for voice to help diagnose disease, but most have been small and siloed, says Yaël Bensoussan, an otolaryngologist at USF Health and the other lead investigator. There also aren’t any large databases of voice data, and it’s such a new area of study that researchers haven’t yet figured out best practices around how to collect voice information for research. “We’ll build standards for how to collect the data,” she says.
The project is funded through the Bridge2AI program at the NIH, which supports projects that build ethical, rigorous, and accessible datasets that can be used to develop AI tools. It’ll run over four years and could get up to $14 million in funding over that time period.
The research team will start by building an app that will collect voice data from participants with conditions like vocal fold paralysis, Alzheimer’s disease, Parkinson’s disease, depression, pneumonia, and autism. All the voice collections will be supervised by a clinician. “So for example, somebody that has Parkinson’s disease — their voice can be lower and the way they talk is slower,” Bensoussan says. They would be asked to say sounds, read sentences, and read full texts through the app.
They’d be able to easily add on other diseases that come up over the course of the research. Elemento says he has a friend who is an oncologist and often treats patients who have metastases — cancer growths — in their brains. “He told me that he could actually tell from the changes in people’s voices that they were having metastases to the brain,” Elemento says. “I was blown away.”
Then, they’ll use the datasets to build AI models that could detect the various conditions. The research team is collaborating with the medical AI company Owkin to build and train the AI models in the project. Owkin’s framework lets the patient data stay housed at the healthcare center where it was collected — the AI model travels between institutions. The model learns separately on each dataset, and then the results of those trainings come back to a central location, where they’re amalgamated together. Then, the updated combined model is sent back out to each of the locations, and the process begins again.
That lends an additional layer of privacy protection to the voice data, which is unique in that it can be easily tied back to the person it comes from. People’s voices are easily identifiable, even if their name is removed. A team of bioethicists is working on the project to study the ethical and legal implications of a voice database and of voice-based diagnostics. They’re going to be thinking through, for example, if voice is protected by the Health Insurance Portability and Accountability Act (or HIPAA) and whether patients own their own vocal data, Bensoussan says.
Medical researchers aren’t the only groups interested in using voice to diagnose disease — big tech companies that make voice assistants are as well. Amazon has patents that would use Alexa to figure out if people have emotional problems, like depression, or physical problems like a sore throat. Theoretically, if the sounds in someone’s voice showed signs of something like Alzheimer’s, a passive in-home voice assistant could flag the condition. That’d raise another layer of ethical and legal problems, which experts are already starting to think through.
“Technologies like this are coming. And I think they’re coming faster than the law is equipped to address in a complete way,” David Simon, a research fellow studying digital home health at the Petrie-Flom Center for Health Law Policy, Biotechnology, and Bioethics at Harvard Law School, told Inverse in August.
For now, the new research program isn’t interested in building programs for home devices. It’s focused on developing tools that would be used by doctors in doctor’s offices and clinics. It’d be particularly helpful in lower-resourced settings where someone might not be able to see a specialist. “Someone could register the patient’s voice, and have the app say that there’s a really high chance that there’s laryngeal cancer, so you should send the patient to be seen by an expert,” Bensoussan says. “AI for screening is going to be really important.”