It's easy for most people to tell others what they want, but it's actually a luxury that those with speech disorders don't have. The National Institute on Deafness and Other Communication Disorders estimates that 7.5 million people in the US have trouble using their voices, and now a program called VocaliD wants to help these individuals by having others donate their voices to make customized, synthetic ones that those with impairments can use.
The process starts with VocaliD's cofounder and speech scientist Rupal Patel and her group of specialists who listen to the sounds a patient can make to hear what their voice could sound like. Then they find a surrogate who is similar in age and sex to the patient, and that surrogate speaks and records thousands of sample sentences from books like White Fang or The Velveteen Rabbit. It's a vocabulary and sound building process that's similar to what's used to make Siri and other text-to-speech systems. After the recording sessions, VocaliD uses software that blends the surrogate's voice with the sounds the patient makes to create speech units like vowel sounds. Once all the sounds are blended, a synthetic voice can be made that has both characteristics from the patient and the surrogate.
Donors must read thousands of sample sentences for hours
VocaliD is still in its early stages, being funded by grants from the National Science Foundation, and it's still perfecting the process of recording surrogate voices that takes several hours over many days. The team eventually wants to cut the recording studio out of the mix by making it possible for donors to record their voices over time using an app and either a smartphone or a computer and microphone. There's also currently no indication as to how patients access these voices after they're made; presumably the customized voice is programmed into their preferred communication device or app.
But it is a hopeful start for those with speech disorders — people who suffer from conditions like Parkinson's, cerebral palsy, or verbal apraxia typically rely on dedicated speech devices that have a limited number of vocal choices. Those options tend to lack personality and sound computerized, and while the synthesized voices from VocaliD will likely never sound as fluid as a voice coming out of a person's mouth, they let a person communicate and hear what they might sound like if they could do so on their own.