I woke up this morning, ate my breakfast and spoke some Spanish. I can’t even speak Spanish. I didn’t emerge from a dream or experience a medical miracle, I used Skype Translator. It’s a new app designed to translate your speech in real time, making it possible to hold a conversation in a language you don’t understand.
I spoke with Maria Romero Garcia, a Spanish professor who has helped Microsoft test early versions of its Skype Translator software. We discussed everything from the weather to holiday plans, and Garcia’s work as a translator. After the call ended 20 minutes later, I felt bemused that I’d actually managed to hold a conversation with a Spanish speaker and the aid of a computer. The app converted Garcia's spoken words into robotic English audio and text, although we could still hear each other speaking our native languages. Both parts are fed at the same time, generating a conversation that can at times feel a little unusual and unnatural, but it works once you understand the flow and slight delays. It wasn’t without issues, though.
Translations during the beta aren't 100 percent perfect yet
I found that the vast majority of my own speech recognition didn’t pick up very well. Garcia’s translations were almost always accurate, so I could easily understand her, but Skype Translator struggled to pick up on my fine English accent. That might be because I have a thick London accent. It’s not quite Cockney, but I have a habit of pronouncing three as "free." During the call "how does this work?" turned into "how does this button?" and "so what are you doing for Christmas" translated into "and so you want to for Christmas." Garcia was undisturbed by some of the awkward translations, and the conversation continued to flow. Garcia also understands English, which may have helped keep the conversation moving.
My accent often confuses those outside of London, and especially Cortana, Siri, and a lot of other speech assistants. I was rather hoping Skype Translator might be the one computer that could understand me, my one true love if you will. There was no spark, so it didn’t really work out.
Despite my issues, this technology has a magical quality to it and huge potential. You can imagine schools using this for education purposes, families communicating in different tongues, or even businesses striking important deals without having to translate things word by word. Microsoft and many other companies have been trying to crack speech recognition for years, with varied success. Bill Gates constantly discussed the power of speech technology during the ‘90s, promising transformative advancements in Windows that never really arrived. What Microsoft is doing here is piecing together a number of separate technologies and trying to build a simple and working translation engine.
There's Impressive amounts of technology behind Skype Translator
There’s voice recognition, machine translation, speech synthesis, cloud processing, and machine learning all happening in almost real-time. From a technology point of view it’s impressive, and the translation will improve over time as Skype learns more about your accent and the words you regularly use. Microsoft is currently limiting Skype Translator to an invite-only beta for Spanish and English speakers, to help the company analyze and improve its translation techniques before a broader rollout. Other languages will be added for speech translation in the future, and the app currently supports more than 40 languages for text-based translation during chat conversations.
Science fiction hasn’t come to life just yet with Skype Translator, but the future of this technology is exciting. It reminds me of the work Microsoft is doing to improve navigation for visually-impaired people. It’s yet another example of the type of challenge you’d expect the new Microsoft to be working on, transforming the complexities of technology into useful solutions for everyday needs.