AT&T's research arm has spent over two decades developing its Watson speech and language engine, which translates spoken words into text. Now, AT&T is planning to release a number of Watson APIs for developers in June, in an effort to accelerate development and innovation in the voice recognition space. Instead of having to develop their own speech recognition software, developers will now be able to plug AT&T's Watson APIs into their apps to more easily include voice recognition features.
AT&T's first APIs will be focused around seven different areas: web search, local business search, Q&A, voice mail to text, SMS, AT&T's U-verse video programming guide, a general-purpose dictation API. AT&T has found that speech recognition works best when focused on specific categories, so these categories will help Watson know what types of words to expect. Unsurprisingly, AT&T's informational video (included below) focused on the example of building a Watson-enabled U-verse programming guide, so you could tell it what channel, movie, or actor you were looking for. While these seven categories will be part of the initial release, it sounds like AT&T plans to add more and more categories over time.
Additionally, AT&T will also be releasing an SDK it's calling the Speech Kit; it'll allow developers to create software that captures spoken words and sends them into a network for transcription. There's minimal details on where exactly the captured words are sent, but we expect we'll hear more when this SDK is released. As a way of showing off Watson software in action before the API release, AT&T recently launched the AT&T Translator app for Android and iOS. It purports to translate your speech into another language of your choice, but the few reviews in iTunes make it sound like there's a few bugs that need to be worked out still.
Historically, AT&T has used Watson internally for interactive voice response for things like the automated customer care systems we've grown to know (and possibly hate) over the years, as well as voicemail-to-text, voice search, and many other applications for translating a human's voice to a computer. While that usage may have its place, we're happy to see Watson's technology get into the hands of developers who will hopefully apply it to situations beyond yelling at voice-automated menus for an operator.
Update: AT&T had plenty of Watson-enabled experiments to show us at an event in New York today, the most flashy of which was a QNX-equipped Porsche 911 that used the carrier's cloud-based service to handle voice commands. The convertible had its top down and unfortunately had trouble picking up commands accurately and reliably with the din of New York City around it, but be sure to check our impressions of the car (which appears to be unchanged from CES 2012) here.