Every day, Facebook performs some 4.5 billion automatic translations — and as of yesterday, they’re all processed using neural networks. Previously, the social networking site used simpler phrase-based machine translation models, but it’s now switched to the more advanced method. “Creating seamless, highly accurate translation experiences for the 2 billion people who use Facebook is difficult,” explained the company in a blog post. “We need to account for context, slang, typos, abbreviations, and intent simultaneously.”
The big difference between the old system and the new one is the attention span. While the phrase-based system translated sentences word by word, or by looking at short phrases, the neural networks consider whole sentences at a time. They do this using a particular sort of machine learning component known as an LSTM or long short-term memory network.
The benefits are pretty clear. Compare these two examples from Facebook of a Turkish-to-English translation. The top one comes from the old phrase-based system, and the bottom one from the new system. You can see how taking into account the full context of the sentence produces a more accurate result.
“With the new system, we saw an average relative increase of 11 percent in BLEU — a widely used metric for judging the accuracy of machine translation — across all languages compared with the phrase-based systems,” the company said.
When a word in a sentence doesn’t have a direct corresponding translation in a target language, the neural system will generate a placeholder for the unknown word. A translation of that word is searched for in a sort of in-house dictionary built from Facebook’s training data, and the unknown word is replaced. That allows abbreviations like “tmrw” to be translated into their intended meaning — “tomorrow.”
“Neural networks open up many future development paths related to adding further context, such as a photo accompanying the text of a post, to create better translations,” the company said. “We are also starting to explore multilingual models that can translate many different language directions.”