Social media conglomerate Meta has created a single AI model capable of translating across 200 different languages, including many not supported by current commercial tools. The company is open-sourcing the project in the hopes that others will build on its work.
The AI model is part of an ambitious R&D project by Meta to create a so-called “universal speech translator,” which the company sees as important for growth across its many platforms — from Facebook and Instagram, to developing domains like VR and AR. Machine translation not only allows Meta to better understand its users (and so improve the advertising systems that generate 97 percent of its revenue) but could also be the foundation of a killer app for future projects like its augmented reality glasses.
Experts in machine translation told The Verge that Meta’s latest research was ambitious and thorough, but noted that the quality of some of the model’s translations would likely be well below that of better-supported languages like Italian or German.
“The major contribution here is data,” Professor Alexander Fraser, an expert in computational linguistics at LMU Munich in Germany, told The Verge. “What is significant is 100 new languages [that can be translated by Meta’s model].”
Meta’s achievements stem, somewhat paradoxically, from both the scope and focus of its research. While most machine translation models handle only a handful of languages, Meta’s model is all-encapsulating: it’s a single system able to translate in more than 40,000 different directions between 200 different languages. But Meta is also interested in including “low-resource languages” in the model — languages with fewer than 1 million publicly-available translated sentence-pairs. These include many African and Indian languages not usually supported by commercial machine translation tools.
Meta AI research scientist Angela Fan, who worked on the project, told The Verge that the team was inspired by the lack of attention paid to such lower-resource languages in this field. “Translation doesn’t even work for the languages we speak, so that’s why we started this project,” said Fan. “We have this inclusion motivation of like — ‘what would it take to produce translation technology that works for everybody’?”
Fan says the model, described in a research paper here, is already being tested to support a project that helps Wikipedia editors translate articles into other languages. The techniques developed in creating the model will also be integrated into Meta’s translation tools soon.
How do you judge a translation?
Translation is a difficult task at the best of times, and machine translation can be notoriously flaky. When applied at scale on Meta’s platforms, even a small number of errors can produce disastrous results — as, for example, when Facebook mistranslated a post by a Palestinian man from “good morning” to “hurt them,” leading to his arrest by Israeli police.
To evaluate the quality of the new model’s output, Meta created a test dataset consisting of 3001 sentence-pairs for each language covered by the model, each translated from English into a target language by someone who is both a professional translator and native speaker.
The researchers ran these sentences through their model, and compared the machine’s translation with the human reference sentences using a benchmark common in machine translation known as BLEU (which stands for BiLingual Evaluation Understudy).
BLEU allows researchers to assign numerical scores measuring the overlap between pairs of sentences, and Meta says its model produces an improvement of 44 percent in BLEU scores across supported languages (compared to previous state-of-the-art work). However, as is often the case in AI research, judging progress based on benchmarks requires context.
Although BLEU scores allow researchers to compare the relative progress of different machine translation models, they do not offer an absolute measure of software’s ability to produce human-quality translations.
Remember: Meta’s dataset consists of 3001 sentences, and each has been translated only by a single individual. This provides a baseline for judging translation quality, but the total expressive power of an entire language cannot be captured by such a small sliver of actual language. This problem is in no way limited to Meta — it’s something that affects all machine translation work, and is particularly acute when assessing low-resource languages — but it shows the scope of the challenges facing the field.
Christian Federmann, a principal research manager who works on machine translation at Microsoft, said the project as a whole was “commendable” in its desire to expand the scope of machine translation software to lesser-covered languages, but noted that BLEU scores by themselves can only provide a limited measure of output quality.
“Translation is a creative, generative process which may result in many different translations which are all equally good (or bad),” Federmann told The Verge. “It is impossible to provide general levels of ‘BLEU score goodness’ as they are dependent on the test set used, its reference quality, but also inherent properties of the language pair under investigation.”
Fan said that BLEU scores had also been complemented with human evaluation, and that this feedback was very positive, and also produced some surprising reactions.
“One really interesting phenomenon is that people who speak low-resource languages often have a lower bar for translation quality because they don’t have any other tool,” said Fan, who is herself a speaker of a low-resource language, Shanghainese. “They’re super generous, and so we actually have to go back and say ‘hey, no, you need to be very precise, and if you see an error, call it out.’”
The power imbalances of corporate AI
Working on AI translation is often presented as an unambiguous good, but creating this software comes with particular difficulties for speakers of low-resource languages. For some communities, the attention of Big Tech is simply unwelcome: they don’t want the tools needed to preserve their language in anyone’s hands but their own. For others, the issues are less existential, but more concerned with questions of quality and influence.
Meta’s engineers explored some of these questions by conducting interviews with 44 speakers of low-resource languages. These interviewees raised a number of positive and negative affects of opening up their languages to machine translation.
One positive, for example, is that such tools allow speakers to access more media and information. They can be used to translate rich resources, like English-language Wikipedia and educational texts. At the same time, though, if low-resource language speakers consume more media generated by speakers of better-supported languages, this could diminish the incentives to create such materials in their own language.
Balancing these issues is challenging, and the problems encountered even within this recent project show why. Meta’s researchers note, for example, that of the 44 low-resource language speakers they interviewed to explore these questions, the majority of these interviewees were “immigrants living in the US and Europe, and about a third of them identify as tech workers” — meaning their perspectives are likely different to those of their home communities and biased from the start.
Professor Fraser of LMU Munich said that despite this, the research was certainly conducted “in a way that is becoming more of involving native speakers” and that such efforts were “laudable.”
“Overall, I’m glad that Meta has been doing this. More of this from companies like Google, Meta, and Microsoft, all of whom have substantial work in low resource machine translation, is great for the world,” said Fraser. “And of course some of the thinking behind why and how to do this is coming out of academia as well, as well as the training of most of the listed researchers.”
Fan said Meta attempted to preempt many of these social challenges by broadening the expertise they consulted on the project. “I think when AI is developing it’s often very engineering — like, ‘Okay, where are my computer science PhDs? Let’s get together and build it just because we can.’ But actually, for this, we worked with linguists, sociologists, and ethicists,” she said. “And I think this kind of interdisciplinary approach focuses on the human problem. Like, who wants this technology to be built? How do they want it to be built? How are they going to use it?”
Just as important, says Fan, is the decision to open-source as many elements of the project as possible — from the model to the evaluation dataset and training code — which should help redress the power imbalance inherent in a corporation working on such an initiative. Meta also offers grants to researchers who want to contribute to such translation projects but are unable to finance their own projects.
“I think that’s really, really important, because it’s not like one company will be able to holistically solve the problem of machine translation,” said Fan. “It’s everyone — globally — and so we’re really interested in supporting these types of community efforts.”