Google has announced an ambitious new project to develop a single AI language model that supports the world’s “1,000 most spoken languages.” As a first step towards this goal, the company is unveiling an AI model trained on over 400 languages, which it describes as “the largest language coverage seen in a speech model today.”
Language and AI have arguably always been at the heart of Google’s products, but recent advances in machine learning — particularly the development of powerful, multi-functional “large language models” or LLMs — have placed new emphasis on these domains.
Google has already begun integrating these language models into products like Google Search, while fending off criticism about the systems’ functionality. Language models have a number of flaws, including a tendency to regurgitate harmful societal biases like racism and xenophobia, and an inability to parse language with human sensitivity. Google itself infamously fired its own researchers after they published papers outlining these problems.
These models are capable of many tasks, though, from language generation (like OpenAI’s GPT-3) to translation (see Meta’s No Language Left Behind work). Google’s “1,000 Languages Initiative” is not focusing on any particular functionality, but instead on creating a single system with huge breadth of knowledge across the world’s languages.
Speaking to The Verge, Zoubin Ghahramani, vice president of research at Google AI, said the company believes that creating a model of this size will make it easier to bring various AI functionalities to languages that are poorly represented in online spaces and AI training datasets (also known as “low-resource languages”).
“Languages are like organisms, they’ve evolved from one another and they have certain similarities.”
“By having a single model that is exposed to and trained on many different languages, we get much better performance on our low resource languages,” says Ghahramani. “The way we get to 1,000 languages is not by building 1,000 different models. Languages are like organisms, they’ve evolved from one another and they have certain similarities. And we can find some pretty spectacular advances in what we call zero-shot learning when we incorporate data from a new language into our 1,000 language model and get the ability to translate [what it’s learned] from a high-resource language to a low-resource language.”
Past research has shown the effectiveness of this approach, and the scale of Google’s planned model could offer substantial gains over past work. Such large-scale projects have become typical of tech companies’ ambition to dominate AI research, and draw on these firms’ unique advantages in terms of access to vast amounts of computing power and training data. A comparable project is Facebook parent company Meta’s ongoing attempt to build a “universal speech translator.”
Access to data is a problem when training across so many languages, though, and Google says that in order to support work on the 1,000-language model it will be funding the collection of data for low-resource languages, including audio recordings and written texts.
The company says it has no direct plans on where to apply the functionality of this model — only that it expects it will have a range of uses across Google’s products, from Google Translate to YouTube captions and more.
“The same language model can turn commands for a robot into code; it can solve maths problems; it can do translation.”
“One of the really interesting things about large language models and language research in general is that they can do lots and lots of different tasks,” says Ghahramani. “The same language model can turn commands for a robot into code; it can solve maths problems; it can do translation. The really interesting things about language models is they’re becoming repositories of a lot of knowledge, and by probing them in different ways you can get to different bits of useful functionality.”
Google announced the 1,000-language model at a showcase for new AI products. The company also shared new research on text-to-video models, a prototype AI writing assistant named Wordcraft, and an update to its AI Test Kitchen app, which gives users limited access to under-development AI models like its text-to-image model Imagen.