Google has announced PaLM 2: its latest AI language model and competitor to rival systems like OpenAI’s GPT-4.
“PaLM 2 models are stronger in logic and reasoning, thanks to broad training in logic and reasoning,” said Google CEO Sundar Pichai onstage at the company’s I/O conference. “It’s also trained on multilingual text spanning over 100 languages.”
PaLM 2 is much better at a range of text-based tasks, Google senior research director Slav Petrov told journalists in a roundtable prior to the model’s announcement at Google’s I/O conference, including reasoning, coding, and translation. “It is significantly improved compared to PaLM 1 [which was announced in April 2022],” said Petrov.
As an example of its multilingual capabilities, Petrov showed how PaLM 2 is able to understand idioms in different languages, giving the example of the German phrase “Ich verstehe nur Bahnhof,” which literally translates to “I only understand train station” but is better understood as “I don’t understand what you’re saying” or, as an English idiom, “it’s all Greek to me.”
In a research paper describing PaLM 2’s capabilities, Google’s engineers claimed the system’s language proficiency is “sufficient to teach that language” and noted this is in part due to a greater prevalence of non-English texts in its training data.
Like other large language models, which take huge amounts of time and resources to create, PaLM 2 is less a single product than a family of products — with different versions that will be deployed in consumer and enterprise settings. The system is available in four sizes, named Gecko, Otter, Bison, and Unicorn, from smallest to largest, and has been fine-tuned on domain-specific data to perform certain tasks for enterprise customers.
Think of these adaptations like taking a basic truck chassis and adding a new engine or front bumper to accomplish certain tasks or work better in specific terrain. There’s a version of PaLM trained on health data (Med-PaLM 2), which Google says can answer questions similar to those found on the US Medical Licensing Examination to an “expert” level and another version trained on cybersecurity data (Sec-PaLM 2) that can “explain the behavior of potential malicious scripts and help detect threats in code,” said Petrov. Both of these models will be available via Google Cloud, initially to select customers.
Within Google’s own domain, PaLM 2 is already being used to power 25 features and products, including Bard, the company’s experimental chatbot. Updates available through Bard include improved coding capabilities and greater language support. It’s also being used to power features in Google Workspace apps like Docs, Slides, and Sheets.
Notably, Google says the lightest version of PaLM 2, Gecko, is small enough to run on mobile phones, processing 20 tokens per second — roughly equivalent to around 16 or 17 words. Google did not say what hardware was used to test this model, only that it was running “on the latest phones.” Nevertheless, the miniaturization of such language models is significant. Such systems are expensive to run in the cloud, and being able to use them locally would have other benefits, like improved privacy. The problem is that smaller versions of language models are inevitably less capable than their larger brethren.
With PaLM 2, Google will be hoping to close the “AI gap” between the company and competitors like Microsoft, which has been aggressively pushing AI language tools into its suite of Office software. Microsoft now offers AI features that help summarize documents, write emails, generate slides for presentations, and much more. Google will need to keep parity with the company or risk being perceived as slow to implement its AI research.
Although PaLM 2 is certainly a step forward for Google’s work on AI language models, it suffers from problems and challenges common to the technology more broadly.
For example, some experts are beginning to question the legality of training data used to create language models. This data is usually scraped from the internet and often includes copyright-protected text and pirated ebooks. Tech companies creating these models have generally responded by refusing to answer questions about where they source their training data from. Google has continued this tradition in its description of PaLM 2, noting only that the system’s training corpus is comprised of “a diverse set of sources: web documents, books, code, mathematics, and conversational data,” without offering further detail.
There are also problems inherent to the output of language models like “hallucinations,” or the tendency of these systems to simply make up information. Speaking to The Verge, Google VP of research Zoubin Ghahramani says that, in this regard, PaLM 2 was an improvement on earlier models “in the sense that we’re putting a huge amount of effort into continually improving metrics of groundedness and attribution” but noted that the field as a whole “still has a ways to go” in combating false information generated by AI.