Earlier this year, Google unveiled AI Test Kitchen — an Android app that lets users talk to one of its most advanced AI chatbots, LaMDA 2. Today, the company is opening up registrations for early access. You can sign up here, and Google says it will soon be letting people download the app and start chatting. (Though it’s limited to US users right now.)
It’s interesting, considering that Meta made an almost identical move just earlier this month, opening up its latest and greatest AI chatbot, BlenderBot 3, for public consumption. Of course, people quickly found that they could get BlenderBot to say creepy or untruthful things (or even criticize the bot’s nominal boss, Mark Zuckerberg), but that’s kind of the whole point of releasing these demos.
AI researchers say testing chatbots in the wild is still very helpful
As Mary Williamson, a research engineering manager at Facebook AI Research (FAIR), told me at the beginning of the month, many companies don’t like to test their chatbots in the wild because what they say will be damaging to the company, as with Microsoft’s Tay. But for many researchers, the best way to improve these same bots is to throw them into the public arena, where the chattering populace will stress-test and manipulate them in ways no fair-minded engineer would dream of.
“This lack of tolerance for bots saying unhelpful things, in the broad sense of it, is unfortunate,” said Williamson. “And what we’re trying to do is release this very responsibly and push the research forward.”
It’s interesting to compare Google and Meta in that regard, as Meta definitely applied fewer restrictions to interacting with BlenderBot. Google, on the other hand, is limiting conversations with LaMDA 2 to a few basic modes. As I wrote during the announcement:
The app has three modes: “Imagine It,” “Talk About It,” and “List It,” with each intended to test a different aspect of the system’s functionality. “Imagine It” asks users to name a real or imaginary place, which LaMDA will then describe (the test is whether LaMDA can match your description); “Talk About It” offers a conversational prompt (like “talk to a tennis ball about dog”) with the intention of testing whether the AI stays on topic; while “List It” asks users to name any task or topic, with the aim of seeing if LaMDA can break it down into useful bullet points (so, if you say “I want to plant a vegetable garden,” the response might include sub-topics like “What do you want to grow?” and “Water and care”).
That means the potential for embarrassing slips of the virtual tongue is certainly reduced. But, I’ll wager, not entirely eliminated.