Nothing in this world — animal or robot — quite comes close to the flexibility and dexterity of the human hand. For engineers at the Elon Musk-founded nonprofit OpenAI, this presented both a challenge and an opportunity. How could their researchers use artificial intelligence to teach a robot to manipulate objects as artfully as a human?
Usually, when teaching an AI to control a physical robot, scientists tend to come up against the same problems. Training is often done using reinforcement learning; a method where the AI learns through a process of trial and error. But this requires a lot of time, usually amounting to years of experience. That’s fine if you want an AI to beat, say, a video game — you just let it play the game at an accelerated rate. But if you want to teach it a real-life task, you’re in trouble. You can’t wait for robot arms to muddle through years of practice, and it’s hard to get a simulation of the world that’s accurate enough for training purposes.
For OpenAI, the task they’d set themselves was teaching a robot hand to manipulate a six-sided cube; moving it from one position to another so a specific side was facing up. As with earlier research, they began by simulating this environment as accurately as possible, but their next step was what made the difference: they began messing with the simulation.
First, they added random visual noise. Then, they changed the colors of the virtual hand and cube. They randomized the size of the cube; how slippery its surfaces were; and how heavy it was. They even messed with the simulation’s gravity. The effect of all this was to give the AI a better understanding of what it might be like to manipulate the cube in the real world. While the simulation may not have been totally true to life, it had enough variations that it allowed the system to learn to deal with the unexpected.
OpenAI’s Matthias Plappert, who worked on the project, explains that changing the simulation’s gravity was a particularly fun hack. The team knew that when the AI system — known as Dactyl — was controlling a real robot hand, the base of the hand might not be positioned at the same angle each time. A lower angle would mean the cube would fall out of the hand more easily. In order to teach Dactyl how to handle this variant, they decided they would randomize the angle of gravity in the simulation. “Without this randomization, it would just drop the object all the time because it wasn’t used to it,” says Plappert.
Going through all these randomizations took a long time though. A seriously long time. In fact, Dactyl had to accumulate roughly 100 years’ worth of experience to reach top performance. That, in turn, meant the team had to use a lot of computing power — some 6,144 CPUs and eight powerful powerful Nvidia V100 GPUs. That’s the sort of hardware that’s accessible to only a very few research institutions.
But the end results were worth it, says Plappert. Once fully trained, Dactyl was able to move the cube from one position to another up to 50 times in a row without dropping it. (Although the median number of times it did so was much smaller; just 13.) And in learning to move the cube around in its hand, Dactyl even developed human-like behaviors. All this was learned without any human instruction — just trial and error, for decades at a time.
“This shows that what we humans do for manipulation is very optimized,” says Plappert. “It’s a very interesting moment when you look at a robot trying to solve a problem and you think ‘Oh, hey, that’s how I would do that, too.’”
Experts in the fields of robotics and AI speaking to The Verge praised OpenAI’s work, but cautioned that it did not represent a breakthrough for robotic manipulation. Smruti Amarjyoti of Carnegie Mellon University’s Robotics Institute noted that the idea of randomizing the system’s training environment has been done before, but said Dactyl’s movements were “graceful” in a way he’d thought it impossible for AI.
“The end result is highly sophisticated and polished,” said Amarjyoti. “[But] I would consider the biggest achievement of OpenAI in this field would be the engineering coordination that it took and the amount of compute power that was utilized to achieve this feat.”
Antonio Bicchi, a professor of robotics at the Istituto Italiano di Tecnologia, said the research was “elegant and enthusing” but noted a number of limitations. “The result is still limited to a specific task (rolling a die of convenient size) in rather favorable conditions (the hand is facing up, so that the die falls in the palm), and is not even close to be a conclusive argument that these techniques can solve real-world robotics problem,” said Bicchi.
For OpenAI, the research is gratifying for reasons beyond Dactyl’s dice-juggling. The system was taught using a number of the same algorithms and techniques the lab developed to train its video game playing bot, OpenAI Five. This, the company suggests, shows that it is building general purpose algorithms that can be used to tackle a wide array of tasks — something of a holy grail for ambitious AI labs and companies.
Creating more dextrous robots with the help of artificial intelligence would be a huge boon to companies trying to automate manual labor, and there are a number of startups actively pursuing research in this area. But while improving the state of the art in robotics would certainly allow more jobs to be automated, whether or not this wave of job destruction can be offset by the jobs created by new technology is something of an open question.
Either way, it’s clear that artificial intelligence still has a way to go before it can match humanity’s motor skills. Abilities that took Dactyl nearly a hundred years of learning can be picked up by a human with “only very few trials, [even] with new objects and tasks,” notes Bicchi. But certainly the machines are catching up, faster than ever.