A loss for humanity! Man succumbs to machine!
If you heard about AlphaGo’s latest exploits last week — crushing the world’s best Go player and confirming that artificial intelligence had mastered the ancient Chinese board game — you may have heard the news delivered in doomsday terms.
There was a certain melancholy to Ke Jie’s capitulation, to be sure. The 19-year-old Chinese prodigy declared he would never lose to an AI following AlphaGo’s earthshaking victory over Lee Se-dol last year. To see him onstage last week, nearly bent double over the Go board and fidgeting with his hair, was to see a man comprehensively put in his place.
But focusing on that would miss the point. DeepMind, the Google-owned company that developed AlphaGo, isn’t attempting to crush humanity — after all, the company is made up of humans itself. AlphaGo represents a major human achievement and the takeaway shouldn’t be that AI is surpassing our abilities, but instead that AI will enhance our abilities.
When speaking to DeepMind and Google developers at the Future of Go Summit in Wuzhen, China last week, I didn’t hear much about the four games AlphaGo won over Lee Se-dol last year. Instead, I heard a lot about the one that it lost.
“We were interested to see if we could fix the problems, the knowledge gaps as we call them, that Lee Se-dol brilliantly exposed in game four with his incredible win, showing that there was a weakness in AlphaGo’s knowledge,” DeepMind co-founder and CEO Demis Hassabis said on the first day of the event. “We worked hard to see if we could fix that knowledge gap and actually teach, or have AlphaGo learn itself, how to deal with those kinds of positions. We’re confident now that AlphaGo is better in those situations, but again we don’t know for sure until we play against an amazing master like Ke Jie.”
“AlphaGo Master has become its own teacher.”
As it happened, AlphaGo steamrolled Ke into a 3-0 defeat, suggesting that those knowledge gaps have been closed. It’s worth noting, however, that DeepMind had to learn from AlphaGo’s past mistakes to reach this level. If the AI had stood still for the past year, it’s entirely possible that Ke would have won; he’s a far stronger player than Lee. But AlphaGo did not stand still.
The version of AlphaGo that played Ke has been completely rearchitected — DeepMind calls it AlphaGo Master. “The main innovation in AlphaGo Master is that it’s become its own teacher,” says Dave Silver, DeepMind’s lead researcher on AlphaGo. “So [now] AlphaGo actually learns from its own searches to improve its neural networks, both the policy network and the value network, and this makes it learn in a much more general way. One of the things we’re most excited about is not just that it can play Go better but we hope that this’ll actually lead to technologies that are more generally applicable to other challenging domains.”
AlphaGo is comprised of two networks: a policy network that selects the next move to play, and a value network that analyzes the probability of winning. The policy network was initially based on millions of historical moves from actual games played by Go professionals. But AlphaGo Master goes much further by searching through the possible moves that could occur if a particular move is played, increasing its understanding of the potential fallout.
“The original system played against itself millions of times, but it didn’t have this component of using the search,” Hassabis tells The Verge. “[AlphaGo Master is] using its own strength to improve its own predictions. So whereas in the previous version it was mostly about generating data, in this version it’s actually using the power of its own search function and its own abilities to improve one part of itself, the policy net.” Essentially, AlphaGo is now better at assessing why a particular move would be the strongest possible option.
“The whole idea is to reduce your reliance on that human bootstrapping step.”
I asked Hassabis whether he thought this system could work without the initial dataset taken from historical games of Go. “We’re running those tests at the moment and we’re pretty confident, actually,” he said. “The initial results have been that it’s looking pretty good. That’ll be part of this future paper that we’re going to publish, so we’re not talking about that at the moment, but it’s looking promising. The whole idea is to reduce your reliance on that human bootstrapping step.”
But in order to defeat Ke, DeepMind needed to fix the weaknesses in the original AlphaGo that Lee exposed. Although the AI gets ever stronger by playing against itself, DeepMind couldn’t rely on that baseline training to cover the knowledge gaps — nor could it hand-code a solution. “It’s not like a traditional program where you just fix a bug,” says Hassabis, who believes that similar knowledge gaps are likely to be a problem faced by all kinds of learning systems in the future. “You have to kind of coax it to learn new knowledge or explore that new area of the domain, and there are various strategies to do that. You can use adversarial opponents that push you into exploring those spaces, and you can keep different varieties of the AlphaGo versions to play each other so there’s more variety in the player pool.”
“Another thing we did is when we assessed what kinds of positions we thought AlphaGo had a problem with, we looked at the self-play games and we identified games algorithmically — we wrote another algorithm to look at all those games and identify places where AlphaGo seemed to have this kind of problem. So we have a library of those sorts of positions, and we can test our new systems not only against each other in the self-play but against this database of known problematic positions, so then we could quantify the improvement against that.”
None of this increase in performance has required an increase in power. In fact, AlphaGo Master uses much less power than the version of AlphaGo that beat Lee Se-dol; it runs on a single second-gen Tensor Processing Unit machine in the Google Cloud, whereas the previous version used 50 TPUs at once. “You shouldn’t think of this as running on compute power that’s beyond the access of normal people,” says Silver. “The special thing about it is the algorithm that’s being used as opposed to the amount of compute.”
AlphaGo learned from humans, and humans are learning from AlphaGo
AlphaGo is learning from humans, then, even if it may not need to in the future. And in turn, humans have learned from AlphaGo. The simplest demonstration of this came in Ke Jie’s first match against the AI, where he used a 3-3 point as part of his opening strategy. That’s a move that fell out of favor over the past several decades, but it’s seen a resurgence in popularity after AlphaGo employed it to some success. And Ke pushed AlphaGo to its limits in the second game; the AI determined that his first 50 moves were “perfect,” and his first 100 were better than anyone had ever played against the Master version.
Although the Go community might not necessarily understand why a given AlphaGo move works in the moment, the AI provides a whole new way to approach the game. Go has been around for thousands of years, and AlphaGo has sparked one of the most profound shifts yet in how the game is played and studied.
But if you’re reading this in the West, you probably don’t play Go. What can AlphaGo do for you?
Say you’re a data center architect working at Google. It’s your job to make sure everything runs efficiently and coolly. To date, you’ve achieved that by designing the system so that you’re running as few pieces of cooling equipment at once as possible — you turn on the second piece only after the first is maxed out, and so on. This makes sense, right? Well, a variant of AlphaGo named Dr. Data disagreed.
“What Dr. Data decided to do was actually turn on as many units as possible and run them at a very low level,” Hassabis says. “Because of the switching and the pumps and the other things, that turned out to be better — and I think they’re now taking that into new data center designs, potentially. They’re taking some of those ideas and reincorporating them into the new designs, which obviously the AI system can’t do. So the human designers are looking at what the AlphaGo variant was doing, and then that’s informing their next decisions.” Dr. Data is at work right now in Google’s data centers, saving the company 40 percent in electricity required for cooling and resulting in 15 percent overall less energy usage.
DeepMind believes that the same principle will apply to science and health care, with deep-learning techniques helping to improve the accuracy and efficiency of everything from protein-folding to radiography. Perhaps less ambitiously but no less importantly, it may also lead to more sensible workflows. “You can imagine across a hospital or many hospitals you might be able to figure out that there’s this process one hospital’s using, or one nurse is using, that’s super effective over time,” says Hassabis. “Maybe they’re doing something slightly different to this other hospital, and perhaps the other hospital can learn from that. I think at the moment you’d never know that was happening, but you can imagine that an AI system might be able to pick up on that and share that knowledge effectively between different doctors and hospitals so they all end up with the best practice.”
These are areas particularly fraught with roadblocks and worries for many, of course. And it’s natural for people to be suspicious of AI — I experienced it myself somewhat last week. My hotel was part of the same compound as the Future of Go Summit, and access to certain areas was gated by Baidu’s machine learning-powered facial recognition tech. It worked instantly, every time, often without me even knowing where the camera was; I’d just go through the gate and see my Verge profile photo flash up on a screen. I never saw it fail for the thousands of other people at the event, either. And all of this worked based on nothing more than a picture of me taken on an iPad at check-in.
I know that Facebook and Google and probably tons of other companies also know what I look like. But the weird feeling I got from seeing my face flawlessly recognized multiple times a day for a week shows that companies ought to be sensitive about the way they roll out AI technologies. It also, to some extent, probably explains why so many people seem unsettled by AlphaGo’s success.
But again, that success is a success built by humans. AlphaGo is already demonstrating the power of what can happen not only when AI learns from us, but when we learn from AI. At this stage, it’s technology worth being optimistic about.
Photography by Sam Byford / The Verge