What does the world look like to AI?
Researchers have puzzled over this for decades, but in recent years, the question has become more pressing. Machine vision systems are being deployed in more and more areas of life, from health care to self-driving cars, but “seeing” through the eyes of a machine — understanding why it classified that person as a pedestrian but that one as a signpost — is still a challenge. Our inability to do so could have serious, even fatal, consequences. Some would say it already has due to the deaths involving self-driving cars.
New research from Google and nonprofit lab OpenAI hopes to further pry open the black box of AI vision by mapping the visual data these systems use to understand the world. The method, dubbed “Activation Atlases,” lets researchers analyze the workings of individual algorithms, unveiling not only the abstract shapes, colors, and patterns they recognize, but also how they combine these elements to identify specific objects, animals, and scenes.
Google’s Shan Carter, a lead researcher on the work, told The Verge that if previous research had been like revealing individual letters in algorithms’ visual alphabet, Activation Atlases offers something closer to a whole dictionary, showing how letters are put together to make actual words. “So within an image category like ‘shark,’ for example, there will be lots of activations that contribute to it, like ‘teeth’ and ‘water,’” says Carter.
The work is not necessarily a huge breakthrough, but it’s a step forward in a wider field of research known as “feature visualization.” Ramprasaath Selvaraju, a PhD student at Georgia Tech who was not involved in the work, said the research was “extremely fascinating” and had combined a number of existing ideas to create a new “incredibly useful” tool.
Selvaraju told The Verge that, in the future, work like this will have many uses, helping us to build more efficient and advanced algorithms as well as improve their safety and remove bias by letting researchers peer inside. “Due to the inherent complex nature [of neural networks], they lack interpretability,” says Selvaraju. But in the future, he says, when such networks are routinely used to steer cars and guide robots, this will be a necessity.
OpenAI’s Chris Olah, who also worked on the project, said, “It feels a little like creating a microscope. At least, that’s what we’re aspiring towards.”
You can explore an interactive version of the Activation Atlas pictured below here.
Activating the neurons
To understand how Activation Atlases and other feature visualization tools work, you first need to know a little about how AI systems recognize objects in the first place.
The basic way to do this is to use a neural network: a computational structure that’s broadly similar to the human brain (though it’s light-years behind in sophistication). Inside each neural network are layers of artificial neurons connected like webs. Like cells in your brain, these fire in response to stimuli, a process known as activation. Importantly, they don’t just fire on or off; they register on a spectrum, giving each activation a specific value or “weight.”
To turn a neural network into something useful, you must feed it lots of training data. In the case of a vision algorithm, that will mean hundreds of thousands, perhaps even millions, of images, each labeled with a specific category. In the case of the neural network tested by Google and OpenAI’s researchers for this work, these categories were wide-ranging: everything from wool to Windsor ties, from seat belts to space heaters.
As it’s fed this data, different neurons in the neural network light up in response to each image. This pattern is connected to the image’s label, and it’s this association that allows the network to “learn” what things look like. Once trained, you can show the network a picture it’s never seen before, and the neurons will activate, matching the input to a specific category. Congrats! You’ve just trained a machine learning vision algorithm.
If all of this sounds disturbingly simple, that’s because, in many ways, it is. Like a lot of machine learning programs, vision algorithms are, at heart, simply pattern-matching machines. This gives them certain strengths (like the fact that they’re straightforward to train as long as you have the necessary data and computing power). But it gives them certain weaknesses, too (like the fact that they’re easily confused by inputs they haven’t seen before).
Since researchers discovered the potential of neural networks for vision tasks in the early 2010s, they’ve been tinkering with their mechanics, trying to figure out exactly how they do what they do.
One early experiment was DeepDream, a computer vision program released in 2015 that turned any picture into a hallucinogenic version of itself. DeepDream’s visuals were certainly entertaining (in some ways, they became the defining aesthetic for AI), but the program was also an early foray into seeing like an algorithm. “In some ways, this all starts with DeepDream,” says Olah.
What DeepDream does is tune images to be as interesting as possible to algorithms. It may seem like the software is unearthing “hidden” patterns in an image, but it’s more like someone scribbling in a coloring book: filling every inch with eyes, stalks, whorls, and snouts, all to excite the algorithm as much as possible.
Later research has taken this same basic approach and fine-tuned it: first targeting individual neurons within the network to see what excites them, then clusters of neurons, then combinations of neurons in different layers of the network. If early experiments were dedicated but haphazard, like Isaac Newton poking himself in the eye with a blunt needle to understand vision, recent work is like Newton splitting a ray of light with a prism. It’s much more focused. By mapping out what visual elements are activated in each part of a neural network, time and time again, eventually, you get the atlas: a visual index to its brain.
A machine’s-eye view
But what do Activation Atlases actually show us about the inner workings of algorithms? Well, you can start by just navigating around Google and OpenAI’s example here, built to unspool the innards of a well-known neural network called GoogLeNet or InceptionV1.
Scrolling around, you can see how different parts of the network respond to different concepts, and how these concepts are clustered together. (So, for example, dogs are all in one place, and birds are in another.) You can also see how different layers of the network represent different kinds of information. Lower levels are more abstract, responding to basic geometric shapes, while higher levels resolve these into recognizable concepts.
Where this gets really interesting is when you dig into individual classifications. One example Google and OpenAI give is the difference between the category for “snorkel” and “scuba diver.”
In the image below, you can see the various activations that are used by the neural network to identify these labels. On the left are activations that are strongly associated with “snorkel,” and on the right are activations that are strongly associated with “scuba diver.” The ones in the middle are shared between the two classes, while the ones on the fringes are more differentiated.
At a glance, you can make out some obvious colors and patterns. At the top, you have what looks like the spots and stripes of brightly colored fish, while at the bottom, there are shapes that look like face masks. But highlighted on the right-hand side is an unusual activation — one strongly associated with locomotives. When the researchers found this, they were puzzled. Why was this bit of visual information about locomotives important to recognizing scuba divers?
“So we tested it,” says Carter. “We’re like, ‘Okay, if we put a picture of a steam locomotive will it flip the classification from a snorkeler or to a scuba diver?’ And lo and behold, it does.”
The team eventually figured out the reason: it’s because the smooth metal curves of a locomotive are visually similar to a diver’s air tanks. So, to a neural network, that’s one obvious difference between divers and snorkelers. And to save time between distinguishing the two categories, it simply borrowed the identifying visual data it needed from elsewhere.
This sort of example is incredibly revealing of how neural networks operate. To skeptics, it shows the limitations of these systems. Vision algorithms may be effective, they say, but the information they learn actually has little to do with how humans understand the world. This makes them susceptible to certain tricks. For example, if you sprinkle just a few carefully chosen pixels into an image, it can be enough to make an algorithm misclassify it.
But to Carter, Olah, and others like them, the information revealed by Activation Atlases and similar tools show the surprising depth and flexibility of these algorithms. Carter points out, for example, that in order for the algorithm to distinguish between scuba divers and snorkelers, it also associates different types of animals with each category.
“[Animals] that occur in deep water, like turtles, are over by the scuba, and ones that occur on the surface, like birds, are over by the snorkel,” he says. He points out that this is information the system was never directed to learn. Instead, it just picked it up by itself. “And that’s sort of like a deeper understanding of the world. That’s really exciting to me.”
Olah agrees. “I find it almost awe inducing to look through these atlases at higher resolutions and just see the giant space of things these networks can represent.”
The pair hope that by developing tools like this, they’ll help push forward the whole field of artificial intelligence. By understanding how machine vision systems view the world, we can theoretically build them more efficiently and vet their accuracy more thoroughly.
“We have a limited toolbox for that right now,” says Olah. He says we can throw test data at systems to try and trick them, but this approach will always be limited by what we know can go wrong. “But this is giving us — if we want to invest the energy — a new tool for surfacing unknown-unknown problems,” he says. “It feels like each generation of these tools is moving us closer to being able to really understand what’s going on throughout these networks.”