Artificial intelligence seems to have become ubiquitous in the technology industry. AIs, we’re told, are replying to our emails on Gmail, learning how to drive our cars, and sorting our holiday photos. Mark Zuckerberg is even building one to help out around the house. The problem is that the concept of "artificial intelligence" is way too potent for its own good, conjuring images of supercomputers that operate spaceships, rather than particularly clever spam filters. The next thing you know, people are worrying about exactly how and when AI is going to doom humanity.
Tech companies have partly encouraged this elision of artificial intelligence and sci-fi AI (especially with their anthropomorphic digital assistants), but it’s not useful when it comes to understanding what our computers are doing that's new and exciting. With that in mind, this primer aims to explain some of the most commonly used terms in consumer applications of artificial intelligence — as well as looking at the limitations of our current technology, and why we shouldn’t be worrying about the robot uprising just yet.
A robot in DARPA's robotics challenge struggles with a door. (Image credit: IEEE Spectrum / DARPA)
What do ‘neural network,’ ‘machine learning,’ and ‘deep learning’ actually mean?
These are the three terms you’re most likely to have heard lately, and, to be as simple as possible, we can think of them in layers. Neural networks are at the bottom — they're a type of computer architecture onto which artificial intelligence is built. Machine learning is next — it’s a program you might run on a neural network, training computers to look for certain answers in pots of data; and deep learning is on top — it’s a particular type of machine learning that’s only become popular over the past decade, largely thanks to two new resources: cheap processing power and abundant data (otherwise known as the internet).
The concept of neural networks goes all the way back to the ‘50s and the beginning of AI as a field of research. In a nutshell, these networks are a way of structuring a computer so that it looks like a cartoon of the brain, comprised of neuron-like nodes connected together in a web. Individually these nodes are dumb, answering extremely basic questions, but collectively they can tackle difficult problems. More importantly, with the right algorithms, they can be taught.
You tell a computer what to do, with machine learning, you show it how
So, say you want a computer to know how to cross a road, for example, says Ernest Davis, a professor of computer science at New York University. With conventional programming you would give it a very precise set of rules, telling it how to look left and right, wait for cars, use pedestrian crossings, etc., and then let it go. With machine learning, you’d instead show it 10,000 videos of someone crossing the road safely (and 10,000 videos of someone getting hit by a car), and then let it do its thing.
The tricky part is getting the computer to absorb the information from all these videos in the first place. Over the past couple of decades, people have tried all sorts of different methods to try to teach computers. These methods include, for example, reinforcement learning, where you give a computer a "reward" when it does the thing you want, gradually optimizing the best solution; and genetic algorithms, where competing methods for solving a problem are pitted against one another in a manner comparable to natural selection.
Google and other car companies use machine learning to teach their cars to drive.
In today’s classrooms-for-computers, there’s one teaching method that's become particularly useful: deep learning — a type of machine learning that uses lots of layers in a neural network to analyze data at different abstractions. So, if a deep learning system is looking at a picture, each layer is essentially tackling a different magnification. The bottom layer might look at just a 5 x 5 grids of pixels, answering simply "yes" or "no" as to whether something shows up in that grid. If it answers yes, then the layer above looks to see how this grid fits into a larger pattern. Is this the beginning of a line, for example, or a corner? This process gradually builds up, allowing the software to understand even the most complicated data by breaking it down into constituent parts.
"As you go up these layers the things that are detected are more and more global," Yann LeCun, the head of Facebook’s artificial intelligence research team, tells The Verge. "More and more abstract. And then, at the very top layer you have detectors that can tell you whether you’re looking at a person or a dog or a sailplane or whatever it is."
Deep learning systems need a lot of data and a lot of time to work
Next, let’s imagine that we want to teach a computer what a cat looks like using deep learning. First, we’d take a neural network and program different layers to identify different elements of a cat: claws, paws, whiskers, etc. (Each layer would itself be built on layers that allow it to recognize that particular element, but that’s why this is called deep learning.) Then, the network is shown a lot of images of cats and other animals and told which is which. "This is a cat," we tell the computer, showing it a picture of a cat. "This is also a cat. This is not a cat." As the neural network sees different images, different layers and nodes within it light up as they recognize claws, paws, and whiskers, etc. Over time, it remembers which of these layers are important and which aren’t, strengthening some connections and disregarding others. It might discover that paws, for example, are strongly correlated with cats, but that they also appear on things that are not cats, so it learns to look for paws that also appear alongside whiskers.
This is a long, iterative process, with the system slowly getting better based on feedback. Either a human will correct the computer, nudging it in the right direction. Or, if the network has a large enough pot of labeled data, it can test itself, seeing how different weightings of all its layers produce the most accurate answers. Now, you can imagine how many steps are needed just to say whether something is or is not a cat, so think how complex these systems have to be to recognize, well, everything else that exists in the world. That’s why Microsoft was proud to launch an app the other week that identifies different breeds of dogs. The difference between a doberman and a schnauzer might seem obvious to us, but there are a lot of fine distinctions that need to be defined before a computer can tell the difference.
So this is what Google, Facebook, and the rest are using?
For the most part, yes.
Deep learning techniques are now being employed for all sorts of everyday tasks. Many of the big tech companies have their own AI divisions, and both Facebook and Google have launched efforts to open up their research by open-sourcing some of their software. Google even launched a free three-month online course in deep learning last month. And while academic researchers might work in relative obscurity, these corporate institutions are churning out novel applications for this technology every week: everything from Microsoft’s "emotional recognition" web app to Google’s surreal Deep Dream images. This is another reason why we’re hearing a lot about deep learning lately: big, consumer-facing companies are playing with it, and they’re sharing some of the weirder stuff they’re making.
However, while deep learning has proved adept at tasks involving speech and image recognition — stuff that has lots of commercial applications — it also has plenty of limitations. Not only do deep-learning techniques require a lot of data and fine-tuning to work, but their intelligence is narrow and brittle. As cognitive psychologist Gary Marcus writes at the New Yorker, the methods that are currently popular "lack ways of representing causal relationships (such as between diseases and their symptoms), and are likely to face challenges in acquiring abstract ideas like ‘sibling’ or ‘identical to.’ They have no obvious ways of performing logical inferences, and they are also still a long way from integrating abstract knowledge, such as information about what objects are, what they are for, and how they are typically used." In other words, they don’t have any common sense.
For example, in a research project from Google, a neural network was used to generate a picture of a dumbbell after being trained on sample images. The pictures of dumbbells it produced were pretty good: two gray circles connected by a horizontal tube. But in the middle of each weight was the muscular outline of a bodybuilder’s arm. The scientists involved suggest this might be because the pictures the network had been trained on showed a bodybuilder holding the dumbbell. Deep learning might be able to work out what the common visual properties of tens of thousands of pictures of dumbbells are, but it would never make the cognitive leap to say that dumbbells don’t have arms. These sorts of problems aren’t just limited by common sense either. Because of the way they examine data, deep-learning networks can also be fooled by random patterns of pixels. You might see static, but a computer is 95 percent certain that's a cheetah.
These sorts of limitations can be artfully hidden though. Take the new wave of digital assistants like Siri, for example, which often seem like they can understand us — answering questions, setting alarms, and telling a few preprogrammed jokes and quips along the way. But as the computer scientist Hector Levesque points out, these quirks just show how big the gap between AI and real intelligence is. Levesque uses the example of the Turing Test, and points out that the machines that do best at this challenge rely on tricks to make people think they’re talking to a human. They use jokes, quotations, emotional outbursts, misdirection, and all manner of verbal dodges to confuse and distract questioners. And indeed, the machine that was said by some publications to have beaten the Turing test last year did so by claiming to be a 13-year-old Ukrainian boy — a cover story that excused its occasional ignorance, clunky phrasing, and conversational non sequiturs.
A better test of AI in this domain, says Levesque, would be to quiz machines with surreal but logical questions that demand the sort of wide, causal knowledge that Marcus describes. Levesque offers example questions like, "Could a crocodile run a steeplechase?" and "Should baseball players be allowed to glue small wings onto their caps?" Imagine the sort of things a computer would have to know to even attempt to answer these.
If this isn’t artificial intelligence, what is?
This is one of the difficulties of using the term artificial intelligence: it’s just so tricky to define. In fact, it’s axiomatic within the industry that as soon as machines have conquered a task that previously only humans could do — whether that’s playing chess or recognizing faces — then it’s no longer considered to be a mark of intelligence. As computer scientists Larry Tesler put it: "Intelligence is whatever machines haven't done yet." And even with tasks computers can beat, they aren’t doing it by replicating human intelligence. "When we say the neural network is like the brain it’s not true," says LeCun. "It’s not true in the same way that airplanes aren’t like birds. They don’t flap their wings, they don’t have feathers or muscles." If we do create intelligence, he says, it "won’t be like human intelligence or animal intelligence. It’s very difficult for us to imagine, for example, an intelligent entity that does not have [the impulse towards] self-preservation."
Many people working within the field of AI are dismissive of the idea that we’ll ever be able to create artificial intelligence that is truly sentient. "There is no approach at the moment that has any hope of being flexible and performing multiple tasks or going beyond the basic tasks that it’s programmed to do," Professor Andrei Barbu, from MIT’s Center for Brains, Minds and Machines told The Verge, adding that effective AI research is just about creating systems that have been fine-tuned to solve a specific problem. He says that although there have been forays into unsupervised learning, where systems work through data that hasn’t been labeled in any way, this work is still in its infancy. One of the better-known examples is a neural network created by Google that was fed random YouTube thumbnails from 10 million videos. Eventually, it taught itself what a cat looked like, but its creators did not make any wider claims for its ability. As LeCun said at an event at the Orange Institute last year: "We don't know how to do unsupervised learning. That's the biggest obstacle."
Artificial intelligence as a field of study also has a tendency to fall victim to hype. It often happens that a new method is found, progress is made quickly, and commentators (and often computer scientists too) make bold claims that this rate of improvement will continue until we've created robot butlers. Just take this New York Times article from 1958, for example, which describes a very early form of AI — a machine that can just about tell the difference between left and right — as an electronic "embryo," that will one day be able to "walk, talk, see, write, reproduce itself, and be conscious of its existence." When these sorts of promises aren’t fulfilled, the field tends to fall into what’s known as an AI winter — a period of pessimism and reduced funding. (There have been half a dozen minor AI winters, and two major ones — in the late ‘70s and early ‘90s.) And while it’s true that every field of scientific research goes through fallow periods, it’s worth noting that few disciplines disappoint their acolytes so reliably that they come up with a special name for it.
So that's it: No artificial intelligence ever? Just gimmicks and tricks?
Well, not quite. It depends on what you want an AI to be. Our machines are certainly getting more intelligent, but not in a way that we can easily categorize. Take the self-driving software used in Tesla’s cars, for example. The company’s CEO Elon Musk describes it as a "fleet learning network" which pools data so that "when one car learns something, all learn." The end-point of this research won’t be a general or flexible AI, but it will be a lot of smarts spread out over a whole network of computers — something that LeCun refers to as "hidden intelligence."
The future of AI is likely to be subtle
Imagine, in the future, that you have a flawless self-driving car that comes equipped with an advanced digital assistant — one of Siri’s descendants perhaps. This might be the sort of cheating chatbot that Levesque isn’t impressed with, but one that can trick anyone into treating it as a human. You exchange jokes on your morning commute, chat about the news, arrange things in your calendar, and change your destination if you need to, all in a self-driving car that’s learned not only the rules of the road, but also how to account for the inconsistencies of other drivers. By the time we get to this point (and this is almost certainly an achievable goal), will we really care that we don’t have "true" artificial intelligence? Won’t it be enough that it seems like we do?