Skip to main content

If you can identify what’s in these images, you’re smarter than AI

If you can identify what’s in these images, you’re smarter than AI

/

Researchers collect confusing images to expose the weak spots in AI vision

Share this story

From top to bottom and left to right, these images are misidentified as “digital clock,” “lighthouse,” “organ”, “syringe,” “toucan,” “Persian cat.”
From top to bottom and left to right, these images are misidentified as “digital clock,” “lighthouse,” “organ”, “syringe,” “toucan,” “Persian cat.”

Computer vision has improved massively in recent years, but it’s still capable of making serious errors. So much so that there’s a whole field of research dedicated to studying pictures that are routinely misidentified by AI, known as “adversarial images.” Think of them as optical illusions for computers. While you see a cat up a tree, the AI sees a squirrel.

There’s a great need to study these images. As we put machine vision systems at the heart of new technology like AI security cameras and self driving cars, we’re trusting that computers see the world the same way we do. Adversarial images prove that they don’t.

Adversarial images exploit weaknesses in machine learning systems

But while a lot attention in this field is focused on pictures that have been specifically designed to fool AI (like this 3D printed turtle which Google’s algorithms mistakes for a gun), these sorts of confusing visuals occur naturally as well. This category of images is, if anything, more worrying, as it shows that vision systems can make unforced errors.

To demonstrate this, a group of researchers from UC Berkeley, the University of Washington, and the University of Chicago, created a dataset of some 7,500 “natural adversarial examples.” They tested a number of machine vision systems on this data, and found that their accuracy dropped by as much as 90 percent, with the software only able to identify just two or three percent of images in some cases.

You can see what these “natural adversarial examples” look like in the gallery below:

Bugs on a leaf misidentified as a “shipwreck.”

1/8

Bugs on a leaf misidentified as a “shipwreck.”

In an accompanying paper, the researchers say the data will hopefully help train more robust vision systems. They explain that the images exploit “deep flaws” that stem from the software’s “over-reliance on color, texture, and background cues” to identify what it sees.

In the images below, for example, AI mistakes the pictures on the left for a nail, likely because of the wooden backgrounds. In the images of the right, they focus on the hummingbird feeder, but miss the fact that there are no actual hummingbirds present.

And in the four images of dragonflies below, AI hones in on colors and textures, seeing, from left to right, a skunk, a banana, a sea lion, and a mitten. In each case you can see why the mistake was made, but that doesn’t make it less obvious.

That AI systems make these sorts of mistakes is not news. Researchers have warned for years that vision systems created using deep learning (a flavor of machine learning that’s responsible for many of the recent advances in AI) are “shallow” and “brittle”— meaning they don’t understand the world with the same nuance and flexibility as a human.

These systems are trained on thousands of examples images in order to learn what things look like, but we don’t often know which exact elements within pictures AI is using to make its judgements.

Some research suggests that rather than looking at images holistically, considering the overall shape and content, algorithms focus in on specific textures and detail. The findings presented in this dataset seem to support this interpretation, when, for example, pictures that show clear shadows on a brightly-lit surface are misidentified as sundials. AI is essentially missing the wood for the trees.

But does this mean these machine vision systems are irretrievably broken? Not at all. Often the mistakes being made are pretty trivial, like identifying a drain cover as a manhole or mistaking a van for a limousine.

And while the researchers say that these “natural adversarial examples” will fool a wide range of vision systems, that doesn’t mean they’ll fool them all. Many machine vision systems are incredibly specialized, like those use to identify diseases in medical scans, for example. And while these have their own shortcomings, their inability to understand the world as well as a human doesn’t stop them spotting a cancerous tumor.

Machine vision may be quick and dirty sometimes, but it often gets results. Research like this shows us the blind spots we need to fill in next.