Google hired professional photographers to help train its AI camera

Photo by James Bareham / The Verge

How did Google get Clips, its AI-powered camera, to learn to automatically take the best shots of users and their families? Well, as the company explains in a new blog post, its engineers went to the professionals — hiring “a documentary filmmaker, a photojournalist, and a fine arts photographer” to produce visual data to train the neural network powering the camera.

The blog post explains this process in a little more detail, but it’s basically what you’d expect for this sort of AI. In order for the software to recognize what makes a good or a bad photo, it had to be fed lots of examples. The programmers thought about not only obvious markers (eg, it’s a bad photo if there is blurring or if something’s covering the lens) but also more abstract criteria, such as “time” — training Clips with the rule, “Don’t go too long without capturing something.”

Two examples of bad snaps that were used to train Google’s Clips.
Image: Google

In teaching Clips how to recognize good photos and making the user interface as intuitive as possible, Google said it was practicing what it’s calling “human-centered design” — that is, trying to make AI products that work for users without creating extra stress. The Clips camera isn’t actually on general sale yet, but we look forward to testing out the device to see if it lives up to these ambitious goals.

What’s also notable, though, is that Google admits in the blog post that training AI programs like these can be an imprecise process, and that no matter how much data you give a device like Clips, it’s never going to know exactly what photos you value the most. It may be able to recognize a well-framed, in-focus, brightly-lit image, but how will it know that the blurry shot of your son riding his bike without stabilizers for the first time is also priceless?

“In the context of subjectivity and personalization, perfection simply isn’t possible, and it really shouldn’t even be a goal,” write the blog post’s authors. “Unlike traditional software development, ML systems will never be ‘bug-free’ because prediction is an innately fuzzy science.”

Comments

The whole idea of Clips is just really weird to me. Do I expect all the really interesting things in my life to happen in just one place, or am I supposed to buy a load of these things and stow them about the house or am I supposed to have just one and move it around (in which case, why not actually just take a picture with my phone with (probably) better results? And that price is just too high for something without an obvious use for me. If it were more ‘Go Pro’ linked in with Google Photos to automatically select images from the movies I make (like their Storyboard app), then I could understand. But, as it is, I can’t see this ever becoming much of a ‘must buy’ accessory for your photographic life.

Keep one on you. Literally, like a GoPro, or just set it up whenever you think you’d want photos. Then actually participate in the activity that you’d otherwise be photographing.

Obviously, if you’re the kind of person who would have enjoyed the photographing part, this isn’t for you.

Clips seems like a solution looking for a problem. We have more access than ever to capturing life’s every moment, I don’t see a need for this device.

The price to me is the biggest problem. I understand the tech and whats the value but the price is a barrier for this to me.

Agreed, if it was like 100-150 I’d actually consider it, but it’d also need to function as a regular camera/action camera too

Disagree.

Not sure if you have kids, but I do and the best moments to capture are the candid ones. It’s impossible to capture candid moments if you have to take a phone out and snap.

This product makes it like you have a professional photographer just out of sight snapping the best candid moments of your family.

I’m would get one, but not at $249. If they get it down to $129 I’d probably get one.

I think parents capture more than enough candid moments. Of course some can’t help themselves and want every blinking moment of cuteness & bonding recorded. To each their own.

I agree, but when you’re capturing the candid moments, you’re not in them. One of the draws here is that it captures your entire family.

Let’s see. I like that Google is trying a whole new type of product instead of just trying to improve existing tech.

I don’t want this… I could see maybe wanting this if I had kids. I wouldn’t turn it down if I got one for free but otherwise I don’t personally have a use for this.

This is such a "weird Sony" type of product. It may seem like a silly hobby but I’m sure Google will find useful ways to incorporate the real life learnings from Clips. I don’t want or need one, but am excited at how it can advance mobile photography in general.

Love to give it a try and see how well it works. Do wish a bit cheaper for taking a flyer.

In the larger context of Google being an "AI first" company, Clips looks like 50% Google trying to build off-line AI assistants and 50% getting people used to off-line AI assistants with a camera.

I 100% expect Clips to be a one-off product (unless it becomes insanely popular), but elements of it will be built into the next Pixel Phone or one of their Google Home’s with a screen.

Camera glasses seem like a product that tech companies will just keep trying to get to work no matter what, so I won’t be surprised to see these built into a Google Glass type product at some point either.

I guess some sort of robot or drone is probably going to happen, too, at some point.

How much does it cost?

With two kids, we would really appreciate having such a devices. While we have tons of pictures from the first years of our son, we are less enthusiastic (or have less time) to take pictures with two kids. I can really imagine having a party and just putting the camera somewhere, and placing it after 2 hours in a different place.

Clips is what informs apps like Storyboard, or Google Photos when it tries to automatically find ‘scenes’.
This is very much a product intended to help inform their ecosystem first, and possibly satisfy the customer secondarily.

Seems like an obscure Google X project that somehow found daylight.
I could see this clipped to drones, but setting it up on a stationary object is just going to give you the same picture over and over again.

View All Comments
Back to top ↑