A few weeks ago DJI unveiled its newest drone, the Phantom 4, the first craft to offer robust obstacle avoidance at a price the average consumer can afford. It relied on computer vision to power its autonomous flight, and since DJI had shown off this kind of tech before, we assumed that all the hardware on the Phantom 4 was homegrown, or backed by a giant like Intel. But today the chipmaker Movidius announced that its latest offer, the Myriad 2, was at the center of the onboard processor powering the Phantom 4’s incredible new abilities.
As it turns out this isn’t the first time Movidius has partnered with a big name to develop cutting edge technology. Back in 2014 its first chip, the Myriad 1, was revealed as the brains inside of Google’s first generation of Project Tango tablets. After a decade toiling in relative obscurity, the small 125 person company is suddenly poised to emerge as a leader at the intersection of several major markets — from drones to phones to virtual reality — which are looking for ways to enable cheap, power-efficient computer vision.
At the forefront of a new computing paradigm
"The company was founded in late 2005, so we’ve had a long gestation," says CEO Remi El-Ouazzane with a laugh. In its early years it found some business converting old movies into 3D, helping to shore up content offerings for the 3D TV market that never took off. In 2010 its chips were put to use as an engine for 3D rendering, but it was competing with plenty of established chip makers in that market. It wasn’t until 2013, and its partnership with Tango, that the company realized how widespread the application of computer vision could be, and focused in on optimizing for what it believed would be the next wave of devices.
"A different compute paradigm requires a new chip architecture. It’s very similar to the birth of the GPU 20 years ago when 3D graphics was the new paradigm requiring a novel approach," says El-Ouazzane. The company is now focused on developing its chip as a successor to the CPU and the GPU, something it has dubbed the VPU, a vision processing unit optimized specifically for computer vision tasks. It designs the chips in-house, and has them produced by TSMC, the Taiwanese giant which makes chips for Qualcomm, Nvidia, and Apple.
After drones comes virtual reality
Along with its chip, Movidius also designed a set of complementary algorithms and an SDK. It gave DJI access to this suite of hardware and software to help it perfect the the computer vision onboard the Phantom 4. The custom algorithms ensure the VPU can handle a number of different tasks: processing depth from the stereo cameras, proximity from the sonar sensors, spatial orientation, and an advanced version of optical flow for recognizing and following human subjects as part of DJI’s ActiveTrack feature. "There are many [chip] companies now who can do one or two pieces of that," claims El-Ouazzane. "But we are the only company who can aggregate them all at the right level of power, performance, and price."
As we wrote recently, technology like this may soon be at the heart of virtual and augmented reality, allowing mobile phones and relatively cheap headsets to accomplish what today requires an expensive and stationary system. The HTC Vive, for example, relies on a wired headset and two laser light boxes to map out the space around a user and track their movements. But Tango allows for much of that with just a mobile device and a headset.
Samsung may be busy building headphones that can trick your body into believing it’s in motion, but it would probably prefer that its Galaxy Gear headset rely on a far less experimental approach. "It’s all about presence, the ability to immerse yourself into the scene," says El-Ouazzane. "Inside out tracking, gesture tracking, gaze tracking, depth sensing, there are all these applications where our technology is suddenly very relevant."
Teaching mobile devices to see, and soon, how to learn
Beyond computer vision, Movidius is eager to bring its chips to bear on another very hot field in modern computing, artificial intelligence. Earlier this year it announced a partnership with Google to bring deep learning to smartphones, enabling powerful image recognition algorithms to run locally on the device. That is a big part of what it’s doing for DJI’s new drone, allowing the device to see and understand the world around it while avoid the latency that comes with having to query the cloud.
For now the onboard intelligence is limited to pre-programmed skills. "Most of the features in Phantom 4 are hand crafted descriptors and classifiers which allows it to do things like subject tracking," says El-Ouazzane. But he believes that embedded neural networks will soon go beyond that. "Moving forward, you need to get to a situation where the drone or the phone, through neural networks, will start to learn on its own." A drone that can identify and track humans is neat, sure. But what about what that can recognize its owner at a glance.
Verge Reviews: The DJI Phantom 4