I am in a tiny room in a basement somewhere in Microsoft’s Redmond, Washington headquarters, wearing an early version of the HoloLens 2 headset. In front of me is a very real ATV, which is missing a bolt. Not quite at the corner of my vision — but certainly off to the side — I see a glowing indicator pointing to a bucket of the right bolts. I walk over, bend down to look more closely at the shelf, and grab one.
Back at the ATV, a holographic set of instructions hovers above it, telling me what to do and pointing to the exact spot where the bolt needs to go. After a couple of minutes, I’ve successfully fixed the thing — guided by holograms. I tap a holographic button on the guide to close out the instructions.
This sort of demo is quickly becoming commonplace for tech journalists like myself. But if you read the previous description closely, you’ll find that there are three key pieces of technical innovation hidden in plain sight.
Here they are: I saw a hologram off to the side because the field of view in which they can appear is much larger than before. I bent down and didn’t worry about an awkward headset shifting around because it was better balanced on my head. I pushed a button just by pushing a button because I didn’t need to learn a complicated gesture to operate the HoloLens 2.
Those three things might not seem all that remarkable to you, but that’s precisely the point. Microsoft needed to make the HoloLens feel much more natural if it really plans to get people to use it, and it has.
There’s one more unremarkably remarkable thing: even though it was just a demo, I was playing the part of a worker because that’s who the HoloLens 2 is exclusively designed for — workers, not consumers.
The Microsoft HoloLens 2 is available for preorder today for $3,500, and it’s expected to ship later this year. However, Microsoft has decided that it is only going to sell to enterprise customers who want to deploy the headset to their workers. As of right now, Microsoft isn’t even announcing a developer kit version of the HoloLens 2.
Compared to the HoloLens we first saw demonstrated four years ago, the second version is better in nearly every important way. It’s more comfortable, it has a much larger field of view, and it’s better able to detect real physical objects in the room. It features new components like the Azure Kinect sensor, an ARM processor, eye-tracking sensors, and an entirely different display system.
It has a couple of speakers, the visor flips up, and it can see what your hands are doing more accurately than before. There’s an 8-megapixel front-facing camera for video conferencing, it’s capable of full 6 degrees of tracking, and it also uses USB-C to charge. It is, in short, chock-full of new technology. But after four years, that should be no surprise.
The biggest complaint about the first HoloLens was simple: you only saw the holograms in a relatively small box, directly in front of you. Turn your head even a little, and they would disappear from your field of view. Worse, their edges would clip out of existence even when you were staring right at them. It was like looking at a digital world through a tiny rectangle.
The HoloLens 2 has a field of view that’s twice as big as before. It doesn’t quite fill your entire field of vision — there’s still clipping — but it’s big enough now that you no longer feel constantly annoyed by a letterbox. Microsoft says that each eye has the equivalent of a 2K display in front of it, but it’s better to think of that as a metaphor than a precise spec. The exact spec is that it has a “holographic density of 47 pixels per degree,” which means that the pixel density is high enough to allow you to read 8-point font.
Typically, when a tech product gets better specs like these, it happens through sheer force of technical iteration: faster processors, bigger batteries, more RAM, and so on. But that strategy wouldn’t have worked for the display on the HoloLens 2. It needed to get lighter, not heavier. So Microsoft had to completely change over to a different kind of display technology.
Lasers and mirrors
Laser-based displays have become the thing to do for computers on your face. Intel’s Vaunt project used lasers, and the North Focals smart glasses do, too. Although Microsoft is using some of the same basic components, it’s taken them in a different direction and gone much further in developing what they can do.
The lasers in the HoloLens 2 shine into a set of mirrors that oscillate as quickly as 54,000 cycles per second so the reflected light can paint a display. Those two pieces together form the basis of a microelectromechanical system (MEMS) display. That’s all tricky to make, but the really tricky part for a MEMS display is getting the image that it paints into your eyeball.
One solution that companies like North have used is a holographic film on the lens to reflect the image directly into your retina. That has lots of drawbacks: a tiny display and low resolution, for two. But the truly problematic part is simply ensuring the display is aimed right into your eye. You have to be custom-fitted for the North glasses, and the image can disappear entirely if they’re misaligned.
Microsoft doesn’t want any of those problems, so it turned to the same thing it used on the first HoloLens: waveguides. They’re the pieces of glass in front of your eye that are carefully etched so they can reflect the holograms in front of your eyes. The waveguides on the HoloLens 2 are lighter now because Microsoft is using two sandwiched glass plates instead of three.
When you put the whole system together — the lasers, the mirrors, and the waveguide — you can get a brighter display with a wider field of view that doesn’t have to be precisely aimed into your eyes to work. Zulfi Alam, general manager for Optics Engineering at Microsoft, contends that Microsoft is way out ahead with this system and that waveguides are definitely the way to go for mixed reality. “There’s no competition for the next two or three years that can come close this level of fidelity in the waveguides,” he argues.
Do you want a wider field of view? Simple. Just increase the angle of the mirrors that reflect the laser light. A wider angle means a bigger image.
Do you want brighter images? Simple again. Lasers, not to put too fine a point on it, have light to spare. Of course, you have to deal with the fact that waveguides lose a ton of light, but the displays I saw were set to 500 nits and looked plenty bright to me. Microsoft thinks it could go much brighter in the final version, depending on the power draw.
Do you want to see the holograms without getting specifically fitted for your headset? Simple yet again. The waveguide doesn’t require specific fitting or measurement. You can just put the headset on and get going. It also can sit far enough in front of your eyes to allow you to wear whatever glasses you need comfortably.
Simple, simple, simple, right? In truth, it’s devilishly complex. Microsoft had to create an entirely new etching system for the waveguides. It had to figure out how to direct light to the right place in the waveguides nearly photon by photon. “We are simulating every photon that comes from the laser,” Alam says. The light from the lasers isn’t just reflected; it’s split apart in multiple colors and through multiple “pupils” in the display system and then “reconstituted” into the right spot on the waveguides. “Each photon is calculated where it’s expected to go,” Alam says. That takes a ton of computing power, so Microsoft had to develop custom silicon to do all of the calculations on where the photos would go.
And though alignment is much easier with the waveguide, that doesn’t mean it’s perfect. That’s why there are two tiny cameras on the nose bridge, directed at your eyeballs. They will allow the HoloLens 2 to automatically measure the distance between your pupils and adjust the image accordingly. Those cameras will also allow the HoloLens 2 to vertically adjust the image if it gets tilted or if your eyes are not perfectly even. (They are not. Sorry.)
A sort of free benefit of those cameras is that they can also scan your retinas to log you into the HoloLens 2 securely. It runs Windows, after all, and therefore it supports Windows Hello. They also track where you’re looking, which enables some new user interactions I’ll get to below.
Then there’s power: lasers, oscillating mirrors, and custom chips to handle the computing for all of that must chew through battery. But Alam tells me that even with all of that, it still manages to require less power than the alternative. The mirrors oscillate in resonance, so it takes less energy to move them, sort of like they’re the fastest metronomes ever. Lasers are also less lossy than LEDs, and custom silicon can be optimized to its specific task.
”Our evolution is toward a form factor that is truly glasses,” Alam says, “and all these are significant steps in this journey.”
All that tech is impressive for sure, but I don’t want to oversell the image quality. What I was using wasn’t a finished product. I did see a tiny halo around some of the holograms, and they sometimes jumped around a bit. Most of the features based on the nose bridge eye scanners weren’t flipped on yet, either. Still, compared to the first HoloLens, what I saw crossed over the line from “cool demo I’d use for 20 minutes and then be annoyed” to “I could see people using this for a few hours if the software was really useful.”
But if you’re going to use a headset for “a few hours,” it needs to be comfortable enough to leave on in the first place.
Here’s how you put the HoloLens 2 on: you put it on like a baseball cap, twist a knob on the back to tighten the headband, and then you’ll start seeing holograms. The end.
It’s much less fiddly than the last HoloLens or any other face-mounted display I’ve ever tried. Because of all the work on the display system, you can skip the extra “fuss with the position to make sure you can see the image” step. The body of the thing is simpler, too. It’s a single band that’s held on with minimal pressure on the back of your head and on your forehead. (There’s an optional top strap if you need it.)
All of that is nice, but it’s pointless if the headset is uncomfortable to wear. And though I never had it on for more than a 20-minute stint, I think it will hold up for longer periods.
Microsoft has a “human factors” lab where it loves to show off its collection of dummy human heads and high-speed cameras. Carl Ledbetter, senior director of design for the Microsoft Device Design Team, walked me through all of the prototypes and material Microsoft tried to get into the final product. He explained how Microsoft experimented with different designs and materials, ultimately landing on carbon fiber to save weight.
”The reality is [we have to] fit kids, adults, men, women, and different ethnicities around the world. Everybody’s head is different,” he says. Microsoft has a database of around 600 heads tracking the shape of the cranium, eye depth, the size and relative position of the nose bridge, and other variations. Ledbetter’s team attached sensors to people’s necks to measure muscle strain, to make sure the center of gravity was right.
The result is that the HoloLens 2 has a more forgiving and flexible fit. It simply does a better job of accommodating basic, physical human realities. You can flip the visor up so it’s out of your field of view so you can make eye contact without removing the headset. The memory foam pad that rests on your forehead is removable and cleanable, and the thermals have been completely redesigned so heat is piped away from your head.
All of that really helps, but the most important thing Microsoft did was move the center of gravity right behind your ears instead of up by your eyes. The HoloLens 2 isn’t really much lighter than the original HoloLens. It feels lighter, though, because it’s balanced more naturally on your head. That balance makes a huge difference. The weight of it is less noticeable and should put less strain on your neck.
Ledbetter moved the weight by literally moving the heaviest part: the main processor and battery are now located in a module that sits on the back of the headset, with wires inside the headband running up to the display board and components in the front. That processor, by the way, is an ARM-based Qualcomm Snapdragon 850, and that’s important because it addresses another basic human reality: we hate when the battery dies, and we hate plugging stuff in. An ARM processor means it can have a smaller battery.
The original HoloLens ran on an Intel processor, and it ran Windows. Since then, Microsoft has done a ton of work to get Windows working well on ARM. Those efforts are slowly coming to fruition on laptops, but Intel is still the order of the day on those machines where raw speed is usually more important to users than battery life. In general, there’s a tension with Intel. It’s not delivering the lower-power chips that mobile devices demand. Intel even reportedly had to lobby Microsoft to keep the Surface Go on its chips.
So what about the HoloLens 2? Alex Kipman is the person in charge of the whole HoloLens project. He says that “ARM rules in battery-operated devices. The ARM decision became fairly easy. If you’re going to be on battery, [it’s] hard to find a product that’s not running ARM today.”
When I point out that there are plenty of Windows laptops running on batteries using Intel chips, he becomes blunter. “Intel doesn’t even have an SoC [system on chip] right now for these types of products that run on battery. They did have one, the previous version [of the HoloLens] had Cherry Trail, which they discontinued. That decision is a no-brainer.”
For workers, not consumers
The HoloLens 2 is only being sold to corporations, not to consumers. It’s designed for what Kipman calls “first-line workers,” people in auto shops, factory floors, operating rooms, and out in the field fixing stuff. It’s designed for people who work with their hands and find it difficult to integrate a computer or smartphone into their daily work. Kipman wants to replace the grease-stained Windows 2000 computer sitting in the corner of the workroom. It’s pretty much the same decision Google made for Google Glass.
“If you think about 7 billion people in the world, people like you and I — knowledge workers — are by far the minority,” he replies. To him, the workers who will use this are “maybe people that are fixing our jet propulsion engine. Maybe they are the people that are in some retail space. Maybe they’re the doctors that are operating on you in an operating room.”
He continues, saying it’s for “people that have been, in a sense, neglected or haven’t had access to technology [in their hands-on jobs] because PCs, tablets, phones don’t really lend themselves to those experiences.”
Fair enough. That’s completely in fitting with Microsoft’s new focus on serving corporate and enterprise needs instead of trying to crank out hit consumer products. That was one of my takeaways when I interviewed CEO Satya Nadella last year, and it holds true today. As I wrote then, it’s “a different kind of Microsoft than what we’re used to thinking of. It’s a little less flashy, yes, but it has the benefit of being a lot more likely to succeed.”
Besides, Kipman argues, even the HoloLens 2 isn’t good enough to be a real mass-market consumer technology product. “This is the best, highest watermark of what can be achieved in mixed reality and I’m here to tell you that it’s still not a consumer product,” he says, then continues:
Why is it not a consumer product? It’s not as immersive as you want it to be. It’s more than twice as immersive as the previous one, [but it’s] still not immersive enough for that consumer off the street to go use it. It’s still not comfortable enough … I would say that until these things are way more immersive than the most immersive product, way more comfortable than the most comfortable product, and at or under $1,000, I think people are kidding themselves in thinking that these products are ready.
Kipman says that Microsoft has not participated in the consumer hype cycle for these types of products. “We were not the company that hyped VR. We are certainly not the company that hyped AR. And since we merged the two into the mixed reality and AI efforts, we haven’t hyped either.”
That’s not exactly true. We have seen plenty of demos from Microsoft showing off games — including Minecraft — and other consumer applications for the HoloLens. So this move to the enterprise market is absolutely a pivot.
But it’s a pivot that’s part and parcel with Microsoft’s larger corporate strategy. And just because it’s no longer being positioned as a consumer product doesn’t mean that it’s not an important product — one that Microsoft appears to be committed to and is developing software for.
A better interface on your face
The first HoloLens required users to learn awkward gestures with names like “Air Tap” and “Bloom.” You had to make these really specific hand gestures because that’s all the first HoloLens’ sensors could detect and understand.
The HoloLens 2 can detect and understand much more because of a new array of sensors for reading the room called the Azure Kinect. “Kinect” because that’s the brand for Microsoft’s cameras that can scan rooms, “Azure” because seemingly everything the company does these days is somehow connected to its cloud service and as a further signal that this is a business product, not an Xbox add-on.
“HoloLens 1 is just one big mesh. It’s like dropping a blanket over the real world,” Kipman says. “With HoloLens 2, we go from spatial mapping to semantic understanding of spaces. You understand what’s a couch, what is a human sitting on the couch, what’s the difference between a window and a wall.”
I can’t speak to how well Kinect is actually able to identify objects — Microsoft didn’t demo any of that for us — but it theoretically works because the Azure Kinect sees the room at a higher resolution and because it is hooked up to cloud services that help it figure out what things are.
There’s one aspect where I can definitively say that the higher fidelity is real: it’s able to identify my hand and what it’s doing much more easily. It can track up to 25 points of articulation on both hands in space, which means that you shouldn’t need to use the Air Tap gesture to interact with holograms anymore.
In one demo, I paced around a room looking at various holograms that were set up on tables. As I reached my hands in, a box appeared around each one with little grab handles on the edges and corners. I could just reach in and grab the whole box and move the hologram around. I could also just grab one edge to rotate it, or two to resize it. When there was a button, I could stick my finger out and push it. I doubt that it’s accurate enough to, say, let you type on a virtual QWERTY keyboard, but it’s a big step up over the first generation, nonetheless.
Eye tracking also comes into play in how you interact with holograms. The HoloLens 2 can detect where you’re looking and use that information as a kind of user interface. There were demos where I just stared at a little bubble to make it pop into holographic fireworks, but the most useful one was an auto-scroller. The closer to the bottom of the page I got, the faster the words scrolled, but then it stopped when I looked back up.
I didn’t see the full top-level user interface, so I don’t know if that’s changing. But one thing absolutely isn’t: it still runs Windows. It utilizes the shared code in Windows OneCore, which means you won’t get a traditional Windows desktop shell, but you will be able to run any Universal Windows App on it. It also has the necessary drivers to let you connect a keyboard and a mouse to it over Bluetooth if you really want to.
Chaitanya Sareen, the principal group program manager for Microsoft Mixed Reality, explains that they’re trying to “make the machine work around the person versus the other way around.” Sareen calls this “instinctual interaction” as opposed to “intuitive,” since it can piggyback off of what we already do with real objects in the world. “Is anyone born saying ‘There’s going to be a close button [in the upper corner of a window]’? No,” he says. “A lot of interfaces we use are learned.”
Sareen is still thinking through some of the details of what the user interface will be, but the goal is to use many of the natural gestures you picked up as a toddler instead of making you learn a whole new interface language.
Microsoft is also making new software tools available to developers. One of the most important, Dynamic 365 Guides, will be a mixed reality app with templates to create instructions for repairing real-world things like that ATV. Other tools depend on Microsoft’s cloud services. One is Azure Remote Rendering that lets the HoloLens offload some compute load to the cloud. It exists because the HoloLens 2 can only store and render a limited kind of detail for something like a 3D render of an engine locally. With Remote Rendering, some of the detail can come in real time from the cloud, so it displays potentially infinite levels of detail, allowing you to model and interact with the smallest parts of a holographic machine.
Finally, there’s Azure Spatial Anchors. It lets you pin holograms to real places in the world. At a basic level, it’s not all that different from what Apple and Google are already doing in augmented reality: letting multiple devices see and interact with the same virtual object. Microsoft’s ambitions are much grander, though: it wants to create the infrastructure for a “world scale” set of holograms, and it’s building tools that let developers use that infrastructure across platforms, including iOS and Android.
Solving that requires more than just GPS location and object recognition. Kipman talks a lot about distinguishing between identically boring conference rooms that are in the same spot on different floors. Tracking objects in space using optics is famously difficult. Walk in a circle around a building, and your position will drift, so the computer won’t put your ending point at the starting point. It’s a little fuzzy about how far along Microsoft has actually gotten toward solving these problems, but it’s actively working on them.
Alex Kipman believes we are on the precipice of the “third era of computing.” First came PCs with their open architectures, second came phones with walled garden app stores, and now he hopes mixed reality headsets will swing the pendulum back to openness because Microsoft intends to keep the HoloLens open. The HoloLens works with Microsoft’s cloud services, but it would work with other ecosystems, too. Kipman says the HoloLens and Azure are “loosely coupled, but tightly aligned.”
I could do more than quibble with his summary of the history of computing and point out that there’s also quite a history of underdogs calling for openness, but the larger point stands: Microsoft thinks that mixed reality is going to be a Big Deal.
Understanding what Microsoft’s plans lately has required wading through a lot more jargon than it used to. With the HoloLens 2 specifically, expect a lot of discussion about “time-to-value” (how quickly a user can do something useful after getting a device from an employer) and “intelligent edge” (devices with their own computing power nevertheless connected to the cloud).
There’s a cognitive dissonance for regular consumers with all of that talk. Kipman’s protestations to the contrary, there is plenty of hype around the HoloLens 2. It’s just directed at corporations now. Some of it is well-deserved. I think that the HoloLens 2 is a technical marvel. Just because it isn’t being sold as a consumer device doesn’t mean that it’s not also an important piece of technology, something that could change our conception of what a computer should look like.
But we’re used to consumer electronics companies doing their best to put such technical marvels on store shelves, translating that hype into gadgets in our pockets and on our heads.
For the HoloLens 2, the hype isn’t about personal technology. It’s just business.