Dr Simon Taylor, Co-founder and Research Director, Zappar
Last week the AR world was surprised to learn of the $542 million investment into Magic Leap. I don’t know if anyone else had heard of them before but I certainly hadn’t, and it looks like their website came out of stealth mode just after the announcement. Precious little information is available publicly other than a couple of patent applications and a website full of dreams, but having skimmed those details it did get me thinking about the requirements for a wearable display to enable truly immersive AR applications.
What is an AR experience and how can it be delivered?
Google Glass has been widely described as an “Augmented Reality” display. To me this always felt like a stretch (although to be fair to them I don’t think it’s a term Google themselves use to describe Glass). There is no universally accepted definition of what AR actually is, but I like the criteria proposed by Ron Azuma, way back in 1997 (1). Azuma says to be considered “Augmented Reality” an experience must have 3 characteristics:
- Combine real and virtual
- Be interactive in real time
- Be registered in 3D
I like that definition as it doesn’t mention anything about the hardware devices used to deliver the experience or the type of sensors or tracking employed to meet the “registered in 3D” requirement. When talking about AR experiences, registration refers to positioning the virtual content so that it appears anchored to a certain place or object in the real world. Apps that use the GPS and compass on a smartphone to overlay points of interest on a live camera feed would meet the criteria, as would a head-mounted display that uses cameras to visually track a target object and optics to additively combine virtual content with the direct view of the real world.
Google Glass is by design a head-up display. It is point 3 above where it falls down on Azuma’s definition of AR - it’s not a device designed to align virtual content with the real world, it is more about simply providing pop-up notifications in a particular small part of the user’s field of view. That’s why I’ve found it hard to get too excited by the potential of Glass for AR applications. The field of view covered by the display is simply too small for what most people imagine when they think of an immersive head-mounted AR experience. You can display the live camera feed on the screen on Glass along with some AR content “registered” to that camera image, but the overall impression is a disorienting picture-in-picture effect with a small and laggy view containing the camera feed and AR content. Immersive it is not.
Smartphones and tablets tick a lot of boxes for delivering AR experiences. They are convenient mobile packages containing a good screen, camera, network connectivity, and decent processing power. Hundreds of millions of us already have them, so they represent by far the best option for delivering AR experiences to consumers today. They are the devices that have allowed AR to break out of the lab and into the hands of consumers.
However there is a limitation to how immersive a smartphone AR experience can possibly be. The virtual content is displayed on top of live images from the device’s camera, meaning the view onto the world is from the point of view of the camera rather than the user’s eyes. It’s another case of the picture-in-picture effect I’ve talked about with Google Glass. I feel that is less of an issue with smartphones - it is for me easier to accept that there is this screen through which I must view the augmented world, and having the freedom to move that screen (and the viewpoint) independently of my head is a welcome feature. However the fact there is a device that must be the focus of my attention between the real world and my eyes does of course limit immersion.
The route to more immersive experiences is clear - we must remove the device between the world and our eyes. In the same way an Oculus VR headset provides a more immersive experience than playing a game on a monitor, so we would expect more immersive AR experiences to be delivered by wearable displays. Currently smartphones provide a small window into an augmented world. To step through the window and into the world ourselves will require removing the smartphone from the setup.
Existing AR headsets
The tiny field-of-view (FOV) covered by the display of Google Glass means that it is unable to deliver experiences that would meet Azuma’s AR criteria. However head-mounted displays with wider FOVs have been used to deliver AR experiences in research, industrial and defence applications for decades.
A major decision when producing a device for AR experiences is the mechanism to achieve the first aspect of Azuma’s definition - how to combine the real and virtual views of the world. There are two general approaches here: video see-through or optical see-through. Video see-through displays are essentially VR headsets combined with a camera for each eye to provide live video for the “real” view of the world. With optical see-through displays the real world is observed directly and the virtual view is combined with it optically. Imagine a semi-transparent mirror at a 45 degree angle to horizontal - you can see straight through it to the real world (somewhat darkened), but you can also see a reflection of whatever is above the mirror. If you place a tiny screen above the mirror in a darkened housing then whenever bright images are displayed on the screen they will be visible as a semi-transparent overlay on the real world.
There are devices available commercially that take both the optical see-through and video see-through approaches. At Zappar we have done some work with Vuzix, a specialist in eyewear displays for industrial AR applications. They offer the Wrap 1200DXAR which uses a video see-through approach and the Star 1200XLD offering an optical see-through solution. These devices can provide real value in certain use cases but are not aimed at the consumer and are not yet at the level of delivering virtual content that is indistinguishable from the real world, which is the dream being sold by Magic Leap.
The ideal AR headset
Let’s assume we want to build a consumer-facing bit of hardware to be used as a head-mounted AR display. The first thing to consider is the use case: a lot of AR uses are about providing small bits of entertainment or information that are relevant to the current context of the user. That’s distinct from the use case for a VR headset for gaming. The Oculus value proposition works as the user has taken the decision to dedicate a block of time to immersing themselves in a game world, and so the few seconds it takes them to position the device on their head is worth it for the pay-off of increased immersion. At the other end of the “experience richness” spectrum are the notifications of Google Glass. The Glass value proposition only makes sense if the device is worn by the user at all times - no user would be prepared to hear a beep for a notification to then take Glass out of their pocket, place on their head, respond to the notification, and put it away again.
AR experiences can cover the spectrum from simple pop-ups and labels of things in the world, to richer interactive experiences taking place in your environment. For our fantasy AR headset it makes sense to shoot for an always-worn bit of hardware in the Google Glass mould, so that the value equation for users works down to the simplest experiences and in-vision notifications. Our desire for a piece of hardware that is always worn, even when it is not being used, means the device must be almost imperceptible when it is not being used. The view of the real world must be as good as it is without the device being worn. That rules out the video see-through display idea (think of Oculus with a couple of cameras outside); no users would want to go through their everyday life wearing such a display. To provide an unobstructed view of the world when not in use to me strongly suggests an optical see-through display is the way to go.
Lots of improvements are going to be required to existing optical see-through displays to provide AR content that sits seamlessly in the real world and where the virtual and real are indistinguishable. Here’s a quick run down of the key areas that need significant improvement, just focusing on the display hardware issues:
- Wider field of view and higher resolution for the virtual content. This site suggests 529 megapixels per eye over a 120-degree field of view would get pretty close to the limits of human eyes.
- Correct focus behaviour within the virtual scene. When the user is looking at a virtual object near to them, other virtual objects further away should appear out-of-focus. This does not happen on current VR or AR devices, which typically have a single effective focal distance for all their content.
- Controllable transparency across the field of view. The simple semi-transparent mirror set-up described above for optical see-through displays gives a darkened view of the real world and a floating, semi-transparent overlay of the virtual content. Neither of those properties are acceptable for our immersive AR display - the mixing of real and virtual views needs to be controllable (ideally at the same resolution as our virtual content) across the whole field of view from 100% real without any darkening to 100% virtual where the real world is completely occluded.
There are real and serious challenges to delivering all of those 3 goals. Let’s just assume we’ve solved all of these issues, and move on to think about why the device might offer an interesting consumer proposition.
There was a lot of discussion around the Magic Leap announcement about why people should be interested in an AR headset and whether it would offer any real value to consumers. For me the interesting point is that just because our fantasy device is the perfect AR display doesn’t mean that AR is all that it can do. Our device has controllable transparency, so it’s also the perfect VR device. Just set the whole device to opaque and you can be completely immersed in a game world. Just like Oculus, but with higher resolution, and the ability to freely focus on different objects in the virtual world. And you’re wearing it already! Pretty cool, right?
How about at the other end of the spectrum? You’re wearing the device but it’s currently completely inactive and you’re fully present in the real world. Then a notification pops into a small portion of your vision, much like with Google Glass. However now there’s no need to get out another device to respond to it. “Oh it’s a call from Dad. Make it look like it’s on a 60 inch display 3 metres in front of me, and make it, say, 20% transparent so I don’t walk into anything”.
Neither of these examples are AR experiences by the definition we have been using - in the first case there is no combination of real and virtual, and in the second case the real and virtual objects are not registered in 3D. However they are both compelling use cases for the headset.
Of course it could also be used to deliver experiences that tick all of our criteria, with virtual objects correctly registered to the real world. In short the device can provide experiences from Google Glass to Oculus, and lots of things in between.
For me it is this flexibility in the device that makes it most interesting. Magic Leap’s CEO is keen to avoid the term Augmented Reality, and if the device I’ve outlined is similar to the one they envision then I would agree with him that the term is too limiting to encompass the full potential of the display. AR is one type of experience that it can deliver, but it’s that controllable transparency between real and virtual that makes it so much more. Perhaps a term like “Controllable Reality” would be a better description.
Will it happen?
It sounds to me like an exciting dream world full of possibilities. I share the feelings in Caspar’s blog post from last week about the social issues to overcome, and the importance of compelling content. For me the “Controllable Reality” aspect gives the device enough utility outside of AR-specific experiences to drive adoption of the device first, and then the adoption of such a device will in turn drive the creation of more interactive AR experiences.
Here at Zappar we focus on the software side of the AR story, and produce tools and experiences aimed at today’s smartphones. Much of our technology and experience would be directly applicable in this fantasy future of AR eyewear. We’ll certainly be watching developments in this area with interest, and continue working hard to make Zappar a key player in the software side of this emerging ecosystem.
I suppose the big question is how close our fantasy device is to becoming a reality. This Magic Leap patent discusses a way of approximating a display with the correct focus behaviour mentioned above, which seems plausible. Other multi-focal displays have been demonstrated already.
However I’m yet to see solutions for the controllable transparency aspect that would give sharp transitions between real and virtual, that would work correctly in a combined virtual and real world where the user is able to freely adjust their focus, and that keeps the user’s view of the real world broadly unchanged in areas without virtual content.
Perhaps there’s a first step along the journey to our ideal device that would still deliver value to users. Truly believable and immersive AR is really hard in a lot of ways, not just in terms of the requirements it enforces on the display hardware that I’ve been discussing but for all sorts of software and usability reasons that I haven’t even mentioned. “Controllable Reality” feels like the right place to start to me. I think there’s something in a device that would be always worn and could adapt from a small Google Glass style notification display to a fully immersive Oculus-style headset. Even with compromises on resolution, focus depths and sharpness of the transitions between real and virtual content that sounds like a device I’d love to try out.
(1) Ronald Azuma. A survey of augmented reality. Presence, 6(4):355–385, 1997.