A Few Thoughts on Virtual Reality

Virtual reality was hot about five years ago. The Oculus, then the HTC Vive, made virtual reality experiences close to plug-and-play on moderately powerful Windows personal computers. Further, Unity and the SteamVR platform introduced programmers to developing tools that put the user inside the simulated world rather than mediating a view through a flat-screen. I thought this was the future.

However, virtual reality did not develop a whole lot after that initial bit. The screens got higher resolution, but the best experience remained a lighthouse and tether outside-in approach. This approach uses lighthouses, boxes mounted in fixed positions in the room that have the job of helping the central unit keep track of the relative position and orientation of the user.

The big question is, where is the user's head? The lighthouses transmit infra-red patterns that the head-mounted display (HMD)would capture through external cameras. Based on the angle and distortion of what it saw from the lighthouse, the HMD would understand where you are and transmit that back to the computer for decision-making.

One could express intent through handheld controllers. They similarly emitted patterns of light the HMD would use to figure out where they were in space, as well as the expected grip, button, and potentially stick-movement that detected through varieties of potentiometers and a radio connection.

But all this intent and position data paled compared to the need to deliver two high-resolution images at a very high refresh rate to the user. These images needed to reflect their location accurately - and immediately - or there would be unpleasant consequences, such as nausea. Lag in this draw would create motion sickness. Low fidelity would strain one's eyes. And eyes were already a bit tired by mini-monitors shooting out light just a few inches from one's pupils.

For this last, the problem became transmission bandwidth and lag. The solution was to make the connection back to the processor as high-bandwidth as possible. An HDMI cable sent raw image data from the computer to your headset.

This lighthouse-and-tether model created the highest-fidelity experience. Efforts to remove the lighthouse introduced "inside-out" VR. The cameras on the head-mounted display attempt to determine reference points in the world around you. Then, as you move, they look for their change in orientation and position as you move the headset around. Phone-based augmented reality uses a similar technique when anchoring a scene. In theory, these points remain constant, so changes mean your headset is moving. In practice, the ability of the inside-out system to track those points was imperfect. The lag caused motion sickness as the representation of the world was not exactly in tune with what one's inner ear said was reality. Inside-out VR meant less hardware and often lower price but a worse experience.

Losing the tether meant using a high-bandwidth radio to transmit the data (e.g., using WiGig) or moving the computational hardware into the headset itself (e.g., the Oculus Quest). The former is a technical marvel. The latter brings down the overall quality because of the relatively minor processing power that will sit lightly in a head-mounted unit. Those powerful GPUs on the personal computer are harder to fit - and harder to cool - on the device.

It seems to me that one should lose the cable and keep the lighthouses. The combination would require more setup but creates the highest-quality experience for the user. However, I have not seen this in the market.

Of course, technology gets better, so I expect that both inside-out and tetherless technology will improve.

One constant between the modes, however, is that the user is functionally blind. By completely replacing context with the screens in front of one's eyes, we are at the mercy of whatever that HDMI cable sends your way. The only context it can share is what it knows about - the position of your head and the devices in your hands. You cannot see your own hands, which makes complex manipulations impossible. You can usually see the controllers in front of you rendered into the field of view, which helps some but does not have much precision.

As a consequence, one cannot type or use fine motor skills to communicate complex ideas. For me, the most exciting innovations on the software side were how to improve the precision and bandwidth for expressing that intent. An app called Tilt Brush in 2016 was quite impressive for this - one hand could hold an instrument that would create in the volume of space all around you, while the other could mount a plethora of tools, colors, and effects as if it were a palette attached to your arm. The interaction to make these selections was intuitive and rich - uncommon in VR. The ability to create whole scenes in a matter of minutes was extraordinary. Exploring high-dimensional data visualizations was also a treat, though limited when I was working with it.

The VR manufacturers are already working on improving these issues. For example, there are experimental devices that identify finger motion in space. In the Vive ecosystem, some objects and attachments send out their infra-red beams to support tracking these objects just as one does with the handheld controllers. Examples: A magic wand that can wave and a baseball bat one could swing. Right now, these objects are large, inaccurate, or both. But technology marches forward, and innovation comes from the direction we are not looking. And with that blocking visor, we are not looking at much now.

A second issue after the clumsiness is the inability to read. When I was working with the tech, one would not want to read a sheet of paper on it. Headlines for a menuing system, perhaps, but reading medium-form text was a trial. Again, this is fine for a video game, but this was a problem to generate creative and productive output.

VR illiteracy is challenging, and I do not know if linear development in screen and sensor technology will fix it. We may need more creativity.

Finally, VR is isolating. One can have other people in a shared universe of some kind, but virtual reality is one person blocked off from the world to take over more of the senses. I see enormous potential for focusing productivity - data can fully surround one, and all expressed intent goes back into this system. I can fly among data points to drive insight. If I can read to consume more data and have a high-bandwidth mechanism to deploy more

It is also incredibly intense to the point of exhausting one in a much shorter window than a more casual human-computer relationship, such as the laptop on which I type this. To wit, I can look away from this screen. I must completely exit the VR context by removing the HMD to leave that one.

Virtual reality has tremendous potential for facilitating insight and creation. The technology can be used for highly immersive and stimulating entertainment, to be sure. Still, the opportunity to take a professional and permit them to express themselves to a machine to a degree not possible with a screen and keyboard is phenomenal. I am excited for what comes next in this field, both in hardware, software, and the modalities of interaction that we have not thought of yet.

Photo by stephan sorkin on Unsplash