Virtual Reality - Textbook PDF

i ii VIRTUAL REALITY Steven M. LaValle University of Oulu Copyright Steven M. LaValle 2020 Available for downloading at http://lavalle.pl/vr/ Contents Preface...

i ii VIRTUAL REALITY Steven M. LaValle University of Oulu Copyright Steven M. LaValle 2020 Available for downloading at http://lavalle.pl/vr/ Contents Preface vii 1 Introduction 1 1.1 What Is Virtual Reality?........................ 1 1.2 Modern VR Experiences........................ 9 1.3 History Repeats............................. 24 2 Bird’s-Eye View 37 2.1 Hardware................................ 37 2.2 Software................................. 49 2.3 Human Physiology and Perception.................. 54 3 The Geometry of Virtual Worlds 65 3.1 Geometric Models............................ 65 3.2 Changing Position and Orientation.................. 70 3.3 Axis-Angle Representations of Rotation................ 79 3.4 Viewing Transformations........................ 84 3.5 Chaining the Transformations..................... 89 4 Light and Optics 95 4.1 Basic Behavior of Light......................... 95 4.2 Lenses.................................. 101 4.3 Optical Aberrations........................... 108 4.4 The Human Eye............................. 113 4.5 Cameras................................. 119 4.6 Displays................................. 120 5 The Physiology of Human Vision 125 5.1 From the Cornea to Photoreceptors.................. 125 5.2 From Photoreceptors to the Visual Cortex.............. 130 5.3 Eye Movements............................. 136 5.4 Implications for VR........................... 142 iii iv CONTENTS 6 Visual Perception 151 6.1 Perception of Depth.......................... 151 6.2 Perception of Motion.......................... 162 6.3 Perception of Color........................... 170 6.4 Combining Sources of Information................... 175 7 Visual Rendering 183 7.1 Ray Tracing and Shading Models................... 183 7.2 Rasterization.............................. 189 7.3 Correcting Optical Distortions..................... 198 7.4 Improving Latency and Frame Rates................. 201 7.5 Immersive Photos and Videos..................... 210 8 Motion in Real and Virtual Worlds 217 8.1 Velocities and Accelerations...................... 217 8.2 The Vestibular System......................... 222 8.3 Physics in the Virtual World...................... 226 8.4 Mismatched Motion and Vection.................... 237 9 Tracking 245 9.1 Tracking 2D Orientation........................ 246 9.2 Tracking 3D Orientation........................ 250 9.3 Tracking Position and Orientation................... 260 9.4 Tracking Attached Bodies....................... 271 9.5 3D Scanning of Environments..................... 278 10 Interaction 283 10.1 Motor Programs and Remapping................... 283 10.2 Locomotion............................... 289 10.3 Manipulation.............................. 297 10.4 Social Interaction............................ 301 10.5 Additional Interaction Mechanisms.................. 308 11 Audio 313 11.1 The Physics of Sound.......................... 313 11.2 The Physiology of Human Hearing.................. 318 11.3 Auditory Perception.......................... 322 11.4 Auditory Rendering........................... 327 12 Evaluating VR Systems and Experiences 337 12.1 Perceptual Training........................... 338 12.2 Recommendations for Developers................... 344 12.3 Comfort and VR Sickness....................... 348 12.4 Experiments on Human Subjects................... 357 CONTENTS v 13 Frontiers 369 13.1 Touch and Proprioception....................... 369 13.2 Smell and Taste............................. 376 13.3 Robotic Interfaces............................ 380 13.4 Brain-Machine Interfaces........................ 386 vi CONTENTS Preface The Rebirth of Virtual Reality Virtual reality (VR) is a powerful technology that promises to change our lives unlike any other. By artificially stimulating our senses, our bodies become tricked into accepting another version of reality. VR is like a waking dream that could take place in a magical cartoon-like world, or could transport us to another part of the Earth or universe. It is the next step along a path that includes many familiar media, from paintings to movies to video games. We can even socialize with people inside of new worlds, which could be real or artificial. At the same time, VR bears the stigma of unkept promises. The hype and excitement has often far exceeded the delivery of VR experiences to match it, es- pecially for people without access to expensive laboratory equipment. This was particularly painful in the early 1990s when VR seemed poised to enter mainstream use but failed to catch on (outside of some niche markets). Decades later, we are witnessing an exciting rebirth. The latest technological components, mainly aris- ing from the smartphone industry, have enabled high-resolution, low-cost, portable VR headsets to provide compelling VR experiences. From 2014 onward, this has mobilized leading technology companies to invest billions of dollars into growing a VR ecosystem that includes art, communication, entertainment, enhanced work productivity, and social interaction. At the same time, a new generation of tech- nologists is entering the field with fresh ideas. Online communities of hackers and makers, along with college students around the world, are excitedly following the rapid advances in VR and are starting to shape it by starting new companies, working to improve the technology, and making new kinds of experiences. The whole ecosystem is growing at a steady pace, while some particular use cases such as industry training are rapidly expanding. A current challenge is to introduce advanced hardware that is not simply derived from other markets. The greatest need for innovation is in visual displays that are particularly designed for VR. Distinctions with other technologies such as augmented reality (AR) and mixed reality (MR) are becoming less significant as the technology progresses because they can all be handled by the same or similar devices. At the time of writing, the relatively new term XR (or extended reality) has become popular to represent this unification; however, this book will refer to these as variations of VR. vii viii PREFACE The Intended Audience The book is grew out of material for an undergraduate course on VR that I in- troduced at the University of Illinois in 2015. I have never in decades of teaching seen students so excited to take a course. We could offer enough slots to come even close to meeting the demand. Therefore, the primary target of this book is undergraduate students around the world. This book would be an ideal source for starting similar VR courses at other universities. Although most of the inter- ested students have been computer scientists, the course at Illinois has attracted students from many disciplines, such as psychology, music, kinesiology, engineer- ing, medicine, and economics. Students in these other fields come with the most exciting project ideas because they can see how VR has the potential to radically alter their discipline. To make the course accessible to students with such diverse backgrounds, I have made the material as self-contained as possible. There is no assumed background in software development or advanced mathematics. If prospective readers have at least written some scripts before and can remember how to multiply matrices together, they should be ready to go. In addition to use by students who are studying VR in university courses, it is also targeted at developers in industry, hobbyists on the forums, and researchers in academia. The book appears online so that it may serve as a convenient references for all of these groups. To provide further assistance, there are also accompanying materials online, including lecture slides (prepared by Anna Yershova LaValle) and recorded lectures (provided online for free by NPTEL of India). Why Am I Writing This Book? I enjoy teaching and research, especially when I can tie the two together. I have been a professor and have taught university courses for two decades. Robotics has been my main field of expertise; however, in 2012, while on a sabbatical in Oulu, Finland, I started working at Oculus VR a few days after its Kickstarter campaign. I left the university and became their chief scientist, working on head tracking methods, perceptual psychology, health and safety, and numerous other problems. I was struck at how many new challenges arose during that time because engineers and computer scientists (myself included) did not recognize human perception problems that were disrupting our progress. I became convinced that for VR to succeed, perceptual psychology must permeate the design of VR systems. As we tackled some of these challenges, the company rapidly grew in visibility and influence, eventually being acquired by Facebook. Oculus VR is largely credited with stimulating the recent rebirth of VR. I quickly returned to the University of Illinois with a new educational mission: Teach a new generation of students, developers, and researchers the fundamentals of VR in a way that fuses perceptual psychology with engineering. Furthermore, this book focuses on principles do not depend heavily on the particular technology of today. The goal is to improve the reader’s understanding of how VR systems ix work, what limitations they have, and what can be done to improve them. One important component is that even though technology rapidly evolves, humans who use it do not. It is therefore crucial to understand how our sensory systems func- tion, especially with matched with artificial stimulation. This intent is to provide a useful foundation as the technology evolves. In many cases, open challenges remain. The book does not provide the solutions to them, but instead provides the background to begin researching them. Online Materials A full draft of this book is posted online at: http://lavalle.pl/vr/ along with pointers to additional materials, such as lecture videos and slides. Suggested Use This text may be used for a one-semester course by spending roughly one week per chapter, with the exception of Chapter 3, which may require two weeks. The book can also be used to augment other courses such as computer graphics, interfaces, and game development. Selected topics may also be drawn for a short course or seminar series. Depending on the technical level of the students, the mathematical concepts in Chapter 3 might seem too oppressive. If that is the case, students may be advised to skim over it and jump to subsequent chapters. They can understand most of the later concepts without the full mathematical details of Chapter 3. Nevertheless, understanding these concepts will enhance their comprehension throughout the book and will also make them more comfortable with programming exercises. Lab Component From 2015, we have used high-end consumer VR headsets on PCs with graphics cards that were specifically designed for VR. Development on many other platforms is possible, including all-in-one VR headsets, but one must be careful to ensure that the performance requirements required for projects and exercises are met by the particular choice of platform. For software, almost all students develop VR projects using Unity 3D. Alternatives may be Unreal Engine and CryEngine, depending on their level of technical coding skills. Unity 3D is the easiest because knowledge of C++ and associated low-level concepts is unnecessary. Students with strong programming and computer graphics skills may instead want to develop projects “from scratch”, but they should be aware that implementation times may be much longer. x PREFACE Acknowledgments I am very grateful to many students and colleagues who have given me extensive feedback and advice in developing this text. It evolved over many years through the development and teaching at the University of Illinois at Urbana-Champaign (UIUC), starting in early 2015. The biggest thanks goes to Anna Yershova LaValle, who has also taught the virtual reality course at the University of Illinois and collaborated on the course development. We also worked side-by-side at Oculus VR since the earliest days. We continue to work together at the University of Oulu in Finland. I am grateful to the College of Engineering and Computer Science Depart- ment at the University of Illinois for their support of the course. Furthermore, Oculus/Facebook has generously supported the lab with headset donations. I am also grateful to the Indian Institute of Technology (IIT) Madras in Chennai, India for their hospitality and support while I taught a short version of the course. I also appreciate the efforts of my colleagues at the University of Oulu, who have recently recruited me and support me to build the Perception Engineering Labora- tory, which investigates fundamental VR issues. Most of the thanks goes to Timo Ojala and the Center for Ubiquitous Computing. Finally, I am extremely grateful to the hundreds of students who have served as test subjects for the course and book while it was under development. Their endless enthusiasm and questions helped shape this material. Among many helpful colleagues, I especially thank Alan B. Craig, Ian Bailey, Henry Fuchs, Don Greenberg, Jukka Häkkinen, T. Kesh Kesavadas, Paul Mac- Neilage, M. Manivannan, Betty Mohler, Aaron Nichols, Timo Ojala, Yury Petrov, Dan Simons, and Richard Yao for their helpful insights, explanations, suggestions, feedback, and pointers to materials. I sincerely thank the many more people who have given me corrections and comments on early drafts of this book. This includes Lawrence Angrave, Frank Dellaert, Blake J. Harris, Evgeny Klavir, Cameron Merrill, Katherine J. Mim- naugh, Peter Newell, Matti Pouke, Yingying (Samara) Ren, Matthew Romano, Tish Shute, Killivalavan Solai, Karthik Srikanth, Markku Suomalainen, Jiang Tin, David Tranah, Ilija Vukotic, and Chris Widdowson, Kan Zi Yang, Xu (Richard) Yu. Finally, thanks to Deborah Nicholls and Adam Warkoczewski for their help with all of the book figures. Steve LaValle Oulu, Finland, March 2019 Chapter 1 Introduction 1.1 What Is Virtual Reality? Virtual reality (VR) technology is evolving rapidly, making it undesirable to define VR in terms of specific devices that may fall out of favor in a year or two. In this book, we are concerned with fundamental principles that are less sensitive to particular technologies and therefore survive the test of time. Our first challenge is to consider what VR actually means, in a way that captures the most crucial aspects in spite of rapidly changing technology. The concept must also be general enough to encompass what VR is considered today and what we envision for its future. We start with two thought-provoking examples: 1) A human having an ex- perience of flying over virtual San Francisco by flapping his own wings (Figure 1.1); 2) a gerbil running on a freely rotating ball while exploring a virtual maze that appears on a projection screen around the mouse (Figure 1.2). We want our definition of VR to be broad enough to include these examples and many more, which are coming in Section 1.2. This motivates the following. Definition of VR: Inducing targeted behavior in an organism by using artificial sensory stimulation, while the organism has little or no awareness of the interfer- ence. Four key components appear in the definition: 1. Targeted behavior: The organism is having an “experience” that was designed by the creator. Examples include flying, walking, exploring, watching a movie, and socializing with other organisms. 2. Organism: This could be you, someone else, or even another life form such as a fruit fly, cockroach, fish, rodent, or monkey (scientists have used VR technology on all of these!). 1 2 S. M. LaValle: Virtual Reality Figure 1.1: In the Birdly experience from the Zurich University of the Arts, the user, wearing a VR headset, flaps his wings while flying over virtual San Francisco. A motion platform and fan provide additional sensory stimulation. The figure on the right shows the stimulus presented to each eye. (a) (b) Figure 1.2: (a) An experimental setup used by neurobiologists at LMU Munich to present visual stimuli to a gerbil while it runs on a spherical ball that acts as a treadmill (Figure from ). (b) A picture of a similar experiment, performed at Princeton University. 1.1. WHAT IS VIRTUAL REALITY? 3 3. Artificial sensory stimulation: Through the power of engineering, one or more senses of the organism become co-opted, at least partly, and their ordinary inputs are replaced or enhanced by artificial stimulation. 4. Awareness: While having the experience, the organism seems unaware of the interference, thereby being “fooled” into feeling present in a virtual world. This unawareness leads to a sense of presence in an altered or alternative world. It is accepted as being natural. You have probably seen optical illusions before. A VR system causes a perceptual illusion to be maintained for the organism. For this reason, human physiology and perception represent a large part of this book. Testing the boundaries The examples shown in Figures 1.1 and 1.2 clearly fit the definition. Anyone donning a modern VR headset1 and enjoying a session should also be included. How far does our VR definition allow one to stray from the most common examples? Perhaps listening to music through headphones should be included. What about watching a movie at a theater? Clearly, technology has been used in the form of movie projectors and audio systems to provide artificial sensory stimulation. Continuing further, what about a portrait or painting on the wall? The technology in this case involves paints and a canvass. Finally, we might even want reading a novel to be considered as VR. The technologies are writing and printing. The stimulation is visual, but does not seem as direct as a movie screen and audio system. In this book, we not worry too much about the precise boundary of our VR definition. Good arguments could be made either way about some of these borderline cases, but it is more impotant to understand the key ideas for the core of VR. The boundary cases also serve as a good point of reference for historical perspective, which is presented in Section 1.3. Who is the fool? Returning to the VR definition above, the idea of “fooling” an organism might seem fluffy or meaningless; however, this can be made sur- prisingly concrete using research from neurobiology. When an animal explores its environment, neural structures composed of place cells are formed that encode spatial information about its surroundings [239, 243]; see Figure 1.3(a). Each place cell is activated precisely when the organism returns to a particular location that is covered by it. Although less understood, grid cells even encode locations in a manner similar to Cartesian coordinates (Figure 1.3(b)). It has been shown that these neural structures may form in an organism, even when having a VR experience [2, 44, 114]. In other words, our brains may form place cells for places that are not real! This is a clear indication that VR is fooling our brains, at least partially. At this point, you may wonder whether reading a novel that meticulously describes an environment that does not exist will cause place cells to be generated. 1 This is also referred to as a head mounted display or HMD. 4 S. M. LaValle: Virtual Reality Figure 1.3: (a) We animals assign neurons as place cells, which fire when we return to specific locations. This figure depicts the spatial firing patterns of eight place cells in a rat brain as it runs back and forth along a winding track (figure by Stuart Layton). (b) We even have grid cells, which fire in uniformly, spatially distributed patterns, apparently encoding location coordinates (figure by Torkel Hafting). Figure 1.4: A VR thought experiment: The brain in a vat, by Gilbert Harman in 1973. (Figure by Alexander Wivel.) 1.1. WHAT IS VIRTUAL REALITY? 5 We also cannot help wondering whether we are always being fooled, and some greater reality has yet to reveal itself to us. This problem has intrigued the greatest philosophers over many centuries. One of the oldest instances is the Allegory of the Cave, presented by Plato in Republic. In this, Socrates describes the perspective of people who have spent their whole lives chained to a cave wall. They face a blank wall and only see shadows projected onto the walls as people pass by. He explains that the philosopher is like one of the cave people being finally freed from the cave to see the true nature of reality, rather than being only observed through projections. This idea has been repeated and popularized throughout history, and also connects deeply with spirituality and religion. In 1641, René Descartes hypothesized the idea of an evil demon who has directed his entire effort at deceiving humans with the illusion of the external physical world. In 1973, Gilbert Hartman introduced the idea of a brain in a vat (Figure 1.4), which is a thought experiment that suggests how such an evil demon might operate. This is the basis of the 1999 movie The Matrix. In that story, machines have fooled the entire human race by connecting to their brains to a convincing simulated world, while harvesting their real bodies. The lead character Neo must decide whether to face the new reality or take a memory-erasing pill that will allow him to comfortably live in the simulation without awareness of the ruse. Terminology regarding various “realities” The term virtual reality dates back to German philosopher Immanuel Kant , although its use did not involve technology. Kant introduced the term to refer to the “reality” that exists in someone’s mind, as differentiated from the external physical world, which is also a reality. The modern use the VR term was popularized by Jaron Lanier in the 1980s. Unfortunately, name virtual reality itself seems to be self contradictory, which is a philosophical problem rectified in by proposing the alternative term virtuality. While acknowledging this issue, we will nevertheless continue onward with term virtual reality. The following distinction, however, will become important: The real world refers to the physical world that contains the user at the time of the experience, and the virtual world refers to the perceived world as part of the targeted VR experience. Although the term VR is already quite encompassing, several competing terms related to VR are in common use at present. The term virtual environments pre- dates widespread usage of VR and is preferred by most university researchers. It is typically considered to be synonymous with VR; however, we emphasize in this book that the perceived environment could be a photographically captured “real” world just as well as a completely synthetic world. Thus, the perceived environ- ment presented in VR need not seem “virtual”. Augmented reality (AR) refers to systems in which most of the visual stimuli are propagated directly through glass or cameras to the eyes, and some additional structures, such as text and graphics, appear to be superimposed onto the user’s world. The term mixed reality (MR) is sometimes used to refer to an entire spectrum that encompasses VR, AR, and ordinary reality. People have realized that these decades-old terms and dis- 6 S. M. LaValle: Virtual Reality Figure 1.5: When considering a VR system, it is tempting to focus only on the traditional engineering parts: Hardware and software. However, it is equally im- portant, if not more important, to understand and exploit the characteristics of human physiology and perception. Because we did not design ourselves, these fields can be considered as reverse engineering. All of these parts tightly fit together to form perception engineering. tinctions have eroded away in recent years, especially as unifying technologies have rapidly advanced. Therefore, attempts have been recently made to hastily unify them back together again under the headings XR, X Reality, VR/AR, AR/VR, VR/AR/MR and so on. The related notion of Telepresence refers to systems that enable users to feel like they are somewhere else in the real world; if they are able to control anything, such as a flying drone, then teleoperation is an appropriate term. For our purposes, virtual environments, AR, mixed reality, telepresence, and teleoperation will all be considered as perfect examples of VR. The most important idea of VR is that the user’s perception of reality has been altered through engineering, rather than whether the environment they be- lieve they are in seems more “real” or “virtual”. A perceptual illusion has been engineered. Thus, another reasonable term for this area, especially if considered as an academic discipline, could be perception engineering, engineering methods are being used to design, develop, and deliver perceptual illusions to the user. Fig- ure 1.5 illustrates the ingredients of perception engineering, which also motivates the topics of book, which are a mixture of engineering and human psysiology and perception. Interactivity Most VR experiences involve another crucial component: inter- action. Does the sensory stimulation depend on actions taken by the organism? If the answer is “no”, then the VR system is called open-loop; otherwise, it is 1.1. WHAT IS VIRTUAL REALITY? 7 closed-loop. In the case of closed-loop VR, the organism has partial control over the sensory stimulation, which could vary as a result of body motions, including eyes, head, hands, or legs. Other possibilities include voice commands, heart rate, body temperature, and skin conductance (are you sweating?). First- vs. Third-person Some readers of this book might want to develop VR systems or experiences. In this case, pay close attention to this next point! When a scientist designs an experiment for an organism, as shown in Figure 1.2, then the separation is clear: The laboratory subject (organism) has a first-person experience, while the scientist is a third-person observer. The scientist carefully designs the VR system as part of an experiment that will help to resolve a scientific hypothesis. For example, how does turning off a few neurons in a rat’s brain affect its navigation ability? On the other hand, when engineers or developers construct a VR system or experience, they are usually targeting themselves and people like them. They feel perfectly comfortable moving back and forth between being the “scientist” and the “lab subject” while evaluating and refining their work. As you will learn throughout this book, this is a bad idea! The creators of the experience are heavily biased by their desire for it to succeed without having to redo their work. They also know what the experience is supposed to mean or accomplish, which provides a strong bias in comparison to a fresh subject. To complicate matters further, the creator’s body will physically and mentally adapt to whatever flaws are present so that they may soon become invisible. You have probably seen these kinds of things before. For example, it is hard to predict how others will react to your own writing. Also, it is usually harder to proofread your own writing in comparison to that of others. In the case of VR, these effects are much stronger and yet elusive to the point that you must force yourself to pay attention to them. Take great care when hijacking the senses that you have trusted all of your life. This will most likely be uncharted territory for you. More real than reality? How “real” should the VR experience be? It is tempt- ing to try to make it match our physical world as closely as possible. This is referred to in Section 10.1 as the universal simulation principle: Any interaction mechanism in the real world can be simulated in VR. Our brains are most familiar with these settings, thereby making it seem most appropriate. This philosophy has dominated the video game industry at times, for example, in the development of highly realistic first-person shooter (FPS) games that are beautifully rendered on increasingly advanced graphics cards. In spite of this, understand that extremely simple, cartoon-like environments can also be effective and even preferable. Ex- amples appear throughout history, as discussed in Section 1.3. If you are a creator of VR experiences, think carefully about the task, goals, or desired effect you want to have on the user. You have the opportunity to make the experience better than reality. What will they be doing? Taking a math course? Experiencing a live theatrical performance? Writing software? Designing a house? Maintaining a long-distance relationship? Playing a game? Having a meditation 8 S. M. LaValle: Virtual Reality and relaxation session? Traveling to another place on Earth, or in the universe? For each of these, think about how the realism requirements might vary. For example, consider writing software in VR. We currently write software by typing into windows that appear on a large screen. Note that even though this is a familiar experience for many people, it was not even possible in the physical world of the 1950s. In VR, we could simulate the modern software development environment by convincing the programmer that she is sitting in front of a screen; however, this misses the point that we can create almost anything in VR. Perhaps a completely new interface will emerge that does not appear to be a screen sitting on a desk in an office. For example, the windows could be floating above a secluded beach or forest. Furthermore, imagine how a debugger could show the program execution trace. In all of these examples, it will important to determine the perception-based criteria that need to be satisfied for the perceptual illusions to be convincingly and comfortably maintained for the particular VR experience of interest. Synthetic vs. captured Two extremes exist when constructing a virtual world as part of a VR experience. At one end, we may program a synthetic world, which is completely invented from geometric primitives and simulated physics. This is common in video games and such virtual environments were assumed to be the main way to experience VR in earlier decades. At the other end, the world may be captured using modern imaging techniques. For viewing on a screen, the video camera has served this purpose for over a century. Capturing panoramic images and videos and then seeing them from any viewpoint in a VR system is a natural extension. In many settings, however, too much information is lost when projecting the real world onto the camera sensor. What happens when the user changes her head position and viewpoint? More information should be captured in this case. Using depth sensors and SLAM (Simultaneous Localization And Mapping) techniques, a 3D representation of the surrounding world can be captured and maintained over time as it changes. It is extremely difficult, however, to construct an accurate and reliable representation, unless the environment is explicitly engineered for such capture (for example, a motion capture studio). As humans interact, it becomes important to track their motions, which is an important form of capture. What are their facial expressions while wearing a VR headset? Do we need to know their hand gestures? What can we infer about their emotional state? Are their eyes focused on me? Synthetic representations of ourselves called avatars enable us to interact and provide a level of anonymity, if desired in some contexts. The attentiveness or emotional state can be generated synthetically. We can also enhance our avatars by tracking the motions and other attributes of our actual bodies. A well-known problem is the uncanny valley, in which a high degree of realism has been achieved in an avatar, but its appearance makes people feel uneasy. It seems almost right, but the small differences are disturbing. There is currently no easy way to make ourselves appear to others in a VR experience exactly as we do in the real world, and in most cases, we might not want to. 1.2. MODERN VR EXPERIENCES 9 Health and safety Although the degree of required realism may vary based on the tasks, one requirement remains invariant: The health and safety of the users. Unlike simpler media such as radio or television, VR has the power to overwhelm the senses and the brain, leading to fatigue or sickness. This phenomenon has been studied under the heading of simulator sickness for decades; in this book we will refer to adverse symptoms from VR usage as VR sickness. Sometimes the discomfort is due to problems in the VR hardware and low-level software; however, in many cases, it is caused by a careless developer who misunderstands or disregards the side effects of the experience on the user. This is one reason why human physiology and perceptual psychology are large components of this book. To engineer comfortable VR experiences, one must understand how these factor in. In many cases, fatigue arises because the brain appears to work harder to integrate the unusual stimuli being presented to the senses. In some cases, inconsistencies with prior expectations, and outputs from other senses, even lead to dizziness and nausea. Another factor that leads to fatigue is an interface that requires large amounts of muscular effort. For example, it might be tempting move objects around in a sandbox game by moving your arms around in space. This quickly leads to fatigue and an avoidable phenomenon called gorilla arms, in which people feel that the weight of their extended arms is unbearable. For example, by following the principle of the computer mouse, it may be possible to execute large, effective motions in the virtual world by small, comfortable motions of a controller. Over long periods of time, the brain will associate the motions well enough for it to seem realistic while also greatly reducing fatigue. This will be revisited in Section ??. 1.2 Modern VR Experiences The current generation of VR systems was brought about by advances in display, sensing, and computing technology from the smartphone industry. From Palmer Luckey’s 2012 Oculus Rift design to building a viewing case for smart phones [123, 244, 312], the world has quickly changed as VR headsets are mass produced and placed onto the heads of millions of people. This trend is similar in many ways to the home computer and web browser revolutions; as a wider variety of people have access to the technology, the set of things they do with it substantially broadens. This section provides a quick overview of what people are doing with VR systems, and provides a starting point for searching for similar experiences on the Internet. Here, we can only describe the experiences in words and pictures, which is a long way from the appreciation gained by experiencing them yourself. This printed medium (a book) is woefully inadequate for fully conveying the medium of VR. Perhaps this is how it was in the 1890s to explain in a newspaper what a movie theater was like! If possible, it is strongly recommended that you try many VR experiences yourself to form first-hand opinions and spark your imagination 10 S. M. LaValle: Virtual Reality (a) (b) (c) (d) Figure 1.6: (a) Valve’s Portal 2 demo, which shipped with The Lab for the HTC Vive headset, is a puzzle-solving experience in a virtual world. (b) The Virtuix Omni treadmill for walking through first-person shooter games. (c) Lucky’s Tale for the Oculus Rift maintains a third-person perspective as the player floats above his character. (d) In the Dumpy game from DePaul University, the player appears to have a large elephant trunk. The purpose of the game is to enjoy this unusual embodiment by knocking things down with a swinging trunk. to do something better. Video games People have dreamed of entering their video game worlds for decades. By 1982, this concept was already popularized by the Disney movie Tron. Figure 1.6 shows several video game experiences in VR. Most gamers currently want to explore large, realistic worlds through an avatar. Figure 1.6(a) shows Valve’s Portal 2 for the HTC Vive headset. Figure 1.6(b) shows an omnidirectional treadmill peripheral that gives users the sense of walking while they slide their feet in a dish on the floor. These two examples give the user a first-person perspective of their character. By contrast, Figure 1.6(c) shows Lucky’s Tale, which instead yields a comfortable third-person perspective as the user seems to float above the character that she controls. Figure 1.6(d) shows a game that contrasts all the others in that it was designed to specifically exploit the power of VR. Immersive cinema Hollywood movies continue to offer increasing degrees of realism. Why not make the viewers feel like they are part of the scene? Figure 1.2. MODERN VR EXPERIENCES 11 Figure 1.7: In 2015, Oculus Story Studio produced Emmy-winning Henry, an immersive short story about an unloved hedgehog who hopes to make a new friend, the viewer. 12 S. M. LaValle: Virtual Reality Figure 1.8: VR Cinema, developed in 2013 by Joo-Hyung Ahn for the Oculus Rift. Viewers could choose their seats in the theater and watch any movie they like. 1.7 shows an immersive short story. Movie directors are entering a fascinating new era of film. The tricks of the trade that were learned across the 20th century need to be reinvestigated because they are based on the assumption that the cinematographer controls the camera viewpoint. In VR, viewers can look in any direction, and perhaps even walk through the scene. What should they be allowed to do? How do you make sure they do not miss part of the story? Should the story be linear, or should it adapt to the viewer’s actions? Should the viewer be a first-person character in the film, or a third-person observer who in invisible to the other characters? How can a group of friends experience a VR film together? When are animations more appropriate versus the capture of real scenes? It will take many years to resolve these questions and countless more that will arise. In the meantime, VR can also be used as a kind of “wrapper” around existing movies. Figure 1.8 shows the VR Cinema application, which allows the user to choose any seat in a virtual movie theater. Whatever standard movies or videos that are on the user’s hard drive can be streamed to the screen in the theater. These could be 2D or 3D movies. A projector in the back emits flickering lights and the audio is adjusted to mimic the acoustics of a real theater. This provides an immediate way to leverage all content that was developed for viewing on a screen, and bring it into VR. Many simple extensions can be made without modifying the films. For example, in a movie about zombies, a few virtual zombies could enter the theater and start to chase you. In a movie about tornadoes, perhaps 1.2. MODERN VR EXPERIENCES 13 (a) (b) Figure 1.9: An important component for achieving telepresence is to capture a panoramic view: (a) A car with cameras and depth sensors on top, used by Google to make Street View. (b) The Insta360 Pro captures and streams omnidirectional videos. the theater rips apart. You can also have a social experience. Imagine having “movie night” with your friends from around the world, while you sit together in the virtual movie theater. You can even have the thrill of misbehaving in the theater without getting thrown out. Telepresence The first step toward feeling like we are somewhere else is cap- turing a panoramic view of the remote environment (Figure 1.9). Google’s Street View and Earth apps already rely on the captured panoramic images from millions of locations around the world. Simple VR apps that query the Street View server directly enable to user to feel like he is standing in each of these locations, while easily being able to transition between nearby locations (Figure 1.10). Panoramic video capture is even more compelling. Figure 1.11 shows a frame from an im- mersive rock concert experience. Even better is to provide live panoramic video interfaces, through which people can attend sporting events and concerts. Through a live interface, interaction is possible. People can take video conferencing to the next level by feeling present at the remote location. By connecting panoramic cameras to robots, the user is even allowed to move around in the remote en- vironment (Figure 1.12). Current VR technology allows us to virtually visit far away places and interact in most of the ways that were previously possible only while physically present. This leads to improved opportunities for telecommuting to work. This could ultimately help reverse the urbanization trend sparked by the 19th-century industrial revolution, leading to deurbanization as we distribute 14 S. M. LaValle: Virtual Reality (a) (b) Figure 1.10: A simple VR experience that presents Google Street View images through a VR headset: (a) A familiar scene in Paris. (b) Left and right eye views are created inside the headset, while also taking into account the user’s looking direction. Figure 1.11: Jaunt captured a panoramic video of Paul McCartney performing Live and Let Die, which provides a VR experience where users felt like they were on stage with the rock star. 1.2. MODERN VR EXPERIENCES 15 (a) (b) Figure 1.12: Examples of robotic avatars: (a) The DORA robot from the Univer- sity of Pennsylvania mimics the users head motions, allowing him to look around in a remote world while maintaining a stereo view (panoramas are monoscopic). (b) The Plexidrone, a flying robot that is designed for streaming panoramic video. more uniformly around the Earth. Virtual societies Whereas telepresence makes us feel like we are in another part of the physical world, VR also allows us to form entire societies that remind us of the physical world, but are synthetic worlds that contain avatars connected to real people. Figure 1.13 shows a Second Life scene in which people interact in a fantasy world through avatars; such experiences were originally designed to view on a screen but can now be experienced through VR. Groups of people could spend time together in these spaces for a variety of reasons, including common special interests, educational goals, or simply an escape from ordinary life. Empathy The first-person perspective provided by VR is a powerful tool for causing people to feel empathy for someone else’s situation. The world contin- ues to struggle with acceptance and equality for others of different race, religion, age, gender, sexuality, social status, and education, while the greatest barrier to progress is that most people cannot fathom what it is like to have a different iden- tity. Figure 1.14 shows a VR project sponsored by the United Nations to yield feelings of empathy for those caught up in the Syrian crisis of 2015. Some of us may have compassion for the plight of others, but it is a much stronger feeling to understand their struggle because you have been there before. Figure 1.15 shows a VR system that allows men and women to swap bodies. Through virtual so- cieties, many more possibilities can be explored. What if you were 10cm shorter than everyone else? What if you teach your course with a different gender? What if you were the victim of racial discrimination by the police? Using VR, we can imagine many “games of life” where you might not get as far without being in the 16 S. M. LaValle: Virtual Reality Figure 1.13: Virtual societies develop through interacting avatars that meet in virtual worlds that are maintained on a common server. A snapshot from Second Life is shown here. Figure 1.14: In Clouds Over Sidra, 2015, film producer Chris Milk offered a first- person perspective on the suffering of Syrian refugees (figure by Within, Clouds Over Sidra). 1.2. MODERN VR EXPERIENCES 17 Figure 1.15: In 2014, BeAnotherLab, an interdisciplinary collective, made “The Machine to Be Another” where you can swap bodies with the other gender. Each person wears a VR headset that has cameras mounted on its front. Each therefore sees the world from the approximate viewpoint of the other person. They were asked to move their hands in coordinated motions so that they see their new body moving appropriately. 18 S. M. LaValle: Virtual Reality Figure 1.16: A flight simulator in use by the US Air Force (photo by Javier Garcia, U.S. Air Force). The user sits in a physical cockpit while being surrounded by displays that show the environment. “proper” group. Education In addition to teaching empathy, the first-person perspective could revolutionize many areas of education. In engineering, mathematics, and the sci- ences, VR offers the chance to visualize geometric relationships in difficult concepts or data that are hard to interpret. Furthermore, VR is naturally suited for practi- cal training because skills developed in a realistic virtual environment may transfer naturally to the real environment. The motivation is particularly high if the real environment is costly to provide or poses health risks. One of the earliest and most common examples of training in VR is flight simulation (Figure 1.16). Other ex- amples include firefighting, nuclear power plant safety, search-and-rescue, military operations, and medical procedures. Beyond these common uses of VR, perhaps the greatest opportunities for VR education lie in the humanities, including history, anthropology, and foreign lan- guage acquisition. Consider the difference between reading a book on the Victo- rian era in England and being able to roam the streets of 19th-century London, in a simulation that has been painstakingly constructed by historians. We could even visit an ancient city that has been reconstructed from ruins (Figure 1.17). 1.2. MODERN VR EXPERIENCES 19 Figure 1.17: A tour of the Nimrud palace of Assyrian King Ashurnasirpal II, a VR experience developed by Learning Sites Inc. and the University of Illinois in 2016. Fascinating possibilities exist for either touring physical museums through a VR interface or scanning and exhibiting artifacts directly in virtual museums. These examples fall under the heading of digital heritage. Virtual prototyping In the real world, we build prototypes to understand how a proposed design feels or functions. Thanks to 3D printing and related tech- nologies, this is easier than ever. At the same time, virtual prototyping enables designers to inhabit a virtual world that contains their prototype (Figure 1.18). They can quickly interact with it and make modifications. They also have op- portunities to bring clients into their virtual world so that they can communicate their ideas. Imagine you want to remodel your kitchen. You could construct a model in VR and then explain to a contractor exactly how it should look. Virtual prototyping in VR has important uses in many businesses, including real estate, architecture, and the design of aircraft, spacecraft, cars, furniture, clothing, and medical instruments. Health care Although health and safety are challenging VR issues, the tech- nology can also help to improve our health. There is an increasing trend toward distributed medicine, in which doctors train people to perform routine medical pro- cedures in remote communities around the world. Doctors can provide guidance through telepresence, and also use VR technology for training. In another use of VR, doctors can immerse themselves in 3D organ models that were generated from 20 S. M. LaValle: Virtual Reality Figure 1.18: Architecture is a prime example of where a virtual prototype is invaluable. This demo, called Ty Hedfan, was created by IVR-NATION. The real kitchen is above and the virtual kitchen is below. 1.2. MODERN VR EXPERIENCES 21 Figure 1.19: A heart visualization system based on images of a real human heart. This was developed by the Jump Trading Simulation and Education Center and the University of Illinois. medical scan data (Figure 1.19). This enables them to better plan and prepare for a medical procedure by studying the patient’s body shortly before an operation. They can also explain medical options to the patient or his family so that they may make more informed decisions. In yet another use, VR can directly provide therapy to help patients. Examples include overcoming phobias and stress disor- ders through repeated exposure, improving or maintaining cognitive skills in spite of aging, and improving motor skills to overcome balance, muscular, or nervous system disorders. VR systems could also one day improve longevity by enabling aging people to virtually travel, engage in fun physical therapy, and overcome loneliness by connecting with family and friends through an interface that makes them feel present and included in remote activities. Augmented and mixed reality In many applications, it is advantageous for users to see the live, real world with some additional graphics superimposed to enhance its appearance; see Figure 1.20. This has been referred to as augmented reality or mixed reality (both of which we consider to be part of VR in this book). By placing text, icons, and other graphics into the real world, the user could leverage the power of the Internet to help with many operations such as navigation, social interaction, and mechanical maintenance. Many applications to date are targeted at helping businesses to conduct operations more efficiently. Imagine a 22 S. M. LaValle: Virtual Reality Figure 1.20: The Microsoft Hololens, 2016, uses advanced see-through display technology to superimpose graphical images onto the ordinary physical world, as perceived by looking through the glasses. Figure 1.21: Nintendo Pokemon Go is a geolocation-based game from 2016 that allows users to imagine a virtual world that is superimposed on to the real world. They can see Pokemon characters only by looking “through” their smartphone screen. 1.2. MODERN VR EXPERIENCES 23 factory environment in which workers see identifying labels above parts that need to assembled, or they can look directly inside of a machine to determine potential replacement parts. These applications rely heavily on advanced computer vision techniques, which must identify objects, reconstruct shapes, and identify lighting sources in the real world before determining how to draw virtual objects that appear to be naturally embedded. Achieving a high degree of reliability becomes a challenge because vision algorithms make frequent errors in unforeseen environments. The real- world lighting conditions must be estimated to determine how to draw the virtual objects and any shadows they might cast onto real parts of the environment and other virtual objects. Furthermore, the real and virtual objects may need to be perfectly aligned in some use cases, which places strong burdens on both tracking and computer vision systems. Several possibilities exist for visual displays. A fixed screen should show images that are enhanced through 3D glasses. A digital projector could augment the environment by shining light onto objects, giving them new colors and textures, or by placing text into the real world. A handheld screen, which is part of a smartphone or tablet could be used as a window into the augmented or mixed world. This is the basis of the popular Nintendo Pokemon Go game; Figure 1.21. The cases more relevant for this book involve mounting the display on the head. In this case, two main approaches exist. In a see-through display,, the users see most of the real world by simply looking through a transparent material, while the virtual objects appear on the display to disrupt part of the view. Recent prototype headsets with advanced see-through display technology include Google Glass, Microsoft Hololens, and Magic Leap. Achieving high resolution, wide field of view, and the ability to block out incoming light remain significant challenges for affordable consumer-grade devices; however, it may become well-solved within a few years. An alternative is a pass-through display, which sends images from an outward-facing camera to a standard screen inside of the headset. Pass-through displays overcome current see-through display problems, but instead suffer from latency, optical distortion, color distortion, and limited dynamic range. New human experiences Finally, the point might be to simply provide a new human experience. Through telepresence, people can try experiences through the eyes of robots or other people. However, we can go further by giving people experiences that are impossible (or perhaps deadly) in the real world. Most often, artists are the ones leading this effort. The Birdly experience of human flying (Figure 1.1) was an excellent example. Figure 1.22 shows two more. What if we change our scale? Imagine being 2mm tall and looking ants right in the face. Compare that to being 50m tall and towering over a city while people scream and run from you. What if we simulate the effect of drugs in your system? What if you could become your favorite animal? What if you became a piece of food? The creative possibilities for artists seem to be endless. We are limited only by what our bodies can comfortably handle. Exciting adventures lie ahead! 24 S. M. LaValle: Virtual Reality (a) (b) Figure 1.22: (a) In 2014, Epic Games created a wild roller coaster ride through vir- tual living room. (b) A guillotine simulator was made in 2013 by Andre Berlemont, Morten Brunbjerg, and Erkki Trummal. Participants were hit on the neck by friends as the blade dropped, and they could see the proper perspective as their heads rolled. 1.3 History Repeats Staring at rectangles How did we arrive to VR as it exists today? We start with a history that predates what most people would consider to be VR, but includes many aspects crucial to VR that have been among us for tens of thousands of years. Long ago, our ancestors were trained to look at the walls and imagine a 3D world that is part of a story. Figure 1.23 shows some examples of this. Cave paintings, such as the one shown in Figure 1.23(a) from 30,000 years ago. Figure 1.23(b) shows a painting from the European Middle Ages. Similar to the cave painting, it relates to military conflict, a fascination of humans regardless of the era or technology. There is much greater detail in the newer painting, leaving less to the imagination; however, the drawing perspective is comically wrong. Some people seem short relative to others, rather than being further away. The rear portion of the fence looks incorrect. Figure 1.23(c) shows a later painting in which the perspective has been meticulously accounted for, leading to a beautiful palace view that requires no imagination for us to perceive it as “3D”. By the 19th century, many artists had grown tired of such realism and started the controversial impressionist movement, an example of which is shown in Figure 1.23(d). Such paintings leave more to the imagination of the viewer, much like the earlier cave paintings. Moving pictures Once humans were content with staring at rectangles on the wall, the next step was to put them into motion. The phenomenon of stroboscopic apparent motion is the basis for what we call movies or motion pictures today. 1.3. HISTORY REPEATS 25 (a) (b) (c) (d) Figure 1.23: (a) A 30,000-year-old painting from the Bhimbetka rock shelters in India (photo by Archaelogical Survey of India). (b) An English painting from around 1470 that depicts John Ball encouraging Wat Tyler rebels (unknown artist). (c) A painting by Hans Vredeman de Vries in 1596. (d) An impressionist painting by Claude Monet in 1874. Flipping quickly through a sequence of pictures gives the illusion of motion, even at a rate as low as two pictures per second. Above ten pictures per second, the motion even appears to be continuous, rather than perceived as individual pictures. One of the earliest examples of this effect is the race horse movie created by Eadward Muybridge in 1878 at the request of Leland Stanford (yes, that one!); see Figure 1.24. Motion picture technology quickly improved, and by 1896, a room full of spec- tators in a movie theater screamed in terror as a short film of a train pulling into a station convinced them that the train was about to crash into them (Figure 1.25(a)). There was no audio track. Such a reaction seems ridiculous for anyone who has been to a modern movie theater. As audience expectations increased, so had the degree of realism produced by special effects. In 1902, viewers were inspired by a Journey to the Moon (Figure 1.25(b)), but by 2013, an extremely 26 S. M. LaValle: Virtual Reality Figure 1.24: This 1878 Horse in Motion motion picture by Eadward Muybridge, was created by evenly spacing 24 cameras along a track and triggering them by trip wire as the horse passes. The animation was played on a zoopraxiscope, which was a precursor to the movie projector, but was mechanically similar to a record player. high degree of realism seemed necessary to keep viewers believing (Figure 1.25(c) and 1.25(d)). At the same time, motion picture audiences have been willing to accept lower degrees of realism. One motivation, as for paintings, is to leave more to the imag- ination. The popularity of animation (also called anime or cartoons) is a prime example (Figure 1.26). Even within the realm of animations, a similar trend has emerged as with motion pictures in general. Starting from simple line drawings in 1908 with Fantasmagorie (Figure 1.26(a)), greater detail appears in 1928 with the introduction of Mickey Mouse(Figure 1.26(b)). By 2003, animated films achieved a much higher degree of realism (Figure 1.26(c)); however, excessively simple ani- mations have also enjoyed widespread popularity (Figure 1.26(d)). Toward convenience and portability Further motivations for accepting lower levels of realism are cost and portability. As shown in Figure 1.27, families were willing to gather in front of a television to watch free broadcasts in their homes, 1.3. HISTORY REPEATS 27 (a) (b) (c) (d) Figure 1.25: A progression of special effects: (a) Arrival of a Train at La Ciotat Station, 1896. (b) A Trip to the Moon, 1902. (c) The movie 2001, from 1968. (d) Gravity, 2013. even though they could go to theaters and watch high-resolution, color, panoramic, and 3D movies at the time. Such tiny, blurry, black-and-white television sets seem comically intolerable with respect to our current expectations. The next level of portability is to carry the system around with you. Thus, the progression is from: 1) having to go somewhere to watch it, to 2) being able to watch it in your home, to 3) being able to carry it anywhere. Whether pictures, movies, phones, computers, or video games, the same progression continues. We can therefore expect the same for VR systems. At the same time, note that the gap is closing between these levels: The quality we expect from a portable device is closer than ever before to the version that requires going somewhere to experience it. Video games Motion pictures yield a passive, third-person experience, in con- trast to video games which are closer to a first-person experience by allowing us to interact with him. Recall from Section 1.1 the differences between open-loop and closed-loop VR. Video games are an important step toward closed-loop VR, whereas motion pictures are open-loop. As shown in Figure 1.28, we see the same trend from simplicity to improved realism and then back to simplicity. The earliest 28 S. M. LaValle: Virtual Reality (a) (b) (c) (d) Figure 1.26: A progression of cartoons: (a) Emile Cohl, Fantasmagorie, 1908. (b) Mickey Mouse in Steamboat Willie, 1928. (c) The Clone Wars Series, 2003. (d) South Park, 1997. games, such as Pong and Donkey Kong, left much to the imagination. First-person shooter (FPS) games such as Doom gave the player a first-person perspective and launched a major campaign over the following decade toward higher quality graph- ics and realism. Assassin’s Creed shows a typical scene from a modern, realistic video game. At the same time, wildly popular games have emerged by focusing on simplicity. Angry Birds looks reminiscent of games from the 1980s, and Minecraft allows users to create and inhabit worlds composed of course blocks. Note that reduced realism often leads to simpler engineering requirements; in 2015, an ad- vanced FPS game might require a powerful PC and graphics card, whereas simpler games would run on a basic smartphone. Repeated lesson: Don’t assume that more realistic is better! Beyond staring at a rectangle The concepts so far are still closely centered on staring at a rectangle that is fixed on a wall. Two important steps come next: 1) Presenting a separate picture to each eye to induce a “3D” effect. 2) Increasing 1.3. HISTORY REPEATS 29 Figure 1.27: Although movie theaters with large screens were available, families were also content to gather around television sets that produced a viewing quality that would be unbearable by current standards, as shown in this photo from 1958. the field of view so that the user is not distracted by the stimulus boundary. One way our brains infer the distance of objects from our eyes is by stereopsis. Information is gained by observing and matching features in the world that are visible to both the left and right eyes. The differences between their images on the retina yield cues about distances; keep in mind that there are many more such cues, which are explained in Section 6.1. The first experiment that showed the 3D effect of stereopsis was performed in 1838 by Charles Wheatstone in a system called the stereoscope (Figure 1.29(a)). By the 1930s, a portable version became a successful commercial product known to this day as the View-Master (Figure 1.29(b)). Pursuing this idea further led to Sensorama, which added motion pictures, sound, vibration, and even smells to the experience (Figure 1.29(c)). An unfortunate limitation of these designs is requiring that the viewpoint is fixed with respect to the picture. If the device is too large, then the user’s head also becomes fixed in the world. An alternative has been available in movie theaters since the 1950s. Stereopsis was achieved when participants wore special glasses that select a different image for each eye using polarized light filters. This popularized 3D movies, which are viewed the same way in the theaters today. Another way to increase the sense of immersion and depth is to increase the 30 S. M. LaValle: Virtual Reality (a) (b) (c) (d) (e) (f) Figure 1.28: A progression of video games: (a) Atari’s Pong, 1972. (b) Nintendo’s Donkey Kong, 1981. (c) id Software’s Doom, 1993. (d) Ubisoft’s Assassin’s Creed Unity, 2014. (e) Rovio Entertainment’s Angry Birds, 2009. (f) Markus “Notch” Persson’s Minecraft, 2011. 1.3. HISTORY REPEATS 31 (a) (b) (c) (d) Figure 1.29: (a) The first stereoscope, developed by Charles Wheatstone in 1838, used mirrors to present a different image to each eye; the mirrors were replaced by lenses soon afterward. (b) The View-Master is a mass-produced stereoscope that has been available since the 1930s. (c) In 1957, Morton Heilig’s Sensorama added motion pictures, sound, vibration, and even smells to the experience. (d) In competition to stereoscopic viewing, Cinerama offered a larger field of view. Larger movie screens caused the popularity of 3D movies to wane in the 1950s. 32 S. M. LaValle: Virtual Reality field of view. The Cinerama system from the 1950s offered a curved, wide field of view that is similar to the curved, large LED (Light-Emitting Diode) displays offered today (Figure 1.29(d)). Along these lines, we could place screens all around us. This idea led to one important family of VR systems called the CAVE, which was introduced in 1992 at the University of Illinois (Figure 1.30(a)). The user enters a room in which video is projected onto several walls. The CAVE system also offers stereoscopic viewing by presenting different images to each eye using polarized light and special glasses. Often, head tracking is additionally performed to allow viewpoint-dependent video to appear on the walls. VR headsets Once again, the trend toward portability appears. An impor- tant step for VR was taken in 1968 with the introduction of Ivan Sutherland’s Ultimate Display, which leveraged the power of modern displays and computers (Figure 1.30(b)) [321, 322]. He constructed what is widely considered to be the first VR headset. As the user turns his head, the images presented on the screen are adjusted to compensate so that the virtual objects appear to be fixed in space. This yielded the first glimpse of an important concept in this book: The percep- tion of stationarity. To make an object appear to be stationary while you move your sense organ, the device producing the stimulus must change its output to compensate for the motion. This requires sensors and tracking systems to become part of the VR system. Commercial VR headsets started appearing in the 1980s with Jaron Lanier’s company VPL, thereby popularizing the image of goggles and gloves; Figure 1.30(c). In the 1990s, VR-based video games appeared in arcades (Figure 1.30(d)) and in home units (Figure 1.30(e). The experiences were not compelling or comfortable enough to attract mass interest. However, the current generation of VR headset leverages the widespread availability of high resolution screens and sensors, due to the smartphone industry, to offer lightweight, low- cost, high-field-of-view headsets, such as the Oculus Rift (Figure 1.30(f)). This has greatly improved the quality of VR experiences while significantly lowering the barrier of entry for developers and hobbyists. This also caused a recent flood of interest in VR technology and applications. Bringing people together We have so far neglected an important aspect, which is human-to-human or social interaction. We use formats such as a live theater performance, a classroom, or a lecture hall for a few people to communi- cate with or entertain a large audience. We write and read novels to tell stories to each other. Prior to writing, skilled storytellers would propagate experiences to others, including future generations. We have communicated for centuries by writing letters to each other. More recent technologies have allowed us to interact directly without delay. The audio part has been transmitted through telephones for over a century, and now the video part is transmitted as well through videocon- ferencing over the Internet. At the same time, simple text messaging has become a valuable part of our interaction, providing yet another example of a preference for decreased realism. Communities of online users who interact through text 1.3. HISTORY REPEATS 33 (a) (b) (c) (d) (e) (f) Figure 1.30: (a) CAVE virtual environment, Illinois Simulator Laboratory, Beck- man Institute, University of Illinois at Urbana-Champaign, 1992 (photo by Hank Kaczmarski). (b) Sutherland’s Ultimate Display, 1968. (c) VPL Eyephones, 1980s. (d) Virtuality gaming, 1990s. (e) Nintendo Virtual Boy, 1995. (f) Oculus Rift, 2016. 34 S. M. LaValle: Virtual Reality Figure 1.31: Second Life was introduced in 2003 as a way for people to socialize through avatars and essentially build a virtual world to live in. Shown here is the author giving a keynote address at the 2014 Opensimulator Community Confer- ence. The developers build open source software tools for constructing and hosting such communities of avatars with real people behind them. messages over the Internet have been growing since the 1970s. In the context of games, early Multi-User Dungeons (MUDs) grew into Massively Multiplayer Online Games (MMORPGs) that we have today. In the context of education, the PLATO system from the University of Illinois was the first computer-assisted instruction system, which included message boards, instant messaging, screen shar- ing, chat rooms, and emoticons. This was a precursor to many community-based, on-line learning systems, such as the Khan Academy and Coursera. The largest amount of online social interaction today occurs through Facebook apps, which involve direct communication through text along with the sharing of pictures, videos, and links. Returning to VR, we can create avatar representations of ourselves and “live” together in virtual environments, as is the case with Second Life and Opensim- ulator 1.31. Without being limited to staring at rectangles, what kinds of soci- eties will emerge with VR? Popular science fiction novels have painted a thrilling, yet dystopian future of a world where everyone prefers to interact through VR [47, 96, 314]. It remains to be seen what the future will bring. As the technologies evolve over the years, keep in mind the power of simplicity when making a VR experience. In some cases, maximum realism may be im- portant; however, leaving much to the imagination of the users is also valuable. Although the technology changes, one important invariant is that humans are still designed the same way. Understanding how our senses, brains, and bodies work 1.3. HISTORY REPEATS 35 is crucial to understanding the fundamentals of VR systems. Further reading Each chapter of this book concludes with pointers to additional, related literature that might not have been mentioned in the preceding text. Numerous books have been written on VR. A couple of key textbooks that precede the consumer VR revolution are Understanding Virtual Reality by W. R. Sherman and A. B. Craig, 2002 and 3D User Interfaces by D. A. Bowman et al., 2005. Books based on the current technology include [137, 186]. For a survey of the concept of reality, see. For recent coverage of augmented reality that is beyond the scope of this book, see. A vast amount of research literature has been written on VR. Unfortunately, there is a considerable recognition gap in the sense that current industry approaches to con- sumer VR appear to have forgotten the longer history of VR research. Many of the issues being raised today and methods being introduced in industry were well addressed decades earlier, albeit with older technological components. Much of the earlier work remains relevant today and is therefore worth studying carefully. An excellent starting place is the Handbook on Virtual Environments, 2015 , which contains dozens of re- cent survey articles and thousands of references to research articles. More recent works can be found in venues that publish papers related to VR. Browsing through recent publications in these venues may be useful: IEEE Virtual Reality (IEEE VR), IEEE In- ternational Conference on Mixed and Augmented Reality (ISMAR), ACM SIGGRAPH Conference, ACM Symposium on Applied Perception, ACM SIGCHI Conference, IEEE Symposium of 3D User Interfaces, Journal of Vision, Presence: Teleoperators and Vir- tual Environments. 36 S. M. LaValle: Virtual Reality Chapter 2 Bird’s-Eye View This chapter presents an overview of VR systems from hardware (Section 2.1) to software (Section 2.2) to human perception (Section 2.3). The purpose is to quickly provide a sweeping perspective so that the detailed subjects in the remaining chapters will be understood within the larger context. Further perspective can be gained by quickly jumping ahead to Section 12.2, which provides recommendations to VR developers. The fundamental concepts from the chapters leading up to that will provide the engineering and scientific background to understand why the recommendations are made. Furthermore, readers of this book should be able to develop new techniques and derive their own recommendations to others so that the VR systems and experiences are effective and comfortable. 2.1 Hardware The first step to understanding how VR works is to consider what constitutes the entire VR system. It is tempting to think of it as being merely the hardware components, such as computers, headsets, and controllers. This would be woefully incomplete. As shown in Figure 2.1, it is equally important to account for the or- ganism, which in this chapter will exclusively refer to a human user. The hardware produces stimuli that override the senses of the user. In the head-mounted display from Section 1.3 (Figure 1.30(b)), recall that tracking was needed to adjust the stimulus based on human motions. The VR hardware accomplishes this by using its own sensors, thereby tracking motions of the user. Head tracking is the most important, but tracking also may include button presses, controller movements, eye movements, or the movements of any other body parts. Finally, it is also important to consider the surrounding physical world as part of the VR system. In spite of stimulation provided by the VR hardware, the user will always have other senses that respond to stimuli from the real world. She also has the ability to change her environment through body motions. The VR hardware might also track objects other than the user, especially if interaction with them is part of the VR experience. Through a robotic interface, the VR hardware might also change the real world. One example is teleoperation of a robot through a VR interface. 37 38 S. M. LaValle: Virtual Reality Tracking Organism VR Hardware Stimulation Surrounding Physical World Figure 2.1: A third-person perspective of a VR system. It is wrong to assume that the engineered hardware and software are the complete VR system: The organ- ism and its interaction with the hardware are equally important. Furthermore, interactions with the surrounding physical world continue to occur during a VR experience. Sensors and sense organs How is information extracted from the physical world? Clearly this is crucial to a VR system. In engineering, a transducer refers to a device that converts energy from one form to another. A sensor is a special transducer that converts the energy it receives into a signal for an electrical circuit. This may be an analog or digital signal, depending on the circuit type. A sensor typically has a receptor that collects the energy for conversion. Organisms work in a similar way. The “sensor” is called a sense organ, with common examples being eyes and ears. Because our “circuits” are formed from interconnected neurons, the sense organs convert energy into neural impulses. As you progress through this book, keep in mind the similarities between engineered sensors and natural sense organs. They are measuring the same things and sometimes even function in a similar manner. This should not be surprising because we and our engineered devices share the same physical world: The laws of physics and chemistry remain the same. Configuration space of sense organs As the user moves through the physical world, his sense organs move along with him. Furthermore, some sense organs move relative to the body skeleton, such as our eyes rotating within their sockets. Each sense organ has a configuration space, which corresponds to all possible ways it can be transformed or configured. The most important aspect of this is the number of degrees of freedom or DOFs of the sense organ. Chapter 3 will cover this thoroughly, but for now note that a rigid object that moves through ordinary space has six DOFs. Three DOFs correspond to its changing position in space: 1) side-to-side motion, 2) vertical motion, and 3) closer-further motion. The other 2.1. HARDWARE 39 Configuration Control Natural Stimulation Sense Neural Pathways World Organ Figure 2.2: Under normal conditions, the brain (and body parts) control the con- figuration of sense organs (eyes, ears, fingertips) as they receive natural stimulation from the surrounding, physical world. Configuration Control Virtual Sense Neural Pathways World Display Organ Generator World Figure 2.3: In comparison to Figure 2.2, a VR system “hijacks” each sense by replacing the natural stimulation with artificial stimulation that is provided by hardware called a display. Using a computer, a virtual world generator maintains a coherent, virtual world. Appropriate “views” of this virtual world are rendered to the display. three DOFs correspond to possible ways the object could be rotated; in other words, exactly three independent parameters are needed to specify how the object is oriented. These are called yaw, pitch, and roll, and are covered in Section 3.2. As an example, consider your left ear. As you rotate your head or move your body through space, the position of the ear changes, as well as its orientation. This yields six DOFs. The same is true for your right eye, but it also capable of rotating independently of the head. Keep in mind that our bodies have many more degrees of freedom, which affect the configuration of our sense organs. A tracking system may be necessary to determine the position and orientation of each sense organ that receives artificial stimuli, which will be explained shortly. An abstract view Figure 2.2 illustrates the normal operation of one of our sense organs without interference from VR hardware. The brain controls its configura- tion, while the sense organ converts natural stimulation from the environment into neural impulses that are sent to the brain. Figure 2.3 shows how it appears in a VR system. The VR hardware contains several components that will be discussed 40 S. M. LaValle: Virtual Reality Figure 2.4: If done well, the brain is “fooled” into believing that the virtual world is in fact the surrounding physical world and natural stimulation is resulting from it. 2.1. HARDWARE 41 Figure 2.5: In a surround-sound system, the aural displays (speakers) are world- fixed while the user listens from the center. shortly. A Virtual World Generator (VWG) runs on a computer and produces “another world”, which could be many possibilities, such as a pure simulation of a synthetic world, a recording of the real world, or a live connection to another part of the real world. The human perceives the virtual world through each targeted sense organ using a display, which emits energy that is specifically designed to mimic the type of stimulus that would appear without VR. The process of con- verting information from the VWG into output for the display is called rendering. In the case of human eyes, the display might be a smartphone screen or the screen of a video projector. In the case of ears, the display is referred to as a speaker. (A display need not be visual, even though this is the common usage in everyday life.) If the VR system is effective, then the brain is hopefully “fooled” in the sense shown in Figure 2.4. The user should believe that the stimulation of the senses is natural and comes from a plausible world, being consistent with at least some past experiences. Aural: world-fixed vs. user-fixed Recall from Section 1.3 the trend of having to go somewhere for an experience, to having it in the home, and then finally to having it be completely portable. To understand these choices for VR systems and their implications on technology, it will be helpful to compare a simpler case: Audio or aural systems. Figure 2.5 shows the speaker setup and listener location for a Dolby 7.1 Sur- round Sound theater system, which could be installed at a theater or a home family room. Seven speakers distributed around the room periphery generate most of the 42 S. M. LaValle: Virtual Reality Figure 2.6: Using headphones, the displays are user-fixed, unlike the case of a surround-sound system. sound, while a subwoofer (the “1” of the “7.1”) delivers the lowest frequency com- ponents. The aural displays are therefore world-fixed. Compare this to a listener wearing headphones, as shown in Figure 2.6. In this case, the aural displays are user-fixed. Hopefully, you have already experienced settings similar to these many times. What are the key differences? In addition to the obvious portability of head- phones, the following quickly come to mind: In the surround-sound system, the generated sound (or stimulus) is far away from the ears, whereas it is quite close for the headphones. One implication of the difference in distance is that much less power is needed for the headphones to generate an equivalent perceived loudness level com- pared with distant speakers. Another implication based on distance is the degree of privacy allowed by the wearer of headphones. A surround-sound system at high volume levels could generate a visit by angry neighbors. Wearing electronics on your head could be uncomfortable over long periods of time, causing a preference for surround sound over headphones. Several people can enjoy the same experience in a surround-sound system (although they cannot all sit in the optimal location). Using headphones, they would need to split the audio source across their individual headphones simultaneously. They are likely to have different costs, depending on the manufacturing difficulty and available component technology. At present, headphones are 2.1. HARDWARE 43 favored by costing much less than a set of surround-sound speakers (although one can spend a large amount of money on either). All of these differences carry over to VR systems. This should not be too surprising because we could easily consider a pure audio experience to be a special kind of VR experience based on our definition from Section 1.1. While listening to music, close your eyes and imagine you are at a live perfor- mance with the artists surrounding you. Where do you perceive the artists and their instruments to be located? Are they surrounding you, or do they seem to be in the middle of your head? Using headphones, it is most likely that they seem to be inside your head. In a surround-sound system, if recorded and displayed prop- erly, the sounds should seem to be coming from their original locations well outside of your head. They probably seem constrained, however, into the horizontal plane that you are sitting in. This shortcoming of headphones is not widely recognized at present, but nev- ertheless represents a problem that becomes much larger for VR systems that include visual displays. If you want to preserve your perception of where sounds are coming from, then headphones would need to take into account the configura- tions of your ears in space to adjust the output accordingly. For example, if you nod your head back and forth in a “no” gesture, then the sound being presented to each ear needs to be adjusted so that the simulated sound source is rotated in the opposite direction. In the surround-sound system, the speaker does not follow your head and therefore does not need to rotate. If the speaker rotates with your head, then a counter-rotation is needed to “undo” your head rotation so that the sound source location is perceived to be stationary. Visual: world-fixed vs. user-fixed Now consider adding a visual display. You might not worry much about the perceived location of artists and instruments while listening to music, but you will quickly notice if their locations do not appear correct to your eyes. Our vision sense is much more powerful and complex than our sense of hearing. Figure 2.7(a) shows a CAVE system, which parallels the surround-sound system in many ways. The user again sits in the center while displays around the periphery present visual stimuli to his eyes. The speakers are replaced by video screens. Figure 2.7(b) shows a user wearing a VR headset, which parallels the headphones. Suppose the screen in front of the user’s eyes shows a fixed image in the headset. If the user rotates his head, then the image will be perceived as being attached to the head. This would occur, for example, if you rotate your head while using the Viewmaster (recall Figure 1.29(b)). If you would like to instead perceive the image as part of a fixed world around you, then the image inside the headset must change to compensate as you rotate your head. The surrounding virtual world should be counter-rotated, the meaning of which will be made more precise in Section 3.4. Once we agree that such transformations are necessary, it becomes a significant engineering challenge to estimate the amount of head and eye movement that has occurred and apply the appropriate transformation in a timely and accurate 44 S. M. LaValle: Virtual Reality (a) (b) Figure 2.7: (a) A CAVE VR system developed at Teesside University, UK. (b) A 90-year-old woman (Rachel Mahassel) wearing the Oculus Rift DK1 headset in 2013. manner. If this is not handled well, then users could have poor or unconvincing experiences. Worse yet, they could fall prey to VR sickness. This is one of the main reasons why the popularity of VR headsets waned in the 1990s. The component technology was not good enough yet. Fortunately, the situation is much improved at present. For audio, few seemed to bother with this transformation, but for the visual counterpart, it is absolutely critical. One final note is that tracking and applying transformations also becomes necessary in CAVE systems if we want the images on the screens to be altered according to changes in the eye positions inside of the room. Now that you have a high-level understanding of the common hardware ar- rangements, we will take a closer look at hardware components that are widely available for constructing VR systems. These are expected to change quickly, with costs decreasing and performance improving. We also expect many new devices to appear in the marketplace in the coming years. In spite of this, the fundamentals in this book remain unchanged. Knowledge of the current technology provides concrete examples to make the fundamental VR concepts clearer. The hardware components of VR systems are conveniently classified as: Displays (output): Devices that each stimulate a sense organ. Sensors (input): Devices that extract information from the real world. Computers: Devices that process inputs and outputs sequentially. Displays A display generates stimuli for a targeted sense organ. Vision is our dominant sense, and any display constructed for the eye must cause the desired image to be formed on the retina. Because of this importance, Chapters 4 and 5 will explain displays and their connection to the human vision system. For CAVE systems, some combination of digital projectors and mirrors is used. Due 2.1. HARDWARE 45 (a) (b) Figure 2.8: Two examples of haptic feedback devices. (a) The Touch X system by 3D Systems allows the user to feel strong resistance when poking into a virtual object with a real stylus. A robot arm provides the appropriate forces. (b) Some game controllers occasionally vibrate. to the plummeting costs, an array of large-panel displays may alternatively be employed. For headsets, a smartphone display can be placed close to the eyes and brought into focus using one magnifying lens for each eye. Screen manufacturers are currently making custom displays for VR headsets by leveraging the latest LED display technology from the smartphone industry. Some are targeting one display per eye with frame rates above 90Hz and over two megapixels per eye. Reasons for this are explained in Chapter 5. Now imagine displays for other sense organs. Sound is displayed to the ears using classic speaker technology. Bone conduction methods may also be used, which vibrate the skull and propagate the waves to the inner ear; this method appeared Google Glass. Chapter 11 covers the auditory part of VR in detail. For the sense of touch, there are haptic displays. Two examples are pictured in Figure 2.8. Haptic feedback can be given in the form of vibration, pressure, or temperature. More details on displays for touch, and even taste and smell, appear in Chapter 13. Sensors Consider the input side of the VR hardware. A brief overview is given here, until Chapter 9 covers sensors and tracking systems in detail. For visual and auditory body-mounted displays, the position and orientation of the sense organ must be tracked by sensors to appropriately adapt the stimulus. The orientation part is usually accomplished by an inertial measurement unit or IMU. The main component is a gyroscope, which measures its own rate of rotation; the rate is referred to as angular velocity and has three components. Measurements from the gyroscope are integrated over time to obtain an estimate of the cumulative change in orientation. The resulting error, called drift error, would gradually grow unless other sensors are used. To reduce drift error, IMUs also contain an accelerometer and possibly a magnetometer. Over the years, IMUs have gone from existing only 46 S. M. LaValle: Virtual Reality (a) (b) Figure 2.9: Inertial measurement units (IMUs) have gone from large, heavy me- chanical systems to cheap, microscopic MEMS circuits. (a) The LN-3 Inertial Navigation System, developed in the 1960s by Litton Industries. (b) The internal structures of a MEMS gyroscope, for which the total width is less than 1mm. as large mechanical systems in aircraft and missiles to being tiny devices inside of smartphones; see Figure 2.9. Due to their small size, weight, and cost, IMUs can be easily embedded in wearable devices. They are one of the most important enabling technologies for the current generation of VR headsets and are mainly used for tracking the user’s head orientation. Digital cameras provide another critical source of information for tracking sys- tems. Like IMUs, they have become increasingly cheap and portable due to the smartphone industry, while at the same time improving in image quality. Cameras enable tracking approaches that exploit line-of-sight visibility. The idea is to iden- tify features or markers in the image that serve as reference points for an moving object or a stationary background. Such visibility constraints severely limit the possible object positions and orientations. Standard cameras passively form an image by focusing the light through an optical system, much like the human eye. Once the camera calibration parameters are known, an observed marker is known to lie along a ray in space. Cameras are commonly used to track eyes, heads, hands, entire human bodies, and any other objects in the physical world. One of the main challenges at present is to obtain reliable and accurate performance without placing special markers on the user or objects around the scene. As opposed to standard cameras, depth cameras work actively by projecting light into the scene and then observing its reflection in the image. This is typically done in the infrared (IR) spectrum so that humans do not notice; see Figure 2.10. In addition to these sensors, we rely heavily on good-old mechanical switches and potientiometers to create keyboards and game controllers. An optical mouse is also commonly used. One advantage of these familiar devices is that users can rapidly input data or control their characters by leveraging their existing training. 2.1. HARDWARE 47 (a) (b) Figure 2.10: (a) The Microsoft Kinect sensor gathers both an ordinary RGB image and a depth map (the distance away from the sensor for each pixel). (b) The depth is determined by observing the locations of projected IR dots in an image obtained from an IR camera. A disadvantage is that they might be hard to find or interact with if their faces are covered by a headset. Computers A computer executes the virtual world generator (VWG). Where should this computer be? Although unimportant for world-fixed displays, the location is crucial for body-fixed displays. If a separate PC is needed to power the system, then fast, reliable communication must be provided between the headset and the PC. This connection is currently made by wires, leading to an awkward tether; current wireless speeds are not sufficient. As you have noticed, most of the needed sensors exist on a smartphone, as well as a moderately powerful computer. Therefore, a smartphone can be dropped into a case with lenses to provide a VR experience with little added costs (Figure 2.11). The limitation, though, is that the VWG must be simpler than in the case of a separate PC so that it runs on less- powerful computing hardware. In the near future, we expect to see wireless, all-in- one headsets that contain all of the essential parts of smartphones for delivering VR experiences. These will eliminate unnecessary components of smartphones (such as the addi

Virtual Reality - Textbook PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue