Document Details

DeadOnCoralReef

Uploaded by DeadOnCoralReef

Vrije Universiteit Amsterdam

Christopher D. Wickens, John Lee, Yili Liu, Sallie Gordon Becker

Tags

visual sensory systems human factors engineering perception human factors

Summary

This document is chapter 4 of an introduction to human factors engineering. It introduces the concepts of visual perception, including the visual sensory system, light stimulus, and eyeball anatomy. It also discusses the important characteristics of human visual performance.

Full Transcript

Visual Sensory Systems T he 50-year-old traveler, arriving in an unfamiliar city on a dark, rainy night, is picking up a rental car. The rental agency bus driver points to “the red sedan over there” and drives off, but in the dim light of the parking lot, our traveler cannot easily tell which car...

Visual Sensory Systems T he 50-year-old traveler, arriving in an unfamiliar city on a dark, rainy night, is picking up a rental car. The rental agency bus driver points to “the red sedan over there” and drives off, but in the dim light of the parking lot, our traveler cannot easily tell which car is red and which is brown. He climbs into the wrong car, realizes his mistake, and settles at last in the correct vehicle. He pulls out a city map to figure out the way to his destination, but in the dim illumination of the dome light, the printed street names on the map are just a haze of black. Giving up on the map, he remains confident that he will see the appropriate signage to Route 60 that will direct him toward his destination, so he starts the motor to pull out of the lot. The streaming rain forces him to search for the wiper switch, but the switch is hard to find because the dark printed labels cannot be read against the gray color of the interior. A little fumbling, however, and the wipers are on, and he emerges from the lot onto the highway. The rapid traffic closing behind him and bright glare of headlights in his rearview mirror force him to accelerate to an uncomfortably rapid speed. He cannot read the first sign to his right as he speeds by. Did that sign say Route 60 or Route 66? He drives on, assuming that the turnoff will be announced again; he peers ahead, watching for the sign. Suddenly, there it is on the left side of the highway, not the right where he had expected it, and he passes it before he can change lanes. Frustrated, he turns on the dome light to glance at the map again, but in the fraction of a second his head is down, the sound of gravel on the undercarriage signals that his car has slid off the highway. As he drives along the berm, waiting to pull back on the road, he fails to see the huge pothole that unkindly brings his car to an abrupt halt. Our unfortunate traveler is in a situation that is far from unique. Night driving in unfamiliar locations is one of the more hazardous endeavors that humans undertake (Evans, 1991), especially as they become older. The reasons the dangers are From Chapter 4 of An Introduction to Human Factors Engineering, Second Edition. Christopher D. Wickens, John Lee, Yili Liu, Sallie Gordon Becker. Copyright © 2004 by Pearson Education, Inc. All rights reserved. 41 Visual Sensory Systems so great relate to the pronounced limits of the visual sensory system. Many of these limits reside within the peripheral features of the eyeball itself and the neural pathways that send messages of visual information to the brain. Others relate more directly to brain processing and to many of the perceptual processes. In this chapter we discuss the nature of light stimulus and the eyeball anatomy as it processes this light. We then discuss several of the important characteristics of human visual performance as it is affected by this interaction between characteristics of the stimulus and the human perceiver. THE STIMULUS: LIGHT Amplitude Essentially all visual stimuli that the human can perceive may be described as a wave of electromagnetic energy. The wave can be represented as a point along the visual spectrum. As shown in Figure 1a, this point has a wavelength, typically expressed in nanometers along the horizontal axis, and an amplitude on the vertical axis. The wavelength determines the hue of the stimulus that is perceived, and the amplitude determines its brightness. As the figure shows, the range of wavelengths typically visible to the eye runs from short wavelengths of around 400 nm (typically observed as blue-violet) to long wavelengths of around 700 nm (typically observed as red). In fact, the eye rarely encounters “pure” wavelengths. On the one hand, mixtures of different wavelengths often et ol i av ltr V U 400 t le io B e lu en re G WAVELENGTH (nanometers) w llo Ye d re ra nf ed R I 700 Visual Spectrum (a) FIGURE 1a (a) The visible spectrum of electromagnetic energy (light). Very short (ultraviolet) and very long (infrared) wavelengths falling just outside of this spectrum are shown. Monochromatic (black, gray, white) hues are not shown because these are generated by the combinations of wavelengths. (b) The CIE color space, showing some typical colors created by levels of x and y specifications. (Source: Helander, M., 1987. The design of visual displays. In Handbook of Human Factors. G. Salvendy, ed., New York: Wiley, Fig. 5.1.35, p. 535; Fig. 5.1.36, p. 539. Reprinted by permission of John Wiley and Sons, Inc.). 42 Visual Sensory Systems Y PRIMARY 0.8 1931, 2° CIE STANDARD OBSERVER 530 520 540 510 LOCUS OF SPECTRAL COLORS (wavelength in nm) 550 GREEN 560 PURITY 0.6 B 570 500 YELLOW A 0.4 490 PHYSICALLY POSSIBLE COLORS 0.2 HERSHEY BAR WHITE ILLUMINANT C A B 0.2 600 610 RED 620 LIPSTICK 650 630 700 RED LOCUS OF PURE NONSPECTRAL COLORS BLUE 470 460 450 0.0 590 PURITY 480 0.0 580 X PRIMARY 0.4 (b) 0.6 0.8 1.0 FIGURE 1b act as stimuli. For example, Figure 1a depicts a spectrum that is a mixture of red and blue, which would be perceived as purple. On the other hand, the pure wavelengths, characterizing a hue, like blue or yellow, may be “diluted” by mixture with varying amounts of gray or white (called achromatic light). This is light with no dominant hue and therefore not represented on the spectrum). When wavelengths are not diluted by gray, like pure red, they are said to be saturated. Diluted wavelengths, like pink, are of course unsaturated. Hence, a given light stimulus can be characterized by its hue (spectral values), saturation, and brightness. The actual hue of a light is typically specified by the combination of the three primary colors—red, green, and blue—necessary to match it (Helander, 1987). This specification follows a procedure developed by the Commission Internationel de L’Elairage and hence is called the CIE color system. 43 Visual Sensory Systems As shown in Figure 1b, the CIE color space represents all colors in terms of two primary colors of long and medium wavelengths specified by the x and y axes respectively (Wyszecki, 1986). Those colors on the rim of the curved lines defining the space are pure, saturated colors. A monochrome light is represented at point C in the middle of the space. The figure does not represent brightness, but this could be shown as a third dimension running above and below the color space of 1b. Use of this standard coordinate system allows common specification of colors across different users. For example a “lipstick red” color would be established as having .5 units of long wavelength and .33 units of medium wavelength (see Post, 1992, for a more detailed discussion of color standardization issues). While we can measure or specify the hue of a stimulus reaching the eyeball by its wavelength, the measurement of brightness is more complex because there are several different meanings of light intensity. (Boyce, 1997) This is shown in Figure 2, where we see a source of light, like the sun or, in this case, the headlight of our driver’s car. This source may be characterized by its luminous intensity, or luminous flux, which is the actual light energy of the source. It is measured in units of candela. But the amount of this energy that actually strikes the surface of an object to be seen—the road sign, for example—is a very different measure, described as the illuminance and measured in units of lux or foot candles. Hence, the term illumination characterizes the lighting quality of a given working environment. How much illuminance an object receives depends Luminance Reflected Luminous Energy (flux) (L/4) (L/16) Absorbed (L/36) Illuminance FIGURE 2 Concepts behind the perception of visual brightness. Luminance energy (flux) is present at the source (the headlight), but for a given illuminated area (illuminance), this energy declines with the square of the distance from the source. This is illustrated by the values under the three signs at increasing intervals of two units, four units, and six units away from the headlight. Some of the illuminance (solid rays) is absorbed by the sign, and the remainder is reflected back to the observer, characterizing the luminance of the viewed sign. Brightness is the subjective experience of the perceiver. 44 Visual Sensory Systems on the distance of the object from the light source. As the figure shows, the illuminance declines with the square of the distance from the source. Although we may sometimes be concerned about the illumination of light sources in direct viewing, the amount of glare produced by headlights shining from the oncoming vehicles for example (Theeuwes et al., 2002), and human factors is also concerned about the illumination of work place, human factors is also concerned with the amount of light reflected off of objects to be detected, discriminated, and recognized by the observer when these objects are not themselves the source of light. This may characterize, for example, the road sign in Figure 2. We refer to this measure as the luminance of a particular stimulus typically measured in foot lamberts (FL). Luminance is different from illuminance because of differences in the amount of light that surfaces either reflect or absorb. Black surfaces absorb most of the illuminance striking the surface, leaving little luminance to be seen by the observer. White surfaces reflect most of the illuminance. In fact, we can define the reflectance of a surface as the following ratio: Reflectance 1%2 = luminance 1FL2 illuminance 1FC2 (1) (A useful hint is to think of the illuminance light, leaving some of itself [the “il”] on the surface and sending back to the eye only the luminance.) The brightness of a stimulus, then, is the actual experience of visual intensity, an intensity that often determines its visibility. From this discussion, we can see how the visibility or brightness of a given stimulus may be the same if it is a dark (poorly reflective) sign that is well illuminated or a white (highly reflective) sign that is poorly illuminated. In addition to brightness, the ability to see an object—its visibility—is also affected by the contrast between the stimulus and its surround, but that is another story that we shall describe in a few pages. Table 1 summarizes these various measures of light and shows the units by which they are typically measured. A photometer is an electronic device that measures luminous intensity in terms of foot lamberts. An illumination meter is a device that measures illuminance. TABLE 1 Physical Quantities of Light and Their Units Quantity Luminous flux Illuminance Luminance Reflectance Brightness Units 1 candela or 12.57 lumins Foot candle or 10.76 LUX Candela/M2 or foot lambert A ratio 45 Visual Sensory Systems THE RECEPTOR SYSTEM: THE EYEBALL AND THE OPTIC NERVE Light, or electromagnetic energy, must be transformed to electrochemical neural energy, a process that is accomplished by the eye. Figure 3 presents a schematic view of the wonderful receptor system for vision, the eyeball. As we describe certain key features of its anatomy and how this anatomy affects characteristics of the light energy that passes through it, we identify some of the distortions that disrupt our ability to see in many working environments and therefore should be the focus of concern for the human factors engineer. The Lens As we see in the figure, the light rays first pass through the cornea, which is a protective surface that absorbs some of the light energy (and does so progressively more as we age). Light rays then pass through the pupil, which opens or dilates (in darkness) and closes or constricts (in brightness) to admit adaptively more light when illumination is low and less when illumination is high. The lens of the eye is responsible for adjusting its shape, or accommodating, to bring the image to a precise focus on the back surface of the eyeball, the retina. This accommodation is accomplished by a set of ciliary muscles surrounding the lens. Sensory receptors located within the ciliary muscles send information regarding accommodation to the higher perceptual centers of the brain. When we view images up close, the light rays emanating from the images converge as they approach the eye, and the muscles must accommodate by changing the lens to a rounder shape, as reflected in Figure 3. When the image is far away and the Higher Perceptual Centers Retina Accomodative Response Lens tic Op erve N Fovea (all cones) Pupil Ciliary Muscles Periphery (mostly rods) FIGURE 3 Key aspects of the anatomy of the eyeball. 46 Visual Sensory Systems light rays reach the eye in essentially parallel fashion, the muscles accommodate by creating a flatter lens. Somewhere in between is a point where the lens comes to a natural “resting” point, at which the muscles are doing little work at all. This is referred to as the resting state of accommodation. The amount of accommodation can be described in terms of the distance of a focused object from the eye. Formally, the amount of accommodation required is measured in diopters, which equal 1/viewing distance (meters). Thus, 1 diopter is the accommodation required to view an object at 1 meter. As our driver discovered when he struggled to read the fine print of the map, our eyeball does not always accommodate easily. It takes time to change its shape, and sometimes there are factors that limit the amount of shape change that is possible. Myopia, or nearsightedness, results when the lens cannot flatten and hence distant objects cannot be brought into focus. Presbyopia, or farsightedness, results when the lens cannot accommodate to very near stimuli. As we grow older, the lens becomes less flexible in general, but farsightedness in particular becomes more evident. Hence, we see that the older reader, when not using corrective lenses, must hold the map farther away from the eyes to try to gain focus, and it takes longer for that focus to be achieved. While accommodation may be hindered by limits on flexibility of the lens and compensated by corrective lenses, it is also greatly influenced by the amount of visibility of the image to be fixated, which is determined by both its brightness and its contrast. The Visual Receptor System An image, whether focused or not, eventually reaches the retina at the back of the eyeball. The image may be characterized by its intensity (luminance), its wavelengths, and its size. The image size is typically expressed by its visual angle, which is depicted by the two-headed arrows in front of the eyes in Figure 3. The visual angle of an object of height H, viewed at distance D, is approximately equal to arctan (H/D) (the angle whose tangent = H/D). Knowing the distance of an object from a viewer and its size, one can compute this ratio. For visual angles less than around 10 degrees, the angle may be expressed in minutes of arc rather than degrees (60 minutes = 1 degree) and approximated by the formula VA = 5.7 ! 60 ! (H/D) (2) Importantly, the image can also be characterized by where it falls on the back of the retina because this location determines the types of visual receptor cells that are responsible for transforming electromagnetic light energy into the electrical impulses of neural energy to be relayed up the optic nerve to the brain. There are two types of receptor cells, rods and cones, each with six distinctly different properties. Collectively, these different properties have numerous implications for our visual sensory processing. 1. Location. The middle region of the retina, the fovea, consisting of an area of around 2 degrees of visual angle, is inhabited exclusively by cones (Figure 3). 47 Visual Sensory Systems Outside of the fovea, the periphery is inhabited by rods as well as cones, but the concentration of cones declines rapidly moving farther away from the fovea (i.e., with greater eccentricity.) 2. Acuity. The amount of fine detail that can be resolved is far greater when the image falls on the closely spaced cones than on the more sparsely spaced rods. We refer to this ability to resolve detail as the acuity, often expressed as the inverse of the smallest visual angle (in minutes of arc) that can just be detected. Thus, an acuity of 1.0 means that the operator can resolve a visual angle of 1 minute of arc (1/60 of 1 degree). Table 2 provides various ways of measuring visual acuity. Since acuity is higher with cones than rods, it is not surprising that our best ability to resolve detail is in the fovea, where the cone density is greatest. Hence, we “look at” objects that require high acuity, meaning that we orient the eyeball to bring the image into focus on the fovea. While visual acuity drops rapidly toward the periphery, the sensitivity to motion declines at a far less rapid rate. We often use the relatively high sensitivity to motion in the periphery as a cue for something important on which we later fixate. That is, we notice motion in the periphery and move our eyes to focus on the moving object. 3. Sensitivity. Although the cones have an advantage over the rods in acuity, the rods have an advantage in terms of sensitivity, characterizing the minimum amount of light that can be detected, or the threshold. Sensitivity and threshold are reciprocally related: As one increases, the other decreases. Since there are no rods in the fovea, it is not surprising that our fovea is very poor at picking up dim illumination (i.e., it has a high threshold). To illustrate this, note that if you try to look directly at a faint star, it will appear to vanish. Scotopic vision refers to vision at night when only rods are operating. Photopic vision refers to vision when the illumination is sufficient to activate both rods and cones (but when most of our visual experience is due to actions of cones). 4. Color sensitivity. Rods cannot discriminate different wavelengths of light (unless they also differ in intensity). Rods are “color blind,” and so the extent to which hues can be resolved declines both in peripheral vision (where fewer cones are present) and at night (when only rods are operating). Hence, we can understand how our driver, trying to locate his car at night, was unable to discriminate the poorly illuminated red car from its surrounding neighbors. 5. Adaptation. When stimulated by light, rods rapidly lose their sensitivity, and it takes a long time for them to regain it (up to a half hour) once they are returned to the darkness that is characteristic of the rods’ “optimal viewing enviTABLE 2 Some Measures of Acuity Minimum separable acuity Vernier acuity Landolt ring Snellen acuity 48 General measurement of smallest detail detectable Are two parallel lines aligned? Is the gap in a ring detectable? Measurement of detail resolved at 20 feet, relative to the distance at which a normal observer can resolve the same detail (e.g., 20/30) Visual Sensory Systems ronment.” This phenomenon describes the temporary “blindness” we experience when we enter a darkened movie theater on a bright afternoon. Environments in which operators are periodically exposed to bright light but often need to use their scotopic vision are particularly disruptive. In contrast to rods, the low sensitivity of the cones is little affected by light stimulation. However, cones may become hypersensitive when they have received little stimulation. This is the source of glare from bright lights, particularly at night. 6. Differential wavelength sensitivity. Whereas cones are generally sensitive to all wavelengths, rods are particularly insensitive to long (i.e., red) lengths. Hence, red objects and surfaces look very black at night. More important, illuminating objects in red light in an otherwise dark environment will not destroy the rods’ dark adaptation. For example, on the bridge of a ship, the navigator may use a red lamp to stimulate cones in order to read the fine detail of a chart, but this stimulation will not destroy the rods’ dark adaptation and hence will not disrupt the ability of personnel to scan the horizon for faint lights or dark forms. Collectively, these pronounced differences between rods and cones are responsible for a wide range of visual phenomena. We consider some of the more complex implications of these phenomena to human factors issues related to three important aspects of our sensory processing: contrast sensitivity (CS), night vision, and color vision. SENSORY PROCESSING LIMITATIONS Contrast Sensitivity Our unfortunate driver could not discern the wiper control label, the map detail, or the pothole for a variety of reasons, all related to the vitally important human factors concept of contrast sensitivity. Contrast sensitivity may be defined as the reciprocal of the minimum contrast between a lighter and darker spatial area that can just be detected; that is, with a level of contrast below this minimum, the two areas appear homogeneous. Hence, the ability to detect contrast is necessary in order to detect and recognize shapes, whether the discriminating shape of a letter or the blob of a pothole. The contrast of a given visual pattern is typically expressed as the ratio of the difference between the luminance of light, L, and dark, D, areas to the sum of these two luminance values: c = (L " D)/(L + D) (3) The higher the contrast sensitivity that an observer possesses, the smaller the minimum amount of contrast that can just be detected, CM, a quantity that describes the contrast threshold. Hence, CS = 1/CM (4) 49 Visual Sensory Systems The minimum separable acuity (the width of light separating two dark lines) represents one measure of contrast sensitivity, because a gap that is smaller than this minimum will be perceived as a uniform line of constant brightness. Contrast sensitivity may often be measured by a grating, such as that shown along the x axis of Figure 4. If the grating appears to be a smooth bar like the grating on the far right of the figure (if it is viewed from a distance), the viewer is unable to discern the alternating patterns of dark and light, and the contrast is below the viewer’s CS threshold. Expressed in this way, we can consider the first of several influences on contrast sensitivity: the spatial frequency of the grating. As shown in Figure 4, spatial frequency may be expressed as the number of dark-light pairs that occupy 1 degree of visual angle (cycles/degrees or c/d). If you hold this book approximately 1 foot away, then the spatial frequency of the left grating is 0.6 c/d, of the next grating is 1.25 c/d, and of the third grating is 2.0 c/d. We can also see that the spatial frequency is inversely related to the width of the light or dark bar. The human eye is most sensitive to spatial frequencies of around 3 c/d, as shown by the two CS functions drawn as curved lines across the axis of Figure 4. When the contrast (between light and dark bars) is greater, sensitivity is greater across all spatial frequencies. The high spatial frequencies on the right side of Figure 4 characterize our sensitivity to small visual angles and fine detail (and hence reflect the standard measurement of visual acuity), such as that involved in reading fine print or making fine adjustments on a vernier scale. Much lower frequencies characterize the recognition of shapes in blurred or degraded conditions, like the road sign sought by our lost driver or the unseen pothole that terminated his trip. Low contrasts at low spatial frequencies often characterize the viewing of images that High Contrast Contrast Sensitivity Low Contrast Sensitivity 1 4 8 20 Cycles/Degree FIGURE 4 Spatial frequency gratings, used to measure contrast sensitivity. The particular values on the x axis will vary as a function of visual angle and therefore the distances at which the figure is held from the eyes. The line above each grating will occupy 1 degree of visual angle when the book is viewed at a distance of 52 cm. The two curves represent contrast sensitivity as a function of spatial frequency for two different contrast levels. 50 Visual Sensory Systems are degraded by poor “sensor resolution,” like those from infrared radar (Uttal et al., 1994). A second important influence on contrast as seen in Figure 4 is that lower contrasts are less easily discerned. Hence, we can understand the difficulty our driver had in trying to read the label against the gray dashboard. Had the label been printed against a white background, it would have been far easier to read. Many users of products like VCRs are frustrated by the black on black raised printing instructions (Figure 5). Color contrast does not necessarily produce good luminance–contrast ratios. Thus, for example, slides that produce black text against a blue background may be very hard for the viewing audience to read. A third influence on contrast sensitivity is the level of illumination of the stimulus (L + D, the denominator of formula 3). Not surprisingly, lower illumination reduces the sensitivity and does so more severely for sensing high spatial frequencies (which depend on cones) than for low frequencies. This explains the obvious difficulty we have reading fine print under low illumination. However, low illumination can also disrupt vision at low spatial frequencies: Note the loss of visibility that our driver suffered for the low spatial frequency pothole. Two final influences on contrast sensitivity are the resolution of the eye itself and the dynamic characteristics of the viewing conditions. Increasing age reduces the amount of light passing through the cornea and greatly reduces the sensitivity. This factor, coupled with the loss of visual accommodation ability at close viewing, produces a severe deficit for older readers in poor illumination. Constant sensitivity declines also when the stimulus is moving relative to the viewer, as our driver found when trying to read the highway sign. All of these factors, summarized in Table 3, are critical for predicting whether or not detail will be perceived and shapes will be recognized in a variety of degraded viewing conditions, and hence these factors are critical for FIGURE 5 Difficult visibility of low-contrast, raised-plastic printing. With small letters and black plastic, such information is often nearly illegible in poor illumination. (Source: Courtesy of Anthony D. Andre, Interface Analysis Associates, San Jose, CA.) 51 Visual Sensory Systems TABLE 3 Some Variables That Affect Contrast and Visibility Variable Effect ↓ Contrast ↓ Illumination ↓ Visibility ↓ Contrast sensitivity Polarity Spatial frequency Black on white better than white on black Optimum CS at 3 C/D Visual accommodation CS Motion ↓ CS Example Black print on gray Reading map in poor light Designing viewgraphs Ideal size of text font given viewing distance Map reading during night driving Reading a road sign while moving indirectly informing the designer of certain standards that should be adhered to in order to guarantee viewability of critical symbols. Many of these standards may be found in handbooks like Boff and Lincoln (1988) or textbooks such as Salvendy (1997). Human factors researchers are also trying to develop models to show how all the influences in Table 3 interact in a way that would, for example, allow one to specify the minimum text size for presenting instructions to be viewed by someone with 20/40 vision in certain illumination or to determine the probability of recognizing targets at night at a particular distance (Owens et al., 1994). However, the accuracy of such models has not yet reached a point where they are readily applicable when several variables are involved. What can be done instead is to clearly identify how these factors influence the best design whenever print or symbols must be read under less than optimal circumstances. We describe some of these guidelines as they pertain to the readability of the printed word. Reading Print. Most obviously, print should not be too fine in order to guarantee its readability. When space is not at a premium and viewing conditions may be less than optimal, one should seek to come as close to the 3 cycles/degrees value as possible (i.e., stroke width of 1/6 degree of visual angle) to guarantee maximum readability. Fine print and very narrow stroke widths are dangerous choices. Similarly, one should maximize contrast by employing black letters on white background rather than, for example, using the “sexier” but less readable hued backgrounds (e.g., black on blue). Black on red is particularly dangerous with low illumination, since red is not seen by rods. Because of certain asymmetries in the visual processing system, dark text on lighter background (“negative contrast”) also offers higher contrast sensitivity than light on dark (“positive contrast”). The disruptive tendency for white letters to spread out or “bleed” over a black background is called irradiation. The actual character font matters too. Fonts that adhere to “typical” letter shapes like the text of this book are easier to read because of their greater famil- 52 Visual Sensory Systems iarity than those that create block letters or other nonstandardized shapes. Another effect on readability is the case of the print. For single, isolated words, UPPERCASE appears to be as good as if not better than lowercase print, as, for example, the label of an “on” switch. This advantage results in part because of the wider visual angle and lower spatial frequency presented. However, for multiword text, UPPERCASE PRINT IS MORE DIFFICULT TO READ than lowercase or mixed-case text. This is because lowercase text typically offers a greater variety of word shapes. This variety conveys sensory information at lower spatial frequencies that can be used to discern some aspects of word meaning in parallel with the high spatial frequency analysis of the individual letters (Broadbent & Broadbent, 1980; Allen et al., 1995). BLOCKED WORDS IN ALL CAPITALS will eliminate the contributions of this lower spatial frequency channel. Other guidelines for text size and font type may be found in Sanders and McCormick (1993). Color Sensation Color vision is a facility employed in the well-illuminated environment. Our driver had trouble judging the color of his red sedan because of the poor illumination in the parking lot. A second characteristic that limits the effectiveness of color is that approximately 7 percent of the male population is color deficient; that is, they are unable to discriminate certain hues from each other. Most prevalent is red-green “color blindness” (protanopia) in which the wavelengths of these two hues create identical sensations if they are of the same luminance intensity. Many computer graphics packages use color to discriminate lines. If this is the only discriminating feature between lines, the graph may be useless for the color-blind reader or the reader of the paper passed through a monochrome photocopier. Because of these two important sensory limitations on color processing, a most important human factors guideline is to design for monochrome first (Shneiderman, 1987) and use color only as a redundant backup to signal important information. Thus, for example, a traffic signal uses the location of the illuminated lamp (top, middle, bottom) redundantly with its color to signal the important traffic command information. Two additional characteristics of the sensory processing of color have some effect on its use. Simultaneous contrast is the tendency of some hues to appear different when viewed adjacent to other hues (e.g., green will look deeper when viewed next to red than when viewed next to a neutral gray). This may affect the usability of multicolor-coded displays, like maps, as the number of colors grows large. The negative afterimage is a similar phenomenon to simultaneous contrast but describes the greater intensity of certain colors when viewed after prolonged viewing of other colors. Night Vision The loss of contrast sensitivity at all spatial frequencies can inhibit the perception of print as well as the detection and recognition of objects by their shape or 53 Visual Sensory Systems color in poorly illuminated viewing conditions. Coupled with the loss of contrast sensitivity due to age, it is apparent that night driving for the older population is a hazardous undertaking, particularly in unfamiliar territory (Waller, 1991; Shinar & Schieber, 1991). Added to these hazards of night vision are those associated with glare, which may be defined as irrelevant light of high intensity. Beyond its annoyance and distraction properties, glare has the effect of temporarily destroying the rod’s sensitivity to low spatial frequencies. Hence, the glare-subjected driver is less able to spot the dimly illuminated road hazard (the pothole or the darkly dressed pedestrian; Theeuwes et al., 2002). BOTTOM-UP VERSUS TOP-DOWN PROCESSING Up to now, we have discussed primarily the factors of the human visual system that effect the quality of the sensory information that arrives at the brain in order to be perceived. As shown in Figure 6, we may represent these influences as those that affect processing from the bottom (lower levels of stimulus processing) upward (toward the higher centers of the brain involved with perception and understanding). As examples, we may describe loss of acuity as a degradation in bottom-up processing or the high-contrast sensitivity as an enhancement of bottom-up processing. In contrast, an equally important influence on processing operates from the top downward. This is perception based on our knowledge (and desire) of what should be there. Thus, if I read the instructions, “After the procedure is completed, turn the system off ,” I need not worry as much if the last word happens to be printed in very small letters or is visible with low conExperience Knowledge (Expectancies and Desires) Top-Down Processing Perception The Senses Stimulus World FIGURE 6 The relation between bottom-up and top-down processing. 54 Bottom-Up Processing Visual Sensory Systems trast because I can pretty much guess what it will say. Much of our processing of perceptual information depends on the delicate interplay between top-down processing, signaling what should be there, and bottom-up processing, signaling what is there. Deficiencies in one (e.g., small, barely legible text) can often be compensated by the operation of the other (e.g., expectations of what the text should say). Our initial introduction to the interplay between these two modes of processing is in a discussion of depth perception, and the distinction between the two modes is amplified further in our treatment of signal detection. DEPTH PERCEPTION Humans navigate and manipulate in a three-dimensional (3-D) world, and we usually do so quite accurately and automatically (Gibson, 1979). Yet there are times when our ability to perceive where we and other things are in 3-D space breaks down. Airplane pilots flying without using their instruments are also very susceptible to dangerous illusions of where they are in 3-D space and how fast they are moving (O’Hare & Roscoe, 1990; Hawkins & Orlady, 1993; Leibowitz, 1988). In order to judge our distance from objects (and the distance between objects) in 3-D space, we rely on a host of depth cues to inform us of how far away things are. The first three cues we discuss—accommodation, binocular convergence, and binocular disparity—are all inherent in the physiological structure and wiring of the visual sensory system. Hence, they may be said to operate on bottom-up processing. Accommodation, as we have seen, is when an out-of-focus image triggers a change in lens shape to accommodate, or bring the image into focus on the retina. As shown in Figure 3, sensory receptors, within the ciliary muscles that accomplish this change, send signals to the higher perceptual centers of the brain that inform those centers how much accommodation was accomplished and hence the extent to which objects are close or far (within a range of about 3 m). These signals from the muscles to the brain are called proprioceptive input.) Convergence is a corresponding cue based on the amount of inward rotation (“cross-eyedness”) that the muscles in the eyeball must accomplish to bring an image to rest on corresponding parts of the retina on the two eyes. The closer the distance at which the image is viewed, the greater the amount of proprioceptive “convergence signal” sent to the higher brain centers by the sensory receptors within the muscles that control convergence. Binocular disparity, sometimes called stereopsis, is a depth cue that results because the closer an object is to the observer, the greater the amount of disparity there is between the view of the object received by each eyeball. Hence, the brain can use this disparity measure, computed at a location where the visual signals from the two eyes combine in the brain, to estimate how far away the object is. All three of these bottom-up cues are only effective for judging distance, slant, and speed for objects that are within a few meters from the viewer (Cutting & Vishton, 1995). (However, stereopsis can be created in stereoscopic displays to 55 Visual Sensory Systems simulate depth information at much greater distances. Judgment of depth and distance for more distant objects and surfaces depends on a host of what are sometimes called “pictorial” cues because they are the kinds of cues that artists put into pictures to convey a sense of depth. Because the effectiveness of most pictorial cues is based on past experience, they are subject to top-down influences. As shown in Figure 7, some of the important pictorial cues to depth are Linear perspective: The converging of parallel lines (i.e., the road) toward the more distant points. Relative size: A cue based on the knowledge that if two objects are the same true size (e.g., the two trucks in the figure), then the object that occupies a smaller visual angle (the more distant vehicle in the figure) is farther away. Interposition: Nearer objects tend to obscure the contours of objects that are farther away (see the two buildings). Light and shading: Three-dimensional objects tend to cast shadows and reveal reflections and shadows on themselves from illuminating light. These shadows provide evidence of their location and their 3-D form (Ramachandran, 1988). FIGURE 7 Some pictorial depth cues. (Source: Wickens, C. D., 1992. Engineering Psychology and human performance. New York: HarperCollins. Reprinted by permission of Addison-Wesley Educational Publishers, Inc.) 56 Visual Sensory Systems Textural gradients: Any textured surface, viewed from an oblique angle, will show a gradient or change in texture density (spatial frequency) across the visual field (see the Illinois cornfield in the figure). The finer texture signals the more distant region, and the amount of texture change per unit of visual angle signals the angle of slant relative to the line of sight. Relative motion, or motion parallax, describes the fact that more distant objects show relatively smaller movement across the visual field as the observer moves. Thus, we often move our head back and forth to judge the relative distance of objects. Relative motion also accounts for the accelerating growth in the retinal image size of things as we approach them in space, a cue sometimes called looming (Regan et al., 1986). We would perceive the vehicle in the left lane of the road in Figure 7 to be approaching, because of its growing image size on the retina. Collectively, these cues provide us with a very rich sense of our position and motion in 3-D space as long as the world through which we move is well illuminated and contains rich visual texture. Gibson (1979) clearly described how the richness of these cues in our natural environment support very accurate space and motion perception. However, when cues are degraded, impoverished, or eliminated by darkness or other unusual viewing circumstances, depth perception can be distorted. This sometimes leads to dangerous circumstances. For example, a pilot flying at night or over an untextured snow cover has very poor visual cues to help determine where he or she is relative to the ground (O’Hare & Roscoe, 1990), so pilots must rely on precision flight instruments. Correspondingly, the implementation of both edge markers and high-angle lighting on highways greatly enriches the cues available for speed (changing position in depth) for judging distance hazards and allows for safer driving. Just as we may predict poorer performance in tasks that demand depth judgments when the quality of depth cues is impoverished, we can also predict that certain distortions of perception will occur when features of the world violate our expectations, and top-down processing takes over to give us an inappropriate perception. For example, Eberts and MacMillan (1985) established that the higher-than-average rate at which small cars are hit from behind results because of the cue of relative size. A small car is perceived as more distant than it really is from the observer approaching it from the rear. Hence, a small car is approached faster (and braking begins later) than is appropriate, sometimes leading to the unfortunate collision. Of course, clever application of human factors can sometimes turn these distortions to advantage, as in the case of the redesign of a dangerous traffic circle in Scotland (Denton, 1980). Drivers tended to overspeed when coming into the traffic circle with a high accident rate as a consequence. In suggesting a solution, Denton decided to trick the driver’s perceptual system by drawing lines across the roadway of diminishing separation, as the circle was approached. Approaching the circle at a constant (and excessive) speed, the driver experiences 57 Visual Sensory Systems the “flow” of texture past the vehicle as signaling increasing in speed (i.e., accelerating). Because of the nearly automatic way in which many aspects of perception are carried out, the driver should instinctively brake in response to the perceived acceleration, bringing the speed closer to the desired safe value. This is exactly the effect that was observed in relation to driving behavior after the marked pavement was introduced, resulting in a substantial reduction in fatal accidents at the traffic circle, a result that has been sustained for several years (Godley, 1997). VISUAL SEARCH AND DETECTION A critical aspect of human performance in many systems concerns the closely linked processes of visual search and object or event detection. Our driver at the beginning of the chapter was searching for several things: the appropriate control for the wipers, the needed road sign, and of course any number of possible hazards or obstacles that could appear on the road (the pothole was one that was missed). The goal of these searches was to detect the object or event in question. These tasks are analogous to the kind of processes we go through when we search the phone book for the pizza delivery listing, search the index of this book for a needed topic, search a cluttered graph for a data point, or when the quality control inspector searches the product (say, a circuit board) for a flaw. In all cases, the search may or may not successfully end in a detection. Despite the close link between visual search and detection, it is important to separate our treatment of these topics, both because different factors affect each and because human factors personnel are sometimes interested in detection when there is no search (e.g., the detection of a fire alarm). We consider the process of search itself, but to understand visual search, we must first consider the nature of eye movements, which are heavily involved in searching large areas of space. Then we consider the process of detection. Eye Movements Eye movements are necessary to search the visual field (Monty & Senders, 1976; Hallett, 1986). Eye movements can generally be divided into two major classes. Pursuit movements are those of constant velocity that are designed to follow moving targets, for example, following the rapid flight of an aircraft across the sky. More related to visual search are saccadic eye movements, which are abrupt, discrete movements from one location to the next. Each saccadic movement can be characterized by a set of three critical features: an initiation latency, a movement time (or speed), and a destination. Each destination, or dwell, can be characterized by both its dwell duration and a useful field of view (UFOV). In continuous search, the initiation latency and the dwell duration cannot be distinguished. The actual movement time is generally quite fast (typically less than 50 msec) and is not much greater for longer than for shorter movements. The greatest time 58 Visual Sensory Systems is spent during dwells and initiations. These time limits are such that even in rapid search there are no more than about 3 to 4 dwells per second (Moray, 1986), and this frequency is usually lower because of variables that prolong the dwell. The destination of a scan is usually driven by top-down processes (i.e., expectancy; Senders, 1964), although on occasion a saccade may be drawn by salient bottom-up processes (e.g., a flashing light). The dwell duration is governed jointly by two factors: (1) the information content of the item fixated (e.g., when reading, long words require longer dwells than short ones), and (2) the ease of information extraction, which is often influenced by stimulus quality (e.g., in target search, longer dwells on a degraded target). Finally, once the eyes have landed a saccade on a particular location, the useful field of view defines how large an area, surrounding the center of fixation, is available for information extraction (Sanders, 1970; Ball et al., 1988). The useful field of view defines the diameter of the region within which a target might be detected if it is present. The useful field of view should be carefully distinguished from the area of foveal vision, defined earlier in the chapter. Foveal vision defines a specific area of approximately 2 degrees of visual angle surrounding the center of fixation, which provides high visual acuity and low sensitivity. The diameter of the useful field of view, in contrast, is task-dependent. It may be quite small if the operator is searching for very subtle targets demanding high visual acuity but may be much larger than the fovea if the targets are conspicuous and can be easily detected in peripheral vision. Recent developments in technology have produced more efficient means of measuring eye movements with oculometers, which measure the orientation of the eyeball relative to an image plane and can therefore be used to infer the precise destination of a saccade. Visual Search The Serial Search Model. In describing a person searching any visual field for something, we distinguish between targets and nontargets (nontargets are sometimes called distractors). The latter may be thought of as “visual noise” that must be inspected in order to determine that it is not in fact the desired target. Many searches are serial in that each item is inspected in turn to determine whether it is or is not a target. If each inspection takes a relatively constant time, I, and the expected location of the target is unknown beforehand, then it is possible to predict the average time it will take to find the target as T = (N ! I)/2 (5) where I is the average inspection time for each item, and N is the total number of items in the search field (Neisser et al., 1964). Because, on the average, the target will be encountered after half of the targets have been inspected (sometimes earlier, sometimes later), the product (N ! I) is divided by two. This serial search model has been applied to predicting performance in numerous environments in which people search through maps or lists, such as phone books or computer menus (Lee & MacGregor, 1985; Yeh & Wickens 2001). 59 Visual Sensory Systems If the visual search space is organized coherently, people tend to search from top to bottom and left to right. However, if the space does not benefit from such organization (e.g., searching a map for a target or searching the ground below the aircraft for a downed airplane [Stager & Angus, 1978]), then people’s searches tend to be considerably more random in structure and do not “exhaustively” examine all locations (Wickens, 1992; Stager & Angus, 1978). If targets are not readily visible, this nonexhaustive characteristic leads to a search-time function that looks like that shown in Figure 8 (Drury, 1975). The figure suggests that there are diminishing returns associated with giving people too long to search a given area if time is at a premium. Drury has used such a model to defined the optimum inspection time that people should be allowed to examine each image in a quality-control inspection task. Search models can be extremely important in human factors (Brogan, 1993) for predicting search time in time-critical environments; for example, how long will a driver keep eyes off the highway to search for a road sign? Unfortunately, however, there are two important circumstances that can render the strict serial model inappropriate, one related to bottom-up processing and the other to topdown processing. Both factors force models of visual search to become more complex and less precise. Conspicuity. The bottom-up influence is the conspicuity of the target. Certain targets are so conspicuous that they may “pop out” no matter where they are in the visual field, and so nontarget items need not be inspected (Yantis, 1993; Treisman, 1986). Psychologists describe the search for such targets as parallel because, in essence, all items are examined at once (i.e., in parallel), and in contrast to the equation 5, search time does not increase with the total number of items. Such is normally the case with “attention grabbers,” such as a flashing warning signal, a moving target, or a uniquely colored, highlighted item on a checklist, a computer screen, or in a phone book. 1.0 Probability Target Detected by Time t 0 Time (t ) FIGURE 8 Predicted search success probability as a function of the time spent searching. (Source: Adapted from Drury, C., 1975. “Inspection of sheet metal: Models and data.” Reprinted with permission from Human Factors, 17. Copyright 1975 by the Human Factors and Ergonomics Society. 60 Visual Sensory Systems Conspicuity is a desirable property if the task requires the target to be processed, but an undesirable one if the conspicuous item is irrelevant to the task at hand. Thus, if I am designing a checklist that highlights emergency items in red, this may help the operator in responding to emergencies but will be a distraction if the operator is using the list to guide normal operating instructions; that is, it will be more difficult to focus attention on the normal instructions. As a result of these dual consequences of conspicuity, the choice of highlighting (and the effectiveness of its implementation) must be guided by a careful analysis of the likelihood that the user will need the highlighted item as a target (Fisher & Tan, 1989). Table 4 lists some key variables that can influence the conspicuity of targets and, therefore, the likelihood that the field in which they are embedded will be searched in parallel. Expectancies. The second influence on visual search that leads to departures from the serial model has to do with the top-down implications of searcher expectancies of where the target might be likely to lie. Expectancies, like all topdown processes, are based upon prior knowledge. Our driver did not expect to see the road sign on the left of the highway and, as a result, only found it after it was too late. As another example, when searching a phone book we do not usually blanket the entire page with fixations, but our knowledge of the alphabet allows us to start the search near or around the spelling of the target name. Similarly, when searching an index, we often have an idea what the topic is likely to be called, which guides our starting point. It is important to realize that these expectancies, like all knowledge, come only with experience. Hence, we might predict that the skilled operator will have more top-down processes driving visual search than the unskilled one and as a result will be more in the efficient, a conclusion born out by research (Parasuraman, 1986). These top-down influences also provide guidance for designers who develop search fields, such as indexes and menu pages, to understand the subjective orderings and groupings the items that users have. Conclusion. In conclusion, research on visual search has four general implications, all of which are important in system design. TABLE 4 Target Properties Inducing Parallel Search 1. Discriminability from background elements. a. In color (particularly if nontarget items are uniformly colored) b. In size (particularly if the target is larger) c. In brightness (particularly if the target is brighter) d. In motion (particularly if background is stationary) 2. Simplicity: Can the target be defined only by one dimension (i.e., “red”) and not several (i.e., “red and small”) 3. Automaticity: a target that is highly familiar (e.g., one’s own name) Note that unique shapes (e.g., letters, numbers) do not generally support parallel search (Treisman, 1986). 61 Visual Sensory Systems 1. Knowledge of conspicuity effects can lead the designer to try to enhance the visibility of target items (consider, for example, reflective jogging suits [Owens et al., 1994] or highlighting critical menu items). In dynamic displays, automation can highlight critical targets to be attended by the operator (Yeh & Wickens 2001b; Dzindolet et al., 2002. 2. Knowledge of the serial aspects of many visual search processes should forewarn the designer about the costs of cluttered displays (or search environments). When too much information is present, many maps present an extraordinary amount of clutter. For electronic displays, this fact should lead to consideration of decluttering options in which certain categories of information can be electronically turned off or deintensified (Mykityshyn et al., 1994; Stokes et al., 1990; Yeh & Wickens 2001a). However, careful use of color and intensity as discriminating cues between different classes of information can make decluttering unnecessary (Yeh & Wickens, 2001a). 3. Knowledge of the role of top-down processing in visual search should lead the designer to make the structure of the search field as apparent to the user as possible and consistent with the user’s knowledge (i.e., past experience). For verbal information, this may involve an alphabetical organization or one based on the semantic similarity of items. In positioning road signs, this involves the use of consistent placement. 4. Knowledge of all of these influences can lead to the development of models of visual search that will predict how long it will take to find particular targets, such as the flaw in a piece of sheet metal (Drury, 1975), an item on a computer menu (Lee & MacGregor, 1985; Fisher & Tan, 1989), or a traffic sign by a highway (Theeuwes, 1994). For visual search, however, the major challenge of such models resides in the fact that search appears to be guided much more by top-down than by bottom-up processes (Theeuwes, 1994), and developing precise mathematical terms to characterize the level of expertise necessary to support top-down processing is a major challenge. Detection Once a possible target is located in visual search, it becomes necessary to confirm that it really is the item of interest (i.e., detect it). This process may be trivial if the target is well known and reasonably visible (e.g., the name on a list), but it is far from trivial if the target is degraded, like a faint flaw in a piece of sheet metal, a small crack in an x-rayed bone, or the faint glimmer of the lighthouse on the horizon at sea. In these cases, we must describe the operator’s ability to detect signals. Signal detection is often critical even when there is no visual search at all. For example, the quality-control inspector may have only one place to look to examine the product for a defect. Similarly, human factors is concerned with detection of auditory signals, like the warning sound in a noisy industrial plant, when search is not at all relevant. 62 Visual Sensory Systems Signal Detection Theory. In any of a variety of tasks, the process of signal detection can be modeled by signal detection theory (SDT) (Green & Swets, 1988; Swets, 1996; T. D. Wickens, 2002), which is represented schematically in Figure 9. SDT assumes that “the world” (as it is relevant to the operator’s task) can be modeled as either one in which the “signal” to be detected is present or absent, as shown across the top of the matrix in Figure 9. Whether the signal is present or absent, the world is assumed to contain noise: Thus, the luggage inspected by the airport security guard may contain a weapon (signal) in addition to a number of things that might look like weapons (i.e., the noise of hair blowers, calculators, carabiners, etc.), or it may contain the noise alone, with no signal. The goal of the operator in detecting signals is to discriminate signals from noise. Thus, we may describe the relevant behavior of the observer as that represented by the two rows of Figure 9—saying, “Yes (I see a signal)” or “No (there is only noise).” This combination of two states of the world and two responses yields four joint events shown as the four cells of the figure labeled hits, false alarms, misses, and correct rejections. Two of these cells (hits and correct rejections) clearly represent “good” outcomes and ideally should characterize much of Signal Present (+ Noise) "Yes" (Signal seen) State of World Signal Absent (Noise only) False Hit (H) Alarm (FA) P(H) P(FA) Operator Behavior Correct "No" (No signal perceived) Rejection Miss (M) (CR) 1—P(H) Response Bias "Yes" vs. "No" 1—P(FA) Sensitivity Low vs. High FIGURE 9 Representation of the outcomes in signal detection theory. The figure shows how changes in the four joint events within the matrix influence the primary performance measures of response bias and sensitivity, shown at the bottom. 63 Visual Sensory Systems the performance, while two are “bad” (misses and false alarms) and ideally should never occur. If several encounters with the state of the world (signal detection trials) are aggregated, some involving signals and some involving noise alone, we may then express the numbers within each cell as the probability of a hit [#hits/#signals = p(hit)]; the probability of a miss [1 " p(hit)]; the probability of a false alarm [#FA/#no-signal encounters] and the probability of a correct rejection [1 " p(FA)]. As you can see from these equations, if the values of p(hit) and p(FA) are measured, then the other two cells contain entirely redundant information. Thus, the data from a signal detection environment (e.g., the performance of an airport security inspector) may easily be represented in the form of the matrix shown in Figure 9, if a large number of trials are observed so that the probabilities can be reliably estimated. However, SDT considers these same numbers in terms of two fundamentally different influences o

Use Quizgecko on...
Browser
Browser