Perception: Visual Systems & Cognitive Psychology
Document Details

Uploaded by Rileyd07
Tags
Summary
This textbook excerpt discusses visual perception and cognitive elements. It covers topics on human and machine vision, visual maps & firing synchrony, and Gestalt principles. It also covers elements of study around the visual systems and the brain.
Full Transcript
what if… You look around the world and instantly, effortlessly, recognize the objects that surround you — words on this page, objects in the room where you’re sitting, things you can view out the window. Perception, in other words, seems fast, easy, and automatic. But ev...
what if… You look around the world and instantly, effortlessly, recognize the objects that surround you — words on this page, objects in the room where you’re sitting, things you can view out the window. Perception, in other words, seems fast, easy, and automatic. But even so, there is complexity here, and your ability to perceive the world depends on many separate and individually complicated processes. Consider the disorder akinetopsia (Zeki, 1991). This condition is rare, and much of what we know comes from a single patient — L.M. — who developed this disorder because of a blood clot in her brain, at age 43. L.M. was completely unable to perceive motion — even though other aspects of her vision (e.g., her ability to recognize objects, to see color, or to discern detail in a visual pattern) seemed normal. Because of her akinetopsia, L.M. can detect that an object now is in a position different from its position a moment ago, but she reports see- ing “nothing in between.” As a way of capturing this experience, think about what you see when you’re looking at really slow movement. If, for example, you stare at the hour hand on a clock as it creeps around the clock face, you cannot discern its motion. But you can easily see the hand is now pointing, say, at the 4, and if you come back a while later, you can see that it’s closer to the 5. In this way, you can infer motion from the change in position, but you can’t perceive the motion. This is your experience with very slow movement; L.M., suffering from akinetopsia, has the same experience with all movement. What’s it like to have this disorder? L.M. complained, as one concern, that it was hard to cross the street because she couldn’t tell which of the cars in view were moving and which ones were parked. (She eventually learned to estimate the position and movement of traffic by listening to cars’ sounds as they approached, even though she couldn’t see their movement.) Other problems caused by akinetopsia are more surprising. For example, L.M. complained about difficulties in following conversations, because she was essentially blind to the speaker’s lip movement or changing facial expressions. She also felt insecure in social settings. If more than two people were moving around in a room, she felt anxious because “people were suddenly here or there, but [she had] not seen them moving” (Zihl, von Cramon, & Mai, 1983, p. 315). Or, as a different example: She had trouble in everyday activities like pouring a cup of 63 preview of chapter themes e explore vision — humans’ dominant sensory modality. W aspects — so that the detection of simple features depends We discuss the mechanisms through which the visual on how the overall form is organized, and the perception of system detects patterns in the incoming light, but we also size depends on the perceived distance of the target object. showcase the activity of the visual system in interpreting e emphasize that the interpretation of the visual input W and shaping the incoming information. is usually accurate — but the same mechanisms can lead to e also highlight the ways in which perception of W illusions, and the study of those illusions can often illumi- one aspect of the input is shaped by perception of other nate the processes through which perception functions. coffee. She couldn’t see the fluid level’s gradual rise as she poured, so she didn’t know when to stop pouring. For her, “the fluid appeared to be frozen, like a glacier” (Zihl et al., 1983, p. 315; also Schenk, Ellison, Rice, & Milner, 2005; Zihl, von Cramon, Mai, & Schmid, 1991). We will have more to say about cases of disordered perception later in the chapter. For now, though, let’s note the specificity of this disorder — a disruption of movement perception, with other aspects of perception still intact. Let’s also highlight the important point that each of us is, in countless ways, dependent on our perceptual contact with the world. That point demands that we ask: What makes this perception possible? The Visual System You receive information about the world through various sensory modalities: You hear the sound of the approaching train, you smell the freshly baked bread, you feel the tap on your shoulder. Researchers have made impressive progress in studying all of these modalities, and students interested in, say, hearing or the sense of smell will find a course in (or a book about) sensation and perception to be fascinating. There’s no question, though, that for humans vision is the dominant sense. This is reflected in how much brain area is devoted to vision compared to any of the other senses. It’s also reflected in many aspects of our behavior. For example, if visual information conflicts with information received from other senses, you usually place your trust in vision. This is the basis for ventriloquism, in which you see the dummy’s mouth moving while the sounds themselves are coming from the dummy’s master. Vision wins out in this contest, and so you experience the illusion that the voice is coming from the dummy. The Photoreceptors How does vision operate? The process begins, of course, with light. Light is produced by many objects in our surroundings — the sun, lamps, candles — and then reflects off other objects. In most cases, it’s this reflected light — reflected 64 C H A P T E R T H R E E Visual Perception from this book page or from a friend’s face — that launches the processes of visual perception. Some of this light hits the front surface of the eyeball, passes through the cornea and the lens, and then hits the retina, the light-sensitive tissue that lines the back of the eyeball (see Figure 3.1). The cornea and lens focus the incoming light, just as a camera lens might, so that a sharp image is cast onto the retina. Adjustments in this process can take place because the lens is surrounded by a band of muscle. When the muscle tightens, the lens bulges somewhat, creating the proper shape for focusing the images cast by nearby objects. When the muscle relaxes, the lens returns to a flatter shape, allowing the proper focus for objects farther away. On the retina, there are two types of photoreceptors — specialized neural cells that respond directly to the incoming light. One type, the rods, are sensitive to very low levels of light and so play an essential role when- ever you’re moving around in semidarkness or trying to view a fairly dim FIGURE 3.1 THE HUMAN EYE Retina Fovea Pupil Lens Iris Optic nerve Cornea (to brain) Light enters the eye through the cornea, and the cornea and lens refract the light rays to produce a sharply focused image on the retina. The iris can open or close to control the amount of light that reaches the retina. The retina is made up of three main layers: the rods and cones, which are the photoreceptors; the bipolar cells; and the ganglion cells, whose axons make up the optic nerve. The Visual System 65 FIGURE 3.2 RODS AND CONES Blind spot 180 Thousands of photoreceptors Cones Fovea Cone Rods per square millimeter Rod 140 100 60 20 0 60 40 20 0 20 40 60 Distance on retina from fovea (degrees) A B (Panel A) Rods and cones are the light-sensitive cells at the back of the retina that launch the neural process of vision. In this (colorized) photo, cones appear green; rods appear brown. (Panel B) Distribution of photoreceptors. Cones are most frequent at the fovea, and the number of cones drops off sharply as we move away from the fovea. In contrast, there are no rods at all on the fovea. There are neither rods nor cones at the retina’s blind spot—the position at which the neural fibers that make up the optic nerve exit the eyeball. Because this position is filled with these fibers, there’s no space for any rods or cones. stimulus. But the rods are also color-blind: They can distinguish different intensities of light (and in that way contribute to your perception of bright- ness), but they provide no means of discriminating one hue from another (see Figure 3.2). Cones, in contrast, are less sensitive than rods and so need more incom- ing light to operate at all. But cones are sensitive to color differences. More precisely, there are three different types of cones, each having its own pattern of sensitivities to different wavelengths (see Figure 3.3). You perceive color, therefore, by comparing the outputs from these three cone types. Strong firing from only the cones that prefer short wavelengths, for example, accompanied by weak (or no) firing from the other cone types, signals purple. Blue is signaled by equally strong firing from the cones that prefer short wavelengths and those that prefer medium wavelengths, with only modest firing by cones that prefer long wavelengths. And so on, with other patterns of firing, across the three cone types, corresponding to different perceived hues. Cones have another function: They enable you to discern fine detail. The ability to see fine detail is referred to as acuity, and acuity is much higher for 66 C H A P T E R T H R E E Visual Perception FIGURE 3.3 WAVELENGTHS OF LIGHT Wavelength Amplitude Pressure Time A White light Prism 400 500 600 700 Nanometers Visible light Gamma Ultraviolet Infrared Broadcast X-rays Radar rays rays rays radio B The physics of light are complex, but for many purposes light can be thought of as a wave (Panel A), and the shape of the wave can be described in terms of its amplitude and its wavelength (i.e., the distance from “crest” to “crest”). The wavelengths our visual system can sense are only a tiny part of the broader electromagnetic spectrum (Panel B). Light with a wavelength longer than 750 nanometers is invisible to us, although we feel these longer infrared waves as heat. Ultraviolet light, which has a wavelength shorter than 360 nanometers, is also invisible to us. That leaves the narrow band of wavelengths between 750 and 360 nanometers — the so-called visible spectrum. Within this spectrum, we usually see wavelengths close to 400 nanometers as violet, those close to 700 nanometers as red, and those in between as the rest of the colors in the rainbow. the cones than it is for the rods. This explains why you point your eyes toward a target whenever you want to perceive it in detail. What you’re actually doing is positioning your eyes so that the image of the target falls onto the fovea, the very center of the retina. Here, cones far outnumber rods (and, in fact, the center of the fovea has no rods at all). As a result, this is the region of the retina with the greatest acuity. The Visual System 67 In portions of the retina more distant from the fovea (i.e., portions of the retina in the so-called visual periphery), the rods predominate; well out into the periphery, there are no cones at all. This distribution of photo receptors explains why you’re better able to see very dim lights out of the corner of your eyes. Psychologists have understood this point for at least a century, but the key observation here has a much longer history. Sailors and astronomers have known for hundreds of years that when looking at a barely visible star, it’s best not to look directly at the star’s location. By looking slightly away from the star, they ensured that the star’s image would fall outside of the fovea and onto a region of the retina dense with the more light-sensitive rods. Lateral Inhibition Rods and cones do not report directly to the cortex. Instead, the photo- receptors stimulate bipolar cells, which in turn excite ganglion cells. The ganglion cells are spread uniformly across the entire retina, but all of their axons converge to form the bundle of nerve fibers that we call the optic nerve. This is the nerve tract that leaves the eyeball and carries information to various sites in the brain. The information is sent first to a way station in the thalamus called the lateral geniculate nucleus (LGN); from there, information is transmitted to the primary projection area for vision, in the occipital lobe. Let’s be clear, though, that the optic nerve is not just a cable that conducts signals from one site to another. Instead, the cells that link retina to brain are already analyzing the visual input. One example lies in the phenomenon of lateral inhibition, a pattern in which cells, when stimulated, inhibit the activ- ity of neighboring cells. To see why this is important, consider two cells, each receiving stimulation from a brightly lit area (see Figure 3.4). One cell (Cell B in the figure) is receiving its stimulation from the middle of the lit area. It is intensely stimulated, but so are its neighbors (including Cell A and Cell C). As a result, all of these cells are active, and therefore each one is trying to in- hibit its neighbors. The upshot is that the activity level of Cell B is increased by the stimulation but decreased by the lateral inhibition it’s receiving from Cells A and C. This combination leads to only a moderate level of activity in Cell B. In contrast, another cell (Cell C in the figure) is receiving its stimula- tion from the edge of the lit area. It is intensely stimulated, and so are its neighbors on one side. Therefore, this cell will receive inhibition from one side but not from the other (in the figure: inhibition from Cell B but not from Cell D), so it will be less inhibited than Cell B (which is receiving inhibition from both sides). Thus, Cells B and C initially receive the same input, but C is less inhibited than B and so will end up firing more strongly than B. 68 C H A P T E R T H R E E Visual Perception FIGURE 3.4 LATERAL INHIBITION Bright physical stimulus Gray physical stimulus Intense stimulation Moderate stimulation Cell Cell Cell Cell Cell Cell A B C D E F Sideways connection providing inhibition To brain To brain To brain To brain To brain To brain 1,000 spikes 1,000 spikes 1,100 spikes 80 spikes 90 spikes 90 spikes per second per second per second per second per second per second Stimulus as perceived Stimulus as perceived Cell B receives strong inhibition from all its neighbors, because its neighbors are intensely stimulated. Cell C, in contrast, receives inhibition only from one side (because its neighbor on the other side, Cell D, is only moderately stimulated). As a result, Cells B and C start with the same input, but Cell C, receiving less inhibition, sends a stronger signal to the brain, emphasizing the edge in the stimulus. The same logic applies to Cells D and E, and it explains why Cell D sends a weaker signal to the brain. Note, by the way, that the spikes per second numbers, shown in the figure, are hypothetical and intended only to illustrate lateral inhibition’s effects. Notice that the pattern of lateral inhibition highlights a surface’s edges, because the response of cells detecting the edge of the surface (such as Cell C) will be stronger than that of cells detecting the middle of the surface (such as Cell B). For that matter, by increasing the response by Cell C and decreasing the response by Cell D, lateral inhibition actually exaggerates the contrast at the edge — a process called edge enhancement. This process is of enormous The Visual System 69 FIGURE 3.5 MACH BANDS Edge enhancement, produced by lateral inhibition, helps us to perceive the outline that defines an object’s shape. But the same process can produce illusions — including the Mach bands. Each vertical strip in this fig- ure is of uniform light intensity, but the strips don’t appear uniform. For each strip, contrast makes the left edge (next to its darker neighbor) look brighter than the rest, while the right edge (next to its lighter neighbor) looks darker. To see that the differences are illusions, try placing a thin object (such as a toothpick or a straightened paper clip) on top of the boundary between strips. With the strips separated in this manner, the illusion disappears. importance, because it’s obviously highlighting the information that defines an TEST YOURSELF object’s shape — information essential for figuring out what the object is. And let’s emphasize that this edge enhancement occurs at a very early stage of 1. What are the differ- the visual processing. In other words, the information sent to the brain isn’t ences between rods and cones? What traits a mere copy of the incoming stimulation; instead, the steps of interpretation do these cells share? and analysis begin immediately, in the eyeball. (For a demonstration of an 2. What is lateral inhibition? illusion caused by this edge enhancement — the so-called Mach bands — see How does it contribute Figure 3.5.) to edge perception? Visual Coding In Chapter 2, we introduced the idea of coding in the nervous system. This term refers to the relationship between activity in the nervous system and the stimulus (or idea or operation) that is somehow represented by that activ- ity. In the study of perception, we can ask: What’s the code through which neurons (or groups of neurons) manage to represent the shapes, colors, sizes, and movements that you perceive? 70 C H A P T E R T H R E E Visual Perception Single Neurons and Single-Cell Recording Part of what we know about the visual system — actually, part of what we know about the entire brain — comes from a technique called single-cell recording. As the name implies, this is a procedure through which investiga- tors can record, moment by moment, the pattern of electrical changes within a single neuron. We mentioned in Chapter 2 that when a neuron fires, each response is the same size; this is the all-or-none law. But neurons can vary in how often they fire, and when investigators record the activity of a single neuron, what they’re usually interested in is the cell’s firing rate, measured in “spikes per second.” The investigator can then vary the circumstances (either in the external world or elsewhere in the nervous system) in order to learn what makes the cell fire more and what makes it fire less. In this way, we can figure out what job the neuron does within the broad context of the entire nervous system. The technique of single-cell recording has been used with enormous suc- cess in the study of vision. In a typical procedure, the animal being studied is first immobilized. Then, electrodes are placed just outside a neuron in the animal’s optic nerve or brain. Next, a computer screen is placed in front of the animal’s eyes, and various patterns are flashed on the screen: circles, lines at various angles, or squares of various sizes at various positions. Researchers can then ask: Which patterns cause that neuron to fire? To what visual inputs does that cell respond? By analogy, we know that a smoke detector is a smoke detector because it “fires” (i.e., makes noise) when smoke is on the scene. We know that a motion detector is a motion detector because it “fires” when something moves nearby. But what kind of detector is a given neuron? Is it responsive to any light in any position within the field of view? In that case, we might call it a “light detector.” Or is it perhaps responsive only to certain shapes at certain positions (and therefore is a “shape detector”)? With this logic, we can map out precisely what the cell responds to — what kind of detector it is. More formally, this procedure allows us to define the cell’s receptive TORSTEN WIESEL field — that is, the size and shape of the area in the visual world to which AND DAVID HUBEL that cell responds. Much of what we know about the visual system is based on Multiple Types of Receptive Fields the pioneering work done by David Hubel and Torsten In 1981, the neurophysiologists David Hubel and Torsten Wiesel were Wiesel. This pair of research- awarded the Nobel Prize for their exploration of the mammalian visual ers won the 1981 Nobel Prize for their discoveries. (They system (e.g., Hubel & Wiesel, 1959, 1968). They documented the existence shared the Nobel with Roger of specialized neurons within the brain, each of which has a different type Sperry for his independent of receptive field, a different kind of visual trigger. For example, some neu- research on the cerebral rons seem to function as “dot detectors.” These cells fire at their maximum hemispheres.) Visual Coding 71 Receptive field Center Neural firing frequency Time A Surround Time B FIGURE 3.6 CENTER-SURROUND CELLS Some neurons in the visual system have receptive fields with a “center-surround” organization. Pan- els A through D show the firing frequency for one Time of those cells. (A) This graph shows the cell’s firing C rate when no stimulus is presented. (B) The cell’s firing rate goes up when a stimulus is presented in the middle of the cell’s receptive field. (C) In con- trast, the cell’s firing rate goes down if a stimulus is presented at the edge of the cell’s receptive field. (D) If a stimulus is presented both to the center of the receptive field and to the edge, the cell’s firing Time rate does not change from its baseline level. D rate when light is presented in a small, roughly circular area in a specific position within the field of view. Presentations of light just outside of this area cause the cell to fire at less than its usual “resting” rate, so the input must be precisely positioned to make this cell fire. Figure 3.6 depicts such a receptive field. These cells are often called center-surround cells, to mark the fact that light presented to the central region of the receptive field has one influence, while light presented to the surrounding ring has the opposite influence. If both the center and the surround are strongly stimulated, the cell will fire neither more nor less than usual. For this cell, a strong uniform stimulus is equivalent to no stimulus at all. 72 C H A P T E R T H R E E Visual Perception FIGURE 3.7 ORIENTATION-SPECIFIC VISUAL FIELDS Some cells in the visual system fire only when the input contains a line segment at a certain orientation. For example, one cell might fire very little in response to a horizontal line, fire only occasionally in response to a diagonal, and fire at its maximum rate only when a vertical line is present. In this figure, the circles show the stimulus that was presented. The right side shows records of neural firing. Each vertical stroke represents a firing by the cell; the left–right position reflects the passage of time. ( after hubel , 1963) Other cells fire at their maximum only when a stimulus containing an edge of just the right orientation appears within their receptive fields. These cells, therefore, can be thought of as “edge detectors.” Some of these cells fire at their maximum rate when a horizontal edge is presented; others, when a vertical edge is in view; still others fire at their maximum to orien- tations in between horizontal and vertical. Note, though, that in each case, these orientations merely define the cells’ “preference,” because these cells are not oblivious to edges of other orientations. If a cell’s preference is for, say, horizontal edges, then the cell will still respond to other orientations — but less strongly than it does for horizontals. Specifically, the farther the edge is from the cell’s preferred orientation, the weaker the firing will be, and edges sharply different from the cell’s preferred orientation (e.g., a vertical edge for a cell that prefers horizontal) will elicit virtually no response (see Figure 3.7). Other cells, elsewhere in the visual cortex, have receptive fields that are more specific. Some cells fire maximally only if an angle of a particular size appears in their receptive fields; others fire maximally in response to corners Visual Coding 73 and notches. Still other cells appear to be “movement detectors” and fire strongly if a stimulus moves, say, from right to left across the cell’s receptive field. Other cells favor left-to-right movement, and so on through the various possible directions of movement. Parallel Processing in the Visual System This proliferation of cell types highlights another important principle — namely, that the visual system relies on a “divide and conquer” strategy, with different types of cells, located in different areas of the cortex, each specializing in a particular kind of analysis. This pattern is plainly evident in Area V1, the site on the occipital lobe where axons from the LGN first reach the cortex (see Figure 3.8). In this brain area, some cells fire to (say) horizontals in this position in the visual world, others to horizontals in that position, others to verticals in specific positions, and so on. The full ensemble of cells in this area Area V1, primary visual FIGURE 3.8 AREA V1 IN projection area THE HUMAN BRAIN Area V1 is the site on the occipi- tal lobe where axons from the LGN first reach the cortex. The top panel shows the brain as if sliced verti- cally down the middle, revealing the “inside” surface of the brain’s right hemisphere. The bottom panel shows the left hemisphere of the brain viewed from the side. As the two panels show, most of Area V1 is located on the cortical surface bet ween the two cerebral hemispheres. 74 C H A P T E R T H R E E Visual Perception provides a detector for every possible stimulus, making certain that no matter what the input is or where it’s located, some cell will respond to it. The pattern of specialization is also evident when we consider other brain areas. Figure 3.9, for example, reflects one summary of the brain areas known to be involved in vision. The details of the figure aren’t crucial, but it is noteworthy that some of these areas (V1, V2, V3, V4, PO, and MT) are in the occipital cortex; other areas are in the parietal cortex; others are in the temporal cortex. (We’ll have more to say in a moment about these areas out- side of the occipital cortex.) Most important, each area seems to have its own function. Neurons in Area MT, for example, are acutely sensitive to direction and speed of movement. (This area is the brain region that has suffered dam- age in cases involving akinetopsia.) Cells in Area V4 fire most strongly when the input is of a certain color and a certain shape. Let’s also emphasize that all of these specialized areas are active at the same time, so that (for example) cells in Area MT are detecting movement in the visual input at the same time that cells in Area V4 are detecting shapes. In other words, the visual system relies on parallel processing — a system in FIGURE 3.9 THE VISUAL PROCESSING PATHWAYS Parietal VIP cortex PO MST 7a Occipital MT cortex LIP Retina LGN V1 V3 V2 Inferotemporal TEO cortex V4 TE Each box in this figure refers to a specific location within the visual system. Notice that vision depends on many brain sites, each performing a specialized type of analysis. Note also that the flow of information is complex, so there’s no strict sequence of “this step” of analysis followed by “that step.” Instead, everything happens at once, with a great deal of back-and-forth communication among the various elements. Visual Coding 75 which many different steps (in this case, different kinds of analysis) are going on simultaneously. (Parallel processing is usually contrasted with serial processing, in which steps are carried out one at a time — i.e., in a series.) One advantage of this simultaneous processing is speed: Brain areas trying to discern the shape of the incoming stimulus don’t need to wait until the motion analysis or the color analysis is complete. Instead, all of the analy- ses go forward immediately when the input appears before the eyes, with no waiting time. Another advantage of parallel processing is the possibility of mutual influ- ence among multiple systems. To see why this matters, consider the fact that sometimes your interpretation of an object’s motion depends on your under- standing of the object’s three-dimensional shape. This suggests that it might be best if the perception of shape happened first. That way, you could use the results of this processing step as a guide to later analyses. In other cases, though, the relationship between shape and motion is reversed. In these cases, your interpretation of an object’s three-dimensional shape depends on your understanding of its motion. To allow for this possibility, it might be best if the perception of motion happened first, so that it could guide the subsequent analysis of shape. How does the brain deal with these contradictory demands? Parallel pro- cessing provides the answer. Since both sorts of analysis go on simultaneously, each type of analysis can be informed by the other. Put differently, neither the shape-analyzing system nor the motion-analyzing system gets priority. Instead, the two systems work concurrently and “negotiate” a solution that satisfies both systems (Van Essen & DeYoe, 1995). Parallel processing is easy to document throughout the visual system. As we’ve seen, the retina contains two types of specialized receptors (rods and cones) each doing its own job (e.g., the rods detecting stimuli in the periphery of your vision and stimuli presented at low light levels, and the cones detect- ing hues and detail at the center of your vision). Both types of receptors func- tion at the same time — another case of parallel processing. Likewise, within the optic nerve itself, there are two types of cells, P cells and M cells. The P cells provide the main input for the LGN’s parvocellular cells and appear to be specialized for spatial analysis and the detailed analysis of form. M cells provide the input for the LGN’s magnocellular cells and are specialized for the detection of motion and the perception of depth.1 And, again, both of these systems are functioning at the same time — more parallel processing. Parallel processing remains in evidence when we move beyond the occipi- tal cortex. As Figure 3.10 shows, some of the activation from the occipital 1. The names here refer to the relative sizes of the relevant cells: parvo derives from the Latin word for “small,” and magno from the word for “large.” To remember the function of these two types of cells, many students think of the P cells as specialized roughly for the perception of pattern and M cells as specialized for the perception of motion. These descriptions are crude, but they’re easy to remember. 76 C H A P T E R T H R E E Visual Perception FIGURE 3.10 THE WHAT AND WHERE PATHWAYS Parietal lobe Posterior parietal cortex Occipital Temporal lobe lobe Inferotemporal cortex Information from the primary visual cortex at the back of the head is transmit- ted to the inferotemporal cortex (the so-called what system) and to the pos- terior parietal cortex (the where system). The term “inferotemporal” refers to the lower part of the temporal lobe. The term “posterior parietal cortex” refers to the rearmost portion of this cortex. lobe is passed along to the cortex of the temporal lobe. This pathway, often called the what system, plays a major role in the identification of visual objects, telling you whether the object is a cat, an apple, or whatever. At the same time, activation from the occipital lobe is also passed along a second pathway, leading to the parietal cortex, in what is often called the where system. This system seems to guide your action based on your perception of where an object is located — above or below you, to your right or to your left. (See Goodale & Milner, 2004; Humphreys & Riddoch, 2014; Ungerleider & Haxby, 1994; Ungerleider & Mishkin, 1982. For some complications, though, see Borst, Thompson, & Kosslyn, 2011; de Haan & Cowey, 2011.) The contrasting roles of these two systems can be revealed in many ways, including through studies of brain damage. Patients with lesions in the what system show visual agnosia — an inability to recognize visually presented objects, including such common things as a cup or a pencil. However, these patients show little disorder in recognizing visual orientation or in reaching. The reverse pattern occurs with patients who have suffered lesions in the where system: They have difficulty in reaching, but no problem in object identification (Damasio, Tranel, & Damasio, 1989; Farah, 1990; Goodale, 1995; Newcombe, Ratcliff, & Damasio, 1987). Still other data echo this broad theme of parallel processing among sepa- rate systems. For example, we noted earlier that different brain areas are Visual Coding 77 critical for the perception of color, motion, and form. If this is right, then someone who has suffered damage in just one of these areas might show problems in the perception of color but not the perception of motion or form, or problems in the perception of motion but not the perception of form or color. These predictions are correct. As we mentioned at the chapter’s start, some patients suffer damage to the motion system and so develop akinetop- sia (Zihl et al., 1983). For such patients, the world is described as a succes- sion of static photographs. They’re unable to report the speed or direction of a moving object; as one patient put it, “When I’m looking at the car first, it seems far away. But then when I want to cross the road, suddenly the car is very near” (Zihl et al., 1983, p. 315). Other patients suffer a specific loss of color vision through damage to the central nervous system, even though their perception of form and motion remains normal (Damasio, 1985; Gazzaniga, Ivry, & Mangun, 2014; Meadows, 1974). To them, the entire world is clothed only in “dirty shades of gray.”2 Cases like these provide dramatic confirmation of the separateness of our visual system’s various elements and the ways in which the visual system is vulnerable to very specific forms of damage. (For further evidence with neu- rologically intact participants, see Bundesen, Kyllingsbaek, & Larsen, 2003.) Putting the Pieces Back Together Let’s emphasize once again, therefore, that even the simplest of our intellec- tual achievements depends on an array of different, highly specialized brain areas all working together in parallel. This was evident in Chapter 2 in our consideration of Capgras syndrome, and the same pattern has emerged in our description of the visual system. Here, too, many brain areas must work together: the what system and the where system, areas specialized for the detec- tion of movement and areas specialized for the identification of simple forms. We have identified the advantages that come from this division of labor and the parallel processing it allows. But the division of labor also creates a problem: If multiple brain areas contribute to an overall task, how is their functioning coordinated? When you see an athlete make an astonishing jump, the jump itself is registered by motion-sensitive neurons, but your recognition of the athlete depends on shape-sensitive neurons. How are the pieces put back together? When you reach for a coffee cup but stop midway because you see that the cup is empty, the reach itself is guided by the where system; the fact that the cup is empty is registered by the what system. How are these two streams of processing coordinated? Investigators refer to this broad issue as the binding problem — the task of reuniting the various elements of a scene, elements that are initially addressed by different systems in different parts of the brain. And obviously 2. This is different from ordinary color blindness, which is usually present from birth and results from abnormalities that are outside the brain itself — for example, abnormalities in the photoreceptors. 78 C H A P T E R T H R E E Visual Perception this problem is solved. What you perceive is not an unordered catalogue of sensory elements. Instead, you perceive a coherent, integrated perceptual world. Apparently, this is a case in which the various pieces of Humpty Dumpty are reassembled to form an organized whole. Visual Maps and Firing Synchrony Look around you. Your visual system registers whiteness and blueness and brownness; it also registers a small cylindrical shape (your coffee cup), a medium- sized rectangle (this book page), and a much larger rectangle (your desk). How do you put these pieces together so that you see that it’s the coffee cup, and not the book page, that’s blue; the desktop, and not the cup, that’s brown? There is debate about how the visual system solves this problem, but we can identify three elements that contribute to the solution. One element is spatial position. The part of the brain registering the cup’s shape is separate from the parts registering its color or its motion; nonetheless, these various brain areas all have something in common. They each keep track of where the target is — where the cylindrical shape was located, and where the blue- ness was; where the motion was detected, and where things were still. As a result, the reassembling of these pieces can be done with reference to posi- tion. In essence, you can overlay the map of which forms are where on top of the map of which colors are where to get the right colors with the right forms, and likewise for the map showing which motion patterns are where. Information about spatial position is, of course, useful for its own sake: You have a compelling reason to care whether the tiger is close to you or far away, or whether the bus is on your side of the street or the other. But in addition, location information apparently provides a frame of reference used to solve the binding problem. Given this double function, we shouldn’t be surprised that spatial position is a major organizing theme in all the various brain areas concerned with vision, with each area seeming to provide its own map of the visual world. Spatial position, however, is not the whole story. Evidence also suggests that the brain uses special rhythms to identify which sensory elements belong with which. Imagine two groups of neurons in the visual cortex. One group of neurons fires maximally whenever a vertical line is in view; another group fires maximally whenever a stimulus is in view moving from a high position to a low one. Let’s also imagine that right now a vertical line is presented and it is moving downward; as a result, both groups of neurons are firing strongly. How does the brain encode the fact that these attributes are bound together, different aspects of a single object? There is evidence that the visual system marks this fact by means of neural synchrony: If the neurons detecting a vertical line are firing in synchrony with those signaling movement, then these attributes are registered as belonging to the same object. If they aren’t in synchrony, then the features aren’t bound together (Buzsáki & Draguhn, 2004; Csibra, Davis, Spratling, & Johnson, 2000; Elliott & Müller, 2000; Fries, Reynolds, Rorie, & Desimone, 2001). Visual Coding 79 What causes this synchrony? How do the neurons become synchronized in the first place? Here, another factor appears to be important: attention. We’ll have more to say about attention in Chapter 5, but for now let’s note that attention plays a key role in binding together the separate features of a stimulus. (For a classic statement of this argument, see Treisman & Gelade, 1980; Treisman, Sykes, & Gelade, 1977. For more recent views, see Quinlan, 2003; Rensink, 2012; and also Chapter 5.) Evidence for attention’s role comes from many sources, including the fact that when we overload someone’s attention, she is likely to make conjunction errors. This means that she’s likely to correctly detect the features present in a visual display, but then to make mistakes about how the features are bound together (or conjoined). Thus, for example, someone shown a blue H and a red T might report seeing a blue T and a red H — an error in binding. Similarly, individuals who suffer from severe attention deficits (because of brain damage in the parietal cortex) are particularly impaired in tasks that require them to judge how features are conjoined to form complex objects (e.g., Robertson, Treisman, Friedman-Hill, & Grabowecky, 1997). TEST YOURSELF Finally, studies suggest that synchronized neural firing occurs in an animal’s 3. How do researchers brain when the animal is attending to a specific stimulus but does not occur use single-cell record- in neurons activated by an unattended stimulus (e.g., Buschman & Miller, ing to reveal a cell’s 2007; Saalmann, Pigarev, & Vidyasagar, 2007; Womelsdorf et al., 2007). All receptive field? of these results point toward the claim that attention is crucial for the binding 4. What are the advan- problem and, moreover, that attention is linked to the neural synchrony that tages of parallel pro- cessing in the visual seems to unite a stimulus’s features. system? What are the Notice, then, that there are several ways in which information is represented disadvantages? in the brain. In Chapter 2, we noted that the brain uses different chemical signals 5. How is firing synchrony (i.e., different neurotransmitters) to transmit different types of information. relevant to the solu- We now see that there is information reflected in which cells are firing, how often tion of the binding problem? they are firing, whether the cells are firing in synchrony with other cells, and the rhythm in which they are firing. Plainly, this is a system of considerable complexity! Form Perception So far in this chapter, we’ve been discussing how visual perception begins: with the detection of simple attributes in the stimulus — its color, its motion, and its catalogue of features. But this detection is just the start of the process, because the visual system still has to assemble these features into recognizable wholes. We’ve mentioned the binding problem as part of this “assembly” — but binding isn’t the whole story. This point is reflected in the fact that our per- ception of the visual world is organized in ways that the stimulus input is not — a point documented early in the 20th century by a group called the “Gestalt psychologists.”3 The Gestaltists argued that the organization is 3. Gestalt is the German word for “shape” or “form.” The Gestalt psychology movement was committed to the view that theories about perception and thought need to emphasize the organization of patterns, not just focus on a pattern’s elements. 80 C H A P T E R T H R E E Visual Perception contributed by the perceiver; this is why, they claimed, the perceptual whole is often different from the sum of its parts. Some years later, Jerome Bruner (1973) voiced related claims and coined the phrase “beyond the information given” to describe some of the ways our perception of a stimulus differs from (and goes beyond) the stimulus itself. For example, consider the form shown in the top of Figure 3.11: the Necker cube. This drawing is an example of a reversible (or ambiguous) figure — so-called because people perceive it first one way and then another. Specifically, this form can be perceived as a drawing of a cube viewed from above (in which case it’s similar to the cube marked A in the figure); it can also be perceived as a cube viewed from below (in which case it’s similar to the cube marked B). Let’s be clear, though, that this isn’t an “illusion,” because nei- ther of these interpretations is “wrong,” and the drawing itself (and, therefore, the information reaching your eyes) is fully compatible with either interpre- tation. Put differently, the drawing shown in Figure 3.11 is entirely neutral with regard to the shape’s configuration in depth; the lines on the page don’t specify which is the “proper” interpretation. Your perception of the cube, how- ever, is not neutral. Instead, you perceive the cube as having one configuration or the other — similar either to Cube A or to Cube B. Your perception goes beyond the information given in the drawing, by specifying an arrangement in depth. FIGURE 3.11 THE NECKER CUBE A B The top cube can be perceived as if viewed from above (in which case it is a transparent version of Cube A) or as if viewed from below (in which case it is a transparent version of Cube B). Form Perception 81 FIGURE 3.12 AMBIGUOUS FIGURES A B Some stimuli easily lend themselves to reinterpretation. The figure in Panel A, for example, is perceived by many to be a white vase or candlestick on a black background; others see it as two black faces shown in profile. A similar bistable form is visible in the Canadian flag (Panel B). The same point can be made for many other stimuli. Figure 3.12A (after Rubin, 1915, 1921) can be perceived either as a vase centered in the picture or as two profiles facing each other. The drawing by itself is compatible with either of these perceptions, and so, once again, the drawing is neutral with regard to perceptual organization. In particular, it is neutral with re- gard to figure/ground organization, the determination of what is the figure (the depicted object, displayed against a background) and what is the ground. Your perception of this drawing, however, isn’t neutral about this point. Instead, your perception somehow specifies that you’re looking at the vase and not the profiles, or that you’re looking at the profiles and not the vase. Figure/ground ambiguity is also detectable in the Canadian flag (Figure 3.12B). Since 1965, the centerpiece of Canada’s flag has been a red maple leaf. Many observers, however, note that a different organization is possible, at least for part of the flag. On their view, the flag depicts two profiles, shown in white against a red backdrop. Each profile has a large nose, an open mouth, and a prominent brow ridge, and the profiles are looking downward, toward the flag’s center. In all these examples, then, your perception contains information — about how the form is arranged in depth, or about which part of the form is figure and which is ground — that is not contained within the stimulus itself. Appar- ently, this is information contributed by you, the perceiver. 82 C H A P T E R T H R E E Visual Perception The Gestalt Principles With figures like the Necker cube or the vase/profiles, your role in shaping the perception seems undeniable. In fact, if you stare at either of these figures, your perception flips back and forth — first you see the figure one way, then another, then back to the first way. But the stimulus itself isn’t changing, and so the information that’s reaching your eyes is constant. Any changes in per- ception, therefore, are caused by you and not by some change in the stimulus. One might argue, though, that reversible figures are special — carefully designed to support multiple interpretations. On this basis, perhaps you play a smaller role when perceiving other, more “natural” stimuli. This position is plausible — but wrong, because many stimuli (and not just the reversible figures) are ambiguous and in need of interpretation. We often don’t detect this ambiguity, but that’s because the interpretation happens so quickly that we don’t notice it. Consider, for example, the scene shown in Figure 3.13. It’s almost certain that you perceive segments B and E as being FIGURE 3.13 HE ROLE OF INTERPRETATION IN T PERCEIVING AN ORDINARY SCENE C A B E D A B Consider the still life (Panel A) and an overlay designating five different segments of the scene (Panel B). For this picture to be perceived correctly, the perceptual system must first decide what goes with what — for example, that Segment B and Segment E are different bits of the same object (even though they’re separated by Segment D) and that Segment B and Segment A are different objects (even though they’re adjacent and the same color). Form Perception 83 united, forming a complete apple, but notice that this information isn’t pro- vided by the stimulus; instead, it’s your interpretation. (If we simply go with the information in the figure, it’s possible that segments B and E are parts of entirely different fruits, with the “gap” between the two fruits hidden from view by the banana.) It’s also likely that you perceive the banana as entirely banana- shaped and therefore continuing downward out of your view, into the bowl, where it eventually ends with the sort of point that’s normal for a banana. In the same way, surely you perceive the horizontal stripes in the background as continuous and merely hidden from view by the pitcher. (You’d be surprised if we removed the pitcher and revealed a pitcher-shaped gap in the stripes.) But, of course, the stimulus doesn’t in any way “guarantee” the banana’s shape or the continuity of the stripes; these points are, again, just your interpretation. Even with this ordinary scene, therefore, your perception goes “beyond the information given” — and so the unity of the two apple slices and the continuity of the stripes is “in the eye of the beholder,” not in the stimu- lus itself. Of course, you don’t feel like you’re “interpreting” this picture or extrapolating beyond what’s on the page. But your role becomes clear the moment we start cataloguing the differences between your perception and the information that’s truly present in the photograph. Let’s emphasize, though, that your interpretation of the stimulus isn’t care- less or capricious. Instead, you’re guided by a few straightforward principles that the Gestalt psychologists catalogued many years ago — and so they’re routinely referred to as the Gestalt principles. For example, your perception is guided by proximity and similarity: If, within the visual scene, you see elements that are close to each other, or elements that resemble each other, you assume these elements are parts of the same object (Figure 3.14). You also tend to assume that contours are smooth, not jagged, and you avoid FIGURE 3.14 GESTALT PRINCIPLES OF ORGANIZATION Similarity Proximity Good continuation Closure Simplicity We tend to group We tend to We tend to see a We tend to perceive We tend to interpret a form these dots into perceive groups, continuous green an intact triangle, in the simplest way possible. columns rather linking dots that bar rather than two reflecting our bias We would see the form on than rows, are close together. smaller rectangles. toward perceiving the left as two intersecting grouping dots of closed figures rather rectangles (as shown on the similar colors. than incomplete ones. right) rather than as a single 12-sided irregular polygon. As Figure 3.13 illustrated, your ordinary perception of the world requires you to make decisions about what goes with what — which elements are part of the same object, and which elements belong to different objects. Your decisions are guided by a few simple principles, catalogued many years ago by the Gestalt psychologists. 84 C H A P T E R T H R E E Visual Perception interpretations that involve coincidences. (For a modern perspective on these principles and Gestalt psychology in general, see Wagemans, Elder, Kubovy et al., 2012; Wagemans, Feldman, Gephstein et al., 2010.) These perceptual principles are quite straightforward, but they’re essential if your perceptual apparatus is going to make sense of the often ambiguous, often incomplete information provided by your senses. In addition, it’s worth mention- ing that everyone’s perceptions are guided by the same principles, and that’s why you generally perceive the world in the same way that other people do. Each of us imposes our own interpretation on the perceptual input, but we all tend to impose the same interpretation because we’re all governed by the same rules. Organization and Features We’ve now considered two broad topics — the detection of simple attributes in the stimulus, and then the ways in which you organize those attributes. In thinking about these topics, you might want to think about them as separate steps. First, you collect information about the stimulus, so that you know (for example) what corners or angles or curves are in view — the visual fea- tures contained within the input. Then, once you’ve gathered the “raw data,” you interpret this information. That’s when you “go beyond the information given” — deciding how the form is laid out in depth (as in Figure 3.11), decid- ing what is figure and what is ground (Figure 3.12A or B), and so on. The idea, then, is that perception might be divided (roughly) into an “information gathering” step followed by an “interpretation” step. This view, however, is wrong, and, in fact, it’s easy to show that in many settings, your interpretation of the input happens before you start cataloguing the input’s basic features, not after. Consider Figure 3.15. Initially, these shapes seem to FIGURE 3.15 A HIDDEN FIGURE Initially, these dark shapes have no meaning, but after a moment the hidden figure becomes clearly visible. Notice, therefore, that at the start the figure seems not to contain the features needed to identify the various letters. Once the figure is reorganized, with the white parts (not the dark parts) making up the figure, the features are easily detected. Apparently, the analysis of fea- tures depends how the figure is first organized by the viewer. Form Perception 85 have no meaning, but after a moment most people discover the word hidden in the figure. That is, people find a way to reorganize the figure so that the familiar letters come into view. But let’s be clear about what this means. At the start, the form seems not to contain the features needed to identify the L, the I, and so on. Once the form is reorganized, though, it does contain these features, and the letters are immediately recognized. In other words, with one organization, the features are absent; with another, they’re plainly present. It would seem, then, that the features themselves depend on how the form is organized by the viewer — and so the features are as much “in the eye of the beholder” as they are in the figure itself. As a different example, you have no difficulty reading the word printed in Figure 3.16, although most of the features needed for this recognition are absent. You easily “provide” the missing features, though, thanks to the fact that you interpret the black marks in the figure as shadows cast by solid letters. Given this interpretation and the extrapolation it involves, you can easily “fill in” the missing features and read the word. How should we think about all of this? On one hand, your perception of a form surely has to start with the stimulus itself and must in some ways be gov- erned by what’s in that stimulus. (After all, no matter how you try to interpret Figure 3.16, it won’t look to you like a photograph of Queen Elizabeth — the basic features of the queen are just not present, and your perception respects this obvious fact.) This suggests that the features must be in place before an interpretation is offered, because the features govern the interpretation. But, on the other hand, Figures 3.15 and 3.16 suggest that the opposite is the case: that the features you find in an input depend on how the figure is interpreted. Therefore, it’s the interpretation, not the features, that must be first. The solution to this puzzle, however, is easy, and builds on ideas that we’ve already met: Many aspects of the brain’s functioning depend on parallel pro- cessing, with different brain areas all doing their work at the same time. In addition, the various brain areas all influence one another, so that what’s going on in one brain region is shaped by what’s going on elsewhere. In this FIGURE 3.16 MISSING FEATURES PERCEPTION People have no trouble reading this word, even though most of the features needed for recognition are absent from the stimulus. People easily “supply” the missing features, illustrating once again that the analysis of features depends on how the overall figure has been interpreted and organized. 86 C H A P T E R T H R E E Visual Perception way, the brain areas that analyze a pattern’s basic features do their work at the same time as the brain areas that analyze the pattern’s large-scale configu- TEST YOURSELF ration, and these brain areas interact so that the perception of the features is 6. W hat evidence tells us guided by the configuration, and analysis of the configuration is guided by that perception goes the features. In other words, neither type of processing “goes first.” Neither beyond (includes more has priority. Instead, they work together, with the result that the perception information than) the that is achieved makes sense at both the large-scale and fine-grained levels. stimulus input? 7. W hat are the Gestalt Constancy principles, and how do they influence visual perception? We’ve now seen many indications of the perceiver’s role in “going beyond the 8. What evidence is there information given” in the stimulus itself. This theme is also evident in another that the perception aspect of perception: the achievement of perceptual constancy. This term of an overall form refers to the fact that we perceive the constant properties of objects in the world depends on the detec- (their sizes, shapes, and so on) even though the sensory information we receive tion of features? What evidence is there that about these attributes changes whenever our viewing circumstances change. the detection of fea- To illustrate this point, consider the perception of size. If you happen to tures depends on the be far away from the object you’re viewing, then the image cast onto your overall form? retinas by that object will be relatively small. If you approach the object, then the image size will increase. This change in image size is a simple consequence of physics, but you’re not fooled by this variation. Instead, you manage to achieve size constancy — you correctly perceive the sizes of objects despite the changes in retinal-image size created by changes in viewing distance. Similarly, if you view a door straight on, the retinal image will be rectan- gular; but if you view the same door from an angle, the retinal image will have a different shape (see Figure 3.17). Still, you achieve shape constancy — that is, you correctly perceive the shapes of objects despite changes in the retinal image created by shifts in your viewing angle. You also achieve bright- ness constancy — you correctly perceive the brightness of objects whether they’re illuminated by dim light or strong sun. FIGURE 3.17 SHAPE CONSTANCY If you change your viewing angle, the shape of the reti- nal image cast by a target changes. In this figure, the door viewed straight on casts a rectangular image on your retina; the door viewed from an angle casts a trape zoidal image. Nonetheless, you generally achieve shape constancy. Constancy 87 Unconscious Inference How do you achieve each of these forms of constancy? One hypothesis focuses on relationships within the retinal image. In judging size, for exam- ple, you generally see objects against some background, and this can provide a basis for comparison with the target object. To see how this works, imagine that you’re looking at a dog sitting on the kitchen floor. Let’s say the dog is half as tall as the nearby chair and hides eight of the kitchen’s floor tiles from view. If you take several steps back from the dog, none of these relationships change, even though the sizes of all the retinal images are reduced. Size con- stancy, therefore, might be achieved by focusing not on the images themselves but on these unchanging relationships (see Figure 3.18). Relationships do contribute to size constancy, and that’s why you’re better able to judge size when comparison objects are in view or when the target you’re judging sits on a surface that has a uniform visual texture (like the floor tiles in the example). But these relationships don’t tell the whole story. Size constancy is achieved even when the visual scene offers no basis for com- parison (if, for example, the object to be judged is the only object in view), provided that other cues signal the distance of the target object (Harvey & Leibowitz, 1967; Holway & Boring, 1947). How does your visual system use this distance information? More than a century ago, the German physicist Hermann von Helmholtz developed an influential hypothesis regarding this question. Helmholtz started with the fact that there’s a simple inverse relationship between distance and reti- nal image size: If an object doubles its distance from the viewer, the size of its image is reduced by half. If an object triples its distance, the size of its FIGURE 3.18 AN INVARIANT RELATIONSHIP THAT PROVIDES INFORMATION ABOUT SIZE One proposal is that you achieve size constancy by focusing on relation ships in the visual scene. For example, the dog sitting nearby on the kitchen floor (Panel A) is half as tall as the chair and hides eight of the kitchen’s floor tiles from view. If you take several steps back from the dog (Panel B), none of these relationships change, even though the sizes of all the reti- nal images are reduced. By focusing on the relationships, then, you can see that the dog’s size hasn’t changed. A B 88 C H A P T E R T H R E E Visual Perception image is reduced to a third of its initial size. This relationship is guaranteed to hold true because of the principles of optics, and the relationship makes it possible for perceivers to achieve size constancy by means of a simple calculation. Of course, Helmholtz knew that we don’t run through a con- scious calculation every time we perceive an object’s size, but he believed we’re calculating nonetheless — and so he referred to the process as an unconscious inference (Helmholtz, 1909). What is the calculation that enables someone to perceive size correctly? It’s multiplication: the size of the image on the retina, multiplied by the dis- tance between you and the object. (We’ll have more to say about how you know this distance in a later section.) As an example, imagine an object that, at a distance of 10 ft, casts an image on the retina that’s 4 mm across. Because of straightforward principles of optics, the same object, at a distance of 20 ft, casts an image of 2 mm. In both cases, the product — 10 3 4 or 20 3 2 — is the same. If, therefore, your size estimate depends on that product, your size estimate won’t be thrown off by viewing distance — and that’s exactly what we want (see Figure 3.19). FIGURE 3.19 THE RELATIONSHIP BETWEEN IMAGE SIZE AND DISTANCE Closer objects cast larger retinal images d Retinal image Farther objects cast smaller retinal images 2d Retinal image If you view an object from a greater distance, the object casts a smaller image on your retina. Nonetheless, you generally achieve size constancy — perceiving the object’s actual size. Helmholtz proposed that you achieve con- stancy through an unconscious inference—essentially multiplying the image size by the distance. Constancy 89 What’s the evidence that size constancy does depend on this sort of infer- ence? In many experiments, researchers have shown participants an object and, without changing the object’s retinal image, have changed the apparent distance of the object. (There are many ways to do this — lenses that change how the eye has to focus to bring the object into sharp view, or mirrors that change how the two eyes have to angle inward so that the object’s image is centered on both foveas.) If people are — as Helmholtz proposed — using distance information to judge size, then these manipulations should affect size perception. Any manipulation that makes an object seem farther away (without changing retinal image size) should make that object seem bigger (because, in essence, the perceiver would be “multiplying” by a larger num- ber). Any manipulation that makes the object seem closer should make it look smaller. And, in fact, these predictions are correct — a powerful confir- mation that people do use distance to judge size. A similar proposal explains how people achieve shape constancy. Here, you take the slant of the surface into account and make appropriate adjustments — again, an unconscious inference — in your interpretation of the retinal image’s shape. Likewise for brightness constancy: Perceivers are sensitive to how a surface is oriented relative to the available light sources, and they take this information into account in estimating how much light is reaching the surface. Then, they use this assessment of lighting to judge the surface’s brightness (e.g., whether it’s black or gray or white). In all these cases, therefore, it appears that the perceptual system does draw some sort of unconscious inference, taking viewing circumstances into account in a way that enables you to perceive the constant properties of the visual world. Illusions This process of taking information into account — whether it’s distance (in order to judge size), viewing angle (to judge shape), or illumination (to judge brightness) — is crucial for achieving constancy. More than that, it’s another indication that you don’t just “receive” visual information; instead, you inter- pret it. The interpretation is an essential part of your perception and gener- ally helps you perceive the world correctly. The role of the interpretation becomes especially clear, however, in circum- stances in which you misinterpret the information available to you and end up misperceiving the world. Consider the two tabletops shown in Figure 3.20. The table on the left looks quite a bit longer and thinner than the one on the right; a tablecloth that fits one table surely won’t fit the other. Objectively, though, the parallelogram depicting the left tabletop is exactly the same shape as the one depicting the right tabletop. If you were to cut out the shape on the page depicting the left tabletop, rotate it, and slide it onto the right tabletop, they’d be an exact match. (Not convinced? Just lay another piece of paper on top of the page, trace the left tabletop, and then move your tracing onto the right tabletop.) 90 C H A P T E R T H R E E Visual Perception FIGURE 3.20 TWO TABLETOPS These two tabletops seem to have very different shapes and sizes. However, this contrast is an illusion — and the shapes drawn here (the two parallelograms depicting the tabletops) are identical in shape and size. The illusion is caused by the same mechanisms that, in most circumstances, allow you to achieve constancy. Why do people misperceive these shapes? The answer involves the normal mechanisms of shape constancy. Cues to depth in this figure cause you to perceive the figure as a drawing of three-dimensional objects, each viewed from a particular angle. This leads you — quite automatically — to adjust for the (apparent) viewing angles in order to perceive the two tabletops, and it’s this adjustment that causes the illusion. Notice, then, that this illusion about shape is caused by a misperception of depth: You misperceive the depth rela- tionships in the drawing and then take this faulty information into account in interpreting the shapes. (For a related illusion, see Figure 3.21.) FIGURE 3.21 THE MONSTER ILLUSION The two monsters appear rather different in size. But, again, this is an illusion, because the two drawings are exactly the same size. The illusion is created by the distance cues in the picture, which make the monster on the right appear to be farther away. This (mis)perception of dis- tance leads to a (mis)percep- tion of size. Constancy 91 FIGURE 3.22 A BRIGHTNESS ILLUSION The central square (third row, third column) appears much brighter than the square marked by the arrow. Once again, though, this is an illusion. If you don’t believe it, use your fingers or pieces of paper to cover everything in the figure except for these two squares. A different example is shown in Figure 3.22. It seems obvious to most viewers that the center square in this checkerboard (third row, third column) is a brighter shade than the square indicated by the arrow. But, in truth, the TEST YOURSELF shade of gray shown on the page is identical for these two squares. What has happened here? The answer again involves the normal processes of perception. 9. What does it mean to First, the mechanisms of lateral inhibition (described earlier) play a role here say that size constan- cy may depend on an in producing a contrast effect: The central square in this figure is surrounded unconscious infer- by dark squares, and the contrast makes the central square look brighter. The ence? An inference square marked at the edge of the checkerboard, however, is surrounded by about what? white squares; here, contrast makes the marked square look darker. 10. How do the ordinary But, in addition, the visual system also detects that the central square is in mechanisms of con- the shadow cast by the cylinder. Your vision compensates for this fact — again, stancy lead to visual illusions? an example of unconscious inference that takes the shadow into account in judging brightness — and therefore powerfully magnifies the illusion. The Perception of Depth In discussing constancy, we said that perceivers take distance, slant, and illumination into account in judging size, shape, and brightness. But to do this, they need to know what the distance is (how far away is the target object?), what the viewing angle is (“Am I looking at the shape straight on or at an angle?”), and what the illumination is. Otherwise, they’d have no way to take these factors into account and, therefore, no way to achieve constancy. Let’s pursue this issue by asking how people judge distance. We’ve just said that distance perception is crucial for size constancy, but, of course, information about where things are in your world is also valuable for its own sake. If you want to walk down a hallway without bumping into obstacles, you need to know which obstacles are close to you and which ones are far off. If you wish to caress a loved one, you need to know where he or she is; 92 C H A P T E R T H R E E Visual Perception otherwise, you’re likely to swat empty space when you reach out with your caress or (worse) poke him or her in the eye. Plainly, then, you need to know where objects in your world are located. Binocular Cues The perception of distance depends on various distance cues — features of the stimulus that indicate an object’s position. One cue comes from the fact that your eyes look out on the world from slightly different positions; as a result, each eye has a slightly different view. This difference between the two eyes’ views is called binocular disparity, and it provides important information about distance relationships in the world. Binocular disparity can lead to the perception of depth even when no other distance cues are present. For example, the bottom panels of Figure 3.23 show the views that each eye would receive while looking at a pair of nearby objects. If we present each of these views to the appropriate eye (e.g., by draw- ing the views on two cards and placing