Psychology 8th - Perception - Gleitman, Gross, Reisberg - PDF

5 CHAPTER Perception S hould you find yourself at a karaoke party, performing a song that you think you know by heart, be careful: Yo...

5 CHAPTER Perception S hould you find yourself at a karaoke party, performing a song that you think you know by heart, be careful: You might end up falling prey to a mondegreen. Mondegreens are misperceptions of common phrases, especially from poems and songs. Thousands of people are convinced, for example, that Creedence Clearwater Revival is crooning, “There’s a bathroom on the right.” The actual lyric is “a bad moon on the rise.” Likewise, when country star Crystal Gale proclaims, “Don’t it make my brown eyes blue?” legions of fans think she’s singing, “Doughnuts make my brown eyes blue.” Even religious music is open to these errors. Countless church- goers unknowingly change “Gladly the cross I’d bear” into a hymn to visually challenged wildlife: “Gladly, the cross-eyed bear.” Mondegreens reveal how active and interpretive perception is. Although perception feels effortless, much of the sensory information we receive is incomplete and ambiguous—so we have to supplement and interpret that information in order to understand what’s going on around us. In many mondegreens, the acoustic input is truly ambiguous—the sound waves that reach you when someone says “the cross I’d bear” are virtually identical to the sound waves that comprise “the cross-eyed bear.” You have to interpret the input. Other mondegreens don’t match the stimuli as closely, showing that people do more than interpret the sensory information; some- times, they actually overrule it. This is true of the mondegreen that gave the 181 phenomenon its name. As a girl, the American writer Sylvia Wright loved the 17th- century Scottish ballad “The Bonnie Earl O’ Murray,” which she perceived as saying: They hae [have] slain the Earl Amurray, And Lady Mondegreen. “I saw it all clearly,” Wright recalls in a Harper’s Weekly article: “The Earl had yellow curly hair and a yellow beard and of course wore a kilt.... Lady Mondegreen lay at his side, her long, dark brown curls spread out over the moss.” This is a wonderfully romantic image but, it turns out, not what the balladeers intended: Wright may have heard “And Lady Mondegreen,” but the stanza actually ends “And laid him on the green.” As we’ll see, errors like mondegreens are relatively rare—perception is generally accurate. However, these errors plainly reveal that perception involves interpretation. After all, if you weren’t interpreting to being with, how could you ever misinterpret? In this chapter, we’ll examine this interpretation process, looking at how it leads to accurate perception of the world, and also how it occasionally leads to perceptual error. We’ll focus on humans’ most important source of perceptual information—our sense of vision—starting with the crucial question of how we recognize objects. We’ll then turn to how variations in our circumstances affect perception. Finally, we’ll con- sider one last layer of complication: It’s not enough just to see what something is; we also need to know what it is doing. We’ll therefore examine how we perceive motion. Perception feels easy and immediate: You open your eyes and recognize what you see. You understand sounds the second they reach your ears. In truth, perception only seems simple because you’re extraordinarily skilled at it. In this chapter, we’ll explore just how complex the perceptual process really is. FORM PERCEPTION: WHAT IS IT? The ability to recognize objects is, of course, enormously important for us. If we couldn’t tell a piece of bread from a piece of paper, we might try to write on the bread and eat the paper. If we couldn’t tell the difference between a lamppost and a potential mate, our social lives would be strange indeed. So how do we manage to recognize bread, paper, mates, and myriad other objects? In vision, our primary means of recog- nizing objects is through the perception of their form. Of course, we sometimes do rely on color (e.g., a violet) and occasionally on size (e.g., a toy model of an automobile); but in most cases, form is our major avenue for identifying what we see. The question is how? How do we recognize the forms and patterns we see in the world around us? The Importance of Features One simple hypothesis regarding our ability to recognize objects is suggested by data we considered in Chapter 4. There we saw that the visual system contains cells that serve as feature detectors—and so one of these cells might fire if a particular angle is in view; another might fire if a vertical line is in view; and so on. Perhaps, therefore, we just need to keep track of which feature detectors are firing in response to a particular input; that way, we’d have an inventory of the input’s features, and we could then compare this inventory to some sort of checklist in memory. Does the inventory tell us that the object in front of our eyes has four right angles and four straight sides of equal length? If so, 182 chapter 5 PPERCEPTIONO then it must be a square. Does the inventory tell us that the object in view has four legs and a very long neck? If so, we conclude that we’re looking at a giraffe. Features do play a central role in object recognition. If you detect four straight lines on an otherwise blank field, you’re not likely to decide you’re looking at a circle or a picture of downtown Chicago; those perceptions don’t fit with the set of features pre- sented to you by the stimulus. But let’s be clear from the start that there are some com- plexities here. For starters, consider the enormous variety of objects we encounter, perceive, and recognize: cats and cars, gardens and gorillas, shoes and shops; the list goes on and on. Do we have a checklist for each of these objects? If so, how do we search through this huge set of checklists, so that we can recognize objects the moment they appear in front of us? Besides that, for any one of the objects we can recognize there is still more variety. After all, we recognize cats when we see them close up or far away, from the front or the 5.1 The variability of stimuli we recog- side, and sitting down or walking toward us (Figure 5.1). Do we have a different feature nize We recognize cats from the side or the front, whether we see them close up or checklist for each of these views? Or do we somehow have a procedure for converting far away. these views into some sort of “standardized view,” which we then compare to a check- list? If so, what might that standardization procedure be? We clearly need some more theory to address these issues. (For more on how we might recognize objects despite changes in our view of them, see Tarr, 1995; Vuong & Tarr, 2004.) Similarly, we often have partial views of the objects around us; but we can recognize them anyway. Thus, we recognize a cat even if it’s sitting behind a tree and we can see only its head and one paw. We recognize a chair even when someone is sitting on it, blocking much of the chair from view. We identify the blue form in Figure 5.2 as a square, even though one corner is hidden. These facts, too, must be accommodated by our theory of recognition. The Importance of Organization It seems, then, that we’ll need a great number of feature lists (due to the sheer number 5.2 Recognizing partially occluded of objects we can recognize). We’ll also need flexibility in how we use the feature lists figures We recognize the blue form as a (to address the diversity of views we get of each object, and the problem of partial square even though it doesn’t have the right views). But there’s still another problem for us to confront: It turns out that the catalog inventory of features—thanks to the fact of features we rely on—and so the starting point for our recognition of objects in the that one of its corners is hidden from view. world—depends on how we interpret the visual input. This role for interpretation can be demonstrated in several ways. One is that we choose to ignore many of the features available to our visual system, basing our recog- nition only on a subset of the features that are actually in view. To make this idea con- crete, consider Figure 5.3. The pattern in the man’s shirt contains many lines and angles, but we give them little priority when analyzing the scene. Instead, we focus on the features defining the outline of the man’s body and telling us that this is, in fact, a picture of a man wearing sunglasses. Likewise, we ignore the outline of the shadows cast by the trees because we realize that the features making up this outline don’t tell us much about the identity of the objects in the picture. It seems, then, that we somehow choose which features “matter” for our recognition and which features are incidental. 5.3 The complexity of real-world scenes Our recognition of objects depends only on a well-chosen subset of the features that are in view. So, for example, we recognize the man’s shirt by paying attention to its outline; we’re not misled by the lines that form the pattern in the fabric, or by the edges of the shadows. Instead, we manage to zoom in on just those features that define the shapes we’re interested in. PForm Perception: What Is It?O 183 5.4 A hidden figure At first the figure seems not to contain the features needed to identify the various letters. But once the figure is reorganized with the white parts forming the figure and not the dark parts, its features are easily detected. So it seems that the analysis of features depends on a preliminary step in which the viewer must organize the figure. (A) Figure 5.4 illustrates a different type of interpretation. These dark shapes seem meaningless at first, but after a moment most people find a way to reorganize the fig- ure so that the familiar letters come into view. But let’s be clear about what this means. At the start, the form doesn’t seem to contain the features we need to identify the L, the I, and so on—and so we don’t detect these letters. Once we’ve reorganized the form, though, it does contain the relevant features and so we immediately recognize the let- ters. Apparently, therefore, the catalog of features present in this figure depends on how we interpret its overall form. Based on one interpretation, the features defining these letters are absent—and so we can’t detect the letters or the word LIFT. With a different interpretation, the features are easily visible and we can immediately read the word. It seems, then, that features are as much “in the eye of the beholder” as they are in the figure itself. As a related example, consider Figure 5.5. Here, most people easily perceive two com- plete triangles—one on the orange background, one on the green. But again, the fea- (B) tures of these triangles aren’t present on the page; specifically, the sides of the triangle aren’t marked in the figure at all. However, the perceiver organizes the overall form so that the missing sides are filled in—she’s essentially creating the features for herself. Once that’s done, she can clearly perceive the triangles. PERCEPTUAL PARSING The previous examples powerfully suggest that the perception of form depends both on feature input—that is, what’s actually in front of your eyes—and on how you organize and interpret the form. But what exactly does it mean for a perceiver to “interpret” a form, or to find an “organization” within a figure? And why do you end up with one interpretation and not another? Why, for example, do most people decide that Figure 5.5 Subjective contours In (A) we see an 5.5 should be organized in a way that connects the angles so that they become parts of orange triangle whose vertices lie on top of a single form rather than treating them as separate forms? the three green circles. The three sides of Questions like these were crucial for Gestalt psychology, a school of psychology that this orange triangle (which looks brighter emphasized that organization is an essential feature of all mental activity: We under- than the orange background) are clearly stand the elements of the visual input as linked to each other in a certain way, and the visible, even though they don’t exist physi- identity of these elements depends on the linkage. (That’s why in Figure 5.5, we perceive cally. In (B) we see the same effect with the round elements as intact circles, each partially hidden by another form, rather than green and orange reversed. Here, the green as a series of “Pac-Man” figures.) Likewise, we appreciate a work of music because we triangle—which looks darker than the green background—has subjective green perceive the individual notes as forming a cohesive whole. Similarly, our thoughts have contours. meaning only in relationship to each other. In all cases, the Gestalt psychologists wanted to ask how this organization was achieved, and how it influenced us. (The word Gestalt is derived from a German word meaning “form” or “appearance.”) Gestalt psychologists described several aspects of this organization and identified Gestalt psychology A theoretical several principles that guided it. Some of the principles are concerned with the way you approach that emphasizes the role of parse the input—that is, how you separate a scene into individual objects, linking organized wholes in perception and together the parts of each object but not linking one object’s parts to some other object. other psychological processes. To make this idea concrete, consider the still life in Figure 5.6. To make sense of this pic- ture, your perception must somehow group the elements of the scene appropriately. 184 chapter 5 PPERCEPTIONO (A) (B) 3 1 2 5 4 5.6 Perceptual parsing (A) A still life. (B) An overlay designating five different segments of the scene shown in (A). To determine what an object is, the perceptual system must first decide what goes with what: Does portion 2 go with 1 or with 3, 4, or 5? Or does it go with none of them? Portion 2 (part of the apple) must be united with portion 5 (more of the apple), even though they’re separated by portion 4 (a banana). Portion 2 should not be united with similarity In perception, a principle by which we tend to group like figures, portion 1 (a bunch of grapes), even though they’re adjacent and about the same color. especially by color and orientation. The bit of the apple hidden from view by the banana must somehow be filled in, so that you perceive an intact apple rather than two apple slices. All of these steps involved in deciding which bits go with which other bits fall under the label of parsing. What cues guide you toward parsing a stimulus pattern one way rather than another? The answer involves both feature information and information about the larger-scale pattern. For example, we tend to interpret certain features (such as a T-junction—Figure 5.7) as indicating that one edge has disappeared behind another; we interpret other features differently. In this way, “local” informa- tion—information contained in one small part of the scene—helps guide our parsing. But “global” information—information about the whole scene—is also crucial. For example, perceivers tend to group things together according to a principle of similarity—meaning that, all other things being equal, they group together figures that T-junction Y-junction 5.7 How features guide parsing We noted earlier that feature analy- sis depends on a preliminary step in which the viewer organizes the overall figure. But it turns out that the opposite is also true: The features determine how the viewer organizes the figure. For example, viewers usually interpret a T-junction as one surface dropping from view behind another. They usually interpret a Y-junction as a corner pointing toward them. PForm Perception: What Is It?O 185 A Similarity B Proximity C Good continuation D Closure E Simplicity We tend to group We tend to We tend to see a We tend to perceive We tend to interpret a form these dots into perceive groups, continuous green an intact triangle, in the simplest way possible. columns rather linking dots that bar rather than two reflecting our bias We would see the form on than rows, are close together. smaller rectangles. toward perceiving the left as two intersecting grouping dots of closed figures rather rectangles (as shown on similar colors. than incomplete ones. right) rather than as a single 12-sided irregular polygon. 5.8 Other Gestalt principles resemble each other. So in Figure 5.8A, we group blue dots with blue dots, red with red. proximity In perception, the closeness Perceivers are also influenced by proximity—the closer two figures are to each other, of two figures. The closer together they are, the more we tend to group them the more we tend to group them together perceptually (for more on these principles of together perceptually. perceptual organization, see Figure 5.8; Palmer, 2002; Wertheimer, 1923). Parsing is also guided by several other principles, including good continuation— good continuation A factor in visual a preference for organizations in which contours continue smoothly along their orig- grouping; we tend to perceive contours inal course. This helps us understand why portions 2 and 5 in Figure 5.6 are grouped in a way that alters their direction as little together as parts of a single object; but good continuation can also be documented as possible. in much simpler stimuli (Figure 5.8C). Good continuation is also relevant to Figure subjective contours Perceived 5.5. Some theorists interpret the subjective contours visible in this figure as a spe- contours that do not exist physically. We cial case of good continuation. In their view, the contour is seen to continue along its tend to complete figures that have gaps original path—even, if necessary, jumping a gap or two to achieve this continuation in them by perceiving a contour as con- (Kellman & Shipley, 1991; Kellman, Garrigan, & Shipley, 2005). tinuing along its original path. FIGURE AND GROUND 5.9 Figure and ground (A) One of the Another part of visual organization is the separation of the object from its setting, early steps in seeing a form is to segregate so that the object is seen as a coherent whole, separate from its background. This sepa- it from its background. If we perceive the ration of figure and ground allows you to focus on just the banana in Figure 5.6, treating figure in part (A) as a blue rectangle with a hole in it (B), the edge marks the contour everything else in the scene as merely the backdrop for the banana. But the of the hole. The situation is reversed in (C). separation of figure and ground is just as important with simpler and entirely unfamil- Now the edge marks the white blob, not a iar figures. In Figure 5.9A, the white splotch appears to most people as the figure and is break in the blue background. In this sense, the edge belongs to the figure, not the ground. As it turns out, the perception in (C) is much more likely. Which is figure, which is ground? Alternative ways of organizing this stimulus (A) (B) (C) 186 chapter 5 PPERCEPTIONO perceived as closer to the viewer than the blue region (which is seen as the ground) as shown in 5.9c. The edge between the blue and white regions is perceived as part of the figure, defining its shape. The same edge does not mark a contour for the blue region but merely marks the point at which the blue region drops from view. Of course, you can usually identify a figure so quickly and easily that it feels like this specification is somehow specified by the stimulus itself and is not an element of your interpretation. But the fact remains that identifying the figure, like all aspects of per- ceptual organization, is up to you. This is most evident whenever you realize there’s more than one way to interpret a given stimulus—as in Figure 5.10, which can be seen either as a white vase or as two blue faces in profile. This reversible figure makes it clear that the stimulus itself is neutral in its organization. What is figure and what is ground, it seems, depends on how we look at it. Other examples point to the same broad conclusion and highlight the perceiver’s active role in interpreting the input. Is Figure 5.11A—the Necker cube—aligned with 5.10 Reversible figure-ground pattern the solid cube shown in Figure 5.11B, so we’re viewing it from above? Or is it aligned This figure can be seen as either a pair of with the cube shown in Figure 5.11C, so we’re viewing it from below? Most people can silhouetted faces or a white vase. organize the Necker cube in either way, so they first perceive it to have one orientation and then the other. Apparently, then, the organization is not specified by the figure reversible figure A visual pattern that itself but is instead up to the perceiver. easily allows more than one interpreta- All of these observations suggest that perception is less “objective” than one might tion, in some cases changing the specifi- suppose, because what we perceive is, it seems, often determined by how we interpret cation of figure and ground, in other cases changing the perceived organiza- or organize the input. At the same time, it’s important to realize that perceivers’ infer- tion in depth. ences and interpretations tend to be neither foolish nor random. Quite the contrary: Our interpretations of the sensory input are, first of all, shaped by our experience; and they’re correct far more often than not (Enns, 2004). Likewise, the interpretations (A) The Necker cube themselves tend to be quite logical, as if our visual system always follows certain rules. We’ve already mentioned some of these rules—a preference for grouping similar things together, for example, or a preference for parsing the input so that it creates smooth contours. But other rules also guide us: For example, we seem to prefer per- ceptual interpretations that explain all the information contained within the stimu- lus, and so we avoid interpretations that would explain only bits and pieces of the Alternative ways of perceiving this stimulus stimulus. We also seem to avoid interpretations that would involve some contradic- (B) (C) tion, such as perceiving a surface to be both opaque and transparent. What’s more, we seem to avoid interpretations that depend on accident or coincidence. (“This is what the form would look like if viewed from exactly the right position.”) Of course, no one claims that the perceptual apparatus is literally proceeding through a sequence of log- ical steps, weighing each of these rules in turn. Still, our perception does seem guided by these principles, so that our interpretations of the input will be logical and usually 5.11 The Necker cube The ambiguous correct (Figure 5.12). Necker cube, shown in (A), can be per- ceived as aligned with either the cube shown in (B) or the one in (C). (A) (B) (C) 5.12 Impossible figures We’ve mentioned how “logical” the perceptual system seems to be, but it’s important to realize that this logic has limits. As an example, consider these so-called impossible figures. We per- ceive them as if they show three-dimensional objects, although contradictions within each figure guarantee that they can’t be three- dimensional. PForm Perception: What Is It?O 187 NETWORK MODELS OF PERCEPTION The last few pages have created a new question for us. We’ve been focusing on the inter- pretive nature of perception, and we’ll keep doing that throughout this chapter. In all cases, people don’t just “pick up” and record the stimuli that reach the eye, the way a camera or videorecorder might. Instead, they organize and shape the input. When they encounter ambiguity—and they often do—they make choices about how the ambiguity should be resolved. But how exactly does this interpretation take place? This question needs to be pur- sued at two levels. First, we can try to describe the sequence of events in functional terms—first, this is analyzed; then that is analyzed—laying out in detail the steps needed to accomplish the task. Second, we can specify the neural processes that actu- ally support the analysis and carry out the processing. Let’s look at both types of expla- nation, starting with the functional approach. Feature Nets Earlier in this chapter, we noted some complications for any theorizing that involves features. Even with these complications, though, it’s clear that feature detection plays a central role in object recognition. We saw in Chapter 4 that the visual system does analyze the input in terms of features: Specialized cells—feature detectors—respond to lines at various angles, curves at various positions, and the like. More evidence for visual search A task in which the importance of features comes from behavioral studies that use a visual search participants are asked to determine procedure. In this task, a research participant is shown an array of visual forms and whether a specified target is present asked to indicate as quickly as she can whether a particular target is present— within a field of stimuli. whether a vertical line is visible among the forms shown, perhaps, or a red circle is visible amid a field of squares. This task is easy if the target can be distinguished from the field by just one salient feature—for example, searching for a vertical among a field of horizontals, or for a green target amidst a group of red distracters. In such cases, the target “pops out” from the distracter elements, and search time is virtually independent of the number of items in the display—so people can search through four targets, say, as fast as they can search through two, or eight as fast as they can search through four (Figure 5.13, part A or B; Treisman, 1986a, 1986b; Wolfe 5.13 Visual search In (A), you can immediately spot the vertical, distinguished from the other shapes by just one feature. Likewise, in (B), you can immediately spot the lone green bar in the field of reds. In (C), it takes much longer to find the one red vertical, because now you need to search for a combination of features—not just for red or vertical, but for the one form that has both of these attributes. (A) (B) (C) 188 chapter 5 PPERCEPTIONO 5.14 How a feature net guides CLOCK Word detector identification An example of a feature net. Here the feature detectors respond to simple elements in the visual input. When the appropriate feature detectors are acti- C L O C K Letter detectors vated, they trigger a response in the letter detectors. When these are activated, in turn, they can trigger a response from a higher-level detector, such as a detector for Feature an entire word. detectors & Horowitz, 2004; we’ll have more to say about visual search later in the chapter). These results make it clear that features have priority in our visual perception. We can detect them swiftly, easily, and presumably at an early stage in the sequence of events required to recognize an object. But how do we use this feature information? And how,knowing about the complications we’ve discussed, can we use this information to build a full model of object recognition? One option is to set up a hierarchy of detectors with detectors in each layer serving as the triggers for detectors in the next layer (Figure 5.14). In the figure, we’ve illustrated this idea with a hierarchy for recognizing words; the idea would be the same with one for recogniz- ing objects. At the lowest level of the hierarchy would be the feature detectors we’ve already described—those responsive to horizontals, verticals, and so forth. At the next level of the hierarchy would be detectors that respond to combinations of these simple features. Detectors at this second level would not have to survey the visual world directly. Instead, they’d be triggered by activity at the initial level. Thus, there might be an “L” detector in the second layer of detectors that fires only when triggered by both the vertical- and horizon- tal-line detectors at the first level. Hierarchical models like the one just described are known as feature nets because feature net A model of pattern recog- they involve a network of detectors that has feature detectors at its bottom level. In the nition involving a network of detectors earliest feature nets proposed, activation flowed only from the bottom up—from and having feature detectors as the net- work’s starting point. feature detectors to more complex detectors and so on through a series of larger and larger units (see, for example, Selfridge, 1959). Said differently, the input pushes the process forward, and so we can think of these processes as “data driven.” More recent models, however, have also included a provision for “top-down” or “knowledge-driven” processes—processes that are guided by the ideas and expectations that the perceiver brings to the situation. To see how top-down and bottom-up processes interact, consider a problem in word recognition. Suppose you’re shown a three-letter word in dim light. In this setting, your visual system might register the fact that the word’s last two letters are AT; but at least initially, the system has no information about the first letter. How, then, would you choose among MAT, CAT, and RAT? Suppose that, as part of the same experiment, you’ve just been shown a series of words including several names of animals (dog, mouse, canary). This experience will activate your detectors for these words, and the acti- vation is likely to spread out to the memory neighbors of these detectors—including (probably) the detectors for CAT and RAT. Activation of the CAT or RAT detector, in turn, will cause a top-down, knowledge-driven activation of the detectors for the letters in these words, including C and R (Figure 5.15). PNetwork Models of PerceptionO 189 (A) Top-down processing Formulate hypothesis Thoughts about animal about the identity of the stimulus. (B) Bottom-up processing MAT CAT RAT MAT CAT RAT Combine features into more complex forms. Select and examine relevant aspects of the stimulus to check the hypothesis. M C R M C R 5.15 Bidirectional activation (A) Thoughts about the concept animal activates the words CAT and RAT (among others), which then activate their constituent letters, including the first-position letters C and R. This process also inhibits incompatible words, such as MAT. Activation is indicated with arrows; inhibition in red. (B) Some milliseconds later, further perceptual Detect features of analysis of the first letter of the stimulus word has activated the feature CAT the input. curved-to-the-left, which partially activates the letter C. This process adds to the activation of the word CAT and inhibits the letter R and the word RAT. Thus, the word CAT is more intensely activated than all other words. To keep things simple, many other connections (e.g., between the words CAT and RAT) are not shown in the figure. While all this is going on, the data-driven analysis continues; by now, your visual sys- tem has likely detected that the left edge of the target letter is curved (Figure 5.15B). This bottom-up effect alone might not be enough to activate the detector for the letter C; but notice that this detector is also receiving some (top-down) stimulation (Figure 5.15A). As a result, the C detector is now receiving stimulation from two sources—from below (the feature detector) and above (from CAT), and this combination of inputs will probably be enough to activate the C detector. Then, once this detector is activated, it will feed back to the CAT detector, activating it still further. (For an example of models that work in this way, see McClelland, Rumelhart, & Hinton, 1986; also Grainger, Rey, & Dufau, 2008.) It’s important, though, that we can describe all of these steps in two different ways. If we look at the actual mechanics of the process, we see that detectors are activating (or inhibiting) other detectors; that’s the only thing going on here. At the same time, we can also describe the process in broader terms: Basically, the initial activation of CAT functions as a knowledge-driven “hypothesis” about the stimulus, and that hypothesis makes the visual system more receptive to the relevant “data” coming from the feature detectors. In this example, the arriving data confirm the hypothesis, thus leading to the exclusion of alternative hypotheses. With these points in view, let’s return to the question we asked earlier: We’ve discussed how the perceptual system interprets the input, and we’ve emphasized that the interpretation is guided by rules. But what processes in our mind actually do the “inter- preting”? We can now see that the interpretive process is carried out by a network of detectors, and the interpretive “rules” are built into the way the network functions. For example, how do we ensure that the perceptual interpretation is compatible with all the information in the input? This point is guaranteed by the fact that the feature detectors help shape the network’s output, and this simple fact makes it certain that the output will be constrained by information in the stimulus. How do we ensure that our perception 190 chapter 5 PPERCEPTIONO contains no contradiction (e.g., perceiving a surface to be both opaque and transparent)? (A) Some of the geons This is guaranteed by the fact that detectors within the network can inhibit other (incompatible) detectors. With mechanisms like these in place, the network’s output is sure to satisfy all of our rules—or at least to provide the best compromise possible 1 2 among the various rules. From Features to Geons to Meaning 3 4 5 The network we’ve described so far can easily recognize targets as simple as squares and circles, letters and numerals. But what about the endless variety of three-dimensional (B) Some objects that can be created objects that surround us? For these, theorists believe we can still rely on a network of detectors; but we need to add some intermediate levels of analysis. 5 A model proposed by Irving Biederman, for example, relies on some 30 geomet- 5 4 3 ric components that he calls geons (short for “geometric ions”). These are three- 5 3 dimensional figures such as cubes, cylinders, pyramids, and the like; nearly all objects can be broken down perceptually into some number of these geons. To 2 recognize an object, therefore, we first identify its features and then use these to 3 identify the component geons and their relationships. We then consult our visual memory to see if there’s an object that matches up with what we’ve detected 5.16 Some proposed geometric (Biederman, 1987; Figure 5.16). primitives Part (A) shows some of the In Biederman’s system, we might describe a lamp, say, as being a certain geon (num- geons (geometric ions) that our perceptual ber 4 in Figure 5.16) on top of another (number 3). This combination of geons gives us system uses in its analysis of complex forms. Part (B) shows how geons are a complete description of the lamp’s geometry. But this isn’t the final step in object assembled into objects—so that geon 5 recognition, because we still need to assign some meaning to this geometry. We need to side-attached to geon 3 creates the shape know that the shape is something we call a lamp—that it’s an object that casts light and of a coffee cup. can be switched on and off. As with most other aspects of perception, those further steps usually seem effortless. We see a lamp (or a chair, or a pickup truck) and immediately know what it is and what it is for. But as easy as these steps seem, they’re far from trivial. Remarkably, we can find geons (geometric ions) Simple geo- cases in which the visual system successfully produces an accurate structural descrip- metric figures, such as cubes, cylinders, tion but fails in these last steps of endowing the perceived object with meaning. The and pyramids, that can be combined to cases involve patients who have suffered certain brain lesions leading to visual agnosia create all other shapes. An early (and crucial) step in some models of object (Farah, 1990). Patients with this disorder can see, but they can’t recognize what they recognition is determining which geons see (Chapter 3). Some patients can perceive objects well enough to draw recognizable are present. pictures of them; but they’re unable to identify either the objects or their own drawings. One patient, for example, produced the drawings shown in Figure 5.17. When asked to say what he had drawn, he couldn’t name the key and said the bird was a tree stump. He evidently had formed adequate structural descriptions of these objects, but his ability to process what he saw stopped there; his perceptions were stripped of their meaning (A) Original (B) Copy (Farah, 1990, 2004). 5.17 Drawings by a patient with associative agnosia The left column shows the forms shown the patient; the right column shows the patient’s drawings. While the patient could see the models well enough to reproduce them accurately, he was unable to recognize these objects. PNetwork Models of PerceptionO 191 THE NEUROSCIENCE OF VISION Where are we so far? We started by acknowledging the complexities of form perception— including the perceiver’s need to interpret and organize visual information. We then con- sidered how this interpretation might be achieved—through a network of detectors, shaped by an interplay between bottom-up and top-down processes. But these points simply invite the next question: How does the nervous system implement these processes? More broadly, what events in the eye, the optic nerve, and the brain make perception possible? The Visual Pathway As we saw in Chapter 4, the rods and cones pass their signals to the bipolar cells, which relay them to the ganglion cells (Figure 4.26). The axons of the ganglion cells form the optic nerve, which leaves the eyeball and begins the journey toward the brain. But even at this early stage, the neurons are specialized in important ways, and different cells are responsible for detecting different aspects of the visual world. The ganglion cells, for example, can be broadly classified into two categories: the parvo cells Ganglion cells that, smaller ones are called parvo cells, and the larger are called magno cells (parvo and because of their sensitivity to differences magno are the Latin for “small” and “large”). Parvo cells, which blanket the entire in hue, are particularly suited to perceiv- retina, far outnumber magno cells. Magno cells, in contrast, are found largely in the ing color and form. retina’s periphery. Parvo cells appear to be sensitive to color differences (to be more pre- magno cells Ganglion cells that, cise, to differences either in hue or in brightness), and they probably play a crucial role because of their sensitivity to brightness in our perception of pattern and form. Magno cells, on the other hand, are insensitive changes, are particularly suited to per- to hue differences but respond strongly to changes in brightness; they play a central ceiving motion and depth. role in the detection of motion and the perception of depth. This pattern of neural specialization continues and sharpens as we look more deeply into the nervous system. The relevant evidence comes largely from the single-cell recording technique that lets investigators determine which specific stimuli elicit a response from a cell and which do not (see Chapter 4). This technique has allowed investigators to explore the visual system cell by cell and has given us a rich understand- ing of the neural basis for vision. PARALLEL PROCESSING IN THE VISUAL CORTEX In Chapter 4, we noted that cells in the visual cortex each seem to have a “preferred stimulus”—and each cell fires most rapidly whenever this special stimulus is in view. For some cells, the preferred stimulus is relatively simple—a curve, or a line tilted at a particular orientation. For other cells, the preferred stimulus is more complex—a corner, an angle, or a notch. Still other cells are sensitive to the color (hue) of the input. Others fire rapidly in response to motion—some cells are particularly sensitive to left- to-right motion, others to the reverse. This abundance of cell types suggests that the visual system relies on a “divide- and-conquer” strategy. Different cells—and even different areas of the brain—each specialize in a particular kind of analysis. Moreover, these different analyses go on in parallel: The cells analyzing the forms do their work at the same time that other cells are analyzing the motion and still others the colors. Using single-cell recording, investigators have been able to map where these various cells are located in the visual cortex as well as how they communicate with each other; Figure 5.18 shows one of these maps. 192 chapter 5 PPERCEPTIONO Parietal VIP cortex PO MST TE Occipital cortex MT LIP Retina LGN V1 V3 Inferotemporal cortex V2 V4 TEO TE 5.18 The visual processing pathways Each box in this figure refers to a specific location within the visual system; the two blue boxes at the left refer to locations outside the cortex; all other boxes refer to locations on the cortex. Notice two key points: First, vision depends on many different brain sites, each performing a specialized type of analysis. Second, the flow of information is complex—so there’s surely no strict sequence of “this step” of analysis fol- lowed by “that step.” Instead, everything happens at once and there’s a great deal of back- and-forth communication among the various elements. Why this reliance on parallel processing? For one thing, parallel processing allows greater speed, since (for example) brain areas trying to recognize the shape of the stimulus aren’t kept waiting while other brain areas complete the motion analysis or the color analysis. Instead, all types of analysis can take place simultaneously. Another advantage of parallel processing lies in its ability to allow each system to draw informa- tion from the others. Thus, your understanding of an object’s shape can be sharpened by a consideration of how the object is moving; your understanding of its movement can be sharpened by noting what the shape is (especially the shape in three dimen- sions). This sort of mutual influence is easy to arrange if the various types of analysis are all going on at the same time (Van Essen & DeYoe, 1995). The parallel processing we see in the visual cortex also clarifies a point we’ve already discussed. Earlier in the chapter, we argued that the inventory of a figure’s features depends on how the perceiver organizes the figure. So, in Figure 5.4, the features of the letters were absent with one interpretation of the shapes, but they’re easily visible with a different interpretation. In Figure 5.5, some of the features (the triangle’s sides) aren’t present in the figure but seem to be created by the way we interpret the arrangement of features. Observations like these make it sound like the interpretation has priority because it determines what features are present. But it would seem that the reverse claim must also be correct because the way we inter- pret a form depends on the features we can see. After all, no matter how you try to inter- pret Figure 5.18, it’s not going to look like a race car, or a map of Great Britain, or a drawing of a porcupine. The form doesn’t include the features needed for those interpretations; and as a result, the form will not allow these interpretations. This seems to suggest that it’s the features, not the interpretation, that have priority: The features guide the interpre- tation, and so they must be in place before the interpretation can be found. PThe Neuroscience of VisionO 193 How can both of these claims be true—with the features depending on the interpreta- tion and the interpretation depending on the features? The answer involves parallel pro- cessing.Certain brain areas are sensitive to the input’s features,and the cells in these areas do their work at the same time that other brain areas are analyzing the larger-scale config- uration. These two types of analysis, operating in parallel, can then interact with each other, ensuring that our perception makes sense at both the large-scale and fine-grained levels. THE “ WHAT ” AND “ WHERE” SYSTEMS Evidence for specialized neural processes, all operating in parallel, continues as we move beyond the visual cortex. As Figure 5.19 indicates, information from the visual cortex is transmitted to two other important brain areas in the inferotemporal cortex (the lower part of the temporal cortex) and the parietal cortex. The pathway carrying “what” system The visual pathway information to the temporal cortex is often called the “what” system; it plays a major leading from the visual cortex to the role in the identification of visual objects, telling us whether the object is a cat, an temporal lobe; especially involved in apple, or whatever. The second pathway, which carries information to the parietal identifying objects. cortex, is often called the “where” system; it tells us where an object is located— “where” system The visual pathway above or below, to our right or left (Ungerleider & Haxby, 1994; Ungerleider & leading from the visual cortex to the Mishkin, 1982). parietal lobe; especially involved in There’s been some controversy, however, over how exactly we should think about locating objects in space and coordinat- these two systems. Some theorists, for example, propose that the path to the parietal ing movements. cortex isn’t concerned with the conscious perception of position. Instead, it’s primarily involved in the unnoticed, automatic registration of spatial location that allows us to control our movements as we reach for or walk toward objects in our visual world. Likewise, this view proposes that the pathway to the temporal cortex isn’t really a “what” system; instead, it’s associated with our conscious sense of the world around us, including our conscious recognition of objects and our assessment of what these objects look like (e.g., Goodale & Milner, 2004; also D. Carey, 2001; Sereno & Maunsell, 1998). No matter how this debate is settled, there can be no question that these two path- ways serve very different functions. Patients who have suffered lesions in the occipi- tal-temporal pathway—most people agree this is the “what” pathway—show visual agnosia (see Chapter 3). They may be unable to recognize common objects, such as a cup or a pencil. They’re often unable to recognize the faces of relatives and friends— Parietal lobe Frontal Posterior lobe parietal cortex Occipital Temporal lobe 5.19 The “what” and “where” pathways lobe Information from the primary visual cortex at the back of the head is transmitted to Inferotemporal the inferotemporal cortex (the so-called cortex “what” system) and to the posterior pari- etal cortex (the “where” system). 194 chapter 5 PPERCEPTIONO but if the relatives speak, the patients can recognize them by their voices. At the same time, these patients show little disorder in visual orientation and reaching. On the other hand, patients who have suffered lesions in the occipital-parietal pathway— usually understood as the “where” pathway—show the reverse pattern. They have difficulty in reaching for objects but no problem with identifying them (A. Damasio, Tranel, & H. Damasio, 1989; Farah, 1990; Goodale, 1995; Newcombe, Ratcliff, & Damasio, 1987). The Binding Problem It’s clear, then, that natural selection has favored a division-of-labor strategy for vision: The processes of perception are made possible by an intricate network of subsystems, each specialized for a particular task and all working together to create the final product—an organized and coherent perception of the world. We’ve seen the benefits of this design, but the division-of-labor setup also creates a problem for the visual system. If the different aspects of vision—the perception of shape, color, movement, and distance—are carried out by different processing mod- ules, then how do we manage to recombine these pieces of information into one whole? For example, when we see a ballet dancer in a graceful leap, the leap itself is registered by the magno cells; the recognition of the ballet dancer depends on parvo cells. How are these pieces put back together? Likewise, when we reach for a coffee cup but stop midway because we see that the cup is empty, the reach itself is guided by the occipital- parietal system (the “where” system); the fact that the cup is empty is perceived by the occipital-temporal system (the “what” system). How are these two streams of process- ing coordinated? We can examine the same issue in light of our subjective impression of the world around us. Our impression, of course, is that we perceive a cohesive and organized world. After all, we don’t perceive big and blue and distant; we instead perceive sky. We don’t perceive brown and large shape on top of four shapes and moving; instead, we perceive our pet dog running along. Somehow, therefore, we do manage to re-integrate the separate pieces of visual information. How do we achieve this reunification? Neuroscientists call this the binding problem—how the nervous binding problem The problem con- system manages to bind together elements that were initially detected by separate fronted by the brain of recombining the systems. elements of a stimulus, given the fact that these elements are initially analyzed We’re just beginning to understand how the nervous system solves the binding separately by different neural systems. problem. But evidence is accumulating that the brain uses a pattern of neural synchrony—different groups of neurons firing in synchrony with each other—to identify which sensory elements belong with which. Specifically, imagine two groups of neurons in the visual cortex. One group of neurons fires maximally whenever a vertical line is in view. Another group of neurons fires maximally whenever a stimulus is in view moving from left to right. Also imagine that, right now, a vertical line is presented, and it is moving to the right. As a result, both groups of neurons are firing rapidly. But how does the brain encode the fact that these attributes are bound together, different aspects of a single object? How does the brain differentiate between this stimulus and one in which the features being detected actually belong to different objects—perhaps a static vertical and a moving diagonal? The answer lies in the timing of the firing by these two groups of neurons. In Chapter 3, we emphasized the firing rates of various neurons—whether a neuron was firing at, say, 100 spikes per second or 10. But we also need to consider exactly when a neuron is firing, and whether, in particular, it is firing at the same moment as other neurons. When the neurons are synchronized, this seems to be the nervous system’s PThe Neuroscience of VisionO 195 indication that the messages from the synchronized neurons are in fact bound together. To return to our example, if the neurons detecting a vertical line are firing in synchrony with the neurons signaling movement—like a group of drummers all keep- ing the same beat—then these attributes, vertical and moving, are registered as belong- ing to the same object. If the neurons are not firing in synchrony, the features are registered as belonging to separate objects (Buzsáki & Draguhn, 2004; Csibra, Davis, Spratling, & Johnson, 2000; M. Elliott & Müller, 2000; Fries, Reynolds, Rorie, & Desimone, 2001; Gregoriou, Gotts, Zhou, & Desimone, 2009). PERCEPTUAL CONSTANCY We began this chapter by noting that perception seems easy and immediate—we perceptual constancy The accurate open our eyes and see, with no apparent complexities. But we also said that this intu- perception of certain attributes of a dis- ition is misleading, because perception involves considerable complexities—many tal object, such as its shape, size, and brightness, despite changes in the proxi- steps to be taken, and many points at which the perceiver must play an active role in mal stimulus caused by variations in our interpreting the input. This message emerged again and again in our discussion of viewing circumstances. how we recognize the objects that surround us, and the same broad message—mul- tiple steps, and an active role—emerges when we consider another essential aspect of perceiving: the achievement of perceptual constancy. This term refers to the fact that we perceive the constant properties of objects in the world (their sizes, shapes, and so on) even though the sensory information we receive about these attributes changes whenever our viewing circumstances change. To illustrate this point, consider the perception of size. If we happen to be far away from the object we’re viewing, then the image cast onto our retinas by that object will be relatively small. If we approach the object, then the image size will increase. We’re not fooled by this variation in image size, though. Instead, we manage to achieve size constancy—correctly perceiving the sizes of objects in the world despite the changes in retinal-image size created by changes in viewing distance. Likewise, if we view a door straight on, the retinal image will be rectangular in shape; but if we view the same door from an angle, the retinal image will have a different shape (Figure 5.20). Still, we achieve shape constancy—that is, we correctly perceive the shapes of objects despite changes in the retinal image created by shifts in our viewing angle. We also achieve brightness constancy—we correctly perceive the brightness of objects whether they’re illuminated by dim light or strong sun. 5.20 Shape constancy When we see a door at various slants from us, it appears rectangular even though its retinal image is often a trapezoid. 196 chapter 5 PPERCEPTIONO 5.21 An invariant relationship that provides information about size (A) and (B) show a dog at different distances from the observer. The retinal size of the dog varies with distance, but the ratio between the retinal size of the dog and the retinal size of the textural elements (e.g., the floor tiles) is constant. (A) (B) Unconscious Inference How do we achieve each of these forms of constancy? One hypothesis focuses on rela- tionships within the retinal image. In judging size, for example, we might be helped by the fact that we generally see objects against some background, and various elements in the background can provide a basis for comparison with the target object. Thus the dog sitting nearby on the kitchen floor is half as tall as the chair and hides a number of the kitchen’s floor tiles from view. If we take several steps back from the dog, none of these relationships changes, even though the sizes of all the retinal images are reduced (Figure 5.21). Size constancy, therefore, might be achieved by focusing not on the images themselves but on these unchanging relationships. Relationships do contribute to size constancy, and that’s why we are better able to judge size when comparison objects are in view or when the target we’re judging sits on a surface that has a uniform visual texture (like the floor tiles in the example). But these relationships don’t tell the whole story. Size constancy is found even when the visual scene offers no basis for comparison—if, for example, the object to be judged is the only object in view—provided that other cues signal the distance of the target object (Chevrier & Delorme, 1983; Harvey & Leibowitz, 1967; Holway & Boring, 1947). How might our visual system use this distance information? More than a century ago, the German physicist Hermann von Helmholtz developed an influential hypoth- esis regarding this question. Helmholtz started with the fact that there’s a simple inverse relationship between distance and retinal image size: If an object doubles its distance from the viewer, the size of its image is reduced by half. If an object triples its distance, the size of its image is reduced to a third of its initial size. This relation- ship is guaranteed to hold true because of the principles of optics, and the relation- ship makes it possible for perceivers to achieve size constancy by means of a simple unconscious inference A process pos- calculation. Of course, Helmholtz knew that we don’t run through a conscious calcu- tulated by Hermann von Helmholtz to lation every time we perceive an object’s size; but he believed we were calculating explain certain perceptual phenomena nonetheless—and so he referred to the process as an unconscious inference such as size constancy. For example, an (Helmholtz, 1909). object is perceived to be at a certain dis- tance and this is unconsciously taken What is the calculation that allows someone to perceive size correctly? It’s simply into account in assessing its retinal multiplication: the size of the image on the retina, multiplied by the distance between image size, with the result that size you and the object. (We’ll have more to say about how you know this distance in a later constancy is maintained. section.) Thus, imagine an object that, at a distance of 10 feet, casts an image on the PPerceptual ConstancyO 197 5.22 The relationship between image (A) Closer objects cast larger retinal images size and distance If an object moves to a d new distance, the size of the retinal image cast by that object changes. A doubling of the distance reduces the retinal image by half. If the distance is tripled, the retinal Retinal image image is cut to one-third of its initial size. (B) Father objects cast smaller retinal images 2d Retinal image retina that’s 4 millimeters across (Figure 5.22). The same object, at a distance of 20 feet, casts an image of 2 millimeters. In both cases, the product—10 # 4 or 20 # 2—is the same. If, therefore, your size estimate depends on that product, your size estimate won’t be thrown off by viewing distance—and of course, that’s exactly what we want. What’s the evidence that size constancy does depend on this sort of inference? In many experiments, researchers have shown people some object and, without changing the object’s retinal image, changed the apparent distance of the object. (There are many ways to do this—lenses that change how the eye has to focus to bring the object into sharp view, or mirrors that change how the two eyes have to angle inward so that the object’s image is centered on both foveas.) If people are—as Helmholtz proposed— using distance information to judge size, then these manipulations should affect size perception. Any manipulation that makes an object seem farther away (without chang- ing retinal image size) should make that object seem bigger. Any manipulation that makes the object seem closer should make it look smaller. And, in fact, these predictions are correct—a powerful confirmation that we do use distance to judge size. A similar proposal explains how people achieve shape constancy. Here, we take the slant of the surface into account and make appropriate adjustments—again, an unconscious inference—in our interpretation of the retinal image’s shape. Likewise for brightness constancy: We seem to be quite sensitive to how a surface is oriented relative to the available light sources, and we take this information into account in estimating how much light is reaching the surface. Then we use this assessment of lighting to judge the surface’s brightness (e.g., whether it’s black or gray or white). In all these cases, therefore, it appears that our perceptual system does draw some sort of unconscious inference, taking our viewing circumstances into account in a way that allows us to perceive the constant properties of the visual world. Illusions This process of taking information into account—no matter whether we’re taking viewing distance into account, or viewing angle, or illumination—is crucial for achiev- ing constancy. More than that, it’s yet another indication that we don’t just “receive” 198 chapter 5 PPERCEPTIONO visual information, we interpret it. The 5.23 Two tabletops The table on the left interpretation is always an essential part looks longer and thinner than the one on of our perception and generally helps us the right, but in fact, the parallelograms depicting each tabletop are identical. If you perceive the world correctly. But the role were to cut out the left parallelogram, of the interpretation becomes especially rotate it, and slide it onto the right parallel- clear when we misinterpret the informa- ogram, they’d line up perfectly. The appar- tion available to us and end up misper- ent difference in their shapes is an illusion ceiving the world. resulting from the way viewers interpret Consider the two tabletops shown in the figure. Figure 5.23. The table on the left looks appreciably longer and thinner than the one on the right; a tablecloth that fits one table surely won’t fit the other. Objectively, though, the parallelogram depicting the left table- top is exactly the same shape as the one depicting the right tabletop. If you were to cut out the left tabletop, rotate it, and slide it onto the right tabletop, they’d be an exact match. (Not convinced? Just lay another piece of paper on top of the page, trace the left tabletop, and then move your tracing onto the right tabletop.) Why do people misperceive these shapes? The answer involves the normal mecha- nisms of shape constancy. Cues to depth in this figure cause the viewer to perceive the figure as a drawing of two three-dimensional objects, each viewed from a particular angle. This leads the viewer—quite automatically—to adjust for the (apparent) viewing angles in order to perceive the two tabletops, and it’s this adjustment that causes the illusion. Notice then, that this illusion about shape is caused by a misperception of depth: The viewer misperceives the depth relationships in the drawing and then takes this faulty information into account in interpreting the shapes. (For a related illusion, see Figure 5.24.) A different example is shown in Figure 5.25. It seems obvious to most viewers that the center square in this checkerboard (third row, third column) is a brighter shade than the square indicated by the arrow. But, in truth, the shade of gray shown on the page is identical for these two squares. What has happened here? The answer again involves the normal mechanisms of perception. Notice, first, that the central square is surrounded by dark squares; this arrangement creates a contrast effect 5.24 The monster illusion The two that makes the central square look brighter. The square marked at the edge of the monsters shown here are the same size on checkerboard, on the other hand, is surrounded by white squares; here, contrast the page, but the monster on the right makes the marked square look darker. So for both squares, we have contrast effects appears larger. This is because the depth that move us toward the illusory perception. But the visual system also detects that cues in the picture make the monster on the central square is in the shadow cast by the cylinder. Our vision compensates for the right appear to be farther away. This this fact—again, an example of unconscious inference that takes the shadow into (mis)perception of distance leads to a account in judging brightness—and powerfully magnifies the illusion. (mis)perception of size. 5.25 A brightness illusion Most viewers will agree that the center square in this checkerboard (third row, third column) is a brighter shade than the square indicated with the arrow. But, in truth, the shade of gray shown on the page is identical for these two squares! If you don’t believe it, use your fingers or pieces of paper to cover everything in the figure except these two squares. PPerceptual ConstancyO 199 DISTANCE PERCEPTION: WHERE IS IT? So far in this chapter, we’ve emphasized how you recognize the objects you encounter. This focus has led us to consider how you manage to perceive forms as well as how you cope with variations in viewing circumstances in order to perceive an object’s shape and size correctly. And once again, this discussion leads to a new question: To perceive what something is, you depth cues Sources of information that need to achieve constancy. But, to achieve constancy, you need to perceive where something signal the distance from the observer to is—how far it is from you (so that you can achieve size constancy) and how it is angled the distal stimulus. relative to your line of view (so that you can achieve shape constancy). binocular disparity A depth cue based Of course, information about where things are in your world is also valuable for its on the differences between the two own sake. If you want to walk down a hallway without bumping into things, you need to eyes’ views of the world. This difference know which obstacles are close to you and which ones are far off. If you wish to caress a becomes less pronounced the farther an loved one, you need to know where he or she is; otherwise, you’re likely to poke him or object is from the observer. her in the eye. Plainly, then, you need to know where objects in your world are located. How, therefore, do you manage to perceive a three-dimensional world, judging which objects are close and which are far? The answer centers on depth cues—features of the B stimulus that indicate an object’s position. What are these cues? Binocular Cues A One important cue for distance comes from the fact that our two eyes look out onto the world from slightly different positions; as a result, each eye has a slightly different view. This difference between the two eyes’ views is called binocular disparity, and it gives us important information about distance relationships in the world (Figure 5.26). Binocular disparity can induce the perception of depth even when no other distance cues are present. For example, the bottom panels of Figure 5.26 show the views that each eye would receive while looking at a pair of nearby objects. If we present each of B A these views to the appropriate eye (e.g., by drawing the views on two cards and placing A B one card in front of each eye), we can obtain a striking impression of depth. Disparity was the principle behind the stereoscope, a device popular in the 19th Left eye's view Right eye's view century (Figure 5.27), which presented a slightly different photograph to each eye and A B A B so created a vivid sense of depth. The same principle is used in 3-D movies, in which two different movies—presenting two slightly different views of each scene—are projected simultaneously onto the theatre’s screen. For these movies, viewers wear special glasses 5.26 Binocular disparity Two images at to ensure that their left eye sees one of the movies and their right eye sees the other. In different distances from the observer will this way, each eye gets the appropriate input and creates the binocular disparity that in present somewhat different retinal images. turn produces a compelling perception of depth. In the left eye’s view, these images are close together on the retina; in the right eye’s view, the images are farther apart. Monocular Cues This disparity between the views serves as a powerful cue for depth. Binocular disparity has a powerful effect on the way we perceive depth. But we can also perceive depth with one eye closed; so, clearly, there must be cues for depth that depend only on what each eye sees by itself. These are the monocular depth cues. One of the monocular depth cues depends on the adjustment that the eye must make to see the world clearly. Specifically, we’ve already mentioned that in each eye, monocular depth cues Features of the visual stimulus that indicate muscles adjust the shape of the lens to produce a sharply focused image on the retina. distance even if the stimulus is viewed The amount of adjustment depends on how far away the viewed object is—there’s a lot with only one eye. of adjustment for nearby objects, less for those a few steps away, and virtually no adjustment at all for objects more than a few meters away. It turns out that perceivers 200 chapter 5 PPERCEPTIONO (A) (B) 5.27 Stereoscope and View-Master After their invention in 1833, stereoscopes were pop- ular for many years. They work by presenting one picture to the left eye and another to the right; the disparity between the pictures creates a vivid sense of depth. The View-Master, a popular children’s toy, works exactly the same way. The photos on the wheel are actually pairs—at any rotation, the left eye views the leftmost photo (the one at 9 o’clock on the wheel) and the right eye views the rightmost photo (the one at 3 o’clock). are sensitive to the amount of adjustment and use it as a cue indicating how far away the object is. Another set of monocular cues have been exploited for centuries by artists to create an impression of depth on a flat surface—that is, within a picture—which is why these cues are often called pictorial cues. In each case, these cues rely on straightfor- pictorial cues Patterns that can be rep- ward principles of physics. For example, imagine a situation in which a man is trying resented on a flat surface in order to cre- to admire a sports car, but a mailbox is in the way (Figure 5.28A). In this case, the ate a sense of a three-dimensional object or scene. mailbox will inevitably block the view simply because light can’t travel through an opaque object. This fact about the physical world provides a cue we can use in judging interposition A monocular cue to dis- distance. The cue is known as interposition (Figure 5.28B)—the blocking of our view tance that relies on the fact that objects of one object by some other object. In this example, interposition tells the man that farther away are blocked from view by the mailbox is closer than the car. closer objects. 5.28 Pictorial cues (A) This man is look- ing at the sports car, but the mailbox blocks part of his view. (B) Here’s how this scene looks from the man’s point of view. Because the mailbox blocks the view, we get a simple but powerful cue that the mailbox must be closer to the man than the (A) (B) sports car is. PDistance Perception: Where Is It?O 201 In the same way, distant objects necessarily produce a smaller retinal ima

Psychology 8th - Perception - Gleitman, Gross, Reisberg - PDF

Document Details

Tags

Related

Summary

Full Transcript