Visual Perception PDF

CHAPTER 3: VISUAL PERCEPTION Perception (from the Latin perception) is the organization, identification, and interpretation of sensory information in order to represent and understand the presented information, or the environment. All perception involves signals that go through the nervous system, which in turn result from physical or chemical stimulation of the sensory system. For example, vision involves light striking the retina of the eye, smell is mediated by odour molecules, and hearing involves pressure waves. Perception is not only the passive receipt of these signals, but it's also shaped by the recipient's learning, memory, expectation, and attention. Perception can be split into two processes, (1) processing the sensory input, which transforms these low-level information to higher-level information (e.g.,extracts shapes for object recognition), (2) processing which is connected with a person's concepts and expectations (or knowledge) and selective mechanisms (attention) that influence perception. Perception depends on complex functions of the nervous system, but subjectively seems mostly effortless because this processing happens outside conscious awareness Since the rise of experimental psychology in the 19th Century, psychology's understanding of perception has progressed by combining a variety of techniques. Psychophysics quantitatively describes the relationships between the physical qualities of sensory input and perception. Sensory neuroscience studies the brain mechanisms underlying perception. Perceptual systems can also be studied computationally, in terms of the information they process. Perceptual issues in philosophy include the extent to which sensory qualities such as sound, smell or colour exist in objective reality rather than in the mind of the perceiver. Although the senses were traditionally viewed as passive receptors, the study of illusions and ambiguous images has demonstrated that the brain's perceptual systems actively and pre-consciously attempt to make sense of their input. There is still active debate about the extent to which perception is an active process of hypothesis testing, analogous to science, or whether realistic sensory information is rich enough to make this process unnecessary. The perceptual systems of the brain enable individuals to see the world around them as stable, even though the sensory information is typically incomplete and rapidly varying. Human and animal brains are structured in a modular way, with different areas processing different kinds of sensory information. Some of these modules take the form of sensory maps, mapping some aspects of the world across part of the brain's surface. These different modules are interconnected and influence each other. For instance, taste is strongly influenced by smell. PROCESS AND TERMINOLOGY The process of perception begins with an object in the real world, termed the distal stimulus or distal object. By means of light, sound or another physical process, the object stimulates the body's sensory organs. These sensory organs transform the input energy into neural activity, a process called transduction. This raw pattern of neural activity is called the proximal stimulus. These neural signals are transmitted to the brain and processed. The resulting mental re-creation of the distal stimulus is the percept. An example would be a shoe. The shoe itself is the distal stimulus. When light from the shoe enters a person's eye and stimulates the retina, that stimulation is the proximal stimulus. The image of the shoe reconstructed by the brain of the person is the percept. Another example would be a telephone ringing. The ringing of the telephone is the distal stimulus. The sound stimulating a person's auditory receptors is the proximal stimulus, and the brain's interpretation of this as the ringing of a telephone is the percept. The different kinds of sensation such as warmth, sound, and taste are called “sensory modalities" Perceptual Illusions: What They Are, Causes, Types and Examples By observing things around us, we tend to integrate different parts of visual stimuli and organize them into meaningful ways. In general, we carry out this perceptual elaboration of stimuli completely unconsciously, so it seems to us that we grasp things in this way simply because “they are like that.” This impression of adherence of the percept to objective reality is normally quite correct, but sometimes perception misleads us as illusory perceptual configurations are created. So, what are perceptual illusions? Perceptual illusions, the intriguing phenomena that challenge our sensory perception, have fascinated psychologists and neuroscientists for centuries. These illusions deceive our senses, distorting our perception of reality and revealing the intricate workings of the human mind. In this exploration, we delve into what perceptual illusions are, their underlying causes, various types, and provide illustrative examples to elucidate their captivating nature. What are perceptual illusions? Perceptual illusions refer to discrepancies between the objective reality of a stimulus and our subjective perception of it. These illusions occur when our sensory systems misinterpret sensory information, leading to perceptual distortions or misjudgments of size, shape, color, motion, or depth. Perceptual illusions represent, unlike perceptual constancy, an inaccurate perception of real objects. Illusion is, therefore, a perceptual alteration by which perception that does not conform to the characteristics of the stimulus causes a discrepancy between the physical world and the perceived world. It should be said that perceptual illusions They are not only optical illusions, since the latter is based on the introduction of visual tricks that work with human perception; while perceptual illusions are rather a cognitive phenomenon that has as its protagonist the brain processing of sensory information. The Visual System The visual system constructs a mental representation of the world around us. This contributes to our ability to successfully navigate through physical space and interact with important individuals and objects in our environments. Anatomy of the Visual System The eye is the major sensory organ involved in vision. Light waves are transmitted across the cornea and enter the eye through the pupil. The cornea is the transparent covering over the eye. It serves as a barrier between the inner eye and the outside world, and it is involved in focusing light waves that enter the eye. The pupil is the small opening in the eye through which light passes, and the size of the pupil can change as a function of light levels as well as emotional arousal. When light levels are low, the pupil will become dilated, or expanded, to allow more light to enter the eye. When light levels are high, the pupil will constrict, or become smaller, to reduce the amount of light that enters the eye. The pupil’s size is controlled by muscles that are connected to the iris, which is the colored portion of the eye. After passing through the pupil, light crosses the lens, a curved, transparent structure that serves to provide additional focus. The lens is attached to muscles that can change its shape to aid in focusing light that is reflected from near or far objects. In a normal-sighted individual, the lens will focus images perfectly on a small indentation in the back of the eye known as the fovea, which is part of the retina, the light-sensitive lining of the eye. The fovea contains densely packed specialized photoreceptor cells. These photoreceptor cells, known as cones, are light-detecting cells. The cones are specialized types of photoreceptors that work best in bright light conditions. Cones are very sensitive to acute detail and provide tremendous spatial resolution. They also are directly involved in our ability to perceive color. While cones are concentrated in the fovea, where images tend to be focused, rods, another type of photoreceptor, are located throughout the remainder of the retina. Rods are specialized photoreceptors that work well in low light conditions, and while they lack the spatial resolution and color function of the cones, they are involved in our vision in dimly lit environments as well as in our perception of movement on the periphery of our visual field. The two types of photoreceptors are shown in this image. Cones are colored green and rods are blue. We have all experienced the different sensitivities of rods and cones when making the transition from a brightly lit environment to a dimly lit environment. Imagine going to see a blockbuster movie on a clear summer day. As you walk from the brightly lit lobby into the dark theater, you notice that you immediately have difficulty seeing much of anything. After a few minutes, you begin to adjust to the darkness and can see the interior of the theater. In the bright environment, your vision was dominated primarily by cone activity. As you move to the dark environment, rod activity dominates, but there is a delay in transitioning between the phases. If your rods do not transform light into nerve impulses as easily and efficiently as they should, you will have difficulty seeing in dim light, a condition known as night blindness. Rods and cones are connected (via several interneurons) to retinal ganglion cells. Axons from the retinal ganglion cells converge and exit through the back of the eye to form the optic nerve. The optic nerve carries visual information from the retina to the brain. There is a point in the visual field called the blind spot: Even when light from a small object is focused on the blind spot, we do not see it. We are not consciously aware of our blind spots for two reasons: First, each eye gets a slightly different view of the visual field; therefore, the blind spots do not overlap. Second, our visual system fills in the blind spot so that although we cannot respond to visual information that occurs in that portion of the visual field, we are also not aware that information is missing. The optic nerve from each eye merges just below the brain at a point called the optic chiasm. The optic chiasm is an X-shaped structure that sits just below the cerebral cortex at the front of the brain. At the point of the optic chiasm, information from the right visual field (which comes from both eyes) is sent to the left side of the brain, and information from the left visual field is sent to the right side of the brain. Once inside the brain, visual information is sent via a number of structures to the occipital lobe at the back of the brain for processing. Visual information might be processed in parallel pathways which can generally be described as the “what pathway” (ventral) and the “where/how” pathway (dorsal). The “what pathway” is involved in object recognition and identification, while the “where/how pathway” is involved with location in space and how one might interact with a particular visual stimulus (Milner & Goodale, 2008; Ungerleider & Haxby, 1994). For example, when you see a ball rolling down the street, the “what pathway” identifies what the object is, and the “where/how pathway” identifies its location or movement in space. Object and Form Perception The perception of forms can be considered from two fundamental perspectives: viewer-centered representation considers the appearance of an object relative to the viewer, and object-centered representation considers the appearance of the object itself, regardless of the distance and angle from which it is viewed. Both of these feed into the mental representation of the object itself. Descriptions of objects from a viewer-centered perspective detail their appearance and position from the perspective of the speaker: an object is "three feet away" and "horizontal" or "titled at about a twenty- degree angle downward to the left." Meanwhile, descriptions from an object-centered perception consider the object itself (the desk is rectangular and rounded on one end), in relation to other objects (the pencil is in the middle of the desk), or in relation to its own parts (on each side of the desk are two drawers, a shallow one on top of a deeper one). People may combine these two approaches in describing the shape and form of an object, though they tend to become more object-centered when describing an imaginary object and more viewer-centered when describing a real one. In addition to considering the size and shape of an object, our perception tends to organize objects into visual groups. That is to say when we look upon a car, we see the tires, fenders, lights, glass, and other objects that comprise the car, but group all of these objects into a single "car" gestalt and regard them as a singular thing. The "Law of Pragnanz" suggests that it is our tendency to attempt to group items into larger forms so as to perceive fewer objects with more properties. This requires some cognitive function, as the grouping of objects is neither arbitrary nor random, but based on patterns that the mind has stored, and against which it compares what we perceive at any given moment. Some distinction is made between our consideration of figures and backgrounds. That is, we notice upon entering a room that there is a chair, a window, a table, a lamp, and curtains. We perceive the chair, table, and lamp to be figures within the room, but consider the window and curtains to be part of the background. This plays into the optical illusion that shows a vase, with the silhouettes of two people facing one another on either side. Subjects often fail to notice the silhouettes because they are perceived as part of the background, which is often ignored when attention is given to the figure of the vase. Gestalts are commonly applied to familiar environments, such as when we regard the space in which we work as "my desk" and not a collection of the various items that rest upon it. People also apply gestalts in unfamiliar environments to simply the task of becoming oriented. Walking down an unfamiliar street, we see cars (not the individual parts) along the side of the road, people (not distinguishing bodies from clothing from objects they may be carrying), and the sidewalk (not distinguishing each square of cement or the ventilation grates and the like). While the application of gestalt to the environment is a simple matter that is second nature, and hardly bears consideration for practical purposes, it represents a great complexity in our perception of our environment an the objects within it, in which cognition is a factor: the mind must be active to recognize a group of shapes as being a single object. Pattern recognition is another factor of perception that bears consideration. W recognize the stamen, pistil, and petals of a plant to recognize it as a daisy, and we recognize the eyes, nose, mouth, hair, and shape of the face to recognize someone we know. This requires not only organizing shapes in the environment into gestalt, but then matching that gestalt against a pattern that resides in memory. An experiment is mentioned in which subjects associate faces with names, and are then asked to recall the names when shown different images. Some images are of only part of the face (a close-up of an eye), other images are the entire face. Not surprisingly, people are easily able to recall the name when shown the whole face but struggle when they are shown only part of it. When the experiment was repeated using houses instead of faces, people were far more likely to be able to name the house from seeing a part of it than they were to recognize a face - and even did slightly better at recalling names from a part of a house than to seeing the whole image. Much of our ability to recognize patterns is based on experience - which is little wonder, given that recognition requires previous exposure - but what is noted, particularly in the task of reading, is that experience teaches us to recognize shapes of entire words. While we originally learn to pronounce a word phonetically, letter by letter, we eventually learn the shapes of entire words through experience in seeing them often: we do not need to "sound out" a word or read it letter by letter, but recognize it at a glance. (EN: I recall there was some experimentation in elementary education that attempted to skip the phonetic approach and instead teach students entire words based on shape - which failed horribly to the detriment of public literacy.) There is a brief mention of prosopagnosia, a condition in which subjects are unable to recognize faces, which has been associated to damage in the lower temporal lobes. There is some debate as to whether the condition is more closely related to perception or memory. Theoretical Approaches Theories of perception fall into two general categories: there are bottom-up theories that begin with each perceptual stimulus which hare combined into higher order perceptions, then there are top-down theories that consider perception as a whole, then reduce it to component stimuli Bottom-Up Approaches Perception is a process of pattern-recognition: when we receive stimuli from the environment, we first attempt to match it against an existing mental model based on its properties and context. Mental models are loose enough so that if something does not perfectly match a pattern, it may still be close enough to our model that we can identify it. For example, consider seeing an engineer's prototype of a concept car - it may not exactly match any car we have ever seen, but it has enough semblance to our understanding of what a car is that we can tell very quickly that it is a car of some kind. This holds true even if there are obvious mismatches, such as lacking headlamps or a windshield. By one theory, our mind attempts to be efficient by quickly identifying the general nature of things, effectively to exclude them from the things we need to think about, so that we may better focus on the things that we do. Laboratory experiments are heavily contrived to elicit a desired response - for example, an image is carefully designed to suggest a shape by means of negative space. There is the valid concern that the natural environment is not as structured or contrived - that no context is intentionally superimposed by reality to guide us to a predetermined perceptual goal. One set of theories maintains that we maintain a mental store of templates or patterns, based on experience, that enable us to group stimuli related to a given phenomenon. Where there is a template in storage, we can recognize a group of stimuli at once rather than having to analyze the sensory stimuli we receive to recognize it. That is, we recognize a given letter such as "A" because we have seen it before, in varying styles and positions, and can remember what it is when we see it again. There are also prototype theories that maintain what we have is not a template, but a set of general principles by which we define a thing, which integrates the most frequently observed features of a class. For example, we can immediately recognize the letter "A" in a variety of fonts and scripts because each instance of the letter matches the basic features (two vertical lines intersecting at the top, with a crossbar) even when the details change (serifs are added) or the figure is slightly distorted (the letter is italicized). It's also noted that a prototype can be entirely theoretical - that a person who has not actually seen an example of an object can still recognize it if they have conceptualized it and understand the prototype. (EN: Though I do believe that this depends on the quality of the description. Consider the experience of a child who has been told what a giraffe looks like but has never actually seen one, or an adult who visits a place they have been told about but have never been. There does seem to be a moment of shock and uncertainty when confronted with the genuine article, and the comment that things are not as expected likely reflects the accuracy and granularity of the information that was used to build their mental prototype.) Yet another approach to perception are feature-matching theories, in which we attempt to match specific aspects of a perceived phenomenon rather than the entire form. By this theory, raw perception notices features, formulates a suggestion of what that object might be, then gives greater attention to additional details to conclude that it is so. In the first match, features that define an object are considered in assessing whether it might be, and in the second match, features that would exclude an object from a class are checked to disqualify it. For example, two vertical lines with a crossbar lead us to suspect we might be looking at the letter "A", but this conclusion is discarded when we recognize the vertical lines are parallel and we are actually looking a the letter "H." The recognition-by-components theory suggests that we recognize objects in three-dimensional space by reducing them to geons (geometric shapes such as spheres, cubes, cones, cylinders, and the like) and recognizing the way in which they are interconnected (EN: Much in the way that an "Art 101" student learns to draw.) It's suggested that this is very handy in explaining how we recognize generic objects (a human face) but not very good at explaining how we differentiate within broad categories (knowing one person's face from another's) Top-Down Approaches In contrast to the bottom-up approaches, there is a separate school of theory that suggests perception begins with higher-order thinking: that our perception is not merely the assembling of sensory data into patterns, but draws on other sources of information - conceptual constructs that exist within the mind. In that way, perception is not a perfect reflection of reality, but a compromise between what we perceive and what we know, with our mental fiction filling in the gaps in sensory facts. That is to say that what we perceive is the result of a negotiation between what we sense and what we think. This approach explains the way in which we form perceptions based on incomplete data. The example given is a red octagonal placard with the letters "ST", then a gap where a vine has overgrown the placard, then the letter "P." The reason we recognize this as a stop sign is that our conceptual model fills in the cap in sensory data. Another example is the use of night vision, when the lighting is so low that we cannot discern the color of things, but see only their physical forms. While we often describe things in terms of their color, we can overcome the lack of sensory data with the conceptual construct: we know a given object is a banana because of its shape, even though we cannot see that it is yellow. By this theory, the observer quickly forms and tests various hypotheses about the information he receives from the environment, based on three factors: 1. The sensory data the observer receives from the environment 2. The knowledge stored in the observer's memory 3. Inferences that fill the gaps in the data Perception is therefore an cognitive process, though it occurs at the unconscious level. Were it based on external data alone, without the application of experience and intelligence, we would be dumbfounded my most of what we see in the natural environment. The bottom-up approaches are rightly criticized for their failure to consider the context of perception - when in reality there are blatantly obvious connections between the context in which an object is perceived (both the physical environment surrounding the object and the mental state of the observer) resulting in significant differences in perception. Even more striking is a phenomenon known as the configural-superiority effect, in which objects presented in certain configurations are more readily identifiable than if they were viewed in isolation, even though the configuration adds complexity. By the constructive approach, intelligence is a critical part of perception: we do not perceive what is "out there in the world" with an uninformed eye, but instead apply our intelligence to understand the things we perceive. Synthesizing the Approaches Both the top-down and bottom-up theories have been able to generate empirical support through experimentation. But instead of considering whether one or the other is correct, instead consider how they work together: Taken to extremes, the top-down position would underestimate the value of sensory data. It maintains that perception relies upon memory, but the memory itself has to be resident in order for this to occur. As such, the top-down approach explains how we recognize an object, but not how we come to the knowledge that enables us to recognize it. That is to say that when we encounter a wholly unfamiliar object, we revert to a bottom-up approach in order to create the construct or memory to which future perceptions will be compared. The first time a person sees an elephant, they observe the details they sense in order to understand what it is, and each subsequent encounter with an elephant or similar creature can then reflect on previous experience in a top-down manner. In a similar vein, the bottom-up approach may also be useful in refining the construct. In essence, the top- level model may include a construct that indicates all fish have scales, until we observe fish-like creatures that do not have scales. Our choice is either to create a separate class of object, or to allow our bottom-up perceptions to modify the top-down ones - that is, to adjust the model to reflect that many species of fish, but not all, have scales. Computational Theory of Perception Marr's computational theory of perception is another compromise between the top-down and bottom-up approaches. IT considers the bottom-up approach without entirely dismissing the importance of prior knowledge in interpreting sensory data. Working with visual data, Marr considered shape to be discerned by their edges (the borders of the shape), contours (nuances of their visual texture), and regions (areas within a shape that are undifferentiated). His reckoning is that the eyes perceive a two-dimensional shape that the brain translates into three dimensions. An object is perceived by the eyes in two dimensions, its edges and its fill. The brain then recognizes that there are levels of shading and distortion in the area within the shape, organized into regions and then contours, to imply that it is a three-dimensional form. Deficits in Perception Additional understanding of perception can be gleaned from individuals whose perceptual processes differ from the norm. Agnosia Agnosia is a sever deficit in the ability to perceive sensory information, and there are various kinds that pertain to different senses, which is attributed to lesions in specific parts of the brain. People with visual agnosia are believed to have normal sensations of what exists in their field of vision, but cannot recognize what they see. They can describe the shapes before them, and even seem to discern entire objects, but cannot recall the name of the object. One subject was presented wit ha pair of spectacles - he was able to recognize that there were two round objects joined together by a bar, but struggled to say the name of the object. With some deliberation, he guessed the object was a bicycle. This response is interesting in that it does demonstrate the bottom-up practice of observing individual shapes and their conjugation, and in a sense the response was close (in the sense that a bicycle also consists of two round shapes joined by a bar). Other agnosias render subjects unable to recognize more than one object at a time, unable to recognize familiar environments, unable to recognize faces, etc. There are even selective agnosias in which a subject recognizes the faces of other people but does not recognize a specific few, including his own. Auditory agnosia pertains to the ability to recognize sounds, such as the inability to discern between the voices of different people or the inability to remember or even to perceive a specific piece of music as anything but random noise. In some instances, agnosia has an extreme level of specificity: an individual with otherwise normal skills may routinely be unable to recognize a specific thing (they may recognize most people but blank on a specific one) or a specific class of things (to recognize the difference in faces of animals but not in the faces of humans), which gives evidence to the notion that memories are stored in very specific parts of the brain. But while this would explain the inability to recognize a face for the first time (the place that face is recorded was damaged), it does nopt explain the ability to recognize it a second time (a new recording should in theory have been made outside the damaged area). While damage to certain parts of the brain seems to explain why some people have agnosia, it fails to suggest why most people do not. That is, we understand people have the ability to recognize specific faces, but no sense of why this should be so. Color Perception Various deficiencies in the perception of color, including "color blindness," are more common in men than in women. The author details a number of conditions, such as red-green and blue-yellow colorblindness, all the way to complete colorblindness. He does not, however, relate these conditions to the brain rather than the eye, or expound upon the matter further than to acknowledge it. Akinetopsia A final condition is a selective loss of motion perception, and it is reckoned that individuals with this perception perceive motion as if it were a series of snapshots. This condition is associated to severe bilateral damage to the temporoparietal cortexes.

Visual Perception PDF

Document Details

Tags

Related

Summary

Full Transcript