Object Recognition & Spatial Cognition PDF
Document Details
Uploaded by JubilantNeptunium
Simon Fraser University
Banich & Compton
Tags
Summary
This document discusses object recognition and spatial cognition, covering topics such as the ventral visual system, receptive fields, visual agnosia, and different coding models. It also includes information about prosopagnosia and neural regions involved in motion perception and spatial navigation.
Full Transcript
Object Recognition and Spatial Cognition Chapters 6 & 7 Chapter 6 Object Recognition Ventral Visual System = “What” The ventral visual processing stream: Processing information related to object recognition Inferotemporal cortex: PIT: Posterior Inferotemporal...
Object Recognition and Spatial Cognition Chapters 6 & 7 Chapter 6 Object Recognition Ventral Visual System = “What” The ventral visual processing stream: Processing information related to object recognition Inferotemporal cortex: PIT: Posterior Inferotemporal cortex CIT: Central Inferotemporal cortex AIT: Anterior Inferotemporal cortex FIGURE 6.1A (DiCarlo et al., 2012). Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Ventral Visual System = “What” Posterior regions respond to relatively simple stimuli Cells further along the ventral stream respond to more complex and specific stimuli. AIT: Receptive Fields Area of space to which a cell is sensitive. Become larger as we move further along the ventral stream. A large receptive field allows an object to be identified regardless of its size or location. Consequence – some information about an item’s position in space is lost. FIGURE 6.3 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Visual Agnosia An inability to recognize objects in the visual modality (that cannot be explained by other causes such as problems with attention, memory, language etc.) Two types: Apperceptive and Associative Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Apperceptive Agnosia A fundamental difficulty in forming a percept (a mental impression of something perceived by the senses) Sensory information is still processed in a rudimentary way, but the data cannot be put together to allow the person to perceive a meaningful whole. FIGURE 6.5 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Associative Agnosia Basic visual information can be integrated to form a meaningful perceptual whole, but that particular perceptual whole cannot be linked to stored knowledge. FIGURE 6.7 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Apperceptive vs. Associative Agnosia The site of brain damage differs: Apperceptive agnosia: damage to the occipital lobe and surrounding areas Associative agnosia: damage to the occipitotemporal regions of both hemispheres and subadjacent white matter Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Agnosia and Brain Damage FIGURE 6.4 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Prosopagnosia Agnosia specific to faces. A selective inability to recognize or differentiate among faces. Result of damage to the ventral stream of the right hemisphere. Individuals can usually determine basic information (it’s a face) Limited ability to recognize a face as belonging to a particular person Developmental (congenital) prosopagnosia is thought to be present in approximately 2% of the general population. Anterior temporal lobe not as activated by images of faces Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Word vs. Face Recognition Martinaud et al., 2012). FIGURE 6.8 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Sparse Coding The theory that a small but specific group of cells responds to the presence of a given object. The “grandmother cell theory”: There is a particular cell in the ventral stream whose job is to fire when you see a particular object or person (such as your grandmother). An extreme version of sparse coding. is not thought to be correct. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Population Coding The theory that the pattern of activity across a large population of cells codes for individual objects. Extreme version: every cell in the ventral stream is involved in coding for every object. The reality probably lies somewhere in-between sparse and population coding, but evidence seems to lean closer toward population coding. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Sparse vs. Population Coding FIGURE 6.10 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Invariance in Recognition Humans can identify objects under many different conditions. Form-cue Invariance – the brain’s categorization is constant regardless of the form of the cue that represents that object. Perceptual Constancy – the ability to recognize objects seen from different angles, positions, sizes, and kinds of illumination. Therefore at some level, our mental representation of objects is more abstract/conceptual. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Lateral Occipital Complex (LOC) Involved in form-cue invariance and perceptual constancy. Shows evidence of perceptual constancy across variations in size, location, and form of the shape May be the stage in visual processing where abstract shape representations are formed. Supports recognition of objects despite differing conditions FIGURE 6.14 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 fMRI Adaptation Model FIGURE 6.13 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Neural Representations Viewpoint-independent or viewpoint-Dependent? Ie. Do neural representations depend on the viewpoint from which the object is seen? Seems to be some amount of viewpoint-dependency Can recognize objects from multiple viewpoints, but recognition is faster and more accurate when viewed from familiar viewpoint Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Viewpoint-dependency in Coding by Ventral Stream Cells FIGURE 6.15 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Neural Representations A bit of both! Some ventral stream cells change depending on object orientation, other cells respond to an object in the same way regardless of orientation Neuroimaging studies have found viewpoint-dependence at earlier stages of the ventral stream and viewpoint-independence at later stages Recognition of an object from a particular viewpoint likely depends on comparison with stored descriptions/abstract mental representation in the brain. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Hemispheric Differences in Processing Distinct neural mechanisms involved in object recognition: Left ventral stream – important for analyzing parts of objects Right ventral stream – important for analyzing whole forms (configuration FIGURE 6.16 of the parts) Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Inversion Effect Recognition is poorer when an object is turned upside down. Suggests that we tend to rely on the way features are put together/ related to each other (configural information) when identifying objects. Especially important with object categories for which we have a lot of expertise FIGURE 6.17 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Models for how whole shapes are recognized Nonlocal Binding: A whole object is represented by the co-activation of cells that represent the parts of the object in particular locations. No separate unit that represents the whole, but rather the whole is perceived when the units representing all the constituent parts are simultaneously activated Conjunctive Encoding: Lower-level regions represent features and send output to higher-level regions representing the shapes that result from the joining of those features Separate unit that represents the whole, beyond just the ones that respond to the parts Evidence leans towards supporting the conjunctive coding model. Object Recognition in Tactile and Auditory Modalities Vision is a dominant sense in humans. However, humans can also recognize objects by hearing and touch. Similarities in brain organization emerge across modalities: Early cortical areas code basic features Higher-level areas organize sensations into representations of recognizable objects Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Agnosias in Other Modalities Auditory agnosia is the inability to recognize the meaning of sounds. Somatosensory/tactile agnosia is the inability to recognize an item by touch (but can recognize the object in other modalities). Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 What Versus Where Across Modalities One commonality across all the different senses is that they must distinguish “what” and “where”. There is a distinction between “what” and “where” pathways in the brain that seems to be a basic organizational feature for all modalities. Visual system: ventral stream = “what”, dorsal stream = “where” Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Chapter 7 Spatial Cognition Dorsal Visual System = “Where” Supports many aspects of spatial processing. Key components are located in the parietal cortex, which has subdivisions: The anterior parietal lobe = somatosensory representations (not considered part of the dorsal stream proper). The posterior parietal cortex (PPC) = multisensory and crucial in many aspects of spatial cognition. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Anatomy of the Dorsal Stream Within the PPC, the superior and inferior parietal lobules are separated by the intraparietal sulcus. The middle temporal (MT) and medial superior temporal (MST) areas contribute to motion. The dorsal stream receives visual information from primary visual cortex, also input from somatosensory cortex and vestibular system (information about position of the body in space). Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 The Dorsal Visual Processing Stream FIGURE 7.1 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 3 Pathways of the Dorsal Stream 1. Connects the parietal cortex with prefrontal cortex Supports spatial working memory 2. Connects the parietal cortex with premotor cortex Supports visually guided actions such as reaching and grasping 3. Connects the parietal cortex with medial temporal cortex Supports spatial navigation Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 3 Pathways of the Dorsal Stream FIGURE 7.2 (Kravitz et al., 2011). Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Cells of the Dorsal Stream Responsive to attributes of visual information that are useful for processing spatial relations. These cells do not play a large role in object recognition as they are NOT sensitive to: Form or color Items positioned in central vision Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Coding for the Three Dimensions of Space The brain is able to code for the vertical (up–down), the horizontal (left–right), and the depth (near–far) dimensions. The retinal images that the brain receives are two-dimensional and the depth dimension must be computed in the cortex. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Distinguishing Left From Right The visual world is mapped in a retinotopic manner onto visual cortex, with the map reversed in relation to the visual world with respect to both the up–down and left–right dimensions. Left and right are inherently relative terms (my left or yours?) Cortex of the left parietal lobe may be involved in left/right discriminations Impaired left/right discriminations when TMS used on left (but not the right) parietal cortex Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Depth Perception Depth perception helps to localize items in the near-far plane. Crucial to understanding where things are in space. Depth is determined by the amount of binocular disparity. Cells in primary visual cortex are sensitive to different amounts of binocular disparity, and provide important information about depth for use by the dorsal stream regions. Cells in various regions of the dorsal stream have been shown to be sensitive to binocular disparity. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Motion Parallax Refers to the fact that as you move through an environment, near objects are displaced more on your retina than objects that are far away. Motion parallax is a monocular depth cue (in contrast to binocular disparity). Recordings in monkeys indicate that cells in area MT appear to integrate different kinds of cues (binocular disparity and motion parallax) to code for depth. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Spatial Frames of Reference People can understand the spatial location of an object via: Egocentric reference frames specify an object’s location in relation to some aspect of the self (body, head, eye, etc.) Allocentric reference frame specify an object’s location in relation to other objects, independent of one’s own location Different kinds of coding rely on separable brain processes: Egocentric neglect associated with right parietal lobe damage Object-centered neglect associated with right hemisphere middle and inferior temporal lobe damage Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Motion Perception Motion perception is inherently tied to spatial perception, because perceiving motion involves perceiving changes in an object’s spatial location over time. We must be able to represent our own motion through the world in order to fully understand where we are presently located. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Neural Regions for Motion Perception Akinetopsia: Selective deficits in motion perception Studies of brain-damaged patients and neuroimaging studies indicate that a specific region, area MT (also known as area V5) is critically important for perceiving motion. Neighboring region MST is also involved in coding for more complex motion, such as optic flow (the pattern of movement of images on your retina as you move actively through an environment). Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Optic Flow The movement of the retinal image for each aspect of the scene will depend on the speed and direction as well as the spatial relationship of each scene element in relation to the body. FIGURE 7.8 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Incorporating Knowledge of Self-Motion A problem for the brain: did the object move or did the eye move? Sources that provide the visual system with information about eye movements: 1. Motor regions of the brain that are planning to move the eyes (sends a corollary discharge). 2. Sensory receptors within the eye muscles provide ongoing feedback about changes in the position of the eye. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Movement of an Image on the Retina FIGURE 7.9 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Accounting for Movement of the Body To accurately understand whether external objects are moving or stationary, one must take into account the body’s motion. Researchers are still determining how the brain is able to construct stable visual representations even as the body continuously moves through space. Parietal lobe regions receive input from the vestibular system and from areas controlling and sensing eye movements, so that the movement of external objects can be calculated in reference to the self. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Optic Ataxia A disorder of visually guided reaching (as a result of parietal lobe damage) Usually most pronounced in the peripheral rather than the central visual field. Deficit is most profound for real-time integration of vision and motion. May result from an inability to integrate cues involving eye-gaze direction (perception) and reaching direction (action). Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Optic Ataxia FIGURE 7.11 (from Battaglia-Mayer and Caminiti, 2002). Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Space and Action Perception of spatial dimensions, reference frames, spatial relations, and motion is important in being able to construct an accurate view of the spatial layout of the external world. We need to be able to act upon our representations, so one important function of the dorsal stream is to participate in sensory–motor translation: transforming sensory representations into action patterns. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 An anti-saccade task to study sensory-motor translation FIGURE 7.12 (Janzen and van Turennout, 2004). Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Coding for Intended Movements Toward Targets Sensory-Motor Integration within subregions of the parietal cortex: Cells in the lateral intraparietal cortex code the location of the stimulus first and then shift to the location of the planned movement. Coding of space for intended eye movements takes place in the lateral intraparietal region (LIP) Disrupting activity in the parietal reach region (PRR) produces symptoms similar to optic ataxia Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Subregions that support sensorimotor translation FIGURE 7.13 (Andersen et al., 2019) Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Spatial Navigation The ability to navigate around an environment is crucial. Two basic strategies for spatial navigation: Route-based strategies - the person’s understanding is represented as a sequence of steps, often specified in terms of particular landmarks. Map-based strategy - involves having an allocentric understanding of how all of the different parts of the landscape relate to one another. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Egocentric Disorientation Spatial navigation can be disrupted in several different ways - egocentric disorientation involves the inability to represent the location of objects in relationship to the self Associated with damage to the posterior parietal region Patients have difficulties with navigation (both route-based and map-based) because they are unable to represent spatial relations. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Landmark Agnosia Patients lose the ability to recognize certain landmarks that are usually used for navigation (so route-based navigation becomes difficult). Deficits typically occur following damage to ventral stream areas. Difficulty recognizing the landmark – what it is. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Neural Coding for Spatial Navigation 1. The parahippocampal place area (PPA) codes for landmarks relevant to navigation. 2. The retrosplenial complex (RSC) determines a person’s location within a familiar spatial environment. 3. The medial temporal lobe (MTL) contains a map-like allocentric representation of familiar environments. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Three Main Spatial Navigation Regions FIGURE 7.16 Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Additional Navigational Deficits Anterograde disorientation – trouble constructing new environmental representations, but can still navigate previously learned environments. Associated with PPA damage. Heading disorientation – trouble understanding one’s own orientation (heading), but can recognize landmarks and understand relations between locations in space. Associated with RSC damage. Banich & Compton, Cognitive Neuroscience © Cambridge University Press & Assessment 2023 Connections Object Recognition = Ventral “What” Pathway Spatial Cognition = Dorsal “Where” Pathway Functions are not completely segregated Other regions can be involved in navigation Some processes are important in both object recognition and spatial understanding Depth perception can help determine what an object is, and also where it is in space Navigation requires combination of both, to recognize landmarks and then where you are based off of that Quiz 2 In class on Wednesday, June 5 Material from Sensation and Perception (ch.5), Object Recognition (ch.6), and Spatial Cognition (ch.7) Second half of Motor Control (ch.4) will not be on Quiz 2 (saved for the midterm) Same style and structure as Quiz 1