Summary

This document discusses object recognition, challenges in interpreting 3D objects in 2D, and how individuals recognize and interpret objects with partial information. It covers aspects like gestalt principles, figure-ground segregation, and the ventral pathway.

Full Transcript

Section 6: Object Recognition Recognition of individuals within your species coveys many benefits like preferentially responding to kin and maintaining social bonds Recognition of individuals of a different species is not as understood o Farmers recognize individual animals e...

Section 6: Object Recognition Recognition of individuals within your species coveys many benefits like preferentially responding to kin and maintaining social bonds Recognition of individuals of a different species is not as understood o Farmers recognize individual animals even when breeding large numbers of them ▪ Familiarity with benefits coming from being able to monitor the health of a flock o Species that spend a lot of time with non-conspecifics (domesticated pets, honeybees), may also show individual recognition capabilities ▪ This ability was investigated in a species that is neither domestic, nor spends much time with non conspecifics – wild American crows. Known to use tools in various ways, this species is considered quite intelligent. Crow Research Study o Objective: To examine how crows associate negative experiences with specific human faces. o Banding Experience: The crows were banded by researchers wearing specific masks, creating a negative association for the birds. o Experiment Design: ▪ Three Initial Experiments: Crows were banded while researchers wore specific masks. Researchers later walked along the crows' habitat wearing: Trapping Masks: Masks used during the banding (labelled as "dangerous"). Non-Trapping Masks: Other masks that were not used during the trapping process. Final Experiment: o Two researchers walked the route simultaneously: One wearing a "dangerous" mask. The other wearing a different mask. o This setup allowed researchers to observe how well crows could discriminate between faces based on their prior experiences. Key Findings: The experiments aimed to determine crows' recognition and memory of specific human faces associated with negative experiences. Scolding behaviour (the crows circling overhead and making specific calls) was increased only for the various combinations of the dangerous masks. o Thus, other species demonstrate sophisticated object discrimination abilities Challenges to Object Recognition Going from a 3D world to a 2D retinal image Viewpoint and orientation independence Distinguishing objects that share features o For example, being able to distinguish that there are two red balls that look the same and that there isn’t just one ball (thus, the difference in spatial location is enough to inform you that they are separate) 1 Recognizing objects with only partial information available Gestalt The Gestalt psychologists suggested that we use a number of heuristics or rules of thumb to help impose order on the visual images as they come in. You can make sense of this word largely because of those principles. The G is incomplete, but you use a law of closure to form the letter. Despite the incomplete image we tend to ‘fill-it-in” in sensible ways. The small black and grey squares enable the E to pop out because of their proximity to one another – things close together normally go together or form groups. Even though the S is broken up by a vertical image you have no trouble seeing it because you apply a law of good continuation – it is more probable that parts of the image lying along the same path belong together. The two Ts appear as they do given the law that things that look similar typically go together. And the AL give an example of the differentiation of figure from ground (as well as the influence of context) – the first thing you see is the AL as that is clearly part of the word. Look closely and a small, stylized tree pops up inside the A. Viewed on its own it is more difficult to see the two letters. Figure-Ground Segregation Faces / vases illusion Viewpoint / Orientation Independence Not only do you see all these images as a car, but as the same car. So we can recognize objects independent of our own viewpoint or the orientation of the object. We can also recognize objects from viewpoints we rarely see 2 Occluded Objects Degraded Images We can recognize degraded images Impossible Objects Recognition of Impossible Objects: Humans can recognize or process images of impossible objects (e.g., objects that cannot physically exist) in an intelligible manner. Concept of Object Constancy: Refers to the ability to recognize objects consistently despite changes in their appearance. o These principles support object constancy o Changes in the retinal size of the image, lighting conditions, orientation and the viewer’s perspective are all accompanied by large changes in the retinal image ▪ Yet you have no trouble recognising the object Persistence of Recognition: Once an image is suggested, it can be challenging to stop recognizing it, indicating strong cognitive processing capabilities. Implications: Highlights the brain's ability to maintain a stable perception of objects, even when there are significant changes in the retinal image (e.g., changes in perspective, lighting, or occlusion). Importance in Vision Science: Understanding object constancy is crucial for studying how we perceive and interpret visual information in our environment. Bistable Images Images that can be interpreted in multiple ways (the duck/rabbit) The visual information supports multiple interpretations The Ventral ‘What’ Pathway Early parts of this pathway (V1 and V2) process very localised aspects of the image and low-level properties such as: o Line orientation o Luminance (brightness) o Spatial frequency As you move further forward in this stream, cells being responding to more complex combinations of stimuli In the later stages in the infero-temporal cortex cells show response preferences to particular classes of objects Building the Image 3 The representation of an image is built up in a hierarchical fashion. o Early stages process localised, low level aspects of an image, with full recognition, semantic processing only occurring at the later stages of processing. o And this is reflected to some degree in the receptive field sizes of neurons at each stage of processing. ▪ Here, you’re looking at a schematic of receptive field sizes from the retina, to the LGN and eventually V1. Many to One This arrangement where many cell project to one further up the hierarchy reflects the fact that higher up in the hierarchy, the system integrates information from the lower levels. Lots of information being funneled into one area Orientation Tuning Cells are “tuned” to specific orientations o Hubel and Weisel first showed this o Individual cells showed firing rates to specific orientations and directions of motion – so the cell could be said to ‘prefer’ that orientation Orientation “preferences” are organized in columns in V1 Columnar organization is common elsewhere in the brain o e.g. the motion sensitive regions V5/MT Complex Preferences in the Ventral Stream the further forward in the system, the more sophisticated the preferences demonstrated by population of neurons 4 LOC – object form V4/V8 – colour V5/MT – motion AIT – complex object representations Firing rates in the top panel show sensitivity to hands – even when shown in a mitten! But the objects on the bottom row – even those on the left that are designed to look ‘hand-like’ – fail to elicit a response from the cell. Receptive Fields+ Cell Density Cell density is highest at the fovea o we foveate the objects we want to process and so devote more processing power to this region of space Receptive field (RF) size increases with distance from the fovea RF size increases in more anterior regions of temporal cortex o RF in general increases as we move further forward in the ventral pathway and dorsal pathway. ▪ but in the dorsal stream only 60% of RFs in the anterior regions of the pathway include the fovea. o RF size becomes so large that cells ubiquitously encompass the fovea. Advantages to larger RF o Cells respond regardless of location or size o Cells respond to global shape properties o Supports object constancy 5 o Cells in IT always include fovea (central vision is always represented) ▪ (Note: anterior portions of dorsal stream have larger RFs but only 60% include fovea) Synaesthesis and Modularity of Function Ventral Pathway & Modularity: o The ventral visual pathway in the brain is typically described as modular. o Different regions of the pathway specialize in processing specific aspects of an object or image, such as shape, color, and texture. o However, this modular view doesn't fully explain how objects are processed in vision as a whole. Synesthesia as an Exception: o Definition: Synesthesia is a condition where a stimulus in one sensory modality triggers additional, unexpected perceptions in another modality. o Grapheme-color synesthesia: The most common form, where numbers or letters are perceived as having specific colors (e.g., the number 7 is always seen as yellow). o Other types: ▪ Shape-taste synesthesia: Shapes trigger the perception of specific tastes. ▪ Space-time synesthesia: Time (like days of the week) is experienced as having a spatial location. ▪ Object-personality synesthesia: Certain objects, like letters, are perceived as having personalities (e.g., "E is a jerk, S is kind"). Modularity Breakdown in Synesthesia: o Synesthesia challenges the modular view because a single stimulus (like a letter) can activate multiple perceptual experiences (like color). o It suggests cross-linking of different sensory modules that are usually distinct in non-synesthetes. Underlying Mechanisms of Synesthesia: Two key theories: o Reduced Synaptic Pruning: ▪ During development, synaptic pruning reduces neural connections to streamline processing. ▪ In synesthetes, there may be less pruning, leading to more cross-linkages between sensory regions (e.g., numbers and color processing areas). o Enhanced Re-entrant Activation: 6 ▪ When a module (e.g., for numbers) activates, there is increased back propagation of signals to other areas, like those responsible for color, causing synesthetic experiences. o It is difficult to differentiate between these two possibilities. Memory and Synesthesia: o Synesthesia can enhance memory for certain items. o The additional perceptual information (such as color associations) can help synesthetes organize and recall information more effectively. o Grapheme-Color Synaesthesia and Memory: ▪ Grapheme-color synaesthetes can recall more information when presented with alphanumeric characters that are congruent with their synesthetic color associations. ▪ Example: If a synaesthete associates the number "7" with yellow and sees it presented in yellow, their recall improves. ▪ If the colors of the characters are incongruent (not matching their synesthetic experience), their memory performance decreases to the level of neurotypical individuals. o Case Study: Patient S: ▪ Patient S: A famous case studied by Alexandr Luria in the mid-20th century. ▪ Exceptional Memory: Patient S was known for extraordinary memory abilities, possibly linked to a highly developed form of synaesthesia. ▪ Synaesthetic Descriptions: Patient S didn't just associate colors with numbers but also gave rich, detailed personality traits to them. Example descriptions: o The number "1" was perceived as a slender man with a long face and upright posture. o The number "2" was perceived as a plump woman with an elaborate hairdo, dressed in velvet or silk. o Implications for Memory: ▪ Synaesthesia can provide additional sensory information (such as colors or character traits) that helps synaesthetes organize and recall information more effectively. ▪ However, this memory enhancement depends on the congruency of the stimuli with their synesthetic experiences. Modularity and the Binding Problem Earlier parts of the visual pathway process more rudimentary elements of an image and later parts represents objects and scenes in more holistic ways. That is, all this modularity poses some challenges for the visual system. o How do we combine what each module does into a coherent representation we can use? ▪ This is the so-called binding problem 7 o how do we recognise things when the image falling on the retina constantly changes? ▪ the process known as object constancy. The image here shows that distinct brain regions do different things o extract features in early visual stages, describe the form or shape in later stages (like LOC) and then actually recognize the object, perhaps by matching to some representation in memory. o But the fact that these different processes occur in distinct brain regions only highlights the binding problem – is there a single brain region that ‘knows’ what we’re looking at? View-Invariant Representations Sensory inputs define some basic properties that we can extract from an image despite changes at the level of the retina o Build the image up from basic properties to higher order representations Marr’s computational model o Built the image up in specific stages View-dependent representations would place large demands on memory systems Biederman’s Geons – A Visual Alphabet The recognition by components theory – objects are a combination of parts and these parts (geons) form a kind of visual / perceptual alphabet o An object is defined by the unique set and arrangement of geons Problems o How is it that we can recognize objects with very different geons as the same objects o How can we distinguish between objects that share many/all of the same geons Object Representation and the Grandmother Cell Theory Hierarchical and Modular Representation in the Brain: o The brain processes objects in a hierarchical and modular way. Early stages involve low-level processing, such as recognizing basic features (e.g., lines, shapes). o As information moves through the brain, it is integrated to form higher-level concepts, such as recognizing familiar objects or people (e.g., a table or grandmother). 8 The Grandmother Cell Hypothesis: o This hypothesis suggests that at some point, the brain could have individual cells (or small populations of cells) dedicated to recognizing specific entities, like a "grandmother cell" for recognizing your grandmother. o By extension, there would be "table cells," "phone cells," and "motorbike cells" responsible for recognizing those specific objects. Problems with the Grandmother Cell Hypothesis: o Vulnerability to Brain Damage: If this hypothesis were true, damage to a small number of cells could lead to the loss of recognition for entire categories of objects or people. For example, if the "grandmother cells" were damaged, you might lose the ability to recognize your grandmother altogether. o Agnosias: Brain damage doesn't typically eliminate all recognition capacities for a given object or person. For instance, you might fail to recognize your grandmother by sight (visual form agnosia), but you would still recognize her by voice or other cues. Novel Object Recognition: o Another issue with the grandmother cell theory is explaining how we recognize novel objects we've never seen before. o Example: How did the first Europeans who saw a platypus process it? They wouldn't have a pre-existing population of "platypus cells" in their inferotemporal cortex to handle this new stimulus. o This suggests that object recognition involves more dynamic and flexible processes than just relying on dedicated cells for each familiar object. Ensemble Encoding One potential solution to the previous problem is known as ensemble encoding There is no single population that represents a class of objects (or Granny). But recognition is achieved as a result of synchronised activity across regions that processes the object – a combination of activity from complex feature detectors of various kinds. o Recognition via synchronized firing across multiple regions Failures of Object Recognition Agnosia – a = without, gnosis = knowledge Apperceptive Agnosia – impaired object recognition due to degraded perceptual abilities o a figure copying performance of a patient with apperceptive agnosia – sometimes referred to as visual form agnosia. o They struggle to accurately match shapes, cannot copy simple line drawings and cannot name objects based on visual information alone. 9 o For this patient, elementary processing of visual properties such as brightness, colour and even texture remain intact. o They can, if given the opportunity, name objects using other sensory modalities ▪ Thus, patient does not have any deficits in semantic or memory processes. Associative Agnosia – impaired object recognition despite intact perceptual processing o have little difficulty representing the form of what they see. So they can copy objects and their matching ability remains largely intact. o Where they have trouble is in attaching an appropriate label to what they see. ▪ So they fail to name objects accurately despite being able to draw them correctly and they struggle to make drawings from memory alone. o Brain damage is later in the visual pathway at the occipito-temporal border or even further into temporal cortex. More commonly arising from bilateral damage, associative agnosia has been seen after unilateral left hemisphere lesions. Apperceptive Agnosia Brain damage that leads to apperceptive agnosia occurs early in the visual pathway often affecting the lateral occipital complex The challenge for these patients is in binding elements of an image into a coherent whole. Their perception is fragmented making it difficult for them to extract the global shape properties. They become heavily reliant on local properties of an image, demonstrate impaired object constancy and struggle to move from processing parts of the image to recognizing the whole image. 10 Associative Agnosia Aka Category Specific Agnosia’s Distinction between living and non-living things o Patient may fail to recognize living things but have less difficulty recognizing non-living things ▪ Tools and man-made objects share fewer things in common than do living things, which commonly have appendages (arms, legs, etc.), heads, and bodies. ▪ Their similarity to one another likely evokes activation within a common neural network and makes them susceptible to damage o The diverse physical and functional characteristics of mad-made objects may rely on more diverse and distinct networks of brain regions, making them more robust to brain damage Homomorphism o The notion that living things share more attributes in common rests on the assumption that living things are ‘homomorphic’ – they have the same general shape Objects have functions o Non-living things can be distinguished based on their functional characteristics – how we use them and what we use them for. o The same can’t be said of living things – what functional difference is there between a lion and a tiger? Evolutionary pressures led to distinct processing systems Psychological Neighbourhood Thus, tools are more resistant to category specific agnosia’s o Because they share fewer perceptual characteristics and have distinct functional properties (e.g. wooden hammer vs. metal saw) o Tools have functional representations Category specific agnosia may occur more for objects with few or closely linked representations o E.g. musical instruments Mike Dixon studying a patient with category specific agnosia (due to herpes encephalitis) for a specific class of musical instruments (e.g. string but not brass) o proposed the theory of a ‘psychological neighborhood’ – objects that share some property (e.g., perceptual, action affordances, functions, etc.) would be closer together in a psychological neighborhood and presumably have similar neurological underpinnings. o This in turn makes them more susceptible to damage. 11 Prosopagnosia Most common category specific agnosia where the patient cannot recognize faces using vision alone o Can recognize that there is a face, but cannot associate it with a specific identity o Can recognize people through other modalities (e.g. their voice or gait) o Sex and relative age of a person can be determined (which is low frequency information - global), but their identity cannot be determined (higher frequency information - local) It is controversial as to if this condition is congenital Are faces a special category of object? o They are the most salient thing we use not just to identify individuals but to communicate with them as well. o As Ekman showed many years ago, some facial expressions are seemingly universal. ▪ The notion that there are only 6 classes of emotional expression (sadness, happiness, anger, fear, surprise, and disgust) has been strongly challenged, along with the notion that we should consider them as opposing pairs somehow. ▪ But what is undeniable is that the human face conveys an enormous amount of information communicated non-verbally in social interactions. o Face selective cells in monkeys IT (inferotemporal) cortex ▪ Firing rates for cells in the IT cortex to a variety of images. Strong responses to faces in image 1, 3, 6, 7 Images 4 and 5 where the mouth and eye are occluded still elicit a strong response very little responses to non- faces in image 8 and 2 ▪ thus, in primates it is undeniable that there are regions of the brain that are preferentially active for faces and face-like stimuli o we have no trouble imposing a face structure on objects that do not normally have them (e.g. houses and cars) ▪ Giuseppe Arcimboldo: painting of a bowl of vegetables, can see a face Thatcher Illusion Aka the face-inversion effect Demonstrates something important about face processing. o Local details in a face are more difficult to process when inverted (a non- canonical orientation) 12 ▪ Thus, inverted faces are more difficult to remember N170 shows distinct pattern for faces and inverted faces In an upside down face where the eyes and mouth have been flipped right side up, the face is recognizable However, a face right side up with the eyes and mouth upside down, looks off-putting This suggests that faces are not typically processed one feature at a time, instead they are processed as a whole o Martha Farah first referred to this as ‘holistic face processing’ Holistic Processing Can see this form of processing in memory tasks o People are asked to study a range of faces and houses o Later they do a recognition memory test – they are presented with either the whole image / only part of the image ▪ Faces are poorly recalled when only part of the face is available ▪ Houses show no such deficit FFA – Fusiform Face Area Faces and houses have now become a common theme in neuroscience research, perhaps because both things are such common objects in our world. Aina Puce and colleagues first showed using fMRI that there was a specific region in the fusiform gyrus of the temporal lobe that responds preferentially to faces. Nancy Kanwisher and colleagues followed that up a few years later showing face specificity in the fusiform and labelling the region the ‘fusiform face area’. Early work showed that there was also a left visual field/right hemisphere advantage for responding to faces – that effect seems to replicate reasonably well (left is right in the brain images shown here). o But early work also showed that the FFA was not silent when other objects were shown. Clearly, face processing involves a network of brain regions and what each does functionally is still an open question. o Other face sensitive regions are in the occipital cortex (OFA) and in the superior temporal sulcus (STS). Object Processing in the FFA Alex Martin, Kalanit Grill-Spector and others have shown clearly that the FFA also responds to other object classes. 13 F=faces, A=animals, C=cars and S=sculpture. o Each class of object shows significant activation of the FFA (although clearly faces show the strongest activity). Holistic Processing and the Fusiform Gyrus Prosopagnosia – common after right inferior temporal lesions Alexia – letter-by-letter reading, common after left inferior temporal lesions What is common about these deficits? o impaired holistic processing So, prosopagnosics should also be poor at reading tasks and vice versa Lesions and Co-morbidity Special Category or Expertise? Early recognition memory studies suggest our visual memory capacities far exceed our verbal memory and that this is particularly true for faces. People can be shown thousands of images on one day and accurately recognize around 90% of them the next (Lionel Standing originated much of this work in the 1970’s). Isabel Gauthier and colleagues addressed this in a visual training study. 14 o They created “Greebles” – abstract figures with various appendages that could, after extensive training, be identified at the family level (these are the Smiths, these are the Jones’) and at the individual level (this is Bob and this is Joe). o It took a lot of training. But fMRI scans prior to training showed no activation of the FFA in response to Greebles. But after training the FFA now shows robust activity to the Greebles. ▪ So perhaps we’re just seeing an expertise effect here. Others have shown FFA activation to dogs in expert dog show judges, FFA activity to muscle cars in experts in vintage cars and so on. Section 7: Spatial Processing / Perception of the Dorsal Stream Ants exhibit dead reckoning, the leave the nest to search for food, they explore widely. Including many turns, cutbacks, etc. o When food is found, the go directly back to the nest in a way that suggests they kept track of their initial path o Thus, they calculate and remember the distance and direction relative to starting point o Similar behaviour has been found in spiders, geese and… o Golden Hamsters ▪ Experiments used variants of a two-legged outward-bound journey (two paths orthogonal to one another with different lengths and different angles) as well as more meandering outward journeys. A third experiment had the hamsters slowly rotated on a platform while collecting their food Dorsal Stream Functions Focus: Spatial processes and spatial perception. Key Studies: Early lesion studies in monkeys demonstrated a dissociation between dorsal (spatial) and ventral (object recognition) processing. Key Experimental Tasks 1. Alternating Rule Task: 15 o Purpose: To learn the location of food based on spatial rules. o Landmark Task: ▪ The monkey learns that food is initially placed in a well indicated by a landmark (e.g., a cylinder). ▪ After several trials, the food location shifts to the opposite well. ▪ Critical Aspect: The monkey learns the spatial rule rather than recognizing the landmark object itself. ▪ Findings: Monkeys with parietal lesions struggle with this task, indicating the dorsal stream’s role in spatial awareness. 2. Object Discrimination Task: o Purpose: To learn the location of food based on object associations. o The monkey learns that food is in a well associated with one of two objects (e.g., cylinder or cube) for several trials. o The food then shifts to the well associated with the other object. o Critical Aspect: The identity of the objects matters, but their spatial arrangement can change without issue. o Findings: Monkeys with parietal lesions perform well, while those with temporal (ventral stream) lesions show impairments Summary of Findings o Dorsal Stream (Spatial Processing): Implicated in learning and executing spatial rules, as evidenced by performance deficits in parietal lesioned monkeys. o Ventral Stream (Object Recognition): Implicated in object identity recognition, as evidenced by performance deficits in temporal lesioned monkeys when the task relies on object discrimination. Frames of Reference When coding the spatial information in the environment we can: o Code it relative to ourselves as actors – Egocentric – where does my foot go next? o Code it in terms of the relations between objects – Allocentric – which block is in front of/besides/lower than, etc. o Fred Previc developed a theoretical account of 3-dimensional space based on evolutionary models of primate behaviour. ▪ Described several regions of space from: personal space, spatial relations of the body peripersonal space, regions of space within arms – or foot’s – reach) 16 focal extrapersonal space which essentially meant wherever we direct our attention or eye movements further split extrapersonal space into extrapersonal ‘action’ space – regions just beyond arms reach but that we could immediately step into, and ambient extrapersonal space. Support for Previc’s Account o General Consensus: Support for Previc’s account is limited. o Impact: The account encouraged research into how information is processed differently across various regions of space. Differences in Spatial Information Processing o Saccadic Eye Movements: ▪ Observation: Saccades (rapid eye movements) are faster to targets in the upper visual field. o Visual Memory Bias: ▪ Study by Previc & Intraub (1997): Participants in a lecture theater were asked to copy a picture. ▪ Findings: Analysis of the copies revealed that the upper portion of the image was expanded relative to the lower portion, indicating a bias in visual memory favoring the upper visual field. o Grasping and Motor Processes: ▪ Efficiency in Lower Visual Field: Grasping movements are more efficient in the lower visual field. Speed-accuracy trade-offs are also more favorable in the lower visual field. Motor imagery tasks show similar efficiency in the lower visual field. Terminology o Anisotropies: The differences in processing speed and efficiency across various regions of space, particularly favoring the upper visual field for saccades and the lower visual field for grasping and motor tasks. Saccadic Remapping The velocity of our eye movements is so fast that we suppress perception of the world during the saccade – otherwise we’d have to deal with blur all the time. The bottom panels highlight a process known as saccadic remapping. Given the sheer velocity of eye movements and the fact that to cope with that we suppress perception during the saccade, it is plausible that we would need to recreate our mental representations of the world anew every time our eyes landed on something – close to five times a second! 17 o This is not what happens. Instead, our visual system is capable of anticipating what the world will look like before we start to move our eyes – so we remap the RFs in anticipation of the landing point of our next saccade. Single Case Study of a Right Fronto-Parietal Patient with Neglect The task they used is known as the double step saccade task. (by Jean Rene Duhamel) Double Step Saccade Task o The task involves starting with the eyes fixed on a fixation point (FP). o Two saccade targets (A and B) are flashed on the screen for less than 200 milliseconds. o Timing Importance: ▪ Critical Time Frame: The less than 200 msec duration is crucial, as an eye movement cannot be initiated in this time frame (except for express saccades, which are not the focus here). Eye Movement Mechanism o Retinal Coordinates Issue: ▪ If eye movements were based solely on retinal coordinates: The movement would go down and slightly left towards target A. After moving to A, the coordinates for target B would change, leading to a missed target. o Actual Eye Movement: ▪ Instead of relying solely on retinal coordinates, the brain remaps the coordinates of the visual world based on the anticipated landing point for target A. ▪ This allows for an accurate movement towards target B: The eyes move down and slightly to the right, successfully acquiring target B. Key Points o Remapping Process: The brain processes and adjusts visual coordinates dynamically, allowing for accurate targeting of moving stimuli. o Implication: This task illustrates the complexity of saccadic eye movements and the brain's ability to predict and adjust for spatial changes in the environment. 18 Duhame’s patient had no trouble saccading to targets when given enough time (top panel), but when the two targets disappeared in under 200 msec the patient acquired target 1 OK but failed to remap their representation of the world in order to accurately acquire target 2 (lower panel). Anamorphisms A technique used to create depth To see the 3D depth, viewers are required to use a special lens or stand at a specific vantage point Julian Beever was an artist that used this technique, and Hans Holbien is the most famous to use it o In his “Ambassadors” painting there is a skull painted in great detail that is only viewable from a specific vantage point Binocular Disparity This helps us to perceive depth Each eye receives a slightly different image, and your brain calculates the difference between them to determine the relative depths of objects in the field of view Binocular disparity is not the only means to depth perception, because people who have lost one eye (enucleated) are still capable of it One static cue of depth can come from – shape from shading o Most people perceive the central circle to be concave and the surrounding circles to be convex 19 Motion Parallax Represents a dynamic depth cue Suggests that objects closer to you from the fixation point will appear to move quickly in the opposite direction whereas objects beyond the fixation point will appear to move more slowly, in the same direction as your motion o we can use this information (which is the appearance of motion or apparent motion) to determine depth. Motion Perception Basic Perceptual Category: Motion perception is considered fundamental due to its prevalence in the environment. Brain Regions: Regions responsible for motion perception do not fall strictly within the dorsal or ventral visual streams but occupy areas in between. o Often associated with the dorsal stream (action-related stream) because motion typically triggers actions, such as dodging or avoiding obstacles. Key Brain Areas: Area V5: Also known as MT (Middle Temporal) and MST (Medial Superior Temporal) regions. o Labels like MT and MST come from studies on monkeys, where they better correspond to motion-sensitive regions than in humans. MT in Monkeys: Cells in the MT area respond to both direction and speed of motion. o A common method for testing motion sensitivity is the coherent motion task, which involves fields of dots moving either randomly or in a coherent direction. Sensitivity: o Monkeys and humans have exquisite sensitivity to motion, needing only about 5% of dots in a full field to move in a coherent direction to accurately detect motion direction. Optic Flow and Direction Heading Early 2000’s work using fMRI shows activation in area V5 to full field stimulation using optic flow Important: a large region of the brain is strongly activated by motion in the image. Each hemisphere responded strongest to contralateral stimuli and also significantly to ipsilateral stimuli Optic flow: the perception of apparent movement of stimuli relative to our own direction of heading o Movement appears to be fastest at the fixation point and slowest in the periphery 20 o Important for processes like calculating direction and speed of heading – critical for locomotion (as a means to representing depth, and even in segmenting objects in the environment) Structure for Motion Can use motion to discern structure in the environment o Imagine trying to find a tiger among the reeds in the plains of the Serengeti. o The stripes of the tiger blend with the reeds and you struggle to find the animal that could kill you. o But when he moves the tiger betrays himself – now, by virtue of the motion you can see where he lurks. Ferb & collegues showed this in fMRI o Had random segments move in different direction from background segments o In two conditions, when the movement stopped, the segments forming a rhino either remained present or were scrambled o While motion activated V5, the motion after-effect (a persistent image of the rhino) activated the LOC ▪ Highlights that motion perception can help for action or perception Point-Light Method Lights attached to people are measured as people move (can tell weather they are walking, jumping, dancing, and at above chance levels, their sex) another example of discerning information from motion Psychopaths and Gait Processing Wheeler and Book et al., showed that those higher in psychopathy were better able to recognize body language characteristics of victims o Experimenters filmed people walking from behind and asked participants to indicate which of them had been a victim of a violent crime. o Found that those higher in interpersonal and effective aspects of psychopathy did better at discriminating people who had been victims of a violent crime o In a sample of incarcerated people, the justifications focused largely on gait – something about the swing of the arms that gave away victims. ▪ This is touted then as a potential mechanism by which violent offenders select their victims. Optic Ataxia Alain Vighetto and Marie-Therease Perenin were the first to fully explore the disorder Arises usually as a consequence of bilateral parietal lesions (sometimes unilateral) Patients with the disorder mis-reach for things in the periphery Consequences 21 Look-away passes: A basketball technique exemplified by Steve Nash, where the player deceives defenders by looking in one direction while passing in another. Touch typing: The skill of typing without looking at the keyboard, which can be hindered if a person watches their fingers. Reaching for objects: Such as grabbing a coffee mug while focusing on something else, where hands and eyes coordinate but focus on separate tasks. In cases of optic ataxia, patients find it difficult to perform such tasks where eye and hand coordination diverges (the hands move somewhere different than the eyes). The example of Steve Nash is used to emphasize how important this coordination is in complex tasks like sports, while more mundane activities such as typing and grabbing objects also rely on it. Patients with optic ataxia struggle to direct the eyes and hands to do different things. This can be shown clinically: Test Setup: The patient is instructed to look straight ahead while attempting to grasp a target placed to their left or right, using either hand. Contralesionally Hand: The most notable impairment occurs when the patient uses the hand opposite the side of their brain lesion (e.g., a right-hand difficulty due to a left parietal lesion) to reach into the contralesional (opposite) space. o But it is also true that patients have trouble reaching for peripheral targets in contralesional space with the ipsilesional hand or in ipsilesional space with the contralesional hand! Field Effect: Patients have trouble reaching for objects in contralesionally space (opposite the lesion) with either hand. This is visualized as a "field effect" where reaching across the body is impaired. Hand Effect: The "hand effect" describes difficulties using the contralesional hand (right hand for a left lesion), even when reaching for objects within either space (left or right) In the example provided, a patient with a left parietal lesion struggles both to reach into right space and to use their right hand regardless of the target's location. The red area indicates where field effects disrupt reaching, while green highlights hand-specific disruptions. Unilateral Superior Parietal Brain Damage Video shows a patient with unilateral superior parietal brain damage demonstrating optic ataxia. The patient faces one of my students and focuses her eyes on her nose – something for the patient to fixate on. I am standing behind the patient and will place a pencil in front of her, to the left or right, and simply ask her to grasp it with either hand. Optic Ataxia and the Automatic Pilot 22 Patient IG – bilateral superior parietal lobe lesions Pisella and colleagues showed an interesting effect with an optic ataxia patient using what is called a target-perturbation task. In this task, people reach rapidly for a target and on some trials (25% of them) the target moves (or is perturbed) as soon as they start to move. o Turns out that we rapidly adjust to such things Study Instructions: o Participants were told that if a target moved, they should stop their current movement rather than adjust to the new location. o Interestingly, 15% of the time, participants still adjusted their movements even when instructed to stop. Testing on Patient IG (with optic ataxia): o IG had bilateral superior parietal lesions. o In the "Location-Go" trials (where participants should adjust), IG adjusted her movements to the shifted target, but she did so more slowly compared to controls. o In the "Location-Stop" trials (where participants were instructed to stop their movement after a target jump), IG outperformed the controls, as she had no trouble inhibiting her movements when the target was perturbed. Conclusion: o The superior parietal cortex in healthy individuals seems to act as an "automatic pilot" for guiding visually-directed movements. This automatic control can make it difficult for people to stop a movement once initiated. o In IG, who’s ballistic (fast and automatic) movements are disrupted due to parietal lesions, these movements are easier to inhibit when necessary, such as when the target is perturbed. This suggests that damage to the parietal cortex can interfere with automatic movement planning, leading to slower adjustments but better inhibition of unwanted movements. Other Studies on Patient IG (By Milner) Misreaching in the Periphery: o IG's difficulty in reaching for targets in her peripheral vision improves when there is a delay between the target being presented and the onset of her movement. o The phenomenon of misreaching in the periphery is referred to as "magnetic misreaching" by David Carey and colleagues, indicating that IG's hand movements tend to be drawn towards her point of fixation, almost like a magnetic pull. Observation of Misreaching: o When IG's fixation point (black target) is centrally located, she has no problem reaching for it, accurately pointing where she is looking. 23 o As the targets move further into the periphery, IG's reach movements are pulled towards her central fixation, showing significant deviation in the endpoint of her movements. Effect of Delays: o Introducing a delay between when the target is shown and when IG starts her movement greatly reduces this misreaching effect. o This improvement suggests that in-the-moment movements are disrupted for IG due to her parietal damage, but when given extra time, she can use alternative coordinate systems to improve her accuracy These results indicate that IG's optic ataxia primarily affects movements that are planned and executed quickly, but with sufficient time, she can compensate using other co-ordinate systems, enhancing her reaching performance. Section 8: Attention In order to maximize reward (nectar), both honey bees and bumbles need to select from a multitude of possible targets o Similar to humans, they cannot represent everything they see and must filter out irrelevant things (grasses and low-yield nectar) and focus on only those that are relevant (high-reward nectar) o This is attentional selection Honeybee’s o Evolved in the tropics and subtropics ▪ environment likely shaped differences in foraging strategies o better at discriminating different colours o larger colonies o achieves a division of labour based on age Bumblebees o Evolved in the northern temperate zone o Higher spatial resolution (for both chromatic and achromatic vision) 24 o Smaller colonies o achieves a division of labour based on size One of the key consequences of the distinct visual apparatus between the two, is the degree to which they co-operate o Bumblebees don’t communicate information regarding food sources to nest mates – serves to optimize individual performance (higher spatial resolution) o The more frenic honeybees serves speed o A cross-species speed-accuracy trade off Comparative biology – the field examines natural variation across species to examine the mechanisms that support common functions (similarities, differences, and the influence of ecosystems on the organism and community) o Shed light on structure-function relationships E.g. work by Anne Triesman: each bee is tagged and placed in an environment where they search for a target with a reward – a pad with nectar – among distractors – pads with no nectar. Bees are then filmed from above o Can look at the fly path and the time it takes them to find the target as a function of the # of distractors o Bumblebees were more accurate than honeybees regardless of target size ▪ Les affected by the number of distractors (suggestive of a distinction between parallel and serial search) ▪ deviate from straight ahead earlier than do the honeybees (shown on the right) – so they decided where to fly earlier on but took longer to get there. o Honeybees were faster to get to the targets ▪ More affected by the number of distractors Everybody Knows William James: “Everyone knows what attention is. It is the taking possession by the mind, in clear and vivid form, one out of what seem several simultaneously possible objects or trains of thought. Focalization, concentration of consciousness are of its essence. It implies withdrawal from some things in order to deal effectively with others, and is a condition which has a real opposite in the confused, dazed, scatterbrain state... o This description is dense 1. Cannot possibly attend to everything we see because our physical plant (what we use to perceive and interact with environment) is limited 2. He conflates consciousness with attention, such that we can’t possibly attend to something we are not conscious of 3. Highlights the opposite side of the selection coin: to attend to something we may have to withdraw from others 2nd Quote: "Compared to what we ought to be, we are only half awake. We are making use of only a small part of our physical and mental resources." Interpretation: Emphasizes the idea that we are not fully utilizing our potential. o Suggests we have untapped physical and mental resources. 25 James stresses that attention is necessarily selective—we can't process all stimuli equally. o This selectivity indicates we are limited in focus and attention capacity. The quote hints at a belief that we could potentially increase our mental and physical usage if we push beyond our current limitations. The idea that humans only use a small portion of their resources could be linked to the misconception that we only use 10% of our brains o James' statement might have been misinterpreted over time, contributing to this widespread myth. Where’s Waldo Phenomenon The puzzles play on our limited capacity and the challenges we face in selecting something Waldo Analogy: o Easy Search: If only one Waldo (or object) is in a distinct setting (e.g., a field of green grass), he can be spotted easily from a long distance. o Difficult Search: When there are many similar individuals (like during a Guinness World Record event with thousands of Waldo enthusiasts), finding a specific person (like Jane) becomes very difficult. Search Types: o Serial Search: In a crowd, we typically search one-by-one, making the process more time-consuming as the number of similar individuals increases. o Search Complexity: The more "Waldos" (similar items/people) present, the more challenging it becomes to locate the desired one. Definitions Alertness and arousal – reticular activating system Vigilance or sustained attention Selective attention – fronto-parietal network Sociological Context of Early Attention Theories Strict Behaviorism: Figures like Watson focused on observable stimulus-response associations, rejecting the need to consider internal mental states to explain human behavior. B.F. Skinner: Often linked to behaviorism, though his views were more moderate, acknowledging the existence of internal mental states. Computers as a Cognitive Metaphor: o With the rise of computers, a metaphor for cognition developed. The internal computations of a machine (which could not be seen) served as a comparison for understanding human mental processes. o Inputs were fed into the computer, internal processes were engaged, and outputs were produced. This mirrored the human cognitive system, where mental processes transformed perception (input) into behavior (output). Treisman (1964): 26 o Anne Treisman referred to humans as an "information handling system," emphasizing the roles of input, computation, and feedback in cognition. The UNIVAC Example: o The UNIVAC (Universal Automatic Computer), built in 1951, was used to predict the outcome of the 1952 U.S. Presidential election. o Although UNIVAC accurately predicted Eisenhower's landslide victory within 1% of the actual result, CBS initially doubted the results, reflecting skepticism around trusting machines to handle complex computations. o The cognitive revolution of the 1950s helped society begin to trust the idea of internal, unseen processes, both in machines and humans. Cognitive Revolution: o This period marked a shift in psychology, from behaviorism to cognitive psychology, focusing on internal mental mechanisms that guide behavior, similar to a computer’s internal processes. What is the Mechanism of Selection? What is the mechanism of selection in the brain that determines which sensory information is processed, while the rest is discarded? Broadbent's Early Selection Model: o Donald Broadbent (1950s) proposed that information from multiple sensory channels is selected very early in the cognitive processing stream. o Information on unselected channels is lost. o This model suggests that sensory registers (e.g., iconic and echoic memory) have an unlimited capacity to temporarily store large amounts of sensory data. Iconic and Echoic Memory: o Iconic memory = visual sensory register. o Echoic memory = auditory sensory register. o Sensory registers hold information briefly but decay rapidly unless selected for further processing. Sperling's Partial Report Experiment (1960s): o Tested iconic memory using a 3x4 grid of letters. o In a full report condition, participants could only recall 3-5 letters out of 12. o In a partial report condition (where participants were cued to recall one row), accuracy rose to about 75%, suggesting that large amounts of information are momentarily registered before decaying. Attention and Bottleneck Theory: o To deeply process information, we must select a specific channel for further analysis. This selection allows information to pass through a bottleneck into short-term memory. o Bottleneck model: Information is like balls moving through tubes and gates. Only one channel’s information passes through at a time, blocking others from entry. 27 Physical Models of Attention: o Broadbent encouraged building physical models (tubes and gates) to represent cognitive processes. o These models serve two purposes: 1. Account for known findings in attention research. 2. Characterize processes with predictable outcomes for further testing. Broadbent's Dichotic Listening Experiment (1956): o Participants were presented with different number strings in each ear. o They never mixed up channels, demonstrating the brain’s capacity to maintain separate channels for processing sensory input. Limitations of the Model: o Broadbent warned that the model is not a literal description of brain processes but a useful metaphor. o Information (balls) represents cognitive data, not raw sensory stimuli. Elaborate Model and Memory Consolidation: o Broadbent extended his model to explain how information can be consolidated into immediate memory (short-term/working memory). o Rehearsing processed information strengthens it in memory. This idea relates to modern views of memory as re-enactment, where recalling strengthens neural networks (including errors). Key Takeaways: o Sensory memory captures large amounts of information, but selective attention filters what reaches deeper cognitive levels. o Broadbent’s models laid the groundwork for understanding attention bottlenecks and selective processing in the brain. o Modern theories continue to explore how information is selected and rehearsed, consolidating into working and long-term memory. Early vs. Late Selection This all led to competing models of where in the information processing stream the bottleneck occurred – essentially, where is information selected and what happens to information on unselected channels? The Cocktail Party Effect How do we tune out irrelevant information? Probabilistic predictions o same direction, pitch, sensible narratives Very little processed on unattended channel Change in gender, pure tones, reversed speech Turning your ear to the speaker 28 Broadbent’s Early Selection Model (1958): o Early selection theory: Channels of information are selected for processing early in the information processing stream. o Bottleneck: Only selected information proceeds to further, deeper processing, while unselected information is lost. o Influence: Set the stage for much of the cognitive theorizing on attention. o Challenges: Later theorists (e.g., late selection models) argued that information is processed more deeply before selection occurs. Cherry’s Contributions (1953): o Focused on how we select and process information among multiple sensory channels. o Cocktail party effect: How we recognize one person’s speech amidst simultaneous conversations. o Unlike Broadbent, Cherry approached this from a behaviourist perspective, focusing less on subjective experiences. Key Findings from Cherry’s Experiments: a) Simultaneous Speech Streams (Bilateral Input): o Two simultaneous speech streams were presented to both ears. o Participants were able to report the attended message and separate words from each narrative. o There was no cross talk between the speech streams, meaning participants correctly attributed words to their respective narratives. o When transpositions did occur, they were predictable based on narrative probabilities (the natural flow of speech). b) Dichotic Listening Task (One Stream per Ear): o In a dichotic listening task, participants heard one speech stream in each ear. o They could easily “tune in” to one message while ignoring the other. o Ignored stream: Participants retained very little from the ignored stream and were unable to report even basic details like the language used. o However, certain changes in the unattended stream were detected: ▪ Voice pitch/sex change (e.g., male to female) was reliably noticed. ▪ Non-speech stimuli (pure tones) were always detected. ▪ Reversed speech was noted by participants as “having something odd about it.” Cherry's Conclusion: o Selective attention to one stream involves using cues such as direction, pitch, and transitional probabilities (predicting how the narrative unfolds). o People can filter out unattended streams but still detect significant changes, like a change in speaker characteristics. o Cocktail party behavior: Turning one ear towards the speaker is a sensible strategy for focusing attention, as his results suggested. o Importantly, unattended information wasn’t lost in an all-or-none fashion – some stimuli could still break through. Key Takeaways: 29 o Broadbent's early selection model emphasizes selective attention early in processing, where unattended information is lost. o Cherry’s work demonstrates the ability to selectively attend to one speech stream and detect specific types of information in an unattended stream (e.g., speaker changes). o Selective attention involves using cues and predicting the flow of information, but unattended information isn't entirely discarded Where is the Bottleneck and what happens to unattended information Both Cherry and Broadbent thought that selection occurred early in the system and most, if not all, unattended information was lost. Essentially, this is a signal-to-noise problem. How do we extract relevant from irrelevant information? o At times things might “pop-out” and be easy to represent. ▪ Anne Treisman and her colleagues showed time and time again, is that the speed with which you detect a pop-out target of this kind (e.g. a red apple target among green apple distractors) is unaffected by the number of distractors. o When things don’t just ‘pop-out’ we take more time to extract the signal from the noise Visual Search – Pop Out vs. Serial Search Feature Search vs. Conjunction Search: o Feature search: Involves finding a target that differs from distractors based on one salient dimension (e.g., color, shape). ▪ Example: Searching for a red circle among green circles. o Conjunction search: Involves searching for a target based on a combination of features (e.g., color and shape together). ▪ Example: Searching for a red circle among green squares and red squares. Key Findings: o Feature Search: ▪ Number of distractors has little to no effect on search time and accuracy. ▪ The target "pops out" due to the distinct single feature, leading to fast and accurate identification. o Conjunction Search: ▪ Search time increases as the number of distractors increases. ▪ Performance suffers: More conjunction errors occur (mistakenly identifying distractors as targets). ▪ This type of search requires more effort and is processed more slowly because it involves comparing multiple features at once. Mechanisms of Visual Search: o Treisman proposed that different mechanisms are involved in these two types of search: 30 ▪ Feature search relies on a parallel processing mechanism, where all items are processed simultaneously. ▪ Conjunction search relies on a serial processing mechanism, where items are examined one by one, making it more time- consuming and error-prone as the number of distractors increases. Treisman on Object Representation and Visual Search Association vs. Gestalt Theories: o Association Theory: Suggests that objects are perceived by assembling individual parts to form the whole. o Gestalt Theory: Proposes that we perceive the whole object first before recognizing its individual components. o Example: The image provided can be interpreted as either a skull or two astronauts on the moon with Earth in the background. ▪ Your initial perception pops out instantly, and only after further inspection can you switch between interpretations. Treisman's Visual Search Experiment: o Feature Search: ▪ Involves searching for a single feature (e.g., a red T among green L’s). ▪ Fast search times and unaffected by the number of distractors. ▪ Parallel processing occurs, where the target "pops out" quickly. a feature search in which something pops out like this red apple arises because you only need to find the red thing among the green things. o Conjunction Search: ▪ Involves searching for a combination of features (e.g., a red T among green L’s, red L’s, and green T’s). ▪ Slower search times, and reaction times increase linearly as the number of distractors increases. ▪ Serial processing occurs, requiring more cognitive effort to distinguish between feature combinations. Feature Integration Theory Objects, which require us to bind different features to the one thing being represented take longer and require attention. In Treisman’s theory – feature integration theory – locations (spatial locations) are processed serially Also, attention “glues” co-located features together. Even with practice, subjects weren’t able to turn conjunction searches into feature searches – in other words, practice was not sufficient to automate the search mechanism for conjunctive searches – they still needed to proceed in a serial fashion. 31 Illusory conjunctions No cube in this image (it is just an illusion) Illusory conjunctions suggest unattended information is not ‘lost’ What Treisman often found in conjunction searches were errors indicative of illusory conjunctions – reporting a red X when it had really been a blue X in the display. Aside from being a measurable error, these illusory conjunctions suggest that unattended information is nevertheless processed and available for – erroneous – report! The Binding Problem Unattended Information and Illusory Conjunctions: o Illusory conjunctions refer to the incorrect combination of features from multiple objects into one perception. o The occurrence of illusory conjunctions suggests that unattended information is not completely ignored. ▪ Even non-target information is processed to some extent, though it may be inaccurately combined with attended information. o Attenuation Theory (Treisman): ▪ Unattended information is weakened but not entirely lost. It is processed at a lower level than attended information. The Binding Problem: o The brain faces the challenge of binding independent features (color, shape, size, etc.) into cohesive objects. o There are two types of binding problems: ▪ 1. Object Segregation: Distinguishing discrete objects from each other. ▪ 2. Feature Binding: Combining features (like color, shape) that belong to the same object. ▪ Treisman’s Solution: We serially search through space to locate features. Attention acts like "glue," binding together features of objects that share the same location. Attention and Pop-Out Effect: o The pop-out effect occurs when an object, such as a red apple among green apples, stands out immediately. ▪ This is because the spatial localization of the unique object is easy to process. ▪ Attention binds features, making the red apple easy to differentiate from the green apples. Segregation: How do we represent objects as discrete from one another? 32 Combination: How are multiple features combined to make a single object? Grandmother Cell Hypothesis: o The Grandmother cell dilemma poses the question of how we recognize complex objects like Grandma if different features (color, texture, etc.) are processed by different regions of the brain. o The hypothesis suggests that there could be a single neuron (or a small set of neurons) responsible for recognizing Grandma. Issues with the Grandmother Cell Hypothesis: o If there were a single Grandmother cell, damage to that neuron would cause the total loss of the ability to recognize Grandma. o Extending this logic implies that each recognizable person or object (Grandpa, Professor Danckert, etc.) would need its own unique neuron. Distributed Representation: o Instead of relying on a single cell, recognition depends on distributed representations across multiple brain regions. o Different aspects of an image (e.g., texture, color) are processed by distinct neural regions, but all contribute to recognition. o Even if one part of the network is damaged, other regions can still support recognition. o Ensemble encoding = synchronized firing across multiple regions ▪ It is thought that this is responsible for binding, but more work is needed to prove Resilience of the Brain: o The brain's ability to still recognize Grandma despite potential damage to specific areas illustrates the importance of redundancy and networked processing. o This system of distributed neural processing provides a safeguard against complete loss of recognition abilities due to damage to a single region. Is Attention Just About Eye Movement Treisman's Feature Integration Theory (FIT): o According to Treisman’s theory, the spatial location of objects is selected first. o Attention plays a key role in binding various features (e.g., color, shape) that are co-located in space to create a unified perception of the object. o The theory raises the question of how we choose which location to attend to in complex environments. Alfred Yarbus and Saccadic Eye Movements: o Alfred Lukyanovich Yarbus, a pioneering Russian psychologist, contributed significantly to the understanding of saccadic eye movements in the 1960s. o He collaborated with Alexandr Luria, a prominent figure in 20th-century neuropsychology. o Yarbus conducted experiments where participants viewed complex images (e.g., paintings) and their eye movements were tracked under different task conditions. Findings from Yarbus’ Research: 33 o Under free viewing conditions, people tend to focus on similar parts of an image, suggesting that there are common elements or features that draw attention across individuals. o His studies helped illustrate the ways in which our eyes select and prioritize visual information based on tasks or goals. Relevance to Treisman’s FIT: o Yarbus' work complements Treisman’s theory by addressing the pre-attentive selection of spatial locations. o His findings suggest that visual attention may be directed toward certain salient features in an image, which then serve as the foundation for feature binding through attention. These patterns of eye movements that Yarbus examined – saccades and fixations – are particularly striking for faces. But, are eye movements all there is to attention? o Not really – we’re still left with the question of which mechanisms determine where we look just as much as we need to ask which mechanisms determine where we attend. o Furthermore, it’s certainly possible to send the eyes looking one way, while attention is directed elsewhere – Stave Nash look-away pass, magicians (when you watch one hand too close, you miss the trick) ▪ what both examples show is that normally, where we attend is tightly coupled to where we are looking. But it doesn’t have to be that way. ▪ This tells us something beyond the fact that the eyes and attention don’t ubiquitously go together. ▪ It also shows that we are often drawn to where other people are looking – what is referred to as joint attention. If you’re looking at something, it must be important so I’d better take a gander myself! ▪ This example also brings up an important distinction between overt attention – observable movements of the eyes – and covert attention: shifts of attention in the absence of observable movements of the eyes. 34 Spotlight Metaphor Here the spotlight is perhaps the most powerful metaphor – that attention acts to shine a light on attended information and we can move that spotlight around in the world. Herman von Helmholtz He was curious as to how much information we can represent in brief instants of time? Explored this by placing a large array of stimuli (e.g., a grid of letters) in a darkened room, illuminating them briefly with an electric spark and relying on introspective reports – tell me what you saw? It was clear that it was not possible to attend to all the available stimuli. But he did note that it was possible to attend to subsets of information one was not directly looking at. o So attention could be dissociated from where the eyes were looking – what we’ve referred to as covert attention. o Wilhelm Wundt: “Separating the line of fixation from the line of attention.” How Do We Move Attention Around Metaphors have played a large role in how we have theorised about the control of attention throughout space. Here, the “Spotlight” metaphor is most prominent. Attention acts to enhance processing wherever it is directed (a contentious claim in itself) and can be moved from place to place in the manner of a flashlight or spotlight. But does the data stack up to the metaphor? The Posner Task and Spatial Attention 1. Posner’s Task Overview: o Developed by Michael Posner in the late 1970s/early 80s. o A simple task to test the control of spatial attention. o Participants are asked to keep their eyes on a central cross while a target appears in the periphery. o Participants either detect the target (e.g., press a button) or perform a discrimination task (e.g., identify the target as a certain color or letter). 2. Cue and Target Setup: o Before the target appears, a cue is presented (e.g., an arrow, word, or peripheral event). o The cue directs participants to covertly attend to one side of the display (left or right). o The target then appears either in the cued location (a valid trial) or the opposite location (an invalid trial). 3. Reaction Time (RT) Differences: o Reaction times (RTs) are significantly faster in valid trials compared to invalid trials. 35 o Valid trials are easier because attention is already focused on the correct location. o In invalid trials, attention must be shifted to the new location, leading to a delay. 4. Posner’s Model of Attention Shifting: o Three core components are required to redirect attention: 1. Disengage attention from the current focus. 2. Shift attention to the new location. 3. Engage attention on the target. o Each of these processes is dependent on distinct brain regions (to be discussed later). 5. Cue Predictability: o The cue can be predictive or unpredictive: 1. Predictive cues reliably indicate the location of the upcoming target on most trials. 2. Unpredictive cues have equal numbers of valid and invalid trials. o The benefit of the cue is stronger when it is predictive and can be trusted. 6. Legacy of the Posner Task: o Posner’s task is widely recognized, with his name associated with the design. o Like the Stroop Effect (named after J.R. Stroop) and the Sternberg Task (by Saul Sternberg), the Posner Task is one of the few tasks named after its creator due to its influential results. Key Takeaways: o Posner’s task is a classic experiment that measures how attention is shifted in space, with valid trials being faster than invalid ones. o Attention shifting involves disengaging, shifting, and engaging, which rely on different brain regions. o Predictive cues enhance the effectiveness of attention direction, while unpredictive cues offer no consistent advantage. Inhibition of Return in Spatial Attention Importance of Timing in Attention: o Spatial attention does not stay focused on one location for too long. o Vigilance is essential for survival (e.g., in the Serengeti), where continuously attending to one location (e.g., rustling leaves) can make you vulnerable to threats from other locations (like a predator creeping up). o This highlights the need for flexibility in attention, ensuring one doesn’t stay fixated on a single spot for too long. Inhibition of Return (IOR): o In the Posner task, while reaction times (RTs) are faster for valid trials, this benefit diminishes after a cue has been present for more than half a second. o After this point, attention shows a bias toward the uncued location. o This phenomenon is known as Inhibition of Return (IOR), described by Posner and Robert D. Rafal. 36 o IOR prevents continuous attention on the same location, redirecting attention elsewhere after enough time has passed. Evolutionary Context: Foraging and Spatial Attention: o Foraging behavior in animals is linked to this shift in attention, requiring a balance between: o Exploitation: Extracting all possible resources from the current location (e.g., all berries from a bush). ▪ Exploitation: Staying too long in one place can make animals vulnerable to predators and opportunity costs (missing better resources). o Exploration: Searching for better resources elsewhere (e.g., finding a better bush). ▪ Exploration: Requires extra effort and may risk missing out on full exploitation of available resources. o Inhibition of Return (IOR) may represent a mechanism in spatial attention that balances exploitation (focusing on the current location) with exploration (shifting attention to new locations). IOR’s Role in Balancing Attention: o IOR helps maintain vigilance by ensuring attention shifts away from previously attended locations. o This mechanism enables individuals to keep an eye on their surroundings, preventing over-fixation and allowing for quick detection of new stimuli, which may be important for survival. Key Takeaways: o IOR shifts attention away from locations once enough time has been spent there, avoiding fixation. o This process might be evolutionarily linked to foraging behavior, where animals need to balance the use of current resources and exploration for better ones. o IOR helps maintain vigilance and ensures that attention can be redirected to new, potentially more important stimuli. Early PET Studies Stimuli don’t change Attend to shape, colour or motion Posner and Rafal’s Mechanisms of Attentional Orienting Three Mechanisms for Orienting Attention: o Disengage: Detaching attention from its current focus. o Shift: Moving attention to a new location. o Engage: Fixing attention on the new target. Neural Structures Supporting Each Mechanism: o Disengage: ▪ Critical for detaching attention from its current focus. ▪ Impaired in patients with inferior parietal damage, particularly when damage is to the right hemisphere. 37 o Shift: ▪ Responsible for relocating attention to a new position. ▪ Linked to the superior colliculus, a midbrain structure that controls saccadic eye movements (rapid eye movements between fixation points). ▪ Progressive Supranuclear Palsy (PSP) affects the superior colliculus, resulting in direction-specific issues with eye movements (up/down) and difficulty shifting attention in the same directions, even when eye movements aren’t required. o Engage: ▪ Ensures attention is locked onto the new location. ▪ Dependent on the thalamus, which plays a vital role in attention engagement. Key Observations: o Damage to specific brain regions disrupts particular aspects of attentional control. o The disengage function is seen as particularly crucial, as without it, shifting or engaging attention is hindered. o Patients with PSP exhibit difficulties shifting attention in certain directions, aligning with the role of the superior colliculus in controlling saccades and attentional shifts. Key Takeaways: o Posner and Rafal’s model identifies three core attentional operations: disengaging, shifting, and engaging attention. o These operations are supported by distinct brain structures: inferior parietal lobe, superior colliculus, and thalamus. o Damage to any of these areas leads to specific deficits in attention orienting, with the disengage mechanism being particularly vulnerable following right inferior parietal damage Your ability to inhibit and initiate eye movements is reliant on the frontal eye fields, not just the parietal regions Cortical Control of Selective Attention R hemisphere important for redirecting attention to the left and right BOLD signal increased in response to both valid and invalid responses 38 Dorsolateral Prefrontal Cortex and Premotor Theory of Attention Dorsolateral Prefrontal Cortex (DLPFC): o A cluster of frontal brain regions involved in attention. o Contains frontal-eye-fields and supplementary-eye-fields (and supplementary motor cortex), critical for controlling saccadic eye movements (overt attention shifts to objects/events). Premotor Theory of Attention (Rizzolatti): o Proposes that shifts in visual attention serve the purpose of preparing for an action. o Attention shifts are closely tied to motor planning. o When attention shifts, the eyes typically follow (link between covert and overt attention). o Costs of responding to uncued locations in Posner’s task can be explained by the time required to cancel one motor plan (e.g., looking left) and develop another (e.g., looking right). o This theory links visual attention with motor action preparation. Posner’s Task and Premotor Theory: o Posner’s task represents an atypical scenario where attention and eye movements are separated. o The task helped isolate the mechanisms of attention from overt eye movements, though Posner acknowledged that normally attention and eye movements coincide. Anterior Cingulate Cortex (ACC): o Located on the medial surface of the frontal cortex. o Involved in a wide range of attentional behaviors, such as: ▪ Monitoring performance. ▪ Error correction. ▪ Recently linked to exploratory behaviors, aiding in the decision to switch between exploitation (using current resources) and exploration (seeking new resources). Challenges in Defining Attention: o Many brain regions contribute to attention, making it difficult to pinpoint a single area or mechanism responsible for attentional control. o The question arises: Are the ways we talk about attention limiting our understanding? o The complexity of attention mechanisms across various brain regions suggests that attention might be more distributed and nuanced than previously thought. No single part of the brain that you can point to as the controller of attention, it involves a large scale network 39 Right TPJ as a circuit breaker o Interrupts current focus for behaviourally relevant information o Exists predominantly in the right hemisphere, not the left Superior parietal regions orient attention contralaterally Evidence from both fMRI and patients with neglect Left picture: Common activations during attention Right picture: Common cortical areas damaged In those with spatial neglect The overlap between the two shows that the right hemisphere’s role in spatial attention is to determine what is relevant in your environment Historical Perspectives and the Complexity of Attention Historical Attempts to Find the 'Seat of the Soul': o Early thinkers like Aristotle, Descartes, and Da Vinci sought a single structure or organ (e.g., heart, pineal gland, ventricles) as the seat of the soul. o These efforts reflect the mistaken search for a 'smoking gun' – a singular mechanism or structure responsible for controlling attention. No Singular Mechanism or Structure for Attention: o The assumption of a single brain region controlling attention is false. o Attention is a distributed function, relying on a large-scale network of brain areas. o There is no one specific brain structure that can definitively be labeled as the seat of attention. Parallel Processing and Multiple Perception-to-Action Networks: o Allan Allport suggests that parallel brain networks are involved in attention. ▪ Think of the dorsal/ventral distinction: the same visual information is processed for different reasons across parallel networks. o There are multiple perception-to-action networks, each specific to different behaviors or effectors. Spatial Selection and Attention: o Early selection theorists (e.g., Treisman, and to some extent Posner) proposed that spatial selection has primacy in information processing. o However, spatial processing is not “low level” or “simple.” ▪ It is a complex process that involves broad networks of brain regions. o Spatial selection can occur in late stages of processing, influencing regions beyond early visual areas (e.g., frontal-eye-fields), right up until an action is chosen. Problems of Attentional Theorizing Allport says that much of our theorising about attention “…rests on the assumption… that attentional functions were all of one type… The penalty for such wishful thinking is 40 to be condemned forever to appeal … to ill-defined causal mechanisms… – attentional resources, central processing system, (anterior) attentional system, central executive… – whose explanatory horsepower is nil.” (my emphasis). In some sense then we have put the cart before the horse – developing our metaphors before collecting our data and trying to force the data to fit – admittedly intuitively appealing – metaphors. False Dichotomies A few main problems with which we talk about attention 1) First, we have a predilection for binary thinking or for dichotomous theorising – things are either one way or another. o For attention that has led to dichotomies like top-down vs. bottom-up; automatic vs. voluntary, This can lead us to design experiments to test the dichotomy, as opposed to examining the mechanisms we’re really interested. And things are rarely, if ever, so clearly dichotomous. Posner’s Dichotomy of Attention (Automatic vs. Voluntary) Posner's Early Work on Spatial Attention: o Dichotomy: Posner's research highlighted a distinction between automatic (exogenous) and voluntary (endogenous) attention.

Use Quizgecko on...
Browser
Browser