Summary

This document discusses visual perception and imagery, including models like Marr's tri-level hypothesis. It explores the hierarchical organization of visual processing and the neural implementation of cognitive tasks in the visual system.

Full Transcript

Perception I: Visual Perception & Imagery I. Turn to the brain A. Two pathways of visual processing B. Marr’s tri-level hypothesis II. Hierarchical organization of visual processing A. Marr’s model of visual processing B. Human visual system 1. Primary visual cortex and simple cells 2. Extrastriate...

Perception I: Visual Perception & Imagery I. Turn to the brain A. Two pathways of visual processing B. Marr’s tri-level hypothesis II. Hierarchical organization of visual processing A. Marr’s model of visual processing B. Human visual system 1. Primary visual cortex and simple cells 2. Extrastriate cortex a. Depth perception b. Color coding 3. What and where pathways C. Blindsight III.How are mental representations stored in the brain? A. Analog code B. Propositional code The Turn to the Brain ✧ As mentioned previously, early models of visual perception focused on top-down analysis and included relatively little discussion of neural implementation ✧ It was only in the 1980s with the emergence of functional neuroimaging that cognitive scientists started studying what goes on in the brain when people are actually performing different types of cognitive tasks Ø Also, an early move in this direction came with Ungerleider and Mishkin’s two visual systems hypothesis (1982) Two Pathways of Visual Processing The two visual systems hypothesis Formed from experiments on monkeys in which different parts of their brain were selectively removed Holds that there are two pathways of visual processing − Dorsal stream: a system of interconnected regions of the visual cortex involved in the perception of spatial location, beginning with the striate cortex and ending with the posterior parietal cortex − Ventral stream: a system of interconnected regions of visual cortex involved in the perception of form, beginning with the striate cortex and ending with the inferior temporal cortex Tri-Level Hypothesis In addition, David Marr’s tri-level hypothesis of information processing emphasized the importance of understanding how cognitive processes are implemented in the brain Hypothesis holds that mental or artificial information-processing events can be evaluated on three different levels Computational level: Highest, most abstract level − What does the problem entail, i.e., what output is system is trying to get? What is purpose or reason for the process? Ø Ex: Aim in visual processing is object recognition Algorithmic level: Programming level − What information-processing steps are being used to solve the problem? − Need a formal procedure that specifies how the data is to be transformed, what the steps are, and what the order of steps is Implementational level: Lowest level − Where is the hardware that is being used? How can the representations and algorithms be realized physically? Ø Ex: computer hardware or human brain and neurons Marr’s Model of Visual Processing ✧ Marr developed a theory of visual processing that was built on a hierarchy of different levels for studying cognition Drew on psychology, mathematics, neuroscience and clinical study of brain-damaged patients ✧ System has to take a complex pattern of unstructured stimuli in the visual field and interpret them into representations that can then serve as input to more complex cognitive functions, such as object recognition First stage: Image projected onto the retina is analyzed in terms of the intensity of areas of light and dark Adjacent regions of sharp contrast (of light and dark) indicate the presence of edges and contours The edges and contours determine the basic features of the object, including line segments and circular shapes The result is a raw primal sketch of the image Second stage: Features in the raw primal sketch that are similar in size and orientation get grouped The groups of features are then processed again to produce a representation of the object that includes its surfaces and layout: a 2.5-D sketch For every point in the field of view, the 2.5-D sketch represents how far it is from the observer – it is viewer-centered v Problem though is that we need to be able to establish object constancy, that is, to recognize that an object is the same object though the image projected on our retina may change either because the object or the viewer is moving Third stage: The image is then transformed into a 3-D sketch in which the the axes of symmetry and elongation link the object parts − Symmetry axis: line that divides an object into mirror image halves − Elongation axis: line defining direction along which main bulk or mass of a shape is distributed The 3-D sketch is object-centered – the object’s parts are described relative to one another and are linked on the basis of shared properties and axes ➜ This solves the object constancy problem, allowing recognition of an object presented in different orientations and under different conditions, e.g., lighting changes vRemember the Samoyed-white wolf selectivity/invariance problem? Hierarchical Organization of Human Visual System Marr’s own work on vision contains relatively little discussion of neural implementation However, subsequent research on the mammalian visual system indicates that information in the visual cortex is in fact processed hierarchically Information flows through a progression of different areas, each of which generates a representation of increasing complexity Human visual system First, input from the retina is conveyed via the optic nerve through the optic chiasm to the superior colliculus of the brainstem, then to the lateral geniculate nucleus (LGN) of the thalamus Human visual system (continued) LGN projects to area V1 or the primary visual cortex, which maps onto the striate cortex, an anatomically distinct region of the brain Area V1 is where information processing proper begins − V1 is retinotopically organized, i.e., neighboring regions of the visual field are represented by neighboring regions of V1 − Neurons in V1 (feature detectors) are sensitive to low-level features of the visual field, such as orientation and direction of movement Ø Ex: There are simple cells that respond only to the presence of line segments with a particular orientation − This area also filters information to accentuate edges and contours Human visual system (continued) Area V1 projects to area V2 Neurons in V2 process some of same features as V1, along with more complex features, such as complexes of edges, shape, and depth − Depth is principally determined by retinal disparity: the fact that points on objects located at different distances from the observer will fall on slightly different locations on the two retinas o The closer the object, the larger the disparity o This provides the basis for stereopsis or depth perception v In general, the extrastriate cortex (region surrounding the striate cortex) processes additional features of visual information, such as movement, spatial frequency, retinal disparity, and color Coding of Color Trichromatic (three color) Theory: retina has three types of color receptors (cones), each especially sensitive to one of three colors: red, green, and blue Most color blind people are not truly color blind; they simply lack functioning redor green-sensitive cones – They see the world in shades of yellow and blue – Both red and green look yellowish to them – Visual acuity is normal 2015 Buffalo Bills vs. New York Jets game: “For the 8% of Americans like me that are red-green colorblind, this game is a nightmare to watch,” tweeted one fan Opponent-Process Theory: color-sensitive receptor cells respond in an opposing center-surround fashion to pairs of primary color They are excited by light in the center and inhibited by light in the surround They are also excited by light of a particular color but inhibited by light of the opposing color Ø Ex: Some neurons are turned “on” by red but turned “off” by green ➜ This results in color afterimages phenomenon Ø Green Dot Illusion: ✧ So which theory is correct? Both. ✧ Color processing occurs in two stages: − Retina’s red, green, and blue cones respond in varying degrees to different color stimuli − Cones’ responses are then processed by opponent-process cells Effects of Color in Marketing Ø Assume that you are considering buying condoms You enter a store and notice that the store doesn’t carry all the brands you may be familiar with, so you’re going to have to make your choice based on the product package alone You are really interested in finding a brand that is considered Durable, strong, and well built (“rugged” condition) OR Classy, attractive, and refined (“sophisticated” condition) Which would you select? Purple hue, low saturation, high value Red hue, high saturation, low value “Rugged” “Sophisticated” Match the following qualities with the associated color(s): Sincerity Excitement Competence Sophistication Ruggedness (Labrecque & Milne, 2012) Answers: Sincerity: white, yellow1 Excitement: red, orange1 Competence: blue Sophistication: black, pink, purple Ruggedness: brown 1Marginally significant Human visual system (continued) From V2, information follows the ventral (“what”) or dorsal (“where”) pathway For ventral pathway: − Information goes from V2 to V4, then to inferior temporal cortex (ITC) − ITC includes specialized areas for face recognition (fusiform face area) and identification of the human body and body parts (fusiform body area) Similarities between Human Visual System & Neural Networks ✧ Information-processing in the visual cortex is hierarchically organized – as are neural networks ✧ In addition, some parts of visual system are retinotopically organized – as are convolutional layers in convolutional neural networks Blindsight ✧ Blindsight: Rare neurological condition where people who are blind in one or both visual fields due to damage to their visual cortex can nonetheless “guess” significantly above chance… - The identity or location of particular objects - The particular emotions expressed by a face in a photo in front of them Participants were asked to “post” a card into a slot (Milner & Goodale, 1998) Blindsight patient was able to meander around all the clutter in a hallway that he was told was empty (Weiskrantz) Patient’s responses to pictures of animals presented in his blind field. Correct answers are underlined. (Trevethan, Sahraie, & Weinskranz, 2007) Ø Proposed explanation for blindsight: There is a second pathway of visual perception that – Does not go through the visual cortex – Instead simply makes a very short loop through the limbic system: from the superior colliculus (visual processing center in brainstem) directly to the emotional/instinctual centers of the brain ➜Proposed mechanism for “intuition How Are Mental Representations Stored? The mental imagery debate: Is information stored In analog code (i.e., as a pictorial representation) OR As a propositional code (i.e., descriptive) Experiments on mental imagery by Roger Shepard and Jacqueline Metzler in early 1970s spawned the imagery debate Suggested that some types of cognitive information processing involve forms of representation that are very different from how information is represented in, and manipulated by, a digital computer Ø Imagery and rotation studies: − Rotate each object on the left to see if it matches the object on the right: ➜ Amount of time it takes to rotate a mental image depends on the extent of the rotation One feature of digitally encoded information is that the length of time it takes to process a piece of information is typically a function only of the quantity of information (the number of bits that are required to encode it) The particular information that is encoded ought not to matter But what the mental rotation experiments show is that there are informationprocessing tasks that take varying amounts of time even though the quantity of information remains the same ➜ This suggests that mental rotation tasks tap into ways of encoding information that are very different from how information is encoded in a digital computer − More specifically, the information may be encoded in pictorial form, similar to the way a map represents a geographical region (analog code), rather than as a description (propositional code) Stephen Kosslyn found a similar effect in a different type of study Participants were asked to focus on a point in a picture then asked to answer questions about other parts of the picture Ø Ex: Focus on the tail of the plane 1) Does the plane have a propeller? 2) Is there a pilot in the cockpit? ➜ Participants took longer to answer Question #1 than #2 – Length of time it took to answer varied according to the distance of the parts from the original point of focus So which viewpoint is correct: analog or propositional? Evidence in support of analog code (pictorial representation) Imagery and size: - Condition #1: Imagine a rabbit standing next to an elephant - Condition #2: Imagine a rabbit standing next to a fly Ø “Does the rabbit have two front paws?” ➜ People make faster judgments about the characteristics of large mental images than of small mental images; also, they take longer to travel a large mental distance, whether that’s visual or auditory Imagery and interference: - Create a clear mental image of a friend’s face - Keeping that image in mind, simultaneously let your eyes wander over the scene in front of you ➜ Visual imagery may interfere with visual perception, and motor imagery with motor images Imagery and neuroimaging research: - The primary visual cortex is activated when people work on tasks that require visual imagery - Visual imagery activates about 70-90% of the same brain regions that are activated during visual perception o Similar findings have been reported for auditory and motor imagery - People with prosopagnosia cannot create a mental image of a face Evidence in support of propositional code (descriptive representation): Imagery and parts of figures: Ø Look at the figure below, and form a clear mental image of the figure: Without glancing back at the figure in the previous slide, consult your mental image. Does that mental image contain a parallelogram? ➜ People have difficulty identifying that a part belongs to a whole if they have not included the part in their original verbal description of the whole Imagery and ambiguous figures: Ø Create a clear mental image of the figure below: Write down what the figure in the previous slide depicted. Then give a second, different interpretation of the figure you saw ➜ Some ambiguous figures are difficult to reinterpret in a mental image So which viewpoint is correct: analog or propositional? ☞ The majority of research supports the analog viewpoint, but some people on some tasks use a propositional code Also, some people tend to be visualizers and others tend to be verbalizers, and research has found neural correlates of these different cognitive styles, but more on that in the next lecture… Video Reference Video excerpted from: Green Dot illusion https://www.youtube.com/watch?v=gur-_IGV7F8

Use Quizgecko on...
Browser
Browser