Lecture 2 (Perception I) - Vision PDF
Document Details
Uploaded by FlatteringMoldavite8537
Tags
Summary
This document is a lecture on the topic of perception. It discusses topics such as sensation, perception, early models of object perception, and artificial neural networks in pattern recognition. It goes into detail on how the visual processing of the human brain works.
Full Transcript
Lecture 2 : (Perception I) ~ Vision Type Reporting Status In progress Sensation vs. Perception As humans, we are cognitive beings who Acquire information about the world around us...
Lecture 2 : (Perception I) ~ Vision Type Reporting Status In progress Sensation vs. Perception As humans, we are cognitive beings who Acquire information about the world around us Integrate that information with prior knowledge from our stored memory Store that knowledge in our memory so we can use it later to help us achieve our goals First step in this process of acquiring knowledge about the world involves sensation and perception Sensation: process by which our sensory receptors and nervous system receive stimulus energies from the environment and transduce them into neural impulses Perception: process of interpreting and organizing sensory information through use of previous knowledge Early Models of Object Perception Template matching model: object perception involves a comparison of the stimulus with set of templates or specific patterns stored in memory Problem: cannot account for complexity and flexibility of object recognition (e.g., individual differences in handwriting) Lecture 2 : (Perception I) ~ Vision 1 Feature-analysis model: discrimination of objects is based on small number of characteristics of stimuli Are these two the same letter? GM PR ➜ People are faster at deciding whether G and M are different than P and R Supported by neurological evidence – some neurons respond only to horizontal lines, others to diagonals, etc. Problem: Cannot explain recognition of complex objects with features that move and distort (e.g., horse or kangaroo) Recognition-by-components model: view that an object is represented as an arrangement of simple 3-D shapes called geons Cup/pail composed of cylinder and curved tube geons in a particular arrangement Prototype model: object perception involves a comparison of the stimulus with ideal, abstract example People are faster at identifying sparrow as a bird than penguin One of the most famous models in all of cognitive psychology It has been hypothesized that our sensory systems act primarily as a selective filtering mechanism This filter sorts things according to a limited number of variables (e.g., warm, unpleasant, green) out of which we construct our world But prototype theory suggests that our minds can also perceive objects in a very different way... Alternative modes of perception: Mindfulness is largely about seeing the “suchness” of things, that is, seeing things directly without conceptual filters ➜ Our preconceived notions prevent us from seeing the real person in front of us Artificial Neural Networks in Pattern Recognition Human neurons Many different neurons connect to the dendrites of each neuron Lecture 2 : (Perception I) ~ Vision 2 Some produce excitatory effect; others produce inhibitory effect There are also different levels of intensity of these effects If the activation of the neuron reaches a certain minimum threshold, the neuron will fire Artificial neural networks (ANN) The nodes or neurons are organized into layers in much the same way that human neural networks are The weights attached to the connections between pairs of units in adjacent layers determine the overall behavior of the network This is similar to the way in which excitatory and inhibitory neurons of various strengths connect to a particular neuron in human neural networks The bias term indicates what the weighted sum needs to be before the node/neuron will activate This is similar to the threshold necessary for activation of a neuron in human neural networks Ex: How might a computer recognize a “9” using neural networks, given the huge variety of ways in which people write 9’s? Lecture 2 : (Perception I) ~ Vision 3 To simplify things, we can represent the “9” using a grid of 28 x 28 pixels of varying shades of gray First (input) layer of network: Starts with bunch of neurons or nodes corresponding to an array of 28 x 28 pixels in the image Each node holds a number that represents the grayscale value of the corresponding pixel, ranging from 0 for black to 1 for white This is the neuron’s activation level Activations in one layer bring about activations in the next layer, which in turn bring about activations in the next layer... This is loosely analogous to how, in biological networks of neuron, some groups of neurons cause other neurons to fire Second layer (or first “hidden layer”): Each neuron in the second layer might pick up on whether there is an edge in one particular region You assign a weight to each one of the connections between a particular neuron in the second layer and the neurons in the first layer Then you take all the activations from the first layer and compute their weighted sum according to the weights Could make the weights associated with almost all of the pixels 0 except for some positive weights in target region To really pick up on whether there is an edge here, could also have some negative weights associated with the surrounding pixels ➜ Sum is largest when those middle pixels are bright but surrounding pixels are darker But maybe you don’t want the neuron to light up anytime the sum is bigger than zero – maybe you only want it to be active when the sum is bigger than say 10 So you add in some other number (the bias), like -10, to the weighted sum➜ The bias tells you how high the weighted sum needs to be before the neuron starts getting meaningfully active Lecture 2 : (Perception I) ~ Vision 4 The connections between the other layers also have weights and biases associated with them Third layer (or second “hidden layer”): When we recognize digits, we piece together various components Ex: A “9” has a loop near the top and a line on the right whereas an “8” has a loop on the top and one below Each neuron in the third layer corresponds to one of these subcomponents Ex: A particular neuron in the third layer might be activated by any generally loopy pattern toward the top These subcomponents are made up of the various edges from the second layer Last (output) layer: Has 10 neurons, each representing one of the digits The activation in these neurons – some number between 0 and 1 – represents how much the system thinks a given image corresponds with a given digit Learning is about getting the computer to find a setting for all of the different weights and biases so that it will actually solve the problem at hand ➜ This is done through backpropagation Learning in Neural Nets: Backpropagation ANNs can compute any function that can be computed by a digital computer However, it was not until the emergence of backpropagation learning algorithm that it became possible to train multilayer neural networks The strength or weight of the connections between neurons in adjacent layers varies: neural networks learn by modifying these weights Lecture 2 : (Perception I) ~ Vision 5 Learning algorithms that are programmed into the ANN change the weights of the connections between pairs of neurons in adjacent layers in order to reduce the “mistakes” that the network makes The basic idea is that each hidden unit connected to an output unit bears a degree of “responsibility” for the error of that output unit If the activation level of an output unit is too low, then the weight between the output unit and each hidden unit connected to it is increased to decrease the error The network then assigns error levels to the next layer of hidden units, so the error is propagated back down through the network until the input layer is reached Other Neural Networks Q&A Q: How many neurons should there be in each hidden layer? A: There are a number of empirically-derived rules-of-thumb. Of these, the most commonly relied on is “the optimal size of the hidden layer is usually between the size of the input and size of the output layers” Q: How many hidden layers are needed? Are more layers better? A: No. Situations in which performance improves with additional hidden layers are very few. One hidden layer is sufficient most of the time. Q: Why are more hidden layers not necessarily better? A: Increasing the number of hidden layers much more than the sufficient number will cause the network to overfit the training set It will learn the training data, but it won’t be able to generalize to new unseen data Lecture 2 : (Perception I) ~ Vision 6 under-fitting : Network is trying to fit a function to the data but the function is not complex enough to correctly represent the data, so it suffers from “underfitting” optimum : This model has the appropriate complexity to accurately represent the data and to generalize, since it has learned the trend that this data follows (inverted parabolic shape) Overfitting: This model fits the data, but it overfits to it – it hasn’t learned the trend and thus is not able to generalize to new data Top-down Processing in Object Recognition Limitations of models of object perception discussed above: assumes that perception is always objective and accurate, but in real life, that is often not the case... What we perceive, the way we perceive, is not always what would be predicted by these models Lecture 2 : (Perception I) ~ Vision 7 Our concepts, expectations, and beliefs play a much bigger role in perception than we usually realize 💡 Perception engages both top-down and bottom-up processing Bottom-up processing: analysis of information coming from stimuli through sensory receptors Object perception as combination of stimulus information from sensory receptors Emphasizes the importance of information coming from the outside world Top-down processing: information processing guided by higher-level processes, such as our beliefs, expectations, and memories Our knowledge, beliefs about the world inform our perceptions Emphasizes the importance of information coming from our minds “Objective reality” is often not as objective as we think... Our perception of an object may change though image on retina does not change Reversible figures (e.g., Necker cube; vase/profiles); ambiguous figures (e.g., old woman/young woman) Effects of Prior Experience on Perception Children who have been physically abused are significantly more likely to misperceive a fearful face as angry (Pollak) Cultural effects on perception affect what is perceived Emotional effect : When angry, people more often perceive neutral objects as guns Self-fulfilling prophecies: People generally think that it is our experiences and perceptions that create our beliefs, but often, it is actually our beliefs that create our experiences and perceptions Our beliefs and expectations influence others’ behavior Lecture 2 : (Perception I) ~ Vision 8 The Pygmalion effect: study found that students who were (randomly) labeled intellectual “spurters” showed significantly greater gains in IQ and academic performance after 8 months than controls Follow-up: If teacher believed that girls learn to read faster than boys, they did Children who were told they were neat and tidy became more neat and tidy than those who were told they should be neat and tidy Follow-up: children who are told that they are good at math showed greater improvements in math scores than those who were told that they should try to become good at math Those who over-idealize romantic partners as having many virtues and few faults tend to have happier and longer-lasting relationships Moreover, the partners who are over-idealized tended to develop those traits over time! Our beliefs and expectations influence our own behavior Study by Mark Snyder found that when a man was led to believe that a woman found him attractive, she was more likely to act as if she did Perceptual Constancies Perceptual constancy: perceiving objects as unchanging (having consistent lightness, color, shape, and size) even as illumination and retinal images change Many visual illusions result from the overuse of strategies employed to achieve perceptual constancy Shape constancy: we perceive the form of familiar objects as constant even while our retinal images of them change A door casts an increasingly trapezoidal image on our retinas as it opens, yet we still perceive it as rectangular ➜ Illusion results from visual system’s attempt to maintain lightness constancy: we perceive an object as having a constant color, even if changing illumination alters the wavelengths reflected by the object Lecture 2 : (Perception I) ~ Vision 9 Muller-Lyer illusion: Is line AB or line BC longer? ➜ Size-distance constancy: Our brains are used to perceiving angles as corners that are near or far away and sees the inward- facing corners as more distant and therefore smaller Moon illusion: Does the moon appear larger near the horizon or when it is high in the sky? ➜ When the moon is near the horizon we perceive it to be farther away from us than when it is high in the sky, but since the moon is actually the same size, our minds make it look bigger when it is near the horizon to compensate for the increased distance The Magical Kingdom of Salt In the Salar de Uyuni of Bolivia, the world’s largest salt flat, with no other objects in sights, the human eye loses its ability to establish a proper field of depth. The result is some bizarre pictures. Effects of Color in Marketing Assume that you are considering buying condoms. You enter a store and notice that the store doesn’t carry all the brands you may be familiar with, so you’re going to have to make your choice based on the product package alone You are really interested in finding a brand that is considered Durable, strong, and well built (“rugged” condition) OR Classy, attractive, and refined (“sophisticated” condition) Which would you select? Purple hue, low saturation, high value Red hue, high saturation, low value “Sophisticated” “Rugged” Face Recognition Face recognition is “special” Lecture 2 : (Perception I) ~ Vision 10 Single-cell recordings of monkeys show activation of particular cells in lower temporal only when full-face photos of other monkeys are presented Recognition accuracy for faces and houses: parts vs. whole Study in which participants were shown series of faces with person’s name and series of houses with owner’s name Later on recognition test, they showed greater recall of Parts of houses and Whole faces « Do people tend to perceive men or women more in “parts”? Women Prosopagnosia: failure to recognize particular people by the sight of their faces After stroke, sheep rancher could not recognize people but could recognize sheep Note: the eyes also play a special role in perception 70-90% of famous portrait paintings sampled from the last five centuries have an eye at or within 5% of the painting’s exact centerline Modular Processing Visual illusions suggest that the mind is at least in part modular That is, it is not solely organized in terms of faculties, such as memory and attention, that can process any type of information Rather, there are specialized information-processing modules that Lecture 2 : (Perception I) ~ Vision 11 Respond automatically & Cannot be “switched off” Modular processes are usually characterized by: Fixed neural architecture It is sometimes possible to identify determinate regions of the brain that are associated with particular types of modular processing Specific breakdown patterns Modules can fail in highly determinate ways, which provide clues on the form and structure of Ex: fusiform face processing. area for face recognition —> Ex: prosopagnosia Other Neurological Disorders Related to Visual Perception 1)Visual agnosia: inability to recognize/identify visual objects despite relatively good visual perception Usually due to damage in occipital or temporal lobes “Mr. P” in Oliver Sacks’ Man Who Mistook His Wife for a Hat Man with agnosia puzzling over a picture of a cow suddenly found himself making alternating up-and-down movements with fists. He looked down at his hands and said, “Oh, a cow!” 2)Visual neglect syndrome or unilateral spatial neglect: Tendency to ignore – or to be unaware of – information on one half of visual field, usually the left side Typically occurs after damage (e.g., stroke) to right hemisphere, particularly damage to the parietal and frontal lobes Relatively common Rare syndromes Lecture 2 : (Perception I) ~ Vision 12 Capgras syndrome: characterized by belief that family and/or friends are imposters Damage to pathway between visual cortex and amygdala, which regulates emotions Emotional “glow” that we normally feel around people we are close to is missing Ramachandran argues that this emotional “glow” is, to a large extent, what gives us a sense of continuity in our relationships Functional blindness (conversion disorder): unexplained vision loss with no organic basis Cambodian women who had witnessed horrible war atrocities became either partially or wholly blind Blindsight: vision without awareness Blindness resulting from damage to visual cortex When presented with various shapes like circles and square, or photos of faces of men and women, patient could not tell (or guess) what his eyes were gazing at However, when shown pictures of people with angry or happy faces, he was able to guess the emotions expressed, at a rate far better than chance Patients are also able to correctly “guess” the identity or location of particular objects ★ Patients report that they get a “gut” feeling that allows them to perform these tasks ☞A second pathway of visual perception may account for this phenomenon Two Pathways of Visual Perception Study looked at speed with which people were able to find a specific hidden object among a group of similar objects Participants were instructed to Lecture 2 : (Perception I) ~ Vision 13 1) Passively allow the target to just “pop” into their minds OR 2) Actively direct their attention to the target ➜ Participants in Group 1 outperformed those in Group 2 Look for the circle with just one gap, and say whether the gap is on the left or the right: Use “relax” strategy Use active search strategy Proposed explanation: Participants who were basically told to relax and go with their gut instinct used a secondary pathway of visual perception that Does not go through the visual cortex Instead simply makes a very short loop through the limbic system: the emotional, instinctual center of the brain Research evidence for existence of two pathways: Auditory cortex of rats was destroyed, then rats were exposed to tone paired with an electric shock ➜ Rats quickly learned to fear tone, though they could not “hear” it! Explanation: the sound took the direct route from ear to thalamus to amygdala, bypassing all higher avenues Development of Perception There is a critical period for normal sensory and perceptual development Kittens reared in a cylinder with only vertical black and white stripes later had difficulty perceiving horizontal bars ➜ Kitten would play with rod only when it was held upright Lecture 2 : (Perception I) ~ Vision 14 Adults who were born blind and later gained vision through newly- developed surgical interventions (e.g., cataract surgery) usually have some difficulty recognizing objects Ø At age 3, Mike May lost his vision in an explosion. Decades later, a new cornea restored vision to his right eye. Unfortunately, although signals were now reaching his visual cortex, it lacked the experience to interpret them May could not recognize expression, or faces, apart from features such as hair Yet he can see an object in motion Lecture 2 : (Perception I) ~ Vision 15