🎧 New: AI-Generated Podcasts Turn your study notes into engaging audio conversations. Learn more

Lecture 2 (Object & Scene Perception) v2.pdf

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Full Transcript

3/30/23 But First… — Before we start lecture 2, let’s quickly discuss the answers to the questions that I posted at the end of lecture 1. — These were: — What is the binding problem? — What is an illusory conjunction? — Why does Feature Integration Theory predict conjunction searches to be slow?...

3/30/23 But First… — Before we start lecture 2, let’s quickly discuss the answers to the questions that I posted at the end of lecture 1. — These were: — What is the binding problem? — What is an illusory conjunction? — Why does Feature Integration Theory predict conjunction searches to be slow? 1 0 1 What is the Binding Problem? What is the Binding Problem? — Different aspects of a stimulus are processed independently, often in separate brain areas. — For example, motion is processed by the dorsal stream and form is processed by the ventral stream — The issue of how an object’s individual features are combined (i.e. bound) to create a coherent percept is known as the binding problem. Dorsal Stream Ventral Stream 2 2 3 3 1 3/30/23 What is an Illusory Conjunction? Why Are Conjunction Searches Predicted To Be Slow? — A prediction of FIT is that if attention is inhibited, features from different objects will be incorrectly bound together. — Treisman & Schmidt (1982) showed that such illusory conjunctions occur — They presented character strings very briefly (95-168 ms) followed by noise mask. — Some forms of visual search require binding to occur. — For example, binding is required if the target contains the same features as the distractors. Target features: red horizontal — The primary task was to report the two numbers. — Then O’s (i.e. observers) were asked to report the coloured letters. — If the target differs from the distractors only by its particular conjunction of features then that is a conjunction search — FIT predicts that in conjunction searches attention needs to be applied to each object in turn (i.e. one at a time) to determine whether or not the attended object is the target — Thus, these searches are predicted to be very slow — O’s often associated the wrong colour with the wrong letter. — Such incorrect bindings are known as illusory conjunctions. 4 4 Distractor features: red, green horizontal, vertical 5 5 Overview — In this lecture, I am going to discuss how we perceive objects and scenes — I am going to cover the following topics — The problem of object and scene perception — Competing solutions — Principles of grouping — Principles of segregation — Gist perception 7 6 7 2 3/30/23 The Problem State Of The Art — Perception seems effortless but it is much harder than — Currently, the state-of-the-art computer object it seems. — One way to appreciate the difficulties in perceiving objects and scenes is to try to get a computer to do it. — It turns out that computers are worse at recognising objects than humans… — …and fail in very unpredictable ways. recognition systems use artificial neural networks. — Athalye et al. (2018) investigated what sort of images these object recognition systems would misclassify. — Based on what they discovered, they then designed images that would fool these systems. 8 8 9 9 State Of The Art What Is This? — Amazingly, TensorFlow’s InceptionV3 classifier thought that this was an image of a rifle! — Seemingly bizarre misclassifications such as this are unsettling and fairly common. — In fact, you don’t have to use specially-generated images to fool an image classifier. — Misclassifications commonly occur with natural images if they are presented at unexpected orientations (Alcon, 2019) 10 10 11 11 3 3/30/23 State Of The Art State Of The Art — In the previous example, common objects presented at unusual angles were often misclassified. — This shows how hard it is to build an effective image classifier… — …and demonstrates that scene and object perception is quite difficult! Alcon (2019) 12 12 13 13 Difficulty 1: The Stimulus On the Retina is Ambiguous Take Home Message — Object perception is very hard. d — Our best computer algorithms are still quite bad at it. — What makes the task so hard? — A number of factors, but the three most important ones are a — All these lines form the same retinal image. — Thus, this 1D retinal image is ambiguous — The stimulus on the retina is ambiguous — Objects can be hidden or blurred — Similarly, 2D retinal images are also ambiguous in — Objects look different from different viewpoints that multiple stimuli can give rise to the same 2D retinal image and in different poses 14 14 b c 15 15 4 3/30/23 Difficulty 2: Objects Can Be Partially Occluded or Blurred Difficulty 3: Objects look different in different poses and from different viewpoints My glasses — In the above photo, can you see my glasses that are partially occluded by the book? — Most likely a machine would have difficulty recognising my glasses because they are partially occluded. — Machines find it hard to recognise objects when they appear in unexpected poses or are viewed from unexpected angles. 16 16 17 17 Summary How Do Humans Succeed? — How do humans solve these problems and — The problem — Competing solutions — Principles of grouping — Principles of segregation — Gist perception successfully perceive objects and scenes? — Although a complete explanation of this is beyond the scope of this lecture, we can make some progress towards this goal. — We start by discussing two competing schools of thought: — Structuralism — Gestaltism 18 18 19 19 5 3/30/23 Structuralism Gestaltism — Structuralism was proposed by Edward Titchener, based on his studies under Wilhelm Wundt. — Structuralism distinguishes between sensations and perceptions — Sensations: elementary processes occur in response to stimulation — Perceptions: Conscious awareness of objects and scenes — Structuralism claims that sensations combine to form perceptions. — In other words, according to Structuralism, conscious awareness is the sum of these elementary sensations.... — …and contains nothing that was not already present in these elementary sensations. — Gestaltism directly contradicts Structuralism. — The Gestaltists claim that conscious awareness is more than the sum of the elementary sensations. — In other words, conscious awareness can have a characteristics not present in any of the elementary sensations. — What evidence is there for this claim? 20 20 21 21 Evidence for Gestaltism Apparent Motion — In apparent motion an observer sees two stationary dots flashed in succession. — Although each of the dots is stationary, the observer perceives motion — There are two main pieces of evidence that support the claim that conscious awareness can be more than the sum of the elementary sensations — These two pieces of evidence are: — In other words, the conscious awareness has a character (i.e. motion) not present in the elementary sensations (because they were both stationary). — The conscious percept of motion was constructed and was not present in the elementary sensations. — The physical stimulus itself is not moving. — Apparent motion — Illusory contours 22 22 23 23 6 3/30/23 Apparent Motion Apparent Motion 24 24 25 25 Illusory Contours Take Home Message — Illusory contours are a second example of where the conscious awareness has a characteristic not present in the elementary sensations. — There is plenty of evidence that conscious awareness is constructed and can contain characteristics not physically present in the image. — For example, — motion can be perceived when there is no motion in the image (e.g. apparent motion) — contours can be seen when there are no contours in the image (e.g. illusory contours) — This evidence argues against Structuralism but in favour of Gestaltism. — For the rest of this lecture, we will therefore confine our attention to Gestaltism. — Illusory contours are seen in locations where there are no physical contours. — The conscious awareness of the illusory contour is constructed – there is no physical contour at these locations. Illusory contour 26 26 27 27 7 3/30/23 Gestalt Principles of Grouping Grouping and Segregation — According to Gestaltism, humans are able to perceive objects and scenes because of perceptual organisation. — In other words, humans are able to make sense of a visual image because they can perceptually organise it into the constituent objects. — How do they do this? — Perceptual organisation is achieved by the processes of grouping and segregation. — Grouping is the process by which parts of an image are perceptually bound together to form a perceptual whole (e.g. the perception of an object) — Segregation is the process by which parts of a scene are perceptually separated to form separate wholes (e.g. the perception of separate objects). — Together, grouping and segregation allow a scene to perceptually organised into its constituent objects thereby allowing observers to make sense of the scene. 28 28 29 29 For Example A Simple Example — The next slide contains a scene containing two objects, with each object containing two components. — For each object, the two components are grouped together to form a single perceptual object. — The two objects are segregated to form two separate objects — Thus, both grouping and segregation are needed to make sense of the scene. 30 30 31 31 8 3/30/23 A More Natural Example Take Home Message — To make sense of scenes, both grouping and segregation are needed. — Otherwise, the scenes cannot be perceptually organised into meaningful units. All this grouped together to form a single object All this grouped together to form a single object Additionally, the two objects are perceptually segregated so they can be perceived as separate objects 32 32 33 33 Summary Gestalt Principles of Grouping — Grouping is governed by 5 key principles. — The more of these principles that apply, the more likely components of an image will be grouped together to form a perceptual object. — Original Gestalt principles — Good continuation — Pragnanz — Similarity — Proximity — Common fate — Two additional ones (added later) — Common region — Uniform connectedness — The problem — Competing solutions — Principles of grouping — Principles of segregation — Gist perception 34 34 35 35 9 3/30/23 Good Continuation Pragnanz — Remember we mentioned that occlusions can make object recognition difficult. — The principle of good continuation can help. — Aligned (or nearly aligned) contours are grouped together to form a single object. — This is why contour A is grouped with contour B, instead of with contours C or D. — Literally German for “Good figure”. — Also known as “principle of good figure” or “principle of simplicity” — Essentially, groupings occur to make the resultant figure as simple as possible. A C D B — In the figure to the right you see a panda, not a collection of unconnected splotches. 36 36 37 37 Similarity — The more similar objects are, the more likely they will be grouped together. — In a), all the dots are the same colour so it is unclear whether things are organised vertically or horizontally. — In b), colour similarity groups the dots into columns. Proximity a) — The closer the dots are, the more likely they are to be grouped together. — In b), grouping by proximity forms horizontal rows. b) b) 38 38 a) 39 39 10 3/30/23 Common Fate Common Region — Things that are moving in the same way are grouped together. — Elements that are within the same region of space tend to group together (Palmer, 1992) a) b) 40 40 41 41 Take Home Message Uniform Connectedness — There are a number of principles that help people to group together parts of an image to form perceptual wholes. — These principles include — Connected regions with the same visual characteristics (e.g. colour) tend to group together (Palmer & Rock, 1994) — Good continuation a) — Pragnaz — Similarity — Proximity — Common fate b) — Common region — Uniform connectedness 42 42 43 43 11 3/30/23 What Are the Three Main Difficulties of Object Perception? Pop Quiz — Please write down your answers to the following questions: — There are a number of difficulties, but the three — What are the three main difficulties of object perception? most important ones are — Describe two bits of evidence for Gestaltism — The stimulus on the retina is ambiguous — How does Gestaltism claim that perceptual organisation is — Objects can be hidden or blurred achieved? — Objects look different from different viewpoints — Name four of the Gestalt principles of grouping. and in different poses 44 44 45 45 What is the Evidence for Gestaltism? How is Perceptual Organisation Achieved? — Perceptual organisation is achieved by the processes of grouping and segregation. — Grouping is the process by which parts of an image are perceptually bound together to form a perceptual whole (e.g. the perception of an object) — Segregation is the process by which parts of a scene are perceptually separate to form separate wholes (e.g. the perception of separate objects). — Together, grouping and segregation allow a scene to perceptually organised into its constituent objects thereby allowing observers to make sense of the scene. — There are two main pieces of evidence that support the claim that conscious awareness can be more than the sum of the elementary sensations — These two pieces of evidence are: — Apparent motion — Illusory contours 46 46 47 47 12 3/30/23 What are the Gestalt Principles of Grouping? Summary — These principles include — The problem — Competing solutions — Good continuation — Pragnaz — Proximity — Principles of grouping — Principles of segregation — Common fate — Gist perception — Similarity — Common region — Uniform connectedness 48 48 49 49 Segregation Segregation — It is not enough to group components of an image together to form an object, you also need to segregate the different objects in the scene from each other… — …and also segregate the objects from the background. — If you did not do this, you would perceive the entire image as just a single object… — Much of the perceptual segregation literature has focused on figure-ground segregation. — The reason for this is that objects are normally perceived as “figures” and the background is typically perceived as the “ground”. — Consequently, if you can identify what the figure is, you can typically identify the objects. — But how does a person determine what is “figure” and what is “ground”? — …which would be very confusing. 50 50 51 51 13 3/30/23 Figural Properties Figural Properties Rubin Vase — Regions of the image are more likely to be seen as figure if: — They are in front of the rest of the image — They are at the bottom of the image — They are convex — They are recognisable. 52 52 53 53 Figural Properties Figural Properties — The Rubin vase is ambiguous – it can be perceived as either a vase or two faces. — It is therefore not clear what the figure is – two faces or one vase. — If the vase is brought in front of the image it is then seen as the figure. — If the two faces are brought in front of the image, they are then seen as the figure. — This shows that depth ordering affects figure perception. — Take home message: Regions of an image in front of the rest of the image tend to be seen as figures (i.e. they are seen as objects) — Most people perceive image (a) as a red object in front of a green background. — This is because lower areas are more likely to be seen as figures (i.e. are more likely to be perceived as objects) 100 75 50 25 0 Lower seen as figure Left seen as figure Modified from Vecera et al. (2002) 55 54 54 b) a) 55 14 3/30/23 Figural Properties Figural Properties - Convexity b) a) — However, there is no left-right bias. — Consequently, image (b) is ambiguous. — It is not clear which side is the figure and which side is the ground. 100 75 From Peterson & Salvagio (2008) 50 25 0 Lower seen as figure Left seen as figure Is the black shape figure or ground? Modified from Vecera et al. (2002) 56 56 57 57 Figural Properties - Convexity Figural Properties - Convexity Concave regions From Peterson & Salvagio (2008) Convex regions Are the white shapes figures or ground? From Peterson & Salvagio (2008) 58 58 59 59 15 3/30/23 Experience Figural Properties - Convexity — Peterson & Salvagio (2008) showed that if you see — People also used past experience to segregate a single border, there is a slight tendency to perceive the convex region as figure. — However, if you see multiple convex regions, each with the same colour, you are more likely to perceive those regions as figure. — Take home message: Convex regions are assumed to be figures (i.e. objects) overlapping objects — What letters do you see below? — You use your knowledge of letters to segregate these two letters into separate objects. W M 60 60 61 61 Experience Experience — As a) is in a familiar orientation it is easier to segregate it from the background than in b) a) b) From Gibson & Peterson (1994) Life Magazine:58;7 1965-02-19, p 120 63 62 62 63 16 3/30/23 Experience Experience — Once you have seen the Dalmatian you cannot “unsee” it. — That knowledge even survives when the image is flipped left to right. Life Magazine:58;7 1965-02-19, p 120 64 64 Life Magazine:58;7 1965-02-19, p 120 65 65 Gist Perception Summary — When scenes are flashed rapidly in front of an — The problem — Competing solutions — Principles of grouping — Principles of segregation — Gist perception observer, she may not be able to identify all the objects in the scene. — Nevertheless, she get an overall impression of what the scene is about. — For example, she might think that the image shows “a crowded cafe” — That “overall impression” is what is known as the “gist” of the scene. 66 66 67 67 17 3/30/23 Gist Perception Gist Perception — Potter (1976) studied gist perception using the following paradigm. Bridge? — In each trial, the observer was cued with a particular scene description. — Then she saw 16 randomly chosen scenes, each for 250 ms. — Then she was asked if any of the scenes fitted the description. — Observers were at near 100% accuracy. — This showed that observers can rapidly perceive a scene’s gist. 250 ms 250 ms 250 ms Tim e Potter (1976) 68 68 69 69 Gist Perception — Fei-Fei investigated what the minimum scene exposure time is needed to perceive a scene’s gist. — Observers were presented with just a single scene, followed by a mask — Observers were then asked to describe what they had seen. Li et al. (2007) 70 70 71 71 18 3/30/23 Gist Perception — Fei-Fei et al reported that the longer the stimulus presentation time, the more detailed and accurate the description. — People could start to perceive aspects of the scene at about 27 ms, but the perceptions were not very detailed 27 ms Couldn’t see much; it was mostly dark w/ some square things, maybe furniture. (Subject: AM) 40 ms This looked like an indoor shot. Saw what looked like a large framed object (a painting?) on a white background (i.e., the wall). (Subject: RW) 72 72 73 73 Take Home Message — Although observers can extract the gist of a scene 67 ms I saw the interior of a room in a house. There was a picture to the right, that was black, and possibly a table in the center. It seemed like a formal dinning room. (Subject: JB) 500 ms Some fancy 1800s living room with ornate single seaters and some portraits on the wall. (Subject: WC) very rapidly, the gist they extract is not very detailed. — The longer observers view a scene, the more detailed the gist they extract. — 27 ms is enough time to extract some gist, and very accurate perception can be achieved in just 250 ms 74 74 75 75 19 3/30/23 Summary Questions — Before the next lecture, please write down your answers to the following questions: — In this lecture, I have discussed object and scene perception. — We have covered the following topics: — Please list four figural cues — What is the gist of a scene? — The problem — How long do you need to get a rudimentary gist? — Competing solutions — Principles of grouping — Principles of segregation — Gist perception 76 76 77 77 The End 78 78 20

Use Quizgecko on...
Browser
Browser