Schacter Chapter 7 Learning dprime PDF
Document Details
Uploaded by SpiritedLight8156
Douglas College
Schacter
Tags
Summary
This document provides an overview of learning, focusing on classical and operant conditioning, as well as observational learning. The text details concepts such as acquisition, extinction, and spontaneous recovery in the context of classical conditioning. Operant conditioning is also discussed, including Thorndike's Law of Effect and Skinner's reinforcement concepts.
Full Transcript
Learning Chapter 7 1 Overview Learning associations: Classical Conditioning Operant Conditioning Observational Learning 2 Defining Learning A Behaviourists Definition: Learning is a relatively permanent (long l...
Learning Chapter 7 1 Overview Learning associations: Classical Conditioning Operant Conditioning Observational Learning 2 Defining Learning A Behaviourists Definition: Learning is a relatively permanent (long lasting) change in an organism’s behaviour due to experience. As we will see later, more modern approaches to learning focus on more than just behaviour. 3 Associative Learning Brains naturally associate events that co-occur. This is called associative learning or learning by association. Much of the early psychological investigations of learning focused on this type of learning. 4 Classical Conditioning Classical conditioning, learning to associate one stimulus with another, was discovered by the Russian physiologist Ivan Pavlov. His work provided a basis for behaviourism. Ivan Pavlov (1849-1936) 5 Before Conditioning Before conditioning, food (Unconditioned Stimulus, US) produces salivation (Unconditioned Response, UR). Unconditioned stimulus (US): Unconditioned yummy dog food response (UR): dog salivates Unconditioned stimulus (US): a stimulus that unconditionally (automatically) triggers a response. The response is usually instinctual. Unconditioned response (UR): the unlearned, naturally occurring response to the unconditioned stimulus (US) 6 Before Conditioning A neutral stimulus (NS) (e.g., a tone) does not produce a salivation response. Neutral stimulus (NS) No response 7 During Conditioning The bell/tone (NS) is repeatedly presented with the food (US). Neutral Unconditioned stimulus (NS) Unconditioned response (UR): stimulus (US) dog salivates 8 After Conditioning The dog begins to salivate upon hearing the tone. The tone isn't neutral any more, it causes a response. The tone is now a Conditioned Stimulus (CS) and the salivation is the Conditioned Response (CR). Conditioned Conditioned response: stimulus dog salivates (formerly neutral) Conditioned stimulus (CS): an originally irrelevant stimulus that, after association with an unconditioned stimulus (US), triggers a conditioned response. Conditioned response (CR): the learned response to a previously neutral conditioned stimulus (CS). Usually the same behaviour as the UR. 9 Classical Conditioning The NS and the CS are the same stimulus. The difference is whether or not the stimulus triggers the conditioned response. The UR and the CR are the same response, triggered by different events. The difference is whether learning was necessary for the response to happen. 10 Acquisition Acquisition is the initial learning stage in classical conditioning in which an association between a neutral stimulus and an unconditioned stimulus takes place. In most cases, for conditioning to occur, the neutral stimulus needs to come before the unconditioned stimulus. The optimal time in between the two stimuli is about half a second. Learning is useful when it helps predict future relevant events. 11 Acquisition 12 Acquisition and Extinction Extinction refers to the diminishing of a conditioned response. If the US (food) stops appearing with the CS (bell), the CR decreases. The CS no longer predicts the US. 13 Spontaneous Recovery When extinction is followed by a rest period, presenting the tone alone might lead to a spontaneous recovery (a return of the conditioned response). If the CS (tone) is again presented repeatedly without the US, the CR becomes extinct again. 14 Stimulus Generalization Tendency to respond to stimuli similar to the CS is called generalization. Pavlov conditioned the dog’s salivation (CR) by using miniature vibrators (CS) on the thigh. When he subsequently stimulated other parts of the dog’s body, salivation dropped but still occurred. 15 Stimulus Discrimination Discrimination is the learned ability to distinguish between a conditioned stimulus and other stimuli that do not signal an unconditioned stimulus. 16 Watson: Applying Classical Conditioning to People Little Albert Study (1920): 9-month-old Little Albert was not afraid of rats John B. Watson and Rosalie Rayner then clanged a steel bar every time a rat was presented to Albert Albert acquired a fear of rats, and generalized this fear to other soft and furry things Watson prided himself in his ability to shape people’s emotions. He later went into advertising. 17 Applications of Classical Conditioning Classical Conditioning has been applied to many contexts. For example, drug rehabilitation: Substance abuse involves conditioned associations between drugs and various stimuli (people, places, etc.) To avoid cravings users should avoid people and places associated with previous drug use 18 Limits to the Behaviourist Approach Although the behaviourist approach provided many insights into learning, over time it became clear that it was inadequate. Behaviourist failed to fully appreciate two important factors that affect learning: 1) Biological predispositions an constraints 2) Cognitive Processes Biological Predispositions Behaviourists believed that laws of learning were similar (almost identical) for all animals. Therefore, a pigeon and a person do not differ in their learning. However, later research demonstrated that learning is constrained by an organism’s abilities acquired through natural selection. People and animals are not Tabula Rasa (blank slates). Our ability to learn new behaviours is constrained by our innate nature. Biological Predisposition in Classical Conditioning: Taste Aversion Taste Aversion conditioning demonstrates that the duration between the CS and the US may sometimes be long (hours). A biologically relevant CS (taste) led to John Garcia conditioning but other non-relevant stimuli (sight or sound) did not. Only a single pairing of the CS and US are required for this type of conditioning. Cognitive Processes Early behaviourists believed that learned behaviours of all organisms (including humans) could be described by simple mechanisms that described links between stimuli and behaviours. Reference to internal mental states was not required. However, late in the behaviourists era it was suggested that animals learn the predictability of a stimulus, meaning they learn expectancy about the occurrence of a stimulus. Expectancy and Classical Conditioning In the Rescorla-Wagner model of classical conditioning, a CS sets up an expectation that the US will soon appear. The expectation leads to many behaviours associated with the US. Respondent vs. Operant Behaviour Respondent behaviour: behaviour that occurs as an automatic response to a stimulus In classical conditioning, the behaviour elicited by the CS and US is respondent Operant behaviour: any behaviour that operates on (affects) the environment May or may not be in response to an external event 24 Operant Conditioning Operant conditioning forms an association between behaviours and their consequences (events that are a result of the behaviour). The results of behaviour determine the likelihood of that behaviour being repeated in the future. The consequence of the behaviour may be reinforcing (rewarding), punishing or neutral. 25 Thorndike’s Law of Effect Edward Thorndike studied problem solving by placing cats in a puzzle box; they were rewarded with food when they solved the puzzle. The cats appeared to learn by trial and error. Thorndike proposed the law of effect: behaviours followed by favourable consequences become more likely, and behaviours followed by unfavourable consequences become less likely. 26 B.F. Skinner B. F. Skinner pioneered more controlled methods to explore Edward Thorndike’s principles. 27 Reinforcement Reinforcement refers to any feedback from the environment This meerkat has just that makes a behaviour more completed a task out in the cold likely to recur. Positive reinforcement: adding something desirable (e.g., warmth) Negative reinforcement: ending something unpleasant (e.g., turn For the meerkat, this warm light is off an annoying sound) desirable. 28 Context and Learning The consequence of a behaviour can depend on the context it is performed in Reinforcement and punishment become associated with salient aspects of the environment in which they occur (discriminant stimuli) The learned response will only be emitted when the appropriate discriminant stimulus is present 29 Primary & Secondary Reinforcers Primary reinforcer: a stimulus that meets a basic need or otherwise is intrinsically desirable, such as food, sex, fun, attention, or power Secondary/conditioned reinforcer: a stimulus (e.g., money) which has become associated with a primary reinforcer (money buys food, builds power) 30 Shaping Behaviour When a creature is not likely to randomly perform exactly the behaviour you are trying to teach, you can reward any behaviour that comes close to the desired behaviour. Shaping is the operant conditioning procedure in which reinforcers guide behaviour towards the desired target behaviour through successive approximations. This technique is used extensively in animal training. 31 Immediate & Delayed Reinforcers Immediate Reinforcer: A reinforcer that occurs instantly after a behaviour. A rat gets a food pellet for a bar press. Delayed Reinforcer: A reinforcer that is delayed in time from behaviour that produces it. For example, a paycheck that comes twice a month. Reinforcers become increasing less effective as the delay increases. 32 Delayed Reinforcers Humans have the ability to link a consequence to a behaviour even if they aren’t linked sequentially in time. However, we may be inclined to engage in behaviours that provide small immediate reinforcers (watching TV) rather than large delayed reinforcers (getting an A in a course) which require consistent work. 33 Reinforcement Schedules B.F. Skinner experimented with the effects of giving reinforcements in different patterns or “schedules” to determine what worked best to establish and maintain a target behaviour. Continuous reinforcement: Reinforces the desired response each time it occurs the subject acquires the desired behaviour quickly Partial/intermittent reinforcement: reinforces a response only part of the time the target behaviour takes longer to be acquired but, persists longer without reward 34 Partial Reinforcement: Ratio Schedules Ratio Schedules determine rewards based on the number of instances of the desired behaviour Fixed-ratio schedule: Reinforces a response only after a specified number of responses e.g., piecework pay. Variable-ratio schedule: Reinforces a response after an unpredictable number of responses. hard to extinguish because of the unpredictability. (e.g., behaviours like gambling, fishing.) 35 Partial Reinforcement: Interval Schedules Interval schedules reinforce behaviours based on the interval of time since the last reinforcement. Fixed-interval schedule: Reinforces a response only after a specified time has elapsed e.g., getting your paycheck every two weeks Variable-interval schedule: Reinforces a response at unpredictable time intervals produces slow, steady responses. 36 Effect of Schedules of Reinforcement Fixed ratio: Predictable rewards leads to Fixed interval: short bursts of fast learning, high rate of responding responding when anticipating reward Variable ratio: high, consistent responding, even if reinforcement stops Variable interval: slow, consistent (resists extinction) responding Punishment Punishments have the opposite effects of reinforcement. These consequences make the target behaviour less likely to occur in the future Positive Punishment: ADD something unpleasant/aversive (e.g., spank a child) Negative Punishment: TAKE AWAY something pleasant/ desired (e.g., no TV time, no attention) Positive does not mean “good” or “desirable” and negative does not mean “bad” or “undesirable.” When is punishment effective? Punishment works best in natural settings when we encounter punishing consequences from actions such as reaching into a fire; in that case, operant conditioning helps us to avoid dangers. Artificial punishments work best when consequences happen as they do in nature: Severity of punishments is not as helpful as making the punishments immediate and certain. Problems with Physical Punishment Punished behaviours may restart when the punishment is over; learning is not lasting a child may learn to discriminate among situations, and avoid those in which punishment might occur. The child might learn an attitude of fear or hatred, which can interfere with learning. This can generalize to a fear/hatred of all adults or many settings Physical punishment models aggression and control as a method of dealing with problems Problems with Physical Punishment Punishing focuses on what NOT to do, which does not guide people to a desired behaviour Even if undesirable behaviours do stop, another problem behaviour may emerge that serves the same purpose, especially if no replacement behaviours are taught and reinforced In order to teach desired behaviour, reinforce what’s right more often than punishing what’s wrong Applications of Operant Conditioning Operant Conditioning principles (reinforcement, shaping) are widely applied in education, sports, and business. Applications of Operant Conditioning Parenting 1) Rewarding small improvements toward desired behaviours works better than expecting complete success, and also works better than punishing problem behaviours. 2) Giving in to temper tantrums stops them in the short run but increases them in the long run. Self-Improvement Reward yourself for steps you take toward your goals. As you establish good habits, then make your rewards more infrequent (intermittent). Biological Predisposition in Operant Conditioning Biological constraints predispose organisms to learn associations that are naturally adaptive. The article “The misbehavior of organisms” (Breland and Breland, 1961) showed that animals drift towards their biologically predisposed Marian Breland Bailey instinctive behaviours. Evolutionary Predisposition and Operant Conditioning After being rewarded with food for turning right, rats turn left on the next trial. Why was the reinforced behaviour not emitted? Rats are foraging animals. After clearing out the food in one location they search new locations. It takes many trials for them to consistently turn right. Cognition & Operant Conditioning According to behaviourist laws of operant conditioning, animals learn by the consequences of their actions. If no reward or punishment occurs nothing is learned. However, it was discovered that if rats are allowed to explore a maze without any reward given (e.g., cheese) they develop mental representations of the layout of the maze (a cognitive map). This type of learning is known as latent learning: skills or knowledge gained from experience, but not apparent in behaviour until rewards are given. Latent Learning When a reward for completing a maze is given, previously unrewarded rats that have wandered the maze demonstrate their knowledge by quickly completing the maze. Cognitive Maps When rats were rewarded with food in maze A, they could easily emit the correct behaviour to obtain the food reward in maze B. This indicates they had built a cognitive map of the maze allowing them to determine a new route. Learning by Observation Higher animals, especially humans, learn through observing and imitating others. Observational learning: watching other people's behaviour & learning from their experience. Modelling Modelling: The behaviour of others serves as an example of how to respond to a situation. We may imitate this model regardless of reinforcement. Modeling and Imitation Modeling and imitation begins early in life. This 14-month-old child imitates the adult on TV in pulling a toy apart. Children are prone to over-imitating: copying adult behaviours that have no function and no reward. Humans imitate both behaviours and emotions (“emotional contagion”). Bandura’s Bobo Doll Experiment Bandura’s classic “Bobo doll” experiment illustrated the powerful effect of imitation. Video: Bobo (2m:47s) Antisocial Effects of Observational Learning Children who witness violence in their homes, but are not physically harmed themselves, may hate violence but still may become violent more often than the average child Media Models of Violence Research shows that viewing media violence (TV, video games) leads to increased aggression and reduced prosocial behaviour (such as helping an injured person). This violence-viewing effect might be explained by imitation, and also by desensitization toward pain in others. Prosocial Effects of Observational Learning Prosocial behaviour: actions which benefit others, contribute value to groups, and follow moral codes and social norms Children model prosocial behaviour at least as well as antisocial behaviour Demonstrating prosocial behaviour is more effective that creating rules, lecturing or punishing Especially when the positive effects of the behaviour is made clear Defining Learning A Modern Definition: Learning is the acquisition of new knowledge, skills, or responses from experience that results in a relatively permanent change in the state of the learner. 57