Operant Conditioning PDF
Document Details
Uploaded by BraveJubilation
2023
Hande Kaynak
Tags
Summary
These lecture notes cover operant conditioning, including Thorndike's Laws and Skinner's work. It discusses how learning occurs incrementally and the role of reinforcement and punishment in shaping behavior.
Full Transcript
PSY381-Assoc Prof Hande Kaynak 10/10/2023 Operant Condititoning 2 Chapter 6 – Operant Conditioning (Introduction) Do we always learn by associating neutral stimuli with other stimuli in environment? 1 responsestimulus R SR reinforcing stimulus SP punishing stimulus BehaviorConsequenc...
PSY381-Assoc Prof Hande Kaynak 10/10/2023 Operant Condititoning 2 Chapter 6 – Operant Conditioning (Introduction) Do we always learn by associating neutral stimuli with other stimuli in environment? 1 responsestimulus R SR reinforcing stimulus SP punishing stimulus BehaviorConsequences B C Importance of controlling learning, particularly complex, voluntary, goal-directed behavior. Also called instrumental conditioning because the response is instrumental in producing the consequence. It’s called operant conditioning because the response operates on the environment to produce a consequence. The dog goes through the circle. (Behavior) Examples Examples 3 4 FOOD (Consequence) the frequency of behavior increases! Using the vending machine Drinking coffee (Behavior) (Consequence) behavior increases! 1 PSY381-Assoc Prof Hande Kaynak 10/10/2023 Learning Curve for Cats in Box OC – Thorndike (1890s) 5 6 http://www.youtube.com/watch?v=BDujDOLre8 no sudden improvement in performance. Rather, the response that worked (stepping on the treadle) was gradually strengthened, while responses that did not work (e.g., clawing at the gate, chewing on the cage) were gradually weakened. a puzzle box Thorndike’s Laws Thorndike’s Laws 7 8 Also called S-R learning. Type R: (Also called operant conditioning). Behavior is controlled by its consequences. Law of effect – A chance act becomes a learned behavior when a connection is formed between a stimulus (S) and a response (R) that is rewarded. If a response is followed by a satisfying state of affairs, the strength of the connection is increased (“stamped in”). If a response is followed by an annoying state of affairs, the strength of the connection is decreased (“stamped out”). Law of exercise – the S-R connection is strengthened by use and weakened with disuse. Law of readiness – motivation is needed to develop an association or display changed behavior. 1. 2. 3. When someone is ready to perform some act, to do so is satisfying. When someone is ready to perform some act, not to do so is annoying. When someone is not ready to perform some act and is forced to do so, it is annoying. 2 PSY381-Assoc Prof Hande Kaynak 10/10/2023 Thorndike’s Laws Thorndike’s Laws 9 10 Learning Is Incremental, not Insightful. In other words, learning occurs in very small Thorndike’s cats learned to solve the puzzle box problem (gradually/suddenly) ________. systematic steps rather than in huge jumps. According to Thorndike, behaviors that worked were st____ i__, while behaviors that did not work were st___ o__. Burrhus Frederic Skinner (Operant Conditioning) Burrhus Frederic Skinner 11 12 Learning that relies on associating behavior with its results or consequences. Defined as “operant” – animal is operating on environment – not passive like in CC. Highlights importance of reinforcement & punishment in learning. Any response that is followed by a reinforcing stimulus tends to be repeated. (The emphasis is on behavior and its consequences) A reinforcing stimulus is anything that increases the rate with which an operant response occurs. (This process exemplifies contingent reinforcement, because getting the reinforcer is contingent (dependent) on the organism emitting a certain response. 3 PSY381-Assoc Prof Hande Kaynak 10/10/2023 Skinner Skinner 13 14 To study this type of learning – needed to design controlled environment. https://www.youtube.co Skinner Box m/watch?v=PQtDTdDr 8vs https://www.youtube.co m/watch?v=MOgowRy 2WC0 https://www.youtube.co m/watch?v=CC7q6wE 89yE&feature=emb_lo go&ab_channel=vanbe rgen90 Pecking the response key Food pellet The effect: The future probability of pecking the key increases The behavior is an operant response because its occurrence results in the delivery of a certain consequence, & that consequence affects the future probability of the response Cumulative Recording 15 16 The Cumulative Recording Time is recorded on the x-axis and total number of responses is recorded on the y-axis. The cumulative recording never goes down. The rate with which the line ascends indicates the rate of responding. 4 PSY381-Assoc Prof Hande Kaynak 10/10/2023 Operant Consequences: Reinforcers and Punishers 17 In the original version of the Skinner box, rats earn food by p______ a _________; in a later version, pigeons earn a few seconds of access to food by p_________ at an illuminated plastic disc known as a _______ ___________. 18 a stimulus is a reinforcer if (1) it follows a behavior, and (2) the future probability of that behavior increases. Skinner’s procedures are also known as fr_____ ope____ procedures in that the animal controls the rate at which it earns food. Operant Consequences: Reinforcers and Punishers 19 The terms reinforcement and punishment usually refer to the process or procedure by which a certain consequence changes the strength of a behavior. Thus, the use of food to increase the strength of lever pressing is an example of reinforcement, while the food itself is a reinforcer. We can use the terms reward and reinforcer interchangeably. Conversely, a stimulus is a punisher if (1) it follows a behavior, and (2) the future probability of that behavior decreases. Operant Consequences: Reinforcers and Punishers 20 More specifically, a reinforcer is a consequence that (precedes/follows) __a behavior and (increases/decreases) __ the probability of that behavior. A punisher is a consequence that (precedes/follows) ___ a behavior and (increases/decreases) ___ the probability of that behavior. Strengthening a roommate’s tendency toward cleanliness by thanking her when she cleans the bathroom is an example of ______, while the thanks itself is a _____. When labeling an operant conditioning procedure, punishing consequences (punishers) are given the symbol ___ (which stands for ___ ___ ), while reinforcing consequences (reinforcers) are given the symbol ____ (which stands for ___ _____ ). The operant response is given the symbol _____. 5 PSY381-Assoc Prof Hande Kaynak 10/10/2023 Operant Consequences: Reinforcers and Punishers Discriminative Stimuli 21 22 The terms reinforcers and punishers refer to the specific __________ that follows a behavior. Operant behaviors are sometimes simply called _____. These can be contrasted with elicited behaviors, which Skinner called ___ behaviors or simply ___. A discriminative stimulus that signals the availability of reinforcement is given the symbol ___. When a behavior is consistently reinforced or punished in the presence of certain stimuli, those stimuli will begin to influence the occurrence of the behavior. If lever pressing produces food only when a tone is sounding, the rat soon learns to press the lever only when it hears the tone. In the presence of the tone, if the rat presses the lever, it will receive food. In the presence of discriminative stimuli, responses are reinforced and in the absence of it, they are not reinforced. Discriminative Stimuli Discriminative Stimuli 23 24 A discriminative stimulus is a signal that indicates that a response will be followed by a reinforcer. The presence of SD makes the response more likely to occur. Discriminative stimuli do not elicit behavior in the manner of a CS or UCS in classical conditioning. For example, the tone does not automatically elicit a lever press; it merely increases the probability that a lever press will occur. Three-term contingency. The three-term contingency can be viewed as consisting of an antecedent event (an antecedent event is a preceding event), a behavior, and a consequence (which can be remembered by the initials ABC). Operant Conditioning Discriminative Stimulus Behavior Consequence You notice something (tone), do something (press a lever), and get something (food). 6 PSY381-Assoc Prof Hande Kaynak 10/10/2023 Discriminative Stimuli Differences between OC and CC 25 26 discriminative stimulusresponsereinforcer SD R SR AntecedentBehaviorConsequences A B C The discriminative stimulus can also signal not only reinforcement but also the punishment. Q: A discriminative stimuli (does/does not) __ elicit behavior in the same manner as a CS. Rather than saying that the SD elicits the behavior, we say that the person emits the behavior in the presence of the SD. Four Types of Contingencies Four Types of Contingencies 27 28 2 questions to ask: (1)Does the consequence consist of something being presented or withdrawn? Learn the terminology: “Reinforcement” always means strengthening behavior. “Punishment” always means decreasing behavior. “Positive” always means adding a stimulus. “Negative” always means removing a stimulus. (2)Does the consequence serve to strengthen or weaken the behavior? 7 PSY381-Assoc Prof Hande Kaynak Four Types of Contingencies 10/10/2023 Positive reinforcement 29 Procedure After behavior occurs: Result: Positive Reinforcement Present pleasant stimulus Behavior increases Negative Reinforcement (escape or avoidance) Remove aversive stimulus Behavior increases Positive Punishment (or just “punishment”) Present aversive stimulus Behavior decreases Negative Punishment (omission) Remove pleasant stimulus Behavior decreases Negative reinforcement Negative reinforcement 31 32 Negative means only that the behavior has resulted in something being removed or subtracted. This is an example of reinforcement because the Karen cries while saying to her boyfriend, “John, I don’t feel as though you love me.” John gives Karen a big hug saying, “That’s not true, dear, I love you very much.” If John’s hug is a reinforcer, Karen is (more/less) ___________ likely to cry the next time she feels insecure about her relationship. More specifically, this is an example of _________ reinforcement of Karen’s crying behavior. behavior increases in strength; it is negative reinforcement because the consequence consists of taking something away. 8 PSY381-Assoc Prof Hande Kaynak 10/10/2023 Positive punishment Negative punishment 33 34 Positive means only that the behavior has resulted in something being presented or added. For example, if a rat received a shock when it pressed a lever, it would stop pressing the lever. It is punishment because the behavior decreases in strength, and it is negative punishment because the consequence consists of the removal of something. Negative punishment Disadvantages of Using Punishment 35 36 Emotional effects Suppression of other behaviors Jonathan’s behavior of talking to other women at parties has been negatively punished. What about the girlfriend’s behavior? Negative punishment Negative reinforcement Need for continual monitoring Attempts to escape situation Aggression against punisher or others Whenever Sasha pulled the dog’s tail, the dog left and went into another room. As a result, Sasha now pulls the dog’s tail less often when it is around. The consequence for pulling the dog’s tail was the (presentation/removal) _________ of a stimulus, and the behavior of pulling the dog’s tail subsequently (increased/decreased) _________ in frequency; therefore, this is an example of _______ _______. 9 PSY381-Assoc Prof Hande Kaynak Examples… Examples… 37 38 When Sasha was teasing the dog, it bit her. As a result, she no longer teases the dog. The consequence for Sasha’s behavior of teasing the dog was the (presentation/removal) ___________ of a stimulus, and the teasing behavior subsequently (increased/decreased) _______ in frequency; therefore, this is an example of _____ _____. When Alex held the car door open for Stephanie, she made a big fuss over what a gentleman he was becoming. Alex no longer holds the car door open for her. The consequence for holding open the door was the ______________ of a stimulus, and the behavior of holding open the door subsequently ______________ in frequency; therefore, this is an example of ______ __________. When Tenzing shared his toys with his brother, his mother stopped criticizing him. Tenzing now shares his toys with his brother quite often. The consequence for sharing the toys was the ________ of a stimulus, and the behavior of sharing the toys subsequently _______ in frequency; therefore, this is an example of ______ _______. When Alex burped in public during his date with Stephanie, she got angry with him. Alex now burps quite often when he is out on a date with Stephanie. The consequence for burping was the _________ of a stimulus, and the behavior of belching subsequently ________ in frequency; therefore, this is an example of ________ ___________. Primary and Secondary Reinforcers Shaping: Reinforcing What Doesn’t Come Naturally 39 40 A primary reinforcer (also called an unconditioned 10/10/2023 reinforcer) is an event that is innately reinforcing. E.g. food, water, proper temperature (neither too hot nor too cold), and sexual contact. associated with basic physiological needs. satisfies some biological need and works naturally, regardless of a person’s prior experience. Secondary Reinforcer a stimulus that becomes reinforcing because of its association with a primary reinforcer. E.g. money, status, ‘good morning’, ‘well done’, ‘bravo’. The process of teaching a complex behavior by rewarding closer and closer approximations of the desired behavior. E.g. teaching a children how to wear glasses. Putting the glasses on a table. Touching the glasses is followed by a reward. Playing with the glasses is followed by a reward. Wearing the glasses is followed by a reward. Keeping to wear the glasses is followed by a reward. http://www.youtube.com/watch?v=_7kIV6zvAQY 10