PSYF 212 Principles of Learning PDF 2024
Document Details
Uploaded by ResoluteSurrealism5589
null
2024
null
Mehmet Akif GÜZEL
Tags
Related
- Principios de aprendizaje y conducta - Domjan 9th Edición PDF
- Learning, Conditioning, Motivation AS PDF
- Learning Psychology (PSYC-2502) Lecture Notes PDF
- Instrumental Conditioning: Foundations Chapter 5.6.7.9 PDF
- Instrumental Conditioning: Foundations Chapter 5 PDF
- PSYF 212 Principles of Learning PDF (2024-2025)
Summary
This document presents lecture notes on "Instrumental (Goal-Directed) Conditioning," part of a psychology course (PSYF 212 Principles of Learning) offered in the Fall 2024 term. The document covers foundational concepts, early investigations, and modern approaches to the study of instrumental conditioning, including discrete-trial and free-operant procedures. It touches on reinforcement and punishment.
Full Transcript
10/22/24 PSYF 212 Principles of Learning Mehmet Akif GÜZEL, PhD, Assoc Prof, Cognitive Psy, Dep of Psychology, Factory Building, FOA11 2024-2024 Fall Term Mon, 2 hrs (13.00-14.45) Thurs, 3 hrs (13.00-1...
10/22/24 PSYF 212 Principles of Learning Mehmet Akif GÜZEL, PhD, Assoc Prof, Cognitive Psy, Dep of Psychology, Factory Building, FOA11 2024-2024 Fall Term Mon, 2 hrs (13.00-14.45) Thurs, 3 hrs (13.00-15.45) Chapter5 Instrumental Conditioning: Foundations Copyright © 2017 Cengage Learning. All Rights Reserved. 1 Introduction to Instrumental (Goal-Directed) Conditioning Analysis of learning situations in which stimuli is result or consequence of behavior Responding is necessary to produce desired outcome Instrumental Behavior–occurs because it was previously effective in producing certain consequences 2 1 10/22/24 Introduction to Instrumental (Goal-Directed) Conditioning S* R 3 Practice makes it perfect 4 2 10/22/24 Early Investigations of Instrumental Conditioning Thorndike’s animal boxes Discrete trials vs. Free-operant trials 5 Early Investigations of Instrumental Conditioning Thorndike’s animal boxes FIGURE 5.1 Two of Thorndike’s puzzle boxes, A and I. In Box A, the participant had to pull a loop to release the door. In Box I, pressing down on a lever released a latch on the other side. (Left: Based on “Thorndike’s Puzzle Boxes and the Origins of the Experimental Analysis of Behavior,” by P. Chance, 1999, Journal of the Experimental Analysis of Behaviour, 72, pp. 433–440. Right: Thorndike, Animal Intelligence Experimental Studies, 1898.) 6 3 10/22/24 Thorndike’s Findings Law of effect–if response R in presence of Latencies to Escape Box A stimulus S is followed by a satisfying event, Longest = 160 seconds association between stimulus S and response R Shortest = 6 seconds becomes strengthened If the response is followed by annoying event, the S–R association is weakened Association between response and stimuli present at the time of response is learned 7 Modern Instrumental Conditioning: Discrete-Trial Procedures Conducted by Willard Stanton Small Similar to Thorndike method; each training trial begins with putting animal in apparatus and ends with removal of animal after instrumental response performed Uses mazes Measures running speed and latency 8 4 10/22/24 REGARDING THE USE OF MAZES! & MICE! Edward Chace TOLMAN (1886-1956) y 9 Modern Instrumental Conditioning: Free Operant Procedures Conducted by B.F. Skinner Free-Operant Procedures–allow animal to repeat instrumental response without constraint over and over again without being taken out of the apparatus until the end of an experimental session Studies behavior in more continuous manner than with mazes Operant Response (example = lever press)–defined in terms of effect that behavior has on environment. Behavior defined in how it operates on the environment. Any response that is required to produce desired consequence is an instrumental response because it is “instrumental” in producing particular outcome 10 5 10/22/24 Modern Instrumental Conditioning: Free Operant Procedures y 11 Magazine Training and Shaping Magazine Training Shaping (training rat to press lever) (training rat to press lever for food) Carefully designed Successful shaping of behavior involves three components: Food Magazine–food – Clearly define final response you want delivery device performed Pair sound with food After magazine – Clearly assess performance starting level magazine until sound training, rat is ready to – Divide progression from starting point to elicits classically learn operant the final target behavior into appropriate conditioned approach response training steps/successive approximations response Successive approximations comprise training plan Animal goes to food cup Free-operant techniques provide special and picks up pellet opportunity to observe changes in the likelihood of behavior over time y 12 6 10/22/24 Instrumental Conditioning Procedures Appetitive Stimulus Aversive Stimulus pleasant event unpleasant stimulus 13 Positive Reinforcement vs Positive Punishment Positive Reinforcement Positive Punishment Instrumental response produces Instrumental response produces appetitive stimulus unpleasant stimulus If response occurs, the appetitive Response produces outcome, but stimulus is presented; if response outcome is aversive does not occur, appetitive stimulus Effective punishment procedures is not presented produce decrease in rate of Positive reinforcement produces responding increase in rate of responding Example = F for not studying Example = A+ for studying 14 7 10/22/24 Negative Reinforcement vs Negative Punishment Negative Reinforcement Negative Punishment (Omission Training) Instrumental response turns off Preferred over positive punishment; does not aversive stimulus involve delivering aversive stimulus Negative contingency between Negative contingency between response and instrumental response and environmental event aversive stimulus Increases instrumental Decreases instrumental responding responding Example = Opening umbrella to stop rain from getting you wet Example = Getting “grounded” Differential reinforcement of other behavior (DRO)–involves the reinforcement of other behavior 15 Instrumental Conditioning Procedures 16 8 10/22/24 Instrumental Response Does instrumental response inevitably produce uniformity or stereotypy? Yes, but you can increase variability/creativity by requiring it to earn reinforcement! 17 Example: Degree of Response Variability Participants reinforced for varying the type of rectangles they drew (VAR) or received reinforcement without any requirement to vary their drawings (YOKED) Higher U-values indicate greater responding variability Lack of creativity can happen but is NOT inevitable FIGURE 5.8 Degree response variability along three dimensions of drawing a rectangle (size, shape, and location) for human participants who were reinforced for varying the type of rectangles they drew (VARY) or received reinforcement on the same trials but without any requirement to vary the nature of their drawings (YOKED). Higher values of U indicate greater variability in responding (based on Ross & Neuringer, 2002). 18 9 10/22/24 Belongingness Explains why certain responses more easily trained than others Certain responses belong with reinforcer because of evolutionary history Marian Breland Bailey Example = operating a latch/pulling a string are manipulator responses that are naturally related to release from confinement but not a puzzle box Breland and Breland and the “miserly” raccoon Instinctive drift–extra (instinctively performed) responses developed in food reinforcement 19 Constraints on Instrumental Conditioning Is response part of a behavior system? When an animal is food-deprived and is in a situation where it might encounter food, feeding system becomes Will I put the money in activated the bank? Animal engages in foraging. We can predict which responses will increase with food reinforcement by studying what animals do when their feeding system is activated in absence of instrumental conditioning. Perform a classical conditioning experiment. CS elicits components of behavior system activated by US. If instinctive drift reflects responses of the behavior system, responses akin to instinctive drift should be evident. y 20 10 10/22/24 Instrumental Reinforcer Quantity, quality, and what subject previously received -- important! Getting more/better reinforcement soon after behavior is best Behavioral Contrasts–when a small reward is treated as especially poor after reinforcement with a large reward and vice versa – Behavioral-contrast effects can occur because of a shift from a prior reward magnitude or due to anticipated reward – Example = “anticipatory negative contrast” may explain why individuals addicted to cocaine derive little satisfaction from conventional reinforcers (a tasty meal) that others enjoy on a daily basis 21 Response-Reinforcer Relation Two types of relationships between a response and a reinforcer Response- reinforcer contingency–extent Temporal Relation–time between to which instrumental response is response and reinforcer necessary and sufficient to produce reinforcer Temporal Contiguity–special case of temporal relation; refers to reinforcer delivery immediately after response Temporal and causal factors are independent of each other! 22 11 10/22/24 Response-Reinforcer Relation Delay of Reinforcement 23 Delay in b/w R & S*? S* R 24 12 10/22/24 Best practices in Response Reinforcer Mark the target instrumental response in some way to distinguish from other activities (Clicker training in dogs) y 25 Superstition Superstitious behavior rests on idea of “accidental/adventitious” reinforcement Terminal responses–species-typical responses that reflect anticipation of food as time draws closer to next food presentation Skinner: “Temporal contiguity” (rather than response-reinforcement contingency is the most important factor in instrumental conditioning! For example, R1 and R7 (orienting to the food magazine and pecking at something on the magazine NO! wall) occurred more often at the end of the food-food + interval than at other times Interim responses–reflect other sources of motivation early in inter-food interval, when food presentation is unlikely 26 13 10/22/24 Superstition 27 Learned Helplessness Learned-Helplessness Effect–exposure to uncontrollable shock disrupts subsequent learning Firmly established; has been replicated in numerous studies Learned-Helplessness Hypothesis–assumes that during exposure to uncontrollable shocks, animals learn that the shocks are independent of their behavior. Expectation of future lack of control undermines their ability to learn new instrumental response Is provocative and controversial Learning deficit occurs for two reasons: Expectation of lack of control reduces motivation to perform response Previously learned expectation of lack of control makes learning more difficult 28 14 10/22/24 Learned Helplessness 29 Learned Helplessness Learned-helplessness hypothesis vs. Activity deficit hypothesis - Group Y show a learning deficit following exposure to inescapable shock because inescapable shocks encourage animals to become inactive or freeze (little support) Attention Deficit Hypothesis–exposure to inescapable shock reduces extent to which animals pay attention to their own behavior and that is why they show deficit to learn Stimulus Relations in Escape Conditioning–why is exposure to escapable shock not nearly as bad? Instead of focusing on why inescapable shock disrupts subsequent learning, they asked why exposure to escapable shock is not nearly as bad (Minor, Dess, & Overmier, 1991). What is it about the ability to make an escape response that makes exposure to shock less debilitating? 30 15 10/22/24 Learned Helplessness Stimulus Relations in Escape Conditioning–why is exposure to escapable shock not nearly as bad? “Safety-signal feedback cues are reliably followed by the intertrial interval and hence by the absence of shock. Therefore, such feedback cues can become conditioned inhibitors of fear and limit or inhibit fear elicited by contextual cues of the experimental chamber. (For a discussion of conditioned inhibition, see Chapter 3.) No such safety signals exist for animals given yoked, inescapable shock because for them, shocks and shock-free periods are not predictable. Therefore, contextual cues of the chamber in which shocks are delivered are more likely to become conditioned to elicit fear with inescapable shock.” 31 Summary Early Investigations of Instrumental Conditioning Modern Approaches to the Study of Instrumental Conditioning Discrete-Trial Procedures Free-Operant Procedures Instrumental Conditioning Procedures Pro-Tip: Learning to predict stressful events Positive/Negative Reinforcement and Punishment (and when we are safe) is Fundamental Elements of Instrumental Conditioning effective in reducing harmful effects of stress Instrumental Response Instrumental Reinforcer Response-Reinforcer Relation 32 16