PY2111 Lecture 10 - Part 2 - PDF

Part 2 RECAP Lecture 1 Recap OVERVIEW OF HUMAN AND ANIMAL LEARNING AND BEHAVIOUR What is “Behaviour”? Any outward or inward response to a stimulus. Includes actions, speech/language, feelings, and thoughts. All behaviour is elicited by a stimulus, seen or unseen, conscious or unconscious. What Causes Changes in Behaviour? 2 main sources ◦ Evolution (nature) ◦ Learning (nurture) Evolution Darwin’s Theory of Evolution 1. Populations have the potential to increase exponentially if there are enough resources. 2. Resources are limited. 3. Competition for resources leads to a struggle amongst individuals. 4. There is variability within a species due to genetic mutations or random chance. 5. This variability is often heritable. Individuals with physical or psychological advantages to adapt to their environment will gain more resources, reproduce more, and their genes will be passed down more frequently than individuals who are less well- adapted to their environment. Learning & Experience Learning is an enduring change in the mechanisms of behaviour based on prior experience. ◦ Can occur consciously or unconsciously. ◦ Can occur with or without immediate changes to behaviour. ◦ Can be classified as either associative or non-associative (more on this in later segments). Learning is often stimulus- and response-specific. Understanding the relationship between learning and behaviour allows us to better predict and modulate human behaviour. Works in combination with evolutionary adaptations. Evolution- Dominant Behaviours Reflex arc Reflexes vs. Modal Action Patterns Reflexes ◦ Single action. Modal action patterns ◦ Response sequences that are typical of a particular species. ◦ Examples: mating/courtship rituals, defensive action patterns. Repeated Stimulation Repeatedly experiencing the same stimulus has two effects: ◦ Habituation: Decline in responding ◦ Sensitisation: Increase in responding These changes are considered non-associative learning. Spontaneous Recovery vs. Dishabituation Spontaneous Recovery Dishabituation Imposing a retention interval Presenting a different/novel should result in spontaneous stimulus should result in a recovery of the response. recovery (dishabituation) of the response. Dual-Process Theory Groves & Thompson (1970) Addresses underlying factors responsible for behavioural habituation and sensitisation. ◦ Note that this theory separates underlying processes from outward behavioural changes. Assumes that different types of underlying neural processes are responsible for increases and decreases in responsiveness. ◦ Habituation process (occurs in the S-R system) ◦ Sensitisation process (occurs in the state system) Not mutually exclusive. ◦ Behavioural output depends on relative strength of process. ◦ Thus, habituation and sensitisation effects are both due to habituation and sensitisation processes with the sum determining outward behaviour. Opponent Process Theory Primary process, also known as a Manifest process. affective a-b ◦ Responsible for quality of the response emotional state that occurs in (observed presence of the stimulus. emotional reaction) Opponent process, aka b process. ◦ Generates the opposite emotional A reaction. Underlying ◦ Lags behind primary emotional opponent disturbance. processes 0 B Stimulus event Trials Opponent Process Theory After extensive stimulus exposure. Manifest Primary reaction becomes weaker; affective after-reaction becomes stronger. response a-b (observed emotional reaction) A Underlying opponent processes 0 B Stimulus event Trials Opponent Process Theory Can be applied to conditioned drug tolerance ◦ A form of associative learning ◦ Stimulus cues get paired with stimulus effects ◦ Stimulus cues = conditioned stimulus (CS) ◦ Stimulus effects = unconditioned stimulus (US) ◦ Natural response to the stimulus = unconditioned response (UR) ◦ After repeated CS to US pairings (i.e., CS- US), the B response activates upon presentation of the CS alone ◦ The B process is the body’s attempt to maintain homeostasis in anticipation of the UR upon gaining the US ◦ Tolerance to the stimulus forms as a result, requiring more of the stimulus to achieve initial UR levels ◦ The learned association can be interrupted by inserting novel stimuli, such as a change in environment ◦ Novel stimuli, however, can disrupt the body’s attempt at homeostasis, resulting in an overdose Lecture 2 Recap CL ASSICAL CONDITIONING I Classical Conditioning Unconditione Unconditioned Conditioned Conditioned d Stimulus Response (UR) Stimulus (CS) Response (CR) (US) Naturally Naturally Initially neutral, Learned provokes a elicited by a and later response to response US paired with the the CS US Food Salivation Sex Aroused Shock Startle Excitatory Conditioning (Acquisition) Acquisition = CS is followed by an outcome. Excitatory associations = CS predicts the occurrence of an outcome. Excitatory CS = a CS that predicts the occurrence of the US. Excitatory Conditioning Procedures Importance of timing and stimulus duration. ◦ Inter-stimulus interval (ISI) ◦ Inter-trial interval (ITI) ITI ISI CS CS CS CS CS US US US trial Order of stimulus presentations. Excitatory Conditioning Organisms learn temporal relationship between CS and US, but this learning is not always directly measurable when testing the CS alone. There is a distinction between knowing and performing. Delay Trace Long- Simultaneou Backward delay s CS CS CS CS CS CS US US US US US US Measuring CRs Quantifying behavior ◦ Magnitude ◦ E.g., amount drank, amount of conditioned emotional responding (CER) suppression. ◦ Probability ◦ E.g., likelihood of blinking, human conditioning studies. ◦ Latency ◦ Amount to time to respond or stop suppressing (RT). Not always convergent Inhibitory Conditioning (Extinction) Extinction = stop reinforcing a previously excitatory CS. Inhibitory associations = CS predicts the absence of the outcome. Slower to learn. ◦ Eventually, subject stops producing a CR to the CS. Inhibitory Conditioning Procedures Generally speaking, there are five different types of inhibitory conditioning paradigms Pavlovian Conditioned Inhibition (CI) Differential CI Explicitly Unpaired CI Inhibition of Delay Backward CI Inhibitory Conditioning Procedures Pavlovian Differential Explicitly Inhibition of Backward CI CI CI unpaired delay A+ / AX- A+ / X- +CI / X- X+ +X CE CECI CE CI CI CI CI US US US US US Measuring CI Two conventional tests ◦ Negative summation test ◦ Retardation test Phase 1 Phase 2 Summation Test Retardation Test Grp1 A+/AB- X+ XB cr B+ B cr Grp2 A+/AB- X+ X  CR Grp3 A+/AC- X+ XB CR B+ B CR Training excitor Transfer excitor Included to show stimulus specificity / generalisation decrement control Experimental Conditioning Paradigms Aversive conditioning Appetitive conditioning Eyeblink conditioning Sign and goal tracking Conditioned taste aversion Lecture 3 Recap CL ASSICAL CONDITIONING II Contiguity Contingency (Necessity) Outcome Present Absent Present Before a b CS Absent During c d After CS pre-exposure Gp Ctrl X CR x Gp Exp X cr x Gp Exp reduces attention to CS = weaker responding to the CS at test. Contingency (Necessity) Outcome Present Absent Before Present a b During CS Absent c d After US pre-exposure Gp Ctrl X CR x Gp Exp X cr x Gp Exp habituates to the US = weaking responding to the CS at test. Contingency (Necessity) Outcome Present Absent Before Present a b During CS Absent c d After Partial reinforcement Gp Ctrl xX CR Gp Exp X cr x Note: Must equate total number of US presentations for both groups. Gp Exp has partial extinction on CS- trials = weaker responding to the CS at test. Contingency (Necessity) Outcome Present Absent Before Present a b During CS Absent c d After Degraded contingency Gp Ctrl xX CR Gp Exp X Cr x Note: Must equate total number of US presentations for both groups. Gp Exp habituates to the US = weaker responding to the CS at test. Contingency (Necessity) Outcome Present Absent Before Present a b During CS Absent c d After CS post-exposure Gp Ctrl xX CR Gp Exp x X cr Gp Exp undergoes procedural EXTINCTION = weaker responding to the CS at test. Contingency (Necessity) Outcome Present Absent Before Present a b During CS Absent c d After US post-exposure Gp Ctrl xX CR Gp Exp X cr x Gp Exp experiences retroactive interference (repeated presentations of US alone interferes with original learned CS- US) = weakened responding to the CS at test. Overshadowing and Blocking Overshadowing Group Acquisition Test OV AX+ X  cr Ctrl X+ X  CR Blocking Group Phase 1 Phase 2 Test Exp A+ AX+ X  cr Acq Ctrl A+ X+ X  CR Novelty Latent inhibition (CS-pre-exposure) and US pre- exposure Group Pre- Acquisition Test exposure LI X- X+ X  cr Ctrl A- X+ X  CR Group Pre- Acquisition Test exposure US-Pre + X+ X  cr Ctrl - X+ X  CR Latent inhibition is context specific Group Pre- Acquisition Test exposure LI-Diff (X-)1 (X+)2 (X)2  CR LI-Same (X-)2 (X+)2 (X)2  cr Novelty Latent inhibition (CS-preexposure) and US preexposure ◦ Not truly inhibitory. ◦ The CS does not pass negative summation test (but does pass retardation test). Group Preexposur Acquisition Test e NegSum X- B+ XB  CR NegSumCtr X- B+ B  CR l Ret X- X+  cr RetCtrldeficit explanation. ◦ Attention A- X+  CR ◦ Protective benefit from irrelevant stimuli and later acquisition of phobias. CS-US Relevance or Belongingness Garcia and Koelling’s bright and noisy (and flavored) water experiment Flavor Audiovisual Conditioning Test Licks/min Flavor + audiovisual -----> shock Flavor Audiovisual Flavor+ audiovisual -----> sickness Flavor Audiovisual Sickness Shock Phobias and fear conditioning Evolutionarily based genetic predispositions Higher-Order Conditioning Learning without a direct CS-US experience. Second Order Conditioning Group First Order Second Order Test Cond Cond SOC A+ XA X  CR Ctrl A+ X/A X  cr SOC CS FOC CS US CR SOC is the result of a simple S-R association. Sensory Preconditioning Same as SOC but Phases 1 and 2 are switched. Sensory preconditioning N o Occasion Setting Occasion setting refers to the ability an event (occasion setter) to modulate the association between a CS-US pair. Occasion Setter Conditioned Unconditioned Stimulus (CS) Stimulus (US) Stimulus Control by Contextual Cues Contexts can serve as conditioned excitors and conditioned inhibitors that can pass retardation and summation tests. Contextual cues do not have to directly signal reinforcement/non- reinforcement to gain behavioral control. ◦ They can also control behaviour through occasion setting, just like a discrete cue. Lecture 4 Recap ASSOCIATIVE LEARNING THEORIES Bush-Mosteller Model Local error reduction rule: ∆V = (-Vn)  refers to the total amount of conditioning that a US can support on a given trial V refers to the associative strength of the particular cue in question n refers to trial number (-V) = predictive error Note:  = 1 when US is present  = 0 when US is absent Rescorla-Wagner Model Total error reduction rule: ∆V =  (-∑Vn) ∑V (also noted as VT or V(presentcues)) refers to the total associative strength of all CSs present on trial number n.  is the learning rate parameter that refers to the associability of the CS.  is the learning rate parameter that refers to the associability of the US. ◦ Associability roughly corresponds to salience or intensity. ◦0 <  < 1 ◦0 <  < 1 Problems with Rescorla-Wagner Model Spontaneous recovery Acquisition Extinction Retention interval Test Problems with Rescorla-Wagner Model Latent inhibition (CS pre-exposure) Preexp Acq Test Exp X- X+ cr X CR Con Y- X+ X  CR Predicted by R-W model Problems with Rescorla-Wagner Model Extinction of inhibition Pav C.I. CI Ext Test Train cr CR Exp A+ / AX- X- X Con A+ / AX- - X  cr Predicted by R-W model R-W model during Phase CI Ext: ∆V = (0 – [-1.0]1)  predictive error is 1.0 Problems with Rescorla-Wagner Model Nonreinforcement of neutral cue in presence of conditioned inhibitor should make cuePavCI excitatory Train Acq Test Exp A+ / AX- XY- Y  cr CR Con A+ / AX- Y- Y  cr Predicted by R-W model R-W model during Phase Acq ∆V = ( – [X + Y]n) ∆V = (0 – [-1.0 + 0.0]1)  predictive error is 1.0 ∆V = (0 – [-1.0 + 0.1]2)  predictive error is 0.9 ∆V = (0 – [-1.0 + 0.2]3)  predictive error is 0.8.. ∆V = (0 – [-1.0 + 1.0]30)  predictive error is 0.0 Problems with Rescorla-Wagner Model Retrospective revaluation - change in response to the target CS as a function of manipulating the associative status of a related CS. ◦ E.g., recovery from blocking Phase 1 Phase 2 Phase 3 Test Block A+ AX+ - X  cr Con A+ BX+ - X  CR Rec A+ AX+ A- X  CR Not predicted by R-W model R-W model during Phase 2: ∆V = (1 – [0.8 + 0.2]30)  predictive error is 0.0 Comparator Process (Miller’s Comparator Hypothesis) We treat contextual cues as a comparator stimulus that indirectly invokes the cognitive representation of the US. Target CS at Directly test Link 1 activated US X representation Link 2 Comparator process Respons e Comparator Indirectly stimulus Link 3 activated Representatio n US representati context on Link 2 x Link 3 = Comparator term Models of Attention Pearce & Hall (1980) Proposed that attention is determined by how surprising the US was on the preceding trial. More surprising means more attention is paid on the subsequent trial. Mackintosh (1975) Proposed that attention increases to cues that are reliable predictors of the US. Models of Attention Hogarth, Dickinson, & Duka (2011) Looking for action – attention that a stimulus commands after it has become a good predictor of the US and can generate a CR with minimal cognitive effort. ◦ Similar to Mackintosh’s attentional mechanism. Looking for learning – attention that is involved in processing cues that are not yet good predictors of the US and therefore have much to be learned about. ◦ Similar to Pearce & Hall’s attentional mechanism. Looking for liking – attention that stimuli command because of their emotional value. Timing Models The CS-US interval (ISI) is important for strength and rate of learning and CR (think about trace conditioning). ◦ When something occurs is equally as important as what occurs. The intertrial interval (ITI) can influence conditioning. ◦ Generally, longer ITIs lead to better conditioning than massed trials (i.e., short ITIs). ITI and ISI interact to determine responding. ◦ Responding is determined by the length of the ISI relative to the length of the ITI. ◦ Not absolute value. Lecture 5 Recap OPERANT CONDITIONING I Instrumental Conditioning Learning about outcomes that are contingent upon the organism’s behaviour. A response is elicited with the purpose of producing or avoiding the outcome. Thorndike’s Law of Effect Responses made in the presence of a stimulus that are followed by a pleasant reward result in a stronger association between the stimulus and the response. More responding Thorndike’s Law of Effect Responses made in the presence of a stimulus that are followed by a pleasant reward result in a stronger association between the stimulus and the response. More responding Responses made in the presence of the stimulus that are followed by a noxious outcome (punishment) result in the S-R association being weakened. Less responding Instrumental Conditioning Procedures Positive = Instrumental response produces outcome. Negative = Instrumental response removes outcome. Reinforcement = rewarding outcome. Punishment = aversive outcome. Positive Negative Reinforcement (Positive) Reinforcement Escape or Avoidance (Negative reinforcement) Increases IR Punishment (Positive) Punishment Omission training (Negative punishment) Decreases IR Discrete-Trial Procedures Clear and distinct beginning and end of trial. Instrumental response is performed only once, and that ends the trial. Often used with rats and running mazes. ◦ Measures running speed and response latency. Free-Operant Procedures Invented by B. F. Skinner. Organism is allowed to make operant response repeatedly and without constraint before being removed from the apparatus. Allows for the study of behaviour in a more continuous manner. ◦ Able to observe chains of behaviours. Operant Conditioning Organisms learn to emit a particular behaviour in the presence of a particular stimulus to receive a reward. The stimulus that signals the contingency between a response and the outcome is called a discriminative stimulus. A discriminative stimulus that signals a positive contingency between the response and the outcome is S+. A discriminative stimulus that indicates a negative contingency between the response and the outcome is S-. Shaping Behaviours Response shaping involves the establishment of behaviour through a sequence of reinforced approximations and nonreinforcement of earlier response forms. Careful balance of reinforcement and withholding. Instrumental Reinforcer Quality and quantity ◦ Better rewards result in better performance. ◦ But responding is also influenced by the response requirement. Okay US Mean number of reinforcers Good US Best US earned 5 10 15 20 25 30 35 40 Response requirement Instrumental Reinforcer Quality and quantity ◦ Perceived quality and quantity influenced by organism’s previous experience with that reinforcer. ◦ Similar to the idea of expectation driving excitation and inhibition. r as t n t L-S co ve iti Po s L-L Running S-L speed Ne ga S-S t iv econtrast Trials Response-Reinforcer Relation Temporal relation ◦ Refers to time between response and reinforcer. ◦ Immediate reinforcement > delayed reinforcement, partly due to ambiguity as to cause of outcome. ◦ Techniques to reduce ambiguity or misattribution of cause: ◦ Provide secondary, or conditioned, reinforcer (e.g., clicker training, verbal reinforcement). ◦ Marking procedure. Response- Reinforcer Relation Response-reinforcer contingency ◦ Extent to which the instrumental response is necessary and sufficient to produce the reinforcer ◦ Perfect contingency not necessary. ◦ Perceived contingency is what is important, not actual contingency. Learned- Helplessness Effect Seligman, Overmier, & Maier ◦ Application to clinical depression. ◦ Triadic design using shuttle apparatus. Group Exposure Conditioning Result E Escapable shock Rapid avoidance learning Escape- Y Yoked inescapable shock avoidance Slow avoidance learning training R Restricted to apparatus Rapid avoidance learning Learned Helplessness Hypothesis Animal perceives 0 contingency relationship and assumes that future reinforcers will also be independent of their behaviour. This undermines ability to learn new instrumental responses. Learning deficit due to reduced motivation to perform an instrumental response and deficient ability to learn that behaviour is now effective. Alternative Hypotheses Activity deficit hypothesis ◦ Specific to producing movement. ◦ Uncontrollable shocks disrupt escape learning in shuttle box due to freezing response but facilitate eyeblink conditioning. Attention deficit hypothesis ◦ Exposure to inescapable shock reduces the extent to which animals pay attention to their own behaviour. ◦ This causes a learning deficit. Lecture 6 Recap OPERANT CONDITIONING II Simple Schedules of Reinforcement Fixed Variable Ratio Reinforced every nth Reinforced on trial average after n trials Interval Reinforced every nth Reinforced on second/ minute/ average after n hour, etc seconds/ minutes/ hours, etc. Post- reinforcem ent pause Post- reinforcem ent pause Choice Behaviour Associative Structure of Instrumental Conditioning Three-term contingency: ◦ Contextual stimulus (S) S O ◦ Instrumental response (R) ◦ Reinforcing or response outcome (O) S-R association (law of effect) R ◦ The role of the reinforcer is simply to “stamp in” or strengthen the S-R association. ◦ The basic motivation to perform the conditioned instrumental response is activation of the S-R association via exposure to the pertinent contextual stimuli, in the presence of which the response was previously reinforced. ◦ Research on S-R association has resurged due to interest in habits, habitual drug taking, and other automatised behaviours. Two-Process Theory Two types of learning. ◦ Pavlovian and instrumental. During conditioning, the organism is reinforced for performing an instrumental response in the presence of a stimulus. ◦ This allows the stimulus to become directly associated with the outcome (S-O association). The S-O association activates an emotional state, positive or negative, that motivates the instrumental behaviour. Two-Process Theory Test the two-process theory using Pavlovian instrumental transfer (PIT) experiment. Phase 1 Phase 2 Transfer Test Instrumental Pavlovian Present Pavlovian CS during conditioning conditioning performance of the instrumental response Lever press  Tone  food Lever press with or without tone food  food If a Pavlovian S-O association motivates instrumental behaviour, then the rate of lever pressing should increase when the tone is present. Example: lever pressing for food decreases in presence of CS+ for footshock. Associative Structure of Instrumental Conditioning Direct R-O associations ◦ Not formally included in the Two-Process Theory, but evidence of R-O associations comes from devaluation studies. If an instrumental response is motivated by an R- O association, devaluation of the reinforcer should reduce the rate of the instrumental response. Associative Structure of Instrumental Conditioning S(R-O) association ◦ S activates both R and the R-O association when it is present, which motivates behaviour. Premack Principle Noted that responses that accompany commonly used reinforcers involve activities that individuals are highly likely to perform. However, the instrumental responses are typically low-probability activities. Premack proposed that given two responses of different likelihoods, H and L, the opportunity to perform the higher probability response (H) after the lower probability response (L) will result in reinforcement of response L. ◦ The opportunity to perform response L after response H will not result in reinforcement of response H. Lecture 7 Recap EXTINCTION, RECOVERY AND AVOIDANCE Reducing Responding Extinction In Classical Conditioning, the CS is no longer followed by the US. In Instrumental Conditioning, the CR in the presence of the CS is no longer followed by the US. Likened to the opposite of acquisition (think Rescorla-Wagner). Interference Outcome interference: AB AC Cue interference: AB CB Punishment – Thorndike’s Law of Effect Partial Reinforcement Extinction Effect (PREE) Slower rate of extinction following partial reinforcement relative to continued reinforcement. Why? Discrimination hypothesis ◦ Change from partial reinforcement to extinction is difficult to perceive relative to change from CRF to extinction. ◦ Therefore, behaviour will continue longer after partial reinforcement due to an inability to discriminate the new contingency. Generalisation decrement hypothesis ◦ Organisms continue responding because the situation is similar to one in which it has responded to in the past, which was reinforced. Frustration theory ◦ Organisms learn to persist in the face of frustration. What is Happening During Extinction? Interference vs. Forgetting vs. Unlearning Unlearning is an active process by which the CS becomes unassociated with the US. In other words, the negative contingency between the CS and US erases the excitatory association. Forgetting is a passive processes that reduces responding due to degradation of the memory trace that comes about through passage of time. Explicit nonreinforcement of the CS or the CR is not required for forgetting to reduce responding. Interference is an active process in which new competing inhibitory associations interfere with expressing old excitatory associations. Interference Proactive interference – Phase 1 reduces ability to recall learned information from Phase 2. Retroactive interference – Phase 2 reduces ability to recall learned information from Phase 1. Interference Cue interference X Out co me Y Outcome interference O1 X O2 Interference What type of interference is EXTINCTION? ◦Phase 1 Phase 2 Test Retroactive ◦X+ X- Xcr outcome interference ◦Phase 1 Phase 2 Test Proactive outcome ◦X+ X- XCR interference Recovery from Extinction Extinction learning is not permanent. Even if there is total cessation of responding at end of extinction training, the conditioned response (in Classical Conditioning or Instrumental Conditioning) is often observed to return. Thus, extinction does not produce erasure or unlearning, and reduced responding at the end of extinction is not the result of forgetting. Extinction results in new learning of an inhibitory S-R association that suppresses responding whenever S is present. Consistent with observation that extinction behaviour is specific to contextual cues in which response was extinguished. ◦ Changing contexts reduces the similarity of the learning phases and reduces interference. Spontaneous Recovery Return of responding after a period of time has passed during which no training of the CS or US occurred. ! ! ! Renewal Recovery of responding when the subject is tested outside of the extinction context. This can mean being tested back in: ! A ◦ The original training context (ABA renewal). ◦ Tested in an entirely novel context (ABC renewal & AAB renewal). A B ! C ! B Reinstatement ! ! ! ! Facilitated Reacquisition Rapid reacquisition of an extinguished response relative to acquiring responding to a novel cue. ! ! ! ! ! ! Bouton’s (1993) Theory of Retrieval Based on interference research where extinction is an example of retroactive interference. Bouton’s (1993) Theory of Retrieval Extinction is an example of Retroactive Outcome Interference, in which the second-learned X – O2 association disrupts retrieval of the X – O1 association. Phase 1 Phase 2 Test X – US X – noUS X? X – US or X – noUS? Bouton’s (1993) Theory of Retrieval Extinction is an example of Retroactive Outcome Interference, in which the second-learned X – O2 association disrupts retrieval of the X – O1 association Phase 1 Phase 2 Test X – US X – noUS X? X – noUS Bouton’s (1993) Theory of Retrieval When the test context is not similar to the extinction context, then Proactive Outcome Interference is observed. Phase 1 Phase 2 Test X – US X – noUS X? X – US Bouton’s (1993) Theory of Retrieval When the test context is not similar to the extinction context, then Proactive Outcome Interference is observed. Phase 1 Phase 2 Test X – US X – noUS X? X – US Bouton’s (1993) Theory of Retrieval Same mechanism to explain spontaneous recovery, renewal, and reinstatement. Context = TIME & SPACE. Enhancing Extinction Enhance learning during extinction. ◦ Trial spacing and session spacing. ◦ Massive extinction. ◦ Deepened extinction. Enhance retrieval of extinction memory. ◦ Extinction in multiple contexts. ◦ Increase context similarity. ◦ Reminder cues. Punishment Punishment involves the presentation of an aversive event as a result of producing a target instrumental response. Intended to suppress occurrence of target instrumental response that would otherwise occur with relatively high frequency. Laboratory studies of punishment typically begin by pairing the target instrumental response with a positive reinforcement to encourage high rate of responding. Degree of suppression observed is determined by variables related to presentation of punisher and variables related to availability of positive reinforcement. The actual punisher can take many forms. ◦ Remember that punishment can be positive or negative. ◦ Punishment does not have to be physically aversive (e.g., shock) to be punishing. ◦ Point deductions or time outs are also effective punishment. Variables Affecting Effectiveness of Punishment Intensity ◦ More intense punishment generally more effective at suppressing behaviour to a point. ◦ Suppressed behaviour with low-intense punishers often recovers, even with continued punishment due to habituation. ◦ Escalating punishment less effective than initially severe punishment, even if final intensity of punishment is same. ◦ Resistance. Variables Affecting Effectiveness of Punishment Schedules of punishment ◦ Punishment need not be administered after every instrumental response to be effective in suppressing behaviour. ◦ Degree of response suppression produced by punishment depends on the proportion of responses that are punished. ◦ If behaviour is maintained by FR schedule, punishment increases the length of the postreinforcement pause but has little effect on ratio run. NP F R 00 0 0 10 R 5 F Cumulative responses 00 FR 3 2 00 FR FR 100 FR 1 Variables Affecting Effectiveness of Punishment Contingency ◦ Stronger contingency leads to more effective behaviour suppression. ◦ Goodall, compared lever-press responding in presence of punishing cue (IC group) and yoked conditioned suppression cue (CC group), both trained with mild US. ◦ What does this say about practicality though for reducing behaviour? Mean suppression CER PUN ratio Day s Variables Affecting Effectiveness of Punishment Contiguity ◦ Decreasing contiguity by adding delay between instrumental responding and punishment reduces effectiveness of punisher to suppress behaviour. ◦ Important for practical applications. Just wait until your father gets home… Variables Affecting Effectiveness of Punishment Presence of positive reinforcement ◦ Despite receiving punishment, instrumental response may continue if organism receives Unpunished (control) positive reinforcement (extrinsic or intrinsic) Baseline Punishment for performing the instrumental response. Extended cocaine - resistant ◦ Example: drug addictions, S&M Seeking responses Extended cocaine - ◦ Pelloux et al. used rats to model drug sensitive addiction with simultaneous shock punishment and cocaine reinforcement. Moderate cocaine ◦ IR 1: drug-seeking lever; VI 120-s reinforced with drug-taking lever ◦ IR 2: drug-taking lever; FR 1 reinforced with cocaine hit. ◦ Moderate or extensive training. ◦ Punishment procedure introduced on half trials with shock and no cocaine. Variables Affecting Effectiveness of Punishment Discriminative stimulus ◦ Presence indicates that the instrumental response will be punished. ◦ Same thing as positive occasion setter. ◦ Example: dog whisperer ◦ Punishment as a discriminative stimulus for positive reinforcement (e.g., attention from parents) may result in increase in instrumental responding (misbehaviour) rather than suppression. Theories of Punishment Conditioned emotional response theory of punishment - Estes ◦ Estes proposed that punishment suppresses behaviour through the same mechanism that produces response suppression to a fear-conditioned stimulus. ◦ Various stimuli an individual experiences just before making the punished response serve as the CS that signals the delivery of the US. ◦ Behavioural suppression occurs primarily because a fear-conditioned stimulus elicits freezing, which then interferes with other activities. ◦ Explains observation that more intense and longer shocks produce more suppression by assuming that these aversive events elicit more vigorous conditioned emotional responses. ◦ Also accounts for response-contingency effect. Theories of Punishment Negative law of effect ◦ Originated with Thorndike’s idea that positive reinforcement strengthens behaviour and punishment weakens it. ◦ Negative law of effect opposite to positive law of effect. ◦ Punishment caused a reduction in sensitivity to relative reinforcement rates. ◦ Punishment was 3x more effective than reinforcement. ◦ Consistent with general literature showing that negative events or emotions are much stronger than positive events or emotions. Reducing Responding Punishment ◦ Involves a positive contingency between the instrumental response and the aversive outcome. ◦ Suppresses instrumental responding. ◦ Passive avoidance. Avoidance ◦ Requires individuals to make a specific instrumental response to prevent an aversive stimulus from occurring. ◦ Increases occurrence of instrumental behaviour. ◦ Active avoidance. ◦ Simple classical conditioning without opportunity to avoid shock does not produce avoidance behaviour. Lecture 8 Recap STIMULUS DISCRIMINATION AND GENERALIS ATION Pearce’s (1987) Configural Processing Model Each individual learning situation is a single configured representation that consists of the CS and the context. Responding occurs to stimuli as a function of how similar they are to the CS. AB CR A+ A Y cr CR cr Learning Factors in Stimulus Control Stimulus Discrimination Training in classical conditioning: CS+ / CS- The CS+ is excitatory and predicts the presentation of the outcome. The CS- is inhibitory and predicts the absence of the outcome. CR CS+ CS- Time Learning Factors in Stimulus Control Stimulus Discrimination Training in instrumental conditioning: S+ / S- Making the operant response in the presence of the S+ will be followed by reinforcement. Making the operant response in the presence of the S- will not be followed by reinforcement. CR S+ CS+ S- CS- Time Learning Factors in Stimulus Control Experience affects your ability to perceive and identify elements of a complex stimulus and whether you will attend to them. Stimulus Generalisati on Gradient Interactions Between S+ and S- Intradimensional discrimination produces a peak-shift effect when S+ and S- are very similar. Demonstrated by Hanson (1959) using pigeons and wavelength CSs (in nanometers; nm). S- S Responding + Negative summatio n Explanation of Peak-Shift Effect Peak-shift S- S effect + Responding Negative summatio n Acquired Equivalence Acquired equivalence promotes generalisation to other stimuli that belong to the same class. Lecture 9 Recap EVALUATIVE CONDITIONING Acquisition of Likes and Dislikes via Associative Processes Classical conditioning CS US DV: changes in preparatory/defensive behaviours (reflexes) Evaluative conditioning Neutral stimulus (CS) Affective stimulus (US) DV: change in valence of neutral stimulus Generality of Evaluative Conditioning Visual domain Gustatory domain Cross-modal domain ◦ Boundaries of evaluative conditioning Biologically significant USs ◦ Affective priming task Functional Characteristics of Evaluative Conditioning Extinction Statistical contingency Contingency awareness Counterconditioning Postacquisition revaluation Models of Evaluative Conditioning Conceptual-categorisation account ◦ Davey (1994) Holistic account ◦ Martin and Levey 1970s Referential account ◦ Baeyens et al. (1992) ◦ Signal vs. referential Questions?

PY2111 Lecture 10 - Part 2 - PDF

Document Details

Tags

Related

Summary

Full Transcript