Lecture 3_ Rescorla-Wagner cont'd; some failures of R-W.pptx
Document Details
Uploaded by WillingOstrich
Full Transcript
RescorlaWagner model continued Reminder slide … CS salience ∆Vi = α (λ – Vi-1) learning depends on the ‘surprisingness’ of the US. R-W curve is negatively accelerating Can we use R-W to model:(asymptote at λ = 1) Effects of CS intensity Blocking Overshadowing Extinction… V US surprise ∆V Effects of...
RescorlaWagner model continued Reminder slide … CS salience ∆Vi = α (λ – Vi-1) learning depends on the ‘surprisingness’ of the US. R-W curve is negatively accelerating Can we use R-W to model:(asymptote at λ = 1) Effects of CS intensity Blocking Overshadowing Extinction… V US surprise ∆V Effects of CS intensity on learning (α parameter in R-W model) Kamin & Schaub (1963) Ton e Sho ck Group Strong CS Tone = 81 dB Group Medium CS Tone = 62 dB Group Weak CS Tone = 49 dB The more intense the CS the faster the acquisition of the conditioned response Any learning theory should account for the effect of the intensity of the CS The Rescorla-Wagner Model ∆Vi = α (λ – Vi1) To model increased intensity, let’s increase αCS to 0.7 … ∆Vi = α (λ – Vi-1); α = 0.7 Trial ∆Vi = α (λ – Vi-1); α = 0.5 ∆V0 = 0 ∆V0 = 0 0 ∆V1 = 0.5 (1 – V0) = 0.5; V1 = ∆V1 = 0.7 (1 – V0) = 0.7; V1 = 1 0.5 0.7 ∆V2 = 0.5 (1 – V1) = 0.25; V2 ∆V2 = 0.7 (1 – V0) = 0.21; V2 2 = 0.75 = 0.91 faster increase in associative strength… Higher CS intensity (alpha) causes a steeper learning curve …compare Extinction: effect of reward omission Extinction: effect of presenting the CS in the absence of the US Acquisition Ton e Sho ck Extinction Ton e Bouton & King (1983) In the absence of reward, the conditioned response gradually weakens Any learning theory should account for this extinction effect Accounting for extinction in the RW λmodel = maximum conditioning possible for the US i.e., maximum extinction will be achieved when λ = 0 We crunch the R-W model ‘backwards’, with V1= max. value attained during acquisition. αCS = 0.5 V5 = 0.96 λ =0 ∆V1 = 0.5 (0 – V0) = 0.5 ∆V1 = 0.5(0 – 0.97) = For Trial 1… 0.48 … all trials V1 = V0 +∆V1 V1 = 0.97 - 0.48 =.48 Trial deltaV V 0 0.00 0.97 1 -0.48 0.48 2 -0.24 0.24 3 -0.12 0.12 4 -0.06 0.06 5 -0.03 0.03 Cue Competition Effects: Overshadowing and Blocking Training Test In overshadowing, CS1 and CS2 compete for associative strength R-W model and overshadowing Calculate ΔV separately for light and tone. Calculate surprise as λ-Vtotal where Vtotal = Vtone + Vlight αtone=0.5 αlight=0.3: Trial ΔVtone Vtone 0 0 0.00 1 0.50 0.50 2 0.10 0.60 3 0.02 0.62 4 0.00 0.62 5 0.00 0.62 ΔVlight 0 0.30 0.06 0.01 0.00 0.00 Vlight 0.00 0.30 0.36 0.37 0.37 0.37 Associative strength for tone Trial ∆Vi,tone = αtone (λ –Vi-1, tone - Vi-1, light) ∆V0, tone = 0 0 1 2 ∆V1, tone = 0.5 (1 – V0, tone – V0,light) = 0.5 ∆V2, tone = 0.5 (1 – V1, tone – V1,light) = 0.10 Vi, tone = Vi-1, tone + ∆Vi, tone V0, tone = 0 V1, tone = 0.5 V2, tone = 0.6 Associative strength for light Trial ∆Vi, light = αlight (λ –Vi-1, tone - Vi-1, light) ∆V0, light = 0 0 1 2 ∆V1, light = 0.3 (1 – V0, tone – V0,light) = 0.3 ∆V2, light = 0.3 (1 – V1, tone – V1,light) = 0.06 Vi, light= Vi-1, light + ∆Vi, light V0, light = 0 V1, light = 0.3 V2, light = 0.36 Compare to the classical study on overshadowing by Macintosh (1976) In every trial, the surprise of the US presentation will depend on the animal’s ‘expectation’ of a US presentation given the presence of all CSs also present. Vtotal = Vtone + Vlight Low Vtotal will make the US surprising, whereas high Vtotal will make the US ‘unsurprising’. R-W correctly predicts that stimuli compete for predictive value and higher associative strength (V) accrues to the higher-intensity (or salience) stimulus. Blocking: pre-training with one CS blocks learning about a second CS, presented in a compound training session Phase 1 Phase 2 Test CS1 US, followed by…CS1 + CS2 US CS2 test shows impaired learning for CS2. R-W and blocking Phase 1- standard V, ΔV, etc., calculations for CS1. αlight = 0.3, 10 trials crunch the numbers… Vlight, 10 = 0.97 R-W and blocking Phase2, as in overshadowing – but initial Vlight = 0.97 (Vlight,10 from Phase 1) Let’s see what happens if the preconditioning phase did not result in ‘complete’ learning (e.g., by dropping light intensity) Summary R-W model assumes that learning will take place only when the US is surprising. Based on this basic assumption, the model accounts for a wide range of phenomena including: Gradual learning and negatively accelerated curves Extinction Stimulus competition (Overshadowing) Stimulus competition (Blocking) Some failures of the R-W model Some failures of the R-W model Assessment of the Rescorla-Wagner model Successfully accounts for Fails to account for (among others) Conditioning Spontaneous recovery Overshadowing Latent inhibition Blocking Conditioned Inhibition (we didn’t model this one) Extinction Wagner’s model (1976; 1981) variations in US & CS processing role of context Assessment of the Wagner’s model Habituation Latent inhibition R-W model recap and importance of US ‘surprisingness’ Increases in “associative strength” (V) depend on the CS and the US been processed conjointly in the STM α = “CS intensity” (R-W model assumes α is constant) λ = max learning possible for this US, for simplicity: – λ = 1 when US present – λ = 0 when US absent ∆V = ‘change’ in V V < λ, positive surprise!! V = λ, no surprise V > λ, negative surprise ∆Vi = α (λ – Vi-1) ‘surprise’ term …and under surprise: the cover an apple. If V< λ, positive (λ-V)= (1-0) = 1 Jamie found a yellow food cover… …so Jamie learnt the cover If V= λ, surprise isthat low: (λ-V)= (1-1) = 0 always hides a delicious apple. This sequence was repeated many times…...to his surprise,surprise: the apple was not If V> λ, negative (λ-V)= (0-1) = -1 One day he found the tray and he was expecting to find an apple, but… there! The Rescorla-Wagner model, extinction reminder For Extinction, set λ=0 ΔVi = α (λ-Vi) For this example, αCS=0.8 Vi = Vi-1 + ΔVi Conditioning (Trial10) V9 = 1 ΔV10=0.8 (1-1)= 0.0 V10=V9+ΔV10=1+0= 1 Extinction (Trial11) ΔV=0.8 (0-1)= ─ 0.8 V11=1+(-0.8)= 0.2 Extinction (Trial12) ΔV=0.8 (0-0.2)= ─ 0.16 V12=0.2+(-0.16)= 0.184 Extinction (Trial15) ΔV=0.8 (0-0)= 0 V15=0+(0)= 0 (V – λ) Let’s crunch the following across trials: Acquisition Extinction ΔV surprise (V – λ) associative strength/decrement (ΔV) associative strength (V) Acquisition Acquisition Extinction V Extinction Problems with the R-W model I: Spontaneous Recovery The R-W model accounts for extinction by allowing for negative surprise, resulting in total ‘unlearning’ when V reaches 0. However extinction ≠ ‘unlearning’ R-W fails to account for spontaneous recovery after extinction R-W also fails to account for ‘facilitated reacquisition’ after extinction Interference interpretation of spontaneous recovery (e.g., Bouton, 1991) Acquisition and extinction memories compete for the control of the behaviour Extinction - inhibitory CS no US association superimposed at test onto previous CS US excitatory learning Spontaneous recovery: retrievability of the inhibitory memory fades faster than that for excitation (which presumably fades very slowly). Retrospective revaluation Retrospective revaluation changes the response to an absent cue. … by changing the status of a companion cue R-W model cannot account for this Although later R-W variants can… Unblocking Blocking is impaired when the reward magnitude is unexpectedly reduced during compound training phase. Surprising downshift in reward enhances attention to the light cue which now carries informational value. The light cue becomes a strong positive predictor of reward Because of the disappointing reward downshift, R-W model predicts negative associative strength for the previously unexperienced light cue making it a conditioned inhibitor. Sensory preconditioning Sensory cue results in conditioning to the tone. R-W model cannot explain this, because no change in the associative strength of either stimulus occurs during ‘preconditioning’ phase. Latent inhibition Pre-exposure to the to-be-CS impairs conditioning Lubow (1965) Light Light % leg flexion Light 80 70 60 50 40 30 20 10 0 Control Pre-exposure 1 2 3 4 Blocks of 20 trials something happens during the preexposure to the to-be-CS that impairs The Rescorla-Wagner model fails to predict Latent Pre-exposure phase λ=0 Inhibition Model (US never presented) Conditioning phase λ=1 (US is present) (αCS=0.8 for this example) Pre-exposure predicts normal conditioning ΔVi = α (λ-Vi-1) Vi = Vi-1 + ΔVi Pre-exposure Conditioning Pre-exposure Conditioning Pre-exposure Trial1 ΔV1=0.8 (0-0)=0.0 V1=V0+ΔV1=0+0=0 Pre-exposure Trial1000 ΔV1=0.8 (0-0)=0.0 V1000=V999+ΔV1000=0+0=0 Next time … Accounting for latent inhibition using Wagner’s ‘Sometimes Opponent Process’ model More recent attentional models of learning