Lecture 2_ Pavlovian Phenomena & the Rescorla-Wagner Model.pptx
Document Details
Uploaded by WillingOstrich
Full Transcript
Conditions of Learning (cont’d) Significance of S-S learning studies… Sensory preconditioning experiments (e.g., Rizley & Rescorla (1972)) challenge the notion that learning can only occur in the form of S-R associtions. Learning can result in the establishment of S-R associations (‘Law of Effect’)...
Conditions of Learning (cont’d) Significance of S-S learning studies… Sensory preconditioning experiments (e.g., Rizley & Rescorla (1972)) challenge the notion that learning can only occur in the form of S-R associtions. Learning can result in the establishment of S-R associations (‘Law of Effect’) But it can also involve the establishment of associations between stimuli (S-S). But can animals can use other associative structures? Outcome-Response (Goal Directed Behaviour) Stimulus—Response—Outcome Outcome-response (O-R) or Goal-directed theory Colwill & Rescorla (1985) Phase 1 Phase 2 chain pull sucrose & sucrose —sickness lever press food pellet Test Chain pull or lever press? “Outcome Devaluation Procedure” Associative structure O1 Response1 O2 Response2 Theories exist that distinguish between R-O and O-R associative structures, but we won’t go into this (de Wit & Dickinson, 2009). (S) (O) (R) Extensive training or specific reinforcement protocols (random interval) bias learning to a S-R (habitual) representation. Demonstrating S-O-R association using Pavlovian instrumental transfer (PIT). S-R learning cannot account for PIT because the action (R) was not paired with the Pavlovian cue before the transfer test. S-O-R account: The Pavlovian cue (S) activates a representation of the outcome (O), and this activation excites the appropriate action (R) associated with the outcome. Hierarchical Associations: Goal Directed Behaviour Modulated by a Discriminative Stimulus Colwill & Rescorla (1990) 30 Responses / min 25 R 1 20 15 10 5 0 LIGHT TONE Summary: According to the Law of Effect, conditioning depends on stimulus-response (S—R) associations. This is a very narrow view. Animals encode other relationships between events, including: Stimuli (S—S; Rescorla & Rizley, 1972) Responses an outcomes (O-R; Colwill & Rescorla, 1985): The basis of Goal Directed Behaviour Hierarchical relationships: S1—(R1O1); S2—(R1O2); Colwill & Rescorla, 1990): The basis of Goal-Directed Behaviour Modulated by Discriminative Stimuli The three questions to be asked about any learning phenomenon What are the conditions that bring learning about? What is learned? Robert A. Rescorla How does learning affect behaviour? Learning is not performance Silent learning: Sometimes learning does not result in a behavioural change. E.g., sensory pre-conditioning Also, learning can affect the behaviour of an individual in many different ways depending on: CS-US relationship nature of the US nature of the CS Behaviour may reflect Stimulus substitution Preparatory response (e.g. eye blink to avoid airpuff) Compensatory response (e.g., conditioned drug tolerance) Wasserman, Franklin & Hearst (1974) on the CR Approach / withdrawal scores Influence of the CS-US relationship 1 CSUS 0.5 CS / US CS no US 0 1 2 3 4 5 Blocks of 4 sessions CS US CS / US CS no US Positive relationship: approach CR No relationship: no behavioural change Negative relationship: withdrawal Influence of the US on the CR: Stimulus substitution Responding to the CS “as if” it was a substitute of the reward-US. Male pigeons eat a key that signals food, and sip a key that signals water. Moore, 1969). http://www.youtube.com/watch?v=50Emqi YC9Xw CR depends on the nature of the reinforcer: Food-US; CR=Pecking at disk with Open Beak and Closed Eyes Water-US; CR=Pecking at disk with Closed Beak and Open Eyes Stimulus substitution Rats bite and lick a bar which predicts food (Peterson et al., 1972). https://www.youtube.com/watch?v=bEDoNJAwarE A raccoon could not be trained to insert a money coin in a money box for food: it treated the coin as food, and did not want to release it in the box (Breland & Breland, 1961). https://www.youtube.com/watch?v=D52sQJEQI0Y Second rat used as a CS. Stimulus substitution would predict biting the CS rat. A second rat used as a conditioned stimulus. The CR was not biting or salivating. The participant rat approached the CS-rat and performed social contact and solicitation. Timberlake and Grant, 1975 Sign tracking vs. goal Sign-tracking tracking attentional bias toward reward-paired stimulus (CS), rather than goal itself (reward US) What determines balance? Individual differences: Sign-trackers respond more to CS incentive properties Goal trackers focus more on CS predictive properties Link to drug abuse – sign-trackers show heightened goal tracker sign tracker response to drug-related cues response lever acting as cue reward port Compensatory Conditioned Responses The CR may be the opposite of the UR Noxious substances – We already learned about conditioned taste aversion – But what happens if the noxious substance is ingested chronically? – Cues predicting the US (the noxious substance) cause an CR that aims to counteract the effects of the noxious substance (the US). – Later we’ll talk about a related concept called an ‘opponent process’. Conditioned drug tolerance theory conditioned hyperalgesia Shepard Siegel Enhanced pain sensitivity in a morphine-paired environment (paw licking after touching hot surface); Siegel, Hinson, & Frank, (1978) Mortality in mice from high dose of morphine after repeated morphine pretreatments Interview with Shepard Siegel: https://www.youtube.com/watch?v=g0pc_ihQ6sY Often, S-R learning is not a mindless, ‘jump the gun’ phenomenon. Its purpose is to optimize the organism’s interaction with the upcoming unconditioned stimulus. Associative Learning Theories: introducing the Rescorla-Wagner model Associative Learning Theories: introducing the Rescorla-Wagner model Can the model explain basic Pavlovian phenomena? Shape of learning curve Different types of learning paradigms Overshadowing Blocking Extinction Robert A. Rescorla (1) Shape of learning curve tone air puff light food tone shock flavour sickness The learning curve is gradual and negatively accelerated (2) Overshadowing in Pavlovian conditioning Mackintosh (1976) 85dB 60dB 75dB 50dB Test Nick Mackintosh Compound CS (light + tone) Tone CS varies in intensity across different groups US shock Stimuli compete for predictive value when presented in a compound preceding the US The one with higher intensity (or salience) is likely to overshadow the weaker one (3) Kamin Blocking Leon Kamin CS1 US, followed by…CS1 + CS2 US CS2 test shows impaired learning for CS2. low informative value of the tone (CS2) (redundant) lack of surprise of the US seem to be of more relevance. Classical Conditioning and the Rescorla Wagner Model CS = Tone US = Food Co-presentation of CS-US allows joint processing; relationship is encoded in memory. CS-US relationship is encoded as V - value of association between the CS and the US Nodes in conceptual nervous system Foo d US = Food Real World CS = Tone CS CS CS V US US US V=“Association strength” 0≤V≤1 What factors cause changes in associative strength (∆V )? CS ∆V pronounced ‘delta’ V = ‘change’ in V Increases in “association strength” (∆V) depend on CS and US been processed conjointly US ∆V = CS processing x US processing Let’s call CS processing α α = “CS intensity”(or salience); ranges from 0 to 1 In R-W model we assume α is constant ∆V = α x US processing is the maximum conditioning possible for the US V (Associative strength) US processing depends on the ‘surprisingness’ of its presentation When V ≠ λ US processed US not predicted λ λ = maximum conditioning possible for the US V = λ, No surprise V