Affective Neuroscience for MCqs.docx
Document Details
Uploaded by PropitiousSwamp
Full Transcript
Affective Neuroscience Overview 3 Approach and Avoidance Motivation 3 Approach Motivation and the Reward System 4 Outputs of Reward Systems 4 Function of reward system 4 What is Reward? 4 Discovery of the Reward System 5 Dopamine Pathways 6 Dopamine and Reward: example studies 7 1. Psychopharmacolog...
Affective Neuroscience Overview 3 Approach and Avoidance Motivation 3 Approach Motivation and the Reward System 4 Outputs of Reward Systems 4 Function of reward system 4 What is Reward? 4 Discovery of the Reward System 5 Dopamine Pathways 6 Dopamine and Reward: example studies 7 1. Psychopharmacological Studies 7 DA Agonists and Antagonists: Powell et al. 1996 (PPIs & REWRESP) 7 2. Neuroimaging Studies 8 Monetary Incentive Delay (MID) task: Knutson et al 2005 8 Wilson et al., 2018: whole brain meta-analysis - many brain areas involved in the task 8 Reward Processing theories of DA 9 Reward Liking Theory (“the anhedonia hypothesis”): Wise et al., 1978 9 Salamone & Correa (2012) 10 Problems with the Reward Liking Theory 10 Studies Challenging the Liking Theory 10 Martinez 2004 11 Ikemoto & Panksepp, 1996 11 Pecina, Smith and Berridge, 2006 11 Reward Wanting Theory 12 Animal Studies (Berridge & Robinson, 1998) 12 Human Studies: 12 Effort-Based Paradigms 13 Effort-based reward task demonstrating role of DA in reward wanting in humans (in mitigating effort costs): Chong et al., 2015 13 Reward Size vs Amount of Effort: size of reward (rather than effort) predicts the magnitude pf phasic DA release 15 Other problems with reward wanting theory 15 Reward Learning Theory 16 Reinforcement 16 Reward- Prediction - Error Signalling, Schultz, 1998 17 The Reward Value of Information: an extension of reward prediction error idea 18 Animal Studies: Bromberg-Martin & Hokosaka 2009 – Information Choice Task 18 Phasic DA release activity in reward -related events: 19 Human Studies: EEG as a Distal Signature of Phasic DA Activity 19 EEG Index of Reward Prediction Error 19 Event Related Potentials (ERPs) 20 The Reward Positivity (RewP- aka FRN; fERN; MFN): Proudfit, 2015 21 Sambrook & Goslin, 2015: meta-analysis – RewP predicted by expected value (size x predictability) 21 Potts et al. (2006) Gambling Task 22 Other Evidence of Link Between Reward Positivity and DA 22 Other Ways to Translate Findings from Animal Studies to Humans 23 The Reward Value of Information 23 Dan Bennett et al 2016: Information Choice Task – pay cost to have reward-related information 23 Brydevall et al., 2018: Information Seeking Task 25 Smillie, Bennett et al., 2021 26 Coda: Reward Wanting 26 Incentive Salience Hypothesis: Saunders & Robinson (2012) 27 Sign vs. goal tracking paradigm 27 Reconciling the Learning and Wanting Literature? 28 Summary 28 Key Terms 30 Summary of Key Findings 32 Seminar Readings 33 Treadway et al., 2012 Dopamine Mechanisms of individual Differences in Human Effort-Based Decision-Making 33 Kaack et al., 2020 Exploring approach motivation: Correlating self-report, frontal asymmetry, and performance in the Effort Expenditure for Rewards Task 38 Hubbard et al., 2016 A General Benevolence Dimension that links Neural, Psychological, Economic , and Life-Span Data on Altruistic Tendencies 42 From subject manual: This lecture will provide a brief overview of the field of affective neuroscience—the study of the neural underpinnings of emotion and motivation—and take a deep dive into one of the richest topics in this literature, concerning the motivational functions of dopamine. The focus will be on the different methods that have been used over the years to compare and contrast evidence for the various theories of dopamine that have been advanced over the last 50 years. Luke Smillie (5 September 2023) Overview Approach motivation and the “reward system” Functions of dopamine (DA) Affective neuroscience involves the study of the “emotional systems of the mammalian brain”. The name was coined to distinguish from areas of neuroscience other than cognitive neuroscience (Jaak Panksepp) Neuroscience of emotion, mood, affect, motivation, feelings, and personality (which in mid 20th century were seen as pseudo-scientific). Origins within… Behaviourism Evolutionary Psychology (when hypothesising over the functions of emotions: adaptiveness is often at the centre of theorising e.g., why might feel jealousy; anxiety – why more anxious when see a snake vs a car) (Animal) Behavioural Neuroscience The methods particularly draw on old behaviorist paradigms. Approach and Avoidance Motivation Fundamental affective-behavioural repertoires… that involve the direction and energisation of behaviour toward reward (stimulus – theorised to be accompanied by desire; wanting) or away from threat (accompanied by some negatively valanced feelings/emotions/anxiety) Characterised in terms of ‘fear’ and ‘desire’ - testing animals so we don’t really know what they are feeling – labels are referring to an implied significance of a behavioural response. Are fundamentally important as all organisms with some form of nervous system have (reward) approach and (threat) avoidance capacities Elliot & Covington, 2001; Jones & Gosling, 2008; Schneirla, 1959: most things we study are built upon the fundamental approach and avoidance systems Amoebae – protoplasm flow Complex organisms display complex behaviour repertoires around approach and avoidance motivation e.g., Rats – dithering, oscillating, freezing, scanning, sniffing, exploration, etc. Approach Motivation and the Reward System Some argue there are not defined modules for distinct systems in the brain, and that it is all interconnected and fluid, but others for heuristic purposes refer to different systems/modules e.g. Behavioural Approach/Activation System (Gray) Behavioural Facilitation System (Depue) Seeking System (Panksepp) The central idea is that the systems are activated by rewards and have certain outputs i.e., the focus is upon the various outputs of reward systems: Outputs of Reward Systems Animal literature – “vigorous investigatory sniffing, exploration, foraging… and predatory urges” (Panksepp & Moskal, 2008) Human literature - Desire, hunger, effort, curiosity, interest, addiction, anhedonia, aggression (Elliot, 2008) Function of reward system To move the organism up a spatiotemporal gradient (e.g., forward in time or across space) towards reward. What is Reward? “…a general term that refers to all aspects of appetitive learning, motivation, and emotion… so broad as to be essentially meaningless.” — Salamone & Correa (2012) (Luke think this is quite harsh but we do need to be clear about how defining reward) A few definitions: a ‘reward’ is… Anything that aids survival/reproduction (‘innate’ / ‘primary’ rewards – food and sex) limited: a lot of rewards of interest are secondary to these functions - might have a distal association with a primary reward but are not directly linked to reproduction and food) From cybernetics – study of self-regulated goal-directed systems: a reward is anything that serves as an attractor or goal, i.e., something that elicits approach motivation/behaviour circular: approach motivation is anything that energises towards reward. Anything for which an organism will ‘work’ or pay a cost for; expend some kind of energy for (i.e., anything willing to expend effort/resources to obtain). Most accepted definition: A lot of ways reward is operationalised in the literature seems to fit this definition. Seems to make sense – if not willing to work towards or give something to achieve it, then would question its value to you/not a reward. Discovery of the Reward System James Olds et al (1950s): wanted to identify where in the brain reinforcement was happening. Logic was to take a rat, insert an electrode into brain which would deliver a small current to stimulate release of neurotransmitters in particular brain areas. Looking for something that for the rat would be reinforced for stimulating. What area of brain would rats be motivated to stimulate? Self-stimulation studies … tale end of behaviorism High motivation to self-stimulate when the electrode was around the nucleus accumbens (NA) (the ventral part of the striatum) Since has been viewed as one of the core structures involved in the reward system NA receives ascending dopaminergic projections from the ventral striatum area Target of major branches of the dopamine system Would play a clear cost to receive the stimulus - effort, will run across an electrified grid to get to the lever. Clear evidence that the NA is involved in some way in reward directed responses - a key clue was that it received ascending dopaminergic projections Is generally well established that DA release in the nucleus accumbens is triggered by in animals by food, sex, drugs of abuse and in humans by sexual imagery, cocaine, drug stimuli i.e., by rewards Berridge & Robinson, 1998; Redoute ́et al. 2000,Verma, 2015 Dopamine transporter is responsible for recycling DA in the synaptic cleft (pumping out excess DA in the synaptic cleft). Cocaine stops the transporter recycling (mopping up) the DA from the synaptic cleft. Cocaine acts as a DA agonist Dopamine Pathways This would be a reasonable approximation of the reward system (but will get different versions). Ventral striatum basically is the nucleus accumbens. Striatal and nucleus accumbens are those mainly spoken about in literature. Green, blue and red lines are different ascending DA pathways Red: nigrostriatal pathway goes from the subsantia nigra (top of brain stem) up to the cordate (part of the dorsal striatum) Green: meso-limbi pathway from the VTA to the NA Blue: meso-cortic pathway from the VTA to prefrontal and limbic areas including the NA Dopamine and Reward: example studies Evidence that DA neurons are involved in reward systems comes principally from: Psychopharmacological studies Neuroimaging studies 1. Psychopharmacological Studies DA Agonists and Antagonists: Powell et al. 1996 (PPIs & REWRESP) Can a dopamine agonist (Bromocriptine) (ie promotes DA release from a V2 receptor) aid in treatment of brain injury-related motivational anhedonia? A lack of desire to do what usually wanted to do - often a symptom of depression and brain injury. Looked at effect of Bromocriptine upon various clinical ratings of motivation: Percentage Participation Index (PPI), Motivation, Spontaneity , Engagement– a clinical rating of how much a patient is engaging in the discussion Behavioural measure of reward-responsiveness (REWRESP): a simple task where participant has to sort cards at a particular rate. Do it 3 times and in the middle trial get a $$ reward for speed of sorting – simple measure of how much one increases psychomotor speed when linked to a reward i.e., is a simple behavioural reward test. (carrot test) Five assessments: Two baseline assessments 14 - 21 days prior to administration of the Bromocriptine Once at max dosage during treatment Two post-withdrawal assessments > 14 days post withdrawal of treatment Everything goes up. REWRESP is the reward response (the carrot task). Is one study that seems to suggest the role of DA in reward related behaviour. 2. Neuroimaging Studies Monetary Incentive Delay (MID) task: Knutson et al 2005 Is often used in the neuroimaging context. It starts with a cue that signals the probability AND the size of the reward that might be delivered on that trial. In this example the bar height (how close it is to the top of the circle) indicates the size of reward; vertical bar (how far it is to the right) indicates the likelihood of reward. Participants go through practice trials so become good at perceiving the cues quite quickly. Cues need to be simple. Shown cue, then is a delay (2 seconds) then a target they must respond quickly to and if respond in time they get the $ reward. Do task many 100s time in a scanner. Often interested in looking at BOLD activation in anticipation and might also look at activation in the outcome and evaluation stage and responses to received reward. Knutson et al., 2005 showing activation during reward anticipation as a function of (or being predicted by) expected value (probability x magnitude). Activation in the nucleus accumbens and various parts of the midbrain Wilson et al., 2018: whole brain meta-analysis - many brain areas involved in the task A more recent whole brain meta-analysis of 15 dataset: designed to overcome underpowered studies. Key conclusion was that there are many areas involved in the task: with strongest responses in the caudate putamen and nucleus accumbens when comparing reward and nuetral outcomes. Reward Processing theories of DA What is Dopamine actually doing? How should this process be described? Three broad accounts: ‘Liking’ theories (Wise, 1978) DA’s role in reward is to generate good feelings: i.e., generation of pleasure, euphoria, hedonia. ‘Seeking’ or ‘wanting’ theories (Berridge & Robinson, 1998; Panksepp, 1998) DA’s role is to energise behaviour and mitigate effort costs – behavioural drive; responsible for mitigating effort costs associated with pursuing reward (remember reward is something we are willing to expend some effort/cost to gain). more tied to the approach/motivation idea ‘Learning’ theories (e.g., Shultz, 1998) DA ‘s role in reward is to learn predictions- formation of associations - about rewards and habits Alcaro, Huber & Panksepp, 2007; Wise, 2004 Reward Liking Theory (“the anhedonia hypothesis”): Wise et al., 1978 Dopamine generates pleasure or hedonia reward-related pleasure/hedonia Dopamine deficiency in Major Depressive Disorder and Parkinson’s Disease results in anhedonia Classic study: Wise and colleagues (1978) blocked DA using the antagonist (DA blocker) pimozide (blocks D2 DA receptors) Four groups of rats trained to run to a food reward cup (becomes familiar to them) Next, two test days with split into 4 groups: Reward as usual Reward + 0.5 mg/kg pimozide Reward + 1.0 mg/kg pimozide No reward 8 trial runs over 2 test days Plots show running time On Day 1 No reward group beginning to run less smoothly – less engaged (hard to say motivation) By Day 2 can see the dose dependent effect – slowest from no reward; larger dose of DA blocker; small dose of DA blocker; to reward as usual and no DA blocker; DA antagonists appears to be taking the pleasure, euphoria, or goodness out of normally rewarding food - how DA got its name as the “pleasure chemical” – but a misnomer Salamone & Correa (2012) A review of the Wise study and following studies gave rise to the following conclusions: “DA is a ‘reward’ transmitter that mediates hedonic reactions…” “Depression is due to a reduction of DA- Regulation experience of pleasure ...” “Addiction depends on the experience of pleasure induced by drugs that hijack the brain’s ‘reward system’...” i.e., this approach influenced a lot of different types of literature. Problems with the Reward Liking Theory Pharmacological findings that not all DA agonists elicit euphoria (ie not all pharmacological ways of increasing DA levels elicit euphoria). Drug euphoria ratings weakly correlated with amount of DA release also from drug literature. Drug euphoria from amphetamine is not blunted by co- administration of DA antagonist. So maybe these are drugs that impact the DA but that‘s not the reason for the impact on the euphoria. Studies Challenging the Liking Theory Martinez 2004: subjective rating of euphoria of cocaine only significantly increased with doses when effort introduced (if only liking the rating would be simply dose dependent) Ikemoto & Panksepp, 1996: DA antagonist decreased approach behaviour (rats slowed down) but not apparent liking of reward (ate just as much) Pecina, Smith and Berridge, 2006: hedonic reactions to taste in animals and babies not seem to be affected by DA agonist/antagonist Martinez 2004 Dissociation of behavioural reinforcement from subjective pleasure… 16 Cocaine dependent subjects smoked cocaine or placebo: Day 1 (top graph): each dose administered at different times. Rated positive effects ‘high’, ‘stimulated’, etc.) at baseline and then 4 x across a 1-hour period. The dose is not impacting strongly on the subjective ratings of euphoria (cocaine blocks dopamine reuptake but here larger doses of DA reuptake inhibitor is not increasing self-reported euphoria) Days 2 & 3: Same but included an effort-based choice between voucher ($5 worth more than the drug) and another dose then could have $5 or a dose of the same amount. Made it effortful – if wanted another dose had to press a button up to 1600 times. This happened 5x across a 1- hour period. Now rating is scaling well with the size of the dose. Ikemoto & Panksepp, 1996 Conceptual replication of Wise et al 1978 Looked at the running speeds towards the food as well as the general behavioural activity around the test environment and the amount of food eaten. Two DA antagonists were injected into their nucleus accumbens (NAc) and VTA. The DA antagonist seems to be impacting on running towards (ie approach behaviour) and general activity but is not impacting how much they actually eat. Blocking dopamine function impairs efforts to approach rewards but not (apparent) enjoyment of rewards (reward consumption) Pecina, Smith and Berridge, 2006 Cross species generalisable hedonic (sweet) and aversive reactions (bitter).Hedonic reactions don’t seem to be: • Blunted by DA depletion; or • Enhanced by DA agonists (or in DAT knockout mice) Berridge (2000) concluded that “whatever dopamine systems contribute to the process of reward, they do not mediate the hedonic impact of tastes” Hedonic experiences associated with reward linked instead with NAc opioid receptors, and endocannabinoid receptors Berridge, 2000, 2007; Castro & Berridge, 2014 Roy Wise - the main proponent of the anhedonia hypothesis the idea that DA is involved in the reward liking – concedes that evidence does not support it – so the theory has been abandoned. SO, if not reward ‘liking’ then… DA is mediating something that might be broadly referred to as “wanting”? – refers to any direction of approach motivation, including reward ‘drive’, energisation of behaviour, desire, anticipation, behavioural activation, reward- directed effort might be some feelings but is less about pleasure, euphoria Berridge / Robinson / Panksepp Advocates of “learning” interpretation say it’s not to do with effort but rather is to do with the formation of reward-associations and adaptation of behaviour to reward contingencies Shultz / Dayan / Montague Reward Wanting Theory Animal Studies (Berridge & Robinson, 1998) Supported in animal studies in which rats were fed a neurotoxin – 6-hydroxydopamine (6-OHDA)- which destroyed DA neurons: Motor capacity retained, can walk, chew, etc. – ie no effect on walking capacity But will not seek food; will starve unless fed artificially – completely obliterated the reward seeking “Even if rats still ‘like’ food, and retain the capacity to learn hedonic associations about it, and to make the movements needed to eat, they do not ‘want’ food and so do not eat it.” – Berridge & Robinson (1998) Human Studies: In humans, reports of craving suppressed by DA blocker e.g. Alcohol dependent subjects, following a sip of alcohol (Modell et al 1993) Cocaine dependent subjects, shown video of drug preparation Effort-Based Paradigms One of the most compelling paradigms to show DA playing a role in the wanting element of reward are effort-based paradigms. Led to an interpretation that the role of DA is to mitigate the effort-related “cost” of reward pursuit where rewards are defined as something we are willing to pay a cost for or invest some effort in obtaining Animals given a choice between something they like and something they don’t like as much. The rats must expend more effort to get the highly palatable food (press lever; climb barriers etc). Generally, across studies different ways of changing levels of DA , whether genetically or pharmacologically - e.g., use of DA agonist to increase DA function or a DA antagonist to decrease DA function –: increasing dopamine function increases response bias towards high effort-related responses choices for the more desirable food (ie increases willingness to work for a reward) whereas lowering DA function, they make fewer high effort-related choices: Bardgett et al.,2009; Salamone et al., 2007 Can see again the connection with motivational anhedonia: a classic description is that they don’t want to invest what might be a small amount of effort to do something they would normally have enjoyed. Effort-based reward task demonstrating role of DA in reward wanting in humans (in mitigating effort costs): Chong et al., 2015 Parkinson Disease (PD) Patients – already prescribed dopamine boosting medication. 26 PD patients, 26 controls Week 1: DA boosting medication as usual Week 2: overnight withdrawal from DA medication The number of rings on trunk indicates the amount of effort and number of apples corresponds to amount of reward (corresponds with $). Effort is the % contraction of the maximal voluntary contraction on the dynamometers. Multiple trials where see the amount of reward and the amount of effort need to get the reward. The task disassociates the size of the reward from the effort required. In other studies where the higher reward always requires more effort, as the above study with rats and cheese, they are confounding the effort costs with the size of the reward. Effort indifference point (EIP)– where p’s will choose to go for the reward on 50% of trials. An EIP of 3 means when the reward value was 3, half the time the participants would choose to go for the reward and half the time they would choose not to go for the reward. Results: Controls: As the stake or the size of the reward increases, people’s EIPs are increasing – so willing to expend more effort for reward PDC: (PD off medication; less DA) – at lower reward levels the EIP are lower – expending less effort for particularly somewhat small rewards. Once reward gets larger, they expend as much effort as controls PD: (on medication) – expending more effort for larger rewards than the controls Providing support for the idea that DA might be mitigating the effort costs required in obtaining rewards. Problems with reward wanting theory Problem #1: Reward Size vs. Amount of Effort The difficulty with a lot of reward-wanting studies is that reward size is always matched with effort size, so difficult to know if dopamine is responding to the size of the reward or to the effort required. Problem with many “reward wanting” studies is that the tasks conflate reward size with effort costs e.g., Effort Expenditure for Reward Task (EEfRT—to be discussed in the seminar), high reward is always matched with the high effort (and low reward always requires low effort) so hard to know whether dopamine is responding to the size of reward OR to the effort required. Reward Size vs Amount of Effort: size of reward (rather than effort) predicts the magnitude of phasic DA release - Walton & Bouret, 2018 Studies disentangling reward size and effort cost… Mainly in animal studies e.g., predicting phasic DA release from DA neurons during a task where is trial-by-trial variation in reward and effort cost… Walton & Bouret, 2018 The size of the reward is predicting dopamine release. Effort cost is around zero (sometimes below). Plotting regression coefficients from when reward size and effort cost on each trial is regressed onto predicting phasic DA release from DA neurons. The strong predictor of DA release is the size of reward – the size of reward predicts the amount of DA released. The regression coefficient for effort costs is around zero (and even negative) Blue line: regression line for predicting DA release based on the size of the reward. Red line: regression line for predicting DA release based on the the effort required. The dependent variable is the phasic response of dopamine release. Amount of effort required is not predicting the magnitude of the phasic dopamine release: (A) Pasquereau & Turner, 2013 ;(B) Varazzani et al., 2015 “Dopamine is reliably modulated by expectations of future reward, whereas the negative effects of effort costs is much more limited”: Walton & Bouret, 2018 Problem #2: Gradual Impact of Dopamine DA agonists often have a gradual impact on reward directed behaviour Effects seem to require repeated trials indicating some kind of familiarisation or experience with the rewards This suggests a role for learning / reinforcement/maintenance of reward-directed behaviours Go back to Wise et al., 2004 study: a. controls – given food and keep responding with lever presses across all 4 test days. Those given the doses of DA antagonist (b. an c.) show a progressive decline in response. But those in the home cage sample (f.) were given same dose as those in the trials but not given tasks until day 4. One interpretation is that there is a kind of build-up of effect so that the response progressively declines the more antagonist you have but if that were the case the HC response on Day 4 should be the same as the Day 4 response of those on the same dose in the trials at Day 4. But they’re behaving as if on test day 1. So not fully explained by reward wanting. If DA just regulates motivation, effort etc then these results don’t make sense. There seems to be some learning involved - it can’t be just reward-wanting. Reward Learning Theory Learning as an explanation of the role DA plays in reward processing – DA’s role in reward is to learn predictions about rewards and habits Reinforcement the ‘stamping in’ of stimulus associations and response habits. Not a conscious process. Support for DA involvement in… Lever pressing for food… (Wise 1978/2004) Place preference conditioning (when left to wander freely rodents will prefer the physical location where received the DA agonist)… Reward-prediction-error signalling…most compelling evidence/demonstration that DA plays a role in reward learning (Schultz 1998) Reward- Prediction - Error Signalling, Schultz, 1998 Reward prediction error = the value of an outcome relative to its expected value Schultz, 1998: a single cell recording of phasic activity of DA neurons in the VTA in monkeys: shows 3 different stages in learning process. No prediction: reward occurs/no condition stimulus; reward of orange juice arrives unannounced; is a phasic increase in firing of dopamine neurones upon receipt of reward. Reward reliably predicated by a cue (e.g., a light or a tone): The monkeys learn that the cue is followed by the reward. Now get the phasic response when get the cue not upon receipt of reward – i.e., get phasic response when able to predict the reward. This is evidence that DA is not playing a role in the hedonic reaction to the stimulus otherwise would mainly get the phasic reaction when actually get the reward. Reward predicted by a cue, but no reward delivered – see a dip in the rate of firing. This has been interpreted as the pattern of DA phasic response activity is not merely cuing that a reward has arrived or is predicted but is also signalling a “reward-prediction error” – the difference between what predicted and actual reward. See an increase in firing in positive reward prediction error (reward is greater than expected) and a dip in phasic activity when the reward prediction error is negative (reward was less than expected or no reward when one was expected). Adaption of the Rescorla-Wagner model form the 1970’s. When the interpretation that DA is signalling reward prediction error (i.e., that a reward has occurred that’s bigger or larger than expected - a reward-related event that violates my prediction of an event has occurred – it was thought that DA must be playing a key role in reinforcement learning. Th e figure below is trying to illustrate that: a cue generating a prediction - if it matches the reward then no change but if there is a difference, then this influences the strengthening and weakening of the associations (not spend much time on this) The Reward Value of Information: an extension of reward prediction error idea Animal Studies: Bromberg-Martin & Hokosaka 2009 – Information Choice Task On a 3rd of trials, the macaques monkey could choose between an orange or a blue square. Blue square – information will be given about the reward but crucially the choice did not influence the trial outcome (either the size or timing of the reward). The informative stimuli are therefore non-instrumental. Half the time they would see a + sign indicating a big reward or a squiggle which meant a small reward Orange square – see a cue but is random i.e., doesn’t give information about the reward. The reward itself does not vary – ie no choice about the size of reward - only choice is whether or not to receive information about whether will get a small or large reward. Given a choice on a 3rd of trial – when 90% of time the monkeys chose to get the information (the green). On other 3rd they are forced information, and this is where see the phasic increase in firing rate. On another 3rd it is forced random when uncertainty, see a phasic dip. In this panel shows the forced information condition: when subject learns they’re going to receive information, get a phasic increase in DA cell firing. Then see a cue that says will get a large reward (another increase in DA firing) but when get the same reward - see no change in cell firing: supports idea that the changes in phasic cell firing are coding reward prediction error. If get information that will get a small reward, then get a phasic dip and then no change when receive predicted small reward. This panel shows forced random condition - subjects learn they won’t get information about reward (phasic dip in DA cell firing) - uninformative cue results in no change to DA activity, but if get a larger than average reward, see get an increase in phasic DA activity and a decrease in DA phasic activity if a smaller reward. In both conditions there is no change in phasic activity if there is no prediction error – only increases if reward is bigger than expected or dips if it is lower than expected. Ie these phasic responses mirror the reward-prediction – errors that DA neurones code in relation to primary incentives. Summary: Phasic DA release activity in reward -related events: is not happening when reward is delivered reliably when there is no reward (and no reward was expected) but is happening when the reward is larger than expected/predicted; phasic release activity decreases when the reward is smaller than expected or is not delivered. Seems to generalise from primary rewards (like water and juice) to other things like information Human Studies: EEG Event Related Potentials as a Distal Signature of Phasic DA Activity Research in human subjects: can’t do single cell studies as invasive but can use non-invasive techniques – studies that argue there are some EGG components that might be a distal signature of phasic DA activity or least that seem to be responsive to reward prediction error. EEG Index of Reward Prediction Error Another interpretation of role DA system seems to play in reward related processes – is the idea that phasic activity of DA neurons might signal a reward prediction error which is basically that a reward that is larger or smaller than expected has occurred Applies to “specific” rewards (e.g. food for animals) and also informational rewards (info about upcoming rewards) A useful way to study this in humans has been with Event Related Potentials (ERPs) derived from EEG data and involved in performance and outcome monitoring – has been argued might be a distal signal of reward prediction error. Quite a lot of evidence that ERP might reflect the reward prediction signalling. Event Related Potentials (ERPs) ERP was originally identified in the context of error processing. Two ERPs: Error-Related Negativity: negative deflection in the EEG just after people made an error capturing a subconscious corrective process of an error; and Feedback Related Negativity (FRN) (most important) a negative deflection in EEG occurring after negative feedback or after a loss “Negativity” because have a negative deflection (negative amplitude is PLOTTED UP on EEG). “Error” and “feedback” related negativity because 1st identified when people doing tasks with correct and incorrect trials. Found that when people had made an error, there would be a higher error related negativity vs when made a correct trial. This negative activity happened immediately upon making the error i.e., before get feedback. Idea that at some level people know they’ve made an error, and this is a kind of corrective process. Feedback Related Negativity (FRN) was a more negative deflection in response to negative feedback or information about losses compared to when given positive information about correct trials. FRN is also known as “feedback ERN (fERN)” and “medial frontal negativity (MFN)” because it is calculated based on medial frontal EEG sites. Is FRN a signal of “loss” or of “error”? Proudfit 2015: “error”? After a while it was thought to be misnamed as it does not seem to be simply a signal of error: losses result in a negative FRN even if the response was correct i.e., gaslighting type studies that tell you that although you were correct, you lose money anyway. “loss”? Negative FRN shows up for reward/nonreward trials ie where is no loss as such. Also get negative FRN’s when people break even (ie no loss) - so maybe is not a signal of loss – “breaking even was just like losing”. Proudfit (2015) argued that the FRN component is by “default” (i.e., after any event) a negative going inflection that is suppressed (made less negative) by reward – so will see the negativity suppressed when a gain (negative amplitude is often plotted up) Therefore calls it “Reward Positivity” RewP The Reward Positivity (RewP- aka FRN; fERN; MFN): Proudfit, 2015 Proudfit argued we can compute a difference wave for either the loss-gain difference or the gain- loss difference - by subtracting the EEG wave form for gain trials from the EEG wave form for loss trials (or vice versa). The grey line is the reward positivity. Some argue the RewP can be interpreted as a reward related component that is potentially related to the reward prediction error signalling. Evidence supporting the claim: Proudfit, 2015 the magnitude of the reward positivity CORRELATES with the magnitude of BOLD activity shown in fRMI in the VS (the key target of ascending DA projections) in response to rewards only a correlation – but does show RewP seems to be positively associated with a more direct marker of dopaminergic activity Sambrook & Goslin, 2015: meta-analysis – RewP predicted by expected value (size x predictability) Possibly the most compelling evidence is found in the meta-analysis by Sambrook & Goslin, 2015 – (although does not involve any direct measurement of DA activity per se). Meta-analysed studies that had computed the RewP and quantified in their subgroup analysis how the RewP was impacted i.e., by (1) the size of the reward; and (2) the predictability of the reward. – i.e., they found strong main effects of reward size and likelihood on the FRN consistent with it signalling/coding for a reward prediction error. They showed the best way to model the size of the reward positivity was in terms of the expected value i.e., the product of the size of the reward and the probability of the reward. RewP is maximal for big unpredicted reward outcomes vs non-rewards. RewP size is predictable from manipulated Reward Prediction Error (RPE) size RewP is larger for reward outcomes that are less likely and of greater magnitude – is larger when is a big unpredicted reward (Sambrook & Goslin, 2015) RewP is smaller (or even negative going) when is a large unpredicted non-reward event Example (important study): Potts et al. (2006) Gambling Task Tried to assess whether the data seen for RewP mirrors what is seen in phasic DA activity. On every trial were 2 stimuli - always lemons or a gold bar. 80% time if see lemons 1st time, you would see them a 2nd time and would get no reward. 80% of trials if gold bar is 1st stimuli, the 2nd will also be a gold bar and would get a reward. So, are creating across all the trials an expectation of a reward if 1st stimulus is a gold bar (and same of no reward if 1st see a lemon). 20% of trials it’s flipped so if 1st get a lemon, then a gold bar, would get a reward operationalising unexpected reward. 20% of time gold bar comes up and then lemons and don’t get a reward – so an unpredicted non-reward. Plots the wave form for the 4 trial types. Plotting negatively down Electrodes on medial frontal area RewP most pronounced on unpredicted non-rewards trials (saw gold bar, then lemons and then no reward) and least negative (almost positive) in unpredicted reward trials (saw lemon then gold bar then reward). No difference in the predicted conditions (regardless of reward) Potts et al (2006) argued that regardless of whether directly related to dopamine phasic activity it is still responding in a similar way to single cell studies – i.e., RewP varies across the trial types in a similar way to the single cell recordings of phasic activity of DA neurons. Other Evidence of Link Between Reward Positivity and DA Strong version of the claim is that the RewP, the EEG component, is some kind of neural signature of phasic DA activity. Unlikely we could ever be sure of that but does seem to be behaving in a similar way. Other evidence people point to includes: Broad positivity source localised to the anterior cingulate cortex (ACC) (the target of the meso-cortical limbic DA neurons) and putamen (Foti et al., 2011; Holroyd & Coles, 2002) EEG has poor spatial specificity “somewhere under the scalp” some source localisation studies can be persuasive but take with a grain of salt. Plausibly generated by midbrain DA neurons Cannot claim that we really know that RewP is coming from the ACC Might be enough to say it is a neural signature Is correlational evidence that RewP is associated with other more direct markers of dopaminergic activity e.g., BOLD activation in reward processing regions e.g., NAcc – see Proudfit (2015) Pharmacological studies which suggest size of Reward Positivity is modulated by dopaminergic agents Candidate gene studies relating to alleles having some role in DA function (e.g. the COMT gene) involved in DA catabolism. So is some basis for thinking that reward positivity might be a useful peripheral index for studying reward positivity in humans where very hard to do invasive studies. Refs: Holroyd & Yeung, 2012; Mueller, et al., 2014. Other Ways to Translate Findings from Animal Studies to Humans The Reward Value of Information Bromberg-Martin & Hokosaka ( 2009) showed monkeys preferred to choose trials where would see advanced information about whether would get a reward. Cues perfectly predicted the reward received. A human analogue to this information choice tasks was developed by Bennett et al, 2016. Dan Bennett et al 2016: Information Choice Task – pay cost to have reward-related information Developed a simple task to use with humans based on the monkey paradigm. On every trial participants could choose to view stimuli providing advanced knowledge of the trial outcome or perceptually similar but uninformative stimuli. Informative (A) is a 50% chance will turn over a set of cards with a majority of black cards (reward) or red (no reward) cards. Pay 3 cent per card. Noninformative (B) – the cards have no relationship to the outcome – random stimuli. No cost. The trial process: Informative – has a cost (3 cents/card) - remember one way can tell what people value is what they are willing to bear a cost for. Bennet et al found people are willing to pay a cost for information. Like monkeys, humans strongly preferred information vs noninformative stimuli despite the same economic value and 25% were willing to incur a monetary cost to view the informative stimuli. Link between the Bennett study and RewP component: Myra Brydevall study building on Bennett’s work. Brydevall et al., 2018: Information Seeking Task Computed reward positivity for every card that was turned over in the information preference task – if choose to see information, 5 cards are turned over, if majority black = $; if majority red = no $. Positive and negative “information prediction errors” (IPE): the difference between actual and expected information. Each card turned over represents a gain in information about likely reward outcome or is it neutral or a loss in information. Found gains in non-instrumental information elicited the evoked EEG reward positivity (or FRN) response. Looked at reward positivity (aka Feedback-Related negativity, FRN) for the positive information prediction errors (gains in certainty) and the negative ones (decreases in certainty). Found FRN was more positive for positive IPEs, again characteristic of reward positivity (negativity is plotted up; when compute the difference wave get top purple line): Smillie, Bennett et al., 2021 Found dopamine enhancement increases preferences for (moderately) costly information: Administered a DA agonist to participants – did see that after administering sulpiride (a D2 receptor agonist), increased proportion of choices made for moderately costly information (1 & 3 cents). No difference for zero cost or in high cost (5 cents). Gives some confidence that this task is tapping information rewards in the sense that the monkey tasks was doing. Confirms findings of Bennett et al 2016 that non-instrumental information has reward value. The finding of a dopaminergic drug increasing the reward value of moderately costly information is consistent with evidence for a dopaminergic basis to information valuation Bromberg-Martin & Hokosaka (2009). Taken together these studies provide behavioural and neural evidence for the reward value of information. Coda: Reward Wanting To summarise lecture: the general case in the literature is that the hedonic liking interpretation of DA (that DA is a pleasure chemical, the idea that it’s responsible for drug euphoria and blunted affect in anhedonia and depression) has been debunked. The wanting theory says DA is about motivation, desire, anticipation, driving us toward reward, seeking our reward. The learning theory says DA is about learning from reward and is involved in coding for predictions for reward outcomes and associations about reward outcomes Reward prediction error signalling - that the role of dopamine is to predict outcomes – is highly regarded in the literature. DA is a signal that helps us to adapt to reward outcomes, to predict them better, to learn about what things are associated with reward. But proponents of the reward wanting theory ask how can learning or reinforcement explain a lot of other aspects of reward related behaviour e.g.: “Vigorous investigatory sniffing, exploration, foraging”…? (Panksepp & Moskal, 2008) Desire, hunger, effort, curiosity, interest, craving… ? “Knowledge, in any implicit or explicit form of association, no matter how strong, does not necessarily [for example]compel the pursuit of drugs.” Robinson & Berridge (2003). There are other aspects of reward processing (desire, effort, curiosity etc) that don’t seem to be accounted for in the reward learning approach. Incentive Salience Hypothesis: Saunders & Robinson (2012) Testing the idea within the reward wanting interpretation of DA’s function Is another specific hypothesis about what exactly DA does. Proposes that DA is involved in the attribution of ‘incentive salience’ - the process by which initially neutral things become motivationally rewarding, attractive and desired. Saunders & Robinson (2012): aimed to test DA’s role in the attribution of incentive salience. Tried to pit the wanting and learning theories against each other in a paradigm called “sign-tracking’ vs. ‘goal tracking’. Sign vs. goal tracking paradigm An animal task: rat in a chamber Lever introduced into chamber for 8 seconds. Lever retracted; pellet delivered to food cup (not near lever) (just creates an association). Repeat for 25 trials. Some rats increasingly approach the food cup as soon as the lever is presented (“goal trackers”). The lever has become a reward predictor Other rats (15-20%) increasingly approach the lever! The lever itself is treated as the reward The pairing of the lever with the reward makes the lever motivationally attractive (“sign trackers”) Research question: does DA mediate goal tracking (learning about reward) or sign tracking (finding something rewarding/motivating)? DA antagonist flupentixol (DA blocker) impacted (in dose dependent fashion) sign tracking, not goal tracking. Concluded DA had no impact on rats learning to predict the arrival of a reward but is impacting the rats that are finding the lever motivationally attractive This is taken to be a compelling illustration of what is meant by incentive salience – how a neutral object can become motivationally attractive Reconciling the Learning and Wanting Literature? Goal tracking sign tracking studies seem to give a compelling case for DA being involved in attribution salience Others suggest DA is involved in mitigating costs for reward. But then have other literature saying there is compelling evidence that DA neurons are encoding reward prediction errors and are having a pivotal role when learning. No one really knows how to reconcile the competing evidence. Hierarchical Reinforcement Learning Framework Tries to reconcile the wanting and learning theories into an integrated model Differences are reflecting different parts of a more complex system e.g., DA neurons have two firing modes: phasic (short time course) and tonic (longer time course) Phasic DA firing (milliseconds): involved in making predictions, detecting motivational salience, RPE signalling (‘learning’ associations between a novel event and association with reward) - is therefore short Tonic DA: involved in tracking reward likelihoods, sustained behaviour / vigour, motivations to pursue rewards (‘wanting’) - a longer time course involved in wanting. The framework is plausible because it maps to the time course of the phenomena observed in various studies. E.g. in sign tracking vs goal tracking study (Saunders & Robinson) is happening over at least several minutes - longer time course – whereas with the ERP/reward prediction error studies – are looking at the arrival of a RPE and a response within a couple of 100 milliseconds So it might be that different paradigms may capture different DA firing modes, reflect different positions in the approach/motivation ‘hierarchy’ (e.g., ERPs vs. sign/goal tracking). Pursuing any goal will involve short time course processes (learning associations/detecting reward related information) as well as longer time course things like sustaining effort over time in pursuit of a reward, for example. Refs: Holroyd & Yeung 2012; Niv at al., 2007 Summary DA long implicated in reward directed behaviour — approach motivation. Clear consensus that DA does not mediate reward ‘liking’ (e.g., hedonia, drug euphoria) - not the pleasure chemical Evidence from single cell recordings of DA neurons assessing the phasic responses of DA neurons make a strong case that Phasic DA activity is coding reward-prediction error—a key piece of evidence or parameter favouring reward learning theories. Mixed evidence regarding the role of DA in mitigating the effort costs for reward directed behaviour. DA may be involved in the attribution of ‘incentive salience’ – the process though which reward ‘wanting’ occurs. Integrative models of the various reward-processing functions of DA (as well as other neurotransmitters) are still emerging. Reading for this week: Wise, R. (2004). Dopamine, learning and motivation. Nature Reviews Neuroscience 5, 483-494. Review of the reward processing functions of dopamine, including comparisons of competing theories (by Roy Wise, famous for the defunct anhedonia hypothesis) Further reading, for interest: Sambrook, T. D., & Goslin, J. (2015). A neural reward prediction error revealed by a meta- analysis of ERPs using great grand averages. Psychological Bulletin, 141(1), 213–235. A meta-analysis of Feedback Related Negativity / Reward Positivity Harmon-Jones, E., Gable, P. A., & Peterson, C. K. (2010). The role of asymmetric frontal cortical activity in emotion-related phenomena: A review and update. Biological Psychology, 84, 451-462. Overview of Left Frontal Asymmetry (LFA), an EEG index of approach motivation we will discuss in the workshop (week 9) Key Terms Approach motivation: incentivized behaviours that stems from internal processes and/or external stimuli (Kaack introduction) DA: Dopamine a neurotransmitter that facilitates exploration, approach, learning, and cognitive flexibility in response to unexpected rewards and cues indicative of the possibility of reward (from Allen & DeYoung 2017 p13 citing Bromberg-Martin et al., 2010) DA agonist: increase dopamine function/release (inhibiting reuptake in the synaptic cleft) DA antagonist: decrease DA function EEfRT: effort expenditure for rewards task – a behaviour measure of cost/benefit decision-making in humans. DA increases willingness to endure effort and probabilistic costs to obtain a monetary reward (Treadway et al., 2012). May be an important tool for assessing approach motivation at a behavioural level (Hughes et al., 2015; Kaack, et al 2020) EEG: Electroencephelography (EEG) measures neural activity by recording electrical activity along the scalp. It has a much higher temporal resolution than fMRI, capable of tracking differences in brain activity on the order of milliseconds (as opposed to seconds for fMRI), but poor spatial resolution. fMRI: functional magnetic resonance imaging – creates images of the brain based on the magnetic properties of different tissue types. The blood-oxygen-level-dependent (BOLD) signal from functional MRI (fMRI), therefore, can be used to indicate when different regions of the brain are more or less active. Feedback Related Negativity (FRN): a negative deflection in EEG occurring after negative feedback or after a loss (aka feedback ERN (fERN), medial frontal negativity (MFN) and Reward Positivity (RewP). The FRN is an EEG waveform that appears 200–350 milliseconds after receiving feedback about an outcome and appears to be generated by the dorsal anterior cingulate cortex (ACC) in response to dopaminergic signalling of deviations from the expected value of the outcome (Sambrook& Goslin, 2015). Animal research has shown that one type of dopaminergic neuron encodes a prediction error learning signal by spiking in response to better-than-expected outcomes and dropping below baseline levels of activity in response to worse-than expected outcomes (Bromberg-Martin, Matsumoto, & Hikosaka, 2010). The FRN shows the same pattern (becoming most negative for worse-than-expected outcomes and least negative for better-than-expected outcomes), indicating that it is a prediction error signal driven by dopamine (Proudfit, 2015; Sambrook & Goslin, 2015). (from Allen & DeYoung 2017 p18) Information prediction error (PE): difference between actual and expected information (Brydevall et al., 2018) Left Frontal Asymmetry (LFA): higher activity in the left frontal cortex at rest compared to right frontal activity is said to reflect a physiological index of approach motivation. Increased LFA found to be positively correlated with increased approach motivation (Kaack, 2020, p.1238). Nucleas Accumbens (NA): ventral striatum – a core structure involved in reward processing Reward Wanting Theory: dopamine’s role in reward is to energize behaviour, to mitigate effort cost Bardgett et al.,2009; Salamone et al., 2007. DA is about motivation, desire, anticipation, driving us toward reward, seeking our reward. Reward Learning Theory: dopamine’s role in reward is to learn predictions about rewards and habits. DA is involved in coding for predictions for reward outcomes and associations about reward outcomes. Reward Liking Theory: dopamine’s role in reward is to generate good feelings such as euphoria (The wanting theory says Reward Positivity (RewP): is an event-related potential (ERP) component that indexes sensitivity to reward and can be elicited by feedback indicating monetary gains relative to losses (from Speed et al’s Abstract). Is the difference wave between the unpredicted reward and the unpredicted non-reward. It is a putative marker of dopaminergic reward-prediction-error signalling (Proudfit, 2015; Sambrook & Goslin, 2015). RewP is modulated by Reward Prediction Error (RPE) i.e., RewP seems to reflect the degree of RPE so that RewP is larger as a function of the size and unpredictability of rewards. RewP reflects the magnitude and predictability of reward-related outcomes and is modulated by DA challenge. From L8, p13. A large reward positivity means the waveform was more positive for the unpredicted reward and more negative for the unpredicted non-reward. Responding to reward-related events. PET: positron emission tomography - advantage of allowing measurement of receptors for particular neurotransmitters but the disadvantage of being invasive, as I trequires injection of radioactive tracers into the bloodstream. Both MRI and PET are valuable for their spatial resolution Reward Prediction Error (RPE): the value of an outcome (reward) relative to its expected value Reward Prediction Error Signalling: the pattern of DA phasic response activity - Schultz, 1998: increase in firing if large reward and decrease if no or less reward Bromberg-Martin & Hokosaka ( 2009): phasic firing depends upon whether information given and whether reward matches prediction Bennet, 2016; Brydevall 2018: EEG and RewP Summary of Key Findings Evidence of DA involvement in approach motivation systems: Treadway et al., 2012 (EEfRT task and PET scans with and without DA agonist/d-amphetamine)”: individual differences in DA responsivity correlates with willingness to expend greater effort for larger rewards. Kaack, et al., 2020 Seminar Readings Treadway et al., 2012 Dopamine Mechanisms of Individual Differences in Human Effort-Based Decision-Making Individual differences in effort-based decision making and how DA might be involved. Aim: to evaluate whether variability in responsivity of the DA system was associated with individual differences in cost/benefit decision-making In the introduction section, the authors cite evidence that dopamine plays a role in which aspect of reward processing? Evidence that DA plays a role in mitigating the cost of effort for reward – blocking DA signalling results in a behavioural shift towards low-effort low-risk options but enhancement of DA increases willingness to work for rewards. They are thinking about the role of DA in a decision-making framework – as if is a kind of cost-benefit analysis. Luke: approach motivation might be a better term to use – or willingness to pay a cost. Describe the EEfRT task (Effort Expenditure for Reward Task). (How does it work? What is the ‘logic’ of the task? What is the main variable/measure the authors obtained from the task?) EEfRT is a behavioural task: cost/benefit decision-making involving both effort and probabilistic costs A multi-trial game – in which choose between high effort/high reward or low effort/low reward at 3 levels of probability of receiving a reward: 88% “high”; 50% “medium”; 12% “low”) 2 task conditions: Low effort task (30 seconds/dominant hand with index finger) less reward (chance to win $1) High effort task (90 seconds; nondominant hand/pinky finger) more reward (chance to win $2.37) Do you notice any possible limitations to the EEfRT task? Ecological validity: artificial tasks but it is hard to do anything more naturalistic in PET studies – injecting with radioactive tracer and going in a PET imaging machine. Both conditions are physically tiring – so by end of the trials, might choose no effort as getting tired. One of the strongest effects in these types of trials number. As participants are told the difference in the tasks is the amount of time spent (and which finger), it was easy to work out the cost-benefit : 3x longer to do the hard task so if the reward is not 3 times larger, it may no longer seem worth it. What imaging technique do the authors use? Did you get a sense how this method works? PET: inject with radioactive tracer – is radioactive because as its metabolised it emits positrons which are picked up by the tomographer. The tracers (a kind of glucose) have an affinity for (and so will bind with) particular receptors. As radioactive material metabolises it produces an image of activity. With PET are looking at availability and function of receptors. PET is the only way to get down to population neurons of interest. Can’t use fRMI to say conclusively that dopamine neurones are involved in nucleus accumbens. How did the researchers try to engage/stimulate the dopamine/reward system? Used a d-amphetamine to stimulate D2 /D3 receptor availability - d-amphetamines potentiates DA release. Describe the basic procedure of the study. What were the participants asked to do? Three sessions: PET with placebo: injected with a radioactive tracer which has an affinity with DA receptors which enables us to see the availability of DA receptors on two different occasions (with placebo or d-amphetamines) PET scan with d-amphetamines: EEfRT – without amphetamine or placebo Three things they are trying to connect: How much effort people are investing in the task The availability of participant’s DA receptors under a placebo The availability of participant’s DA receptors with d-amphetamines Looking at how individual differences in how people do the task are related to individual differences in receptor availability. Calculating the difference in the DA receptor availability because of having had the amphetamine - comparing individual differences in DA receptor availability with (session 2) - and without (session 1) the d-amphetamine). And then relating individual differences in the effect of the amphetamine on the availability of DA receptors to individual differences in how people do the task i.e., how much effort are willing to put in. What did the authors treat as a measure of stimulant-induced DA responsivity? [18 F]fallypride measurements of non-displaceable binding potential (BPND ) – a computed estimate of the number of available D2 /D3 receptors in both the striatal and prefrontal areas. Took two scans – one at baseline (placebo) and one with a .43mg/kg of d-amphetamine. Produced an index of individual differences in stimulant induced DA responsivity being the difference (or percentage change) in the binding potential after amphetamine vs after placebo in the brain regions of interest (the vmPFC and left caudate). Figure 2 depicts the key results of the study. Can you describe what is being presented here? THIS IS THE TAKE HOME MESSAGE Primary dependent variable was the proportion of high -effo