Lecture 7 PDP Models Declarative Memory PDF
Document Details
Uploaded by LighterImagery
University of Alberta
2024
James Farley
Tags
Related
Summary
This lecture introduces PDP models as a form of parallel distributed processing for declarative memory, part of an introduction to cognitive neuroscience course. The models' operation is explained, and how they relate to concepts like long-term potentiation (LTP) and long-term depression (LTD) is discussed.
Full Transcript
Lecture 7 PDP Models Declarative Memory Introduction to Cognitive Neuroscience PSYCH 375 A1 James Farley, Fall 2024 PDP Models How They Work We just talked about two broad classes of models: cognitive models and neurological models One type of model that (debatably) could be considered int...
Lecture 7 PDP Models Declarative Memory Introduction to Cognitive Neuroscience PSYCH 375 A1 James Farley, Fall 2024 PDP Models How They Work We just talked about two broad classes of models: cognitive models and neurological models One type of model that (debatably) could be considered intermediate between the two named above is a ‘parallel distributed processing’ (PDP) model Knowledge is represented in the distributed pattern of activity of many units (or ‘nodes’) Weights (values between 0 and 1) at each of the many connections determine how signals sent from a given unit will either increase or decrease the activity of the next unit when transmitted through that particular connection PDP Models How They Work PDP models are conceptual in many ways (like cognitive models), yet the fact that the essential ‘pieces’ look/work a whole lot like how we understand neurons look/work (i.e. communicate) means it has a biologically plausible form that we could consider a sort of ‘neural network’ (hence the role/application in arti cial intelligence!) The way PDP models operate has some similarity to how repeated ring of connected neurons can strengthen those connections (long-term potentiation, LTP: ‘Neurons that re together wire together’: Donald Hebb) Can also relate long-term depression (LTD) to these models, in which repeated ring can also (depending on how everything is connected) selectively weaken connections This interplay between strengthening/weakening certain connections to ‘ ne-tune’ operation of the system to accomplish some goal can be compared to what happens when neural networks ‘learn’ to do things (usually through trial/error) by modifying connection weights fi fi fi fi fi PDP Models How They Work PDP networks consist of 3 ‘layers’ of units: 1. Input units: activated by stimulation from the environment (e.g. retinal activity when viewing pictures of faces during an experiment) 2. Hidden units: receive signals from input units (e.g. feature detectors in visual cortex that receive signals from the retina, neurons involved in perceptual judgments and decisions making, etc.) 3. Output units: receive signals from hidden units (e.g. neurons involved in delivering the response provided by the person making the judgment) PDP Models https://www.youtube.com/watch?v=ITwpDxx0UBE The beginning and end of this video frames PDP models in terms of what they contribute to discussions about philosophy of mind. While fascinating, that isn’t the focus of our course so don’t worry too much about that stu (unless you want to!) ff PDP Models Modelling Recognition PDP models can be built to test hypotheses Imagine we want to test whether we can get a PDP model to ‘behave like’ (in any number of ways) a human would when performing a particular task Rogers et al. (2004) built a PDP model to simulate what happens when humans make judgments about faces PDP Models Modelling Recognition One kind of basic question is simply, ‘could it work’? (hypothesis: this is a viable model for accomplishing object recognition) Finding that the PDP system is able to accomplish whatever task it was intended to doesn’t necessarily speak to how humans do that task but it’s a sort of proof of concept In other words, if our model can do it, and we understand that our model functions like interconnected neurons (a neural network), that could be how our brains do it PDP Models Modelling Recognition Beyond simply being a proof of concept, we can also use PDP models to test more speci c hypotheses e.g. does the particular pattern of de cits in a ‘damaged’ (compromised in some way) PDP models resemble what happens to humans with neurological damage (e.g. a stroke)? Rogers et al. (2004) programmed simulated ‘lesions’ into the system to see how disabling/ compromising a random assortment of units a ects the output They found the errors made by the compromised system were qualitatively similar to those shown by patents with semantic dementia Again, this doesn’t mean that must be how it works in happens with human stroke damage, but rather that could be how it works in humans with stroke damage fi ff fi PDP Models Graceful Degradation PDP models demonstrate graceful degradation, which means selectively damaging certain units doesn’t necessarily result in a complete breakdown of the whole system (as can happen with other kinds of system, e.g. damage the AC adaptor for a game console and it won’t even turn on) Depending on the extent, this seems similar to what happens with neurological damage (thought continued on next slide…) A di erent kind of graceful degradation… https://creativeengineeringstudio.com/graceful-degradation-with-acceptable-fallbacks/ ff PDP Models Graceful Degradation If we thought that conceptual knowledge (e.g. what a ‘dog’ is gets stored in neuron X) gets stored by a speci c neuron/group of neurons, we would expect stroke damage to result in very selective losses for recognizing speci c objects on the one hand (e.g. lose neuron X and lose your concept of dog), while leaving knowledge for all other kinds of objects completely intact on the other This doesn’t seem to be what happens and people tend to get more generalized kinds of behavioural e ects from neurological damage Note that one complication to the above is people sometimes do seem to have pretty speci c behavioural consequences from neurological damage, e.g. a case study of visual agnosia that was restricted to musical instruments fi fi ff fi PDP Models VS ? How They Represent Information Think back to our discussion about whether knowledge can be thought of as being stored in an ‘amodal’ form (independent of any one modality, not constrained to vision, hearing, etc.) that is then applied to representations in speci c modalities This could be contrasted with the two being less clearly separable (i.e. the idea that conceptual and sensory-based representations are not entirely distinct but rather somehow one and the same) This is all a bit abstract in some ways but consider the fact that the way a PDP model works doesn’t seem very compatible with the idea that having learned what a ‘dog’ is could be attributed/localized to any one (or other small number) of units (which, remember, are like neurons in many ways) fi PDP Models VS ? How They Represent Information The implication of what was discussed on the previous slide is that ‘meaning’ (this is where it gets abstract!) isn’t represented by any one singular part of the PDP model Could this also have implications for how we think about localization of function? We’re taking about representations and not functions here, but still… Rather, it seems to be an emergent property that can be understood to arise out of the interaction of all the parts of the system, not any one speci c part itself While not what they were talking about, the Gestalt mantra applies here: ‘The whole is more than the sum of it’s parts!’ “Semantic knowledge does not reside in representations that are separate from those that subserve perception and recognition, but emerges from the learned associations amongst such representations in di erent modalities.” (Rogers, 2004) ff fi Declarative LTM Memory Chapter 11 https://www.nature.com/articles/nrn2850/ gures/1 fi Imaging Long-Term Memory Experimental Challenges Early work geared towards better understanding long-term memory using imaging highlights some of the more general challenges associated with these methods, on both the design and analysis front One obstacle relates to the constraints imposed by block designs typical of the time (e.g. Petersen et al., 1988… see next slide) https://meteoreducation.com/long-term-memory/ Imaging Long-Term Memory Petersen et al. (1988) Petersen et al. (1988) compared activation during word blocks with xation blocks (i.e. used subtractive logic) https://www.researchgate.net/publication/282051206_Statistical_Analysis_Methods_for_the_fMRI_Data/ gures?lo=1 fi fi Imaging Long-Term Memory Petersen et al. (1988) Petersen et al. (1988) compared activation during word blocks with xation blocks (i.e. used subtractive logic) Methodological note: not clear what (exactly) participants were doing during the xation block, as well as the fact that they may have encoded things during those blocks (not necessarily related to the stimuli but maybe just the context, e.g. ‘this is super boring!’) Recall the possibility discussed in an earlier lecture that participants in FFA fMRI studies may have approached the presentation of faces (vs. houses, for example) di erently when spontaneously (and with minimal experimental instructions) shown stimuli from each category https://www.researchgate.net/publication/282051206_Statistical_Analysis_Methods_for_the_fMRI_Data/ gures?lo=1 fi fi fi ff Imaging Long-Term Memory Tulving et al. (1994) Another kind of approach taken to study LTM compares neural activation during the presentation of novel stimuli with the same during familiar stimuli This approach is based on the assumption that the inherent novelty would, on average, result in more robust encoding mechanisms being automatically recruited ‘If we can’t devise a condition when it (encoding) is not active, perhaps we can at least devise one when it is less active’ Note that this isn’t about recollection but encoding, and in fact you might make rather di erent predictions if you were focussed on testing a hypothesis related to recollection ff Imaging Long-Term Memory Tulving et al. (1994) Based on the logic described on the previous slide, Tulving et al. (1994) showed participants a set of photos on day 1, then brought them back for a follow-up session on another day (day 2) Two kinds of photos were presented on day 2: some ‘old’ ones (that were shown on day 1), as well as some ‘new’ ones (that were not shown on day 1) It was predicted that more encoding would take place while viewing new stimuli, as compared to old, which would be re ected in di erences in neural activation Found greater hippocampus and parahippocampal activation during the presentation of ‘new’ stimuli, as compared to ‘old’ fl ff Imaging Long-Term Memory Stern et al. (1996) Stern et al. (1996) showed 40 novel images with instructions to encode them for a later memory test Compared activation in a ‘one-item’ condition in which only a single image is presented on each trial, to that in a ‘many-items’ condition in which several images are presented on each trial (with the trial duration being held constant across conditions) Found greater activation in the ‘many-item’ condition, as compared to the ‘one- item’ condition, within several regions (including posterior hippocampus, parahippocampal gyrus, and fusiform gyri) Methodological note: could boredom be a confound in this design? Imaging Long-Term Memory An ERP-Based Approach Event-related potentials (ERPs) have also been used to study recollection using this general paradigm: 1. Present a series of stimuli, one at a time 2. Average ERP’s time-locked to stimulus onset 3. Test memory of participants for stimuli 4. Compare ERP’s recorded during initial stimulus presentation, as a function of subsequent memory performance http://faculty.washington.edu/losterho/erp_tutorial.htm Imaging Long-Term Memory An ERP-Based Approach Note the advantage of having higher temporal resolution (with EEG) here Relies on ‘incidental encoding’, or encoding that occurs without explicit intention Methodological note: One advantage of this design could be that participants perform the same task on every trial (unlike some of the other PET/fMRI work we discussed, e.g. comparing blocks of xation to blocks of encoding) One complication: lots of variability in how predictive the signal is, which seems to (at least in part) be related to the form the test takes (free recall vs. cued recall vs. comprehension) fi