Summary

These notes cover weeks 4, 5, and 6 of a course, potentially in psychology or a related field. They discuss topics like assessing research, secondary data analysis, observational research, and critical appraisal. The notes highlight the importance of research design and the psychology reproducibility crisis.

Full Transcript

Week 4/5/6 Week 4: Assessing research Week 5: Secondary Data Analysis Week 6: Observational Research Week 4 Psychology’s “reproducibility crisis” Difficult to replicate studies and find the same effects If something is real/tru...

Week 4/5/6 Week 4: Assessing research Week 5: Secondary Data Analysis Week 6: Observational Research Week 4 Psychology’s “reproducibility crisis” Difficult to replicate studies and find the same effects If something is real/true/accurate, you should be able to find the effect multiple times, across different contexts etc Just because an article has been peer reviewed & published doesn’t mean it’s always good research/true The scientific method Our conclusions can be faulty due to: internal validity external validity only 1 explanation of the results. Generalizability of findings Threats Threats A confounding variable provides an selection bias alternative explanation for the relationship sampling procedure favors the selection of between the two variables some individuals over others. E.g. time of day Provides limited range of participant gender characteristics Critical appraisal Balanced assessment of strengths of research against limitations Assess process of research + results of research (murky distinction) Must consider all aspects of the research: ― Clarity of research question and/or scope ― Quality of research design ― Appropriateness of statistical analysis ― Appropriateness of interpretation of results retrospective critically review research: ― Whether to “believe” an effect ― Whether to build your (future) research upon past findings ― Whether to endorse a particular therapy or intervention for your client or organisation Critically appraising this article Should CBT be coupled with between-session internet-based clinician support? Article by Ivanov et al., Results show there was a significant reduction in hoarder behaviour which is a good thing! There was no control group which is ok—-- just means we cannot take 2 different time points from the study intervention and say that the intervention CAUSED these differences Small sample size — this is ok in this case, because it is a “feasible study”, whereby the researchers implement an intervention on a small scale to see if it would work on a large scale Formal Assessment Tools Assessment of quality: ― CASP (Critical Appraisal Skills Guidelines): to assess specific types of studies (e.g. RCTs) ― Cochrane’s Risk of Bias Tool (RCTs specifically) Article reporting guidelines: ― CONSORT (Consolidated Standards of Reporting Trials) checklist ― APA JARS (Journal Article Reporting Standards) Assessment of strength of evidence for specific interventions: ― NHMRC evidence hierarchy (used by APS in their review of therapies) CONSORT (Consolidated Standards of Reporting Trials) checklist Cochrane’s Risk of Bias NHMRC Evidence Hierarchy CASP (Critical Appraisal Skills Guidelines): to assess specific types of studies (e.g. RCTs) Questions to help you understand a randomized control trial A. Are the results valid? 1. Did the trial address a clearly focused issue? 2. Were patients randomly assigned to treatments? 3. Were all of the patients who entered the trial properly accounted for at its conclusion? (i.e. did any drop out?) 4. Were patients, health workers and study personnel ‘blind’ to treatment? (did they know if they were in the control vs treatment group?) 5. Were the groups similar at the start of the trial? 6. Aside from the experimental intervention, were the groups treated equally? B. What are the results? 7. How large was the treatment effect? 8. How precise was the treatment effect? C. Will the results help locally? 9. Can the results be applied to the local population, or in your context? 10. Were all clinically important outcomes considered? A study to critically appraise Gamification = using a mobile health app 1. Clarity of aims Clearly stated aims as well as how they were addressing them Clearly stated hypotheses 2. Random Allocation Participants were randomized AND method to randomize stated 3. All subjects accounted for? I.e. mention drop out 4. Blinding Experimenter did not intervene and tell people which group they were in 5. Baseline Group Comparability No difference between groups on demographics age, gender, ethnicity 6. Equal Treatment? 7. Treatment Effect Standardized (partial eta sq, d) and unstandardized (mean differences) effect sizes reported for psych effects Lots of analyses: interaction effects, main effects, and post-hoc contrasts 8. Precision of Results No error bars or CIs included in the graphs!! — you should always include them! SDs reported in Table 1 but no CIs in text either – this is bad! CIs reported around adherence effects (odds ratios) 9. Application of Results Most participants were 35–44 years of age, male, and white. Issues with generalisability Journal quality Peer-review should reduce bias in published research Journals each have an impact factor (IF) number of citations per article published in the journal InCites Journal Citation Reports (JCR) Some 2023 IFs – generally higher the better ― Annual Review of Psychology: 23.6 — this is great! Impact factor Does IF measure quality? Can the IF mislead ? ― Consider discipline with small numbers of ― Some types of articles are more researchers likely to be cited than others → higher IF ― An article is problematic, lots of researchers discuss it and therefore cite it. Causing it to have ― Journals can act to increase a high IF but because it is bad not a good article their IF ― Different types of IF (e.g.SJR) where data are weighted and source data can be different Week 5 Secondary data analysis retrospective analysis using existing datasets additional analysis of previously collected data Advantages Access to broad range of populations e.g. minority groups Using longitudinal research, birth cohorts to address RQs (as you couldn't run the experiment yourself) Ethical use of data Think about whether the benefits of using the secondary data outweigh the risks/concerns to participants (i.e are they going to be distressed?) Disadvantages YOU don’t have control over how data were collected Often, mismatch between the study design, measures used, and your RQ/hypotheses Study was designed for a different RQ to yours Often non-experimental study design —- cannot answer cause & effect Not ideal measures used (brief, different scales, etc.) Need to really understand the data, measures, participants etc Sometimes, it costs to access the dataset Secondary Data Analysis Process (lecturer outline) 1. Determine RQ (& possible hypotheses) 2. Find appropriate dataset for your area / RQ 3. Identify the study’s design, variables 4. Refine hypotheses using the actual variables Your planned RQ/hypotheses → might not actually be testable on this dataset refine hypotheses the data that is actually there 5. Analyse data 6. Make conclusions and write up results Secondary Data Analysis Process Donnellan & Lucas (2013) 1: Find Existing Data Sets. 2: Read Codebooks. will detail the procedures and methods used to acquire the data provide a list of all the assessments collected. 3: obtain Dataset researchers may create a smaller “working file” by taking only relevant variables from the larger master files. 4: Conduct Analyses Be Cautious & skeptic of data Try to detect any mistakes in data Important Considerations May need ethical approval at the start Compromising (on design, variables, etc.) can undermine your conclusions Data used for purposes other than the Is it worth using secondary data or is it original intention better to do primary research? Analytic methods need to match the how will you handle the data if people have sampling design dropped out Don’t data snoop!! — looking at results first, then coming up with a hypothesis based on what the secondary data already found. Making your research seem “perfect” Big datasets: Longitudinal projects often have lots of missing data Decide on how you will deal with missing data Need to consider significance levels, effect sizes etc. if sample size is huge Consider using effect size cut-offs instead as support (or not) of hypotheses (p-value) increased potential for Questionable (dodgy) Research Practices (QRPS) Including: p-hacking (i.e., exploiting analytic flexibility to obtain statistically significant results), selective reporting of statistically significant results hypothesizing after the results are known Cognitive biases often cause of QRPs apophenia (the tendency to see patterns in random data) confirmation bias (focus on evidence that is consistent with one’s beliefs) hindsight bias (view past events as predictable) Protect against researcher bias Pre-registration of hypotheses, methods, and analysis plans, and submitting these to either a third-party registry (the Open Science Framework [OSF]) Registered Reports are more likely to report null results smaller effect sizes be replicated Week 6 Observational research design Ways to observe behaviour 1. Naturalistic observation 2. Participant observation 3. Field experiments 4. Contrived observation 5. Observation of physical traces 6. Archival research, content analysis, and digital traces Ways to sample behaviour 1. Continuous sampling 2. Situation sampling 3. Time sampling 4. Instantaneous sampling 5. Event sampling 6. Individual sampling Ways to record behaviour 1. The frequency method 2. The duration method 3. The interval method 4. Calculating other measures Reliability and validity Ways to observe behaviour Naturalistic observation natural setting unobtrusive e.g. Goodall’s studies of chimpanzees 1960s Pros Cons insight into real-world behaviour Time consuming and expensive (high ecological validity) Potential for observer influence / Subjective interpretation Can observe behaviours that cannot be manipulated by researcher only see public behaviours Ethical concerns — participants may not have given consent Can reduce some Concerns: Have multiple observers Video record for later analysis Conceal observer Habituate participants to observer Interventions Can repeat observations of the same behaviour Can manipulate conditions Can make causal inferences Participant observation engages in the same activities as the people being observed Pros Cons time consuming Allows observation of secretive behaviours May affect subjects’ behaviours May lose objectivity Huge ethical concerns — you are lying to people Field experiments manipulate 1 or more IVs in natural settings Pros Cons Allow causal inferences Confounds may decrease Internal validity – is what we see due to More time effective what we think it is? Allows repeat observations Ethical concerns — no consent given Contrived observation i.e. structured observation settings arranged specifically to facilitate the occurrence of specific behaviours E.g. Invite people into the lab (decorated as an office) to complete a task Pros Cons Do not need to wait for behaviours to Because environment is less natural, occur naturally behaviours may be less natural More time effective Difficult to know what features of environment need to be preserved Can manipulate independent variables if necessary Observation of physical traces physical surroundings to find reflections of subjects’ previous activity Traces may be conscious changes (e.g., decorations) things unconsciously left behind/affected (e.g., rubbish) may be privacy concerns The University of Arizona Garbage Project — well-known trace studies. find out about such things as food preferences, waste behaviour, and alcohol consumption. Pros Cons High ecological validity Interesting behaviours may be missed – no control No chance of affecting behaviours might leave NO trace Can observe behaviours that cannot be manipulated by researcher Ethical concerns — no consent given, invading people’s privacy Inexpensive and easy Other Examples of Physical-Trace Analysis The number of different fingerprints on a page — understand readership of various advertisements in a magazine. The magazines people donated to charity were used to determine people's favourite magazines. Archival research, content analysis, and digital traces Archival research historical records to measure and describe behaviours/ events occurring in the past Pros Cons No chance of interfering with behaviours Not actually witnessing behaviours Can compare data from different times limits the inferences that can be drawn Other researchers can look the same data content analysis documents and communication artifacts (e.g., images, text, media, manuals) to infer the occurrence, frequency, a quality of behavioural events Pros Cons Can identify patterns/features not Not actually witnessing behaviours immediately obvious t Analysis may not actually be objective Clear and transparent how inferences and conclusions were drawn digital traces artefacts of digital technology-users’ activities, transactions, and communications Pros Cons Extremely large data sets difficult to explain results Data-driven approaches can find ethical concerns: surprising patterns Gives researchers power over people Who “owns” the data? Who gets to profit? Ways to sample behaviour Continuous Sampling All behaviour is coded that occurs within a specified time period Pros Cons Can see antecedents and Expensive consequents of target behaviour time consuming When do you record data if you are always observing? Situation Sampling studying a target behaviour in different locations, and under different circumstances Speeding in school zones: Check at different times of day am &/or pm Pros Cons Observer only has to focus on events expensive to carry out of interest – has time to record data Difficult to pre-plan high external validity of findings (range of situations, times, location etc.) Time Sampling (a.k.a. Interval Sampling) Observer decides in advance to observe only during specified time periods (e.g. 1 hour per day) records the specified behaviour during that period only. Pros Cons only has to observe behaviours during May miss important behaviours certain times Less time recording Difficult to pre-select times Instantaneous Sampling (a.k.a. Target Time Sampling) decides in advance the pre-selected moments to observe records what is happening at that instant. Everything happening before or after is ignored Pros Cons only has to observe behaviours during May miss important behaviours certain times Less time recording Difficult to pre-select times Event Sampling pre-decides 1 specific event or behaviour to be observed (once per observation interval). Pros Cons may miss important behaviours – Observer only has to focus on antecedents; consequents events of interest behaviours may not generalize Less time recording Individual Sampling Only 1 participant is observed at a time. Focus shifts to another participant for the next interval Pros Cons Time consuming and expensive less likely to miss important behaviours observing only 1 person we may miss 1 participant exhibiting behaviours if they only did so while observer was focussing on others Examples Recording Behaviour What you record will depend on the goals of the study Do you need: a comprehensive description of behaviour OR Just a description of selected behaviours. Are the data in a qualitative or a quantitative form? Quantitative Measures The Frequency Method (how many) Counting the instances of each specific behaviour that occur during a fixed-time observation period E.g. Count number of iLearn posts The Duration Method (how long) Recording how much time an individual spends engaged in a specific behaviour during a fixed-observation period E.g. Time spent studying/revising The Interval Method Dividing the observation period into a series of intervals and then recording whether a specific behaviour occurs during each interval E.g. Logged onto iLearn that day Y/N Inter-observer reliability = extent to which independent observers agree in their observations. Intra-rater reliability The degree of similarity between different ratings made at different time points by the same rater for the same stimulus Would the same rater code the same behaviours in the same way on a second occasion that they viewed the same sequence of behaviours? High inter-observer reliability increases confidence that observations are accurate (valid). is increased by providing clear definitions about behaviours training observers providing feedback about discrepancies. is assessed by calculating percentage agreement or correlations, depending on how the behaviours were measured and recorded. Agreement of 90% (or higher) = “good” measure of reliability What factors could negatively influence inter-rater reliability? Some common problems: Error of Apprehending: Error of Recording: Observer Bias: might not be able to see poor techniques and determine the behaviours what the driver is doing equipment they choose to observe due to the angle of the camera. inexperience. observers' expectations lead to systematic errors in recording behaviour. Computational Error: Observer Effect: Observer Error: is usually due to an due to the presence of the Inexperience inappropriate choice of a observer or stimuli. statistical test. poorly defined behavioural units observer drift Influence of the Observer If individuals change their behaviour when they know they are being observed ("reactivity"), their behaviour may no longer be representative of their normal behaviour (aka: the Hawthorne effect). Methods to control reactivity include unobtrusive (non-reactive) measurement, adaptation (habituation, desensitization), indirect observations of behaviour.

Use Quizgecko on...
Browser
Browser