Lecture 2 - One Way Between Subjects ANOVA 2023-2024(1)(1).pptx
Document Details

Uploaded by WillingOstrich
Full Transcript
One-way between-subjects ANOVA (Part Revision from Year 1) T Gerdjikov 1 Overview of ANOVA lectures Analyses of Variance (ANOVA) is essential for the successful analysis of undergraduate project/dissertation work. Using ANOVA we can analyse data in a way that is acceptable for most undergraduate...
One-way between-subjects ANOVA (Part Revision from Year 1) T Gerdjikov 1 Overview of ANOVA lectures Analyses of Variance (ANOVA) is essential for the successful analysis of undergraduate project/dissertation work. Using ANOVA we can analyse data in a way that is acceptable for most undergraduate projects, possibly for publication. The emphasis is on application Advanced statistical material is kept to a minimum. 2 Overview of lectures on ANOVA We will cover one-way and factorial ANOVA and how to make sense of and apply ANOVA designs. Many concrete examples to support learning. Introduce some of the most frequent experimental designs in psychology. 3 Aims and objectives of this lecture Introduce theory behind ANOVA. Understand basic principles of ANOVA Why is it done? What does it tells us? Do not worry about complicated maths The PC will do the grunt work for us! Emphasis on underlying logic and concepts 4 ANOVA vs. t-test Both doing the same job testing for mean differences t-tests limited to one pair of means (IV, or ‘factor’ with two levels) Example: 2 groups (Coffee vs. No-coffee) complete a puzzle. We measure their speed in solving the task (hence the DV is ‘Time’). Coffee group… No coffe (contro) group… 5 Compare the t-test and ANOVA results 6 Digression: what do error bars tell us? We usually use standard errors or the mean (SEM) in plotting error bars Rather than the standard deviation which you may be more familiar with SEM error bars can aid inferential statistics (eyeballing whether the means differ significantly). However, to be sure we should always look at the printed p-values. If two SEM error bars overlap, the means most likely do not differ significantly. However, in general, the opposite is not true: If the SEM error bars don’t overlap, we cannot be sure that the means are different... JASP also offers 95% confidence interval (CI) error bars which we won’t cover. 7 Now let’s add a third group: ‘Decaf drink’ We want to answer the question: Is the effect on performance due to caffeine vs. coffee smell, expectation, etc… Now we have 3 groups: Coffee, no coffee and Decaf One IV with 3 levels, same DV Group 1… Group 2… Group 3… 8 Comparison between ANOVA and t-test Why not use lots of t-tests? … Just compare pairs of means with t-tests? This would require a total of 3 t-tests. Each t-test is accepted with a 5% chance of finding a difference when none exists in reality Type I error or α; that’s what p< 0.05 means The three separate t-tests accumulate a relatively large α level Familywise Error rate = 1 – (0.95)3= 0.1426 (14%) ANOVA performs one ‘omnibus’ test which circumvents this problem Slight disadvantage: “ANOVA cannot tell us which specific 9 groups differ statistically, only that at least two groups were” Analyses of Variance (ANOVA) ANOVA can be used analyse experiments with more than one independent variable (aka ‘factor’) with two or more levels E.g. effect of coffee on recall for easy vs. difficult problems Two IVs Drink type with 3 levels: coffee, no-coffee, decaf Problem difficulty with 2 levels, easy vs. difficult Allows us to analyse complex designs and look at several IVs simultaneously. Some scientific questions may not be answerable using a single IV. In experimental research, most studies are analysed using ANOVA. 10 Sources of variability in ANOVA 11 What causes the variability among our participants? Even in the same experimental group, participant scores are likely to differ. This ‘unexplained’ variability may be described as Within-group Residual variability (error) variance Sources: Individual differences: different levels of skill, knowledge and strategy used State of the participant, e.g., current level of attention Measurement error – e.g., accuracy in recording behavioural scores 12 ‘average’ scores of the different groups may also differ. This is often referred to as Often referred to as between-group variability Due to experimental manipulations. Can we partial that out from residual or ‘error’ variability? 13 Dependent variable Theory of ANOVA: partitioning variability 3. Within-group 1. Total variability 2. Between-group variability (error) variability Group1 Group2 Group3 Group1 Group2 Group3 Group1 Group2 Group3 Quick and intuitive hand calculation of each source of variability (Sums of Squares) SStotal - Khan Academy video link 14 SSbetween and SSerror - Khan Academy video link Theory of ANOVA (3 types of variability): Mean Squares - ‘df-adjusted Sums of Squares’. 1) Total Mean Square (MSTOTAL) variability between all the scores. 2) Model Mean Square (MSMODEL or BETWEEN-GROUP) Variability explained by our experimental manipulation or independent variable (IV). (i.e., our by ‘model’) 3) MSRESIDUAL or WITHIN-SUBJECTS or ERROR Variability which cannot be explained by the model; residual differences within the groups. 15 Partitioning the variance It’s all about the cake! Experimenta l variance Random variance 16 Total variance in the data Experimental variance Random variance Likely to be significant Experimenta l variance Random variance Likely to be non-significant 17 One-way betweensubjects ANOVA example 18 One-way between subjects ANOVA Example: Does drinking coffee affects problem-solving? Three groups of 10 individuals each IV drink type with 3 conditions: Coffee, No-coffee, and Decaf DV: Time taken to complete a puzzle Group 1… Group 2… Group 3… 19 One-way between subjects ANOVA H0: no significant difference in time to complete puzzle between the three groups. H1: difference between any two of the three groups Coffee No Coffee Decaf 35 28 33 29 18 44 28 31 33 26 45 34 45 56 32 48 51 32 55 48 47 54 50 35 55 49 49 39 44 43 20 Ideally, between-group variance should be greater than residual variance (within-group). We could also say we are testing if the different groups (‘samples’) come from the same underlying population Participants perform ‘the same’ regardless of treatment Samples from the same population should have similar means (subject to some noise). So ANOVA tests if the three group means come from the same or different populations. 21 One-way between subjects ANOVA: How ANOVA works Estimate spread of scores (variance) between groups called: Mean Square Model (MSBetweengroup). Estimate spread of scores (variance) within groups called: Mean Square Residual (MSResidual). If samples belong to same population then the ratio of: MSBetween-group/MSResidual should be small (Example 1). If samples belong to different populations then the ratio of: MSBetween-group/MSResidual should be large (Example 2). Example 1 150 100 50 0 -10 -5 0 5 10 5 10 5 10 Example 2 150 100 50 0 -10 -5 0 Example 2 150 100 50 0 -10 -5 0 22 One-way between subjects ANOVA: How ANOVA works Visual inspection suggests scores for coffee-drinkers are lower They completed the puzzle in less time than the other two groups. Let’s see if this can be supported statistically… 23 One-way between subjects ANOVA results: Descriptive statistics Is the ‘spread’ similar across groups (i.e. homogeneity of variance)? This is one important assumption of ANOVA which needs to be tested. 24 One-way between subjects ANOVA results: Homogeneity of Variance Strictly speaking, we should only use ANOVA (without corrections) if the groups have very similar spread (variance). Tested using Levine’s Test for Equality of Variance. If sample variances do not differ significantly then you can use ANOVA. Here p >.05 Conclude no evidence that variances differ. 25 Calculations shown on the main ANOVA table 26 Computing the degrees of freedom (df) dfbetween group (here dfBeverage) numbers of groups -1 k-1 3-1 = 2 dfResidual number -1) of groups x (number of people in each group 27 Computing the Mean Square Just divide each Sum of Squares by its corresponding df Mean Square (MS) Between-Groups (here ‘Beverage’)= Sum of SquaresBeverage/dfBeverage = 1528.067/2 = 764.033 Mean Square (MS) Within Groups = Sum of SquaresBeverage/dfResiduals = 1499.4/27 = 55.533 28 Computing the Mean Square Average variability that our model can explain = 764 Average variability our model cannot explain (residual/errors) = 55. We expect MSBeverage to be greater than 29 Computing the F-ratio 𝑀𝑆 𝐵𝑒𝑡𝑤𝑒𝑒𝑛 − 𝑔𝑟𝑜𝑢𝑝 𝐹= 𝑀𝑆 𝑅 𝑒𝑠𝑖𝑑𝑢𝑎𝑙 𝑠 764.03333 𝐹= =13.75810 55.53333 The model explains approx. 14 times more variation than the residual. F-ratio is ratio of variability that the model explains against the variability the model does 30 Effect Size with ANOVA When using effect size with ANOVA, we use ηp² (partial eta squared) How do we interpret ηp² (Cohen’s, 1988): Effect Size η² (Eta squared) Small.01 Medium 0.06 Large 0.14 E.g., ηp² = 0.35, conclude large effect size. It also means31that 35% of the change in the DV can be accounted for by the IV. Why effect sizes? Statistical vs. practical significance Statistical significance (p-value) influenced by sample size Hence p=0.0001, does not imply larger effect than p=.04 Example: Weight loss with weight app vs. dietician p< 0.05, significant! But is a 0.2kg weight-loss of ‘practical’ significance? Used in meta analyses (later lectures) Used in power analyses 32 https://www.graphpad.com/quickcalcs/ttest2/ Effect Size with ANOVA: “% of variance accounted for by model”? 2 η𝑝 = 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 − 𝑔𝑟𝑜𝑢𝑝 𝑆𝑆 𝑏𝑒𝑡𝑤𝑒𝑒𝑛− 𝑔𝑟𝑜𝑢𝑝 + 𝑆𝑆𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 large effect size 50.0% of the variance accounted for by ‘Beverage’ (experimental manipulation) 1528 η𝑝 = =0.5048 1528+1499 2 33 (Small print: and For single IV, they are the same.) Use effect size to estimate required sample size (power analysis) Calculate how many participants you should test to be confident of obtaining a significant effect (total sample size) More participants required to detect smaller effect sizes Power – what probability of detecting true effect do we desire? Test ‘sensitivity’ 34 Posthocs 35 Posthoc tests Why needed? To put it simply, F-ratio only tells us there is a pair of means among the all conditions that differ. But which pair? Coffee vs. decaf Coffee vs. no-coffee Decaf vs no-coff Additional tests are needed… Why not just do t-tests? Because of family-wise error rate. Think of posthocs as ‘kind-of-like’ t-tests but with tweaks to how the p-value is calculated to control familywise error rate. This comes with a penalty: Stricter criterion to accept an effect as significant. 36 Reduced statistical power (Increase of Type II error). Posthoc tests: two examples Bonferroni corrected t-test - divide alpha level by the number of comparisons. E.g., with 3 comparisons, alpha = 0.05 3 = 0.017 for each comparison. Bonferroni is a very conservative post-hoc test which can be used as a safe option. Tukey HSD adequate protection against the Type I error without being excessively conservative. Good choice when we have near equal sample sizes. 37 There is a whole zoo of different posthocs… Choice of posthoc may depend on whether sample sizes are equal and whether different assumptions are met. In this module we will concentrate on Tukey’s HSD, Bonferroni and 38 E.g, what if homogeneity of variance is violated… Choose method that does not rely on the assumption of equal variances, E.g. Games-Howell procedure. A selection of posthocs available through JASP… 39 Posthoc using JASP Posthoc table gives lists all pair-wise comparisons with: associated mean differences, standard errors of the difference, an associated t-test and a p-values. We should get used to reading these tables even if they seem a bit confusing at first! Best to read them in conjunction with a descriptives plot. 40 Final points How is ANOVA calculated if group sizes differ? (more realistic case) Calculations are modified (not discussed here). Equal group size calculations most straightforward for intuitive understanding of main concepts What happens if the data is not normal? ANOVA Use is robust and can endure violations of assumptions. of non-parametric statistics (discussed in week 2 lectures). Why is today’s ANOVA variety called one-way between-subjects? only one factor (or independent variable) Factor manipulation given between different subjects. 41 Textbook chapters – see Handbook timetable 42