Lecture 6 - ANCOVA PDF
Document Details
Uploaded by CommodiousApostrophe6548
University of York
Tags
Related
- Experimental Design I: Single Factor Designs Chapter 7 PDF
- Quantitative Methods Course Pack PDF
- Covariance Research Design PDF
- HPS3U34 – CM 10-11-12 : Tester les effets principaux, d'interaction, et simples dans les différents plans factoriels (PDF)
- Chapter 12 – Analysis of Variance: One-Way Between-Subjects Design PDF
- Lecture 7 - One Way, Factorial Repeated Measures and Mixed Factorial ANOVA PDF
Summary
This document is a lecture on research design and statistics, focusing on ANCOVA and factorial independent ANOVA. It includes a decision tree illustrating different statistical methods.
Full Transcript
Research Design and Statistics Lecture 6: ANCOVA and Factorial Independent ANOVA 891962 Overview Modelling a covariate using the linear model ANCOVA and its link to regression and ANOVA Multiple factors - extending the independent ANOVA to account for more th...
Research Design and Statistics Lecture 6: ANCOVA and Factorial Independent ANOVA 891962 Overview Modelling a covariate using the linear model ANCOVA and its link to regression and ANOVA Multiple factors - extending the independent ANOVA to account for more than one categorical predictor variable Decision tree - our learning framework 1. What sort of CONTINUOUS (CONT) CATEGORICAL (CAT) measurement? 2. How many predictor TWO variables? ONE TWO (or more) ONE (or more) 3. What type of predictor CONT CAT CONT CAT BOTH CAT CONT CAT CONT BOTH variable? 4. How many levels of MORE THAN categorical predictor? TWO TWO 5. Same (S) or Different (D) participants for each S D S D S D BOTH D D predictor level? independent ANOVA Multiple regression 6. Meets assumptions t-test (independent) measures ANVOA One-way repeated measures ANOVA t-test (dependent) Factorial repeated Factorial ANOVA for parametric tests? Factorial mixed Correlation or Independent Regression YES ANCOVA One-way Pearson ANOVA Logistic Regression Logistic Regression Logistic Regression Log-linear analysis Chi-Squared test Mann-Whitney Kruskal-Wallis Spearman Friedman Willcoxon NO Last week: More than two means - ANOVA Much like the model for a t- Outcomei = (model) + errori test we can write a model for more than two means. Yi = bo + b1X1i + b2X2i + ei In the case here, we have three means and the model Yi = (bo + b1X1i + b2X2i) + ei accounts for the three levels of a categorical variable with dummy variables as before. Let’s go through an example [Field 5th Ed. Chapter12] We have: 1. control group (this could be a treatment as usual, a no treatment or ideally some kind of placebo group – for example, if our hypothesis was specifically about puppies we could give people in this group a cat disguised as a dog); 2. 15 minutes of puppy therapy (a low-dose group) 3. 30 minutes of puppy contact (a high-dose group). The dependent variable was a measure of happiness ranging from 0 (as unhappy as I can possibly imagine being) to 10 (as happy as I can possibly imagine being). Dummy variables Yi = bo + b1X1i + b2X2i + ei Happinessi =b o + b1Longi + b2Shorti + ei Group Dummy variable 1 (Long) Dummy variable 2 (Short) Control 0 0 15 minutes of therapy 0 1 30 minutes of therapy 1 0 Dummy variables and a continuous covariate Yi = bo + b1X1i + b2X2i + ei Happinessi =b o + b1Longi + b2Shorti + b3Puppylovei+ ei The modification for the model now includes a continuous variable, Puppylove, for which each individual will have a value. The aim is to essentially discount the effect of this variable on the outcome ahead of accounting for the effects of the treatment groups. This is similar to performing a hierarchical regression, which tests whether the additional of a predictor increases the variance explained by the model after the variance accounted for by other predictors is removed. Data would look like this Three columns: Column 1: the dose of puppy therapy, which define the groups and is the categorical predictor of the differences between means Column 2: the outcome score which is participant happiness Column 3: the continuous covariate, which is the participant’s love of puppies ANCOVA assumptions: The usual suspects Normality Homogeneity of variance (Levene’s test) ANCOVA assumptions: Independence of the covariate Independence of the covariate and treatment effect means that the categorical predictors and the covariate should not be dependent on each other. [see Adam’s video] Similar issue in multiple regression - if predictor variable correlate too much with each other, it is unlikely that a reliable estimate of their relationship with the outcome can be calculated. ANCOVA assumptions: homogeneity of regression slopes. Homogeneity of regression slopes means that the covariate has a similar relationship with the outcome measure, irrespective of the level of the categorical variable - in this case the group. On the right we have a figure that shows that the effect of the covariate is similar for the control and 15 minute group, which is an example of homogeneity of regression slopes. But the third group, receiving 30 minutes of therapy, exhibits and different slope. [see video by Adam] There are alternative, a bit more advanced, methods to account for such differences as they are not, in general, uninteresting, but for the ANCOVA analysis they do present an issue. Analyze -> General Linear Model -> Univariate The panel to the right shows the window that pops up. 1. the outcome/dependent variable, happiness 2. The fixed factor is the categorical predictor, dose of therapy 3. The covariate, love of puppies Output The crucial tabel to look at is the one to the right, labelled Test of Between- Subjects effects. Rows 3 and 4 of the table look at the effects of the the covariate, puppy love, and the predictor, therapy dose, respectively. The F and p values can be read from the final two columns Quote df for the effect and error, e.g. 2,26 Output: Adjusted means The group means can be recalculated once the effect of the covariate is ‘discounted’. These values can differ markedly from the original group means and help with interpretation. Decision tree - our learning framework 1. What sort of CONTINUOUS (CONT) CATEGORICAL (CAT) measurement? 2. How many predictor TWO variables? ONE TWO (or more) ONE (or more) 3. What type of predictor CONT CAT CONT CAT BOTH CAT CONT CAT CONT BOTH variable? 4. How many levels of MORE THAN categorical predictor? TWO TWO 5. Same (S) or Different (D) participants for each S D S D S D BOTH D D predictor level? independent ANOVA Multiple regression 6. Meets assumptions t-test (independent) measures ANVOA One-way repeated measures ANOVA t-test (dependent) Factorial repeated Factorial ANOVA for parametric tests? Factorial mixed Correlation or Independent Regression YES ANCOVA One-way Pearson ANOVA Logistic Regression Logistic Regression Logistic Regression Log-linear analysis Chi-Squared test Mann-Whitney Kruskal-Wallis Spearman Friedman Willcoxon NO Factorial Designs Independent factorial design: There are several independent variables or predictors and each has been measured using different entities (between groups). We discuss this design today [see Field Chapter 14]. Repeated-measures (related) factorial design: Several independent variables or predictors have been measured, but the same entities (participants) have been used in all conditions [see Field Chapter 15]. Mixed design: Several independent variables or predictors have been measured; some have been measured with different entities, whereas others used the same entities [see Field Chapter 16]. Independent factorial design: An example The study tested the prediction that subjective perceptions of physical attractiveness become inaccurate after drinking alcohol (the well-known beer-goggles effect), but tests whether this effect may differ for attractive and unattractive faces.* 48 participants were randomly subdivided into three groups of 16: (1) a placebo group drank 500 ml of alcohol- free beer; (2) a low-dose group drank 500 ml of average strength beer (4% ABV); and (3) a high-dose group drank 500 ml of strong beer (7% ABV). Within each group, half (n = 8) rated the attractiveness of 50 photos of unattractive faces on a scale from 0 (pass me a paper bag) to 10 (pass me their phone number) and *If the beer-goggles effect is driven by alcohol impairing the remaining half rated 50 photos of attractive faces.1 symmetry judgements then you’d expect a stronger effect The outcome for each participant was their median rating for unattractive (asymmetric) faces (because alcohol will across the 50 photo affect the perception of asymmetry) than attractive (symmetric) ones. Independent factorial designs and the linear model Yi = bo + b1X1i + b2X2i + ei Happinessi =b o + b1Longi + b2Shorti + ei Adapt to our new experiment Attractivenessi =bo + b1FaceTypei + b2Alcoholi + ei Independent factorial designs: Interactions Attractivenessi =b o + b1FaceTypei + b2Alcoholi + ei Attractivenessi =b o + b1FaceTypei + b2Alcoholi + b3Interacti + ei The first equation models the two predictors in a way that allows them to account for variance in the outcome separately, much like a multiple regression model The second equation adds a term that models how the two predictor variables interact with each other to account for variance in the outcome that neither predictor can account for alone. The interaction is important to us because it tests our hypothesis that alcohol will have a stronger effect on the ratings of unattractive than attractive faces. Independent factorial designs: Interactions Attractivenessi =b o + b1FaceTypei + b2Alcoholi + b3Interacti + ei Type of face Alcohol Dummy Dummy Interaction Mean rated (FaceType) (Alcohol) Unattractive Placebo 0 0 0 3.500 Unattractive High dose 0 1 0 6.625 Attractive Placebo 1 0 0 6.375 Attractive High dose 1 1 1 6.125 This is a simplified version of the coding scheme. Importantly it shows that interaction is zero for all conditions other than the one when both predictors are ‘present’. It therefore models the benefit or cost of having both variables present. The interaction is the product of the dummy variables. It is often denoted with an x - in our case FaceType x Alcohol Interactions: following the maths to get some intuition Attractivenessi =b o + b1 x 0 + b2x 0 + b3x 0 + ei Type of face Alcohol Dummy Dummy Interaction Mean rated (FaceType) (Alcohol) Unattractive Placebo 0 0 0 3.500 Unattractive High dose 0 1 0 6.625 Attractive Placebo 1 0 0 6.375 Attractive High dose 1 1 1 6.125 bo = XUnattractive,Placebo Interactions: following the maths to get some intuition Attractivenessi =b o + b1 + b2x 0 + b3x 0 + ei Type of face Alcohol Dummy Dummy Interaction Mean rated (FaceType) (Alcohol) Unattractive Placebo 0 0 0 3.500 Unattractive High dose 0 1 0 6.625 Attractive Placebo 1 0 0 6.375 Attractive High dose 1 1 1 6.125 bo = XUnattractive,Placebo b1 = XAttractive,Placebo - XUnattractive,Placebo Interactions: following the maths to get some intuition Attractivenessi =b o + b1 x 0+ b2x 1 + b3x 0 + ei Type of face Alcohol Dummy Dummy Interaction Mean rated (FaceType) (Alcohol) Unattractive Placebo 0 0 0 3.500 Unattractive High dose 0 1 0 6.625 Attractive Placebo 1 0 0 6.375 Attractive High dose 1 1 1 6.125 bo = XUnattractive,Placebo b1 = XAttractive,Placebo - XUnattractive,Placebo b2 = XUnattractive,HighDose - XUnattractive,Placebo Interactions: following the maths to get some intuition Attractivenessi =b o + b1 x 1+ b2x 1 + b3x 1 + ei Type of face Alcohol Dummy Dummy Interaction Mean rated (FaceType) (Alcohol) Unattractive Placebo 0 0 0 3.500 Unattractive High dose 0 1 0 6.625 Attractive Placebo 1 0 0 6.375 Attractive High dose 1 1 1 6.125 bo = XUnattractive,Placebo b1 = XAttractive,Placebo - XUnattractive,Placebo b2 = XUnattractive,HighDose - XUnattractive,Placebo b3 = (XAttractive,HighDose - XUnattractive,HighDose) - (XAttractive,Placebo - XUnattractive,Placebo) Interactions: following the maths to get some intuition Attractivenessi =b o + b1 x 1+ b2x 1 + b3x 1 + ei Type of face Alcohol Dummy Dummy Interaction Mean rated (FaceType) (Alcohol) Unattractive Placebo 0 0 0 3.500 Unattractive High dose 0 1 0 6.625 Attractive Placebo 1 0 0 6.375 Attractive High dose 1 1 1 6.125 b3 = (XAttractive,HighDose - XUnattractive,HighDose) - (XAttractive,Placebo - XUnattractive,Placebo) The interaction coefficient, b3, measures how the effect of face type (on face ratings) depends on the dose of alcohol. If b3 is large in size, either positive or negative, we will now that alcohol dose has a large effect on the ratings of different face types. If, however, the interaction coefficient is small, we would know that difference in ratings of attractiveness resulting from different face types would not depend on the alcohol dose. Interactions: What it represents b3 = (XAttractive,HighDose - XUnattractive,HighDose) - (XAttractive,Placebo - XUnattractive,Placebo) The interaction coefficient, b3, measures how the effect of face type (on face ratings) depends on the dose of alcohol. If b3 is large in size, either positive or negative, we will know that alcohol dose has a large effect on the ratings of different face types. If, however, the interaction coefficient is small, we would know that difference in ratings of attractiveness resulting from different face types would not depend on the alcohol dose. How do we know whether our coefficients in the model are significant? We follow the same routine to compute sums of squares for each factor of the model (and their interaction) and compare them to the residual sum of squares, which measures what the model cannot explain Compute Total Sum of Squares SST = Σ(xi - xgrand)2 Here we sum over all participants i = 1 to N Degrees of freedom, df, is (N-1) - in this case 47 Compute Model Sum of Squares SSM = Σng(xg - xgrand)2 Here we sum over all groups (g = 1 to 6, because we had six groups). There is a mean value for each group, g, and we multiply the difference between the group and grand mean by the number of participants in each group. In our example we had 8 participants per group. Degrees of freedom, df, is (g-1) - in this case 5 Compute Main Effect of Face Type Sum of Squares SSA = Σng(xg - xgrand)2 We are using the same general equation, but we now are considering only two groups - the group of participants that rated attractive faces and the group of participants rating unattractive faces. We will therefore sum over g = 1 to 2, because we have two groups. There is a mean value for each group, g, and we multiply the difference between the group and grand mean by the number of participants in each group. In our example we had 24 participants per group. Degrees of freedom, df, is (g-1) - in this case 1 Compute Main Effect of Alcohol Sum of Squares SSB = Σng(xg - xgrand)2 We are using the same general equation, again, but we now are considering three groups - the group of participants that received the placebo, the group of participants that received the low dose and the group of participants that received the high dose of alcohol. We will therefore sum over g = 1 to 3, because we have three groups. There is a mean value for each group and we multiply the difference between the group mean and grand mean by the number of participants in each group. In our example we had 16 participants per group. Degrees of freedom, df, is (g-1) - in this case 2 Compute Interaction Sum of Squares SSAxB = SSM - SSA - SSB dfAxB = dfM - dfA - dfB = 5 - 1 - 2 = 2 Compute Residual Sum of Squares SSR = Σsg2 (ng - 1) We use the individual variances of each group and multiply them by one less than the number of people within the group (n), in this case n = 8. Degrees of freedom, df, is g(ng-1) - in this case 6 x 7 = 42 Mean Sums of Squares MSA = SSA / dfA MSR = SSR / dfR MSB = SSB / dfB MSAxB = SSAxB / dfAxB Mean Sums of Squares FA = MSA / MSR FB = MSB / MSR FAxB = MSAxB / MSR Effect size There are three effects that have associated F values. The value F makes it straightforward to compute in terms of the η2 SPSS gives partial η2 values for each F test, if you request them in the options setting up the model Return to our example Assumptions met? Main output table 8 rows, 5 columns (with numbers) First two rows largely unimportant for us Columns give; Sum of Squares, degrees of freedom, Mean Sum of Squares, F and p (Sig.) Rows 3 and 4 give those column values for the main effects of FaceType and Alcohol, respectively Row 5 gives the column values for the interaction between FaceType and Alcohol. *IMPORTANT: A sixth column gives the effect sizes, but only if you request them Decision tree - our learning framework 1. What sort of CONTINUOUS (CONT) CATEGORICAL (CAT) measurement? 2. How many predictor TWO variables? ONE TWO (or more) ONE (or more) 3. What type of predictor CONT CAT CONT CAT BOTH CAT CONT CAT CONT BOTH variable? 4. How many levels of MORE THAN categorical predictor? TWO TWO 5. Same (S) or Different (D) participants for each S D S D S D BOTH D D predictor level? independent ANOVA Multiple regression 6. Meets assumptions t-test (independent) measures ANVOA One-way repeated measures ANOVA t-test (dependent) Factorial repeated Factorial ANOVA for parametric tests? Factorial mixed Correlation or Independent Regression YES ANCOVA One-way Pearson ANOVA Logistic Regression Logistic Regression Logistic Regression Log-linear analysis Chi-Squared test Mann-Whitney Kruskal-Wallis Spearman Friedman Willcoxon NO Next week One-way repeated ANOVA Factorial repeated ANOVA Assumptions to be met Main effect and interactions Mixed Model ANOVA Essentially, understanding the same approach but for different experimental designs, when the same participant is tested in all the experimental conditions Decision tree - our learning framework 1. What sort of CONTINUOUS (CONT) CATEGORICAL (CAT) measurement? 2. How many predictor TWO variables? ONE TWO (or more) ONE (or more) 3. What type of predictor CONT CAT CONT CAT BOTH CAT CONT CAT CONT BOTH variable? 4. How many levels of MORE THAN categorical predictor? TWO TWO 5. Same (S) or Different (D) participants for each S D S D S D BOTH D D predictor level? independent ANOVA Multiple regression 6. Meets assumptions t-test (independent) measures ANVOA One-way repeated measures ANOVA t-test (dependent) Factorial repeated Factorial ANOVA for parametric tests? Factorial mixed Correlation or Independent Regression YES ANCOVA One-way Pearson ANOVA Logistic Regression Logistic Regression Logistic Regression Log-linear analysis Chi-Squared test Mann-Whitney Kruskal-Wallis Spearman Friedman Willcoxon NO Decision tree - our learning framework 1. What sort of CONTINUOUS (CONT) CATEGORICAL (CAT) measurement? 2. How many predictor TWO variables? ONE TWO (or more) ONE (or more) 3. What type of predictor CONT CAT CONT CAT BOTH CAT CONT CAT CONT BOTH variable? 4. How many levels of MORE THAN categorical predictor? TWO TWO 5. Same (S) or Different (D) participants for each S D S D S D BOTH D D predictor level? independent ANOVA Multiple regression 6. Meets assumptions t-test (independent) measures ANVOA One-way repeated measures ANOVA t-test (dependent) Factorial repeated Factorial ANOVA for parametric tests? Factorial mixed Correlation or Independent Regression YES ANCOVA One-way Pearson ANOVA Logistic Regression Logistic Regression Logistic Regression Log-linear analysis Chi-Squared test Mann-Whitney Kruskal-Wallis Spearman Friedman Willcoxon NO Decision tree - our learning framework 1. What sort of CONTINUOUS (CONT) CATEGORICAL (CAT) measurement? 2. How many predictor TWO variables? ONE TWO (or more) ONE (or more) 3. What type of predictor CONT CAT CONT CAT BOTH CAT CONT CAT CONT BOTH variable? 4. How many levels of MORE THAN categorical predictor? TWO TWO 5. Same (S) or Different (D) participants for each S D S D S D BOTH D D predictor level? independent ANOVA Multiple regression 6. Meets assumptions t-test (independent) measures ANVOA One-way repeated measures ANOVA t-test (dependent) Factorial repeated Factorial ANOVA for parametric tests? Factorial mixed Correlation or Independent Regression YES ANCOVA One-way Pearson ANOVA Logistic Regression Logistic Regression Logistic Regression Log-linear analysis Chi-Squared test Mann-Whitney Kruskal-Wallis Spearman Friedman Willcoxon NO