Full Transcript

# Chapter 14: Factorial ANOVA ## 14.1 Basic Logic - **Factorial Design:** An experiment with more than one independent variable. - **Factor:** Another name for independent variable. - **Levels:** The different values of the independent variable. - A factorial ANOVA allows to test for more...

# Chapter 14: Factorial ANOVA ## 14.1 Basic Logic - **Factorial Design:** An experiment with more than one independent variable. - **Factor:** Another name for independent variable. - **Levels:** The different values of the independent variable. - A factorial ANOVA allows to test for more complex effects than a one-way ANOVA ### 14.1.1 Example: 2x2 Factorial Design Let's say we conduct an experiment that looks at the effect of: 1. Caffeine (0mg vs 100mg) 2. Exam Difficulty (Easy vs Hard) on exam performance - This is a $2 \times 2$ design, because there are 2 factors, each with levels. - The number of conditions can be obtained by multiplying the number of levels: $2 \times 2 = 4$ conditions. ### 14.1.2 Main Effects - The effect of one independent variable on the dependent variable, ignoring the other independent variable(s). - In our example, there are two main effects: - The main effect of caffeine on exam performance - The main effect of exam difficulty on exam performance ### 14.1.3 Interaction Effects - The effect of one independent variable on the dependent variable, taking into account the other independent variable(s). - An interaction occurs when the effect of one factor depends on the level of another factor. - In our example: - Is the effect of caffeine on exam performance different depending on whether the exam is easy or hard? - Equivalently: Is the effect of exam difficulty on exam performance different depending on whether the participant ingested caffeine or not? ## 14.2 Assumptions of Factorial ANOVA 1. **Normality:** The scores for each condition should be sampled from a normal distribution. 2. **Homogeneity of variance:** The variance within each condition should be similar. 3. **Independence:** The scores should be independent. ## 14.3 How the ANOVA Works ### 14.3.1 Logic 1. Calculate the variance between each of the conditions 2. Calculate the variance within each of the conditions 3. Calculate an F-statistic for each of the effects: main effect of factor 1, main effect of factor 2, interaction effect ### 14.3.2 Partitioning the Variance - In factorial ANOVA, the total variance is partitioned into more sources than in one-way ANOVA. - In a two-way ANOVA, we partition the variance into: - Variance explained by Factor A - Variance explained by Factor B - Variance explained by the interaction between Factor A and B - Error Variance ## 14.4 Computation ### 14.4.1 Sum of Squares The formulas to calculate the Sum of Squares for each effect are: - $SS_{between cells} = \sum_{i=1}^{n_{cells}} n_i (\bar{Y_i} - \bar{Y_{Grand}})^2$ - $SS_{A} = \sum_{i=1}^{n_{levels A}} n_i (\bar{Y_{A_i}} - \bar{Y_{Grand}})^2$ - $SS_{B} = \sum_{i=1}^{n_{levels B}} n_i (\bar{Y_{B_i}} - \bar{Y_{Grand}})^2$ - $SS_{AxB} = SS_{between cells} - SS_A - SS_B$ - $SS_{within cells} = \sum_{i=1}^{n_{cells}} (n_i - 1)S^2_i$ - $SS_{total} = \sum_{i=1}^{N} (Y_i - \bar{Y_{Grand}})^2$ Where: - $i$ is the index of the current cell. - $n_{cells}$ is the number of cells (i.e. conditions). - $n_i$ is the number of observations in cell $i$. - $\bar{Y_i}$ is the mean of cell $i$. - $\bar{Y_{Grand}}$ is the grand mean. - $n_{levels A}$ is the number of levels of factor A. - $\bar{Y_{A_i}}$ is the mean of level $i$ of factor A. - $n_{levels B}$ is the number of levels of factor B. - $\bar{Y_{B_i}}$ is the mean of level $i$ of factor B. - $N$ is the total number of observations. - $Y_i$ is the value of observation $i$. **Note:** $SS_{total} = SS_A + SS_B + SS_{AxB} + SS_{within cells}$ ### 14.4.2 Degrees of Freedom The formulas to calculate the Degrees of Freedom for each effect are: - $df_A = levels_A - 1$ - $df_B = levels_B - 1$ - $df_{AxB} = df_A \times df_B$ - $df_{within cells} = N - n_{cells}$ - $df_{total} = N - 1$ Where: - $levels_A$ is the number of levels for factor A. - $levels_B$ is the number of levels for factor B. - $N$ is the total number of observations. - $n_{cells}$ is the number of cells (i.e. conditions). **Note:** $df_{total} = df_A + df_B + df_{AxB} + df_{within cells}$ ### 14.4.3 Mean Square The formulas to calculate the Mean Square for each effect are: - $MS_A = \frac{SS_A}{df_A}$ - $MS_B = \frac{SS_B}{df_B}$ - $MS_{AxB} = \frac{SS_{AxB}}{df_{AxB}}$ - $MS_{within cells} = \frac{SS_{within cells}}{df_{within cells}}$ ### 14.4.4 F-Statistic The formulas to calculate the F-Statistic for each effect are: - $F_A = \frac{MS_A}{MS_{within cells}}$ - $F_B = \frac{MS_B}{MS_{within cells}}$ - $F_{AxB} = \frac{MS_{AxB}}{MS_{within cells}}$ ## 14.5 Example ### 14.5.1 The Data Let's continue with our example looking at the effect of: 1. Caffeine (0mg vs 100mg) 2. Exam Difficulty (Easy vs Hard) on exam performance | Participant | Caffeine | Difficulty | Score | | ----------- | -------- | ---------- | ----- | | 1 | 0 | Easy | 50 | | 2 | 0 | Easy | 60 | | 3 | 0 | Easy | 70 | | 4 | 0 | Easy | 80 | | 5 | 0 | Hard | 30 | | 6 | 0 | Hard | 40 | | 7 | 0 | Hard | 50 | | 8 | 0 | Hard | 60 | | 9 | 100 | Easy | 70 | | 10 | 100 | Easy | 80 | | 11 | 100 | Easy | 90 | | 12 | 100 | Easy | 100 | | 13 | 100 | Hard | 50 | | 14 | 100 | Hard | 60 | | 15 | 100 | Hard | 70 | | 16 | 100 | Hard | 80 | ### 14.5.2 Sum of Squares First, we calculate the means for each cell: | | Easy | Hard | | -------- | ---- | ---- | | 0mg | 65 | 45 | | 100mg | 85 | 65 | | **Means**| **75** | **55** | And the grand mean: - $\bar{Y_{Grand}} = 62.5$ We can now plug these values into the formulas: - $SS_{between cells} = \sum_{i=1}^{n_{cells}} n_i (\bar{Y_i} - \bar{Y_{Grand}})^2$ - $SS_{between cells} = 4(65-62.5)^2 + 4(45-62.5)^2 + 4(85-62.5)^2 + 4(65-62.5)^2$ - $SS_{between cells} = 3400$ - $SS_{A} = \sum_{i=1}^{n_{levels A}} n_i (\bar{Y_{A_i}} - \bar{Y_{Grand}})^2$ - $SS_A = 8(75-62.5)^2 + 8(55-62.5)^2$ - $SS_A = 1250$ - $SS_{B} = \sum_{i=1}^{n_{levels B}} n_i (\bar{Y_{B_i}} - \bar{Y_{Grand}})^2$ - $SS_B = 8(65-62.5)^2 + 8(65-62.5)^2$ - $SS_B = 50$ - $SS_{AxB} = SS_{between cells} - SS_A - SS_B$ - $SS_{AxB} = 3400 - 1250 - 50$ - $SS_{AxB} = 2100$ - $SS_{within cells} = \sum_{i=1}^{n_{cells}} (n_i - 1)S^2_i$ We also need to calculate the variance for each cell: | | Easy | Hard | | -------- | ---- | ---- | | 0mg | 166.67 | 166.67 | | 100mg | 166.67 | 166.67 | - $SS_{within cells} = (4-1)166.67 + (4-1)166.67 + (4-1)166.67 + (4-1)166.67$ - $SS_{within cells} = 2000$ - $SS_{total} = \sum_{i=1}^{N} (Y_i - \bar{Y_{Grand}})^2$ - $SS_{total} = (50 - 62.5)^2 + (60 - 62.5)^2 + (70 - 62.5)^2 + (80 - 62.5)^2 + (30 - 62.5)^2 + (40 - 62.5)^2 + (50 - 62.5)^2 + (60 - 62.5)^2 + (70 - 62.5)^2 + (80 - 62.5)^2 + (90 - 62.5)^2 + (100 - 62.5)^2 + (50 - 62.5)^2 + (60 - 62.5)^2 + (70 - 62.5)^2 + (80 - 62.5)^2$ - $SS_{total} = 5400$ As a check, we can verify that: - $SS_{total} = SS_A + SS_B + SS_{AxB} + SS_{within cells}$ - $5400 = 1250 + 50 + 2100 + 2000$ - $5400 = 5400$ ### 14.5.3 Degrees of Freedom We can now calculate the degrees of freedom: - $df_A = levels_A - 1$ - $df_A = 2 - 1$ - $df_A = 1$ - $df_B = levels_B - 1$ - $df_B = 2 - 1$ - $df_B = 1$ - $df_{AxB} = df_A \times df_B$ - $df_{AxB} = 1 \times 1$ - $df_{AxB} = 1$ - $df_{within cells} = N - n_{cells}$ - $df_{within cells} = 16 - 4$ - $df_{within cells} = 12$ - $df_{total} = N - 1$ - $df_{total} = 16 - 1$ - $df_{total} = 15$ As a check, we can verify that: - $df_{total} = df_A + df_B + df_{AxB} + df_{within cells}$ - $15 = 1 + 1 + 1 + 12$ - $15 = 15$ ### 14.5.4 Mean Square We can now calculate the Mean Squares: - $MS_A = \frac{SS_A}{df_A}$ - $MS_A = \frac{1250}{1}$ - $MS_A = 1250$ - $MS_B = \frac{SS_B}{df_B}$ - $MS_B = \frac{50}{1}$ - $MS_B = 50$ - $MS_{AxB} = \frac{SS_{AxB}}{df_{AxB}}$ - $MS_{AxB} = \frac{2100}{1}$ - $MS_{AxB} = 2100$ - $MS_{within cells} = \frac{SS_{within cells}}{df_{within cells}}$ - $MS_{within cells} = \frac{2000}{12}$ - $MS_{within cells} = 166.67$ ### 14.5.5 F-Statistic We can now calculate the F-Statistics: - $F_A = \frac{MS_A}{MS_{within cells}}$ - $F_A = \frac{1250}{166.67}$ - $F_A = 7.5$ - $F_B = \frac{MS_B}{MS_{within cells}}$ - $F_B = \frac{50}{166.67}$ - $F_B =.3$ - $F_{AxB} = \frac{MS_{AxB}}{MS_{within cells}}$ - $F_{AxB} = \frac{2100}{166.67}$ - $F_{AxB} = 12.6$ ### 14.5.6 ANOVA Summary Table We can summarize these results in an ANOVA summary table | Source | SS | df | MS | F | | -------------- | ------ | ---- | --------- | ------ | | Caffeine | 1250 | 1 | 1250 | 7.5 | | Difficulty | 50 | 1 | 50 | 0.3 | | CxD | 2100 | 1 | 2100 | 12.6 | | Within | 2000 | 12 | 166.67 | | | Total | 5400 | 15 | | | ### 14.5.7 Interpretation - There was a significant main effect of caffeine on exam performance,F(1,12) = 7.5, p <.05. - There was no significant main effect of difficulty on exam performance, F(1,12) =.3, p >.05. - There was a significant interaction between caffeine and difficulty on exam performance, F(1,12) = 12.6, p <.01. ## 14.6 Post-Hoc Tests ### 14.6.1 Main Effects - If there are more than two levels for a significant main effect, you can perform post-hoc tests to determine which levels are significantly different from each other. - The logic is the same as for the one-way ANOVA (Chapter 13). ### 14.6.2 Interaction Effects - To follow up on a significant interaction, you can perform **simple main effects** analyses. - A simple main effect is the effect of one factor at a single level of the other factor. - In our example, we can look at the simple main effect of caffeine at each level of difficulty. - The effect of caffeine on exam performance when the exam is easy. - The effect of caffeine on exam performance when the exam is hard. - Equivalently, we can look at the simple main effect of difficulty at each level of caffeine. - The effect of exam difficulty on exam performance when the participant ingested 0mg of caffeine. - The effect of exam difficulty on exam performance when the participant ingested 100mg of caffeine. - You can use a Bonferroni correction to correct for multiple comparisons. - Divide the alpha level by the number of simple main effects you are testing. - In our example, if we are testing the simple main effect of caffeine at each level of difficulty, we would divide the alpha level by 2. ## 14.7 Reporting the Results The figure shows a graph with exam scores on the y-axis and exam difficulty on the x-axis. There are different colored lines for 0mg and 100mg of caffeine, showing that the effect of caffeine on exam performance is different depending on whether the exam is easy or hard. The caption reads that there was a significant interaction between caffeine and difficulty on exam performance, so the effect of caffeine depended on the difficulty of the exam. ### Example Write-Up Here's an example of how you might report the results of this analysis: "A $2 \times 2$ factorial ANOVA was conducted to examine the effects of caffeine (0mg vs 100mg) and exam difficulty (easy vs hard) on exam performance. The results indicated a significant main effect of caffeine, $F(1, 12) = 7.5, p <.05$, such that participants who ingested 100mg of caffeine performed better on the exam than those who ingested 0mg of caffeine. There was no significant main effect of exam difficulty, $F(1, 12) =.3, p >.05$. However, there was a significant interaction between these two independent variables, $F(1, 12) = 12.6, p <.01$. Follow-up tests revealed a significant effect of caffeine when the exam was easy, $F(1, 12) = 17.14, p <.005$, but not when the exam was hard, $F(1, 12) = 0.51, p >.05$. These results suggest that the effect of caffeine on exam performance depends on the difficulty of the exam."