Biostatistics I (EVSC 500) Fall 2024 ANOVA Test PDF
Document Details
Uploaded by SkillfulYttrium1462
Tuskegee University
2024
Dr. Tej Gautam
Tags
Related
Summary
This document is a lecture on the analysis of variance (ANOVA), specifically focusing on one-way ANOVA for comparing means across groups. The lecture explores the definition, assumptions, and methodology for applying ANOVA, as well as offering relevant examples. Notes include formulas and a summary table.
Full Transcript
Biostatistics I (EVSC 500) Fall 2024 ANOVA Test Instructor: Dr. Tej Gautam Email: [email protected] ***Strictly prohibited to share with others or upload online without the permiss...
Biostatistics I (EVSC 500) Fall 2024 ANOVA Test Instructor: Dr. Tej Gautam Email: [email protected] ***Strictly prohibited to share with others or upload online without the permission of instructor.*** 1 Outlines In this session, we will focus on: One-way ANOVA Test (Definition and Brief Description) Assumptions Method of Testing Relevant Examples 2 ANOVA Test The statistical method for comparing means of various groups is called the analysis of variance (ANOVA). Before using this method, we need to make sure that the our data should satisfy some important statistical assumptions because the ANOVA test is based on some statistical assumption. 3 One-way ANOVA test What is ANOVA? The one-way analysis of variance (ANOVA) is used to determine whether the mean of a dependent variable is the same in two or more unrelated, independent groups. However, it is typically only used when you have three or more independent, unrelated groups. 4 One-way ANOVA test Assumptions: Dependent variable should be measured at the continuous level. independent variable should consist of two or more categorical, independent (unrelated) groups. You should have independence of observations, which means that there is no relationship between the observations in each group or between the groups themselves. There should be no significant outliers. Dependent variable should be approximately normally distributed for each category of the independent variable. There needs to be homogeneity of variances. Assumptions 1, 2 and 3 are related to study design and choice of variables, they cannot be tested for directly. Assumptions 4, 5, and 5 are tested using statistical packages. 5 Elements of a Designed Experiment Response Variable: Factor Levels and Treatments The response variable is the variable of interest to Factor levels are the values of be measured in the experiment. We also refer to the the factor used in the response as the dependent variable. Typically, the experiment. response/dependent variable is quantitative in nature. The treatments of an experiment are the factor-level Factors are those variables whose effect on the combinations used. response is of interest to the experimenter. Quantitative factors are measured on a numerical Experimental Unit scale, whereas qualitative factors are those that are An experimental unit is the not (naturally) measured on a numerical scale. object on which the response and Factors are also referred to as independent factors are observed or variables. measured. 6 Analysis of Variance (Formula) We can use these new terms to write an F-statistic to test the null hypothesis that the means are equivalent across the two or more groups against the alternative that at least two group means differ. 7 ANOVA Summary Table 1. Test Statistic Degrees Mean Source of of Sum of F = MST / MSE Square F Variation Freedom Squares (Variance) MST is Mean Square for Treatment MSE is Mean Square for Error Treatment k–1 SST MST = MST 2. Degrees of Freedom SST/(k – 1) MSE v1 = k – 1 numerator degrees of freedom Error n–k SSE MSE = v2 = n – k denominator degrees of freedom SSE/(n – k) k = Number of groups Total n–1 SS(Total) = n = Total sample size SST+SSE 8 ANOVA F-Test Critical Value If means are equal, F = MST / MSE ≈ 1. Only reject large F ANOVA F-Test to Compare k Treatment Means: Completely Randomized Design Reject H H0: µ1 = µ2 = … = µk Ha: At least two treatment means differ Do Not Reject H Test Statistic: Rejection region: F > F, 0 F p-value: P(F > Fc) F(α; k – 1, n – k) where F is based on (k – 1) numerator degrees of freedom (associated with MST) and (n – k) denominator degrees of freedom Always a one-sided tail (associated with MSE). 9 Steps for Conducting an ANOVA for a Completely Randomized Design Note: Be careful not to automatically conclude that the treatment means are equal because the possibility of a Type II error must be considered if you accept H0. Example (Production F-Test ) Recall that k = number of groups and n = total sample size. As production manager, you want to see if So we have k = 3 and n = 15 for this example. Then three filling machines have different mean v1 = k – 1 numerator degrees of freedom = 2 filling times. You assign 15 similarly trained v2 = n – k denominator degrees of freedom = 12. and experienced workers, 5 per machine, to the machines. At the 0.05 level of significance, is there a difference in mean filling times? Critical Value(s): Mach1 Mach2 Mach3 H 0 : 1 = 2 = 3 = 25.40 23.40 20.00 Ha: Not all equal 26.31 21.80 22.20 0.05 24.10 23.50 19.75 = 0.05 23.74 25.10 22.75 21.60 20.60 20.40 1 =2 2 =12 0 3.89 F 10 Example: Production F-Test (cont’d…) Degrees Mean Source of Sum of of Square F Variation Squares Freedom (Variance) Treatment 3 – 1=2 47.1640 23.5820 23.582/0.92 (Machines) =25.6 Error 15–3=12 11.0532 0.9211 Total 15 – 1=14 58.21 72 Critical Value(s): Decision: Reject at = 0.05 = 0.05 Conclusion: There is evidence population means are different 0 3.89 F 11 Example: Rice Yield 12 Example: Rice Yield-Solution(manual) 13 Example: Rice Yield-Solution(Using Excel or other software) Here, F-Critical value with numerator and denominator degree of freedom 3 and 12 at significance level 0.01 (𝐹0.01, 3,12 )=5.95. We reject the null hypothesis as F-value> F-critical value You can make decision based on p-value as well. Here, p-value will be 0.0054. In this case, we reject the null hypothesis at 1% significance level. (you get p- value from excel or other statistical packages). 14 Example- Housing Price 15