Chi-Square Test for Goodness of Fit PDF
Document Details
Uploaded by ExceedingChrysoprase7632
Monash University
Tags
Summary
This document explains the chi-square test for goodness of fit in statistical inference, focusing on categorical variables. It provides an example analyzing birth distributions across days of the week. The analysis demonstrates statistical methods used for hypothesis testing.
Full Transcript
Statistical Inference – Categorical Variable Chi-Square Test for Goodness of Fit 2 Test for specific distribution, Step 1 (Goodness of Fit Test) Previously, we’ve tested if there is a...
Statistical Inference – Categorical Variable Chi-Square Test for Goodness of Fit 2 Test for specific distribution, Step 1 (Goodness of Fit Test) Previously, we’ve tested if there is a relationship (dependence) between two categorical variables. ⇒ chi-square test for independence Now, we will test if a categorical variable has a specific distribution. ⇒ Chi-square test for goodness of fit The four steps in carrying out a significance test: Chi-square test for goodness of fit State the null and alternative hypotheses. Check conditions and then Calculate the test statistic. Find the P-value using the appropriate distribution. Make decision and state conclusion in the context of the specific setting of the test. a l ways! As 3 Example - Birth People say that births are not evenly distributed across the days of the week. Fewer babies are born on Saturday and Sunday than on other days, probably because doctors find weekend births inconvenient. A random sample of 700 births from local records shows this distribution across the days of the week: Day Sun. Mon. Tues. Wed. Thurs. Fri. Sat. Births 84 110 124 104 94 112 72 Sure enough, the two smallest counts of births are on Saturday and Sunday. Do these data give statistically significant evidence that local births are not equally likely on all days of the week? 4 Step 2a, 2b (Chi-square Test for Goodness of Fit) Day Sun. Mon. Tues. Wed. Thurs. Fri. Sat. Births 84 110 124 104 94 112 72 5 Step 3 (Chi-square Test for Goodness of Fit) In the Birth Example: k = 7, df = k – 1 = 7 – 1 = 6. The value of Chi-square statistic is 19.12. Use table D to find the p-value. P-value is between 0.0025 and 0.005 6 Step 4 (Chi-square Test for Goodness of Fit) Step 4: Decision and Conclusion Whether or not the result is statistically significant is based on the P-value and the chosen level of s! significance α : l way As a If p-value ≤ α → reject null, H0 concluding there is significant evidence for the alternative hypothesis that the categorical variable has distribution different to the one state in the null. if p-value > α → cannot reject null with conclusion that there is not enough evidence to support the alternative hypothesis. In the Birth Example: Use a level of significance 5%. Decision: P-value between 0.0025 and 0.005, i.e., p-value < 0.05, reject H0. Conclusion: The sample give statistically significant evidence at 5% level that local births are not equally likely on all days of the week. Comparison: Goodness of Fit Test and Two-Tailed z-Test 7 for Two Categories (Part 1) The two methods, a two-tailed test for a proportion based on the normal distribution and a chi-square test for two categories, are equivalent. Example: A student collected a sample by spinning a coin on edge to see if it would land Heads or Tails. Her data showed 168 heads and 232 tails in 400 spins. Test at α=0.05 to see if the sample provides convincing evidence that spinning a coin contradicts a 50–50 distribution of heads and tails. Do this test by: 1. Using the z-test for a proportion; and 2. Using the chi-square goodness-of-fit test Step 3. Using a standard normal distribution, p-value = 2x0.0007 = 0.0014 Step 4. p-value < 0.05, reject H0. This sample contradicts 50-50 distribution of heads and tails. Comparison: Goodness of Fit Test and Two-Tailed z-Test 8 for Two Categories (Part 2) Step 4. p-value < 0.05, reject H0. The sample is significant evidence that the Distribution of heads and Tails is not 50-50. The value of χ2-test statistic for the goodness-of-fit test (10.24) is just the square of the z-statistic (−3.2). That’s one reason for having “square” in the name Chi-square.