Statistical Inference - Comparing Two Means PDF
Document Details
Uploaded by ExceedingChrysoprase7632
Monash University
Tags
Summary
This document explains statistical inference methods for comparing two means using matched pairs t-procedures. It provides notation, examples, and conditions for the tests. The document is designed for an undergraduate statistics course and covers topics like confidence intervals and hypothesis tests.
Full Transcript
Statistical Inference – Comparing Two Means Matched Pairs t Procedures 2 Matched pairs t procedures So far, we have learned the statistical inferences about a mean using...
Statistical Inference – Comparing Two Means Matched Pairs t Procedures 2 Matched pairs t procedures So far, we have learned the statistical inferences about a mean using a single sample. However, one-sample inference is less common than comparative inference. Comparing Two Population Means Studies that involve making two observations on the same individual, or one observation on each of two similar individuals. When paired data result from measuring the same variable twice, we can make comparisons by analyzing the differences in each pair. If the conditions for inference are met, we can use one sample t procedures to perform inference about the mean difference. The paired data could also come from measuring two different variables for each individual, e.g., Interested in amount of difference between two variables such as your height and weight, or from measuring the same variable on similar individuals (matched pairs). 3 Paired Data - Notation Example: does cooking change the Vitamin C content of tomatoes? Before After d=B-A 1 32 30 d1 = 2 2 29 30 d2 = -1 3 34 33 d3 = -1 4 28 26 d4 = 2 5 31 33 d5 = -2 6 30 26 d6 = 4 7 28 29 d7 = -1 8 31 29 d8 = 2 4 Confidence Interval for Paired Data Conditions: Except in the case of small samples, the assumption that the data are a random sample from the population of interest is more important than the assumption that the population distribution is Normal. n < 15: Use t procedures if the data appear close to Normal (symmetric, single peak, no outliers). If the data are clearly skewed or if outliers are present, do not use t. n ≥ 15: The t procedures can be used except in the presence of outliers or strong skewness in the data. Large samples n ≥ 40: The t procedures can be used even for clearly skewed distributions. 5 Example: Air pollution (CI for Paired Data) random Pollution index measurements were recorded for two areas of a city on each of 8 Area A Area B d=A-B ?days. Are the average pollution levels the same for the two areas of the city 2.92 1.84 1.08 The data is paired by days for the two areas of a city on each of 8 random days. 1.88 0.95 0.93.Calculate the difference d 5.35 4.26 1.09 Conditions to check: 3.81 3.18 0.63 1. Is it reasonable to regard these 8 measurement pairs 4.69 3.44 1.25 as a random sample from the population of all paired measurements? 4.86 3.69 1.17 2. Use a dot plot or a histogram to check normality. 5.81 4.95 0.86 0.6 0.9 1.2 1.5 5.55 4.47 1.08 6 Test of Significance for Paired Data (Paired t-test) Step 1: Determine null and alternative hypotheses. Step 3: Find P-value. H0: µd = 0 vs Ha: µd ≠ 0 (“are different”) Use t distribution with df = n – 1, where n is the number of or Ha: µd < 0 (µ1 < µ2 so µ1 - µ2 < 0) differences. or Ha: µd > 0 (µ1 > µ2 so µ1 - µ2 > 0) Step 4: Decision and Conclusion Step 2a: Verify data conditions. Remember, conditions apply to the differences. Similar to t-test for a single mean. Except in the case of small samples, the assumption that the data are a random sample from the population of interest is more important than the assumption that the population distribution is Normal. n < 5: Use t procedures if the data appear close to Normal (symmetric, single peak, no outliers). If the data are clearly skewed or if outliers are present, do not use t. n ≥ 15: The t procedures can be used except in the presence of outliers or strong skewness in the data. Large samples n ≥ 40: The t procedures can be used even for clearly skewed distributions. Step 2b: Compute the test statistic. The t-test statistic is: 7 Example: Air pollution (Paired t-test) Area A Area B d=A-B 2.92 1.84 1.08 1.88 0.95 0.93 5.35 4.26 1.09 3.81 3.18 0.63 4.69 3.44 1.25 0.6 0.9 1.2 1.5 4.86 3.69 1.17 5.81 4.95 0.86 5.55 4.47 1.08