Lecture 15: Statistical Methods PDF
Document Details
Anfal Alqefari
Tags
Summary
This lecture introduces methods for estimating population means using confidence intervals. It covers cases where the population standard deviation is known or unknown, and applies these methods to sample data. Examples are provided with solutions demonstrating the calculations.
Full Transcript
Estimating the Mean Anfal Alqefari 1 Single Sample: Estimating the Mean (µ) 1.1 Case I: Confidence Interval for µ from Normal Population and standard deviation Known When estimating the population mean µ from a sample drawn from a normal population...
Estimating the Mean Anfal Alqefari 1 Single Sample: Estimating the Mean (µ) 1.1 Case I: Confidence Interval for µ from Normal Population and standard deviation Known When estimating the population mean µ from a sample drawn from a normal population with a known population standard deviation σ, the confidence interval for the population mean can be calculated using the following formula: σ CI = x̄ ± Zα/2 √ n Where: x̄ is the sample mean. Zα/2 zα/2 is the critical value from the standard normal distribu- tion, leaving an area of α/2 in each tail. σ is the known population standard deviation. n is the sample size. 1 1.2 Case II: Confidence Interval for µ from Nor- mal Population and standard deviation σ Un- known Definition 1. If x̄ and s are the mean and standard deviation of a ran- dom sample of size n from a normal population with unknown variance σ 2 , a 100(1 − α)% confidence interval for µ is: s s x̄ − tα/2 (ν) √ < µ < x̄ + tα/2 (ν) √ , n n where: x̄ is the sample mean, s is the sample standard deviation, tα/2,ν is the critical value from the t-distribution with n−1 degrees of freedom, leaving an area of α/2 in each tail, α is the significance level. Example 1. The weights of seven similar bags of rice are 4.8, 5.2, 5.4, 4.8, 5.0, 5.2, and 4.6 kilograms. Find a 95% confidence interval for the mean weight of all such bags, assuming an approximately normal distribution. Solution: 1. Calculate the Sample Mean 4.8 + 5.2 + 5.4 + 4.8 + 5.0 + 5.2 + 4.6 35 x̄ = = = 5. 7 7 2 2. Calculate the Sample Standard Deviation v u n u 1 X s= t (xi − x̄)2 n − 1 i=1 r 0.04 + 0.04 + 0.16 + 0.04 + 0 + 0.04 + 0.16 = r 6 0.48 = 6 ≈ 0.283 3. For a 95% confidence interval, and with ν = n − 1 = 6 degrees of freedom, the critical value tα/2 is found using the t-distribution table as tα/2 (ν) = t0.05/2 (6) = t0.025 (6) = 2.447 Figure 1: tα/2 (ν) = t0.05/2 (6) = t0.025 (6) = 2.447 Then, the 95% confidence interval given as 0.283 0.283 5 − 2.447 √ < µ < 5 + 2.447 √ 7 7 5 − 0.262 < µ < 5 + 0.262 4.738 < µ < 5.262 3 1.3 Case III: Confidence Interval for µ from Non- Normal Population and Large sample size n) Definition 2. When normality cannot be assumed, and as n → ∞ (i.e., n ≥ 30), the confidence interval for µ can be written as: 1. If standard deviation σ is known, then: σ σ x̄ − zα/2 √ < µ < x̄ + zα/2 √ , n n 2. If standard deviation σ is unknown, can replace σ by s, then: s s x̄ − zα/2 √ < µ < x̄ + zα/2 √ , n n This is often referred to as a large-sample confidence interval. Example 2. A researcher is studying the heights of a random sample of 40 students. The sample mean height is 170 cm, and the sample standard deviation s is 8 cm. Construct a 99% confidence interval for the population mean height? Solution: Given: sample mean height x̄ = 170 sample standard deviation s = 8 sample size n = 40 4 Since n = 40 ≥ 30 and standard deviation σ is unknown, then the confidence interval for µ is given by s s x̄ − zα/2 √ < µ < x̄ + zα/2 √ , n n For a 99% confidence level, the z-value can be find from Z-table as 2.57 + 2.58 zα/2 = z0.01/2 = = 2.575 2 Then, the 99% confidence interval for the population mean height is s s x̄ − zα/2 √ < µ < x̄ + zα/2 √ n n 8 8 170 − 2.575 √ < µ < 170 + 2.575 √ 40 40 170 − 3.257 < µ < 170 + 3.257 166.743 < µ < 173.257 Example 3. A random sample of 50 elements is taken, and the sample mean is found to be 1200 units. The population standard deviation σ is known to be 100 units. Construct a 90% confidence interval for the population mean µ? Solution: Given: sample mean height x̄ = 1200 standard deviation σ = 100 sample size n = 50 5 Since n = 50 ≥ 30 and standard deviation σ is known, then the confi- dence interval for µ is given by σ σ x̄ − zα/2 √ < µ < x̄ + zα/2 √ , n n For a 90% confidence level, the z-value can be find from Z-table as 1.64 + 1.65 zα/2 = z0.1/2 = z0.05 = = 1.645. 2 1.64+1.65 Figure 2: z0.05 = 2 = 1.645 Then, the 90% confidence interval for the population mean is σ σ x̄ − zα/2 √ < µ < x̄ + zα/2 √ n n 100 100 1200 − 1.645 √ < µ < 1200 + 1.645 √ 50 50 1200 − 1.645 (14.14) < µ < 1200 + 1.645 (14.14) 1200 − 23.26 < µ < 1200 + 23.26 1176.74 < µ < 1223.26 6 Definition 3. Standard Error The standard deviation of X̄ is also known as the standard error of X̄. There are two cases: 1. If the standard deviation σ is known: σ Standard Error = √ n 2. If the standard deviation σ is unknown: s Standard Error = √ n 2 Two Samples: Estimating the Differ- ence between Two Means 2.1 Case I: Confidence Interval for the Difference Between Two Population Means µ1 −µ2 from a Normal Population and σ12 and σ22 are Known Definition 4. If x̄1 and x̄2 are means of independent random samples of sizes n1 and n2 from normal populations with known variances σ12 and σ22 , respectively, a 100(1 − α)% confidence interval for µ1 − µ2 is given by: s s σ12 σ22 σ12 σ22 (x̄1 − x̄2 ) − z α2 + < µ1 − µ2 < (x̄1 − x̄2 ) + z α2 + n1 n2 n1 n2 where zα/2 is the critical value from the standard normal distribu- tion, leaving an area of α/2 in each tail. 7 Example 4. A study was done to compare two teaching methods, A and B. The final exam scores of students using both methods were measured. 50 students were taught using method A, and 75 students were taught using method B. The teaching conditions and other factors were kept the same for both groups. On average, students taught using method A scored 70, while students taught using method B scored 78. Find a 96% confidence interval for µB − µA , where µA and µB are the average exam scores for all students taught by methods A and B, respectively. Assume that the population standard deviations are 10 and 12 for methods A and B, respectively. Solution: We are given the following data for the two teaching methods: Method n x̄ σ A 50 70 10 B 75 78 12 We need to calculate a 96% confidence interval for µB − µA. The formula for the confidence interval when the population variances are known is: σA σB sdA sdB (x̄B −x̄A )−z α2 √ +√ < µB −µA < (x̄B −x̄A )+z α2 √ +√ nA nB nA nB For a 96% confidence level, the z-value can be find from Z-table as 2.05 + 2.06 zα/2 = z0.04/2 = z0.02 = = 2.055 2 8 2.05+2.06 Figure 3: z0.04/2 = z0.02 = 2 = 2.055 Then, the 96% confidence interval for the difference in population means µB − µA is σA σB σA σB (x̄B − x̄A ) − z α2 √ + √ < µB − µA < (x̄B − x̄A ) + z α2 √ + √ nA nB n nB A 10 12 10 12 (78 − 70) − 2.055 √ + √ < µB − µA < (78 − 70) + 2.055 √ + √ 50 75 50 75 8 − 5.74 < µB − µA < 8 + 5.74 2.26 < µB − µA < 13.74 2.2 Case II: Confidence Interval for the Difference Between Two Population Means µ1 − µ2 from a normal population and and σ12 and σ22 are unknown Definition 5. If x̄1 and x̄2 are means of independent random samples of sizes n1 and n2 , respectively, from approximately normal populations with unknown but equal variances, a 100(1 − α)% confidence interval for µ1 − µ2 is given by: 9 r r 1 1 1 1 (x̄1 − x̄2 ) − t α2 sp + < µ1 − µ2 < (x̄1 − x̄2 ) + t α2 sp + n1 n2 n1 n2 where sp is the pooled estimate of the population standard deviation and t α2 is the t-value with ν = n1 + n2 − 2 degrees of freedom, leaving an area of α/2 in each tail. Definition 6. Pooled Variance The pooled estimator of the population Variance denoting by s2p , we have the following: (n1 − 1)s21 + (n2 − 1)s22 s2p = n1 + n2 − 2 Corollary 1. Pooled Standard Deviation The pooled estimator of the population standard deviation denoting by sp , we have the following: s (n1 − 1)s21 + (n2 − 1)s22 sp = n1 + n2 − 2 Example 5. Two independent departments, Department 1 and De- partment 2, were chosen for a study on employee performance. For 12 monthly performance reviews collected at Department 1, the av- erage performance score was x̄1 = 85.3 with a standard deviation of s1 = 4.5, while 10 monthly reviews collected at Department 2 had an average performance score of x̄2 = 78.2 and a standard deviation of s2 = 3.2. Find a 95% confidence interval for the difference between the pop- ulation means for the two departments, assuming that the populations are approximately normally distributed with equal variances. Solution: We are given the following information for Department 1 and Department 2: 10 Department n x̄ s Department 1 12 85.3 4.5 Department 2 10 78.2 3.2 We need to calculate a 90% confidence interval for the difference between the population means µ1 − µ2 , assuming that the variances are equal. The formula for the confidence interval is given by: r r 1 1 1 1 (x̄1 − x̄2 ) − t α2 sp + < µ1 − µ2 < (x̄1 − x̄2 ) + t α2 sp + n1 n2 n1 n2 For a 95% confidence interval, and with ν = n1 +n2 −2 = 20 degrees of freedom, the critical value tα/2 is found using the t-distribution table as tα/2 (ν) = t0.05/2 (20) = t0.025 (20) = 2.086. Figure 4: t0.025 (20) = 2.086 11 We calculate the pooled Variance s2p using the formula: (n1 − 1)s21 + (n2 − 1)s22 s2p = n1 + n2 − 2 (12 − 1)(4.52 ) + (10 − 1)(3.22 ) = 12 + 10 − 2 222.75 + 92.16 = 20 314.91 = 20 ≈ 15.7455 Then, the pooled estimator for the population standard deviation is: √ sp = 15.7455 ≈ 3.97 Then, the 95% confidence interval for the difference between the pop- ulation means µ1 − µ2 is r r 1 1 1 1 (x̄1 − x̄2 ) − t α2 sp + < µ1 − µ2 < (x̄1 − x̄2 ) + t α2 sp + n n2 n1 n2 r 1 r 1 1 1 1 (85.3 − 78.2) − (2.086)(3.97) + < µ1 − µ2 < (85.3 − 78.2) + (2.086)(3.97) + 12 10 12 10 7.1 − 3.546 < µ1 − µ2 < 7.1 + 3.546 3.554 < µ1 − µ2 < 10.646 12