Chapter 8: Estimates and Sample Sizes PDF

Document Details

Joeeeyism

Uploaded by Joeeeyism

Beijing Foreign Studies University

Tags

statistics population mean sample size estimation

Summary

This document is a chapter on estimating population means and sample sizes. It covers point estimation, confidence intervals, critical values, and assumptions. It's part of a lecture from Beijing Foreign Studies University, focusing on mathematical statistics.

Full Transcript

Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Chapter 8: Estimates and Sample Sizes International Business School BEIJING FOREIGN...

Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Chapter 8: Estimates and Sample Sizes International Business School BEIJING FOREIGN STUDIES UNIVERSITY 1 / 26 Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimate a Population Mean Point Estimate Definition The value of a sample statistic that is used to estimate a population parameter is called a point estimate. Two types of estimations Point estimation and confidence interval estimation. 2 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimate a Population Mean Confidence Interval Definition A confidence interval (or interval estimate) is a range (or an interval) of values used to estimate the true value of a population parameter. A confidence interval is sometimes abbreviated as CI. A confidence interval is the probability 1 − α (often expressed as the equivalent percentage value) that the confidence interval actually does contain the population parameter, assuming that the estimation process is repeated a large number of times. (The confidence level is also called degree of confidence, or the confidence coefficient.) Most common choices are 90%, 95%, or 99%. (α = 10%), (α = 5%), (α = 1%), 3 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimate a Population Mean Critical Values Definition A standard z score can be used to distinguish between sample statistics that are likely to occur and those that are unlikely to occur. Such a z score is called a critical value. Critical values are based on the following observations: 1. Under certain conditions, the sampling distribution of sample means can be approximated by a normal distribution. 4/ 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimate a Population Mean Critical Values Definition 2. A z score associated with a sample mean has a probability of α/2 of falling in the right tail. 5 /26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimate a Population Mean Critical Values Definition 3. The z score separating the right-tail region is commonly denoted by zα/2 and is referred to as a critical value because it is on the borderline separating z scores from sample means that are likely to occur from those that are unlikely to occur. 6 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimate a Population Mean Finding zα/2 for a 95% Confidence Level Example 7 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ Known Notation µ = population mean σ = population standard deviation x̄ = sample mean E = margin of error zα/2 = z score separating an area of α/2 in the right tail of the standard normal distribution Point of Estimate of the Population Mean The sample mean x̄ is the best point estimate of the population mean µ. 8 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ Known Assumption 1. The sample is a simple random sample. (All samples of the same size have an equal chance of being selected.) 2. The value of the population standard deviation σ is known. 3. Either or both of these conditions is satisfied: The population is normally distributed or n > 30. 9 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ Known Confidence Interval for Estimating a Population Mean (with σ Known) σ x̄ − E < µ < x̄ + E where E = zα/2 √ n or x̄ ± E or (x̄ − E , x̄ + E ) 10/ 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ Known Example A publishing company has just published a new college textbook. Before the company decides the price at which to sell this textbook, it wants to know the average price of all such textbooks in the market. The research department at the company took a sample of 25 comparable textbooks and collected information on their prices. This information produced a mean price of $90.50 for this sample. It is known that the standard deviation of the prices of all such textbooks is $7.50 and the population of such prices is normal. 1. What is the point estimate of the mean price of all such college textbooks? 2. Construct a 90% confidence interval for the mean price of all such college textbooks. 11/ 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ Known Solution From the given information, n = 25, x̄ = $90.50, and σ = $7.50 1. The point estimate of the mean price of all such college textbooks is $90.50; that is Point estimate of µ = x̄ = $90.50. 2. The confidence level is 90%, or.90. First we find the z value for a 90% confidence level. Here, the area in each tail of the normal distribution curve is α/2 = (1 − 0.90)/2 = 0.05. Now in Z -Table, look for the areas.0500 and.9500 and find the corresponding values of z. These values are z = −1.65 and z = 1.65. Substitute the z values in the CI formula for µ: √ √ x̄ ± E = 90.50 ± zα/2 (σ/ n) = 90.50 ± 1.65 (7.50/ 25) = 90.50 ± 1.65(1.50) = 90.50 ± 2.48 = (90.50 − 2.48) to (90.50 + 2.48) = $88.02 to $92.98 12 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ Known Example According to CardWeb.com, the mean bank credit card debt for households was $7868 in 2004 (USA Today, July 1, 2005). Assume that this mean was based on a random sample of 900 households and the standard deviation of such debts for all households in 2004 was $2070. Make a 99% confidence interval for the 2004 mean bank credit cards debt for all households. Solution From the given information, n = 900, x̄ = $7868, σ = $2070 and Confidence level = 99% or 0.99. Note that the sample size is large (n > 30), hence we can use the normal distribution to make a confidence interval for µ. 13/ 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ Known Solution To find z for 99% confidence level, first we find the area in each of the two tails: (1 − 0.99)/2 = 0.0050. Then, we look for 0.0050 and 0.0050+0.99 = 0.9950 areas in the Z -Table to find the z values. The values are: -2.58 and 2.58. Substituting these values in the CI formula: √ √ x̄ ± E = 7868 ± zα/2 (σ/ n) = 7868 ± 2.58 (2070/ 900) = 7868 ± 2.58(69) = 7868 ± 178.02 = $7689.98 to $8046.02 14/ 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ Known Estimate the Sample Size Notation and Formula µ = population mean σ = population standard deviation x̄ = sample mean E = margin of error zα/2 = z score separating an area of α/2 in the right tail of the standard normal distribution h zα σ i2 /2 n = E 15/ 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ Known Estimate the Sample Size Example Assume that we want to estimate the mean IQ score for the population of statistics students and the standard deviation for all statistics students is 15. How many statistics students must be randomly selected for IQ tests if we want 95% confidence that the sample mean is within 3 IQ points of the population mean? Given that σ = 15. Solution α = 0.05, α/2 = 0.025, zα/2 = 1.96, E = 3 and σ = 15. h zα σ i2  1.96 · 15 2 /2 n = = = 96.04 ' 97 E 3 With a simple random sample of only 97 statistics students, we will be 95% confident that the sample mean is within 3 IQ points of the true population mean µ. 16 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ not Known There are 3 possible cases as below to construct a confidence interval (CI) for the population standard deviation σ is unknown. Case I If the following three conditions are fulfilled: 1. The population standard deviation σ is not known 2. The sample size is small (i.e., n < 30) 3. The population from which the sample is selected is normally distributed, then we use the t distribution (explained in the next slides) to make the confidence interval for µ. 17 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ not Known Case II If the following two conditions are fulfilled: 1. The population standard deviation σ is not known 2. The sample size is large (i.e., n ≥ 30), then again we use the t distribution to make the confidence interval for µ. Case III If the following three conditions are fulfilled: 1. The population standard deviation σ is not known 2. The sample size is small (i.e., n < 30) 3. The population from which the sample is selected is not normally distributed (or its distribution is unknown), then we use a nonparametric method to make the confidence interval for µ. Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ not Known Choosing the Appropriate Distribution 19 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ not Known The t Distribution Definition The t distribution is a specific type of bell-shaped distribution with a lower height and a wider spread than the standard normal distribution. As the sample size becomes larger, the t distribution approaches the standard normal distribution. The t distribution has only one parameter, called the degrees of freedom (df , where df = n − 1). The meanp of the t distribution is equal to 0, and its standard deviation is df /df − 2 Point of Estimate of the Sample Mean The sample mean is the best point estimate of the population mean. 20/ 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ not Known The t Distribution Notation µ = population mean σ = population standard deviation x̄ = sample mean n = number of sample values E = margin of error tα/2 = critical t value separating an area of α/2 in the right tail of the t distribution 21/ 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ not Known The t Distribution Margin of Error s E = tα/2 √ n where tα/2 has n − 1 degrees of freedom. The t distribution table in Formulae Book lists values for tα/2. Confidence Interval for the Estimate of µ (with σ not Known) x̄ − E < µ < x̄ + E s where E = tα/2 √ , df = n − 1 n tα/2 found in t distribution table in Formulae Book. 22/ 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ not Known The t Distribution Example Dr. Moore wanted to estimate the mean cholesterol level for all adult men living in Hartford. He took a sample of 25 adult men from Hartford and found that the mean cholesterol level for this sample is 186 mg/dL with a standard deviation of 12 mg/dL. Assume that the cholesterol levels for all adult men in Hartford are (approximately) normally distributed. Construct a 95% confidence interval for the population mean µ. Solution 23/ 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ not Known The t Distribution Solution To find the value of t, we need to know the degrees of freedom and the area under the t distribution curve in each tail. Degrees of freedom = n − 1 = 25 − 1 = 24. Area in two tails = 0.05 or Area in one tail = 0.025 From the t distribution table, the value of t for df = 24 and 0.025 area in two tails is 2.064. Substitute all values in the CI formula, 95% confidence interval is : √ √ x̄ ± tα/2 (s/ n) = 186 ± 2.064 (12/ 25) = 186 ± 2.064(2.40) = 186 ± 4.95 = 181.05 to 190.95 Therefore, 95% confidence that the mean cholesterol level for all adult men living in Hartford is between 181.05 and 190.95 mg/dL. Note that x̄ = 186 is a point estimate of µ in this example, and 4.95 is the margin of error. 24 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ not Known The t Distribution Example Sixty-four randomly selected adults who buy books for general reading were asked how much they usually spend on books per year. The sample produced a mean of $1450 and a standard deviation of $300 for such annual expenses. Determine a 99% confidence interval for the corresponding population mean. Solution From the given information, n = 64, x̄ = $1450, s = $300 and Confidence level =99% or 0.99. Here σ is not known, but the sample size is large (n > 30). Hence, we will use the t distribution to make a confidence interval for µ. Degrees of freedom = n − 1 = 64 − 1 = 63. Area in two tails = 0.01 25 / 26 Estimate a Population Proportion Estimating a Population Mean: σ Known Estimating a Population Mean: σ not Known Estimating a Population Mean: σ not Known The t Distribution Solution From the t distribution table, the value of t for df = 63 and 0.01 area in two tails is 2.660. Substitute all values in the CI formula, 99% confidence interval is: √ √ x̄ ± tα/2 (s/ n) = $1450 ± 2.660 (300/ 64) = $1450 ± 2.660(37.50) = $1450 ± $99.75 = $1350.25 to $1549.75 Thus, we can state with 99% confidence that based on this sample the mean annual expenditure on books by all adults who buy books for general reading is between $1350.25 and $1549.75. 26 / 26

Use Quizgecko on...
Browser
Browser