1703-Ch 8 Lecture Notes (1) PDF

Chapter 8: Confidence Intervals Introduction There are two major techniques for classical statistical inference: confidence intervals and hypothesis testing. We will be studying both techniques for our last unit. Statistical Inference – The process of drawing conclusions about a population based on sample data.  The purpose of collecting data on a sample is not simply to have data on that sample.  Researchers take the sample to infer from that data some conclusion about the wider population represented by the sample.  For Confidence Intervals, we use data from the sample statistics to construct an interval that we think contains the population parameter. Confidence Intervals estimate the value of a population parameter; they do not estimate the value of a sample statistic or of an individual observation. CC BY Creative Commons Attribution 4.0 International License 1 A confidence interval is an interval of values computed from sample data that is likely to include the unknown value of a population parameter. There is no guarantee that a given confidence interval does capture the parameter, but there is a predictable probability of success known as the confidence level. (90%, 95%, 98%, 99% - common CL’s) The confidence level is a measure of how confident we are that the confidence interval contains the true population parameter. 𝐶𝐿 = 1 − 𝛼 OR 𝐶 = 1 − 𝛼, 𝛼 = significance level The confidence level (example 95%) tells us that, after repeated sampling, 95% of the confidence intervals constructed would contain the parameter of interest (and 5% would not).  We are 95% confident that the true population parameter is between (LB, UB). Theory: if we select many different samples of size n from the same population and construct the corresponding confidence intervals, 95% of them would contain the parameter. Constructing Confidence Intervals (For population proportion 𝒑 or population mean 𝝁) To construct a confidence interval, we need a point estimate and the margin of error. A point estimate is a single value used to estimate an unknown population parameter.  The sample proportion 𝒑̂ is the best point estimate for the population proportion, p.  The sample mean 𝒙̅ is the best point estimate for the population mean, µ. We realize that the point estimate is most likely not the exact value of the population parameter, but close to it. We can take those point estimates (either 𝑝̂ or 𝑥̅ ) and construct interval estimates called confidence intervals for the population parameter p or µ. The half-width of a confidence interval is often called the margin of error, E. 𝑝̂(1−𝑝̂) 𝑠 For proportions: 𝐸 = 𝑧 ∗ √ For means: 𝐸 = 𝑡 ∗ ( ) 𝑛 √𝑛 𝒛∗ 𝐚𝐧𝐝 𝒕∗ are known as critical values CC BY Creative Commons Attribution 4.0 International License 2 What affects the margin of error? 1. The level of confidence which determines the value of 𝒛∗ or 𝒕∗. As the confidence level increases, the margin-of-error and width of the confidence level will also increase. 2. The sample size. A larger sample size produces a narrower confidence interval whenever other factors remain the same. In this case, we generally say our estimate of the population parameter is more precise. CC BY Creative Commons Attribution 4.0 International License 3 A confidence interval: point estimate ± margin of error A confidence interval can be written as an interval or an inequality. (Lower Bound, Upper Bound) OR Lower Bound < parameter < Upper Bound Example 1: Find the confidence interval given the point estimate and the margin of error. Write the answer in interval notation and inequality notation. A. Write the confidence interval for a population proportion 𝒑 if the point estimate is 0.72 and the margin of error is 0.14. B. Write the confidence interval for a population mean 𝝁 if the point estimate is 48.7 and the margin of error is 5.24. CC BY Creative Commons Attribution 4.0 International License 4 8.3: A Population Proportion To construct a confidence interval for a single unknown population proportion, 𝑝, we need a point estimate for 𝑝 and the margin of error, E.  ̂, is the best point estimate of the population proportion, p. The sample proportion, 𝒑  The margin of error, E, is the critical value times the standard error for the sample proportion.  A critical value is a table value based on sampling distribution of the point estimate and the desired confidence level. 𝒛∗ represents the critical value for a population proportion, p. 𝑝̂(1−𝑝̂)  The standard error is the standard deviation of the sample statistic. √ 𝑛 Confidence Interval for a Population Proportion, 𝒑 Point Estimate ± Margin of Error 𝒑̂(𝟏 − 𝒑 ̂) ̂ ± 𝒛∗ √ 𝒑 𝒏 Confidence Interval for 𝒑: (LB, UB) or LB < p < UB Interpretation: “We are [confidence level] confident the population proportion is between [lower bound] and [upper bound].” Conditions for Normality: 1. Representative sample 2. X, the number of successes, follows a binomial distribution 3. Both 𝑛𝑝̂ and 𝑛𝑞̂ are at least 5 (at least 5 successes and 5 failures) CC BY Creative Commons Attribution 4.0 International License 5 One strategy for determining z∗, or zα/2, for a general confidence level is to use the fact that −z∗ is the z-score that has area (1−C)/2 below it. This can easily be read from the z-table provided. Search within the body of the table and identify the corresponding value of −z∗. The table at the bottom left of our z-chart gives the critical values 𝑧 ∗ for the common confidence levels of (90%, 95%, 98% and 99%) Example 2. Find the critical value for a confidence interval in which we are 88% confidence. Example 3. Find the critical value for a confidence interval in which we are 80% confident. CC BY Creative Commons Attribution 4.0 International License 6 Example 4. The Marist Institute for Public Opinion surveyed 883 randomly selected American adults about allergies. According to a report posted at the Institute’s website, 36% of the sample answered “yes” to the question “Are you allergic to anything?” (a) Construct a 95% confidence interval to estimate the population parameter p = proportion of all American adults who are allergic to something. point estimate 𝑝̂ = 0.36 sample size n = 883 critical value is z∗ = 1.96 (to achieve 95% confidence) margin of error The 95% confidence interval is point estimate ± margin of error 𝑝̂ ± E 𝑝̂(1−𝑝̂) 𝑝̂ ± 𝑧 ∗ √ 𝑛 0.36 ± 0.0316639 ans (0.3283,0.3917) or 32.83% < 𝑝 < 39.17% (a) Interpret the interval. ans We are 95% confident the population proportion of American adults who are allergic to something is between 32.83% and 39.17%. (b) Suppose a researcher claimed that less than 40% of the population would answer “yes.” Address this claim based on our interval. ans Because the value 0.40 does NOT lie in the interval, and our interval is LESS THAN 0.40, we can conclude the claim is true. (c) Address all assumptions and the validity of your results. ans sampling methodology is sound X follows a binomial distribution with n𝑝̂ = 883 ∗ 0.36 = 317.88 ≥ 5 and n𝑞̂= 883 ∗ (1 − 0.36) = 565.12 ≥ 5. The needed assumption that 𝑝̂ ∼ normal is confirmed. CC BY Creative Commons Attribution 4.0 International License 7 Example 5. A survey of 500 airline passengers found that 338, or 68%, were satisfied with the service they received from the flight attendants. Calculate and interpret a 95% confidence interval for the proportion of passengers who are satisfied with the service from flight attendants. (a) Find the estimate for 𝑝: (b) Calculate the 95% CI for 𝑝: (c) Interpret the confidence interval: (d) Check the assumptions: (e) Suppose a researcher claimed that less than 68% of airline passengers were satisfied with the service from flight attendants. Address this claim according to our interval. CC BY Creative Commons Attribution 4.0 International License 8 Working Backwards to Find the Margin of Error or Sample Mean When we calculate a confidence interval, we find the sample mean, calculate the margin of error, and use them to calculate the confidence interval. (point estimate ± margin of error) However, sometimes when we read statistical studies, the study may state the confidence interval only. If we know the confidence interval, we can work backwards to find both the margin of error and the sample mean. Finding the Margin of Error and Point Estimate When Given the Confidence Interval (LB, UB) 𝑈𝐵−𝐿𝐵  Margin of Error: 𝐸= 2 𝑈𝐵+𝐿𝐵  Point Estimate: 𝑥̅ 𝑜𝑟 𝑝̂ = 2 Example 6. Suppose you have the following confidence interval: (10.21, 10.59). (a) Find the margin of error. (b) Find the point estimate. Example 7. Suppose you have the following confidence interval: (8, 20). Find the following without using a calculator. (a) The margin of error. (b) The point estimate. CC BY Creative Commons Attribution 4.0 International License 9 8.2: A Single Population Mean Using the Student t Distribution To construct a confidence interval for a single unknown population mean, µ, where the population standard deviation is UNKNOWN, we need a point estimate for µ and the margin of error, E. If we DO NOT KNOW the population standard deviation, σ, we use the t-distribution for means (instead of the z-distribution) to find the critical value needed.  ̅, is the best point estimate of the population mean, µ. The sample mean, 𝒙  The margin of error, E, is the critical value (𝑡 ∗ ) times the standard error for the sample 𝑠 mean, ( ) √𝑛  𝒕∗ represents the critical value from the t-distribution for the confidence level desired, with n − 1 degrees of freedom (𝑑𝑓 = 𝑛 − 1). The degrees of freedom is the number of sample observations that can vary after certain restrictions have been imposed. For the t distribution, the degrees of freedom is equal to n − 1. NOTE: Your textbook uses tα/2 as the critical value, where we use 𝑡 ∗. CC BY Creative Commons Attribution 4.0 International License 10 Properties of the t-Distribution  There are an infinite number of t distributions, one for each possible value for degrees of freedom. (𝑑𝑓 = 𝑛 − 1)  The t distribution has the same symmetric bell shape as the z distribution, but it reflects the greater variability that is expected with estimating σ.  The t distribution has a mean of 0.  The standard deviation of the t distribution varies with the sample size but is always greater than 1 (unlike the z).  As the sample size n gets larger, the t distribution gets closer to the z distribution. Example 8. Find the 𝑡 ∗ critical value for the following: a. 95% confidence, n = 15. b. 99% confidence, n = 43 c. 80% confidence, n = 30 CC BY Creative Commons Attribution 4.0 International License 11 Confidence Interval for a Population Mean, 𝝁 Point Estimate ± Margin of Error 𝑠 𝑥̅ ± 𝑡 ∗ ( ) √𝑛 Confidence Interval for 𝝁: (LB, UB) or LB < 𝝁 < UB Interpretation: “We are [confidence level] confident the population(true) mean is between [lower bound] and [upper bound].” Conditions for Normality: 1. Representative Sample 2. One of the following: (a) The population must be normally distributed, X ∼ normal, OR (b) The sample size needs to be large enough, n ≥ 30. CC BY Creative Commons Attribution 4.0 International License 12 Example 9. A physician wants to test a medicine to lower blood glucose concentrations in diabetic patients. The drug is considered “successful” if it can lower blood glucose concentrations below 110mg/dL. He uses the drug on 30 patients, and at the end of the trial the sample mean is 102 mg/dL with a sample standard deviation of 3.2. Assume values are normally distributed. 1. Construct a 90% confidence interval for the true value of the population mean blood glucose concentration. (a) Check the conditions for the validity of the test. (b) Determine the critical value. (c) Determine the point estimate and margin of error. (d) Construct the confidence interval. (e) Interpret the interval found in part (d). 2. The physician claims the drug is successful. Is this claim reasonable? CC BY Creative Commons Attribution 4.0 International License 13 Example 10. The Dallas Stars active roster has 24 players on it, and the average player weight (in pounds) is 200.2, while the standard deviation (in pounds) is 18.2. (a) Assuming that the data is somewhat symmetric and treating the Stars as a representative sample of NHL players, construct a 95% confidence interval on the average weight of hockey players. (b) Interpret the interval. (c) Address the claim that the mean weight of all hockey players is over 195 pounds. (d) Address all assumptions and the validity of your results. CC BY Creative Commons Attribution 4.0 International License 14 Mixed Examples Example 11. The Berkman Center for Internet & Society at Harvard conducted a study analyzing the privacy management habits of teen internet users. In a group of 50 teens, 13 reported having more than 500 Facebook friends on Facebook. Find a 90% confidence interval for the true proportion of teens who would report having more than 500 Facebook friends. (a) Obtain a point estimate for the proportion of teens who have more than 500 Facebook friends. (b) Verify that the requirements for constructing a confidence interval for 𝑝 are satisfied. (c) Construct a 90% confidence interval for the true proportion of teens who have more than 500 Facebook friends. (d) Interpret the confidence interval. (e) What is the effect of increasing the level of confidence on the width of the interval? CC BY Creative Commons Attribution 4.0 International License 15 Example 12: A study was done on acupuncture to determine how effective it is in relieving pain. Sensory rates for 15 subjects were measured which resulted in a mean of 8.23 and standard deviation of 1.67. a) Construct a 95% confidence interval for the mean sensory rate for the population. b) Interpret the confidence interval. c) Checks the assumptions for normality. CC BY Creative Commons Attribution 4.0 International License 16 Example 13. Suppose you have the following confidence interval: (0.247,0.304). (a) Find the margin of error. (b) Find the point estimate. Example 14. What are the two (2) ways to reduce the width of a confidence interval? a. Larger sample size and higher level of confidence b. Smaller sample size and lower level of confidence c. Larger sample size and lower level of confidence d. Unable to determine without seeing the data Example 15. Suppose you compute an interval with a sample size of 77. What will happen to the confidence interval if the sample size increases to 90? a. The confidence interval will widen. b. The confidence interval will narrow. c. The width of the confidence interval will stay the same. Example 16. A student was asked to find a 99% confidence interval for the proportion of students who take notes using data from a random sample of n = 77. Which of the following is a correct interpretation of the interval 0.12 < 𝑝 < 0.3? a. There is a 99% probability that the proportion of the population is between 0.12 and 0.3. b. The proportion of all students who take notes is between 0.12 and 0.3, 99% of the time. c. With 99% confidence, a randomly selected student takes notes in a proportion of their classes that is between 0.12 and 0.3. d. With 99% confidence, the proportion of all students who take notes is between 0.12 and 0.3. Example 17. Answer the following. a) Given a confidence interval of (3.4, 5.6), can we conclude that the parameter is no more than 3.8? b) Given a confidence interval of (3.4, 6.2), can we conclude that the parameter is different from 5.8? c) Given a confidence interval of (3.4, 5.6), can we conclude that the parameter is at least 3.2? CC BY Creative Commons Attribution 4.0 International License 17

1703-Ch 8 Lecture Notes (1) PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue