Summary

This document contains questions and explanations on statistical study and data analysis, focusing on different types of sampling methods, and questions on population, sample, and statistics. It also has examples and calculations on how to find the sample statistics about different data sets.

Full Transcript

HW 4 1. The population is the complete set of people or things being studied, a population parameter is a number that describes a characteristic of the​population, a sample is the set of people or things from which the data are obtained a sample statistic is a number that describes a characteristic...

HW 4 1. The population is the complete set of people or things being studied, a population parameter is a number that describes a characteristic of the​population, a sample is the set of people or things from which the data are obtained a sample statistic is a number that describes a characteristic of a​sample, and raw data are the individual measurements collected. 2. Each​year, a group surveys​50,000 households to study internet usage. In one area of the​study, the group is interested in finding out how many hours a day the household spends streaming video from the internet. Describe the five basic steps in a statistical study with an example of their application below. Part 1 A. State the goal of your study. In this​case, it is to discover how many hours per day a household spends streaming internet video. B. State the goal of your study. In this​case, it is to discover how many households have internet access. C. Create a question to ask members of the study. In this​case, the question is​"How many hours per day do you spend streaming internet​video?" D. State the goal of your study. In this​case, it is to discover how many hours per week a household spends streaming internet video. Part 2 A. Start collecting data. In this​case, we randomly call phone numbers and ask them about their internet habits. B. Redefine the goal of your study. In this​case, we also want to discover how many hours per day a household spends checking email. C. Choose a representative sample from the population. In this​case, it would be choosing a small sample from the​50,000 households. D. Choose a representative sample from the population. In this​case, it would be choosing a sample of​50,000 households. Part 3 A. Estimate the data for the sample. In this​case, we guess​that, on​average, a household spends 1.5 hours per day streaming internet video. B. Collect raw data from the sample and summarize these data by finding sample statistics of interest. In this​case, it would be asking the households how many hours they spend streaming internet video and turning this data into an average. C. Discover if there is interest in this study. In this​case, we ask other researchers if they would be interested in reviewing the collected data. D. Collect raw data from the sample and summarize these data by finding sample statistics of interest. In this case it would be asking the households if they have internet access. Part 4 A. Use the sample statistics to infer the population parameters. In this​case, based on the data​ gathered, the group estimates the average time per day that a household spends streaming internet video. B. Use the sample statistics to infer the population parameters. In this​case, the group estimates how many households have internet access. C. Act on the collected data. In this​case, we see that some households spend 0 hours streaming internet video and so we show these households how to stream video online. D. Estimate any missing data. In this​case, we would estimate the data for any households that we could not collect data from. Part 5 A. Draw conclusions to determine what you learned and whether you achieved your goal. In this​ case, we discovered the average time per day that a household spends streaming internet video. B. Draw conclusions to determine what you learned and whether you achieved your goal. In this​ case, we learned that some households use the internet to check email. C. Act on the results. In this​case, we determine that we need to have more households stream internet video. D. Repeat your study. In this​case, we find a new sample to discover their internet usage. 3. A recent telephone poll of 995 randomly selected adults revealed that 6 in 10 adults believe there has been progress in finding a cure for cancer in the last 26 years. Describe the​population, sample, population​parameters, and sample statistics. Part 1 A. All adults in the country B. The 995 adults selected C. 6 out of 10 adults in the country D. 6 out of 10 of the 995 adults selected Part 2 A. 6 out of 10 of the 995 adults selected B. The 995 adults selected C. All adults in the country D. 6 out of 10 adults in the country Part 3 A. The total number of all adults in the country B. The percentage of all adults in the country who believe there has been progress C. The number of adults selected D. The percentage of the 995 adults selected who believe there has been progress Part 4 A. The proportion of all adults in the country who believe there has been progress B. 6 out of 10 C. The total number of all adults in the country D. 995 4. An IRS​(Internal Revenue​Service) auditor randomly selects for audits 10 taxpayers in each of the filing status​categories: single, head of​household, married filing​jointly, and married filing separately. A. Simple random sampling B. Stratified sampling C. Convenience sampling D. Systematic sampling 5. Blood alcohol concentrations of drivers involved in fatal crashes and then given jail sentences are shown below. Find the​mean, median, and mode of the listed numbers. 0.28 0.18 0.18 0.16 0.13 0.23 0.29 0.23 0.14 0.16 0.11 0.16 The mean is 0.188 The median is 0.17 What​is(are) the​mode(s)? Select the correct choice below​and, if​necessary, fill in the answer box within your choice. A. The​mode(s) is(are) 0.16 B. There is no mode. 6. 27 4 4 11 12 34 3 15 6 4 3 4 4 14 6 a. Find the mean and median margin of victory. b. Identify the outlier in the data set. If the outlier is​eliminated, what are the new mean and​median? a. Find the mean and median of the weights. The mean is 10.1 The median is 6 b.​Which, if​any, of the margins of victory would be considered the​outlier? Select the correct choice below​and, if​necessary, fill in the answer box to complete your choice. A. The outlier is 34 B. None of the margins would be considered an outlier. c. Find the mean with the outlier excluded. Select the correct choice below​and, if​necessary, fill in the answer box to complete your choice. A. The mean without the outlier is 8.4 B. None of the weights would be considered an outlier. d. Find the median with the outlier excluded. Select the correct choice below​and, if​necessary, fill in the answer box to complete your choice. A. The median without the outlier is 5 B. None of the weights would be considered an outlier. 7. Per capita earnings in New York City. A. The median because it is easier to calculate. B. The mode because there can be more than one. C. The median because it is unaffected by outliers. D. The mean because it is the most commonly understood measure of center. E. The mode because it will find the most common per capita​earning(s) in New York City. F. The mean because it will find the average earnings for everyone in New York City. 8. Daily snowfall in Omaha in January. A. The mean because it takes into account all​days, regardless of the amount of snow. B. The median because it is easier to calculate. C. The mean because it is the most commonly understood measure of center. D. The mode because there can be more than one. E. The mode because it will find the most common snowfall​amount(s). F. The median because it is unaffected by​outliers, such as extreme storms. 9. Exam results for 100 students are given below. For the given exam​results, briefly describe the shape and variation of the distribution. median=75, mean=​75, low score=60, high score=90 Describe the shape of the distribution. A. The distribution is​right-skewed because the median is greater than the mean. B. The distribution is​left-skewed because the median is less than the mean. C. The distribution is​left-skewed because the median is greater than the mean. D. The distribution is symmetric because the median is equal to the mean. E. The distribution is​right-skewed because the median is less than the mean. The difference between the greatest and least scores is 30​, which is small compared to the number of students. 10. Define and distinguish among positive​correlation, negative​correlation, and no correlation. How do we determine the strength of a​correlation? Define positive correlation. Choose the correct answer below. A. Positive correlation means that both variables tend to increase​(or decrease) together. B. Positive correlation means that there is no apparent relationship between the two variables. C. Positive correlation means that two variables tend to change in opposite​directions, with one increasing while the other decreases. D. Positive correlation means that there is a good relationship between the two variables. Define negative correlation. Choose the correct answer below. A. Negative correlation means that both variables tend to increase​(or decrease) together. B. Negative correlation means that two variables tend to change in opposite​directions, with one increasing while the other decreases. C. Negative correlation means that there is a bad relationship between the two variables. D. Negative correlation means that there is no apparent relationship between the two variables. Define no correlation. Choose the correct answer below. A. No correlation means that the two variables are always zero. B. No correlation means that both variables tend to increase​(or decrease) together. C. No correlation means that two variables tend to change in opposite​directions, with one increasing while the other decreases. D. No correlation means that there is no apparent relationship between the two variables. How do we determine the strength of a​correlation? A. The more closely two variables follow the general​trend, the stronger the correlation​(which may be positive or​negative). B. Negative correlation is stronger than no correlation. Positive correlation is stronger than negative correlation. C. No correlation is stronger than negative correlation. Positive correlation is stronger than no correlation. D. The more closely two variables follow the general​trend, the weaker the correlation​(which may be positive or​negative). 11. The histogram in the figure shows times between eruptions of a geyser. Draw a smooth curve that captures its important features. Then classify the distribution according to its number of​peaks, symmetry or​skewness, and variation. a. Draw a smooth curve that captures the important features. Choose the correct answer below. b. The distribution has 2 ​peak(s), is not symmetric, and has moderate to large variation. 12. The histogram of a sample of the weights of 213 rugby players is shown to right. Draw a smooth curve that captures its important features. Then classify the distribution according to its number of​peaks, symmetry or​ skewness, and variation. a. Draw a smooth curve that captures the important features. Choose the correct answer below. b. The distribution has 1 ​peak(s), is symmetric, and has fairly low variation. HW 5 1. Consider two grocery stores at which the mean time in line is the same but the variation is different. At which store would you expect the customers to have more complaints about the waiting​time? A. The customers would have more complaints about the waiting time at the store that has more variation because some customers would have longer waits and might think they are being treated unequally. B. The customers would have more complaints about the waiting time at the store that has less variation because some customers would have longer waits and might think they are being treated unequally. C. The customers would have more complaints about the waiting time at the store that has more variation because some customers are easily annoyed. D. The customers would have more complaints about the waiting time at the store that has less variation because some customers are easily annoyed. 2. What are the quartiles of a​distribution? How do we find​them? a. What are the quartiles of a​distribution? A. The quartiles consist of the​mean, the​median, and the standard deviation of the data distribution. B. When the data distribution is divided equally into four​sets, each set of values is called a quartile. C. The quartiles consist of the lowest​value, the​median, and the highest value of the data distribution. D. The quartiles are values that divide the data distribution into quarters. b. How do we find​them? A. The lower quartile is the median of the data values in the lower half of a data set. The middle quartile is the overall median. The upper quartile is the median of the data values in the upper half of data set. B. The lower quartile is the mean of the data values in the lower half of a data set. The middle quartile is the overall mean. The upper quartile is the mean of the data values in the upper half of data set. C. The lower quartile is the median of the data values in the first quarter of a data set. The middle quartile is the overall median. The upper quartile is the median of the data values in the fourth quarter of data set. D. The lower quartile is the mean of the data values in the first quarter of a data set. The middle quartile is the overall mean. The upper quartile is the mean of the data values in the fourth quarter of data set. 3. After recording the pizza delivery times for two different pizza​shops, you conclude that one pizza shop has a mean delivery time of 43 minutes with a standard deviation of 4 minutes. The other shop has a mean delivery time of 41 minutes with a standard deviation of 21 minutes. Interpret these figures. If you liked the pizzas from both shops equally​well, which one would you order​from? Why? a. Interpret these figures. Choose the correct answer below. A. Both the means and the variations are nearly equal. B. The means are nearly​equal, but the variation is significantly greater for the second shop than for the first. C. The variations are nearly​equal, but the mean is greater for the first shop than for the second. D. The means are nearly​equal, but the variation is significantly lower for the second shop than for the first. b. If you liked the pizzas from both shops equally​well, which one would you order​from? Why? A. Choose the second shop. The delivery time is more reliable because it has a larger standard deviation. B. Choose the first shop. The delivery time is more reliable because it has a lower standard deviation. C. Choose the first shop. The delivery time is more reliable because it has a lower mean. D. Choose the second shop. The delivery time is more reliable because it has a larger mean. 4. A small animal veterinarian reviews her records for the day and notes that she has seen eight dogs and eight cats with the following weights​(in pounds). ​Dogs: 14​, 25​, 39​, 45​, 53​, 64​, 75​, 102 ​Cats: 4​, 5​, 9​, 12​, 16​, 19​, 21​, 21 a. Make correct conjectures below about which set has the larger​mean, median, and standard deviation. Choose the correct answer below. A. The​mean, median, and standard deviation are all higher for dogs because most of the weights are​larger, so the average​value, middle​value, and spread must be larger. B. The mean and median are higher for dogs because most of the weights are​larger, so the average value and middle value must be larger. The standard deviation is higher for cats because there is more variation in the weights. C. The mean and median are higher for cats because there is less variation in the weights. The standard deviation is higher for dogs because there is more variation in the weights. D. The​mean, median, and standard deviation are all higher for cats because there is less variation in the​weights, so the average​value, middle​value, and spread must be larger. b. Compute the mean and standard deviation of each set. The mean for the dogs is 52.1 The mean for the cats is 13.4 The standard deviation for the dogs is 28.2 The standard deviation for the cats is 6.9 5. What is a normal​distribution? Briefly describe the conditions that make a normal distribution. What is a normal distribution and what conditions make a distribution​normal? Choose the correct answer below. A. A normal distribution is an asymmetric distribution with a single peak. Its peak corresponds to the mode of the distribution. Its variation is characterized by the standard deviation of the distribution. B. A normal distribution is a​symmetric, bell-shaped distribution with a single peak. Its peak corresponds to the​mean, median, and mode of the distribution. Its variation is characterized by the standard deviation of the distribution. C. A normal distribution is a symmetric distribution with two peaks. Its peaks correspond to the maximum and the minimum of the distribution. Its variation is characterized by the standard deviation of the distribution. D. A normal distribution is a​symmetric, bell-shaped distribution with a single peak. Its peak corresponds to the​mean, median, and mode of the distribution. Its variation is characterized by the range of the distribution. 6. What is the​68-95-99.7 rule for normal​distributions? Explain how it can be used to answer questions about frequencies of data values in a normal distribution. A. The rule states that about​1, 2, and 3 data points lie in​68%, 95%, and​99.7% of the data​points, respectively, in a normal distribution. B. The rule states that about​68%, 95%, and​99.7% of the data points in a normal distribution lie within​1, 2, and 3 standard deviations of the​mean, respectively. C. The rule states that about​68%, 95%, and​99.7% of the data points in a normal distribution lie within​0, 1, and 2 standard deviations of the​mean, respectively. D. The rule states that about​0, 1, and 2 data points lie in​68%, 95%, and​99.7% of the data​points, respectively, in a normal distribution. 7. What is a standard​score? How do you find the standard score for a particular data​value? a. Choose the correct definition of a standard score below. A. A standard score is a data value equal to the mean. B. A standard score is a data value that lies within one standard deviation of the mean. C. A standard score is the distance between a data value and the nearest outlier. D. A standard score is the number of standard deviations a data value lies above or below the mean. b. Choose the correct formula for computing a standard score below. A. The standard score for a particular data value is given by z=standard deviation/data value B. The standard score for a particular data value is given by z=standard deviation/mean-data value C. The standard score for a particular data value is given by z=data value/standard deviation D. The standard score for a particular data value is given by z=data value-mean/standard deviation 8. Consider the following set of three​distributions, all of which are drawn to the same scale. Identify the two distributions that are normal. Of the two normal​distributions, which one has the larger​variation? The two normal distributions are (a) & (c), where (c) has the larger standard deviation. 9. State whether you would expect the following data set to be normally distributed or not. Scores on an easy statistics exam A. Normally distributed B. Not normally distributed 10. Assume that a set of test scores is normally distributed with a mean of 100 and a standard deviation of 25. Use the​68-95-99.7 rule to find the following quantities. a. The percentage of scores less than 100 is 50% b. The percentage of scores greater than 125 is 16% 11. The scores on a psychology exam were normally distributed with a mean of 61 and a standard deviation of 9. What is the standard score for an exam score of 51​? The standard score is -1.11 12. The scores on a psychology exam were normally distributed with a mean of 55 and a standard deviation of 6. What is the standard score for an exam score of 44​? The standard score is -1.83 HW 6 2. 15!/12! The solution is 2730 4. A city council with eleven members must elect a five​-person executive committee consisting of a​ mayor, deputy​mayor, secretary, comma administrator, and treasurer. How many executive committees are​possible? A. Arrangements with repetition should be used because no item may be selected more than once and the order matters. B. Permutations should be used because we make selections from a group of choices. C. Combinations should be used because no item may be selected more than once and the order does not matter. D. Arrangements with repetition should be used because we make selections from a group of choices. E. Permutations should be used because no item may be selected more than once and the order matters. Calculate how many different committees are possible. 55,440 5. Answer the following question using arrangements with​repetition, permutations, or combinations. The President must assign ambassadors to five different foreign embassies. From a pool of ten ​candidates, how many different diplomatic teams can she​form? A. Permutations should be used because we make selections from a group of choices. B. Arrangements with repetition should be used because no item may be selected more than once and the order matters. C. Arrangements with repetition should be used because we make selections from a group of choices. D. Permutations should be used because no item may be selected more than once and the order matters. E. Combinations should be used because no item may be selected more than once and the order does not matter. Calculate how many different diplomatic teams are possible. 30,240 6. Answer the following question using arrangements with​repetition, permutations, or combinations. How many anagrams​(rearrangements) of the letters BODKIN can you​make? All the letters can be arranged in 720 ways. 7. A dog shelter is giving away 11 different​dogs, but you have room for only 3 of them. How many different dog families could you​have? A. Arrangements with repetition should be used because no item may be selected more than once and the order matters. B. Permutations should be used because no item may be selected more than once and the order matters. C. Permutations should be used because we make selections from a group of choices. D. Arrangements with repetition should be used because we make selections from a group of choices. E. Combinations should be used because no item may be selected more than once and the order does not matter. Calculate how many different dog families are possible. 165 8.A certain lottery has 39 numbers. In how many different ways can 5 of the numbers be​selected? (Assume that order of selection is not​important.) There are 575,757 different ways the numbers can be selected. 10. What does it mean when we write​P(A)? What is the possible range of values for​P(A), and​why? ​P(A) means which of the​following? A. ​P(A) means the number of times that event A will occur. B. ​P(A) means the probability that event A will not occur. C. ​P(A) means the probability that event A will occur. Which of the following is true for the possible range of values for​P(A)? A. The range of possible values for​P(A) is from 0 to 1​(inclusive), with 0 meaning there is no chance that event A will occur and 1 meaning it is certain that event A will occur. B. The range of possible values for​P(A) is the number of possible events where A could occur. C. The range of possible values for​P(A) is 0 and 1 with 0 meaning that event A did not occur and 1 meaning that event A did occur. D. The range of possible values for​P(A) can be any positive real​number, with larger numbers being more likely to occur. 11. Decide whether the following statement makes sense​(or clearly​true) or does not make sense​(or is clearly​false). Explain your reasoning. The probability that Jonas will win the race is 0.6 and the probability that he will not win is 0.5. A. The statement does not make sense because the probability of Jonas winning the race cannot be greater than the probability of him not winning the race. B. The statement makes sense because it is true that the probability of Jonas not winning the race is 0.5. C. The statement makes sense because the probability of Jonas winning the race will always be between 0 and 1. D. The statement does not make sense because the sum of the probabilities of Jonas winning and not winning the race must equal to 1. 12. How many different choices of car does a person have if a particular model comes in 9 colors and 4 styles ​(sedan​, station wagon​, full dash size SUV comma or minivan​)? There are 36 different choices of car. 13. Pizza House offers 2 different​salads, 5 different kinds of​pizza, and 3 different desserts. How many different three course meals can be​ordered? How many different meals can be​ordered? 30 15. Use the theoretical method to determine the probability of the following outcome and event. State any assumptions made. Tossing two coins and getting either one head or two heads A. Assuming that each coin is fair and is equally likely to land heads or​tails, the probability is three fourths B. Assuming that each coin is fair and is equally likely to land heads or​tails, the probability is four thirds C. Assuming that each coin is fair and is equally likely to land heads or​tails, the probability is one half D. Assuming that each coin is fair and is equally likely to land heads or​tails, the probability is 2 times 2. 16. Use the theoretical method to determine the probability of the given outcome or event. Assume that the die is fair. Rolling a single​six-sided die and getting a low number (1,2, or 3) The probability rolling a single​six-sided die and getting a low number (1,2, or 3) ½ 19. Use the theoretical method to determine the probability of the outcome or event given below. The next president of the United States was born on Thursday The probability of the given event is 1/7 20. Determine the probability of the given opposite event. What is the probability of rolling a fair die and not getting an outcome less than 3​? The probability of rolling a fair die and not getting an outcome less than 3 is 2/3

Use Quizgecko on...
Browser
Browser