QR Exam PDF
Document Details
Tags
Summary
This document contains questions and explanations on statistical study and data analysis, focusing on different types of sampling methods, and questions on population, sample, and statistics. It also has examples and calculations on how to find the sample statistics about different data sets.
Full Transcript
HW 4 1. The population is the complete set of people or things being studied, a population parameter is a number that describes a characteristic of thepopulation, a sample is the set of people or things from which the data are obtained a sample statistic is a number that describes a characteristic...
HW 4 1. The population is the complete set of people or things being studied, a population parameter is a number that describes a characteristic of thepopulation, a sample is the set of people or things from which the data are obtained a sample statistic is a number that describes a characteristic of asample, and raw data are the individual measurements collected. 2. Eachyear, a group surveys50,000 households to study internet usage. In one area of thestudy, the group is interested in finding out how many hours a day the household spends streaming video from the internet. Describe the five basic steps in a statistical study with an example of their application below. Part 1 A. State the goal of your study. In thiscase, it is to discover how many hours per day a household spends streaming internet video. B. State the goal of your study. In thiscase, it is to discover how many households have internet access. C. Create a question to ask members of the study. In thiscase, the question is"How many hours per day do you spend streaming internetvideo?" D. State the goal of your study. In thiscase, it is to discover how many hours per week a household spends streaming internet video. Part 2 A. Start collecting data. In thiscase, we randomly call phone numbers and ask them about their internet habits. B. Redefine the goal of your study. In thiscase, we also want to discover how many hours per day a household spends checking email. C. Choose a representative sample from the population. In thiscase, it would be choosing a small sample from the50,000 households. D. Choose a representative sample from the population. In thiscase, it would be choosing a sample of50,000 households. Part 3 A. Estimate the data for the sample. In thiscase, we guessthat, onaverage, a household spends 1.5 hours per day streaming internet video. B. Collect raw data from the sample and summarize these data by finding sample statistics of interest. In thiscase, it would be asking the households how many hours they spend streaming internet video and turning this data into an average. C. Discover if there is interest in this study. In thiscase, we ask other researchers if they would be interested in reviewing the collected data. D. Collect raw data from the sample and summarize these data by finding sample statistics of interest. In this case it would be asking the households if they have internet access. Part 4 A. Use the sample statistics to infer the population parameters. In thiscase, based on the data gathered, the group estimates the average time per day that a household spends streaming internet video. B. Use the sample statistics to infer the population parameters. In thiscase, the group estimates how many households have internet access. C. Act on the collected data. In thiscase, we see that some households spend 0 hours streaming internet video and so we show these households how to stream video online. D. Estimate any missing data. In thiscase, we would estimate the data for any households that we could not collect data from. Part 5 A. Draw conclusions to determine what you learned and whether you achieved your goal. In this case, we discovered the average time per day that a household spends streaming internet video. B. Draw conclusions to determine what you learned and whether you achieved your goal. In this case, we learned that some households use the internet to check email. C. Act on the results. In thiscase, we determine that we need to have more households stream internet video. D. Repeat your study. In thiscase, we find a new sample to discover their internet usage. 3. A recent telephone poll of 995 randomly selected adults revealed that 6 in 10 adults believe there has been progress in finding a cure for cancer in the last 26 years. Describe thepopulation, sample, populationparameters, and sample statistics. Part 1 A. All adults in the country B. The 995 adults selected C. 6 out of 10 adults in the country D. 6 out of 10 of the 995 adults selected Part 2 A. 6 out of 10 of the 995 adults selected B. The 995 adults selected C. All adults in the country D. 6 out of 10 adults in the country Part 3 A. The total number of all adults in the country B. The percentage of all adults in the country who believe there has been progress C. The number of adults selected D. The percentage of the 995 adults selected who believe there has been progress Part 4 A. The proportion of all adults in the country who believe there has been progress B. 6 out of 10 C. The total number of all adults in the country D. 995 4. An IRS(Internal RevenueService) auditor randomly selects for audits 10 taxpayers in each of the filing statuscategories: single, head ofhousehold, married filingjointly, and married filing separately. A. Simple random sampling B. Stratified sampling C. Convenience sampling D. Systematic sampling 5. Blood alcohol concentrations of drivers involved in fatal crashes and then given jail sentences are shown below. Find themean, median, and mode of the listed numbers. 0.28 0.18 0.18 0.16 0.13 0.23 0.29 0.23 0.14 0.16 0.11 0.16 The mean is 0.188 The median is 0.17 Whatis(are) themode(s)? Select the correct choice belowand, ifnecessary, fill in the answer box within your choice. A. Themode(s) is(are) 0.16 B. There is no mode. 6. 27 4 4 11 12 34 3 15 6 4 3 4 4 14 6 a. Find the mean and median margin of victory. b. Identify the outlier in the data set. If the outlier iseliminated, what are the new mean andmedian? a. Find the mean and median of the weights. The mean is 10.1 The median is 6 b.Which, ifany, of the margins of victory would be considered theoutlier? Select the correct choice belowand, ifnecessary, fill in the answer box to complete your choice. A. The outlier is 34 B. None of the margins would be considered an outlier. c. Find the mean with the outlier excluded. Select the correct choice belowand, ifnecessary, fill in the answer box to complete your choice. A. The mean without the outlier is 8.4 B. None of the weights would be considered an outlier. d. Find the median with the outlier excluded. Select the correct choice belowand, ifnecessary, fill in the answer box to complete your choice. A. The median without the outlier is 5 B. None of the weights would be considered an outlier. 7. Per capita earnings in New York City. A. The median because it is easier to calculate. B. The mode because there can be more than one. C. The median because it is unaffected by outliers. D. The mean because it is the most commonly understood measure of center. E. The mode because it will find the most common per capitaearning(s) in New York City. F. The mean because it will find the average earnings for everyone in New York City. 8. Daily snowfall in Omaha in January. A. The mean because it takes into account alldays, regardless of the amount of snow. B. The median because it is easier to calculate. C. The mean because it is the most commonly understood measure of center. D. The mode because there can be more than one. E. The mode because it will find the most common snowfallamount(s). F. The median because it is unaffected byoutliers, such as extreme storms. 9. Exam results for 100 students are given below. For the given examresults, briefly describe the shape and variation of the distribution. median=75, mean=75, low score=60, high score=90 Describe the shape of the distribution. A. The distribution isright-skewed because the median is greater than the mean. B. The distribution isleft-skewed because the median is less than the mean. C. The distribution isleft-skewed because the median is greater than the mean. D. The distribution is symmetric because the median is equal to the mean. E. The distribution isright-skewed because the median is less than the mean. The difference between the greatest and least scores is 30, which is small compared to the number of students. 10. Define and distinguish among positivecorrelation, negativecorrelation, and no correlation. How do we determine the strength of acorrelation? Define positive correlation. Choose the correct answer below. A. Positive correlation means that both variables tend to increase(or decrease) together. B. Positive correlation means that there is no apparent relationship between the two variables. C. Positive correlation means that two variables tend to change in oppositedirections, with one increasing while the other decreases. D. Positive correlation means that there is a good relationship between the two variables. Define negative correlation. Choose the correct answer below. A. Negative correlation means that both variables tend to increase(or decrease) together. B. Negative correlation means that two variables tend to change in oppositedirections, with one increasing while the other decreases. C. Negative correlation means that there is a bad relationship between the two variables. D. Negative correlation means that there is no apparent relationship between the two variables. Define no correlation. Choose the correct answer below. A. No correlation means that the two variables are always zero. B. No correlation means that both variables tend to increase(or decrease) together. C. No correlation means that two variables tend to change in oppositedirections, with one increasing while the other decreases. D. No correlation means that there is no apparent relationship between the two variables. How do we determine the strength of acorrelation? A. The more closely two variables follow the generaltrend, the stronger the correlation(which may be positive ornegative). B. Negative correlation is stronger than no correlation. Positive correlation is stronger than negative correlation. C. No correlation is stronger than negative correlation. Positive correlation is stronger than no correlation. D. The more closely two variables follow the generaltrend, the weaker the correlation(which may be positive ornegative). 11. The histogram in the figure shows times between eruptions of a geyser. Draw a smooth curve that captures its important features. Then classify the distribution according to its number ofpeaks, symmetry orskewness, and variation. a. Draw a smooth curve that captures the important features. Choose the correct answer below. b. The distribution has 2 peak(s), is not symmetric, and has moderate to large variation. 12. The histogram of a sample of the weights of 213 rugby players is shown to right. Draw a smooth curve that captures its important features. Then classify the distribution according to its number ofpeaks, symmetry or skewness, and variation. a. Draw a smooth curve that captures the important features. Choose the correct answer below. b. The distribution has 1 peak(s), is symmetric, and has fairly low variation. HW 5 1. Consider two grocery stores at which the mean time in line is the same but the variation is different. At which store would you expect the customers to have more complaints about the waitingtime? A. The customers would have more complaints about the waiting time at the store that has more variation because some customers would have longer waits and might think they are being treated unequally. B. The customers would have more complaints about the waiting time at the store that has less variation because some customers would have longer waits and might think they are being treated unequally. C. The customers would have more complaints about the waiting time at the store that has more variation because some customers are easily annoyed. D. The customers would have more complaints about the waiting time at the store that has less variation because some customers are easily annoyed. 2. What are the quartiles of adistribution? How do we findthem? a. What are the quartiles of adistribution? A. The quartiles consist of themean, themedian, and the standard deviation of the data distribution. B. When the data distribution is divided equally into foursets, each set of values is called a quartile. C. The quartiles consist of the lowestvalue, themedian, and the highest value of the data distribution. D. The quartiles are values that divide the data distribution into quarters. b. How do we findthem? A. The lower quartile is the median of the data values in the lower half of a data set. The middle quartile is the overall median. The upper quartile is the median of the data values in the upper half of data set. B. The lower quartile is the mean of the data values in the lower half of a data set. The middle quartile is the overall mean. The upper quartile is the mean of the data values in the upper half of data set. C. The lower quartile is the median of the data values in the first quarter of a data set. The middle quartile is the overall median. The upper quartile is the median of the data values in the fourth quarter of data set. D. The lower quartile is the mean of the data values in the first quarter of a data set. The middle quartile is the overall mean. The upper quartile is the mean of the data values in the fourth quarter of data set. 3. After recording the pizza delivery times for two different pizzashops, you conclude that one pizza shop has a mean delivery time of 43 minutes with a standard deviation of 4 minutes. The other shop has a mean delivery time of 41 minutes with a standard deviation of 21 minutes. Interpret these figures. If you liked the pizzas from both shops equallywell, which one would you orderfrom? Why? a. Interpret these figures. Choose the correct answer below. A. Both the means and the variations are nearly equal. B. The means are nearlyequal, but the variation is significantly greater for the second shop than for the first. C. The variations are nearlyequal, but the mean is greater for the first shop than for the second. D. The means are nearlyequal, but the variation is significantly lower for the second shop than for the first. b. If you liked the pizzas from both shops equallywell, which one would you orderfrom? Why? A. Choose the second shop. The delivery time is more reliable because it has a larger standard deviation. B. Choose the first shop. The delivery time is more reliable because it has a lower standard deviation. C. Choose the first shop. The delivery time is more reliable because it has a lower mean. D. Choose the second shop. The delivery time is more reliable because it has a larger mean. 4. A small animal veterinarian reviews her records for the day and notes that she has seen eight dogs and eight cats with the following weights(in pounds). Dogs: 14, 25, 39, 45, 53, 64, 75, 102 Cats: 4, 5, 9, 12, 16, 19, 21, 21 a. Make correct conjectures below about which set has the largermean, median, and standard deviation. Choose the correct answer below. A. Themean, median, and standard deviation are all higher for dogs because most of the weights arelarger, so the averagevalue, middlevalue, and spread must be larger. B. The mean and median are higher for dogs because most of the weights arelarger, so the average value and middle value must be larger. The standard deviation is higher for cats because there is more variation in the weights. C. The mean and median are higher for cats because there is less variation in the weights. The standard deviation is higher for dogs because there is more variation in the weights. D. Themean, median, and standard deviation are all higher for cats because there is less variation in theweights, so the averagevalue, middlevalue, and spread must be larger. b. Compute the mean and standard deviation of each set. The mean for the dogs is 52.1 The mean for the cats is 13.4 The standard deviation for the dogs is 28.2 The standard deviation for the cats is 6.9 5. What is a normaldistribution? Briefly describe the conditions that make a normal distribution. What is a normal distribution and what conditions make a distributionnormal? Choose the correct answer below. A. A normal distribution is an asymmetric distribution with a single peak. Its peak corresponds to the mode of the distribution. Its variation is characterized by the standard deviation of the distribution. B. A normal distribution is asymmetric, bell-shaped distribution with a single peak. Its peak corresponds to themean, median, and mode of the distribution. Its variation is characterized by the standard deviation of the distribution. C. A normal distribution is a symmetric distribution with two peaks. Its peaks correspond to the maximum and the minimum of the distribution. Its variation is characterized by the standard deviation of the distribution. D. A normal distribution is asymmetric, bell-shaped distribution with a single peak. Its peak corresponds to themean, median, and mode of the distribution. Its variation is characterized by the range of the distribution. 6. What is the68-95-99.7 rule for normaldistributions? Explain how it can be used to answer questions about frequencies of data values in a normal distribution. A. The rule states that about1, 2, and 3 data points lie in68%, 95%, and99.7% of the datapoints, respectively, in a normal distribution. B. The rule states that about68%, 95%, and99.7% of the data points in a normal distribution lie within1, 2, and 3 standard deviations of themean, respectively. C. The rule states that about68%, 95%, and99.7% of the data points in a normal distribution lie within0, 1, and 2 standard deviations of themean, respectively. D. The rule states that about0, 1, and 2 data points lie in68%, 95%, and99.7% of the datapoints, respectively, in a normal distribution. 7. What is a standardscore? How do you find the standard score for a particular datavalue? a. Choose the correct definition of a standard score below. A. A standard score is a data value equal to the mean. B. A standard score is a data value that lies within one standard deviation of the mean. C. A standard score is the distance between a data value and the nearest outlier. D. A standard score is the number of standard deviations a data value lies above or below the mean. b. Choose the correct formula for computing a standard score below. A. The standard score for a particular data value is given by z=standard deviation/data value B. The standard score for a particular data value is given by z=standard deviation/mean-data value C. The standard score for a particular data value is given by z=data value/standard deviation D. The standard score for a particular data value is given by z=data value-mean/standard deviation 8. Consider the following set of threedistributions, all of which are drawn to the same scale. Identify the two distributions that are normal. Of the two normaldistributions, which one has the largervariation? The two normal distributions are (a) & (c), where (c) has the larger standard deviation. 9. State whether you would expect the following data set to be normally distributed or not. Scores on an easy statistics exam A. Normally distributed B. Not normally distributed 10. Assume that a set of test scores is normally distributed with a mean of 100 and a standard deviation of 25. Use the68-95-99.7 rule to find the following quantities. a. The percentage of scores less than 100 is 50% b. The percentage of scores greater than 125 is 16% 11. The scores on a psychology exam were normally distributed with a mean of 61 and a standard deviation of 9. What is the standard score for an exam score of 51? The standard score is -1.11 12. The scores on a psychology exam were normally distributed with a mean of 55 and a standard deviation of 6. What is the standard score for an exam score of 44? The standard score is -1.83 HW 6 2. 15!/12! The solution is 2730 4. A city council with eleven members must elect a five-person executive committee consisting of a mayor, deputymayor, secretary, comma administrator, and treasurer. How many executive committees arepossible? A. Arrangements with repetition should be used because no item may be selected more than once and the order matters. B. Permutations should be used because we make selections from a group of choices. C. Combinations should be used because no item may be selected more than once and the order does not matter. D. Arrangements with repetition should be used because we make selections from a group of choices. E. Permutations should be used because no item may be selected more than once and the order matters. Calculate how many different committees are possible. 55,440 5. Answer the following question using arrangements withrepetition, permutations, or combinations. The President must assign ambassadors to five different foreign embassies. From a pool of ten candidates, how many different diplomatic teams can sheform? A. Permutations should be used because we make selections from a group of choices. B. Arrangements with repetition should be used because no item may be selected more than once and the order matters. C. Arrangements with repetition should be used because we make selections from a group of choices. D. Permutations should be used because no item may be selected more than once and the order matters. E. Combinations should be used because no item may be selected more than once and the order does not matter. Calculate how many different diplomatic teams are possible. 30,240 6. Answer the following question using arrangements withrepetition, permutations, or combinations. How many anagrams(rearrangements) of the letters BODKIN can youmake? All the letters can be arranged in 720 ways. 7. A dog shelter is giving away 11 differentdogs, but you have room for only 3 of them. How many different dog families could youhave? A. Arrangements with repetition should be used because no item may be selected more than once and the order matters. B. Permutations should be used because no item may be selected more than once and the order matters. C. Permutations should be used because we make selections from a group of choices. D. Arrangements with repetition should be used because we make selections from a group of choices. E. Combinations should be used because no item may be selected more than once and the order does not matter. Calculate how many different dog families are possible. 165 8.A certain lottery has 39 numbers. In how many different ways can 5 of the numbers beselected? (Assume that order of selection is notimportant.) There are 575,757 different ways the numbers can be selected. 10. What does it mean when we writeP(A)? What is the possible range of values forP(A), andwhy? P(A) means which of thefollowing? A. P(A) means the number of times that event A will occur. B. P(A) means the probability that event A will not occur. C. P(A) means the probability that event A will occur. Which of the following is true for the possible range of values forP(A)? A. The range of possible values forP(A) is from 0 to 1(inclusive), with 0 meaning there is no chance that event A will occur and 1 meaning it is certain that event A will occur. B. The range of possible values forP(A) is the number of possible events where A could occur. C. The range of possible values forP(A) is 0 and 1 with 0 meaning that event A did not occur and 1 meaning that event A did occur. D. The range of possible values forP(A) can be any positive realnumber, with larger numbers being more likely to occur. 11. Decide whether the following statement makes sense(or clearlytrue) or does not make sense(or is clearlyfalse). Explain your reasoning. The probability that Jonas will win the race is 0.6 and the probability that he will not win is 0.5. A. The statement does not make sense because the probability of Jonas winning the race cannot be greater than the probability of him not winning the race. B. The statement makes sense because it is true that the probability of Jonas not winning the race is 0.5. C. The statement makes sense because the probability of Jonas winning the race will always be between 0 and 1. D. The statement does not make sense because the sum of the probabilities of Jonas winning and not winning the race must equal to 1. 12. How many different choices of car does a person have if a particular model comes in 9 colors and 4 styles (sedan, station wagon, full dash size SUV comma or minivan)? There are 36 different choices of car. 13. Pizza House offers 2 differentsalads, 5 different kinds ofpizza, and 3 different desserts. How many different three course meals can beordered? How many different meals can beordered? 30 15. Use the theoretical method to determine the probability of the following outcome and event. State any assumptions made. Tossing two coins and getting either one head or two heads A. Assuming that each coin is fair and is equally likely to land heads ortails, the probability is three fourths B. Assuming that each coin is fair and is equally likely to land heads ortails, the probability is four thirds C. Assuming that each coin is fair and is equally likely to land heads ortails, the probability is one half D. Assuming that each coin is fair and is equally likely to land heads ortails, the probability is 2 times 2. 16. Use the theoretical method to determine the probability of the given outcome or event. Assume that the die is fair. Rolling a singlesix-sided die and getting a low number (1,2, or 3) The probability rolling a singlesix-sided die and getting a low number (1,2, or 3) ½ 19. Use the theoretical method to determine the probability of the outcome or event given below. The next president of the United States was born on Thursday The probability of the given event is 1/7 20. Determine the probability of the given opposite event. What is the probability of rolling a fair die and not getting an outcome less than 3? The probability of rolling a fair die and not getting an outcome less than 3 is 2/3