PSY 230 Midterm Review Sheet PDF
Document Details
Tags
Summary
This document is a review sheet for a midterm exam in a psychology course (PSY 230). It covers topics such as data types, graphical representations, and different probability concepts. The document includes various questions and examples.
Full Transcript
PSY 230 Midterm Review Sheet 1. If we collect information on people's favorite restaurant, the restaurants selected will be of what data type and level of measurement? Which graphical summaries would be best to portray this type of data? 2. If we collect information on the patients'...
PSY 230 Midterm Review Sheet 1. If we collect information on people's favorite restaurant, the restaurants selected will be of what data type and level of measurement? Which graphical summaries would be best to portray this type of data? 2. If we collect information on the patients' ages seen on a Friday night at the ER, the patients age will constitute which data type and level of measurement? Which graphical summaries would be best to portray this type of data? 3. We present a satisfaction survey where people are asked a series of questions and the results are answered on a scale from 1-5, with 1 being the least satisfied and 5 being the most satisfied. The data collected (survey score) will be of what data type and level of measurement? Which graphical summaries would be best to portray this type of data? 4. A researcher is looking to assess the temperature patterns during the winter in Colorado. The researcher measures the temperature in a specific location every day during the winter at exactly 2:00 PM. The collection of these temperatures constitutes which data type and level of measurement? Which graphical summaries would be best to portray this type of data? 5. What is the main difference between a parameter and a statistic? 6. Do the following symbols represent a statistic or a parameter? [\$\\overline{x},\\mu,\\sigma,s\$]{.math.inline} 7. Describe the difference between a stratified random sample, cluster sample, sample of convenience, systematic sample, and random sample. 8. For the following histograms, determine whether the histogram is symmetric, skew left, or skew right. a. \(b) (c) Chart, histogram Description automatically generated ![Chart, histogram Description automatically generated](media/image2.png) Chart, histogram Description automatically generated 9. If a data set is considered to be positively skew (skew right), what is the relationship between the mean and median? 10. If a data set is considered to be negatively skew (skew left), what is the relationship between the mean and median? 11. If a data set is considered to be approximately symmetric, what is the relationship between the mean and median? 12. A frequency distribution is created and the lowest class falls between the values 30 -- 38. What is the midpoint of this class? 13. A frequency distribution is created and the highest class falls between the values 80 -- 91. What is the midpoint of this class? 14. For the following frequency histogram, answer the following questions: a. What proportion of data falls at or above 28? b. What proportion of data falls below 21? ![Chart, histogram Description automatically generated](media/image3.png) 15. For the following frequency histogram, answer the following questions: c. What proportion of data falls at or above 8? d. What proportion of data falls below 12? Chart, histogram Description automatically generated 16. For the following set of numbers, find the mean, median, and mode: 1,3,6,3,8,9,2,5,8,10 17. For the following set of numbers, find the mean, median, and mode: 18, 27, 16, 25, 35, 42 18. What are the key differences between the mean median and mode? 19. If we have the following data set: 1,3, 6, 9, 15, which measures of center would be impacted by changing the value of 15 to 35? 20. Given the two sets of data, which will have the larger standard deviation? Set 1 -- 1, 1, 3, 3, 5, 5 or set 2 -- 3, 5, 9, 16, 20 21. Given the two sets of data, which will have the smaller standard deviation? Set 1 -- 15, 16, 17, 19, 21 or set 2 -- 1, 5, 13, 16, 25 22. If we roll a six-sided die, what is the probability that we roll the following: e. The number 4 f. An even number g. A number less than 5 h. A number greater than 3 23. Suppose we have two events that are independent of one another, what does this tell us about how we compute P(A and B)? 24. If events A and B are independent of one another, and P(A) = 0.2 and P(B) = 0.3, what is P(A and B)? 25. If events A and B are independent of one another, and P(A) = 0.5 and P(B) = 0.8, what is P(A\|B)? 26. 25% of people in the US have type B+ blood. If we randomly sample 2 unrelated people in the US, answer the following: i. What is the probability a single individual has type B+ blood? j. What is the probability that a single individual does not have type B+ blood? k. What is the probability both people we survey have type B+ blood? l. What is the probability neither of the people have type B+ blood? 27. IRS data shows that 34.4% of households in the US bring in \$100,000+. If we randomly sample 3 households, answer the following: m. What is the probability an individual household brings in 100,000+? n. What is the probability that an individual household does not bring in 100,000+? o. What is the probability all 3 households bring in 100,000+? p. What is the probability none of the households bring in 100,000+? 28. Find the area under the Z-curve to the left of Z = -1.56. 29. Find the area under the Z-curve to the left of Z = 2.01. 30. Find the area under the Z-curve to the right of Z = -0.84. 31. Find the area under the Z-curve to the right of Z = 1.83. 32. Find the area under the Z-curve between Z = -1.56 and Z = -0.84. 33. Gummy bears are packaged using a process with a mean of 16 ounces and a standard deviation of 0.2 ounces. Find the Z-score that represents the following package weights and determine which is the most extreme value: q. 16.5 ounces r. 15.8 ounces s. 16.1 ounces t. 15.4 ounces 34. What does a Z-score represent? 35. What is the empirical rule? 36. Mandy is headed to her doctor appointment and does not want to be late. The drive time from her house to the doctor's office follows a Normal distribution with an average of 28 minutes and a standard deviation of 7 minutes. She decides to leave 40 minutes early. What is the probability she is late for this appointment (takes longer than 40 mins to arrive) 37. A new questionnaire is developed to determine the happiness of people. The average score for people follows a Normal distribution with a mean of 50 and a standard deviation of 15. Based on the grading scale, a person is deemed truly happy if they score 70 and above, and truly unhappy if they score 25 and below. Answer the following: u. Based on these claims of the new questionnaire, what proportion of the population is truly happy? (score greater than or equal to 70) v. Based on these claims of the new questionnaire, what proportion of the population is truly unhappy? (score less than or equal to 25) 38. What is the Central Limit theorem? What information does it give us? 39. The local soda shop fills their soda bottles with a process that follows a normal distribution with a population mean of 32 fl oz and a population standard deviation of 1.1 fl oz. If we sample 36 soda bottles, what is the Z-score pertaining to a sample mean of 32.6 fl oz? 40. Gummy bears are packaged using a process with a mean of 16 ounces and a standard deviation of 1.4 ounces. If we sample 64 packages and find the mean of those packages to be 15.5 ounces, what is the z-score corresponding to this value? 41. How do sample size and standard error relate to one another? Answers: 1. Qualitative Data, Nominal Data. Pie Charts, Bar Charts/Pareto Charts 2. Quantitative Data, Ratio Data. Histograms, Polygons, Boxplots, Dot plots (lesser extent) 3. Qualitative Data, Ordinal Data. Bar Charts/Pareto Charts, Pie Charts (lesser extent) 4. Quantitative Data, Interval Data. Histograms, Polygons, Boxplots, Dot plots (lesser extent) 5. A parameter is a numerical description (mean, median, standard deviation, etc) of a population, and a statistic is a numerical description (mean, median, standard deviation, etc) of a sample. We use sample statistics to help estimate population parameters. 6. [\$\\overline{x}\$]{.math.inline} = Sample mean, Statistic, [*μ*]{.math.inline} = Population mean, Parameter, [*σ*]{.math.inline} = Population standard deviation, Parameter, [*s*]{.math.inline} = sample standard deviation, Statistic 7. a. Stratified Random Sample: The population is divided into at least two distinct groups, or strata, then a random sample of a certain size is drawn from each strata to ensure the sample represents the population well amongst the specified categories b. Cluster Sample: If the population breaks into naturally occurring subgroups/clusters, with each subgroup/cluster being similar to the population as a whole, several clusters can be randomly selected to be a part of the sample, and every element within each cluster is included in the sample c. Sample of Convenience: The sample collected is one that comes from readily available members of the population d. Systematic Sample -- The population is arranged in some natural sequential order, and every kth element is selected to be a part of the sample e. Simple Random Sample: a subset of the population is selected in such a manner that every sample of size n has an equal chance of being selected 8. f. Symmetric g. Skew Right h. Skew Left 9. The mean will be larger than the median 10. The mean will be less than the median 11. The mean and the median will be approximately equal to one another 12. Midpoint = 34 13. Midpoint = 85.5 14. i. 9/21 j. 6/21 15. k. 6/23 l. 21/23 16. Mean = 5.5, Median = 5.5, Mode = 3, 8 17. Mean = 27.167, Median = 26, Mode = None 18. m. The mean measures the center and incorporates every single data point in the set. This makes it susceptible to outliers. n. The median finds the center of the data set such that half of the data points fall below this value, and half of the data points fall above this value. The median does not incorporate every single value in the data set, which means it misses out of some subtle variation, but is not susceptible to outliers o. The mode measures which value shows up most frequently in the data set. There may be a single mode, multiple modes, or no modes 19. The mean would be impacted as it incorporates every data point. The median and mode would remain unaffected by this change 20. Set 2 will have the larger standard deviation. We can determine this upon inspection as the values are further spread apart on average than the first set 21. Set 1 will have the smaller standard deviation. We can determine this upon inspection as the values are more clustered together and less spread apart than the second set 22. p. 1/6 q. 3/6 = ½ r. 4/6 = 2/3 s. 3/6 = ½ 23. If two events are independent of one another, then P(A and B) = P(A)P(B) 24. P(A and B) = 0.06 25. P(A\|B) = P(A) = 0.5 26. t. P(B+) = 0.25 u. P(B+^c^) = 0.75 v. P(B+ and B+) = 0.0625 w. P(B+^c^ and B+^c^) = 0.5625 27. x. P(100k+) = 0.344 y. P(100k+^c^) = 0.656 z. P(100k+ and 100k+ and 100k+) = 0.0407 a. P(100k+^c^ and 100k+^c^ and 100k+^c^)= 0.2823 28. 0.0594 29. 0.9778 30. 0.7995 31. 0.0336 32. 0.1411 33. b. Z = 2.5 -\> Extreme c. Z = -1.0 -\> Not Extreme d. Z = 0.5 -\> Not Extreme e. Z = -3.0 -\> Most Extreme 34. A Z-score tells us how many standard deviations an observed measurement falls from the population mean. 35. The empirical rule states that if the data is symmetric and bell shaped, then approximately 68% of the data will fall within 1 standard deviation of the mean, approximately 95% of the data will fall within 2 standard deviations of the mean, and approximately 99.7% of the data will fall within 3 standard deviations of the mean. 36. 0.0436 37. Truly Happy: 0.0918 Truly Unhappy: 0.0475 38. The Central Limit Theorem tells us that when our sample size is sufficiently large (n ≥ 30) in most cases, the distribution of the sample mean ([\$\\overline{x}\$]{.math.inline}) will be approximately normal with a mean equal to the mean of the original distribution (µ) and a standard deviation (also known as the standard error) of [\$\\frac{\\sigma}{\\sqrt{n}}\$]{.math.inline}. 39. Z = 3.273 40. Z = -2.857 41. Since standard error is computed as [\$\\frac{\\sigma}{\\sqrt{n}}\$]{.math.inline}, the larger the sample size, the smaller the standard error will be.