Chapter 2: Numerical Descriptors Learning Objectives PDF
Document Details
Uploaded by .keeks.
Marian University
Tags
Summary
This document details learning objectives for chapter 2, focusing on numerical descriptors in statistics. It covers measures of center like mean and median, measures of spread such as quartiles and standard deviation, and how to use these concepts with five-number summaries and boxplots. The document explains dealing with outliers and choosing appropriate summary statistics.
Full Transcript
Chapter 2: Numerical Descriptors Learning Objectives LO 1. Measures of Center: Mean and Median 1. Mean: calculated by adding all values, then dividing them by the number of individuals 2. Median: midpoint of a distribution – the number such that half of the observations are smaller,...
Chapter 2: Numerical Descriptors Learning Objectives LO 1. Measures of Center: Mean and Median 1. Mean: calculated by adding all values, then dividing them by the number of individuals 2. Median: midpoint of a distribution – the number such that half of the observations are smaller, and half is larger a. Sort observations smallest to largest; then the median is (n+1)/2 in the sorted list b. If n is odd, then the median is the value of the center of observation c. If n is even, then the median is the mean of the two center observations 3. The median is resistant to skew and outliers, while the mean is not a. In a left-skewed distribution, median is larger than mean b. In a right skewed distribution, mean is larger than median LO 2. Measures of Spread: Quartiles and Standard Deviation 1. First Quartile: median of the values below the median in the sorted data set 2. Third Quartile: median of the values above the median in the sorted data set 3. Standard Deviation: describes variation around mean a. Must calculated variance (s2) then take the square root of it to get the standard deviation (s) b. Features: i. Measures spread about mean, and should only be used when the average is the measure of the center ii. Always 0 or greater; when it’s 0, the values in the sample are identical iii. Same units of measurement as the original observations iv. The variance haws the squared units of the original observations and is harder to interpret v. Not resistant; outliers have an even larger effect on the standard deviation than they do on the median LO 3. The Five-Number Summary and Boxplots 1. Five-Number Summary: quartiles, median, minimum, and maximum 2. Boxplot: graphical view of the five-number summary LO 4. IQR and Outliers 1. Interquartile Range: distance between the first and third quartiles (the length of the box in a boxplot) a. Because the quartiles are medians themselves (of each half of the data set), the interquartile range is a resistant statistic b. It is possible for it to equal 0, if the values for the first and third quartile are equal 2. Outlier: individual value that falls outside the overall pattern a. Suspected Low Outlier: any value less than Q1 – 1.5(IQR) b. Suspected High Outlier: any value greater than Q3 + 1.5(IQR) LO 5. Dealing with Outliers 1. Dealing with outliers depends in part on what kind of outliers they are a. Human error in recording information b. Human error in experimentation or data collection c. Unexplainable by legitimate wild observations i. Interested in all individuals or typical individuals? 2. Don’t disregard outliers to make data look better and don’t act as if they don’t exist LO 6. Choosing Among Summary Statistics 1. Because the mean is not resistant to outliers or skew, use it to describe distributions that are symmetrical and don’t have outliers a. Plot the man and use the standard deviation for error bars 2. Otherwise, use the medial and the five-number summary, which can be plotted as a boxplot LO 7. Organizing a Statistical Problem 1. State – practical question in a real-world setting 2. Plan – specific statistical operations called for 3. Solve – make the graphs and carry out the calculations 4. Conclude – practical conclusion in a real-world setting