Measures of Central Tendency PDF
Document Details
![TransparentMusicalSaw1414](https://quizgecko.com/images/avatars/avatar-15.webp)
Uploaded by TransparentMusicalSaw1414
Hamilton College
Tags
Summary
This document provides an overview of measures of central tendency, including mode, median, and mean. It also discusses how the level of measurement of data and skewness affects the choice of measure.
Full Transcript
Measures of Central Tendency Outline for Today I. Measures of central tendency and their limitations ¡ Mode On your own ¡ Median (OYO) (…mostly) ¡ Mean II. Factors that affect which measure of central tendency we use ¡ Level of measurement of t...
Measures of Central Tendency Outline for Today I. Measures of central tendency and their limitations ¡ Mode On your own ¡ Median (OYO) (…mostly) ¡ Mean II. Factors that affect which measure of central tendency we use ¡ Level of measurement of the variable ¡ Skewness of the distribution Some rounding rules: ¡ In this class, you will generally be asked to round most of your final answers to 2 decimal places. But you need to work all of your intermediate calculations to 3 decimal places. POINTS! ¡ How to round numbers to two decimal places: ¡ 10.4814 10.48 (round down) ¡ 10.4852 10.49 (round up) ¡ 10.485000 10.48 (round to even number) ¡ 10.475000 10.48 (round to even number) ¡ SO, if the number to the right of the decimal place you plan to round to is: ¡ Less than 5: round DOWN ¡ Greater than 5 (or 5 but with non-zero numbers following it): round UP ¡ EXACTLY 5: round so that the final answer is always an even number (so, approximately half the time you round up and half the time you round down) Parameters vs. Statistics PP and SS Describe using a parameter. Describe using a statistic. Mean = µ Mean = M = "̅ Self-Compassion Data “When I’m going through a very hard time, I give myself the caring and tenderness I need.” X f (rating) 5 4 (almost always) 4 13 (often) 3 18 (sometimes) So far: Frequency tables and histograms 2 13 provide a sense of the shape and (rarely) distribution of a set of numbers. 1 6 Next: How we represent the center or (almost never) average of this distribution with just a single number. Outline for Today I. Measures of central tendency and their limitations ¡ Mode ¡ Median ¡ Mean II. Factors that affect which measure of central tendency we use ¡ Level of measurement of the variable ¡ Skewness of the distribution The Mode ¡ What is it? X f ¡ The most frequently occurring score (rating) ¡ The mode of this 5 4 distribution is: (almost always) 4 13 ¡ Mode = 3 (often) ¡ The modal response was “sometimes” 3 18 (sometimes) ¡ Careful! 2 13 The mode is the X value, (rarely) (NOT the frequency!) 1 6 Mode = 3 (not 18) (almost never) The Mode ¡ How do you spot the mode in a histogram? Frequency Limitations of the Mode ¡ At times, the mode is not the most informative measure of central tendency: ¡ When there are multiple modes ¡ When there is no mode (e.g., rectangular distribution) ¡ A major limitation of the mode: ¡ Doesn’t tell you about all of the observations For example, imagine two courses in which an A is the modal grade. Course 1: 100 As Course 2: 100 As 2 Cs. 90 Cs Which course would you rather be enrolled in? The mode doesn’t tell you any information about about the Cs (just the As) The Median ¡ The point in a set of scores (a distribution) that divides the distribution in half. ¡ Half the scores are above the median, half are below the median A data set with 25 participants 7, 7, 7, 8, 8, 8, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 14, 14, 14, 15, 16, 17, 20, 25 OYO work: On the next FIVE slides, I provide instructions about how to calculate the median. You are all very, very bright, and class time is precious. Therefore, my expectation is that you will learn the material on the next few slides, even though I am not taking class time to teach you these steps. All this information is also in your Privitera textbook. Calculating the median (and showing the formulas) will be on Exam 1. The Median On your own (OYO) ¡ The point in a set of scores (a distribution) that divides the distribution in half. ¡ Half the scores are above the median, half are below the median ¡ Let’s look at two sets of scores and compute the median for each ¡ Data set 1: Step 1: 5, 2, 1, 4, 2, 3, 1, 4, 1, 5, 5 Put each set of scores in ¡ Data set 2: order from smallest to 3, 1, 4, 2, 4, 2, 1, 2, 3, 4, 1, 3 largest Computing the median (step 2) On your own (OYO) N+1 For a data set with an odd # of scores 2 (Data set 1; N = 11) 1, 1, 1, 2, 2, 3, 4, 4, 5, 5, 5 th Position The 6th (middle) score is the median. Median = 3 For a data set with an even # of scores (Data set 2; N = 12) Position 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4 The average of the two scores that surround 6.5 (i.e., average of the the Median = 2.5 6th and 7th scores is the median score). On your own (OYO) Computing the median from a N+1 frequency distribution 2 X f (rating) Step 1: calculate N by 5 4 summing the frequencies (f) (almost always) N = 54 4 13 (often) Step 2: use the formula to determine the position (out of 3 18 54) that divides the data set in (sometimes) half. 2 13 (rarely) (54+1)/2 = 27.5 1 6 Step 3: Determine the rating (x) (almost never) that occupies 27th and 28th score (by counting from the bottom) 54 On your own (OYO) Computing the median from a frequency distribution X f (54+1)/2 = 27.5 (position of the median) (rating) The median = 3, because 5 4 (almost always) the 27th and 28th scores are both a 3. 4 13 (often) 19 + 18 = 37. So the 27th & 28th scores 3 18 are in this portion of the distribution. (sometimes) 2 13 6 + 13 = 19, so we’re at the (rarely) 19th score in the distribution 1 6 Start counting from the (almost never) lowest X value. N = 6 Computing the median from a frequency On your own distribution – a graphical view (OYO) Scores 20-37 are in Scores here 7-19 are in here Scores 1-6 are in here (54+1)/2 = 27.5 (position of the median) The median = 3, Note: On an exam, you have to SHOW YOUR WORK. To do so, you because the 27th and would write the formula (N+1)/2, then 28th scores are both a 3. you would solve the formula, then find the median, and report it. Limitations of the Median An example to bring the concepts to life: You’ve been invited to two parties and have been told the median age for each party is 20 years old. Which party do you want to attend? Ages of party-goers (in years): Party 1: 1,4,8,10,12,13,19,20,25,32,36,40,42,60,62 Party 2: 17,18,18,18,18,19,19,20,21,21,21,21,21,21,21 Limitations of the Median 1. Isn’t computed from all the data in the data set (distribution). (similar to the mode) 2. Only makes use of the ranked position of the measurements. 3. Not sensitive to many changes in the data set (e.g., if subjects are removed/added) The Mean ¡ What is it? ¡ Arithmetic average ∑" "! = % Computing the Mean When Given a Frequency Distribution ∑ '($) X f $# = (rating) * 5 4 (almost always) Raw data (compassion score) 4 13 (often) 3 18 (sometimes) 2 13 (rarely) 1 6 (almost never) Computing the Mean When Given a Frequency Distribution N = sample size (Sf) ∑ '(") x f f(x) $# = (rating) * 5 4 20 To calculate the mean ("̅ ): (almost always) 4 13 52 (often) Sum (S) the frequencies (f) 3 18 54 Note: Sf should = the sample size (# participants) (sometimes) 2 13 26 Multiply rating (x) by frequency (f) to get f(x) (rarely) Sum (S) f(x) values 1 6 6 Divide Sf(x) by sample size (N) (almost never) ∑ "($) '() N =Sf = 54 "̅ = & = (* = 2.93 Sf(x)= 158 The mean response = ~sometimes Limitation of the Mean ¡ The mean is sensitive to extreme scores ¡ Example: # of children in a sample of families ¡ Sample 1: 0, 1, 2, 2, 3 (mean = 1.6) ¡ Sample 2: 0, 1, 2, 2, 12 (mean = 3.4) Note: neither the median or the mode change! (both = 2) Outline for Today I. Measures of central tendency and their limitations ¡ Mode ¡ Median ¡ Mean II. Factors that affect which measure of central tendency to use ¡ Level of measurement of the variable ¡ Skewness of the distribution Summary slide first: Levels of measurement determine which measures of central tendency are reported/used ¡ Nominal data? (e.g., sport team) ¡ Use mode only ¡ Ordinal data? ¡ Can use mode or median. ¡ Median is best. ¡ Median provides more information, so should report that instead of the mode ¡ Interval/ratio data? ¡ Can use mode, median, or mean ¡ Mean is best but if distribution is skewed, report mean AND median Levels of measurement determine which measures of central tendency are reported/used ¡ Nominal data? (sports team) ¡ Use mode only 1 = soccer ¡ Numbers on nominal scale do not 2 = track have (meaningful) order, so no 3 = field hockey median should be calculated. 4 = football (recall, to generate the median, first you rank all the data!) 5 = swim team ¡ Numbers on nominal scale describe something or someone, nothing more. Levels of measurement determine which measures of central tendency are reported/used ¡ Ordinal data? ¡ Use median (best) or mode Ordinal scales indicate that one value is greater than 1 = extra small another (i.e., the numbers on the scale can be 2 = small ordered/ranked). 3 = medium 4 = large Therefore the median (which requires ordering the 5 = jumbo scores) is an appropriate measure of central tendency Question: Why not use the mean when numbers are on an ordinal scale? But it cannot be determined whether or not the Answer: To calculate a mean, you have to calculate the sum of the differences between ranks/values are equal. observations. But, a sum is meaningful only when intervals between successive categories are approximately equal. In this example we simply do [see Privitera, Table 1.2 (pg 17)] not know this information. Levels of measurement determine which measures of central tendency are reported/used Interval/ratio data? ¡ Can use mode, median, or mean ¡ Mean is best but if distribution is skewed, report mean AND median QUICK QUIZ - Which measure of central tendency is most affected by extreme scores? ¡the mean! ¡So ALWAYS look at the shape of the distribution when deciding which measure(s) to use. The skewness/outliers in a distribution affects which measure of central tendency to use Example from evolutionary psychology (Psych 101): How do the sexual strategies of men and women differ? The skewness/outliers in a distribution affects which measure of central tendency to use Mean ideal number of sexual partners over the next 30 years Men Women From Pederson et al. (2002) The skewness/outliers in a distribution affects which measure of central tendency we use Mean ideal number of sexual partners over the next 30 years Men Women 7.69 2.78 (Statistically significant difference) From Pederson et al. (2002) Take home message: The mean does not provide an accurate description of this data set. Mode for men & women: 1 (mean = 2.78) Median for men & women: 1 (mean = 7.69) SIDENOTE: There should be hash marks along this tail of the X-axis because the scale changes. From Pederson et al. (2002) Normal Distribution Three normal distributions 0 10 20 30 40 50 60 70 80 90 100 ç Puppy obedience scores(x) Modified from Privitera: Fig 6.1 Fine details about distribution curves: Looking ahead/thought exercise: Imagine these are three frequency distributions - Tails do not touch the x-axis of data from three different puppy discipline - curve is symmetric (L R = mirror image) schools. If you were a dog trainer and wanted - Peak is rounded (not flat, not pointy) all your little pups to behave similarly (all really - Shapes: platykurtic, mesokurtic, leptokurtic good) at the conclusion of their six-months of training, which normal distribution would you want to represent your data and why? Looking at distributions and central tendency measures Amount of skew: Direction of skew If the mean and median are close together, Mean > Median à positive skew there usually is not much skew Mean < Median à negative skew The greater the difference, the greater the skew Summary of Central Tendency I. Measures of central tendency ¡ Mode ¡ Median ¡ Mean II. What affects which measure of central tendency we use? ¡ Level of measurement of the variable ¡ Skewness of the distribution Levels of measurement (again!) Equal & Meaningful meaningful True 0? Type of data order? differences? Nominal No Qualitative Ordinal Yes No Quantitative Interval Yes Yes No Quantitative Ratio Yes Yes Yes Quantitative Numbers on a nominal scale identify something (good for coding in SPSS) Numbers on an ordinal scale can be ranked (ordered by value (grtr than, less than) Numbers on an interval scale can be ranked, are equidistant, but if contain “0” it does not convey the absence of something (no true zero). Numbers on an ratio scale, can be ranked, are equidistant, and zero means absence