Unit 3: Measures of Central Tendency PDF

Summary

This document discusses measures of central tendency including mean, median, and mode in statistics. It provides definitions, formulas and examples, making it useful for learning about these fundamental statistical concepts.

Full Transcript

Angelique DUKUNDE Unit 3 : MEASURES OF CENTRAL TENDENY Unit 3 : MEASURES OF CENTRAL TENDENY ✓ The central tendency is measured by averages ✓ These describe the point about which the various observed values cluster ✓ In mathematics, an average...

Angelique DUKUNDE Unit 3 : MEASURES OF CENTRAL TENDENY Unit 3 : MEASURES OF CENTRAL TENDENY ✓ The central tendency is measured by averages ✓ These describe the point about which the various observed values cluster ✓ In mathematics, an average, or central tendency of a data set refers to a measure of the "middle" or "expected" value of the data set 1. ARITHMETIC MEAN ✓ Most common measure of central tendency and Best for making predictions ✓ Applicable when scores are measured at the interval level, Prepared by DUKUNDE ANDELIQUE Unit 3 : MEASURES OF CENTRAL TENDENY ✓To find the mean, add all of the values, then divide by the number of values (number of observations) i.e. n=sample size or N= population size  The lower case, Greek letter mu is used to N  xi symbolize population mean  = i =1 Population N  An “x” with a bar over it, read x-bar, is used n for sample mean. x i x = i =1 Sample n ✓If we have “n” real numbers x1 , x 2 , x 3 ,......., x n , their arithmetic mean, denoted by, x can be expressed as: x1 + x2 + x3 +............. + x n x= Prepared by DUKUNDE ANDELIQUE n Unit 3 : MEASURES OF CENTRAL TENDENY ✓For Example, let us see how to find the mean of the marks 15 students for an examination marked out of 100 listing X 1 14 2 17 3 31 x-bar 4 28 737/15 = 49.13333 5 42 6 43 7 51 8 51 9 66 10 70 11 67 12 70 13 78 14 62 n = 15 47 total 737 Prepared by DUKUNDE ANDELIQUE Unit 3 : MEASURES OF CENTRAL TENDENY ✓ Mathematical properties of Arithmetic Mean: (i) The sum of the deviations of the items from the arithmetic mean is always zero. i.e. ∑(xi -x ) = 0 (ii) The sum of the squared deviations of the items from arithmetic mean is the minimum, that is, less than the sum of the squared deviations of the items from any other values (iii) Since x =(∑xi)/n; n x= ∑xi (iv) If we have the arithmetic mean of the number of items of two or more than two related groups, we can compute combined average of these groups by applying the following formula: x 1,2 =( n1*x 1+ n2 * x2 )/( n1+ n2 ) Prepared by DUKUNDE ANDELIQUE Unit 3 : MEASURES OF CENTRAL TENDENY Where x1,2 = Combined mean of the two groups x 1= Arithmetic mean of first group x 2 = Arithmetic mean of second group n1= Number of items in the first group n2= Number of items in the second group Note: In case the values are presented in a frequency table: Sample mean x = (∑fixi )/n 1 Population mean µ= N f X i i For grouped data, if m1, m2 , m3 ,.........., mk are the mid-values and f1 , f 2 , f 3 ,........, f k are the corresponding frequencies, where the subscript ‘k’ stands for the number of classes, then the mean is X =  fm i i Prepared by DUKUNDE ANDELIQUE f i Unit 3 : MEASURES OF CENTRAL TENDENY 5.Weighted Mean ✓ The Weighted mean of the positive real numbers x1,x2,..., xn with their weight n w1,w2,..., wn is defined to be w x i i i =1 x= n w i 6.Median i =1 ✓ Because the mean average can be sensitive to extreme values, the median is sometimes useful and more accurate ✓ The median is simply the middle value among some scores of a variable ✓ The implication of this definition is that a median is the middle value of the observations such that the number of observations above it is equal to the number of observations below it Prepared by DUKUNDE ANDELIQUE Unit 3 : MEASURES OF CENTRAL TENDENY ❑ To find the median: ✓ List the numbers in ascending order ✓ If there is a number in the middle (odd number of values) that is the median ✓ If there is not a middle number (even number of values) take the two in the middle, their average is the median ❑ Median for frequency Series ✓ Arrange the data in ascending and assign their frequency ✓ Find out the cumulative frequencies ✓ Apply the formula: Median = Size of n+1 / 2 ✓ Now look at the cumulative frequency column and find that total which is either equal to n+1/2 or next higher to that and determine the value of the variable corresponding to it Prepared by DUKUNDE ANDELIQUE Unit 3 : MEASURES OF CENTRAL TENDENY Illustration: listing X listing X 1 14 1 14 2 17 2 17 3 28 3 28 4 31 4 31 5 42 5 42 6 43 6 43 7 47 7 47 8 51 8 51 51+53 = 52 9 51 9 53 2 10 62 10 57 11 66 11 62 12 67 12 66 13 70 13 67 14 70 14 70 15 78 15 70 If “n” is Even 16 78 If “n” is odd Me = X 1 1  ( n +1) M e =  X n + X n  2 2 +1 2  2 Prepared by DUKUNDE ANDELIQUE Unit 3 : MEASURES OF CENTRAL TENDENY Exercise: Calculate the Median income from following data Income (USD) 800 1000 1500 1800 2000 2500 No of Persons 16 24 26 30 20 6 ❑ Median for grouped data ✓ Compute the less than type cumulative frequencies ✓ Determine N/2 , one-half of the total number of cases ✓ Locate the median class for which the cumulative frequency is more than N/2 ✓ Determine the lower limit of the median class.This is L0 ✓ Sum the frequencies of all classes prior to the median class.This is F Prepared by DUKUNDE ANDELIQUE Unit 3 : MEASURES OF CENTRAL TENDENY ✓ Determine the frequency of the median class.This is f0 ✓ Determine the class width of the median class.This is h ✓ Apply the following formula: h n  M e = Lo +  − F fo 2  L0 = Lower class boundary of the median class h = Width of the median class f0 = Frequency of the median class F = Cumulative frequency of the pre-median class Prepared by DUKUNDE ANDELIQUE Unit 3 : MEASURES OF CENTRAL TENDENY Exercise :Find Median age of the following Age in years Number of persons Cumulative number of persons 14.5-19.5 677 677 19.5-24.5 1908 2585 24.5-29.5 1737 4332 29.5-34.5 1040 5362 34.5-39.5 294 5656 39.5-44.5 91 5747 44.5-49.5 16 5763 All ages 5763 - Prepared by DUKUNDE ANDELIQUE Unit 3 : MEASURES OF CENTRAL TENDENY 7.Mode ✓ Mode is the value of a distribution for which the frequency is maximum ✓ In other words, mode is the value of a variable, which occurs with the highest frequency ✓ So the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3 ✓ The mode is not necessarily well defined. The list (1, 2, 2, 3, 3, 5) has the two modes 2 and 3 Illustration: The weekly pocket money for 9 first year pupils was found to be: 3 , 12 , 4 , 6 , 1 , 4 , 2 , 5 , 8 Find Mean, Median and Mode Mean Median Mode 5 ANDELIQUE 4 Prepared by DUKUNDE 4 Unit 3 : MEASURES OF CENTRAL TENDENY ❑Mode of ungrouped data: Steps ✓ Find the modal class which has highest frequency ✓ L0 = Lower class boundary of modal class ✓ h = Interval of modal class ✓ Δ1 = difference of frequency of modal class and class before modal class ✓ Δ2 = difference of frequency of modal class and class after modal class ✓ Apply the following formula 1 M o = L0 + h 1 +  2 Prepared by DUKUNDE ANDELIQUE Unit 3 : MEASURES OF CENTRAL TENDENY Exercise: Find Mode income of the following data Income in USD Midpoint (x) Frequency (f) Midpoint x frequency (fx) 0-4 2 6 12 5-9 7 12 84 10-14 12 7 84 15-19 17 5 85 Total n = 30 ∑(fx) = 265 Prepared by DUKUNDE ANDELIQUE  Mean  Definition: The mean is the arithmetic average of a set of numbers. It is calculated by summing all the values and dividing by the number of values.  Usage: The mean is useful for datasets without extreme outliers, as it takes all values into account.  Sensitivity: It can be heavily influenced by outliers. For example, in the dataset [1, 2, 3, 4, 100], the mean (22) does not represent the central location well. Prepared by DUKUNDE ANDELIQUE  Median  Definition: The median is the middle value when a dataset is ordered from least to greatest. If there is an even number of observations, the median is the average of the two middle values.  Usage: The median is a better measure for skewed distributions or datasets with outliers. It represents the point at which half the values are below and half are above.  Stability: It remains stable even when extreme values are present, making it a reliable measure for skewed data. Prepared by DUKUNDE ANDELIQUE  Mode  The mode is the value that appears most frequently in a dataset. A dataset may have one mode (unimodal), more than one mode (multimodal), or no mode at all (when all values are unique).  The mode is useful for categorical data, where we want to know the most common category. It can also provide insights into the distribution of numerical data.  In some datasets, the mode may not be representative if it occurs with significantly lower or higher frequencies than other values. Prepared by DUKUNDE ANDELIQUE Conclusion  For symmetric distributions: Mean is often best.  For skewed distributions or outliers: Median is preferable.  For categorical data or identifying common values: Mode is useful.  Choosing the right measure of central tendency depends on the nature of the data and the specific insights you're looking to gain.  In many cases, it's beneficial to look at all three measures to gain a comprehensive understanding of your data. Prepared by DUKUNDE ANDELIQUE 1) Imagine we have the following dataset representing the ages (in years) of 15 patients diagnosed with diabetes: Ages: 34, 45, 29, 52, 38, 41, 60, 55, 47, 39, 50, 36, 44, 59, 31 i) What is the mean age of the patients diagnosed with diabetes? ii) What is the median age of the patients? iii) Is there a mode in the ages of the patients? If so, what is it? Prepared by DUKUNDE ANDELIQUE 2) Given the systolic blood pressure readings of 15 patients: 120, 130, 125, 140, 135, 130, 128, 123, 132, 138, 140, 125, 129, 137, 136 i) What is the mean systolic blood pressure? ii) What is the median systolic blood pressure for the same dataset? iii) What is the mode of the systolic blood pressure readings? 3) Analyze the following total cholesterol levels of 15 patients:  190, 210, 200, 180, 205, 200, 215, 190, 195, 200, 205, 210, 220, 180, 185  What is the mean total cholesterol level?  What is the median total cholesterol level for the above dataset  What is the mode of the cholesterol levels in this dataset? Prepared by DUKUNDE ANDELIQUE  Given the following grouped systolic blood pressure readings: BP Group (mmHg) | Frequency | |------------------|----------| | 110-119 |4 | | 120-129 |8 | | 130-139 | 12 | | 140-149 |6 | | 150-159 |2 | i) Calculate the mean age of the patients using the grouped data. ii) What is the median age group of the patients? Provide the cumulative frequency and determine the median class and find out the median value iii) Identify the modal age group from the dataset. Which age value? Prepared by DUKUNDE ANDELIQUE  With the following cholesterol levels and frequencies: Cholesterol Range Frequency 180-189 3 190-199 8 200-209 9 210-219 5 i) What is the mode of the cholesterol levels? ii) What is the mode of the cholesterol levels? iii) What is the median of the cholesterol levels? Prepared by DUKUNDE ANDELIQUE  Consider the following dataset representing the ages of patients diagnosed with hypertension, grouped into exclusive age intervals: Age Group (Years) Frequency 20-20 3 30-40 7 40-50 10 50-60 15 60-70 12 70-80 5 i) What is the mode of the cholesterol levels? ii) What is the mode of the cholesterol levels? iii) What is the median of the cholesterol levels? Prepared by DUKUNDE ANDELIQUE

Use Quizgecko on...
Browser
Browser