Basic Statistics (FBQT 1024) Chapter 3 PDF
Document Details
Uploaded by SuperiorWhite4407
Tags
Summary
This document provides a chapter on basic statistics, focusing on measures of central tendency including mean, median, and mode. It differentiates between ungrouped and grouped data, and includes examples and exercises.
Full Transcript
Chapter 3 : Measures of Central Tendency By the end of this chapter, students shall be able to: Compute and interpret the mean, median, and mode for a set of data differentiate ungrouped and grouped data Describing Data Numerically Central Tendency Variation Mean...
Chapter 3 : Measures of Central Tendency By the end of this chapter, students shall be able to: Compute and interpret the mean, median, and mode for a set of data differentiate ungrouped and grouped data Describing Data Numerically Central Tendency Variation Mean Range Median Interquartile Range Mode Variance Standard Deviation Central Tendency Mean Median Mode n x i x i1 n Arithmetic Midpoint of Most frequently average ranked values observed value Mean Median Mode Relationships among the Mean, Median, and Mode The mean for ungrouped data is obtained by dividing the sum of all values by the number of values in the data set. Thus, Mean for population data: x N Mean for sample data: x x n The most common measure of central tendency Mean = sum of values divided by the number of values Affected by extreme values (outliers) 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Mean = 3 Mean = 4 1 2 3 4 5 15 1 2 3 4 10 20 3 4 5 5 5 5 Table 3.1 gives the 2002 total payrolls of five Major League Baseball (MLB) teams. Find the mean of the 2002 payrolls of these five MLB teams. 2002 Total Payroll MLB Team (millions of dollars) Anaheim Angels 62 Atlanta Braves 93 New York Yankees 126 St. Louis Cardinals 75 Tampa Bay Devil Rays 34 Table 3.1 x x 390 $78 million n 5 Thus, the mean 2002 payroll of these five MLB teams was $78 million. The following are the ages of all eight employees of a small company: 53 32 61 27 39 44 49 57 Find the mean age of these employees. x 362 45.25 years N 8 Thus, the mean age of all eight employees of this company is 45.25 years, or 45 years and 3 months. Table 3.2 lists the 2000 populations (in thousands) of the five Pacific states. Table 3.2 Population State (thousands) Washington 5894 Oregon 3421 Alaska 627 Hawaii 1212 California 33,872 An outlier Notice that the population of California is very large compared to the populations of the other four states. Hence, it is an outlier. It show how the inclusion of this outlier affects the value of the mean. If we do not include the population of California (the outlier) the mean population of the remaining four states (Washington, Oregon, Alaska, and Hawaii) is 5894 3421 627 1212 Mean 2788.5 thousand 4 Now, to see the impact of the outlier on the value of the mean, we include the population of California and find the mean population of all five Pacific states. This mean is 5894 3421 627 1212 33,872 Mean 9005.2 thousand 5 A sample of five executives received the following bonuses last year (in RM): 14 000 15 000 17 000 15 000 16 000 a) Is this information a sample or a population? b) Find the mean bonus of the five executives. a) Is this information a sample or a population? b) What is the mean number of patents granted? The median is the value of the middle term in a data set that has been ranked in increasing order. The median does not affected by extreme values. 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Median = 3 Median = 3 The location of the median: n 1 Median position position in the ordered data 2 If the number of values is odd, the median is the middle number If the number of values is even, the median is the average of the two middle numbers n 1 Note that is not the value of the median, only 2 the position of the median in the ranked data The following data give the weight lost (in pounds) by a sample of five members of a health club at the end of two months of membership: 10 5 19 8 3 Find the median. First, we rank the given data in increasing order as follows: 3 5 8 10 19 There are five observations in the data set. Consequently, n = 5 and n 1 5 1 Position of the middle term 3 2 2 Therefore, the median is the value of the third term in the ranked data. 3 5 8 10 19 Median The median weight loss for this sample of five members of this health club is 8 pounds. Table below lists the total revenue for the 12 top- grossing North American concert tours of all time. Find the median revenue for these data. Total Revenue Tour Artist (millions of dollars) Steel Wheels, 1989 The Rolling Stones 98.0 Magic Summer, 1990 New Kids on the Block 74.1 Voodoo Lounge, 1994 The Rolling Stones 121.2 The Division Bell, 1994 Pink Floyd 103.5 Hell Freezes Over, 1994 The Eagles 79.4 Bridges to Babylon, 1997 The Rolling Stones 89.3 Popmart, 1997 U2 79.9 Twenty-Four Seven, 2000 Tina Turner 80.2 No Strings Attached, 2000 ‘N-Sync 76.4 Elevation, 2001 U2 109.7 Popodyssey, 2001 ‘N-Sync 86.8 Black and Blue, 2001 The Backstreet Boys 82.1 First we rank the given data in increasing order, as follows: 74.1 76.4 79.4 79.9 80.2 82.1 86.8 89.3 98.0 103.5 109.7 121.2 There are 12 values in this data set. Hence, n = 12 and n 1 12 1 Position of the middle term 6.5 2 2 Therefore, the median is given by the mean of the sixth and the seventh values in the ranked data. 74.1 76.4 79.4 79.9 80.2 82.1 86.8 89.3 98.0 103.5 109.7 121.2 82.1 86.8 Median 84.45 $84.45 million 2 Thus the median revenue for the 12 top-grossing North American concert tours of all time is $84.45 million. A real estate broker intends to determine the median selling price of 10 houses listed below: What is the median price? The following data represents the number of home runs hit by all team in the Indian League in 2004. 157 133 189 215 208 139 152 167 202 197 124 239 191 169 Find the median of this data set. Value that occurs most often Not affected by extreme values Used for either quantitative or qualitative data There may be no mode There may be several modes 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 No Mode Mode = 9 The following data give the speeds (in miles per hour) of eight cars that were stopped on I-95 for speeding violations. 77 69 74 81 71 68 74 73 Find the mode. In this data set, 74 occurs twice and each of the remaining values occurs only once. Because 74 occurs with the highest frequency, it is the mode. Therefore, Mode = 74 miles per hour A data set may have none or many modes, whereas it will have only one mean and only one median. The data set with only one mode is called unimodal. The data set with two modes is called bimodal. The data set with more than two modes is called multimodal. Last year’s incomes of five randomly selected families were $36,150. $95,750, $54,985, $77,490, and $23,740. Find the mode. The prices of the same brand of television set at eight stores are found to be $495, $486, $503, $495, $470, $505, $470 and $499. Find the mode. The ages of 10 randomly selected students from a class are 21, 19, 27, 22, 29, 19, 25, 21, 22 and 30. Find the mode. Five houses on a hill by the beach $2,000 K House Prices: $2,000,000 500,000 $500 K 300,000 $300 K 100,000 100,000 $100 K $100 K House Prices: Mean: ($3,000,000/5) $2,000,000 = $600,000 500,000 300,000 100,000 100,000 Median: middle value of ranked Sum 3,000,000 data = $300,000 Mode: most frequent value = $100,000 Mean is generally used, unless extreme values (outliers) exist Then median is often used, since the median is not sensitive to extreme values. Example: Median home prices may be reported for a region – less sensitive to outliers Describes how data are distributed Measures of shape Symmetric or skewed Left-Skewed Symmetric Right-Skewed Mean < Median Mean = Median Median < Mean Mean Median Mode Mean for population data: fx or fx N f Mean for sample data: x fx or x fx n f Where x is the midpoint and f is the frequency of a class. Table 3.2 gives the frequency distribution of the daily commuting times (in minutes) from home to work for all 25 employees of a company. Calculate the mean of the daily commuting times. Daily Commuting Time Number of Employees (minutes) 0 to less than 10 4 10 to less than 20 9 20 to less than 30 6 30 to less than 40 4 40 to less than 50 2 Daily Commuting Time f x fx (minutes) 0 to less than 10 4 5 20 10 to less than 20 9 15 135 20 to less than 30 6 25 150 30 to less than 40 4 35 140 40 to less than 50 2 45 90 N = 25 ∑fx = 535 fx 535 21.40 minutes N 25 Thus, the employees of this company spend an average of 21.40 minutes a day commuting from home to work. Table 3.4 gives the frequency distribution of the number of orders received each day during the past 50 days at the office of a mail-order company. Calculate the mean. Number of Orders Number of Days 10 – 12 4 13 – 15 12 16 – 18 20 19 – 21 14 x The following data refers to the number of bicycles owned by 27 families at Taman Bukit Katil. Find the mean of these data. Number of bicycles Number of families 0 2 1 6 2 13 3 4 4 2 The following data shows the number of visits to the library made by all 100 international students in one year. Number of visits Number of students 0–4 17 5–9 41 10 – 14 22 15 – 19 11 20 – 24 8 25 – 29 1 Find the average number of visits to the library. f0 f1 xˆ Lmod e Cmod e f0 f1 f0 f2 where Lmod e is the lower boundary of class containing the mode f0 is the frequency of the modal class f1 is the frequency of class before modal class f2 is the frequency of class after modal class Cmod e is the class interval of the modal class n 2 m 1 f x Lm Cm fm where Lm is the lower boundary of the class containing the median f m 1 is the cumulative frequency before the median class fm is the frequency of the median class Cm is the class interval of the class containing the median The grouped frequency table shows the number of CDs bought by a class of children in the past year. Number of CDs Frequency (f) 0-4 10 5-9 12 10-14 6 15-19 2 >19 0 Calculate mean, mode and median.