Chapter 7a Statistics - Measures of Central Tendency PDF
Document Details
Uploaded by RomanticCedar
Central Philippine University
Tags
Summary
This document is about measures of central tendency in introductory statistics. It explains different concepts like mean, mode, median, and weighted mean. It also briefly describes the concept of frequency distribution.
Full Transcript
CHAPTER 7a. STATISTICS– Measures of Central Tendency Core Idea “Statistical tools derived from mathematics are useful in processing and managing numerical data in order to describe a phenomenon and predict values.” le...
CHAPTER 7a. STATISTICS– Measures of Central Tendency Core Idea “Statistical tools derived from mathematics are useful in processing and managing numerical data in order to describe a phenomenon and predict values.” learning objectives 1. discuss statistics; 2. describe the different measures of central tendency; 3. compute for the arithmetic mean, weighted mean, median, and mode of a given set of data; and 4. construct a frequency table. population – entire group under consideration sample – any subset of the population AREAS OF STATISTICS ❑ descriptive statistics - involves collection, organization, summarization, and presentation of a data which describes the sample ❑ inferential statistics - interprets results and draws conclusion for a population based on a sample (parameter) inferential statistics population sampling (statistic) sample descriptive statistics descriptive statistics MEASURES OF CENTRAL TENDENCY ❑ Mean ❑ Median ❑ Mode MEAN/ARITHMETIC MEAN It is computed The 𝒎𝒆𝒂𝒏 of 𝑛 numbers is the sum by finding the of the numbers, divided by 𝑛. sum of the data ∑𝒙 values divided 𝒎𝒆𝒂𝒏 = 𝒏 by the number of data values. where ∑𝑥 is the sum of the numbers. 𝝁 ∶ mean of a population ഥ ∶ mean of a sample 𝒙 MEAN/ARITHMETIC MEAN Example 1. Finding the Mean (Page 102) Six friends in a biology class of 20 students received test grades of 92, 84, 65, 76, 88, and 90. Find the mean of these test scores. MEDIAN It is the middle number or the mean The 𝒎𝒆𝒅𝒊𝒂𝒏 of a ranked list of 𝒏 of the two middle is the: numbers in a list of ▪ middle number if 𝒏 is odd; and numbers that have ▪ mean of the two middle numbers been arranged in is 𝒏 is even. numerical order from smallest to largest or largest to smallest. ranked list – list of numbers that is arranged in numerical order from smallest to largest or vice versa. MEDIAN Example 2. Finding the Median (Page 103) Find the median of the data in the following lists. a. 4, 8, 1, 14, 9, 21, 12 b. 46, 23, 92, 89, 77, 108 MODE A list of numbers may The 𝒎𝒐𝒅𝒆 of a list of numbers have 1 or more mode. In is the number that occurs most some instance, there are frequently. list of numbers that has no mode. ▪ unimodal – a list of numbers that contains one mode. ▪ bimodal – a list of numbers that contains two mode. ▪ multimodal – a list of numbers that contains three or more mode. MODE Example 3. Finding the Mode (Page 103) Find the mode of the data in the following lists. a. 18, 15, 21, 16, 15, 14, 15, 21 b. 2, 5, 8, 9, 11, 4, 7, 23 COMPARISON BETWEEN MEAN, MEDIAN, & MODE ❑ The mean, median, and mode are all averages. Consider the following examples and compare the mean, median, and mode for the salaries of employees of a company. Php370,000 Php42,000 Php 36,000 Php32,000 Php 30,000 Php30,000 Php 25,000 Php 5,000 Which best represents the average of these salaries? COMPARISON BETWEEN MEAN, MEDIAN, & MODE ❑ The mean, median, and mode are all averages. ❑ The mean is affected by the extreme value while the mode and median are not. ❑ The median is useful when one or more extreme values exists in the list. ❑ The mode is the least used and can only be considered when dealing with nominal data. THE WEIGHTED MEAN The 𝒘𝒆𝒊𝒈𝒉𝒕𝒆𝒅 𝒎𝒆𝒂𝒏 of 𝑛 numbers 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 with the respective assigned weights 𝑤1 , 𝑤2 , 𝑤3 , … , 𝑤𝑛 is ∑(𝒙 ∙ 𝒘) 𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑚𝑒𝑎𝑛 = ∑𝒘 where ∑(𝑥 ∙ 𝑤) is the sum of the products formed by multiplying each number by its assigned weight and ∑𝑤 is the sum of all the weights. THE WEIGHTED MEAN Consider the example below: Subjects Grade Unit/s Mathematics 98 3 Filipino 90 3 English 88 3 Science 89 3 Computer 92 1 What is the general Physical Education 95 2 weighted average GWA: __________ 15 (GWA) ? THE WEIGHTED MEAN Suppose Anita obtained the following average grades for each component: Prelim Exam – 85 Midterm Exam – 88 Final Exam – 88 Quiz – 90 If the grading system states that the prelim is 25%, midterm is 35%, and final exam 20%, quiz is 20%, what is his final grade? FREQUENCY DISTRIBUTION It is constructed through a table that lists observed events and the frequency of occurrence of each observed event, is often used to organize raw data. Consider the raw data: (Refer to Page 106) Number of Laptop Computers per Household 2 0 3 1 2 1 0 4 2 1 1 7 2 0 1 1 0 2 2 1 3 2 2 1 1 4 2 5 2 3 1 2 2 1 2 1 5 0 2 5 RELATIVE FREQUENCY DISTRIBUTION It is a type of frequency distribution that lists the percent of data in each class. Download Number of Percent of Time subscribers subscribers a. What is the percent of 0-5 6 subscribers who required at 5-10 17 10-15 43 least 25 s? 15-20 92 20-25 151 b. What is the percent of 25-30 192 subscribers who required at 30-35 190 most 20 s? 35-40 149 40-45 90 c. What is the percent of the the 45-50 45 subscribers who required at 50-55 15 least 5 s but less than 20 s to 55-60 10 download a file. CHAPTER 7b. STATISTICS– Measures of Dispersion Core Idea “Statistical tools derived from mathematics are useful in processing and managing numerical data in order to describe a phenomenon and predict values.” learning objectives 1. discuss measures of dispersion; 2. compute the range, standard deviation, or variance of a given set of data; and 3. solve problems involving the measures of dispersion. MEASURES OF DISPERSION ❑ Range ❑ Standard Deviation ❑ Variance THE RANGE The range of a set of data values is the difference between greatest data value and the least data value. Example 1. Find the Range Machine 1 Machine 2 (Page 112) 9.52 8.01 6.41 7.99 Find the range of the numbers of 10.07 7.95 ounces dispensed by Machine 1 5.85 8.03 and Machine 2 in Table 4.5. 8.15 8.02 𝑥ҧ = 8.0 𝑥ҧ = 8.0 Table 4.5. Soda Dispensed (ounces) Note: The range is sensitive to extreme values. THE STANDARD DEVIATION The standard deviation is a measure that indicates how much data scatter around the mean of a set of data. THE STANDARD DEVIATION The standard deviation is a measure that indicates how much data scatter around the mean of a set of data. mean = 155 cm THE STANDARD DEVIATION The standard deviation is a measure that indicates how much data scatter around the mean of a set of data. THE STANDARD DEVIATION If 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 is a population of 𝑛 numbers with a mean of 𝜇, then the standard deviation ∑ 𝑥−𝜇 2 of the population is 𝜎 = (1). 𝑛 If 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 is a sample of 𝑛 numbers with a mean of 𝑥,ҧ then the standard deviation of ∑ 𝑥−𝑥ҧ 2 the sample is 𝑠 = (2). 𝑛−1 Note: The standard deviation is less sensitive to extreme values. THE STANDARD DEVIATION 1. Determine the mean. Example 2. Find the Standard 2. Calculate the deviation Deviation (Page 114) (difference) between each number and the mean. The following numbers were 3. Square each deviation and obtained by sampling a find the sum of these population. squared deviations. 2, 4, 7, 12, 15 4. If the data is a population, divide the sum by 𝒏, or if Find the standard deviation of the data is a sample, divide the sample. the sum by 𝒏 − 𝟏. 5. Compute the square root. THE VARIANCE The variance for a given set of data is the square of the standard deviation of the data. If 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 is a population of 𝑛 numbers with a mean of 𝜇, then the variance of the ∑ 𝑥−𝜇 2 population is 𝜎 = 2 (3). 𝑛 If 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 is a sample of 𝑛 numbers with a mean of 𝑥,ҧ then the variance of the sample ∑ 𝑥− 𝑥ҧ 2 is 𝑠 2 = (4). 𝑛−1 Consider Example 2. Find the variance of the set of data. THE VARIANCE The variance for a given set of data is the square of the standard deviation of the data. If 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 is a population of 𝑛 numbers with a mean of 𝜇, then the variance of the ∑ 𝑥−𝜇 2 population is 𝜎 = 2 (3). 𝑛 If 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 is a sample of 𝑛 numbers with a mean of 𝑥,ҧ then the variance of the sample ∑ 𝑥− 𝑥ҧ 2 is 𝑠 2 = (4). 𝑛−1 Consider Example 2. Find the variance of the set of data. EXERCISES: (ASSIGNMENT) Consider the following data. 80 82 83 90 84 88 81 85 92 82 Solve for the measures of central tendency and measures of dispersion. MEASURES OF POSITION CHAPTER 7c Measures of Position PERCENTILE DECILE QUARTILE Percentile Divides the distribution into 100 parts A value x is called the pth percentile of a data set provided p% of the data values are less than x. P50 = 50(20)/100 = 10 But since its even, the P50 = (84+ 80)/2 = 82 Percentile for a Given Data Value Given a set of data and a data value x, 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑆𝑐𝑜𝑟𝑒 𝑥 = × 100 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠 Example: On a reading examination given to 900 students. Elaine’s score of 602 was higher than the score of 576 of the students who took the examination. What is the percentile for Elaine’s score? Decile Divides the distribution into 10 parts Quartile Divides the distribution into 4 parts Interrelatedness of Percentile, Decile, and Quartile Median LINEAR REGRESSION & CORRELATION CHAPTER 7d TERMS & DEFINITIONS BIVARIATE DATA – Data involving two variables where each value of one of the variables is paired with a value of the other. SCATTER PLOT – A graph that presents the relationship between two variables in a data set. Example 1 Consider the bivariate data below: Weight (lbs) Height (inches) Weight (lbs) and Height (inches) 80 140 60 75 155 62 70 159 67 65 179 70 60 192 71 55 200 72 50 212 75 120 140 160 180 200 220 LEAST-SQUARES REGRESSION LINE Weight (lbs) and Height (inches) 80 75 70 65 60 55 50 120 130 140 150 160 170 180 190 200 210 220 LEAST-SQUARES REGRESSION LINE The least-squares of regression line for a set of bivariate data is the line that minimizes the sum of the squares of the vertical deviations from each data point to the line. LEAST-SQUARES REGRESSION LINE Weight (lbs) and Height (inches) 80 75 70 65 60 55 50 120 130 140 150 160 170 180 190 200 210 220 LEAST-SQUARES REGRESSION LINE The least-squares of regression line for a set of bivariate data is the line that minimizes the sum of the squares of the vertical deviations from each data point to the line. LEAST-SQUARES REGRESSION LINE THE FORMULA FOR THE LEAST-SQUARES LINE The equation of the least-squares line for n ordered pairs 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , … , 𝑥𝑛 , 𝑦𝑛 is ෝ = 𝒂𝒙 + 𝒃 𝒚 where 𝑛∑𝑥𝑦 − ∑𝑥 ∑𝑦 𝑎= and 𝑏 = 𝑦ത − 𝑎𝑥ҧ 𝑛∑𝑥 2 − ∑𝑥 2 LEAST-SQUARES REGRESSION LINE Example 1 Determine the least-square line of the bivariate data below: Weight (lbs) Height (inches) Weight (lbs) and Height (inches) 80 140 60 75 155 62 70 159 67 65 179 70 60 192 71 55 200 72 50 212 75 120 140 160 180 200 220 Example 2 Find the equation of the least-squares line for the ordered pairs in the table below. Stride Length (m) 2.5 3.0 3.3 3.5 3.8 4.0 4.2 4.5 Speed (m/s) 3.4 4.9 5.5 6.6 7.0 7.7 8.3 8.7 LINEAR CORRELATION COEFFICIENT LINEAR CORRELATION COEFFICIENT The strength of a linear relationship between two variables is called the linear correlation coefficient 𝑟 and is defined as follows: For the 𝑛 ordered pairs 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , 𝑥3 , 𝑦3 , … , 𝑥𝑛 , 𝑦𝑛 , the linear correlation coefficient 𝒓 is given by 𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 𝑟= 𝑛 ∑ 𝑥2 − ∑ 𝑥 2 ⋅ 𝑛 ∑ 𝑦2 − ∑ 𝑦 2 If 𝑟 is positive, then the relationship between the two variables has a positive correlation. In this case, if one variable increases, the other variable also increases. LINEAR CORRELATION COEFFICIENT LINEAR CORRELATION COEFFICIENT The strength of a linear relationship between two variables is called the linear correlation coefficient 𝑟 and is defined as follows: For the 𝑛 ordered pairs 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , 𝑥3 , 𝑦3 , … , 𝑥𝑛 , 𝑦𝑛 , the linear correlation coefficient 𝒓 is given by 𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 𝑟= 𝑛 ∑ 𝑥2 − ∑ 𝑥 2 ⋅ 𝑛 ∑ 𝑦2 − ∑ 𝑦 2 Moreover, if 𝑟 is negative, then the relationship between the two variables has a negative correlation. In this case, if one variable increases, the other variable decreases. LINEAR CORRELATION COEFFICIENT LINEAR CORRELATION COEFFICIENT The strength of a linear relationship between two variables is called the linear correlation coefficient 𝑟 and is defined as follows: For the 𝑛 ordered pairs 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , 𝑥3 , 𝑦3 , … , 𝑥𝑛 , 𝑦𝑛 , the linear correlation coefficient 𝒓 is given by 𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 𝑟= 𝑛 ∑ 𝑥2 − ∑ 𝑥 2 ⋅ 𝑛 ∑ 𝑦2 − ∑ 𝑦 2 In addition, if 𝑟 is 0, then there is no relationship between the two variables. A correlation coefficient of −1 or +1 indicates a perfect linear relationship. LINEAR CORRELATION COEFFICIENT SCATTERPLOT OF THE DATA WITH DIFFERENT CORRELATION COEFFICIENT LINEAR CORRELATION COEFFICIENT Example 3 Find the value of correlation coefficient in example 1 and 2. End of Discussion…