Chapter 4: Statistics PDF
Document Details
Uploaded by Deleted User
Kayseri Ann Eureka Gabriel
Tags
Summary
This document provides an introduction to statistics, focusing on measures of central tendency, dispersion, and relative position. It includes examples, definitions, and formulas, suitable for an introductory course or self-study.
Full Transcript
CHAPTER 4: STATISTICS GE 3 – MATH IN THE MODERN WORLD PREPARED BY: KAYSERI ANN EUREKA GABRIEL MEASURES OF MEASURES OF MEASURES OF CENTRAL DISPERSION RELATIVE TENDENCY POSITION ▪ MEAN ▪ RANGE ▪ Z-SCORE ▪...
CHAPTER 4: STATISTICS GE 3 – MATH IN THE MODERN WORLD PREPARED BY: KAYSERI ANN EUREKA GABRIEL MEASURES OF MEASURES OF MEASURES OF CENTRAL DISPERSION RELATIVE TENDENCY POSITION ▪ MEAN ▪ RANGE ▪ Z-SCORE ▪ MEDIAN ▪ STANDARD ▪ PERCENTILE ▪ MODE DEVIATION ▪ QUARTILE ▪ WEIGHTED ▪ VARIANCE MEAN ▪ FDT STATISTICS - the science of collecting, analyzing, interpreting, and presenting data to make informed decisions or understand patterns. 4.1. MEASURES OF CENTRAL TENDENCY AVERAGE 1. MEAN - the mean of n numbers is the sum of the numbers divided by n Σ𝑥 Σ𝑥 Mean = Mean = 𝑁 𝑛 SAMPLE vs. POPULATION Σ𝑥 Σ𝑥 μ= x̅ = 𝑁 𝑛 x̅ – sample mean μ – population mean Σ – summation notation x – data point Σx – summation of data points n or N – number of data points 2. MEDIAN - if n is odd, median is the middle number in a ranked list -if n is even, median is the mean of two middle numbers 3. MODE - the data point that occurs most frequently MEAN vs. MEDIAN vs. MODE data set: salaries of 5 employees $370,000 $60,000 $36,000 Mean: _____ $20,000 Median: _____ $20,000 Mode: _____ WHEN TO USE: MEAN: when all values in a dataset contribute equally to the final result, and there are no extreme outliers. MEDIAN: when you have a dataset with outliers or a skewed distribution. MODE: when you're interested in the most frequent value in a dataset. WEIGHTED MEAN - used when some data values are more important than others Σ(𝑥 𝑤) Weighted mean = Σ𝑤 WEIGHTED MEAN Illustration: Use the weighted mean formula to find Joey’s GWA for first semester. Course Course Unit Grade English 2 92 History 2 90 Chemistry 3 88 Algebra 3 89 FREQUENCY DISTRIBUTION TABLE - a table that lists observed events and the frequency of occurrence of each observed event Σ(𝑥 𝑓) Mean = Σ𝑓 5 Steps in the construction of a grouped FDT: 1. Identify the largest data value or the maximum (MAX) and smallest data value or the minimum (MIN) from the data set and compute the range, R. The range is computed as the difference between the largest and smallest value, i.e. R = MAX - MIN. 2. Determine the number of classes (k) using k = √N, where N is the total number of observations in the data set. Round-off k to the nearest whole number. It should be noted that the computed k might not be equal to the actual number of classes constructed in an FDT. 3. Calculate the class size, c, using c = R/k. Round off c to the nearest value with precision the same as that with the raw data. 4. Construct the classes or the class intervals. A class interval is defined by a lower limit (LL) and an upper limit (UL). oStart with LL of the lowest class (LL1): LL1 is the MIN of the data set. oThe LL's of the succeeding classes are then obtained by adding c to the LL of the preceding classes. oThen get the UL of the lowest class (UL1): UL1 is obtained by subtracting one unit of measure from the LL of the next class. oThe UL's of the succeeding classes are then obtained by adding c to the UL of the preceding classes. oThe lowest class should contain the MIN while the highest class should contain the MAX. 5. Tally the data into the classes constructed in Step 4 to obtain the frequency of each class. Each observation must fall in one and only one class. Simplified steps in the construction of a grouped FDT: Simplified steps in the construction of a grouped FDT: 1. Identify the largest (MAX) and smallest data value (MIN) from the data set and compute the range, R. 2. Determine the number of classes, k using k = √N 3. Calculate the class size, c, using c = R/k. 4. Construct the classes or the class intervals. A class interval is defined by a lower limit (LL) and an upper limit (UL). 5. Tally the data into the classes constructed in Step 4 to obtain the frequency of each class. Illustration: Create an FDT for the pulse rates given below: PULSE RATE (bpm) 1 42 2 38 3 66 4 81 5 59 6 84 7 67 8 82 9 77 10 68 Illustration: Identify the following, then create a grouped FDT for the pulse rates. MIN: _________ MAX: _________ RANGE: _________ Number of classes, k: _________ Class size, c: _________ LL of the lowest class (LL1): _________ LL of the second class (LL2): _________ UL of the lowest class (UL1): _________ Frequency Distribution Table Pulse Rate Range Frequency 4.2. MEASURES OF DISPERSION 1. RANGE - the difference between the greatest data value and the least data value 2. STANDARD DEVIATION -shows how much the values in the data differ from the mean. - if the standard deviation is small, the numbers are close to the average. If it's large, the numbers are more spread out. 2. STANDARD DEVIATION Σ(x−μ)² Σ(x−x̅)² σ= s= N n−1 2. STANDARD DEVIATION Procedure for computing a standard deviation: 1. Determine the mean of the data set. 2. For each data point, calculate its deviation (difference) from the mean. 3. Calculate the square of each deviation, and find the sum of these squared deviations. 4. If the data is a population, divide the sum by N. If the data is a sample, divide the sum by n-1. 5. Find the square root of the quotient in Step 4. 3. VARIANCE - the square of the standard deviation - σ² - s² Illustration: Compute for the standard deviation and variance of the pulse rates. PULSE RATE (bpm) x x-μ (x-μ)2 1 42 2 38 3 66 4 81 5 59 6 84 7 67 8 82 9 77 10 68 4.3. MEASURES OF RELATIVE POSITION 1. z-SCORES - the z-score for a given data point x is the number of standard deviations that x is above or below the mean of the data. 𝑥−μ 𝑥 − 𝑥ҧ 𝑧𝑥 = 𝑧𝑥 = σ s 2. PERCENTILES - a data point x is called the pth percentile of a data set provided p% of the data values are less than x 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑝𝑜𝑖𝑛𝑡𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥 Percentile of score x = ∙ 100 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑝𝑜𝑖𝑛𝑡𝑠 3. QUARTILES - Q1 is the first quartile, and the median of data points less than Q2 - Q2 is the second quartile, and the median of the whole data set - Q3 is the third quartile, and the median of data points greater than Q2 3. QUARTILES Procedure for finding quartiles: 1. Rank the data set. 2. Find the median of the data set. This is Q2. 3. Find the median of the data points less than Q2. This is Q1. 4. Find the median of the data points greater than Q2. This is Q3. Illustration: Using the pulse rate data below, find the following: PULSE RATE (bpm) a. z-score for the pulse rate 38 bpm 1 42 b. z-score for the pulse rate 84 bpm 2 38 c. percent of individuals whose 3 66 pulse rate is less than 68 bpm 4 81 d. Q1, Q2, and Q3 5 59 6 84 7 67 8 82 9 77 10 68 Σ𝑥 Σ𝑥 Σ(𝑥 𝑓) μ= x̅ = Mean = 𝑁 𝑛 Σ𝑓 Σ(𝑥 𝑤) R = MAX - MIN. Weighted mean = k = √(N) Σ𝑤 c = R/k 𝑥−μ 𝑧𝑥 = LL1 = MIN σ LL2 = LL1 + C 𝑥 − 𝑥ҧ UL1 = LL2 - 1 𝑧𝑥 = s UL2 = UL1 + C 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑝𝑜𝑖𝑛𝑡𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥 Percentile of score x = ∙ 100 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑝𝑜𝑖𝑛𝑡𝑠