C04-Data-Analysis (1) PDF
Document Details
Uploaded by CureAllNephrite9087
Tags
Summary
This document introduces the concepts of statistics, including data collection, organization, analysis, and interpretation. It covers different types of variables (qualitative and quantitative, discrete and continuous) and two main divisions of statistics: Descriptive and Inferential Statistics. It also highlights the importance of statistics in various fields, including economics.
Full Transcript
Introduction Statistics nowadays are very useful. It enables the researchers to easily find the solutions to the problems, either personal or societal, interpret the results and give the implications of these solutions to our everyday lives. Furthermore, these results brought many improvement...
Introduction Statistics nowadays are very useful. It enables the researchers to easily find the solutions to the problems, either personal or societal, interpret the results and give the implications of these solutions to our everyday lives. Furthermore, these results brought many improvements and inventions produced had good and positive impact to the society. Statistics may be used in education, politics, economics, and the like. With this, it also gives us information on the trends in the society and helps us to discover problems which may need an immediate solutions. STATISTICS AND ITS IMPORTANCE Key Concepts Statistics is a science that deals with the collection, organization, analysis, and interpretation of data. - Collection means gathering relevant information or data from the population through survey, test, interview, experiment, etc. - Organization or presentation refers to the systematic arrangement of data into textual form, table, graph, or chart. - Analysis is the careful examination of data and may be with the use of statistical tool. - Interpretation of data is making a generalization or conclusion from the data that have been analyzed. Population - the group from which data are to be collected. Sample - a subset of a population. Variable - a feature characteristic of any member of a population differing in quality or quantity from one member to another. Quantitative variable - a variable differing in quantity. For example, the weight of a person, number of people in a car. Qualitative variable - a variable differing in quality or attribute. For example, color, the degree of damage of a car in an accident. Discrete variable - a variable which no value may be assumed between two given values, for example, number of children in a family. It is a whole number, and are usually a count of objects. Continuous variable - a variable which any value may be assumed between two given values, for example, the length and width of a rectangular table is 3.5 meters by 1.75 meters. Two Divisions of Statistics: 1. Descriptive Statistics: Descriptive statistics deals with collection of data, its presentation in various forms, such as tables, graphs and diagrams and findings averages and other measures which would describe the data. Example: Industrial statistics, population statistics, trade statistics etc. and businessmen use descriptive statistics in presenting their annual reports, final accounts, bank statements. 2. Inferential Statistics: Inferential statistics deals with techniques used for analysis of data, making predictions, comparisons, and drawing conclusions about a population using information gathered about a representative portion or sample of that population. Worktext: Mathematics in the Modern World 83 Example: Suppose we want to have an idea about the percentage of indigents in our country. We take a sample from the population and find the proportion of indigents in the sample. This sample proportion with the help of probability enables us to make some inferences about the population proportion. Importance of Statistics Statistics plays a vital role in every field of human activity. Statistical methods are useful tools in aiding researches and studies in different fields such as education, economics, social sciences, business, health and many others. It helps provide more critical analyses of information. Examples: (1) In Economics: Economics largely depends upon statistics. National income accounts are multipurpose indicators for the economists and administrators. Statistical methods are used for preparation of these accounts. (2) In Natural and Social Sciences: Statistical methods are commonly used for analyzing the experiments results, testing their significance in Biology, Physics, Chemistry, Mathematics, Meteorology, Research chambers of commerce, Sociology, Business, Public Administration, Communication and Information Technology, etc… MEASURES OF CENTRAL TENDENCY A measure of central tendency or measure of central location is a summary measure that describes a whole set of data. It is a single number that indicates the center of a collection of data. The most commonly used measures of central tendency are the mean, median, and mode. A. Mean, Median and Mode of Ungrouped Data MEAN (𝑥̅ ) The mean, also called as the “average” or arithmetic mean/average”, is the most commonly used measure of central tendency. It is said to be the most reliable measure of central tendency. To calculate mean, add all the numbers in a set and then divide the sum by the total count of numbers. Properties of Mean 1. A set of data has only one mean and does not have an outlier. 2. All values in the data set are included in computing the mean. 3. It is very useful in comparing two or more data sets. 4. It is affected by the extreme small or large values on a data set. 5. It is appropriate in symmetrical data. 𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠 Mean: (𝑥̅ ) = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 Worktext: Mathematics in the Modern World 84 ∑𝑥 ∑𝑥 Sample Mean: 𝑥̅ = Population Mean: 𝜇 = 𝑛 𝑁 Where: 𝑥̅ -sample mean (read as “x bar”) 𝜇 -population mean (read as “mu”) 𝑥 -the value of any particular observation or measurement 𝛴𝑥 -sum of all values 𝑛 -total number of values in the sample 𝑁 -total number of values in the population Illustrative Examples: 1. Jean has been working part- time on a fast-food company. The following numbers represent the number of hours Jean has worked on this fast-food company for each of the past 8 months: 30, 45, 43, 60, 71, 82, 71, 83. What is the mean (average) number of hours that Jean worked on this company? Solution: Step 1: Add the numbers to determine the total number of hours he worked. 30 + 45 + 43 + 60 + 71 + 82 + 71 + 83 = 485 485 Step 2: Divide the total by the number of months. = 60.63 hours/month 8 The average number of hours John worked on the Website is 60.63hours/ month. 2. Joseph operates Technology Giant, a Website service that employs 8 people. Find the mean age (in years) of his workers if the ages of the employees are as follows: 26, 23, 30, 25, 29, 33, 38, 35 Solution: Step 1: Add the numbers to determine the total age of the workers. 26 + 23 + 30+ 25+ 29 + 33 + 38 + 35 = 239 Step 2: Divide the total by the number of workers 239 = 29.875 𝑜𝑟 30 𝑦𝑒𝑎𝑟𝑠 8 The average age of Joseph’s workers is 30 years old. Weighted Mean/Average The weighted mean/average is particularly useful when various classes or groups contribute differently to the total. Worktext: Mathematics in the Modern World 85 The weighted mean/average may be calculated by using the following three-step procedure: 1. multiply each value by its corresponding weight; 2. find the sum of those products; and 3. divide that sum by the sum of the weights. The following formula expresses the procedure: ∑ 𝑤𝑥 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯ 𝑤𝑛 𝑥𝑛 𝑥̅ = = ∑𝑤 𝑤1 + 𝑤2 + ⋯ 𝑤𝑛 where 𝑤 represents the weight and 𝑥 represents the data value. Illustrative Example: Rena, a fourth year student majoring in mathematics took the following courses with the corresponding units and grade during the first semester of the school year. What is her average grade? Course Title Unit Grade The Teaching Profession 3 1.4 Field Study 5 1 1.3 Field Study 6 1 1.5 Special Topic 3 1 1.8 Calculus 1 3 1.7 Calculus 11 3 1.8 Seminar on Technology in Mathematics 3 1.2 Abstract Algebra 3 1.8 Mathematical Investigation and Modelling 3 1.7 TOTAL 21 Solution: 3(1.4) + 1(1.3) + 1(1.5) + 1(1.8) + 3(1.7) + 3(1.8) + 3(1.2) + 3(1.8) + 3(1.7) 𝑥̅ = 21 33.4 𝑥̅ = 21 𝑥̅ = 1.59 The weighted average grade of Rena is 1.59 MEDIAN (𝑥̃) The median is the number that falls in the middle position after the data has been organized either in ascending or descending order or array. Worktext: Mathematics in the Modern World 86 Properties of Median 1. It is unique, there is only one median for a set of data. 2. It is found by arranging the set of data in ascending or descending order and getting the value of the middle observation. 3. It is not affected by extreme values. To determine the value of the median for ungrouped data, consider two rules. 1. If n is odd, the median is the middle ranked. 2. If n is even, then the median is the average of the two middle ranked values. 𝑛+1 Median: 𝑥̃ = 2 Illustrative Examples: 1. Find the median of the following data:12, 3, 17, 8, 14, 10, 6 Solution: Step 1: Organize the data in an array. 3, 6, 8, 10, 12, 14, 17 Step 2: Since the number of data values is odd, the median is the middle most position. In this case, the median is the value that is found in the fourth position of the data in an array. 3, 6, 8, 10 , 12, 14, 17 2. Find the median of the following data: 7, 9, 3, 4, 15, 2, 8, 6, 2, 4 Solution: Step 1: Arrange the data in an array. 2, 2, 3, 4, 4, 6, 7, 8, 9, 15 Step 2: Since the number of data values is even, the median will be 𝑛+1 the mean value of the numbers found before and after the 2 position. 𝑛 + 1 10 + 1 11 = = = 5.5 2 2 2 Step 3: The number before the 5.5 position is 4 and the number after the 5.5 position is 6. Now, you need to find the mean value. 2, 2, 3, 4, 4, 6, 7, 8, 9, 15 4+6 =5 2 The median is 5. Worktext: Mathematics in the Modern World 87 MODE (𝑥̂) The mode (𝑥̂) is the value in a data set that appears most frequently. Illustrative Examples: 1. Find the mode of the following data: 76, 81, 76, 80, 76, 83, 77, 79, 82, 76 Solution: There is no need to organize the data in an array, unless you think that it would be easier to locate the mode if the numbers are in an array. In the above data set, the number 76 appears thrice, but all the other numbers appear only once. Since 76 appears with the greatest frequency, it is the mode of the data set. 2. The ages of 12 randomly selected customers at a local 7-Eleven listed below: 21, 21, 29, 24, 31, 21, 27, 24, 24, 32, 33, 19 What is the mode of the above ages? Solution: The above data set has two values that each occur with a frequency of 3. These values have 2 modes 21, and 24 which is called bi-modal. All other values occur only once. 3. The coach of a sports team begins to observe the color of t-shirts his athletes wear. His goal is to find out what color is worn most frequently so that he can offer a common color or uniform shirts to his athletes. Monday: Green, Blue, Pink, White, Blue, and Blue Tuesday: Blue, Red, Black, Pink, Green, and Blue Wednesday: Orange, White, White, Blue, Blue, and Red Thursday: Brown, Black, Brown, Blue, White, and Blue Friday: Black, Blue, Red, Blue, Red, and Pink What is the mode of the colors above? Solution: The color blue was worn 11 times during the week. All other colors were worn with much less frequency in comparison to the color blue. The owner can offer a blue shirt for his employees. Worktext: Mathematics in the Modern World 88 Name: Date: Program and Section: Score: Try this! Direction: Answer the following. A. Solve for the mean, median and mode of the following data set and interpret the results 1. 54, 50, 54, 55, 56, 57,57, 58, 58, 60, 68 2. 45, 48, 52, 46, 41, 26, 36, 34, 38, 41, 39, 38, 30, 49, 46, 55 3. 154, 133, 232, 267, 289, 274, 321, 348, 188, 439 B. Ben and his friends are comparing the number of times they have been to the movies in the past year. The table below illustrates how many times each person went to the movie theatre in each month. Jan Feb Mar Apr May June July Aug Sept Oct Nov Dec Ben 3 3 2 5 2 3 2 4 2 3 2 2 John 3 2 1 1 1 3 3 3 2 4 1 2 Matthew 1 3 3 2 1 4 5 3 2 2 2 3 Rose 2 2 2 1 3 2 4 1 3 2 3 3 1. By comparing modes, who among the friends has gone to the movies the least per month? 2. By comparing medians, who among the friends has gone to the most per month? 3. Rank the friends in the order of most movies seen to least movies seen by comparing their means. 4. By comparing the means of movies seen in each month, what month is the most popular movie-watching month? 5. By comparing medians, which month is the least popular movie- watching month? 6. What is the mean of the medians for each month (the arithmetic average of the medians of the number of movies seen in each month)? Worktext: Mathematics in the Modern World 89 Definition of Terms Raw Data is the data collected in original form Range is the difference between the highest value and the lowest value in the distribution. Frequency Distribution Table is the organization of data in a tabular form, using mutually exclusive classes showing the frequency or count of the occurrences of values in the sample. Class Interval/width/size (i) is the distance between the class lower limit and the class upper limit. Class Limit is the smallest and largest observation (data, events etc.) in each class. Hence, each class has two limits: a lower and an upper limit. Class Boundary or True Class Limit. It is 0.5 more of an upper class limit and 0.5 less of a lower class limit. Therefore, each class has an upper and lower class boundary or true upper and true lower class limit. Midpoint or Class Mark (X) is found by adding the upper and lower class limits of any class and dividing the sum by 2 Frequency (𝑓) is the number of values in a specific class of a frequency distribution table. Cumulative Frequency (𝑐𝑓) – is the sum of the frequencies accumulated up to the upper boundary of a class in frequency distribution table. Frequency Distribution Table Illustrative Example: 1. Construct a frequency distribution table for the following total scores in the 1st Quarter Quizzes in a Mathematics class. 118, 123, 128, 129, 130, 130, 133, 124, 125, 127, 136, 138, 141, 141, 149, 154, 150 Solution: The following steps are involved in the construction of a frequency distribution. 1. Decide the approximate number of classes in which the data are to be grouped. There are no hard and first rules for number of classes. In most cases we have 5 to 20 classes. H.A. Sturges provides a formula for determining the approximation number of classes. K =1+.3.322 log N, where = Number of classes, N = the total number of observations Worktext: Mathematics in the Modern World 90 K =1+.3.322 log 17 K = 5.09 K≈5 2. Find the range of the data. The range is the difference between the largest and the smallest value. Range ( R ) = R = 154 – 118 = 36 3. Determine the approximate class interval/width/size (i). The class interval is obtained by dividing the range by the number of classes. 𝑅𝑎𝑛𝑔𝑒 𝑖 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑙𝑎𝑠𝑠𝑒𝑠 36 Class size, 𝑖 = = 7.2 5 In the case of fractional results, the next higher whole number is taken as the size of the class interval. Class size (𝑖) = 7.2 becomes 8 4. Decide the starting point. The lower class limit should cover the smallest value in the raw data. Write down your lowest value for your first minimum data value. The lowest value is 118. 5. Determine the remaining class limit.When the lowest class limit has been determined, then adding the class width/size to the lower class limit (118 + 8 = 126) the next lower class limit is found. The remaining lower class limits may be determined by adding the class size repeatedly until the largest value of the data is observed in the class. You can compute the upper class limit by subtracting one from the class width and add that to the minimum data value. For example: 118 + (8 – 1) = 125 or 118 – 125 150-157 126 – 133 142-149 134 – 141 134 – 141 142 – 149 126 – 133 150 – 157 118 – 125 Tally the observations or scores in each class, and determine the frequency. The total of the frequency must be equal to the number of observations. The scores are: 118, 123, 128, 129, 130, 130, 133, 124, 125, 127, 136, 138, 141, 141, 149, 154, 150 Worktext: Mathematics in the Modern World 91 Frequency Distribution Table Score Tally Frequency (f) 118-125 IIII 4 126-133 IIII – I 6 134-141 IIII 4 142-149 I 1 150-157 II 2 Total 17 By using the frequency distribution table above: a) What are the lower and the upper class limits of the first two classes? For the first class 118 – 125 the lower class limit is 118 and the upper class limit is 125. For the second class, 126-133, the lower class limit is 126 and the upper class limit is 133. b) What are the true class limits/class boundaries of the first two classes? For the first class 118 – 125, the lower class boundary is 118 – 0.5 = 117.5, and for the second class, 126-133, it is 126 – 0.5 = 125.5 While the true upper limits or upper class boundaries for the first class 118 – 125, the true upper limit or upper class boundary is 125 + 0.5 = 125.5, and for the second class, 126-133, it is 133 + 0.5 =133.5 c) What is the class interval/width/size? Class Interval/width/size is the distance between the class lower limit and the class upper limit. It can be obtained by getting the difference of the two lower limits or upper limits of two succeeding classes For the two succeeding classes: 118-125 126-133 The class width is 126-118 = 8 or 133-125 = 8 d) Find the class midpoint or class mark of the first class. For the first class, 118-125, the class midpoint, 𝐿𝑜𝑤𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡 + 𝑈𝑝𝑝𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡 𝑋= 2 118 + 125 𝑋= 2 𝑋 = 121.5 Worktext: Mathematics in the Modern World 92 B. Mean, Median, Mode of Grouped Data MEAN (𝑥̅ ) OF GROUPED DATA Steps in computing the mean of grouped data: a. Find the midpoint/class mark (𝑋) of each class. b. Multiply the frequency (f) of each class by its midpoint (𝑋) to get 𝑓𝑋. c. Find the sum of 𝑓𝑋 d. Find the sum of all the frequencies (𝑛) e. Divide the sum 𝑓𝑋 by the sum of the frequencies. Formula for the Mean (𝑥̅ ) of Grouped Data ∑ 𝑓𝑋 𝑥̅ = 𝑛 Illustrative Example: Nowadays, most people spend their leisure time in facebook. Fifty-eight students in a class recorded the time they spent in facebook during their free time. The frequency distribution table data below shows the number of minutes they spent in facebook. Classes 𝑓 𝑋 𝑓𝑋 5–9 1 7 7 10 – 14 2 12 24 15 – 19 2 17 34 20 – 24 6 22 132 25 – 29 7 27 189 30 – 34 10 32 320 35 – 39 30 37 1 110 𝛴𝑓 = n = 58 𝛴𝑓𝑋 = 1 816 Solution: Applying the formula, ∑ 𝑓𝑋 1 816 𝑥̅ = = = 31.31 𝑛 58 Therefore, 31.31 minutes spent on facebook in the mean of 58 students. MEDIAN (𝒙 ̃) OF GROUPED DATA Steps in computing the median for grouped data a. Compute the less than cumulative frequency (< 𝑐𝑓) of the data. The less than cumulative frequency (< 𝑐𝑓 ) is obtained by adding the frequencies successively starting from the lowest class. 𝑛 b. Determine the median class by computing the value of 2. c. Determine the value of the cumulative frequency before the median class (𝑐𝑓𝑥̃). Worktext: Mathematics in the Modern World 93 d. Determine the true class limit L𝑥̃ of the median class e. Determine the class width. f. Apply the formula. Formula of Median of Grouped Data: 𝑛 −< 𝑐𝑓 𝑥̃ = 𝐿𝑥̃ + (2 )𝑖 𝑓𝑚 (Using the same data in the previous lesson). The data shows the time spent of 43 students in studying during examination in their math course. Find the median. Classes 𝑓 < 𝑐𝑓 25 – 29 3 3 30 – 34 2 5 35 – 39 5 10 40 – 44 8 18 45 – 49 Median class 8 26 50 – 54 8 34 55 – 59 9 43 Solution: Follow the steps in determining the median for grouped data. 𝑛 43 a) 2 = 2 = 21.5 b) The cumulative frequency before the median class (𝑐𝑓𝑥̃) is 18. (If the classes are arranged in ascending order, the word before refers to the cumulative frequency less than the frequency of the median class) c) The frequency of the median class (𝑓𝑚 ) is 8. d) The true class limit L𝑥̃ of the median class L𝑥̃ = 45 – 0.5 , L𝑥̃ = 44.5 Determine the class width, 𝑖= 5. 𝑛 −< 𝑐𝑓 𝑥̃ = 𝐿𝑥̃ + (2 )𝑖 𝑓𝑚 43 − 18 𝑥̃ = 44.5 + ( 2 )5 8 21.5 − 18 𝑥̃ = 44.5 + ( )5 8 𝑥̃ = 44.5 + 2.1875 𝑥̃ = 46.69 The median time spent by the student in studying is 46. 69 minutes. Worktext: Mathematics in the Modern World 94 MODE (𝒙̂) OF GROUPED DATA Steps in computing the mode for grouped data: a. Identify the modal class by determining the class with the highest frequency. b. Determine the true lower limit or class boundary (L𝒙 ̂) of the modal class. c. Calculate 𝑑1 , the difference of the frequency of the modal class and the frequency of the class preceding (1 class lower in value from the modal class) the modal class. d. Calculate 𝑑2 , the difference of the frequency of the modal class and the frequency of the class succeeding (1 class higher in value from the modal class) the modal class. e. Determine the class width/size (𝑖) f. Substitute the values in the formula. Formula of Mode of Grouped Data: 𝑑1 𝑥̂ = 𝐿𝑥̂ + ( )𝑖 𝑑1 + 𝑑2 Illustrative Example: The data show the time spent of 43 students in studying during examination in their math course. Find the mode and interpret the result. Classes 𝒇 25 – 29 3 30 – 34 2 35 – 39 5 40 – 44 7 45 – 49 6 50 – 54 8 55 – 59 modal class 9 Solution: a) 55-59 is the modal class b) The lower boundary of the modal class is, Lx̂ = 55 − 0.5 = 54.5 c) 𝑑1 = 9 - 8 , 𝑑1 = 1 d) 𝑑2 = 9 – 0, 𝑑2 = 9 e) 𝑖 = 5 f) Substitute the values in the formula. 𝑑1 𝑥̂ = 𝐿𝑥̂ + ( )𝑖 𝑑1 + 𝑑2 1 𝑥̂ = 54.5 + ( )5 1+9 𝑥̂ = 54.5 + 0.5 = 55 Therefore, most students spent 55 minutes in studying during their exam in math. Note: The above formula for finding the exact mode for grouped data applies only for uni-modal distribution. Worktext: Mathematics in the Modern World 95 OTHER MEASURES OF RELATIVE POSITION Measures of relative position or location also called quantiles are used to partition or divide an ordered (array) data set into equal parts like the median. The common measures of relative position are the quartiles, deciles, and percentiles. Median divides the ordered data into 2 equal parts while Quartiles divide a data set into four equal parts. The three quartiles: Quartile 1 (Q 1) also called the lower quartile is the value that below which 25% of the data lie; Quartile 2 (Q2) that is equivalent to the median is the value that below which 50% of the data lie, and Q3 also called the upper quartile is the value that below which 75% or three- fourths of the data lie. Deciles divide the array data set into ten equal parts and there are 9 deciles, denoted by D1, D2, …, D9. The Decile 1 or D1 is the value that below which 10% of the data lie. Percentiles divide the array data set into one hundred equal parts. There are 99 percentiles, denoted by P1, P2, …, P99. The Percentile 1 or P1 is the value that below which 1% of the data lie. Interquartile range (IQR)= 𝑈𝑝𝑝𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 – 𝑙𝑜𝑤𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 Note: A quantile is a number or cut-off, and not a range of values. The figure below illustrates the relationship of the quantiles in a given distribution. Q1 = P25 Q2 = P50 = D5 = 𝑥̃ Q3 = P75 The formulas are as follows: Ungrouped Grouped Data Notation Data Quartile 𝑘(𝑁 + 1) 𝑘𝑁 Where: 𝑄𝑘 = −< 𝐶𝐹𝑏 4 𝑄𝑘 = 𝐿𝐵𝑄𝑘 + ( 4 )𝑖 𝑄𝑘 -𝑘𝑡ℎ quartile 𝑐𝑓𝑄𝑘 𝐿𝐵𝑄𝑘 -lower class boundary of the 𝑘𝑡ℎ or quartile < 𝐶𝐹𝑏 -less than 𝑛 cumulative −< 𝑐𝑓 𝑄𝑘 = 𝐿 + (4 )𝑖 frequency below the 𝑓𝑚 𝑘𝑡ℎ quartile class 𝑐𝑓𝑄𝑘 -frequency of the 𝑘𝑡ℎ quartile class Worktext: Mathematics in the Modern World 96 Ungrouped Grouped Data Notation Data Decile 𝑘(𝑁 + 1) 𝑘𝑁 Where: 𝐷𝑘 = −< 𝐶𝐹𝑏 10 𝐷𝑘 = 𝐿𝐵𝐷𝑘 + ( 10 )𝑖 𝐷𝑘 𝑘𝑡ℎ decile 𝑐𝑓𝐷𝑘 𝐿𝐵𝐷𝑘 -lower class boundary of the 𝑘𝑡ℎ decile < 𝐶𝐹𝑏 -less than cumulative frequency below the 𝑘𝑡ℎ decile class 𝑐𝑓𝐷𝑘 -frequency of the 𝑘𝑡ℎ decile class Percentile 𝑘(𝑁 + 1) 𝑘𝑁 Where: 𝑃𝑘 = 100 −< 𝐶𝐹𝑏 100 𝑃𝑘 = 𝐿𝐵𝑃𝑘 +( )𝑖 𝑃𝑘 -𝑘𝑡ℎ percentile 𝑐𝑓𝑃𝑘 𝐿𝐵𝑃𝑘 -lower class boundary of the 𝑘𝑡ℎ percentile < 𝐶𝐹𝑏 -less than cumulative frequency below the 𝑘𝑡ℎ percentile class 𝑐𝑓𝑃𝑘 -frequency of the 𝑘𝑡ℎ percentile class Illustrative Example: The monthly salary in pesos of 16 DepEd Elementary teachers are as follows: Teacher Salary Teacher Salary 1 30,531 9 22,216 2 32,469 10 32,072 3 36,942 11 21,038 4 20,754 12 21,038 5 23,222 13 21,038 6 21,327 14 20,754 7 37,400 15 20,754 8 45,269 16 20,754 Find the a) lower quartile (Q1), b) 7th decile, and c) 30th percentile. Worktext: Mathematics in the Modern World 97 Solution: First, arrange the observation in an array. 16 45,269 15 37,400 14 36,942 13 32,469 12 32,072 11 30,531 D7 10 23,222 9 22,216 8 21,327 7 21,038 6 21,038 5 21,038 P30 4 20,754 Q1 3 20,754 2 20,754 1 20,754 a) lower quartile (Q1) Substitute the values in the formula: 𝑘(𝑁 + 1) 𝑄𝑘 = 4 1(16 + 1) 𝑄1 = 4 17 𝑄1 = = 4.25 4 The 4th observation or item in the table is Php 20 754. Therefore, 25% of the 16 DepEd Elementary teachers have salaries that are below or lower than Php 20 754. b) 7th decile Using the formula: 𝑘(𝑁 + 1) 𝐷𝑘 = 10 7(16 + 1) 7(17) 119 𝐷7 = = = 10 10 10 𝐷7 = 11.9 th The 11 observation or item which is Php 30 531 shows that 70% of the 16 DepEd Elementary teachers have salaries that are below or lower than Php 30 531. Worktext: Mathematics in the Modern World 98 c) 30th percentile Using the formula: 𝑘(𝑁 + 1) 𝑃𝑘 = 100 30(16 + 1) P30 = 100 30(17) 510 P30 = = 100 100 P30 = 5.10 The 5th observation or item which is Php 21 038 implies that 30% of the 16 DepEd Elementary teachers have salaries that are below or lower than Php 21 038. QUANTILES FOR GROUPED DATA Illustrative Examples: The data show the time spent in Facebook by 43 students. Find: (a) Quartile 1 and (b) Decile 2 and (c) Percentile 52. Class interval 𝑓 < 𝑐𝑓 25 – 29 3 3 30 – 34 2 5 Decile 2 35 – 39 5 10 Quartile 1 40 – 44 8 18 Percentile 52 45 – 49 8 26 50 – 54 8 34 55 – 59 9 43 Solution: a) Solving for the 1st quartile, Q1 Quartile 1 is one-fourth (or 25%) of the data falls on or below 𝑛 1(𝑛) Q1, replace by 0.25n or in the formula of the median. 2 4 1(𝑛) First solve 0.25n or and locate in the column for < 𝑐𝑓 the 4 1(𝑛) 43 location of Q1. So, = =10.75 4 4 𝑛 0.25𝑛−