Stat-213 Chapter 2 PDF
Document Details
Uploaded by MeritoriousSelenite
University of Southern Mindanao
Tags
Summary
This document is a chapter on frequency distributions and graphs. It covers the basics of organizing data into tables and constructing various types of graphs, including bar graphs, pie charts, histograms, and frequency polygons. The chapter also introduces concepts like class limits, class boundaries, class widths, and how to calculate frequencies.
Full Transcript
FREQUENCY DISTRIBUTIONS AND GRAPHS Intended Learning Outcomes: 1. Discuss the frequency distribution table and different types of graphs. 2. Construct a frequency distribution table and different types of graphs. This chapter explains how to organize and display data using tables an...
FREQUENCY DISTRIBUTIONS AND GRAPHS Intended Learning Outcomes: 1. Discuss the frequency distribution table and different types of graphs. 2. Construct a frequency distribution table and different types of graphs. This chapter explains how to organize and display data using tables and graphs. We will learn how to prepare frequency distribution tables for qualitative and quantitative data and how to construct bar graphs, pie graphs, histograms, and frequency polygons. FREQUENCY DISTRIBUTIONS The most convenient way of organizing data is by constructing frequency distribution. A frequency distribution is a collection of observations produced by sorting them into classes and showing their frequency (or numbers) of occurrences in each class. There are three basic types of frequency distribution: categorical, ungrouped, and grouped. The categorical frequency distribution is used for data that can be placed in specific categories, such as nominal, or ordinal level data. Example 1 The following data give the results of a sample survey. The letters A, B, and C represent the three categories. A B A A C C A C C C C B C C B C B B B C B C C A C C C B C A Construct a frequency distribution table for these data. Solution The categories are the letters. Record these categories in the first column. Then read each result from the given data and mark a tally, denoted by “|” in the second column next to the corresponding category. The tallies are marked in blocks of fives for counting convenience. Lastly, record the tallies for each category in the third column. This column is called the column of frequency. Category Tally Frequency (f) A |||| − | 6 B |||| − |||| 9 C |||| − |||| − |||| 15 𝑠𝑢𝑚 = 30 The sum of the entries in the frequency column gives the sample size or total frequency. When observations are sorted into classes of single values, the result is called a frequency distribution for ungrouped data. When observations are sorted into classes of more than one value, the result is called frequency distribution for grouped data. The following are the basic terminologies associated with frequency tables. Lower class limit – the smallest data value that can be included in the class. Upper class limit – the largest data value that can be included in the class. Class boundaries – are used to separate the classes so that there are no gaps in the frequency distribution. Class marks – the midpoints of the classes. 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 + 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑋𝑚 = 2 Class width – the difference between two consecutive lower class limits. The class width of the preceding distribution is 5 (105 − 100 = 5). The following are the steps in constructing a frequency table. Decide on the number of classes your frequency table will have. Usually, it is between 5 and Step 1 20. Step 2 Find the range. This is the difference between the highest and lowest scores. Find the class width. Divide the range by the number of classes. The class width should be an Step 3 odd number. This ensures that the midpoint of each class has the same place value as the data. Select a starting point, either the lowest score or the lower class limit. Add the class width tot Step 4 the starting point to get the second lower class limit. Then enter the upper class limit. Find the boundaries by subtracting 0.5 from each lower class limit and adding 0.5 to the upper Step 5 class limit. Step 6 Represent each score by a tally. Step 7 Count the total frequency each class. Example 2 When 40 people were surveyed at Greenbelt 3, they reported the distance they drove to the mall, and the results (in kilometers) are given below. 2 8 1 5 9 5 14 10 31 20 15 4 10 6 5 5 1 8 12 10 25 40 31 24 20 20 3 9 15 15 25 8 1 1 16 23 18 25 21 12 Construct a frequency distribution table. Solution Follow the steps: Step 1 The number of classes is 8. (chosen arbitrarily) Step 2 𝑅𝑎𝑛𝑔𝑒 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 − 𝑙𝑜𝑤𝑒𝑠𝑡 = 40 − 1 = 39 𝑅 39 Step 3 𝑐𝑙𝑎𝑠𝑠 𝑤𝑖𝑑𝑡ℎ = = = 4.875 ≈ 5 𝑐𝑙𝑎𝑠𝑠𝑒𝑠 8 Step 4 Determine the lower class limits. Class limits 1 − 6 − 11 − 16 − 21 − 26 − 31 − 36 − Subtract 1 unit from the lower class limit of the second class to obtain the upper limit of the first class: 6 − 1 = 5. Then add the class width to get the succeeding upper class limits. Class limits 1– 5 6 – 10 11 – 15 16 – 20 21 – 25 26 – 30 31 – 35 36 – 40 Step 5 Determine the class boundaries. Class limits Class boundaries 1– 5 0.5 – 5.5 6 – 10 5.5 – 10.5 11 – 15 10.5 – 15.5 16 – 20 15.5 – 20.5 21 – 25 20.5 – 25.5 26 – 30 25.5 – 30.5 31 – 35 30.5 – 35.5 36 – 40 35.5 – 40.5 Step 6 Tally the scores. Class limits Class boundaries Tally 1– 5 0.5 – 5.5 |||| − |||| − | 6 – 10 5.5 – 10.5 |||| − |||| 11 – 15 10.5 – 15.5 |||| − | 16 – 20 15.5 – 20.5 |||| 21 – 25 20.5 – 25.5 |||| − | 26 – 30 25.5 – 30.5 31 – 35 30.5 – 35.5 || 36 – 40 35.5 – 40.5 | Step 7 Make the frequency distribution table. Class limits Class boundaries Tally Frequency 1– 5 0.5 – 5.5 |||| − |||| − | 11 6 – 10 5.5 – 10.5 |||| − |||| 9 11 – 15 10.5 – 15.5 |||| − | 6 16 – 20 15.5 – 20.5 |||| 5 21 – 25 20.5 – 25.5 |||| − | 6 26 – 30 25.5 – 30.5 0 31 – 35 30.5 – 35.5 || 2 36 – 40 35.5 – 40.5 | 1 A variation of the standard frequency table is used when cumulative totals are desired. The cumulative frequency for a table, whose classes are in increasing order, is the sum of the frequencies for that class and all previous classes. Class Cumulative Class limits Class boundaries Tally Frequency Midpoints frequency 1– 5 0.5 – 5.5 3 |||| − |||| − | 11 11 6 – 10 5.5 – 10.5 8 |||| − |||| 9 20 11 – 15 10.5 – 15.5 13 |||| − | 6 26 16 – 20 15.5 – 20.5 18 |||| 5 31 21 – 25 20.5 – 25.5 23 |||| − | 6 37 26 – 30 25.5 – 30.5 28 0 37 31 – 35 30.5 – 35.5 33 || 2 39 36 – 40 35.5 – 40.5 38 | 1 40 Note: When constructing frequency tables, 1. the classes must be mutually exclusive; each score must belong to only one class. 2. include all classes, even if their frequency is zero. 3. make sure that all classes have the same width. 4. try to select convenient numbers for class limits. 5. make sure that the number of classes should be between 5 and 20. The cumulative frequency for a table whose classes are in decreasing order, is the sum of the frequencies for that class and all succeeding classes, as in the next example. Example 3 Construct a grouped frequency table for the given data below. 112 100 127 120 134 118 105 110 109 112 110 118 117 116 118 122 114 114 105 109 107 112 114 115 118 117 118 122 106 110 116 108 110 121 113 120 119 111 104 111 120 113 120 117 105 110 118 112 114 114 Solution Following the steps discussed before, the frequency distribution of the given data is shown below. The classes are arranged in decreasing order. Class limits Class boundaries Class Midpoints Frequency Cumulative frequency 130 − 134 129.5 − 134.5 132 1 50 125 − 129 124.5 − 129.5 127 1 49 120 − 124 119.5 − 124.5 122 7 48 115 − 119 114.5 − 119.5 117 13 41 110 − 114 109.5 − 114.5 112 18 28 105 − 109 104.5 − 109.5 107 8 10 100 − 104 99.5 − 104.5 102 2 2 Example 4 Students who applied for scholarship in a certain university were classified according to their class rank: F – freshman, S – sophomore, J – junior, Se – senior. Construct a frequency distribution for the data. F S J Se F F F S S J Se J J J Se Se F Se Se F Se F F J J S J S F F Solution Since there are 4 categories of students, there will be 4 classes namely: Freshman, Sophomore, Junior, and Senior. Category Tally Frequency (f) Freshman |||| − |||| 10 Sophomore |||| 5 Junior |||| − ||| 8 Senior |||| − || 7 HISTOGRAM, FREQUENCY POLYGONS, AND OGIVES After organizing the data, they can be presented in graphic form. This is done because it is easier for most people to comprehend the meaning of the data presented graphically than numerically. The three most common graphs are (a) histogram, (b) frequency polygon, and (c) cumulative frequency graph or ogive. Histogram This is a graph that displays the data by using vertical bars of various heights to represent the frequencies. To draw a histogram, first mark the classes on the horizontal axis and frequencies on the vertical axis. Next, draw a bar for each class so that its height represents the frequency of that class. The bars are drawn adjacent to each other. Frequency Polygon This graph displays the data by using lines that connect points plotted for the frequencies at the midpoints of the classes. To draw a frequency polygon, mark a dot above the midpoint of each class at a height equal to the frequency of that class. Next, mark two more classes, one at each end, and mark their midpoints. Note that these two classes have zero frequencies. Lastly, join the consecutive dots with straight lines. Ogive This is a graph that represents the cumulative frequencies of the classes. To draw an ogive, mark the class boundaries on the horizontal axis and the cumulative frequencies on the vertical axis. Plot the cumulative frequencies at each upper class boundary. Upper class boundaries are used since the cumulative frequencies represent the number of observations accumulated up to the upper boundary of each class. Example 1 Using the frequency distribution given in Example 2 of the Section 2.1, construct the following: a) a histogram b) a frequency polygon c) ogive Solution a) histogram b) frequency polygon c) ogive OTHER TYPES OF GRAPHS Pareto Graph It is used to represent a frequency distribution for a categorical or qualitative data, and the frequencies are displayed by the heights of vertical bars. Time Series Graph It is used to represent data that occur over a specific period of time. Pie Graph It is a circle that is divided into sections of wedges according to the percentage of frequencies in each category of the distribution. Place Number of People Example 1 A survey of 500 families were asked the question “Where Davao 50 are you planning to spend your vacation this summer?” It Boracay 200 resulted in the following distribution (see figure at the right.) Palawan 125 Tagaytay 90 Construct a pie graph for the data and summarize the Baguio 35 results. Solution Step 1 Since there are 360° in a circle, the frequency of each class must be converted into a proportional part of the circle. This conversion is done by using the formula 𝑓 𝑑𝑒𝑔𝑟𝑒𝑒 = (360°) 𝑛 𝑓 – frequency 𝑛 – sum of the frequencies 50 200 Davao: 500 (360°) = 36° Boracay: 500 (360°) = 144° 125 90 Palawan: 500 (360°) = 90° Tagaytay: 500 (360°) = 64.8° 35 Baguio: 500 (360°) = 25.2° Step 2 Convert each frequency to percentage. 𝑓 %= (100%) 𝑛 50 200 Davao: 500 (100%) = 10° Boracay: 500 (360°) = 144° 125 90 Palawan: 500 (100%) = 25° Tagaytay: 500 (360°) = 64.8° 35 Baguio: (360°) = 25.2° 500 Step 3 Using a protractor, draw the graph and label each section with the name and percentage. FREQUENCY DISTRIBUTION TABLE AND HISTOGRAM IN EXCEL https://www.statisticshowto.com/frequency-distribution-table-in-excel/ https://www.youtube.com/watch?v=Giewd9yH4q0 A frequency distribution table in Excel gives you a snapshot of how your data is spread out. It’s usual to pair a frequency distribution table with a histogram. A histogram gives you a graph to go with the table. In order to make a frequency distribution table in Excel with a histogram, you must have the Data Analysis Toolpak installed. Example Problem: Make a frequency distribution table in Excel. Use the following IQ scores: 99, 101, 121, 132, 140, 155, 98, 90, 100, 111, 115, 116, 121, 124. Step 1: Type your data into a worksheet. Make sure you put your data into columns. Use column headers. For this example, type “IQ Scores” into cell A1. Then type the IQ scores into cells A2 to A15. Note: Column headers will become the labels on the histogram. Step 2: Type the upper levels for your BINs into a separate column. For this sample problem, type 99, 109, 119, 129, 139, and 149 as your upper limits into column C. Note that I “missed” the top value of 155. You’ll see what Excel does with the “outlier” in the last step. Step 3: Make a column of labels so it’s clear what BINs the upper limits are labels for. Step 4: Click the “Data” tab. Then click “Data Analysis”. If you don’t see data analysis, make sure you have installed the Data Analysis Toolpak. Step 5: Click “Histogram” and then click “OK.” Step 6: Type where your data is into the “Input Range” text box. For this sample problem, type “A2:A15”. Step 7: Type where your upper limits are into the “BIN Range” text box. For this sample problem, type “C2:C7”. Step 8: Select a location where you want your output to appear. For example, click the “New Worksheet” button. Step 9: Click “Chart Output” and then click “OK.” Excel will put the histogram next to your frequency table. Note that the item I missed (155) has been magically inserted into the chart (in the BIN labeled “More”). Of course, if you want this upper BIN to be labeled, you can always add a new BIN (150-159) and redo the chart! FREQUENCY POLYGON IN EXCEL https://www.hopewell.k12.pa.us/Downloads/polygonexcel.pdf https://www.youtube.com/watch?v=VDQlqenGlsQ 1. Startup Excel. 2. Title the A1 and B1 column Midpoints and Frequency accordingly. 3. In the Midpoints and Frequency columns input your data. 4. Once you have your raw data into Excel, select your frequencies including the label. 5. Select INSERT from the top toolbar. 6. Click on Insert Line Chart, and select Line with Markers, from the 2-D Column Section. 7. By now you should have something that looks like this. Right Click on any region of the graph, choose Select Data. 8. Select Edit from the Horizontal Axis Labels and highlight the midpoints (excluding header) from column A, then click [OK]. 9. Click [OK] on the Select Data Source box. By now you should be looking at something like this. 10. At the top toolbar select Chart Tools, then Quick Layout. Choose Layout 10 and add the axis labels and title for your graph. 11. Your graph should look similar to the one below. OGIVE IN EXCEL https://www.statology.org/ogive-excel/ https://www.youtube.com/watch?v=b4uxatdMYuM An ogive is a graph that shows how many data values lie above or below a certain value in a dataset. This tutorial explains how to create an ogive in Excel. Example: How to Create an Ogive in Excel Perform the following steps to create an ogive for a dataset in Excel. Step 1: Enter the data. Enter the data values in a single column: Step 2: Define the class limits. Next, define the class limits you’d like to use for the ogive. I’ll choose class widths of 10: Step 3: Find class frequencies. Next, we’ll use the following formula to calculate the frequencies for the first class: Copy this formula to the rest of the classes: Step 4: Find cumulative frequencies. Next, we’ll use the following formulas to calculate the cumulative frequency for each class: Step 5: Create the ogive graph. To create the ogive graph, hold down CTRL and highlight columns D and F. Along the top ribbon in Excel, go to the Insert tab, then the Charts group. Click Scatter Chart, then click Scatter with Straight Lines and Markers. This will automatically produce the following ogive graph: Feel free to modify the axes and the title to make the graph more aesthetically pleasing: