Chapter 4: Collection and Presentation of Data PDF
Document Details
Uploaded by Deleted User
College of Engineering, Department of Electrical Engineering
Tags
Summary
This document covers various methods for collecting and presenting data in research, including sampling techniques (simple random, cluster, stratified, systematic) and data presentation methods (textual, tabular, graphical). It also describes different methods of collecting data, such as direct/interview, questionnaire, registration, and experimental. It is a chapter from a textbook on a subject in the field of statistics and data analysis.
Full Transcript
CHAPTER 4 Collection and Presentation of Data SAMPLING TECHNIQUES The value of statistic depends on the particular sample selected from the population and changes from sample to sample. Its value is subject to sampling variability. To ensure the validity of the conclusions or inferences fr...
CHAPTER 4 Collection and Presentation of Data SAMPLING TECHNIQUES The value of statistic depends on the particular sample selected from the population and changes from sample to sample. Its value is subject to sampling variability. To ensure the validity of the conclusions or inferences from the sample to the population, sampling techniques are employed. Simple Random Sampling A simple random sample is a subset of individuals chosen from a larger set in which a subset of invidividuals are chosen randomly, all with the same probability. It is the process of selectinf a sample in a random way. The simplest method of random sampling is lottery. From the given population, the names of the persons or objects are listed in small slips of paper, draw the desired and then jumbled thoroughly. Without looking at the slips of paper, draw the desired sample size. In these procedure, each item has an equal chance of being chosen as a sample. SAMPLING TECHNIQUES Cluster Sampling If the population is spread out over a wide area, and if the complete list of the members of the population is not available, this technique is the most suitable. In this kind of sampling, the total population is divided into a number of relatively small areas, and some of these areas or clusters, are randomly selected for the inclusion in the overall sample. As an example, suppose that the Dean of Student Affairs of the University wants to know how fraternity men at the school feel about a certain new regulation. He can take cluster sample by interviewing some or all of the members of several randomly selected fraternities. SAMPLING TECHNIQUES Stratified Random Sampling To avoid bias in the selection of samples, stratified random sampling is used where in the population is divided into categories or strata. From these divisions the members will be drawn proportionate to the stratum. In using these techniques it is very important that the researcher must first determine the characteristics of the population in order to obtain the appropriate divisions needed in the problem. Essentially, the goal of stratification is to form strata in such a way that there is some relationships between being in a particular stratum and the answer sought. SAMPLING TECHNIQUES Systematic Sampling In some instances, the most practical way of sampling is to select, say 25th name on a list, every 7th house on one side of the street, every 10th piece coming off a production line, and so on. This is called systematic sampling and an element of randomness can be introduced by using random numbers to pick unit with which to start. Method of Collecting Data After a research problem has been definitely decide upon, the next step would be to determine and select the method of collecting data. There are a number of tools or techniques in data procedure, some of the more important ones are: 1.Direct or Interview Method. This method maybe considered as an oral type of questionnaire in which the researcher gets the needed information from the subject or interviewee verbally and directly in face- to-face contact. 2.Indirect or Questionnaire Method. The questionnaire according to Good is a list of [planned, written questions related to a particular topic intended for the subject’s respondent to reply. In this method, written responses are given to prepared questions, intended to elicit answer to the problems of study. Questionnaires may be mailed or hand-carried. 3.Registration Method. This data gathering method is governed by certain laws. Examples are the registry of births, deaths, marriages, and licenses. 4.Experimental Method. This is used when the objective is to find out the cause and effect relationship of certain phenomena under controlled conditions. This method is usually employed by scientific researchers. Methods of Presenting Data After a research problem has been definitely decide upon, the next step would be to determine and select the method of collecting data. There are a number of tools or techniques in data procedure, some of the more important ones are: Textual Method. In this method, collected data are presented in narrative and paragraph forms. By this method, the investigation gets information by merely reading the gathered data. Tabular Method. Data are orderly arranged and presented in rows and columns for a more easy and comprehensible comparison of figures. Graphical Method. Data gathered are presented in visual or pictorial form. This would enable the researcher to get a clear view of the relationships of data through pictures and colored maps. Organizing and Presenting of Data Organizing and presenting a set of data is one of the first tasks in understanding problem. The most common method of summarizing data is to present them in condensed for in tables or graphs. 1. Stem and Leaf Display 26 40 24 32 25 28 23 26 A m o re co m p a ct way o f 30 34 29 44 32 33 30 25 conveying the information and at the same time high 42 55 26 52 27 25 33 32 lighting the important 35 46 32 33 31 36 41 27 aspects of the data. 24 38 26 34 23 31 26 24 STEM AND LEAF DISPLAY To illustrate the technique, break each number into tens and unit digits, tallying together the values which share the ten digits. That is, we will think of the number 26 as 216. The ten digits will be aligned vertically with the units displayed to the side. The first row of the display tells us that the list contains values 23, 24, 25, 26,…29. The third row tells us that the list contains 5 values in the 40’s and in the 4th row, two values in the 50’s. STEM AND LEAF DISPLAY There are various ways of modifying the stem-and-leaf display to meet the particular needs. If we want to construct a display with more stems than we can divide each stem position into two and have a double- stem display. Organizing and Presenting of Data 2. Frequency and Distribution Table When the bulk data is quite large, a good overall picture of all the information needed can be presented by grouping into a number of classes, intervals or categories. For example, a 1990 data obtained from a pediatric clinic on the weight of babies, within a year may be summarized as follows. FREQUENCY DISTRIBUTION TABLE Example: The following are the scores of 120 students in an 80-item pre-test in math. 44 58 73 60 76 65 64 65 75 65 70 44 50 54 74 75 60 62 55 56 63 68 75 49 42 74 55 65 78 60 57 60 64 62 69 40 64 56 59 63 57 54 59 59 58 59 60 46 65 48 45 55 75 61 54 67 55 74 63 73 45 51 49 64 50 51 62 51 53 63 55 47 59 76 60 56 70 52 50 66 50 61 62 48 70 77 69 60 65 60 55 60 54 57 56 42 75 66 70 50 69 53 51 50 58 64 68 45 63 43 64 63 77 64 70 60 65 60 72 74 Solution: Since the highest value is 78 and the smallest value is 40, then, the range of values could be obtained by subtraction: 78-40 =38. If we choose 5 as our class interval, then there will be 8 classes, obtained by dividing the range by the class interval 38/5 =7.6 or 8. The lowest class will now be 40-44, 45-49 and so on. Class intervals Frequency Class Boundaries Class Marks 75-79 10 74.5-79.5 77 70-74 12 69.5-74.5 72 65-69 15 64.5-69.5 67 60-64 30 59.5-64.5 62 55-59 20 54.5-59.5 57 50-54 18 49.5-54.5 52 45-49 9 44.5-49.5 47 40-44 6 39.5-44.5 42 N=120 The number given in the right hand column of the table above which shows how many items fall into each class are called class frequencies. The smallest and largest values that can go into any given class are called its class limits and for the distribution of the scores in the above data, 40, 45, 50, 55… are called lower class limits while 44, 49, 54, 59,… are calle d upper class limits. The scores grouped above are in whole numbers so that 75-79 actually included 74.5 to 79.5; 70-74 actually includes 69.5-74.5 and e.g. for the succeeding classes. These values 74.5-79.5, 69.5-74.5 etc. are the true class limits or class boundaries. FREQUENCY DISTRIBUTION Refers to the tabulation of data by category or class intervals with corresponding frequency for each class. Frequecy corresponds to the number of items belong to a category. The class intervals refers to the grouping per category defined by the lower limit and upper limit. The class boundaries are more accurate expressions of the class limits. The class mark is the midpoint of a class interval. It can be found by getting the average of the lower class limit and the upper class limit. There are essentially two ways in which frequency distribution can be modified to suit particular needs. One way is to convert a distribution into a percentage distribution by dividing each class frequency by the total number of items or observations and then multiplying by 100%. Percentage distributions are often used when we want to describe a particular distribution. The other way o f m o d i f yi ng a f re qu e n cy distribution is to convert into a less th an or m ore tha n cumul ati ve d i s t r i b u t i o n. To c o n s t r u c t a cumulative distribution, we simply add the class frequencies, starting either at the top or at the bottom of the distribution. Graphical Presentation of a Frequency Distribution The most common graphs to show a frequency distribution are the histogram and the frequency polygon. 1. The histogram is a set of vertical bars having their bases on the horizontal axes which center on the class marks. The widths of the bars correspond to the class marks and the heights correspond to the frequencies. 2. The frequency polygon is the modification of the histogram; only, the frequency polygon is a line graph where the class frequency is plotted against the class marks. The cumulative frequency polygon is the graph of a cumulative frequency distribution. THE PIE GRAPH OR PIE CHART The Pie Graph or Pie Chart is useful in presenting frequency distribution wherein the entire circle represents the entire population. The Pie is subdivided into segments each of which is proportional in size to quantities or percentage it represent. Consider the distribution of 3 Billion budgets allocated to the 6 major agencies of the government for the fiscal year 1992. Agency Budget (in Proportion Percent Degree Billions) Education 1.1.367 36.7 132 Nat’l Defense.9.3 30 108 Public works.53.176 17.6 64 Agri. and AR.24.08 8 29 DILG.12.04 4 14 Health.11.037 3.7 13