Scientific Research Methods and Statistical Application (June 2024) PDF
Document Details
Uploaded by MagnanimousBlueLaceAgate
Addis College
2024
Kuma Gowwomsa E. (PhD)
Tags
Summary
This document details lecture notes from Addis College on Scientific Research methods and Statistical Application. It covers various aspects of data processing, analysis, and types of analysis, including introductions, processing operations, and different statistical techniques.
Full Transcript
Addis College School of Graduate Studies Department of Construction Technology and Management Scientific Research methods and Statistical Application (June, 2024) Kuma Gowwomsa E. (PhD) Email: kuma.s@addiscoll...
Addis College School of Graduate Studies Department of Construction Technology and Management Scientific Research methods and Statistical Application (June, 2024) Kuma Gowwomsa E. (PhD) Email: [email protected] Mobile: +251911268836 1 Chapter 5 Data Processing and Analysis 3 Contents Data Processing and Analysis 1. Introduction 2. Processing Operations 3. Elements/Types of Analysis 4. Statistics in Research 4 1. Introduction ❑ Research can be defined as scientific and systematic search for pertinent information on a specific topic. ❑ The data, after collection, has to be processed and analyzed in accordance with the outline laid at the time of developing the research plan. ❑ Technically speaking, processing implies editing, coding, classification and tabulation of collected data so that they will be ready for analysis. ❑ The term analysis refers to the computation of certain measures along with searching for patterns of relationship that exist among data-groups. ❑ Thus, “in the process of analysis, relationships or differences supporting or conflicting with original or new hypotheses should be subjected to statistical tests of significance to determine with “what” validity data can be said to indicate any conclusions. 5 Introduction ❑ Specify the analysis procedures you will use, and label them accurately. The analysis plan should be described in detail. ❑ If coding procedures are to be used, describe in a reasonable detail. ❑ If triangulating is applied, carefully explain how it is done. ❑ Each research question will usually require its own analysis. ❑ Thus, the research questions should be addressed one at a time followed by a description of the type of statistical tests (if necessary) that will be performed to answer that “research question. “ 6 Introduction ❑ Be specific: State what variables will be included in the analyses and identify the dependent and independent variables if such a relationship exists. ❑ Decision making criteria (e.g., the critical alpha level) should also be stated, as well as the computer software that will be used (if there is a need to use one). ❑ These will help you, and the reader evaluate the choices you made and procedures you followed. 7 2. Processing Operations ❑ With this brief introduction concerning the concepts of processing and analysis, we can now proceed with the explanation of all the processing operations. 2.1 Editing ❑ Editing of data is a process of examining the collected raw data (specially in surveys) to detect errors and omissions and to correct these when possible. 2.2 Coding ❑ Coding refers to the process of assigning numerals or other symbols to answers so that responses can be put into a limited number of categories or classes. 8 Processing Operations 2.3 Classification ❑ Most research studies result in a large volume of raw data which must be reduced into homogeneous groups to get meaningful relationships. ❑ This fact necessitates classification of data which happens to be the process of arranging data in groups or classes on the basis of common characteristics. 2.3.1 Classification according to “Attributes “ ❑ As stated above, data are classified on the basis of common characteristics which can either be descriptive (such as contractor, consultant, client, etc.) or numerical (such as weight, height, income, etc.). Processing Operations 9 2.3 Classification 2.3.2 Classification according to “Class- intervals” ❑ Unlike descriptive characteristics, the numerical characteristics refer to quantitative phenomenon which can be measured through some statistical units. ❑ Data relating to income, production, age, weight, etc. come under this category. ❑ Example: A persons whose incomes, let say, are within 201 Birr to 400 Birr can form 1-Group, while, those whose incomes are within 401 Birr to 600 Birr can form another group and so on. 10 2. Processing Operations 2.4 Tabulation ❑ When a mass of data has been assembled, it becomes necessary for the researcher to arrange the same in some kind of concise and logical order. This procedure is referred to as tabulation. ❑ Thus, tabulation is the process of summarizing raw data and displaying the same in compact form (i.e., in the form of statistical tables) for further analysis. ❑ In a broader sense, tabulation is an orderly arrangement of data in columns and rows. 11 3. Elements/Types of Analysis ❑ As stated earlier, by analysis, we mean the computation of certain indices or measures along with searching for patterns of relationship that exist among the data groups. ❑ Analysis, particularly in case of survey or experimental data, involves estimating the values of unknown parameters of the population and testing of hypothesis for drawing inferences ❑ Analysis may, therefore, be categorized as descriptive analysis and inferential analysis (Inferential analysis is often known as statistical analysis). ❑ We may as well, talk of correlation analysis and causal analysis. Types of Analysis A. Correlation Analysis ❑ It studies the joint variation of two or more variables for determining the amount of correlation between two or more variables. B. Causal Analysis ❑ It is concerned with the study of how one or more variables affect changes in another variable. ❑ It is thus, a study of functional relationships existing between two or more variables. This analysis can be termed as regression analysis. ❑ In modern times, with the availability of computer facilities, there has been a rapid development of multivariate analysis which may be defined as “all statistical methods which simultaneously analyze more than 2 - Variables on a sample of observations”. 12 13 4. Statistics in Research ❑ There are 2- Major areas of statistics i.e. Descriptive statistics and Inferential statistics. 1. Descriptive statistics concern the development of certain indices from the raw data, whereas inferential statistics concern with the process of generalization. 2. Inferential statistics are also known as sampling statistics, and are mainly concerned with 2- Major type of problems: 1. Estimation of population parameters, and 2. Testing of statistical hypotheses. Statistics in Research 14 ❑ The important statistical measures that are used to summarize the survey/research data are: 1. Measures of central tendency or statistical averages; 2. Measures of dispersion; 3. Measures of asymmetry (skewness); 4. Measures of relationships; and 5. Other measures. ❑ Amongst the measures of central tendency, the 3- most important ones, are the Arithmetic average or mean, median and mode. ❑ Among the measures of dispersion, variance, and its Square root -the standard deviation are the most often used measures. Other measures such as mean deviation, range, etc. are also used. For comparison purpose, the coefficient of standard deviation or the coefficient of variation are mostly used. Statistics in Research 15 ❑ With respect of the measures of Skewness and kurtosis, the first measure of skewness based on mean and mode or on mean and median are mostly used. ❑ Amongst the measures of relationship, Karl Pearson’s coefficient of correlation is the frequently used measure in case of statistics of variables. ❑ Multiple correlation coefficient, partial correlation coefficient, regression analysis, etc., are other important measures often used by a researcher. Statistics in Research 16 4.1 Measures of Central Tendency ❑ Measures of central tendency (or statistical averages) tell us the point about which items have a tendency to cluster. ❑ Measure of central tendency is also known as statistical average. Mean, median and mode are the most popular averages. ❑ Mean, also known as arithmetic average, is the most common measure of central tendency and may be defined as the value which we get by dividing the total of the values of various given items in a series by the total number of items. Statistics in Research 17 4.1 Measures of Central Tendency Statistics in Research 18 4.1 Measures of Central Tendency ❑ Median is the value of the middle item of series when it is arranged in ascending or descending order of magnitude. ❑ Mode is the most commonly or frequently occurring value in a series. The mode in a distribution is that item around which there is maximum concentration. Statistics in Research 19 4.2 Measures of Dispersion ❑ An averages can represent a series only as best as a single figure can, but it certainly cannot reveal the entire story of any phenomenon under study. Specially it fails to give any idea about the scatter of the values of items of a variable in the series around the true value of average. ❑ In order to measure this scatter, statistical devices called measures of dispersion are calculated. ❑ Important measures of dispersion are: 1. Range, 2. Mean deviation, and 3. Standard deviation. Statistics in Research 20 4.2 Measures of Dispersion 1. Range ❑ Range is the simplest possible measure of dispersion and is defined as the difference between the values of the extreme items of a series (i.e. highest and lowest). 2. Mean deviation ❑ It is the average of difference of the values of items from some average of the series. Such a difference is technically described as deviation. 3. Standard Deviation ❑ It is most widely used measure of dispersion of a series and it is defined as the square-root of the average of squares of deviations, when such deviations for the values of individual items in a series are obtained from the arithmetic average. 21 Statistics in Research 4.2 Measures of Dispersion AAU, EiABC, Research Methods & TRW, Lecture Notes, December 2016, Muluken T. 22 Statistics in Research 4.2 Measures of Dispersion AAU, EiABC, Research Methods & TRW, Lecture Notes, December 2016, Muluken T. Statistics in Research 23 4.2 Measures of Skewness ❑ Skewness is, thus, a measure of asymmetry and shows the manner in which the items are clustered around the average. In a symmetrical distribution, the items show a perfect balance on either side of the mode, but in a skew distribution the balance is thrown to 1- side. Statistics in Research 24 4.2 Measures of Skewness Statistics in Research 25 4.3 Measures of Relationship ❑ So far we have dealt with those statistical measures that we use in context of uni-variate population i.e., the population consisting of measurement of only 1- variable. ❑ But if we have the data on 2-variables, we are said to have a bi-variate population and if the data happen to be on more than 2- variables, the population is known as multi-variate population. ❑ If for every measurement of a variable, X, we have corresponding value of a 2nd variable, Y, the resulting pairs of values are called a bi-variate population. ❑ In addition, we may also have a corresponding value of the 3rd variable, Z, or the 4th variable, W, and so on, the resulting pairs of values are called a multi-variate population. Statistics in Research 26 4.3 Measures of Relationship ❑ In case of bi-variate or multi-variate populations, we often wish to know the relation of the 2 and/or more variables in the data to one another. ❑ There are several methods of determining the relationship between variables, but no method can tell us for certain that a correlation is indicative of causal relationship. ❑ Thus we have to answer 2- types of questions in bi-variate or multi-variate populations: 1. Does there exist association or correlation between the 2 (or more) variables? If yes, of what degree? 2. Is there any cause and effect relationship between the 2 variables in case of the bi-variate population or between 1 variable on 1- side and 2 or more variables on the other side in case of multivariate population? If yes, of what degree, in which direction? Statistics in Research 27 4.3 Measures of Relationship ❑ The 1st question is, answered by the use of correlation technique and the 2nd question by the technique of regression. i.e. simple or multiple regression. a) Simple Regression b) Multiple Regression ❑ There are several methods of applying the 2- techniques, but the important ones are shown under. ❑ In case of “bi-variate” population: Correlation can be studied through: 1. Charles Spearman’s coefficient of correlation; and 2. Karl Pearson’s coefficient of correlation etc. Whereas, the cause and effect relationship can be studied through simple regression equations. Statistics in Research 28 4.3 Measures of Relationship Statistics in Research 29 4.3 Measures of Relationship ❑ In case of “multi-variate” population: Correlation can be studied through: 1. Coefficient of multiple correlation; and 2. Coefficient of partial correlation. Whereas, the cause and effect relationship can be studied through multiple regression equations. a) Multiple correlation b) Coefficient of partial and regression correlation ANY QUESTIONS PLEASE ??