CHAPTER FOUR.docx

**CHAPTER FOUR** **RESEARCH DESIGN** 1. **Meaning of Research Design** Research design is the plan, structure and strategy of investigation conceived so as to obtain answers to research questions and to control variance. It includes an outline of everything the researcher will do including what observations to make, how to make them, and what type of statistical analysis to use. A research design, in a way, is a set of instructions to the researcher on how to arrange the conditions for collection and analysis of data in a manner that will achieve the objectives of the study. In a sense, it can be taken as a control mechanism. The formidable problem that follows the task of defining the research problem is the preparation of the design of the research project, popularly known as the "research design". Decisions regarding what, where, when, how much, by what means concerning an inquiry or a research study constitute a research design. "A research design is the arrangement of conditions for collection and analysis of data in a manner that aims to combine relevance to the research purpose with economy in procedure." In fact, the research design is the conceptual structure within which research is conducted; it constitutes the blueprint for the collection, measurement and analysis of data. As such the design includes an outline of what the researcher will do from writing the hypothesis and its operational implications to the final analysis of data. Moreover, research design is a set of advance decisions that make up the master plan specifying the methods and procedures for collecting and analyzing the needed information. 1. **The Significant of Research Design** There are reasons to justify the significant played upon research design. First, although every research problem may seem totally unique, there are usually enough similarities among research problems to allow us to make some decisions, in advance, as to the best plan to use to resolve the problem. Thus, it facilitates the smooth sailing of the various research operations, thereby making research as efficient as possible yielding maximal information with minimal expenditure of effort, time and money. Second, there are some basic business research designs that can be successfully matched to given research problems. In this way, they serve the researcher much like the blue print serves the builder. In this regard, research design stands for advance planning of the methods to be adopted for collecting the relevant data and the techniques to be used in their analysis, keeping in view the objective of the research and the availability of staff, time and money. The main purpose of a research design is to enable the researcher to answer research questions as validly, objectively, accurately and economically as possible. In more specific terms, a research design sets up the framework for adequate tests of relationships among variables. In a sense, it indicates what observations to make, how to make them, and how to analyze the data obtained from observations. Moreover, a design specifies what type of statistical analysis to use and can even suggest the possible conclusions to be drawn from the analysis. Any research plan is deliberately and specifically conceived and executed to bring empirical evidence to bear on the research problem. 2. **Important concepts relating to research design** Before describing the different research designs, it will be appropriate to explain the various concepts relating to designs so that these may be better and easily understood. **1. Dependent and independent variables:** A concept which can take on different quantitative values is called a variable. As such the concepts like weight, height, income are all examples of variables. If one variable depends upon or is a consequence of the other variable, it is termed as a dependent variable, and the variable that is antecedent to the dependent variable is termed as an independent variable. For instance, if we say that height depends upon age, then height is a dependent variable and age is an independent variable. Further, if in addition to being dependent upon age, height also depends upon the individual's sex, then height is a dependent variable and age and sex are independent variables. Similarly, readymade films and lectures are examples of independent variables, whereas behavioural changes, occurring as a result of the environmental manipulations, are examples of dependent variables. **2. Extraneous variable:** Independent variables that are not related to the purpose of the study, but may affect the dependent variable are termed as extraneous variables. A study must always be so designed that *the effect upon the dependent variable is attributed entirely to the* *independent variable(s), and not to some extraneous variable or variables*. **3. Control:** One important characteristic of a good research design is to minimize the influence or effect of extraneous variable(s). The technical term 'control' is used when we design the study minimizing the effects of extraneous independent variables. In experimental researches, the term 'control' is used to refer to restrain experimental conditions. **4. Confounded relationship:** When the dependent variable is not free from the influence of extraneous variable(s), the relationship between the dependent and independent variables is said to be confounded by an extraneous variable(s). **5. Research hypothesis:** When a prediction or a hypothesized relationship is to be tested by scientific methods, it is termed as research hypothesis. The research hypothesis is a predictive statement that relates an independent variable to a dependent variable. Usually a research hypothesis must contain, at least, one independent and one dependent variable. Predictive statements which are not to be objectively verified or the relationships that are assumed but not to be tested are not termed research hypotheses. 3. **Features of a good design** A good design is often characterized by adjectives like flexible, appropriate, efficient, economical and so on. Generally, the design which minimizes bias and maximizes the reliability of the data collected and analyzed is considered a good design. The design which gives the smallest experimental error is supposed to be the best design in many investigations. Similarly, a design which yields maximal information and provides an opportunity for considering many different aspects of a problem is considered most appropriate and efficient design in respect of many research problems. Thus, the question of good design is related to the purpose or objective of the research problem and also with the nature of the problem to be studied. A design may be quite suitable in one case, but may be found wanting in one respect or the other in the context of some other research problem. One single design cannot serve the purpose of all types of research problems. If the research study happens to be an exploratory or a formulative one, wherein the major emphasis is on discovery of ideas and insights, the research design most appropriate must be flexible enough to permit the consideration of many different aspects of a phenomenon. But when the purpose of a study is accurate description of a situation or of an association between variables (or in what are called the descriptive studies), accuracy becomes a major consideration and a research design which minimizes bias and maximizes the reliability of the evidence collected is considered a good design. Studies involving the testing of a hypothesis of a causal relationship between variables require a design which will permit inferences about causality in addition to the minimization of bias and maximization of reliability. But in practice it is the most difficult task to put a particular study in a particular group, for a given research may have in it elements of two or more of the functions of different studies. It is only on the basis of its primary function that a study can be categorized either as an exploratory or descriptive or hypothesis-testing study and accordingly the choice of a research design may be made in case of a particular study. Besides, the availability of time, money, skills of the research staff and the means of obtaining the information must be given due weightage while working out the relevant details of the research design. 4. **Three Types of Research Design** Research designs are classified in to three major categories: exploratory, descriptive, and causal. The choice of the most appropriate design depends largely upon the objective of the research. It has been said that research has three objectives: - - - Three points should be made relative to the interdependency of research designs. - First, in some cases, it may be perfectly legitimate to begin any one of three designs and to use only that one design. - Second, research is an "iterative" process; by conducting one research project, we learn that we may need additional research, and so on. This may mean that we need to utilize multiple research designs. We could very well find, for example, that after conducting descriptive research, we need to go back and conduct exploratory research. - Third, if multiple designs are use in any particular order (if there is an order), it makes sense to first conduct exploratory research, then descriptive research, and finally causal research. The only reason for this order pattern is that each subsequent design requires greater knowledge about the research problem on the part of the researcher. Therefore, exploratory may give one the information needed to conduct a descriptive study which, in turn, may provide the information necessary to design causal experiment. 1. **Exploratory Research:** Exploratory research is most commonly unstructured, informal research that is undertaken to gain background information about the general nature of the research problem. By unstructured, we mean that exploratory research does not have a formalized set of objectives, sample plan, or questionnaire. It is usually conducted when the research does not know much about the problem and needs additional information or desire new or more recent information. Because exploratory research is aimed at gaining additional information about a topic and generate possible hypothesis to test, it is described as informal. Such research may consist of going to the library and reading published secondary data; of asking customers, sales persons, and acquaintances for their opinions about a company, its products, services, prices; or of simply observing every day company practices. Exploratory research is systematic, but it is very flexible in that it allows the researcher to investigate whatever source s/he desires and to the extent s/he feels is necessary in order to gain a good feel for the problem at hand. Exploratory research studies are also termed as formulate research studies. The main purpose of such studies is that of formulating a problem for more precise investigation or of developing the working hypotheses from an operational point of view. The major emphasis in such studies is on the discovery of ideas and insights. As such the research design appropriate for such studies must be flexible enough to provide opportunity for considering different aspects of a problem under study. Inbuilt flexibility in research design is needed because the research problem, broadly defined initially, is transformed into one with more precise meaning in exploratory studies, which fact may necessitate changes in the research procedure for gathering relevant data. **Uses of Exploratory Research:** Exploratory research is used in a number of situations: to gain the background information, to define terms, to clarify problems and hypothesis, and to establish research priorities. a. **Gain Background Information**: when very little is known about the problem or when the problem has not been clearly formulated, exploratory research may be used to gain much-needed background information. b. **Define Terms**: exploratory research helps to define terms and concepts. c. **Clarify Problems and Hypothesis**: exploratory research allows the researcher to define the problem more precisely and to generate hypothesis for the upcoming study. Exploratory research can also be beneficial in formulation of hypothesis, which are statements describing the speculated relationships among two or more variables. d. **Establishing Research Priorities:** exploratory research can help a firm prioritize research topics in order of importance, especially when it is faced with conducting several research studies. A review of customer compliant letters, for example, may indicate which product or services are most in need of management's attention. **Methods of Conducting Exploratory Research:** A variety of methods are available to conduct exploratory research. These include secondary data analysis, experience survey, case analysis, focus groups, and project techniques. a. **Secondary Data Analysis**: by secondary data analysis we refer to the process of searching for and interpreting existing information relevant to the research problem. Secondary data are data that have been collected for some other purpose. An analysis of secondary data is often the "core' of exploratory research. This is because there are many benefits to examining secondary data and the costs are typically minimal. Furthermore, the costs for searching time for such data are being reduced everyday as more and more computerized databases become available. b. **Experience Survey**: experience survey refers to gathering information from those thought to be knowledgeable on the issues relevant to the research problem. For instance, if the research problem deals with difficulties encountered when buying infant clothing, then surveys of mothers (or fathers) with infants may be in order. Experience survey differ from surveys conducted as part of descriptive research in that there is usually no formal attempt to ensure that the survey results are representatives of any defined group of subjects. c. **Case Analysis**: by case analysis, we refer to a review of available information about a former situation(s) that has some similarities to the present research problem. d. **Focus Groups**: An increasing popular method of conducting exploratory research is focus groups, which are small groups of people brought together and guided by a moderator through unstructured, spontaneous discussion for the purpose of gaining information relevant to the research problem. Although focus group should encourage openness on the part of the participants, the moderator's task is to ensure the discussion is "focused" on some general area of interest. 2. **Descriptive Research** As the name implies, the major objective of descriptive research is describing the characteristics of a particular individual, or of a group/ state of nature/ variables under study. Since the aim is to obtain complete and accurate information in the said studies, the procedure to be used must be carefully planned. The research design must make enough provision for protection against bias and must maximize reliability, with due concern for the economical completion of the research study. The design in such studies must be rigid and not flexible. For example, when we wish to know:- - How many customers that a specific bank have, - What services they buy and in what frequency, - Which advertisement of service they recall, and - What their attitudes are towards the bank and its competitors, we turn to descriptive research, which provides answers to questions such as who, what, where, when, and how, as they are related to the research problem. Typically, answers to these questions, are found in secondary data or by conducting surveys. Managers or/and decision makers often need answers to these basic questions before they can formulate and implement any business strategies. **Classification of Descriptive Research Studies** There are two basic descriptive research studies available to the business researcher; cross-sectional and longitudinal. ***Types of Study*** ***Features of Study*** ---------------------- ---------------------------------------------------------------------------------------------------------------------------------------------- Cross-sectional One-time measurement, including a sample survey where the emphasis is placed on large, representative sample. Longitudinal Repeated measurements on the same sample, including a traditional panels (questions remain the same) and an omnibus panel (questions differ) ***Table 5.1: Classifications of descriptive Research Studies*** a. **Cross-sectional studies:** measure a population at only one point in time. Cross sectional studies are very prevalent in marketing research, outnumbering longitudinal studies and causal studying. Because cross sectional studies are one-time measurements, they are often described as "snapshots" of the population. As an example, many magazines survey a sample of their subscribers and ask them questions such as their age, occupation, income, educational level, and so on. This sample data, taken a one point in time, is used to describe the readership of the magazine in terms of demographics. Cross sectional studies normally employee fairly large sample size, so many cross sectional studies are referred to as sample surveys. Sample surveys are cross sectional studies whose samples are drawn in such a way as to be representative of a specific population. b. **Longitudinal studies**: repeatedly measure the same population over a period of time. Because longitudinal studies involve multiple measurements, they are often described as "movies" of the population. Longitudinal studies are employed by almost 50 percent of business using marketing research. To ensure the success of longitudinal study, researcher must have access to the same members of the sample, called a panel, so as to take repeated measurements. There are two types of panels: traditional panels and omnibus panels. Traditional panels ask panel members the same questions on each panel measurement. Omnibus panels vary questions from one panel measurement to the next. Usually, firms are interested in using data from traditional panels because they can gain insights in to changes in consumers' purchases, attitudes, and so on. 3. **Causal Research** Causality may be thought of as understanding a phenomenon in terms of conditional statements of the form "if *x*, then *y*". These "if-then" statements become our way of manipulating variables of interest. For example, if I spend more on advertising, then sales will rise. Fortunately for mankind, there is an inborn tendency to determine causal relationships. This tendency is ever present in our thinking and our actions. Likewise, marketing managers are always trying to determine what will cause a change in consumer satisfaction, a gain in market share, or an increase sale. 2. ***[SAMPLE DESIGN AND PROCEDURE]*** **Introduction** - Sampling is selecting a small number of units from a population in such a manner that they can be used to make estimates about the population. A sample is drawn with the aim of inferring certain facts about the population from the results in the sample, a process known as statistical inference. A sample can provide very accurate estimates about the population it represents if carefully drawn using well established procedures. An essential feature of a sample is the definition of the rule by which it is selected. - ***[Basic Concepts in Samples and Sampling]*** To begin, we acquaint you with some basic terminology used in sampling. The terms we discussed here are population, sample and sample unit, census, sampling error, sample frame and sample frame error. - ***Population; The entire group of people of interest from whom the researcher needs to obtain information*** - ***Element/ sampling unit; One unit from a population*** - ***Sampling; The selection of a subset of the population*** - ***Sampling frame; Listing of population from which a sample is chosen*** - ***Census ; A polling of population*** - ***Survey ; A polling of the sample*** - ***Parameter; The variable of interest*** - ***Statistic; The information obtained from the sample about the parameter*** - ***Sampling error:** sampling error is any error in a survey that occurs because a sample is used; sampling error is caused by two factors: the method of sample selection and the size of the sample* - ***Sample Frame and Sample Frame Error:** to select a sample, you will need a sample frame, which is a master list of all the sample units in the population. For instance, if a researcher had defined a population to be all shoe repair stores in Gondar, he or she would need a master* listing of these stores as a frame *from which to sample. Similarly, if the population being researched were certified public accountants, a sample frame for this group would be needed.* - A sample frame invariably contains sample frame error, which is the degree to which it fails to account for all of the population. A way to envision sample frame error is by matching the list with the population and seeing to what degree the list adequately matches the targeted population. Whenever a sample is drawn, the amount of potential sample frame error should be judged by the researcher. It is a researcher's responsibility to seek out a sample frame with the least amount of error at a reasonable cost. The researcher should also apprise the client of the degree of sample frame error involved. - Census Vs Sample The objective of most business research projects is to obtain information about the characteristics or parameters of a **population**. A population is the aggregate of all the elements that share some common set of characteristics and that comprise the universe for the purpose of the business research problem. The population parameters are typically numbers, such as the proportion of consumers who are loyal to a particular brand of toothpaste. Information about population parameters may be obtained by taking a **census** or a **sample**. - **A census** involves a complete enumeration of the elements of a population. The population parameters can be calculated directly in a straight forward way after the census is enumerated. - **A sample**, on the other hand, is a sub group of the population selected for participation in the study. Sample characteristics, called statistics, are then used to make inferences about the population parameters. - **The Need for Sampling** Sampling is a critical issue in survey research; usually the time, money, and effort involved do not permit a researcher to study all possible members of a population. Furthermore, it is generally not necessary to study all possible cases to understand the phenomenon under consideration. Sampling comes to our aid by enabling us to study a portion of the population rather than the entire population. Since the purpose of drawing a sample from a population is to obtain information concerning that population, it is extremely important that individuals included in a sample constitute a representative cross section of individuals in the population. I.e. samples must be representation if one is to be able to generalize with confidence from the sample to the population. So, randomness is basic to scientific observation and reference. A type of sample that contains many unrepresentative characteristics would be termed as a biased sample. The findings of a biased sample cannot legitimately be generalized to the population from which it is taken. ### Limitations of sampling - **Less Accuracy:** in comparison to census technique the conclusion derived from sample are more liable to error. Therefore, sampling techniques is less accurate than the census technique. - **Misleading conclusion:** if the sample is not carefully selected or if samples are arbitrarily selected, the conclusion derived from them will become misleading if extended to all population. - **Need for specialized knowledge:** the sample technique can be successful only if a competent and able scientist makes the selection. **Sampling technique is used under the following conditions:** - - - - - - **Sampling Techniques** They are basically of two types: non-probability sampling and probability sampling. In the final analysis, all sample designs fall in to one of the two categories; probability or non- probability. - Probability samples are ones in which members of the population have a known chance (probability) of being selected in to sample. - Non-probability samples, on the other hand, are instances in which the chances (probability) of selecting members from the population in to the sample are unknown. The essence of a "known" probability rests in the sampling method rather than in knowing the exact size of the population. Probability sampling methods are those that ensure that, if the exact size of the population were known for the moment in time that sampling took place, the exact probability of any member of the population being selected in to the sample could be calculated. In other words, this probability value is really never calculated in actuality, but we are assured by the sample method that the chances of any one population member being selected in to the sample could be computed.With non-probability method there is no way to determine the probability even if the population size is known because the selection technique is subjective. So it is the sampling method rather than the knowledge of the size of the sample or the size of the population that determines probability or non-probability sampling. **[Probability Sampling Methods]** 1. **Simple Random Sampling** - **Simple random sampling** is a probability sampling procedure that gives every element in the target population, and each possible sample of a given size, an equal chance of being selected. As such, it is an **equal probability selection method** (EPSEM). This sampling technique is expressed by the formula: Probability of selection = [Sample Size] There are a number of examples of simple random sampling, including the "blind draw" method and the table of random numbers method. - **The "blind draw" method**: the "blind draw" method involves blindly choosing participants by their names or some other unique designation. - **The table of random numbers method**. A more sophisticated application of simple random sampling is to use a table of random numbers which is a listing of numbers whose random order is assured. If you look at a table of random numbers, you will not be able to see any systematic sequence of the numbers regardless of where on the table you begin and whether you go up, down, left, right or diagonally across the entries. ***What Are the Steps in Selecting a Simple Random Sample?\ ***There are six major steps in selecting a simple random sample: 1. Define the target population. 2. Identify an existing sampling frame of the target population or develop a new one 3. Evaluate the sampling frame for under coverage, over coverage, multiple coverage, and clustering, and make adjustments where necessary. 4. Assign a unique number to each element in the frame. 5. Determine the sample size. 6. Randomly select the targeted number of population elements. 2. **Systematic Sampling** Systematic sampling is one of the most prevalent types of sampling techniques used in place of simple random sampling its popularity over simple random sampling is based on: - **The "economic efficiency"** that represents: i.e., it can be applied with less difficulty. Systematic sampling is probability sampling because it employs a random starting point, which ensures there is sufficient randomness in the systematic sampling to approximate a known and equal probably of any person or item in the population being selected in to the sample. In essence, systematic sampling envisions the list as a set comprising mutually exclusively samples, each one of which is equally representative of the listed population. - It can be accomplished in a shorter time period than can simple random sampling. - It has the potential to create a sample that is almost identical in quality to samples created from simple random sampling. To use systematic sampling, it is necessary to obtain a listing of the population, just as in the case of simple random sampling, however, it is not necessary to transcribe names, numbers, or any other designations on to slips of paper or computer files. Instead, the researcher decides on a "skip interval," which is calculated by dividing the number of names on the list by the sample size. Names are selected based on this skip interval. The skip interval is computed very simply through the use of the following formula. Sample size For example, there are 100,000 elements in the population and a sample of 1000 is desired. In this case, the sampling interval, *i*, is 100. A random number between 1 and 100 is selected. If, for example, this number is 23, the sample consists of elements 23, 123, 223, 323, 423, 523, and so on. **Procedure** 1. Select a suitable sampling frame 2. Each element is assigned a number from 1 -- N (population) 3. Determine the sampling interval I, where i=N/n. if i is a fraction, round to the nearest whole number 4. Select a random number, r, between 1 and i, as explained in simple random sampling 5. The element with the following numbers will comprise the systematic random sample; r, r+i, r + 2i, r+3r,... r + (n-1)i. 3. **Cluster Sampling** It is used when the population under study is large, when the distribution of the members is scattered, or when the selection of individual members is not continent for several reasons. Cluster sampling is used in situations where the population members are naturally grouped in unit that can be used conveniently as clusters. Example: A researcher is interested to survey the Math's achievement of 4th grade students in elementary schools found in Ethiopia. It is practically impossible to test all 4^th^ grade students in elementary school students in the country by a single researcher. Since the 4^th^ graders are naturally grouped by regions, Zones, Woredas, schools and classes, the researcher may take zones or Woredas or schools as clusters. If schools are selected, it will be easier for him to take students as sample. Then, all students in the selected schools are to be selected. Cluster sampling differs from stratified random sampling in that in cluster sampling random selection occurs not with the individual members but with the clusters. The clusters for the sample are randomly selected from the large population of cluster and once a cluster is selected for the sample, all the population members in that clusters are included in the sample. This is in contrast to stratified sampling in which the individual members within strata are randomly selected.We illustrate cluster sampling by describing a type of cluster sample known as area sampling. **Areas sampling as a form of Cluster Sampling** In area sampling, the researcher subdivides the population to be surveyed in to areas, such as regions, cities, sub cities, neighborhoods, or any other convenient and identifiable geography designation. The researcher has two options at this point: a. **In the one step approach**, the research may believe the various geographic areas to be sufficiently identical to permit him concentrate on just one area and then generalize the results to the full population. But the researcher would need to select that one area randomly and perform a census of its members. b. **In a two- step approach** there are two steps to the sampling process. - In the first step, the researcher could select a random sample of areas. - In the second step, the researcher could decide on a probability method to sample individuals within the chosen areas. The two step approach is preferable to the one-step approach because there is always the possibility that a single cluster may be less representative then the researcher believed. But the two- step method is more costly because more areas and time are involved. ***Cluster sampling procedure*** 4. **Stratified Sampling** - Stratified sampling is a probability sampling procedure in which the target population is first separated into mutually exclusive, homogeneous segments (strata), and then a simple random sample is selected from each segment (stratum). The samples selected from the various strata are then combined into a single sample. This sampling procedure is sometimes referred to as "quota random sampling. - All of the sampling methods we have described thus far implicitly assume that the population has a normal (bell shaped) distribution for its key properties. That is, there is the assumption that every potential sample unit is a fairly good representation of the population, and any who are extremely in one way are perfectly counterbalanced by opposite extreme potential sample units. Unfortunately, it is common to work with populations in commercial research that contain unique sub groupings; you might encounter a population that is not distributed systematically across a normal curve. With this situation, unless you make adjustments in your sample design, you will end up with a sample described as "statistically inefficient" or in other words, inaccurate. One solution is ***stratified sampling, which separates the population in to different subgroups and then samples all of these sub groups.*** - With stratified random sampling, one takes a skewed population and identifies the sub groups or strata contained within it. Simple random sampling, systematic sampling, or some other type of probability sampling procedure is then applied to draw a sample from each stratum. The stratum sample sizes can differ based on knowledge of the variability in each population stratum and with the aim of achieving the greatest statistical efficiency. **Accuracy of stratified sampling:** how does stratified sampling result in a more accurate overall sample? There are two ways this accuracy is achieved. - First, stratified sampling allows for explicit analysis of each stratum. The college degree example illustrates why a researcher would want to know about the distinguishing differences between the strata in order to assess the true picture. Each stratum represents a different response profile, and by allocating sample size based on the variability in the strata profiles, a more efficient sample design is achieved. - Second, there is a procedure that allows the estimation of the overall sample mean by use of a weighted mean, whose formula takes in to consideration the sizes of the strata relative to the total population size and applies those proportions to the strata's means. The population mean is calculated by multiplying each stratum by its proportion and summing the weighted statements. This formula results in an estimate that is consistent with the true distribution of the population when the sample sizes used in the strata are not proportionate to their shares of the population. Here is the formula that is used for two strata: Mean ~population~ = (mean ~A~) (proportion) + (mean ~B~)(proportion) Where A signifies stratum A, and B signifies stratum B. **Procedures** 1. Select a suitable sampling frame 2. Select the stratification variables and the number of strata, H 3. Divide the entire population in to H strata. Based on the classification variable, each element of the population is assigned to one of the H strata 4. In each stratum, number from 1 to N~h,~ (the population size of stratum h) 5. Determine the sample size of each stratum, N~h,~ based on proportionate or disproportionate stratified sampling where \ [\$\$\\sum\_{h = 1}\^{H}{n\_{h} = n}\$\$]{.math.display}\ 6. In each stratum, select a sample random sample of size n. ***What Are the Subtypes of Stratified Sampling?*** ***\ ***There are two major subtypes of stratified sampling: proportionate stratified sampling and disproportionate stratified sampling. Disproportionate stratified sampling has various subcategories. ***[Proportionate Stratified Sampling]*** - In proportionate stratified sampling, the number of elements allocated to the various strata is proportional to the representation of the strata in the target population. That is, the size of the sample drawn from each stratum is proportional to the relative size of that stratum in the target population. As such, it is a self-weighting and EPSEM sampling procedure. The same sampling fraction is applied to each stratum, giving every element in the population an equal chance to be selected. - The resulting sample is a self-weighting sample. This sampling procedure is used when the purpose of the research is to estimate a population's parameters. ***[Disproportionate Stratified Sampling]*** - Disproportionate stratified sampling is a stratified sampling procedure in which the number of elements sampled from each stratum is not proportional to their representation in the total population. Population elements are not given an equal chance to be included in the sample. The same sampling fraction is not applied to each stratum. On the other hand, the strata have different sampling fractions, and as such, this sampling procedure is not an EPSEM sampling procedure. In order to estimate population parameters, the population composition must be used as weights to compensate for the disproportionality in the sample. However, for some research projects, disproportionate stratified sampling may be more appropriate than proportionate stratified sampling. - Disproportionate stratified sampling may be broken into three subtypes based on the purpose of allocation that is implemented. The purpose of the allocation could be to facilitate within-strata analyses, between-strata analyses, or optimum allocation. Optimum allocation may focus on the optimization of costs, the optimization of precision, or the optimization of both precision and costs **5-Stage (multi stage) Sampling** It is a further development of the principle of cluster sampling. It is used in large-scale surveys for a more comprehensive investigation. The researcher may have to use two, three, or four stage sampling. For example, a researcher wants to study the opinion of teachers towards self-centered class system. He wants to select a sample from all elementary school teachers in Ethiopia. A simple random sampling would be impractical and so from the regional state a sample of five regional states could be selected randomly from Northern, Eastern, southern, Western and central regions. From the five states chosen, all zones could be listed and random sample of 15 zones selected. Form the 15 zones, 30 Woredas can be selected randomly and from these Woredas all elementary schools could be listed and a random sample of 100 schools selected. It wouldn't be difficult to compile a list of all elementary school teachers working in 100 schools and random sample of say 650 teachers selected. The successive random sampling of regional states, zones, Woredas schools and finally teachers constitute a multistage /5stages/. Multi stage sampling is comparatively convenient less time consuming, and less expensive method of sampling. However, an element sample bias gets introduced because of the unequal size of some of the selected sub-samples. ***[Non- Probability Sampling]*** - All of the sampling methods we have described thus far embody probability sampling assumptions. In each case, the probability of any item being selected from the population in to the sample is known; even though it cannot be calculated precisely. The critical difference between probability and non-probability sampling method is the mechanics used in the sample design. With a non- probability sampling method, selection is not based on probability. That is you cannot calculate the probability of any one person in the population being selected in to the sample. Still, each non-probability sampling method strives to draw a representative sample. - There are four non probability sampling methods. 1. **Convenience Samples:** Convenience samples are samples drawn at the convenience of the interviewer. Accordingly, the most convenient areas to a researcher in terms of time and effort turn out to be ***"high traffic" areas such as shopping or busy pedestrian areas***. The selection of the place and consequently, prospective respondents is subjective rather than objective. Certain members of the population are automatically eliminated from the sampling process. For instance, there are those people who may be infrequent or even non visitors of the particular high- traffic areas being used. On the other hand, in the absence of strict selection procedures, there are members of the population who may be omitted because of their physical appearance, or by the fact that they are in a group rather than alone. 2. **[Judgment Samples:]** Judgment samples are somewhat different from convenient samples in concept because they require the judgment or an "educated guess" as who should represent the population. Often, the researcher or some individual helping the researcher who has considerable knowledge about the population will choose those individuals he or she feels constitute the sample. Focus group studies often use judgment sampling rather than probability sampling. 3. **[Referral Sampling:]** Referral sampling, sometimes called "**snow ball samples,"** require respondents to provide the names of additional respondents. Such lists begin when the researcher comply a short list of sample units that is smaller than the total sample he or she desires for the study. After each respondent is interviewed, he or she is queried about the names of the other possible respondents. In this manner, additional respondents are referred by previous respondents. Referral samples are most appropriate when there is a limited and disappointingly short sample frame and when respondents can provide the names of others who would qualify for the survey. The non-probability aspects of referral sampling come from the selectivity used throughout. 4. **[Quota Samples:]** The quota sample establishes a specific quota for various types of individuals to be interviewed. It is a form of non-probability sampling used prevalently by commercial researchers. The quotas are determined through application of the research objectives and are defined by key characteristics used to identify the population. In the application of quota sampling, a field worker is provided with screening criteria that will classify the potential respondent in to a particular quota cell. A large bank, for instance might stipulate that the final sample be one-half adult males and one-half adult females because in their understanding of their market, the customer profile is about 50-50, male and female. A quota system overcomes much of the non representativeness danger inherent in convenience samples. It may guarantee that the researcher has sufficient sub sample sizes for meaningful subgroup analysis**.** **Strength and weaknesses of Sampling Methods** ***[Developing a Sample Plan]*** The various aspects of sampling are logically joined together, and there is a definite sequence of steps, called the sample plan, that the researcher goes through in order to draw and ultimately arrive at the final sample. These steps are illustrated in the following figure. untitled**Illustration of steps of a sample plan** - **[Step 1: Define the Relevant Population]** - The very first step to be considered in the sampling process requires a definition of the target population under study. The target population is identified by the research study objectives. The researcher has to specify the sample unit in the form of a precise description of the type of person to be surveyed. Demographic descriptions, such as age range, income range, educational level, and so forth, are typically used in population definitions. Life style descriptions may also include. Finally, other types of descriptions are available. - **[Step 2: Obtain a "listing" of the Population]** - Once the relevant population has been defined, the researcher begins searching for a suitable list to serve as the sample frame. In some studies, candidate lists are readily available in the form of directories, company files or records, either public or private that are made available to the researcher. Most lists suffer from sample frame error: or as we noted earlier, the listing does not contain a complete enumeration in that some of those listed may not belong to the population. The key to assessing sample frame error lies in two factors: 1. Judging how the people listed in the sample frame are from the population, and 2. Estimating what kinds of people in the population are not listed in the sample fame - With the first factor, screening questions at the begging of an interview will usually suffice as a means of disqualifying those contacted who are not consistent with the population definition. - With the second consideration, if the researcher cannot find any reason that those population members that were left off the list would adversely affected the final sample, the degree of frame error is judged tolerable. **[Step 3: Design Sample Plan (size, method)]** - Armed with a precise definition of the population and an understanding of the availability and condition of the target population, the researcher progress directly in to the design of the sample itself. At this point, the cost of various data collection method factors comes in to play. That is, the researcher begins to simultaneously balance sample design, data collection costs, and sample size. There may be a need for a trade-off between the desire for statistical precision and the requirements of efficiency and economy. Regardless of the size of the sample, the specific sampling method or combination of sampling methods to be employed must be stipulated in detail by the researcher. There is no one "best" sampling method. The sample plan varies according to the objective of the survey and its constraints. The sampling method description includes all of the necessary steps to draw the sample. For instance, if we decide to use systematic sampling, the sampling method would detail the sampling frame, the sample size, the skip interval, how the random starting point would be determined, qualifying questions, reconstructs, and replacement procedures. Obviously, it is vital to the success of the survey that the sampling method is adhered to throughout the entire sampling process. **Step 4: Access the Population** It refers to approaching the population as per the specified definition. **Step 5: Draw the sample** Drawing the sample is a two- phase process. - - Simply put, you need to choose a person and ask him or her some questions. However, as you realize, not everyone will agree to answer. So there comes the question of substitutions. Substitutions occur whenever an individual who was qualified to be in the sample proves to be unavailable, unwilling to respond, or unsuitable. **Step 6: Validate the Sample** - The final activity in the sampling process is the validation stage. Sample validation can take a number of forms, one of which is to compare the sample's demographic profile with a known profile such as the census. With quota sample validation, of course, the researcher must use demographic characteristics other than used to set up the quota system. The essence of sample validation is to assure the client that this sample is, in fact, a representative sample of the population about which the decision maker wishes to make decisions. **Step 7: Resample, if Necessary** - When a sample fails the validation, it means that it does not adequately represent the population. This problem may arise even when sample substitutions are incorporated. Sometimes when this condition is found, the researcher causes a weighting scheme in the tabulations and analyses to compensate for the misrepresentation. On the other hand, it is sometimes possible to perform resembling by selecting more respondents and adding them until a satisfactory level of validation is reached. ***[Sampling and Non Sampling Error]*** - Sampling studies are subject to sampling and non-sampling errors, which are a random, or of a constant nature. The errors created due to sampling and at which the average magnitude can be determined are sampling errors, while others are called biases. **[Sampling Error: ]** - Is the difference between the result of a sample and the result of census - Is the difference between the sample estimation and the actual value of the population - Is an error that is created because of chance only? Although a sample is properly selected, there will be some difference between the estimate obtained from sample (sample statistics) and the actual value of population (parameter). The mean of the sample might be different from the population mean by chance alone. The standard deviation of the sample will also probably be different from the population standard deviation. We can therefore, expect some difference between the sample statistics and the corresponding population values known as parameters. This difference is known as the sampling error. ***[Systematic/Non Probability/Sampling Bias]*** It is a non-probability error, which can be created from errors in the sampling procedures and it cannot be reduced or eliminated by increasing the samples size. Such errors occur because of human mistakes and not chance. The possible factors that contribute to the creation of such bias include: 1. **Inappropriate Sampling:** If the sample unit is misrepresentation of the universe, it will result in sampling bias. This could happen when a researcher gathers data from a sample that was drawn from some favored locations. It occurs when there is a failure of all units in the universe to have some probability of being selected for the sample. 2. **Accessibility Bias:** in a considerable number of research studies, researchers tend to select respondents who are the most accessible to them (easily reached). But it should be noted that when all members of the population aren't equally accessible, the researcher must provide some mechanisms of controlling so as to ensure the absence of over and under representation of some respondents. 3. **Defective Measuring Devices:** in some instances, questions in a questionnaire may not be phrased so that they are fully understandable by respondents. Consequently, the answers obtained aren't accurate. Furthermore, on any measuring device, most individuals are likely to be *mismeasured to some degree due to errors in procedures of observation, interviewing, coding etc.* 4. **Non -Response Bias:** This is an incomplete coverage of a sample or inability to get complete responses from all the individuals initially included in the sample. This arises due to failure in locating some of the individuals of the population selected for sample or due to their refusal. Non- response errors are also due to the respondents not processing /having correct information or due to their giving deliberately biased responses. Note that non-sampling errors occur both in a sample survey and in a census;where as the sampling errors occur only when a sample survey is conducted. Preparing the survey questionnaire and handling the data carefully can minimize no-sampling errors. ***[Determining the Sample Size]*** Instead of determining representativeness, the size of the sample affects the accuracy of results. Sample accuracy refers to how close the sample's statistic is to the true population's value it represents. Note that: - Sample size is not related to representativeness. Representativeness is dependent on the sample plan. - Sample size is related to accuracy. - **Methods of Determining Sample Size** **1. Arbitrary Approach:** The arbitrary approach uses a "rule of thumb" to determine the sample size. The researcher may determine, for instance, that the sample size should be 5% of the population. Arbitrary sample size are simple and easy to apply, but they are neither efficient not economical. **2. Conventional Approach:** The conventional approach follows some "convention" or number believed somehow to be the right sample size. The convention might be an average of the sample sizes of similar studies, it might be the largest sample size of previous surveys, or it might be equal to the sample size of a competitor's survey. For instance the researcher may take the conventional sample size of between 1000 and 1200 as determined by the industry. Use of a conventional sample size can result in a sample that may be too small or too large. **3. Cost Basis Approach:** It uses cost as a basis for sample size. In this case, instead of the value of the information to be gained (from survey being a primary consideration in the sample size) the sample size is determined by cost factors that ignore the value. **4. Statistical Analysis Approach:** Some advanced statistical techniques require certain minimum sample sizes in order to be reliable, or to safeguard the validity of the statistical results. Statistical analysis is used to analysis subgroups with in a sample. **5. Confidence Interval Approach**: To create a valid sample, the confidence interval approach applies the concept of: - **Variability:** is the amount of dissimilarity in respondents' answers to a particular question. - **Confidence interval**; is a range whose end points define a certain percentage of the responses is a question. - **Standard error**: it indicates how far away from the true population value a typical sample result is expected. 3. **[METHOD OF DATA COLLECTION AND DESIGN OF DATA COLLECTION FORMS]** - **Sources of Data** - Data needed for business research can be grouped into two types; primarily and secondary. Primarily data refers to information that is developed or gathered by the researcher specifically for the research project. Secondary data has previously been gathered by someone other than the researcher and/or for some other purpose than the research project at hand. The term **secondary data** refers to data not gathered for the immediate study at hand but for some other purpose. - **Secondary Data** **Any data which have been gathered earlier for some other purpose are secondary data in the hands of the business researcher.** - **Advantages and Disadvantages of secondary data** **Advantages of secondary data** - **Secondary sources can usually be found more quickly and cheaply.** - **In many cases we can hope to gather primary information, at any cost, comparable to published data, for example, census reports and industry statistics.** - **Secondary data sources also extend our time and space range. Most research on past event has to rely on secondary data sources to some degree.** - **Data gathered about distant places, for example, foreign countries, often can be collected more cheaply through secondary materials.** **Disadvantages of secondary data** - **The most important limitation is that the information often will not meet our specific needs. Secondary source information has been collected by someone else for their purpose rather than ours.** - **Units of measure do not match with others.** - **Different time periods may be involved.** - **We often cannot even assess the accuracy of the information because we know little about the conditions under which the research took place.** - **The value of secondary information often is partially obsolete before it is available** - ***[Characteristics of secondary data]*** **The researcher must be very careful in using secondary data. He must make minute scrutiny because it is just possible that secondary data may be unsuitable or may be inadequate in the context of the problem which the researcher wants to study. By way of caution, before using secondary data, we must see that they possess following characteristics:** 1. **Reliability of Data: The reliability can be tested by finding out such things about the said data?** - **Who collected the data?** - **What were the sources of data?** - **Where they collected by using proper methods?** - **At what time were they collected?** - **Was there any bias of the computer?** - **What level of accuracy was desired?** - **What it achieved?** 2. **Suitability of Data: Researcher must very carefully scrutinize the definition of various terms and units of collection used at the time of collecting the data from primary sources originally.** 3. **Adequacy of Data: If the level of accuracy achieved in data is found inadequate for the purpose of present enquiry, they will be considered as inadequate and should not be used by the researcher.** 1. **Internal data** refers to data that has been collected within the firm. Such data include sales records, purchase requisitions, **departmental reports, production summaries, financial and accounting reports, marketing, sales studies** and invoices. Obviously, a good business researcher always determines what internal information is already available. Today, a major source of internal data is database that contains information on customers, sales, suppliers, and any other facet of business a firm may wish to track. 2. **External data** is data obtained from outside the firm. We classify external data into three sources: published, syndicated, and databases. - Published sources are those that are available from either libraries or other entities such as trade associations. - Syndicated sources are highly specialized and are not available in libraries for the general public. **Syndicated (or commercial) data** normally consist of data that have been collected and compiled according to some standardized procedure. In most cases these data are collected for a particular business or company, with a specific reason or purpose driving the data collection procedure. - **Generally published sources, but for few specific cases, other sources of information may be useful. There are five categories of external sources:** - - **Books** - **Periodicals** - **Government documents** - **Variety of materials-Reference books, University publications etc.** - **Census Data** - **Primary Data** **Primary data** are originated by a researcher for the specific purpose of addressing the problem at hand. They are individually tailored for the decision-makers of organization's that pay for well-focused and exclusive support. - *Information that is developed or gathered by the researcher specifically for the research at hand,* - *Those which are collected a fresh and for the first time, thus happen to be original in character,* - *Those data where are collected at hand by the researcher especially for the purpose of the study* - **Advantages and disadvantages of primary data** **Advantages of primary data** - Applicable and usable.....if done right - Accurate and reliable........can answer your direct research questions - Up- to- date \-\--as you have collected the data - Expensive - Not immediately available - Not as readily accessible - ***[Data Gathering Instruments]*** - 1. 2. 3. Observation 4. Focus group discussion *[Questionnaire]* A form of formalized framework consisting of a set of questions and scales designed to generate primary data. 1. **Unstructured questions:** Are open ended questions formatted to allow respondents to reply in their own words. 2. **Structured questions:** are closed ended questions that require the respondent to choose from a predetermined set of responses or scale points. **Advantages of Closed ended/Structured questions:** - Easier and quick for respondents to answer - Easier to compare the answer of different respondents - Easier to code and statistically analyze - Less literate respondents are not at a disadvantage **Disadvantages of closed ended/Structured questions:** - Respondents with no opinion or no knowledge can answer anyway - Respondents can be confused because of too many choices - Misinterpretation of a questions can go unnoticed **Advantages of open ended/unstructured questions** - Permit an unlimited number of possible answers - Respondents can answer in detail - Unanticipated finding can be discovered - Permit adequate answers to complete questions - Permit creativity, self-expression, and richness of detail - Reveal a respondent's logic, thinking process and frame of reference **Disadvantages of open ended/unstructured questions** - Different respondents give different degrees of detail in numbers -- responses may not considered - Reponses may be irrelevant or buried in useless detail - Coding responses is difficult - Articulate and highly literate respondents have an advantage - Greater amount of respondent time, thought, and effort is necessary **Type of Closed Ended questions** 1. **Dichotomous Questions** - - - - 2. **Ranked Responses** ------------ -- Cost People Atmosphere ------------ -- 3. **Rated Responses/Likert scale** **VI** **I** **N** **U** **VU** ------------------------ -------- ------- ------- ------- -------- Community life **1** **2** **3** **4** **5** Low cost **1** **2** **3** **4** **5** Outdoor life **1** **2** **3** **4** **5** Ability to move around **1** **2** **3** **4** **5** Note that each of the four rows will form a separate variable that contains the appropriate numeric code from 1 to 5. It is important to keep the scale balanced. Semantic Deferential Scales --------------------------- Bipolar questions have a scale with two ends. The subject is asked to show bias towards either end. What is your balance of preference here? ------------------------- -------- -------- -------- -------- -------- -------------------------- I like going for walks. \[ \] \[ \] \[ \] \[ \] \[ \] I like watching a movie. ------------------------- -------- -------- -------- -------- -------- -------------------------- This type of question can be use to force choice between very similar items (thus causing a semantic differential choice). Multiple choice questions: -------------------------- Selection questions ask you to make a choice from a list of items given. Single-selection questions ask you to choose only one item. Multiple-choice questions let you choose as many responses as you wish. ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Which of the following places have you visited? (Check all that apply): \[ \] Paris \[ \] London \[ \] Rom \[ \] Moscow *[Types of Bad Questions]* Any questions that prevent or distort the fundamental communication between the researcher and the respondents [1. Double Barreled Questions ] Double Barreled Questions cover more than one topic. The goal should be one topic per question. Example: When was the last time you upgraded your computer and printer? This is a double-barreled question because it asks about two separate topics (computer and printer) at the same time. Respondents who have upgraded one without upgrading the other wouldn't know how to answer. When you are analyzing the data, how will you know which part of the question was answered? "And" or "or" within a question usually makes it double barreled. Try eliminating the less important topic or create two questions. ***[2. Leading / Loaded Questions ]*** A [leading question](http://knowledge-base.supersurvey.com/glossary.htm#leading_question) suggests to the respondent that the researcher expects or desires a certain answer. The respondent should not be able to discern what type of answer the researcher wants to hear. Example: Now that you\'ve seen how you can save time, would you buy our product? - By citing proof the product is good (i.e., it saves time), the questioner has tipped his hand that he wants a "yes" answer. A loaded question asks the respondent to rely on their emotions more than the facts. Loaded questions contain "emotive" words with a positive or negative connotation. For example, while politicians are willing to associate themselves with democracy and liberty, they try to avoid tags like environmentalist and liberal. They know such "charged" terms can create negative reactions in people regardless of the content of the statement. Example: Do you approve of the President's oppressive immigration policy? - This question includes an emotive word (i.e., oppressive) which carries a negatively-charged connotation for most respondents. The respondent is asked to answer based on how it would feel to be oppressed rather than on the merits of the [president's policy. ] **Funnel sequence of questions** Five distinct parts of a questionnaire 1. **The title & identification of the survey's sponsors** - Purpose of the survey - How the information will be used - Sponsors of the survey - Guarantee for individual anonymity 2. **Instructions to respondents** - How to fill the questionnaire - Information about potential confusion - Who is expected to answer sections of questions? 3. **Warm-up questions** - Simple and easy to answer with a minimum effort - Bringing them to a level of comfort 4. **Body of the questionnaire** - The place where the most important questions can be located - The easiest questions should appear before the more difficult or complex questions - The questions should be arranged in a logical order 5. **Classification questions** 6. Pre-testing/Pilot testing ------------------------- - The last step in questionnaire design is to test a questionnaire with a small number of interviews before conducting your main interviews. Ideally, you should test the survey on the same kinds of people you will include in the main study. If that is not possible, at least have a few people, other than the question writer, try the questionnaire. This kind of test run can reveal unanticipated problems with question wording, instructions to skip questions, etc. It can help see if the interviewees understand your questions and giving useful answers. - ***[Interview]*** - Individual interviews occur as conversations between a researcher and a respondent. Highly secluded interviews list all the questions the interviewer is supposed to ask. The interviewer follows this interview schedule carefully so that interviews by different interviewers and with different respondents are conducted in a similar way. Moderately scheduled interviews identify the specific questions interviews are to ask but allow them freedom to probe for additional information after responses to the primary questions are given. Unscheduled interview list the primary topics the interviewer should cover but allow maximum freedom for phrasing. - **[Focus Group Discussion]** Powell et al (1996: 499) define a focus group as a group of individuals selected and assembled by researchers to discuss and comment on, from personal experience, the topic that is the subject of the research. Focus groups are a form of group interviewing but it is important to distinguish between the two. Group interviewing involves interviewing a number of people at the same time the emphasis being on questions and responses between the researcher and participants. Focus groups however rely on interaction within the group based on topics that are supplied by the researcher (Morgan 1997: 12).Merton and Kendall's (1946) influential article on the focused interview set the parameters for focus group development. This was in terms of ensuring that participants have a specific experience of or opinion about the topic under investigation; that an explicit interview guide is used; and that the subjective experiences of participants are explored in relation to predetermined research questions. - ***[Why use focus groups and not other methods? ]*** The main purpose of focus group research is to draw upon respondents' attitudes, feelings, beliefs, experiences and reactions in a way in which would not be feasible using other methods, for example observation, one-to-one interviewing, or questionnaire surveys. Compared to individual interviews, which aim to obtain individual attitudes, beliefs and feelings, focus groups elicit/draw out a multiplicity of views and emotional processes within a group context. The individual interview is easier for the researcher to control than a focus group in which participants may take the initiative. Compared to observation, a focus group enables the researcher to gain a larger amount of information in a shorter period of time. Observational methods tend to depend on waiting for things to happen, whereas the researcher follows an interview guide in a focus group. In this sense focus groups are not natural but organized events. **[The role of focus groups]** ### Focus groups can be used at the preliminary or exploratory stages of a study (Kreuger 1988); during a study, perhaps to evaluate or develop a particular programme of activities (Race et al 1994); or after a programme has been completed, to assess its impact or to generate further avenues of research. They can be used either as a method in their own right or as a complement to other methods, especially for triangulation (Morgan 1988) and validity checking. Focus groups can help to explore or generate hypotheses. Powell and Single (1996) developed questions or concepts for questionnaires and interview guides (Hoppe et al 1995; Lankshear 1993). They are however limited in terms of their ability to generalize findings to a whole population, mainly because of the small numbers of people participating and the likelihood that the participants will not be a representative sample. Examples of research in which focus groups have been employed include developing HIV education in Zimbabwe (Munodawafa et al 1995), understanding how media messages are processed (Kitzinger1994 and 1995), exploring people's fear of woodlands (Burgess 1996) and distance interviewing of family doctors (White and Thomson 1995). **[The practical organization of focus groups]** ### Organizing focus group interviews usually requires more planning than other types of interviewing as getting people to group gatherings can be difficult and setting up appropriate venues with adequate recording facilities requires a lot of time.The recommended number of people per group is usually six to ten (MacIntosh 1993), but some researchers have used up to fifteen people (Goss and Leinbach 1996) or as few as four (Kitzinger 1995). Numbers of groups vary, some studies using only one meeting with each of several focus groups (Burgess 1996), others meeting the same group several times. Focus group sessions usually last from one to two hours. Neutral locations can be helpful for avoiding either negative or positive associations with a particular site or building (Powell and Single 1996). Otherwise the focus group meetings can be held in a variety of places, for example, people's homes, in rented facilities, or where the participants hold their regular meetings if they are a pre-existing group. It is not always easy to identify the most appropriate participants for a focus group. If a group is too heterogeneous, whether in terms of gender or class, or in terms of professional and 'lay' perspectives, the differences between participants can make a considerable impact on their contributions. Alternatively, if a group is homogenous with regard to specific characteristics, diverse opinions and experiences may not be revealed. Participants need to feel comfortable with each other. Meeting with others whom they think of as possessing similar characteristics or levels of understanding about a given topic, will be more appealing than meeting with those who are perceived to be different (Morgan 1988). ### - ### The role of moderator Once a meeting has been arranged, the role of moderator or group facilitator becomes critical, especially in terms of providing clear explanations of the purpose of the group, helping people feel at ease, and facilitating interaction between group members. During the meeting moderators will need to promote debate, perhaps by asking open questions. They may also need to challenge participants, especially to draw out people's differences, and tease out a diverse range of meanings on the topic under discussion. Sometimes moderators will need to probe for details, or move things forward when the conversation is drifting or has reached a minor conclusion. Moderators also have to keep the session focused and so sometimes they may deliberately have to steer the conversation back on course. Moderators also have to ensure everyone participates and gets a chance to speak. At the same time moderators are encouraged not to show too much approval (Kreuger 1988), so as to avoid favoring particular participants. They must avoid giving personal opinions so as not to influence participants towards any particular position or opinion. - **Triangulation** - Triangulation refers to the use of more than one approach to the investigation of a research question in order to enhance confidence in the ensuing findings. By combining multiple observers, theories, methods, and empirical materials, researchers can hope to overcome the weakness or intrinsic biases and the problems that come from single method, single-observer, single-theory studies. Often the purpose of triangulation in specific contexts is to obtain confirmation of findings through convergence of different perspectives. The point at which the perspectives converge is seen to represent reality. **Types of Triangulation:** - According to Stake (1995) there are five protocols of triangulation. These are - ***Data Source Triangulation -*** The analyst asks whether or not what they are reporting is likely to be constant at other times or circumstances. - ***Investigator Triangulation -*** Other researchers take a look at the same scene. Or findings can be presented to other researchers to discuss alternative interpretations. - ***Theory Triangulation -*** Multiple investigators agree as to the meaning of the phenomenon. - ***Methodological Triangulation -*** This involves using a variety of data collection methods to build confidence in the interpretations made so far. - ***Member Triangulation -*** The respondent is asked to review the material for accuracy and to add further comments that might aid description and explanation. By doing so, the actors personally help to triangulate the researcher's observations and interpretations. 4. **DATA PROCESSING AND ANALYSIS** The goal of any research is to provide information out of raw data. The raw data after collection has to be processed and analyzed in line with the outline (plan) laid down for the purpose at the time of developing the research plan. The compiled data must be classified, processed, analyzed and interpreted carefully before their complete meanings and implications can be understood. **Data Processing** Data processing implies editing, coding, classification and tabulation of collected data so that they are amenable to analysis. **1. Editing:** It is a process of examining the collected raw data to detect errors and omission (extreme values) and to correct those when possible. - It involves a careful scrutiny of completed questionnaires or schedules - It is done to assure that the data are: - - - - - Editing can be either field editing or central editing - **Field editing:** consist of reviewing of the reporting forms by the investigator for completing what has been written in abbreviation and/ or in illegible form at a time of recording the respondent's response. This sort of editing should be done as soon as possible after the interview or observation. - **Central Editing:** it will take place at the research office. Its objective is to correct errors such as entry in the wrong place, entry recorded in wrong units (weeks instead of month). **2. Coding:** Coding refers to the process of assigning numerical or other symbols to answers so that responses can be put in to a limited number of categories or classes. Such classes should be appropriate to the research problem under consideration. There must be a class of every data items. They must be mutually exclusive (a specific answer can be placed in one and only one cell in a given category set). Coding decisions should usually be taken at the designing stage of the questionnaire. **3. Classification:** Most research studies in a large volume of raw data, which must be reduced in to homogenous group. Data classification implies the processes of arranging data in groups or classes on the basis of common characteristics. Data having common characteristics placed in one class and in this way the entire data get divided in to a number of groups or classes. - **Data according to attributes:** Data are classified on the basis of common characteristics that are descriptive such as literacy, sex, honesty, etc. Such descriptive characteristics refer to qualitative phenomenon, which cannot be measured quantitatively: only their presence or absence in an individual item can be noticed. Data obtained this way on the basis of certain attributes are known as statistics of attributes and their classification is said to be classification according to attributes. - **Classification according to Class interval:** Unlike descriptive characteristics the numerical characteristics refer to quantitative phenomena, which can be measured through some statistical unit. Data relating to income, age, weight, etc. come under this category. Such data are known as statistics of variables and are classified on the basis of class interval. For example, individuals whose incomes, say, are within 1001-1500 birr can form one group, those whose incomes within 500-1500 birr can form another group and so on. In this way the entire data may be divided in to a number of groups or classes or what are usually called class interval. Each class-interval, thus, has an upper as well as lower limit, which is known as class limit. The difference between the two-class limits is known as class magnitude. The number of items that fall in a given class is known as the frequency of the given class. - **Data Analysis** Data analysis is further transformation of the processed data to look for patterns and relations among data groups. By analysis we mean the computation of certain indices or measures along with searching for patterns or relationship that exist among the data groups. Analysis particularly in case of survey or experimental data involves estimating the values of unknown parameters of the population and testing of hypothesis for drawing inferences. Analysis can be categorized as: - - - **Descriptive Analysis** - Descriptive analysis is largely the study of distribution of one variable. Analysis begins for most projects with some form of descriptive analysis to reduce the data in to a summary format. Descriptive analysis refers to the transformation of raw data in to a form that will make them easy to understand and interpret. - The most common forms of describing the processed data are: - Tabulation - Percentage - Measurers of central tendency - Measures of dispersion - Measures of asymmetry **[Tabulation]** **[Tabulation]** refers to the orderly arrangements of data in a table or other summary format. It presents responses or the observations on a question-by-question or item-by item basis and provides the most basic form of information. It tells the researcher how frequently each response occurs. Tabulation may be done by hand or by mechanical or electronic devices such as the computer. Tabulation may be classified as simple and complex: **Simple tabulation** gives information about one or more groups of independent questions resulting in one-way table. **Complex tabulation** shows the division of data into two or more categories. It is designed to give information concerning one or more sets of inter-related questions **Need for Tabulation** ======================= - It conserves space and reduces explanatory and descriptive statement to a minimum - It facilitates the process of comparison - It facilitates the summation of items and the detection of errors and omission - It provides basis for various statistical computation. **[Percentage]:** Whether the data are tabulated by computer or by hand, it is useful to have percentages and cumulative percentage. Table containing percentage and frequency distribution is easier to interpret. **[Measures of Central Tendency]** Describing the central tendency of the distribution with mean, median, or mode is another basic form of descriptive analysis. These measures are most useful when the purpose is to identify typical values of a variable or the most common characteristics of a group. Measures of central tendency are also known as statistical average. Mean, median, and mode are most popular averages. - The most commonly used measure of central tendency is the **mean**. To compute the mean, you add up all the numbers and divide by how many numbers there are. It\'s not the average nor a halfway point, but a kind of center that balances high numbers with low numbers. For this reason, it\'s most often reported along with some simple measure of dispersion, such as the range, which is expressed as the lowest and highest number. - The **median** is the number that falls in the middle of a range of numbers. It\'s not the average; it\'s the halfway point. There are always just as many numbers above the median as below it. In cases where there is an even set of numbers, you average the two middle numbers. The median is best suited for data that are ordinal, or ranked. It is also useful when you have extremely low or high scores. - The **mode** is the most frequently occurring number in a list of numbers. It\'s the closest thing to what people mean when they say something is average or typical. The mode doesn\'t even have to be a number. It will be a category when the data are nominal or qualitative. The mode is useful when you have a highly skewed set of numbers, mostly low or mostly high. You can also have two modes (bimodal distribution) when one group of scores are mostly low and the other group is mostly high, with few in the middle. **[Measure of Dispersion]:** **Measure of Dispersion:** is a measurement how the value of an item scattered around the truth-value of the average. Average value fails to give any idea about the dispersion of the values of an item or a variable around the truth-value of the average. After identifying the typical value of a variable the researcher can measure how the value of an item is scattered around the true value of the mean. It is a measurement of how far is the value of the variable from the average value. It measures the variation of the value of an item. Important measures of dispersion are: - - - - **[Measure of asymmetry (Skew-ness]):** **Measure of asymmetry (skew-ness):** when the distribution of items is happen to be perfectly symmetrical, we then have a normal curve and the relating distribution is normal distribution. Such curve is perfectly bell shaped curve in which case the value of Mean= Median= Mode Skewness is, thus a measurement of asymmetry and shows the manner in which the items are clustered around the average. In a symmetric (normal distribution) the items show a perfect balance on either side of the mode, but in a skewed distribution the balance is skewed one side or distorted. The amount by which the balance exceeds on one-side measures the skew-ness. Knowledge about the shape of the distribution is crucial to the use of statistical measure in research analysis. Since most methods make specific assumption about the nature of distribution. Skew -ness describes the asymmetry of a distribution. A skewed distribution therefore has one tail longer than the other. - A **positively skewed** distribution has a longer tail to the right - A **negatively skewed** distribution has a longer tail to the left - A distribution with **no skew** (e.g. a normal distribution) is symmetrical ***[Inferential Analysis]*** Most researcher wishes to go beyond the simple tabulation of frequency distribution and calculation averages and/or dispersion. They frequently conduct and seek to determine the relationship between variables and test statistical significance. When the population is consisting of more than one variable, it is possible to measure the relationship between them. Is there any association or correlation between the two or more variable? If yes, then up to what degree? This will be answered by the use of correlation technique. **[Correlation]** The most commonly used relational statistic is correlation and it\'s a measure of the strength of some relationship between two variables, not causality. Interpretation of a correlation coefficient does not even allow the slightest hint of causality. The most a researcher can say is that the variables share something in common; that is, are related in some way. The more two things have something in common, the more strongly they are related. There can also be negative relations, but the important quality of correlation coefficients is not their sign, but their absolute value. A correlation of -.58 is stronger than a correlation of.43, even though with the former, the relationship is negative. The following table lists the interpretations for various correlation coefficients:.8 to 1.0 **Very strong** ----------- -----------------.6 to.8 Strong.4 to.6 Moderate.2 to.4 Weak.0 to.2 Very weak Pearson\'s correlation coefficient, or small **r**, represents the degree of linear *association* between any two variables. Unlike regression, correlation doesn\'t care which variable is the independent one or the dependent one, therefore, you cannot infer causality. Correlations are also dimension-free, but they require a good deal of variability or randomness in your outcome measures. A correlation coefficient always ranges from negative one (-1) to one (1), so a negative correlation coefficient of -0.65 indicates that \"65% of the time, when one variable is low, the other variable is high\". A positive correlation coefficient of 0.65 indicates, \"65% of the time, when one variable exerts a positive influence, the other variable also exerts a positive influence\". A correlation coefficient at zero, or close to zero, indicates no linear relationship. The most frequently used correlation coefficient in data analysis is the Pearson product moment correlation. It is symbolized by the small letter r, and is fairly easy to compute from raw scores. Is there any cause and effect (causal relationship) between two variables or between one variable on one side and two or more variables on the other side? This question can be answered by the use of regression analysis. In regression analysis the researcher tries to estimate or predict the average value of one variable on the basis of the value of other variable. **[Regression]** Regression is the closest thing to estimating causality in data analysis, and that\'s because it predicts how much the numbers \"fit\" a projected straight line. The most common form of regression, however, is linear regression, and the least squares method to find an equation that best fits a line representing what is called the regression of y on x. Instead of finding the perfect number, however, one is interested in finding the perfect line, such that there is one and only one line (represented by equation) that perfectly represents, or fits the data, regardless of how scattered the data points. The slope of the line (equation) provides information about predicted directionality, and the estimated coefficients (or beta weights) for x and y (independent and dependent variables) indicates the power of the relationship. Zi= independent variable

Document Details

Tags

Related

Full Transcript