Data Collection Methods and Techniques PDF

Summary

This document describes different methods and techniques for gathering data, including primary methods such as surveys and secondary methods such as literature reviews. Detailed explanations are provided regarding the various survey types and their potential applications.

Full Transcript

**Data collection** is the **process of gathering information** and **data through various methods and techniques** in order to answer the research question. It involves the collection of both primary and secondary data (Gray, 2014; Creswell, 2012; Saunders, Lewis, & Thornhill, 2007). **Primary dat...

**Data collection** is the **process of gathering information** and **data through various methods and techniques** in order to answer the research question. It involves the collection of both primary and secondary data (Gray, 2014; Creswell, 2012; Saunders, Lewis, & Thornhill, 2007). **Primary data** is data that is **collected directly from research participants** through methods such as **surveys,** i**nterviews**, and **observations**. **Secondary data** is **data that has already been collected** by others and is available in sources such as **books**, **journals**, **archives**, **repositories**, and **databases**.  A common technique used for primary data collection is **Survey.** **Survey **is a common technique **used to gather primary data**. Surveys can be conducted through various means, such as **online surveys**, **mail surveys**, **phone surveys**, **or in-person surveys**. There are several types of survey techniques used in research, each with its own advantages and disadvantages. Here are some of the most common types of survey techniques: **(i) Mail surveys.** The questionnaires for this type of survey are sent out to a representative sample of the population **through mail**. The fact that this method may **reach a vast and diverse audience**, as well as **give respondents time to think about their responses**, is the primary benefit of using it. On the other hand, **response rates are often quite low**, and the **quality of the data may be jeopardized** as a result of responses that are either insufficient or erroneous. **(ii) Phone surveys.** Surveys conducted via the **phone** means gathering responses to questionnaires through **telephone**, using either **pre-programmed devices** or the **services of skilled interviewers**. The ability to **reach a large number of individuals in a short amount of time** is the primary benefit of using this method. Another advantage is that the **replies may be collected and analyzed in real time**. However, **response rates may be lower than those seen in surveys conducted in-person**, and there may be **questions over the reliability and precision of the data collected**. **(iii) Online surveys.** The development of surveys and its administration through **the use of online channels**, such as **email**, **social media**, or **websites** (e.g., survey monkey, google form etc.) specifically dedicated to surveying, are both included in the technique for carrying out online surveys. The ability to **reach a broad audience that is representative of a variety of demographics in a short amount of time and at a low cost** is the primary benefit of this method. **Answers can be gathered and analyzed in real time**, and there are a range of tools available to assist in ensuring the quality of the data obtained. Nonetheless, there may be questions regarding the representativeness of the sample, and respondents may have a greater propensity to submit responses that are either incomplete or erroneous. **(iv) In-person surveys.** Conducting a survey in person, either through **interviews or questionnaires** that respondents fill out on their own, is what\'s meant by a \"in-person survey\". The capacity to **deliver a high response rate and provide more in-depth responses**, as well as the **capability to clarify and probe responses**, is the primary benefit of using this method. However, this approach can be time-consuming and costly, and there is always the possibility of \"social desirability bias\" or interviewer effects influencing the results. **(v) Mixed-mode surveys. **This involves using a **combination of survey techniques, such as online and in-person surveys**, to reach a larger and more diverse population. The main advantage of this method is that it **can help overcome the limitations of individual survey** methods and **provide a more comprehensive and representative sample**. However, this method can be complex and expensive to implement, and there may be concerns about sample bias and data consistency across different survey modes.   Secondary data is **data that has already been collected and is available for use by other researchers**. Some common techniques used for secondary data collection are: **(a) Literature gathering.** Literature gathering involves **collecting and analyzing** previously published research studies on a particular topic.  **(b) Government reports.** **Government agencies** collect a large amount of data, which is available for public use. These reports can be used as a secondary data source. **(c) Online databases.** There are many **online databases that provide access** to a wide range of data sources, including academic journals, government reports, and industry data. **(d) Social media and web analytics.** **Social media platforms and web analytics tools** provide a wealth of information about online behavior and can be used as a secondary data sources. Validity and reliability are important concepts in data collection (Drost, 2011). **Validity **refers to the extent to which the data collected **accurately reflects the research question or topic.** **Reliability **refers to the **consistency and stability** of the data collected. To ensure validity and reliability in data collection, researchers use appropriate data collection methods, establish clear and consistent procedures, and use standardized instruments and protocols. They should also ensure that the sample size is adequate and representative, and that the data collection process is well-documented and transparent. While validity and reliability are relevant to both quantitative and qualitative research, they may be interpreted differently in each approach. **(i) Content validity** refers to the degree to which a measurement tool covers all the **relevant aspects** of the concept being measured. Content validity is ensured in two ways: literature review and expert opinion. **(ii) Construct validity** refers to the degree to which a measurement tool **accurately measures** **the theoretical construct it is intended to measure**. Convergent validity and divergent validity are two components of construct validity. **(iii) Criterion validity** refers to the degree to which a **measurement tool correlates with an external criterion**, such as another measurement tool or an observed behavior.  \(a) Reliability in quantitative research. Reliability refers to the **consistency and stability** of a measurement tool over time and across different situations. In quantitative research, reliability can be assessed using different types of reliability, including test-retest reliability, inter-rater reliability, and internal consistency reliability. **(i) Test-retest reliability** refers to the degree to which a measurement tool produces the **same results** when **administered multiple times** to the same group of participants. **(ii) Inter-rater reliability** refers to the degree to which **different raters or observers** produce **consistent results** when using the same measurement tool. **(iii) Internal consistency reliability** refers to the degree to which **different items** in a measurement tool are **consistent** with each other.  Editing\ **Irrespective of the method of data collection**, the information collected is called raw data or simply data. The first step in processing your data is to ensure that the **data is 'clean'** -- that is, free from inconsistencies and incompleteness. This process of 'cleaning' is called editing.  There are several ways of minimizing such problems:     By inference -- Certain questions in a research instrument may be related to one another and it might be possible to **find out the answer to one question from the answer to another**. Of course, you must be careful about making such inferences or you may introduce new errors into the data.\  By recall -- if the data is collected by means of interviews, sometimes it might be possible for the interviewer to **recall a respondent's answers**. Again, you must be extremely careful.\  By going back to the respondent -- if the data has been collected by means of interviews or the questionnaires contain some identifying information, it is possible to visit or phone a respondent to confirm or ascertain an answer. This is, of course, expensive and time consuming. There are two ways of editing the data: 1 examine all the answers to one question or variable at a time.\ 2 examine all the responses given to all the questions by one respondent at a time. Having 'cleaned' the data, the next step is to code it. The method of coding is largely dictated by two considerations: 1. the way a variable has been measured (measurement scale) in your research instrument (e.g. if a response to a question is descriptive, categorical or quantitative). 2. the way you want to communicate the findings about a variable to your readers. For coding, the first level of distinction is whether a set of data is qualitative or quantitative in nature.  3. Data analysis is the process of examining and interpreting data collected through various methods and techniques to answer the research question (Creswell, 2012). 4. Data analysis involves organizing, cleaning, coding, and transforming data to derive meaningful insights and conclusions (Saunders, Lewis, & Thornhill, 2007). 5. Data analysis can involve both quantitative and qualitative methods, depending on the nature of the research question and data collected. **Descriptive analysis.** This technique involves summarizing and describing the characteristics of a set of data, such as the mean, median, and standard deviation The following are some examples of descriptive statistics. **(i) Measures of central tendency**. These statistics are used to summarize the typical or central value of a dataset. The most commonly used measures of central tendency are the mean, median, and mode. **(ii) Measures of dispersion. **These statistics are used to describe how spread out the values in a dataset are. The most commonly used measures of dispersion are the range, variance, and standard deviation.  **(iii) Frequency distributions. **These statistics are used to display the number or proportion of times that each value in a dataset occurs. They are often visualized as histograms or bar charts. **(iv) Percentiles.** These statistics are used to describe the relative position of an individual value within a dataset. The most commonly used percentiles are the 25th, 50th (median), and 75th percentiles. **(v) Measure of association.** Correlation coefficients statistics are used to describe the strength and direction of the relationship between two variables. The most commonly used correlation coefficients are Pearson\'s r and Spearman\'s rho.  Having analyzed the data that, you collected through either quantitative or qualitative method(s), the next task is to present your findings to your readers. The main purpose of using data display techniques is to make the findings easy and clear to understand, and to provide extensive and comprehensive information in a succinct and effective way. There are many ways of presenting information. The choice of a particular method should be determined primarily by your impressions/knowledge of your likely readership's familiarity with the topic and with the research methodology and statistical procedures. If your readers are likely to be familiar with 'reading' data, you can use complicated methods of data display; if not, it is wise to keep to simple techniques. Although there are many ways of displaying data, this module is limited to the more commonly used ones. Broadly, there are four ways of communicating and displaying the analyzed data. These are: - text - tables - graphs; and - statistical measures. In quantitative studies the text is very commonly combined with other forms of data display methods, the extent of which depends upon your familiarity with them, the purpose of the study and what you think would make it easier for your readership to understand the content and sustain their interest in it. Hence as a researcher it is entirely up to you to decide the best way of communicating your findings to your readers. **Text** Text, by far, is the most common method of communication in both quantitative and qualitative research studies and, perhaps, the only method in the latter. It is, therefore, essential that you know how to communicate effectively, keeping in view the level of understanding, interest in the topic and need for academic and scientific rigor of those for whom you are writing. Your style should be such that it strikes a balance between academic and scientific rigor and the level that attracts and sustains the interest of your readers. Of course, it goes without saying that a reasonable command of the language and clarity of thought are imperative for good communication. Your writing should be thematic: that is, written around various themes of your report. findings should be integrated into the literature citing references using an acceptable system of citation; your writing should follow a logical progression of thought; and the layout should be attractive and pleasing to the eye. Language, in terms of clarity and flow, plays an important role in communication. **Tables** **Structure** Other than text, tables are the most common method of presenting analyzed data. According to The Chicago Manual of Style (1993: 21), 'Tables offer a useful means of presenting large amounts of detailed information in a small space.' According to the Commonwealth of Australia Style Manual (2002: 46), 'tables can be a boon for readers. They can dramatically clarify text, provide visual relief, and serve as quick point of reference.' It is, therefore, essential for beginners to know about their structure and types. Figure 16.1 shows the structure of a table. A table has five parts: 1 **Title** -- this normally indicates the table number and describes the type of data the table contains. it is important to give each table its own number as you will need to refer to the tables when interpreting and discussing the data. the tables should be numbered sequentially as they appear in the text. the procedure for numbering tables is a personal choice. if you are writing an article, simply identifying tables by number is sufficient. In the case of a dissertation or a report, one way to identify a table is by the chapter number followed by the sequential number of the table in the chapter, the main advantage of this procedure is that if it becomes necessary to add or delete a table when revising the report, the table numbers for that chapter only, rather than for the whole report, will need to be changed. The description accompanying the table number must clearly specify the contents of that table. in the description identify the variables about which information is contained in the table, for example 'Respondents by age' or 'attitudes towards uranium mining'. if a table contains information about two variables, the dependent variable should be identified first in the title, for example 'attitudes towards uranium mining \[dependent variable\] by gender \[independent variable\]'. 2 **Stub** -- the subcategories of a variable, listed along the y-axis (the left-hand column of the table). according to The McGraw-Hill Style Manual (long year 1983: 97), 'the stub, usually the first column on the left, lists the items about which information is provided in the horizontal rows to the right.' The Chicago Manual of Style (1993: 331) describes the stub as: 'a vertical listing of categories or individuals about which information is given in the columns of the table'. 3 **Column headings** -- the subcategories of a variable, listed along the x-axis (the top of the table). in univariate tables (tables displaying information about one variable) the column heading is usually the 'number of respondents' and/or the 'percentage of respondents' (tables 16.1 and 16.2). in bivariate tables (tables displaying information about two variables) it is the subcategories of one of the variables displayed in the column headings (table 16.3). 4 **Body** -- the cells housing the analyzed data. 5 **Supplementary notes or footnotes** -- there are four types of footnotes: source notes; other general notes; notes on specific parts of the table; and notes on the level of probability (The Chicago Manual of Style 1993: 333). if the data is taken from another source, you have an obligation to acknowledge this. The source should be identified at the bottom of the table, and labelled by the word 'source:' as in Figure 16.1. similarly, other explanatory notes should be added at the bottom of a table. **TYPES OF TABLES** Depending upon the number of variables about which information is displayed, tables can be categorized as: **univariate** (also known as **frequency tables**) -- containing information about one variable, for example tables 16.1 and 16.2; **bivariate (**also known as **cross-tabulations**) -- containing information about two variables, for example table 16.3; and **polyvariate or multivariate** -- containing information about more than two variables, for example table 16.4.  

Use Quizgecko on...
Browser
Browser