Summary

This document introduces the basics of descriptive and inferential statistics, covering different types of data and data collection methods used in research. It outlines the importance of careful data collection for research integrity.

Full Transcript

INTRODUCTION STATISTICS Statistics can be described as a branch of Mathematics that has to do with collecting, organizing, summarizing, presenting, analyzing, interpreting and drawing conclusions or inferences from a given data. It can also be described as a science of data. It is clear from the def...

INTRODUCTION STATISTICS Statistics can be described as a branch of Mathematics that has to do with collecting, organizing, summarizing, presenting, analyzing, interpreting and drawing conclusions or inferences from a given data. It can also be described as a science of data. It is clear from the definition that statistics is not only the tabulation and graphical presentation of numbers, but drawing inferences from them. Statistics is a very broad subject with applications in a vast number of different fields ranging from Sciences, Engineering, Medicine, Arts, Social Sciences etc. Statistical methods fall into two broad areas: 1. Descriptive statistics 2. Inferential statistics Descriptive statistics merely describe, organize, or summarize data. It refers only to the actual data available. Examples include the mean blood pressure of a group of patients and the success rate of a surgical procedure. Descriptive statistics includes the construction of graphs, charts, tables, and the calculation of various descriptive measures such as averages, measures of variation and percentiles. However, this branch of statistics does not allow one to draw any conclusion or make any inferences about the data. Inferential statistics involves making inferences that go beyond the actual data. It usually involves inductive reasoning (i.e., generalizing to a population after having observed only a sample). Examples include the mean blood pressure of all citizens and the expected success rate of a surgical procedure in patients who have not yet undergone the operation. It is a method of using information from a sample to draw conclusions about 1 the population. Inferential statistics include methods like point estimation, interval estimation, hypothesis testing, T-test, Analysis of Variance (ANOVA), Pearson’s Correlation, regression etc. COLLECTION AND ORGANISATION OF DATA CONCEPT OF DATA COLLECTION Data collection is the process of gathering and measuring information on variables of interest, in an established systematic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes. The data collection component of research is common to all fields of study including humanities, business, physical, medical and social sciences etc. While methods vary by discipline, the emphasis on ensuring accurate and honest collection remains the same. The goal for all data collection is to capture quality evidence that then translates to rich data analysis and allows the building of a convincing and credible answer to questions that have been posed. Therefore, accurate data collection is essential to maintaining the integrity of research. Both the selection of appropriate data collection instruments (existing, modified, or newly developed) and clearly defined instructions for their correct use, reduce the likelihood of errors occurring. Data collection is one of the most important stages in conducting research. You can have the best research design in the world but if you 2 cannot collect the required data, you will not be able to complete your project. Data collection is a very demanding job which needs thorough planning, hard work, patience, perseverance and more, to be able to complete the task successfully. This starts with determining what kind of data required followed by the selection of a sample from a certain population. After that, certain instruments to collect the data from the selected sample. TYPES OF DATA a. Qualitative Data b. Quantitative Data c. Mixed Data a. Qualitative Data They are mostly non-numerical and usually descriptive or nominal in nature. This means the data collected are in the form of words and sentences. Often, such data captures feelings, emotions, or subjective perceptions of something. In order to collect qualitative data while conducting a research, the questions will be open-ended. Qualitative methods include: 1. Focus groups; 2. group discussions; 3. interviews. Qualitative approaches are expensive and time consuming to implement. 3 The findings cannot be generalized to participants outside of the program and are only indicative of the group involved. Qualitative data collection methods have the following characteristics: 1. they tend to be open-ended and have less structured protocols (i.e., researchers may change the data collection strategy by adding, refining, or dropping techniques or informants); 2. they rely more heavily on interactive interviews; respondents may be interviewed several times to follow up on a particular issue, clarify concepts or check the reliability of data; 3. researchers rely on multiple data collection methods to check the authenticity of their results; 4. the findings are not generalizable to any specific population. The collection of data in a qualitative study usually takes a great deal of time. The researcher needs to record any potentially useful data thoroughly, accurately, and systematically, using the following: 1. field notes 2. sketches 3. audiotapes 4. photographs and 5. other suitable means. The data collection methods must observe the ethical principles of research. 4 b. Quantitative Data Quantitative data are numerical in nature and can be mathematically computed. Quantitative data measure uses different scales, which can be classified as nominal scale, ordinal scale, interval scale and ratio scale. Quantitative approaches use a systematic standardized approach and employ methods such as surveys and ask questions. These approaches have the advantage that they are cheaper to implement, are standardized so comparisons can be easily made and the size of the effect can usually be measured. The quantitative approaches however, are limited in their capacity for the investigation and explanation of similarities and unexpected differences. It is important to note that for peer-based programs, quantitative data collection approaches often prove to be difficult to implement for agencies. Lack of necessary resources to ensure rigorous implementation of surveys and frequently experienced low participation and loss to follow up rates are commonly experienced factors. The quantitative data collection methods rely on random sampling and structured data collection instruments that fit diverse experiences into predetermined response categories. 5 They produce results that are easy to summarize, compare, and generalize. If the intent is to generalize from the research participants to a larger population, the researcher will employ probability sampling to select participants. Typical quantitative data gathering strategies include: 1. Experiments/clinical trials 2. Observing and recording well-defined events. For example, counting the number of patients waiting in emergency at specified times of the day. 3. Obtaining relevant data from management information systems. 4. Administering surveys with closed-ended questions. For example, face-to face and telephone interviews, questionnaires etc. In quantitative research (survey research), interviews are more structured than in qualitative research. In a structured interview, the researcher asks a standard set of questions and nothing more. Face-to-face interviews have a distinct advantage of enabling the researcher to establish rapport with potential participants and therefore gain their cooperation. Paper-pencil-questionnaires can be sent to a large number of people and saves the researcher time and money. The questionnaire could also be online, for instance, Google form and Microsoft form. People 6 are more truthful while responding to the questionnaires regarding controversial issues in particular due to the fact that their responses are anonymous. c. Mixed Methods (Data) Mixed methods approach combines both qualitative and quantitative research data, techniques and methods within a single research framework. Mixed methods approaches may mean a number of things, that is, a number of different types of methods in a study or at different points within a study or using a mixture of qualitative and quantitative methods. Mixed methods encompass multifaceted approaches that combine to capitalize on strengths and reduce weaknesses that stem from using a single research design. Using this approach to gather and evaluate data, may assist to increase the validity and reliability of the research. Some of the common areas in which mixed-method approaches may be used include: 1. Initiating, designing, developing and expanding interventions; 2. Evaluation; 3. Improving research design; 4. Corroborating findings. Some of the challenges of using a mixed methods approach include: 7 1. Designing complementary qualitative and quantitative research questions; 2. Time-intensive data collection and analysis; 3. Decisions regarding which research methods to combine. Mixed methods are useful in highlighting complex research problems such as disparities in technology and can also be transformative in addressing issues for vulnerable or marginalized populations or research which involves community participation. Using a mixed-methods approach is one way to develop creative options to traditional or single design approaches to research and evaluation. In Statistics, there are many ways of classifying data. The two main classes of data are: a. Primary data b. Secondary data a. Primary Data Data that have been collected from first-hand-experience are known as primary data. Primary data have not been published yet and are more reliable, authentic and objective. Primary data have not been changed or altered by human beings; therefore its validity is greater than secondary data. 8 In statistical surveys, it is necessary to get information from primary sources and work on primary data. For example, the statistical records of female population in a country cannot be based on newspaper, magazine and other printed sources. A research can be conducted without secondary data but a research based on only secondary data is least reliable and may have biases because secondary data has already been manipulated by human beings. One of such sources is old and secondly they contain limited information as well as they can be misleading and biased. Sources for primary data are limited and at times, it becomes difficult to obtain data from primary source because of either scarcity of population or lack of cooperation. The following are some of the sources of primary data: 1. Experiments: Experiments require an artificial or natural setting to perform logical study in order to collect data. Experiments are more suitable for medicine, psychological studies, nutrition and for other scientific studies. In experiments, the experimenter has to keep control over the influence of any extraneous variable on the results. 2. Survey: Survey is most commonly used method in medicine, social sciences, management, marketing and psychology to some extent. Surveys can be conducted in different methods. 9 3. Questionnaire: It is the most commonly used method in survey. Questionnaires are a list of questions either open-ended or close- ended, for which the respondents give answers. Questionnaire can be conducted via telephone, mail, live in a public area, or in an institute, through electronic mail or through fax and other methods. 4. Interview: Interview is a face-to-face conversation with the respondent. In interview, the main problem arises when the respondent deliberately hides information otherwise, it is an in depth source of information. The interviewer can not only record the statements the interviewee speaks but he can observe the body language, expressions and other reactions to the questions too. This enables the interviewer to draw conclusions easily. 5. Observations: Observation can be done while letting the observing person know that s/he is being observed or without letting him know. Observations can also be made in natural settings as well as in artificially created environment. Advantages of Using Primary Data 1. The investigator collects data specific to the problem under study. 2. There is no doubt about the quality of the data collected (for the investigator). 3. If required, it may be possible to obtain additional data during the study period. 10 Disadvantages of Using Primary Data 1. The investigator has to contend with all the hassles of data collection: deciding why, what, how, when to collect; getting the data collected (personally or through others); getting funding and dealing with funding agencies; ethical considerations (consent, permissions, etc.). 2. Ensuring the data collected is of high standard: all desired data are obtained accurately, and in the format they are required in; there are no fake/ cooked up data; unnecessary/useless data has not been included. 3. Cost of obtaining the data is often the major expense in studies. b. Secondary Data Data collected from a source that has already been published in any form is called a secondary data. The review of literature in any research is based on secondary data. It is collected by someone else for some other purpose (but being utilized by the investigator for another purpose). For example, census data being used to analyze the impact of education on career choice and earning. 11 Common sources of secondary data include censuses, organizational records and data collected through qualitative methodologies or qualitative research. Secondary data is essential, since it is impossible to conduct a new survey that can adequately capture past change and/or developments. Sources of Secondary Data The following are some ways of collecting secondary data: 1. Books 2. Records 3. Biographies 4. Newspapers 5. Published censuses or other statistical data 6. Data archives 7. Internet articles 8. Research articles by other researchers (journals) 9. Databases, etc. Importance of Secondary Data 1. Secondary data can be less valid but its importance is still there. Sometimes, it is difficult to obtain primary data. In these cases, getting information from secondary sources is easier and possible. 12 2. Sometimes, primary data do not exist. In such situation, one has to confine the research on secondary data. 3. Sometimes, primary data are present but the respondents did not provide enough/relevant information so, secondary data can suffice. 4. A clear benefit of using secondary data is that, much of the background work needed has already been carried out. For example, literature reviews, case studies might have been carried out, published texts and statistics, could have been already used elsewhere, media promotion and personal contacts have also been utilized. This wealth of background work means that secondary data generally have a pre-established degree of validity and reliability which need not be re-examined by the researcher who is re-using such data. 5. Secondary data can also be helpful in the research design of subsequent primary research and can provide a baseline with which the collected primary data results can be compared to. Therefore, it is always wise to begin any research activity with a review of the secondary data. Other importance of using secondary data are: 1. No hassles of data collection. 2. It is less expensive. 13 3. The investigator is not personally responsible for the quality of data (‘I didn’t do it’). Disadvantages of Using Secondary Data 1. The data collected by the third party may not be a reliable party, so, the reliability and accuracy of data go down. 2. With the passage of time, the data become obsolete and very old. 3. Secondary data collected can distort the results of the research. For using secondary data, a special care is required to amend or modify for use. 4. Secondary data can also raise issues of authenticity and copyright. 14

Use Quizgecko on...
Browser
Browser