M2A: Basic Concepts of Sampling Design PDF
Document Details
Uploaded by FuturisticBernoulli
Polytechnic University of the Philippines
Rumel Angelo T. Alfaro
Tags
Summary
This document provides an overview of basic sampling design concepts in a statistical analysis course, including data collection methods. It details the differences between primary and secondary data, with examples like interviews and questionnaires.
Full Transcript
STAT 203 STATISTICAL M2A: BASIC CONCEPTS OF SAMPLING DESIGN ANALYSIS with SOFTWARE APPLICATION RUMEL ANGELO T. ALFARO P U P 1 COURSE OUTCOME & TOPICS Where we...
STAT 203 STATISTICAL M2A: BASIC CONCEPTS OF SAMPLING DESIGN ANALYSIS with SOFTWARE APPLICATION RUMEL ANGELO T. ALFARO P U P 1 COURSE OUTCOME & TOPICS Where we’re going… Topics STAT 203 Define data collection and its Data Collection (2.1) STATISTICAL importance to research. Sources of Data (2.2) ANALYSIS with Determine the sources of data SOFTWARE Types of Primary Data (2.3) (primary and secondary data). APPLICATION Secondary Data (2.4) Understand the different types of primary data. Develop the knowledge on the use of secondary data. P U P 2 DATA COLLECTION STAT 203 STATISTICAL ANALYSIS with SOFTWARE Data collection is the process of gathering and measuring information APPLICATION on variables of interest, in an established systematic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes. Without proper planning for data collection, a number of problems can occur. If the data collection steps and processes are not properly planned, the research project can ultimately end up with a data set that does not serve the purpose for which it was intended. P U P 3 STEPS IN DATA COLLECTION STAT 203 1. Set the objectives for collecting data STATISTICAL 2. Determine the data needed based on the set objectives. ANALYSIS with 3. Determine the method to be used in data gathering and define the SOFTWARE comprehensive data collection points. APPLICATION 4. Design data gathering forms to be used. 5. Collect data. P U P 4 SOURCES OF DATA STAT 203 1. PRIMARY DATA - data which are collected fresh and for the first time and thus happen to be original in character. These provide a first-hand STATISTICAL account of an event or time period and are authoritative. They represent ANALYSIS with original thinking, reports on discoveries or events, or they can share new SOFTWARE information. APPLICATION 2. SECONDARY DATA - data which have been collected by someone else and which have already been passed through the statistical process. These offer an analysis, interpretation or a restatement of primary sources and are considered to be persuasive. They often attempt to describe or explain primary sources. P U P 5 PRIMARY DATA STAT 203 The primary data can generally be collected by the following four methods: STATISTICAL Direct/Interview Method Indirect/Questionnaire Method ANALYSIS with SOFTWARE APPLICATION Experiment Observation P U P 6 DIRECT/INTERVIEW METHOD STAT 203 STATISTICAL ANALYSIS with SOFTWARE APPLICATION This method of collecting data involves presentation or oral-verbal stimuli and reply in terms of oral-verbal responses. Interview Method is an oral verbal communication where interviewer asks questions (which are aimed to get information required for study) to respondent. P U P 7 TYPES OF INTERVIEW Personal Structured Individual Clinical STAT 203 STATISTICAL ANALYSIS with SOFTWARE APPLICATION Telephonic Unstructured Group Selection P U P 8 INDIRECT/QUESTIONNAIRE METHOD STAT 203 STATISTICAL ANALYSIS with SOFTWARE A questionnaire is a self-administered research instrument that can be done by mail APPLICATION or in a group setting such as a classroom. Respondents are expected to read and understand the questions and write down the reply in the space meant for the purpose in the questionnaire itself. Most of the time, this method of data collection involve sourcing and accessing existing data that were originally collected for the purpose of the study. Questionnaires help researchers to gather needed statistical information for their studies; however, much care must be given to proper questionnaire design and usage. Otherwise, the results will be unreliable. P U P 9 TYPES OF QUESTION STAT 203 An open-ended question is a type of question that does not include response categories. This type of question is usually appropriate for STATISTICAL collecting subjective data. ANALYSIS with SOFTWARE A closed-ended question is a type of question that includes a list of APPLICATION response categories from which the respondent will select his answer. This type of question is usually appropriate for collecting objective data. Open-Ended Question Closed-Ended Question “List three activities that you plan to “Select three activities that you plan to spend spend more time on when you retire.” more time on after you retire: traveling; eating out; fishing and hunting; exercising; visiting relatives.” P U P 10 VALIDITY AND RELIABILITY STAT 203 Validity refers to the extent that the instrument measures what it was designed to measure. STATISTICAL Content validity measures the extent to which the items that comprise the scale accurately represent or measure the information that is being assessed. Are the questions that are ANALYSIS with asked representative of the possible questions that could be asked? SOFTWARE Construct validity measures what the calculated scores mean and if they can be generalized. APPLICATION Construct validity uses statistical analyses, such as correlations, to verify the relevance of the questions. Criterion-related validity has to do with how well the scores from the instrument predict a known outcome they are expected to predict. Statistical analyses, such as correlations, are used to determine if criterion-related validity exists. Scores from the instrument in question should be correlated with an item they are known to predict. If a correlation of greater than 0.60 exists, criterion related validity exists as well. Source: Statistics Solutions P U P 11 VALIDITY AND RELIABILITY STAT 203 Reliability refers to the extent that the instrument yields the same results over multiple trials. STATISTICAL Test-retest is a method that administers the same instrument to the same sample at ANALYSIS with two different points in time, perhaps one year intervals. If the scores at both time periods are highly correlated, greater than 0.60, they can be considered reliable. SOFTWARE APPLICATION The alternative form method requires two different instruments consisting of similar content. Internal consistency uses one instrument administered only once. The coefficient alpha (or Cronbach’s alpha) is used to assess the internal consistency of the item. If the alpha value is 0.70 or higher, the instrument is considered reliable. The split-halves method also requires one test administered once. The number of items in the scale are divided into halves and a correlation is taken to estimate the reliability of each half of the test. Source: Statistics Solutions P U P 12 KEY DESIGN PRINCIPLES OF A GOOD QUESTIONNAIRE ✓ Keep the questionnaire as short as possible. STAT 203 ✓ Decide on the type of questionnaire (Open-Ended or Closed-Ended). STATISTICAL ✓ Write the questions properly. ANALYSIS with ✓ Order the questions appropriately. SOFTWARE ✓ Avoid questions that prompt or motivate the respondent to say what you APPLICATION would like to hear. ✓ Write an introductory letter or an introduction. ✓ Write special instructions for interviewers or respondents. ✓ Translate the questions if necessary. ✓ Always test your questions before taking the survey. (Pre-test) P U P 13 EXPERIMENT STAT 203 STATISTICAL ANALYSIS with SOFTWARE Experiment is a method of collecting data where there is direct human intervention APPLICATION on the conditions that may affect the values of the variable of interest. Bear in mind that the experimental method has several limitations that you should be aware of and these are considered as basic threats to the validity of the experiment. Ethical, moral, and legal concerns Unrealistic Controlled Environments Inability to Control for All Variables P U P 14 VARIABLES IN AN EXPERIMENT Statistical studies usually include one or more STAT 203 independent variables and one dependent variable. A study conducted at Virginia Polytechnic Institute and presented in Psychology Today divided female undergraduate students into two The independent variable in an experimental groups and had the students perform as many sit- STATISTICAL study is the one that is being manipulated by the ups as possible in 90 sec. ANALYSIS with researcher. The independent variable is also called the explanatory variable. The first group was told only to “Do your SOFTWARE best,” while the second group was told to try to increase the actual number of sit-ups done The resultant variable is called the dependent APPLICATION variable or the outcome variable. each day by 10%. After four days, the subjects in the group who The treatment group (also called the were given the vague instructions to “Do your experimental group) receives the treatment best” averaged 43 sit-ups, while the group that was given the more specific instructions to whose effect the researcher is interested in. increase the number of sit-ups by 10% averaged 56 sit-ups by the last day’s session. The control group receives either no treatment, a standard treatment whose effect is The conclusion then was that athletes who were already known, or a placebo (a fake treatment). given specific goals performed better than those who were not given specific goals. P U P 15 EXPERIMENTAL STUDY STAT 203 In an experimental study, the researcher manipulates one of the variables and tries to determine how the manipulation influences other STATISTICAL variables. ANALYSIS with SOFTWARE In a true experiment, participants are randomly assigned to either the APPLICATION treatment or the control group. A quasi-experiment, also called non-randomized study is similar to true experiments, but they lack random assignment to treatment and control groups. P U P 16 EXPERIMENTAL DESIGNS A research design includes the structure of a study and the strategies for conducting that study. Specific to experiments, experimental designs have been developed to reduce biases STAT 203 of all kinds as much as possible. Let R = Randomized, that is, subjects are randomly selected to treatment group; O = STATISTICAL Observation of testing; X = the Treatment ANALYSIS with SOFTWARE Pre-test Treatment Post-test Pre-test Treatment Post-test APPLICATION R 𝑂1 X 𝑂2 𝑂1 X 𝑂2 R 𝑂3 𝑂4 𝑂3 𝑂4 Design 1 Design 3 Treatment Post-test Treatment Post-test R X 𝑂2 X 𝑂2 R 𝑂4 𝑂4 Design 2 Design 4 P U P 17 OBSERVATION METHOD STAT 203 STATISTICAL ANALYSIS with Observation method is a method under which data from the field is SOFTWARE collected with the help of observation by the observer or by APPLICATION personally going to the field. ADVANTAGES DISADVANTAGES Subjective bias eliminated Time consuming Current information Limited information Independent to respondent’s variable Unforeseen factors P U P 18 TYPES OF OBSERVATION By Structure STAT 203 1. Structured Observation - when observation is done by characterizing style of recording the observed information, standardized conditions of observation, definition of the units to be observed, or selection of pertinent data of observation. STATISTICAL Example: An auditor performing inventory analysis in store. ANALYSIS with 2. Unstructured Observation - when observation is done without any thought before SOFTWARE observation. APPLICATION Example: Observing children playing with new toys. By Participation 1. Participant - when the observer is member of the group which he is observing. 2. Non-participant - when observer is observing people without giving any information to them. P U P 19 SECONDARY DATA STAT 203 STATISTICAL ANALYSIS with Secondary data are data that have been already collected by and readily available SOFTWARE from other sources. Such data are cheaper and more quickly obtainable than the primary data and also may be available when primary data can not be obtained at all. APPLICATION Secondary data can be collected by the following five methods: Internal data (inside a firm) such as internal reports, sales call reports, etc. Published report on newspaper and periodicals. Financial data reported in annual reports. Records maintained by the institution. Information from official publications. P U P 20 EVALUATION OF SECONDARY DATA Evaluation means the following four requirements must be satisfied: STAT 203 1. Availability - It has to be seen that the kind of data you want is available or not. If it is not available, then you have to go for primary data. STATISTICAL 2. Relevance - It should be meeting the requirements of the research problem. For this: ANALYSIS with Units of measurement should be the same; SOFTWARE APPLICATION Concepts used must be same and currency of data should not be outdated. 3. Accuracy - In order to find how accurate the data is, the following points must be considered: Specification and methodology used; Margin of error should be examined; The dependability of the source must be seen. 4. Sufficiency - Adequate data should be available. P U P 21 STAT 203 STATISTICAL QUESTIONS? ANALYSIS with SOFTWARE APPLICATION You may reach at the ff channels during Consultation Hours: Google Classroom Microsoft Teams P U P 22