HE 201 Data Collection & Sampling (October 21, 2024) PDF
Document Details
2024
Dr. Ketan Shankardass
Tags
Summary
This document from October 21, 2024 covers various topics in research methods. It includes a discussion of sampling techniques from the perspective of health sciences. The keywords in this file include data collection, sampling methods, and research methods.
Full Transcript
Data Collection & Sampling HE 201: Research Methods in the Health Sciences October 21, 2024 Dr. Ketan Shankardass Associate Professor Department of Health Sciences Today’s objectives: ◦ Chapters 19, 20 ◦ To begin our discussion on how to collect data and how to select participants into a st...
Data Collection & Sampling HE 201: Research Methods in the Health Sciences October 21, 2024 Dr. Ketan Shankardass Associate Professor Department of Health Sciences Today’s objectives: ◦ Chapters 19, 20 ◦ To begin our discussion on how to collect data and how to select participants into a study (sampling). The Research Process Rehman et al, 2021 ◦ A group (or subgroup) of individuals, communities, or organizations; ◦ research studies usually carefully Population identify target population, source populations, sample population, and study population TYPES OF RESEARCH POPULATIONS Target population: the broad population to which the results of a study should be applicable abstract Target and Source population (sampling e.g., cancer registry; frame): a well-defined subset telephone of individuals from the target books; list of Source population from which potential study participants households within the vicinity of Populations some will be sampled exposure concrete Target and Source Populations Unless a study goal is extremely A well-defined study question narrow in scope, it is usually not possible to invite all members of a identifies a target population to target population to participate in a which the results of the study study should apply Instead, a more specific source population should be identified Ideally, the sampling frame consists of an enumerated list of population members Why? Sample Populations Not the same as the sampling frame! ◦ Sample population: the individuals from a source population who are invited to participate in the research project ◦ Ideally, probability-based sampling is used to ensure that the sample population is representative of the source population. Sample Populations When a source population is much larger than the sample size required for a study, a subset of the source population may serve as a sample population Sampling bias occurs when the Non-random-sampling bias occurs individuals sampled for a study when each individual in the source systematically are not population does not have an equal representative of the source chance of being selected for the sample population population as a whole Probability-based sampling ◦ A method to ensure members of a source population have an equal likelihood (e.g., probability) of being invited to participate in a research study. Simple random sampling ◦ Every person in the population has an equal probability of being selected in a study ◦ Random number generator ◦ Pulling names out of a hat ◦ Flipping a coin Systematic Sampling ◦ For larger populations, we might not want every other individual ◦ e.g. using a list or inventory, choosing every 10th or 100th person ◦ Not always free from bias ? Stratified random sampling ◦ Populations and groups are first divided and grouped based on a characteristic before selection takes place ◦ e.g. sampling different years of university students Cluster sampling ◦ When a sample is divided into clusters, typically based on naturally-occurring groups ◦ e.g. conducting a study with classrooms in a school district; you don’t have resources to include every classroom; pick a random sample of classrooms. Question ◦ Have you ever received a survey in your e-mail, or been contacted by telephone to participate, and not responded? Study population: the eligible members of the Study sample population who consent to Populations participate in the study and complete required study activities Generalizability ◦ Also known as “external validity” or “representativeness” ◦ Allows results to be applicable to a target audience ◦ To apply findings to a target population (higher level) Sometimes a nonprobability-based sample is appropriate ◦ A convenience population is a nonprobability-based source population selected due to ease of access to those individuals, schools, workplaces, organizations, or Convenience communities samples ◦ Convenience sampling must always be used with caution, since convenient sample populations are often systematically different from the target and source populations they are intended to represent Non-probability Sampling Chain/snowball/referral Respondent-driven sampling sampling Encouraging people combines "snowball who have taken your sampling" with a survey to send it to mathematical model other people they that weights the know sample to compensate e.g. like a snowball for the fact that the growing as it travels sample was collected downhill in a non-random way Sample Populations (Cont.) ◦ No matter which sampling method is used, the ideal goal is to end up with a sample population that is representative of the source population ◦ Some errors occur by chance and cannot be resolved by randomization, but many forms of bias can be mitigated with careful planning, rigorous methods, and sufficiently large sample sizes – Sampling bias can be avoided by selecting more appropriate source populations and applying inclusion and exclusion criteria consistently Study Populations ◦ Participation rate: the percentage of members of a sample population who are included in the study population ◦ In an ideal situation all of the sampled individuals agree to participate in the study, but a 100% participation rate is extremely rare ◦ A high participation rate helps prevent the selection bias that occurs when the members of the study population are not representative of the source population from which they were drawn ◦ A low response rate may result in nonresponse bias if the members of a sample population who agree to participate in a study are systematically different from nonparticipants ◦ That’s especially bad if the difference is non-random with respect to the study factors of interest, e.g., alcohol use Vulnerable Populations ◦ Vulnerable populations are populations whose members might have limited ability to make an autonomous decision about volunteering to participate in a research study ◦ Young children, people in prison, people with severe health issues, people with cognitive disabilities, and some other socially-marginalized groups of people who might have limited ability to make an independent decision about volunteering to participate in a research study, e.g., people living in poverty. ◦ Pay close attention to the ethics of research with vulnerable groups. Vulnerable Populations Vulnerable populations should not be selected as the source population for studies that do Why? not require their participation At the same time, it is problematic when members of vulnerable populations are systematically excluded from research, since the only way to study health issues of special importance to potentially vulnerable populations is to allow members of those populations to participate in relevant research studies Research studies including members of vulnerable populations require extra consideration of the potential risks of research to participants Community Involvement ◦ Some studies benefit from or require the participation and support of geographic, cultural, educational, religious, or social communities and their leaders ◦ Connections with community representatives should be established early in the research planning process and maintained throughout the data collection and dissemination period – The approval of community leaders does not negate the requirement to obtain individual informed consent from participants whose individual data will be collected Community Involvement ◦ Community-based participatory research (CBPR) is based on partnerships in which community members identify research priorities and are involved in every stage of the research process. EXAMPLE: TRANS PULSE PROJECT https://transpulseproject.ca/about-us/project-history/ Importance of Sample Size Recruiting too many Recruiting too few participants participants wastes resources. makes the study invalid. There’s a threshold of Can not answer your study participants that would be question. adequate to help you answer your research question. Sample Size and Certainty Levels ◦ In statistics, sample size is the number of observations in a data set – In the health sciences, the sample size may be the number of individual humans or animals in the study population, or the number of neighbourhoods (ecological study) ◦ The desired sample size for a quantitative study is based on statistical estimations about how many data points are required in order to answer the study question with a specified level of certainty Larger sample sizes produce narrower Sample confidence intervals for statistical measures. Size and Certainty Larger sample sizes Levels yield more statistically significant results. What is a mean? ◦ The average value of a variable that is calculated by adding up the values and dividing that sum by the total number of individuals who answered the question. ◦ e.g. 25 30 30 40 50 62 What is a confidence interval? ◦ A confidence interval is a statistical estimate of the range of likely values of a parameter in a source population based on the value of that statistic in a study population ◦ e.g. The mean age is 39.5, but we are 95% confident that the value falls somewhere between 31 and 46. Larger Samples from a Population Have a Narrower 95% Confidence Interval Than Smaller Samples Sample Size Estimation ◦ A sample size calculator is a tool used to identify an appropriate number of participants to recruit for a quantitative study ◦ The range of suggested sample sizes is based on a series of assumptions (= guesses) about the expected characteristics of the sample population ◦ Sample size calculators are freely available online and are bundled with most statistical software programs ◦ When the level of certainty about inputs is low, it is better to err on the side of a larger sample size. https://sample-size.net/sample-size-means/ Power Estimation ◦ Power is the ability of a statistical test to detect significant differences between subgroups of a population when differences really do exist. ◦ Power is defined as 1 – β, so a 20% likelihood of a type 2 error corresponds to a power of 80% ◦ Studies with more participants have more power. Type 1 and 2 Errors ◦ In statistics, an error is a difference between the value obtained from a study population and the true value in the larger population from which the study participants were drawn that occurs by chance rather than as a result of systematic bias “False Positive” “False Negative” Type 1 and Type 2 error ◦ Type 1 () error ◦ occurs when a study population yields a statistically-significant test result even though a significant difference or association does not actually exist in the source population – When is set at 5%, about 1 in 20 statistical tests will result in a type 1 error Type 2 () error ◦ occurs when a statistical test of data from a study population finds no significant result even though a significant difference or association actually exists in the source population ◦ The best way to minimize the likelihood of type 2 errors is to have a large sample size Type 1 and Type 2 error ◦ Type 1 () error ◦ If =5%, 1 in 20 statistical tests will result in a type 1 error ◦ Type 2 () error Power = 1 – ◦ If =20%, Power = 80% ◦ If =20%, 20% likelihood of type 2 error Refining the Study Approach ◦ Researchers must be prepared to rethink the study approach if their estimated number of available participants will not yield sufficient power ◦ The sample size estimates generated by sample size and power calculators refer to the study population (the actual number of participants), not the sample population (the number of individuals invited to participate in the study) ◦ The number of people sampled for a study needs to be larger than the required number of participants, because the participation rate is unlikely to be 100% Next Class ◦ October 23, 2024 ◦ Questionnaire Development, Surveys/Questionnaires ◦ Collecting Quantitative Data ◦ Chapter 21, 22