Sampling and Sample Size Estimation PDF

Document Details

UnselfishCharacterization

Uploaded by UnselfishCharacterization

School of Public Health

2023

Tanko Abdulai

Tags

sampling methods sample size estimation randomization statistics

Summary

This document presents a lecture or presentation on sampling methods, randomization procedures, and sample size determination in research. It details various probability and non-probability sampling techniques, including convenience, purposive, quota, and systematic sampling. The document also covers strategies for determining sample size in both qualitative and quantitative studies.

Full Transcript

Sampling and Sample size estimation Tanko Abdulai (PhD) Department of Epidemiology, Biostatistics, and Disease control School of Public Health, UDS 2023 Outline  Sampling  Randomization  Sample size estimation WHY SELECT A SAMPLE? Population? Sample? Sampling  A sample is a sub...

Sampling and Sample size estimation Tanko Abdulai (PhD) Department of Epidemiology, Biostatistics, and Disease control School of Public Health, UDS 2023 Outline  Sampling  Randomization  Sample size estimation WHY SELECT A SAMPLE? Population? Sample? Sampling  A sample is a subset of the population—the part that is actually being observed or studied.  We can only rarely study whole populations, so inferential statistics are almost always needed to draw conclusions about a population when only a sample has actually been studied.  A single observation—such as one person’s blood pressure—is an element, denoted by X. The number of elements in a population is denoted by N, and the number of elements in a sample by n. A population therefore consists of all the elements from X1 to XN, and a sample consists of n of these N elements. Sampling procedure  Probability (random) Non-probability sampling Types of Sampling Non probability sampling  Convenience sampling  Purposive sampling  Snowballing Quota sampling Convenience sampling  This is the least rigorous technique, involving the selection of participants that is handy/available, most accessible (or volunteers) in terms of time, effort and money.  May result in poor quality data and lacks intellectual credibility. Purposive sampling/Judgement sampling  Uses personal judgment to select sample that would be representative OR selecting those who are known to have the needed information.  Snowball is a type (used with hard to identify groups such as addicts) Quota sampling the population is first segmented into mutually exclusive sub-groups, just as in stratified sampling. Then judgment is used to select the subjects or units from each segment based on a specified proportion. Random sampling  Simple random sample  Cluster random sampling  Stratified random sampling  Systematic sampling Simple random sampling  e.g Names in a hat, flipping a coin or table of random numbers  Larger samples more likely to represent population  Any difference between population and sample is random and small (random sampling error) Stratified random sampling  Ensures small subgroups (strata, e.g by gender, age) are represented  Normally proportional to their part of population  Break population into strata, then randomly select within strata  Multistage sampling Cluster random sampling  Selects groups as sample units rather than individuals  E.g., schools in Tamale metro are grouped into say 30 circuits, 5 circuits are then randomly selected  REQUIRES a large number of groups/clusters  Multistage sampling Systematic (Nth) sampling  Either Considered random or non random  Random- if list is randomly ordered  nonrandom- systematic with random starting point  Divide population size by sample size to get N (population size/sample size = Nth, sampling interval) Randomization  The allocation of study participants to an intervention/treatment or a control/placebo group in Randomized clinical/field trials.  This is done to ensure internal validity.  It ensures the allocation groups are free from selection bias Methods of randomization: Simple randomization  Flip a coin Heads Treatment A Tails Treatment B  Roll a dice Even number Treatment A Odd number Treatment B  Random number Table  Computer generated random numbers  *Unequal sample size may arise if study sample is small Methods of randomization: randomization into groups of two/Block randomization  Study participants are randomized two at a time. Envelopes are numbered sequentially (e.g., from 1 to 100) and separated into groups of two.  If the first number is even, “experimental group” is written on a slip of paper and put in the first envelope. the paired envelope, then automatically get the alternative group, in this case, “control group.”  If an even number of patients have been admitted into the study, exactly half would be in the experimental group and half in the control group. Systematic Allocation  The first patient is randomly assigned to a group, and the next patient is automatically assigned to the alternate group. Subsequent patients are given group assignments on an alternating basis. This method also ensures that the experimental and control groups are of equal size if there is an even number of patients entered in the study.  the variance of the data from a systematic allocation is smaller than that from a simple random allocation, so the statisticalpower is improved Stratified Allocation  In clinical research, stratified allocation is often called prognostic stratification. It is used when the investigators want to assign patients to different risk groups depending on such baseline variables as the severity of disease (e.g., stage of cancer) and age. When such risk groups have been created, each stratum can be allocated randomly to the experimental group or the control group. This is usually done to ensure homogeneity of the study groups by severity of disease. If the homogeneity is achieved, the analysis can be done for the entire group and within the prognostic groups. Advantages of probability sample Provides a quantitative measure of the extent of variation due to random effects Provides data of known quality Better control over non sampling sources of errors Mathematical statistics and probability can be applied to analyze and interpret the data Disadvantages of Non-probability Sampling Selection bias likely No mathematical property Non-probability sampling should not be undertaken with science in mind Provides false economy Sampling Issues in Survey Produce best estimation Not too low, not too large Minimum error/unbiased estimation Economic consideration Design consideration: best collection strategy Determination of sample size  The purpose of the study, population size, the risk of selecting a “bad” sample and the allowable sampling error inform the sample size.  Data analysis plan e.g., number of cells one will have in cross tabulation  whether undertaking a qualitative or quantitative study Sample size in qualitative studies  Probability sampling not appropriate as sample not intended to be statistically representative  But, sample should have ability to represent salient characteristics in population.  Sample size is usually small (15-50) to allow in-depth exploration and understanding of phenomena under investigation Determining sample size in quantitative study Several criteria will need to be specified to determine the appropriate sample size: Level of precision, Level of confidence or risk, Degree of variability in the attributes being measured ( prevalence) External validity Degree of variability refers to the distribution of attributes in the population. The more heterogeneous a population, the larger the sample size required to obtain a given level of precision. The less variable (more homogeneous) a population, the smaller the sample size. Degree of variability  A proportion of 50 % indicates a greater level of variability than either 20% or 80%. This is because 20% and 80% indicate that a large majority do not or do, respectively, have the attribute of interest.  Because a proportion of 0.5 indicates the maximum variability in a population, it is often used in determining a more conservative sample size, that is, the sample size may be larger than if the true variability of the population attribute were used. Sample size determination strategies  Using a census for small populations  Using published tables  Applying formulas to calculate a sample size  Use computer software e.g., EPI-info Cochran equation  Where n0 is the sample size, 𝑍 2 𝑝𝑞 𝑛0 = 2  Z2 is the abscissa of the normal curve that cuts off an area α at the tails; 𝑒 OR  e is the desired level of precision, 𝑍 2 𝑝(1 − 𝑞) 𝑛0 = 2  p is the estimated proportion of an attribute that is present in the population, 𝑒 and q is 1-p.  The value for Z is found in statistical tables which contain the area under the normal curve. e.g Z = 1.96 for 95 % level of confidence 𝑛0 𝑛= For Smaller Populations a modified Cochran Formula for Sample Size (n) 𝑛0 − 1 Calculation is used. 1+ 𝑁 The Yamane formula  ASSUMPTION:  95% confidence level 𝑁  P =.5 ; 𝑛=  Where n is the sample size, 1 + 𝑁(𝑒)2  N is the population size,  e is the level of precision. *researchers commonly add 10 % to the sample size to compensate for persons that the researcher is unable to contact.

Use Quizgecko on...
Browser
Browser