Epidemiology Foundations I PDF - POPM4040 - Week 1 2024

Summary

These lecture notes are about epidemiology, focusing on foundations and learning outcomes. They include topics such as causal inference, epidemiological measures and study designs, and introduce different types of data and bias. This specific set of notes were presented on September 10 and 12, 2024.

Full Transcript

Epidemiology Foundations I POPM4040 Epidemiology of Foodborne Diseases Week 1: September 10 and 12, 2024 Dr. Lauren Grant Outline Scope of epidemiology Causal inference and causal models Epidemiologic measures (occurrence, association) Epidemiologic stud...

Epidemiology Foundations I POPM4040 Epidemiology of Foodborne Diseases Week 1: September 10 and 12, 2024 Dr. Lauren Grant Outline Scope of epidemiology Causal inference and causal models Epidemiologic measures (occurrence, association) Epidemiologic study design (cross-sectional, case-control, cohort) Types of data Bias (selection, information, confounding) Epidemiologic data analysis Learning Outcomes - 1 1. Define epidemiology, describe aims, and distinguish between types of epidemiology (descriptive versus etiologic). 2. Define epidemiological terms. 3. Distinguish between closed and open populations and describe why this differentiation is important for epidemiological studies. 4. Distinguish between four measures of disease occurrence. 5. Describe considerations for measuring incidence, incidence proportion, incidence rates, and prevalence. 6. Calculate incidence proportions, incidence rates, and prevalence. 7. Distinguish between incidence and prevalence. Discuss use of both measures in epidemiology and relate to epidemiology of enteric illnesses. 8. Describe and distinguish between different causal models (counterfactual and sufficient component cause models) 9. Distinguish between association and causation. Describe why association is not causation. 10. Distinguish between measures of association (risk ratio, rate ratio, odds ratio) 11. Construct appropriate 2x2 tables. Calculate and interpret measures of association Learning Outcomes - 2 12. Describe and distinguish between different epidemiological study designs. 13. Distinguish between prospective and retrospective studies. 14. Describe pros and cons of epidemiological study designs. 15. Identify measures that can be calculated by each study design. 16. Discuss why incidence measures cannot be calculated in case-control studies. 17. Distinguish between probability and odds. 18. Describe tools for quality assessment of epidemiological studies. 19. Distinguish between primary and secondary data sources. 20. Distinguish between random and systematic error. 21. Describe three major sources of bias in epidemiological studies. 22. Describe types of data analyses performed in epidemiological studies. Epidemiology is the science that studies disease occurrence and health states in human populations (Rothman) Epidemiology is the study of the distribution and determinants of disease, health-related states, or other events in specified human populations and the application of this study to the control of human health problems (Last) Infectious Disease Reproductive Psychiatric Clinical Epidemiology Epidemiology Epidemiology Epidemiology Molecular Genetic Injury & Violence Social Epidemiology Epidemiology Epidemiology Epidemiology Environmental Occupational Nutritional Pharmaco- Epidemiology Epidemiology Epidemiology Epidemiology Epidemiology is the study of the distribution and determinants of disease, health-related states, or other events in specified human populations and the application of this study to the control of human health problems (Last) Descriptive epidemiology (medical Etiologic or analytic epidemiology demography) measures how disease assesses the effect of exposures on frequency and other population health the occurrence of disease. indicators vary by characteristics of person, time, and place. Descriptive epidemiology seeks to characterize the distributions of health, disease, and harmful or beneficial exposures in a defined population as they exist, including any meaningful differences in distribution, & whether the distribution is changing over time (Fox: PMID 35325036) What proportion of the population experiences acute gastrointestinal illness each year? Epidemiology is the study of the distribution and determinants of disease, health-related states, or other events in specified human populations and the application of this study to the control of human health problems (Last) Descriptive epidemiology seeks to characterize the distributions of health, disease, and harmful or beneficial exposures in a defined population as they exist, including any meaningful differences in distribution, & whether the distribution is changing over time (Fox: PMID 35325036) Descriptive epidemiology involves assembling measures of disease occurrence for demographic and geographic subpopulations. Examples: Contrasts of disease incidence by age, sex, gender identity, race/ethnicity, urban or rural residence, or by time period (Person, Place, Time) Demographic and geographic patterns in disease occurrence can inform public health programming and point to potential causal mechanisms. Epidemiology is the study of the distribution and determinants of disease, health-related states, or other events in specified human populations and the application of this study to the control of human health problems (Last) Descriptive epidemiology seeks to characterize the distributions of health, disease, and harmful or beneficial exposures in a defined population as they exist, including any meaningful differences in distribution, & whether the distribution is changing over time (Fox: PMID 35325036) Exposure: Any factor that may explain or predict a health outcome of interest. Etiologic or analytic epidemiology assesses the effect of exposures, - Microscale: Genetics, microbiome which include possible causes, on the - Person-level: Demographics, SES, occurrence of disease. behaviours, medical history, enviro hazards - Macroscale: Neighbourhood SES, national income inequality One aim of epidemiology is to obtain a valid and precise estimate of the frequency of disease occurrence and to identify patterns of disease occurrence that vary with time, place, or population subgroups. All measures of disease occurrence must be measured in a population – a group of people who share characteristics or meet criteria that define membership in the population - observed at or over a specific time. Cross-sectional: A single point in time Longitudinal: Over time Closed: Adds no new members and loses members only to death. Open: Gain and lose members There are four measures of disease occurrence or frequency: Incidence proportion, incidence time, incidence rate, and prevalence Incidence: The number of new occurrences of a condition or disease in a population over a specified time period. Number of new cases during t Population at risk during t Considerations: Clear case definition Identification of truly new cases Well-defined population at risk Appropriate t for the disease being studied Incidence proportion (cumulative incidence, risk, attack rate) is the proportion of a closed population at risk that becomes diseased within a given period of time. Proportion: Division of two related numbers where the numerator is a subset of the denominator. To calculate incidence proportion: Measure size of the population at risk Count number of new cases that arise over the given time interval Incidence proportion is a measure of average risk (probability that a disease develops in an individual within t) over t for persons in a closed population. Goldstein ND. The relationship of cumulative incidence to incidence rate. Oct 27, 2015. DOI: 10.17918/goldsteinepi. Incidence proportion (cumulative incidence, risk, attack rate) is the proportion of a closed population at risk that becomes diseased within a given period of time. Proportion: Division of two related numbers where the numerator is a subset of the denominator. To calculate incidence proportion: Measure size of the population at risk Count number of new cases that arise over the given time interval Incidence proportion is a measure of average risk (probability that a disease develops in an individual within t) over t for persons in a closed population. Goldstein ND. The relationship of cumulative incidence to incidence rate. Oct 27, 2015. DOI: 10.17918/goldsteinepi. Incidence proportion (cumulative incidence, risk, attack rate) is the proportion of a closed population at risk that becomes diseased within a given period of time. Additional considerations: Open populations (e.g. Ontario) If the population is at a steady state: Midpoint of the time period. Competing risks Loss to follow-up Cannot distinguish when a disease occurs as long as it is within t. Incidence proportion is best when the follow-up time is short and there is no or relatively little loss to follow up. Incidence rate is a measure of the number of new cases in a defined population per unit of time. Incidence rate is a measure of the number of new cases in a defined population per unit of time. Incidence rate is a measure of the number of new cases in a defined population per unit of time. Incidence time: The time span from zero time to the time at which the outcome event occurs, if it occurs. What about events that may not occur during the period of observation? We account for the length of time each individual was in the population at risk for the disease event. This length of time is called the person- time contribution of the individual. Incidence rate is a measure of the number of new cases in a defined population per unit of time. Number of new cases during t ∑ Time spent in population persons Numerator: Number of new cases during t Denominator: Total amount of time at risk of disease for all persons being followed. It estimates the rate at which the outcome develops: X new cases per Y person-years Incidence rate is best when measuring incidence in open populations and when the follow-up time is long. Prevalence is the proportion of the population at risk that has a given disease, condition, or characteristic at a particular time. Prevalence: Number of new and existing cases during t Population at risk during t Reflects both incidence rate and duration of disease. Associations reflect causes of disease and determinants of survival with disease. Uses in epidemiology: Measure of disease burden Occurrence of diseases with no clear moment of onset Seroprevalence studies of infection Health system and public health planning Another aim of epidemiology is to estimate the effect of a potential cause or causes on the occurrence of the disease to explain the disease phenomenon or identify interventions to prevent disease or reduce harm. A cause is an event, Exposure A (1: E+, 0: E-) condition, or characteristic Outcome Y (1: D+, 0: D-) without which the disease would not have occurred Y a=1: (Rothman). Y a=0: Individual causal effect: The exposure A has a causal effect on an individual’s outcome Y if Y a=1 ≠ Y a=0 for the individual. Average causal effect in a population: An average causal effect of treatment A on outcome Y is present if Pr[Y a=1 = 1] ≠ Pr[Y a=0 = 1] in the population. This can be rewritten as: E[Y a=1] ≠ E[Y a=0] Counterfactual model Enteric illnesses generally have one necessary cause. However, other factors can influence a) exposure or b) severity of disease once exposed. Sufficient- Component Cause Model A cause is an act or event or state of nature which initiates or permits, alone or in conjunction with other causes, a sequence of events resulting in an effect. A sufficient cause inevitably produces the effect. Causality can be studied at multiple levels. Population-level: Socioeconomic status determines extent of and susceptibility to exposure to tobacco smoke. Individual-level: Smoking behaviour (duration, frequency, number) Organ-level: Pathological changes in the lung after exposure to tobacco smoke. Cellular or molecular-level: Changes in DNA methylation alter gene expression of tumor suppressor genes. Association is not causation. An exposure A and an outcome Y are associated if information about A conveys information about (or allows one to partially predict) the outcome Y. https://tylervigen.com/ Two variables may be associated without a causal relationship. Causation implies a Association implies a contrast between all contrast between the individuals exposed and exposed and unexposed all individuals unexposed. groups in a population. Causation is defined by a Association is defined by a different risk in the same different risk in two disjoint population under two subsets of a population different exposure values. determined by their actual exposure value. If the exposed and unexposed groups differ by a characteristic that also affects the outcome, then an association will be observed where there is no causal effect. CONFOUNDING!!! Hernán MA, Robins JM (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC. Measures of association quantify the strength and direction of the relationship between exposure and outcome variables by comparing two or more groups defined by their exposure status. Identifying causes of disease involves comparison between groups of people Disease No Disease Total who differ by exposure. (D+) (D-) Exposed Total (E+) Exposed By measuring and comparing disease Unexposed Total incidence between two or more (E-) Unexposed exposure groups, we can estimate Total Total Non- Population whether there is an association Diseased Diseased Total between exposure and outcome. Measures of disease frequency can be compared by calculating their ratio: Risk ratio (relative risk) and rate ratio (relative rate) The risk ratio compares the incidence of disease among exposed individuals to the indicence of disease among Disease No Disease Total (D+) (D-) unexposed individuals: Exposed a b Total (E+) Exposed Risk of disease in D+ | E+ Unexposed c d Total exposed group: E+ (E-) Unexposed Total Total Non- Population Risk of disease in D+ | E- Diseased Diseased Total unexposed group: E- The risk ratio expresses how many times RR > 1: Increased risk of outcome the risk of developing an outcome in an RR = 1: Null risk of outcome exposed person relative to an unexposed RR < 1: Reduced risk of outcome person. Measures of disease frequency can be compared by calculating their ratio: Risk ratio (relative risk) and rate ratio (relative rate) The rate ratio compares the incidence rate of disease among exposed individuals to the incidence rate of Disease Person- Total (D+) Years disease among unexposed individuals: Exposed a b Total (E+) Exposed Rate of disease in D+ | E+ Unexposed c d Total exposed group: Person-Years | E+ (E-) Unexposed Total Total Time at Rate of disease in Diseased Risk unexposed group: D+ | E- Person-Years | E- The rate ratio expresses how many times IRR > 1: Increased rate of outcome the rate of developing an outcome in an IRR = 1: No difference in rate exposed person relative to an unexposed IRR < 1: Reduced rate of outcome person. The odds ratio compares the odds of disease among exposed individuals to the odds of disease among non-exposed individuals The rate ratio compares the incidence rate of disease among exposed Disease No Disease Total individuals to the incidence rate of (D+) (D-) disease among unexposed individuals: Exposed a b Unknown (E+) Odds of disease in D+ | E+ Unexposed c d Unknown exposed group: (E-) D- | E+ Total Total Non- Diseased Diseased Odds of disease in D+ | E- (Contrived) unexposed group: D- | E- The odds ratio is a good estimate of the OR > 1: Increased rate of outcome risk ratio when the outcome is OR = 1: No difference in rate uncommon ( Large cohorts. 2. Long latency period between exposure and development of disease -> Long follow-up 3. Expensive or difficult to obtain exposure information from a cohort In a case-control study, all cases and a sample of non-diseased individuals from the same source population (controls) are identified from the source population and their exposure status is determined. Odds of disease are then calculated. No follow-up period A case-control study can be conceptualized as a more efficient cohort study. Source population: Residents of the City of Guelph in 2023 Cohort Study: Disease + Disease - Total Exposure + 155 22,500 22,655 Exposure - 135 126,765 126,900 290 149,265 149,555 Data on all individuals in the source population Size of population at risk for both exposed and non-exposed groups (can calculate incidence and risk ratio) Considerations: Expensive and not feasible to collect information on every single resident of Guelph Random sample? Because the outcome is rare, my random sample may not contain any (or enough) cases to compute measures of occurrence and association A case-control study can be conceptualized as a more efficient cohort study. Source population: Residents of the City of Guelph in 2023 Case-Control Disease + Disease - Total Study: Exposure + 155 225 - Exposure - 135 1265 - 290 - - Collect data on all diseased individuals in the source population (cases) and determine their exposure status. Take a sample of the non-diseased population (e.g. 1%) and determine their exposure status. This sample represents the exposure distribution of non-diseased individuals in the source population. Incidence (and risk or rate measures) cannot be calculated because the size of the population at risk is unknown. Probability of disease in each exposure group cannot be calculated but odds of disease can be calculated. When the outcome is uncommon, the OR ~ RR or IRR Odds: Probability of disease / Probability of non-disease Probability (likelihood, risk): Proportion of diseased individuals out of all individuals (diseased and non-diseased) in a population. A horse runs 10 races and wins 6 of the 10 races: Probability of winning: 6/10 or 60% Odds of winning: 6/4 or 1.5 to 1. A cross-sectional study collects information on both exposure and outcome status of a well-defined population at one point in time. Cross-sectional studies measure prevalence and can be used to estimate causal effects of exposures on prevalence via estimation of a prevalence odds ratio. https://www.equator-network.org/reporting-guidelines/strobe/ https://casp-uk.net/casp-tools-checklists/ There are two types of data used in epidemiological studies: Primary and secondary There are two types of error in epidemiological studies: Random and systematic Error: A difference between an observed value and the true value. Random error: Difference due to chance alone (unpredictable) Systematic error (bias): Predictable difference resulting in inaccurate effect estimation There are three major types of bias in epidemiological studies: Selection, information, and confounding Selection Bias: Can occur when the selection of subjects into a study or their retention in a study leads to a study population that does not have the same exposure-outcome Control selection bias distribution as the Loss to follow-up bias source population Self-selection bias Healthy worker effect Differential referral or diagnosis of subjects Cohort Study: Disease + Disease - Total Exposure + 155 22,500 22,655 Exposure - 135 126,765 126,900 290 149,265 149,555 Case-Control Disease + Disease - Total Study: Exposure + 155 225 - Exposure - 135 1265 - 290 - - Case-Control Disease + Disease - Total Study: Exposure + 155 525 - Exposure - 135 965 - 290 - - Information or observation bias (measurement error or misclassification) occurs when measured exposure and/or outcome status differs from their true values. Non-differential misclassification: Error is independent of exposure or outcome status Differential misclassification: Error depends on exposure or outcome status Consider the ability to remember what you ate over a 7-day period… Confounding is the distortion of a measure of association that occurs when other risk factors for the outcome are unevenly distributed between the groups being compared. USE OF A MEAT INCOME THERMOMETER AGE USE OF A MEAT INCOME THERMOMETER 1. The confounder is an independent risk factor for the outcome. The confounder is associated with the outcome. Age is associated with use of a meat thermometer. 2. The distribution of the confounder differs by exposure groups. The confounder is associated with the exposure. The high-income group has a greater proportion of older individuals compared to the low-income group. 3. The confounder is not an intervening variable between the exposure and outcome. Epidemiologists use qualitative and quantitative methods. Qualitative methods: Collects and analyzes non-numerical data, usually to provide richer, contextual information regarding exposures and outcomes. Quantitative methods: Numerical techniques to quantify disease occurrence or associations between exposures and outcomes Descriptive statistics Regression analysis Explanatory models Predictive models Machine learning and other data science approaches. Summary

Use Quizgecko on...
Browser
Browser