Statistics in Health Sciences PDF
Document Details
Uploaded by PraiseworthyHammeredDulcimer
Universitat Autònoma de Barcelona
2023
Jose Barrera
Tags
Summary
This document is a set of lecture notes on statistics in health sciences. It covers measures of association, impact, and comparing risks for different types of studies. The document is from 2023 and has an author, and details a brief introduction on relative risk, odds ratios, and attributable risk. The document looks likely to involve an applied statistics degree.
Full Transcript
B.Sc. Degree in Applied Statistics Statistics in Health Sciences 8. Measuring the association between exposure and disease Jose Barreraab [email protected] https://sites.google.com/view/josebarrera a ISGlobal Barcelona Institute for Global Health - Campus MAR b Department of Mathematics (UAB)...
B.Sc. Degree in Applied Statistics Statistics in Health Sciences 8. Measuring the association between exposure and disease Jose Barreraab [email protected] https://sites.google.com/view/josebarrera a ISGlobal Barcelona Institute for Global Health - Campus MAR b Department of Mathematics (UAB) This work is licensed under a Creative Commons “Attribution-NonCommercial-ShareAlike 4.0 International” license. Statistics in Health Sciences 1 Measures of association 2 Measures of impact: Attributable risk, attributable fraction and attributable number Introduction Comparing risks under different study designs Comparing RR, PR, OR and POR Confidence intervals Population attributable risk or fraction Exposure attributable risk or fraction Attributable fractions vs attributable numbers Jose Barrera (ISGlobal & UAB) Statistics in Health Sciences, 2023/2024 2 / 32 Measuring the association between exposure and disease: Introduction In these slides. . . • How can we assess the association between having been exposed to a given condition and suffer a given disease? • How can we compare the risk of suffering a given disease when a given exposure is present or absent? • What indicators can we consider? Under what conditions are they estimable? Concepts: Relative Risk, Odds Ratio, Attributable Risk. Smoke free bus stop in Warsaw, Poland (summer 2018). � @overdispersion Jose Barrera (ISGlobal & UAB) Statistics in Health Sciences, 2023/2024 3 / 32 Comparing risks Introduction • Supose we want to explore the relationship between a binary exposure E and the presence of a given disease D, based on a 2 ⇥ 2 contingency table. • For example, supose that, at the end of a study, we have the following data: Disease No Yes Total No Yes 192 84 48 36 240 120 Total 276 84 360 Exposed • We can consider some indicator summarizing the comparison of the risk of the disease between the two groups of exposure. • This is usually done with a measure of relative risk, with the exposed group in the numerator and the non exposed group in the denominator. • The proper measure depends on the study design. . . Jose Barrera (ISGlobal & UAB) Statistics in Health Sciences, 2023/2024 4 / 32 Comparing risks under different study designs Cohort studies In cohort studies, we can compare the cumulative incidence of (i.e. probability of developing) the disease between the two exposure groups to get the risk ratio (RR) and the risk difference (RD): Risk ratio = RR = CIE , CIĒ Risk difference = RD = CIE CIĒ . Prove that, for the following cohort study, RR = 1.50, which can be interpreted as the cumulative incidence (i.e. probability of developing) the disease among exposed being 50% higher than among non exposed. Exposed Study start End of study Disease Disease No Yes Total No Yes 240 120 0 0 240 120 Total 360 0 360 Jose Barrera (ISGlobal & UAB) follow-up ! No Yes Total No Yes 192 84 48 36 240 120 Total 276 84 360 Exposed Statistics in Health Sciences, 2023/2024 5 / 32 Comparing risks Case-control studies (1/5) In case-control studies, we can compare the prevalence of the exposure (i.e. probability of having been exposed) between the two disease status groups: P(E|D) . P(E|D̄) Study start End of study Exposed Disease No Yes Exposed Total No (control) Yes (case) ? ? ? ? 276 84 Total ? ? 360 Jose Barrera (ISGlobal & UAB) E assessment ! No Yes Total No (control) Yes (case) 192 48 84 36 276 84 Total 240 120 360 Disease Statistics in Health Sciences, 2023/2024 6 / 32 Comparing risks Case-control studies (2/5) • In case-control studies, we can compare the prevalence of the exposure (i.e. probability of having been exposed) between the two disease status groups: P(E|D) , P(E|D̄) which is not very useful because we are interested in compare risks of D. I.e. we would like to estimate P(D|E) instead. P(D|Ē) • Using elemental theory of probability, it can be shown that P(D|E) P(E|D) 1 P(E) = ⇥ , 1 P(E|D) P(E) P(D|Ē) which cannot be estimated because with data from a case-control study, the prevalence of the exposure, P(E), cannot be estimated. • To solve this problem, we use the concept of odds. . . Jose Barrera (ISGlobal & UAB) Statistics in Health Sciences, 2023/2024 7 / 32 Comparing risks Case-control studies (3/5) • The odds of an event X is a monotone transformation of the probability: odds(X ) = P(X ) P(X ) = . 1 P(X ) P(X̄ ) • The odds transforms the space P(X ) 2 [0, 1] into the space odds(X ) 2 [0, 1): 10 9 8 odds(X ) 7 6 5 4 3 2 1 0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 P (X ) Jose Barrera (ISGlobal & UAB) Statistics in Health Sciences, 2023/2024 8 / 32 Comparing risks Case-control studies (4/5) • Now, we can compare the prevalence of the exposure (i.e. probability of having been exposed) between the two disease status groups to get the odds ratio (OR): Odds ratio = OR = Prove that odds(E|D) . odds(E|D̄) P(E|D) P(D|E) odds(E|D) odds(D|E) 6= while = . P(E|D̄) P(D|Ē) odds(E|D̄) odds(D|Ē) • According to the previous result, we can compute and interpret the odds ratio as ratio of odds of the disease instead of a ratio of odds of being exposed: Odds ratio = OR = Jose Barrera (ISGlobal & UAB) odds(E|D) odds(D|E) = = odds(E|D̄) odds(D|Ē) Statistics in Health Sciences, 2023/2024 P(E|D) P(Ē|D) P(E|D̄) P(Ē|D̄) = P(E|D)P(Ē|D̄) . P(E|D̄)P(Ē|D) 9 / 32 Comparing risks Case-control studies (5/5) Prove that, for the following case-control studya , odds(D|E) = 0.43, odds(D|Ē) = 0.25 and OR = 1.71, which can be interpreted as the odds of having the disease among exposed being 71% higher than among non exposed. Study start End of study Exposed Disease a No Yes Exposed Total No (control) Yes (case) ? ? ? ? 276 84 Total ? ? 360 E assessment ! No Yes Total No (control) Yes (case) 192 48 84 36 276 84 Total 240 120 360 Disease In contingency tables for case-control studies, disease status groups are usually arranged in rows instead of in columns. Jose Barrera (ISGlobal & UAB) Statistics in Health Sciences, 2023/2024 10 / 32 Comparing risks Cross-sectional studies In cross-sectional studies, we can compare the prevalence of (i.e. probability of having) the disease between the two exposure groups to get the following measures of association: • Prevalence ratio: PR = P(D|E) . P(D|Ē) • Prevalence difference: PD = P(D|E) • (Prevalence) odds ratio: = P(D|Ē). odds(D|E) . odds(D|Ē) Exercise Compute and interpret PR, PD and POR for the following cross-sectional study data. Study start End of study Disease Exposed No Yes Disease Total No Yes ? ? ? ? ? ? Total ? ? 360 Jose Barrera (ISGlobal & UAB) E and D assessment ! No Yes Total No Yes 192 84 48 36 240 120 Total 276 84 360 Exposed Statistics in Health Sciences, 2023/2024 11 / 32 Ratios vs differences (1/2) Ratios vs differences • In a given study, when comparing two groups, using ratios could lead, depending on data, to different results than using differences. E.g. it can happen that PR1 > PR2 while PD1 < PD2 , or vice versa. It also applies to risks (cohort studies) and odds (case-control studies). • It is natural, since ratios and difference work in different metrics.a • Usually, ratios are more widely used than differences. • Don’t compare a study based on ratios with another study based on differences!!! a Naive example: A’s income have increased from 100 units to 120 units (difference: 20 units; percentage: 20%). B’s income have increased from 150 units to 174 units (difference: 24 units; percentage: 16%). Jose Barrera (ISGlobal & UAB) Statistics in Health Sciences, 2023/2024 12 / 32 Ratios vs differences (2/2) Ratios vs differences: example (data) Ratios vs differences: example (results) A hypothetical cross-sectional study: PR resulted higher among women while PD resulted higher among men: D̄ Women D Total D̄ Men D Total Ē E 565 280 70 43 635 323 461 246 260 161 721 407 Total 845 113 958 707 421 1128 Women Men PR PD 1.208 1.097 0.023 0.035 Ratios vs differences: generalization It can be easily shown that this example is a particular case of the general situation in which if PR⇡ > PR⇢ and Jose Barrera (ISGlobal & UAB) PR⇡ P(D | Ē⇢ ) > PR⇢ P(D | Ē⇡ ) 1 , then PD⇡ < PD⇢ . 1 Statistics in Health Sciences, 2023/2024 13 / 32 Example: calculations in R using the epiR package (1/4) Toy data > > > > > > > > > > > > > + > > E0D0 E0D1 E1D0 E1D1 <<<<- Manually (point estimates) 192 48 84 36 > dd ## disease no disease ## exposed 36 84 ## non exposed 48 192 E0 <- E0D0 + E0D1 E1 <- E1D0 + E1D1 D0 <- E0D0 + E1D0 D1 <- E0D1 + E1D1 n <- E0 + E1 rr <- (E1D1 / E1) / (E0D1 / E0) or <- (E0D0 * E1D1) / (E0D1 * E1D0) dd <- matrix(c(E1D1, E1D0, E0D1, E0D0), nrow = 2, byrow = TRUE) rownames(dd) <- c("exposed", "non exposed") colnames(dd) <- c("disease", "no disease") Jose Barrera (ISGlobal & UAB) > rr ## [1] 1.5 > or ## [1] 1.714286 Statistics in Health Sciences, 2023/2024 14 / 32 Example: calculations in R using the epiR package (2/4) Cohort study > library(epiR) > epi.2by2(dat = as.table(dd), method = "cohort.count") ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## Exposed + Exposed Total Outcome + 36 48 84 Outcome 84 192 276 Total 120 240 360 Inc risk * 30.0 20.0 23.3 Odds 0.429 0.250 0.304 Point estimates and 95% CIs: ------------------------------------------------------------------Inc risk ratio 1.50 (1.03, 2.18) Odds ratio 1.71 (1.04, 2.83) Attrib risk in the exposed * 10.00 (0.36, 19.64) Attrib fraction in the exposed (%) 33.33 (3.25, 54.06) Attrib risk in the population * 3.33 (-3.35, 10.02) Attrib fraction in the population (%) 14.29 (-0.69, 27.03) ------------------------------------------------------------------Uncorrected chi2 test that OR = 1: chi2(1) = 4.472 Pr>chi2 = 0.034 Fisher exact test that OR = 1: Pr>chi2 = 0.047 Wald confidence limits CI: confidence interval * Outcomes per 100 population units Jose Barrera (ISGlobal & UAB) Statistics in Health Sciences, 2023/2024 15 / 32 Example: calculations in R using the epiR package (3/4) Cross-sectional study > library(epiR) > epi.2by2(dat = as.table(dd), method = "cross.sectional") ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## Exposed + Exposed Total Outcome + 36 48 84 Outcome 84 192 276 Total 120 240 360 Prevalence * 30.0 20.0 23.3 Odds 0.429 0.250 0.304 Point estimates and 95% CIs: ------------------------------------------------------------------Prevalence ratio 1.50 (1.03, 2.18) Odds ratio 1.71 (1.04, 2.83) Attrib prevalence in the exposed * 10.00 (0.36, 19.64) Attrib fraction in the exposed (%) 33.33 (3.25, 54.06) Attrib prevalence in the population * 3.33 (-3.35, 10.02) Attrib fraction in the population (%) 14.29 (-0.69, 27.03) ------------------------------------------------------------------Uncorrected chi2 test that OR = 1: chi2(1) = 4.472 Pr>chi2 = 0.034 Fisher exact test that OR = 1: Pr>chi2 = 0.047 Wald confidence limits CI: confidence interval * Outcomes per 100 population units Jose Barrera (ISGlobal & UAB) Statistics in Health Sciences, 2023/2024 16 / 32 Example: calculations in R using the epiR package (4/4) Case-control study > epi.2by2(dat = as.table(dd), method = "case.control") ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## Exposed + Exposed Total Outcome + 36 48 84 Outcome 84 192 276 Total 120 240 360 Prevalence * 30.0 20.0 23.3 Odds 0.429 0.250 0.304 Point estimates and 95% CIs: ------------------------------------------------------------------Odds ratio 1.71 (1.04, 2.83) Attrib fraction (est) in the exposed (%) 41.58 (0.11, 65.68) Attrib fraction (est) in the population (%) 17.86 (-0.43, 32.81) ------------------------------------------------------------------Uncorrected chi2 test that OR = 1: chi2(1) = 4.472 Pr>chi2 = 0.034 Fisher exact test that OR = 1: Pr>chi2 = 0.047 Wald confidence limits CI: confidence interval * Outcomes per 100 population units Jose Barrera (ISGlobal & UAB) Statistics in Health Sciences, 2023/2024 17 / 32