Biostatistics Refresher PDF

Summary

This document provides a refresher on biostatistics, covering topics such as types of variables, descriptive statistics (including measures of central tendency and variability), and population distributions. It's aimed at healthcare professionals, specifically in the area of pharmacotherapy.

Full Transcript

Biostatistics: A Refresher I. INTRODUCTION TO STATISTICS A. Statistics allow us to classify, summarize, and analyze data. B. Statistics enable investigators to summarize data, determine the likelihood that a treatment or procedure will affect a group of patients, and estimate the effect size. C....

Biostatistics: A Refresher I. INTRODUCTION TO STATISTICS A. Statistics allow us to classify, summarize, and analyze data. B. Statistics enable investigators to summarize data, determine the likelihood that a treatment or procedure will affect a group of patients, and estimate the effect size. C. Statistics help us determine whether the results of a study can be applied to our practice. D. Why Pharmacists Need to Know Statistics E. As Statistics Pertains to Most of You 1. Pharmacotherapy Specialty Examination content outline, Domain 2: Application of Evidence to Practice and Education (25%) 2. Task statements: a. Retrieve relevant information that addresses pharmacotherapy-related inquiries. b. Evaluate pharmacotherapy-related literature, and health information. c. Disseminate pharmacotherapy-related information to educate health care professionals, patients, and caregivers. F. Examples of Online Statistical and Study Design Tools 1. www.graphpad.com/quickcalcs/ 2. http://statpages.org/ G. Several papers have investigated the various types of statistical tests used in the biomedical literature. Table 1. Statistical Content of Original Articles in New England Journal of Medicine, 2004–2005 Statistical Test/Procedure No statistics or descriptive statistics t-tests Contingency tables Nonparametric tests Epidemiologic statistics Pearson correlation Simple linear regression Analysis of variance Transformation Nonparametric correlation a % Statistical Test/Procedure % 13 Adjustment and standardization 1 26 53 27 35 3 6 16 10 5 Multiway tables Power analyses Cost-benefit analysis Sensitivity analysis Repeated-measures analysis Missing-data methods Noninferiority trials Receiver operating characteristics Resampling Principal component and cluster analyses Other methods Survival methods 61 Multiple regression Multiple comparisons 51 23 13 39 <1 6 12 8 4 2 2 2 4 Only applies to articles with Methods. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-467 Biostatistics: A Refresher Table 2. Statistical Content of Original Articles from Six Major Medical Journals from January to March 2005 (n=239 articles)a Statistical Test/Procedure Descriptive statistics (mean, median, frequency, SD, and IQR) Simple statistics Chi-square analysis t-test Kaplan-Meier analysis Wilcoxon rank sum test Fisher’s exact test Analysis of variance Correlation Multivariate analysis Cox proportional hazards Multiple logistic regression Multiple linear regression Other regression analysis None No. (%) Statistical Test/Procedure 219 (92) Others 120 (50) 70 (29) 48 (20) 48 (20) 38 (16) 33 (14) 21 (9) 16 (7) 164 (69) 64 (27) 54 (23) 7 (3) 38 (16) 5 (2) Intention-to-treat analysis Incidence or prevalence Relative risk or risk ratio Sensitivity analysis Sensitivity or specificity No. (%) 42 (17.6) 39 (16.3) 29 (12.2) 21 (8.8) 15 (6.3) Articles published in American Journal of Medicine, Annals of Internal Medicine, BMJ, Journal of the American Medical Association, Lancet, and New England Journal of Medicine. IQR = interquartile range; SD = standard deviation. Source of data in Tables 1 and 2: Horton NJ, Switzer S. Statistical methods in the Journal, N Engl J Med 2005;353:1977-9; Windish DM, Huot SJ, Green ML. Medicine resident’s understanding of the biostatistics and results in the medical literature, JAMA 2007;298:1010-22. a II. TYPES OF VARIABLES AND DATA A. Definition: Random Variables—A variable with observed values that may be considered outcomes of an experiment and whose values cannot be anticipated with certainty before the experiment is conducted B. Two Types of Random Variables 1. Qualitative variables 2. Quantitative variables C. Qualitative variables 1. Can take only a limited number of values within a given range 2. Often referred to as categorical variables 3. Nominal: Classified into groups in an unordered manner and with no indication of relative severity (e.g., mortality [dead or alive], disease presence [yes or no], race, marital status) 4. Ordinal: Ranked in a specific order but with no consistent level of magnitude of difference between ranks (e.g., NYHA functional class describes the functional status of patients with heart failure, and subjects are classified in increasing order of symptoms: I, II, III, IV; Likert-type scales [strongly agree {SA}, agree {A}, neutral {N}, disagree {D}, strongly disagree {SD}; disease stages [A, B, C, D]) a. These are not real numbers. b. Common error: Measure of central tendency – in most cases, means and standard deviations (SDs) – should not be reported with ordinal data. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-468 Biostatistics: A Refresher D. Quantitative Variables 1. Discrete quantitative variables can take specific numeric values (1, 2, 3, 4, etc.), rather than any value in an interval as described for continues below. These numeric values have a clear quantitative value (number of hospitalizations, number of pregnancies) 2. Quantitative continuous variables can take on any value within a given range. a. Interval: Data are ranked in a specific order with a consistent change in magnitude between units; the zero point is arbitrary (e.g., degrees Fahrenheit). b. Ratio: Like “interval” but with an absolute zero (e.g., degrees Kelvin, heart rate, blood pressure, time, distance) III. TYPES OF STATISTICS A. Descriptive Statistics: Used to summarize and describe data that are collected or generated in research studies. This is done both visually and numerically. 1. Visual methods of describing data a. Frequency distribution b. Histogram c. Scatterplot d. Boxplot 2. Numerical methods of describing data: Measures of central tendency a. Arithmetic mean (i.e., average) i. Sum of all values divided by the total number of values ii. Should generally be used only for continuous and normally distributed data iii. Very sensitive to outliers and tend toward the tail, which has the outliers iv. Most commonly used and most understood measure of central tendency v. Geometric mean: Useful for data that have log-normal distributions b. Median i. Midpoint of the values when placed in order from highest to lowest. Half of the observations are above and half are below. When there is an even number of observations, it is the mean of the two middle values. ii. Also called 50th percentile iii. Can be used for ordinal or continuous data (especially good for skewed populations) iv. Insensitive to outliers c. Mode i. Most common value in a distribution ii. Can be used for nominal, ordinal, or continuous data iii. Sometimes, there may be more than one mode (e.g., bimodal, trimodal). iv. Does not help describe meaningful distributions with a large range of values, each of which occurs infrequently 3. Numerical methods of describing data: Measures of data spread or variability a. Standard deviation i. Measure of the variability about the mean; most common measure used to describe the spread of data ii. Square root of the variance (average squared difference of each observation from the mean); returns variance back to original units (non-squared) iii. Appropriately applied only to continuous data that are normally or near normally distributed or that can be transformed to be normally distributed ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-469 Biostatistics: A Refresher iv. By the empirical rule for normal distributions, 68% of the sample values are found within ±1 SD, 95% are found within ±2 SD, and 99% are found within ±3 SD. v. The coefficient of variation relates the mean and the SD (SD/mean × 100%). b. Range i. Difference between the smallest and largest values in a data set does not give a tremendous amount of information by itself. ii. Easy to compute (simple subtraction) iii. Size of range is very sensitive to outliers. iv. Often reported as the actual values rather than the difference between the two extreme values c. Percentiles i. The point (value) in a distribution in which a value is larger than some percentage of the other values in the sample. Can be calculated by ranking all data in a data set ii. The 75th percentile lies at a point at which 75% of the other values are smaller. iii. Does not assume the population has a normal distribution (or any other distribution) iv. The interquartile range (IQR) is an example of the use of percentiles to describe the middle 50% values. The IQR encompasses the 25th–75th percentile. 4. Presenting data using only measures of central tendency can be misleading without some idea of data spread. Studies that report only medians or means without their accompanying measures of data spread should be closely scrutinized. What are the measures of spread that should be used with means and medians? 5. Example data set (Table 3) Table 3. Twenty Baseline HDL Concentrations from an Experiment Evaluating the Impact of Green Tea on HDL 64 54 59 60 68 65 a. b. c. 59 67 87 65 79 49 64 55 46 62 48 46 54 65 Calculate the mean, median, and mode of the data set given in Table 3. Calculate the range, and SD (on examination, you will most likely not have to do this by hand). Which figure(s) would allow us to evaluate the visual presentation of the data? B. Inferential Statistics 1. Conclusions or generalizations made about a population (large group) from the study of a sample of that population 2. Choosing and evaluating statistical methods depend, in part, on the type of data used. 3. An educated statement about an unknown population is commonly called an inference in statistics. 4. Statistical inference can be made by estimation or hypothesis testing. IV. POPULATION DISTRIBUTIONS A. Discrete Distributions 1. Binomial distribution: There are two possible outcomes; the probability of obtaining either outcome is known, and you want to know the chance of observing a certain number of successes in a certain number of trials. 2. Poisson distribution: Counting events in a certain period of observation: The average number of counts is known, and you want to know the likelihood of observing a various number of events. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-470 Biostatistics: A Refresher B. Normal (Gaussian) Distribution 1. Most common model for population distributions 2. Symmetric or bell-shaped frequency distribution 3. Landmarks for continuous, normally distributed data a. µ: Population mean is equal to zero. b. σ: Population SD is equal to 1. c. x̄ and s: These represent the sample mean and SD. 4. When a random variable is measured in a large enough sample of any population, some values will occur more often than will others. 5. A visual check of a distribution can help determine whether it is normally distributed (whether it appears symmetric and bell shaped). Need the data to perform these checks. a. Frequency distribution and histograms (visually look at the data) b. Median and mean will be about equal for normally distributed data (most practical and easiest to use). c. Formal test: Kolmogorov-Smirnov test and others d. More challenging to evaluate this when we do not have access to the data (when we are reading an article), because most articles do not present all data or both the mean and median 6. The parameters mean and SD define a normally distributed population. 7. Probability: The likelihood that any one event will occur given all the possible outcomes 8. Estimation and sampling variability a. Estimation is one method that can be used to make an inference about a population parameter b. Separate samples (even of the same size) from a single population will give slightly different estimates. c. The distribution of means from random samples approximates a normal distribution. i. The mean of this “distribution of means” is equal to the unknown population mean, µ. ii. The SD of the means is estimated by the standard error of the mean (SEM). iii. As in any normal distribution, 95% of the sample means lie within ±2 SEM of the population mean. d. The distribution of means from these random samples is about normal regardless of the underlying population distribution (central limit theorem). You will get slightly different mean and SD values each time you repeat this experiment. e. The SEM is estimated with a single sample by dividing the SD by the square root of the sample size (n). The SEM quantifies uncertainty in the estimate of the mean, not variability in the sample. Important for hypothesis testing and 95% CI estimation f. Why is all this information about the difference between the SEM and SD worth knowing? i. Calculation of CIs (95% CI is approximately the mean ± 2 times the SEM.) ii. Hypothesis testing iii. Deception (e.g., makes results look less “variable,” especially when used in graphic format) 9. Recall the previous example about HDL and green tea. From the calculated values in section III, do these data appear to be normally distributed? V. CONFIDENCE INTERVALS A. Commonly Reported as a Way to Estimate a Population Parameter 1. In the medical literature, 95% CIs are the most commonly reported CIs. In repeated samples, 95% of all CIs include true population value (i.e., the likelihood [or probability] that the population value is contained within the interval). In some cases, 90% or 99% CIs are reported. Why are 95% CIs most often reported? ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-471 Biostatistics: A Refresher 2. Example a. Assume a baseline birth weight in a group (n=51) with a mean ± SD of 1.18 ± 0.4 kg. b. 95% CI is about equal to the mean ± 1.96 × SEM (or 2 × SEM). In reality, it depends on the distribution being used and is a bit more complicated. c. What is the 95% CI? The 95% CI is calculated to be (1.07, 1.29), meaning there is 95% certainty that the true mean of the entire population studied will be 1.07–1.29 kg. d. What is the 90% CI? The 90% CI is calculated to be (1.09, 1.27). The 95% CI will always be wider than the 90% CI for any given sample. Therefore, the wider the CI, the more likely it is to encompass the true population mean. 3. The differences between the SD, SEM, and CIs should be noted when interpreting the literature because they are often used interchangeably. Although it is common for CIs to be confused with SDs, the information each provides is quite different and has to be assessed correctly. 4. Recall the previous example about HDL and green tea. What is the 95% CI of the data set, and what does that mean? B. CIs can also be used for any sample estimate. Estimates derived from categorical data such as risk, risk differences, and risk ratios are often presented with the CI and will be discussed in the text that follows. VI. HYPOTHESIS TESTING A. Null and Alternative Hypotheses (see Table 4 for other types of examples) 1. Null hypothesis (H0): Example: No difference between groups being compared (treatment A equals treatment B) 2. Alternative hypothesis (H A): Example: Opposite of null hypothesis; states that there is a difference (treatment A does not equal treatment B) 3. The structure or the manner in which the hypothesis is written dictates which statistical test is used. Two-sample t-test: H0: Mean 1 = Mean 2 4. Used to assist in determining whether any observed differences between groups can be explained by chance 5. Tests for statistical significance (hypothesis testing) determine whether the data are consistent with H0 (no difference). 6. The results of the hypothesis testing will indicate whether enough evidence exists for H0 to be rejected. a. If H0 is rejected: Statistically significant difference between groups (unlikely attributable to chance) b.  If H0 is not rejected: No statistically significant difference between groups (any apparent differences may be attributable to chance). Note that we are not concluding that the treatments are equal. 7. Types of hypothesis testing. These are situations in which two groups are being compared. There are numerous other examples of situations these procedures could be applied to (Table 4). ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-472 Biostatistics: A Refresher Table 4. Types of Hypothesis Testing Question Nondirectional Difference Are the means different? Equivalence Directional Superiority Noninferiority Hypothesis Method H0: Mean1 = Mean2 H A: Mean1 ≠ Mean2 or H0: Mean1 − Mean2 = 0 HA: Mean1 − Mean2 ≠ 0 Traditional 2-sided t-test Confidence intervals Are the means practically equivalent? H0: Mean1 − Mean2 ≥ Δ HA: Mean1 − Mean2 < Δ Two 1-sided t-test (TOST) procedure Confidence intervals Is mean 1 > mean 2? (or some other similarly worded question) H0: Mean1 ≤ Mean2 H A: Mean1 > Mean2 or H0: Mean1 − Mean2 ≤ 0 HA: Mean1 − Mean2 > 0 Traditional 1-sided t-test Confidence intervals Is mean 1 no more than a certain amount lower than mean 2? H0: Mean1 − Mean2 ≥ Δ HA: Mean1 − Mean2 < Δ Confidence intervals Δ = equivalence or noninferiority margin; H0 = null hypothesis; H A = alternative hypothesis. B. To Determine What Is Sufficient Evidence to Reject H0: Set the a priori significance level (α) and generate the decision rule. 1. Developed after the research question has been stated in hypothesis form 2. Used to determine the level of acceptable error caused by a false positive (also known as level of significance) a. Convention: A priori α is usually 0.05. b. Critical value is calculated, capturing how extreme the sample data must be to reject H0. C. Perform the Experiment and Estimate the Test Statistic. 1. A test statistic is calculated from the observed data in the study, which is compared with the critical value. 2. Depending on this test statistic’s value, H0 is not rejected (often called fail to reject) or rejected. 3. In general, the test statistic and critical value are not presented in the literature; instead, p-values are generally reported and compared with a priori α values to assess statistical significance. p-value: Probability of obtaining a test statistic and critical value as extreme as or more extreme than the one actually obtained 4. Because computers are used in these tests, this step is often transparent; the p-value estimated in the statistical test is compared with the a priori α (usually 0.05), and the decision is made. D. CIs Instead of Hypothesis Testing 1. Hypothesis testing and calculation of p values tell us (ideally) whether there is a statistically significant difference between groups, but nothing about the magnitude of the difference. 2. CIs help us determine the importance of a finding or findings, which we can apply to a situation. 3. CIs give us an idea of the magnitude of the difference between groups and the statistical significance. 4. CIs are a range of data, together with a point estimate of the difference. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-473 Biostatistics: A Refresher 5. Wide CIs a. Many results are possible, either larger or smaller than the point estimate provided by the study. b. All values contained in the CI are statistically plausible. 6. If the estimate is the difference between two continuous variables: A CI that includes zero (no difference between two variables) can be interpreted as not statistically significant (a p value of 0.05 or greater). There is no need to show both the 95% CI and the p value. 7. The interpretation of CIs for odds ratios and relative risks is somewhat different. In this case, a value of 1 indicates no difference in risk, and if the CI includes 1, there is no statistical difference. (See the discussion of case-control/cohort studies in other sections for how to interpret CIs for odds ratios and relative risks.) 8. There is no need to report both the 95% CI and the p value. VII. DECISION ERRORS Table 5. Summary of Decision Errors Underlying Truth or Reality Test Result Accept H0 (no difference) Reject H0 (difference) H0 is true (no difference) No error (correct decision) Type I error (α error) H0 is false (difference) Type II error (β error) No error (correct decision) H0 = null hypothesis. A. Type I error: The probability of making this error is defined as the significance level α. 1. Convention is to set the α to 0.05, effectively meaning that, 1 in 20 times, a type I error will occur when the H0 is rejected. Thus, 5.0% of the time, a researcher will conclude that there is a statistically significant difference when one does not actually exist. 2. The calculated chance that a type I error has occurred is called the p-value. 3. The p-value tells us the likelihood of obtaining a given (or a more extreme) test result if the H0 is true. When the α level is set a priori, H0 is rejected when p is less than α. In other words, the p-value tells us the probability of being wrong when we conclude that a true difference exists (false positive). 4. A lower p-value does not mean the result is more important or more meaningful, but only that it is statistically significant and not likely to be attributable to chance. B. Type II error: The probability of making this error is called β. 1. Concluding that no difference exists when one truly does (not rejecting H0 when it should be rejected) 2. It has become a convention to set β at 0.10–0.20. C. Power (1 − β) 1. The probability of making a correct decision when H0 is false; the ability to detect differences between groups if one actually exists 2. Dependent on the following factors: a. Predetermined α b. Sample size c. The size of the difference between the outcomes you want to detect: Often not known before conducting the experiment, so to estimate the power of your test, you will have to specify how large a change is worth detecting (usually determined from previous data the investigator has or from the literature) d. The variability of the outcomes that are being measured (usually determined from previous data the investigator has or from the literature) ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-474 Biostatistics: A Refresher e. Power is decreased by poor study design and/or the use of incorrect statistical tests (use of nonparametric tests when parametric tests are appropriate). 3. Statistical power analysis and sample size calculation a. Related to the previous discussion of power and sample size b. Sample size estimates should be performed in all studies a priori. c. Necessary components for estimating appropriate sample size i. Acceptable type II error rate (usually 0.10–0.20) ii. Observed difference in predicted study outcomes that is clinically significant iii. The expected variability in item ii iv. Acceptable type I error rate (usually 0.05) v. Statistical test that will be used for primary end point 4. Statistical significance versus clinical significance a. As stated earlier, the size of the p-value is not necessarily related to the clinical importance of the result. Smaller values mean only that chance is less likely to explain observed differences. b. Statistically significant does not necessarily mean clinically significant. c. Lack of statistical significance does not mean that results are not clinically important. d. When considering nonsignificant findings, consider sample size, estimated power, and observed variability. Table 6. Four Studies Carried Out to Test the Response Rate to a New Drug vs. That to a Standard Drug Study New Drug Response Rate (%) Difference in % Responding New Drug Response p Value Rate (%) 1 2 3 4 480 of 800 (60) 15 of 25 (60) 15 of 25 (60) 240 of 400 (60) 416 of 800 (52) 13 of 25 (52) 9 of 25 (36) 144 of 400 (36) 0.001 0.57 0.09 <0.0001 Point Estimate (%) 8 8 24 24 95% CI 3%–13% -19% to 35% -3% to 51% 17%–31% e. Which study (or studies) observed a statistically significant difference in response rate? f. If the smallest change in response rate thought to be clinically significant is 15%, which of these trials may be convincing enough to change your practice? VIII. STATISTICAL TESTS AND CHOOSING A STATISTICAL TEST A. Basic Study Designs 1. Number of study groups or samples a. One b. Two c. More than two 2. Related samples (paired or matched) a. Crossover design b. Two or more observations in the same biologic unit 3. Independent samples a. Parallel design b. Different people in each group or sample c. Less statistical efficiency than crossover design ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-475 Biostatistics: A Refresher B. Choosing the Appropriate Statistical Test Depends on the Following: 1. Type of data (e.g., qualitative [nominal, ordinal] vs. quantitative [continuous]) 2. Distribution and/or spread of data (e.g., normal vs. nonnormal) 3. Number of groups 4. Study design (e.g., parallel, crossover) C. Parametric vs. Nonparametric Tests 1. Parametric tests assume the following: a. Data being investigated have an underlying distribution that is normal or close to normal or, more correctly, randomly drawn from a parent population with a normal distribution. Remember how to estimate this (mean ~ median)? b. Data measured are quantitative continuous data measured on either an interval or a ratio scale. c. Parametric tests assume that the data being investigated have variances that are homogeneous between the groups investigated. This is often called homoscedasticity. 2. Nonparametric tests a. “Distribution-free” statistics b. Data are not (or do not need to be) normally distributed. Skewed quantitative continuous data c. Data do not meet other criteria; quantitative discrete, ordinal, or nominal data D. Examples of Representative Statistical Tests Figure 1. Flowchart of representative statistical tests. Information from: Baker WE, Sowinski KM. Biostatistics and study designs: fundamentals of design and interpretation. In: Updates in Therapeutics®: Cardiology Pharmacy Preparatory Review Course. ACCP, 2018. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-476 Biostatistics: A Refresher E. Comparing Groups, Continuous and Ordinal Data 1. Parametric tests a. Student t-test: Several different types i. One-sample test: Compares the mean of the study sample with the population mean Group 1 Known population mean ii. Two-sample, independent samples, or unpaired test: Compares the means of two independent samples. This is an independent samples test. Group 1 Group 2 (a) Equal variance test (i) Rule for variances: If the ratio of larger variance to smaller variance is greater than 2, we generally conclude the variances are different. (ii) Formal test for differences in variances: F test (iii) Adjustments to data can be made for cases of unequal variance. (b) Unequal variance test: Correction applied iii. Paired test: Compares the mean difference of paired or matched samples. This is a relatedsamples test. Group 1 Measurement 1 Measurement 2 b. Analysis of variance (ANOVA): A more generalized version of the t-test that can apply to more than two groups i. One-way ANOVA: Compares the means of three or more groups in a study; also known as a single-factor ANOVA. This is an independent samples test. Group 1 Group 2 ii. Two-way ANOVA: Additional factor (e.g., age) added Younger groups Older groups Group 1 Group 1 Group 2 Group 2 Group 3 Group 3 iii. Repeated-measures ANOVA: This is a related-samples test. Related measurements Group 1 Related Measurements Measurement 1 Group 3 Measurement 2 Measurement 3 iv. Several more complex factorial ANOVAs can be used. v. Many comparison procedures are used to determine which groups actually differ from each other. Post hoc tests: Tukey HSD (honestly significant difference), Bonferroni, Scheffé, Newman-Keuls ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-477 Biostatistics: A Refresher c. Analysis of covariance: Provides a method to explain the influence of a categorical variable (independent variable) on a continuous variable (dependent variable) while statistically controlling for other variables 2. Nonparametric tests a. Type of data that may require use: continuous data not normally distributed (skewed data) or ordinal data b. Tests for independent samples i. Wilcoxon rank sum test, Mann-Whitney U test, or Wilcoxon-Mann-Whitney test: Compares two independent samples (related to a two-sample t-test) ii. Kruskal-Wallis one-way ANOVA by ranks: Compares three or more independent groups (related to a one-way ANOVA) c. Tests for related or paired samples i. Sign test and Wilcoxon signed rank test: Compares two matched or paired samples (related to a paired t-test) ii. Friedman ANOVA by ranks: Compares three or more matched or paired groups F. Comparing Groups for Categorical or Nominal Data a. Independent samples i. Chi-square (χ2) test: Compares expected and observed proportions between two or more groups (a) Test of independence (b) Test of goodness of fit ii. Fisher exact test: Specialized version of the chi-square test for small groups (cells) containing less than five predicted observations b. Paired samples i. McNemar test: Used to determine whether there are differences on a dichotomous variable between two related groups. Similar to a paired t-test, for this type of data IX. CORRELATION AND REGRESSION A. Introduction: Correlation vs. Regression 1. Correlation examines the strength of the association between two variables. It does not necessarily assume that one variable is useful in predicting the other. 2. Regression examines the ability of one or more variables to predict another variable. B. Pearson Correlation 1. The strength of the relationship between two variables that are normally distributed, ratio or interval scaled, and linearly related is measured with a correlation coefficient. 2. Often called the degree of association between the two variables 3. Does not necessarily imply that one variable is dependent on the other (regression analysis will do that) 4. Pearson correlation (r) ranges from −1 to +1 and can take any value in between: −1 Perfect negative linear relationship 0 No linear relationship +1 Perfect positive linear relationship 5. Hypothesis testing is performed to determine whether the correlation coefficient is different from zero. This test is highly influenced by sample size. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-478 Biostatistics: A Refresher C. Pearls About Correlation 1. The closer the magnitude of r to 1 (either + or −), the more highly correlated the two variables. The weaker the relationship between the two variables, the closer r is to 0. 2. There is no agreed-on or consistent interpretation of the value of the correlation coefficient. It is dependent on the environment of the investigation (laboratory vs. clinical experiment). 3. Pay more attention to the magnitude of the correlation than to the p-value because it is influenced by sample size. 4. Crucial to the proper use of correlation analysis is the interpretation of the graphic representation of the two variables. Before using correlation analysis, it is essential to generate a scatterplot of the two variables to visually examine the relationship. D. Spearman Rank Correlation: Nonparametric test that quantifies the strength of an association between two variables but does not assume a normal distribution of continuous data. Can be used for ordinal data or nonnormally distributed continuous data E. Regression 1. A statistical technique related to correlation. There are many different types. For simple linear regression, one continuous outcome (dependent) variable and one continuous independent (causative) variable 2. Two main purposes of regression: Development of prediction model and accuracy of prediction 3.  Prediction model: Making predictions of the dependent variable from the independent variable; Y = mx + b (dependent variable = slope × independent variable + intercept) 4. Accuracy of prediction: How well the independent variable predicts the dependent variable. Regression analysis determines the extent of variability in the dependent variable that can be explained by the independent variable. a. Coefficient of determination (r2) measured describing this relationship. Values of r2 can range from 0 to 1. b.  An r2 of 0.80 could be interpreted as saying that 80% of the variation in Y is explained by the variation in X. c. This does not provide a mechanistic understanding of the relationship between X and Y but rather a description of how clearly such a model (linear or otherwise) describes the relationship between the two variables. d. Like the interpretation of r, the interpretation of r2 is dependent on the scientific arena (e.g., clinical research, basic research, social science research) to which it is applied. 5. For simple linear regression, two statistical tests can be used. a. To test the hypothesis that the y-intercept differs from zero b. To test the hypothesis that the slope of the line is different from zero 6. Regression is useful in constructing predictive models. The literature is full of examples of predictions. The process involves developing a formula for a regression line that best fits the observed data. 7. Like correlation, there are many different types of regression analysis. a. Multiple linear regression: One continuous independent variable and two or more continuous dependent variables b. Simple logistic regression: One categorical response (dependent) variable and one continuous or categorical explanatory (independent) variable c. Multiple logistic regression: One categorical response (dependent) variable and two or more continuous or categorical explanatory (independent) variables d. Nonlinear regression: Variables are not linearly related (or cannot be transformed into a linear relationship). This is where our pharmacokinetic equations come from. e. Polynomial regression: Any number of response and continuous variables with a curvilinear relationship (e.g., cubed, squared) ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-479 Biostatistics: A Refresher 8. Example of regression a. The following data are taken from a study evaluating enoxaparin use. The authors were interested in predicting patient response (measured as anti-factor Xa concentrations) from the enoxaparin dose in the 75 subjects who were studied. Antifactor Xa Concentrations (U/mL) 1.20 1.00 0.80 0.60 0.40 0.20 0.00 0.00 1.00 2.00 3.00 4.00 Enoxaparin Dose (mg/Kg) Figure 2. Relationship between antifactor Xa concentrations and enoxaparin dose. b. The authors performed regression analysis and reported the following: Slope: 0.227, y-intercept: 0.097, p<0.05, r2 = 0.31. c. Answer the following questions: i. What are the assumptions necessary to use regression analysis? ii. Provide an interpretation of the coefficient of determination. iii. Predict anti-factor Xa concentrations at enoxaparin doses of 2 and 3.75 mg/kg. iv. What does the p<0.05 value indicate? X. SURVIVAL ANALYSIS A. Studies the Time Between Entry in a Study and Some Event (e.g., death, myocardial infarction) 1. Censoring makes survival methods unique; considers that some subjects leave the study for reasons other than the event (e.g., lost to follow-up, end of study period) 2. Considers that all subjects do not enter the study at the same time 3. Standard methods of statistical analysis such as t-tests and linear or logistic regression may not be appropriately applied to survival data because of censoring. B. Estimating the Survival Function 1. Kaplan-Meier method a. Uses survival times (or censored survival times) to estimate the proportion of people who would survive a given length of time under the same circumstances b. Allows the production of a table (life table) and a graph (survival curve) c. We can visually evaluate the curves, but we need a test to evaluate them formally. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-480 Biostatistics: A Refresher 2. Log-rank test: Compare the survival distributions between two or more groups. a. This test precludes an analysis of the effects of several variables or the magnitude of difference between groups or the CI (see the text that follows for the Cox proportional hazards model). b. H0: No difference in survival between the two populations c. Log-rank test uses several assumptions. i. Random sampling and subjects chosen independently ii. Consistent criteria for entry or end point iii. Baseline survival rate does not change as time progresses. iv. Censored subjects have the same average survival time as uncensored subjects. 3. Cox proportional hazards model a. Most popular method to evaluate the impact of covariates; reported graphically like Kaplan-Meier b. Investigates several variables at a time c. Actual method of construction and calculation is complex. d. Compares survival in two or more groups after adjusting for other variables e. Allows calculation of a hazard ratio (and CI) 4. Example of survival analysis. Figure 3. Hemofilter survival with bivalirudin versus heparin in patients receiving continuous renal replacement therapy. Source: Kiser TH, MacLaren R, Fish DN, et al. Bivalirudin versus unfractionated heparin for prevention of hemofilter occlusion during continuous renal replacement therapy. Pharmacotherapy 2010;30:1117-26. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-481 Biostatistics: A Refresher XI. SUMMARY OF SELECTING STATISTICAL TESTS Table 7. Representative Statistical Tests Type of Data Goal Continuous Quantitative Ordinal, or Quantitative (from Normal Distribution (from Nonnormal Population) Distribution) Nominal or Named Survival Time (Two Possible Outcomes) Descriptive Mean, SD Median, interquartile range, range Proportion or percent Compare one group to a known value One-sample t test One sample Wilcoxon sign test Chi-square or binomial testb Compare two independent groups Unpaired t test (equal or unequal variance) Wilcoxon rank sum testa Fisher’s exact test (for Log-rank test small samples), chi-square Compare two paired or related groups Paired t test Wilcoxon sign-rank test, sign test McNemar’s test Conditional proportional hazards regressionb Compare three or more independent groups One-way ANOVA Kruskal-Wallis test Chi-square test Cox proportional hazard regression Compare three or more related groups Repeated-measures ANOVA Friedman ANOVA test Cochrane Q Conditional proportional hazards regressionb Quantify association between two variables Pearson correlation Spearman rank correlation Contingency coefficientsb Predict value from another measured variable Simple linear regression or nonlinear regressionb Nonparametric regressionb Simple logistic regressionb Cox proportional hazard regression Predict value from several measured or binomial variables Multiple linear regressionb or multiple nonlinear regressionb Multiple logistic regressionb Cox proportional hazard regression Kaplan Meier curve Wilcoxon rank sum test = Mann-Whitney U test = Wilcoxon-Mann-Whitney Not covered in this chapter or presentation ANOVA = analysis of variance; SD = standard deviation. Source: Modified from Motulsky H. Intuitive Biostatistics. Oxford Press, 2015; DiCenzo R, ed. Clinical Pharmacist’s Guide to Biostatistics and Literature Evaluation. ACCP, 2015. a b ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-482

Use Quizgecko on...
Browser
Browser