Study Designs: Fundamentals and Interpretation PDF
Document Details
Uploaded by UnrealEinsteinium
Tags
Summary
This document details various concepts in study design, including the classification of research design, the hierarchy of clinical study design, and validity issues in study design. It covers topics such as internal and external validity, confounding variables, and different types of bias in research. The document also briefly touches on observational study designs and case reports/case series.
Full Transcript
Study Designs: Fundamentals and Interpretation I. INTRODUCTION: WHY DO PHARMACISTS NEED TO KNOW ABOUT STUDY DESIGN AND INTERPRETATION? A. Pharmacotherapy Specialty Examination content outline, Domain 2: Application of Evidence to Practice and Education (25%) B. Subgroups: 1. Retrieve relevant inf...
Study Designs: Fundamentals and Interpretation I. INTRODUCTION: WHY DO PHARMACISTS NEED TO KNOW ABOUT STUDY DESIGN AND INTERPRETATION? A. Pharmacotherapy Specialty Examination content outline, Domain 2: Application of Evidence to Practice and Education (25%) B. Subgroups: 1. Retrieve relevant information that addresses pharmacotherapy-related inquiries. 2. Evaluate pharmacotherapy-related literature and health information. 3. Disseminate pharmacotherapy-related information to educate health care professionals, patients, and caregivers. II. VARIOUS CONCEPTS IN STUDY DESIGN A. Research Design Classification 1. Study purpose: Descriptive versus analytical 2. Time orientation: Prospective versus retrospective design a. Prospective: Begin in the present and progress forward, collecting data from subjects whose outcomes lie in the future. b. Retrospective: Begin and end in the present; however, this design involves a major backward look to collect information about events that occurred in the past. 3. Investigator orientation: a. Experimental trials b. Quasi-experimental trials c. Observational trials B. Relative Strength of Evidence: Hierarchy of Study Designs: This hierarchy holds assuming that all of the study designs are performed using the best possible techniques (e.g., a poorly conducted RCT is not necessarily higher on the hierarchy than a well-done cohort study) Systematic Reviews and Meta-Analysis RCT Cohort Case-Control Cross-Sectional Case Series Case Reports Ideas, opinions, and reviews Figure 1. Hierarchy of clinical study design. RCT = randomized controlled clinical trial. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-493 Study Designs: Fundamentals and Interpretation C. Validity in Study Design 1. Internal validity a. The degree to which the outcome (either efficacy or safety) can be explained by differences in assigned groups b. Often related to the study methods; proper design, conduction, and analysis of the study in order to minimize systematic bias c. Factors that may affect internal validity: i. Poor study design ii. Inadequate randomization or allocation of subjects iii. Lack of or inappropriate blinding of participants, personnel, and outcome assessors iv. Use of imprecise or inaccurate measurements or inappropriate statistical methods v. Incomplete outcome data or selective outcome reporting d. Limitations are magnified with use of nonrandomized/observational study designs 2. External validity a. The degree to which the findings can be generalized or extrapolated to a population beyond a study b. Lack of consideration for external validity is a common criticism of clinical research (regardless of the design) and may explain the underuse of clinical trial results in routine clinical practice c. Factors that may affect external validity: i. Setting of the trial (health care system, country, selection of participating centers/clinicians) ii. Selection of patients (eligibility and exclusion criteria, placebo/treatment run-in period) iii. Study patient characteristics (baseline clinical characteristics, racial/ethnic/sex group breakdowns, uniformity of underlying pathology, comorbidities, severity of disease) iv. Differences between trial protocol and routine practice (intervention timing, appropriateness of control, background therapy and standardization, frequency of monitoring) v. Outcome measures and follow-up (relevance and acceptance of surrogate outcomes, reproducibility of findings, use of patient-centered outcomes, frequency/adequacy of follow-up) vi. Adverse effects of treatment (impact of run-in period, discontinuation rates, completeness of adverse drug event reporting, intensity of safety procedures) d. Limitations could be minimized with use of pragmatic design (although internal validity is reduced). D. Random Error in Study Design: A difference between the observed and expected results of a study that is caused by chance, or a nonsystematic error. These errors impact the reliability of the results and can be controlled, but not eliminated. This error diminishes as the study sample gets larger. E. Bias in Study Design 1. Definition: A difference between the observed and expected results of a study that is caused by systematic, nonrandom error. Bias can be caused by variation in study methodology and conductance, ultimately introducing error in outcome interpretation. Bias can be broadly categorized as selection bias, measurement bias, and confounding. The most important design techniques used to reduce bias while designing a study are blinding and randomization. 2. Bias can be controlled or accounted for with appropriate planning in the design, collection, and analysis of data. Bias may have a different impact depending on how the collection data are interpreted. 3. Examples of bias a. Selection bias: An error in the selection or sampling of individuals for a clinical study. This systematic selection leads to an advantage for one group over the other. Selection bias may impact case-control studies more than cohort studies. b. Observational or information bias: Incorrect determination of outcomes or exposures. Examples include: An error in the recording of individual factors of a study, such as inaccurate recording of a patient’s risk factor or inaccurate recording of the timing of a blood sample ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-494 Study Designs: Fundamentals and Interpretation c. Recall bias: Systematic error caused by differences in the accuracy or completeness of the recollections of study participants regarding past events or experiences. For example, cases are more likely to remember exposures than controls. d. Interviewer bias: Interviews are not conducted in a uniform manner for all study participants e. Misclassification bias: Subject is categorized into an incorrect group or category potentially altering the association or research outcome. i. Differential bias: Bias that occurs when information errors differ between groups being studied (e.g., in a cohort study, between those with the disease and those without). This is a nonrandom error. ii. Non-differential bias: Bias that occurs when the results collected are incorrect but affect both groups the same. This is a systematic error. f. Publication bias: Studies that report positive findings are more likely to be published than studies that report negative findings. F. Confounding in Study Design 1. A nonrandomized variable that affects the independent or dependent variable, altering the ability to determine the true effect on the measured outcome. These factors may hide or exaggerate a true association. 2. To minimize the potential for missing a confounding variable, all relevant information should be collected and evaluated. During the design, subjects should be randomized or matched in the analysis by stratification, by propensity score matching, or by use of multivariable analysis techniques. G. Factors Important in Assessing Causality: 1. Temporality: Cause before effect 2. Strength: Plausibility increases with strength of relationship 3. Biological gradient: Dose-response 4. Consistency: Observations over several settings 5. Specificity: Single cause for effect 6. Plausibility: Biologically 7. Coherence: Consistency with existing knowledge 8. Analogy: Preclinical expectation applied to clinical testing 9. Experimental design III. CASE REPORTS/CASE SERIES A. Document and Describe Experiences, Novel Treatments, and Unusual Events. Allows hypothesis generation that can be tested with other study designs. Note that the title does not state “study.” 1. Possible adverse drug reactions in one or more patients a. QT-interval prolongation associated with fluoroquinolone antibiotics 2. Case report: One patient 3. Case series: More than one patient with a similar experience or many case reports combined into a descriptive review 4. Reports should provide sufficient detail to allow readers to recognize same/similar cases at their center/ practice. 5. The Case Report (CARE) guidelines describe what information should be included in a case report. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-495 Study Designs: Fundamentals and Interpretation B. Advantages and Disadvantages 1. Advantages: Hypotheses are formed, which may be the first step in describing an important clinical problem. Easy to perform and inexpensive 2. Disadvantages: Does not provide explanation other than conjecture and does not establish causality or association IV. OBSERVATIONAL STUDY DESIGNS A. Design Does Not Involve Investigator Intervention, Only Observation. It is essential to remember that observational study designs investigate associations—cannot prove causation. B. Case-Control Study: Study Exposure in Those with and without the Outcome of Interest Classify and Compare Risk Factor Present Risk Factor Absent Risk Factor Present Risk Factor Absent Past Begin Cases Subjects With Condition of Interest Cases Subjects Without Condition of Interest Present Figure 2. Case-control study design. 1. Determine the association between exposures/risk factors and disease/condition. Classic example: Aspirin use and Reye syndrome 2. Retrospective studies 3. Practical method to study exposures in rare diseases or diseases that take long periods to develop 4. Critical assumptions to minimize bias a. Cases are selected to be representative of those who have the disease. Randomly select cases when possible. b. Controls are representative of the general population that does not have the disease and are as identical as possible to the cases, minus the presence of the disease. c. Information is collected from cases and controls in the same way. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-496 Study Designs: Fundamentals and Interpretation 5. Examples a. Risk of myocardial infarction associated with antihypertensive drug therapies (JAMA 1995;274: 620-5) i. Purpose of the study: To estimate the association between myocardial infarction (MI) and use of antihypertensives ii. Disease: Myocardial infarction. Exposure: Antihypertensives iii. Cases (n=623): Hypertensive patients who sustained a first fatal or nonfatal MI. Controls (n=2032): Patients with hypertension matched for age, sex, and calendar year. All 623 cases and 2032 controls had pharmacologically treated hypertension. iv. Data collection: Review of medical record; telephone interview of consenting survivors. Exposure (antihypertensive therapy) was assessed using the computerized pharmacy database. b. Phenylpropanolamine (PPA) and the risk of hemorrhagic stroke (N Engl J Med 2000;343:1826-32). Purpose of study: i. To estimate in women the association between hemorrhagic stroke and the use of appetite suppressants containing PPA ii. To estimate the association between any use of PPA (in appetite suppressant or cough or cold remedy) and hemorrhagic stroke iii. To estimate in men and women the association between hemorrhagic stroke and the type of exposure to PPA iv. Disease: Hemorrhagic stroke (several types). Exposure: PPA v. Cases: Symptomatic subarachnoid or intracerebral hemorrhage (n=702). Controls: Matched by sex, race, and age (n=1376) vi. Exposure assessed by structured questionnaire, product photographs, and ingredient confirmation 6. Advantages a. Inexpensive and can be conducted quickly b. Allows investigation of several possible exposures or associations, particularly when risk factors are unknown. 7. Disadvantages a. Confounding must be controlled. b. Observational and recall bias: Looking back to recall exposures and their possible levels of exposure c. Selection bias: Case selection and control matching are difficult. 8. Measure of association: OR (odds ratio): The OR is interpreted as the odds of exposure to a factor in those with a condition or disease compared with those who do not have the condition or disease. Interpretation of these concepts will be presented below. C. Cohort Study 1. Determines the association between exposures/factors and disease/condition development. Allows an estimation of the risk of outcome (and the RR between the exposure groups). Study outcome of interest in those with and without the exposure of interest. Classic examples follow: a. Framingham Study. A prospective “cohort” of subjects from Framingham, Massachusetts, were (and are) studied over time to evaluate the relationship between a variety of conditions (exposures) on the development of cardiovascular disease (summarized in Int J Epidemiol 2015;44:1800-13). b. Nurses’ Health Study: Prospective study that investigated the potential long-term consequences of the use of oral contraceptives (summarized in BMJ 2014;349:g6356) c. Thimerosal DTP (diphtheria-tetanus-whole-cell pertussis) vaccine study: Retrospective cohort study that investigated the impact of thimerosal on developmental neurologic disorders (Pediatrics 2004;114:584-91) ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-497 Study Designs: Fundamentals and Interpretation 2. Describes the incidence or natural history of a disease/condition and measures it in time sequence 3. Retrospective (historical): Begins and ends in the present but involves a major backward look to collect information about events that occurred in the past Begin Measure/Classify Measure Outcomes and Compare Outcome Present Subjects With Risk Factor Outcome Absent Study Sample Subjects Without Risk Factor Population Subjects have Outcome (EXCLUDED) Past Outcome Present Outcome Absent Present Figure 3. “Retrospective” (historical) cohort study design. a. Advantages: Less expensive and time-consuming; no loss to follow-up, ability to investigate issues not amenable to a clinical trial or ethical or safety issues b. Disadvantages: Only as good as the data available, little control of confounding variables through nonstatistical approaches, recall bias 4. Prospective or longitudinal: Begin in the present and progress forward, collecting data from subjects whose outcomes lie in the future Begin Measure/Classify Measure Outcomes and Compare Outcome Present Subjects With Risk Factor Outcome Absent Study Sample Subjects Without Risk Factor Population Subjects have Outcome (EXCLUDED) Present Outcome Present Outcome Absent Future Figure 4. Prospective cohort study design. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-498 Study Designs: Fundamentals and Interpretation a. Example: Prospective, observational study: Postmenopausal hormone use and secondary prevention of coronary events in the Nurses’ Health Study (Ann Intern Med 2001;135:1-8) b. Advantages: Can control for confounding factors to a greater extent, easier to plan for data collection c. Disadvantages: More expensive and time-intensive, loss of subject follow-up, difficult to study rare diseases/conditions at a reasonable cost 5. Measure of association: RR: The risk of an event or development of a condition relative to exposure; the risk of someone developing a condition when exposed compared with someone who has not been exposed D. Cross-sectional (a.k.a. prevalence study) 1. Identify the prevalence or characteristics of a condition in a group of individuals. 2. Examples a. Population-based, cross-sectional study: Prevalence of serious eye disease and visual impairment in a north London population (BMJ 1998;316:1643-7) b. Cross-sectional analysis of data from a large cohort study: Maternal characteristics and migraine pharmacotherapy during pregnancy (Cephalgia 2009;29:1267-76) 3. Advantages: Easy design, “snapshot in time,” all data collected at one time, studies are accomplished by questionnaire, interview, or other available biomedical information (e.g., laboratory values) 4. Disadvantages: Does not allow the study of a factor (or factors) in individual subjects over time, just at the time of assessment; difficult to study rare conditions V. INCIDENCE, PREVALENCE, RELATIVE RISK/RISK RATIOS, ODDS RATIOS, AND HAZARD RATIOS A. Incidence 1. Measure of the probability of developing an event or condition 2. Incidence rate: Number of new cases of an event or condition per population in a specified time 3. Calculated by dividing the number of individuals who develop an event or condition during a given period by the number of individuals who were at risk of developing an event or condition during the same period B. Prevalence 1. Measure of the number of individuals who have an event or condition at any given time 2. Point prevalence: Prevalence on a given date 3. Period prevalence: Prevalence in a period (e.g., year, month) C. Interpreting Risk Ratios (RR, OR, HR) 1. Estimate the magnitude of association between exposure and an event or condition. Key point: For observational studies, this is not cause and effect; it is an association. 2. The RR and OR differ in how they are interpreted and calculated. a. The RR is the incidence of an event in the exposed group divided by the incidence of the event in the unexposed (control) group. b. The OR is the odds of an event occurring in the exposed group divided by the odds of the event occurring in the control group. 3. The RR cannot be directly calculated for most case-control studies because the baseline risk is not available; instead, the OR is the risk ratio utilized. The RR can, however, be calculated for prospective studies, including cohort studies, because the baseline risk can be estimated. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-499 Study Designs: Fundamentals and Interpretation 4. The HR differs from the OR and RR in that it can estimate risk at any given point of time within a time period, whereas OR and RR are estimates of the risk over a period of time. 5. When interpreting ratios, the point of unity (no difference) is 1. The RR, OR, and HR are, therefore, interpreted based on their difference from unity (see Table 1). If the 95% confidence interval includes 1, then there is no statistical difference between groups. 6. Interpretation of the index of risk, including RR, OR, and HR a. Direction of risk Table 1. Direction of Risk Associated with OR, RR and HR RR/OR/HR Interpretation Negative association RR: Risk of disease is lower in the exposed group OR: Odds of exposure/having an outcome is lower in the diseased group HR: E vent rates are lower in the experimental group than control group (experimental treatment is better than control treatment) No association RR: Risk of disease in the two groups is the same OR: Odds of exposure/having an outcome in the two groups is the same HR: Event rates are the same in both groups Positive association RR: Risk of disease is greater in the exposed group OR: Odds of exposure/having an outcome is greater in the diseased group HR: E vent rates are higher in the experimental group than control group (experimental treatment is worse than control treatment) 1 HR = hazard ratio; OR = odds ratio; RR = relative risk/risk ratio. b. Magnitude of risk Table 2. Magnitude of Risk Associated with OR and RR RR/OR 0.75 RR: 25% reduction in the risk OR: Odds are 0.75/1 No difference in risk/odds 1.0 1.5 RR: 50% increase in the risk OR: Odds are 1.5/1 higher RR: 200% (3-fold increase in the risk) OR: Odds are 3/1 higher 3.0 Interpretation 7. Calculating RR/OR/contingency tables Table 3. Contingency Table for Estimating RR and OR Exposure? Yes No Event or Condition? Yes No A B C D ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-500 Study Designs: Fundamentals and Interpretation A/(A+B) a. RR = ––––––– C/(C+D) A –– C b. OR = –– or = (A x D)/(B x C) B –– D Example (PPA study): N Engl J Med 2000;343:1826-32 8. Table 4. Contingency Table Exposure? Appetite suppression use Event? Hemorrhagic Stroke in Women Yes No 6 1 377 749 Yes No From: Kernan WN, Viscoli CM, Brass LM, et al. Phenylpropanolamine (PPA) and the risk of hemorrhagic stroke. N Engl J Med 2000;343:1826-32. a. OR = (6/377)/(1/749) = 12 b. Data from the PPA study above related to appetite suppressant and development of hemorrhagic stroke Table 5. Use of PPA and Appetite Suppressants and the Risk of Developing Hemorrhagic Stroke Appetite suppressant: Women Appetite suppressant: Men Appetite suppressant: Either PPA: Women PPA: Men PPA: Either Adjusted OR (95% CI) Cases (+ hemorrhagic stroke) n=383 6 0 6 Controls (− hemorrhagic stroke) n=750 1 0 1 16.6 (1.51–182) — 15.9 (1.38–184) 21 6 27 20 13 33 1.98 (1.00–3.90) 0.62 (0.20–1.92) 1.49 (0.84–2.64) CI = confidence interval; OR = odds ratio; PPA = phenylpropanolamine. c. d. What do these numbers mean? Can you interpret the point estimate and 95% CI in all cases? i. What does the point estimate mean? ii. What does the CI mean? iii. Which ones are statistically significant? D. Causation 1. REMEMBER: We do not prove or show causality with observational studies, but there is some general “guidance” to consider when evaluating them. It is important to recognize that, in certain situations, the conduct of studies to establish causality is not possible, practical, or ethical (e.g., exposure to a drug leading to a birth defect or exposure to an environmental factor leading to the development of cancer). ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-501 Study Designs: Fundamentals and Interpretation 2. Questions used to evaluate causality a. Was statistical significance observed? b. What was the strength of the association, as measured by the OR, RR, or HR? c. Were dose-response relationships evaluated? d. Was there a temporal relationship between exposure and disease/outcome? e. Have the results been consistently shown? f. Is there biologic plausibility to the association? g. Is there any experimental (e.g., animal, in vitro) evidence? VI. RANDOMIZED CONTROLLED TRIAL DESIGN A. Characteristics 1. Experimental or interventional, investigator makes intervention and evaluates cause and effect. Examine etiology, cause, efficacy, using comparative groups. 2. Some previous background information or studies should exist to suggest that the intervention used will likely be beneficial. 3. Design allows assessment of causality. a. Sufficient cause b. Necessary cause c. Risk factor 4. Minimizes bias through randomization and/or stratification a. Randomization b. Block randomization c. Stratification d. Cluster randomization 5. Treatment controls a. Placebo controlled b. Active controlled c. Historical control 6. Blinding methods a. Single-blind: Either subjects or investigators are unaware of subject assignment to active/control. b. Double-blind: Both subjects and investigators are unaware of subject assignment to active/control. c. Triple-blind: Both subjects and investigators are unaware of subject assignment to active/control; in addition, an analysis group is unaware. d. Double-dummy: Method used to match active and control therapies when there are differences in delivery. i. Example: One group receives treatment twice-a-day and the other receives treatment once-aday. The once-a-day group will receive a placebo for the second dose. ii. Example: One group receives an oral medication and the other group receives an intravenous medication. The oral group will receive an intravenous placebo and the intravenous group will receive an oral placebo. e. Open-label (Nonblind): Everyone is aware of subject assignment to active/control. 7. May use parallel or crossover design (see additional information below) a. Crossover provides practical and statistical efficiency to the same degree as a parallel design. b. Crossover is not appropriate for certain types of treatment questions (e.g., effect of treatment on a disease that worsens quickly over time or worsens during the study period). 8. Factorial design: Designed to answer two separate research questions in a single group of subjects ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-502 Study Designs: Fundamentals and Interpretation 9. Examples a. Clinical trial: Comparison of two drugs, comparison of two behavioral modifications, etc. b. Educational intervention: Online course versus lecture class format c. Health care intervention: Pharmacist-based health care team versus non–pharmacist-based health care team B. Randomized Controlled Trial: Parallel Design Randomize and Allocate Begin Measure Outcomes and Compare Outcome Intervention #1 No Outcome Study Sample Intervention #2 (or Placebo, etc.) Population Outcome No Outcome Present Future Figure 5. Randomized controlled trial: parallel design. C. Randomized Controlled Trial: Crossover Design Figure 6. Randomized controlled trial: crossover design. D. RCT: Factorial Design ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-503 Study Designs: Fundamentals and Interpretation E. Examples of Considerations for Controlled Trials 1. Are the results of the study valid (methods)? a. Did the subjects undergo randomization, and what was the randomization technique? Did the randomization process result in equal baseline characteristics? b. Were all subjects who entered the trial accounted for? Was follow-up complete? If not, how many were lost to follow-up, from which groups did they leave, and why? c. Were subjects analyzed in the groups to which they were randomized? Was intention-to-treat, per-protocol, or actual treatment analysis used? d. How was blinding conducted (e.g., subject, investigator), if applicable? e. Were the inclusion and exclusion criteria appropriate, or were they too restrictive or inclusive? Were the groups similar at the start of the trial? f. Was the sample size sufficient, and was a power calculation included? g. Were the groups handled the same way, aside from the intervention(s)? h. Were the statistical tests appropriate and understandable? i. What was assessed: Surrogate markers or true outcomes? Were a priori subgroup analyses performed? 2. What were the results? a. How large was the treatment effect? b. How precise was the effect (e.g., based on CIs significant)? c. Did the authors properly interpret the results? 3. Can I apply the results of this study to my patient population? Will they help me care for my patients? a. Can the results of this study be applied to general practice? b. Was a representative population studied? Can I apply this to my setting? c. Do the patients I care for fulfill the enrollment criteria for this study? d. Do the patients I care for fulfill the subgroup criteria evaluated? e. Do the expected benefits outweigh the expected and/or unanticipated risks? VII. OTHER ISSUES TO CONSIDER IN CONTROLLED TRIALS A. Subgroup Analysis 1. Important part of controlled clinical trials (if set a priori) 2. Many times, they are overused and overinterpreted, leading to unnecessary research, misinterpretation of results, and/or suboptimal patient care. 3. Many potential pitfalls in identifying and interpreting a. Failure to consider several comparisons or to adjust p-values b. Problems with sample size (power), classification, and lack of assessment of interaction B. Composite End Points: Often, the impression is that this practice is not a good practice. 1. The primary end point is one of the most important decisions to make in the design of a clinical study. 2. A composite end point combines several end points. a. For example, in cardiovascular trials, major adverse cardiovascular events are commonly used and include: Cardiovascular death, nonfatal MI, and nonfatal stroke and may also include target vessel revascularization or hospitalization b. Usually combines measures of morbidity and mortality c. What does the following statement mean? Our findings show that ramipril reduces the rates of death, MI, stroke, revascularization, cardiac arrest, heart failure, complications related to diabetes, and new cases of diabetes in a broad spectrum of high-risk patients. Treating 1000 patients with ramipril for 4 years prevents about 150 events in around 70 patients. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-504 Study Designs: Fundamentals and Interpretation i. Was there a reduction in all the end points or just in some? ii. Are all the outcomes just as likely to occur? iii. Why would the investigators of this trial have been interested in all of these outcomes? 3. What are the benefits of using composite end points? a. No single primary outcome b. Alleviate problems of multiple testing c. Increase number of events, which decreases sample size and cost to the investigator 4. What are the limitations? a. Difficulties in interpreting composite end points; consider our earlier example b. Misattribution of statistically beneficial effects of composite measure to each of its component end points c. Dilution of effects, negative results for relatively common component of composite end point “hide” real differences in other end points. Undue influence exerted on composite end point by “softer” component end points d. “Averaging” of overall effect: Problems when component end points move in opposite directions; a sign the composite end point should be abandoned without valid conclusions being drawn e. Should all end points be weighed the same, or should death “weigh” more? 5. The results for each individual end point should be reported together with the results for the composite. C. Surrogate End Points 1. Parameters thought to be associated with clinical outcomes a. Blood pressure for stroke prevention b. LDL-C reduction for cardiovascular death reduction i. Statins: Yes ii. Hormone replacement therapy: No c. Premature ventricular contraction suppression and reduced mortality 2. Surrogate outcomes do not always predict clinical outcomes. 3. Short-duration studies that evaluate surrogate end points may not be large enough to detect uncommon adverse events. D. Superiority Versus Equivalence Versus Noninferiority 1. A superiority trial is designed to detect a difference between experimental treatments. This is the typical design in a clinical trial. 2. An equivalence trial is designed to confirm the absence of a meaningful difference(s) between treatments, neither better nor worse (both directions). The key is the definition of the specified margin. What difference is important? One example is a bioequivalence trial. 3. A noninferiority trial is designed to investigate whether a treatment is not clinically worse (not less effective than stated margin, [noninferiority margin] or inferior) than an existing treatment. a. It may be the most effective, or it may have a similar effect b. Useful when placebo administration is not possible for ethical reasons c. ONTARGET (The Ongoing Telmisartan Alone and in Combination with Ramipril Global Endpoint Trial): N Engl J Med 2008;358:1547-59. i. Designed to evaluate telmisartan, ramipril, or their combination in patients with a high risk of vascular disease ii. Objective was to determine whether telmisartan was noninferior to ramipril in the incidence of cardiovascular deaths iii. Noninferior margin was defined as 13% or less d. Essentials of noninferiority design i. Control group must be effective ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-505 Study Designs: Fundamentals and Interpretation ii. Current study similar to previous study with control and with equal doses, clinical conditions, and design used iii. Adequate power is essential, and usually, larger sample sizes are required. iv. Need to have a clinically defined noninferiority margin a priori v. May perform both an intention-to-treat and a per-protocol analysis E. Explanatory vs. Pragmatic Clinical Trials (N Engl J Med 2016;375:454-63) 1. Explanatory trials: Test a physiologic or clinical hypothesis 2. Pragmatic trials: Inform a clinical or policy decision using real-world populations a. Designed to show the real-world effectiveness of interventions in broad patient populations outside the rigors of a traditional RCT b. Study participants should be similar to those who would receive the intervention if it were to become standard of care. c. Investigators involved with routine clinical practice are preferred to experienced trialists, particularly those who work in a group practice setting. d. Interventions are commonly not masked. e. End points focus on those that are important to patients (e.g., hospitalizations, quality of life, symptoms) or that have been underrepresented in RCTs. f. Disadvantage: Generally have lower internal validity VIII. CONTROLLED CLINICAL TRIALS: ANALYSIS A. Questions to Consider in Evaluating and Interpreting a Clinical Trial 1. Study design a. Was the studied sample representative of the population or the individual to whom the results were being applied? b. Were the inclusion/exclusion criteria appropriate, or were they overly restrictive or inclusive? c. Sufficient sample size, power, and so forth? Was a power analysis included? d. Was a study objective and/or hypothesis provided? e. Was the study blinded and to whom? (subject, investigator, study personnel, or all?) f. Was a run-in phase used? If so, why? Did it affect the interpretation of the trial? g. What type of randomization method was performed? Did the randomization process produce equal baseline characteristics between all groups? 2. Outcomes/assessments a. Were the primary and/or secondary outcomes identified, were they reasonable, and did they apply to clinical practice? b. Was a composite outcome used, and were all the individual components identified and clearly stated in the methods and results? c. Were surrogate markers used instead of (or in addition to) clinically relevant outcomes? 3. Analysis a. What analysis technique was used: Intention-to-treat, as-treated, or per-protocol? b. Were the statistical tests appropriate? 4. Interpretation: Was the author’s interpretation appropriate and within the confines of the study design? 5. Extrapolation a. Are you applying the results to similar patients in a similar setting? b. Are there possible additional adverse effects that were not measured in this study? ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-506 Study Designs: Fundamentals and Interpretation IX. COMMON APPROACHES TO ANALYZING CLINICAL TRIALS A. Intention-to-Treat Analysis 1. Compares outcomes on the basis of initial group assignment or “as randomized.” The allocation to groups was how they were “intended to be treated,” even though they may not have taken the medication for the duration of the study, dropped out, and did not comply with the protocol. 2. Determines effect of treatment under usual conditions of use. Analogous to routine clinical practice in which a patient receives a prescription but may not adhere to the prescribed drug regimen. 3. Gives a conservative estimate of differences in treatments; may underestimate treatment benefits 4. Most common approach to assessing clinical trial results 5. This is the preferred type of analysis in a superiority trial. B. Per-Protocol Analysis 1. Subjects who do not adhere to allocated treatment are not included in the final analysis; only those who completed the trial and adhered to the protocol (based on some predetermined definition [e.g., 80% adherence]) 2. Provides additional information about treatment efficacy and provides more generous estimates of differences between treatments 3. Subject to several issues because of factors such as lower sample size and definitions of adherence. Results are more difficult to interpret and would be validly applied only to adherent patients like those in the trial; not necessarily generalizable to all patients 4. Preferred type of analysis in noninferiority trials, although intention-to-treat analysis is often also used C. As-Treated Analysis 1. Subjects are analyzed by the actual intervention received. If subjects were in the active treatment group but did not take active treatment, the data would be analyzed as if they were in the placebo group. 2. This analysis essentially ignores/destroys the randomization process for those who did not adhere to study design. Results should be interpreted with caution. X. REVIEWS A. Narrative Review (nonsystematic review) 1. Summarizes results of several studies without systematic methods. The review is qualitative and subjective in nature. 2. These methods would be typical of a standard literature review. B. Qualitative Systematic Review 1. Summary that uses explicit methods to perform a comprehensive literature search, critically appraise it, and synthesize the world literature on a specific topic. This is a summary that uses systematic methods to objectively review a topic, but in a qualitative way. 2. Differs from a standard literature review: The study results are more comprehensively synthesized and reviewed. Studies included in the review are based on specific inclusion/exclusion criteria, so not all students on a given clinical question are included. 3. As with a controlled clinical trial (or other studies), the key is a well-documented and well-described systematic review. C. Quantitative Systematic Review (meta-analysis) 1. Dramatic increase in the number of these types of papers 2. First meta-analysis probably published in 1904: Assessment of typhoid vaccine effectiveness ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-507 Study Designs: Fundamentals and Interpretation 3. Systematic review that uses mathematical/statistical techniques to summarize the results of the evaluated studies 4. These techniques may improve on the following: a. Calculation of effect size b. Increase in statistical power c. Interpretation of disparate results d. Reduction in bias e. Answers to questions that may not be addressable with individual studies 5. Reliant on criteria for inclusion of previous studies and statistical methods to ensure validity. Details of included studies are essential. 6. Elements of trial methodology a. Research question b. Identification of available studies c. Criteria for trial inclusion/exclusion d. Data collection and presentation of findings e. Calculation of summary estimate: Ideally with forest plot (Figure 7) f. Assessment of heterogeneity i. Statistical heterogeneity ii. Chi2, Cochran Q, and I2 are common tests for heterogeneity. g. Assessment of publication bias: Funnel plot h. Sensitivity analysis Lactobacillus Placebo Events Total Events Total Weight Ahuja, M. (2002) 0 Armazzi, A. (2001) 1 Beausolleii, M. (2007) 7 Cremonini, F. (2002) 1 Golz, V. (1979) 3 Thomas, M. (2001) 33 Subtotal (95% CI) 545 28 30 8 44 15 21 5 43 9 133 40 816 195 30 45 21 36 134 451 Risk Ratio M-H Random, 95% CI 3.95 6.1% 12.7% 6.0% 9.9% 14.9% 53.5% Total events 51 105 Heterogeneity: Tau2 = 1.41; Chi2 = 30.19; df = 5 (P < 0.0001) P = 83% Test for overall effect: Z = 2.46 (P = 0.01) 1.3.2 Age < 18yo Arvela, T. (1999) 3 61 9 58 9.7% Ruszczyrski, M. (2008) 3 120 9 120 9.6% Tankanow, R. (1990) 10 15 15 23 14.5% Vanderhoof, J. (1999) 7 93 25 95 12.7% Subtotal (95% CI) 289 216 46.5% Total events 23 59 Heterogeneity: Tau2 = 0.59; Chi2 = 12.63; df = 3 (P = 0.006) P = 76% Test for overall effect: Z = 1.79 (P = 0.01) Total (95% CI) 1105 757 100.0% 0.01 [0.00, 0.11] 0.13 [0.02, 0.94] 0.45 [0.20, 0.95] 0.17 [0.02, 1.27] 0.25 [0.08, 0.95] 0.95 [0.68, 1.42] 0.24 [0.08, 0.75] Risk Ratio M-H Random, 95% CI s Study or Subgroup 1.3.1 Age > 18yo n n n n n n u 0.32 [0.09, 1.11] 0.33 [0.09, 1.20] 0.95 [0.61, 1.50] 0.29 [0.13, 0.63] 0.44 [0.18, 1.08] u 0.35 [0.19, 0.67] u Total events 74 164 Heterogeneity: Tau2 = 0.66; Chi2 = 42.93; df = 9 (P < 0.0001) P = 79% Test for overall effect: Z = 3.22 (P = 0.001) Test for subgroup differences: Not applicable n n n n 0.005 0.1 Favours Lactobacillus 1 10 Favours Placebo 200 Figure 7. Forest plots. CI = confidence interval; yo = years old. From: Kale-Pradhan PB, Jassal HK, Wilhelm SM. Role of Lactobacillus in the prevention of antibiotic-associated diarrhea: a meta-analysis. Pharmacotherapy 2010;30:119-26. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-508 Study Designs: Fundamentals and Interpretation XI. SUMMARY MEASURES OF EFFECT A. Absolute Risk Measures 1. Absolute risk (AR): Chance of an outcome occurring a. Table 6 example: ARRamipril = 14.0 (%) or 0.140 b. Table 6 example: ARPlacebo = 17.8 (%) or 0.178 2. Absolute risk difference (ARD): Difference in the absolute risk in exposed vs. unexposed a. The ARD can be an absolute risk reduction (ARR) or absolute risk increase (ARI) depending on the outcome. b. The ARD is the absolute difference between the incidence in the exposed group (experimental event rate or EER) and the incidence in the unexposed group (control event rate [CER]). c. Mathematically, the ARD = |EER – CER| d. Table 6 example: |EER (Ramipril) – CER (placebo)| e. Table 6 example: ARR: |0.14 – 0.178| = 0.038 3. Number needed to treat (NNT) a. Another means to characterize changes or differences in absolute risk b. Definition: The reciprocal of the absolute risk reduction (ARR). It is the number of patients that need to be treated for a given period of time to see one patient have a positive outcome or prevent a negative outcome in one patient. c. NNT = 1/(ARR) d. Rounded to the next-highest whole number is the most conservative approach e. Applied to clinical outcomes with dichotomous data (e.g., yes/no, alive/dead, MI/no MI) f. Assumes the baseline risk is the same for all patients (or that it is unrelated to RR) g. Extrapolation beyond studied time points should not occur. h. NNTs should be provided only for statistically significant effects. i. Number needed to harm depends on the outcome being addressed. B. Relative Risk Measures 1. RR: Discussed earlier. Ratio of the risk in the exposed group to the risk of the outcome in the unexposed group 2. Relative risk difference (RRD). a. A relative risk reduction (RRR) is calculated when the RR is less than 1 and a relative risk increase (RRI) is calculated when the RR is greater than 1. b. RRR can be calculated based on the ARR or the RR i. RRR = ARR/CER ii. RRR = 1 – RR c. RRI can be calculated based on the ARI or the RR i. RRI = ARI/CER ii. RRI = RR – 1 d. Example using Table 6 i. RRR = ARR/CER ii. RRR: = 0.038/0.178 = 0.21 (or 21%) C. Absolute differences are more important than relative differences, though the authors of many clinical studies highlight the differences observed in trials with relative differences because they are numerically larger. Why? Larger numbers are more convincing to practitioners and patients. Most drug advertisements (both directly to patients and to health care professionals) quote relative differences. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-509 Study Designs: Fundamentals and Interpretation D. NNT application 1. HOPE (Heart Outcomes Prevention Evaluation) study (N Engl J Med 2000;342:145-53) 2. Study evaluated the effect of ramipril on cardiovascular events in high-risk patients. 3. Prospective randomized double-blind study a. 9297 high-risk patients received ramipril or matching placebo once daily for an average follow-up of 5 years. b. Primary outcome: Composite of MI, stroke, or death from cardiovascular causes 4. Results shown in Table 6. Table 6. Risk of Primary and Secondary End Points by Treatment Group Outcome Combined Death from CV causes Myocardial infarction Stroke Absolute Risk (%) Ramipril Absolute Risk (%) Placebo Relative Risk RRR ARR NNT 14.0 6.1 9.9 3.4 17.8 8.1 12.3 4.9 0.79 0.75 0.80 0.69 0.21 0.25 0.20 0.31 0.038 0.02 0.024 0.015 27 50 42 67 ARR = absolute risk reduction; CV = cardiovascular; NNT = number needed to treat; RRR = relative risk reduction. XII. REPORTING GUIDELINES FOR CLINICAL STUDIES A. The Consolidated Standards of Reporting Trials (CONSORT) 1. Initially published in 1996 and updated several times since—most recently, in 2010 2. Created in an effort to improve, standardize, and increase the transparency of the reporting of clinical trials and to facilitate the improvement of literature evaluation 3. Available at www.consort-statement.org/ 4. The CONSORT statement has been endorsed by several publications and published in these journals. 5. The CONSORT statement a. The checklist: 25-item checklist pertaining to the content of the following: i. Title ii. Abstract iii. Introduction iv. Methods v. Results vi. Discussion vii. Other information b. The flow diagram: Intended to depict the passage of study participants through the randomized controlled trial 6. Extensions of the CONSORT statement a. Design extensions i. Cluster trials ii. Noninferiority and equivalence trials iii. Pragmatic trials b. Intervention extension i. Herbal medicinal interventions ii. Nonpharmacologic interventions iii. Acupuncture interventions ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-510 Study Designs: Fundamentals and Interpretation c. Data extensions i. Patient-reported outcomes ii. Harms iii. Abstracts B. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement 1. Initially published in 2007 2. “An international, collaborative initiative of epidemiologists, methodologists, statisticians, researchers and journal editors involved in the conduct and dissemination of observational studies” 3. Available at www.strobe-statement.org 4. Endorsed by several publications and published in these journals 5. The STROBE checklist: 22-item checklist, same basic concepts as the CONSORT checklist, with alterations germane to observational trials C. Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) 1. Established in 1996 (as QUOROM), renamed in 2009 2. Evidence-based minimum set of items for reporting systematic reviews and meta-analyses 3. Available at www.prisma-statement.org/ a. The PRISMA checklist: 27-item checklist with alterations germane to systematic reviews and meta-analyses b. The PRISMA flow diagram: Four-stage diagram, depicting the flow of information through the systematic review c. The PRISMA explanation and elaboration document: Intended to enhance the use and understanding of the PRISMA statement D. Enhancing the Quality and Transparency of Health Research (EQUATOR) Network 1. International initiative to improve the reliability and value of medical research literature by promoting transparent and accurate reporting of research studies 2. Does not have its own statements, but promotes the use of key reporting guidelines 3. Many other statements regarding study types not addressed in the discussion related to CONSORT, STROBE, and PRISMA are listed on the EQUATOR network Web site (www.equator-network.org). XIII. PHARMACOECONOMIC STUDIES A. Cost-Minimization Analysis 1. Differences in cost among comparable therapies are evaluated 2. Only useful to compare therapies that have similar outcomes B. Cost-Effectiveness Analysis 1. Outcome: Clinical units or cost per unit health outcome (outcome examples: years of life saved, number of symptom free days, blood glucose, blood pressure, etc.) 2. Useful to measure the cost impact when health outcomes are improved C. Cost-Utility Analysis 1. Assigns utility weights to outcomes so the impact can be measured in relation to cost (outcome example: quality-adjusted life-years) 2. Compares outcomes related to mortality when mortality may not be the most important outcome ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-511 Study Designs: Fundamentals and Interpretation D. Cost-Benefit Analysis 1. Monetary value is placed on both therapy costs and beneficial health outcomes. 2. Allows analysis of both the cost of treatment and the costs saved with beneficial outcomes XIV. SENSITIVITY/SPECIFICITY/PREDICTIVE VALUES A. Sensitivity: Proportion of True Positives That Are Correctly Identified by a Test; a test with a high sensitivity means that a negative test can rule OUT the disorder B. Specificity: Proportion of True Negatives That Are Correctly Identified by a Test; a test with high specificity means that a positive test can rule IN the disorder C. Positive Predictive Value: Proportion of Patients with a Positive Test Result Who Actually HAVE the Disease D. Negative Predictive Value: Proportion of Patients with a Negative Test Result Who Actually DO NOT HAVE the Disease E. Example: Tables 7 and 8 Table 7. Relationship Between Test and Correct Diagnosis Identified by Disease Disease Test Disease Present Disease Absent Total Test positive True positive (TP) False positive (FP) TP + FP Test negative False negative (FN) True negative (TN) TN + FN TP + FN FP + TN Total Total Sensitivity = TP/(TP + FN) Specificity = TN/(TN + FP) Positive predictive value = TP/(TP + FP) Negative predictive value = TN/(TN + FN) Positive likelihood ratio = sensitivity/(1 − specificity) Negative likelihood ratio = (1 − sensitivity)/specificity Table 8. Relationship Between Test and Correct Diagnosis Identified by Disease in a Published Study Disease Test Positive disease Negative disease Total Positive 231 (true positive) 32 (false positive) 263 Negative 27 (false negative) 54 (true negative) 81 258 86 344 Total From: Drum DE, Christacapoulos JS. Hepatic scintigraphy in clinical decision making. J Nucl Med 1972;13:908-15. ACCP Updates in Therapeutics® 2023: Pharmacotherapy Preparatory Review and Recertification Course 2-512