Gordis Epidemiology 7th Edition PDF
Document Details
Uploaded by HumorousJackalope
Mashhad University of Medical Sciences
Tags
Related
Summary
This book is a textbook on epidemiology written for undergraduate students. It details various elements and design issues related to randomized trials in the context of preventive and therapeutic medical interventions.
Full Transcript
10 Assessing Preventive and Therapeutic Interventions: Randomized Trials All who drink of this treatment recover in a short time, Except those whom it does not help, who all die, It is obvious, therefore, that it fails only in incurable cases....
10 Assessing Preventive and Therapeutic Interventions: Randomized Trials All who drink of this treatment recover in a short time, Except those whom it does not help, who all die, It is obvious, therefore, that it fails only in incurable cases. —Galen1 (129–c. 199 ce) LEARNING OBJECTIVES To describe the important elements of randomized To introduce design issues related to randomized trials. trials, including stratified randomization, planned To define the purpose of randomization and of and unplanned crossovers, and factorial design. masking. To illustrate the problems posed by noncompliance in randomized trials. Some ways of quantifying the natural history of disease The question regards a matter of fact, that has to be and of expressing disease prognosis were discussed in determined by observation and not by authority; and Chapter 6. Our objective, both in clinical practice and in it is one that appears to be a very suitable topic for public health, is to modify the natural history of a dis- statistical inquiry.... Are prayers answered, or are they ease so as to prevent or delay death or disability and to not?... [D]o sick persons who pray, or are prayed for, improve the health of the patient or the population. The recover on the average more rapidly than others?2 challenge is to select the best available preventive or therapeutic interventions to achieve this goal. To do so, As with many pioneering ideas in science and medi- we need to carry out studies that determine whether cine, many years were to pass before this suggestion was these interventions work (efficacy or effectiveness) and actually implemented. In 1965, Joyce and Welldon re- are safe. The randomized trial is considered the ideal ported the results of a randomized double-blind trial of design for evaluating both the efficacy and the side the efficacy of prayer.3 The findings of this study did not effects of new forms of intervention. indicate that patients who were prayed for derived any The notion of using a rigorous methodology to as- special benefits from that prayer. However, a more re- sess the efficacy of new drugs, or of any new modalities cent study by Byrd4 evaluated the effectiveness of inter- of care, is not recent. In 1883, Sir Francis Galton, the cessory prayer in a coronary care unit population using British anthropologist, wrote as follows: a randomized double-blind protocol. The findings from this study suggested that prayer had a beneficial thera- It is asserted by some, that men possess the faculty of peutic effect. Which is correct? obtaining results over which they have little or no di- In this chapter and the one following, we discuss rect personal control, by means of devout and earnest study designs that can be used for evaluating ap- prayer, while others doubt the truth of this assertion. proaches to treatment and prevention and focus on the 200 CHAPTER 10 Assessing Preventive and Therapeutic Interventions: Randomized Trials 201 use of the randomized trial. Although the term ran- A planned trial was described by the Scottish sur- domized clinical trial is often used together with its ac- geon James Lind in 1747.6 Lind became interested in ronym, RCT, the randomized trial design also has ma- scurvy, which killed thousands of British seamen jor applicability to studies outside the clinical setting, each year. He was intrigued by the story of a sailor such as community-based trials. For this reason, we use who had developed scurvy and had been put ashore the term randomized trial. To facilitate our discussion, on an isolated island, where he subsisted on a diet of reference is generally made to treatments and drugs; grasses and then recovered from the scurvy. Lind the reader should bear in mind that the principles de- conducted an experiment, which he described as scribed apply equally to assess new programs for follows: screening and early detection, to compare different ap- I took 12 patients in the scurvy on board the Salis- proaches to prevention, or new ways of organizing and bury at sea. The cases were as similar as I could have delivering health services. them... they lay together in one place and had one Trials are essentially experiments that are under the diet common to them all. Two of these were ordered control of the investigator. Compare this with observa- a quart of cider per day.... Two others took 25 gutts tional studies reviewed in Chapter 7, where the investi- of elixir vitriol.... Two others took two spoonfuls of gator watches what unfolds but does not interfere. vinegar.... Two were put under a course of sea wa- Suggestions of many of the elements that are impor- ter.... Two others had two oranges and one lemon tant to randomized trials can be seen in many anecdotal given them each day.... Two others took the bigness descriptions of early trials. In a review of the history of of nutmeg. The most sudden and visible good effects clinical trials, Bull described an unintentional trial con- were perceived from the use of oranges and lemons, ducted by Ambroise Paré (1510–1590), a leading figure one of those who had taken them being at the end of in surgery during the Renaissance.5 Paré lived at a time 6 days fit for duty.... The other... was appointed when the standard treatment for war wounds was the nurse to the rest of the sick. application of boiling oil. In 1537 Paré was responsible for the treatment of the wounded after the capture of Interestingly, the idea of a dietary cause of scurvy the castle of Villaine. The wounded were so numerous proved unacceptable in Lind’s day. Only 47 years later that, he says: did the British Admiralty allow the experiment to be repeated—this time on an entire fleet of ships. The re- At length my oil lacked and I was constrained to sults were so dramatic that, in 1795, the Admiralty made apply in its place a digestive made of yolks of eggs, lemon juice a required part of the standard diet of Brit- oil of roses and turpentine. That night I could not ish seamen and later changed this to lime juice. Scurvy sleep at my ease, fearing that by lack of cauteriza- essentially disappeared from British sailors, who, even tion I would find the wounded upon which I had today, are referred to as “limeys.” not used the said oil, dead from the poison. I raised Randomized trials can be used for many purposes. myself early to visit them, when beyond my hope I They can be used for evaluating new drugs and other found those to whom I had applied the digestive treatments of disease, including lifestyle interventions medicament feeling but little pain, their wounds and tests of new health and medical care technology. neither swollen nor inflamed, and having slept The basic design of a randomized trial is shown in through the night. The others to whom I had ap- Fig. 10.1. plied the boiling oil were feverish with much pain We often begin with a defined population in which and swelling about their wounds. Then I deter- participants are randomized to receive either a new mined never again to burn thus so cruelly the poor treatment or the current treatment (often referred to as wounded. “usual care” or “standard of care”), and we then follow Although this was not a randomized trial, it was a the subjects in each group to see how many are im- form of unplanned trial, which has been carried out proved in the new treatment group compared with how many times when a therapy thought to be the best avail- many are improved in the current treatment group. If able has been in short supply and has not been available the new treatment is associated with a better outcome, for all of the patients who needed it. we would expect to find better outcomes in more of the 202 SECTION II Using Epidemiology to Identify the Cause of Disease written criteria is to ask: If we have spelled out our STUDY criteria in writing and someone not involved in the POPULATION study walks in off the street and applies our criteria to the same population, will that person select the same RANDOMLY ASSIGNED subjects whom we would have selected? There should be no element of subjective decision-making on the NEW CURRENT part of the investigator in deciding who is included or TREATMENT TREATMENT not included in the study. Any study procedure must in principle be replicable by others, just as is the case with laboratory experiments. Clearly, this is easier said than DO NOT DO NOT done, because in randomized trials we are often dealing IMPROVE IMPROVE with relatively large populations. The principle is nev- IMPROVE IMPROVE ertheless important, and the selection criteria must Fig. 10.1 Design of a randomized trial. therefore be precisely stated. ALLOCATING SUBJECTS TO TREATMENT new treatment group than in the current treatment group. GROUPS WITHOUT RANDOMIZATION We may choose to compare two groups receiving dif- Before discussing the process of randomization, let us ferent therapies, or we may compare more than two ask whether there might be some alternatives to ran- groups. Although at times a new treatment may be domization that could be used. compared with no treatment, often a decision is made not to use an untreated group. For example, if we Studies Without Comparison wanted to evaluate a newly developed therapy for ac- The first possible alternative is the case study or case se- quired immunodeficiency syndrome (AIDS), would we ries (as was presented in Chapter 7). In this type of be willing to have a group of AIDS patients in our study study, no comparison is made with an untreated group who were untreated? The answer is clearly no; we would or with a group that is receiving some other treatment. compare the newly developed therapy with a currently The following story was told by Dr. Earl Peacock when recommended regimen, which would clearly be much he was chairman of the Department of Surgery at the better than no therapy at all. Perhaps the use of an un- University of Arizona: treated group may only be justified if there are not ap- One day when I was a junior medical student, a proved therapies for the disease being studied. In this very important Boston surgeon visited the school case, a “placebo” can be used. and delivered a great treatise on a large number of Let us now turn to some of the issues that must be patients who had undergone successful operations considered in the design of randomized trials. Chief for vascular reconstruction. At the end of the lecture, among them is specification of the study “arms,” or a young student at the back of the room timidly treatments. These must be clearly stated with criteria for asked, “Do you have any controls?” Well, the great their measurement, as well as the duration of the treat- surgeon drew himself up to his full height, hit the ments and how long the study will last. First, let’s start desk, and said, “Do you mean did I not operate on with who is eligible to be studied. half of the patients?” The hall grew very quiet then. The voice at the back of the room very hesitantly SELECTION OF SUBJECTS replied, “Yes, that’s what I had in mind.” Then the visitor’s fist really came down as he thundered, “Of The criteria for determining who will or will not be course not. That would have doomed half of them to included in the study must be spelled out with great their death.” God, it was quiet then, and one could precision and in writing before the study is begun. Simi- scarcely hear the small voice ask, “Which half?”7 lar to a study hypothesis, these should be prespecified (also called a priori) at the time of the study design The issue of comparison is important because we planning. An excellent test of the adequacy of these want to be able to derive a causal inference regarding the CHAPTER 10 Assessing Preventive and Therapeutic Interventions: Randomized Trials 203 relationship of a treatment and subsequent outcome. Studies With Comparison The problem of inferring a causal relationship from a If we therefore recognize the need for our study to include sequence of events without any comparison is demon- some type of comparison, what are the possible designs? strated in a story cited by Ederer.8 Historical Controls During World War II, rescue workers, digging in the We could use a comparison group from the past, called ruins of an apartment house blown up in the Lon- historical controls. Some refer to them as external controls don blitz, found an old man lying naked in a bath- or synthetic controls. We have a therapy today that we tub, fully conscious. He said to his rescuers, “You believe will be quite effective, and we would like to test it know, that was the most amazing experience I ever in a group of patients; we know that we need a compari- had. When I pulled the plug and the water started son group. So, for comparison, we will go back to the re- down the drain, the whole house blew up.” cords of patients with the same disease who were treated The problem exemplified by this story is: If we ad- before the new therapy became available. This type of minister a drug and the patient improves, can we attri- design seems inherently simple and attractive. bute the improvement to the administration of that What are the problems in using historical controls? drug? Professor Hugo Muensch of Harvard University First, if today we decide to carry out the study just de- articulated his Second Law: “Results can always be im- scribed, we may set up a very meticulous system for data proved by omitting controls.”9 collection from the patients currently being treated. But, For example, in June 2020, in the first peak of the of course, we cannot do that for the patients who were COVID-19 pandemic, results of a study were pub- treated in the past, for whom we must abstract data lished,10 titled “Safety Update: COVID-19 Convalescent from medical records which are likely useful for manag- Plasma in 20,000 Hospitalized Patients.” The paper was ing individual care but are fraught with error and omis- instrumental in the Food and Drug Administration sions when used for research purposes. Consequently, if (FDA)´s decision to issue the Emergency Use Authoriza- at the end of the study we find a difference in outcome tion for convalescent plasma for hospitalized COVID-19 between patients treated in the early period (historical patients. The authors stated that the objective of their controls) and patients treated in the later (current) pe- study was to provide an update on key safety metrics riod, we will not know whether there was a true differ- after transfusion of convalescent plasma in hospitalized ence in outcome or whether the observed difference was COVID-19 patients. They analyzed data from a conve- due only to a difference in the quality of the data collec- nience sample of the US FDA Expanded Access Program tion. The data obtained from the study groups must be for COVID-19 convalescent plasma in which ALL of the comparable in kind and quality; in studies using his- 20,000 hospitalized patients with COVID-19 were torical controls, this is often not the case. transfused with convalescent plasma. That was a critical The second problem is that if we observe a difference flaw of the study—the absence of a controlled compari- in outcome between the early group and the later group, son group who were not exposed to convalescent we will not be sure that the difference is due to the plasma. Despite this, the authors concluded that “These therapy because many things other than the therapy updated data provide robust evidence that transfusion of change over calendar time (e.g., ancillary supportive convalescent plasma is safe in hospitalized patients with therapy, living conditions, nutrition, and lifestyles). This COVID-19, and support the notion that earlier adminis- is often referred to as “secular changes.” Hence, if we tration of plasma within the clinical course of COVID-19 observe a difference and if we have ruled out differences is more likely to reduce mortality.” Unfortunately, this in data quality as the reason for the observed difference, was a conclusion that did not align with the kind of we will not know whether the difference is a result of the evidence that came from such a single arm non-con- drug we are studying or of other changes that take place trolled study and was the basis of a letter to the editor in many other factors that may be associated with the criticizing the methodology and the resulting flawed outcome over calendar time. This is why when an inves- conclusion.11 Since this paper included all exposed pa- tigator chooses historical controls, careful consideration tients, it is neither a cohort study not a clinical trial; it is should be made to whether there were major secular simply a large case series of 20,000 patients from which changes in accuracy of diagnosis, other treatments or we cannot draw a causal inference. management of the patient population being studied. 204 SECTION II Using Epidemiology to Identify the Cause of Disease However, at times, this type of design may be useful. “Practically every one of the controls was ill, and not For example, when a disease is uniformly fatal and a new one of the subjects had any trouble. Really wonder- drug becomes available, a decline in case-fatality that par- ful stuff.” A skeptic asked how he had chosen the allels use of the drug would strongly support the conclu- controls and the subjects. “Oh, I gave the stuff to my sion that the new drug is having an effect. Examples in- seamen and used the passengers as controls.”14 clude the discovery of insulin to treat diabetes, of penicillin There are a number of possible approaches for se- to treat serious infections, of streptomycin to treat tuber- lecting controls in such a nonrandomized fashion. One culosis, the St Jude protocol to treat children with acute is to assign patients by the day of the month on which lymphocytic leukemia and of tyrosine kinase inhibitors the patient is admitted to the hospital: for example, if (TKIs) such as imatinib (Gleevec) to treat chronic myelo- admission is on an odd-numbered day of the month the cytic leukemia. The use of historical controls has also been patient is in group A, and if admission is on an even- used by regulatory authorities for approvals of new drugs. numbered day of the month the patient is in group B. In For example, pretomanid was developed by Tuberculosis a trial of anticoagulant therapy after World War II, in (TB) Alliance as a combination drug (along with bedaqui- which this day-of-the-month method was used, it was line and linezolid) in the treatment of patients with highly discovered that more patients than expected were ad- drug-resistant forms of tuberculosis. During the clinical mitted on odd-numbered days. The investigators re- development program of the drug, FDA issued a Fast ported that “as physicians observed the benefits of antico- Track and Orphan Drug Designation given the rare dis- agulant therapy, they speeded up, where feasible, the ease status of the drug-resistant forms of TB, which po- hospitalization of those patients... who would routinely tentially justifies the use of an external control arm. In have been hospitalized on an even day in order to bring as 2019, pretomanid was approved by the FDA as a New many as possible under the odd-day deadline.”15 Molecular Entity (NME) based on one single arm phase 3 The problem here is that the assignment system was trial.12 However, while the trial itself was single arm, the predictable: it was possible for the physicians to know sponsor (TB Alliance), used a Real-World Evidence what the assignment of the next patient would be. The (RWE) methodology to support the approval of the drug primary goal of randomization is to eliminate the pos- using the external control arm approach with the agree- sibility that the investigator will know what the assign- ment of the FDA. They conducted a literature summary ment of the next patient will be, because such knowledge and case-matched analysis of historical control data for introduces the possibility of bias on the part of the inves- extensively resistant TB patients using carefully chosen tigator regarding the treatment group to which each inclusion and exclusion criteria. They were able to identify participant will be assigned. Randomization, if done 16 research articles that met their criteria and calculated a correctly, eliminates this type of selection bias. pooled relative risk of a favorable outcome comparing the Many years ago, a study was carried out of the effects new combination therapy to the external control arm of of bacillus Calmette-Guérin (BCG) vaccination against 6.6 (95% confidence interval 4.6, 9.6, P ,.0001).13 tuberculosis in children from families with tuberculosis Simultaneous Nonrandomized Controls in New York City.16 The physicians were told to divide the group of eligible children into a group to be immu- Because of the importance of the problems posed by nized and a comparison or control group who were not historical controls and the difficulties of dealing with immunized. As seen in Table 10.1, tuberculosis mortality changes over calendar time, an alternative approach is was almost five times higher in the controls than in the to use simultaneous controls that are not selected in a vaccinated children. However, as the investigators wrote: randomized manner. The problem with selecting simul- taneous controls in a nonrandomized manner is illus- Subsequent experience has shown that by this trated by the following story: method of selection, the tendency was to inoculate the children of the more intelligent and cooperative A sea captain was given samples of anti-nausea pills parents and to keep the children of the noncoopera- to test during a voyage. The need for controls was tive parents as controls. This was probably of con- carefully explained to him. Upon return of the ship, siderable error since the cooperative parent will not the captain reported the results enthusiastically. only keep more careful precautions, but will usually CHAPTER 10 Assessing Preventive and Therapeutic Interventions: Randomized Trials 205 TABLE 10.1 Results of a Trial of Bacillus of a patient to a study group. The critical element of Calmette-Guérin Vaccination: I randomization is the unpredictability of the next as- signment.17 TUBERCULOSIS How is randomization accomplished? Although ran- DEATHS dom allocation (randomization) is currently usually No. of Children Number % done through computer generated programs, on occa- Vaccinated 445 3 0.67 sion manual randomization is used either as a backup to Controls 545 18 3.30 computer-generation assignment or when access to a computer is limited. In this hypothetical example of Data from Levine MI, Sackett MF. Results of BCG immuniza- tion in New York City. Am Rev Tuberculosis. 1946;53:517–532. manual assignment, we use a selection from a table of random numbers (Table 10.3). (Such random number tables are available in an appendix in most statistics TABLE 10.2 Results of a Trial of Bacillus textbooks or can be generated on computers.) Calmette-Guérin Vaccination: II First, how do we look at Table 10.3? Note that the table is divided into 10 rows and 4 numbered columns TUBERCULOSIS (row numbers appear on the far left columns). The col- DEATHS umns are numbered along the top, 00–04, 05–09, and so No. of Children Number % on. This means that the first number in column 00 is 5, Vaccinated 556 8 1.44 the number in column 01 is 6, the number in column 03 Controls 528 8 1.52 is 3, etc. Similarly, the rows are numbered along the left, Data from Levine MI, Sackett MF. Results of BCG immuniza- 00, 01, 02, and so on. Thus, it is possible to refer to any tion in New York City. Am Rev Tuberculosis. 1946;53:517–532. digit in the table by giving its column and row numbers. This is important if the quality of the randomization process is to be checked by an outsider. How do we use bring the child more regularly to the clinic for in- this table? Let us say that we are conducting a study in struction as to child care and feeding.16 which there will be two groups: therapy A and therapy B. In this example, we will consider every odd number Recognizing that the vaccinations were selectively per- an assignment to A and every even number an assign- formed in children from families that were more likely to ment to B. We close our eyes and put a finger anywhere be conscious of health and related issues, the investigators on the table and write down the column and row num- realized that it was possible that the case-fatality rate from ber that was our starting point. We also write down the tuberculosis was lower in the vaccinated group, not be- direction we will move in the table from that starting cause of the vaccination itself but because these children were selected from more health-conscious families who had a lower risk of mortality from tuberculosis, with or TABLE 10.3 Table of Random Numbers without vaccination. To address this problem, a change 00–04 05–09 10–14 15–19 was made in the study design: alternate children were vac- 00 56348 01458 36236 07253 cinated, and the remainder served as controls. This does not constitute randomization, but it was a marked im- 01 09372 27651 30103 37004 provement over the initial design. As seen in Table 10.2, 02 44782 54023 61355 71692 there was now no difference between the groups. 03 04383 90952 57204 57810 04 98190 89997 98839 76129 ALLOCATING SUBJECTS USING 05 16263 35632 88105 59090 RANDOMIZATION 06 62032 90741 13468 02647 07 48457 78538 22759 12188 In view of the problems discussed, randomization is the 08 36782 06157 73084 48094 best approach in the design of a trial. Randomization 09 63302 55103 19703 74741 means, in effect, tossing a coin to decide the assignment 206 SECTION II Using Epidemiology to Identify the Cause of Disease point (horizontally to the right, horizontally to the left, The envelopes are then sealed. When the first patient up, or down). Let us assume that we point to the “5” at is enrolled, envelope 1 is opened and the assignment is the intersection of column 07 and row 07 and move read; this process is repeated for each of the remaining horizontally to the right. The first patient, then, is desig- patients in the study. nated by an odd number, 5, and will receive therapy A. However, this process is not foolproof. The following The second patient is also designated by an odd num- anecdote illustrates the need for careful quality control ber, 3, and will receive therapy A. The third is designated of any randomized study: by an even number, 8, and will receive therapy B, and so In a randomized study comparing radical and sim- on. Note that the next patient assignment is not predict- ple mastectomy for breast cancer, one of the sur- able; it is not a strict alternation, which would be pre- geons participating was convinced that radical mas- dictable and hence subject to investigator bias, know- tectomy was the treatment of choice and could not ingly or unknowingly. reconcile himself to performing simple mastectomy There are many ways of using a table of random on any of his patients who were included in the numbers for allocating patients to treatment groups in study. When randomization was carried out for his a randomized trial (Box 10.1). Although many ap- patients and an envelope was opened that indicated proaches are valid, the important point is to spell out in simple mastectomy for the next assignment, he writing whatever approach is selected for use, before would set the envelope aside and keep opening enve- randomization is actually begun. lopes until he reached one with an assignment to Having decided conceptually how to use the random radical mastectomy. numbers for allocating study participants, how do we make a practical decision as to which patients get which What is reflected here is the conflict experienced by therapy? Let us assume, for example, that a decision has many clinicians who enroll their own patients in ran- been made that odd digits will designate assignment to domized trials. On the one hand, the clinician has the treatment A and even digits will designate treatment B. obligation to do the best they can for the patient; on the The treatment assignment that is designated by the ran- other hand, when a clinician participates in a clinical dom number is written on a card, and this card is placed trial, they are, in effect, asked to step aside from the inside an opaque envelope. Each envelope is labeled on usual decision-making role and essentially to “flip a the outside: Patient 1, Patient 2, Patient 3, and so on, to coin” to decide which therapy the patient will receive. match the sequence in which the patients are enrolled in Thus, there is often an underlying conflict between the the study. For example, if the first random number is 2, clinician’s role and the role of the physician participat- a card for therapy B would be placed in the first enve- ing in enrolling patients in a clinical trial, and as a result, lope; if the next random number is 7, a card for therapy unintentional biases may occur. A in the second one, and so on, as determined by the This is such a common problem, particularly in random numbers. large, multicenter trials, that randomization is not car- ried out by each participating clinical field center; rather, it is done by an impartial separate coordinating center. BOX 10.1 Examples of Using A Random When a new patient is registered at a clinical center, the Numbers Table for Allocating Patients to coordinating center is called, or an assignment is down- Treatment Groups in A Randomized Trial loaded by the coordinating center. A randomized assign- If we plan to compare two groups: ment is then made for that patient by the coordinating We decide that even digits designate treatment A, center, and the assignment is noted in both the clinical odd digits designate treatment B, or and coordinating centers. We decide that digits 0–4 designate treatment A, What do we hope to accomplish by randomization? digits 5–9 designate treatment B If we randomize properly, we achieve non-predictability If we plan to compare three groups: of the next assignment; we do not have to worry that We decide that digits 1–3 designate treatment A, any subjective biases of the investigators, either overt digits 4–6 designate treatment B, digits 7–9 designate treatment C, and digit 0 would be ignored or covert, may be introduced into the process of select- ing patients for one treatment group or the other. CHAPTER 10 Assessing Preventive and Therapeutic Interventions: Randomized Trials 207 In addition, if the study is large enough and there are Fig. 10.2 presents a hypothetical example of the effect enough participants, we hope that randomization will of lack of comparability on a comparison of mortality increase the likelihood that the groups will be compa- rates of the groups being studied. Let us assume a study rable to each other in regard to characteristics about population of 2,000 subjects with myocardial infarc- which we may be concerned, such as sex, age, race, and tions, of whom half receive an intervention and the severity of disease—all factors that may affect progno- other half do not. Let us further assume that of the sis. Randomization is not a guarantee of comparability 2,000 patients, 700 have an arrhythmia and 1,300 do because chance may play a role in the process of random not. Case-fatality in patients with the arrhythmia is treatment assignment. However, if the treatment groups 50%, and in patients without the arrhythmia it is 10%. that are being randomized are large enough and the Let us look at the nonrandomized study on the left side randomization procedure is free of bias, they will tend of Fig. 10.2. Because there is no randomization, the inter- to be similar. In the meantime, if the two groups are vention groups may not be comparable in the proportion found to be different according to some baseline vari- of patients who have the arrhythmia. Perhaps 200 in the ables, we can infer that these differences occurred by intervention group may have the arrhythmia (with a case- chance. (Note that even chance differences have to be fatality of 50%) and 500 in the no-intervention group adjusted for.) may have the arrhythmia (with its 50% case-fatality). The Fig. 10.2 Nonrandomized versus randomized studies. I, If the study is not randomized, the proportions of patients with arrhythmia in the two intervention groups may differ. In this example, individuals with arrhyth- mia are less likely to receive the intervention than individuals without arrhythmia. II, If the study is random- ized, the proportions of patients with arrhythmia in the two intervention groups are more likely to be similar. 208 SECTION II Using Epidemiology to Identify the Cause of Disease resulting case-fatality will be 18% in the intervention What Is the Main Purpose of Randomization? group and 30% in the no-intervention group. We might The main purpose of randomization is to prevent any be tempted to conclude that the intervention is more potential selection biases on the part of the investigators effective than not intervening. from influencing the assignment of participants to differ- However, let us now look at the randomized study on ent treatment groups. When participants are randomly the right side of Fig. 10.2. As seen here, the groups are assigned to different treatment groups, all decisions on comparable, as is likely to occur when we randomize, so treatment assignment are removed from the control that 350 of the 1,000 patients in the intervention group of the investigators. Thus, the use of randomization is and 350 of the 1,000 patients in the no-intervention crucial to protect the study from any biases that might group have the arrhythmia. When the case-fatality is cal- be introduced consciously or subconsciously by the culated for this example, it is 24% in both groups. Thus investigator into the assignment process. the difference observed between intervention and no in- As mentioned previously, although randomization of- tervention when the groups were not comparable in ten increases the comparability of the different treatment terms of the arrhythmia was entirely due to the noncom- groups, randomization does not guarantee comparability. parability and not to any effects of the intervention itself. Another benefit of randomization is that to whatever In the nonrandomized study (also known as observa- extent it contributes to comparability, this contribution tional), arrhythmia is defined as a confounding variable applies both to variables we can measure and to variables (or confounder), as it is associated with case-fatality and that we cannot measure and may not even be aware of, is different between the groups. (Please note that al- even though they may be important in interpreting the though Fig. 10.2 shows 1,000 participants in both the findings of the trial. intervention and no-intervention group, randomization does not guarantee an equal number of participants in Stratified Randomization each group; however, with large numbers, on average Sometimes we may be particularly concerned about the two groups will generally be comparable.) comparability of the groups in terms of one or a few One might ask, if we are so concerned about the important characteristics that we strongly think may comparability of the groups, why not just match the influence prognosis or response to therapy in the groups groups on the specific variables about which we are being studied, but as we have just said, randomization concerned, rather than randomizing? The answer is that does not ensure comparability. An option that can be we can match only on variables that we know about and used is stratified randomization, an assignment method that we can measure. Thus, we cannot match on many that can be very helpful in increasing the likelihood of variables that may affect prognosis, such as an individ- comparability of the study groups. In this section, we ual’s genetic constitution, elements of an individual’s will show how this method is used to assign participants immune status, or other variables of which we may not to different study groups. even be aware and hence we will not measure during the For example, let us say that we are particularly con- study’s data collection. In addition, if we match on a cerned about age as a prognostic variable: prognosis is particular characteristic, we cannot analyze its associa- much worse in older patients than among the younger. tion with the outcome because the two groups will al- Therefore, we are concerned that the two treatment ready be identical regarding this characteristic. groups be directly comparable in terms of age. Although To summarize, randomization increases the likeli- one of the benefits of randomization is that it may in- hood that the groups will be comparable not only in crease the likelihood of such comparability, it does not terms of variables that we recognize and can measure, guarantee it. It is still possible that after we randomize, but also in terms of variables that we may not recognize, we may, by chance, find that most of the older patients may not be able to test now, and may not be able to mea- are in one group and most of the younger patients are sure with today’s technologies. However, at the end of in the other. Our results would then be impossible to the day, randomization cannot always guarantee compa- interpret because the higher-risk patients would be rability of the groups being studied. We can analyze clustered in one group and the lower-risk patients in the whether there are important differences between the two other. Any difference in outcome between intervention groups that may be associated with the trial outcome. groups may then be attributable to this difference in the CHAPTER 10 Assessing Preventive and Therapeutic Interventions: Randomized Trials 209 age distributions of the two groups rather than to the older women, with the same analysis done for men. effects of the intervention. Subgroup analysis according to randomization strata In stratified randomization, we first stratify (stratum is akin to small, randomized trials, which may not be 5 layer or level) our study population by each variable sufficiently statistically powered as the overall trial to that we consider important and then randomize par- detect treatment differences, ticipants to treatment groups within each stratum. Let us consider the example shown in Fig. 10.3. We are studying 1,000 patients and are concerned that sex DATA COLLECTION ON SUBJECTS and age are important determinants of prognosis. If we As mentioned earlier, it is essential that the data col- randomize, we do not know what the composition of lected for each of the study groups be of the same qual- the groups may be in terms of sex and age; therefore, we ity. We do not want any differences in results between decide to use stratified randomization. the groups to be due to differences in the quality or We first stratify the 1,000 patients by sex into 600 completeness of the data that were collected in the study males and 400 females. We then separately stratify the groups. Let us consider some of the variables about males by age and the females by age. We now have four which data need to be obtained on the subjects. groups (strata): younger males, older males, younger females, and older females. We now randomize within Treatment (Assigned and Received) each group (stratum), and the result is a new treatment What data are needed? First, we must know to which group and a current treatment group for each of the treatment group the patient was assigned. In addition, four groups. As in randomization without stratification, we must know which therapy the patient actually re- we end up with two intervention groups, but having ceived. It is important to know, for example, if the initially stratified the groups, we increase the likelihood patient was assigned to receive treatment A but did that the two groups will be comparable in terms of sex not comply. A patient may agree to be randomized and age. (As in Fig. 10.2, Fig. 10.3 shows that randomiza- but may later change his or her mind and refuse to tion results in an equal number of participants in each comply. Conversely, it is also clearly important to treatment group, although this result is not guaranteed know whether a patient who was not assigned to re- by randomization.) We may also decide to examine the ceive treatment A may have taken treatment A on his results in each stratum, e.g., compare the randomized or her own, often without the investigators knowing. groups in younger women (say under 50 years old) and This can also happen if the investigator or the clinical Fig. 10.3 Example of stratified randomization. See discussion in text. 210 SECTION II Using Epidemiology to Identify the Cause of Disease research coordinator inadvertently administers the For example, if the randomized trial’s objective is to wrong treatment (or wrong dose) to the patient. evaluate a new medication for migraines, it is expected that mortality from cancer would be similar in the two Outcome groups. The need for comparable measurements in all study groups is particularly true for measurements of outcome. Masking (Blinding) Such measurements include both improvement (the de- Masking involves several components: First, we would sired effect) and any side effects (often referred to as ad- like the subjects not to know which group they are as- verse events) that may appear. There is therefore a need signed to. This is of particular importance when the for explicitly stated criteria for all outcomes to be mea- outcome is a subjective measure, such as a psychiatric sured in a study. Once the criteria are explicitly stated, we condition, self-reported severity of headache or low must be certain that they are measured comparably in all back pain. If the patient knows that they are receiving a study groups. In particular, the potential pitfall of out- new therapy, enthusiasm and certain psychological fac- comes being measured more carefully in those receiving tors on the part of the patient may operate to elicit a a new drug than in those receiving currently available positive response even if the therapy itself had no posi- therapy must be avoided. Masking (blinding), discussed tive biologic or clinical effect. later, can prevent much of this problem, but because How can subjects be masked? One way is by using a masking is not always possible, attention must be given to placebo, an inert substance that looks, tastes, and smells ensuring comparability of measurements and of data like the active agent. However, use of a placebo does not quality in all of the study groups. automatically guarantee that the patients are masked (blinded). Some participants may try to determine All-Cause Mortality Outcome (“Public Health whether they are taking the placebo or active drug. For Outcome”) example, in a randomized trial of vitamin C for the On occasion a medication or a preventive strategy for common cold, patients were blinded by use of a placebo mortality that is effective with regard to the main out- and were then asked whether they knew or suspected come of interest does not increase event-free survival. which drug they were taking. Even with masking of For example, in the 13-year follow-up of the European medications, participants may decide to alternate doses Randomized Study of Screening for Prostate Cancer, with another participant so that they hope they each get there was a reduction of approximately 27% in pros- some of the study medication; this was commonly seen tate cancer mortality.18 However, overall mortality when new antiretroviral therapy was initiated for AIDS (also known as “public health outcome”) was similar in resource limited settings. in the two study groups, thus suggesting that effective- As seen in Table 10.4, of the 52 people who were re- ness of screening with regard to all-cause mortality ceiving vitamin C and were willing to make a guess, 40 was null. stated they had been receiving vitamin C. Of the 50 who were receiving placebo, 39 said they were receiving pla- Prognostic Profile at Entry cebo. How did they know? They had bitten into the If we know the risk factors for a bad outcome, we want capsule and could tell by the bitter taste. Does it make to verify that randomization has provided reasonable any difference that they knew? The data suggest that the similarity between the two groups in terms of these rate of colds was higher in subjects who received vita- risk factors. For example, if age is a significant risk fac- min C but thought they were receiving placebo than in tor, we would want to know that randomization has subjects who received placebo but thought they were resulted in groups that are comparable for age. Data for receiving vitamin C. Thus, we must be very concerned prognostic factors should be obtained at the time of about lack of masking of the subjects and its potential subject entry into the study, and then the two (or more) effects on the results of the study, particularly when we groups can be compared on these factors at baseline are dealing with subjective end points. (i.e., before the treatment is provided). Another strategy Use of a placebo is also important for studying the to evaluate comparability is to examine an outcome to- rates of side effects and reactions, formally called adverse tally unrelated to the treatment that is being evaluated. events. The Physicians’ Health Study was a randomized CHAPTER 10 Assessing Preventive and Therapeutic Interventions: Randomized Trials 211 TABLE 10.4 Randomized Trial of Vitamin plays a major role in identifying both the real benefits C and Placebo for the Common Cold: of an agent and its side effects. Sometimes it is possible Results of a Questionnaire Study to to use a medication in both the new therapy and in the Determine Whether Subjects Suspected placebo groups to prevent the occurrence of the most Which Agent They Had Been Given obvious side effects of the therapy. In the aspirin ex- ample, a proton pump inhibitor, which is a class of SUSPECTED DRUG medication that is used to prevent gastrointestinal Actual Drug Vitamin C Placebo Total symptoms from excess acid, could be given to both Vitamin C 40 12 52 randomized groups, thus masking the participants with Placebo 11 39 50 regard to the group to which they were allocated. In Total 51 51 102 addition to masking (or blinding) the subjects, we also want to mask the observers or data collectors in regard P ,.001. to which group a patient is in. The masking of both From Karlowski TR, Chalmers TC, Frenkel LD, et al. Ascorbic acid for the common cold. A prophylactic and therapeutic trial. participants and study personnel is called “double JAMA. 1975;231:1038. Copyright 1975, American Medical masking.” Some years ago, a study was being conducted Association. to evaluate coronary care units in the treatment of myocardial infarction. It was planned in the following manner: TABLE 10.5 Physicians’ Health Study: Patients who met strict criteria for categories of Side Effects According to Treatment Group myocardial infarction [were to] be randomly as- Aspirin Placebo signed either to the group that was admitted im- Side Effect Group (%) Group (%) P mediately to the coronary care unit or to the group that was returned to their homes for domi- GI symptoms 34.8 34.2.48 (except ulcer) ciliary care. When the preliminary data were presented, it was apparent in the early phases of Upper GI tract 1.5 1.3.08 ulcers the experiment that the group of patients labeled as having been admitted to the coronary care unit Bleeding 27.0 20.4 ,.00001 did somewhat better than the patients sent home. problems An enthusiast for coronary care units was uncom- GI, Gastrointestinal. promising in his insistence that the experiment Data from Steering Committee of the Physicians’ Health was unethical and should be terminated and that Study Research Group. Final report on the aspirin component of the Ongoing Physicians’ Health Study. N Engl J Med. the data showed that all such patients should 1989;321:129–135. Copyright 1989, Massachusetts Medical be admitted to the coronary care unit. The statis- Society. All rights reserved. tician then revealed the headings of the data col- umns had been interchanged and that really the home care group seemed to have a slight advan- tage. The enthusiast then changed his mind but trial of the use of aspirin to prevent myocardial infarc- could not be persuaded to declare coronary care tions. Table 10.5 shows the side effects that were reported units unethical.19 in groups receiving aspirin and those receiving placebo in this study. The message of this example is that each of us comes Note the high rates of reported reactions in people to whatever study we are conducting with a certain receiving placebo. Thus it is not sufficient to say that number of subconscious or conscious biases and pre- 34% of the people receiving aspirin had gastrointesti- conceptions. The methods discussed in this chapter and nal symptoms; what we really want to know is the ex- Chapter 11 are designed to shield the study from the tent to which the risk of side effects is increased in biases of the investigators. people taking aspirin compared with those not taking We will now turn to two other aspects of the design aspirin (i.e., those taking placebo). Thus, the placebo of randomized trials: crossover and factorial designs. 212 SECTION II Using Epidemiology to Identify the Cause of Disease treatment. Subjects are randomized to new treatment or CROSSOVER current treatment (see Fig. 10.4A). After being observed Another important issue in clinical trials is crossover. for a certain period of time on one therapy and after any Crossover may be of two types: planned or unplanned. changes are measured (see Fig. 10.4B), the patients are A planned crossover is shown in Fig. 10.4. In this ex- switched to the other therapy (see Fig. 10.4C). Both ample, a new treatment is being compared with current groups are then again observed for a certain period of A B C D E F Fig. 10.4 (A–F) Design of a planned crossover trial. See discussion in text. CHAPTER 10 Assessing Preventive and Therapeutic Interventions: Randomized Trials 213 time (see Fig. 10.4D). Changes in group 1 patients while (see Fig. 10.5C). The patients seen on the left in Fig. 10.5D they are on the new treatment can be compared with are now treated surgically, and those on the right in this changes in these patients while they are on the current figure are treated medically. Those treated surgically treatment (see Fig. 10.4E). Changes in group 2 patients include some who were randomized to surgery (shown while they are on the new treatment can also be com- in pink) and some who crossed over to surgery (shown pared with changes in these patients while they are on in yellow). Those treated medically include some who the current treatment (see Fig. 10.4F). Thus each patient were randomized to medical treatment (shown in yel- can serve as his or her own control, holding constant the low) and some who crossed over to medical treatment variation between individuals in many characteristics (shown in pink). that do not vary over time and that could potentially Unplanned crossovers pose a serious challenge in affect a comparison of the efficacy of the two agents in analyzing the data. If we analyze according to the origi- the trial. nal assignment (called an intention to treat analysis), we This type of design is very attractive and useful pro- will include in the surgical group some patients who vided that certain cautions are taken into account. First received only medical care, and we will include in the is that of carryover: For example, if a subject is changed medical group some patients who had surgery. In other from therapy A to therapy B and observed under each words, we would compare the patients according to the therapy, the observations under therapy B will be valid treatment to which they were originally randomized, re- only if there is no residual carryover from therapy A. gardless of what treatment actually occurred. Fig. 10.5E There must be enough of a “washout period” to be sure shows an intention to treat analysis in which we compare none of therapy A, or its effects, remains before starting the group in pink (randomized to surgical treatment) therapy B. Second, the order in which the therapies are with the group in yellow (randomized to medical treat- given may elicit psychological responses. Patients may ment). Although it may be counterintuitive, it is the react differently to the first therapy given in a study as a most conservative analysis of randomized trials, and result of the enthusiasm that is often accorded a new also highly regarded by regulatory authorities for the study (Hawthorne effect); this enthusiasm may diminish purpose of new drug approvals. If, however, we analyze over time. We therefore want to be sure that any differ- according to the treatment that the patients actually ences observed are indeed due to the agents being evalu- receive (as treated analysis), we will have broken, and ated, and not to any effect of the order in which they therefore lost the benefits of, the randomization. In were administered. Finally, the planned crossover design essence, as treated analysis of a randomized trial can is clearly not possible if the new therapy is surgical or if be considered as an analysis of a cohort study since the new therapy cures the disease. randomization is not intact anymore and, thus, ob- A more important consideration is that of an un- servational (nonrandomized) study biases may have planned crossover. This is also called contamination of occurred. treatment arms. Fig. 10.5A shows the design of a ran- No perfect solution is available for this dilemma. domized trial of coronary bypass surgery, comparing it Current practice is to perform the primary analysis with medical care for coronary heart disease. Random- by intention to treat—according to the original ran- ization is carried out after informed consent has been domized assignment. We would hope that the results obtained. Although the initial design is straightfor- of other comparisons would be consistent with this ward, in reality, unplanned crossovers may occur. Some primary approach. The bottom line is that because subjects randomized to bypass surgery may begin to there are no perfect solutions, the number of un- have second thoughts and decide not to have the sur- planned crossovers must be kept to a minimum. gery (see Fig. 10.5B). They are therefore crossovers into Obviously, if we analyze according to the original ran- the medical care group (see Fig. 10.5C). In addition, the domization and there have been many crossovers, the condition of some subjects assigned to medical care interpretation of the study results will be question- may begin to deteriorate and urgent bypass surgery able. If the number of crossovers becomes large, the may be required (see Fig. 10.5B)—these subjects are problem of interpreting the study results may become crossovers from the medical to the surgical care group insurmountable. 214 SECTION II Using Epidemiology to Identify the Cause of Disease A B C D E Fig. 10.5 (A–E) Unplanned crossover in a study of cardiac bypass surgery and the use of intention to treat analysis. (A) Original study design. (B–D) Unplanned crossovers. (E) Use of intention to treat analysis. use the same study population for testing both drugs. FACTORIAL DESIGN This factorial type of design is shown in Fig. 10.6. An attractive alternative option in the study designs If the effects of the two treatments are indeed com- discussed in these chapters is the factorial design. As- pletely independent, we could evaluate the effects of suming that two drugs are to be tested, the anticipated treatment A by comparing the results in cells a 1 c to the outcomes for the two drugs are different, and their results in cells b 1 d (Fig. 10.7A). Similarly, the results for modes of action are independent, one can economically treatment B could be evaluated by comparing the effects CHAPTER 10 Assessing Preventive and Therapeutic Interventions: Randomized Trials 215 Treatment A in cells a 1 b to those in cells c 1 d (see Fig. 10.7B). In + the event that it is decided to terminate the study of treat- ment A, this design permits continuing the study to determine the effects of treatment B. Both + A and B B only An example of a factorial design is seen in the Physi- cians’ Health Study.20 More than 22,000 physicians were (cell a) (cell b) randomized using a 2 3 2 factorial design that tested Treatment B aspirin for primary prevention of cardiovascular disease Neither and beta carotene for primary prevention of cancer. A only Each physician received one of four possible interven- A nor B tions: both aspirin and beta carotene, neither aspirin nor (cell c) (cell d) beta carotene, aspirin and beta carotene placebo, or beta carotene and aspirin placebo. The resulting four groups Fig. 10.6 Factorial design for studying the effects of two treatments. are shown in Figs. 10.8 and 10.9. The aspirin part of the study (Fig. 10.10A) was terminated early, on the advice STUDY POPULATION 22,071 RANDOMLY ASSIGNED Aspirin Placebo 11,037 11,034 RANDOMLY ASSIGNED RANDOMLY ASSIGNED Beta Beta Placebo Placebo carotene carotene A 5,517 5,520 5,520 5,514 Fig. 10.8 Factorial design used in a study of aspirin and beta carotene. Aspirin + BOTH Beta + Aspirin and Beta carotene Only Beta-carotene carotene NEITHER Aspirin Aspirin B Only nor Beta carotene Fig. 10.7 (A and B) Factorial design. (A) The effects of treat- ment A (orange cells) versus no treatment A. (B) The effects of Fig. 10.9 Factorial design of the study of aspirin and beta caro- treatment B (purple cells) versus no treatment B. tene in 2 3 2 table format. 216 SECTION II Using Epidemiology to Identify the Cause of Disease to comply or may stop participating in the study. These noncompliers are also called dropouts from the study. On the other hand, people may just stop taking the agent assigned without admitting this to the investiga- tor or the study staff. Whenever possible, checks on potential noncompliance are built into the study. These may include, for example, urine or blood tests for the agent being tested or for one of its metabolites. Another problem in randomized trials has been called drop-ins. Patients in one group may inadvertently take the agent assigned to the other group. For example, in a trial of the effect of aspirin for prevention of myo- A cardial infarction, patients were randomized to aspirin or to no aspirin. However, a problem arose in that, be- cause of the large number of over-the-counter prepara- tions that contain aspirin, many of the control patients might well be taking aspirin without knowing it. This was previously discussed as an unplanned crossover. Two steps were taken to address this problem: (1) con- trols were provided with lists of aspirin-containing over-the-counter preparations that they should avoid, and (2) urine tests for salicylates were carried out both in the aspirin group and in the controls. The net effect of noncompliance on the study results will be to reduce any observed differences (i.e., driving the B difference toward the null) because the treatment group will include some who did not receive the therapy, and the Fig. 10.10 (A and B) Factorial design. (A) The effects of aspirin no-treatment group may include some who received the (orange cells) versus no aspirin. (B) The effects of beta caro- treatment. Thus, the groups will be less different in terms tene (purple cells) versus no beta carotene. of therapy than they would have been had there been no noncompliance, so that even if there is a difference in the effects of the treatments, it will appear much smaller. of the external data monitoring board, because a statisti- One approach that was used in the Veterans Admin- cally significant 44% decrease in the risk of first myocar- istration Study of the Treatment of Hypertension22 dial infarction was observed in the group taking aspirin. was to carry out a pilot study in which compliers and The randomized beta carotene component (see Fig. noncompliers were identified. When the actual full 10.01B) continued until the originally scheduled date of study was later carried out, the study population was completion. After 12 years of beta carotene supplemen- limited to those who had been compliers during the tation, no benefit or harm was observed in terms of the pilot study (sometimes referred to as a “run-in pe- incidence of cancer or heart disease or death from all riod”). The problem with this approach is that when causes. Subsequent reports have shown greater risk of we want to generalize from the results of such a study, cancer with beta carotene in smokers.21 we can only do so to other populations of compliers, which may be different from the population in any free-living community, which would consist of both NONCOMPLIANCE compliers and noncompliers. Patients may agree to be randomized, but following Table 10.6 shows data from the Coronary Drug Project randomization they may not comply with the assigned reported by Canner and coworkers.23 This study was a treatment. Noncompliance may be overt or covert: On comparison of clofibrate and placebo for lowering choles- the one hand, people may overtly articulate their refusal terol. The table presents the mortality in the two groups. CHAPTER 10 Assessing Preventive and Therapeutic Interventions: Randomized Trials 217 TABLE 10.6 Coronary Drug Project: TABLE 10.8 Coronary Drug Project: 5-Year 5-Year Mortality in Patients Given Mortality in Patients Given Clofibrate or Clofibrate or Placebo Placebo According to Level of Compliance No. of Patients Mortality (%) CLOFIBRATE PLACEBO Clofibrate 1,065 18.2 No. of Mortality No. of Mortality Placebo 2,695 19.4 Compliance Patients (%) Patients (%) Poor 357 24.6 882 28.2 Modified from Canner PL, Forman SA, Prud’homme GJ, for the Coronary Drug Project Research Group. Influence of (,80%) adherence to treatment and response to cholesterol on Good 708 15.0 1,813 15.1 mortality in the coronary drug project. N Engl J Med. (80%) 1980;303:1038–1041. Total Group 1,065 18.2 2,695 19.4 Modified from Canner PL, Forman SA, Prud’homme GJ, for TABLE 10.7 Coronary Drug Project: 5-Year the Coronary Drug Project Research Group. Influence of Mortality in Patients Given Clofibrate or adherence to treatment and response of cholesterol on mortality in the coronary drug project. N Engl J Med. Placebo According to Level of Compliance 1980;303:1038–1041. No. of Mortality Patients (%) FDCs of antihypertensive medications is associated with Clofibrate a significant improvement in medication compliance or Poor complier (,80%) 357 24.6 persistence, despite non-statistically significant yet ben- Good complier (80%) 708 15.0 eficial trends in blood pressure and adverse effects.24 Placebo 2,695 19.4 What can we learn from these tables? People who do Modified from Canner PL, Forman SA, Prud’homme GJ, for not comply or who do not participate in studies differ the Coronary Drug Project Research Group. Influence of from those who do comply and who do participate. adherence to treatment and response to cholesterol on Therefore, in conducting a study to evaluate a therapy or mortality in the coronary drug project. N Engl J Med. other intervention, we cannot offer the agent to a popula- 1980;303:1038–1041. tion and compare the effects in those who take the agent to the effects in those who refuse or do not, because the No large difference in 5-year mortality was seen be- two groups are basically different in terms of many de- tween the two groups. The investigators speculated that mographic, social, psychological, and cultural variables perhaps this was the result of the patients not having that may have important roles in determining outcome. taken their medication. Table 10.7 shows the results of These are all forms of selection bias that were discussed separating the clofibrate subjects into good compliers previously when we talked about observational study and poor compliers. Here we see the 5-year mortality was designs. Randomization, or some other approach that 24.6% in the poor-complier group compared with 15% reduces selection bias, is essential in a valid clinical trial. in the good-complier group. We might thus be tempted to conclude that compliance was indeed the factor that produced the results seen in Table 10.6: no significant CONCLUSION difference between the clofibrate and placebo groups. The randomized trial is generally considered the gold Table 10.8 separates both groups, clofibrate and pla- standard of study designs. For the purpose of new drug cebo, into compliers and noncompliers. Even in the pla- approval by regulatory authorities, results of the ran- cebo group, 5-year mortality in the poor compliers was domized, double-masked, placebo-controlled clinical higher than in the good compliers: 28% compared with trial are considered the highest level of evidence. When 15%. One way to maximize compliance is to administer hierarchies of study design are created to assess the a single pill that includes a combination of two medica- strength of the available evidence supporting clinical and tions needed to achieve a therapeutic target. This is com- public health policy, randomized trials are virtually al- monly called fixed-dose combinations (FDCs). A sys- ways at the top of the list when study designs are ranked tematic review and meta-analysis found that the use of in order of descending quality. However, a recently 218 SECTION II Using Epidemiology to Identify the Cause of Disease developed observational study approach—Mendelian 12. Conradie F, Diacon AH, Ngubane Nosipho, et al. Treat- randomization—the discussion of which is not within ment of highly drug-resistant pulmonary tuberculosis. the scope of this textbook, mimics random allocation if N Engl J Med. 2020;382(10):893–902. its rather stringent assumptions can be met.25 13. U.S. Food & Drug Administration. Drug Approval Package: Pretomanid; Sept 13, 2019. https://www. This chapter has discussed many of the components accessdata.fda.gov/drugsatfda_docs/nda/2019/ of the randomized trial that are designed to shield the 212862Orig1s000TOC.cfm study from any preconceptions and biases of the inves- 14. Wilson EB. Cited in Ederer F: Why do we need controls? tigator and of others involved in conducting the study, Why do we need to randomize? Am J Ophthalmol. as well as from other biases that might inadvertently be 1975;79:761. introduced. In Chapter 11 we will address some other 15. Wright IS, Marple CD, Beck DF. Cited in Ederer F: Why issues relating to the design of randomized trials and do we need controls? Why do we need to randomize? Am will consider several interesting examples and applica- J Ophthalmol. 1975;79:761. tions of the randomized trial design. Later in this book, 16. Levine MI, Sackett MF. Results of BCG immunization in we will discuss the use of randomized trials and other New York City. Am Rev Tuber. 1946;53:517–532. study designs for evaluating health services and for 17. Ederer F. Practical problems in collaborative clinical tri- als. Am J Epidemiol. 1975;102:111–118. studying the effectiveness of screening. 18. Schröder FH, Hugosson J, Roobol MJ, et al. Screening and prostate-cancer mortality in a randomized European REFERENCES study. N Engl J Med. 2009;360:1320–1328. 19. Cochrane AL. Cited in Ballintine EJ: Objective measure- 1. Cited in Silverman WA. Where’s the Evidence? Debates in ments and the double masked procedure. Am J Ophthal- Modern Medicine. New York: Oxford University Press; 1998. mol. 1975;79:764. 2. Galton F. Inquiries Into Human Faculty and Its Develop- 20. Hennekens CH, Buring JE, Manson JE. Lack of effect of ment. London: Macmillan; 1883. long-term supplementation with beta carotene on the 3. Joyce CRB, Welldon RMC. The efficacy of prayer: a double incidence of malignant neoplasms and cardiovascular blind clinical trial. J Chronic Dis. 1965;18:367. disease. N Engl J Med. 1996;334:1145–1149. 4. Byrd RC. Positive therapeutic effects of intercessory 21. Goralczyk R. Beta-carotene and lung cancer in smokers: prayer in a coronary care unit population. South Med J. review of hypotheses and status of research. Nutr Cancer. 1988;81:826. 2009;61(6):767–774. 5. Bull JP. The historical development of clinical therapeutic 22. Materson BJ, Reda DJ, Massie BM, et al. Single-drug trials. J Chronic Dis. 1959;10:218. therapy for hypertension in men. A comparison of six 6. Lind J. A Treatise of the Scurvy. Edinburgh: Sands, Murray antihypertensive agents with placebo. The Department & Cochran; 1753. of Veterans Affairs Cooperative Study Group on Anti- 7. Peacock E. Cited in Tufte ER: Data Analysis for Politics hypertensive Agents. N Engl J Med. 1993;328(13): and Policy. Englewood Cliffs, NJ: Prentice-Hall; 1974. 914–21. 8. Ederer F. Why do we need controls? Why do we need to 23. Canner PL, Forman SA, Prud’homme GJ. Influence of randomize? Am J Ophthalmol. 1975;79:758. adherence to treatment and response of cholesterol on 9. Bearman JE, Loewenson RB, Gullen WH. Muensch’s mortality in the coronary drug project. N Engl J Med. Postulates, Laws and Corollaries. Biometrics Note No. 4. 1980;303:1038–1041. Bethesda, MD, Office of Biometry and Epidemiology, 24. Gupta AK, Arshad S, Poulter NR. Compliance, safety, and National Eye Institute, April 1974. effectiveness of fixed-dose combinations of antihyperten- 10. Joyner MJ, Bruno KA, Klassen SA, et al. Safety update: sive agents: a meta-analysis. Hypertension. 2010;55(2): COVID-19 convalescent plasma in 20,000 hospitalized 399–407. patients. Mayo Clin Proc. 2020;95(9):1888–1897. 25. Smith GD, Ebrhaim S. ‘Mendelian randomization’: 11. Farag YM. Limitations of safety update on convalescent Can genetic epidemiology contribute to understanding plasma transfusion in COVID-19 patients. Mayo Clin environmental determinants of disease? Int J Epidemiol. Proc. 2020; 95(12):2801–2802. 2003;32(1):1–22. R E V I E W Q U E S T I O N S F O R C H A P T E R S 1 0 A N D 1 1 A R E AT THE END OF CHAPTER 11.