Anda di halaman 1dari 7

Journal of Clinical Epidemiology 61 (2008) 324e330

REVIEW ARTICLES

Efficient ways exist to obtain the optimal sample size in clinical trials
in rare diseases
J.H. van der Leea,*, J. Wesselinga,b, M.W.T. Tanckc, M. Offringaa,d
a
Department of Pediatric Clinical Epidemiology, Emma Children’s Hospital (ECH), Academic Medical Center (AMC), University of Amsterdam,
Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands
b
Department of Pediatrics, Red Cross Hospital, Beverwijk, The Netherlands
c
Department of Clinical Epidemiology, Biostatistics and Bioinformatics, AMC, University of Amsterdam, The Netherlands
d
Department of Neonatology, ECH AMC, University of Amsterdam, The Netherlands
Accepted 12 July 2007

Abstract
Objective: Recruitment of pediatric patients in randomized clinical trials is hampered by the rarity of many conditions and by ethical
constraints. The objective of this paper is to give an overview of design options to obtain a statistically valid result while including a
minimum number of subjects.
Study Design and Setting: Overview and discussion of several approaches to conduct valid randomized clinical trials in rare diseases
and vulnerable populations.
Results: Sequential designs have been developed as efficient ways to evaluate accumulating information from a clinical trial, thereby
reducing the average size of trials. Different sequential procedures exist, including group sequential designs, boundaries designs, and adap-
tive designs. The sample size attained at the end of the trial is unknown at the start. The sample size for a given set of a, b, and effect size
may turn out to be larger than with a classical fixed sample size approach. Simulations have shown that on average, sample sizes are
smaller.
Conclusion: There are several possibilities to optimize the number of subjects in a clinical trial. The rarity of many disorders in
children and the ethical requirements in this patient population should not obstruct the performance of well-designed research to support
clinical decision making. Ó 2008 Elsevier Inc. All rights reserved.
Keywords: Epidemiologic research design; Sample size; Randomized controlled trials; Ethics; Rare diseases; Child

1. Introduction by legislation, which has been in effect since January 26,


2007 (http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?
At present, the evidence base for treatments in clinical
uri5CELEX:32006R1901:EN:NOT) [2]. As a result, an in-
pediatrics is, at best, weak [1]. Expectations on the effec-
crease in the number of drug trials in children is expected.
tiveness of most drug interventions in children are based
These developments have sparked the debate on various
on extrapolations from results of studies in adults. How- design issues in pediatric clinical research, including the
ever, responses to drug treatments and other interventions
‘‘optimal’’ number of patients to be included in these trials.
in children are often unpredictably different from adults
This debate is relevant because the number of eligible chil-
[1]. The need for clinical trials in children has been increas-
dren for pediatric trials is often small because many condi-
ingly recognized by pediatricians and pediatric research
tions are relatively uncommon in children, and recruitment
groups from all over the world. A recently published policy
of children into trials is challenging because the threshold
statement points out the importance of performing random-
for gaining consent is often high and complex. One poten-
ized controlled trials on health care interventions in chil-
tial solution is the application of techniques that minimize
dren [1]. Meanwhile, the European Union has taken the number of subjects in a randomized clinical trial. In the
measures to encourage pharmaceutical trials in children
present paper, we will present some readily available alter-
native approaches to conduct valid research while optimiz-
* Corresponding author. Tel.: þ31-20-566-3299; fax: þ31-20-696-
ing or minimizing the number of subjects to be included.
5099. The aim of this paper is to give an overview of several rel-
E-mail address: j.h.vanderlee@amc.uva.nl (J.H. van der Lee). atively unfamiliar design options that may be used to obtain
0895-4356/08/$ e see front matter Ó 2008 Elsevier Inc. All rights reserved.
doi: 10.1016/j.jclinepi.2007.07.008
J.H. van der Lee et al. / Journal of Clinical Epidemiology 61 (2008) 324e330 325

a statistically valid result while including a minimum num- will lead to results that answer a somewhat different re-
ber of subjects to expand the readers’ knowledge of alterna- search question, and strategy #9 (mis)uses the fact that
tive possibilities to consider for designing future pediatric a minimal clinically relevant difference is arbitrary and that
trials. Some examples of recent pediatric trials that have there is no standardized way to its determination [4]. All
used these approaches are presented. Although this paper these strategies have their pros and cons, which have to
is prompted by the challenges surrounding pediatric drug be weighed carefully in the design phase of each trial. More
trials, the methods described here can be applied to any dis- detailed discussion of these approaches can be found in nu-
cipline performing clinical trials in small or vulnerable merous articles and textbooks (e.g., [4e7]).
populations.
3. Alternative approaches minimizing the number
2. Optimal number of subjects in pediatric clinical of subjects to be included
trials
3.1. Sequential designs
From a statistical point of view, the number of subjects
in a trial should be as large as possible, thus increasing the The principles of randomized clinical trials were origi-
precision of effect estimation of the intervention (i.e., nar- nally derived from agricultural research, in which the statis-
rower confidence intervals [CIs]) and enhancing the possi- tical methods of analysis of variance and regression
bility of a justified rejection of the null hypothesis (usually analysis were developed [8]. An essential difference be-
stated as ‘‘There is no difference in the effect of the inter- tween agricultural and clinical trials is the time period dur-
ventions’’) in favor of the alternative hypothesis (‘‘There is ing which data are gathered. In an agricultural field trial, all
a difference in the effect of the interventions’’). This prob- crops are harvested simultaneously at the end of the growth
ability of reaching a true positive conclusion is called the season, whereas in clinical trials it usually takes weeks or
power of a study [3]. months, sometimes even years before all subjects are in-
When designing any clinical trial, the power should be cluded, and consequently the gathering of the outcome data
maximized. This can be achieved by maximizing the num- is extended over a similarly long time period. Thus, it is
ber of subjects included in the study. However, as discussed possible that before the end of the inclusion phase of a clin-
above, there are practical and ethical restraints to the inclu- ical trial enough information has already been assembled,
sion of large numbers of subjects, especially in the pediatric though usually not analyzed, to decide which intervention
population. is superior. Because subjects will be included and random-
Contrary to the statistical viewpoint, from an ethical ized until the sample size that was determined in advance
point of view the number of children participating in a trial has been reached, this may lead to inefficiency, that is, in-
should be as small as possible. Not only to protect partici- appropriate inclusion of subjects, unnecessarily prolonging
pating children from receiving the less favorable inter- the trial duration, and, more importantly, to allocation of
vention unnecessarily, when the information gathered is trial subjects to the ‘‘inferior’’ intervention at a time when
already sufficient to conclude which intervention is supe- the evidence of its inferiority might be available. Sequential
rior, but also to start treating patients not participating in designs were developed to overcome these problems. Vari-
the trial according to the evidently best intervention ous sequential procedures exist, and they can be (roughly)
strategy as soon as possible. divided into two types, namely those derived from the
It is essential to conduct trials in children, but how can repeated significance test approach, also called group se-
the optimal number of participants be established given quential designs, and those derived from the boundaries ap-
the aforementioned constraints? Sample size calculations proach. Because there is no uniform terminology, we make
offer a possibility to estimate the optimal number of pa- a distinction between ‘‘design’’ and ‘‘analysis’’ [9]. Thus,
tients needed to obtain a power of 80% or 90%. The prob- within the boundaries design a distinction can be made
lem of limited availability of subjects to be included in between ‘‘continuous sequential analysis’’ and ‘‘group se-
a trial has led to different strategies used by researchers quential analysis.’’ Recently, a modification of the group
to improve power. Several strategies are use of 1) compos- sequential design has been proposed, called the adaptive
ite or 2) surrogate outcomes, 3) the crossover design, 4) re- design. We will discuss the group sequential design, the
peated measurements, 5) analysis of covariance instead of boundaries design, and adaptive design in more detail
simple comparison of outcomes in two groups, 6) conduct- below.
ing an under-powered trial for a later meta-analysis, 7) the
prospective meta-analysis approach, 8) use of one-sided in- 3.1.1. Group sequential design, repeated significance
stead of two-sided hypothesis testing, and 9) inflation of the testing
minimal clinically relevant difference. The first five strate- In this approach, a series of conventional statistical anal-
gies have to do with truly improving the power; strategies yses, so-called interim analyses, is carried out at various
#6 and #7 make use of meta-analysis techniques, strategy predetermined time points on the accumulating data. The
#8 changes the assumptions underlying the trial, and thus basis for this methodology was laid by Armitage in the
326 J.H. van der Lee et al. / Journal of Clinical Epidemiology 61 (2008) 324e330

1970s [10,11]. To ensure an overall type I error probability of 80 patients, with a nominal value a* 5 0.02. The rationale
(a), the significance thresholds at the various time points for the interim analysis was to prevent unnecessary injection
are adjusted to allow for the repetition (a*). Various pain in children if there was a difference in pain incidence
methods for a-adjustment have been proposed, either with between the two groups. The underlying considerations for
the same a* for all analyses, Pocock’s method [12], or with the timing and a* of the interim analysis were not reported.
different a* at each analysis, for example, O’Brien and The study was ended when this interim analysis showed a
Fleming [13], and Lan and DeMets [14]. In contrast to significantly lower incidence of injection pain in the Etomi-
the earlier methods of Pocock and O’Brien and Fleming, dateeÒLipuro group (5%; 95% CI 5 0.61%e16.9%) than
the so-called ‘‘a-spending function’’ developed by Lan in the propofolelidocaine group (47.5%; 95% CI 5 31.5%e
and DeMets [14] that characterizes the rate at which the 63.9%) (P 5 0.0007) [22].
a is spent and thus determines the a* at each interim anal-
ysis, only depends on the number of past and current in- 3.1.2. Boundaries design
terim analyses, and not on the number of future interim This approach relies on a graphical rule, where a V sta-
analyses. This approach enhances the flexibility of the de- tistic, representing the amount of information gathered in
sign, because the number of (interim) analyses does not the course of a trial, is plotted on the X-axis, and a Z statis-
have to be predetermined. Further modifications included tic, representing the effect size, is plotted on the Y-axis.
the possibility of stopping early when there is no relevant Prior to the start of the experiment, the boundaries are cal-
difference between the intervention groups (stopping for fu- culated based on the alternative hypothesis and the desired
tility) [15,16]. Because, in general, the assumed effect size levels of the type I and II errors (a and b, respectively). Ex-
is too optimistic, the probability of early stopping is very amples of boundaries are shown in Fig. 1. At each analysis,
low, depending on the type of a-spending that is used. which can be done after each patient (continuous sequential
Therefore, it is unrealistic to plan a trial that is too large analysis) or after a fixed or variable number of patients
to be conducted, in the hope that it can be stopped at an (group sequential analysis), the two statistics Z and V are
early stage. calculated based on the data accumulated thus far, and plot-
An example of repeated significance testing is a con- ted, creating a so-called sample path. This is illustrated in
trolled clinical trial to investigate whether the survival Fig. 2. Inclusion and randomization of subjects is continued
and neurological outcomes of pediatric patients requiring as long as the sample path remains between the boundaries
airway management in an out-of-hospital emergency situa- (continuation region). A conclusion is reached when a
tion differed between those treated with bagevalveemask boundary is crossed. In case of a one-sided test, that is, in-
ventilation or with endotracheal intubation [17]. A group vestigating only whether the experimental treatment leads
sequential design was used with an a correction according to a better outcome than the control treatment, a single pair
to O’Brien and Fleming to allow for early stopping if the of boundaries is plotted (Fig. 1aec). Crossing of the upper
outcome in one of the treatment groups would be much bet- boundary leads to rejection of the null hypothesis; crossing
ter or worse than that in the other group [13]. The trial was of the lower boundary leads to nonrejection of the null hy-
designed to have 80% power to detect an increase in ‘‘sur- pothesis. Two pairs of boundaries are plotted symmetrically
vival to hospital Emergency Room’’ from 5% to 10% with around the horizontal axis in case of a two-sided test, in
a two-sided a of 0.05, resulting in a required sample size of which both possibilities of better and worse outcomes of
800 patients, with three interim analyses after each 200 pa- the experimental compared to the control treatment are in-
tients. The interim analyses did not result in stopping the vestigated (Fig. 1d). Crossing of the upper or lower bound-
trial, and the trial was continued until 830 patients were in- ary leads to the conclusion that the experimental treatment
cluded. The final intention-to-treat analysis showed no sig- is superior or inferior, respectively; crossing of the bound-
nificant difference between the two airway management aries between both pairs of boundaries leads to nonrejection
strategies for both outcomes. The clinical interpretation of of the null hypothesis.
these results gave rise to some discussion [18e21]. How- Tests of this type are descendants of the sequential prob-
ever, this is outside the scope of this paper. ability ratio test (SPRT) of Wald [8,23]. Initially, these
Another example, in which the effect size estimation methods could only be used to compare a single proportion
appeared too conservative in retrospect, is a randomized clin- or mean to a hypothetical value, or for the comparison of
ical trial to investigate the difference in incidence of injection two proportions (paired observations). These restrictions
pain during intravenous induction of anesthesia in children hampered a successful application of these types of tests.
between a new formulation (EtomidateeÒLipuro) and the After modifications by Whitehead [24,25], comparison of
existing standard of propofol with added lidocaine [22]. two independent groups with respect to continuous, bino-
The required sample size, calculated based on an expected mial (e.g., alive/dead), or censored outcomes (survival data)
proportion of 25% in the propofolelidocaine group and 5% was possible, thus increasing the applicability of the
in the EtomidateeÒLipuro group, a of 0.05 and power of methods.
90%, was reported to be 110. On request of the Ethics Com- In an SPRT, the boundaries are parallel (Fig. 1a), giving an
mittee, an interim analysis was planned after the inclusion infinite ‘‘continuation region.’’ Therefore, it is theoretically
J.H. van der Lee et al. / Journal of Clinical Epidemiology 61 (2008) 324e330 327

Fig. 1. (aec) Boundaries for a one-sided superiority trial with one-sided a 5 0.05 and b 5 0.10, using a) a single sequential probability ratio test (SPRT), b)
a single truncated SPRT with a maximal sample size of 600, c) a single triangular test. (d) Boundaries for a two-sided trial with two-sided a 5 0.05 and
b 5 0.10, using a double triangular test. All four trials were designed to detect a difference between means of 0.25 standard deviations. The vertical dotted
line represents the V value corresponding with the equivalent fixed sample size VFIXED.

possible that a trial requires an almost infinite value of V stops, irrespective of the result. This truncation point L
(and thus n), which renders this test impracticable for is chosen more or less arbitrarily, but it has to be beyond
clinical trials. Alternatives that are more useful in clinical the amount of information VFIXED required for the equiv-
trials are the truncated SPRT (Fig. 1b) and the triangular alent fixed sample design to correct for multiple hypothe-
test (TT) (Fig. 1c) [8,26]. In the truncated SPRT, a maxi- sis testing. The space between the parallel boundaries is
mum value (L) of V is determined after which the trial influenced by the choice of L. In other words, the choice
of L has consequences for the sample size. For a given a,
b, and effect size, the boundaries are closer to each other
with increasing L. In case of the TT (Fig. 1ced) the
boundaries are convergent, resulting in a finite ‘‘continua-
tion region.’’ Therefore, the optimal properties of the
SPRT and TT differ: the average sample size reduction is
larger with the SPRT when the actual effect size is (much)
larger than expected and smaller when the actual effect
size is smaller than expected (Fig. 1a vs. 1c) [27]. When
the actual and expected effect sizes are similar, the ex-
pected sample size reductions are about equal. In most
trials, the actual effect size turns out to be smaller than
expected. In those cases the TT is more efficient [27].
In contrast to the classical fixed sample size design, the
eventual amount of information, that is, number of patients,
needed to complete a trial is unknown at the start of a se-
quential trial. Although correction for multiple testing
results in a larger maximal possible sample size (the
Fig. 2. One-sided superiority triangular test and sample path of trial of amount of information which is represented by the apex
Bellissant et al. [30]. of the triangle in Fig. 1c and d), the average sample size
328 J.H. van der Lee et al. / Journal of Clinical Epidemiology 61 (2008) 324e330

needed to complete a trial using a sequential method in selection of treatments (e.g., [35,36]), adaptation of end-
simulations was always smaller than that of the correspond- points (e.g., [37,38]), or inserting or deleting interim anal-
ing fixed design, irrespective of the effect size or power yses. See for a more extensive description of this design the
[27]. The 90th percentiles of the sample size distributions papers by, for example, Bauer and Brannath, Posch et al.,
of the sequential designs were in the same order of magni- Chang et al., Shen and Cheng [39e42]. However, many
tude as the corresponding fixed designs, when the actual features, for example, definition of stopping rules [43] or,
effect was close to the expected effect, but larger when interpretation of the results when primary endpoints have
the actual effect was smaller than expected, especially for been changed, for instance, are still subject for debate
the SPRT. [44]. The term ‘‘adaptive design’’ is a comprehensive term
In principle, Z and V can be calculated after each indi- comprising many possible design adaptations [44]. It
vidual patient, but generally Z and V are calculated after should not be confused with the more specific term ‘‘adap-
the data of a number of patients have become available, that tive treatment allocation design,’’ where the allocation ratio
is, group sequential analysis. Given the possibility that in- can be adapted based on preliminary results from the trial
termediate points, if plotted, could have lain outside the tri- [45]. To our knowledge no pediatric trials with an adaptive
angular region during long gaps between inspections, and design have been published so far. Because it maximizes
thus opportunities for stopping might have been missed, the efficiency of data gathering from individual patients,
an adjustment of the stopping boundaries is made, resulting this approach deserves more attention [46].
in a so-called Christmas tree shape (Fig. 2). After a bound-
ary has been crossed, an adjusted point estimate and CI of
the effect can be calculated with the computer program
4. Discussion
PEST or EaSt [28,29]. Due to the sequential nature of the
analysis, the CIs are wider than those obtained with conven- The basic approach of trying to minimize the number of
tional fixed sample size methods of analysis. subjects needed to obtain enough information either to de-
Bellissant et al. described a TT to assess the efficacy of cide which intervention is best, or to decide that the inclu-
metoclopramide on gastroesophageal reflux in infants [30]. sion of patients in the trial can be stopped because of
The trial was designed to detect a mean benefit on a contin- futility [47,48] is appealing. There is no simple general rule
uous outcome scale of 0.5 with an expected standard devi- to decide which of the aforementioned approaches to re-
ation of 0.5 with 95% power and a one-sided a of 0.05. In duce the necessary number of children to be included in
a fixed design, 23 patients per treatment arm would have to a trial is most appropriate in specific circumstances. The
be included. The authors anticipated that recruitment would trial about ventilation techniques in the emergency out-of
be difficult and wanted to stop the study as soon as suffi- hospital setting in the example above might have been
cient information was collected and decided, therefore, to stopped earlier because of futility if a boundaries design
use the TT. After 3 years and 9 months and inclusion of had been used instead of the O’Brien and Fleming method
39 children, the trial ended in futility, because the lower [17,48]. Although on average the boundaries approach
boundary was crossed (Fig. 2). The observed benefit of me- leads to inclusion of less subjects than the fixed sample size
toclopramide over placebo was approximately 0.2 instead approach [24,27], there is no guarantee for any individual
of 0.5. trial that this will be the case.
An important prerequisite for the boundaries approaches
is that the time between inclusion and outcome measure-
3.2. Adaptive or flexible design
ment is short in comparison to the accrual rate. If many
Adaptive designs share a number of features with se- patients can be included in a short period, using effects
quential designs, in which the null hypothesis is tested at measured in only a part of those patients to decide when
a sequence of interim analyses [31]. However, in contrast, to stop including patients makes no sense. The boundaries
the design of an adaptive trial can be changed based on full approaches can also be used for survival analysis. However,
knowledge gained from the interim analyses. When modifi- if the median survival time is long compared to the rate of
cations are made, a new phase of trial starts, and data accu- inclusion, again, the analysis will be too late to make a
mulated in (an) earlier phase(s) is no longer combined with sensible decision about ending the inclusion of patients.
data from the new phase. All phases are analyzed sepa- Although this does not lead to a reduction in sample size, it
rately, and the P-values of the different phases are then may lead to a reduction in total trial duration.
combined using a predefined rule. Examples of combina- It is evident that in a sequential design, the boundaries
tion rules are the product criterion of Fisher [32] and the in- are designed based on prior knowledge and assumptions,
verse normal method by Lehmacher [33]. The emphasis in and that these boundaries should not be changed based on
these designs is more on flexibility of the design than on information gathered during the sequential analyses. Never-
minimization of the average sample size. theless, some mid-trial design reviews are possible [49].
Different adaptations are possible including reassess- In this paper, we have also discussed the adaptive de-
ment of sample size (see critical reflection by [34]), signs, in which the assumptions are deliberately changed
J.H. van der Lee et al. / Journal of Clinical Epidemiology 61 (2008) 324e330 329

during the trial, based on information gathered in a so- settings will still be necessary. However, the rarity of many
called internal pilot. The advantage of this type of design disorders in children and the ethical requirements in this
is that all information gathered is used for the analysis. patient population should not obstruct the performance of
For these designs in particular, the prospective recording well-designed research to support clinical decision making.
of assumptions and adaptations ‘‘along the way’’ is of ut-
most importance for the credibility of the final conclusions
of a report. This can be guaranteed if all randomized trials,
including sequential and adaptive designs, are registered in References
a prospective trial register, with documentation of their ap- [1] Caldwell PH, Murphy SB, Butow PN, Craig JC. Clinical trials in
proaches to sample size calculation. Furthermore, to pre- children. Lancet 2004;364:803e11.
clude bias due to changing experimental conditions, it is [2] Clinical trials in children, for children. Lancet 2006;367:1953.
essential that the data analysis is performed independently [3] Altman DG. Practical statistics for medical research. London:
from the actual performance of the trial, that is, inclusion Chapman & Hall; 1991.
[4] Schulz KF, Grimes DA. Sample size calculations in randomised
and treatment of patients, and assessment of the outcome. trials: mandatory and mystical. Lancet 2005;365:1348e53.
An independent data monitoring committee should decide [5] Bland JM, Altman DG. One and two sided tests of significance. BMJ
about sample size adaptations, continuing or stopping the 1994;309:248.
inclusion of subjects without giving any information on [6] Friedman LM, Furberg CD, DeMets DL. Fundamentals of clinical
the details of the analysis to those involved in the trial per- trials. 3rd ed. New York: Springer-Verlag; 1998.
[7] Sackett DL, Cook DJ. Can we learn anything from small trials? Ann
formance. A decision to stop a trial should not be made N Y Acad Sci 1993;703:25e31.
light-heartedly. It will be much more difficult to defend fu- [8] Whitehead J. The design and analysis of sequential clinical trials.
ture trials on the same intervention after one trial has been Chichester, UK: John Wiley and Sons; 1997.
stopped early. [9] Sebille V, Bellissant E. Sequential methods and group sequential de-
One of the obstacles for designing and publishing a se- signs for comparative clinical trials. Fundam Clin Pharmacol
2003;17:505e16.
quential trial with a boundaries approach is probably that [10] Armitage P. Sequential methods in clinical trials. Am J Public Health
this approach is, so far, relatively unknown among re- 1958;48:1395e402.
searchers and journal editors. We suggest that, in addition [11] Armitage P. Sequential medical trials. Oxford: Blackwell Scientific
to the CONSORT statement for randomized clinical trials Publications; 1975.
[www.consort-statement.org], a list of criteria is developed [12] Pocock SJ. Group sequential methods in the design and analysis of
clinical trials. Biometrika 1977;64:191e9.
to assess the validity of sequential trials using the bound- [13] O’Brien PC, Fleming TR. A multiple testing procedure for clinical
aries approach, and to establish minimum criteria for trials. Biometrics 1979;35:549e56.
reporting this type of trial. [14] Lan KKG, DeMets DL. Discrete sequential boundaries for clinical
An objection that was brought forward by an expert was trials. Biometrika 1983;70:659e63.
that he worried about the external validity of a sequential [15] Emerson SS, Fleming TR. Symmetric group sequential test designs.
Biometrics 1989;45:905e23.
trial, because the results obtained in such a small sample [16] Pampallona S, Tsiatis AA. Group sequential designs for one-sided
may be due to selection bias (Jan Tijssen, oral communica- and two-sided hypothesis testing with provision for early stopping
tion, April 2006). In our opinion, this argument could be in favor of the null hypothesis. J Stat Plan Infer 1994;42:19e35.
used against all randomized controlled trials, not specifi- [17] Gausche M, Lewis RJ, Stratton SJ, Haynes BE, Gunter CS,
cally those using the boundaries approach. Yet another ob- Goodrich SM, et al. Effect of out-of-hospital pediatric endotracheal
intubation on survival and neurological outcome: a controlled clinical
jection might be that attempts to reduce the sample size in trial. JAMA 2000;283:783e90.
a trial also reduce the ability of detecting side effects or ad- [18] Cristofani C. Out-of-hospital endotracheal intubation of children.
verse effects. If side effects are expected to be a substantial JAMA 2000;283:2791.
issue, the trial should be powered in such a way that they [19] Eckstein M. Out-of-hospital endotracheal intubation of children.
can be reasonably evaluated. Apart from that, there are JAMA 2000;283:2790.
[20] Nieman C, Merlino J, Polk JD, Kovach B, Mancuso C, Fallon WF Jr.
many side effects or adverse effects that are too infrequent Out-of-hospital endotracheal intubation of children. JAMA 2000;283:
to be detected even in large trials. Clinicians always have to 2790e1.
be on the alert for possible unwanted effects. [21] Sagel JS. Out-of-hospital endotracheal intubation of children. JAMA
In conclusion, the approaches to optimize the number of 2000;283:2791e2.
subjects in a trial described in this paper show that limita- [22] Nyman Y, Von HK, Palm C, Eksborg S, Lonnqvist PA. Etomidate-
Lipuro is associated with considerably less injection pain in children
tions in the available numbers of patients should not be ac- compared with propofol with added lidocaine. Br J Anaesth 2006;97:
cepted as a prime reason not to conduct a trial to answer 536e9.
a clinically relevant question. There are several possibilities [23] Wald A. Sequential analysis. New York: Wiley); 1947.
to minimize the sample size necessary to yield a valid re- [24] Whitehead J, Jones DR. The analysis of sequential clinical trials.
sult. Nevertheless, it is important to remain critical and alert Biometrika 1979;66:443e52.
[25] Whitehead J, Stratton I. Group sequential clincial trials with triangu-
for false-positive results, especially when effect sizes are lar continuation regions. Biometrics 1983;39:227e36.
larger than expected and when vested (financial) interests [26] Anderson TW. A modification of the sequential probability ratio test
may be at stake [50]. Replication of results in different to reduce the sample size. Ann Math Stat 1960;31:165e97.
330 J.H. van der Lee et al. / Journal of Clinical Epidemiology 61 (2008) 324e330

[27] Sebille V, Bellissant E. Comparison of four sequential methods [39] Bauer P, Brannath W. The advantages and disadvantages of adaptive
allowing for early stopping of comparative clinical trials. Clin Sci designs for clinical trials. Drug Discov Today 2004;9:351e7.
(Lond) 2000;98:569e78. [40] Chang M, Chow SC, Pong A. Adaptive design in clinical research:
[28] PEST 4.4 Operating manual. Reading, UK: The University of Reading; issues, opportunities, and recommendations. J Biopharm Stat 2006;16:
2004. 299e309.
[29] Cytel Software Corporation. EaSt: A software package for the design [41] Posch M, Koenig F, Branson M, Brannath W, Dunger-Baldauf C,
and interim monitoring of group sequential clinical trials. Cambridge, Bauer P. Testing and estimation in flexible group sequential designs
MA: Cytel Software Corporation; 1992. with adaptive treatment selection. Stat Med 2005;24:3697e714.
[30] Bellissant E, Duhamel JF, Guillot M, Pariente-Khayat A, Olive G, [42] Shen Y, Cheng Y. Adaptive design: estimation and inference with cen-
Pons G. The triangular test to assess the efficacy of metoclopramide sored data in a semiparametric model. Biostatistics 2007;8:306e22.
in gastroesophageal reflux. Clin Pharmacol Ther 1997;61:377e84. [43] van Houwelingen HC. On ‘‘Bayesian monitoring’’. J Clin Epidemiol
[31] Wassmer G. Basic concepts of group sequential and adaptive group 1999;52:713e4.
sequential test procedures. Stat Pap 2000;41:253e79. [44] Committee for medicinal products for human use (CHMP). Reflec-
[32] Bauer P, Kohne K. Evaluation of experiments with adaptive interim tion paper on methodological issues in confirmatory clinical trials
analyses. Biometrics 1994;50:1029e41. with flexible design and analysis plan. London: European Medicines
[33] Lehmacher W, Wassmer G. Adaptive sample size calculations in Agency (EMEA); 2006.
group sequential trials. Biometrics 1999;55:1286e90. [45] Coad DS, Ivanova A. The use of the triangular test with response-
[34] Jennison C, Turnbull BW. Mid-course sample size modification in adaptive treatment allocation. Stat Med 2005;24:1483e93.
clinical trials based on the observed treatment effect. Stat Med [46] Hirtz DG, Gilbert PR, Terrill CM, Buckman SY. Clinical trials in
2003;22:971e93. childrendhow are they implemented? Pediatr Neurol 2006;34:436e8.
[35] Hommel G. Adaptive modifications of hypotheses after an interim [47] Whitehead J. Stopping rules for clinical trials. Control Clin Trials
analysis. Biometrical J 2001;43:581e9. 2004;25:69e70.
[36] Kelly PJ, Stallard N, Todd S. An adaptive group sequential design for [48] van der Tweel I, van Noord PA. Early stopping in clinical trials and ep-
phase II/III clinical trials that select a single treatment from several. idemiologic studies for ‘‘futility’’: conditional power versus sequential
J Biopharm Stat 2005;15:641e58. analysis. J Clin Epidemiol 2003;56:610e7.
[37] Bauer P, Kieser M. Combining different phases in the development [49] Whitehead J, Whitehead A, Todd S, Bolland K, Sooriyarachchi MR.
of medical treatments within a single trial. Stat Med 1999;18: Mid-trial design reviews for sequential clinical trials. Stat Med
1833e48. 2001;20:165e76.
[38] Kieser M, Bauer P, Lehmacher W. Inference on multiple endpoints in [50] Montori VM, Devereaux PJ, Adhikari NK, Burns KE, Eggert CH,
clinical trials with adaptive interim analyses. Biometrical J 1999;41: Briel M, et al. Randomized trials stopped early for benefit: a system-
261e77. atic review. JAMA 2005;294:2203e9.

Anda mungkin juga menyukai