Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=astata.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to The
American Statistician.
http://www.jstor.org
Approximate
is Betterthan"Exact" forIntervalEstimation
ofBinomialProportions
Alan AGRESTI and Brent A. COULL
Table 1. Mean Coverage Probabilitiesof Nominal 95% Confidence Intervalsfor the Binomial Parameter p, withRoot Mean
Square Errorsin Parentheses, forSamplingp froma Uniform Distribution
Method n= 5 n= 15 n= 30 n= 50 n= 100
Exact .990 .980 .973 .969 .965
(.041) (.031) (.026) (.022) (.017)
Score .955 .953 .952 .952 .951
(.029) (.019) (.014) (.012) (.008)
Wald .641 .819 .875 .901 .922
(.400) (.238) (.170) (.133) (.094)
Wald witht .664 .837 .886 .905 .926
(.391) (.233) (.167) (.131) (.093)
Mid-P .978 .964 .958 .955 .953
(.033) (.021) (.017) (.013) (.010)
Continuity-corrected .987 .979 .973 .969 .965
Score (.039) (.030) (.025) (.021) (.016)
120 General
in Section 4). The mean actual coverageprobabilitiesfor rectcomparisonof theformulasforthetwo intervalwidths
the Wald intervaltendto be muchtoo small. On the other yields that the score intervalis narrowerthan the Wald
hand,the exact intervalis veryconservative.For instance, intervalwheneverp3falls within (n ? z2)j(8r ? 4Z2) of
forthismethod,Cn = .990 when n = 5, .980 when n = 1/2. In particular, since thistermdecreasesin thelimitto-
15, and .973 when n = 30. By contrast,Cn forthe score ward 1/ = .35 as n increasesor Izj decreases,the score
methodis close to thenominalconfidencelevel,even forn intervalis narrowerthantheWald intervalwheneverp3falls
= 5 whereit is .955. Figure1, whichplotsC, as a function in (.15, .85) for any n and any nominalconfidencelevel.
of n forthethreeintervalestimatorswiththeuniformand See Ghosh (1979) for additionalresultsabout the relative
skewedbeta weightings, theirperformance.
illustrates Sim- lengthsof the two typesof interval.This comparisonhas
ilar resultswere obtainedwiththe bell-shapedweighting limitedrelevance,sincetheactualcoverageprobabilitiesof
and using .90 nominalconfidencecoefficient, but are not thetwo methodsdiffer. We mentionthis,however,to stress
reportedhere. thatthe inadequacyof the Wald approachis not thatthe
To describehow far actual coverage probabilitiestypi- intervalsare too short.
cally fall fromthe nominalconfidencelevel, Table 1 also For fixedn and p, the expectedwidthof an intervales-
reports f. (Cn (p) .95)2dp,the uniform-weighted
- root timatoris a useful measureof its performance. Figure 2
mean squarederrorof thoseprobabilitiesabout thatconfi- illustratesthe relativesizes of the expectedwidthsforthe
dencelevel.These values indicatethatthevariability about nominal95% Wald, score, and exact intervalsby plotting
thenominallevel is muchsmallerforthe score confidence themas a functionof p, forn = 15. For small n, the score
intervalthanfortheWald or exactconfidenceintervals.The intervalstendto be muchshorterthanexact intervals.The
improvedperformance of the score methodrelativeto the narrownessof the Wald intervalsas p approaches0 or 1
Wald methodis no surpriseand simplyadds to otherevi- reflectsthe factthatwhen x = 0 or n, thatintervalis de-
dence of thistypeaccumulatedovertheyears(e.g., Ghosh generateat 0 or at 1. By contrast,when x = 0, the score
1979; Vollset 1993). Some readers,though,may be sur- intervalis [0,z2/j(n + Z2)] = [0, 3.84/(n + 3.84)] and the
prisedat just how muchbetterthe score methoddoes than exact intervalis [0, 1 - (.025)1/m],whichis approximately
theexactmethod.The exact intervalremainsquite conser- [0, - log(.025)/n] = [0, 3.69/n];thelattershowsan exten-
vativeevenformoderatelylargesamplesizes whenp tends sion of the "rule of 3/n" (Jovanovicand Levy 1997) from
to be near0 or 1. The Wald intervalis also especiallyinad- the .95 upperconfidenceboundto .95 confidencelimits.
equatewhenp is near0 or 1, partlya consequenceof using Is anything sacrificedby usingthe score intervals?Well,
p as its midpointwhen the binomialdistribution is highly since theyare not "exact,"theyare not guaranteedto have
skewed. coverage probabilitiesuniformlybounded below by the
Even thoughthe score intervalstend to have consider- nominalconfidencelevel, and theiractual confidenceco-
ably higheractual coverageprobabilitiesthantheWald in- efficient (theinfimum of suchprobabilities)is, in fact,well
tervals,theyare not necessarilywider.In fact,unless the below it.Vollset's(1993) plotsof thecoverageprobabilities
sample proportionsfall near 0 or 1, theyare shorter.Di- as a functionof p, forvariousmethods,are illuminating for
1. 0 E E E E E E E
1.0 E E E E E E E E
E E E- E
-.. ..-- - -s - -- - - -s . . . .. . . . .s- - - - --- -L - ---a -
------- a-. s.. ... - ..---- ----
0.8 - ww0.8 - w
0.7 0.7
w
0.6 - 0.6 w
0.5 - 0.5
-
0.4 0.4 -wn
I I I I I fln n
0 20 40 60 80 100 0 20 40 60 80 100
(a) (b)
Figure 1. Mean Coverage Probability as a Functionof Sample Size forthe Nominal95% Exact (E), Score (S), and Wald (W) Intervals,Whenp
and (b) a Beta Distribution
(0,1) Distribution
has (a) a Uniform with1 = 10 and 1J = .05.
one near 1, at whichthe actual coverageprobabilityfalls score intervalis (X + z2 /2)/(rn+ z2) (X + 2)/(rn+ 4))
seriouslybelowthenominalconfidencelevel,andthisbadly an instructor will not go farwrongin givingthefollowing
affectsthe actual confidencecoefficient. These regionsget advice: "Add two successes and two failuresand thenuse
closerto 0 and to 1 as n increases.For n = 10 withnominal theWald formula(1)." Thatis, this"adjustedWald" interval
95% confidenceintervals,forinstance,thereis a minimum uses theusual simpleformulapresentedin suchcourses,but
coverageof .835 at p = .018 and p = .982, whereasat n = with(n + 4) trialsand pointestimatep = (X + 2)/(rn+ 4).
100, thereis a minimumcoverageof .838 at p = .002 and The midpointof this interval,p = (X + 2)/(n + 4), is
p = .998. nearlyidenticalto the midpointof the 95% score interval.
We now explain why this happens. There is a region It is identicalto the Bayes estimate(mean of the posterior
of values [O,r) for p that falls in the score confidence distribution) for the beta priordistribution with parame-
intervalonly when X = 0. The upper bound r of this ters 2 and 2, whichhas mean .50 and standarddeviation
region is the lower endpointof the confidenceinterval .224 and which shrinksthe sample proportiontoward.50
when X = 1, which for large n is approximately(1 + somewhatmore thandoes the uniformprior.This simple
2/2- z4 + z2/2)/n. The coverageprobability just be- adjustmentto the ordinaryWald intervalchangesit from
low r is approximatelyP(X = 0) = [I - (1 + z2/2 - highlyliberalto slightlyconservative, on the average,and
zV4 + z2/2)/n]nT exp{-(1 + z2/2 - z 4 + z2/2)}. The a bit moreconservativethanthe score method.Figure3 il-
analogousremarkapplies forvalues of p near 1. This lim- lustrates,showingthemean actualcoverageprobabilityC,
itingcoverage probabilityis .800 for nominal90% inter- forthenominal95% Wald and adjustedWald intervalsas a
vals, .838 for 95% intervals,and .889 for 99% intervals. functionof n, fortheuniformand skewedweightingsof p.
See Huwang (1995) forrelatedremarks.In particular, the The adjustedWald confidenceintervalbehavessurprisingly
actual confidencelevel does not convergeto the nominal well, even forverysmall samplesizes.
level as n increases. Figure 4 shows the actual coverage probabilitiesas a
Thoughthismayseem problematic, theportionof the[0, functionof p for the Wald, adjustedWald, and Clopper-
1] parameterspace over whichthe actual coverageproba- Pearson exact intervalswhen n = 5 and n = 10. The im-
Table 2. Proportionof ParameterSpace forwhich(a) Nominal95% Score Intervalhas Actual Coverage Probability
Below .90; (b) Nominal95% Score and Exact IntervalsHave Actual Coverage Probabilities
Between .93 and .97; (c) Actual Coverage Probability
is Closer to .95 forScore IntervalthanExact Interval
122 Gener-al
Coverage Probability Coverage Probability
1.0 1.0 -
AAAA A
A-- -A -----A----A.----A----A-A---- -t- AA A A A
0'9 WW W W W W 0.9 W w
W W W W W
W W
0.8 0.8 w
w
0.7 0.7
w
0.6 0.6 - w
0.5 0.5
0.4 0.4 - w
I I I l|I n I I I I n
n
0 20 40 60 80 100 0 20 40 60 80 100
(a) (b)
Figure3. Mean Coverage Probability as a Functionof Sample Size forthe Nominal95% Wald (W) and Adjusted Wald (A) Intervals,Whenp
has (a) a Uniform withg = .10 and a = .05.
and (b) a Beta Distribution
(0,1) Distribution
provementof the adjustedWald intervalover the ordinary ing spikes withseriouslylow coveragenear p = 0 and 1.
Wald intervalis dramatic.The adjustedWald intervalalso This is because thisinterval'srathercrudeboundscontain
has theadvantage,relativeto thescoreinterval,of nothav- 0 whenX = 0 or 1 and contain1 whenX = n- 1 or n. For
n =5 .
0.90 0.90 0.90
Figure4. A Comparison
ofCoverageProbabilities 95% Wald,AdjustedWald,and ExactIntervals.
fortheNominal
0.90 0.90 0 90
0.o - 0.95- 0 -
0 80 0980 0.90
I I j I I I I
0.70- p 0.70 0.70 - p
0 5 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
Figure5. A Comparisonof Coverage Probabilitiesforthe Nominal95% Wald,Score, and Exact Intervalsfora Poisson Mean.
124 General
nevermuch less than) the nominalconfidencelevel. Our is betterthantheexact intervalin termsof how mostprac-
evaluationsagreedwiththis,and are also illustratedin Ta- titionersinterpret thatterm.
ble 1. We feelthisis a reasonablemethodto use, especially Resultssimilarto thosein thisarticlealso hold in other
if one is concernedthatp maybe veryclose to 0 or 1. It is discreteproblems.For instance,similarcomparisonsapply
morecomplexcomputationally thanthescore and adjusted for score,Wald, and exact confidenceintervalsfora Pois-
Wald intervals, butlike thoseintervalsit has theadvantage son parameter,u,based on an observationX fromthatdis-
of being shorterthantheexact interval. tribution. Figure 5 illustrates, plottingthe actual coverage
Yet anotheralternative methodis a continuity-corrected probabilities whenthenominalconfidencelevelis .95. Here,
versionof thescoreinterval,based on thenormalcontinu- the score intervalfor,uresultsfrominverting the approx-
itycorrectionforthebinomial.This intervalapproximates imately normal test statistic z = (X - Ato)/ to,theWald
theClopper-Pearsoninterval,however,and our evaluations interval results from inverting z = (X - Ato)/ X, and the
and resultsin Vollset(1993, Fig. 2) suggestthatit is often endpoints of the exact interval, (1/2)(x x 025' X2(X?1),.975)
as conservativeas the exact intervalitself.Again,Table 1 resultfromequatingtail sumsof nullPoisson probabilities
illustrates,and we do notrecommendthisapproach. to .025 (Garwood 1936; forn independentPoisson obser-
Finally,we mentiontwoothermethodsthatperform well. vations,X ..... ,Xn, the same formulasapply if one lets
The confidenceintervalbased on inverting the likelihood- X = E Xi and t = E(X) = nE(Xi)). For anotherdiscrete
ratiotestis similarto the score intervalin termsof how it example,see Mehta and Walsh (1992) fora comparisonof
compareswiththeexact interval,butit is morecomplexto exactwithmid-P confidenceintervalsforodds ratiosor for
construct.Not surprisingly, Bayesian confidenceintervals a commonodds ratioin several2 x 2 contingency tables.
withbeta priorsthatare onlyweaklyinformative also per- Exact inferencehas an important place in statisticalinfer-
formwell in a frequentist sense (see, e.g., Carlinand Louis ence of discretedata, in particularfor sparse contingency
1996, pp. 117-123). table problemsfor whichlarge-samplechi-squaredstatis-
In decidingwhetherto use the score interval,some may tics are oftenunreliable.However,approximateresultsare
be botheredby its poor coverageforvalues of p just below sometimesmore usefulthanexact results,because of the
the lowerboundaryof the intervalwhen X = 1 and just inherentconservativeness of exact methods.
above theupperboundaryof theintervalwhenX = n- 1.
One could thenuse an adapted versionthatreplaces the [ReceivedFebruiary 1997. RevisedNovember1997.]
lower endpointby - log(l - a)/n when X = 1 and the
upperendpointby 1 + log(l - ca)/nwhenX = n - 1. (e.g., REFERENCES
atp =log(l - a)/n, P(X = O) = [I + log(l - a)/n] Agresti,A. (1996), An Introductionz to CategoricalData Analysis,New
1 - a.) This adaptationimprovesthe minimumcoverage York:Wiley.
considerably.For instance,the nominal95% intervalhas Blyth,C. R., and Still, H. A. (1983), "Binomial ConfidenceIntervals,"
Journalof theAmericanStatisticalAssociation,78, 108-116.
minimumcoverageprobability converging to .895 forlarge
B6hning,D. (1994), "BetterApproximateConfidenceIntervalsfor a Bi-
n, whichis the large-samplecoverageprobability at p just nomialParameter," CancadianJournalof Statistics,22, 207-218.
below thelowerendpointof theintervalwhenX = 2. Carlin,B. P., and Louis, T. A. (1996), Bayes acndEmzpirical
Bayes Methods
for Data Analysis,London: Chapmanand Hall.
Chen,H. (1990), "The Accuracyof ApproximateIntervalsfora Binomial
Parameter,"Journalof theAnmerican StatisticalAssociation,85, 514-
5. CONCLUSION AND EXTENSIONS 518.
Clopper,C. J., and Pearson, E. S. (1934), "The Use of Confidenceor
The Clopper-Pearsonintervalhas coverageprobabilities Fiducial LimitsIllustratedin theCase of theBinomial,"Biometrika, 26,
boundedbelow by the nominal confidencelevel, but the 404-413.
typicalcoverageprobability is muchhigherthanthatlevel. Duffy, D. E., andSantner, T. J.(1987), "ConfidenceIntervalsfora Binomial
ParameterBased on MultistageTests,"Biometrics,43, 81-93.
The score and adjustedWald intervalscan have coverage Garwood, F. (1936), "Fiducial Limits for the Poisson Distribution,"
probabilitieslower thanthe nominalconfidencelevel, yet Biometrika,28, 437-442.
the typicalcoverage probabilityis close to thatlevel. In Ghosh,B. K. (1979), "A Comparisonof Some ApproximateConfidenceIn-
forminga 95% confidenceinterval,is it betterto use an ap- tervalsfortheBinomialParameter," Journcalof theAnmericanzStatistical
Association,74, 894-900.
proachthatguaranteesthattheactualcoverageprobabilities
Huwang,L. (1995), "A Note on theAccuracyof an ApproximateInterval
are at least .95 yettypicallyachievescoverageprobabilities forthe BinomialParameter," Statistics& ProbabilityLetters,24, 177-
of about .98 or .99, or an approachgivingnarrowerinter- 180.
vals forwhichtheactualcoverageprobability could be less Jovanovic,B. D., and Levy,P. S. (1997), "A Look at the Rule of Three,"
The Anmericanz Statistician,51, 137-139.
than.95 but is usuallyquite close to .95? For most appli-
Lancaster,H. 0. (1961), "SignificanceTests in Discrete Distributions,"
cations,we would preferthelatter.The score and adjusted Journalof theAmericanz StatisticalAssociation,56, 223-234.
Wald confidenceintervalsfor p provide shorterintervals Laplace, P. S. (1812), TheorieAnalytiquedes Probabilites,Paris:Courcier.
withactualcoverageprobability usuallynearerthenominal Leemis,L. M., and Trivedi,K. S. (1996), "A Comparisonof Approximate
confidencelevel. In particular,even thoughthe score and IntervalEstimatorsforthe BernoulliParameter,"The Amzerican? Statis-
tician,50, 63-68.
adjusted Wald intervalsleave somethingto be desiredin
Mehta, C. R., and Walsh, S. J. (1992), "Comparisonof Exact, Mid-p,
termsof satisfyingthe usual technicaldefinition of "95% and Mantel-HaenszelConfidenceIntervalsfortheCommonOdds Ratio
confidence," theoperationalperformance of thosemethods Across Several 2x2 ContingencyTables," The Anmerican? Statisticiacn,
126 Genieral