Anda di halaman 1dari 5

Bayes Factors: What They Are and What They Are Not

Author(s): Michael Lavine and Mark J. Schervish


Source: The American Statistician, Vol. 53, No. 2 (May, 1999), pp. 119-122
Published by: Taylor & Francis, Ltd. on behalf of the American Statistical Association
Stable URL: http://www.jstor.org/stable/2685729 .
Accessed: 15/01/2015 14:55

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

Taylor & Francis, Ltd. and American Statistical Association are collaborating with JSTOR to digitize, preserve
and extend access to The American Statistician.

http://www.jstor.org

This content downloaded from 128.227.136.95 on Thu, 15 Jan 2015 14:55:36 PM


All use subject to JSTOR Terms and Conditions
Bayes Factors: What They Are and What They Are Not
Michael LAVINEand Mark J. SCHERVISH
tio pfH(x)/[(l - p)fA(x)]. The Bayes factoris the ratio
fH (X) /fA (X) .
Bayes factorshave been offeredby Bayesians as alterna- Example1. Considerfourtossesof thecoin mentioned
tivesto P values (or significanceprobabilities)fortesting earlier,and supposetheyall land heads. Let p,be theprior
hypothesesand for quantifyingthe degree to which ob- over the parameterspace Q = {O, 1/2,1}, where a point
served data supportor conflictwith a hypothesis.In an in Q gives the probabilityof heads. If the hypothesisof
earlierarticle,Schervishshowedhow the interpretation of interestis H2: ( = 1/2,then
P values as measuresof supportsuffersa certainlogical
flaw.In thisarticle,we show how Bayes factorssufferthat I
fH2 (x) = and fH5((X)A=(l})
same flaw.We investigatethe source of thatproblemand
considerwhatare the appropriateinterpretations of Bayes
The Bayes factorin favorof H2 is
factors.
fH2 (X) t({O, I})
KEY WORDS: Measure of support;P values. _

fH5(X) 16At({1})
Suppose thata Bayesian observesdata X = x and tests
a hypothesisH using a loss functionthatsays the cost of
1. INTRODUCTION typeII erroris some constantb overthealternative and the
Considertosses of a coin knownto be eitherfair,two- cost of typeI erroris constantover the hypothesisand is
headed,or two-tailed.There are six nontrivialhypotheses c x b. The posteriorexpectedcost of rejectingH is then
about 0, theprobabilityof heads: cbPr(H is truelX = x), while the posteriorexpectedcost
of acceptingH is b(1 - Pr(H is truelX= x)). The formal
H1:0=1 H2: 0 = 1/2 H3: 0 = O Bayes rule is to rejectH if the cost of rejectingis smaller
H4:0#71 H5:0 7#1/2 H6:0O. thanthe cost of accepting.This simplifiesto rejectingH
if its posteriorprobabilityis less than 1/[1+ c], whichis
Jeffreys (1960) introduceda class of statisticsfor testing
equivalentto rejectingH if the posteriorodds in its favor
hypothesesthatare now commonlycalled Bayes factors.
are less thanl/c. This, in turn,is equivalentto rejectingH
The Bayesfactorforcomparinga hypothesisH to its com-
if theBayes factorin favorof H is less thansome constant
plement,thealternative A, is theratioof theposteriorodds
k implicitlydetermined by c and thepriorodds.
in favorof H to thepriorodds in favorof H.
It wouldseemthenthata Bayesiancould declineto spec-
To make thismoreprecise,let Q be theparameterspace
ifypriorodds, interpret theBayes factoras "theweightof
and let QH c Q be a propersubset.Let ,ube a probability
evidencefromthedatain favourof the... model"(O'Hagan
measureover Q and, foreach 0 c Q, let fx IE ( I0) be the
1994, p. 191); "a summaryof theevidenceprovidedby the
densityfunction (or probabilitymass function)forsomeob-
data in favorof one scientifictheory... as opposed to an-
servableX givene = 0. The predictivedensityof X given
other"(Kass and Raftery1995,p. 777); or the"'odds forHo
H: e) C QH is fH(x) equal to the average of fxie(xj0)
to H1 thatare givenbythedata' " (Berger1985,p. 146) and
withrespectto ,t restricted to QH. Similarly,thepredictive
testa hypothesis"objectively"by rejectingH if theBayes
densityof X givenA: e , QH is fA(x) equal to the av-
factoris less thansome constantk. In fact,Schervish(1995,
erage of fx Ie(x 0) withrespectto Atrestricted to QA (the
p. 221) said "The advantageof calculatinga Bayes factor
complementof QH). That is,
overtheposteriorodds ... is thatone need notstatea prior
I odds..." and then (p. 283) thatBayes factorsare "ways
fH (X) fQHfxE (x 0)d-t(0) to quantifythedegreeof supportfora hypothesisin a data
1-t(QH)
set."Of course,as theseauthorsclarified, suchan interpreta-
and tionis notstrictlyjustified.While theBayes factordoes not
fx IE(xI0)d-t(0) dependon thepriorodds,it does dependon "how theprior
fA (X) fQlA
mass is spreadout overthetwo hypotheses"(Berger1985,
Pi(QA)
p. 146). Nonetheless,it sometimeshappensthatthe Bayes
If p is the priorprobabilitythatH is true-that is, p factor"will be relativelyinsensitiveto reasonablechoices"
,-(QH)-then the posteriorodds in favorof H is the ra- (Berger 1985, p. 146), and thena commonopinionwould
be that"such an interpretation is reasonable"(Berger1985,
p. 147).
Michael Lavine is AssociateProfessor,Instituteof Statisticsand Deci-
sion Sciences,Duke University,Durham,NC 27708-0251 (Email: michael
We show,by example,thatsuch informaluse of Bayes
@stat.duke.edu).MarkJ.Schervishis Professor,Departmentof Statistics, factorssuffersa certainlogical flawthatis not suffered by
CarnegieMellon University, PA 15213.
Pittsburgh, using the posteriorodds to measuresupport.The removal

?? 1999 AmericanStatisticalAssociation TheAmericanStatistician,


May 1999, Vol.53, No. 2 119

This content downloaded from 128.227.136.95 on Thu, 15 Jan 2015 14:55:36 PM


All use subject to JSTOR Terms and Conditions
of the priorodds fromthe posteriorodds to producethe nonempty, disjoint,and exhaustivehypothesesH1, H2, and
Bayes factorhas consequencesthataffecttheinterpretationH3 as in Examples 1 and 2. Let H4 be the complement
of theresultingratio. of H1 (the union of H2 and H3) as in the examples,so
thatH2 impliesH4. Straightforward algebra shows thatif
2. BAYES FACTORS ARE NOT MONOTONE IN
fH3(X) < min{fH2(x),fH(x)}, thenthe Bayes factorin
THE HYPOTHESIS favorof H4 will be smallerthanthe Bayes factorin favor
of thethreehy-
Example 2. Consideronce again the fourcoin tosses of H2 regardlessof thepriorprobabilities
potheses H1, H2, and H3. For instance, the nonmonotonic-
thatall came up heads, let the parameterspace be Q =
ity will occur in Example 2 no matter what one chooses for
{0, 1/2,1} (as in Example 1) and definea priordistribution
the (strictly positive) priordistribution A,.What happensis
ptby
thattheBayes factorpenalizesH4 forcontainingadditional
Af({1/2})=.98, and A({0}) =.01. parameter values (thosein H3) thatmaketheobserveddata
H({l}) =.01,
less likelythanall of theotherhypothesesunderconsidera-
The six predictiveprobabilitiesare tion.An appliedexampleof thisphenomenonwas encoun-
teredby Olson (1997), who was comparingthreemodes of
fH,(X) = 1, fH2x) = 0625, fH3(X) = 0, inheritance in thespeciesAstilbebiternata.All threemodes
fH4(X) .0619, fHX) (x) .072, are represented by simplehypothesesconcerningthedistri-
butionof theobservabledata. One hypothesis, H1, is called
and the six nontrivial Bayes factorsare tetrasomicinheritance, whiletheothertwo hypotheses,H2
and H3 (thosewhichhappento have thelargestand small-
fHl (X)/fH4 (X) 16.16, fH2(X)/fH5(X) .125,
est likelihoods,respectively), togetherforma meaningful
fH3 (X)/fH6 (X) = 0, category,disomicinheritance. The Bayes factorin favorof
H2 will be largerthantheBayes factorin favorof H2 U H3
and theirinverses.Suppose thatwe use the Bayes factors
no matterwhat strictlypositivepriorone places over the
to test the correspondinghypotheses.That is, we reject
threehypothesesbecause H3 has the smallestlikelihood.
a hypothesisif the Bayes factorin its favoris less than
some fixednumberk. If we choose k c (.0619,.125), then
3. BAYES FACTORS ARE MEASURES OF
we rejectH4 because 1/16.16 < k but accept H2 because
CHANGE IN SUPPORT
.125 > k. That is, we face the apparentcontradiction of
accepting0 = .5 but rejecting0 c {0,.5}. This problem The factthatBayes factorsare notcoherentas measures
does notariseif we choose to testthehypothesesby reject- of supportdoes not mean thattheyare not useful sum-
ing when the posteriorodds is less thansome numberk'. maries.It only means thatone mustbe carefulhow one
The posteriorodds in favorof H2 is nevermorethanthe interprets them.What the Bayes factoractuallymeasures
posteriorodds in favorof H4. is the change in the odds in favorof the hypothesiswhen
In Example 2, we were testingtwo hypotheses,H2 and goingfromthepriorto theposterior.In fact,Bernardoand
H4, such thatH2 impliesH4. Gabriel(1969) introduceda Smith(1994, p. 390) said "Intuitively, theBayes factorpro-
criterionfor simultaneoustestsof nestedhypotheses.The vides a measure of whether the data x have increasedor
testsof H2 and H4 are coherentif rejectingH4 entailsre- decreased the odds on Hi relative to Hj." In termsof log-
jecting H2. One typicaluse of a measureof supportfor odds, the posterior log-odds equals the priorlog-oddsplus
hypothesesis to rejectthose hypotheses(thatwe want to the logarithm of the Bayes factor. So, for example,if one
test)thathave smallmeasuresof support.We can translate were to use log-odds to measure support (a coherentmea-
thecoherenceconditionintoa requirement foranymeasure sure), then the logarithm of the Bayes factor would measure
of supportforhypotheses.Since any supportforH2 must how much the data change the support for the hypothesis.
a fortioribe supportfor H4, the supportfor H2 mustbe Testinghypothesesby comparingBayes factorsto pre-
no greaterthanthesupportforH4. Using theBayes factor specified standardlevels (like 3 or 1/3 to standfor 3-to-
as a measureof supportviolates the coherencecondition. 1 for or 1-to-3against) is similarto confusingPr(AIB)
Schervish(1996) showedthatusing P values as measures with Pr(BIA). In Example 2, even thoughthe Bayes fac-
of supportalso violatesthecoherencecondition.Examples tor fH4(X)/fHl(X) = .0619 is small,theposterior odds
of coherentmeasuresare theposteriorprobability, thepos- Pr[H4 x]/Pr[Hllx] = .99/.01 x .0619 6.13 is large and
teriorodds, and variousformsof the likelihoodratio test implies Pr[H41x] .86. The small Bayes factor says that
statistic the data will lower the probabilityof H4 a large amount
relativeto whereit starts(.99), but it does not implythat
LR(H) SUPOEQH fxJe (X10) and H4 is unlikely.
supo'E fxie(xl0)

LR'(H) - SUPOEQH fXlE(X10) 4. WHY COHERENCE?


SUPOeQA fx|e(x 0)(1
Is coherencea compellingcriterionto requireof a mea-
The nonmonotonicity (incoherence)of Bayes factors sureof support?Aside fromtheheuristic given
justification
is actually very general. Suppose that there are three earlier,thereis a decision theoreticjustification.As be-

120 General

This content downloaded from 128.227.136.95 on Thu, 15 Jan 2015 14:55:36 PM


All use subject to JSTOR Terms and Conditions
Table 1. Partitionsof the Set of Possible x Values by Two If we subtractthese two we get R(0, 0) - R(0,) -
Pairs of Tests
Po(C)g(0), where
02 _ _ _2
O 0 1
g(O) Li(0, 0) - L2 (O, 0)- L1(0, 1) + L2(O, 1).
O F C 0 F 0
1 E D 1 C UE D
Our assumptionsimplythatg(O) > 0 forall 0 C Q2 \Q1 and
it is 0 forall other0. Since Po(C) > 0 forsome 0 C Q2 \ Q1,
fore,we assume that a typicalapplicationof a measure 0 is inadmissible. FromtheBayesianperspective, if x c C,
of supportwill be to rejecthypothesesthathave low sup- theposteriorriskof 0 is f[L1 (0, 0) + L2(0, l)]dAte1x(OJx),
port.Hence, we will justifycoherenceas a criterionfor whereAt,e1x is theposteriordistribution.The posteriorrisk
simultaneoustests.Considerthemostgeneralloss function of 4 is f[Li(0, 1) +L2(O, O)]dAteIx(Ox). The difference be-
L thatis conduciveto hypothesistesting.That is, let the tweenthesetwo posteriorrisksis easily seen to equal the
actionspace have two points,0 and 1, where0 means ac- integralof g(O) withrespectto theposteriordistribution. If
cept H and 1 means reject H, and let H e C QH be x , C, then the two rules make the same decision; hence,
the hypothesis.We assume thatL(O,O) > L(O, 1) for all theyhave the same posteriorrisk.So long as theposterior
0 'QH and L(O,O) < L(O, 1) for all 0 C QH. This says risksare finiteand Q2\Ql has positiveposteriorprobability,
thaterroris more costlythancorrectdecision,but other- bcannotbe a formalBayes rule.
wise places no restrictions on the loss. Now, suppose that
we have two hypotheses,H1 e C Q1 and H2 e C Q2,
withcorresponding loss functionsL1 and L2 of the above 5. DISCUSSION
form.For the simultaneoustestingproblem,we use loss Coherenceis a propertyof testsof two or more nested
L(O, (a,, a2)) = Li (O,a,) + L2(0, a2), whereai is theac- hypothesesconsideredjointly,but we can gain some in-
tion for testingHi for i = 1,2. We impose one other sightintoit by consideringa singleteston its own. When
condition-namelythat for those parametersthat are in comparingtwohypothesesit is usefulto rephrasetheques-
both hypothesesor in both alternatives,the costs of er- tionas How well, relativeto each other,do thehypotheses
rorbe the same in bothtestingproblems.In symbols,this explainthedata? In the case of comparingtwo simplehy-
meansthatforall 0 c (Q1 n Q2) U (QC n Qc), we have potheses,thereis wide agreementon how this should be
L1 (O,O)-L1 (O, 1) = L2(O,O)-L2 (O, 1). Underthesecondi- done. As Berger(1985, p. 146) pointedout,theBayes fac-
tions,we can provetwo simpleresults.For non-Bayesians, toris the same as the likelihoodratioLR' from(1) in this
incoherent testsare inadmissible.For Bayesians,incoherent case. Also, in the case of two simple hypotheses,the P
testsare notformalBayes rules. value is just theprobabilityin the tail of one of thedistri-
Suppose thatQ1 C Q2 and let Oi be a testof Hi. That is, butionsbeyondthe observedlikelihoodratio,hence it is a
qi(x) = 1 meansrejectHi and qi(x) = 0 meansacceptHi. monotonefunctionof theBayes factor.So, the Bayes fac-
The samplespace is dividedintofourpartsC, D, E, and F torand theP value reallycan measurethesupportthatthe
accordingto whether(01(x),02(X)) is (0,1), (1,1), (1,0), or dataoffersforone simplehypothesis relativeto another,
and
(0,0),respectively. See theleftside of Table 1. In particular, in a way thatis acceptableto Bayesiansand non-Bayesians
C = {X : (l(X), 02 (X))= (0, 1)} is thesetwherewe make alike. One shouldalso notethatcoherenceis notan issue in
the incoherentdecision to reject H2 while acceptingH1. thecase of two simplehypothesesbecause theredo notex-
Createanotherpair (X1, X2) of testssuch that,fori = 1,2, isttwononempty distinctnestedhypotheseswithnonempty
i(x) = Xiq(x)for all x , C and 4i (x) = 03_i(x) for all complements.On theotherhand,as we notedat theend of
x c C. Thatis, X = (X1,X2) switchesthetwodecisions Section 3, just because the data increasethe supportfora
when incoherenceoccurs in X = (01,02). Then the right hypothesisH relativeto its complementdoes notnecessar-
side of Table 1 gives thesets where b1and 2 takevarious ily make H morelikelythanits complement, it onlymakes
pairsofvalues.Supposethatthereexists0 C Q2 \Q with H morelikelythanit was a priori.
Po(C) > 0. We can now showthat0 dominatesf and that When at least one of thehypothesesis composite,inter-
it has smallerposteriorrisk.The riskfunctionsof thetwo pretationsare not so simple.One mightchoose eitherto
pairsof testsare maximize,to sum,or to averageover compositehypothe-
ses. Users of the likelihoodratio statisticmaximize:they
findthevalue of 0 withineach hypothesisthatbestexplains
thedata. Users of posteriorprobabilitiessum:theposterior
probabilityof a hypothesisis the sum (or integral)of the
R(0,0) = L1(0,0)Po(C U F) + L2(O,O)Po(E U F) posteriorprobabilitiesof all the0's withinit.Usersof Bayes
-HL1(O,1)Po(D U F) factorsaverage:the Bayes factoris the ratioof fx,E (x I0)
+L2(O, 1)Po(C U D), averagedwithrespectto the conditionalpriorgiven each
hypothesis.But averaginghas at least two potentialdraw-
R(O,4) =L1( 0: )Po(F) + L2(O, 0)Po(C U F UF) backs. First,it requiresa priorto averagewithrespectto,
?L1(O, 1)Po(C UD UE) and second,it penalizesa hypothesisforcontainingvalues
+l 2(, 1)Po(D). with small likelihood.As we noted at the end of Section
May 1999, Vol.53, No. 2
The AmericanStatistician, 121

This content downloaded from 128.227.136.95 on Thu, 15 Jan 2015 14:55:36 PM


All use subject to JSTOR Terms and Conditions
2, interpreting
theBayes factoras a measureof supportis MultipleComparisons," AnnalsofMathematicalStatistics,40, 224-250.
incoherentbecause of the seconddrawback. Jeffreys,H. (1960), TheoryofProbability(3 ed.), Oxford:ClarendonPress.
Kass, R., and Raftery,A. (1995), "Bayes Factors,"JournaloftheAmerican
[ReceivedJanuary'1997. RevisedSeptember1997.] StatisticalAssociation,90, 773-795.
O'Hagan, A. (1994), Kendall's Advanced Theoryof Statistics,Vol. 2B:
Bayesiani.Inference,Cambridge:UniversityPress.
REFERENCES
Olson, M. (1997), "Applicationof BayesianAnalysesto DiscriminateBe-
Berger,J. 0. (1985), StatisticalDecision Theoryand Bayesian Analysis tweenDisomic and TetrasomicInheritancein Astilbebiternata,"Tech-
(2nd ed.), New York:Springer-Verlag. nical report,Duke University, Departmentof Botany.
Bernardo,J.,and Smith,A. F. M. (1994), Bayesian Theoqy,New York: Schervish,M. J. (1995), Theoiyof Statistics,New York:Springer-Verlag.
Wiley. (1996), "P-values: WhatThey Are and What They Are Not,"The
Gabriel,K. R. (1969), "SimultaneousTest Procedures-Some Theoryof AmericanStatistician,50, 203-206.

122 General

This content downloaded from 128.227.136.95 on Thu, 15 Jan 2015 14:55:36 PM


All use subject to JSTOR Terms and Conditions

Anda mungkin juga menyukai