2.4 Goodness of Fit Test: STAT 504

STAT504
AnalysisofDiscreteData
2.4GoodnessofFitTest
Printerfriendlyversion (https://onlinecourses.science.psu.edu/stat504/print/book/export/html/60)
Agoodnessoffittest,ingeneral,referstomeasuringhowwelldotheobserveddatacorrespondto
thefitted(assumed)model.Wewillusethisconceptthroughoutthecourseasawayofcheckingthe
modelfit.Likeinalinearregression,inessence,thegoodnessoffittestcomparestheobserved
valuestotheexpected(fittedorpredicted)values.
Agoodnessoffitstatisticteststhefollowinghypothesis:
H0:themodelM0fits
vs.
HA:themodelM0doesnotfit(or,someothermodelMAfits)
Mostoftentheobserveddatarepresentthefitofthesaturatedmodel,themostcomplexmodel
possiblewiththegivendata.Thus,mostoftenthealternativehypothesis(HA)willrepresentthe
saturatedmodelMAwhichfitsperfectlybecauseeachobservationhasaseparateparameter.Laterin
thecoursewewillseethatMAcouldbeamodelotherthanthesaturatedone.Letusnowconsider
thesimplestexampleofthegoodnessoffittestwithcategoricaldata.
Inthesettingforonewaytables,wemeasurehowwellanobservedvariableXcorrespondstoaMult
(n,)modelforsomevectorofcellprobabilities,.Wewillconsidertwocases:
1. whenvectorisknown,and
2. whenvectorisunknown.
Inotherwords,weassumethatunderthenullhypothesisdatacomefromaMult(n,)distribution,
andwetestwhetherthatmodelfitsagainstthefitofthesaturatedmodel.Therationalebehindany
modelfittingistheassumptionthatacomplexmechanismofdatagenerationmayberepresentedby
asimplermodel.Thegoodnessoffittestisappliedtocorroborateourassumption.
ConsiderourDiceExample(/stat504/sites/onlinecourses.science.psu.edu.stat504/files/lesson02/dice_example.png)from
theIntroduction.Wewanttotestthehypothesisthatthereisanequalprobabilityofsixsidesthatis
comparetheobservedfrequenciestotheassumedmodel:XMulti(n=30,0=(1/6,1/6,1/6,1/6,
1/6,1/6)).Youcanthinkofthisassimultaneouslytestingthattheprobabilityineachcellisbeing
equalornottoaspecifiedvalue,e.g.
H0:(1,2,3,4,5,6)=(1/6,1/6,1/6,1/6,1/6,1/6)
vs.
HA:(1,2,3,4,5,6)(1/6,1/6,1/6,1/6,1/6,1/6).
Mostsoftwarepackageswillalreadyhavebuiltinfunctionsthatwilldothisforyouseethenext
sectionforexamples(https://onlinecourses.science.psu.edu/stat504/node/61)inSASandR.Hereisastepbystep
proceduretohelpyouconceptuallyunderstandthistestbetterandwhatisgoingonbehindthese
functions.
Step1:Ifvectorisunknownweneedtoestimatetheseunknownparameters,andproceed
toStep2IfvectorisknownproceedtoStep2.
Step2:Calculatetheestimated(fitted)cellprobabilities^ s,andexpectedcellfrequencies,
Ej'sunderH0.
j
Step3:CalculatethePearsongoodnessoffitstatistic,X2and/orthedeviancestatistic,G2
andcomparethemtoappropriatechisquareddistributionstomakeadecision.
Step4:Ifthedecisionisborderlineorifthenullhypothesisisrejected,furtherinvestigate
whichobservationsmaybeinfluentialbylooking,forexample,atresiduals(../node/62).
Pearsonanddevianceteststatistics
ThePearsongoodnessoffitstatisticis
k
^j )
(X j n
=
^j
n
j=1
Aneasywaytorememberitis
2
(O j E j )
=
Ej
whereOj=Xjistheobservedcountincellj,andE
^j
= E(Xj ) = n
istheexpectedcountincellj
undertheassumptionthatnullhypothesisistrue,i.e.theassumedmodelisagoodone.Noticethat
^ istheestimated(fitted)cellproportion underH .
j
0
j
Thedeviancestatisticis
k
2
= 2 Xj log (
j=1
Xj
)
^j
n
where"log"meansnaturallogarithm.Aneasywaytorememberitis
2
Oj
= 2 Oj log (
j
)
Ej
Insometexts,G2isalsocalledthelikelihoodratioteststatistic,forcomparingthelikelihoods
(http://onlinecourses.science.psu.edu/stat504/node/27)(l0andl1)oftwomodels,thatiscomparingthe
loglikelihoodsunderH0(i.e.,loglikelihoodofthefittedmodel,L0)andloglikelihoodunderHA(i.e.,
loglikelihoodofthelarger,lessrestricted,orsaturatedmodelL1):G2=2log(l0/l1)=2(L0L1).A
commonmistakeincalculatingG2istoleaveoutthefactorof2atthefront.
NotethatX2andG2arebothfunctionsoftheobserveddataXandavectorofprobabilities.Forthis
reason,wewillsometimeswritethemasX2(x,)andG2(x,),respectivelywhenthereisno
ambiguity,however,wewillsimplyuseX2andG2.Wewillbedealingwiththesestatistics
throughoutthecourseintheanalysisof2wayandkwaytables,andwhenassessingthefitoflog
linearandlogisticregressionmodels.
TestingtheGoodnessofFit
X2andG2bothmeasurehowcloselythemodel,inthiscaseMult(n,)"fits"theobserveddata.
Ifthesampleproportionsp j=Xj/n(i.e.,saturatedmodel)areexactlyequaltothemodel'sjfor
cellsj=1,2,...,k,thenOj=Ejforallj,andbothX2andG2willbezero.Thatis,themodelfits
perfectly.
2
2
^ 'scomputedunderH ,thenX andG areboth
Ifthesampleproportionsp jdeviatefromthe
0
positive.LargevaluesofX2andG2meanthatthedatadonotagreewellwiththe
assumed/proposedmodelM0.
HowcanwejudgethesizesofX2andG2?
Theanswerisprovidedbythisresult:
IfxisarealizationofXMult(n,),thenasnbecomeslarge,thesamplingdistributionsofboth
X2(x,)andG2(x,)approachchisquareddistribution (http://onlinecourses.science.psu.edu/stat504/node/23#chi
squared)withdf=k1,wherek=numberofcells,2 k1.
ThismeansthatwecaneasilytestanullhypothesisH0:=0againstthealternativeH1:0for
someprespecifiedvector0.AnapproximateleveltestofH0versusH1is:
RejectH0ifcomputedX2(x,0)orG2(x,0)exceedsthetheoreticalvalue2k1(1).
Here,2k1(1)denotesthe(1)thquantileofthe2k1distribution,thevalueforwhichthe
probabilitythata2k1randomvariableislessthanorequaltoitis1.Thepvalueforthistestis
theareatotherightofthecomputedX2orG2underthe2k1densitycurve.Belowisasimplevisual
example.Considerachisquareddistributionwithdf=10.Let'sassumethatacomputedteststatisticis
X2=21.For=0.05,thetheoreticalvalueis18.31.
(/stat504/sites/onlinecourses.science.psu.edu.stat504/files/lesson02/chisqdistributions.R)
UsefulfunctionsinSASandRtorememberforcomputingthepvaluesfromthechisquare
distributionare:
InR,pvalue=1pchisq(teststatistic,df),e.g.,1pchisq(21,10)=0.021
InSAS,pvalue=1probchi(test statistic,df), e.g.,1probchi(21,10)=0.021
YoucanquicklyreviewthechisquareddistributioninLesson0
(https://onlinecourses.science.psu.edu/stat504/node/23),orcheckout
http://www.statsoft.com/textbook/stathome.html(http://www.statsoft.com/textbook/stathome.html)and
http://www.ruf.rice.edu/~lane/stat_sim/chisq_theor/index.html
(http://www.ruf.rice.edu/%7Elane/stat_sim/chisq_theor/index.html).TheSTATSOFTlinkalsohasbriefreviewsof
manyotherstatisticalconceptsandmethods.
Hereareafewmorecommentsonthistest.
Whennislargeandthemodelistrue,X2andG2tendtobeapproximatelyequal.Forlarge
samples,theresultsoftheX2andG2testswillbeessentiallythesame.
Anoldfashionedruleofthumbisthatthe2approximationforX2andG2workswellprovided
thatnislargeenoughtohaveEj=nj5foreveryj.Nowadays,mostagreethatwecanhave
Ej<5forsomeofthecells(say,20%ofthem).SomeoftheEj'scanbeassmallas2,butnoneof
themshouldfallbelow1.Ifthishappens,thenthe2approximationisn'tappropriate,andthe
testresultsarenotreliable.
Inpractice,it'sagoodideatocomputebothX2andG2toseeiftheyleadtosimilarresults.Ifthe
resultingpvaluesareclose,thenwecanbefairlyconfidentthatthelargesampleapproximation
isworkingwell.
IfitisapparentthatoneormoreoftheEj'saretoosmall,wecansometimesgetaroundthe
problembycollapsingorcombiningcellsuntilalltheEj'sarelargeenough.Butwecanalso
performasmallsampleinferenceorexactinference.WewillseemoreonthisinLesson3
(https://onlinecourses.science.psu.edu/stat504/node/89).Pleasenotethatthesmallsampleinferencecanbe
conservativefordiscretedistributions,thatismaygivealargerpvaluethanitreallyis(e.g.,for
moredetailsseeAgresti(2007),Sec.1.4.31.4.5,and2.6Agresti(2013),Sec.3.5,andfor
BayesianinferenceSec3.6.)
Inmostapplications,wewillrejectthenullhypothesisXMult(n,)forlargevaluesofX2or
G2.Onrareoccasions,however,wemaywanttorejectthenullhypothesisforunusuallysmall
valuesofX2orG2.Thatis,wemaywanttodefinethepvalueasP(2k1X2)orP(2k1
G2).VerysmallvaluesofX2orG2suggestthatthemodelfitsthedatatoowell,i.e.thedatamay
havebeenfabricatedoralteredinsomewaytofitthemodelclosely.ThisishowR.A.Fisher
figuredoutthatsomeofMendel'sexperimentaldatamusthavebeenfraudulent(e.g.,see
Agresti(2007),page327Agresti(2013),page19).
2.3.3MultinomialSampling
(/stat504/node/59)
up
2.5ExamplesinSAS/R:DiceRolls&
(/stat504/lesson2)
Tomato(/stat504/node/61)

2.4 Goodness of Fit Test: STAT 504

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

2.4 Goodness of Fit Test: STAT 504

Diunggah oleh

Hak Cipta:

Format Tersedia

STAT504

Anda mungkin juga menyukai