Anda di halaman 1dari 59

COMPLETE

BUSINESS
STATISTICS
by
AMIR D. ACZEL
&
JAYAVEL SOUNDERPANDIAN
7th edition.

Prepared by Lloyd Jaisingh, Morehead State


University

Chapter 9
Analysis of Variance

McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All


9- 2

9 Analysis of Variance
• Using Statistics
• The Hypothesis Test of Analysis of Variance
• The Theory and Computations of ANOVA
• The ANOVA Table and Examples
• Further Analysis
• Models, Factors, and Designs
• Two-Way Analysis of Variance
• Blocking Designs
9- 3

9 LEARNING OBJECTIVES
After studying this chapter you should be able to:
• Explain the purpose of ANOVA
• Describe the model and computations behind ANOVA
• Explain the test statistic F
• Conduct a one-way ANOVA
• Report ANOVA results in an ANOVA table
• Apply a Tukey test for pairwise analysis
• Conduct a two-way ANOVA
• Explain blocking designs
• Apply templates to conduct one-way and two-way ANOVA
9- 4

9-1 Using Statistics

• ANOVA (ANalysis Of VAriance) is a statistical method for


determining the existence of differences among several population
means.
 ANOVA is designed to detect differences among means from
populations subject to different treatments
 ANOVA is a joint test
 The equality of several population means is tested
simultaneously or jointly.
 ANOVA tests for the equality of several population means by looking at
two estimators of the population variance (hence, analysis of variance).
9- 5

9-2 The Hypothesis Test of


Analysis of Variance
• In an analysis of variance:
 We have r independent random samples, each one corresponding to a
population subject to a different treatment.
 We have:
 n = n1+ n2+ n3+ ...+nr total observations.
 r sample means: x1, x2 , x3 , ... , xr
 These r sample means can be used to calculate an estimator
of the population variance. If the population means are
equal, we expect the variance among the sample means to be
small.
 r sample variances: s12, s22, s32, ...,sr2
 These sample variances can be used to find a pooled
estimator of the population variance.
9- 6

9-2 The Hypothesis Test of Analysis of


Variance (continued): Assumptions

•• Weassume
We assumeindependent
independentrandom
randomsampling
samplingfrom
fromeach
eachof
ofthe
therrpopulations
populations
•• Weassume
We assumethat
thatthe
therrpopulations
populationsunder
understudy:
study:
 are normally distributed,
 are normally distributed,
 with means µ that may or may not be equal,
 with means µ that
i
i may or may not be equal,
 but with equal variances, σ 2 2.
 but with equal variances, σ .ii

………

µ 1 µ 2 µ r
Population 1 Population 2 Population r
9- 7

9-2 The Hypothesis Test of Analysis of


Variance (continued)
Thehypothesis
The hypothesistest
testof
ofanalysis
analysisof
ofvariance:
variance:

HH00::µµ 11==µµ 22==µµ 33==µµ 44==......µµ r r


HH11::Not allµµ i i(i(i==1,1,...,
Notall ...,r)r)are
areequal
equal
Thetest
The teststatistic
statisticofofanalysis
analysisof ofvariance:
variance:

Estim
ateateofovfarian
varian
cecebased
basedononmmeansfro
mm rsam ples
FF(r-1(r-1,n,-r)n-r) == Estim eansfro rsamples
Estim
Estimateateofovfarian
varian
cecebased
basedononallallsam
sam pleoboserv
ple bserv atio
atio nsns
Thatis,
That is,the
thetest
teststatistic
statisticininan
ananalysis
analysisof
ofvariance
varianceisisbased
basedon
onthe
theratio
ratioof
of
twoestimators
two estimatorsof ofaapopulation
populationvariance,
variance,and
andisistherefore
thereforebased
basedononthe
theFF
distribution,with
distribution, with(r-1)
(r-1)degrees
degreesof offreedom
freedomininthe
thenumerator
numeratorand
and(n-r)
(n-r)
degreesof
degrees offreedom
freedomininthethedenominator.
denominator.
9- 8

When the Null Hypothesis Is True

Whenthe
When thenull
nullhypothesis
hypothesisisistrue:
true:
Η 0: µ µ= =µ
Wewould
We wouldexpect
expectthethesample
samplemeans
meanstotobe benearly
nearly
equal,as
equal, asininthis
thisillustration.
illustration. And
Andwe wewould
would
x
expectthe
expect thevariation
variationamong
amongthe thesample
samplemeans
means
(betweensample)
(between sample)totobebesmall,
small,relative
relativetotothe
the
variationfound
variation foundaround
aroundthetheindividual
individualsample
sample
means(within
means (withinsample).
sample).

x IfIfthe
thenull
nullhypothesis
hypothesisisistrue,
true, the
thenumerator
numeratorinin
thetest
the teststatistic
statisticisisexpected
expectedtotobe
besmall,
small,relative
relative
totothe
thedenominator:
denominator:
Estimateofvariancebasedonmeansfromrsamples
FF(r-1(r-1,n,-r)n-r) == Estimateofvariancebasedonmeansfromrsamples
Estimateofvariancebasedonall sampleobservations
Estimateofvariancebasedonallsampleobservations
x
9- 9

When the Null Hypothesis Is False

When the null hypothesis is false:


µ is equal to µ but not to µ ,
µ is equal to µ but not to µ ,
µ is equal to µ but not toµ , or
x x x µ , µ , and µ are all unequal.

Inany
In anyof
ofthese
thesesituations,
situations,we
wewould
wouldnot
notexpect
expectthe
thesample
samplemeans
meanstotoall
allbe
benearly
nearly
equal. We
equal. Wewould
wouldexpect
expectthe
thevariation
variationamong
amongthethesample
samplemeans
means(between
(between
sample)totobe
sample) belarge,
large,relative
relativetotothe
thevariation
variationaround
aroundthe
theindividual
individualsample
samplemeans
means
(withinsample).
(within sample).

IfIfthe
thenull
nullhypothesis
hypothesisisisfalse,
false, the
thenumerator
numeratorininthe
thetest
teststatistic
statisticisisexpected
expectedtotobe
be
large,relative
large, relativetotothe
thedenominator:
denominator:
Estimateofvariancebasedonmeansfromrsamples
FF(r-1(r-1, n,-r)n-r) == Estimateofvariancebasedonmeansfromrsamples
Estimateofvariancebasedonall sampleobservations
Estimateofvariancebasedonall sampleobservations
9- 10

The ANOVA Test Statistic for r = 4 Populations and


n = 54 Total Sample Observations

••Suppose
Supposewe
wehave
have44populations,
populations,from
fromeach
eachof
ofwhich
whichwe
wedraw
drawan
an
independentrandom
independent randomsample, sample,with withnn1++nn2++nn3++nn4==54.
54. Then
Thenour
ourtest
test
1 2 3 4
statisticis:
statistic is:
FF(4-1, 54-4)==FF(3,50)==Estimate of variance based on means from 4 samples
Estimate of variance based on means from 4 samples
(4-1, 54-4) (3,50)
Estimate of variance based on all 54 sample observations
Estimate of variance based on all 54 sample observations

F Distribution with 3 and 50 Degrees of Freedom The nonrejection region (for α =0.05)in this
0.7
instance is F ≤ 2.79, and the rejection region
0.6 is F > 2.79. If the test statistic is less than
0.5 2.79 we would not reject the null hypothesis,
and we would conclude the 4 population
f(F)

0.4
0.3
0.2
means are equal. If the test statistic is
0.1
α =0.05
greater than 2.79, we would reject the null
0.0
F(3,50)
hypothesis and conclude that the four
0 1 2 3 4 5
2.79 population means are not equal.
9- 11

Example 9-1

Randomlychosen
Randomly chosengroups
groupsofofcustomers
customerswere
wereserved
serveddifferent
differenttypes
typesofofcoffee
coffeeand
andasked
askedtotorate
ratethe
the
coffeeon
coffee onaascale
scaleofof00toto100:
100:21
21were
wereserved
servedpure
pureBrazilian
Braziliancoffee,
coffee,20
20were
wereserved
servedpure
pureColombian
Colombian
coffee,and
coffee, and22
22were
wereserved
servedpure
pureAfrican-grown
African-growncoffee.
coffee.

Theresulting
The resultingtest
teststatistic
statisticwas
wasFF==2.02
2.02

F Distribution with 2 and 60 Degrees of Freedom

H0 : µ1 = µ2 = µ3 0.7
0.6
H1: Not all three means equal 0.5
n1 = 21 n 2 = 20 n3 = 22 n = 21+ 20 + 22 = 63 0.4

f(F)
0.3
r =3 0.2
The critical point for α = 0.05 is : 0.1
α =0.05

F =F =F = 3.15 0.0


F

r -1,n-r  3−1,63−3  2,60  0 1 2 3 4 5
      Test Statistic=2.02 F(2,60) =3.15
F = 2.02 < F = 3.15

2,60 


 

H0 cannot be rejected, and we cannot conclude that any of the


population means differs significan tly from the others.
9- 12

9-3 The Theory and the Computations


of ANOVA: The Grand Mean
Thegrand
The grandmean,
mean,
mean x,x,isisthe
themean
meanof
ofall
allnn== nn+
1+ n2+ n3+...+ nr observations
mean 1 n2+ n3+...+ nr observations
ininall
allrrsamples.
samples.

The mean of sample i (i = 1,2,3,..., r) :


ni
∑ xij
xi = =1
j
ni
The grand mean, the mean of all data points :
r ni r
∑ ∑ xij ∑ ni xi
xi = i=1 j =1 = i=1
n n
where x is the particular data point in position j within the sample from population i.
ij
The subscript i denotes the population, or treatment, and runs from 1 to r. The subscript j
denotes the data point within the sample from population i; thus, j runs from 1 to n .
j
9- 13

Using the Grand Mean: Table 9-1

Treatment (j) Sample point(j) Value(xij)


I = 1 Triangle 1 4
Triangle 2 5
Triangle 3 7 x1=6
Triangle 4 8
Mean of Triangles x2=11.5
6
I = 2 Square 1 10 x=6.909
x3=2
Square 2 11 0 5 10
Square 3 12 Distance from data point to its sample mean
Square 4 13
Distance from sample mean to grand mean
Mean of Squares 11.5
I = 3 Circle 1 1 IfIfthe
ther rpopulation
populationmeansmeansarearedifferent
different(that
(thatis,
is,atat
Circle 2 2 leasttwo
least twoofofthethepopulation
populationmeans
meansarearenot
notequal),
equal),
Circle 3 thenititisislikely
then likelythat
thatthe
thevariation
variationofofthe
thedata
data
3
pointsabout
points abouttheir
theirrespective
respectivesample
samplemeans
means
Mean of Circles
les 2 (within sample
(within samplevariation)
variation)will
willbe
besmall
smallrelative
relative
Grand mean of all data points 6.909
totothe
thevariation
variationofofthether rsample
samplemeans
meansabout
aboutthe
the
grandmean
grand mean(between
(betweensample
samplevariation).
variation).
9- 14

The Theory and Computations of ANOVA:


Error Deviation and Treatment Deviation

We define an error devi ation as the difference between a data point


and its sample mean. Errors are denoted by e, and we have:

eij = xij − xi
We define a treatment deviation as the deviation of a sample mean
from the grand mean. Treatment deviations, ti , are given by:
t = x −x
i i

TheANOVA
The ANOVAprinciple
principlesays:
says:
Whenthe
When thepopulation
populationmeans
meansare
arenot
notequal,
equal,the
the“average”
“average”error
error
(withinsample)
(within sample) isisrelatively
relativelysmall
smallcompared
comparedwith
withthe
the“average”
“average”
treatment(between
treatment (betweensample)
sample)deviation.
deviation.
9- 15

The Theory and Computations of


ANOVA: The Total Deviation
Thetotal
The totaldeviation
deviation(Tot
(Totij))isisthe
thedifference
differencebetween
betweenaadata
datapoint
point(x
(xij))and
andthe
thegrand
grandmean
mean(x):
(x):
ij ij
Totij=x
Tot =xij--xx
ij ij
Forany
For anydata
datapoint
pointxx:ij :
ij

TotTot==tt++ee
Thatis:
That is:
TotalDeviation
Total Deviation==Treatment
TreatmentDeviation
Deviation++Error
ErrorDeviation
Deviation

Consider data point x24 =13 from table 9-1. The


Total deviation:
mean of sample 2 is 11.5, and the grand mean is Tot24 =x24 -x=6.091

6.909, so:
e24 = x 24 − x 2 = 13 − 11.5 = 1.5 Error deviation:
e24 =x24 -x2=1.5
t 2 = x 2 − x = 11.5 − 6.909 = 4 .591 x24 =13
Treatment deviation:
Tot 24 = t 2 + e24 = 1.5 + 4 .591 = 6.091 t2=x2-x=4.591 x2=11.5
or
x = 6.909
Tot 24 = x 24 − x = 13 − 6.909 = 6.091 0 5 10
9- 16

The Theory and Computations of


ANOVA: Squared Deviations

Total Deviation = Treatment Deviation + Error Deviation


The total deviation is the sum of the treatment deviation and the error deviation:
t + e = ( x − x ) + ( x ij − x ) = ( x ij − x ) = Totij
i ij i i
Notice that the sample mean term ( x ) cancels out in the above addition, which
i
simplifies the equation.

Squared Deviations

2 2 2 2
t +e = (x − x) + ( x ij − x )
i ij i i
2 2
Totij = ( x ij − x )
9- 17

The Theory and Computations of ANOVA:


The Sum of Squares Principle
Sums of Squared Deviations
n n
r j 2 r 2 r j 2
∑ ∑ Tot = ∑ nt + ∑ ∑ e
i = 1 j = 1 ij i =1 ii i = 1 j = 1 ij
n n
r j r r j
∑ 2
∑ (x − x) = ∑ n (x − x) + ∑ 2 ∑ ( x − x )2
i = 1 j = 1 ij i =1 i i i = 1 j = 1 ij i

SST = SSTR + SSE


TheSum
The Sumof
ofSquares
SquaresPrinciple
Principle

Thetotal
The totalsum
sumofofsquares
squares(SST)
(SST)isisthe
thesum
sumof
oftwo
twoterms:
terms: the
thesum
sumofof
squaresfor
squares fortreatment
treatment(SSTR)
(SSTR)and
andthethesum
sumof
ofsquares
squaresfor
forerror
error(SSE).
(SSE).
SST == SSTR
SST SSTR ++ SSE SSE
9- 18

The Theory and Computations of ANOVA:


Picturing The Sum of Squares Principle

SSTR SSE

SST
SSTmeasures
SST measuresthe
thetotal
totalvariation
variationininthe
thedata
dataset,
set,the
thevariation
variationof
ofall
allindividual
individualdata
data
pointsfrom
points fromthe
thegrand
grandmean.
mean.

SSTRmeasures
SSTR measuresthe
theexplained
explainedvariation,
variation,the
thevariation
variationof
ofindividual
individualsample
samplemeans
means
fromthe
from thegrand
grandmean.
mean. ItItisisthat
thatpart
partof
ofthe
thevariation
variationthat
thatisispossibly
possiblyexpected,
expected,oror
explained,because
explained, becausethe
thedata
datapoints
pointsare
aredrawn
drawnfrom
fromdifferent
differentpopulations.
populations. It’s
It’sthe
the
variationbetween
variation betweengroups
groupsofofdatadatapoints.
points.

SSEmeasures
SSE measuresunexplained
unexplainedvariation,
variation,the
thevariation
variationwithin
withineach
eachgroup
groupthat
thatcannot
cannotbe
be
explainedby
explained bypossible
possibledifferences
differencesbetween
betweenthe
thegroups.
groups.
9- 19

The Theory and Computations of


ANOVA: Degrees of Freedom
Thenumber
The numberofofdegrees
degreesofoffreedom
freedomassociated
associatedwith
withSST
SSTisis(n
(n--1).
1).
nntotal
totalobservations
observationsininall
allrrgroups,
groups,less
lessone
onedegree
degreeofoffreedom
freedom
lostwith
lost withthe
thecalculation
calculationof ofthe
thegrand
grandmean
mean

Thenumber
The numberofofdegrees
degreesofoffreedom
freedomassociated
associatedwith
withSSTR
SSTRisis(r(r--1).
1).
rrsample
samplemeans,
means,less
lessone
onedegree
degreeofoffreedom
freedomlost
lostwith
withthe
the
calculationofofthe
calculation thegrand
grandmean
mean

Thenumber
The numberofofdegrees
degreesofoffreedom
freedomassociated
associatedwith
withSSE
SSEisis(n-r).
(n-r).
nntotal
totalobservations
observationsininall
allgroups,
groups,less
lessone
onedegree
degreeofoffreedom
freedom
lostwith
lost withthe
thecalculation
calculationofofthe
thesample
samplemean
meanfrom
fromeach
eachofofrrgroups
groups

Thedegrees
The degreesofoffreedom
freedomare areadditive
additiveininthe
thesame
sameway wayas
asare
arethe
thesums
sumsof
ofsquares:
squares:
df(total)==df(treatment)
df(total) df(treatment)++df(error)
df(error)
(n(n--1)1) == (r(r--1)1) ++ (n(n--r)r)
9- 20

The Theory and Computations of


ANOVA: The Mean Squares
Recallthat
Recall thatthe
thecalculation
calculationofofthe
thesample
samplevariance
varianceinvolves
involvesthe
thedivision
divisionofofthe
thesum
sumof
of
squareddeviations
squared deviationsfrom
fromthe
thesample
samplemean
meanby
bythe
thenumber
numberofofdegrees
degreesofoffreedom.
freedom. This
This
principleisisapplied
principle appliedas
aswell
welltotofind
findthe
themean
meansquared
squareddeviations
deviationswithin
withinthe
theanalysis
analysisof
of
variance.
variance.
SSTR
MSTR =
Meansquare
Mean squaretreatment
treatment(MSTR):
(MSTR): (r − 1)
SSE
MSE =
Meansquare
Mean squareerror
error(MSE):
(MSE): (n − r )
SST
Meansquare
squaretotal
total(MST):
(MST): MST =
Mean (n − 1)

(Notethat
(Note thatthe
theadditive
additiveproperties
propertiesofofsums
sumsofofsquares
squaresdo
donot
notextend
extendtotothe
themean
mean
squares. MST ≠≠ MSTR
squares. MST MSTR++MSE.
MSE.
9- 21

The Theory and Computations of


ANOVA: The Expected Mean Squares
2
E ( MSE ) = σ

and
∑ µ −µ 2
2 ni ( i ) = σ 2 when the null hypothesis is true
E ( MSTR) = σ +
r −1 > σ 2 when the null hypothesis is false

where µ i is the mean of population i and µ is the combined mean of all r populations.

That is, the expected mean square error (MSE) is simply the common population variance
(remember the assumption of equal population variances), but the expected treatment sum of
squares (MSTR) is the common population variance plus a term related to the variation of the
individual population means around the grand population mean.

If the null hypothesis is true so that the population means are all equal, the second term in
the E(MSTR) formulation is zero, and E(MSTR) is equal to the common population variance.
9- 22

Expected Mean Squares and the


ANOVA Principle

When the null hypothesis of ANOVA is true and all r population means are
equal, MSTR and MSE are two independent, unbiased estimators of the
common population variance σ 2.

Onthe
On theother
otherhand,
hand,when
whenthe
thenull
nullhypothesis
hypothesisisisfalse,
false,then
thenMSTR
MSTRwill
willtend
tendtoto
belarger
be largerthan
thanMSE.
MSE.

Sothe
So theratio
ratioof
ofMSTR
MSTRand andMSE
MSEcancanbe
beused
usedas
asan
anindicator
indicatorof
ofthe
the
equalityor
equality orinequality
inequalityof
ofthe
therrpopulation
populationmeans.
means.

Thisratio
This ratio(MSTR/MSE)
(MSTR/MSE)will willtend
tendto
tobe
benear
nearto
to11ififthe
thenull
nullhypothesis
hypothesisisis
true,and
true, andgreater
greaterthan
than11ififthe
thenull
nullhypothesis
hypothesisisisfalse.
false. The
TheANOVA
ANOVAtest,
test,
finally,isisaatest
finally, testof
ofwhether
whether(MSTR/MSE)
(MSTR/MSE)isisequal
equalto,to,or
orgreater
greaterthan,
than,1.1.
9- 23

The Theory and Computations of


ANOVA: The F Statistic

Under the assumptions of ANOVA, the ratio (MSTR/MSE)


possess an F distribution with (r-1) degrees of freedom for
the numerator and (n-r) degrees of freedom for the
denominator when the null hypothesis is true.

The test statistic in analysis of variance:


MSTR
F( r -1,n-r ) =
MSE
9- 24

9-4 The ANOVA Table and Examples

Treatment (i) i j Value (x ij ) (x ij -xi ) (x ij -xi )2 n


r j
∑ ( x − x ) 2 = 17
Triangle 1 1 4 -2 4
Triangle 1 2 5 -1 1
SSE = ∑
Triangle 1 3 7 1 1 i = 1 j = 1 ij i
Triangle 1 4 8 2 4 r
2
Square 2 1 10 -1.5 2.25 SSTR = ∑ n ( x − x ) = 159 .9
Square 2 2 11 -0.5 0.25
i =1 i i
Square 2 3 12 0.5 0.25
Square 2 4 13 1.5 2.25 SSTR 159 .9
MSTR = = = 79 .95
Circle 3 1 1 -1 1 r −1 ( 3 − 1)
Circle 3 2 2 0 0
SSTR 17
Circle 3 3 3 1 1 MSE = = = 2 .125
73 0 17 n −r 8
Treatment (xi -x) (xi -x) 2
ni (x i -x) 2 MSTR 79 .95
F = = = 37 .62 .
Triangle -0.909 0.826281 3.305124 ( 2 ,8 ) MSE 2 .125
Square 4.591 21.077281 84.309124
Critical point (α = 0.01): 8.65
Circle -4.909 124.098281 72.294843
159.909091
H may be rejected at the 0.01 level
0
of significance.
9- 25

ANOVA Table

Source of Sum of Degrees of


Variation Squares Freedom Mean Square F Ratio
Treatment SSTR=159.9 (r-1)=2 MSTR=79.95 37.62
Error SSE=17.0 (n-r)=8 MSE=2.125
Total SST=176.9 (n-1)=10 MST=17.69

F Distribution for 2 and 8 Degrees of Freedom TheANOVA


The ANOVATable
Tablesummarizes
summarizesthe
the
0.7 ANOVAcalculations.
ANOVA calculations.
0.6

InInthis
thisinstance,
instance,since
sincethe
thetest
teststatistic
statisticisis
0.5
Computed test statistic=37.62
0.4
greaterthan
greater thanthe
thecritical
criticalpoint
pointfor anαα ==
foran
f(F)

0.3
0.01level
0.01 levelofofsignificance,
significance,thethenull
null
0.2

0.1
0.01 hypothesismay
hypothesis maybe berejected,
rejected,and
andwewemaymay
0.0 concludethat
conclude thatthe
themeans
meansforfortriangles,
triangles,
F(2,8)
0
8.65
10
squares,and
squares, andcircles
circlesare
arenot
notall
allequal.
equal.
9- 26

Template Output

Decision:
Reject the
Null Hypothesis
9- 27

Minitab Output

Decision:
Reject the
Null Hypothesis
9- 28

Example 9-2: Club Med

Club Med has conducted a test to determine whether its Caribbean resorts are equally well liked by
vacationing club members. The analysis was based on a survey questionnaire (general satisfaction,
on a scale from 0 to 100) filled out by a random sample of 40 respondents from each of 5 resorts.
Resort Mean Response (x i ) Source of Sum of Degrees of
Guadeloupe 89 Variation Squares Freedom Mean Square F Ratio

Martinique 75 Treatment SSTR= 14208 (r-1)= 4 MSTR= 3552 7.04

Eleuthra 73 Error SSE=98356 (n-r)= 195 MSE= 504.39


Paradise Island 91 Total SST=112564 (n-1)= 199 MST= 565.65
St. Lucia 85
F Distribution with 4 and 200 Degrees of Freedom
SST=112564 SSE=98356
0.7
Theresultant
The resultantFF
0.6

0.5
ratioisislarger
ratio largerthan
than
Computed test statistic=7.04
0.4 thecritical
the criticalpoint
pointfor
for
f(F)

αα ==0.01,
0.01,so
sothe
the
0.3

0.2

0.1
0.01 nullhypothesis
null hypothesismaymay
0.0 berejected.
be rejected.
0 F(4,200)
3.41
9- 29

Example 9-3: Job Involvement

Source of Sum of Degrees of


Variation Squares Freedom Mean Square F Ratio

Treatment SSTR= 879.3 (r-1)=3 MSTR= 293.1 8.52


Error SSE= 18541.6 (n-r)= 539 MSE=34.4
Total SST= 19420.9 (n-1)=542 MST= 35.83

Giventhe
Given thetotal
totalnumber
numberof ofobservations
observations(n (n==543),
543),thethenumber
numberof ofgroups
groups
(r(r==4),
4),the
theMSEMSE(34.(34.4),
4),and
andthe
theFFratio
ratio(8.52),
(8.52),the
theremainder
remainderof ofthe
the
ANOVAtable
ANOVA tablecan
canbe becompleted.
completed. TheThecritical
criticalpoint
pointofofthe
theFFdistribution
distributionfor for
αα ==0.010.01andand(3,
(3,400)
400)degrees
degreesofoffreedom
freedomisis3.83.
3.83. The
Thetest
teststatistic
statisticininthis
this
exampleisismuch
example muchlarger
largerthan
thanthis
thiscritical
criticalpoint,
point,so
sothe
theppvalue
valueassociated
associatedwithwith
thistest
this teststatistic
statisticisisless
lessthan
than0.01,
0.01,and
andthe
thenull
nullhypothesis
hypothesismaymaybe berejected.
rejected.
9- 30

9-5 Further Analysis

Do Not Reject H0 Stop


Data ANOVA
Reject H0

The sample means are unbiased estimators of the population means.

The mean square error (MSE) is an unbiased estimator of the common


population variance.

Confidence Intervals
for Population Means
Further
Analysis Tukey Pairwise
Comparisons Test
The ANOVA Diagram
9- 31

Confidence Intervals for Population


Means
A (1 - α ) 100% confidence interval for µ i , the mean of population i:
MSE
xi ± tα
2 ni
where t α is the value of the t distribution with (n - r ) degrees of
2
α
freedom that cuts off a right - tailed area of .
2
Resort Mean Response (x i )
MSE 504.39
Guadeloupe 89 xi ± tα = xi ± 1.96 = xi ± 6.96
Martinique 75 2 ni 40
Eleuthra 73 89 ± 6.96 = [82.04, 95.96]
Paradise Island 91 75 ± 6.96 = [ 68.04,81.96]
St. Lucia 85 73 ± 6.96 = [ 66.04, 79.96]
SST = 112564 SSE = 98356 91 ± 6.96 = [84.04,97.96]
ni = 40 n = (5)(40) = 200 85 ± 6.96 = [ 78.04, 91.96]
MSE = 504.39
9- 32

The Tukey Pairwise-Comparisons


Test
The Tukey Pairwise Comparison test, or Honestly Significant Differences (MSD) test, allows us
to compare every pair of population means with a single level of significance.

It is based on the studentized range distribution, q, with r and (n-r) degrees of freedom.

The critical point in a Tukey Pairwise Comparisons test is the Tukey Criterion:
MSE
T = qα
ni
where ni is the smallest of the r sample sizes.

The test statistic is the absolute value of the difference between the appropriate sample means, and
the null hypothesis is rejected if the test statistic is greater than the critical point of the Tukey
Criterion

Note that there are ()


r
2
=
r!

2 !( r − 2 ) !
pairs of populatio n means to compare. For examp le, if r = 3:

H 0 : µ1 = µ 2 H 0 : µ1 = µ 3 H 0 : µ2 = µ3
H 1 : µ1 ≠ µ 2 H 1 : µ1 ≠ µ 3 H 1 : µ2 ≠ µ3
9- 33

The Tukey Pairwise Comparison Test:


The Club Med Example
Thetest
The teststatistic
statisticfor
foreach
eachpairwise
pairwisetest
testisisthe
theabsolute
absolutedifference
differencebetween
betweenthe theappropriate
appropriate
samplemeans.
sample means.
ii Resort
Resort Mean
Mean I.I. HH0:0:µµ 1 1== µµ 2 2 VI. HH:0:µµ 2== µµ 4
VI. 0 2 4
1 1 Guadeloupe
Guadeloupe 89
89 H : µ ≠ µ
H1:1 µ 1 1≠ µ 2 2 H : µ ≠ µ
H1:1 µ 2 2≠ µ 4 4
22 Martinique
Martinique 75
75 |89-75|=14>13.7*
|89-75|=14>13.7* |75-91|=16>13.7*
|75-91|=16>13.7*
33 Eleuthra
Eleuthra 73
73 II. HH:0:µµ 1== µµ 3
II. VII. HH:0:µµ 2== µµ 5
VII.
0 1 3 0 2 5
4 4 Paradise
Paradise Is.Is. 91
91 H : µ ≠ µ
H1:1 µ 1 1≠ µ 3 3 H : µ ≠ µ
H1:1 µ 2 2≠ µ 5 5
55 St.Lucia
St. Lucia 85
85 |89-73|=16>13.7*
|89-73|=16>13.7* |75-85|=10<13.7
|75-85|=10<13.7
III. HH:0:µµ 1== µµ 4
III. VIII.HH:0:µµ 3== µµ 4
VIII.
0 1 4 0 3 4
The critical point T for
The critical point T0.005.05 for H : µ ≠ µ
H1:1 µ 1 1≠ µ 4 4 H : µ ≠ µ
H1:1 µ 3 3≠ µ 4 4
r=5and
r=5 and(n-r)=195
(n-r)=195 |89-91|=2<13.7
|89-91|=2<13.7 |73-91|=18>13.7*
|73-91|=18>13.7*
degreesofoffreedom
degrees freedomis: is: IV. HH:0:µµ 1== µµ 5
IV. IX. HH:0:µµ 3== µµ 5
IX.
MSE 0 1 5 0 3 5
T = qα H : µ ≠ µ
H1:1 µ 1 1≠ µ 5 5 H : µ ≠ µ
H1:1 µ 3 3≠ µ 5 5
ni |89-85|=4<13.7 |73-85|=12<13.7
|89-85|=4<13.7 |73-85|=12<13.7
504.4
V.V. HH:0:µµ 2== µµ 3 X. HH:0:µµ 4== µµ 5
X.
= 3.86 = 13.7 0 2 3 0 4 5
40 H : µ ≠ µ
H1:1 µ 2 2≠ µ 3 3 H : µ ≠ µ
H1:1 µ 4 4≠ µ 5 5
|75-73|=2<13.7
|75-73|=2<13.7 |91-85|=6<13.7
|91-85|= 6<13.7
Rejectthe
Reject thenull
nullhypothesis
hypothesisififthetheabsolute
absolutevalue valueofofthe
thedifference
differencebetween
betweenthe thesample
samplemeans
means
isisgreater
greaterthan
thanthe
thecritical
criticalvalue
valueofofT. T.(The
(Thehypotheses
hypothesesmarked
markedwith
with**are
arerejected.)
rejected.)
9- 34

Picturing the Results of a Tukey Pairwise


Comparisons Test: The Club Med Example

Werejected
We rejectedthe
thenull
nullhypothesis
hypothesiswhich
whichcompared
comparedthe themeans
meansof
ofpopulations
populations11
and2,2,11and
and and3,3,22and
and4,4,and
and33and
and4.4. On
Onthetheother
otherhand,
hand,we
weaccepted
acceptedthe
the
nullhypotheses
null hypothesesof ofthe
theequality
equalityof
ofthe
themeans
meansofofpopulations
populations11and
and4,4,11and
and5,5,
22and
and3,3,22and
and5,5,33and
and5,5,and
and44and
and5.5.
µ µ µ µ µ
3 2 5 1 4

Thebars
The barsindicate
indicatethe
thethree
threegroupings
groupingsof ofpopulations
populationswith
withpossibly
possiblyequal
equal
means:22and
means: and3;3;2,2,3,3,and
and5;5;and
and1,1,4,4,and
and5.5.
9- 35

Picturing the Results of a Tukey Pairwise


Comparisons Test: The Club Med Example
9- 36

Picturing the Results of a Tukey Pairwise


Comparisons Test: The Club Med Example

NOTE: Zero is not


included in the intervals.
Thus there is a
significant difference
between the means
for A and B, A and C,
and B and C.
9- 37

9-6 Models, Factors and Designs

•• AAstatistical
statisticalmodel
modelisisaaset
setofofequations
equationsand
andassumptions
assumptionsthat
thatcapture
capturethe
the
essentialcharacteristics
essential characteristicsofofaareal-world
real-worldsituation
situation
Theone-factor
 The one-factorANOVA
ANOVAmodel:
model:

xxij=ij=µµ i+
i+εε ij= µ ++αα i+
ij=µ i+εε ij ij

whereεε ij ijisisthe
where theerror
errorassociated
associatedwith withthe
thejth
jthmember
memberof of
theith
the ithpopulation.
population. The Theerrors
errorsareareassumed
assumedto tobe
be
normally distributed with mean
normally distributed with mean 0 and variance σ .0 and variance σ 2 2.
9- 38

9-6 Models, Factors and Designs


(Continued)
AAfactor
factorisisaaset
setofofpopulations
populationsorortreatments
treatmentsofofaasingle
singlekind.
kind. For
Forexample:
example:


 One factor models based on sets of resorts, types of airplanes, or kinds of
 One factor models based on sets of resorts, types of airplanes, or kinds of
sweaters
sweaters
 Two factor models based on firm and location
 Two factor models based on firm and location
 Three factor models based on color and shape and size of an ad.
 Three factor models based on color and shape and size of an ad.

• Fixed-Effectsand
Fixed-Effects andRandom
RandomEffects
Effects
 A fixed-effects model is one in which the levels of the factor under study (the
 A fixed-effects model is one in which the levels of the factor under study (the
treatments)are
treatments) arefixed
fixedininadvance.
advance. Inference
Inferenceisisvalid
validonly
onlyfor
forthe
thelevels
levelsunder
under
study.
study.
 A random-effects model is one in which the levels of the factor under study are
 A random-effects model is one in which the levels of the factor under study are
randomlychosen
randomly chosenfrom
fromananentire
entirepopulation
populationofoflevels
levels(treatments).
(treatments). Inference
Inferenceisis
validfor
valid forthe
theentire
entirepopulation
populationofoflevels.
levels.
9- 39

Experimental Design

•• AAcompletely-randomized
completely-randomizeddesign designisisone
oneininwhich
whichthetheelements
elementsare
areassigned
assignedtoto
treatmentscompletely
treatments completelyatatrandom.
random. That
Thatis,
is,any
anyelement
elementchosen
chosenfor
forthe
thestudy
studyhas
has
anequal
an equalchance
chanceofofbeing
beingassigned
assignedtotoany
anytreatment.
treatment.
•• Inaablocking
In blockingdesign,
design,elements
elementsare
areassigned
assignedtototreatments
treatmentsafter
afterfirst
firstbeing
being
collectedinto
collected intohomogeneous
homogeneousgroups.
groups.
 In a completely randomized block design, all members of each block
 In a completely randomized block design, all members of each block
(homogeneousgroup)
(homogeneous group)are
arerandomly
randomlyassigned
assignedtotothe
thetreatment
treatmentlevels.
levels.
 In a repeated measures design, each member of each block is assigned to all
 In a repeated measures design, each member of each block is assigned to all
treatmentlevels.
treatment levels.
9- 40

9-7 Two-Way Analysis of Variance

InInaatwo-way
two-wayANOVA,
ANOVA,the theeffects
effectsofoftwo
twofactors
factorsorortreatments
treatmentscancanbebeinvestigated
investigatedsimultaneously.
simultaneously. Two-
Two-


wayANOVA
way ANOVAalso
alsopermits
permitsthe
theinvestigation
investigationofofthe
theeffects
effectsofofeither
eitherfactor
factoralone
aloneand
andofofthe
thetwo
twofactors
factors
together.
together.

 Theeffect
The effecton
onthe
thepopulation
populationmean
meanthat
thatcan
canbebeattributed
attributedtotothe
thelevels
levelsofofeither
eitherfactor
factoralone
aloneisiscalled
calledaamain
main
effect.
effect.


AnAninteraction
interactioneffect
effectbetween
betweentwo
twofactors
factorsoccurs
occursififthe
thetotal
totaleffect
effectatatsome
somepair
pairofoflevels
levelsofofthe
thetwo
twofactors
factorsoror
treatmentsdiffers
treatments differssignificantly
significantlyfrom
fromthe
thesimple
simpleaddition
additionofofthe
thetwo
twomain
maineffects.
effects. Factors
Factorsthat
thatdodonot
notinteract
interactare
are
calledadditive.
called additive.

• Threequestions
Three questionsanswerable
answerableby
bytwo-way
two-wayANOVA:
ANOVA:

 Arethere
Are thereany
anyfactor
factorAAmain
maineffects?
effects?

 Arethere
Are thereany
anyfactor
factorBBmain
maineffects?
effects?

 Arethere
Are thereany
anyinteraction
interactioneffects
effectsbetween
betweenfactors
factorsAAand
andB?
B?

• Forexample,
For example, wewemight
mightinvestigate
investigatethe
theeffects
effectsononvacationers’
vacationers’ratings
ratingsofofresorts
resortsbybylooking
lookingatatfive
fivedifferent
different
resorts(factor
resorts (factorA)
A)and
andfour
fourdifferent
differentresort
resortattributes
attributes(factor
(factorB).
B). InInaddition
additiontotothe
thefive
fivemain
mainfactor
factorAA
treatmentlevels
treatment levelsand
andthe
thefour
fourmain
mainfactor
factorBBtreatment
treatmentlevels,
levels,there
thereareare(5*4=20)
(5*4=20)interaction
interactiontreatment
treatmentlevels.3
levels.3
9- 41

The Two-Way ANOVA Model

•• xxijkijk==µµ ++αα i+ β j j++((αα ββ ))ij ij++εε


i+β ijkijk
whereµµ
 where isisthe
theoverall
overallmean;
mean;
 αα i iisisthe
theeffect
effectof
oflevel
leveli(i=1,...,a)
i(i=1,...,a)of
offactor
factorA;
A;
 ββ jisisthe
j
theeffect
effectof
oflevel
levelj(j=1,...,b)
j(j=1,...,b)of
offactor
factorB;
B;
 (α ββ )) jjisisthe
(α theinteraction
interactioneffect
effectof
oflevels
levelsi iand
andj;j;
jj

 εε jjkisisthe
theerror
errorassociated
associatedwith
withthe
thekth
kthdata
datapoint
pointfrom
fromlevel
leveli iof
offactor
factor
jjk
AAand
andlevellevelj jof offactor
factorB.
B.
 εε jjkisisassumed
assumedtotobe bedistributed
distributednormally
normallywithwithmean
meanzero
zeroand
andvariance
variance
jjk
σσ 2 forforall
alli,i,j,j,and
andk.k.
2
9- 42

Two-Way ANOVA Data Layout:


Club Med Example
Factor A: Resort
Paradise
Guadeloupe Martinique Eleuthra Island St. Lucia
Factor B:
Attribute

Friendship n11 n21 n31 n41 n51


Sports n12 n22 n32 n42 n52
Culture n13 n23 n33 n43 n53
Excitement n14 n24 n34 n44 n54

Eleuthra/sports interaction:
Graphical Display of Effects Rating Combined effect greater than
additive main effects
Friendship Friendship
Excitement Attribute
Sports Excitement
Culture
R a ting

Sports

Culture

Eleuthra St. Lucia Paradise island


Resort
Martinique Guadeloupe
Eleuthra St. Lucia Paradise Island
Resort Martinique Guadeloupe
9- 43

Hypothesis Tests a Two-Way ANOVA


•• FactorAAmain
Factor maineffects
effectstest
test::
H:0:αα =i=00for
H forall
alli=1,2,...,a
i=1,2,...,a
0 i
H:1:Not
H allαα iare
Notall are00
1 i

•• FactorBBmain
Factor maineffects
effectstest:
test:
H:0:ββ =j=00for
H forall
allj=1,2,...,b
j=1,2,...,b
0 j
H:1:Not
H allββ iare
Notall are00
1 i

•• Testfor
Test for(AB)
(AB)interactions:
interactions:
H (α ββ ))=ij=00for
H:0:(α forall
alli=1,2,...,a
i=1,2,...,aand
andj=1,2,...,b
j=1,2,...,b
0 ij
H:1:Not
H Notall (α ββ )) ijare
all(α are00
1 ij
9- 44

Sums of Squares

 In a two-way ANOVA:
 In a two-way ANOVA:
xxijkijk=µ +α i+i+ββ j j++(α
=µ +α (α ββ ))ijkijk++εε ijkijk
SST==SSTR
 SST SSTR+SSE
+SSE
SST==SSA
 SST SSA++SSB
SSB+SS(AB)+SSE
+SS(AB)+SSE

SST = SSTR + SSE


∑ ∑ ∑( x −x )2 = ∑ ∑ ∑( x −x )2 + ∑ ∑ ∑( x −x )2

SSTR = SSA + SSB + SS ( AB)


= ∑ ∑ ∑( x −x )2 + ∑ ∑ ∑( x −x )2 + ∑ ∑ ∑( x + x + x −x )2
i j ij i j
9- 45

The Two-Way ANOVA Table

Source of Sum of Degrees


Variation Squares of Freedom Mean Square F Ratio
Factor A SSA a-1 SSA MSA
MSA = F=
a −1 MSE
Factor B SSB b-1 SSB MSB
MSB = F=
b −1 MSE
Interaction SS(AB) (a-1)(b-1) SS ( AB) MS ( AB)
MS ( AB) = F=
( a −1)(b −1) MSE

Error SSE ab(n-1) SSE


MSE =
ab( n −1)
Total SST abn-1

A Main Effect Test: F(a-1,ab(n-1)) B Main Effect Test: F(b-1,ab(n-1))

(AB) Interaction Effect Test: F((a-1)(b-1),ab(n-1))


9- 46

Example 9-4: Two-Way ANOVA


(Location and Artist)
Source of Sum of Degrees
Variation Squares of Freedom Mean Square F Ratio
Location 1824 2 912 8.94

Artist 2230 2 1115 10.93

Interaction 804 4 201 1.97

Error 8262 81 102

Total 13120 89

αα =0.01,
=0.01,FF(2,81) =4.88
(2,81)
⇒Both
=4.88 ⇒ Bothmain
maineffect
effectnull
nullhypotheses
hypothesesare
arerejected.
rejected.
αα =0.05,
=0.05,FF(2,81) =2.48
(2,81)
⇒Interaction
=2.48 ⇒ Interactioneffect
effectnull
nullhypotheses
hypothesesare
arenot
notrejected.
rejected.
9- 47

Hypothesis Tests

F Distribution with 2 and 81 Degrees of Freedom F Distribution with 4 and 81 Degrees of Freedom

0.7 0.7

0.6 Location test statistic=8.94 0.6

0.5
Artist test statistic=10.93 0.5 Interaction test statistic=1.97
0.4 0.4

f(F)
f(F)

0.3 0.3
α =0.05
α =0.01 0.2
0.2

0.1 0.1

0.0 0.0 F
0 1 2 3 4 5 6 F 0 1 2 3 4 5 6

F0.01 =4.88 F0.05 =2.48


9- 48

Overall Significance Level and Tukey


Method for Two-Way ANOVA

Kimball’sInequality
Kimball’s Inequalitygives
givesan
anupper
upperlimit
limitononthe
thetrue
trueprobability
probabilityof
ofatatleast
least
one Type
one TypeIIerror
errorininthe
thethree
threetests
testsof
ofaatwo-way
two-wayanalysis:
analysis:
αα ≤≤ 1- 1-(1-α
(1-α )1)(1-α
1
(1-α )2)(1-α
2
(1-α )3)
3

TukeyCriterion
Tukey Criterionfor
forfactor
factorA:
A:
MSE
T = qα
bn
wherethe
where thedegrees
degreesof
offreedom
freedomofofthe
theqqdistribution
distributionare
arenow
nowaaand
andab(n-1).
ab(n-1).
Notethat
Note thatMSE
MSEisisdivided
dividedbybybn.
bn.
9- 49

Template for a Two-Way ANOVA


9- 50

Extension of ANOVA to Three Factors

Source of Sum of Degrees


Variation Squares of Freedom Mean Square F Ratio
Factor A SSA a-1 SSA MSA
MSA = F=
a −1 MSE
Factor B SSB b-1 SSB MSB
MSB = F=
b −1 MSE
Factor C SSC c-1 SSC MSC
MSC = F=
c −1 MSE
Interaction SS(AB) (a-1)(b-1) SS ( AB) MS ( AB )
MS ( AB) = F=
(AB) ( a −1)(b −1) MSE
Interaction SS(AC) (a-1)(c-1) SS ( AC) MS ( AC )
MS ( AC) = F=
(AC) (a −1)(c −1) MSE
Interaction SS(BC) (b-1)(c-1) SS ( BC) MS ( BC)
MS ( BC) = − F=
(BC) (b 1)(c −1) MSE

Interaction SS(ABC) (a-1)(b-1)(c-1) SS ( ABC) MS( ABC)


MS ( ABC) = F=
(ABC) (a −1)(b −1)(c −1) MSE
Error SSE abc(n-1) SSE
MSE =
abc( n −1)
Total SST abcn-1
9- 51

Two-Way ANOVA with One


Observation per Cell
• The case of one data point in every cell presents a
problem in two-way ANOVA.
• There will be no degrees of freedom for the error term.
• What can be done?
• If we can assume that there are no interactions between
the main effects, then we can use SS(AB) and its
associated degrees of freedom (a – 1)(b – 1) in place of
SSE and its degrees of freedom.
• We can then conduct main effects tests using MS(AB).
• See the next slide for the ANOVA table.
9- 52

Two-Way ANOVA with One


Observation per Cell

Source of Sum of Squares Degrees of Mean Square F Ratio


Variation Freedom

Factor A SSA a-1


MSA = SSA F = MSA
a −1 MS (AB)

Factor B SSB b-1


MSB = SSB F = MSB
b −1 MS (AB)
“Error” SS(AB) (a – 1)(b – 1)
MS ( AB) = SS ( AB)
(a −1)(b −1)

Total SST ab - 1
9- 53

9-8 Blocking Designs

• A block is a homogeneous set of subjects, grouped to


minimize within-group differences.
• A competely-randomized design is one in which the
elements are assigned to treatments completely at
random. That is, any element chosen for the study has an
equal chance of being assigned to any treatment.
• In a blocking design, elements are assigned to treatments
after first being collected into homogeneous groups.
 In a completely randomized block design, all members of each
block (homogenous group) are randomly assigned to the
treatment levels.
 In a repeated measures design, each member of each block is
assigned to all treatment levels.
9- 54

Model for Randomized Complete


Block Design

•• xxij= µ ++αα i+
ij=µ β j j ++ εε ijij
i+ β
whereµµ
where isisthe
theoverall
overallmean;
mean;
 αα iis
isthe
theeffect
effectof
oflevel
leveli(i=1,...,a)
i(i=1,...,a)of
offactor
factorA;
A;
i

 ββ jis
isthe
theeffect
effectof
ofblock
blockj(j=1,...,b);
j(j=1,...,b);
j

εε ijis
ij
isthe
theerror
errorassociated
associatedwith xx
with ij
ij
 ε is assumed to be distributed normally withmean
ε ij ijis assumed to be distributed normally with meanzero
zeroand
and
varianceσσ 2 2for
variance forall
alliiand
andj.j.
9- 55

ANOVA Table for Blocking Designs:


Example 9-5

Source of Variation Sum of Squares Degress of Freedom Mean Square F Ratio

Blocks SSBL n-1 MSBL = SSBL/(n-1) F = MSBL/MSE


Treatments SSTR r-1 MSTR = SSTR/(r-1) F = MSTR/MSE
Error SSE (n -1)(r - 1) MSE = SSE/(n-1)(r-1)
Total SST nr - 1

Source of Variation Sum of Squares df Mean Square F Ratio


Blocks 2750 39 70.51 0.69
Treatments 2640 2 1320 12.93
Error 7960 78 102.05
Total 13350 119

α = 0.01, F(2, 78) = 4.88


9- 56

Template for the Randomized


Complete Block Design
9- 57

Two-Way ANOVA Using the


Template for Problem 9-42
9- 58

Two-Way ANOVA Using Minitab


for Problem 9-42
9- 59

Two-Way ANOVA Using Minitab


for Problem 9-42

Anda mungkin juga menyukai