Simple Recap
T-test
Assumptions of ANOVA
each group is approximately normal
standard deviations of each group are
approximately equal
One-Way ANOVA
Partitions Total Variation
Total variation
Variation due to
treatment
Sum of Squares Among
Sum of Squares Between
Sum of Squares Treatment
(SST)
Among Groups Variation
Variation due to
random sampling
Due to individual differences
within groups.
Sum of Squares Within
Sum of Squares Error (SSE)
Within Groups Variation
5
Total Variation
SS Total Y11 Y Y21 Y Yij Y
2
Response, Y
Group 1
Group 2
Group 3
Treatment Variation
SSB ( SStrt ) n1 Y1 Y n2 Y2 Y n p Y p Y
2
Response, Y
Y3
Y
Y1
Group 1
Y2
Group 2
Group 3
Response, Y
Y3
Y1
Group 1
Group 2
Y2
Group 3
2. Degrees of Freedom
1 = p -1
2 = n - p
p = # Populations, Groups, or Levels
n = Total Sample Size
One-Way ANOVA
Summary Table
Source of Degrees Sum of
Variation
of
Squares
Freedom
Treatment
p-1
SSB
Mean
F
Square
(Variance)
MSB =
MSB
SSB/(p - 1) MSW
Error
n-p
SSW
MSW =
SSW/(n - p)
Total
n-1
SS(Total) =
SSB+SSW
10
Weekly
sales
529
529
658
658
793
793
514
514
663
663
719
719
711
711
606
606
461
Weekly
461
529
529
sales
498
498
663
663
604
604
495
495
485
485
557
557
353
353
557
557
542
542
614
614
Quality
Quality
804
804
630
630
774
774
717
717
679
679
604
604
620
620
697
697
706
706
615
615
492
492
719
719
787
787
699
699
572
572
Weekly
523
523
584
sales
584
634
634
580
580
624
624
Price
Price
672
672
531
531
443
443
596
596
602
602
502
502
659
659
689
689
675
675
512
512
691
691
733
733
698
698
776
776
561
561
572
572
469
469
581
581
679
679
532
532
H0: 1 = 2= 3
H1: At least two means differ
Count
20
20
20
ANOVA
Source of Variation
Between Groups
Within Groups
SS
57512
506984
Total
564496
Sum
Average Variance
11551
577.55 10775.00
13060
653.00
7238.11
12173
608.65
8670.24
df
2
57
MS
28756
8894
P-value
3.23
0.0468
F crit
3.16
59
Data Sales;
input strategy$ sale @@;
cards;
C 529 Q 804 P 672
.;
run;
16
Difference between the levels of factor A, and Difference between the levels of factor A
difference between the levels of factor B; no
No difference between the levels of factor B
interaction
M R
Level 1 of factor B
e e
s
a p
Level 2 of factor B
n o
n
s
e
Levels of factor A
Levels of factor A
M R
e e
s
a p
n o
n
s
e
1
M R
e e
s
a p
n o
n
s
e
M R
e e
s
a p
n o
n
s
e
2
Interaction
Levels of factor A
1
Levels of factor A
3
b
2
.1
0
1
.
0
2
.1
5
0
.5
0
0
.0
.1
.0a2
.0
M
e
a
n
d
a
t
A1B1 A1B2
A2B1 A2B2
Average
Response
No Interaction
male
Average
Response
male
female
sv
lv
nor
female
sv
lv
nor
20
Interaction
1.Occurs When Effects of One Factor
Vary According to Levels of Other Factor
2.When Significant, Interpretation of Main
Effects (A & B) Is Complicated
3.Can Be Detected
In Graph of Cell Means, Lines Cross
21
Two-Way ANOVA
Total Variation Partitioning
Total
Total Variation
Variation
SS(Total)
Variation
VariationDue
Dueto
to
Treatment
TreatmentAA
SSA
Variation
VariationDue
Dueto
to
Interaction
Interaction
SS(AB)
Variation
Variation Due
Dueto
to
Treatment
Treatment BB
SSB
Variation
VariationDue
Dueto
to
Random
Random Sampling
Sampling
SSE
22
Two-Way ANOVA
Null Hypotheses
1.No Difference in Means Due to Factor A
H0: 1. = 2. =... = a.
MS(B)
F=
MSE
MS(A)
F=
MSE
Rejection region: F > F,a-1 ,n-ab
SS(B)/(b-1)
SSE/(n-ab)
MSE
F > Fa-1)(b-1),n-ab
Sums of
squares
Deg. of
freedom
Mean squares
F ratio
Between A
SSA
K-1
MSA= SSA/(K-1)
MSG/MSE
Between B
SSB
H-1
MSB= SSB/(H-1)
MSB/MSE
Interaction
SSAB
(K-1)(H-1)
MSAB=
SSI/(K-1)(H-1)
MSAB/MSE
Error
SSE
KH(L-1)
MSE=
SSE/KH(L-1)
Total
SST
n-1
F( K 1)( H 1), KH ( L 1)
FH 1, KH ( L 1)
FK 1, KH ( L 1)
Convenience
Quality
Price
491
712
558
447
479
624
546
444
582
672
464
559
759
557
528
670
534
657
557
474
677
627
590
632
683
760
690
548
579
644
689
650
704
652
576
836
628
798
497
841
575
614
706
484
478
650
583
536
579
795
803
584
525
498
812
565
708
546
616
587
Yijk
Level i Level j
Factor Factor
A
B
MS(AB)MSE
F = MS(Marketing*Media)/MSE = .09
Fcritical = Fa-1)(b-1),n-ab = F.05,(3-1)(2-1),60-(3)(2) = 3.17 (p-value= .9171)
Multiple Comparisons
When the null hypothesis is rejected, it
may be desirable to find which mean(s)
is (are) different, and at what ranking
order.
Three commonly used statistical
inference procedures:
Fishers least significant difference (LSD)
method
Bonferroni adjustment
Tukeys multiple comparison method
Example 1
MANOVA
DV
Y1=
Y2=
IV
1=
2=
3=
4=
5=
Example 2
Example 3
Thus
ANOVA
Setting:
Group 1: X 11 , X 12 ,..., X 1n1 , i.i.d ~ N(1, 2 )
Group 2: X 21 , X 22 ,..., X 2 n2 , i.i.d ~ N(2 , 2 )
M
Group g: X g1 , X g 2 ,..., X gn2 , i.i.d ~ N(g , 2 )
The random samples from different groups are independent.
Test: H 0 :
1 2 L g
MANOVA
Setting:
Group 1: X 11 , X 12 ,..., X 1n1 , i.i.d ~ N(1 , )
% %
%
%
Group 2: X 21 , X 22 ,..., X 2 n2 , i.i.d ~ N(2 , )
% %
%
%
M
Group g: X g1 , X g 2 ,..., X gn2 , i.i.d ~ N(g , )
% %
%
%
The random samples from different groups are independent.
Test: H 0 :
1 2 L g
% %
%
ANOVA
Setting:
Group 1: X 11 , X 12 ,..., X 1n1 , i.i.d ~ N(1, 2 )
Group 2: X 21 , X 22 ,..., X 2 n2 , i.i.d ~ N(2 , 2 )
M
Group g: X g1 , X g 2 ,..., X gn2 , i.i.d ~ N(g , 2 )
The random samples from different groups are independent.
Test: H 0 :
1 2 L g
Since:
l ( l )
Xlj
( l )
(overall mean)
(treatment effect)
el j ,
(random error)
el j ~N(0, 2 )
xl j
(observation)
x
(overall sample mean)
( xl x )
( xl j x ) 2 ( xl x ) 2 ( xl j xl ) 2 2( xl x )( xl j xl )
nl
nl
( xl j x ) nl ( xl x ) ( xl j xl ) 2
2
j 1
j 1
nl
( x
l 1 j 1
SST
lj
nl
x ) nl ( xl x ) ( xl j xl ) 2
2
l 1
l 1 j 1
SSB
SSW
( xl j xl )
(residual)
ANOVA Table
______________________________________________________
Source of
Sum of
Degrees of
variation
squares (SS)
freedom(d.f.)
______________________________________________________
g
Treatments
SSB nl ( xl x ) 2
g -1
l 1
nl
Residual
SSW= ( xl j xl )
l 1 j 1
l 1
______________________________________________________
g
Total
nl
SST= ( xl j x )
l 1 j 1
l 1
______________________________________________________
SSB /( g -1)
g
SSW /( nl g )
l 1
g -1,
nl g
l 1
( ), reject H 0 .
MANOVA
Setting:
Group 1: X 11 , X 12 ,..., X 1n1 , i.i.d ~ N(1 , )
% %
%
%
Group 2: X 21 , X 22 ,..., X 2 n2 , i.i.d ~ N(2 , )
% %
%
%
M
Group g: X g1 , X g 2 ,..., X gn2 , i.i.d ~ N(g , )
% %
%
%
The random samples from different groups are independent.
Test: H 0 :
1 2 L g
% %
%
Since:
l ( l )
% % % %
Xlj
%
%
(overall mean)
el j ~ N(0, )
%
( l )
% %
(treatment effect)
el j ,
%
(random error)
xl j
( xl x )
%
%
% %
(observation) (overall sample mean) (estimated treatment effect)
( xl j xl )
% %
(residual)
nl
nl
nl
lj
lj
T
Total sum of squares
and cross products
l 1
l 1 j 1
lj
lj
W
Within sum of squares
and cross products
MANOVA Table
______________________________________________________________
Source of
variation
Sum of
squares (SS)
Degrees of
freedom(d.f.)
______________________________________________________________
g
Treatments
B nl ( xl x )( xl x ) '
% %% %
l 1
nl
Residual
W= ( xl j xl )( xl j xl ) '
% % %
l 1 j 1 %
g -1
g
n
l 1
______________________________________________________________
g
Total
nl
T= ( xl j x )( xl j x ) '
%% %
l 1 j 1 %
n
l 1
______________________________________________________________
W -1B.
The first such variate, which we will label Y1, is that linear
combination of the original variables that has the property that
SSB/SSW of Y1 is maximized. In other words, SSB/SSW is less for
any other linear combination of the original variables. That is, a new
dimension is defined in the original space in such a way that group
differences are maximized on this dimension.
between-class variance of Y
within-class variance of Y
aT Ba
max T
a
a Wa
Result:
Let 1 , 2 ,L , s 0 denote the s nonzero eigenvalues of
W -1B
and
Tests of Hypotheses:
(1) There is no significant main effect for education level (F(2, 58) = 1.685, p = .
194, partial eta squared = .055) (red dots)
(2) There is no significant main effect for marital status (F (1, 58) = .441, p = .
509, partial eta squared = .008)(green dots)
(3) There is a significant interaction effect of marital status and education level (F
(2, 58) = 3.586, p = .034, partial eta squared = .110) (blue dots)
7
6
5
4
MarriedorNot
Married/Partner
NotMarried/Partner
HighSchool
CollegeorNot
SomePostHigh
CollegeorMore
MANOVA Example
South
Midwest
MY1
MY1
MY2
My3
MY2
My3
Vectors of means
on the three DVs
(Y1, Y2, Y3) for
Regions South and
Midwest
First we will look at the overall F test (over all three dependent variables). What we
are most interested in is a statistic called Wilks lambda (), and the F value
associated with that. In the case of our IV, REGION, Wilks lambda is .465, and has
an associated F of 3.90, which is significant at p. <001. Thus the null hypothesis
was rejected.
Above is a portion of the output table reporting the ANOVA tests on the three
dependent variables, abortions per 1000, divorces per 1000, and % Christian
adherents. Note that only the F values for %Christian adherents and Divorces per
1000 population are significant at your criterion of .017. (Note: the MANOVA
procedure doesnt seem to let you set different p levels for the overall test and the
univariate tests, so the power here is higher than it would be if you did these tests
separately in a ANOVA procedure and set p to .017 before you did the tests.)
SAS Code
See page 314 in the textbook Applied
Multivariate statistical analysis, by
Richard A. Johnson, Dean W. Wichern,
China Statistics Press, 2003.