Anda di halaman 1dari 24

One--Way ANOVA*

One

main source: Vernoy & Vernoy (1997)

Quick Reminder

The IndependentIndependent-Samples t Test

The CorrelatedCorrelated-Samples t Test

The Need of One


One--Way ANOVA.

The General Idea About One


One--Way
Analysis of Variance (ANOVA)
9 z test is applicable for testing hypothesis of any normally
distributed data generated from a large n and involved with one
independent
variable
p
9 t tests are applicable for testing hypothesis of data generated
from:
one independent variable involving a single group of
sample
one independent variable involving two independent
g
p of sample
p
groups
one twotwo-level independent variable involving two
correlated groups of sample
9 Strictly speaking, one-way ANOVA is an extended version of a t
test; it lets you perform a t test for one independent variable with 3
or more levels at a time

Strong Notes:
If you were to conduct hypothesis testing of a
specific independent variable involving 3 or
more levels
levels.
.
You cannot draw a directly-transitive
conclusion from multiple t tests among the
different groups (as permitted by t tests)!
Instead, you have to employ a one
one--way
O (so
et e ca
ed F test
test) for
o tthiss
ANOVA
(sometime
called
purpose.
As for t tests, the aim of one-way ANOVA is to
determine whether the null hypothesis (Ho)
can be safely rejected

About One
One--Way ANOVA
9 ANOVA is an evaluation of the random differences between scores or
subjects. In any research involving three or more groups, with each
subjects
group containing several subjects, it is possible that any differences
between the groups are due either to experimental manipulation or to
chance differences between the subjects in the different groups
groups.
For example, the means of the three groups shown in Table 13.1 are all
different from one another. The type 1 pedal arrangement has a mean of
2 errors, whereas the type 2 pedal arrangement has a mean of 3.8 errors
and the type 3 pedal arrangement has a mean of 5.4 errors. These
differences may be because increasing the separation between the
pedals causes more errors, or because just by chance the people who
pedal errors were assigned
g
to the type
yp 3 p
pedal
tend to make more p
arrangement.
If the Ho is true and the independent variable has no real effect, then the
differences in the number of errors for the three pedal arrangements are
due solely to chance differences in the drivers' abilities.

9 To test whether differences among sample groups are due merely


to chance, we can conduct an ANOVA.
9 In an ANOVA, we use the data gathered from the samples to make
two separate estimates of the variance (denoted by MSwg and
MSbg) in the population.
We arrive at these estimates using two very different methods, and
then we compare the two estimates to see whether they are
similar.
similar

Comparing The Estimated Variances


9 The key of a one-way ANOVA is comparing two separate
estimates that are arrived at using two distinctly different
methods called:
a. variance within each sample (MS
MSwg)
b. difference between the means of samples (MS
MSbg)
9 If both estimates are similar or exactly the same,
same then it
stands to reason that the samples are probably from the
same population and the Ho is true.
true
9 If the estimates are very different,
different then at least one of the
samples probably comes from a population different from
the other samples, so the H1 is true.
true

Essential Steps in Performing


One--Way ANOVA
One
Setting The Hypotheses
Estimating MSs*
Finding F Value*
Finding Degree of Freedom*
Finding Critical F Value*
Drawing up a Conclusion
Finding p Value (optional)*
Conducting post hoc Test*
*

Executable by SPSS

Drawing up a Conclusion

The Example:
Example
You are to study the factor of unintended
acceleration that cause road accident. It is
hypothesised that the distance between
the brake and the accelerator pedals
plays a contributing factor in driver error
that cause road accident. To test this
hypothesis you design an experiment in
hypothesis,
which subjects use one of three driving
simulators, each with a different pedal
arrangement. In the close-pedal
arrangement, the distance between
pedals is only 1 inch; in the moderatepedal arrangement, it's 2 inches; and in
the far-pedal arrangement, 3 inches.
You assign ten subjects to each condition
and instruct
instr ct them to dri
drive
e their sim
simulators
lators
for 4 hours. During these 4 hours, the
number of errors made by each driver is
recorded.

The Hypotheses
The null hypothesis:
hypothesis
Ho : There is no difference between the sample means of
any of the level
Ho : 1 = 2 = .. = k
where k is the number of levels of the independent variable

The alternative hypothesis:


hypothesis
H1 : At least one of the sample means comes from a
population different from that of the other sample means.

(Refer to Example)

Ho : There is no difference between the sample means of


any of the three level
Ho : 1 = 2 = 3
H1 : At least one of the three sample means comes from a
population different from that of the other sample means.

Estimating The Mean Square (MS)


9 The MS is the mean (the average) of the squared
d i ti scores used
d tto calculate
l l t th
i ti
deviation
the variation.
9 MS is equivalent to variance estimate (est. ) for the t
tests
9 The are two types of MS:
q
p ((MSwg)
mean square
within g
groups
mean square between groups (MSbg)

Mean Square Within Group

where
is the mean for kth group
k

is the total number of different groups

Mean Square Between Group

where
is the mean for kth group

is the total number of different groups

note: MSbg is always bigger or equal to MSwg

(Refer to Example)

(Refer to Example)

(Refer to Example)

10

(Refer to Example)

Finding The F Value


9

F test is used to decide whether to reject the Ho or


fail to reject Ho (for one independent variable with 3
or more levels at a time).

It is done by comparing the MSbg and MSwg using


the following formula:

Observe that F 1 as MSbg is always bigger or


equal to MSwg
If MSbg and MSwg are similar, then F = 1

11

(Refer to Example)

9 In the example,
= 28.930/1.481 = 19.534

Finding Degree of Freedom (df


(df))
9 There are two types of degree of freedom:
a. degree of freedom for the mean square between group (dfbg)
b degree of freedom for the mean square within groups (dfwg)
b.

dfbg = k 1
dfwg = (n1 1) + (n2 1) + . + (nk 1)
where
k

is the total number of different groups

nm

is the number of observations for the mth group

12

(Refer to Example)

9 In the example,
example
dfbg = 3 1 = 2
dfwg = (10 1) + (10 1) + (10 1) = 27

Drawing Up A Conclusion
The hypothesis testing for ANOVA is concluded in
a similar way as we do for t tests.
In ANOVA, we refer to Table F to identify the
critical value of F (denoted by Fcv)
Table F is arranged so that we look for the computed
degrees of freedom between groups (i.e. dfbg) in the
row at the top of the table (denoted by dfN).
The corresponding degrees of freedom within
groups (i.e. dfwg) is in the most left column of the
table (denoted by dfD)

13

(Refer to Example)

In the example, with = 0.05 (dfN = 2 and dfD = 27)


the critical F value is

Fcv = 3.35
Conclusion:
Conclusion
Observe that, with = 0.05, the F = 19.534 lies in the
rejection region (as F > Fcv).
Thus,
we can reject
there iis a
Th
j t H0 and
d acceptt H1 (i.e.
(i th
difference between at least two of the pedal
arrangements and the difference is sigificant)

(This is vital if you use SPSS to perform ANOVA)

Finding The p Value (optional)


Note that F is the ratio of two estimates i.e.

with F 1
If the estimates are very different (i.e. the F value is large), then
at least one of the samples probably comes from a population
different from the other samples, so the H1 is true (i.e. we can
safely reject Ho)

14

Recall that, in the example, using the Table F with = 0.05, dfN = 2
and dfD = 27, the critical value of F value is Fcv = 3.35
The computed F = 19.354 is large as compared to Fcv (i.e. much
larger than 3.35 that we get from the Table F at = 0.05!!).
Therefore, it is very probable for H1 to occur (i.e. very unlikely for Ho
to occur!).

In this case, if p is the probability for Ho to occur, then p is small with


p < 0.05 (in fact, p is smaller than 0.01 as the Fcv with = 0.01 is
5.49)
5 49)

This tells us that the probability for Ho to occur is very small (i.e. less
than the prescribed ).Thefore we reject Ho and accept H1

The Source Table


You will find a source table useful for the
purpose of drawing up conclusion and
reporting of ANOVA analysis.
A source table displays the vital information of
the ANOVA performed on the data; sum
squares, degree of freedom, mean squares, F
value and p.

15

Source Table ( = 0.05)


Source

SS

df

MS

Between groups

SSbg

dfbg

MSbg

Within group

SSwg

dfwg

MSwg

Total

SStotal

dftotal

< or > 0.05

note: a. Reject Ho if p < 0.05


b. The exact p value can be found if you have very detailed Table F with
various values of (like the one adopted by SPSS)

Source Table ( = 0.01)


Source

SS

df

MS

Between groups

SSbg

dfbg

MSbg

Within group

SSwg

dfwg

MSwg

Total

SStotal

dftotal

< or > 0.01

note: a. Reject Ho if p < 0.01


b. The exact p value can be found if you have very detailed Table F with
various values of (like the one adopted by SPSS)

16

(Refer to Example)

Source Table ( = 0.05)

Performing Post Hoc Test


The F test tells us whether we can safely reject the Ho. In the
case where we reject Ho, it indicates that there is some
difference between at least two and possibly more of the
groups, but it does not reveal where that difference lies.
However, there are several tests that can do so. These are
called post hoc tests. Post hoc is Latin for after the fact.
These tests are only conducted after you have determined
that you have an F ratio that is significant.
significant
The one we discuss next is called the Tukeys HSD,
HSD which
stands for Tukeys honestly significant difference.
difference

17

Performing Specific Post Hoc Test - HSD


The honestly significant difference (HSD) is used to compare
sample means when an analysis of variance leads to a significant
F it cannott be
b used
d when
h the
th F ratio
ti is
i nott significantly
i ifi
tl llarge.
F;
It reveals how far apart the sample means must be in order for
them to be significantly different.
We can compute the HSD by using the following formula:

where

MSwg

is the mean square between group

is the number of subjects in each sample

is the q value identified from Table Q

The value for q can be found in Table Q.


To find the value of q, you must enter Table Q with the number of
samples in the analysis of variance (k) and the number of degrees
of freedom within (dfwg). This is sometimes a problem because the
table does not list all possible degrees of freedom.
If your number of degrees of freedom is not listed, you must find
the value in the table that is closest to yours without going over it.
In the example, with alpha level of .05, k = 3, dfwg = 27, you will find
no corresponding q value (as there is no listing for 27 degrees of
freedom within groups!).
Therefore, you must find the value for the next lower number of
g
degrees
of freedom,, which is 24. Look across to find

q = 3.53

18

With these values (i.e. MSwg = 1.481 , n = 10, q = 3.53)

This HSD value tells us that any difference between means of 1.359
pedal errors or greater is significant.
Let's examine the differences between the means of the various
pedal arrangement types. The means for types 1, 2, and 3,
respectively, are 2 errors, 3.8 errors, and 5.4 errors. The difference
between the means for types 1 and 2 is 1.8 errors, which is greater
than the required 1.359. The difference between the means for types
1 and 3 is 3.4
1.6
3 4 errors,
errors and the difference between types 2 and 3 is 1
6
errors, both of which are also greater than the HSD of 1.359.
Thus, in this experiment all three pedal arrangements are
significantly different from one another. Based on these results,
automobile designers should choose pedal arrangement 1 because
drivers who used it made significantly fewer errors.

Performing One
One--Way ANOVA
Tests Using SPSS

19

Group

Type

Group

Type

ONEWAY
Type BY Group
/MISSING ANALYSIS
/POSTHOC = TUKEY ALPHA(.05).

SPSS Generated One


One--Way ANOVA

ANOVA
Type
Sum of Squares

df

Mean Square

Sig.

Between
Groups

57.8667

28.9333

19.530

0.000

Within Groups

40.000

27

1.4815

Total

97.8667

29

Sig. = p

20

How to Draw A Conclusion About


The Test?

Method 1

Check the value of significant


p ; Reject H0 if p <

Method 2

Check the value of F ; Reject


Ho if F falls in the rejection
region (refer to Fc.v identified
from the Table F)

Conclusion::
Conclusion

Reject Ho (i.e. there is significant


difference between the means)

21

Post Hoc : Tukeys HSD


Multiple Comparisons
Dependent Variable: Type
Tukey HSD

(I) Group
Group 1
Group 2
Group 3

(J) Group
Group 2
Group 3
Group 1
Group 3
Group 1
Group 2

Mean
Difference
(I-J)
-1.800*
-3.400*
1.800*
-1.600*
3.400*
1.600*

Std. Error
.544
.544
.544
.544
.544
.544

Sig.
.007
.000
.007
.018
.000
.018

95% Confidence Interval


Lower Bound Upper Bound
-3.15
-.45
-4.75
-2.05
.45
3.15
-2.95
-.25
2.05
4.75
.25
2.95

*. The mean difference is significant at the .05 level.

Conclusion:

There exists significant difference for each pair!


Observe that p < for:
Group 1 vs Group 2 (with p = 0.007)
Group 1 vs Group 3 (with p = 0.000)
Group 2 vs Group 3 (with p = 0.018)

The F Distribution

22

23

24