Anda di halaman 1dari 26

More than two groups:

ANOVA and Chi-square


ANOVA
(ANalysis Of VAriance)
 Idea: For two or more groups, test
difference between means, for quantitative
normally distributed variables.
 Just an extension of the t-test (an ANOVA
with only two groups is mathematically
equivalent to a t-test).
One-Way Analysis of Variance

 Assumptions, same as ttest


 Normally distributed outcome
 Equal variances between the groups
 Groups are independent
Hypotheses of One-Way ANOVA

H 0 : μ1 =μ2 =μ 3= ⋯
H 1 : Not all of the population means are the same
ANOVA
 It’s like this: If I have three groups to
compare:
 I could do three pair-wise ttests, but this would
increase my type I error
 So, instead I want to look at the pairwise
differences “all at once.”
 To do this, I can recognize that variance is a
statistic that let’s me look at more than one
difference at a time…
The “F-test”
Is the difference in the means of the groups
(=variability within groups)?
Summarizes the mean differences
between all groups at once.

Variabilit
y
between
groups

F
Variabilit
ywithin
groups

Analogous to pooled variance from a ttest.


The F-distribution
 A ratio of variances follows an F-distribution:

 2
between
~Fn,m
 2
within

The F-test tests the hypothesis that two variances are


equal.
F will be close to 1 if sample variances are equal.

0:   within
2 2
H between

a: 
2 2
H between
within
ANOVA Table
Mean Sum
Source of Sum of of
variation d.f. squares Squares F-statistic p-value

SSB SSB/k-1 Go to
Between k-1 SSB
(k groups) (sum of squared k 1
Fk-1,nk-k
deviations of SSW
nk k chart
group means
from grand
mean)

SSW s2=SSW/nk-k
Within nk-k
(sum of squared
(n individuals per
deviations of
group)
observations
from their group
mean)

Total nk-1 TSS


variation (sum of squared deviations of
observations from grand TSS=SSB + SSW
mean)
Example
Treatment 1 Treatment 2 Treatment 3 Treatment 4
60 inches 50 48 47
67 52 49 67
42 43 50 54
67 67 55 67
56 67 56 68
62 59 61 65
64 67 61 65
59 64 60 56
72 63 59 60
71 65 64 65
Example

Step 1) calculate the sum of


squares between groups: Treatment 1 Treatment 2 Treatment 3 Treatment 4
60 inches 50 48 47
67 52 49 67

Mean for group 1 = 62.0 42 43 50 54


67 67 55 67
Mean for group 2 = 59.7 56 67 56 68

Mean for group 3 = 56.3 62 59 61 65


64 67 61 65
Mean for group 4 = 61.4 59 64 60 56
72 63 59 60
71 65 64 65
Grand mean= 59.85

SSB = [(62-59.85)2 + (59.7-59.85)2 + (56.3-59.85)2 + (61.4-59.85)2 ] xn per group=


19.65x10 = 196.5
Example

Step 2) calculate the sum of


squares within groups: Treatment 1 Treatment 2 Treatment 3 Treatment 4
60 inches 50 48 47
67 52 49 67

(60-62) 2+(67-62) 2+ (42-62) 2+ 42 43 50 54

(67-62) 2+ (56-62) 2+ (62-62) 2+ 67 67 55 67

(64-62) 2+ (59-62) 2+ (72-62) 2+ 56 67 56 68

(71-62) 2+ (50-59.7) 2+ (52- 62 59 61 65

59.7) 2+ (43-59.7) 2+67-59.7) 2+ 64 67 61 65

(67-59.7) 2+ (69-59.7) 59 64 60 56

2…+….(sum of 40 squared 72 63 59 60

deviations) = 2060.6 71 65 64 65
Step 3) Fill in the ANOVA table

Source of variation d.f. Sum of squares Mean Sum of F-statistic p-value


Squares

Between 3 196.5 65.5 1.14 .344

Within 36 2060.6 57.2

Total 39 2257.1
Step 3) Fill in the ANOVA table

Source of variation d.f. Sum of squares Mean Sum of F-statistic p-value


Squares

Between 3 196.5 65.5 1.14 .344

Within 36 2060.6 57.2

Total 39 2257.1

INTERPRETATION of ANOVA:
How much of the variance in height is explained by treatment group?
R2=“Coefficient of Determination” = SSB/TSS = 196.5/2275.1=9%
Coefficient of Determination

2 SSB SSB
R= =
SSB+SSE SST
The amount of variation in the outcome variable (dependent variable) that
is explained by the predictor (independent variable).
Step 1. Assumptions for the
Test
 Level of measurement of the group variable can be any
level of variable that identifies groups.

 Level of measurement of the test variable is interval.

 The test variable is normally distributed in the population:


 skewness and kurtosis between –1.0 and +1.0,
or
 number is each group is greater than 10 (central
limit theorem)

 The variances (dispersion) of the groups are equal. The


Levene test of equality of population variances is used to
test this assumption.
15
Step 2. Hypotheses and alpha
 The research hypothesis is that the mean of at least one of
the population groups is different from the means of the
other groups.

 The null hypothesis is that the means of all of the


population groups are equal.

 If we don’t have a specific reason for setting the level of


significance to a specific probability, we can use the
traditional social science benchmark of 0.05. This means
that we are willing to risk making a mistake in our decision
to reject the null hypothesis if it only happens once in every
20 decisions, or our decision would be correct 19 out of 20
times. The alpha level to use will be stated in the problems.

16
Step 3. Sampling distribution and
test statistic
 In the ANOVA test, the probability is obtained from the “F”
distribution instead of the normal curve distribution.

 The test statistic is also referred to as the F-ratio or F-test


because it follows the f-distribution.

17
Step 4. Computing the Test
Statistic
 Conceptually the test statistic is computed in a way similar
to the independent samples t-test. Both are computed by
dividing the differences in means by the measure of
variability among the groups.

 We identify the probability of the test statistic from the


SPSS statistical output.

18
Step 5. Decision and
Interpretation
 If the probability of the test statistic is less than or equal to
the probability of the level of significance (alpha error rate),
we reject the null hypothesis and conclude that our data
supports the research hypothesis.

 If the probability of the test statistic is greater than the


probability of the level of significance (alpha error rate), we
fail to reject the null hypothesis and conclude that our data
does not support the research hypothesis.

19
Interpreting Differences in
Population Means
 If we fail to reject the null hypothesis, we can state that we
found no differences among the means for the population
groups for this characteristic. We do not say they are
equal.

 If we reject the null hypothesis, we can conclude that the


mean for at least one population group is different from the
others.

20
ANOVA summary
 A statistically significant ANOVA (F-test)
only tells you that at least two of the groups
differ, but not which ones differ.

 Determining which groups differ (when it’s


unclear) requires more sophisticated
analyses to correct for the problem of
multiple comparisons…
ANOVA Test in SPSS (1)

Next step is to examine the distribution of


the dependent variable. You can check
whether the dependent variable is normally
distributed or not in:

Analyze > Descriptive Statistics >


Descriptives…

22
ANOVA Test in SPSS (3)

Click Analyze > Compare Means > One-Way ANOVA... on


the top menu
Transfer the dependent variable (Time) into the Dependent
List: box and the independent variable (Course) into the
Factor: box using the appropriate SPSS Right Arrow Button
buttons (or drag-and-drop the variables into the boxes)
Click the continue button.

23
ANOVA Test in SPSS (2)

After moving [age] into


“Variable(s):” box, click
“Options…” button to select
the distribution statistics.

24
ANOVA Test in SPSS (3)

Select “Kurtosis” and


“Skewness” to examine
whether [age] is normally
distributed or not.

Then, click “Continue” and


“OK” buttons.

25
One-way ANOVA in SPSS
ANOVA

VAR00001
Sum of
Squares df Mean Square F Sig.
Between Groups 52,431 2 26,215 2,765 ,111
Within Groups 94,800 10 9,480
Total 147,231 12

Last column: The p-value: The smallest value of  at which the


null hypothesis is rejected.

Anda mungkin juga menyukai