Anda di halaman 1dari 89

Hypothesis Testing

Can you convert your data into meaningful information?


Do you know how to conduct experiments?
- Dr. O. -
o = x x
i
Sample
Mean/Average
Standard Deviation
2
o Variance
o ,
S x,
Population
Sample
Normal
Distribution
(Z)
t Distribution
Chi
2
Distribution
F Distribution
Basic Concepts
Population Sample
Statistics
Inference
Parameters
Presumption: belief dictated by probability
Independent Related Independent Related
Interval
and
Ratio
Ordinal
Nominal
Two Samples
N-Samples
Single
Sample
Hypothesis Testing Map
Eyes Color, Sex, Race
Level Pain, Preference,..
#of Children,..{Integers}
{Any Real #}
o
o
=
=
:
:
1
0
H
H
o
o
<
>
:
:
1
0
H
H
o
o
>
s
:
:
1
0
H
H
Two-Tailed
Test
One-Tailed
Test
One-Tailed
Test
Let X
1
,, X
n
be a large (e.g., n > 30) sample from a
population with mean and standard deviation o. To test a
null hypothesis of the form
H
0
: =
0
H
0
:
0
H
0
: s
0
Compute the z-score:
If o is unknown it may be approximated by S.
n
X
z
/
0
o

=
n S
X
t
/
0

=
H
0
:
D
= 3%
H
1
:
D
3%
Single Sample Case
EXERCI SE #1
n
s
x
t
n
2
0
1

=

002618 . 0
1
) (
2
2
2
=


=
n
n
x
x
s
H
0
:
D
= 3%
H
1
:
D
3%
=0.494
SS
n
s
x
t
n
2
0
1

=

H
0
:
D
= 3%
H
1
:
D
3%
=0.494
df = n-1
=10-1
= 9
EXERCI SE #2
EXERCI SE #3
df = 13
Many studies have shown that direct eye contact and even patterns that look like eyes are
avoided by many animals. Some insects, such as moths, have even evolved large eye-
spot patterns on their wings to help ward off predators. This research example, modeled
after Scaife's (1976) study, examines how eye-spot patterns affect the behavior of moth-
eating birds.
A sample of n = 16 insectivorous birds is selected. The animals are tested in a box that
has two separate chambers (see Figure). The birds are free to roam from one chamber to
another through a doorway in a partition. On the wall of one chamber, two large eye-spot
patterns have been painted. The other chamber has plain walls. The birds are tested one
at a time by placing them in the doorway in the center of the apparatus. Each animal is
left in the box for 60 minutes, and the amount of time spent in the plain chamber is
recorded. Suppose that the sample of n = 16 birds spent an average of = 35 minutes in
the plain side with SS = 1215. Can we conclude that eye-spot patterns have an effect on
behavior? Note that while it is possible to predict a value for , we have no information
about the population standard deviation.
EXERCI SE #4
H
0
:
D
= 30
H
1
:
D
30
97.5% confidence
H
0
:
D
30
H
1
:
D
> 30
95% confidence
More Applications
H
0
:
D
= 0
H
1
:
D
0
Two Related Samples Case
n S
D
t
D
/

=
Cluster
Before
After
X
2
-X
1
Site
X
2
X
1
n S
D
t
D
/

=
Site
X
2
X
1
df=4
-2.13
0 : H
0 : H
1
0
<
>
EXERCI SE #1
Cluster
Before
H
0
:
D
0
H
1
:
D
< 0
After
The environmental policy must be well known across all the company levels. To certify
ISO 14001 a company wide training effort has been made to make sure that every
cluster in the company need less overtime dedication to this matter.
Note that we
tested the
mean = 0
already
H
0
:
D
0
H
1
:
D
< 0
In-Class Work
Compare the mean numbers of days
spent in hibernation by hedgehogs in
two areas of the country using
randomly selected samples of each.
Suppose 10 hedgehogs in each area are
captured, radio-tagged and released.
Their movements are monitored
throughout the winter to determine the
number of days when they did not
leave their nests.
Two Independent Samples
Test ( comparable)
Two Independent Samples Test
Two Independent Samples Test ( comparable)
Two Independent Samples Test ( comparable)
df =
Exercise
97.5% confidence
More Applications
Goodness of fit test (Chi
2
test)
It is suggested that dispersal of seeds from the edge of a plantation of trees
follows an inverse square law (i.e. the number of seeds travelling x m is
proportional to 1/x
2
). We want to test this theory using an isolated stand of
trees on an area of moorland. Seeds will be collected in four 0.25 m2 plots at
each of the five distances, on the downwind side of the plantation. The total
numbers of seeds counted at each distance were:
Using the equation (theoretical or empirical)
0.04 0.0564
X 294
0.04 0.0564
X 294
168 . 9
2
3
= _
If P-value (=0.05) > P-
score; Then, H
0
is true
If P-value (=0.05)
P=score; Then, we have
significant evidence that
the results are not
consistent w/H
0
, so H
1
is true.
One tail
Dist.
Multiply by
2
=0.054
If P-value (=0.05) > P-
score; Then, H
0
is true
If P-value (=0.05)
P=score; Then, we have
significant evidence that
the results are not
consistent w/H
0
, so H
1
is true.
7.814
168 . 9
2
3
= _
Highschool College Graduate
67 25 8
Education
A pool of 100 individual have been surveyed with respect to level of
education (see table below). The population is expected to show the
following trend: High school 60%, College degree 25%, and Master and PhD
15%. Is the sample representative of the population?
EXERCI SE #1
Poor Average Rich
A
25 12 7 44
B
6 12 2 20
C 3 3 19 25
34 27 28 89
c
a
t
a
l
y
s
t
Reactant
Association between Two Factors
(Chi
2
test)
Poor Average Rich
A
16.81 13.35 13.84 44
B
7.64 6.07 6.29 20
C
9.55 7.58 7.87 25
34 27 28 89
Reactant
c
a
t
a
l
y
s
t


=
GrandTotal
Column Row
Poor Average Rich
A
8.19 -1.35 -6.84
B
-1.64 5.93 -4.29
C
-6.55 -4.58 11.13
Reactant
c
a
t
a
l
y
s
t
Poor Average Rich
A
25 12 7 44
B
6 12 2 20
C 3 3 19 25
34 27 28 89
c
a
t
a
l
y
s
t
Reactant
Poor Average Rich
A
16.81 13.35 13.84 44
B
7.64 6.07 6.29 20
C
9.55 7.58 7.87 25
34 27 28 89
Reactant
c
a
t
a
l
y
s
t
Difference (O-E)
Observed
Expected
Poor Average Rich
A 3.99 0.14 3.38 7.51
B 0.35 5.80 2.93 9.08
C 4.49 2.77 15.76 23.03
8.84 8.71 22.07 39.62
Reactant
c
a
t
a
l
y
s
t
Difference (O-E)
2
E
If P-score > 0.05 Then,
there is insufficient
evidence to conclude that
there is association
between two factors
If P-score 0.05 Then,
there is significant
evidence of an
association between two
factors.
One tail
Dist.
Multiply by
2
=1.03E-07
62 . 39
2
4
= _
Poor Average Rich
A 3.99 0.14 3.38 7.51
B 0.35 5.80 2.93 9.08
C 4.49 2.77 15.76 23.03
8.84 8.71 22.07 39.62
Reactant
c
a
t
a
l
y
s
t
Difference (O-E)
2
E
If P-score > 0.05 Then,
there is insufficient
evidence to conclude that
there is association
between two factors
If P-score 0.05 Then,
there is significant
evidence of an
association between two
factors.
One tail
Dist.
Multiply by
2
=1.03E-07
62 . 39
2
4
= _
9.487
A scientist is concern that his new developed drug is affecting the proportion
of sheep having stillborn lambs in his flock of sheep. He has records from
before the drug was tested in the population so he is able to compare with
current data.
EXERCI SE #1
Compare the mean numbers of
days spent in hibernation by
hedgehogs in two areas of the
country, using randomly
selected samples of each.
Suppose 10 hedgehogs in each
area are captured, radio-tagged
and released. Their movements
are monitored throughout the
winter to determine the number
of days when they did not leave
their nests
Two Independent Samples Test ( check if comparable)
33.344
32.233

1
2

2
2

1
2

2
2
These are one-tailed
probabilities, so we need to
multiply them by 2.
Therefore P= 96%
4.0255
1.034
2
2
2
1 1
2
2
2
1 0
: H
: H
o = o
o = o
2
2
2
1 1
2
2
2
1 0
: H
: H
o < o
o > o
2
2
2
1 1
2
2
2
1 0
: H
: H
o > o
o s o
Two-Tailed
Test
One-Tailed
Test
One-Tailed
Test
) 1 N , 1 N , 2 / (
2 1
F Fscore
o
>
) 1 N , 1 N , (
2 1
F Fscore
o
>
) 1 N , 1 N , 1 (
2 1
F Fscore
o
<
The hypothesis that the two standard deviations are equal is rejected if
2
2
2
1 1
2
2
2
1 0
: H
: H
o < o
o > o
2
2
2
1 1
2
2
2
1 0
: H
: H
o > o
o s o
One-Tailed
Test
One-Tailed
Test
) 1 N , 1 N , (
2 1
F Fscore
o
>
) 1 N , 1 N , 1 (
2 1
F Fscore
o
<
The hypothesis that the two standard deviations are equal is rejected if
1.034
EXERCI SE #1
3.912
1.34
ANOVA Test
A B C D
x x x x
x x x x
x x x x
x x x x
x x x x
ONE-FACTOR
FLUX
LEVELS
R
E
P
L
I
C
A
T
E
S
S
a
m
p
l
e
V
a
l
u
e
s
1 2 3
A x,x,x,x x,x,x,x x,x,x
B x,x,x,x x,x,x,x x,x,x
C x,x,x,x x,x,x,x x,x,x
D x,x,x,x x,x,x,x x,x,x
LEVELS
REAGENT
L
E
V
E
L
S
C
A
T
A
L
Y
S
T
One-Way ANOVA Two-Way ANOVA
1 2 3 c 1 2 3 c 1 2 3 c
1 x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x
2 x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x
3 x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x
a x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x
FACTOR B
FACTOR C
LEVELS
1 2 b
LEVELS
FACTOR C
LEVELS
F
A
C
T
O
R
A
L
E
V
E
L
S
FACTOR C
LEVELS
Three-Way ANOVA
Levels
Ratio H2SO4/Sludge
Digestion Time
Leaching Time
Agitation RPM
One-Way ANOVA Test
A B C D
x x x x
x x x x
x x x x
x x x x
x x x x
ONE-FACTOR
FLUX
LEVELS
R
E
P
L
I
C
A
T
E
S
S
a
m
p
l
e
V
a
l
u
e
s
One-Way ANOVA Test
A B C D
x x x x
x x x x
x x x x
x x x x
x x x x
ONE-FACTOR
FLUX
LEVELS
R
E
P
L
I
C
A
T
E
S
S
a
m
p
l
e
V
a
l
u
e
s
ONE-FACTOR
FLUX
LEVELS
A B C
R
E
P
L
I
C
A
T
E
S
S
a
m
p
l
e
V
a
l
u
e
s
16 38 19
13 21 14
36 36 17
29 39 15
18 26 12
22.4 32 15.4 23.267
One-Way ANOVA Test
A B C D
x x x x
x x x x
x x x x
x x x x
x x x x
ONE-FACTOR
FLUX
LEVELS
R
E
P
L
I
C
A
T
E
S
S
a
m
p
l
e
V
a
l
u
e
s
5.098
6.272
EXERCISE #1
F=8.45
Two-Way ANOVA Test
1 2 3
A x,x,x,x x,x,x,x x,x,x
B x,x,x,x x,x,x,x x,x,x
C x,x,x,x x,x,x,x x,x,x
D x,x,x,x x,x,x,x x,x,x
LEVELS
REAGENT
L
E
V
E
L
S
C
A
T
A
L
Y
S
T
Two-Way ANOVA Test
Two-Way ANOVA Test
20 = y
i
A
i
B
Two-Way ANOVA Test
20 = y
i
A
i
B
Two-Way ANOVA Test
20 = y
i
A
i
B
) 05 . 0 %( 5 = o
Two-Way ANOVA Test
20 = y
i
A
i
B
) 05 . 0 %( 5 = o
Two-Way ANOVA Test
Two-Way ANOVA Test
Two-Way ANOVA Test
Two-Way ANOVA Test
Three-Way ANOVA Test
1 2 3 c 1 2 3 c 1 2 3 c
1 x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x
2 x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x
3 x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x
a x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x x,x,x
FACTOR B
FACTOR C
LEVELS
1 2 b
LEVELS
FACTOR C
LEVELS
F
A
C
T
O
R
A
L
E
V
E
L
S
FACTOR C
LEVELS
Simplified Case
Three-Way ANOVA Test
Three-Way ANOVA Test
Three-Way ANOVA Test
Three-Way ANOVA Test
Similar to
2 way ANOVA
Three-Way ANOVA Test
Computed F > F(5%)
Therefore, We reject Ho
for AB, AC, and BC. In
other words, these
interactions do have an
impact in the dependent
variable.