Anda di halaman 1dari 10

Hypothesis Test

Setting up and testing hypotheses is an essential part of statistical

inference. In order to formulate such a test, usually some theory has been
put forward, either because it is believed to be true or because it is to be
used as a basis for argument, but has not been proved, for example,
claiming that a new drug is better than the current drug for treatment of
the same symptoms.
In each problem considered, the question of interest is simplified into two
competing claims / hypotheses between which we have a choice; the null
hypothesis, denoted H0, against the alternative hypothesis, denoted H1.
These two competing claims / hypotheses are not however treated on an
equal basis: special consideration is given to the null hypothesis.
We have two common situations:
1. The experiment has been carried out in an attempt to disprove or
reject a particular hypothesis, the null hypothesis, thus we give that
one priority so it cannot be rejected unless the evidence against it is
sufficiently strong. For example,
H0: there is no difference in taste between coke and diet coke
against
H1: there is a difference.

2. If one of the two hypotheses is 'simpler' we give it priority so that a
more 'complicated' theory is not adopted unless there is sufficient
evidence against the simpler one. For example, it is 'simpler' to
claim that there is no difference in flavour between coke and diet
coke than it is to say that there is a difference.
The hypotheses are often statements about population parameters like
expected value and variance; for example H0 might be that the expected
value of the height of ten year old boys in the Scottish population is not
different from that of ten year old girls. A hypothesis might also be a
statement about the distributional form of a characteristic of interest, for
example that the height of ten year old boys is normally distributed within
the Scottish population.
The o
not re

N
The n
eithe
basis
trial o
bette
We g
fact t
wher
acce
The f
terms
not re
If we
null h
evide
sugg
See a

A
The a
hypo
new d
differ
would
The a
avera
outcome o
eject H0".
Null Hypo
null hypoth
r because
s for argum
of a new d
er, on aver
H0: there
give specia
hat the nu
reas the al
pted if / w
final concl
s of the nu
eject H0"; w
conclude
hypothesis
ence again
ests that t
also hypot
Alternativ
alternative
thesis tes
drug, the a
rent effect
d write
H1: the tw
alternative
age, than
of a hypoth
othesis
hesis, H0,
e it is belie
ment, but h
drug, the n
rage, than
e is no diffe
al conside
ull hypothe
lternative
hen the nu
usion onc
ull hypothe
we never
"Do not re
s is true, it
nst H0 in fa
the alterna
thesis test
ve Hypoth
e hypothes
t is set up
alternative
, on avera
wo drugs h
e hypothes
the curren
hesis test
represent
eved to be
has not be
null hypoth
the curre
erence be
eration to t
esis relate
hypothesi
ull is rejec
ce the test
esis. We e
conclude
eject H0",
t only sugg
avour of H
ative hypo
t.
hesis
sis, H1, is a
to establi
e hypothes
age, comp
have diffe
sis might a
nt drug. In
test is "Re
ts a theory
true or be
een proved
hesis migh
nt drug. W
etween the
he null hy
s to the st
s relates t
cted.
has been
either "Rej
"Reject H
this does
gests that
H1. Rejectin
othesis ma
a stateme
sh. For ex
sis might b
ared to th
rent effect
also be tha
this case
eject H0 in
y that has
ecause it i
d. For exa
ht be that t
We would w
e two drug
ypothesis.
tatement b
to the stat
n carried o
ect H0 in f
1", or even
not neces
there is n
ng the nul
ay be true.
ent of what
xample, in
be that the
at of the c
ts, on ave
at the new
we would
n favour of
been put
s to be us
ample, in a
the new dr
write
gs on aver
This is du
being teste
tement to
out is alwa
favour of H
n "Accept
ssarily me
not sufficie
l hypothes

t a statistic
n a clinical
e new dru
current dru
erage.
w drug is b
d write
f H1" or "D
forward,
sed as a
a clinical
rug is no
rage.
ue to the
ed,
be
ys given i
H1" or "Do
H1".
an that the
ent
sis then,
cal
trial of a
g has a
ug. We
better, on
o
n
e
The f
terms
not re
If we
null h
evide
sugg

S
A sim
distri
Exam
1. H0
2. H0
See a

C
A com
popu
Exam
1. X ~
2. X ~
See a

T
H1: the n
final concl
s of the nu
eject H0". W
conclude
hypothesis
ence again
ests that t
Simple Hy
mple hypot
bution com
mples
: X ~Bi(10
: X ~N(5,
also comp
Composit
mposite hy
lation dist
mples
~Bi(100,p
~N(0, )
also simpl
Type I Err
ew drug is
usion onc
ull hypothe
We never
"Do not re
s is true, it
nst H0 in fa
the alterna
ypothesis
thesis is a
mpletely.
00,1/2), i.e
20), i.e.
posite hyp
te Hypoth
ypothesis
tribution co
p) and H1:
and H1:
le hypothe
ror
s better th
ce the test
esis. We e
conclude
eject H0",
t only sugg
avour of H
ative hypo
s
a hypothes
e. p is spe
and ar
othesis.
hesis
is a hypot
ompletely.
p >0.5
unspeci
esis.
han the cu
has been
either "Rej
"Reject H
this does
gests that
H1. Rejectin
othesis ma
sis which s
ecified
re specifie
thesis whi
.
fied
rrent drug
n carried o
ect H0 in f
H1", or eve
not neces
there is n
ng the nul
ay be true.
specifies t
ed
ich does n
g, on avera
out is alwa
favour of H
n "Accept
ssarily me
not sufficie
l hypothes

the popula
not specify
age.
ys given i
H1" or "Do
t H1".
an that the
ent
sis then,
ation
y the
n
e
In a h
rejec
For e
that t
A typ
differ
The f
test:
A typ
impo
there
rejec
proba
The e
If we
as th
hypo
For a
the s
A typ

T
In a h
not re
hypothesis
ted when
example, in
the new dr
H0: there
pe I error w
rent effects
following t

Truth
H
H
pe I error is
rtant to av
ting the nu
ability of a
P(type I
exact prob
do not rej
e sample
thesis (es
any given s
maller the
pe I error c
Type II Er
hypothesis
ejected wh
s test, a ty
it is in fac
n a clinica
rug is no b
e is no diffe
would occu
s when in
table gives
Rejec
H0 Type I E
H1 Right de
s often co
void, than
sted so tha
ull hypothe
a type I err
error) =si
bability of
ject the nu
may not b
pecially if
set of data
e risk of on
can also b
rror
s test, a ty
hen it is in
ype I error
ct true; tha
al trial of a
better, on
erence be
ur if we co
fact there
s a summa
Decision
ct H0 Do
Error Rig
ecision Typ
nsidered t
a type II e
at there is
esis wrong
ror can be
gnificance
a type II e
ull hypothe
be big eno
the truth i
a, type I a
ne, the hig
e referred
ype II error
n fact false
occurs w
t is, H0 is w
new drug
average, t
etween the
oncluded t
e was no d
ary of pos
n
on't reject H
ght decision
pe II Error
to be more
error. The
a guaran
gly; this pr
precisely
e level =
error is gen
esis, it ma
ough to ide
is very clo
nd type II
gher the ris
to as an e
r occurs w
e. For exam
hen the nu
wrongly re
g, the null
than the c
e two drug
that the tw
difference
ssible resu
H0
e serious,
hypothes
teed 'low'
robability
y computed

nerally un
ay still be f
entify the f
ose to hypo
errors are
sk of the o
error of th
when the n
mple, in a
ull hypothe
ejected.
hypothesi
current dru
gs on aver
wo drugs p
between t
ults of any
and there
is test pro
probabilit
is never 0
d as
known.
false (a typ
falseness
othesis).
e inversely
other.
he first kind
null hypoth
clinical tri
esis is
s might be
ug; i.e.
rage.
produced
them.
hypothes
efore more
ocedure is
ty of
0. This
pe II error
of the nul
y related;
d.
hesis H0, is
ial of a ne
e
is
e
)
l
s
w
drug,
avera
A typ
produ
drugs
A typ
The p
by
A typ
Com
See a

T
A tes
used
our h
The c
mode

C
The c
of the
the n
The c
level
two-s
, the null h
age, than
H0: there
pe II error w
uced the s
s on avera
pe II error i
probability
and writte
P(type II
pe II error c
pare type
also powe
Test Stati
st statistic
to decide
hypothesis
choice of a
el and the
Critical Va
critical val
e test stati
ull hypoth
critical val
at which t
sided.
hypothesis
the curren
e is no diffe
would occ
same effec
age, when
is frequen
y of a type
en
error) =
can also b
I error.
er.
istic
is a quant
e whether
s test.
a test stat
hypothes
alue(s)
ue(s) for a
istic in a s
hesis is rej
ue for any
the test is
s might be
nt drug; i.e
erence be
cur if it was
ct, i.e. the
n in fact the
ntly due to
II error is

be referred
tity calcula
or not the
istic will de
es under q
a hypothes
ample is c
ected.
y hypothes
carried ou
e that the n
e.
etween the
s conclude
re is no di
ey produc
sample s
generally
d to as an
ated from
null hypo
epend on
question.
sis test is
compared
sis test de
ut, and wh
new drug i
e two drug
ed that the
ifference b
ced differe
izes being
y unknown
error of th
our samp
thesis sho
the assum
a thresho
to determ
epends on
hether the
is no bette
gs on aver
e two drug
between th
nt ones.
g too smal
n, but is sy
he second
le of data.
ould be rej
med proba
ld to whic
mine wheth
the signif
test is on
er, on
rage.
gs
he two
ll.
ymbolised
d kind.
. Its value
jected in
ability
h the valu
her or not
ficance
e-sided or
is
e
r
See a

C
The c
statis
is, the
one r
the o
mem
mem
See a
See a

S
The s
of wr
It is t
to the
signif
hypo
The s
Usua

P
also critica
Critical Re
critical reg
stic for whi
e sample
region (the
other will n
ber of the
ber of the
also critica
also test s
Significan
significanc
rongly reje
he probab
e consequ
ficance lev
thesis and
vertently m
significanc
Significa
ally, the sig
P-Value
al region.
egion
gion CR, o
ich the nu
space for
e critical re
ot. So, if t
critical re
critical re
al value.
statistic.
nce Level
ce level of
ecting the n
bility of a ty
uences of
vel as sma
d to preve
making fals
ce level is
nce Level
gnificance
or rejection
ll hypothe
the test s
egion) will
the observ
egion, we c
egion then
l
f a statistic
null hypot
ype I error
such an e
all as poss
nt, as far a
se claims.
usually de
l =P(type
e level is c
n region R
esis is reje
tatistic is p
ved value
conclude "
we conclu
cal hypoth
hesis H0, i
r and is se
error. That
sible in ord
as possibl

enoted by
I error) =
hosen to b
RR, is a se
cted in a h
partitioned
o reject th
of the test
"Reject H0
ude "Do n
hesis test i
if it is in fa
et by the in
is, we wa
der to prot
le, the inve
y

be 0.05 (o
t of values
hypothesis
d into two
e null hyp
t statistic i
0"; if it is n
not reject H
s a fixed p
act true.
nvestigato
ant to mak
tect the nu
estigator f
or equivale
s of the te
s test. Tha
regions;
pothesis H
is a
ot a
H0".
probability
or in relatio
ke the
ull
from
ently, 5%).
st
at
0,
y
on
.
The p
proba
extre
true.
It is t
true.
It is e
rejec
signif
That
level,
Smal
smal
indica
rathe

P
The p
rejec
corre
In oth
comm
a typ
The m
want

O
probability
ability of g
eme than t
he probab
equal to th
t the null h
ficance lev
is, if the n
, this woul
ll p-values
ler it is, th
ates the s
er than sim
Power
power of a
t the null h
ect decisio
her words
mitting a ty
e II error f
Power =
maximum
a test to h
One-sided
y value (p-
getting a va
hat observ
bility of wro
he significa
hypothesis
vel of our
null hypoth
ld be repo
s suggest t
e more co
trength of
mply conclu
a statistica
hypothesis
n.
, the powe
ype II erro
from 1, us
1 - P(type
power a t
have high
d Test
-value) of a
alue of the
ved by cha
ongly rejec
ance level
s. The p-v
test and, i
hesis were
orted as "p
that the nu
onvincing i
f evidence
uding "Rej
al hypothe
s when it i
er of a hyp
or. It is calc
ually expr
e II error) =
test can ha
power, clo
a statistica
e test stati
ance alon
cting the n
of the tes
value is co
if it is sma
e to be reje
p <0.05".
ull hypothe
is the reje
for say, re
ject H0' or
sis test m
s actually
pothesis te
culated by
ressed as:
=
ave is 1, t
ose to 1.
al hypothe
istic as ex
ne, if the n
null hypoth
st for whic
ompared w
aller, the re
ected at th
esis is unl
ction of th
ejecting th
r "Do not r
easures th
y false - tha
est is the p
y subtracti
:
he minimu
esis test is
xtreme as
ull hypoth
hesis if it i
h we wou
with the ac
esult is sig
he 5% sign
likely to be
he null hyp
he null hyp
reject H0".
he test's a
at is, to m
probability
ng the pro
um is 0. Id
s the
or more
esis H0, is
s in fact
ld only jus
ctual
gnificant.
nficance
e true. The
pothesis. It
pothesis H
ability to
ake a
y of not
obability o
deally we
s
st
e
t
H0,
f
A one
which
of the
In oth
less t
critica
A one
The c
the p
test.
Exam
Supp
avera
again
Eithe
Presu
altern
be le
they
Yet a
again
Here
in a b
would
less t

T
e-sided te
h we can r
e probabili
her words
than the c
al value of
e-sided te
choice bet
purpose of
mple
pose we w
age, 50 m
H0: =5
nst
H1: <5
er of these
umably, w
native hyp
ss than 50
get the co
another alt
ng this tim
H0: =5
nst
H1: not
, nothing s
box; only t
d know tha
than or gre
Two-Sided
st is a sta
reject the
ity distribu
, the critic
critical valu
f the test.
st is also
tween a o
f the inves
wanted to t
atches in
50,
50 or H1:
two altern
we would w
pothesis si
0 matches
orrect num
ternative h
me to a two
50,
t equal to
specific ca
that, if we
at the ave
eater than
d Test
tistical hyp
null hypot
ution.
al region f
ue of the te
referred to
ne-sided a
tigation or
test a man
a box. We
>50
native hyp
want to tes
nce it wou
s, on avera
mber of ma
hypothesis
o-sided tes
50
an be said
could reje
erage num
n 50.
pothesis te
thesis, H0
for a one-s
est, or the
o as a one
and a two-
r prior reas
nufacturers
e could se
potheses w
st the null
uld be use
age, in a b
atches in a
s could be
st:
ect the nul
ber of ma
est in whic
are locate
sided test
e set of va
e-tailed tes
-sided tes
sons for u
s claim tha
et up the fo
hypothesi
eful to know
box (no on
a box or m
e tested ag
e average
l hypothes
atches in a
ch the val
ed entirely
is the set
lues great
st of signif
st is determ
using a one
at there ar
ollowing hy
d to a one-
is against
w if there
ne would c
more).
gainst the
number o
sis in our t
a box is lik
ues for
y in one tai
t of values
ter than th
ficance.
mined by
e-sided
re, on
ypotheses
-sided tes
the first
is likely to
complain i
same null
of matches
test, we
kely to be
il
s
he
s
st.
o
f
l,
s
A two
which
proba
In oth
less t
a sec
A two
The c
by th
test.
Exam
Supp
avera
again
Eithe
Presu
altern
be le
they
Yet a
again
Here
in a b
would
less t

O
o-sided tes
h we can r
ability dist
her words
than a firs
cond critic
o-sided tes
choice bet
e purpose
mple
pose we w
age, 50 m
H0: =5
nst
H1: <5
er of these
umably, w
native hyp
ss than 50
get the co
another alt
ng this tim
H0: =5
nst
H1: not
, nothing s
box; only t
d know tha
than or gre
One Sam
st is a stat
reject the
ribution.
, the critic
t critical v
al value o
st is also r
tween a o
e of the inv
wanted to t
atches in
50,
50 or H1:
two altern
we would w
pothesis si
0 matches
orrect num
ternative h
me to a two
50,
t equal to
specific ca
that, if we
at the ave
eater than
ple t-test
tistical hyp
null hypot
al region f
alue of the
f the test.
referred to
ne-sided t
vestigation
test a man
a box. We
>50
native hyp
want to tes
nce it wou
s, on avera
mber of ma
hypothesis
o-sided tes
50
an be said
could reje
erage num
n 50.
pothesis te
thesis, H0
for a two-s
e test and
o as a two
test and a
n or prior r
nufacturers
e could se
potheses w
st the null
uld be use
age, in a b
atches in a
s could be
st:
ect the nul
ber of ma
est in whic
are locate
sided test
the set of
-tailed tes
two-sided
reasons fo
s claim tha
et up the fo
hypothesi
eful to know
box (no on
a box or m
e tested ag
e average
l hypothes
atches in a
ch the valu
ed in both
is the set
f values g
st of signifi
d test is de
or using a
at there ar
ollowing hy
d to a one-
is against
w if there
ne would c
more).
gainst the
number o
sis in our t
a box is lik
ues for
tails of the
of values
reater tha
icance.
etermined
one-sided
re, on
ypotheses
-sided tes
the first
is likely to
complain i
same null
of matches
test, we
kely to be
e
n

d
s
st.
o
f
l,
s
A one
mean
from
The n
That
unkn
This
hypo

T
A two
mean
indep
When
varia
The n
That
popu
altern

e sample
n where th
an underl
null hypoth
H0: =
is, the sam
own varia
null hypot
theses, de
H1: is n
H1: >
H1: <
Two Sam
o sample t
n where th
pendent o
n carrying
nces for th
null hypoth
H0: 1 =
is, the two
lation. Thi
native hyp
H1: 1 is
H1: 1 >
H1: 1 <
t-test is a
he data are
ying norm
hesis for t
0, where
mple has
nce (whic
thesis, H0
epending
not equal t

ple t-test
t-test is a
he data are
bservation
out a two
he two po

hesis for t
2
o samples
is null hyp
potheses, d
not equal
2
2
hypothesi
e a random
mal distribu
he one sa
0 is know
been draw
ch therefor
is tested a
on the que
to
hypothesi
e collected
ns, each fr

o sample t-
pulations
he two sa
s have bot
pothesis is
depending
to 2
is test for
m sample
ution N(,
ample t-tes
wn.
wn from a
re has to b
against on
estion pos
s test for a
d from two
rom an un
-test, it is
are equal
mple t-tes
h been dr
s tested ag
g on the q
of indepe
), wher
st is:
populatio
be estimat
ne of the fo
sed:
o random
nderlying n
usual to a
, i.e.
st is:
rawn from
gainst one
uestion po
g question
endent obs
re is un
n of given
ted from th
ollowing a
g question
samples o
normal dis
assume tha
the same
e of the fol
osed.
servations
nknown.
n mean an
he sample
alternative