Anda di halaman 1dari 36

INTERVAL ESTIMATION

and

HYPOTHESIS TESTING
Making statements about a population
by examining sample results
Sample statistics Population parameters
(known) Inference (unknown, but can
be estimated from
sample evidence)
Sample
Population
Inferential Statistics


An estimator of a population parameter is a sample statistic
used to estimate or predict the population parameter.
An estimate of a parameter is a particular numerical value of
a sample statistic obtained through sampling.
A point estimate is a single value used as an estimate of a
population parameter.
A population parameter is
a numerical measure of a
summary characteristic of a
population.
Sample Statistics as Estimators of
Population Parameters
A sample statistic is a
numerical measure of a
summary characteristic of a
sample.
Inferential Statistics
Estimation
e.g., Estimate the population mean
using the information derived from
sample
Hypothesis Testing
e.g., Use sample evidence to test
hypotheses about the population
mean
Drawing conclusions and/or making decisions
concerning a population based on sample results.
Point and Interval Estimates
A point estimate is a single number,
A confidence interval contains a certain
percentage of possible values of the parameter
Point Estimate
Lower
Confidence
Limit
Upper
Confidence
Limit
Width of
confidence interval
Confidence Level, (1-o)
Suppose confidence level = 95%
Also written (1 o) = 0.95
A relative frequency interpretation:
Any possible sample has 95% chance
that the confidence intervals constructed
around its statistic will contain the
unknown true parameter
Reducing the Margin of Error
The margin of error can be reduced if

the sample standard deviation is lower ()

The sample size is increased (n)

The confidence level is decreased, (1 o)
n

z ME
/ 2
=
Finding z
o/2
Consider a 95% confidence interval:
z = -1.96 z = 1.96
.95 1 = o
.025
2

= .025
2

=
Point Estimate
Lower
Confidence
Limit
Upper
Confidence
Limit
Z units:
X units:
Point Estimate
0
Find z
.025
= 1.96 from the standard normal
distribution table
Common Levels of Confidence
Commonly used confidence levels are
90%, 95%, and 99%
Confidence
Level
Confidence
Coefficient,

Z
o/2
value
1.28
1.645
1.96
2.33
2.58
3.08
3.27
.80
.90
.95
.98
.99
.998
.999
80%
90%
95%
98%
99%
99.8%
99.9%
o 1

x
=
Intervals and Level of Confidence
Confidence Intervals
Intervals
extend from


to


100(1-o)%
of intervals
constructed
contain ;
100(o)% do
not.
Sampling Distribution of the Mean
n

z x
n

z x +
x
x
1
x
2
/2 o /2 o
o 1
Z-value for Sampling Distribution
of the Mean
Z-value for the sampling distribution of :
where: = sample mean
= population mean
= population standard deviation
n = sample size
X

) X (

) X (
Z
X

=
X
Example 1
A large automotive-parts wholesaler needs an estimate of the
mean life it can expect from windshield wiper blades under
typical driving conditions
Already, management has determined that the standard
deviation of the population life is 6 months
Suppose we select a simple random sample of 100 wiper
blades, collect data on their useful lives, and obtain these
results:
o = 6 months
n = 100
x = 21 months
Give a 95% confidence interval for the true average life
expectancy of wiper blades.
| | 176 . 22 , 824 . 19
176 . 1 21
) 6 . 0 )( 96 . 1 ( 21
100
6
96 . 1 21 96 . 1
=
=
=
=
n
x
o
Example 1
(continued)
Interpretation
We are 95% confident that the true
mean life of the population of wiper
blades is between 19.82 and 22.18
months
Although the true mean may or may not
be in this particular interval, 95% of
intervals formed in this manner will
contain the true mean
Display shows the Excel function that is used
Example using Excel
(continued)
Display shows the error value of 5387.75. Add and
subtract this value to the sample mean to get the 95%
confidence interval.
Example using Excel
(continued)
n
Central Limit Theorem
As the
sample
size gets
large
enough
the sampling
distribution
becomes
almost normal
regardless of
shape of
population
x
Population Distribution
Distribution of sample means
(becomes normal as n increases)
Central Tendency
Variation
x
x
Larger
sample
size
Smaller
sample size
If the Population is not Normal
Sampling distribution
properties:

x
=
n

x
=
x

Students t-distribution
t
0
t (df = 5)
t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have fatter tails than the
normal
Standard
Normal
(t with df = )
t Z as n increases
t Table
Right Tail Area
df

.10 .025
.05
1 12.706
2
3 3.182
t
0
2.920
The body of the table
contains t values, not
probabilities
Let: n = 3
df = n - 1 = 2
o = .10
o/2 =.05
o/2 = .05
3.078
1.886
1.638
6.314
2.920
2.353
4.303
t distribution values
With comparison to the Z value
Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) ____

.80 1.372 1.325 1.310 1.282
.90 1.812 1.725 1.697 1.645
.95 2.228 2.086 2.042 1.960
.99 3.169 2.845 2.750 2.576
Note: t Z as n increases
HYPOTHESIS TESTING

The Null Hypothesis, H
0

At the beginning, assume that the null hypothesis
is true
(until evidence suggests otherwise)
Similar to the notion of innocent until
proven guilty
Refers to the status quo
Always contains =, or one of the > signs
May or may not be rejected
The Alternative Hypothesis, H
A
Is the opposite of the null hypothesis
e.g., The average number of TV sets in U.S.
homes is not equal to 3 ( H
1
: 3 )
The assertion of all situations not covered by H
0
Challenges the status quo
Never contains the = , or > sign
Is generally the researchers theory
H
0
and H
1
are:
Mutually exclusive: Only one can be true.
Exhaustive: Together

they cover all possibilities, so one or the other
must be true.
Sampling Distribution of X
= 50
If H
0
is true
If it is unlikely that
we would get a
sample mean of
this value ...
... then we
reject the null
hypothesis that
= 50.
Reason for Rejecting H
0
20
... if in fact this were
the population mean
X
Level of Significance
and the Rejection Region
H
0
: 3
H
1
: < 3
0
H
0
: 3
H
1
: > 3
o
o
Represents
critical value
Lower-tail test
Required Level of significance =
o
0
Upper-tail test
Two-tail test
Rejection
region is
shaded
/2
0
o
/2
o
H
0
: = 3
H
1
: 3
Level of Significance, o
Defines rejection region of the sampling
distribution
Is designated by o , (level of significance)
Typical values are .01, .05, or .10
Is selected by the researcher at the beginning
Provides the critical value(s) of the test
Reject H
0
Do not reject H
0
Decision Rule
o
z

0

0
H
0
:
0

H
1
: >
0

Critical value
Z

0
0
z
n
s
x
z if H Reject >

=
n s/ Z X if H Reject
0 0
+ >
n
s
z
0
+
Alternate rule:
x
Errors in Making Decisions
Type I Error
Rejecting a true null hypothesis
The probability of Type I Error is o
Called level of significance of the test
Set by researcher in advance

Type II Error
Fail to reject a false null hypothesis
The probability of Type II Error is

Outcomes and Probabilities
Result Probabilities
Result Probabilities
H
0
: Innocent
The Truth The Truth
Verdict Innocent Guilty Decision H
0
True H
0
False
Innocent Correct Error
Do Not
Reject
H
0
1 - o
Type II
Error (| )
Guilty
Error
Correct
Reject
H
0
Type I
Error
(
o
)
Power
(1 - | )
Jury Trial
Hypothesis Test
Do not reject H
0
Reject H
0
Reject H
0
There are
two critical
values,
defining the
two regions
of rejection
Two-Tail Tests
o/2
0

H
0
: = 3
H
1
: = 3
o/2
Lower critical
value
Upper critical
value
3

z

x

-z
o/2
+z
o/2
In some settings, the
alternative hypothesis
does not specify a
unique direction
p-Value Approach to Testing
p-value: Probability of obtaining a test
statistic more extreme ( or > ) than
the observed sample value, given H
0
is
true
Also called observed level of significance
Smallest value of o for which H
0
can be
rejected
p-Value Approach to Testing
Convert sample result (e.g., ) to test statistic
(e.g., z statistic )
Obtain the p-value
For an upper
tail test:


Decision rule: compare the p-value to o
If p-value < o , reject H
0

If p-value > o , do not reject H
0

(continued)
x
) |
n s/
- x
P(Z
true) is H given that ,
n s/
- x
P(Z value - p
0
0
0
0
= > =
> =
When the p-value is smaller than 0.01, the result is called very
significant.

When the p-value is between 0.01 and 0.05, the result is called
significant.

When the p-value is between 0.05 and 0.10, the result is considered
by some as marginally significant (and by most as not significant).

When the p-value is greater than 0.10, the result is considered not
significant.
The p-Value: Rules of Thumb
Caution: Hypotheses are
accepted, not proved
Suppose your theory is x > 3 (H
A
)
Obtaining a sample mean greater than 3 is not
sufficient to support your theory
It simply does not provide statistical evidence to
reject it

Obtaining a sample mean significantly greater
than 3 (the H
0
value) supports your theory,
but does not prove it.
THEORIES CAN NEVER BE PROVED BY
SAMPLE EVIDENCE.

Example 1
A company that delivers packages
within a large metropolitan area
claims that it takes an average of
28 minutes for a package to be
delivered from your door to the
destination. Suppose that you
want to carry out a hypothesis test
of this claim.
A random sample of 100 deliveries
resulted in x = 31.5 minutes
and s = 5 minutes. Test the claim
at the o = 0.05 level.
H
0
: = 28
H
1
: = 28
o = 0.05
n = 100
o is unknown, but n
is large, so use a
z statistic
Critical Value:
z
.025
= 1.96
Example 1
Reject H
0
: sufficient evidence that true mean
delivery time is different from 28 minutes
Reject H
0
Reject H
0
o/2=.025
-z
/2
Do not reject H
0
0

o/2=.025
-1.96
1.96
7
100
5
28 31.50
n
s
x
z =

=
7
H
0
: = 28
H
1
: = 28
z
/2
(continued)

Anda mungkin juga menyukai