Statistics
A sample statistic is an estimate of a population
parameter
A sample estimate is subject to sampling error
Sampling distribution captures the variation of a
sample estimate around the true parameter value if
repeated samples were drawn from the population
1
Sampling distribution of the sample mean
A random sample
1
,
2
, ,
is drawn from a
population with mean = (
) and variance
2
= (
).
The sample mean
=
1
=1
is a common estimate for the
population mean . What is the sampling distribution of
?
It depends on the population distribution of
For
~i. i. d. Bern p ,
=1
~ , , then
=1
= =
(1 )
For
~i. i. d. N ,
2
,
~(,
)
For
~i. i. d. Exp ,
=1
~(, ),
then
~(, )
2
Central Limit Theorem
For any arbitrary population distribution
with = (
) and
2
= (
), as sample size
, the sampling distribution of
converges
to a (,
2
), i.e.
/
~(0,1)
Note for any n,
= and
=
2
.
3
Empirical distributions of sample mean
x
D
e
n
s
i
t
y
0 2 4 6 8 10
0
.
0
0
0
.
0
5
0
.
1
0
0
.
1
5
0
.
2
0
0
.
2
5
0
.
3
0
0
.
3
5
x
D
e
n
s
i
t
y
0 2 4 6 8 10
0
.
0
0
.
1
0
.
2
0
.
3
0
.
4
0
.
5
0
.
6
x
D
e
n
s
i
t
y
0 2 4 6 8 10
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
x
D
e
n
s
i
t
y
0 2 4 6 8 10
0
.
0
0
.
5
1
.
0
1
.
5
2
.
0
2
.
5
For a (2,1) population
= 1 = 5
= 10 = 100
4
Normal approximation to Binomial distribution
If ~(, ), then when n is large
(, 1 )
X~Bin(5,0.5)
D
e
n
s
i
t
y
0 1 2 3 4 5
0
.
0
0
0
.
0
5
0
.
1
0
0
.
1
5
0
.
2
0
0
.
2
5
0
.
3
0
X~Bin(20,0.5)
D
e
n
s
i
t
y
5 10 15
0
.
0
0
0
.
0
5
0
.
1
0
0
.
1
5
X~Bin(10,0.1)
D
e
n
s
i
t
y
0 1 2 3 4 5
0
.
0
0
.
1
0
.
2
0
.
3
0
.
4
X~Bin(100,0.1)
D
e
n
s
i
t
y
0 5 10 15 20
0
.
0
0
0
.
0
2
0
.
0
4
0
.
0
6
0
.
0
8
0
.
1
0
0
.
1
2
= 5, = 0.5 = 20, = 0.5
= 10, = 0.1 = 100, = 0.1
5
A general rule of thumb
10 and (1 ) 10
Continuity Correction
+0.5
(1 )
1
0.5
(1 )
6
Sampling distribution of the sample variance
The sample variance
2
=
1
1
(
)
2
=1
is often
used to estimate the population variance
2
. The sampling
distribution of
2
is also very dependent on the population
distribution of
.
When
~(,
2
), it can be shown that
1
2
2
~
2
1
2
~
2
1
( 1)
2
=
2
and
2
=
2
4
1
and
2
are statistically independent.
7
2
,
denotes the upper critical point of a
2
distribution (Table A.5)
Particularly,
2
~
2
1
, and
2
/2
=
2
1,
Ex: Find (1) and ;
(2) a, b such that
upper tail area
= 2.5%
lower tail area
= 2.5%
2
,1
2
,
8
2
10,0.05
_
2
10,0.95
_
2
8
) a 0.9 ( 5 P b _ < < =
Ex: The variance among the repeat measurements is used
to quantify the precision of an instrument. Suppose the
advertised claim for the precision of one kind of
thermometer is = 0.01
2
= 0.01
2
, then this casts doubt on the advertised claim.
What is the threshold value of sample variance so that the
probability of observing a value no less than the cut-off
value is no more than 5%?
9
Students t-distribution
For
~ ,
2
, it is known that
(
)
/
~ 0,1 . Then,
=
(
)
/
~
1
2
1 +
+1
2
, < <
10
Students t-distribution
Symmetric around zero
Bell-shaped
as
Upper critical value:
,
,1
=
,
(Table A.4)
,
,1
11
Ex: Find (1)
10,0.1
,
5,0.95
(2) a such that
10
< = 0.95
Ex: A soft drink company uses a filling machine to fill cans.
Each 12 oz. can is to contain = 355 milliliters of
beverage. The actually filling amount follows a normal
distribution with mean and variance
2
.
(1) If
2
is known to be 0.5
2
ml, then what is the probability
that mean content of a six-pack of cans is less than
354.8ml?
(2) If
2
is unknown, the sample variance of the contents of
a six-pack of cans is measured to be 0.6
2
ml, what is the
minimum deviation of the sample mean of a six-pack
from such that the probability of observing a sample
mean at least distant away from is no more than
5%?
12
Snedecor-Fishers F-distribution
For
Let
Then
with pdf given by
13
F-distribution
1
,
2
~
1
2
,
1
Upper critical value:
1
,
2
,
(Table A.6)
Lower critical value:
1
,
2
,1
=
1
2
,
1
,
2
=
2
/1
/
~
1,
and
2
,/2
=
1,,
Ex: Find (1)
10,10,0.05
,
10,10,0.95
,
5,10,0.1
,
5,10,0.9
(2) a & b such that <
8,12
< = 0.9
1
,
2
,
1
,
2
,1
Lower tail
area =
14
Ex: A company tests samples of a certain product made by
two different suppliers to determine whether the variability
in their products are different. Two samples of
1
=9 and
2
= 13 units are drawn from the products of the two
suppliers. A decision rule for declaring the true variance of
the two suppliers are different is defined as
for some
1
< 1 and
2
> 1. Determine the decision rule
such that
15
2 2
1 1
1 2
2 2
2 2
or
s s
c c
s s
< >
2 2
2 2 2 2
1 1
1 1 2 2 1 2
2 2
2 2
| | 0.05
s s
c P P c
s s
o o o o
| | | |
= =
| |
\ . \ .
< = > =
Sampling distribution of order statistics
Data:
1
,
2
, ,
~. . . () continuous distribution
Ordered data:
(1)
<
(2)
< <
Consider sampling distributions of
(1)
=
and
()
=
16
Sampling distribution of the r-th order statistic
Let