Anda di halaman 1dari 7

Lecture Notes 5

Statistical Models

(Chapter 6.) A statistical model P is a collection of probability distributions (or a collection of densities). An example of a nonparametric model is
(
)
Z
P=

(p00 (x))2 dx < .

p:

A parametric model has the form


(
P=

p(x; ) :
2 /2

where Rd . An example is the set of Normal densities {p(x; ) = (2)1/2 e(x)

}.

For now, we focus on parametric models. Later we consider nonparametric models.

Statistics

Let X1 , . . . , Xn p(x; ). Let X n (X1 , . . . , Xn ). Any function T = T (X1 , . . . , Xn ) is itself


a random variable which we will call a statistic.
Some examples are:
1.
2.
3.
4.
5.
6.

order statistics, X(1) X(2) X(n)


P
sample mean: X = n1 i Xi ,
P
1
2
sample variance: S 2 = n1
i (Xi x) ,
sample median: middle value of ordered statistics,
sample minimum: X(1)
sample maximum: X(n) .

Often, we are interested in the distribution of T .


Example 1 If X1 , . . . , Xn (, ), then X (n, /n).

Proof. The mgf is


P

MX = E[etx ] = E[e

Xi t/n

]=

E[eXi (t/n) ]



= [MX (t/n)] =

1
1 t/n

 n

1
=
1 /nt

n
.

This is the mgf of (n, /n). 


Example 2 If X1 , . . . , Xn N (, 2 ) then X N (, 2 /n).
Example 3 If X1 , . . . , Xn iid Cauchy(0,1),
p(x) =

1
(1 + x2 )

for x R, then X Cauchy(0,1).


Example 4 If X1 , . . . , Xn N (, 2 ) then
(n 1) 2
S 2(n1) .
2

The proof is based on the mgf.


Example 5 Let X(1) , X(2) , . . . , X(n) be the order statistics, which means that the sample
X1 , X2 , . . . , Xn has been ordered from smallest to largest:
X(1) X(2) X(n) .
Now,
FX(k) (x) = P (X(k) x)
= P (at least k of the X1 , . . . , Xn x)
n
X
=
P (exactly j of the X1 , . . . , Xn x)
=

j=k
n 
X
j=k


n
[FX (x)]j [1 FX (x)]nj
j

Differentiate to find the pdf:


pX(k) (x) =

n!
[FX (x)]k1 p(x) [1 FX (x)]nk .
(k 1)!(n k)!
2

Sufficiency

We continue with parametric inference. In this section we discuss data reduction as a


formal concept.

3.1

Sufficient Statistics

Suppose that X1 , . . . , Xn p(x; ). T is sufficient for if the conditional distribution


of X1 , . . . , Xn |T does not depend on . Thus, p(x1 , . . . , xn |t; ) = p(x1 , . . . , xn |t).
Intuitively, this means that you can replace X1 , . . . , Xn with T (X1 , . . . , Xn ) without losing
information. (This is not quite true as well see later. But for now, you can think of it this
way.)

Example 6 X1 , , Xn Poisson(). Let T =

Pn

i=1

pX n |T (xn |t) = P(X n = xn |T (X n ) = t) =

Xi . Then,
P (X n = xn and T = t)
.
P (T = t)

But
n

P (X = x and T = t) =

0
T (x1 , . . . , xn ) 6= t
P (X1 = x1 , . . . , Xn = xn ) T (x1 , . . . , xn ) = t.

Hence,
n

P (X = x ) =

n
Y
e xi
i=1

Now, T (xn ) =

xi !

en xi
en t
Q
Q
=
=
.
(xi !)
(xi !)

xi = t and so
P (T = t) =

en (n)t
t!

since T Poisson(n).

Thus,
t!
P (X n = xn )
= Q
P (T = t)
( xi )!nt
P
which does not dependP
on . So T =
P i Xi is a sufficient statistic for . Other sufficient
statistics are: T = 3.7 i Xi , T = ( i Xi , X4 ), and T (X1 , . . . , Xn ) = (X1 , . . . , Xn ).

3.2

Sufficient Partitions

It is better to describe sufficiency in terms of partitions of the sample space.


Example 7 Let X1 , X2 , X3 Bernoulli(). Let T =
xn
(0,
(0,
(0,
(1,
(0,
(1,
(1,
(1,

0,
0,
1,
0,
1,
0,
1,
1,

0)
1)
0)
0)
1)
1)
0)
1)

8 elements

t
t=0
t=1
t=1
t=1
t=2
t=2
t=2
t=3

Xi .
p(x|t)
1
1/3
1/3
1/3
1/3
1/3
1/3
1

4 elements

1. A partition B1 , . . . , Bk is sufficient if f (x|X B) does not depend on .


2. A statistic T induces a partition. For each t, {x : T (x) = t} is one element of the
partition. T is sufficient if and only if the partition is sufficient.
P
P
3. Two statistics can generate the same partition: example: i Xi and 3 i Xi .
4. If we split any element Bi of a sufficient partition into smaller pieces, we get another
sufficient partition.
Example 8 Let X1 , X2 , X3 Bernoulli(). Then T = X1 is not sufficient. Look at its
partition:
xn
(0, 0, 0)
(0, 0, 1)
(0, 1, 0)
(0, 1, 1)
(1, 0, 0)
(1, 0, 1)
(1, 1, 0)
(1, 1, 1)
8 elements

t
p(x|t)
t=0
(1 )2
t=0
(1 )
t=0
(1 )
t=0
2
t=1
(1 )2
t=1
(1 )
t=1
(1 )
t=1
2
2 elements
4

3.3

The Factorization Theorem

Theorem 9 T (X n ) is sufficient for if the joint pdf/pmf of X n can be factored as


p(xn ; ) = h(xn ) g(t; ).
Example 10 Let X1 , , Xn Poisson. Then
P

P
en Xi
1
=Q
en i Xi .
p(xn ; ) = Q
(xi !)
(xi !)

Example 11 X1 , , Xn N (, 2 ). Then
n

p(x ; , ) =

1
2 2

 n2

 P

(xi x)2 + n(x )2
exp
.
2 2

(a) If known:
n

p(x ; ) =

1
2 2

 n2

P

(xi x)2
exp
2 2
{z
}


exp
|

h(xn )


n(x )2
.
2 2
{z
}

g(T (xn )|)

Thus, X is sufficient for .


P
P
(b) If (, 2 ) unknown then T = (X, S 2 ) is sufficient. So is T = ( Xi , Xi2 ).

3.4

Minimal Sufficient Statistics (MSS)

We want the greatest reduction in dimension.


Example 12 X1 , , Xn N (0, 2 ). Some sufficient statistics are:
T (X1 , , Xn ) = (X1 , , Xn )
T (X1 , , Xn ) = (X12 , , Xn2 )
!
m
n
X
X
T (X1 , , Xn ) =
Xi2 ,
Xi2
i=1

T (X1 , , Xn ) =

Xi2 .

i=m+1

T is a Minimal Sufficient Statistic if the following two statements are true:


1. T is sufficient and
2. If U is any other sufficient statistic then T = g(U ) for some function g.
In other words, T generates the coarsest sufficient partition.
Suppose U is sufficient. Suppose T = H(U ) is also sufficient. T provides greater reduction
than U unless H is a 1 1 transformation, in which case T and U are equivalent.
Example 13 X N (0, 2 ).
2
X 2 , X 4 , eX .

X is sufficient. |X| is sufficient. |X| is MSS. So are

Example 14 Let X1 , X2 , X3 Bernoulli(). Let T =


xn
(0,
(0,
(0,
(1,
(0,
(1,
(1,
(1,

0,
0,
1,
0,
1,
0,
1,
1,

0)
1)
0)
0)
1)
1)
0)
1)

t
t=0
t=1
t=1
t=1
t=2
t=2
t=2
t=3

p(x|t)
1
1/3
1/3
1/3
1/3
1/3
1/3
1

Xi .

u
u=0
u=1
u=1
u=1
u = 73
u = 73
u = 91
u = 103

p(x|u)
1
1/3
1/3
1/3
1/2
1/2
1
1

Note that U and T are both sufficient but U is not minimal.

3.5

How to find a Minimal Sufficient Statistic

Theorem 15 Define
p(y n ; )
R(x , y ; ) =
.
p(xn ; )
Suppose that T has the following property:
n

R(xn , y n ; ) does not depend on if and only if T (y n ) = T (xn ).


Then T is a MSS.
6

Example 16 Y1 , , Yn iid Poisson ().


P

en yi p(y n ; )
yi xi
Q
Q
p(y n ; ) = Q
=
,
p(xn ; )
yi
yi !/ xi !
P
P
P
which is independent of iff
yi =
xi . This implies that T (Y n ) =
Yi is a minimal
sufficient statistic for .
The minimal sufficient statistic is not unique. But, the minimal sufficient partition is unique.
Example 17 Cauchy.
p(x; ) =
Then

1
.
(1 + (x )2 )
n
Q

{1 + (xi )2 }
p(y ; )
= i=1
.
n
Q
p(xn ; )
2
{1 + (yj ) }
n

j=1

The ratio is a constant function of if


T (Y n ) = (Y(1) , , Y(n) ).
It is technically harder to show that this is true only if T is the order statistics, but it could
be done using theorems about polynomials. Having shown this, one can conclude that the
order statistics are the minimal sufficient statistics for .

What Sufficiency Really Means

If T is sufficient, then T contains all the information you need from the data to compute the
likelihood function. It does not contain all the information in the data. We will define
the likelihood function in the next set of notes.

Anda mungkin juga menyukai