Anda di halaman 1dari 5

14.

CLT, Part II: Independent but not identically distributed


Lehmann 2.7; Ferguson 5

To begin with, lets consider the example of the so-called Poisson-binomial distribution:

Example 14.1 Let Xi Bernoulli(pi ), with X1 , X2 , . . . independent. We would like to know


whether
 1 Pn
n n i=1 (Xi pi ) L
pPn N (0, 1).
i=1 pi (1 pi )/n

The answer is yes, as long as the pi are bounded away from 0 and 1. Later, well see how to
prove this.

The preceding example involved a CLT-like result for a sequence X1 , X2 , . . . that is independent but not
identically distributed. However, this is not generalPenough for our purposes. Consider Example 12.4, in
n
which we proved a CLT-like result for sums Xn = i=1 Yni of independent random variables that do not
form a single sequence. There is also the example of simple linear regression, already considered in Example
8.2:

Example 14.2 Recall the simple linear regression situation:

Yi = 0 + 1 zi + i ,

where the zi are known constants and the i are independent with mean 0 and variance 2 . If
we define
zi z 1
wi = Pn 2
and vi = zwi
j=1 (z j z) n

as before, then notice that wi and vi each depend on n, an implicit fact that we may make
(n) (n)
explicit by introducing the new notations wi and vi . Then the least squares estimators of
0 and 1 are
n n
(n) (n)
X X
0n = vi Yi and 1n = wi Yi ,
i=1 i=1

respectively. We would like to know about the asymptotic normality of 0n and 1n .

For examples like 12.4 and 14.2, we need the concept of a triangular array:

X11 independent
X21 X22 independent
X31 X32 X33 independent
..
.

In other words, we assume that for each n, Xn1 , . . . , Xnn


Pare independent with E Xnk = nk and Var Xnk =
2 n
nk < . We will now develop theory that tells when k=1 Xnk is asymptotically normal.

First, some notational definitions. Let Ynk = Xnk nk . Alternatively, we could simply assume that nk = 0,
as is the practice in many textbooks. However, by defining the Ynk separately, we emphasize the notion that
the following theory applies to triangular arrays of variables with mean zero.

33
Pn Pn
Let Tn = k=1 Ynk and s2n = Var Tn = k=1 nk
2
. Obviously, Tn /sn has mean 0 and variance 1; our goal is
to give conditions under which
Tn L
N (0, 1). (19)
sn

One condition well study that implies (19) is called the Lindeberg condition:
n
1 X 2

For every  > 0, E Ynk I {|Ynk sn } 0 as n . (20)
s2n
k=1

Another condition that implies (19) is the Lyapunov condition:


n
1 X
E |Ynk |2+ 0 as n .

There exists > 0 such that (21)
s2+
n k=1

Note that Lehmann, in Section 2.7, presents a weaker version of Lyapunovs condition that fixes = 1.

Finally, we will give a technical condition that does not imply (19) but that will be discussed below. This
condition essentially says that no single random variable is dominant in its contribution to s2n :
1
max 2 0 as n . (22)
s2n kn nk

Now we can give two theorems that summarize the relationships among conditions (19), (20), (21), and (22).

Theorem 14.1 Lindeberg-Feller theorem: Condition (20) holds if and only if conditions (19) and
(22) hold.

Theorem 14.2 The Lyapunov condition (21) implies the Lindeberg condition (20).

The proof of the Lindeberg-Feller theorem will not be presented here, but the proof of theorem 14.2 is fairly
straightforward and is given as a problem at the end of this topic.

As an example of the power of the Lindeberg condition, we first prove the iid version of the Central Limit
Theorem, theorem 12.1. However, we first restate the dominated convergence theorem, first discussed in
Problem 11.1, since we will often use this result in verifying the Lindeberg condition.
L
Theorem 14.3 Dominated convergence theorem: If Yn Y and |Yn | Z for a random variable Z
satisfying E |Z| < , then E Yn E Y .

Now we show that the CLT is implied by Theorem 14.1.

Example 14.3 Suppose that X1 , X2 , . . . are iid with E Xi = and Var Xi = 2 < . Then let
Ynk = Xk , and s2n = n 2 in the Lindeberg condition (20). Thus, we must verify that for
any ,
n
1 X 2

2
E Ynk I{|Ynk |  n} 0 as n . (23)
n
k=1

But Yn1 , . . . , Ynn are identically distributed by assumption, so the left hand side of expression
(23) simplifies to
1 2

2
E Y11 I{|Y11 |  n} . (24)

34
2
Since E Y11 < , the dominated convergence theorem 14.3 implies that expression (24) goes
P
to zero because clearly I{|Y11 |  n} 0. Thus, the Lindeberg condition is satisfied, which
implies that
Pn
n(X n ) Ynk L
= k=1 N (0, 1).
sn

Note that the Lyapunov condition does not imply the Central Limit Theorem 12.1: In Problem 14.2, you
are asked to find an example of an iid sequence not satisfying the Lyapunov condition to which the CLT
applies. The existence of such an example means, of course, that the converse of Theorem 14.2 is not true.

Example 14.4 Reconsider Example 14.1, in which Xi Bernoulli(pi ), with X1 , X2 , . . . indepen-


dent, and let Ynk = Xk pk . Observe that for any > 0,
2
1 pk (1 pk ) = E Ynk E |Ynk |2+ .

Therefore,
n n
1 X 1 X 1
E |Ynk |2+ Var Ynk = .
s2+
n k=1
s2+
n k=1
sn

Therefore, if sn (which is clearly true if, say, pk is bounded away from 0 and 1), the
Pn L
Lyapunov condition is satisfied and so k=1 Ynk /sn N (0, 1).

Example 14.5 Reconsider Example 12.4, in which we used the Berry-Esseen theorem (Theorem
12.2) to show that if Xn binomial(n, pn ), then
X npn L
p n N (0, 1) (25)
npn (1 pn )
whenever pn (1 pn ) does not converge to zero. In order to use Theorem 14.1 to prove this
result, let Yn1 , . . . , Ynn be iid with

P (Ynk = 1 pn ) = 1 P (Ynk = pn ) = pn .
Pn
Then with Xn = npn + k=1 Ynk , we obtain Xn binomial(n, pn ) as specified. Furthermore,
E Ynk = 0 and Var Ynk = pn (1 pn ), so the Lindeberg condition says that for any  > 0,
n
1 X 
2
n p o
E Ynk I |Ynk |  npn (1 pn ) 0. (26)
npn (1 pn )
k=1

p clearly |Ynk | 1, the left hand side of expression (26) will be identically zero whenever
Since
 npn (1 pn ) > 1. Thus, a sufficient condition for (25) to hold is that npn (1 pn ) . One
may show that this is also a necessary condition (this is Problem 14.3).

Problems
Problem 14.1 Prove theorem 14.2.

Problem 14.2 Give an example of an iid sequence to which the Central Limit Theorem 12.1 applies
but for which the Lyapunov condition is not satisfied.

Problem 14.3 In Example 14.5, it is shown that npn (1 pn ) is a sufficient condition for (25)
to hold. Prove that it is also a necessary condition. You may assume that pn (1 pn ) is always
nonzero.

35
Problem 14.4 (a) Suppose that X1 , X2 , . . . are iid with E Xi = and Var Xi = 2 < . Let
zn1 , . . . , znn be constants satisfying
2
maxkn znk
Pn 2 0 as n .
j=1 znj

Pn L
Let Tn = k=1 znk Xk , and prove that (Tn E Tn )/ Var Tn N (0, 1).

(b) Reconsider Example 14.2, the simple linear regression case in which
n n
(n) (n)
X X
0n = vi Yi and 1n = wi Yi ,
i=1 i=1

where
(n) zi z 1 (n)
wi = Pn 2
and vi = zwi
j=1 (z j z) n

for constants z1 , z2 , . . .. Using part (a), state and prove sufficient conditions on the constants

zi that ensure the asymptotic normality of n(0n 0 ) and n(1n 1 ). You may assume
the results of Example 8.2, where it was shown that E 0n = 0 and E 1n = 1 .

Problem 14.5 Give an example (with proof)of a sequence of independent random variables Z1 , Z2 , . . .
with E (Zi ) = 0, Var (Zi ) = 1 such that n(Z n ) does not converge in distribution to N (0, 1).

Problem 14.6 Do Problem 8.8 on p. 131.

Problem 14.7 Let (a1 , . . . , an ) be a random permutation of the integers 1, . . . , n. If aj < ai for
some i < j, then the pair (i, j) is said to form an inversion. Let Xn be the total number of
inversions:
j1
n X
X
Xn = I{aj < ai }.
j=2 i=1

For example, if n = 3 and we consider the permutation (3, 1, 2), there are 2 inversions since
1 = a2 < a1 = 3 and 2 = a3 < a1 = 3. This problem asks you to find the asymptotic
distribution of Xn .

(a) Define Y1 = 0 and for j > 1, let


j1
X
Yj = I{aj < ai }
i=1

be the number of ai greater than aj to the left of aj . Then the Yj are independent (you dont
have to show this; you may wish to think about why, though). Find E (Yj ) and Var Yj .

(b) Clearly Xn = Y1 + Y2 + + Yn . Prove that


3
 
4Xn L
n 2
1 N (0, 1).
2 n

(c) For n = 10, evaluate the distribution of inversions as follows. First, simulate 1000 per-
mutations on {1, 2, . . . , 10} and for each permutation, count the number of inversions. Plot a
histogram of these 1000 numbers. Use the results of the simulation to estimate P (X10 24).
Second, estimate P (X10 24) using a normal approximation. Can anyone find the integer c
such that 10!P (X10 24) = c?

36
Problem 14.8 Suppose that X1 , X2 , X3 is a sample of size 3 from a beta (2, 1) distribution.

(a) Find P (X1 + X2 + X3 1) exactly.

(b) Find P (X1 + X2 + X3 1) using a normal approximation derived from the central limit
theorem.

(c) Let Z = I{X1 + X2 + X3 1}. Approximate E Z = P (X1 + X2 + X3 1) by Z =


P1000
i=1 Zi /1000, where Zi are a simulated iid sample from the distribution of Z. In addition to
Z, report Var Z for your sample. (To think about: What is the theoretical value of Var Z?)

(d) Approximate P (X1 + X2 + X3 3/2) using the normal approximation and the simulation
approach. (Note that the exact solution is a bit more cumbersome to compute in this case.)

Problem 14.9 For each n, let Xn1 , . . . , Xnn be independent with Xnk Poisson(k/n). Suppose
Yn1 , . . . , Ynn are independent for each n with E (Ynk ) = 0 and such that Ynk and X
Pnkn have the
same second and third central moments. Prove a central limit theorem involving k=1 Ynk .

Problem 14.10 Lindeberg and Lyapunov impose conditions on moments so that asymptotic nor-
mality occurs. However, it is possible to have asymptotic normality even if there are no moments
at all. Let Xn assume the values +1 and 1 with probability (1 2n )/2 each and the value
2k with probability 2k for k > n.

(a) Show that E (Xnj ) = for all positive integers j and n.


L
(b) Show that n X n N (0, 1).

Problem 14.11 Assume that elements (coupons) are drawn from a population of size n, randomly
and with replacement, until the number of distinct elements that have been sampled is rn , where
1 rn n. Let Sn be the drawing on which this first happens. Suppose that rn /n , where
0 < < 1.

(a) Suppose k 1 distinct coupons have thus far entered the sample. Let Xnk be the waiting
time until the next distinct one appears, so that
rn
X
Sn = Xnk .
k=1

Find the expectation and variance of Xnk .

(b) Let mn = E (Sn ) and n2 = Var (Sn ). Show that

Sn mn L
N (0, 1).
n

Tip: One approach is to apply Lyapunovs condition with = 2. This involves demonstrating
4
an asymptotic expression for n2 and a bound on E [Xnk E(Xnk )] . There are several ways
to go about this.

37

Anda mungkin juga menyukai