Anda di halaman 1dari 6

Questions and Answers on Maximum Likelihood

L. Magee 1. Given: an observation-specic log likelihood function the log likelihood function (|y, X) = a data set (xi , yi ), i = 1, . . . , n a value for the maximum likelihood estimator of the parameter vector briey describe how you would compute (a) the negative Hessian estimator of the variance of (b) the outer product of gradient (OPG) estimator of the variance of (c) a misspecication-consistent variance estimator that follows from interpreting the ML estimator as a method of moments estimator 2. The random variable y has a probability density function f (y) = (1 ) + 2y for 0 < y < 1 = 0 otherwise
i () n i=1 i ()

Fall, 2008

= ln f (yi |xi , )

for 1 < < 1. There are n observations yi , i = 1, . . . , n, drawn independently from this distribution. (a) (i) Write the cumulative distribution function of y. (ii) Derive the expected value of y. (iii) Suggest a method of moments estimator for based on the sample mean y . (b) (i) Write the log likelihood function for . (ii) Write the rst-order condition for the ML estimator of . 3. y1 , . . . , yn are n independent draws from an exponential distribution. The probability density function of each yi is f (yi |) = 1 exp(yi /), where yi > 0 and > 0. The exponential distribution has the property E(yi ) = . (a) Derive (i) the observation-specic log likelihood function (ii) the log likelihood function () (iii) the maximum likelihood (ML) estimator of , .
i ()

(b) Derive the following estimators of the variance of , showing their general formulas as part of your answer. (i) the negative Hessian variance estimator (ii) the Information matrix variance estimator (iii) the outer product of gradient (OPG) variance estimator (iv) the misspecication-consistent variance estimator that follows from interpreting the ML estimator as a method of moments estimator 4. Given observations on the scalar xi , i = 1, . . . , n, each yi is independently drawn according to the conditional pdf f (yi |xi , ) = (xi )1 exp( yi ) xi

where yi > 0, xi > 0, and > 0. is an unknown scalar parameter. (a) Write the observation-specic log likelihood function (b) Write log likelihood function () =
i i () i ()

(c) Derive the maximum likelihood (ML) estimator of . (d) In this model, E(yi |xi , ) = xi . Using this fact, suggest another consistent estimator of that is dierent from the ML estimator in (c). No explanation is required. 5. (16 marks: 4 for each part) Let yi , i = 1, . . . , n be independently-observed non-negative integers drawn from a Poisson distribution Prob(yi |) = yi e , yi ! yi = 0, 1, 2, . . .

The Poisson distribution has the property E(yi |) = .


(Aside: ! is known as the factorial operator. yi !, or yi factorial, is dened as yi ! = 1 2 . . . (yi 1) yi . In the current question, this term serves as a normalizing constant, and has no eect on the derivations of the maximum likelihood estimator or its variance estimators, much like the 2 term in the denominator of the normal pdf.)

(a) Write the observation-specic log likelihood function (b) Write log likelihood function () =
i i ()

i ()

(c) Derive , the maximum likelihood (ML) estimator of . (d) Derive an estimator of the variance of using any one of the four standard methods.

Answers 1. (a) the negative Hessian estimator: Va = ( (b) the OPG estimator: Vb = (
n 2 i=1

)1 , evaluated at =
i s

n i i 1 i=1 ( )( ) ) ,

where the

are evaluated at =

(c) misspecication-consistent estimator: given the denitions in (a) and (b), it can be written 2 2 as Vc = Va Vb1 Va , or Vc = ( n )1 ( n ( i )( i ) )( n )1 i=1 i=1 i=1 2. (a) (i) The probability density function f (y) = 0 when y < 0 and f (y) = 0 when y > 1. Therefore when y < 0, the cdf is F (y) = When 0 < y < 1,
y y f (s)ds

= 0, and when y > 1, F (y) = 1.

F (y) =
0

((1 ) + 2s)ds

= ((1 )s + s2 ) |y s=0 = (1 )y + y 2 , 0 < y < 1 (ii) E(y) = =


0 1 0 yf (y)dy 1

y((1 ) + 2y)dy

= ((1/2)(1 )y 2 + (2/3)y 3 ) |1 y=0 = (1/2)(1 ) + (2/3) = (1/2) + (1/6) (iii) From (ii), E(y) = (1/2) + (1/6), which gives a population moment condition E(y ((1/2) + (1/6))) = 0 The sample moment condition is
n

n1
i=1

(yi ((1/2) + (1/6))) = 0

which can be written as y (1/2) (1/6) = 0, and the estimator is = 6 3 y (b) (i) () =
n i=1 ln(1

+ 2yi ).

(ii) There is no closed-form solution. The rst-order condition is


n

()/ = 0
i=1

2yi 1 = 0 at = 1 + 2yi

3. (a) (i)

i ()

= ln(f (yi |) = ln() yi /

(ii) () = n i () = n ln() n yi / i=1 i=1 is the value of that solves / = 0. (iii) / = Therefore n = (b) (i) 2 /2 = 2 n 2
n i=1 yi 3 n i=1 yi n i=1 yi 2

n +

n i=1 yi 2

n i=1 yi

=y

Evaluating this at = y and subbing out 2 ()/2 = n 2(n) y n = 2 3 2

= n gives y

The negative Hessian variance estimator is 2 V1 = ( 2 ()/2 )1 = n (ii) The Information matrix is minus one times the expected value of the second derivative matrix derived in part (i). The exponential density assumption implies E(yi ) = , so E 2 /2 = 2
n i=1 3

n 2n n n = 3 2 = 2 2

The Information matrix variance estimator is the inverse of the Information matrix, evaluated at n 2 V2 = ( )1 = n 2 (iii) Evaluate the gradient, or rst derivative vector of yi y 1 yi yi i ()/ = + = = 2 2 2 For notational convenience, use 2 = n1
n i=1 (yi i,

at :

y )2 , even though there is no 2

parameter in the model. Then the OPG is


n i=1

i () i () ( )( ) =

n i=1

i () 2 ) = (

n i=1 (yi 4

y )2

n 2 4

The outer product of gradient (OPG) variance estimator is the inverse of this OPG 4 V3 = n 2 (Aside: V3 has the odd feature that 2 appears in the denominator rather than the numerator. But it turns out that for the exponential distribution, Var(yi ) = 2 . Since plim( 2 ) =Var(yi ), then as n , 2 and 2 both converge to 2 . So as n , then V3 becomes close to (iv)
4 n2

2 n,

the same as V1 and V2 . This equivalence depends on the

assumption that yi has an exponential distribution.) 2 V4 = ( 2 )1 (


n

(
i=1

i () i () 2 )( ) )( 2 )1

= V1 (V3 )1 V1 2 n 2 2 = ( )( )( ) n n 4 2 = n 4. (a)
i ()

= ln(xi )
n i=1 i ()

yi xi i ln(xi ) n

(b) () = (c)

yi i ( xi ) n

()/ =
i=1

( ln(xi )/)
i=1 n

((

yi )/) xi

n +

(
i=1

yi 1 ( ) xi 2 (
i

= 0 when n +

yi ) = 0 = n1 xi

(
i

yi ) xi

(d) Since E(yi |xi , ) = xi , then Em(yi , xi , ) = 0 where m = yi xi . This population moment condition leads to the sample moment condition n1 (yi xi ) = 0. Solving for gives
i

x i xi = y /. (Another choice of moment condition is Exi (yi xi ) = 0, which 2 = leads to OLS, i x i yi / i xi .)


i yi /

5. (a)

i ()

= ln(Prob(yi |)) = yi ln ln(yi !)


i i ()

(b) () =

i yi ln

i ln(yi !)

(c) is the value of satisfying the rst-order condition / = 0. / =


i yi

n=0

at

i yi

=y

(d) The negative Hessian variance estimator is V () = 2 2


1

evaluated at

= , and

yi 2 = i2 2 V () = (

therefore
1

i yi ) 2

2 2 = = n n i yi

Anda mungkin juga menyukai