Econometrics (EM2008/EM2Q05) Maximum Likelihood and Generalized Least Squares Estimators

Econometrics [EM2008/EM2Q05]
Lecture 4
Maximum likelihood and generalized least squares
estimators
Irene Mammi
irene.mammi@unive.it
Academic Year 2018/2019
1 / 19
outline
I maximum likelihood and generalized least squares estimators

I maximum likelihood estimators (MLEs)
I ML estimation of the linear model
I ML-based tests: LR, Wald and LM
I generalized least squares
I References:
I Johnston, J. and J. DiNardo (1997), Econometrics Methods, 4th
Edition, McGraw-Hill, New York, Chapter 5.
2 / 19
MLEs in a nutshell
I let y 0 = y1 y2 · · · yn be a n-vector of sample values,

dependent on some k-vector of unknown parameters,

θ0 = θ1 θ2 · · · θk

I let the joint density be written f (y; θ): this density may either
indicate, for given θ, the probability of a set of sample outcomes, or it
may be interpreted as a function of θ, conditional on a set of sample
outcomes
I in the latter interpretation it is referred as likelihood function:
Likelihood function = L(θ; y ) = f (y; θ)
I maximizing the likelihood function wrt θ implies finding a specific

value, say θ̂, that maximizes the probability of obtaining the sample
values actually observed
I θ̂ is the MLE of the unknown parameter vector θ
3 / 19
MLEs in a nutshell (cont.)
I in most cases, it is simpler to maximize the log of the likelihood

fuction:
` = ln L
I then
∂` 1 ∂L
=
∂θ L ∂θ
and the θ̂ that maximizes ` will also maximize L
I the derivative of ` wrt θ is the score, s (θ; y )
I the MLE θ̂ is obtained by setting the score to zero, i.e. by finding θ
that solves
∂`
s (θ; y ) = =0
∂θ
4 / 19
ML estimation of the linear model
I consider the linear model
y = X β + u with u ∼ N (0, ff 2 I )
I the multivariate normal density for u is
1 2 0
f (u ) = e −(1/2σ )(u u )
(2πσ2 )n/2
I so that the multivariate density for y conditional on X is

∂u
f (y |X ) = f (u )
∂y
where |(∂u/∂y )| is the absolute value of the determinant from the

n × n matrix of partial derivatives of u wrt y, which is here simply
the identity matrix
5 / 19
ML estimation of the linear model (cont.)
I the log-likelihood function is
n n 1
` = ln f (y |X ) = ln f (u ) = − ln 2π − ln σ2 − 2 u 0 u
2 2 2σ
n n 1
= − ln 2π − ln σ − 2 (y − X β)0 (y − X β)
2
2 2 2σ
I The vector of unknown parameters, θ, has k + 1 elements, namely,
θ0 = β0 , σ2

I taking partial derivatives gives
∂` 1
= − 2 (−X 0 y + X 0 X β)
∂β σ
∂` n 1
2
= − 2 + 4 (y − X β ) 0 (y − X β )
∂σ 2σ 2σ
6 / 19
ML estimation of the linear model (cont.)
I setting these partial derivatives to zero gives the MLEs as
β̂ = (X 0 X )−1 X 0 y
and
σ̂2 = (y − X β̂)0 (y − X β̂)/n
I the MLE, β̂ is the OLS estimator, b, and σ̂2 is e 0 e/n with e being
OLS residuals
I the max of the likelihood function is L( β̂, σ̂2 ) = constant · (e 0 e )−n/2
7 / 19
ML-based tests
consider the general framework of linear hypotheses about β
H0 : R β = r
where R is a q × k (q < k ) matrix of known constants and r a q × 1

known vector
Likelihood ratio (LR) test
I L( β̂, σ̂2 ) is the unrestricted maximum likelihood and can be

expressed as a function of the unrestricted sum of squares, e 0 e
I the model may also be estimated in restricted form by maximizing
the likelihood subject to the restrictions R β = r . Denote the resulting
estimators as β̃ and σ̃2 : the maximum of the likelihood is L( β̃, σ̃2 )
I if the restrictions are valid, we expect the restricted maximum to be
close to the unrestricted maximum
8 / 19
ML-based tests (cont.)
I the likelihood ratio is defined as
L( β̃, σ̃2 )
λ=
L( β̂, σ̂2 )
I a generally applicable large-sample test is
a
LR = −2 ln λ = 2[ln L( β̂, σ̂2 ) − ln L( β̃, σ̃2 )] ∼ χ2 (q )
which can be alternatively expressed as
LR = n(ln e 0∗ e ∗ − ln e 0 e )
I the calculation of the LR statistic requires the fitting of both the

restricted and unrestricted model
9 / 19
Wald (W) test
I the Wald test only requires to calculate the unrestricted β̂

I the vector (R β̂ − r ) indicates the extent to which the unrestricted
ML estimates fit the null hypothesis: a vector close to zero would
support H0
I under the null, R β̂ − r is asymptotically distributed as multivariate
normal with zero mean and variance-covariance matrix RI −1 ( β)R 0
where I −1 ( β) = σ2 (X 0 X )−1
I we have
a
(R β̂ − r )0 [RI −1 ( β)R 0 ]−1 (R β̂ − r ) ∼ χ2 (q )
I the asymptotic distribution still holds when σ2 is replaced by

σ̂2 = e 0 e
10 / 19
I the resulting Wald statistic is
(R β̂ − r )0 [R (X 0 X )−1 R 0 ]−1 (R β̂ − r ) a 2
W = ∼ χ (q )
σ̂2
which can also be expressed as
n(e 0∗ e ∗ − e 0 e ) a 2
W = ∼ χ (q )
e0e
11 / 19
Lagrange multiplier (LM) test
I the LM test, also known as the score test, is based on the score vector
∂ ln L ∂`
s (θ) = =
∂θ ∂θ
I the unrestricted estimator, θ̂, is found by solving s (θ̂) = 0; the score
vector will in general not be zero when evaluated at θ̃, the restricted
estimator
I if the restrictions are valid, the restricted maximum, `(θ̃), should be
close to the unrestricted maximum, `(θ̂), and so the gradient of the
former should be close to zero
I under the null hypothesis,
a
LM = s 0 (θ̃)I −1 (θ̃)s (θ̃) ∼ χ2 (q )
I notice that there is no need to compute the unrestricted estimator
12 / 19
I it can be shown that the LM statistic is
LM = nR 2
where R 2 is the squared multiple correlation coefficient from the

regression of e ∗ on X
I the LM test can be implemented in two steps: first compute the
restricted estimator θ̃ and obtain the residual vector e ∗ ; then regress
e ∗ on X and refer nR 2 from this regression to χ2 (q )
I it can be shown that
n(e 0∗ e ∗ − e 0 e )
LM =
e 0∗ e ∗
I it can also be proved that W ≥ LR ≥ LM.
13 / 19
Figure 1: ML-based tests
14 / 19
ML estimation with nonspherical disturbances
I consider the model
y = X β + u with u ∼ N (0, σ2 Ω)
where Ω is a positive definite matrix of order n, whose elements are

assumed to be known
I e.g. assume
var(ui ) = σi2 = σ2 X2i2 i = 1, 2, . . . , n
so that the error variance-covariance matrix is

 2
··· 0

X21 0
 0 X222 ··· 0 
var(u ) = σ2  . 2
2 2 2

..  = σ diag X21 X22 · · · X2n
 
. .
. . .
 . . . . 
0 0 2
· · · X2n
15 / 19
ML estimation with nonspherical disturbances (cont.)
I the multivariate normal density for u is

1
f (u ) = (2π )−n/2 |σ2 Ω|−1/2 exp − u 0 (σ2 Ω)−1 u
2
which, noting that |σ2 Ω| = σ2n |Ω|, can be rewritten as
f (u ) = (2π )−n/2 (σ2 )−n/2 |Ω|−1/2 exp[(−1/2σ2 )u 0 Ω−1 u ]

I the log-likelihood is then
n n 1 1
` = − ln(2π ) − ln σ2 − ln |Ω| − 2 (y − X β)0 Ω−1 (y − X β)
2 2 2 2σ
I differentiating with respect to β and σ2 and setting the partial
derivatives to zero gives the ML estimators
β̂ = (X 0 Ω−1 X )−1 X 0 Ω−1 y
and
1
σ̂2 = (y − X β̂)0 Ω−1 (y − X β̂)
n
16 / 19
generalized least squares
I since Ω is positive definite, its inverse is positive definite. Thus it is

possible to find a nonsingular matrix P such that
Ω −1 = P 0 P
I substitution into the MLE formula gives
β̂ = (X 0 P 0 PX )−1 X 0 P 0 Py = [(PX )0 (PX )]−1 (PX )0 (Py )
which is exactly the vector of estimated coefficients that would be

obtained from the OLS regression of the vector Py on the matrix PX
I to see this, premultiply the linear model by P and obtain
y ∗ = X ∗ β + u∗
where y ∗ = Py, X ∗ = PX, and u ∗ = Pu
17 / 19
generalized least squares (cont.)
I since Ω = P −1 (P 0 )−1 , we have
var(u ∗ ) = E(Puu 0 P 0 )
= σ2 PΩP 0
= σ2 PP −1 (P 0 )−1 P 0
= σ2 I
I the coefficient vector from the OLS regression of y ∗ on X ∗ is the

generalized least squares (GLS) estimator:
b GLS = (X 0∗ X ∗ )−1 X 0∗ y ∗
= (X 0 Ω −1 X ) −1 X 0 Ω −1 y
I it follows directly that
var(b GLS ) = σ2 (X 0∗ X ∗ )−1

= σ 2 (X 0 Ω −1 X ) −1
18 / 19
generalized least squares (cont.)
I an unbiased estimate of σ2 is obtained as
s 2 = (y ∗ − X ∗ b GLS )0 (y ∗ − X ∗ b GLS )/(n − k )

= [P (y − Xb GLS )]0 [P (y − Xb GLS )]/(n − k )
= (y ∗ − Xb GLS )0 Ω−1 (y − Xb GLS )/(n − k )
I an exact finite sample test of the linear restrictions
H0 : R β = r
can be based on
(r − Rb GLS )0 [R 0 (X 0 Ω−1 X )−1 R 0 ]−1 (r − Rb GLS )/q

F =
s2
19 / 19

Econometrics (EM2008/EM2Q05) Maximum Likelihood and Generalized Least Squares Estimators

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Econometrics (EM2008/EM2Q05) Maximum Likelihood and Generalized Least Squares Estimators

Diunggah oleh

Hak Cipta:

Econometrics [EM2008/EM2Q05]

Academic Year 2018/2019

I maximum likelihood and generalized least squares estimators

I let y 0 = y1 y2 · · · yn be a n-vector of sample values,

dependent on some k-vector of unknown parameters,

Likelihood function = L(θ; y ) = f (y; θ)

I maximizing the likelihood function wrt θ implies finding a specific

I in most cases, it is simpler to maximize the log of the likelihood

I consider the linear model

I the multivariate normal density for u is

where |(∂u/∂y )| is the absolute value of the determinant from the

I taking partial derivatives gives

I setting these partial derivatives to zero gives the MLEs as

consider the general framework of linear hypotheses about β

where R is a q × k (q < k ) matrix of known constants and r a q × 1

I L( β̂, σ̂2 ) is the unrestricted maximum likelihood and can be

I the likelihood ratio is defined as

which can be alternatively expressed as

I the calculation of the LR statistic requires the fitting of both the

Wald (W) test

I the Wald test only requires to calculate the unrestricted β̂

I the asymptotic distribution still holds when σ2 is replaced by

I the resulting Wald statistic is

I notice that there is no need to compute the unrestricted estimator

I it can be shown that the LM statistic is

where R 2 is the squared multiple correlation coefficient from the

Figure 1: ML-based tests

I consider the model

where Ω is a positive definite matrix of order n, whose elements are

var(ui ) = σi2 = σ2 X2i2 i = 1, 2, . . . , n

so that the error variance-covariance matrix is

f (u ) = (2π )−n/2 (σ2 )−n/2 |Ω|−1/2 exp[(−1/2σ2 )u 0 Ω−1 u ]

β̂ = (X 0 Ω−1 X )−1 X 0 Ω−1 y

I since Ω is positive definite, its inverse is positive definite. Thus it is

I substitution into the MLE formula gives

β̂ = (X 0 P 0 PX )−1 X 0 P 0 Py = [(PX )0 (PX )]−1 (PX )0 (Py )

which is exactly the vector of estimated coefficients that would be

where y ∗ = Py, X ∗ = PX, and u ∗ = Pu

I the coefficient vector from the OLS regression of y ∗ on X ∗ is the

I it follows directly that

var(b GLS ) = σ2 (X 0∗ X ∗ )−1

I an unbiased estimate of σ2 is obtained as

s 2 = (y ∗ − X ∗ b GLS )0 (y ∗ − X ∗ b GLS )/(n − k )

I an exact finite sample test of the linear restrictions

(r − Rb GLS )0 [R 0 (X 0 Ω−1 X )−1 R 0 ]−1 (r − Rb GLS )/q

Anda mungkin juga menyukai