Attribution Non-Commercial (BY-NC)

34 tayangan

Attribution Non-Commercial (BY-NC)

- Basic Math Symbols
- VAR Models and Cointegration - Sebastian Fossati
- Lecture 8 Regression Analysis
- KKP MTE 3110_3
- Does Microfinance Really Help
- Corruption Institution
- Demand
- EmpMacroAll
- Paper 4Socioeconomic Determinants of Corruption a Cross Country Evidence and Analysis by Ghulam Shabbir and Abdul Rauf Butt
- Intensive and Extensive Margins of Exports: What Can India Learn from China?
- Logistic Regression 08012008
- Computational Finance and Financial Engineering Second R-Rmetrics User and Developer Workshop June 28th–July 2nd 2009
- 1-s2.0-S0301479717302062-main.pdf
- Rattan_NN
- 1330-1400_Ball_RFM.pdf
- articol
- Empirical Methods Handout
- Notes for Multivariate Statistics with R
- MPRA Paper 37278
- Weighted Least Squares

Anda di halaman 1dari 251

Advanced Econometrics

Panel data econometrics

and GMM estimation

Alban Thomas

MF 102, thomas@toulouse.inra.fr

a consistent treatment of the impact of unobserved heterogeneity

on model predictions:

restrictions imposed by theory:

geneity.

Methods:

- Fixed Eects Least Squares

- Generalized Least Squares

- Instrumental Variables

- Maximum Likelihood estimation for Panel Data models

- Generalized Method of Moments for Panel Data

- Heteroskedasticity-consistent estimation

- Dynamic Panel Data models

- Simulation-based inference

- Nonparametric and Semiparametric estimation

Contents

I

Introduction

1.1

1.1.1

1.1.2

Examples . . . . . . . . . . . . . . . . . . .

10

1.1.3

1.1.4

. . . . . . . . . . . . . . .

11

1.2

Analysis of variance . . . . . . . . . . . . . . . . .

12

1.3

Some denitions . . . . . . . . . . . . . . . . . . .

15

17

2.1

Notation . . . . . . . . . . . . . . . . . . . . . . .

17

2.1.1

Model notation

. . . . . . . . . . . . . . .

18

2.1.2

19

2.1.3

. . . . .

20

21

2.2

2.2.1

21

2.2.2

. .

23

2.2.3

Comments . . . . . . . . . . . . . . . . . .

24

2.2.4

25

CONTENTS

2.3

26

2.3.1

. . . . . . . . .

26

2.3.2

27

2.3.3

29

2.3.4

2.3.5

2.3.6

of variances

31

Extensions

33

3.1

33

3.1.1

33

3.1.2

3.2

3.3

. . . . . . . . . . . . . . . . .

30

. . . . . .

. . . . . . . .

36

37

3.2.1

. . .

37

3.2.2

`Typical heteroskedasticity . . . . . . . . .

38

. . . . . . . . . . .

39

3.3.1

Introduction . . . . . . . . . . . . . . . . .

39

3.3.2

40

47

4.1

Introduction . . . . . . . . . . . . . . . . . . . . .

47

4.2

48

4.3

49

4.4

. . . . . . . . .

GLS estimator . . . . . . . . . . . . . . . . . . . .

51

4.4.1

51

4.4.2

IV in a panel-data context

51

4.4.3

. . . . . . . . .

ment matrix . . . . . . . . . . . . . . . . .

52

CONTENTS

4.4.4

and Breusch-Mizon-Schmidt

4.5

4.5.1

. . . . . . . . . . . . . . . . . . . . . .

4.6.1

4.7

55

. . . . .

56

. . . . . . . . . . . . . .

56

. . . . . . . . . . . . .

56

Model specication

. . . . . . . . .

4.7.1

4.7.2

53

estimators

4.6

. . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . .

58

58

58

63

5.1

63

Motivation . . . . . . . . . . . . . . . . . . . . . .

5.1.1

5.2

5.3

63

5.1.2

65

5.1.3

. . . .

67

69

5.2.1

70

5.2.2

Instrumental-variable estimation . . . . . .

73

. . . . . . . . . . . . .

75

5.3.1

75

5.3.2

An equivalent representation

. . . . . . . .

76

5.3.3

. . . . . . . .

77

5.3.4

78

5.3.5

78

. . .

8

II

6

CONTENTS

The GMM estimator

6.1

6.2

6.3

85

85

. . . . . . . . . . . . .

85

6.1.1

Moment conditions

6.1.2

6.1.3

. . . . .

86

. . . . . . .

87

6.1.4

87

6.1.5

. . . . .

88

6.1.6

Comments . . . . . . . . . . . . . . . . . .

89

91

6.2.1

Introduction . . . . . . . . . . . . . . . . .

91

6.2.2

91

6.2.3

A denition

92

6.2.4

. . . . . . . . . . . . . . . . .

. . . . .

92

93

6.3.1

Consistency

. . . . . . . . . . . . . . . . .

94

6.3.2

Asymptotic normality . . . . . . . . . . . .

95

6.4

. . . . . . . . . . . .

97

6.5

. . . . . . . . . . . . . . . .

99

6.6

102

6.6.1

. . . . . .

102

6.6.2

. . . . . . . . . .

104

6.6.3

6.6.4

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

7.1

7.1.1

106

83

109

115

. . . . . . . . .

115

115

CONTENTS

7.1.2

7.2

7.3

7.4

GMM estimation

. . . . . . . . . . . . . .

117

118

7.2.1

A simple estimator

. . . . . . . . . . . . .

118

7.2.2

120

7.2.3

. . . . . .

121

. . . . . . . .

122

. . . . . . . . . . .

122

. . . . . . . . . . . . . . . .

123

125

7.4.1

126

7.4.2

126

7.4.3

127

7.4.4

128

7.4.5

. . . .

130

7.4.6

. . . . .

133

7.3.1

7.3.2

IV estimation

135

8.1

Introduction . . . . . . . . . . . . . . . . . . . . .

135

8.2

136

8.2.1

Model assumptions

136

8.2.2

. . . . . . . . . . . . .

. .

137

139

8.3.1

Additional assumptions . . . . . . . . . . .

139

8.4

140

8.5

. . . .

141

8.5.1

141

8.5.2

Mixed structure

143

8.3

8.6

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

145

10

III

9

CONTENTS

149

9.1

9.2

151

. . .

151

. . . . . . . . . .

151

9.1.1

9.1.2

Logit model

. . . . . . . . . . . . . . . . .

152

9.1.3

Probit model . . . . . . . . . . . . . . . . .

152

153

9.2.1

Sucient statistics . . . . . . . . . . . . . .

153

9.2.2

Conditional probabilities

. . . . . . . . . .

155

9.2.3

Example:

. . . . . . . . . . . . . . .

156

. . . . . . . . . . . . . . . . . . . .

157

T =2

9.3

Probit models

9.4

9.5

9.4.1

. . . . . . . . . .

159

9.4.2

The IV estimator

. . . . . . . . . . . . . .

162

. . . . . . . .

164

9.5.1

164

9.5.2

Example

168

. . . . . . . . . . . . . . . . . . .

Random-eect model

Appendix 2. The two-way random eects model

171

173

model

179

Appendix 5. GMM estimation of static panel models185

11

CONTENTS

194

c Software

c

Appendix 8. A crash course in Gauss

c

Appendix 9. Example: The Gauss
software

203

211

219

c 224

232

References

238

12

CONTENTS

Part I

Panel Data Models

13

Chapter 1

Introduction

Panel data: Sequential observations on a number of

units (individuals, rms).

cross-section time-series data.

Also called

or

pooled

1.1.1

F (Y; X; Z; ) = 0;

where

Y:

:

Z:

(public

parameters.

Linear model:

Y = 0 + xX + z Z + u:

15

X:

16

CHAPTER 1. INTRODUCTION

characteristics, or

features,

not included in Z .

to both inter-individual dierences

ables.

1.1.2

Examples

rms value those people more;

(expected productivity) anyway, and rms value worker ability

more.

b)

More ecient rms enjoy more sales, and thus have more money

for advertisement expenditures.

c)

Firms with higher output are more regulated on average.

d) W AGE = 0 + 11I(UNION ) + 2Z .

1.1.

17

Firms react to higher wages imposed by unions by hiring higherquality workers, and

1.1.3

prices are dicult to use, because:

Time-series:

related;

or rms.

With panel data, variations across individuals and across time periods are accounted for.

Cross-sections: no information on adjustment dynamics. Estimates may reect inter-individual dierences inherent in comparisons of

1.1.4

variables

across individuals. This is critical in practice, explains why panel

data models are now so popular in micro- and macro-econometrics.

Point related to endogeneity and omitted variables issues.

18

CHAPTER 1. INTRODUCTION

(Q)

, p = @c@Q

= AQ 1 (Cobb-Douglas)

= (0 + 1Q) (Quadratic).

1

Cobb-Douglas case: log Q = 1 (log p

log A ). From

equilibrium condition to estimable equation: Observations (Qit ; pit ),

unobserved heterogeneity i , rm i, period t.

1

(log pit log i A )

log Qit =

1

Identication issue: estimable equation is

Q~ it = a0 + a1p~it + uit; i = 1; 2; : : : ; N; t = 1; 2; : : : ; T;

~ it = log Qit, p~it = log pit, a1 = 1=( 1),

where Q

a0 = ( A E log i) =( 1), Euit = 0.

Model identied if E log i = 0, i.e., Ei = 1, otherwise A is biased if i is overlooked and E log i 6= 0.

Empirical issue: possible correlation between output price

and eciency term

i.

pit

Consider the model

where

xit

is scalar,

and

i = 1; 2; : : : ; N; t = 1; 2; : : : ; Ti;

i

i.

Ti:

number of

1.2.

19

ANALYSIS OF VARIANCE

Ti

1X

y ;

yi =

T t=1 it

Sxxi =

Ti

X

t=1

x )2;

(xit

and

Syyi =

Ti

X

t=1

(yit

Ti

1X

x ;

xi =

T t=1 it

Sxyi =

yi)2;

Ti

X

t=1

(xit

xi)(yit

yi);

i = 1; 2; : : : ; N:

^ i = Sxyi=Sxxi

and

xi^

^ i = y i

2 =S ;

Sxyi

xxi

RSSi = Syyi

with

(Ti

i is

2) degrees of freedom:

Consider now a restricted model with constant slopes and constant intercepts:

1 = 2 = = N (= )

1 = 2 = = N (= ):

be

^ =

PN PTi

)(yit

i=1 t=1(xit x

PN PTi

)2

i=1 t=1 (xit x

y)

20

CHAPTER 1. INTRODUCTION

and

^ = y x^ , where

y =

Ti

N X

X

1

P

i Ti i=1 t=1

yit; x =

1

P

Ti

N X

X

i Ti i=1 t=1

xit:

RSS =

hP

Ti

N X

X

i=1 t=1

(yit

y)2

N PTi

i=1 t=1(yit y)(xit

PN PTi

)2

i=1 t=1(xit x

PN

freedom:

i=1 Ti 2.

i2

x)

estimation would require a great number of time observations. If

unobserved heterogeneity is additive in the model, we might consider the following specication with constant slope and dierent

intercepts:

Minimizing

P P

i t (yit

i xit )2 with respect to i and , we

have

XX

t

(yit

xit ) = 0;

XX

i

xit(yit

xit ) = 0;

so that

P P

x (y y )

^ i = yi xi and ^ = P i P t it it i :

i )

i t xit (xit x

P

Residual Sum of Squares has now

i Ti (N + 1) degrees of

free-

dom (

1.3.

21

SOME DEFINITIONS

Typical panel: when number of units (individuals) N

and number of time periods (

T ) is small.

is large,

Balanced panel: same # periods for every unit (individual).

Rotating panel: A subset of individuals is replaced every period. Rotating panels can be balanced or unbalanced.

Pseudo panel:

remains in the sample decreases as the number of periods increases

(non response, moving, death, etc.)

22

CHAPTER 1. INTRODUCTION

Chapter 2

The linear model

2.1 Notation

yit = xit + uit; i = 1; 2; : : : ; N; t = 1; 2; : : : ; T;

where

xit is a K

vector,

yit and components of xit are both time-varying and varying across

individuals.

xit:

uit = i + t + "it;

i is the time-invariant individual

eect, and "it is the i.i.d. component.

where

t is the time

uit = i + "it.

error-component model: uit = i + t + "it .

Two-way

eect,

23

24

E (yitjxit; i) = xit + i for ind. i, across periods,

E (yitjxit; t) = xit + t for period t, across individuals,

E (yitjxit; i; t) = xit + i + t for ind. i and period t.

2.1.1

Model notation

Y = X + + + ";

Convention: index t runs faster, index i runs slower:

where

0

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

@

y11

..

.

y1T

y21

..

.

y2T

..

.

yit

..

.

yN 1

..

.

yNT

1

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

A

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

4

(1)

X11

..

.

X1(1)T

(1)

X21

..

.

X2(1)T

..

.

Xit(1)

..

.

XN(1)1

..

.

(1)

XNT

(K )

X11

..

.

X1(KT )

(K )

X21

..

.

X2(KT )

..

.

Xit(K )

..

.

XN(K1)

..

.

(K )

XNT

3

7

7

7

7

70

7

7

7B

7B

7B

7B

7B

7B

7B

7@

7

7

7

7

7

7

7

5

1

2

C

C

..

C

.

C

+++"

k C

C

C

..

A

.

2.1.

25

NOTATION

yi = Xi + i + + "i; i = 1; 2; : : : ; N;

0

where yi is T 1, Xi is T K . Note: = (1 ; 2 ; : : : ; T ) and

i = (i; i; : : : ; i)0 are (T 1).

2.1.2

eT : T -vector of ones;

B = IN

(1=T )eT e0T :

B = (1=N )eN e0N

IT :

Q = INT

(Between-individual operator);

(Between-period operator);

IN

(1=T )eT e0T = INT

(Within-individual operator);

Q = INT

IT = INT

B

(Within-period operator;)

(Computes full population mean).

model (otherwise, use B B to demean all variables).

The

26

The

NT .

means.

2.1.3

Q0 = Q; B 0 = B; Q2 = Q; B 2 = B; BQ = QB = 0;

Decomposition of the Q operator with N = T = 2:

02

3

1

1 0 0 0

B6 0 1 0 0 7

C

1

1

0

1

1

6

7

Cy

Qy = B

@4 0 0 1 0 5

0 1

2 1 1 A

0 0 0 1

0

1

2

30

1

1 1 0 0

y11

y11

B y12 C 1 6 1 1 0 0 7 B y12 C

6

7B

C

C

=B

@ y21 A 2 4 0 0 1 1 5 @ y21 A

0 0 1 1

y22

y22

0

1

0

1

y11

y11 + y12

B y12 C 1 B y11 + y12 C

C

B

C

=B

@ y21 A 2 @ y21 + y22 A

y22

y21 + y22

We will also use

QT = IT (1=T )eT e0T = IT BT : Within operator for a single

individual.

2.2.

27

Terminology: the xed-eects model does not mean that individual eects

estimation is

conditional

Rather,

i 's

2.2.1

theorem

Let E the NT N matrix of individual dummy variables:

2

3

1

0

0 0

61

7

0

0 0

6

7

61

7

0

0 0

6

7

60

7

1

0

0

6

7

60

7

1

0 0

6

7

60

7

1

0 0

6

7

E = 6 ..

7

..

regressing

6

6

6

6

6

6

6

6

4

"

"

0

0

0

0

0

0

(i = 1) (i = 2)

0

0

0

1

1

1

"

(i = N )

Y = X + E
+ " = W + u

0 00

where W = [X; E ], = ( ;
) , u = + ".

7

7

7

7

7

7

7

7

5

28

Frish-Waugh-Lovell theorem:

Parameter estimates

are numeri-

^ = (X X ) 1X Y ; where

X = [I E (E 0E )

Y = [I E (E 0E )

1W 0Y

1E 0 ]X = PE X;

1E 0]Y = PE Y

and

on

0

0

But E = IN

eT , E E = IN

eT eT = IN T

, PE = I E (E 0E ) 1E 0 = I T1 E (IN )E 0

= I 1 (IN

eT )(IN

eT )0 = I IN

1 eT e0 = Q.

T

E ).

= (X 0QX ) 1(X 0QY ).

Hence

Eliminate individual eects

from variables

yit

1=T

X

t

yit = (xit

BY = (X

1=T

X

t

xit) + uit

BX ) + u Bu

1=T

X

t

uit

QY = QX + Qu:

= (X 0QX ) 1X 0 QY and V ar(^ ) = "2(X 0QX ) 1.

2.2.

29

2.2.2

y1

6y 7

6 2 7

6 .. 7

4. 5

yN

x1

6x 7

2 7

=6

6 .. 7

4. 5

xN

eT

0T

60 7

6e 7

T 7

6 T 7

+6

6 .. 7 1 + 6 .. 7 2

4. 5

4. 5

0T

60

6 T

+ + 6 ..

4

eT

0T

7

7

7 N

5

0T

+ 6 ..

7

7

7;

5

"1

6"

6 2

4

"N

with assumptions:

OLS estimates of and i obtain by

N

X

X

min "0i"i = (yi

i=1

i=1

, ^ i = yi

i

xi )0(yi

i

xi )

i = 1; 2; : : : ; N;

and substituting in partial derivative wrt. , we have

^ =

" N;T

X

i;t

(xit

xi)(xit

xi;

xi)0

# 1 " N;T

X

i;t

(xit

xi)(yit

yi)

Dummy-Variable) estimator. ^ is unbiased, is consistent when N

or

V ar ^ = ^ 2"

" N

X

i=1

xiQT x0i

# 1

30

where

QT = IT

2.2.3

! 1.

Comments

) Coecients associated with time-invariant regressors are not

identied.

unit, hence the name.

BY = BX + B + B";

This alternative estimator uses variation between individual means

for model variables.

If

PT

t x1it i;t

by NT

K (individual eects not included). But in the model

Y = X + E + + ", the RSS would be divided by N (T 1) K .

model

be multiplied by

(NT

K )=[N (T

1) K ].

2.2.

31

Y

..........

..........

.

.

.

.

.

.

.

.

.

.

..........

..........

.

........

........

.

.

.

.

.

.

.

......

Within

Between

1...........

X

2.2.4

Poolability

As before:

versus

xit is a K vector.

H0 : 1 = 2 = = N (= ) (K (N

but now

1) constraints).

v F (K (N

URSS=N (T K 1)

and URSS:

PN

i=1 RSSi where RSSi

H0 : 1 = = N (= ).

1); N (T

1)) ;

= Syyi Sxyi

32

(OLS)

versus

(Within)

v F ((N

URSS=(NT N K )

1); NT

K )) ;

and URSS: from Within (LSDV) regression.

2.3.1

! 1.

upon the

i 's) wrt.

Assumptions:

with

E (ij ) =

2

0

if

i = j;

otherwise

"2 if i = j and t = s;

E ("it"sj ) = 0 otherwise:

2

2

2

Hence cov (uit ; ujs ) = + " if i = j and t = s, and if i = j

and t 6= s.

2.3.

33

Let

2 + "2 2

6 2

2 + "2

6

0

T = E (uiui) = 6 ..

4.

2

2

2

2

..

.

2 + "2

a (

E (uu0) =

= IN

T = IN

2 (eT e0T ) + "2IT

3

7

7

7;

5

We have

= IN

2 (T BT ) + "2(QT + BT )

since QT = IT

BT and BT = (1=T )eT e0T . Therefore

= IN

2 (T BT ) + "2(QT + BT ) = T 2 B + "2INT

= "2Q + (T 2 + "2)B .

or equivalently:

2.3.2

Y = X + U;

with

E (UU 0) =

.

, 2

and

"2,

.

^ GLS = X 0

1X 1 X 0

1Y

^ GLS ) = "2 X 0

1X 1.

and V ar (

covariance matrix

Computation of

1:

r = ("2)r Q + (T 2 + "2)r B

for an arbitrary scalar

r.

Based on properties of

Q and B (idem-

34

1

1

B

1 = 2Q + 2

"

T + "2

and

1

1

1=2 = Q +

B:

2

"

(T + "2)1=2

^ GLS = X 0

1X 1 X 0

1Y

We have

"

= X0

"2

1

# 1"

X0

"2

1

Y :

i 1h

i

1

1

0

0

= X (Q + B ) X

X (Q + B ) Y ;

2

2 2

2 2

where = (T + )= = 1 + T = .

"

"

"

"

1=2 and use OLS: Y = X + u, where

"

1

=

2

Y = "

Y = Q +

B Y

(" + T )1=2

"

1

=

2

X = "

X = Q +

B X;

(" + T )1=2

so that

Y = (Q + 1=2B )Y; X = (Q + 1=2B )X;

scalar form:

fyit g = (yit

(1

fxitg = (xit

(1

and in

p1

)yi

p1 )xi:

2.3.

35

2.3.3

1

1

^ GLS = X 0 QX + 1 X 0 BX

X 0QY + X 0 BY

^ W ithin = (X 0QX ) 1X 0 QY; ^ Between = (X 0BX ) 1X 0BY;

so that

1 0

0

1 0

where S1 = [X QX + X BX ] X QX and

0

S2 = [X 0QX + 1 X 0 BX ] 1 X BX

.

(ii) If T ! 1, then 1= ! 0 and ^ GLS ! ^ W ithin.

(iii) If 1= ! 1, then ^ GLS ! ^ Between.

(iv) V ar(^ W ithin) V ar(^ GLS ) is a s.d.p. matrix.

(v) If 1= ! 0 then V ar(^ W ithin) ! V ar(^ GLS ).

2.3.4

i's ?

in the sample: conditional inference, use Fixed eects. Example:

Individuals are not selected as random, or all rms in a given industry are selected.

marginal (uncondi-

selected randomly from a huge population (consumers).

36

Sampling process: purely random or not;

Number of units (countries, regions, households,...);

Interchangeability of units;

Endogeneity of Xit (see later).

2.3.4.2 Terminology

When xed individual eects are considered, Fixed-Eects or

GLS (Generalized Least Squares) estimation procedure.

2.3.5

observations (

N = 629, T = 6).

The GLS estimator is a weighted-average of the Within and Between estimators, where the weight is the inverse of the corresponding variance.

the Between estimator neglects the

variation within individuals, and the OLS gives equal weight to

both Within and Between variations.

Note. If the model contains an intercept:

2.3.

37

Table 2.1:

Variable

Within

GLS

Constant

0.8499

Age in [20,35]

0.0557

0.0393

Age in [35,45]

0.0351

0.0092

Age in [45,55]

0.0209

-0.0007

Age in [55,65]

0.0209

-0.0097

Age 65 over

-0.0171

-0.0423

-0.0042

-0.0277

-0.0204

-0.0250

Self-employed

-0.2190

-0.2670

South

-0.1569

-0.0324

Rural

-0.0101

-0.1215

we use

2.3.6

variances

^ 2 = u0Qu=tr(Q) =

"

and

because

PN PT

i=1 t=1(uit

N (T

1)

ui)2

X

"2 + T 2 = u0Bu=tr(B ) = T

u2i =N;

i=1

tr(Q) = N (T

variances from the

1) and tr(B ) = N .

uit's

u^it's instead.

38

true

u's;

2/ Amemiya (1971):

p

2

pNT (^2"

N (^

"2)

2 )

where

^ 2 = "2 + T 2

vN

0;

2"4 0

0 24

^ 2" =T .

Mean square error from Within regression:

^ 2" = Y 0QY

Y 0QX (X 0QX ) 1X 0 QY =[N (T

1) K ]

Y 0BX (X 0BX ) 1X 0BY =[N

"2 + T 2 = Y 0BY

X ),

1]:

not in the

Within regression.

4/ Nerlove (1971):

Compute

^ 2 = N1 1

PN

i

i=1(^

^i)2, where ^ i

LSDV regression. And

by consistent estimates:

Feasible GLS.

Chapter 3

Extensions

3.1 The Two-way panel data model

Error component structure of the form:

uit = i + t + "it

i = 1; 2; : : : ; N; t = 1; 2; : : : ; T;

or in matrix form

U = (IN

eT ) + (eN

IT ) + ";

where

3.1.1

on the N individuals over the period 1 ! T .

and

inference

3.1.1.1 Notation

Fixed-eect estimates of

Q = IN

IT

IN

(eT e0T =T ) (eN e0N =N )

IT ;

39

40

CHAPTER 3.

so that

Qu = fuit

ui

EXTENSIONS

utgit :

N

X

with restriction

i=1

i = 0:

T

X

with restriction

t=1

t = 0;

^ = (X 0QX ) 1X 0 QY;

^

^ i = yi xi;

^

^t = yt xt:

If the model contains an intercept, operator

Q = IN

IT

so that

Qu = fuit

Q becomes

IN

(eT e0T =T ) (eN e0N =N )

IT

+(eN e0N =N )

(eT e0T =T )

ui ut + ugit, and Within estimates are

^ = (X 0QX )

^ i = (yi y)

^ t = (yt y)

1X 0 QY;

(xi

(xt

^

x);

^

x):

1/ H0 : 1 = = N = 1 = = T = 0.

3.1.

41

v F (k1; k2);

URSS=[(N 1)(T 1) K ]

where

k1 = N + T

2; k2 = (N

1)(T

1) K );

and

RRSS: (Restricted RSS): from pooled OLS.

2/ H0 : 1 = = N = 0 given t 6= 0; t T

1.

v F (k1; k2);

URSS=[(N 1)(T 1) K ]

where

k1 = N

1; k2 = (N

1)(T

1) K );

and

RRSS: from regression w/ time dummies only:

(yit

yt) = (xit

xt) + (uit

ut):

3/ H0 : 1 = = T 1 = 0 given i 6= 0; i N

1.

v F (k1; k2);

URSS=[(N 1)(T 1) K ]

where

k1 = T

1; k2 = (N

1)(T

1)

K );

and

42

CHAPTER 3.

EXTENSIONS

RRSS: from Within regression as in one-way model:

(yit

yi) = (xit

xi) + (uit

ui):

3.1.2

Estimation of a Cobb-Douglas production function:

+3 log Machineryit + 4 log F ertilizerit:

uit):

Climatic conditions, identical across farms (t);

Motivation for adding specic eects (into

Table 3.1:

tion

Assumption

Estimate

1 (Labor)

2 (Real estate)

3 (Machinery)

4 (Fertilizer)

Sum of 's

R 2

(I)

(II)

(III)

0.256

0.166

0.043

0.135

0.230

0.199

0.163

0.261

0.194

0.349

0.311

0.289

0.904

0.967

0.726

0.721

0.813

0.884

i = t = 0 i = 0 t = 0

3.2.

43

Panel data: in the random-eect context, heteroskedasticity due

to panel data structure.

But variances

2

"2

and

are assumed

constant.

V ar(i) = i2

V ar("i) = i2

E ("it"is) 6= 0

Individual-specic heteroskedasticity

t 6= s

Typical heteroskedasticity

Serial correlation

3.2.1

or

i = 1; 2; : : : ; N;

= E (UU 0) = diag[i2]

(eT e0T ) + diag["2]

IT ;

where

diag["2] is N N .

We have

e e0

eT e0T

T T + diag["2]

IT

T

T

eT e0T

eT e0T

r

2

2

r

2

r

= diag[(T i + " ) ]

+ diag[(" ) ]

IT

:

T

T

Transformation of the heteroskedastic model:

multiply both sides by

"

1=2

"

= diag

2

(T i + "2)1=2

eT e0T

+ IN

IT

T

eT e0T

:

T

44

CHAPTER 3.

yit = yit

"

"

p

T i2 + "2

EXTENSIONS

!#

yi:

is individual-

specic:

i = (T i2 + "2)="2

and

yit = yit

Feasible GLS:

p1 yi:

i

2

2

2

2

Step 2. Noting

that V ar (uit ) = wi = i + " , estimate wi by

PT

1=(T

Compute

^ 2i = w^i2 ^ 2" ;

Form T

^ 2i + ^ 2" , ^i and compute y^it ; x^it;

Regress y

^it on x^it to get ^ .

1)

Step 3.

Step 4.

Step 5.

uit

t=1 (^

1; 2; : : : ; N

3.2.2

requires

T >> N .

w^i2; i =

`Typical heteroskedasticity

Assumptions:

= E (UU 0) = diag[2 ]

(eT e0T ) + diag[i2]

IT

= diag[T 2 + i2]

(eT e0T =T ) + diag[i2]

(IT

Transformed model uses

1

]

(eT e0T =T )

2

2

T + i

1=2 = diag[ p

eT e0T =T ) :

3.3.

45

+diag[1=i]

(IT

eT e0T =T ) ;

Y =

1=2 has typical element

y y

y

yit = it i + p 2i 2

i

T + i

y iyi

i

p

= it

where i = 1

i

T 2 + i2

E (u2it) = wi2 = 2 + i2 8i, hence

OLS residuals u

^it can be used to

P

T

2 ^ 2 = 1=(T 1)

estimate wi : w

uit ^ui)2.

i

t (^

Within residuals u

~ are then used to compute

PTit

2

^ i = 1=(T 1) t (~uit u~i)2.

so that

A consistent estimate of

2 is ^ 2 = (1=N )

PN 2

^i

i (w

^ 2i ).

3.3.1

Introduction

i

Ti periods, and total

PN

number of observations is now

i=1 Ti (instead of NT previously).

vidual) to another. For individual , we have

Examples

Consumers: may move, die or refuse to answer anymore;

Workers: may become unemployed,...

Problem of attrition: probability of a unit staying in the sample

decreases as the # of periods increases.

46

CHAPTER 3.

3.3.2

EXTENSIONS

Consider the unbalanced model with

y11

B y12

B

B y13

B

@ y21

y22

To eliminate

T1 = 3 and T2 = 2:

x11

1

C B x12 C

B 1

C B

C

B

C = B x13 C + B 1

C B

C

B

A @ x21 A

@ 2

x22

2

"11

C B "12

C B

C + B "13

C B

A @ "21

"22

C

C

C:

C

A

Q =

2

6

6

6

6

4

2=3

1=3

1=3

0

0

I3

e3e03=3

0

I2

1=3

2=3

1=3

0

0

1=3

1=3

2=3

0

0

0

e2e02=2

0

0

0

1=2

1=2

3

0

07

7

07

7;

1=2 5

1=2

where

Q = diag(ITi

The model is

3.3.

47

where

Nt:

n.

t, and n =

PT

t=1 Nt .

Consider a

N

matrix at time

t.

(y11; y21; y31) (y12; y32) (y13; y23).

Example:

1 0 0

40 1 05

0 0 1

8

>

>

>

>

>

>

>

>

>

>

>

>

>

<

1 0 0

D1 = 4 0 1 0 5

0 0 1

1 0 0

D

=

>

2

>

>

0 0 1

>

>

>

>

>

>

>

>

>

>

:

1 0 0

D3 =

0 1 0

We have 3 (Nt N ) matrices Dt , t = 1; 2; 3 constructed from I3

above.

a (n N ) matrix, and 2 = diag (Dt eN ), a (n T ) matrix:

2

3

D1 D1eN

0

6 D

0

0 7

6 2

7

= 6 ..

7:

..

..

4 .

5

0

.

.

DT 0 DT eN

48

CHAPTER 3.

EXTENSIONS

(the Nt 's).

The

Matrix

is

n (N + T ),

1

in

before.

0

unit i), and 2 2 = diag (Nt ) (number of individuals for period

t).

0

Also, 2 1 is a T N matrix of dummy variables for the presence

in the sample of unit i at time t.

Note that

model

where

and

Dit:

t's.

1 = (eT

2 = (IT

eN ), and would be NT (N + T ).

In the balanced panel case, we would have

IN ) and

3.3.

49

n = 3 + 2 + 2 = 7 and N = 3:

In example above,

=

vector

1

60

6

60

6

61

6

40

0

0

0

1

1

0

0

1

0

0

0

1

0

1

0

0

1

0

1

0

0

1

0

0

0

0

1

0

0

1

0

1

0

0

1

1

1

0

0

0

0

0

0

0

1

1

0

0

0

07

7

07

7

07

7;

07

7

7

15

1

would be

0

1

0

1

0

0

6

6

6

6

6

6

6

6

6

4

0

0

1

0

1

0

1

0

0

0

0

1

0

y

0 B 11 C

y11 + y12 + y13

y

B

21

C

17

y21 + y23

C B

7B

y

B

C B

31

7

B

0 7B

C B y31 + y32

y

B

C=B

12

07

y11 + y21 + y31

C

7B

B y32 C B

5

@

C

0 B

y12 + y32

@ y13 A

1

y13 + y23

y23

3

1

C

C

C

C

C

C

A

Easier method if

and

Let

N = 011

T = 022

NT = 021

= 2 1N10NT

P = T NT N10NT = 02

(N N );

(T T );

(T N );

(n T );

(T T ):

50

CHAPTER 3.

EXTENSIONS

such unbalanced two-way panel is

Q = In

where

1N101

: generalized inverse of

P

0;

P.

P

0Y = Y 1N11

QY = Y 1N101Y

0

0Y .

where 1 = 1 Y and = P

Transformed variable

PTi

t=1 yti .

Typical transformed element:

where

;

1i

a0i

+

Ti

Ti

t;

Example

N = 3, T = 3.

Let

We have

3 0 0

1 1 1

N = T = 4 0 2 0 5 ; NT = 4 1 0 1 5 ;

0 0 2

1 1 0

2

P =4

1:6666

0:8333

0:8333

0:8333

1:1666

0:3333

0:8333

0:3333 5

1:1666

3.3.

51

QY =

B

B

B

B

B

B

B

B

B

@

0:4582

0:1875 C

C

0 1

0

1

0:5000 C

6

0:3383

C

0:5418 C

C ; 1 = @ 6 A = @ 1:6618 A

0:5000 C

C

9

2:0368

C

0:0832 A

0:1875

For example,

Qy11 = 1

6 1

+ ( ) (1 1 1 ) @

3 3

0

Qy31 = 3

9 1

+ ( ) (1 1 0 ) @

2 2

0:3383

1:6618 A + 0:3383 = 0:4582:

2:0368

1

0:3383

1:6618 A + 0:3383 = 0:5:

2:0368

52

CHAPTER 3.

EXTENSIONS

Chapter 4

Augmented panel data models

What are augmented panel models ? Implication for estimation ?

Special estimation techniques when GLS are not feasible.

4.1 Introduction

Consider the model

xit a 1 K vector of time- and individual-varying regressors,

and zi a 1 G vector of individual-specic (time-invariant) rewith

gressors.

Example:

Estimation method:

QY = QX + (I

B )Z
+ Q + Q" = QX + Q";

53

54

since

CHAPTER 4.

BZ = Z .

Only

identiable.

feasible:

) ^ ;

xi^ = i + Zi
+ "i;

to estimate the
's.

yi

i = 1; 2; : : : ; N;

4.2 Choice between Within and GLS

One of the choice criterion between Within and GLS: presence of

Recall: GLS is a consistent and ecient estimator provided regressors are exogenous:

E (izi) = 0

8i; t:

Consider the non-augmented model yit = xit + i + "it .

If xit is endogenous in the sense E (i xit ) 6= 0, then GLS are not

E (ixit) = 0

consistent:

and

^ GLS = + X 0

1X 1 X 0

1U

= + X 0 Q + 1B X 1 X 0 Q + 1B U ;

2 2

where = 1 + T =" , so that

0

X

Q + 1B U = [X 0Q" + X 0(B + B")=]

4.3.

55

= 0 + X 0B= + 0 = X 0= 6= 0;

because

E (X 0") = 0 and B = .

E (Z 0) 6= 0.

E (X 0) 6= 0 and/or

Within estimates are consistent because

is ltered out.

not identiable);

Three problems remain:

xi^ = zi + i + "i,

zi still correlated with i.

yi

distinction between exogenous and endogenous

xit's).

Null hypothesis:

H0 : E (X 0) = E (Z 0) = 0 (exogeneity).

56

CHAPTER 4.

H0

Alternative

^ GLS

^ W ithin

Consistent,

Consistent,

ecient

not ecient

Not consistent

Consistent

mates of

xit's

Therefore,

HT = ^ W ithin

Notes

^ GLS

^ GLS

0 h

^ W ithin

H0,

only.

i 1

^ GLS

v 2(K ):

i

^

^

Weighting matrix V ar( W ithin) V ar( GLS ) is positive: GLS

and

Recall that

"2(X 0QX ) 1.

0

Within estimator is based on the condition E (X QU ) = 0, whereas

0 1

0

0

GLS is based on E (X

U ) = 0 ) E (X QU ) = 0 and E (X BU ) =

0.

For GLS, we add

of

X.

later).

B ):

rank

4.4.1

Alternative method:

Instrumental-variable estimation.

In the

observations:

W is a N L matrix of instruments.

If K = L,

where

[W 0(Y

If L > K ,

[W 0(Y

X )] = 0

^ = (W 0X ) 1W 0Y

X )] = 0

(W 0Y ) = (W 0X )

(IV estimator)

L conditions on K

parameters)

(Y X )0W (W 0W ) 1W 0

X ) where PW = W (W 0W ) 1W 0

(Y

) ^ = (X 0PW0 X )

Note:

in general, instruments

1 (X 0 P Y ):

W

originate from or outside the

equation.

4.4.2

IV in a panel-data context

);

Find relevant instruments, not correlated with .

58

CHAPTER 4.

Y = X11 + X22 + Z1
1 + Z2
2 + + ";

where

X1 :

X2 :

Z1 :

Z2 :

and let

N K1

N K2

N G1

N G2

i and t;

endogenous, varying across i and t;

exogenous, varying across i;

endogenous, varying across i;

exogenous, varying across

data: Let

have

Y =

1=2Y , X =

1=2X ,

h

=

1=2.

We

1 0

0

^ IV = PW

PW Y

h

i 1h

i

0

1

=

2

1

=

2

0

1

=

2

1

=

2

=

PW

PW

Y :

Computation of

4.4.3

and

1=2:

E (X10 ) = E (Z10 ) = 0

) Obvious instruments are X1 and Z1, not sucient because

K1 + G1 < K1 + K2 + G1 + G2.

Additional instruments: must not be correlated with .

Because is the source of endogeneity, every variable not correlated with is a valid instrument. Best valid instruments are

highly correlated with X2 and Z2 .

QX1 and QX2 are valid instruments: E [(QX1)0] = E [X10 Q] =

Exogeneity assumptions:

As for

E [X10

1U ] = E [X10 (Q + 1B )U ] = E [X10 B (Q + 1B )U ]

since

BQ = 0 and BB = B .

Identication condition: We have K1 + K2 + G1 + G2 parameters

to estimate, using K1 + K1 + K2 + G1 instruments (K1 + K2 instruments in

4.4.4

QX ).

K1 G2.

Breusch-Mizon-Schmidt

xit is exogenous, we can use the following conE (xiti ) = 0 8i; 8t instead of E (x0ii) = 0.

ditions:

X1

in

60

CHAPTER 4.

x11

6x

6 11

6 :::

6

6

6 x21

6

x21

X1 = 6

6

6 :::

6

6 xN 1

6

6 xN 1

6

4 :::

xN 1

x12

x12

:::

x22

x22

:::

xN 2

xN 2

:::

xN 2

:::

:::

:::

:::

:::

:::

:::

:::

:::

:::

x1T

x1T

:::

x2T

x2T

:::

xNT

xNT

:::

xNT

(i = 1; t = 1)

(i = 1; t = 2) 7

7

7

:::

7

(i = 2; t = 1) 7

7

(i = 2; t = 2) 7

7

7

:::

7

7

(i = N; t = 1) 7

7

(i = N; t = 2) 7

7

5

:::

(i = N; t = T )

= [QX; X1; Z1], and an equivalent estimator obtains by

such that

is

WAM

using

where (QX1 ) is constructed as X1 above.

Amemiya and MaCurdy: their instrument matrix yields an estimator as least as ecient as with the Hausman-Taylor matrix,

if

list of instruments, but as [(QX1 ) ; X1 ] is of rank K1 , we only add

(T 1)K1 instruments. identication condition is T K1 G2.

Identication condition:

We add

Even more ecient estimator: based on conditions

E [(QT X2i)0i] = 0.

For BMS, estimator is more ecient if endogeneity in

X2

origi-

where

(QX1)

and

(QX2)

X1

for AM.

1)K2 G2.

As before, we only add (T

1)K2 instruments, as (QX2) is not

full rank but (T

1)K2.

Identication condition:

for IV estimators

Problem here: endogenous regressors may yield unconsistent estimates of variance components in

, in particular parameter .

Let

M1 = BY

where

BX ^ W = B

BX (X 0 QX ) 1X 0Q Y

= Z
+ + B BX (X 0 QX ) 1X 0Q ";

X = (X1jX2), Z = (Z1jZ2), and
= (
1;
2).

The last

suces to nd instruments for

The IV estimator of

is

Z2 in order to estimate .

62

CHAPTER 4.

(X1; Z1). Using parameter estimates ^ W and ^B , we form residwhere

uals

QX ^ W and u^B = BY

u^W = QY

BX ^ W

Z ^B :

These two vectors of residuals are used to compute variance composants as in standard Feasible GLS.

4.5.1

QX

and

QY .

Step 3. Estimate B by the IV procedure above.

Step 4. Compute 2 and "2 from u^W and u^B , and compute

^ = 1 + T ^ 2 =^ 2" .

(Q + B )Y = yit

(1

)yi.

matrix

W.

4.6 Example: Wage equation

4.6.1

Model specication

4.6.

63

:

where

w:

wage rate

X1: additional variables (industry, occupation status, etc.), and ED : educational level. Proxies

worker's ability (unobserved),

union, etc.

ED: @w=@ED.

conditions

where

ED ?

ED = G[; X2];

If ability

ED = G[X2; Z2] + V;

V = G[X2; ] G[X2; Z ].

where

Two problems when estimating the rst equation while overlooking the second one:

ED);

error bias.

64

CHAPTER 4.

Sample used: Panel Study of Income Dynamics (PSID), University of Michigan. See Baltagi and KhantiAkom 1990, Cornwell

and Rupert 1988.

households (males and females) aged between 18 and 65 in 1976,

with a positive wage in private, nonfarm employment for the

years 1976 to 1982.

4.7.1

W KS : number of weeks worked in the year;

EXP : working experience in years at the date of the sample;

OCC : dummy, 1 if bluecollar occupation;

IND : dummy, 1 if working in industry;

UNION : dummy, 1 if wage is covered by a union contract.

4.7.2

heads

Metropolitan Statistical Area);

MS : Marital Status dummy, 1 if head is married;

4.7.

65

F EM : dummy, 1 female;

BLK : dummy, 1 if head is black;

ED : number of years of education attained.

Individual-specic variables:

ED, BLK

and

F EM .

Variables

a priori

individual eects):

MS );

Variables

IND).

a priori

Zi's)

exogenous:

Augmented model

a priori endogenous: Z2: ED;

Variables a priori exogenous: Z1 : (BLK , F EM ).

Variables

66

CHAPTER 4.

Table 4.1:

Variable

LW AGE

EXP

W KS

OCC

IND

UNION

SOUT H

SMSA

MS

ED

F EM

BLK

Mean

Std. Dev.

Minimum

Maximum

6.6763

0.4615

4.6052

8.5370

19.8538

10.9664

1.0000

51.0000

46.8115

5.1291

5.0000

52.0000

0.5112

0.4999

0.0000

1.0000

0.3954

0.4890

0.0000

1.0000

0.3640

0.4812

0.0000

1.0000

0.2903

0.4539

0.0000

1.0000

0.6538

0.4758

0.0000

1.0000

0.8144

0.3888

0.0000

1.0000

12.8454

2.7880

4.0000

17.0000

0.1126

0.3161

0.0000

1.0000

0.0723

0.2590

0.0000

1.0000

4.7.

67

Table 4.2:

Exogenous regressors

only.

Within

GLS

0.0976 (0.0040)

OCC

-0.0696 (0.02323)

-0.0701 (0.02322)

SOUTH

-0.0052 (0.05833)

-0.0072 (0.05807)

SMSA

-0.1287 (0.03295)

-0.1275 (0.03290)

0.0317 (0.02626)

0.0317 (0.02624)

Constant

IND

2(4) = 0:551

Table 4.3:

only.

Within

GLS

0.0561 (0.0024)

0.1136 (0.002467)

0.1133 (0.002466)

EXPE2

-0.0004 (0.000054)

-0.0004 (0.000054)

WKS

0.0008 (0.0005994)

0.0008 (0.0005994)

-0.0322 (0.01893)

-0.0325 (0.01892)

0.0301 (0.01480)

0.0300 (0.01479)

Constant

EXPE

MS

UNION

2(5) = 24:94

68

CHAPTER 4.

Table 4.4:

Within

GLS

0.1866 (0.01189)

OCC

-0.0214 (0.01378)

-0.0243 (0.01367)

SOUTH

-0.0018 (0.03429)

0.0048 (0.03188)

SMSA

-0.0424 (0.01942)

-0.0468 (0.01891)

IND

0.0192 (0.01544)

0.0148 (0.01521)

EXPE

0.1132 (0.00247)

0.1084 (0.00243)

-0.0004 (0.00005)

-0.0004 (0.00005)

0.0008 (0.00059)

0.0008 (0.00059)

-0.0297 (0.01898)

-0.0391 (0.01884)

0.0327 (0.01492)

0.0375 (0.01472)

FEM

-0.1666 (0.12646)

BLK

-0.2639 (0.15413)

ED

0.1373 (0.01415)

Constant

EXPE2

WKS

MS

UNION

2(9) = 495:3

Table 4.5:

HT

AM

BMS

0.1772 (0.017)

0.1781 (0.016)

0.1748 (0.016)

-0.0207 (0.013)

-0.0208 (0.013)

-0.0204 (0.013)

0.0074 (0.031)

0.0072 (0.031)

0.0077 (0.031)

-0.0418 (0.018)

-0.0419 (0.018)

-0.0423 (0.018)

IND

0.0135 (0.015)

0.0136 (0.015)

0.0138 (0.015)

EXPE

0.1131 (0.002)

0.1129 (0.002)

0.1127 (0.002)

-0.0004 (0.005)

-0.0004 (0.000)

-0.0004 (0.000)

0.0008 (0.000)

0.0008 (0.000)

0.0008 (0.000)

-0.0298 (0.018)

-0.0300 (0.018)

-0.0303 (0.018)

0.0327 (0.014)

0.0324 (0.014)

0.0326 (0.014)

FEM

-0.1309 (0.126)

-0.1320 (0.126)

-0.1337 (0.126)

BLK

-0.2857 (0.155)

-0.2859 (0.155)

-0.2793 (0.155)

0.1379 (0.021)

0.1372 (0.020)

0.1417 (0.020)

Constant

OCC

SOUTH

SMSA

EXPE2

WKS

MS

UNION

ED

Test

Chapter 5

Dynamic panel data models

5.1 Motivation

Usefulness of dynamic panel data models:

variables of interest;

In practice: estimate long-run elasticities and structural parameters from Euler equations.

5.1.1

problems

R

maxq(0);:::;q(T ) E e rt(t) ;

(t) = p(t)q(t) c[q(t); b(t)];

b_ = G[b(t); q(t)];

69

70

CHAPTER 5.

b(t) is the state variable (stock, capital,...), q(t) is the control variable, r is discount rate. G(:) describes the evolution path

where

nP

o

T

t

maxq0;:::;qT E

t=0(1 + r) t ;

and use the Bellman equation:

Vt(bt) = max Et t + (1 + r) 1Vt+1(bt+1)

where Vt (bt ) is the value function of the problem at time t,

Et is the conditional expectation operator at time t.

and

We use a) the envelope theorem (evolution path at optimum depends only on state variable, as control variable is already optimized); b) First-order condition wrt. control variable.

1 @Vt+1 @f (bt; qt)

=

+

;

@bt

@bt

1 + r @f

@bt

(Envelope theorem)

1 @Vt+1 @f (bt; qt)

=

+

=0

@qt

@qt

1 + r @f

@qt

From (F OC ):

@Vt+1

@ t @f (bt; qt)

=

@f

@qt

@qt

1

(1 + r);

(FOC)

5.1.

71

MOTIVATION

@Vt @ t

=

@bt @bt

Now we lag

1

"

@ t 1

1 @ t

+

@qt 1 1 + r @bt

Assume

@ t @f (bt; qt)

@qt

@qt

@ t @f

@qt @qt

1

@f (bt; qt)

:

@bt

@f @f (bt 1; qt 1)

= 0:

@bt

@qt 1

We have

@ t

1 + r @ t 1

a @ t

=

+ 1

:

@qt

a2

@qt 1

a2 @bt

This is the Euler equation relating current and past marginal

prots.

If, for instance, prot is linear-quadratic in

b0 + b1qt + b2bt =

1+r (b0 + b1 qt 1 + b2 bt 1)

a

2

a1

a2

where

0

1

2

3

5.1.2

= (a2 b1

= (a2 b1

= (a2 b1

= (a2 b1

a1c1)

a1c1)

a1c1)

a1c1)

1 [b ((1 + r) a ) + a c ] ;

0

2

1 0

1 [(1 + r)b ] ;

1

1 [(1 + r)b ] ;

2

1 [a c a b ] :

1 2

2 2

72

CHAPTER 5.

budget constraint

ct + At = yt + At 1(1 + rt); t = 1; 2;

where

ct

is consumption at time

income, and

rt is interest rate.

t, At

is total assets,

yt

is wage

U = u(c1) +

where

1

u(c );

1+ 2

U

where

= c1 +

1

c2 ;

1+

At the optimum (by replacing budget constraints in utility function and optimizing wrt.

A1):

@u @c1

1 @u @c2

@U

=

+

=0

@A1 @c1 @A1 1 + @c2 @A1

@u 1 + r @u

, @c

=

:

1 1 + @c2

This is the

c1 1= =

1+r

c2 1= :

1+

5.1.

73

MOTIVATION

c1 =

1+r

(

1+

u(X ) = 1=2(

Ec2)

X )2 :

c1 = Ec2

if

r = :

ct+1 = ct + "t+1;

where

"t+1 is i.i.d.;

ct 1) + "t:

ct = 0 + 1yt + (ct 1

5.1.3

1yt 1) + 2(yt 1

ct 1) + "t:

of the variable of interest (consumption, capital stock,...)

yt+1

yt

x .

equilibrium path is y =

1

5.1.3.1 Long-run elasticities

Dynamic models are helpful in computing long-run elasticities.

Consider for example the dynamic consumption model

where

C~i;t+j and P~i;t+j respectively denote logs of

and price.

have

consumption

We

74

CHAPTER 5.

j

j 1u

where ui;t+j = uit +

i;t+1 + + ui;t+j 1 + ui;t+j .

Assume we want to compute the change in consumption at

t and t + j :

@ C~i;t+j

@ C~i;t+j @ C~i;t+j

+

+

+

= (j + j 1 + + + 1):

~

~

~

@ Pit

@ Pi;t+1

@ Pi;t+j

time

j

X

@ C~i;t+j

j + j 1 + + + 1) = :

=

lim

(

~

j !1

j !1

1

s=0 @ Pi;t+s

lim

Consider the following Cobb-Douglas production model

where

Qit

is output of rm

poses into

where

t

change),

uit

i at time t, Nit

is labor input,

Kit

is

shock having an AR(1) representation:

vit

"it

is an i.i.d.

is a productivity

5.2.

75

2 log Ki;t 1

"i;t 1] ;

or

log Qit = 1 log Nit + log Ni;t 1 + 3 log Kit + log Ki;t 1

+5 log Qi;t 1 + t + (i + !it);

subject to restrictions

2 = 1 5 and 4 = 35.

Hence, equivalence between a static (short-run) model with seriallycorrelated productivity shocks, and a dynamic representation of

production output.

Simple dynamic panel-data model:

yi0; i = 1; 2; : : : ; N are assumed known.

2

We assume E ("it ) = 0 8i; t, E ("it "js ) = "

if i = j; t = s and 0 otherwise, E (i "it ) = 0 8i; t.

By continuous substitution:

+ 2"

i;t

2 +

+ t

t

1 " + 1 + t y :

i1

i0

1 i

76

CHAPTER 5.

5.2.1

^ =

PN PT

i=1 t=1(yit yi)(yi;t 1 yi; 1) ;

PN PT

2

i=1 t=1 (yi;t 1 yi; 1)

^ i = yi

^yi; 1;

where

T

T

T

1X

1X

1X

y ; yi; 1 =

y ; "i =

" :

yi =

T t=1 it

T t=1 i;t 1

T t=1 it

Also,

^ = + NT 1i=1PN t=1PT

;

2

(

y

y

)

i;t

1

i;

1

i=1 t=1

NT

This estimator exists if denominator

merator converges to 0.

Numerator:

1

plimN !1

NT

because

N;T

X

i;t

(yi;t 1

yi; 1)("it

N

1X

"i) = plim

y "

N i=1 i; 1 i

We

use

T

1X

1 1 T

(T

yi; 1 =

yi;t 1 =

yi0 +

T t=1

T 1

1) T + T

i

(1 )2

1 T 1

1 T 2

+

" +

" + + "i;T 1 :

1 i1

1 i2

5.2.

77

We have

N

X

plim

N

X

1

1

1

yi; 1"i = plim

"i

N i=1

N i=1 T

(

N

X

T

X

1

1

"

N i=1 T t=1 it

"2 (T 1)

= 2

T

(1

1

T

"

T 1

X

1

1

t=1

"

T 1

X

1

T t

T t

#)

"it

#)

"it

1

t=1

T + T

:

)2

1 PN;T (y

2

In a similar manner, we show that plim

i;t 1 yi; 1)

i;t

NT

= plim

2

= " 2 1

1

1) T + T

T2

2

(T

(1 )2

1

T

1 1 T

T 1

1+

plimN !1 (^

) =

1

T 1

2

(1 )(T

1)

1 T

T (1 )

1

= O(1=T ):

(yit

yi) = (yi;t 1

1=T .

is large and

is small.

78

CHAPTER 5.

Table 5.1:

0.2

0.5

0.7

0.9

Bias

Percent

-0.2063

-103.1693

-0.1539

-76.9597

10

-0.1226

-61.3139

20

-0.0607

-30.3541

40

-0.0302

-15.0913

-0.2756

-55.1282

-0.2049

-40.9769

10

-0.1622

-32.4421

20

-0.0785

-15.6977

40

-0.0384

-7.6819

-0.3307

-47.2392

-0.2479

-35.4084

10

-0.1966

-28.0912

20

-0.0938

-13.3955

40

-0.0449

-6.4114

-0.3939

-43.7633

-0.3017

-33.5179

10

-0.2432

-27.0248

20

-0.1196

-13.2934

40

-0.0563

-6.2561

5.2.

79

5.2.2

Instrumental-variable estimation

(small).

when

is xed

(yit

yit = yi;t 1 + "it;

"i;t 1)

In model above, yi;t 1 correlated by construction with "i;t 1 ! We

need instruments that are uncorrelated with ("it

"i;t 1) but correlated with (yi;t 1

yi;t 2). Only possibility in a single-equation

framework with no other explanatory variables: use values of dependent variables.

of "it ; "i;t 1; : : : ; "i1 ; i ; yi0 .

As for lagged dependent variables, we can use either yi;t 2 or

(yi;t 2 yi;t 3):

E [yi;t 2("it "i;t 1)] = E ("i;t 2"it) E ("i;t 2"i;t 1) = 0;

E [(yi;t 2 yi;t 3)("it "i;t 1)] = E ["i;t 2("it "i;t 1)]

E ["i;t 3("it "i;t 1)] = 0;

E [yi;t 2(yi;t 1 yi;t 2)] = 0 E ("2i;t 2) = "2;

E [(yi;t 2 yi;t 3)(yi;t 1 yi;t 2)] = 0 E ("2i;t 2) = "2:

Instrumental-variable estimators that are consistent when N and/or

T ! 1:

PN PT

(y y )(y

y )

^ = PNi=1PT t=3 it i;t 1 i;t 2 i;t 3

i=1 t=3 (yi;t 1 yi;t 2)(yi;t 2 yi;t 3)

ture values of

80

CHAPTER 5.

^ =

or

Conclusion:

even though

because the

PN PT

i=1 t=3(yit

PN PT

i=1 t=3(yi;t 1

yi;t 1)yi;t 2

:

yi;t 2)yi;t 2

Q operator used introduces errors "is correlated by

IV Estimation proceeds as follows.

Step 1.

(yit

yi;t 1) = (yi;t 1

yi;t 2) + (xit

xi;t 1) + "it

"i;t 1:

estimate ; with the IV procedure.

Use

Step 2.

Substitute

yi

^yi; 1

and estimate

Step 3.

^ and ^

xi^ = zi + i + "i; i = 1; 2; : : : ; N;

by OLS.

i2

^

(xit xi;t 1) ;

i2

PN h

2

1

1 ^ 2;

^

^ = N i=1 yi ^yi; 1 zi ^ xi

"

T

yi;t 2)

5.3.

81

2

IV estimator of and are consistent only when T ! 1, but

inconsistent when T is xed and N ! 1.

IV estimator of

We now treat

yi;t 1.

^ =

PN PT

i=1 t=1 yit yi;t 1

PN PT

2

i=1 t=1 yi;t 1

=+

PN PT

i=1 t=1(i + "it)yi;t 1 :

PN PT

2

i=1 t=1 yi;t 1

We show that

N X

T

1 1 T

1 X

( + " )y

=

Cov(yi0; i)

plimN !1

NT i=1 t=1 i it i;t 1 T 1

1 2

T;

+

(

T

1)

T

+

T (1 )2

and

As

5.3.1

"it.

N X

T

N 2

1 X

1 2T

2

i yi0

plimN !1

yi;t 1 =

:

NT i=1 t=1

T (1 2) N

2 1

1 T 1 2T

+

: T 2

+

(1 )2 T

1

1 2

82

CHAPTER 5.

1 T 1

+

T (1 ) 1

1

"2

+

(T 1)

T (1 2)2

2

2T

Cov(yi0; i)

2

T 2 + 2T :

or generated as

5.3.2

yit).

yi0 (constant

An equivalent representation

with the following assumptions:

E (ixit) = 0; E (izi) = 0; E (i"it) = 0;

E (ij ) = 2 if i = j;

0 otherwise;

E ("it"js) = "2 if i = j; t = s;

0 otherwise:

We can also write

yit = wit + i;

where i = i =(1

); Ei = 0; V ar(i) = 2 = 2 =(1 )2;

and the dynamic process

fect

i.

5.3.

83

5.3.3

(A)

(B)

In model (A),

yit

wit = wi;t 1 + xit + zi + "it;

yit = wit + i:

is driven by unobserved characteristics

xit and zi .

i , dif-

eects i . Conditional on exogenous xit and zi , wit are driven by

identical processes with i.i.d. shocks "it . But observed value yit is

shifted by individual-specic eect i .

Possible interpretation:

and

wit

is a latent variable,

yit

is observed,

wit is unobserved.

But

assumptions (or knowledge) on initial conditions may help to distinguish between both processes.

Dierent cases:

1/ yi0 xed;

2/ yi0 random;

2.a/ yi0 independent of i, with E (yi0) = y

2 ;

and

V ar(yi0) =

y0

3/ wi0 xed;

4/ wi0 random;

4.a/ wi0 random with common mean w and variance "2=(1

0

2)

84

CHAPTER 5.

(stationarity assumption);

w2 0;

2) (sta-

tionarity assumption);

4.d/ wi0 random with mean i0 and arbitrary variance w2 0.

See Appendix 4 for a derivation of Maximum Likelihood estimators in each case.

5.3.4

2

2

and yields the GLS estimator. When and " are unknown,

When

2

and

VT .

Other cases

converges to Within. When N ! 1 and T is xed, GLS is inconEstimators for

eects.

5.3.5

demand for natural gas in the US, including a/ the demand due

to replacement of gas appliances, and b/ demand due to increases

in the stock of appliances.

5.3.

85

Table 5.2: Properties of the MLE for dynamic panel data models

Parameters

xed,

Case 1:

; ; "2

; 2

Case 2.a:

; ; "2

y ;
; 2 ; y2

0

; ; "2

wi0;
; 2

yi0

xed,

!1

xed

Consistent

Inconsistent

Consistent

yi0

Case 2.b:

!1

Consistent

; ; "2

y ;
; 2 ; y2 ;

random,

yi0

ind. of

Consistent

Consistent

Inconsistent

Consistent

yi0

correlated with

Consistent

Consistent

Inconsistent

Consistent

Case 3:

wi0

xed

Consistent

Inconsistent

Inconsistent

Inconsistent

; ; "2

Consistent

Consistent

2

w ; ;

Inconsistent

Consistent

2

Case 4.b: wi0 random, mean w , variance w

; ; "2

Consistent

Consistent

2

w ; ; ; w

Inconsistent

Consistent

Case 4.c: wi0 random, mean i0, variance "2 =(1

2)

; ; "2

Consistent

Inconsistent

2

Inconsistent

Inconsistent

i0; ;

2

Case 4.d: wi0 random, mean i0, variance w

; ; "2

Consistent

Inconsistent

2

2

i0; ; w

Inconsistent

Inconsistent

Case 4.a:

86

CHAPTER 5.

Demand system:

Fit = Fit (1 r)Fi;t 1;

Fit = a0 + a1Nit + a2Iit;

Git = b0 + b1Pit + b2Fit;

Git and Git are respectively the new demand and the actual

demand for gas at time t from unit i, r is the appliances deprecia

tion rate, Fit and Fit are respectively the new and actual demand

for all types of fuel, Nit is total population, Iit is per-head income,

and Pit is relative price of gas.

where

+4Iit + 5Ii;t 1 + 6Gi;t 1;

where

Nit = Nit

Ii;t 1, and 6 = 1 r.

In accordance with the theory, (here, 6 ) is biased upward for

sumption that initial conditions

5.3.

87

Parameter

0 (Intercept)

1 (Pit)

2 (Nit)

3 (Ni;t 1)

4 (Iit)

5 (Ii;t 1)

6 (Gi;t 1)

OLS

Within

GLS

-3.650

-4.091

(3.316)

(11.544)

-0.0451(*)

-0.2026

-0.0879(*)

(0.027)

(0.0532)

(0.0468)

0.0174(*)

-0.0135

-0.00122

(0.0093)

(0.0215)

(0.0190)

0.00111(**)

0.0327(**)

0.00360(**)

(0.00041)

(0.0046)

(0.00129)

0.0183(**)

0.0131

0.0170(**)

(0.0080)

(0.0084)

(0.0080)

0.00326

0.0044

0.00354

(0.00197)

(0.0101)

(0.00622)

1.010(**)

0.6799(**)

0.9546(**)

(0.014)

(0.0633)

(0.0372)

Notes. N = 36, T = 11. Standard errors are in parentheses. (*) and (**):

parameter signicant at 10% and 5% level respectively.

88

CHAPTER 5.

Part II

Generalized Method of Moments

estimation

89

Chapter 6

The GMM estimator

Generalized Method of Moments: ecient way to obtain consistent parameter estimates under mild conditions on the model.

Very popular in estimating structural economic models, as it requires much less conditions on model disturbances than Maximum

Likelihood. Another important advantage: easy to obtain parameter estimates that are robust to heteroskedasticity of unknown

form.

6.1.1

Moment conditions

wishes to estimate a p 1 vector whose true value is 0 .

Note: notation above is very general, xi will typically include de-

variables.

Let

91

92

E [f (xi; 0)] = 0:

6.1.2

yi = xi0 + ui; i = 1; 2; : : : ; N;

where

0 :

term.

A common assumption is

and

ui

is the error

In terms of the denition above,

xi ).

E (xiui) = E [xi(yi

Note that here,

p = q,

xi0] = 0:

parameters to estimate.

E (ziui) = 0.

such that

There are

Vector

f [(xi; yi; zi); ] = zi(yi xi ):

or

p parameters to estimate.

q p.

and

E (uijxi) = 0

6.1.

6.1.3

A sample

bution

93

(a; b) with

true values

a0

and

b0.

Relationship between

a

E (xi) = 0 ;

b0

a

E (xi)]2 = 20 :

b0

In our notation in the denition above: = (a; b) and

h

a

a 2 ai

; (x

)

;

f (xi; ) = xi

b i b

b2

so that E [f (xi; 0] = 0.

6.1.4

E [xi

case where p = q (as many conditions as parameters), we could

solve E [f (xi; 0 )] = 0 for 0 . But E [f (:)] is unknown, whereas

function values f (xi; ) can be computed 8; 8i. Also, sample

moments of function f (:) can be computed:

How to estimate

N

1X

fN () =

f (x ; ):

N i=1 i

E (f ) close to

fN (population moments close to empirical moments), then ^N is

a convenient estimate for 0 , where f (^

N ) = 0.

0 = E [f (0)] fN (^N ) ) 0 ^N :

estimation to be valid: a)

E (f )

is adequately approximated by

94

Example: linear regression.

Sample moment conditions are

N

N

1X

1X

x u^ =

x (y

N i=1 i i N i=1 i i

and solving for

^ N

yields

^ N =

6.1.5

xi^ N ) = 0;

N

X

i=1

xix0i

! 1 N

X

i=1

xiyi:

etc.). Restriction: Mean of distribution is equal to the variance.

Assumption:

dependent variables

y1; y2; : : : ; yN

are distributed

1; 2; : : : ; N

respectively.

We assume the

i's

ri

r!

linear relationship:

log i = 0 +

p

X

j =1

j xij :

L=

Ni=1

exp(

yi i

i)

yi!

"

= exp

N

X

i=1

i + 0

N

X

i=1

yi

6.1.

p

X

j =1

N

X

i=1

xij yi

1

Ni=1yi!

95

T0 =

N

X

i=1

yi

Tj =

N

X

i=1

xij yi

j = 1; : : : ; p;

@i

= i

@0

If we set derivatives of

T0 =

N

X

i=1

^ i

and

@i

= xij i:

@j

Tj =

N

X

i=1

xij ^i

j = 1; : : : ; p

P

^i = exp(^ 0 + pj=1 ^ j xij ): Hence, we match sample moPN

Pp

^

ments T0 and Tj to theoretical moments

exp(

+

^ j xij )

0

i

=1

j

=1

PN

^ Pp ^

and Tj =

i=1 xij exp( 0 + j =1 j xij ) respectively.

We have p + 1 such matching conditions for p + 1 parameters.

where

6.1.6

Comments

and the usual estimation criteria. For Maximum Likelihood and

Least Squares, we maximize (minimize) a criterion

^ = arg min N1 PNi [yi f (xi; )]2

(LS)

96

system for

.

We could consider minimizing the IV criterion wrt.

^ = arg min (Y

where

X)0Z (Z 0Z ) 1Z 0(Y

:

X);

N

N

1X

1X

z u^ =

z (y

N i=1 i i N i=1 i i

^ =

N

X

i=1

zi0 xi

! 1 N

X

i=1

xi^) = 0

zi0 yi = (Z 0X ) 1Z 0Y:

from the FOC

or start

N

1X

@ log L()

j=^ = 0;

N i=1

@

Ensure that we can replace population moments by sample moments, for the Method of Moments to work.

conditions than parameters) ?

choice of instruments) ?

valid

6.2.

97

6.2.1

Introduction

unknown parameters, therefore we cannot nd a vector ^

N satisfying fN ( ) = 0.

by dening

AN

0(1).

Important note: for the just-identied case, QN ( ) = 0

fN () = 0, but in the over-identied case, QN () > 0.

where

because

This fact is important for model checking (we will come to this

point later in the course).

6.2.2

Consider

ments), and

rank(W 0X ) = p.

Solving for

we have

are instru-

^ = (W 0X ) 1(W 0Y )

u(^ )0PW0 u(^ ) = Y

X (W 0X ) 1(W 0Y ) 0 W

(W 0W ) 1W 0

Y X (W 0X ) 1(W 0Y )

= Y 0PW Y + (W 0Y )0(W 0X ) 1X 0 PW X (W 0X ) 1(W 0Y )

98

= Y 0PW Y + (Y 0W )(W 0X ) 1(X 0W )(W 0W ) 1(W 0X )(W 0X ) 1

(W 0Y ) (Y 0W )(W 0X ) 1(X 0W )(W 0W ) 1(W 0Y )

(Y 0W )(W 0W ) 1(W 0X )(W 0X ) 1(W 0Y )

0

1 = (X 0 W ) 1 :

and because (W X )

u(^ )0PW0 u(^ ) = 2Y 0PW Y 2Y 0PW Y = 0:

6.2.3

A denition

which we wish to estimate a p 1 vector of parameters whose

true value is 0. Let E [f (xi; 0)] = 0 be a set of q moment conditions, and fN () the corresponding set of sample moments. Dene

the criterion

QN = fN ()0AN fN ();

where AN is a stochastic, positive O(1) matrix. The GMM estimator of is

^N = arg min QN ():

Denition 1

6.2.4

q > p instruments

E (ziui) = E (zi(yi

xi0)) = 0

N

1X

fN ( ) =

z (y

N i=1 i i

xi ) =

1 0

(Z Y

N

Z 0X ):

6.3.

99

AN =

Assume that

1 0Z

NZ

N

X

1

zi0 zi

N i=1

! 1

= N (Z 0Z ) 1:

! 1), to a

1

QN ( ) = (Z 0Y Z 0X )0(Z 0Z ) 1(Z 0Y Z 0X ):

N

Dierentiating wrt. give rst-order conditions:

1

@QN ( )

j

^ N = 2X 0 Z (Z 0Z ) 1(Z 0Y Z 0 X ^ N ) = 0:

=

@

N

^ N , we have

Solving for

constant matrix

^ N = X 0Z (Z 0Z ) 1Z 0X 1 X 0Z (Z 0Z ) 1Z 0Y:

This expression is the IV formulation for the case where there are

more instruments than parameters.

We examine here key properties that any useful estimator should

verify: consistency (convergence to the true parameter value as

the sample size gets large) and asymptotic normality (to be able

to use the asymptotic distribution for statistical inference).

100

6.3.1

Consistency

Assumption set 1

(i)

gi() = E [f (xi; )].

gi() = 0 8i , = 0.

(ii) Let

There exists

0

such that

p

fN () and gN (). Then fNj gNj !

0 uniformly 8 2

and 8j = 1; 2; : : : ; q .

(iii) Let

AN

such that

AN

AN

^N is weakly consistent.

Theorem 1

p

!

0.

pointwise convergence in probability on . It means that

Note: Uniform convergence in

p

2

! 0 for j = 1; 2; : : : ; q:

true that

when

fNj (N )

gNj (N )

p

!

0, where N is a sequence of

increases.

From (iii) and (iv ), we can form a non-random sequence

6.3.

101

such that

p

!

0 uniformly for 2 :

N () = 0 , =

we have that Q

QN () Q N ()

(i) and (ii),

Q N () > 0 otherwise.

From

0,

and

Therefore,

2

^N

^N minimizes QN ();

0 minimizes Q N (p );

QN () Q N () ! 0:

But this implies that

p

!

0, because

6.3.2

Asymptotic normality

Assumption set 2

(v) Function

P

p

!

sequence N such that N

0, we assume that

(vi) Let

FN (N ) FN

where

on

.

FN

is a sequence of

For any

p

!

0;

102

(vii) Function

d

!

N (0; Iq );

N = NV ar[fN (0)], a sequence of q q non-random,

where V

positive denite matrices.

^N has the following asymptotic distribution: N (^N 0) v

N (0;

), where

is a p p matrix:

Theorem 2

i 1

h

i 1

0

^

^

FN (N ) AN FN (N ) :

1=2

0

0

^

^

^

^

FN (N ) AN VN AN FN (N )

FN (N ) AN FT (N )

p

d

N (0; Ip)

N (^N 0) !

Proof:

We know that

0:

fN

where N 2 [^

N ; 0]. Since ^N is a consistent estimator (proved

p

!

above), we know that N

0.

Let us premultiply expansion above by FN (^

N )0AN :

FN (^N )0AN fN (^N ) = FN (^N )0AN fN (0)

+FN (^N )0AN FN (N )(^N

0) = 0

6.4.

103

(^N

0 ) =

N (^N

i 1

h

i

1

0 ) =

FN (^N )0AN FN (N )

p

FN (^N )0AN VN1=2VN 1=2 NfN (0)

p

VN 1=2 hNfN (0) is Ni(0; Iq ).

hp

i

p ^

^

Therefore, E

N (N 0) = 0 and V ar N (N 0) =

,

where

=

h

i 1

0

^

FN (N ) AN FN (N )

FN (^N )0AN VN AN FN (^N )

where

i 1

everywhere. Note that FN is q p, therefore the variance-covariance

matrix of the GMM estimator is p p.

Optimality of GMM: what is the best weighting matrix

AN , the

1 0

1

0

0

Aopt

N = arg min (FN AN FN )) FN AN VN AN FN (FN AN FN ) :

AN

Lemma 3

The matrix

(FN0 VN 1FN ) 1

104

If we select

AN = VN 1, we get

(FN0 AN FN ) 1

= 0:

Hence, best weighting matrix for GMM: inverse of the variancecovariance of moment conditions.

For this choice, variance of GMM is simply

1

"

!0

!# 1

h

i

^

^

1

1 @f (x; N ) 1

1 @f (x; N )

V arf (x; ^N )

N @

N

N @

and this denes the optimal GMM. But: in general, no condition imposed on distribution of

VN

that produces a

Solution: use a two-step estimation procedure

Step 1.

AN (A1N ):

^1N

using an

Step 2. Compute V^N from u(^1N ) and nd ^2N such that

^2N = arg min u0()Z (V^N ) 1Z 0u():

initial matrix

A1N

6.5.

105

solutions:

Method 1.

sively replacing

^N

and

AN

depends on

, and solve

^N = arg min QN () fN ()0AN ()fN ():

In practice, construction of variance-covariance matrix depends on the nature of data: cross-sections, times series, or panel

data (see dedicated section below).

Advantage of GMM over many alternative estimation procedures:

easy to provide statistical inference on model validity. In general,

we will test for the validity of moment conditions, also denoted orthogonality conditions.

P

1 N f (x ; ) and V is a consistent estimator of

i

N

i

N

V = limN !1 var[ NfN (0)].

First-order condition associated with minimization of QN ( ):

Find

@QN (^N )

^N )0VN 1fN (^N ) = 0;

=

F

(

N

@ ^N

where FN (^

N ) = @fN (N )=@.

If ^

N satises FOC above, it must also satisfy

P^ VN 1=2fN (^N ) = 0;

106

where

and

so that

M^ = VN 1=2FN (^N );

FN (^N )0VN 1fN (^N ):

1

P V 1=2E [f (0)] = 0;

where

P = M (M 0 M ) 1M 0

and Fi ( ) = @f (xi; )=@ .

Projection matrix

only

If

and

is of rank

p,

M = V 1=2E [Fi(0)];

q1

vector

E [f (xi; 0)] to

0.

The identifying restrictions determine the asymptotic distribution

of

^N :

N (^N

p

0

1

0

1

=

2

0) = (M M ) M V

NfN (0) + op(1);

P V 1=2 NfN (0). This implies

p ^

d

N (N 0) !

N 0; (M 0M ) 1 :

where

The basic way of testing for model validity is to use the over-

identifying restrictions

(Iq

6.5.

107

p.

QN (^N ) measures the extend to which

(IQ

Interpretation:

the data

satises the over-identifying restrictions. The asymptotic distribution of sample moments is determined by the function of the

data in the over-identifying restrictions:

^ converges in probability to P . We nally have

because P

d

!

N (0; Iq

P):

are orthogonal:

p

p

Cov[ N (^N 0); NfN (^N )]

= (M 0M ) 1M 0 (Iq P ) = 0:

H0 : E [f (xi; 0)] = 0

or

O

ing restrictions (H0 ) and over-identifying restrictions (H0 ):

H0I : P V 1=2E [f (xi; 0)] = 0;

H0O : (Iq P )V 1=2E [f (xi; 0)] = 0:

because

O

by estimated sample moments. But H0

automatically satised

The test statistic proposed by Hansen (1982) is

JN = NQN (^N )

d

!

2(q

p)

under

H0:

108

JN

A 0

JN v

zq (Iq

where zq v N (0; Iq ).

is asymptotically equivalent to

P )0 (Iq

P )zq = zq0 (I

P )zq ;

We have seen above how to obtain the optimal GMM estimator,

by selecting for the weighting matrix the inverse of the covariance

matrix for the moment conditions. We now show how to obtain

an even more ecient GMM estimator, based on the best choice

for the instruments. We are looking for the optimal, asymptotic

variance minimizing choice of instruments.

Based on Newey 1993, Ecient estimation of models with conditional moment restrictions.

6.6.1

vector of observations (on all variables), and is the q 1 vector

Let

E [(z; 0)jx] = 0

E [A(x)(z; 0)] = 0;

where x is a vector of conditioning variables, A(x) is an r s

matrix of functions of x, and 0 the true value of parameters.

Focus of the analysis here: choose A(x) to minimize the asymptotic variance of the GMM estimator.

Let

@(z; 0)

D(x) = E

jx ;

@

6.6.

109

B (x) = C:D(x)0

(x) 1;

where

= E [D(x)0

(x) 1D(x)] 1 :

Example: Linear model with heteroskedasticity

We have in the model

D(x) = x0;

y = x00 + "; E ("jx) = 0,

1=2(x):

1 corrects for heteroskedasticity,

Analogy with linear model:

(x)

and derivatives @(z; 0)=@ correspond to regressors, and matrix

D(x) is a function of x closely correlated with those derivatives.

Since

and dene

so that

(x)B (x)0] = FN0 AN E [A(x)D(x)]

and

= FN0 AN FN ;

E (mAm0A ) = FN0 AN VN AN FN ; [E (mB m0B )] 1 = :

Therefore,

(E [mB m0B ]) 1

110

= (E [mAm0B ]) 1 E [mAm0A ]

where

E [mB m0A] = E [RR0];

n

R = (E [mAm0B ]) 1 mA

Since

asymptotic variance.

6.6.2

Optimal instruments

and

(x) =

(x; 0);

D(:) and

(:) are known, and is a real vector. Because D (x) and

(x), we could estimate 0 by running

a linear regression of @(z; ^

)=@ and (z; ^)(z; ^)0 on x. This

^ (x) = D(x; ^)0

(x; ^) 1 and the resulting GMM estimator

gives B

where functions

would be

^ = arg min

8

n

<X

2 :

i=1

"

n

X

i=1

B^ (xi)B^ (xi)0

# 1

n

X

i=1

9

=

B^ (xi)(zi; ) :

D(x; ) and

(x; ) are misspecied.

Consider

6.6.

where

111

h(:) is known.

Ex-

estimator at least as ecient as weighted least squares.

Drawback: estimator may not be consistent if the form of heteroskedasticity is misspecied.

Dene moment restrictions as

(z; ) = y

f (x; ) ; [y

f (x; )]2 h(x; ; ) 0 :

@f (x; )=@ 0

0

D(x) = D(x; 0); D(x; ) = @h(x; ; )=@ 0 @h(x; ; )=@0 ;

B (x) = D(x)0

(x) 1:

Empirical issue: when is incorporating additional moment condition yielding a more ecient estimator ?

Asymptotic variance of the heteroskedasticity-corrected least squares

estimator:

E ["2jx]

@f (x; 0)

@

@f (x; 0) 0 1

;

@

in E [D(x)0

(x) 1D(x)] 1.

E ["3jx] = 0, or

h(x; 0; 0) = h(x; 0).

Otherwise, the asymptotic variance of the heteroskedasticity-corrected

least squares estimator will be larger than the conditional moment

bound.

Corollary:

not depend on

x or !

h(x; ; ) and

(x) do

112

Needs specication of

(x),

Assume

E ["3jx] = 0;

^

^ ^ ), where ^ and ^ are initial estif (xi; )] =h(xi; ;

ance of

[yi

mators.

Estimated optimal instruments are then

"

^

0

^ x) = h(x; ; ^ )

D^ (x) = D(x; ^);

(

^ ^ )2 ;

0

^:h(x; ;

^ x) 1:

B^ (x) = D^ (x)0

(

6.6.3

Advantage:

avoid misspecication in

D(x; 0)

and

(x; 0)

in

Principle: estimate expectations that enter optimal instruments

nonparametrically (these expectations are conditional upon

x).

Simplest nonparametric estimator: nearest neighbor, or

NN

estimator.

constructed by averaging over the values of the dependent variable for observations where the conditional variable (x) is closest

to its evaluation value.

6.6.

113

xl denote a measure of scale of lth component of x (standard deviation). x being of rank r , dene

Let

jjxi

xj jjn =

r

X

(xil

l=1

^ l

xjl )2

)1=2

ing for the multivariate nature of

K; K n, and

8

<

:

Integer

vation

i.

!kK 0

!kK = 0

x.

i and j , account-

1 k K;

for

k > K;

PK

k=1 !kK = 1:

for

and

j 6= i according to distance

th

above. Then assign the weight Wij = !jK to observation with j

smallest distance jjxi

xj jjn.

Let

Wii = 0

!kK = 1=K; k K .

To compute conditional expectation of y given x:

Select the set of the K (out of n) xi's closest to point x;

Compute the mean of the yi values corresponding to the xi's

Example: uniform weights

chosen above:

K

1X

E (yjx) =

!kK yk (x) =

yk (x);

K k=1

k=1

K

X

where

yk(x)

is the yi whose xi is closest to x, y2 is

measure dened above (y1

114

Other possibility:

E (yjx) =

n

X

j =1

!j yj(x);

!jT =

2(K

0

j + 1)=[K (K + 1)]

j < K;

for j K;

for

of quadratic weights:

!jQ

6(K 2

0

(j

1)2]=[K (K + 1)(4K

1)]

j < K;

for j K:

for

The nearest neighbor estimator of the conditional covariance

at

xi is

^ xi) =

where observation

n

X

j =1

(xi)

procedure).

D(x) is accordingly

n

X

@(zj ; ^)

^

D(xi) =

Wij

:

@

j =1

form, and depend only on

x.

D(x)

D(x; )

6.6.

115

D(x; ) has the same dimension as D(x), and its components are

equal to those of D (x) that are known, and 0 otherwise.

n

X

j =1

Wij

@(zj ; ^)

@

D(xj ; ^) :

n

X

1

0

1

^ xi) ; ^ =

^ xi)D^ (xi)

B^ (xi) = D^ (xi)

(

D^ (xi)0

(

n i=1

6.6.4

! 1

estimators

We wish to estimate the conditional expectation at the point

x, E (Y jX = x) = m(x), with

m(x) =

Z 1

X=

f (y; x)

y

dy;

f1(x)

1

and the marginal density of x.

A nonparametric alternative to k

NN will consist in estimating densities above nonparametrically, to construct m

^ (x) =

i

R1 h

^

^

1 y f (y; x)=f1(x) dy . Popular approach in practice: the Nadaraya-

where

116

Let

The den-

sity function is

f (x) =

d

F (x + h=2) F (x h=2)

F (x) = lim

h!0

dx

h

:

h!0

h

For estimating f (x) based on observations x1 ; : : : ; xn , we consider h a function of n such that h

! 0 when n ! 1. The

= lim

probability above is then estimated by the proportion of observations falling in the interval

1

f^(x) =

nh

1

=

nh

(x

Number of

Number of

x1

n

1 X

=

1I

nh i=1

n

1 X

=

1I

nh i=1

1

2

h=2; x + h=2):

h

h

;x+

2

2

x1; : : : ; xn in x

x

;:::;

xn

in

( 1=2; 1=2)

1 xi x 1

h 2

2

1

;

i

2

xi

h=2), with midpoint x. Bandwidth h measures the degree to which

the data are smoothed (averaged) in computing f^(x).

This rst, naive nonparametric density estimator as been proposed by Fix and Hodges, 1951, and obtains by averaging the

xi's

6.6.

in an interval around

117

xi h=2.

of weights, one can replace the indicator function by a positive kernel function denoted

K (:).

estimator is

n

n

X

X

x

x

1

1

i

K

=

K ( i) ;

f^(x) =

nh i=1

h

nh i=1

where the kernel function has the following properties:

Z 1

K ( )d = 1; K (

1) = K (1) = 0;

with a multivariate kernel

and

K (x)dx = 1:

n

1 X

z z

^

^

f (y; x) = f (z ) = q+1

K1 i

;

nh i=1

h

where

is a xed point.

Important issue: selection of the optimal bandwidth parameter,

h.

(A1) Observations

(A2) Kernel

118

R

R K (2 )d

(ii)

(

R K

2

(i)

(iii)

= 1,

)d = 2 6= 0,

K ( )d < 1.

in some neighborhood of

x.

h = hn ! 0 as n ! 1.

(A5) nhn ! 1 as n ! 1.

(A4)

f^:

h2 00

^

Bias [f (x)] =

f (x);

2 2

Strategy for choosing

h:

1

var [f^(x)] =

f (x)

nh

K 2 ( )d :

Error (MISE):

Z h

i2

f^(x) f (x) dx =

Z h

1

AMISE = 1h4 + 2(nh) 1;

4

Z

where

1 = 22 [f 00(x)]2dx; 2 =

K 2( )d :

maxfO(h4); O(nh) 1g. Hence, the only value of h for which the

Since Bias

h / n 1=5;

for which

6.6.

119

The estimator is

m

^ (x) =

"

1 Pn K1 yi y ; xi x

iP

=1

h

h

y

dy;

n

x

x

q

1

i

K

1 (nh )

i=1

h

Z 1

(nhp)

K (:) and K1(:; :) are q-multivariate and p-multivariate kernels respectively, and p = q + 1 (recall x has rank q ). Dene

i = h 1(yi y) , y = yi hi. The numerator above becomes

where

Z 1

(nhp) 1

n

X

i=1

Z

n

1X

y

=

n i=1 i

n Z 1

1X

n i=1

(yi

1

1

h )K1 ;

K1 ;

xi

h p+2K1 ;

xi

xi

hd

h q d

d;

and since the last term is zero for symmetric kernels, we nally

have

n

1X

=

yh

n i=1 i

Z 1

K1 ;

xi

d

n

1X

x x

=

:

yih q K i

n i=1

h

m

^ (x) =

" n

X

i=1

xi

# 1 "X

n

i=1

xi

yi :

G

NN ).

120

m(x) = E (Y jX = x) =

where

and

n

X

i=1

!is(x)yi;

xi x

d

!is(x) = Pn

xi x ;

K

i=1

d

th nearest

distance between x and its K

d is the

neighbor.

One can show (Mack, 1981) that

and

K.

estimation:

K = nh4=(4+q)

and

Chapter 7

GMM estimators for time series

models

7.1 GMM and Euler equation models

Lucas critique (1976): evaluations based traditional dynamic simultaneousequation models are awed because parameters are assumed invariant across dierent policy regimes.

Hence, marginal response to a change in policy instruments is not

to be expected from rational agents taking into account policy

changes in their decision making.

Standard estimation procedures (MLE) are computationally burdensome when one introduces taste and technology parameters.

structural models, to draw inference on these parameters.

7.1.1

consumption and investment in a single asset to maximize dis-

121

122

counted utility

max E0

"

1

X

t=0

t U (Ct) ;

where

set t ,

Ct is consumption,

t is a constant discount factor,

U (:) is a strictly concave utility function.

Budget constraint:

Ct + Pt Qt RtQt 1 + Wt;

where

Pt and Qt: price and quantity of asset bought,

Wt: labor income.

Asset price is deated by the price of consumption good.

First-order condition:

Equivalently,

where

Rt+1 U 0(Ct+1)

Et

U 0(C )

Pt

t

U 0(:) = @U=@C:

1 = 0:

Specication of the utility function:

Ct

U (Ct) = ;

with

< 1;

7.1.

123

so that

where

7.1.2

R

C

Et t+1 t+1 1 = 0;

Pt

Ct

1.

(7.1)

GMM estimation

of

R

LW1;t+1 = log t+1

Pt

and

C

LW2;t+1 = log t+1

Ct

given

t ;

Disadvantage:

The

R

C

E t+1 t+1

Pt Ct

1 =E

R

C

t+1 t+1

Pt Ct

= 0:

expectations hypothesis: agents use all available information at

t, t, so that

If yt+1 2

= t but zt 2 t then Et(yt+1zt) = [Et(yt+1)] zt:

If Et (yt+1 ) = 0, by the Law of Iterated Expectations, we have

E (yt+1zt) = 0, and the Euler equation implies

Rt+1 Ct+1

E ["t+1(; )zt] = 0 where "t+1 =

1;

Pt

Ct

time

124

and

zt

Ct i; Rt i; Pt i; i 0.

t.

Notes.

endogenous variables for GMM.

rate, model is just identied (for

Consider estimation of a pure moving average MA(1) model

yt = "t + 0"t 1;

where

7.2.1

"t is an i.i.d.

(7.2)

A simple estimator

0 =

E (ytyt 1)

0

=

:

E (yt2)

1 + 02

^T =

we obtain estimator

^T

0 by sample estimator

PT

t=2 yt yt 1 ;

PT

2

t=2 yt

by solving

^2T

^T 1^T

1 = 0:

7.2.

125

not be veried in nite samples, especially if j0 j close to 1. We

may dene

~T =

and solution for

~T

8

<

0:5

if j

^T j < 0:5;

if

^T > 0:5;

if

^

: T

0:5

is

~T =

rived from

1 4^2T

:

2^T

02, whose expression can be de-

2

~ T =

:

2

~

1+

Consider now estimation in a GMM framework.

2

y

t yt 1

f (yt; ) =

;

yt2 2(1 + 2)

such that Ef (yt ; 0 ) = 0 (theoretical moment condition).

T

1X

f T ( ) =

f (y ; ) =

T t=1 t

(1=T ) Tt=1 yt2 2(1 + 2)

fT (^T = 0

~

^T = ~T = (T ; ~ 2T ).

same estimators as above:

yields the

126

Estimators ^T and ~T are consistent and asymptotically normal with distribution

Theorem 4

p

pT (^T

0)

0)

T (^T

where

1

=

(1 02)2

v N (0; );

20203(2 + 02 + 04)

20203(2 + 02 + 04) 204(1 202 + 304 + 206)

0 0

+

;

0 4

with 4 the fourth-order cumulant of "t.

Under the normality assumption, asymptotic variance of the

MLE of

^T

(1

is

02).

in general.

7.2.2

1959).

The MA(1) dened by (7.2) is invertible, therefore it admits

an AR representation:

yt =

where

1

X

j =1

j (0)yt j + "t;

j = 1; 2; : : :

7.2.

127

yt =

K

X

j =1

j (0)yt j + "Kt:

(7.3)

"Kt = "t +

1

X

j =K +1

1:

as the reduced-form model. The AR model captures second-order

properties of

yt

K -vector

0

1

1()

A 8 ; with j () = ( )j ;

AK () = @ ...

K ( )

^K denote the K -vector of OLS estimators (^1; : : : ; ^ K )

and let A

Dene the

in (7.3).

For an given

K , we dene

2

where

7.2.3

0

AK () VT K

= ( 1; +1) and VT K

is a

A^K

K K

AK () ;

weighting matrix.

We can write

j () = j 1; j = 1; 2; : : : ;

(7.4)

with

0() = 1:

128

estimate 0 by regressing OLS estimates (^

1; : : : ; ^ K ) on lagged

values of themselves, i.e., on 1; (^

1; : : : ; ^ K 1). The estimator is

This is an

P

^D =

K

^ j ^ j 1

j =1

P

K

2

^j

j =1

with

^ 0 = 1:

VT K = BK ()0BK ();

and

LK : K K

where

BK () = IK + LK ;

To simplify exposition, we concentrate on the ARMA(1,1) case.

7.3.1

The model is

where we assume

0 6= 0;

yt = 0yt 1 + ut;

where

ut = "t + 0"t 1;

(7.5)

7.3.

129

is inconsistent because

yt =

7.3.2

1

X

j =0

0j ut

ut)

(7.6)

IV estimation

(7.6) implies that E (ut yt j ) = 0 8j 2. We can use these moment conditions to estimate consistently 0 with an IV procedure.

Ef (yt; 0) = 0

f (yt; ) = (yt

where

yt 1)yt 2;

T

1X

fT ( ) =

(y yt 1)yt 2;

T t=3 t

^ T ) = 0 for ^T gives

and solving fT (

^ T =

T

X

t=3

yt 2 yt 1

! 1 T

X

t=3

yt 2yt:

T (^ T 0) v N (0; ),

(1 02)(1 + 402 + 400 + 4003 + 20202 + 02)

=

:

(1 + 00)2(0 + 0)2

Theorem 5

ARMA(1,1) model is

T (^ MLE

(1 + 00)2(1 02)

0) v N 0;

:

(0 + 0)2

130

is close to

0.

The MLE is more ecient than GMM, especially for large values

of

0 and 0.

model becomes over-identied) by including

yt j ; j = 2; 3; : : :,

yielding

T

X

^ Tj =

t=j +1

yt 1yt

! 1

j

T

X

t=j +1

yt yt j ;

for

j 2:

^ T is the most ecient of these

Because j0 j < 1, it follows that

Dolado 1990 shows that

yt 1 )

variable (

yt

j ).

(rapidly) with

Since

yt

yt 1

q vector of conditions

E (utyt j ) = 0;

where

8j

yt 1))

estimator is

^ Tq =

T

X

t=q+2

yt 1Yq;t0 2ATq

T

X

t=q+2

2.

Yq;t 2yt 1

! 1

GMM

7.4.

131

T

X

t=q+2

yt 1Yq;t0 2ATq

X

t=q+2

Yq;t 2yt;

q q weighting matrix.

^ Tq is

The asymptotic distribution of

where

ATq

is a positive denite

T (^ Tq 0) !d N 0; "2(Rq0 Aq Rq ) 1Rq0 Aq Vq Aq Rq (Rq0 Aq Rq ) 1 ;

where

T !1

(1 + 00)(0 + 0)

"20j 1

:

(1 02)

1

The optimal choice for the weighting matrix being ATq = Vq , we

with

have

T (^ Tq

j th element

0) !d N 0; "2(Rq0 Aq Rq ) 1 :

In the time-series framework, moment conditions are dened as

E [f (xt; 0] = 0,

mated is

T X

T

X

1

VT = T var[fT (0)] =

E [f (xt; 0)f (xs; 0)]:

T t=1 s=1

This is the average of autocovariances for the process

Let

ft = f (xt; 0)

and rewrite

function:

VT =

T 1

X

j = (T

1)

VT

f (xt; 0).

as a general autocovariance

T (j )

where

132

T (j ) =

7.4.1

P

(1=T ) Tt= (j 1) E (ft+j ft0); j < 0:

yt = xt0 + ut;

Assume

where

E (ft) = E (xtut) = 0:

E (utjut 1; xt; ut 2; xt 1; : : :) = 0

and

Residual

We

have

T

1X

VT =

T (0) =

E (xtututx0t) = u2 E (xtx0t);

T t=1

the standard OLS variance-covariance matrix. The estimator of

VT

T

2X

^

V^T = u

xtx0t;

T t=1

7.4.2

where

T

1X

2

^ u =

u^2t ; u^t = yt

T t=1

^

xt:

The covariance matrix is then

T

1X

VT =

T (0) =

E (xtututx0t);

T t=1

7.4.

133

T

1X

^

VT =

xtu^tu^tx0t:

T t=1

This is White's heteroskedasticity consistent estimator.

In a typical IV setup, where

ft = wt(yt

xt ); wt are instruments;

T

1X

E (u2t )wt0 wt;

VT =

T t=1

and the asymptotic covariance matrix would be

1

P X

T W

1

1 ^

P P

T W W

1 0

X PW

T

1

PW = W (W 0W ) 1W 0.

where

7.4.3

Assume

VT =

m

X

j= m

T (j ):

V^T =

^ T (j ) =

m

X

^ T (j );

where

j= m

(

P

(1=T ) Tt=j +1 xtu^tx0t j u^t j ;

P

(1=T ) Tt= (j 1) xt+j u^t+j x0tu^t;

j 0;

j < 0:

134

strong, and an obvious idea would be to construct an estimator

V^MM

V^MM =

^ T (j ) =

where

T 1

X

j = (T

1)

^ T (j );

where

P

(1=T ) Tt=j +1 f^tf^t0 j ; j 0;

P

(1=T ) Tt= (j 1) f^t+j f^t0; j < 0;

But:

Although V^MM may be asymptotically unbiased, it is not consistent in the mean squared error sense;

is 0 8T .

Why sample autocovariance matrix

^ T (j )

j, T + 1 j T 1 ?

Suppose j = T

2; then

^ T (j ) tends to 0 as T

arbitrary

7.4.4

!1!

autocovariance genuinely tends to 0 as

case for

! 1.

This is the

mixing property.

7.4.

135

and Z : Rl+1 ! R. The sequence fytg is mixing if there exists

a sequence of positive numbers fng, converging to 0, such that

Denition 2

E [Y (:)]E [Z (:)]j < n:

We can replace the sum in the denition of

sum,

p are eliminated.

VT :

V^T =

^ T (0) +

p

X

j =1

by a

truncated

^ T

( j ) =

(j )0, we consider

^ T (j ) +

^ T (j )0 :

(7.7)

1=4), so

and should go to innity at some rate, typically p = o(T

that all non-zero

T (j )'s are consistently estimated.

not be positive semidenite.

(1987):

with

multiply

^ T (j )

^ T (j ) may

V^T =

^ T (0) +

p

X

j =1

j

p+1

^ T (j ) +

^ T (j )0 ;

^ T (0) down to

136

7.4.5

covariance matrix estimators.

General form: weighted average of sample autocovariance matrices:

V^T =

T 1

X

s= (T

1)

!s

^ T (s);

Strategy: choose a lag window such that f!s g approaches 1

rapidly enough to obtain asymptotic unbiasedness;

slowly enough to ensure that the variance converges to 0.

where the sequence of weights

In practice, we concentrate on

s

;

!s = k

mT

function k (:) is the lag window generator. These estimators bewhere

0.

We assume

k(0) = 1; k(z ) = k( z ) 8z 2 R;

Z 1

jk(z)jdz < 1;

and k(:) is continuous at 0 and "everywhere else" except at a nite number of points.

Note: When k (:) = 0 for z > 1, mT reduces to p, the lag truncation parameter.

7.4.

Let

kr = lim

z !0

tion

kr

137

k(:), and kr

k(z )

jzjr

Consider nally the following measure of smoothness of the spectral density function in the neighborhood of 0:

1

X

(

r

)

1

S = (2)

jj jr

(j );

j= 1

also denoted the

function:

When

1

1 X

Sf () =

(j )e

2 j = 1

ij :

is equal to

2

1982).

Dene the asymptotic truncated Mean Squared Error:

T

MSEh = E min j vec(V^T

mT

where

BT

VT )j; h ;

Theorem 6

We have

(i) If m2T =T

! 0 then V^T

VT

p

!

0.

p

! 1 and BT !

B.

138

and jjS (r) jj < 1, then

p

T=mT (V^T VT ) = Op(1):

(iii)

T !1 h!1

Z 1

+

k2(z )dz tr(B )(I + Bqq )Sf (0)

Sf (0) ;

1 P

P

where Bqq = i j eie0j

ej ei and ei is a zero vector with 1 as

the ith element.

(i): establishes consistency of scale parameter covariance estimators for bandwidth sequences that grow at rate

o( T ).

(iii): Gives asymptotic truncated Mean Squared Error.

For

(r )

mT r kr 2Sj;j

and asymptotic variance

Z 1

m

T

(2)

2

8 Sj;j

k2(z )dz:

T

1

According to the-

Variance of these

r.

Also, no kernel estimators with r > 2 can be positive semidenite. Hence, we should restrict attention to estimators with r = 2

kernel estimators are

Optimal choice of scale parameter

mT :

according to asymptotic

T (2r+1)

7.4.

139

k(z )

r kr

Truncated

1 for jz j 1,

1 0

0 otherwise

Bartlett

1

jzj for jzj 1

1 1

0 otherwise

Parzen

1

6z 2 + 6jz j3 for 0 jz j 1=2,

2(1

jzj)3 for 1=2 jzj,

2 6

0 otherwise

7.4.6

T 1

1 X

W (; mT ) =

!e

2 s= (T 1) s

This is also denoted the

spectral window.

is:

V^T =

Z

where

is the

or

T 1

X

s= (T

1)

^ seis

periodogram,

and

W (:; :) is

the

averaging kernel.

Spectral estimators once computationally burdensome, before FFT

(Fast Fourier Transforms) became popular.

Dene the Fourier transform of

f^t as

T

1 X

(p) = p

f^teip t:

2T t=1

140

k(z )

kr

h

i r

sin(6z=5)

25

2

cos(6z=5)

Quadratic 122 z2 6z=5

2 =10

Daniell

(sin(z )=z

2 2 =6

2(1 cos(sz ))

Tent

2 1/12

z2

The periodogram matrix can be computed at the Fourier frequencies

p =

as

2p

T

; p = 1; 2; : : : ;

T

2

(T 1)

X

2

V^T =

I^T (0p)W (0p; mT );

2T 1 p= (T 1)

windows with r = 2, the Quadratic Spectral window (see table)

where

1992).

Chapter 8

GMM estimators for dynamic

panel data

8.1 Introduction

GMM estimation was introduced as an interesting alternative to

Fixed-eects, Maximum-Likelihood or GLS estimation procedures.

But its advantages are the most obvious for estimating dynamic

panel-data models.

The Anderson-Hsiao Instrumental-variable procedure: consistent

estimates when

formation.

Two drawbacks:

a) In IV procedure, variance-covariance matrix is restricted;

b) Only one instrument is used (either

141

yi;t 2 or yi;t 2

yi;t 3).

142

Important paper: Arellano and Bond (Review Econ. Stat. 1991):

more robust procedure can be used (point a)) and more orthogonality conditions can be used (point b)).

8.2.1

Model assumptions

E (yisuit) = 0; t = 2; 3; : : : ; T; s = 0; 1; : : : ; t 2;

uit = "it = "it

where

"i;t 1.

This is a set of

T (T

1)=2

was available).

" are

correlated, i.e., we must have E ("it "i;t+s ) = 0, for

not serially

s = 1; 1.

E (yisuit) = 0; t = 3; : : : ; T; s = 0; 1; : : : ; t 3;

which gives

(T

1)(T

1) condi-

tions).

By continuous substitution seen before:

1 t

i + tyi0;

yit = "it + "i;t 1 + 2"i;t 2 + + t 1"i1 +

1

8.2.

143

so that

= E ("i;t 2("it "i;t 1)) = 0

because by assumption E (i "it ) = E ("it yi0 ) = 0.

8.2.2

matrix.

The instrument submatrix for unit

i is of the form:

yi0 0 0

6 0 yi0 yi1 0

0 0

6

60 0 0 y

i0 yi1 yi2 0

Wi = 6

0

0

0

yi;T 2

6

4

..

.

..

.

..

.

0

0

so that Wi ui =

0

ui2 yi0

B ui3 yi0

B

B ui3 yi1

B

B ui4 yi0

B

B u y

i4 i1

B

B u y

i4 i2

B

B

B

B

B

B

@

..

.

uiT yi0

..

.

uiT yi;T 2

0

and E (Wi ui ) = 0.

..

.

..

.

..

.

..

.

..

.

yi0

C

C

C

C

C

C

C

C

C=

C

C

C

C

C

C

A

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

@

(yi2

(yi3

(yi3

(yi4

(yi4

(yi4

(yiT

(yiT

..

.

..

.

yi1) yi0

yi2) yi0

yi2) yi1

yi3) yi0

yi3) yi1

yi3) yi2

..

.

yi;T 1) yi0

..

.

yi;T 1) yi;T 2

7

7

7

7

7

5

1

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

A

144

(W 0

W ) 1:

is the variance-covariance

of " (in the transformed model). If "it is homoskedastic, we have

Initial weighting matrix for

E ("2it) = E [("it "i;t 1)("it

E ("it"i;t+1) = E [("it "i;t

so that for unit

2

H=

(T

2)(T

"i;t 1)] = 2"2

1)("i;t+1 "it )] = "2

6

6

6

6

6

4

2

1 0

1 2

1 0

0

1 2

1

..

.

2) matrix.

..

.

..

.

..

.

..

.

..

.

..

.

We can use

0

07

7

07

7;

..

.

1 2

7

5

weighting matrix as

A1 =

N

X

i=1

Wi0HWi:

= y0 1W A1 1W 0y 1 1 y0 1W A1 1W 0y ;

we can compute the second-stage weighting matrix as:

A2 =

where

^ui = yi

N

X

i=1

^yi; 1.

Wi0^ui^u0iWi;

8.3.

145

Ahn and Schmidt (1995) propose

ditions:

E (uiT uit) = 0; t = 2; 3; : : : ; T

(T

2)

orthogonality conditions.

1:

have T (T

1)=2 +

8.3.1

Additional assumptions

8.3.1.1 Homoskedasticity

t = 1; 2; : : : ; T . This adds T 1 condi-

E (u2it)

8i,

E (yisuit) = 0 t = 2; : : : ; T; s = 0; : : : ; t 2;

E (yitui;t+1 yi;t+1ui;t+2) = 0 t = 1; : : : ; T 2;

E (uiui;t+1) = 0 t = 1; : : : ; T 1;

where

ui = T1

PT

t=1 uit.

8.3.1.2 Stationarity

The entire set of the T (T

1)=2+(2T

2) conditions is now

E (yisuit) = 0 t = 2; : : : ; T; s = 0; : : : ; t

E (uiT yit) = 0 t = 1; : : : ; T 1;

E (uityit ui;t 1yi;t 1) = 0 t = 2; : : : ; T:

2;

146

The Ahn and Schmidt estimator obtains by adding to ArellanoBond instrument matrix the following block for unit

Wi =

B

B

B

@

yi3 yi3 0 :::

..

.

..

.

..

.

..

.

0

0

..

.

i:

ui 0 ::: 0

0 ui ::: 0

..

.

..

.

..

.

..

.

1

C

C

C:

A

W0

conditions.

ments

Let

(W 0; W 1)

^

W 0 respectively, and J (^) and J (^ ) the

and

and

Then under

0

J (^) J (^ ) v 2(rank(W 1)):

Blundell and Bond (1998) suggest to use linear moment restrictions based on assumptions for initial conditions. They propose

E (uityi;t 1) = 0 t = 3; 4; : : : ; T;

with the addition of

E (ui3yi2) = 0:

This last condition combined with the one above implies the AhnSchmidt (1995) nonlinear restrictions

E (uitui;t 1) = 0; t = 3; : : : ; T .

8.5.

147

model:

yi0 =

+ "i0:

correlated with the level of

i=(1

)

i=(1 ) itself.

must not be

The GMM estimator of Blundell and Bond combines the AhnSchmidt conditions

Wi 0

0

6 0 y

0

i2

6

6

Wi+ = 6 0 0 yi3

6

4

..

.

..

.

..

.

..

.

0

0

0

0

yi;T 1

7

7

7

7;

7

5

yi = yi; 1 + i + "i:

We consider here two generalizations to multiplicative individual

eects models.

8.5.1

Holtz-Eakin et al.

uit = ti + "it;

148

where

t

is

Let us lag equation above one period:

and dene a new variable rt = t =t 1 . Substracting from the rst

equation the second one premultiplied by rt , we have

yit rtyi;t 1 = (yi;t 1 rtyi;t 2) + (xit rtxi;t 1)

+"it rt"i;t 1:

This is new, nonlinear equation with parameters to be estimated:

; ; rt; t = 2; 3; : : : ; T .

dierencing.

GMM estimation is applicable as before (Arellano-Bond, AhnSchmidt or Blundell-Bond), but the initial weighting matrix cannot be used anymore. Let

"it = "it

rt"i;t 1.

We have, under

= rt"2

E ("2

= E [("it rt"i;t 1)("it rt"i;t 1)]

it )

= "2(1 + rt2)

E ("it"i;t+1) = E [("it rt"i;t 1)("i;t+1 rt+1"it)]

= rt+1"2:

Thus, the optimal initial weighting matrix would be

1 + r12 r2

6 r2

1 + r22

6

60

r3

6

4 :::

:::

0

:::

0

r3 0

1 + r32 0

:::

:::

:::

rT 1

:::

:::

:::

:::

1 + rT2

3

7

7

7:

7

5

8.5.

149

When the

rt's

our choice, see above. Also, as the model is nonlinear, we must

minimize the GMM numerically (no closed-form solution).

8.5.2

Mixed structure

Consider

where

i = 1; 2; :::; N t = 1; 2; :::; T;

uit = i + tvi + "it:

tvi

captures

We assume

E ("iti) = E ("itvi) = 0; E (yi0"it) = 0 8t; E (ivi) = v :

Consider the case where one of the following conditions holds:

t = s

Under condition (8.1),

= i + vi.

i

8t; s = 1; 2; : : : ; T;

v = v2 = 0:

let t =

8t; then uit = i + "it,

vi

(8.1)

(8.2)

where

E (u2it) =

2 + "2 and E (uituis) = 2 if t 6= s. Models uit = i + tvi + "it

150

and

v = 1; 1).

and

We have

uit = (1 t)i + "it

Then

vi

v = 1;

if v =

1:

if

uit = tvi + "it

with

t = (1 + t); vi = i:

When

tion yields:

uit

ui;t 1 = (t

t 1)vi + "it

"i;t 1;

with

vi:

s t 2:

uit

E [(uit

i .

rt"i;t 1;

We have

s t 2:

8.6.

151

To eliminate both eects

and

tvi,

it is necessary to use a

4yit

r~t4yi;t 1 = (4yi;t 1

r~t4yi;t 2) + 4"it

r~t4"i;t 1;

i = 1; 2; : : : ; N; t = 3; 4; : : : ; T , where

r~t = 4t=4t 1 = (t

t 1)=(t 1

t 2):

GMM estimators of the double-dierence model based on Quasidierencing rst and then First-dierencing residuals are not consistent when instruments include lagged dependent variables.

We would have in that case:

4 [("it

which depends on

4(rt"i;t

1)

i4rt;

i.

GMM procedures using instrument matrices from lagged dependent variables would yield consistent estimates only when the correct model transformation is performed.

Consider the wage equation seen before, in a simpler, dynamic

form:

where

wit:

OCCit:

wage rate,

W KSit:

152

2. Multiplicative case uit = tvi + "it;

3. Mixed case uit = i + tvi + "it.

transformation. In case 2, nonlinear GMM in parameters

and

~t = t=t 1; t = 4; 5; : : : ; T ,

and Double-dierence.

W KS; OCC ).

Table 8.1:

Parameter

1

2

First-dierence GMM

Estimate

Std. error

t-stat.

0.9465

0.0126

74.83

0.0022

0.0022

0.98

-0.0848

0.0423

-2.00

8.6.

Table 8.2:

Parameter

1

2

r1

r2

r3

r4

r5

153

Quasi-dierence GMM

Estimate

Std. error

t-stat.

0.9121

0.0218

41.72

0.0150

0.0038

3.87

-0.1014

0.1007

-1.00

-0.5838

0.3856

-1.51

-0.0871

0.0974

-0.89

0.3294

0.0621

5.29

-0.1842

0.1074

-1.71

1.0401

0.5947

1.75

154

Table 8.3:

Parameter

1

2

r~1

r~2

r~3

r~4

Double-dierence GMM

Estimate

Std. error

t-stat.

0.9211

0.0460

19.98

0.0082

0.0014

5.79

-0.0394

0.0322

-1.22

-0.5272

0.2250

-2.34

-0.1188

0.1029

-1.15

0.2931

0.1009

2.90

-0.0863

0.0399

-2.16

Part III

Discrete choice models

155

Chapter 9

Nonlinear panel data models

9.1 Brief review of binary discrete-choice models

Models with qualitative variables: binary choice and multinomial

models. Brief survey of these models, for cross-section data and

the binary case :

yi = xi + ui; i = 1; 2; : : : ; N;

yi = 1

if yi > 0;

yi = 0

if yi 0;

xi: 1 K vector of regressors. Threshold 0 is arbitrary here, as

E (yi) is unknown.

9.1.1

[0; 1]. Two

possible values for residual ui : 1 xi (when yi = 1) or ui =

xi

157

158

0) ( xi )2 + P rob(yi = 1) (1 xi )2

(when

= (1 xi ) ( xi )2 + xi (1 xi )2

= (1 xi )[( xi )2 + xi (1 xi )]

= xi (1 xi ):

9.1.2

Logit model

exp(xi ) ;

P rob(yi = 1) = (xi ) = 1+exp(

xi )

1

P rob(yi = 0) = 1 (xi ) = 1+exp(

xi ) ;

exp(xi )

Density: (xi ) =

[1+exp(xi )] :

2

In this case,

9.1.3

V ar(ui) = 2=3.

Probit model

xi = R xi = p1 exp( u2i );

1 2

2 2

R

1 exp( u2i2 );

p

xi = x+1

2

i = 2

2

ui

p1

2 exp( 22 ):

P rob(yi = 1) =

P rob(yi = 0) = 1

xi

Density:

=

Parameter

ui is N (0; 2)

=):

is normal-

ized to 1.

Estimation method: Maximum Likelihood:

^ = arg max

N

Y

i=1

[P rob(yi = 1)]yi [1

P rob(yi = 0)]1

yi

9.2.

159

= arg min

where

N

Y

i=1

F (ixi );

1.

@P rob(yi = 1)=@xi).

b) marginal eects (

= P rob("it < xit + i) = F (xit + i):

i )

9.2.1

Sucient statistics

Maximum Likelihood estimator: we have to estimate both

and

i; i = 1; : : : ; N , but i and are not independent for qualitativechoice models. When T is xed, MLE estimates of i are not consistent and consequently, the MLE of is not consistent either.

Individual eects i are denoted incidental parameters (their number increases with N ).

Solution: Neyman-Scott (1948) principle of estimation in the presence of incidental parameters.

statistic i for , i = 1; 2; : : : ; N

sucient

, then

f (yijxi; i; ) =

f (yijxi; i; )

;

g(ijxi; i; )

for

g(ijxi; i; ) > 0;

160

i.

A consistent estimator of

^ = arg max

Joint probability of

yi:

h

P rob(yi) =

exp i

N

Y

i=1

P

f (yijxi; i; ):

T

t=1 yit

P

QT

t=1 [1 + exp(xit

T

t=1 yit xit

i

+ i)]

wrt.

N X

T

@ log L X

=

@

i=1 t=1

and wrt.

exp(xit + i)

+ y x = 0;

1 + exp(xit + i ) it it

i:

T

@ log L X

=

@i

t=1

T

X

t=1

yit =

exp(xit + i)

+ y = 0; i = 1; 2; : : : ; N;

1 + exp(xit + i) it

T

X

t=1

exp(xit + i)

1 + exp(xit + i )

i is: i =

PT

The probability that

t yit = s is

Hence, a sucient statistic for

exp(is)

T!

Q

s!(T s)!

[1

+

exp(

x

+

)]

it

i

t

i = 1; 2; : : : ; N:

PT

t=1 yit .

X

d2Bi

exp

T

X

t=1

! )

ditxit

9.2.

161

9.2.2

Conditional probabilities

yi given i is:

i

T

exp

t=1 yit xit

P

P rob (yi i) = P

T

d2Bi exp

t=1 dit xit

P

P

( t yit)!(T

t yit)! ;

hP

where

T!

Bi is a set of indices for individual i:

(

Set

Bi

T

X

and

t=1

dit =

T

X

t=1

yit :

yP

it for individual

T

in

t yit . Groups

for which

PT

t yit

= 0

or

PT

t yit

= T

T

y

=

s

2

]0

;

T

[

;

there

are

(

it

1

s ) =T !=[s!(T s)!] such elements,

that correspond to distinct T sequences with value s.

PT

Notes:

To compute the above probability, we have to consider for each

s all possible sequences of 0's and 1's. Example: if T = 4 and

s = 2, we would have 6 cases and

2

3

1 1 0 0 0

1

61 0 1 07

exp(

x

)

i

1

!

6

7

T

X

X

6 1 0 0 1 7 B exp(xi2 ) C

7B

C

exp

ditxit = vec 6

6 0 1 1 0 7 @ exp(xi3 ) A

6

7

t=1

d2Bi

40 1 0 15

exp(xi4 )

0 0 1 1

162

9.2.3

Example:

T =2

yi1 + yi2 = 1.

!i = 1

!i = 0

if

if

Let

(yi1; yi2) = (1; 0):

P rob(!i = 1)

P rob(!i = 0) + P rob(!i = 1)

exp(i + yi2xi2 )

=

[1 + exp(i + xi1 )][1 + exp(i + xi2 )]

[1 +exp

(i + xi1 ) + exp(i + xi2 )

exp(i + xi2 )

=

exp(i + xi1) + exp(i + xi2 )

exp[(xi2 xi1) ])

=

= [(xi2 xi1) ]:

1 + exp[(xi2 xi1) ]

In that case, Bi = fijyi1 + yi2 = 1g and the conditional

likelihood is log L =

X

i2Bi

log-

T >P

2, we have to consider alternative sets of

T

observations for which

t yit is the same. Note that this formulation is a conditional Logit specication: regressors x depend on

In practice, when

the alternative.

9.3.

163

PROBIT MODELS

One typically uses the Probit model in the random-eect case

(easier to work with).

uit = i + "it,

distribution

where

is drawn from

Assume

2

:

1 + 2

The contribution to the likelihood of unit i is Li = P rob(yi )

V ar() = 2 ; V ar("it) = 1; Corr(uit; uis) = =

=

Z i1 xi1

Z iT xiT

it = 2yit

elements in ui .

where

1

and

f ( :)

the

Z +1 Y

T

1 t=1

2 )).

Z +1

i,

f (uitji )f (i)di;

Li as

p2 #

) dti;

itxit + it ti p

1

1

Li(yi) = p

Z +1

"

T

Y

t2i

(

t=1

which is now a one-dimensional integral that can be evaluated numerically (Gauss-Hermite integration procedure).

of the method: assume a constant correlation

Disadvantage

) across periods.

164

models

We consider here estimation of binary-choice panel data models

with xed eects and possibly endogenous regressors.

In the model

i = 1; : : : ; N; t = 1; : : : ; T;

true if

x is strictly exogenous:

E ["itjxi1; : : : ; xiT ] = 0:

x is predetermined only:

E ["itjxi1; : : : ; xit] = 0;

and in this case we have to use IV estimation strategy, e.g., tting

using as instruments past values of

x.

some linearization of the model is performed.

9.4.1

Negative result of Chamberlain (1993): Even if the distribution of

"it

(e.g.,

it

is independent from

zi).

and

"it,

conditional on

x and

Assumptions

A.1. The conditional distribution of

Radon-Nikodym conditional density function

ft(itjxit; zi).

it (conditioning on xit and zi). The conditional distribution of

eit has support

et(xit; zi) and is denoted Fet(eitjxit; zi).

A.2. Let

t = r and t = s, the conditional distribution of it given xit and zi has support [Lt ; Kt ] with

1 Lt <

0 < Kt 1, and the support of xit eit is a subset of [Lt; Kt].

A.3. For 2 periods

A.4. Let

(i)

Then

(ii)

E (izi); zz ; xrz and xsz exist:

(iii) zz and (xrz xsz ) 0zz (xrz xsz )0

166

are nonsingular.

(xit; zi) can be correlated with it, but (A.1) rules out (xit; zi)

it;

i can be correlated with xit or zi, but (nuit; i) must be independent given (xit ; zi );

"it is uncorrelated with instruments zi;

(A.2) means that the conditional distribution of "it given (it; xit; zi)

does not depend on it ;

According to (A.3), it can take on any value that x0it + eit

as deterministic functions of

P rob(yitjxit; zi; it = k1) ! 0

and

fk1g and

! 1:

when the variance of

variable.

Theorem 7

Let

yit =

:

ft(itjxit; zi)

E (yit jxit; zi) = x0it + E (i + "itjxit; zi):

Proof. Let (dropping subscripts for clarity) s = s(x; e) =

x0

e.

We have

E (yjx; z ) = E

E [y

jx; z

f ( jx; z )

=

=

Z KZ

Z K

E [y

f ( jx; z )d

f ( jx; z )

Z K

e L

[1I( > s)

e wrt. :)

Note that

1I( > s)

+1I( > 0 s)] 1I(s 0) [1I(s > > 0) + 1I( > s > 0)] 1I(s > 0)

1I( > 0 s)1I(s 0)

= 1I(s > 0) [1I( > s > 0) 1I(s > > 0) 1I( > s > 0)]

+1I(s 0) [1I(0 s) + 1I( > 0 s)

= 1I(s 0)1I(0 > s)

,

=

E (yjx; z ) =

1I(s 0)

Z K

e L

Z 0

s

ddFe(ejx; z)

1 d

1I(s > 0)

Z s

1 d dFe(ejx; z )

= x0 + E (ejx; z )

QED:

sdFe(ejx; z )

168

9.4.2

The IV estimator

to (A.4),

for

t = r; s:

Let

xsz )0 1 (xrz

xsz )zz1(xrz

= (xrz

xsz )zz1

t = E (ziyit ):

Then is consistently estimated by

(r

s).

and

xir

yir

yis

on

E (yjx; z ) = x0 + E ( + "jx; z )

,

Let

4x = xr

= (x0ir x0is) + E ("ir "isjx):

xs, 4y = yr ys,4 = zyr zys. The 2SLS

yir

will be

^ = (4xz 0)(z 0z ) 1(z 4x0) 1 (4xz 0)(z 0z ) 1z 4y:

Lewbel and Honor show that

N (^

where

) v N 0;

Var(Q^ i)

0 ;

can be replaced by

^ and

Q^ i = (ziyir

ziyis ) zi(xir

^

xis)0:

For computing

estimate of

ft .

(xit; zi).

sity of

instrument variables, and uit = (it ; wit ) (a K + L + 1 vector).

f^(it; wit) and f^(wit) respectively denote the estimated joint density function of it and components of wit , and the joint density

associated to components of wit . These densities are

Let

NT

X

K +L+1 1

f^(it; wit) = NT h

= NT h

NT Z

X

1

K +L+1

uit

uj

f^(it; wit)dit

Km

it

NT

1X

j =1

Km

j wit

;

wit

wj

h

wj

dit

kernels such that Km (x) =

Km (x; y)dy and Km (x)dx = 1.

where

j =1

NT hK +L

Km

j =1

f^(wit) =

is the window,

2

f^(it; wit)

^

ft(itjxit; zi) = ^

:

f (wit)

170

General-purpose estimation technique for models with selection:

models with endogenous regime switching, Generalized Tobit models, etc.

Use of a particular, ecient simulator for multivariate normal

distributions: the GHK simulator (Geweke-Hajivassiliou-Keane,

Geweke 1991, Brsch-Supan and Hajivassiliou 1993, Keane 1994).

9.5.1

L=

where

g("j)f ()d;

f:rg

= (1; 2; : : : ; K )0 and "

are a

K -vector

(9.1)

and a

M-

general structural model dened by

straints

r.

g(:j:),

Notes:

In this model formulation, " is an implicit function of parameters and observed variables.

Function g(:j:) may depend in particular on the conditional distribution of

" given .

integration involving multiple probability distributions.

Idea of the GHK technique: construct a recursive algorithm to

approximate multiple integrals.

9.5.

171

Let

= var(),

decomposition):

D

Dene

B

=B

B

@

satisfying

DD0 =

(Choleski

d11 0 : : : 0

d21 d22 : : : 0 C

C

..

.

..

.

..

dK 1 dK 2 : : :

..

.

..

.

C:

A

variate. We have

L=

where

i(:):

K

Y

(9.2)

= (1; 2; : : : ; K ).

f : rg can be written

recursively as

1

1

1

1

r1; 2 (r2 d211); 3 (r3 d311 d322);

d11

d22

d33

1

: : : ; K

(r

d : : : dK;K 1K 1):

dKK K K 1 1

L=

Z 1

"

Z 1

1

r1 =d11 d22

(r2 d21 1 )

:::

K

Y

i=1

(i)Ai = (i)

1

1

(r

dii i

di11

i

: : : di;i 1i 1)

(9.3)

1

172

1

where Ai =

dii (ri

di11

i

, and

(:) is the

normal cumulative density function (CDF). The likelihood function above is now

L=

Z

A1

:::

"

Z

AK

K

Y

i=1

K

Y

i=1

1

(r

dii i

!

!

di11

: : : di;i 1i 1)

distribution is truncated normal is between 0 and 1. Let ui denote

a random variable on [0; 1]. We can then write

dom variables. The probability associated to any

ui =

(i)

1

d1ii (ri

d1ii (ri

di11

di11

: : : di;i 1i 1)

: : : di;i 1i 1)

; i = 1; 2; : : : ; K:

For example:

(1) (r1=d11)

,

1 = 1 [u1 (1 (r1=d11)) + (r1=d11)]

1 (r1=d11)

(2) (1=d22(r2 d21r1))

u2 =

1 (1=d22(r2 d21r1))

1

1

, 2 = 1 u2 1 d (r2 d211) + d (r2 d211) ;

22

22

where 1 is dened above.

For any i, we have the recursive formula:

1

i = 1 ui 1

(r d : : : di;i 1i 1)

dii i i1 1

u1 =

9.5.

173

1

+

(r

dii i

di11

: : : di;i 1i 1)

(u1; : : : ; uK ).

The likelihood function now involves random variables ui ; i =

1; : : : ; K and K integrals with constant bounds:

which depends on the sequence of uniform random variables

L=

Z 1

where

:::

Z 1" Y

K

i=1

1

(r

dii i

di11

: : : di;i 1i 1)

!

g("jD )

du1du2 : : : duK ;

QK

i Ai (i)

dui=di.

Since the

ui's

above by

Compute recursively (1s; : : : ; Ks ) from us above;

Average out over the S draws to form the Simulated Likelihood:

LS =

"K

S Y

X

1

1

1

(r

S s=1 i=1

dii i

di11s

: : : di;i 1is 1)

Note

Easy to generalize to a restriction set of the form

a < < b.

i = q [ui; (ai

di11

g("jD s) :

We

174

(bi

di11

where

If we want to compute for example the probability of an event

Q( ) = Q1:Q2: : : : QK ;

where

[(ai di11 : : : di;i 1i 1)=dii] ;

and average out over simulations.

9.5.2

Example

"Unemployment and liquidity constraints".

St and Et respectively.

y1t > 0;

y1t 0:

St =

Et =

8

<

:

0

1

1

0

if

if

y2t < ;

+

if y2t < ;

+

if y2t :

if

y2 = 1I(y1 > 0)2 + x22 + v2:

Six possible regimes, as (S; E ) in f0; 1g f 1; 0; 1g.

9.5.

S E

0

-1

-1

175

y1

y2

11 + x11 + v1 < 0

x22 + v2 <

x11 + v1 <0

< x22 + v2 < +

12 + x11 + v1 < 0

+ < x22 + v2

11 + x11 + v1 > 0

2 + x22 + v2 <

x11 + v1 > 0 < 2 + x22 + v2 < +

12 + x11 + v1 > 0

+ < 2 + x22 + v2

a1

a2

<

v1

v2

<

b1

b2

as follows:

S E

0

-1

-1

a1

1

1

1

(
11 + x11)

x11

(
12 + x11) +

a2

+

x22

x22

2

2

x22

x22

b1

(
11 + x11)

x11

+

(
12 + x11)

+1 2

+1 + 2

+1

example above, where v corresponds to " and corresponds to

In the panel data case: We would have

tional on

b2

x22

x22

+1

x22

x22

+1

176

Allows for multivariate distributions for individual eects, possibly correlated across equations.

the

177

of the Random-eect model

and ", the log-likelihood is

NT

N

1 0 1

log("2)

log()

U U;

2

2

2"2

log L =

where

NT

log(2)

2

=

="2 = Q + B , and

j

j = ("2)N (T

1)( 2 + T 2 )N = ( 2)NT N :

"

"

":

NT

1

N

NT

log(2)

log d0 Q + B d

log();

log L =

2

2

2

where d = Y

X ^ .

Estimate of 1= conditional on :

P P

0 Qd

d

dit di)2

i t (P

1d

= =

=

:

(T 1)d0Bd T (T 1) i(di d)2

Estimate of

conditional on 1=:

1

X0 Q + B X

1

1

X 0 Q + B Y:

(problem of local maximum).

Breusch (1987) procedure: iterate between

vergence.

^ "2 and 1d

= until con-

= is positive and

starts an increasing sequence;

178

= is positive

and starts a decreasing sequence.

maximum of log L is the same, this is the true maximum.

179

A2.1 Assumptions and notation

Assumptions:

E (it) = E (i"it) = E ("it) = 0;

and

is independent of

i; t and "it.

We have

E (uitujs) =

8 2

2

2

< + + "

2

: 2

i = j; t = s;

if i = j; t 6= s;

if i 6= j; t = s:

if

= 2 (IN

eT e0T ) + 2 (eN e0N

IT ) + "2(IN

IT )

= T 2 B + N2 B + "2INT :

A2.2 Feasible GLS estimation

We can write

P4

j =1 j Mj ,

1 = "2

2 = T 2 + "2

3 = N2 + "2

4 = T 2 + N2 + "2

with

M1 = (IN eNNeN )

(IT

0

0

M2 = (IN eNNeN )

eETeT

0

0

M3 = eNNeN

(IT eTTeT )

0

0

M4 = ( eNNeN )

( eTTeT ):

eT e0T

T )

180

We have

r =

P4

r

j =1 j Mj ,

so that

4

X

"

1=2 =

p" j Mj

j =1

and the typical element of

yit = yit

with

1 = 1

p" 2 ; 2 = 1

Y = "

1=2Y

1yi

is

2yt + 3y;

p" 3 ; 3 = 1 + 2 +

p" 4

1:

Y on X .

biased estimator of j is U Mj U=tr (Mj ); j = 1; 2; 3.

Because

residuals.

mates:

p

2

NT

(^

p 2"

@ N (^

p 2

0

T (^

"2)

2"4 0 0

2 ) A v N @0; @ 0 24 0

0 0 24

2 )

11

AA :

Between-individual and Between-periods.

M1 = (IN eN e0N =N )

(IT eT e0T =T ).

Within,

181

Estimate of

1:

^ 1 = ^ 2" =

[Y 0M1Y

:

(N 1)(T 1) K

M2 = (IN eN e0N =N )

(eT e0T =T ).

Estimate of

2:

^ 2 =

[Y 0M2Y

and we compute

model transformed by

;

(N 1) K

^ 2 = (1=T )(^2

^ 2" ).

M3 = (IN eN eN =N )

(eT e0T =T ).

Estimate of

3:

^ 3 =

[Y 0M3Y

and we compute

;

(T 1) K

^ 2 = (1=N )(^3

^ 2" ).

^ GLS = (X 0 M1X )="2 + (X 0M2X )=2 + (X 0M3X )=3 1

and

V ar(^ GLS ) = "2 (X 0M1X ) + "2(X 0 M2X )=2

^ W ithin = [X 0M1X ] 1[X 0M1Y ],

Within estimator

^ BI = [X 0M2X ] 1[X 0M2Y ],

Between-individual estimator is

182

Between-period estimator is

^ GLS = W1^ W ithin + W2^ BI + W3^ BP ;

with

i

0M X

0M X 1

X

X

0

2

2

W1 = X M1X + " + "

(X 0M1X );

h

i

0M X

0 M X 1 "

X

X

2

0

0

2

W2 = X M1X + " + "

(X M2 X );

h

i

0M X

0 M X 1 "

X

X

0

2

2

W3 = X M1X +

+

(X 0M3X ):

2

"

2

"

3

2

2

2

3

When T and N ! 1, ^ GLS ! ^ W ithin;

If " ! 1, then ^ GLS ! ^ BI ;

If " ! 1, then ^ GLS ! ^ BP .

2

2

2

3

Breusch-Pagan (1980): Lagrange Multiplier test statistic for

= = 0.

H0 :

only

1

@ log L() 0

@ 2 log L()

@ log L()

LM =

E

;

@

@@0

@

where

log L() =

and

= (2 ; 2 ; "2).

NT

log(2)

2

1

log j

j

2

1 0 1

U

U;

2

183

Gradient of log likelihood:

@ log L()

1

@

= tr

1

@i

2

@i

i = 1; 2; 3.

1

@

+ U 0

1

1U ;

2

@i

Because

= 2 (IN

eT e0T ) + 2 (eN e0N

IT ) + "2(IN

IT );

we have

=

@i

8

0

< IN eT eT

eN e0N IT

:

INT

Hence

(2 )

2

i=2 ( )

2

i=3 (" ):

i=1

0

0

0

@ log L()

NT 4 1 U0 (IN

0 eT eT )U=U 0U

=

1 U (eN eN

IT )U=U U

@

2"2

0

and

3

5;

1

@ 2 log L()

=

E

@@0

2

3

(

N

1)

0

(1

N

)

4

2"

4

0

(T 1) (1 T ) 5 :

NT (N 1)(T 1) (1 N ) (1 T ) (NT 1)

NT

LM =

1

2(T 1)

NT

+

1

2(N 1)

U 0(IN

eT e0T )U 2

U 0U

U 0(eN e0N

IT )U 2

U 0U

184

and is distributed as a

Important note. LM

U.

185

eects model

A3.1 Notation

D1 and D1 + D2 resp.

D1 1 )

Y1

X1

U1

=

+

;

(D1 + D2) 1 ) Y2

X2

U2

where X1 and X2 are resp. D1 K and (D1 + D2 ) K .

Variance-covariance matrix of

Now, let

We have

0

0

Tj =

Pj

i=1 Di ,

1 0

=

;

0

2

is

=

0

2 eD1 e0D2

2

0

2

eD2 eD1

" ID2 + 2 eD2 e0D2

with

j =

+"2(ITj eTj e0Tj =Tj ):

rj = (Tj 2 + "2)r

eT e0

j

Tj

Tj

2

2

2

If we denote wj = Tj + " ,

matrix:

+ ("2)r ITj

!

0

eTj eTj

Tj

"

wj

anced panel is

"

j 1=2 =

T1 = D1 and T2 = D1 + D2.

so that

eTj e0Tj

+ ITj

Tj

eTj e0Tj

Tj

186APPENDIX 3.

eTj e0Tj

j

Tj

= ITj

where

1=2Yj : yjt

Typical element of "

diagonal and o-terms (in the

0

^ GLS = X X

1

Y = "

1=2Y , and

"

1=2 = diag ITi

j = 1

"

:

wj

PTj

1

j Tj t=1 yjt .

N > 2,

because

is block-

0

X Y where X = "

1=2X;

eTi e0Ti

+ diag

Ti

"

wi

eTi e0Ti

Ti

Amemiya (1971) suggests the following estimates for

U^ 0QU^

;

T

N

K

i

i

2 and "2:

^ 2" = P

N + tr (X 0QX ) 1X 0 B X ^ 2"

2

P

P 2 P

^ =

T

i

i

i Ti = i Ti

0

tr (X Q X ) 1X 0 (Jn=N ) X ^ 2"

P

P 2 P

+

;

T

T

=

T

i

i

i

i i

i

P

P

where Jn is a matrix of ones, of dimension (

i Ti) ( i Ti),

U^ 0B U^

eTi e0Ti

B = diag

Ti

ji=1!N ; Q = diag

ITi

eTi e0Ti

Ti

ji=1!N :

187

models

A4.1 Likelihood functions

Dierent likelihoods corresponding to cases 1 to 4 above. Assumption:

y

L1 = (2)

NT

2

"

N

(det V )

exp

N

1X

2 i=1

u0iVT 1ui ;

(T T ) variance-covariance matrix for unit i.

where

i):

N

2

"

N

2

N

X

1

(y

2y2 i=1 i0

exp

y )2 :

0

NT

L2b = (2)

(

exp

("2)

N (T

1)

("2 + T a)

N

2

(y2 )

0

N

2

" T

N X

X

N X

T

X

1

a

2+

u

it

2"2 i=1 t=1

2"2("2 + T a) i=1

(2)

i):

"

N

2

exp

N

X

1

(y

2y2 i=1 i0

0

t=1

u2it

y )2 ;

0

#)

188

where

y ).

a = 2 2 y2

and

L3 = (2)

NT

2

( 2 )

NT

"

(2)

N X

T

1 X

[(y

2"2 i=1 t=1 it

exp

(yi;t 1

N

"

(2)

N

2

z ]2

2)):

"

N (T +1)

L4a = (2)

"

j

T +1j

N

2

exp

N

1X

wi0)2 :

2=(1

N

1 X

(y

22 i=1 i0

exp

yi0 + wi0)

w and variance

#

1 v0 ;

vi

T +1

i

2 i=1

where vi is a (T + 1) vector vi = (yi0

w ; yi1 yi0 xit

zi
; : : : ; yiT yi;T 1 xiT zi
) and

T +1 is a (T +1) (T +1)

matrix

T +1 = "2 1

0T

Useful expressions:

00T

IT

1

+ 2 1

eT

1

"2T

j

T +1j = 1 2 "2 + T 2 + 11 + 2

and

1 = 1

T +1

"2

"

1 2 00T

0T

IT

2

"

2

; e0T :

1+

+T +

1

1

189

1+

(1 + ; e0T ) :

eT

2

variance w0 ): same as 4.a/, but with

T +1 replaced by

1

2 = 2 00

VT +1 = "2 w " T + 2 1

0T IT

eT

; e0T :

2 ): same as 4.a/, but with y replaced by i0.

0

)

ance

(1

2

2

) (2 + w2 ) respectively.

w0 ):

i0

and

"2=(1

y

replaced by

i0,

and

large

and xed

Useful for checking maintained assumptions on initial conditions.

Based on Likelihood Ratio (LR) statistics.

Case 1

yi0 xed.

timated log-likelihood L1 under assumption H0 : VT = " IT +

2 eT e0T , and L1 the estimated log-likelihood with unrestricted VT

L0) is distributed

(T (T + 1)=2 components). Under H0 , 2(L1

1

190

as a

Case 4.a

wi0

wi0

w

and variance

"2=(1 2)

H0: matrix

T +1 as dened in likelihood for Case 4.a, vs. alternative: unrestricted variance-covariance with (T + 1)(T + 2)=2

0

components, with log-likelihoods L4a and L4a respectively. Under

H0, 2(L4a L04a) is distributed as a 2((T + 1)(T + 2)=2 2)

(note only two free parameters in restricted VT +1, as already

estimated).

Case 4.b

variance

2

w0 .

Let

L04b

w

and arbitrary

VT +1 for Case 4.b, and L4a the unrestricted log-likelihood for Case

L0 ) admits

4.a (as above). Under H0 : True model is 4.b, 2(L4a

4b

2

a ((T +1)(T +2)=2

3) distribution (3 free parameters in Case

2

2

2

4.b: " ; ; w ).

0

H0:

2

as a (1).

Under

191

models

In the Instrumental-Variable context (Hausman-Taylor, AmemiyaMaCurdy, Breusch-Mizon,Schmidt), we assumed:

= "2INT + 2 (IN

eT e0T ).

E (X 0") = E (Z 0") = 0.

was assumed

Several cases:

1. Random or xed eects (instruments correlated with

);

").

For the panel data case, we can use the fact that several time

observations are available for each unit.

If heteroskedasticity of

E (uituis) = 0; t 6= s, we have

N

1 0

VN = NV arf (x; ) = Nvar Z u = 2 E [Z 0uu0Z ]

N

N

1

= Z 0[IT

diagfi2g]Z

N

P

2

where i can be estimated by

^ 2i = T1 Tt=1 u^2it. Hence, a optimal

second-step estimate for VN would be

N

X

1

^ i; where H^ = diagf^ 2i g:

V^N =

Zi0HZ

N i=1

2

2

the form E (u ) =

it

such that

192

remove individual eects

model with

q orthogonality conditions:

and Wi is a T q matrix of instruments.

Because QT is a T T symmetric matrix, conditions above can

0

be rewritten E [(QT Wi ) ui ] = 0 and the optimal weighting matrix

AN is VN 1 with

W 0 u

VN = NE

N

0

u W

= NE [(QW=N )0uu0(QW=N )]

N

Hence, for GMM, it is equivalent to transform the model (by

or the instrument matrix.

VN =

because

1

[(QW )0["2INT + T 2 B ](QW )]

N

1

= [(QW )0["2INT ](QW )]

N

"2 0

1 0

VN = (W Q)(QW )

W QW:

N

N

0

0

0

^N = arg min u () W ( W QW ) 1 W u () ;

N

N

N

Q)

193

and the optimal GMM estimator is

A.5.2 Random eects and strictly exogenous instruments

By denition, random eects

exogeneity (

uncorrelated with

" at

E (wis0 uit) = 0

for

s; t = 1; 2; : : : ; T;

8t = 1; 2; : : : ; T , and set WSE;i = IT

wiT0 . Moment conditions

0

then read E (WSE;iui ) = 0.

which gives

the form for 2SLS or the 3SLS form are equivalent. We have

0 ) = 1=2

w0

1=2WSE;i = 1=2(IN

wiT

iT

0 )( 1=2

I ) = W B;

= (IN

wiT

qT

SE;i

where

B = 1=2

IqT .

We assume now instruments are correlated with

LT of dimension T (T 1):

exogenous. To remove

where

194

where

6

6

LT = 6

6

6

4

Note that (

T)

LT (L0T LT ) 1L0T .

If instruments

1 0 0

1 1 0

0

0

0 0 0

0 0 0

1 1

0

1

..

.

..

.

..

.

..

.

..

.

0

0

..

.

3

7

7

7

7:

7

5

LT : QT =

0 L0 u ) = E (Z 0 L0 " ) = 0;

E (ZSE;i

T i

SE;i t i

where

0:

ZSE;i = IT 1

wiT

ZSE;i as instruments.

In this case, we consider a

1q

vector of instruments

wit

such

that

There are T (T + 1)=2 such conditions: instruments are not correlated with future values of "it (and are not correlated with i ).

On the other hand, if instruments are weakly exogenous but are

correlated with

can be written

where

uis = uis

for

t = 1; 2; : : : ; T

1; t s;

ui;s 1.

triangular matrix that satises F F = IT , so that Cov (F ui ) =

195

0 ), we have the following Forward-Filter

Wi = (wi01; wi02; : : : ; wiT

estimator:

1

^ FF = X 0F 0 H (H 0H ) 1H 0F X

X 0F 0 H (H 0H ) 1H 0F Y ;

where

F = IN

F .

Wi

and lter

But FF transformation preserves the weak exogeneity of instruments

wit.

When

is large and

Wi .

plim N1

PN

0 0

i=1 HiF ui uiF Hi

1 PN H 0 F F 0 H

6= plimP

i

i=1 i

N

plim N1

N

0

i=1 Hi Hi :

We now present alternative GMM estimators that may be more

ecient than IV-HT, IV-AM or IV-BMS. Why: under strict exogeneity assumption, we have much more moment conditions than

HT, AM or BMS.

We rst consider the case where we restrict

= "2IT + eT e0T

2

unrestricted

matrix.

yi = Ri + (eT

zi)
+ ui Xi + ui;

0 )0 (a T k matrix of

ui = (eT

i )+"i, Ri = (ri01; r20 i; : : : ; riT

0 0

00

time-varying regressors), eT

zi = [zi ; zi ; : : : ; zi ] (a T g matrix

where

196

of time-invariant regressors).

Assume regressors

E (di

"i) = 0;

where

i:

HT, AM and BMS instruments are of the form

si); sHT;i = (r1i; z1i)

sAM;i = (r1i1; r1i2; : : : ; r1iT ; z1i)

sBMS;i = (sAM;i; r2i);

where

r2i = (r2i1

If the

E (Wi0uiu0iWi) = E (Wi0Wi),

ing the same instruments

Wi.

The strict exogeneity assumption implies

E [(LT

di)0ui] = E (L0T ui

di)

= E [L0T (eT i + "i)

di] = E (L0T "i

di) = 0;

where

LT

di is a T [(T

1)(kT + g]

WB;i = (LT

di; eT

si) instead of WA;i = (QT Ri; eT

si):

Number of additional instruments wrt.

BMS:

rank(ZB;i)

rank(ZA;i) = (T

1)(kT + g)

k.

is unrestricted.

Other

197

A.5.6 GMM with unrestricted variance-covariance matrix

ZB;i satisfy the no conditional heteroskedasticity assumption, but the variance-covariance of u is unrestricted.

We assume instruments

form of the GMM estimator with unrestricted

ments

1ZA;i

using instru-

This is not

= E (Ri0 QT 1eT i):

But when BMS assumption is not true and with an unrestricted

, E (Ri0 QT 1eT i) 6= 0.

Q = 1

and we can show that

QT

for removing

i :

QeT = 0.

Therefore:

because

si);

for

si:

198

specied

moment conditions or by nding

0 0

moment conditions: V = E (Z uu Z ). In a linear model, u( ) =

Y X where , we can solve directly for ^ N :

^ GMM = X 0ZVN 1Z 0X 1 X 0ZVN 1Z 0Y:

where

u to be a) homoskedastic (V

is diago-

= E (uu0) = IN

, where

= "2INT + 2 (IN

eT e0T ) and = "2IT + 2 eT e0T :

In the 2SLS case (HT, AM, BMS), we premultiply the model in

vector form

Zi:

yi = Xi + ui

by

^ 2SLS = X0

1=2Z (Z 0Z ) 1Z 0

1=2X 1

X 0

1=2Z (Z 0Z ) 1Z 0

1=2Y :

1=2Zi as instruAn equivalent 2SLS estimator obtains by using

1

ments:

^ 2SLS = X0

1Z (Z 0

1Z ) 1Z 0

1X 1

X 0

1Z (Z 0

1Z ) 1Z 0

1Y :

2

^ 3SLS = X 0Z (Z 0

Z ) 1Z 0X 1 X 0Z (Z 0

Z ) 1Z 0Y :

199

GMM and 3SLS are equivalent if the following condition holds:

E (Zi0uiu0iZi) = E (Zi0Zi) 8i = 1; 2 : : : ; N;

because, as

! 1,

N

1 0

1X

plim Z

Z = plim

Zi0u^iu^0iZi = E (Zi0uiu0iZi) = V:

N

N i=1

This condition is denoted

When condition

No conditional heteroskedasticity.

E (Zi0uiu0iZi) = E (Zi0Zi)

Impossible to prove 3SLS is more or less ecient than 2SLS, but

there exists a condition for numerical equivalence between 2SLS

and 3SLS:

exists a non-singular, non-stochastic matrix B such that

1=2Z =

ZB .

Theorem 8

is

^

estimated from rst-stage N for GMM. It states that under this

1=2) does

condition, ltering (premultiplying instruments by

200APPENDIX 6.

inference

A.6.1 Heterogeneity and the linear property

In linear panel-data models, the residual consists of an heterogeneity factor

i and an i.i.d.

error term

"it:

uit = i + "it:

OLS (or, equivalently, ML) yield consistent but not ecient estimates if unobserved heterogeneity is omitted.

problem: dicult to compute the likelihood of nonlinear models

because of dependent observations for a given individual (

yit

is

not i.i.d.).

root and no individual eect

where

jij < 1, i independent from "it, "it is N (0; "2).

The

If the restricted model is estimated, under the following data generating process:

201

the OLS estimate of

is

N

1X

Cov(i; V ar(yi;t

P

^

i +

N i=1

1=n i V ar(yi;t

1))

1)

N

Covi(P

i; "2=(1 2i ))

1X

+

:

=

N i=1 i 1=n i "2=(1 2i )

average of the true i 's (the bias is positive).

If all

overestimates the

where

i are i.i.d.

E (i) = 0.

This is

the Maximum Likelihood estimate of

^ =

We have

^ T !1

!

"

"P

is

N PT

i=1 t=1 yit

NT

N

X

1

1

N i=1 i

# 1

# 1

N

1X

<

:

N i=1 i

Hence, the MLE of the misspecied model underestimates the average of individual parameters

i .

202APPENDIX 6.

In many cases, it is not possible to lter out the individual effect without very restrictive assumptions (e.g., Fixed-eect Logit,

Another possibility is to integrate

Basic idea: specify a density distribution for

conditional likelihood.

f~(yit; xit + i) with i v (; );

where is a distributional parameter, and the vector of param-

eters of interest.

The distribution of

f (yitjxit; ; ) =

Assume

exp(xit + i)yit

f~(yit; xit + i) =

exp[ exp(xit + i)]:

yit!

Change of variable: i = exp(i ), with probability distribution:

1=
1 exp(
=)

(;
) =

;

(
)1=
(1=
)

where

(:): Gamma distribution, and
> 0. Then it can be

shown that

f (yitjxit; ; ) =

:

(1= ) (yit + 1)[1 + ; exp(xit )]yit+1=

203

This is the

Probit with heterogeneity:

Assume

i v N (0; 2 ):

P rob[yit = 1jxit] =

where

(:):

1

(xit + )

d;

density function of

N (0; 1).

P rob[yi1 = 1; : : : ; yiT = 1] =

6=

T

Y

t=1

Z Y

T

1

(xit + )

d

t=1

P rob[yit = 1]:

In more complex

of the form

M (yitjxit; ; ) =

Purpose: approximate multiple integrals using Monte Carlo (simulation) techniques.

204APPENDIX 6.

We can write

M (yitjxit; ; ) =

(;
) 0

m(yit; xit + ) 0

(;
0)d;

0

(;
)

ters . We have for individual i at time t:

(; )

M (yit jxit; ; ) = E m(yit; xit + ) 0

;

(; 0)

0

0

which is the expectation using distribution of m() ()= ().

0

Density function is the importance sampling function.

where

from distribution 0 , we can approximate the above

If we can nd

drawn

expectation by

S

1X

(is ;
)

s

m(yit; xit + i ) 0 s 0 :

S s=1

(i ;
)

Under (mild) regularity assumptions, the simulated expression

converges to the above expectation, using a weak Law of Large

Numbers. Two issues in practice:

Number of draws to obtain consistency ?

For the choice of the importance sampling function, make sure the

domain of

of estimator depends on estimation procedure.

205

Gouriroux and Monfort (J. of Econometrics, 1993): Simulated

GMM (SGMM) and Simulated Maximum Likelihood (SML).

For SGMM, when population moments are impossible to compute,

we replace

S

1X

E [f (yit; xit; i; ] = 0 by

[f (yit; xit; is; ] 0;

S s=1

or by

S

1X

(s ;
)

[f (yit; xit; is; ] 0 is 0:

S s=1

(i ;
)

s

MGMM

=

( N

X

S

1X

[f (yi; xi; is; ]0 Zi

S s=1

i=1

!)

T 1

N

X

S

1X

0

Zi

[f (yi; xi; is; )]

S s=1

i=1

Zi is a T L matrix of instruments. The SGMM is consistent and asymptotically normal when N tends to innity and S

where

is xed. This is because we can use the weak Law of Large Numbers for consistency of the simulator

1P f

s

S

towards

E f

and a

log L() =

where heterogeneity

f (yijxi; ).

N

X

i=1

206APPENDIX 6.

Then

S

1X

f~(yi; xi; is; );

S s=1

where

Ls() =

"

N

X

S

X

i: is ; s = 1; 2; : : : ; S

and

1

1

log

f~(y ; x ; s; ) :

N i=1

S s=1 i i i

N=S

may be necessary.

We use the Gouriroux and Monfort (1993) result. The SGMM

and SML criteria are of the form

"

GN () =

1

N

N

X

in

to

Two dierent simulated criteria can be used for GN ( ): whether

I

D

identical (GN ( )) or dierent sets (GN ( )) of simulation draws

207

are used for each individual:

"

N

S

1X

1X

I

GN ( ) =

yi; xi;

(yi; xi; s; )

N i

S s

"

GDN () =

1

N

N

X

i

yi; xi;

1

S

S

X

s

!#

;

!#

"

E yi; xi;

1

S

S

X

s

!#

(yi; xi; s; )

I

G(). Therefore ^ that maximizes (SML)

I

or minimizes (SGMM) GN ( ) is inconsistent.

GDN () converges to the non random scalar:

"

!#

S

1X

EE yi; xi;

(yi; xi; s; ) ;

S s

which is in general dierent from G( ). But if function is linear

D

D

wrt. E
(:), GN ( ) converges to G( ) and ^

is consistent.

Case 2. S and N ! 1.

Both

^I

and

^D

are consistent.

yit = 1 if yit > 0;

yit = 0 if yit 0; (Probit);

208APPENDIX 6.

and

yit = yit if yit > 0;

yit = 0 if yit 0; (Tobit);

where i v N (0; 1), "it v N (0; 1).

Because

the

yit

T -fold

integrals. But we can consider the conditional likelihood

functions of

yi given xi and i:

f (yijxi; i; ) =

yit =1

(xit + )

1

y

f (yijxi; i; ) =

it

yit >0 "

Y

Y

yit =0

Y

yit =0

( xit

xit

"

xit

"

)

as simulators.

*

*

*

*

*

*

;

DYNTAB.SAS ;

;

Uses datafile DYNTAB3.DAT;

;

Create library and file names ;

* Change directory information below ;

filename watfile 'd:/dea/panel/dyntab3.dat';

* Create SAS table and read data from Ascii file ;

data wat;

infile watfile;

input id year conso price revenue precip ;

* Compute logs ;

lconso=log(conso); lprice=log(price);

lrevenue=log(revenue);

run;

* Descriptive statistics ;

proc means data=wat;run;

* OLS regression ;

proc reg data=wat;

model lconso = lprice lrevenue;

run;

* Model 1: One-way Fixed effects ;

* cs=116:

209

210

proc tscsreg data=wat cs=116;

model lconso= lprice lrevenue /fixone ;

run;

* Model 2: Two-way Fixed effects ;

* option /fixtwo: Set two-way Fixed-effect ;

proc tscsreg data=wat cs=116;

model lconso= lprice lrevenue /fixtwo ;

run;

* Model 3: One-way Random effects ;

* option /ranone: Set one-way Random-effect ;

proc tscsreg data=wat cs=116;

model lconso= lprice lrevenue /ranone;

run;

* Model 4: Two-way Random effects ;

* option /rantwo Set Two-way Random-effect ;

proc tscsreg data=wat cs=116;

model lconso= lprice lrevenue /rantwo;

run;

* Model 5: One-way Random effects with AR(1) ;

* option /ranone parks rho Set One-way Random-effect ;

* and compute RHO: Ar(1) parameter ;

proc tscsreg data=wat cs=116;

model lconso= lprice lrevenue /ranone parks rho;

run;

* Compute parameter estimates on each cross section ;

proc sort data=wat;

by year;

proc reg data=wat;

SOFTWARE

211

model lconso= lprice lrevenue ;

by year;

run;

* Compute Within and Between estimates ;

* using the MEANS procedure ;

proc sort data=wat;

by id;

proc means data=wat noprint;

var lconso lprice lrevenue ;

by id;

output out=out1 mean=mconso mprice mrevenue ;

data out1;set out1;

keep id mconso mprice mrevenue ;

data wat;

merge wat out1;

by id;

data wat;set wat;

qconso=lconso-mconso; qprice=lprice-mprice;

qrevenue=lrevenue-mrevenue;

* Within regression ;

proc reg data=wat;

model qconso = qprice qrevenue ;

run;

* Between regression ;

proc reg data=wat;

model mconso = mprice mrevenue;

run;

212

SOFTWARE

MODEL 1. ONE-WAY FIXED EFFECTS

TSCSREG Procedure

Dependent Variable: LCONSO

Model Description

Estimation Method

FIXONE

Number of Cross Sections 116

Time Series Length

6

SSE

MSE

RSQ

Model Variance

2.578099 DFE

578

0.00446

Root MSE 0.066786

0.9344

Numerator DF:

115 F value: 58.3964

Denominator DF: 578 Prob.>F: 0.0000

Parameter Estimates

Variable

CS 1

CS 2

CS 3

CS 4

CS 5

... ...

CS 112

CS 113

CS 114

CS 115

INTERCEP

LPRICE

LREVENUE

DF

1

1

1

1

1

...

1

1

1

1

1

1

1

Parameter

Estimate

-0.455773

-0.222476

0.153338

-0.131488

0.027422

...

0.420843

-0.322888

-0.259767

-0.240823

5.099257

-0.134245

0.024386

Standard

Error

0.039463

0.039923

0.038900

0.039174

0.038890

...

0.040309

0.039376

0.038678

0.039379

0.366957

0.018447

0.033223

T for H0:

Parameter=0

-11.549433

-5.572620

3.941882

-3.356518

0.705132

... ...

10.440506

-8.200102

-6.716134

-6.115479

13.896065

-7.277506

0.734009

0.0001

0.0001

0.0001

0.0008

0.4810

...

0.0001

0.0001

0.0001

0.0001

0.0001

0.0001

0.4632

Variable

Label

Cross Sec

Cross Sec

Cross Sec

Cross Sec

Cross Sec

Cross Sec

Cross Sec

Cross Sec

Cross Sec

Intercept

213

MODEL 2. TWO-WAY FIXED EFFECTS

TSCSREG Procedure

Dependent Variable:

LCONSO

Model Description

Estimation Method

FIXTWO

Number of Cross Sections 116

Time Series Length

6

SSE

MSE

RSQ

Model Variance

2.205671 DFE

573

0.003849 Root MSE 0.062043

0.9439

Numerator DF:

120 F value: 65.6530

Denominator DF: 573 Prob.>F: 0.0000

Variable

CS 1

CS 2

CS 3

...

CS 114

CS 115

TS 1

TS 2

TS 3

TS 4

TS 5

INTERCEP

LPRICE

LREVENUE

DF

1

1

1

...

1

1

1

1

1

1

1

1

1

1

Parameter Estimates

Parameter Standard T for H0:

Estimate

Error

Parameter=0

-0.535192 0.040793 -13.119702

-0.302435 0.041809 -7.233670

0.120803

0.037066 3.259125

... ...

...

... ...

-0.288486 0.036463 -7.911820

-0.256215 0.036669 -6.987209

-0.102087 0.017883 -5.708681

-0.047565 0.016463 -2.889216

-0.030524 0.014486 -2.107135

-0.007359 0.012507 -0.588378

-0.025528 0.009992 -2.554900

6.316873

0.396540 15.929983

-0.251061 0.034210 -7.338896

-0.053316 0.033244 -1.603773

0.0001

0.0001

0.0012

...

0.0001

0.0001

0.0001

0.0040

0.0355

0.5565

0.0109

0.0001

0.0001

0.1093

Variable

Label

Cross Sec

Cross Sec

Cross Sec

Cross Sec

Cross Sec

Time Seri

Time Seri

Time Seri

Time Seri

Time Seri

Intercept

214

SOFTWARE

TSCSREG Procedure

Dependent Variable: LCONSO

Model Description

Estimation Method

RANONE

Number of Cross Sections 116

Time Series Length

6

Variance Component Estimates

SSE 3.12498

DFE

693

MSE 0.004509 Root MSE 0.067152

RSQ 0.1087

Variance Component for Cross Sections

Variance Component for Error

0.043243

0.004460

Degrees of Freedom: 2

m value: 14.4912 Prob. > m: 0.0007

Variable

INTERCEP

LPRICE

LREVENUE

DF

1

1

1

Parameter

Estimate

4.692305

-0.149074

0.053077

Parameter Estimates

Standard T for H0:

Error

Parameter=0

0.354917 13.220844

0.017611 -8.465039

0.032306 1.642977

0.0001

0.0001

0.1008

Variable

Label

Intercept

215

MODEL 4. TWO-WAY FIXED EFFECTS

TSCSREG Procedure

Dependent Variable:

LCONSO

Model Description

Estimation Method

RANTWO

Number of Cross Sections 116

Time Series Length

6

Variance Component Estimates

SSE 2.707154 DFE

693

MSE 0.003906 Root MSE 0.062501

RSQ 0.0907

Variance Component for Cross Sections

Variance Component for Time Series

Variance Component for Error

0.043638

0.000746

0.003849

Degrees of Freedom: 2

m value: 22.2377 Prob. > m: 0.0000

Variable

INTERCEP

LPRICE

LREVENUE

DF

1

1

1

Parameter

Estimate

5.674742

-0.225151

-0.018251

Parameter Estimates

Standard T for H0:

Error

Parameter=0

0.371984 15.255323

0.027604 -8.156464

0.032401 -0.563297

0.0001

0.0001

0.5734

Variable

Label

Intercept

216

SOFTWARE

Source

Model

Error

c Total

Analysis

Sum of

DF

Squares

2

0.31252

693 2.57810

695 2.89062

Root MSE

Dep Mean

C.V.

Variable

INTERCEP

QPRICE

QREVENUE

DF

1

1

1

of Variance

Mean

Square

F Value

0.15626 42.003

0.00372

0.06099

-0.00000

-1.291786E17

R-square

Adj R-sq

Prob>F

0.0001

0.1081

0.1055

Parameter Estimates

Parameter

Standard

T for H0:

Estimate

Error

Parameter=0

-5.28092E-17 0.00231195 -0.000

-0.134245

0.01684666 -7.969

0.024386

0.03034107 0.804

1.0000

0.0001

0.4218

Variable

Label

Source

Model

Error

C Total

DF

2

693

695

Analysis of Variance

Sum of

Mean

Squares

Square

F Value

7.13103

3.56551 84.369

29.28684 0.04226

36.41786

Root MSE

Dep Mean

C.V.

Variable

INTERCEP

MPRICE

MREVENUE

DF

1

1

1

Parameter

Estimate

-0.176444

-0.259461

0.494483

0.20557

4.99481

4.11576

R-square

Adj R-sq

Prob>F

0.0001

0.1958

0.1935

Parameter Estimates

Standard

T for H0:

Error

Parameter=0

0.68091356 -0.259

0.02278084 -11.389

0.05958703 8.298

0.7956

0.0001

0.0001

Variable

Label

217

Introduction

Gauss is an interpreter computer language, that is most conveniently run in interactive mode (global variables are kept in memory until one quits Gauss). It has a small built-in editor useful for

long jobs, or it can be used in command mode.

When Gauss is executed rst, you are inside the command mode,

with the following prompt:

[Gauss].

command mode and the edit mode using either tool bar (Windows bar at the bottom, Gauss bar on top). In command mode,

you can edit any le (for example

myprog.prg)

by typing

edit

You

edit mode, simply use the Run option on top, or enter key function F3. You may save the program by entering the F2 function

key.

To declare a text le for output, use the syntax

The reset option clears the le if it exists!

In a program, you can choose to have output written to the le

or not (useful for inspecting results on the screen only):

output le).

218

You can either work with data les in text format (Ascii), or with

preexisting Gauss datasets. To load a text-format data le:

load x[1000,5]=mydata.dat

or

n=100;t=10;nvar=5;load x[n*t,nvar]=mydata.dat;.

1

of variable names (

mydata").

varnames);

("

call saved(x,"mydata",varnames).

Basic operators

In Gauss, most operators return a value that may be stored in a

variable, or printed to screen. If no assigment command is given,

the program will simply output the result to the screen. Example:

You don't have to specify the dimension of ectors or matrices if

they are assigned a computed value.

be modied afterwards) or when using loops (see below). To create a vector with predetermined values:

x={1 2 3};

(a

1 Note:

;.

vnames={"a","b","c"}.

219

Here is a list of useful operators:

cols(x)

rows(x)

meanc(x)

stdc(x)

sqrt(x)

sumc(x)

cumsumc(x)

columns of

x;

Returns the number of rows of

x;

x;

x;

Computes square root of elements in

x;

x;

x;

cdfn(x)

Returns the cumulative normal distribution (x);

2

cdfchic(x,y)

Returns the complement to 1 of the (x) cumulative distribution with

2

puting p-values of

tests.

x'

Transposes matrix or vector x;

y=x1 x2, y=x1|x2

Concatenates

horizontally or vertically;

y=x[.,1]

Selects column 1 and all rows of matrix x;

y=x[1:10,.]

Selects rows 1 to 10 and all columns;

y=x[1:10,1:20]

Selects columns 1 to 20 and rows 1 to 10;

vec(x)

Creates a vector from a matrix, by stacking all

columns one after the other. vec(x) is NT 1 if x is N T ;

diag(x)

Returns the rst diagonal of matrix x (must be

square);

reshape(x,n,t)

Reshapes matrix x into a N T matrix;

a*b*c

Performs matrix multiplication (check number of rows

and columns!);

a.*b, a./b

220

inv(x)

Compute inverse of x (for generalized

division (

inverse, use

invpd(x));

zeros(n,m)

Returns a n m matrix of zeros;

ones(n,m)

Returns a n m matrix of ones;

eye(n)

Returns a n n identity matrix;

a.*.b

Computes the Kronecker product a

b;

Conditional operators and loops

Useful for testing and creating dummy variables. Operators:

.eq,

less than, less than or equal to, strictly greater than, greater than

or equal to.

Example: suppose you want to create an indicator variable equal

y= x .le 50, which creates a N 1 vector y , with yi = 1 if

xi 50 and 0 otherwise. That is, when a variable is assigned

to 1 when

Example: You want to create a new variable

and equal to

y if z > 0.

z , equal to x if z < 0

z = x.*(z .lt 0)

Loops are not recommended because they produce lengthy processes, and vector operators should always be preferred. But in

some cases, they are necessary. Examples of loops are:

y[i]=x[i]+a;

i=i+1;

endo;

221

or

y[i]=x[i]+a;

i=i+1;

endo;

Note: in the above examples, vector

instance

y=zeros(n,1).

select a subset of observations.

y=sorthc(x,1)

Sorts matrix

x using

variable in column 1

as key;

y=selif(x, x .eq 1)

Creates matrix

Creates matrix

equal to 1;

y=delif(x, x .lt 0)

tive values from

x;

from values of

by deleting nega-

Creating procedures

Very useful to speed up repetitive tasks. The general syntax is

proc func(a);

local toto;

:::

retp(toto);

endp;.

local variable (not accessible outside procedure func) and return

a single argument toto. In some cases, it is necessary to have more

222

local toto1,toto2,toto3;

:::

retp(toto1,toto2,toto3);

endp;.

This code declares 3 inputs

In that case, we must use the following syntax when calling this

procedure:

{b1,b2,b3}=func(a1,a2,a3);

Beware of the use of local variables; any variable used in the procedure must either be declared as local (its value is lost when one

quits the procedure) or else where in the program (this will be a

global variable). A possibility to avoid problems is to declare all

variables as global at the start of the program, with the syntax:

(Within operator).

proc(x);

local toto;

toto=reshape(x,n,t);

toto=toto-meanc(toto');

toto=reshape(toto,n*t,1);

retp(toto);

endp;

Note in this case, variables

and

procedure:

proc(x,n,t);

local toto;

223

toto=reshape(x,n,t);

retp(reshape(toto-meanc(toto'),n*t,1));

retp(toto);

endp;

And if we wished to return both Between and Within:

proc (2)=(x,n,t);

local toto;

toto=reshape(meanc(reshape(x,n,t)'),n*t,1);

retp(toto,x-toto));

endp;

Some useful built-in procedures

If not, a 0 is put in place of the Gauss dataset name.

call dstat(0,x)

in

x;

call dstat("mydata",1|3)

call ols(0,y,x);

y on x;

as follows:

library optmum;optmum;

guments;

ters;

Main command;

is the

224

ret is a return

The optmum procedure calls a user-dened procedure (here,

func)

on parameters (here,

proc(z);

:::;

retp(crit);

endp;

z).

Example: To estimate a nonlinear model by minimizing the residual sum of squares, where the model is

log(1)wi:

yi = 0 + 12xi +

library optmum;optmum;

x0={0.1 , 0.1 , 0.5};

{x, f, g, ret} = optmum(&func,x0);

proc(z);

local err;

err=y-z[1]-z[1]*z[2]*x-ln(z[2])*w;

be global variables, while err (the residual) is local

1 PN u2

err=meanc(err'*err);

Computes

i i

N

Note:

is

z [1], 1

is

retp(crit);

endp;

must

225

/* DYNTAB.PRG 16 01 2001 Residential water use */

new; clear all;

library tscs,pgraph;

tscsset;graphset;

output on;

n=116; t=6;

load x[n*t,6]=d:/dea/panel/dyntab3.dat;

id=x[.,1];

year=x[.,2];

conso=ln(x[.,3]);

price=ln(x[.,4]);

revenue=ln(x[.,5]);

precip=ln(x[.,6]);

vnames="year","conso","price","revenue","precip","id" ;

call saved(year conso price revenue precip id,"watle",vnames);

y= conso ;

x= price,revenue ;

grp= id ;

__title("Water demand equation");

call tscs("watle",y,x,grp);

226

SOFTWARE

=====================================================================

TSCS Version 3.1.2 1/17/01 3:51 pm

=====================================================================

Data Set: watfile

OLS DUMMY VARIABLE RESULTS

Dependent variable: conso

Observations :

Number of Groups :

Degrees of freedom :

Residual SS :

Std error of est :

Total SS (corrected) :

F = 35.033

P-value =

Var

price

revenue

Coef.

-0.134245

0.024386

Std.

Group Number

1

2

3

...

114

115

116

696

116

578

2.578

0.067

2.891

with 2,578 degrees of freedom

0.000

Coef.

-0.347461

0.035045

Std.

Error

0.018447

0.033223

Dummy Variable

4.643484

4.876781

5.252595

... ... ...

4.839490

4.858434

5.099257

t-Stat

-7.277506

0.734009

Standard Error

0.365639

0.370063

0.369474

... ... ...

0.365496

0.359065

0.366957

F(115, 578) = 58.3964 P-value: 0.0000

P-Value

0.000

0.463

227

OLS ESTIMATE OF CONSTRAINED MODEL

Dependent variable: conso

Observations :

696

Number of Groups :

116

Degrees of freedom :

693

R-squared :

0.172

Rbar-squared :

0.170

Residual SS :

32.532

Std error of est :

0.217

Total SS (corrected) : 39.308

F = 72.175

with 3,693 degrees of freedom

P-value =

0.000

Var

CONSTANT

price

revenue

Coef.

1.164761

-0.249873

0.376643

Std.

Coef.

-0.406149

0.257121

Std.

Error

0.598014

0.022153

0.052746

t-Stat

1.947715

-11.279345

7.140637

P-Value

0.052

0.000

0.000

TABLE OF R-SQUARED TERMS

R-squaredfull model:

0.934

R-squaredconstrained model: 0.172

Partial R-squared:

0.921

FULL, RESTRICTED, AND PARTIAL R-SQUARED TERMSX VARIABLES ARE CONSTRAINED

228

SOFTWARE

R-squaredfull model:

0.934

R-squaredconstrained model: 0.926 Partial R-squared:

0.108

Dependent variable: conso

Observations :

696

Number of Groups :

116

Degrees of freedom :

693

Residual SS :

3.135

Std error of est :

0.067

Total SS (corrected) : 3.517

F = 22047.870

with 3,693 degrees of freedom

P-value =

0.000

Std. errors of error terms:

Individual constant terms: 0.206

White noise error : 0.067

Var

CONSTANT

price

revenue

Coef.

4.687235

-0.149316

0.053560

Std.

Coef.

-0.363264

0.071009

Std.

Error

0.355285

0.017623

0.032338

t-Stat

13.192903

-8.472974

1.656247

P-Value

0.000

0.000

0.098

229

Group Number

1

2

3

4

5

...

112

113

114

115

116

Random Components

-0.346522

-0.121608

0.250638

-0.020350

0.128761

... ... ...

0.512636

-0.216224

-0.151243

-0.125587

0.104064

Null hypothesis: Individual error components do not exist.

Chi-squared statistic (1): 1367.1014

P-value:

0.0000

230

Gauss c

/* IV2.PRG Instrumental variable estimation and GMM estimation

Model y(it) = X(it)beta + Z(i) gamma

We use Hausman-Taylor, Amemiya-MaCurdy, Breusch-Mizon-Schmidt instruments,

both for IV and GMM */

new;clear all;

/* You only need to change this block */

/* Define dimensions

N: number of units, T=number of time periods

nvar= Nb. of variables to be read

k1: Nb. of X1it, k2: Nb. of X2it, g1= Nb.

kq= k1+k2, kb= k1+k2+g1+g2*/

n=595;

t=7;

nvar=13;

k1=4;

k2=5;

g1=2;

g2=1;

kq=k1+k2;

kb=k1+k2+g1+g2;

et=ones(t,1);

un=ones(n*t,1);

unb=ones(n,1);

/* Read data */

load x[n*t,nvar]=psid.dat;

output file=iv1.out reset;

expe=x[.,1];

expe2=x[.,2];

wks=x[.,3];

of Z2i

231

occ=x[.,4];

ind=x[.,5];

south=x[.,6];

smsa=x[.,7];

ms=x[.,8];

fem=x[.,9];

unioni=x[.,10];

edu=x[.,11];

blk=x[.,12];

lwage=x[.,13];

/* Define matrices X, Z and vector Y */

x1=occ south smsa ind;

x2=expe expe2 wks ms unioni;

z1=fem blk;

z2=edu;

y=lwage;

x=x1 x2;

z=z1 z2;

/* You don't need to change anything after this */

/* Compute Between and Within transformations:

Caution: keep that order for BXZ: X,Z,Y */

qx=with(x y);

bxz=bet(x z y);

by=bxz[.,cols(bxz)];

bxz=bxz[.,1:cols(bxz)-1];

qy=qx[.,cols(qx)];

qx=qx[.,1:cols(qx)-1];

/* Within regression and error term (uw) */

betaw=inv(qx'qx)*qx'qy;

uw=qy-qx*betaw;

/* Compute variance with instruments */

exob=un bxz;

gamb=inv(exob'exob)*(exob'by);

BX and QX

232

ub=by-exob*gamb;

sigep=uw'uw/(n*(t-1)-kq);

sigq=sqrt(sigep*diag(inv(qx'qx)));

a=x1 z1;

di=by-bxz[.,1:kq]*betaw;

zz=un z1 z2;

gamhatw=inv(zz'*a*inv(a'*a)*a'*zz)*zz'*a*inv(a'*a)*a'*di;

s2=(1/(n*t))*(by-bxz[.,1:kq]*betaw

-zz*gamhatw)'*(by-bxz[.,1:kq]*betaw-zz*gamhatw);

sigal=s2-(1/t)*sigep;

theta=sqrt(sigep/(sigep+t*sigal));

/* GLS transformation and estimate

Caution: keep the order 1,X1,X2,Z1,Z2 in matrix EXOG */

exog=gls(un x1 x2 z1 z2 y);

yg=exog[.,cols(exog)];

exog=exog[.,1:cols(exog)-1];

betagls=inv(exog'exog)*(exog'yg);

siggls=sqrt(sigep*diag(inv(exog'exog)));

/* HT */

aht=un qx bet(x1) z1;

betaht=inv(exog'*aht*inv(aht'*aht)*aht'*exog)*exog'*aht*inv(aht'*aht)

*aht'*yg;

sight=sqrt(sigep*diag(inv(exog'*aht*inv(aht'*aht)*aht'*exog)));

/* AM */

x1s=tam(x1);

aam=un qx x1s z1;

betaam=inv(exog'*aam*inv(aam'*aam)*aam'*exog);

betaam=betaam*exog'*aam*inv(aam'*aam)*aam'*yg;

sigam=sqrt(sigep*diag(inv(exog'*aam*inv(aam'*aam)*aam'*exog)));

/* BMS */

233

abms1=aam tbms(with(x2));

/* This is the general form for BMS instrument, it should work in most

cases. But with the application to PSID data, we must drop some variables,

see below. This means you have to delete ABMS1 below for your application

*/

/* Remove abms1 just below: */

abms1=un qx bet(x1) tbms(with(occ south smsa ind ms wks unioni)) z1;

betabms1=inv(exog'*abms1*inv(abms1'*abms1)*abms1'*exog)

*exog'*abms1*inv(abms1'*abms1)*abms1'*yg;

sigbms1=sqrt(sigep*diag(inv(exog'*abms1*inv(abms1'*abms1)*abms1'*exog)));

/* Compute variance-covariance matrices */

varq=sigep*inv(qx'qx); varg=sigep*inv(exog'*exog);

varht=sigep*inv(exog'*aht*inv(aht'*aht)*aht'*exog);

varam=sigep*inv(exog'*aam*inv(aam'*aam)*aam'*exog);

varbms1=sigep*inv(exog'*abms1*inv(abms1'*abms1)*abms1'*exog);

test1=(betagls[2:kq+1]-betaw)'*inv(varq-varg[2:kq+1,2:kq+1]);

test1=test1*(betagls[2:kq+1]-betaw);

test2=(betaht[2:kq+1]-betaw)'*inv(varq-varht[2:kq+1,2:kq+1])

*(betaht[2:kq+1]-betaw);

test3=(betaht-betaam)'*inv(varht-varam)*(betaht-betaam);

test4=(betaam-betabms1)'*inv(varam-varbms1)*(betaam-betabms1);

output file=iv1.out reset;

output on;

"Within estimates ";

" Estimate standard error t-stat ";

betaw sigq betaw./sigq;

"GLS estimates";

"sigma(alpha),sigma(epsilon),theta(=(sig(ep)/(sig(ep+t*sig(al)))**(1/2))";

sigal sigep theta;

" Estimate standard error t-stat ";

betagls siggls betagls./siggls;

234

" Estimate standard error t-stat ";

betaht sight betaht./sight;

"AM estimates ";

" Estimate standard error t-stat ";

betaam sigam betaam./sigam; "BMS estimates ";

" Estimate standard error t-stat ";

betabms1 sigbms1 betabms1./sigbms1;

"Hausman test statistics and p-value ";

"Within vs. GLS ";

test1 cdfchic(test1,kq);

"Within vs. HT ";

test2 cdfchic(test2,k1-g2);

"AM vs. HT ";

test3 cdfchic(test3,cols(aam)-cols(aht));

"BMS vs. AM ";

test4 cdfchic(test4,cols(abms1)-cols(aam));

/* GMM estimation */

b1,se1,b2,se2,sar = gmm(y,un x1 x2 z1 z2,aht,1);

"GMM-HT estimates ";

" Estimate standard error t-stat ";

b2 se2 b2./se2;

"Hansen test and p-value ";

sar cdfchic(sar,cols(aht)-rows(b2));

b1,se1,b2,se2,sar = gmm(y,un x1 x2 z1 z2,aam,1);

"GMM-AM estimates ";

" Estimate standard error t-stat ";

b2 se2 b2./se2;

"Hansen test and p-value ";

sar cdfchic(sar,cols(aam)-rows(b2));

b1,se1,b2,se2,sar = gmm(y,un x1 x2 z1 z2,abms1,1);

"GMM-BMS estimates ";

235

" Estimate standard error t-stat ";

b2 se2 b2./se2;

"Hansen test and p-value ";

sar cdfchic(sar,cols(abms1)-rows(b2));

output off;

proc bet(w);

/* Compute BX from matrix w */

local i,term,betx;

term=reshape(w[.,1],n,t);

term=meanc(term').*.et;

term=reshape(term,n*t,1);

betx=term;

i=2;

do until i>cols(w);

term=reshape(w[.,i],n,t);

term=reshape(meanc(term').*.et,n*t,1);

betx=betx term;

i=i+1;

endo;

retp(betx);

endp;

proc with(w);

/* Compute Within transformation for matrix W */

retp(w-bet(w));

endp;

proc gls(w);

/* GLS transformation */

local term; term=w-(1-theta)*bet(w);

retp(term);

endp;

proc tam(w);

/* AM transformation, stacking time observations */

local i,term,xstar;

term=reshape(w[.,1],n,t).*.et;

xstar=term;

236

i=2;

do until i>cols(w);

term=reshape(w[.,i],n,t).*.et;

xstar=xstar term;

i=i+1;

endo;

retp(xstar);

endp;

proc tbms(w);

/* BMS transformation, stacking time observations but deleting last column

*/

local i,term,xstar;

term=reshape(w[.,1],n,t).*.et;

xstar=term[.,1:cols(term)-1];

i=2;

do until i>cols(w);

term=reshape(w[.,i],n,t).*.et;

xstar=xstar term[.,1:cols(term)-1];

i=i+1;

endo;

retp(xstar);

endp;

proc (5)=gmm(y,x,z,d);

local zx,w,w2,b,e,e2,b2,se,se2,sar2;

zx = z'x;

if d==1;

w = invpd(inw(z));

else;

w = invpd(z'z);

endif;

b = invpd(zx'w*zx)*zx'w*z'y;

e = y-x*b;

w2 = ezw(e,z);

se = invpd(zx'w*zx)*zx'w*w2*w*zx*invpd(zx'w*zx);

237

w = invpd(w2);

se2 = invpd(zx'w*zx);

b2 = se2*zx'w*z'y;

e2 = y-x*b2;

sar2 = e2'z*w*z'e2;

retp(b,sqrt(diag(se)),b2,sqrt(diag(se2)),sar2);

endp;

proc ezw(e,z);

local k,ez,T;

T = rows(e)/N;

k = cols(z);

ez = reshape(e.*z,N,K*T)*(ones(T,1).*.eye(K));

retp(ez'ez);

endp;

proc inw(z);

local a,i,zi,zaz,T;

t = rows(z)/N;

a = eye(T);

zaz = 0;

i = 1;

do until i>N;

zi = z[(i-1)*T+1:i*T,.];

zaz = zaz + zi'a*zi;

i = i+1;

endo;

retp(zaz);

endp;

238

Method: Arellano-Bond */

/* Defines variables below as global */

clearg N,T,y,x,z,alpha,sco,hes,zgy,fake,mom,w;

/*Read data*/

n=595; t=7; nvar=13;

load x[n*t,nvar]=d:/dea/panel/psid.dat;

lwage=x[.,13];

wks=x[.,3];

occ=x[.,4];

clear x;

/* Create a (NxT) matrix for dependent var.

y=reshape(lwage,n,t);

/* Stack exogenous vars. */

x=wks occ;

*/

top=1 to add instruments from X that are weakly exogenous and in level;

set top=2 to add for instruments from X that are strongly exogenous and

in first-difference form */

top=2;

/* Set AR1 to 0 for general case, and AR1 to 1

for serially correlated epsilon's of order 1 (E (epi tepi ; t + 1) <> 0) */

ar1=1;

/* You don't need to change anything after this line */

/* Define identity matrices I(T-2) for AB and BB */

ddif = eye(T-2);

/* Construct AB instrument matrix Z.

239

First component matrix: lagged Y's

Recall: if AR1=1, restriction when epsilon's are serially correlated

of order 1 */

z = (y[.,1]).*.ddif[.,1];

j = 2;

do until j>cols(ddif);

z = z ((y[.,1:j]).*.ddif[.,j]);

j = j+1;

endo;

if ar1==1;

z = (y[.,1]).*.ddif[.,1];

j = 2;

do until j>cols(ddif);

z = z ((y[.,1:j-1]).*.ddif[.,j]);

j = j+1;

endo;

z=z[.,2:cols(z)];

endif;

/* Second component matrix: Instruments from X */

/* Delete this block if you want only instruments from y's */

if top==1;

/* Weakly exogenous X's, in level */

toto=shapent(x[.,1]);

z2 = (toto[.,1]).*.ddif[.,1];

j = 2;

do until j>cols(ddif);

z2 = z2 ((toto[.,1:j]).*.ddif[.,j]);

j = j+1;

endo;

i=2;

do until i>cols(x);

toto=shapent(x[.,i]);

z2 =z2 ((toto[.,1]).*.ddif[.,1]);

j = 2;

do until j>cols(ddif);

z2 = z2 ((toto[.,1:j]).*.ddif[.,j]);

240

j = j+1;

endo;

i=i+1;

endo;

z=z z2;

endif;

if top==2;

/* Strongly exogenous X's, in first-difference form */

toto=shapent(x[.,1]);

z2 = (toto[.,3]-toto[.,2]).*.ddif[.,1];

j = 2;

do until j>cols(ddif);

z2 = z2 ((toto[.,j]-toto[.,j-1]).*.ddif[.,j]);

j = j+1;

endo;

i=2;do until i>cols(x);

toto=shapent(x[.,i]);

z2 = z2 ((toto[.,3]-toto[.,2]).*.ddif[.,1]);

j = 2;

do until j>cols(ddif);

z2 = z2 ((toto[.,j]-toto[.,j-1]).*.ddif[.,j]);

j = j+1;

endo;

i=i+1;

endo;

z=z z2;

endif;

b1,se1,b2,se2,sar = gmm(vec((y[.,3:T]-y[.,2:T-1])'),

vec((y[.,2:T-1]-y[.,1:T-2])')

trans(x),z,1);

output file = dpd1.out on;

"Arellano-Bond GMM estimates";

if top ==0;

"Instruments from lagged Y's only (TOP=0)";

endif;

if top==1;

241

"Instruments from X are weakly exogenous and in level (TOP=1)";

endif;

if top==2;

"Instruments from X are strongly exogenous and first-differenced (TOP=2)";

endif;

if ar1==1;

"Restricted estimates: epsilon are serially correlated of order 1 (AR1=1)";

endif;

" Estimate standard error t-stat";

b2 se2 b2./se2;

"Nb. of conditions (instruments) " cols(z);

"Nb. of parameters " rows(b2);

"Hansen specification test and p-value ";

sar cdfchic(sar,cols(z)-rows(b2));

output off;

proc shapent(w);

/* Reshapes vector in NxT form */

retp(reshape(w,n,t));

endp;

proc trans(w);

/* Transforms matrix X in First Difference */

local toto,i,xfd;

toto=reshape(w[.,1],n,t);

toto=vec((toto[.,3:T]-toto[.,2:T-1])');

xfd=toto;

i=2;

do until i>cols(w);

toto=reshape(w[.,i],n,t);

toto=vec((toto[.,3:T]-toto[.,2:T-1])');

xfd=xfd toto;

i=i+1;

endo;

retp(xfd);

endp;

242

proc (2)=ls(y,x);

/* Computes OLS, returns White var-covar matrix */

local ixx,b,e,v;

ixx = invpd(x'x);

b = ixx*x'y;

e = y-x*b;

v = ixx*(ezw(e,x))*ixx;

retp(b,v);

endp;

proc ezw(e,z);

local k,ez,T;

T = rows(e)/N;

k = cols(z);

ez = reshape(e.*z,N,K*T)*(ones(T,1).*.eye(K));

retp(ez'ez);

endp;

proc inw(z);

local d,a,i,zi,zaz,T;

T = rows(z)/N;

d = zeros(T,1) (eye(T-1)|zeros(1,T-1));

a = 2*eye(T) - (d + d');

zaz = 0;

i = 1;

do until i>N;

zi = z[(i-1)*T+1:i*T,.];

zaz = zaz + zi'a*zi;

i = i+1;

endo;

retp(zaz);

endp;

proc (5)=gmm(y,x,z,d);

local zx,w,w2,b,e,e2,b2,se,se2,sar2;

zx = z'x;

243

if d==1;

w = invpd(inw(z));

else;

w = invpd(z'z);

endif;

b = invpd(zx'w*zx)*zx'w*z'y;

e = y-x*b;

w2 = ezw(e,z);

se = invpd(zx'w*zx)*zx'w*w2*w*zx*invpd(zx'w*zx);

w = invpd(w2);

se2 = invpd(zx'w*zx);

b2 = se2*zx'w*z'y;

e2 = y-x*b2;

sar2 = e2'z*w*z'e2;

retp(b,sqrt(diag(se)),b2,sqrt(diag(se2)),sar2);

endp;

244

REFERENCES

References

S.C. Ahn and P. Schmidt, Ecient Estimation of Models for Dynamic Panel

Data, Journal of Econometrics, 68, 5-27, 1995.

S.C. Ahn and P. Schmidt, A Separability Result for GMM Estimation, with

Applications to GLS Prediction and Conditional Moment Tests, Econometric Reviews, 14(1), 19-34, 1995.

S.C. Ahn and P. Schmidt, Ecient Estimation of Dynamic Panel Data Models:

Alternative Assumptions and Simplied Estimation, Journal of Econometrics, 76,

309-321, 1997.

S.C. Ahn, Y.H. Lee and P. Schmidt, GMM Estimation of Linear Panel Data

Models with Time-varying Individual Eects, Journal of Econometrics, 101, 219255, 2001.

T. Amemiya, The estimation of the variances in a variance-components model,

International Economic Review, 12, 1-13, 1971.

T. Amemiya and T.E. MaCurdy, Instrumental-Variable Estimation of an ErrorComponents Model, Econometrica, 54(4), 869880, 1986.

E.B. Andersen, Conditional inference and models for measuring (Mentalhygiejnisk Forlag, Copenhague), 1973.

T.W. Anderson and C. Hsiao, Formulation and Estimation of Dynamic Models

Using Panel Data, Journal of Econometrics, 18, 4782, 1982.

D.W.K. Andrews, Heteroskedasticity and autocorrelation consistent covariance

matrix estimation, Econometrica, 59, 817-858, 1991.

D.W.K. Andrews and J.C. Monahan, An improved heteroskedasticity and autocorrelation consistent covariance matrix estimator, Econometrica, 60, 953-966,

1992.

W. Antweiler, Nested Random Eects Estimation in Unbalanced Panel Data,

Journal of Econometrics, 101, 295-313, 2001.

M. Arellano, Discrete choices with panel data, working paper 0101, CEMFI,

245

2001.

M. Arellano and S. Bond, Some Tests of Specication for Panel Data: Monte

Carlo Evidence and an Application to Employment Equations, Review of Economic

Studies, 58, 277297, 1991.

M. Arellano and O. Bover, Another Look at the Instrumental Variable Estimation of Error-Components Models, Journal of Econometrics, 68, 2951, 1995.

J. Alvarez and M. Arellano, The Time Series and Cross Section Asymptotics

of Dynamic Panel Data Estimators, CEMFI Working Paper No. 9808, 1998.

P. Balestra and M. Nerlove, Pooling cross-section and time-series data in the

estimation of a dynamic model: the demand for natural gas, Econometrica, 34,

585-612,1966.

B.H. Baltagi, Econometric Analysis of Panel Data, J. Wiley, 1995.

B.H. Baltagi and S. Khanti-Akom, On ecient estimation with panel data:an

empirical comparison of instrumental variables estimators, Journal of Applied Econometrics, 5, 401-406, 1990.

B.H. Baltagi, Simultaneous equations with error components, Journal of Econometrics, 17, 189-200, 1981.

B.H. Baltagi, Specication issues, in The econometrics of panel data: Handbook of theory and applications, chap. 9, L. Matyas and P. Sevestre eds., Kluwer

Academix Publishers, Dordrecht, 196-205, 1992.

B.H. Baltagi, Panel data, Journal of Econometrics, 68, 1-268, 1995.

B.H. Baltagi, S.H. Song and B.C. Jung, The Unbalanced Nested Error Component Regression Model, Journal of Econometrics, 101, 357-381, 2001.

R. Blundell and S. Bond, GMM estimation with persistent panel data: An

application to production functions, IFS working paper W99/4, 1999.

R. Blundell and S. Bond, Initial Conditions and Moment Restrictions in Dynamic Panel Data Models, Journal of Econometrics, 87, 115143, 1998.

246

REFERENCES

A. Brsch-Supan and V. Hajivassiliou, Smooth unbiased multivariate probability simulators for maximum likelihood estimation of limited dependent variables

models, Cowles Foundation paper 960, Yale University, 1990.

T.S. Breusch, G.E. Mizon and P. Schmidt, Ecient Estimation Using Panel

Data, Econometrica, 57(3), 695-700, 1989.

G. Chamberlain, Asymptotic Eciency in Estimation with Conditional Moment Restrictions, Journal of Econometrics, 34, 305-334, 1987.

G. Chamberlain, Panel data, in Handbook of Econometrics, pp. 1247-1318, Z.

Griliches and M. Intriligator eds., North- Holland, Amsterdam, 1984.

G. Chamberlain, Comment: Sequential Moment Restrictions in Panel Data,

Journal of Business and Economic Statistics, 10, 20-26, 1992.

G. Chamberlain, Multivariate regression models for panel data, Journal of

Econometrics, 18, 5-46, 1982.

E. Charlier, B. Melenberg and A. van Soest, Estimation of a censored regression panel data model using conditional moment restrictions eciently, Journal of

Econometrics, 95, 25-56, 2000.

C. Cornwell. and P. Rupert, Ecient Estimation with Panel Data: An Empirical Comparison of Instrumental Variables Estimators, Journal of Applied Econometrics, 3, 149-155, 1988.

B. Crpon, F. Kramarz and A. Trognon, Parameters of Interest, Nuisance Parameters and Orthogonality Conditions. An Application to Autoregressive Error

Component Models, Journal of Econometrics, 82, 135156, 1997.

C. Cornwell, P. Schmidt and D. Wyhowski, Simultaneous equations and panel

data, Journal of Econometrics, 51, 151-181, 1992.

G. Dionne, R. Gagn and C. Vanasse, Inferring technological parameters from

incomplete panel data, Journal of Econometrics, 87, 303-327, 1998.

J. Dolado, Optimal instrumental variable estimator of the AR parameter of an

ARMA(1,1) process, Econometric Theory, 6, 117-119.

247

B. Dormont, Introduction l'Economtrie des Donnes de Panel, Editions du

Centre National de la Recherche Scientique, Paris, 1989.

E. Fix and J.L. Hodges, Discriminatory analysis, nonparametric estimation:

consistent properties, Report No 4, USAF School of Aviation Medicine, Randolph

Field, Texas, 1951.

J. Geweke, Bayesian inference in econometric models using Monte Carlo integration, Econometrica, 57, 1317-1339, 1989.

S. Girma, A quasi-dierencing approach to dynamic modelling from a time series of independent cross-sections, Journal of Econometrics, 365-383, 2000.

R. Hall, Stochastic implications of the life cycle-permanent income hypothesis,

Journal of Political Economy, 86, 971-987, 1978.

B.E. Hansen, Threshold Eects in Non-Dynamic Panels: Estimation, Testing,

and Inference,Journal of Econometrics, 93, 345368, 1999.

L.P. Hansen, Large sample properties of generalized method of moments estimators, Econometrica, 50, 102-1054, 1982.

L.P. Hansen, A method of calculating bounds on the asymptotic covariance

matrices of generalized method of moments estimators, Journal of Econometrics,

30, 203-238, 1985.

L.P. Hansen and T.J. Sargent, Instrumental variables procedures for estimating

linear rational expectations models, Journal of Monetary Economics, 9, 263-296,

1982.

L.P. Hansen and K.J. Singleton, Generalized instrumental variable estimation

of nonlinear rational expectations models, Econometrica, 50, 1269-1286, 1982.

L.P. Hansen, J.C. Heaton and A. Yaron, Finite-sample properties of some alternative GMM estimators, Journal of Business and Economics Statistics, 14, 262-280,

1993.

W. Hrdle and J.S. Marron, Optimal bandwidth selection in nonparametric

regression function estimation, Annals of Statistics, 13 1465-1481, 1983.

R.D.F. Harris and E. Tzavalis, Inference for unit roots in dynamic panels where

248

REFERENCES

J.A. Hausman, Specication Tests in Econometrics, Econometrica, 46(6), 12511271,

1978.

J.A. Hausman and W.E. Taylor, Panel Data and Unobservable Individual Effects, Econometrica, 49(6), 13771398, 1981.

J.J. Heckman and T.E. MaCurdy, A life-cycle model of female labor supply,

Review of Economic Studies, 47, 47-74, 1980.

I. Hoch, Estimation of production function parameters combining time-series

and cross-section data, Econometrica, 30, 34-53, 1962.

D. Holtz-Eakin, W. Newey and H. Rosen, Estimating Vector Autoregressions

with Panel Data, Econometrica, 56, 13711395, 1988.

B.E. Honor and A. Lewbel, Semiparametric binary choice panel data models

without strictly exogeneous regressors, working paper, Boston College, 2000.

C. Hsiao, Analysis of Panel Data, Cambridge University Press, 1986.

K.S. Im, S.C. Ahn, P. Schmidt and J.M. Wooldridge, Ecient estimation of

panel data models with strictly exogenous explanatory variables, Journal of Econometrics, 93, 177-201, 1999.

G.W. Imbens, One-step estimators for over-identied generalized method of

moments models, Review of Economic Studies, 64, 359-383.

J. Inkmann, Misspecied heteroskedasticity in the panel Probit model: A small

sample comparison of GMM and SML estimators, Journal of Econometrics, 97, 227259, 2000.

R.A. Judson and A.L. Owen, Estimating dynamic panel data models: A guide

for macroeconomists, Economics Letters, 65, 9-15, 1999.

M.P. Keane and D.E. Runkle, On the estimation of panel-data models with

serial correlation when instruments are not strictly exogenous, Journal of Business

and Economic Statistics, 10, 1-9, 1992.

N.M. Kiefer, A Time Series-Cross Section Model with Fixed Eects with an

249

Intertemporal Factor Structure, unpublished manuscript, Cornell University, 1980.

E. Kyriazidou, Estimation of a panel data sample selection model, Econometrica, 65, 1335-1364, 1997.

Y.H. Lee and P. Schmidt, A Production Frontier Model with Flexible Temporal

Variation in Technical Ineciency, in The Measurement of Productive Eciency:

Techniques and Applications, Oxford University Press, 1993.

L.A. Lillard and Y. Weiss, Components of Variation in Panel Earnings Data:

American Scientists 1960-1970, Econometrica, 47, 437454, 1979.

R. Lucas, Econometric policy evaluation: A critique, in The Phillips curve and

labor markets, K. Brunner (Ed.), Vol. 1, North-Holland, 1976.

Y.P. Mack, Local properties of k N N regression estimates, SIAM Journal of

Algebraic and discrete methods, 2, 311-323, 1981.

L. Matyas and P. Sevestre, The Econometrics of Panel Data. Handbook of

Theory and Applications, Kluwer Academic Publishers, 1992.

P. Mazodier and A. Trognon, Heteroskedasticity and stratication in error components models, Annales de l'INSEE, 30-31, 451-482, 1978.

C. Meghir and F. Windmeijer, Moment Conditions for Dynamic Panel Data

Models with Multiplicative Individual Eects in the Conditional Variance,IFS

Working Paper Series No. W97/21, 1997.

R. Mott, Identication and estimation of dynamic models with a time series

of repeated cross-sections, Journal of Econometrics, 59, 99-123, 1993.

M. Nerlove, A note on error components models, Econometrica, 39, 383-396,

1971.

W.K. Newey, Ecient estimation of models with conditional moment restrictions, in Handbook of Statistics, C.R. Rao and H.D. Vinod (Eds.), Vol. 11, Elsevier

Science Publishers, 1993.

W.K. Newey, Ecient instrumental variables estimation of nonlinear models,

Econometrica, 58, 809-837, 1990.

250

REFERENCES

W.K. Newey and K.D. West, Automatic lag selection in covariance estimation,

Review of Economic Studies, 61, 631-653, 1994.

W.K. Newey and K.D. West, Hypothesis testing with ecient method of moments estimation, International Economic Review, 28, 777-787, 1987.

W.K. Newey and K.D. West, A simple, positive denite, heteroscedasticity and

autocorrelation consistent covariance matrix, Econometrica, 55, 703-708, 1987.

P. Schmidt, S.C. Ahn and D. Wyhowski, Comment: Sequential Moment Restrictions in Panel Data,Journal of Business and Economic Statistics, 10, 1014,

1992.

C.J. Stone, Consistent nonparametric regression, Annals of Statistics, 5, 595645, 1977.

P.A.V.B. Swamy and S.S. Arora, The exact nite sample properties of the estimators of coecients in the error components regression models, Econometrica,

40, 261-275, 1972.

M. Verbeek and T.E. Nijman, Testing for selectivity bias in panel data models,

International Economic Review, 33, 681-703, 1992.

M. Verbeek and T.E. Nijman, Minimum MSE estimation of a regression model

with xed eects and a series of cross- sections, Journal of Econometrics, 59, 125136, 1993.

T.D. Wallace and A. Hussain, The use of error components models in combining cross-sction and time-series data, Econometrica, 37, 55-72, 1969.

T.J. Wansbeek and A. Kapteyn, Estimation of the error components model

with incomplete panels, Journal of Econometrics, 41, 341-361, 1989.

H. White, A heteroscedasticity consistent covariance matrix estimator and a

direct test for heteroscedasticity, Econometrica, 48, 817-838, 1980.

H. White, Asymptotic theory for econometricians, Academic Press, Orlando,

1984.

251

J.M. Wooldridge, A framework for estimating dynamic, unobserved eects

panel data models with possible feedback to future explanatory variables, Economics Letters, 68, 245-250, 2000.

- Basic Math SymbolsDiunggah olehJordan Rey Infante
- VAR Models and Cointegration - Sebastian FossatiDiunggah olehalexa_sherpy
- Lecture 8 Regression AnalysisDiunggah olehadilsajjad2005
- KKP MTE 3110_3Diunggah olehSiti Khirnie Kasbolah
- Corruption InstitutionDiunggah olehmatloobilahi
- Does Microfinance Really HelpDiunggah olehSarfraz Khalil
- DemandDiunggah olehdshyllon7428
- EmpMacroAllDiunggah olehgricco
- Paper 4Socioeconomic Determinants of Corruption a Cross Country Evidence and Analysis by Ghulam Shabbir and Abdul Rauf ButtDiunggah olehMuddasir Farooq
- Intensive and Extensive Margins of Exports: What Can India Learn from China?Diunggah olehPiklooBear
- Logistic Regression 08012008Diunggah olehMohammad Defien
- Computational Finance and Financial Engineering Second R-Rmetrics User and Developer Workshop June 28th–July 2nd 2009Diunggah olehelz0rr0
- 1-s2.0-S0301479717302062-main.pdfDiunggah olehhavocsnsd
- Rattan_NNDiunggah olehJosé Morales
- 1330-1400_Ball_RFM.pdfDiunggah olehLikevekt Metalforce
- articolDiunggah olehMonica Boian
- Empirical Methods HandoutDiunggah olehEkalardion2
- Notes for Multivariate Statistics with RDiunggah olehAhmed Assal
- MPRA Paper 37278Diunggah olehDiamante Gomez
- Weighted Least SquaresDiunggah olehnitin30
- Spatial Dynamics in an Estuarine System Modeling Biophysical Components and Interactions to Advance Blue Crab Fishery Management.pdfDiunggah olehchokbing
- Econ 399 Chapter2a.pptDiunggah olehJheena Yousafzai
- Nonstationary Panels, Panel Cointegration, and Dynamic Panels.pdfDiunggah olehBhuwan
- Panel_BM.pdfDiunggah olehboggle
- Vectores autorregresivosDiunggah olehSergio
- Stock, Wright and Yogo (2002).pdfDiunggah olehmatheux2828
- 1. Solution EconometricsDiunggah olehAbdullah Mamun Al Saud
- nkjnjDiunggah olehIgor Strobilius
- Sweave JournalsDiunggah olehbarbarabento
- sect3pt1to3pt4handoutDiunggah olehKreanne LF

- Ability TrackingDiunggah olehquantanglement
- EViews5_1PanelPooledDataDiunggah olehGatahari Chandra
- Noy (2010) the Economics of Natural Disasters in a Developing Country the Case of VietnamDiunggah olehmichael17ph2003
- EX10.pdfDiunggah olehSalman Farooq
- UntitledDiunggah olehapi-106078359
- drukker_xtdpdDiunggah olehRodrigo Fernandez
- smj2131Diunggah olehmert0723
- Test Bank Questions Chapter 6Diunggah olehAnonymous 8ooQmMoNs1
- HARVARD Guide on Economics Research PaperDiunggah olehnightbox07
- tirc_80Diunggah olehregreg88regreg
- Process Fault Detection Based on Modeling and EstimationDiunggah olehRoberto DG
- EViews 6 Users Guide IIDiunggah olehtianhvt
- Variable Lag Mengurangi EndogenitasDiunggah olehMuhammad Syah
- Homework EconometricsDiunggah olehtnthuyoanh
- ALBERTO ABADIEDiunggah olehGortix123
- Jiraporn Et Al-2014-Financial ManagementDiunggah olehFerlyan Huang 正莲
- research paperDiunggah olehNimisha Kapoor
- Daymon an Empirical Test Ofthe Inequality Trap ConceptDiunggah olehAlexandra Vasiliu
- 1-s2.0-S0304407607001741-mainDiunggah olehAziz Adam
- Amartya Sen 2000 - Role of Legal and Judicial Reform in DevelopmentDiunggah olehRanjan Basu
- Research Proposal for MBADiunggah olehAndy Phuong
- Empirical Methods - Esther Duflo 2002Diunggah olehmoroquito20
- Panel Gmm CommandsDiunggah olehSaira Abid
- 3) Spolaore, Enrico and Romain Waeziarg (2013), How Deep Are the Roots of Economic DevelopmentDiunggah olehSusmita Sen
- Gmm With StataDiunggah olehDevin-KonulRichmond
- Bond VolatilityDiunggah olehSyed Ahsan Ansar
- Fiscal PolicyDiunggah olehAdiluhung Wiseso
- Does Going Public Affect InnovationDiunggah olehjimmycabo
- Heuristic Function of Party Affiliation in Voter MobilizationDiunggah olehsuekoa
- Mondragon JmpDiunggah olehpj

## Lebih dari sekadar dokumen.

Temukan segala yang ditawarkan Scribd, termasuk buku dan buku audio dari penerbit-penerbit terkemuka.

Batalkan kapan saja.