Econometrics Chapter 10 PPT Slides

Chapter 10
Regression with
Panel Data
Copyright 2015 Pearson, Inc. All rights reserved.
Outline
1. Panel Data: What and Why
2. Panel Data with Two Time Periods
3. The FIXED effects Model:
1. What is it
2. Potential issues
3. TIME Fixed Effects
4. The RANDOM effects Model

1. What is it
2. Potential issues
5. Choosing b/t the FIXED effects and RANDOM

effects Model
10-2
Panel Data: What and Why

(SW Section 10.1)
A panel dataset contains observations on multiple
entities (individuals, states, companies), where
each entity is observed at two or more points in
time.
Hypothetical examples:
Data on 420 California school districts in 1999 and again
in 2000, for 840 observations total.
Data on 50 U.S. states, each state is observed in 3 years,
for a total of 150 observations.
Data on 1000 individuals, in four different months, for
4000 observations total.
10-3
What Are Panel Data? (cont.)

There are four different kinds of variables that we encounter when
we use panel data:
1. Variables that can differ between individuals but dont change
over time:
e.g., gender, ethnicity, and race
2. Variables that change over time but are the same for all
individuals in a given time period:
e.g., the retail price index and the national unemployment rate
3. Variables that vary both over time and between individuals:

e.g., income and marital status
4. Trend variables that vary in predictable ways:

e.g., an individuals age
More in class practice and examples
10-5
Notation for panel data

A double subscript distinguishes entities (states) and time
periods (years)
i = entity (state), n = number of entities,
so i = 1,,n
t = time period (year), T = number of time periods
so t =1,,T
Data: Suppose we have 1 regressor. The data are:
(Xit, Yit), i = 1,,n, t = 1,,T
10-6
Panel data notation, ctd.

Panel data with k regressors:
(X1it, X2it,,Xkit, Yit), i = 1,,n, t = 1,,T
n = number of entities (states)
T = number of time periods (years)
Some jargon
Another term for panel data is longitudinal data
balanced panel: no missing observations, that is, all
variables are observed for all entities (states) and all time
periods (years)
10-7
Why are panel data useful?

With panel data we can control for factors that:
Vary across entities but do not vary over time
Could cause omitted variable bias if they are
omitted
Are unobserved or unmeasured and therefore
cannot be included in the regression using multiple
regression
Heres the key idea:
If an omitted variable does not change over time,
then any changes in Y over time cannot be caused
by the omitted variable.
10-8
Example of a panel data set:

Traffic deaths and alcohol taxes
Observational unit: a year in a U.S. state
48 U.S. states, so n = # of entities = 48
7 years (1982,, 1988), so T = # of time periods = 7
Balanced panel, so total # observations = 748 = 336
Variables:
Traffic fatality rate (# traffic deaths in that state in that
year, per 10,000 state residents)
Tax on a case of beer
Other (legal driving age, drunk driving laws, etc.)
10-9
U.S. traffic death data for 1982:
Higher alcohol taxes, more traffic deaths?

10-10
Why might there be higher/more traffic deaths

in states that have higher alcohol taxes?
Other factors that determine traffic fatality

rate:
Quality (age) of automobiles
Quality of roads
Culture around drinking and driving
Density of cars on the road
10-11
These omitted factors could cause

omitted variable bias.
Example #1: traffic density. Suppose:
I.
High traffic density means more traffic deaths
II. (Western) states with lower traffic density have lower

alcohol taxes
. Then the two conditions for omitted variable bias are
satisfied. Specifically, high taxes could reflect high traffic
density (so the OLS coefficient would be biased positively
high taxes, more deaths)
. Panel data lets us eliminate omitted variable bias when the
omitted variables are constant over time within a given
state.
10-12
Example #2:Cultural attitudes towards drinking and

driving:
(i) arguably are a determinant of traffic deaths; and
(ii) potentially are correlated with the beer tax.
Then the two conditions for omitted variable bias are

satisfied. Specifically, high taxes could pick up the effect
of cultural attitudes towards drinking so the OLS coefficient
would be biased
Panel data lets us eliminate omitted variable bias when the

omitted variables are constant over time within a given
state.
10-13
Outline
1. Panel Data: What and Why
2. Panel Data with Two Time Periods
1. What is it
2. Potential issues

1. What is it
2. Potential issues

effects Model
10-14
Panel Data with Two Time Periods

(SW Section 10.2)
Consider the panel data model,
FatalityRateit = 0 + 1BeerTaxit + 2Zi + uit
Zi is a factor that does not change over time (density), at
least during the years on which we have data.
Suppose Zi is not observed, so its omission could result
in omitted variable bias.
The effect of Zi can be eliminated using T = 2 years.
10-15
The key idea:

Any change in the fatality rate from 1982 to 1988 cannot be
caused by Zi, because Zi (by assumption) does not change
between 1982 and 1988.
The math: consider fatality rates in 1988 and 1982:

FatalityRatei1988 = 0 + 1BeerTaxi1988 + 2Zi + ui1988
Suppose E(uit|BeerTaxit, Zi) = 0.

Subtracting 1988 1982 (that is, calculating the
change), eliminates the effect of Zi
10-16

so
FatalityRatei1988 FatalityRatei1982 =
1(BeerTaxi1988 BeerTaxi1982) + (ui1988 ui1982)
The new error term, (ui1988 ui1982), is uncorrelated with

either BeerTaxi1988 or BeerTaxi1982.
This difference equation can be estimated by OLS, even

though Zi isnt observed.
The omitted variable Zi doesnt change, so it cannot be a

determinant of the change in Y
This differences regression doesnt have an intercept it was

eliminated by the subtraction step
10-17
Example: Traffic deaths and beer taxes

1982 data:
= 2.01 + 0.15BeerTax
FatalityRate
(.15)
(n = 48)
(.13)
1988 data:
= 1.86 + 0.44BeerTax
FatalityRate
(.11)
(n = 48)
(.13)
Difference regression (n = 48)

= .072 1.04(BeerTax1988BeerTax1982)
FR1988 FR1982
(.065) (.36)
An intercept is included in this differences regression allows
for the mean change in FR to be nonzero more on this
later
10-18
FatalityRate v. BeerTax:
Note that the intercept is nearly zero

10-19
Outline
1. Panel Data: What and Why DONE
2. Panel Data with Two Time Periods DONE
1. What is it
2. Potential issues

1. What is it
2. Potential issues

effects Model
10-20
The Fixed Effects

Model
10-21
16-
Fixed Effects
Fixed-effects (FE) explore the relationship between the
independent variables and dependent variable within an
entity (country, state, institution etc.).
Each entity (state) has its own individual characteristics that
may or may not influence the dependent variables
Why use FE? Because we believe that something within the
entity (state) will bias the variables; we need to control for
this to get unbiased estimates. Therefore, FE removes the
effect of those time-invariant characteristics from the
independent variables so we can assess their net effect.
10-22
Fixed Effects Regression

(SW Section 10.3)
What if you have more than 2 time periods (T > 2)?
Yit = 0 + 1Xit + 2Zi + uit, i =1,,n, T = 1,,T
We can rewrite this in two useful ways:
1. n-1 binary regressor regression model
2. Fixed Effects regression model
We first rewrite this in fixed effects form. Suppose
we have n = 3 states: California, Texas, and
Massachusetts.
10-23
Yit = 0 + 1Xit + 2Zi + uit, i =1,,n, T = 1,,T

Population regression for California (that is, i = CA):
YCA,t = 0 + 1XCA,t + 2ZCA + uCA,t
= (0 + 2ZCA) + 1XCA,t + uCA,t
Or
YCA,t = CA + 1XCA,t + uCA,t
CA = 0 + 2ZCA doesnt change over time
CA is the intercept for CA, and 1 is the slope
The intercept is unique to CA, but the slope is the
same in all the states: parallel lines.
10-24
For TX:
YTX,t = 0 + 1XTX,t + 2ZTX + uTX,t
= (0 + 2ZTX) + 1XTX,t + uTX,t
or
YTX,t = TX + 1XTX,t + uTX,t, where TX = 0 + 2ZTX
Collecting the lines for all three states:

YCA,t = CA + 1XCA,t + uCA,t
YTX,t = TX + 1XTX,t + uTX,t
YMA,t = MA + 1XMA,t + uMA,t
or
Yit = i + 1Xit + uit,

i = CA, TX, MA,
t = 1,,T
10-25
The regression lines for each state in a

picture
Recall that shifts in the intercept can be represented using

binary regressors
10-26
We now put this in binary regressor form:

Yit = 0 + CADCAi + TXDTXi + 1Xit + uit
DCAi = 1 if state is CA, = 0 otherwise
DTXt = 1 if state is TX, = 0 otherwise
leave out DMAi (why?)
10-27
Why no dummy for Alabama?
10-28
Summary: Two ways to write the fixed

effects model
1. n-1 binary regressor form: good for initial
intuition
Yit = 0 + 1Xit + 2D2i + + nDni + uit
1 for i=2 (state #2)
where D2i =
0 otherwise
, etc.
2. Fixed effects form: what we really use in real life

Yit = 1Xit + i + uit
. i is called a state fixed effect or state effect it is
the constant (fixed) effect of being in state i
10-29
Stata FE model: Method 1
xi: reg mrall beertax i.state
This is how we create a dummy variable for each state
i.state
_Istate_1-56
(naturally coded; _Istate_1 omitted)
Source |
SS
df
MS
Number of obs
=
336
-------------+---------------------------------F(48, 287)
=
56.97
Model | 9.8570e-07
48 2.0535e-08
Prob > F
=
0.0000
Residual | 1.0345e-07
287 3.6047e-10
R-squared
=
0.9050
-------------+---------------------------------Adj R-squared
=
0.8891
Total | 1.0892e-06
335 3.2512e-09
Root MSE
=
1.9e-05
-----------------------------------------------------------------------------mrall |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------beertax | -.0000656
.0000188
-3.49
0.001
-.0001026
-.0000286
_Istate_2 | -.0000568
.0000267
-2.13
0.034
-.0001093
-4.29e-06
_Istate_3 | -.0000655
.0000219
-2.99
0.003
-.0001086
-.0000224
_Istate_4 | -.0001509
.0000304
-4.96
0.000
-.0002109
-.000091
.
_Istate_49 | -.0001759
.0000294
-5.98
0.000
-.0002338
-.0001181
_Istate_50 | -.0000229
.0000313
-0.73
0.466
-.0000844
.0000387
_cons |
.0003478
.0000313
11.10
0.000
.0002861
.0004094
-----------------------------------------------------------------------------10-30
Stata Model FE: Method 2

First let STATA know you are working with panel
data by defining the entity variable (state) and
time variable (year):
.
xtset state year;

panel variable:
time variable:
delta:
state (strongly balanced)

year, 1982 to 1988
1 unit
10-31
Stata Model FE: Method 2

xtreg mrall beertax, fe
Fixed-effects (within) regression
Group variable: state
R-sq:
within = 0.0407
between = 0.1101
overall = 0.0934
Number of obs
=
336
Number of groups =
48
Obs per group:
min =
7
avg =
7.0
max =
7
F(1,287)
=
12.19
corr(u_i, Xb) = -0.6885
Prob > F
=
0.0006
-----------------------------------------------------------------------------mrall |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------beertax | -.0000656
.0000188
-3.49
0.001
-.0001026
-.0000286
_cons |
.0002377
9.70e-06
24.51
0.000
.0002186
.0002568
-------------+---------------------------------------------------------------sigma_u | .00007147
sigma_e | .00001899
rho | .93408484
(fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(47, 287) = 52.18
Prob > F = 0.0000
The panel data command xtreg with the option fe performs fixed effects regression.
The reported intercept is arbitrary, and the estimated individual effects are not
reported in the default output.
The fe option means use fixed effects regression
10-32
Do they give different results ?

Which one should I pick ?
Lets compare the results in a table side by side
we create from scratch:
xi: reg mrall beertax i.state
estimates store xifeno
xtreg mrall beertax, fe
estimates store xtfeno
estimates table xifeno xtfeno, b(%7.4f) se(%7.4f) t
(%7.4f) stats(N r2_a)
10-33
estimates table xife xtfe, b(%7.4f) se(%7.4f) t (%7.4f)

stats(N r2_a)
---------------------------------Variable | xifeno
xtfeno
-------------+-------------------beertax | -0.0001
-0.0001
|
0.0000
0.0000
-3.4915
-3.4915
_Istate_2 |
-0.0001
0.0000
-2.1290
_Istate_48 |
-0.0001
0.0000
-3.6363
_Istate_49 |
-0.0002
0.0000
Remember: We always get the

correct adjR2 by running the xi
command, NOT xtreg !
Which AdjR2 is correct then ?

The one from the xi: reg .i.state
command
10-34
16-
Outline
1. What is it DONE
2. Potential issues

1. What is it
2. Potential issues

effects Model
10-35
Potential Issues:
1. Heteroskedasticity across i entities
.What is it ?
.Why is it a problem ?
.How do we detect it ?
.How do we fix it ?
2. Serial Correlation across years within entity
.What is it ?
.Why is it a problem ?
.How do we detect it ?
.How do we fix it ?
10-36
Issue 1: Heteroskedasticity
1. How do we identify if we have it?

xttest3 command
xtreg (your regression), fe
xttest3
Modified Wald test for groupwise heteroskedasticity in fixed effect
regression model
H0: sigma(i)^2 = sigma^2 for all i
What does this tell us about
chi2 (48) =
4826.21
our null (reject / cannot reject)
Prob>chi2 =
0.0000
Do we have heterosked ?
2. How to correct for it?

xtreg (your regression), fe robust
10-37
Issue 2: Serial Correlation:
What if for an
entity i, the errors are correlated across time ?
Serial Correlation:
Run xtserial (your regression), output
Ho: no serial correlation in the idiosyncratic errors
Reject the null: we have
serial correlation
in the idiosyn errors
How to correct for it?

xtregar (your regression), fe will run the FE effects model with AR(1)
10-38
What we know so far:

If we have HETEROSKEDASTICITY ONLY
We find is using the xttest3 command
Correct it via the xtreg ., fe robust
If we have SERIAL CORRELATION ONLY

We find it via the xtserial test
We correct it via the xtregar., fe command
But what if we have both ?

10-39
Valid Standard Errors

HAC (heteroskedastic and autocorrelation
consistent) Standard Errors:
SE that are valid even if our error term is
heteroskedastic and serially correlated within entity.
One type of HAC:
Clustered standard errors:
Allow for heteroskedasticity and for arbitrary
autocorrelation within entity, but assume errors
are uncorrelated across entities.
10-40
Clustered Standard Errors

Clustered standard errors estimate the
variance of 1 when the errors are:
i.i.d. across entities
but are potentially autocorrelated within an
entity.
10-41
Clustered SEs: Implementation in STATA

. xtreg vfrall beertax, fe vce(cluster state)
R-sq: within = 0.0407
between = 0.1101
overall = 0.0934
corr(u_i, Xb)
= -0.6885
Number of obs
Number of groups
Obs per group: min
avg
max
F(1,47)
Prob > F
=
=
=
=
=
=
=
336
48
7
7.0
7
5.05
0.0294
(Std. Err. adjusted for 48 clusters in state)

-----------------------------------------------------------------------------|
Robust
vfrall |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------beertax | -.6558736
.2918556
-2.25
0.029
-1.243011
-.0687358
_cons |
2.377075
.1497966
15.87
0.000
2.075723
2.678427
------------------------------------------------------------------------------
vce(cluster state) says to use clustered standard errors, where the

clustering is at the state level (observations that have the same
value of the variable state are allowed to be correlated, but are
assumed to be uncorrelated if the value of state differs)
10-42
Clustered SEs: Implementation in STATA

. xtreg vfrall beertax, fe vce(cluster state)
between = 0.1101
overall = 0.0934
corr(u_i, Xb)
= -0.6885
Number of obs
Number of groups
Obs per group: min
avg
max
F(1,47)
Prob > F
=
=
=
=
=
=
=
336
48
7
7.0
7
5.05
0.0294

-----------------------------------------------------------------------------|
Robust
vfrall |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------beertax | -.6558736
.2918556
-2.25
0.029
-1.243011
-.0687358
_cons |
2.377075
.1497966
15.87
0.000
2.075723
2.678427
------------------------------------------------------------------------------
Is the R2 correct ?
NO, remember that the correct adj R2 can only be obtained by running the
xi command:
xi: reg vfrall beertax i.state
10-43
What if instead the residuals are correlated across

groups, NOT within entity ?
Clustered standard errors wont work since they:
Allow for heteroskedasticity and for arbitrary
autocorrelation within entity, but assume errors are
uncorrelated across entities.
We use our regular approach: identify if

problem exists and then correct it.
1. How do we identify if our residuals are indeed
correlated across groups ?
2. How do we fix it?
10-44
Issue 3: What if the residuals are correlated across groups ?
1. How do we identify if our residuals are indeed

correlated across groups ?
Pesaran CD (cross-sectional dependence) test

Ho: residuals are not correlated.
xtreg (your regression), fe
xtcsd, pesaran abs
2. How to correct for it?
If we reject the null (Prob<0.05), use Driscoll and Kraay
standard errors
xtscc (your regression), fe
10-45
Summary of issues and solutions

If we have HETEROSKEDASTICITY ONLY
We find is using the xttest3 test
Correct it via the robust option
If we have SERIAL CORRELATION ONLY

We find it via the xtserial test
We correct it via the xtregar., fe command
If we have both HETEROSKED & SERIAL CORELATION:
Use the xtreg , fe vce (cluster id)
____________________________________
If we have ERRORS correlated ACROSS entities

We find is using the Pesaran Test
Correct it via the xtscc command
10-46
Outline
1. What is it DONE
2. Potential issues DONE

1. What is it
2. Potential issues

effects Model
10-47
Regression with Time Fixed Effects

(SW Section 10.4)
An omitted variable might vary over time but not
across states:
Safer cars (air bags, etc.); changes in national
laws
These produce intercepts that change over time
Let St denote the combined effect of variables
which changes over time but not states (safer
cars).
The resulting population regression model is:
Yit = 0 + 1Xit + 2Zi + 3St + uit
10-48
Estimation with both entity and time

fixed effects
Yit = 1Xit + i + t + uit
Which are the entity fixed effects ?
Which are the time fixed effects ?
Explain please the subscripts.
Is that an error or why dont be have a 0
10-49
.
.
.
.
.
.
.
.
gen y83=(year==1983);
First generate all the time binary variables
gen y84=(year==1984);
gen y85=(year==1985);
gen y86=(year==1986);
gen y87=(year==1987);
gen y88=(year==1988);
global yeardum "y83 y84 y85 y86 y87 y88";
xtreg vfrall beertax $yeardum, fe vce(cluster state);

Number of obs
=
336
Number of groups
=
48
Obs per group: min =
7
between = 0.1101
avg =
7.0
overall = 0.0876
max =
7
corr(u_i, Xb) = -0.6781
Prob > F
=
0.0009
-----------------------------------------------------------------------------|
Robust
vfrall |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------beertax | -.6399799
.3570783
-1.79
0.080
-1.358329
.0783691
y83 | -.0799029
.0350861
-2.28
0.027
-.1504869
-.0093188
y84 | -.0724206
.0438809
-1.65
0.106
-.1606975
.0158564
y85 | -.1239763
.0460559
-2.69
0.010
-.2166288
-.0313238
y86 | -.0378645
.0570604
-0.66
0.510
-.1526552
.0769262
y87 | -.0509021
.0636084
-0.80
0.428
-.1788656
.0770615
y88 | -.0518038
.0644023
-0.80
0.425
-.1813645
.0777568
_cons |
2.42847
.2016885
12.04
0.000
2.022725
2.834215
-------------+---------------------------------------------------------------10-50
Are the time effects jointly statistically

significant?
First Method:
.
(
(
(
(
(
(
test $yeardum;
1)
2)
3)
4)
5)
6)
y83
y84
y85
y86
y87
y88
F(
=
=
=
=
=
=
0
0
0
0
0
0
6,
47) =
Prob > F =
4.22
0.0018
Yes
10-51
Do we need Time Fixed Effects?

Second method:
joint test to see if the dummies for all years are equal to 0, if they
are then no time fixed effects are needed.
testparm _Iyear*
Reject the null that all time

coefficients are
equal to zero,
so we do need the time
10-52
fixed effects

1. The data set
originally comes
with one column
listing the
drinking age for
each state. How
do you create the
variables
Drinking Age 18,
Drinking Age 19
and Drinking Age
20 for model (4)
and (5) ?
2. Why would
you create
them in the
first place ?
10-53

3. What is
income in real
values ? How do
we obtain those
relative to
nominal values ?
4. Why is real
income per capita
in log terms ?
10-54
Under the LS assumptions for panel data:

The OLS fixed effect estimator 1 is unbiased,
consistent, and asymptotically normally distributed
However, the usual OLS standard errors (both
homoskedasticity-only and heteroskedasticityrobust) will in general be wrong because they
assume that uit is serially uncorrelated.
In practice, the OLS standard errors often understate the
true sampling uncertainty: if uit is correlated over time,
you dont have as much information (as much random
variation) as you would if uit were uncorrelated.
This problem is solved by using clustered standard
errors.
10-55

1. What is the
difference
between each of
the models?
2. Please interpret
in words each of
the coefficients
in column (4)
3. How would you
test if Time
Effects are
needed ?
4. How can you tell
this is a fixed
effects model?
10-56
Outline
1. What is it DONE
3. TIME Fixed Effects DONE

1. What is it
2. Potential issues

effects Model
10-57
The Random Effects

Model
10-58
16-58
The Random Effects Model

Recall that the fixed effects model is based on the
assumption that each cross-sectional unit has its
own intercept
The random effects model instead is based on the
assumption that the intercept for each crosssectional unit is drawn from a distribution (that is
centered around a mean intercept)
Thus each intercept is a random draw from an
intercept distribution and therefore is independent of
the error term for any particular observation
Hence the term random effects model
10-59
16-
This was fixed effects, from a few slides ago.

Random Effects would mean different slopes !
10-60
16-
The Random Effects Model (cont.)

Advantages of the random effects model:
1. more degrees of freedom than a fixed effects model
This is because rather than estimating an intercept for
virtually every cross-sectional unit, all we need to do is to
estimate the parameters that describe the distribution of
the intercepts.
2. Can now also estimate time-invariant explanatory
variables (like race or gender).
Disadvantages of the random effects model:
1. Most importantly, the random effects estimator requires us to
assume that ai (the fixed effect term) is uncorrelated with
the independent variables, the Xs, if were going to avoid
omitted variable bias
10-61
16 This may be an overly strong assumption in many cases
Random Effects
Random effects assume that the group error term is not
correlated with the independent variables which allows for
time-invariant variables to play a role as explanatory
variables: you can include time invariant variables (i.e.
gender). In the fixed effects model these variables are
absorbed by the intercept.
However, you need to specify those individual
characteristics that may or may not influence the
independent variables. Problem: some variables may not be
available (i.e. culture), leading to omitted variable bias.
10-62
Remember our FE model results

xtreg vfrall beertax, fe vce(cluster state)
between = 0.1101
overall = 0.0934
corr(u_i, Xb)
= -0.6885
Number of obs
Number of groups
Obs per group: min
avg
max
F(1,47)
Prob > F
=
=
=
=
=
=
=
336
48
7
7.0
7
5.05
0.0294

-----------------------------------------------------------------------------|
Robust
vfrall |
Coef.
Std. Err.
t
P>|t|
-------------+---------------------------------------------------------------beertax | -.6558736
.2918556
-2.25
0.029
-1.243011
-.0687358
_cons |
2.377075
.1497966
15.87
0.000
2.075723
2.678427
-----------------------------------------------------------------------------Copyright 2015 Pearson, Inc. All rights reserved.
10-63
Running our RE model results

xtreg vfrall beertax, re vce(cluster state)
Random-effects GLS regression
R-sq:
within = 0.0407
between = 0.1101
overall = 0.0934
Number of obs
=
336
Number of groups =
48
Obs per group:
min =
7
avg =
7.0
max =
7
Wald chi2(1)
=
0.22
corr(u_i, X)
= 0 (assumed)
Prob > chi2
=
0.6373
-----------------------------------------------------------------------------|
Robust
vfrall |
Coef.
Std. Err.
z
P>|z|
-------------+---------------------------------------------------------------beertax | -.0520158
.1103327
-0.47
0.637
-.2682638
.1642323
_cons |
2.067141
.1212281
17.05
0.000
1.829539
2.304744
-------------+----------------------------------------------------------------
Are the coefficients different from the FE model ? Why ?

Can you still add in time fixed effects ? SURE !
10-64
Outline
1. What is it DONE

1. What is it DONE
2. Potential issues

effects Model
10-65
Random Effects Potential Issues
Everything we learned under fixed effects still applies:

1. Heteroskedasticity: quick review within your group:
a) what is it,
b) how do we check for it,
c) how do correct for it
2. Serial Correlation: quick review within your group:

a) what is it,
3. Both Heterosked & Serial Correlation:

a) how do correct for it
4. Cross-Sectional Dependence:
a) what is it,
quick review
10-66
Random Effects TIME Fixed Effects
Everything we learned under fixed effects still applies:
1. Can we still use Time Fixed Effects ?
2. How would you introduce them into the

regression ? Same as under Fixed Effects
First generate all the time binary variables
.gen
.
.
.
.
y83=(year==1983);
. gen y84=(year==1984);
gen y85=(year==1985);
. gen y86=(year==1986);
gen y87=(year==1987);
. gen y88=(year==1988);
global yeardum "y83 y84 y85 y86 y87 y88";
xtreg vfrall beertax $yeardum, re vce(cluster state);
3. How would you check if Time FE are needed ?

Same as under Fixed Effects.
test $yeardum
10-67
Outline
1. What is it DONE

1. What is it DONE

effects Model
10-68
Choosing Between Fixed and Random

Effects
One key is the nature of the relationship between ai and the Xs:
If theyre likely to be correlated, then it makes sense to use
the fixed effects model
If not, then it makes sense to use the random effects model
Can also use the Hausman test to examine whether there is
correlation between ai and X
Essentially, this procedure tests to see whether the regression
coefficients under the fixed effects and random effects models are
statistically different from each other
If they are different, then the fixed effects model is
preferred
If the they are not different, then the random effects
model is preferred (or estimates of both the fixed effects and
random effects models are provided)
10-69
Choosing FE vs RE: Method 1: xtoverid

FE: indep vars are uncorrelated with the idiosyncratic error
(orthogonality conditions), but could be corr with the group error
RE: additionally indep vars are uncorrelated with the groupspecific error (orthogonality conditions)
These additional orthogonality conditions are overidentifying
restrictions;
xtoverid
Ho: indep vars are uncorrelated with the group-specific
error (the extra RE orthogonality conditions)
P-value<0.05, reject the null-> indep vars are correlated
with the group-specific error
10-70
Choosing FE vs RE: Method 1: xtoverid

FE:
xtreg (your regression), fe vce(cluster i) will run the FE effects model
RE:
xtreg (your regression), re vce(cluster i) will run the RE effects model
xtoverid
Ho: indep vars are uncorrelated with the group-specific

error (the extra RE orthogonality conditions)

10-71
Choosing FE vs RE: Method 2: hausman

FE:
xtreg (your regression), fe vce(cluster i) will run the FE effects model
estimates store fe
RE:
xtreg (your regression), re vce(cluster i) will run the RE effects model
estimates store re
hausman fe re, sigmaless
Ho: same as before (RE is preferred to FE)

10-72

What is wrong with this Hausman Test ?
estimates store FE
estimates store RE
hausman FE RE, sigmaless
STATA RESULT:
ERROR: hausman cannot be used with vce(robust), vce(cluster cvar),
or p-weighted data
Correct:
estimates store FE
estimates store RE
10-73

We run the correct Hausman test and get the following
result. Should we choose FE or RE ?
---- Coefficients ---|
(b)
(B)
(b-B)
sqrt(diag(V_b-V_B))
|
FE
RE
Difference
S.E.
-------------+---------------------------------------------------------------beertax |
-.6558736
-.0520158
-.6038579
.1435348
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test:
Ho:
difference in coefficients not systematic

chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=
17.70
Prob>chi2 =
0.0000
10-74

Econometrics Chapter 10 PPT Slides

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Econometrics Chapter 10 PPT Slides

Diunggah oleh

Hak Cipta:

Format Tersedia

Chapter 10

Copyright 2015 Pearson, Inc. All rights reserved.

4. The RANDOM effects Model

5. Choosing b/t the FIXED effects and RANDOM

Panel Data: What and Why

Copyright 2015 Pearson, Inc. All rights reserved.

What Are Panel Data? (cont.)

3. Variables that vary both over time and between individuals:

4. Trend variables that vary in predictable ways:

More in class practice and examples

Copyright 2015 Pearson, Inc. All rights reserved.

Notation for panel data

Copyright 2015 Pearson, Inc. All rights reserved.

Panel data notation, ctd.

Copyright 2015 Pearson, Inc. All rights reserved.

Why are panel data useful?

Example of a panel data set:

Copyright 2015 Pearson, Inc. All rights reserved.

U.S. traffic death data for 1982:

Higher alcohol taxes, more traffic deaths?

Why might there be higher/more traffic deaths

Other factors that determine traffic fatality

Copyright 2015 Pearson, Inc. All rights reserved.

These omitted factors could cause

High traffic density means more traffic deaths

II. (Western) states with lower traffic density have lower

Example #2:Cultural attitudes towards drinking and

Then the two conditions for omitted variable bias are

Panel data lets us eliminate omitted variable bias when the

Copyright 2015 Pearson, Inc. All rights reserved.

4. The RANDOM effects Model

5. Choosing b/t the FIXED effects and RANDOM

Panel Data with Two Time Periods

The key idea:

The math: consider fatality rates in 1988 and 1982:

Suppose E(uit|BeerTaxit, Zi) = 0.

Copyright 2015 Pearson, Inc. All rights reserved.

FatalityRatei1988 = 0 + 1BeerTaxi1988 + 2Zi + ui1988

The new error term, (ui1988 ui1982), is uncorrelated with

This difference equation can be estimated by OLS, even

The omitted variable Zi doesnt change, so it cannot be a

This differences regression doesnt have an intercept it was

Copyright 2015 Pearson, Inc. All rights reserved.

Example: Traffic deaths and beer taxes

Difference regression (n = 48)

Copyright 2015 Pearson, Inc. All rights reserved.

Note that the intercept is nearly zero

4. The RANDOM effects Model

5. Choosing b/t the FIXED effects and RANDOM

The Fixed Effects

Fixed Effects Regression

Yit = 0 + 1Xit + 2Zi + uit, i =1,,n, T = 1,,T

Collecting the lines for all three states:

Yit = i + 1Xit + uit,

i = CA, TX, MA,

The regression lines for each state in a

Recall that shifts in the intercept can be represented using

We now put this in binary regressor form:

Why no dummy for Alabama?

Copyright 2015 Pearson, Inc. All rights reserved.

Summary: Two ways to write the fixed

2. Fixed effects form: what we really use in real life

Stata FE model: Method 1