Regression with
Panel Data
Outline
1. Panel Data: What and Why
2. Panel Data with Two Time Periods
3. The FIXED effects Model:
1. What is it
2. Potential issues
3. TIME Fixed Effects
10-2
10-3
2. Variables that change over time but are the same for all
individuals in a given time period:
e.g., the retail price index and the national unemployment rate
10-5
10-6
10-7
10-8
Variables:
Traffic fatality rate (# traffic deaths in that state in that
year, per 10,000 state residents)
Tax on a case of beer
Other (legal driving age, drunk driving laws, etc.)
10-9
10-10
10-11
10-12
10-13
Outline
1. Panel Data: What and Why
2. Panel Data with Two Time Periods
3. The FIXED effects Model:
1. What is it
2. Potential issues
3. TIME Fixed Effects
10-14
10-15
10-16
10-17
= 2.01 + 0.15BeerTax
FatalityRate
(.15)
(n = 48)
(.13)
1988 data:
= 1.86 + 0.44BeerTax
FatalityRate
(.11)
(n = 48)
(.13)
FR1988 FR1982
(.065) (.36)
An intercept is included in this differences regression allows
for the mean change in FR to be nonzero more on this
later
10-18
FatalityRate v. BeerTax:
10-19
Outline
1. Panel Data: What and Why DONE
2. Panel Data with Two Time Periods DONE
3. The FIXED effects Model:
1. What is it
2. Potential issues
3. TIME Fixed Effects
10-20
10-21
16-
Fixed Effects
Fixed-effects (FE) explore the relationship between the
independent variables and dependent variable within an
entity (country, state, institution etc.).
Each entity (state) has its own individual characteristics that
may or may not influence the dependent variables
Why use FE? Because we believe that something within the
entity (state) will bias the variables; we need to control for
this to get unbiased estimates. Therefore, FE removes the
effect of those time-invariant characteristics from the
independent variables so we can assess their net effect.
Copyright 2015 Pearson, Inc. All rights reserved.
10-22
10-23
10-24
For TX:
YTX,t = 0 + 1XTX,t + 2ZTX + uTX,t
= (0 + 2ZTX) + 1XTX,t + uTX,t
or
YTX,t = TX + 1XTX,t + uTX,t, where TX = 0 + 2ZTX
t = 1,,T
10-25
10-26
10-27
10-28
where D2i =
0 otherwise
, etc.
10-29
i.state
_Istate_1-56
(naturally coded; _Istate_1 omitted)
Source |
SS
df
MS
Number of obs
=
336
-------------+---------------------------------F(48, 287)
=
56.97
Model | 9.8570e-07
48 2.0535e-08
Prob > F
=
0.0000
Residual | 1.0345e-07
287 3.6047e-10
R-squared
=
0.9050
-------------+---------------------------------Adj R-squared
=
0.8891
Total | 1.0892e-06
335 3.2512e-09
Root MSE
=
1.9e-05
-----------------------------------------------------------------------------mrall |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------beertax | -.0000656
.0000188
-3.49
0.001
-.0001026
-.0000286
_Istate_2 | -.0000568
.0000267
-2.13
0.034
-.0001093
-4.29e-06
_Istate_3 | -.0000655
.0000219
-2.99
0.003
-.0001086
-.0000224
_Istate_4 | -.0001509
.0000304
-4.96
0.000
-.0002109
-.000091
.
_Istate_49 | -.0001759
.0000294
-5.98
0.000
-.0002338
-.0001181
_Istate_50 | -.0000229
.0000313
-0.73
0.466
-.0000844
.0000387
_cons |
.0003478
.0000313
11.10
0.000
.0002861
.0004094
-----------------------------------------------------------------------------10-30
Copyright 2015 Pearson, Inc. All rights reserved.
10-31
Number of obs
=
336
Number of groups =
48
Obs per group:
min =
7
avg =
7.0
max =
7
F(1,287)
=
12.19
corr(u_i, Xb) = -0.6885
Prob > F
=
0.0006
-----------------------------------------------------------------------------mrall |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------beertax | -.0000656
.0000188
-3.49
0.001
-.0001026
-.0000286
_cons |
.0002377
9.70e-06
24.51
0.000
.0002186
.0002568
-------------+---------------------------------------------------------------sigma_u | .00007147
sigma_e | .00001899
rho | .93408484
(fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(47, 287) = 52.18
Prob > F = 0.0000
The panel data command xtreg with the option fe performs fixed effects regression.
The reported intercept is arbitrary, and the estimated individual effects are not
reported in the default output.
The fe option means use fixed effects regression
10-32
Copyright 2015 Pearson, Inc. All rights reserved.
10-33
0.0000
0.0000
-3.4915
-3.4915
_Istate_2 |
-0.0001
0.0000
-2.1290
_Istate_48 |
-0.0001
0.0000
-3.6363
_Istate_49 |
-0.0002
0.0000
Outline
1. Panel Data: What and Why DONE
2. Panel Data with Two Time Periods DONE
3. The FIXED effects Model:
1. What is it DONE
2. Potential issues
3. TIME Fixed Effects
10-35
Potential Issues:
1. Heteroskedasticity across i entities
.What is it ?
.Why is it a problem ?
.How do we detect it ?
.How do we fix it ?
2. Serial Correlation across years within entity
.What is it ?
.Why is it a problem ?
.How do we detect it ?
.How do we fix it ?
10-36
Copyright 2015 Pearson, Inc. All rights reserved.
Issue 1: Heteroskedasticity
10-37
What if for an
entity i, the errors are correlated across time ?
Serial Correlation:
Run xtserial (your regression), output
Ho: no serial correlation in the idiosyncratic errors
Reject the null: we have
serial correlation
in the idiosyn errors
10-38
10-39
10-40
10-41
= -0.6885
Number of obs
Number of groups
Obs per group: min
avg
max
F(1,47)
Prob > F
=
=
=
=
=
=
=
336
48
7
7.0
7
5.05
0.0294
10-42
= -0.6885
Number of obs
Number of groups
Obs per group: min
avg
max
F(1,47)
Prob > F
=
=
=
=
=
=
=
336
48
7
7.0
7
5.05
0.0294
Is the R2 correct ?
NO, remember that the correct adj R2 can only be obtained by running the
xi command:
10-43
10-44
10-46
Outline
1. Panel Data: What and Why DONE
2. Panel Data with Two Time Periods DONE
3. The FIXED effects Model:
1. What is it DONE
2. Potential issues DONE
3. TIME Fixed Effects
10-47
10-48
10-49
.
.
.
.
.
.
.
.
gen y83=(year==1983);
First generate all the time binary variables
gen y84=(year==1984);
gen y85=(year==1985);
gen y86=(year==1986);
gen y87=(year==1987);
gen y88=(year==1988);
global yeardum "y83 y84 y85 y86 y87 y88";
xtreg vfrall beertax $yeardum, fe vce(cluster state);
test $yeardum;
1)
2)
3)
4)
5)
6)
y83
y84
y85
y86
y87
y88
F(
=
=
=
=
=
=
0
0
0
0
0
0
6,
47) =
Prob > F =
4.22
0.0018
Yes
10-51
2. Why would
you create
them in the
first place ?
Copyright 2015 Pearson, Inc. All rights reserved.
10-53
10-54
10-55
10-56
Outline
1. Panel Data: What and Why DONE
2. Panel Data with Two Time Periods DONE
3. The FIXED effects Model:
1. What is it DONE
2. Potential issues DONE
3. TIME Fixed Effects DONE
10-57
10-58
16-58
10-59
16-
10-60
16-
Random Effects
Random effects assume that the group error term is not
correlated with the independent variables which allows for
time-invariant variables to play a role as explanatory
variables: you can include time invariant variables (i.e.
gender). In the fixed effects model these variables are
absorbed by the intercept.
However, you need to specify those individual
characteristics that may or may not influence the
independent variables. Problem: some variables may not be
available (i.e. culture), leading to omitted variable bias.
10-62
= -0.6885
Number of obs
Number of groups
Obs per group: min
avg
max
F(1,47)
Prob > F
=
=
=
=
=
=
=
336
48
7
7.0
7
5.05
0.0294
10-63
Number of obs
=
336
Number of groups =
48
Obs per group:
min =
7
avg =
7.0
max =
7
Wald chi2(1)
=
0.22
corr(u_i, X)
= 0 (assumed)
Prob > chi2
=
0.6373
(Std. Err. adjusted for 48 clusters in state)
-----------------------------------------------------------------------------|
Robust
vfrall |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------beertax | -.0520158
.1103327
-0.47
0.637
-.2682638
.1642323
_cons |
2.067141
.1212281
17.05
0.000
1.829539
2.304744
-------------+----------------------------------------------------------------
10-64
Outline
1. Panel Data: What and Why DONE
2. Panel Data with Two Time Periods DONE
3. The FIXED effects Model:
1. What is it DONE
2. Potential issues DONE
3. TIME Fixed Effects DONE
10-65
4. Cross-Sectional Dependence:
a) what is it,
b) how do we check for it,
c) how do correct for it
quick review
10-66
.gen
.
.
.
.
y83=(year==1983);
. gen y84=(year==1984);
gen y85=(year==1985);
. gen y86=(year==1986);
gen y87=(year==1987);
. gen y88=(year==1988);
global yeardum "y83 y84 y85 y86 y87 y88";
xtreg vfrall beertax $yeardum, re vce(cluster state);
10-67
Outline
1. Panel Data: What and Why DONE
2. Panel Data with Two Time Periods DONE
3. The FIXED effects Model:
1. What is it DONE
2. Potential issues DONE
3. TIME Fixed Effects DONE
10-68
10-70
10-71
10-72
10-73
Ho:
10-74