Anda di halaman 1dari 7

Review Material

Xiaoxia Shi
December 14, 2010
1. OLS
(a) Bivariate regression
1
i
= ,
0
+ ,
1
A
i
+
i
. (1)
Estimates:
^
,
1
=
P
n
i=1

A
i


A

1
i


1

P
n
i=1

A
i


A

2
(2)
^
,
0
=

1
^
,
1

A. (3)
Assumptions: Linear form, Random Sampling, no multicollinearity, Exo-
geneity (1(
i
jA
i
) = 0), Homoskedasticity (\ c:(
i
jA
i
) = o
2
),
Under Exogeneity,
^
,
1
is unbiased: 1(
^
,
1
jA) = ,
1
.
Under Homoskedasticity and Exogeneity,
^
,
1
is BLUE. (Gauss-Markov
Theorem) Variance of
^
,
1
:
\ c:(
^
,
1
jA) =
o
2
P
n
i=1

A
i


A

2
. (4)
(b) Multivariate regression
1
i
= ,
0
+ ,
1
A
1i
+ ,
2
A
2i
+ ... + ,
k
A
ki
+
i
. (5)
i. OLS estimator
^
,
j
=
P
n
i=1
^ :
ji
1
i
P
n
i=1
^ :
2
ji
. (6)
where ^ :
ji
is the residual from regression of A
j
on the rest of the
regressors (including the constant).
1
ii. For a general / 2 f1. 2. ...g. With the same assumptions as in the
bivariate case, we get the same conclusions about
^
,
1
.
^
,
2
. ....
^
,
k
.
iii. Interpretation: now let A
i
denote the vector (A
1i
. ...A
ki
)
0
. With
1(
i
jA
i
) = 0,
^
,
j
is interpreted as the causal eect of A
ji
on 1
i
holding
other covariates constant. Without 1(
i
jA
i
) = 0.
^
,
j
cannot be inter-
preted as causal eect, but it can be interpreted as correlation. Its
the correlation between A
ji
with 1
i
controlling for other covariates.
(c) Multicollinearity: one of the regressors (including the constant, if there is
one) can be written as a linear function of other regressors. One example:
dummy variable trap. If a bunch of dummy variables sum up to 1 (which
is equal to the constant term), one of the dummy variables has to be left
out of the regression (the base group)
(d) Heteroskedasticity: \ c:(
i
jA
i
) = o
2
(A
i
). Consequences: OLS estimator
is not ecient anymore; STATA reports "wrong standard error". OLS
estimator is still unbiased. We can use robust standard error (which is
justied in large samples)
Testing for heteroskedasticity: visual inspection, the Breusch-Pagan test,
and the White test.
(e) Weight least square: a way to get an estimator with smaller variance than
the OLS estimator in the presence of heteroskedasticity.
(f) Endogeneity: 1(
i
jA
i
) 6= 0. OLS estimator is biased. In the bivariate
case:
1(
^
,
1
jA) = ,
1
+
P
i
r
i
1(
i
jA)
P
i
r
2
i
. (7)
Direction of the bias is determined by the correlation between
i
and A
i
.
2. Testing (inference).
(a) t-test. In small sample, t-statistic has t-distribution with n-k degrees of
freedom (the error term is assumed to be normally distributed). In large
sample, normal error assumption can be dropped and the t-statistic has
asymptotic Normal distribution.
(b) F-test: to test multiple restrictions. e.g. H
0
: ,
1
= 0 and ,
2
= 0. Steps
to compute F-statistic.
(c) Test for heteroskedasticity.
(d) Test for autocorrelation.
2
3. Endogeneity, proxy variable, and 2SLS
(a) Endogeneity: omitted variable, measurement error, simultaneity.
i. Omitted variable:
1
i
= ,
0
+ ,
1
A
1i
+ (,
2
A
2i
+
i
). 1(
i
jA
1i
) = 0. (8)
in which A
2i
is not observed. Direction of bias is determined by
co(,
2
A
2i
+
i
. A
1i
) = co(,
2
A
2i
. A
1i
) = ,
2
co(A
2i
. A
1i
). (9)
When ,
2
6= 0 AND co(A
2i
. A
1i
) 6= 0, OLS estimator of ,
1
is biased.
ii. measurement error:
1
i
= ,
0
+ ,
1
A
i
+
i
. 1(
i
jA
1i
) = 0. (10)
A

i
= A
i
+ c
i
, (11)
where A
i
is not observed. If we use A

i
instead, we run regression:
1
i
= ,
0
+ ,
1
A

i
+
i
, (12)
where
i
=
i
,
1
c
1
.
Classical measurement error assumption: 1(c
i
jA
i
) = 0, 1(c
i

i
) = 0,
\ c:(c
i
jA
i
) = o
2
e
.
Direction of the bias is determined by
co(
i
. A

i
) = co(
i
,
1
c
i
. A
i
+ c
i
)
= co(
i
. A
i
) + co(
i
. c
i
) ,
1
co(c
i
. A
i
) ,
1
co(c
i
. c
i
)
= 0 + 0 0 ,
1
co(c
i
. c
i
)
= ,
1
o
2
e
. (13)
which always has the opposite sign as ,
1
. Thus the bias is always
toward zero, and is called attenuation bias.
Classical measurement error in the dependent variable does not cause
bias or inconsistency.
iii. Simultaneity:
1
1i
= ,
0
+ ,
1
1
2i
+
i
1
2i
= c
0
+ c
1
1
1i
+
i
, co(
i

i
) = 0. (14)
3
)
1
2i
= c
0
+ c
1
(,
0
+ ,
1
1
2i
+
i
) +
i
(15)
)
1
2i
=
1
1 c
1
,
1
(c
0
+ c
1
,
0
+ c
1

i
+
i
) (16)
Thus,
co(
i
. 1
2i
) = co(
i
.
1
1 c
1
,
1
(c
1

i
+
i
))
=
c
1
1 c
1
,
1
. (17)
Typically, 1 c
1
,
1
0 (stable system requirement). Then the sign
of bias on OLS estimator
^
,
1
is the same as the sign of c
1
.
(b) Proxy variable- for endogeneity caused by omitted variables. A proxy
variable should mimic the omitted variable. Thus, it should has similar
impact on 1 as does the omitted variable and it can be correlated with
the other regressors.
Example: IQ score for ability, adding time trend to regression when a
regressor and/or the dependent variable has trend.
(c) 2SLS or IV.
i. In a regression
1
i
= ,
0
+ ,
1
A
1i
+ ,
2
A
2i
+ ... + ,
K
A
Ki
+
i
. (18)
If A
Ki
is endogenous (and all the other regressors are exogenous), and
if we have an instrumental variable 2
i
that satises: co(A
Ki
. 2
i
) 6= 0,
and co(2
i
.
i
) = 0, then we can use the following regression:
A
Ki
= o
0
+ o
1
2
i
+
i
, (19)
to decompose A
Ki
into a part that is not correlated with
i
: (o
0
+o
1
2
i
)
and a part that is correlated with
i
: (
i
). Then use the rst part
instead of A
Ki
in the original regression.
In practice, this is the 2SLS procedure: regress A
Ki
on 2
i
; save the
predicted value
^
A
Ki
; use
^
A
Ki
instead of A
Ki
in the original regression.
4
ii. (Weak Instrument) If co(A
Ki
. 2
i
) is small, then o
1
is small, then
^
A
Ki
=
^
o
0
+
^
o
1
2
i
is almost constant (little variation), then in the
regression
1
i
= ,
0
+ ,
1
A
1i
+ ,
2
A
2i
+ ... + ,
K
^
A
Ki
+ n
i
.
the variance of
^
,
K
is very large.
iii. (Fishy Instrument) co(2
i
.
i
) 6= 0. Then, the "exogenous part" is
not exogenous,
^
A
Ki
is not asymptotically uncorrelated with n
i
. So,
2SLS estimator is inconsistent.
iv. Bivariate regression case:
1
i
= ,
0
+ ,
1
A
i
+
i
. 1(
i
jA
i
) 6= 0. (20)
A valid instrument The IV estimator for ,
1
:
^
,
IV
1
=
P
i

2
i


2
n

1
i
P
i

2
i


2
n

A
i
(21)
Assume \ c:(
i
j2
i
) = o
2
"
$. The asymptotic variance of
^
,
IV
1
:
:.\ c:(
^
,
IV
1
) =
o
2
"
:o
2
x
1
2
xz
v. In practice, it is recommended to use all the exogenous regressors in
the rst stage regression:
A
Ki
= o
0
+ o
1
A
1i
+ ... + o
K1
A
(K1)i
+ o
K
2
i
+
i
, (22)
vi. 2SLS estimator is biased, because o
0
. o
1
are unknown and can only be
estimated. But it is consistent if co(A
Ki
. 2
i
) 6= 0, and co(2
i
.
i
) =
0.
4. Panel data
(a) Panel data and endogeneity. Panel data can be used to deal with endo-
geneity caused by unobserved time invariant preferences. By doing First
dierence or Fixed eect regression, we eliminate the latent time-invariant
preference from the error term.
For example, the individual xed ability can be dierenced out when
estimating return to education.
5
(b) Panel data and serial correlated error. Individual xed (time-invariant)
heterogeneity, if not dierenced out, is in the error term, and causes serial
correlation of the error term:

it
= n
i
+
it
co (
it
.
is
) = \ c: (n
i
) 6= 0. (23)
Solution: (1) use cluster standard error.
(2) use random eect:
(1
it
`

1
i
) = ,
0
(1 `) + ,
1

A
it
`

A
i

+ (
it
`
i
) , (24)
where ` satises:
Co (
it
`
i
.
is
`
i
) = 0. (25)
:` = 1
p
o
2
"
, (1o
2
u
+ o
2
"
).
5. Treatment eect, Experiments, and Natural experiments
(a) How a new drug reduce the risk of heart attack? How a job training
program increase a persons chance of getting a job? etc., etc. The eects
of the new drug, of the job training program is treatment eect.
(b) Ideally, we want to randomly assign the treatment to some people and
not to other people. If all people comply and there is no spillover eect,
we can estimate the treatment eect easily by comparing the treated and
untreated after treatment. (Note that we do the comparison by running
regressions of the dependent variable on the "treatment" dummy. Run-
ning regression gives us a test statistic at the same time.) This approach
is called random experiment (as what the scientists do in their labs).
(c) Not truly random experiments and natural experiments. Treated group
and untreated (control) group are not necessarily similar a priori. With
only one cross section data, we cannot do anything. With Panel data or
pooled cross-sectional, DID estimation is possible. For DID to be valid,
we need (a) no spillover eect and (b) the natural growth paths of the
treatment group and the controlled group are parallel.
6. Time series:
(a) Static models, FDL models
6
(b) strict exogeneity and unbiasedness.
(c) contemporaneous exogeneity and consistency.
(d) Stationarity and weak dependence.
(e) Random Walk.
7. Technical stu:
(a) Change in percent(age)=change/initial value100%
(b) Change in percentage point is the "change100" when the dependent
variable is 0-1.
(c) Draw the tted line of 1 on A controlling for other covariates: Step 1:
run regression
1
i
= ,
0
+ ,
1
A
i
+ o ot/c:coc:ictc: +
i
. (26)
Step 2: draw the function: =
^
,
0
+
^
,
1
A
i
.
(d) The marginal eect of A on 1 : if A is continuous
J1
JA
Notice the nonlinearity and interaction terms in the regression.
If A is binary:
1 (1 jA = 1) 1(1 jA = 0).
Dummy dependent variables: LPM:
1 = ,
0
+ ,
1
A + n
i
)
Pr (1 = 1jA) = ,
0
+ ,
1
A. (27)
J Pr (1 = 1jA)
JA
= ,
1
(28)
Nonlinear probability models:
Pr (1 = 1jA) = 1 (,
0
+ ,
1
A) . (29)
J Pr (1 = 1jA)
JA
= ,
1
1
0
(,
0
+ ,
1
A) . (30)
7

Anda mungkin juga menyukai