Jean Flemming
jean.c.flemming@gmail.com
i = 1, . . . , n
where E[Ui |Xi ] = 0, (Xi , Yi ) are i.i.d., 0 < E[Xi4 ] < , 0 < E[Ui4 ] < and
Var[Ui |Xy ] = 2
Solution: (a) OLS estimators for and solve
n
1X
(Yi Xi )2
n i=1
min Qn (, ) = min
,
1X
i)
(Yi
X
0=
n i=1
n
1X
i)
0=
Xi (Yi
X
n i=1
Solving for
we get:
n
1X
i ) = Y + X
(Yi X
n i=1
and for :
n
=
=
1X
i)
Xi X
(Xi Yi Xi (Y X)
n i=1
n
n
n
n
X
X
X
1 X
0
Xi Yi Y
Xi + X
Xi
Xi Xi
n i=1
i=1
i=1
i=1
Pn
(X X)(Yi Y )
Pn i
= i=1
2
(Xi X)
i=1
= :
(b) WTS: E()
=
=
Pn
(X X)(U
i U)
Pn i
E + i=1
=
2
i=1 (Xi X)
= E
E()
+ U ))
+ X
Where the last step follows from E[Ui |Xi ] = 0 and LIE
(c) Using properties of conditional variance and the fact that var[Ui |Xy ] = 2 :
Pn
i=1 (Xi X)(Yi Y )
Pn
var(|Xi ) = var
Xi ]
2
i=1 (Xi X)
Pn
i=1 (Xi X)(Ui U )
Pn
= var +
X i ]
2
i=1 (Xi X)
2
X
n
1
= Pn
(Xi X)(U
i U )|Xi
2 var
i=1 (Xi X)
i=1
2 X
n
1
2
= Pn
i = Xi X,
Ui = Ui U , and consider the following three
(d) Define Yi = Yi Y , X
linear regressions without a constant:
i + i
Yi = 1 X
i + i
Yi = 2 X
Yi = 3 Xi + i
Without a constant, the regression coefficient is given by the following expressions
for each regression:
Pn
Yi Xi
1 = Pi=1
n
2
i=1 Xi
Page 2
Pn
i
Yi X
2 = Pi=1
n
2
i=1 Xi
Pn
Yi Xi
3 = Pi=1
n
2
i=1 Xi
Expanding these expressions gives:
Pn
(Y Y )(Xi X)
Pn i
1 = i=1
2
i=1 (Xi X)
2 =
=
=
=
3 =
=
Pn
Yi (Xi X)
Pi=1
n
2
(Xi X)
Pni=1
Pn Yi
i=1
i=1 Yi Xi X
Pn
2
(X
X)
i
Pn i=1
Y
Yi Xi X
Pi=1
n
2
(Xi X)
Pni=1
X (Y Y )
i=1
Pni i 2
X
Pn i=1 i
i Y )(Xi
i=1 (YP
n
2
i=1 Xi
X)
Page 3
mean of the complete observations has the same effect as adding the new variable.
Solution: Consider a 2 n matrix of regressors (including a constant):
X = [1, Z]0
For simplicity and WLOG assume that the first element of Z is missing. We can
write:
X = [X1 , X2 ]0
where X1 = [1, Zo ] is n 2, X2 = [1, 0, . . . , 0]0 is n 1, and Zo = [0, z2 , . . . , zn ]
replaces the missing value z1 with 0.
By the Frisch-Waugh theorem,
Y = X + U = 1 X1 + 2 X2 + U
1 = (X10 M2 X1 )1 (X10 M2 Y )
0 ... ... 0
0 1
0
1 0
where M2 = In X2 (X2 X2 ) X2 = .
, I with the first 1 removed
.
.. n
..
0 ... ... 1
1 = (X10 X1 X10 .
..
... ... 0
0 ... ... 0
1
0
0
1
0
1
0
0
X1 ) (X1 Y X1 .
Y )
..
.
.
.
.
.
.
... ... 1
0 ... ... 1
R =
1
n
1
n
Pn
(Yi Y )2
Pi=1
n
2
i=1 (Yi Y )
R =
1
n1
Pn
Y )2
Y )2
i=2 (Yi
P
n
1
i=2 (Yi
n1
Page 4
where Y =
1
n1
Pn
i=2
Yi .
2.
Since Y 6= Y , R2 6= R
Now consider replacing the missing observation with the mean of the other n 1
observations:
1 X
z1 =
zi
n 1 i=2
Replacing the column Zo with deviations from their means, we regress Y on
= [1, Z]
0 , where zi = zi 1 Pn zi .
X
n1
i=2
Again we have the first element of Z equal to zero since z1 = 0, and the slope
coefficient from the regression is given by
0
0
1 = (X1 X1 )1 (X1 Y )
which is the same as dropping the first observation as before. Thus the coefficients
are equal to the previous cases.
i =1,...,n.
Recall the hypotheses and the conclusions of the Gauss-Markov Theorem. Explain
why each hypothesis is important and tell what estimator, linear, unbiased and best
exactly mean. Then, for each of the following cases, state which hypothesis is violated
(if any) and what are the consequences on the conclusions (if any):
(a) E(Yi i |Xi )2 = Xi 2 ,
(b) Xi = (1, X1i , X2i , X1i X2i )T ,
(c) Xi = (1, X1i , X2i , X1i X2i )T ,
(d) Ui = Zi + i , with E[Zi |Xi ] 6= 0 and E[i |Xi ] = 0,
Page 5
Solution:
(a) Since V ar(Y |X) 6= 2 In , the errors are (conditionally) heteroskedastic. This
implies that OLS estimator, although still unbiased, is no longer efficient. Moreover,
= s2 (X 0 X)1 is biased for V ar()
so we must be
the classical estimator V\
ar()
careful when doing inference.
(b) No hypothesis violated here since the columns of X are linearly independent.
(c) Since the third variable is a linear combination of two other variables (X1
X2 ),the columns of X are not linearly independent, so (X 0 X) is not invertible.
6= ,
(d) When E[Zi |Xi ] 6= 0 and E[i |Xi ] = 0, using LIE it can be shown that E()
i.e. is not unbiased
(e) E[Zi |Xi ] = 0 and E[i |Xi ] = 0, is unbiased, no hypothesis is violated here.
expected value
variance
c
n
c
n
c 2n+1
4n
2n+1
c 4n2
c 2n+1
4n2
Solution: In order for the estimator to be consistent we need that for > 0
) = 0
lim P r(| |
Page 6
= lim V ar()
+ [Bias()]
2 = 0.
lim M SE()
Hence,
= lim c = 0. Consistent.
Estimator 1: lim M SE()
n
n n
= lim c 2n + 1 = c/2. Inconsistent.
Estimator 2: lim M SE()
n
n
4n
= lim c + c 2n + 1 = c. Inconsistent.
Estimator 3: lim M SE()
n
n
4n2
= lim c2 + c 2n + 1 = 0. Consistent.
Estimator 4: lim M SE()
n
n n
4n2
5. Let Ui be the ith residual in the OLS regression of Yi on Xi , and let Ui be the corresponding regression error. Show that if the OLS estimator is consistent for the
population regression parameter, then plim(Ui Ui ) = 0.
Solution:
i = Xi + Ui X
i = Ui + ( )X
i.
Ui = Yi X
p
p
If the OLS estimator is consistent, as n and Ui Ui .
6. Production data are available for a random sample of n = 22 firms. Let Yi and Xi
be, respectively, the log output and the log (units of) labor for firm i. Moreover, let
n
n
n
X
X
X
2
2
i Y ) = 30.
Y = 20,
(Yi Y ) = 100, X = 10,
(Xi X) = 60, and
(Xi X)(Y
i=1
i=1
i=1
(a) Estimate
and by OLS in the following model:
Yi = + Xi + Ui
i =1,...,n,
where EXi Ui = 0 and Ui N (0, 2 ). (Xi , Yi ) are i.i.d. What is the interpretation of
in terms of labor and output?
(b) Compute the following sample moments:
n
X
i=1
Yi ,
n
X
Xi ,
i=1
Page 7
n
X
i=1
X i Yi
Solution:
(a) The OLS estimates are:
= 30/60 = 0.5,
= 20 0.5 10 = 15.
is the estimated elasticity of output to labor, that is, the expected percentage change in output as consequence of a unit percentage change in units of
labor.
(b)
X
X
(Yi Y )2 = 100 =
Yi2 = 100 + nY 2 = 8900
X
X
2 = 60 =
2 = 2260
(Xi X)
Xi2 = 60 + nX
X
X
= 30 =
= 4430
(Yi Y )(Xi X)
Yi Xi = 30 + nY X
(c)
T SS =
ESS =
(Yi Y )2 = 100
i ) = n2 60 = 15
(Yi Y )2 = n var(Yi ) = n var(
+ X
RSS = T SS ESS = 85,
Page 8
1
nk
R2 = 0.15
P 2
Ui . The variance of in this
Pn
(X X)(U
i
Pn i
var + i=1
2
i=1 (Xi X)
Pn
= var
U )
n var(X)
2
n var(X)
= 0.2661.
sd
.e.()
s.e.()
and
t20 .
sd
.e.()
Pr t 2 ,nk <
< t 2 ,nk = 1 [ t 2 ,nk sd
.e.()].
sd
.e.()
When = 5%, with 20 degrees of freedom, |t0.025,20 | = 2.0861 : so, in the 95%
of cases, the true [0.5 2.086 0.2661]
(e) We want to test the following hypothesis: H0 : = 1 against H1 : 6= 1.
The t-statistic | 0.5/0.2661| = 1.879 has to be compared with the proper
critical value, given the level of significance we choose. They are |t 0.05 ,20 | =
2
2.086 and |t 0.10 ,20 | = 1.7242 . Finally, we can reject H0 at 90% and cannot
2
reject at 95%.
Page 9