(1)
(2)
+ Ut;1
2 + Ut;2 :
Because the two equations appear unrelated, we would think of estimating them
separately. (And the system is referred to as a system of seemingly unrelated
regression (SUR) equations.) Of course, if the two equations actually are unrelated, then we should estimate them separately. But it may be the case that
there is a relation between the equations, brought forward by correlation between
the two error terms. If the two error terms are correlated, then we can gain a
more e cient estimator by estimating the two equations jointly, as was shown by
Zellner in 1962. The intuition mirrors the intuition for a single equation with
serial correlation. If Ut;1 is correlated with Ut;2 , then knowledge of Ut;2 can help
us reduce our predicted value of Ut;1 .
The formal assumption is that Ut = (Ut;1 ; Ut;2 )0 is an i.i.d. random variable
with mean zero and covariance matrix
E (Ut Ut0 ) =
! 11 ! 12
! 12 ! 22
Y1
Y2
,X=
X1 0
0 X2
1
2
, and U =
U1
U2
are
= E (U U 0 ) where
! 11 I ! 12 I
! 12 I ! 22 I
I:
X0
Y:
X 0d
Y;
! 212 , for
= d
I =
! 22 I
! 12 I
! 12 I
! 11 I
Pn
Pn
2
1
! 12 P
! 11
t=1 Xt;1 Xt;2
t=1 Xt;2
P
Xd
X
=
n
2
! 22 nt=1 Xt;1
C ! 12 t=1 Xt;1 Xt;2
P
Pn
P
2
2
2
! 212 ( nt=1 Xt;1 Xt;2 ) . Also
with C = ! 11 ! 22 nt=1 Xt;1
t=1 Xt;2
0
Xd
1
C
Y =
P
! 22 Pnt=1 Xt;1 Yt;1
! 11 nt=1 Xt;2 Yt;2
P
! 12 Pnt=1 Xt;1 Yt;2
! 12 nt=1 Xt;2 Yt;1
SURP
estimators BS equal
PThe
P
P
P
P
2
! 11 P Xt;2 (! 22 XP
! 12 XP
XP
! 12 P Xt;2 Yt;1 )
t;1 Yt;1
t;1 Yt;2 ) + ! 12
t;1 Xt;2 (! 11 P Xt;2 Yt;2
2
! 12 Xt;1 Xt;2 (! 22 Xt;1 Yt;1 ! 12 Xt;1 Yt;2 ) + ! 22 Xt;1
(! 11 Xt;2 Yt;2 ! 12 Xt;2 Yt;1 )
There are two cases in which simplication occurs. First, if ! 12 = 0, then
" PX Y #
P 2
P
P 1 21
1
! 11 P Xt;2
! 22 P Xt;1 Yt;1
X1
P
P 2 P 2
:
=
BS =
2
X Y
P 2 22
! 11 Xt;2 Yt;2
! 11 ! 22 Xt;1
Xt;2 ! 22 Xt;1
X
2
If the errors are uncorrelated across equations, then there is no relation between
the equations and the SUR estimators are identical to the OLS estimators for each
equation separately.
Second, if X1 =
BS equals
P
P
P
P
PX22, thenP
(!
! 12 P Xt Yt;2 ) + ! 12 P Xt2 (! 11 P Xt Yt;2 ! 12 P Xt Yt;1 )
!
X
22 P Xt Yt;1
11 P t
1
P
2
( Xt2 ) (!11 !22 !212 ) ! 12 Xt2 (! 22 Xt Yt;1 ! 12 Xt Yt;2 ) + ! 22 Xt2 (! 11 Xt Yt;2 ! 12 Xt Yt;1 )
" PX Y #
P
P
2
P 1 21
1
! 11 !P
Xt Yt;1 ! 12 PXt Yt;1
X1
22
P
P
=
=
:
2
X2 Y2
2
2
P
X
Y
+
!
!
X
Y
!
Xt (! 11 ! 22 ! 12 )
t t;2
11 22
t t;2
12
X2
2
If the regressor is the same in each equation, then there is no additional information from the cross-equation correlation.
To understand why identical regressors should produce such a result, we study
the special case in which there is only one observation for each regression. The
formula for Bs simplies to
BS =
1
2 2
X1 X2 (! 11 ! 22
! 212 )
X22 (! 11 ! 22
X12 (! 11 ! 22
! 212 ) X1 Y1
! 212 ) X2 Y2
wX22 X1 Y1
;
wX22 X12
with w = ! 11 ! 22 ! 212 . Yet as there is only one observation for each equation, the
SUR estimator cannot give dierent weights to the observations in an equation
and is identical to the OLS estimator, which gives equal weight to each observation
within an equation. For n > 1, the weights are not simply wX22 , the variation
in the weights depends on the variation between X1 and X2 . With identical
regressors, the weight variation disappears and the SUR estimator is identical to
OLS.
In practice, is unknown. The (feasible) implementation of the estimator is
achieved in two stages. In the rst stage, estimate each equation by OLS, yielding
B1 , B2 , U1P and U2P . We then estimate the elements of the joint error covariance
matrix by
1
0 P
:
Si;j =
U P Ut;j
n K t;i
The estimated covariance matrix is then used to construct the second-stage GLS
estimator.
In the above example, there are only two equations and n observations for
each equation. For the rst-stage error covariance matrix to be well estimated,
n must be substantially larger than the number of equations.