Linear Regression Models: 1 What Does This Equation Means?

Linear Regression Models
Jin-Lung Lin
Linear regression models:
yt = β0 + β1 x1t + β2 x2t + єt , t = 1, 2, , T
E(єt ) = 0, ∀t
V(єt ) = σ2 , ∀t
E(єtєs ) = 0, ∀t x s
E(xitєt ) = 0, i = 1, 2
1 What does this equation means?

1. yt is the caused (dependent) variable while x1t , x2t are the causal (independent) variables.
2. The relation between yt , x1t , x2t is linear and no important causal variables are left out.
3. β1 measures the effect of one unit of change of x1t on yt with x2t being fixed. Similarly, β2
measures the effect of one unit of change of x2t on yt with x1t being fixed.
4. Note that
∂yt
β1 =
∂x1t
∂yt
β2 =
∂x2t
dyt ∂yt ∂yt dx2t
= +
dx1t ∂x1t ∂x2t dx1t
1
2 Estimating β s
Least square estimates are derived from minimizing the squared residuals
T
Min SSE = Q(yt − β0 − β1 x1t − β2 x2t )2
t=1
− x̄1 )(yt − ȳ) PTt=1 (x2t − x̄2 )2 − PTt=1 (x1t − x̄1 )(x2t − x̄2 ) PTt=1 (x2t − x̄2 )(yt − ȳ)
Pt=1 (x1t
T
β̂1 =
Pt=1 (x1t2 − x̄) Pt=1 (x2t − x̄2 ) − (PTt=1 (x1t − x̄1 )(x2t − x̄))2
T T 2
Pt=1 (x1t − x̄1 )2 Pt=1 (x2t − x̄2 )(yt − ȳ) − Pt=1 (x1t − x̄1 )(yt − ȳ) Pt=1 (x1t − x̄1 )(x2t − x̄2 )
T T T T
β̂2
Pt=1 (x1t − x̄1 )2 Pt=1 (x2t − x̄2 )2 − (Pt=1 (x1t − x̄1 )(x2t − x̄2 ))2
= T T T
β̂0 = ȳ − β̂1 x̄1 − β̂2 x̄2
Thus,
ŷt = β̂0 + β̂1 x1t + β̂2 x2t

ŷt − ȳ = β̂1 (x1t − x̄1 ) + β̂2 (x2t − x̄2 )
є̂t = yt − ŷt
3 How to interpret least square estimates
yt = ŷt + є̂t
= β̂0 + β̂1 x1t + β̂2 x2t + є̂t
ŷt Ù є̂t since

T
Q ŷtє̂ = ( β̂0 + β̂1 x1t + β̂2 x2t )є̂t
t=1
T
= β̂0 Q(yt − β̂1 x1t − β̂2 x2t )
t=1
T
+ β̂1 Q x1t (yt − β̂1 x1t − β̂2 x2t )
t=1
T
+ β̂2 Q x2t (yt − β̂1 x1t − β̂2 x2t )
t=1
= 0
by the normal equations. Further,

T T T
Q(yt − ȳ)2 = Q( ŷ − ȳ)2 + Q(yt − ŷt )2
t=1 t=1 t=1
2
Note that we are decomposition yt into the sum of two terms ŷt and є̂t . It is the orthogonality
condition that makes the decomposition unique. To sum up, regressing yt on (x1t , x2t is equivalent
to projecting yt onto the space spanned by x1t , x2t .
4 Another look at LSE

To solve β̂1 , β̂2 , β̂0 , one needs to know (PTt=1 x1t2 , PTt=1 x2t
2
, PTt=1 x1t x2t and (PTt=1 x1t yt , PTt=1 x2t yt , that
is the covariance among xit and covariance between yt and xit . The requirement is obviously as in
computing ∂yt ~∂x1t , ∂yt one needs to disentangle the covariance among xit .
5 General cases
Generally speaking,
Y = Xβ + є
where
< =
@ A
@ A
1 x11 x12 x1k
@ A
@ 1 x21 x22 x2k A
Y = (y1 , y2 , , yT ) , X = @ A
@ A

@ A
@ A

@ A
> 1 xT1 xT2 xTk ?
β̂ = (X X)−1 X Y
where
< =
@ Pt=1 xt1 Pt=1 xt2 Pt=1 xtk A
T T T
@ T A
T
@ P xt1 A
@ Pt=1 xt12 Pt=1 xt1 xt2 Pt=1 xt1 xtk A
T T T
X X = @ t=1 A
@ A
@ A
@ T A

@ Pt=1 xtk Pt=1 xtk xt1 Pt=1 xtk xt2 A
> Pt=1 xtk ?
T T T 2
V( β̂) = (X X)−1 σ2
6 Testing hypotheses on β
Typically all elements of (X X) diverge to ª as T ª and (X X)−1 converges to a all zeros matrix.
It is often the case that T1 (X X)−1 Σ−1
p
x as T ª. It can be shown that LSE β̂ is the MLE. Under
3
the null hypothesis H0 β = β0
º
T( β̂ − β0 ) x )
d
N(0, σ2 Σ−1
With this asymptotic normality, we can test any hypothesis on β.
1 X X −1
T( β̂ − β0 ) ( ) ( β̂ − β0 ) χ2 (k + 1)
d
σ̂2 T
Let the hypothesis be expressed as Rβ = r where R is a m (k + 1) matrix. Then,
1 X X −1 −1
(R( ) R) (Rβ̂ − r) χ2 (m)
d
T(Rβ̂ − r)
σ̂2 T
In particular, if the hypothesis only involves a single parameter, ie. R = (0, , 0, 1, 0, , 0),
or only one single element R is 1 and the rest are zeros, it can be transformed into a t-test. For the
hypothesis like βg+1 , , βk = 0, a different F test is also feasible. Let SSER , SSEC be the residual sum
of squares under the null and the alternative hypothesis. Then,
(SSER − SSEC )~(k − g)

F= Fk,n−k−1
(SSEC )~(n − k − 1)
7 A closer look at the assumptions

1. E(єt ) = 0 is usually fine as long as an intercept is included. If not, just subtract the mean from
the equation.
2. E(єtєs ) = 0 could be violated for time series regression. Need to diagnostically check this
assumption.
3. V(єt ) = σ2 ∀t homoscedasticity assumption could be violated for cross-section regression

or for financial time series.
4. E(xitєt ) = 0 could be violated due to simultaneity. Need to use instrument variables.
8 Empirical examples
Estimating market β for Taiwanese stock market

Linear Regression Models: 1 What Does This Equation Means?

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Linear Regression Models: 1 What Does This Equation Means?

Diunggah oleh

Hak Cipta:

Format Tersedia

Linear Regression Models