Anda di halaman 1dari 3

January 26, 1999

Useful Equations for Linear Regression


Simple linear regression: one predictor (p = 1):
Model: E(y|x) = + x
E(y) =expectation or longterm average of y | = conditional on
Alternate statement of model: y = + x + e, e normal with mean zero for
all x, var(e) =
2
= var(y|x)
Assumptions:
1. Linearity
2.
2
is constant, independent of x
3. Observations (es) are independent of each other
4. For proper statistical inference (CI, Pvalues), y (e) is normal conditional
on x
Verifying some of the assumptions:
1. In a scattergram the spread of y about the tted line should be constant
as x increases
2. In a residual plot (d = y y vs. x) there are no systematic patterns (no
trend in central tendency, no change in spread of points with x)
Sample of size n : (x
1
, y
1
), (x
2
, y
2
), . . . , (x
n
, y
n
)
L
xx
=

(x
i
x)
2
L
xy
=

(x
i
x)(y
i
y)

= b =
L
xy
L
xx
= a = y b x
y = a + bx =

E(y|x) estimate of E(y|x) = estimate of y
SST =

(y
i
y)
2
MST =
SST
n1
= s
2
y
SSR =

( y
i
y)
2
MSR =
SSR
p
SSE =

(y
i
y
i
)
2
MSE =
SSE
np1
= s
2
yx
SST = SSR + SSE F =
MSR
MSE
=
R
2
/p
(1R
2
)/(np1)
F
p,np1
R
2
=
SSR
SST
SSR
MSE

2
p
(p = 1) s.e.(b) =
s
yx

L
xx
t =
b
s.e.(b)
t
np1
1
1 twosided CI for b t
np1,1/2
s.e.(b)
(p = 1) s.e.( y) = s
yx

1 +
1
n
+
(x x)
2
L
xx
1 twosided CI for y y t
np1,1/2
s.e.( y)
(p = 1) s.e.(

E(y|x)) = s
yx

1
n
+
(x x)
2
L
xx
1 twosided CI for E(y|x) y t
np1,1/2
s.e.(

E(y|x))
Multiple linear regression: p predictors, p > 1:
Model: E(y|x) = +
1
x
1
+
2
x
2
+ . . . +
p
x
p
+ e
Interpretation of
j
: eect on y of increasing x
j
by one unit, holding all other
xs constant
Assumptions: same as for p = 1 plus no interaction between the xs (xs act
additively; eect of x
j
does not depend on the other xs).
Verifying some of the assumptions:
1. When p = 2, x
1
is continuous, and x
2
is binary, the pattern of y vs. x
1
,
with points identied by x
2
, is two straight, parallel lines
2. In a residual plot (d = y y vs. y) there are no systematic patterns (no
trend in central tendency, no change in spread of points with y). The
same is true if one plots d vs. any of the xs.
3. Partial residual plots reveal the partial (adjusted) relationship between
a chosen x
j
and y, controlling for all other x
i
, i = j, without assuming
linearity for x
j
. In these plots, the following quantities appear on the axes:
y axis: residuals from predicting y from all predictors except x
j
x axis: residuals from predicting x
j
from all predictors except x
j
(y is
ignored)
When p > 1, least squares estimates are obtained using more complex formu-
las. But just as in the case with p = 1, all of the coecient estimates are
weighted combinations of the ys,

w
i
y
i
[when p = 1, the w
i
for estimating
are
x
i
x

(x
i
x)
2
].
Hypothesis tests with p > 1:
Overall F test tests H
0
:
1
=
2
= . . .
p
= 0 vs. the althernative
hypothesis that at least one of the s = 0.
To test whether an individual
j
= 0 the simplest approach is to compute
the t statistic, with n p 1 d.f.
2
Subsets of the s can be tested against zero if one knows the standard
errors of all of the estimated coecients and the correlations of each pair
of estimates. The formulas are daunting.
To test whether a subset of the s are all zero, a good approach is to com-
pare the model containing all of the predictors associated with the s of
interest with a submodel containing only the predictors not being tested
(i.e., the predictors being adjusted for). This tests whether the predictors
of interest add response information to the predictors being adjusted for.
If the goal is to test H
0
:
1
=
2
= . . . =
q
= 0 regardless of the values
of
q+1
, . . . ,
p
(i.e., adjusting for x
q+1
, . . . , x
p
), t the full model with p
predictors, computing SSE
full
or R
2
full
. Then t the submodel omitting
x
1
, . . . , x
q
to obtain SSE
reduced
or R
2
reduced
. Then compute the partial F
statistic
F =
(SSE
reduced
SSE
full
)/q
SSE
full
/(n p 1)
=
(R
2
full
R
2
reduced
)/q
(1 R
2
full
)/(n p 1)
F
q,np1
Note that SSE
reduced
SSE
full
= SSR
full
SSR
reduced
.
Notes about distributions:
If t t
b
, t normal for large b and t
2

2
1
, so [
b
s.e.(b)
]
2

2
1
If F F
a,b
, a F
2
a
for large b
If F F
1,b
,

F t
b
If t t
b
, t
2
F
1,b
If z normal, z
2

2
1
y D means y is distributed as the distribution D
y D means that y is approximately distributed as D for large n


means an estimate of
3

Anda mungkin juga menyukai