y = X1 β1 + X2 β2 +
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 1 / 16
O MITTED VARIABLES
(Free lunch?) Suppose X10 X2 = 0. Then the bias goes away. Interpretation, the information is
not “right,” it is irrelevant. β
f1 is the same as β
c1 .
W. Ch 3 page 99. shows that
σ2
V (βb1 ) =
SST1 (1 − R12 )
where SST1 is the total variation in X1 and R1 is the R − squared from the regression of X1
on X2 . Furthermore,
σ2
V (βe1 ) =
SST1
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 2 / 16
W HAT AFFECTS THE VARIANCE OF OLS?
The variance of the OLS estimator of βj , conditional on the sample values of the independent
variables is
σ2
V (βbj ) = (1)
SSTj (1 − Rj2 )
where SSTj = ni=1 (Xij − X j )2 is the total sample variation in Xj and Rj2 is the R-squared
P
from the regression of Xj on all other independent variables including constant term.
The larger σ 2 , the larger is the variance of OLS estimator. More noise means difficult to
estimate the partial effect of any variable.
The larger the total variation in Xj , the smaller is the variance of βbj . To increase the in sample
variation of Xj , one can increase the sample size!
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 3 / 16
M ULTICOLLINEARITY
The variance of an estimated coefficient will tend to be larger if there are other X ’s in the
model that can predict Xj . This is reflected by a high Rj2 in equation 1;
The standard error of prediction will also tend to be larger if there are unnecessary or
redundant X ’s in the model.
See W. page 96-97 for a discussion
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 4 / 16
S TATA ILLUSTRATIVE EXAMPLE : THE 1978 AUTOMOBILE DATASET
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 5 / 16
E XAMPLE : AUTOMOBILE DATASET
auto_out.txt
. describe
Page 1
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 7 / 16
S TATA ILLUSTRATIVE EXAMPLE : THE 1978 AUTOMOBILE DATASET
Regression Diagnostics:
After estimating a model, we want to check the entire regression for: Normality of the residuals,
Omitted and unnecessary variables, Heteroskedasticity;
We also want to test individual variables for: Outliers, Collinearity, Functional form
Look at Residuals: in stata rvfplot
Check residuals for normality
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 8 / 16
E XAMPLE : AUTOMOBILE DATASET
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 9 / 16
E XAMPLE : AUTOMOBILE DATASET
Residuals
-5000 0 5000 10000
Residuals
0
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 10 / 16
E XAMPLE : AUTOMOBILE DATASET
auto_out.txt
. imtest
Cameron & Trivedi's decomposition of IM-test
---------------------------------------------------
Source | chi2 df p
---------------------+-----------------------------
Heteroskedasticity | 13.43 10 0.2005
Skewness | 12.08 4 0.0168
Kurtosis | 1.16 1 0.2815
---------------------+-----------------------------
Total | 26.67 15 0.0315
---------------------------------------------------
. ovtest
. hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of log_p
chi2(1) = 0.53
Prob > chi2 = 0.4654
hettest performs three versions of the Breusch-Pagan (1979) and Cook-Weisberg (1983) test for
heteroskedasticity. All three versions of this test present evidence against the null hypothesis that t = 0 in
V () = σ 2 expzt . If varlist is not specified, the fitted values are used for z.
ovtest Ramsey regression specification-error test for omitted variables
imtest performs an information matrix test for the regression model and an orthogonal
decomposition into tests for heteroskedasticity, skewness, and kurtosis.
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 12 / 16
AUTOMOBILE EXAMPLE :ROBUST REGRESSION
auto_out.txt
. imtest
Cameron & Trivedi's decomposition of IM-test
---------------------------------------------------
Source | chi2 df p
---------------------+-----------------------------
Heteroskedasticity | 13.43 10 0.2005
Skewness | 12.08 4 0.0168
Kurtosis | 1.16 1 0.2815
---------------------+-----------------------------
Total | 26.67 15 0.0315
---------------------------------------------------
. ovtest
. hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of log_p
chi2(1) = 0.53
Prob > chi2 = 0.4654
If you detect possible problems with your initial regression, you can:
Check for mis-coded data
Divide your sample or eliminate some observations (like diesel cars)
Try adding more covariates if the ovtest turns out positive
Change the functional form on Y or one of the regressors
Use robust regression
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 14 / 16
ROBUST R EGRESSION
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 15 / 16
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
AUTOMOBILE EXAMPLE :ROBUST
Variables: fitted values ofREGRESSION
log_p
chi2(1) = 0.53
Prob > chi2 = 0.4654
------------------------------------------------------------------------------
log_p | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | .0006669 .0000848 7.87 0.000 .0004978 .0008361
mpg | .0485624 .0129485 3.75 0.000 .0227308 .0743939
forXmpg | -.0542761 .0126837 -4.28 0.000 -.0795795 -.0289728
_Iforeign_1 | 1.892195 .3215827 5.88 0.000 1.250655 2.533735
_cons | 5.397624 .5172239 10.44 0.000 4.36579 6.429457
------------------------------------------------------------------------------
Dr. Rachida Ouysse (ECON3208/ECON3291) Review of Multiple Regression Model: continued
School
c of Economics, UNSW 16 / 16