Anda di halaman 1dari 7

Autocorrelation

There is serial correlation (i.e. autocorrelation) when either the dependent variable (AR
models) or the residual (MA morels) show correlation with its values in pasts periods.
This is a problem because standard errors (even heteroskedastic robusts) are not
concistent, affecting statistical inferences (i.e. hypothesis testing). In cross section data, it
is relatively safe to assume

E(ei,ej)=0

In words, the residuals are completely independent across observations. However, it is


usually the case that errors are not independent in time series. Observations for the same
individual in different periods of time are usually correlated. There are many factors that
may cause this (auto)correlation:

a) Cyclical components
b) Ommited variable bias
c) Functional form misspecification
d) Data manipulation: secondary effect of smooting or interpolation techniques in
copiling statistics

Consider the simplest bivariate model

y t = β 0 + β 1 x t + et

where the residuals et follow a first order autoregression process

et = ρet −1 + u t
and u t is an error term with no serial correlation. The parameter ρ is the correlation

coefficient and indicates how strong the autocorrelation is. Replacing iteratively r times,
the residuals can be expressed as e geometric series of the error terms u t .

r
et = u t + ρu t −1 + ρ 2 u t − 2 + ... + ρ r u t − r + ρ r +1et − r +1 = ∑ ρ i u t −i + ρ r +1et − r +1
i=0

Since − 1 < ρ < 1 and ρ r +1et − r +1 tend to zero, the last term can be dropped as r → ∞


et = ∑ ρ i u t −i
i =0

where

∞ ∞
E (et ) = E (∑ ρ i u t −i ) = ∑ ρ i E (u t −i ) = 0 and
i=0 i =0

σ u2
V (et ) = V (∑ ρ u t −i ) = ∑ ( ρ ) V ((u t −i ) ) =
∞ ∞
i i 2 2

i=0 i =0 1− ρ 2

The last result for the variance of the residuals assume u t is homoskedastic.

The OLS estimator β̂1 is unbiased and consistent but the usual standard errors are not.
The variance of beta looks like

σ u2  ∑∑ ( xi − x )( x j − x ) ρ k 
V ( βˆ1 ) = 1 + 
∑ t
( x − x ) 2 
 ∑ ( xi − x )
2 

Which is clearly no the simplest variance case for homoskedastic errors. It is very likely
that the standard errors are underestimated which will lead to the usual t and F statistic
not to be valid.

Detecting Autocorrelation in the Error Term

Since the true population errors are not observed, all autocorrelation detection procedures
are based on regression residuals which are the sample estimates of the population error
terms. The first method relies on graphical detection. This method is very important and
should be always consider the initial detection step. You run the regression and obtain the
residuals which should be plotted over time.
We work with the workfile sugar.wf1 (which is part of the data sets provided by Hill,
Griffiths and Judge, 2002) and estimate ln( At ) = β 0 + β 1 ln( Pt ) + et . Figure 1 presets the

graphs of the residuals (levels and first difference). The general pattern of the residuals
reflects some inertia where positive (negative) errors are followed by positive errors
(negative)

Method 2: Durbin Watson Test:

Then, you have more formal tests. The most commonly used is the Durbin-Watson test
and it is based on simulations of the AR(1) model for the error term et . Notice that this is

different to the AR(1) model for the dependent variable. Stock and Watson mostly work
with this second case while here we introduce the AR(1) model to the binary regression
analysis. Consider a regression model with two equations:

y t = β 0 + β 1 x t + et

et = ρet −1 + u t
This is a two-sided sided test of no serial correlation ( H 0 : ρ e = 0 , H 1 : ρ e ≠ 0 ). E-views

automatically produce the Durbin Watson (d) statistic which is reported on the regression
output. Consider, again, ln( At ) = β 0 + β 1 ln( Pt ) + et

Dependent Variable: LOG(A)


Method: Least Squares
Date: 12/04/08 Time: 23:58
Sample: 1 34
Included observations: 34
Variable Coefficient Std. Error t-Statistic Prob.
C 6.111328 0.168570 36.25397 0.0000
LOG(P) 0.970582 0.110629 8.773336 0.0000
R-squared 0.706345 Mean dependent var 4.707273
Adjusted R-squared 0.697168 S.D. dependent var 0.561094
S.E. of regression 0.308771 Akaike info criterion 0.544589
Sum squared resid 3.050865 Schwarz criterion 0.634375
Log likelihood -7.258010 F-statistic 76.97143
Durbin-Watson stat 1.291242 Prob(F-statistic) 0.000000

You can find the d-statistic on the botton left-hand corner of the regression output. This
statistics is defined as:

d = 2(1 − ρ )

Therefore, no serial correlation is consistent with a d-statistic close to 2. When d = 2 , the


correlation coefficient is zero and there is no evidence of autocorrelation in the error
term.. If d < 2 ( d > 2 ), there is evidence of positive (negative) autocorrelation. There is
perfect positive (negative) autocorrelation when d = 0 ( d = 4 ).
E-views does not provide tables for the critical values or p values of the Durbin-Watson
test. In our case, the d-statistic is 1.29, situated in the rejection area1. You will not be
required to do hypothesis testing with the d-w statistic but you need to know what it
is for.

1
Go to http://www.stanford.edu/~clint/bench/dwcrit.htm. For T=34 and 2 parameters, you have two critical
values for 5% significance level 1.39 (lower level) and 1.51 (upper level). A d statistic below 1.39 is strong
evidence of positive serial correlation while you do not reject the null of no correlation if the d statistic is
above the upper level (1.51). Values in the interval are mixed evidence so it is inconclusive.
Method 3: LM Test.

Durbin-Watson is, together with AFD unit root test, the most commonly used test in time
series. However, it is import to know that it is not relevant in many instances, for instance
if the error distribution is not normal, or if you have the dependent variable in a lagged
form as an independent variable this is not an appropriate test for autocorrelation. A test
that is suggested that does not have these limitations is the Lagrange Multiplier test for
autocorrelation or Breusch-Godfrey test.
Starting from the initial equation yt = β 0 + β 1 xt + et , the perturbation term et is estimated

and its first lag introduced in an auxiliary regression. The t-statistic of the new variable
gives you . Eviews compute this statistic directly from the equation window, clicking on
View/Residual Tests/SerialCorrelation LM Test. You should indicate the number of
lagged residuals to be included (in this cse, only one). The final outpt should be similar to:

Breusch-Godfrey Serial Correlation LM Test:


F-statistic 4.022271 Probability 0.053699
Obs*R-squared 3.904864 Probability 0.048147

Test Equation:
Dependent Variable: RESID
Method: Least Squares
Date: 12/05/08 Time: 01:58
Variable Coefficient Std. Error t-Statistic Prob.
C 0.019701 0.161432 0.122040 0.9037
LOG(P) 0.011818 0.105912 0.111581 0.9119
RESID(-1) 0.343298 0.171173 2.005560 0.0537
R-squared 0.114849 Mean dependent var 1.01E-16
Adjusted R-squared 0.057742 S.D. dependent var 0.304057
S.E. of regression 0.295148 Akaike info criterion 0.481415
Sum squared resid 2.700477 Schwarz criterion 0.616094
Log likelihood -5.184062 F-statistic 2.011135
Durbin-Watson stat 1.978183 Prob(F-statistic) 0.150928

Since the p-val is higher than 5% but below 10%, the rejection or no rejection of the null
hypothesis of no serial correlation depends on the significance level selected. At 5%, we
can not reject the hypothesis that there is no autocorrelation. This result contradict our
conclusion from the d-statistics. However, take into account we can reject no serial
correlation at 10% significance level.
One advantage of this test is that can be generalized to higher order of autocorrelation.
You should introduce further lags of the estimated residuals and use the F-statistic to
thest the null hypothesis of no serial correlation.
Figure 1: Residual Plot

Equation _reg01.ls LOG(A) C LOG(P)


‘warning: this is time series so no heterosk. robust se.
line resid
line d(resid)

0.8
0.8
0.6
0.6

0.4 0.4

0.2 0.2
0.0
0.0
-0.2
-0.2
-0.4
-0.4
-0.6

-0.8 -0.6
5 10 15 20 25 30 5 10 15 20 25 30

RESID D(RESID)

The residual line plot shows some positive correlation while the first difference shows
negative correlation.

Code: E-views code for serial correlation

‘This is the Durbin-Watson statistic


Equation _reg01.ls LOG(A) C LOG(P)
Scalar d-stat=@dw

‘Lagrange Multiplier test


Equation _reg01.ls LOG(A) C LOG(P)
Genr ehat=resid
Equation _reg01.ls LOG(A) C LOG(P) ehat(-1)

Anda mungkin juga menyukai