Anda di halaman 1dari 95

# WESS Time Series Lectures

July, 2015

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 1 / 90

Econometrics so far
We have studied the regression equation

yi = α + βxi + i i = 1...n

## where i indicates an individual in a sample of n observations.

We have been interested in
The effect of x on y; dy /dx = β
The predicted value of y, given x; E[yi |xi ] = α + βxi
The fit of the model, e.g. the R 2 statistic
Can we do the same with a pair of time series

yt = α + βxt + t

NO!
(Or rather, only under special circumstances, and such a regression is only

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 2 / 90

Two problems in time series (1)

## With data yt , xt the critical assumption E[t |xt ] = 0 is difficult to

maintain. The y and x are often simultaneously determined in a system,
’the economy’.

yt = αy + βy xt + yt
xt = αx + βx yt + xt

## Consider the regression

yt = a + bxt + et
= a + b(α + βx yt + xt ) + et

## The regression error et is an estimate of the yt error yt . It will be

correlated with the regressor xt as this regressor actually contains yt , and
so contains yt

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 3 / 90

Two problems in time series (2)

a 7.9 (139)
b 2.9 (15.6)
R2 0.75 -

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 4 / 90

Two problems in time series (2)

a 7.9 (139)
b 2.9 (15.6)
R2 0.75 -

## yt is U.S. output. xt is mean land sea temperature

Now what do you think about the regression?

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 4 / 90

Two problems in time series (2)

a 7.9 (139)
b 2.9 (15.6)
R2 0.75 -

## yt is U.S. output. xt is mean land sea temperature

Now what do you think about the regression?

## This is not a regression about climate change!

It is called the spurious regression problem.
It is easy to do crap regressions with time series data!

Overview

## In Part I our models will look like

yt = α + βyt−1 + t or yt = α + βt−1 + t

## so the explanatory variable is replaced by the previous value of the

dependent variable, or by previous errors.
Our primary interest will be prediction, E[yt |yt−1 ] = α + βyt−1 .

Overview

## In Part I our models will look like

yt = α + βyt−1 + t or yt = α + βt−1 + t

## so the explanatory variable is replaced by the previous value of the

dependent variable, or by previous errors.
Our primary interest will be prediction, E[yt |yt−1 ] = α + βyt−1 .
In Part II we will learn how to estimate dynamic relationships

yt = α0 + α1 yt−1 + β0 xt + β1 xt−1 + t

in a way that elimantes the two common problems with time series
regressions.

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 5 / 90

Time series data

Our sample data {yt }, {xt } refers to observations on the same unit in
sequential time periods, t = 1, . . . , T
Periods may be years, quarters, months, weeks or days, depending on
interst and data availability

## years quarters months weeks days

finance finance finance
macro macro macro
growth growth

Time series data

## There are three main types of time sereis data

Mean reverting (’stationary’)
Series with a trend (’trend stationary’) series
Series with permanent shocks (’Integrated series’)

## Part I of these notes deals with stationary series

Part II looks at testing for permanent shocks and dealing with Integrated
series

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 7 / 90

Mean reverting series

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 8 / 90

Trend stationary series

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 9 / 90

Integrated series

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 10 / 90

The data file ‘macro vars.xls’ contains lots of U.S. data series
GDP is a good example

Take logs

## The first thing we don’t like is the exponential shape.

Our regressions are linear regressions, so lets transform the data to make it
more linear

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 12 / 90

Logs

Look at your data. For data in levels, taking logs is often the first
step in applied work
Not if the data is already in % changes, or an interest rate!
Logarithms make percentage changes comparable by eye, which is
often more relevant.
Recall g = (xt − xt−1 )/xt−1 ⇒ 1 + g = xt /xt−1 , so then
ln(xt ) − ln(xt−1 ) = ln(1 + g ) ≈ ln(g ) for small g .

Difference

## If the variable appears to be trending, as with ln GDPt , a safe thing to

do is take differences.

## Generate the difference of GDP and plot this

What is the main difference compared to previous plots?

This now looks mean reverting. Later we will test formally whether a
series is mean reverting or integrated, but don’t forget it’s always
sensible to start by looking at your data.
∆yt = yt − yt−1 is the difference operator.

## Consider the price data. Look at

levels
log levels
difference of logs - inflation
the difference of inflation
Which would you be happy to consider mean reverting?

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 15 / 90

Part I: Modelling stationary time series

So far I have talked about mean reverting series, but we can be more
precise
We will model variables which are covariance stationary (stationary
for short)
The mean exists and does not depend on time, E[yt ] = µ for all t
(quick notation ∀t).
The variance exists and is independent of time, var(yt ) = σy2 .
The Autocovariance, cov(yt , yt−k ) = σk2 is indpendent of time, it
depends only on k and not on t

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 16 / 90

Modelling stationary time series: assumptions on errors

## Any model of a stationary series imposes two key assumptions on the

errors
A1: E[t ] = E[E[t |yt−1 ]] = 0
A2: E[t t−s ] = 0 ∀s > 0
There are also some technical assumptions
A3: E[2t ] = var(t ) = σ2
A4: yt and yt−j become independent as j gets large
A5: Very large outliers are unlikely
These assumptions apply to the true model, and we have to replicate them
in our statistical model.

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 17 / 90

Discussion of assumpions

## A1 This tells us that t is unpredictable given information about yt−1 ,

available before period t begins. In the regression context, we require
that t is unpredictable given all our r.h.s. variables. We use
predetermined data yt−1 , yt−2 , t−1 , t−2 , ...
A2 In cross-sections, this is a second order assumption determining the
standard error of b.
In time series it is a first order assumption, determining the
consistency of b. See exercise.
A3 Allows us to make calculations about variances, including confidence
intervals around parameter estimates and forecasts. It is implied by
stationarity

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 18 / 90

Discussion of assumptions

## A4 This is a technical requirement to derive limiting behaviour of

estimators. It replaces i.i.d. assumption in cross-sectional data
A5 This says that our models are not suitable for certain types of very
high-frequency finance, but it’s generally not a problem with
macroeconomic data.
In applied work, spurious regressions and models with wrong/insufficient
dynamics tend to violate A2, so checking it is key. Also, check A2 if you
are evaluating someone else’s work!

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 19 / 90

Stationary time series: AR(1) model

## Our first time sereis model for stationary data

yt = α + βyt−1 + t (1)

## This replaces independent explanatory data with past value of the

dependent variable.
Models the correlation between yt and its own past
Stationarity requires |β| < 1
Then the influence of past shocks dies away smoothly
Estimate the model by OLS

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 20 / 90

The AR(1) estimator

## The AR(1) regression is like a standard OLS regression

PT
t=2 (yt − ȳ )(yt−1 − ȳ )
b= PT 2
t=2 (yt−1 − ȳ )
cov(yt , yt−1 )
=
var (yt )
a = ȳ − b ȳ
a
⇒ ȳ =
1−b

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 21 / 90

The AR(1) estimator

## Variance of b follows standard OLS theory

T
X −1
var(b) = σ̂ 2 (yt−1 − ȳ )2
t=2
σ̂ 2
=
\t )
var(y
T
1 X
where σ̂ 2 = ˆ2t
T −1−k
t=2

Note we lose an extra DoF for every lag we include in the autoregression
Confidence testing as usual, given |b| < 1:
τ = (b − bH0 )/SE (b) ∼ tα/2,DoF
Expect low R 2 compared to cross sectional data.

Example

## param estimate tvalue

a
b
R2 -

Plot the residuals of the regression. Do you think they meet A1 - A3?
We will look at formal tests for these assumptions below.

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 23 / 90

General AR(p) model
One lag of yt may not be enough: an omitted variable bias
This shows up as E[t t−s ] 6= 0
We find a model with enough lags to ensure E[t t−s ] = 0∀s

## ˆt = b1 ˆt−1 + · · · + bq t−q

ˆ + νt
H0 : b1 = b2 = · · · = bq = 0
HA : bi 6= 0 for some i
τ= ∼ χ2q
= nR 2 ∼ χ2q

## Inference on β̂i as in standard multivariate OLS models

Alexander Karalis Isaac (Warwick) Time Series July, 2015 24 / 90
Model selection strategy

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 25 / 90

Model selection strategy

## Should be begin small and add lags until A2 holds?

NO!: Don’t base your model selection algorithm on starting from
models that don’t make any statistical sense
Start big and eliminate insignificant regressors, to find the smallest
model for which A2 still holds
Quarterly example
Begin with p=5
Re-esetimate excluding the insiginifcant longer lags
Check E[t t−s ] = 0, s = 1...4
Repeat untill model contains only significant terms
D. Hendry ’PcGets’ software automates this

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 25 / 90

Notes on examples
You do some examples: ∆GDPt , ∆Const , ∆Invt , ∆Inft :

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 26 / 90

The MA(q) process

## We noticed some series require very long AR models to capture all

the conditional correlation of the yt series.
This costs degress of freedom, making estimates and forecasts less
accurate
Is there a smaller model which could capture the dependency that AR
models struggle with?

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 27 / 90

The MA(q) process

## We noticed some series require very long AR models to capture all

the conditional correlation of the yt series.
This costs degress of freedom, making estimates and forecasts less
accurate
Is there a smaller model which could capture the dependency that AR
models struggle with?
This is the moving average process

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 27 / 90

The MA(1) process

Equation:

yt = α + βt−1 + t

Simple to analyse
Stationary for any β value
t }T
Harder to estimate - b determines {ˆ t }T
t=1 , but {ˆ t=1 is the
regressor which determines b!
Solution: take an MLE approach (as in Probit)

MLE in the MA(1)

## t |yt−1 ∼ N(0, σ2 )

1
f (yt |yt−1 ) = √ exp((yt − α − βt−1 )2 /(2σ2 ))
2πσ
X
l(α, β, σ2 ) = ln f (yt |yt−1 )
t
max l(.)w .r .t.α, β, σ2

## Techinchally this is also conditional on 0 . A typical assumption is

0 = E[t ] = 0, though there are other approaches.
Inference follows standard maximum likelihood procedure

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 29 / 90

Information criteria

The MLE approach suggests another tool for tackling model selection
Minimise the expected information loss across potential models

## AIC: −(2l(θ̂) − 2k): choose model with lowest AIC

BIC: −(2l(θ̂) − k ln(T )): choose model with lowest BIC

## Combine insights from significance tests and Info criteria to choose

parsimonious model. Always check A2 holds!
Information criteria are also relevant for AR models, which can be
placed within MLE theory

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 30 / 90

Notes on examples
You do some MA(q) examples: ∆GDPt , ∆Const , ∆Invt , ∆Inft :

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 31 / 90

Model evaluation: forecast performance
If your job is forecasting, choose model with best forecasts!
I In sample forecasts:
Estimation period is 1 . . . T and look at e.g. 1-period ahead forecast
E[yt+1 |yt , θ̂T ]
This is similar to in-sample fit where we compare ŷt with yt , but now
we are doing it 1-period ahead.
I Out of sample forecasts:
Estimation period is 1 . . . N, and look at 1-period ahead forecasts
E[yt+1 |yt , θ̂N ] for t = N + 1, N + 2, N + 3 etc. up to final data point T .
This is a tougher test as none of the information in the forecast period
contributed to the parameter estimation.
A simple criterion Minimum Mean Square Error
N
1 X
MSE = (ŷi − yi )2
N
i=1

## where ŷi is the forecast, yi is the realsiation.

Alexander Karalis Isaac (Warwick) Time Series July, 2015 32 / 90
Empirical Example: In-sample forecast comparisons

Compare 1-step ahead forecasts from 4-lag and preferred AR, MA models

## Variable Model MSE Model MSE

∆ GDP AR(4) MA(4)
AR( ) MA( )
∆ Cons AR(4) MA(4)
AR( ) MA( )
∆ Inv AR(4) MA(4)
AR( ) MA( )
∆ Inf AR(4) MA(4)
AR( ) MA( )

ARMA(p,q) models

## A1, A2, A3 apply for a well specified model

Estimation is by maximum likelihood
Don’t do large ARMAs, in practice ARMA(2,1) is often a good
approximation for macroeconomic time series.

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 34 / 90

Forecasts from an ARMA(2,1)

MSE
Variable k=1 k=4 k=8
∆ GDP
∆ Cons
∆ Inv
∆ Inf

Estimate the model to 2005. From 2003q1, produce static 1-period ahead
forecasts up to 2005, then dynamic 4 and 8 period ahead forecasts also
from 2003q1
What happens to the MSE as the forecast horizon increases?

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 35 / 90

Out of sample forecast example

Now estimate the model to 2007 and repeat the process using dynamic
out of sample forecasting up to 2011

MSE
Variable k=1 k=4 k=8
∆ GDP
∆ Cons
∆ Inv
∆ Inf

This is the problem the BoE had (with a more sophisticated model) during
the crisis
The FED did less badly because its model updates the parameters, via the
Kalman filter, when it makes an error. Beyond the scope of this course!

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 36 / 90

More on forecast errors

We have used the MSFE to look at different models and the effect of
different time horizons
Out of sample forecasts errors are larger than in-sample, because the
forecast error is really composed of two parts

## MSFE = E[(yT +1 − ŷT +1|T )2 ]

= σ2 + var[(a − α) + (b − β)yT ]

## The out of sample forecasts involve re-estimating the model, so give

an estimate of the likely performance of the model in real time

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 37 / 90

Deeper into time sereis: preliminaries

## What does the AR part actually measure?

What does the MS part actually measure?
Why is their combination sometimes more useful?

Think about the way the influence past shocks, t−s decays over time
To go deeper into time series we need to brush up our maths!
We will look at deriving the conditional and unconditional
expectations, variances and autocovariances for simple time-series
models.

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 38 / 90

Conditional Expectations

The conditional expectation E[yt+1 |yt ] follows from the conditional mean
eqation we write down in AR(1) or MA(1) model

## E[yt+1 |yt ] = E[α + βyt + t+1 |yt ]

= α + βE[yt |yt ] + E[t+1 |yt ]
= α + βyt

## E[yt+1 |yt ] = E[α + βt + t+1 |yt ]

= α + βE[t |yt ]
= α + βt

AR(1)

## E[yt+2 |yt ] = E[α + βyt+1 + t+2 |yt ]

= α + βE[yt+1 |yt ] + E[t+2 |yt ]
= α + β(α + βyt )
= α + βα + β 2 yt
k−1
X
E[yt+k |yt ] = β i α + β k yt
i=0
α
lim E[yt+k |yt ] =
k→∞ 1−β

MA(1)

E[yt+k |yt ] = α ∀k ≥ 2

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 40 / 90

Unconditional Expectations

## If we know the process, but have no observations, what is our best

guess at a value yt ? Our best guess is the unconditional mean implied
by the process
AR(1)

= α + βE[yt ]
α
E[yt ] =
1−β

MA(1)

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 41 / 90

Uncertainty and variance AR(1)

Conditional variance

## var(yt+1 |yt ) = var(α + βyt + t+1 |yt )

= var (t |yt ) = σ2
var(tt+2 |yt ) = var(α + βyt+1 + t+2 |yt )
= β 2 var(yt+1 |yt ) + var(t+2 |yt )
= (1 + β 2 )σ 2
k−1
X
var(yt+k |yt ) = (β 2 )i σ 2
i=0
⇒ lim var(yt+k |yt ) =
k→∞

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 42 / 90

Uncertainty and variance AR(1)

Unconditional variance

## σy2 = var(yt ) = var(α + βyt−1 + t )

= β 2 var(yt ) + σ2
σ2
σy2 =
1 − β2
Compare this to the limit of the conditional variance

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 43 / 90

Uncertainty and Variance MA(1)

Conditional variance

## var(yt+1 |yt ) = var(α + βt + t+1 |yt )

= σ2
var(yt+k |yt ) = var(α + βt+k−1 + t+k |yt )
= (1 + β 2 )σ2 ∀k ≥ 2

Unconditional variance

= (1 + β 2 )σ 2

## So the conditional variance of MA(1) returns to unconditional

variance after 2 periods!

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 44 / 90

Forecast error variance

## We should include confidence intervals in our forecasts

Assume t ∼ N(0, σ2 )
Then the 95% confidence intervals for E[yt+k |yt ] are
k−1
X 
AR(1) = yt+k|t ± 1.96 (β 2 )i σ2
j=0

## In practice it is common to apply these formulas to forecasts

generated with estimates a, b, σ̂2 , ignoring the extra uncertainty
created by estimating parameters

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 45 / 90

Forecasts with confidence intervals

PIC

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 46 / 90

Deeper into time sereis: ACF

## An important property is the correlation between yt and yt−k

The Autocovariance function is the set of numbers
cov(yt , yt−k ) := σk2
The sample estimator
PT of the Autocovariance function is
2 1
σ̂k = T −k−1 t=k+1 ỹt ỹt−k where ỹt = yt − ȳ
The Autocovariance function is normalised by the variance of y to
give the Autocorrelation function ACF(k):

cov(yt , yt−k )
ρk =
var(yt )

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 47 / 90

ACF for various stationary models

PIC

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 48 / 90

ACF: discussion

The ACF shows us how long it takes for the influence of past shocks
to die away, by measuring the correlation between yt and its own past
values.
For stationary processes the ACF becomes statistically insignificant
after a finite number of periods.
Stationary processes have finite memory - the influence of a shock is
finite
PIC:growth ACF

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 49 / 90

Deeper into time series: PACF

## Clearly autoregressions can caputre correlation between yt and its

past, but how many lags do we need?
If yt = α + β1 yt−1 + β2 yt−2 + t , then we know from regression
analysis that β2 is a measure of the conditional correlation between yt
and yt−2 after accounting for the correlation explained by yt−1

## cov(yt , yt−k |yt−1 , yt−2 , ..., yt−k+1 )

PACF (k) = 1/2
var(yt |yt−1 , ..., yt−k+1 ) var(yt−k |yt−1 , ..., yt−k+1 )
e.g. PACF (3) =

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 50 / 90

PACFs for stationary processes

PIC

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 51 / 90

Memory in AR(1)
Let yt = βyt−1 + t , i.e. put α = 0 ⇒ µ = 0

## cov(yt , yt−1 ) = E[(βyt−1 + t )yt−1 ]

2
= βE[yt−1 ] = βσy2
⇒ corr (yt , yt−1 ) = β

## cov(yt , yt−2 ) = E[(βyt−1 + t )yt−2 )]

= E[(β(βyt−2 + t−1 ) + t )yt−2 ]
= E[β 2 yt−1 + βt−1 yt−2 + t yt−2 ]
= βσy2
⇒ corr (yt , yt−2 ) = β 2

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 52 / 90

ACF for different AR(1) models

PIC

PACF AR models

| {z }
cond corr

## The coefficient in the bracketed term is

cov(yt , yt−2 |yt−1 )
p
var(yt |yt−1 ) var(yt−2 |yt−1 )

## PACF (k) = βk in AR(p) models

So the PACF drops sharply to 0 after the final lagged term in the
AR(p) model
This is an alternative way to think about how many lags to include

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 54 / 90

PACF various AR models

PIC
What do you notice about ACF vs. PACF in AR models?

ACF MA(1)

## cov(yt , yt−1 ) = E[(βt−1 + t )(βt−2 + t−1 )]

= βσ2
⇒ corr (yt , yt−1 ) = β

=0
ACF (k) = 0 ∀k≥2

PACF of MA(1)

## To calculate the PACF directly is hard. Here’s a neat trick

Assume β < 1, notice t = yt − βt−1

yt = β(yt−1 − βt−2 ) + t
= β(yt−1 − β(yt−2 − βt−3 )) + t
= βyt−1 − β 2 yt−2 + β 3 (yt−3 − βt−4 ) + t
X∞
yt = (−1)i+1 β i yt−i + t
j=1

## Which is an AR(∞), and is well defined give |β| < 1.

Using the earlier result, the PACF will decay geometrically as β i declines
to zero

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 57 / 90

Box Jenkins model building method

## Two famous statisticians suggested the ACF/PACF as a way of building

times series regressions

## AR(p) MA(q) ARMA(p,q)

ACF Decays smoohtly Chops off at q lags Decays smoothly
PACF Chops off at p lags Decays smoothly Decays smoothly

## Inspection of empirical ACF, PACF can help suggest sensible starting

ARMA(p,q) model.
Then test down to small model using significance and information
criteria.
Always check A2 holds for your residuals

Emprical P/ACF

PIC
∆ ln GDP
PIC

Emprical P/ACF

PIC
∆Inf
PIC

Summary

## We have dealt with finite memory processes where

I ACF (k) → 0 as k → ∞
I PACF (k) → 0 as k → ∞
I E[yt ] = µ ∀ t
I var(yt ) = σy2 ∀ t
I cov(yt , yt−k ) depends only on k and not t
ARMA(p,q) models make decent forcasts for these series
But in economics, they are only approximate models

## How do we deal with levels of series and model relationships between

dynamic economic variables?

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 61 / 90

PART II: Integrated processes

## Prcoesses with permanent shocks are called integrated processes

I A simple example shows our ideas of µ and σ 2 are not compatible with
permanent shocks
The first problem is to decide if a series is integrated
I Dickey Fuller tests
We then have a choice
I Difference the series to make it stationary
I Look for cointegration between two or more integrated series

Permanent shocks

## Consider the random walk yt = yt−1 + t , y0 = 0

y1 = y0 + 1 = 1
y2 = y1 + 2 = 1 + 2
...yt = 1 + 2 + · · · + t

Xt−1
var(yt ) = var( t−i )
j=0
2
= tσ
→∞ as t → ∞

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 63 / 90

Permanent shocks

Think about the random walk with drift yt = α + yt−1 + t
This is an AR(1) with β = 1
α
Thus E[yt ] = 1−β is undefined
The process has no unconditional mean
Conditional forecasts

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 64 / 90

Regressions with random walks

eachother

α
β
R2 -

## Breusch-Godfrey stat for serial corr up to order 4:

This is typical of a spurious regression
High R 2 combined with positive serial correlation is always a sign of
spurious regression
Now regress ∆y on ∆x. Is there any relationship?

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 65 / 90

Testing for unit roots: Dickey-Fuller test

## The best way to avoid spurious regressions is to do regressions with

stationary series
To determine stationarity, we need to test β = 1 in the process
yt = α + βyt−1 + t

∆yt = α + (β − 1)yt−1 + t
= α + ρyt−1 + t
H0 : ρ̂ = 0 ⇒ there is a unit root
HA : ρ̂ < 0 ⇒ No unit root
ρ̂
tDF =
SE (ρ̂)

The test stat tDF follows the Dickey Fuller distribution, which gives much
more negative critical values than the standard normal

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 66 / 90

Dickey Fuller distribution

PIC
The DF distribution is sensitive to specification of the test
I Inclusion of an intercept
I Inclusion of a trend
I Number of lags
I Sample size

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 67 / 90

The Augmented Dickey Fuller test

## It is essential the there is no serial correlation in DF regression

residuals
If necessary add lagged differences of the dependent variable

yt = α + β1 yt−1 + β2 yt−2 + t
= α + β1 yt−1 + β2 yt−1 − β2 yt−1 + β2 yt−2 + t
= α + (β1 + β2 )yt−1 − β2 ∆yt−1 + t
∆yt = α + (β1 + β2 − 1)yt−1 − β2 ∆yt−1 + t
= α + ρyt−1 − β2 ∆yt−1 + t

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 68 / 90

Dealing with trends

## Include trends using the ‘restricted trend’ option if available, for

g = γ/(1 − β)

yt = α + γt + βyt−1 + t
∆yt = α + (β − 1)(yt−1 − gt) + t
= α + ρ(yt−1 − gt) + t
⇒ ∆yt = α + t if ρ = 0 (2)
⇒ yt = α + γt + βyt−1 + t if ρ < 0 (3)

## From (2) if process is unit root it is RW with drift

From (3) if process is not unit root, it is trend stationary with |β| < 1.

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 69 / 90

Dickey Fuller Tables

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 70 / 90

Notes on exercise

## The order of integration, d, written yt ∼ I (d) is the number of times

a series must be differenced in order to make the series yt stationary
Determine the order of integration of Output, Consupmtion,
Investment and Prices.
Do any series exhibit trend-stationary behaviour?

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 71 / 90

Cointegration: Random walks which Tango!

## So far we have dealt with Integrated series by differencing to make

them stationary and modelling their (univariate) stationary behaviour.
There is an important case when we can work with two (or more)
Integrated series directly. This is when the series are cointegrated
I Economic behaviour creates long run - equilibrium - relationships
between series. E.g. output and consumption, investment and output,
house prices and earnings (?), stock prices and profits (?)
I The ratio of such series is a stationary series, even though the two
series are I(1)!
I Variables which cointegrate in this way adjust to dynamic shocks in
order to move back towards their equilibrium relationship

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 72 / 90

Output and Consumption

Plots of series

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 73 / 90

Cointegration: formal definition

If a linear combination of I(1) series is I(0) then the two series cointegrate

xt ∼ I (1) yt ∼ I (1)
yt − βxt ∼ I (0)

## The ‘cointegrating vector’ is the pair of values (1, −β) which

(working in logs) give the stationary ratio between the series
Economic theory often suggests theoretical values for β, so itis
interesting to see if these are true in the data

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 74 / 90

Common stochastic trends
Cointegration occurs when two series share a common stochastic trend,
say Xt . Let X0 = 0 and

Xt = Xt−1 + t
t
X
⇒ Xt = t
s=1

## yt = βXt + ỹt xt = Xt + x̃t

⇒ yt − βxt = βXt + ỹt − β(Xt + x̃t )
= ỹt − β x̃t ∼ I (0)

The common stochastic trend has been cancelled out. The pair (1, β) is
called the cointegrating vector as gives is the stationary linear combination
of y and x
Alexander Karalis Isaac (Warwick) Time Series July, 2015 75 / 90
Output and Consumption

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 76 / 90

Cointegration: long and short-run relationships

## If an economically meaningful equilibrium relationship exists:

There must be dynamic adjustment in the short run in order to return
the variables towards equilibrium levels when shocks push them apart
Thus the long-run relationship makes predictions about short-run
The levels of the series this period help us predict changes in the
series next period
We can represent both the long-run and the short-run behaviour of
cointegrated series through the error correction model

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 77 / 90

Error correction model

## We have seen an estimate of the cointegrating relationship between

const and outputt

ct = βyt + 

## Encouragingly the residuals ˆt from this relationship were stationary

But look at the BGodfrey stat - XXX - the above model is not
dynamically well-specified; it does not meet A2.
A model with more general dynamics is

## ct = β1 yt + β2 yt−1 + β3 ct−1 + t (4)

This allows for the response of ct to its own past, current and lagged
values of yt

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 78 / 90

Error correction model
Although (4) is a more general dynamic specification, it consists of
I (1) variables, yet the t series should be I (0).
With a bit of algebra we can rewrite the model entirely in terms of
I (0) variables

ct = β1 yt + β2 yt−1 + β3 ct−1 + t
= β1 yt − β1 yt−1 + β1 yt−1 + β2 yt−1 + β3 ct−1 + t
= β1 ∆yt + (β1 + β2 )yt−1 + β3 ct−1 + t
∆ct = β1 ∆yt + (β1 + β2 )yt−1 + (β3 − 1)ct−1 + t
 
β1 + β2
= β1 ∆yt + (β3 − 1) ct−1 − yt−1 +t
1 − β3
| {z }
E. C. term

∆yt , ∆ct are I (0), provided there is cointegration, so are the error
term and the equilibrium relationship in the large brackets

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 79 / 90

Error correction model

## ∆ct = α1 ∆yt + α2 (ct−1 − βyt−1 ) + t (5)

β1 +β2
Cointegration imposes the restrictions γ = (β3 − 1) and β = 1−β3
If there is a cointegrating relationship
I ct−1 − β̂yt−1 ∼ I (0), and ˆt ∼ I (0)
I α̂2 < 0
The α̂2 < 0 requirement ensures ct adjusts to being above its
long-run level in period t − 1 by reducing in period t
To estimate such a model, we need an estimate of ct−1 − β̂yt−1

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 80 / 90

Estimation of the ECM
Engle and Granger (1987) propose a two-step procedure for estimating (5)
First we need an estimate of the cointegrating vector. Regress:

ct = βyt + νt
⇒ ν̂t = ct − β̂yt

## the ν̂t is our estimate of deviations from the long-run equilibrium

relationship
Second, we estimate, by OLS

## We can recover estimates of the parameters of the original dynamic

model (4) from the parameters of the estimated ECM, α̂1 , α̂2 and
ν̂t−1

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 81 / 90

Testing for cointegration: EG procedure
The two-step estimation approach suggests a method for testing whether 2
series are actually cointegrated
Estimate the cointegrating relationship

ct = βyt + νt

## Save ν̂t series and perform an ADF test with no intercept

p−1
X
∆ν̂t = ρν̂t + γi ∆ν̂t−i + ut
i=1
H0 : ρ = 0 ⇒ ut is I(1) and there is no conitegration
HA : ρ < 0 ⇒ ut is I(0) and there may be cointegration
Critical values are McKinnon’s < DF critical values

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 82 / 90

Testing for cointegration: EG procedure
...estimate the ECM

## Test that there is a significant, negative change in ct whenever

ct−1 > β̂yt−1 , in order to restor equilibrium

## H0 : α̂2 < 0 ⇒ error correction is significant

HA : α̂2 ≥ 0 ⇒ no significant error correction
α̂2
τ= ∼ t0.05,DoF
SE (α̂2 )

## If the estimates pass these two tests, there is significant cointegration

and the ECM can be used to estimate the dynamic model
If not, then work with differences, i.e. transform the two series to
make them stationary.

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 83 / 90

EG procedure: discussion

The Engle-Granger procedure works well with two variables, but there are
drawbacks
The initial regression is misspecified, ν̂t is usually serially correlated
This two-step step approach introduces more variance than a
dynamically well-specified 1-step procedure
Results, esp. with more than two variables are sensitive to which
variable is taken as the left hand side variable
With more than two variables, there may be more than one
cointegrating relationship, and EG will estimate a linear combination
of these relationships, which has no real interpretation
These problems can be overcome by the Johansen procedure which is a
vector-based approach to estimating cointegrating equations

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 84 / 90

Empirical examples

(ct , yt )
(hpt , wt )
(SPt , Dt )

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 85 / 90

Forecast comparisons

## Estimate your error correction models on 1960-2000

Estimate your preferred ARIMA on 1960-2000
Produce 1-step and 4-step ahead out of sample forecasts with each
model for 2001-2006
Compare the MSPE from each model

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 86 / 90

Summary: Work stream for applied time series

I Is the series trending over time?
I Is the trend exponential or linear?
I Is the series mean reverting?
I Would the series look mean reverting in most subsamples?
I Are there several variables that seem to exhibit the same random trend?
Take logs of exponentially increasing variables
Begin Dickey Fuller tests
I Decide about appropriate inclusion of trends and constants based on
visual inspection and inspection of DF regression results
I Include f + 1 lags in initial DF specification and remove insignificant
lags; check for serial correlation up to order f , ensure A2 is satisfied.
Using preferred specification of DF tests decide on order of
integration of the series

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 87 / 90

Summary: With the transformed stationary series

## Build univariate ARMA models for forecasting

I Inspect ACF, PACF, decide on candidate AR, MA, ARMA specification
test down by eliminating insignificant lags, minimizing AIC/BIC; ensure
A2 is satisfied in preferred model
Inspect forecast predictions v.s. actual outcomes
I Do the forecast error bounds include 95% of actual outcomes?
I Are the forecast errors close to uncorrelated?
Test robustness by performing out of sample forecast exercise
I You will need to reserve part of your sample so will lose some
information from the estimation
I But you might find a model that performs better in practice, or at
leatunderstand more about how your model is likely to perform as new
data comes in

## Alexander Karalis Isaac (Warwick) Time Series July, 2015 88 / 90

Summary: Modelling cointegrating series
Plot the ratio of interest
Engle-Granger Procedure Step I
I Estimate the cointegrating relationship with appropriate constant/trend
inclusion
I Save the residuals
I Perform Dickey Fuller test on residuals
I No constant! McKinnon p-values
I H0 : no cointegration. If reject H0 go to...
Engle-Granger Procedure Step II
I Estimate ECM with appropriate lagged differences so that A2 holds
I Test αˆ2 < 0 by standard t-test
I H0 : no cointegration (α2 = 0). If reject H0 ...
ECM is correct model. Recover parameters of restricted ARDL model
with appropriate tranformations
Interpret cointegrating relationship
Make dynamic forecasts
Alexander Karalis Isaac (Warwick) Time Series July, 2015 89 / 90
The End!