Anda di halaman 1dari 14

Box-Jenkins Methodology

BJ models use only current and past values of the time series to produce forecasts (no other independent variables)

Steps in Box-Jenkins Modeling


Prepare Raw Data Identify Model Revise the Model Estimate Parameters Model Good ? Forecast

No

Data Preparation
Data has to be transformed to stationarity before applying BJ technique. Stationarity consists of three parts.

Data Preparation
Stationary in Variance
Fluctuation constant over time. Detectable by scatter plot. Usually enforced by taking loge or square root. Mathematically, Var(Yt) = 2.

Stationary in Mean

Fluctuates about a fixed level. Detectable by scatter plot and ACF. Usually enforced by differencing a suitable number of times d. Mathematically, E(Yt) = .

Covariance Stationary

Not detectable by scatter plot. Mathematically, for any k 0, Cov(Yt ,Yt-k) depends on k only.

Variance Stabilization
Yt = tt where t ~ iid N(5,1)
Before Taking Log
700

Backshift Operator : B

600

B k yt = yt k
(1 B) yt = yt yt 1
61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99

500

Y(t)

400

300

200

100

0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59

t
After Taking Log
7

(1 B)2 yt = yt 2 yt 1 yt 2
(1 B)1 yt = (1 B B2 ...) yt
59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99

6 5

Log(Y(t))

2 1

0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57

Autoregressive (AR) Models


Typical model :
Yt 6 1 .2Yt 1 0 .8Yt 2 t

Moving Average (MA) Models


Typical model : Yt t t 1 0 .8 t 2 General MA(q) model : (stationarity assumed)

General AR(p) model : (stationarity assumed)


Yt 0 1Yt 1 ... pYt p t
(1 1 B ... p B p )Yt 0 t

where 1 1 B ... p B p does not contain the factor 1 B .

Yt 0 t 1t 1 ... qt q
Yt 0 (1 1B ... q B q )t

Autoregressive Moving Average (ARMA) Models


Typical model : Yt 0.2Yt 1 0.8Yt 2 0.1t 1 t
General ARMA(p,q) model : (stationarity assumed)

Autoregressive Integrated Moving Average (ARIMA) Models

These are ARMA models fitted to data that need to be differenced to ensure stationarity in mean. General ARIMA(p,d,q) model :

Yt 1Yt 1 ... pYt p 0 t 1t 1 ... qt q


(1 1B ... p B )Yt 0 (1 1B ... q B )t
where 1 B ... B p does not contain the factor 1 p 1 B.
p q

(1 1B ... p B p ) (1 B )d Yt 0 (1 1B ... q B q )t
where 1 1B ... p B p does not contain the factor 1 B.

ARIMA(p,d,q) Models
ARIMA(2,1,1) = ARMA(2,1) fitted to data differenced once ARIMA(0,2,1) = MA(1) fitted to data differenced twice ARIMA(1,0,1) = ARMA(1,1)

Model Identification
First transform data to stationarity by differencing suitable number of times, taking logs, etc Choose those models (there may be more than one) with (1) the theoretical ACF most closely matches the sample ACF and (2) the theoretical PACF most closely matches the sample PACF

What is PACF ?
For given k, regress Yt against Yt-1,,Yt-k :

Plots for Yt = 0.7Yt-1+ t ; t ~ iid N(0,1)


5.000 4.000 3.000

t a0 +a1Yt 1+...+ak 1Yt k 1+bkYt k Y


The lag-k partial autocorrelation coefficient (PAC) is the coefficient bk of Yt-k It measures the strength of correlation between Yt-k and Yt when the effects of other time lags : 1, 2, ,(k-1) are removed The collection of bk (k1) constitutes the PACF
1.0000 .8000 .6000 .4000 .2000 .0000

2.000

1.000 Y(t) 0.000 1 -1.000 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49

-2.000

-3.000

-4.000

-5.000

1.0000 .8000 .6000 ACF Upper Limit Low er Limit .4000 .2000 .0000 -.2000 -.4000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 PACF Upper Limit Low er Limit

-.2000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 -.4000 -.6000

Typical PACFs for AR Models


Y(t) = -0.7Y(t-1) + e(t)
0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 1 2 3 4 5 6 7 8 9 10 11 12 .6000 .4000 .2000 .0000 -.2000 -.4000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Typical PACFs for MA Models


Y(t) = -0.7e(t-1) + e(t)
.2000 .1000 .0000 -.1000 -.2000 -.3000 -.4000 -.5000 1 2 3 4 5 6 7 8 9 10 11 12 .4000 .2000 .0000 -.2000 -.4000 -.6000 -.8000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Y(t) = 0.5*Y(t-1) - 0.4*Y(t-2) + e(t)

Y(t) = -0.9e(t-1) + 0.8e(t-2) + e(t)

Y(t) = -0.5Y(t-1) + 0.4Y(t-2) + 0.3Y(t-3) + e(t)


.6000 .4000 .2000 .0000 -.2000 -.4000 -.6000 1 2 3 4 5 6 7 8 9 10 11 12

Y(t) = -0.4e(t-1) + 0.5e(t-2) + 0.6e(t-3) + e(t)


.4000 .2000 .0000 -.2000 -.4000 -.6000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

AR(1) Model : Examples

AR(2) Model : Examples

MA(1) Model : Examples

MA(2) Model : Examples

ARMA(1,1) Model : Examples

ARMA(1,1) Model : Examples

Guidelines for Model Identification


MODEL AR(p) MA(q) ARMA(p,q) ACF Decays rapidly Truncates after lag q Decays rapidly PACF Truncates after lag p Decays rapidly Decays rapidly

Case : S&P Monthly Closing

In most cases, 0 p,d,q 2 and 0 p+q 2

S&P Monthly Closing : Differenced Once

S&P Closing : One Step Ahead Forecast


The ARIMA(0,1,0) model is :

(1 B)Yt t
Yt Yt 1 t Yt Yt 1 t
Forecast for t = 234 :

234 = 1482.37

Case : Transportation Daily Closing Index

Closing : Identify Model

Closing : ACF of Differenced Data

Closing : PACF of Differenced Data

Closing : SPSS

Closing : Choosing (p, d, q)

Closing : Error Measures

Closing : Residual ACF

Closing : Saving Residuals

Closing : Residuals Saved

Closing : Error Measures

Closing : Parameter Estimates

Closing : ACF of Residuals

Closing : Normality of Residuals

Closing : One Step Ahead Forecast


The ARIMA(1,1,0) model is :
(1 0.438B)(1 B)Yt t

Case : Paper Towel Weekly Sales

(1 1.438B 0.438B2 )Yt t


Yt 1.438Yt-1 0.438Yt-2 t

Forecast for t = 66 :

66 = 1.438(288.57) 0.438(286.33) = 289.55

Towel : Identify Model

Towel : Identify Model

Towel : Parameter Estimates (1)

Towel : Residual ACF (1)

Towel : Residual Saved (1)

Towel : Q-Q Plot (1)

Towel : Parameter Estimates (2)

Towel : Residual ACF (2)

Towel : Q-Q Plot (2)

Towel : Parameter Estimates (3)

Towel : One-Step Ahead Forecast


The ARIMA(0,1,1) model is :

Steps in Model Building


Transform data to stationarity Based on the ACF & PACF, determine the values of p and q From the computer printout, determine whether ALL fitted parameter values are significant; if not re-fit using other values of p and/or q Check whether the residuals appear random If there are more than one tentative model, choose the best one by considering their error measures

(1 B)Yt (1 0.351B)t
Yt Yt-1 0.351t-1 t
Forecast for t = 121 :
121 = 15.65 + 0.351(0.69) = 15.89

Anda mungkin juga menyukai