Anda di halaman 1dari 12

# Solutions to the Review Questions at the End of Chapter 5 1.

Autoregressive models specify the current value of a series yt as a function of its previous p values and the current value an error term, ut, while moving average models specify the current value of a series yt as a function of the current and previous q values of an error term, ut. AR and MA models have different characteristics in terms of the length of their memories, which has implications for the time it takes shocks to yt to die away, and for the shapes of their autocorrelation and partial autocorrelation functions. 2. ARMA models are of particular use for financial series due to their fle i!ility. "hey are fairly simple to estimate, can often produce reasona!le forecasts, and most importantly, they re#uire no knowledge of any structural varia!les that might !e re#uired for more traditional econometric analysis. \$hen the data are availa!le at high fre#uencies, we can still use ARMA models while e ogenous e planatory varia!les %e.g. macroeconomic varia!les, accounting ratios& may !e uno!serva!le at any more than monthly intervals at !est. '. yt = yt(1 + ut yt = ).* yt-1 + ut yt = ).+ ut(1 + ut %1& %2& %'&

%a& "he first two models are roughly speaking AR%1& models, while the last is an MA%1&. ,trictly, since the first model is a random walk, it should !e called an AR-MA%),1,)& model, !ut it could still !e viewed as a special case of an autoregressive model. %!& \$e know that the theoretical acf of an MA% q& process will !e .ero after q lags, so the acf of the MA%1& will !e .ero at all lags after one. /or an autoregressive process, the acf dies away gradually. -t will die away fairly #uickly for case %2&, with each successive autocorrelation coefficient taking on a value e#ual to half that of the previous lag. /or the first case, however, the acf will never die away, and in theory will always take on a value of one, whatever the lag. "urning now to the pacf, the pacf for the first two models would have a large positive spike at lag 1, and no statistically significant pacf0s at other lags. Again, the unit root process of %1& would have a pacf the same as that of a stationary AR process. "he pacf for %'&, the MA%1&, will decline geometrically. %c& 1learly the first e#uation %the random walk& is more likely to represent stock prices in practice. "he discounted dividend model of share prices states that the current value of a share will !e simply the discounted sum of all e pected future dividends. -f we assume that investors form their e pectations a!out dividend payments rationally, then the current share price should

1/12

## Introductory Econometrics for Finance Chris Brooks 2008

em!ody all information that is known a!out the future of dividend payments, and hence today0s price should only differ from yesterdays !y the amount of une pected news which influences dividend payments. "hus stock prices should follow a random walk. 2ote that we could apply a similar rational e pectations and random walk model to many other kinds of financial series. -f the stock market really followed the process descri!ed !y e#uations %2& or %'&, then we could potentially make useful forecasts of the series using our model. -n the latter case of the MA%1&, we could only make one(step ahead forecasts since the memory of the model is only that length. -n the case of e#uation %2&, we could potentially make a lot of money !y forming multiple step ahead forecasts and trading on the !asis of these. 3ence after a period, it is likely that other investors would spot this potential opportunity and hence the model would no longer !e a useful description of the data. %d& ,ee the !ook for the alge!ra. "his part of the #uestion is really an e tension of the others. Analysing the simplest case first, the MA%1&, the memory of the process will only !e one period, and therefore a given shock or innovation, ut, will only persist in the series %i.e. !e reflected in yt& for one period. After that, the effect of a given shock would have completely worked through. /or the case of the AR%1& given in e#uation %2&, a given shock, ut, will persist indefinitely and will therefore influence the properties of yt for ever, !ut its effect upon yt will diminish e ponentially as time goes on. -n the first case, the series yt could !e written as an infinite sum of past shocks, and therefore the effect of a given shock will persist indefinitely, and its effect will not diminish over time. 4. %a& 5o and 6enkins were the first to consider ARMA modelling in this logical and coherent fashion. "heir methodology consists of ' steps7 -dentification ( determining the appropriate order of the model using graphical procedures %e.g. plots of autocorrelation functions&. 8stimation ( of the parameters of the model of si.e given in the first stage. "his can !e done using least s#uares or ma imum likelihood, depending on the model. 9iagnostic checking ( this step is to ensure that the model actually estimated is ade#uate. 5 : 6 suggest two methods for achieving this7 ( ;verfitting, which involves deli!erately fitting a model larger than that suggested in step 1 and testing the hypothesis that all the additional coefficients can <ointly !e set to .ero.

2/12

## Introductory Econometrics for Finance Chris Brooks 2008

( Residual diagnostics. -f the model estimated is a good description of the data, there should !e no further linear dependence in the residuals of the estimated model. "herefore, we could calculate the residuals from the estimated model, and use the =<ung(5o test on them, or calculate their acf. -f either of these reveal evidence of additional structure, then we assume that the estimated model is not an ade#uate description of the data. -f the model appears to !e ade#uate, then it can !e used for policy analysis and for constructing forecasts. -f it is not ade#uate, then we must go !ack to stage 1 and start again> %!& "he main pro!lem with the 5 : 6 methodology is the ine actness of the identification stage. Autocorrelation functions and partial autocorrelations for actual data are very difficult to interpret accurately, rendering the whole procedure often little more than educated guesswork. A further pro!lem concerns the diagnostic checking stage, which will only indicate when the proposed model is too small and would not inform on when the model proposed is too large. %c& \$e could use Akaike0s or ,chwar.0s 5ayesian information criteria. ;ur o!<ective would then !e to fit the model order that minimises these. \$e can calculate the value of Akaike0s %A-1& and ,chwar.0s %,5-1& 5ayesian information criteria using the following respective formulae
2 & @ 2kAT A-1 ? ln % 2 & @ k ln%T&AT ,5-1 ? ln %

"he information criteria trade off an increase in the num!er of parameters and therefore an increase in the penalty term against a fall in the R,,, implying a closer fit of the model to the data. *. "he !est way to check for stationarity is to e press the model as a lag polynomial in yt. y t = 0!803 y t 1 + 0! 82 y t 2 + ut Rewrite this as
y t "1 0!803 L 0! 82 L2 # = ut

\$e want to find the roots of the lag polynomial "1 0!803 L 0! 82 L2 # = 0 and determine whether they are greater than one in a!solute value. -t is easier %in my opinion& to rewrite this formula %!y multiplying through !y (1A).B+2, using z for the characteristic e#uation and rearranging& as z2 @ 1.1CC z ( 1.4BB ? )

3/12

## Introductory Econometrics for Finance Chris Brooks 2008

Dsing the standard formula for o!taining the roots of a #uadratic e#uation,
z=
2 11\$\$ ! 11\$\$ ! + % & 1 & 1!% 2

? ).C*+ or 1.E'4

,ince A== the roots must !e greater than one for the model to !e stationary, we conclude that the estimated model is not stationary in this case. B. Dsing the formulae a!ove, we end up with the following values for each criterion and for each model order %with an asterisk denoting the smallest value of the information criterion in each case&. ARMA %p,q& model order %),)& %1,)& %),1& %1,1& %2,1& %1,2& %2,2& %',2& ).+42F %2,'& %','&
2& log % ).E'2 ).+B4 ).E)2 ).+'B ).+)1 ).+21 ).C+E ).CC'

A-1 ).E42 ).++4 ).E22 ).+BB ).+41 ).+B1 ).+'E ).+''F ).+42 ).+'4

,5-1 ).E44 ).++C ).E2* ).+C) ).+4C ).+BC ).+4B ).+*1 ).+44

).C+2 ).CB4

"he result is pretty clear7 !oth ,5-1 and A-1 say that the appropriate model is an ARMA%',2&. C. \$e could still perform the =<ung(5o test on the residuals of the estimated models to see if there was any linear dependence left unaccounted for !y our postulated models. Another test of the models0 ade#uacy that we could use is to leave out some of the o!servations at the identification and estimation stage, and attempt to construct out of sample forecasts for these. /or e ample, if we have 2))) o!servations, we may use only 1+)) of them to identify and estimate the models, and leave the remaining 2)) for construction of forecasts. \$e would then prefer the model that gave the most accurate forecasts. +. "his is not true in general. Ges, we do want to form a model which fits the data as well as possi!le. 5ut in most financial series, there is a su!stantial amount of noise. "his can !e interpreted as a num!er of random events that are unlikely to !e repeated in any forecasta!le way. \$e want to fit a model to the data which will !e a!le to generalise. -n other words, we want a model which fits to features of the data which will !e replicated in futureH we do not want to fit to sample(specific noise.

%/12

## Introductory Econometrics for Finance Chris Brooks 2008

"his is why we need the concept of parsimony ( fitting the smallest possi!le model to the data. ;therwise we may get a great fit to the data in sample, !ut any use of the model for forecasts could yield terri!le results. Another important point is that the larger the num!er of estimated parameters %i.e. the more varia!les we have&, then the smaller will !e the num!er of degrees of freedom, and this will imply that coefficient standard errors will !e larger than they would otherwise have !een. "his could lead to a loss of power in hypothesis tests, and varia!les that would otherwise have !een significant are now insignificant. E. %a& \$e class an autocorrelation coefficient or partial autocorrelation coefficient as significant if it e ceeds 1!'
1 T

## ? ).1EB. Dnder this rule,

the sample autocorrelation functions %sacfs& at lag 1 and 4 are significant, and the spacfs at lag 1, 2, ', 4 and * are all significant. "his clearly looks like the data are consistent with a first order moving average process since all !ut the first acfs are not significant %the significant lag 4 acf is a typical wrinkle that one might e pect with real data and should pro!a!ly !e ignored&, and the pacf has a slowly declining structure. %!& "he formula for the =<ung(5o IF test is given !y Q& = T "T + 2# using the standard notation. -n this case, T?1)), and m?'. "he null hypothesis is 3)7 1 ? ) and 2 ? ) and ' ? ). "he test statistic is calculated as
0!%20 2 0!10% 2 0!032 2 Q& = 100 102 + + = 1'!%1! 100 1 100 2 100 3

k2 k =1 T k
m

2 m

"he *J and 1J critical values for a 2 distri!ution with ' degrees of freedom are C.+1 and 11.' respectively. 1learly, then, we would re<ect the null hypothesis that the first three autocorrelation coefficients are <ointly not significantly different from .ero. 1). %a& "o solve this, we need the concept of a conditional e pectation, i.e. E t 1 " yt yt 2 ( yt 3 (!!!# /or e ample, in the conte t of an AR%1& model such as , y t = a 0 + a1 y t 1 + u t -f we are now at time t(1, and dropping the t(1 su!script on the e pectations operator

)/12

=

## E " yt +2 # = a0 + a1 E " yt +1 # = a0 + a1 " a0 + a1 E " yt ##

= = = =

a0 + a0a1 + a12 yt 1

a0 + a0a1 + a12 E " yt # a0 + a0 a1 + a12 E " yt # a0 + a0a1 + a12 "a0 + a1 yt 1 # a0 + a0a1 + a12 a0 + a13 yt 1

etc.

## "o forecast an MA model, consider, e.g.

y t = u t + b1u t 1
E " y t yt 1 ( yt 2 (!!!#

8 "u t + b1u t 1 #

b1 u t 1

,o 5ut

ft(1,1

b1 u t 1

E " yt + 1 yt 1 ( yt 2 ( !!!#

## ? Koing !ack to the e ample a!ove,

? )

8 "u t +1 + b1u t #

## y t = 0!03 + 0! ' y t 1 + 0!%2u t 1 + u t

,uppose that we know t(1, t(2,... and we are trying to forecast yt. ;ur forecast for t is given !y
E " yt y t 1 ( yt 2 (!!! #

/12

## Introductory Econometrics for Finance Chris Brooks 2008

ft(1,2 ? E " y t +1 y t 1 ( y t 2 (!!!# = 0!03 +0! ' y t +0!%2u t +u t +1 5ut we do not know yt or ut at time t(1. Replace yt with our forecast of yt which is ft-1,1. ft(1,2 ? ).)'B @).BE ft(1,1 ? ).)'B @ ).BEF1.+'B ? 1.')2 ? ).)'B @).BE ft(1,2 ? ).)'B @ ).BEF1.')2 ? ).E'*

ft(1,' etc.

%!& Kiven the forecasts and the actual value, it is very easy to calculate the MSE !y plugging the num!ers in to the relevant formula, which in this case is MSE = 1 N

n =1

" x t 1+ n f t 1( n # 2

if we are making N forecasts which are num!ered 1,2,'. "hen the MSE is given !y
MSE = 1 ("1!83 0!032# 2 + "1!302 0!' 1# 2 + "0!'3) 0!203# 2 ) 3 1 = "3!%8' + 0!11 + 0!)3 # = 1!380 3

2otice also that +4J of the total M,8 is coming from the error in the first forecast. "hus error measures can !e driven !y one or two times when the model fits very !adly. /or e ample, if the forecast period includes a stock market crash, this can lead the mean s#uared error to !e 1)) times !igger than it would have !een if the crash o!servations were not included. "his point needs to !e considered whenever forecasting models are evaluated. An idea of whether this is a pro!lem in a given situation can !e gained !y plotting the forecast errors over time. %c& "his #uestion is much simpler to answer than it looks> -n fact, the inclusion of the smoothing coefficient is a red herring ( i.e. a piece of misleading and useless information. "he correct approach is to say that if we !elieve that the e ponential smoothing model is appropriate, then all useful information will have already !een used in the calculation of the current smoothed value %which will of course have used the smoothing coefficient in its calculation&. "hus the three forecasts are all ).)')*. %d& "he solution is to work out the mean s#uared error for the e ponential smoothing model. "he calculation is

\$/12

MSE = =

## 1 ( 0!003' + 0!8 )8 + 0!02'8) = 0!2''8 3

"herefore, we conclude that since the mean s#uared error is smaller for the e ponential smoothing model than the 5o 6enkins model, the former produces the more accurate forecasts. \$e should, however, !ear in mind that the #uestion of accuracy was determined using only ' forecasts, which would !e insufficient in a real application. 11. %a& "he shapes of the acf and pacf are perhaps !est summarised in a ta!le7 Lrocess \$hite noise AR%2& MA%1& ARMA%2 ,1& acf 2o significant coefficients Keometrically declining or damped sinusoid acf /irst acf coefficient significant, all others insignificant Keometrically declining or damped sinusoid acf pacf 2o significant coefficients /irst 2 pacf coefficients significant, all others insignificant Keometrically declining or damped sinusoid pacf Keometrically declining or damped sinusoid pacf

A couple of further points are worth noting. /irst, it is not possi!le to tell what the signs of the coefficients for the acf or pacf would !e for the last three processes, since that would depend on the signs of the coefficients of the processes. ,econd, for mi ed processes, the AR part dominates from the point of view of acf calculation, while the MA part dominates for pacf calculation. %!& "he important point here is to focus on the MA part of the model and to ignore the AR dynamics. "he characteristic e#uation would !e %1@).42z& ? ) "he root of this e#uation is (1A).42 ? (2.'+, which lies outside the unit circle, and therefore the MA part of the model is inverti!le. %c& ,ince no values for the series y or the lagged residuals are given, the answers should !e stated in terms of y and of u. Assuming that information is availa!le up to and including time t, the 1(step ahead forecast would !e for time t@1, the 2(step ahead for time t@2 and so on. A useful first step would !e to write the model out for y at times t@1, t@2, t@', t@47
y t +1 = 0!03 + 0! ' y t + 0!%2u t + u t +1

8/12

## y t +% = 0!03 + 0! ' y t +3 + 0!%2u t +3 + u t +%

"he 1(step ahead forecast would simply !e the conditional e pectation of y for time t@1 made at time t. 9enoting the 1(step ahead forecast made at time t as ft,1, the 2(step ahead forecast made at time t as ft,2 and so on7
E " y t +1 y t ( y t 1 (!!!# = f t (1 = E t + y t +1 * = E t +0!03 + 0! ' y t + 0!%2u t + u t +1 * = 0!03 + 0! ' y t + 0!%2u t

## since 8tMut@1N?). "he 2(step ahead forecast would !e given !y

E " yt +2 yt ( yt 1 (!!!# = f t ( 2 = Et + yt +2 * = Et +0!03 +0! ' yt +1 +0!%2ut +1 +ut +2 * = 0!03 +0! ' f
t (1

since 8tMut@1N?) and 8tMut@2N?). "hus, !eyond 1(step ahead, the MA%1& part of the model disappears from the forecast and only the autoregressive part remains. Although we do not know yt@1, its e pected value is the 1(step ahead forecast that was made at the first stage, ft,1. "he '(step ahead forecast would !e given !y
E " yt +3 yt ( yt 1 (!!!# = f t (3 = Et + yt +3 * = Et +0!03 +0! ' yt +2 +0!%2ut +2 +ut +3 * = 0!03 +0! ' f
t (2

## and the 4(step ahead !y

E " yt +% yt ( yt 1 (!!!# = f t ( % = Et + yt +% * = Et +0!03 +0! ' yt +3 +0!%2ut +3 +ut +% * = 0!03 +0! ' f
t (3

%d& A num!er of methods for aggregating the forecast errors to produce a single forecast evaluation measure were suggested in the paper !y Makridakis and 3i!on %1EE*& and some discussion is presented in the !ook. Any of the methods suggested there could !e discussed. A good answer would present an e pression for the evaluation measures, with any notation introduced !eing carefully defined, together with a discussion of why the measure takes the form that it does and what the advantages and disadvantages of its use are compared with other methods. %e& Moving average and ARMA models cannot !e estimated using ;=, O they are usually estimated !y ma imum likelihood. Autoregressive models can !e estimated using ;=, or ma imum likelihood. Lure autoregressive models contain only lagged values of o!served #uantities on the R3,, and therefore, the lags of the dependent varia!le can !e used <ust like any other regressors. 3owever, in the conte t of MA and mi ed models, the lagged values of the error term that occur on the R3, are not known a priori. 3ence, these #uantities are replaced !y the residuals, which are not availa!le until after the model has !een estimated. 5ut e#ually, these residuals are re#uired in order to !e a!le to estimate the model parameters. Ma imum likelihood essentially works around this !y calculating the values of the coefficients and the residuals at the same time. Ma imum likelihood involves selecting the most likely values of the parameters given the actual data sample, and given an assumed statistical distri!ution for the errors. "his techni#ue will !e discussed in greater detail in the section on volatility modelling in 1hapter +.

'/12

## Introductory Econometrics for Finance Chris Brooks 2008

12. %a& ,ome of the stylised differences !etween the typical characteristics of macroeconomic and financial data were presented in 1hapter 1. -n particular, one important difference is the fre#uency with which financial asset return time series and other #uantities in finance can !e recorded. "his is of particular relevance for the models discussed in 1hapter *, since it is usually a re#uirement that all of the time(series data series used in estimating a given model must !e of the same fre#uency. "hus, if, for e ample, we wanted to !uild a model for forecasting hourly changes in e change rates, it would !e difficult to set up a structural model containing macroeconomic e planatory varia!les since the macroeconomic varia!les are likely to !e measured on a #uarterly or at !est monthly !asis. "his gives a motivation for using pure time(series approaches %e.g. ARMA models&, rather than structural formulations with separate e planatory varia!les. -t is also often of particular interest to produce forecasts of financial varia!les in real time. Lroducing forecasts from pure time(series models is usually simply an e ercise in iterating with conditional e pectations. 5ut producing forecasts from structural models is considera!ly more difficult, and would usually re#uire the production of forecasts for the structural varia!les as well. %!& A simple rule of thum! for determining whether autocorrelation coefficients and partial autocorrelation coefficients are statistically significant is to classify them as significant at the *J level if they lie outside of
1 !' & 1 T

## , where T is the sample si.e. -n this case, T ? *)), so a particular

coefficient would !e deemed significant if it is larger than ).)++ or smaller than O).)++. ;n this !asis, the autocorrelation coefficients at lags 1 and * and the partial autocorrelation coefficients at lags 1, 2, and ' would !e classed as significant. "he formulae for the 5o (Lierce and the =<ung(5o test statistics are respectively
2 Q = T k k =1 m

and

Q& = T "T + 2#

k2 . k =1 T k
m

## -n this instance, the statistics would !e calculated respectively as

Q = )00 +0!30\$ 2 + "0!013 2 # + 0!08
2

## + 0!0312 + "0!1'\$ 2 #* = \$0!\$'

and
0!30\$ 2 "0!013 2 # 0!08 2 "0!1'\$ 2 # 0!0312 Q& = )00 )02 + + + + = \$1!3' )00 2 )00 3 )00 % )00 ) )00 1

"he test statistics will !oth follow a 2 distri!ution with * degrees of freedom %the num!er of autocorrelation coefficients !eing used in the test&. "he critical values are 11.)C and 1*.)E at *J and 1J respectively. 1learly, the null hypothesis that the first * autocorrelation coefficients are <ointly .ero is resoundingly re<ected.

10/12

## Introductory Econometrics for Finance Chris Brooks 2008

%c& ,etting aside the lag * autocorrelation coefficient, the pattern in the ta!le is for the autocorrelation coefficient to only !e significant at lag 1 and then to fall rapidly to values close to .ero, while the partial autocorrelation coefficients appear to fall much more slowly as the lag length increases. "hese characteristics would lead us to think that an appropriate model for this series is an MA%1&. ;f course, the autocorrelation coefficient at lag * is an anomaly that does not fit in with the pattern of the rest of the coefficients. 5ut such a result would !e typical of a real data series %as opposed to a simulated data series that would have a much cleaner structure&. "his serves to illustrate that when econometrics is used for the analysis of real data, the data generating process was almost certainly not any of the models in the ARMA family. ,o all we are trying to do is to find a model that !est descri!es the features of the data to hand. As one econometrician put it, all models are wrong, !ut some are useful> %d& /orecasts from this ARMA model would !e produced in the usual way. Dsing the same notation as a!ove, and letting f.,1 denote the forecast for time z@1 made for x at time z, etc7 Model A7 MA%1&
f z (1 =0!38 + 0!10u t 1
f z ( 2 = 0!38 + 0!10 0!02 = 0!3\$8

f z ( 2 = f z ( 3 = 0!38

2ote that the MA%1& model only has a memory of one period, so all forecasts further than one step ahead will !e e#ual to the intercept. Model 57 AR%2&
, t = 0! 3 + 0!1\$ xt 1 0!0' xt 2 x f z (1 = 0! 3 + 0!1\$ 0!31 + 0!0' 0!02 = 0! 81
f z ( 2 = 0! 3 + 0!1\$ 0! 81 + 0!0' 0!31 = 0!\$18

## f z ( 3 = 0! 3 + 0!1\$ 0!\$18 + 0!0' 0! 81 = 0! '0

f z ( % = 0! 3 + 0!1\$ 0! '0 + 0!0' 0!\$1 = 0! 83

%e& "he methods are overfitting and residual diagnostics. ;verfitting involves selecting a deli!erately larger model than the proposed one, and e amining the statistical significances of the additional parameters. -f the additional parameters are statistically insignificant, then the originally postulated model is deemed accepta!le. "he larger model would usually involve the addition of one e tra MA term and one e tra AR term. "hus it would !e sensi!le to try an ARMA%1,2& in the conte t of Model A, and an ARMA%',1& in the conte t of Model 5. Residual diagnostics would involve e amining the acf and pacf of the residuals from the estimated model. -f the residuals showed any action, that is, if any of the acf or pacf coefficients showed statistical significance, this would suggest that the original model was inade#uate. Residual diagnostics in the 5o (6enkins sense of the term involved only e amining the acf and pacf, rather than the array of diagnostics considered in 1hapter 4.

11/12

## Introductory Econometrics for Finance Chris Brooks 2008

-t is worth noting that these two model evaluation procedures would only indicate a model that was too small. -f the model were too large, i.e. it had superfluous terms, these procedures would deem the model ade#uate. %f& "here are o!viously several forecast accuracy measures that could !e employed, including M,8, MA8, and the percentage of correct sign predictions. Assuming that M,8 is used, the M,8 for each model is
1 ( "0!3\$8 0! 2# 2 + "0!38 0!1'# 2 + "0!38 0!32# 2 + "0!38 0!\$2# 2 ) = 0!1\$) % 1 MSE " Model B # = "0! 81 0! 2# 2 + "0!\$18 0!1'# 2 + "0! '0 0!32# 2 + "0! 83 0!\$2# 2 = 0!32 % MSE " Model A# =

"herefore, since the mean s#uared error for Model A is smaller, it would !e concluded that the moving average model is the more accurate of the two in this case.

12/12