Anda di halaman 1dari 10

Exercises 3

Comments, Solutions
Shenzhen Graduate School 2015
James E. Gentle
These problems are similar to some in Chapter 2 of Tsay.
1. This problem involves a model; not data.
Assume that the simple returns on a certain monthly bond index follows an MA(1) model, Rt =
2
At +1 At1 , where {At } is a white noise process. Now, assume further that 1 = 0.2 and A
= 0.000625.
(a) If A100 = 0.01, what is the 1-step-ahead forecast and the standard deviation of the forecast error?
(The forecast origin is t = 100.)
(b) Again, letting A100 = 0.01, what is the 2-step-ahead forecast and the standard deviation of the
forecast error?
(c) Compute the lag-1 and lag-2 autocorrelations of the process.
First of all, note that this is just an MA(1) process with a mean of 0. This kind of process has limited
memory of the past and the forecasts go to the mean very quickly.
(a) The 1-step-ahead forecast from t = 100, where A100 = 0.01 is
100 (1) = E(R101 |F100) = 1 A100 = 0.002.
R
Remember F100 represents all information from the past, that is, up to time=100.
The original statement of this exercises asked the question, what is its standard deviation?,
implying the standard deviation of the forecast itself. I should have asked about the forecast
error, because we are usually interested in the standard deviation of the forecast error. The
standard deviation of the forecast depends on what the random variables are. In one interpretation
of the original statement, there are no random variables, so the standard deviation would be 0.
I accepted an answer about the standard deviation, in almost any form.
The forecast error is
100(1) = A101
E100(1) = R101 R
2
The variance of the of this is V(A101 ) = A
= 0.000625, so the standard deviation is just A =
0.025.

(b) The 1-step-ahead forecast from t = 100, where A100 = 0.01 is


100 (2) = E(R102 |F100) = 0.
R
This is an illustration of the lack of memory of an MA process. An MA(q) process has no memory
of what went before q steps back, so the best forecast is just the mean of the process.
Again, my comments concerning my question about the standard deviation in the original version
of this assignment apply.
The forecast error is
100(2) = A102 + 1 A101 .
E100 (2) = R102 R
2
The variance of the of this is (1 + 12 )A
= 0.00065, so the standard deviation is just 0.0255.

(c) We first compute the autocovariances, (1) and (2). This is a special case of the standard
2
computations for an MA(1) process. We have R (1) = 1 A
and R (2) = 0. Of course, R (0) =
2 2
2
2 2
(1 + 1 )A , so R (1) = 1 A /((1 + 1 )A ) = .192, and (2) = 0.
2. Consider the seasonally monthly US unemployment rate from January 1948 to March 2009. (These data
are in the file m-unrate.txt at http://faculty.chicagobooth.edu/ruey.tsay/teaching/fts3/ under Chapter 2, Exercises. This website is linked on the main website for this course. This file is
from FRED at http://research.stlouisfed.org/fred2/. This website is linked on the main website for this course, and it contains more up-to-date data, but I have had trouble accessing that site
from here. If you can get to this site, dont use Tsays data. Use the Fed data through April 2015.)

(a) Use the PACF and ACF with the given data to choose p and q in an ARMA(p, q) model, and
then fit that model. Summarize how you did this and give the chosen p and q and the fitted
parameters.
(b) Does your model indicate the existence of business cycles in this time series? Why do you answer
as you do?
(c) Use your fitted model (theres no single correct model) to provide 1- and 2-month ahead forecasts. (These would be for April and May, 2009, or for May and June, 2015, depending on whether
you use Tsays data or the updated data.) What are the estimated standard deviations of your
forecasts?
I will just use the data from Tsays website. The R code is
m_unrate<-ts(read.table(
"http://faculty.chicagobooth.edu/ruey.tsay/teaching/fts3/m-unrate.txt",
head=TRUE)[,4],start=c(1948,1),frequency=12)
plot(m_unrate)
pacf(m_unrate)
acf(m_unrate)

6
4

m_unrate

10

(a) After running the code above, I got the plot below.

1950

1960

1970

1980
Time

I got the PACF

1990

2000

2010

0.6
0.4
0.2
0.2 0.0

Partial ACF

0.8

1.0

Series m_unrate

0.0

0.5

1.0

1.5

2.0

Lag

and the ACF

0.4
0.0

0.2

ACF

0.6

0.8

1.0

Series m_unrate

Lag

There do appear to be cycles in the data, but their period is not constant; therefore, we probably
cannot fit a seasonal component.
The PACF and ACF are inconclusive. I might just try an ARMA(4,1) model.
fit <- arima(m_unrate,order=c(4,0,1))
fit
Coefficients:
ar1
ar2
1.4711 -0.2366
s.e. 0.1021
0.1206

ar3
-0.1754
0.0726

sigma^2 estimated as 0.03966:

ar4
-0.0691
0.0538

ma1
-0.4809
0.0969

intercept
5.6807
0.3739

log likelihood = 140.88,

aic = -267.76

(b) Cycles are always hard to identify, especially over a short term. (Actually, we can often identify
cycles whether they are present of not!)
In economic time series we speak of cycles, which have various lengths; and we speak of seasons,
which have fixed lengths usually related to some unit of the calendar, such a day, week, quarter,
or year. In other types of time series, particularly those generated by a physical process, cycles
are often of primary interest. Those cycles are often called periods, but the word period is
also used to refer to the length of the cycle. Those periods are often of fixed length, and one
period or cycle may be superimposed on another one, so that the overall appearance is somewhat
irregular. Those cycles correspond to waves, and usually a transform from the time domain
to the frequency domain is the best way to begin to analyze those kinds of time series.
From the raw time series plot, we can see cycles. If these were regular, that is, if they had a
more-or-less fixed period that we could relate to a regular time unit such as a month or a year,
we would try to model a seasonal effect. These cycles vary in length and they also do not seem
to correspond to any kind of monthly or annual time period.
The ACF and PACF will generally show some spikes corresponding to seasonal patterns or regular
cycles (periods), but they are not very useful unless the patterns are very regular.
In this particular example, the slightly larger values of the PACF at long lags are a possible
indication of business cycles; but I would call it inconclusive.
(c)
3. Obtain the daily closing prices of IBM from January 1, 1970 to May 25, 2015, and compute the simple
returns. Compute the ACF for the first 100 lags of the series of absolute returns. Is there evidence
of long-range dependence? Why do you answer as you do?
Heres my code:
setwd("c:/Work/Lectures/Courses/HIT_Shenzhen")
source("financetools_funs.R")
IBM_d <- get.stock.price("IBM", start.date=c(1,1,1970), stop.date=c(5,25,2015),
full.table=FALSE, print.info=FALSE)
IBM_dr <- abs(diff(IBM_d)/IBM_d[-1])
acf(IBM_dr,lag.max=100)
I get this ACF:

0.0

0.2

0.4

ACF

0.6

0.8

1.0

Series IBM_dr

20

40

60
Lag

80

100

When the ACF does not die off faster than this, there is evidence of long-range correlation (or dependence). Dependence can show up as either a positive or negative correlation. That is why we use the
absolute returns.
What we see in the IBM example is not at all atypical. The returns of most stocks and indexes show
similar patterns.
Long-range memory is an important area of current research.
4. Consider the weekly yields of Moodys Aaa seasoned bonds. These data from January 5, 1962 to
April 10, 2009 are in the file w-Aaa.txt at Tsays website under Chapter 2, Exercises. (The same
comments as above apply. If you can get the FRED data, use May 1, 2015 as the ending date.)
(a) What are the sample mean, standard deviation, skewness, excess kurtosis, minimum and maximum of the weekly yields? (You can use basicStats in R.)
(b) What is the p-value for a test of the null hypothesis that the skewness is 0? Would you reject the
null hypothesis at the 0.05 level.
(c) What is the p-value for a test of the null hypothesis that the excess kurtosis is 0? Would you
reject the null hypothesis at the 0.05 level.
(d) Build an ARMA time series model for the weekly yields of the Aaa bonds.
Heres my code:
library(fBasics)
w_Aaa<-ts(read.table(
"http://faculty.chicagobooth.edu/ruey.tsay/teaching/fts3/w-Aaa.txt",
head=FALSE)[,4],start=c(1962,1),frequency=52)
mean(w_Aaa)
stdev(w_Aaa)
skewness(w_Aaa))
kurtosis(w_Aaa)
min(w_Aaa)
max(w_Aaa))
These commands yield all of the requested statistics, but I will do them again separately.
Its usually a good idea to plot the data. In this case, a histogram might give us a good picture.
(Remember, of course, a histogram is static; it does not have time information.)

300
200
100
0

Frequency

400

500

Histogram of w_Aaa

10
w_Aaa

12

14

16

The summary function in the base library gives some of the requested statistics:
summary(w_Aaa)
Min. 1st Qu.
4.190
5.985

Median
7.540

Mean 3rd Qu.


7.830
8.930

Max.
15.850

The statistics based on higher order moments are obtained by the stdev function in the base library
and the skewness kurtosis functions in the fbasics library.
stdev(w_Aaa)
skewness(w_Aaa)
kurtosis(w_Aaa)
This yields
> stdev(w_Aaa)
[1] 2.418744
> skewness(w_Aaa)
[1] 0.857092
attr(,"method")
[1] "moment"
> kurtosis(w_Aaa)
[1] 0.5786054
attr(,"method")
[1] "excess"
The skewness indicate a positive tail. (We already knew this from the histogram.)
The excess kurtosis indicates that the data are more heavy-tailed than data from a normal distribution.
But we must interpret this in the context of a nonnegative, skewed distribution.
5. Now consider the weekly yields of Moodys Baa seasoned bonds over the same period as in the previous
question. (The data are in the file w-Baa.txt or at FRED.) For these data do the same exercises as
above for the Aaa bonds.
As before, heres my code:
w_Baa<-ts(read.table(
"http://faculty.chicagobooth.edu/ruey.tsay/teaching/fts3/w-Baa.txt",
head=FALSE)[,4],start=c(1962,1),frequency=52)
Heres the histogram:

300
200
0

100

Frequency

400

500

Histogram of w_Baa

10

12

14

16

18

w_Baa

The summary function in the base library gives some of the requested statistics:
> summary(w_Baa)
Min. 1st Qu. Median
4.780
6.990
8.350

Mean 3rd Qu.


8.847 10.200

Max.
17.290

The statistics based on higher order moments are obtained as before:


> stdev(w_Baa)
[1] 2.717073
> skewness(w_Baa)
[1] 0.9297785
attr(,"method")
[1] "moment"
> kurtosis(w_Baa)
[1] 0.760896
attr(,"method")
[1] "excess"
Both the skewness and the excess kurtosis indicate a distribution that is less normal-like than for
the Aaa bonds.
6. Now, consider a regression model with the yields of the Baa bonds fitted to the yields of the Aaa
bonds (i.e., the Baa yield is the independent variable; the model is yA = 0 + 1 yB + ). (You can
use lm in R. The comments in the R help system for using lm with time series do not apply to the
application.)
(a) What are your coefficient estimates and what is your estimate of the variance of the residuals?
(b) Now examine the residuals of this fitted model. Are they serially correlated?
This is a common kind of problem in finance. We have two time series that are probably related, and
we want to study their relationship.
One way to do this is by regression analysis. The problem here is that the residuals, that is, the random
error term in the regression equation does not follow the usual assumptions of zero correlation.

One way of addressing this problem is by vector ARMA methods, that is, by multivariate time series
analysis. The property of interest is called cointegration. We will not be able to discuss multivariate
analysis or cointegration in this course.
Lets just proceed to do the regression.
> fit<-lm(w_Aaa~w_Baa)
> summary(fit)
Call:
lm(formula = w_Aaa ~ w_Baa)
Residuals:
Min
1Q
-2.46094 -0.14265

Median
0.06717

3Q
0.20420

Max
0.68515

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.030569
0.023025
1.328
0.184
w_Baa
0.881591
0.002488 354.350
<2e-16 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1
1
Residual standard error: 0.3357 on 2465 degrees of freedom
Multiple R-squared: 0.9807,
Adjusted R-squared: 0.9807
F-statistic: 1.256e+05 on 1 and 2465 DF, p-value: < 2.2e-16
For the model yA = 0 + 1 yB + ), the estimate of the intercept 0 is 0.031, and it is not significantly
different from 0, and the estimate of the slope 1 is 0.882, and it is highly significantly different from
0.
Now, we want to examine the residuals:
res <- fit$residuals
plot(res)

1.0
0.5
0.5 0.0

res

1.5

2.0

2.5

Heres a plot of the residuals over time. What does that mean?

500

1000
Index

1500

2000

2500

Remember, these are the residuals over time. There was a period of time when the residuals were large
(negatively). The fact that the residuals were large, either positively or negatively during some period
just means that the fit during that period would be different from the overall fit. In general,
the fit of Aaa to Baa returns has changed over time.
Ill return to the consideration of the residuals behavior with respect to time below, but first, ignoring
the time series aspects, I want to show you something that we do in ordinary regression.
The plot above relates to the residuals within time, but in regression analysis, the relationship of the
residuals to either the independent or dependent variable is usually of more interest. Heres how we
can get that (this is for a plot of residuals versus the independent variable). Note the use of order
instead of sort.

1.5
1.0
0.5 0.0

0.5

res[order]

2.0

2.5

order <- order(w_Aaa)


plot(w_Aaa[order],res[order])

10

12

14

16

w_Aaa[order]

This shows that there are a lot of outliers, especially when the yields on the Baa bonds are low.
In a regression of one time series on another, an important question has to do with the correlations of
the residuals over time; hence we might return to the time domain, and compute the PACF and ACF.
pacf(res)
acf(res)

0.6
0.4
0.2
0.0

Partial ACF

0.8

1.0

Series res

10

15

20

25

30

Lag

The PACF indicates that the residuals may contain an AR component at lag 1.
Looking at the ACF, we see that it is dying very slowly.

0.4
0.0

0.2

ACF

0.6

0.8

1.0

Series res

10

15

20

25

30

Lag

These two series do not fit any simple univariate model. A vector autoregression model of order
1, that is, a VAR(1), might fit (indicated by the PACF)

Anda mungkin juga menyukai