0.0737
4.2652
0.2054Y
1.3608X
1.3608
=
0.2054
6.7023X
a,
0.3201X
6.7023
cx
the
The feedback given by these equations is not intentional but occurs in
the
during
made
machine tool itself, since no manual adjustments were
entire process.
To consider the implications of these models, we first note that for
but
Diameter 1 is slightly positive, whereas that for Diameter 2 is small
and
unstable
is
1
Diameter
process
for
the
that
indicates
negative. This
therefore out of control, whereas for Diameter 2 it is almost on the bound
therefore
ary of stability, but still stable (in fact, asymptotically stable), and
inspecting.
visually
by
confirmed
qualitatively
can
be
in control. These facts
drift
the series in Fig. 8.5. Measurements of Diameter 1 clearly show a
such
have
not
does
indicating out of control behavior, whereas Diameter 2
an unstable drift.
dt
DIameter 2
Similarly, for
+ 0.0737Y
c = 0.3201
to
so that the system equation for Diameter 1 corresponding to (8.4.3)
(8.4.5) are
dX
-+4.26
52X=Z-Y
a,
4.3389
0.0057
1
(xf= 13.57780.0737
a,
1
c + aa
cz, +
DIameter 1
310
Th
2
P
i
(X
1
1b4)
[Hint: >2
0 faster than
+ oc]
8.5 For each of the models in the illustrative applications given in Section
8.6,
(a) compute 01 using the approximations derived in Problem 8.4
and see how they tally with the estimated values of 01;
(b) what values of the exponential smoothing parameter K should
be employed if one decides to use EWMA for prediction?
PROBLEMS
8.1 Verify the expressions (8.1.6), (8.1.15ac), and (8.1.16ac). [Hint:
See Section 7.3.2.]
8.2 Find an expression for y(O) of an AM(2,1) model. For the same
is the variance larger or smaller compared to the A(2) model? Why?
Give physical explanation in addition to mathematical reasoning.
8.3 Verify the limits of 01 and cr given in Section 8.5.
8.4 Show that as
co, the value of P given by Eq. (8.1.15a) can be
approximated by
PROBLEMS 311
312
1
D
=
2P
1
D
2
(p.
(o
i.
1
)
2
D
)
2
D
)
2
(1XD(1 + X + 2PX
2PX
)
(1X)(1 + X + 1
8.6 If the conditions required for EWMA are nearly satisfied, what in
tervals of values of K should be employed for EWMA forecasting if
the error variance is not to exceed 4% above the optimal value and
the underlying model is (i) A(2), (ii) AM(2,1)?
8.7 Taking ct, = 5, 10, 15, give the approximate values of the feedback
parameters c and OLf, both positive, that will justify the EWMA model.
for the sampled process within the confidence limits 0.01 of the
parameter 4.
8.8 (a) Show that
where
and
21
b
a sin 2bA
a sin b cosh a
sinh a cos btk
where
N = sinli 2ax
1
2
N
1
D
2
D
313
For a first order system, both these limiting cases tend to random walk.
Simulated data from the random walk model
VX, = a,
thatis,
(1 B)X, = a,,
shown in Fig. 9.1 illustrates the constant trend. Note that although the
series tends to remain at the same level, this level changes randomly since.
the trend i stochastic in nature.
Somewhat hazy patches of constant trends at different levels can arise
Stochastic trends in the data arise when the roots are real. Let us consider
a few examples of these trends. When the Greens function G, tends to
1 is proportional
remain constant for all j, we get a constant trend. When G
as
well
as fr, we have a
varies
with
to j, we get a linear trend. When it
1
quadratic trend, and in general, we can have a polynomial trend.
The key to the analysis of trends and seasonality is to remember the fact
that the Greens function determines the overall pattern in the data. The
closer the roots are to the stability &oundary, the more vividly the Greens
function appears as the overall pattern in the data. Therefore trends and
seasonality patterns can be well represented by discrete models with roots
A, closed to one in absolute value and by continuous models with real parts
of the roots p., close to zero. The real roots represent constant, growth, or
decay trends, and the complex Eoots represent seasonality with the period
given by the imaginary part.
over the full model, then it can be taken as an adequate model and used
for forecasting, characterization, etc.
For the purpose of simplicity, we will first analyze the trend and
seasonality via the characteristic roots and the Greens function in Section
9.1, together with illustrations of simulated series. Real examples of mod
eling and analysis will then be given in Section 9.2, with data from business,
economics, hospital management, and transportation.
50
100
150
a,.
200
250
315
1.76,
1 = 1, A
2
0.76
42 = 0.76, A
If a parsimonious representation is desired, such roots close to one
from an adequate model found by the procedure of Chapter 4 can be
replaced by the exact value one. This is equivalent to introducing the
operator V = (1B) suggested by the adequate model. Such a procedure
was implicit when we used the model
VX, = (1
B)a,
1
0
for the IBM data. After reestimating 0,, and comparing with the ARMA(2,1)
model, this model was found to be adequate in Chapter 4 and was used
in Chapter 5 for forecasting. A similar procedure could also have been
followed for the papennaking data.
If a continuous model is used instead of the discrete one, it can provide
us wth the valuable information whether the trend arises as a result of
genuine limiting parameter values or simply because the sampling interval
Papermaking:
from higher order systems with one root (discrete) close to one in absolute
value and others relatively small. Two such examples can be seen from the
papermaking input data and the IBM data in Figs. 2.1 and 2.3. The
ARMA(2,1) models fitted to these in Tables 4.2 and 4.3 have the following
parameters and roots
IBM: 4, = 1.05,
= 0.05,
kl=1,
>2.5
6.5
is too small. This is clear from the limiting cases discussed in Section
s
for a first order system, which are also applicable to higher order system
with only one real root close to the limiting value one.
xt
200
0
200 1
-400
-600
-800
1000
-1200
1400
200
(1 + 1
5B)a
4
O.
.
I
150
I
100
V2X
I
50
lj),
)
0
(+V
,
e
2
+g
317
250
If w,,
is
very
small
so that both c 2
w and a
are small, the roots
0=
tend to
te value one and we get a linear trend by the operator
absolu
(1 2B + B
). If the roots are complex, the Greens function is a sine wave
2
With very small frequency so that it stretches out very far and gives the
appearance of linear trends in the data. Similar is the case in which the
roots are real since the exponentially decaying Greens function decays
2 +
t
3
1 + c
c
t + c
2
j
2
1 g
j + g
1
.
...
t
1
+ c,,
fl
g
j
+ 1
With 0
0.45, 02 = 0.45 and the variance of ass the same as that used
in Figs. 9.1 and 9.2. A quadratic or parabolic trend is clearly visible..
GQ)
very slowly and gives rise to a linear trend. Thus, in these cases the trends
are genuine in the sense that they arise from the limiting values of system
parameters.
On the other hand, since the exponents in the Greens function involve
be close to one because the sampling interval t is too
,, , the roots may
, a are not. This is like taking a large
1
parameters cr
the
even
though
small
amount of data on a very small part of a sine wave or exponentially decaying
Greens function of the second order system. This small part magnified in
the data then appears to be a straight line. Such a spurious trend induced
by the excessively small sampling interval can be detected and corrected
only with the help of more data or physical knowledge of the system.
319
(1 + 0.45B
=
200
-
150
50
0.45 B
)a,.
2
Comparison with Figs. 9.1 and 9.2 shows that the plot is relatively smoother.
This is explained by the scale factors. The range in Fig. 9.1 with the operator
(1B) is from 15 to +10; in Fig. 9.2 with the operator (1B)
2 it is
from 1400 to + 200; and in Fig. 9.3 with the operator (1 B) it is from
120,000 to 0. These ranges are a clear indication of increased instability
of the underlying system.
Note that the plot in Fig. 9.3 looks like a deterministic function al
though generated by a stochastic process. It is thus possible to approximate
deterministic functions by ARMA models over a finite amount of time.
Two or more roots close to one in absolute value give an indication that
such a deterministic component may be present in the data. One should
then consider the physical understanding of the system to decide whether
such a deterministic component is justified. If justified, the modeling meth
ods outlined later in Chapter 10 should be employed.
4)2
cos
V4+44)
2
1
(4)
)
i)
(V
r(cos w i sin w)
4)1
2V
/ zr\
x 2
(9.1.2a)
(9.1.1)
4)1 =
For a period of 12
and hence,
Thus,
1.34
2v
5
(I)
2ir
0.670 iO.456
11 years
and
(1
vB
X,
B
)
2
1.65B + 0.903B
)X,
2
(1 + 0.45B)a
1
(1 + 0.45B)a,
321
The reason why the periodic trend is not very regular is that P2 is not close
to 1 so that there is considerable damping.
To illustrate the seasonality of 12, two sets of data simulated by the
models
(1
are shown in Figs. 9.4 and 9.5. Both these models have the same period
of 12, but there is a slight damping in the second model since 4) < 1.
The seasonal patterns can be clearly discerned from these plots.
4)2
2 cos
2i
(1 iV)
(9.1.3)
that is,
=
1,
4,
2cos()
FIgure 9.5
20
10
10
20
100
150
100
150
200
200
250
250
(1 + O.45B)a,.
50
50
Figure 9.4
322
20
40
60
2
k
1
A
V+44,J
100
(9.1.5)
(1 + O.45B)a,.
80
323
Before
Modeling
and
xi
324
xt
)X
2
Figure 9.7 Simulated series from the model (1 +0.95B +0.9B
(1 + (L45B)a.
Roots
Table 9.1
Period
p
2
3
4
5
6
7
8
9
10
12
14
15
20
21
24
25
28
30
50
38
60
\p)
2
11.968B+B
2
11.984B+B
2
11.989B+B
12.000BB
2
2
1-1.956B+8
2
1+2B+B
2
1+B+B
2
1+B
2
10.618B+B
1B+B
2
11.2478+B
2
1l.414B+B
2
11.532B+B
2
11.618B+B
2
l1.732B+B
2
11.802B+B
11.827B+B
2
11.902B+B
2
11.911B+D
2
11.932B+B
2
11.937B+B
1-1.950B+B
2
2
B
1
14
+B
Operator
25
1, i
0.50010.866
1
0.30910.951
0.50010.866
0.62410.782
0.70710.707
0.766iO.643
0.809iO.588
0.86610.500
0.90110.434
0.91410.407
0.95110.309
0.95610.295
0.96610.259
0.96910.249
0.97510.222
0.97810.209
0.98410.178
0.992iO.126
0.99510.105
1.00, 1.00
Roots
2
1
0
0.618
1
1.247
1.414
1.532
1.618
1.732
1.802
1.827
1.902
1.911
1.932
1.937
1.950
1.956
1.968
1.984
1.989
2.000
same as that for the grinding wheel profile in Appendix I, with 4 = 0.76
and 4 = 0.21. If we simplify the data and make it stationary by
differencing it and fit a model to the differenced series based on the es
timated autocorrelation, we get an MA(3) model with
0.107,
03 = 0.185
02 = 0.357,
=
that has no relation to the actual AR(2) model whatsoever. Moreover, the
valuable information contained in the linear trend in the original data is
lost. (On the other hand, the methods of Chapter 10 recommended for
such data almost exactly estimate the linear trend and the AR(2) param
eters.)
No such simplification of the data to guess a model form is needed in
our modeling procedure described in Chapter 4. The procedure is robust
enough to model data with trends and seasonality. If the adequate model
naturally indicates trends by real roots and seasonailty by complex roots
close to one in absolute value, we can then introduce these operators
ensuring that the original series is not distorted.
0.76, 4
=
0.21.
Figure 9.9 shows the monthly averages of the weekly total investment in
the U.S. treasury securities by the large commercial banks in New York
When two or more roots are equal to or greater than one in absolute
value with or without seasonality and the data has a smooth trend or a
regular periodic behavior, the possibility of a deterministic component is
indicated. If the physical understanding of the underlying system justifies
such a deterministic component, one can use the methods discussed sub
sequently in Chapter 10.
xt
327
0.6275,
3
,
2
0.0269 ii .0536
City. Fig. 9.10 shows the corresponding monthly averages of money market
rates in weekly market yields of 3-month bills of U.S. Government secu
rities. In all, there are 132 data points each, from January1965 to December
1975. The second method of computing a,s, starting with a
1 =X
, was used
1
in estimation as recommended in Section 4.5.
(a) Money Rate
The results of modeling the money rate data by the procedure in Chapter
4 are given in Table 9.2. The ARMA(4,3) model is found to be adequate
and the autoregressive roots are
1 = 0.4582;
K
) = 0.9812;
= 0.004310.9653
None of these roots are close to the moving average roots:
9-.
25
50
xt
S
a-
5_.
t (Months)
75
100
3
K
4
K
0.0086
2 cos
I 2ir
125
indicateS
0.9812
Hence, none of the roots can be cancelled. The root )
B). The compleX
the possibility of a constant trend represented by (1
period since:
4 are nearly i that is a 4 monthly seasonal
K
pair of roots
by Eq. (9.1.5)
4)2 ?41
and
i
4
Table 9.2
Parameters
42
01
4)6
(6,5)
329
0.6940.404
(1 B)(1 + B
)
2
(1,3)
ARMA Order
(4,3)
0.5140.293
0.478 0.305
0.491 0.399
0.4190.386
(2,1)
0.941 0.734
0.0368 0.726
0.6810.260
0.2760.706
0.9130.311
1.0970.0278
0.8410.325
0.1061 0.126
1.1450.083
0.6840.539
3.9190.212
18.093
02
3.962 0.245
15.124
1.4580.764
1.255 1.24
0.809 1.42
0.0149 1.45
0.4941.14
0.456 0.549
0.2250.729
0.8560.675
0.1580.839
0.06700.751
3.864 0.726
15.046
0.6970.305
18.785
0.026911.0536
/
V1
7.70
03
04
05
Mean
Residual
sum of
squares
F
2
v
= 0.6275,
Compared with (4,3).
B)(.k )
1
4)
(1 0
B
1
)a,
B
3
B 0
2
0
which is also clear from the examination of Table 9.1. This 4 monthly
).
2
period can be represented by the operator (1 + B
) for the stochastic trend
2
Using the combined operator (1 B)(1 + B
and seasonality, we can reduce the autoregressive pirameters by 3 and fit
an ARMA(1 ,3) model to the resultant data, that is, the model now is
(1 B)(1 + B
)(1
2
The estimated parameter values for this model are shown in the last column
of Table 9.2. The residual sum of squares is only slightly higher than that
of the adequate (4,3) model. Therefore, the parsimonious model with trend
and seasonality operators may also be considered adequate.
Either the full ARMA(4,3) or the parsimonious ARMA(1 ,3) with the
operator (1 B)(1 + B
) can be used for forecasting by the procedure dis
2
cussed in Chapter 5. Both these models would give practically the same
forecasts and probability limits. The fact that with the operator (1- B)(1 + B
)
2
the Greens function fails to die out does not cause an appreciable change
in the error variances for short-term forecasts.
..
.0
1+
I-
U,
II
I,
0
0
C.-)
00
I-.
00
..
-.
C.)
I-.
00
00
I_
I-
a-. W Va
00 a o
CI)
00
1+
.0
0%
CI)
C--) a-
I C-.)
pp
1+ 1+
00 C.)
CI) .0
U
0%
VI
I.
CC
-a
11
Ca)
.0
1+
1+
(0
00 CIa
C,, C.)
00
+ 1+ 1+
0
Qoa-
0
C.)
C.)
Vi
a-
bob
tOO004Lia4
.0 0 C.) .00% C-.) 000 .0
0
OOWVa00%C
0%
C.)
0s
U,
.
1+ 1+ 1+ 1+
pppp
0% .0
00 W 00 Va
00
I,
p
0
00
.-
+i+L+l+I+I+I+I+I+I+I+H
I-.
1+ 1+
OO000%.C.0%
cia ) a..) . .00000 C.)
0 C-.)
%00%G0,W00Va
.0
a-.
0
pp
0% -
a..
tPrPPPPPPPb
I.
a-.
-)
00
a-.
cia .- .
%0 I-V,
I+I++
1+
%O
,-
III
0%-00
000s
cia
00
.0
cia
1+
-
1+
0%
..
-.
%0
,-.,
ft
II
a..)
a.
a
OEE
331
O.0051iO.9852
(O.96
2
g
2
)i + A(0.9852) cos
( 0.9201) + 5
1
g
(g
1
0.4582) + 2
(g
0.9812)i + A(0.9653) cos
j+
(i+)
The long-term behavior of the Greens function and hence that of the
overall series is determined by the autoregressive roots/parameters, whereas
the short-term behavior up to three lags is also affected by the moving
average roots/parameters.
In addition to the seasonality of four months, the series are dominated
by a-long-term factor leading to the near constant trend and is given by
(b) G,
1
G
(c) Interpretation
The real significance of the trend and seasonality analysis far exceeds the
parsimonious representation and the resultant simplicity of forecasting.
With the help of characteristic roots la.,, the trend and seasonality operators
discussed in this chapter, and the form of the Greens functions studied in
earlier chapters, we will try to briefly interpret the adequate ARMA(4,3)
model for the money rate and the investment data.
We know from Chapter 3 that the Greens function of the above
ARMA(4,3) model will consist of the sum of two exponentials correspond
ing to the two real roots X
1 and X
2 and a sine wave determined by the
complex conjugate pair k and )a... With the 4 monthly seasonality analyzed
above, the forms of the Greens function are shown in Fig. 9.lla, 9.llb
and may be taken as
,X
3
X
4
(b) Investment
The results of modeling the investment data are shown in Table 9.3. The
adequate model is ARMA(4,3) with autoregressive roots
= 0.9652;
2 = 0.9201.
k
332
20
g, (0.4582I
2 (0.9812)1
g
60
2i
A (0.9663)icos(--j+c)
2g (0.9812)
Gjg, (0A582P+
2
+A (0.9653)Jcos&j+j3)
.
the exponential (0.9812) for the money rate and (0.9652) for the invest
ment. This part of the Greens function effectively denotes high inertia or
sluggishness or long memory similar to the case of random walk discussed
in Section 2.2.6. Note that such inertia or sluggishness is more prominent
for the money rate than. for the investment since the exponential (0.9812)
dies slower than (0.9652). This shows that a part of the money rate has a
greater tendency to remain at a fixed level than the corresponding part of
the investment.
The third part of the Greens function leads to the fluctuating tendency
in the series due to the negative root. The root 0.9201 for the investment
1.2
1.0
0.5
fi
0.0!!
(a)
40
j(Months)
Figure 9.lla Greens function and its components for the money market rate
data, ARMA(4,3) model. (Compare with Fig. 3.7)
ci
kL.
1.0.1.
0.5
0.0
60
80
I
+4 (O.9852)1cos(j+)
. (0.9652)1
01)I+g
g9
1
G
2
(O
2
g (0.9652)1
(0.9201)1
20
40
j(Month)
(b)
Figure 9.llb Greens function and its components for the investment data,
ARMA(4,3) model. (Compare with Fig. 3.7)
333
334
STOCHASTIC TRENDS AND SEASONALITY
,
2
,g
1
precise in two ways. The first is to actually compute the coefficients g
A, and the phase angle 3 by the formulae given in Chapter 3, especially
(3.1.17b), (3.1.17c) and (3.1.26). These coefficients quantify the relative
influence of the three factors considered above and the effect of the moving
average parameters in the overall behavior of the series. The second is to
compute the variance components as discussed in Section 3.3.6. (For de
tails, see Appendix A 9.1 and Problem 9.6.)
Figure 9.12 shows a plot of the daily in-patient census at the University of
Wisconsin hospital, from April 21 to July 29, 1974. The data consisting of
100 observations are listed in Appendix I at the end of the book. The plot
clearly indicates a cyclic trend of a weekly or 7-day period as would be
expected. In other words, the observations that are seven days apart are
similar. According to available literature on modeling seasonal series, one
), that is, take the seventh difference of the
7
might use the operator (1 B
) is expected to remove the
7
data before modeling since the operator (1 B
nonstationarity that is present because of a 7-day period in the series.
) is inap
7
However, the results show that the use of the operator (1 B
propriate for modeling hospital census data.
s set to zero are given in Table
1
The results of modeling with initial a
9.4. Following the standard procedure, we find an ARMA(7,6) model
adequate for the hospital census data. The characteristic roots (poles and
zeros) of the ARMA(7,6) model are given in Table 9.5 and are also shown
).
7
in Fig. 9.13 where they are compared with the roots of the operator (1 B
) has seven roots of unity,
7
The polynomial (1 B
2irj
eT,
6
j 0, 1, 2,
evenly spaced (0 = 2/7 = 51.43) on the unit circle. It is seen from Fig.
9.13 that all the complex roots of the ARMA(7,6) model lie fairly close
7 = 0.745 is quite different
to the complex roots of unity, but the real root X
7
1 in the ARMA(7,6)
from the real root of unity. The absence of the root X
) is not appropriate for
7
model implies that the use of the operator (1 B
the hospital census data.
) is also clear from the
7
The inappropriateness of the operator (1 B
fact that if the ARMA(7,6) model with this operator was appropriate, then
we should have
4)14)2..--4)60, 4)7_i
and the actual values in Table 9.4 are nowhere near these. On the other
7 = 0.745 we note that
7 = 1 by A
hand, to replace the roots A
B
B
6
(1B)(1+B+
2
3
4
+
5
)
)
7
(1B
xt
0
.0
Therefore,
80
40
60
20
t(Days)
Figure 9.12
4)2
0.255,
4)7
0.745
335
100
(1+B+B = (1+0.255B+0.255B
B
2
(10.745B)
2
3
4
+
5
)
6
0.255B + 0.255B
3
4 + 0.255B
5 + 0.255B
6 0.745B
)
7
which gives
=
1 + X
iX
2
6.983
cos
2ir
0.624
\.V0.6242 + 0.7862
cos
336
CI)
CI)
C)
CO
Cd,
5)
0
00
i
Q r1
-4
I
I
c r5
-4
-4
c-j
NN
-4
I
I
r
N
C 00 .0 .0
CC
()
.0 00 C Q\ 0000
+1
S
CrC
-4
+1 +1 +1 +1 +1 +1 +1
r
00 t .0 .0 N
.0 N
r r4
SSSS
(4 1
r r r
.0
SS
.0
+1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1
00
SS
.
+1 +1 +1 +1 +1 +1 +1 +1
55
N O
00
-4
+1
r.C
-4
.0
I
I
-4
-4_I_
-4
.0
CrC
00
555
0000
+1 +1 +1
-4
+1 +1 +1 +1 +1
-4
r 00 Q
.0
00 t
Cr)
+1
IC)
-4
00
-4
-4
-4
00
-4
5(4
.0
(.1
N
0
C..)
I-)
IC)
COo
4-
+1
CO
.0
Cf.)
+1
+1
.0
O
-4
r .0
+1 +1 +1 +1 +1 +1
c\
+1 +1 +1 +1 +1 +1 +1
S
(4 N
S S
+1 +1
55
-4
CO Cfl
-4
+1 +1 +1 +1
N Q 00
+1 +1 +1 +1 +1 +1
-4
CrC 0000
N
crr
FIgure 9.13
Re(A)
lm)X)
1
i2r
rootsofunitye jO, 1
;
7
o pole(autoregressive root)
337
gkk
1
X
5
g
+ g
X +
2
+ g
X4
7
Similarly, the periods for the other two pairs (0.237iO.965) and
(0.89210.424) are found as 7/2 days and 7/3 days, respectively (Table
9.6). Thus, the roots (0.23710.965) and (0.89210.424) only reflect
the higher harmonics of the fundamental period of 7 days.
In order to see the contribution of these harmonics to the hospital
census data, the amplitude of these harmonics are computed using the
Greens function as discussed in Chapter 3. Since the ARMA(7,6) model
has seven characteristic roots, X, i = 1, 2,
7, the Greens function
for this model may be represented from Eq. (3.1.24) as
1
G
where the g,< are the strength of the mode represented by Xk. As all the
complex roots are close to 1 in absolute value (Table 9.6), the Greens
1+
00
+ 1+
VI C
1+ + 1+
-J
I..,
I-
I-
tJ
1+ 1+ 1+
op
.000
C VI
VI
. .0
-
-.
COP
Op
00 t) 0%
.0 W
,J -I .
I-.
8EE
JC00
1 0 VI VI 1%)
00 i-
1+ 1+ 1+
(.) 00
Vi 0
. 0 -.J
Vi
%0
0% 0%
00
1+ 1+ 1+
00
0%
00
1+
0%
oP P P
C 00 I-.
.) . 0
C C C
* J
00
C
VI
p pp
I-
0%
1+
VI
VI
I-
I.-.
4,
(0
339
2 A. cos (w,j+,)
+
k
7
g
3 cos (w
A
j + 33) + g
3
X
7
(9.2.3)
.03
4)6
=1.2351, 04
0.2822,
0.1981; 0
1.2982, 05
0.9051, 02
1.5275, 06
1.2571
0.9780,
n U!
.oiojoq
A1c06o
-AcLZcT
DLII
243
OS
AOcErI
X060 + XILZOI +
AZ86VI
Aqsouaspw
SOlIDS pD3Sfl[pC LIIUUOSUDS oqi azojaioqj
041 IDA0DJOjAJ
BE
AJXIVNOSVJS ONY
A
(
u
9
ILcrI +UcLlci +aZ86ZT icvi +go8L6o+gIco6o+ 1)
&E96o+sLOVoI +,9JOI +OL86O+q89660+OLO1 +)
9
(
q uo ! 6t7t7CO = L. pim
K686O + XW10I + J(
se owns 1
v Jo 03UJA 042 42!M
OD1j-i(3qRUOSnDS
+ A6HL0
xtowi + x066o +
AO8L60
9
ATLcVr
z6t7t7LiJ
8E8U
gzt100I
U6E611
eq iioos
SOlIDS 3IO4
10
soo wnwpdo
soarnosai juidsoq Jo uO3ez!I!1n 41 doo
03 I3DflpO13U!
etp !I0d jo pojjo otp u iosu o posn oq ospi uu
1pow u. puwr P3!P.zd otp oepowwoe lsoq o inps 1(oJcl
-wo pun luowdrnbo oip &ipsnpn U042 pun peo uoped oq &u2snaloJ
ioj uoTle.nstuitupe piidsoq oip o in;sn oq ue pow oioqn
oq
nn sots A
ip enp ip jo o
1
d
1
x!puQddV m posj
rnep Dip.
W01j (TaM) xDpu! !.Id nsoq pue (j) xpui
oti iidV 2 61
icq1uoui oip jo Slold 042 OqS cU6 P tt6 !l
!Jd iDurnsuoo
fl
(I
4)
it
it
c.
xt
50
100
150
t(Months)
200
250
II
3U0
C
Q
0
C
50
100
150
t(Months)
200
250
c
14
xt
300
342
343
Figure 9.16 shows the logged airliner passenger ticket sales from January
1949 to December 1960; the original data are listed in Appendix I. It should
be noted that the logged series is being used here simply to compare with
roots
a.
0
00
C.,,
1+
0
0
0op
1+ 1+ 1+
pop
.0
ow
0
1+
00 0
0
.0 C-
0
00%
1+1+
.00
Op
bo
0000
1+ 1+ 1+ 1+
p
00 .0. )
0% 3 3 I
0% W . o
Lh VI
.0 .0 0
0 VI 00
000
.
.000000000
C)
0
0% 0
C-
VI
C)
C) 0
0 C)
-.
I-.
-J
Op
1+ 1+
..
.0
,) 0%
op
VI .-3
00 C) 00
inbo
000
1+ 1+ 1+
%OV0%
0%
.
.0.
0
00p
C-
C-. C-
0% 0%
W C) 0
0pp
00% .
0% 0000
w
[4- 1+ 1+
00p
00
1+ 1+ 1+
00p
0 C) 0.
C) 00
W I- .000% 0
.0 I- .J 1) 0
C)
C) 0 C) 0
1+ 1+ 1+ 1+ 1+ 1+ 1+ 1+ 1+ 1+ 1+ 1+
0000 .0 0% VI 00 .0
1+
0
C.,,
0%
CO
CO
0.
CO
-I
-4
(0
00
(0
0.
tI
(4,3)
j 0.17831.0156
10.9508
t.72
(2,3)
345
0.9381
1.0157
1.0136
0.1787 iO.9542 0.176810.9487 0.180510.9492
0.1431iO.4670
0.8443
0.8962
0.8523
0.2138 10.9796 0.2140 iO.9795 0.2148 iO.9768
0.144010.5221
Order of
the
ARMA
Model
(6,5)
V (3,3)
t--
1.9460.180
0.9450.183
(2,1)
(4,3)
0.8790.235
93.500.653
93.530.652
0.8590.233
1.5240.194
1.1010.169
1.5030.157
0.9240.183
0.4330.244
0.5810.138
(1,0)
Table 9.8 The Characteristic Roots of the ARMA Models for the Consumer
Roots
A,
1.0158
0.9l13
A
,4
3
A
>.
0.8523
0.8018
+0.2140 10.9794
v,
21
v
3
0
,
4
V
V
Table 9.9
eters
1.0140.005
Param-
4
4>
4,
4)3
OC
93.480.664
03
Mean
22.355
22.684
0.71
23.153
2.05
Residual
sum of
squares
F
The adequate model is ARMA(1,0).
1.0139
(1,0)
0.8792
1.0131
0.9328
1.0132
0. 9332
0.21 12 10.9658
0.8805
0.2238 iO.9619
(4,3)
4
K>,>.
v,
v
,
2
>
Figure 9.16
120
80
Month
some of the existing models of the series and not in order to make the
modeling simple. If the modeling strategy of Chapter 4 is followed, the
original series is as easy to model as the logged one. The growth trend and
the unstable nature of the series do not require a transformation to make
the series stationary, as was illustrated by the CPI and WPI series in
Section 9.2.3.
Fitting ARMA(2n,2n 1) models for increasing order yields a signif
icantly reduced sum of squares until n = 7, that is up to the ARMA(14,13)
model. The estimation results of the ARMA(14,13) model are given in
Table 9.11. It is also found that for the ARMA(14,13) model, the parameter
is small (4,, = 0.0767); hence, the parameter
is dropped and
the ARMA(13,13) model is fitted to the data with the results given in Table
9.12. The residual autocorrejatjons and the 2/VN bounds are shown in
Fig. 9.17.
The 13 autoregressive and moving average characteristic roots of the
ARMA(13,13) model are listed in Table 9.13 together with the periods
corresponding to the complex conjugate pairs calculated by Eq. (9.1.1).
It is easy to see from the table that the first 12 estimated roots are practically
the same as the 12 roots of unity, the theoretical values of which are listed
in the left half of the table for comparison. The positions of these roots
are indicated on the unit circle as shown in Fig. 9.18 on page 350. The
angles and the complex number representations e of these roots are
given in columns 3 and 4 of Table 9.13.
C
0
5.
347
0.9929iO.0212
294.31
12.0560
5.9686
4.0161
3.0093
2.4096
1.22
29.86
60.32
89.64
119.63
149.42
Angle
0.27380.2160
0.12510.2070
1.10770.2310
0.46490.2170
0.01870.2700
0.19770.2780
0.27220.1980
0.13750.2240
0.05700.2800
0.01100.2300
0.090698.
...
For complex conjugate pairs, only one root a+ib is considered. For a full configuration
0.9539
0.0816
0.858910.5076
0.0063iO.9994
0.4951iO.8704
0.4953iO.8689
0.8683iO.4985
0.57240.2050
0.05780.02850
0. 1749 0.2240
Period
5.85240.6370
R
0
R,
I?,
R,
4
R
R,
6
R
Mean
Characteristic Roots
1.01610.0262
0.0767 0.0557
0.04470.0307
0.92480.0058
Position
4)13
05
06
07
08
09
0,
0,
012
0.11860.0291
03
04
0.11860.0082
0.07120.0029
0.08120.0119
0.09960.0133
0.06130.0339
03
0.12210.0123
02
0,
1.14520.0092
0.24760.0147
0.16060.0284
4)12 =
4,6
4)3
4,2
Parameters
(42
4)
0
CO
CO
4)
CO
>
4)
4)
0
+1
00
0C
C..0
00
+1
00
00
.0
00
Cr)
,-
0%
C)
C)
C4
Ci
Cr)
0
-C
C)
I
*1
.0
00
C)
CI
C
0
C)
I
+4
C)
+1
rI
.0
.0
00
+1
I
4!
+1
.0
.0
Cr)
+1
oc
.,
0.,
04
CO
0
CO
CO
0
CO
0
.4)
0
0.
I,
CO
C)
V
Cd)
44
CO
04
0
0
CO
0..
5.
6]
4)
0.
CII
+1
+1
0
qcr
GOC)
C)
-I
+1
.0
0
0
0
+I
00
.0
0
+1
Co
+1
-C
-C
+1
0%
+1
0
C
.0
GO
00
+1
Cr)
(-
.0
r
0%
00
0.
C
00%OO
Q
0:
00
+1
00
00
0%
+1
r)
-4
00
.-
00
oI
00
<
>
4)
>
CO
C)
C)
0.4
I-.
0
0
C)
0
0
349
Ill
Iii.
0%
II
ii
I iii.
OE
lapow
lIJ
iii
(L C)WV
IIJI
jo suO!Wt9nooOne
i.
OL
II
0
0
0
0
0
S!
II
=
=
4,
4
64
4
84,
64,
St
RTOO68WO
t1TO.O8L6OO
9TOOtiOOI
c9Oo89cOo
8coott6oo
9VZOO1Z6OO
czoocLcoo
9)
9
9,
=
= 014,
19cwoI8c6O
LOO1Z66o
TOCOo91Wo
LIZOO+EIOOO
ZOO9OSOO
I0.IUJCO1Q
s100WR.Id
OLWO tOV9
= 010
(cI)vNIv n ;o s
O61OSOtOO
OSOrOVIEOO
0W01L1600
OOVTOt61tO
61WOLtL1O
m,cFOZLEOO
O8SUO6OO
O6L1OWWO
o1cuoLooEo
oocuoztswo
ZOt718EO
OC9IO6IJ
ESL1600
snq
pow nbpi
ujpue
ioije purn
o
uo
U PP!00 oq 0UU3 sps
(zi9 i)
-qo oq iI!M qi (ii)viiv qi aroj rqi pirn. pow (t)viv n
jeq u&oqs oq &trnbpe jo
inq s ppotu (itt)viv
oq puq ioqo oq1 u opow (it)vwiv u IW& i q pio o pu
ioizdo Otfl
90
90
US
OS
pow (ii)viiv
(9.2.4)
--
Figure 9.18 Positions of the characteristic roots (Table 9.13) on the unit circle.
R3
Figure 9.19
10
20
Lag
30
40
50
1.000
0.500
0.000
0.500
)X, = )a
12
(1B
12
B
12
(10
Combining these two models gives the two-parameter model (9.2.4). After
fitting, they conclude this model to be adequate on the basis of residual
autocorrelations of a,s, (Fig. 9.19), although a number of them appear
rather large compared with their standard deviation 1/V7 =
0.09, and the twenty-third autocorrelation is about 2.5 times the standard
deviation.
The modeling of the airline data brings out several strong points of
the ARMA(n,n 1) modeling strategy. First of all, no trial and error was
involved in modeling. The adequate ARMA(13,13) model was arrived at
by a straightforward application of the modeling procedure given in Chap
ter 4. Even the parsimonious model (9.2.4) was clearly indicated by the
parameter values of the ARMA(13,13) model and its characteristic roots
given in Table 9.13. This shows that such parsimonious models are not
missed by the ARMA(n,n 1) strategy, but are naturally included in
them, as was emphasized in Chapter 2, and can be logically arrived at as
illustrated above.
Second, the ARMA(n,n 1) strategy not only provides us with simpler
models if we want, but also clearly informs us of the cost we have to pay
for the simplicity of such parsimonious models. For the airline data, we
can choose the two-parameter ARZvIA(13,13) model only at the cost of
nearly an 80% increase in the one step ahead forecast error variance. Note
that the one step ahead forecast error variance for the ARMA(13,13) model
is 0.000637 (0.091751144), whereas for the.two-parameter model, it is 0.001267
seasonality
APPENDIX A 9.1
CALCULATION OF THE STRENGTH g,,, AND AMPLITUDE A,
CORRESPONDING TO MODE Ak FOR THE ARMA(7,6) MODEL
1
G
=
gkXk
0,Ki
(Xkkk_l)(Xkjk+,).
olki
.
1,2,.
)
7
(k,k
7
,6 from Table 9.4 and kk, k
2
(XkXl)(Xk
)
-K
k=1,2
1,2,.
=
1
A
1
0,K
1
K
2
6
1
A
4
0
4
3
0
i
1
(A,X
)
3
)
4
(X,
(x
)(X,
X
A,)(
)
6
X1
(A
K
X A
)
7
Substituting 0., i
from Table 9.5
= O.1046+iO.1016
and g
2 is its complex conjugate
iO.1016
2 = 0.1046
g
Similarly, g
,g
4
, g
3
,g
5
, and g
6
7 can be calculated and are given in Table
9.6. The corresponding amplitudes are simply twice the absolute values of
)
2
gk by Eq. (3.1.16b). Thus, for (g, g
1 = 2 V.0.1o46)2 + (0.1016)2
A
an
135.8
[imaginary Part of g
1
Real Part of g
1
PROBLEMS
353
= 0.2916
Similarly, for (g
,g
3
) and (g
4
,g
5
) the amplitudes are computed as 0.0666
6
and 0.0640, respectively. The results of the amplitudes and their relative
ratios are given in Table 9.6.
The phase angles f3 can be calculated using Eq. (3.1. 17c). Thus, for
,2
1
(g
g
)
tan [:]
PROBLEMS
9.4 A set of daily data covering a period of two years gives an ARMA(4,1)
model as aquate. The autoregressive roots are
= 0.625 iO.780;
4 = 0.978 10.208
3
A
(a) Determine the seasonality.
(b) Suggest the possible seasonality of operators and the form of
the parsimonious model after using the seasonality operators.
9.5 An ARMA(7,1) model has characteristic roots Ks given by
2 = 0.999, K
0.62, K
3 = 0.998,
1
K
= 0.510.866,
,X = 0.5iO.866
6
K
7
(., Plot the characteristic roots on the complex plane.
What kind of trend and seasonalities are contained in the model?
Can the operator (1 BN) be used for the system? If no, explain
why not? If yes, what is the value of N?
(b)
(c)
(L
13)
(a)
+ A (0.9653)1 sin
9.6 The Greens function of the ARMA(4,3) model in Table 9.2 for the
money market rate may be nearly represented by
1 = g
G
(0.4582) + g
1
(O.9812)
2
355
10
356
DETERMINISTIC TRENDS AND SEASONAUTY
We will first consider the simplest nonstationarity that involves only a linear
trend in the meant. This implies that the data appear to be scattered around
a straight line. It was seen in Section 2.1.1 that when the data are hdependent, we can ttse the standard linear regression formulae to estimate
the parameters and their confidence intervals. In this section, we will con
sider a method of modeling a linear trend when the data are no longer
independent.
The method essentially consists of decomposing the model into tA1
partsdeterministic and stochastic, removing the linear deterministic part:
that causes the nonstationarity in the data, and modeling the remainiflg:
stochastic part by the methods of the preceding chapters. For the purpO;
of removing the deterministic trend, and also for obtaining the initial values
of the parameters of the final deterministic plus stochastic model, we wi11
first fit the deterministic model alone. In this preliminary fitting of tbe.
UNEAR TRENDS
357
deterministic part, we Will obtain the estimates by the least squares method
assuming uncorrelated errors. In particular, when the trend is linear, the
linear regression results of Section 2.1.1 can be used to estimate the pa
rameters and to obtain the residuals. The residuals can be examined to
gether with their sample autocorrelations to check that no other nonsta
tionary trends are present. The residuals can be modeled by the procedure
in Chapter 4 to fit an ARMA model, which roughly represents the sto
chastic model. Then, the parameters of the final deterministic plus sto
chastic model can be estimated together, by starting with the estimates
from the separate models as initial values.
22
20-
18
20
60
80
40
100
a,
U
C)
FIgure 10.1
0.10658 0.00986
a,
0,
Cycles
= 1.3263
The residuals , from the model (10.1.1) after using these estimated
values are plotted in Fig. 10.2. The residuals have a tendency to increase
once they start increasing and to decrease once they start decreasing This
behavior indicates a strong positive correlation in successive data points,
and therefore the residuals are not independent.
We will therefore try to model the residuals following the procedure
in Chapter 4. However, examining Figs. 10.1 and 10.2 for the original data
and the residuals respectively, we see that the residuals are very small in
numerical value compared to the data. Most of the residuals are less than
358
359
Lag
Figure 10.3 Crack length data: Autocorrelations of the residuals from the
deterministic model.
50
LINEAR TRENDS
360
0.2366
1.4668
0.1558
0.5870
0.6981
0.7507
0.388
0.669
0.512
0.454
0.555
0.377
0.470
0.15973
0.4157
(1,0)
0.18986
0.9299 0.079
1.320
0.18650
0.4070
0.4173 1.240
0.4940 1.140
(2,1)
Table 10.1
Parameters
03
03
Residual
Sum of
Squares
I.u
0.5
0.01
IIilI
I
10
20
Lag
I
30
40
50
C
C
0
0
C,
0.5
1.0
FIgure 10.4 Crack length data: Autocorrelations of the residuals of the AR(1).
model for the stochastic part.
LINEAR TRENDS
361
therefore seems reasonable to consider the residuals from the AR(1) model
as practically white noise and choose this model.
3t
i
13
i +
t
1
13
[
4
0
(
1
t1)]
Y,_
f3
a,
dX(t)
+ aX(t) = Z(t)
dt
For the purpose of estimation the discrete model may be written as
4
X
1
_ + a,
(10.1.4)
or
1,
0.1079
0.0024
t(14 + I34 + a,
$3
)
= 1
Y,_ + o(14) + 1
4,
Since the derivative of 1, with respect to either of the parameters
I3,
Or 4 is a function of the parameters itself, the reader may verify that Eq.
(10.1.4) is a nonlinear regression model in its conditional aspect. Therefore,
it is necessary to use the nonlinear least squares method to estimate its
parameters. Starting with the estimates of 13o, 13i, and 4 in the separate
models obtained above as the initial values, the nonlinear least squares
routine minimizing the sum of squares of a,s in the model (10.1.4) yields
the following estimates and 95% confidence intervals:
10.8896 0.0853
=
(4
(4
C
O.5
n 1!
0.5
1.0
10
I
.20
Lag
I
30
I 1I I
I
40
50
of
The deterministic part of the model, which accounts for more than 82%
of the variance of the crack length observation, represents the mode of
crack propagation as a function of the number of stress cycles applied to
the material. The model (10.1.2) shows that the mechanism of crack prop
agation can be represented by a simple linear model, at least over the range
stress cycles used in the data, that is, zero to 90 cycles. The crack length
is directly proportional to the number of stress cycles the material has
undergone.
A conjecture about the stochastic part of the model (10.1.2) can be
made, based on the fact that this part represents a very small percentage
of the actual observation. Let us assume that the stochastic part of the
model arises from the measuring instrument. The instrument cannot re
spond immediately and will involve some time lag or inertia giving rise to
instrument error or noise. We have seen in Chapter 6 that a system with
inertia can be represented by an A(1) system. if this conjecture is indeed
correct, then the preceding analysis gives us the time constant of the in-
FIgure 10.5 Crack length data: Autocorrelations of the residuals for the
deterministic plus stochastic model.
362
1
=
24.01
0.00195
0.04166X(t)
f3 = 10.89
+
Z(t),
(10.1.Sc)
We have seen in the last section that when the data clearly indicate the
nature of the deterministic part and the noise or the disturbance added to
Engineers, and scientists are often faced with the problem of instrument
calibration. The static constant bias in the instrument can be readily found
and in many cases easily corrected. However, the calibration of the dy
namics is more difficult. In general, one is forced to carry out the calibration
of the dynamics under laboratory conditions, using the instrument to meas
ure a fixed measurement. Such a calibration under laboratory conditions
may not hold under actual working conditions.
The above analysis provides a possible method for calibrating the
dynamics of the instrument under working conditions. If the above analysis
is repeated for more sets of data, one can get a fairly good idea of the
dynamic behavior of the instrument. Of course, the assumption inherent
in such a procedure, that the stochastic part in the data is due to the
measuring instrument alone, will have to be checked and confirmed by
looking into the physical circumstances and any additional information
available.
Instrument:
C(0)
a--
1
0.04166
363
strument as
EXPONENTIAL TRENDS
EXPONENTIAL TRENDS
41.0
365
Figure 10.6 shows a plot of 100 data points of the observed basis weights
in response to a step input in the stock flow rate of a papermaking process.
The numerical values of the data, taken at one-second intervals, are given
in Appendix I. A brief description of the papermaking process together
with a schematic diagram is presented in Section 11.11 The modeling and
analysis of this data provide a technique..of. system identification from on
line measurements as discussed in Pandit, Goh, and Wu (1976).
During the operation of the papermaking machine, there are variations
of paper basis weight not only in the machine direction but also across the
machine direction. The beta-ray gage recording the basis weight moves
back and forth across the machine direction, forming a zig-zag measure
ment path, as shown in Fig. 10.7, due to the movement of the paper in
the machine direction. Under steady, state conditions, when weight vari
ation in the cross-direction is negligible, the record looks similar to the
sketch shown in Fig. 10.8a, but if there is a significant cross-machine var
iation, for example, if the paper is thicker at the center, the measurements
exhibit a pattern such as the one shown in Fig. 10.8b.
In the particular papermaking machine under study, there was a strong
cross-directional variation superimposed as noise over the dynamic re
sponse. It is therefore important to characterize not only the dynamics but
also the noise. The dynamics is controlled by means of the stock flow rate
(time-wise adjustments), whereas the cross-directional variation can be
controlled by adjusting the openings on various sections of the headbox
a,
C
a,
10
20
30
40
50
60
Time(seconds)
70
80
90
100
Lriiiiiiii
0
Figure 10.6 Basis weight measurement data showing the machine response to
a step increase in stock flow.
0,
(b)
(a)
(Center of paper)
Time
I.
0)
Time
(10.2.1)
e) + e
y = A + g(1
where y, is the measured basis weight at time t seconds, g is the steady
state gain, r is the time constant in seconds, and e, represents the unknown
disturbance effects. Although a part of the disturbance may come from
the input, machine-directional variation, or the instrument, it is believed
that a major portion arises from the cross-directional variation.
In the preliminary analysis, primarily aimed at obtaining the initial
values for the final adequate model to be considered later, we assume
, g, and 7 by the nonlinear
0
to be uncorrelated and estimate the values A
Since the observed data shown in Fig. 10.6 is the result of a step increase
in the input stock flow, under the assumption of the first order dynamics
with superimposed noise or disturbance, the model for the basis weight
may be written as
367
Timer
Figure 10.9 Basis weight residuals from a first order deterministic model
(10.2.1).
1.9321 0.1661
27.1719 6.2593
Residual sum of squares: 4.0293
The plot of the residuals and their autocorrelations is shown in Figures
10.9 and 10.10. From both these plots it is clear that the residuals are
stationary and highly correlated; in fact, the plot in Fig. 10.10 shows a
decaying pattern characteristic of a stationary stochastic process. If the
model for the residual is AR(1), then the initial value for the parameter
0.755.
4 will be fi =
least squares method. The initial values of these parameters required for
the least squares computer routine minimizing the sum of squares of the
0 can be taken
ess are easily obtained from the data. The initial value of A
the
basis weight
Fig.
10.6,
In
39.
about
value
as the smallest observed
is the sum
which
40.6,
around
fluctuate
seconds
30
after
data
measurement
t
1.6.
At
=
39
40.6
as
taken
was
= ; the
g
therefore,
and
g;
0
of A
.37)g 40; therefore,
0 + (1
average response should reach the value A
27.
from the plot,
Starting with these initial values, the nonlinear least squares routine
gives the following estimates and 95% confidence intervals:
0 38.7586 0.1840
A
EXPONENTIAL TRENDS
1.0
0.5
0.0
0.5
h1.
I
1
iIIi
I
20
I1II
I
30
Ill
C
0
J
1.0
I
10
Lag
ih
IIII
I
40
1111111
50
=
=
0 + g(i
A
e;)
X,_ + a,
1
+
X,
(10.2.2)
X,
0
A
+
g(1
e) + 1
X_ + a,
(10.2.2a)
=
=
38.9974
0.4228
23.5606
0.1270
0.2682
1.7992
35.3912
0.7887
Taking the estimated values given above for the model (10.2.1) as the
initial values of A
, g, and T and the initial value of 4, as the one lag
0
autocorrelation of the residuals 4 =
= 0.755, the nonlinear least.
squares computer routine yields the following estimates and 95% confi
dence intervals:
0
A
g
=
=
0.4
0.2
00
XX
XX
X
XXX
Xx
I
20
X
X
xx x
x x
I
40
X,
X
Timee
I
60
EXPONENTIAL TRENDS
XX
XXC
I
80
100
369
.5
0.4
0
FIgure 10.11
dXQ)
XQ)
0
a
Z(t)
(10.2.3)
Since the sampling interval is one second, model (10.2.2) implies that the
differential equation for the stochastic part XQ) may. be written as [see Eq.
(6.4.7)]
0
A
g]
[B(t)
0
A
g]
(10.2,4)
With o =
ln(0.7887) = 0.2374. The deterministic part is
) =
1
ln(cb
the step response of the basis weight 8(t) measured along the path in Fig.
10.7. Its differential equation is (see Section 6.1)
if[B(t)
0
A
+
g
=
(38.9974+1.7992)
35.3912
0 = 38.9974
A
=
40.7966
4.2128
1.4915
0.9522
0.7360
0.0521
1 + a
_
5
4X
To explore this possibility, the sum of squares of ass from
0 + g(1
X.. + a,
4
Yt = A
44) + 1
corresponding to (10.2.2a) can be minimized to estimate the parameters.
, g, and 4. The results of estimation are as follows:
0
A
0 = 39.0486 0.2939
A
=
0.2374
which is much smaller than that of the machine-directional variation rep
resented by Eq. (10.2.4).
B(0)
B(co)
where
370
1.7944
1.6639
.
1.6639
1003
30
32
44
46-
48
50
52 -ic
54
56
0.2
xx
cc
I
0.4
0.8
1.0
Time
1.2
1.4
1.6
0.6
Deterministic plus
ARI1) model
(RSS: 32.49)
Deterministic model
(RSS: 58.05)
1.8
2.0
371
1
7.4
6.7, it is seen that the increase is significant even at
.,,(1,100)
0
Since F
a 1% level. Thus, the data gives strong evidence against the hypothesis of
the same time constant. We tentatively conclude that the model (10.2.2)
and the corresponding differential equations (10.2.3) and (10.2.4) provide
the correct representation of the system.
EXPONENTIAL TRENDS
22.575 0.599
31.347 0.539
1 = 0.6447 0.0520
Residual sum of squares: 58.05
0
A
A
1
(10.2.5)
e +
1
0 + A
A
s
Yt
0 is the final equilibrium
where (A
0 + A
) is the initial concentration, A
1
concentration, and r is the relaxation time constant. Finding the initial
values from the data as described before, the nonlinear least squares routitie
s gives the following results:
1
minimizing the sum of squares of e
-0.20
0.00
0.20
0.40
0.60
Xt
=
I.
Z_
4
1
liJ_.
46
+ a
1
Lag
8 1012141618202224262830
(10.26)
IIIIlI1III1II1I
---
=
y
+
0
e
1
+X
.A
A
The autocorrelations of the residual ess are shown in Fig. 10.13 and the
fit is shown by a dotted line in Fig. 10.12.
,A
0
Taking the estimated values as the initial values of A
, 7 and the
1
, the following deterministic plus stochastiC
1
first autocorrelation as that of
model was fitted:
C
0
-0.40
0
FIgure 10.13
0
A
A
1
24.109 1.50
32.198 1.24
EXPONENTIAL TRENDS
0.5248 0.0934
0.7863 0.1272
)
1
ln(4
=
ln(0.7863)
12.021
373
dC(t)
1
+
C(t)
dt
.i
dX(t)
+ aoX(t)
.1
III
I.
Lag
8Q
1
A
(10.2.8)
(10 2 7)
II
Z(t)
11111
10 12 14 16 18 20 22 24262830
I.
0.60
0.40
0.20
-0.20
0.40
2
y(t)
0 = y(t) = CQ) + XQ)
A
where .y(t) denotes the observed value at t.
C-
FIgure 10.14
1 + a
e TX,_
(10.2.10)
y,
TX,_ + a,
0 + A
A
e e 1
1
(10.2. 10a)
1 + X
A
e
0 + 1
A
23.8367 3.2984
375
5.02
6.85. Therefore, the data
(1,120)
0
and F
3.92
(1,120)
0
whereas F
does not have strong evidence to reject the model (10.2.10), which may
therefore be also considered as adequate. Statistically, it is difficult to
choose between the models (10.2.10) and (10.2.6). The choice will have
to be made on the basis of other evidence based on a physical understanding
of the process.
=
0.3953 0.1669
of squares: 34.19
sum
Residual
It is thus seen that the residual sum of squares of the model (10.2.10)
with three parameters is only slightly larger than that of the four-parameter
model (10.2.6). If the F-criterion is used on the basis of the local linear
hypothesis used in nonlinear least squares, we get
1
34.19
32.49
F
1004
32.49
1
A
1
2A
eC1t
X,
(10.3.1).
0
C
Cu
>
C
0
600
500
400
300
200
1000
20
40
60
100
120
140
80
Months
>4R,e +
sin
(jwt
4)
377
(10.3.2)
cos
2Rjer,t
cos (jcot)]
(10.3.3)
where C,
is the dominant frequency in radians per month for the major peaks clearly
visible in the residuals of the model (10.3.4) as also in the original data.
The initial estimates of the additional parameters B
, b
1
, and C
1
1 were
roughly obtained from the plot of the residuals in Fig. 10.16 as B
1 = 20,
1 = 0.7. The results of estimation with 95% confidence bounds
= 0.01,
are shown in the third column of Table 10.2. Note that the residual sum.
of squares is drastically reduced from 296,250 to 95,783 by inclusion of
only one period of 12 months. This is also clear from Figs. 10.15 and 10.16
that show that the yearly period can account for a large part of the variation
in the series. Also, the estimate of the exponent is significantly large and
its confidence interval does not include zero. This means that the amplitude
e + a,
1
R
(10.3.4)
Y = t
The initial guesses for the parameters R
1 can be obtained from
1 and r
the plot of the data by using the facts: e
0 = 1 and e
2.7. For example,
when t = 0, Yo
, the value of R
1
R
1 from Fig. 10.15 is approximately
130. Similarly, when t = 11r
,y
1
e = 130 (2.7) = 351 and Fig. 10.15
1
1=R
gives an average value oft = 100 months. This provides an initial estimate
of r
1 = lit
1 = 0.01. With these initial estimates, the nonlinear least squares
routine minimizing the sum of squares of ess in Eq. (10.3.4) yields the
results given in the second column of Table 10.2.
The residuals from the model (10.3.4) are shown in Fig. 10.16. The
exponential growth present in the original data has been removed and the
residuals only have periodic trends with increasing amplitude. Therefore,
it seems that a single exponential with real roots is enough and we need
not add more exponentials with real roots, unless warranted at a later stage
in the preliminary analysis.
20
40
60
80
Months
100
120
140
150
-100
-50
50
160
379
+ Vi
+ e
Bp [C
1 sin (fwt
q cos (jwt)]
(10.3.6)
.
2, 3,
until the reduction in the residual
for values of = 1 and i
sum of squares is insignificantly small and the statistical tests show the
adequacy of the model. The estimation results are shown in columns 4,5,
6 and 7 for the values of i = 2, 3, 4, and 5, respectively. Adding the sixth
period does not result in a significant improvement and therefore we stop
ati = 5.
The introduction of five periods in the model accounts for yearly, halfyearly, 4 monthly, quarterly, and 2-2/5 monthly periods. It is therefore
natural that periods smaller than 2-2/5 monthly may not be necessary.
Residuals from the model (10.3.6) with e = 1, i = 5 are shown in Fig. 10.17.
No further periodic trends are visible, which confirms that the introduction
of further periodic components is not necessary. The fit of this deterministic
t
Rp
of the yearly oscillations are growing as postulated by the model. The plot
in Fig. 10.16 again confirms this observation.
Since the improvement in the sum of squares is large, we successively
fit the models
C
0
>.
100
[mean
Table 10.2 Deterministic and Stochastic Models for International Airline Data
Model:
Y,
ERenst
1-1
(i,n,m)
Total No. of
Parameters
31
+ n +
is
Vi
280.3, variance
00
C,
14,392].
z
C? cos (jofl)] X,
i-I
,x,.,
O,,a,.,. + a,
(2,0,0)
(1,0,0)
(0,0,0)
(5,3,3)
(5,0,0)
(4,0,0)
(3,0,0)
23
17
14
11
in
(1)
Parameters
R,,
I,
130.8310.14
0.00950.0007
B,
(8)
(7)
(6)
(5)
(4)
(3)
(2)
130.633.51
130.633.71
130.434.03
131.075.85
0.00950.0005 0.00950.0003 0.00950.0002 0.00950.0002
15.774.132
15.784.36
16.164.83
14.836.56
0.01360.0039 0.01260.0027 0.01290.0025 0.01290.0024
0.4330.065
0.4330.069
0.4330.076
0.4260.106
9.314.708 9.3964.49
8.884.961
C,
82
127.480.45
130.623.36
0.00950.0003 0.00960.0007
15.894.063
15.793.96
0.01280.0022 0.01280.0020
0.4360.059
0.4340.063
9.5032.40
9.4144.30
0.01210.0050
2
b
2
C
83
0.9970.016
0.01150.0046
-0.9960.015
4.529 5.225
0.0095 0.0107
-0.431 0.340
3
C
4
B
4
b
4
C
85
5
C
4),
+2
0,
02
03
Residual
sum of
squares
0.01140.0038
0.9960.012
0.01130.0041
0.9960.012
0.01120.0022
0.9960.006
3.914 1.652
4.1874.50
4.2674.76
0.0103 0.0103 0.0105 0.0098 0.0113 0.0039
0.433 0.316 0.432 0.300 0.4110.122
5.3572.642
4.3795.328
4.3205.536
0.00720.0124 0.00710.0118 0.0060.0472
0.4.230.425
0.903 0. 191
0.9120.191
2.3831.785
2.2894.114
0.01210.0160 0.01150.0068
0.9940.025
0.9950.049
0.0870.163
0.1400.110
0.8920.112
0.7160.234
0.9550.227
0.1800.256
g
0
8,091
28,908
32,354
36,897
44,753
95,783
296,250
m
z
0
Ca
C,
C,
00
I-
Fi;ur 10.17
Tn.ternational
Months
20
40
80
100
Months after January 1949
60
120
140
160
383
1.0
0
I
10
I
20
hh
Lag
30
I
40
1111 jJIIJB
50
1.0
.5
384
e
1
1=1
c. sin
/2jirt
a_
O
1
+ X
a_
O
2
2lirt\
vi q cos i-i_,)]
RI +
20
-x
10
10
x
Xxx
x
xx
x
X
1
40
X
x
XX
XXX
20
x
x
60
XX
Xx
I
x
I
80
:,:scx
XXX
I
100
xx
I
120
XX
xl
XX
140
a_
0
3
(10.3.7)
160
The last column of Table 10.2 shows the results of minimizing the sum
of squares of a,s in the model (10.3.7), starting with the initial values that
are the estimated values of the separate models fitted earlier. It can be
seen from the table that the introduction of the stochastic part represented
by an ARMA(3,3) drastically reduces the sum of squares from 28,908 to
8,091. On the other hand, it can be verified that the introduction of one
more period in place of the stochastic part by making i = 6 reduces the
sum of squares only to 28,849. This strengthens our decision to stop at i
= 5 in the deterministic part. Since the preceding analysis of the residuals
from the deterministic part (10.3.6) had clearly indicated that the ARMA(3,3)
model is adequate, we conclude that the model 10.3.7 is the final adequate
model.
The residual t2,s from the model (10.3.7) are shown in Fig. 10.20.
i
20
-300
Months
Figure 10.20 Plot of residuals from the ARMA(3,3) and the deterministic
model.
1,0
0.5
-0.5
1
0
30
40
Lag
20
50
--
10
385
Figure 10.21
IB, I
K,
Figure 10.22
-100
600
100
(7)
100
(m)
600
bUU
6OU
100
mm==&
a)
I82Ieb2
600
1821
1e2
2
18
I?lflflflflJ
1
IR
18,1?
-100
100
100
100
100
Di
100
-100
600
I
387
40
20
20
40
eL2[
B
2
line) than in Fig. 10.23. The first two periodic terms without the expo
nential growth in their amplitude are further amplified and shown in
Figs. 10.25 and 10.26. For the first periodic term (the value of parameter
1
C
0.436 and the frequency w), the two components C
1 sin wE and
Vi C cos wE are plotted to yield a curve with the starting point at a
B,) IC,
100
50
50
100
388
sin t
Ccoswisin(we+
c,
f
1
c
IC,cos
C sinwt+
64i5:.150
> t
FIgure 10.25 Plot of the first periodic term without the exponential growth,
[sin (0)t+64.15)].
distance \/fzc from the datum line and showing a phase shift of 64.
For the second periodic term (with parameter C
2 = 0.996 and frequeacyj
20)), Fig. 10.26 shows the position of the starting point closer to the datufli
line, that is, at a distance V1
C showing a phase shift of 174.87.
flYt
This model reduces the unstable series to a stable one by three transfor
mations: the first is logarithmic, the second is the first differencing V j
C,
O.996
389
C, sin 2w:
2
C
T
.1...
cos2wt
FIgure 10.26 Plot of the second periodic term without the exponential growth,
5,
.5
4,
4,
4,
U
C
.5
Pa $h
ii,fq*.e
deterministic
Stochastic1500
FIgure 10.27
2 3
4
IIlIllIllIIII_._
._
36
24
12
4 5 6 7 8 9 1011 121314
Number of steps ahead
Number of steps ahead
80,.X
160
240
30Se
Muftiplicative
320
Deterministic
iii
11111111
400-
480
660
640
720
The model (10.3.8) has only two parameters, whereas (10.3.7) has 23.
If we are interested in the data analysis only for the purpose of prediction
or forecasting, is it worthwhile to increase the number of parameters by
21 and use the model (10.3.7) rather than (10.3.8)? To answer this question,
we have computed C step ahead forecast error variances assuming each of
these models to be true for C = ito 36, that is, one month to three years
ahead forecasts. The mean squared errors of prediction for the two models
are plotted in Fig. 10.27. In order to clearly see the difference in the mean
squared errors of the two models, the first 14 months are also plotted on
a large.scale. (The estimated parameters are assumed known for both as
usual.)
It can be seen from this plot that the prediction errors for the model
(10.3.8) are consistently larger for all lags. For one step ahead forecasts,
the mean squared error (o) is about 76% larger for (10.3.8) compared to
the model (10.3.7), that is, 99.06 versus 56.18; for = 12 or one year
ahead forecasts, it is larger by about 165% (498.91 versus 188.23). This
difference grows very rapidly for higher lags and for C = 36 or a three
year ahead forecast, the squared error by (10.3.8) is about 14 times as
large as that of (10.3.7). The other feature that is apparent from the plot
391
laW)]
0.091021144
0.000632
(280.3)2 {exp (0.000632)i)
= 49.677
Which compares favorably with the value 56.18 for the 23 parameter model (10.3.7).
o(1)
The graphs for the multiplicative model in Fig. 10.27 have been based on this expression.
Actually the series mean is increasing because of instabilitylnonstationasity, and the values
plotted in Fig. 10.27 would be underestimates. Also, c7 for the w, series has been taken as
0.182521144
0.00126, rather than the slightly larger unbiased estimate 0.18252/129 =
0.00141.
It is interesting to note that by a similar computation, for the ARMA(13,13) model of
Section 9.2.4,
is {exp
t It is somewhat difficult to compare the mean squared error of the model (10.3.8) for the
logged series with that of the model (10.3:7) for the orig&wJ series, especially since the series
is nonstationary. For illustrative purposes, we have converted the error variances for the
model (10.3.8) by assuming lognormal distribution for the original series with mean , =
280.3. Then the expression for the step ahead forecast error variance ai() for the original
Senes in terms of oW) for the logged series w, is given by
is a jump in the mean squared error value of the multiplicative model after
every 12 months. It can be shown that these jumps arise because of the
nature of the Greens function of the multiplicative model.t
This enormous difference in long-term prediction errors is inherent in
the structure and assumptions of the two models emphasized above. The
deterministic part of the model (10.3.7) can be predicted without error so
that its mean squared error for long-term forecasts is limited to the error
of its stochastic part. Since the stochastic part of the model (10.3.7) is
stable, its variance that is the upper bound for its long-term mean squared
prediction error is finite. For the airline data, the variance of the stochastic
part is (28908/144) = 200.75 (Table 10.2), and the mean squared prediction
error for arbitrarily large lags can never exceed this limit theoretically.
On the other hand, there is only one unstable model in (10.3.8) to
account for the entire series. Because of its instability, the variance for the
model (10.3.8) is theoretically infinite, and therefore, as the lag increases,
the long-term, mean squared prediction error tends to infinity, by its very
assumptions. Thus, in practice, one can expect the long-term prediction
errors for the model (10.3.8) to increase without any bound.
When the models (10.3.7) or (10.3.8) are used for forecasting long
periods, there is a possibility that the model parameters would change over
long periods of time. This would introduce additional error in forecasting
that was not considered above. This error could be larger for the model
(10.3.7) since it has a larger number of parameters. However, one would
generally update the parameter values as the new data comes in and thus
try to reduce this type of error. Also, we have taken the estimated param
...
y
=
e + X,
1
2A
(10.4.1)
Following Eq. (10.3.1), we may write a general model for a series nonsta-.
tionary in the mean as
(1
0282
D
b_
)
1
Z(t)
B)a,
1
0_
(10.4;3)
(10.4.2):c
...
(1 + bD
-.
393
mean series. In this case the data fluctuates around a fixed level. Such a
series is stationary and can be modeled by methods of Chapter 4.
When the exponents K, are real, very small, but nonzero, the model
represents polynomial trends. This is clear since the exponential function
is a limit of such a polynomial and with small exponents the higher order
terms are negligible. With real, large, and positive or negative exponents,
it can represent increasing or. decreasing exponential trends.
With complex conjugate exponents K
1 with negative real parts, the
model (10.4.1) can represent damped sine cosine trends, with zero real
parts exact sine cosines buried in noise and with positive real parts sine
cosine trends with growing amplitude. As explained in Section 10.3, it is
preferable to use the equivalent form (10.3.3) whenever periodic trends
are found in the data.
It can be seen that the exponential model (10.4.1) is a very powerful
device capable of representing correlated or uncorrelated data with a wide
variety of nonstationary trends. The corresponding differential equations
for the deterministic and stochastic parts can provide an insight into the
underlying system and its coefficients can be related to the system char
acteristics.
When very little is known about the underlying system, the model
(10.4.1) and the corresponding differential equations can help in conjec
turing theories of its behavior. When some conjectures are available, they
can be tested against the evidence of the model arrnved at from the data.
When the deterministic model for the system is fairly well-known, it can
be used in the model (10.4.1) replacing the sum of exponentials.
If the model (10.4.1) is used to represent the data from an inherently
nonlinear system, it provides the closest linear approximation over the
range of the data values. A further analysis with more data may reveal the
actual nonlinear representation of the system.
4395
The procedure of modeling for (10.4.1) is basically the same as that used
in the examples considered earlier. The dominant linear or exponential
trends are taken out first by least squares and the residuals are examined
for further trends. Such an examination of residuals can provide infor
mation about further trends and rough initial estimates of the correspond
ing parameters. This process is also continued for periodic trends until the
data appear stationary. Then, the ARMA/AM model may be fitted to the
stationary residuals. Finally, the combined deterministic plus stochastic
model is estimated simultaneously.
It is thus seen that unlike the procedure for stationary series, the
modeling procedure for nonstationary series does make use of the plot of
the residuals and no automatic routine is provided that directly yields
the final adequate model from the data. This difference is inherent in the
nature of the two classes: the class of stationary series is by nature a definite,
limited class amenable to mathematical analysis, on the other hand, the
class of nonstationary series is only negatively defined and therefore in
definite and unlimited. The lack of a positive definition and restriction
makes it difficult to analyze the modeling problem theroctically, as in the
case of the stationary series, and devise a routine modeling strategy ap
plicable in all cases.
However, once the kind of nonstationarity is discerned from the
data or the residuals, modeling can be executed without much difficulty.
For example, once we know from the data that it involves one or two
exponentials and/or a periodic component and its harmonics, a computer
routine can be written for modeling. The examples considered in this chap
ter illustrate the principal types of nonstationarity for which such a
routine may be written.
In the preliminary analysis, the procedure for modeling the determin
istic part of (10.4.1) involves successively adding the exponentials, one by
One. Such a gradual addition of two or three parameters at a time consid
erably reduces the problems and difficulties in estimation. Fairly accurate
values of the parameters of a single exponential or periodic component
can be obtained from the residuals, and therefore the convergence of the
minimization routine with respect to these parameters is quite fast. More
over, since the final values of. the remaining parameters in an earlier model
are used as the initial values, the minimization routine has to essentially
Work only with the additional two or three parameters.
396
(1
397
by minimizing the sum of squares of a. Note that this fitting is, strictly
speaking, not correct since a is actually correlated. The initial values of
1 are easily obtained from the graph by using the properties
, and K
1
AA
,
0
of the exponential function.
0.010495 0.00092
4.941 1.02
0
A
A
1
(2.1)
(8,7)
The CPI data residuals a, from this fitting are shown in Fig. 10.28. No
additional exponential or seasonal deterministic trends are visible, which
can be further confirmed by fitting double exponential and showing that
the residual sum of squares is not significantly improved.
We can now apply the procedure of Chapter 4 to the residuals in Fig.
10.28. The results of modeling are given in Table 10.3. (The second method
1 = X
1 has been used.) The ARMA(6,5)
of computing a,s starting with a
model is considered to be adequate at the 5% level. For the ARMA(6,5)
model (Table 10.4), the damped frequency of the root 0.3641 iO.8467
gives a period p = 3.18, whereas the undamped root 0.0094 iO.9993 is
almost the same as i, which is a 4 monthly seasonality by Table 9.1.
(1,0)
6.384
3.09
1.53
6.179
3.0120.102
1.2140085
1.4190.150 4.6120.321
5.8980.522
2.2500.179
1.2690.175 6.4550.636
5.3230.618
0.9840.147
0.8530.086 3.7500.478
2.4390.279
0.8590.085
1.9110.099
0.1830.076
1.2590.074 2.5840.162
3.491 0.193
1.1370.108
0.2820.075 3.5160.208
2.5720.192
0.9380.087
1.8990.132
0.9450.075
2.0030.025
1.0060.024
1.3170.081
0.4430.149
0.9400.150
0.8200.082
4.30
6.801
0.9480.054
9.63
7.408
0.2360.007
0.2390.007
1.2400.006
1,005*0.018
63
63
63
63
63
63
91
02
03
04
0-,
06
05
8.857
4,
Table 10.3 Models for CPI Residuals from First Order Exponential
F.
m
R:;1:
:
1
-4
:9
U p
,
4
. v
6
V
,v
2
v
3
6
.A
5
A
4
A4. A
,A
1
A
5
1.0053
(1,0)
0.9484
1.001510.0519
(2,1)
(6,5)
(8,7)
1.001110.0519
1.002310.0457 1.001610.0519
-0.343810.8344 -0.364110.8467 -0.371610.8405
0.009410.9993 0.008910.9984
0.8673i0.5132.
0.9491
0.9483
1.0077
-0.385710.9313 0.385910.9311 -0.385610.9332.
0.003010.9866 0.003210.9845
08635i0.5157
ARMA Model
(4,3)
Table 10.4 The Characteristic Roots of the Models Given in Table 10.3
398
7
399
Roots
Roots
Moving Average
Parameters
Stochastic
0.870.082
05:
1.00 0.085
1.0
0.5
An
Figure 10.29
0.5
i.e
0.01122 0.0022
2.155 1.05
10
20
Lag
30
40
TEITTE
1
A
4)6.
4s: 0.980.141
1.0206
: 89.782.55 4),: 1.240.083 0.387.iO.851 0,: 0.2190.075
0
A
1.00110.040 Oa:1.16 0.080 0.39910.928
A,: 3.9152.47 42.t.340.I45
0.00810.996 03: 1.25 0.059 0.001510.980
K,: 0.0120.003 4: 2.200.168
04:0.19 0.082
4)4:1.220.163
Parameters
Autoregressive
Deterministic
Parameteas
CPI Data
Table 10.5
400
DETERMINISTiC TRENDS AND SEASONALITY
50
100
150
200
250
The residuals from this model are shown in Fig. 10.30. No additional de
terministic trends are visible; hence the procedure of Chapter 4 can be
applied with the results in Tables 10.6 and 10.7. The ARMA(4,3) model
is found to be adequate. It is further verified from the fit of the ARMA(4,3)
model that since the added pair of roots (Table 10.7) gets almost cancelled
on both sides,
gives
it
only
marginal
a
improvement over the ARMA(2,1).
The estimation
results
for
the
combined model (10.4.1) for the WPI
with one exponential and K
0
0 with X, given by an ARMA(2,1) model
are given in Table 10.8; using ARMA(4,3) for X. reduces the residual sum
of squares to only 22.644. The residual autocorrelations are shown in
Fig. 10.31. The estimated values are quite close to those obtained from
the separate estimation, although the confidence intervals of the deter
ministic parameters are now wider.
Comparing the final results for the CPI and WPI in Tables 10.5 and
10.8, we can see that exponents K
1 are almost equal,,, the confidence in-
z
0,
FIgure 10.30
Table 10.6 Models for WPI Residuals from First Order Exponential
(1,0)
(2,1)
(4,3)
1.21270.004
0.63860.003
0.86180.004
1.98630.025
0.98850.025
1.00550.004
22.850
0.98720.026
24.274
3.01
4>1
26.082
3.66
0.05360.078
1.85610.079
0.04910.079
0.87120.080
4)4
01
02
03
Residual
sum of
F
squares
The adequate model is the ARMA(4.3).
401
tervals for the deterministic parameters overlap and the characteristic roots
of the ARMA(2,1) model for the WPI are also present in the ARMA(6,5)
of the CPI. Similarly, in the purely stochastic models in Section 9.2.3, the
only root 1.014 of the adequate AR(1) model for the WPI was also found
to be present in the ARMA(4,3) model for the CPI. Thus, the dynamics
of the CPI is that of the WPI plus some other modes given by the additional
characteristic roots not present in the WPI. This result is intuitively ap
pealing since the consumer prices will always be determined based on the
wholesale price, but will in addition be affected by factors relevant to retail
trading. In particular, the above analysis shows that the the CPI is signif
icantly affected by the approximately 3 monthly and 4 monthly seasonal
effects, whereas the WPI is not. Such a dynamic dependence of the CPI
on the WPI suggests that these two should be modeled together as multiple
series.
The question of whether one should use the stochastic models obtained
in Section 9.2.3 or the deterministic plus stochastic models in the present
section now arises. A purely statistical answer to this question can be given
Roots
0.9872
(1,0)
0.9932iO.0457
(2,1)
0.978310.0482
0.8255
0.951510.0513
(4,3)
1.0191iO.0736
1.0055
ARMA Models
Table 10.7 The Characteristic Roots of the ARMA Models for the WPI
>l 2
, v>
2
v
22.84a.
--
01: 1.0170.024
0.99iO.043
4: 1.990.028
4:0.990.028
Moving Average
Parameter
_1
.0
I
i
I
20
La
10
iii
I I
I [
I. .1
final
40
L..L.
30
0.5-
OC
Figure 10.31
80
C
C
0.5
1.1
model.
whereas F
(6,co) = 2.80. Therefore, the reduction of RSS by the deter
0
ininistic plus stochastic model is statistically significant at 1% level. How
ever, for the WPI series, the reduction in sum of squares with five addi
tional parameters is not significant. Therefore, the models of this section
are significantly better than those of Section 9.2.3 although the deter
ministic parameters in Table 10.5 have rather large confidence intervals.
From forecasting considerations, it can be seen that the models of this
on the basis of the F-criterion. For the CPI the adequate ARMA(4,3)
model in Table 9.7 has 8 parameters and RSS of 6.9118, whereas the
stochastic plus deterministic model in Table 10.5 has 14 parameters and
RSS of 5.8455. With 202 observations the F-test gives
5.8455
188
6.9118
x
F
= 5.716
A 92.463.77
:
0
: 2.283.52
1
A
: 0.0110.008
1
K
Deterministic
Parameters
Stochastic
Autoregressive
Roots
Parameters
Table 10.8 Combined Estimation of Stochastic and Deterministic Parts for the
WPI Data
i.o
section will reduce the one step ahead forecast error by about 16% (since
the one step ahead forecast, that is, the variance of the residuals o, re
duces from 0.0344 to 0.0289) and the reduction will be substantially
larger for long-team forecasts as illustrated in Section 10.3.5.
However, the decision in choosing the model should be made not only
on the basis of a statistical check, but also by taking into account the
economic theory regarding the price behavior. The presence of the deter
ministic part, supported by the F-criterion, should also be confirmed by
the knowledge of the price behavior. If there are theories supporting both
types of models, then one may use the above analysis to choose between
them.
404
Wheel
ace profile
(x
2
h)
+
(y
2
k)
2
R
(10.5.1)
k + YR
2
(x
2
h)
(10.5.2
Here, the actual profile height y can be decomposed into two parts. The
first part consists of the profile height due to the curvature and offset, y,
and the second part shows the actual profile variation, e, that is,
s
y=k+VRZ(xh)
+
2
VR2
(t
2
h)
405
(10.5.3)
Yr
Yr
V36
(t
2
h)
e,
(10.5.4)
For simplicity, let us take the initial value of R as equal to the wheel
radius, that is, R = 6 inches, and estimate the parameters h and k frbm
the model
0.138 0.707
0.6000
0.5321
0.4030.450
0.5037
(3,2)
0.5696
0.5310.371
0.9340.524
0.188 0.710 0.822 0.741 0.813 0.491
0.128 0.621 0.317 0.686 0.347 0.482
0.2330.576 0.637.0428
0.7830.424
(2,1)
10.9 Wood Surface ProfileModels for Residuals from the Circular Arc
Parameters
4>
(1)4
4>6
02
03
04
0,
Residential
Sum of
Squares
02:
0.16290.4819
4: 1.609 0.2920
j: 0.74130.3065
4: 0.11570.1281
0: 0.62730.2736
+ a,
a,_
0
1
a,_
0
2
(10.5.5)
For the airline data, compare the periods in the stochastic part as
well as the deterministic part in the model (10.3.7) and Table 10.2
with those obtained in Section 9.2.4. Give your interpretations.
10.2 (a) In Section 10.5.1, the exponentially increasing rate of the
Consumer Price Index (CPI) Data and the Wholesale Price
Index (WPI) Data are determined by the term Ae Compare
the increasing rates of these two sets of data from Ae for
small t and large t.
(b) Compare, by means of their characteristic roots,
(i)the ARMA(8,7) model in Table 10.3 with the ARMA(6,5)
model in Table 9.7 for the CPI data;
(ii)the ARMA(4,3) model in Table 10.6 with the ARMA(4,3)
model in Table 9.9 for the WPI. Are there any complex
conjugate roots giving almost the same periodicity? Ex
plain your results.
J) Using the CP1 or WPI data as an example, compare and contrast
between the stochastic model in Chapter 9 and the deterministic
PROBLEMS
for which the estimation results are given in Table 10.10. It is interesting
to note that the estimated value of the radius is 5.651 inches rather than
the nominal wheel radius of 6 inches. This estimate of the mean radius
could form a characterization of the wood profile under different conditions
of speed, feed, pressure, types of wood, etc. The dynamics. of the grinding
process will be reflected in the parameters and characteristic roots of the
ARMA(3,2) model, which can be used to quantify the pulping process.
(t--h)
2
y,=k+VR
+
X,
X, = 1
X,.. + 2
4
X,_ +
4
R: 5.651 3.630
k: 4.1664.988
h: 3.7730.0068
406
1
407
X,_
4
1
(1i
1
T
)
2
T
=
a,
e+
1
(r
2
2)
e2
&
1.6873,
42 =
Be [C sin wt + Vi
B = 10
b
0.05
C = 0.5
sketch YQ), showing correctly
(1) the initial value Y(0);
(2) the period of YQ);
(3) the exponentially decaying envelope.
1.5774
0.7499
13.5005,
Compute the one and two step ahead forecast error variances
for the combined model of (i) first order given by (10.2.2a)
and (ii) second order given above.
(b) Find the percentage improvement in the forecast error vari
ances from (i) to (ii)
(c) Use the F-test to show that the improvement from the first
order to the second order model is insignificant.
(d) Part (c) implies that the noise given by the stochastic part is
essentially represented by an AR(1) model. Why then are the
parameters 4 and O given above so large, and why do their
95% confidence intervals 0.7499 0.1434 and 0.9788
0.1609, respectively, not include zero?
(a)
10.5
X,
0 + g[1
A
This model, fitted to the basis weight data, gives the following
results:
0 = 38.9277, g = 1.6485, = 11.6708,
A
PROBLEMS
...
(1
B
1
0
B
2
0
.
.
0mB)(ft + a,)
10.7 As discussed in Section 10.4.2, when the exponents in the deterministic part are equal to those in the Greens function of the
stochastic part, special cases of the models (10.4.1 and 10.4.2) can
be written with fewer number of parameters; these can be checked
as illustrated for the first order case in Sections 10.2.4 and 10.2.6.
In the discrete time case, this special case of models (10.4.1) and
(10.4.2)
can
be
written
as
4,,,Bfl)y,
(1
4B
2
4)B
=
:
=
t0
1,
* 0
= 1,
= 0,
and the discrete step function
S,
(1
0,
t < 0
to initiate the trend from a desired origin.
(a) Show that the model in Section 10.2.4 can be represented by
this model with n = 1, m = 0 as:
4),B)y, = f, + a,
where
4>
4,2
..
4?
= A
8 + g(1
0
(Note that, for a finite geometric series, the sum:
1 +
(b) Show that the model (10.2.10) can also be represented by the
one given in (a) with 4> = e?; find the corresponding f,.
Show that further specializing to
(c)
(1
B)y, = 1
s,_ + a,
A
gives a straight line trend buried in random walk. What is the
slope of this straight line? What is the Y-intercept? How would
you modify the model if you want the Y-intercept (initial
condition) to be I?
(d) Show that the model in (c) is suggested by the results for the
Wholesale Price Index (WPI) in Table 10.8. Fmd the initial
values of A
0 and A
1 using Fig. 9.15 and the data at the end.
Compare this model with those in Chapters 9 and 10, and
discuss your preference.
(1
1 + a,,
A
=
B)y,
E(e,e,_k)
,
1
A
B)y, = e,,
=
PROBLEMS
(cr + A)
409
or
(1
E(s,)
gz(1
0,B)a,
>4)]
xl
(K
)/(X
0
1
A
)
2
,g
2 = 2
01)1
B)(1
1
x
B).
2
x
4). = (1
)
B
2
>1
(rg
1
1 )4)
4)
)
B
2
X, = (1
= A
1
0 + A
10.9 (a) Show that the stable ARMA(2,1) model with a constant added
to a,s:
B)(A + 1
1
B
0
s,_ + a,)
A
(1
)y = (1
2
4>B
0 + A
A
,
1
, Yi
0
or the simplified form, given Yo = A
)y, = (1
B
2
4)
B)(A + a,),
1
0
t
2
(1
is equivalent to the following special case of (10.4.1) and
(10.4.2):
(1
1
where, as usual g
1 = (A
) and (1
1
K
2
(K
(b)
(1
XB)VX,
=
(1
[(x
YtAO+AIL
)o
1
(x
0
B
1
)a,
2
(1x)
1)
(1__0\
+ix)tj+Xt
410
411
EPD: VX,
1 + 0.52,
a = 1.12
a, + 0.35a,...
Determine the parameters of all the components of deterministic
trends contained in these models. The data are listed at the end of
the book. Do you agree with these models? Can you improve them?
How?
10.12 Nelson (1973) has fitted the following models to the quarterly data
on the Gross National Product (GNP) and the actual Expenditures
of Producers Durables (EDP) in the United States in billions of
current dollars from 194701 through 1966-04 (80 observations);
cr = 22.66
0.62B)VX, = 2.69 + a,
GNP: (1
PROBLEMS
11
MULTIPLE SERIES:
OPTIMAL CONTROL AND
FORECASTING BY LEADING INDICATOR
In earlier chapters, we were concerned with a single series
of
data,
con
sidered as a realization of a stationary stochastic system. Such data
can
always be represented by a model of the ARMA fonu. The ARMA models
can be extended to vector systems to show that two or more sets of data,
treated as the realization of a vector
station
ary
stocha
stic
system
can
,
be
similarly represented by vector models. In this chapter, we will consid
er
simple systems relating two or more, that is, multiple series of data.
The thost prominent applications involving multiple sets of data arise
in control systems, although there are other important applications such
as economic forecasting, system simulation, etc. In this chapter we will be
primarily concerned with model
for
s
discret
system
e
s
and
develo
ping
strat
egies for their optimal forecasting and control in the sense of minim
um
mean squared erroi. The models employed here can be readily used in
other related applications.
The contro
system
l
s
can
be
formul
ated
in
two
equiva
lent
but
distinc
t
ways. The first one corresp
onds
to
the
classic
al
contro
theory
l
,
emplo
ys
block diagrams and transfer functions, and utilizes transform techniq
ues
such as Fourier, Laplace, or Z-transfonns.
In
this
formul
ation,
almost
an
exclusive emphasis is placed upon those specific control systems classif
ied
as servomechanisms. In addition, attention is directed in most cases to
systems of the single-inputsingle-output variety.
The secoiid formulation corresponds to modern control theory, em
ploys the state space technique,
and
utilize
metho
s
ds
such
mathem
as
atical
programming suited to computer calculation. The state space techniq
ue
essentially consists of treating all the inputs
and
output
s
(as
well
as
their
derivatives and differences if necessary) as the
states
of
the
system
and
forming a single vector with each of the states as its
elemen
ts.
This
for
mulation is particu
larly
suitabl
e
for
multip
le-inpu
tmu
ltiple-o
utput
sys
tems. The advantages are its comprehensiveness and the ease of theoret
ical
and computational treatm
even
ent
in
the
case
of
compl
icated
system
s.
Kalman filtering is a well-known example
of
this
formul
ation
of
moder
n
control theory. Both these formulations together with
the
import
ance of
transition from classical to modem control theory may be found
in most
control texts.
412
413
paper product
basis weight
Dryer
Calendar
Press
FIgure 11.1
(Output) X
2
Pope reel
Machine
chest
414
415
O.25B
1
1
X
,
2
a
0.7B)
(1
+ (1O.7B)
(11. 1. la)
Transfer
Noise
Function
In this form the model is called a transfer function model and is
often
schematically represented by a block diagram shown in Fig. 11.2. The
maintained at the given target value depending upon the grade of the paper
being made. We will denote the output by X
, which is the deviation of
2
the basis weight at time t from the target value. The aim of the control
system is to keep the output X
, at zero.
2
416
O.7B)
OuW
input
Figure 11.2 Block diagram for the transfer function model (11.1.la) without
feedback.
1
_
21
1.2X
2
_
21
O.35X
1
_
11
0.2X
417
the system are just considered as states and an equation (generally differ
ence or differential equation) relating ttlem is taken as the system equation.
The state variables may be observable (such as input, output, etc.) or
unobservable (such as disturbances or derivatives of input-output, etc.).
The ARMA or AM models for univariate time series discussed in
earlier chapters are actually state variable models. For example, the discrete
time AR(1) model relates the observed states X and X
_ with unobserved
1
state a; the continuous time A(1) model relates the unobserved states
dX(t)/dt and ZQ) with the observed state X(t). Therefore, a natural extension
of the ARMA models to control systems involving one or more inputs and
one output is a model such as (11.1.1). This model can always be converted
into a transfer function form such as (11.1. la), if desired. However, it will
be subsequently seen that the extended ARMA models such as (11.1.1)
are quite sufficient to deal with control systems, In fact, for both the
modeling and derivation of optimal control strategies, it is far simpler to
use the form (11.1.1) than the form (11.1.la).
Although every model such as (11.1.1) can be reduced to transfer
function form, there is a difference between such a reduced model and
one originally formulated in the transfer function form. It is seen in the
model (11.1. la) that both the transfer function and the noise part have the
same autoregressive operator (1 0.7B) . This may not necessarily be
the case for a transfer function model formulated a priori. When a transfer
function model is being formulated a priori, if it is known that the transfer
function and the noise have different modes of dynamics, then the transfer
function model may be of the form, say,
O.2B
1
,
2
a
(11.1.2a)
12,
=
107B
+ 105B
which is the same as the model
(10.7B)(10.SB)X = 0.2B(10.5B)X
,
2
1 + (1O.7B)a
,
2
that is,
2
X
O.1X_
1 (11.1.2)
_
2
, 0.7a
2
2 + a
Therefore, when converting an autoregressive moving average model (11.1.2)
into a transfer function model, one must cancel nearly equal roots to write
the model in its simplest form.
0.25 X
,_ + 0.7
1
La2rjI
1
11
Fx
a,=I
=
=
[+21
4,
r
)1
+22]
+121
1
0.8
0
[0.25 0.7]
+ a
,
2
(11.1.4):i
So far we have not considered the nature of the input Xi,. In the transfer
function form or the autoregressive moving average form as discussed
above, X
, can be any discrete input, either deterministic or stochastic.
1
The deterministic inputs commonly used are step, ramp, or sinusoidal. An
example of modeling with step input was presented in Sections 10.2.1 to
10.2.5. However, here we will be primarily concerned with stochastic or
random inputs. When the input is a stationary stochastic process or time
series, the input-output model can be formulated as a vector system, which
we will now discuss.
Suppose the input X
, is a stochastic process or time series and there
1
is no feedback, that is X
, does not depend on X
1
, at any t, then the input
2
, can be represented by an ARMA model involving dependence on its
1
X
past values only. For simplicity we will take this model to be also AR(1)
with, say, 4) = 0.8. Let the output-input model be (11.1.1). Then, the
complete control system may be represented by a pair of models
= 0.8 X,_
1
+ a
,
1
(11.1.4a)
418
419
2
a
x
2
1
(11.1.5)
(11.1.5a)
Figure 11.3 Block diagram for the transferTunction model (l1.1.5a) with
feedback.
12:_i
1
a
,
1
= (10.88)
+ (10.88)
0.25
1
X
_
1
12,
,
2
a
= (10.7B)
(10.7B)
then the corresponding vector model is
,_ + 0.1 12,_i + a
X
= 0.8 1
,
1
, = 0.25X,_ + 0.7 12,_i + a
2
X
,
2
420
MULTIPLE SERIES: OPTIMAL CONTROL AND FORECASTING
X,
= a,
which is the same as Eq. (11.1.4a) with the only difference being that the
value of 4 has changed from zero to 0.1. It is thus seen that the feedback
is easily incorporated in the vector ARMA models with no change in their.
form, but it changes the structure of the transfer function model consid
erably.
The ARV(1) model (11.1.4) can be developed by exactly the same
reasoning as in Chapter 2 applied to the vector X, rather than to the scalar
X,. When both the input and the output are discrete white noise sequences,
we get an ARV(O) model
a,
a,_
1
0
(11.1.8):.
Va
21
LYa
,
1
[Ya
2
Ya,
where y, and
denote the variances of a
, and a
1
,, and
2
and y
represent
their
cross-depend
or
ence
covariance.
It is important to
note
the
change
notation
of
for
the
variances of the
residuals. When y
= 0,
so
that
,
1
a
are uncorrelated or
and
,
2
a
Ya21
independent, we use r and
or
and
interchangeably; when
y,,
C72
they are correlated or dependent, we use Ya
j and Ya,
1
, exclusively to em-.
2
phasize their bivariate nature.
A thorough understanding of the ARMAV model structure and mod
421
cling procedure, etc. is beyond the scope of this book. However, the usual
assumption in control theory that the input noise (or the input itself) is
independent of the output noise implies that modeling and control can be
accomplished by considering the two models separately, as in Eqs. (11.1.4a)
and (11.1.5), ratherthan
together
in
their
vector
form
(11.1.6)
and
(11.1.7).
The assumption of independent noises, in turn, means that a
, is inde
1
pendent of a
,, so that the matrix Ya and the O matrices in (11.1.7) and
2
(11.1.8) are diagonal. This will be shown in the next section 11.1.6 by
relating ARMAV models with their simplified scalar forms used in this
chapter.
However, the assumption of independent noises is so common that
the reader can skip Section 11.1.6 and proceed to Section 11.2, keeping
in mind the assumption that a
,, a
1
,, etc. are assumed independent at the
2
same time. No continuity will be lost if all the subsequent transformations
to ARMAV models are skipped.
X,
= +rx,_
1
=
rA*
Va
ri
(11.1.6a)
A simpler form can be deduced from the general ARMAV model first by
a simple transformation and next by assuming that the input and output
noises are uncorrelated. This form of the vector autoregressive moving
average model can greatly simplify the estimation of the parameters of the
multiple input-one output systems.
The
input
output
and
noises
in
a
bivariate
autoregressiv
e moving av
erage model can be made uncorrelated by making
diagonal, that is,
,a = 0
1
E[a
]
21
For ifiustration, let us rewrite an ARV(1) model, using *s to emphasize
that ar, and a, are correlated
1
11
1
,
a
1
I1
P.12111
I I
1,li
(1116b)
I
I
I *
I + 1*
A.*IIX
r221J L 2
1J
L 2tJ
L 11
L It
where
and
are the elements of the autoregressive pa
4,
rameter matrix 4. Note that the first two subscripts of the parameter
elements designate the row and column of the matrix; the last subscript
denotes the subscript of the parameter matrix 4, that is, 1. The variance
covarance matrix
and ar,, y, is a symmetric matrix and with possibly
of
a,
nonzero
Transformation from correlated a, and a, to uncor
related a
, and a
1
, may be represented by
2
a,
a
0
4
(1i.1.9a)
j LzLj
4)1201 Faitl
l
o
120
4
L
)
.
iL
1]
01
Ian
oy
12
y,+4)
1]
i Lt1io
:i
+
2
4)u:u1r
01
*r)jl
2
a
1
E
(
1
: )(ai*
211
[\a
L 4)120
iLati
E+4)1+4)4)u
ij
4)120]
4)1201Fai,1(a*)F
[+o
[o
1
r
[o
1
r
E[1
iJ
01
j
o
2
+4)i
12
Thus, the
Ian J
2 =
Va,
(11.1.9c)
I
*
y:n)
*
*
Yajz
-)
( (
s/an
*
.
) (
=
Va,,
(11.1.9)
Van
matrix is
Van
:u
120
Substituting 4)120 =
(fl.1.9b)1 gives
Since y =
and
(11.1.9b)
, and
1
Since is a diagonal matrix representing the cross-dependence of a
a, as zero, the cross terms in Eq. (11.L9b) should be zero, that is,
= 0
+
Ii
or
[aj,1
Lz:i
422
:l2
4)1201111:1
12
Va
,
2
a
,_
11X
121
4)
1
%, we get
[ai,1
(1l.1.9d)
423
=
=
= 4)oa?
01 =
0N
:hEq. (11.1.7b) yields
= 1
4 +
_
X
X,_ + a,
2
and
_
1
a
a,
4)2
4)1 =
_
a
1
0
(11.1.7d)
(11.1.7c)
2 = X
X
1.
4
.
1
1 + 4),
,. + 2
2
X
1
a
,
Note that since each series includes only its own a, the model is just
a conditional regression and the conditional least squares estimates of the
parameters can be obtained separately as if by a univariate series rather
than a bivarjate series.
Next, let us consider an ARMAV(2,1) model:
**
X, = X,...
1 + 4)X,_
2 + a
a
0
1
_
(11.1.7a)
14)111
(11.1.o)
1 iLX2:i
4)rnJLX2I_lJ
211
4
L
LtJ
where 4)120 =
12 and 4)
(
fry)
m. 4), 4>, 4 are transformed parameters, with 0
4)
1
4
,a
, and a
1
, are now uncorrelated.
2
=+oa?, and a
1
Thus, the series X, and 121 can be expressed as
X, +
= _
11 + _
X
111
4)
1
21 + a
X
1
,
1
[o
Fi
, = a,
1
a
1
L0
and the uncorrelated a
, and 2
1
a
1 can be expressed as
[x
j[X
J
21
[j
,_
[x
k:?.I1 1
LJ i1iLX2,_1
[ )[:i
[]
(11.1.7e)
[1
L0
+
..
4),,X
2 + a2,
_
21
Ornaz,_i (11.L10a)
and therefore the estimation of
11
a
,
1
11 _
a
which does not include ,
the parameters can be performed separately by minimizing the sum of
21 alone. It is easy to see that Eq. (11.1.10a)
a
squares of the residuals s
has the same form as Eqs. (11.1.1) to (11.1.3). In this chapter,.we wiII
only be using these special forms of the ARMAV model. These may bc
thought of as extended ARMA models, as developed through Section.
21
a
, is assumed to be independent of .
1
11.1.3, keeping in mind that a
425
The modeling procedure for the simplified ARMAV models can be out
lined along the same lines as the ARMA(n,n 1) models in Section 4.5.
Step 1
Obtain the initial values of the parameters by the inverse function method
as given in Appendix A 11.1. The initial values also provide an indication
of the presence of initial delay or dead time. For example, if the initial
values of the parameters, say,
4 of one input-one output system
are small (confidence intervals include zero), then the model for the output
, could start with lag 3. However, it should be noted that the choice of
2
X
the initial delay L is tentative, and if there is doubt, the parameters should
be included in the final estimation.
Step 2
By the nonlinear least squares method fit the models for input and output
separately, in a form modified for lag, if necessary, starting from the initial
values in Step 1. Increase the value of n until the reduction in the sum of
squares is insignificant as discussed in Sections 4.4.2 and 4.4.3. (The first
method of setting a,
0, t = 1 to n and starting the computation of a,s
from t= n + 1 is preferable.)
Step 3
Drop the small parameter values to reduce the order and/or increase the
lag or dead time and refit the model. Check the refitted model against the
one in Step 2 and determine the adequate model.
N
1
_,
11
a,,a
y2,(_k)
115
Ya
(11.2.1)
Step 4
Compute the auto and cross-correlation estimates of a,,, ,
11 using the av
a
erage substracted residuals in the following
Ya,jk =
Yaqs
k = 0, 1, 2,.
(11.2.2)
P,k
VYaaoYa,jo
If the P4k
5 lie within the 2/ VN
t band (or the unified correlations
are less than two in magnitude), the series may be considered independent.
If not, repeat in Steps 2 and 3 with a greater number of parameters. If the
tThese are slightly conservative, actually one can use 1.96) VWZI.
386
38.8
39.0
39.2
39.4
xl,
10
20
30
Time (seconds)
Time (seconds)
40
50
.5
C
C
C.)
C,
0
0
U)
xl
427
0.3060
0.1950
0.1950
.50_58.538
0.00624
0.3060
0.696 0.195
0.2470.461
(1,0)
Nwnbr of observatjos
Residual
sum of
squares
4213
Parameters
0.00406
8.538
0.1950
1.1250.239
1.453 0.245
-0.5230.241
0.2750.457
0.593 0.471
Model Order
(2,1)
0.00339
3.140
0.1593
0.4540.422
0.416 0.378
0.5360.428
0.470 0.308
0.9230.251
0.5180.267
0.0290.325
1.3040.349
(3,2)
compared with F
(3,45) = 2.84. Comparing the (2,1) model with the
0
(3,2) gives F = 3.14. On the other hand, ARMA(4,3) model (not given
The improvement in the residual sum of squares from the first order to
second order is significant since
ever, the external inputs such as step or impulse are still commonly used
because of the difficulties of obtaining the dynamics from the normal op
erating data. An attempt by Goh (1973) to model this data by empirical
cross-correlation methods did not succeed because of large disturbance.
Modeling of this data by the procedure of Section 11.2.1 will now be
illustrated. We first fit a first order model to the output X
, series. As in
2
Step 1, the initial values of the parameters 4
m and 4 are computed
using the procedure illustrated in Appendix A 11.1. The initial values also
indicate that there is no significant delay between input and output, and
hence the lag was chosen as one. The nonlinear least squares method gives
the results for the output X and the input X, presented in Tables 11.1
and 11.2, respectively.
It is seen from Table 11.1 that the first order model fitted to the
output-input model is
= O.247X
1 + O.696X
_
11
2 + a
, y = 0.00624
2
l{3
X9O
1
66E00O =
tZXSIc.O
V +
X9IYO
XtStO= X
t
! X !
s.iuieied IoP!PP SU pP3W
-ltXZ6.O + XOLiO +
JO
1
x
OOEO
xZIo = IX
S lEtfl () S
Q3enb
S3S3 q3 SUS X 211dm ifl iOd
xEcco
zucvo
-p11
111,
_xEcoo
tVTt3
cioowo = - Z_SlvIcyJ +
_ZXZIOO :rX9flTJ + XISIO +
428
1;
4,
-4
.40 ()
ddoo
(4.0 In In
(40
+1 +1
OF
-4 (4
.4
-4
0%
-4
C
-4
+1 +1
_4 _I
0 c
F- F-
in
-4
.0-4
( (4
CO
I-.
In
+1 +1
rI
-I ,4
00
0 C )
+1 +1 +
0.00
0.000
+1 +1 +1 +
0
40.
II
0.0.0 00 .-0
00
(4
0
(40 l
(4 .0
+1 +1 +1
-4
000 .0 i
In (4 in 0
(40% (1
C:,
od
II
.000
0
0
.80
I-.
00
() 0%
(4 In fl 0% (4 .00% .0 (
-l 0 In In c (40% (i 4 0% (400
-4 (4 .0.0 .-4 4.0 .0 5 in 0
-I
00.
>
o o
in 00 In
in In In 0% 0 In
in
+1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +;
C -4 in
Cfl 0% * in
(I -4
In
00.00
-I
-4
in
C) In
.0
.0
0
000
0In
I.
(4
I
10
.1..
41
I
0
Lag,k
I
10
20
0.5
ii
I
10
ii.
II
I
i..
I
20
Lag.k
. II.-
II
30
40
20
-.0.5
I I
00
30
o.c
LI
20
Lag, k
B)
1
4
+211
1, 2, 3,.
2 +
B
1
4(1 + 4)
B + 4)
1
(1
from the deterministic models, which may be later refined based on the
(q
10
.11 II
0.5
I p
0.5
i.c
if
0.5
II I
I. I
0.5
1.0
4Th
4
+
4)
+
...)
(11.2.3)
431
1.7992 0.4228
35.3912 23.5606
0.7887 0.1270
_i
X
1
ct,
.terminjstjc
inputs,
using
function
since the sampling interval = 1. The gain for the discrete as well as the
Continuous system are the same and therefore by Eq. (11.2.3)
1.7992(10,9189) = 0.144
= g(14)
)
1
It is not difficult to see from the above expansion that the gain g for any
discrete transfer functions can be obtained by simply substituting B = 1.
Now recall from Section 10.2.2 that the adequate model for the step
response of the papermaking system was found to be
A
=
1
Y
+
0
g(1_e_())+X
= 121
CASTING
432 MULTIPLE SERIES: OPTIMAL CONTROL AND FORE
ed ARMA
pendix A 11.2, or from the operating data, using the extend
derstanding
modeling procedure of Section 11.2.1. From the point of view ofun
ng of Section
the dynamics as well as prediction and control, the modeli
ages:
11.2.1 from the operating data has the following advant
ed just for the
I The normal operation of the system need not be disturb
purpose of analysis.
and therefore would
2 The dynamics is obtained under realistic conditions
provide more precise prediction and control.
to detect changes in
3 By continuously monitoring the data it is possible
action when
the system over time and to take appropriate corrective
necessary.
difficulties in
Apart from these advantages, there are some inherent
scale of such
the deterministic input approach. It is difficult to decide the
te of the dynamics.
inputs that would provide an accurate and reliable estima
submerged in
If inputs that are too weak are used, the response may be
n. If inputs
the noise, making it difficult to estimate the transfer functio
signal to noise
that are too large are used to provide a sufficiently high
ics under
ratio, the test may yield a distorted picture of the actual dynam
may be added
actually smaller normal inputs. Moreover, the input noise
noise.
to the response and may inflate and distort the so-called output
example
These difficulties are somewhat illustrated by the papermaking
dynamics
considered above. The deterministic input indicates a first order
third order.
although the dynamics under operating conditions is second or
is inflated
The confidence intervals are relatively large and the variance of a,s
both cases of
from 0.0062 to 0.0166 comparing the first order model in
operating data and deterministic inputs respectively.
0.923]
0.1361
0.470]
0.553
0.093
0.470
0.013
0.518
0.052
0.023
0.923
[0.536
0.502
[0.416
[o.2
[0.454
0.012]
0.518]
] [0.416
0.12311
0.123][0.300
1 j [0.454
.I.IA.
O P1
[0
Fi
Ii
[0
iL0.536
0.123110.053
1
0.1511
*Trsfoalion to ARMAV
2 are estimated, the
1 and X
Once the separate adequate models for X
a) can be
corresponding bivariate ARMAV model in the form of (11.1.6
determined by back transformation [see Eq. (11.1.7c)]:
9l
[0
Ii
.1* A.IA.
P39O P3
0*
0.069
0.029
j[ 0
0.123110.591
A..1 A.
P0 lpo
[1
[0
[0.591
0
.A.1
Ii
[o
10.244
lii
0
0.029j[0
1[i
0
1.304J[o
0.093}[x
_
1
0.470] [X
1
_
2
0.123
0.123
0.0231111,_2
0.923] LX
_
2
[0.454
02
+ [0.5
[0.416
10.591
0.029j[a,_
0.0691[a_i
2][X
1
0.05
3
_
[x
1
518J
_
0.
2
3
[at]
[a]
0.013
[0.536
+
Yawo
jf
0.123110.00075
0.000417
0.003390
0
1
0.00339j[_O.123
0
1
433
(11.2.4)
0.123111.451
i
j[ 0
1a A.
P0 2PO
[i
[0
j
2
[x
11.451 0.339]
0
1.304]
and the ARMAV model in expanded form is written as
0.000801
[0.000417
11
[1.451
0.339][4,_2
0
a_
.
4j[
l
0
3
2
with the variance-covajiance matrix from Eq. (11.1.9) as
*
Ya
11.2.3
Two
InputsOne Output System With Dead Time:
Application to the Paper Pulping Digester
?he Simplest one-input-one..output example was considered in Section 11.2.2.
Since the data was adjusted for the initial delay or dead time of SO seconds
calculated from machine speed, the complications of estimating the dead
Slow
Chips
time outlined in Step 1 of Section 11.2.1 did not arise, in this section, we
will consider an example in which such dead time exists but its exact amount
is unknown, and moreover, in addition to the one input that is manipulated
for control purposes, there is one more input that is measured but not
manipulated. The data for this example are taken from a continuous pulp
ing digester in a paper mill.
The schematic diagram showing the continuous pulping digester proc
ess is given in Fig. 11.6. Wood chips are fed through a low pressure feeder
(at about 16 psig) to a steam vessel to drive off air to facilitate liquor
penetration. The chips are then mixed with circulating liquor and forced
into the digester by a high pressure (at about 160 psig) feeder. Recycling
white liquor for cooking is added and after about 45 minutes in the im
pregnation zone in the upper part of the digester, heat is added to raise
435
The data of these three series, taken from Goh (1973), are given in Ap
pendix I and plotted in Figure 11.7.
Physically, since the input temperature is a manipulable variable, it
should not be affected by other variables. The second input B/F ratio is
mainly related to adjustments for the chip level. Since these two inputs
are adjusted for different purposes, they need not be dependent on each
Other. Therefore, the two series
and X can be modeled separately by
60
20
40
60
Hours
80
100
25
X
55
120
437
II
II
(ii)
(b)
(c)
1
f
1 }
1
f{
13k
Id)
Figure 11 .8(ad)
(8)
HI
I H IH
12k
1
no dependence on
3:
X
0.
2.
:
1
XX
1.
3
X:
and X
.
3
These dead times will either be confirmed or rejected in the final
estimation. As explained earlier, the physical evidence indicates that not
only X
2 should not depend on other series. We will, however,
1 but also X
retain the dependence of X2f on X
1 and ,
31 with lags two and zero obtained
X
4)230.
(7)
aik
1,
IC
1.
rtl
23k
If)
439
Modeling
The results of modeling the X
, series with the above dead times are shown
3
in Table 11.3. The facts that for the adequate (2,1) model the values of
4)316 and 4
rn are not very small, and that their confidence intervals do not
include zero, confirm the choice of six lags for X
31 with X
1 and one lag
for X
,. The total time for the digester process is about eight hours;
2
3 with X
hence, the six-hour dead time between the temperature manipulation and
the resultant K number of the output pulp is physically reasonable.
Since X
1 does not depend upon the other series, it can be modeled
by the ARMA methods of Chapter 4 with the results shown in Table 11.4.
The adequate model turns out to be AR(1). The adequacy of the AR(1)
model is also confirmed by the confidence intervals for ARMA(2,1) and
ARMA(3,2) models in addition to the F-values.
Table 11.5 gives the results of modeling the B/F ratio Xi,. It is seen
that all the parameters, except 4) for the (1,0) model and 4
m and O
for the (2,1) model, are small and their confidence intervals include zero.
This indicates that the series
does not depend upon X
, and X
1
, as
3
conjectured by the physical evidence earlier. Hence, using the method of
above, to illustrate how the errors in initial lag estimation are corrected
in the final estimation and, also, how ARMA models emerge as special
cases of the extended ARMA or the ARMAV models.
(e)
T I
I r I
I lit
Figure 11 .8(ef)
aa,
ITTIlT I
I I I!
21k
0.286 0.220
0.2640.221
Model Order
(2,1)
0.105 0.161
0.1670.149
0.3180.160
(1,0)
0.024 0.135
(3,2)
248.1
1.935
230.1
(11.2.5)
alone,
0.282 0.205
0.0560.216
0.2370.220
0.1640.157
0.1800.162
0.3250.169
0.0050.308
0.3080.288
0.2510.238
0.7680.335
0.4300.321
8.178
0.6350.256
0.7640.116
326.9
1.2310.251
0.3560.212
,
3
Table 11.3 Modeling Paper Pulping Digester OutputK Number X
Parameters
+318
+3j7
+321
+332
333
0333
Residual
sum of
squares
F
The adequate model is (2,1).
4.01
178.4
0.7840.116
(1,0)
(3,2)
172.7
0.982
169.5
0.1240.721
0.1050.720
0.5620.432
0.5380.768
0.5870.646
1.774
0.1441.032
0.7841.036
0.0670.831
Model Order
(2,1)
+112
+113
0112
Residual
sum of
squares
F
Iall =
4599
0.0290.207
0.5570.159
0.1010.207
0.0140.190
(1,0)
(2,1)
2.391
420.5
0.0830231
0.0190.277
0.0110.276
0.0540.582
0.4250.309
0.0130.191
0.0310.208
0.4170.645
Model Order
441
4212
4231
+232
Residual sum
of squares
F
,
2
a
=
,
1
_
21
0.36a
4.01
_
1
,
2
0.167X
2
_
21
0.318X
,
311
a 0.635a
,
2+3
..
31
0.356X
= 2.41
7
_
11
0.264X
2
_
21
0.44X
(11.2.6)
at the top,
in the second row, and i
and P33k in the third row.
Pzia
31k,
3
These plots show that ,
11 ,
a
21 and a
a
, are neither autocorrelated nor cross3
correlated. Therefore, the basic assumption for the special formulation of
the ARMAV modeling is satisfied and the fitted models are entertained.
It should be noted that modeling all three series is necessaiy to check
the justification for using the special form of the ARMAV models and to
obtain a comprehensive knowledge of the system including feedback and
the interactions between the inputs. However, if one knows by prior mod
eling or physical knowledge that the special form of the ARMAV model
is justified and the model is to be used only for control purposes, it suffices
to get the model for the output alone rather than models for all the input
and output series. Furthermore, it will be seen in the next section that
when the dead time of the additional input
,
,
2
X
which
is
not
manipulated,
is greater than or equal to the dead time of the manipulated input ,
17 the
X
derivation of the optimal control equation does not require the models for
the inputs.
In summary, the adequate models for the series Xi,, ,
21 and X
X
, are
3
listed below:
, = 0.784X
1
X
,,
1
,+a
1
o, = 1.64
1
_
21
0.11X
1
_
31
1.231X
, = 0.286X
3
K
6
,
1
+
fr
01
0-
0-
!.
I,
-.
33
P
0
C,,
0
0
NLLSVo3OI
ii
;!-.
II
ONY
C.,
C
1)
0
I..,
I
I
I
I
I.
I
ij
I
I
$
I
1I
I
P22
. I
I
...,
I
I
-,
,oui.iioo ivvau.io
-.
Pi
P31
I
I
I
I
31dLfl
I,
11
P
I
I
ii
I
I
fi
443
F
I
L 0.286
+
]
6
_
1
o1fx
1c4tl
F1.64
I 0
L 0
[g
fo
J
36
0JLx
26
0IX
[aj
0
0
0
1[x
]
2
_
1
2
( X,_
2.410J
01
0.2001
(11.2.7)
Once the models are obtained from the data as discussed and illustrated
in Section 11.2, they can be used for prediction or forecasting and control.
model are low. Consequently, the effort involved in estimating the param
eters of the simplified form of the ARMAV model is far less than in the
case of a complete ARMAV model. In fact, it is easy to see that for the
ARMAV model, the number of parameters to be estimated increases con
siderably with the increase of the number of input-output series.
Note that because of the lag between X and X, the order of the
complete ARMAV model is (7,1), whereas the order of the models for
individual series when formulated as the special form of the ARMAV
0.200
0
4.027
1
0.635JL4_
7
_
3
OJLX
7
0NX_
7
_
1
ojfx
1
0 ]fa_
0.087 f a_
1
10.264 0
0
0
0.356JLX
_
3
J
0.061
f 0
10.022
0
0.417
0
0
0.451
0.318
0
0.024
F 0
0.011
L 0
0
0
1
0.083
,=I0
1
Lo 0
Thus, similar to the one input-one output models for papermaking data,
a combined three series ARMAV model can be obtained from three sep.
arate models by back transformation using 4 = 44, 42 =
= 4o4)6, r = l4i and or = 4&O4k from Eq. (11.l.7c):
] F 0.78
11
[x
0
0 1FXL_x
0.040 0.115 II X_
1
I Xzt I = I 0.019
0.167
J 1 0
3
Lx
1
l.23lJLX_
*Transformation to ARMAV
If we consider the results of the (2,1) model for illustrative purposes from
Table 11.5 for the X, series, the 4, matrix can be written as
OPTIMAL CONTROL
(11.3.1)
Let us now consider the problem of deriving the optimal control strategy
for a given model. The goal of the control effort is to keep the output at
the target value, which may be taken as zero, when the output is measured
from the target value as the zero. However, because of the noise and
disturbance in the system, it is seldom possible to maintain the output
exactly at zero level. The best one can do is to keep the deviations or
errors from the zero target values as small as possible. Since these devia
tions that form the output are random variables with zero mean, the meas
ure of their smallness is given by their variance. Therefore, the optimal
control is defined as those adjustments in the manipulable input values
that yield minimum variance or minimum mean squared error of the output.
We already know from Chapter 5 that the minimum mean squared
error forecast that yields minimum variance of.the prediction error can be
obtained by taking conditional expectation. Since there is always some lag
L I between the manipulated input and the output to be controlled, the
earliest time at which the input X
1 can affect the output is t + L. There-:
fore, the minimum mean squared error control strategy is to adjust the.
input K
,(L), is zero
2
, such that the forecast of Xz,+L made at time t, .t
1
(or target value). Thus, the optimal control strategy may be simply written...
as
=
,(L)
2
k
,(L)
2
e
it is clear that after implementing the control equation (11.3.1), the output
is given by
(11.3.2).,
(L)
2
X2V+L = e
OPTIMAL CONTROL
445
which has the smallest variance that can be achieved based on observations
at time t. This optimally controlled output, or output error or deviation
from the mean (target value), is of course the same as the L step ahead
forecast error, and by Chapter 3 [see Eq. (5.2.11)], it has a MA(L 1)
model.
In this section, the development of the optimal control equation ad
an assessment of its effectiveness are presented in four convenient steps:
For illustration purposes, four examples are chosen, namely: (i) The first
order model with lag 1, (ii) the second order model with lag 1, (iii) the
first order model with lag 2, and (iv) the second order model. with lag 2.
General formulae are given after the examples.
0.0062.
As the simplest case, let us consider the model (11.1.1), which is in fact a
first order model fitted to the papermaking data in Section 11.2, with
rounded-off parameter values
1 + a
...
21
,_ + 0.7X
1
0.25X
,
2
(11.3.3)
with y
,(1)
2
X
,
1
0.25X
,
2
0.7X
(11.3.4)
= 2.8X
2
Which gives the desired control equation.
,
1
VX
=
,
2
2.8VX
This equation states that if the output deviation increases by, say, 0.5, the:
input scale reading should be reduced by 1.4 from its preceding position.
This is intuitively meaningful. Since the paper thickness increases, we have.
to reduce the gate opening so that less stock flows in and eventually reduces,
the thickness measured by the basis weight. If the thickness decreases,
then the gate opening has to be enlarged as the equation indicates. (The
equation can be readily implemented by the P11) controller, see Appendix.
A 11.3.)
that is,
447
7a
0.0041.
,_
2
0.52X
1
_
21
0.52X
,
2
1.13a
,
2
,_ + a
2
1.13a
1
_ + 1.13a
1
,
2
,
2
, + 0.59X
2
1.45X
1 O.52X
_
11
(11.3.7)
(11.3.7a)
(11.3.6)
,
1
X
1 + 1
_
11
,_
2
1.86X
, + 2.11X
2
1.14X
that is,
Now, if the a
, in Eq. (11.3.7) is replaced by using Eq. (11.3.8), then
2
the control equation becomes
,_
2
0.52X
,_ + 1
0.59X
, = 0.32X
1
0.28X
, + 1
2
that is,
,
1
0.28X
with
OPTIMAL CONTROL
448
MULTIPLE SERIES: OPTIMAL CONTROL AND FORECASTING
Model
0.72t(1)
0.25 X
1 + 0.7S
,(1)
2
=
(11.3.lOa).
2t(2) = 0
that is,
,
1
O.25X
(11.3.1,):
(11.31i)
where .k
,(1) is the one step ahead forecast of the optimally controlled
2
output.
Unlike the case of initial delay L = 1 (Section 11;3.2), the substitution
of the control equation (11.3.10a in Eq. (11.3.9) does not give the opti
mally controlled output in a straightforward manner. Nevertheless, from:
Eq. (11.3.9)
,_ + 0.7X
0.25X
,
2
= 1
Hence, by substituting Eq. (11.3.11) into Eq. (11.3.lOa),
0.25X = 0.7(0.25X
,
1
1 + O.7X
_
11
,)
2
and the control equation is
,_
0.7X
,
2
1.96X
= 1
In the difference form, the control equation is
VX = 1
1
,_
0.7VX
,
2
1.96VX
=
=
(1
, +
2
.)a
1
0.25
,
2
a
11
X
2
_
0.7B
+ 1
0.7B
(1 0.7B + 0.7
B +
2
1 7B
2
X_
(I1.3i
then
21
e
(2)
OPTIMAL CONTROL
0.25
,
1
X
0.7B
21
t
(2)
(2)
2
21 + X
e
(2)
+
i + 0.7
or
=
(1 + 0.7B)a
21
0,
449
(11.3.13)
Since from the control equation the two step ahead forecast .k
(2)
21
the optimally controlled output is the two step ahead forecast error
X
2
2 = (2)
21
e
,
2
0.7a
=
21 + 1
a
2
that is,
ya
49
l.
)ya
2
(1+0.7
22
The output variance is, thus, increased by 49% for the dead time L=2
compared with the L =1 case. It will be shown later that as L increases,
the output variance increases and reaches to the same as that without any
Control.
2
_
21
0.52X
1 + a
_
21
1.13a
,
2
(11.3,14)
,
1
0.28X
_
1
0.59X
,
2
+ 1.45X
1
_
21
0.52X
)
2
1.13a
,
2
0.52X
21 = 0
0.52X
(11.3 .15a)
0.28
0.59B
1.45B + 0.52B
2
01
0.52
0.28
0.59B
1
X
1.45B + 0.52B
2
(2)
2
X
1.45
1.13 = 0.32
= 4
G + (1)2 = 1.45(0.32)
1
= 0.056
1 =
G
(2)
2
e
(1 + 0.32B)a
2
This confirms the previous conclusion that after control, the output b
comes a MA(1) model.
etc. Since k
(2) = 0 by the control strategy, the output after implementing
2
the control is e
(2), that is,
2
2
X
(1 + 2
B)a
1
G
(11.3.16)
where
.-
, + G
2
= a
2
X
+ G
2
a
1
a +
2
Thus,
+ 1
If X
(1) is substituted into Eq. (11.3.15a), we get
2
,
1
0.28X
0.59Xir_i + 1.45(0.28X
_
1
2
_
1
0.59X
+ 1.45k
(1)
21
450
1
I
0.32
V.3L
1
)
(1 + O.i2B)
,
2
t(1) = 0.32a
2
X
, =
2
a
21
451
1 +O.32B
2
X
,
2
0.52X
, = 1.45(0.32)X
1
0.59B)X
, + 0.52(1 + 0.32B)X
2
,
2
] = (1 + G + G
2
Var[X
...
..
G)y
2
GL_lB)a
(11.3.19)
(11.3.18)
as the
noise in
Moreover as
((
as
variance
Since it takes a longer time for the manipulated input to affect the output.
and
2 = (1 + G
X
B + G
1
B +
2
It should be noted that the model of the output after implementing the
Control equation is MA(L 1) if the lag is L, that is,
11.3.6
After simplification
1 +
_
11
, = l.79X
1
X
1 + 1.45
_
11
0.59X
(1 + 0.32B)(0.28
or
1
0.28X
,
2
X
(11.3.17)
(1 +O.32B)
Now, substituting Eq. (11.3.17) into Eq. (11.3.15a), the control equation
becomes
and
OPTIMAL CONTROL
452
MULTIPLE SERIES: OPTIMAL CONTROL AND FORECASTING
Hence, for large lags the optimal control is practically the same as no
control. For the papermaking process, this says that the longer it takes to
realize the effectof change in the stock flow on the output basis weight,
the larger the variance of the optimally controlled output will be. In other
words, the optimal control will deteriorate as the lag L increases. This is
understandable since nothing can be done about the disturbances that come
in the lag time. In fact, it is these disturbances that form the minimum
mean squared error of prediction or control and inevitably appear as the
output. No. further improvement is possible unless the system design is
altered to change the lag and/or model form including parameter values.
Step
Obtain the L step ahead forecast for the output using the conditional.
expectation procedure as developed in Chapter 5 and equate it to zero,
This gives the basis for developing the optimal control equation.
Step 2
Find the optimally controlled output by obtaining the L step ahead forecast
error e(L) from the orthogonal decomposition of the model using the
method of Chapter 5. In essence, this is a MA(L 1) model.
Step 3
Solve for the optimal control equation in terms of input and output aln
by using the optimally controlled output model and express the control;;
strategy in the difference form.
Step4
Obtain the variance of the optimally controlled output.
..
21LX1_L + 2iL+1X1:_L_1 + .
222
O
_
2
a
_
X
2
4
L >0
2
a
flX_fl
22ma2tm
(11.3.2O
The output model for one input-one output system with initial lag or
dead time L is
2
X
221
0
2
a
1
_
1
X_
4
1
(L)
2
k
. +
22L. +
+_,,
21
flX
22
+4
OPTIMAL CONTROL
21 1) + k
221
c1
(L
222
4
(
2
L 2)
.
O22La2
.
453
(11.3.21a)
+.
+.
= 421LX1t + 4
21L +
GL_lB-)aZ
(11.3.22)
with the a
, terms absent when L > m.
2
The optimally controlled output after implementing this control equa
tion is
= (1 + G
B+G
1
8 +
2
0mB)
+
0
B)(G
B
1
2
B
G
+G
BO
1
(1O
2
B
(11.3.23)
where G
1 is the Greens function of the ARMAQz,m) model for X
21 in
(11.3.20) and can be computed by comparing the coefficients of equal
powers of B in (See Section 3.1.5)
.)
Bp ...
1
(i
B
2
+ . .
.i
\i 1
j=1
GL_LB1_4) a
,
2
G,B)
B
1
( G._
L1
1,2,.
.,L
1,
(11.3.24)
2 (L D =
454
21L+2
2JL
22L+k
21L
+ .
i=O
. +
L-1
J1
22L+i
4
2
1L
21L
,
2
x
Bn_L)(1+GIB
,
2
(GB)X
BL_l)X
1
GL_
t
_k
+ . . . + 2
mL
B
1
(1+G
2
B
2L+1
B +
nL
+... + GLB)XlE
k=O
[
+
1+
2
+
+
z
2
L1fYa
(11.3.21)
It is left as an exercise for the reader to verify that the results of Sections
11.3.2 to 11.3.6 can be obtained from the above general results by proper
substitutions.
The variance of the optimally controlled output is obtained from Eq.
(11.3.22) as
V[X
]
2
Var{[1
B
1
B
2
}
2
+G
+G + . . . + GL_lBL_]a
(11.3.25)
x
(
2
e)
,
2
X
1,2, . .
L1
_
2
a
1
G
. . .
GL_la
_
2
Ll
3.0
6 + 0.O2X
_
1
0.10X
21
(327/109)
OPTIMAL CONTROL
,
3
31 + a
0.76X
455
(11.3.27)
,
3
X
11 + 2
O.10X
0.76k
r
(5)
O.02k
(
5) + 3
(11.3.28a)
To find 3(5) we have to first find the optimally controlled output, which
is the same as the six step ahead forecast error e
(6). This error is most
3
easily obtained from the Greens function form of the model (11.3.27),
written for X
6 as
3
0.10
0.02
(10.76B)
+ (10.76B)
=
4.01.
1
36
a
(11.3.29)
+ (10.76B)
Before we take the conditional expectation on this form to get (6),
31 we
e
have to substitute X
25 in terms of a
, using the adequate ARMA model
2
, from Eq. (11.2.5) as
2
forX
(1 0.11B 2
)X = (1 + 0.36B)a
0.44B
,
,
2
(11.3.30)
with
0.10
j.O
O.76X_
+
j=1
6
a
3
G
1
_
j=O
a
3
G
1
_
6
31
(11.3.29a)
456
6
3
G
,
2
a
1
_
21
\j=1
(2
Ba
3
G
,
2
21
6
3
G
,
3
a
1
_
31
.)(10.76B)
O.02(1+0.36B)
0, 1, 2,
(11.3.31)
(11.3.33)
(11.3.32)
0.02(1+0.36B
)
,_
2
a
1
(1 0.76B)(1 0.11B 0.44B
)
2
and hence,
(6)
3
e
1
where 3
G
21 is given by
0.02
21
X
1
_
(1 0.76B)
=
2
(10.11B
)
0.44B
B
(G
+
B+
32
2
G
G
1 +
2
3
and
G
3
31 = 0.76,
It follows from Eq. (11.3.32) that
G
3
21 = 0.02
G
3
22 = 0.02(0.76 + 0.11 + 0.36) = 0.025
323 = 0.02(0.76 x 0.11 + 0.44) + (0.76 + 0.11)0.025
G
= 0.029
G
3
24 = 0.029(0.76 + 0.11) + 0.025(0.76 x 0.11 + 0.44)
(11.3.31a)
Hence,
after
implementin
g
the
control equation, the optimally controlled
output, which is the same as e
,(6), may be written as
3
6 = 2
,
3
X
4
O.O25
,
O.O
a
2f+5 + 2
3
,
2
a + 0.029a
, + 0.026a
0.028a
21 + +
+ 2
31
2(0.76
_
6
)Ja
,
(11.3.28)
The terms k
,(5) and X
2
,(5) in the optimal control equation (11.3.28a)
3
may now be obtained from (11.3.30) and (11.3.31a). Thus, the optimal
control equation takes the form
,(5) + 7.6.3,(5)
2
, = 0.2k
1
X
where
and
1
.
2
0.11X
,
2
0.44X
2
_
71
0.025a
5
,
2
0.026a
_
1
,
2
O.36a
3
0.029a,,_
3
2(0.76
,
_,
)a
= 5,4, 3
OPTIMAL CONTROL
,x
2
(e) = ,
2
0.l
(1
L
) + 0.442,(2),
,
2
0.02a
, + (0.76)
2
0.026a
3
a
5
X,,
, + 0.44X
2
0.11X
211 + 0.36a
,
2
,X
2
(2) = (1)
21
0.11
k + 0.44X
,
2
X
(
2
1)
,
2
a
,X
3
(5)
4
,
2
0.028a
_
, = X
3
a
,
3
457
By carrying out the algebra similar to the one used in Section 11.3.7
in deriving Eq. (11.3.21), we can
express
the
control
equation
(11.3.28)
completely in terms of the present and the past values of the output X
,
3
and the inputs X
, and .
1
21 However, the resulting equation is quite cum
X
bersome. On the other hand, the recursive relations given above are corn
putationally equivalent, but much easier
to
deal
with.
These
equations
completely specify the amount of optimally
controlled
adjustment.
If nec
essary, Eq. (11.3.28) can be expressed in the difference VX
, by applying
1
the operator V on both sides as illustrated before in Sections 11.3.2 to
11.3.4.
(iv) Control Efficiency
(O.76)21)a33
Itfollows from Eq. (11.3.31a) that the (minimum) variance of the optimally
controlled output is
) = (0.022 + 0.0252 + 0.0292 + 0.0282 + O.O
31
Var(X
22
)Ya
262
+
x 100 = 30.17%
2.41.
0.17k2(5)
jO
4.92
9.81
100
49.85%
4.92
9.81
0.358214.01
+ 0.1612 + 0.1122)2.41
j=1
)
31
Var(X
2t(2)
,
2
0.lLt + 0.44X
(l)
21
2
+
,
2
0.44X
1
_ + 0.36a
(1) = 0.11X
2
X
2
0.36a
2
_
21
0.44X
1
_
2
0.11X
= K,.,
21
a
(5) 0.358a + 0.112a
3
t
_
31
1 + 0.l6la + O.ll2a
31 = 0.3O6a + O.358a,_
k
(4)
4
O.3O6a_
3
O.l66a,_
2
0.031a_
3 + 0.l7a_.,
= X
31
a
3
31
O.238a
2
0.366a_
1
_
31
0.59a
5
0.358a_
0.1l2a,_s
4
O.l6la,_
is
The (minimum) variance of the optimally controlled output
(4)
31
0.36k
+ 0.32k
(4) + 1.2323(5)
21
with
Pulping
11.3.9 Optimal Control of a Two-Input One-Output Paper
Digester
illustrative
The simple first order model was chosen in Section 11.3.8 for
(2,1) that
be
to
found
was
digester
the
purposes. The adequate model for
as
(11.2.6)
Eq.
or
11.3
may be written from Table
2
_
21
1 + O.32X
_
2
O.17X
7
,
1
= 0.29X
6 + O.26X
_
11
1 (11.3.34)
_
31
O.64a
3
,_ + a
3
O.36X
2
1
_
31
+ 1.23X
459
0,
1,, p_
=
-
q + 1
PI
<L
3,
Bii1(B)X + H(B)a,,
,q+2, q+J
L, and
(B)X
1
B-4
,
(11.3.35)
pi
2cIPJ(B)XftL_C,
(11.3.36)
Jk(B)Ak,
11
H,,(B)a
(11.3.37)
j=q+1,q+2,...,p1
Note that, for convenience of notation, lags , of X, relative to X,,, are
mcluded in 4
Jk(B).
+ 4P,L_,_l)kIl(1) + 4Pj(L_l)Xjl
[4,,,,k(L ,) + 4J.lft(L i)
dpj(Lj+1)jti
(B)X
cZ
,
1
jq + 1
+ 4
(L 1) + 4
k
1
(L 2) +
1
k
2
OPPLaPK OPPL+laP,_1
The conditional expectations of Xft, 1= q + 1, q + 2,.
pi required
in the above equation can be obtained by applying the standard rules to
input models that may be written in the operator form
1(L)
where, ,,
We can now generalize the results of the last two sections and derive the
optimal control equations for a multi-input--one-output system. To take
into account the different lags for the inputs, let us generalize Eq. (11.2.6)
for the output X,,, to the form
OPTIMAL CONTROL
(B)X
I
1
+
kq+i j=*
(11.3.38)
ot+L =
e_L(L)
=
L-1
kqi,=ek
2 2
G,,kJak_,
(11.3.39)..
11
H
,
1
(B)a
j
=
I)pp(B)Xp
q + 1, q + 2,..
H(qz)(g+2)(B)a(q2),
pi
GPkJBJ)akt
I?(q+2)p(B)
Hpp(B)a,,,
I.k
(;
(11.3.4O);
H(B)a,,
2 jk(B)Xkr
.
.
(B)X + k:8ekpk(B)xkt
B-I
,
1
(q+ i)(q+2)(B)
i)(q+ l)(B). 4
(q+2)(q+ i)(B)
4
P(q + i)(q+2)(B)
::
(g+ 1)(q+1)(B)
i(q+2)(q +
4
)(B)
2
BCI);(q+i)(B) B2(I)p(q+2)(B)
(q+2)(q+ l)(B)
::
kq+i
)(B) B+2I,,(;
1
Be+1cL;(q
)(B):
2
VarX,)
it
ILi
kq+i
\
GkJ)
OPTIMAL CONTROL
461
(11.3.41)
Equating equal powers of B for an ak, in Eq. (11.3.40) gives the G,,kJ
required in Eq. (11.3.39). Once we know G,,kJ, Eq. (11.3.39) yields the
(minimum) variance of the optimally controlled output
.
1
(Jk
21
1 + H(B)a
(B)X
21
B
O22L+aZ(_1
(11.3.43)
(11.3.42)
Special Case: p2
To illustrate the implications of the general equations derived above, con
sider their special case p=2 so that q = q=1. The Eq. (11.3.35) takes
the form
=
O22La2
t(B)
(B)
2
4
2
it
+ (B)
e,(L)
221
G
0
,
2
a
1
_
1
(11.3.45)
that are the same as Eqs. (11.3.20) and (11.3.21a). Since there is only one
input, Eq. (11.3.37) is not needed to get the conditional expectations.
Equation (11.3.38) can be obtained immediately from Eq. (11.3.42) as
(B)
2
H
2
2
a
( 11 3 44 )
21+L.
X,
1=i
L-1
(L
331
,
3
2
(11.3.48)
as
(11.3.5O)
(11.3.49L
(L) = 0
,X
3
,+
1
(B)X
31
k
.(L
1
4
,
+ 2
1)
(L 2)
2
.k
320
(
31
B1)a
3
G
.
3
31
3_
G
,
3
a
21
L1
H(B)
a =
,
3
33
4
(B)
3
X
+ . . .
(B)XZ,+L_ +
lI>
3
(L 2)
t
,
3
2
,_ ...
3
a
33L 1
0
,
1
(B)X
31
4)
1)
Case(i):L
2
Then q =2 and the control equation flows from Eq. (11.3.36) as
(L) = 0
3
X
Special Case: p=
3
Equation (11.3.46) is also clear from Eq. (11.3.44) and is the operator
equivalent of Eq. (11.3.23). The conditional expectation in Eq. (11.3.43)
can now be obtained in a straightforward way from Eq. (11.3.45) with
1)
1)..t2,(e
2
(L_
+ _
3L3r
0
3
O33L+1a3,_1
(11.3.51)
463
,a
3
G
1
_
21 + 2 31
a
G
1
2_
1=0
j2
(11.3.53)
f0
31
3_
G
,
3
a
21
a
1
2G
f=2
)
H
33
c(B)
(B
)
)4
33
[P(B
(B
24
(B)]
32
Be
(B)4
331
G
2 B
,
2
3 and a
which reduces to the two operator identities for a
433(B)f
cl(B)l
IcI(B)
(B)
32
[2
,
2
H(B)a
H(B)a
1
,
3
I(B)
BCI(B)
(11.3.54a)
(11.3.54)
2)
.t
(L
.
3
,4
+ 3
)X
(1)
(L_l
33
f
+ 33LX3r +.
+. .
1) +
1
331
4
3
k
+ (L
+ )
_e +...
X2
2
4>32(L
.-
OPTIMAL CONTROL
> GJBJ
Be2(B)H(B)
464
33
(
2
[4,
(B)
B)4
(11.3 .54b)
4211X1, +
= 4)
e 1),
2
t
1
ijZr(e 1 +
2
e 2 (11.4.2)
Equation (11.4.2) shows
that
except
for
lead
time
e
=
1,
forecasts
the
for the series X,, are also needed to compute the forecasts of .
21
X
465
(11.4.4)
(al.4.3)
Case
Let us consider X, as a white noise series; it then has the model
, = a
1
X
,
1
and
,
2
X
1
, + 4)
1
X
211
4)
0,
=
,4)
2
k
221
(e1),
.k
(
2
1)
(e)
2
x
1
3) +
4,
(
2
k
1
e2)
]
X
2
44jj
4)
,
2
k
121
(e 3)1
4)
(
2
[2
121
C 2)
(e3)
2
k
111
ct,
1 +.
,+
X
1
4)fjj
+
(11.4.5)
It is thus seen that the forecasts beyond the first step for C > 1 are not at
all affected by the leading indicator. This is obvious since the leading
indicator ,
11 being a white noise series, cannot provide any forecasts that
X
can be used in improving the forecasts of the required series X
, after lead
2
time one.
Case (ii)
If 11, also has an extended first order autoregressive model
, = 4)
1
X
_ + 4)lX2_j + a
1
X
111
1
then using the rules for conditional expectation,
4) + 4)121.2,(2)
(2)
11
.t
111
1)
2x
4
j
1
[4$
,
11 + 121
e 2)
2
k
+ 44jj
,}]
X
2
]
X
2
+ 4)
+4)
(
2
k
111
e 3)
422l12t(
1),
4)fjjXI, +
211
4)
(11.4.6)
(1)
2
k
1
(e)
21
.k
+ 4)2ll4)L2l[1z2)
+.
jo
__
1
a
11
G,(4
1
1 +
a
,
2
_)
(1 +B) _
(4
1
a
211
1
1 + 2
a
,
)
(1 + 1
2 +
B + 4B
,_ + a
a
211
.)(4
1
,)
2
(11.4.8)
(11.4.7)
Gi(4
,
1
a
211
+e_z + az,c...i)
a2:+c)
.,
a
1
G
,
1
11 + G(d
,_ + a2)
a
1
+ 2
(
G
,
1
a
211
...
+ +
+
.
+2
_
(
1
G+
1
a
211
4
1+)
1
,
2
a
_ +...
a,,)
G_
j
(
aj, a2,+i)
+ 1
+ _
11 +
a
11
G(4
1
+. .
(11.4.10);
(11.4.9)
+ a
,+e..,)
2
+ 3
,+e_ + a2,e2)
1
a
211
Gz(4
jo
, +
a
211
(4
1
,..
a
211
Gj(4
.
1
_
2
X
.
(1
B
21
,
1
X
_
)(4
1
4 + 2
a
,
)
Using 1
,=a for Case (i), we get
X
,
,
2
X
466
j
Errors and Probability Umits
467
.
.
,+ _
1
a
2
[4
(4
G
1
a
11
1
1+a
,)
2
+ 2
(
G
,
a
211
+
)
1
_
_
4
a +
+ 1
(+aa a_
G
+
2
_
2141
+a
1
a
11
[(4
)
1 + ,
2
(
G
,
a
11
+
1
_
4
)
a
+. . . + 21
Ge
(
1
a
211
+
1
.
)
2
+
4
1
a
_+1
a
G...
,
2
,
.1
(11.4.11)
(1 + G + G +
...
+ 4
)(4LiY
2
(11.4.12)
)
,
11
(4
2
X
1
+
1
l.9y)
2
1.96[V
,
(1))J
(e
(e)
2
k
1
that is,
+ -2)(411Yau +y
) + G_J]
4
1.96[(1+G+G + .
(11.4.13)
that is,
V[e
(
2
)]
Since a
, and a
1
, are assumed independent, the variances of the forecasts
2
(actually of the forecast errors) are given by
,(1)] =
2
V[e
,_ +a2,+c) + 1
a
1
(4
)
1
2
(
G
,
a
+_
4i +a+_
e,() (e)
21
,
2
=X
+2
e(l)
11.4.3 Forncast
y = 0.0062
Let us consider an example ofthe first order models for papermaking data
as discussed in Section 11.2 with rounded-off parameter values (Tables
11.1 and 11.2) for illustrative purposes Y
a y = 0.0021 (11.4.1
,
1
4)
0.10X + 0.19X
,
= 2
1 +
_
2
2 = 0.25X
X
.. 0.7X
1
1 + a
_
21
,
2
+0
2,_I
7
.
1
2,1
+ 0.7X_ +
22
0.25a
1
_
0.048
0.504
_ a
2
0.048X
,
2
= 0.6752 +
1
.
2
0.675X
.
0
&G
(11.4.15).
= (1
0.6758 2
0.04
)
1+2
X
8B 0.25a,_
a
O.675B 0.045B
) ,
2
(0.25
1
_j
a + a)
0.25
.
+
1
,
+
2
0.
12: = _
(
.
_
a
19
0.10
j
.
i
X
X
(1
12,
X,
(0.25a +
G
_
1
..
11
=
jo
where G
1 is the Greens function of the AR(2) model
that S,
= 0.675
G
1
&
0 = 1
G
=
1 =
G
2
G
1),
(11.4.16)
Using the
condit
ional
expect
of
ation
the
model
(11.4.14), we get tlW
expressions for the forecasts as (recall that X, and X
, are the values after7
2
subtracting the mean)
,(1) = 0.25X
2
k
, + 0.7X
1
:
2
0.252,( 1) + 0.712s(
or using Eq. (11.4.15),
(t
2
)
=
G_ 1
z(0.25a
)
2
1.96
(
))
[V(e
J
469
(11.4.17)
(10.6752+0.5042 +
(11.4.18)
+ G_
1
21 = (O.2Sajt+e_i+azte) + 2
e
()
O.675(O.25a
+
)
1
a
_
+
+ 2
G
(
1
)
0.2
j+
e_
5a
a + G_la2+,
and the 95% probability limits
on
the forecasts are
(a)1a
96
k
(
2
1) l.
where
,V[e
2
(e)J
For example, the 95% probability limits on two step ahead forecasts
are
X
(
2
2) 1.96[{0.0625(0.0021) + 0.0062) + 0.6752(0.0o62)]1
= k
,(2) 1.96V0.0091561 = 2(2) 0.1875
2
Figure 11.10 depicts the forecasts and the 95% probability limits for
lead time e = 1, 2,
, 7 made at origin t = 43. It
is
seen
that
the
probability limits of the forecasted values include all the actual observ
a
tions.
To illustrate the advant
age
of
using
leading
a
indicat
or in forecasting,
suppose that only the output series
is
availab
12,
le.
An
AR(1) model was
fitted to this data following the Chapter 2
metho
dology
.
, = 1
2
X
0.7168
,
2
_
X + a,,
with
042 = 0.00674
Table
11.6
shows
the
estima
ted
varianc
es
of
forecas
errors
t
made
with
and
without the leading indicator. As expected, the leading indicator improv
es
the accuracy of the forecasts since the error variances are smaller. The
improvement in the error variances is maximum at lag 2 and reduces for
large lags.
34
36
38
40
0.01358
0.01370
0.01376
0.01379
0.01386
7
8
9
10
0.01334
0.01287
0.01197
0.00674
0.01020
4
5
1
2
3
Lead Time
0.00610
0.00901
0.01060
0.01146
0.01193
0.01218
0.01232
0.01239
0.01243
0.01245
V[e(O]
Without Leading
With Leading
Indicator
Indicator
9.85
9.86
9.95
10.57
10.30
10.07
10.95
11.44
9.50
11.66
Improvement With
Leading Indicator
Percentage
Table 11.6 Comparison of Forecast Error Variances for First Order Models
series alone which can provide forecasts at least as good as, if not better
than, those provided by using the leading indicator. Sometimes, it may be
necessary to disregard the F-criterion and use a scalar ARMA model of
higher order although it does not provide a significant improvement in the
sum of squares at a given level of probability such as 95%.
32
471
..
2
a
0
2
1
0
2
2m2tm
0Ln
+ a
2
Let X
2 be the main series of interest and X
11 be the leading indicator. To
ensure that a
11 is independent of a,, the zero lag parameter may be included
either in the X
11 model as 4 [illustrated in Eq. (11.4.14)1 or in the
model as 4. For the sake of definiteness we will include it in the X
,
2
model as 4210; this implies that the series of interest is affected by the
leading indicator at the same time, but the feedback effect of the series
on the leading indicator requires at least one lag. Then the models are
, = 421LX1_L +
2
X
+
+ 4,X
_,,
11
+ 4X,_
1 + 4mX2t_z +
+
21
a
O
1
_
(11.4.19)
472
1
_
1
0
1
a
2
12
111
O
_
1
a
1 + a
_
a
1
O
,
1
1
,,,
2
. +... + 4,X
122
4
.
2
X_ + X
1
,
1
X
1 +
+ 1
_
1
X
111
4,
1
_.
1
x
1
+
41L1X1_L_i+ +
i..o
ci
, +
2
X
1
4
... + 0
22mt2tm
,
G,
_
2
a
21
1 +
_
2
a
O
1
(e 1) + t
1
k
f 2) +
222
+
(
2
42iLXit_L+
() =
2
e
(11.4.20)
(11.4.22)
(11.4.21)
For the forecasts at lead times e L, the model (11.4.20) is not needed
.
1
since the forecasts involve only the known present and past values of X
Thus,
+ _
2 +
1)
2
+
(
2
X
1
+ 6X2t_fl
2
4B
...
2
B
22
cf
+
. . .
2
0mB
GB
G,
ci
B
1
O,
4,,B)(1
(1
= y
(1
...
GmB
. .
...)
O,,,,Bm)
(11.4.23)
(11.4.24)
22
O
mBm)fi
(11.4. 19a)
1
21L1 B
OiimBm)tuit:
(11 .4.20a).
+ B
122
4
2
(+ziLB
111
0
B
221
0
B
=
d?,iB)Xi
2 =
B)X
2
4
2
4)iinBiXit
...
21 + (1
4
,
1
+ BjX
. . . +
2
B
1
d
2
where G
1 is the Greens function of the ARMA(n,m) model in (11.4.19)
and can be computed by comparing the coefficients of equal powers of B
in
(1
B
1
4
B
111
cf
>
(1
111
0
B
. . .
_...
. +
+. . .
4flB)
473
(11.4.25)
+ J,B)]X
,
2
Substituting for X
1 from Eq. (11.4.20a) in (11.4.19a), we get
2
B
12
B-c,
111
(14
ijB4mB
[(1
2
_(4BL+jL+jB
)(4
121
B
1
+4
2
B+4B
112
4
2
B B
= (1 11
+ (4ilLBL+diL+iBIi
(1
Jk(B)Xkt
,
1
= Hjj(B)a
j = 1, 2,
. .
(11.4.26)
k=i
ki
jkOZct
k1
2 2
1=I
o,jgaj,
aj,
(11.4.26a)
GPICIaIU+_J
(11.4.27)
The step ahead forecast error in the series of interest Xi,. is given
.
.
...
...
F,
kI
cP
(
B)
l(l)(B) 1
4,,(B)
jO
j)(B) H(B)a
4l(_j)(B) 1
(B)a
1
H
1
22
H
1
(_j)(B) (B)a,
2
(I)
(11.4.28),
ki j=O
1
G
(11.4.29)
cI
(B)
11
(B)
21
c1
(B)
11
(B)
21
t1
c1i
(B)
12
I(B)
(B)a
1
H
,
1
1
H(B)a
=
ki \jO
2 12
G,B)akJ
/
Var[e,()1
Equating equal powers of B for an ak, in Eq. (11.4.28) gives GkJ required
in (11.4.27). The variance of the step ahead forecast is then given by
cI
(
1
B) 2
4
(
B)
cti (B)
(B)
11
12
(B) 4(B)
21
cI
4,
(
1
(B)
2
B) t
c1
(B)
(B)
1
4
1 12
22
4
21 (B)
0
(B)
by
474
(B)(B)]
21
4
(11.4.30)
475
and
(B
...
...
B
1
(1O
B9X + (14
1
4
,
B
1
(A 11.1.1)
0zrnB
k
m
t
(A 11.1.2)
,
2
4B)X
...
...
+ _,,
11 +
X
21
4,
022rn2t_rn
where n and m are the autoregressive and moving average order, respec
tively. Using the B operator
B
111
(14
11 + (4+mB
4
1
B)X
.-.(1_
D_
DinP1t
VjflL
I
Urn
+ a,O
1
_
2
a
221
_
2
4
1
X
11
1 i+ 6
X_ +
1
_...
The initial values of the parameters for the ARMAV models and their
special cases can be obtained from their inverse function coefficients. A
procedure to get the initial guess values based on the inverse function
coefficients for the ARMA model parameters has been described in Chaptsr 4. In this section, the same approach is extended to find the initial
values of the parameters for the input and output series models. We will
use the bivariate model to illustrate the method
It follows from Section 11.1.6 that the bivariate model (one input-one
output system) may be written as
,+ f)uoXzt = 1
1
X
111
4
g
X
_ +X
X_
1
121 + 4ii2Xie_2 + 2
4
1
_
2
Appendix A 11.1
(B) (G
11
(B)H
21
I
B)
211
(B)H(B)
1
4
11
(G
B
i)
476
...
.
.
..
..
,
2
.)X
.)X,
.)X,
.
11
.)X
,,,B)X,
1
4>
]
1
)X
11
B)X
21
4)
]
21
.)X
0iimBi[(1 1B
2
B
122
B1
121
1
120
(1
B
>4
121
+ (1
+ (+1w4>121B
...
2
B
222
B1
221
(l1
= (1
2
B
212
B1
211
(A 11.1.4)
(A 11.1.3)
= (1
.
12218
(A 11.1.6)
11
.)X
(A 11.1.5)
and
2
B
12
B4
11
(4
(1
B)X,
22
4>
2
B
222
B4>
221
+ (14)
211
B
2
O
Bl
212
B_I
m)[(_K
= (102218
+
4>112
= 0111
0112
+111
01111111
0111111,_I
+
+ 122
011J 11121
1
0
1
1
1j_z
12
112
ij
121
(A 11.1.5b)
(A 11.1.5a)
and
=
120
mio + 1121
= 0
11112j1
+121
L1f12O
4)120
+121
4>211
4)212
4>211
221
21j
212
= 211
02211211
0221121)_I
2
0
2
1
21_i
21
221221 +
221
= 022,
112
0
2
B
2
0,B
Oi_ 11211
22j1221 +
OiimBm)liij = 0,
rnBm)1
2
O
211
2 = 0,
i=1,2
i=1,2
477
(A 11.1.6a)
(A 11.1.6b)
(A 11.1.7)
j with the assumption that 011j and 022, = 0 for j> m; and
j > n for the (n,m) model. In particular, for j
+221
4>221
and
for all
221
0
B
(1
and
(1
_
1
X
11
4)
1
4>121211
(A 11.1.8)
_
2
X
12
4)
1 + a
,
1
+flp
Z
1
rp
It is thus clear that the initial values of the parameters 4>s and 0s can
be computed from Eqs. (A 11.1.5) to (A.11.1.7) if the values of Ii,, and
max(n,m) are known. The values of the inverse
211, i= 1,2, j 1,2,
function coefficients for the series .7(, and X
, can be determined from the
2
parameters of the pure autoregressive model of order p. For example, let
us
autoregressive
consider
model of order p for the series X
the
,
1
X
1
1 = 4>
11 + 4>12112,_I
X
111
4)
1
, + _
2
X
20
Or
, = K
1
a
, +
1
+ 4)
2W2t-2 +
a,
a
1
0
_
1
11
11
X
212
4)
2
X + _
221
4>
_
1
X
21
4)
1
a
+ 4)
2
X...
122
1 + 4)
X_
221
2 + 1
X
120
4)
_
1
X
111
4)
2
_
1
X
112
1 + 4)
(A 11.1.10)
(A 11.1.9)
(2 +
m
1
122, +
Ifl
1212 +
4)212 = 212
4)211
0221
= 122
01111121
01111111
02211221
02211211
0221
1213 + 1223
4)112 = 412
= 1121 + 01111120
+ 0111
112 + 1122
1113 + 123
4)120 = 120
0111
(A iii)
(A 11111>
get the initial values. As in the case of AR models (see Chapter 4), oie
can also use more equations and obtain the estimates of initial values by
linear least squares.
In solving by the simultaneous equation method, some redundancy
arises since the 0s can be obtained from any one set of Eq. (A 11.1.7),
that is, equations either with Ij or 1,,. Rather than making an arbitrary
choice, a sensible method is to add the corresponding equations and then
solve the resultant m simultaneous equations. Substituting these in the first
n equations, one can recursively compute 4) as in the ARMA case.
I,,
0.2355,
213 =
0.0280
1212 =
0.2517,
0.3012,
121 =
211 =
0.0259
0.6125,
0.3093,
0.0195,
0.0387
0.0386
1
0.5228,
0.3298
0.752
1.5125
4)122 =
Im
0.5899
0.0666,
0.129,
0.3012,
4)121 =
4)
0.90,
0.028,
0.733,
a,, D
1
1
(3mDm
m<n
+ 13m_iDm_l +
a.)Y
(
2
t)
(1)
D +
1
a,,_
...
2
)
1
(D
)
(D
.
)
0
a
(D
..
(D,)
1)cY
Q
1
)
(A 11.2.2)
(A 11.2.1)
The procedure for obtaining the discrete transfer function for pulsed
inputs (that is, inputs that are assumed to be constant between the sampling
intervals) from the known transfer functions for the continuous time inputs
may be found in standard control theory books for deterministic sampled
data systems, see, for example, Kuo (1970). The relevant formulae in our
notation are given below.
Suppose that the differential equation relating the input Y
Q) with the
1
output Y
Q) is
2
Apndix A 11.2
Note that the method described here to obtain the initial values of the
parameters refers to the bivariate model of order (n,m). However, it is
easy to extend this method to a general multivariate model [(pi) inputs
and one output systems].
4)212
em
4)112 =
0111 =
0.0424,
0.2192,
411 =
0.1293,
ljo
479
For the papermaking data, the third order autoregressive model gives
2
D
1
cc
::
j..0
1u, (Dp.,)
130
1
=
T(D)
T(B)
=
B)
1
O
0
L
_CiXIB
i=0,1,...n;
(1
(A 11.2.3)
(A 11.2.5)
(A 11.2.4)
where
X=eI,
ci
(i)
C13,I4
(ILk
.!)
1, 2,.
(A 11.2.6)
(A 11.2.6a)
to (D
Pk
,,
Ck
g(1
e)
g(1
e)
22i
4
(1
1
4
B
1
)
1
D + c
0
ga
(
1
Y
0
t)
481
It is easily seen from Chapter 6 that the differential equation that gives
such a response is
(D + (
2
)
0
a
Y
t)
T(D)
D(D+c)
I1
1
B)[(
B
1
)
g(1
g(14
)
1
B
(1
Appendix A 11.3
K,,e(t) + K
IeQ)
dt + Kd
de(t)
(A 11.3.1)
where eQ) is the error or the deviation of the measured output variable
from the desired or target (set point) value. K,,, K,, and Kd are constants
that cai be adjusted on the controller, the first term being proportional
(P) to the error, the second and the third use integral (I) and derivative
(D) of the error. Hence the name P11) controller.
The liscrete form of equation (A 11.3.1) can be obtained by replacing
the integral by summation and the derivative by the difference. In our
usual notation of X
, and X
1
, for the deviations of the system input (or
2
,
1
X
K4
K
, + (lB)X2t +
2
K,X
, + K
2
K,X
1X + -VX,
1
Kd
(K +
1
_)x_
+ -X
,
2
(A 11.3.2)
(a)
(b)
,_
3
O.36X
2
2
_
21
,.. + O.32X
2
O.17X
1
,_
3
1.23X
1
,_ + a
3
O.64a
1
,
3
,_ + 0.26X
1
0.29X
6
7
_
11
,_ + a
2
0.36a
,
2
,_ + 2
2
0.11X
1
,_ 1
0.44X
(3),,.
,
2
X
PROBLEMS
Note the formal similarity of this equation with the control equation (11.3.15),
showing that equation (A 11.3.2) is a special case of the general one-input-.
0, one gets
one-output control equation (11.3.21). When K,
Kd
proportional onlyP-control as illustrated by Eq. (11.3.4) for the pa
permaking data. Similarly, when Kd
0 one gets proportional plus inte
gralPIcontrol, which is most prevalent in industry.
With the increasing use of digital computers or microprocessors, Eq.
(A 11.3.2) rather than (A 11.3.1) is used for process control. However,
such PID control assumes a specific form of the model and may lead to
suboptimal control when this model form is not appropriate.
,
1
X
or
483
[0.5
0.9]
10.4 o.ol
1]
o.i1
jl
[0.14
10.3
fo.i
[o.o
0.4
0.7
0.3]
0.21
,. + O.9X
1
0.25X
3
1
,
2
)k
,
2
+ 1
,... + a
2
0.3a
,
1
0.7X
,_
0.2X
2
Compute the initial values of the parameters of the input and output
series for this model. Assume the lag between the output-input is
one, L 1, and the system has feedback.
11.4 Verify Eq. (11.2.7) using the estimated values of the parameters of
adequate models for two input-one output paper pulping digester
data as given in Tables 11.3 to 11.5.
11.5 (a) Suppose you are a consultant to a chemical firm. If the manager
of that firm wishes to know about process control only from
the analysis and not from the engineering viewpoint, what is
the first step you would suggest?
(b) Suppose the management informs you that the output data of
the chemical process, when modeled, yield an MA(2) model.
Can you make any improvement in the sense of optimum con
trol? Explain your answer.
11.6 For a two input-one output ARMAV model with a given matrix,
derive the expressions for the matrices 4 and y,.
Suppose the models for a one input-one output system are obtained
as
,
1
0.25X,_ + a
Ii
1o[o
(c)
PROBLEMS
(a) Derive the optimal control equation using the minimum mean
squared error strategy.
(b) Obtain the optimally controlled output and its variance in terms
of error variance -y.
(c) If the lag in the above system is increased to four, compute
the percentage increase in the variation of the optimally con
trolled output.
11.8 The output X and the input X of a system are, related by the
following model:
1
_
21
0.2a
1
, + a,
2
0.3X
4 + 1.1X,_
_
1
O.5X
3
X
(a) Derive the optimal control equation.
(b) It is possible to reduce the dead time from three to two by
suitable design changes estimated to cost $1,000,000 (106) every
year. If the yearly savings from the reduction in the optimal
output variance is estimated as $100,000 (10) per one percent,
would you advise the management to make the design changes?
If yes, specify the yearly net savings; if no, specify the yearly.
net loss resulting from the changes.
11.9 Verify the results of Section 11.3.9.
11.10 In a certain economic system, it was decided to improve the fore
casts of the main series X by using the information from its leading.
indicator. The models fitted to two series are
= 1.0
,
1
, = 0.1X, + a
1
X
,
,
0
y
,
1.1X_
2 + 2
0.12X_
,_
2
X = 0.7X
0.3X + a,
3
_
11
y = 1.5
.
Suppose X
,_ 1.5, X, = 0.6,
1
X
=
, = 0.5, 1
1
,_ = 0.5, 2
X
= 0.8.
(a) Find the forecasts for the main series through = 5.
(b) Compute the corresponding forecast errors variances.
11.11 For the pulping digester data, assume that only 100 observationS.,.
are given. Using the models obtained in the text, find
(a) forecasts of the last 10 observations;
(b) the corresponding forecast errors;
(c) the corresponding 95% probability limits.
11.12 Consider Fig. 11.10. The forecasts seem to follow the series quite
well up to the first lag and then move away. Why does this happefl?J
Can you improve the forecasts? If yes, give the new forecaStS
probability limits and plot thcm as in Fig. 11.10 to show the iifl.
provement. If no, explain why not.
TABLE Al
TABLE A2
TABLE A3
TABLE A4
TABLE A5
TABLE A6
TABLE A7
TABLE AB
TABLE A9
APPENDIX
DATA LISTING
TABL
Al
E
0
University of Wisconsin Hospital In-patient Census April21
Through July 29, 1974 (Chapter 9).
TABLE All Monthly U.S. Consumer Price Index from 1953
Through
1970 (Chapter 9).
TABLE A12 Monthly U.S. Wholesale Price Index from 1953 Throug
h
1970 (Chapter 9).
TABLE Al 3 International Airline Passengers: Monthly Totals
(Thou
sands of Passengers) January 1949 through December 1960.
(Chapter 10).
TABLE Al 4 Coded Crack Length Data (Chapter 10).
TABLE A15 Wood Surface Profile (Chapter 10).
TABLE Al 6 Papermaking Machine Step Input Test Data
(Chapter 10).
485