6, JUNE 2001
877
I. INTRODUCTION
TERATIVE learning control algorithms are reported and applied to fields topping robotics [4][7], and [23][36], for
example, precision speed control of servomotors and its application to a VCR [1], cyclic production process with application
to extruders [2], and coil-to-coil control in rolling [3]. Several
approaches for updating the control law with repeated trials on
identical tasks, making use of stored preceding inputs and output
errors, have been proposed and analyzed [4][38]. Several algorithms assume that the initial condition is fixed; that is, at each
iteration the state is initialized always at the same point. For
a learning procedure that automatically accomplishes this task,
the reader may refer to [28] and [17]. For algorithms that employ
more than one history data, refer to [15] and [16]. A majority of
these methodologies are based on contraction mapping requirements to develop sufficient conditions for convergence and/or
boundedness of trajectories through repeated iterations. Consequently, a condition on the learning gain matrix is assumed in
order to satisfy a necessary condition for contraction. Based on
the nature of the control update law, conditions on , such as
or
, are imposed (where the matrix
is a function of input/output coupling functions to the system)
Manuscript received September 1, 1999; revised April 25, 2000. Recommended by Associate Editor M. Polycarpou. This work was supported by the
University Research Council at the Lebanese American University.
The author is with Lebanese American University, Byblos 48328, Lebanon
(e-mail: ssaab@lau.edu.lb).
Publisher Item Identifier S 0018-9286(01)05129-7.
878
where
is the
learning control gain matrix, and
is the output error; i.e.,
where
is a realizable desired output trajectory. It is assumed that
for any realizable output trajectory and an appropriate initial
, there exists a unique control input
condition
generating the trajectory for the nominal plant. That is, the
following difference equation is satisfied:
(3)
is full-column rank and
is a
Note that if
realizable output trajectory, then a unique input generating the
output trajectory is given by
(1)
where
(4)
;
;
;
;
and
.
The learning update is given by
(2)
is given by
Substituting the values of the state errors in the previous equation, we have
879
is uncorrelated with
, then at iterate ,
, and
. The
(8)
(5)
Combining (4) and (5) in the two-dimensional Roessor Model
[39] as shown in (6) at the bottom of the page. The following
is justified because the input vector of difference equation (6)
has zero mean. Note that if disturbances are nonstationary or
colored noise, then the system can be easily augmented to cover
this class of disturbances. Writing (6) in a compact form, we
have
Let
where
and
(7)
As a consequence,
and
tivesemidefinite matrices.
where
are
is an
is an
vectors
where the zero submatrices are due to zero crosscorrelation
and
. For compactness, we denote
between
,
,
,
,
,
, and
.
Expanding on the left hand terms of (8), we get the equation
shown at the bottom of the next page. Consequently, the trace
is equivalent to the following:
of
vector
matrix, and
trace
trace
is an
matrix.
such
Next, we attempt to find a learning gain matrix
that the trace of the error covariance matrix
is minimized. It is implicitly assumed that the input and state
as a covarierrors have zero mean, so it is proper to refer to
ance matrix. Note that the mean-square error is used as the performance criterion because the trace is the summation of error
(6)
880
(9)
Setting
and
(10)
. This is
Next, we show that
in function
accomplished by expressing the state error
,
, and
,
of the initial state error
, and similarly, the input error
in function of
,
,
,
, and
,
, and then correlating
and
. Iterating the
argument of (4), we get
and
(12)
Correlating the right hand terms of (11) and (12), it can be
,
,
,
readily concluded that
,
, and
, are all uncorrelated. In addition, these
,
terms are also uncorrelated with
. At this point, the correlation
and
of (11) and (12) is equivalent to correlating
and
, where
,
.
and
Since the terms
cannot be represented as a function of one another, hence
these terms are considered uncorrelated. Therefore,
, and consequently
.
is reduced to
Then,
881
(15)
Substitution of (13) into (15) yields to
(16)
Claim 1: Assuming that
, and
nite, then
is symmetric positivedefi-
(17)
Proof: Substituting (13) in
, we get
(13)
and the input error covariance matrix becomes
One way to show the equality claimed, we multiply the left-hand
side of (17) by the inverse of the right-hand side, and show that
the result is nothing but the identity matrix
(14)
IV. CONVERGENCE
In this section, we show that stochastic convergence is
is
guaranteed in presence of random disturbances if
full-column rank. In particular, it is shown that the input error
covariance matrix converges uniformly to zero in presence
of random disturbances (Theorem 3), where the state error
covariance matrix converges uniformly to zero in presence of
biased measurement noise (Theorem 5). In the following, some
useful results are first derived.
is a full-column rank matrix, then
Proposition 1: If
if and only if
.
Proof: Sufficient condition is the trivial case [set
in (13)]. Necessary condition. Since
is full-column rank,
is a full-row rank
then
(18)
Using a well-known matrix inversion lemma [40], we get
(19)
Substituting (19) into the right-hand side of (18), we have
882
where again
Since
is
Lemma 2: If
is full-column rank, then the
learning algorithm, presented by (2), (13), and (16), guarantees
is a symmetric positive-defthat
, and
. Moreover, the eigenvalues
inite matrix
are positive and strictly less than one, i.e.,
of
, and
.
Proof: The proof is proceeded by induction with respect
. By examining (14),
to the iteration index for
is assumed to be a symmetric positivedefinite masince
is symmetric and nonnegative definite. Define
trix, then
. Equation (17) implies that
. Since
is full-column rank and
is
is symmetric possymmetric positivedefinite, then
symmetric and posiitivedefinite. In addition, having
are
tivedefinite implies that all eigenvalues of
are
positive. Therefore, the eigenvalues of
strictly greater than one, which is equivalent to conclude that
are positive and strictly less than one.
the eigenvalues of
is nonsingular. Equation (16) implies that
This implies that
is nonsingular. Thus,
is a symmetric possymmetric
itive definite matrix. We may now assume that
is sympositive definite. Equation (14) implies that
metric and nonnegative definite. Using a similar argument of
for
, implies that
, and consequently
,
is
with strictly positive eigenvalues, is nonsingular and
symmetric and positivedefinite.
Remark: Since all the eigenvalues of
are
, and
strictly positive and strictly less than one
(Lemma 2), then there exists a consistent norm
such that
, and
(20)
is full-column rank, then the
Theorem 3: If
learning algorithm, presented by (2), (13), and (16), guarantees
. In addition,
and
that
uniformly in
as
.
is bounded in
, then
also bounded in
, where
. This can be seen by em-
and
then
(21)
is bounded, then
Since
is symmetric positivedefinite and bounded. Therefore,
is symmetric, positivedefinite,
and bounded. The latter together with the assumption
, then one may conclude that
that
, or
Since
883
ilarly,
.
.
Therefore,
Theorem 5: If
is a full-column rank matrix,
then in absence of state disturbance and reinitialization errors
is positive
(excluding biased measurement noise
definite), the learning algorithm, presented by (2), (13), and
. In addition,
(16), guarantees that
,
, and the state error covariance matrix
uniformly in
as
.
Proof: Obviously, the results of the previous theorem still
, and
, (11) is reduced
applies. Setting
to
TABLE I
PERFORMANCE OF PROPOSED ALGORITHM
We assume that
,
,
,
,
,
, and
are all available. Starting with
, then the pseudocode is as follows:
Step 1) For
a) apply
to the system described by (1),
;
and find the output error
;
b) employing (22), compute
;
c) using (13), compute learning gain
;
d) using (2), update the control
.
e) using (16), update
, go to Step 1).
Step 2)
may be
In the case where
neglected, then the pseudocode is the same as the previous one
excluding the second numbered item, and using (24) instead of
.
(13), to compute learning gain
B. Numerical Example
, we get
From
previous
results,
,
. Therefore,
.
. Again,
is straightforward.
,
Consequently,
uniform convergence for
V. APPLICATION
In this section, we go over the steps involved in the application of the proposed algorithm. A pseudo code is presented
for both cases where the term
is considered, and neglected. In addition, a numerical example
is presented to illustrate the performance of the proposed algorithm.
A. Computer Algorithm
In order to apply the proposed learning control algorithm,
the state error covariance matrix needs to be available. From
Section III, one can extract the following
(22)
(23)
, and
,
second,
,
where the integration period
,
,
, and
are normally distributed
,
,
white random processes with variances
,
, and 1, respectively. All the disturbances are
assumed to be unbiased except for the measurement noise with
. The desired trajectories with domain
mean
(1 second interval) are the trajectories associated with
. The initial input for the generic and proposed al,
. The initial input error
gorithms
,
.
covariance scalar (single input)
Next, the generic algorithm requires the control gain to sat. It is noted that if
is chosen
isfy
is close to zero, then, in the iterative dosuch that
main, fast transient response is noticed but at the cost of larger
is chosen such that
steady-state errors. However, if
884
2 [0; 100].
is close to one, then, smaller steady-state errors is noticed at the cost of very slow transient response. Consequently,
is chosen such that
,
the generic gain
.
for all
Performance: In order to quantify the statistical and deterministic size of input and state errors, after a fixed number of
iterations, we use three different measures. For statistical evaluation, we measure the variance and the mean square error of
each variable. For deterministic examination, the infinite norm
is measured. Since the convergence rate, as the number of iteration increases, is also of importance, then the accumulation
of each of these measures (variance, mean square, and infinite
norm) are also computed. The superiority of the proposed algorithm can be detected by examining both Table I and Fig. 1.
In Fig. 1, the solid, and dashed lines represent the employment
of the proposed and generic gains, respectively. The top plot
, whereas the bottom plot
of Fig. 2 shows the evolution of
shows the conformation of Proposition 1. In addition, an examination of Table I and Fig. 2 (top) shows the close conformity
of the computed input error variance and the estimated one gen,
erated by the proposed algorithm. In particular, at
, and
, with
.
VI. CONCLUSION
This paper has formulated a computational algorithm for the
learning gain matrix, which stochastically minimizes, in a leastsquare sense, trajectory errors in presence of random disturis full-column rank,
bances. It is shown that if
885
, and avg
versus k , 1
100
k
(
) with
the sampling period. For a small sampling
period, one may approximate
. There are several other mathematical applications
and the sampling period does not have to
where
be small, e.g., if the rows of the state matrix corresponding to
th output is equal to the associated row of the output coupling
, and only the second
matrix . For example, if
. Therefore, the
row of is equal to , then
learning gain matrix is reduced to
(24)
Note that, by using the learning gain of (24) [instead of (13)], all
the results in Sections IV and V still apply as presented. Consequently, the computer algorithm does not require the update of
(25)
is a vector-valued function with range in
. Dewhere
, and
. We assume
fine
and a desired trajectory
are given
that the initial state
,
and assumed to be realizable that is, bounded
,
, and
, such that
, and
.
, and
are as asThe restrictions on the noise inputs
,
,
sumed in Section II. Again, we define
,
,
, the state error
, and the input error vector
vector
886
(27)
may represent the repetitive state disturbance. It is
where
is uniformly globally Lipschitz
assumed that the function
. Let
in and for
, then
is also uniformly globally Lipschitz
. The error equation corresponding to
in and for
(27) is given by
(28)
where
, and since
is a function
, then
. Borrowing the results
of
from Section III, we find that the learning gain that minimizes
is similar to the one presented by (13). In
the trace of
particular, defining
then
(26)
Therefore, borrowing some of the results presented in Sections IV and V, one can conclude the following.
Theorem 6: Consider the system presented in (25). If
is full-column rank and
,
then the learning algorithm, presented by (2), (26), and (16),
. In addition,
guarantees that
and
uniformly in
as
.
is not necessary. In case
Lipschitz requirement on
, then
where
. If the function
is uniformly globally Lipschitz in
and for
; then, positive constant
such that
.
Theorem 7: Consider the system presented in (25), If
is full-column rank and
is uniformly globally Lip, then the learning algorithm,
schitz in and for
presented by (2), (26), and (16), guarantees that
. In addition,
and
uniformly in
as
.
Proof: The proof is the same as the corresponding part
.
of the proof of Theorem 3 except the nonsingularity of
is Lipschitz, then positive constant such that
Since
. Borrowing the results
given in [14, Th. 1], then the infinite norms of the state errors
is bounded for all , consequently,
are bounded. Therefore,
is nonsingular.
in place of
in the definition of the
Substituting
matrix of (26), the following results are concluded.
is a full-column rank matrix,
Theorem 8: If
then for any positive constant , the learning algorithm, presented by (2), (26), and (16), will generate a sequence of inputs
such that the infinite-norms of the input, output, and state errors
, in
given by (28) decrease exponentially for
. In addition,
.
. Since
,
,
Proof: Define
, and
, thus
in particular, for
, for
and
.
Borrowing results of [14, Corollary 2], then the infinite-norms
of the input, output, and state errors given by (28) decrease ex, in
. The results related
ponentially for
to the input error covariance matrix can be concluded from the
previous subsection.
REFERENCES
[1] Y.-H. Kim and I.-J. Ha, A learning approach to precision speed control
of servomotors and its application to a VCR, IEEE Trans Contr. Syst.
Technol., vol. 7, pp. 466477, July 1999.
[2] M. Pandit and K.-H. Buchheit, Optimizing iterative learning control
of cyclic production process with application to extruders, IEEE Trans
Contr. Syst. Technol., vol. 7, pp. 382390, May 1999.
[3] S. S. Garimella and K. Srinivasan, Application of iterative learning control to coil-to-coil control in rolling, IEEE Trans Contr. Syst. Technol.,
vol. 6, pp. 281293, Mar. 1998.
[4] S. Arimoto, S. Kawamura, and F. Miyazaki, Bettering operation of
robots by learning, J. Robot. Syst., vol. 1, pp. 123140, 1984.
[5] S. Arimoto, Learning control theory for robotic motion, Int. J. Adaptive Control Signal Processing, vol. 4, pp. 544564, 1990.
[6] S. Arimoto, T. Naniwa, and H. Suziki, Robustness of P-type learning
control theory with a forgetting factor for robotic motions, in Proc. 29th
IEEE Conf. Decision Control, Honolulu, HI, Dec. 1990, pp. 26402645.
[7] S. Arimoto, S. Kawamura, and F. Miyazaki, Convergence, stability,
and robustness of learning control schemes for robot manipulators, in
Int. Symp. Robot Manipulators: Modeling, Control, Education, Albuquerque, NM, 1986, pp. 307316.
[8] G. Heinzinger, D. Fenwick, B. Paden, and F. Miyaziki, Stability of
learning control with disturbances and uncertain initial conditions,
IEEE Trans. Automat. Contr., vol. 37, pp. 110114, Jan. 1992.
[9] J. Hauser, Learning control for a class of nonlinear systems, in Proc.
26th Conf. Decision Control, Los Angeles, CA, Dec. 1987, pp. 859860.
[10] A. Hac, Learning control in the presence of measurement noise, in
Proc. American Control Conf., 1990, pp. 28462851.
887