8, AUGUST 2001
1333
a unique control input ud (t) 2 <p generating the trajectory for the
nominal plant. That is the following difference equation is satisfied
Samer S. Saab
xd (t + 1) = A(t)xd (t) + B (t)ud (t)
AbstractIn an earlier paper, the learning gain for a D-type learning
algorithm, is derived based on minimizing the trace of the input error covariance matrix for linear time-varying systems. It is shown that, if the
product of the input/output coupling matrices is full-column rank, then the
input error covariance matrix converges uniformly to zero in the presence
of uncorrelated random disturbances, whereas, the state error covariance
matrix converges uniformly to zero in the presence of measurement noise.
However, in general, the proposed algorithm requires knowledge of the
state matrix. In this note, it is shown that equivalent results can be achieved
without the knowledge of the state matrix. Furthermore, the convergence
rate of the input error covariance matrix is shown to be inversely proportional to the number of learning iterations.
Index TermsIterative learning control, stochastic control.
Define the state and the input error vectors as x(t; k) = xd (t) 0
1
x(t; k), and u(t; k ) = ud (t) 0 u(t; k ), respectively. It is assumed
that the initial state error x(0; k), initial input error u(t; 0), state
disturbance w(t; k), and the unbiased measurement error v (t; k) are
all modeled as zero-mean white Gaussian noise, and statistically independent.
Defining the input error and state error covariance matrices as
Pu; k = E [u(t; k )u(t; k)T ], and Px; k = E [x(t; k )x(t; k)T ],
respectively, where E is the expectation operator. It is shown [1]
that the learning gain, which minimizes the trace of the input error
covariance matrix, is given by
NOMENCLATURE
Kk
and
Pu; k
Kk ; Pu; k
P u; k :
Learning gain and the input error covariance matrices resulting from the optimal control algorithm presented in
[1].
Learning gain and sequence of matrices (analogous to
Pu; k ) used to define the modified learning algorithm
proposed in this note.
Actual input error covariance matrices resulting from the
proposed (modified) learning algorithm.
2 [0; n ];
2< ;
2<;
2< ;
2<;
(1)
n
p
+ (C
0 C + A)P
x; k (
(4)
+1 ]01
1
where the argument t is dropped for compactness, and C + = C (t + 1).
+ C Qt C
0 C +A)
+ Rt + Rt
0K
N )Pu; k
T
(5)
01 01
1; k N ] Pu; k
= [I + Pu; k N S
t
x(t; k)
u(t; k)
w(t; k )
y (t; k )
v (t; k)
A(t)
B (t)
C (t)
I. PRELIMINARY
where
(3)
2< ;
q
0 e(t; k)]
(6)
(2)
Kk = Pu; k N [N Pu; k N
2 01
+S ]
(7)
0 K N )P
0 K NP
= Pu; k
u; k (
u; k
+ Kk (N Pu; k N
0K
0P
N ) + Kk S2 Kk
u; k
N Kk
T
k
+ S )K :
1334
Note that this learning gain can no longer be claimed as optimal gain
matrix. Substituting the value of Kk into the last equality, we get
Pu; k+1
=P 0 P
u; k
u; k
(N P
(N P
+ P N (N P
=(I 0 K N )P :
0P
u; k
u; k
u; k N
u; k
+ S2 )01 N P
+ S2 )01 N P
+ S2 )01 N P
u; k
Theorem 2: If N is full-column rank and Pu; 0 is a symmetric positivedefinite matrix, then the learning algorithm, presented by (2), (7),
and (8), guarantees that
kP k < k1 c1
u; k
u; k
(10)
u; k
=1 1= min[(N
kK (t; k)k < (1=k)c2 .
u; k
where c1
u; k
=1 c1 kN k kS201k, then
S201 N . Let c2
)]
=(I 0 K N )P
kP k = k(P 010 + kN S201N )01 k
=[I + P N S201N ]01 P :
(8)
=1=min[(P 010 + kN S201N )]:
It is worthwhile noting that by eliminating (C 0 C + A)P (C 0
C + A) , the matrix S1 becomes S2 , and consequently the results of Note that since P 01 is symmetric positive definite matrix,
0
P
, that is P +1 = (I 0K N )P , and (8) are consistent with (5)
then min[(P 010 + kN S201 N )]
>
min[(kN S201N )]
+
+
and (6), respectively. Since the term (C 0 C A)P (C 0 C A) is
0
1
= k min[(N S2 N )]. Therefore
eliminated in the learning matrix K and the update of the error covariance matrix, then 1) knowledge of the state matrix is no longer needed,
kP k < k1 min[(N1 S 01N )] = k1 c1 :
and 2) P
does no longer represent the true input error covariance
2
Pu; k+1
u; k
u; k
u; k
u; k
u;
u;
x; k
;k
u; k
u;
u; k
u; k
x; k
u;
T
u; k
u; k
matrix.
In the following, the convergence characteristics of Kk and Pu; k are
shown to be equivalent to Kk and Pu; k .
Theorem 1: If N
C t
B t is full-column rank, then the
learning algorithm, presented by (2), (7), and (8), guarantees the following:
2) Pu; k is a symmetric positive-definite matrix 8 k , and t 2 ; nt ;
3) the eigenvalues of I 0 Kk N are positive and strictly less than
one; i.e., < I 0 Kk N < 8 k , and t 2 ; nt ;
4) kPu; k+1 k < kPu; k k 8 k . In addition, Pu; k ! and Kk !
uniformly in ; nt as k ! 1.
The proofs of Theorem 1 are identical to the proofs of their counterparts
presented in [1], and are thus omitted.
In what follows, we show that as k ! 1, Pu; k ! (and consequently Kk ! ) uniformly in ; nt if and only if the true input
error covariance matrix P u; k ! uniformly in ; nt . This fact underlines the main contribution of this manuscript. In addition, we show
that the convergence is inversely proportional to the number of learning
iterations.
Claim 1: If N is full-column rank, then the learning algorithm, presented by (2), (7), and (8), assures that
= ( + 1) ( )
[0 ]
) 1
[0 ]
0
[0 ]
Pu; k
[0 ]
0
= [I + kP
u;
0N
( )
[0 ]
01 01
S2 N ] Pu; 0 :
1
where M = N
= [I + P
u; 0 M
]01 P
u;
= 1,
01
01
+ [I + (k 0 1)P 0 M ]01 P 0 M
1 [I + (k 0 1)P 0 M ]01 P 0
= [I + (k 0 1)P 0 M ]
1 [I + [I + (k 0 1)P 0 M ]01 P 0 M ]
= fI + (k 0 1)P 0 M + P 0 M g01P 0
= (I + kP 0 M )01P 0 :
=
u;
u;
u;
+ S2 )01 k
[0 ]
P u; k+1
=1 E [u(t; k + 1)u(t; k + 1) ]
=(I 0 K N )P (I 0 K N )
+ K ((C 0 C + A)P (C 0 C + A) + S2 )K
T
u; k
x; k
T
k
(=
+ ] )
=
Pu; k+1
=(
= (I 0 K N )P (I 0 K N ) + K S2K
k
u; k
T
k
1P
u; k
1
+1 =
P u; k+1
0 P +1
u; k
=(I 0 K N )1P (I 0 K N ) + K L K
k
u; k
T
k
u;
u;
u;
u;
u;
(9)
=1
= [ + ( 0 1)
u; k
u;
u;
u;
01
Pu; 0
u;
P u; k
= P + 1P
u; k
u; k
=0
1
(solid), and
01
01
where
1P
u; k
=0 = +1
01
j
i
j
( ) =1
(I 0 K 0 N )
k
x; k
(I 0 K 0 N )
k
= +1
(11)
= +1
(I 0 K 0 N )P
k
u;
k1P k
u;
=1
Since
then
01
(I 0 K 0 N ) = [I + (k 0 i)G]01(I + G):
k
01
=1 = +1
01
(I 0 K 0 N ) kD k
k
(I 0 K 0 N )
k
= +1
01
(I 0 K 0 N ) kD0 k
k
=
01
=[ + ]
= +1
01
where G Pu; 0 N T S201 N , and Equation (9) is used to obtain the last
equality. Note that since Pu; 0 , and N T S201 N are symmetric positive
definite matrices, then the eigenvalues of G are strictly positive, that
is, G > . Substituting Pu; 1
I G 01 Pu; 0 into the second
equality of the last equation, we get
( ) 0
u; k
=[I + (k 0 i)G]01 P
u; k
01
k01
with
:
I . Note that since Di is symmetric and posj =k
itivesemidefinite matrix, then
P u; k is also symmetric and
semipositivedefinite matrix. Employing (8), we have
1D
1335
(I 0 K 0 N )
k
< cc1
01
i2 (k 0 i)2
i=1
+ c 2 k12 :
c
) ]=1
1
[1 ( ) ] 1
=2
[1 ( ) ] 2
k1P k < ck3 :
u; k
[1 (
= max(2
1336
kP k kP k + k1P k:
u; k
u; k
u; k
(1 )
kP k <
u; k
=1 +
cP
1
1
(1 )
1P
The author would like to thank one of the earlier reviewers for constructive suggestions in improving this work.
where c P
c1
cc3 . Consequently, as k ! 1; P u; k ! . Conversely, (11) implies that kP u; k k kPu; k
P u; k k. Since Pu; k is
symmetric and positive definite matrix and P u; k is symmetric and
positivesemidefinite matrix, then 1) if P u; k ! as k ! 1, then
Pu; k ! and P u; k ! , as k ! 1, and 2) if kP u; k k < c P =k ,
then there exist positive constants cP 1 and c1P 1 such that kPu; k k <
=k cP 1 , and k P u; k k < c1P 1 =k for all k > .
Remark: When applying the proposed algorithm, it is intuitive to
initially set Pu; 0 to the estimate of P u; 0 . However, if this is not the
case, then P u; 0 6
. Thus, P u; k in (11) becomes
ACKNOWLEDGMENT
+1
1
REFERENCES
[1] S. S. Saab, A discrete-time stochastic learning control algorithm, IEEE
Trans. Automat. Contr., vol. 46, pp. 877887, June 2001.
=0
u; k
01
01
(I 0 K 0 N )
k
=0 = +1
01
(I 0 K 0 N )
k
= +1
01
j
=0
(I 0 K 0 01 N )
k
1 1P 0
u;
01
=0
(I 0 K 0 01 N )
k
1D
For instance, to guarantee that P u; k is symmetric positivesemidefinite matrix, then it may be assumed that P u; 0 is also symmetric positivesemidefinite matrix. Applying the argument similar to the original
proof leads to the desired results.
Theorem 4: If N is a full-column rank matrix, then in absence of
state disturbance and reinitialization errors (excluding biased measurement noise), the learning algorithm, presented by (2), (7), and (8), guarantees that the input error covariance matrix P u; k ! , and the state
error covariance matrixP x; k E x t; k x t; k T ! uniformly
in ; nt as k ! 1.
Proof: Theorem 1 implies that Pu; k ! , and consequently Theorem 3 implies that P u; k ! . The rest of the proof is similar to the
proof of its counterpart in [1], thus omitted.
[0 ]
0
= [ ( ) ( )] 0
0
0
I. INTRODUCTION
It is widely recognized that an integral controller is inherently robust
in the face of model and controller parameter variations. The value of
integral control in achieving robust asymptotic regulation has recently
been exploited for nonlinear uncertain systemssee, e.g., [1][4], [9],
and [10], and the references therein. In [1] and [2], Freeman and Kokotovic propose a backstepping scheme for robust integral control of a
class of nonlinear systems with unknown nonlinearities. Global setpoint regulators with disturbance rejection property are constructed at
the price of assuming full-state information and the relative degree
being equal to the system order. Both assumptions in [1], [2] are relaxed by Khalil [10] by means of his high-gain observers techniques
complemented by the idea of saturating the controller outside a compact set of interest. Naturally, as a consequence of the worst-case
design, the results in [10] are of regional and semiglobal types.
The purpose of this note is to propose global regulation results for
a class of nonlinear systems with disturbances combining those in [1],
[2], [10], i.e., we do consider unmeasured zero-dynamics and uncertain nonlinearities. Both partial-state and output feedback control cases
will be investigated. The obtained results extend our previous results
Manuscript received November 7, 2000; revised March 22, 2001. Recommended by Associate Editor Z. Lin. This work was supported in part by the
National Science Foundation under Grants INT-9987317 and ECS-0093176.
Z.-P. Jiang is with the Department of Electrical and Computer Engineering,
Polytechnic University, Brooklyn, NY 11201 USA (e-mail: zjiang@control.poly.edu).
I. Mareels is with the Department of Electrical and Electronic Engineering,
Melbourne University, Parkville 3052 Victoria, Australia (e-mail: i.mareels@ee.mu.oz.au).
Publisher Item Identifier S 0018-9286(01)07685-1.