J. William Orlando
Helton Merino
University of California University of Rhod e Islan d
San Diego, Californi a Kingston, Rhod e Islan d
Philadelphia
Copyright ©1998 by the Society for Industrial and Applied Mathematics.
10987654321
All rights reserved. Printed in the United States of America. No part of this book
may be reproduced, stored, or transmitted in any manner without the written
permission of the publisher. For information, write to the Society for Industrial
and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA
19104-2688.
Typeset by TEXniques, Inc., Boston, MA and the Society for Industrial and
Applied Mathematics. Printed by Victor Graphics, Inc., Baltimore, MD.
is a registered trademark.
Contents
Preface xiii
2 Internal Stability 11
2.1 Control and stability 11
2.2 Interpolation 13
2.3 Systems with a stable plant 15
2.4 Exercise 16
V
vi CONTENTS
4 Optimization 35
4.1 Review of concepts 35
4.2 Generating a performance function 36
4.3 Finding T with best performance 38
4.3.1 Example 39
4.4 Acceptable performance functions 40
4.5 Performance not of the circular type 43
4.6 Optimization 44
4.6.1 The optimization problem OPT 44
4.7 Internal stability and optimization 45
4.7.1 The optimization problem OPT 45
4.7.2 OPT with circular T 45
4.8 Exercises 46
II More on Design 63
6 Examples 65
6.1 Numerical practicalities 65
6.1.1 Sampling functions on the jw axis 66
6.1.2 Discontinuous functions 67
6.1.3 Vanishing radius function 68
6.1.4 Performance function incorrectly defined 69
6.2 Design example 1 69
6.2.1 Electro-mechanical and electrical models 69
6.2.2 Mathematical model 71
CONTENTS vii
20 Proofs 209
20.1 The general theory 209
20.1.1 Size functions and their properties 209
20.1.2 Proof of Theorem 19.1.4 213
20.2 Proofs for H°° Optima 214
20.2.1 Definitions and notation 214
20.2.2 Performance, Jacobians, and adjoints 215
20.2.3 Proof of the Matrix H°° Optimization
Theorem 17.1.1 217
x CONTENTS
VI Appendices 219
A History and Perspective 221
C Uncertainty 229
C.1 Introduction 229
C.2 Types of uncertainty 230
C.3 Dealing with uncertainty 231
C.4 A method to treat plant uncertainty 231
C.5 An example with quasi-circular T 232
C.6 Performance and uncertainty 233
C.7 Extensions 236
References 279
Index 289
This page intentionally left blank
Preface
xiii
xiv PREFACE
Optimal design
We shall always deal with a situation where part of a linear system is given
(called the plant in control theory). We want to find the additional part /
(the designable part) so that the whole system meets certain requirements (see
Fig. 1).
This book presents a list of requirements that are sound from the physical
point of view yet simple mathematically. The list of requirements discussed
here is by no means complete; as more is understood about systems, more
requirements can be added to those in this book. It is our contention that the
framework is flexible enough to accommodate them. Several examples illustrate
the practical solution of design problems using steps I-III.
H°° control optimizes the performance over all frequencies; it does not just
average (mean square) performance over frequencies. This approach is true to
the physical problem, whereas the older mean square optimization approaches
distort engineering specifications in order to produce a mathematically easy
problem. Fortunately in the last 15 years the mathematics of H°° has become
powerful enough to solve engineering problems.
Software
This book also serves to introduce the control software package OPTDesign,
which runs under Mathematica. The book is independent of the software; how-
ever, the companionship of the software provides a richer experience. With
OPTDesign the reader can easily reproduce the calculations done here in the
solved examples, and try variations on them.
Recommended background
Parts I and II require knowledge of basics of engineering such as what a
frequency response function is. It can be learned well by a student who has had
a one-semester control course, maybe less. Popular control texts are [FPE86],
[Do81], and [O90].
Part III requires an introduction to complex variables and analytic functions.
About three weeks of an undergraduate complex variables course might suffice.
Part IV requires some knowledge of vector-valued function spaces, their du-
als, and properties.
Part V requires a little knowledge of Banach spaces.
xvi PERFACE
Other references
Advanced books that present H°° theory but with a different approach include
[F87], [GL95], [BGR90], [FF91], [GM90], [DFT92], [H87], [Dy89], [Kim97], and
[ZDG96]. All of these books are based on the same type of mathematics, namely,
on extensions of something called Nevanlinna-Pick, Nehari, and commutant
lifting theories. Another approach is [BB91]. There exist commercial software
packages that do H°° control design, such as toolboxes available from The Math-
Works Inc. (Robust Control toolbox, LMI Control toolbox, //-Analysis and Syn-
thesis toolbox, and QFT toolbox), Delight (Prof. Andre Tits, U. of Maryland),
and Qdes (Prof. S. Boyd, Stanford University.) The mathematical core pre-
sented in the later parts of our book and our software is fundamentally different
from these approaches.
Thanks
We thank Trent Walker, Julia Myers, Dave Schwartz, Jeff Allen, John Flattum,
Robert O'Barr, and David Ring for computer work and testing, and Joy Kirsch
for administration. Mark Stankus was a valuable source of computer expertise.
Thanks to Neola Crimmins and Zelinda Collins for typing endless scribbled
pages. Eric Rowell read a draft of the book and caught many errors. The index is
due to Jeremy Martin. Julia Myers read many drafts of the book, and she found
many typographical and grammatical errors, and cases of unclear exposition.
We thank Julia for her wonderful work. We are especially grateful to Prof. Fred
Bailey of the University of Minnesota, his student Brett Larson, and to Prof.
A. T. Shenton and research assistant Zia Shafiei of the University of Liverpool.
Finally, we would like to thank the Air Force Office of Scientific Research, the
National Science Foundation, and the Ford Motor Co. for partially supporting
the writing of this book through grants.
Code development
OPTDesign and its companion Anopt are based on algorithms due to Helton,
Merino, and Walker and were written in Mathematica by Orlando Merino, Julia
Myers and Trent Walker. Also, Daniel Lam, Robert O'Barr, Mark Paskowitz
and Mike Swafford contributed. Earlier versions of Anopt (Fortran, 1989) were
designed and written by Jim Bence, J. William Helton, Julia Myers, Orlando
Merino, Robert O'Barr, and David Ring. An even earlier Fortran package is
Approxih by David Schwartz and J. W. Helton (1985).
This chapter outlines our approach to solving design problems. Section 1.1
gives basic facts about rational functions, which are the single most important
type of functions in practical system design. The basic system and functions
considered in this book are introduced in section 1.2 and section 1.3. A control
design problem is presented in section 1.4. A description of the method used in
this book to solve control problems is presented in section 1.5.
The function F is called proper if d(F] > 0 and strictly proper if d(F) > 0.
A rational function can be thought of as a function on the imaginary axis
F(JUJ) or as a function on the complex plane, where the notation F(s) is used.
A function F is said to be bounded (on the imaginary axis) if there exists a
constant M > 0 such that
3
4 CHAPTER 1. SYSTEM DESIGN PROBLEMS
Fig. 1.1. Plot of \F(ju}) versus frequency uj, where F ( s ) — l/((s — 0.5)2 + 4).
The dotted line marks
for all s, are sometimes referred to as real (on the real axis) and are very impor-
tant in engineering. They arise as the Laplace transform of real-valued functions
of time. On the imaginary axis equality (1.1) becomes
All functions that appear in this book have this property. One can show that a
rational function is real if and only if the coefficients are real numbers.
A rational function is stable if every pole of the function has negative real
part and the relative degree is not positive.
The usual convention is to denote the set of all rational functions that are
stable and real by 71H°°. Note that functions in KH00 can be described as
proper rational functions with real coefficients and with no poles in the closed
right half-plane RHP.2
The functions P and C are proper, real, rational functions of s. They are
called the plant and the compensator, respectively. Figure 1.2 is called the
closed-loop system and denoted by S. We assume that the plant in Fig. 1.2
is given and cannot be modified, and that many choices of the compensator are
possible.
Besides P and C, there are other functions commonly associated with the
closed-loop system <5, which we now derive. Begin with Fig. 1.2 to obtain the
equations
Note that the variable s has been suppressed in (1.5) to simplify notation. Other
key functions are
Now we see that it is possible to write C and every transfer function associated
to S in terms of T and the given part P. That is, to determine the closed-
loop system S one has only to pick a particular value for T. Under these
circumstances we refer to T as a designable transfer function
The designable transfer function cannot take an arbitrary form. There are
many properties that the physics or mathematics of the system dictate. There
is a set X of admissible functions to which the designable T must belong. For
example, X contains functions that are continuous on the imaginary axis and
uniformly bounded there. Clearly, the set X must be defined before the design
process begins. In this book the set J is a subset of KH00.
The function T* is (among the closed-loop transfer functions that satisfy internal
stability requirements) the best, in terms of overall performance as measured
by the index. Optimization (minimizing the performance index) is treated in
Chapter 4, and a computer session is presented in Chapter 5.
If it is determined in step III that there exist solutions to Design, then the
optimal designable transfer function T* can be used to run simulations, can be
implemented physically, or, as will often be the case, can serve as a guide to
restating the problem with more stringent performance requirements. The new
Design problem is subsequently solved with steps I, II, and III.
If the result of step III is that no solutions to Design exist, the engineer is
confronted with two possible paths. One is to redefine the problem completely
by radically changing the specs. The other is to reexamine the requirements to
determine which of them can be relaxed. In this case, a problem is formulated
with the new set of requirements and then solved with steps I, II, and III.
Since in practice it is common to repeat steps I, II, and III several times in
the manner described above, we add another step to our list. With step IV one
can treat a sequence of Design problems.
The mathematics used here produce T* and an associated C*, via equation
(1.6), given by a set of values on the ju axis. In most instances the designer will
want to represent C* as a rational function. In some cases, it is desirable that
this rational function have low order. These two objectives can be accomplished
with the techniques provided by the subjects of system identification and model
reduction. While these subjects are not treated in this book,4 the software
package OPTDesign has functions for doing (stable or otherwise) rational fits
of data.
We finish this chapter with a caveat. It may be desirable in doing system
design to select compensators that are stable. For example, this would be the
case if the engineer wants to build and test the compensator as an independent
unit. In its current state, the theory and computational methods of H°° opti-
mization do not handle stability of compensators. Thus optimal compensators
C* produced by the methods used here are not necessarily stable (see [DFT92],
page 79).
4
The reader is referred to [SS90] and [GKL89] for theoretical aspects of system identification
and model reduction. See also [B92] and [Tr86].
1.6. EXERCISES 9
1.6 Exercises
1. Verify that the properties of being real, bounded, stable, and proper are
preserved under the following operations: addition of two rational func-
tions, multiplication of rational functions, and multiplication of a rational
function by a real number.
2. Prove that a rational function is real if and only if the coefficients can be
chosen to be real numbers.
3. Prove that a rational function F is bounded if and only if it is proper and
has no poles on the imaginary axis.
4. Use formula (1.6) in verifying the following relations:
in terms of T(ju)}.
This page intentionally left blank
Chapter 2
Internal Stability
One way to achieve this is to choose C that cancels RHP poles and zeros
of P. However, any RHP pole-zero cancellation in the product PC is highly
undesirable, because small uncertainties in the plant P or in the compensator's
construction lead to a radical change in the behavior of the closed-loop system
S. We now illustrate this point with an example.
11
h CHAPTER 2. INTERNAL STABILITY
Example. For the given plant P(s) = l/(s — 1), consider the compensators
Ci(s) = (s - l ) / ( s + 1) and C2(s) = 2. Observe that a RHP zero of the
compensator C\ cancels a RHP pole of the plant when the product P(s)C\(s)
is formed:
where 6 is a real number with small absolute value. For this plant and the
compensator d, the closed-loop transfer function is
The function Tf has an unstable pole for all small values of 5, except for 6 = 0.
Let ss denote such value. Figure 2.2 shows the plot of s§- Here
Fig. 2.2. P/oi of the unstable pole s§ o/P*5 as a function of 8. The parameter 8
is on the horizontal axis. The point (0,1) is omitted, since for 8 = 0 there is no
unstable pole in Ps.
2.2. INTERPOLATION 13
On the other hand, for the plant P6 and compensator Ci the closed-loop
transfer function is
DEFINITION. The system S is internally stable if the following are all satis-
fied.
ii. A RHP pole of the plant is not canceled by a RHP zero of the compensator.
iii. A RHP zero of the plant is not canceled by a RHP pole of the compensator.
Here 'RfH00 is the linear space of all proper rational functions that are stable
(poles off the closed RHP ) and real (real coefficients).
The fundamental problem we treat in this chapter is as follows: given a plant
P in a certain class, find a description of all the internally stable systems <S with
P as plant. The same problem is discussed in Chapter 7 for any plant P.
There are many ways in which one can present an answer to this problem,
all of which are mathematically equivalent. In this book we choose one that
consists of writing a formula that is (and must be) satisfied by all designable
closed-loop transfer functions T of these systems. The reasons for choosing this
particular approach are technical rather than physical: we have a procedure
available that uses this formula to give an answer to system design problems.
To derive formulas for internally stable systems we need to introduce the
concept of interpolation condition, which is presented in the next section.
2.2 Interpolation
Given SQ, a complex number, and /, a rational function, a relation of the form
14 CHAPTER 2. INTERNAL STABILITY
The above claims form the first half of the following result.
PROPOSITION 2.2.1. Let S be a system with plant P and closed-loop transfer
function T.
2.3. SYSTEMS WITH A STABLE PLANT 15
where A and B are certain functions in RH°° that can be determined from J ,
and TI could be any function in 7£7Y°°. Thus as T\ sweeps 7£7Y°°, we have T
sweeping all functions in 'R.'H00 meeting J .
It is clear from this formula and from P being strictly proper that C is proper.
For C to have a pole at the same location s = SQ as a RHP zero of P, one must
have 1 — P(SQ)TI(SQ) = 0, which is impossible since T\ has no pole at SQ and P
is zero there.
Example. If the plant of an internally stable system is then
any stable closed-loop transfer function T has the form
where T\ is some (or any) element of 71H00. We note that TI may not be strictly
proper. In fact, d(T\) 7^ 0 forces d(C) = 0 upon the system.
Note that we used the actual plant P in the formula (2.8). While this may
be convenient, many other formulas for T are possible. For example, we can
correctly write
Example. If now P(s) = (s2 — 4)/(s 4 + 2s2 + 2), then P is stable and strictly
proper, so for the system to be internally stable the closed-loop transfer function
must have the form
2.4 Exercise
1. Consider the family of plants given by
Frequency Domain
Performance Requirements
Frequency domain performance requirements have a convenient graphical inter-
pretation in terms of disks. Section 3.1 reviews basic concepts and introduces
disk inequalities. The most common performance requirements for control sys-
tems are given in section 3.2. Section 3.3 discusses disk inequalities arising from
performance requirements. The first-time reader may stop after reading this
section and jump to Chapter 4. The topics covered at this point are sufficient
to provide the basic tools to solve simple design problems, when used in con-
junction with material from Chapters 2, 4, and 5. The more interested reader
may continue with sections 3.4 and 3.5, which explain additional measures of
performance and the corresponding disk inequalities.
3.1 Introduction
3.1.1 The closed-loop system S
Consider the closed-loop system depicted in Fig. 3.1. In this system, P(s) and
C(s) are proper real rational functions of s. We take the plant P to be a given
rational function and use the closed-loop transfer function T to parameterize
the closed-loop systems S obtained with different compensators. Hence many
systems S with given plant P are possible by letting the designable transfer
function T take different values. Key functions are
17
18 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE
which T must satisfy. Here K and R are fixed functions that embody the desired
specs of the system. K is called the center of the disk (3.1) and R is called the
radius of (3.1). Disk inequalities are easy to plot as regions in 3-D space (see
Fig. 3.2) and correspond to the inside of a tubelike domain. If the frequency u;
is fixed, then a disk inequality is represented in the complex plane by a region
The process of piecing together disk inequalities will be discussed in sections 3.3
and 3.5. We shall see in Chapter 4 how to use disk inequalities to pose optimiza-
tion problems of the simplest kind discussed in this book. Many system design
problems can be solved by finding solutions to these optimization problems.
If two or more disk inequalities apply at the same frequency ui, then the
region S^ is the intersection of the disks that corresponds to individual con-
straints. Thus in this case, 5W is not a disk.
We can easily compare the gain-phase margin m with the gain margin g and
the phase margin </> by looking at the Nyquist plot in Fig. 3.3. Typically m is
more conservative than either 0 or g.
Fig. 3.3. The gain margin g, phase margin (j), and gain-phase margin m.
If a given stable system has m near 0, then it is close to the unstable case,
which is undesirable. If am denotes the largest value of 1/ra considered to be
acceptable in inequality (3.4), then we can formulate a constraint in terms of
am as (see Fig. 3.4)
for some small atr. Thus good tracking is generated by requiring (3.10) to hold.1
It turns out that (3.10) is too stringent to be obtainable in a practical engi-
neering problem since the function T rolls off at high frequency. What is realistic
is to require the system to track low-frequency functions well, that is. to require
that T(JLL)) be close to 1 over some specified frequency range [—LUtn^tr]- Thus
we introduce one of the key constraints of control.
1
A standard theorem (the Plancherel theorem. [Yng88]) applied to (3.9) yields immediately
22 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE
3.2.3 Bandwidth
Bandwidth of a system is commonly denned as the frequency u^ at which \T(ju)
falls below a given constant times the low-frequency value of the input. This
constant usually is taken to be 0:5 = 4=. A particular bandwidth is required to
ensure that the system at high frequency is not upset by noise, plant uncertainty,
actuator sluggishness, etc.
that for T € RH°° the error satisfies a useful bound:
Consequently, if
then the normalized error in the left-hand side of (3.12) is no larger than c. Thus specs of the
form (3.13) guarantee that
for all u with finite energy. This is a very strong form of tracking.
3.2. MEASURES OF PERFORMANCE 23
Bandwidth Constraint
Given
we see that both T and PC roll off at the same rate. To obtain an inequality
useful for the design process, we must eliminate the compensator C from the
right-hand side of relation (3.17). To do this, use the fact that compensators
roll off at high frequency with the asymptotic form 2
2
Since behavior near j'oc is what is important in (3.18), ar can be taken to be a weight
function of frequency that is not too close to 0 at any given frequency.
24 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE
for some ar and some n. At the outset of design, an engineer should specify n
and ar for the class of compensators that is to be built.3 Once ar and n are
specified, combine (3.17) and (3.18) to obtain the closed-loop roll-off constraint.
bad choice leads to problems that do not make sense from either the numerical
or the physical point of view. We illustrate this with examples below.
Example. Consider the problem Design for the plant P(s) — 1/(1 — s) and
the performance requirement
An obvious mistake here is the absence of constraints valid on the band 0 < uj <
1.0. It is easy to find functions T that satisfy (3.20) and that produce internally
stable systems S, but that have undesirable behavior at low frequency.
Example. Now consider Design for P(s) with performance requirements
For any physical system, T must roll off to 0 at very high frequency. This is not
enforced by the constraints (3.21). One way to remedy this is to require roll-off
on T, for example, with the constraint
The examples above illustrate two basic principles for choosing a set of fre-
quency domain performance requirements:
At each frequency u;, there must be a constraint in the set of requirements
that is active at this frequency.
Behavior of the system at very high frequency must be specified with a
roll-off constraint on T.
There are cases where performance measures not described in this section are
important. In particular, if the plant has a zero or a pole right on the ju axis
or near it, the compensator bound and plant bound constraints must be used to
set up the problem correctly. See section 3.4 for details on this.
Tracking and
1 0.25
disturbance rejection
Gain-phase 1 2.0
Bandwidth 0 0.707
|1 — T(JUJ)\ < 0.25 for u; < 0.5 (tracking for unit step input),
|1 — T(J(JJ)\ < 2.0 for all uj (gain-phase margin),
\T(ju)\ < 0.707 for u; > 3 (bandwidth).
Note that for u> fixed, the large disks from the gain-phase margin constraint
contain the smaller disks from the bandwidth and the tracking error constraints.
Thus the smaller disks define the region where T(JUJ) must lie for either low or
high frequencies u. For midrange frequencies, the gain-phase margin constraint
is the only constraint that is applicable. We collect this information in Table
3.1. The notation there is the same as the notation in inequality (3.1). The
region defined by these constraints is drawn in Fig. 3.8.
The point is that at each frequency uj there is one and only one disk
3.4. MORE PERFORMANCE MEASURES 27
for some prescribed number ac. In terms of the closed-loop system S, this
says that the "closed-loop compensator" has magnitude bounded by ac. The
closed-loop compensator is what the compensator puts out in response to an
5
Also, the examples in Chapters 4 and 5 will be clear to this beginner.
28 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE
input to the system. The reason that (3.23) is imposed on the system is that
if it is violated, then the closed-loop system can saturate, a small current into
the system can cause arcing at the output of (7, or other problems may appear.
In [DS81] Doyle and Stein discuss drawbacks of high loop gain in somewhat
different terms. Their main concern is that (3.23) holds at high frequencies; low
frequencies are less important.
Let us analyze (3.23 ) in terms of T. Since PC = T/(l — T), we obtain
Fig. 3.10. Region defined by the compensator bound constraint, if there is a zero
of P at s = juz.
The closed-loop plant is the output of the closed-loop system of Fig. 3.1 to an
input to the plant.
Inequality (3.26) is analyzed in a similar fashion as inequality (3.24). Recall
that internal stability of the system implies that T(juo] = I when P(ju)} — oo.
Inequality (3.26) is binding near the poles of P; it must hold whenever \uj — u)p <
rip, where up is any pole of P on the juj axis and r\p is a small positive number.
From this we get the plant bound constraint, illustrated in Fig. 3.11.
Fig. 3.11. Region defined by the plant bound constraint, if P has a pole at
For good behavior of the feedback system it is necessary that T^y be RHP-
stable. This is always the case with internally stable systems. More can be
required — for example, that there is a restriction on the size of Td-^y. The
following inequality gives a precise statement, where c is a given constant:
In many cases relation (3.30) has a counterpart in the frequency domain, which
can be obtained with the final value theorem (cf. [C44], page 191, or [LP61],
page 315). Suppose that u(t) is such that £(u — y)(s) has no poles on the closed
RHP, except perhaps a simple pole at s — 0. The final value theorem says that
3.4. MORE PERFORMANCE MEASURES 31
in this case
It follows from equation (3.35) and from T = 1 — 5 that T satisfies the interpo-
lation conditions
Let us now take m <n + l and U(s] — l/sm. Because of (3.36), the function
32 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE
By combining (3.36) with (3.38), and applying 1'Hopital's rule for limits, we
obtain the following proposition.
PROPOSITION 3.4.1. An internally stable system with type n plant produces
a steady-state error ess for input U(s) = l/sm given by
One can show that an internally stable system S with type n plant satisfies
with requirements
6
These are denoted by Kp, Kv, and Ka in most books.
3.5. A FULLY CONSTRAINED PROBLEM 33
Plant
Tracking and
disturbance rejection
Gain-phase
Bandwidth
Roll-off
The constraints a, b, and c are active at all frequencies 0 < u < I . Clearly
constraint c is contained in constraint a, so we restrict attention to a and b.
Solve the equation
to obtain the solution u>o ~ 0.29. Thus b is more stringent than a on the
frequency band 0 < u; < 0.29, and b is less stringent than a on the band
0.29 < u) < 1. We see that c is the only constraint that applies on the band
1 < (jj < 2. For (jj > 2 it is clear that every T that satisfies d also satisfies c, so
it is enough to study constraints d and e on this band. We solve the equation
Fig. 3.13. Section of the envelope defined by the center K(jo;) and radius R(ju>)
in Table 3.2.
contained in the requirements. For example, two intersecting disks do not yield
a single disk.
Chapter 4
Optimization
In this chapter we discuss performance functions and how to use them as objec-
tive functions in optimization problems. These concepts are applied to system
design in examples. We begin with a review. The reader who is interested in
getting quickly to the point where examples can be solved by computer may
read sections 4.1-4.4 and then skip to Chapter 5.
or, equivalently,
The system S is internally stable if T has no RHP poles and there is no RHP
pole-zero cancellation in the product PC.
35
36 CHAPTER 4. OPTIMIZATION
lr
The examples in this chapter were solved with the software package OTPDesign.
4.2. GENERATING A PERFORMANCE FUNCTION 37
We will build a "performance function" from (4.1) that can be used to solve
design problems. Begin by rewriting (4.1) in the following way:
Now for a given T calculate the largest value of the left-hand side of (4.2):
We now check if
Thus g(w) has three critical points: uj = 0, uj — ±\/13/3. The value c/(0) = 1/4
is a minimum, and g(±\/T3/3) = 297/512 is a maximum. Also, g(uj) —>• 9/16 as
(jj —s> (X). A plot is most informative (see Fig. 4.2). Thus
38 CHAPTER 4. OPTIMIZATION
is called optimal. We reserve the symbol 7* for the "optimal performance," that
is, the performance of the optimal designable transfer function:
T (recall that this includes internal stability). More precisely, we say that T* is
a solution to
4.3.1 Example
Physically the problem is to stabilize the plant and impose closed-loop gain-
phase margin and roll-off performance.
Solution. Observe that the plant is strictly proper and stable. At this point
we will not spend more time discussing internal stability, since the software we
intend to use for the optimization takes care of this automatically.
We first construct the center and radius functions for our problem. Recall
that the gain-phase margin constant ra is the closest acceptable distance from
the function L = PC to the point —1; in other words,
Thus the frequency domain performance requirements are described by (see Fig.
4.3)
where
40 CHAPTER 4. OPTIMIZATION
Fig. 4.3. The performance envelope defined by center and radius functions in
the example is outlined by the thick curves. The horizontal axis is frequency u;,
and the two functions are K(JUJ) + R(ju] and K(JUJ) — R(jw).
and
To give an idea of what the answer is, some results of calculations with the
software package OPTDesign are presented below. For more details see Chapter
5.
Calculations by computer give 7* = 0.09923, so there exist solutions to
Design. Here is a rational function that is a low-order approximation to the
optimal one:
2
The performance may turn out to be infinitely large at frequencies UJQ such that the plant
has a zero at s = JUIQ , or at infinite frequency. This should not be a problem, provided that
only T that give internally stable systems are considered.
4.4. ACCEPTABLE PERFORMANCE FUNCTIONS 41
Fig. 4.4. 3-D plot of the performance envelope and the solution TQ. Clearly T0
satisfies the constraints.
The justification of rules 1 and 2 requires the results in Part III of this book,
so we refer the interested reader to it.
The following are examples of performance functions that are common in
the engineering literature. We assume that the underlying design problem has
internal stability requirements, and that the plant P is strictly proper and has
no poles or zeros on the juj axis. Also, W, Wi, and W2 denote given rational
weight functions of frequency.
Fig. 4.7. The zeros and poles of the compensator (0.786826 + 2.36403s +
2.68265s2+1.42052s3+0.315071s4)/(7.08474+9.97784s+13.6385s2+7.38526s3+
s4) that corresponds to TQ are indicated by "o" and "x," respectively. Note that
this compensator is stable. Stability of the compensator is not guaranteed a
priori by the design method of this book.
4.5. PERFORMANCE NOT OF THE CIRCULAR TYPE 43
where W\ and W% are known weight functions.4 Clearly this constraint is not
a disk inequality. Indeed, the region
is not a disk. However, the inequality (4.17) can be used to define a performance
function valid at least for those u satisfying uj\ < u < 0^2• For this we set, for
z any complex number,
3
When zeros or poles of the plant occur on the ju> axis, weight functions have to be chosen
so that the weights in the terms Wi(ju>)\T(jiii)\'2 and W2(juj)\l—T(ju>)\2 have the zeros (resp.
poles) at the same values of u as those of the plant, order included. The reason for this is the
internal stability requirement. See Chapter 7
4
See [OZ93]. Also see section 6.5 in Chapter 6.
44 CHAPTER 4. OPTIMIZATION
The number 7(T) represents the cost, or overall performance, associated with
the designable transfer function T. It is called the performance index. Note
that (4.18) can be written as
Observe that 7(T) < 1 means that T satisfies the performance requirements,
with some slack. Thus we associate small values of 7(T) with good performance.
The simplest functions F arise when a single disk inequality applies at each
frequency.
4.6 Optimization
It is a fact that there is a formula that depends on the RHP poles and zeros of the
given plant P that gives all T e X. Indeed, there exist functions A, B e RH°°
such that
One approach to treating the design problem in practice is to use formulas (4.21)
and (4.22) to account for internal stability. Depending on the specific approach
and the tools available, to solve Design the formulas are either manipulated
directly by the designer or they are automatically handled by the computer. In
this case only the list of unstable zeros and poles of the plant is necessary for
the software to build arid use the necessary formulas.
PROPOSITION 4.6.1. 7/7* < I , then there exist solutions to Design, and if
7* > 1, then there are no solutions to Design.
Proof. By definition, the number 7* is the smallest value possible for 7 so
that the inequality
has solutions T G 1. Therefore 7* > 1 says that there are no functions F that
satisfy (4.20), i.e., that there are no solutions to Design. Similarly, 7* < 1 say
that there is at least one function T G X that satisfies (4.20), which implies that
there exist solutions to Design.
4.8 Exercises
1. Is the F given below of quasi-circular type? Take
ANSWERS
la Yes
b No, the level circular are ellipses,
c No, the level curves are lemniscates.
d Yes, because linear function transforms take circles to circles.
5.1 Introduction
This chapter introduces the reader to practical design by showing how the ideas
learned so far feed into a computer implementation. It is written in a generic
tone, so that one need not know anything specific about computation to get a
concrete idea of how to do a control design. The best explanation of this subject
is to give an actual computer design session. Unfortunately, this usually entails
getting involved in many specialized details peculiar to a package. However, we
have developed a program whose use is close enough to conventional English
that anyone can read it without specialized knowledge.
Our program is called OPTDesign and runs under Mathematica. This per-
mits any standard symbolic or numerical calculations to be done in a language
that is quite compatible with standard mathematical notation.
and the requirements are tracking on the band 0 < o> < 0.3 with bound 1.5,
gain-phase margin constant of 0.5, bandwidth 0 < uj < 2.0; and closed-loop
roll-off with bound 2.5 on the band 4.0 < LU (see Table 5.1). We wish to design
an internally stable system S that satisfies the given performance requirements.
The following is a list of inputs for a computer run. If you wanted to try
OPTDesign you would begin by loading OPTDesign into a Mathematica session.
«OPTDesign'
47
48 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN
Constraint Band
Tracking
Gain-phase
Bandwidth
Roll-off
We enter the center and radius functions fco and TQ directly as step functions.
For example, in Mathematica this is done using the Which[ ] command. This
is a strange name for what most English speakers call "If."
wt 0.3;
alphat 1.5;
alphagpm 2.0;
wb 2.0;
alphab 0.7;
wr 4.0;
alphar 2.5;
of how much distortion we are introducing into the problem by smoothing func-
tions. We must keep in mind that the solution obtained with OPTDesign will
be optimal with respect to the "smoothed" envelope. The following command
produces the plot shown in Fig. 5.1.
EnvelopePlot[Radius->rO,Center->kO, FrequencyBand->{0.,wr+l.}]
A 3-D plot of the requirements envelope (see Fig. 5.2) is produced with
EnvelopePlot3D[Radius->rO,Center->kO]
5.3). Note that in our example most of the gridpoints are in the band 0.1 <
(jj < 10.0, which is where the center and radius functions have their distinctive
features. This is desirable.
EnvelopeLogPlot[Radius->rO,Center->kO,
FrequencyBand->{0.,wr+l.},Discrete -> True];
If we do not like either of the plots in Figs. 5.1 and 5.3, then several input
parameters can be modified to rerun the problem and to obtain more satisfactory
plots. We leave this for a later section and proceed now with a run of the
program OPTDesign with the given data.
Diagnostics and progress reports are routinely printed to the screen as the
program OPTDesign runs:
5.3. OPTIMIZATION WITH OPTDESIGN 51
Summary
Observe that the output parameter 7* (in the above output, gamma*) is less
than 1. The column ned in the screen output is a measure of numerical error
or noise. Flat and GrAlign are diagnostics. If they are near 0, the calculated
solution is nearly optimal. We say more on this later in section 5.6.
We conclude that there are solutions to the (smoothed) Design problem and
that the accuracy of the computation is acceptable. Furthermore, since 7* is
much less than 1, we can tighten the performance requirements substantially and
still get a solution. The calculated closed loop transfer function, open loop, and
compensator output is stored in the output variables T, L and Co respectively.
We postpone discussing their format until subsection 5.5.1, where we show how
to plot and manipulate it. Now we move on to the production of a compensator.
52 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN
Error = 0.171731
22
(2. + s) (2. + s) (3.769 + 1. s)
with stable function Tirat(s). The proper rational functions A and B depend
on the plant and incorporate automatically the internal stability requirements
on T. You Must run OPTDesign before you run RationalModel.
Recall that the compensator C and the closed-loop transfer functions are
related by the formula
There is no point in writing out the formula for the output now, since most
valuable to us will be a zero- and- pole plot of compl. The following command
produces the plot shown in Fig. 5.4.
PlotZP [compl [s] , s]
The figure suggests that compl has a pole-zero pair at s = 5.3 The standard
Mathematica functions do not cancel terms with decimal notation. To do the
cancellation we can use a function provided with OPTDesign:
2
The Error number displayed by the RationalModel routine refers to the fit of T\ and not
to the fit of T.
3
One way to confirm this is
complzeros = s /. NSolveC Numerator[compl]==0,s]
{-2., -2., -0.987364, 5. }
complpoles • s /. NSolveC Denominator[compl]==0,8]
{5, -6.60224, -1.04396 - 0.710493 I, -1.04396 + 0.710493 1}
5.4. PRODUCING A RATIONAL COMPENSATOR 53
Hence comp2(s) results from the simplification of compl. We plot the poles and
zeros of comp2 now (see Fig. 5.5).
PlotZP [comp2[s],s]
It is clear from the plots in Fig. 5.4 and Fig. 5.5 that the compensator comp2
does not cancel RHP poles or zeros of the plant, which is a necessary condition
for internal stability of the overall system S. Now we calculate the closed-loop
54 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN
transfer function Trat2 that corresponds to the compensator comp2, and follow
this with a plot of the poles and zeros of Trat2 (see Fig. 5.6).
PlotZP[Trat2[s],s]
Note that since comp2 is proper and Trat2 is stable, for this choice of compen-
sator the system is internally stable.
For those who like old-fashioned 2-D plots, the following command produces
the plot in Fig. 5.8.
One usually wants to look at Bode plots to evaluate the design. These can be
obtained in several ways in most control packages. For example, in OPTDesign
the following commands produce the plots.
BodeMagnitude[Tw,s,{w,0.1,10},PlotLabel->"Magnitude Plot"];
Plots for the open loop L = PC can be produced in similar fashion by plotting
L = pcomp2.
One can see in the magnitude plot of the closed loop that most of the sample
points are located in the frequency band 0.1 < uj < 10. One can judge this to
56 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN
Fig. 5.9. Bode plot (magnitude) of the closed-loop transfer function Trat2.
be acceptable from the numerical standpoint, based on the fact that both the
input functions ko and r0 change mostly in this band. For details see section
5.8.
also have this feature of working either on functions defined by formulas or lists
of data.
If one wants to plot or manipulate T by itself, then one must access the grid
on the ju axis where your OPTDesign session is evaluating T, via
Grid [ ]
5.5. HOW GOOD IS THE ANSWER? 57
Fig. 5.10. Bode plot (phase) of the closed-loop transfer function Trat2.
The output is a list of numbers on the juj axis. Now we can plot T using
RHPListPlot3D[ T, PlotRange -> {{0,3},Automatic,Automatic}]
where the PlotRange option sets the range you will see.
One can put together a functions values with the grid on which it is defined,
using
OPTDParametrize[T]
Note that you may achieve identical results with
Transpose [{Grid [] ,T}]
Indeed, this is how OPTDParametrize is defined.
It might well be useful to note that the plotting commands above have a
very simple core plus a few embellishments to make the scales, to make labels,
to put the — 1 point into the Nyquist plot, etc. As an example we illustrate how
one builds plotting routines such as Nyquist and RHPListPlot3D. The core of
RHPListPlot3D[ T ] is
ScatterPlotSD [ Transpose [ {Grid[] , Re[T], Im[T]} ] ]
The core of Nyquist [ L ] is
ListPlot[ Transpose [ {Re[L], Im[L]} ]]
So far we have discussed functions presented as data sets. Now we mention
that if you have a rational function Tr rather than a lot of discrete values,
Discretize[T r ] gives a list of values of Tr on the ambient OPTDesign session
grid.
The commands mentioned in this section and their output are discussed
further in Appendix G.
58 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN
Diagnostic Meaning
OPTDesign[p,Center->kO,Radius->rO,GridPoints->wpts];
2. The requirements envelope changes very fast over a frequency region. This
may be the result of your way of setting up the envelope. One possibility
is to change it to make it gentler, either by smoothing it or by redefining
it. In this case one should verify that these changes preserve the physical
requirements for the design. Fpr a successful computer calculation, one
must make sure that the chogeji grid has many points in those bands, as we
just described in item 1. The smoothing in OPTDesign runs is specified
with the option Nsmth. The input Nsmth-^1 indicates a small amount
of automatic smoothing; more smoothing is achieved with Nsmth-^5 or
Nsmth-+10. The input Nsmth~-+ 0 corresponds to no automatic smoothing.
An example of a command is
OPTDesign[p,Center->kO,Radius->rO,Nsmth->5,GridPoints->wpts];
One may decide whether the rational function chosen to approximate data is
of good quality simply by displaying a plot of the absolute value of the difference
of the data and the function (evaluated at the frequency values specified by the
data) and studying the resulting profile.
The function RationalModel uses an algorithm by L. N. Trefethen for doing
Caratheodory-Fejer approximation. To use it, type
The grid should be of the type discussed in section 5.8 and discussed further
in section 6.1.1.
The OPTDesign package also contains a more powerful general-purpose rou-
tine for approximation of data called NewtonFit. See the documentation for
NewtonFit in Appendix F.
5.10 Exercises
1. In the problem treated in this chapter, smooth the envelope a lot, say
Smoothing —> 50, and compare it to the unsmoothed envelope.
2. While fixing all other specs in section 5.2, make the gain-phase margin
as small as possible. (This is the Glover-McFarlane [GM90] approach to
design.) How much does smoothing affect the answer?
3. Same as problem (2), but vary the tracking error.
For example, the effect of this on RationalModel is that it may not work satisfactorily when
1. DegreeOfDenominator —* n is used as input with n > 4.
2. The numerical error diagnostic in OPTDesign is not small.
3. You least expect it. To be safe you must run with many different initializations.
This page intentionally left blank
Pant II
More on Design
This page intentionally left blank
Chapter 6
Examples
• Points on the juj axis for sampling functions are badly chosen.
65
66 CHAPTER 6. EXAMPLES
Fig. 6.1. An example ofGA-point grid on the uj axis produced with the OPTDe-
sign package command Grid[Ngrid — > n,GridSpread — > b], which gives
n points distributed around ±6. Practically all (positive) points lie between 0.16
and 106.
At certain frequency uj, the value of the performance T(UJ, T(ju}} does not
depend on T.
Fig. 6.2. An example of a 512-point grid on the u) axis produced with the OPT-
Design package command Grid [Ngrid — > n.GridSpread — > b]. Most of
the (positive) points lie between 0.016 and 1006.
Fig. 6.3. Plot of the function k\(uj] obtained from k(u) by using linear
interpolation with transition region 1.5 < LJ < 2.5. The following Math-
ematica commands produce ki(u): wa = 0.5; wb = 1.5; linel[w_] =
InterpolatingPolynomial[wa,l,wb,0,w] ; kl[w_] = Which[ Abs[w] <=
wa , 1, wa < Abs[w] <= wb , linel [Abs[w]] , wb < Abs[w] , 0] ;
Fig. 6.4. The plot of the function ^2(0;) obtained from k(uj] by us-
ing high-order interpolation with transition region 1.5 < u) < 2.5 and
derivatives at the endpoints specified to be 0. The following Mathe-
matica commands produce k^u}: wa = 0.5; wb = 1.5; polyl [w_] =
InterpolatingPolynomial[wa,l,0,wb,0,0,w]; k2[w_] = Which[ Abs[w]
<= wa , 1, wa < Abs[w] <= wb , polyl [Abs [w] ] , wb < Abs[w] ,
0];
Other types of interpolation are possible. For example, one may do it with a
polynomial so that derivatives at the endpoints of the transition region have
certain values. See Figs. 6.3 and 6.4.
for some T\ 6 7£7Y°°. If the system S satisfies the closed-loop roll-off inequality
That is, we obtain a problem where the variable is I\ and the radius function
is equal to a for all large u>, which is acceptable. Computer programs such as
OPTDesign do this step automatically.
Figure 6.7 represents this problem with the mechanical parts replaced by their
electrical equivalents, where
Take u\ as the output, and scale the frequency by 100 to obtain the following
coefficient matrices, where a — I/JL-
72 CHAPTER 6. EXAMPLES
The parameter a varies between 1 and 10. The maximum load corresponds to
a = 1. The plant transfer function from input u to output uj\ is computed from
the formula
We obtain
The parameter a accounts for the variable load in this example. Here a varies
between a = 10, which is the minimum-load case, and a = 1, which is the
maximum-load case. The nominal case corresponds to a = 5. The nominal
plant is
The nominal plant is stable (so are all the other plants Pa for all a £ [1,10]).
The magnitude of the nominal plant is shown in Fig. 6.8.
According to Larson, these requirements should be met over the full range of load
perturbations. We will instead do the (much easier) problem of solving Design
for the nominal plant only. The problem for loads other than the nominal will
not be discussed here.
It is clear that the gain-phase margin constraint is the only one that is binding
at midrange frequencies 1 < uj < 5. Also note that m = 0.5 ensures a gain
margin of 6 dB (see Fig. 6.9).
Setting a closed-loop roll-off constraint. For many problems with
proper compensator one can use the plant P(s) to provide the profile of the
envelope, by setting in
In our case we set ur = 10.0. A plot of the magnitude of the plant (Fig. 6.8
shows that it has a peak at a frequency close to our choice of ujr, so near this
frequency |P(ji<;)| changes quickly. In light of this we do not use the plant in the
closed-loop roll-off constraint. Instead, we shall state constraint (6.12) in terms
of a function whose magnitude has a simple behavior at frequencies u; > ur,
and such that its roll-off corresponds to that of a rational function with relative
degree 2. We choose
With respect to internal stability we merely note that there are no RHP
poles or zeros of the plant P(s}. Hence the internal stability of the system with
plant P(s) (and proper compensator) is guaranteed by simply requiring that
the closed-loop transfer function T(s) have relative degree 2.
A circular performance function
and
The center K(UJ] and radius R(OJ) have jump discontinuities (see Figs. 6.10 and
6.11).
The next thing to do is to remove jump discontinuities from the requirements
envelope. To do it, we define continuous functions KI(OJ] and R\(u) that
approximate K(UJ) and R(UJ) except near jump discontinuities, where KI(UJ}
and RI(U>) are defined as linear functions. See Figs. 6.12-6.15.
6.2.5 ptimization
We now have all the elements necessary to state and solve the design problem
by optimization of the performance. A run of the program OPTDesign with
6.2. DESIGN EXAMPLE 1 75
Fig. 6.10. 2-D plot of the requirements envelope determined by the center func-
tion K((jj) and radius function R(jj}.
Fig. 6.11. 3-D plot of the requirements envelope determined by the center func-
tion K(u}) and radius function R(u>).
76 CHAPTER 6. EXAMPLES
Fig. 6.12. The radius functions RI(UJ) (thin) and R(u>) (thick) on three different
frequency ranges.
Fig. 6.13. The center functions K\ (w) (thin) and K(UJ} (thick).
6.2. DESIGN EXAMPLE 1 77
Fig. 6.14. 2-D plot of the requirements envelope corresponding to the center
function K\(u) and radius function RI(U>).
Fig. 6.15. 3-D plot of the requirements envelope corresponding to the center
function KI(UJ) and radius function RI(UJ).
78 CHAPTER 6. EXAMPLES
256 gridpoints and with heavy smoothing yields 7* = 0.618, so there exist
solutions to the design problem with the modified envelope (with smoothing).
The plots of the optimal closed-loop transfer function T* and the corresponding
sensitivity function S = 1 — T* indicate that the sensitivity has magnitude —5
dB for frequency uo near 1, while in the original performance requirements it was
specified that this magnitude had to be less than —10 dB. See Figs. 6.16-6.18.
Thus we must take further action to solve the original design problem.
Fig. 6.19. The center function K%((*}) values drop in magnitude to the right of
uj — 1.3. Here K^uo] is shown as the thin line and K(uS) is shown as the thick
line.
Fig. 6.20. The radius function R^^} values are smaller than 0.25 for 0 < uj <
1.3. Here R-2(u} is shown as the thin line and R(UJ) is shown as the thick line.
80 CHAPTER 6. EXAMPLES
The run of OPTDesign with the new envelope produces an acceptable closed-
loop transfer function. A degree 6 rational approximation to the new opti-
mal closed-loop transfer function is readily obtained with the function Ratio-
nalModel. We have
Fig. 6.23. Bode magnitude plot of the sensitivity function 82(3) — 1 — ^2(5).
Fig. 6.24. Plot of zeros and poles of the closed-loop function T^s).
Fig. 6.28. A step response function that satisfies the overshoot and settling time
requirements when M = 0.1, ts — 15.0, and A = 0.02. In this figure, the step
response curve cannot run into the shaded area without violating the constraints.
and
It is a fact that functions Tref(s) given by (6.22) come from systems with
type n = 1 plant (i.e., it has a simple pole at s = 0). Hence if settling time
and even overshoot are important in a design problem, then the function Tref
given in (6.22) may give useful information only if the plant of the system to be
designed is of type n = I .
6.4. DESIGN EXAMPLE 2 85
It is easy to see that if T(s) is given by the formula below,2 then T(s) has
relative degree 1 and satisfies (6.25).
where T\ (s) is any bounded and stable function. We will pick T r e f(s) so that it
satisfies equation (6.26), but to make the task of choosing Tref manageable, we
will consider only those T given by
where 6 > 0. The real parameters a, 6, and c are chosen by trial and error so
that
We need to find a closed-loop system S that is internally stable and meets the
performance requirements.
Fig. 6.29. Step response for a system with closed-loop function Tref.
Although there are many choices of a, 6, and c, the value c — 0 gives rea-
sonable step response (Fig. 6.29). Hence we set
6.4. DESIGN EXAMPLE 2 87
Due to the pole of the plant at s = 0, the plant bound constraint is binding
at low frequencies. A plot of the closed-loop plant for the function T re f(s) is
shown in Fig. 6.30, while the magnitude of Tref (s) and of the sensitivity function
S(s) = 1 — Tref(s) are depicted in Fig. 6.31. Figures 6.30 and 6.31 suggest a
Fig. 6.30. Plot of the plant bound function for the system when the closed-loop
function it Tref.
center function k(juj] that is equal to 1 at low frequencies (up to wp — 0.7, say)
and then drops to be close to 0 after wi, — 2.0. We choose k to be piecewise
linear, so that if l(uj) denotes a line interpolating (0.7,1) and (2.0,0), then
(a; < a>p), the radius equals ap/p(juj), where ap = 0.9 is taken from Fig. 6.30.
At frequency ujb = 2, Fig. 6.31 suggests a radius equal to 0.75. Finally, we pick
the roll-off frequency to be u>r = 10.0. Figure 6.31 suggests a radius function
with value 0.25 at the roll-off frequency uj = ajr = 10.0, so we set the radius to
linearly interpolate these values. For higher frequencies, the radius is set to a
multiple of the magnitude of the plant, |p(jf'o;)|. Let i\(uo} (resp., ^(^)) denote
the linear interpolant between frequencies u> — 0.7 and uj = 2.0 (resp., u> = 2.0
and u} = 10.0). Then r(u)) is given by
See Figs. 6.32 and 6.33. Computer code for this example is given in Appendix
C and in the file appendixch6.nb.
A computer run with OPTDesign produces an optimal value 7* = 1.46, so
there is no solution to the problem with the modified envelope requirements.
6.4. DESIGN EXAMPLE 2 89
Fig. 6.37. Another view of the step response function for closed-loop T\.
Fig. 6.39. A view of the zeros and poles of Ci(s) close to the origin.
92 CHAPTER 6. EXAMPLES
The zeros and poles of C^s) are shown in Fig. 6.40, while a plot of the magni-
tudes of C\(s) and Ci(s) is given in Fig. 6.41.
The closed-loop function T2(s) that comes from the compensator 62(5) is
given by
Finally, the step response that corresponds to the choice of C<2,(s) as compensator
is shown in Fig. 6.42.
6.5. PERFORMANCE FOR COMPETING CONSTRAINTS 93
For simplicity and physical realism we shall assume that W\(<jj) is near zero
or is zero at very high frequencies, that W2(uj) is near zero or is zero at very low
frequencies, and that Wi(u) has magnitude comparable to \P(jw)\~l for large
frequencies. In particular, we do not impose a relative degree requirement on
the compensator.
The next two subsections describe practical approaches to solving (Probi).
94 CHAPTER 6. EXAMPLES
or
Fig. 6.43. Sublevel sets <SW arising from two circular performance requirements:
\1 —T\ < 0.6 and \T\ < 0.8. The intersection of the two sets has corners in its
boundary.
Note that <SW is the intersection of two disks, so it typically has "corners"
in its boundary. As a consequence of this, the requirements envelope cannot
be expressed in terms of center and radius functions like the examples we have
encountered so far in this book.
Performance functions with corners in the boundary of the level sets are
not differ entiable and are difficult to treat numerically. In this section we shall
describe two approaches to dealing with (Probi) numerically, but first we restate
(Probi) as
Table 6.1 gives information about other output, and Figs. 6.45-6.47 display
plots of relevant functions.
Conclusion. Solutions Tp for (Probi) were obtained using performance
functions Fp, for p = 2,4,8. The function T8(s) gives the best overall weighted
sensitivity and weighted magnitude (in more difficult examples its calculation
3
A square root in the formula of the function Fa has been removed, since for optimization
purposes such power affects the optimal value of the performance, but the function that
optimizes the performance does not change.
96 CHAPTER 6. EXAMPLES
subject to
T internally stabilizes S.
Such z are called feasible. When 1^2(^)2 gets near 1 the value of Fe becomes
quite large. Hence the logarithm in Fe heavily punishes z for being close to
violating the inequality (6.45). The logarithmic term in F€ receives the name
barrier function. Also, by choosing a suitable e one can manipulate the contri-
bution of the barrier function to the value of Fe.
The algorithm for solving constrained optimization problems with barrier
functions is presented now.
bl. Use T° as initial guess to find T* that minimizes supw Te(aj,T(ju)) over
all T that make <S internally stable.
b2. Update e <- ^, T° 4- T*.
b3. Stop if T° satisfies a preset tolerance criterion; else repeat (bl)-(b3).
Example. Consider P(s) = ^-, Wl = |P(ju;)|, and W2 = IP^))- 1 in
(Proba). A computer run with 32 gridpoints and the barrier method initialized
with e = 1 and T°(s) = 0.1 + produces
and the results shown in Table 6.2. Also see Fig. 6.48.
to higher weighted magnitude of the closed loop function (but still acceptable),
and leads to lower magnitude of the weighted sensitivity. A computer code tem-
plate for doing runs similar to this one is given in Appendix C and in the file
appendixchG.nb.
This page intentionally left blank
Chapter 7
Internal Stability II
Internal stability was introduced in Chapter 2. This chapter continues the dis-
cussion to obtain theorems that precisely characterize internally stable systems,
either in terms of zeros and poles of the plant or in terms of a formula for
the closed-loop transfer function. Section 7.1 develops the mathematical tools
for interpolation with rational functions. In section 7.2 a characterization of
internally stable systems is given in terms of interpolation conditions on the
closed-loop transfer function, when the plant has simple RHP zeros and poles.
The case of higher multiplicity is treated in section 7.3.
101
102 CHAPTER 7. INTERNAL STABILITY II
for some constants CQ, ci, • • - , cn-\. The right-hand side of (7.3) is called
Newton's representation for the function T(s).
Now we seek T e KH00 with the form (7.3) that satisfies INT. Set s =
si, s = «2 5 • • • ) s = sn in (7.3) and combine with INT to obtain the system of
equations
Example. Find T e 7£7i°° with a = —1 as its only pole location, such that
where 7 is any nonzero real constant. Observe that the function T0 in (7.8) has
no zeros in the RHP other than « i , . . . , sn.
obtained from /AT by setting the right-hand sides to 0. The set of conditions
INTo plays a fundamental role in parameterizing all functions T G 'RfH00 that
satisfy INT.
THEOREM 7.1.1. Let a set INT of interpolation conditions be given. Also
given are functions T\, TO in 'R.'H00 such that T\ satisfies INT, TQ satisfies INTo,
SQ, ..., sn are the only zeros ofTo, and TQ has relative degree 0. Then every T
in KH00 that satisfies INT has the form
Example. We will find a formula for all functions T € IZH00 that satisfy
104 CHAPTER?. INTERNAL STABILITY II
We can choose any negative number as the location of the pole of the rational
functions T0 and T\ in formula (7.9). We select a = —2; set
and
where
is in IZH00. This proves half of the lemma. Now suppose that the relations in
(7.16) hold. If there is pole-zero cancellation in PC at a point s = SQ in the
RHP, then either Q = C/(l + PC) or SP = P/(l + PC) has a pole at s = s0,
both of which contradict the hypothesis.
To illustrate the use of this lemma, we verify (7.16) for the system Si with
plant P(s) = l/(s — 1) and compensator Ci(s) = (s — l)/(s + 1). In section 2.3
we saw that Ti(s) = l/(s + 2), which belongs to TlH.00. We also have
We see that the closed-loop plant PSi has a pole at s = 1. By Lemma 7.2.1 we
can assert that the system Si is not internally stable.
Now consider the system <$2 with the same plant and compensator C^s) = 2.
Recall that T2(s) = 2/(s + 1), so
and
and suppose that all the RHP zeros of TO are listed in (7.23).
If T 6 'RfH00 is the closed-loop transfer function of an internally stable
system S with plant P and d(C) — dc, then there exist H e 7£7i°° such that
7.3. PARAMETERIZATION: THE GENERAL CASE 107
Conversely, if (7.24) holds for some H € TIH00, then the system S associated
with P and T is internally stable, and the degree of its compensator is d(C) = dc.
conditions
Furthermore, we suppose that all the RHP zeros of TO are listed in INT® and
that
A variation of the method we have been using works for higher-order inter-
polation problems as well. The difference here is that the formula for T now
contains summands with numerators of the form 1, (s — «i), (s — si) 2 , . . .,
Conversely, if T is any function in 7ZH00 that satisfies INTh for some dc >
0, then the closed-loop system <S associated with T is internally stable and its
compensator has relative degree dc.
We now state the result with the parameterization of internally stable sys-
tems.
THEOREM 7.3.3. Assume the hypotheses of Theorem 7.3.2, and let TI, T0
in lUt00 be such that TI satisfies INTh and TQ satisfies INT®. Furthermore,
suppose that all the RHP zeros of TQ are listed in INT® and that T^(p^) ^ 0
( * = l , . . . , n ) , T™<(ze)^ 0 (* = ! , . . . , m ) .
// T e TlTi00 is the closed-loop transfer function of an internally stable
system S with plant P and d(C) — dc, then there exist H G THi.00 such that
Conversely, if (7.32) holds for some H G 7^.7^°°, then the system S associated
with P and T is internally stable, and the degree of its compensator is d(C) — dc.
7.4 Exercises
1. Find an interpolant for each set of conditions below.
pole location
, pole locatio
pole location
pole location
2. Prove that the system of equations (7.4) does have a solution in CQ, . . .,
cn-i and that this solution is unique.
3. Prove that if the function T 6 UH00 satisfies T(SI) = 0, . . . ,T(s n ) = 0,
then either T is identically zero or degree{denominator(T)} > n.
4. Prove that given INT with n data points, there exists a unique T erh00
that satisfies INT, with the following property: if p and q are polynomials
such that T = p/q, then degree(g) < n.
5. Find all interpolants satisfying the conditions stated in Exercises la-d.
6. Prove Theorem 7.1.1.
7. Find an interpolant for each set of conditions below.
pole location s — — 1, relative degree d(T] = 2
pole location
pole locatio
d. Do problems a-c
We denote by ARHP the set of all functions /(s) that are real, bounded,
and continuous in the closed right-half-plane (RHP) and that are analytic in
the open RHP. In particular, for any f(s] € ARHP the function /(jw) is a
bounded, continuous function of u that satisfies f ( j u ) = f(—ju)- The set
aRHPis a (real) vector space with the usual properties of addition
117
118 CHAPTER 8. H°° OPTIMIZATION AND CONTROL
case" is the frequency u at which sup w6fl F(u;, f ( j u ) ) occurs. We minimize this
over all / 6 ARHP- Thus we obtain exactly the problem OPT.
The theory of OPT is based in part on the analysis of the sublevel sets
For each desired level of performance c, one has target sets <Sw(c) associated
with each frequency a;. The objective is to find a function / with no poles in the
RHP such that each f(jcu) belongs to «Sw(c). Any such / makes the performance
of the overall system at least as good as c for all a>.
120 CHAPTER 8. H°° OPTIMIZATION AND CONTROL
The transformation (8.2) maps the open RHP one-to-one and onto the open
unit disk D, and maps the juj axis with uj = oo one-to-one and onto the unit
circle.
A fundamental property of this coordinate change is that functions remain
analytic, bounded, and real: f ( s ) is analytic, bounded, and real for s E RHP if
and only if F(£) is analytic, bounded, and real for £ e D. The following table
gives some pairs (s, £) that arise from this transformation.
Note that f ( s ) has a pole in the LHP and a zero in the RHP. Correspondingly,
F(C) has a pole outside the unit disk and a zero inside the unit disk.
2
Mathematics literature treats functions on the disk rather the RHP. There are other
reasons, e.g., our graphical test in section 9 is easier on the disk, Fourier transforms and their
discrete versions are available and easy to implement on the computer, etc.
8.4. PERFORMANCE FUNCTIONS 121
Theory of the OPT problem (see [HMar90]) implies that the existence and
other properties of solutions to OPT are closely related to properties of the
sublevel sets of the performance function. Usually "well-behaved" sublevel sets
correspond to solutions one can consider "nice." A well-behaved sublevel set
has the following characteristics:
Boundedness: The set SQ(C) is contained in some disk with finite radius.
Connectedness: Two points in <S0(c) can be joined by an arc that lies
totally in <S#(c).
Simply-connectedness: No "holes"; i.e., the set complementary to «S0(c) is
connected.
Nonvanishing gradient: The gradient of r(ej<9, z) in z = x + jy is not zero
for any z in the boundary of S0 (c).
Other desirable properties are:
Convexity: Two points in <S0(C) can ^e joined by a segment that lies totally
in «S0(c).
Smooth boundary: In particular, the boundary of S0(c) has no corners.
Smooth dependence on 0: In particular, small changes in 6 lead to small
changes in «S0(c).
Example 3. Suppose that two continuous functions k(e^e] and w(e^e) are
given and that w(e^e) > 0 for all 9. With these functions we build the perfor-
mance
Example 4. Suppose that u>i(e j6> ) and W2(e^°) are given positive functions,
and
Therefore <S0(c) is a disk with center u>2(e-' 6) )/(tt>i(e- 76 ') + W2(e^&)) and radius
and F is quasi-circular.
then Se(c] is an ellipse centered at l/e^e and with semiaxes that vary with 9.
Example 6. Set
Then Se(c] is a lemniscate centered at l/e j6> . For small c the sublevel sets are
convex, for c slightly smaller than 1 the sublevel sets are nonconvex, and for
c> I the sublevel sets are no longer connected sets.
8.4. PERFORMANCE FUNCTIONS 123
Solutions to OPT
Partial derivatives with respect to the variables z and 2 are defined by the
formulas
125
126 CHAPTER 9. SOLUTIONS TO OPT
The partial derivatives with respect to z and z satisfy the standard differentia-
tion rules for functions of one real variable. Some examples are
Fig. 9.1. A sublevel set <S#(c) = {z : g(z) < c} of a real-valued function g(z],
and the gradient vector n at a point of the boundary of SQ(C}.
In particular we have
The following is a very nice geometric interpretation of (i) and (ii) of Theo-
rem 9.3.1. Suppose that 7* is the optimal value of OPT. The sublevel sets
Fig. 9.2. The optimal performance surface. In the top figure the curve
(ej0, /*(e j0 )) lies on the boundary of the solid. As 9 increases from 0 to 2ir,
the dark strip representing normals to the sublevel sets makes one complete
clockwise revolution with respect to the 9 axis, as indicated in the bottom figure.
Prove that for a suitable fco, the function /(ej<?) = k^e^6 is a solution to OPT.
9.4. PROPERTIES OF SOLUTIONS 129
are basic to the study of qualitative properties of OPT. The standard assumption
that we make on sublevel sets is as follows.
SA (Standard assumption): <S0(c) is connected and simply connected, and as
9 varies it is uniformly bounded and has area uniformly bounded away
from 0. Furthermore, the gradient of F(e-^, z) in z = x + jy is not zero
for all z in the boundary of S0(c).
tailored to control problems where the plant has no poles or zeros on the j'u;
axis except at oo. This gives simple formulas that are illustrative of the general
situation.
Let d(P] denote the relative degree of the plant P, and let d(C) denote the
relative degree we shall require of the compensator C we are seeking. Then the
function T = PC (I + PC}~1 has relative degree d(C) + d(P}. We now assume
that F(u;, z) has sublevel sets <Sw(c) that shrink to zero at the rate
as (jj —>• oo. To avoid pathology in Su we assume that T(UJ,Z) for large uj is
given by a rational function. This would be the case in a typical control design
problem.
Now we state a test for optimality that requires checking function values
only for positive uj.
THEOREM 9.5.1. Suppose Y is C3 and satisfies the assumptions laid out in
this section. Suppose f* 6 ARHP produces §^(<*>, f*(jw)) which never equals 0.
Then f* is a strict local minimizer for OPTjinp if and only if
Moreover, if K is any subset of the complex plane C that does not intersect the
closed unit disk, then one can choose each r^ to have its poles in 7£.
Proof. One can use a suitable (analytic) linear fractional transformation
v = /(£) to map the unit disk to the unit disk, and the set P to a set PI that
is a subset of the open disk D(l, 1) centered at 1 with radius I.1 This allows
1
To approximate g on P by analytic functions is equivalent to approximating the function
g-± — g o l~l on PI by analytic functions of i>: if / is a function analytic on the disk such that
131
132 CHAPTER 10. FACTS ABOUT ANALYTIC FUNCTIONS
us to assume that P is a set contained in the disk D(l, 1), and we make this
assumption in the rest of the proof.
As a first case consider a function g(C) = Y^e=-n c^ where n > 0 is fixed.
The function g(£) can be expanded in a power series of £ — 1 that is uniformly
and absolutely convergent on compact subsets of D(l, 1). By taking as many
terms as necessary in this power series, we can approximate g(e^e] by analytic
polynomials of e?Q.
The same holds when g is a continuous function on P, since such functions
are approximable by expressions of the form Y^e=-n ce(e^e)£ by the Weierstrass
approximation theorem.
Therefore g is approximable by analytic functions on P. Finally, Runge's
theorem [R] guarantees that any analytic function on the disk that is continuous
on the closed unit disk can be uniformly approximated on the circle by rational
functions whose poles are in any specified set that does not intersect the closed
disk.
How badly behaved are the functions r^ in Theorem 10.1.1 outside the arc
P? The answer is given by the following theorem.
THEOREM 10.1.2. Let P be a closed proper subset of the unit circle <9D. If
g is any continuous function on P, then either g extends to a bounded analytic
function on the disk D or, for any sequence {fn} C A such that
Since r is real-valued, we have that r(ej61) = r(e^), and from this it follows that
C-n = c^,. We assume without loss of generality that CN = C-N = 1. Now r(C)
is a rational function of £ £ C, with poles located at £ = 0 and no zeros on the
unit circle. If £n € C is such that r(£o) = 0, we have
We have proved that if C is a zero of r. then Cn = T£TJ is also a zero. Thus the
X- -5 T ^>U |£g|^
which (as the reader can easily prove), upon division, yields a polynomial ex-
hibiting the same type of symmetric arrangement of the coefficients. One can
carry this process until r is completely factored in the following form:
Since |£| < 1 if and only if |C* | > 1> w may label £1,..., CAT as the roots of(c) r(i
located in the unit disk so that Q,..., (^ are outside the unit disk. Now set
implies that r(ej61) = h(e^e}h(e^e}. This proves the theorem in the case where
r is a polynomial.
134 CHAPTER 10. FACTS ABOUT ANALYTIC FUNCTIONS
Also, since b has winding number 0 about 0 we must have v(0) = v(2?r). So we
may think of v as a function on the unit circle, and as such it has an associated
Fourier series expansion
where (since v is real-valued) vg = v-e. Now let u be the function on the circle
whose Fourier coefficients are given by ue = —jvi if t < 0, ui = jvi if i > 0, and
w0 = 0. Then u(e?e) is real-valued. Set h(eje] := u(eje}+jv(eje). A calculation
yields
for example, using Poisson's formula. Now a harmonic function v*(x, y) can be determined
so that g(x + iy) = v*(x,y) + iv(x,y} satisfies the Cauchy-Riemann equations on D. The
function h — e9 is analytic on the disk D, and its phase on the circle is v, i.e., the phase of b.
10.4. THE FUNDAMENTAL MISTAKE OF H°°-CONTROL 135
where zr = number of zeros of r inside the unit disk and pr = number of poles
of r inside the unit disk.
The price of making this mistake is high because, upon running an optimization
program with such specs as input, one finds that the closed-loop transfer func-
tion T has huge magnitude off of B. Amusingly enough, T will meet the specs
perfectly on B. In short, one obtains a solution so ridiculous that it is not even
a good basis for trial and error.
136 CHAPTER 10. FACTS ABOUT ANALYTIC FUNCTIONS
Why are the remarks we just made true? They are an immediate conse-
quence of Theorems 10.1.1 and 10.1.2; indeed, the remarks are just a rephrasing
of the mentioned results, only now in the context of control.
A wholly different consequence of Runge's theorem is that pole placement,
a valuable technique in control design, has a shaky foundation, which (in the
authors' opinion) has not been carefully analyzed. The basic idea behind pole
placement is to set the poles of the closed-loop transfer function at some pre-
scribed distance from the juj axis to achieve an objective, such as stabilizing
the system without making it too sluggish. While the technique is effective in
particular situations where there is a lot of human intervention and time for
trial and error, we maintain that the method is badly codified so that it cannot
be used algorithmically. This is a consequence of Runge's theorem (though not
of Theorem 10.1.1). Runge's theorem implies that any function in ARHP and
smooth on the juo axis (even at u; = oo) can be approximated by r^ exactly as
in Theorem 10.1.1. That is, the hypotheses of this theorem are different, but
the conclusion is the same. This of course implies that if we have a proposed
design producing a closed-loop transfer function T0 that is rational with poles
at locations {PJ}, then we can find T£ such that supw \T(juj] — T£(jaj)\ < E and
with poles about any place we want. The number of poles of T£ might be greater
than that of TQ. This and other constraints that one might impose in order to
make pole placement a systematic rather than a trial-and-error subject have not
been analyzed.
The moral of the pole placement story is different from the tale of the funda-
mental mistake. Here we are saying that pole placement can be a viable method,
but it is a ripe area for more research. In contrast, the fundamental mistake
should never be made.
Chapter 11
An important tool used in the proof of Therorem 9.3.1 is the Taylor expansion
of smooth, real-valued functions of z 6 C in powers of z. This is computed for
performance functions in section 11.1. The actual proof of the theorem begins
with the necessity of conditions (i) and (ii) of the theorem, discussed in section
11.2 and section 11.3, respectively. The proof of the theorem is completed with
the discussion of sufficiency of (i) and (ii) in section 11.4.
To use this in the proof of the theorem, we consider the function G(z) =
F(e je , /*(e j6> ) + z) (for now e^e is fixed), and for ease of notation we write
137
138 CHAPTER 11. PROOF OF THE MAIN RESULT
F imply that there exist positive constants CQ and <§o such that the following
relation holds:
We use the Taylor expansion of F to see that for e > 0 small enough,2 the
1
This uses only a small part of the strength of Theorem 10.1.1, which says that we could
choose h to approximate the whole function —a(e j 9 ) on rc. Also we could have used Theorem
10.3.1.
2
A precise argument is to choose e > 0 such that eco sup# |/i(e^)|2 < —£2, that is,
coesup e |/i(e je )| 2 < —efe- Combining this with inequality (11.5), we have
11.3. SOLUTIONS MUST SATISFY 139
Compare (11.10) with (11.6): the former is valid for all 9 and the latter is valid
for 9 6 r°. We can proceed with steps similar to those that follow (11.6) and
with a small enough e we obtain
Before launching into heavy details we give the idea of the proof. Taylor
approximation and the flatness condition (i) of Theorem 9.3.1 imply that
140 CHAPTER 11. PROOF OF THE MAIN RESULT
provided hn has no zeros on the circle. (If it does, then one can replace it by
a function without zeros on the circle that is a small perturbation of hn.) This
says that the function ahn must wind about 0, so in particular Re(a/in) must
be positive for some e j6>1 . Hence
Hence
Suppose for a moment that for each n there exists 0 such that a(e^9)hn(ej ) =
tn > 0. Substituting in (11.15), we get
But since ||/in||oo —> 0, we have that tn —>• 0, so the left-hand side of (11.17)
cannot hold for all n.
To justify the existence of tn as above, we note that the complex function
Hn(z) = zhn(z] has a zero at z = 0. By complex function theory, the function
H maps onto some disk centered at 0, so for some 9 we have Hn(e^} > 0. Since
the function a(e^Q}/e^e is continuous and never 0, it follows that for some #,
Computer Solutions to
OPT
One of the big payoffs of having results such as Theorem 9.3.1 available is tha
one can develop optimality tests. These tests that can be used as a measure
of how close a candidate solution to OPT is to being optimal. This topic is
presented in section 12.1. As we shall see in section 12.3, another major payoff
is that optimality conditions lead directly to computer algorithms. This requires
some basic definitions, which are given in section 12.2.
1
Since the winding number of a function can take only integer values, small changes in a
function typically do not change the winding number of it with respect to 0. Condition II is
automatically satisfied for all functions / that are close (in the norm of the supremum) to a
solution /*!
141
142 CHAPTER 12. COMPUTER SOLUTIONS TO OPT
In our computer run the Flat diagnostic went from 0.99 down to 0.002 in four
iterations, while GrAlign was zero throughout the iteration (this is not unusual
for "scalar" cases).
The numbers in Table 12.1 were obtained with the program Anopt. Com
puter code for generating the run is
«Anopt';
Anopt[g, 0.01];
where the coefficients ag are real numbers. On the other hand, if we now consider
f(e^) as a function on the circle, it is a continuous function and thus has a
Fourier series expansion
where the Fourier coefficients ft are real numbers. The relationship between
(12.1) and (12.2) becomes clear when we formally substitute e^0 for z in (12.1):
Now, by the theory of Fourier series we know that the representation (12.2) is
unique. If (12.3) is valid, then we must have that cn — fn for n = 0,1, 2 , . . .
and that cn = 0 for n = —1, — 2 , . . . . To prove that (12.3) holds for functions
/ G A, one can consider first polynomials in z for which this relation is obvious.
Then the general result follows from the fact that functions in A are limits of
polynomials.
Now suppose that f(e^9} is any function on the circle that is continuous. If
the Fourier expansion of f(e^e] has the form
we see that we can generate a function /(z) defined on the disk as a power series
in z by formally replacing e^ by z in (12.4). It is possible to show that the radius
of convergence is 1 (see [Hof62]). Thus, roughly speaking, functions defined by
relation (12.1) and functions defined by relation (12.2) can be identified.
If we drop from the definition of A the requirement that f(z] be continuous
on the closed unit disk, then we obtain a normed vector space called the Hardy
space H°°. It is obvious that some functions f(z) in H°° have boundary values
f ( e i e ] (those in A, for example). But it is not clear, a priori, that one can talk
about boundary values of an arbitrary function f(z) in H°°. There is, however,
a way to define f(e^0) mathematically with the help of measure theory (see
[Hof62]).
If /(ej61) is a measurable function / : <9D —> C, the norms below are well
defined and give a nonnegative value or infinity:
The norms above are used to define the normed vector spaces
144 CHAPTER 12. COMPUTER SOLUTIONS TO OPT
Proof. Suppose that /* is such that conditions (i) and (ii) of Theorem
9.3.1 are satisfied. Prom (i) we immediately get the existence of the constant
3
Often ajfc+i = Xk + t * h is used, where t > 0 is a parameter chosen according to a
prespecified criterion. The process of finding t is called linesearch.
146 CHAPTER 12. COMPUTER SOLUTIONS TO OPT
i.e.,
in two unknowns, tp and /. Because of the way ^ is normalized, one can write
At this point the basic idea is clear: to find solutions to the optimality
conditions (i) and (ii) of Theorem 9.3.1, it is enough to solve equation (12.18).
Thus we state the following.
12.3. NUMERICAL ALGORITHMS 147
Critical point problem. Find analytic functions /*, (3* with 1+2 Re(eJ'e/3*)
positive on the circle such that (/*,/?*) is a solution to equation (12.18).
We have found that Newton's method applied to the critical point problem
is extremely effective.
A standard fact about Newton iteration is that when the linearizations T!, ^
are uniformly invertible, the iterates have an excellent convergence property
called second-order convergence. This is analyzed in [HMW93], where the lin-
earizations are shown to be invertible and consequently our algorithm is second-
order convergent. It requires quite a bit of functional analysis to find the correct
setting and estimates. While it is well beyond the scope of our presentation here,
we give the formula for J" in Chapter 15.
The optimization problem of approximating in the norm of the supremum a
given continuous function on the circle by functions in A is a primitive version
of the problem OPT considered in this part of the book. As such it was treated
first in 1949 by Nehari, who proved that the distance is the largest singular value
of certain Hankel operators. Closely related work goes back to Caratheodory-
Fejer and Pick in the early 1900s. A history of work on this problem is vast and
appears later in Appendix A.
The Newton-type algorithm in this section is from [HMW93]. Disk iteration
type algorithms for solving OPT and their origins will be discussed in Chapter
15.
This page intentionally left blank
Part IV
H°° Theory : Vector Case
This page intentionally left blank
The control design methods presented in Parts I and II of this book ex-
tend to multiple-input, multiple-output (MIMO) systems. The most challeng-
ing problems that arise in MIMO design are optimization problems such as
OPT, but for MIMO design one optimizes over TV-tuples1 of analytic functions
/ = (/i, / 2 , . . . , /AT) instead of a over a single (scalar) analytic function. Fortu-
nately much is known about this optimization problem.
In this book we do not lay out MIMO control theory, but we do sketch the
theory that can be used to numerically solve the H°° optimization problem that
arises. That is the subject of the remainder of the book.
1
These correspond to N designable single-input, single-output (SISO) subsystems.
This page intentionally left blank
Chapter 13
In this chapter we begin the study of H°° optimization over Ar-tuples of analytic
functions / = (/i, /2, • • • , /n), which generalizes the OPT theory of optimization
over a single scalar analytic function.
We describe two types of computer algorithms based on this theory. These
are disk iteration and the Newton algorithm, introduced in section 12.3. In this
chapter and in the remainder of this part we give more generality, detail, and
analysis. Variations (large ones) on these algorithms are implemented by various
authors in various programs (see the preface). One example is our program
Anopt which implements both algorithms and will be used for illustrations in
this chapter.
To do MIMO control one needs to be able to handle interpolation constraints
on matrix-valued analytic functions and produce parameterizations of the type
discussed for the scalar case in Chapter 7. Such parameterizations are well
understood in the MIMO case and reported in many references (e.g., [BGR90],
[FF91], [H87]). We do not discuss interpolation constraints for matrix-valued
analytic functions in this book, so for details on this we refer the reader to the
works cited above.
153
154 CHAPTER 13. MANY ANALYTIC FUNCTIONS
where \\Z\\N represents the euclidean norm on C ; that is, for z = (z\,..., ZN),
we have
The followin g i s the inpu t th e use r types i n a computer sessio n wit h the packag e
Anopt i n order t o solv e OPT.
«Anopt';
Anopt[g,0.000001];
In the session above the input is the performance "g" and the number 0.01,
which is an error tolerance for stopping calculations. The output is shown in
Table 13.1.
13.3. OPTIMALITY CONDITIONS FOR SOLUTIONS TO OPT 155
II. There exist F = (F\,... , F/v) with Fg analytic on the unit disk D and
integrable boundary values Fi(e^6}, and there exists an integrable, positive-
valued function ijj(ei&) such that f ipt^ = 1 and
Gradient
alignment
The reader might like to compare this to Theorem 9.3.1 when TV = 1, but
the correspondence is not obvious because the gradient alignment condition
looks different than the condition on the winding number in Theorem 9.3.1.
Theorem 13.3.1 is written in what could be called primal-dual form. One can
apply Theorem 13.3.1 productively without understanding this connection, so
we move on and do the applications in this chapter.
Later, in section 14.4, we prove that the winding number and primal-dual
formulations are equivalent. Primal-dual refers to the fact that / is the unknown
in the original, i.e., "primal," problem and ip has the interpretation of the
unknown in the "dual" problem; Part V is devoted to this subject.
Functions /° € AN that satisfy the flatness and gradient alignment condi-
tions are good candidates to a local solution to OPT, but they may not be a
local solution. We illustrate this in the next section.
13.4 An example
We now give a performance function F and a function /° that satisfies the
flatness and gradient alignment conditions, but /° is not a local solution.
Set /°(e je ) := (0,0). The performance function we consider is
We now prove that /° satisfies the the flatness and gradient alignment conditions
of Theorem 13.3.1.
The flatness condition holds since
13.5 COMPUTER DIAGNOSTICS FROM OPTIMALITY CONDITIONS 157
Hence,
Note that
so the gradient alignment condition holds with ^(e^6) = 100 \e?Q + 0.2| 2 and
Ft(e*9) = eie/(l + 0.2e^), 1 = 1,2.
We now prove that /° is not a directional minimizer for OPT (thus it is
not a local solution either). To do this, consider the function h°(e^e) = (1, —1).
Then, for t € R we have
To test the gradient alignment condition directly one also needs functions
t/> and F, which do not appear in the statement of the problem OPT. Thus
it is desirable to have a test of the gradient alignment condition that does not
158 CHAPTER 13. MANY ANALYTIC FUNCTIONS
It is easy to check on the computer when this should apply. Thus the con-
jecture above allows one to check (for most performance functions) the gradient
alignment condition of Theorem 13.3.1 by testing if each am/ao is analytic, pro-
vided Af > 1. When N = 1 this test does not apply and one must use the
winding number test of Chapter 12.
To implement the test for the gradient alignment condition, consider the
Fourier expansion of the functions
and form
which is calculated on the computer with the fast Fourier transform. At op-
timum both Flat and GrAlign should equal zero to nearly machine precision.
The program Anopt prints these values out at each iteration to monitor how
the approximate solutions are progressing. If / is near the solution, we must
have Flat(f) w 0 and GrAlign(f) ~ 0. Computer output showing the Flat and
Gr Align tests was presented in section 13.2.
The Flat and GrAlign tests for optimality we have developed are used
in step 1. Another stopping criterion one may want to include in particular
implementations is "lack of sufficient progress" (however you want to define
this). A typical step 2 consists of replacing the performance function with a
"model," a new performance that is based on the original function but easier
to deal with. Step 3 is called linesearch. One obvious example of the linesearch
criterion is that t minimizes sup^ r(e j0 , f(e^e] + th(e.ie}}.
The two algorithms discussed later in this book (see Chapter 15) are Newton
iteration and disk iteration. Both are based on models of the performance
obtained by taking some terms from the Taylor expansion of T(e.ie,z) about
z = the current guess /.
We now address what we have found to be the most effective means of eval-
uating the performance of a computer algorithm. The objective is to have a
theoretical test, which can be done with pencil and paper at the same time that
you are developing an algorithm. This is tremendously helpful in discovering
computer algorithms. The reason is that the algorithm itself is presented ana-
lytically, and having an analytic means of evaluation allows one to see how to
make changes that improve performance of the algorithm. What we recommend
as the leading indicator of success of an algorithm is actually very conventional.
It is the order of convergence, defined in the next paragraph.
A sequence of vectors xn in a vector space with norm || || that converges to
x* is said to be p-convergent if there exists C > 0 such that
(EST)
The larger p is the faster the sequence converges. An algorithm that generates
sequences will be called order p convergent provided the sequences it generates
are at least order p convergent. Here C is a number whose size influences the
error estimate (EST) less than p. When p = I we need C < I to guarantee
improvement from one iteration to the next.
A summary of the performance of algorithms presented in this book is as
follows:
Method Case Convergence
Disk iteration order 2
order 1
Newton iteration allN order 2
The main theorem in this chapter is from [HMW93] and strengthens the
theorem in [H86]. Independently a special case was proved and put to very
good use in pure mathematics by Lempert [Le86].
Chapter 14
Coordinate Descent
Approaches to OPT
on the quadrant {x > 0,y > 0}, which is a sloped trough whose bottom crease
is the line x = y. Start at (xo,yo) and descend in the x direction, that is,
follow the line ( x , y o ) , and and stop at (xi,yo) = (yo,yo), the minimum in the
161
162 CHAPTER 14. COORDINATE DESCENT APPROACHES TO OPT
x direction. That point lies on the crease. Now do the y descent; you never get
off of the crease!1 (See Fig. 14.1.)
We start by defining the H°° coordinate descent (CD) algorithm, but for
simplicity of presentation we state the case N = 2 only. The idea will be no
surprise.
Given T: 3D x C2 -» R+ and /° - (/?,/£) G *42, an update f1 e ^2 is
obtained in the following way.
hold locally. Clearly any local solution to OPT is a coordinate descent solution
to OPT.
Next observe that CDS (1) is a standard OPT problem with N = 1, as is
CDS (2). If we invoke the N = I case of Theorem 13.3.1, we immediately get
the following.
THEOREM 14.2.2. Let T be of class C3 and /* = (/1? /2) in AI be such that
neither entry of ( J^j- (•,/*), J^ (•,/*)) is ever equal to 0. The function /* is a
coordinate descent solution to OPT if and only if
I. The function
Flatness
II. There exist F = (Fi^F^) in the Hardy space H\ and two nonnegative mea-
surable functions ^\^i of e^e G <9D such that f i^e^r = 1 and
Gradient
alignment
Example 2.
Thus
The converse follows from Theorem 10.3.1 applied to the function b = ae^n6
whose winding number is 0. That theorem tells us that there is a analytic phase
function F satisfying (14.1).
Immediately this gives the result that the winding number condition of The-
orem 9.3.1 is
where pi and qe are coprime (have no common zeros) for t = 1,... ,7V. The
following is shown in [H86]
PROPOSITION 14.4.2. The ^ = -0m condition for OPT over AN is equiva-
lent to the stament
The integer i(a] defined as the number of zeros that the greatest
common divisor ofpi,... ,PN has inside the disk minus the number
of zeros of the least common multiple of q\,..., <?jv is strictly greater
than zero.
Roughly speaking, i is the number of common zeros minus the total number
of poles (inside the disk) of the at. Now we need the following result.
THEOREM 14.4.3. When N = 1, we have that for generic F, if f * e A is a
solution to OPT such that a(-) :— §^(-,/(-)) is never 0, then wind(a;0) = 1.
166 CHAPTER 14. COORDINATE DESCENT APPROACHES TO OPT
Combine Theorem 14.4.2 with Theorem 14.4.3 to obtain very strong con-
clusions. For example, it leads us to suspect that generically if the ai have no
common poles inside D, then all zeros of the ai are common.2. This does not
prove that generically all zeros are common, because the generic N = I behavior
may not be true for a N > 1 optimum.
The coordinate descent presentation in this chapter is drawn mainly from
[HMer93a].
2
To do the counting use
We summarize with
#common zeros(a£) > A^(#common zeros(a^) — 1).
This implies that ^common zeros(o^) cannot be greater than 1 because of the strict inequality
and the fact that N > 1. Also the condition ^common zeros(a^) #P°\es(ae)
prevents it from equaling 0. Thus ^common zeros(a^) = 1, which is what we want to prove.
Chapter 15
This chapter is one of the more theoretical parts of the book, which is ironic
in that it derives the formulas that are used in our computer algorithms. The
authors are a bit iconoclastic in that they see developing computer algorithms
of this type as almost an exercise in pure mathematics. Indeed Appendix B
describes pure mathematics implications of this and previous chapters. A design
engineer can skip this chapter. An engineer who develops optimization methods
might find it interesting.
In this chapter we discuss in some detail two algorithms for solving OPT:
disk iteration and Newton iteration. Sections 15.2 and 15.3 present disk iter-
ation. Section 15.4 shows how our Newton-type method presented in Chapter
12 for one function / generalizes to N functions /i,/2, • • • ,/W- Section 15.5
compares numerical properties of the methods; they turn out to have comple-
mentary advantages. The last section derives the formula due to Nehari and
others, which is the core of our disk iteration method. As we shall see, this
formula follows directly from our main optimality theorem, 13.3.1.
15.1 Notation
We define for 1 < p < oo the spaces LPN whose elements are ./V-tuples of Lebesgue
measurable functions (/i, • . . , /N) and whose norms are defined in terms of the
euclidean norm | • ||AT on C N by
167
168 CHAPTER 15. MORE NUMERICAL ALGORITHMS
where the Fourier coefficients ft are vectors in C^. The space H1^ consists of
/ 6 LPN that have fe = 0 for I < 0. Functions / in H^ extend to an analytic
function F(z) on the unit disk D, so that f(e.->e} is the nontangential limit of
F(z) for almost every 0. See [Hof62].
The orthogonal complement in L2N of a subspace S is denoted S^. If k is
an N x M-matrix-valued function on the unit circle, the Hankel operator with
symbol k is the operator Hk '• H'M^n action a —>H2± [ka\. The P —>• H^-
Toeplitz operator with symbol k is the operator 7^ : H^ —> H^ with action
Let H.k '. HI —> H%f be the operator with action a —> PH2±[ka]. This
type of operator is called a Hankel operator, and k is called its symbol. A
well-known result is the following.
THEOREM 15.2.1. Ifk is in LJ$, then
// the symbol k is in H°° + C, then /* is unique. Let 7i£ be the adjoint ofHk,
and set r = dist(fc, H™); then
jVn 2 , but it has special structure that can be exploited to reduce the operation
count in the inversion (with algorithms of the type found in [SK91]), possibly
reducing the order of the inversion to Nn2. Also there exist algorithms for
inverting a matrix by iteration, which can be effective in this setting.
The power method is clearly efficient in many respects, and it is superior to
the Newton iteration in memory requirements and operation count. Obviously
it is also better when it comes to time actually used to produce an update in step
2 of the iteration scheme. However, there is a crucial limitation of the power
method: it is directly applicable only to very simple problems OPT, namely,
Nehari problems.
The properties of the power method make it an interesting tool for solving
OPT problems through approximating F by Taylor expansion. This Taylor
expansion is used to obtain a Nehari problem, or at least a problem close to it.
This has been attempted before in [HMer93b]. In section 15.3.2 the method of
disk iteration (see [HMer93b]) is described.
A first glance at the algorithm reveals that the Taylor expansion of F about
/, the current guess at the solution, is not necessarily of the form of a Nehari
problem. However, it can be solved by iterating Nehari problems [HMer93b].
Thus an implementation of the disk iteration algorithm that uses the power
method requires an inner loop to find a solution to step 2.1 Another feature
of (15.7) is that the Taylor expansion it contains does not capture all second-
order information from most functions F. One expects that this would imply
first-order (bad) convergence for the algorithm. This is indeed true in the vector-
valued case N > 1, but it is surprising that (in practice) second-order (good)
convergence occurs in the scalar case N — I (see [HMer93b]). Thus disk iteration
gives second-order convergence in practice when N = 1, and no better than first-
order convergence when N > I .
1
We note that there are many reasonable modifications of the expression (15.7) that pre-
serve intact terms of order up to 1 in h. In fact in [HMer93b] one was found that eliminates
the inner loop at the expense of all the second-order information.
15.4. THE NEWTON ITERATION ALGORITHM FOR OPTN 171
where
Thus our objective is to solve the critical point problem (15.9) with T given
by (15.10). Here x is the identity function on the unit circle, namely, the
function X(eje) = &>e.
Now we turn to the solution of operator equations (e.g., equation (15.9))
with the iterative method of Newton. In terms of T' we have the Newton step
for updating a given pair (/, j3}:
Let a* € HI be the outer spectral factor of -0. From (15.14) we have that
F/a* £ HX and hence that
To prove the reverse inequality, note that by (15.13) the following relation holds
for almost all 0:
and notice that by multiplying both sides of (15.20) by (k — /*)* and using
(15.13) one obtains
Table 15.4. Convergence to local optimum with disk iteration and smoothing.
The example shown in Table 15.3 is now run with automatic smoothing on
the current guess at the solution. This lowers the numerical error (ned) as the
iteration progresses and improves Flat, but GrAlign is practically unchanged.
Thus numerical noise is not the source of the lack of progress in the run in
Table 15.1.
Multiplying both sides of (15.22) by (k — /*)*, using (15.21), and taking conju-
gates of both sides, we have
Table 15.5. Convergence to local optimum with disk iteration and special line-
search.
The example shown in Table 15.3 is now run with a mixed strategy in the
linesearch: if / is the current guess at the solution and h is the update direction
generated by the algorithm, then either GrAlign(f -f th) or the supremum
of F(-,/ + th) is minimized in £, depending on the progress with respect to
previous iterations. The progress improves to a clear linear rate of convergence.
Compare this with Table 15.1.
Now we must linearize equation (15.27) and obtain a tractable formula (this
we do next). Then to prove second-order convergence we must show that the
Jacobian is invertible (this is analyzed afterward).
where
Substituting (15.28) into (15.30) and dropping all terms which are not linear or
constant in 6 or e yields
178 CHAPTER 15. MORE NUMERICAL ALGORITHMS
where
/
N.M2 is the Hankel operator with symbol M% =
applied to (15.32).
One can see in representation (15.33) that T'^ ^ has the form of a "conjugate
of Toeplitz plus Hankel" operator (except for a small change in the range, which
turns out to be unimportant for our analysis). One consequence of this, as
pointed out to us by Ali Sayed, is that T£ ^ can be inverted numerically with
fast algorithms similar to those described in [SK91]. This enhances the practical
appeal of Newton's method applied to the equation T = 0.
for some 6 and for all \z\ < 1. One version (cf. [H86]) of a deep theorem mostly
due to L. Carleson, called the Corona theorem, is as follows.
THEOREM 15.7.2. For F e HJ?,
the range of the Toeplitz operator Tp '• H2^ —> H2 is H2
if and only if
Tp1 TFi + ... + TpN TFN ^s invertible
if and only if
F is an invertible outer function.
We shall be dealing with functions F e H^ that never vanish on dD, and
for these the theorem roughly says that not all components of F can have a
common zero in D. Since having a common zero is a rare event, we see that the
outer condition is generically true provided N > I .
We now present the main theorem of this section.
THEOREM 15.7.3. Let F be smooth and strictly plurisubharmonic . Let
f G AN be a smooth function that satisfies the gradient alignment condition
where smooth 4>: <9D —> C has \<f>\ = 1 and F G H1^ is outer and smooth if and
only if the subspace
is a closed proper subset of L2. This subspace is closed if we add the assumption
that a is rational. By Lemma 14-4-1, factoring (15.35) is equivalent to
for some continuous uniformly positive 1/1 : 3D —* R+, some F € Hff an outer
function, and n > 1. Even for nonrational a if the representation in (15.39) is
true, then TM is invertible if and only if n <\. This for N = 1 is equivalent to
stating that the winding number condition wind(a, 0) < 1 holds.
Proof. Write the block Toeplitz operator TM as
a block matrix with Toeplitz operator (of the appropriate dimension) entries.
Since R is & positive definite-valued and uniformly invertible function, TR is
invertible. Standard Schur complement arguments tell us that TM is invertible
iff T^at(Tfi)~lTXais invertible. It is invertible iff (15.38) since TR is invertible.
Now we show for a rational function a that condition (15.38) fails iff condition
(15.39) holds. If range Tp is not onto H2, then M = xaH2 is a subspace of
L2 that is not equal to L2. Since a is rational and a(e j0 ) never vanishes, M is
closed. Thus the range condition (15.36) in Lemma 15.7.5 is true, so we get the
representation (15.37). That is, a has the form in condition (15.39). Even if a
is nonrational and (15.39) holds,
has range onto iff n < 1 and Tp has range onto. This uses the fact that
7^-i x n-i is a scalar Toeplitz operator, so Coburn's lemma implies that the
range of 7^,-ixn-i is onto iff n — 1 = wind(^~ 1 x n ~ 1 ;0) < 0. Also it uses
Theorem 15.7.2, which says Tp has range onto iff F is outer.
183
184 CHAPTER 16. MORE THEORY OF THE VECTOR OPT PROBLEM
.III
For every nonzero
Conversely, if the flatness and gradient alignment conditions and III with strict
inequality hold, then /* is a directional minimizer.
The interested reader may check the reference [HMer93a] for the proofs and
details pertaining to this section.
We now observe that this theorem gives a practical optimality test when
7V<2.
COROLLARY 16.1.2. When N — I , condition HI of Theorem 16.1.1 is
satisfied at any f € A since Af = {0}. Therefore, if f G A satisfies that fj(-, /)
is never 0 for any 0, and if it satisfies the flatness and gradient alignment
conditions, then f is a strict local optimizer of OPT.
When N = 2 we have a practical test given in the following theorem.
THEOREM 16.1.3. Let F(-, z) be a given performance function and let /* G
A2. Fort= 1,2, set a^6} = If (e^,/(e^)) and let
Suppose that at least one of the functions ag is never zero on the circle, and
that it has winding number 1 about 0. // /* satisfies the flatness and gradient
alignment conditions, then relation III of Theorem 16.1.1 with strict inequality
is equivalent to the following statement.
Either (i) there exists 9$ such that
or (ii) bTBb is never zero on the circle, and n& :— wind(6 T Bb) is either an odd
number or a number greater than 2.
If Conjecture 13.5.1 is true, then the hypothesis on the winding number of at
is not too restrictive. The test in Theorem 16.1.3 is implemented in the software
package Anopt.
Open question: Find a practical test to check condition III when N > 2.
For those who want a practical test when TV > 2 we include the next theorem,
which replaces III with a (stronger) condition that is easy to check. When III
is taken together with flatness and gradient alignment, they are sufficient (but
not neccessary) to ensure local optimality.
THEOREM 16.1.4. Let T, /*, a, A, and B be as in Theorem 13.3.1. For
9 € fO, 2?r) let
16.2. AN EXAMPLE (CONTINUED) 185
Note that
and this implies that the gradient alignment condition is satisfied by /i with
and
Therefore condition (16.3) holds for all 0 and all 2, and this implies that fi is a
strict local optimizer.
The main difference between the N = I and N > I cases is that in higher
dimensions solutions may not be unique. We saw an example of nonuniqueness
earlier in section 13.4. Those steeped in the lore of several complex variables
will appreciate the fact that F in this example is strictly plurisubharmonic in z.
Our standard assumption on the OPT problem is:
16.3. PROPERTIES OF SOLUTIONS 187
(SA) F depends smoothly on 8, is real analytic in z (and in z], and has gradient
^j (e j<? ,z) that never vanishes when T(e?e,z] = 7*. The sets £0(7*) are
connected, simply connected, have nonempty interior, and are uniformly
bounded in 0.
While 7* may not be known in advance in a particular situation, one might verify
that all «S0(7) for a wide range of 7 satisfy these conditions; this is because the
conditions are not very restrictive.
We now give a list of results. Definitions of the more specialized terms in
the theorems are given below.
The behavior of OPT depends heavily on the properties of the sublevel sets
of the performance function F, so now we list key ones. A strictly convex set is
a convex set with no line segment contained in its boundary. The unit ball in
C^ is strictly convex, while the unit ball in the space Mmn of ra x n matrices
is not unless ra = 1 or n = 1. Polynomially convex sets are a much broader
class of sets than convex sets. Convex sets are intersections of half spaces
{w : Re l(w] < 0}, where i is a linear function into the complex plane, while
polynomially convex sets are intersections of sets of the form {w : Re p(w) < 0},
where p is a polynomial.
16.3.1 Uniqueness
THEOREM 16.3.1. // <S0(7*) is strictly convex (uniformly in 9), then if an
Hf? solution /* to OPT exists, it is unique. Also /*(e j6> ) e dSe for almost
every 9.
Recall that when N = 1, the solution to OPT is unique (section 9.4). This
gives us two theorems with very different conditions guaranteeing uniqueness.
A theorem of Vityaev unifies the two by saying roughly that if the <S# miss being
strictly convex by at most one complex dimension, then any smooth solution to
the OPT problem is unique.
16.3.2 Existence
THEOREM 16.3.2. Suppose SA holds and that each SQ is polynomially con-
vex. Then an H1^ solution f* to OPT exists. Moreover, if a sequence fk £ H™
approximately solves OPT (in the sense sup F(e j6/ , /fc(e j6> )) = 7fc with 7fe \ 7*,),
e
then a subsequence that converges in normal family sense has as its limit a func-
tion /oo in Hff that satisfies r(ej'e, foc(e:'d)) < 7* almost everywhere. Moreover,
if each So is strictly convex and smooth, then f* is continuous.
The result on continuity of solutions /* is in [S190]. Recent deep work
on regularity of /* in a special OPT type problem has been done in [A196].
Extending these results may be fruitful open ground.
The examples and theorems of Chapter 16 are from [HMer93a]. More results
can be found in [H86] and in [HMar90]. Theorem 16.3.1 is due to Helton and
188 CHAPTER 16. MORE THEORY OF THE VECTOR OPT PROBLEM
Howe [HH86]; see also Owens and Zames [OZ93J. For TV = 1, its uniqueness is
found in [HMar90]. The difficult part of Theorem 16.3.2 is the smoothness of
/*. For strict convex sublevel sets it is due to Slodkowsky [S189], and its proof
is influenced by ideas of Lempert [Le86]. For N — 1 convexity is not needed see
[HMar90] or [S190].
PartV
Semidefinite Programming
of the Vector OPT Problem
This page intentionally left blank
We show how the results we have presented on H°° optimization can be seen
from the viewpoint of a subject called semidefinite programming. This has been
very successful in the classical area of linear programming and more recently
in finding matrices that satisfy a collection of positive definite linear matrix
inequalities (LMIs). All of this will be explained in the next three chapters.
Also we present more general results than the ones you have seen in the book
so far.
191
This page intentionally left blank
Chapter 17
A pair (7*, /*) is a local solution to MOPT if it solves the problem obtained
from MOPT by restricting the optimization of F to a neighborhood of /* inside
AN.
Of course the MOPT problem is the OPT problem with performance func-
tion f given by
The function F is not smooth. Therefore this is a badly behaved OPT problem,
and very little of our theory of optimality and few of our numerical algorithms
are applicable.
Many optimization problems can be rephrased as problems of the type
MOPT. Below we illustrate this by presenting an example involving the max-
imum of two performance functions. Later, in section 19.2, we turn to linear
matrix inequalities.
193
194 CHAPTER 17. MATRIX H°° OPTIMIZATION
or equivalently
Since
if
then there exists #* 6 L^xn such that the triple (7*,/*,#*) satisfies
17.2. SCALAR PERFORMANCE MEASURES 195
Complementary
Gradient
Alignment
ls
Recall that x the operator of multiplication by e^0. We may also write
X ( e j e ] — eje. Also, the inequalities 4 and 5 of PDE+#°° hold pointwise almost
everywhere in 0.
Remark 17.1.2 It can be easily shown that if T has diagonal or block diag-
onal structure, then the function \I> in Theorem 17.1.1 can be chosen to have
the same diagonal structure.
Example. Optimality conditions for competing performance functions. Let
Ti(eie,z) and ^(e- 7 0 ,z) be scalar-valued performance functions of e?® and z €
C, and set
//
The main difference between this and Theorem 13.3.1 is in the flatness condi-
tion (1), which we have now expressed in a seemingly complicated way. Indeed,
this is the way it arises in many proofs of Theorems 9.3.1 and 13.3.1 (though
not the ones we gave), and one observes that \I>* does not vanish a.e., which
implies that 7*! — F(-,/*) = 0, which is what we call flatness of performance
at optimum. We foreshadow more general theory by saying that conditions (1),
(4), and (5) are examples of what are called complementarity conditions since
they say that the sets where the two functions
since the right side is not 0. Various ways of choosing <£ give different algorithms.
Such algorithms are usually called primal-dual interior point algorithms. The
term interior comes from the viewpoint that the set {/ : F(e J 0 ,/(-)) > 7*}
constitute a feasible region, and when we fix 7 > 7* and solve PDE+HOC for
/* we are a approaching /* from the interior of the feasible region.
We treat the gradient alignment condition exactly as we did in the scalar-
valued F case of section 12.3, except now we need to take traces of certain
197
198 CHAPTER 18. ALGORITHMS FOR #°° OPTIMIZATION
so that it is equivalent to
We reemphasize that in this algorithm at each step we fix 7 and then solve for /,
\I>, and 7 using Newton's method, with barriers or restrictions on the linesearch
imposing positivity conditions $ > 0 and 7! — F(-, /) > 0.
The theory mentioned in section 15.4 is strong evidence that such equa-
tions have an invertible differential (almost always), and the fact that Newton's
method demonstrates second order convergence in these cases corroborates this.
Standard language adapted to this problem is that when e is fixed the primal
and dual variables /e, ^£ which we solve for lie on the central path. As we take e
to 0 our updates follow the central path, which we hope leads to a local optimum
/*,**.
No e or 7 appears.
The reader interested in very preliminary comparisons from computer ex-
periments on 2 x 2 matrix- valued F should see [HMWPreP] . Developments in the
rapidly expanding field of semidefinite programming and primal-dual optimiza-
tion are reported in [LO96], [NN94], [VB96], [Wr97].
This page intentionally left blank
Chapter 19
Semidefinite Programming
versus Matrix H°°
Optimization
Semidefinite programming solves a great variety of engineering optimization
problems. Of particular importance are the linear matrix inequalities, which
now appear to occur in most branches of engineering [BEFB94], [SI95].
Our goals in this chapter are to present a succinct theory of Semidefinite pro-
gramming and to show how the methods that we presented in previous chapters
actually correspond to a type of Semidefinite programming theory and corre-
sponding algorithms.
201
202 CHAPTER 19. SEMIDEFINITE PROGRAMMING
Note that the definition of 77* requires that {0} ^ X+ ^ X. Some properties
of size functions are given in Proposition 20.1.1 in Chapter 20.
DEFINITION 19.1.2. Let X be a real Banach space with cone X+ and size
function 77* . We say that 77* satisfies the Hahn-Banach condition if
(HBO
19.1.2 Examples
Example 1. Consider X — R2 with norm ||(x,j/)|| = max{|x|, |y|}. Set
X = {(x,y) : x > 0, y > 0}, and let / = (1,1). Note that X'+ = X+ and
+
||(iy,i;)||x' = \w + \v\. Here r]*(x,y) — max{:r, y} is a size function.
denote the singular values of A. We define functions a\ and omax on the space
of n x TTT, matrices as follows:
Here y* is the transpose of y, and tr denotes the trace function. Also one can
prove that
19.1. BACKGROUND ON SEMIDEFINITE PROGRAMMING 203
Let M+ be the cone of the nonnegative definite matrices, and let / be the
identity matrix. Clearly, M'+ = M+. Then
That is, rf(B] is the largest eigenvalue of B. One can show that r/* satisfies
(HBC).
Let / be the constant function equal to the identity matrix, and let X+ be
the cone of continuous functions on dD that take nonnegative definite matrix
values. Then
It can be shown that 77* satisfies (HBC). The dual spanxn consists ofce of C
functional which can be represented as n x n self-adjoint matrix-valued bounded
Borel measures. More details about this example are given in section 20.2.1.
minimiz
subject to
We shall refer to (PO) as the primal optimization problem and refer to 7 and /
as the primal variables.
A pair (7*,/*) is a local solution to (PO) if there is a neighborhood V of
(7* 7 /*) such that the latter point is a solution to the problem obtained from
(PO) by adding the constraint (A, /) e V. Note that if F is affine in /; then
local solutions are also global. This is not true for general T.
Now we list optimality conditions and an associated primal-dual problem.
The latter is given in terms of the differential T>Jrf[-] of .F at a given / € F,
204 CHAPTER 19. SEMIDEFINITE PROGRAMMING
and its adjoint (Uff}^ : X' — » F'.1 Recall that the adjoint operator satisfies
and is defined by
minimize
subject to
We refer to A as a dual variable. A triple (7*, /*, A*) is a local solution to (PDO)
if there is a neighborhood V of (7*, /*, A*) such that the latter point is a solution
to the problem obtained from (PDO) by adding the constraint (7, /, A) € V.
The quantity
is called the duality gap. The content of our next result is that the duality gap
is zero at local solutions to (PDO).
PROPOSITION 19.1.3. //(7*,/*,A*) is a local solution to (PDO), then
Now strict inequality in (19.12) does not hold, since otherwise 7* may be re-
placed by a smaller 71 that still satisfies the inequality constraint, while reducing
the value of A, which is not possible. Therefore
Strict inequality in (19.14) does not hold, since otherwise one may choose AI €
X+ with unit norm such that
has been studied by several authors; see [VB96] for a beautiful and thorough
discussion, extensive bibliography, and applications. It is the central problem
treated with linear matrix inequality techniques. The problem (19.16) is very
close to problem (PO) when X is the space of real symmetric matrices, X+ is
the space of positive semidefinite matrices in X, and f is an affine map from
RK to X. Indeed, one can show that with the additional condition of at least
one of the ^s being positive definite, then problem (19.16) is a special case of
(PO).
Now we give the optimality conditions (PDE+) for (19.17). First note that A is
a positive real number, which the normalization forces to be 1. Next note that
0 = 'DJ:f [I] = VJr(f*). This is the classical "gradient equal to zero" condition.
minimize
subject to
With the setup of Example 3 in section 19.1.2 it is not too hard to show that
Theorem 19.1.4 applies to local solutions to MOPT. See section 20.1.2. That
19.2. MATRIX AND OTHER OPTIMIZATION PROBLEMS 207
is, local solutions to MOPT must satisfy relations (PDE}+ With consider-
able work (see section 20.2.3) one can then convert (PDE}+ to the conditions
(PDE+H00) in Theorem 17.1.1.
More information can be found in [HMWPrep].
This page intentionally left blank
Chapter 20
Proofs
This chapter is the most technical of the book. It requires a modest amount
of background in Banach spaces. The chapter is devided into two sections.
In section 20.1 basic concepts and results are discussed and Theorem 19.1.4 is
proved. In section 20.2 the general theory is applied to case of H°° optimization
and Theorem 17.1.1 is proved.
and
6. If x is in X+, then
209
210 CHAPTER 20. PROOFS
Proof. The existence of AQ G X' such that equations (20.3) hold is a conse-
quence of the Hahn-Banach theorem. The only thing left to verify is that AQ is
a nonnegative functional. From
we conclude that
we have that x\ = r/*(—XQ)I + XQ is in X+. By the first part of the proof there
exists AQ G X'+ with unit norm such that (Ao, x) < rj*(x) for all x G X and such
that
Therefore,
Proposition 20.1.1 implies that since rj*(I] = 1 and XQ has unit norm, we must
have that {Ao,/} < 1; note that (Ao,/) < 1 is not possible, since otherwise
T]*(—XQ} < 0 by (20.8), thus yielding XQ € X+ — a contradiction. Therefore we
have
THEOREM 20.1.5. Let rf be a size function such that (HBC) holds. Let
XQ be an nonzero element of X+ and M be a closed subspace of X such that
XQ 0 M and
so we conclude that
20.1. THE GENERAL THEORY 213
The first step in the proof is to show that the following relation holds:
and
214 CHAPTER 20. PROOFS
Thus (i), (ii), and (iv) of (PDE^) are satisfied. Also, (iii) holds since we are
assuming for now that k = 0, while a local version of (v) is automatically
satisfied.
We have finished the proof except for the k wrinkle. To include k(f) in the
(PDO) formulas, rewrite (PO) as
Replace 7 with 7 := 7 + k(f) to get the problem we have been treating. This
immediately yields the conclusion of our theorem.
The space C'nxm (the dual of Cnxm} consists of bounded Borel measures A.
These measures may be represented in the form
where
The statement of the lemma follows from combining (20.30) and (20.31).
20.2. PROOFS FOR H°° OPTIMA 217
The last member of relation (20.32) defines a real linear functional, which by
the chain of equalities is the zero functional:
Since A* is in the dual cone, we have, by the discussion in section 20.2.1, that
W* takes nonnegative definite values; that is relation 4 of PDEH00 holds. Also
by (20.24), and since A* has unit norm, we have that condition 3 of PDEH00
holds.
Now we may rewrite condition 1 of PDE+ as
Since 7*7 — F(-,/*) and # take nonnegative definite values, the trace of their
product is a nonnegative real-valued function, and since the integral of the latter
is 0, we have
While historians trace control to Archimedes' time and the theory of control to a
paper of James Clerk Maxwell on governors, the subject called classical control
began in the Second World War in labs in England and the United States that
designed radar-driven anti-aircraft guns. The beginnings of classical control are
summarized in a book [JNP47] by James, Nichols, and Phillips. According to
Ralph Phillips, a book that was very influential in their lab was Bode's famous
book on amplifiers, although Bode's book is not referenced in [JNP47]. The
primary technique that emerged was adjusting parameters in low order (e.g.,
degree 2) rational functions and checking that the graphs lie in certain regions.
Classical control dominated industrial practice for many years, even though it
could be taught only by example and its main technique was trial and error.
Much of the theory of control in the 1960s and 1970s focused on achieving
desired frequency domain performance as closely as possible in a mean-square-
error sense. We have nothing to add to the literature here and so do not give a
historical treatment.
The subject of optimizing worst-case error in the frequency domain along its
present lines started not with control but with circuits. One issue was to design
amplifiers with maximum gain over a given frequency band. Another was the
design of circuits with minimum broadband power loss. Indeed, H°° control is a
subset of a broader subject, H°° engineering, which focuses on worst-case design
in the frequency domain. In paradigm engineering problems this produces what
the mathematician calls an "interpolation problem" for analytic functions. The
techniques of Nevanlinna-Pick interpolation had their first serious introduction
into engineering in a SISO circuits paper by Youla and Saito [YS67] in the mid -
1960s. Further development waited until the mid-seventies, when Helton [H76],
[H78], [H81] applied interpolation and more general techniques from operator
theory to amplifier problems. Here the methods of commutant lifting [A63],
[NF70], [S67] and of Admajan-Arov-Krein (AAK) [AAK68], [AAK72], [AAK78]
were used to solve MIMO optimization problems. The disk method used in this
context was first described in an engineering article on gain equalization [H81]
and followed the foundational theory published in the mathematics article [H78]
221
222 APPENDIX A. HISTORY AND PERSPECTIVE
some discussion of this, see the articles [HMer91] and [HV97]. The main result
(Theorem 13.3.1) used here for MIMO control is taken from [HMer93a]. Earlier
versions are in the paper [H86], and independently in the pure mathematics
literature a special case concerning "Kobayashi extremals" is due to Lempert
[Le86]. Also interesting is recent work of Zames and Owens [OZ93] on convex
H°° optimization. While [HH86] gave only qualitative properties of optima such
as flatness, a numerical approach was proposed in [OZ95].
Disk iteration algorithms were introduced in [H85] and studied in
[HMer93b]. The paper [HMW93] introduced the Newton computer algorithm
of Chapter 12. The reason Newton's method had not been successful before
on such an old problem is because the sup norm is not a smooth performance
function. Thus one needs a special approach to apply Newton or gradient-like
algorithms. This is the idea behind modern primal-dual methods as described
in Chapter 19. Modern interest in them dates to the work of Karmarkar in
the mid-1980s, and a rapid evolution brought them to the form one sees now.
Good references on the history of these and other methods are [Wr97], [VB96],
[LO96]; we refer the reader to them.
Ap pendix B
We call this a claim rather than a theorem because we use the term "almost
all locations" loosely. Also, the differentials are operators on spaces of functions,
:
Due to Nehari and Adamjan-Arov-Krein.
225
226 APPENDIX B. PURE MATHEMATICS AND H°° OPTIMIZATION
and we did not say exactly what the spaces are. Below we indicate a proof for
H2. Unfortunately, H2 is the wrong space for analyzing Newton's method since
a nonlinear map rarely maps H2 into itself. The proof and precise formulation
of invertibility in the space of H°° fl C°° functions with a topology imposed by
a family of Sobolev norms is done in [HMW93] .
Idea of Proof. First we prove condition 1. Here we shall denote the Nehari-
commutant lifting conditions
Possibly this is a natural type of mathematical result to seek for many function
theory problems.
What we are saying is hardly shocking from the viewpoint of optimization
theory. The mathematically challenging thing is that for the class of problems
we study, one does not immediately get the invertibility condition. Also, it
APPENDIX B. PURE MATHEMATICS AND H°° OPTIMIZATION 227
is a bit surprising that such basic principles of optimization theory were only
recently applied to the very old Nevanlinna-Pick-Nehari problem.
For perspective, and to make sure that the reader interprets this speculation
widely (wildly) enough, we mention some interesting work on optimization in
several complex variables (which takes an approach very different from ours).
This concerns interpolation and approximation problems on the polydisk. There
is striking work of Agler [Agunpub] that derives beautiful matrix inequality con-
ditions for checking if a particular N-P interpolation problem is solvable but for
a norm stronger than the sup norm. Agler's view, implemented successfully in
several problems, is that instead of doing sup norm optimization, one modifies
the norm in some way so that the answer to the problem can be gotten from
an eigenvalue problem or something similar. A very elegant result of this type,
due to Cotlar and Sadosky [CS94], converts the Nehari problem in the polydisk
for a BMO-type norm to an eigenvalue problem.
The suggestion here is that we liberalize our notion of what it means to
answer a question in function theory beyond insisting that an answer must have
the form of an eigenvalue problem. Often by principles of functional analysis it
is not too difficult to write down general optimality conditions. It seems that a
very reasonable pursuit for function theory problems from the pure mathematics
point of view is to find optimality conditions whose differential is invertible. This
is much more challenging. While this seems like a considerable loosening of the
eigenvalue condition, if we insist that our conditions use only explicit formulas,
then it may not be that liberal a condition. Only time will tell if these ideas
extend in many other directions.
There is a yet more liberal notion of solution looming on the horizon that
involves liberalizing DI. The invertibility requirement in DI is motivated by
finding well-behaved "primal-dual" algorithms. Primal-dual interior point al-
gorithms probably do not in fact require invertibility of the differential, but do
require invertibility of the differential compressed to certain cones in function
space. This is under investigation by many researchers in many contexts and is
very open at the moment.
Also, we mention that there are extremely pretty connections between the
theory of analytic disks developed in this book and ongoing work in several
complex variables. The article [HMer91] describes one line of connections, and
the paper [HV97] indicates some others. Readers in the area of several complex
variables might find these articles interesting.
This page intentionally left blank
Appendix C
Uncertainty
C.I Introduction
So far we have considered the system S where the plant P is known. However in
most physical systems typically there is a certain amount of "plant uncertainty,"
i.e., lack of knowledge about the plant.
To explain what uncertainty is, we use the example of an airplane. We can
model the control of an airplane with a system of differential equations. These
equations are the result of an idealization of the airplane that invariably does
not include some elements, either because we are not aware that they affect the
airplane or because we choose not to include them for simplicity and are deemed
relatively unimportant. The differential equations have a series of parameters
that have physical significance and whose values we can tell only approximately.
We may further simplify the model by linearizing the system of differential
equations. The model is surely not perfect; indeed at high frequencies it is
highly inaccurate. Also there are factors affecting the system, among which
are changes in the properties of the components or even malfunction of some
components.
A common approach to treating uncertainty when doing system design is to
assume that there is available a known reference or nominal plant PQ , and that
the true plant P of the system is, in the frequency domain description,
229
230 APPENDIX C. UNCERTAINTY
the resulting system satisfies the performance requirements originally set. This
must hold for any feasible perturbation of the plant. Sometimes this is referred
to as robust performance.
where we know that 0.5 < a < 0.7 and 1 < (3 < 2. In this case we know
something about P: it is a rational function with relative degree I and only one
pole. We also know a range for the parameters a and (3.
Finally, one may have a mixed type of uncertainty, where there is a para-
metric part and an unparametric part in AP. Either type of uncertainty can
be converted to the unparametric type; this is not discussed further.
C.3. DEALING WITH UNCERTAINTY 231
Observe that whatever the true plant P(ju] is, we have that
This is the case where the uncertainty sets Pu are disks about p = PQ(JU),
with (known) radius 8\Po(juj)\:
It is easy to see that in this case our t in the expression (C.6) is given by (C.7).
Fig. C.2. The quantity A is selected to have the direction of steepest ascent for
the performance G(UJ, •), and so that PQ + A is the point in P^ that yields the
largest value d of the linear approximation to G(u;, •) at PQ. The number d is
an approximation to the largest performance possible when P ranges in Pw. We
setG*(u,P0) :=d.
Set
and
234 APPENDIX C. UNCERTAINTY
Then
where
Proof. Set A = <5Po2, where z ranges in the unit disk in the complex plane,
i.e., \z\ < 1. From this relation and from (C.15), (C.16), and (C.18) we get
The conclusion of the theorem follows directly from relation (C.19) and the
following lemma.
LEMMA C.6.2. If a and b are constants such that \b\ > 1, then
maps the unit disk one-to-one and onto a disk D\ in the plane. Note that the
set DI is not a half-plane since |6| > 1. If c\ and r\ denote the center and radius
of DI , respectively, then it is clear that the largest modulus possible for a point
in DI is |ci| + r\. We now determine c\ and r\. Note that for all z — e1® we
have
A result from complex analysis says that rational transformations on the circle
that have constant modulus must have pole and zeros that come in conjugate
pairs. This means that
6. PE RFO RM A N C EA ND UNCE RTAIN TY 235
and
See Figs. C.3, C.4, and C.5 for a graphical comparison between the functions
G and G* that arise from equations (C.17) and (C.13).
Fig. C.3. Plot of the worst case uncertainty performance G that arises from
(C.17) and of the modified performance G* that arises from (C.13) as functions
of TQ = x + iy. Here k — 0.5, 8 — 0.25, and PQ = 1. The range of x is (—1, 2)
and the range of y is (—1.5,1.5).
236 APPENDIX C. UNCERTAINTY
Fig. C.4. Plot of the levels 0.1, 0.5, 1.0, and 2.0 for the difference G - G*.
The functions G and G* of TQ = x -f iy are produced by equations (C.17) and
(C.13) with k — 0.5, 6 = 0.25, and PQ = 1. The variable x is represented in the
horizontal axis.
C.7 Extensions
We now obtain a modified performance by using second-order terms from a
Taylor expansion of the performance. The basic formula we need is the order 2
expansion of G,
where
From the expression (C.25) we can produce several variations of the method
described in section C.4. To illustrate this we carry out the calculations for
functions G with the form
C.7. EXTENSIONS 237
Fig. C.5. Plot of G and G* when TO ranges on the x axis. The functions G
and G* are produced by equations (C.17) and (C.13), for k = 0.5, 6 = 0.25, and
Po = l.
Note that expression (C.28) has a second-order term (in AP) that may be neg-
ative for some TQ'S.
A variation of (C.28) is obtained by dropping some of the second-order terms:
This page intentionally left blank
Appendix D
This appendix contains computer code of the two sessions with the package
OPTDesign, which were discussed in Chapter 6. The code can be easily modified
to treat other design problems. See the notebook appendixch6.nb.
«OPTDesign';
239
240 APPENDIX D. COMPUTER CODE FOR EXAMPLE IN CHAPTER 6
FigEnvelopePlot2DO = EnvelopePlot[Radius->r01,Center->k01,
FrequencyBand->{0.01,12}];
FigEnvelopePlotSDO = EnvelopePlotSD[Radius->r01,Center->k01,
FrequencyBand->{0.01,6}];
ra[w_]=InterpolatingPolynomial[{{0.0,0.1},{0.12,0.1}}, Abs[w]];
rb[w_]=InterpolatingPolynomial[{{0.12,0.1},{1,0.32}}, Abs[w]];
re[w_]=InterpolatingPolynomial[{{1.0,0.32},{2.0,2.0}}, Abs [w]];
rd[w_]=InterpolatingPolynomial[{{2.0,2.0},{3.0,2.0}}, Abs [w]];
re[w_]=InterpolatingPolynomial[{{3.0,2.0},{5,0.1}}, Abs[w]];
rf[w_]=InterpolatingPolynomial[{{5.0,0.1},{10.0,0.01}}, Abs[w]];
FigCenter = Show[Plot[kl[w],{w,0,7}],Plot[kOl[w],{w,0,7},
PlotStyle -> {{Thickness[0.02],GrayLevel[0.5]}}]];
plotrla = Show[Plot[rl[w],{w,0,1}].Plot[r01[w],{w,0,1},
PlotStyle -> {{Thickness[0.02].GrayLevel[0.5]}}]];
plotrlb = Show[Plot[r1[w],{w,0,7}],Plot[r01[w],{w,0,7},
PlotStyle -> {{Thickness[0.02].GrayLevel[0.5]}}]];
plotrlc • Show[Plot[rl[w],{w,5,12}],Plot[r01[w],{w,5,12},
D.I. COMPUTER CODE FOR DESIGN EXAMPLE 1 241
rb2[w_]=InterpolatingPolynomial[{{0.12,0.1},{1.3,0.25}}, Abs[w]];
rc2[w_]=InterpolatingPolynomial[{{1.3,0.25},{2.0,2.0}}, Abs[w]];
r2[w_] =Which[0.0 <= Abs[w] <= 0.1, ra[w] ,
0.1 < Abs [w] <= 1.3, rb2[w],
1.3 < Abs[w] <= 2.0, rc2[w],
2.0 < Abs[w] <= 3.0, rd[w] ,
3.0 < Abs [w] <= 5.0, re [w] ,
5.0 < Abs[w] <= 10.0, rf[w],
10.0 < Abs[w] , (0.01/Abs[paux[I 10.]])*Abs[paux[I w ] ] ] ;
k2[w_]= kl[w];
FigCenter2 = Show[Plot[k2[w],{w,0,7}],[Plot[k01[w],{w,0,7},
PlotStyle -> {{Thickness[0.02],GrayLevel[0.5]}}]];
FigRadius2 = Show[Plot[r2[w],{w,0,7}].Plot[rOl[w],{w,0,7},
PlotStyle -> {{Thickness[0.02].GrayLevel[0.5]}}]];
FigEnvelope2Plot2D =
EnvelopePlot[Radius->r2,Center->k2,FrequencyBand->{0.,3}];
FigEnvelope2Plot3D =
EnvelopePlotSD[Radius->r2,Center->k2,FrequencyBand->{0.0,3}];
FigTLowZP = PlotZP[TratLow[s],s];
FigCLowZP = PlotZP[CratLow[s],s];
«OPTDesign' ;
wp = 0.7; alphap = 0 . 9 ;
wb = 2.0; alphab = .75;
wr = 10.; alphar = 0.25/Abs[p[I wr]];
linel[w_] = InterpolatingPolynomial[
{{wp,alphap Abs[pinv[I wp]]}, {wb,alphab}},w];
Iine2[w_] = InterpolatingPolynomial[
{{wb,alphab},{wr,alphar Abs[p[I wr]]}},w];
Iine3[w_] = InterpolatingPolynomial[{{wp,!},{wb,0}},w];
EnvelopePlot[Radius->r,Center->k,FrequencyBand->{0.01,12}];
EnvelopePlotSD[Radius->r,Center->k,FrequencyBand->{0.01,12},PlotRange->All];
FigBodeMag = BodeMagnitude[T];
FigBodePha = BodePhase[T];
D.2. COMPUTER CODE FOR DESIGN EXAMPLE 2 243
{num.den} = {Numerator[Cratl[s]].Denominator[Cratl[s]]};
zeros = s /. Solve[num==0,s];
poles = s /. Solve[den==0,s];
Crat2[s_] = Cratl[0] *
(1 - s/zeros[[4]])(l - s/zeros [[5]])(1 - s/zeros [[6]])/
( (1 - s/poles[[!]])(! - s/poles[[2]])(l - s/poles[[3]]));
FigCratlCrat2 = Plot[{Abs[Cratl[I w]],Abs[Crat2[I w]]},{w,0,2}];
Trat2[s_] = Together[ p[s] Crat2[s]/(l + p[s] Crat2[s])] //Chop ;
FigCrat2ZP = PlotZP[Crat2[s],s];
step2[t_] = Chop[Simplify[InverseLaplaceTransform[ Trat2[s]/s,s,t] ] ] ;
FigStepT2 = Plot[step2[t],{t,0,14}];
This page intentionally left blank
Appendix E
Downloading OPTDesign
and Anopt
Those who like doing things the hard way can download the packages OPTDe-
sign and Anopt through anonymous ftp.
Type
ftp anopt.ucsd.edu
When the remote system requests the accountname, you reply
anonymous
When the system requests the password, type your email address:
myadress.edu
Then type
cd pub/anopt
Now you are in the correct directory. There are two types of files, .tar.gz
(for Unix) and .zip (for MSWindows). Pick the file of your favorite type with
the latest date, download it to your system, and uncompress it.
245
This page intentionally left blank
Appendix F
Anopt Notebook
F.I Foreword
Anopt is a Mathematica package for solving diverse optimization problems over
spaces of functions analytic on the unit disk in the complex plane.
The software is useful for engineers doing worst-case design in the frequency
domain. This includes problems in control as well as a broadband gain equal-
ization and matching. Another application is in the field of several complex
variables where the program can be used to find analytic disks that are optimal
with respect to various criteria, e.g., Kobayashi metric calculations.
The main program AnoptfJ is easy to use, even for those with little or no
computing experience. AnoptfJ can be run at a very simple level or at a very
complex level.
The package Anopt was developed at the Laboratory for Mathematics and
Statistics at the University of California at San Diego, during the years 1989 to
1994, by J.W. Helton, Orlando Merino, Julia Myers, and Trent Walker. Finan-
cial support came from the Air Force Office of Scientific Research, the National
Science Foundation, and the NSF-REU program at the San Diego Super Com-
puter Center.
Send comments, questions, and information on bugs to anopt@math.uscd.edu.
247
248 APPENDIX D. ANOPT NOTEBOOK
where the infimum is over / in AN and the supremum is over e in the unit circle.
To type in the formula for the performance F(e, z) requires a translation to the
symbols that Anopt understands. To enter G in Mathematica, replace z by z[l]
and assign the result to a name, say g:
g = Abs [ 0.8 + (1/e + z[l] ) ~ 2 ] ~ 2
F.3. EXAMPLE 1: FIRST RUN 249
Anopt[g, 0.02]
Summary
gammastar = 1.010084881809756E+00
flat = 1.7866711005E-02
gr Align = 0 E+00
ned = 5.35E-03
Output from Anopt. When Anopt runs, the screen output gives informa-
tion on how the run is progressing. In the screen output from the previous run
we see several columns. A brief explanation of their meaning follows.
It Iteration number
Value sup e F(e,/(e)), the value at current iteration.
Step sup norm of difference between last two iterates /.
Flat Flatness optimality diagnostic. It is zero at the solution.
GradAlign Gradient alignment optimality diagnostic. It is zero at the solution,
ned A measure of numerical noise in calculations.
Sm. Indicates whether smoothing takes place.
Grid Number of points on the unit circle used for function evaluation.
In our run above, Flat went from 0.99 down to 0.018 in three iterations,
while GradAlign was zero throughout the iteration (this is not unusual for
scalar-valued examples). At the moment of stopping, the iteration was making
acceptable progress, but the (large) error tolerance we gave as input prevented
the program Anoptf] from obtaining more accurate results.
Solution. When the run is over you will have access to the calculated value
of the function /*, under the name Solution. This is a list of lists with the
following format:
Solution
where xi, #2, • • • 5 zn are complex numbers. The kih entry of Solution (itself a
list) is Solution[[/c]].
250 APPENDIX D. ANOPT NOTEBOOK
Dimensions[Solution]
{1, 32}
Thus in our example, only one scalar-valued analytic function was produced. It
consists of 32 values that are the result of sampling the function on a grid of 32
equally spaced points in the unit circle.
DiskListPlot[Solution]
You can oriduce plots or manipulate the output algebraically. Also, the
package Anopt.m comes with a function for displaying the solution in 3-D.
DiskListPlot3D[Solution]
F.4. EXAMPLE 2: VECTOR- VALUED ANALYTIC FUNCTIONS 251
« AnoptCg, 0.001] ;
Summary
gammastar = 2.146918994180885E+00
flat = 3.6801434547E-06
grAlign = 5.9442841631E-04
ned =5.3 E-06
Stopping, and how good is the solution? The run stops when the
equalities
are satisfied, where tol is the error tolerance set by the user. At the solution
one must have that Flat = GradAlign = 0, so small Flat and small GradAlign
is a necessary condition for a calculated guess at the answer to be close to the
actual solution.
Sometimes ther e i s difficult y i n th e calculations . AnoptfJ ha s basicall y tw o
ways to deal with it: smoothing and grid size doubling. Both are performed
automatically if the internal algorithms of Anopt indicate that such action is
necessary. The user may supress smoothing or grid doubling by specifying
options when Anopt is run. In the run above, smoothing occurred at the third
iteration, while the grid size remained constant at 32.
252 APPENDIX D. ANOPT NOTEBOOK
Note that it is clear from the expression above that the function f2 of the
calculated solution / = (/I, /2) is a multiple of e10.
We can specify the initial guess at solution as an express in "e" (which represents
Exp[I theta]). For example, we set
The performance function we deal with here has only one complex variable
(N = 1); hence only one entry is required in /Q. In the case N > 1 the different
entries of /o have to be separated by commas.
The calling sequence to be used now takes as input the function /o as initial
guess, besides the performance g and the tolerance tol. The sequence is Anopt[p,
tol, /o]. Many other inputs can be specified as options. Below we set the
maximum number of iterations to be 3:
F.5. EXAMPLE 3: SPECIFICATION OF MORE INPUT 253
Anopt[g, .0001,fO,Iterations->3]
Summary
gammastar = 1.009352594124695E+00
flat = 2.08995166745E-02
grAlign = 0 E+00
ned = 2.12E-03
Another valid way to specify the initial guess is a list of values obtained by
sampling the function on a grid of equally spaced points. To produce a discrete
version of /0 on a 32-point grid you can use a replacement rule. Note the braces
around the rational function:
Restarting the run. In the above, the process stopped because the limit
on the number of iterations was attained. If you want Anopt to proceed from
the last iteration above, you can restart the iteration. For this take as an initial
guess for the new run the last iterate of the previous run, which is stored in the
variable Solution.
Below we double the grid size of the answer we obtained in the run above
and assign it to /I. Then /I is used to restart the iteration.
f1 = DoubleGrid[Solution];
Anopt[g,0.00002,fl]
Summary
gammastar = 1.000005018135407E+00
flat = 1.291679907 E-05
grAlign = 0 E+00
ned =3.1 E-03
254 APPENDIX D. ANOPT NOTEBOOK
grad = ComplexD[g,z[l]] /.
{e -> edisc, z[l]->Solution[[!]]};
DiskListPlot[grad,PlotJoined-xTrue]
HEADING EXPLANATION
Usually the most important numbers on the screen are the Flat and GradAlign
diagnostics. As they approach zero, the current guess is expected to approach
the solution. Under mild hypotheses, the solution is unique in OPT problems
with only one (scalar-valued) analytic known function. When the number of an-
alytic unkown functions N > 1, solutions are not unique. There may be many
local solutions, which Anopt may find when run with different initial guesses.
Explanation of the output table will be expanded in section 1. A more
advanced user may set the options Diagnostics-> 1 and Diagnostics2->l.
This gives additional information on the run, which is stored automatically in
certain files.
Some Mathematica functions and notation
ComplexD [g, Conj ugate [z] ] Complex derivative of g with respect to Conjugate [z].
Newtonlnterpolant
Notebook
Introduction
Let R[s] be a function of the complex variable s, and let
be given sets of n complex numbers. The relation R[sl] = zl is called the inter-
polation condition, and several of these form a "set of interpolation conditions":
257
258 APPENDIX E. NEWTONINTERPOLANT NOTEBOOK
«NewtonInterpolant'
rat2 = Newtonlnterpolant[data,s,PoleLocation->-4]
G.3. SPECIFYING OF THE RELATIVE DEGREE 259
Functions with nonnegative relative degree are called proper, and if the relative
degree is positive the function is called strictly proper.
The user can specify a relative degree as input for Newtonlnterpolant [].
Suppose that you want to determine an interpolant for
with pole location —4 and relative degree 2. This is how you can do it:
In[12j:=
rat2 = Newtonlnterpolant[data,s,PoleLocation->-4,RelativeDegree->2]
Out [12]=
If some of the data you use to set up an interpolation problem is complex and
you want an answer that is real-rational, then you must include pairs of data
points that reflect property (RR). For example, if one interpolation condition is
Note: If one of the conditions is not given as input, Newtonlnterpolant[] will give
as interpolant a non-real-rational, since it will assume that you are willing to
260 APPENDIX E. NEWTONINTERPOLANT NOTEBOOK
rat4 = Newtonlnterpolant[data,s]
Out [15]=
Outfl7j=
G.5. HIGHER-ORDER INTERPOLATION 261
NewtonFit Notebook
NewtonFit
Introduction
This Mathematica notebook is the documentation for the package Newton-
Fit, for treating nonlinear L2 data fitting problems. The main function is
NewtonFit [] , which is an implementation of the Newton algorithm.
Acknowledgment. Discussions with Jim Easton were helpful.
An L2 approximation problem
The Mathematics function Fit[] can be used to treat (Prob) when the func-
tion F(x,al,...) is linear in the a's. For the general problem (Prob) a more
powerful algorithm is necessary. One example of this is the classical Newton
method, implemented here as the function NewtonFit. Another is the well-
known Gauss-Newton algorithm. These are iterative procedures that, when
they converge, produce a local solution to (Prob). This is the best one can hope
for for such a general optimization problem. In practice, one way to go about
finding global solutions is to run the algorithm repeatedly with different initial
263
264 APPENDIX H. NEWTONFIT NOTEBOOK
guesses, with the hope that the global solution will be found by one of these
runs.
The emphasis in these notes is on models F(;c,al,... , am) that are ratio-
nal functions in x, with coefficients given by al, a2, a3,..., am. The algorithm
implemented here is very general in that, in principle, it solves (Prob) for F
rational and for many other functions not necessarily rational. The wide range
of problems that can be treated with this implementation is due to the fact that
Mathematica can do both symbolic and numerical calculations.
is sought.
Problem. Given datapoints (2), datavalues (3), and the model (4), find
parameters a[l],a[2],a[3],a[4] that minimize
SetDirectory["~/BOOK/CODEUPDATE"];
«NewtonFit.m;
data = Transpose[{points,values}];
The model is set below in terms of parameters a[l], a[2],... and the variable s.
initiall = {-1.,5.,9.,4.};
The output of Newton is assigned to the variable output 1 below. This will
be helpful when manipulating the results of the call to NewtonFit [] .
266 APPENDIX H. NEWTONFIT NOTEBOOK
outputl = NetwonFit[data,model,s,a,Parameters->initiall]
0 0.754419 0.581507
1 4 . 05838 0.482129 0.313686
2 3 . 0786 0.0714609 0.0123889
3 2.09594 0.00824722 0.000601043
4 0.389809 0 . 000208885 0.000327769
-6
5 0.0285536 1.66219 10 0.000327117
-11
6 0.0000899072 3.82929 10 0.000327117
A plot is now produced that contains the (discrete) data being approximated
and the optimal function r[s].
Show[
ListPlot[Transpose[{Re[values],Im[values]}],
DisplayFunction->Identity],
ParametricPlot[{Re[r{I t]],Im[r[I t]]},{t,0,10.},
DisplayFunction->Identity],
DisplayFunction->$DisplayFunction];
H.2. TEMPLATE FOR MANY RUNS 267
«NewtonFit .m;
w = table[0.01*(l.l)~i,{i,0,99}];
F[s_]=l/(s~3 + 6. s~2 + 11. s + 6 . ) ;
values = F[I w];
points = I w;
data = Transpose[{points,values}];
model = (a[l] s + l.)/(a[2] s~2 + a[3] s + a[4]);
Do[
initial = 10.{Random[] .Random[] .Random[] .Random[]} -5.;
output = NewtonFit[data,model,s,a,
Parameters->initial];
Save["outputl".initial,output];
.-Ci.1,50}];
Exit
0 0.00372873 0.000526309
1 0.165522 0.0000226568 0.000401831
-8
2 0.00417434 2.57386 10 0.000401817
-7 -10
3 3.43866 10 3.06373 10 0.000401817
-10
4 0. 3.06373 10 0.000401817
To plot the resulting function set, just proceed as in the first example.
initials = {0.,.9,1.3,.17};
outputs = NetwonFit[data,models,s,a,
Parameters->initial3]
0 0.323703 0.00430163
1 0.112943 0.022005 0.00156777
2 0.00762118 0.0000177771 0.00154469
-9
3 0.0000305235 1.18668 10 0.00154469
-11 -9
4 4.53645 10 1.08998 10 0.00154469
Now the rational function obtained above is produced, and a plot is gener-
ated.
par 3 = Parameters /. outputs;
r3[s_]= modelS /.{a[j_] :>par3[[j]]}
Show[
ListPlot [Transpose [{Re [values] ,Im [values]}] ,
DisplayFunction->Identity] ,
270 APPENDIX H. NEWTONFIT NOTEBOOK
References
Z. Shafiei and A.T. Shenton, Theory and Application of H-infinity Disk Method,
Report no. MES/ATS/BAE/002/90, Department of Mechanical Engineering,
University of Liverpool, U.K.
R. Luus, Optimization in model reduction, Int. J. Control, 32 (1980),
pp. 741-747.
P. Gill, Q. Murray, and M. Wright, Practical optimization, Academic Press New
York, 1986.
Appendix I
Some users may want to manipulate the output of an OPTDesign run before
dealing with rational fits. We present examples of OPTDesign commands to
plot and manipulate T, L, Co, or other lists of data.
271
272 APPENDIX I. OPTDESIGN PLOTS, DATA, AND FUNCTIONS
The assumption is that there is a grid that is used for sampling all functions of
the OPTDesign session. To see it, type
In[18] := Grid[]
Out[18] = {oo, 10.1532, 5.02734, 3.29656, 2.41421, 1.87087, 1.49661, 1.2185, 1., 0.820679,
0.668179, 0.534511, 0.414214, 0.303347, 0.198912, 0.0984914, 0., -0.0984914,
-0.198912, -0.303347, -0.414214, -0.534511, -0.668179, -0.820679, -1.,
-1.2185, -1.49661, -1.87087, -2.41421, -3.29656, -5.02734, -10.1532}
If you wish, you may produce a list of pairs of the form {w, T[Iw}}. To do this,
type
Out[19] = {{oo, 0}, {10.1532, -0.0371551 + 0.0341142 I}, {5.02734, -0.124852 + 0.00349232 I},
{3.29656, -0.184928 - 0.129109 I}, {2.41421, -0.114989 - 0.306715 I},
{1.87087, 0.0780776 - 0.371381 I}, {1.49661, 0.212942 - 0.285464 I},
{1.2185, 0.236439 - 0.199328 I}, {!., 0.231823 - 0.160042 I},
{0.820679, 0.230235 - 0.141843 I}, {0.668179, 0.230142 - 0.133257 I},
{0.534511, 0.233164 - 0.136253 I}, {0.414214, 0.250685 - 0.152637 I},
{0.303347, 0.301352 - 0.162629 I}, {0.198912, 0.365523 - 0.131061 I},
{0.0984914, 0.40342 - 0.0669143 I}, {0., 0.41222 +0. I},
{-0.0984914, 0.40342 + 0.0669143 I}, {-0.198912, 0.365523 + 0.131061 I},
{-0.303347, 0.301352 -f 0.162629 I}, {-0.414214, 0.250685 + 0.152637 I},
{-0.534511, 0.233164 + 0.136253 I}, {-0.668179, 0.230142 + 0.133257 I},
{-0.820679, 0.230235 + 0.141843 I}, {-!., 0.231823 + 0.160042 I},
{-1.2185, 0.236439 + 0.199328 I}, {-1.49661, 0.212942 + 0.285464 I},
{-1.87087, 0.0780776 + 0.371381 I}, {-2.41421, -0.114989 + 0.306715 I},
{-3.29656, -0.184928 + 0.129109 I}, {-5.02734, -0.124852 - 0.00349232 I},
{-10.1532, -0.0371551 - 0.0341142 I}}
In[20] :- Dimensions[Tpairs]
Out[203 - {32, 2}
1.2. PLOTS 273
1.2 Plots
Plotting the envelope and T simultaneously
In[21] := EnvelopePlot3D[Radius -> rO, Center -> kO, ClosedLoop -> T];
To plot in 3-D discrete functions of frequence (i.e., lists of values such as T),
use the command
1.2. PLOTS 275
In[26] := Nyquist[L];
For more on rational approximation with a different algorithm, see the New-
tonFit notebook (Appendix F).
279
280 REFERENCES
[La89] B. LARSON, Siso Robust Controller Design via the H°° Method,
master thesis, under the direction of Prof. F. Bailey, University
of Minnesota, April 1989.
289
290 INDEX