Anda di halaman 1dari 309

Classical Contro l

Using H°° Methods


Theory, Optimization, and Design
This page intentionally left blank
Classical Contro l
Using H°° Methods
Theory, Optimization, an d Desig n

J. William Orlando

Helton Merino
University of California University of Rhod e Islan d
San Diego, Californi a Kingston, Rhod e Islan d

Society for Industrial and Applied Mathematics

Philadelphia
Copyright ©1998 by the Society for Industrial and Applied Mathematics.

10987654321

All rights reserved. Printed in the United States of America. No part of this book
may be reproduced, stored, or transmitted in any manner without the written
permission of the publisher. For information, write to the Society for Industrial
and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA
19104-2688.

Typeset by TEXniques, Inc., Boston, MA and the Society for Industrial and
Applied Mathematics. Printed by Victor Graphics, Inc., Baltimore, MD.

Library of Congress Catalog Card Number: 98-86612

is a registered trademark.
Contents

Preface xiii

I Short Design Course 1


1 A Method for Solving System Design Problems 3
1.1 Rational functions 3
1.2 The closed-loop system S 4
1.3 Designable transfer function 6
1.4 A system design problem 6
1.5 The method 7
1.6 Exercises 9

2 Internal Stability 11
2.1 Control and stability 11
2.2 Interpolation 13
2.3 Systems with a stable plant 15
2.4 Exercise 16

3 Frequency Domain Performance Requirements 17


3.1 Introduction 17
3.1.1 The closed-loop system S 17
3.1.2 Frequency domain performance requirements 18
3.1.3 Disk inequalities 18
3.2 Measures of performance 19
3.2.1 Gain-phase margin 19
3.2.2 Tracking error 21
3.2.3 Bandwidth 22
3.2.4 Closed-loop roll-off 23
3.2.5 Fundamental trade-offs 24
3.2.6 Choosing sets of performance requirements 24
3.3 Piecing together disk inequalities 25
3.4 More performance measures 27
3.4.1 Peak magnitude 27

V
vi CONTENTS

3.4.2 Compensator bound 27


3.4.3 Plant bound 28
3.4.4 Disturbance rejection 29
3.4.5 More on tracking error 30
3.4.6 Tracking and type n plants 31
3.5 A fully constrained problem 32

4 Optimization 35
4.1 Review of concepts 35
4.2 Generating a performance function 36
4.3 Finding T with best performance 38
4.3.1 Example 39
4.4 Acceptable performance functions 40
4.5 Performance not of the circular type 43
4.6 Optimization 44
4.6.1 The optimization problem OPT 44
4.7 Internal stability and optimization 45
4.7.1 The optimization problem OPT 45
4.7.2 OPT with circular T 45
4.8 Exercises 46

5 A Design Example with OPTDesign 47


5.1 Introduction 47
5.2 The problem 47
5.3 Optimization with OPTDesign 50
5.4 Producing a rational compensator 52
5.5 How good is the answer? 54
5.5.1 More on plots and functions 56
5.6 Optimality diagnostics 58
5.7 Specifying compensator roll-off 58
5.8 Reducing the numerical error 59
5.9 Rational Fits 60
5.10 Exercises 61

II More on Design 63
6 Examples 65
6.1 Numerical practicalities 65
6.1.1 Sampling functions on the jw axis 66
6.1.2 Discontinuous functions 67
6.1.3 Vanishing radius function 68
6.1.4 Performance function incorrectly defined 69
6.2 Design example 1 69
6.2.1 Electro-mechanical and electrical models 69
6.2.2 Mathematical model 71
CONTENTS vii

6.2.3 Statement of the problem 73


6.2.4 Reformulation of requirements 73
6.2.5 Optimization 74
6.2.6 Second modification of the envelope
and optimization 78
6.3 Time domain performance requirements 81
6.3.1 Two common time domain requirements 83
6.3.2 A naive method 83
6.3.3 A refinement of the naive method 85
6.4 Design example 2 85
6.4.1 Statement of the problem 86
6.4.2 Translation of time domain requirements 86
6.5 Performance for competing constraints 93
6.5.1 Rounding corners of performance functions 94
6.5.2 Constrained optimization with a barrier method 97

7 Internal Stability II 101


7.1 Calculating interpolants 101
7.1.1 Calculating one interpolant 102
7.1.2 Parameterization of all interpolants 103
7.1.3 Interpolation with a relative degree condition 104
7.2 Plants with simple RHP zeros and poles 105
7.3 Parameterization: The general case 107
7.3.1 Higher-order interpolation 107
7.3.2 Plants with high-multiplicity RHP zeros and poles . . . . 109
7.4 Exercises 110

III H°° Theory 113

8 H°° Optimization and Control 117


8.1 The problem OPT 117
8.2 The fundamental H°° problem of control 118
8.3 Change of coordinates 120
8.4 Performance functions 121

9 Solutions to OPT 125


9.1 Complex partial derivatives 125
9.2 Winding number 127
9.3 Main result 127
9.4 Properties of solutions 129
9.5 The OPTRHP optimality test 130
viii CONTENTS

10 Facts about Analytic Functions 131


10.1 Any function on an arc 131
10.2 Spectral factorization 132
10.3 Analytic functions with prescribed phase 134
10.4 The fundamental mistake of H°°-control 135

11 Proof of the Main Result 137


11.1 Taylor expansions 137
11.2 Solutions make the performance flat 138
11.3 Solutions must satisfy 139
11.4 Flatness and winding number 139

12 Computer Solutions to OPT 141


12.1 Computer diagnostics 141
12.2 Spaces of functions 142
12.3 Numerical algorithms 144

IV H°° Theory: Vector Case 149


13 Many Analytic Functions 153
13.1 The OPT problem 153
13.2 Solving OPT on the computer 154
13.3 Optimality conditions for solutions to OPT 155
13.4 An example 156
13.5 Computer diagnostics from, optimality conditions 157
13.6 Algorithms for OPT 158

14 Coordinate Descent Approaches to OPT 161


14.1 An example in which the coordinate
descent method fails 161
14.2 Coordinate descent over H°° 162
14.3 Experimental evidence 164
14.4 Another perspective 164

15 More Numerical Algorithms 167


15.1 Notation 167
15.2 Nehari's problem 168
15.3 Disk iteration algorithms 169
15.3.1 The power method for solving Nehari's problem 169
15.3.2 The Algorithm 170
15.4 The Newton iteration algorithm for OPTN 171
15.5 Numerical comparison of the algorithms 171
15.6 Derivation of the Nehari solution 172
15.7 Theory of Newton iteration 176
15.7.1 The linearized optimality equations 177
CONTENTS ix

15.7.2 Second-order convergence: Invertibility of T 178

16 More Theory of the Vector OPT Problem 183


16.1 A sufficient condition for optimality 183
16.2 An example (continued) 185
16.3 Properties of solutions 186
16.3.1 Uniqueness 187
16.3.2 Existence 187

V Semidefinite Programming vs. H°° Optimization 189


17 Matrix H°° Optimization 193
17.1 Optimality conditions for MOPT 194
17.2 The special case of scalar performance
measures 195

18 Numerical Algorithms for H°° Optimization 197


18.1 Interior point algorithm 198
18.2 Interior point algorithm 2 199
18.3 The "boundary case" 199

19 Semidefinite Programming versus Matrix H°° Optimization 201


19.1 Background on Semidefinite programming 201
19.1.1 Basic Setup 201
19.1.2 Examples 202
19.1.3 The optimization problem 203
19.1.4 Main high-level theorem 205
19.1.5 High-level numerical algorithms 205
19.2 Matrix and other optimization problems 206
19.2.1 LMIs and related stories 206
19.2.2 Smooth optimization 206
19.2.3 Return to matrix H°° optimization 206

20 Proofs 209
20.1 The general theory 209
20.1.1 Size functions and their properties 209
20.1.2 Proof of Theorem 19.1.4 213
20.2 Proofs for H°° Optima 214
20.2.1 Definitions and notation 214
20.2.2 Performance, Jacobians, and adjoints 215
20.2.3 Proof of the Matrix H°° Optimization
Theorem 17.1.1 217
x CONTENTS

VI Appendices 219
A History and Perspective 221

B Pure Mathematics and H°° Optimization 225

C Uncertainty 229
C.1 Introduction 229
C.2 Types of uncertainty 230
C.3 Dealing with uncertainty 231
C.4 A method to treat plant uncertainty 231
C.5 An example with quasi-circular T 232
C.6 Performance and uncertainty 233
C.7 Extensions 236

D Computer Code for Example in Chapter 6 239


D.1 Computer code for design example 1 239
D.2 Computer code for design example 2 242

E Downloading OPTDesign and Anopt 245

F Anopt Notebook 247


F.1 Foreword 247
F.2 Optimizing in the sup norm: The problem OPT 248
F.3 Example 1: First run 248
F.4 Example 2: Vector-valued analytic functions 251
F.5 Example 3: Specification of more input 252
F.6 Quick reference for Anopt 255

G Newtonlnterpolant Notebook 257


G.1 First calculation of an interpolant 258
G.2 Specifying the pole location 258
G.3 Specifying of the relative degree 259
G.4 Complex numbers as data 259
G.5 Higher-order interpolation 260

H NewtonFit Notebook 263


H.1 First example 264
H.2 Template for many runs 267
H.3 Using a weight 267
H.4 Stable zeros and poles 268

I OPTDesign Plots, Data, and Functions 271


1.1 Functions and grids 271
1.2 Plots 273
1.3 Rational approximation and model reduction 276
CONTENTS xi

References 279

Index 289
This page intentionally left blank
Preface

Purpose of this book


One of the main accomplishments of control in the 1980s was the development of
H°° engineering. This book teaches control system design using H°° techniques,
and H°° theory motivated by control applications.
The first parts of the book are aimed at students of control at any level
who want an H°° supplement to their control course. The later parts of the
book treat the theory of H00 control and should be valuable to theoreticians in
control and to research mathematicians. Even practitioners might benefit from
a light reading of this theory, while mathematicians who want a physical feel
for control should like the beginning of the book.
Since the book is highly modular the theory parts can be read independently
of the design parts, and vice versa.1 Thus the book should be useful for three
purposes.
Control systems design (Parts I and II). We believe that the approach pre-
sented here has three virtues. First, it is conceptually simple. Second, it corre-
sponds directly to what is often called "classical control," which is good simply
because of the widespread appeal of classical frequency domain methods. Fi-
nally, classical control has always been presented as trial and error applied to
specific cases; this book lays out a much more precise (almost axiomatic) ap-
proach. This has the tremendous advantage of converting an engineering prob-
lem to one that can be put directly into a mathematical optimization package.
While some trial and error may be needed, our approach greatly reduces it.
The student who learns Part I should have a good feel for how engineering
specs are encoded as precise mathematical constraints.
A theory for frequency domain CAD (Parts III, IV, and V). Part III gives
a theory of H°° control that is no more difficult to understand than other ap-
proaches but applies to much more general problems. This approach uses a few
simple qualitative principles about analytic functions (which have broad engi-
neering value) to produce a seemingly qualitative picture of what properties
an optimal controller should have. Rather surprisingly, this physically appeal-
ing characterization of an optimal controller leads directly to highly effective
computer algorithms.
1
ln fact, some readers might get by with the short version of this book [HMER98a], which
consists mainly of Parts I and II and some computational appendices.

xiii
xiv PREFACE

Very little mathematical sophistication is needed to read Part III, although it


presents results of research done in the last decade. Part IV extends the theory
and algorithms for analytic functions presented in Part III to optimization over
many analytic functions. Part V continues this extension of our theory to the
many-input, many-output situation. Matrix inequalities and the semidefinite
programs used to solve them are quickly moving to the forefront of modern
design. Part V sketches this theory and shows how Parts III and IV relate to
semidefinite programming.
Research in mathematics and engineering. The theory is interesting in itself
from a mathematical perspective; and it has connections and applications to
fields such as functional analysis operator theory and (one and several) complex
variables. In engineering its connections with amplifier design and antenna
design are relatively undeveloped. A mathematician who turns directly to Parts
III-V should find them to be a readable account of recent developments. The
core of H°° control is optimization of a performance measure over the family
of analytic functions. Parts III and IV integrate results that have appeared
in print since the early 1980s, and Part V approaches H°° optimization from
the semidefinite programing perspective that very recently became fashionable.
While a powerful and unified theory is emerging, the area is new enough that
much remains to be done.

Optimal design
We shall always deal with a situation where part of a linear system is given
(called the plant in control theory). We want to find the additional part /
(the designable part) so that the whole system meets certain requirements (see
Fig. 1).

Fig. P. 1. The given and designable parts of the system.

A high-level description of the design method described in this book is:

I. List requirements carefully and find a mathematical representation for


them.
II. Obtain a performance function F from the given part of the system and
the requirements listed in I, in terms of the designable variable /. The
performance function is a function F of frequency uj and /. In other words,
the performance of the closed-loop system at frequency u is F(ja>, f(juj)).
The function F is actually a cost function; the smaller its value, the better
the design.
PREFACE xv

III. Optimize the performance to obtain an acceptable system, or conclude


that no such system exists with the given requirements.

This book presents a list of requirements that are sound from the physical
point of view yet simple mathematically. The list of requirements discussed
here is by no means complete; as more is understood about systems, more
requirements can be added to those in this book. It is our contention that the
framework is flexible enough to accommodate them. Several examples illustrate
the practical solution of design problems using steps I-III.
H°° control optimizes the performance over all frequencies; it does not just
average (mean square) performance over frequencies. This approach is true to
the physical problem, whereas the older mean square optimization approaches
distort engineering specifications in order to produce a mathematically easy
problem. Fortunately in the last 15 years the mathematics of H°° has become
powerful enough to solve engineering problems.

Software
This book also serves to introduce the control software package OPTDesign,
which runs under Mathematica. The book is independent of the software; how-
ever, the companionship of the software provides a richer experience. With
OPTDesign the reader can easily reproduce the calculations done here in the
solved examples, and try variations on them.

SISO and MIMO


This book emphasizes single-input, single-output (SISO) systems. However, the
methods used here can be generalized readily to multiple-input, multiple-output
(MIMO) systems. MIMO design problems are not supported by the current
version of the software package OPTDesign. However, the optimization routine
Anopt used by OPTDesign has the capability of solving many optimization
problems of the kind that arise in MIMO design.

Recommended background
Parts I and II require knowledge of basics of engineering such as what a
frequency response function is. It can be learned well by a student who has had
a one-semester control course, maybe less. Popular control texts are [FPE86],
[Do81], and [O90].
Part III requires an introduction to complex variables and analytic functions.
About three weeks of an undergraduate complex variables course might suffice.
Part IV requires some knowledge of vector-valued function spaces, their du-
als, and properties.
Part V requires a little knowledge of Banach spaces.
xvi PERFACE

Other references
Advanced books that present H°° theory but with a different approach include
[F87], [GL95], [BGR90], [FF91], [GM90], [DFT92], [H87], [Dy89], [Kim97], and
[ZDG96]. All of these books are based on the same type of mathematics, namely,
on extensions of something called Nevanlinna-Pick, Nehari, and commutant
lifting theories. Another approach is [BB91]. There exist commercial software
packages that do H°° control design, such as toolboxes available from The Math-
Works Inc. (Robust Control toolbox, LMI Control toolbox, //-Analysis and Syn-
thesis toolbox, and QFT toolbox), Delight (Prof. Andre Tits, U. of Maryland),
and Qdes (Prof. S. Boyd, Stanford University.) The mathematical core pre-
sented in the later parts of our book and our software is fundamentally different
from these approaches.

Thanks
We thank Trent Walker, Julia Myers, Dave Schwartz, Jeff Allen, John Flattum,
Robert O'Barr, and David Ring for computer work and testing, and Joy Kirsch
for administration. Mark Stankus was a valuable source of computer expertise.
Thanks to Neola Crimmins and Zelinda Collins for typing endless scribbled
pages. Eric Rowell read a draft of the book and caught many errors. The index is
due to Jeremy Martin. Julia Myers read many drafts of the book, and she found
many typographical and grammatical errors, and cases of unclear exposition.
We thank Julia for her wonderful work. We are especially grateful to Prof. Fred
Bailey of the University of Minnesota, his student Brett Larson, and to Prof.
A. T. Shenton and research assistant Zia Shafiei of the University of Liverpool.
Finally, we would like to thank the Air Force Office of Scientific Research, the
National Science Foundation, and the Ford Motor Co. for partially supporting
the writing of this book through grants.

Code development
OPTDesign and its companion Anopt are based on algorithms due to Helton,
Merino, and Walker and were written in Mathematica by Orlando Merino, Julia
Myers and Trent Walker. Also, Daniel Lam, Robert O'Barr, Mark Paskowitz
and Mike Swafford contributed. Earlier versions of Anopt (Fortran, 1989) were
designed and written by Jim Bence, J. William Helton, Julia Myers, Orlando
Merino, Robert O'Barr, and David Ring. An even earlier Fortran package is
Approxih by David Schwartz and J. W. Helton (1985).

Downloading OPTDesign and Anopt


The software packages OPTDesign and Anopt can be obtained at the Anopt web
site http://anopt.ucsd.edu or by anonymous ftp as described in Appendix E.

J. William Helton Orlando Merino


Part 1
Short Design Course
This page intentionally left blank
Chapter 1

A Method for Solving


System Design Problems

This chapter outlines our approach to solving design problems. Section 1.1
gives basic facts about rational functions, which are the single most important
type of functions in practical system design. The basic system and functions
considered in this book are introduced in section 1.2 and section 1.3. A control
design problem is presented in section 1.4. A description of the method used in
this book to solve control problems is presented in section 1.5.

1.1 Rational functions


If N(s) and D(s) are polynomials, then the rational function has
relative degree given by

The function F is called proper if d(F] > 0 and strictly proper if d(F) > 0.
A rational function can be thought of as a function on the imaginary axis
F(JUJ) or as a function on the complex plane, where the notation F(s) is used.
A function F is said to be bounded (on the imaginary axis) if there exists a
constant M > 0 such that

The smallest such M denoted namely,

is called the supremum of \F(juj) (see Fig. 1.1).


A rational function F is bounded if and only if it is proper and has no poles
on the imaginary axis; the proof of this is left as an exercise. The value
is not always attained, as the example below illustrates.

3
4 CHAPTER 1. SYSTEM DESIGN PROBLEMS

Fig. 1.1. Plot of \F(ju}) versus frequency uj, where F ( s ) — l/((s — 0.5)2 + 4).
The dotted line marks

Example. The function F ( s ) = s2/(s2 — 10) is bounded, since |F(ju;)| < 1


for all (jj 6 E. However, for this example there is no single finite frequency UJQ
at which |F(jo;o)| — 1- In other words, the supremum is not attained.
Functions F ( s ) that satisfy the equality

for all s, are sometimes referred to as real (on the real axis) and are very impor-
tant in engineering. They arise as the Laplace transform of real-valued functions
of time. On the imaginary axis equality (1.1) becomes

All functions that appear in this book have this property. One can show that a
rational function is real if and only if the coefficients are real numbers.
A rational function is stable if every pole of the function has negative real
part and the relative degree is not positive.
The usual convention is to denote the set of all rational functions that are
stable and real by 71H°°. Note that functions in KH00 can be described as
proper rational functions with real coefficients and with no poles in the closed
right half-plane RHP.2

1.2 The closed-loop system S


The basic closed-loop system we consider is a linear, time-invariant, finite-
dimensional system in the frequency domain description. It is depicted in Fig.
2
Some of the concepts introduced here apply to nonrational functions as well. For example,
a continuous function F(JUJ) may be bounded or may satisfy (1.2).
1.2. THE CLOSED-LOOP SYSTEM S 5

Fig. 1.2. The closed-loop system S

The functions P and C are proper, real, rational functions of s. They are
called the plant and the compensator, respectively. Figure 1.2 is called the
closed-loop system and denoted by S. We assume that the plant in Fig. 1.2
is given and cannot be modified, and that many choices of the compensator are
possible.
Besides P and C, there are other functions commonly associated with the
closed-loop system <5, which we now derive. Begin with Fig. 1.2 to obtain the
equations

Combine the two equations in (1.3) to obtain

Solve for y(s) and obtain

Equation (1.4) contains the closed-loop transfer function of 5, defined as

Note that the variable s has been suppressed in (1.5) to simplify notation. Other
key functions are

The sensitivity function

The tracking error for input

The closed-loop compensator

The closed-loop plant

The open-loop transfer function


6 CHAPTER 1. SYSTEM DESIGN PROBLEMS

1.3 Designable transfer function


Roughly speaking, to design a system one must choose all of the system's parts
that have not been given in advance. This is done according to some criteria or
requirements.
The most obvious way to select all the parts of system S is to specify C.
Doing this determines the closed-loop system S completely, in the sense that all
its functions T, L, 5, Q, etc. can be written in terms of the given part P and
our selection for C.
In this book we will use another way to select a closed-loop system S. Think
of T as a variable.3 If T is specified, then one can calculate C by solving in
(1.5):

Now we see that it is possible to write C and every transfer function associated
to S in terms of T and the given part P. That is, to determine the closed-
loop system S one has only to pick a particular value for T. Under these
circumstances we refer to T as a designable transfer function
The designable transfer function cannot take an arbitrary form. There are
many properties that the physics or mathematics of the system dictate. There
is a set X of admissible functions to which the designable T must belong. For
example, X contains functions that are continuous on the imaginary axis and
uniformly bounded there. Clearly, the set X must be defined before the design
process begins. In this book the set J is a subset of KH00.

1.4 A system design problem


We now formalize the generic problem we attempt to solve in this book.
The designer rejects or accepts a system by comparing the actual charac-
teristics of the system to a set of requirements that describes what is expected
from the system. The two main types of requirements we consider are internal
stability and performance.
The system S is internally stable if its closed-loop transfer function T is
stable and there is no cancellation of left-half-plane (unstable) poles and zeros
in the product PC. Internal stability is treated in Chapter 2.
In the simplest possible terms, a performance requirement is an inequality
in terms of the designable closed-loop transfer function T. An example is

Performance requirements are discussed in Chapter 3.


We state the design problem as follows:
3
This has the advantage that design specifications are usually presented in terms of the
closed-loop transfer function T. It also has a mathematical advantage that is important to
the optimization algorithms underlying our method.
1.5. THE METHOD 7

Design Given a plant P, a set of performance require-


ments "P, and a set of feasible functions X C TZH00,
determine if there exist T G J that make the
closed-loop system S internally stable and satisfy
all the performance requirements in P. If the an-
swer is yes, find one such T.

1.5 The method


A solution to Design is found by the following sequence of steps.
I. Obtain a mathematical description of the performance and
internal stability requirements.
Performance requirements. Each performance requirement is an inequal-
ity in terms of the designable transfer function T. In Chapter 3 we present a
list of frequency domain performance requirements from which the designer can
choose to formulate problems.
Internal stability requirements. Designable transfer functions T corre-
sponding to internally stable systems can be parameterized by a formula (see
Chapter 2). The beginner who simply wants to run a design package does not
need to know much about internal stability in that many design packages do
this step automatically.

II. Obtain a performance function and a performance index from the


performance function.
The mathematical expressions for the performance requirements are combined
to form a performance function F(o;,T(jia;)), a positive-valued function of T
and uj. In many cases the performance function is denned in such a way that a
designable T satisfies the requirements if and only if

By considering the worst-case performance over all frequencies, a performance


index that depends on the closed-loop transfer function T is obtained:

The performance index is a single number that gives a measure of goodness of


a choice of T. In other words, it is a cost function: the smaller 7(T), the better
the choice of T. A procedure to form F from the requirements is discussed in
Chapters 3 and 4.

III. Minimize the performance index over all T that correspond to an


internally stable system S.
This is the step that requires a special-purpose computer optimization program.
Minimizing the performance index over all possible T produces two outputs:
8 CHAPTER 1. SYSTEM DESIGN PROBLEMS

A yes or no answer to the question of whether there exist closed-loop


systems that meet all the requirements.

An optimal designable transfer function T*.

The function T* is (among the closed-loop transfer functions that satisfy internal
stability requirements) the best, in terms of overall performance as measured
by the index. Optimization (minimizing the performance index) is treated in
Chapter 4, and a computer session is presented in Chapter 5.
If it is determined in step III that there exist solutions to Design, then the
optimal designable transfer function T* can be used to run simulations, can be
implemented physically, or, as will often be the case, can serve as a guide to
restating the problem with more stringent performance requirements. The new
Design problem is subsequently solved with steps I, II, and III.
If the result of step III is that no solutions to Design exist, the engineer is
confronted with two possible paths. One is to redefine the problem completely
by radically changing the specs. The other is to reexamine the requirements to
determine which of them can be relaxed. In this case, a problem is formulated
with the new set of requirements and then solved with steps I, II, and III.
Since in practice it is common to repeat steps I, II, and III several times in
the manner described above, we add another step to our list. With step IV one
can treat a sequence of Design problems.

IV. If many satisfactory closed-loop systems exist (or none exist),


tighten (or loosen) the specifications accordingly and go to
Step I. Stop if this process can not be carried further.

The mathematics used here produce T* and an associated C*, via equation
(1.6), given by a set of values on the ju axis. In most instances the designer will
want to represent C* as a rational function. In some cases, it is desirable that
this rational function have low order. These two objectives can be accomplished
with the techniques provided by the subjects of system identification and model
reduction. While these subjects are not treated in this book,4 the software
package OPTDesign has functions for doing (stable or otherwise) rational fits
of data.
We finish this chapter with a caveat. It may be desirable in doing system
design to select compensators that are stable. For example, this would be the
case if the engineer wants to build and test the compensator as an independent
unit. In its current state, the theory and computational methods of H°° opti-
mization do not handle stability of compensators. Thus optimal compensators
C* produced by the methods used here are not necessarily stable (see [DFT92],
page 79).
4
The reader is referred to [SS90] and [GKL89] for theoretical aspects of system identification
and model reduction. See also [B92] and [Tr86].
1.6. EXERCISES 9

1.6 Exercises
1. Verify that the properties of being real, bounded, stable, and proper are
preserved under the following operations: addition of two rational func-
tions, multiplication of rational functions, and multiplication of a rational
function by a real number.
2. Prove that a rational function is real if and only if the coefficients can be
chosen to be real numbers.
3. Prove that a rational function F is bounded if and only if it is proper and
has no poles on the imaginary axis.
4. Use formula (1.6) in verifying the following relations:

5. Write all functions L, C, S, and Q in terms of P and T.


6. Consider a closed-loop system S where T is stable and no right-half-plane
(RHP) pole-zero cancellation occurs in PC. Prove that in this case
a. If SQ is a zero of P in the RHP, then T(SQ) = 0.
b. If 5i is a pole of P in the RHP, then T(SI) = 1.
7. Rewrite the relation

in terms of T(ju)}.
This page intentionally left blank
Chapter 2

Internal Stability

This chapter is an introduction to internal stability of systems. The main con-


cept is discussed in section 2.1, where it is illustrated with examples. Internal
stability is closely connected to a mathematical topic called interpolation, and
that is the subject of section 2.2. We conclude the chapter with a section on
control systems for a given stable plant. The chapter gives the student a quick
overview of internal stability, as well as enough background to treat design prob-
lems of certain types. The discussion of internal stability in full generality is
left to the more advanced student and is presented in Chapter 7.

2.1 Control and stability


Consider the closed-loop system S depicted in Fig. 2.1, where P and C are
proper, real, rational functions. Given a plant P, by choosing an appropriate
compensator C one can obtain a transfer function T for the closed-loop system
that is stable, i.e., that has no poles in the closed right-half-plane (RHP).

Fig. 2.1. The closed-loop system <S.

One way to achieve this is to choose C that cancels RHP poles and zeros
of P. However, any RHP pole-zero cancellation in the product PC is highly
undesirable, because small uncertainties in the plant P or in the compensator's
construction lead to a radical change in the behavior of the closed-loop system
S. We now illustrate this point with an example.

11
h CHAPTER 2. INTERNAL STABILITY

Example. For the given plant P(s) = l/(s — 1), consider the compensators
Ci(s) = (s - l ) / ( s + 1) and C2(s) = 2. Observe that a RHP zero of the
compensator C\ cancels a RHP pole of the plant when the product P(s)C\(s)
is formed:

That the closed-loop transfer function 7\ associated with C\ is stable is clear


from the following calculation:

Now consider a small perturbation of the plant corresponding to a change


in the location of the pole. Thus we now have the plant

where 6 is a real number with small absolute value. For this plant and the
compensator d, the closed-loop transfer function is

The function Tf has an unstable pole for all small values of 5, except for 6 = 0.
Let ss denote such value. Figure 2.2 shows the plot of s§- Here

Fig. 2.2. P/oi of the unstable pole s§ o/P*5 as a function of 8. The parameter 8
is on the horizontal axis. The point (0,1) is omitted, since for 8 = 0 there is no
unstable pole in Ps.
2.2. INTERPOLATION 13

On the other hand, for the plant P6 and compensator Ci the closed-loop
transfer function is

Note that T% is stable for all small <5, while Tf is not.


In this example we see that cancellation of a RHP pole of P with a zero
of C produces an undesirable system, since the latter becomes unstable under
small perturbations of the plant. A similar phenomenon occurs when a RHP
zero of the plant is canceled with a pole of the compensator (see Exercise 1).
Thus stability of T, sometimes referred to as external stability of the closed-
loop system, is not sufficient to guarantee a satisfactory closed-loop system <S,
although it is a necessary ingredient in the practical design of systems.
The closed-loop transfer function describes external, or input-output, be-
havior of the system. A more satisfying concept of stability is that of internal
stability.

DEFINITION. The system S is internally stable if the following are all satis-
fied.

i. The closed-loop transfer function T is in ftO.00.

ii. A RHP pole of the plant is not canceled by a RHP zero of the compensator.

iii. A RHP zero of the plant is not canceled by a RHP pole of the compensator.

Here 'RfH00 is the linear space of all proper rational functions that are stable
(poles off the closed RHP ) and real (real coefficients).
The fundamental problem we treat in this chapter is as follows: given a plant
P in a certain class, find a description of all the internally stable systems <S with
P as plant. The same problem is discussed in Chapter 7 for any plant P.
There are many ways in which one can present an answer to this problem,
all of which are mathematically equivalent. In this book we choose one that
consists of writing a formula that is (and must be) satisfied by all designable
closed-loop transfer functions T of these systems. The reasons for choosing this
particular approach are technical rather than physical: we have a procedure
available that uses this formula to give an answer to system design problems.
To derive formulas for internally stable systems we need to introduce the
concept of interpolation condition, which is presented in the next section.

2.2 Interpolation
Given SQ, a complex number, and /, a rational function, a relation of the form
14 CHAPTER 2. INTERNAL STABILITY

is called an interpolation condition on f . Interpolation constraints on functions


in 7^H°° play an important role in H°° control. One of the main tools in solving
problems involving interpolation constraints is a formula for parameterizing all
K7i°° functions meeting interpolation conditions.
A case familiar to all is that of polynomial / (rather than rational). A basic
result from algebra states that given a point SQ in the complex plane and a
polynomial /, then

if and only if one can write

where h is some polynomial. This can be thought of as a formula or parame-


terization of all functions / satisfying (2.4).
To illustrate the role of interpolation in internal stability, we consider the
case of a system S with a plant P that has a first-order (simple) pole at s = SQ
in the RHP. In the discussion below the following interpolation condition is
important:

CLAIM. If SQ is a simple RHP pole of P, andT satisfies (2.6), then there is


no pole-zero cancellation in the product P(s)C(s) at s = SQ.

Proof. If there is cancellation, then P(SQ)C(SQ) is finite, and this implies


that

This proves the claim.


CLAIM. // SQ is a simple RHP pole of P, and if there is no RHP pole-zero
cancellation in PC at s — SQ, then T satisfies(2.6).

Proof. If there is no cancellation in the product P(s)C(s) at s = SQ, then


P(s)C(s) has an order n0 > 0 pole at s — s0- Let

Then i is a complex number and

The above claims form the first half of the following result.
PROPOSITION 2.2.1. Let S be a system with plant P and closed-loop transfer
function T.
2.3. SYSTEMS WITH A STABLE PLANT 15

// SQ is & simple RHP pole of P, then pole-zero cancellation in the product


PC does not occur at s = SQ if and only ifT(so) = 1.

If SQ is a simple RHP zero of P, then pole-zero cancellation in the product


PC does not occur at s — SQ if and only ifT(so) = 0.

One consequence of Proposition 2.2.1 is that for certain systems, internal


stability is described in terms of a set J of interpolation conditions such as
T(SQ) = 0 or T($o) = 1. A complete discussion of this can be found in Chapter
7, but in this chapter we restrict our study to the case where the plant is stable.
We shall see in Chapter 7 that one can always write down a formula for all
functions T in 7^.7^°° that satisfy given interpolation conditions. Such a formula
has the form

where A and B are certain functions in RH°° that can be determined from J ,
and TI could be any function in 7£7Y°°. Thus as T\ sweeps 7£7Y°°, we have T
sweeping all functions in 'R.'H00 meeting J .

2.3 Systems with a stable plant


Systems S with a RHP-stable plant are a common type in practical control
systems. These are also simple enough to provide a good introduction to the
study of internal stability.
We now try to answer the question, given a stable plant P, what systems S
have P as plant and are internally stable?
To give an answer we will find a formula for the closed-loop transfer functions
T that come from such systems. Here is a restatement of the question above:
given a stable plant P, describe all possible closed-loop transfer functions T that
come from an internally stable system S with plant P. The answer we seek is
in the following theorem.
THEOREM 2.3.1. Let P be a strictly proper, RHP-stable plant for the system
S. Then the system S is internally stable if and only if there exists T\ € 'R-H00
such that T = PTi.
Proof. If <S is internally stable, then PC has the same zeros as P, including
multiplicity. Thus the zeros of T = PC'/(I + PC] contain the zeros of P, and
this implies that the function T\ := T/P has no RHP poles. The function
TI is proper since the system S has proper C and Ti = C/(l + PC}. Thus
we have that TI is an element of 7ZH°°. This proves the "only if" side of the
theorem. Conversely, suppose that T = PT\ for some T\ e RH°°. Note that
the compensator is given by
16 CHAPTER 2. INTERNAL STABILITY

It is clear from this formula and from P being strictly proper that C is proper.
For C to have a pole at the same location s = SQ as a RHP zero of P, one must
have 1 — P(SQ)TI(SQ) = 0, which is impossible since T\ has no pole at SQ and P
is zero there.
Example. If the plant of an internally stable system is then
any stable closed-loop transfer function T has the form

where T\ is some (or any) element of 71H00. We note that TI may not be strictly
proper. In fact, d(T\) 7^ 0 forces d(C) = 0 upon the system.
Note that we used the actual plant P in the formula (2.8). While this may
be convenient, many other formulas for T are possible. For example, we can
correctly write

to describe all T from internally stable S.

Example. If now P(s) = (s2 — 4)/(s 4 + 2s2 + 2), then P is stable and strictly
proper, so for the system to be internally stable the closed-loop transfer function
must have the form

for some T! € UK00.

2.4 Exercise
1. Consider the family of plants given by

where S is a real constant. Show that


a. For small \6\, the compensator Ci(s) — l/(s — 1) and plant P6 yields
a closed-loop transfer function T6 that is stable if and only if <5 = 0.
b. For all small |<5|, the compensator C2(s) = (s + l)/(s 4- 3) and plant
P6 yields a stable closed-loop transfer function T6.
Chapter 3

Frequency Domain
Performance Requirements
Frequency domain performance requirements have a convenient graphical inter-
pretation in terms of disks. Section 3.1 reviews basic concepts and introduces
disk inequalities. The most common performance requirements for control sys-
tems are given in section 3.2. Section 3.3 discusses disk inequalities arising from
performance requirements. The first-time reader may stop after reading this
section and jump to Chapter 4. The topics covered at this point are sufficient
to provide the basic tools to solve simple design problems, when used in con-
junction with material from Chapters 2, 4, and 5. The more interested reader
may continue with sections 3.4 and 3.5, which explain additional measures of
performance and the corresponding disk inequalities.

3.1 Introduction
3.1.1 The closed-loop system S
Consider the closed-loop system depicted in Fig. 3.1. In this system, P(s) and
C(s) are proper real rational functions of s. We take the plant P to be a given
rational function and use the closed-loop transfer function T to parameterize
the closed-loop systems S obtained with different compensators. Hence many
systems S with given plant P are possible by letting the designable transfer
function T take different values. Key functions are

The closed-loop function


The sensitivity function
The tracking error for input
The closed-loop compensator

17
18 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE

Fig. 3.1. The closed-loop system S.

The closed-loop plant PS;

The open-loop transfer function

3.1.2 Frequency domain performance requirements


Two types of requirements imposed on the system S are discussed in this book:
internal stability and performance. The internal stability requirement was stud-
ied in Chapter 2. Performance requirements are inequalities involving the func-
tions of the system. In this chapter we focus on performance requirements in
the frequency domain.
DEFINITION. A frequency domain performance requirement is an inequality
that the designable transfer function T must satisfy, and an interval in uo for
which the inequality is required to hold.

3.1.3 Disk inequalities


All of the frequency domain performance requirements treated in this chapter
can be written in the form of a disk inequality

which T must satisfy. Here K and R are fixed functions that embody the desired
specs of the system. K is called the center of the disk (3.1) and R is called the
radius of (3.1). Disk inequalities are easy to plot as regions in 3-D space (see
Fig. 3.2) and correspond to the inside of a tubelike domain. If the frequency u;
is fixed, then a disk inequality is represented in the complex plane by a region

consisting of a solid disk.


In many cases the requirements give one disk inequality on one frequency
band, and other disk inequalities on other frequency bands. When the various
frequency bands are disjoint, it is possible to link these disk inequalities together
into a single one, valid for all frequencies:
3.2. MEASURES OF PERFORMANCE 19

Fig. 3.2. Tubelike region {(uj, z) : \K(ju>) - z\ = R(ju)} determined by a disk


inequality

The process of piecing together disk inequalities will be discussed in sections 3.3
and 3.5. We shall see in Chapter 4 how to use disk inequalities to pose optimiza-
tion problems of the simplest kind discussed in this book. Many system design
problems can be solved by finding solutions to these optimization problems.
If two or more disk inequalities apply at the same frequency ui, then the
region S^ is the intersection of the disks that corresponds to individual con-
straints. Thus in this case, 5W is not a disk.

3.2 Measures of performance


A set of basic frequency domain performance requirements are presented in this
section. More requirements are introduced later in the chapter; however, it is
possible to state physically sensible design problems with the requirements given
in this section alone.

3.2.1 Gain-phase margin


In classical control, the gain margin and phase margin are the chief measures
of how robustly stability is achieved. In order to formulate constraints in terms
of the closed-loop transfer function T, we use a composite measure of stabil-
ity, which seems at least as good as the gain margin and phase margin taken
together. Define the gain-phase margin of the system S to be
20 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE

Graphically m is just the distance of the Nyquist plot of PC to the point —1 in


the complex plane. Simple algebra converts (3.3) to

We can easily compare the gain-phase margin m with the gain margin g and
the phase margin </> by looking at the Nyquist plot in Fig. 3.3. Typically m is
more conservative than either 0 or g.

Fig. 3.3. The gain margin g, phase margin (j), and gain-phase margin m.

If a given stable system has m near 0, then it is close to the unstable case,
which is undesirable. If am denotes the largest value of 1/ra considered to be
acceptable in inequality (3.4), then we can formulate a constraint in terms of
am as (see Fig. 3.4)

Fancier notions are very natural. For example, if W is a given nonnegative


function of frequency, we can define

to be the weighted gain-phase margin. The corresponding constraint is derived


from this definition in the obvious way and produces am(jaj) = W(juj}/mw

Gain-phase margin constraint


For a given am in the interval (0,1),
3.2. MEASURES OF PERFORMANCE 21

Fig. 3.4. Region defined by the gain-phase margin constraint.

3.2.2 Tracking error


Fundamentally, tracking error is a time domain concept. The issue is to measure
how close the output y(t] of the closed-loop system is to a given input u(t).
Indeed, we want the error function

to be "small" for each function u in a big class of possible inputs. To use


frequency domain techniques, one must translate time domain criteria back to
the frequency domain. We do this below.
Consider system S with given time domain input u(t) and output y(t], and
let U(s) and Y(s) be the Laplace transforms of u(t) and y(t). Then Y(s) =
T(s)U(s), and the Laplace transform of eu is

For T E T^-TY00, a condition that generates small eu compared to the size of u is

for some small atr. Thus good tracking is generated by requiring (3.10) to hold.1
It turns out that (3.10) is too stringent to be obtainable in a practical engi-
neering problem since the function T rolls off at high frequency. What is realistic
is to require the system to track low-frequency functions well, that is. to require
that T(JLL)) be close to 1 over some specified frequency range [—LUtn^tr]- Thus
we introduce one of the key constraints of control.
1
A standard theorem (the Plancherel theorem. [Yng88]) applied to (3.9) yields immediately
22 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE

Tracking Error Constraint


For given

(See Fig. 3.5.)

Fig. 3.5. Region defined by the tracking error constraint.

3.2.3 Bandwidth
Bandwidth of a system is commonly denned as the frequency u^ at which \T(ju)
falls below a given constant times the low-frequency value of the input. This
constant usually is taken to be 0:5 = 4=. A particular bandwidth is required to
ensure that the system at high frequency is not upset by noise, plant uncertainty,
actuator sluggishness, etc.
that for T € RH°° the error satisfies a useful bound:

for all functions u on [0, oo) having finite energyJ f°°


0
\u(t}\2dt. The inequality is sharp in
the sense that there are input functions that make this inequality as close as one desires to
equality. That is,

Consequently, if

then the normalized error in the left-hand side of (3.12) is no larger than c. Thus specs of the
form (3.13) guarantee that

for all u with finite energy. This is a very strong form of tracking.
3.2. MEASURES OF PERFORMANCE 23

Bandwidth Constraint
Given

(See Fig. 3.6.)

In practice a more refined constraint is required for very high frequency,


since it is desirable that T roll off to 0. This is discussed in section 3.2.4.

Fig. 3.6. Region defined by the bandwidth constraint.

3.2.4 Closed-loop roll-off


The open-loop transfer function L = PC of a given system rolls off at high
frequency, since the plant P and the compensator C are strictly proper. That
is,
From the relation

we see that both T and PC roll off at the same rate. To obtain an inequality
useful for the design process, we must eliminate the compensator C from the
right-hand side of relation (3.17). To do this, use the fact that compensators
roll off at high frequency with the asymptotic form 2

2
Since behavior near j'oc is what is important in (3.18), ar can be taken to be a weight
function of frequency that is not too close to 0 at any given frequency.
24 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE

for some ar and some n. At the outset of design, an engineer should specify n
and ar for the class of compensators that is to be built.3 Once ar and n are
specified, combine (3.17) and (3.18) to obtain the closed-loop roll-off constraint.

Closed-loop roll-off constraint


For given

(See Fig. 3.7.)

Fig. 3.7. Region defined by the closed-loop roll-off constraint.

3.2.5 Fundamental trade-offs


The most fundamental trade-off in control is between bandwidth constraints
and performance measures such as tracking or gain-phase margin, which dom-
inate considerations at low frequency. Bandwidth and roll-off are consistent
constraints, and tracking error competes strongly with them. Thus a basic
challenge in control is to get tracking over a broad enough frequency band, sub-
ject to the roll-off constraints dictated by actuator sluggishness and uncertainty
in plants, sensors, and the environment. In a particular engineering problem, to
obtain a precise feel for this trade-off requires the use of a computer program.

3.2.6 Choosing sets of performance requirements


The designer must choose a set of performance requirements in order to state
and then solve a design problem. This set should be selected with care, since a
3
We thank L. Desoer for emphasizing this constraint to us.
3.3. PIECING TOGETHER DISK INEQUALITIES 25

bad choice leads to problems that do not make sense from either the numerical
or the physical point of view. We illustrate this with examples below.

Example. Consider the problem Design for the plant P(s) — 1/(1 — s) and
the performance requirement

An obvious mistake here is the absence of constraints valid on the band 0 < uj <
1.0. It is easy to find functions T that satisfy (3.20) and that produce internally
stable systems S, but that have undesirable behavior at low frequency.
Example. Now consider Design for P(s) with performance requirements

For any physical system, T must roll off to 0 at very high frequency. This is not
enforced by the constraints (3.21). One way to remedy this is to require roll-off
on T, for example, with the constraint

The examples above illustrate two basic principles for choosing a set of fre-
quency domain performance requirements:
At each frequency u;, there must be a constraint in the set of requirements
that is active at this frequency.
Behavior of the system at very high frequency must be specified with a
roll-off constraint on T.
There are cases where performance measures not described in this section are
important. In particular, if the plant has a zero or a pole right on the ju axis
or near it, the compensator bound and plant bound constraints must be used to
set up the problem correctly. See section 3.4 for details on this.

3.3 Piecing together disk inequalities


The performance measures introduced in section 3.2 can be expressed as disk
inequalities. In this section, we present an example in which several disk in-
equalities are put together and expressed as one disk inequality that is valid for
all frequencies. Being able to combine disk inequalities into a single one is a
necessary skill for students of control who want to solve system design problems
with the methods proposed in this book.
Example. Consider a case where the requirements are given by constraints
only on the tracking error, gain-phase margin, and bandwidth.4 We will formu-
late a single disk constraint from the given constraints.
Suppose the following constraints are given:
4
This is not a physical example, since there is no closed-loop roll-off requirement.
26 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE

Table 3.1. A design example with three constraints.

Constraint Disk Inequality Freq. Band K(jw) R(ju)

Tracking and
1 0.25
disturbance rejection

Gain-phase 1 2.0

Bandwidth 0 0.707

|1 — T(JUJ)\ < 0.25 for u; < 0.5 (tracking for unit step input),
|1 — T(J(JJ)\ < 2.0 for all uj (gain-phase margin),
\T(ju)\ < 0.707 for u; > 3 (bandwidth).
Note that for u> fixed, the large disks from the gain-phase margin constraint
contain the smaller disks from the bandwidth and the tracking error constraints.
Thus the smaller disks define the region where T(JUJ) must lie for either low or
high frequencies u. For midrange frequencies, the gain-phase margin constraint
is the only constraint that is applicable. We collect this information in Table
3.1. The notation there is the same as the notation in inequality (3.1). The
region defined by these constraints is drawn in Fig. 3.8.

Fig. 3.8. The region defined by the constraints in Table 3.1.

The point is that at each frequency uj there is one and only one disk
3.4. MORE PERFORMANCE MEASURES 27

in which T(ju>) is constrained to lie. This is a mathematical description of


performance specifications that can easily be put in a computer program of the
kind described in this book.

3.4 More performance measures


In section 3.2 we showed how to convert the most basic specifications in a control
problem to precise performance measures (suitable for computer optimization).
A beginner who masters these alone will have a very good feel for the basics of
control.5 However, there are common control problems that cannot be formu-
lated correctly using only the constraints in section 3.2 — for example, if the
plant has a pole or a zero on the jaj axis. In this section we give additional
performance measures, in particular ones that deal with zeros or poles of the
plant on the juj axis.

3.4.1 Peak magnitude


One common constraint is that the magnitude of the closed-loop transfer func-
tion T not become too large; if it does, then T is close to unstable. Also, it has
been seen in practical design (see [Ho63], page 190) that the peak magnitude of
T is related to a large overshoot in the step response function. In many cases a
peak value of 1.1-1.5 is acceptable, while in others this is too high. We now de-
fine the peak magnitude constraint; the corresponding graphical representation
is given in Fig. 3.9.

Peak magnitude constraint


For given

3.4.2 Compensator bound


Desoer and Gustafson proposed another very sensible criterion for good com-
pensation in [DG82]:

for some prescribed number ac. In terms of the closed-loop system S, this
says that the "closed-loop compensator" has magnitude bounded by ac. The
closed-loop compensator is what the compensator puts out in response to an
5
Also, the examples in Chapters 4 and 5 will be clear to this beginner.
28 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE

Fig. 3.9. Region defined by the peak magnitude constraint.

input to the system. The reason that (3.23) is imposed on the system is that
if it is violated, then the closed-loop system can saturate, a small current into
the system can cause arcing at the output of (7, or other problems may appear.
In [DS81] Doyle and Stein discuss drawbacks of high loop gain in somewhat
different terms. Their main concern is that (3.23) holds at high frequencies; low
frequencies are less important.
Let us analyze (3.23 ) in terms of T. Since PC = T/(l — T), we obtain

Typically this constraint adds no serious restriction unless P\ is small (e.g., as


uj —> oo or near zeros of P on the juj axis). Inequality (3.24) adds a restriction
that is binding near the zeros of P (u — oo included). Thus the designer should
require (3.24) to hold whenever \aj — ujz\ < 77, where jujz is any zero of P on the
axis and 77 is a small positive number. Note that for large cj, inequality (3.24)
is contained in the closed-loop roll-off constraint (set n = 0 in (3.19)).

Compensator bound constraint


For given ac > 0, ry 2 , and for any zero jajz of P on the JLJ axis,

(See Fig. 3.10.)

3.4.3 Plant bound


Another constraint is that the "closed-loop plant" be bounded by a specified
value ap. That is,
3.4. MORE PERFORMANCE MEASURES 29

Fig. 3.10. Region defined by the compensator bound constraint, if there is a zero
of P at s = juz.

The closed-loop plant is the output of the closed-loop system of Fig. 3.1 to an
input to the plant.
Inequality (3.26) is analyzed in a similar fashion as inequality (3.24). Recall
that internal stability of the system implies that T(juo] = I when P(ju)} — oo.
Inequality (3.26) is binding near the poles of P; it must hold whenever \uj — u)p <
rip, where up is any pole of P on the juj axis and r\p is a small positive number.
From this we get the plant bound constraint, illustrated in Fig. 3.11.

Plant bound constraint


For given ap > 0 and r\p > 0 and all poles JUJP of P(ju],

3.4.4 Disturbance rejection


Systems are affected by disturbances, and one of the purposes of feedback control
is to minimize the effect of these disturbances. As an example, consider the
system given by an airplane, with input (established by the pilot) a certain
yaw rate u(t] and output the actual yaw rate of the airplane y ( t ) . A gust
of wind produces yaw and thus affects the output of the system. A simple
way of modeling this situation is shown in Fig. 3.12, where the disturbance is
represented as an additive input to the system at the output to the plant. If
there is no other input to the system in Fig. 3.12, the transfer function from
disturbance d to plant output y is easily found to be
30 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE

Fig. 3.11. Region defined by the plant bound constraint, if P has a pole at

Fig. 3.12. Disturbance in a feedback system.

For good behavior of the feedback system it is necessary that T^y be RHP-
stable. This is always the case with internally stable systems. More can be
required — for example, that there is a restriction on the size of Td-^y. The
following inequality gives a precise statement, where c is a given constant:

We have already encountered inequalities similar to (3.29) before in the context


of tracking error constraint, so at this point we refer the reader to the discussions
of the latter.

3.4.5 More on tracking error


Consider a system <S with input u and output y. The steady-state error for
input u(t) is

In many cases relation (3.30) has a counterpart in the frequency domain, which
can be obtained with the final value theorem (cf. [C44], page 191, or [LP61],
page 315). Suppose that u(t) is such that £(u — y)(s) has no poles on the closed
RHP, except perhaps a simple pole at s — 0. The final value theorem says that
3.4. MORE PERFORMANCE MEASURES 31

in this case

Thus we have from (3.30) that

Since Y(s) = T(s)U(s), it follows from (3.31) that

If air denotes the largest acceptable value of ess in (3.32), then

If a^ is not 0, one can write an inequality in the variable u) that guarantees


(3.33). We call this inequality the tracking error constraint for input U(s). It is
denned for those U(s) for which (1 —T(s))U(s) has no poles in the closed RHP,
except possibly for a simple pole at s — 0.

Tracking error constraint for input U


For given

3.4.6 Tracking and type n plants


We now discuss the case where the plant P has n pure integrations (a pole of
order n at s — 0). In this case we say that the plant is of "type n."
We assume that internal stability is a requirement, so RHP poles of the plant
P are not canceled by zeros of the compensator C. Since the plant is of type
n, the product PC also has a pole of order n at s = 0. Hence the sensitivity
function S — (I + PC)"1 has a zero of order n at s = 0. In particular, we have

It follows from equation (3.35) and from T = 1 — 5 that T satisfies the interpo-
lation conditions

Let us now take m <n + l and U(s] — l/sm. Because of (3.36), the function
32 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE

has no poles on the closed RHP, except possibly a pole of order 1 at s = 0.


Therefore the final value theorem applies, and one has

By combining (3.36) with (3.38), and applying 1'Hopital's rule for limits, we
obtain the following proposition.
PROPOSITION 3.4.1. An internally stable system with type n plant produces
a steady-state error ess for input U(s) = l/sm given by

Hence ess = 0 if I < m < n.


Proposition 3.4.1 implies that for internally stable systems with type n plant,
the most appropriate input U(s) = l/sm in the constraint (3.34) is U(s) —
l/sn+1. Other inputs U that may be suitable are functions that are stable on
the closed RHP, except for a pole of order n + I at s = 0.
Common time domain specifications on systems with type n plant are on
the position, velocity, and acceleration error constants KQ, K±, and K2,6 where

One can show that an internally stable system S with type n plant satisfies

3.5 A fully constrained problem


In section 3.3 we obtained a single disk inequality from performance require-
ments on bandwidth, tracking, and gain-phase margin. We consider here a
similar problem, but now with requirements on bandwidth, closed-loop roll-off,
plant bound, tracking, and gain-phase margin. Again, we will derive a single
disk inequality from the requirements.
The given plant is

with requirements

6
These are denoted by Kp, Kv, and Ka in most books.
3.5. A FULLY CONSTRAINED PROBLEM 33

Table 3.2. A design example with five constraints.

Constraint Disk Inequality Preq. Band K(ju) R(3")

Plant

Tracking and
disturbance rejection

Gain-phase

Bandwidth

Roll-off

The constraints a, b, and c are active at all frequencies 0 < u < I . Clearly
constraint c is contained in constraint a, so we restrict attention to a and b.
Solve the equation

to obtain the solution u>o ~ 0.29. Thus b is more stringent than a on the
frequency band 0 < u; < 0.29, and b is less stringent than a on the band
0.29 < u) < 1. We see that c is the only constraint that applies on the band
1 < (jj < 2. For (jj > 2 it is clear that every T that satisfies d also satisfies c, so
it is enough to study constraints d and e on this band. We solve the equation

and obtain the solution u\ K, 2.41. Therefore constraint d is more stringent on


the band 2 < aj < 2.41, whereas e is more stringent on 2.41 < a;. We summarize
these findings in Table 3.2, which defines the center function K and the radius
function R at each frequency. See Fig. 3.13 for a graphical representation of the
functions K and R.
The format of Table 3.2 is convenient for representing constraints in a way
that is easy to read. Note that in this example we can write the constraints as
a single disk inequality. This is possible here because overlapping constraints
can be written as a single disk constraint. However, sometimes it is not possible
to define a radius and center functions that incorporate all the information
34 CHAPTER 3. FREQUENCY DOMAIN PERFORMANCE

Fig. 3.13. Section of the envelope defined by the center K(jo;) and radius R(ju>)
in Table 3.2.

contained in the requirements. For example, two intersecting disks do not yield
a single disk.
Chapter 4

Optimization

In this chapter we discuss performance functions and how to use them as objec-
tive functions in optimization problems. These concepts are applied to system
design in examples. We begin with a review. The reader who is interested in
getting quickly to the point where examples can be solved by computer may
read sections 4.1-4.4 and then skip to Chapter 5.

4.1 Review of concepts


Consider again the closed-loop system <S depicted in Fig. 4.1. Here P(s) and
C(s] are proper, real, rational functions of s. The function P(s) is given, and
the function C(s) depends on the choice of designable transfer function T(s).
The functions T(s) and C(s) are related by the formula

or, equivalently,

The system S is internally stable if T has no RHP poles and there is no RHP
pole-zero cancellation in the product PC.

Fig. 4.1. The closed-loop system S.

35
36 CHAPTER 4. OPTIMIZATION

Performance requirements are given inequalities to be satisfied by functions


associated with the system <S. The inequalities may be stated in terms of time
or frequency. Frequency domain performance requirements were dealt with in
Chapter 3. Time domain performance requirements are briefly discussed in
Chapter 5 and treated in more detail in Chapter 6.
The basic control problem we consider is

Design Given a plant P, a set of performance require-


ments P, and a set of feasible functions

T is the closed loop of an internally


stable system S with plant P},

determine if there exist T 6 X that satisfy all the


performance requirements in P. If the answer is
yes, find one such T.
The method used to solve Design is as follows:
I. Obtain a mathematical description of the performance and internal sta-
bility requirements (write down disk inequalities, and take notice of the
unstable poles and zeros of the plant).
II. Obtain a performance function (e.g., derive a formula for the radius and
center functions).
III. Find a feasible T that yields the best performance.
Often in practice steps I- III are followed by step IV.
IV. If many satisfactory T exist or none exist, tighten or loosen the specifica-
tions accordingly and go to step I. Stop if this process cannot be carried
further.
In this chapter we will emphasize steps II and III, especially when the per-
formance function is of the "circular" type (for a definition see section 4.2).
Step III requires the use of computer software, since numerical calculations are
involved.1 Step I was treated in Chapters 2 and 3, and step IV is illustrated
in Chapter 5. See Chapter 5 for a discussion of performance functions that are
not circular.

4.2 Generating a performance function


Let us suppose that we have one or more performance requirements that can be
expressed as disk inequalities. We consider in this section the case where all of
these can be put together as a single disk inequality, valid over all frequencies:

lr
The examples in this chapter were solved with the software package OTPDesign.
4.2. GENERATING A PERFORMANCE FUNCTION 37

We will build a "performance function" from (4.1) that can be used to solve
design problems. Begin by rewriting (4.1) in the following way:

Now for a given T calculate the largest value of the left-hand side of (4.2):

If a particular T is available, then one can check if the original performance


requirements are satisfied. To do so one just calculates 7(T). It is not hard to
convince oneself that two possible outcomes of calculating 7(T) are:

Example. Suppose that the center k and radius r are given by

We now check if

satisfies the disk inequality (4.1):

After some calculations one obtains

Thus g(w) has three critical points: uj = 0, uj — ±\/13/3. The value c/(0) = 1/4
is a minimum, and g(±\/T3/3) = 297/512 is a maximum. Also, g(uj) —>• 9/16 as
(jj —s> (X). A plot is most informative (see Fig. 4.2). Thus
38 CHAPTER 4. OPTIMIZATION

Fig. 4.2. Plot of the function

We conclude that T does satisfy the disk inequality (4.1).


For designable T and given center and radius functions k and r, respectively,
the expression

is called the circular performance function, or simply performance function. The


number 7(T) is the worst-case performance ofT, but most of the time we will
simply call it the performance of T.

4.3 Finding T with best performance


A feasible function T* with the property that

is called optimal. We reserve the symbol 7* for the "optimal performance," that
is, the performance of the optimal designable transfer function:

The optimal designable function T* is a precious object since there is no other T


that has better performance. By calculating the best performance possible, 7*,
the designer has conquered a peak. This vantage point offers information that
is crucial in making decisions on the next steps. If T* satisfies the constraints,
one can use T* as the choice of design or modify T to get a low-order model
that still satisfies the constraints. If T* does not satisfy the constraints, then
is immediately known that there is no T that satisfies the constraints! The
designer then can take other actions, such as to modify the "soft" constraints
and then do the calculation again.
The process of finding the optimal T* and the corresponding optimal perfor-
mance 7* is called optimization. The optimal T minimizes 7(T) over all possible
4.3. FINDING T WITH BEST PERFORMANCE 39

T (recall that this includes internal stability). More precisely, we say that T* is
a solution to

Because optimization is a complicated procedure, to carry it out one needs a


computer with the appropriate software. When computer algorithms and soft-
ware are not available to do the optimization, one has to resort to unsatisfactory
approaches, such as writing a formula for T as a rational function with given
order and unknown coefficients. Then one tries to do the get a good T by sim-
ply trying lots of possible choices of coefficients. Such methods are not always
practical, and more important, the designer usually gets no indication of how
far from optimal the choice of T is.

4.3.1 Example
Physically the problem is to stabilize the plant and impose closed-loop gain-
phase margin and roll-off performance.

Problem. Solve Designfor the plant P(s) = (s+5)/(s+l) 2 , with performance


requirements gain-phase margin m = 0.5 and closed-loop roll-off |T(ju;)| <
\P(ju)\,toTl<u.

Solution. Observe that the plant is strictly proper and stable. At this point
we will not spend more time discussing internal stability, since the software we
intend to use for the optimization takes care of this automatically.
We first construct the center and radius functions for our problem. Recall
that the gain-phase margin constant ra is the closest acceptable distance from
the function L = PC to the point —1; in other words,

The requirement on the roll-off can be written as

Thus the frequency domain performance requirements are described by (see Fig.
4.3)

The inequalities in (4.12) can be cast as a single disk inequality

where
40 CHAPTER 4. OPTIMIZATION

Fig. 4.3. The performance envelope defined by center and radius functions in
the example is outlined by the thick curves. The horizontal axis is frequency u;,
and the two functions are K(JUJ) + R(ju] and K(JUJ) — R(jw).

and

The optimal T* is the solution to

To give an idea of what the answer is, some results of calculations with the
software package OPTDesign are presented below. For more details see Chapter
5.
Calculations by computer give 7* = 0.09923, so there exist solutions to
Design. Here is a rational function that is a low-order approximation to the
optimal one:

See Figs. 4.4-4.7 for plots of the optimal T*.

4.4 Acceptable performance functions


There are two basic rules that have to be followed to define a useful performance
function.
RULE 1: The performance at any T and any frequency u (including u = oo)
should be a well-defined real number.2

2
The performance may turn out to be infinitely large at frequencies UJQ such that the plant
has a zero at s = JUIQ , or at infinite frequency. This should not be a problem, provided that
only T that give internally stable systems are considered.
4.4. ACCEPTABLE PERFORMANCE FUNCTIONS 41

Fig. 4.4. 3-D plot of the performance envelope and the solution TQ. Clearly T0
satisfies the constraints.

Fig. 4.5. Magnitude ofTo.

RULE 2: For any fixed frequency a;, the performance as a function of T


cannot be independent of T.

The justification of rules 1 and 2 requires the results in Part III of this book,
so we refer the interested reader to it.
The following are examples of performance functions that are common in
the engineering literature. We assume that the underlying design problem has
internal stability requirements, and that the plant P is strictly proper and has
no poles or zeros on the juj axis. Also, W, Wi, and W2 denote given rational
weight functions of frequency.

1 If the system has a plant P with relative


degree p, then W must have relative degree — p (or larger) for the per-
42 CHAPTER 4. OPTIMIZATION

Fig. 4.6. Phase o/T 0 .

Fig. 4.7. The zeros and poles of the compensator (0.786826 + 2.36403s +
2.68265s2+1.42052s3+0.315071s4)/(7.08474+9.97784s+13.6385s2+7.38526s3+
s4) that corresponds to TQ are indicated by "o" and "x," respectively. Note that
this compensator is stable. Stability of the compensator is not guaranteed a
priori by the design method of this book.
4.5. PERFORMANCE NOT OF THE CIRCULAR TYPE 43

formance to be well defined. If W does not, then Rule 2 is violated at


U) — OO .

2. This is not a good performance, no


matter what W is. The reason is that since T rolls off at high frequency,
the performance at very high frequency is determined by W only. This
performance has no way of measuring how fast T rolls off or any relevant
high-frequency property of T.
3. The weight W^ should
be 0 at infinite frequency, so that the term with this weight does not
influence performance at very high frequency. The comment in the first
example also applies to the weight -W\.
4. This continues
example 3. The relative degree of W? should be — p/2 or larger, where p
is the relative degree of the plant.3

4.5 Performance not of the circular type


In this section we discuss performance functions more general than (4.8). Sup-
pose that we have a design problem where one of the constraints is

where W\ and W% are known weight functions.4 Clearly this constraint is not
a disk inequality. Indeed, the region

is not a disk. However, the inequality (4.17) can be used to define a performance
function valid at least for those u satisfying uj\ < u < 0^2• For this we set, for
z any complex number,

Thus a designable transfer function T satisfies the constraint at frequency u; if


and only if

The function F is called a performance function. Let us define for T £ KH00


the number

3
When zeros or poles of the plant occur on the ju> axis, weight functions have to be chosen
so that the weights in the terms Wi(ju>)\T(jiii)\'2 and W2(juj)\l—T(ju>)\2 have the zeros (resp.
poles) at the same values of u as those of the plant, order included. The reason for this is the
internal stability requirement. See Chapter 7
4
See [OZ93]. Also see section 6.5 in Chapter 6.
44 CHAPTER 4. OPTIMIZATION

The number 7(T) represents the cost, or overall performance, associated with
the designable transfer function T. It is called the performance index. Note
that (4.18) can be written as

Observe that 7(T) < 1 means that T satisfies the performance requirements,
with some slack. Thus we associate small values of 7(T) with good performance.
The simplest functions F arise when a single disk inequality applies at each
frequency.

4.6 Optimization
It is a fact that there is a formula that depends on the RHP poles and zeros of the
given plant P that gives all T e X. Indeed, there exist functions A, B e RH°°
such that

For general P this formula is proved in Chapter 7, while for RHP-stable P we


already saw in Chapter 2 that

One approach to treating the design problem in practice is to use formulas (4.21)
and (4.22) to account for internal stability. Depending on the specific approach
and the tools available, to solve Design the formulas are either manipulated
directly by the designer or they are automatically handled by the computer. In
this case only the list of unstable zeros and poles of the plant is necessary for
the software to build arid use the necessary formulas.

4.6.1 The optimization problem OPT?


Suppose that steps I and II in Section 4.1 have been completed; that is, there
is available a performance function F and a set X of admissible functions. To
find the "best" T possible, one must find the smallest possible value for the
performance index 7(T) for all T £ X. That is, one must solve the following
problem:

OPTj Given F(o;, 2), a positive-valued function of uj € M


and of z E C, and a set X of admissible functions,
find

and find an optimal T* if it exists.

In general OPTj is a constrained problem, since one may have X ^ RH°° .


By solving OPTi we solve Design. This follows from Proposition 4.6.1.
4.7. INTERNAL STABILITY AND OPTIMIZATION 45

PROPOSITION 4.6.1. 7/7* < I , then there exist solutions to Design, and if
7* > 1, then there are no solutions to Design.
Proof. By definition, the number 7* is the smallest value possible for 7 so
that the inequality

has solutions T G 1. Therefore 7* > 1 says that there are no functions F that
satisfy (4.20), i.e., that there are no solutions to Design. Similarly, 7* < 1 say
that there is at least one function T G X that satisfies (4.20), which implies that
there exist solutions to Design.

4.7 Internal stability and optimization


It is a very simple matter to transform OPTi into an unconstrained optimization
problem OPT. This is desirable because powerful mathematical algorithms for
solving unconstrained problems are available. It is so easy that the computer can
do this in a manner that is transparent to the user, for example, as implemented
in the package OPTDesign. Occasionally the designer may want to do it directly
by hand (or may be forced to do it). Thus we include a discussion of this below.

4.7.1 The optimization problem OPT


To do the transformation of OPT? into OPT we use the formula (4.21) in
combination with the performance T(UJ,Z). Suppose T is as in (4.21). Define a
new performance function as follows:

Therefore OPTj is equivalent to the following unconstrained optimization prob-


lem:

OPT Given FI(U;, z), a positive-valued function of uj € R


and of z € C, find

and find an optimal T£ if it exists.

4.7.2 OPT with circular T


Note that if the function F is circular, then the function FI in (4.25) is also
circular. To see this, combine the disk inequality (4.2) with relation (4.21) to
obtain the inequality
46 CHAPTER 4. OPTIMIZATION

The resulting inequality is itself a disk inequality in the variable T\:

Thus we can choose TI(CJ,TI(JUJ}) to be the left-hand side of inequality (4.27)


to arrive at the unconstrained optimization problem OPT.

4.8 Exercises
1. Is the F given below of quasi-circular type? Take

2. Is with k and r given below nondegenerate?

ANSWERS
la Yes
b No, the level circular are ellipses,
c No, the level curves are lemniscates.
d Yes, because linear function transforms take circles to circles.

2. All F are degenerate, because


a F is not defined at all frequencies. This is very bad.
b F(j) = oo. This causes numerical problems. You must change variables
in function space (see Part III of this book).
Chapter 5

A Design Example with


OPTDesign

5.1 Introduction
This chapter introduces the reader to practical design by showing how the ideas
learned so far feed into a computer implementation. It is written in a generic
tone, so that one need not know anything specific about computation to get a
concrete idea of how to do a control design. The best explanation of this subject
is to give an actual computer design session. Unfortunately, this usually entails
getting involved in many specialized details peculiar to a package. However, we
have developed a program whose use is close enough to conventional English
that anyone can read it without specialized knowledge.
Our program is called OPTDesign and runs under Mathematica. This per-
mits any standard symbolic or numerical calculations to be done in a language
that is quite compatible with standard mathematical notation.

5.2 The problem


We now give the data for a specific problem. The plant is

and the requirements are tracking on the band 0 < o> < 0.3 with bound 1.5,
gain-phase margin constant of 0.5, bandwidth 0 < uj < 2.0; and closed-loop
roll-off with bound 2.5 on the band 4.0 < LU (see Table 5.1). We wish to design
an internally stable system S that satisfies the given performance requirements.
The following is a list of inputs for a computer run. If you wanted to try
OPTDesign you would begin by loading OPTDesign into a Mathematica session.
«OPTDesign'

47
48 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN

Table 5.1. Performance requirements.

Constraint Band

Tracking

Gain-phase

Bandwidth

Roll-off

We enter the center and radius functions fco and TQ directly as step functions.
For example, in Mathematica this is done using the Which[ ] command. This
is a strange name for what most English speakers call "If."

p[s_] = ( 1 - s/5 )/( 1 + s/2 ) ~ 2 ;

wt 0.3;
alphat 1.5;
alphagpm 2.0;
wb 2.0;
alphab 0.7;
wr 4.0;
alphar 2.5;

kO[w_] = Which[ 0 <= Abs[w] < wb, 1.0,


wb <= Abs [w] ,0.0];

rO [w_] Which[0 <= Abs [w] < wt, alphat,


wt <= Abs[w] < wb, alphagpm,
wb <= Abs[w] < wr, alphab,
wr <= Abs[w] , alphar Abs[ p[I w] ] ]

The requirements envelope that we defined above has jump discontinuities at


two locations on the semiaxis u; > 0. For sampling functions, a grid of points on
the jui axis has to be chosen. If the user does not specify a grid, then OPTDe-
sign chooses one with 128 points distributed around uj = I.1 Also, OPTDesign
automatically does a very small amount of smoothing of the discretized center
and radius functions (if desired, the user may specify the amount of smoothing).
The plot of the requirements envelope before and after smoothing gives an idea
1
To change the default grid on the juj axis, use SetGrid[n,GridSpread -> 6], where n is
an integer representing the number of gridpoints, and 6 is a positive number representing the
frequency around which the gridpoints are distributed.
5.2. THE PROBLEM 49

of how much distortion we are introducing into the problem by smoothing func-
tions. We must keep in mind that the solution obtained with OPTDesign will
be optimal with respect to the "smoothed" envelope. The following command
produces the plot shown in Fig. 5.1.
EnvelopePlot[Radius->rO,Center->kO, FrequencyBand->{0.,wr+l.}]

Fig. 5.1. The original and smoothed requirements envelopes.

A 3-D plot of the requirements envelope (see Fig. 5.2) is produced with
EnvelopePlot3D[Radius->rO,Center->kO]

Fig. 5.2. 3-D plot of the requirements envelope.

A plot of the discrete profile of the requirements envelope can be used to


judge if there are enough frequency gridpoints in the band of interest (see Fig.
50 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN

5.3). Note that in our example most of the gridpoints are in the band 0.1 <
(jj < 10.0, which is where the center and radius functions have their distinctive
features. This is desirable.
EnvelopeLogPlot[Radius->rO,Center->kO,
FrequencyBand->{0.,wr+l.},Discrete -> True];

If we do not like either of the plots in Figs. 5.1 and 5.3, then several input
parameters can be modified to rerun the problem and to obtain more satisfactory
plots. We leave this for a later section and proceed now with a run of the
program OPTDesign with the given data.

Fig. 5.3. Plot of the discrete profile of the requirements envelope.

5.3 Optimization with OPTDesign


Running OPTDesign for plant P and specified disks with radius TO and center
fco is done by entering
OPTDesign[p,Radius->rO,Center->kO]

The main things that are produced are

• A parameter 7* that tells us if (a smoothed version of) this disk-type


problem has a solution. If 7* is less than 1, then the problem has a
solution; indeed if 7* is much less than 1, then the constraints can be
tightened. But if 7* > 1, then the design problem has no solution and the
constraints must be loosened to obtain a problem that has a solution.

• The optimal closed-loop transfer function T.

Diagnostics and progress reports are routinely printed to the screen as the
program OPTDesign runs:
5.3. OPTIMIZATION WITH OPTDESIGN 51

Parameterization for internal stability:

T = A + B * Tl, where Tl is RHP stable and

Processing performance function...


Sampling Radius and Center ...
Smoothing Radius and Center ...
Optimization routine Anopt now number-crunching

It Current Value Step Optimality Tests Error Sm. Grid


gamma Flat GrAlign ned

0 2.2485924730083E+00 N/A 8.9E-01 1. E+00 N/A NON 128


1 1.5969785364195E-01 1.8E+00 3.6E-03 0 E+00 1.5E-02 NON 128

Summary

Supremum of gamma: gamma* 1.596978536419496E-01


Optimality Test : Flat 3.5999393585E-03
Optimality Test : GrAlign 0
Error diagnostic : ned 1.54E-02

Calculating output functions


Resetting options
Done!

Observe that the output parameter 7* (in the above output, gamma*) is less
than 1. The column ned in the screen output is a measure of numerical error
or noise. Flat and GrAlign are diagnostics. If they are near 0, the calculated
solution is nearly optimal. We say more on this later in section 5.6.

We conclude that there are solutions to the (smoothed) Design problem and
that the accuracy of the computation is acceptable. Furthermore, since 7* is
much less than 1, we can tighten the performance requirements substantially and
still get a solution. The calculated closed loop transfer function, open loop, and
compensator output is stored in the output variables T, L and Co respectively.
We postpone discussing their format until subsection 5.5.1, where we show how
to plot and manipulate it. Now we move on to the production of a compensator.
52 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN

5.4 Producing a rational compensator


The next step is to generate a rational compensator. We will first produce a
rational expression for the closed-loop transfer function and extract a rational
compensator from it.
Computer routines for doing rational approximation are not extremely re-
liable when the degree of the approximation is high (see section 5.9). The
rational approximation in our package appropriate here is RationalModel. By
default RationalModel finds a rational function that approximates the closed-
loop transfer function.2
Trat[s_] = RationalModel[s , DegreeOfDenominator->3]

Error = 0.171731

-5 + s (-4.67853 - 1.92117 s) (-5 + s)

22
(2. + s) (2. + s) (3.769 + 1. s)

RationalModel produces a rational function of the form

with stable function Tirat(s). The proper rational functions A and B depend
on the plant and incorporate automatically the internal stability requirements
on T. You Must run OPTDesign before you run RationalModel.
Recall that the compensator C and the closed-loop transfer functions are
related by the formula

Here we call the compensator compl.


compl[s_] = l/p[s] Trat[s]/(l-Trat[s]);

There is no point in writing out the formula for the output now, since most
valuable to us will be a zero- and- pole plot of compl. The following command
produces the plot shown in Fig. 5.4.
PlotZP [compl [s] , s]

The figure suggests that compl has a pole-zero pair at s = 5.3 The standard
Mathematica functions do not cancel terms with decimal notation. To do the
cancellation we can use a function provided with OPTDesign:
2
The Error number displayed by the RationalModel routine refers to the fit of T\ and not
to the fit of T.
3
One way to confirm this is
complzeros = s /. NSolveC Numerator[compl]==0,s]
{-2., -2., -0.987364, 5. }
complpoles • s /. NSolveC Denominator[compl]==0,8]
{5, -6.60224, -1.04396 - 0.710493 I, -1.04396 + 0.710493 1}
5.4. PRODUCING A RATIONAL COMPENSATOR 53

Hence comp2(s) results from the simplification of compl. We plot the poles and
zeros of comp2 now (see Fig. 5.5).
PlotZP [comp2[s],s]

Fig. 5.4. Poles and zeros of compl.

Fig. 5.5. Poles and zeros of comp2.

It is clear from the plots in Fig. 5.4 and Fig. 5.5 that the compensator comp2
does not cancel RHP poles or zeros of the plant, which is a necessary condition
for internal stability of the overall system S. Now we calculate the closed-loop
54 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN

transfer function Trat2 that corresponds to the compensator comp2, and follow
this with a plot of the poles and zeros of Trat2 (see Fig. 5.6).

PlotZP[Trat2[s],s]

Fig. 5.6. Poles and zeros of Trat2.

Note that since comp2 is proper and Trat2 is stable, for this choice of compen-
sator the system is internally stable.

5.5 How good is the answer?


To verify that we have met the performance requirements we plot the closed-loop
transfer function and the performance envelope simultaneously. A 3-D picture
(shown in Fig. 5.7) of the requirements envelope together with the plot of Trat2
is produced with the following command.

For those who like old-fashioned 2-D plots, the following command produces
the plot in Fig. 5.8.

One usually wants to look at Bode plots to evaluate the design. These can be
obtained in several ways in most control packages. For example, in OPTDesign
the following commands produce the plots.

BodeMagnitude[ Trat2[s],s,{w,0.1,10},PlotLabel->"Magnitude Plot"];


BodePhaseE Trat2[s],s,{w,0.1,10},PlotLabel->"Phase Plot"];
5.5. HOW GOOD IS THE ANSWER? 55

Fig. 5.7. 3-D plot of Trat2 inside the requirements envelope.

Fig. 5.8. Plot of the original, (nonsmoothed) performance of Trat2.

Instead of showing these we shall illustrate an important numerical point


by demonstrating Bode plots of functions on the ju axis gridpoints used by
the latest run of OPTdesign. This allows one to see if the grid captures the
frequency bands that are important for the problem. The Bode plots are shown
in Figs. 5.9 and 5.10. The commands are as follows.
Tw = Discretize[Trat2];

BodeMagnitude[Tw,s,{w,0.1,10},PlotLabel->"Magnitude Plot"];

BodePhase[ Tw,s,{w,0.1.10},PlotLabel->"Phase Plot"];

Plots for the open loop L = PC can be produced in similar fashion by plotting
L = pcomp2.
One can see in the magnitude plot of the closed loop that most of the sample
points are located in the frequency band 0.1 < uj < 10. One can judge this to
56 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN

Fig. 5.9. Bode plot (magnitude) of the closed-loop transfer function Trat2.

be acceptable from the numerical standpoint, based on the fact that both the
input functions ko and r0 change mostly in this band. For details see section
5.8.

5.5.1 More on plots and functions


Some users may want to manipulate the output of an OPTDesign run before
dealing with rational fits. If you type T in a session after you run OPTDesign,
then Mathematica returns a list of complex numbers that are the values of T on
the grid. Now we give a brief discussion of commands to plot and manipulate
the closed loop T, the open loop L, or other lists of data. A list of some useful
commands is EnvelopePlotSD, BodeMagnitude, BodePhase, Nyquist, Nichols,
RHPListPlotSD, OPTDParametrize, Discretize, Grid.
To plot the list of data T and the requirements envelope simultaneously,
type

EnvelopePlotSD[ Radius -> rO, Center -> kO, ClosedLoop -> T]


The important point is that the call is the same as whether T is a rational
function or a list of data. The commands

BodeMagnitude[T], BodePhase[T], Nyquist[L] , Nichols[L]

also have this feature of working either on functions defined by formulas or lists
of data.
If one wants to plot or manipulate T by itself, then one must access the grid
on the ju axis where your OPTDesign session is evaluating T, via

Grid [ ]
5.5. HOW GOOD IS THE ANSWER? 57

Fig. 5.10. Bode plot (phase) of the closed-loop transfer function Trat2.

The output is a list of numbers on the juj axis. Now we can plot T using
RHPListPlot3D[ T, PlotRange -> {{0,3},Automatic,Automatic}]
where the PlotRange option sets the range you will see.
One can put together a functions values with the grid on which it is defined,
using
OPTDParametrize[T]
Note that you may achieve identical results with
Transpose [{Grid [] ,T}]
Indeed, this is how OPTDParametrize is defined.
It might well be useful to note that the plotting commands above have a
very simple core plus a few embellishments to make the scales, to make labels,
to put the — 1 point into the Nyquist plot, etc. As an example we illustrate how
one builds plotting routines such as Nyquist and RHPListPlot3D. The core of
RHPListPlot3D[ T ] is
ScatterPlotSD [ Transpose [ {Grid[] , Re[T], Im[T]} ] ]
The core of Nyquist [ L ] is
ListPlot[ Transpose [ {Re[L], Im[L]} ]]
So far we have discussed functions presented as data sets. Now we mention
that if you have a rational function Tr rather than a lot of discrete values,
Discretize[T r ] gives a list of values of Tr on the ambient OPTDesign session
grid.
The commands mentioned in this section and their output are discussed
further in Appendix G.
58 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN

5.6 Optimality diagnostics


If a computer run stops, does it mean that a solution has been obtained? How
good is the "calculated solution"? If garbage was produced, how do we know?
These questions can be answered with optimality diagnostics. These are one or
more indicators that tell directly or indirectly whether a "computed solution"
is close to an "actual solution."
The importance of having optimality diagnostics available for optimization
problems cannot be overemphasized. Any respectable optimization theory must
have them, and any serious optimization software must implement them.
The optimization theory behind the problem Design has two diagnostics
that are crucial: the Flatness diagnostic and the Gradient Alignment diagnostic.
These are positive real numbers; when the design is a true theoretical optimum,
then these two numbers are equal to zero. They could be implemented in most
H°° packages we know, so they are in no way dependent on OPTDesign.
When run, the program OPTDesign "iterates," that is, produces a sequence
J1(°)) J l ( 1 ) j j 1 ( 2 ) j ..., of "guesses at the answer," which improve the performance
with each step or iteration. For each one of these iterations, the Flatness and
Gradient Alignment diagnostics are printed on the screen so that the user may
follow the progress of the computer run.
The Flatness and Gradient Alignment diagnostics are labeled Flat and
Gr•Align. Other output diagnostics that are printed on the screen for each iter-
ation are the performance level attained at such iteration, 7, and an indicator
of numerical error, ned. Table 5.2 gives more details on this.

5.7 Specifying compensator roll-off


In the example above we obtained a compensator with relative degree 0; that
is, the compensator was proper but not strictly proper, so it does not roll off.
Recall from section 3.2.4 that if you want to specify the relative degree of the
compensator to be d(C] > 1, set the roll-off constraint so that the radius func-
tion goes to 0 asymptotically with

as u) —> oo, where d(P) is the relative degree of the plant P.


Here is an example where the relative degree of the compensator is specified
as d(C) = 2. Note that the roll-off of the function \p(ju)/uj2\ equals d(P) + 2,
which is the appropriate roll-off of the radius. Observe that the degree of the
compensator enters in two places, in the definition of the radius function and in
the call to OPTDesign.

kO[w_] = Which[0 <= Abs[w] < wb, 1.0,


wb <= Abs [w] ,0.0];
5.8. REDUCING THE NUMERICAL ERROR 59

rO[w_] = Which[0 <= Abs[w] < wt, alphat,


wt <= Abs [w] < wb, alphagpm,
wb <= Abs[w] < wr, alphab,
wr <= Abs[w] , alphar Abs[p[I w]/w"2]
];

OPTDesign[p,Center->kO,Radius->rO,Cdecay -> 2];

Table 5.2. Optimality diagnostics.

Diagnostic Meaning

A positive number. It is the performance level attained at a current guess


T at the answer. More precisely, it gives the worst-case performance value.
If F(u>, z) is the performance function and T(ju) is the current guess at the
answer, then 7 = j(T) = sup^ r(u>,T(ju)}.

A number between 0 and 1. One of the fundamental results in the theory of


H°° optimization is the fact that in most problems, an optimal design flattens
the performance (see Chapter 9). That is, if T* is an optimal function, a plot
of r(u>,T*(ju>)) vs. cj produces a horizontal line. The diagnostic flat measures
Flat the "nonflatness" of the current guess at the answer. To give a hypothetical
example, if the performance at the current guess varies between 0.25 and 0.75
as u; goes from 0 to oo, then in this case the flatness diagnostic is defined as Flat
=Q~g— = 0.37 5 On e interpretatio n i s that th e rang e o f the performanc e
at the current guess is 37.5% of its maximum value.

A nonnegative number. This optimality diagnostic at the true answer is equal


GrAlign to 0. It verifies that a certain winding number has the "appropriate value."
For details see sections 9.5 and 13.5.

A positive number. When small it indicates that numerical calculations contain


ned little numerical error. Numerical trouble and what to do about it are discussed
in section 5.8.

5.8 Reducing the numerical error


In a nutshell, numerical error might be too large as a result of one or more of the
following causes: the requirements envelope changes too fast (discontinuities is
a reason that comes to mind), the grid on the juj axis is badly chosen, or the
rational approximation to the solution to the problem is poor. In this section
we give some pointers as to how to proceed if numerical error arises.
1. The choice of gridpoints on the juj axis. The command Grid[Ngrid —>
n, GridSpread —» b ] produces n gridpoints on the u; axis, which ac-
60 CHAPTER 5. A DESIGN EXAMPLE WITH OPTDESIGN

cumulate near u; = b. For OPTDesign to perform well when solving a


design problem, it is necessary that a "sufficient" number of gridpoints
be produced in a frequency band where the center and radius functions
have their distinctive features. If this is a somewhat large band, then one
chooses b to be a midpoint of this band and also chooses n large enough
to populate the band of interest with a fair amount of points. What n is
good enough is impossible to say a priori, as one has to do a computer run
to get a "feel" for the adequacy of the chosen numbers n, b. For example,
to produce a grid with 256 points distributed around 3.5 and follow it with
an OPTDesign run, type the following commands.

wpts = Grid[Ngrid -> 256,GridSpread -> 3.5];

OPTDesign[p,Center->kO,Radius->rO,GridPoints->wpts];

2. The requirements envelope changes very fast over a frequency region. This
may be the result of your way of setting up the envelope. One possibility
is to change it to make it gentler, either by smoothing it or by redefining
it. In this case one should verify that these changes preserve the physical
requirements for the design. Fpr a successful computer calculation, one
must make sure that the chogeji grid has many points in those bands, as we
just described in item 1. The smoothing in OPTDesign runs is specified
with the option Nsmth. The input Nsmth-^1 indicates a small amount
of automatic smoothing; more smoothing is achieved with Nsmth-^5 or
Nsmth-+10. The input Nsmth~-+ 0 corresponds to no automatic smoothing.
An example of a command is

OPTDesign[p,Center->kO,Radius->rO,Nsmth->5,GridPoints->wpts];

3. Agreement (or lack thereof) between functions as data on a grid and a


rational formula. This topic is important enough to deserve a separate
section; we refer the reader to section 5.9.

5.9 Rational Fits


We have seen that the function RationalModel in the package OPTDesign pro-
duces rational approximations to data representing an optimal closed-loop trans-
fer function. Rational approximation of functions given as data may be done
poorly, or even may go completely wrong for several fundamental reasons (not
related to OPTDesign).4
This is because the problem is one of highly nonlinear optimization. As a consequence,
1. The larger the degree of the denominator, the more likely a rational fit is to converge
to a false optimal fit.
2. If data is not reasonably smooth, then fitting routines may give spurious results.
3. Rational fits are very sensitive to initialization.
5.10. EXERCISES 61

One may decide whether the rational function chosen to approximate data is
of good quality simply by displaying a plot of the absolute value of the difference
of the data and the function (evaluated at the frequency values specified by the
data) and studying the resulting profile.
The function RationalModel uses an algorithm by L. N. Trefethen for doing
Caratheodory-Fejer approximation. To use it, type

We warn the reader again that RationalModel is not a stand-alone rational


fit program and that it must be run in the course of OPTDesign. For a stand-
alone Caratheodory-Fejer rational fit on a stable function whose values on a grid
wpts are given in a variable F, one can use

The grid should be of the type discussed in section 5.8 and discussed further
in section 6.1.1.
The OPTDesign package also contains a more powerful general-purpose rou-
tine for approximation of data called NewtonFit. See the documentation for
NewtonFit in Appendix F.

5.10 Exercises
1. In the problem treated in this chapter, smooth the envelope a lot, say
Smoothing —> 50, and compare it to the unsmoothed envelope.
2. While fixing all other specs in section 5.2, make the gain-phase margin
as small as possible. (This is the Glover-McFarlane [GM90] approach to
design.) How much does smoothing affect the answer?
3. Same as problem (2), but vary the tracking error.

For example, the effect of this on RationalModel is that it may not work satisfactorily when
1. DegreeOfDenominator —* n is used as input with n > 4.
2. The numerical error diagnostic in OPTDesign is not small.
3. You least expect it. To be safe you must run with many different initializations.
This page intentionally left blank
Pant II
More on Design
This page intentionally left blank
Chapter 6

Examples

In this chapter we present examples that illustrate the theory developed in


previous chapters, as well as the use of some techniques for solving problems in
practice.
The design process involves extensive use of computers and algorithms, so
some time spent studying their efficient use is beneficial. We discuss briefly in
section 6.1 numerical issues that arise in a wide class of algorithms for solving
optimization problems. The first control example we solve, the design of a
control for a slide drive, is presented in section 6.2.
Many practical situations lead to control design problems with requirements
on the time domain functions associated with the system. To use frequency
domain methods, one must translate time domain information to the frequency
domain. We take a first look at time domain requirements in section 6.3. The
requirements for overshoot and settling time and a method for translating in-
formation to the frequency domain are introduced in this section. The method
is applied in our solution of a design problem in section 6.4.
Finally, in the last section of this chapter we discuss design problems that
involve competing constraints. By this we mean that two or more constraints
are active at the same frequencies, and improving a design with respect to one
of them brings a degradation of performance with respect to at least one of the
other ones. In section 6.5 we discuss ways to treat competing constraints with
our methods, and a design problem is solved using two practical approaches.

6.1 Numerical practicalities


There are several reasons for the numerical difficulties that may arise when
solving OPT by computer:

• Points on the juj axis for sampling functions are badly chosen.

65
66 CHAPTER 6. EXAMPLES

Fig. 6.1. An example ofGA-point grid on the uj axis produced with the OPTDe-
sign package command Grid[Ngrid — > n,GridSpread — > b], which gives
n points distributed around ±6. Practically all (positive) points lie between 0.16
and 106.

The performance function F is not continuous. In the circular performance


case, this occurs when the radius function or the center function is not
continuous.

The value of the performance T(uj,T(ju))} is extremely large at some fre-


quencies a;, regardless of the value of T. In circular performance problems,
this is the case when the radius gets very close to 0 at one or more fre-
quencies.

At certain frequency uj, the value of the performance T(UJ, T(ju}} does not
depend on T.

We discuss these issues below. The emphasis is on circular performance func-


tions, but the comments apply, for the most part, to other performance functions
as well.

6.1.1 Sampling functions on the jo; axis


To solve OPT by computer one must choose a grid of points on the jaj axis.
These points are needed for sampling functions. To have an adequate discrete
representation of functions, the grid must contain many points in the frequency
bands where the functions being sampled change a lot. For circular performance
functions, the center K and the radius R determine which choices of grid points
are appropriate. Figures 6.1 and 6.2 illustrate different selections of grids with
the software package OPTDesign.1
x
The main optimization program, Anopt, works with functions defined on the unit circle
instead of the imaginary axis. The linear fractional transformation s —> ^T^" *s used to map
from the axis to the circle and vice versa. Here b is a constant that the user may set.
6.1. NUMERICAL PRACTICALITIES 67

Fig. 6.2. An example of a 512-point grid on the u) axis produced with the OPT-
Design package command Grid [Ngrid — > n.GridSpread — > b]. Most of
the (positive) points lie between 0.016 and 1006.

6.1.2 Discontinuous functions


Depending on the way in which requirements are put together, the performance
function may not be continuous. For example, the center function may have
values that drop from 1 to 0 at a particular frequency. This is undesirable;
most algorithms for solving OPT have numerical problems when dealing with
discontinuous performance functions. Even if the optimal designable function
T could be computed, in many cases it wouldn't be acceptable because it is not
continuous.
To circumvent this problem in the circular performance case, it is neces-
sary to modify the radius and center functions to make them continuous and
even differentiable. Doing this changes the problem being solved, but it still
should produce good approximate solutions if handled with care. The trade-off
is closeness to the original function versus good numerical properties.
To illustrate a simple way to obtain a continuous function from a discontin-
uous one, consider the function

The function k(uj] has a jump discontinuity at uj = 2. We pick a "transition


region," say ua < u> < ujb, that contains the location of the jump and interpolate
the points (o;a, k(uja}} and (a;?,, k(uJb)} with a linear function ^1(0;). Then we
define
68 CHAPTER 6. EXAMPLES

Fig. 6.3. Plot of the function k\(uj] obtained from k(u) by using linear
interpolation with transition region 1.5 < LJ < 2.5. The following Math-
ematica commands produce ki(u): wa = 0.5; wb = 1.5; linel[w_] =
InterpolatingPolynomial[wa,l,wb,0,w] ; kl[w_] = Which[ Abs[w] <=
wa , 1, wa < Abs[w] <= wb , linel [Abs[w]] , wb < Abs[w] , 0] ;

Fig. 6.4. The plot of the function ^2(0;) obtained from k(uj] by us-
ing high-order interpolation with transition region 1.5 < u) < 2.5 and
derivatives at the endpoints specified to be 0. The following Mathe-
matica commands produce k^u}: wa = 0.5; wb = 1.5; polyl [w_] =
InterpolatingPolynomial[wa,l,0,wb,0,0,w]; k2[w_] = Which[ Abs[w]
<= wa , 1, wa < Abs[w] <= wb , polyl [Abs [w] ] , wb < Abs[w] ,
0];

Other types of interpolation are possible. For example, one may do it with a
polynomial so that derivatives at the endpoints of the transition region have
certain values. See Figs. 6.3 and 6.4.

6.1.3 Vanishing radius function


In some cases, when the radius in circular performance problems vanishes, the
mathematics for solving OPT break down. However, a vanishing radius is not
a difficulty if the problem is set up correctly. Indeed, the radius function should
vanish at frequency U>Q whenever s = JUJQ is a zero of the plant (including
infinite frequency). The full justification of this statement requires the theory
of Chapter 7, so we explain here only the case where the plant P(s) is stable.
Let P(s) be a strictly proper rational function with no RHP poles. By
Theorem 2.3.1 internally stable systems S with plant P(s) have a closed-loop
transfer function of the form
6.2. DESIGN EXAMPLE 1 69

for some T\ 6 7£7Y°°. If the system S satisfies the closed-loop roll-off inequality

then the latter can be written in terms of T\ as

That is, we obtain a problem where the variable is I\ and the radius function
is equal to a for all large u>, which is acceptable. Computer programs such as
OPTDesign do this step automatically.

6.1.4 Performance function incorrectly defined


The performance function denned by the user should depend on the closed-loop
transfer function T at each frequency u;, including LJ — oo.
For example, consider a strictly proper plant P(s] (so that for internally
stable systems <S, the function T rolls off at infinite frequency), and suppose
that we wish to use the magnitude of the sensitivity as our only performance
criterion, thus leaving out a requirement on the roll-off. We may write the
performance function as

Since T rolls off, the performance is equal to 1 at infinite frequency, regardless


of the choice of T. Thus this performance function is not adequate for H°°
optimization. For details on this see section 10.4 in Part 3 of this book.

6.2 Design example 1


The system is a slide drive, driven by a DC motor (see Fig. 6.5). This was
studied by Brett Larson in his master's thesis [La], written under the guidance
of Prof. Fred Bailey. The following description appears in page 3 of the thesis:
An electrical voltage u(t) drives a DC motor with output speed u>i(t).
The output of the motor goes through the gear train which steps the
speed down to u)2(t). The gear train is connected to a table which
slides along a rail with the load on top.
The fundamental goal in this problem is to control the speed of the table very
precisely.

6.2.1 Electro-mechanical and electrical models


Ignoring the dynamics in the gear train, the problem can be represented by the
equivalent circuit in Fig. 6.6.
70 CHAPTERS. EXAMPLES

Fig. 6.5. Slide drive apparatus.

Fig. 6.6. Electro-mechanical model.

The following equations correspond to the DC motor.

The parameters for this problem are


6.2. DESIGN EXAMPLE 1 71

Figure 6.7 represents this problem with the mechanical parts replaced by their
electrical equivalents, where

Fig. 6.7. Electrical model.

6.2.2 Mathematical model

Choosing the state vector as

we have that the system of state equations for the system is

Take u\ as the output, and scale the frequency by 100 to obtain the following
coefficient matrices, where a — I/JL-
72 CHAPTER 6. EXAMPLES

The parameter a varies between 1 and 10. The maximum load corresponds to
a = 1. The plant transfer function from input u to output uj\ is computed from
the formula

We obtain

The parameter a accounts for the variable load in this example. Here a varies
between a = 10, which is the minimum-load case, and a = 1, which is the
maximum-load case. The nominal case corresponds to a = 5. The nominal
plant is

The nominal plant is stable (so are all the other plants Pa for all a £ [1,10]).
The magnitude of the nominal plant is shown in Fig. 6.8.

Fig. 6.8. Magnitude of the nominal plant P(s) = -Ps(s).


6.2. DESIGN EXAMPLE 1 73

6.2.3 Statement of the problem


The set of requirements given here is taken from [La89].

According to Larson, these requirements should be met over the full range of load
perturbations. We will instead do the (much easier) problem of solving Design
for the nominal plant only. The problem for loads other than the nominal will
not be discussed here.

6.2.4 Reformulation of requirements


Now we rewrite the requirements using the notation of Chapter 3. Observe
that the phase-margin constraint needs to be stated as an inequality in terms
of the closed-loop function T. Also, note that there is no closed-loop roll-off
constraint, so one must be supplied.
Setting a gain-phase margin constraint. To treat the phase margin
requirement (6.lie), we rewrite it as a gain-phase margin constraint with m =
0.5:

It is clear that the gain-phase margin constraint is the only one that is binding
at midrange frequencies 1 < uj < 5. Also note that m = 0.5 ensures a gain
margin of 6 dB (see Fig. 6.9).
Setting a closed-loop roll-off constraint. For many problems with
proper compensator one can use the plant P(s) to provide the profile of the
envelope, by setting in

In our case we set ur = 10.0. A plot of the magnitude of the plant (Fig. 6.8
shows that it has a peak at a frequency close to our choice of ujr, so near this

Fig. 6.9. Derivation of m in design example 1.


74 CHAPTER 6. EXAMPLES

frequency |P(ji<;)| changes quickly. In light of this we do not use the plant in the
closed-loop roll-off constraint. Instead, we shall state constraint (6.12) in terms
of a function whose magnitude has a simple behavior at frequencies u; > ur,
and such that its roll-off corresponds to that of a rational function with relative
degree 2. We choose

The complete set of performance requirements can be stated now:

With respect to internal stability we merely note that there are no RHP
poles or zeros of the plant P(s}. Hence the internal stability of the system with
plant P(s) (and proper compensator) is guaranteed by simply requiring that
the closed-loop transfer function T(s) have relative degree 2.
A circular performance function

can be formed from the performance requirements by setting

and

The center K(UJ] and radius R(OJ) have jump discontinuities (see Figs. 6.10 and
6.11).
The next thing to do is to remove jump discontinuities from the requirements
envelope. To do it, we define continuous functions KI(OJ] and R\(u) that
approximate K(UJ) and R(UJ) except near jump discontinuities, where KI(UJ}
and RI(U>) are defined as linear functions. See Figs. 6.12-6.15.

6.2.5 ptimization
We now have all the elements necessary to state and solve the design problem
by optimization of the performance. A run of the program OPTDesign with
6.2. DESIGN EXAMPLE 1 75

Fig. 6.10. 2-D plot of the requirements envelope determined by the center func-
tion K((jj) and radius function R(jj}.

Fig. 6.11. 3-D plot of the requirements envelope determined by the center func-
tion K(u}) and radius function R(u>).
76 CHAPTER 6. EXAMPLES

Fig. 6.12. The radius functions RI(UJ) (thin) and R(u>) (thick) on three different
frequency ranges.

Fig. 6.13. The center functions K\ (w) (thin) and K(UJ} (thick).
6.2. DESIGN EXAMPLE 1 77

Fig. 6.14. 2-D plot of the requirements envelope corresponding to the center
function K\(u) and radius function RI(U>).

Fig. 6.15. 3-D plot of the requirements envelope corresponding to the center
function KI(UJ) and radius function RI(UJ).
78 CHAPTER 6. EXAMPLES

256 gridpoints and with heavy smoothing yields 7* = 0.618, so there exist
solutions to the design problem with the modified envelope (with smoothing).
The plots of the optimal closed-loop transfer function T* and the corresponding
sensitivity function S = 1 — T* indicate that the sensitivity has magnitude —5
dB for frequency uo near 1, while in the original performance requirements it was
specified that this magnitude had to be less than —10 dB. See Figs. 6.16-6.18.
Thus we must take further action to solve the original design problem.

Fig. 6.16. Bode magnitude plot o/T*.

6.2.6 Second modification of the envelope


and optimization
Note that the function T* obtained in the previous run has satisfactory charac-
teristics at frequencies other than u near 1. We now modify the requirements
envelope used in the previous run in a way that amounts to tightening the con-
straints in a small frequency band around the frequency uj — 1. See Figs. 6.19
and 6.20.

Fig. 6.17. Bode phase plot o/T*.


6.2. DESIGN EXAMPLE 1 79

Fig. 6.18. Bode magnitude plot of the sensitivity function S* = 1 — T*

Fig. 6.19. The center function K%((*}) values drop in magnitude to the right of
uj — 1.3. Here K^uo] is shown as the thin line and K(uS) is shown as the thick
line.

Fig. 6.20. The radius function R^^} values are smaller than 0.25 for 0 < uj <
1.3. Here R-2(u} is shown as the thin line and R(UJ) is shown as the thick line.
80 CHAPTER 6. EXAMPLES

The run of OPTDesign with the new envelope produces an acceptable closed-
loop transfer function. A degree 6 rational approximation to the new opti-
mal closed-loop transfer function is readily obtained with the function Ratio-
nalModel. We have

The compensator corresponding to T2(s) is

See Figs. 6.21-6.25

Fig. 6.21. Bode magnitude plot ofT^s}.

Fig. 6.22. Bode phase plot o/T 2 (s).


6.3. TIME DOMAIN PERFORMANCE REQUIREMENTS 81

Fig. 6.23. Bode magnitude plot of the sensitivity function 82(3) — 1 — ^2(5).

Fig. 6.24. Plot of zeros and poles of the closed-loop function T^s).

Conclusion. A numerical solution to the original design problem with


nominal plant has been found. Plots of the magnitude and phase of the (discrete)
optimal compensator are shown in Figs. 6.26 and 6.27.

6.3 Time domain performance requirements


Time domain requirements on the closed-loop system S must be translated to
the frequency domain in order to solve design problems with the methods used
in this book. In this section, two time domain requirements are introduced,
and a technique for treating them is given. The technique is applied later in a
second design problem (section 6.4).
82 CHAPTER 6. EXAMPLES

Fig. 6.25. Plot of zeros and poles of the compensator C^s)

Fig. 6.26. Magnitude of the optimal compensator.


6.3. TIME DOMAIN PERFORMANCE REQUIREMENTS 83

Fig. 6.27. Phase of the optimal compensator.

6.3.1 Two common time domain requirements


Two common time domain requirements used for system design are the over-
shoot requirement and the settling time requirement. Both are denned in terms
of the step response curve c(i), which is the response of the system in the time
domain to an input that is a unit step (see Fig. 6.28). The step response function
has T(s)/s as Laplace transform, i.e.,

The requirements we present in this sections are as follows.


Overshoot requirement Given

Settling time requirement Given

6.3.2 A naive method


A way to handle time domain requirements in the frequency domain is to use
a low-order rational function Tref as a reference. The idea is to choose, before
the design process begins, a function Tref that is the transfer function of a well-
understood system that meets the time domain requirements. Then frequency
domain requirements are established by using plots of the function Tref. The
hope is that systems that satisfy these frequency domain requirements will also
satisfy the original time domain requirements.
A choice of closed-loop reference function that is usually mentioned in control
books is
84 CHAPTER 6. EXAMPLES

Fig. 6.28. A step response function that satisfies the overshoot and settling time
requirements when M = 0.1, ts — 15.0, and A = 0.02. In this figure, the step
response curve cannot run into the shaded area without violating the constraints.

This system is known to satisfy the relations (see [FPE86])

and

Hence Tref(s) is completely determined by specifying ts and Mp. Plots of func-


tions Tref and 1 — Tref are then used to determine frequency domain parameters
(bandwidth, tracking, etc).
The advantage of (6.22) is that because of its low order, it is very easy to
generate parameters un and £ to suit one's needs. However, the strength of
(6.22) is also its weakness. This is a consequence of the initial value and final
value theorems [C44] [LP61], which (very roughly) have the following rule of
thumb as a corollary.

TIME DOMAIN vs. FREQUENCY DOMAIN RULE. For internally


stable systems with type n plant (n > Oj, high-frequency behavior of
the closed-loop transfer function determines the behavior of the step
response function near time zero, and very low frequency behavior
of the closed-loop transfer function determines the behavior of the
step response function for large times. The same is true for ramp
response when n > 1 and parabola response when n > 2.

It is a fact that functions Tref(s) given by (6.22) come from systems with
type n = 1 plant (i.e., it has a simple pole at s = 0). Hence if settling time
and even overshoot are important in a design problem, then the function Tref
given in (6.22) may give useful information only if the plant of the system to be
designed is of type n = I .
6.4. DESIGN EXAMPLE 2 85

6.3.3 A refinement of the naive method


We now describe a way of treating time domain requirements in design problems
where the plant of the system is of type n.
Suppose that we have a type n plant and that we wish to solve a design
problem so that in particular requirements on the step response and overshoot
are satisfied.
The internal stability requirement forces the following interpolation con-
straints on the closed-loop function T(s] (see section 3.4.6):

It is easy to see that if T(s) is given by the formula below,2 then T(s) has
relative degree 1 and satisfies (6.25).

where T\ (s) is any bounded and stable function. We will pick T r e f(s) so that it
satisfies equation (6.26), but to make the task of choosing Tref manageable, we
will consider only those T given by

where 6 > 0. The real parameters a, 6, and c are chosen by trial and error so
that

satisfies overshoot and settling time requirements.


The formula for a family of reference functions given by (6.27) may be altered
to include other time domain considerations. For example, if one would like to
require that the step response have very little change for an initial time interval,
then one is forced to consider closed-loop functions with a relative degree that
is greater than or equal to that of the plant. Also, the magnitude of Tref(s) for
large frequencies is important. To obtain formulas for such families of functions
Tref (s) the reader may use the theory of interpolation developed in Chapter 7.

6.4 Design example 2


The following design problem is taken from [FPE86], p. 497. Design is treated
here with the naive method for converting time domain requirements described
in section 6.3.2.
2
The reader interested in a derivation of this and similar formulas should refer to Chapter 7.
86 CHAPTER 6. EXAMPLES

6.4.1 Statement of the problem


The problem is to design a satellite's attitude control. The plant is the rational
function

The requirements on the step response function c(t) are overshoot

and settling time

We need to find a closed-loop system S that is internally stable and meets the
performance requirements.

6.4.2 Translation of time domain requirements


The plant function p(s) has an order 2 pole at s = 0; that is, the plant is of
type n — 2. Since the behavior of the system at initial times is not an issue, to
choose a reference closed-loop function we may use formula (6.27).

Fig. 6.29. Step response for a system with closed-loop function Tref.

Although there are many choices of a, 6, and c, the value c — 0 gives rea-
sonable step response (Fig. 6.29). Hence we set
6.4. DESIGN EXAMPLE 2 87

Due to the pole of the plant at s = 0, the plant bound constraint is binding
at low frequencies. A plot of the closed-loop plant for the function T re f(s) is
shown in Fig. 6.30, while the magnitude of Tref (s) and of the sensitivity function
S(s) = 1 — Tref(s) are depicted in Fig. 6.31. Figures 6.30 and 6.31 suggest a

Fig. 6.30. Plot of the plant bound function for the system when the closed-loop
function it Tref.

center function k(juj] that is equal to 1 at low frequencies (up to wp — 0.7, say)
and then drops to be close to 0 after wi, — 2.0. We choose k to be piecewise
linear, so that if l(uj) denotes a line interpolating (0.7,1) and (2.0,0), then

The radius function has the following characteristics. At low frequencies

Fig. 6.31. Magnitudes of and


88 CHAPTER 6. EXAMPLES

Fig. 6.32. 1-D plot of the requirements envelope.

Fig. 6.33. 3-D plot of the requirements envelope.

(a; < a>p), the radius equals ap/p(juj), where ap = 0.9 is taken from Fig. 6.30.
At frequency ujb = 2, Fig. 6.31 suggests a radius equal to 0.75. Finally, we pick
the roll-off frequency to be u>r = 10.0. Figure 6.31 suggests a radius function
with value 0.25 at the roll-off frequency uj = ajr = 10.0, so we set the radius to
linearly interpolate these values. For higher frequencies, the radius is set to a
multiple of the magnitude of the plant, |p(jf'o;)|. Let i\(uo} (resp., ^(^)) denote
the linear interpolant between frequencies u> — 0.7 and uj = 2.0 (resp., u> = 2.0
and u} = 10.0). Then r(u)) is given by

See Figs. 6.32 and 6.33. Computer code for this example is given in Appendix
C and in the file appendixch6.nb.
A computer run with OPTDesign produces an optimal value 7* = 1.46, so
there is no solution to the problem with the modified envelope requirements.
6.4. DESIGN EXAMPLE 2 89

Fig. 6.34. Bode magnitude plot of T*.

Fig. 6.35. Bode phase plot ofT*.

OPTDesign produces an optimal function T* anyway, that is, a function that


comes closest to meeting all the constraints. See Figs. 6.34 and 6.35 for the
Bode plots of T*. The function T* may be useful for design purposes, since
the correspondence between frequency domain and time domain requirements
is not explicitly known and something was lost in the (very rough) translation
we performed.
A rational approximation TI (s) of the (discrete) function T* is readily ob-
tained with the tools of the package OPTDesign. We get

The step response function step(i) = £ l ( T ( s } / s ] is shown in Figs. 6.36 and


6.37. The compensator Ci that corresponds to TI(S) has high order. Its zeros
90 CHAPTER 6. EXAMPLES

Fig. 6.36. Step response function for closed-loop T\

Fig. 6.37. Another view of the step response function for closed-loop T\.

and poles are shown in Figs. 6.38 and 6.39.

The zeros and poles of Ci(s) are

See Figs. 6.38 and 6.39.


Order reduction techniques may be applied to obtain a compensator with
lower order and satisfactory characteristics. Here we shall try something very
simple that yields a lower-order compensator that may be acceptable. By in-
spection, we locate pole-zero pairs of C\ (s) that are close and proceed to cancel
them. Of course, this produces changes in the frequency response of Ci(s), but
the hope is that these changes are not too large (we are partially justified in
expecting this because canceled pairs are "close" to a perfect cancellation). By
proceeding with cancellation as explained above, we obtain the compensator
C2(s) given by
6.4. DESIGN EXAMPLE 2 91

Fig. 6.38. Plot of zeros and poles of C\ (s]

Fig. 6.39. A view of the zeros and poles of Ci(s) close to the origin.
92 CHAPTER 6. EXAMPLES

The zeros and poles of C^s) are shown in Fig. 6.40, while a plot of the magni-
tudes of C\(s) and Ci(s) is given in Fig. 6.41.

Fig. 6.40. Zeros and poles of C%(s).

Fig. 6.41. Magnitudes of Ci(s] andC^s).

The closed-loop function T2(s) that comes from the compensator 62(5) is
given by

Finally, the step response that corresponds to the choice of C<2,(s) as compensator
is shown in Fig. 6.42.
6.5. PERFORMANCE FOR COMPETING CONSTRAINTS 93

Fig. 6.42. Step response corresponding to the closed-loop function T^s).

Conclusion. We obtained an order 6 rational compensator so that the


corresponding closed-loop system has settling time of 9s and overshoot of 15%.
We also obtained an order 3 compensator with overshoot of 20% and settling
time of 10s. It is conceivable that the overshoot may be lowered, as there is
some room for improvement (by letting the settling time increase).

6.5 Performance for competing constraints


Until now we have dealt with design problems of the circular type; that is,
the performance envelope has the shape of a circle at each frequency point.
However, in practical design it is common to encounter problems with more
than one valid performance requirement on a single frequency band. We say
that such design problems have competing constraints.
To discuss the issues involved in solving design problems with competing
constraints we will focus on the reference problem (Probi), stated below. While
(Probi) is not completely general, it is all we need to understand in order to be
able to attack more difficult problems.

(Probi) Given a plant P(s) and nonnegative functions


Wi(u) and W2(w), find T e KH00 so that S is
internally stable and

For simplicity and physical realism we shall assume that W\(<jj) is near zero
or is zero at very high frequencies, that W2(uj) is near zero or is zero at very low
frequencies, and that Wi(u) has magnitude comparable to \P(jw)\~l for large
frequencies. In particular, we do not impose a relative degree requirement on
the compensator.
The next two subsections describe practical approaches to solving (Probi).
94 CHAPTER 6. EXAMPLES

6.5.1 Rounding corners of performance functions


If the frequency a; is a fixed real number, then the set <SW of all possible complex
numbers T that satisfy both inequalities in (Probi) is a subset of the complex
plane. The set <5W (shown in Fig. 6.43) can be described in symbols as

or

Fig. 6.43. Sublevel sets <SW arising from two circular performance requirements:
\1 —T\ < 0.6 and \T\ < 0.8. The intersection of the two sets has corners in its
boundary.

Note that <SW is the intersection of two disks, so it typically has "corners"
in its boundary. As a consequence of this, the requirements envelope cannot
be expressed in terms of center and radius functions like the examples we have
encountered so far in this book.
Performance functions with corners in the boundary of the level sets are
not differ entiable and are difficult to treat numerically. In this section we shall
describe two approaches to dealing with (Probi) numerically, but first we restate
(Probi) as

(Prob2) Given a plant -P(s), nonnegative functions W\(uj]


and W2uj, and

find T that minimizes


over all T that make S internally stable.

Mathematically speaking, (Prob2) is what one must solve in order to solve


(Probi). However, (Prob2) is difficult numerically. One practical approach to
6.5. PERFORMANCE FOR COMPETING CONSTRAINTS 95

solving (Prob2) is to compromise by replacing the performance function F in


(6.42) by a (numerically) better-behaved function,

where p is an integer such as 2, 4, or 8. Pictorially, doing this corresponds to


"rounding the corners of the sublevel sets" to obtain a more gentle performance
function. For example, p = 2 rounds the corners of the sublevel sets of the
performance function (6.42) tremendously,3 while p = 8 does less rounding but
is more demanding numerically. See Fig. 6.44.

Fig. 6.44. Plot of level sets «S for (disk),


and

Example. Consider Wi = |P(jw)|, and ^2 = IPtf")!"1 in


(Probi). Optimization with the performance function Fp, for p = 2,4, 8, initial
guess T°(s) = 1, and 32 gridpoints yields (after model reduction) an optimal
function Tp(s) where

Table 6.1 gives information about other output, and Figs. 6.45-6.47 display
plots of relevant functions.
Conclusion. Solutions Tp for (Probi) were obtained using performance
functions Fp, for p = 2,4,8. The function T8(s) gives the best overall weighted
sensitivity and weighted magnitude (in more difficult examples its calculation
3
A square root in the formula of the function Fa has been removed, since for optimization
purposes such power affects the optimal value of the performance, but the function that
optimizes the performance does not change.
96 CHAPTER 6. EXAMPLES

Table 6.1. Computer output: Rounding corners method.

Fig. 6.45. Plot of

Fig. 6.46. Plot of and

Fig. 6.47. Plot of and


6.5. PERFORMANCE FOR COMPETING CONSTRAINTS 97

could be complicated by the appearance of numerical noise in the computations).


Of the three computed answers, T2(s) is the one that has the least satisfactory
overall weighted sensitivity and weighted magnitude, but its calculation entailed
little numerical difficulties. Finally, T4(s) is an intermediate case. Computer
code for this example is given in Appendix C and in the file appendixchG.nb.

6.5.2 Constrained optimization with a barrier method


Our second approach to solving (Probi) is to restate it as a constrained opti-
mization problem (Prob2), and then solve it using a barrier method that we shall
describe in this subsection.
We begin by recalling that (Probi) has two performance requirements: a
weighted sensitivity constraint, and a closed-loop magnitude constraint. In sec-
tion 6.5.1 we discussed how to solve (Probi) by optimizing a performance func-
tion defined in terms of both constraints. Now we state a problem of optimizing
a performance function defined in terms of the weighted sensitivity function,
over closed- loop functions T that satisfy the closed-loop magnitude constraint.
This problem is equivalent to (Probi).

(Probs) Given a plant P(s) and nonnegative functions


VFiu; and W^- minimize

subject to

T internally stabilizes S.

Our practical approach to solving (Proba) is as follows. Let e > 0 be a fixed


number, and define

Note that Fe is defined only for those z that satisfy

Such z are called feasible. When 1^2(^)2 gets near 1 the value of Fe becomes
quite large. Hence the logarithm in Fe heavily punishes z for being close to
violating the inequality (6.45). The logarithmic term in F€ receives the name
barrier function. Also, by choosing a suitable e one can manipulate the contri-
bution of the barrier function to the value of Fe.
The algorithm for solving constrained optimization problems with barrier
functions is presented now.

Optimization with the barrier method. Given e > 0 and a feasible


function T°,
98 CHAPTER 6. EXAMPLES

bl. Use T° as initial guess to find T* that minimizes supw Te(aj,T(ju)) over
all T that make <S internally stable.
b2. Update e <- ^, T° 4- T*.
b3. Stop if T° satisfies a preset tolerance criterion; else repeat (bl)-(b3).
Example. Consider P(s) = ^-, Wl = |P(ju;)|, and W2 = IP^))- 1 in
(Proba). A computer run with 32 gridpoints and the barrier method initialized
with e = 1 and T°(s) = 0.1 + produces

and the results shown in Table 6.2. Also see Fig. 6.48.

Table 6.2. Computer run with the barrier method.

Fig. 6.48. Plot of for the optimal T ob-


tained with

Conclusion. The barrier method with 32 gridpoints was used to solve


(Probi). A solution is already obtained with e = 1. The weighted sensitivity
attains a maximum value of 0.59, and the weighted magnitude of the closed-loop
function attains a maximum value of 0.64.
For comparison another solution was obtained using e = 0.1. In this case the
weighted sensitivity attains a maximum value of 0.46 and the weighted magni-
tude of the closed-loop attains a maximum value of 0.96. Thus reducing e leads
6.5. PERFORMANCE FOR COMPETING CONSTRAINTS 99

to higher weighted magnitude of the closed loop function (but still acceptable),
and leads to lower magnitude of the weighted sensitivity. A computer code tem-
plate for doing runs similar to this one is given in Appendix C and in the file
appendixchG.nb.
This page intentionally left blank
Chapter 7

Internal Stability II

Internal stability was introduced in Chapter 2. This chapter continues the dis-
cussion to obtain theorems that precisely characterize internally stable systems,
either in terms of zeros and poles of the plant or in terms of a formula for
the closed-loop transfer function. Section 7.1 develops the mathematical tools
for interpolation with rational functions. In section 7.2 a characterization of
internally stable systems is given in terms of interpolation conditions on the
closed-loop transfer function, when the plant has simple RHP zeros and poles.
The case of higher multiplicity is treated in section 7.3.

7.1 Calculating interpolants


Systems S whose plant has RHP poles or RHP zeros have closed-loop transfer
functions that can be described by a formula. We shall see that this formula
depends on those RHP poles and RHP zeros of the plant and nothing else. To
derive the formula and related results, we concentrate first on the case where
all of these zeros and poles have multiplicity 1 (called simple poles and zeros) .
Suppose that the following sets of complex numbers are given:
(data points)
and
(data values) .
For T(s] a rational function, consider the set of equalities

called an interpolation condition on T. If T £ T^H00 satisfies INT, then T is


called an interpolant (for the data given in (7.1) and (7.2)).
The main objective of this section is to present ways to generate interpolants.
More precisely, given a set of equalities INT we show how to
Produce one interpolant.

101
102 CHAPTER 7. INTERNAL STABILITY II

Produce a parameterization (formula) of all interpolants.


First, we present an elementary way to produce one interpolant (section 7.1.1)
and to parameterize all interpolants (section 7.1.2). This is then generalized to
problems with an interpolation condition at infinity (section 7.1.3).
Interpolation with polynomials, rather than rational functions, is a standard
subject in numerical analysis (see [DC80]). The treatment here for rational
functions is conceptually the same, and proofs are similar.

7.1.1 Calculating one interpolant


Let INT and a real number a < 0 be given. We will show how to produce an
interpolant T in 7W30 such that all its poles are located at s — a.
Let T be a proper rational function of s with a single pole location at s — a.
One can think of T as being a sum of rational functions whose denominators
are of the form (s — a)k, with k = 0 , . . . , n — 1. Since T is proper, for each factor
(s — SQ) that appears in the numerator there is a corresponding factor (s — a)
in the denominator. Thus T must have the form

for some constants CQ, ci, • • - , cn-\. The right-hand side of (7.3) is called
Newton's representation for the function T(s).
Now we seek T e KH00 with the form (7.3) that satisfies INT. Set s =
si, s = «2 5 • • • ) s = sn in (7.3) and combine with INT to obtain the system of
equations

This system of equations in the unknown c^ has a unique solution. It can be


solved easily by back substitution.

Example. Find T e 7£7i°° with a = —1 as its only pole location, such that

Solution. Note that n — 2 (number of points in INT), so Newton's represen-


tation for an interpolant is

Combining (7.5) and (7.6), we see that (7.3) becomes


7.1. CALCULATING INTERPOLANTS 103

and we obtain CQ = 1 and c\ = —6.


A special situation occurs when the interpolation values v\ , . . . , vn are all
zero. In this case, system (7.4) gives CQ = c\ = . . . = c n _i = 0, so the function T
given by the Newton's representation method is the zero function. A nontrivial
interpolant for this case is easily obtained by setting

where 7 is any nonzero real constant. Observe that the function T0 in (7.8) has
no zeros in the RHP other than « i , . . . , sn.

7.1.2 Parameterization of all interpolants


Given a set INT of interpolation conditions, by INTo we denote the set of
homogeneous interpolation conditions

obtained from /AT by setting the right-hand sides to 0. The set of conditions
INTo plays a fundamental role in parameterizing all functions T G 'RfH00 that
satisfy INT.
THEOREM 7.1.1. Let a set INT of interpolation conditions be given. Also
given are functions T\, TO in 'R.'H00 such that T\ satisfies INT, TQ satisfies INTo,
SQ, ..., sn are the only zeros ofTo, and TQ has relative degree 0. Then every T
in KH00 that satisfies INT has the form

for some F in HH.00. Conversely, if a function T is given by the formula (7.9),


then T belongs to KH00 and satisfies INT.
Proof. Let T be a function in TZH00 that satisfies INT. Then the difference
T(s) — TI(S) has zeros at the locations s = s i , . . . , s — sn. Therefore the rational
function F(s) — (T(s) — Ti(s))/Tb(s) has no poles in the closed RHP and is
proper since TO has relative degree 0. This says that F(s] belongs to 71H°°,
and we have proved the first part of the theorem. To prove the second part,
suppose that T in T^-Ti00 satisfies (7.9). By direct substitution of the values
s = SQ, s i , . . . , sn in the formula (7.9) we see that T satisfies INT.
It follows from Theorem 7.1.1 that it is enough to find particular functions
TO and T1; as in formula (7.9), to determine all 72/ft00 functions that satisfy
INT.

Example. We will find a formula for all functions T € IZH00 that satisfy
104 CHAPTER?. INTERNAL STABILITY II

We can choose any negative number as the location of the pole of the rational
functions T0 and T\ in formula (7.9). We select a = —2; set

and

By taking s = 1 and s = 3 in (7.12) and combining with (7.10), we obtain CQ = 2


and GI = 5. We know from Theorem 7.1.1 that all functions T G 7£7Y°° that
satisfy (7.10) are of the form

where

7.1.3 Interpolation with a relative degree condition


If a rational function T is strictly proper, then as \s gets large, |T(s)| approaches
0. One can think of T as having a zero at s = oo. The order we associate with
this "zero" is the same number as the relative degree of T, d(T) (the difference
of degrees between denominator and numerator). One way to incorporate this
special zero in interpolation problems is to specify the relative degree of the
interpolant together with INT.
Given INT and an integer r > 0, we seek T in KTi.00 with relative degree
d(T} — r that satisfies INT. To produce one interpolant, modify the represen-
tation (7.3) to include the requirement on the relative degree of T:

To find CQ, GI, . . . , c n _i, proceed as before by forming a triangular system of


equations and solving it.
A parameterization of all interpolants that satisfy both INT and the relative
degree condition d(T] = r is obtained after a small modification to Theorem
7.1.1. Details of the proof of Theorem 7.1.2 are left to the reader.
THEOREM 7.1.2. Assume the hypotheses of Theorem 7.1.1, with the addi-
tional assumptions that d(Ti) < r and d(T0) — r. Then a rational function T
is a function in TVH00 that has relative degree r and satisfies INT if and only if
there exists a rational F 6 T^-Ti00 such that
7.2. PLANTS WITH SIMPLE RHP ZEROS AND POLES 105

7.2 Plants with simple RHP zeros and poles


Internal stability can be expressed as the condition that certain functions asso-
ciated with <S are bounded and have no poles in the closed RHP. This fact is
presented below as a lemma needed for the main theorems of this section, but
it is interesting in itself. In fact, it can be used as an alternative definition of
internal stability that generalizes to MIMO systems (see [BB91], [DFT92], and
[F87]).
LEMMA 7.2.1. The closed-loop system of Fig. 2.1 is internally stable if and
only if
belong to

Proof. Key relations are S = 1 - T and T = QP = PQ. These imply that


T is in UH00 if and only if S and PQ = QP are in ftTi00. If the system is
internally stable, we only have to check that Q = CS and SP belong to 72/H00 '.
Note that

at the poles sp of C. Thus Q = CS is in 1ZH00'. Similar reasoning implies that

is in IZH00. This proves half of the lemma. Now suppose that the relations in
(7.16) hold. If there is pole-zero cancellation in PC at a point s = SQ in the
RHP, then either Q = C/(l + PC) or SP = P/(l + PC) has a pole at s = s0,
both of which contradict the hypothesis.
To illustrate the use of this lemma, we verify (7.16) for the system Si with
plant P(s) = l/(s — 1) and compensator Ci(s) = (s — l)/(s + 1). In section 2.3
we saw that Ti(s) = l/(s + 2), which belongs to TlH.00. We also have

We see that the closed-loop plant PSi has a pole at s = 1. By Lemma 7.2.1 we
can assert that the system Si is not internally stable.
Now consider the system <$2 with the same plant and compensator C^s) = 2.
Recall that T2(s) = 2/(s + 1), so

Therefore (7.16) is satisfied, and the system £2 is internally stable.


The description of internal stability in terms of interpolation conditions is
given in the following theorem.
106 CHAPTER 7. INTERNAL STABILITY II

THEOREM 7.2.2. Consider the closed-loop system S, where the plant P is


given. Let pi (i — 1 , . . . , n) denote the poles and ze (i = 1,..., ra) denote the
zeros of the plant P in the closed RHP. Suppose that all unstable zeros and
poles of the plant are simple. If S is internally stable, then there exists an
integer dc > 0 such that the closed-loop transfer function T = PC (I + PC)"1
must satisfy interpolation conditions

Conversely, if T is any function in TZH00 satisfying (7.21) for some dc > 0,


then the closed-loop system $ associated with T and P is internally stable.
Proof. Suppose the system S is internally stable. Then P~1T = C(l +
PC)~l = Q 6 KH00. Clearly, if P(ze) = 0, then ze is a pole of p-1. Thus
T(ze) = 0. Also SP = (1 + PC)~1P <E W-t00, so poles of P are zeros of
(1 + PC)~l = 1 -T. Then T(pe) = 1. We now set dc = d(C) in equation (7.21)
to finish the first part of the proof.
Now suppose that T e KH00 satisfies (7.21). The relation d(T] - d(P) =
dc>0 combined with (7.21) ensures that the compensator is proper. This fact,
and both P and T having the same RHP zeros imply that Q = TP~l is proper
and has no RHP poles, i.e., P~1T = Q e KH00. Also 5 = 1-T 6 KH°°. Again,
(7.21) says that poles of P are zeros of 1 - T; thus (1 - T)P = SP e KH00.
Therefore <S is internally stable.
Theorem 7.2.2 can be combined with Theorem 7.1.2 to obtain a formula
describing all possible internally stable systems S associated with a fixed plant.
The proof of Theorem 7.2.3 is left to the reader.
THEOREM 7.2.3. Let P be a proper, real, rational function, with simple
RHP zeros z\,..., zn and RHP poles p\,..., pm, and let dc be a nonnegative
integer. Let TQ, 7\ be 7ZH00 functions such that

and

and suppose that all the RHP zeros of TO are listed in (7.23).
If T 6 'RfH00 is the closed-loop transfer function of an internally stable
system S with plant P and d(C) — dc, then there exist H e 7£7i°° such that
7.3. PARAMETERIZATION: THE GENERAL CASE 107

Conversely, if (7.24) holds for some H € TIH00, then the system S associated
with P and T is internally stable, and the degree of its compensator is d(C) = dc.

Example. Consider the plant P(s) = (s — l ) / ( s — 3), and set dc = 0.


We now find a formula for all T that yields an internally stable system S with
d(C) = 0 and plant P(s) = (s - l ) / ( s - 3).
The plant P has RHP zero s = 1 and RHP pole s = 3. Set TI(S) =
2(5 - l)/(s + 1) and ToO) = (s - l)(s - 3)/(s + I) 2 . Then TI and T0 satisfy
(7.22) and (7.23), respectively, and by Theorem 7.2.3 we have

7.3 Parameterization: The general case


In this section, the results already obtained on parameterization of internally
stable systems is generalized to the case where the plant may have RHP zeros or
poles with multiplicity greater than 1 . Results in this section are stated without
proof.

7.3.1 Higher-order interpolation


The interpolation conditions we now consider have specific values of the deriva-
tives of the function at selected points.

are given. As in previous sections, we consider the problems of producing one


interpolant T\ for INT^ and of obtaining a parameterization of all interpolants
for INTh.
We use two IZH00 functions, TO and TI. The assumptions on these are that
TI satisfies INTh and that TO satisfies the corresponding set of homogeneous
lr
The inclusion of the condition {si,...,s n } C RHP is not required for interpolation.
However, it is essential for internal stability.
108 CHAPTER 7. INTERNAL STABILITY II

conditions

Furthermore, we suppose that all the RHP zeros of TO are listed in INT® and
that
A variation of the method we have been using works for higher-order inter-
polation problems as well. The difference here is that the formula for T now
contains summands with numerators of the form 1, (s — «i), (s — si) 2 , . . .,

THEOREM 7.3.1. A function T is in 7£7i°° , has relative degree r, and sat-


isfies INTh if and only if there exists H G KH°° such that

Example. Find T G 7^K00 with relative degree d(T) = 0 such that

Solution. The expression for T is

We also need the derivative of T,

Setting s = 0 in (7.27) and in (7.28) gives CQ = 1 and ci = 2. Now using s = 3


in (7.27) gives

Solving for 02 in (7.29) yields c% — Finally setting s = 4 in (7.27) leads to

and this implies that


7.3. PARAMETERIZATION: THE GENERAL CASE 109

7.3.2 Plants with high-multiplicity RHP zeros and poles


A RHP pole or zero of the plant with multiplicity higher than 1 in an internally
stable system can be translated as an interpolation condition on derivatives of
T at that location. More precisely:
THEOREM 7.3.2. Consider the closed-loop system S. Let pe (I — 1, . . . , n)
denote the poles and zg (I = 1, . . . , m) denote the zeros of the plant P in the
closed RHP, so that these poles and zeros have multiplicity nt, (i = 1, . . . , n)
and mi (I = l , . . . , m ) , respectively. If S is internally stable, then for some
integer dc > 0 the closed-loop transfer function T — PC(l + PC}~1 satisfies the
interpolation conditions

Conversely, if T is any function in 7ZH00 that satisfies INTh for some dc >
0, then the closed-loop system <S associated with T is internally stable and its
compensator has relative degree dc.
We now state the result with the parameterization of internally stable sys-
tems.
THEOREM 7.3.3. Assume the hypotheses of Theorem 7.3.2, and let TI, T0
in lUt00 be such that TI satisfies INTh and TQ satisfies INT®. Furthermore,
suppose that all the RHP zeros of TQ are listed in INT® and that T^(p^) ^ 0
( * = l , . . . , n ) , T™<(ze)^ 0 (* = ! , . . . , m ) .
// T e TlTi00 is the closed-loop transfer function of an internally stable
system S with plant P and d(C) — dc, then there exist H G THi.00 such that

Conversely, if (7.32) holds for some H G 7^.7^°°, then the system S associated
with P and T is internally stable, and the degree of its compensator is d(C) — dc.

Example. If P(s) = and dc = 1, then the interpolation condi-


tions on any T 6 7ZH°° for internal stability of the system S are given by

If a = — 1 is selected as the location of the poles of T, then the technique


described in section 7.3.1 yields the interpolant

Thus we have the parameterization


110 CHAPTER 7. INTERNAL STABILITY II

7.4 Exercises
1. Find an interpolant for each set of conditions below.

pole location
, pole locatio
pole location
pole location
2. Prove that the system of equations (7.4) does have a solution in CQ, . . .,
cn-i and that this solution is unique.
3. Prove that if the function T 6 UH00 satisfies T(SI) = 0, . . . ,T(s n ) = 0,
then either T is identically zero or degree{denominator(T)} > n.
4. Prove that given INT with n data points, there exists a unique T erh00
that satisfies INT, with the following property: if p and q are polynomials
such that T = p/q, then degree(g) < n.
5. Find all interpolants satisfying the conditions stated in Exercises la-d.
6. Prove Theorem 7.1.1.
7. Find an interpolant for each set of conditions below.
pole location s — — 1, relative degree d(T] = 2
pole location
pole locatio

8. Prove the statements below.


a. If both TI , T2 € ftft00 have relative degree r, then TI +T2 has relative
degree < r. Give an example where strict inequality occurs.
b. If Ti,T2 € 7£7Y°° have relative degrees ri,r2, respectively, then the
product TiT2 has relative degree r\ + r%.
9. Find all functions T e KH°° that satisfy the conditions stated in Exercise
7a - 7c.
10. Prove Theorem 7.1.2.
11. Find a formula for all closed- loop transfer functions T that correspond to
an internally stable system <S, if the relative degree of the compensator is
0 and the plant P(s) is given by
7.4. EXERCISES 111

12. Do Exercise lla-lld, but with d(C) = 1.


13. Use the results in section 7.3 to find the closed-loop T that arise from
internally stable systems with degree of compensator dc = 0 and plant

d. Do problems a-c

14. Supply a proof for the results stated in section 7.3.

References and further reading


References that amplify our presentation of internal stability, interpolation, and
control (but with a different point of view) are [BGR90], [BB91] (gives a good
history), [DC80], [F87], [DFT92], [V85], [YJB76a], and [YJB76b].
This page intentionally left blank
Pant III
H°° Theor y
This page intentionally left blank
In this part of the book we discuss the general optimization problem OPT,
which is the main mathematical problem arising in H°° design. The discussion
is restricted to scalar-valued analytic functions. In control applications this
corresponds to the SISO case.
In Chapter 8 the mathematical problem OPT is formulated, and its relevance
to control is treated. We introduce performance functions and sublevel sets of
a performance function.
In Chapter 9 we give a test for determining if a given function is a solution
to OPT. This one simple characterization of optima is the key to understanding
OPT both conceptually and numerically. This test has a beautiful graphical
interpretation that is also easy to implement on a computer.
Chapter 10 contains background facts on analytic functions and lists three
fundamental properties of analytic functions, which every control engineer should
know.
Chapter 11 gives the proof of the characterization given in Chapter 9 of
solutions to OPT. The proof follows quickly from the three basic properties
listed in Chapter 3.
Chapter 12 concerns computations of solutions to OPT. We begin with com-
puter diagnostics, follow with a sample computer run, and then discuss an al-
gorithm for finding solutions.
This page intentionally left blank
Chapter 8

H°° Optimization and


Control

8.1 The problem OPT


Let A" be a subset of the complex plane C, and let / : X —> C be a continuous
function. The function / is said to be
Bounded (uniformly) if there exists m > 0 such that \f(s]\ < m for all
s e X. The symbol supsex \f(s)\ denotes the smallest such m.
Real (on the real axis) if X is symmetric about the real axis and f(s] =
/(s) for every s £ X.

Analytic on X if X is open and for every point SQ in X there is a disk


A contained in X and centered at SQ such that f ( s ) has a power series
representation valid for s € A,

We denote by ARHP the set of all functions /(s) that are real, bounded,
and continuous in the closed right-half-plane (RHP) and that are analytic in
the open RHP. In particular, for any f(s] € ARHP the function /(jw) is a
bounded, continuous function of u that satisfies f ( j u ) = f(—ju)- The set
aRHPis a (real) vector space with the usual properties of addition

and scalar multiplication

117
118 CHAPTER 8. H°° OPTIMIZATION AND CONTROL

It is well known from the theory of analytic functions that an / continuous on a


closed set and analytic on the interior of the set achieves the maximum modulus
on the boundary of the set. As a consequence of this we have that any / in
aRHP satisfies

A nonnegative-valued function of u € R and z G C is called a performance


function. We will use the notation F(u>, z) to denote a performance function. A
performance function F that satisfies

is said to be real symmetric. Performance functions that arise from engineer-


ing applications are real symmetric; consequently all the performance functions
considered in this book are real symmetric. So from now on, by the word "per-
formance" what we really mean is "real symmetric performance."
The main optimization problem we consider is,1

OPT Given a performance function F(u;, z}, find 7* > 0


and /* in ARHP such that

We will refer to the function /* in the statement of OPT as a solution or


minimizer, and refer to 7* as the optimal value. Solutions /* to OPT do not
always exist in ARHP', however, for many performance functions they do (see
section 9.4).

8.2 The fundamental H°° problem of control


The OPT problem is central to the design of a system where specifications
are given in the frequency domain and where stability is a key issue. Suppose
that our objective is to design a linear, time-invariant system in the frequency
domain description, with stability a major consideration. Part of the system is
given, which we are forced to use (called the plant), and another part / we get
to choose (designable part) (see Fig. 8.1). The stipulation that the designable
part of the circuit be stable can be restated as / having no poles in the RHP.
Suppose also that to carry out the design (i.e., choose /) we use a given
performance function F. In many cases the performance takes the form of a
positive function of frequency u; and of the designable part / at that frequency.
Thus we write T ( u , f ( j u ) ) for the performance at frequency u and for de-
signable function /. The performance is a cost function: big is bad. The "worst
^•As stated here, OPT corresponds to SISO system design. The MIMO case leads to OPT
but with functions F — ( / i , . . . , /TV)) with each ft 6 ARHP- See Chapter 13.
8.2. THE FUNDAMENTAL H°° PROBLEM OF CONTROL 119

Fig. 8.1. The given and designable parts of the system.

case" is the frequency u at which sup w6fl F(u;, f ( j u ) ) occurs. We minimize this
over all / 6 ARHP- Thus we obtain exactly the problem OPT.
The theory of OPT is based in part on the analysis of the sublevel sets

of F. In fact there is a simple picture to think of in connection with a design


(see Fig. 8.2).

Fig. 8.2. The function f makes performance at least as good as c at uj = UJQ.

For each desired level of performance c, one has target sets <Sw(c) associated
with each frequency a;. The objective is to find a function / with no poles in the
RHP such that each f(jcu) belongs to «Sw(c). Any such / makes the performance
of the overall system at least as good as c for all a>.
120 CHAPTER 8. H°° OPTIMIZATION AND CONTROL

8.3 Change of coordinates from RHP to the unit


disk
Because it is more convenient,2 we now change coordinates in the domains of
functions and use £ in the unit disk D in the complex plane and e^° in the unit
circle dD, instead of s in the RHP and ju in the imaginary axis.
The change of coordinates can be accomplished with a linear fractional trans-
formation:

so that functions f ( s ) of s become fun c:ctions of

The transformation (8.2) maps the open RHP one-to-one and onto the open
unit disk D, and maps the juj axis with uj = oo one-to-one and onto the unit
circle.
A fundamental property of this coordinate change is that functions remain
analytic, bounded, and real: f ( s ) is analytic, bounded, and real for s E RHP if
and only if F(£) is analytic, bounded, and real for £ e D. The following table
gives some pairs (s, £) that arise from this transformation.

Example. We perform the change of coordinates on the function f ( s ) —


We have

Note that f ( s ) has a pole in the LHP and a zero in the RHP. Correspondingly,
F(C) has a pole outside the unit disk and a zero inside the unit disk.
2
Mathematics literature treats functions on the disk rather the RHP. There are other
reasons, e.g., our graphical test in section 9 is easier on the disk, Fourier transforms and their
discrete versions are available and easy to implement on the computer, etc.
8.4. PERFORMANCE FUNCTIONS 121

Instead of ARHP we will use its "disk" counterpart, which we denote by A.


In other words, A is the set of all functions /(£) that are bounded, real, and
continuous on the closed unit disk, and analytic on the open unit disk. With
the change of coordinates, the main optimization problem becomes
OPT Given a performance function r(e j ' e ,;z),
find 7* > 0 and /* in A such that

8.4 Performance functions and sublevel sets


With the coordinate change introduced in the previous section, the sublevel sets
of r(e-?e, z) now depend on 9:

Theory of the OPT problem (see [HMar90]) implies that the existence and
other properties of solutions to OPT are closely related to properties of the
sublevel sets of the performance function. Usually "well-behaved" sublevel sets
correspond to solutions one can consider "nice." A well-behaved sublevel set
has the following characteristics:
Boundedness: The set SQ(C) is contained in some disk with finite radius.
Connectedness: Two points in <S0(c) can be joined by an arc that lies
totally in <S#(c).
Simply-connectedness: No "holes"; i.e., the set complementary to «S0(c) is
connected.
Nonvanishing gradient: The gradient of r(ej<9, z) in z = x + jy is not zero
for any z in the boundary of S0 (c).
Other desirable properties are:
Convexity: Two points in <S0(C) can ^e joined by a segment that lies totally
in «S0(c).
Smooth boundary: In particular, the boundary of S0(c) has no corners.
Smooth dependence on 0: In particular, small changes in 6 lead to small
changes in «S0(c).

Example 1. For the performance function T(ePe , z ) = |l/(e^ + 0.5) — z\2


the sublevel set SQ(C) is a disk, centered at l/(ej6> + 0.5) with radius1/-c.
Example 2. HT(ePe,z) = (2 + cos(6»))|l/(e^+0.5) - ^|2, then the sublevel
set <S0(c) is a disk, centered at l/(e^e + 0.5) with radius\/-c/(2+cos(0)).
122 CHAPTER 8. H°° OPTIMIZATION AND CONTROL

Example 3. Suppose that two continuous functions k(e^e] and w(e^e) are
given and that w(e^e) > 0 for all 9. With these functions we build the perfor-
mance

This performance function appears in control applications and it is quite in-


formative to study. The sublevel sets 50 (c) of F are disks with center k(e^0}
and radius (c/w(ei8))1/2. Thus 5#(c) is connected and simply connected, and
it is bounded since the radius is finite. The sets <Sg are also convex and have
smooth boundary. Since k(e^e] and w(e^e] are continuous, SQ(C} varies contin-
uously with 6. If k(eie) and w(e^9) are smooth, then 5#(c) varies smoothly in
0 for fixed c. Performance functions with sublevel sets that are disks are called
quasi-circular. The following is an example.

Example 4. Suppose that u>i(e j6> ) and W2(e^°) are given positive functions,
and

This performance function arises in H°° control literature, where it is called


mixed sensitivity. It is quasi-circular, a fact that we now verify. Use the formula
| a - 6 |2 = | a 2 - 2 Re(a b) + \ b |2 to rewrite (8.7) as

Collecting terms and completing the square, we get

Therefore <S0(c) is a disk with center u>2(e-' 6) )/(tt>i(e- 76 ') + W2(e^&)) and radius

and F is quasi-circular.

Example 5. If wi(e^0) and W2(e^e] are given positive functions and

then Se(c] is an ellipse centered at l/e^e and with semiaxes that vary with 9.

Example 6. Set

Then Se(c] is a lemniscate centered at l/e j6> . For small c the sublevel sets are
convex, for c slightly smaller than 1 the sublevel sets are nonconvex, and for
c> I the sublevel sets are no longer connected sets.
8.4. PERFORMANCE FUNCTIONS 123

Example 7. Let w\(e^e] and w^e.^0} be given positive functions and

Then <50(c) is the intersection of two disks centered at \je?Q and at


with radiic/w1(ej0) and c/w2(ej0).

Example 8. Let w\(e^6} and W2(ej6>) be given positive functions and

Then S0(c) is a rhombus centered at 1/ ej0.


This page intentionally left blank
Chapter 9

Solutions to OPT

Suppose we are handed a candidate / for a solution to an OPT problem and


asked if it is a solution. The main result in this chapter yields a test to answer
this question. The result is given in section 9.3.

9.1 Complex partial derivatives


Since

we can change the variables in a function g of x and y and write g as a function


of the variables z and ~z. For example,

Partial derivatives with respect to the variables z and 2 are defined by the
formulas

125
126 CHAPTER 9. SOLUTIONS TO OPT

The partial derivatives with respect to z and z satisfy the standard differentia-
tion rules for functions of one real variable. Some examples are

It is well known that complex functions g = u(x, y] + jv(x, y] are analytic as a


function of z = x+jy if and only if u and v are harmonic functions that satisfy
the Cauchy-Riemann equations (cf. [Ahl66], [Con78]). Also, it can be shown
that the Cauchy-Riemann equations can be written as -j^g — 0. Thus the latter
equation is equivalent to saying that g is analytic.
A consequence of the definition of d/dz is that if g is a real-valued function
of z, then -^Lg is precisely the gradient of g in the real variables x and y. Thus
we have (abusing notation)

See Fig. 9.1.

Fig. 9.1. A sublevel set <S#(c) = {z : g(z) < c} of a real-valued function g(z],
and the gradient vector n at a point of the boundary of SQ(C}.

Although we write functions T(e:ie, z) as dependent on z only, actually they


also depend on ~z. For convenience the variable ~z is not made explicit. For
example,

is a function of both z and ~z:


9.2. WINDING NUMBER 127

In particular we have

9.2 Winding number


Another concept we need to introduce is the winding number of a curve with
respect to a point. Suppose that a : <9D —> C is continuous and never 0. The
point a(e°) is real and nonzero by assumption. We pick a number such as 0
(if a(e°) > 0) or -TT (if a(e°) < 0) and define "the argument of a(e°}" to be
such number. Then as 0 varies from 0 to 2?r there is always a well-defined phase
(or angle) determined by a(e^B} and the positive semiaxis. Furthermore we
can choose the phase to vary continuously with #, provided we allow for angles
to take any real value. This phase function is called a continuous argument
(cf. [Con78]).
Since a(e°) = a(e 27 ™), the increment in the value of any continuous argument
from 9 — 0 to 9 = 2-/T is an integer multiple of 27r, say 2im. The number n is
called the winding number of a(-) with respect to 0 and is denoted wind( a ; 0 ).

9.3 Main result on the characterization


of solutions
The characterization of scalar-valued solutions to OPT is as follows.
THEOREM 9.3.1. Suppose a given performance function r(e j ^, z) is of class
C3. Let f* be a function in A for which the function
never vanishes. Then /* is a strict local minimizer to OPT if and only if
(i) T(ej0J*(eJ0)} is constant in 9.
(ii) wind( a ; 0 ) > 0.

The following is a very nice geometric interpretation of (i) and (ii) of Theo-
rem 9.3.1. Suppose that 7* is the optimal value of OPT. The sublevel sets

at c — 7* can be represented as a solid in 3-D as shown in Fig. 9.2. In this


figure the variable 9 is represented in one of the axes, and at each constant 9
the plane perpendicular to the 9 axis represents the complex plane C containing
the sublevel set Se.
Condition (i) in Theorem 9.3.1 implies that f*(eje) is always on the bound-
ary of the set £0(7*). Thus as 9 varies from 0 to 2yr, the pair (#, /(e j<9 )) describes
a curve on the boundary of the solid shown in Fig. 9.2. In other words, if the
curve (e j ' e ,/*(e j ' e )) dips inside the solid or if it goes out of the solid, then /*
does not satisfy condition (i).
128 CHAPTER 9. SOLUTIONS TO OPT

To see what (ii) means, consider the function a(e^e) = §j(eJ'6'?/*(e'76'))- At


each 9 this represents an exterior normal to the set <S0(7*) C C at the point
f*(ei6}. As 0 varies from 0 to 2?r, the function a winds at least once in the
clockwise sense about the 0 axis. Condition (ii) of the theorem says that if /*
is an optimizer, then a winds at least once in the clockwise sense.

Fig. 9.2. The optimal performance surface. In the top figure the curve
(ej0, /*(e j0 )) lies on the boundary of the solid. As 9 increases from 0 to 2ir,
the dark strip representing normals to the sublevel sets makes one complete
clockwise revolution with respect to the 9 axis, as indicated in the bottom figure.

Example 1. The constant function f ( e ^ 6 ) — 0 is a solution to OPTwhen the


performance is T(eje,z) = \l/eje - z2. Indeed, r(e j ' e ,0) = 1 and ff (e^,0) =
—e j(? , so conditions (i) and (ii) in Theorem 9.3.1 are satisfied.

Exercise. Let a and (3 be positive constants. Consider the performance

Prove that for a suitable fco, the function /(ej<?) = k^e^6 is a solution to OPT.
9.4. PROPERTIES OF SOLUTIONS 129

9.4 Properties of solutions


A knowledge of the fundamental qualitative properties of solutions is very useful
to someone who is using a computer program to solve problems like OPT. Then
when the program produces odd answers, the user can have an idea of what is
happening or at least eliminate some possibilities.
Three important questions from the qualitative point of view are
If a solution exists, is it unique?
If a solution exists, is it smooth?
Does OPT have a solution or merely an optimizing sequence?
In the case of scalar-valued functions /, the OPT problem is incredibly well
behaved (even when the performance function F is not convex in z ) . Solutions
exist in A and are unique! Solutions are smooth (infinitely differentiable) if the
performance function F is smooth. A discussion of existence and uniqueness of
solutions in the scalar case can be found in [HMar90], which is the main source
of material in the section.
The sublevel sets

are basic to the study of qualitative properties of OPT. The standard assumption
that we make on sublevel sets is as follows.
SA (Standard assumption): <S0(c) is connected and simply connected, and as
9 varies it is uniformly bounded and has area uniformly bounded away
from 0. Furthermore, the gradient of F(e-^, z) in z = x + jy is not zero
for all z in the boundary of S0(c).

THEOREM 9.4.1. Let T(eie,z) be at least once continuously differentiable


with sublevel sets that satisfy SA. Then any local continuous solution f* to
OPT is unique. Moreover, if F is infinitely differentiable, then f* is infinitely
differentiable on the unit circle <ad = { ej0 : 0 < O < 2H
One consequence of Theorem 9.4.1 is that global minimizers and local mini-
mizers to OPT are the same. In particular, if one calculates a solution using a
method for rinding local solutions, then the solution is in fact the global solution.
Also one should not take smoothness entirely for granted. In the very natu-
ral and important problem of optimizing H°° performance over stable compen-
sators, one finds that the optimal closed-loop transfer function is almost never
continuous. Indeed, requiring that the compensator be stable in addition to re-
quiring that the closed-loop system be internally stable has an extremely strong
effect on the nature of the optimal solution. This notion originated in circuit
theory and there it is called the spiked gain principle, [H81].
The results cited in this section extend to optimization problems where the
performance is not necessarily real symmetric, and the optimization is performed
over analytic functions that are not necessarily "real," i.e., that satisfy f( e-j0 )=
f(e 10).
130 CHAPTER 9. SOLUTIONS TO OPT

9.5 The OPTRHP optimality test


We change coordinates and look briefly at functions analytic on the RHP rather
than on the unit disk. Conversion of the main results of this chapter to the
RHP is an exercise suited to the reader who wants to build complex variable
strength. We do not go through the derivation here but just list the RHP version
of Theorem 9.3.1 characterizing solutions to the OPT problem.
We consider a RHP version of the OPT problem,
(OPTRHp)

tailored to control problems where the plant has no poles or zeros on the j'u;
axis except at oo. This gives simple formulas that are illustrative of the general
situation.
Let d(P] denote the relative degree of the plant P, and let d(C) denote the
relative degree we shall require of the compensator C we are seeking. Then the
function T = PC (I + PC}~1 has relative degree d(C) + d(P}. We now assume
that F(u;, z) has sublevel sets <Sw(c) that shrink to zero at the rate

as (jj —>• oo. To avoid pathology in Su we assume that T(UJ,Z) for large uj is
given by a rational function. This would be the case in a typical control design
problem.
Now we state a test for optimality that requires checking function values
only for positive uj.
THEOREM 9.5.1. Suppose Y is C3 and satisfies the assumptions laid out in
this section. Suppose f* 6 ARHP produces §^(<*>, f*(jw)) which never equals 0.
Then f* is a strict local minimizer for OPTjinp if and only if

(i) r(o>, f * ( j u } } is constantinu.

(ii) plus twice the number of counterclockwise turns 0/


as uj goes from 0 to oo.

Note that the number of counterclockwise turns in (ii) can be an integer or


half of an integer.
For more detail and applications of this for special classes (circular) of F,
see [Lprep]. Theorem 9.3.1 comes from [H86], and Theorem 9.4.1 comes from
[HMar90].
Chapter 10

Facts about Analytic


Functions

This chapter presents three basic properties of analytic functions. We think


they are so basic that most systems engineers should know them. The proofs,
though nontrivial, are well known to specialists. However, we sketch them here.
Section 10.1 concerns approximation of functions on an arc of the circle. Spectral
factorization is discussed in section 10.2, and an analogous result on the phase
of functions is presented in section 10.3. The chapter closes with a discussion
in section 10.4 of an important design principle.

10.1 Any function on an arc is a small


perturbation of a function analytic
on the disk
THEOREM 10.1.1. Let P be a closed subset of the unit circle 5D that does not
equal the whole unit circle. If g is any continuous function on P, then there is a
sequence of rational functions r^ e A that approximate g uniformly on P; that
is,

Moreover, if K is any subset of the complex plane C that does not intersect the
closed unit disk, then one can choose each r^ to have its poles in 7£.
Proof. One can use a suitable (analytic) linear fractional transformation
v = /(£) to map the unit disk to the unit disk, and the set P to a set PI that
is a subset of the open disk D(l, 1) centered at 1 with radius I.1 This allows
1
To approximate g on P by analytic functions is equivalent to approximating the function
g-± — g o l~l on PI by analytic functions of i>: if / is a function analytic on the disk such that

131
132 CHAPTER 10. FACTS ABOUT ANALYTIC FUNCTIONS

us to assume that P is a set contained in the disk D(l, 1), and we make this
assumption in the rest of the proof.
As a first case consider a function g(C) = Y^e=-n c^ where n > 0 is fixed.
The function g(£) can be expanded in a power series of £ — 1 that is uniformly
and absolutely convergent on compact subsets of D(l, 1). By taking as many
terms as necessary in this power series, we can approximate g(e^e] by analytic
polynomials of e?Q.
The same holds when g is a continuous function on P, since such functions
are approximable by expressions of the form Y^e=-n ce(e^e)£ by the Weierstrass
approximation theorem.
Therefore g is approximable by analytic functions on P. Finally, Runge's
theorem [R] guarantees that any analytic function on the disk that is continuous
on the closed unit disk can be uniformly approximated on the circle by rational
functions whose poles are in any specified set that does not intersect the closed
disk.
How badly behaved are the functions r^ in Theorem 10.1.1 outside the arc
P? The answer is given by the following theorem.
THEOREM 10.1.2. Let P be a closed proper subset of the unit circle <9D. If
g is any continuous function on P, then either g extends to a bounded analytic
function on the disk D or, for any sequence {fn} C A such that

the sequence must satisfy


Since the proof of this theorem requires more advanced mathematics than we
use at this stage of the book, we provide only a sketch here. If sup^ \fn(e^0)\ < fco
for some constant kQ and all n, then {/„} is a normal family (cf. [Con78]).
Therefore it has a subsequence that is uniformly convergent on compact subsets
of the disk to a function /0 e H°°. Moreover, \fn(^e] — fo(^e}\ —>• 0 as n —> oo,
for all eie 6 P. Then g(e^e] = /o(ej<9) for e^e e P, so /o is an analytic extension
of g on the disk.

10.2 Spectral factorization: Analytic functions


with prescribed amplitude
THEOREM 10.2.1. Given a differentiate function r(e^&} > 0, there exists a
unique function h £ A that is never 0 on the closed disk D~ and that factors r
in the sense

then clearly f o I l is analytic and


10.2. SPECTRAL FACTORIZATION ... 133

Moreover, if r is a polynomial or rational, then so is h.


Proof. We give a proof for the case where r is a polynomial. Suppose that
r is given by

Since r is real-valued, we have that r(ej61) = r(e^), and from this it follows that
C-n = c^,. We assume without loss of generality that CN = C-N = 1. Now r(C)
is a rational function of £ £ C, with poles located at £ = 0 and no zeros on the
unit circle. If £n € C is such that r(£o) = 0, we have

We have proved that if C is a zero of r. then Cn = T£TJ is also a zero. Thus the
X- -5 T ^>U |£g|^

polynomial (£ — Co)(C ~ Co) *s a factor of the numerator of

which (as the reader can easily prove), upon division, yields a polynomial ex-
hibiting the same type of symmetric arrangement of the coefficients. One can
carry this process until r is completely factored in the following form:

Since |£| < 1 if and only if |C* | > 1> w may label £1,..., CAT as the roots of(c) r(i
located in the unit disk so that Q,..., (^ are outside the unit disk. Now set

Note that the relation

implies that r(ej61) = h(e^e}h(e^e}. This proves the theorem in the case where
r is a polynomial.
134 CHAPTER 10. FACTS ABOUT ANALYTIC FUNCTIONS

10.3 Analytic functions with prescribed phase


THEOREM 10.3.1. Let b be a complex-valued, differentiable function of e^e
that never equals 0. Then there is a smooth function h 6 A whose phase equals
the phase of b at every e^eif and only i/wind(6;0) = n > 0. Furthermore, the
function h has no zeros in the disk if and only if n = 0, and in this case h is
unique up to a constant. We call this function h the (analytic) phase factor of
b.
Proof. This may be beyond the skill level of many students, so not everyone
will want to spend much time on it. Suppose that 6 is a given differentiable
function on the circle with wind(6;0) = n = 0. The argument v(Q) (i.e., phase)
of b(e^) is a real-valued continuous function of 9 in the interval [0, 27r] that
satisfies

Also, since b has winding number 0 about 0 we must have v(0) = v(2?r). So we
may think of v as a function on the unit circle, and as such it has an associated
Fourier series expansion

where (since v is real-valued) vg = v-e. Now let u be the function on the circle
whose Fourier coefficients are given by ue = —jvi if t < 0, ui = jvi if i > 0, and
w0 = 0. Then u(e?e) is real-valued. Set h(eje] := u(eje}+jv(eje). A calculation
yields

Since 6 is differentiate, v is differentiable too, and then the function u is a


continuous function of e^e [Hof62]. Hence g is a continuous function on the
circle. In the theory of Hardy spaces it is shown that (10.4) can be used to
define a continuous extension g of g to the closed unit disk that is analytic inside
the disk. The Taylor coefficients in the expansion of g about 0 are precisely the
Fourier coefficients of g:

Finally, set h((,) = e9^. Clearly h has no zeros in D.2


2
Another approach to proving the theorem is to construct the solution to Laplace's equation

for example, using Poisson's formula. Now a harmonic function v*(x, y) can be determined
so that g(x + iy) = v*(x,y) + iv(x,y} satisfies the Cauchy-Riemann equations on D. The
function h — e9 is analytic on the disk D, and its phase on the circle is v, i.e., the phase of b.
10.4. THE FUNDAMENTAL MISTAKE OF H°°-CONTROL 135

To prove uniqueness, note that if h is another analytic function whose phase


on the circle is the phase of 6, then h/h is analytic on D since h has no zeros
there. Also, this ratio is real-valued. But this is not possible unless the ratio is
a constant.
Now, if n > 0, then consider the function bi(eje] = b(eje}/(ej9)n. By the
above discussion, there is a function h analytic on the disk whose phase on the
circle is the phase of 61. But then the phase of b(e^e) is equal to the phase of
h(e^6}(e^e}n, which is analytic on the disk D.
We just proved one direction of the theorem. The other direction is easy:
If h is an analytic function with the same phase as the phase of b, then n =
number of zeros of h inside the disk > 0. In particular, if h has no zeros, then
n = 0.
We finish the section by mentioning a basic result about rational functions
(see [Con78]).
THEOREM 10.3.2. If r((,) is a rational function that has no poles or zeros
on the unit circle |£| = 1, then

where zr = number of zeros of r inside the unit disk and pr = number of poles
of r inside the unit disk.

10.4 The fundamental mistake of H°° control


and other stories
This section is not essential to the proofs that follow in the rest of the book;
however, the discussion can help reinforce the ideas in section 10.1. It also states
a simple principle with which every control designer should be familiar.
Any frequency domain control problem that has been correctly translated
to a math problem suitable for computer optimization has the property that at
each frequency there is a constraint imposed by the specifications. People who
violate this make the following error:

The fundamental mistake of H°° control

"/ am only interested in a frequency band B, so I will not put specs


outside of the band B."

The price of making this mistake is high because, upon running an optimization
program with such specs as input, one finds that the closed-loop transfer func-
tion T has huge magnitude off of B. Amusingly enough, T will meet the specs
perfectly on B. In short, one obtains a solution so ridiculous that it is not even
a good basis for trial and error.
136 CHAPTER 10. FACTS ABOUT ANALYTIC FUNCTIONS

Why are the remarks we just made true? They are an immediate conse-
quence of Theorems 10.1.1 and 10.1.2; indeed, the remarks are just a rephrasing
of the mentioned results, only now in the context of control.
A wholly different consequence of Runge's theorem is that pole placement,
a valuable technique in control design, has a shaky foundation, which (in the
authors' opinion) has not been carefully analyzed. The basic idea behind pole
placement is to set the poles of the closed-loop transfer function at some pre-
scribed distance from the juj axis to achieve an objective, such as stabilizing
the system without making it too sluggish. While the technique is effective in
particular situations where there is a lot of human intervention and time for
trial and error, we maintain that the method is badly codified so that it cannot
be used algorithmically. This is a consequence of Runge's theorem (though not
of Theorem 10.1.1). Runge's theorem implies that any function in ARHP and
smooth on the juo axis (even at u; = oo) can be approximated by r^ exactly as
in Theorem 10.1.1. That is, the hypotheses of this theorem are different, but
the conclusion is the same. This of course implies that if we have a proposed
design producing a closed-loop transfer function T0 that is rational with poles
at locations {PJ}, then we can find T£ such that supw \T(juj] — T£(jaj)\ < E and
with poles about any place we want. The number of poles of T£ might be greater
than that of TQ. This and other constraints that one might impose in order to
make pole placement a systematic rather than a trial-and-error subject have not
been analyzed.
The moral of the pole placement story is different from the tale of the funda-
mental mistake. Here we are saying that pole placement can be a viable method,
but it is a ripe area for more research. In contrast, the fundamental mistake
should never be made.
Chapter 11

Proof of the Main Result

An important tool used in the proof of Therorem 9.3.1 is the Taylor expansion
of smooth, real-valued functions of z 6 C in powers of z. This is computed for
performance functions in section 11.1. The actual proof of the theorem begins
with the necessity of conditions (i) and (ii) of the theorem, discussed in section
11.2 and section 11.3, respectively. The proof of the theorem is completed with
the discussion of sufficiency of (i) and (ii) in section 11.4.

11.1 Taylor expansions and performance


functions
By using relation (9.2) the reader can verify that if G(z] is a real-valued and
smooth function, then the following expansion holds:

To use this in the proof of the theorem, we consider the function G(z) =
F(e je , /*(e j6> ) + z) (for now e^e is fixed), and for ease of notation we write

The Taylor expansion can now be restated as

The O term in (11.3) represents a function that goes to 0 as least as fast as a


constant multiple of \z\2 when z —> 0. In other words when |z| is small,

Of course the O term in equation (11.3) depends on $, but it is possible to


obtain an inequality that is valid for all ej61. The hypotheses on smoothness of

137
138 CHAPTER 11. PROOF OF THE MAIN RESULT

F imply that there exist positive constants CQ and <§o such that the following
relation holds:

We leave the proof of (11.5) to the reader.

11.2 Solutions make the performance flat


Suppose that /* is a solution to OPT for which F(ej61, f * ( e ^ e } } is not constant.
By the assumptions, there is a 9 interval r on which Y(ePe, f * ( & 9 } } dips below
7* — <5i (see Fig. 11.1). To obtain a contradiction we will produce a function
/ € A of the form / = /* + eh and with the property that sup^ F(ej61, f ( & 6 } } <
sup0 F(ej61, /*(e j0 )). To construct and analyze / we shall use the set r.

Fig. 11.1. The function g(e^e) dips below 7 — 6\.

By Theorem 10.1.1 we can select h E A with phase that approximates the


phase of — a(e j0 ) on T C , the complement of r.1 That is,

Thus, there exists a constant 62 > 0 such that

We use the Taylor expansion of F to see that for e > 0 small enough,2 the
1
This uses only a small part of the strength of Theorem 10.1.1, which says that we could
choose h to approximate the whole function —a(e j 9 ) on rc. Also we could have used Theorem
10.3.1.
2
A precise argument is to choose e > 0 such that eco sup# |/i(e^)|2 < —£2, that is,
coesup e |/i(e je )| 2 < —efe- Combining this with inequality (11.5), we have
11.3. SOLUTIONS MUST SATISFY 139

function / = /* + e/i satisfies

Of course eh might be very large on r, and if e is not small enough, then


T(ei0, / ( e ^ 0 ) ) may be larger than 7* for 0 € r. However, we can take e so small
that

Then is uniformly less than 7* for all 0, contradicting


the assumption that /* is optimal.

11.3 Solutions must satisfy the winding number


condition
We now may assume that r(e j0 , f*(eJ0}} is constant = 7*. If m := wind(a ; 0) <
0, then the function b(e^e) = (e j6> ) m a(e j61 ) is never 0 and has winding number
0 about 0. Then by Theorem 10.3.1 there is a function h in A whose phase
approximates the phase of —6, so that there exists a constant 8% > 0 such that

Compare (11.10) with (11.6): the former is valid for all 9 and the latter is valid
for 9 6 r°. We can proceed with steps similar to those that follow (11.6) and
with a small enough e we obtain

This produces the contradiction

11.4 Flatness and winding number conditions


are sufficient
Suppose now that conditions (i) and (ii) of Theorem 9.3.1 hold, but that /*
is not an optimizer. To obtain a contradiction we assume that {/in}nejv is a
sequence in A such that sup^ \hn(e^°)\ —> 0 and

Before launching into heavy details we give the idea of the proof. Taylor
approximation and the flatness condition (i) of Theorem 9.3.1 imply that
140 CHAPTER 11. PROOF OF THE MAIN RESULT

The key point is that since hn is analytic on the disk, then

wind(a/2,n; 0) = wind(a; 0) + wind(fo n ; 0) > 0

provided hn has no zeros on the circle. (If it does, then one can replace it by
a function without zeros on the circle that is a small perturbation of hn.) This
says that the function ahn must wind about 0, so in particular Re(a/in) must
be positive for some e j6>1 . Hence

But this contradicts the assumption (11.11).


Now we give a formal proof that includes the second-order term in the Taylor
expansion. Since supg, \hn(e^9}\ —> 0 we may assume that swpe \hn(e^e}\ < <§o
without loss of generality, and we have from (11-5) that

Hence

Suppose for a moment that for each n there exists 0 such that a(e^9)hn(ej ) =
tn > 0. Substituting in (11.15), we get

But since ||/in||oo —> 0, we have that tn —>• 0, so the left-hand side of (11.17)
cannot hold for all n.
To justify the existence of tn as above, we note that the complex function
Hn(z) = zhn(z] has a zero at z = 0. By complex function theory, the function
H maps onto some disk centered at 0, so for some 9 we have Hn(e^} > 0. Since
the function a(e^Q}/e^e is continuous and never 0, it follows that for some #,

This completes the proof of the theorem.


Chapter 12

Computer Solutions to
OPT

One of the big payoffs of having results such as Theorem 9.3.1 available is tha
one can develop optimality tests. These tests that can be used as a measure
of how close a candidate solution to OPT is to being optimal. This topic is
presented in section 12.1. As we shall see in section 12.3, another major payoff
is that optimality conditions lead directly to computer algorithms. This requires
some basic definitions, which are given in section 12.2.

12.1 Computer diagnostics


An extremely valuable tool in computation, especially with iterative methods,
is a collection of diagnostics that indicate if the iteration is close or not close
to a solution. The diagnostics that over the years we have found effective for
OPT are based directly on the general mathematical Theorem 9.3.1. This has
the advantage that the diagnostics are not algorithm specific, so we recommend
them to anyone producing H°° software. Indeed we make heavy use of them in
Anopt, our H°° optimization software program.
Suppose that / € A is a candidate solution to OPT. To test the flatness
condition (i) of Theorem 9.3.1, we can calculate

To implement a test for condition (ii), 1 compute

1
Since the winding number of a function can take only integer values, small changes in a
function typically do not change the winding number of it with respect to 0. Condition II is
automatically satisfied for all functions / that are close (in the norm of the supremum) to a
solution /*!

141
142 CHAPTER 12. COMPUTER SOLUTIONS TO OPT

by evaluating the phase of / on a grid and taking differences of these values


on adjacent gridpoints. At the optimum Flat(f) = 0 and GrAlign(f) > 0.
Optimization programs for solving OPT should print Flat(f) and GrAlign(f)
at each iteration so that the user can monitor how the approximate solutions /
are progressing. Of course, if / is near the solution, we must have Flat(f) « 0
and GrAlign(f) > 0.
To illustrate a practical set of diagnostics we solve OPT when F is given by

A computer run produces the numbers shown in Table 12.1.

Table 12.1. Computer output.

In our computer run the Flat diagnostic went from 0.99 down to 0.002 in four
iterations, while GrAlign was zero throughout the iteration (this is not unusual
for "scalar" cases).
The numbers in Table 12.1 were obtained with the program Anopt. Com
puter code for generating the run is

«Anopt';

g = Abs[0.8 + (1/e + z[l] ) ~ 2 ] ~ 2 ;

Anopt[g, 0.01];

The reader who wants more details may refer to Appendix D.

12.2 Spaces of functions


This section serves as background for section 12.3.
Functions f ( z ) in A have a power series expansion about 0 valid for any z
in the unit disk:
12.2. SPACES OF FUNCTIONS 143

where the coefficients ag are real numbers. On the other hand, if we now consider
f(e^) as a function on the circle, it is a continuous function and thus has a
Fourier series expansion

where the Fourier coefficients ft are real numbers. The relationship between
(12.1) and (12.2) becomes clear when we formally substitute e^0 for z in (12.1):

Now, by the theory of Fourier series we know that the representation (12.2) is
unique. If (12.3) is valid, then we must have that cn — fn for n = 0,1, 2 , . . .
and that cn = 0 for n = —1, — 2 , . . . . To prove that (12.3) holds for functions
/ G A, one can consider first polynomials in z for which this relation is obvious.
Then the general result follows from the fact that functions in A are limits of
polynomials.
Now suppose that f(e^9} is any function on the circle that is continuous. If
the Fourier expansion of f(e^e] has the form

we see that we can generate a function /(z) defined on the disk as a power series
in z by formally replacing e^ by z in (12.4). It is possible to show that the radius
of convergence is 1 (see [Hof62]). Thus, roughly speaking, functions defined by
relation (12.1) and functions defined by relation (12.2) can be identified.
If we drop from the definition of A the requirement that f(z] be continuous
on the closed unit disk, then we obtain a normed vector space called the Hardy
space H°°. It is obvious that some functions f(z) in H°° have boundary values
f ( e i e ] (those in A, for example). But it is not clear, a priori, that one can talk
about boundary values of an arbitrary function f(z) in H°°. There is, however,
a way to define f(e^0) mathematically with the help of measure theory (see
[Hof62]).
If /(ej61) is a measurable function / : <9D —> C, the norms below are well
defined and give a nonnegative value or infinity:

The norms above are used to define the normed vector spaces
144 CHAPTER 12. COMPUTER SOLUTIONS TO OPT

which are Banach spaces.


The spaces that occur most often correspond to p = l,2,oo. In particular,
the following relation holds:

Therefore L°° C L2 c L1.'


The space L2 is particularly important because it is also a Hilbert space with
inner product

All functions in L1 do have an associated Fourier series expansion (12.2).2


This fact can be used to define the Hardy spaces Hp,

Again, functions in Hp can be thought of as functions defined and analytic on


the open disk by means of a Taylor series.
Since H 2 is a closed subspace of L2 it has an orthogonal complement H 2
in L2. It is easy to check that

If is in we define the operators and

P is the orthogonal projection of L2 onto.

12.3 Numerical algorithms for solving OPT


In this book we shall describe two different types of algorithms, which we call
disk iteration and Newton iteration. Both are based directly on the optimality
conditions in Theorem 9.3.1. We first present the Newton iteration algorithm,
which is easier to describe. More will be said in Chapter 15.
As one might guess, special cases of the OPT problem were solved long before
the attack began on the general problem. In particular, solutions to problems
where the sublevel sets of F(u;, z] in z are disks were given long ago by Nehari
plus others and have been implemented on the computer within the last two
2
But this expansion may fail to give f(e^} for some values of e^.
12.3. NUMERICAL ALGORITHMS 145

decades. Thus an effective class of algorithms can be based on iterating solutions


to "disk" problems that approximate the true OPT problem. These are the
disk iteration algorithms referred to above. We postpone their description until
Chapter 15. This allows us to present and derive these algorithms for fairly
general OPT problems.
Now we begin a serious presentation of Newton iteration. Suppose that a
function T : R —> R is given. In calculus, students learn that the points x 6 R
at which T attains a minimum satisfy T'(x) — 0 (i.e., x is a critical point) and
that one can determine excellent candidates for x by finding these critical points.
We will do exactly this in order to find minimizers to OPT, but in a setting
where the element x belongs to function space and T is a function on this space.
Consequently, our strategy is to write down some equations T"(/) = 0 that
minimizers /* satisfy and then solve the equations with an iterative method.
The most common method for solving nonlinear equations T'(x) = 0 is
Newton's method. It is based on the following simple idea. Say Xk is your
current guess at the solution x*. To update you want to find h so that T'(xk -f
h) = 0, but since you cannot solve for h, you settle for solving for h in the
linearized equation

This equation is linear in h and so it is readily solvable for h provided that


the linear map T'Xk is invertible. So one solves (12.11) for h and then sets
xk+i = xk + h.3
One virtue of Newton's method is that it applies to a wide variety of sit-
uations, even when the unknowns belong to function spaces. In particular, it
can be used to solve OPT by iteration, and this is the basis for the method
presented in this section. We treat the scalar case here, but most of what we
say also applies to the case of optimization over vector-valued functions / (see
Chapter 13).
The first thing we do now is to rewrite the winding number condition (ii) of
Theorem 9.3.1 in a more algebraic form. This makes it more suited to a Newton
method solution.
PROPOSITION 12.3.1. In addition to the hypotheses to Theorem 9.3.1, as-
sume that the function a(e^0) is smooth. Then the conditions (i) and (ii) of
Theorem 9.3.1 are equivalent to the following: There exist 7 € R, / € A,
F € A, and if) a smooth, positive-valued function on the circle such that

Proof. Suppose that /* is such that conditions (i) and (ii) of Theorem
9.3.1 are satisfied. Prom (i) we immediately get the existence of the constant
3
Often ajfc+i = Xk + t * h is used, where t > 0 is a parameter chosen according to a
prespecified criterion. The process of finding t is called linesearch.
146 CHAPTER 12. COMPUTER SOLUTIONS TO OPT

7 in (12.12). If wind(a ; 0) = n, we have that the smooth function b(e^e] =


a(e^)/(e^)n has winding number 0 about 0. By Theorem 10.3.1 there exists an
analytic function F\ such that phase(fr) = phase(Fi). Another way to describe
this relation is

Thus by the definition of b we obtain

i.e.,

Now set F = (eje)(n-VFi(eje)and ^ = |Fl(ei }|. Note that since by hypothesis


n > 1, we have that F e A. This proves half of the proposition. The other
direction is obvious.
We now try to solve (12.12). The unknowns in equation (12.12) are the
functions "0, F, and / and the scalar 7. The difficulty with applying Newton's
method to equation (12.12) is that, in a sense, it has more unknown functions
(four) than constraints (two). This means that local solutions are not isolated;
rather, there is a whole continuum of them. Newton's method is well known for
its bad behavior in such circumstances, because in the linearization (12.11) the
differential T'Xk will not be invertible. Consequently, rather than working with
(12.12) directly, we modify it first to obtain a problem with only two unknowns.
This is the key step in our method to finding solutions to (12.12).
We proceed with formal calculations for a moment. Begin by eliminating
indeterminacy in (12.12) by assuming ip € L\ is such that /o^V'ff — 1- Note
that P~ [ r(-, /) ] = 0 since T(-, /) is a constant function and that P0~ [ ejeF } = 0
(provided F € H2). Thus we obtain the system

in two unknowns, tp and /. Because of the way ^ is normalized, one can write

where /3 is a scalar-valued function that is analytic on the disk. Now, substitut-


ing (12.17) into (12.16) gives an operator equation of the form

At this point the basic idea is clear: to find solutions to the optimality
conditions (i) and (ii) of Theorem 9.3.1, it is enough to solve equation (12.18).
Thus we state the following.
12.3. NUMERICAL ALGORITHMS 147

Critical point problem. Find analytic functions /*, (3* with 1+2 Re(eJ'e/3*)
positive on the circle such that (/*,/?*) is a solution to equation (12.18).

We have found that Newton's method applied to the critical point problem
is extremely effective.

Newton's method. Given (/,/?), solve the equation

for e, 77 and produce an update

A standard fact about Newton iteration is that when the linearizations T!, ^
are uniformly invertible, the iterates have an excellent convergence property
called second-order convergence. This is analyzed in [HMW93], where the lin-
earizations are shown to be invertible and consequently our algorithm is second-
order convergent. It requires quite a bit of functional analysis to find the correct
setting and estimates. While it is well beyond the scope of our presentation here,
we give the formula for J" in Chapter 15.
The optimization problem of approximating in the norm of the supremum a
given continuous function on the circle by functions in A is a primitive version
of the problem OPT considered in this part of the book. As such it was treated
first in 1949 by Nehari, who proved that the distance is the largest singular value
of certain Hankel operators. Closely related work goes back to Caratheodory-
Fejer and Pick in the early 1900s. A history of work on this problem is vast and
appears later in Appendix A.
The Newton-type algorithm in this section is from [HMW93]. Disk iteration
type algorithms for solving OPT and their origins will be discussed in Chapter
15.
This page intentionally left blank
Part IV
H°° Theory : Vector Case
This page intentionally left blank
The control design methods presented in Parts I and II of this book ex-
tend to multiple-input, multiple-output (MIMO) systems. The most challeng-
ing problems that arise in MIMO design are optimization problems such as
OPT, but for MIMO design one optimizes over TV-tuples1 of analytic functions
/ = (/i, / 2 , . . . , /AT) instead of a over a single (scalar) analytic function. Fortu-
nately much is known about this optimization problem.
In this book we do not lay out MIMO control theory, but we do sketch the
theory that can be used to numerically solve the H°° optimization problem that
arises. That is the subject of the remainder of the book.

1
These correspond to N designable single-input, single-output (SISO) subsystems.
This page intentionally left blank
Chapter 13

Many Analytic Functions:


Optimization for MIMO
System Design

In this chapter we begin the study of H°° optimization over Ar-tuples of analytic
functions / = (/i, /2, • • • , /n), which generalizes the OPT theory of optimization
over a single scalar analytic function.
We describe two types of computer algorithms based on this theory. These
are disk iteration and the Newton algorithm, introduced in section 12.3. In this
chapter and in the remainder of this part we give more generality, detail, and
analysis. Variations (large ones) on these algorithms are implemented by various
authors in various programs (see the preface). One example is our program
Anopt which implements both algorithms and will be used for illustrations in
this chapter.
To do MIMO control one needs to be able to handle interpolation constraints
on matrix-valued analytic functions and produce parameterizations of the type

discussed for the scalar case in Chapter 7. Such parameterizations are well
understood in the MIMO case and reported in many references (e.g., [BGR90],
[FF91], [H87]). We do not discuss interpolation constraints for matrix-valued
analytic functions in this book, so for details on this we refer the reader to the
works cited above.

13.1 The OPT problem


We denote by AN the (real) vector space of all TV-tuples / = (/i, • • • , /AT) where
/£ is in A. The space AN is a complete normed space (or Banach space),

153
154 CHAPTER 13. MANY ANALYTIC FUNCTIONS

equipped with the norm

where \\Z\\N represents the euclidean norm on C ; that is, for z = (z\,..., ZN),
we have

A performance function is a nonnegative valued function F on 5D x C . We


will use the notations r(ej'e, z] and r(ej(9, zi,..., ZN] to denote the same object.
The optimization problem OPT over the set AN is
OPT Given a performance function r(e j e ,2), find
7* > 0 and /* in AN (if it exists) such that

The function /* in the statement of OPT is called a solution or minimizer.


The theoretical study of OPT with the most practical implications is the
characterization of local solutions. We already saw in the study of the scalar
case (a single analytic function /; cf. Chapter 12) how conditions for optimality
lead quickly to practical tests for optimality, and to algorithms. The situation
in the vector case (many analytic functions /i,. • • , /W) is quite similar, and this
will become apparent in the following sections.

13.2 Solving OPT on the computer


The theory and algorithms of OPT are mature enough to permit the use of
computers to obtain approximate solutions in well-behaved cases as well as
diagnostics with few keystrokes.
Consider the performance function

The followin g i s the inpu t th e use r types i n a computer sessio n wit h the packag e
Anopt i n order t o solv e OPT.

«Anopt';

g = Abs[l/(e + 0.5 ) - z[l] ] "2 +


( 5 + Re[l/(2 + e)] ) Abs[ 1/e + z[2]]~2 +
2 Im[z[l] z[2]]~2;

Anopt[g,0.000001];

In the session above the input is the performance "g" and the number 0.01,
which is an error tolerance for stopping calculations. The output is shown in
Table 13.1.
13.3. OPTIMALITY CONDITIONS FOR SOLUTIONS TO OPT 155

Table 13.1. Output of computer run.

The meaning of the columns headings is as follows: "Iter" is the iteration


number, "Current Value" is the current value of supe r(e j ' e ,/(e j ' e )), "Step"
is the distance between the current approximation / to a solution and the
previous one, "F/af and "GrAlign" are optimality tests, "ned" is a diagnosti
of numerical error, "Sm." reports automatic smoothing, and "Grid" is the
current number of grid points used in calculations. The analytic function that
is the calculated solution is also an output, but it is not printed. See Ap-
pendix D.

13.3 Optimality conditions for solutions to OPT


As in the scalar case (Theorem 9.3.1), diagnostics and computer algorithms
in the vector case are based on a theorem characterizing solutions to OPT.
We now present a result that gives two necessary conditions for a solution.
These two conditions already give enough information to derive an algorithm
(see [HMW93]). Additional conditions to ensure sufficiency will be presented in
section 16.1.
So far we have mentioned "solutions /* in AN to OPT" that are global
solutions. However, in the study of conditions for optimality one must consider
local solutions. This is needed because when N > 1, there is no uniqueness of
solutions.
A local solution f G AN is a solution to OPT in some neighborhood V C AN
of /. It is a strict local solution if it is the only solution in some neighborhood.
A function h € AN is a descent direction relative to / € AN if for all t > 0
small enough,

The function / is a directional solution or directional minimizer if there are


no descent directions relative to /. Of course, every strict local solution is a
directional solution.
Recall that F is a function of z where z = (z\,..., ZN], so we can take partial
derivatives such as
THEOREM 13.3.1. Let T be of class C3 and let /* — (/!,..., fN) in AN be
such that
156 CHAPTER 13. MANY ANALYTIC FUNCTIONS

is not the zero vector for any 9.


If the function f* is a local solution to OPT, then

I. r(e j6> ,/*(e j0 )) is a constant function of e^B. (Flatness)

II. There exist F = (F\,... , F/v) with Fg analytic on the unit disk D and
integrable boundary values Fi(e^6}, and there exists an integrable, positive-
valued function ijj(ei&) such that f ipt^ = 1 and

Gradient
alignment

The reader might like to compare this to Theorem 9.3.1 when TV = 1, but
the correspondence is not obvious because the gradient alignment condition
looks different than the condition on the winding number in Theorem 9.3.1.
Theorem 13.3.1 is written in what could be called primal-dual form. One can
apply Theorem 13.3.1 productively without understanding this connection, so
we move on and do the applications in this chapter.
Later, in section 14.4, we prove that the winding number and primal-dual
formulations are equivalent. Primal-dual refers to the fact that / is the unknown
in the original, i.e., "primal," problem and ip has the interpretation of the
unknown in the "dual" problem; Part V is devoted to this subject.
Functions /° € AN that satisfy the flatness and gradient alignment condi-
tions are good candidates to a local solution to OPT, but they may not be a
local solution. We illustrate this in the next section.

13.4 An example
We now give a performance function F and a function /° that satisfies the
flatness and gradient alignment conditions, but /° is not a local solution.
Set /°(e je ) := (0,0). The performance function we consider is

We now prove that /° satisfies the the flatness and gradient alignment conditions
of Theorem 13.3.1.
The flatness condition holds since
13.5 COMPUTER DIAGNOSTICS FROM OPTIMALITY CONDITIONS 157

To check the gradient alignment condition we begin by calculating partial deriva-


tives:

Hence,

Note that

so the gradient alignment condition holds with ^(e^6) = 100 \e?Q + 0.2| 2 and
Ft(e*9) = eie/(l + 0.2e^), 1 = 1,2.
We now prove that /° is not a directional minimizer for OPT (thus it is
not a local solution either). To do this, consider the function h°(e^e) = (1, —1).
Then, for t € R we have

It is clear from (13.5) that

which is what we wanted to prove.

13.5 Computer diagnostics from optimality


conditions
Diagnostics and computer algorithms for the vector case follow immediately
from the previous two sections and are a direct generalization of those for the
scalar case found in Chapter 12.
The flatness condition of Theorem 13.3.1 can be easily implemented on a
computer, by calculating

To test the gradient alignment condition directly one also needs functions
t/> and F, which do not appear in the statement of the problem OPT. Thus
it is desirable to have a test of the gradient alignment condition that does not
158 CHAPTER 13. MANY ANALYTIC FUNCTIONS

involve ip and F. The following conjecture is very useful in obtaining such a


test.
CONJECTURE 13.5.1. Under the hypotheses of Theorem 13.3.1, for generic
Y, the gradient alignment condition of Theorem 13.3.1 is equivalent to
Set am := -jj-(-,f*}- Pick m® 6 {!,...,AT} such that amo is not
the function 0. Then for each m 6 {1,..., Af} the function -—^ is
analytic in the disk.

It is easy to check on the computer when this should apply. Thus the con-
jecture above allows one to check (for most performance functions) the gradient
alignment condition of Theorem 13.3.1 by testing if each am/ao is analytic, pro-
vided Af > 1. When N = 1 this test does not apply and one must use the
winding number test of Chapter 12.
To implement the test for the gradient alignment condition, consider the
Fourier expansion of the functions

and form

which is calculated on the computer with the fast Fourier transform. At op-
timum both Flat and GrAlign should equal zero to nearly machine precision.
The program Anopt prints these values out at each iteration to monitor how
the approximate solutions are progressing. If / is near the solution, we must
have Flat(f) w 0 and GrAlign(f) ~ 0. Computer output showing the Flat and
Gr Align tests was presented in section 13.2.

13.6 Algorithms for OPT


The algorithms for OPT discussed in this book are instances of the following
general scheme for solving nonlinear problems iteratively.

General iteration scheme

1. Test / for optimality. If optimal, then stop.


2. Solve a subproblem to obtain an update direction h.
3. Perform a search on / + th with t > 0 to find a t = to that satisfies a
preset criterion.
4. Set / = / + toh, and go to step 1.
13.6. ALGORITHMS FOR OPT 159

The Flat and GrAlign tests for optimality we have developed are used
in step 1. Another stopping criterion one may want to include in particular
implementations is "lack of sufficient progress" (however you want to define
this). A typical step 2 consists of replacing the performance function with a
"model," a new performance that is based on the original function but easier
to deal with. Step 3 is called linesearch. One obvious example of the linesearch
criterion is that t minimizes sup^ r(e j0 , f(e^e] + th(e.ie}}.
The two algorithms discussed later in this book (see Chapter 15) are Newton
iteration and disk iteration. Both are based on models of the performance
obtained by taking some terms from the Taylor expansion of T(e.ie,z) about
z = the current guess /.
We now address what we have found to be the most effective means of eval-
uating the performance of a computer algorithm. The objective is to have a
theoretical test, which can be done with pencil and paper at the same time that
you are developing an algorithm. This is tremendously helpful in discovering
computer algorithms. The reason is that the algorithm itself is presented ana-
lytically, and having an analytic means of evaluation allows one to see how to
make changes that improve performance of the algorithm. What we recommend
as the leading indicator of success of an algorithm is actually very conventional.
It is the order of convergence, defined in the next paragraph.
A sequence of vectors xn in a vector space with norm || || that converges to
x* is said to be p-convergent if there exists C > 0 such that
(EST)
The larger p is the faster the sequence converges. An algorithm that generates
sequences will be called order p convergent provided the sequences it generates
are at least order p convergent. Here C is a number whose size influences the
error estimate (EST) less than p. When p = I we need C < I to guarantee
improvement from one iteration to the next.
A summary of the performance of algorithms presented in this book is as
follows:
Method Case Convergence
Disk iteration order 2
order 1
Newton iteration allN order 2

Some technical points:

Convergence of the disk iteration algorithms has not been established.


However, one can show that if it converges, then the limit function /*
must satisfy the flatness and gradient alignment optimality conditions.
We conjecture that if disk iteration with a linesearch converges to /*, then
/* is a local solution of OPT. Then using arguments in [HMer93b] one can
easily give estimates for the order of convergence of various disk iteration
algorithms. We reemphasize that one obtains this order of convergence
160 CHAPTER 13. MANY ANALYTIC FUNCTIONS

estimate without actually having proved that iterates in the algorithm


converge.
The difficulty with establishing convergence is that in the function spaces
over which the optimization is performed, bounded sets are not necessarily
compact. In such spaces the distance between consecutive elements of a
sequence may decrease to 0 and the sequence may still not converge. In
summary, our experience indicates that formal order of convergence is
easier to analyze and of much more practical importance in predicting
what you see on a computer than actual convergence itself.
We have not specified which norm we are using in section 13.6. The details
are rather complicated and do not contribute to our exposition, so we leave
them out.
Newton iteration may converge to an /* that is riot a local minimum, but
a sort of critical point. See [HMer91], [HMer93a].

The main theorem in this chapter is from [HMW93] and strengthens the
theorem in [H86]. Independently a special case was proved and put to very
good use in pure mathematics by Lempert [Le86].
Chapter 14

Coordinate Descent
Approaches to OPT

In standard Rn optimization of nonlinear functions it is a natural idea and


fairly common in folklore to reduce the original problem to a sequence of one-
dimensional problems through the technique one might call coordinate descent.
As we shall see, the analog of such a technique is seriously flawed in our setting.
The explanation is an obvious consequence of condition II of Theorem 14.2.2.
The idea behind coordinate descent is easy to understand once it is described
in the very special case where one minimizes a real-valued function g on R2.
Start at (XQ, 2/0) G Rn and descend in the x direction, that is, follow
the line (x,yo), and and stop at (xi,yo)> the minimum of g(-,yo} in
the x direction. Now fix x\ and do the y descent, that is, minimize
g(xi,y) to obtain (x\,y\). Now fix y\ and do the x descent. Keep
it up until the process stops (slows to a crawl) at (x*,y*).

14.1 An example in which the coordinate


descent method fails
The failure of coordinate descent in H°° optimization is very intuitive when
thought of graphically. It is easy to see on a performance function that is not
smooth that a coordinate descent tends to stop in corners possibly far short
of a local optimum. This is best illustrated in R2 by the problem of finding a
minimizer for the function

on the quadrant {x > 0,y > 0}, which is a sloped trough whose bottom crease
is the line x = y. Start at (xo,yo) and descend in the x direction, that is,
follow the line ( x , y o ) , and and stop at (xi,yo) = (yo,yo), the minimum in the

161
162 CHAPTER 14. COORDINATE DESCENT APPROACHES TO OPT

x direction. That point lies on the crease. Now do the y descent; you never get
off of the crease!1 (See Fig. 14.1.)

Fig. 14.1. Trough showing the step from (xo,yo) to (xi,yo).

This example is so dramatic because R2 is of low dimension relative to the


crease. H00 optimization is comparatively mysterious because while the sup
norm certainly is not smooth, so that OPT is a nonsmooth problem, it sits on
an infinite-dimensional space, so we cannot tell if there are "many" creases or
not. Thus in practice hitting a crease might be rare.
For OPT problems the flatness and gradient alignment optimality conditions
of Theorem 14.2.2 below strongly suggest that in fact hitting a crease is generic
behavior. Second, numerical experiments like those below overwhelmingly sup-
port this.

14.2 Coordinate descent over H°°


This section presents a simple (practical) test, Theorem 14.2.2, to determine
when the natural coordinate descent algorithm for "solving" OPT stops. For-
tunately, this stopping criterion compares directly to Theorem 13.3.1, which
tells us when we are at a true optimum to OPT. As we see, the conditions for
coordinate descent to stop appear so much weaker than for OPT to have a true
solution that we make the following conjecture.
CONJECTURE 14.2.1. For generic F coordinate descent does not obtain a
true local optimum. In other words, with probability 1 coordinate descent gives
the wrong answer.
•"•This is easy to check, and we leave the details to the reader. This is not necessarily true
for an arbitrary trough.
14.2. COORDINATE DESCENT OVER H°° 163

We start by defining the H°° coordinate descent (CD) algorithm, but for
simplicity of presentation we state the case N = 2 only. The idea will be no
surprise.
Given T: 3D x C2 -» R+ and /° - (/?,/£) G *42, an update f1 e ^2 is
obtained in the following way.

Start now at (/i 1 ,/^) and repeat.


To analyze the CD algorithm we write down the mathematical statement
of what it means for the algorithm to stop. We call (/i,/!) = /* € A-2 a
coordinate descent solution to OPT provided

hold locally. Clearly any local solution to OPT is a coordinate descent solution
to OPT.
Next observe that CDS (1) is a standard OPT problem with N = 1, as is
CDS (2). If we invoke the N = I case of Theorem 13.3.1, we immediately get
the following.
THEOREM 14.2.2. Let T be of class C3 and /* = (/1? /2) in AI be such that
neither entry of ( J^j- (•,/*), J^ (•,/*)) is ever equal to 0. The function /* is a
coordinate descent solution to OPT if and only if
I. The function
Flatness

II. There exist F = (Fi^F^) in the Hardy space H\ and two nonnegative mea-
surable functions ^\^i of e^e G <9D such that f i^e^r = 1 and

Gradient
alignment

The key issue is; does CD give local solutions to OPT?


To answer this we compare condition II of Theorem 14.2.2, which asserts
that two seemingly unrelated functions ip\ and ^2 exist when /* is a coordinate
descent solution to OPT, and the condition that -0i must equal ^2 required
by true local solutions to OPT (according to Theorem 13.3.1). Now given two
functions ty\ and -02, the probability that they are equal is 0. Thus if indeed
there is no hidden relationship between the functions ty\ and ^2, the conjecture
is true.
164 CHAPTER 14. COORDINATE DESCENT APPROACHES TO OPT

14.3 Experimental evidence


Further evidence that condition II of Theorem 13.3.1 and condition II of The-
orem 14.2.2 are worlds apart is supplied by simple computer experiments. The
first cases we tried are reported below.
Example 1. Let

When coordinate descent is initialized at f ° ( e ^ e } = (0,0), it stops at fc(^e] =


with yalue 2.54423. This value is produced at step GDI. But
Theorem 13.3.1 can be used to prove that fc is not a local optimizer for OPT.
In fact 7* = 2.09101 is the (unique) optimal value, attained at

Example 2.

The optimal value in OPT is s* = 5.28606, while coordinate descent, initialized


at /° = (0,0), completes GDI to finally get to a complete halt right at CD2,
giving s* == 7.14102. Thus in this example, there is approximately 35% error in
the answer.

14.4 Another perspective: Winding numbers


versus primal-dual
The key gradient alignment condition of Theorem 14.2.2 is equivalent to the
winding number conditions like those used in stating Theorem 9.3.1. We now
explain this and, as previously promised, explain how the "primal-dual" formu-
lation in Theorem 13.3.1 looks in terms of winding numbers, thereby connecting
it with Theorem 9.3.1.
The link between the primal-dual and winding number formulations comes
from the following lemma about scalar-valued functions.
LEMMA 14.4.1. Suppose a(-} is a smooth, complex-valued function of e^e
that never vanishes. Then wind(a, 0) > n if and only if there is a smooth,
positive function ty of e^e and a smooth function F in A satisfying
14.4. ANOTHER PERSPECTIVE 165

Proof. Suppose that (14.1) holds. Since tjj is real,

Thus

The converse follows from Theorem 10.3.1 applied to the function b = ae^n6
whose winding number is 0. That theorem tells us that there is a analytic phase
function F satisfying (14.1).
Immediately this gives the result that the winding number condition of The-
orem 9.3.1 is

which is exactly the gradient alignment condition of Theorem 13.3.1 when


N = l.
When N > 0 we analyze coordinate descent. Lemma 14.4.1 rephrases the
gradient alignment and flatness conditions of Theorem 14.2.2, as wind(e^, 0) > 1
for each 1 < t < N.
This leaves the ^ = ^)m condition unanalyzed. We begin analyzing it by
presenting a winding number viewpoint, valid for rational functions a(e^e}. The
winding number of a nonvanishing rational function a on <9D is

inside D. Suppose the functions ag(e^°} := ^- ( • , / * ( • ) ) are rational and ex-


press

where pi and qe are coprime (have no common zeros) for t = 1,... ,7V. The
following is shown in [H86]
PROPOSITION 14.4.2. The ^ = -0m condition for OPT over AN is equiva-
lent to the stament
The integer i(a] defined as the number of zeros that the greatest
common divisor ofpi,... ,PN has inside the disk minus the number
of zeros of the least common multiple of q\,..., <?jv is strictly greater
than zero.

Roughly speaking, i is the number of common zeros minus the total number
of poles (inside the disk) of the at. Now we need the following result.
THEOREM 14.4.3. When N = 1, we have that for generic F, if f * e A is a
solution to OPT such that a(-) :— §^(-,/(-)) is never 0, then wind(a;0) = 1.
166 CHAPTER 14. COORDINATE DESCENT APPROACHES TO OPT

Combine Theorem 14.4.2 with Theorem 14.4.3 to obtain very strong con-
clusions. For example, it leads us to suspect that generically if the ai have no
common poles inside D, then all zeros of the ai are common.2. This does not
prove that generically all zeros are common, because the generic N = I behavior
may not be true for a N > 1 optimum.
The coordinate descent presentation in this chapter is drawn mainly from
[HMer93a].

2
To do the counting use

inside D. The last line follows from

We summarize with
#common zeros(a£) > A^(#common zeros(a^) — 1).
This implies that ^common zeros(o^) cannot be greater than 1 because of the strict inequality
and the fact that N > 1. Also the condition ^common zeros(a^) #P°\es(ae)
prevents it from equaling 0. Thus ^common zeros(a^) = 1, which is what we want to prove.
Chapter 15

More Numerical Algorithms

This chapter is one of the more theoretical parts of the book, which is ironic
in that it derives the formulas that are used in our computer algorithms. The
authors are a bit iconoclastic in that they see developing computer algorithms
of this type as almost an exercise in pure mathematics. Indeed Appendix B
describes pure mathematics implications of this and previous chapters. A design
engineer can skip this chapter. An engineer who develops optimization methods
might find it interesting.
In this chapter we discuss in some detail two algorithms for solving OPT:
disk iteration and Newton iteration. Sections 15.2 and 15.3 present disk iter-
ation. Section 15.4 shows how our Newton-type method presented in Chapter
12 for one function / generalizes to N functions /i,/2, • • • ,/W- Section 15.5
compares numerical properties of the methods; they turn out to have comple-
mentary advantages. The last section derives the formula due to Nehari and
others, which is the core of our disk iteration method. As we shall see, this
formula follows directly from our main optimality theorem, 13.3.1.

15.1 Notation
We define for 1 < p < oo the spaces LPN whose elements are ./V-tuples of Lebesgue
measurable functions (/i, • . . , /N) and whose norms are defined in terms of the
euclidean norm | • ||AT on C N by

Functions / € LPN have an associated Fourier series expansion

167
168 CHAPTER 15. MORE NUMERICAL ALGORITHMS

where the Fourier coefficients ft are vectors in C^. The space H1^ consists of
/ 6 LPN that have fe = 0 for I < 0. Functions / in H^ extend to an analytic
function F(z) on the unit disk D, so that f(e.->e} is the nontangential limit of
F(z) for almost every 0. See [Hof62].
The orthogonal complement in L2N of a subspace S is denoted S^. If k is
an N x M-matrix-valued function on the unit circle, the Hankel operator with
symbol k is the operator Hk '• H'M^n action a —>H2± [ka\. The P —>• H^-
Toeplitz operator with symbol k is the operator 7^ : H^ —> H^ with action

15.2 Nehari's problem: Perspective on OPT from


a classical viewpoint
The math problems treated and used in this book long precede engineering
application. Some go back to the beginning of this century. A variant that is
close to the spirit of this book is the classical Nehari problem, which dates from
the 1950s.

Nehari's problem. Let A; be a function in L^. Find its distance to Hpf.


Here the distance of k to Hp? means

and a best H°° approximation to k is a function /° e H™ that realizes the


supremum in (15.2). Standard theory shows that /* exists. Moreover, if k €
CN + HN'I then the best approximation is unique (here CN is the set of TV-tuples
of continuous functions on the unit circle). We refer to the problem of calculating
dist(k,H^) and a minimizer /* e H^ as Nehari's problem. Nehari's problem
is also the simplest nontrivial OPT problem with the performance

Let H.k '. HI —> H%f be the operator with action a —> PH2±[ka]. This
type of operator is called a Hankel operator, and k is called its symbol. A
well-known result is the following.
THEOREM 15.2.1. Ifk is in LJ$, then

and there exists a best approximation f* € 7i^. If a* is a function at which


T-tk attains its norm, then
15.3. DISK ITERATION ALGORITHMS 169

// the symbol k is in H°° + C, then /* is unique. Let 7i£ be the adjoint ofHk,
and set r = dist(fc, H™); then

Equation (15.6) says that a* is a function associated with r, the largest


singular value of the compact operator Ti.k (i.e., the norm of "Hk)- We will
refer to equation (15.6) as the Nehari-commutant lifting formula. For N > I
this follows from commutant lifting arguments [A63], [S67], [NF70], [AAK68],
[AAK72], [AAK78].

15.3 Disk iteration algorithms


The solution of Nehari's problem leads directly to numerical methods for solving
OPT problems where the sublevel sets are disks ("disk problems"). To solve
more general OPT problems we can successively approximate OPT by disk
problems and solve them. We call this disk iteration. This is described together
with experimental findings in [HMer93b]; here we sketch only the basic idea.

15.3.1 The power method for solving Nehari's problem


A common way to solve Nehari's problem is to solve (15.6) for a* as an eigen-
value problem, and then solve for /* in (15.5). The power method is a natural
choice since we are seeking the largest eigenvalue of the operator H*kHk and the
corresponding eigenvector. It is a well-known fact that, provided the multiplic-
ity of this eigenvalue is equal to 1, the power method converges linearly to the
solution.
The power method. Given k and a guess a at a solution to (15.6), update
a to ai with

Suppose that the unit circle is discretized to n (a power of 2) equally spaced


points, so that one can calculate projections in 2nlog 2 (n) operations by using
a fast Fourier transform algorithm (we do not count additions). Then step PI
takes -/V(4nlog 2 (n) + 2n) operations, and step P2 takes Nn multiplications, one
square root, and Nn divisions. Thus the power method for solving Nehari's
problem is of order 7Vnlog 2 (n). Also, note that memory requirements are low
since no matrices have to be stored at any moment.
For comparison we mention that Newton iteration requires the inversion of
a matrix M that arises from a discretization of the operator T\ ^ introduced in
sections 12.3 and 15.4. The matrix M has a number of entries that scale wit
170 CHAPTER 15. MORE NUMERICAL ALGORITHMS

jVn 2 , but it has special structure that can be exploited to reduce the operation
count in the inversion (with algorithms of the type found in [SK91]), possibly
reducing the order of the inversion to Nn2. Also there exist algorithms for
inverting a matrix by iteration, which can be effective in this setting.
The power method is clearly efficient in many respects, and it is superior to
the Newton iteration in memory requirements and operation count. Obviously
it is also better when it comes to time actually used to produce an update in step
2 of the iteration scheme. However, there is a crucial limitation of the power
method: it is directly applicable only to very simple problems OPT, namely,
Nehari problems.
The properties of the power method make it an interesting tool for solving
OPT problems through approximating F by Taylor expansion. This Taylor
expansion is used to obtain a Nehari problem, or at least a problem close to it.
This has been attempted before in [HMer93b]. In section 15.3.2 the method of
disk iteration (see [HMer93b]) is described.

15.3.2 The Algorithm


We now introduce the disk iteration algorithm.

Disk iteration. Given a current guess at the solution / 6 A AT, step 2 of


the scheme is

Find h = (hi, /i2, • - - , /IN) & AN that minimizes

A first glance at the algorithm reveals that the Taylor expansion of F about
/, the current guess at the solution, is not necessarily of the form of a Nehari
problem. However, it can be solved by iterating Nehari problems [HMer93b].
Thus an implementation of the disk iteration algorithm that uses the power
method requires an inner loop to find a solution to step 2.1 Another feature
of (15.7) is that the Taylor expansion it contains does not capture all second-
order information from most functions F. One expects that this would imply
first-order (bad) convergence for the algorithm. This is indeed true in the vector-
valued case N > 1, but it is surprising that (in practice) second-order (good)
convergence occurs in the scalar case N — I (see [HMer93b]). Thus disk iteration
gives second-order convergence in practice when N = 1, and no better than first-
order convergence when N > I .
1
We note that there are many reasonable modifications of the expression (15.7) that pre-
serve intact terms of order up to 1 in h. In fact in [HMer93b] one was found that eliminates
the inner loop at the expense of all the second-order information.
15.4. THE NEWTON ITERATION ALGORITHM FOR OPTN 171

15.4 The Newton iteration algorithm for OPT^


This section describes Newton's method of solving OPT when N > I . We follow
closely the reference [HMW93]. Recall that in the scalar case (section 12.3) the
key was a set of critical point equations. The vector representation (C^-valued
analytic functions) of the critical point equations is

where -0 = 1+ Re(x/$) where j3 is a scalar-valued function that is analytic on


the disk. Thus we get the nonlinear equation

where

Thus our objective is to solve the critical point problem (15.9) with T given
by (15.10). Here x is the identity function on the unit circle, namely, the
function X(eje) = &>e.
Now we turn to the solution of operator equations (e.g., equation (15.9))
with the iterative method of Newton. In terms of T' we have the Newton step
for updating a given pair (/, j3}:

We refer to (15.11) as Newton iteration, and to repeated application of (15.11)


as the Newton algorithm.
A property that is required for Newton's method to converge well to a local
optimum (/*,/?*) is that Tl* gt is invertible. We analyze this in section 15.7
and give strong theoretical evidence that T'^ ^, is invertible for almost all (/, /3).
Also there is a large body of experimental evidence that this is true, and the
next section gives some examples.

15.5 Numerical comparison of the algorithms


In this section we compare disk iteration to the Newton-type algorithm. We
illustrate the behavior of disk iteration when applied to the function
172 CHAPTER 15. MORE NUMERICAL ALGORITHMS

The (global) optimal value is 7* = 3800, attained at two different points in


function space: the constant functions /j" = (30,—30) and f£ = (—30,30).
There are no other local solutions to OPT. Also, we saw in section 13.4 that
the function F has a "saddle point" value at 20,000 that is attained at the
constant function /£ = (0,0) (here Flat(f^) — 0 and GrAlign(f^) = 0, but
there is no local minimum here since condition II in Theorem 13.3.1 is not
satisfied).
A final observation is that the Newton iteration runs presented here do not
include a linesearch along the direction given by one step in the method; i.e., a
full step is taken each time. This is in contrast with the disk iteration method
(see [HMer93b]), which depends on the linesearch in a strong manner.
The discretization of the problem is carried out by sampling functions on a
grid of equally spaced points on the unit circle. In all the examples shown in
Tables 15.1-15.5 the number of samples is 256.

Table 15.1. Newton iteration: Convergence to local optimum.

The Newton iteration is initialized at f(e:>6) — (29.6 + O.le^, -30.4 -


O.OOOle-'6 + 0.001 (ej61)2), which is near a local solution. Observe how the
diagnostics Flat and GrAlign tend to zero at an essentially quadratic rate.
The same holds for the true error ||/fc — /*||L°°-

15.6 Derivation of the Nehari solution


In this section we return to disk iteration and derive Theorem 15.2.1 which lies
at its core. Indeed, we derive relations (15.4), (15.5), and (15.6) that appear in
Theorem 15.2.1 directly from the optimality conditions in Theorem 13.3.1.
Recall that the Nehari problem is of the type OPT with performance as
in (15.3). To avoid technical issues we assume that k 0 H^ and that k is
continuous; hence there is a unique best H°° approximation /* to k. We also
assume that k—f* is bounded away from 0. Thus F and /* satisfy the hypotheses
of Theorem 13.3.1.
Set r — dist(k,H'x). Then r > 0. The optimality conditions I and II of
Theorem 13.3.1 in this case say that there exists fy measurable and positive on
15.6. DERIVATION OF THE NEHARI SOLUTION 173

Table 15.2. Newton iteration: Convergence to saddle point.

In this example the Newton iteration converges quadratically to the constant


function /*(e je ) = (0,0), which is not a local minimizer. The iteration was
initialized with f(&>°) = (0.2 + O.le^, 1.4 + O.le?0 + O.Ole^2).

dD and a function F e H^ such that

Let a* € HI be the outer spectral factor of -0. From (15.14) we have that
F/a* £ HX and hence that

Combining (15.14) and (15.15), one obtains

This proves formula (15.5).


We now check the identity (15.4). We trivially have that

To prove the reverse inequality, note that by (15.13) the following relation holds
for almost all 0:

Combine (15.5) and (15.18) to obtain


174 CHAPTER 15. MORE NUMERICAL ALGORITHMS

Table 15.3. Convergence to local optimum with disk iteration.

Disk iteration with linesearch on the value of F and initialized at f(e? ) =


(29.6+ 0.16^, -30.4 -0.0001e-?e+0.001 (e^ 9 ) 2 ) gives very slow progress toward
the solution. Observe that the diagnostic GrAlign decreases at a much slower
rate than Flat. This is common behavior for this algorithm when optimizing
over A-2- Also note that the numerical error diagnostic remains relatively large
during the iteration. Compare this with Table 15.1, where the function F and
the initialization in / are the same as those in the present run, and a quadratic
convergence rate is achieved.

and this finishes the proof of the equality (15.4).


At this point we are in position to derive the Nehari-commutant lifting
formula (15.6). Rewrite (15.14) as

and notice that by multiplying both sides of (15.20) by (k — /*)* and using
(15.13) one obtains

Now we reach a crossroads. The strategy behind our approach in section


15.4 was to project (15.20) onto H^-Q. In this case it produces a Toeplitz
operator-type formula,
The strategy in the Nehari and commutant lifting approaches is to project
both terms in (15.20) onto H^Q. This yields
15.6. DERIVATION OF THE NEHARI SOLUTION 175

Table 15.4. Convergence to local optimum with disk iteration and smoothing.

The example shown in Table 15.3 is now run with automatic smoothing on
the current guess at the solution. This lowers the numerical error (ned) as the
iteration progresses and improves Flat, but GrAlign is practically unchanged.
Thus numerical noise is not the source of the lack of progress in the run in
Table 15.1.

Multiplying both sides of (15.22) by (k — /*)*, using (15.21), and taking conju-
gates of both sides, we have

Taking projection of both sides of (15.23) onto H2 gives

Equation (15.24) is precisely the Nehari-commutant lifting formula (15.6).


We see that the Nehari-commutant lifting formula is remarkable in that
it puts together, in a single equality, both relations (15.13) and (15.14) while
eliminating / and F (our approach in section 15.3 eliminates T and F). It is
also notable that all this is accomplished with a formula that is linear in a. It
is unfortunate that these properties do not hold for functions F that are more
general than (15.3).
176 CHAPTER 15. MORE NUMERICAL ALGORITHMS

Table 15.5. Convergence to local optimum with disk iteration and special line-
search.

The example shown in Table 15.3 is now run with a mixed strategy in the
linesearch: if / is the current guess at the solution and h is the update direction
generated by the algorithm, then either GrAlign(f -f th) or the supremum
of F(-,/ + th) is minimized in £, depending on the progress with respect to
previous iterations. The progress improves to a clear linear rate of convergence.
Compare this with Table 15.1.

15.7 Theory of Newton iteration


Recall that our Newton iteration method applies to the optimality of equations
of flatness and gradient alignment from Theorem 13.3.1, written as a collection
of equations

where Jt/> = 1, as in section 15.4. To implement or to analyze Newton's method


we must compute the Jacobian (differential) of an operator T defined by the
left-hand side of equation (15.25). For convenience, a change of variables is
introduced,

where J3 is an analytic function. This change of variables has the advantage of


taking care of the normalization J ^ = 1. Thus combining equations (15.25)
15.7. THEORY OF NEWTON ITERATION 177

with (15.26), we obtain an equation of the form

Now we must linearize equation (15.27) and obtain a tractable formula (this
we do next). Then to prove second-order convergence we must show that the
Jacobian is invertible (this is analyzed afterward).

15.7.1 The linearized optimality equations


We now derive the formula for the Jacobian T', a of T. The following Taylor
expansions will be useful.

where

To compute T' we consider the expression

Substituting (15.28) into (15.30) and dropping all terms which are not linear or
constant in 6 or e yields
178 CHAPTER 15. MORE NUMERICAL ALGORITHMS

Thus we have that the Jacobian of T is given by

An elegant way to write T' appears in the following proposition.


PROPOSITION 15.7.1. The operator T'/^ in (15.32) can be written in the
form

where

TMI is a Toeplitz operator with symbol MI =

/
N.M2 is the Hankel operator with symbol M% =

C is the operator f —> f, IN is the N x N identity matrix, and ifi —


1 + 2 Re xP-

A proof of Proposition 15.7.1 follows from the relations

applied to (15.32).
One can see in representation (15.33) that T'^ ^ has the form of a "conjugate
of Toeplitz plus Hankel" operator (except for a small change in the range, which
turns out to be unimportant for our analysis). One consequence of this, as
pointed out to us by Ali Sayed, is that T£ ^ can be inverted numerically with
fast algorithms similar to those described in [SK91]. This enhances the practical
appeal of Newton's method applied to the equation T = 0.

15.7.2 Second-order convergence: Invert ibility of T'


In this section we indicate how one predicts theoretically that the Newton it-
eration algorithm in section 15.4 is second-order convergent. It is standard
that second-order convergence for Newton iterates "solving" T = 0 holds if the
Jacobian T' of T is invertible. Thus invertibility of T' is the key issue.
For Newton iteration it is appropriate to define T not on all of an H 2 type
space but on a subspace of it consisting of smooth functions. Now one can
discuss spaces appropriate to our problem, and that is done in [HMW93], but
we avoid it here. Instead we shall analyze the easier but highly informative
problem of invertibility of T' acting from H2 to H2. This gives one a very good
picture of the invertibility behavior appropriate to our situation.
15.7. THEORY OF NEWTON ITERATION 179

One needs a solid background in functional analysis to understand this sec-


tion; indeed, one should know about compact operators, effects of compact per-
turbations of operators on Hilbert space, and the basics of Toeplitz operators
and the Beurling-Lax theorem.
Define an invertible outer function F € H^1 to be one with the property
that its component functions satisfy

for some 6 and for all \z\ < 1. One version (cf. [H86]) of a deep theorem mostly
due to L. Carleson, called the Corona theorem, is as follows.
THEOREM 15.7.2. For F e HJ?,
the range of the Toeplitz operator Tp '• H2^ —> H2 is H2
if and only if
Tp1 TFi + ... + TpN TFN ^s invertible
if and only if
F is an invertible outer function.
We shall be dealing with functions F e H^ that never vanish on dD, and
for these the theorem roughly says that not all components of F can have a
common zero in D. Since having a common zero is a rare event, we see that the
outer condition is generically true provided N > I .
We now present the main theorem of this section.
THEOREM 15.7.3. Let F be smooth and strictly plurisubharmonic . Let
f G AN be a smooth function that satisfies the gradient alignment condition

with 0 7^ F € HJy and ip a uniformly positive rational function. Represent i^>


as ip = I + 2 Re(x/3) with (3 in H2. Then the spectrum of the differential T* „
does not have 0 as an accumulation point if and only if F is an outer function.
For N = 1, such a factoring exists if and only if the winding number condition
wind(a, 0) < 1 holds.
The wind(a, 0) < 1 condition for N = I is, by Theorem 14.4.3, true for
generic optima. The condition that F is an outer function has been mentioned
as generic in the class of analytic functions for N > 1, though it is not proved
to be generic among solutions to OPT problems. Thus if T' is not invertible, 0
is an eigenvalue of T", and the theorem says a small change in /, j3 will probably
move the eigenvalues of T' to make it invertible. This strongly suggests (and in
numerical experiments one sees) that T' is invertible "with probability 1."
To prove Theorem 15.7.3 we need a lemma.
LEMMA 15.7.4. If a : <9D —•> C is a smooth function that never equals the 0
vector, then it has a phase outer factorization, namely,
180 CHAPTER 15. MORE NUMERICAL ALGORITHMS

where smooth 4>: <9D —> C has \<f>\ = 1 and F G H1^ is outer and smooth if and
only if the subspace

is a closed proper subset of L2. This subspace is closed if we add the assumption
that a is rational. By Lemma 14-4-1, factoring (15.35) is equivalent to

where a is a scalar outer function and ^ is a smooth strictly, positive function


in L°° and wind(0,0) = n.
Proof. A good reference for the following arguments is [Doug72]. The set
M. = {g = ah : h G H^} is a closed invariant subspace of L2 under the unilat-
eral shift. We can apply the Beurling-Lax theorem to get the representation

where |0(e J ^)| = 1 for all 6 provided that

This is true because M. is the range of a multiplication operator:

Since a never vanishes, {g = ah : h G H2^} = 4>H2 is a closed set. Thus


{g = ah : h G H2^} = (j>H2, or equivalently

This implies that 0a is invertible outer, and we denote it F. We need <j> to be


smooth, and this is true because a is smooth. This requires a more technical
argument such as that in [Goh64]
The last line of the theorem follows because Lemma 14.4.1 applied to
b '•= X~n4>i which has winding number 0 about 0, says there is smooth a G H°
with the same phase as x~V- That is, there is a uniformly positive function ?/>
such that V>X~ n ^ = ot. We have that wind(^x~ n </>, 0) = wind(a, 0) = 0, which,
since o; is invertible on the unit circle and is in H"00, implies that a must be
outer.
Proof of Theorem 15.7.3. The starting point is the formula (15.7.1) for T",
which has the form

where L is L := TM\ with


15.7. THEORY OF NEWTON ITERATION 181

and C can be shown to be a compact operator. It is easy to show that L is


self-adjoint. Consequently by basic Fredholm theory, if L is invertible, then
T" = L + C has spectrum (eigenvalues) that do not have 0 as an accumulation
point.
The next lemma tells when L is invertible

LEMMA 15.7.5. For M = with R a uniformly strictly posi-


tive definite N x N matrix-valued function, and the smooth a G LJ^ satisfying
a(ei0} / 0; we have that TM is invertible if an^ only if

If a is rational, this fails if and only if

for some continuous uniformly positive 1/1 : 3D —* R+, some F € Hff an outer
function, and n > 1. Even for nonrational a if the representation in (15.39) is
true, then TM is invertible if and only if n <\. This for N = 1 is equivalent to
stating that the winding number condition wind(a, 0) < 1 holds.
Proof. Write the block Toeplitz operator TM as

a block matrix with Toeplitz operator (of the appropriate dimension) entries.
Since R is & positive definite-valued and uniformly invertible function, TR is
invertible. Standard Schur complement arguments tell us that TM is invertible
iff T^at(Tfi)~lTXais invertible. It is invertible iff (15.38) since TR is invertible.
Now we show for a rational function a that condition (15.38) fails iff condition
(15.39) holds. If range Tp is not onto H2, then M = xaH2 is a subspace of
L2 that is not equal to L2. Since a is rational and a(e j0 ) never vanishes, M is
closed. Thus the range condition (15.36) in Lemma 15.7.5 is true, so we get the
representation (15.37). That is, a has the form in condition (15.39). Even if a
is nonrational and (15.39) holds,

has range onto iff n < 1 and Tp has range onto. This uses the fact that
7^-i x n-i is a scalar Toeplitz operator, so Coburn's lemma implies that the
range of 7^,-ixn-i is onto iff n — 1 = wind(^~ 1 x n ~ 1 ;0) < 0. Also it uses
Theorem 15.7.2, which says Tp has range onto iff F is outer.

Proof of Theorem 15.7.3 (continued). We can use Lemma 15.7.5 to obtain


Theorem 15.7.3 since its hypotheses guarantee the nonvanishing of a, and F
strictly plurisubharmonic means precisely that A is uniformly positive definite,
182 CHAPTER 15. MORE NUMERICAL ALGORITHMS

so R := ijjA is also strictly plurisubharmonic. The gradient alignment assump-


tion implies (15.39).
The history of solutions to the Nehari problem is no brief matter, especially
how the Nehri-commutant lifting theory intertwines with control. This is given
in Appendix A. Disk iteration methods were introduced in [BHM86]. They were
refined and analyzed in various papers (see [HMer93b]). The Newton iteration
and its analysis through invertibility of T' comes from [HMW93].
Chapter 16

More Theory of the Vector


OPT Problem

A sufficient condition for optimality in the problem OPT is presented in this


chapter. The main result is given in section 16.1. An example is given in
section 16.2, and a survey of qualitative results on OPT from the literature is
the content of section 16.3.

16.1 A sufficient condition for optimality


We will use the following notation for the second-order partial derivatives:

representing the N x N matrix-valued function with (I, k) entries given by

The next result states a sufficient condition for a function to be a solution to


OPT.
THEOREM 16.1.1. Let T be of class C3 and /* = (/i, • • • ,/jv) in AN be
such that

is note the vector 0 for any

// the function f* is a directional solution to OPT then, in addition to the


flatness and gradient alignment conditions of Theorem 13.3.1, the following con-
dition holds:

183
184 CHAPTER 16. MORE THEORY OF THE VECTOR OPT PROBLEM

.III
For every nonzero

Conversely, if the flatness and gradient alignment conditions and III with strict
inequality hold, then /* is a directional minimizer.
The interested reader may check the reference [HMer93a] for the proofs and
details pertaining to this section.
We now observe that this theorem gives a practical optimality test when
7V<2.
COROLLARY 16.1.2. When N — I , condition HI of Theorem 16.1.1 is
satisfied at any f € A since Af = {0}. Therefore, if f G A satisfies that fj(-, /)
is never 0 for any 0, and if it satisfies the flatness and gradient alignment
conditions, then f is a strict local optimizer of OPT.
When N = 2 we have a practical test given in the following theorem.
THEOREM 16.1.3. Let F(-, z) be a given performance function and let /* G
A2. Fort= 1,2, set a^6} = If (e^,/(e^)) and let

Suppose that at least one of the functions ag is never zero on the circle, and
that it has winding number 1 about 0. // /* satisfies the flatness and gradient
alignment conditions, then relation III of Theorem 16.1.1 with strict inequality
is equivalent to the following statement.
Either (i) there exists 9$ such that

or (ii) bTBb is never zero on the circle, and n& :— wind(6 T Bb) is either an odd
number or a number greater than 2.
If Conjecture 13.5.1 is true, then the hypothesis on the winding number of at
is not too restrictive. The test in Theorem 16.1.3 is implemented in the software
package Anopt.
Open question: Find a practical test to check condition III when N > 2.
For those who want a practical test when TV > 2 we include the next theorem,
which replaces III with a (stronger) condition that is easy to check. When III
is taken together with flatness and gradient alignment, they are sufficient (but
not neccessary) to ensure local optimality.
THEOREM 16.1.4. Let T, /*, a, A, and B be as in Theorem 13.3.1. For
9 € fO, 2?r) let
16.2. AN EXAMPLE (CONTINUED) 185

Suppose, in addition to the flatness and gradient alignment conditions of Theo-


rem 13.3.1, that there exist eP® such that the Hessian in z 0/T(eJ'e, •) is strictly
positive definite when compressed to Tg. That is, suppose that there is a 8 > 0
such that for every z G Tg,

Then f* is a directional solution of OPT. Moreover, if the statement (16.3)


holds for some 6 > 0 and all e^e 6 <9D and z e C^, then f* is a strict local
optimizer.
Someone working in several complex variables would look at T0 as "the
complex tangent space of £0(7*)," which is a very natural object. We have the
following corollary of Theorem 16.1.4.
COROLLARY 16.1.5. ///* satisfies I and II, and if the sublevel sets o/T are
strictly convex, then f* is a strict local optimizer. Thus if for all 0 the function
T(eie,z] is strictly convex in z, then /* is a strict local optimizer.

16.2 An example (continued)


We continue our study of OPT for the performance function defined by relation
(13.3). We shall show that that the constant functions fi(e^e] = (30, —30) and
/2(—30,30) are strict local solutions. The calculations for the function /2 are
almost identical to those for f\, so we present only those corresponding to f\.
Since

it is clear that f\ satisfies the flatness condition. We now calculate first-order


derivatives in z at f\(e^}. From equation (13.4) we have

Note that

and this implies that the gradient alignment condition is satisfied by /i with

To use Theorem 16.1.4 we need the second-order partial derivatives:


186 CHAPTER 16. MORE THEORY OF THE VECTOR OPT PROBLEM

and

With the functions A and B defined as in Theorem 16.1.1, we have that

If 6 is fixed, then clearly a(e^eYz = 0 when z has the form

for some z\ G C. For such z we have that

Therefore condition (16.3) holds for all 0 and all 2, and this implies that fi is a
strict local optimizer.

16.3 Properties of solutions


In this section we return to basic qualitative questions that were addressed for
N = 1 in section 9.4. Now we treat N > I. Do solutions in H™ exist? Are
solutions unique? Are solutions smooth?
Again, Se(c) refers to the sublevel sets

The main difference between the N = I and N > I cases is that in higher
dimensions solutions may not be unique. We saw an example of nonuniqueness
earlier in section 13.4. Those steeped in the lore of several complex variables
will appreciate the fact that F in this example is strictly plurisubharmonic in z.
Our standard assumption on the OPT problem is:
16.3. PROPERTIES OF SOLUTIONS 187

(SA) F depends smoothly on 8, is real analytic in z (and in z], and has gradient
^j (e j<? ,z) that never vanishes when T(e?e,z] = 7*. The sets £0(7*) are
connected, simply connected, have nonempty interior, and are uniformly
bounded in 0.
While 7* may not be known in advance in a particular situation, one might verify
that all «S0(7) for a wide range of 7 satisfy these conditions; this is because the
conditions are not very restrictive.
We now give a list of results. Definitions of the more specialized terms in
the theorems are given below.
The behavior of OPT depends heavily on the properties of the sublevel sets
of the performance function F, so now we list key ones. A strictly convex set is
a convex set with no line segment contained in its boundary. The unit ball in
C^ is strictly convex, while the unit ball in the space Mmn of ra x n matrices
is not unless ra = 1 or n = 1. Polynomially convex sets are a much broader
class of sets than convex sets. Convex sets are intersections of half spaces
{w : Re l(w] < 0}, where i is a linear function into the complex plane, while
polynomially convex sets are intersections of sets of the form {w : Re p(w) < 0},
where p is a polynomial.

16.3.1 Uniqueness
THEOREM 16.3.1. // <S0(7*) is strictly convex (uniformly in 9), then if an
Hf? solution /* to OPT exists, it is unique. Also /*(e j6> ) e dSe for almost
every 9.
Recall that when N = 1, the solution to OPT is unique (section 9.4). This
gives us two theorems with very different conditions guaranteeing uniqueness.
A theorem of Vityaev unifies the two by saying roughly that if the <S# miss being
strictly convex by at most one complex dimension, then any smooth solution to
the OPT problem is unique.

16.3.2 Existence
THEOREM 16.3.2. Suppose SA holds and that each SQ is polynomially con-
vex. Then an H1^ solution f* to OPT exists. Moreover, if a sequence fk £ H™
approximately solves OPT (in the sense sup F(e j6/ , /fc(e j6> )) = 7fc with 7fe \ 7*,),
e
then a subsequence that converges in normal family sense has as its limit a func-
tion /oo in Hff that satisfies r(ej'e, foc(e:'d)) < 7* almost everywhere. Moreover,
if each So is strictly convex and smooth, then f* is continuous.
The result on continuity of solutions /* is in [S190]. Recent deep work
on regularity of /* in a special OPT type problem has been done in [A196].
Extending these results may be fruitful open ground.
The examples and theorems of Chapter 16 are from [HMer93a]. More results
can be found in [H86] and in [HMar90]. Theorem 16.3.1 is due to Helton and
188 CHAPTER 16. MORE THEORY OF THE VECTOR OPT PROBLEM

Howe [HH86]; see also Owens and Zames [OZ93J. For TV = 1, its uniqueness is
found in [HMar90]. The difficult part of Theorem 16.3.2 is the smoothness of
/*. For strict convex sublevel sets it is due to Slodkowsky [S189], and its proof
is influenced by ideas of Lempert [Le86]. For N — 1 convexity is not needed see
[HMar90] or [S190].
PartV
Semidefinite Programming
of the Vector OPT Problem
This page intentionally left blank
We show how the results we have presented on H°° optimization can be seen
from the viewpoint of a subject called semidefinite programming. This has been
very successful in the classical area of linear programming and more recently
in finding matrices that satisfy a collection of positive definite linear matrix
inequalities (LMIs). All of this will be explained in the next three chapters.
Also we present more general results than the ones you have seen in the book
so far.

191
This page intentionally left blank
Chapter 17

Matrix H°° Optimization


To this point this book has concentrated on the study of the OPT problem with
smooth performance. Now we focus on an optimization problem that generalizes
it. The idea is to allow a matrix-valued performance F. Surprisingly this vastly
extends the number of physical problems that can be treated. We begin by
formally defining the class of performance functions we consider.
DEFINITION 17.0.1. A matrix performance function is a smooth function
F of eie € 5D and z € C^ that takes n x n nonnegative definite matrix values
and that is real symmetric; that is, F(e j6> , z) = Y(e~'ie, 2) for all 9 and all z.
The basic problem of this part of the book is

MOPT Given an n x n matrix performance function F, find

/* € AN and 7* € R such that

A pair (7*, /*) is a local solution to MOPT if it solves the problem obtained
from MOPT by restricting the optimization of F to a neighborhood of /* inside
AN.
Of course the MOPT problem is the OPT problem with performance func-
tion f given by

The function F is not smooth. Therefore this is a badly behaved OPT problem,
and very little of our theory of optimality and few of our numerical algorithms
are applicable.
Many optimization problems can be rephrased as problems of the type
MOPT. Below we illustrate this by presenting an example involving the max-
imum of two performance functions. Later, in section 19.2, we turn to linear
matrix inequalities.

193
194 CHAPTER 17. MATRIX H°° OPTIMIZATION

Competing constraints. Consider the following problem: Given two


scalar-valued performance functions FI and F2, find minimizers to

or equivalently

We now derive a MOPT problem. Set

Since

holds the problem (17.3) can be rewritten as a MOPT problem. It should be


clear that the same idea applies to situations involving any finite number of
scalar- or matrix-valued functions F^.

17.1 Optimality conditions for MOPT


All we do in this section is indicate how our optimality conditions in Theorem
13.3.1 generalize to matrix-valued performance functions F. These optimal-
ity conditions follow from a much more general optimality theory sketched in
Chapter 19.
For F a n x n matrix performance function and for / € AN, define U(f) by

THEOREM 17.1.1. Let f* e AN be a local solution to MOPT. Set

if

then there exists #* 6 L^xn such that the triple (7*,/*,#*) satisfies
17.2. SCALAR PERFORMANCE MEASURES 195

Complementary

Gradient
Alignment

ls
Recall that x the operator of multiplication by e^0. We may also write
X ( e j e ] — eje. Also, the inequalities 4 and 5 of PDE+#°° hold pointwise almost
everywhere in 0.
Remark 17.1.2 It can be easily shown that if T has diagonal or block diag-
onal structure, then the function \I> in Theorem 17.1.1 can be chosen to have
the same diagonal structure.
Example. Optimality conditions for competing performance functions. Let
Ti(eie,z) and ^(e- 7 0 ,z) be scalar-valued performance functions of e?® and z €
C, and set

It is easy to verify that if ^* is a dual optimal function, then so is the function


V^i — diag(^i,^2) obtained from \I>* by replacing all the off-diagonal terms by
0. The equations corresponding to PDE+7f °° with $ = I can be written as
follows:

17.2 The special case of scalar performance


measures
Now we show how the results we have just stated for MOPT correspond to the
work we have done on OPT in previous sections. We reemphasize that in the
OPT problem F is a scalar-valued rather than a matrix-valued function. We
start by specializing Theorem 17.1.1 to scalar-valued F. Then we show how it
is "equivalent to" Theorem 13.3.1, which characterizes local solutions to OPT.
196 CHAPTER 17. MATRIX H°° OPTIMIZATION

THEOREM 17.2.1. Let f* e AN be a local solution to MOPT. Set

//

then there exists such that the triple satisfies

The main difference between this and Theorem 13.3.1 is in the flatness condi-
tion (1), which we have now expressed in a seemingly complicated way. Indeed,
this is the way it arises in many proofs of Theorems 9.3.1 and 13.3.1 (though
not the ones we gave), and one observes that \I>* does not vanish a.e., which
implies that 7*! — F(-,/*) = 0, which is what we call flatness of performance
at optimum. We foreshadow more general theory by saying that conditions (1),
(4), and (5) are examples of what are called complementarity conditions since
they say that the sets where the two functions

do not vanish are complementary.


Other discrepancies are minor. For example, in conditions (2) and (3) one
uses the fact that both F and #* are 1 x 1 matrices to eliminate the trace in
the formulas in Theorem 17.1.1. The expression UTU that appears in Theorem
17.1.1 satisfies

so the hypotheses of Theorems 17.1.1 and 17.2.1 are identical.


The results in this chapter were announced in [HMW95].
Chapter 18

Numerical Algorithms for


H°° Optimization

Writing the Flatness condition in complementarity form (condition (1) of The-


orem 17.1.1) suggests a slightly different class of numerical algorithms from the
Newton algorithms in section 15.4 — a class which, while sometimes slower, ap-
plies to more complicated optimization problems than we have covered. Also,
there is a strong unity between these methods and those popular ones that solve
LMIs. These issues are the subject of this chapter.
We produce several algorithms for solving the equations in PDE+H°° for
the unknowns

which could be thought of as modifications (and generalizations) of our Newton


algorithm in section 15.4. When F is not scalar-valued the complementarity
condition does not guarantee that 7* — F is 0. Thus the trick in our treatment
of the Newton algorithm in section 15.4 of eliminating 7 by applying a projection
to it does not work.
The algorithms presented below fix a large 7 (or something closely related) in
the complementarity condition and then solve the resulting equations via New-
ton's method, subject to preserving positivity constraints. The^i they change
7 and repeat the process. This process is can be thought of as "relaxing" the
complementarity condition

since the right side is not 0. Various ways of choosing <£ give different algorithms.
Such algorithms are usually called primal-dual interior point algorithms. The
term interior comes from the viewpoint that the set {/ : F(e J 0 ,/(-)) > 7*}
constitute a feasible region, and when we fix 7 > 7* and solve PDE+HOC for
/* we are a approaching /* from the interior of the feasible region.
We treat the gradient alignment condition exactly as we did in the scalar-
valued F case of section 12.3, except now we need to take traces of certain

197
198 CHAPTER 18. ALGORITHMS FOR #°° OPTIMIZATION

matrices. To be specific, we wish to project

so that it is equivalent to

which is legitimate whenever ^ is a function in L2.


To make sense this requires that dual variables (such as ^ in Theorem 17.2.1)
be elements of L2. We readily assume this because stronger assumptions such
as smoothness of various functions are necessary in order to prove convergence
of the algorithms.
In light of the previous discussion combining (18.1) with 2 and 3 of PDE+H°°,
we get the optimality equations we wish to solve:

We reemphasize that in this algorithm at each step we fix 7 and then solve for /,
\I>, and 7 using Newton's method, with barriers or restrictions on the linesearch
imposing positivity conditions $ > 0 and 7! — F(-, /) > 0.
The theory mentioned in section 15.4 is strong evidence that such equa-
tions have an invertible differential (almost always), and the fact that Newton's
method demonstrates second order convergence in these cases corroborates this.

18.1 Interior point algorithm


Take $ = I in PDE+H™. Then condition (1) of PDE+H™ becomes

Thus fixing a small e > 0 is equivalent to just picking

in the complementarity condition. This strategy is proposed in [HO94] and


analyzed by the authors in [AHO96].
18.2. INTERIOR POINT ALGORITHM 2 199

18.2 Interior point algorithm 2


Take <I> = 7. As we shall see, the distinctive feature of this algorithm is that the
dual variable \& is eliminated from the problem. We now describe some simple
algebra that will do this.
We may solve (1) of PDEH™ for # and get

Substituting this into (2) of PDEH00 gives

Also, the normalization requirement in (3) of PDEH00 becomes the normaliza-


tion

Finally one solves (1L£) and (2Le) for 7, /.


There is an elegant way to reformulate equation (2L e ). The key part of
(2L e ) is

which we now observe is a logarithmic derivative. Since (1L£) implies that jI —


is a uniformly positive definite matrix function, we have that

Standard language adapted to this problem is that when e is fixed the primal
and dual variables /e, ^£ which we solve for lie on the central path. As we take e
to 0 our updates follow the central path, which we hope leads to a local optimum
/*,**.

18.3 The "boundary case"


For completeness in this context recall our Newton algorithm solves PDEH^°
conditions (2) and (3) and replaces (1) by

No e or 7 appears.
The reader interested in very preliminary comparisons from computer ex-
periments on 2 x 2 matrix- valued F should see [HMWPreP] . Developments in the
rapidly expanding field of semidefinite programming and primal-dual optimiza-
tion are reported in [LO96], [NN94], [VB96], [Wr97].
This page intentionally left blank
Chapter 19

Semidefinite Programming
versus Matrix H°°
Optimization
Semidefinite programming solves a great variety of engineering optimization
problems. Of particular importance are the linear matrix inequalities, which
now appear to occur in most branches of engineering [BEFB94], [SI95].
Our goals in this chapter are to present a succinct theory of Semidefinite pro-
gramming and to show how the methods that we presented in previous chapters
actually correspond to a type of Semidefinite programming theory and corre-
sponding algorithms.

19.1 Background on semidefinite programming


We operate at a very high level of abstraction since this brings out the basic
issues and actually makes understanding easier. Ultimately we show how the
optimality theory presented so far fits in.
This section requires that the reader know the basics of Banach spaces so it
is not for everyone. Those who skip this section should go to section 19.2.

19.1.1 Basic Setup


Let X be a real Banach space. A subset X+ of X is called a cone if it is
closed under addition and multiplication by nonnegative scalars. If x and y are
elements of X, we say that x > y provided x — y is an element of X+.
Let X' be the dual space of X, that is, the set of bounded linear functionals
A on X. The dual cone of a closed cone X+ is the set

201
202 CHAPTER 19. SEMIDEFINITE PROGRAMMING

Note that X'+ is closed in the topology induced by the norm of X.


DEFINITION 19.1.1. Let X+ be a closed cone in a Banach space X such that
X+ ^ X. Let I be a fixed element of X+ such that I ^ 0 and set

The function 77* is called a size function (with respect to I) if

Note that the definition of 77* requires that {0} ^ X+ ^ X. Some properties
of size functions are given in Proposition 20.1.1 in Chapter 20.
DEFINITION 19.1.2. Let X be a real Banach space with cone X+ and size
function 77* . We say that 77* satisfies the Hahn-Banach condition if

(HBO

The supremum in (HBC) is always attained (see Corollary 20.1.4 in Chapter


20.)

19.1.2 Examples
Example 1. Consider X — R2 with norm ||(x,j/)|| = max{|x|, |y|}. Set
X = {(x,y) : x > 0, y > 0}, and let / = (1,1). Note that X'+ = X+ and
+
||(iy,i;)||x' = \w + \v\. Here r]*(x,y) — max{:r, y} is a size function.

Example 2. If A is an n x ra matrix, let

denote the singular values of A. We define functions a\ and omax on the space
of n x TTT, matrices as follows:

Let X = M be the self-adjoint n x n matrices with entries in C equipped with


the norm amax. Then, since every element in M can be written as a linear
combination of n2 self-adjoint matrices with real coefficients, one can easily
show that M' — M (as a set) and X'+ = X+, via

Here y* is the transpose of y, and tr denotes the trace function. Also one can
prove that
19.1. BACKGROUND ON SEMIDEFINITE PROGRAMMING 203

Let M+ be the cone of the nonnegative definite matrices, and let / be the
identity matrix. Clearly, M'+ = M+. Then

That is, rf(B] is the largest eigenvalue of B. One can show that r/* satisfies
(HBC).

Example 3. Let X = CnXm the set of continuous functions on the unit


circle dD of the complex plane C, with values in M, the set of n x n self-adjoint
matrices. (See the previous example.) Equip the space Cnxn with the norm

Let / be the constant function equal to the identity matrix, and let X+ be
the cone of continuous functions on dD that take nonnegative definite matrix
values. Then

It can be shown that 77* satisfies (HBC). The dual spanxn consists ofce of C
functional which can be represented as n x n self-adjoint matrix-valued bounded
Borel measures. More details about this example are given in section 20.2.1.

19.1.3 The optimization problem


Let X be a Banach space, X+ be a closed cone in X, 77* be a size function on
X with respect to / G X, and F be a real Banach vector space with dual F'.
Let J- : F —>• X be a C^ (continuously Frechet differentiable) map, and k be a
linear functional on F. A basic problem that we consider in this paper is:

minimiz

subject to

We shall refer to (PO) as the primal optimization problem and refer to 7 and /
as the primal variables.
A pair (7*,/*) is a local solution to (PO) if there is a neighborhood V of
(7* 7 /*) such that the latter point is a solution to the problem obtained from
(PO) by adding the constraint (A, /) e V. Note that if F is affine in /; then
local solutions are also global. This is not true for general T.
Now we list optimality conditions and an associated primal-dual problem.
The latter is given in terms of the differential T>Jrf[-] of .F at a given / € F,
204 CHAPTER 19. SEMIDEFINITE PROGRAMMING

and its adjoint (Uff}^ : X' — » F'.1 Recall that the adjoint operator satisfies
and is defined by

The primal-dual optimization problem is

(PDO) Find 7* e R, /* e F, and A* e X^ solving

minimize

subject to

We refer to A as a dual variable. A triple (7*, /*, A*) is a local solution to (PDO)
if there is a neighborhood V of (7*, /*, A*) such that the latter point is a solution
to the problem obtained from (PDO) by adding the constraint (7, /, A) € V.
The quantity

is called the duality gap. The content of our next result is that the duality gap
is zero at local solutions to (PDO).
PROPOSITION 19.1.3. //(7*,/*,A*) is a local solution to (PDO), then

Proof. Since 7*7 - F ( f * ) > 0, we have that

Now strict inequality in (19.12) does not hold, since otherwise 7* may be re-
placed by a smaller 71 that still satisfies the inequality constraint, while reducing
the value of A, which is not possible. Therefore

We also have by the definition of 77*,

^^We use L' to denote the adjoint of a linear operator L.


19.1. BACKGROUND ON SEMIDEFINITE PROGRAMMING 205

Strict inequality in (19.14) does not hold, since otherwise one may choose AI €
X+ with unit norm such that

Such AI exists by Corollary 20.1.4. Replacing A* by AI (or, if necessary, by a


suitable scaling of a convex combination of A* and AI) yields a smaller value of
A, thus contradicting the fact that (7*, /*, A*) is a local solution. Therefore we
arrive at the relation

Combine (19.13) with (19.15) to obtain the conclusion of the proposition.


Proposition 19.1.3 leads to the following system of equations in A, /, and 7,
which must be satisfied by local solutions to (PDO):

Adding the inequality constraints

to (PDE) gives a set of relations we refer to as (PDE+).

19.1.4 Main high-level theorem


Our next result states that (PO) and (PDO) are closely related.
THEOREM 19.1.4. Suppose that 77* satisfies (HBC). //(7*,/*) is a local so-
lution to (PO) such that 7* > 0, then there exists A* G X'+ such that (7*, /*, A*)
satisfies (PDE*).
Theorem 19.1.4 implies that solving (PDE + ) leads to excellent candidates
for solutions to (PO). Our algorithms are based on this. The proof of Theorem
19.1.4 will be given in section 20.1.2.

19.1.5 High-level numerical algorithms


There are conceptual numerical algorithms that can be based on the optimality
conditions (PDE). We do not list them because our emphasis in this book is on
the special H°° cases already discussed in Chapter 18.
206 CHAPTER 19. SEMIDEFINITE PROGRAMMING

19.2 Matrix and other optimization problems


In this section we describe areas where our theory applies.

19.2.1 LMIs and related stories


Let AO, AI, . . . , AK be n x n real symmetric matrices, and let c e HK . For
/ = (/i, . . . , /*) e R* let T(f) = A0 + Alfl + ••• + AKfK. The optimization
problem

has been studied by several authors; see [VB96] for a beautiful and thorough
discussion, extensive bibliography, and applications. It is the central problem
treated with linear matrix inequality techniques. The problem (19.16) is very
close to problem (PO) when X is the space of real symmetric matrices, X+ is
the space of positive semidefinite matrices in X, and f is an affine map from
RK to X. Indeed, one can show that with the additional condition of at least
one of the ^s being positive definite, then problem (19.16) is a special case of
(PO).

19.2.2 Smooth optimization


Traditional unconstrained optimization of a smooth function F on Rn fits into
our present setting. To see this take X — R, X+ = R + , T : Rn —> R, and
k(f) = 0. Then in (PO) the minimization of 7 gives 7 = •?•"(/), so (PO) becomes

Now we give the optimality conditions (PDE+) for (19.17). First note that A is
a positive real number, which the normalization forces to be 1. Next note that
0 = 'DJ:f [I] = VJr(f*). This is the classical "gradient equal to zero" condition.

19.2.3 Return to matrix H°° optimization


We now state the problem MOPT as a problem of the type (PO).

(PO-H00) Find 7 = 7* e R, / = /* 6 AN solving

minimize

subject to
With the setup of Example 3 in section 19.1.2 it is not too hard to show that
Theorem 19.1.4 applies to local solutions to MOPT. See section 20.1.2. That
19.2. MATRIX AND OTHER OPTIMIZATION PROBLEMS 207

is, local solutions to MOPT must satisfy relations (PDE}+ With consider-
able work (see section 20.2.3) one can then convert (PDE}+ to the conditions
(PDE+H00) in Theorem 17.1.1.
More information can be found in [HMWPrep].
This page intentionally left blank
Chapter 20

Proofs

This chapter is the most technical of the book. It requires a modest amount
of background in Banach spaces. The chapter is devided into two sections.
In section 20.1 basic concepts and results are discussed and Theorem 19.1.4 is
proved. In section 20.2 the general theory is applied to case of H°° optimization
and Theorem 17.1.1 is proved.

20.1 The general theory


20.1.1 Size functions and their properties

PROPOSITION 20.1.1. Let rj* be a size function with respect to I. Then

1. 77* is convex and positive homogeneous.

2. 77* is subadditive: for every x, y in

3. 77* is Lipschitz: for every x, y in

and

5. // both x and —x are elements of X

6. If x is in X+, then

for every t G R and x G X.

209
210 CHAPTER 20. PROOFS

Proof. The proof of Proposition 20.1.1 is straightforward. We present only


the proof of the convexity of 77*. For t G [0,1] we have

PROPOSITION 20.1.2. Suppose 77* satisfies (HBC). If X £ X'+ is such that


there exists x G X+ satisfying

then (X, I) = \\X\\x'.


Proof. We may assume that \X\\x1 = 1- From (X,rj*(x}I — x) > 0 we have
that

Hence (A,/) > 1. We also have that

THEOREM 20.1.3. Let p be a sublinear functional on a Banach space X;


that is, p is real-valued and

Let X+ be a closed cone in X such that

Then for every XQ € X+, XQ ^ 0, there exists XQ € X'+ such that


20.1. THE GENERAL THEORY 211

Proof. The existence of AQ G X' such that equations (20.3) hold is a conse-
quence of the Hahn-Banach theorem. The only thing left to verify is that AQ is
a nonnegative functional. From

we conclude that

If in particular x G X+, then it follows from (20.4) and (20.2) that

COROLLARY 20.1.4. If rf is a size function that satisfies (HBC), then for


every x G X there exists A G X'+ such that

That is, the supremum in (HBC) is attained.


Proof. First assume that XQ is a nonzero element in X+. Take p = r?* in
Theorem 20.1.3 to obtain A0 G X'+ such that (20.3) holds. By (20.4),

Hence ||Ao|| < 1. We also have that

by (6) of Proposition 20.1.1, so ||A0||x' = 1- This proves the corollary in the


case when XQ is a nonnegative element in X.
Now suppose that XQ is any element of X that is not in X+. Since

we have that x\ = r/*(—XQ)I + XQ is in X+. By the first part of the proof there
exists AQ G X'+ with unit norm such that (Ao, x) < rj*(x) for all x G X and such
that

By (20.5) and (7) of Proposition 20.1.1 we have

Therefore,

Since the left-hand side of (20.7) is nonnegative, we have


212 CHAPTER 20. PROOFS

Proposition 20.1.1 implies that since rj*(I] = 1 and XQ has unit norm, we must
have that {Ao,/} < 1; note that (Ao,/) < 1 is not possible, since otherwise
T]*(—XQ} < 0 by (20.8), thus yielding XQ € X+ — a contradiction. Therefore we
have

Finally substitute (20.9) into (20.6) to obtain

THEOREM 20.1.5. Let rf be a size function such that (HBC) holds. Let
XQ be an nonzero element of X+ and M be a closed subspace of X such that
XQ 0 M and

Then there exists A G X+ satisfying

Proof. Consider the function rj : X/M —> R, where

The function 77 is convex. It is also finite valued, since by convexity, for 77


to be bounded below it is sufficient to notice that T)(XO) is finite. Note that
X+ + M is a nontrivial closed cone in X/M. By Theorem 20.1.3 there exists a
bounded linear functional A on X/M such that relations (20.3) hold with p = 77.
Also, note that A is nonnegative on the (nontrivial) cone X+ + M. The linear
functional A induces in a natural way a nonnegative functional A on X via

Note that (A, y) = 0, Vy € M. Clearly relations (20.3) hold with p = 77 and


AQ = A, since by hypothesis 77* (#o) = ^*(^o)- Therefore,

so ||A||X' < 1 holds. But since

so we conclude that
20.1. THE GENERAL THEORY 213

20.1.2 Proof of Theorem 19.1.4


First we prove the theorem with k(-) = 0, since a simple transformation gives
the result involving nontrivial k. Note that the optimal 7* in (PO), by definition
of size 77*, is equal to ??*(^r(/*)). Therefore there exists e > 0 such that /* and
7* satisfy

The first step in the proof is to show that the following relation holds:

In order to prove (20.13), it suffices to show that

To prove (20.14), fix h G F. The function 0(t) = ri*(F(f*) + t X>^/.[/i]) is


a convex function of t. If (20.14) is violated, (f>(t) < 0(0) for t > 0, and it
follows that </>'+(0), the right derivative of 4> at t = 0, is negative. (The right
derivative exists because </> is convex.) Since by Proposition 20.1.1 77* is Lipschitz
continuous, we have

Then, for alH > 0 small enough,

The functions o\ and 0-2 in (20.15) satisfy

Hence there exists t > 0 such that

which is impossible. The proof of (20.14) is now complete. Relation (20.13) is


readily obtained from (20.14) by taking the infimum of both members of the
equality.
Since ^(/*) is not an element of the closure of DT f+ [F], by Theorem 20.1.5
applied to (20.13) there exists a functional A* € X+ with unit norm such that

and
214 CHAPTER 20. PROOFS

Thus (i), (ii), and (iv) of (PDE^) are satisfied. Also, (iii) holds since we are
assuming for now that k = 0, while a local version of (v) is automatically
satisfied.
We have finished the proof except for the k wrinkle. To include k(f) in the
(PDO) formulas, rewrite (PO) as

Replace 7 with 7 := 7 + k(f) to get the problem we have been treating. This
immediately yields the conclusion of our theorem.

20.2 Proofs for H°° Optima


20.2.1 Definitions and notation
While some of the following notation has already been introduced, we present it
here in order to have a more self-contained section. For p e [1, oo), let Lp — L^
be the the space of complex- valued Lebesgue measurable functions / on the unit
circle <9D having finite pth norm \\f\\p = {/ \f(eje)\p}l/p. For p = oo, L°° is
the space of measurable functions / on 5D such that ||/||oo = ess sup|/(e^)|
is finite. If g is a function on $D that takes matrix values, we say that g is in
(matrix- valued) Lpnxm if the entries of g belong to (scalar) Lp. Functions g in
L^xn have an associated Fourier series expansion

where gg is an n x ra matrix, i 6 Z. We make the assumption (standard in


engineering but not in pure mathematics) that elements of I/^ Xm have Fourier
coefficients gg that are matrices that have real entries only. The Hardy space
Hnxm consists of those g in Lpnxm for which Q = g_i = g_2 = • • •. Thus Hp,
Lp are real vector spaces.
By Cnxm we denote the subset of continuous functions in L^xm. We equip
Cnxm with the norm

while the norm we assume on I^ xm is


20.2. PROOFS FOR H°° OPTIMA 215

The space C'nxm (the dual of Cnxm} consists of bounded Borel measures A.
These measures may be represented in the form

where it is understood that the action of A on an element g G CnXm ls given by

If A is absolutely continuous with respect to Lebesgue measure, then there exists


a function ^ in L\xrn such that

The function ^ is the density of A (with respect to Lebesgue measure.) With


(20.21), relation (20.20) can be rewritten as

The norms of A and \& are related by

Moreover, if A is in the dual cone to the continuous functions with n x n nonneg-


ative definite values and if A is absolutely continuous with respect to Lebesgue
measure with density ^, then ^ takes nonnegative definite values, and in this
case,

20.2.2 Performance, Jacobians, and adjoints


To simplify notation, the convention of writing F(e J ' e ',z) instead of F(eJ'e, z, ~z)
will be used. If F is a performance function, then its (fc, ^)th entry F^ is a
scalar-valued function of e^e and of z = ( z i , . . . , zjv)*, which for e^e fixed has
differential in z given by

Then the differential of F for ej0 fixed is


216 CHAPTER 20. PROOFS

where and are n x n matrices with entries


and respectively.
One can consider (with abuse of notation) the function F as an operator
mapping continuous functions to continuous functions:

The differential of the operator in (20.26) at a given / is a linear map

where

We now state a result that will be needed later.


LEMMA 20.2.1. Let X be in C'nxn, let F be a performance function, and let
f eAN. //A>0, then

Proof. For /i G AN we have

But for 1 < j < N we also have

The statement of the lemma follows from combining (20.30) and (20.31).
20.2. PROOFS FOR H°° OPTIMA 217

20.2.3 Proof of the Matrix #°° Optimization


Theorem 17.1.1
Let A* G C'nxn be given by Theorem 19.1.4, that is, (7*, /*, A*) is a local solution
to (PDE+).
We now claim that A* is absolutely continuous with respect to Lebesgue
measure. To see this, start from condition (iii) of (PDE^) and apply Proposition
20.2.1 to conclude that every h 6 AN satisfies

The last member of relation (20.32) defines a real linear functional, which by
the chain of equalities is the zero functional:

For j = 1, . . . , A7", the C^- valued Borel measure tr f ^-(-, /) dX J annihilates


A\. Hence it is absolutely continuous with respect to Lebesgue measure, and
its density 0j belongs to X^N- Therefore, we may write

Relation (20.34) can be rewritten in the form

where Finally, by (17.7) we may solve for dX in (20.35) to


get

Hence A is absolutely continuous with respect to Lebesgue measure. Let ^ be


the density function of A. Note that substituting dX = (**)*f;f into (20-34)
gives relation (2) of PDE+ H™.
218 CHAPTER 20. PROOFS

Since A* is in the dual cone, we have, by the discussion in section 20.2.1, that
W* takes nonnegative definite values; that is relation 4 of PDEH00 holds. Also
by (20.24), and since A* has unit norm, we have that condition 3 of PDEH00
holds.
Now we may rewrite condition 1 of PDE+ as

Since 7*7 — F(-,/*) and # take nonnegative definite values, the trace of their
product is a nonnegative real-valued function, and since the integral of the latter
is 0, we have

It is an exercise in linear algebra to show that if A and B are n x n nonnegative


definite matrices such that tr(AB) = 0, then AB — 0. Applying this result to
equation (20.38) implies condition 1 of PDEH00.
Part VI
Appendices
This page intentionally left blank
Appendix A

History and Perspective

While historians trace control to Archimedes' time and the theory of control to a
paper of James Clerk Maxwell on governors, the subject called classical control
began in the Second World War in labs in England and the United States that
designed radar-driven anti-aircraft guns. The beginnings of classical control are
summarized in a book [JNP47] by James, Nichols, and Phillips. According to
Ralph Phillips, a book that was very influential in their lab was Bode's famous
book on amplifiers, although Bode's book is not referenced in [JNP47]. The
primary technique that emerged was adjusting parameters in low order (e.g.,
degree 2) rational functions and checking that the graphs lie in certain regions.
Classical control dominated industrial practice for many years, even though it
could be taught only by example and its main technique was trial and error.
Much of the theory of control in the 1960s and 1970s focused on achieving
desired frequency domain performance as closely as possible in a mean-square-
error sense. We have nothing to add to the literature here and so do not give a
historical treatment.
The subject of optimizing worst-case error in the frequency domain along its
present lines started not with control but with circuits. One issue was to design
amplifiers with maximum gain over a given frequency band. Another was the
design of circuits with minimum broadband power loss. Indeed, H°° control is a
subset of a broader subject, H°° engineering, which focuses on worst-case design
in the frequency domain. In paradigm engineering problems this produces what
the mathematician calls an "interpolation problem" for analytic functions. The
techniques of Nevanlinna-Pick interpolation had their first serious introduction
into engineering in a SISO circuits paper by Youla and Saito [YS67] in the mid -
1960s. Further development waited until the mid-seventies, when Helton [H76],
[H78], [H81] applied interpolation and more general techniques from operator
theory to amplifier problems. Here the methods of commutant lifting [A63],
[NF70], [S67] and of Admajan-Arov-Krein (AAK) [AAK68], [AAK72], [AAK78]
were used to solve MIMO optimization problems. The disk method used in this
context was first described in an engineering article on gain equalization [H81]
and followed the foundational theory published in the mathematics article [H78]

221
222 APPENDIX A. HISTORY AND PERSPECTIVE

three years earlier.


In the late 1970s G. Zames [Z79] began to marshal arguments indicating
that H°° rather than H2 was the physically proper setting for control. Zames
suggested on several occasions that these methods were the appropriate ones
for codifying classical control. These efforts yielded a mathematical problem
that Helton identified as an interpolation problem solvable by existing means
(see [ZF81]). In 1981 Zames and Francis [ZF83] used this to solve the resulting
single-input, single-output (SISO) problem. In 1982 Chang-Pearson [CP84]
and Francis-Helton-Zames [FHZ84] solved it for a many-input, many-output
(MIMO) system.
The pioneering work of Zames and Francis treated only sensitivity optimiza-
tion. In 1983 three independent efforts emphasized bandwidth constraints, for-
mulated the problem as a precise mathematics problem, and indicated effective
numerical methods for its solution: Doyle [D83], Helton [H83], and Kwaker-
naak [K83]. Helton treated the problem graphically, just as it is done in Part I
of this book, by approximating multiple-disk constraints with a single moving
disk. Kwakernaak gives an equivalent method, but it is expressed algebraically
in terms of two weight functions W\ and W<2 on the sensitivity function I — T
and on T. Also, all of these papers described quantitative methods which were
soon implemented on computers. It was these papers that actually laid out
precisely the trade-off in control between performance at low frequency and
roll-off at higher frequency and how one solves the resulting mathematics prob-
lem. This is in perfect analogy with amplifier design where one wants large gain
over as wide a band as possible, producing the famous gain-band width trade-off.
Rather remarkable in textbook treatments of control through the 1970s was the
absence (or cursory treatment) of bandwidth constraints. On the other hand,
many control practitioners of that period would say that bandwidth constraints
were important, but when they went to formulate them as a mathematical opti-
mization problem they did it by dropping all high-frequency constraints. Thus
they made what the authors like to call the fundamental mistake of H°° con-
trol (see section 10.4), and the concomitant confusion cost the methods much
credibility. As of the 1990s the control community's views have become more
realistic. This conceptual shift is one of the main accomplishments of H°° con-
trol. The moral of the story is that nothing stabilizes a nebulous philosophical
discussion like the ability to solve the math problems that arise. Mush dries up
quickly under this torch.
All of the traditional methods in H°° solved only optimization problems
where the specification sets <S0 are disks. There is a beautiful and extensive
theory of this. Much of the mathematical theory came directly from existing
work in the study of operators on Hilbert space. However, an important accom-
plishment of theoretical engineers was to formulate the problem in a new set of
coordinates (called state-space coordinates) that are well suited to many prob-
lems. This extremely elegant theory is described in [BGR90], [F87], [GM90],
[G184], [ZDG96].
To describe the origins of state-space H°° engineering we must back up a bit.
Once the power of the commutant lifting - AAK techniques were demonstrated
APPENDIX A. HISTORY AND PERSPECTIVE 223

on engineering problems, P. de Wilde played a valuable role by introducing them


to signal processing applications (see [deWVKTS]) and to others in engineering.
The state-space solutions of H°° optimization problems originated not in H°°
control, but in the area of model reduction. The AAK work with a shift of
language is a paper on model reduction (though not in state-space coordinates)
by Bettayab-Safanov-Silverman [BSS80], which gives a statespace viewpoint for
SISO systems. Subsequently Glover [G184] gave the MIMO state-space theory of
AAK-type model reduction. Since the H°° control problem was already known
to be solvable by AAK, this quickly gave statespace solutions to the H°° control
problem. These state-space solutions were described first in 1984 by Doyle
[Dreport], which though never published was extremely influential. Earlier, in his
thesis (unpublished), he had given state-space H°° solutions based on converting
the geometric (now called behavioral by engineers) version of commutant lifting-
AAK due to Ball and Helton to state-space.
As basic as it is, the work on internal stability treated in Part I was done
fairly recently. In the famous article of Youla-Jabr-Bongiorno [YJB76a],
[YJB76b], the concept of internal stability was introduced and reduced to a
mathematical problem in which all H°° functions meet a given interpolation
constraint. The authors then solved the H2 control problem (rather than opti-
mizing over #°°, as is done in this book).
Independent of the rise of H°° control was the Horowitz approach [Ho63].
The relationship between his approach and ours is described in [BHMer94].
Another independent development was Tannenbaum's [T80] very clever use of
Nevanlinna-Pick interpolation in a control problem in 1980. Also appearing
early on the H°° stage was Kwakernaak's polynomial theory [K86]. Another
major development that dovetailed closely with the invention of H°° control was
a tractable theory of plant uncertainty. A good historical treatment appears in
[DFT92]. Another application of these techniques is to robust stabilization of
systems [Kim84].
The mathematical theory in this book (Part IV) is quite different from and
more general than the classical mathematical theory (Nevanlinna-Pick, Nehari),
which applies only to disk problems. In Chapter 15 we showed how the basic
result of Nehari for disks follows from our approach. Surprisingly, it is not
much more difficult to derive by our approach than it is classically. The reader
interested in connections of the theory developed here with other areas may
want to see [Dy89], [FF91], and [H87]. Work on the general theory for solving
the optimization problem OPT described in Part III and Part IV started in
1981 by Helton and Howe, who worked on convex problems [HH86]. Since
then theoretical work has been done by J. W. Helton, S. Hui, D. Marshall,
O. Merino, Z. Slodkowski, A. Vityaev, K. Lenz, and E. Wegert in papers listed
in the bibliography. Theoretical work pertaining to numerical issues has been
continued by Helton, Merino, and T. Walker [HMer93a], [HMer93b], [HMW93].
Numerical testing and more theoretical work has been done by J. Bence, J.
W. Helton, O. Merino, J. Myers, D. Schwartz, and T. Walker. Interestingly,
the theory developed to solve the engineering problem OPT has strong ties
with ongoing work in the study of functions of several complex variables. For
224 APPENDIX A. HISTORY AND PERSPECTIVE

some discussion of this, see the articles [HMer91] and [HV97]. The main result
(Theorem 13.3.1) used here for MIMO control is taken from [HMer93a]. Earlier
versions are in the paper [H86], and independently in the pure mathematics
literature a special case concerning "Kobayashi extremals" is due to Lempert
[Le86]. Also interesting is recent work of Zames and Owens [OZ93] on convex
H°° optimization. While [HH86] gave only qualitative properties of optima such
as flatness, a numerical approach was proposed in [OZ95].
Disk iteration algorithms were introduced in [H85] and studied in
[HMer93b]. The paper [HMW93] introduced the Newton computer algorithm
of Chapter 12. The reason Newton's method had not been successful before
on such an old problem is because the sup norm is not a smooth performance
function. Thus one needs a special approach to apply Newton or gradient-like
algorithms. This is the idea behind modern primal-dual methods as described
in Chapter 19. Modern interest in them dates to the work of Karmarkar in
the mid-1980s, and a rapid evolution brought them to the form one sees now.
Good references on the history of these and other methods are [Wr97], [VB96],
[LO96]; we refer the reader to them.
Ap pendix B

Pure Mathematics and H°°


Optimization

We have taken the viewpoint of engineering and computation in several parts of


the book. Much of it could be looked at simply as mathematics. One implication
of this work is that it suggests a new class of problems in functional analysis
and operator theory, or in several complex variables if that is your viewpoint.
First we emphasize that a pure mathematician could read Part III and follow
up with Parts IV and V. One who dislikes computation probably might want
to read the parts of the book that describe numerical algorithms quickly, but
a slower reading would reveal that they actually could have been cast in much
more mainstream mathematical terms. Indeed, it is the viewpoint of numerics
that leads to an interesting class of questions in classical mathematical fields. To
describe this, first focus on a statement that summarizes what we have already
done to the old-paradigm Nehari problem.

Claim. For the Nehari problem of approximating a given function k G L°°


by functions f € H°°,

the two different sets of optimality conditions for solutions f ,


1. Equations1 (15.5) and (15.6) in Theorem 15.2.1
2. The key optimality condition T(/,/3) = 0 determined by equations (12.16)
and (12.18) in Chapter 12, or equations (15.8) and (15.9) in Chapter 15
each has the property that their associated differential is invertible at almost all
locations.

We call this a claim rather than a theorem because we use the term "almost
all locations" loosely. Also, the differentials are operators on spaces of functions,
:
Due to Nehari and Adamjan-Arov-Krein.

225
226 APPENDIX B. PURE MATHEMATICS AND H°° OPTIMIZATION

and we did not say exactly what the spaces are. Below we indicate a proof for
H2. Unfortunately, H2 is the wrong space for analyzing Newton's method since
a nonlinear map rarely maps H2 into itself. The proof and precise formulation
of invertibility in the space of H°° fl C°° functions with a topology imposed by
a family of Sobolev norms is done in [HMW93] .

Idea of Proof. First we prove condition 1. Here we shall denote the Nehari-
commutant lifting conditions

(see (15.5) and (15.6) in Theorem 15.2.1) as a system of equations

where A = r 2 > 0 and a 6 H^. The differential

where Rk : H^ —> H2^ is defined by


Our goal is to show that it is invertible (on L 2 ) for almost all choices of
A/"(a,A). It is easy to check that the range is onto. Now Rk is a self-adjoint
operator on L2N and A is an eigenvalue of Rk which for almost all k has mul-
tiplicity 1. Thus, range of Rk — A has codimension 1, so it is not invertible.
Fortunately, the map 6 —> 6a has one-dimensional range in L?N, and since a is
the null vector of Rk — A it is perpendicular to range(JRfc — A). Therefore, range
A/^ A equals L?N. Since A/"' is self-adjoint it must be invertible.
We now proceed to the proof of condition 2. The optimality conditions for
the OPT problem after projecting are denoted T(/, (3) — 0 in Chapters 12 and
15. The objective is to prove that T\ g is invertible for almost all /, (3 that are
solutions to OPT. This is exactly what Theorem 15.7.3 did.
The technique applied here makes sense for many optimization problems in
function theory. Find optimality conditions C for the problem. These are a set
of equations involving primal and dual variables. In our problem, and probably
in many others, the differential of C is not ^avertible. Find some way to^ modify
C and produce new optimality conditions C with invertible differential C'. Thus
we have the pure mathematics problem

Find optimality conditions with invertible differential.

Possibly this is a natural type of mathematical result to seek for many function
theory problems.
What we are saying is hardly shocking from the viewpoint of optimization
theory. The mathematically challenging thing is that for the class of problems
we study, one does not immediately get the invertibility condition. Also, it
APPENDIX B. PURE MATHEMATICS AND H°° OPTIMIZATION 227

is a bit surprising that such basic principles of optimization theory were only
recently applied to the very old Nevanlinna-Pick-Nehari problem.
For perspective, and to make sure that the reader interprets this speculation
widely (wildly) enough, we mention some interesting work on optimization in
several complex variables (which takes an approach very different from ours).
This concerns interpolation and approximation problems on the polydisk. There
is striking work of Agler [Agunpub] that derives beautiful matrix inequality con-
ditions for checking if a particular N-P interpolation problem is solvable but for
a norm stronger than the sup norm. Agler's view, implemented successfully in
several problems, is that instead of doing sup norm optimization, one modifies
the norm in some way so that the answer to the problem can be gotten from
an eigenvalue problem or something similar. A very elegant result of this type,
due to Cotlar and Sadosky [CS94], converts the Nehari problem in the polydisk
for a BMO-type norm to an eigenvalue problem.
The suggestion here is that we liberalize our notion of what it means to
answer a question in function theory beyond insisting that an answer must have
the form of an eigenvalue problem. Often by principles of functional analysis it
is not too difficult to write down general optimality conditions. It seems that a
very reasonable pursuit for function theory problems from the pure mathematics
point of view is to find optimality conditions whose differential is invertible. This
is much more challenging. While this seems like a considerable loosening of the
eigenvalue condition, if we insist that our conditions use only explicit formulas,
then it may not be that liberal a condition. Only time will tell if these ideas
extend in many other directions.
There is a yet more liberal notion of solution looming on the horizon that
involves liberalizing DI. The invertibility requirement in DI is motivated by
finding well-behaved "primal-dual" algorithms. Primal-dual interior point al-
gorithms probably do not in fact require invertibility of the differential, but do
require invertibility of the differential compressed to certain cones in function
space. This is under investigation by many researchers in many contexts and is
very open at the moment.
Also, we mention that there are extremely pretty connections between the
theory of analytic disks developed in this book and ongoing work in several
complex variables. The article [HMer91] describes one line of connections, and
the paper [HV97] indicates some others. Readers in the area of several complex
variables might find these articles interesting.
This page intentionally left blank
Appendix C

Uncertainty

C.I Introduction
So far we have considered the system S where the plant P is known. However in
most physical systems typically there is a certain amount of "plant uncertainty,"
i.e., lack of knowledge about the plant.
To explain what uncertainty is, we use the example of an airplane. We can
model the control of an airplane with a system of differential equations. These
equations are the result of an idealization of the airplane that invariably does
not include some elements, either because we are not aware that they affect the
airplane or because we choose not to include them for simplicity and are deemed
relatively unimportant. The differential equations have a series of parameters
that have physical significance and whose values we can tell only approximately.
We may further simplify the model by linearizing the system of differential
equations. The model is surely not perfect; indeed at high frequencies it is
highly inaccurate. Also there are factors affecting the system, among which
are changes in the properties of the components or even malfunction of some
components.
A common approach to treating uncertainty when doing system design is to
assume that there is available a known reference or nominal plant PQ , and that
the true plant P of the system is, in the frequency domain description,

where AQ is unknown. We may think of P as the result of perturbing PQ.


Uncertainty can have a devastating effect on a system. For example, a stable
system may cease to be stable after a perturbation of the plant. This definitely
is something that we do not want under any circumstances, so we shall require
that reasonable amounts of uncertainty should never make a system that we
have designed unstable. Thus we insist that all feasible plants make the system
internally stable. This is called robust stability.
Uncertainty also affects the performance of the system. Ideally, we would
like to design the system with the property that when the plant is perturbed,

229
230 APPENDIX C. UNCERTAINTY

the resulting system satisfies the performance requirements originally set. This
must hold for any feasible perturbation of the plant. Sometimes this is referred
to as robust performance.

C.2 Types of uncertainty


The first type of uncertainty arises when a set Pu of possible plant values at
each frequency a; is known, but no specific information on the nature of the
uncertainty is available. This is called unparametric uncertainty. Knowledge of
the uncertainty set Pw is the main input to the design procedure. Here is one
scenario in which the set P is determined: We carry out several experiments
to measure P(jui). This produces a (discrete) set of plant values P(JUJ}. This
discrete set is approximated by a set Pu of possible plants as shown in Fig. C.I.

Fig. C.I. An uncertainty set P^ and measurements of P(ju).

The second type of plant uncertainty is called parametric uncertainty. It


arises when it is assumed that the plant P may vary according to certain pa-
rameters ranging in some known sets. As an example, suppose that we are given
the plant

where we know that 0.5 < a < 0.7 and 1 < (3 < 2. In this case we know
something about P: it is a rational function with relative degree I and only one
pole. We also know a range for the parameters a and (3.
Finally, one may have a mixed type of uncertainty, where there is a para-
metric part and an unparametric part in AP. Either type of uncertainty can
be converted to the unparametric type; this is not discussed further.
C.3. DEALING WITH UNCERTAINTY 231

C.3 Dealing with uncertainty: Worst-case


analysis
Now we treat unparametric uncertainty. We assume that at frequency a;, there
is a known set Pu of possible plant values p = PQ(JU) + A. In the discussion
below we also assume that the compensator C is fixed, and the symbol C will
be supressed from now on, so that many expressions that actually depend on C
will not show it explicitly.
We also have a given performance (or cost) function. If we know the exact
value of the plant P, then the performance is denoted by

The problem is that P(JUJ) is unknown, so we must use a mathematical trick to


produce a conservative estimate of the performance. We form a new performance
function G by taking the worst case possible for plant value P(juo}\

Observe that whatever the true plant P(ju] is, we have that

Thus when G(UJ, PQ(JUJ}) is small, so too is G(a>,P(ju;)).


The maximization in (C.3) is quite difficult to handle analytically, even with
functions T and sets PM that are simple. Later we will calculate G explicitly for
certain cases, but for now we concentrate on developing a practical alternative.

C.4 A method to treat plant uncertainty


If the sets Pu are known, sometimes an explicit formula can be obtained for the
function G. However, this is almost always a very complicated calculation to
do (when it is possible!).
Here we present a method for treating uncertainty that is conceptually simple
(it also generalizes to the MIMO case) and is relatively easy to handle numeri-
cally. It is based on the construction of a performance function G* that depends
on PQ and not on P, but which nevertheless includes uncertainty information.
It is possibly the simplest compromise one might think of. Much more sophisti-
cated approaches exist. Probably the most effective is J. Doyles' //-synthesis, or
her which is harder to describe. The exposition here is designed to encourage
the reader to try his or her own compromises.
The idea is simple: to form the modified performance G*, we use a Taylor
expansion of G(w, PO(JU>) + A) about A = 0:
232 APPENDIX C. UNCERTAINTY

For simplicity, the variable u; is supressed in (C.4). The function G* is formed


by dropping from the R.H.S. of expression (C.4) all terms in A with order > 1,
and by choosing A to have the direction of steepest ascent of G at p = PQ. That
is, A has the form

where t is chosen as the largest positive number such that PQ + A belongs t


Pu (see Figure C.2). Thus

where t depends on a;, PQ, G, and the set Pu.


We shall see later that some simplification can be obtained by writing t as

in which case we obtain

This is the case where the uncertainty sets Pu are disks about p = PQ(JU),
with (known) radius 8\Po(juj)\:

It is easy to see that in this case our t in the expression (C.6) is given by (C.7).

C.5 An example with quasi-circular F


To illustrate the method we will do the calculations for a performance function
of the form

where C, W, and k are given functions of frequency u>. A simple calculation


yields

Substituting (C.ll) into (C.8), we obtain


C.6. PERFORMANCE AND UNCERTAINTY 233

Fig. C.2. The quantity A is selected to have the direction of steepest ascent for
the performance G(UJ, •), and so that PQ + A is the point in P^ that yields the
largest value d of the linear approximation to G(u;, •) at PQ. The number d is
an approximation to the largest performance possible when P ranges in Pw. We
setG*(u,P0) :=d.

Note that the change of variables

in (C.12) produces an elegant formula in terms of the closed-loop transfer func-


tion TO. Below we call F the resulting performance.

C.6 Performance and uncertainty


We now derive an explicit formula for the worst-case uncertainty performance G,
when the uncertainty set is a disk about PQ and the given performance function
G is quasi-circular.
We will compare this formula to the modified performance G* obtained in
section C.4 and derive an estimate of the error for this type of performance
function.
THEOREM C.6.1. Fix a frequency uo and suppose that k, PQ, and C are
known complex numbers. Let 6 be such that 6\Po\ < 1 and

Set

and
234 APPENDIX C. UNCERTAINTY

Then

where

Proof. Set A = <5Po2, where z ranges in the unit disk in the complex plane,
i.e., \z\ < 1. From this relation and from (C.15), (C.16), and (C.18) we get

The conclusion of the theorem follows directly from relation (C.19) and the
following lemma.
LEMMA C.6.2. If a and b are constants such that \b\ > 1, then

Proof of lemma. The linear fractional transformation

maps the unit disk one-to-one and onto a disk D\ in the plane. Note that the
set DI is not a half-plane since |6| > 1. If c\ and r\ denote the center and radius
of DI , respectively, then it is clear that the largest modulus possible for a point
in DI is |ci| + r\. We now determine c\ and r\. Note that for all z — e1® we
have

Some algebra leads tro

A result from complex analysis says that rational transformations on the circle
that have constant modulus must have pole and zeros that come in conjugate
pairs. This means that
6. PE RFO RM A N C EA ND UNCE RTAIN TY 235

Solving for c\ in (C.23) we obtain

and

Recall that the modified performance function G* is a practical approximation


to the actual worst-case uncertainty performance G. The following corollary
gives an estimate for the difference between the two functions.
COROLLARY C.6.3. Suppose that 6\T0\ < I. Then

See Figs. C.3, C.4, and C.5 for a graphical comparison between the functions
G and G* that arise from equations (C.17) and (C.13).

Fig. C.3. Plot of the worst case uncertainty performance G that arises from
(C.17) and of the modified performance G* that arises from (C.13) as functions
of TQ = x + iy. Here k — 0.5, 8 — 0.25, and PQ = 1. The range of x is (—1, 2)
and the range of y is (—1.5,1.5).
236 APPENDIX C. UNCERTAINTY

Fig. C.4. Plot of the levels 0.1, 0.5, 1.0, and 2.0 for the difference G - G*.
The functions G and G* of TQ = x -f iy are produced by equations (C.17) and
(C.13) with k — 0.5, 6 = 0.25, and PQ = 1. The variable x is represented in the
horizontal axis.

C.7 Extensions
We now obtain a modified performance by using second-order terms from a
Taylor expansion of the performance. The basic formula we need is the order 2
expansion of G,

where

From the expression (C.25) we can produce several variations of the method
described in section C.4. To illustrate this we carry out the calculations for
functions G with the form
C.7. EXTENSIONS 237

Fig. C.5. Plot of G and G* when TO ranges on the x axis. The functions G
and G* are produced by equations (C.17) and (C.13), for k = 0.5, 6 = 0.25, and
Po = l.

For such functions we have

The expansion of G in this case is

If A is chosen as in (C.5) and (C.7), then after some simplification we obtain


the following modified performance function (now in terms of TO instead of PQ):

Note that expression (C.28) has a second-order term (in AP) that may be neg-
ative for some TQ'S.
A variation of (C.28) is obtained by dropping some of the second-order terms:
This page intentionally left blank
Appendix D

Computer Code for


Examples in Chapter 6

This appendix contains computer code of the two sessions with the package
OPTDesign, which were discussed in Chapter 6. The code can be easily modified
to treat other design problems. See the notebook appendixch6.nb.

D.I Computer code for design example 1


Calculation of the plant. The general (uncertain) plant pa(s) is calcu-
lated directly from the matrices A, b, c, and d as plant(s,a;) = c(sl — A}~lb.
The nominal plant is plant(s, 5).

«OPTDesign';

A = {{-2, -0.4, 0., 0.},


•Cl.,-0.01, -5., 0.},
{0., 10.0, 0., -10},
{0., 0.0,alpha,-alpha}};
b = {2,0,0,0};
c = {0,1,0,0};
d = {0};

plant[alpha.,s_] = c.Inverse[ s IdentityMatrix[4] - A].b;


p[s_] = plant [5, s];
plantplot = BodeMagnitude[p[s]];
paux[s_] = l/(s+l)~2;

Radius and center for original list of requirements

r01[w_]=Which[0.0 <- Abs[w] <= 0.12, 0.1,

239
240 APPENDIX D. COMPUTER CODE FOR EXAMPLE IN CHAPTER 6

0.1 < Abs[w] <= 1.0, 0.32,


1.0 < Abs[w] <= 5.0, 2.0,
5.0 < Abs[w] <= 10.0, 0.1,
10.0 < Abs[w] , (0.01/Abs[paux[I 10.]])*Abs[paux[I w]]];

k01[w_]=Which[0. <= Abs[w] <= 5.0, 1.


5. < Abs[s] , 0.];

FigEnvelopePlot2DO = EnvelopePlot[Radius->r01,Center->k01,
FrequencyBand->{0.01,12}];

FigEnvelopePlotSDO = EnvelopePlotSD[Radius->r01,Center->k01,
FrequencyBand->{0.01,6}];

Modified center and radius functions. Lines are used to interpolate


certain points chosen beforehand. In particular, the radius is set to decrease
linearly from 1 to 0 between frequencies u; = 1 and uj = 5. The function
InterpolateingPolynomial defines a line when only two interpolation points
are specified.
lineO[w_] = InterpolatingPolynomial[{{!.,1},{5.,0}},Abs [w]];

kl[w_]=Which[ 0.0 <= Abs[w] <= 1.0,1.,


1.0 < Abs[w] <= 5.0,lineO[w],
5.0 < Abs[w] , 0. ] ;

ra[w_]=InterpolatingPolynomial[{{0.0,0.1},{0.12,0.1}}, Abs[w]];
rb[w_]=InterpolatingPolynomial[{{0.12,0.1},{1,0.32}}, Abs[w]];
re[w_]=InterpolatingPolynomial[{{1.0,0.32},{2.0,2.0}}, Abs [w]];
rd[w_]=InterpolatingPolynomial[{{2.0,2.0},{3.0,2.0}}, Abs [w]];
re[w_]=InterpolatingPolynomial[{{3.0,2.0},{5,0.1}}, Abs[w]];
rf[w_]=InterpolatingPolynomial[{{5.0,0.1},{10.0,0.01}}, Abs[w]];

rl[w_]=Which[ 0.0 <= Abs[w] <= 0.1, ra [w] ,


0.1 < Abs[w] <= 1.0, rb[w],
1.0 < Abs[w] <= 2 . 0 , rc[w],
2.0 < Abs[w] <= 3.0, rd [w] ,
3.0 < Abs[w] <= 5.0, re [w] ,
5.0 < Abs[w] <= 10.0, rf[w],
10.0 < Abs[w], (0.01/Abs[paux[I 10.]] )*Abs[paux[I w]]];

FigCenter = Show[Plot[kl[w],{w,0,7}],Plot[kOl[w],{w,0,7},
PlotStyle -> {{Thickness[0.02],GrayLevel[0.5]}}]];
plotrla = Show[Plot[rl[w],{w,0,1}].Plot[r01[w],{w,0,1},
PlotStyle -> {{Thickness[0.02].GrayLevel[0.5]}}]];
plotrlb = Show[Plot[r1[w],{w,0,7}],Plot[r01[w],{w,0,7},
PlotStyle -> {{Thickness[0.02].GrayLevel[0.5]}}]];
plotrlc • Show[Plot[rl[w],{w,5,12}],Plot[r01[w],{w,5,12},
D.I. COMPUTER CODE FOR DESIGN EXAMPLE 1 241

PlotStyle -> {{Thickness [0 . 02] , GrayLevel [0 . 5] }}] ] ;


FigRadius = Show [Graphics Array [{plotrla.plotrlb.plotrlc}]] ;
FigEnvelopePlot2D =
EnvelopePlot [Radius ->rl , Center->kl , FrequencyBand->{0 . 01 , 12}] ;
FigEnvelopePlotSD =
EnvelopePlotSD [Radius->r 1 , Center->kl , FrequencyBand->{0 . 01 , 6}] ;

First optimization runs and plots

OPTDesign[p,Center->kl,Radius->rl,Ngrid -> 256,Nsmth -> 100];

FigBodeMagS = BodeMagnitude[l-T, Plot Range -> {-30,10}];


FigBodeMagT = BodeMagnitude[T,PlotRange -> {-40,10}];
FigBodePhaT = BodePhase [T] ;

Redefinition of the center and radius functions, and plots

rb2[w_]=InterpolatingPolynomial[{{0.12,0.1},{1.3,0.25}}, Abs[w]];
rc2[w_]=InterpolatingPolynomial[{{1.3,0.25},{2.0,2.0}}, Abs[w]];
r2[w_] =Which[0.0 <= Abs[w] <= 0.1, ra[w] ,
0.1 < Abs [w] <= 1.3, rb2[w],
1.3 < Abs[w] <= 2.0, rc2[w],
2.0 < Abs[w] <= 3.0, rd[w] ,
3.0 < Abs [w] <= 5.0, re [w] ,
5.0 < Abs[w] <= 10.0, rf[w],
10.0 < Abs[w] , (0.01/Abs[paux[I 10.]])*Abs[paux[I w ] ] ] ;

k2[w_]= kl[w];

FigCenter2 = Show[Plot[k2[w],{w,0,7}],[Plot[k01[w],{w,0,7},
PlotStyle -> {{Thickness[0.02],GrayLevel[0.5]}}]];
FigRadius2 = Show[Plot[r2[w],{w,0,7}].Plot[rOl[w],{w,0,7},
PlotStyle -> {{Thickness[0.02].GrayLevel[0.5]}}]];

FigEnvelope2Plot2D =
EnvelopePlot[Radius->r2,Center->k2,FrequencyBand->{0.,3}];
FigEnvelope2Plot3D =
EnvelopePlotSD[Radius->r2,Center->k2,FrequencyBand->{0.0,3}];

Second optimization run and rational approximation

OPTDesign[p,Center->k2,Radius->r2,Ngrid -> 256,Nsmth -> 100];

TratLow[s_] = RationalModel[s,DegreeOfDenominator -> 6];


CratLow[s_] = Together[ l/p[s] TratLow[s]/(l-TratLow[s])] ;

FigBodeMagTLow = BodeMagnitude [TratLow [s] ,PlotRange -> {-60,10}];


FigBodePhaTLow = BodePhase [Sample [TratLow [s] ] ] ;
242 APPENDIX D. COMPUTER CODE FOR EXAMPLE IN CHAPTER 6

FigBodeMagSLow = BodeMagnitude[1-TratLow[s].PlotRange -> {-30,10}];


FigBodeMagCompLow = BodeMagnitude[CratLow[s]];
FigBodePhaCompLow = BodePhase[Sample[CratLow[s]]];

FigTLowZP = PlotZP[TratLow[s],s];
FigCLowZP = PlotZP[CratLow[s],s];

D.2 Computer code for design example 2


Setup of the problem

«OPTDesign' ;

p[s_] = 0.036 (s + 25)/(s~2 (s + 0.02 + I)(s + 0.02 - I));


pinv[s_] = l/p[s] ;

wp = 0.7; alphap = 0 . 9 ;
wb = 2.0; alphab = .75;
wr = 10.; alphar = 0.25/Abs[p[I wr]];

linel[w_] = InterpolatingPolynomial[
{{wp,alphap Abs[pinv[I wp]]}, {wb,alphab}},w];

Iine2[w_] = InterpolatingPolynomial[
{{wb,alphab},{wr,alphar Abs[p[I wr]]}},w];

Iine3[w_] = InterpolatingPolynomial[{{wp,!},{wb,0}},w];

k[w_] = Which[ 0 <= Abs [w] <= wp , 1,


wp < Abs[w] < wb , lineS[Abs[w]],
wb < Abs [s] , 0] ;

r[w_] = Which[0 <= Abs[w] <= wp , alphap Abs[pinv[I w]],


wp < Abs[w] <= wb , line1[Abs[w]],
wb < Abs[w] <= wr , Iine2[Abs[w]],
wr < Abs[w] , alphar Abs[p[I w]]];

EnvelopePlot[Radius->r,Center->k,FrequencyBand->{0.01,12}];

EnvelopePlotSD[Radius->r,Center->k,FrequencyBand->{0.01,12},PlotRange->All];

Optimization run and model reduction

OPTDesign[p,Center->k,Radius->r,Ngrid->256 , Nsmth -> 30];

FigBodeMag = BodeMagnitude[T];
FigBodePha = BodePhase[T];
D.2. COMPUTER CODE FOR DESIGN EXAMPLE 2 243

Trat[s_] = Rationa!Model[s,DegreeOfDenominator -> 7];


stepl [t_] = Chop[Simplify[InverseLaplaceTransform[Trat[s]/s,s,t]]];

FigStepTla = Plot[ stepl [t] ,{t,0,14},PlotRange -> All];


FigStepTlb = Plot[ stepl [t] ,{t,0,14}.PlotRange -> {0.9,1.2}];

Cratl[s_] = CancelZP[ pinv[s] Trat[s]/(1-Trat[s]),s,s];


FigCratlZPa = PlotZP[Cratl[s],s];
FigCratlZPb = Show[FigCratlZPa,PlotRange -> {{-3,3},{-3,3}}];

{num.den} = {Numerator[Cratl[s]].Denominator[Cratl[s]]};
zeros = s /. Solve[num==0,s];
poles = s /. Solve[den==0,s];
Crat2[s_] = Cratl[0] *
(1 - s/zeros[[4]])(l - s/zeros [[5]])(1 - s/zeros [[6]])/
( (1 - s/poles[[!]])(! - s/poles[[2]])(l - s/poles[[3]]));
FigCratlCrat2 = Plot[{Abs[Cratl[I w]],Abs[Crat2[I w]]},{w,0,2}];
Trat2[s_] = Together[ p[s] Crat2[s]/(l + p[s] Crat2[s])] //Chop ;
FigCrat2ZP = PlotZP[Crat2[s],s];
step2[t_] = Chop[Simplify[InverseLaplaceTransform[ Trat2[s]/s,s,t] ] ] ;
FigStepT2 = Plot[step2[t],{t,0,14}];
This page intentionally left blank
Appendix E

Downloading OPTDesign
and Anopt

If you are using a web browser, go to http://anopt.ucsd.edu and follow the


directions on the web page.

Those who like doing things the hard way can download the packages OPTDe-
sign and Anopt through anonymous ftp.

Type
ftp anopt.ucsd.edu
When the remote system requests the accountname, you reply
anonymous
When the system requests the password, type your email address:
myadress.edu
Then type
cd pub/anopt

Now you are in the correct directory. There are two types of files, .tar.gz
(for Unix) and .zip (for MSWindows). Pick the file of your favorite type with
the latest date, download it to your system, and uncompress it.

245
This page intentionally left blank
Appendix F

Anopt Notebook

Anopt: A program for sup norm optimization


over spaces of analytic functions

J.W. HELTON, O. MERINO, J. MEYERS, AND T. WALKER


Lab. for Mathematics and Statistic, University of California, San Diego

F.I Foreword
Anopt is a Mathematica package for solving diverse optimization problems over
spaces of functions analytic on the unit disk in the complex plane.
The software is useful for engineers doing worst-case design in the frequency
domain. This includes problems in control as well as a broadband gain equal-
ization and matching. Another application is in the field of several complex
variables where the program can be used to find analytic disks that are optimal
with respect to various criteria, e.g., Kobayashi metric calculations.
The main program AnoptfJ is easy to use, even for those with little or no
computing experience. AnoptfJ can be run at a very simple level or at a very
complex level.
The package Anopt was developed at the Laboratory for Mathematics and
Statistics at the University of California at San Diego, during the years 1989 to
1994, by J.W. Helton, Orlando Merino, Julia Myers, and Trent Walker. Finan-
cial support came from the Air Force Office of Scientific Research, the National
Science Foundation, and the NSF-REU program at the San Diego Super Com-
puter Center.
Send comments, questions, and information on bugs to anopt@math.uscd.edu.

247
248 APPENDIX D. ANOPT NOTEBOOK

F.2 Optimizing in the sup norm: The problem


OPT
The mathematica program Anopt can be used to solve min-max problems over
functions / analytic on the unit disk in the complex plane. Anopt can do it
using several algorithms, but the default is a steepest descent-type of algorithm
called disk iteration. We begin by stating the main optimization problem to be
solved.
We shall use the following notation.
AN = the space of C^-valued analytic functions on the unit disk D in C.
f = (/[I] , /[2],..., f[N]) whic h extend continuousl y to th e close d disk .

F(e, z) = a smooth, positive-valued function of e in the unit circle.

The main optimization problem is


OPT Given F(e, z), find /* in AN such that

where the infimum is over / in AN and the supremum is over e in the unit circle.

F.3 Example 1: First run


The problem. Our introductory example consists of a performance func-
tion where there is a single complex variable. Consider the function T(e, z) =
Abs[0.8 + 1/e + z}]2 as the objective or performance function. We want to
solve OPT for this function G(e, z) and calculate the optimal value and optimal
function /*.
Invoking AnoptfJ th e firs t time . Begi n by entering Mathematica an d then
loading the package Anopt.
« Anopt.m
Anopt 3.0 ©Copyright 1991-97
J.W. Helton and O. Merino. All rights reserved.

To type in the formula for the performance F(e, z) requires a translation to the
symbols that Anopt understands. To enter G in Mathematica, replace z by z[l]
and assign the result to a name, say g:
g = Abs [ 0.8 + (1/e + z[l] ) ~ 2 ] ~ 2
F.3. EXAMPLE 1: FIRST RUN 249

Besides the performance function, Anoptf] needs as input an error tolerance.


Below we set the tolerance as 0.02.

Anopt[g, 0.02]

It Current Value Step Optimality Tests : Error : Sm. Grid


gammastar flat gr Align : ned

0 3 . 24 E+00 N/A 9.9E-01 0 E+00 N/A NON 32


1 1.3456 E+00 4. E-01 4.8E-01 0 E+00 5.2E-04 NON 32
2 1.0242720431082E+00 1.4E-01 4.9E-02 0 E+00 2.4E-05 NON 32
3 1 .0100848818098E+00 1 . 8E-02 1.8E-02 0 E+00 5.4E-03 NON 32

Summary

gammastar = 1.010084881809756E+00
flat = 1.7866711005E-02
gr Align = 0 E+00
ned = 5.35E-03

Output from Anopt. When Anopt runs, the screen output gives informa-
tion on how the run is progressing. In the screen output from the previous run
we see several columns. A brief explanation of their meaning follows.

It Iteration number
Value sup e F(e,/(e)), the value at current iteration.
Step sup norm of difference between last two iterates /.
Flat Flatness optimality diagnostic. It is zero at the solution.
GradAlign Gradient alignment optimality diagnostic. It is zero at the solution,
ned A measure of numerical noise in calculations.
Sm. Indicates whether smoothing takes place.
Grid Number of points on the unit circle used for function evaluation.

In our run above, Flat went from 0.99 down to 0.018 in three iterations,
while GradAlign was zero throughout the iteration (this is not unusual for
scalar-valued examples). At the moment of stopping, the iteration was making
acceptable progress, but the (large) error tolerance we gave as input prevented
the program Anoptf] from obtaining more accurate results.
Solution. When the run is over you will have access to the calculated value
of the function /*, under the name Solution. This is a list of lists with the
following format:

Solution

where xi, #2, • • • 5 zn are complex numbers. The kih entry of Solution (itself a
list) is Solution[[/c]].
250 APPENDIX D. ANOPT NOTEBOOK

Dimensions[Solution]
{1, 32}

Thus in our example, only one scalar-valued analytic function was produced. It
consists of 32 values that are the result of sampling the function on a grid of 32
equally spaced points in the unit circle.

Plotting solutions. We now plot the solution as a discrete curve on the


complex plane, using the command

DiskListPlot[Solution]

You can oriduce plots or manipulate the output algebraically. Also, the
package Anopt.m comes with a function for displaying the solution in 3-D.

DiskListPlot3D[Solution]
F.4. EXAMPLE 2: VECTOR- VALUED ANALYTIC FUNCTIONS 251

F.4 Example 2: The case of vector- valued


analytic functions
The problem. We consider now a problem of optimization over A<z, which
is the space of pairs (/i, fo) of analytic functions. We want to find /* that solves
OPT for the performance function

Input. To translate this to the Mathematica language, proceed as before


and replace z\ by z[l], z2 by z[2], etc. The set of inputs is
g = Re[l/e + z [ l ] ] ~ 2 + 4 Im[l/e + z [1]]~2
+ Re[l/e + z[2]]~2 + 0.3 Im[l/e -i- z[2]]~2;

« AnoptCg, 0.001] ;

It Current Value Step Optimality Tests Error Sm. Grid


gammastar flat grAlign ned

0 4 .3E+0 0 N/A 5.3E-01 : 7.2E-01D N/A NON 32


1 2 . 1545988017648E+00 7.8E-01 3.2E-06 : 5.5E-02D 3.2E-05 NON 32
2 2.1485374717714E+00 7.6E-02 1.5E-03 : 1.2E-02D 5. E-05 NON 32
3 2 . 1469189941809E+00 7.5E-03 3.7E-06 : 5.9E-04D 5.3E-06 OA5 32

Summary

gammastar = 2.146918994180885E+00
flat = 3.6801434547E-06
grAlign = 5.9442841631E-04
ned =5.3 E-06

Stopping, and how good is the solution? The run stops when the
equalities

Flat < tol, GrAlign < tol

are satisfied, where tol is the error tolerance set by the user. At the solution
one must have that Flat = GradAlign = 0, so small Flat and small GradAlign
is a necessary condition for a calculated guess at the answer to be close to the
actual solution.
Sometimes ther e i s difficult y i n th e calculations . AnoptfJ ha s basicall y tw o
ways to deal with it: smoothing and grid size doubling. Both are performed
automatically if the internal algorithms of Anopt indicate that such action is
necessary. The user may supress smoothing or grid doubling by specifying
options when Anopt is run. In the run above, smoothing occurred at the third
iteration, while the grid size remained constant at 32.
252 APPENDIX D. ANOPT NOTEBOOK

Fourier coefficients of the solution. Computing the (discrete) Fourier


coefficients of functions is useful for many purposes. The Mathematica com-
mand Fourier [list], where list = {#1,2:2,... ,£ n }, is an implementation of the
FFT algorithm and returns a list of the Fourier coefficients (times the numer-
ical factor Sqrt[n], where n — number of sample points). To illustrate the use
of Fourier[], we compute the Fourier coefficients of Solution[[2]] and store them
under the name "fc2":
fc2 = Fourier [ Solution [2] ] /Sqrt[32.];
The resulting coefficients are ordered according to the indices i n { 0 , — 1, — 2 , — 3,
. . . , 3, 2,1}. We see below that there is only one nontrivial Fourier coefficient in
the first entry of Solution:
Chop[ fc2 ]
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-(5.936871299232548*l(T-9), 0, 2.481559157373434*l(T-8, 0,
5.036991796409501*10"-9, 0 , 1.020976750493356*10"-9,
0, 2.063555217982383*10~-10, 0, 0, 0, 0, 0, -0.6819202501387195}

Note that it is clear from the expression above that the function f2 of the
calculated solution / = (/I, /2) is a multiple of e10.

F.5 Example 3: Specification of more input


Anopt is based upon an iterative scheme that generates a sequence of (discrete)
analytic functions /i,/2,/s, etc. This process begins with an initial guess /o,
which in examples 1 and 2 has been defaulted to the function zero, on a discrete
grid of 16 points. In this section you will learn how to start Anopt on an initial
guess /o that you specify as a rational function sample on the circle.

Statement of the problem: Example 1 revisited. Suppose we want


to solve example 1, only we want to start the iteration with the initial function
f ( e ) = (e2 — 3e + 4)/(e + 3). We define the performance first.

We can specify the initial guess at solution as an express in "e" (which represents
Exp[I theta]). For example, we set

The performance function we deal with here has only one complex variable
(N = 1); hence only one entry is required in /Q. In the case N > 1 the different
entries of /o have to be separated by commas.
The calling sequence to be used now takes as input the function /o as initial
guess, besides the performance g and the tolerance tol. The sequence is Anopt[p,
tol, /o]. Many other inputs can be specified as options. Below we set the
maximum number of iterations to be 3:
F.5. EXAMPLE 3: SPECIFICATION OF MORE INPUT 253

Anopt[g, .0001,fO,Iterations->3]

It Current Value Step Optimality Tests Error Sm. Grid


gammastar flat grAlign ned

0 9.6039969948935E+01 N/A 9.3E-01 : 1. E+00 N/A NON 32


1 1 . 9365155086078E+00 3 . 8E+00 9.1E-01 :0 E+00 2. E-02 NON 32
2 1 . 1423579024226E+00 2.8E-01 2.4E-01 :0 E+00 3.9E-04 5A10 32
3 1.0093525941247E+00 6.7E-02 2.1E-02 :0 E+00 2.1E-03 NON 32

Summary

gammastar = 1.009352594124695E+00
flat = 2.08995166745E-02
grAlign = 0 E+00
ned = 2.12E-03

Another valid way to specify the initial guess is a list of values obtained by
sampling the function on a grid of equally spaced points. To produce a discrete
version of /0 on a 32-point grid you can use a replacement rule. Note the braces
around the rational function:

Restarting the run. In the above, the process stopped because the limit
on the number of iterations was attained. If you want Anopt to proceed from
the last iteration above, you can restart the iteration. For this take as an initial
guess for the new run the last iterate of the previous run, which is stored in the
variable Solution.
Below we double the grid size of the answer we obtained in the run above
and assign it to /I. Then /I is used to restart the iteration.
f1 = DoubleGrid[Solution];
Anopt[g,0.00002,fl]

It Current Value Step Optimality Tests Error Sm. Grid


gammastar flat grAlign ned

0 1.0093525941247E+00 N/A 2.1E-02 :0 E+00 N/A NON 64


1 1.0005560979092E+00 5.7E-03 1.1E-03 :0 E+00 2 . 3E-03 NON 64
2 1.0002780415569E+00 9. E-04 5.6E-04 :0 E+00 1.2E-03 5A10 64
q 1.000033365362 E+00 1.6E-04 6.7E-05 :0 E+00 2. E-03 NON 64
4 1.0000050181354E+00 1 . 6E-05 1.3E-05 :0 E+00 3.1E-03 NON 64

Summary

gammastar = 1.000005018135407E+00
flat = 1.291679907 E-05
grAlign = 0 E+00
ned =3.1 E-03
254 APPENDIX D. ANOPT NOTEBOOK

The solution has the following structure.


Dimensions[Solution]
{1, 64}
Manipulating solutions. As an example of how to manipulate output, we
plot below the discrete values of the function F(e, f*(e)). We then compare this
graph with the claim (from the theory) that gdisc(e) is constant. For this one
must substitute in the formula g = g(e, z[l]} the variable z[l] by Solution[[l]] and
e by its discrete values on a grid of equally spaced points. This is accomplished
using replacement rules as shown below. First we produce a discrete version of
Expfl theta] for theta in [0, 2 Pi].
ngrid = 64
edisc = Table[N[Exp[2 Pi I (i-1)/ngrid]],{i,1,ngrid}];
Now we use replacement rules.
gdisc = g /. {e -> edisc, z[l]->Solution[[l]]};
The plot is produced below. Here we specify a range:
LiatPlot[gdisc,PlotRange -> { 0 , 2 } ]

We conclude that gl is approximately constant. Let us find out how close to


constant:
Max[gdisc]-Min[gdisc]
0.0000129169
Combining algebra and plots. Now we try something a bit different.
Suppose you wish to compute and plot in 3-D the partial derivative of the
performance with respect to z, where z — our calculated solution. This requires
only two lines of input for the user. We also plot the system on the complex
plane for comparison. The function grad defined below is discrete; that is, it is
a list of values.
F.6. QUICK REFERENCE FOR ANOPT 255

grad = ComplexD[g,z[l]] /.
{e -> edisc, z[l]->Solution[[!]]};
DiskListPlot[grad,PlotJoined-xTrue]

F.6 Quick reference for Anopt


The output table. After every iteration, eight numbers are placed on the
screen. Recall that to each new function /^ generated by Anopt correponds to
a new line in the output table. The first line (iter= 0) corresponds to /Q. Since
we did not specify /o in the two examples above, it has been defaulted there to
the zero function. The columns of the table are explained below.

HEADING EXPLANATION

Iter Iterate number k.

SupGamma supG(e,/(e)), the value at current /.

Step sup norm [difference between last two iterates /].

Flat, GradAlign Optimality tests. If / is near optimal,


then Flat and GradAlign are close to 0.

NED Numerical Error Diagnostic. If relatively large


it indicates numerical trouble, usually due to
slowly decaying Fourier coefficients.

Sm Automatic smoothing of fk.

Grid Number of samples on the unit circle, a power of 2.


increased by Anopt as needed.
256 APPENDIX D. ANOPT NOTEBOOK

Usually the most important numbers on the screen are the Flat and GradAlign
diagnostics. As they approach zero, the current guess is expected to approach
the solution. Under mild hypotheses, the solution is unique in OPT problems
with only one (scalar-valued) analytic known function. When the number of an-
alytic unkown functions N > 1, solutions are not unique. There may be many
local solutions, which Anopt may find when run with different initial guesses.
Explanation of the output table will be expanded in section 1. A more
advanced user may set the options Diagnostics-> 1 and Diagnostics2->l.
This gives additional information on the run, which is stored automatically in
certain files.
Some Mathematica functions and notation

Abs[z] Absolute value of z.


Conjugate[z] Conjugate of z.
zAn nth power of z.
Re[z] Real part of z.
Im[z] Imaginary part of z.
{xl,x2,x3} A list with entries xl,x2,x3.
D[g,z] Derivative of g with respect to z.
Fourier [f] Fast Fourier transform of /.

Some Anopt utilities

ComplexCoordinates [g] Applies transformation rules to g to write in terms of


the complex variables and their conjugates.

ComplexD[g,z] Complex derivative of g with respect to z.

ComplexD [g, Conj ugate [z] ] Complex derivative of g with respect to Conjugate [z].

DoubleGrid[f] Doubles the number of sample points


of a discrete function.

HalfGrid[f] Halves the number of sample points


of a discrete function.

DiskListPlot[f] Plots a list of complex numbers in the Re-Im plane.

DiskListPlot3D[f] Plots a list of complex numbers in 3-D space.


One of the axes corresponds to the parameter,
assumed to be theta in [0, 2 Pi].
Appendix G

Newtonlnterpolant
Notebook

Interpolation with rational functions


using Newtonlnterpolant

J.W. HELTON AND O. MERINO


Lab. for Mathematics and Statistic, University of California, San Diego

Introduction
Let R[s] be a function of the complex variable s, and let

be given sets of n complex numbers. The relation R[sl] = zl is called the inter-
polation condition, and several of these form a "set of interpolation conditions":

A function R[] that satisfies (INT) is called an interpolant for (INT).


The function Newtonlnterpolant [] produces a rational function, say jR[s],
that satisfies a set of interpolation conditions specified by the user as input, and
has all its poles at a single location, typically a negative real number.
There are many rational functions that satisfy (INT). Newtonlnterpolant []
produces one that is the most economical in the sense that it is proper and the
denominator has the smallest degree possible.
Newtonlnterpolant[] can also deal with more general problems, with inter-
polation conditions on the derivatives of the function.

257
258 APPENDIX E. NEWTONINTERPOLANT NOTEBOOK

G.I First calculation of an interpolant


Begin by loading the package

«NewtonInterpolant'

The problem is to find an interpolant R[s] for the interpolation conditions

The answer is obtained with

data = { {3,1} , {l,-2} };


ratl = {Newtonlnterpolant[data,s]

To verify that the rational function is in fact an interpolant is easy:

G.2 Specifying the pole location


In the example above the pole location of the interpolant is sO = — 1. You can
change this value by giving a pole location as input. For example, suppose you
want pole location —4 for the interpolant. This is what you type:

rat2 = Newtonlnterpolant[data,s,PoleLocation->-4]
G.3. SPECIFYING OF THE RELATIVE DEGREE 259

G.3 Specifying of the relative degree


If R = N/D is a rational function with N as numerator and D as denominator,
the relative degree of R is the integer

Functions with nonnegative relative degree are called proper, and if the relative
degree is positive the function is called strictly proper.
The user can specify a relative degree as input for Newtonlnterpolant [].
Suppose that you want to determine an interpolant for

with pole location —4 and relative degree 2. This is how you can do it:
In[12j:=
rat2 = Newtonlnterpolant[data,s,PoleLocation->-4,RelativeDegree->2]

Out [12]=

G.4 Complex numbers as data


Rational functions that appear in engineering have real coefficients. This cor-
responds in terms of zeros and poles to the statement that complex zeros and
complex poles come in conjugate pairs. These functions are called real-rational,
and they have the property

If some of the data you use to set up an interpolation problem is complex and
you want an answer that is real-rational, then you must include pairs of data
points that reflect property (RR). For example, if one interpolation condition is

in order to obtain a real-rational interpolant, your set of interpolation conditions


must include both

Note: If one of the conditions is not given as input, Newtonlnterpolant[] will give
as interpolant a non-real-rational, since it will assume that you are willing to
260 APPENDIX E. NEWTONINTERPOLANT NOTEBOOK

give up the condition of real-rational in order to produce the most "economical"


answer.

Problem. Find a real-rational function R such that R[I] — 0, R[2 + 37] =


4-2/, J R[l] =3.

Solution. First expand the set of interpolation conditions to

and follow with


In[15]:=

data = { {1,0},{-1,0},{2 + 3 1,4 - 2 I},


{2 - 3 1,4 + 2 I},{1,3} };

rat4 = Newtonlnterpolant[data,s]

Out [15]=

G.5 Higher-order interpolation


More general interpolation conditions are produced when, besides the value of
the function at a point, values of the derivatives of this function at the point
are specified. Newtonlnterpolant can handle this case too.
Problem. Find an interpolant R such that R[l] = —3, R[2] — 0, and
R'[2] = I.
Infl7j:=

data = { {l,-3} , {2,0,1} };


rat = Newtonlnterpolant[ data, s]

Outfl7j=
G.5. HIGHER-ORDER INTERPOLATION 261

Problem. Find a real-rational interpolant R with relative degree 1 such


that

Solution. In this case it is necessary to consider pairs of complex data as


follows.
In[19]:=

data = { {2 1,0,1 - 3 I},{-2 1,0,1 + 3 1 } };


Newtonlnterpolant[ data ,s, RelativeDegree -> 1]
Out f 19]=
This page intentionally left blank
Appendix H

NewtonFit Notebook

NewtonFit
Introduction
This Mathematica notebook is the documentation for the package Newton-
Fit, for treating nonlinear L2 data fitting problems. The main function is
NewtonFit [] , which is an implementation of the Newton algorithm.
Acknowledgment. Discussions with Jim Easton were helpful.

An L2 approximation problem

The L2 optimization or approximation problem considered here is


(Prob) Given a function /(re, al, a 2 , . . . , am) and two sets
of n numbers each,
(data points) P = {xl, x2, # 3 , . . . , xn}
(data values) V — {yl, y2, ^3,..., yn},
Find (if it exists) an m-tuple (al, a2, a3,..., am)
that minimizes

The Mathematics function Fit[] can be used to treat (Prob) when the func-
tion F(x,al,...) is linear in the a's. For the general problem (Prob) a more
powerful algorithm is necessary. One example of this is the classical Newton
method, implemented here as the function NewtonFit. Another is the well-
known Gauss-Newton algorithm. These are iterative procedures that, when
they converge, produce a local solution to (Prob). This is the best one can hope
for for such a general optimization problem. In practice, one way to go about
finding global solutions is to run the algorithm repeatedly with different initial

263
264 APPENDIX H. NEWTONFIT NOTEBOOK

guesses, with the hope that the global solution will be found by one of these
runs.
The emphasis in these notes is on models F(;c,al,... , am) that are ratio-
nal functions in x, with coefficients given by al, a2, a3,..., am. The algorithm
implemented here is very general in that, in principle, it solves (Prob) for F
rational and for many other functions not necessarily rational. The wide range
of problems that can be treated with this implementation is due to the fact that
Mathematica can do both symbolic and numerical calculations.

What does NewtonFit[] do for you? The package NewtonFit.m contains


the function NewtonFit []. You can

a. Find local solutions to standard nonlinear L2 data fitting problems for a


wide range of models, in particular, models that are rational functions of
given order.
b. Solve weighted nonlinear L2 data fitting problems.
c. Find rational approximations to the data, where the degree and the min-
imum number of stable zeros and poles is specific a priori.

Problems you may encounter when using NewtonFit when treating


rational models

1. NewtonFit will fail frequently in arriving at a local solution. The nature of


the problem of approximation with rational models is such that for many
cases there are many narrow and curved valleys in parameter space. This
produces a high degree of instability in local algorithms, such a Newton
and Gauss-Newton. A good initial guess is very important.
2. Even if a local solution is produced you can never be sure if it is a global
solution (unless the model is linear in the parameters, i.e., the classical
linear fit).

In practice, a partial cure for both problems 1 and 2 is to produce lots of


runs, initializing each time at a different location in parameter space. More sat-
isfactory would be to complement the local algorithms with different algorithms
having better global behavior. This is not done here though, for lack of space.

H.I First example


In this example, we sample a high-order rational function of "s" on the imaginary
axis "Iw" and fit a low-order rational to the data that we obtained.
Consider the function
H.I. FIRST EXAMPLE 265

A grid of points j*w i on the imaginary asix is given by

and corresponding values v[[i}} are generated as

Following Luus and Shenton-Shifiei, a "reduced" second-order model

is sought.

Problem. Given datapoints (2), datavalues (3), and the model (4), find
parameters a[l],a[2],a[3],a[4] that minimize

Load the package

SetDirectory["~/BOOK/CODEUPDATE"];
«NewtonFit.m;

First we produce the gridpoints and grid values:

w = Table [0.01* (l.iri,{i, 0,99}];


F[s_]:=l/(s~3 + 6. s~2 + 11. s + 6 . ) ;
values = F[I w];
points = I w;
The gridpoints s and gridvalues v are combined as a list of pairs (s,v) called
data below. This is one input for NewtonFit [] .

data = Transpose[{points,values}];

The model is set below in terms of parameters a[l], a[2],... and the variable s.

model = (a[l] s + l.)/(a[2] s~2 + a[3] s + a[4]);

We now choose an initial set of parameters to start the iteration. If these


are not specified, the function NewtonFit [] defaults all of the to 0.

initiall = {-1.,5.,9.,4.};

The output of Newton is assigned to the variable output 1 below. This will
be helpful when manipulating the results of the call to NewtonFit [] .
266 APPENDIX H. NEWTONFIT NOTEBOOK

outputl = NetwonFit[data,model,s,a,Parameters->initiall]

Iter Step grad cost

0 0.754419 0.581507
1 4 . 05838 0.482129 0.313686
2 3 . 0786 0.0714609 0.0123889
3 2.09594 0.00824722 0.000601043
4 0.389809 0 . 000208885 0.000327769
-6
5 0.0285536 1.66219 10 0.000327117
-11
6 0.0000899072 3.82929 10 0.000327117

{Parameters -> {-0.134521, 4.86549, 10.0684, 6.01588},


SingValHessian -> {0.401005, 0.0583707, 0.00385517, 0.00126542},
-11
Cost -> 0.000327177, NormGradient -> 3.82929 10
NIterations -> 6}

For an explanation of the diagnostics—for example, Sing ValHessian—type


?SingValHessian at the Mathematica promt to get a description.
The results of the run above say that a local minimum has been found,
since the gradient of the objective function is close enough to 0, and the hessian
has the correct signature. This is how you form a function with the optimal
parameters out of the model:

par = Parameters /. output1;


r[s_] = model /.{a[j_] :> par[[j]]}

A plot is now produced that contains the (discrete) data being approximated
and the optimal function r[s].

Show[
ListPlot[Transpose[{Re[values],Im[values]}],
DisplayFunction->Identity],
ParametricPlot[{Re[r{I t]],Im[r[I t]]},{t,0,10.},
DisplayFunction->Identity],
DisplayFunction->$DisplayFunction];
H.2. TEMPLATE FOR MANY RUNS 267

H.2 Template for many runs


The example above is not typical in that the run produced satisfactory results
in the first trial. Approximation by rational functions is tricky business; usually
there are lots of local minima, and problems due to indefinite hessians are not
uncommon. Below you will find an example of a set of commands for running
the program many times starting the iteration at different random locations in
the space of parameters.
The example treated is the same as above. Initial sets of parameters are gen-
erated randomly with values a[i] in the interval (—5,5). The output of individual
runs is stored for later inspection in a file named outputl.

«NewtonFit .m;
w = table[0.01*(l.l)~i,{i,0,99}];
F[s_]=l/(s~3 + 6. s~2 + 11. s + 6 . ) ;
values = F[I w];
points = I w;
data = Transpose[{points,values}];
model = (a[l] s + l.)/(a[2] s~2 + a[3] s + a[4]);
Do[
initial = 10.{Random[] .Random[] .Random[] .Random[]} -5.;
output = NewtonFit[data,model,s,a,
Parameters->initial];
Save["outputl".initial,output];
.-Ci.1,50}];
Exit

H.3 Using a weight


It is possible to emphasize some data points over others by means of a weight
function as in
268 APPENDIX H. NEWTONFIT NOTEBOOK

Sum[ weight [[i]]*( Abs[ v[[i]] - g[ I*w[[i]] ,a] ] "2) , {i,l,n} ]


A weight can be specified as input for the function NewtonFit [].
Suppose that the example considered in section 1 gives an answer that we
reject as producing too large a deviation from the model to the data for values
of w[[i]] i= 25,..., 45. To reduce the error at these 21 points a weight function
is defined as
wt = Joint Table[1.,{24}],Table[10.,{21}] ,Tabled.,{55}]];
We can use the parameter values found in the previous run to initialize the
current one.

initia!2 = Parameters /. outputl;


output2 = NetwonFit[data,model,s,a,
Parameters->initial2,Weight->wt]

Iter Step grad cost

0 0.00372873 0.000526309
1 0.165522 0.0000226568 0.000401831
-8
2 0.00417434 2.57386 10 0.000401817
-7 -10
3 3.43866 10 3.06373 10 0.000401817
-10
4 0. 3.06373 10 0.000401817

{Parameters -> -[-0.135428, 4.71745, 10.136, 6.01399},


SingValHessian -> {1.39787, 0.23132 , 0.00153349, 0.00155875},
-10
Cost -> 0.000401817, NormGradient -> 3.06373 10
NIterations -> 4}

To plot the resulting function set, just proceed as in the first example.

H.4 Stable zeros and poles


A zero z of a rational function R[x] is stable if Re[z] < 0. A pole z of R[x] is
stable if Re[z] < 0. We now consider the problem of finding the set of optimal
parameters in (Prob) with rational model, with the additional constraint that
the resulting rational function has either (a) stable poles, (b) stable zeros, and
(c) stable poles and zeros. The key in solving this problem with the tools we
have is the following result.

PROPOSITION. Let P[x] be a monic polynomial with real coefficients. Then


P[x] has all its zeros in the left open (closed) left half-plane if an only if when P
is factored over the reals as a product of degree 1 and degree 2 monic polynomials,
the coefficients of each factor are positive (nonnegative).
H.4. STABLE ZEROS AND POLES 269

We can take advantage of this result because the function NewtonFit []


calculates derivatives by symbolic differentiation. As an example, consider the
problem from section 1 with the additional constraint that all zeros and poles
of the resulting rational function are located in the left half-plane. Note that
the answer obtained in section 1 is not satisfactory, because it has a zero in the
RHP. Define as model
models
Note that any choice of parameters in mode!3 produces rational functions with
zeros and poles off the right-hand plane.
The set of initial parameters given below has been determined with a multiple
run of NewtonFit [] with several initial sets of parameters. This set in particular
produces a local solution as shown below.

initials = {0.,.9,1.3,.17};
outputs = NetwonFit[data,models,s,a,
Parameters->initial3]

Iter Step grad cost

0 0.323703 0.00430163
1 0.112943 0.022005 0.00156777
2 0.00762118 0.0000177771 0.00154469
-9
3 0.0000305235 1.18668 10 0.00154469
-11 -9
4 4.53645 10 1.08998 10 0.00154469

{Parameters -> {0., 1.00936, 1.31714, 0.166463},


SingValHessian -> {91.4903, 0.77512, 0.350922, 0.0313337},
-9
Cost -> 0.00154469, NormGradient -> 1.08998 10
NIterations -> 4}

Now the rational function obtained above is produced, and a plot is gener-
ated.
par 3 = Parameters /. outputs;
r3[s_]= modelS /.{a[j_] :>par3[[j]]}

Show[
ListPlot [Transpose [{Re [values] ,Im [values]}] ,
DisplayFunction->Identity] ,
270 APPENDIX H. NEWTONFIT NOTEBOOK

ParametricPlot[{Re[r3 [I t]],Im[r3[I t]]},{t,0,10.},


DisplayFunction->Identity],
DisplayFunction->$DisplayFunction];

References
Z. Shafiei and A.T. Shenton, Theory and Application of H-infinity Disk Method,
Report no. MES/ATS/BAE/002/90, Department of Mechanical Engineering,
University of Liverpool, U.K.
R. Luus, Optimization in model reduction, Int. J. Control, 32 (1980),
pp. 741-747.
P. Gill, Q. Murray, and M. Wright, Practical optimization, Academic Press New
York, 1986.
Appendix I

OPTDesign Plots, Data,


and Functions

Some users may want to manipulate the output of an OPTDesign run before
dealing with rational fits. We present examples of OPTDesign commands to
plot and manipulate T, L, Co, or other lists of data.

Run the example from Chapter 5 before proceeding with


the rest of the notebook.

I.I Functions and grids


Output functions of OPTDesign runs
A run of OPTDesign produces the calculated closed loop T, the open loop L,
and the compensator Co as lists. If you type, say, T in a session after you run
OPTDesign, then Mathematica returns a list of complex numbers that are the
values of the calculated closed-loop function values on a u>-axis grid.
Note that the grid points on the w axis do not appear explicitly when you ask
Mathematica to show T.
In [17] := T

Out[17] = {0, -0.0371551 + 0.0341142 I, -0.124852 + 0.00349232 I, -0.184928 - 0.129109 I,


-0.114989 - 0.306715 I, 0.0780776 - 0.371381 I, 0.212942 - 0.285464 I,
0.236439 - 0.199328 I, 0.231823 - 0.160042 I, 0.230235 - 0.141843 I, 0.230142 - 0.133257
0.233164 - 0.136253 I, 0.250685 - 0.152637 I, 0.301352 - 0.162629 I, 0.365523 - 0.131061
0.40342 - 0.0669143 I, 0.41222 +0. I, 0.40342 + 0.0669143 I, 0.365523 + 0.131061 I,
0.301352 + 0.162629 I, 0.250685 + 0.152637 I, 0.233164 + 0.136253 I, 0.230142 + 0.133257
0.230235 + 0.141843 I, 0.231823 + 0.160042 I, 0.236439 + 0.199328 I, 0.212942 + 0.285464
0.0780776 + 0.371381 I, -0.114989 + 0.306715 I, -0.184928 + 0.129109 I,
-0.124852 - 0.00349232 I, -0.0371551 - 0.0341142 1}

271
272 APPENDIX I. OPTDESIGN PLOTS, DATA, AND FUNCTIONS

What is the grid you are currently using in an OPTDesign


session?

The assumption is that there is a grid that is used for sampling all functions of
the OPTDesign session. To see it, type

In[18] := Grid[]
Out[18] = {oo, 10.1532, 5.02734, 3.29656, 2.41421, 1.87087, 1.49661, 1.2185, 1., 0.820679,
0.668179, 0.534511, 0.414214, 0.303347, 0.198912, 0.0984914, 0., -0.0984914,
-0.198912, -0.303347, -0.414214, -0.534511, -0.668179, -0.820679, -1.,
-1.2185, -1.49661, -1.87087, -2.41421, -3.29656, -5.02734, -10.1532}

Putting together the grid and the values of a function

If you wish, you may produce a list of pairs of the form {w, T[Iw}}. To do this,
type

In[19] := Tpairs = OPTDParametrize[T]

Out[19] = {{oo, 0}, {10.1532, -0.0371551 + 0.0341142 I}, {5.02734, -0.124852 + 0.00349232 I},
{3.29656, -0.184928 - 0.129109 I}, {2.41421, -0.114989 - 0.306715 I},
{1.87087, 0.0780776 - 0.371381 I}, {1.49661, 0.212942 - 0.285464 I},
{1.2185, 0.236439 - 0.199328 I}, {!., 0.231823 - 0.160042 I},
{0.820679, 0.230235 - 0.141843 I}, {0.668179, 0.230142 - 0.133257 I},
{0.534511, 0.233164 - 0.136253 I}, {0.414214, 0.250685 - 0.152637 I},
{0.303347, 0.301352 - 0.162629 I}, {0.198912, 0.365523 - 0.131061 I},
{0.0984914, 0.40342 - 0.0669143 I}, {0., 0.41222 +0. I},
{-0.0984914, 0.40342 + 0.0669143 I}, {-0.198912, 0.365523 + 0.131061 I},
{-0.303347, 0.301352 -f 0.162629 I}, {-0.414214, 0.250685 + 0.152637 I},
{-0.534511, 0.233164 + 0.136253 I}, {-0.668179, 0.230142 + 0.133257 I},
{-0.820679, 0.230235 + 0.141843 I}, {-!., 0.231823 + 0.160042 I},
{-1.2185, 0.236439 + 0.199328 I}, {-1.49661, 0.212942 + 0.285464 I},
{-1.87087, 0.0780776 + 0.371381 I}, {-2.41421, -0.114989 + 0.306715 I},
{-3.29656, -0.184928 + 0.129109 I}, {-5.02734, -0.124852 - 0.00349232 I},
{-10.1532, -0.0371551 - 0.0341142 I}}

Let's check the size of the list Tpairs.

In[20] :- Dimensions[Tpairs]
Out[203 - {32, 2}
1.2. PLOTS 273

1.2 Plots
Plotting the envelope and T simultaneously
In[21] := EnvelopePlot3D[Radius -> rO, Center -> kO, ClosedLoop -> T];

The points of T may be joined by typing


In[22] := EnvelopePlotSD[Radius -> rO, Center -> kO, ClosedLoop -> T, PlotJoined -> True];
274 APPENDIX I. OPTDESIGN PLOTS, DATA, AND FUNCTIONS

Bode plots of the discrete function T


The commands BodeMagnitude[T] and BodePhase[T] take as input either func-
tions defined by formulas or lists of data.

In[23] := BodeMagnitude[T,FrequencyBand -> {0.01,10.}];

In[24] := BodePhase[T,FrequencyBand -> {0.1,10.}];

3-D list plot of gridvalues of T

To plot in 3-D discrete functions of frequence (i.e., lists of values such as T),
use the command
1.2. PLOTS 275

In[25] := RHPListPlot3D[T,PlotRange -> {{0,3}.Automatic,Automatic}]

Nyquist plot definition and examples

The Nyquist plot is produced with

In[26] := Nyquist[L];

The points can be joined by typing the command


276 APPENDIX I. OPTDESIGN PLOTS, DATA, AND FUNCTIONS

In[27] := Nyquist[L, PlotJoined -> True];

Nichols plot of gridvalues


Here is a Nichols plot of the discrete function L.
In[28] := Nichols[L, PlotJoined -> True];

1.3 Rational approximation and model reduc-


tion
From data on the grid to rational functions
To find a stable rational function that corresponds to a closed loop function T
generated by OPTDesign (and that satisfies the internal stability requirements),
you may use
1.3. RATIONAL APPROXIMATION AND MODEL REDUCTION 277

In[29] := Trat = RationalModel[s,DegreeOfDenominator -> 3]


Error = 0.115867

A general-purpose Caratheodory-Fejer approximation by stable rationals is


implemented as the function StableFit. In addition to the function values, you
must specify a grid with a standard OPTDesign format as input (which you can
generate outside of OPTDesign if you like) and the degree of the denominator.
Here we use the list Tl that is a by-product of an OPTDesign run.
In[30] := Dimensions[Tl]
Out [30] = {32}
In [31] := wpts = Grid[] ;
In[32] := StableFit[Tl,wpts,DegreeOfDenominator -> 3]
Error = 0.00965834

For more on rational approximation with a different algorithm, see the New-
tonFit notebook (Appendix F).

From a rational function to data on a grid


If you have a rational function Tr rather than a list of values, the following
command gives a list of values of Tr on the OPTDesign session grid.
In [33] := Tdisc = Discretize[Trat]

Out[33] = {-5.35865 X 10~18, -0.0456089 + 0.0339105 I, -0.125632 - 0.0169765 I, -0.148543 - 0.13693 :


-0.0892608 - 0.252746 I, 0.0169445 - 0.316023 I, 0.126378 - 0.322324 I,
0.213756 - 0.289865 I, 0.27219 - 0.239305 I, 0.304969 - 0.185478 I, 0.318879 - 0.136635 I,
0.320734 - 0.0962158 I, 0.316079 - 0.0647563 I, 0.30898 - 0.0412528 I,
0.302207 - 0.0239817 I, 0.29751 - 0.0109313 I, 0.295844 + 0. I, 0.29751 + 0.0109313 I,
0.302207 + 0.0239817 I, 0.30898 + 0.0412528 I, 0.316079 + 0.0647563 I,
0.320734 + 0.0962158 I, 0.318879 + 0.136635 I, 0.304969 + 0.185478 I, 0.27219 + 0.239305 I
0.213756 + 0.289865 I, 0.126378 + 0.322324 I, 0.0169445 + 0.316023 I,
-0.0892608 + 0.252746 I, -0.148543 + 0.13693 I, -0.125632 + 0.0169765 I,
-0.0456089 - 0.0339105 1}
This page intentionally left blank
References

[AAK68] V. M. ADAM JAM, D. Z. AROV, AND M. G. KREIN, Infi-


nite Hankel matrices and generalized problems of Caratheodory-
Fejer and F. Riesz Functional Anal. Appl., 2, (1968), pp. 1-18.

[AAK72] V. M. ADAM JAM, D. Z. AROV, AND M. G. KREIN, Analytic


properties of Schmidt pairs for Hankel operator and the gen-
eralized Schur-Takagi problem, Math. USSR-Sb., 15 (1972),
pp. 15-78.

[AAK78] V. M. ADAMJAM, D. Z. AROV, AND M. G. KREIN, Infinite


block Hankel matrices and related extension problems, Amer.
Math. Soc. Trans., Ill (1978), pp. 133-156.

[Agreport] J. AcLER, Interpolation, preprint, late 1980s.

[Ahl66] L. V. AHLFORS, Complex Analysis, 2nded., McGraw-Hill, New


York, 1966.

[AHO96] F. ALIZADEH, J.-P. A. HAEBERLY, AND M. L. OVERTON,


Primal-Dual Interior Point Methods for Semidefinite Program-
ming: Convergence Rates, Stability, and Numerical Results,
Technical report, Computer Science Department, Courant In-
stitute of Mathematical Sciences, New York University, New
York, 1996.

[A196] H. ALEXANDER, H. Gromov's method and Benniquin's problem,


Invent. Math., 125 (1996), no. 1, pp. 135-148.

[A63] T. ANDO, On a pair of commuting contractions, Acta Sci.


Math. (Szeged), 24 (1963), pp. 88-90.

[BHMer94] F. N. BAILEY, J. W. HELTON, AND O. MERINO, Alternative


approaches in frequency domain design of single loop feedback
systems with plant uncertainty. Int. J. Robust and Nonlinear
Control, submitted.

279
280 REFERENCES

[BGR90] J. A. BALL, I. GOHBERG, AND L. RODMAN, Interpolation of


rational matrix functions, Birkhauser-Verlag, Basel, 1990.

[B92] H. BEKE, Algorithm GKL; Documentation for Computer Code


GKL, Department of Electrical Engineering, University of Min-
nesota, 1992.

[BHM86] J. BENCE, J. W. HELTON, AND D. E. MARSHALL, Optimiza-


tion over H-infinity, Proc. Conference on Decision and Control,
Athens, Greece, December 1986.

[BSS80] M. BETTAYEB, M. G. SAFANOV, AND L. M. SILVERMAN, Op-


timal approximation of continuous time systems, Proc. IEEE,
Albuquerque, 1980, pp. 195-198.

[BB91] S. BOYD AND C. P. BARRAT, Linear compensator design:


Limits of performance, Prentice Hall, Englewood Cliffs, NJ,
1991.
[BEFB94] S. BOYD, L. EL GHOUI, E. FERON, AND V. BALAKRISHNAN,
Linear matrix inequalities in systems and control theory, SIAM
Publications, Philadelphia, 1994.

[Con78] J. B. CONWAY, Functions of one complex variable, Springer-


Verlag, New York, 1978.

[CP84] B. C. CHANG AND B. PEARSON JR., Optimal disturbance re-


duction in linear multivariable systems, IEEE Trans. Automat.
Control, AC-29 (1984), pp. 880-887. Tech report dated Oct.
82.
[C44] R. V. CHURCHILL, Modern operational mathematics in engi-
neering, McGraw-Hill, New York, 1944.

[CS94] M. COTLAR AND C. SADOWSKi, Nehari and Nevanlinna Pick


problems and holomorphic extensions in the polydisk in terms
of restricted BMO, J. Funct. Anal, 124 (1994), pp. 205- 210.

[DC80] C. DEBOOR AND R. CONTE, Elementary Numerical Analysis:


an algorithmic approach, McGraw-Hill, New York, 1980.

[DG82] C. A. DESOER AND C. L. GUSTAFSON, Design of Multivari-


able Feedback System with Simple Unstable Plant, Berkeley ERL
Memorandum M82/60).

[Dor90] P. DORATO AND R. K. YEDAVALLI, (eds.), Recent advances in


robust control, IEEE Press, New York, 1990.
REFERENCES 281

[Do81] R.C. DORF, Modern control systems, Addison-Wesley, Read-


ing, MA, 1981.

[Doug72] R.G. DOUGLAS, Banach algebra techniques in operator theory,


Academic Press, New York, 1972.

[D83] J. C. DOYLE, Synthesis of robust controllers and filters, in


Proc. of 22nd IEEE Conference on Decision and Control, San
Antonio, Texas 1983.

[Dreport] J. C. DOYLE, Lecture notes in advances in multivariable con-


trol, Honeywell/ONR workshop, Minneapolis, 1984.

[DFT92] J. DOYLE, B.A. FRANCIS, AND A. TANNENBAUM, Feedback


control theory, Macmillan, New York, 1992.

[DGKF89] J. C. DOYLE, K. GLOVER, P. P. KHARGONEKAR, AND B.A.


FRANCIS, State-space solutions to standard H2 and H°° control
problems, IEEE Trans. Automat. Control, 34 (1989), pp. 831-
847.
[DS81] J. C. DOYLE AND G. STEIN, Multivariable feedback design:
Concepts for a classical modern synthesis, IEEE Trans. Au-
tomat. Control, AC-26 (1981), pp. 4-16.

[Dy89] H. DYM, J contractive matrix functions, reproducing kernel


Hilbert spaces and interpolation, CBMS Regional Conf. Ser.
in Math., 71 (1989).

[Fo91] C. FOIAS, B. FRANCIS, H. KWAKERNAAK, AND B. PEARSON,


Commutant lifting techniques for computing optimal H°° con-
trollers, in: H°°-control Theory, Lecture Notes in Math., 1496
(1991).

[F91] C. FOIAS, B. FRANCIS, J. W. HELTON, H. KWAKERNAAK,


AND J.B. PEARSON, H°°-control theory, Lecture Notes in
Math., 1496 (1991).

[FF91] C. FOIAS AND S. FRAZHO, The commutant lifting approach to


interpolation problems, Birkhauser-Verlag, Basel, 1991.

[F87] B. FRANCIS, First course in H°°-control, Springer-Verlag, New


York, 1987.
282 REFERENCES

[FHZ84] B. A. FRANCIS, J. W., HELTON, AND G. ZAMES, H°°-


optimal feedback controller for linear multivariable systems,
IEEE Trans. Automat. Control, AC-29 (1984), pp. 888-900.
Tech. report dated Sept. 1982.

[FOT96] C. FOIAS, H. OZBAY AND A. TANNENBAUM, Robust Control of


Infinite Dimensional Systems, Springer-Verlag, London, 1996.

[FPE86] G. F. FRANKLIN, J. D. POWELL AND A. EMAMI-NAEINI,


Feedback control of dynamic systems, Addison-Wesley, Read-
ing, MA, 1986.

[G81] J. GARNETT, Bounded analytic functions, Academic Press,


New York, 1981.

[GMW84] P. GILL, W. MURRAY, AND M. WRIGHT, Practical optimiza-


tion, Academic Press, London, 1984.

[G184] K. GLOVER, All optimal Hankel-norm approximations of lin-


ear multivariable systems and their L^, error bounds, Int. J.
Control, 39 (1984), pp. 1115-1193.

[GM90] K. GLOVER AND D. C. MCFARLANE, Robust controller de-


sign using normalized coprime factor plant descriptions, Lec-
ture Notes in Control Inform. Sci., 138 (1990).

[Goh64] I. M. GOHBERG, A factorization problem in normed rings,


functions of isometric and symmetric operators, and singular
integral equations, Russian Math. Surveys, 19 (1964), pp. 63-
114.
[GL95] M. GREEN AND D. J. N. LIMEBEER, Linear robust control,
Prentice Hall, Englewood Cliffs, NJ, 1995.

[GKL89] G. Gu, P. KHARGONEKAR, AND B. LEE, Approximation of


infinite-dimensional systems, IEEE Trans. Automat. Control,
34 (1989), no. 6.

[HO94] J.-P.A. HAEVERLY AND M. L. OVERTON, Optimizing eigen-


values of symmetric definite pencils, Proc. American Control
Conference, Baltimore, July 1994.

[H76] J. W. HELTON, Operator theory and broadband matching, an-


nounced in Proc. of Eleventh Annual Allerton Conference on
Circuits and Systems Theory, 1976.
REFERENCES 283

[H78] J. W. HELTON, A mathematical view of broadband matching,


IEEE International Conference on Circuits and Systems Theory,
New York, 1978.

[H81] J. W. HELTON, Broadbanding gain equalization directly from


data, IEEE Trans. Circuits and Systems, CAS-28 (1981), no.
12, pp. 1125-1137.

[H82] J. W. HELTON, Non-Euclidean functional analysis and elec-


tronics, Bull. Amer. Math. Soc., 7 (1982), pp. 1-64.

[H83] J. W. HELTON, An H-infinity approach to control, IEEE Con-


ference on Decision Control, San Antonio, Texas, December
1983.

[H85] J. W. HELTON, Worst case analysis in the frequency domain:


The H-infinity approach to control, IEEE Trans. Automat.
Control, AC-30 (1985), no. 12, pp. 1154-1170.

[H86] J. W. HELTON, Optimization over spaces of analytic functions


and the Corona problem, J. Operator Theory, 15 (1986), no. 2,
pp. 359-375.

[H87] J. W. HELTON, Operator theory, analytic functions, matrices


and electrical engineering, CBMS Regional Conf. Ser. in Math.,
68 (1987).

[H89] J. W. HELTON, Optimal frequency domain design vs. an area


of several complex variables: Plenary address, Mathematical
Theory of Networks and Systems, 1989.

[HH86] J. W. HELTON AND R. HOWE, A bang-bang theorem for opti-


mization over spaces of analytic functions, J. Approx. Theory,
47 (1986), no. 2, pp. 101-121.

[HMar90] J. W. HELTON AND D. MARSHALL, Frequency domain design


and analytic selections, Indiana Univ. Math. J., 39 (1990),
no. 1, pp. 157-184.

[HMer90] J. W. HELTON AND O. MERINO, Numerical results in H°°


control, in Proc. American Control Conference, San Diego, Cal-
ifornia, 1990.

[HMer91] J. W. HELTON AND O. MERINO, Optimal analytic disks:


Several complex variables and complex geometry, Part 2, in
Proc. Sympos. Pure Math., Santa Cruz, California, 1991,
pp. 251-262.
284 REFERENCES

[HMer93a] J. W. HELTON AND O. MERINO, Conditions for optimality


over H°°, SIAM J. Control Optim, 31 (1993), no. 6.

[HMer93b] J. W. HELTON AND O. MERINO, A novel approach to accel-


erating Newton's method for sup-norm optimization arising in
H°° -control, J. Optim. Theory App., 1993, pp. 553-578.

[HMer98a] J. W. HELTON AND O. MERINO, Classical control using H°°


methods: an introduction to design, SIAM, Philadelphia, 1998.

[HMW93] J. W. HELTON, O. MERINO, AND T. E. WALKER, Algorithms


for optimizing over analytic functions, Indiana Univ. Math. J.,
42 (1993), no.3.

[HMW95] J. W. HELTON, O. MERINO, AND T. E. WALKER, #°° op-


timization and semidefinite programming, Proc. Conference on
Decision and Control, New Orleans, Louisiana, December 1995.

[HMWprep] J. W. HELTON, O. MERINO, AND T. E. WALKER, Semidef-


inite programming and H°° optimization, J. Robust Nonlinear
Control, submitted.

[HS85] J. W. HELTON AND D. SCHWARTZ, A primer on the H°° disk


method in frequency domain design control, Documentation for
Fortran software, developed at Lab. for Math, and Statistics,
University of California, San Diego, 1985.

[HV97] J. W. HELTON AND A. VITYAEV, Analytic functions optimiz-


ing competing contraints, SIAM J. Math. Anal., 28 (1997), pp.
749-767.
[Hof62] K. HOFFMANN, Banach spaces of analytic functions, Prentice
Hall, Englewood Cliffs, NJ, 1962.

[Ho63] I.M. HOROWITZ, Synthesis of feedback systems, Academic


Press, New York, 1963.

[Hui87] S. Hui, Qualitative properties of solutions to H°°- optimization


problems, J. Funct. Anal., 75 (1987), pp. 323, 348.

[JNP47] H.M. JAMES, N. B. NICHOLS, AND R. P. PHILLIPS, Theory of


Servomechanisms, Radiation Lab. Series, vol. 25, McGraw-Hill,
New York, 1947.
REFERENCES 285

[deWVKTS] P. DsWiLDE, A. VIEIRA, AND T. KAILATH, On a generalized


Szego-Levinson realization algorithm for optimal linear predic-
tors based on a network synthesis approach, IEEE Trans. Circuit
Theory, Special Issue on Math. Foundations of Systems Theory,
25 (1978), pp. 663-675.

[Kim84] H. KIMURA, Robust stabilizability for a class of transfer func-


tions, IEEE Trans. Automat. Control, AC-29 (1984), pp. 788-
793.
[Kim97] H. KlMURA, Chain scattering approach to H°°-control,
Birkhauser, Boston, 1997.

[K83] H. KWAKERNAAK, Robustness optimization of linear feedback


Systems, IEEE Conf. on Decision Control, San Antonio, Texas,
December 1983.
[K86] H. KWAKERNAAK, A polynomial approach to minimax fre-
quency domain optimization of multivariable systems, Int. J.
Control, 44 (1986), pp. 117-156.

[La89] B. LARSON, Siso Robust Controller Design via the H°° Method,
master thesis, under the direction of Prof. F. Bailey, University
of Minnesota, April 1989.

[Le86] L. LEMPERT, Complex geometry in convex domains, in Proc.


International Congress of Mathematicians, 1986.

[LP61] W. R. LEPAGE, Complex variables and the Laplace transform


for engineers, McGraw-Hill, New York, 1961.

[Lprep] K. LENZ, Properties of Certain Optimal Weighted Mixed Sen-


sitivity Designs, manuscript.

[LO96] A. S. LEWIS AND M. L. OVERTON, Eigenvalue optimization,


Acta Numerica, 5 (1996), pp. 149-190.

[M88] O. MERINO, Optimization Over Spaces of Analytic Functions,


Thesis, University of California, San Diego, 1988.

[Mreport] O. MERINO, Optimizing real valued functional on H1, manu-


script.

[NF70] B. Sz.-NAGY AND C. FOIAS, Harmonic analysis of operators


on Hilbert space, North Holland, Amsterdam, 1970.
286 REFERENCE S

[NN94] Y. E. NESTEROV AND A. S. NEMIROVSKII, Interior point poly-


nomial methods in convex programming, SIAM Publications,
Philadelphia, 1994.

[O90] K. OGATA, Modern control engineering, 2nd ed., Prentice Hall,


Englewood Cliffs, NJ, 1990.

[OZ93] J. G. OWEN AND G. ZAMES, Duality theory for MIMO ro-


bust disturbance rejection, IEEE Trans. Automat. Control, 38
(1993), no. 5.

[RR71] M. ROSENBLUM AND J. ROVNYAK, The factorization problem


for nonnegative operator valued functions, Bull. Amer. Math.
Soc., 77 (1971), pp. 287-318.

[RR85] M. ROSENBLUM AND J. ROVNYAK, Hardy classes and operator


theory, Oxford University Press, 1985.

[S67] D. SARASON, Generalized interpolation in H°°, Trans. Amer.


Math. Soc., 127 (1967) , pp. 179-203.

[SS68] R. SAUCEDO AND E. E. SOBERING, Introductions to continu-


ous and digital control systems, MacMillan, New York, 1968.

[SS90] Z. SHAFIEI AND A. T. SHENTON, Theory and Application of


H°° Disk Method, Report no. MES/ATS/BAE/002/90, Depart-
ment of Mechanical Engineering, University of Liverpool, U.K.,
October 1990.
[SI95] R. E. SKELTON AND T. IWASAKI, Increased roles of linear
algebra in control education, IEEE Control Systems Magazine,
1995, pp. 76-90.

[S189] Z. SLODKOWSKI, Polynomial hulls in C2 and quasicircles, Ann.


Scuola Norm. Sup. Pisa Cl. Sci., 16 (1989), pp. 367-391.

[S190] Z. SLODKOWSKI, Polynomial hulls with convex fibers and com-


plex geodesies, J. Funct. Anal., 94 (1990), pp. 156-355.

[SIG98] R. E. Skelton, T. Iwasaki, and K. Grigoriadis, A unified al-


gebraic approach to linear control design, Taylor and Francis,
London, 1998.

[SK91] A. SAYED AND T. KAILATH, Fast algorithms for generalized


displacement structures, in Proc. Mathematical Theory of Net-
works and Systems, June 1991, Mita Press, Kobe, pp. 27-32.
REFERENCES 287

[T80] A. TANNENBAUM, Feedback stabilization of plants with uncer-


tainty in the gain factor, Int. J. Control, 32 (1980), pp. 1-16.

[Tr86] L. N. TREFETHEN, Matlab Programs for CF Approximation,


Numerical Analysis Report 86-3, Dept. of Mathematics, MIT,
June 1986.
[VB96] L. VANDEBERGHE AND S. BOYD, Semidefinite programming,
SIAM Rev., 38 (1996), pp. 49-95.

[V85] M. VIDYASAGAR, Control systems synthesis: A factorization


approach, MIT Press, Cambridge, MA, 1985.

[Vitprep] A. VITYAEV, Uniqueness of solutions of an H°° optimization


problem in complex geometric convexity, J. Geom. Anal., to
appear.

[We92] E. WEGERT, Nonlinear boundary value problems for holomor-


phic functions and singular integral equations, Academic Verlag,
Berlin, 1992.

[Wi41] D. V. WlDDER, The Laplace transform, Princeton University


Press, Princeton, NJ, 1941.

[Wr97] S. WRIGHT, Primal-dual interior-point methods, SIAM Publi-


cations, Philadelphia, 1997.

[YJB76a] D.C. YOULA, H. A. JABR, AND J.J. BONGIORNO, Modern


Wiener-Hopf design of optimal controllers — Part I: The single
input - single output case, IEEE Trans. Automat. Control, AC-
21 (1976), pp. 319-338.

[YJB76b] D.C. YOULA, H. A. JABR, AND J.J. BONGIORNO, Modern


Wiener-Hopf design of optimal controllers — Part II: The mul-
tivariable case, IEEE Trans. Automat. Control, AC-21 (1976),
pp. 3-13.

[YS67] D.C. YOULA AND M. SAITO, Interpolation with positive real


functions, J. Franklin Inst., 284 (1967), pp. 77-108.

[Yng88] N. YOUNG, An Introduction to Hilbert space, Cambridge Uni-


versity Press, New York, 1988.

[Z79] G. ZAMES, Optimal sensitivity and feedback: Weighted semi-


norms, approximate inverses, and plant invariant schemes,
Proc. Allerton Conf., 1979.
288 REFERENCES

[ZF81] G. ZAMES AND B. FRANCIS, Feedback and minimax sensitiv-


ity, Advanced Group for Aerospace Research and Development,
NATO Lecture Notes, no. 117, Multivariable Analysis and De-
sign Techniques.

[ZF83] G. ZAMES AND B. FRANCIS, Feedback, minimax sensitivity,


and optimal robustness, IEEE Trans. Automat. Control, AC-
28 (1983), pp. 585-601.

[ZDG96] K. ZHOU, J. DOYLE, AND K. GLOVER, Robust and optimal


control, Prentice Hall, Englewood Cliffs, NJ, 1996.
Index
analytic function, 117, 120 coordinates, change of, in function
Anopt, 141, 142, 154, 158 space, 120
ARHP, 117, 121 critical point problem for optimality
assumption, standard (SA) equations, 147, 171
on sublevel sets, 129
on OPT, 186 degree, relative, 3, 104
density, 215
Banach space, 144, 201, 210 descent direction, 155
bandwidth, 22, 25, 32 Design, 7, 8, 36, 39, 44
constraint, 23 designable transfer function, 6, 35,
best H°° approximation, see asolo- 43
cal solution!68 diagnostics, 141
for Nehari's problem, 168 Flatness, 58
BodeMagnitudef /, OPTDesign func- Gradient Alignment, 58
tion, 54, 56 optimality, 58
BodePhasef ], OPTDesign function, output, 58
54, 56 directional solution, 155, 183, 184
bounded function, 3, 9, 117, 120 Discretizef / , OPTDesig n function ,
56
disk inequality, 18, 25, 32, 36, 43
Cancel2[], OPTDesign function, 52 disk iteration, 159, 170
Cauchy-Riemann equations, 126, 134n dual cone, 201
central path, 199 dual variable, 204
circular performance function, 38 duality gap, 204
closed-loop
compensator, 5, 17, 27 EnvelopeLogPlot[J, OPTDesign func-
function, 17 tion, 50
plant, 5, 18, 28 EnvelopePlotSDf /, OPTDesign func-
roll-off, 23, 32 tion, 54, 56
constraint, 24 EnvelopePlotf /, OPTDesign func-
system, 5, 17, 35, 105, 109 tion, 49
transfer function, 6, 101, 106, external stability, 13
109, 136
optimal, 50 factorization, spectral, 132
compensator, 5, 11, 23, 31, 105 feedback control, 29
bound constraint, 25, 28 final value theorem, 30, 32
complementarity conditions, 196 Flat, 51, 58, 141

289
290 INDEX

Flatness diagnostic, 58 HP, 144, 214


flatness of performance at optimum, HPN, 168
139, 141, 196 Hpnxm, 214
Fourier expansion, 134, 143, 144, 158 ^AT!OI 174
frequency band, 18 Hahn-Banach condition (HBC), 202,
frequency domain performance re- 205, 210-212
quirement, 18 Hankel operator, 168, 178
function Hardy space, 134, 143
analytic, 117, 120 harmonic function, 134n
bounded, 3, 9, 117, 120 Hilbert space, 144
closed-loop, 17 H°° engineering, 221
harmonic, 134n
measurable, 143 INT, 101, 102-104
optimal, 38 INT0, 103
performance, 118 INTh, 107, 109
plurisubharmonic, 179, 186 INT%, 108
quasi-circular, 232, 233 internal stability, 6, 13, 15, 31, 35,
proper, 3, 9, 35, 106 105, 106, 109
rational, 3, 106 interpolant, 101, 102
real, 9, 106, 117, 120 interpolation condition, 31, 101, 106,
real rational, 4, 35 109
real symmetric, 118 invertible outer function, 179
sensitivity, 5 iteration, 58
size, 202, 209, 211, 212
stable, 4, 9 LP, 144, 214, 214
strictly proper, 3, 104 LPN, 167
transfer ££xm, 214
closed-loop, 101, 106, 109, 136 Laplace transform, 4, 21
open loop, 18 Laplace's equation, 134n
open-loop, 5, 23 local solution, 155, 159
fundamental mistake of H°° Con- strict, 155
trol, 135 to PDO, 204
to PO, 203
gain margin, 19
gain-phase margin, 19, 25, 32, 39 matrix performance function, 193
constraint, 20 measurable function, 143
weighted, 20 MIMO, 118, 151, 153, 221-224, 231
7*, 50 MIMO system, 105
gamma*, 51 minimizer, 118, see solution
global solution, 129, 155 mistake, fundamental, of H°° Con-
good performance, 44 trol, 135
gradient alignment, 141, 156-159, 162, mixed sensitivity, 122
165, 176, 179, 183-185 MOPT, 193
Gradient Alignment diagnostic, 58 //-synthesis, 231
GrAlign, 51, 58, 141
Grid[], OPTDesign function, 56, 59 ned, 51, 58
INDEX 291

Nehari solution, derivation, 172 peak magnitude, 27


Nehari's problem, 168, 169, 172 constraint, 27
power method for solving, 169 performance function, 7, 35-37, 38,
Nehari-commutant lifting formula, 40, 43, 118, 121, 154
174, 175 circular, 38
Nehari-commutant lifting formula, 169 plurisubharmonic, 179, 186
Newton algorithm, 171 quasi-circular, 122, 232, 233
Newton iteration, 159, 171 performance index, 7, 44
Newton's method, 145, 146 performance requirement, 6
applied to critical point prob- bandwidth constraint, 23
lem, 147 closed-loop roll-off constraint, 24
Newton's representation, 102, 103 frequency domain, 18
Nichols//, OPTDesign function, 56 gain-phase margin constraint, 20
Nyquist plot, 20 tracking error constraint, 22
Nyquistf /, OPTDesign function, 56 phase, 127, 146
factor, 134
open-loop transfer function, 5, 18, margin, 19
23 Plancherel theorem, 21n
OPT, 45, 115, 159, 168, 172, 183 plant, 5, 11, 23, 31, 36, 39, 105, 118,
vector case, 154 229
standard assumption (SA) on, bound constraint, 25, 29, 32
186 nominal, 229
OPTDesign, 40, 45 reference, 229
OPTDesignf /, computer output, 50 PlotZPf], OPTDesig n function , 5 2
OPTDParametrize[], OPTDesign func- plurisubharmonic performance func-
tion, 56 tion, 179, 186
OPTT, 44 PO, 203, 205, 206, 214
optimal closed-loop transfer function, Poisson's formula, 134n
50 pole placement, 136
optimal function, 38 pole-zero cancellation, 9, 11, 14, 35,
optimal performance, 38 105
optimal value, 118 power method for solving Nehari's
optimality diagnostic, 58 problem, 169
optimality test for OPTRHP, 130 primal variable, 203
optimization, 38 primal-dual form, 156
order of convergence, 159 proper function, 3, 9, 35, 106
outer function
invert ible , ii!79 quasi-circular performance function,
output diagnostic, 58 122, 232, 233

parametric uncertainty, 230 rational fit, 60


PDE, 205 rational function, 3, 106
PDE+, 195 RationalModelf /, OPTDesign func-
PDE+H00, 196, 198, 217 tion, 52
PDE+H™, 198 real function, 9, 106, 117, 120
PDO, 204, 205, 214 real rational function, 4, 35
292 INDEX

real symmetric function, 118 transfer function, 11, 17


relative degree, 3, 104 closed-loop, 6, 101, 106, 109,
UK00, 4, 13, 21, 102 136
RHP-stability, 15 optimal, 50
RHPListPlot3D[], OPTDesign func- designable, 6, 35, 43
tion, 56 open-loop, 5, 18, 23
robust performance, 230 type n plant, 31
robust stability, 229
roll-off U, 194
compensator, 58 uncertainty, 229
rate, 39 parametric, 230
Runge's theorem, 131, 132, 136 unparametric, 230, 231

sensitivity function, 5, 17 Weierstrass approximation theorem,


SetGridf], OPTDesign function, 48n 132
singular values si(^4) > s^A) > weight function., 43
..., 202 weighted gain-phase margin, 20
SISO, 115, 118, 151n, 221-223 Whichf /, Mathematica function, 48
size function, 202, 209, 211, 212 winding number, 127, 134, 139, 141n,
solution, 118, 154 146
directional, 155, 183, 184 worst-case performance, 38
global, 129, 155
yaw rate, 29
local, 155
strict, 155
to PDO, 204
to PO, 203
spectral factorization, 132
spiked gain principle, 129
stability
external, 13
internal, 6, 13, 15, 31, 35, 105,
106, 109, 229
robust, 229
stable function, 4, 9
strictly proper function, 3, 104
sublevel set, 119, 121, 127, 129
standard assumption (SA) on,
129
supremum, 3

Taylor expansion, 137, 138, 144, 231,


236
Toeplitz operator, 168, 178
tracking, 32
tracking error, 5, 17, 21, 25, 30
constraint, 22, 30, 31

Anda mungkin juga menyukai