Edited by
Herbert Amann, Zrich University
Steven G. Krantz, Washington University, St. Louis
Shrawan Kumar, University of North Carolina at Chapel Hill
DQ1HNRYi8QLYHUVLWp3LHUUHHW0DULH&XULH3DULV
3DYHO'UiEHN
DURVODY0LORWD
Birkhuser
Basel Boston Berlin
Authors:
3DYHO'UiEHN
'HSDUWPHQWRI0DWKHPDWLFV
)DFXOW\RI$SSOLHG6FLHQFHV
8QLYHUVLW\RI:HVW%RKHPLDLQ3LOVHQ
Univerzitn 8
3O]H
Czech Republic
DURVODY0LORWD
'HSDUWPHQWRI0DWKHPDWLFDO$QDO\VLV
)DFXOW\RI0DWKHPDWLFVDQG3K\VLFV
&KDUOHV8QLYHUVLW\LQ3UDJXH
Sokolovsk 83
3UDKD
Czech Republic
0DWKHPDWLFV6XEMHFW&ODVVLFDWLRQ%[[$+[[++
...5[[$$((&&&
/LEUDU\RI&RQJUHVV&RQWURO1XPEHU
%LEOLRJUDSKLFLQIRUPDWLRQSXEOLVKHGE\'LH'HXWVFKH%LEOLRWKHN
'LH'HXWVFKH%LEOLRWKHNOLVWVWKLVSXEOLFDWLRQLQWKH'HXWVFKH1DWLRQDOELEOLRJUDH
detailed bibliographic data is available in the Internet at <http://dnb.ddb.de>.
,6%1%LUNKlXVHU9HUODJ$*%DVHO%RVWRQ%HUOLQ
7KLVZRUNLVVXEMHFWWRFRS\ULJKW$OOULJKWVDUHUHVHUYHGZKHWKHUWKHZKROHRUSDUWRIWKHPDWHULDOLVFRQFHU
QHGVSHFLFDOO\WKHULJKWVRIWUDQVODWLRQUHSULQWLQJUHXVHRILOOXVWUDWLRQVUHFLWDWLRQEURDGFDVWLQJUHSURGXF
WLRQRQPLFUROPVRULQRWKHUZD\VDQGVWRUDJHLQGDWDEDQNV)RUDQ\NLQGRIXVHSHUPLVVLRQRIWKHFRS\ULJKW
owner must be obtained.
%LUNKlXVHU9HUODJ$*
Basel Boston Berlin
32%R[&+%DVHO6ZLW]HUODQG
3DUWRI6SULQJHU6FLHQFH%XVLQHVV0HGLD
3ULQWHGRQDFLGIUHHSDSHUSURGXFHGRIFKORULQHIUHHSXOS7&)f
3ULQWHGLQ*HUPDQ\
,6%1
H,6%1
ZZZELUNKDXVHUFK
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
1 Preliminaries
1.1 Elements of Linear Algebra . . . . . . . . . . . . . . . . . . . . . .
1.2 Normed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . .
1
24
55
77
91
Operators
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . 139
. . . . . . . . 146
. . . . . . . . 156
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
181
195
208
228
.
.
.
.
.
.
.
.
.
.
249
261
267
295
303
viii
Contents
5.3
5.4
6 Variational Methods
6.1 Local Extrema . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Global Extrema . . . . . . . . . . . . . . . . . . . . . . . .
6.2A Ritz Method . . . . . . . . . . . . . . . . . . . . .
6.2B Supersolutions, Subsolutions and Global Extrema .
6.3 Relative Extrema and Lagrange Multipliers . . . . . . . .
6.3A Contractible Sets . . . . . . . . . . . . . . . . . . .
6.3B Krasnoselski Potential Bifurcation Theorem . . . .
6.4 Mountain Pass Theorem . . . . . . . . . . . . . . . . . . .
6.4A Pseudogradient Vector Fields in Banach Spaces . .
6.4B LusternikSchnirelmann Method . . . . . . . . . .
6.5 Saddle Point Theorem . . . . . . . . . . . . . . . . . . . .
6.5A Linking Theorem . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
309
323
330
338
351
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
361
375
388
398
401
414
416
426
436
442
456
464
.
.
.
.
.
.
.
.
.
.
473
477
481
486
492
.
.
.
.
.
.
.
.
.
.
499
505
510
515
527
Preface
Motto:
Real world problems are in essence nonlinear. Hence methods of nonlinear
analysis became important tools of modern mathematical modeling.
There are many books and monographs devoted to methods of nonlinear analysis
and their applications. Typically, such a book is either dedicated to a particular
topic and treats details which are dicult for a student to understand, or it deals
with an application to complicated nonlinear partial dierential equations in which
a lot of technicalities are involved. In both cases it is very dicult for a student
to get oriented in this kind of material and to pick up the ideas underlying the
main tools for treating the problems in question. The purpose of this book is
to describe the basic methods of nonlinear analysis and to illustrate them on
simple examples. Our aim is to motivate each method considered, to explain it
in a general form but in the simplest possible abstract framework, and nally, to
show its application (typically to boundary value problems for elementary ordinary
or partial dierential equations). To keep the text free of technical details and
make it accessible also to beginners, we did not formulate some key assertions and
illustrative examples in the most general form.
The exposition of the book is at two levels, visually dierentiated by dierent
font sizes. The basic material is contained in the body of the seven chapters. The
more advanced material is contained in appendices to a number of sections and is
presented in a smaller font size. The basic material is independent of the advanced
material, is selfcontained, and can be read by students new to the subject. It
should prepare an undergraduate student in mathematics to read scientic papers
in nonlinear analysis and to understand applications of the methods presented to
more complex problems.
Each chapter contains a number of exercises that should provoke the readers
creativity and help develop his or her own style of approaching problems. However,
the exercises play an additional role. They carry some of the technical material
that was omitted in simplifying some of the basic proofs. They are thus an organic
Preface
part of the exposition for graduate students who already have experience with the
methods of nonlinear analysis and are interested in generalizations.
We have organized the material in this book as follows.
In Chapters 13, we introduce some necessary notions and basic assertions
from linear algebra (Section 1.1) and linear functional analysis (Sections 1.22.2),
and we also present some preliminaries concerning the Contraction Principle and
dierential and integral calculus in normed linear spaces (Sections 2.33.2). In
this part of the text we give proofs of the results which are closely related to the
nonlinear part of the book. On the other hand, several very important statements
of linear functional analysis are left without proofs.
In Chapter 4, local properties of dierentiable mappings are treated. In particular, it includes topics such as the Inverse Function Theorem, the Implicit Function Theorem together with the Rank Theorem and the notion of the dierentiable
manifold. Results such as the LyapunovSchmidt Reduction and the Morse Theorem are used to prove the Local Bifurcation Theorem of Crandall and Rabinowitz.
Chapter 5 is devoted to the topological and monotonicity methods of nonlinear analysis. We focus on the Brouwer and Schauder Fixed Point Theorems, the
Sard Theorem and the analytic approach to the degree of a mapping, monotone
operators and the method of monotone iterations based on the notions of superand subsolutions.
In Chapter 6, basic variational methods are presented. We start with local and global extrema and then continue with the method of Lagrange Multipliers with applications to eigenvalue problems (CourantFischer and Courant
Weinstein Principles), the Mountain Pass Theorem and the Saddle Point Theorem. Abstract results from Chapters 46 are accompanied by examples of various
boundary value problems for ordinary dierential equations. Since these applications are spread over a large number of pages, we add a brief account of examples
of boundary value problems for both ordinary and partial dierential equations
together with the methods used at the end of the book. The reader will also nd
there a short guide on the bifurcation results presented in the book.
Chapter 7 deals with several applications of the preceding methods to boundary value problems for elementary nonlinear partial dierential equations. We
present and discuss the notions of classical and weak solutions and try to minimize the technical diculties connected with the formulation of the problems. All
this material represents a selfcontained introduction to the methods of nonlinear
analysis with simple applications to elementary dierential equations.
More advanced material is presented in appendices which are attached to a
number of sections.
In particular, Appendix 3.2A explores an abstract Newton Method as an
application of the Contraction Principle and dierential calculus in Banach spaces.
Appendices 4.3A to 4.3D are devoted to the analysis on manifolds (vector
elds, dierential forms and integration on manifolds). The main results presented
Preface
xi
in these appendices are an abstract version of the Stokes Theorem (and its applications) and the construction of the Brouwer degree by means of dierentiable forms.
Some xed point theorems for noncompact operators are presented in Appendix 5.1A. As an application of the LeraySchauder degree theory we consider
global bifurcation theorems in Appendix 5.2A while Appendix 5.2B is devoted to
the generalization of the LeraySchauder degree to generalized monotone mappings. Appendix 5.3A deals with the generalization of the theory of monotone
operators to a more general functional setting and to operators which are monotone only in the principal part. In Appendix 5.4A we give the proof of the famous
KreinRutman Theorem which itself falls within the linear theory but plays an
essential role in the study of nonlinear problems. Appendix 5.4B illustrates the
connection between the method of supersolutions and subsolutions and the topological degree.
The Ritz Method is presented in Appendix 6.2A as an application of an abstract variational principle. Appendix 6.2B illustrates the connection between the
method of supersolutions and subsolutions and the existence of global extrema.
Appendix 6.3A has an auxiliary character for establishing the potential bifurcation theorem in Appendix 6.3B. In Appendix 6.4A we generalize the socalled
Deformation Lemma and present the Mountain Pass Theorem in a more general
setting. The generalization of the Lagrange Multiplier Method is carried out in
Appendix 6.4B. Appendix 6.5A is dedicated to the generalization of the Saddle
Point Theorem.
Appendices 7.5A, 7.6A and 7.7A are devoted to applications of the degree of
generalized monotone mappings, the LerayLions Theorem and the Saddle Point
Theorem, respectively, to boundary value problems for elementary partial dierential equations. This more advanced part contains several generalizations of the
methods presented in the basic part and the beginner in the subject who is reading
the book can skip it. On the other hand, there are (a few) places in appendices
where we have to refer to the basic text and the notions we refer to are contained in the forthcoming chapters or sections. This, however, corresponds to our
philosophy of two levels of the text and, in our opinion, does not impair the
smoothness of reading.
In order to make the text selfcontained, we decided to comment on several
notions and statements in footnotes. To place the material from the footnotes
in the text could disturb a more advanced reader and make the exposition more
complicated. In order to emphasize the role of the statements in our exposition we
identify them as Theorem, Proposition, Lemma or Corollary. However, the reader
should be aware of the fact that this by no means expresses the importance of the
statement within the whole of mathematics. So, several times, we call important
theorems Propositions, Lemmas or Corollaries.
Although the book should primarily serve as a textbook for students on the
graduate level, it can be a suitable source for scientists and engineers who have
need of modern methods of nonlinear analysis.
xii
Preface
At this point we would like to include a few words about our good friend,
colleague and mentor Svatopluk Fuck to whom we dedicate this book. His work in
the eld of nonlinear analysis is well recognized and although he died in 1979 at the
age of 34, he ranks among the most important and gifted Czech mathematicians
of the 20th century.
We would like to thank Marie Benediktov
a and Jir Benedikt for an excellent
typesetting of this book in LATEX 2 , excellent gures and illustrations as well as for
their valuable comments which improved the quality of the text. Our thanks belong also to Eva Fasangova, Gabriela Holubov
a, Eva Kasprkov
a and Petr Stehlk
for their careful reading of the manuscript and useful comments which have decreased the number of our mistakes and made this text more readable. Our special
thanks belong to Jir Jarnk for correction of our English, Ralph Chill and Herbert
Leinfelder for their improvements of the text and methodological advice.
Both authors appreciate the support of the research projects of the Ministry
of Education, Youth and Sports of the Czech Republic MSM 4977751301 and
MSM 0021620839.
Plze
nPraha,
November 2006
Pavel Dr
abek
Jaroslav Milota
Chapter 1
Preliminaries
1.1 Elements of Linear Algebra
This section is rather brief since we suppose that the reader already has some
knowledge of linear algebra. Therefore, it should be viewed mainly as a source
of concepts and notation which will be frequently used in the sequel. There are
plenty of textbooks on this subject. As we are interested in applications to analysis
we recommend to the reader the classical book Halmos [64] to nd more detailed
information.
A decisive part of analysis concerns the study of various sets of numbers
such as R, C, RM , . . . , sets of functions (continuous, integrable, dierentiable),
and mappings between them. These sets usually allow some algebraic operations,
mainly addition and multiplication by scalars. We will denote the set of scalars by
and have usually in mind either the set of real numbers R or that of complex
numbers C.
Denition 1.1.1. A set X on which two operations addition and multiplication
by scalars are dened, is called a linear space over a eld if the following
conditions are fullled:
(1) X with respect to the operation x, y X x + y X forms a commutative
group with a unit element denoted by o and the inverse element to x X
denoted by x.
(2) The operation a , x X ax X satises
(a) a(bx) = (ab)x, a, b , x X,
(b) 1x = x, x X, where 1 is the multiplicative unit of the eld .
(3) For the two operations the distributive laws hold, i.e., for a, b , x, y X,
we have
(a) (a + b)x = ax + bx,
(b) a(x + y) = ax + ay.
Chapter 1. Preliminaries
ai xi = o
a1 = = an = 0
i=1
is sometimes called a Hamel basis in order to emphasize the distinction from a Schauder or
orthonormal basis in a Banach space or a Hilbert space, respectively (see Section 1.2).
2 It can be viewed also as an axiom of set theory.
3 A binary relation on A A is called an ordering if
(1) x x for all x A,
(2) if x y and y x, then x = y,
(3) if x y and y z, then x z.
We now return to
Proof of Theorem 1.1.3. Let A be a collection of all linearly independent subsets
of X and dene A B for A, B A if A is a subset of B. Choose A A (A =
since X = {o}) and let M be a maximal element of A , A M, whose
existence
is guaranteed by Zorns Lemma (if B is a chain in A , then sup B =
B). Then
BB
Lin M = X.
The proof of the latter part of Theorem 1.1.3 is more involved (the construction
of an injection of A into B is also based on the application of Zorns Lemma) and
it is omitted.
Denition 1.1.5. Let X be a linear space and let A be a basis of X. Then the
cardinality of A is called the dimension of X.
Example 1.1.6.
(i) Assume that A is a basis of a linear space X. Then there is a set (the socalled index set) and a bijection e A onto A. We will also say
that {e } is a basis of X. For any x X there is a nite subset K
and scalars { }K such that
x=
e .
These scalars are uniquely determined and will be called the coordinates of
x with respect to the basis {e }.
(ii) The space RM of real M tuples with the usual operations is a real linear
space and the elements
ek = (0, . . . , 0, 1, 0, . . . , 0),
k = 1, . . . , M
Chapter 1. Preliminaries
(iii) Similarly, CM is the space of complex M tuples and the set of elements
e1 , . . . , eM dened as above is the standard basis of CM . More generally, if X
is a real linear space and iX is dened by iX {ix : x X} where i2 = 1,
then
( = {x + iy : x, y X} )
XC X + iX
is the complexication of X. The equality x + iy = o holds if and only if
x = y = o. The operations in XC are dened as follows:
(x1 + iy1 ) + (x2 + iy2 ) (x1 + x2 ) + i(y1 + y2 ),
(a + ib)(x + iy) (ax by) + i(bx + ay),
x1 , x2 , y1 , y2 X,
a, b R, x, y X.
x, y R.
to warn the reader that the statement in question is true only in linear spaces of
nite dimension.
Next we state a corollary of Theorem 1.1.3.
Corollary 1.1.8. Let X be a linear space and let Y be a subspace of X. Then there
exists a subspace Z of X with the following properties:
(i) for every x X there are unique y Y , z Z such that x = y + z;
(ii) Y Z = {o}.
x=
e .
K
K nite
for
x=
and
{f1 , . . . , fN }
N
i=1
aij fi ,
j = 1, . . . , M,
(1.1.1)
Chapter 1. Preliminaries
(N rows and M columns; the jth column consists of the coordinates of Aej ).
This matrix A is called the matrix representation of the linear operator A
with respect to the bases {e1 , . . . , eM } and {f1 , . . . , fN }.
On the other hand, if {e1 , . . . , eM } and {f1 , . . . , fN } are bases of X and Y ,
respectively, and A is an N M matrix, then the formula (1.1.1) determines
a linear operator A L(X, Y ).
(iii) If A, B L(X, Y ) have matrix representations A and B (with respect to the
same bases), then
A + B (aij + bij ) i=1,...,N
j=1,...,M
and A L(X, Y ),
i=1,...,N
N
i=1
bki aij
k=1,...,P
j=1,...,M
is the matrix representation of BA. This product of operators is noncommutative in general, even in the case X = Y = Z.
A1 A = IX
j = 1, . . . , M,
i.e.,
T ej =
M
i=1
tij ei ,
j = 1, . . . , M.
Chapter 1. Preliminaries
For x =
M
j fj we have
j=1
x=
M
j=1
M
tij ei =
i=1
M
M
tij j ei .
i=1
j=1
i =
M
tij j .
j=1
The second question can be answered by the same method but a certain
caution in computation is desirable. Write
M
M
M
M
(E)
(F )
(F )
tkj ek =
tkj aik ei =
akj T ek =
akj tik ei .
Afj = A
k=1
k,i=1
k=1
k,i=1
(1.1.2)
Example 1.1.13.
(i) Let X = Y Z. Dene
Px = y
where x = y + z,
y Y,
z Z.
Then P is the socalled projection of X onto Y and has the following properties:
(a) P 2 P P = P ,
(b) Ker P = Z.
It is easy to see that the properties (a), (b) determine uniquely the projection
P and hence also the decomposition X = Y Z (Y = Im P ).
(ii) Let Y be a subspace of X. For x X put
[x] x + Y = {x + y : y Y }.
for x, y X, .
[x] [x]
x X.
Then (the socalled canonical embedding of X onto XY ) is a linear, surjective operator from X onto XY , and Ker = Y . If x = y + z where y Y ,
z Z and X = Y Z, then the mapping j : [x] z is an isomorphism
of XY onto Z. In particular, XY and Z have the same dimension. The
dimension of XY is sometimes called the codimension of Y (codim Y ) and
dim X = dim Y + codim Y.
(1.1.3)
is commutative, i.e., A = A.
XKer A
A
A
Y
Figure 1.1.1.
Proof. The assertion is obvious but do not forget to prove that A is well dened.
Corollary 1.1.15. Let A L(X, Y ). Then
dim X = dim Ker A + dim Im A.
(1.1.4)
10
Chapter 1. Preliminaries
Proof. We have
codim Ker A = dim XKer A = dim Im A = dim Im A
since A is an isomorphism of XKer A onto its image. Equality (1.1.4) follows
immediately from (1.1.3). If A is injective, then
dim X = dim Im A,
and this implies (only in the case of X and Y having the same nite dimension)
that Y = Im A. If Im A = Y , then (nite dimensions!)
dim Ker A = 0,
i.e., A is injective.
Example 1.1.16. Let X be the space of bounded (real) sequences l (N) and dene
the rightshift
SR : x = (x1 , . . . ) (0, x1 , x2 , . . . )
and the leftshift
SL : x = (x1 , . . . ) (x2 , x3 , . . . ).
Then SR is injective but not surjective and SL is surjective but not injective.
Moreover,
for every x X.
SL SR x = x
g
What is S S ?
R L
The following special case of linear operators plays an important role both
in the theory of linear spaces and in applications.
Denition 1.1.17. Let X be a linear space over a eld . A linear operator from
X into is called a linear form. The linear space of all linear forms on X is called
the (algebraic) dual space of X and is denoted by X # .
Example 1.1.18.
(i) Let {e1 , . . . , eM } be a basis of X, i.e., for every x there is a unique M tuple
M
(1 , . . . , M ) M (coordinates of x) such that x =
i ei . The mapping
i=1
11
for an x0 X.
and
Ker f = Y.
if and only if
Ker fi Ker g.
i=1
Proof. The only if part is obvious. For the if part notice that the assertion
g Lin{f1 , . . . , fn } can be interpreted as the existence of a linear form (n )#
such that
g = F
where F (x) (f1 (x), . . . , fn (x)).
(1.1.5)
Let n = Im F (X) Y (Corollary 1.1.8). If = + , = F (x), Y , then the
mapping () = g(x) is a welldened linear form (by assumption). This means
that (1.1.5) holds.
Denition 1.1.20. Let A L(X, Y ) and g Y # . Then the linear form f (x)
g(Ax) is denoted by f = A# g and A# is called the adjoint operator to A.
Remark 1.1.21.
(i) A# L(Y # , X # ).
(ii) If A has a matrix representation A = (aij ) i=1,...,N with respect to bases
j=1,...,M
12
Chapter 1. Preliminaries
aij xj = bi ,
i = 1, . . . , N.
(1.1.6)
j=1
(1.1.7)
for W X # .
Proposition 1.1.22.
(i) (U ) = Lin U for every U X.
(ii) If dim X < , then (W ) = Lin W for all W X # .
Proof. We include the proof because it contains a construction which should be
compared with an analogous one in Section 2.1 (see Proposition 2.1.27 and its
proof).
(i) We can assume U to be a subspace of X since U = (Lin U) . The inclusion U (U ) is obvious. To prove the reverse let us suppose by contradiction
that there is an element x0 (U ) \ U. By the method of proof of Theorem 1.1.3,
a subspace Y of X can be found such that
X = Lin{x0 } Y
and
U Y.
According to Example 1.1.18(ii) there exists f X # with Ker f = Y . In particular, f U and f (x0 ) = 0, which contradicts the choice of x0 .
(ii) This part follows from (i) by replacing X by X # . To repeat the proof we
need that (X # )# could be identied with X. We note that this is possible because
dim X < .
13
belongs to
C.
for all
x C.
Proof. It needs a special tool for the treatment of convex sets and a considerably
more sophisticated extension procedure,5 and, therefore, it is omitted. See, e.g.,
Rockafellar [109, 11] where the interested reader can nd also applications to
convex optimization, and also Corollary 2.1.18.
Theorem 1.1.25. For A L(X, Y ) we have
(i) Im A = (Ker A# ) ,
(ii) Im A# = (Ker A) .
(iii) If, moreover, dim X = dim Y < , then
dim Ker A = dim Ker A# .
(1.1.8)
Proof. (i) It is straightforward to prove both the inclusions which lead to the
equality (Im A) = Ker A# . The result follows then from Proposition 1.1.22(i).
(ii) Let Y = Im A Z (Corollary 1.1.8). For f (Ker A) and y = Ax + z,
z Z, put g(y) = f (x). This denition does not depend on a concrete choice
of x since f (Ker A) . This proves that f = A# g and hence the inclusion
(Ker A) Im A# holds. The reverse inclusion is trivial.
(iii) Observe rst that (XU )# is isomorphic to U for any subspace U of
X, namely,
(F )(x) F ([x]),
F (XU )#
is the desired isomorphism. If dim X < , then XU is isomorphic to (XU )#
(both spaces have the same dimension) and, therefore, XU is isomorphic to U .
Now, we apply this observation to U = Ker A. We recall that Im A is isomorphic to XKer A (Proposition 1.1.14) and therefore to (Ker A) . By (ii), Im A is
isomorphic to Im A# . The equality (1.1.8) follows from Corollary 1.1.15.
5 See
14
Chapter 1. Preliminaries
Remark 1.1.26.
(i) Note that Theorem 1.1.25(i) is an existence result for the equation (1.1.6)
(or (1.1.7)) because it can be reformulated as follows:
The equation (1.1.6) has a solution for b = (b1 , . . . , bN ) if and only
if
N
bi fi = 0
i=1
15
Warning. In innite dimensions the spectrum of a linear operator can contain also
other points than the eigenvalues and is dened in a dierent way (see page 56)!
Remark 1.1.28. It is obvious that the following statements are equivalent in a
nite dimensional complex space X:
(A)
Ker (I A) = {o}
det (I A) = 0
where A is a representation of A.
Since P (z) det(zI A) is a polynomial (the socalled characteristic polynomial ) of degree M = dim X, the problem of nding (A) is equivalent to solving
an algebraic equation (the socalled characteristic equation of A)
P (z) = 0.
(1.1.9)
(1.1.10)
AY
O
O
AZ
16
Chapter 1. Preliminaries
1 0
0
0 2
0
..
..
.. .
..
.
.
.
.
0
1 1
is a representation of a linear operator A L(C2 )
0 1
which has no onedimensional reducing subspace. Hence A has no diagonal
g
representation.
Nk Ker (I A) .
It is obvious that Nk Nk+1 and they cannot be all distinct. If Nk = Nk+1 , then
Ni = Nk for all i > k. Denote by n() the least such k and set
n()
N ()
Nj = Nn() ,
n()
R() Im (I A)
j=1
holds.
(1.1.11)
(ii) Denote by AN and AR the restrictions of A respectively to N () and R().
Then
(AR ) = (A) \ {}.
(AN ) = {},
Moreover, the dimension of N () is equal to the multiplicity of the eigenvalue .
(iii) If (A) = {1 , . . . , k }, then
X = N (1 ) N (k ).
(1.1.12)
Proof. (i) Since R() N () = {o} (by the denition of n()) and dim X =
dim N () + dim R() (Corollary 1.1.15), we deduce the decomposition (1.1.11). If
y = (I A)n() x R(),
17
then
Ay = (I A)y + y = (I A)n() (I A)x + y R().
The Ainvariance of N () is also clear.
(ii) Obviously, (AN ) (A). Let (A) \ {} and let x be a corresponding eigenvector. By (1.1.11) we have x = y + z where y N (), z R().
Further,
o = (I A)x = (I A)y + ( )y + (I A)z.
By virtue of the uniqueness of the decomposition we have
(I A)y = ( )y.
This implies that
o = (I A)n() y = ( )(I A)n()1 y,
and
x = z R().
This shows that (AN ) and (AR ). Since N () R() = {o} the
eigenvalue does not belong to (AR ).
The matrix representation A of A with respect to the basis formed by joining
the bases of N () and R() has the block form
AN O
A=
.
O AR
It follows that
det(zI A) = det(zI AN ) det(zI AR )
and hence the characteristic polynomial of AN is PN (z) = (z )m() where m()
is the multiplicity of the eigenvalue of A. Therefore dim N () = m().
(iii) This follows by induction with respect to the eigenvalues of A.
For a polynomial P (z) = an z n + + a1 z + a0 and A L(X) we put
P (A) = an An + + a1 A + a0 I.
Corollary 1.1.32 (HamiltonCayley). Let A L(X) and let P be the characteristic
polynomial of A. Then P (A) = O.
18
Chapter 1. Preliminaries
Proof. Assume that P has the form (1.1.10) and x = x1 + + xk is the decomposition given by (1.1.12). Since mk = n(k ),
k1
(A k I)mk x =
(A k I)mk xj + o.
j=1
Lemma 1.1.33. Let B L(X) be a nilpotent operator of order n. Then for any
x X \ Ker B n1 the elements x, Bx, . . . , B n1 x are linearly independent and
the subspace
Y = Lin{x, Bx, . . . , B n1 x}
reduces B. The restriction BY of B to Y has the representation
0
0
..
.
0
1
0
.. . .
.
.
0
0
0
1
0
with respect to the basis {B n1 x, . . . , x}. There exists a Binvariant direct complement of Y and the restriction of B to such a complement is nilpotent of order
n.
Proof. It is easy to see the linear independence of the elements x, . . . , B n1 x.
Indeed, if
n1
j B j x = o,
j=0
i.e.,
0 = 0.
19
(1)
A1
F
A =
..
.
(l )
A1 1
O
..
.
(l )
Ak k
where the block matrices (the socalled Jordan cells) have the form
(i)
Aj
0
..
.
0
0
..
.
..
lj columns
0
,
1
j
i = 1, . . . , lj , j = 1, . . . , k.
(1.1.13)
Remark 1.1.35.
(i) We can also interpret Theorem 1.1.34 as follows. Let AE be the representation
of A with respect to the basis E. By Remark 1.1.12(iii), there is a regular
transformation matrix T such that (1.1.2) holds. The canonical matrix AF
may be viewed as a representation of a B L(X) with respect to the basis
E. Denote by T a linear operator represented in the basis E by the matrix T .
Then one has
B = T 1 AT.
(1.1.14)
(ii) Assume that A L(X) where X is a real linear space. The problem in the
application of Theorem 1.1.34 lies in the fact that the spectrum (A) R is
not sucient to guarantee the decomposition (1.1.12). This obstacle can be
overcome by the complexication XC of X. Namely, A is extendable to XC
by the formula
AC (x + iy) = Ax + iAy.
If = + i, = 0, is an eigenvalue of AC with an eigenvector u + iv, then u
and v are linearly independent in X and the complex conjugate is also an
eigenvalue of AC and u iv is the corresponding eigenvector. Moreover, both
and have the same multiplicity. Rearranging the AC canonical basis by
joining its parts which correspond to and we obtain a basis of the real
operator B L(X) is said to be nilpotent if there is such an n N that B n = O. The least
such integer n is called the order of nilpotency.
7 An
20
Chapter 1. Preliminaries
1
0
0
1
0
0
..
..
.
.
.
..
..
..
..
.
.
.
0
0
0
0
..
1
0
..
.
0
1
We omit simple computations which conrm these statements and leave them
to the reader.
The simple canonical form is convenient for solving a system of linear dierential equations with real constant coecients. Such a system can be written in
the form
dx
= Ax,
A L(X).
(1.1.15)
x
dt
If X = RM and A = (aij ) is the representation of A with respect to the
standard basis e1 , . . . , eM , then (1.1.15) is an abstract formulation of the system
x i (t) =
M
aij xj (t),
i = 1, . . . , M,
j=1
where x(t) =
M
xi (t)ei .
(1.1.16)
i=1
where By T 1 AT y.
Theorem 1.1.34 says that T can be chosen in such a way that the representation of
B with respect to the standard basis is the Jordan Canonical Form of A. Having
this form it is easy to solve (1.1.16) (see Exercise 1.1.41).
Qualitative properties of solutions of (1.1.15) are often more interesting than
an involved formula for solutions. Therefore it would be convenient to generalize
the exponential function solving x = ax in R to L(X). Similarly to the onedimensional case we put
n
t n
A x
etA x
n!
n=0
provided the series is convergent in L(X). We postpone the question of convergence
of this series (see Exercise 2.1.34) and give instead an equivalent denition of a
function f (A) for A L(X) without any use of innite series.
21
First we will dene f (B) for B L(CM ) which has a representation in the
form
.. ..
.
.
(1.1.17)
B=
.
M1
j=0
P (j) ()
(z )j + (z )M R(z)
j!
where R is a polynomial, possibly equal to 0. Since z (z )M is the characteristic polynomial of B, we have (B I)M = O (by Corollary 1.1.32). This means
that
M1
P (j) ()
(B I)j .
(1.1.18)
P (B) =
j!
j=0
This shows that we may dene
f (B)
M1
j=0
f (j) ()
(B I)j
j!
(1.1.19)
weaker assumption on f would be also sucient but we do not try to obtain an unduly
general denition. See also Lemma 1.1.37 below.
22
Chapter 1. Preliminaries
f (w)(wI B)
x dw =
M1
j=0
1
2i
f (w)
dw (B I)j x.
(w )j+1
j=0
1
wz
j=0
23
Remark 1.1.39.
(i) A mapping (f ) can be computed either by Lemma 1.1.37 which is valid
also for a general A, or by the formula
f (A)x =
k m(
l )1
f (j) (l )
l=1
j=0
j!
(A l I)j l x
where p =
m()
(A)
<0
for a matrix representation of A with real entries. (The sum over the empty set is
dened to be zero.)
mk
1
Hint. Notice that det A = m
1 k .
Exercise 1.1.41. Show that the formula (1.1.19) yields a matrix representation of
etB in the form
t
tM 1 t
e
tet (M1)!
e
0
tM 2 t
et (M2)!
e
.
.
.
..
..
..
..
t
0
0
whenever B has the representation (1.1.17) with the respect to the same basis.
Exercise 1.1.42. Use the formula in Remark 1.1.39(i) to estimate etA xCM for
large positive t and large negative t in dependence on (A).
24
Chapter 1. Preliminaries
Hint. Suppose that < Re < for all (A). Show that there are constants
c1 , c2 such that
c1 et x for t 0,
etA x
for all x CM .
c2 et x for t 0
In particular, if < 0, then all solutions of (1.1.15) tend to zero as t +
(asymptotic stability).
Exercise 1.1.43. Let A L(CM ) have a regular matrix representation. Show that
(i) all matrix representations of A are regular;
(ii) there exists B L(CM ) such that
eB = A.
Is this B unique? How is (B) related to (A)?
25
(1)
(x, y) = 0 x = y,
(2) x, y X =
(x, y) =
(y, x) (the socalled symmetry of the metric),
(3) x, y, z X =
(x, z)
(x, y) +
(y, z) (the socalled triangle inequality).
If
is a metric on X, then
B(x; r) {y X :
(x, y) < r}
is called the open ball centered at x X with the radius r > 0. Open sets in a
metric space are dened as subsets G X which have the following property:
for every x G there is > 0 such that B(x; ) G.
It is easy to prove that a metric space with this denition of open sets is also a
topological space. For the following notions and results see, e.g., Dieudonne [35].
A subset F of a topological space X is called a closed set if X \ F is open. If
A X, then the intersection of all closed sets containing A is called the closure
of A and is denoted by A, i.e.,
A=
F.
AF
F is closed
G.
GA
G is open
26
Chapter 1. Preliminaries
is continuous at a X
(a nite subcovering).
Any subset A of a topological space X is itself a topological space with the
collection of open sets {(G A) : G open in X}. A subset A of a topological space
X is said to be compact in X if A is a compact topological space in this induced
topology. Further, A X is said to be relatively compact provided A is compact.
In metric spaces we have the following characterization.
Proposition 1.2.1. Let X be a metric space. Then A X is relatively compact if
12
and only if for any sequence {xn }
n=1 A there is a convergent subsequence.
Beside this proposition, the importance of compactness in analysis is obvious
from the next result which will be discussed more deeply in Section 6.2.
Proposition 1.2.2. Let X be either a compact topological space or a sequentially
compact topological space and let f be a continuous real function on X. Then there
exist a maximal and a minimal value of f , i.e., there are x1 , x2 X such that
f (x1 ) f (x) f (x2 )
for all
x X.
12 A
Warning. These two notions of compactness are dierent in topological spaces. To be more
precise:
There is a compact topological space which is not sequentially compact and there is
a sequentially compact topological space which is not compact!
27
which goes far beyond our present considerations. A sequence {xn }n=1 of elements
of a metric space X is called a Cauchy sequence if for every > 0 there is n0 N
such that
for all m, n n0 .
(xm , xn ) <
A metric space X is said to be complete if any Cauchy sequence in X is convergent
(to an element of X). We will encounter complete spaces almost everywhere in
the subsequent text.
Proposition 1.2.3. Let X be a complete metric space. Then A X is relatively
compact if and only if for every > 0 there is a nite set K X (the socalled
nite net for A) such that
In other words, A
a A x K :
(a, x) < .
B(x; ).
xK
{A X : a A and A is connected}.
Then C(a) is a connected set and it is called the component of the point a. If
a, b X, a = b, then either C(a) = C(b) or C(a) C(b) = .
Proposition 1.2.6. Let X be a connected space, let f : X Y be continuous. Then
f (X) is a connected subset of Y . In particular, if : [0, 1] Y is continuous,
A Y , and (0) A, (1) A, then there exists t0 [0, 1] such that (t0 ) A.
Proposition 1.2.7. Let X be a normed linear space and let G be an open subset of
X. Then G is connected if and only if for any two points a, b G there exists a
continuous mapping : [0, 1] G such that (0) = a, (1) = b. In particular,
can be chosen piecewise linear.
Now we are ready to start with the main subject of this section.
Denition 1.2.8. Let X be a real or complex linear space. A function X : X
R is called a norm on X if it has the following properties:
13 I.e.,
28
Chapter 1. Preliminaries
(1) xX = 0 x = o,
(2) xX = xX for R or C and x X,
(3) x + yX xX + yX for x, y X (the socalled triangle inequality).
If a linear space X is endowed with a norm, then X is called a normed linear
space.
In the sequel we will drop the index of the norm whenever there is no danger
of confusion.
It is obvious that
(x, y) x y is a metric on X. Therefore all metric
notions and results are transmitted to normed linear spaces. If a normed linear
space is complete in this metric, then it is called a Banach space. Any metric space
can be embedded as a dense set into a complete metric space. For a normed linear
space X we get a slightly stronger result:
(the socalled completion of X) and a
There exists a Banach space X
and
linear injection L : X X such that Im L is a dense subset of X
xX = L(x)X
x X.
for all
M
i=1
i=1
(x1 , . . . , xM )1
M
xi ,
(x1 , . . . , xM )
i=1
(x1 , . . . , xM )2
M
max xi ,
i=1,...,M
12
xi 
i=1
are norms on RM (for indices 1, it is obvious, the triangle inequality for index
2 needs some eort see also Proposition 1.2.30 below). These norms can be
transmitted to X with help of , i.e.,
x (x) ,
= 1, 2, .
Similar results are true also for a complex linear space X when CM is used instead
of RM . The space X is a Banach space with respect to any of the above norms.
g
29
We note that this result is true for any norm on X (see Corollary 1.2.11(i) below).
Proposition 1.2.10. Let X and Y be normed linear spaces and let A be a linear
operator from X into Y . Then the following statements are equivalent:
(i) A is continuous on X;
(ii) A is continuous at o X;
(iii) there is a constant c such that the inequality
AxY cxX
x X.
(1.2.1)
1
, c2 = M .) Such constants exist also for the norms 1 ,
(Here, e.g., c1 = M
. More generally, two norms on a linear space X are called equivalent if they
satisfy such inequalities. In other words, two norms , on a linear space
X are equivalent if the identity map from (X, ) into (X, ) is continuous
together with its inverse, i.e., it is an isomorphism.14
Corollary 1.2.11.
(i) Any two norms on a nite dimensional linear space X are equivalent. In
particular, X is a Banach space.
(ii) Let X, Y be normed linear spaces and dim X < . Then L(X, Y ) =
L(X, Y ), i.e., any linear operator from X into Y is continuous.
Proof. (i) Let be as in Example 1.2.9 and consider RM (or CM ) equipped with
the 1 norm. Then for x = (x1 , . . . , xM ) RM we have
M
M
M
1 (x)X =
xi fi
xi  fi X c
xi  = cx1 ,
i=1
i=1
i=1
30
Chapter 1. Preliminaries
and {un }n=1 X is a Cauchy sequence if and only if {(un )}n=1 is a Cauchy
sequence, X is a Banach space.
X
1
RM (CM )
Figure 1.2.1.
AxY
M
i=1
xi  Afi Y c
M
xi  = cx1 .
i=1
1 f gn
(f, g)
(1.2.2)
n 1 + f g
2
n
n=1
basic example is RM or CM . Another example is the set N of natural numbers with the
discrete metric: d(m, n) = 1 if m
= n and d(m, m) = 0.
15 A
31
where
f gn sup{f (x) g(x) : x Tn },
denes a metric on C(T ) and the convergence of a sequence in this metric is
actually the locally uniform convergence, i.e., uniform convergence on any compact
subset of T . Since
is bounded it cannot be induced by any norm. Even more is
true, namely, there is no norm on C(T ) which generates the same system of open
g
sets as the metric
does (provided T itself is not compact).
We now state two fundamental results concerning spaces of continuous functions. To formulate the rst we need the concept of equicontinuity:
A family F C(T ) is said to be equicontinuous if for all x T and > 0
there is a neighborhood U of x such that
y U, f F
32
Chapter 1. Preliminaries
Proof. The proof can be found, e.g., in Dugundji [43, XIII, 3] or Kelley [75, Chapter
7, Exercise T].
We note that Theorem 1.2.14 can be easily extended to the space of complex
continuous functions. In this case, A is assumed to possess the following additional
property:
If f A, then also f A.18
The reader can ask why certain additional properties are needed for compactness in innite dimensional spaces like C(T ) in contrast to nite dimensional
spaces. The following theorem explains not only this situation but also the technical diculties which one meets in the calculus of variations (see Chapter 6).
Proposition 1.2.15 (F. Riesz). Let X be a normed linear space. Then the closed
unit ball B(o; 1) {x X : x 1} is compact (in the norm topology) if and
only if X has nite dimension.
Proof. Suciency is obvious (see Example 1.2.9 and Corollary 1.2.11(i)). It remains to prove necessity. We proceed by contradiction. Assume that dim X = .
Choose 0 < < 1 and suppose that we have x1 , . . . , xn B(o; 1) such that
xi xj > 1
We shall show that we can nd another element xn+1 B(o; 1) such that
{x1 , . . . , xn+1 } has the same property. Since Xn = Lin{x1 , . . . , xn } = X there
is y X \ Xn . Denote
d inf{y x : x Xn }.
Observe that d > 0 since Xn is a closed subspace.19 By the denition of the
greatest lower bound, there exists x
Xn such that
d y x < d(1 + ).
For xn+1
y
x
y
x
xn+1 x =
1
1
y (
x + y xx)
d > 1 .
y x
d(1 + )
Thus an innite sequence {xn }n=1 B(o; 1) with no convergent subsequence has
been constructed, which contradicts compactness of B(o; 1).
Example 1.2.16 (spaces of integrable functions). Let be a Lebesgue measurable
subset of RM and let dx denote the Lebesgue measure in RM .
18 If
19 Every
33
(1.2.3)
(1.2.4)
implies that Lp () is a linear space. Observe that  Lp () is not a norm since
f Lp () = 0 implies only f = o almost everywhere (abbreviation: a.e.) in . Put
N () = {f : C : f = o a.e. in }.
Then N is a linear subspace of Lp and the factor space
Lp () Lp ()N
is a normed linear space with the norm
[f ]Lp() = f Lp ()
For the sake of simplicity we will use the notation f instead of the superuous [f ]
for an element of Lp () and will call it simply a function. It is also convenient
to introduce the space L () of all (classes of) essentially bounded measurable
functions. We recall that f is said to be essentially bounded on if there is a
constant c such that
f (x) c
for a.e. x in .
The least possible c is denoted by f L () . Again L () is a norm on L ().
We mention another important inequality the socalled H
older inequality:
1
is
If 1 p and p is the conjugate exponent ( p1 + p1 = 1 where
(1.2.5)
g
the sake of simplicity we will use in the sequel the notation p instead of Lp () .
34
Chapter 1. Preliminaries
{nk }k=1
gp (x) dx
1
2k
p
k=1
p
1
,
2k
k=1
the Monotone Convergence Theorem21 gives that g = lim gn has a nite integral
n
over and therefore g is nite a.e. in . This means that
fnk+1 (x) fnk (x)
k=1
is a.e. convergent, and therefore f (x) lim fnk (x) exists a.e. in . By the Fatou
k
Lemma22 we have
f (x) fnk (x) dx lim inf
fnl (x) fnk (x) dx
l
1
.
2k1
In particular,
f L1 ()
and
lim fnk f 1 = 0.
The rest of the proof is easy. Indeed, a Cauchy sequence which has a convergent
subsequence is itself convergent.
Remark 1.2.18. The proof shows that the following statement is true:
for a.e.
x .
gn (x) dx =
lim
n
22 The
g(x) dx.
The statement holds for lim sup with the reverse inequality for a sequence bounded
above by an integrable function.
Put here hl = fnl fnk .
35
Approximations of integrable functions by more regular functions, like continuous or dierentiable ones, are often desirable.
Proposition 1.2.19 (Density Theorem). For any p [1, ) the subset C()Lp ()
is dense in Lp ().
Proof. It is based on the application of the Luzin Theorem.23 See also Proposition 1.2.21 below.
We now show another type of approximation which is more constructive and
therefore often more convenient in applications. If f , g are measurable functions
on RM , then we dene their convolution f g as
(f g)(x)
f (x y)g(y) dy
for all x RM
(1.2.6)
RM
for which the integral exists. We note that the properties of the convolution follow from the Fubini Theorem provided measurability of the function (x, y)
f (x y)g(y) is established. For details see, e.g., Folland [52], Gripenberg, Londen
& Staans [62, Chapters 24], and also Example 2.1.28. The following assertion is
a basic result on convolutions.
Proposition 1.2.20. Let f L1 (RM ).
(i) If g Lp (RM ), 1 p , then
f g Lp (RM )
and
f gp f 1gp .
Lp (RM ), then x
(f g) = f x
a.e. in RM .
(iii) If g Lp (RM ) and x
i
i
i
(x) dx = 1 (the so(iv) If is a nonnegative, measurable function with
RM
Roughly speaking, the Luzin Theorem says that a bounded measurable function is continuous
with respect to sets, measures of which are arbitrarily close to the measure of provided the
latter is nite. For a more general formulation and the proof of the Luzin Theorem the reader
can consult, e.g., Rudin [113, 2.23].
36
Chapter 1. Preliminaries
Cm = .
m=1
f p (meas ) p p f p ,
f Lp ().
(1.2.7)
This means that the identity map of Lp () into Lp() is continuous. We will
denote this fact by Lp () Lp() and say that Lp () is continuously embedded
into Lp().
Warning. Simple examples show that this is not true if meas = !
The following assertion is an analogue of Theorem 1.2.13.
Proposition 1.2.23 (A.N. Kolmogorov). Let be an open set in RM . Then M
Lp (), p [1, ), is relatively compact if and only if the following conditions are
satised:
(i) M is bounded in Lp (),
37
(ii) > 0 > 0 f M:
(iii) > 0 > 0 f M:
{x:xRM }
f (x)p dx < .
Proof. For the proof based on Proposition 1.2.3 see Yosida [135, Chapter 10, 1].
Remark 1.2.24. All results from 1.2.161.2.23 also hold in spaces of sequences
p1
n=1
which can be regarded as Lp (N) equipped with the counting measure ((A) =
card A).
Example 1.2.25 (spaces of dierentiable functions). We can consider either classical derivatives (dened as limits of relative dierences) or weak derivatives. We
start with the former case.
Let = (1 , . . . , M ) be a multiindex , i.e., i N {0}, i = 1, . . . , M , and
 1 + + M . For a function f on an open set RM we put
D f (x)
 f (x)
M
x
M
1
x
1
and say that f C n () if D f are continuous for all multiindices for which
 n. We can use the metric
given by (1.2.2) to dene
(f, g)
(D f, D g)
for a multiindex and set
n (f, g)
(f, g).
n
uniformly on
for all  n.
x + y
, then we set f (x + y) 0.
connection with this notation observe that for a relatively compact set all derivatives
D f ,  n 1, are uniformly continuous, and therefore continuously extendable to .
24 If
25 In
38
Chapter 1. Preliminaries
The quantity
f C 0, () sup f (x) + sup
x
x,y
x=y
f (x) f (y)
x y
is a norm on the space C 0, () of Holder continuous, bounded functions on . The space C n, () is dened similarly.
We note that C n, () is a Banach space with respect to the above norm (cf. Exercise 7.1.4).
Now we turn our attention to weak derivatives on an open set RM . Let
f L1loc ()
(this means that f L1 (K) for every compact subset K ), and let be a
multiindex. A function g L1loc () is called an weak derivative of f if
f (x)D (x) dx = (1)
g(x)(x) dx for every D().
(1.2.8)
We will denote g =
f
Dw
Warning. Even in the onedimensional case the ordinary derivative existing almost
everywhere need not be the weak derivative!
For example, the Heaviside function
1, x 0,
H(x) =
satises
0, x < 0,
H (x) = 0
for x R \ {0}
but the weak derivative does not exist. The distributional derivative of H 27 is the
Dirac measure.
26 If
39
W k,p () {f Lp () : derivatives Dw
f exist
Dw
f p .28
(1.2.9)
k
k 29
.
(i) If k < Np , then W k,p (RN ) Lp (RN ) for p1 = p1 N
(ii) If k =
(iii) If k >
N
p,
N
p,
then
W k,p (RN ) Lr (RN )
for all
r [p, )
for all
r 1.
and
N 30
p.
40
Chapter 1. Preliminaries
Proof. Proofs of these statements are quite involved and also have a long history.
The interested reader can consult, e.g., Adams [2], Kufner, John & Fuck [82],
Mazja [93], Stein [123, Chapters V, VI]. For a readable account of Sobolev spaces
we recommend Evans [48, Chapter 5]. Spaces with fractional derivatives which
extend the class of Sobolev spaces can be also dened, e.g., Triebel [128], [129].
Remark 1.2.27. The situation for an open set with a nonempty boundary (in
particular, for a bounded ) is even more complicated because some techniques
from harmonic analysis, like Fourier transform, are not available. One possibility is
to extend f W k,p () to a function f W k,p (RN ). This is possible if the boundary possesses certain smoothness properties. To explain this more precisely we
would need some facts about manifolds (see Section 4.3 and Appendix 4.3A). So
we omit details and just state that Theorem 1.2.26 is true provided is locally
Lipschitz (see Section 7.3 for details).
Theorem 1.2.28 (RellichKondrachov). Let be a bounded open set in RN with
a locally Lipschitz boundary, k N, p [1, ).
(i) Let k <
N
p
pN
.
N kp
(1.2.10)
N
p,
Now, we turn our attention to abstract spaces. Proposition 1.2.15 has pointed
out the dierence between nite dimensional spaces and (innite dimensional)
function spaces. Another dierence between the nite and innite dimension lies
in the notion of a basis. It can be shown that any algebraic basis in an innite
dimensional Banach space X has to be uncountable, and therefore the representation of a point by its coordinates can hardly be of any use. This observation leads
to the necessity of expressing an element of X by an innite sum. A sequence
n en .
(1.2.11)
n=1
will use the notation for compact embeddings. An embedding of X into Y is compact
if a ball in X is relatively compact in Y .
31 We
41
There are several imperfections in this denition. Namely, there are separable32
Banach spaces which do not possess a Schauder basis. Moreover, the convergence
of the sum in (1.2.11) can be understood in several nonequivalent meanings. These
problems do not appear in a special class of spaces with an additional structure
which is connected with the norm and allows measuring angles.
Denition 1.2.29. Let X be a real (or complex) linear space. A mapping (, )X :
X X R (or C) is called a scalar product on X if the following conditions are
satised:
(1) for any y X the mapping x (x, y)X is linear;
(2) (x, y)X = (y, x)X for all x, y X in the real case and (x, y)X = (y, x)X in
the complex case;
(3) (x, x)X 0 for every x X and (x, x)X = 0 if and only if x = o.
Proposition 1.2.30. Let (, ) be a scalar product on a linear space X. Then
(i) the socalled Schwartz inequality
(x, y)2 (x, x)(y, y)
x, y X;
(1.2.12)
If X is a linear space with a scalar product we will always consider the norm
on X induced by this scalar product. If X is complete with respect to this norm,
then X is called a Hilbert space and will be usually denoted by H. We note that
which is a completion of X.
if X is not complete there exists a Hilbert space H
32 If
a space X has a Schauder basis, then X is separable. This is not a serious drawback since
most function spaces used in analysis are separable.
33 Notice here a typical procedure with the norm induced by a scalar product, namely using the
second power of the norm in calculation.
42
Chapter 1. Preliminaries
Example 1.2.31.
(i) RM with the scalar product
(x, y) =
M
i i ,
x=
i=1
M
i ei ,
y=
i=1
M
i ei ,
i=1
(in the complex case). Similarly, for p = 2 the norm (1.2.9) is equivalent to
the norm induced by the scalar product
(f, g)W k,2 () =
(D f, D g)L2 () .
k
(iii) The sup norm on BC() is not induced by any scalar product. This can
be seen from the parallelogram identity
x + y2 + x y2 = 2x2 + 2y2,
x, y X
(1.2.14)
1
(x + y2 x y2 )
4
(1.2.15)
(polarization identity) has all properties of a scalar product, and the induced norm coincides with . It is not dicult to show that the sup
norm does not satisfy (1.2.14). Even more is true, namely, the sup norm
is not equivalent to any norm on BC() induced by a scalar product. Since
C[0, 1] L2 (0, 1), the scalar product (1.2.13) is also a scalar product on
C[0, 1]. But the space C[0, 1] is not complete in the L2 norm and, therefore,
the L2 norm on C[0, 1] cannot be equivalent to the sup norm; only the
inequality
f L2 (0,1) f C[0,1]
holds. Observe that L2 (0, 1) is a completion of C[0, 1] with respect to the
g
integral norm given by (1.2.3).
The most useful concept in spaces with a scalar product is the following one.
43
0, = ,
1, = .
k
(xk+1 , ej )ej ,
ek+1 =
j=1
yk+1
.
yk+1
It is obvious that
(ej , ek+1 ) = 0,
ek+1 = 1
j = 1, . . . , k,
and
Lin{x1 , . . . , xk+1 } = Lin{e1 , . . . , ek+1 }.
This procedure is called the Schmidt orthogonalization. For any x Y
n
k ek . Taking the scalar product with ej , we get
Lin{x1 , . . . , xn } we have x =
k=1
(x, ej ) =
n
k (ek , ej ) = j ,
k=1
and also
x2 =
n
j=1
(x, ej )ej ,
n
k=1
(x, ek )ek =
n
k=1
(x, ek )2 .
44
Chapter 1. Preliminaries
y x2 = y
n
j ej , y
j=1
= y2
n
j ej
j=1
n
j (y, ej )
j=1
= y2 +
n
n
j (y, ej ) +
j=1
j 2
j=1
j (y, ej )2
j=1
n
n
(1.2.16)
(y, ej )2
j=1
2
n
n
2
2
y
(y, ej ) = y
(y, ej )ej
.
j=1
j=1
Two consequences follow from this inequality. First, the best approximation of
y X by an element of Y is
Pn y
n
j=1
for all y X.
j=1
Since n is arbitrary (in an innite dimensional space) we have obtained the socalled Bessel inequality:
for all y X.
(1.2.17)
n=1
j=1
34 We
note that this result, namely the linearity of the operator Pn of the best approximation,
is typical for spaces with scalar products. In a general normed linear space X and a nite
dimensional subspace Y the best approximation of an arbitrary x X by elements of Y exists
(by a compactness argument) but a special property of the norm is needed for the uniqueness
of the best approximation. Linearity of the best approximation on all subspaces of dimension 2
implies that the norm is induced by a scalar product. More details can be found in the monograph
of Singer [120].
45
Proposition 1.2.33. Let X be a linear space with a scalar product, let X be separable.35 Then there exists an orthonormal basis in X.
Proof. Let {x1 , x2 , x3 , . . . } be a dense set in X. Put
Yn = Lin{x1 , . . . , xn },
Y =
Yn .
n=1
inequality (1.2.16),
n
x yn x
(x, ej )ej
.
j=1
(x, ej )ej .
j=1
j=1
continuous, we have
(x, ek ) = lim
n
n
j ej , ek = k .
j=1
In order to obtain some useful properties which guarantee that an orthonormal sequence is a basis we need to use completeness. We start with a general
approximation result.
Theorem 1.2.34. Let H be a Hilbert space and let C be a closed convex subset of
H. Then for any x H there exists a unique y C such that
x y = inf {x z : z C}.
(1.2.18)
for all z C
(1.2.19)
46
Chapter 1. Preliminaries
x
y + {x y}
y
z
C
xy
{x y}
yz
o
Figure 1.2.2.
Proof.
Step 1 (Existence). Denote the righthand side in (1.2.18) by d. If d = 0, then
x C (C is closed) and y = x. Suppose that d > 0. Then there are {zn }n=1 C
such that
1
d x zn < d + .
n
By (1.2.14) we get
zn zm 2 = x zm (x zn )2
2
zn + zm
= 2(x zm + x zn ) 4 x
2
2
2
1
1
<2 d+
+2 d+
4d2
n
m
2
m
(notice that zn +z
C since C is convex). This implies that {zn }n=1 is a Cauchy
2
sequence, and therefore it is convergent to a y C. Obviously, x y = d.
47
i.e.,
ty z2 + 2 Re(x y, y z) 0
and taking the limit for t 0+ , the inequality (1.2.19) follows. If (1.2.19) is
satised, then
x z2 = x y + y z2 = x y2 + y z2 + 2 Re(x y, y z) x y2 ,
M M .
Moreover, if P denotes the projection to M given by this direct sum37 (the socalled orthogonal projection), then P L(H), P L(H) = 1 and
(P x, y) = (x, P y)
for all
x, y H.
for all w M.
(1.2.20)
i.e.,
P L(H) 1.
Example 1.1.13(i).
48
Chapter 1. Preliminaries
Corollary 1.2.36. Let H be a Hilbert space and let {en }n=1 be an orthonormal
sequence in H. Then the following statements are equivalent:
(i) {en }
n=1 is an orthonormal basis;
(ii) if (x, en ) = 0 for all n, then x = o;
(iii) Lin{e1 , e2 , . . . } is dense in H;
(iv) the Parseval equality
x2 =
(x, en )2
x H.
(1.2.21)
n=1
Proof. The implication (i)(ii) is obvious and follows from the denition of the
orthonormal basis.
The implication (ii)(iii): Denote Y = Lin{e1 , e2 , . . . }. Assume that Y is not
dense, i.e., Y = H. By Corollary 1.2.35 there exists x Y \ {o}. In particular,
(x, en ) = 0 for all n, a contradiction.
The implication (iii)(iv): The proof of Proposition 1.2.33 shows that the
sequence
n
(x, ek )ek
converges to x for all x H.
sn
k=1
n
(x, ek )2 + x sn 2 .
k=1
(x y, en )2 = 0.
n=1
Remark 1.2.37. Let H be a Hilbert space and {en }n=1 an orthonormal basis in H.
The proof of the last implication shows that for an arbitrary sequence {n }
n=1 R
n 2 is
(or C depending on whether H is a real or complex space) for which
n=1
49
n=1
t2
n t2
dn et
dtn
(the socalled Hermite polynomials) form (after normalization) an orthonormal basis in L2 (R). For the proof and relevant results in harmonic analysis
we recommend the classical book Kaczmarz & Steinhaus [70]. We note that
38 l2 (N)
+
f(n)eint
n=1
n 2 is
n=1
n n for x = {n }
n=1 ,
1.2.24).
where
1
f(n) = (f, en )L2 (,) =
2
f (t)eint dt
and the series is convergent in the L2 norm for arbitrary f L2 (, ). It is worth noting
that the series is actually a.e. convergent to f but this by no means follows from the norm
convergence. This result is due to L. Carlesson and it is one of the most dicult and profound
results in analysis.
50
Chapter 1. Preliminaries
there are many dierent orthonormal bases in L2 spaces. We will present one
g
general method of their construction in Theorem 2.2.16.
for all
x M.
n=k
for all
x H.
F (x) = F (x0 ).
Choose now = F (x0 ). If there is another g H such that F (x) = (x, g), x H,
then 0 = (x, f g) for all x H, in particular, for x = f g. Therefore f = g.
By the Schwartz inequality (1.2.12) we obtain
F (x) = (x, f ) xf ,
i.e.,
F f .
51
for each
x H,
then A is invertible,
A1 L(H)
and
A1 L(H)
1
.
d
Proof. The existence of A follows from (i), (iii) and the Riesz Representation
Theorem. The property (ii) yields the linearity of A. Since
Ay2 = (Ay, Ay) = B(Ay, y) cAyy,
we have Ay cy, i.e., A L(H) and AL(H) c.
The property (iv) means that
dy2 B(y, y) = (y, Ay) yAy,
i.e.,
Ay dy
for all y H.
(1.2.22)
and
w = o.
1
.
d
52
Chapter 1. Preliminaries
Hint. Use Proposition 1.2.3. Obviously, this statement is also a special case of
Theorem 1.2.13.
Exercise 1.2.44. Let {en }n=1 be an orthonormal basis in a Hilbert space H. Dene
n
if x = en ,
1
f (x) = n(1 2x en ) if x en < ,
0
otherwise.
Show that f is a welldened continuous functional on H which is not bounded
on the closed unit ball.
Exercise 1.2.45. Let = M X be a subset of a normed linear space X. For
x X set
dist(x, M) = inf{x y : y M}.
Prove that for any x1 , x2 X we have
 dist(x1 , M) dist(x2 , M) x1 x2 .
Hint. Assume dist(x1 , M) dist(x2 , M). For any > 0 there exists x M such
that
x2 x < dist(x2 , M) + .
Use the triangle inequality for x1 x .
Exercise 1.2.46.40 Let be a bounded open set in RM . For p [1, ) and k N
dene W0k,p () to be the closure of D() with respect to the W k,p ()norm (1.2.9).
(i) Prove that
W0k,p () W k,p ()
and W0k,p () need not be dense in W k,p () (compare it with the statement
of Theorem 1.2.28(iii); see also the Trace Theorem (Theorem 7.3.1)).
(ii) Prove the Poincare inequality:
There exists a constant cp such that for all u W01,p () the inequality
u(x) dx cp
u(x)p dx 41
holds.
40 Supplement
41 Finding
to Example 1.2.25.
the smallest possible value of the constant
dicult problem. See also
' cp is a much more
(
u
u
, . . . , x
x1
M
where
,
xi
i = 1, . . . , M , are
53
Hint. It suces to prove the assertion for u D(). Consider rst = (0, 1)
and use the Mean Value Theorem. Then suppose (without loss of generality)
p1
u(x)p dx
u(x) dx
p
p1
p1
u(x) dx
.
p
Exercise 1.2.47. Let u W 1,p (0, 1), 1 p < . Prove that functions
u+ (x) max{u(x), 0},
u (x) max{u(x), 0}
also belong to W 1,p (0, 1). We remark that the corresponding result is false for
W k,p (0, 1), k 2.
Chapter 2
m n0 ,
x X.
56
Proposition 2.1.2. Let X be a Banach space and A L(X). If A < 1, then the
operator I A is continuously invertible and
(I A)1 =
An
n=0
k
An .
n=0
Then
l
l
l
An
An
An < 1
Sl Sk =
n=k+1
n=k+1
for
l>k
n=k+1
L(X)norm. Denote B lim Sk =
An . We have
k
n=0
(I A)B = lim (I A)
k
k
An = lim
n=0
k
An
n=0
k+1
An
n=1
= lim (I Ak+1 ) = I
k
B(I A) = I,
i.e.,
B = (I A)1 .
(A),
2 The
57
Corollary 2.1.3. Let X be a complex Banach space and A L(X). Then
(A) is
an open set and
{ :  > A}
(A).
Proof. If  > A, then
and I
A 1
A
I A = I
An 3
.
n+1
n=0
The next theorem together with Theorems 2.1.8 and 2.1.13 is one of the
most signicant results in linear functional analysis. For the proofs the interested
reader can consult textbooks on functional analysis, e.g., Conway [28], Dunford &
Schwartz [44], Rudin [112], Yosida [135].
Theorem 2.1.4 (Uniform Boundedness Principle). Let X be a Banach space and
Y a normed linear space. If {A } L(X, Y ) is such that the sets {A xY :
} are bounded for all x X, then {A L(X,Y ) : } is also bounded.
This result is the quintessence of several results on approximation of functions
in classical analysis and can be used for modern proofs of such results. The
following example is typical.
Example 2.1.5. There exists a periodic continuous function the Fourier series of
which is divergent at zero.4 To see this we recall that the nth partial sum of the
Fourier series of a function f at 0 is given by
sin n + 12 t
1
Dn (0 t)f (t) dt where Dn (t) =
, 0 < t <
sn (f )(0) =
2
sin 2t
forms on
(the nth Dirichlet kernel ). Since n : f sn (f )(0)
) are continuous *linear
the space C[, ], the sequence of their norms n L(C[,],R) n=1 should be
3 This series actually converges for such that  > r(A) sup { : (A)} but its proof is
more involved. The quantity r(A) is called the spectral radius of A.
4 Even divergent at uncountably many points but always of measure zero. The set of such bad
functions is dense in C[, ].
58
As indicated in the previous example, Theorem 2.1.4 is essentially an approximation result. This is clearer from its next variant.
Corollary 2.1.6 (BanachSteinhaus). Let X and Y be Banach spaces and let
{An }n=1 L(X, Y ). Then the limits lim An x exist for every x X if and
n
only if the following conditions are satised:
(i) There is a dense set M X such that lim An x exists for each x M.
x, y X.
59
f (t)eint dt
the nth Fourier coecient of f L1 (, ). Since f(n) 0 for n for all
trigonometric polynomials which are dense in L1 (, ), we have
f(n) 0
for all f L1 (, )
(the socalled RiemannLebesgue Lemma). In other words, A : f f() is a continuous linear operator from L1 (, ) into
c0 (Z) {an }nZ : lim an  = 0 ,
{an }c0 (Z) = sup an .
n
for all f L1 (, ).
Dk (n) =
and
Dk L1 (,) log k,
0, n > k,
g
a contradiction.
xn x,
Axn y
implies that
x Dom A
and
Ax = y.
60
Dom A = {x X : x(t)
N = Ker P.
This statement should be compared with the Hilbert space case (Corollary 1.2.35).
An important special case is codim M < . By denition, this means that
an algebraic direct complement N has a nite dimension (codim M dim N ) and
therefore N is closed (Corollary 1.2.11(i)). If M is closed as well, then any projection onto M is continuous. We postpone the case of dim M < to Remark 2.1.19.
We note that if X is a Banach space such that there exists a continuous
projection P , P L(X) 1, onto every closed subspace of X, then X has an
g
equivalent norm induced by the scalar product on X (see Kakutani [71]).
61
Now we turn our attention to the dual space X of all continuous linear
forms on a normed linear space X. In Section 1.1 we have seen the importance of
linear forms. Namely, they allowed us to dene an algebraic adjoint operator A#
and formulate Theorem 1.1.25. The dual space X is even more important for a
normed linear space X since another topology can be introduced on X with help
of X which in a certain sense has better properties (Theorem 2.1.25 below).
Surprisingly, the following basic result does not need any topology.
Theorem 2.1.13 (HahnBanach). Let X be a real linear space and let Y be a
linear subspace of X. Assume that f is a linear form on Y which is dominated
by a sublinear functional p.6 Then there exists F X # such that
(i) F (y) = f (y) for all y Y (extension);
(ii) F (x) p(x) for all x X (dominance).
Proof. The proof is based on an extension of f to a subspace whose dimension
is larger by 1 and such that this extension is dominated by the same p, and
the use of Zorns Lemma as an inductive argument, similarly as in the proof of
Theorem 1.1.3.
Remark 2.1.14. If X is a complex linear space, then we need p to satisfy a stronger
condition than (2) in footnote 6, namely
(2 ) p(x) = p(x), C, x X.
In this case p is called a seminorm.7 The dominance also has to be stronger:
f (x) p(x).
The extension result follows from Theorem 2.1.13 by considering Re f and Im f
and observing that Re f (ix) = Im f (x).
Corollary 2.1.15. Let X be a normed linear space and let Y be a linear subspace
of X (not necessarily closed). If f Y , then there exists F X such that
(i) F (y) = f (y) for y Y ;
(ii) F X = f Y .
Proof. Put p(x) = f x, x X, and apply Theorem 2.1.13 or Remark 2.1.14,
respectively.
Corollary 2.1.16 (Dual Characterization of the Norm). Let X be a normed linear
space. Then
(2.1.1)
xX = max {f (x) : f X with f X 1}.
6
7 The
dierence between a norm and a seminorm is that a seminorm need not satisfy the
condition: p(x) = 0 = x = o.
62
i.e.,
Remark 2.1.17.
(i) If X is a Hilbert space, then the equality (2.1.1) can be obtained immediately
from the Riesz Representation Theorem (Theorem 1.2.40). This theorem can
be often used in Hilbert spaces instead of the HahnBanach Theorem.
(ii) A slightly weaker form of (2.1.1) is often used:
If f (x) = 0 for all f X , then x = o.
The equivalent assertion reads as follows:
X separates points of X.
Corollary 2.1.18 (Separation Theorem). Let X be a normed linear space and let
C be a nonempty, closed, convex set. If x0 C, then there exists F X such
that
sup {Re F (x) : x C} < Re F (x0 ).
(2.1.2)
Proof. It is sucient to give the proof for a real space X and under the additional
assumption o C. In particular, this assumption means that x0 = o. We wish to
extend the form f dened on Lin{x0 } by f (x0 ) = , R. To do that we need
a suitable dominating functional. Since d dist(x0 , C) > 0, there exists a convex
neighborhood of C which does not contain x0 , e.g.,
d
K = x + y : x C, y <
.
2
,
+
z
pK (z) inf > 0 : K
Put
for
z X.8
2
,
d
i.e.,
for
y <
F X .
d
.
2
63
for x C
d
F (x) 1 sup F (y) : y <
< 1 = F (x0 ).
2
d
,
2
n
Fk (x)yk
k=1
64
+ p1 = 1) in the following sense. For any F [Lp ()] there exists a unique
Lp () such that
F (f ) =
f (x)(x) dx
(A),
which is an Xvalued function for every x X. Then for any F X , the complex
function () = F [(I A)1 x] is holomorphic in
(A). For  > A we also
have
An
() F X (I A)1 L(X) xX = F x
n+1
n=0
F x
An
,
n+1
n=0
and so
lim () = 0.

If
(A) = C, would be identically zero (by the Liouville Theorem from the
complex functions theory). Since this should be true for all F X , we get
(I A)1 x = o for all x X, a contradiction. Therefore, the spectrum (A)
is nonempty for each A L(X). This is a generalization of the existence of an
eigenvalue of a linear operator in a nite dimensional space and therefore also a
generalization of the Fundamental Theorem of Algebra (cf. page 15). It is worth
mentioning that the Jordan Canonical Form (Theorem 1.1.34) is based on this
result.
65
for every f X .
Proposition 2.1.22.
(i) (uniqueness) If xn x and xn y, then x = y.
(ii) If lim xn x = 0, then xn x.9
n
is reexive, see Yosida [135, Chapter V, 2]. Hilbert spaces, Lp ()spaces and W 1,p ()spaces
(1 < p < ) are uniformly convex (for a Hilbert space this follows from the parallelogram
identity (1.2.14), for the other two cases see, e.g., Adams [2, Corollary 2.29 and Theorem 3.5]).
66
and
f (x) = x
1
1
f (xn )
f (x) = f (y) for any f X ,
xn
x
i.e., yn y.
a contradiction.
Remark 2.1.23. The weak convergence is the convergence in the weak topology. It
is convenient to dene this topology by systems of neighborhoods of points. We say
that U X is a weak neighborhood of a point x X if there are f1 , . . . , fn X
such that
{y X : fi (y) fi (x) < 1 for i = 1, . . . , n} U.
A subset G X is weakly open (i.e., open in the weak topology) provided it is
a weak neighborhood of each of its points. It is easy to see that a weakly open
set is also open in the norm topology. The converse is generally true only in nite
dimensional spaces.
11 It
67
(1)
(x1 , xn )
(k)
n=1
(1)
{xn }
n=1 such
n=1
for all x Y.
for
x X.
for x X.
Moreover,
lim (x, yk ) = lim (P x, yk ) = f (P x) = (x, y)
for all x X.
68
Remark 2.1.26. Weak convergence in a dual space X is more confusing since two
lim F (fn ) = F (f )
for every F X ;
for every x X.
Criteria for weak convergence in Lp spaces can be found, e.g., in Dunford &
Schwartz [44, Chapter IV, 8].
The weak convergence in X has obviously the same properties as that in
X. Because of the continuous embedding : X X (see the proof of Proposition 2.1.22(iii)) the wconvergence implies the w convergence. The converse is
true if X is a reexive space, i.e.,
(X) = X .
Since the w topology is generally weaker than the wtopology there can exist
more w compact sets than the wcompact ones. In fact, the following result (the
AlaogluBourbaki Theorem, see Conway [28], Dunford & Schwartz [44], Fabian et
al. [49]) holds:
If X is a normed linear space, then any closed ball in X is w compact. If, moreover, X is separable, then the ball is also sequentially
w compact.
For example, this theorem can be applied to balls in Lp (), 1 < p .
In the rest of this section we will examine adjoint operators. Suppose that X
and Y are normed linear spaces and A L(X, Y ). If g Y , then
A g g(A) X .
The operator A : Y X is obviously linear, and it is also continuous since
A g(x) = g(Ax) gY AxY gY AL(X,Y ) xX .
If H1 , H2 are Hilbert spaces and A L(H1 , H2 ) we have another approach
to the denition of an adjoint operator, namely the one based on the Riesz Representation Theorem: For y H2 the mapping f : x (Ax, y)H2 is a continuous
linear form on H1 , and hence there is z H1 for which f (x) = (x, z)H1 . This z is
uniquely determined by y, and we denote for a moment z = A+ y, i.e.,
(Ax, y)H2 = (x, A+ y)H1 .
69
for all x, y H.
70
Notice that the statement (iv) is not a sucient condition for solvability of
the equation
Ax = y
since only the closure of Im A is characterized. There are
many operators the range
t
in C[0, 1] or in L (0, 1). It is not an easy task to decide whether an operator has
a closed range or not. The following statement is useful in applications.
If X, Y are Banach spaces and A L(X, Y ) is injective, then Im A is
closed if and only if there is a positive constant c such that
Ax cx
for all
x X.
Suciency is easy, the necessity part follows from the Open Mapping Theorem.
There is an important subclass of operators with a closed range, namely the
socalled Fredholm operators. An operator A L(X) is said to be Fredholm if
dim Ker A < ,
Im A is closed,
and
codim Im A <
Ax(t) =
(2.1.3)
for 1 p .12
is a linear bounded operator from Lp () into Lp ()
can be found in Dunford
on the kernel k which guarantee that A L(Lp (), Lr ())
& Schwartz [44, Chapter VI, 11A].
12 Conditions
71
is
To prove this assertion we have to show that Ax(t) exists for a.a. t ,
p
13
older inequality,
measurable on and belongs to L (). For 1 p < , by the H
we get for p1 = 1 p1 :
Ax(t)
1
p
1
p
1
p
Set
k(t, s)x(s) ds
p
k(t, s)x(s) ds
(t)
p
p1
p1
.
Since the measurable function (t, s) k(t, s)x(s)p can be approximated by step
bounded), the function t (t) is measurable on
functions (consider rst
The Fubini Theorem yields
.
p
p
(t) dt =
k(t, s)x(s) ds dt
k(t, s) dt x(s)p ds c2 xpLp () .
=
, 1 p < , with a
The Fubini Theorem also yields (we identify g [Lp ()]
p
function from L () see Example 2.1.20(iii))
Ax(t)g(t) dt =
k(t, s)x(s) ds g(t) dt
k(t, s)g(t) dt x(s) ds =
(A g)(s)x(s) ds,
=
i.e.,
A g : s
k(t, s)g(t) dt,
g Lp ().
We note that the adjoint operator to A for p = 2 in the sense of the Riesz
Representation Theorem is of the form
A g(s) =
k(t, s)g(t) dt.
72
Dom (A ) = D
This relation can be considered in various function spaces and also with dierent
domains. If we are interested in its adjoint we should have a good representation
of the dual space. This leads to an observation that spaces of integrable functions
would be more convenient than spaces of continuous functions. Therefore let X =
Lp (0, 1), 1 p < and Dom A = C 1 [0, 1]. Consider A : Dom A X X. We
wish to compute A . Assume g Dom (A ) Lp (0, 1) and A g = f , i.e.,
1
1
x(t)g(t)
dt =
x(t)f (t) dt = A g(x)
for all x Dom A.
g(Ax) =
0
In particular, for
x V = {x Dom A : x(1) = 0}
and
F (t) =
f (s) ds,
0
x(t)F
(t) dt =
x(t)F
(t) dt.
Since the restriction AV of A to V has a dense range in Lp (0, 1) (Im AV = C[0, 1]),
we have F + g = o in Lp (0, 1). This means that g can be changed on a set of
measure zero to have g absolutely continuous and
g = f Lp (0, 1),
i.e.,
14 If you are not familiar with integration by parts for the Lebesgue integral (notice that f
Lp (0, 1) L1 (0, 1)), you can approximate f by a continuous function to get a standard situation
for integration by parts.
73
f (t) dt = 0
0
and
Ker A = (Im A ) .
with
If the equation
Ax = x
15 The last equality should be proved. A deeper insight into these Sobolev spaces will be given
in Chapter 7, cf. also Exercise 1.2.46.
16 Notice that A is an extension of A and, moreover, the graph of A is the closure of the
graph of A (it is also said that A is the closure of A).
74
has a nonzero solution w ( Dom A), then is called an eigenvalue and w a cork2 2
responding eigenfunction of A. Simple calculation shows that (ba)
2 are all eigen
k
values of A,17 and sin ba
(t a) are the corresponding eigenfunctions. Consider
now the boundary value problem
x x = 0.
Let 1 , 2 be a fundamental system for the dierential equation
The Variation of Constants Formula shows that
t
1 (s)2 (t) 1 (t)2 (s)
f (s) ds
(2.1.5)
x(t) = c1 1 (t) + c2 2 (t) +
W (s)
a
is a solution to
x x = f . Here W is the Wronski determinant of 1 , 2 (notice
that for this equation we always can choose 1 , 2 such that W 1). We wish to
nd constants c1 , c2 such that x given by (2.1.5) satises the boundary conditions
x(a) = x(b) = 0. The number is not an eigenvalue if and only if
1 (a) 2 (a)
det
= 0.
1 (b) 2 (b)
In this case the formula (2.1.5) shows that for any f C[a, b] the problem (2.1.4)
has a unique solution in Dom A 18 which is called a classical solution. This means
that
(A). Suppose now that is an eigenvalue. Then we can take 1 as a
corresponding eigenfunction and get
x(a) = c2 2 (a),
i.e.,
c2 = 0
(2.1.6)
since 2 (b) = 0 (by the same argument as above). Notice that (2.1.6) is also a
necessary condition for solvability of (2.1.4).
We will return to this example in the next section (see Example 2.2.17). g
Example 2.1.32. Linear dierential operators of the second order with nonconstant coecients are more complicated. To simplify our exposition we consider a
dierential expression
+ p1 x + p2 x
Lx p0 x
17 The minus sign in the denition of A is conventional; it is introduced to obtain positive
eigenvalues.
18 If f Lp (a, b), then it is possible to show that the function x = x(t) given by (2.1.5) belongs to
W 2,2 (a, b), x(a) = x(b) = 0, and the equation in (2.1.4) is satised a.e. in (a, b). Such a solution
is called a strong solution.
75
Ax = Lx,
and consider
A : Dom A X X.
A solution of Ax = f is therefore a strong solution of
Lx(t) = f (t),
t (a, b),
x(a) = x(b) = 0.
It can be proved that A is injective provided p2 > 0 in [a, b]. (Assume by contradiction that Ker A = {o} and show that there is x0 Ker A which has a negative
minimum at an interior point c (a, b). Deduce that Lx0 (c) < 0.) The Variation
of Constants Formula shows that the operator A is also surjective and A1 is an
integral operator
b
G(t, s)f (s) ds
(2.1.7)
A1 f (t) =
a
for
y D = Dom B.
The same integration as above shows that B A . The proof of the equality
A = B needs a more careful calculation.
The interested reader can consult the books Coddington & Levinson [27,
Chapter 9], Edmunds & Evans [46] or Dunford & Schwartz [45], in particular
Chapter XIII, for details and also for more complicated singular cases which are
important in applications, e.g., in Quantum Mechanics (the Schr
odinger equation).
g
76
1
,
A1
A1
,
1 A1 B A
B 1 A1
A1 2
B A.
1 A1 B A
n n
t A
n!
n=0
= Ax(t)
and satises the initial condition (0) = x0 . (See also the end of Section 1.1, in
particular Exercise 1.1.41.)
Exercise 2.1.35. Let K be a continuous real function on [a, b] [a, b] and let
h C[a, b] be xed. Let
M=
max
(t, )[a,b][a,b]
K(t, )
1
.
M (b a)
Exercise 2.1.36. Let {xn }n=1 , {yn }n=1 be sequences in a Hilbert space H such
that xn x, yn y. Then
(xn , yn ) (x, y).
Hint. Use Proposition 2.1.22(iii).
77
Exercise 2.1.37. Let {en }n=1 be an orthonormal sequence in a Hilbert space. Show
that
en o.
Hint. Use the Bessel inequality (1.2.17).
Exercise 2.1.38. Prove assertion (iv) of Proposition 2.1.22 for a Hilbert space X.
Hint. Use the relation between the scalar product and the norm in X.
Exercise 2.1.39. Show that a convex set (in particular a subspace) of a normed
linear space is weakly closed if and only if it is closed in the norm topology.
Hint. Suppose by contradiction that C is a normclosed convex set which is not
weakly closed. Then there is x0 C w \ C. Use the Separation Theorem (Corollary 2.1.18) to obtain a contradiction.
Exercise 2.1.40. Prove that actually
A L(Y ,X ) = AL(X,Y ) .
Hint. The inequality A A follows from the calculation after Remark 2.1.26.
For the converse inequality use the dual characterization of the norm Ax.
78
Example 2.2.3.
(i) If A L(X, Y ) and dim Im A < (the socalled operator of nite rank ),
then A C (X, Y ).
(iv) Assume that Y is a Banach space and a sequence {An }n=1 C (X, Y ) converges to A L(X, Y ) in the norm operator topology. Then A C (X, Y ).
Proof. The assertions (i) and (ii) are obvious.
To prove (iii) assume by contradiction that there is a subsequence {xnk }
k=1
such that
Axnk Ax c > 0.
The sequence {x
, is bounded (Proposition 2.1.22(iii)), and hence there exists
+n }n=1
a subsequence xnkl
l=1
Axnkl y 0.
Since
f (Axn ) = A f (xn ) A f (x) = f (Ax)
we have y = Ax, and hence a contradiction.
for every f X ,
79
(iv) Let B(o; 1) be the unit ball. By Proposition 1.2.3 it suces to show
that for any > 0 there is a nite net of A(B(o; 1)). We choose n such that
An A < 2 , and a nite 2 net for An (B(o; 1)). By the triangle inequality, this
is the desired net for A(B(o; 1)).
Example 2.2.5.
(i) Let k be a continuous function on the Cartesian product [a, b] [a, b]. Then
the operator
b
Ax : t [a, b]
k(t, s)x(s) ds
a
is true under more general assumptions, e.g., if the interval [a, b] is replaced
by a compact
80
(Bek , fn )2 =
k,n=1
Bek 2 =
B fn 2 .
n=1
k=1
This shows that the quantity n(B) depends only on B and not on the particular choice of bases. Moreover, if n(B) < , then B C (H). To see this
take n N such that
B fn 2 < and dene
n=n +1
B x =
n
(Bx, fn )fn .
n=1
n=n +1
B fn 2 x2 .
n=n +1
Now we give the second proof. Let {xn }n=1 be a bounded set in L2 (). Since
2
L () as a Hilbert space is reexive, there is a subsequence denote it again
xn xL2 ()
k(t, s)2 ds
12
c
k(t, s)2 ds
12
,
81
n
(Ax, ek )ek .
k=1
k=n+1
Remark 2.2.7. The proof of the preceding proposition indicates that the result
holds also in a Banach space X with a Schauder basis {en }n=1 (see page 40).
The famous conjecture of S. Banach was that any separable Banach space has a
Schauder basis. The rst counterexample was constructed by P. Eno. He found
a compact operator in a separable Banach space which cannot be approximated
by operators of nite rank. We notice that separable Banach spaces of functions
like C(), Lp (), W k,p () (1 p < ) have a Schauder basis.
One of our goals in this section is to generalize the Fredholm alternative (see
footnote 6 on page 14). As we have seen in Section 1.1 the notion of the adjoint
operator is very important.
Proposition 2.2.8 (Schauder). Let X, Y be Banach spaces and assume that A
L(X, Y ). Then A is compact if and only if A is compact.
Proof.
Step 1 (the only if part). Suppose that A C (X, Y ) and {gn }n=1 Y ,
gn Y 1. It is easy to verify the assumptions of the Arzel`aAscoli Theorem
(Theorem 1.2.13) for the sequence of functions
gn : K A(B(o; 1)) R
(or C)
(B(o; 1) is the unit ball in X). By this theorem there is a subsequence {gnk }k=1
which is uniformly convergent on K. Since
A gnk (x) A gnl (x) sup gnk (y) gnl (y)
yK
82
for x X
(2.2.1)
is scarcely ever well posed20 as follows from the rst part of the next theorem.
This is the reason why we are interested rather in equations of the type
x Ax = y.
(2.2.2)
if and only if
Ker (I A) = {o};
and
Y Ker T.
i.e.,
X =Y Z
20 An
83
for each z Z,
see page 70. Suppose by contradiction that such c does not exist, i.e., there are
zn Z such that
zn Z = 1
T zn Y <
and
1
zn Z .
n
Then one can nd a subsequence {znk }k=1 for which Aznk converges to a y. Since
T znk o, we have lim znk = y Z. This means that
n
T y = o,
i.e.,
y Y Z,
and thus
y = o.
1
.
2
is clear for X being a Hilbert space, since Im T is closed and the orthogonal complement
(Im T ) is equal to Ker T (Proposition 2.1.27(iv)). In a general Banach space we can use the
factor space XIm T which is algebraically isomorphic to a direct complement W of Im T and
for g XIm T
put f (x) = g([x]). It remains to show that the correspondence g f is an
(isometric) isomorphism onto (Im T ) = Ker T .
84
Denote
dim Ker T = n
dim Ker T = n .
and
i.e.,
Ker (I B) = {o}.
and
(I B)(Y ) = (Y ) = W,
i.e.,
Im (I B) = Im T + W = X,
a contradiction. This proves the inequality n n .
By interchanging T and T we similarly obtain n n.22
Remark 2.2.10. The proof of the following statement is similar to that of Lemma
1.1.31(i).
If A C (X) and 1 (A), then there is k N such that
X = Ker (I A)k Im (I A)k .
Moreover, both the spaces on the righthand side are Ainvariant, and
dim Ker (I A)k < .23
Remark 2.2.11. Theorem 2.2.9 can be generalized to operators A L(X) for
which there is k N such that Ak C (X).
Another way of generalization is connected with perturbations of Fredholm
operators. Notice that the statement (v) of Theorem 2.2.9 says that I A is a
Fredholm operator of index zero provided A C (X). The following theorem states
the stability of index.
Theorem 2.2.12. Let X, Y be Banach spaces and let A L(X, Y ) be a Fredholm
operator. Then
(i) if B C (X, Y ), then A + B is Fredholm and
ind A = ind (A + B);
(2.2.3)
23 This
85
Proof. The proofs and further results can be found, e.g., in Kato [73, IV.5.].
Corollary 2.2.13. Let X be a complex Banach space and let A C (X). Then
(i) (A) \ {0} is a countable set of eigenvalues of nite multiplicity;
(ii) if dim X = , then 0 (A), and if is an accumulation point of (A),
then = 0.
Proof. (i) If = 0, then I A = I A
and Theorem 2.2.9 can be applied. In
particular, if such belongs to (A), then is an eigenvalue of nite multiplicity.
It remains to show that for any r > 0 the set
= { (A) :  > r} is nite.
Assume by way of contradiction that there is a sequence of mutually dierent
and
dist(yn+1 , Wn )
1
.
2
r
2
(2.2.4)
x(s) ds
on the space
L2 (0, 1).
This is a special class of operators which have been examined in Example 2.2.5(ii):
1 for 0 s t 1,
k(t, s) =
0 for 0 t < s 1.
Therefore A C (L2 (0, 1)).
86
1
x,
x(0) = 0.
This implies that x = o in [0, 1]. Since (A) cannot be empty, (A) = {0}, and 0
is no eigenvalue of A.
We notice that the same statement (with a more complicated proof) is valid
for any Volterra integral operator
t
k(t s)x(s) ds,
x L2 (0, 1),
Ax(t) =
0
x=1
4
t
t
2
t2
87
Hence
t=
y
x
12
and
i.e.,
2
1
tx y = 0.
t
Ax2 xy
follows.
(the last equality follows from (i)). Let {xn }n=1 be a sequence such that
xn = 1
and
lim (Axn , xn ) = M.
Then
lim sup Axn M xn 2 = lim sup [(Axn , Axn ) 2M (Axn , xn ) + M 2 ]
n
for every
x H.
and
I A = I A
88
Aen = n en
and
x=
(x, en )en ,
n=1
then
Ax =
n (x, en )en .
n=1
Proof. Let {n }n=1 be the sequence of all nonzero and pairwise distinct eigenvalues
(k)
(k)
of A. Choose an orthonormal basis e1 , . . . , enk of
Nk Ker (k I A).
Remember that Nk Nk+1 (Proposition 2.2.15(v)). Let us align the collection
(k)
{e1 , . . . , e(k)
nk }
k
B=O
exists provided p,
q C[a, b] and p, q > 0 on [a, b]. Moreover, A1 is an integral
operator
b
G(t, s)f (s) ds
A1 f (t) =
a
24 A
89
where G is the Green function of the dierential expression. From the construction
of G it follows that G C([a, b] [a, b]), in particular, G L2 (a, b), and G is a
real symmetric function (G(t, s) = G(s, t)), see, e.g., Walter [131].
By Example 2.2.5(ii), A1 is a compact, selfadjoint26 operator in the real
space L2 (a, b) and Theorem 2.2.16 can be applied to obtain an orthonormal basis
is an eigen
1
= n2 (A1 ) < .
2
n=1 n
However, this result is far from being optimal. We remark here that a variational
approach to an eigenvalue problem for compact, selfadjoint operators will be
briey described in Section 6.3.
Consider now the equation
Ax = x + f
(2.2.5)
n=1
(n )(x, en ) =
(f, en ),
for n N.
n=1
x=
26 We
(f, en )
en
n=1 n
restrict our attention to a special dierential operator A in contrast to the general operator
from Example 2.1.32 in order to get a selfadjoint inverse A1 .
90
k=0
If Ax =
n (x, en )en , then it is easy to verify properties (i)(v) for
1
wx ,
n=1
(f )x
f (n )(x, en )en .
n=1
(A) = {0} {n }
n=1 , then f C((A)) if and only if lim f (n ) = f (0).
n
91
= x()
= 0, 1 x()+1 x()
= x()
(periodic conditions).
Find Green functions, eigenvalues and eigenfunctions. What follows from the
HilbertSchmidt Theorem? Compare this result with that of Example 1.2.38(i).
Exercise 2.2.22. Dene etA for the operator A from Exercise 2.2.21 (see Remark 2.2.18). Take x Dom A and show that the function
t 0,
u(t, ) etA x (),
is a solution to the heat equation
2u
u
=
t
2
satisfying the initial condition u(0, ) = x() and the boundary conditions given by
u(t, ) Dom A. Do not forget to dene the notion of a solution.
Exercise 2.2.23. Let A be as in Example 2.2.17. Prove that
.
Dom A = x =
(x, en )en :
n 2 (x, en )2 <
n=1
and
Ax =
n=1
n (x, en )en .
n=1
for
f : X X.
(2.3.1)
The basic assertions in this section are xed point theorems for contractible and
nonexpansive mappings. If X is a linear space, (2.3.1) is equivalent to the equation
F (x) a f (x) + x = x.
92
The solution of this equation is called a xed point of F . In the case that
f (x) = x Ax
(F (x) = Ax + a)
where A L(X), we succeeded in solving this equation in Section 2.1 (cf. Proposition 2.1.2) by applying the iteration process
x0 = a,
xn = a + Axn1
This idea can be easily generalized to the following result which is often attributed
to S. Banach.
Theorem 2.3.1 (Contraction Principle). Let M be a complete metric space and let
F : M M be a contraction, i.e., there is q [0, 1) such that
(F (x), F (y)) q
(x, y)
for every
x, y M.
xn = F (xn1 ),
(a priori estimate),
(2.3.2)
(a posteriori estimate)
(2.3.3)
hold.
Proof. We prove that {xn }n=1 is a Cauchy sequence. Indeed, for m > n we have
(xm , xn )
(xm , xm1 ) + +
(xn+1 , xn )
=
(F (xm1 ), F (xm2 )) + +
(F (xn ), F (xn1 ))
q[
(xm1 , xm2 ) + +
(xn , xn1 )]
qn
(x1 , x0 ).
(q m1 + + q n )
(x1 , x0 )
1q
Since q < 1, the righthand side is arbitrarily small for suciently large n. The
i.e.,
(
x, y) = 0
(q < 1).
The a posteriori estimate also follows from the above estimate of (xm , xn ).
93
The xed point of F the existence of which has been just established often depends on a parameter. The following result is useful in investigating this
dependence.
Corollary 2.3.2. Let M be a complete metric space and A a topological space.
Assume that F : A M M possesses the following properties:
(i) There is q [0, 1) such that
(F (a, x), F (a, y)) q
(x, y)
for all
aA
and
x, y M.
1
(F (a, (a)), F (b, (a))),
1q
i = 1, 2.
whenever t s < , xi y < ,
Then for any (t0 , 0 ) G there exists > 0 such that the equation
x = f (t, x)
(2.3.4)
94
has a unique solution on the interval (t0 , t0 + ) satisfying the initial condition
x(t0 ) = 0 .
(2.3.5)
Proof. First we rewrite the initial value problem (2.3.4), (2.3.5) into an equivalent
xed point problem for an integral operator F dened by
t
F (x) : t 0 +
f (s, x(s)) ds,
t (t0 , t0 + ).28
(2.3.6)
t0
Put
M = {x C[t0 , t0 + ] : x(t) 0 1 t [t0 , t0 + ]} for a 1 .
Then
sup F (x(t)) 0 K,
tI
tI
where
I [t0 , t0 + ].
If we choose so small that K 1 and L 12 , then F maps M into itself (the
rst condition) and is a contraction with q = 12 (the second condition). By the
Contraction Principle, F has a unique xed point y in M and this is a solution of
(2.3.4), (2.3.5) on the interval (t0 , t0 + ). If x
is a solution of (2.3.4), (2.3.5) on
M (prove it!), i.e., y = x
, and the uniqueness
the interval (t0 , t0 + ), then x
follows.
28 If
t0
f (s, x(s)) ds =
t
t0
t0
f (s, x(s)) ds = 0.
95
Remark 2.3.5. The mapping F dened by (2.3.6) depends actually not only on
x but also on t0 , 0 . By taking smaller we can prove that F is also Lipschitz
continuous with respect to the initial conditions and Corollary 2.3.2 yields that
the solution x(t; t0 , 0 ) of (2.3.4), (2.3.5) is also Lipschitz continuous with respect
to the initial conditions.
Remark 2.3.6. If we apply Theorem 2.3.4 (i.e., the Contraction Principle) to a
system of linear dierential equations
x = A(t)x + g(t)
with a continuous matrix A and a continuous vector function g on an interval
(a, b), we need an extra eort to prove that a solution exists on the whole interval
(a, b). Namely, Theorem 2.3.4 gives only local existence, and in the continuation
process (take (t0 + , x(t0 + )) as a new initial condition) there is no a priori
evidence that could not be smaller and smaller.29 It is therefore sometimes more
convenient not to refer to the Contraction Principle but to prove the convergence
of iterations directly. The following example demonstrates this approach.
Example 2.3.7. Let k be a bounded measurable function on the set
M = {(s, t) R2 : 0 s t 1}.
Then for any f L1 (0, 1) and = 0 there is a unique solution to the integral
equation
t
k(t, s)x(s) ds = f (t).
(2.3.7)
x(t)
0
Ax(t) =
xn = f + Axn1 .
1
Due to the completeness of L (0, 1) the sequence {xn }
n=1 is convergent in L (0, 1)
xn xn1 L1 (0,1) is convergent. We have
if and only if the sum
1
n=1
xn xn1 = A f
n
and
A f (t) =
where
k1 = k
and
kn (t, s) =
29 This
96
n=1
(t, s) M,
 t

kn (t, s)f (s) ds dt
0
0
1
1
n kn
n

f L1 (0,1) .
f (s)
kn (t, s) dt ds
n!
0
s
(t s)n1
,
(n 1)!
an
n!
(G is the Green function see Example 2.1.32). Therefore, we are looking for a
continuous function x which solves the integral equation
1
x(t) =
G(t, s)f (s, x(s)) ds.
(2.3.9)
0
we have proved that C \ {0} (A), i.e., (A) {0}. Since A L(L1 (0, 1)),
(A) =
30 Actually,
97
Denote
F (, x)
For x M we have
f (s, x(s)) f (s, 0) + f (s, x(s)) f (s, 0) K + L(r)r
where K > 0 is a constant such that f (s, 0) K, s [0, 1], and
F (, x)

(K + L(r)r).
8

L(r) < 1
8
and
K 1
.
8 1q
Then F is also a contraction on M with the constant q. We can conclude that for
a given r there is 0 > 0 such that for  0 both the above conditions31 are
satised and (2.3.9) has a solution.
Now we have to show that a continuous solution x of (2.3.9) is actually a
classical solution of the boundary value problem (2.3.8). Since we know the explicit
form of the Green function G, it is obvious that x(0) = x(1) = 0 and it is also
easy to dierentiate twice the righthand side of (2.3.9) (taking into account that
x is continuous).
We remark that we have not used all properties of the integral operator with
the kernel G. In particular, such an operator is compact (Example 2.2.5(i)) and
this property has not been used. This property will be signicant in Section 5.1.
g
The a posteriori estimate (2.3.3) shows that the convergence of iterations
may be rather slow. It can be sometimes desirable to have faster convergence at
the expense of more restrictive assumptions. The classical Newton Method for
solving an equation
f (x) = 0,
f : R R,
is illustrated in Figure 2.3.1.
In order to generalize this method we need the notion of a derivative of
f : X X. This will be the main subject of the next chapter.
31 Notice
that for a xed these conditions are antagonistic, namely the rst requires small r
and the other large r. This situation is typical in applications of the Contraction Principle.
98
y = f (x)
y = f (xn )(x xn ) + f (xn )
x
xn+1
xn
Figure 2.3.1.
for all x, y M.
for
x = (x1 , x2 , . . . ) M.
Then F is a nonexpansive map of the unit ball into itself without any xed point.
g
This example indicates that some special properties of the space are needed.
We formulate the following assertion in a Hilbert space and use the Hilbert structure essentially in its proof. The statement is true also in uniformly convex spaces
but the proof is more involved (see, e.g., Goebel [60]). Let us note an interesting
fact that the validity of Proposition 2.3.10 in a reexive Banach space is an open
problem.
Proposition 2.3.10 (Browder). Let M be a bounded closed and convex set in a
Hilbert space H. Let F be a nonexpansive mapping from M into itself. Then
there is a xed point of F in M. Moreover, if
x0 M,
xn = F (xn1 )
and
yn =
n1
1
xk ,
n
k=0
{yn }n=1
99
Proof. The existence result is not dicult to prove.32 So we will prove a more
interesting result which yields also a numerical method for nding a xed point.
The proof consists of four steps, the last one is crucial and has a variational
character.
Step 1. Since M is bounded, closed and convex, yn M and there is a subsequence
k=0
+ F (yn ) yn 2 + 2 Re(yn F (yn ), F (yn ) yn )
n1
1 k
F (x0 ) F (yn )2 F (yn ) yn 2 .
n
k=0
n1
1 k1
1
F
(x0 ) yn 2 + x0 F (yn )2
n
n
k=1
n1
1
n
F k (x0 ) yn 2
k=0
1
1
= x0 F (yn )2 F n1 (x0 ) yn 2 0
n
n
(all sequences belong to M, and hence they are bounded).
Step 3. The element x
is a xed point of F . To see this, observe that the inequality
(z F (z) (ynk F (ynk )), z ynk )
= (z ynk , z ynk ) (F (z) F (ynk ), z ynk ) z ynk 2 z ynk 2 = 0
is possible to assume that o M. For any t (0, 1) the mapping Ft (x) tF (x) is a
contraction. Letting t 1 we obtain a sequence {xn }
n=1 M for which xn F (xn ) o.
Therefore it is sucient to show that (I F )(M) is closed. This needs a trick which is typical for
monotone operators (Section 5.3). Notice that I F is monotone provided F is nonexpansive.
32 It
100
holds for any z M. By Exercise 2.1.36 and Step 2, the limit of the lefthand side
is (z F (z), z x), i.e., the inequality
(z F (z), z x) 0
(2.3.10)
(z M).
for any v H.
n1
1
v xk 2 + 2 Re(
x v, v yn ).
n
(2.3.11)
k=0
Let v be a weak limit of a subsequence {ynl }l=1 {yn }n=1 , possibly dierent
from {ynk }k=1 . Then v is a xed point of F by virtue of the previous steps. Set
n = nl and take the limit for l in (2.3.11). We nally obtain33
(
x) (v)
x v2 ,
and v = x
follows. In particular, the limit of any weakly convergent subsequence
nl 1
1
x
n
l l j=0
that lim
xj 2 = lim x xn 2 .
n
101
x M,
(2.3.12)
xn = F (xn1 ),
{xn }
n=1
is convergent in M to an x
. Moreover, if the graph of F is
the sequence
closed in M M , then
F (
x) = x
.
Hint. Show that {V (xn )}n=1 is a decreasing sequence; this implies that {xn }n=1
is a Cauchy sequence.
Remark 2.3.13. The condition (2.3.12) is suitable for a vectorvalued mapping
F and plays an important role in game theory. For details see, e.g., Aubin &
Ekeland [10, Chapter VI].
Exercise 2.3.14. Let M be a complete metric space and let F : M M . If there
is n N such that F n is a contraction, then F has a unique xed point in M .
Hint. Let x
be a xed point of
G F n,
x
= lim Gk (x0 ).
k
Remark 2.3.15. The power n N need not be the same for all x, y M , i.e., if
there is q [0, 1) and for every x M there exists n(x) N such that
(F n(x) (x), F n(y) (y)) q
(x, y)
for all y M,
then F also has a unique xed point (Sehgal [119]). The proof is similar to the
previous one.
Exercise 2.3.16 (Edelstein). Let M be a compact metric space and let F : M M
satisfy the condition
(F (x), F (y)) <
(x, y)
for all x, y M, x = y.
i.e.,
F (
x) = x.
102
F : x(t) tx(t).
x(t) =
k(t, s)x(s) ds
0
where and k are as in Example 2.3.7. Prove that x = 0 a.e. in (0, 1).
Hint. First show that x L (0, 1). From the equation we have
xL (0,t) tkL (M) xL (0,t) ,
t (0, 1).
1
.
N (b a)
K(t, , x( )) d + h(t)
a
103
(i) Modify the procedure from Example 2.3.7 to prove that the initial value
problem
x(t)
= A(t)x(t),
x() = ,
has a unique solution which is dened on (a, b).
(ii) Prove that the equation
x(t)
= A(t)x(t)
(2.3.13)
= A(t)x(t) + f (t)?
Hint. Use the Variation of Constant Formula and (iii).
Chapter 3
the inequality
n
f (i )(ti ti1 ) x
i=1
<
(3.1.1)
is satised. Then x is called the Riemann integral of f over [a, b] and it is denoted by
b
f (t) dt.
a
106
Proof. Since f is continuous on the compact interval [a, b], f is uniformly continuous on it. Take an equidistant division Dn = {a = tn0 < < tnn = b} of the
interval [a, b], i.e.,
tni = a +
i
(b a),
n
i = 1, . . . , n,
Then
sn =
n
and
Dn  =
ba
.
n
i=1
easy to see, again by the uniform continuity of f , that condition (3.1.1) is satised
whenever D is suciently small.
Since the Riemann integral is linear and the estimate
b
b
f
(t)
dt
f (t)X dt f C([a,b],X)(b a) 1
a
a
(3.1.2)
holds for each f C([a, b], X), the integral is a linear continuous operator. Its
commutativity with linear operators is important.
Proposition 3.1.3. Let X, Y be Banach spaces and let f : [a, b] X be Riemann
integrable.
(i) If A L(X, Y ), then Af is also integrable and
b
b
f (t) dt =
Af (t) dt
holds.
(3.1.3)
A
a
With help of this generalization of the Riemann integral we can also prove a
basic result on the existence and uniqueness of a solution of a dierential equation
in a Banach space.
1 Here
107
Assume that
f: I G X
where I is an open interval in R, G is an open subset of a Banach space X. By a
solution of a dierential equation
x = f (t, x)
(3.1.4)
x(t)
lim
exists and
x(t)
= f (t, x(t)).
Theorem 3.1.4. Let I be an open interval in R and let G be an open subset of a
Banach space X. Assume that f : I G X is continuous and locally satises
the Lipschitz condition with respect to the second variable, i.e., for every s I,
y G there are > 0, > 0, L > 0 such that
f (t, x1 ) f (t, x2 ) Lx1 x2
i = 1, 2.
whenever t s < and xi y < ,
Then for each t0 I, x0 G there exists h > 0 such that the equation (3.1.4)
has a unique solution on the interval J = (t0 h, t0 + h) which satises the initial
condition
(3.1.5)
x(t0 ) = x0 .
The proof of this theorem is based on the use of the Contraction Principle
for the equivalent integral equation (see also the proof of Theorem 2.3.4)
x(t) = x0 +
t J,
(3.1.6)
t0
where the integral is the Riemann integral. The equivalence of (3.1.4), (3.1.5) and
(3.1.6) is established in the following lemma.
Lemma 3.1.5. Suppose that f is continuous on I G and (t0 , x0 ) I G. Then a
continuous function x : J G is a solution of (3.1.4) on an interval J I and
satises the condition (3.1.5) if and only if t0 J and x solves on J the integral
equation (3.1.6).
2 Recall
that
t0
t0
g(s) ds =
t
108
Proof. Step 1. Assume rst that x is a solution of (3.1.4). Then x as well as the
mapping t J f (t, x(t)) are continuous on J . Choose J and integrate
both sides of (3.1.4) over the interval [t0 , ] (or [, t0 ]). Notice that both sides are
Riemann integrable (Theorem 3.1.2). Moreover,
d
(x(t)) dt = (x( )) (x0 )
x(t)
dt =
(x(t))
dt =
dt
t0
t0
t0
for all X (the last equality follows from the socalled Basic Theorem of
Calculus). By the HahnBanach Theorem, in particular Remark 2.1.17(ii), we
have
x(t)
dt = x( ) x0 ,
t0
Proof of Theorem 3.1.4. Choose > 0, > 0 small enough and K > 0 such that
f (s, x1 ) K,
t0
Lhx1 x2
1
x1 x2
2
for x1 , x2 Mh .
109
for
t J.
for
t (, 1 )
(by uniqueness). This allows us to dene the solution x = x(t) on the entire interval
(, ). Since
t
f (, x()) d t s sup f (, y)
x(t) x(s)
s
(,y)IX
for any < s < t < , the solution x is uniformly continuous on (, ) and,
therefore, continuously extendable at provided < (see Proposition 1.2.4).
The local Theorem 3.1.4 allows us to continue x as a solution beyond the value of
, a contradiction. Hence = . Similarly, we prove inf = .
Remark 3.1.7. Under the assumptions of Corollary 3.1.6 the solution x depends
continuously on the initial data. In order to formulate this result we denote by
x = x(; t0 , x0 ) the solution of (3.1.4), (3.1.5) on the interval I. The continuous
dependence now reads as follows:
For any compact interval J I, t0 J , and any > 0 there is > 0
such that
x(t; t0 , x0 ) x(t; t1 , x1 ) <
for all
tJ
110
s(t) = o
n
Mk .
k=1
k=1
there is a sequence {sm }m=1 of step functions which converges to f a.e. and
f sm X d = 0.
(3.1.7)
lim
m
3 The
reader who is not acquainted with measure theory and the abstract Lebesgue integral can
assume that M is an open subset of RN , is a Lebesgue measure and is a collection of all
Lebesgue measurable subsets of M .
111
f d = lim
sm d.
(3.1.8)
Remark 3.1.10. In order to show that this denition is correct we need to prove
that the norm of a strongly measurable function is a measurable function (this
is obvious) and, therefore, the condition (3.1.7) makes sense. From (3.1.7) it also
immediately follows that the limit in (3.1.8) does not depend on any special choice
of {sn }n=1 .
The following statement oers a very useful criterion for Bochner integrability.
Proposition 3.1.11 (Bochner). Let X be a Banach space and let (M, , ) be a
measurable space. A strongly measurable vector function f : M X is Bochner
integrable if and only if the norm f X is Lebesgue integrable. Moreover,
f
d
f X d.
(3.1.9)
M
Proof.
sn d
exists. Then
sn d lim
n
sn d = .
It is easy to see that does not depend on any special choice of {sn }n=1 from the
denition and, moreover,
f d
f sn d +
sn d,
i.e.,
f d .
M
Step 2. Suppose now that f is Lebesgue integrable and that {sn }n=1 is a sequence of step functions from the denition of strong measurability. Put
o
otherwise.
Then n f a.e. and, by the Lebesgue Dominated Convergence Theorem,
1
n f d 0
since
n (t) f (t) 2 +
f (t).
n
M
It follows from the
independence of of the special choice of approximating step
f d. This proves the inequality (3.1.9).
functions that
M
112
t M,
is strongly measurable. Achieving this4 the rest of the proof is easy: By Proposition 3.1.11, g is Bochner integrable and
g d =
f d,
Af d Z
M
f d
sketch the proof of this result: Let Z . According to the HahnBanach Theorem there
is an extension = (1 , 2 ) of to (X Y ) . Since f and Af are strongly measurable, we
conclude that t (g(t)) = 1 (f (t)) + 2 (Af (t)) is measurable. It can be also shown that there
is N M , (M \ N ) = 0 such that g(N ) is separable. The result now follows from the Pettis
Theorem (see, e.g., Dunford & Schwartz [44, Chapter III, 6], Yosida [135]):
A function g : M Z (Banach space) is strongly measurable if and only if the
following two conditions are satised:
4 We
113
(or C)
is integrable (in this case the Bochner and the Lebesgue integrals coincide). This
shows that the Bochner integral is a restriction of any notion of a weak integral.
We now return to the functional calculus given for matrices (see Theorem 1.1.38). Let B L(X) and let H((B)) be a collection of holomorphic functions on a neighborhood of (B) (this neighborhood can depend on a function).
If f H((B)), then there exists a positively oriented Jordan curve such that
(B) int and f is holomorphic on a neighborhood of int . Hence the integral
1
f (B)x
f (w)(wI B)1 x dw,
x X,
2i
exists. Its properties are collected in the following assertions.
Proposition 3.1.14 (Dunford Functional Calculus). Let X be a complex Banach
space and let B L(X). There exists a unique linear mapping : H((B))
L(X) with the following properties:
(i) (f g) = (f )(g) = (g)(f ) for f, g H((B));
n
n
(ii) if P (w) =
aj wj , then P (B) =
aj B j ;
(iii) if f (w) =
j=0
1
w
j=0
{fn }n=1
(B1 ) = (B) \ {0 }.
114
Put
0 {0 + rei : [0, 2]}
n
1 0
1
=
(I0 B0 )1 x0 d
2i 0 0 n=0 0
x0
(1)n
+
(0 I0 B0 )n x0
0 n=1 ( 0 )n+1
(3.1.11)
(I1 B1 )
( 0 )n dn
1
x1 =
(I1 B1 ) x1 n
n!
d
=0
n=0
=
(n+1)
(1) ( 0 ) (0 I1 B1 )
n
(3.1.12)
x1 ,
n=0
x1 X1 ,  0  < r0 (0 I1 B1 )1 1 .
Proposition 3.1.15. If 0 is an isolated point of the spectrum (B), B L(X),
then there exist operators An L(X), n Z, and r > 0 such that
(I B)1 x =
+
( 0 )n An x
(3.1.13)
n=
for every
n>k
and
z Ak x = o,
then
Bz = 0 z.
On the other hand, if 0 is a nonzero eigenvalue of a compact operator B,
then 0 is a pole of the resolvent of B, i.e., there is k N such that
An = O
for all
n > k.
Proof. Let 0 be an isolated point of (B) and B L(X). If P0 , P1 are the above
projections onto X0 , X1 , then
(I B)1 x = (I B0 )1 x0 + (I B1 )1 x1 ,
P0 x = x0 ,
P1 x = x1 ,
115
(I0 B0 )
k1
k
(1)n
An x
n
x0 =
(0 I0 B0 ) x0 =
n+1
(
)
(
0 )n
0
n=0
n=1
Exercise 3.1.16. Give details to conrm the formulae of resolvent (3.1.11) and
(3.1.12).
Hint. For (3.1.11) replace the sum and the integral and use Proposition 3.1.14(ii).
For (3.1.12) use the resolvent identity
(I B)1 (I B)1 = ( )(I B)1 (I B)1
and induction.
Exercise 3.1.17. Compare the functional calculus from Proposition 3.1.14 with
that of Remark 2.2.18. More precisely, show that for a compact, selfadjoint operator B the functional calculus given in Remark 2.2.18 is an extension of that of
Proposition 3.1.14.
Exercise 3.1.18. Let X be a Banach space. Assume that f : [a, b] X has the
Riemann integral over the interval [a, b]. Show that then the Bochner integral
b
f (t) dt also exists and the two integrals are equal. In particular, Proposia
tion 3.1.3 is a special case of Proposition 3.1.12. However, the proof of Proposition 3.1.3(ii) is much simpler.
Exercise 3.1.19. Let A : Dom A H H be a densely dened linear operator on
a Hilbert space H. Assume that A has a compact resolvent that is also selfadjoint.
(i) Extend the functional calculus (Remark 2.2.18 and Proposition 3.1.14) to
such A. In particular, show that the formula for (f )x still holds provided
that
f (n )2 (x, en )2 <
n=1
116
u(t) etA x0
= Ax(t),
x(0) = x0 .
(iii) Prove that
t
esA x ds Dom A for all x H
(3.1.14)
and
esA x ds = etA x x.
A
0
= Ax(t) + g(t),
x(0) = x0 .
Show that
e(ts)A g(s) ds
tA
u(t) = e x0 +
for
t 0.
follows from the Contraction Principle. Such a solution is called a mild solution of the problem
x(t)
= Ax(t) + h(x(t)),
(3.1.15)
x(0) = x0 .
117
exists, then its value is called the derivative of f at the point a and in the direction
h (or directional derivative or G
ateaux variation) and is denoted by f (a; h).
If f (a; h) exists for all h X and the mappingDf (a) : h f (a; h) is linear
and continuous, then Df (a) is called the G
ateaux derivative of f at the point a.5
Remark 3.2.2. Simple examples of functions of two variables show that the directional derivative need not be linear in h and not even the existence of Df (a)
guarantees the continuity of f at the point a.
M
N
N
M
Example 3.2.3. Consider the standard bases eM
1 , . . . , eM and e1 , . . . , eN of R
N
M
N
and R , respectively. Then we can write f : R R in the form
f (x) =
N
f i (x)eN
i
i=1
It is easy to see that f (a; h) exists if and only if f i (a; h) exists for all i =
i
1, . . . , N . In particular, for h = eM
j , the directional derivative f (a; h) is nothing
else than
5 The
f i
xj (a).
This means that the Gateaux derivative Df (a) has the matrix
terminology concerning G
ateaux dierentiability is not xed. Some authors do not assume
linearity of Df (a).
118
f 1
f 1
x (a) . . . x (a)
1
M
..
..
.
.
.
.
.
f N
f N
(a) . . .
(a)
x1
xM
the form
This matrix is called the Jacobi matrix of f at the point a. If M = N , then its
determinant is denoted by
(f 1 , . . . , f M )
= Jf
(x1 , . . . , xM )
g
simple reason for this observation comes from the fact that the directional derivative f (a; h)
describes the behavior of the functional f along the straight line {a + th : t R}, i.e., the
behavior of the real function t f (a + th) near zero.
119
Theorem 3.2.6 (Mean Value Theorem). Let X be a normed linear space and
Y a Banach space. Let f : X Y have the directional derivative at all points
of the segment joining points a, b X in the direction of this segment, i.e.,
f (a+t(ba); ba) exists for all t [0, 1]. If the mapping t f (a+t(ba); ba)
is continuous on [0, 1], then
1
f (a + t(b a); b a) dt.
(3.2.1)
f (b) f (a) =
0
Proof. Take a Y
and denote
t [0, 1].
f (a + t(b a); b a) dt .
Since Y has been chosen arbitrary, the HahnBanach Theorem (in particular,
Remark 2.1.17(ii)) implies the equality (3.2.1).
The following result oers another possible formulation.
Theorem 3.2.7 (Mean Value Theorem). Let X, Y be normed linear spaces and let
f : X Y . If for given a, b X the directional derivative f (a + t(b a); b a)
exists for all t [0, 1], then
f (b) f (a)Y sup f (a + t(b a); b a)Y
(3.2.2)
t[0,1]
and
f (b) f (a) f (a; b a)Y sup f (a + t(b a); b a) f (a; b a)Y .
t[0,1]
(3.2.3)
Moreover, if Df (a + t(b a)) exists for all t [0, 1], then
f (b) f (a)Y sup Df (a + t(b a))L(X,Y ) b aX .
t[0,1]
(3.2.4)
120
Proof. An idea similar to the previous proof is used. By the dual characterization
of the norm (Corollary 2.1.16) there is X , = 1, such that
f (b) f (a) = (f (b) f (a)).
Dene now
g(t) = (f (a + t(b a))),
t [0, 1].
Then
g (t) = (f (a + t(b a); b a))
and, therefore, the function g satises all assumptions of the classical Mean Value
Theorem. Consequently, if X is a real space, we get
f (b) f (a) = g(1) g(0) = g () = (f (a + (b a); b a))
f (a + (b a); b a)
(see the next remark) and the assertion also follows. The proof of (3.2.3) is similar
and (3.2.4) is an easy consequence of (3.2.2).
Remark 3.2.8. The Mean Value Theorem for functions from R R is often stated
in the following form:
There is (0, 1) such that
f (b) f (a) = f (a + (b a))(b a)
provided f is continuous on the interval [a, b] and f (x) exists for every
x (a, b).
Warning. This equality does not hold even for f : R C ( R2 ) (e.g., f (x) = eix ,
a = 0, b = 2)!
Example 3.2.9. Dierentiability of the norm is connected with the properties of the
corresponding space (see, e.g., Fabian et al. [49, Chapter 5]). As a simple example
we will show the relation between the uniqueness of the supporting hyperplane at
a given point a X, a = 1, and the G
ateaux dierentiability of the norm at the
point a. We recall that by Corollary 2.1.16 there is X , = 1, such that
(a) = a = 1 and
Re (x) 1
7 The
121
Put f (x) = x. Fix h X and let g(t) = a + th, t R. The function g is a
convex real function, and therefore there exist the right and the left derivatives at
(0) g+
(0). Further, we have
zero and g
(a + th) (a)
g(t) g(0)
= (h)
t
t
for
t > 0.
In particular, g+
(0) (h) and similarly g
(0) (h). This means that is
uniquely determined provided the directional derivative of the norm exists at a
for all h X. In particular,
f (a; h) = (h),
a + th a
g(t) g(0)
=
t
t
for
t > 0,
and therefore
(a + th) = 1 + t a + th.
The same inequality holds for t 0. As an easy consequence we get
(a + th) a + th,
(a) = 1.
= 1
where
Y = Lin{a, h}.
The HahnBanach Theorem yields an extension of which determines a supporting hyperplane. Since for a dierent we get a dierent there is no uniqueness of supporting hyperplanes at a and the duality mapping8 is not singlevalued
g
at a.
Similarly to partial derivatives, the G
ateaux derivative is also unsuitable for
the Chain Rule for dierentiability. We recommend to the reader to construct
examples of f : R2 R, g : R R2 such that f (g) has no derivative at o in spite
of the fact that
Df (o) = 0,
g(0) = o,
122
Denition 3.2.10. Let X, Y be normed linear spaces (both over the same scalar
eld). A mapping f : X Y is said to be Frechet dierentiable at a point a X
if there exists A L(X, Y ) such that
f (a + h) f (a) AhY
= 0.
ho
hX
(3.2.5)
lim
In this case A is called the Frechet derivative of f at the point a and is denoted
by f (a).
Remark 3.2.11.
(i) If f (a) exists, then also Df (a) exists. Moreover,
f (a)h = Df (a)h
for all h X.
(ii) Suppose that a linear operator A : X Y has the property (3.2.5). It is easy
to see that A is continuous if and only if f is continuous at a.
(iii) A basic analytical approach to the investigation of nonlinear problems involves their approximation by simpler objects. Among them linear approximations are more appropriate from the local point of view. The classical
notion of the derivative as the best local linear approximation is the most
transparent conrmation of this phenomenon (e.g., the Fermat Theorem for
local extremal points). The notion of Frechet derivative is a genuine generalization to innite dimensional spaces.
Theorem 3.2.12 (Chain Rule). Let X, Y , Z be normed linear spaces and let there
exist g(a; h) for g : X Y . If g(a) = b and for f : Y Z the Frechet derivative
f (b) exists, then
(3.2.6)
(f g)(a; h) = f (b)[g(a; h)].9
Proof. Choose > 0 and h X. By (3.2.5) there is > 0 such that
f (b + k) f (b) f (b)kZ kY
for
kY < .
Put
(t) g(a + th) g(a) tg(a; h).
For
By Denition 3.2.1, there is > 0 such that (t)Y t for t < .
k g(a + th) g(a) = g(a + th) b
we have
kY tg(a; h)Y + (t)Y t[g(a; h)Y + ].
more transparent notation we will often use the symbol f g instead of f (g) for the composition of f and g.
9 For
123
We may choose so small that the righthand side in this inequality is less than
Using all the information and g(a; h) = k(t) we obtain
whenever t < .
t
f (g(a + th)) f (g(a))
f (b)[g(a; h)]
t
Z
f (b + k) f (b) f (b)k
(t)
+
f
=
(b)
t
t Z
(t)Y
kY
+ f (b)L(Y,Z)
[ + g(a; h)Y + f (b)L(Y,Z) ]
t
t
The formula (3.2.6) follows.
for 0 < t < .
Corollary 3.2.13. Let the hypotheses of Theorem 3.2.12 be satised. If, moreover,
Dg(a) exists, then also D(f g)(a) does exist and the analogue of (3.2.6) is true.
A similar assertion is true for (f g) (a) provided g (a) exists.
Proof. The assertion on D(f g)(a) follows from (3.2.6). The proof for (f g) (a)
is similar to that given above.
Corollary 3.2.14. Let A L(Y, Z) and let f (a; h) exist for f : X Y . Then
(Af )(a; h) = Af (a; h)
and similarly for D(Af )(a) and (Af ) (a).
Proof. It is sucient to show that
A (y) = A
for all y Y,
124
(3.2.7)
D
f
(a
,
a
)h
1
1 2 1 t = o(t)
t
as t 0, and the result follows.
10 Then
125
Example 3.2.21. One of the most important nonlinear mappings is the socalled
Nemytski operator which is sometimes also called the substitution (or superposition) operator . As the latter term indicates it arises by the substitution of a
function : G RM R into the function f : G R R. This leads to a new
operator
F : f (, ())
which acts on a space X of functions . We wish to nd conditions on f for F
to be a mapping from X into X and to have some derivatives. We start with the
case X = C[0, 1].
It is clear that the continuity of f on [0, 1] R is sucient to guarantee that
F : X X. Since f is uniformly continuous on compact sets of the form
{(x, y) [0, 1] R : y (x) 1}
F is also continuous on X.
Suppose now that the partial derivative f
y is continuous on [0, 1] R. For
, h X we have, by the classical Mean Value Theorem,
f (x, (x) + th(x)) f (x, (x))
f
=
(x, (x) + (t, x)th(x))h(x)
t
y
for a (t, x) (0, 1) and
 f (x, (x) + th(x)) f (x, (x)) f
(x, (x))h(x)sup t
y
x[0,1]
 f
f
sup sup  (x, (x) + th(x))
(x, (x)) h(x) hC[0,1]
y
01 y
x[0,1]
f
y
on compact sets).
f
(x, (x))h(x).
y
f
y
sup (x)ex 
x[0,)
126
and let f (y) = sin y. Since f is Lipschitz continuous with constant 1 we obtain
F (1 ) F (2 ) 1 2 .
In particular, F is a continuous mapping from X into itself. But
F (0; h) = sin (0)h,
h X,
x ,
is a measurable function on .
Proof. (i) Since a continuous function f (, y) is Lebesgue measurable, the assertion
is obvious.
k
i i (x)
i=1
is a step function on , i.e., there are pairwise disjoint 1 , . . . , k which are measurable,
k
1, x i ,
i
and
i (x) =
=
0, x i ,
i=1
11 The
lack of dierentiability of the Nemytski operators in weighted spaces causes big problems
in the use of the Implicit Function Theorem.
then
f (x, s(x)) =
k
127
f (x, i )i (x)
i=1
for a.a. x ,
is measurable.
for a.a.
and all
y R.
(3.2.8)
Then
(i) F () Lq () for all Lp ();12
(ii) F is a continuous mapping from Lp () into Lq ();
(iii) F maps bounded sets in Lp () into bounded sets in Lq ().
Proof. The proof of (i) is based on Proposition 3.2.23 and the use of the Minkowski
inequality (Example 1.2.16) and it is straightforward.
The proof of (ii) is quite involved and its crucial step consists in the fact that
F maps sequences converging in measure into sequences with the same property.
We omit details (see, e.g., Krasnoselski [78, I.2] or Appell & Zabreiko [8]).
The property (iii) follows from the growth condition (3.2.8).
Remark 3.2.25. The Caratheodory property can be generalized to functions
f : RM R. Proposition 3.2.23 and Theorem 3.2.24 hold similarly for
F (1 , . . . , M )(x) f (x, 1 (x), . . . , M (x)).
Remark 3.2.26. Let RN be an open subset of RN and f : RN +1 R
satisfy the Caratheodory property. Assume, moreover, there exist g Lq () and
c R such that
N
p
yi  q
f (x, y) g(x) + c
i=0
12 Actually,
128
f
(x, (x))h(x)
y
(3.2.9)
f
y (, ())
L (),
f
y
For a.a. x the function under the integral sign can be estimated by the Mean
Value Theorem (the formula (3.2.1)):
 f (x, (x) + th(x)) f (x, (x)) f
(x,
(x))h(x)
t
y
 f
1
f
129
for
()
g
(0, s) t
s=0
is called the second directional derivative (in the directions h, k). Notice that
generally 2 f (a; h, k) = 2 f (a; k, h). (Find an example for f : R2 R!) It is easy
to see that for f : RM R we have
2 f (a; ei , ej ) =
2f
(a)
xi xj
(See a similar assertion in Proposition 1.2.10 for a linear operator.) Denoting the space of all continuous bilinear operators from X X into Y by B2 (X, Y ) we see that the least possible constant
c in the above inequality is a norm on B2 (X, Y ). See also the important Proposition 2.1.7.
130
t [0, 1].
for all
h, k X.
Proof. Similarly to the proof of the classical result on mixed partial derivatives
we express the dierence
f (a + h + k) f (a + h) f (a + k) + f (a)
which is equal to gi (1) gi (0) for
g1 (t) f (a + th + k) f (a + th),
t [0, 1],
s [0, 1].
for n N the nth directional derivative n f (a; h, . . . , h) exists for all h X, then the
n
1 k
f (a; h, . . . , h) is called the Taylor polynomial of the degree n of f
mapping h f (a) +
k!
k=1
16 If
ktimes
at the point a.
131
Since f (a) exists, both the mappings f and f are dened on a neighborhood U
of a. Elements h, k are chosen so small that all variables belong to U.
We can express
gi (1) gi (0) = Ai + gi (0)
and
g1 (0) = e1 (h, k) + f (a)(k, h),
g2 (0) = e2 (k, h) + f (a)(h, k),
(3.2.13)
(3.2.14)
Choose now > 0 and > 0 corresponding to the denition of f (a) such that
f (a + u)v f (a)v f (a)(u, v) uv for
Then every term on the righthand side of (3.2.14) is bounded by (h + k)2
provided h, k < . The same estimate holds for e1 (h, k) and similarly also
for g2 (t) g2 (0), e2 (k, h).
By (3.2.13) we obtain
f (a)(h, k)f (a)(k, h) A1 +A2 +e1 (h, k)+e2 (k, h) 8[h+k]2
(3.2.15)
provided h, k < . Choose h0 , k0 X and put
h = h0 ,
k = k0 .
For a suciently small the estimate (3.2.15) holds. Because of the bilinearity of
f (a) we get
f (a)(h0 , k0 ) f (a)(k0 , h0 ) 8[h0 2 + k0 2 ]
This completes the proof.
132
Remark 3.2.29.
(i) It is not dicult to see that the existence of f (a) implies the existence of
D2 f (a) and the equality
f (a)(h, k) = D2 f (a)(h, k).
It is also possible to prove that the continuity of D2 f on an open set G X
(as a mapping from G into B2 (X, Y )) is equivalent to the continuity of f on
G. In this case we write f C 2 (G).
(ii) If X = RM , Y = R and D2 f (a) exists for f : RM R, then it is sucient
to know the values
D2 f (a)(ei , ej ),
i, j = 1, . . . , M,
to determine D2 f (a). This means that D2 f (a) (and also f (a)) can be represented by the matrix (the socalled Hess matrix )
2
f
(a) .
xi xj
Exercise 3.2.30. Let A L(X, Y ) and B B2 (X, Y ). Compute A , B and B !
Exercise 3.2.31. Let f : X Y be injective on an open set G X. Denote
1
(f G ) = g. Suppose that f (a) and g (b) exist for an a G, f (a) = b. Is it true
that
g (b) = [f (a)]1 ?
(For conditions which guarantee the existence of g (b) see Section 4.1.)
Exercise 3.2.32. Put (A) = A1 for an invertible A L(X, Y ) (here X, Y are
Banach spaces). Show that
(A)(H) = A1 HA1 ,
133
be functionals dened on W
1,p
() (here u(x)
(2
N '
u(x)
xi
i=1
12
). Prove that
u(x)p2 (u(x), v(x)) dx,
g (u)v =
u(x)p2 u(x)v(x) dx.
Hint. Let
(t) = tp2 t,
t = 0,
(0) = 0.
'
(
d
1 p
= (t), t R. Similarly, for y RN , y =
Then is continuous and dt
p t
N
1
2 2
yi
, set
i=1
(y) = yp2 y,
(
'
Then p1 yp = (y) for all y RN .
y = o,
(o) = o.
Exercise 3.2.36. Find conditions on k and f for the socalled Hammerstein operator
b
k(t, s)f (s, (s)) ds
H(t) =
a
2
(s) ds
(ii) F ()(t) =
as
and
F : L2 (0, 1) R?
134
a, b, c L (0, 1).
where
F () =
C 1 [0, 1].
p
(the conjugate exponent) and F ()(x) =
If f (, 0) Lp () where p = p1
f (x, (x)), show the following facts:
(i) F maps Lp () into Lp ().
Hint. Integrate f
y and use the above estimate and Theorem 3.2.24.
p
(ii) F ()h : x f
y (x, (x))h(x) for all h L ().
Hint. Proceed similarly to the main text. Use the H
older inequality to show
p
p
q
(x,
(x))
maps
L
()
into
L
(),
q = p2
.
that Fy ()(x) f
y
135
f (x)
x
x2
x1
y1
f (a)(x a) + f (a)
a
Figure 3.2.1.
(3.2.17)
y0 = o.
(3.2.18)
which are exactly the iterations from Figure 3.2.1. If the sequence of iterations {yn }
n=1
converges to y, then
f (a + y) = o
as follows from (3.2.16). Our goal is to show:
(A1) There is > 0 such that F maps B(o; ) into itself and it is a contraction on
this ball.
136
for
x, y B(a; ).
A1 (x)
Indeed, we can write
1 L
for
x B(a; ).
n=0
1
1 L
2
Lw x.
137
Step 3. We have
r(y)
L 2
and
for
y, z B(o; )
where
r(y) f (a) f (a + y) + A(a + y)y
(see (3.2.16)). Indeed, by Theorem 3.2.6, we get
r(y) =
0
and
r(y) r(z) = f (a + z) f (a + y) + A(a + y)y A(a + z)z
1
=
[A(a + t(z y)) A(a + y)](z y) dt + [A(a + y) A(a + z)]z.
0
y, z B(o; )
for a
q (0, 1).
Moreover,
F (y) F (y) F (o) + F (o) q + f (a) ,
provided f (a) is suciently small.
Step 5. We can now prove the assertion (A2). By (3.2.18) and Theorem 3.2.6,
f (xn ) = f (xn ) f (xn1 ) f (xn1 )(xn xn1 )
1
[f (xn1 + t(xn xn1 )) f (xn1 )](xn xn1 ) dt.
=
0
Hence
f (xn )
and also
L
xn xn1 2 ,
2
Remark 3.2.43. The drawback of the iteration procedure (3.2.18) consists in the requirement to compute the inverse to the derivative at each step. This is the price for fast
convergence. One can assume that by replacing [f (x)]1 by the xed inverse [f (a)]1
138
we should avoid this disadvantage. This idea is also due to I. Newton. Conditions for convergence of these iterations were found by Kantorovich (see Kantorovich [72]). Serious
problems appear when the derivative f (x) is injective but not continuously invertible.
In applications, e.g., to nonlinear partial dierential equations, we have many possibilities of the choice of Banach spaces X , Y such that f : X Y (see, e.g., Example 1.2.25 and Example 2.1.29). It can happen that
[f (x)]1 L(Y , X )
where
X X .
A > 0.
(ii) The same as in (i) for the Kantorovich approximations (Remark 3.2.43).
Chapter 4
Local Properties of
Dierentiable Mappings
4.1 Inverse Function Theorem
In this section we are looking for conditions which allow us to invert a map f : X
Y , especially f : RM RN . The simple case of a linear operator f indicates that
a reasonable assumption is that M = N .
Let us start with the simplest case M = N = 1. The wellknown theorem
says that if f is continuous and strictly monotone on an open interval I, then f
is injective and f (I) is an open interval J . Moreover, the inverse function f 1 is
continuous on J .
It is not clear how to generalize the monotonicity assumption to RM (cf. Section 5.3), and without it the theorem is not true even in R. Since the monotonicity
of a dierentiable function f : R R is a consequence of the sign of the derivative
of f , we take into consideration also f . The example f (x) = x2 where f is not injective in any neighborhood of the origin shows that we have to assume f (x) = 0.
In fact, if f is continuous on an open interval I, f (x) exists (possibly innite)
at all points of I, and f does not vanish at any point of I, then f is injective
(actually strictly monotone since f is either strictly positive or strictly negative
in I), and f 1 is continuous and dierentiable on the open interval f (I).
Therefore, we are looking for a generalization of the assumption f (x) = 0 for
maps f : RM RM . Since we are interested in a (unique) solution of the equation
f (x) = y,
the case of a linear function f : RM RM (then f (x) = f ) suggests assuming
that f (x) is either an injective or, equivalently because of the nite dimension, a
surjective linear map. In both cases, f (x) is an isomorphism of RM onto RM (for
the case of Banach spaces see Theorem 2.1.8).
140
or g(z) = ez , z C.
g (z) = 0,
and f (r, ) is 2periodic and g is 2iperiodic, i.e., f and g are not injective.
Therefore, we cannot expect more than only local invertibility. The philosophy of that is simple. Since the notion of derivative is a local one, we can deduce
only local information from it.
After these preliminary considerations we can state the main theorem. Since
there is no simplication in the case of nite dimension, we formulate it for general
Banach spaces.
Theorem 4.1.1 (Local Inverse Function Theorem). Let X, Y be Banach spaces, G
an open set in X, f : X Y continuously dierentiable on G. Let the derivative
f (a) be an isomorphism of X onto Y for a G. Then there exist neighborhoods
U of a, V of f (a) such that f is injective on U, f (U) = V. If g denotes the
inverse to the restriction f U , then g C 1 (V).
Proof. We will solve the equation
f (x) = y
for a xed y near the point b = f (a) by the iteration process. To do that we have
to rewrite the equation f (x) = y as an equation in X. We denote by A the inverse
map [f (a)]1 L(Y, X). Then
f (x) = y
(4.1.1)
The simplest condition for the convergence of iterations is given by the Contraction
Principle (see Theorem 2.3.1). We have
Fy (x1 ) Fy (x2 ) = x1 x2 A[f (x1 ) f (x2 )]
Af (x2 ) f (x1 ) f (a)(x2 x1 ) A
sup
f () f (a)x1 x2 ,
B(a;r)
x1 , x2 B(a; r). (Here we have used the Mean Value Theorem (see formula
(3.2.3)).) In other words, we can choose r > 0 so small that
Fy (x1 ) Fy (x2 )
1
x1 x2
2
(4.1.2)
1
x a + Ab y.
2
If > 0 is such that A r2 , then Fy (x) B(a; r) provided x B(a; r), y
B(b; ). By the Contraction Principle, the equation (4.1.1) has a unique solution
141
in B(a; r),
x g(y) B(a; r)
1
x1 x2 + Ay1 y2 ,
2
i.e.,
g(y1 ) g(y2 ) 2Ay1 y2 .
(4.1.3)
Put
h = g(y + k) g(y),
We have
k = f (x + h) f (x).
i.e.,
By the denition of the Frechet derivative, for any > 0 there is > 0 such that
f (x + h) f (x) f (x)h h
i.e.,
This also implies the continuity of g (y) since the inverse [f (x)]1 depends continuously on x (see Exercise 2.1.33).
To complete the proof it remains to put
V = B(b; )
and
142
Example 4.1.3. If f C k (G), k N, then g C k (V). This follows easily from the
formula
x = g(y),
g (y) = [f (x)]1 ,
g
the Chain Rule and Exercise 3.2.32.
Denition 4.1.4. Let X, Y be Banach spaces. Then f : X Y is called a dieomorphism of G X (or a dieomorphism of G onto H = f (G)) if the following
conditions are satised:
(1) G is an open set in X, f C 1 (G),
(2) f (G) = H is an open set in Y ,
(3) f is injective on G and the inverse g = (f G )1 belongs to C 1 (H).
If, moreover, f C k (G) for some k N, and (therefore) g C k (H), then f is
called a C k dieomorphism.
A dieomorphism in RM can be viewed as a nonlinear generalization of a
linear invertible operator A : RM RM . Such A yields a linear transformation of
coordinates
y = Ax.
If is a dieomorphism of G onto H and a G, we can suppose without loss of
generality that
(a) = o
(if this is not true consider a new dieomorphism on G: (x)
y = r sin
y = r sin 1 cos 2 ,
z = r sin 2 ;
143
i = 1, . . . , M,
have continuous partial derivatives (obvious) and their Jacobi matrix is regular. Equivalently, the determinant J of the Jacobi matrix is nonzero at a point
(
r , 1 , . . . , M1 ), (
r , 1 , . . . , M1 ) = a. Here
J = rM1
M2
/
cosk k+1 ,
M 2.1
k=1
for all h X,
(4.1.4)
p
0
j=1
0
aj a1 ap ( 1).
for all h X
144
where X is a weaker norm than X . By this we mean that only the estimate
hX dhX
(e.g., X = C 1 [0, 1], hX = sup h(t) + sup h (t), hX = sup h(t)). Then
t[0,1]
t[0,1]
t[0,1]
for all
x X.
t [0, 1].
in particular,
y = (1) = f ((1)).
(4.1.5)
145
for all t1 , t2 [0, ), the mapping is uniformly continuous on the interval [0, ),
hence
lim (t) ()
t
exists (X is a complete space) and the equality (4.1.5) holds for all t [0, ].
Now we are ready to prove that = 1. Indeed, if < 1, then we can apply
Theorem 4.1.1 at the point () to obtain a contradiction with the denition of
.
Step 2. The map f is injective. Suppose by contradiction that there are dierent
x1 , x2 X for which
f (x1 ) = f (x2 ).
Put
y f (x2 ),
i (t) = f (i (t)),
t [0, 1], i = 1, 2.
Then
f (G(1, s)) = (1 s)f (x1 ) + sf (x2 ) = y
x = 0,
f (0) = 0.
146
2u 2 u
+ 2
x2
y
for
u C 2 (G).
2v
1 2v
1 v
+
+
2
2
2
r
r
r r
2 u 2u
x2 , y 2 .
147
f (x, (x)) = 0
we have (formally by the Chain Rule)
f
f
(a, b) +
(a, b) (a) = 0,
x
y
i.e., (a) = ab , since
f
y (a, b)
(4.2.1)
= 2b = 0.
In the latter case, where (a, b) = (1, 0), we have f
y (1, 0) = 0, and (1)
cannot be determined from (4.2.1). The tangent line to M at the point (0, 1) is
parallel to the yaxis, which indicates some problems with determining a solution,
i.e., the (implicit) function . The reader is invited to sketch a gure.
This discussion shows the importance of the assumption
f
(a, b) = 0.
y
How can this assumption be generalized to f : RM+N RN ? A brief inspection
of the linear case leads to the observation that we can compute the unknowns
yM+1 , . . . , yM+N from the equations
fi (y1 , . . . , yM+N ) =
M+N
aij yj = 0,
i = 1, . . . , N,
j=1
i=1,...,N
j=M+1,...,M +N
= 0.
fi
, and the condition on the regularity of the matrix
Nevertheless, aij = y
j
(aij )
means that the partial (Frechet) derivative of f (see Denii=1,...,N
j=M+1,...,M +N
148
i.e.,
x = ,
U = U,
for every x U
and both the functions f and are dierentiable, we get from the Chain Rule
f1 (x, (x)) + f2 (x, (x)) (x) = o,
and therefore
(x) = [f2 (x, (x))]1 f1 (x, (x))
for
x U1
(4.2.2)
149
k1 + + kl = n,
and
F C (C Cn ).
150
t I,
RN ,
C(I, RN ) whenever (, , ) H.
Since these partial Frechet derivatives are continuous, C 1 (H) (see Proposition 3.2.18). The crucial assumption of the Implicit Function Theorem is the continuous invertibility of 3 (t0 , x0 , (; t0 , x0 )) in the space C(I, RN ). Put
t
B(t) =
f2 (s, (s; t0 , x0 ))(s) ds,
C(I, RN ).
t0
151
(; t0 , x0 )
and
()
(; t0 , x0 )
t0
+
t0
RN .
(4.2.4)
(t0 ) = I.
(4.2.5)
g
152
(t(), ) = 0,
U(x0 ).
f (x0 )
0 ()
RN
(, )
x0
(t(), )
Figure 4.2.1.
RN .
U M.
For more detail the interested reader can consult, e.g., Amann [4, Section 23]. g
153
We are often interested in asymptotic behavior of solutions of a system of ordinary dierential equations (linear or nonlinear), e.g., boundedness of solutions or
its convergence to some special solutions (constant, periodic, etc.). In the following
example we briey sketch a method which can be used.
Example 4.2.7. Consider the equation
x = Ax + f
(4.2.6)
If we are interested in bounded solutions only on R+ [0, ), a similar computation shows that all such solutions for f BC(R+ , RN ) are given by
t
+
x(t) = etA x +
e(ts)A P f (s) ds
e(ts)A P + f (s) ds
(4.2.8)
0
(4.2.9)
interested reader can check this formula and also (4.2.8) as an exercise on the use of the
Variation of Constants Formula.
Hint. Use the estimates etA x cet x for x X , t > 0, and etA x cet x for
x X + , t < 0, where the positive constants , c are independent of t and x, is such that
(A) { C :  Re  } = and c depends on only. These estimates follow from
Functional Calculus (see Exercise 1.1.42) and they ensure that integrals in (4.2.7) do exist.
Apply P + to
both sides of the Variation of Constants Formula and send t to obtain
P + x(t) =
t
154
g(y) = f (y) in a neighborhood of 0. For details see Hale [63, Sections III.6 and
IV.3]. A solution in (4.2.8) depends on the parameter x , so we have the equation
(, )(t)
(t) etA
e(ts)A P g((s)) ds +
e(ts)A P + g((s)) ds = o
(o) = o,
s
i.e., Wloc
(x0 ) is tangent to the stable manifold X of the linear equation x = Ax,
g
see Figure 4.2.2.
Remark 4.2.8. It is sometimes convenient to dene a solution of nonlinear, in particular, partial dierential equations, more generally, not assuming that a solution
has all classical derivatives which appear in the equation (see Chapters 6 and 7).
Actually, we have seen one such possibility in the reformulation of a dierential
equation as an integral equation
x = F (x)
where F is given by the formula (2.3.6). Having a more general notion of solution
a natural question arises: Under which conditions is this solution smoother, in
particular, is it a classical solution? Such results are known as regularity assertions. The Implicit Function Theorem can be occasionally used to prove such
statements. See Theorem 6.1.14.
socalled stable manifold W s (x0 ) of the stationary point x0 of the equation x = g(x)
(g(x0 ) = o) is dened as follows: Let (, ) be a solution of this dierential
equation satisfying
3 The
: lim (t, ) = x0
t
{ W
and a local stable manifold is dened by
0 ) : (t, ) U for t 0} where
U is a neighborhood of x0 .
Notice the crucial assumption (A) iR = (i.e., o is a socalled hyperbolic stationary point
of the equation (4.2.9)) in the above argument. Figure 4.2.2 shows also the distinction between
stable and local stable manifolds. It is worth mentioning that a similar approach cannot be
used in the case (A) iR
= . Since there can exist eigenvalues on the imaginary axis of the
multiplicity greater than 1, we cannot expect a manifold consisting of bounded solutions. To get
the socalled central manifold we are forced to solve a nonlinear version of the equations (4.2.7)
in a weighted space instead of BC(R, RN ). However, this problem is more dicult due to the
lack of dierentiability of the Nemytski operator (see footnote 11 on page 126). For details see,
e.g., Chow, Li & Wang [25, Chapter 1] and references given there.
s (x )
Wloc
0
s (x
155
W s (o)
X+
RN
(o, )
()
s
(o)
Wloc
W s (o)
Figure 4.2.2.
for (, ) V.
156
Prove that for any x M there exists a unique y(x) R such that
f (x, y(x)) = 0
and, moreover, y : x y(x) is a continuous map from M into R.
Hint. Use the properties of real functions of one real variable.
Exercise 4.2.11. Let M be a normed linear space, let f be as in Exercise 4.2.10
and, moreover, f C k (M R) with some k N. Then the implicit function
y = y(x) from Exercise 4.2.10 is of the class C k (M ). Prove it!
Hint. Use Theorem 4.2.1.
Exercise 4.2.12. Give details which are omitted in Example 4.2.7.
Exercise 4.2.13. Let A be a densely dened linear operator in a Hilbert space.
Assume that A has a compact selfadjoint resolvent. Extend the construction of the
local stable manifold (Example 4.2.7) to the equation (4.2.6). See Exercise 3.1.19
for the properties of this equation.
Exercise 4.2.14. Assume that
f (x, y) =
ajk (x x0 )j (y y0 )k ,
x x0  < ,
y y0  < .
j,k=0
Moreover, let a00 = 0, a01 = 0. Apply the Implicit Function Theorem and show
that the implicit function y(x) is the sum of a power series in a neighborhood
of x0 .
Note that for complex variables the result follows directly from the properties
of holomorphic functions and Theorem 4.2.1. In the real case one has to prove that
the formal power series for y(x) has a positive radius of convergence.
157
The following proposition deals with the rst nonsingular case for the mapping f : RM RN , M < N . For the second one see Proposition 4.3.8.
Proposition 4.3.2. Let f : RM RN be a dierentiable map on an open set G
RM . Let a G and let f (a) be injective. Let Q be a (linear) projection of RN
onto Y1 Im f (a). Then there exist neighborhoods U of a, V of Qf (a) in Y1 , a
dieomorphism of U onto V and a dierentiable map g : V RN such that
f =g
(see Figure 4.3.1).
RN = Y1 Y2
Y2
f (a)
a
Q
o
U G
G RM
Qf (a)
V
Y1 = Im f (a)
Figure 4.3.1.
Proof. The proof is almost obvious from Figure 4.3.1. Put = Q f . Then
(a) = Qf (a)
is an isomorphism of RM onto Y1 . Since dim Y1 = M is nite, Y1 is a Banach
space (as a closed subspace of the Banach space RN ) and, by Theorem 4.1.1,
is a dieomorphism of a neighborhood U of a onto a neighborhood (in Y1 ) V of
Qf (a). It suces to put
g = f 1 .
Remark 4.3.3.
(i) We have used the nite dimension of Y RN to ensure both the existence
of a continuous linear projection Q and the closedness of the range Im f (a).
If f : X Y , X, Y are Banach spaces, then neither of these two conditions
has to be satised. It follows from the proof that Proposition 4.3.2 holds
under these two additional assumptions. We notice that these assumptions
are superuous provided X has a nite dimension (see Remark 2.1.19).
158
(b) = o,
where h RM is such that
Qk = (a)h.
Moreover, y f (G) W if and only if there is an x G such that
y = f (x)
and
(I Q)(f (x)) = o.
(Qf ) (I Q)g(Qf ).
A L(RM , RN ).
159
(iv) A map f which satises the assumptions of Proposition 4.3.2 at each point
a G is often called an immersion of G into RN . An injective immersion
which is also a homeomorphism of G onto f (G) (in the induced topology
from RN ) is called an embedding. Some examples of immersions which are
not embeddings are shown in Figures 4.3.2 and 4.3.3. We note that we have
already used the term embedding for an injective continuous linear operator.
Further examination of Proposition 4.3.2 leads to the following denition
of a dierentiable manifold. This notion is basic for dierential geometry and
global nonlinear analysis. In this textbook we will mostly use it for purposes of
terminology only. Some basic facts on manifolds are given in Appendix 4.3A and
will be used for developing the notion of degree (Appendix 4.3D).
Denition 4.3.4. A dierentiable manifold of dimension M and of the class C k is
a subset M of RN (N M ) with the following property:
For each x M there is a neighborhood W of x (in RN ) and a
C k dieomorphism of W into RN such that
(M W) = {y = (y1 , . . . , yN ) RN : yM+1 = = yN = 0} (W).
A relative neighborhood W M together with is called a (local) chart at the
point x M . The rst M coordinates (y1 , . . . , yM ) are called the local coordinates
of x on M . The collection of all charts of M is called an atlas of M .
Example 4.3.5.
(i) An open subset G RM is an M dimensional dierentiable manifold of the
class C k for any k N (i.e., of the class C ).
(ii) The graph of a function f : RM R, f C k (G), G an open subset of RM , is
an M dimensional dierentiable manifold of the class C k in RN , N M + 1.
(iii) Let
S 2 = {(x, y, z) R3 : x2 + y 2 + z 2 = 1}
be the 2dimensional sphere. Then S 2 is a 2dimensional dierentiable manifold of the class C in RN , N 3. Indeed, a chart for the upper open
halfsphere can be constructed as follows: let
1
(x, y, z) = (x, y, z 1 x2 y 2 ),
W = {(x, y, z) R3 : x2 + y 2 < 1, z > 0}.
Then is a dieomorphism of W into R3 and
(W S 2 ) = {(u, v, w) R3 : u2 + v 2 < 1, w = 0}.
We will see a more comfortable proof in Example 4.3.10.
160
RM = X 1 X 2
V
a
RN
A1
f (x) = f (a)
o
U
X1 = Ker f (a)
Figure 4.3.4.
Let us denote
161
for any h X.
Remark 4.3.9.
(i) Proposition 4.3.8 together with its proof also holds for f : X Y , X, Y
Banach spaces provided there exists a linear continuous projection P of X
onto Ker f (a). The continuity of A1 follows in this case from the Open
Mapping Theorem (Theorem 2.1.8). The existence of such a projection P can
be shown in two important cases, namely, when Y has nite dimension (and
therefore Ker f (a) has nite codimension Example 2.1.12) or Ker f (a) has
nite dimension (Remark 2.1.19).
(ii) Notice that can be viewed as a local (nonlinear) transformation of coordinates in which f is a linear map, namely
f ((y)) = f (a) + f (a)y,
y U.
This formula also shows that all points in V are regular. Moreover, if z is
suciently close to b = f (a), then
y = A1 (z b) U
and
f ((y)) = z.
This shows that f (G) is an open set in RN provided all points of G are regular.
(iii) In the terms of dierentiable manifolds (Denition 4.3.4) the statement of
Proposition 4.3.8 can be formulated as follows:
If f : RM RN is a dierentiable map in an open set G RM ,
b RN , then the set
{x G : f (x) = b}
is a dierentiable manifold (either empty or of dimension M N )
provided b is a regular value of f .
(iv) Proposition 4.3.8 imposes certain restrictions on the set
{x RM : f (x) = f (a)}.
In Figures 4.3.54.3.7 there are some cases in which a is not a regular point
(i.e., it is a critical point). The value f (a) is critical in all cases.
162
a
a
a (cusp)
Figure 4.3.5.
Figure 4.3.6.
Figure 4.3.7.
b = 0.
The assertions of the last two propositions are part of the following more
general result.
Theorem 4.3.11 (Rank Theorem). Let f : RM RN be a dierentiable map on
an open subset G RM and let the dimension of Im f (x) be constant for x G
(and equal to L N). Then for any a G there exist neighborhoods U of a, W of
b = f (a), cubes C in RM , D in RN and dieomorphisms : C U, : W D
such that the map F dened by F = f has the form
F (z1 , . . . , zM ) = (z1 , . . . , zL , 0, . . . , 0)
for all
z = (z1 , . . . , zM ) C
163
RN L
RML
RM
RN
D
C
o
RL
TC
RL
TD
X2 = Ker f (a)
RM
Y2
RN
f
P
W
U
X1
f (U)
f (U) W
Y1 = Im f (a)
Figure 4.3.8.
X1
A1 Q (an isomorphism)
f (x)
Im f (x)
Figure 4.3.9.
ui Xi , i = 1, 2,
for u U,
and dene
g(u1 , u2 ) = f ((u1 + u2 )).
Now, we show that g actually depends on the rst variable only. To see this we
compute the derivative of g with respect to the second variable:
g2 (u1 , u2 )h2 = f ((u)) (u)h2 .
164
for any h2 X2 .
for
4
(u1 , u2 ), (u1 , o) U.
This result is shown in Figure 4.3.8 by shaded areas. Put g(u1 ) g(u1 , o).
We employ Proposition 4.3.2, in particular Remark 4.3.3(ii) to complete the
proof. Replacing there g for f , we obtain a dieomorphism of a neighborhood
of o RN such that
W of b = f (a) onto a neighborhood W
(I Q)(f (U) W) = o
(see the right lower corner of Figure 4.3.8). We get cubes C and D by dieomorphisms TC , TD in RM , RN , respectively, which transform nonCartesian coordinates in X1 X2 or in Y1 Y2 into Cartesian coordinates in RM = RL RML
(TC (X1 ) = RL ), or in RN = RL RN L , respectively (see the upper part of Figure 4.3.8 and page 163).
Remark 4.3.12. The assertion of the Rank Theorem can be formulated in a slightly
less informative way as follows:
Under the hypotheses of Theorem 4.3.11, f (G) is a dierentiable manifold of dimension L.
Denition 4.3.13. Functions f1 , . . . , fN : RM RN are said to be independent in
an open set G RM if any point x G is regular for f = (f1 , . . . , fN ). In the
other case, the functions are called dependent .
The following assertion explains the notions of dependent and independent
functions.
Suppose the assumptions of the Rank Theorem are satised for
f = (f1 , . . . , fL , fL+1 , . . . , fN ) : RM RN
where functions f1 , . . . , fL are independent in a neighborhood of a point
a RM . Then there is a smooth function G : RL RN L such that
(fL+1 (x), . . . , fN (x)) = G(f1 (x), . . . , fL (x))
for x in a certain neighborhood of a.
fact, the use of Theorem 3.2.7 requires the segment joining (u1 , o) to (u1 , u2 ) to lie in U.
Taking a smaller U if necessary we can assume that U is convex.
Notice that we have got a similar result at the end of the proof of Proposition 4.3.8 where we
have considered only one ber, namely {x : f (x) = f (a)}.
4 In
165
where
and
d
f ((t, a)) = 0
dt
hold for t Ia .
It has been proved in the theory of ordinary dierential equations that a system
x = v(x) (v : G RM RM is smooth) has M 1 independent rst integrals
f1 , . . . , fM1 in a neighborhood U of any nonstationary point a G. A smooth
function g : U R is the rst integral if and only if g, f1 , . . . , fM1 are dependent
on U.
We remark that the knowledge of the rst integrals reduces the original system. For example, if f1 , . . . , fM1 are independent rst integrals in a neighborhood
U of a nonstationary point, then the transformation of coordinates
yi = fi (x),
i = 1, . . . , M 1,
yM = xM
i = 1, . . . , M 1,
y M = w(yM )
for a function w,
i = 1, . . . , M 1,
y M = 1.
For another interpretation and a generalization of the notion of the rst integral
see Exercise 4.3.26 and the end of Appendix 4.3A.
166
Remark 4.3.14. A result similar to the Rank Theorem holds also for a dierentiable map f : X Y where X, Y are Banach spaces. The delicate question is
the existence of continuous linear projections P of X (onto Ker f (a)) and Q of Y
(onto Im f (a)). Such projections exist provided f (a) is a Fredholm operator, i.e.,
Ker f (a) has nite dimension and Im f (a) is a closed subspace of nite codimension in Y (see page 70). Notice that the equation f (x) = y can be solved by the
following procedure which is often called the LyapunovSchmidt Reduction:
The equation
f (x) = y
is equivalent to the pair of equations
y1 Qy = Qf (x1 + x2 ),
where
x = x1 + x2 ,
x2 = P x.
Suppose that the rst equation may be solved5 for x1 assuming x2 to be xed
(looking at x2 as a parameter). We obtain
x1 = g(y1 , x2 ).
The second equation is now an equation (it is called the bifurcation equation or
the alternative problem) of the form
(I Q)f (x2 + g(y1 , x2 )) = y2
for an unknown x2 .
i.e.,
F : X2 = Ker f (a) R
(see Example 4.3.20). Notice that dim X2 is nite for f (a) being a Fredholm map.
5 E.g.,
by the Implicit Function Theorem (Theorem 4.2.1) in the vicinity of a known solution
b = f (a) since f (a) is an isomorphism of X1 onto Y1 or, more generally, by an iteration process.
167
Example 4.3.15. As an application we will investigate the existence of a solution of the following boundary value problem for a system of ordinary dierential
equations
x(t)
= f (t, x(t)),
t (0, 1),
(4.3.1)
x(0) = x(1).
We suppose (see Theorem 2.3.4) that f together with its partial derivatives with
respect to the variables x = (x1 , . . . , xN ) are continuous on [0, 1] RN . We know
that any solution starting at t = 0 satises the integral equation
t
f (s, x(s)) ds
x(t) x(0) =
0
for all t from the interval of its existence. This means that x satises the boundary
value problem (4.3.1) if and only if
G(x0 )
f (s, x(s, x0 )) ds = o.
0
for G : RN RN
= f (t, x(t)),
t (0, 1),
(4.3.2)
x(0) = x(1).
Notice that for = 0 any N dimensional constant a solves (4.3.2). To be able to
use the abstract approach described above we rewrite (4.3.2) in an operator form.
To do this we dene Banach spaces
X = {x C([0, 1], RN ) : x(0) = x(1)},
and operators L, N : X Y :
Lx : t x(t) x(0),
N (x) : t
t [0, 1].
(4.3.3)
168
h X.
for
Check this expression yourself, see also Example 3.2.21. This means that
G 1 (a, 0)h = Lh
is not injective and X2 Ker L consists of N dimensional constant functions.
Moreover,
Y1 Im L = {y Y : y(1) = y(0) = o}.
There are continuous linear projections P , Q onto closed subspaces X2 and Y1 ,
respectively, given by
P x : t x(0),
Qy : t y(t) ty(1).
Y = Y1 Y2 ,
x1 X1 ,
a X2 ,
(4.3.4)
(4.3.5)
and
1 (a, 0) = o
169
0
f2 (s, a
) ds d = tc
To summarize the considerations of the previous example, we get the following conclusion.
Proposition 4.3.16. Let f = (f 1 , . . . , f N ) : [0, 1] RN RN be continuous and
f i
(i, j = 1, . . . , N ). Let the function f satisfy
have continuous partial derivatives x
j
the conditions
1
1
f i
f (s, a
) ds = o,
det
(s, a
) ds = 0
0
0 xj
for a certain constant a
RN . Then there exist > 0 and a dierentiable map
x(, ),  < , such that x(, 0) = a
and the functions x(, ) satisfy the
boundary value problem (4.3.2).
Remark 4.3.17. Let us make some remarks on this result. If the function f in
(4.3.1) is 1periodic in the variable t, then x is a solution of (4.3.1) if and only if
x
(t) = x(t n),
n = [t], t R,
is a 1periodic solution of x = f (t, x). Only technical diculties appear when one
generalizes the just described approach to a more general equation
x(t)
= A(t)x + f (t, x)
with more general boundary conditions
Bx(0) Cx(1) = o
(B, C are N N matrices). Notice also that having a result for a system of
dierential equations we can investigate boundary value problems for second order
equations. For example, we put
y
0
0 0
b1 b2
, f (t, x) =
x=
, C=
,
, B=
y
0 0
g(t, y, y)
c1 c2
170
to rewrite
t (0, 1),
= 0,
b1 y(0) + b2 y(0)
x(t)
0
a(t)
1
0
c1 y(1) + c2 y(1)
=0
x(t) + f (t, x(t)),
t (0, 1),
Bx(0) + Cx(1) = o.
Many other examples of the use of the Implicit Function Theorem can be found
in Vejvoda et al. [130]. We will return to the problem (4.3.1) in Example 5.2.18.
We now turn to the study of the behavior of a dierentiable function in the
vicinity of a critical point. We recommend that the reader considers the cases
f (x) = xn ,
n > 1,
and
f (x) =
n
aij xi xj ,
aij = aji ,
i,j=1
rst.
Denition 4.3.18. Let G be an open set in a Banach space X, f : X R, f
C 2 (G). A critical point a G of f is said to be nondegenerate if for any h X,
h = o, the linear form f (a)(h, ) does not vanish.
The following basic result holds also in a Hilbert space but its nite dimensional version is more transparent.
Theorem 4.3.19 (Morse). Let G be an open set in RM , f : RM R, f C 2 (G). Let
a G be a nondegenerate critical point of f . Then there exists a dieomorphism
of a neighborhood U of a onto a neighborhood V of o RM such that for x U,
y = (x), the function f can be expressed in the form
1
i yi2
2 i=1
M
f (x) = f (a) +
171
f (x) = f (a) +
B(x)
(the Riemann integral of a function with values in RMM ). Note that we have
B() FS . Our aim is to show that we can choose C() F such that
B(x) = C (x)JC(x)
where J is the canonical form of B(a) = 12 f (a), i.e.,
J=
1
..
0
6
.
Here C stands for the adjoint matrix to C, i.e., C = (cji ) provided C = (cij ).
The transformation of coordinates
y = C(x)(x a)
then yields
1
i yi2 .
2 i=1
M
To achieve this goal we will use the Implicit Function Theorem (Theorem 4.2.1).
We put
(B, C) = C (x)JC(x) B(x) : FS F FS .
In particular,
(B(a), T ) = T JT B(a) = o,
6A
172
provided T is a unitary matrix which transforms B(a) into its canonical form J.
Put A JT . The partial dierential of with respect to the second variable has
the form
2 (B, C)M : x M (x)JC(x) + C (x)JM (x),
x G.
Then
Ker 2 (B(a), T ) = {M F : M ()A + A M () = o}
and
Q : M
1
(M (A )1 M A)
2
1 1
J T S F1
2
2 (B(a), T )M = S FS .
and
for all x G.
To nish the proof we have to show that there is a neighborhood U of a such that
B(x) B(a)F <
for all x U.
By the denition of B,
B() B(a)F = sup
xG
(1 t)[f (a + t(x a)) f (a)] dt
MM
1
sup f (x) f (a)MM .
2 xG
This means that we can nd the desired neighborhood U.
173
codim Im f (a) = 1,
h Ker f (a),
f (a) = 0
and
g (a2 ) = o.
174
(this can occur for f : RN +1 RN ) and the matrix of F (a2 ) is regular, i.e., a2
is a nondegenerate critical point of F , then after a suitable transformation of
coordinates we get
1
F (x2 ) = (1 2 + 2 2 )
2
(the Morse Theorem) and the following conclusion:
If sgn 1 = sgn 2 , then the equation (4.3.6) has an isolated solution
x = a;
if sgn 1 = sgn 2 , then there are two curves of solutions given by
2
2
= .
1
(4.3.8)
x0 = o.
The following result is a classical one (see Crandall & Rabinowitz [29]).
Theorem 4.3.22 (Local Bifurcation Theorem). Let X, Y be Banach spaces,
f : R X Y a twice continuously dierentiable map on a neighborhood of (0, o).
Let f satisfy the assumptions
(i) f (, o) = o for all (, ) for some > 0,
(ii) dim Ker f2 (0, o) = codim Im f2 (0, o) = 1,
(iii) if f2 (0, o)x0 = o, x0 = o, then f1,2
(0, o)(1, x0 ) Im f2 (0, o).
7 The
175
(0) = o,
(, x) U
for
x = t(x0 + (t))
for a certain t
(0, o)
R
U
Figure 4.3.10.
Proof. We will give two proofs. The rst one for a nite dimensional case when
X = Y = RM is based on the Morse Theorem. The second one which is due to
M. Crandall and P. Rabinowitz is based on the Implicit Function Theorem and
will be only sketched.
The rst proof. We choose Y = RM , = o, such that
y Im f2 (0, o)
if and only if
(y) = 0.
f1,1
(, o) = o
for
(, )
176
g2 (0, 0) = o
and also
F1,1
(0, 0) = 0.
Therefore, by (iii),
F1,2
(0, 0) = F2,1
(0, 0) = [f1,2
(0, o)(1, x0 ) + f2 (0, o)g1,2
(0, 0)]
= [f1,2
(0, o)(1, x0 )] = 0
since (z) = 0 for every z Im f2 (0, o). If we denote F2,2
(0, 0), we obtain
0
. This matrix has
the matrix representation of F (0, 0) in the form
eigenvalues of dierent signs. The rest of the proof follows by applying the Morse
Theorem (see also Example 4.3.20).
The second proof proceeds by using the Implicit Function Theorem for the
function : R R X1 Y dened by
1
f (, t(x0 + x1 )) for t = 0,
(, t, x1 ) = t
f (, o)(x + x ) for t = 0.
2
x = g(, x)
are shown with an indication of their stability (s for stable, u for unstable).
stationary point a RM is called hyperbolic for the equation x = f (x) provided f (a) = o
and (f (a)) iR = . See also footnote 3 on page 154.
9A
177
x
s
(0, 0)
(0, 0)
u
s
Figure 4.3.11. Transcritical bifurcation
+ max 
x(t),
xX = max x(t) + max x(t)
t[0,2]
t[0,2]
Y = {y C(R) : y is 2periodic},
It is easy to show that
t[0,2]
+ h
f2 (, o)h = h(t)
for
n N.
and
f1,2
(0, o)(1, c) = c
0
0
y Y.
178
u Ker A, v V,
where
h(, u, v) = [sin (u + v) (u + v)] + n2 [sin (u + v) (u + v)].
Because of this special form of h we will try to nd a solution of (4.3.9) in the
form x = (u + v). The equality
f ( + n2 , (u + v)) = 0
holds if and only if
Av + (u + v) + 2 g(, u, v) = 0
(4.3.10)
where
h(, u, v)
, = 0,
3
g(, u, v) =
2
n (u + v)3 , = 0.
6
For solving (4.3.10) we use the LyapunovSchmidt Reduction. According to it, the
equation (4.3.10) is equivalent to the following pair of equations:
0 = Av + Q(u + v) + 2 Qg(, u, v)
0 = (I Q)(u + v) + (I Q)g(, u, v)
(= Av + v + 2 Qg(, u, v)),
(= u + (I Q)g(, u, v)).
By the Implicit Function Theorem, the rst equation has a unique solution v =
(, u) in a neighborhood of the point (0, u , o) for any u . We insert into the
bifurcation equation obtaining
(, u) = u + (I Q)g(, u, (, u)) = 0.
(4.3.11)
(, ).
179
The reader is invited to generalize this procedure to obtain sucient conditions for a bifurcation for the equation
f (, x) = o
assuming that
f C 2 (U),
f (, o) = o
for   < ,
dim Ker f2 ( , o) = codim Im f2 ( , o) = 2
where U is a neighborhood of ( , o).
We notice that no uniqueness of the bifurcation branch was proved even in
our concrete example. Compare this with the assertion given in Theorem 4.3.22.
This is due to our special choice of the form of the bifurcation branch, namely
g
x = (u + v).
Example 4.3.25 (Application of Theorem 4.3.22). We will study the bifurcation
points of the periodic problem
x
(t) + x(t) + g(, t, x(t), x(t))
= 0,
t (0, 2),
(4.3.12)
x(0) = x(2), x(0)
= x(2).
= x(2),
180
x(t) dt = 0 .
Since
F1,2
(0, o)1 = 1
and
1
/ Im F2 (0, o),
x(0) = o, x s (0) = o,
(0)
= 0.
= v(x(t))
(4.3.14)
i = 2, . . . , M.
for z = y + tv(a), y Y,
181
(ii) Deduce from (i) that the equation (4.3.14) has M 1 independent rst integrals in a neighborhood of a nonstationary point.
(iii) Is there any relation between the rst integral of (4.3.14) and the linear
partial dierential equation
v1 (x)
u
u
+ + vM (x)
= 0?
x1
xM
Exercise 4.3.27. Apply Theorem 4.3.22 to the (Dirichlet) boundary value problem
x(t) + x(t) + g(, t, x(t), x(t))
= 0,
t (0, ),
x(0) = x() = 0,
and show that every (k 2 , o), k N, is a bifurcation point!
Exercise 4.3.28. Replace the Dirichlet boundary condition in Exercise 4.3.27 by
the Neumann boundary condition
x(0)
= x()
=0
and prove that every (k 2 , o), k N {0}, is a bifurcation point!
Exercise 4.3.29. Why cannot the approach used in Example 4.3.25 be applied to
prove that the points (k 2 , o), k N, are bifurcation points of (4.3.12) even if k 2 is
an eigenvalue of the associated eigenvalue problem (4.3.13)? Can you modify the
method from Example 4.3.24?
Exercise 4.3.30. Apply Theorem 4.3.22 to the boundary value problem
4
...
d x
(t) x(t) + g(, t, x(t), x(t),
x
(t) x (t)) = 0,
t (0, ),
dt4
x(0) = x
(0) = x() = x(),
and show that, under appropriate assumptions on g, every (k 2 , o), k N, is a
bifurcation point.
at a point
a RM
182
((0)
d
(0))
dt
to all smooth
a M . The tangent space Ta M is the collection of tangent vectors (0)
curves
a { : R RN : there is an open interval I 0
such that C 1 (I ), (I ) M , (0) = a}.
The method for computation of tangent vectors to a parametrized M dimensional manifold M RN is based on the use of local coordinates: Let a M , let
be a dieomorphism of a neighborhood W RN of the point a into RN (see Denition 4.3.4). If
V = W M,
then (V, ) is a local chart of M at the point a M . Denote the inverse of the restriction
of P to V by where P y = (y1 , . . . , yM ) RM for y = (y1 , . . . , yN ) RN . Then
maps the neighborhood U = P (V) of the point b = P (a) RM into M and we will
consider also as an embedding (see Remark 4.3.3(iv)) of U into RN . We call the
local parametrization of V M . The main reason for introducing is that can be
dierentiated, but V cannot, and the whole does not describe M . See Figure 4.3.13.
Consider now a smooth curve a (see Figure 4.3.13). We can choose I so
small that (I ) V. Then
(t) = (P )((t))
RN M
183
RN
M
W
(0)
P
(0)
RM
Figure 4.3.13. Manifold
is a smooth curve in U RM . We have = and, consequently,
j (0) =
or, more briey,
M
j
(b) i (0), 10
yi
i=1
j = 1, . . . , N,
(0)
= (b)((0)).
Since also
(4.3.15)
(0)
= P (a)((0)),
at
b = (0) = (a).
b+t
1
i=1
Ta M = Im (b)
10 The Einstein summation convention is often used in dierential geometry. According to it, the
sum is taken with respect to all indices which appear simultaneously in upper and lower positions.
For example, if e1 , . . . , eM is a basis in RM , then the coordinates of a point x RM with respect
M
to this basis should be denoted by x1 , . . . , xM , since x =
xi ei xi ei by this convention.
i=1
j=1
i=1
184
= (b)
(0)ei =
i (0)
.
(4.3.16)
(0)
= (b)((0))
y
i
i=1
i=1
Example 4.3.32. Let us compute the tangent space to the 2dimensional sphere
1 1
2
, ,
S 2 = {(x, y, z) R3 : x2 + y 2 + z 2 = 1}
at the point a =
.
2 2 2
As local coordinates we choose the spherical coordinates
x = cos cos , y = sin cos , z = sin ,
i.e., (x, y, z) = (, ), b = 4 , 4 and 4 , 4 = a. Then
1 1
1
' (
' (
1
2
= , ,0 ,
= , ,
,
,
4 4
2 2
4 4
2
2 2
is a basis of Ta S 2 . Choosing a perpendicular vector v to both
Ta S 2 = {(x, y, z) R3 : x + y + 2z = 0}.
and
e.g., v =
(If you have drawn a picture, you get a slightly better insight.)
It was shown in Remark 4.3.9(iii) that a manifold can also be given implicitly,
i.e., as the set of solutions of the equation
f (x) = o.
Proposition 4.3.33. Let f : RM RN have continuous partial derivatives in an open set
G RM and let o be a regular value of f (Denition 4.3.6). Then
M = {x G : f (x) = o}
is an (M N )dimensional dierentiable manifold provided M is not empty, and for
a M the tangent space Ta M is equal to Ker f (a).
Proof. The rst part is exactly Proposition 4.3.8 and Remark 4.3.9(iii). If a map : I
M is a smooth curve, (0) = a, then
f ((t)) = o
for t I
and
f (a)(0)
= o,
i.e.,
(0)
Ker f (a).
Since Ta M Ker f (a) and both the spaces have the same (nite) dimension (the
assumption on regularity of o), we have
Ta M = Ker f (a).
185
Since the same geometric object M can be viewed as a manifold with dierent local
parametrizations or as solutions of dierent equations, we would like to know how the
notion of the tangent space (and other notions to be introduced later on) depends on the
way it is introduced. As the implicit denition of manifold leads to local parametrizations (see the proof of Proposition 4.3.8) we can consider only the denition given by
parametrizations. First of all we should say when two atlases of M dene the same
structure on M .
Denition 4.3.34. Two C k atlases (V , )A , (V , )B of M are said to be equiva the mapping
lent if for every a M and any A, B for which a V V
= (P )
) onto (
is a C dieomorphism of ( )1 (V V
)1 (V V ) (see Figure 4.3.14).
k
V
a
= (P )
b
U1 1 (V V)
U1 1 (V V)
Figure 4.3.14.
be two local charts at a point a M which belong to
)
Example 4.3.35. Let (V, ), (V,
where
= (b)( (b)(0)).
denoting y
(b)ei (as above) and z j
(b)ej , we get
i
the transformation rule for the tangent vectors
M
M
j
=
(b)
(b)ej =
(b)
,
i = 1, . . . , M.
(4.3.17)
yi
y
y
z
i
i
j
j=1
j=1
186
RN
RN M
RN M
RN
M
V
g
V
g1 (V)
V
a
g o
G = (P )
U
R
V)
P (g1 (V)
RM
Figure 4.3.15.
where
= ,
and
d
(4.3.18)
(g )(0) =
(b)(G (b)(0)).
dt
We say that g pushes forward the tangent vector (0)
and which
which is denoted by g ((0))
' is(called a pushforward . In particular, g pushes
where
forward the tangent vector yi to g y
i
g
yi
=
M
Gj
(b)
;
y
z
i
j
j=1
=
(b)ej .
zj
(4.3.19)
187
for all
a .
d
(f )(0)
dt
for all
a .
= ij =
dy j
yi
0 for i
= j.
Remark 4.3.37.
(i) From the denition of df (a) it is obvious that df (a) (Ta M ) and its values can
be expressed in local coordinates as follows: If (, V) is a local chart at a M ,
F = f : U RM R, then
F
d
(f )(0) =
(b) i (0).
dt
yi
i=1
M
f = F
for
= a ,
i.e.,
In other words,
df (a) =
M
F
(b) dy i .
y
i
i=1
(4.3.20)
i = 1, . . . , M.
Observe that the formula (4.3.20) allows us to dene continuity of the mapping
x M df (x) (Tx M )
by the requirement that all F (corresponding to all charts of M ) have continuous
partial derivatives.11
at a point a M . Let
)
(ii) Suppose now that there are two local charts (V, ), (V,
f : M R be dierentiable at a M . Put
F f ,
F f ,
i.e.,
F = F
188
df (a) =
(b) dy =
(b)
(b) dy i
yi
zj
yi
i=1
i=1
j=1
M
M
M
F j
F
i
=
(b)
(b) dy =
(b) dz j .
z
y
z
j
i
j
j=1
i=1
j=1
(4.3.21)
The equality
dz j =
M
j
(b) dy i
yi
i=1
(4.3.22)
x V V
and
F = f .
yi
F
(F G)
Gj
(b) =
(b)
(b)
yi
zj
yi
j=1
g ( df (
.
a))
= df (
a) g
yi
yi
(4.3.23)
12 Notice the dierence between the transformation rules (4.3.17) and (4.3.22). The reader who is
acquainted with tensor analysis can realize that tangent vectors are transformed as contravariant
tensors and dierentials as covariant ones.
189
(Ta M ) . This operation will play an important role in the denition of the degree (see
Proposition 4.3.116).
We will return to pullback in Exercise 4.3.71.
Remark 4.3.39. The notion of the dierentiable manifold can be generalized in such a
way that it is not a priori assumed that M is a subset of RN . In fact, we have needed in
Denition 4.3.4 that M has a topological structure (inherited from RN ) such that the
neighborhood V = W M is homeomorphic via P V ((P V )1 = ) with U RM . A
dierential structure on M is introduced with help of dierentiability properties of the
mappings 1,2 1
2 1 for dierent neighborhoods V1 , V2 of M (cf. Denition 4.3.34).
This is sucient for correctness of the denition of the smooth function f : M R. (It
is smooth provided f : U R is smooth for all . Namely,
f (x) = (f 1 )(y) = (f 2 )[1,2 (y)]
for
x V1 V2 .)
These considerations allow us to say that the topological space M 13 is called the M dimensional dierentiable manifold if M is locally homeomorphic to open sets of RM in
such a way that all composite mappings 1,2 belong to the class C k .14
Remark 4.3.40. We can dene an innite dimensional dierentiable (e.g, of the class C k )
manifold by replacing RM by a Banach space X. As an important example consider a
mapping f C k (X, R), k 1, and dene
M = {x X : f (x) = 1}.
If M
= and all its points are regular (i.e., f (x)
= 0 for all x M ), then M is an
innite dimensional manifold of the class C k (this follows from Proposition 4.3.8 and
Remark 4.3.9(i)). Moreover, the tangent space Ta M is equal to Ker f (a) for a M .
Indeed, the inclusion Ta M Ker f (a) can be proved as in Proposition 4.3.33. To get the
reverse inclusion let h Ker f (a) and let be the dieomorphism from Proposition 4.3.8.
For (t) (th) there exists > 0 such that (t) M for t < and (0)
= (o)h = h
(cf. the proof of Proposition 4.3.8), i.e., h Ta M .
In order to dene the dierential df (a) for f : M R we need a generalization of
the notion of the tangent space Ta M in this more general setting. For a V M we
dene
a { : R M : open interval I 0 : (0) = a V, 1 C 1 (I ),
where : U V is a local parametrization of V}.
d
(1 )t=0 with a linear
Similarly as above, Ta M is the collection of all vectors dt
structure of RM . (Actually, Ta M now coincides with RM . We remark that previously
Ta M was an M dimensional subset of RN .) Further,
13 If
M =
df (a) : Ta M R
V is a set such that there are injective and surjective mappings : U V
where U are open subsets of RM , then the sets V form a subbase of a topology in M .
14 The deep theorem due to H. Whitney roughly says that a connected M dimensional dierentiable manifold can be embedded (Remark 4.3.3(iv)) into R2M +1 (see, e.g., Whitney [133,
Chapter IV], Aubin [11, Theorem 1.22] or Sternberg [124, Theorem 2.4.4]). This means that our
previous approach was not too restrictive.
190
is dened as
df (a)v
d
d
=
where v = (o),
= 1 , F = f 1 ,
(f )(F )dt
dt
t=0
t=0
V M
U RM
1
(I )
(I )
a
R
Figure 4.3.16.
In Appendix 6.3B we consider the level sets of a function : X R, C k , k 1.
If 0 is the regular value of and M = {x X : (x) = (a)}
= , then M is a C k dierentiable manifold (with a parameter space X1 = Ker (a), see Remark 4.3.9(i)
and the proof of Proposition 4.3.8). In this case Ta M can be identied with X1 (the
analogue of Proposition 4.3.33).
The following notion is also useful in nonlinear analysis.
Denition 4.3.41. A vector eld on a dierentiable manifold M is such a mapping
v : M T M that
for all x M .
v(x) Tx M
A vector eld v determines the dierential equation
x = v(x).
(4.3.24)
= v((t))
for all
t I .
M
i=1
v i (x)
yi
for
x V,
191
i = 1, . . . , M.
(4.3.25)
The local existence (and uniqueness) theorem for the system (4.3.25) can be used asi
suming that the vector eld v is continuous (and all partial derivatives (vy)
are conj
tinuous). A standard continuation process then yields a solution which is dened on a
maximal time interval Ia ((0) = a). It is well known that even very simple dierential
equations in R need not have any solution dened on the whole of R (e.g., x = x2 + 1).
The situation is better in the case of a compact dierentiable manifold M (i.e., when M
is a compact subset of RN ). If v is continuous on M , a M , then there exists a solution
of (4.3.24), (0) = a, which is dened on the whole of R. Because of the compactness
k
of M there is a nite number of charts (Vi , i ), i = 1, . . . , k such that M =
Vi . By
i=1
for all
x M.
where
x = (t)
and
(t) = ((t)).
The solutions of (4.3.24) are called the characteristics of this partial dierential equation
for an unknown function F .
We obtain a system of partial dierential equations by considering a family of vector
elds. Let v1 , . . . , vk be vector elds on a manifold M and denote by
V (x) = Lin{v1 (x), . . . , vk (x)}
192
the subspace of Tx M . The rst integral of the system v1 , . . . , vk (or the collection of
subspaces {V (x)}xM ) is a function f : M R for which
df (x)(vi (x)) = 0,
i = 1, . . . , k,
x M,
(4.3.26)
or equivalently, df annihilates all V (x), x M , i.e., Lw f = o for all vector elds w such
that w(x) V (x). (Lw f is the socalled Lie derivative see Exercise 4.3.46.) From this
formulation it is clear that we can suppose that the vector elds v1 , . . . , vk are linearly
independent at each x M . Contrary to the case of one equation, the system (4.3.26)
need not have a solution.
The following problem is similar to the preceding one: Let G be an open subset
of RM and let g = (g 1 , . . . , g M ) be a smooth mapping from G R into RM . Since RM
can be also interpreted as the dual space to RM , the mapping g determines a system of
partial dierential equations
u (x) = g(x, u(x)),
x G RM .
(4.3.27)
Expressing the Frechet derivative (i.e., the dierential) u (x) in terms of partial derivatives we get the system
u
(x) = g i (x, u(x)),
xi
i = 1, . . . , M,
x G.
If this system has a solution u, then u C 2 (G) (since g is supposed to be smooth), which
implies a necessary condition for the existence of a solution given by mixed derivatives
2
2
u
u
= xj x
, i, j = 1, . . . , M ):
( xi x
j
i
g i
g i j
g j
g j i
g =
g,
+
+
xj
u
xi
u
i, j = 1, . . . , M.
(4.3.28)
It is a question how to formulate this integrability condition for the system (4.3.26).
The system (4.3.26) is said to be completely integrable in M if for any x M there
is a submanifold N (x) of (the integral manifold ) M containing x such that
i (Ty N (x)) = V (y)
for all
y N (x)
(i is the natural embedding of N (x) into M ). Notice that for one vector eld v
= o
(i.e., dim V (x) = 1) the integral manifold is the integral curve of the system x = v(x)
and, in general, it contains the integral curves of all equations
x = v i (x),
i = 1, . . . , M.
The gluing of all integral curves need not be a manifold. A possible problem is shown in
Remark 4.3.43 below.
The basic result on complete integrability is the next theorem (Frobenius Theorem) which we state without proof. This theorem is an important tool in dierential
geometry, and the reader can nd its proof in textbooks on this subject, e.g., Aubin [11],
Sternberg [124, 3.5]. To formulate the theorem in a compact form we need15 the notion
15 We
will give another formulation of this theorem at the end of Appendix 4.3B.
193
M
vi
i=1
,
yi
w=
M
wj
j=1
,
yj
j w
j v
w
.
[v, w] =
v
y
y
y
j
j
j
j=1
i=1
(4.3.29)
For another interpretation of this operation see Remark 4.3.43 below or Exercise 4.3.47.
Theorem 4.3.42 (Frobenius). Let v1 , . . . , vk be smooth vector elds on a manifold M .
Then this system is completely integrable if and only if
[vi (x), vj (x)] V (x)
for all
x M,
i, j = 1, . . . , k.
Remark 4.3.43. Suppose that two smooth vector elds v, w are given on a (compact)
manifold M and let v , w be the corresponding dynamical systems on M . There is no
reason to expect that these systems commute, i.e.,
v (t, w (s, x)) = w (s, v (t, x)),
and it is not dicult to construct a counterexample which conrms that (see Figure 4.3.17).
v (t, x) = y1
w (s, y1 )
v (t, y2 )
w (s, x) = y2
Figure 4.3.17.
It can be shown that a necessary and sucient condition for commutativity is that
[v, w] = o
(see (4.3.29)).
Exercise 4.3.44. We say that a function f : M R is an integral of the equation (4.3.24)
if f (()) is constant for any solution of (4.3.24). If this is true only locally, f is called a
local integral. Suppose that dim M = 2 and (V, ) is a local chart of M , f is the integral
of (4.3.24) and df (x)
= 0 for all x V. Prove the following assertions.
(i) There is no stationary point of (4.3.24) in V (a V is a stationary point of (4.3.24)
if (t) = a, t R, is a solution of (4.3.24)).
194
F = f ,
where
+ h(z)
z1
z2
in these new local coordinates z = (z1 , z2 ) (use the transformation rule (4.3.17)
and the fact that f is an integral of (4.3.24)). Notice that h(z)
= 0 for all z U .
(iii) Put
H(z1 , z2 ) =
z2
0
z2
d
h(z1 , )
+1
1
2
i, j = 1, . . . , k,
on
U.
,
yi
i = 1, . . . , k,
in a neighborhood of a.
Hint. Cf. Exercises 4.3.44 and 4.3.26.
Exercise 4.3.46. Let M be a dierentiable manifold of the class C and let X be the set
of all C functions on M . Let D : X X satisfy
(D1) D is linear,
(D2) D(f g) = gDf + f Dg for all f, g X (pointwise multiplication).
Show that there is a vector eld v on M such that
Lv f df (x)(v(x)) = Df (x),
x M,
f X.
195
Here Lv f is the directional derivative (in the direction of the vector eld v) which is also
called the Lie derivative (cf. page 192).16
Hint. Put
M
v=
ai
where ai = Dyi .
y
i
i=1
Show that
Df Lv f = o
holds for polynomials of degree 1 on U. Then use the Taylor polynomial. It remains
to extend this result from local charts to the whole manifold use a partition of unity.
See Denition 4.3.74 and Theorem 4.3.76.
The converse statement, i.e., the fact that Lv satises (D1), (D2) for smooth v
is easy to prove. (Do that.) Is there any dierence between the dierential and the Lie
derivative?
Exercise 4.3.47. Let v, w be two smooth vector elds on a dierentiable manifold. Dene
the vector eld [v, w], the socalled commutator (or Lie bracket) of v, w, by the formula
L[v,w] f = Lv (Lw f ) Lw (Lv f )
for every
f X
(see (4.3.29) and Exercise 4.3.46). Show that this denition is correct, i.e., [v, w] is a
vector eld, and show that the Jacobi identity
[u, [v, w]] + [v, [w, u]] + [w, [u, v]] = o
holds for any smooth vector elds u, v, w.17
196
f (ej ) =
ji
1,
0,
i = j,
i
= j.
M
f i (x)ei .
i=1
M
i1 ,...,ip =1
1i1 <<ip M
sgn f
i(1)
(x1 ) f
i(p)
(xp )
{1,...,p}
A(ei1 , . . . , eip )
=
1i1 <<ip M
In particular, if p = M , then
A(x1 , . . . , xM ) = det (f i (xk ))i,k=1,...,M A(e1 , . . . , eM ),
(4.3.30)
i.e., dim M (X) = 1. Notice also that p (X) = {o} for p > M .
(ii) Elements x1 , . . . , xp of X are linearly dependent if and only if
A(x1 , . . . , xp ) = 0
for all
A p (X).
197
Remark 4.3.51.
(i) The exterior product of three or more skewsymmetric forms is dened by induction
and the associative law holds, i.e.,
A B C (A B) C = A (B C).
(ii) The exterior product is not commutative. Namely,
B A = (1)pq A B
for
A p (X),
B q (X).
Example 4.3.52.
(i) If A, B are oneforms (i.e., linear forms), then
A B(x1 , x2 ) = A(x1 )B(x2 ) A(x2 )B(x1 ).
More generally, by induction,
A1 An (x1 , . . . , xn ) = det (Ai (xj ))i,j=1,...,n
for oneforms
A1 , . . . , A n .
198
(4.3.32)
1i1 ip M
M
(f )(y) i
dy ,
yi
i=1
(y) = x
(see (4.3.20)).
(ii) Let be a pdierential form in RN with the representation
(x) =
ai1 ,...,ip (x) df i1 df ip
1i1 <<ip N
(x)
,...,
ai1 ,...,ip (x) det f ik
.
=
yj1
yjp
yjl
k,l=1,...,p
1i <<i N
1
are local charts at the same point, then we have two represen )
(iii) If (V, ) and (V,
(x) =
gj1 ,...,jp dz j1 dz jp .
1j1 <<jp M
(Here dy 1 , . . . , dy M is the basis of (Tx M ) with respect to the local chart (V, )
The Transformation Rule (4.3.22) yields a
).)
and similarly dz 1 , . . . , dz M for (V,
relation between the coecients f... and g... . This relation is simple for M forms
(M = dim M ), namely
j
((x)) dy 1 dy M
(x) = g(x) dz 1 dz M = g(x) det
yi
j
yi
199
(
(y)
local coordinates (see Figure 4.3.14). The determinant of the Jacobi matrix will be
called the Jacobian and denoted by J (Example 4.1.5). This Transformation Rule
can be generalized to mappings between manifolds in a way similar to (4.3.19) and
(4.3.23). If g : M M is a smooth map and is a pdierential form on M, then
the formula
(4.3.33)
g (x)(v1 , . . . , vp ) = (g(x))(gv1 , . . . , g vp ),
x M , v1 , . . . , vp Tx M , where g vi is the pushforward of the tangent vector vi
(see (4.3.19)), denes the pullback of . To obtain a local representation of the type
(4.3.33) we choose local coordinates at x, put vk = yj and use the Transformation
k
Rule (4.3.19). However, the nal formula is rather cumbersome and we will not need
it with the exception of the case when dim M = dim M = M and is an M form,
(z) = f (z) dz 1 dz M .
Then
where G is the local realization of g (see Figure 4.3.15). An important special case is
where is a coordinate mapping U RM V M . The next example shows
how to compute the pullback of (M 1)forms for small M . These formulae are
often used in vector calculus see also special cases of integration in Appendix 4.3C.
Example 4.3.55.
(i) Let
(x, y) = f (x, y) dx + g(x, y) dy
be a 1form in R2 and = (1 , 2 ) : (a, b) R2 a smooth curve. Then
(t)
= ((t))
t
t
= f (1 (t), 2 (t)) 1 (t) dt + g(1 (t), 2 (t)) 2 (t) dt.
(ii) Let
(x, y, z) = f (x, y, z) dy dz + g(x, y, z) dz dx + h(x, y, z) dx dy
be a 2form in R3 and : (u, v) U R2 R3 a smooth parametrization of a
surface S in R3 . Then
[ (u, v)](e1 , e2 ) = ((u, v))( (u, v)e1 , (u, v)e2 )
and, if
= u1
+ u2
+ u3 ,
u
x
y
z
= v1
+ v2
+ v3
v
x
y
z
(here x
is actually the rst vector of the standard basis in R3 see Remark
4.3.54 (ii)), then
= u2 v3 u3 v2 ,
etc.,
dy dz
,
u v
200
,
u v
(f, g, h),
u
v
R3
= (u2 v3 u3 v2 , u3 v1 u1 v3 , u1 v2 u2 v1 ).
u
v
3
Remark 4.3.56. The reader can ask why it is necessary (or reasonable) to introduce
dierential forms even though vectors and vector elds have been dened. Actually,
there is only a technical dierence for oneforms, since Ta M is isomorphic to its dual
(Ta M ) . For example, df (a) (Ta M ) and therefore it can be represented by a scalar
product in Ta M . Since Ta M is a linear subspace of RN (for M RN ) we may dene
the scalar product in Ta M as
(v, w)Ta M (v, w)RN
for
v, w Ta M .19
In particular, this means that there is a vector f (a) the socalled gradient of f such
that
df (a)(v) = (v, f (a))Ta M .
If
df (a) =
then
fi (a) =
fi (a) dyi ,
, f (a)
yi
.20
Ta M
The reason for distinguishing between dierential forms and vector elds lies in the richer
structure of the collection of all dierential forms there are operations like the exterior
product and the exterior dierential (Denition 4.3.57). Moreover, the dierential forms
1 = f dx + g dy + h dz
and
2 = f dy dz + g dz dx + h dx dy
+g
+h .
x
y
z
We will see in Appendix 4.3C that the integral of 1 along a curve can be
interpreted as work done by the force eld F along and the integral of 2 along a
surface S has the meaning of the rate at which a uid ow represented by the velocity
eld F crosses S. Another reason consists in a simplication of various notions and
results of classical vector analysis and dierential geometry. Examples like orientation,
elementary volume and the Stokes Theorem will be shown in Appendix 4.3C.
19 In
20 We
201
d(x) =
(4.3.34)
1i1 <<ip M
Example 4.3.58.
(i) If f : M R is a dierentiable function, i.e., a 0form, then the dierential df
given by Remark 4.3.54(i) is the same as that in (4.3.34).
(ii) Let
(x) = f1 (x) dx1 + f2 (x) dx2 + f3 (x) dx3
be a 1form on an open set G in R3 and (x1 , x2 , x3 ) the Cartesian coordinates of a
point x. If f1 , f2 , f3 are smooth functions on G, then
d(x) = df1 (x) dx1 + df2 (x) dx2 + df3 (x) dx3
f1
f1
f1
dx1 dx1 +
dx2 dx1 +
dx3 dx1 +
x1 x2
x3
=0
f1
f2
f3
f2
dx1 dx2 +
dx2 dx3
=
x1
x2
x2
x3
f1
f3
+
dx3 dx1 .
x3
x1
=
f2 f1
f3 f2
f1
f3
x2
x3 x3
x1 x1
x2
,
form the socalled curl of v (notation curl v or v; for the cross product see
e
Example 4.3.55(ii)).
Remark 4.3.59. By computing the dierentials dfi1 ...ip as in the previous example and
rearranging the sum in (4.3.34) to get rid of the zero terms dy i1 dy ip+1 where
two indices coincide, we obtain, e.g., for an (M 1)form,
(x) =
M
fi (x) dy 1 7
dy i dy M
i=1
(here 7
dy i means that dy i is missing),
d(x) =
M
i=1
(1)i
(fi )
((x)) dy 1 dy M .
yi
202
Example 4.3.58(i) leads to the following question: Are all onedierential forms
dierentials of smooth functions? In other words, has any (continuous) oneform a
primitive function f , i.e., is there f such that df = ? A short speculation on oneforms in R2 suggests obstacles caused by mixed partial derivatives. We investigate this
problem in a more general way.
Proposition 4.3.60.
(i) Let be a dierential form of the class C 2 . Then
d2 d( d) = 0.
(ii) Let and be a pdierential form and qdierential form, respectively, then
d( ) = ( d) + (1)q ( d).
Proof. An easy proof is left to the reader. Notice however that the exchangeability of
mixed partial derivatives of C 2 functions is the crucial point in the statement (i).
Denition 4.3.61. A dierential form is said to be
(1) closed if d = 0,
(2) exact if there is a dierential form such that = d.
Remark 4.3.62. The concept of exact dierential forms is a generalization of the classical
notion of the potential of a mapping f : RM RM : A function F : G R is called a
potential of f in an open set G RM if
F (x)h = (f (x), h)RM ,
x G,
h RM .
i, j = 1, . . . , M,
x G.
The following example shows that this necessary condition is not sucient.
Example 4.3.63. Let G = R2 \ {(0, 0)} and let
(x, y) =
x
y
dx + 2
dy
x2 + y 2
x + y2
be a 1form in G. This form is closed in G. Suppose now that is exact, i.e., there is a
function f : G R such that df = , in particular,
y
f
= 2
,
x
x + y2
x
f
= 2
y
x + y2
in
G.
Integrating, we obtain
arctan y + C(y)
f (x, y) =
arctan y + D(x)
x
for
(x, y) G, y
= 0,
for
(x, y) G, x
= 0.
203
1
,
z
z = x + iy,
x
,
x2 + y 2
Im F (x + iy) =
y
.
x2 + y 2
1
z
for all
z C \ {0}
dz
). In the theory of functions of a complex variable a primitive function
z
can be constructed by a curve integral. We will use the same approach in constructing a
primitive form to a dierential form. This is the main idea of the proof of the following
e
basic result.
(consider
S1
Theorem 4.3.64 (H. Poincare). Any closed dierential form on a dierentiable manifold
is locally exact.
Proof. Let be a closed pform on an M dimensional manifold M (1 p M ). We
choose a local chart (V, ) such that P (V) = U is an open ball in RM with center at
the origin. The pullback (see Remark 4.3.54(iii)) ( = (P )1 ) is a pform in
U. We dene a (p 1)form on U by the formula
1
tp1 (ty)(y, v1 , . . . , vp1 ) dt
(y)(v1 , . . . , vp1 ) =
0
for y U, v1 , . . . , vp1 Ty U .
21
(i) the integral exists (this fact follows from the continuity of t (ty));
(ii) is a (p 1)dierential form on U (the skewsymmetry of follows from the same
property of );
(iii) d(y) = (y) for y U.
Verication of the last statement is technically complicated. The case p = 1 is
more transparent, and therefore we will give the computation only for this case. For the
induction step the reader can consult, e.g., Sternberg [124, Theorem III.4.1], Cartan [21,
Theorem II.3.2.12.1] or Taylor [127, Theorem 1.13.2].
Suppose that has in U the form
(y) ( )(y) =
M
gi (y) df i ,
y U,
i=1
21 Here
204
(y) = gj (y),
yj
i.e.,
j = 1, . . . , M.
(y) =
yj
gj (ty) dt +
0
M
i=1
1
0
gi
(ty)tyi dt =
yj
gj (ty) dt +
0
M
i=1
1
0
gj
(ty)tyi dt.
yi
For the last equality we have used the assumption d = 0 and Exercise 4.3.71(iv):
d = d( ) = ( d) = 0
and, consequently,
gj
gi
=
,
i, j = 1, . . . , M.
yj
yi
Using integration by parts we get
1
1
M 1
8
9t=1
d
gj
gj (ty) dt = tgj (ty) t=0
t gj (ty) dt = gj (y)
t
(ty)yi dt.
dt
yi
0
0
0
i=1
If we put f (x) = (y) for x = (y), then
df = .
Remark 4.3.65. The proof of the case p = 1 shows that there exists a potential of
a smooth mapping g = (g 1 , . . . , g M ) : U RM in a ball U provided the symmetry
conditions
g j
g i
=
,
i, j = 1, . . . , M,
hold.
yj
yi
Example 4.3.63 suggests that certain topological properties of U are necessary if U is not
a ball. In the proof of the previous theorem the potential was dened by the curve
integral
1
(y) =
0
(g((t)), (t))
RM dt
22
o,y
along the curve o,y = {ty : t [0, 1]}. The crucial point in the direct computation of
the Frechet derivative of is an estimate of the dierence (y + h) (y). If the curve
integral depends only on the initial and terminal points and not on the path which joins
these points, then
1
=
(g(y + th), h)RM dt = (g(y), h)RM + o(hRM )
(y + h) (y) =
y,y+h
22 The
denition of the curve integral and of the integral of a dierential form is given in the
next Appendix 4.3C.
205
i.e.,
f Z p (M ).
f (x) dx.
0
It is easy to see that () = 0 if and only if is exact, and also that maps
B 1 (RZ ) onto R. This shows that induces the isomorphism of H 1 (RZ ) onto R.
Now we explain the notion of a simply connected domain in another way which
will be important in the sequel (e.g., in the degree theory Appendix 4.3D).
Denition 4.3.67. Let X, Y be metric (topological) spaces. Continuous maps f, g : X Y
are called homotopic if there exists a continuous map : X [0, 1] Y such that
(, 0) = f (),
(, 1) = g().
206
Remark 4.3.68. The relation between two continuous maps to be homotopic is clearly
an equivalence relation. The set of all continuous maps C(X, Y ) is therefore divided into
disjoint classes of mutually homotopic maps. We denote the class containing f by [f ].
Here we are using the homotopy concept mainly for curves. The reader can imagine
that curves 0 , 1 : [0, 1] M are homotopic if 0 may be continuously deformed (in M !)
into 1 . A curve is called nullhomotopic if is homotopic to a constant curve (point)
M
i dxi
i=1
(4.3.36)
for all
v T (x).
207
and
(y) = 0
for all
yS
j = 1, . . . , k,
(4.3.37)
Theorem 4.3.69 (Frobenius, the dierential forms version). Let 1 , . . . , k be smooth differential oneforms on a dierentiable manifold M . The necessary and sucient condition for the existence of an integral manifold in a vicinity of any point of M is that
di 1 k = 0,
i = 1, . . . , k.
Proof. The equivalence of Theorem 4.3.42 and Theorem 4.3.69 is not dicult to prove,
and the case k = 1, dim M = 3 is rather instructive. In this case a connection to the
Poincare Theorem 4.3.64 and its proof should also be recognized.
Exercise 4.3.70. Denote by O(N ) the set of all regular linear mappings A : RN RN for
which A1 = A and by SO(N ) the set
{A O(N ) : det A = 1}.
the set O(N ) is a dierentiable manifold. Find its dimension.
(i) As a subset of R
Hint. Consider A A A. The dimension is N(N1)
.
2
NN
(ii) Show that SO(N ) is the component of O(N ) containing the identity.
(iii) Show that A O(N ) induces a mapping of S N1 into itself.
(iv) Let be a oneform on S 2 which is invariant under SO(3), i.e.,
A =
for all
A SO(3).
Prove that = 0.
(v) Does a result analogous to (iv) hold for a twoform on R3 ?
Exercise 4.3.71. Prove the following properties of the pullback operation:
(i) g ( ) = (g ) (g ),
(ii) (h g) = g (h ),
(iii) g ( df ) = d(f g) for f : M R,23
(iv) g ( d) = d(g ) where d denotes the dierential.
23 If
208
Exercise 4.3.72. Let M be the open unit ball in R2 without its origin. Show that H1 (M )
is isomorphic to Z. Is it also true in R3 ?
Exercise 4.3.73. Show that H1 (S 1 ) is isomorphic to Z.
Hint. Use an approach similar to that in Example 4.3.66(ii), and instead of the mapping
show that there is a lifting of a continuous closed curve : [0, 1] RZ , i.e.,
: [0, 1]
R continuous such that
i(
) = ,
(0) = 0.
Now consider (1) (actually this is the degree of see Appendix 4.3D). For details see,
e.g., Kosniowski [77, Chapter 16].
dt.
(4.3.38)
dt.
l()
a
We will return to the length and area of a nonlinear object later in this appendix.
The integral on the righthand side of (4.3.38) is the Riemann integral and consequently it has reasonable properties. It could be generalized to some noncontinuous
functions (via the Lebesgue integral) and/or to certain nonsmooth curves (pairwise
smooth or with bounded variation via the RiemannStieltjes integral). Since we are not
interested in these generalizations we always assume that all objects are as smooth as
we need (manifolds at least of the class C 1 , functions, vector elds, dierential forms at
least continuous).
The situation with integration of oneforms is dierent. Namely, dierential forms
are dened only on manifolds (recall that an open subset of RM is also a manifold) and
curves need not be manifolds (see Figure 4.3.2). There are two possibilities to avoid these
obstacles: either to assume that lies on a manifold where the oneform is dened or to
restrict integration to curves which are themselves manifolds. We now examine the rst
possibility and postpone the other one to Denition 4.3.86.
For the denition of the integral of a oneform given in Remark 4.3.54 we have
assumed that the whole curve lies in one chart to get the same representation of the
form at all points of . If more charts are needed to cover the curve we have to be careful
not to integrate over some parts of the curve several times. To eliminate this risk the
following tool is very useful. In order to build it up we need a topological interlude.
209
Denition 4.3.74. Let (Vn , n )nN be an atlas of a dierentiable manifold M . Let {n }nN
be a collection of smooth (often C ) nonnegative functions on M which have the following properties:
(1) for all n N the support of n dened by
supp n {x M : n (x)
= 0}
is a compact subset of Vn ;
(2)
n (x) = 1 for all x M .
n=1
Gn
is compact,
and
M =
Gn .
n=1
For example, Gn can be chosen as the intersection of M with the open ball centered at
o with radius n. For the construction of a partition of unity the following topological
device is convenient. We will need various types of balls so B(a; r) will denote the open
ball in RM (M = dim M ).
Lemma 4.3.75. Let {W }A be an open covering of an M dimensional manifold M
in RN . Then there is a countable open covering {Vm }mN of M with the following
properties:
(i) {Vm }mN is subordinate to {W }A , i.e., for each m N there is an index m
A such that Vm Wm ;
m (B(o; 1)) = M ;
(ii) there are smooth mappings m : B(o; 2) Vm such that
m=1
(iii) the collection {Vm }mN forms a locally nite system, i.e., any point x M has a
neighborhood which intersects only a nite number of {Vm }mN .
Proof. Choose a sequence {Gn }
n=1 of open subsets of M which has the property stated
prior to this lemma. Put in addition G0 = G1 = . The main idea behind the forthcoming
construction is that the compact sets
Kn Gn \ Gn1 ,
n N,
n N,
24 A partition of unity is dened in topology in a more general way; see the corresponding
textbooks, e.g., Dugundji [43, Chapter VIII].
25 We wish to point out that topological notions (like interior) are taken here with respect to the
topology of M , i.e., G M is open provided there is an open set H RN such that G = M H.
210
Hn
Vxn
x
nx (z)
nx
P
(P )1
z
y+ z
2
o
B(o; 1)
B(y; )
B(o; 2)
Figure 4.3.18.
1
y+ z
for z B(o; 2).
n
x (z) = (P )
2
With help of these smooth maps n
x we return back to the manifold by setting
Vxn = n
x (B(o; 1))
cover the
(see Figure 4.3.18). Notice that Vxn is open in M . Open sets {Vxn }xH
n ,A
compact set Kn . We choose a nite subcovering
Vxn1 , . . . , Vxnkn .
The collection {Vxnj }j=1,...,kn , nN , covers M , and
{n
xj (B(o; 2))}j=1,...,kn , nN ,
is the desired locally nite countable system {Vm }mN .
211
k (x) =
x Vk , x = k (y),
x M \ Vk .
(y),
0,
for
x k (B(o; 1)).
k=1
k=1
k (B(o; 1)) = M . It is
k=1
n (x)
k N,
n=1
x M.
n (x)(x),
n=1
T(t) M
dene the integral locally. If is a smooth curve in M and (t) Vn , then (t)
and it can be written in the form
(t)
M
i (t)
i=1
.
yi
M
i=1
fi (x) dy i ,
x Vn ,
212
then we dene
=
n=1
n
n=1
n ((t))
I
M
fi ((t)) i (t) dt
(4.3.39)
i=1
provided the integrals on the righthand side exist and the sum is absolutely convergent.
Remark 4.3.78.
(i) If is a smooth curve dened on a compact interval I = [a, b] and {Vn }nN is
a locally nite covering of M , then lies in a nite number of {Vn }nN only. If,
moreover, the form is continuous, then the integrals in (4.3.39) exist and the sum
is absolutely convergent, since it contains only nitely many nonzero terms. We
require absolute convergence of the series because we do not want the value of the
integral to depend on the arrangement of charts (Vn , n )nN into a sequence.
(ii) It can be proved (do it as an exercise!) that the formula (4.3.39) does not depend
on the choice of partition of unity. It should be also proved that the righthand side
in (4.3.39) is the same for all equivalent atlases on M (see Denition 4.3.34). This
follows from the transformation rules for tangent vectors (4.3.17) and for dierential
forms (4.3.22).
Remark 4.3.79. We can interpret the local coordinates (f1 , . . . , fM ) of a oneform as
the local coordinates of a vector eld
F (x) =
M
fi (x)
i=1
,
xi
x Vn
F expresses the work done by the vector eld F along the curve .
The special cases M = R2 , R3 are known from introductory courses in mechanics (see,
e.g., Kittel, Knight & Ruderman [76, Chapter 5]).
Remark 4.3.80. Figures 4.3.2 and 4.3.3 show that a smooth curve need not be a dierentiable manifold in RN . In order to avoid such cases it is sucient to assume that the
curve has a parametrization which is an embedding (Remark 4.3.3(iv)). If, moreover,
lies on a manifold M , then it is the socalled submanifold of M in the following sense:
A subset P of a dierentiable manifold M is said to be a P dimensional submanifold of
M if there is an atlas (Vn , n )nN of M such that
n (x) = (y1 , . . . , yP , 0, . . . , 0) RN
for all
x Vn P.
The proof of Proposition 4.3.2 shows that the image of an embedding is a submanifold.
In order to integrate functions over a surface in R3 , or more generally over a manifold, we need to generalize the notion of area of a parallelogram to a nonat domains. Let
213
us recall here the denition of the multiple Riemann integral. The notion of (normalized)
area or volume is based on the fact that the unit cube
.
M
x i ei : 0 x i 1
C
i=1
(e1 ,...,eM is the standard basis in RM ) has the M dimensional volume (i.e., the Lebesgue
measure) equal to 1.
Let A be a parallelepiped in RM spanned by vectors v1 , . . . , vM , i.e.,
.
M
j vj : 0 j 1 .
A
j=1
1 dx.
A
This integral can be calculated with help of the linear operator T : RM RM which
M
sends the vectors e1 , . . . , eM of the standard basis to v1 , . . . , vM (T ej = vj =
tij ei ).
i=1
(4.3.40)
(v1 , v1 )RN
..
G(v1 , . . . , vM ) =
.
(vM , v1 )RN
then the formula
..
.
(v1 , vM )RN
..
,
.
(vM , vM )RN
1
(4.3.41)
generally:If T is a (nonlinear)
dieomorphism which maps C onto A T (C), then the
1 dx =
 det T (y) dy holds for the Lebesgue measure V (A) of A. The
formula V (A)
214
, y j
that the Gramm matrix G y1 , . . . , yM consists of scalar products of vectors y
i
in RN (the dierential structure of M is inherited from RN ).
Denition 4.3.81. Let M be a dierentiable manifold with an atlas (Vn , n )nN and let
n = (n Vn )1 .
Un = Pn n (Vn ),
f dV
(n f )(n (y)) det G
,...,
dy1 dyM
(4.3.42)
y1
yM
M
n=1 Un
provided the righthand side exists and the sum is absolutely convergent.
Remark 4.3.82. It is possible to show that this denition does not depend on the partition
of unity and on the choice of an atlas. The righthand side in (4.3.42) exists whenever f
has compact support or, in particular, if M is a compact manifold.
Example 4.3.83. Compute the surface area V (S 2 ) of the unit sphere S 2 in R3 .
It is obvious that the twodimensional surface area of the Greenwich meridian G
is zero. The rest S 2 \ G is covered by one chart with
' (
.
(, ) = (cos cos , sin cos , sin ),
(0, 2), ,
2 2
Since n = 1, U1 = (0, 2) 2 , 2 , 1 = 1 and det G
, = cos2 ,
we have
V (S 2 ) =
1 dS =
S2
(0,2)(
,
2 2)
cos d d = 4.
(It is more common to denote the integration symbol in the twodimensional case by dS
e
instead of dV .)
It follows from Denition 4.3.77 (see also Exercise 4.3.100) that the integral of a
oneform along a curve depends on the orientation of . Namely, if
(t) = (1 t),
t (0, 1),
then
(t) = (1
t),
and hence
27
This scalar product leads to uniformly distributed mass or currents in physical applications
but it is sometimes unrealistic. To cover further applications in a more realistic manner we can
consider dierent scalar products at dierent points of a manifold. Since any positive denite
symmetric bilinear form in RM RM determines a scalar product in RM , we can introduce a
metric structure on a manifold M by a (smooth) mapping g : x M S2+ (Tx M ) (positive
denite symmetric bilinear forms on Tx M ). Such g is called a Riemann metric on M .
215
This dependence on an orientation is crucial for the generalization of the curve integral to
an integral of a dierential form over a manifold. What can the orientation of a manifold
be? Let us start with simple examples like R, R2 , R3 . It is the common understanding that
the standard (equivalently positive) orientation on R is from the left to the right, in
R2 anticlockwise and by the rightthumb rule in R3 . These slightly vague formulations
can be made precise by taking xed bases in R, R2 , R3 , e.g., the standard bases. Then
all bases are divided into two disjoint classes according to the sign of the determinant of
the transformation matrix T which sends the xed (e.g., standard) basis into a new one.
We say that e1 , . . . , eM is a positive basis if det T > 0. We want to remind the reader
that
det T = f 1 f M (
e1 , . . . , eM )
if
T ei = ei , i = 1, . . . , M,
and f 1 , . . . , f M is the dual basis to e1 , . . . , eM (Example 4.3.52(i)). This indicates that
the choice of a xed nowherevanishing continuous M form (i.e., (x)
= 0 for all
x M ) on the M dimensional manifold M makes it possible to introduce an orientation
on M . If (V, ) is a local chart at a point x M , then the basis y 1 , . . . , yM of Tx M
is said to be a positive basis of Tx M provided
(x)
,...,
> 0.
y1
yM
It can be proved that a continuous nonvanishing form exists on M if and only if there
)
of this atlas the
is an atlas (Vn , n ) of M such that for any two charts (V, ), (V,
) (y) (see (4.3.17)) has a positive determinant for all
transformation matrix ((P )
(provided V V
=
y (V V)
).
Denition 4.3.84. A dierentiable manifold M of dimension M is said to be orientable
if there exists a continuous nowherevanishing M form on M . If such a form is xed,
then (M , ) is called an oriented manifold .
Example 4.3.85.
(i) Suppose that M is a twodimensional orientable connected manifold in R3 (i.e., a
surface) and is a nowherevanishing twoform on M . The question is how these
orientations of Tx M cohere with the natural orientation of R3 . To nd an answer we
choose a point x M and local coordinates at x such that y 1 , y 2 form a positive
basis in Tx M . It is obvious that there is a vector n R3 which is perpendicular to
Tx M R3 and such that
,
n,
y1 y2
n
is called a (unit) outer normal vector to
is a positive basis of R3 . The vector
n
n=
.
y1
y2
216
y1
nx
y2
y2
na
y1
Figure 4.3.19.
Denition 4.3.86. Let (M , ) be an oriented M dimensional dierentiable manifold. Let
(Vn , n )nN be an atlas of M for which the coordinate vectors y 1 , . . . , yM form a
positive basis of Tx M for all x Vn and all n N. Let {n }nN be a partition of unity
subordinate to this atlas. If is a continuous M form with the local representation
(x) = fn (x) dy 1 dy M ,
x Vn ,
n=1
n=1
n=1
Un
Un
(4.3.43)
(n fn )(n (y)) dy1 dyM 28
provided the righthand side exists and the sum is absolutely convergent.
Remark 4.3.87.
N
(i) If a form has
compact support, in particular, if M is a compact set in R , then
exists.
the integral
M
28 For
M
(y) = g(y) df 1 df M a continuous M form
on a measurable set U R (cf. Re
g(y) df 1 df M =
217
(ii) Denition 4.3.86 does not depend on the concrete choice of an atlas and on a
partition of unity. If the coordinate vectors y 1 , . . . , yM determine a negative
basis in Tx M , then we change the order, e.g., to
,
,
,...,
,
y2 y1 y3
yM
to get a positive basis. Notice that
(x) = f (x) dy 1 dy M = f (x) dy 2 dy 1 dy 3 dy M .
(iii) Denition 4.3.86 is also independent on a transformation of coordinates in the
following sense:
Let g be a dieomorphism of an oriented manifold M onto a manifold
M.29 Then g induces an orientation on M30 and with respect to this
orientation the equality
=
M
(4.3.44)
,...,
dy 1 dy M
V = det G
y1
yM
where
, . . . , yM
y1
(4.3.42)) is given by
dV (see
M
V (M ) =
V .
M
y = sin cos ,
z = sin ,
' (
(, ) U (, ) ,
6 2
29 We have not dened this notion yet, but it is almost evident how to generalize the wellknown
case RM RM . One has to overcome certain diculties which are caused by the local denition
of a manifold and the global notion of dieomorphism.
30 Let (V , )
form a
n
n nN be an atlas of M such that the coordinate vectors y , . . . , y
1
M
, . . . , g
determine a positive basis of Tg(x) M.
positive basis of Tx M for x Vn . Then g
y1
yM
218
and the orientation such that
, is a positive basis. We wish to compute the
integral
.
M
*
)
has twodimensional surface meaThe curve = ( cos , 0, sin ) : 6 , 2
sure equal to zero, and therefore
=
d d.
M
i.e.,
= cos3 cos4 d d.
(,)(
,
6 2)
cos3 cos4 d d = 0.
It is evident that the computation of
charts have to be used to cover the support of . The main reason is that a partition
of unity must be constructed and this is technically dicult. Because of that, we would
like to have such useful tools like the Fubini Theorem and the Fundamental Theorem of
Calculus. The later theorem, i.e.,
b
f (x) dx = f (b) f (a),
a
can be interpreted in the manifold language as follows. The closed interval [a, b] is a
manifold M positively oriented from the left to the right. The oneform f (x) dx is the
dierential of the zeroform (i.e., the function f ), and the (oriented) boundary of M
consists of the points b, a (in this order). The Fundamental Theorem of Calculus reduces
the integral of the form df (x) = f (x) dx over M to an integral of f over M . This
observation is essential for the generalization to manifolds with boundaries. To do that
we have to dene the boundary of M rst, and then to show how this boundary inherits
the orientation of M .
Denition 4.3.89. Let N be an M dimensional dierentiable manifold in RN . A closed
subset M of N is said to be an M dimensional dierentiable manifold with boundary 31 if
int M = M
(the interior and closure are taken in the topology of N ) and for any point x M there
is a chart (V, ) of an atlas of N such that either
31 This boundary can be an empty set (see, e.g., Remark 4.3.90(i)). If this boundary is nonempty,
then the manifold M is not a dierentiable manifold in the sense of Denition 4.3.4. See also
Remark 4.3.90(ii).
219
(i) V M
or
(ii) P (x) = (0, y2 , . . . , yM ) and P (V M ) = {(y1 , . . . , yM ) RM : y1 0} P (V).
A point x is called an interior point of M in the case (i). If x is not an interior point,
then x is called a boundary point of M . The collection of all boundary points is called
the boundary of M and denoted by M (see Figure 4.3.20).
RN
RN M
V M
x
V
{(y1 , . . . , yM ) RM : y1 < 0}
{(0, y2 , . . . , y
M)
P (x)
y1
P (V)
RM }
{(y1 , . . . , yM ) RM : y1 > 0}
RM
Figure 4.3.20.
Remark 4.3.90.
(i) A manifold can have empty boundary. The sphere S 2 in R3 is an example of this
fact. Such a manifold is also called a manifold without boundary.
(ii) If M is a manifold with nonempty boundary M , then the boundary M is itself a
dierentiable manifold of dimension M 1 in RN . An atlas is given by the restriction
of the original atlas. We notice that M has empty boundary, i.e., (M ) = .
(iii) Let M be a manifold with boundary. The tangent space Ta M for an interior point
a M is dened as in Appendix 4.3A and Ta M = Ta N . If a M , then we take
M
M
: y1 < 0}) through the point
all smooth curves in RM
(R = {(y1 , . . . , yM ) R
N
b = (a) and transfer them into R by applying
= (P )1 .
220
(t)
RM
RM
+
for
for
t < 0,
t > 0,
then
(b)[(0)]
is the socalled outer vector to M . The outer normal n to M at the point
a M is the unit outer vector which is perpendicular in RN to Ta (M ) (see
Figure 4.3.21).
RN
RN M
M
()
a
{a} + Ta M
n
Ta M
P
P (V)
o
y1
(0)
RM
RM
+
Figure 4.3.21.
(iv) If M N is a dierentiable manifold with boundary and (N , ) is an oriented
manifold, then induces an orientation on M as follows. Choose a M and
d
(t, b2 , . . . , bM )v1
dt
t=0
221
at the point a2 = (2, 2 ).
r
The completion to a positive basis in Ta R2 is shown in Figures 4.3.22 and 4.3.23.
R2
a1
v2
v1
o
=
S11
a2
v1
S21
v2
Figure 4.3.22.
222
v1 =
r
2
a2
v2 =
v2 =
a1
1
v1 =
Figure 4.3.23.
(a)
= r(a) dr d ,
= r(a).
r
(ii) Let B be the closed unit ball in R3 . Then B is a threedimensional manifold (included in R3 ) with boundary B = S 2 (the twodimensional sphere). The standard
orientation in R3 = Ta B gives the orientation in int B. The induced orientation on
at
B is obtained by Remark 4.3.90(iv). Namely, we take a normal vector n = r
a point a B and independent vectors v 2 , v 3 Ta (B) in the order given by the
rightthumb rule for (n, v 2 , v 3 ) (see Figure 4.3.24).32 Notice that the orientation on
B is given, e.g., by
= dx dy dz = (r 2 cos ) dr d d
in the spherical coordinates. Similarly,
(v2 , v3 ) =
, v2 , v3 .
r
The next theorem is a basic result on dierential forms and it is the promised
generalization of the fundamental theorem of calculus.
Theorem 4.3.92 (Stokes, abstract version). Let M be an M dimensional oriented manifold
with boundary M . Let be a smooth (M 1)dierential form on M with compact
support. Then
i 33
d =
M
223
R3
n
v2
a
v3
Ta S 2
Figure 4.3.24.
Proof. Let (Vn , n )nN be an atlas of M such that the coordinate vectors given by
j=1
Therefore, it is sucient to prove the Stokes Theorem only for the case when supp is
contained in one coordinate neighborhood, say V. Suppose that has the representation
(x) =
M
(1)i1 fi (y) dy 1 7
dy i dy M ,
x = (y) V,
i=1
i = 0 and
Case 1 (V M = ). Then
M
d =
M
M
fi
(y) dy1 . . . dyM = 0
y
i
i=1 U
224
Case 2 (V M
= ). According to the denition of the boundary we can assume that
V and (V) = U have the form as in Figure 4.3.20. Then
i (x) = f1 (y) dy 2 dy M
for
x = (y) V M
and
i =
M
f1 (y) dy 2 dy M =
M
U RM 1
0
M
fi
f1
dy1 . . . dyM =
(y) dy1 dy2 . . . dyM
yi
U RM 1
y1
i=1 U
=
f1 (0, y2 , . . . , yM ) dy2 . . . yM
d =
M
U RM 1
since the integrals for i = 2, . . . , M vanish because of the compact support of the restriction of f to U RM 1 .
Several special cases of the Stokes Theorem are worth mentioning. We say that a
curve : [a, b] RN is simple if
(t1 ) = (t2 ),
t1 < t2 ,
implies
a = t1 ,
b = t2 .
Corollary 4.3.93 (Green). Let be a bounded open subset of R2 the boundary of which
is the image of a simple closed smooth curve which is oriented so that is on the
lefthand side when we move along the curve. Let F = (f, g) : R2 R2 be a C 1 mapping
in a neighborhood of the closure . Then
f
g
dx dy =
(f dx + g dy).
(4.3.45)
x
y
y dx + x dy,
225
where n is the unit outer normal vector to and (F, n) is the scalar product in R3 .
The integral on the righthand side is dened in Denition 4.3.81.
Proof. Put
= f 1 dy dz + f 2 dz dx + f 3 dx dy
, v forms a positive orthonorand choose local coordinates (u, v) on such that n, u
3
i = F,
du dv.
u
v
The cross product
v
= n.
u
v
is the unit vector which is perpendicular in R3 to
, .
u v
So,
where has the orientation induced from M , the vector curl F is dened in Example 4.3.58(ii) and n is the unit outer normal vector to M in R3 (if y 1 , y 2 is a positive
(
'
basis at a M , then n is perpendicular in R3 to Ta M and n, y 1 , y 2 is a positive
basis in R3 ).
Proof (a hint). Rewrite the abstract Stokes theorem for this special case using the corresponding denitions of integrals and Example 4.3.58(ii).
Remark 4.3.97. Considering F in Corollary 4.3.95 as a velocity eld of a uid ow (Remark 4.3.56) we can interpret the righthand side in (4.3.46) as the amount of the uid
which ows out of a region per a unit time. In particular, if the divergence of F
3
f i
) vanishes everywhere in , then this amount is zero for any subregion
(div F
xi
i=1
From the mathematical point of view it is more interesting that this is a starting point
for the generalization of basic dierential operators to nonat domains. We now briey
describe this procedure. Let F be a vector eld on a manifold M and let be an M form.
Dene an (M 1)form F by
(F )(v1 , . . . , vM 1 ) (F, v1 , . . . , vM 1 ).
226
M
2f
x2i
i=1
where
f
f
f
,...,
x1
xN
is the gradient of f . Since the notion of the gradient has been dened for functions on
manifolds (Remark 4.3.56), we are able to generalize the Laplacian to functions dened
on a manifold M :
M f = div (f ).
This operator M is often called the LaplaceBeltrami operator. For more information
on the signicance of this operator the reader can consult, e.g., Chavel [23], Davies &
Safarov [31], Robinson [108] or Rosenberg [110].
Remark 4.3.99. In weak formulations of boundary problems for elliptic dierential equations (see Chapter 7) the following Green formula is frequently used:
Let a manifold M satisfy the assumptions of the abstract Stokes Theorem and
let f, g C 2 (M ). Then
M
(M f )g +
M
(f, g)Tx M =
M
(4.3.47)
which follows from the abstract Stokes Theorem. Another ingredient is the formula for
the divergence of the product of a function and a vector eld
div (gF ) = g div F + (g, F ).
This formula follows from the denition of the divergence by computation.
does not depend on the choice of an atExercise 4.3.100. Prove that the denition of
las and a partition of unity. What can be said about the dependence on a parametrization
of ?
Exercise 4.3.101.
Let be an exact oneform and f its primitive function, i.e., df = .
! What is the result if is a closed curve?
Compute
227
Exercise
4.3.102. Let be a closed oneform in M . Show that is exact if and only if
= 0 for any smooth closed curve in M .
det G(v1 , . . . , vM )
det G(v1 , . . . , vM 1 )
1 f
2f
1 2f
+
;
+
r 2
r r
r 2 2
formulae are convenient if one is looking for a solution with some symmetries, i.e., invariant with respect to some group actions.
228
2 f
2f
1
+
+ 2
r 2
r r
r cos
2f
2f
f
+ cos 2 + sin
2
;
(iii) on S 2 ;
(iv) on the Riemann manifold (see footnote 27 on page 214).
Exercise 4.3.110. Let M be a connected dierentiable manifold.
(i) Prove that there exists a Riemann metric g on M .
Hint. M is embedded into RN .
(ii) Prove that any two points x, y M can be connected by a C 1 curve, i.e.,
there is
: [a, b] M ,
C1,
(a) = x, (b) = y.
lg ()
;
g(t) ((t),
(t))
dt.
Put
g (x, y) = inf{lg () : : [a, b] M , (a) = x, (b) = y}
and show that g is a metric on M .
(iv) How is the topology on M given by g related to the topology induced from RN ?
229
Proof (see, e.g., Milnor [95]). We will regard the sphere S 2 as a compactication of the
complex plane and endow it with the structure of a dierentiable manifold by two charts
(see Example 4.3.5(iii)) given by the stereographic projections + , of S 2 \ {N } where
N is the north pole, and S 2 \ {S} where S is the south pole, respectively, onto R2 . See
Figure 4.3.25.
S2
x
R2
(x)
+ (x)
S
Figure 4.3.25.
where
Q(z) =
1
P (z 1 )
Q(0) = 0
(z is the complex conjugate to z). The calculation of f (see (4.3.19)) shows that x0 S 2
is a singular point of f (i.e., f (Ta S 2 )
= Tf (a) S 2 ) if and only if
P (z0 ) = o,
z0 = + (x0 ).
Since the last equation has only a nite number of solutions, the set A of singular values
of f is nite. This is the main point where we have used that P is a polynomial. Consider
now a point y S 2 \A, i.e., y is a regular value of f . Then the set f1 (y) is nite (possibly
empty) since a polynomial takes any value only nitely many times. Let (y) denote the
number of points, i.e., the cardinality, of f1 (y). The main part of the proof is to show
that is a constant function on B S 2 \ A. To prove that we consider two kinds of
regular values,
B2 = B \ B1 .
B1 = {y : f1 (y) = },
Since B1 = S 2 \ f (S 2 ) and S 2 is compact, B1 is open in S 2 . For y B2 we have
f1 (y) = {x1 , . . . , xk }
230
and there are disjoint open neighborhoods U(x1 ), . . . , U(xk ) on which f is a dieomorphism (see Local Inverse Function Theorem 4.1.1). Put
Vi f (U(xi )).
It is easy to see that is constant on the set
k
k
2
U(xi ) ,
Vi \ f S \
i=1
i=1
which is a neighborhood of the point y. Since S 2 is connected and A is nite, the set
B is also connected. The function , being locally constant, is constant on the whole of
B. Moreover, cannot vanish on B, i.e., B1 = . This shows that f actually maps S 2
onto S 2 , i.e., P : C C is surjective as well. In particular, there exists z0 C such that
P (z0 ) = 0.
The following result generalizes the above fact that f is surjective (see, e.g., Sternberg [124, Theorem 3.4.3]).
Proposition 4.3.112. Let M1 , M2 be two oriented manifolds of the same dimension, and
let M2 be a connected space. Suppose that f : M1 M2 is a proper35 dierentiable
mapping such that its realization (see Figure 4.3.15) has a nonnegative Jacobian at any
point. Then either the Jacobian vanishes everywhere or f (M1 ) = M2 .
Remark 4.3.113. Write f : C C as
f (z) = g(x, y) + ih(x, y)
where
z = x + iy
and
g, h : R2 R.
g
h
=
y
x
hold for
z = x + iy G,
for all
a
.37
36 Not
231
In particular, if
a=0
and
(t) = reit ,
t [0, 2n],
n Z,
then
Ind 0 = n
and it can be interpreted as the number of revolutions of and also as the increment of
the argument along divided by 2.
Proposition 4.3.114 (Rouche). Let be a simple, closed, positively oriented C 1 curve in
an open set , and let
G {z \ : Ind z
= 0}.
If f is a holomorphic function in for which 0
f (), then the number Nf (G) of
solutions of the equation f (z) = 0 that belong to G is equal to
f (z)
1
Nf (G) =
(4.3.48)
dz = Indf 0 38
2i f (z)
provided the solutions are counted with their multiplicity.39 If g is another holomorphic
function in such that
f (z) g(z) < f (z),
z ,
(4.3.49)
for
z = R
The connection between the winding number and the degree deg (f, , p) as the
latter is dened in Denition 5.2.1 for a regular value p is given by the following result.
For a holomorphic function f and its regular value p the degree deg (f, , p) is dened
38 This
39 A
232
with
(g, h) : R2 R2 ,
and let A = {z1 , . . . , zn } be the same as in the proof of Lemma 4.3.115. Assume that the
solution zj is of multiplicity mj , and denote
m m1 + + mk .
According to Proposition 4.3.114, the number of solutions (including multiplicity) of the
equation f (z) = q is also m provided pq < d. In this neighborhood of the point p there
exists a regular value of f . This follows either from the Sard Theorem (see Theorem 5.2.3)
or, in this special case more easily from the properties of holomorphic functions see,
e.g., Rudin [113, Theorem 10.32]. If the degree has the property of stability with respect
to the point (Property (vi) or (viii) in Theorem 5.2.7 or Theorem 4.3.124), the equality
(4.3.50) will also hold for singular values.
In order to motivate the next result, we note that the integral in (4.3.50) can be
rewritten as an integral over (by the Green Theorem). We avoid such uninteresting and
intricate calculation, and put aside the special case of holomorphic functions. Instead we
will consider the general case of mappings RM RM . For the rest of this appendix we
will suppose that
(H)
f : RM is continuous on ,
p RM \ f ().
233
f =
sgn Jf (x)
(4.3.51)
x
f (x)=p
f dx =
and
(f (x))Jf (x) dx
Then
f = 0
since
f () = 0.
The righthand side of (4.3.51) also vanishes by denition ( = 0).
that
f1 (U)
k
Vi
i=1
(why does such a U exist?). Choose a smooth function with its support in U and
(y) dy = 1. Then
normalize so that
U
f =
k
i=1
k
i=1
f =
Vi
k
i=1
(f (x))Jf (x) dx =
Vi
sgn Jf (xi )
(y) dy =
f (Vi )
k
sgn Jf (xi )
i=1
k
sgn Jf (xi ).
(f (x))Jf (x) dx
Vi
i=1
40 The proof based on a more explicit construction of the form is given by Mawhin [92]. There
the reader can also nd a dierent approach to the homotopy invariance property of the degree
(see Lemma 4.3.117 below).
234
. Then
with supports in a cube Q R \ f () such that
1
f =
f
.
To be sure that this denition does not depend on the choice of the point q we choose
another regular value q B. It is obvious that a dieomorphism h of B (onto B) can be
constructed such that
h(q) = q.
235
Let
be a dierential form constructed in the proof of Proposition 4.3.116 supported on
q U B. Then by the denition of pullback
a cube U,
=
h
.
U
.
Q
This shows that the assumption on regularity of p can be omitted in the denition of the
degree.
Now we may drop the assumption on smoothness of f . However, it will need a
certain further eort.
Lemma 4.3.119. Let be a bounded open set in RM and let H : [0, 1] RM be
such that the mapping t [0, 1] H(t, ) C 1 (, RM ) C(, RM ) is continuous. If
p RM \ H([0, 1] ), then
deg (H(t, ), , p)
is constant on the interval [0, 1].
Proof. First note that H is continuous as a mapping considered on [0, 1], and therefore
H([0, 1] ) is compact and there is an open cube
Q RM \ H([0, 1] )
which is a neighborhood of p. Choose a smooth M form with its support in Q. By
denition,
deg (H(t, ), , p) =
(H(t, )) .
Corollary 4.3.120. Suppose (H) and denote d dist (p, f ()). If g, h are mappings from
C 1 (, RM ) C(, RM ) such that
f gC(,RM ) sup f (x) g(x) < d,
f hC(,RM ) < d,
then
deg (g, , p) = deg (h, , p).
Proof. Put
H(t, x) (1 t)g(x) + th(x),
The assertion now follows from Lemma 4.3.119.
t [0, 1],
x .
The last step in our approach to the denition of the degree consists in the approximation of a continuous mapping by smooth ones.
236
(x) dx = 1. Put
RM
n (x) = nM (nx).
(n g)(x)
RM
n (x y)g(y) dy,
x RM ,
+
RM \U (x)
The rst integral is arbitrarily small for a suciently small neighborhood U(x) of x by
the continuity of g at x; the second integral is zero for suciently large n since n (x )
vanishes on RM \ U(x) for such n.
41 The
xX
xA
xX
This theorem permits generalization (nontrivial) to normal topological spaces (see, e.g.,
Dugundji [43, Section VII.5]). The proof for metric spaces is quite easy. Indeed, without loss
(x,y)g(y)
of generality we can suppose that 1 g 2. Put G(x) = inf dist(x,A) for x X \ A, and
yA
G(x) = g(x) for x A. It is not dicult to show that G has all the required properties.
42 It is also said that converge to the Dirac measure (this is true in the sense of distributions)
n
1
M
or that {n }
n=1 is the socalled approximate unit (the space L (R ) with the convolution
multiplication is a Banach algebra without a unit, and the convergence n g g takes place
in the L1 norm for all g L1 (RM )).
237
2
= 1
a
a2
a5
a
5
a
2
Figure 4.3.26.
238
We are now able to prove the main properties of the degree. See also Proposition 5.2.2 and Theorem 5.2.7. Notice that the following theorem is also true in the case
of manifolds.
Theorem 4.3.124. There exists a mapping deg which sends any triple (f, , p) satisfying
(H) into Z and has the following properties:
(i) (normalization property) If f is the identity map, p , then
deg (f, , p) = 1.
(ii) (additivity property) If 1 , 2 are disjoint open subsets of and the point p is
such that p
f ( \ (1 2 )), then
deg (f, , p) = deg (f, 1 , p) + deg (f, 2 , p).
(iii) (continuity property) Let {fn , , p}, n = 0, 1, . . . , satisfy (H). If the sequence
{fn }
n=1 converges uniformly to f0 on , then
lim deg (fn , , p) = deg (f0 , , p).
43 Because
239
This means that the degree is locally constant on RM \f (), and therefore it is constant
on every component of RM \ f ().
(ix) If the equation
g(f (x)) = p
has no solution in , then the lefthand side of (4.3.52) vanishes (by the solution property). For the same reason all products on the righthand side are also equal to zero.
Suppose therefore that
f () g1 (p)
= .
There is exactly one unbounded component U0 of RM \ f (). Regardless of whether
U0 g1 (p)
= or not,
deg (f, , U0 ) = 0
(by (v) and (viii)). Since
g1 (p) f ()
240
k
deg (g f, i , p)
(4.3.53)
i=1
(g f ) .
deg (g f, , p) =
deg (g, Ui , p) =
Ui
g
= 0.
deg (g, Ui , p) =
(4.3.54)
g = 0.
Ui
Let
g (y) = (y) dy 1 dy M .
If does not vanish on Ui , then there are disjoint open sets Ui+ and Ui which carry the
i
i
= deg (f, i , Ui ) =
.
deg (f, i , Ui ) = deg (f, i , Ui ) =
+ (y) dy
(y) dy
Ui+
Ui
= deg (f, i , Ui )
+
i
(y) dy
f (g )
Ui+
Ui
Ui
i.e., (4.3.54) holds in this case as well. Summing up and using (4.3.53) we complete the
proof.
241
Remark 4.3.125. It can be proved (see Amann & Weiss [5]) that the properties (i)(iv)
determine the degree uniquely.
We will conclude this appendix with some topological applications of the degree.
Applications to dierential equations are shown in Section 5.2. The wellknown and basic
Jordan Separation Theorem asserts that a Jordan curve divides the complex domain into
two open components and exactly one of them is bounded (the interior domain of ).
This theorem has the following generalization.
Theorem 4.3.126 (Generalized Jordan Separation Theorem). Let K be a compact set in
RM such that RM \ K has a nite number, say k, of components. If f is a continuous
injection of K into RM , then RM \ f (K) has exactly k components.
Proof (a sketch). Notice rst that f is actually a homeomorphism of K, i.e., f 1 is
continuous on the compact set f (K). Applying the Tietze Theorem (see the footnote on
page 236) to each coordinate f i of f = (f 1 , . . . , f M ) and f 1 we conclude that there are
continuous extensions g and h of f and f 1 , respectively, which are dened in RM . Denote
by G0 , . . . , Gk1 the components of RM \K where G0 is the unique unbounded component.
Similarly, let U0 , . . . , Um (m N {}) denote the components of RM \ f (K) where U0
is the unique unbounded component. The idea of the proof consists in the application of
the multiplication property of the degree (Theorem 4.3.124 (ix)): Show that
deg (h g, Gj , p) = ij
for
Gi ,
deg (g h, Ui , q) = ij
for
Uj ,
i = 1, . . . , m, j = 1, . . . , k 1.
Dene matrices
A (deg (g, Gj , Ui ))
i=1,...,m ,
j=1,...,k1
M
Proof (a sketch). Show rst that it is sucient to prove the assertion for the case when
G is an open ball B and f is continuous and injective on B. According to Theorem 4.3.126,
RM \f (B) has exactly two components U0 , U1 . If U0 denotes the unbounded component,
then
RM \ f (B) U0
(again by Theorem 4.3.126), i.e., U1 f (B). To show the opposite inclusion recall that
f (B) is bounded and connected.
Remark 4.3.128. If M < N and f : RM RN is a continuous injection, then it can
be proved (not too easily) that the complement of f (RM ) is a dense set in RN . We
want to recall the famous Peano curve, i.e., a continuous (but not injective) map from
the interval [0, 1] onto the square [0, 1] [0, 1].44 This map can be used to construct a
44 This
example (for the construction see, e.g., Dugundji [43, Section IV.4]) has played an important role in developing the notion of a curve.
242
continuous surjection of RM onto RN for any M < N . The existence of such a surjection
for M N is trivial.
As we have mentioned in Section 4.3, our main interest in the degree theory
consists in applying it to solving equations, i.e., in using the solution property (Theorem 4.3.124 (v)). This means to compute the degree, which is by no means an easy
task. Fortunately, we do not need to know the exact value of the degree. It will be sucient to show that it is not equal to zero. For this purpose the following mapping property
is very important.
Denition 4.3.129.
(1) A nonempty subset A of a linear space X is said to be symmetric if
for every
xA
we have
x A.
for each
x A.
in
There are several proofs of this important topological theorem. For the proof based
on algebraic machinery see, e.g., Dugundji [43, Section XVI.6]. Here we present the main
steps of an analytic proof which is taken from Schwartz [118] (see also Nirenberg [100]
or Rothe [111] or Krawcewicz & Wu [80]).
Proof of Theorem 4.3.130. The assertion is obvious for M = 1, therefore, we assume that
M > 1. The main idea of this proof is quite simple. First observe that deg (f, , o) does
not depend on a continuous extension of f from into (see Theorem 4.3.124(vii)).
Since o , there is a small open ball B B(o; ) inside , and there is a continuous
mapping g : RM such that
f (x) for x ,
g(x) =
x
for x B
(by the Tietze Theorem). Part (ii) of Theorem 4.3.124 implies that
deg (g, , o) = deg (g, \ B, o) + deg (g, B, o) = deg (g, \ B, o) + 1.
If g is constructed in such a way that g C 1 ( \ B) and o is a regular value of g, then
sgn Jg (x)
where S = {x \ B : g(x) = o}.
deg (g, \ B, o) =
xS
243
If, moreover, g is odd, then S is either the empty set or a symmetric set and g (x) =
g (x), deg (g, \ B, o) is an even integer, and the proof is completed. Unfortunately, it
is not known whether such g does exist. Therefore, we will show that all the required
properties of g are not actually needed. The demand for regularity of o can be replaced
by the assumption that g does not vanish on a part of a hyperplane, for instance on
H {(x1 , . . . , xM 1 , 0) \ B}.
Indeed, in this case we have
deg (g, \ B, o) = deg (g, H+ , o) + deg (g, H , o)
where
Moreover,
deg (g, H+ , o) =
g =
H+
H
+
g = deg (g, H , o)
for all
x H {(x1 , . . . , xM ) D : xM = 0}.
f (x),
x D,
f(x),
x H,
g(x) =
g(x),
x H+ ,
g (x), x H ,
where g is the Tietze extension of f and f.
The existence of an extension f follows from the following slightly more general
assertion:
244
for some
c
.
2
Put
1 (x1 , . . . , xM ) = 1 (x1 , . . . , xN )
In particular, this means that
i.e., all points of 1 (U) = 1 (U R
Theorem (Theorem 5.2.3)
for
x = (x1 , . . . , xN , . . . , xM ) U RM N .
det
1 (x) = 0,
M N
meas 1 (U) = 0
(meas is the Lebesgue measure in RM ), and RM \ 1 (U) is dense.45 Therefore, there is a
point y0 RM \ 1 (U) such that y0 < 2 . Put 2 = 1 y0 . Then 2 (x)
= o for every
x L and
f 2 C(K) < .
Moreover,
2 (x)
c
2
for
x K.
245
We can assume that the last inequality holds for all x L. Otherwise, we multiply 2
c
by the function 2
()
x L.
y = (0, . . . , 0, 1) RM +1 ,
where
45 In fact we do not need here the whole strength of the Sard Theorem. The following much
weaker result is sucient:
Let G RM be an open set and : G RM a C 1 mapping on G. If A G has
Lebesgue measure zero, so has (A).
Indeed, every point of A belongs to a ball B G on which is bounded. By the Mean Value
Theorem, there is a constant K such that
(x) (y) Kx y,
x, y B.
()
RM
Since
is separable, the set A can be covered by countably many balls {Bj }jN , i.e., A =
(ABj ), and (A) =
(ABj ). To complete the proof we show that meas ((A Bj )) =
j=1
j=1
0, j N. So take > 0. Since meas (A Bj ) = 0, there are countably many cubes (or balls)
{Qk }kN A Bj
Qk ,
meas Qk < , and the estimate () holds for all Qk with
k=1
k=1
k=1
meas (Qk ) c
meas Qk < c
k=1
where the constant c depends only on the dimension M and on K. Since > 0 is arbitrary,
meas ((A Bj )) = 0 for all j N. To obtain the result required in the proof above take
A = U {0, . . . , 0} .
(M N)tuple
246
n
ak xk , an
= 0, use the homotopy
k=0
on
: B(o; 1) RM +1 \ {o}.
(This is the original Borsuks formulation of Theorem 4.3.130.)
Exercise 4.3.137. Deduce the assertion of Exercise 4.3.134 from the BorsukUlam Theorem (Corollary 4.3.132).
Exercise 4.3.138. Prove the following result due to Lusternik and Schnirelmann :
Let F1 , . . . , FM +1 be closed sets which cover S M . Then at least one Fi contains a pair of antipodal points (i.e., x, x Fi ).
Hint. Let
(x) x,
x SM,
for
i = 1, . . . , M.
f i ((Fi )) = {1}
(this consequence of the Tietze Theorem is known in a normal topological space as the
Urysohn Lemma). Put f = (f 1 , . . . , f M ) and apply the BorsukUlam Theorem to obtain
a point x0 . Show that
x0 FM +1 (FM +1 ).
247
Exercise 4.3.139. Prove the following complement of the above covering result of Lusternik and Schnirelmann:
There exist closed sets F1 , . . . , FM +2 which cover S M , and such that no Fi
contains a pair of antipodal points.
Hint. Proceed by induction with respect to M . The assertion is obviously true for M = 1.
Let M = 2. Cover the equator of S 2 with three closed sets E1 , E2 and E3 , with Ej (Ej ) =
, j = 1, 2, 3. Then choose a latitude L on the southern hemisphere and extend the cover
of the equator to a covering A1 , A2 and A3 of the set of all points which lie to the north
of L, including those of latitude L (see Figure 4.3.27).
A2
A3
E3
E2
A1
E1
L
A4
Figure 4.3.27.
Here Aj consists of all great circle arcs from latitude L to the north pole which
contain a point of Ej . Finally, let A4 consist of all points lying to the south of L, including
those of latitude L . Then A1 , A2 , A3 and A4 is the desired covering of S 2 . Continue
the argument for M 3.
Exercise 4.3.140. Prove the following BreadHamCheese Theorem:
If B1 , . . . , BM are bounded measurable sets in RM with M 1, then there is
an (M 1)dimensional plane which divides all the sets Bj into two parts of
the same measure.
This assertion can be reformulated in three dimensions as follows:
Suppose we have a sandwich of bread, ham and cheese with ham and cheese
piled attractively but irregularly on the bread. Then the sandwich can be cut in
two with one straight slash of a knife in such a way that each of two persons
gets an identical share of bread, ham and cheese.
Hint. Let M = 2 and let d S 1 determine a direction in R2 . Take a perpendicular line
to d and move it from to + (see Figure 4.3.28).
248
B1
o
Hd
Figure 4.3.28.
Take the rst and the last perpendicular (they need not be necessarily distinct)
which splits the set B1 into two parts of the same measure. The perpendicular Hd which
is halfway between these two has the equation
(x, d) = a(d)
where a : R2 R satises
a(d) = a(d).
In order to nd d for which Hd also splits B2 into two parts of the same measure, we set
f (d) meas {x B2 : (x, d) > a(d)}.
Then f : S 1 R is continuous and
f (d) + f (d) = meas B2 .
By the BorsukUlam Theorem, there is a point d such that f (d) = f (d). Thus the
corresponding Hd divides B2 into two parts of the same measure.
If M 3, then construct functions f2 , . . . , fM corresponding to B2 , . . . , BM .
Chapter 5
Topological and
Monotonicity Methods
5.1 Brouwer and Schauder Fixed Point Theorems
One of the most frequent problems in analysis, especially in its applications, consists in solving the equation
F (x) = y
where F is a mapping from a Banach space X into a Banach space Y .1 Such an
equation can be reduced to the equation F (x) = o, or, provided X Y , to the
equation
F (x) = x.
(5.1.1)
In this section we present two basic results on the solvability of (5.1.1) in a special
case, namely, for a continuous mapping F and a nite dimensional X, and a
compact mapping F in a general Banach space of innite dimension the Brouwer
and the Schauder Fixed Point Theorems.
We start with the nite dimensional case. A brief inspection of F : R R
indicates that reasonable assumptions on F are continuity on a closed interval I
and F (I) I. Moreover, the interval I should also be bounded. The Intermediate
Value Theorem from Calculus applied to g(x) = F (x) x says that there is a
solution of (5.1.1) in I provided these assumptions are satised. Notice that these
assumptions are too weak to say anything about the number of solutions. Having
no appropriate ordering in R2 , standard proofs of the above result fail in R2 and,
therefore, a generalization is far from being simple.
1 Spaces
X, Y are assumed to have linear and topological structure since we discuss problems of
analysis and not only of algebra or topology. Banach space structures are supposed mainly for
simplication here, but sometimes they can be crucial.
250
for x = 1.
However, this seems to be impossible as our experience says that if the ball is
continuously deformed (by G) onto the sphere, this ball has to be punctured.
Nevertheless, a rigorous proof of this fact is far from being obvious. Such a proof
could be based on introducing certain topological notions which are preserved
under continuous deformation. If we show that some of these notions are dierent
for the ball and its boundary, we obtain a contradiction with the existence of G.
Algebraic topology is devoted to the study of topological invariants of an algebraic
nature (homotopic groups, homological groups, etc.). However, such methods are
beyond the scope of this book.
Instead we will give an analytic proof of the existence of a xed point of a
continuous mapping F : B B. This proof which is due to Milnor [95] is based
on the idea of approximating a bad nonlinear mapping F by a simpler one. A
smooth approximation is possible by the Weierstrass Approximation Theorem (see
Theorem 1.2.14 and the discussion there).
Suppose, by contradiction, that F has no xed point in B. For any > 0 there
are polynomials P1 , P2 , . . . , PN of N variables, such that for P = (P1 , . . . , PN ) we
have
sup F (x) P (x) <
x1
251
P
and also that P 1+
: B B. Moreover, P has no xed points in B, either. This
follows from the estimate
x
F (x) x F (x) P (x) + (1 + ) P (x) x +
1+
for a constant c and all x, y int B, the mapping H(t, ) is injective for small t
and hence it is also a dieomorphism on the whole int B. It remains to prove that
H(t, )(int B) = int B.
Notice that int B is a connected set and thus it is sucient to show that
M H(t, )(int B)
is open and relatively closed in int B. The former property is a consequence of the
local continuous invertibility of H(t, ). To see the latter property we point out
that
M = H(t, )(B) int B
(H(t, x) = 1 for every x = 1). Since B is compact, H(t, )(B) is also compact
and therefore closed, i.e., M is relatively closed in int B.
252
Having Lemma 5.1.1 we continue in the proof of our main statement on the
existence of a xed point of F . To reach a contradiction with the assumption of
nonexistence, we will use the substitution theorem for the Lebesgue integral:
If meas A denotes the N dimensional Lebesgue measure of A RN ,
we have
dx =
det H2 (t, y) dy
(5.1.2)
meas (int B) =
int B
int B
G(int B)
xRN
It is not dicult to prove that is a homeomorphism of K onto B N X. Since
X with the induced RN norm is isomorphic to RM (Corollary 1.2.11(i)), K is also
homeomorphic to B M .
2 Notice
253
The rst main result of this section is the following Brouwer Fixed Point
Theorem.
Theorem 5.1.3 (Brouwer Fixed Point Theorem). Let K be a nonempty, convex,
closed and bounded subset of RN . Assume that F : K K is continuous. Then F
has a xed point in K.
Proof. If K has exactly one point then the statement is obvious. In other cases
choose a homeomorphism of B M = B(o; 1) RM onto K (Lemma 5.1.2).
According to the above discussion, the mapping 1 F : B M B M has a
xed point x
B M . Then
F ((
x)) = (
x) K.
for all i, j = 1, . . . , N.
i = 1, . . . , N.
N
xi ,
i=1
Ax
Ax1
for x D.
where = Ax1 .
Let us mention now the standard application of the Brouwer Fixed Point
Theorem to the existence of periodic solutions of ordinary dierential equations.
The basic idea goes back to H. Poincare: Denote by x(; ) a solution of the initial
value problem
x(t)
= f (t, x(t)),
x(0) = .
(5.1.3)
254
Assume that f satises conditions which ensure the existence and uniqueness of
(5.1.3) (see, e.g., Theorem 2.3.4) and, moreover, that f (, x) is T periodic. Then
x(; ) is a T periodic solution of (5.1.3) if and only if x(; ) is dened on the
interval [0, T ] and
P x(T ; ) =
(P is called the Poincare mapping). Since x(; ) depends continuously on the
initial condition (under reasonable assumptions on f , see Remark 2.3.5 and
Example 4.2.5), the Poincare mapping is continuous and its xed points can be
found with help of the Brouwer Fixed Point Theorem as the following example
suggests.
Example 5.1.5. Assume in addition that there exists r > 0 such that
(x, f (t, x))RN 0
(5.1.4)
for all t R, x RN .3
(5.1.5)
255
such that (s, s) = I.4 Then the solution of (5.1.3) satises the integral equation
t
x(t; ) = (t, 0) +
(t, s)g(s, x(s; )) ds
(5.1.6)
0
(the Variation of Constants Formula) whenever it exists on the interval [0, t]. The
fundamental matrix is continuous on [0, T ] [0, T ] and therefore it is bounded:
(t, s)RN N K,
x(s; ) ds
0
(5.1.7)
whenever x(; ) is dened on the interval [0, t] [0, T ]. If the maximal interval of
the existence of the solution x(; ) is [0, ) with T , then the boundedness of
x(; ), (5.1.7) and the condition (H) imply that x(; ) is uniformly continuous on
[0, ), and therefore it can be extended to a larger interval (see Proposition 1.2.4
and cf. a similar idea in Corollary 3.1.6). This implies that x(; ) is dened on
[0, T ] (actually on R) and the mapping P is dened for all RN .
To apply the Brouwer Fixed Point Theorem (the problem is to show that P
maps a ball into itself) we assume that 1 is not the Floquet multiplier of the linear
equation (5.1.5), i.e., 1 ((T, 0)) or, equivalently, the equation (5.1.5) possesses
only the trivial T periodic solution. Then the equation P () = is equivalent to
the equation
F () [I (T, 0)]1 [x(T ; ) (T, 0)] = .
From (5.1.6) and (5.1.7) we obtain
F () [I (T, 0)]1 [Kb + K(L1 + L2 )]T = c1 () + c2
where c2 does not depend on . Choose small enough to satisfy c2 < 1. Keeping
such xed there is r > 0 such that c1 () + c2 r r. It follows that F maps the
ball B(o; r) RN of the radius r into itself and the Brouwer Fixed Point Theorem
g
yields a T periodic solution of (5.1.4).
The Brouwer Fixed Point Theorem is a very strong device for solving nite dimensional nonlinear equations. Unfortunately, it does not hold in innite
dimensions as the following example shows.
means that x : t (t, s) is a (unique) solution of the equation (5.1.5) which satises
x(s) = .
4 This
256
Example 5.1.7 (Kakutani). Let H be a separable Hilbert space with an orthonormal basis {en }
n=1 . Denote by A L(H) the right shift given by
xn en =
xn en+1 ,
Aen = en+1 , i.e., A
n=1
and
n=1
1
for
x 1.
n=1
xn = xn+1
The series
and
x1 = (1 x2 ) 2 .
n=1
0. Then
x1 = 1,
a contradiction.
Notice that in the previous example the apparently simple linear operator A
is perturbed by a nonlinear operator with a onedimensional range. Continuous
operators with the range in a nite dimensional subspace form an important special
subclass of the socalled (nonlinear) compact operators.
Denition 5.1.8. Let X, Y be normed linear spaces and let M X. A mapping
F : M Y is called a compact operator on M into Y if F is continuous on M (M
being a metric space with the metric induced by the norm of X) and F (M K)
is a relatively compact set in Y for any bounded set K X.
The set of all compact operators from M into Y is denoted by C (M, Y ). If
the range of F C (M, Y ) is a subset of a nite dimensional subspace of Y , then
we say that F is a nite dimensional operator and write F Cf (M, Y ).
We recall that linear compact operators have been investigated in Section 2.2.
Warning. In contrast to the linear case the continuity of a nonlinear operator F
is not a consequence of the fact that F maps bounded sets onto relatively compact
ones! A simple example can be constructed for F : R R.
Our interest in compact operators arises from the observation that they are
close to nite dimensional ones. The precise formulation follows.
Theorem 5.1.9. Let X be a normed linear space, Y a Banach space and let M be
a bounded subset of X.
257
(ii) If {Fn }n=1 C (M, Y ) and lim Fn (x) = F (x) uniformly for x M, then
n
F C (M, Y ).
net y1 , . . . , ym F*(M)
Proof. (i) Since F (M) is compact there is a nite n1 )
of F (M) (Proposition 1.2.3). Functions k (x) = max 0, n1 F (x) yk are
m
continuous on M and
k (x) > 0 for every x M. Therefore the functions
k=1
k (x)
k (x)
,
m
k (x)
k = 1, . . . , m,
k=1
k (x)yk , x M.
k=1
m
m
k=1
1
n
for every x M.
(ii) If we literally translate the classical proof for real functions to vector
functions we see that F is continuous on M. Let n N be such that
sup F (x) Fn (x) <
xM
and y1 , . . . , yk is an net for Fn (M). Then it is also a 2net for F (M). Since Y
is a Banach space, Proposition 1.2.3 shows that F (M) is compact.
Remark 5.1.10. The assertion (i) of Theorem 5.1.9 obviously holds for linear compact operators, but generally we cannot guarantee linearity of the approximating
Proof. Let {Fn }n=1 be the sequence constructed in the proof of Theorem 5.1.9(i).
Denote
Xn Lin{y1 , . . . , ym }.
Since y1 , . . . , ym F (K) and K is convex, we have
Fn (K) K Xn .
The restriction of Fn to K Xn satises the assumptions of the Brouwer Fixed
Point Theorem and hence there is xn K Xn such that
Fn (xn ) = xn .
258
1
nk
and
F (x) = x.
Remark 5.1.12. The above proof of Theorem 5.1.11 is based on the approximation
of F by Fn Cf (K, X). The construction in the proof of Theorem 5.1.9(i) is surely
not unique. We recommend that the reader thinks about a possible simplication
when F acts on a separable Hilbert space.
Another possibility occurs when K is a compact convex set. We obtain a typical situation as soon as X is a reexive Banach space and K is a closed, convex and
bounded subset of X. Then K is compact in the weak topology (Theorem 2.1.25)5
and the continuity of F : K K in the weak topology (it sends weakly convergent sequences into weakly convergent ones) is sucient to justify application of
Theorem 5.1.11.
A slightly more general statement was proved by A.N. Tikhonov (for a proof
see, e.g., Dugundji [43, Appendix 1] and Deimling [34, 10.3]).
We now show how the Schauder Fixed Point Theorem can be applied to
dierential equations. To avoid technical details we restrict ourselves to ordinary
dierential equations. Their solutions are generally smooth which suggests a relation to compact operators.
Proposition 5.1.13. Let G be an open subset of RN +1 and let f : G RN be
continuous on G. Then for any (t0 , x0 ) G there exists > 0 such that the
equation
x = f (t, x)
has a solution on the interval (t0 , t0 +) which satises the condition x(t0 ) = x0 .
Proof. It has been shown in Lemma 3.1.5 that the initial value problem is equivalent to the integral equation
t
f (s, x(s)) ds = x(t)
(5.1.8)
F (x)(t) x0 +
t0
also have to use the fact that a convex set which is closed in the norm topology is also
weakly closed (cf. Exercise 2.1.39).
259
for
(t, x) M.
Then
F (x) x0 C[t0 ,t0 +] c r
for
x K {y C[t0 , t0 + ] : y x0 C[t0 ,t0 +] r}
provided is suciently small. This proves that F (K) K. Since f is also uniformly continuous on M, the operator F is continuous on K (the convergence on
K is the uniform convergence). Further, for t, s [t0 , t0 + ], t < s, x K, we
have
s
F (x)(t) F (x)(s)RN
(5.1.10)
We have dealt with this problem already in Example 2.3.8. It has been proved there
that y is a solution of this problem if and only if it is continuous and satises the
integral equation (f is assumed to be continuous)
F y(t)
(5.1.11)
260
Since
d2
F (y)(t) = f (t, y(t)),
dt2
F maps the ball B(o; R) in C[0, 1] into the set of functions which have uniformly bounded second derivatives. Thus F (B(o; R)) is relatively compact in
C[0, 1] (see Theorem 1.2.13).
(ii) The operator F is a composition of a linear integral operator and a Nemytski
operator (see Example 3.2.21). The Nemytski operator
: y f (, y())
is continuous from C[0, 1] into itself, and the integral operator
1
G(, s)x(s) ds
K : x
0
then
F (y)C[0,1] [a + byC[0,1]] sup
t[0,1]
Whenever b < 8, R
solution.
a
8b ,
G(t, s) ds
a + byC[0,1]
.
8
Exercise 5.1.15. If f has a sublinear growth in y, i.e., there is [0, 1) such that
f (s, y) a + by ,
then no restriction on b is needed. Prove this fact!
Exercise 5.1.16. Prove the Gronwall inequality:
Let f be a nonnegative continuous function on an interval [a, b] and let
A, B be nonnegative reals. Assume that
f (t) A + B
f (s) ds,
t [a, b].
Then
f (t) AeB(ta) ,
t [a, b].
(5.1.12)
261
= Bf (t) Bg(t).
Remark 5.1.17. More general integral and dierential inequalities can be investigated in a similar way. Let us mention x(t)
x int B,
262
n
Mi
and
i=1
Then it follows from Mi M and from (iv) that (Mi ) (M), so a (M). To
i
prove the equality, choose > 0 and a covering {M1i , M2i , . . . , Mm
i } of Mi with
diam Mji (Mi ) + a + .
All of these Mji form a covering of M, so that
(M) a + ,
(M) a.
i.e.,
= (1 , . . . , N ) R
N
.
i = 1, i 0 for all i
i=1
and
A()
N
i Mi
for all
i=1
7M
+ N {z = x + y : x M, y N }.
reader is invited to prove this equality.
8 The
= (1 , . . . , N ) .
263
N
i (Mi ) (M) + .
(5.1.13)
i=1
x=
N
i x i ,
y=
N
i=1
t [0, 1]
i yi ,
z = tx + (1 t)y
and
i=1
where , and xi , yi Mi for all i. The point z can be represented in the form
N
t i , for > 0,
i
i
i zi where i = ti + (1 t)i , zi = i xi + (1 i )yi , i =
z=
0,
for = 0.
i=1
which
N
(j)
(j) x
x
k=
i max i i  max xi 
i
i=1,...,N
i=1,...,N
k
i=1
A()
A()
m
(
'
A (j) + B(o; ).
j=1
m
'
(j)
+ B(o; )
j=1
m
((
' '
+ 2
A (j)
j=1
(M) + 3,
i.e., since > 0 is arbitrary,
(Co M) (M).
Example 5.1.24. Let B(o; 1) X be the open unit ball in a Banach space X. If dim X <
, then
(B(o; 1)) = (B(o; 1)) = (B(o; 1)) = 0
(see Proposition 1.2.3). On the other hand, if dim X = , then
(B(o; 1)) = (B(o; 1)) = (B(o; 1)) = 2.
(5.1.14)
The proof of this fact is not trivial. Since the diameter of B(o; 1) is equal to 2, we know
that (B(o; 1)) 2. In order to prove (5.1.14) we show that (B(o; 1)) 2. Assume
264
n
Mi
i=1
and the diameter of every Mi is strictly less than 2. We may take all Mi s to be closed.
Let Xn X be a subspace of X such that dim Xn = n. Then we have
B(o; 1) Xn =
n
(Mi Xn ).
i=1
In the next denition we will consider a special class of continuous and bounded
operators.
Denition 5.1.25. Let T : M X X be a bounded operator9 from a Banach space X
into itself. The operator T is called a kset contraction if there is a number k 0 such
that
(T (M)) k(M)
for all bounded sets M in M .
The bounded operator T is called condensing if
(T (M)) < (M)
for all bounded sets M in M with (M) > 0.
Obviously, every kset contraction for 0 k < 1 is condensing. Every compact
map T is a kset contraction with k = 0. A typical example of a kset contraction with
0 k < 1 is the following one.
Example 5.1.26. Let K, C : D X X be operators on a Banach space X. Let K be a
kcontractive, i.e., there exists k [0, 1) such that
K(x) K(y) kx y
for all
x, y D,
(5.1.15)
265
Since m A and T (A) A, it follows that C A. This implies T (C) T (A). Obviously
T (A) C, i.e., T (C) C which means that C . So, A C. We have proved that
A = C. Now, (vii), (viii) and (ix) of Proposition 5.1.23 imply that
(A) = (C) = (T (A)).
(5.1.16)
Since T is condensing,
(A) = 0.
Since A is also closed, A is a compact set. The restriction of T to A is thus a compact operator. Consequently, the Schauder Fixed Point Theorem can be applied to the mapping
T : A A.
Corollary 5.1.28. Let K, C : M X X be operators in a Banach space X such that
(K + C)(M) M, let M be a nonempty, closed, bounded and convex set in X, let K
be kcontractive (0 k < 1) and C compact. Then K + C has a xed point in M.
Proof. The proof follows immediately from Example 5.1.26 and Theorem 5.1.27.
The following assertion generalizes the existence part of Theorem 3.1.4 and follows
from the previous Corollary 5.1.28, cf. the statement with the example on page 110. Let
us consider the initial value problem
x = f (t, x) + g(t, x),
(5.1.17)
x(t0 ) = x0
in a Banach space Y . For xed positive numbers a and b dene
R [t0 a, t0 + a] [x Y : x x0 b}.
Proposition 5.1.29. Let us assume that
(i) the map f : R Y is continuous and also Lipschitz continuous with respect to the
second variable, i.e., there exists L > 0 such that
f (t, x) f (t, y) Lx y
for all
that
= because M , and A =
because m A.
266
(iii) the sum f + g is bounded, i.e., there exists B > 0 such that
f (t, y) + g(t, y) B
(t, y) R;
for all
cL < 1,
Bc b.
Then the problem (5.1.17) has a solution x = x(t) dened on (t0 c, t0 + c).
Proof. It follows from Lemma 3.1.5 that the problem (5.1.17) is equivalent to the integral
equation
t
x(t) = x0 +
(5.1.18)
t0
Let
X = C([t0 c, t0 + c], Y )
M = {x X : x x0 X b}.11
and
x M,
K(x)(t) = x0 +
(5.1.19)
C(x)(t) =
t0
= A(t)x(t) +
0
for
(ti , s, xi ) U, i = 1, 2.
267
= A(t)x(t).
We put
F (s, x)
f (s, , x()) d,
s [0, T ],
x C([0, T ], RN )
and
G1 (x)(t) =
G2 (x)(t) =
It is not dicult to show that there are r > 0, > 0 small enough such that H maps
the set
.
Q(r, )
t (0, 1),
x(0) = x(1) = 0,
(5.1.21)
268
G
x1
x0
x0
x2
x0
G
Figure 5.2.1.
Figure 5.2.2.
Figure 5.2.3.
269
the sign of the matrix representation of G (a) is the sign of its Jacobian JG (a).
This idea leads to the following preparatory denition.
Denition 5.2.1. Let be an open and bounded subset of RN and let F
C(, RN ) C 1 (, RN ). Assume that y0 RN \ F () and y0 is a regular value
of F .12 Then we dene the Brouwer degree of F as
deg (F, , y0 ) =
sgn JF (x) .13
(5.2.1)
xF1 (y0 )
We point out that the sum in (5.2.1) is nite. Indeed, otherwise the set
F1 (y0 ) {x : F (x) = y0 }
x) = y0 , and since
has an accumulation point x
. By the continuity of F , F (
y0 F (), x
. By the Local Inverse Function Theorem (Theorem 4.1.1), F
is injective in a neighborhood U of x
. But U contains points of F1 (y0 ) dierent
from x
, a contradiction. Notice that in this argument we have used all assumptions
of Denition 5.2.1.
Proposition 5.2.2. Let be an open bounded subset of RN . The degree dened in
Denition 5.2.1 has the following properties ( I is the identity map):
1 if y0 ,
(i) deg (I, , y0 ) =
0 if y0 .
Suppose that F C(, RN ) C 1 (, RN ) and y0 RN \ F () is a regular value
of F . Then
(ii) deg (F, , y0 ) Z;
(iii) deg (F, , y0 ) = deg (F y0 , , o);
(iv) if deg (F, , y0 ) = 0, then the equation
F (x) = y0
has a solution in ;
(v) if 1 is an open subset of and y0 F ( \ 1 ), then
deg (F, , y0 ) = deg (F, 1 , y0 ).
More generally,
if 1 , . . . , k are pairwise disjoint open subsets of and
k
j , then
y0 F \
j=1
deg (F, , y0 ) =
k
deg (F, j , y0 ).
j=1
12 The
13 Here
denition
of a regular value is given in Denition 4.3.6.
= 0 as usual.
(5.2.2)
270
holds.
(5.2.3)
is valid.
(5.2.4)
Uj > 0.
d inf F (x) y0 : x \
j=1
k
Uj , and if y
j=1
k
%
Vj
j=1
x , t (, 1 + ) for > 0.
(5.2.5)
and
xUj
15 Notice
for every x \
271
j=1
k
deg (H(t, ), Uj , y0 ).
j=1
We wish to prove that this degree is constant on the interval [0, 1]. We will study
the set
M {(t, x) [0, 1] Uj : H(t, x) = y0 }
with help of the Implicit Function Theorem at the point (0, xj ). This is possible
since
H2 (t, x) F (x)L(RN ) = tF (x) G (x)L(RN ) t <
(t, x) (, 1 + ) Uj ,
1
,
c2
This estimate implies that [H2 (t, x)]1 exists (Exercise 2.1.33). The Implicit Function Theorem implies that M has the form
{(t, (t)) : t [0, )}
in a certain neighborhood of (0, xj ) (F (xj ) = y0 ) where C 1 ([0, ), RN ) and
1
(t)
H1 (t, (t))L(RN )
L(RN ) = [H2 (t, (t))]
c2
,
1 c2
One of our main goals is to show that the degree is homotopically invariant,
i.e., if for H given by (5.2.5) we have
y0 = H(t, x)
x ,
(5.2.6)
homotopy H(t, x) for which the degree is well dened is called an admissible homotopy.
1, then lim (t) = x
exists and x
Uj (see Proposition 1.2.4).
t
272
provided at least one side in (5.2.6) is dened. The problem in proving this property
can be seen from Figures 5.2.2 and 5.2.3. Namely, if the dashed curve G is moving
up, then it is equal to F in one instance, o is not a regular value for F , and
so deg (F, , o) is not yet dened. To overcome this obstacle we approximate a
critical value by a regular one. Such approximation is based on the socalled Sard
Theorem. Its special case stated below will be sucient for our purposes.
Theorem 5.2.3 (Sard). Let be an open subset of RN and assume that F
C 1 (, RN ). Then the Lebesgue measure of the set of critical values of F is zero.
Proof. Since RN can be covered by a sequence of bounded open sets and a countable union of sets of measure zero has also measure zero, we can suppose that is
bounded. Choose now an open subset G such that G . Let S be the set of
critical points of F in G. By the same argument as above, it is sucient to show
that
measN F (S) = 0
where measN is the Lebesgue measure in RN . Since G is compact,
d dist(G, RN \ ) > 0
and G can be covered by a nite number of closed cubes C1 , . . . , Ck with
sides
parallel to the coordinate hyperplanes and edges of length a. If a < d N , then
k
Ci and
i=1
c
k
i=1
Ci
i = 1, . . . , k.
Choose one of these cubes and denote it by C. By the Mean Value Theorem
(Theorem 3.2.7),
F (y) F (x) cx y and F (y) Lx y (x y)x y, x, y C
where lim (r) = 0 (uniform continuity of F on the compact set C) and
r0+
cube
r =
a C which contains a critical point x. Denote the diameter of C by r (
(c
measN 1 Lx (C)
r )N 1
we have
and
cN 1 (
measN F (C)
r)
rN .
273
The number of such small cubes which contain critical points is mN at most.
Therefore,
r )
rN mN c1 (
r)
measN F (S C) cN 1 (
with a constant c1 independent of m. Since (
r ) 0 for m ,
measN F (S C) = 0.
Corollary 5.2.4. Under the hypotheses of Theorem 5.2.3 the set of regular values
of F : RN is dense in RN .
Proof. The complement of regular values, i.e., F ( S), cannot have an interior
point and zero Lebesgue measure simultaneously.
Remark 5.2.5. A more general Sard Theorem concerns F : RM RN . It is surprising that the assertion measN F (S C) = 0 needs more smoothness of F , namely
F C r (, RN )
where
The proof is more involved (see, e.g., Hirsch [67, Chapter 3, Theorem 1.3] for
the C case and comments given there, or Sard [117], or Sternberg [124, Theorem II.3.1]). If r max {0, M N }, then there exists F C r (, RN ) such that
int F ( S) = ,
see Whitney [132]. The statement on the Lebesgue measure can be strengthened
by considering the ner Hausdor measure or dimension. The following result also
holds:
If F : RM R is analytic, then F ( S) is even countable.
For more detail see, e.g., Fuck et al. [56, Chapter IV and Appendix IV]. There is
also a generalization for mappings F : X Y , X, Y Banach spaces:
If F (x) is a Fredholm operator for all x and F is suciently
smooth, then F ( S) is nowhere dense in Y .
Sharper results can be proved for functionals (i.e., Y = R) (see the book Fuck et
al. [56] cited above).
Now we return to the degree deg (F, , y0 ) where y0 RN \ F () and it is
yn F ().
In particular, deg (F, , yn ) is well dened by Denition 5.2.1. Part (vi) of Proposi
tion 5.2.2 allows us to presume that the sequence {deg (F, , yn )}n=1 is eventually
274
constant and does not depend on the choice of the sequence {yn }n=1 of regular values. To see this we need to extend Proposition 5.2.2(vi) to guarantee that
deg (F, , y) is constant on any open connected set
G RN \ F ( S).
This can be done due to the fact that any two dierent points in an open connected subset of RN can be connected by a smooth curve in this subset (see
Proposition 1.2.7). We leave details to the reader. He or she should also be convinced that all statements of Proposition 5.2.2 are still valid for this more general
denition of the degree.
For the denition of the degree deg (F, , y0 ) it is not necessary to assume
that F C 1 (, RN ) since any F C(, RN ) can be approximated by smooth
mappings. This is a consequence of the StoneWeierstrass Theorem (see Theorem 1.2.14).18
To show that deg (G, , y0 ) is the same for all
G C(, RN ) C 1 (, RN )
which are close to F in the C(, RN )norm we need the following extension of
Proposition 5.2.2(vii).
Proposition 5.2.6. Let be a bounded open subset of RN and let F, G be the
mappings from C(, RN ) C 1 (, RN ). Put
H(t, x) = (1 t)F (x) + tG(x),
t [0, 1],
x .
Assume that
y0 RN \ {H(t, x) : t [0, 1], x }.
Then
deg (F, , y0 ) = deg (G, , y0 ).
(5.2.7)
Proof. As has been stated above, Proposition 5.2.2(vii) holds for an arbitrary
y0 RN \ F (), in particular for H(t, ) and t small. Put
t0 = sup {t [0, 1] : deg (H(t, ), , y0 ) = deg (F, , y0 )}.
By the same statement (deg (H(t, ), , y0 ) is dened for all t [0, 1]),
deg (H(t0 , ), , y0 ) = deg (H(t, ), , y0 )
18 For
275
Example 5.2.10. Let B B(o; 1) be the closed unit ball in RN and let A be a
linear injective operator from RN into RN .20 Then
where p =
m()
deg (A, int B, o) = (1)p
(A)
<0
276
ball B(x0 ; r) (such that the equation F (x) = o has a unique solution in this closed
ball, namely x0 ) and F (x0 ) is injective. Under these hypotheses we have
deg (F, B(x0 ; r), o) = (1)p
where p =
m().
(5.2.8)
(F (x0 ))
<0
The value deg (F, B(x0 ; r), o) is also called the index of an isolated solution x0 of
g
the equation F (x) = o.
Example 5.2.11. Let be a bounded open subset of RN and let Y be a linear
subspace of RN . Suppose that f C(, Y ) is such that o = f (). Denote
by a projection of RN onto Y , by f the restriction of f onto Y and
g(x) f(x) + (I )x. Then
deg (g, , o) = deg (f, Y, o).
(5.2.9)
and
g1 (o) = f1 (o).
This means that both sides of (5.2.9) are dened. By the construction of the
degree, it suces to prove the equality under the following additional assumptions:
f C 1 () and o is a regular value. Since
g (y)(h + k) = f (y)h + k
for
y Y ,
h Y,
k Ker ,
we get
det g (y) = det f (y), 22
g
Our next aim is to generalize the notion of the degree to innite dimensional
spaces. Since the Brouwer Theorem is a corollary of the homotopy invariance property of the degree and this theorem does not hold even in an innite dimensional
Hilbert space (Example 5.1.7) we cannot expect a meaningful generalization of the
Brouwer degree which would be valid for all continuous mappings. Similarly as in
the Schauder Fixed Point Theorem we restrict our attention to operators which
are well approximated by nite dimensional ones, i.e., to compact operators.
One more remark is desirable. One of the main consequences of the notion of
deg (F, , y0 ) is a sucient condition for the solvability of the equation F (x) = y0
in the set . If F is a compact operator, then F () is rather small in an innite
dimensional space. Therefore it is much better to solve either
x F (x) y0 = o
22 The
277
for a suitable A.
n = X n ,
and
Gn (x) = x Fn (x)
for x n .
{deg (Gn , n , o)}n=1 is constant for large n, and its limit does not depend on
the choice of the approximating sequence {Gn }
n=1 .
Proof. For a given > 0 there is n0 such that
sup F (x) Fn (x) <
for
n n0 .
k (x) = x Fk (x),
G
x X,
k = n, m.
t [0, 1], x X.
= ( X)
we have
For x X
H(t, x) = x F (x) t[Fm (x) F (x)] (1 t)[Fn (x) F (x)]
x F (x) tFm (x) F (x) (1 t)Fn (x) F (x)
d 2 > 0
for small > 0. By Theorem 5.2.7,
m , X,
o) = deg (G
n , X,
o).
deg (G
278
k = m, n,
as the limit of deg (Gn , n , o) for any approximating sequence {Gn , n }n=1 . This
construction also shows that the LeraySchauder degree inherits its properties
from the Brouwer degree.
Theorem 5.2.13. Let be a bounded open subset of a Banach space X. There
exists a mapping deg (I F, , y0 ) dened for all F C (, X) and y0 X such
that
x F (x) = y0
for all x .
This mapping has the following properties:
1 if y0 ,
(i) deg (I, , y0 ) =
0 if y0 .
(ii) deg (I F, , y0 ) = deg (I F y0 , , o).
(iii) If deg (I F, , y0 ) = 0, then the equation
x F (x) = y0
has a solution in .
(iv) If 1 , . . . , k are pairwise disjoint open subsets of and x F (x) = y0 for
k
each x \
j , then
j=1
deg (I F, , y0 ) =
k
deg (I F, j , y0 ).
j=1
(v) If F, G C (, X) and
sup F (x) G(x)X < inf x F (x) y0 X ,
x
then
deg (I F, , y0 ) = deg (I G, , y0 ).
279
t [0, 1],
x ,
for every
and
t [0, 1],
F (z) = z.
But z cannot belong to , i.e., z .
Example 5.2.14 (Rothes version of the Schauder Fixed Point Theorem). Assume
that F is a compact operator from the closed unit ball B(o; 1) of a Banach space
X into X. If
F (B(o; 1)) B(o; 1),
then F has a xed point in B(o; 1).
Indeed, suppose not and consider H(t, x) x tF (x). By the homotopy
invariance property,
deg (I F, B(o; 1), o) = deg (I, B(o; 1), o) = 1,
a contradiction.
280
The next example shows that nding an a priori estimate need not be a
trivial task.
Example 5.2.16. Consider the boundary value problem
x(t) = f (t, x(t), x(t)),
t (0, 1),
x(0) = x(1) = 0,
(5.2.10)
ds = x(t)
F (x)(t)
0
281
=
t0
 t
x(s)
ds yC[0,1],
and to get the estimate x(t)
+ sup 
x(t) 3yC[0,1].
t[0,1]
t[0,1]
t[0,1]
x M0 ,
y R.24
Suppose that (H1) and (H2) hold. Let there exist x X, x = o, and (0, 1]
such that
x = F (x).
24 The
tion.
condition (H1) is sometimes called the sign condition and (H2) the Nagumotype condi
282
+ c2 ].
g(t) log[c1 (x(t))
Since
g(t)
= 2c1
x(t)
x(t)
x(t)f
= 2c1
,
2 +c
2+c
c1 (x(t))
c1 (x(t))
2
2
we obtain, by (H2),
g(t)
2c1 x(t).
Let
G = {t [0, 1] : x(t)
= 0}.
Then
G=
Jj
dsc1 0 + c2
j
j

 t
2c1 x(s)
c1 x 2 (t) + c2
4c1 M0 .
c2
x(t)
M1
c2 2c1 M0
e
,
c1
t [0, 1].
283
It is clear that the above stated procedure can be used for solving a more
general equation
(5.2.11)
Ax = F (x)
where F C (, X)
and A is a linear operator with a bounded inverse. In that case (5.2.11) is equivalent
to
x = A1 F (x)
with a compact operator A1 F . More interesting questions arise for a noninvertible A. Since many dierential operators (both ordinary and partial) are Fredholm
operators we will suppose that A is a linear closed Fredholm operator25 of index
zero, and proceed as in Remark 4.3.14 with the exception that A is not assumed
to be continuous.
We denote
Y2 Im A,
X1 Ker A,
and choose topological complements X2 , Y1 such that
X = X1 X2 ,
Y = Y1 Y2 .
These closed complements exist because X1 has a nite dimension and Y2 a nite
codimension (Example 2.1.12 and Remark 2.1.19). By the assumption on the
index of A there is also an homeomorphism of Y1 onto X1 . Denote by P and
Q the linear continuous projections onto X1 and Y1 with kernels X2 and Y2 ,
respectively. Then the restriction of A to X2 Dom A is an injective operator with
a bounded inverse B.26
The equation (5.2.11) is equivalent to the pair of equations
Ax2 = (I Q)F (x1 + x2 ),
o = QF (x1 + x2 ),
x1 X1 , x2 X2 Dom A,
x1 = x1 + QF (x1 + x2 ).27
(5.2.12)
where
linear closed operator A is said to be Fredholm if dim Ker A < , Im A is closed and
codim Im A < . The index of such an operator is dened as ind A dim Ker A codim Im A.
See the denition on page 70.
26 Indeed, B is a closed operator (as an inverse to a closed operator) dened on the Banach space
Y2 . The continuity of B follows now from the Closed Graph Theorem (Corollary 2.1.10). The
operator B(I Q) is called a generalized (or a right) inverse to A. It is characterized by the
following two properties:
(i) AB(I Q) = I Q (the reason for calling it the right inverse);
(ii) B(I Q)Ax = (I P )x for x Dom A.
25 A
27 Since
F is nonlinear, need not be taken as linear. It is actually only essential that maps
Y1 into X1 and 1 (o) = {o}.
284
X2
Y2 = Im A
B
Q
P
o
X1 = Ker A
Y1
o
Figure 5.2.4.
(5.2.13)
(5.2.14)
= 0.
that injectivity of A is actually not needed hence the assumption (i) is not assumed for
285
Ax = x
x X C[0, 1].
onto Ker A
Qy =
y(s) ds
y(s) ds
x+
x R.29
f (t) dt < ,
f+ (t) dt > .
0
t [0, 1],
x < r,
x > r.
similar to (H) are called conditions of the LandesmanLazer type. See, e.g., Landesman & Lazer [83], Fuck [53], Mawhin [91] or Dr
abek [39], and also Section 7.5.
286
f (t, x) dt <
f (t, x) dt >
0
for
x < r,
for
x > r.
It means that for any solution x of (5.2.13)(5.2.14) there exists t0 [0, 1] such
that x(t0 ) r. Since
x(t) = x(t0 ) +
we get
xC[0,1] M r + max {f C[0,1] , f+ C[0,1] }.
If we take to be a ball B(o; R) of radius R > M , then the condition (i) from
Theorem 5.2.17 is satised for = 1. The same is also true for the solution of
(0, 1).
Ax = F (x),
QF (x) =
f (t, x) dt
0
f (t, R) dt.
(5.2.16)
From the construction of the Brouwer degree it follows that we can assume that
(x)
f (t, x) dt
0
287
f+ (t) dt
0
is also a necessary condition for the solvability of (5.2.13)(5.2.14) under the asg
sumption (H) by (5.2.15)).
Remark 5.2.19. It is not necessary to consider only projections P , Q on the small
Ker A and a complement of Im A. For example, suppose that there are projections
{Qn } converging to the identity in a certain sense, and
Qn Q = Q
(e.g., Qn can be the partial sums of the Fourier series of the elements of Y = C[0, 1]
for the periodic problem). If we can take projections Pn so that
A(Im (I Pn )) = Im (I Qn ),
then there is a chance to solve the rst equation in (5.2.12) for a xed x1 by
the Contraction Principle even if F is only locally Lipschitz. This idea belongs to
L. Cesari (see, e.g., his survey in Cesari [22]). Using this approach he proved the
existence of a 2periodic solution of the equation
x
+ x3 = sin t.
Notice a signicant dierence in the sign of the nonlinear term here and in (H1)
in Example 5.2.16, and the fact that the growth of the nonlinear term is faster
here than in (H2).
At the end of this section we turn our attention to the bifurcations of solutions. As in Section 4.3 we consider the equation
f (, x) = o
where f : R X X is continuous on J U, J is an open interval and U is a
neighborhood of o in a Banach space X. We suppose that
f (, o) = o
for all J
288
in
for all
J \ {0 }.
Put
i() = deg (f (, ), U, o).
If
lim i() = lim i(),
(5.2.17)
0+
in V.
Proof. If F (x0 ) is not compact, then one can nd 0 > 0 and a sequence {yn }n=1
X such that yn 1 and
F (x0 )yk F (x0 )yl 0
for k = l.
0
h
4
and
x0 + yk
for all k N.
( 1).
289
Then
F (x0 + yk ) F (x0 + yl )
F (x0 )( yk yl ) F (x0 + yk ) F (x0 ) F (x0 ) yk )
F (x0 + yl ) F (x0 ) F (x0 ) yl
But this means that F is not compact on , a contradiction.
0
.
2
t (0, 1],
y B(o; ),
y B(o; ).
F
(x
)
F
(x
)(ty)]
+
F
(x
)y
y
0
0
0
0
t
1
c
F (x0 )y y
t [F (x0 + ty) F (x0 ) F (x0 )(ty) cy 2 y
provided y is small enough (by the denition of F (x0 ) and the assumption on
I F (x0 )).
30 For
290
By the homotopy invariance property (in a more general setting see Exercise 5.2.29), deg (I H(t, ), B(o; ), o) is constant on the interval [0, 1]. In particular,
deg (I F, , o) = deg (I F (x0 ), B(o; ), o).
Put
X1
(F (x0 )) p=1
>1
As we have mentioned above, dim X1 = < . Moreover, there exists a topological complement X2 to X1 in X which is F (x0 )invariant (see the decomposition
(2.2.4)). This decomposition of X allows us to use the Product Formula for the
degree (Exercise 5.2.28) provided balls Bi Xi , i = 1, 2, are chosen such that
B1 B2 B(o; ). Hence we obtain
deg (I F (x0 ), B(o; ), o) = deg (F1 , B1 , o) deg (F2 , B2 , o)
where Fi denotes the restriction of I F (x0 ) to Xi , i = 1, 2. To compute
deg (F2 , B2 , o) we introduce the homotopy
H2 (t, y) = y tF (x0 )y,
t [0, 1],
y X2 .
Theorem 5.2.23 (Krasnoselski Local Bifurcation Theorem). Let U be a neighborhood of o in a Banach space X, and let
f (, x) = x Ax G(, x),
J,
x U,
for all
J.
291
Proof. Suppose that (0 , o) is not a bifurcation point. Then there is a neighborhood J V of (0 , o) such that the equation
f (, x) = o
has for every
J a*unique solution in V, namely x = o. We may assume that
)
0 J, : 1 (A) J = {0 }, and also that 0 > 0 (for 0 < 0 consider
f(, x) = f (, x)). The degree deg (f (, ), V, o) is dened and it is given by the
LeraySchauder Index Formula (Proposition 5.2.22):
where =
m() =
m().
deg (f (, ), V, o) = (1)
(A)R
>1
(A)R
1
>
' (
Since m 10 is odd, the degree deg (f (, ), V, o) changes sign at 0 . A contradiction follows now from Proposition 5.2.20.
The Krasnoselski Theorem 5.2.23 is of a local nature and does not say anything about the global behavior of a branch of nontrivial solutions of the equation
f (, x) = o.
The socalled global bifurcation theorems describe these branches (see Appendix 5.2A). The interested reader can also consult, e.g., Rabinowitz [103, pages 11
36], Ize [69], Nirenberg [100, Chapter 3], Krasnoselski & Zabreiko [79], Krawcewicz
& Wu [80]. There are also methods depending on other topological tools. See, e.g.,
Alexander [3, pages 457483] or Fitzpatrick [50] and references given there.
Remark 5.2.24 (Comparison of Theorems 4.3.22 and 5.2.23). Let
value of A and
1
I A = 1.
dim Ker
0
1
0
be an eigen
f1,2
(0 , o) = A.
(5.2.19)
(5.2.20)
(5.2.21)
f1,2
(0 , o)(1, x0 ) Im (I 0 A).
(5.2.22)
292
The compactness of A (Theorem 5.2.23) implies that I 0 A is a Fredholm operator of index 0, hence (5.2.20) holds and also
codim Im (I 0 A) = dim Ker (I 0 A) = 1,
i.e.,
follows. The last assumption is closely connected with the multiplicity
(
' (5.2.21)
1
1
m 0 of 0 as follows from the assertion:
The assumption (5.2.22) is veried if and only if
m
1
0
= dim
Ker (I 0 A)k = 1.
(5.2.23)
k=1
i.e.,
1
0
> 1,
a contradiction.
Now, let us prove (5.2.22) (5.2.23). Take
w Ker (I 0 A)2
and set
u = (I 0 A)w.
0
u Im (I 0 A),
a
i.e.,
This proves
Ker (I 0 A)2 Ker (I 0 A).
w Ker A.
293
for any n N
(do it in detail!).
Exercise 5.2.25. Prove the following assertion:
Let f , A and G be as in Theorem 5.2.23. Let 0 = 0 be a bifurcation
point of f . Then 10 is an eigenvalue of A.
Hint. If 0 is a bifurcation point of f , there are
o = xn o,
Set vn
xn
xn .
n 0 ,
f (n , xn ) = o.
Then
vn = n Avn
G(n , vn )
.
xn
(5.2.24)
for any x .
294
for every
and
t [0, 1].
Then
deg (I h(t, ), , y0 )
is constant with respect to t [0, 1].
The following two exercises use an idea similar to that of Example 5.2.14.
Exercise 5.2.30. Let H be a Hilbert space and F a compact operator on a bounded
open set H into H. Assume that o and
(F (x), x) x2
for each x .
holds for
t [0, 1], x, y R.
Then the boundary value problem (5.2.10) has a solution. Prove that!
Hint. Proceed similarly to Example 5.2.16. Use the equation to estimate x
and compute x with help of a special form of the kernel G.
Exercise 5.2.33. Apply Theorem 5.2.23 to the Dirichlet boundary value problem
x
(t) + x(t) + g(, t, x(t)) = 0,
t (0, ),
x(0) = x() = 0,
and show that every point (k 2 , o), k = 1, 2, . . . is a bifurcation point.
295
(5.2.25)
The following result is due to Rabinowitz [103, pp. 1136], Rabinowitz [104].
Theorem 5.2.34 (Rabinowitz Global Bifurcation Theorem). Let X be a Banach space,
an open set in R X, (0 , o) , 0
= 0. Let us assume:
A is a compact linear operator from X into X,
(5.2.26)
(5.2.27)
(5.2.28)
(5.2.29)
is an eigenvalue of A of odd
Proof. We shall follow the proof of Ize [69]. The idea is the following. We will assume
that C is compact, and prove that it contains an even number of points described in (ii).
Since C is compact, it contains only a nite number of points (, o) where
= 0 and 1
is an eigenvalue of the compact operator A (see Figure 5.2.5): We shall denote them by
(0 , o), . . . , (k1 , o).
(see Figure 5.2.6). Indeed, let U be a neighborhood of C such that U \ C does not
contain any point (, o),
= 0, 1 (A). The set K = U S is then compact,32 and
obviously C (U S) = . By Deimling [34, Lemma 29.1] there exist compact disjoint
sets K1 , K2 K such that
K = K1 K2 ,
31 I.e.,
32 The
C K1 ,
U S K2 .
296
C
(0, o)
Figure 5.2.5.
X
U S
K1
K2
(0, o)
0
U0 (, )
U1 (, )
U2 (, )
U3 (, )
U
Figure 5.2.6.
(5.2.30)
297
Then obviously
fr (, x) = o
f (, x) = o
and
x = r.
(In other words, the function fr considers the solutions of f (, x) = o which belong
and the homotopy invariance
to the sphere x = r.) Then thanks to the choice of
property of the degree (Theorem 5.2.13(vi)), we conclude that
o)
deg (fr , ,
is well dened and independent of r > 0.
The rest of the proof consists in the calculation of this degree for suciently large
r and for suciently small r.
implies that there exists C > 0 such
Step 1 (suciently large r). The boundedness of
that x < C for any (, x) . Then for r > C the equation
fr (, x) = o
and so, according to Theorem 5.2.13(iii), we have
has no solution in ,
o) = 0.
deg (fr , ,
Step 2 (suciently small r). For j = 0, 1, . . . , k 1 set
Uj (, r) {(, x) : x2 +  j 2 < r 2 + 2 },
and choose > 0 so small that the sets Uj (, ) are pairwise disjoint, all belong to ,
and do not contain (0, o) (see Figure 5.2.6).
We prove rst that there exists r > 0 (r ) such that
x Ax tG(, x)
= o
(5.2.31)
xn
yn n Ayn tn
G(n , xn )
= o.
xn
(5.2.33)
y Ay
= o,
a contradiction.
We shall write Uj = Uj (, r) for simplicity. It follows from Theorem 5.2.13(iv) that
o) =
deg (fr , ,
k1
j=0
(5.2.34)
298
Let j be xed. It follows from the choice of > 0 that for 0 <  j  we have
1
(A).
1
j
(5.2.35)
(5.2.36)
(5.2.37)
k1
ij .
j=0
mj odd
Since this degree is independent of r, it must be equal to zero (see Step 1 of this proof).
Hence there must be an even number of eigenvalues of odd algebraic multiplicity among
0 , . . . , k1 .
Now we prove an analogue of the LeraySchauder Index Formula (see Proposition 5.2.22).
Lemma 5.2.35. Let fr , Uj , ij , ij+ be as above. Then
deg (fr , Uj , o) = ij ij+ .
Proof. We will connect fr with a simpler mapping using a suitable homotopy. Let us
dene this homotopy in the following way:
t [0, 1]
ft,r : Uj R X : ft,r (, x) = (t , yt ),
t = t(x2 r 2 ) + (1 t)(2 ( j )2 ),
We prove that for any t [0, 1]
o
/ ft,r (Uj ).
yt = x Ax tG(, x).
299
Assume the contrary, i.e., there exist t [0, 1] and (, x) Uj such that
ft,r (, x) = o.
The fact that (, x) Uj implies
x2 + ( j )2 = r 2 + 2 .
At the same time, from
0 = t = t(x2 + ( j )2 ) t(r 2 + 2 ) + 2 ( j )2
we obtain = j , and so x = r. This together with yt = o contradicts (5.2.31).
The homotopy invariance property of the degree then implies
deg (fr , Uj , o) = deg (f0,r , Uj , o).
The mapping f0,r is now easier to deal with. Indeed, the point o has two preimages,
(j , o) and (j + , o), with respect to the mapping
f0,r (, x) = (2 ( j )2 , x Ax).
is injective:
At both points the Frechet dierential f0,r
f0,r
(, 0)(, u) = (2( j ), u Au).
Corollary 5.2.36. If = R X in Theorem 5.2.34, then the rst possibility (i) reduces
to C is unbounded in R X and (ii) remains unchanged.
Proof. Let (, x) C. Then
x = Ax + G(, x).
This implies that if C is bounded in R X, it is also relatively compact because
T (, x) = Ax + G(, x)
is a compact operator. But C is closed, and so it is compact. We have thus proved that
if C is bounded, it is also compact.
300
We will now discuss the special case when 10 is an eigenvalue of A the multiplicity of
which is equal to 1. If this is the case in Theorem 5.2.34 and C is the component containing
the point (0 , o), then C consists of two connected sets C which near (0 , o) meet only
in (0 , o). More precisely, the next assertion holds (see Deimling [34, Corollary 29.1]).
Corollary 5.2.37. Under the hypotheses of Theorem 5.2.34 suppose, in addition, that the
multiplicity of 10 is 1. Then the component C containing (0 , o) consists of two connected
sets C + and C , C = C + C such that
C + C B((0 , o); ) = {(0 , o)}
and
C B((0 , o); )
=
C+
v0
o
v0
C
Figure 5.2.7.
The global properties of C were studied by Dancer [30]. The main result of this
paper can be formulated as follows.
Theorem 5.2.38 (Dancer Global Bifurcation Theorem). The sets C + and C are either
both unbounded, or
C + C
= {(0 , o)}.
Example 5.2.39 (Application of the Dancer Global Bifurcation Theorem). Let us consider
the Dirichlet boundary value problem
x
(t) + x(t) = g(, t, x(t)),
t (0, ),
(5.2.38)
x(0) = x() = 0.
We assume that g = g(, t, s) is a continuous function from [0, ] R R into R and,
given any bounded interval I R,
lim
s0
g(, t, s)
=0
s
(5.2.39)
301
t [0, ], R,
and so (5.2.38) has a trivial solution. In this example we discuss the existence and properties of nontrivial weak solutions of (5.2.38). Let X W01,2 (0, ) and dene operators
A, G : X X as follows:
(Ax, y) =
x(t)y(t) dt, (G(, x), y) =
g(, t, x(t))y(t) dt for any x, y X.
0
{(n2 , o)} C + C ,
xk
v0
xk
in
X.
G(k , xk )
xk
(5.2.41)
302
then yield that vk v0 even in C 2 [0, ]. In particular, it means that for large enough k,
the functions xk share the nodal properties of v0 . More precisely, let
A+ {(, x) C + : x has exactly (n 1) nodes in (0, ) and x(0)
> 0},
A {(, x) C : x has exactly (n 1) nodes in (0, ) and x(0)
< 0}.
Then there exists 0 > 0 such that
C B((n2 , o); ) = A
for any
0 0 .
In particular, A
= . We show that A is closed and open in C . Let us consider
C + , the case of C is similar. Recall that C + is a connected set with respect to the
x
topology induced by the topology on R X. For a given (,
) C + the convergence
x
) in this topology means that
(k , xk ) (,
in R
and
xk x
in X.
The above mentioned regularity result and the embedding X = W01,2 (0, ) C[0, ]
then imply that
in C 2 [0, ].
xk x
Let us assume that
(k , xk ) A+ ,
(k , xk )
= (, o),
x
(k , xk ) (,
) C + .
x
The fact xk x
in C 2 [0, ] then yields that (,
) A+ , i.e., A+ is closed in C + . On
+
2
)
= (n , o), then there exists > 0 such that
the other hand, if (, x
) A , (, x
x
C + B((,
); ) A+ ,
and xk x
for otherwise there would be k
in C 2 [0, ], (k , xk )
A+ , (k , xk )
+
+
+
C , a contradiction. Hence A is open in C .
We have just proved that A = C and so the sets C + and C do not have
any common point besides (n2 , o). According to Theorem 5.2.38 both C + and C are
unbounded in R X.
Let us emphasize that this means that C are unbounded either with respect to
x, or with respect to (or with respect to both x and !).
Some further properties of g might provide more information about the sets of C
(e.g., boundedness with respect to x if there are a priori estimates for all solutions
and unboundedness with respect to ; or vice versa, boundedness with respect to and
e
unboundedness with respect to x).
Exercise 5.2.40. Consider the boundary value problem (5.2.38) and apply Theorem 5.2.34
to get conclusions about the bifurcation branches. Formulate further assumptions on g
which will imply unboundedness of the branches with respect to x and , respectively.
Exercise 5.2.41. Consider the Neumann boundary value problem
x
(t) + x(t) = g(, t, x(t)),
t (0, ),
x(0)
= x()
= 0.
(5.2.42)
303
t (0, 2),
x(0)
= x(2).
(5.2.43)
(5.2.44)
The purpose of this appendix is to inform the reader about a possible method for extending the LeraySchauder degree theory to mappings of the type (5.2.44). The following
denition is the key to the theory presented in this appendix.
Denition 5.2.44. The operator T : X X is said to satisfy the (S+ ) condition if the
assumptions
un u0
(weakly) in X
and
imply
un u0
(strongly) in
X.
Remark 5.2.45. The topological degree for generalized monotone operators was independently introduced by Browder [19] and Skrypnik [121]. The notation (S+ ) is brought
from Browder [19] while the same condition is called (X) in Skrypnik [121]. The (S+ )
condition is a kind of compactness condition and plays an essential role in the construction of the degree for T : X X . This construction is based on the Brouwer degree and
nite dimensional approximations as the construction of the LeraySchauder degree, and
mappings satisfying the (S+ ) condition then play a similar role as compact perturbations
of the identity. The following assertion illustrates this fact. Its proof is a straightforward
consequence of Denition 5.2.44.
Lemma 5.2.46. Let T : X X satisfy the (S+ ) condition and let K : X X be a
compact operator. Then the sum T + K : X X satises the (S+ ) condition.
The following assertion is an analogue of Theorem 5.2.13 and of Exercise 5.2.26.
Theorem 5.2.47 (I. V. Skrypnik [121]). Let T : X X be a bounded and demicontinuous34 operator satisfying the (S+ ) condition. Let D X be an open, bounded and
and in the sequel we denote by f, u f (u) the value of the linear form f X for an
element u X. If X is a Hilbert space, then according to the Riesz Representation Theorem,
f, x = (x, f ).
34 We say that T : X X is demicontinuous if T maps strongly convergent sequences in X to
weakly convergent sequences in X .
33 Here
304
be valid.
Then
deg (T, D, o) = 1.
Let u0 X be an isolated solution of the equation
T (u) = o.
(5.2.45)
Similarly to the nite dimensional case (and in the case of the LeraySchauder degree)
we dene the index of an isolated solution u0 as
i(T, u0 ) = lim deg (T, B(u0 ; r), o).
r0+
n
i=1
i(T, ui )
holds.
305
The last assertion connects the properties of the functionals and the degree of their
Frechet derivatives.
Proposition 5.2.50 (I. V. Skrypnik [121]). Assume that a real functional F : X R has
a local minimum at u0 X and its Frechet derivative F : X X is a bounded and
demicontinuous mapping which satises the (S+ ) condition. Let, moreover, u0 be an
isolated solution of F (u0 ) = o. Then
i(F , u0 ) = 1.
Example 5.2.51. Let us consider the boundary value problem
p2
x(t))
g(x(t)) = f (t),
t (0, 1),
(x(t)
(5.2.46)
x(0) = x(1) = 0
p
,
p1
35
x (x
p2 x)
is the socalled onedimensional pLaplacian (or halflinear dierential operator of the
second order). The parameter R for which there is a nontrivial weak solution (cf. Remark 5.3.10) = (t) (i.e., not identically equal to zero in (0, 1)) of the problem
p2
x(t))
x(t)p2 x(t) = 0,
t (0, 1),
(x(t)
(5.2.47)
x(0) = x(1) = 0
is called an eigenvalue of the eigenvalue problem (5.2.47) and the function an eigenfunction associated with the eigenvalue . It is known (see, e.g., Elbert [47]) that the problem (5.2.47) has a countable set of simple eigenvalues 0 < 1 < 2 < , lim n =
n
(cf. Appendix 6.4B for the case p > 2) and the values of n , n = 1, 2, . . . , can be explicitly
calculated in terms of p and . The eigenfunction n associated with n is continuously
dierentiable and has exactly n 1 zero points in (0, 1). In particular, we can choose
y [37], Dr
abek, Krejc & Tak
ac [41] and refer1 (t) > 0, t (0, 1). (See Elbert [47], Dosl
ences given there.) However, the concrete values of n are not important in this example.
Let us assume that
lim
g(s)
=
sp2 s
where
for an n = 1, 2, . . . .
(5.2.48)
The problem (5.2.46) is then called a nonresonance problem (cf. Remark 7.5.5). Put
X W01,p (0, 1) with the norm
x =
x(t)
dt
p
p1
.
x(t)
y(t)
dt, 36
G(x), y =
g(x(t))y(t) dt,
0
35 The
36 By
sp2 s,
306
f (t)y(t) dt
x, y X.
for any
If we set
T = J + G,
then the operator equation
T (x) = f
(5.2.49)
x(t)
y(t)
dt
g(x(t))y(t) dt =
f (t)y(t) dt
0
(5.2.50)
holds for all y X. The function x X satisfying (5.2.50) is then a weak solution of
(5.2.46) (cf. Remark 5.3.10). It follows that the existence of a weak solution of (5.2.46)
is equivalent to the existence of a solution of the operator equation (5.2.49).
Our plan is to use the degree argument to prove the existence of a solution of
(5.2.49). First we sketch the properties of operators J and G. The operator J satises
J(x), x = xp .
(5.2.51)
= lim sup
n
x n (t)p dt
lim sup
n
9
x n p2 x n (t) x 0 (t)p2 x 0 (t) (x n (t) x 0 (t)) dt
1
p
x n (t)p dt
x 0 (t) dt
p
1
p
p1
x 0 (t)p dt
p1
1
x n (t)p dt
8
9
= lim sup xn p1 x0 p1 [xn x0 ] 0
x 0 (t) dt
p
where the last inequality follows from the fact that s sp1 is strictly increasing on
(0, ). Hence xn x0 , and due to the uniform convexity of X we have xn x0 in
X.38 The operator J is also invertible and its inverse is continuous.39
The operator G is compact. This follows immediately from the compact embedding
X = W01,p (0, 1) C[0, 1] and from the continuity of g (the reader is invited to prove
it in detail). Hence, due to Lemma 5.2.46 the operator T satises the (S+ ) condition.
J(tx) = tp1 J(x) for any t > 0, x X.
Proposition 2.1.22(iv).
39 See Exercise 5.2.53.
37 I.e.,
38 See
307
x, y X.
for any
[0, 1], x X,
for
and show that there exists R > 0 (large enough) such that this homotopy is admissible
with respect to the ball B(o; R) X. The usual way to prove it relies on an indirect
argument. Assume by contradiction that for any k N there exists k [0, 1] and xk X,
xk k such that
Tk (xk ) = o,
i.e.,
p1
J(vk ) (1 k )
, denote vk
xk
xk
p1
(5.2.52)
G(xk )
f
k S(vk ) + (k 1)
= o.
xk p1
xk p1
(5.2.53)
Due to the reexivity of X and the compactness of the interval [0, 1], we may assume
that
and
k [0, 1].
vk v in X
Using the compactness of the embedding X Lp (0, 1), the facts that G and S are
continuous as operators from Lp (0, 1) into Lp (0, 1), and using the assumption (5.2.48)
we obtain
G(xk )
(1 )S(v)
xk p1
in
X ,
k S(vk ) S(v)
in
X ,
in
(1 k )
(k 1)
f
o
xk p1
as
(the reader is invited to justify all in detail!). Passing to the limit in (5.2.53) we thus get
J(vk ) (1 )S(v) + S(v)
i.e.,
in
vk J 1 (S(v))
in
as
k ,
X.
in
and
J(v) S(v) = o
in
X .
(5.2.54)
308
Since vk = 1 for all k = 1, 2, . . . , we have v = 1, and so (5.2.54) contradicts the fact
that is not an eigenvalue of (5.2.47). This proves that the homotopy T is admissible
with respect to the ball B(o; R) if R is large. Applying Theorem 5.2.47(iii) we arrive at
deg (J G f , B(o; R), o) = deg (J S, B(o; R), o),
but the value of the degree on the righthand side is an odd number according to Theorem 5.2.47(ii). Hence
deg (J G f , B(o; R), o)
= 0,
and the existence of at least one solution x X of (5.2.49) which satises x < R
e
follows from Theorem 5.2.47(i).
Remark 5.2.52. It is possible to solve the problem (5.2.46) by means of the Leray
Schauder degree theory as well. In that case instead of solving the operator equation
J(x) G(x) = f
one has to deal with
x = J 1 (f + G(x))
(cf. Exercise 5.2.54). Due to the properties of J 1 (cf. Exercise 5.2.53) this approach is
more or less equivalent to that presented in Example 5.2.51. However, in more complicated applications (equations of higher order, partial dierential equations, etc., see, e.g.,
Appendix 7.5A) the use of the degree presented in Theorem 5.2.47 can appear to be of
essential advantage!
Exercise 5.2.53. Let J be an operator from Example 5.2.51. Prove that there exists an
inverse operator J 1 which is bounded and continuous.
Hint. The strict monotonicity of s sp2 s implies that
J(u) J(v), u v > 0
u
= v.
for
(5.2.55)
(cf. the proof of the (S+ ) condition in Example 5.2.51). The boundedness of J 1 then
follows.
To prove that J 1 is continuous proceed via contradiction. Suppose it is not, i.e.,
for a
> 0.
i.e.,
309
Exercise 5.2.54. Consider the boundary value problem (5.2.46) with g satisfying (5.2.48).
Prove the existence of at least one weak solution of (5.2.46) using the LeraySchauder
degree theory.
Hint. Prove that J 1 G is a compact operator from X into itself and then use the
homotopy invariance property of the LeraySchauder degree to prove that
x = J 1 (f + G(x))
has at least one solution in X. Compare your proof with the method presented in Example 5.2.51.
Exercise 5.2.55. Consider the problem
p2
x(t)
x(t)
t (0, 1),
x(0) = x(1) = 0,
(5.2.57)
y = f (x)
Figure 5.3.1.
If f is replaced by an operator
T: H H
from a real Hilbert space H (with a scalar product (, ) and the induced norm
) into itself and the same question is posed, then similar conditions appear to
be appropriate to prove that for any h H the equation
T (u) = h
310
uH
T (u) =
( 0)
for any x, y R.
for any u, v H
(5.3.1)
for any u, v H.
(5.3.2)
(5.3.3)
(5.3.4)
311
Proof. The uniqueness of the solution is a direct consequence of the strict monotonicity of T . The existence of a solution to (5.3.4) for any h H is proved in two
steps:
Step 1. Assume for a while that the assertion of the theorem holds if T is continuous and strongly monotone. We prove this fact later, in Proposition 5.3.5. Since
Tn : H H, n N, dened by
Tn : u
1
u + T (u)
n
is strongly monotone (prove it!) for any n N, we claim that given h H there
exists un H such that
Tn (un ) = h.
(5.3.5)
Step 2. Let us prove that {un }n=1 is a bounded sequence in H. Assume the
contrary, i.e., there exists a subsequence which will be denoted by {un }n=1 again
such that
lim un = .
n
1
un w.
nk k
According to (5.3.5),
T (unk ) h w.
{T (unk )}
k=1
u mk u 0 .
We prove that T (u0 ) = h. Indeed, for any v H and k N we have
(T (umk ) T (v), umk v) 0.
312
(5.3.6)
Passing to the limit for 0+ in (5.3.6) and using the continuity of T and of the
scalar product in H, we get
(h T (u0 ), w) 0
for any w H.
(5.3.7)
for any w H,
i.e.,
T (u0 ) = h.
Now, it remains to justify the assumption made in Step 1. For this purpose
we prove the following assertion.
Proposition 5.3.5. Let H be a real Hilbert space and S : H H a continuous and
strongly monotone operator. Then
S(H) = H.
Proof. The idea of the proof is easy. Since H is a connected metric space, it is
enough to prove (see Lemmas 5.3.6 and 5.3.8) that S(H) is both open and closed
in H. Then S(H) = H because the only nonempty subset of H which is both open
and closed is the entire space H.
First we prove that S(H) is closed.
Lemma 5.3.6. Let D be a closed set in H, let S : D H be a continuous and
strongly monotone operator. Then S(D) is a closed set in H.
we use that xn x and yn y imply (xn , yn ) (x, y). See Exercise 2.1.36.
313
Hence {un }n=1 is a Cauchy sequence, and there exists u0 D such that
un u0 .
The continuity of S implies that
S(un ) S(u0 ),
i.e.,
S(u0 ) = h.
To prove that S(H) is an open set is more tricky. For this purpose we need
an auxiliary assertion about an extension of Lipschitz continuous operators.
Lemma 5.3.7. Let D be a subset of a real Hilbert space H, let V : D H be an
operator satisfying
V (u) V (v) u v
for any
u, v D.
for any
u, v H
(5.3.8)
and, moreover,
W (u) = V (u)
for any
u D.
Proof. It follows from Zorns Lemma (see Theorem 1.1.4) that there exists a maximal extension W of the operator V , the domain of which satises
Dom W H,
D Dom W,
and for any u, v Dom W the inequality (5.3.8) holds. Our aim is to prove
Dom W = H.
Assume the contrary, i.e., there exists u0 H \ Dom W . In order to reach a
contradiction it is enough to prove the existence of v0 H such that
v0 W (u) u0 u
Indeed, setting
: u
W
(5.3.9)
v0 ,
u = u0 ,
W (u), u Dom W,
we obtain an operator
: Dom W {u0 } H
W
satisfying (5.3.8) for any u, v Dom W {u0 }. This will be a contradiction with
the maximality of the extension W . So, in the rest of the proof we concentrate on
the existence of v0 satisfying (5.3.9).
Let B be a nite subset of Dom W . Denote by AB the set of all v0 H
satisfying (5.3.9) for any u B. Let A denote the set of all v0 H satisfying
(5.3.9) for all u Dom W . Let Bn be the system of all nite subsets B of Dom W
which belong to the closed ball {u H : u n}, n N. Set
An =
AB .
BBn
314
Clearly, we have
A=
An ,
An+1 An A1 .
n=1
1jm
w W (uj )
.
u0 uj
1 j k,
(5.3.10)
k + 1 j m.
315
such that
w1 W (uj )
w0 W (uj )
<
= ,
u0 uj
u0 uj
w1 W (uj )
< ,
u0 uj
1 j k,
k + 1 j m,
(see Figure 5.3.2 for m = 5 and k = 2).44 Hence h(w1 ) < h(w0 ), a contradiction.
W (u1 )
Hf
W (u4 )
U3
w0
W (u5 )
w1
U5
wM
U2
=C
U1
U4
o{W
W (u3 )
+
Figure 5.3.2. Uj = w Hf :
wW (uj )
u0 uj
(u
1 ),
W
W (u2 )
(u
2 )}
,
< = B(W (uj ); u0 uj ) Hf
k
cj W (uj ),
cj 0,
j=1
k
cj = 1.
j=1
cj zj = o,
zj 2 < zj 2 ,
1 j k.
(5.3.11)
j=1
+
,
%
wW (u )
other words,
Uj
= where Uj = w Hf :
u u j
< (see Figure 5.3.2).
0
j
1jm
%
Indeed,
Uj is a nonempty open subset of Hf and the latter (m k) inequalities in
k+1jm
%
(5.3.10) imply w0
Uj . Using the convexity and compactness of M, the reader is
k+1jm
%
invited to show that
Uj contains the segment {tw0 + (1 t)wM : 0 < t < 1} where
1jk
%
w0 wM = dist (w0 , M). Consequently,
Uj is a nonempty set, too, and w0 belongs to
44 In
1jk
its boundary.
316
(5.3.12)
k
cj cn (
zj , zn ) <
j,n=1
1 j, n k,
k
cj cn (zj , zn ).
j,n=1
However,
2
k
cj cn (zj , zn ) =
c
z
j j = 0,
j=1
j,n=1
k
i.e.,
2
k
cj cn (
zj , zn ) =
c
z
j j ,
j=1
j,n=1
k
2
k
c
z
j j < 0,
j=1
317
1
(v + K1 (v)).
2
i.e.,
ZR = S 1
and R Z1 (D).
The inclusion
Z1 (D) R
(5.3.13)
1
v S(u).
2
v1 = S(u1 ).
318
Furthermore,
(S(u1 ) u1 S(u) + u, u u1 )
= (S(u1 ) u1 v + u, t(v S(u))) + (v S(u), t(v S(u)))
(v S(u), t(v S(u))) = tv S(u)2 ,
and so
tv S(u)2 (S(u1 ) u1 S(u) + u, t(v S(u)))
tS(u1 ) u1 S(u) + uv S(u)
1
tv S(u)2 .
2
Since t > 0 this contradicts (5.3.14). This proves (5.3.13) and the proof is complete.
Let us point out that for operator equations with strongly monotone operators we obtain the continuous dependence of the solution on the righthand
side.
Corollary 5.3.9. Let H be a real Hilbert space and T : H H a continuous and
strongly monotone operator. Then for any h H the equation
T (u) = h
has a unique solution. Let T (u1 ) = h1 and T (u2 ) = h2 . Then
u1 u2
1
h1 h2
c
Remark 5.3.10. Let h : [0, 1]RR R be a real function. Consider the boundary
value problem
t (0, 1),
(5.3.15)
x(0) = x(1) = 0.
Assume that h is continuous and x C 2 [0, 1] is a solution of (5.3.15). Let us
multiply the equation in (5.3.15) by a function y W01,2 (0, 1) and then integrate
the equation from 0 to 1. Applying the Integration by Parts Formula on the lefthand side, we obtain
1
1
x(t)
y(t)
dt =
h(t, x(t), x(t))y(t)
dt.
(5.3.16)
0
319
This identity makes sense for a more general x than that from C 2 [0, 1] and also
for a more general function h. We discuss this issue in Section 7.3 in detail.
If we assume that h is such that the integral on the righthand side of (5.3.16)
exists for any x, y W01,2 (0, 1) (see the following Example 5.3.11), then the function x W01,2 (0, 1) is called the weak solution of (5.3.15) if the integral identity
(5.3.16) holds for any y W01,2 (0, 1).
Once we succeed in nding a weak solution of (5.3.15), a natural question
arises whether it belongs to a better space than W01,2 (0, 1), e.g., the continuity
of the rst and second derivatives of x can be of interest. This is the socalled
regularity problem. It is a very delicate issue in the theory of partial dierential
equations. On the other hand, for an ordinary dierential equation, it is not. For
instance, if h is a continuous function, independent of x,
and x W01,2 (0, 1) is a
weak solution of (5.3.15), then x C 2 [0, 1] is a classical solution of (5.3.15), i.e.,
the equation in (5.3.15) holds pointwise in (0, 1).
Example 5.3.11. Let us consider the boundary value problem
(5.3.17)
x(t)
y(t)
dt,
(J(x), y) =
0
(f , y) =
(G(x), y) =
g(x(t))y(t) dt,
0
for any x, y H.
f (t)y(t) dt
0
(x, y) =
x, y H,
x(t)
y(t)
dt,
0
x =
x(t)
dt
2
12
,
cf. Exercise 1.2.46. The reader is invited to prove that the operators J and G as
well as the element f are well dened and that J is linear. Set
S = J + G.
Then the operator equation
S(x) = f
320
x(t)
y(t)
dt +
g(x(t))y(t) dt =
f (t)y(t) dt
(5.3.18)
holds for all y H. This is the weak formulation of (5.3.17), and x H satisfying
(5.3.18) for any y H is a weak solution of (5.3.17).
Let us prove that S is a continuous operator. This fact follows from the
continuity of J and G. By the denition of J and of the scalar product in H, J is
the identity on H, and so it is a continuous operator. Assume now that xn x.
The embedding of H = W01,2 (0, 1) into C[0, 1] (see Theorem 1.2.26) implies that
xn x uniformly in [0, 1]. It follows then from the continuity of g on R that
g xn g x uniformly in [0, 1] (justify this statement carefully!). Then using the
Dual Characterization of the Norm and the H
older inequality, we conclude that
G(xn ) G(x) = sup (G(xn ) G(x), w)
w1

= sup w1
12
c
w1
12
0 as n .
dt +
=
0
It follows from Corollary 5.3.9 that the problem (5.3.17) has a unique weak solution
for any f L2 (0, 1).
If fn f in L2 (0, 1), then fn f strongly in H (prove it in detail!), and
according to Corollary 5.3.9 the corresponding weak solutions xn H satisfy
xn x 0.
In particular, this means that a weak solution of (5.3.17) depends continuously on
g
the righthand side f L2 ().
45 Here
321
0
nt
n
2
for t 12 ,
for t > 12 .
for any x, y l2
Hint. Assume the contrary, i.e., there exist M > 0 and a sequence {un }n=1 such
that
and
T (un) M.
un as n
,
+
Choose a subsequence of uunn
which is convergent to w. Since T (RN ) = RN
n=1
322
Prove that T is an injective monotone operator, T (R2 ) = R2 and T is not continuous. Can there exist an injective, monotone function T : R R which is not
continuous and T (R) = R?
Exercise 5.3.18. Let H be a real Hilbert space and T : H H a strongly monotone
and Lipschitz continuous operator, i.e., there exist numbers m > 0, M > 0, M > m
such that
(T (u) T (v), u v) mu v2 ,
T (u) T (v) M u v
t (0, 1),
(5.3.19)
x(0)
= x(1)
= 0.
Prove that any weak solution x of (5.3.19) such that x C 2 [0, 1] satises the
equation in (5.3.19) and x(0)
= x(1)
= x(1)
= 0,
to have a unique weak solution.
Hint. Use Corollary 5.3.9.
323
T (u), u
= +,
u
(5.3.20)
T (u) = f
(5.3.21)
has at least one solution u X for every f X . If, moreover, the inequality (5.3.20)
is strict for all u, v X, u
= v, then the equation (5.3.21) has precisely one solution
u X for every f X .
The second assertion is more general since the monotonicity condition (iv) is replaced by a set of weaker conditions.
Theorem 5.3.23 (LerayLions). Let X be a reexive real Banach space. Let T : X X
be an operator satisfying the conditions
(i) T is bounded;
(ii) T is demicontinuous;
(iii) T is coercive.
Moreover, let there exist a bounded mapping : X X X such that
(iv) (u, u) = T (u) for every u X;
(v) for all u, w, h X and any sequence {tn }
n=1 of real numbers such that tn 0, we
have
(u + tn h, w) (u, w);
(vi) for all u, w X we have
(u, u) (w, u), u w 0
(the socalled condition of monotonicity in the principal part);
324
(vii) if un u and
lim (un , un ) (u, un ), un u = 0,
then we have
(w, un ) (w, u)
for arbitrary
w X;
T (u) = f
TF (u) jF (T (u)).
This denes a mapping TF from the space F into the space F (see Figure 5.3.3).
Step 3. Since a continuous linear operator maps a weakly convergent sequence to a
weakly convergent one (see Proposition 2.1.27(i)) and the weak convergence coincides
with the strong convergence on the subspace F of nite dimension (cf. Remark 2.1.23),
it follows from (ii) that TF is continuous. Put
c(r) inf
uX
u
=r
T (u), u
.
u
325
u
jF
X
TF
T
F
jF
T (u)
TF (u) =
jF (T (u))
Figure 5.3.3.
i.e.,
TF (u), u = T (u), jF (u) = T (u), u c(u)u
holds for all u F .
(5.3.23)
(5.3.24)
lim c(un ) = .
F0 = {F : F0 F }.
We denote by UF0 the set of all elements u X which are solutions of the equation
(5.3.24) for a F F0 . Furthermore, let UF0 w be the weak closure of the set UF0 .46 Note
46 In
326
that UF0 w B(o; r0 ) for any F0 (cf. Exercise 2.1.39 and the fact that UF0 B(o; r0 )
for all F0 ). Let f be any nite subset of . Then
UF0 w
= .
F0 f
Indeed, let f = {F0i : dim F0i < , i = 1, . . . , n}. Then each of the sets UF i contains
0
n
F0i .
all solutions u X of the equation (5.3.24) in
i=1
Since B(o; r0 ) is a compact topological space with respect to the weak topology
(notice that X is reexive), it follows from the result of Exercise 1.2.42 that
UF 0
= .
F0
u0
UF 0 w .
F0
In the next two steps we prove that u0 is the desired solution of (5.3.22).
Step 7. Let v X. Choose F0 such that v F0 , and let F F0 . If uF F is a
solution of the equation (5.3.24), then condition (iv) implies that
0 T (v) T (uF ), v uF = T (v), v uF T (uF ), v uF
= T (v), v uF T (uF ), jF (v uF ) = T (v), v uF TF (uF ), v uF
= T (v), v uF .
Thus
T (v), v u 0
u UF 0 .
(5.3.25)
By the denition of weak topology (see Remark 2.1.23), (5.3.25) is valid even for arbitrary
u UF0 w . In particular, we then have
T (v), v u0 0.
(5.3.26)
Step 8 (the Minty trick). Choose w X, t > 0, and put v = u0 + tw in the inequality
(5.3.26). Then
0 T (u0 + tw), tw = t(T (u0 + tw), w,
i.e.,
for every
w X,
i.e.,
T (u0 ) = o.
327
+ g(x(t)) = f (t),
t (0, 1),
(x(t)
(5.3.29)
x(0) = x(1) = 0
x =
x(t)
dt
p
p1
T (x) = f
(5.3.30)
x(t)
y(t)
dt +
g(x(t))y(t) dt =
f (t)y(t) dt
0
(5.3.31)
holds for all y X. So, as in Example 5.2.51, to nd a weak solution x of (5.3.29) (i.e., x
satisfying (5.3.31)) is equivalent to nding a solution of (5.3.30). We have, by the H
older
inequality,
J(xn ) J(x) = sup J(xn ) J(x), y
y
1

= sup 
y
1
x n (t)
p2
x n (t) x(t)
p2
x(t)
y(t)
dt9
 p
p2
x n (t)p2 x n (t) x(t)
x(t)
 dt
 p
p2
x n (t)p2 x n (t) x(t)
x(t)
 dt
1
p
1
p
y(t)
dt
sup
y
1
1
p
p1
(5.3.32)
The last integral tends to zero as xn x 0 due to the continuity of the Nemytski
operator (x)(t) = (x(t))
y
1
sup
y
1
y(t) dt
p
p1
0
1
p
0
1
p
1
p
(5.3.33)
328
as xn x 0 (cf. Example 5.3.11 and the continuous embedding W01,p (0, 1) C[0, 1]
for p > 1). It follows from (5.3.32) and (5.3.33) that T is continuous and hence demicontinuous. The boundedness of T follows from estimates similar to (5.3.32), (5.3.33) (the
reader is invited to do it in detail!). We also have
1
8
9
p2
p2
x(t)
x(t)
y(t)
y(t)
(x(t)
y(t))
dt
T (x) T (y), x y =
0
+
0
p
x(t)
dt
p2
y(t)
y(t)
x(t)
dt
p2
x(t)
x(t)
y(t)
dt +
0
y(t)
dt
1
p
x(t)
dt
0
p1
= [x
x(t)
dt
p
y(t)
dt
0
0
1
p
1
p
x(t)
dt
p
p1
p1
p
y(t)
dt
+
1
p
y(t)
dt
0
p
x(t)
dt +
T (x), x =
0
g(x(t))x(t) dt
0
= xp +
[g(x(t)) g(0)](x(t) 0) dt +
0
i.e., T is coercive. It follows then from Theorem 5.3.22 that there is a unique solution of
e
(5.3.30) (which in turn is a unique weak solution of (5.3.29)).
The advantage of the Browder Theorem is more transparent in the case of partial
dierential equations when the embedding W01,p () C() does not hold in general,
and so only the demicontinuity of T can be proved.
An application of the more general Theorem 5.3.23 is postponed to the last chapter,
Appendix 7.6A.
Exercise 5.3.25. Prove that the unique weak solution x = x(t) of (5.3.29) belongs to
p2 x is absolutely continuous and the equation
C 1 [0, 1], x
+ g(x(t)) = f (t)
(x
p2 x)
holds a.e. in
(0, 1).
x(t)
+
(g(x( )) f ( ) d y(t)
dt = 0
0
(5.3.34)
329
Set
(g(x( )) f ( ) d .
p2
x(t)
+
M (t) = x(t)
a.e. in
(0, 1).
(5.3.35)
The assertion now follows from (5.3.35) as in the proof of Theorem 6.1.14.
Exercise 5.3.26. Prove the following assertion:
Let T be a continuous mapping dened on a Banach space X of nite dimension with values in X . Assume that there exists a real function c = c(r),
dened on the interval (0, ), such that
lim c(r) = and that T (u), u c(u)u holds for all u X.
for
u B(o; R)
(5.3.36)
Apply the homotopy invariance property of the Brouwer degree and show that (5.3.36)
implies that there exists u0 B(o; R) such that
F (u0 ) = o,
i.e.,
T (u0 ) = f .
x(t)
x(0) = x(1) = 0,
t (0, 1),
(5.3.37)
330
for any x, y R
(cf. Section 5.3), we use the usual rst semester calculus denition
for any x, y R satisfying x y
we have
f (x) f (y).
(5.4.1)
provided
y x K,
x<y
xy
provided
provided
x y and x = y,
y x int K.
(5.4.2)
The set
[x, y] = {z X : x z y}
is called an order interval in X. Note that
xy
means
xy K
331
We have
(1 , . . . , N ) (1 , . . . , N )
Example 5.4.6. The set C in Figure 5.4.2 is a cone in R2 but it is not an order
g
cone.
y
K = R2,+
(1 , 2 )
R2
o
x
(1 , 2 )
C
o
x
Figure 5.4.1.
Figure 5.4.2.
if and only if
f (x) g(x)
for all x ,
f g
if and only if
f (x)< g(x)
for all x .
The following assertion summarizes the basic properties of ordering in a Banach space X.
Proposition 5.4.8. For all u, x, xn , y, yn , z X and all a, b R, we have
x x,
xy
xy
and
and
yx
y z
imply
imply
x= y,
x z.
Furthermore, we have
xy
xy
xn yn
47 The
and
and
0ab
uz
for all
imply
imply
implies
ax by,
x+uy+z
and
lim xn lim yn 48
332
provided the limits exist. For the symbol , the following implications hold:
xy
xy
and
and
yz
y z
imply
imply
x z,
x z,
xy
and
a>0
imply
ax ay.
Proof. Use (5.4.2) and the properties of K. For example, if xn yn for all n, then
yn xn K. Since K is closed and limits of {xn }n=1 , {yn }n=1 exist, we conclude
that
y x K,
i.e.,
x y.
Denition 5.4.9. The order cone K is called normal if there is a number c > 0
such that for all x, y X, o x y we have
x cy.
Example 5.4.10. For X = RN , K = RN,+ is a normal order cone in RN . Similarly,
g
C + () is a normal order cone in C().
Lemma 5.4.11. If an order cone is normal, then every order interval [x, y] is
bounded in the norm.
Proof. If x w y, then o w x y x, and hence
w w x + x cx y + x.
Now we can introduce the denition of a monotone increasing operator between ordered Banach spaces.
Denition 5.4.12. Let X and Y be ordered Banach spaces. An operator T : Dom T
X Y is said to be monotone increasing if
x<y
implies
T (x) T (y)
implies
T (x) o.
333
Example 5.4.14. For a linear operator T , the concepts of (strictly, strongly) positive
are the same as those of (strictly, strongly) monotone increasing. Indeed, let T be
positive, for example. Then we have the following sequence of implications:
x<y
o<yx
o T (y) T (x)
o T (y x)
=
T (x) T (y),
g
and
vn+1 = T (vn ),
n = 0, 1, 2, . . . .
(5.4.4)
o
u0
u1 u2 u
v v2 v1
v0
The next denition is a basic denition of the existence theory for operator
equations in ordered Banach spaces.
334
a supersolution) of (5.4.3). Then {un }n=1 ({vn }n=1 ) in (5.4.4) converges if and
50
only if this sequence is bounded above (below). In the case of convergence, the
limit point u is the smallest xed point u of T with u0 u (v is the greatest xed
point v of T with v v0 ).51
Proof. We will consider the case of a subsolution. The case of a supersolution is
very similar. Since T is monotone increasing, we have the sequence of implications
u0 T (u0 ) = u1 = u1 u2 = ,
i.e., u0 u1 u2 .
that unk u. Since the sequence {un }n=1 is monotone, all convergent subse
quences have the same limit point. Therefore, the whole sequence {un }n=1 converges to u as well. Since un+1 = T (un ), letting n shows that
u = T (u).
Let w be a solution of (5.4.3) with u0 w. Then u1 = T (u0 ) T (w) = w,
etc., so that un w for all n. Hence u w.
The intuitive meaning of the following assertion is demonstrated in Figure 5.4.3.
49 The
terminology is not xed. Instead of super and subsolution the notions upper and lower
solutions have been also used.
50 The set M X is bounded above if there is m X such that u m for any u M.
51 Note that the concepts of smallest and greatest are used in the usual sense, e.g., a smallest
xed point u in X is characterized by u w for all other xed points w u0 .
335
Corollary 5.4.17 (Monotone Iterative Method). Let X be a real Banach space with
a normal order cone and T : X X. Assume that u0 and v0 is a subsolution and
a supersolution of (5.4.3), respectively, and u0 v0 . If T is a compact monotone
increasing operator on the order interval [u0 , v0 ], then both the iterative sequences
{un }n=1 and {vn }n=1 from (5.4.4) are dened, converge, and
u = lim un
and
v = lim vn
n
is the smallest xed point and the largest xed point, respectively, of T in [u0 , v0 ].
Furthermore, we have the error estimates
un u v vn
for all
n = 0, 1, . . .
(Figure 5.4.3).
Proof. Since u0 T (u0 ), T (v0 ) v0 and u0 v0 together imply that u0 u1
v0 and similarly that
for all n,
un v0
u(x) =
(5.4.5)
u = T (u),
+
Let us consider the normal order cone C () from Example 5.4.7. Then the operator T : C() C() is compact and monotone increasing.52
Considering subsolutions and supersolutions now means that we replace =
by and , respectively, in the integral equation (5.4.5). Corollary 5.4.17
implies that if u0 C() is a subsolution and v0 C() is a supersolution with
u0 v0 on , then for n the iterative method
un+1 (x) =
G(x, y)f (y, un (y)) dy,
n = 0, 1, 2, . . . ,
compactness of T has been proved in Example 5.1.14; f = f (y, u) increasing in u immediately implies that T is monotone increasing.
336
u0 (1) 0,
v0 (0) 0,
v0 (1) 0.53
(5.4.7)
We will show that u0 , v0 is a subsolution and a supersolution, respectively, for the
operator w = T (z) dened by a solution of the problem
w(t)
+ cw(t) = f (t, z(t)) + cz(t) F (t, z),
t (0, 1),
w(0) = w(1) = 0,
where c > 0 is chosen in such a way that
f
(t, s) + c > 0 for t [0, 1] and s I0
s
min {u0 (t)}, max {v0 (t)} .
t[0,1]
t[0,1]
Notice that 0 I0 . The map T is correctly dened because the Dirichlet problem
w(t)
+ cw(t) = g(t),
t (0, 1),
(5.4.8)
w(0) = w(1) = 0
has a unique solution for any xed g C[0, 1]. Then T : C[0, 1] C[0, 1] is
a compact operator. This follows from the fact that T is composed from the
Nemytski operator
N : z(t) f (t, z(t)) + cz(t)
which is continuous, and a compact linear operator A1 where
A(w(t)) = w(t)
+ cw(t),
(cf. Example 2.2.17), i.e.,
functions u0 , v0 are called a subsolution and supersolution of the boundary value problem
(5.4.6), respectively.
337
Putting
w = T (z2 ) T (z1 )
we get
w(t)
+ cw(t) = f (t, z2 (t)) f (t, z1 (t)) + c(z2 (t) z1 (t)),
w(0) = w(1) = 0.
t (0, 1),
w(t)
+ cw(t) 0,
w(0) = w(1) = 0.
(5.4.9)
Assume that there is t (0, 1) such that w(t) < 0. Then there is t0 (0, 1) such
that
0 > w(t0 ) = min w(t).
t[0,1]
i = 1, . . . , N.
Consider the order cone RN,+ from Example 5.4.5. Translate all the assumptions
and conclusions of Theorem 5.4.16 and Corollary 5.4.17 to this system.
54 The argument used to prove w(t) 0 in (0, 1) is a special version of the more general Maximum
Principle (see, e.g., Protter & Weinberger [102]). The monotonicity of T can be also shown by
proving that the Green function corresponding to the operator A (Example 2.2.17) is nonnegative.
338
t (0, 1),
x(0) = x(1) = 0.
(5.4.10)
(5.4.11)
y > o,
(5.4.12)
339
for all
x Kr ,
(5.4.13)
and which satises appropriate conditions. Furthermore, it is important to know a subsolution x0 , i.e.,
c > 0, x0 > o.
(5.4.14)
M (x0 ) cx0 ,
The general Minorant Principle:
If we know a subsolution of (5.4.11), then we can obtain a positive eigenvalue
with a positive eigenvector of (5.4.11),
is formulated precisely in the following two theorems.
Theorem 5.4.27 (Krasnoselski). Suppose that
(i) X is a real Banach space with an order cone K;
(ii) an operator T : Kr X X is compact and (5.4.13) holds;
(iii) a linear operator M : K X is positive, and there are an x0 > o and a positive
real number c such that (5.4.14) holds.
Then for every with 0 < < r the problem (5.4.11) has a positive solution (x, )
satisfying
x = .
Theorem 5.4.28 (Zeidler). Let us set
(x) = sup{t 0 : x tx0 }
for xed
x0 > o
and all
x K.
The conclusion of Theorem 5.4.27 still holds if we replace (iii) by the following condition:
(iii ) suppose that M : K X K is an operator, not necessarily linear, for which
there is an x0 > o and there are real numbers s with 0 < s 1 and c > 0 such
that
for all x Kr .
(5.4.15)
M (x) ((x))s cx0
Theorem 5.4.27 is a special case of Theorem 5.4.28. Indeed, since x (x)x0 for
x Kr , we have
M (x) (x)M (x0 ) (x)cx0
x Kr .
for all
n > 0,
where
Tn (x) T (x) +
xn > o,
xn =
x0
, n = 1, 2, . . . , 0 < r.
n
x
x
for
x
= o
For x K we set
S(x) = xTn (z(x)) +
and
z(o) = o.
( x)x0
.
n
(5.4.16)
340
xx0
( x)x0
x0
+
=
>o
n
n
n
for all
x K .
for all
x K .
It follows from the boundedness of S(K ) that there exists a number bn > 0 such that
0 < an S(x) bn
for all
x K .
(5.4.17)
By (5.4.17),
S(x)
S(x)
is well dened on K . Furthermore, the operator V : K K is compact on the closed,
bounded, and convex set K (why?). By Theorem 5.1.11 (the Schauder Fixed Point
Theorem) there is an xn K such that
V (x) =
xn = V (xn ).
Tn (xn )
In particular, xn = V (xn ) = , so z(xn ) = xn . Therefore xn = 2
S(x
, which
n )
n )
0 < a n b
n N.
for all
(5.4.18)
so
b sup n < .
nN
On the other hand, xn (xn )x0 implies that there exists such that
sup (xn ) < .
nN
o
as
n
.
Now
(5.4.16),
(xn ) as n , contradicting o < x0 (x
n)
(5.4.13) and (5.4.15) imply that
n xn = T (xn ) +
x0
x0
x0
M (xn ) +
.
n
n
n
((xn ))s c
,
n
so
n (xn )s1 c s1 c a.
341
Now, we pass to the limit n in (5.4.16). Using (5.4.18) and xn = , we can
i = 1, . . . , N,
(5.4.19)
N
ij j
xK
with
x r,
(5.4.20)
j=1
and i = 1, . . . , N . Assume that all the real numbers ij are nonnegative, and
that
N
ij > 0.
min
1iN
j=1
where
i =
N
ij j .
j=1
(5.4.21)
on a nite interval [a, b] with > 0. This time, for xed > 0 and r > 0, the key
condition (substituting (5.4.20)) is
f (s, x) x
for all
(5.4.22)
Applying Theorem 5.4.27 we have the following assertion (the Generalized Jentzsch Theorem):
Suppose A : [a, b] [a, b] R is continuous, nonnegative, and
b
A(t, s) ds > 0.
min
t[a,b]
342
Indeed, we write (5.4.21) as x = T (x) and apply Theorem 5.4.27 with X = C[a, b],
X + = C + [a, b], x0 (t) 1 and
M (x)(t)
and
Ker (I T )2 = Ker (I T ).
for all
x K X+.
and
x , x > 0
for a certain
x K.
always implies
x , x > 0.
343
Theorem 5.4.33 (KreinRutman). Let X be a real Banach space with an order cone
K having nonempty interior. Then any linear, compact, and strongly positive operator
T : X X has the following properties:
(i) T has exactly one eigenvector with x > o and x = 1. The corresponding eigenvalue is r(T ) and it is algebraically simple. Furthermore, x o.
(ii) If (T ),
= r(T ), then  < r(T ).
(iii) The dual operator T has r(T ) as an algebraically simple eigenvalue with a strictly
positive eigenvector x .
Remark 5.4.34. Recall that by the RieszSchauder theory (see Theorem 2.2.9), the spectrum of T consists of at most countably many nonzero eigenvalues of nite multiplicity
which can accumulate only at the origin, and o (T ) whenever dim X = . The
spectra of T and T coincide (X is a real space).
Now, we will give proofs of Proposition 5.4.32 and Theorem 5.4.33.
Proof of Proposition 5.4.32. Let us consider T on the complexication XC = X + iX. By
the RieszSchauder theory (see Theorem 2.2.9), all of the nonzero points of the spectrum
of T consist of eigenvalues of nite multiplicity. The same holds for T . Note that
(T ) { :  = r(T )}
= .
We consider the eigenvalues of T satisfying  = r(T ), and distinguish three cases.
Case 1 (0 = r(T ) is an eigenvalue). Our goal is to construct an x > o and an x > o
such that
and
T (x ) = 0 x .
T (x) = 0 x
From footnote 3 on page 57 we have
(I T )1 u =
and, therefore,
(I T )1 u o
T ju
,
j+1
j=0
for
> r(T ),
> 0
and
u o.
Since T is compact, 0
= 0 is an eigenvalue of nite multiplicity (Remark 2.2.10) and in
the Laurent series
+
( 0 )n An ,
(5.4.23)
(I T )1 =
n=
for all
k<n
and
Ak
= O
344
i.e.,
x > o.
u . Then
u K with u , x > 0. This is possible by Exercise 5.4.42. We set x = Tn
v o implies
x , v = u , Tn v 0
x , u = u , x > 0.
and
for
> 0.
By the Spectral Mapping Theorem (see Proposition 3.1.14(v)), all eigenvalues of T are
of the form + 2 where is an eigenvalue of T . One can check that 1 = 0 + 20 and
1 are the eigenvalues of T of greatest absolute value (the reader is asked to justify it!).
k
There is a sequence {k }
k=1 , k 0, such that Arg 1 is a rational multiple of 2 where
k
2
1 = 0 + k 0 (explain why!). According to Case 2, there is n N such that nk
1 > 0.
Since
n
lim nk
1 = 0 > 0,
k
we get a contradiction.
Before we prove Theorem 5.4.33 we need the following geometrical result.
Lemma 5.4.35. Let X be a real Banach space with an order cone K X + containing an
interior point. Let u o. Then for every v
K there is a uniquely determined number
u (v) > 0 such that
(i) 0 u (v) implies u + v o;
(ii) > u (v) implies u + v
K.
In particular,
u + v o
57 Any
and
>0
imply
< u (v).
complex number
= 0 can be written in the form = ei Arg .
(5.4.24)
345
with
e>o
and
0 > 0.
0 e),
(5.4.25)
(5.4.26)
and choose
= e (x)
and
= x (e).
(5.4.27)
346
T (y) = x + y.
(5.4.28)
T (K)
We want to prove that
= {o}.
K
(5.4.29)
is an order cone in P and T : P P is strongly positive
Assume the opposite, then K
since T is strongly positive. According to Step 1, there exists a positive eigenvector e P
According to Step 2 we necessarily have
of T (and hence also of T ) such that e K.
e = e for a certain
= 0, R. But this fact combined with (5.4.28) implies that x
and y are linearly dependent, which is a contradiction, i.e., (5.4.29) is proved.
It now follows from (5.4.29) that no elements x + y with  +  > 0 belong
to K. In particular, x
K. Since int K
= implies that K is total, there exist nonzero
elements x int K and x int (K) such that
x = x x .58
There exists > 0 such that
T (x ) e.
Indeed, since e int K we nd > 0 large enough to satisfy e 1 T (x ) K.
58 Indeed, if v
0
x= u
v 0 .
int K, v0
= o and > 0 is small enough, then u = v0 + x int K, u
= o. Hence
347
So, we have
T (x) = T (x ) T (x ) T (x ) e,
i.e.,
e+
1
T (x) K.
= x + y + e,
(5.4.30)
Let A be the set of all elements of the form (5.4.30) which belong to K. We have just
shown that A =
f (0 ) = max f () M.
such that
It follows from the strict positivity of T that there exists > 0 such that
T (0 ) e.
Indeed, 0 K, 0
= o, implies T (0 ) int K. We then can nd > 0 small enough to
satisfy
T (0 ) e K.
0 e + (1 x + 1 y) o
where
1 x + 1 y = T (0 x + 0 y).
(5.4.31)
1 = 0 + 0 ,
(5.4.32)
Then
12 + 12 = (02 + 02 )( 2 + 2 ) = M 2 .
It follows from (5.4.31) that
1 = e + '
1
1
x+ '
0
1
1
y
0
1
0
2
1
0
2
=
M 2
,
(0 )2
348
Step 5. We show that 0 is simple. Since dim Ker (0 I T ) = 1 (Step 3), it is enough
to prove
Ker (0 I T )2 = Ker (0 I T ).
Let
(0 I T )2 (x) = o.
By Step 2, this implies
(0 I T )x = e.
We want to show that = 0. Suppose
= 0. We may assume that > 0, for otherwise
we pass to x. Set 0 = 1
0 . Now x = 0 T (x + e) implies
x + e = 0 T (x + 2e)
and
x = 20 T 2 (x + 2e).
n
It follows by induction that x = n
0 T (x + ne), so
'
x(
x
n
for all
= n
e +
0T
n
n
n N.
(5.4.33)
e , x > 0
provided
x > o,
(5.4.34)
i.e., e is strictly positive. Indeed, let x > o. Then T (x) o and by Exercise 5.4.44,
e , T (x) > 0. So
0 e , x = T (e ), x = e , T (x) > 0.
According to the RieszSchauder Theory (see Theorem 2.2.9),
dim Ker (0 I T ) = dim Ker (0 I T )
which is equal to 1 by Steps 2 and 3. To prove that 0 is an algebraically simple eigenvalue
of T choose x Ker (0 I T )2 . Let y = 0 x T x . Then y = e for an R.
For any x > o we have
e , x = y , x = x , 0 x T x.
In particular, taking x = e we obtain = 0, i.e., y = o and x Ker (0 I T ). This
proves
Ker (0 I T )2 = Ker (0 I T ).
This completes the proof of Theorem 5.4.33.
The authors want to point out that another proof of the KreinRutman Theorem
can be found in, e.g., Tak
ac [126].
349
Corollary 5.4.36. Let X and T be as in Theorem 5.4.33. For every y > o, (5.4.12) has
exactly one solution x > o if > r(T ), and no such solution if r(T ). Moreover,
x T (x) = y
and
x > o, y > o
imply
(5.4.35)
has a unique solution for any y X. Since R : K K by the proof of Proposition 5.4.32,
hence y > o implies x > o. On the other hand, if r(T ) and there is a positive solution
x of (5.4.35) for y > o, then choosing e X as in Step 6 of the proof of Theorem 5.4.33
we arrive at
( r(T ))e , x = e , x T (x) = e , y > 0,
a contradiction. Finally, let x > o, y > o and
x T (x) = y
for a certain
R.
Then
( r(T ))e, x = e , x T (x) = e , y,
i.e.,
for all
x o,
then
r(S) r(T ).
If S(x) > T (x) for all x > o, then r(S) > r(T ).
Proof. Let
S(x) T (x)
for all
x o.
350
x(t) =
for all
t ,
(5.4.36)
x X,
In the next example we use some facts from the forthcoming Chapter 7. The reader
who is not acquainted with the properties of the Laplace operator can skip this example
or consider the onedimensional case and replace the Laplace operator by the second
derivative.
Example 5.4.40. Let us consider the eigenvalue problem for the Laplace operator subject
to the homogeneous Dirichlet boundary conditions
u(x) = u(x)
in
x ,
u(x) = 0
on
x ,
(5.4.37)
where is a bounded domain in RN and is its boundary (cf. Remark 7.2.2). Then
(5.4.37) can be written in the form (5.4.36) with = 1 where A = A(t, s) is the Green
function associated with the Laplace equation with the homogeneous Dirichlet boundary
conditions. Since A is a positive continuous function A : R (see, e.g., Gilbarg &
Trudinger [59]), we can apply the result of Example 5.4.39.
Multiplying the equation in (5.4.37) by u = u(x) (u is a real function) and using
the Green Formula (cf. footnote 7 on page 479), we nd
u(x)2 dx =
u2 (x) dx,
which shows that (5.4.37) has only positive real eigenvalues. It then follows from Example 5.4.39 (and hence from the KreinRutman Theorem) that (5.4.37) has the least
eigenvalue 1 > 0 which is simple and which is the only eigenvalue of (5.4.37) having a
e
positive eigenfunction 1 (x) > 0, x .
Exercise 5.4.41. Show that if int K
= , then K is a total cone and construct an example
of a cone which is not total.
Hint. If y int K, then y x K for every x X with > 0 suciently small. Thus
X = K K because
(y + x) (y x)
x=
.
2
351
Exercise 5.4.42. Show that for every x K \ {o}, there exists an x X such that
x , x > 0.
Hint. Since x
K and K is closed, x is an exterior point of K. Consequently, there
is an open convex neighborhood U of x which is disjoint from K. By the Separation
Theorem for convex sets,59 there is an x X with x (K) 0 and x (U) < 0. Hence
x , x > 0.
Exercise 5.4.43. Show that if K is total, then K is an order cone on X .
Hint. K =
{o} implies K
= {o} by Exercise 5.4.42. Suppose x K . We have to
show that x = o. Indeed, x , x 0 for all x K implies x , x 0 for all x X,
because K is total. Hence x = o.
Exercise 5.4.44. Let x X . Show that if x > o (i.e., x o and x , y > 0 for a
y > o), then x , x > 0 for all x int K.
Hint. Suppose x , x = 0 for an x int K. Then x y K for suciently small > 0.
Hence x , x y 0, so x , y = 0. This is a contradiction.
Exercise 5.4.45. Prove that the functional v u (v) from Lemma 5.4.35 is continuous.
Exercise 5.4.46. Apply the KreinRutman Theorem to the problems in Examples 2.1.32
and 2.2.17.
= f (t, x(t)),
t (0, 1),
(x(t)
(5.4.38)
x(0) = x(1) = 0,
as a model example. A special case of it was studied in Examples 5.2.51 and 5.3.24.
However, in this appendix we work in dierent function spaces. Here p > 1 is a real
number and f : [0, 1] R R is a function the properties of which will be specied later.
By a solution of (5.4.38) we understand a function x C 1 [0, 1] with x(0) = x(1) = 0
such that x
p2 x is absolutely continuous and the equation in (5.4.38) holds a.e. in (0, 1).
Clearly, the problem (5.4.38) formally coincides with (5.4.6) if p = 2.
Denition 5.4.47. A function u0 C 1 [0, 1] with u 0 p2 u 0 absolutely continuous is called
a subsolution of (5.4.38) if
u0 (0) 0,
u0 (1) 0
and
(u 0 (t)p2 u 0 (t)) f (t, u0 (t))
for a.e.
t (0, 1),
t (0, 1).
352
and
either
or
x(0) = y(0)
and
x(0)
< y(0),
= f (t, y(t)),
t (0, 1),
(x(t)
x(0) = x(1) = 0,
(5.4.39)
(5.4.40)
This condition is satised if, e.g., f (t, x(t)) = h(t) g(x(t)) where h L (0, 1) and
g : R R is a continuous function (cf. Examples 5.2.51 and 5.3.24).
We prove that the operator T is compact. To this purpose we express T in the
integral form. By the Rolle Theorem for any x = T (y) there exists tx [0, 1] such that
x(t
x ) = 0, i.e.,
p 2 tx
 tx
f (, y( )) d f (, y( )) d
(5.4.41)
x(t)
=t
t 
x(t) =

and
where p =
p
.
p1
p 2
tx
f (, y( )) d 
f (, y( )) d
(5.4.42)
60 Here
tx
(5.4.43)
x(0)
and x(1)
mean the derivative from the right and from the left, respectively.
353
from C[0, 1] into C[0, 1], and (5.4.41), (5.4.42) imply that xn x0 in C01 [0, 1] where
xn = T (yn ), x0 = T (y0 ), i.e., T is continuous. Let M C01 [0, 1] be a bounded set.
To prove the compactness of T we have to show that T (M) is relatively compact. Let
{xn }
n=1 T (M) be an arbitrary sequence, xn = T (yn ), yn M. It follows from
the compact embedding C01 [0, 1] C[0, 1] (see Theorem 1.2.13) that there exists a
subsequence {ynk }
k=1 {yn }n=1 which converges uniformly on [0, 1]. But the continuity
of the Nemytski operator (5.4.43) and (5.4.41), (5.4.42) imply that {xnk }
k=1 converges
in C01 [0, 1], i.e., T (M) is relatively compact. Hence the compactness of T is proved.
The following assertion is referred to as a wellordered case of supersolution and
subsolution.
Theorem 5.4.49 (wellordered case). Let f be a Carath
eodory function satisfying (5.4.40).
Assume that u0 and v0 are a subsolution and a supersolution of (5.4.38), respectively,
with u0 v0 (see Figure 5.4.4). Then the problem (5.4.38) has at least one solution x
satisfying
in [0, 1].
u0 x v0
If, moreover, u0 and v0 are strict and satisfy u0 v0 , then there exists R0 > 0 such
that for, all R > R0 ,
deg (I T, 1 , o) = 1
where
v0
u0
f (t, y)
f(t, y) f (t, u0 (t))
f (t, v0 (t))
if
if
if
u0 (t) y v0 (t),
y u0 (t),
y v0 (t).
Every solution of
p2
x(t))
= f(t, x(t)),
(x(t)
t (0, 1),
x(0) = x(1) = 0,
(5.4.44)
is a solution of (5.4.38). Indeed, assume that x solves (5.4.44) and x > v0 in an interval
I+ (0, 1) and x = v0 on I+ . Then
1
0
1
 dx(t) p2 dx(t) d
f (t, v0 (t))(x(t) v0 (t)) dt
(x(t) v0 (t)) dt =
 dt dt dt
0
(5.4.45)
354
where
(x(t) v0 (t)) =
x(t) v0 (t)
0
on
on
I+ ,
[0, 1] \ I+ .
(x(t)
v
(t))
dt
x(t)
v 0 (t)p2 v 0 (t))(x(t)
v 0 (t)) dt 0.
I+
t (0, 1).
t (0, 1).
x(t))
= f(t, y(t)),
t (0, 1),
x(0) = x(1) = 0
for y C01 [0, 1]. Then T : C01 [0, 1] C01 [0, 1] is compact62 and the solutions of (5.4.44)
are in a onetoone correspondence with the xed points of T. The denition of f ensures
that there exists a constant R0 > 0 such that for any y C01 [0, 1] we have
T (y)C01 [0,1] < R0
(5.4.47)
(see (5.4.41), (5.4.42)). By the Schauder Fixed Point Theorem T has a xed point x in
B(o; R0 ), i.e., x is a solution of (5.4.44). It follows from the above considerations that
u0 x v0 , and so x is also a desired solution of (5.4.38).
The proof of the second part follows from the fact that due to (5.4.47), we can
construct an admissible homotopy
H(, ) I T ,
[0, 1],
61 Note
62 The
sp2 s
355
for a.e.
t (0, 1)
sR
(5.4.48)
and, moreover,
lim
s
f (t, s)
= 1 .63
sp2 s
(5.4.49)
v0
t0
u0
f (t, y)
fr (t, y) = (1 + r y)f (t, y)
0
63 Here
if
if
if
y < r,
r < y < r + 1,
y > r + 1.
356
Next we show that there is K > 0 such that for any r > 0 and for any possible solution
of
p2
(x(t)
x(t))
= fr (t, x(t)),
t (0, 1),
(5.4.50)
x(0) = x(1) = 0,
the following a priori estimate holds:
xC01 [0,1] K.
(5.4.51)
To prove this fact we argue by contradiction, and thus we assume that for any k N
there are rk > 0, xk S solving
t (0, 1),
(x k (t)p2 x k (t)) = frk (t, xk (t)),
(5.4.52)
xk (0) = xk (1) = 0,
and satisfying xk k. Set yk
xxkk
and divide (5.4.50) by xk p1 to obtain
y (0) = y (1) = 0.
k
yk (t) =
p
p (y k (0)) +
frk (, xk ())
d
xk p1
(5.4.53)
d ,
t [0, 1],
(5.4.54)
in
for a
y C0 [0, 1].64
in
C01 [0, 1]
(note that without loss of generality we may also assume that {y k (0)}
k=1 forms a convergent sequence!). It follows from (5.4.54), (5.4.48), (5.4.49) and the Lebesgue Dominated
Convergence Theorem that y solves the problem
p2 y(t))
t (0, 1),
= 1 y(t)p2 y(t),
(y(t)
y(0) = y(1) = 0.
Since y = 1, it follows that y is a nonzero multiple of the rst eigenfunction 1 (t) > 0
in (0, 1) (see Example 5.2.51). If y > 0 in (0, 1), then we nd that xk (t) for any
t (0, 1), which contradicts xk S. Also y < 0 in (0, 1) leads to a contradiction. Hence
the a priori estimate (5.4.51) is proved.
64 This
357
Now choose
R > R0 = max{K, u0 C[0,1] , v0 C[0,1] } + 1
and consider (5.4.50) with r = R and xk = x, i.e.,
p2
x(t))
= fR (t, x(t)),
(x(t)
t (0, 1),
(5.4.55)
x(0) = x(1) = 0.
there exists > 0 such that x(t) < R 1 for t [t0 , t0 + ). But fR (t, x(t)) = 0 by
denition, so x(t) R 2 in (t0 , t0 + ]. Since this implies that x(t) R 2 in (t0 , 1],
we obtain a contradiction. The same argument applies to R + 2. Notice also that v0
and u0 .
Now, let us dene TR : C01 [0, 1] C01 [0, 1] by
x TR (y)
where x is a solution of the problem
p2
x(t))
= fR (t, y(t)),
(x(t)
t (0, 1),
x(0) = x(1) = 0,
and dene the sets
S {x C01 [0, 1] : x },
Su0 {x C01 [0, 1] : u0 x }
and
=R+2
v0
t0
u0
= R 2
Figure 5.4.6.
358
By denition, TR and T coincide in the ball B(o; R). Applying Theorem 5.4.49 and
Theorem 5.2.13(iv) we obtain
1 = deg (I TR , B(o; R) S , o)
= deg (I TR , B(o; R) Sv0 , o) + deg (I TR , B(o; R) Su0 , o)
+ deg (I TR , 2 , o) = 2 + deg (I TR , 2 , o),
Remark 5.4.51. There are several applications of Theorems 5.4.49 and 5.4.50. Also generalizations of these results to the case of partial dierential equations can be found in
literature, see, e.g., Dr
abek, Girg & Man
asevich [40].
In the next assertion we present one application of Theorems 5.4.49 and 5.4.50
which under suitable assumptions on f yields the multiplicity of solutions of (5.4.38).
Theorem 5.4.52. Let f be as in Theorem 5.4.50 and let ui0 and v0i , i = 1, 2, be subsolutions
and supersolutions of (5.4.38), respectively, which satisfy
u10 v01 ,
u20 v02 ,
x3
v02
v01
x2
t0
u20
x1
u10
Figure 5.4.7.
Proof. It follows from Theorem 5.4.49 that there are solutions xi = xi (t), i = 1, 2, of
(5.4.38) which satisfy
u20 x2 v02 .
u10 x1 v01 ,
Now, let us apply Theorem 5.4.50 with a subsolution u20 and a supersolution v01 . We
get another solution x3 = x3 (t) of (5.4.38). Clearly, all xi , i = 1, 2, 3, are mutually
dierent.
359
Exercise 5.4.53. Prove that 1 from Theorem 5.4.49 is an open set in C01 [0, 1].
Exercise 5.4.54. Formulate conditions on f = f (t, x) which guarantee that the problem
(5.4.38) has a pair of wellordered supersolution and subsolution.
Exercise 5.4.55. Formulate conditions on f = f (t, x) which guarantee that the problem
(5.4.38) has a pair of nonwellordered supersolution and subsolution.
Exercise 5.4.56. Formulate conditions on f = f (t, x) which guarantee that the problem
(5.4.38) has two pairs of supersolutions and subsolutions which satisfy the assumptions
from Theorem 5.4.49.
Chapter 6
Variational Methods
6.1 Local Extrema
In this section we present necessary and/or sucient conditions for local extrema
of real functionals. The most famous ones are the Euler and Lagrange necessary
conditions and the Lagrange sucient condition. We also present the brachistochrone problem, one of the oldest problems in the calculus of variations. We
also discuss regularity of the point of a local extremum. The methods presented
in this section are motivated by the equation
f (x) = 0
(6.1.1)
(F (x) F (a)).
362
If the inequalities are strict, we speak about a strict local minimum (strict local maximum). If the functional F has a (strict) local minimum or (strict) local
maximum at a, we say that it has a (strict ) local extremum at a.
In Figure 6.1.1 the critical point a is not a point of extremum of F .
R
Figure 6.1.1.
The fundamental assertion is the following Euler (or Fermat ) Necessary Condition.
Proposition 6.1.2 (Euler Necessary Condition). Let F : X R have a local extremum at a X. If for v X the derivative F (a; v) exists, then
F (a; v) = 0.
Proof. Set
g(t) = F (a + tv),
t R.
Denition 6.1.3. If
F (a; v) = 0
for all v X,
Denition 4.3.6.
363
for any
v X,
(6.1.4)
can assume that U is convex. Then D 2 F (a + tv) exists and is continuous for all t [0, 1].
364
On the other hand, it is well known that (6.1.4) does not imply that F has a local
extremum at the point a. To check that this is the case we can apply Theorem 6.1.5.
If F has continuous second partial derivatives in a neighborhood of a, then we
should investigate the quadratic form
D2 F (a)(v, v) =
N
2F
(a)vi vj .
xi xj
i,j=1
(6.1.5)
To prove that F has, e.g., a local minimum at a, it is enough to show that there
exists > 0 such that for any v RN , v = 1,
D2 F (a)(v, v) .
(6.1.6)
(Here we have used the fact that the quadratic form is homogeneous.) Since we
are in nite dimension, the unit sphere in RN is a compact set. Then (6.1.6) holds
with an > 0 whenever
for all v = 1.3
D2 F (a)(v, v) > 0
(6.1.7)
The reader is invited to justify that (6.1.7) implies (6.1.6) and to explain why this
is not the case when RN is replaced by a space of innite dimension.
It follows from linear algebra4 that for any quadratic form on RN there exists
a basis {u1 , . . . , uN } of RN and numbers 1 , . . . , N such that for any v of the form
v=
N
i ui
i=1
we have
D2 F (a)(v, v) =
N
i i2 .
i=1
The inequality (6.1.7) holds if and only if all i , i = 1, . . . , N , are positive, and so
according to Theorem 6.1.5 the function F has a strict local minimum at a. If there
is at least one positive and at least one negative number among i , i = 1, . . . , N ,
then according to Proposition 6.1.4 the function F does not have a local extremum
g
at a.
Before we give an application in an innite dimensional space, we prove the
following assertion for convex functionals.
3 Here
we use the fact that a positive continuous function on a compact set achieves its minimal
value which has to be positive.
2
2 F
F
4 See also Corollary 6.3.9. (Remember that
(a) = x x
(a).)
x x
i
365
and
F (o; v) = 0
for any v X
(i.e., o X is a critical point). Assume that F does not achieve the minimum
value over X at o X. Then there exists u X for which F (u) = < 0. The
convexity of F implies that
F (tu + (1 t)o) t
i.e.,
F (tu) F (o)
< 0.
t
(6.1.8)
The following result will be needed several times in the further text.
Lemma 6.1.9 (Fundamental Lemma in Calculus of Variations). Let I be an open
interval and f L1loc (I). If
f (x) (x) dx = 0
for any D(I), 5
(6.1.9)
I
366
in the L1 (R)norm6
whenever y n1 , y + n1 J , by the assumption (6.1.9), (g n )(y) is constant
for all such y. The convergence of g n to g means that g is constant a.e. in J ,
i.e., f = const. a.e. in I.
One of the oldest problems in the calculus of variations is studied in detail
in the following example.
Example 6.1.10 (Brachistochrone Problem). The problem is formulated as follows:
For two given points A and B in a vertical plane nd a curve connecting
A and B which is optimal among all other such curves in the following
sense. The point P of unit mass which starts from A with zero velocity
and moves along this curve only due to the gravitational force will reach
the point B in a minimal time.7
In order to nd a suitable mathematical model we shall assume that the points
A = (0, 0) and B = (a, b), b 0, are situated in a vertical plane with the coordinate
system chosen as in Figure 6.1.2. The reader is invited to verify that such a position
of A and B can be considered without loss of generality. We shall concentrate rst
only on curves which are graphs of nonnegative functions y = u(x) which belong
to the space C 1 [0, a].
The point P moves according to the second Newton Law. The resulting force
is a composition of the gravitational force and the reaction force of the constraint
(the point P moves along the given curve). The resulting direction is given by the
tangent line of the curve, see Figure 6.1.2.
The Second Newton Law says that for the velocity v of the point P the
following identity holds:
mv = F = mg cos
(see Figure 6.1.2). Multiplying this identity by v and taking into account that
x = v cos , we obtain
1 2
v
= gv cos = g x,
2
i.e.,
1 2
v = gx
(6.1.10)
2
(the Principle of Conservation of Energy).
6
367
F
mg
a
x
Figure 6.1.2. The xaxis is oriented in the (downward) direction of the gravitational
force.
Since the point P moves along the graph of u = u(x), its trajectory s = s(t)
is given by
x(t) 1
s(t) =
1 + (u (x))2 dx.8
(6.1.11)
0
Hence
ds(t)
ds(t) dx 1
=
= 1 + (u (x(t)))2 x(t).
dt
dx dt
Using (6.1.10) and the strict monotonicity of x we have
1
1 + (u (x))2
dt
=
.
dx
2gx
v(t) =
F (u) =
dx.
2gx
0
(6.1.12)
use the formula for the length of a curve given by the graph of u = u(x): s =
;
1 + (u (x))2 dx.
368
F (w) = F w + x =
a
0
;
2
1 + w (x) + ab
dx
2gx
where
w C01 [0, a] {w C 1 [0, a] : w(0) = w(a) = 0}.
We equip C01 [0, a] with the norm
uC01 [0,a] =
u (x)2 dx
12
.
For a given h C01 [0, a] we have (see Corollary 3.2.14 and Example 3.2.21)
F (w; h) =
0
w (x) + ab
2
= h (x) dx.
<
2
2gx 1 + w (x) + ab
The Euler Necessary Condition (Proposition 6.1.2) for the original variable u reads
a
u (x)
1
h (x) dx = 0
for all h C01 [0, a].
(6.1.13)
2gx[1 + (u (x))2 ]
0
Let us denote
u (x)
M (x) = 1
,
2gx[1 + (u (x))2 ]
x (0, a).
Applying Lemma 6.1.9 we obtain that there is a constant K R such that M (x) =
K a.e. in (0, a). However, the continuity of M actually implies that
u (x)
1
=K
2gx[1 + (u (x))2 ]
(6.1.14)
x
x(
(u (x))2 = ,
2c
2c
x [0, a].
(6.1.15)
x
Hence 0 2c
< 1. After the change of variables x = c(1 cos ), [0, 0 ] (here
0 < is such that a = c(1 cos 0 )) we obtain
du
du
=
c sin
d
dx
369
du
d
2
= c2 (1 cos )2 .
Hence
u( ) = c( sin ),
[0, 0 ].
(Notice that the integration constant is zero since u(0) = 0, and only the sign plus
corresponds to our problem.) Hence the parametric equations of the graph of u
are given by
x = c(1 cos ),
y = c( sin ),
[0, 0 ].
sin
,
1 cos
0 (0, ).
(6.1.16)
(0, ),
is strictly increasing with the supremum (over (0, )) equal to 2 , we have that for
0 ab < 2 the functional F has a unique critical point v C01 [0, a] such that the
graph of the function u(x) = v(x) + ab x has parametric equations
x=a
1 cos
,
1 cos 0
y=a
sin
,
1 cos 0
[0, 0 ],
(6.1.17)
370
Let us return to the case ab < 2 . It still remains to show that the solution
(6.1.17) is a global minimum of F over C01 [0, a]. This follows from Proposition 6.1.8.
Indeed, the function
1
z 1 + z 2
is convex in R. This immediately implies that the functional F is convex on C01 [0, a]
(the reader is invited to prove both facts in detail). Hence the unique critical point
g
of F in C01 [0, a] must be the point of its global minimum.
Let us now consider a more general situation. Namely, let
M = {u C 1 [a, b] : u(a) = u1 , u(b) = u2 },
and let us introduce the functional
b
f (x, u(x), u (x)) dx,
F (u) =
u M,
where f = f (x, y, z) is a function dened on [a, b] R2 with continuous second partial derivatives with respect to all its variables. This assumption will hold
throughout the rest of this section. Applying the Euler Necessary Condition (Proposition 6.1.2) we get the following assertion.
Proposition 6.1.11. Let u0 M be a local extremum of F with respect to M.
Then the function
f
(x, u0 (x), u 0 (x))
x
(6.1.18)
z
is continuously dierentiable on [a, b] and
f
d f
(x, u0 (x), u 0 (x))
(x, u0 (x), u 0 (x)) = 0
(6.1.19)
y
dx z
for all x [a, b].
Proof. Let us rst assume u1 = u2 = 0. Let w C01 [a, b]. Since
b
f
f
0 = F (u0 ; w) =
(x, u0 (x), u 0 (x))w(x) +
(x, u0 (x), u 0 (x))w (x) dx,
y
z
a
we get, by integrating by parts,
b
x
f
f
(x, u0 (x), u0 (x))
(, u0 (), u0 ()) d w (x) dx = 0.
z
a
a y
Using Lemma 6.1.9 we get from (6.1.20) that there is c R such that
x
f
f
(x, u0 (x), u0 (x))
(, u0 (), u 0 ()) d = c
z
y
a
(6.1.20)
(6.1.21)
371
for all x [a, b]. This equality shows that the function (6.1.18) is continuously
dierentiable and (6.1.19) holds for all x [a, b].
u1
In a general case we can consider u u2ba
(x a) u1 instead of u and
apply the previous result on the transformed functional.
Remark 6.1.12. Equation (6.1.19) is the Euler Equation of the functional F . Taking the formal derivative of the second term in (6.1.19) we obtain
d f
2f
(x, u0 (x), u0 (x)) =
(x, u0 (x), u 0 (x))
dx z
xz
2f
2f
(x, u0 (x), u 0 (x))u 0 (x) + 2 (x, u0 (x), u 0 (x))u 0 (x).
+
yz
z
Hence (6.1.19) indicates that u 0 (x) should exist. This motivates the following
assertion.
Theorem 6.1.13 (Regularity of the classical solution). Let u0 M be a local
extremum of F with respect to M, and let x0 (a, b) be such that
2f
(x0 , u0 (x0 ), u 0 (x0 )) = 0.
z 2
Then there exists > 0 such that u0 C 2 (x0 , x0 + ).
Proof. For x [a, b] and z R dene a function by
x
f
f
(x, u0 (x), z)
(, u0 (), u 0 ()) d c
(x, z) =
z
a y
where c is the constant from the proof of Proposition 6.1.11. The Implicit Function
Theorem (see Theorem 4.2.1) implies that there exist 1 > 0, > 0 with the
following properties: for any x (x0 1 , x0 + 1 ) there exists a unique z(x)
u (x0 ) + )
such that
(u 0 (x0 ) ,
0
(x, z(x)) = 0.
Moreover, z C 1 (x0 1 , x0 + 1 ). The continuity of u 0 and the uniqueness of z
imply the existence of (0, 1 ) such that
u 0 (x) = z(x)
for
x (x0 , x0 + ).
372
more convenient to work in the Sobolev space W 1,2 (a, b) and to look for extrema
of F on the set
N = {u W 1,2 (a, b) : u(a) = u1 , u(b) = u2 }.
Notice that it is not obvious whether the functional F is well dened on the set N .
We have to assume that f satises certain growth conditions (see Theorem 3.2.24
and Remark 3.2.25; the Caratheodory property is guaranteed by the continuity of
f and its derivatives).
In this case we have
Theorem 6.1.14 (Regularity of the weak solution). Let h L2 (a, b), c1 0 be
such that for a.a. x [a, b] and for all (y, z) R2 ,
f (x, y, z) h(x) + c1 (y 2 + z 2 ),
 f
 (x, y, z) h(x) + c1 (y + z),
 y
 f
 (x, y, z) h(x) + c1 (y + z).
 z

(6.1.22)
(6.1.23)
(6.1.24)
f
(x, u0 (x), z).
z
Assume that
z > 0 on [a, b] R and that for every xed x [a, b] the function
z (x, z) maps R onto R.
Then u0 C 2 [a, b].
Proof. First, let us assume that u1 = u2 = 0. Conditions (6.1.22)(6.1.24) guarantee that F is well dened on W01,2 (a, b) and that F (u0 ; v) exists for any
v W 1,2 (a, b).10 It follows from Proposition 6.1.2 that for any w W01,2 (a, b),
b
f
f
(x, u0 (x), u 0 (x))w (x) +
(x, u0 (x), u 0 (x))w(x) dx = 0.
F (u0 ; w) =
z
y
a
If we proceed literally as in the proof of Proposition 6.1.11 we arrive at (6.1.21)
which now holds for a.a. x [a, b]. Since the function
x
f
g(x, z) = (x, z) c
(, u0 (), u 0 ()) d
y
a
is continuous on [a, b] R, hence by the assumptions on the function , for any
x [a, b] the equation
g(x, z) = 0
10 The
373
has a unique solution z = z(x). Moreover, by the Implicit Function Theorem (see
Remark 4.2.3, not Theorem 4.2.1!), the function x z(x) is continuous on (a, b).
It can be shown (Exercise 6.1.21) that it is continuous also at the end points a, b.
So, it follows from (6.1.21) that
x
for a.a. x [a, b],
i.e.,
u0 (x) =
z(y) dy.
u 0 (x) = z(x)
a
Hence u0
and it is a local minimum of F in the space C01 [a, b]. The
assertion now follows from Theorem 6.1.13.
In the general case, we consider again
C01 [a, b]
u2 u1
(x a) u1
ba
3
3
,
,
.
2 2
2 2
4 4
2
Prove that F has a local maximum at 2
3 , 3 , a local minimum at
3 , 3 , and
there is no extremum at the critical point (0, 0). For the graph of F see Figure 6.1.3.
on the set
M=
11
Exercise 6.1.17. Use Theorem 6.1.5 to prove that the solution of the Euler equation
(6.1.14) is a local minimum of F from Example 6.1.10.
Hint. Show that
3
(2c a) 2 a
2 F (v; h, h)
h (x)2 dx.
4c gca 0
Exercise 6.1.18. Prove that the functional
u(x)2 [1 u (x)2 ] dx
F (u) =
0
374
arctan nx
arctan n
does not have its global minimum over the set M from Exercise 6.1.19.
Hint. The corresponding Euler equation has no solution.
Exercise 6.1.21. Prove the following statement:
Let g : [a, b]R R be a function and assume that for any x [a, b] the
equation g(x, z) = 0 has a solution denoted by z = z(x) (not necessarily
unique). If
g
(x, z) > 0
on [a, b] R,
z
then this solution is unique. If, moreover, g and g
z are continuous on
[a, b] R, then z = z(x) is continuous on [a, b] as well.
375
Hint. For the continuity of z = z(x) use the Implicit Function Theorem in the
form of Remark 4.2.3 and notice that usage of the Contraction Principle is also
possible at the end points a, b.
R
F
b R
i.e., the minimum of F over [a, b] is at the point x0 (see Figure 6.2.3). The proof of
this fact is typical for this section. Assume that {xn }n=1 is a minimizing sequence
for F on [a, b], i.e.,
F (xn ) inf F (x).12
(6.2.1)
x[a,b]
12 Note
376
F (b)
F (a)
x0
Figure 6.2.3.
The compactness of [a, b] implies that there is a subsequence {xnk }k=1 {xn }n=1
and a point x0 [a, b] such that
xnk x0 .
The continuity of F then implies that
F (x0 ) = inf F (x).
x[a,b]
The reader should notice that a property weaker than the continuity of F is
sucient to get this conclusion, namely
F (x0 ) lim inf F (xnk )
(6.2.2)
(cf. Denition 6.2.1 below). It follows now from (6.2.1) and (6.2.2) that
F (x0 ) = inf F (x)
x[a,b]
x(a,b)
(6.2.3)
x(a,b)
uD
inf
uint D
F (u).
377
F (a)
F (b)
F (x0 )
0
x0
Figure 6.2.4.
Unfortunately, the answer is no in general (see Exercise 6.2.23). The reason lies
in the fact that the compactness of the bounded and closed interval [a, b] is the
crucial property which plays the essential role in the proof. In fact, one can imitate
the proof above to get the following result:
Let F be a lower semicontinuous real functional on a compact set
K X. Then F has a minimum in K.
However, this assertion has practically no applications because compact subsets of
the innite dimensional Banach space X are too thin (see Proposition 1.2.15).
For instance, for any compact set K X we have
int K = .
Because of this fact we have to look for a dierent (weaker why?) topology
on X than that induced by the norm. We would like to nd a new topology on X
with respect to which any bounded (in the norm) set D X is relatively compact.
The lower semicontinuity of a functional F with respect to this topology will then
allow us to prove the above assertion with K substituted by a bounded and closed
set D with respect to this new topology. These problems gave an impulse for
the study of weak convergence introduced in Denition 2.1.21. The reader should
notice that we will discuss weak sequential continuity of functionals instead of weak
continuity (these are dierent concepts since weak topology is not metrizable in
general). The reason is quite practical: weak sequential (semi) continuity is easier
to prove for a concrete (e.g., integral) functional. In order to make the exposition
in this section as clear as possible we will restrict our attention to real Hilbert
spaces H. The reader should have in mind that the following notions can also be
dened in any Banach space.
Denition 6.2.1. Let F : H R be a functional, M H. Then F is said to be
weakly sequentially lower semicontinuous at a point u0 M relative to M if for
378
L(u) = (u, v)
Hence
un u0
L(un ) L(u0 ), 13
implies
in particular,
Proof. Let
{un }n=1
{un }
n=1 M
and
uM
uM
i.e.,
F (u0 ) = inf F (u) > .
uM
13 If
379
Proof. The assumption u0 int M implies that F attains also its local minimum
at u0 . The assertion now follows from Proposition 6.1.2.
Example 6.2.6. Let us consider the boundary value problem for the second order
ordinary dierential equation
t (0, 1),
x =
x(t)
dt
2
12
.
dt +
x(t) dt
f (t)x(t) dt,
F (x) =
2 0
4 0
0
Then for x, y H we have
1
F (x; y) =
x(t)
y(t)
dt +
0
x (t)y(t) dt
3
x H.15
f (t)y(t) dt,
0
for an arbitrary y H,
a detailed discussion of the notion of a weak solution see Remark 5.3.10. Note that this
weak solution x0 minimizes the energy functional F , i.e., it corresponds to the state with the
minimal energy of the system.
15 This functional can represent the energy of a certain system. For this reason it is often called
the energy functional.
380
This implies
xn (t)4 dt
z(t)4 dt,
f (t)xn (t) dt
f (t)z(t) dt.
(6.2.5)
The weak sequential lower semicontinuity of the Hilbert norm (Example 6.2.2)
implies
(6.2.6)
lim inf xn 2 z2 .
n
(6.2.7)
Due to this fact we can estimate F using the Holder inequality as follows:
F (x)
1
1
x2 f L2(0,1) xL2 (0,1) x x 2f L2(0,1) .
2
2
(6.2.8)
u
16 This
x(s)
ds.
0
381
This notion together with Corollary 6.2.5 leads to the following global result.
Theorem 6.2.8. Let F : H R be a weakly sequentially lower semicontinuous
and weakly coercive functional. Then F is bounded below on H, and there exists
u0 H such that
F (u0 ) = min F (u).
uH
have
F (u) d.
Hence
inf F (u) = inf F (u).
uR
uH
382
i.e.,
u0 E(a).
If, moreover, F is strictly convex, then u0 is the unique point with this property.19
Theorem 6.2.12. Let F : H R be continuous, convex and weakly coercive on H.
Then F is bounded below on H, and there exists u0 H such that
F (u0 ) = inf F (u).
uH
19 The
383
i.e.,
uM
Assume that L = 0 and u0 < 1. Then there exists t > 1 such that tu0 = 1,
i.e., tu0 M, and
L(tu0 ) = tL(u0 ) = tL > sup L(u),
uM
a contradiction.
Note that this assertion can be proved directly using the Riesz Representation
g
Theorem (Theorem 1.2.40).
Example 6.2.14. Let us consider the boundary value problem (6.2.4) and the energy functional
1
1 1
1 1
2
4
F (x) =
x(t)
dt +
x(t) dt
f (t)x(t) dt, x H W01,2 (0, 1)
2 0
4 0
0
associated with (6.2.4). We have actually proved in Example 6.2.6 that F is weakly
coercive on H. The continuity of F on H follows from the continuity of the norm
in H, the continuity of the embedding H = W01,2 (0, 1) L4 (0, 1) and from the
continuity of the linear form
x
f (t)x(t) dt
on H
under the assumption f L2 (0, 1). The strict convexity of F follows from the
strict convexity of the real functions
t t2 ,
t t4 ,
384
and the convexity of the linear form. We conclude (see Theorem 6.2.12) that there
exists a unique x0 H such that
F (x0 ) = min F (x).
xH
It follows then from Proposition 6.1.8 that x0 is the unique weak solution of (6.2.4).
g
Remark 6.2.15. The reader should compare Examples 6.2.6 and 6.2.14. In the
latter one we have used Theorem 6.2.12 which enables us to avoid verifying the
assumption of the weak sequential lower semicontinuity of F . This might be a
dicult task in general (it can not be always done so easily by means of the
compact embedding as in Example 6.2.6).
The reader should also notice that the continuity of F without any additional
assumptions does not imply the weak sequential lower semicontinuity of F (see
Exercise 6.2.31).
In the last part of this section we show another possibility for nding critical
points of F under the assumption that F is dierentiable. First we need two
auxiliary assertions.
Lemma 6.2.16. Let F be a functional dened on H and F its gradient.20 Let
F : H H be a monotone operator. Then F is weakly sequentially lower semicontinuous on H.
Proof. Let u, v H. According to the Mean Value Theorem applied to the real
function : s F (v + s(u v)), s [0, 1], there exists t (0, 1) such that
F (u) F (v) = (F (v + t(u v)), u v)
= (F (v), u v) + (F (v + t(u v)) F (v), u v)
(F (v), u) (F (v), v).21
(6.2.9)
Let {vn }
n=1 be a sequence in H such that vn v in H, i.e.,
(F (v), vn ) (F (v), v).
It follows from (6.2.9) that
lim inf F (vn ) F (v) (F (v), v) + lim (F (v), vn ) = F (v).
n
20 Remember
385
dt
= F (o) +
(F (tu), tu)
t
u
u
u
d
F
,
u
u
u
u
The boundedness of F implies
m
sup
[0,r]
uH, u=o
F u < .
u
Consequently, we obtain
d
u
u
F
,
u
u
0
u
d
u
u
+
F
,
u
u
r
F (u) = F (o) +
F (o) rm + u r
for any u H,
u > r.
(6.2.10)
386
dt +
x(t)4 dt
f (t)x(t) dt.
F (x) =
2 0
4 0
0
Then
F (x)y =
x(t)
y(t)
dt +
0
x (t)y(t) dt
3
f (t)y(t) dt
0
is the G
ateaux derivative of F in H W01,2 (0, 1). We verify the assumption
of Theorem 6.2.20. Using the continuous embedding H = W01,2 (0, 1) C[0, 1]
(Theorem 1.2.26) we prove the boundedness (and even continuity!) of F in the
space H. Since s s3 is monotone we have
(F (x1 ) F (x2 ), x1 x2 ) =
+
0
x 1 (t) x 2 (t)2 dt
0
1
f (x)x(t) dt
x
x
x 0
xL2 (0,1)
.
x f L2 (0,1)
x
Using the inequality (6.2.7) we get
(F (x), x)
x f L2 (0,1) ,
x
i.e., F is coercive. We conclude from Theorem 6.2.20 that
F (H) = H,
22 Compare
(6.2.11)
387
Exercise 6.2.23. Let {en }n=1 be an orthonormal basis in a Hilbert space H. Put
1
Dn = x H : x en
2
and dene a functional
x
for
f (x) =
2(n 1) 1
x en
for
x +
n
2
x
n=1
x Dn .
(x,y)
U (x, y) =
Dn ,
388
u
Exercise 6.2.26. Prove that the same conclusion as in Example 6.2.14 holds true
also if f L1 (0, 1).
Hint. Use the embedding W01,2 (0, 1) L (0, 1).
Exercise 6.2.27. Prove that the norm on H and linear forms on H are convex
functionals.
Exercise 6.2.28. Prove that in Theorem 6.2.12 the weak coercivity of F can be
substituted by a weaker assumption:
For any u H there exists r > 0 such that for all v H, v r, we
have F (v) > F (u).
Exercise 6.2.29. Let M be an open convex subset of a real Hilbert space H, let
F : H R be a functional such that for any u M there exists the second
G
ateaux derivative D2 F (u). Prove that
(a)
(b)
(c)
where
(a) D2 F (u)(h, h) 0 for u M, h H;
(b) (F (u) F (v), u v) 0 for u, v M;
(c) F is convex on M.
Hint. Use the Mean Value Theorem (see Theorem 3.2.6) as for real functions.
Exercise 6.2.30. Prove that for any n N and f L2 (0, 1) the boundary value
problem
389
us mention the Galerkin Method, the Finite Elements Method, the KatchanovGalerkin
Method, etc., which are powerful tools in the numerical solution of dierential equations.
Let X be a real Banach space and F a real functional dened on X. An element
u0 X satisfying
(6.2.12)
F (u0 ) = inf F (u)
uX
will be called a solution of the variational problem (6.2.12). We will discuss the Ritz
Method which actually yields directly an algorithm for nding a solution of the variational
problem. The basic idea of the Ritz Method is rather simple:
Instead of looking for the minimum of the functional F on the entire space
X, we look for its minimum on suitable subspaces of the space X in which we
know how to solve the variational problem.
Let us now formulate this idea precisely:
To every n N, let a closed subspace Xn of the space X be assigned. The
problem of nding an element un Xn such that
F (un ) = inf F (u)
uXn
(6.2.13)
holds is called the Ritz approximation of the problem (6.2.12) and the element
un Xn is called a solution of the problem (6.2.13).
The following two fundamental problems immediately present themselves:
(a) the problem of the existence and uniqueness of a solution of the problem (6.2.13);
(b) the relation between the solutions of the problems (6.2.12) and (6.2.13).
Problem (a) has already been solved by Theorem 6.2.12 in the framework of Hilbert
spaces. It follows from Remark 6.2.22 that the same assertion can be proved in a reexive
Banach space X. Since a closed subspace Xn of a reexive Banach space X is also
a reexive Banach space, we have the following assertion which follows directly from
Theorem 6.2.12 and Remark 6.2.22.
Proposition 6.2.32. Let X be a reexive Banach space, and let a functional F dened on
the space X be continuous, strictly convex and weakly coercive on X. Then each of the
problems (6.2.12) and (6.2.13) has precisely one solution u0 and un , respectively.
We now focus our eort on problem (b). We investigate under what condition
lim u0 un = 0
(6.2.14)
is true. If (6.2.14) is valid, then we say that the Ritz Method converges for the problem
(6.2.12) and the solutions un of the problems (6.2.13) approximate the solution of the
problem (6.2.12) in the sense of the norm of the space X.
Proposition 6.2.33. Let F be a continuous linear functional on a normed linear space X
and let {Xn }
n=1 be a sequence of closed subspaces of X such that for every v X there
exist elements vn Xn , n N, such that
lim v vn = 0.
(6.2.15)
390
(6.2.16)
uX
Proof. Let {k }
k=1 be a sequence such that
k inf F (u).
uX
(k)
v (k) for n .
uX
uX
The assertion on the convergence of the Ritz Method for the problem (6.2.12) is
the following proposition.
Proposition 6.2.34 (Ritz Method). Let H be a real Hilbert space,24 and let F be a continuous functional on the space H which has the second G
ateaux derivative D2 F (u)
25
B2 (H, R).
Assume, further, that there exists a constant c > 0 such that for all u, v H we
have
(6.2.17)
D2 F (u)(v, v) cv2 .
Let subspaces Hn of the space H satisfy condition (6.2.15). Then
(i) there exists precisely one solution u0 H of problem (6.2.12);
(ii) for every n N there exists precisely one solution un Hn of problem (6.2.13);
(iii) the Ritz Method converges for problem (6.2.12), i.e.,
lim u0 un = 0.
24 We will state and prove Proposition 6.2.34 in the Hilbert space setting. The generalization to
the Banach space setting can be obtained (c.f. Remark 6.2.22). The reader can nd details in
specialized literature (see, e.g., Saaty [116]).
25 See Section 3.2.
391
(6.2.18)
Multiplying the rst inequality by t, the second by (1 t) and adding both of them, we
obtain that F is strictly convex on H (and also on Hn for arbitrary n).
The assertions (i) and (ii) now follow from Theorem 6.2.12.
It remains to prove assertion (iii). Let u0 and un be a solution of (6.2.12) and
(6.2.13), respectively. Set u u0 and v un u0 in (6.2.18). From (6.2.17) and (6.2.18)
we obtain
c
F (un ) F (u0 ) + DF (u0 )(un u0 ) + un u0 2 .
2
Since u0 H is the minimum point for F on H, it follows from Theorem 6.2.12 that
DF (u0 )(un u0 ) = o,
i.e.,
F (un ) F (u0 ) +
c
un u0 2
2
(6.2.19)
holds for arbitrary n N. On the other hand, due to Proposition 6.2.33, the elements
un , n N, constitute a minimizing sequence for F on H, i.e.,
lim F (un ) = inf F (u) = F (u0 ).
uH
(6.2.20)
So far, we have answered theoretically problems (a) and (b) formulated at the beginning of this appendix. However, from the point of view of practical (numerical) calculations the most interesting problems start right now. The most frequent and most important case arises in practice when the spaces Hn are of nite dimension, e.g., dim Hn = N .
If e1 , . . . , eN is a basis of Hn and
N
ci ei ,
Fn (c1 , . . . , cN ) F
i=1
392
inf
Fn (c1 , . . . , cN ).
(6.2.21)
If the assumptions of Proposition 6.2.34 are satised, then the function Fn is continuous,
strictly convex on the space RN , satises
lim Fn (c) = ,
and then the vector c is a solution of problem (6.2.21) if and only if all partial derivatives
of the rst order of the function Fn vanish at c (cf. Theorem 6.2.12). Thus the problem
of nding a solution of problem (6.2.21) is equivalent to the problem of nding a solution
of the system
Fn (c1 , . . . , cN )
= 0,
c1
..
.
(6.2.22)
Fn (c1 , . . . , cN )
= 0.
cN
The system (6.2.22) is a system of N algebraic equations which are generally nonlinear.
However, note that if the functional F is quadratic, then the system (6.2.22) is a system
of linear algebraic equations.
Remark 6.2.35. We have not been concerned with the question which is fundamental
from the practical point of view: How do we solve system (6.2.22) numerically? A vast
literature dedicated to numerical methods deals with this problem. Just for an illustration
we mention one minimization method. Choose arbitrarily a vector c0 = (c01 , . . . , c0N )
RN . Let us present an algorithm for the construction of a sequence {cm }
m=1 which
converges under appropriate assumptions on f to the solution of system (6.2.22). If we
m
N
know the vector cm = (cm
1 , . . . , cN ) R , we calculate the components of the vector
m+1
N
,
.
.
.
,
c
)
R
as
follows:
Let the function
cm+1 = (cm+1
1
N
m
m
, . . . , cm+1
Fn (cm+1
1
i1 , , ci+1 , . . . , cN )
where
0 < 2.
393
Let {ei }
i=1 be a Schauder basis (see Section 1.2) of a Hilbert space H (not necessarily orthonormal) and dene the subspace Hn as the set of all elements u H which
are of the form
u = c1 e1 + + cn en .
It follows from the denition of the Schauder basis that {Hn }
n=1 satises condition
(6.2.15).
Example 6.2.36. Let H W01,2 (0, 1), f L1 (0, 1) and
F (x)
1
2
1
2
x(t)
dt +
0
1
4
x(t)4 dt
0
x H.
f (t)x(t) dt,
(6.2.23)
(6.2.24)
f (t)y(t) dt,
0
2
y(t)
dt + 3
0
x3 (t)y(t) dt
D2 F (x)(y, y) =
x(t)
y(t)
dt +
DF (x)(y) =
and the assumptions of Proposition 6.2.34 are satised.26 The sequence of functions ei ,
i = 1, 2, . . . , which are dened by
ei (t) ti (1 t),
constitutes a Schauder basis of the space H (see, e.g., Michlin [94]). Thus, if we construct
the subspaces Hn as above, the condition (6.2.15) will be satised. If we rewrite the
system (6.2.22) for this particular case, we obtain the system of nonlinear equations for
unknowns c1 , . . . , cn ,
n
ck
k=1
3
1
n
= d <
=
d <k
j
k
t (1 t)
t (1 t) dt +
ck t (1 t) tj (1 t) dt
dt
dt
0
k=1
1
f (t)tj (1 t) dt,
=
(6.2.25)
that we consider x =
0
2
(x(t)
dt
12
as the norm on H.
394
numerical method. Below we indicate one possible choice of Hn which is dierent from
the previous one and which meets the above mentioned requirements.
Let n N, and put ti = ni for i = 0, 1, . . . , n and Ij = [tj , tj+1 ] for j = 0, 1, . . . , n1.
We dene the spaces Hn as follows:
Hn is the set of functions x = x(t) continuous on the interval [0, 1] which are
linear on every interval [ti , ti+1 ] and for which x(0) = x(1) = 0.
Let ei Hn , i = 1, . . . , n 1, be functions such that
1 for i = j,
j = 0, . . . , n.
ei (tj ) =
0 for i
= j,
It is easily established that the set {ei }n1
i=1 constitutes a basis of the space Hn and that
for all y Hn we have
y(t) =
n1
t [0, 1].
j=1
The system (6.2.22) constructed for this basis will now be itself a system for the unknown
values xn (tj ) of the solution of problem (6.2.13). The crucial point in this
9
8 construction
. We
, i+1
is the fact that the functions ei (t) vanish outside the interval Ii1 Ii = i1
n
n
then have
ei (t)ej (t) = e i (t)e j (t) = 0
for i, j = 1, . . . , n 1, i j > 1 at every point t [0, 1] (with the obvious exception, for
derivatives, of the points t1 , . . . , tn1 , which constitute a set of measure zero). Therefore,
in each of the equations
n1
i=1
ci
e i (t)e j (t) dt +
0
n1
3
ci ei (t)
ej (t) dt =
i=1
(6.2.26)
j = 1, . . . , n 1, of system (6.2.22) only the unknowns cj1 , cj+1 appear apart from cj .
If we compute a solution c1 , . . . , cn1 from these equations, and if we put
un (t) = c1 e1 (t) + + cn1 en1 (t),
t [0, 1],
By Proposition 6.2.34, it suces to show that the spaces Hn satisfy condition (6.2.15).
Let y H and > 0. We shall show that there exist n N and yn Hn such that
y yn < .
(6.2.27)
Indeed, the set D(0, 1) is dense in H (see Exercise 1.2.46). Hence there exists w D(0, 1)
such that
y w < .
(6.2.28)
2
395
for all
i = 0, . . . , n.
n1
ti+1
i=0
w(t)
y n (t)2 dt
ti
n1
i=0
max
t[0,1]
1
2
w(t)
(ti+1 ti )
n2
= 2 max w(t).
n t[0,1]
This implies that for suciently large n N we have
w yn <
.
2
(6.2.29)
The desired inequality (6.2.27) now follows from (6.2.28) and (6.2.29).
Remark 6.2.37.
(i) Let us point out that to get system (6.2.25) it was not essential that an equidistant
division of the interval [0, 1] has been selected. Nonetheless, the norm of the division
(i.e., the maximal distance between two consecutive points) must approach zero.
(ii) The spaces Hn are the simplest which could be chosen for the given example. It
is also possible to choose spaces of C 1 functions which are polynomials of higher
degree on every interval Ii . For instance, one can choose
Hn = {y C 1 [0, 1] : y(0) = y(1) = 0,
yIi is a polynomial of the third degree for all i = 0, . . . , n 1}.27
There exists a basis of this space whose dimension is 2n which consists of the
functions e1 , . . . , en1 , 0 , . . . , n such that
1 for i = j,
ei (tj ) =
e i (tj ) = 0,
i = 1, . . . , n 1, j = 0, . . . , n;
0 for i
= j,
i (tj ) =
i (tj ) = 0,
1
0
for
for
i = j,
i
= j,
i, j, = 0, . . . , n,
n1
j=1
n
y(t
j )j (t),
t [0, 1].
j=0
(iii) From the computational point of view the question of how rapidly the solutions un
of problem (6.2.13) converge to a solution of problem (6.2.12) is very important.
This question is closely related to the regularity of solutions of equations. If, e.g.,
f C 0 [0, 1], then u0 C 2 [0, 1] (cf. Proposition 6.1.11 and Theorem 6.1.13) and
27 These
396
ei
i
0 = t0
ti1
ti
ti+1
1 = tn t
Figure 6.2.5.
using this it can be proved that there exists a constant c > 0 such that for all n N
we have
c
u0 un .
n
If, e.g., u0 C 4 [0, 1], then we even have
u0 un
c
.
n3
Remark 6.2.38 (Finite Elements Method). Similarly to Example 6.2.36 we could proceed
even in the case H W0k,2 (), RN , C 0,1 . The situation then corresponds to
the boundary value problem for partial dierential equations see Chapter 7 for more
details. Suppose that we can divide the set into a nite number of open subsets i ,
i = 1, . . . , k, such that their diameter
diam i = sup x y <
x,yi
1
n
k
i , i j = for i
= j.
i=1
Each of the sets i is called a nite element. The space Hn will consist of functions
whose restrictions to i are smooth functions, for instance polynomials in N variables,
and satisfy certain conditions on the common boundary of the sets i and j (i
= j).
For simplicity and greater intuitive appeal we will consider to be a polygon in R2 and
for every n N we perform a triangulation Tn of the set , i.e., we put
=
k
Ki
diam Ki
i=1
1
, i = 1, . . . , k,
n
see Figure 6.2.6. Assume that precisely one of the following situations arises for the
mutual position of triangles Ki , Kj Tn (i
= j):
(a) the closures of two distinct triangles have no common point;
(b) the closures of two distinct triangles have only one vertex in common;
(c) the closures of two distinct triangles have an entire side in common.
The spaces Hn will be sets of continuous functions whose restrictions to Ki are polynomials of the kth order. Below, we give examples of the spaces Hn for the case k = 1 and
k = 3. The continuity of a function v Hn is ensured on the set by choosing the values of parameters (used for the construction of the function) to be equal at the common
vertices. The reader will nd more details in specialized literature on the Finite Elements
Method (see, e.g., Brenner & Scott [16], Krzek & Neitaanm
aki [81], Rektorys [107]).
397
Ki
Figure 6.2.6.
Example 6.2.39 (k = 1). Let be a polygon in R2 . Let K be an open triangle with
vertices Q1 , Q2 , Q3 . Let P1 (K) be the set of all polynomials of the rst degree dened
on K, i.e., P P1 (K) if
P (x, y) = 0 + 1 x + 2 y,
(x, y) K.
It is easily shown that any function P (x, y) P1 (K) is uniquely determined by its values
at the vertices Q1 , Q2 , Q3 . The values P (Q1 ), P (Q2 ), P (Q3 ) serve as parameters by
means of which the function P (x, y) is constructed.
The function P P1 (K) for which
P (Qi ) = v(Qi ),
i = 1, 2, 3,
is called the Lagrange interpolation of the function v C(K). The function P (x, y)
constructed in this way is denoted by K v. Clearly, K is a linear operator from the
space C(K) into P1 (K) and
v K vW 1,2 (K) chK vW 2,2 (K)
(6.2.30)
holds for arbitrary functions v W 2,2 (K) (here hK = diam K and c > 0 is a constant
independent of v and hK ).28 Dene the space Hn as follows:
Hn {v C() : vKi P1 (Ki ) for all Ki Tn }.
Obviously,
Let v W
2,2
Hn H W 1,2 ().
(). Construct a function vn Hn in the following way:
vn Ki = Ki v.
28 The
398
in the space H (explain why!), we conclude that the spaces Hn , n N, satisfy condition
(6.2.15).
We can construct the basis functions e1 , . . . , ek of Hn just as in Example 6.2.36. If
{Qi }m
i=1 are all vertices of all triangles of the triangulation Tn , then
1 for i = j,
e
j = 1, . . . , m.
ei (Qj ) =
0 for i
= j,
Example 6.2.40 (k = 3). Let K be an open triangle with vertices Q1 , Q2 , Q3 and with
the center of gravity Q0 . Let P3 (K) be the set of polynomials of the third degree dened
on K, i.e., P P3 (K) if
P (x, y) = 0 + 1 x + 2 x2 + 3 x3 + 4 xy + 5 xy 2 + 6 x2 y + 7 y + 8 y 2 + 9 y 3 ,
(x, y) K. A function P (x, y) P3 (K) is uniquely determined by its values at the
vertices and at the center of gravity and by the values of the rst partial derivatives at
the vertices of the triangle K. A function K v P3 (K) for which
K v(Qi ) = v(Qi ),
v(Qi )
K v(Qi )
=
,
x
x
i = 0, 1, 2, 3;
K v(Qi )
v(Qi )
=
,
y
y
i = 1, 2, 3,
is called the Hermite interpolation of the function v C 1 (K). Just as in the preceding
example, the inequality
v K vW 3,2 (K) chK vW 4,2 (K)
v W 4,2 (K).
If we put
Hn {v C 1 (K) : vKi P3 (Ki ) for every triangle Ki Tn },
then
Hn H W 3,2 ()
and the spaces Hn , n N, again satisfy condition (6.2.15) since the set W 4,2 () is dense
e
in the space H.
Exercise 6.2.41. Apply the spaces Hn described in Remark 6.2.37(ii) to Example 6.2.36.
399
x, h H.29
(6.2.32)
Then
F (x) =
0
1
2
x(t)
+
2
x(t)
f (t, s) ds dt
0
is of the class C 1 (H, R) and its critical points correspond to weak solutions of (6.2.31).
A regularity argument applied to (6.2.31) (similar to that from Theorem 6.1.13) implies
that every weak solution is a classical solution in the sense that
x C02 [0, 1] {x C 2 [0, 1] : x(0) = x(1) = 0}
and the equation in (6.2.31) holds at every point t (0, 1).
The link between the method of supersolutions and subsolutions on the one side
and the method of nding the global minimizer on the other side is that
the existence of a wellordered pair of a subsolution and supersolution u0 and
v0 , respectively, implies that the functional F has a minimum on the convex
but noncompact set
M = {x H : u0 (t) x(t) v0 (t) for all t [0, 1]}.
This minimum then solves (6.2.31). Namely, we have the following assertion.
Theorem 6.2.42. Let u0 and v0 be a subsolution and supersolution of (6.2.31) such that
u0 (t) v0 (t), t [0, 1],
E {(t, x) [0, 1] R : u0 (t) x v0 (t)},
and let f : E R be a continuous function. Then the functional F has a global minimum
on M, i.e., there exists x0 M such that
F (x0 ) =
min
xH
u0 xv0
F (x).
t (0, 1),
x(0) = x(1) = 0.
Dene the energy functional associated with this modied problem by
x(t)
1
1
2
x(t)
+
f (t, (t, s)) ds dt.
F (x) =
2
0
0
29 Cf.
(6.2.33)
400
Then F C 1 (H, R) and its critical points correspond to the solutions of (6.2.33). It is
easy to prove (the reader should do it as an exercise) that F is weakly sequentially lower
semicontinuous and weakly coercive. It then follows from Theorem 6.2.8 that F has a
global minimum on H at x0 H, F (x0 ) = o. This x0 is a weak solution of (6.2.33) and
it is regular, i.e., x0 C 2 [0, 1], by Theorem 6.1.14.
We shall show that u0 (t) x0 (t) v0 (t). Indeed, assume by contradiction that
min (x0 (t) u0 (t)) < 0
t[0,1]
and dene
t0 max t [0, 1] : x0 (t) u0 (t) = min (x0 (s) u0 (s)) .
s[0,1]
From the denition of a subsolution u0 and of we obtain that t0 < 1, and for t t0 , t
close to t0 , we have
t
t
[
x0 (s) u
0 (s)] ds =
[f (s, u0 (s)) u
0 (s)] ds 0.
x 0 (t) u 0 (t) =
t0
t0
This contradicts the denition of t0 . Hence x0 (t) u0 (t), t [0, 1]. Similarly we prove
x0 (t) v0 (t), t [0, 1].
Notice that if x is such that u0 (t) x(t) v0 (t), then (t, x(t)) = x(t), i.e., x0 is
a minimizer for F on M and F (x0 ) = o.
Example 6.2.43. Consider the problem
x
(t) = f (t, x(t)),
t (0, 1),
x(0) = x(1) = 0,
(6.2.34)
where f is continuous on [0, 1] R, f (t, 0) = 0, f (t, R) 0 for an R > 0 and there exists
w H W01,2 (0, 1), 0 w(t) R, t [0, 1], such that
1 w(t)
f (t, s) ds dt < 0.
0
Then there exists 0 such that for all , (6.2.34) has, besides the trivial solution,
at least one nontrivial nonnegative solution.
Indeed, u0 0 is a subsolution and v0 R is a supersolution, and according to
Theorem 6.2.42 there exists
x0 M {x H : 0 x(t) R}
which solves (6.2.34) and minimizes the energy functional F on M. Moreover, taking
large enough, we have
w(t)
1
1
2
w(t)
+
f (t, s) ds dt < 0,
F (w) =
2
0
0
and so
F (x0 ) = min F (x) F (w) < 0 = F (o).
xM
401
Remark 6.2.44. The same results as in Theorem 6.2.42 and Example 6.2.43 hold if the
continuity of f is relaxed to f CAR([0, 1]R) and for all r > 0 there exists h L1 (0, 1)
such that for a.e. t (0, 1) and all s R, s r, we have f (t, s) h(t). The reader is
invited to verify all the previous steps as an exercise.
The reader who wants to learn more is referred to De Coster & Habets [33] where
also the relation between nonwellordered supersolutions and subsolutions on the one
hand and the minimax method on the other is discussed.
Exercise 6.2.45. How does the proof of Theorem 6.2.42 change if the homogeneous Dirichlet boundary conditions in (6.2.31) are replaced by the Neumann ones?
Exercise 6.2.46. Consider the problem
p2
x(t)
= f (t, x(t)),
x(t)
t (0, 1),
x(0) = x(1) = 0,
where p > 1 and
F (x) =
0
1
p
+
x(t)
x(t)
(6.2.35)
f (t, s) ds dt.
for all
t [0, 1].
(f (x) f (a))
for all x M U.
402
Proof. Proposition 4.3.8 and Remark 4.3.9(i) yield a dieomorphism of a neighborhood U of o X onto a neighborhood V of a such that
(U Ker (a)) = M V,
(o) = a.
Remark 6.3.3.
(i) The main signicance of Theorem 6.3.2 consists in reducing a (dicult) problem of nding the constrained extremal points to an easier task of nding
the local ones for a function
f
N
i i
i=1
403
Notice rst that all points of M are regular. The necessary condition given by
Theorem 6.3.2 for extremal points requires solving the following four equations:
2xy + y 2 2x = 0,
(6.3.2)
x + 2xy 2y = 0,
(6.3.3)
2z 2z = 0,
(6.3.4)
x + y + z = 1.
(6.3.5)
2
x=y=
and
2
(x y)(x + y + 2) = 0.
i.e.,
2
2
2
f
,
,0 =
.
2
2
2
3
3
x+y =
and hence
f (x, y, 0) = xy(x + y) =
.
3
9
Case 2 ( = 1). Again we have either x = y or x + y = 2. Putting x = y into
(6.3.2) we nd
x = y = 0,
z = 1
or
x=y=
and
f (0, 0, 1) = 1,
2 2 1
, ,
3 3 3
2
,
3
=
z=
19
.
27
1
3
404
2
2
2
max f = f (0, 0, 1) = 1,
,
,0 =
.
min f = f
M
M
2
2
2
If we were interested in local minima/maxima of f with respect to M , we
would need some sucient conditions. Since we are able to reduce the problem of
constrained minima/maxima to that of local ones (see the proof of Theorem 6.3.2),
we might employ the sucient condition which uses the second dierential (Theg
orem 6.1.5). Cf. Exercise 6.3.17.
Example 6.3.5 (Existence of the principal eigenvalue). Let p > 1 be a real number,
X W01,p (0, 1).31 Consider the eigenvalue problem
p2
(x(t)
x(t))
= x(t)p2 x(t),
x(0) = x(1) = 0
t (0, 1),
(6.3.6)
with a real parameter . This problem is linear for p = 2 and nonlinear for p = 2.
We say that R is an eigenvalue of (6.3.6) if there is a weak solution x X,
x = o, of (6.3.6), i.e.,
p2
x(t)
x(t)
y(t)
dt =
x(t)p2 x(t)y(t) dt
(6.3.7)
holds for every y X. The corresponding x is then called an eigenfunction associated with the eigenvalue .32
31 We
32 To
p
x(t)
dt
p1
see the analogue to the linear case the reader should notice that for p = 2 such a function x is
an eigenvector (Denition 1.1.27) and is an eigenvalue of the linear operator Bx = x
, Dom B =
{x W01,2 (0, 1) : x(0) = x(1) = 0} L2 (0, 1). The identity (6.3.7) can be interpreted (for
p = 2) as the operator equation x = Ax where A is dened by the equality (Ax, y)W 1,2 (0,1) =
0
(x, y)L2 (0,1) . The eigenvalues of (6.3.6) are then reciprocal values of the eigenvalues of A.
405
= 0 1
p
x(t)
dt
,
x(t)p dt
1 = inf
xX
x=o
p
x(t)
dt
(6.3.8)
x(t) dt
p
i.e.,
1 = inf
xX
p
x(t)
dt :
x(t)p dt = 1
is attained and use the Lagrange Multiplier Method to show that 1 is the least
eigenvalue (principal eigenvalue) of (6.3.6). Let us prove that the inmum in (6.3.8)
is achieved at an x1 X with
x1 (t)p dt = 1.
xn (t) dt = 1
p
and
x n (t)p dt 1 .
In particular, this means that the sequence {xn }n=1 is bounded in X. By the
reexivity of X and the compact embedding X = W01,p (0, 1) Lp (0, 1) (see
Theorem 1.2.28 and Exercise 1.2.46(i)) there exists a subsequence {xnk }k=1
in X,
x1 (t)p dt = 1
and
xnk x1
i.e.,
0
in Lp (0, 1).
x 1 (t)p dt = 1 .
406
dt
and
g(x) =
x(t)p dt 1.
x 1 (t)p2 x 1 (t)y(t)
dt,
for any y X
g (x1 )y = p
(cf. Exercise 3.2.35). Since x1 = o, we also have g (x1 ) = o, and so the assumptions
of Theorem 6.3.2 are fullled. Hence there exists R such that
f (x1 ) = g (x1 ),
which is equivalent to
1
x 1 (t)p2 x 1 (t)y(t)
dt =
0
(6.3.9)
N
i i (a) = o
with some
i R,
i = 1, . . . , N,
i=1
407
Krasnoselski [78, Chapter 6] the reader can nd the proof of the assertion that the
set contains a sequence of nonzero numbers n = 0 such that n 0. The same
assertion for a Banach space X can be found in Citlanadze [26], Browder [18],
Fuck & Necas [55]. Actually, the whole Chapter 6 of the lecture notes by Fuck
et al. [56] is devoted to this problem. As for more recent references the reader can
confer Zeidler [136] and the bibliography therein.
Let us emphasize that in all above results the authors prove that the cardinality of the set is equal to innity. The question: When is a countable set?
is much more involved. Some partial results in this direction can be found in Fuck
et al. [56]. The proofs are based on a stronger version of the Morse Theorem and
go beyond the scope of this book.
Proposition 6.3.8. Let H be an N dimensional Hilbert space and let A be a selfadjoint operator in H. Then A has N real eigenvalues 1 , . . . , N (if they are
counted with their multiplicities), and the corresponding eigenvectors e1 , . . . , eN
form an orthonormal basis in H.
Proof. Consider two functions f, 1 : H R dened by
f (x) = (Ax, x),
1 (x) = (x, x) 1,
x H.
for all h H,
i.e.,
Ae1 = 1 e1 .
408
2 R such that
and thus there are e2 M2 , 2 ,
2 (e2 )h = (2Ae2 22 e2
2 e1 , h) = 0
f (e2 )h 2 1 (e2 )h
2
(6.3.10)
aij xi xj =
N
i i2 ,
where
x = (x1 , . . . , xN ),
i=1
x=
N
i ei .
i=1
Remark 6.3.10. The procedure explored in the proof of Proposition 6.3.8 has a
disadvantage, namely, to nd the kth eigenvalue k it is necessary to know the
rst k 1 eigenvectors e1 , . . . , ek1 . Because of that it can be convenient to have
another expression for k . We will now prove that
(Ax, x)
k = min max
: (x, y1 ) = = (x, yk1 ) = 0 and x = o (6.3.11)
y1 ,...,yk1
x2
provided dim H k. Expression (6.3.11) is called the Minimax Principle.
Let e1 , . . . , ek be eigenvectors corresponding to the rst k eigenvalues 1
k . Take y1 , . . . , yk1 H and let
N = {x = o : (x, y1 ) = = (x, yk1 ) = 0}.
There is an x
N Lin{e1 , . . . , ek }, say x
=
k
i ei . A simple argument to
i=1
see this consists in the observation that the linear operator : Rk Rk1 (or
Ck Ck1 ) given by
k
i (ei , yj )
=
i=1
j=1,...,k1
409
k
k
k
k
(A
x, x
) =
i i ei ,
j ej =
i i 2 k
i 2 = k
x2 .
i=1
j=1
i=1
i=1
This shows that the maximum in (6.3.11) (denoted by m(y1 , . . . , yk1 )) is not less
than k and therefore
inf
y1 ,...,yk1
m(y1 , . . . , yk1 ) k ,
n 0
(n ),
k = 1, 2, . . . .34
Proof. Set
F (u) = (Au, u),
1 (u) = u2 1
for u H,
and
M1 = {u H : 1 (u) = 0}.
linear selfadjoint operator A is said to be positive if (Au, u) > 0 for all u
= o.
reader should compare this assertion and its proof with the HilbertSchmidt Theorem
(Theorem 2.2.16).
33 A
34 The
410
that we can pass to a subsequence (denoted again as {un }n=1 ) for which
u n e1
and
Aun Ae1
in
with an e1 H.
Then
(Aun , un ) (Ae1 , e1 ) (Aun Ae1 , un ) + (Ae1 , un e1 ) 0
since both terms on the righthand side approach zero. So
F (e1 ) = sup {F (u) : u M1 }.
In particular, we have
F (e1 ) > 0
and
e1 = o.
Assume that e1 < 1. Then there exists t > 1 such that for e1 = te1 we have
e1 = 1, i.e., e1 M1 . Also
F (
e1 ) = (A(te1 ), te1 ) = t2 (Ae1 , e1 ) = t2 F (e1 ) > sup {F (u) : u M1 },
a contradiction. Hence
1 = F (e1 ) = max {F (u) : u M1 }.
Applying Theorem 6.3.2 we prove exactly as in Proposition 6.3.8 that 1 is an
eigenvalue of A and e1 is the corresponding eigenvector. Now, we proceed by
induction using
Mn = {u H : u = 1 and (u, e1 ) = = (u, en1 ) = 0}
as above to get the sequence of eigenvalues
1 2 > 0
(6.3.12)
411
(w, en ) = 0
and
for all n N.
Then
w
Mn ,
(Aw, w) n
and thus
for n = 1, 2, . . . .
n=1
for w =
n en = 0,
n=1
then
n n = n
for n = 1, 2, . . . .
Therefore n = 0 provided n = .
The min max characterization of n s follows as in the nite dimensional
case (Remark 6.3.10).
Remark 6.3.13. It is remarkable that the Minimax Principle holds even without
the assumption on the continuity of A in the sense that
inf
y1 ,...,yk1
yields either the kth eigenvalue or an upper bound of the essential spectrum of a
linear selfadjoint operator A provided A is bounded above. For details see, e.g.,
Reed & Simon [106].
There is also a dual characterization of the eigenvalues of A called the
CourantWeinstein Variational Principle.
Theorem 6.3.14 (CourantWeinstein Variational Principle). Let H be a real separable Hilbert space, A : H H a positive compact selfadjoint linear operator.
Assume that the eigenvalues n of A form a decreasing sequence
1 2 3 n > 0,
n 0
(n )
(cf. Theorem 6.3.12), and the multiplicity of an eigenvalue indicates how many
times this repeats in the above sequence. Then for any n N,
n =
sup
uX
XH
dim X=n u=1
412
uX
XH
dim X=n u=1
n = n .
Our aim is to prove
n n . Set
Step 1. We prove that
X0 = Lin{e1 , . . . , en }.
Then X0 is a linear subspace of H, dim X0 = n, and clearly
n min (Au, u).
uX0
u=1
However, we can estimate the minimum of the quadratic form on the righthand
side in terms of n . For u X0 , u = 1 we have
u=
n
n
xi ei ,
i=1
Then
(Au, u) =
n
i=1
xi i ei ,
n
x2i = 1.
i=1
xj ej =
j=1
n
i x2i n ,
i.e.,
n n .
i=1
n n . Set
Step 2. We prove
Y = Lin{ei }
i=n .
Then codim Y = n 1. Let X be an arbitrary linear subspace of H, dim X = n.
Then necessarily
dim (X Y ) > 0,
and the space X Y must contain an element w = o. We can assume w = 1.
Since w Y , we have
xi ei ,
x2i = 1.
w=
i=n
i=n
The estimate of the quadratic form (Au, u) on the unit sphere in X yields
min (Au, u) (Aw, w) =
uX
u=1
i x2i n
i=n
n n follows.
Since X is arbitrary, the equality
x2i = n .
i=n
413
Example 6.3.15 (Higher eigenvalues). Let p = 2 in (6.3.6), i.e., let us consider the
eigenvalue problem
x(t) + x(t) = 0,
t (0, 1),
(6.3.13)
x(0) = x(1) = 0.
The eigenvalues of the linear problem (6.3.13) can be calculated in an elementary
way. On the other hand, if we set H W01,2 (0, 1) and dene a positive and
compact operator A : H H by
(Ax, y)
W01,2 (0,1)
x(t)y(t) dt, 36
=
0
sup
min
XH x=1
dim X=n
x(t)2 dt.
The following two exercises show the relation between the local (global) extremum subject to a constraint and the local (global) extremum of the functional
depending on a parameter (without the constraint).
Exercise 6.3.16. Prove the following assertion:
Let f , be two real functionals dened on a real Hilbert space H. Let
the functional f ( R) have a local (global ) extremum at a point
x0 H. Then the functional f has a local (global ) extremum subject to
the constraint {x H : (x) = (x0 )} at the point x0 .
Exercise 6.3.17. Prove the following assertion:
Let f, : X R satisfy the assumptions of Theorem 6.3.2 and let
x0 X, R be such that
f (x0 ) (x0 ) = 0.
Assume, moreover, that there exist D2 f (x0 ; h, h), D2 (x0 ; h, h). Then
x0 is a local minimum of f (without the constraint) provided the
quadratic form
h D2 f (x0 ; h, h) D2 (x0 ; h, h),
h X,
is positive denite in X.
36 By
compactness of A follows.
414
dt +
x(t)2 dt c
x(t)
dt.
0
Exercise 6.3.19. Prove that for all x W01,2 (0, ) the inequality
2
2
x(t) dt
x(t)
dt
holds true.
0
in
Y,
h(1, u) B.
415
f (1, x) = x0
x S1 .
for all
P1 f (t, x)
.
P1 f (t, x)
P1 x 0
P1 x0
in S1 .
Step 2. Let the unit sphere S1 H1 be contractible to a point in S1 , i.e., there exists a
continuous map g : [0, 1] S1 S1 and a point x0 S1 such that
g(0, x) = x,
g(1, x) = x0
g 1 x, x
x
h : x
x0
for all
x S1 .
for
x
= o,
for
x = o.
Then h is continuous. Since dim H1 < , the Brouwer Fixed Point Theorem (Theorem 5.1.3) implies that there exists y B(o; 1) H1 such that
h(y) = y.
Since h assumes only values from S1 , we have y S1 , y = 1. On the other hand,
h(y) = g(0, y) = y,
which is a contradiction.
Lemma 6.3.23. Let F be a subset of R. If there exists x0 H1 , x0 = 1, such that
P1 (F) {y H1 : y = ax0 , a R} = ,
then F is contractible to a point in R.
37 I.e.,
416
x + 2tx0 [1 (x, x0 )]
x0 + 2(1 t)[x (x, x0 )x0 ]
for
for
8
9
t 0, 12 , x F,
1 9
t 2 , 1 , x F.
9
1
2
we have
P1 f (t, x) = 2t[1 (x, x0 )]x0 + P1 x,
for t
1
2
9
, 1 we have
P1 f (t, x) = [1 2(1 t)(x, x0 )]x0 + 2(1 t)P1 x.
(6.3.14)
Remark 6.3.25. Let us recall how to interpret the equality (6.3.14). The Frechet derivative
F (x) is a continuous linear operator from H into R. It follows from the Riesz Representation Theorem (see Theorem 1.2.40) that there is a unique point z z(x) H such
that
F (x)y = (y, z),
z = F (x)
for any y H.
In what follows we will identify F (x) with z(x) H and study bifurcation points
of the equation
(6.3.15)
x F (x) = o.
The main objective of this appendix is to prove that (under the assumptions F (o) = o,
F (o) = o and some assumptions concerning the smoothness of F )
every point (, o) where is a nonzero eigenvalue of F (o) : H H is a
bifurcation point of (6.3.15).
417
(6.3.16)
F is compact on U(o),
(6.3.17)
(6.3.18)
F (o) = o,
F (o) = o.
(6.3.19)
Then (0 , o) where 0
= 0 is a bifurcation point of
x F (x) = o
(6.3.20)
x 0.
418
(6.3.21)
In this case the dierential F (x) is perpendicular (recall that F (x) H in our interpretation) to the sphere B(o; r) at x. Then x can be looked for as a limit of those points of
the sphere B(o; r) at which the tangent projections (see (6.3.22) below and Figure 6.3.1)
of F (x) converge to zero. More precisely, we have
P (z) =
F (z)
(F (z), z)
z
(z, z)
z
{y : (z, y) = 0}
o
D(z)
Figure 6.3.1.
Lemma 6.3.28. For z H, z
= o, set
D(z) = F (z)
(F (z), z)
z
(z, z)
(6.3.22)
(D(z) is the orthogonal projection of F (z) to the tangent space of B(o; z) at z 38 ).
Let yn B(o; r), yn x0 , and let F be continuous, and
lim F (yn ) = y
= o,
(6.3.23)
Then yn x0 , y = F (x0 ), x0
= o, and
x0 F (x0 ) = o
where
1
(F (x0 ), x0 ).
r2
and hence
(F (yn ), yn )
(y, x0 )
yn
x0 .
r2
r2
(6.3.24)
419
At the same time, from the denition of D(yn ) and (6.3.23) we have
(F (yn ), yn )
yn = F (yn ) D(yn ) y.
r2
Hence
1
(y, x0 )x0 .
r2
Since y
= o, we have x0
= o and also (y, x0 )
= 0. The denition of D(yn ) and the fact
that D(yn ) o yield
y=
yn = r 2
F (yn ) D(yn )
y
r2
= x0 .
(F (yn ), yn )
(y, x0 )
i.e.,
F (x0 ) =
(y, x0 )
(F (x0 ), x0 )
x0 =
x0 .
2
r
r2
We will look for a curve on the sphere B(o; r) which starts at a xed point x;
the values of F along this curve do not decrease, and after a nite time (even if large)
we almost reach the critical point of F . In other words, we are looking for a curve
k = k(t, x), t [0, ), x B(o; r) such that
k(0, x) = x,
(6.3.25)
i.e.,
k(t, x)2 = r 2 .
d
k(t, x), k(t, x)
dt
=0
for all
t (0, ).
(6.3.26)
d
k(t, x) is perThe equality (6.3.26) states that for all t (0, ) the element dt
pendicular to k(t, x). This will be satised if we look for a solution of the initial value
problem
k(0, x) = x.
420
Let k be a solution of the initial value problem (6.3.27). Then it has the following
important properties:
(i) For any t (0, ) we have
k(t, x) = x.
(ii) For any t (0, ) we have
d
F (k(t, x)) = (F (k(t, x)), D(k(t, x))) = D(k(t, x))2 0.
dt
In other words, the values of the functional F increase along k regardless of the
choice of x B(o; r).
(iii) For any t (0, ) we have
D(k(, x))2 d .
Since F is bounded on B(o; r) (by the Mean Value Theorem and (6.3.19)), there
exists a sequence {ti }
i=1 (0, ) such that
lim D(k(ti , x)) = o.
40
(iv) Since {k(ti , x)}
i=1 is bounded, we can select a weakly convergent subsequence.
Summarizing, we have
Lemma 6.3.29. For any x B(o; r) there exist a sequence {ti }
i=1 (0, ) and x0 H
such that
k(ti , x) x0 ,
D(k(ti , x)) o,
{F (k(ti , x))}
i=1
is an increasing sequence.
(6.3.28)
(6.3.29)
(6.3.30)
It follows from (6.3.28) and (6.3.17) that F (k(ti , x)) y. If we prove that y
= o,
then the assumptions of Lemma 6.3.28 are veried with yn = k(tn , x), and so the existence
of a solution x0 of (6.3.20) with described by (6.3.24) will be proved.
By an appropriate choice of the initial condition x B(o; r), we show that the
above convergence takes place and that given by (6.3.24) is suciently close to 0 .
Recall that A = F (o) is a compact linear selfadjoint operator in the Hilbert space
H (see Proposition 5.2.21). Its spectrum consists of a countable set of real eigenvalues
with one possible limit point = 0. We split the set of all eigenvalues to the parts 0
and < 0 , respectively. We denote by H1 and H2 , respectively, the corresponding closed
linear subspaces generated by the eigenvectors (see Theorem 2.2.16). Note that 0 > 0
implies that dim H1 < . The eigenspace associated with 0 will be denoted by H0 . Let
P1 , P2 be the orthogonal projections of H onto H1 , H2 , respectively (see Figure 6.3.2).
40 The
421
H2
P1
P2
o
(H0 ) H1
Figure 6.3.2.
Let us denote
S1 = {x H1 : x = r}.
Lemma 6.3.30. There exists r0 > 0 such that B(o; r0 ) U(o) (see (6.3.16)), and for all
0 < r < r0 we have
(i) there is no t [0, ) for which the set k(t, S1 ) is contractible to a point (see
Denition 6.3.20) in
R = {x H : P1 x
= o},
(ii) for any t [0, ) there exists xt S1 such that
P1 k(t, xt ) H0 ,
i.e.,
k(t, xt ) H0 H2 .
Proof. Lemma 6.3.23 and (i) imply (ii) (see Exercise 6.3.33). Hence we prove only (i).
According to Lemma 6.3.21 it is sucient to prove that for any t the set S1 is contractible
into k(t, S1 ) in R. Indeed, according to Lemma 6.3.22 the set S1 is not contractible to
a point in R. Since k is a continuous function of both variables, it is sucient to prove
that it assumes only values from R: we want to prove that
P1 k(t, x)
= o
t [0, ),
x S1 .
We have
F (k(0, x)) = F (x)
1
(F (o)x, x) (x)x2
2
(6.3.31)
1
0 (x) x2
2
where (r) 0 as r 0 (see (6.3.19) and Proposition 3.2.27). Note that the last
inequality holds due to x H1 . Since F (k(t, x)) is increasing in t, we conclude herefrom
that
1
0 (r) r 2 .
F (k(t, x))
(6.3.32)
2
On the other hand, we have an estimate from above (we write k instead of k(t, x) for the
sake of brevity):
1
1
F (k) = (F (o)k, k) + F (k) (F (o)k, k)
2
2
1
1
(F (o)P1 k, P1 k) + (F (o)P2 k, P2 k) + (k)k2
2
2
(note that (F (o)P1 k, P2 k) = 0 due to H1 H2 ).
422
Then
2
r +
P1 k2 + (r)r 2 .
2
2
It follows from (6.3.32) and (6.3.33) that
F (k)
(6.3.33)
0 2
4
r
(r)r 2 .
for any
r r0
where
a = a(r0 ) > 0.
(6.3.34)
Proof of Theorem 6.3.26. Step 1. Let tn be an arbitrary sequence of positive numbers. Let xn be a point from S1 for which
P1 k(tn , xn ) H0
(its existence follows from (ii) of Lemma 6.3.30). Since S1 is compact, we can select a
strongly convergent subsequence (denoted again by {xn }
n=1 ) such that
.
lim xn = x
(6.3.35)
in
H,
We show that y
= o. Indeed, we have
(F (yi ), P1 yi ) (y, P1 x0 ).
Also, for all i N, we have the estimate
(F (yi ), P1 yi ) = (F (o)yi , P1 yi ) + (F (yi ) F (o)yi , P1 yi )
1
0 P1 yi 2 (yi )yi 2 0 ar 2
2
for all r small enough due to (6.3.34). This immediately implies
(y, P1 x0 )
= 0,
41 We
and so
y
= o, x0
= o.
423
1
(F (x0