Anda di halaman 1dari 241

Graduate Texts in Mathematics 12

Managing Editors: P. R. Halmos


C. C. Moore

Richard Beals

Advanced
Mathematical
Analysis
Periodic Functions and Distributions,
Complex Analysis, Laplace Transform
and Applications

Springer Science+Business Media, LLC

Richard Beals
Professor of Mathematics
University of Chicago
Department of Mathematics
5734 University Avenue
Chicago, Illinois 60637

Managing Editors
P. R. Halmos

C. C. Moore

Indiana University
Department of Mathematics
Swain Hali East
Bloomington, Indiana 47401

University of California
at Berkeley
Department of Mathematics
Berkeley, California 94720

AMS Subject Classification


46-01, 46S05, 46C05, 30-01,43-01
34-01,3501

Library of Congress Cataloging in Publication Data

Beals, Richard, 1938Advanced mathematical ana!ysis.


(Graduate texts in mathematics, v. 12)
1. Mathematica! analysis. 1. TitIe. II. Series.
QA300.B4 515 73-6884

AII rights reserved.


No part of this book may be trans!ated or reproduced in any form
without written permission from Springer-Verlag.

1973 by Springer Science+Business Media New York


Originally published by Springer-Verlag New York Inc in 1973

ISBN 978-0-387-90066-7
ISBN 978-1-4684-9886-8 (eBook)
DOI 10.1007/978-1-4684-9886-8

to Nancy

PREFACE
Once upon a time students of mathematics and students of science or
engineering took the same courses in mathematical analysis beyond calculus.
Now it is common to separate" advanced mathematics for science and engineering" from what might be called "advanced mathematical analysis for
mathematicians." It seems to me both useful and timely to attempt a
reconciliation.
The separation between kinds of courses has unhealthy effects. Mathematics students reverse the historical development of analysis, learning the
unifying abstractions first and the examples later (if ever). Science students
learn the examples as taught generations ago, missing modern insights. A
choice between encountering Fourier series as a minor instance of the representation theory of Banach algebras, and encountering Fourier series in
isolation and developed in an ad hoc manner, is no choice at all.
It is easy to recognize these problems, but less easy to counter the legitimate pressures which have led to a separation. Modern mathematics has
broadened our perspectives by abstraction and bold generalization, while
developing techniques which can treat classical theories in a definitive way.
On the other hand, the applier of mathematics has continued to need a variety
of definite tools and has not had the time to acquire the broadest and most
definitive grasp-to learn necessary and sufficient conditions when simple
sufficient conditions will serve, or to learn the general framework encompassing different examples.
This book is based on two premises. First, the ideas and methods of the
theory of distributions lead to formulations of classical theories which are
satisfying and complete mathematically, and which at the same time provide
the most useful viewpoint for applications. Second, mathematics and science
students alike can profit from an approach which treats the particular in a
careful, complete, and modern way, and which treats the general as obtained
by abstraction for the purpose of illuminating the basic structure exemplified
in the particular. As an example, the basic L2 theory of Fourier series can be
established quickly and with no mention of measure theory once L 2 (O, 21T) is
known to be complete. Here L2(O, 21T) is viewed as a subspace of the space of
periodic distributions and is shown to be a Hilbert space. This leads to a discussion of abstract Hilbert space and orthogonal expansions. It is easy to
derive necessary and sufficient conditions that a formal trigonometric series
be the Fourier series of a distribution, an L2 distribution, or a smooth
function. This in turn facilitates a discussion of smooth solutions and distribution solutions of the wave and heat equations.
The book is organized as follows. The first two chapters provide background material which many readers may profitably skim or skip. Chapters
3, 4, and 5 treat periodic functions and distributions, Fourier series, and
applications. Included are convolution and approximation (including the
vii

viii

Preface

Weierstrass theorems), characterization of periodic distributions, elements of


Hilbert space theory, and the classical problems of mathematical physics. The
basic theory of functions of a complex variable is taken up in Chapter 6.
Chapter 7 treats the Laplace transform from a distribution-theoretic point of
view and includes applications to ordinary differential equations. Chapters 6
and 7 are virtually independent of the preceding three chapters; a quick
reading of sections 2, 3, and 5 of Chapter 3 may help motivate the procedure
of Chapter 7.
I am indebted to Max 10deit and Paul Sally for lively discussions of what
and how analysts should learn, to Nancy for her support throughout, and
particularly to Fred Flowers for his excellent handling of the manuscript.
Richard Beals

TABLE OF CONTENTS
Chapter One
1.

2.
3.
4.
5.
6.
7.

Basis concepts

Sets and functions .


Real and complex numbers
Sequences of real and complex numbers
Series
Metric spaces
Compact sets
Vector spaces

Chapter Two

Continuous periodic functions


2. Smooth periodic functions .
3. Translation, convolution, and approximation
4. The Weierstrass approximation theorems
5. Periodic distributions
6. Determining the periodic distributions
7. Convolution of distributions
8. Summary of operations on periodic distributions

1.

2.
3.
4.
5.
6.

27

34
38
42
47

51
57

62
67

Periodic functions and periodic distributions

1.

Chapter Four

23

Continuous functions

1. Continuity, uniform continuity, and compactness


2. Integration of complex-valued functions
3. Differentiation of complex-valued functions .
4. Sequences and series of functions .
5. Differential equations and the exponential function
6. Trigonometric functions and the logarithm
7. Functions of two variables
8. Some infinitely differentiable functions

Chapter Three

5
10
14
19

69
72
77
81
84

89
94
99

Hilbert spaces and Fourier series

An inner product in <'C, and the space 22


Hilbert space
Hilbert spaces of sequences
Orthonormal bases .
Orthogonal expansions .
Fourier series
ix

103
109
113
116

121
125

Table of Contents

Chapter Five
1.
2.
3.
4.
5.

6.

Fourier series of smooth periodic functions and periodic distributions


131
Fourier series, convolutions, and approximation
134
The heat equation: distribution solutions
137
The heat equation: classical solutions; derivation
142
The wave equation .
145
Laplace's equation and the Dirichlet problem
150

Chapter Six
1.
2.
3.
4.
5.

6.
7.

Applications of Fourier series

Complex analysis

Complex differentiation
Complex integration
The Cauchy integral formula
The local behavior of a holomorphic function
Isolated singularities
Rational functions; Laurent expansions; residues
Holomorphic functions in the unit disc .

155
159

166
171
175
179
184

Chapter Seven The Laplace transform


Introduction
The space 2 .
The space 2'
Characterization of distributions of type 2'
Laplace transforms of functions .
6. Laplace transforms of distributions
7. Differential equations .
1.
2.
3.
4.
5.

190
193
197
201
205
210

213

Notes and bibliography

223

Notation index

225

Subject index

227

Advanced Mathematical Analysis

Chapter 1

Basic Concepts
1. Sets and functions
One feature of modern mathematics is the use of abstract concepts to
provide a language and a unifying framework for theories encompassing
numerous special cases and examples. Two important examples of such
concepts, that of "metric space" and that of "vector space," will be taken up
later in this chapter. In this section we discuss briefly the concepts, even more
basic, of "set" and of "function."
We assume that the intuitive notion of a "set" and of an "element" of a
set are familiar. A set is determined when its elements are specified in some
manner. The exact manner of specification is irrelevant, provided the elements
are the same. Thus

A = {3, 5, 7}
means that A is the set with three elements, the integers 3, 5, and 7. This is the
same as
A = {7, 3, 5},
or
A = {n I n is an odd positive integer between 2 and 8}
or
A = {2n + 1 I n = 1,2, 3}.
In expressions such as the last two, the phrase after the vertical line is supposed to prescribe exactly what precedes the vertical line, thus prescribing
the set. It is convenient to allow repetitions; thus A above is also

{5, 3, 7, 3, 3},
still a set with three elements. If x is an element of A we write
xEA

or A3X.

If x is not an element of A we write

x A or A

x.

The sets of all integers and of all positive integers are denoted by 71. and
71. + respectively:

71. = {O, 1, -1,2, -2,3, -3, ... },


71.+ = {I, 2,3,4, ... }.
As usual the three dots ... indicate a presumed understanding about what
is omitted.
1

Basic concepts

Other matters of notation:

o denotes the empty set (no elements).


Au B denotes the union, {x I x E A or x E B (or both)}.
A ('\ B denotes the intersection, {x I x E A and x E B}.
The union of AI> A 2, ... , Am is denoted by
m

Al

A2 U Aa

Am or

U ... U

UA

j,

1=1

and the intersection by

Al ('\ A2 ('\ Aa ('\ ... ('\ Am or

nA
m

j=1

The union and the intersection of an infinite family of sets AI, A2 ... indexed
by 7L. + are denoted by
co

co

UA, and

nAI.

j=1

1=1

More generally, suppose J is a set, and suppose that for each j E J we are
given a set AI. The union and intersection of all the Aj are denoted by
UAj and

nAI.

jel

leI

A set A is a subset of a set B if every element of A is an element of B; we


write
A c B or B::::> A.

In particular, for any A we have 0 c A. If A c B, the complement 0/ A in B


is the set of elements of B not in A:
B\A

{x I x

B, x A}.

Thus C = B\A is equivalent to the two conditions

Au C = B,

A('\C=0.

The product of two sets A and B is the set of ordered pairs (x, y) where
x E A and y E B; this is written A x B. More generally, if Al> A 2 , , An are
the sets then

Al

A2

x An

is the set whose elements are all the ordered n-tuples


each XI E A j The product

(Xl> X2, ,

x n), where

AxAxxA
of n copies of A is also written An.
A/unction from a set A to a set B is an assignment, to each element of A,
of some unique element of B. We write
/:A-+B

Sets and functions

for a function f from A to B. If x E A, then f(x) denotes the element of B


assigned by f to the element x. The elements assigned by f are often called
values. Thus a real-valued function on A is a function f: A -r ~, ~ the set of
real numbers. A complex-valued function on A is a functionf: A -r C, C the
set of complex numbers.
A function f: A -r B is said to be 1-1 ("one-to-one") or injective if it
assigns distinct elements of B to distinct elements of A: If x, YEA and
x =1= y, thenf(x) =1= f(y). A functionf: A -r B is said to be onto or surjective
if for each element y E B, there is some x E A such thatf(x) = y. A function
f: A -r B which is both 1-1 and onto is said to be bijective.
If f: A -r Band g: B -r C, the composition of f and g is the function
denoted by g 0 f:
g 0 f: A -r C,

g 0 f(x) = g(f(x)),

for all x

A.

Iff: A -r B is bijective, there is a unique inverse functionf- 1 : B -r A with the


properties: f- 1 0 f(x) = x, for all x E A; f 0 f-1(y) = y, for all y E B.
Examples
Consider the functions f: Z -r Z +, g: Z -r Z, h: 7L. -r Z, defined by
f(n) = n2 + 1,
g(n) = 2n,
h(n) = 1 - n,

nEZ,
nEZ,
n E Z.

Thenfis neither 1-1 nor onto, g is 1-1 but not onto, h is bijective, h- 1 (n) =
1 - n, andfoh(n) = n2 - 2n + 2.
A set A is said to be finite if either A = 0 or there is an n E 7L. +, and a
bijective function f from A to the set {I, 2, ... , n}. The set A is said to be
countable if there is a bijective f: A -r Z +. This is equivalent to requiring that
there be a bijective g: 7L. + -r A (since if such anf exists, we can take g = f- 1 ;
if such a g exists, take f = g-I). The following elementary criterion is
convenient.
Proposition 1.1. If there is a surjective (onto) function f: Z+ -r A, then A
is either finite or countable.
Proof Suppose A is not finite. Define g: Z+ -r Z+ as follows. Let
g(1) = 1. Since A is not finite, A =1= {f(1)}. Let g(2) be the first integer m such
thatf(m) =1= f(1). Having defined g(1), g(2), ... , g(n), let g(n + 1) be the first
integer m such thatf(n + 1) {f(1),f(2), ... ,f(n)}. The function g defined
inductively on all of Z + in this way has the property that fog: Z + -r A is
bijective. In fact, it is 1-1 by the construction. It is onto becausefis onto and
by the construction, for each n the set {f(1),f(2), ... ,f(n)} is a subset of
{f 0 g(l),f 0 g(2), .. . ,f 0 g(n)}. 0
Corollary 1.2. If B is countable and A c B, then A is finite or countable.

Basic concepts

Proof If A = 0, we are done. Otherwise, choose a functionf: 71.+ ~ A


which is onto. Choose an element Xo EA. Define g: 71.+ ~ A by: g(n) = fen)
if/en) E A, g(n) = Xo iff(n) r# A. Then g is onto, so A is finite or countable. 0

Proposition 1.3. If AI, A 2, A 3, ... are finite or countable, then the sets
n

..

UAi and UAi


i=1
i=1
are finite or countable.
Proof We shall prove only the second statement. If any of the Ai are
empty, we may exclude them and renumber. Consider only the second case.
For each Aj we can choose a surjective functionjj: 71.+ ~ Aj. Definef: 71.+ ~
Ui=1 Ai by f(l) = fl(1), f(3) = fl(2), f(5) = fl(3), . . ,f(2) = f2(1), f(6) =
f2(2), f(10) = f2(3), ... , and in general f(2 j(2k - 1 = fAk), j, k = 1,2,
3, .... Any x E Ui=l Aj is in some Ai> and therefore there is k E 71.+ such that
jj(k) = x. Thenf(2 i (2k - 1 = x, sofis onto. By Proposition 1.1, Uj.,1 Ai
is finite or countable. 0
Example
Let Q be the set of rational numbers: Q = {min I mE 71., n E 71. +}. This is
countable. In fact, let An = {j/n Ij E 71., _n 2 ~ j ~ n2}. Then each An is
finite, and Q = U:'=1 An.

Proposition 1.4. If AI> A 2, .. . , An are countable sets, then the product set
X x An is countable.

Al x A2

Proof Choose bijective functions jj: A j ~ 71. +, j = 1, 2, ... , n. For each


71.+, let Bm be the subset of the product set consisting of all n-tuples
(Xl' X2, ... , xn) such that each jj(Xj) ~ m. Then Bm is finite (it has mn elements) and the product set is the union of the sets Bm. Proposition 1.3 gives

mE

the desired conclusion.

A sequence in a set A is a collection of elements of A, not necessarily


distinct, indexed by some countable set J. Usually J is taken to be 71.+ or
71.+ u {O}, and we use the notations

(an):'=1 = (al> a2, a3, .),


(an):'= 0 = (ao, al> a2, .).

Proposition 1.5. The set S of all sequences in the set {O, I} is neither finite
nor countable.
Proof Suppose f: 71. + ~ A. We shall show that f is not surjective. For
mE 71.+, f(m) is a sequence (an.m):'=l = (al. m, a2.m,' .. ), where each
an.m is 0 or 1. Define a sequence (a n):'=1 by setting an = 0 if an.n = 1, an = 1
if ann = O. Then for each m E 71.+, (an):'=1 '" (an.m):'=l = f(m). Thusfis not
each

surjective.

Real and complex numbers

We introduce some more items of notation. The symbol ~ means


"implies"; the symbol <= means "is implied by"; the symbol-> means "is
equivalent to."
Anticipating 2 somewhat, we introduce the notation for intervals in the
set IR of real numbers. If a, b E IR and a < b, then
(a,
(a,
[a,
[a,

b) = {x I x
b] = {x I x
b) = {x I x
b] = {x I x

E
E

E
E

IR, a
IR, a
IR, a
IR, a

< x < b},


< x ~ b},
~ x < b},
~ x ~ b}.

Also,
(a, (0) = {x I x E IR, a < x},
(-00, a] = {x I x E IR, x ~ a}, etc.

2. Real and complex numbers


We denote by IR the set of all real numbers. The operations of addition
and multiplication can be thought of as functions from the product set
IR x IR to IR. Addition assigns to the ordered pair (x, y) an element of IR
denoted by x + y; multiplication assigns an element of IR denoted by xy.
The algebraic properties of these functions are familiar.

Axioms of addition
AI.
A2.
A3.
A4.
x

(x + y) + z = x + (y + z), for any x, y, Z E IR.


x + y = y + x, for any x, y E IR.
There is an element 0 in IR such that x + 0 = x for every x E IR.
For each x E IR there is an element -x E IR such that x + (-x) = O.

Note that the element 0 is unique. In fact, if 0' is an element such that
= x for every x, then

+ 0'

0' = 0'

+0

= 0

+ 0'

= O.

+ y = 0, then
y = y + 0 = y + (x + (-x)) = (y + x) + (-x)
= (x + y) + (-x) = 0 + (-x) = (-x) + 0 = -x.
This uniqueness implies -( -x) = x, since (-x) + x = x + (-x)
Also, given x the element - x is unique. In fact, if x

o.

Axioms of multiplication
Ml.
M2.
M3.
M4.
xx- 1 =

(xy)z = x(yz), for any x, y, z E IR.


xy = yx, for any x, y E IR.
There is an element I =F 0 in IR such that xl = x for any x E IR.
For each x E IR, x =F 0, there is an element x- 1 in IR such that

1.

Basic concepts
Note that 1 and X-I are unique. We leave the proofs as an exercise.

Distributive law

+ XZ, for any x, y, Z E IR.


Note that DL and A2 imply (x + y)z = xz + yz.
DL. x(y

+ z)

xy

We can now readily deduce some other well-known facts. For example,

Ox

= (0 + o)x = Ox + Ox,

so Ox = O. Then

x + (-I)x = lx + (-I)x = (1 + (-Ix = Ox = 0,


so (-I)x = -x. Also,

(-x)y =

-1)x)y = (-I)(xy) =

-xy.

The axioms AI-A4, MI-M4, and DL do not determine IR. In fact there
is a set consisting of two elements, together with operations of addition and
multiplication, such that the axioms above are all satisfied: if we denote the
elements of the set by 0, 1, we can define addition and multiplication by

0+1 = 1 + 0,= 1,
0+0 = 1 + 1 = 0,
11 = 1.
00 = 10 = 01 = 0,
There is an additional familiar notion in IR, that of positivity, from which
one can derive the notion of an ordering of IR. We axiomatize this by introducing a subset P c IR, the set of "positive" elements.

Axioms of order
01. If x E IR, then exactly one of the following holds: x EP, x = 0, or
-XEP.
02. If x, YEP, then x + YEP.
03. If x, YEP, then xy E P.

It follows from these that if x # 0, then x 2 E P. In fact if x E P then this


follows from 03, while if -x EP, then (_X)2 EP, and (_X)2 = -(x( -x
_(_x2 ) = x 2 In particular, 1 = PEP.
We define x < y if y - x E P, X > y if y < x. It follows that x E P <0>
X > O. Also, if x < y and y < z, then
Z -

so

x = (z - y)

+ (y

- x) EP,

< z. In terms ofthis order, we introduce the Archimedean axiom.

04. If x, y > 0, then there is a positive integer n such that nx = x

+ ... + x is >

y.

(One can think of this as saying that, given enough time, one can empty
a large bathtub with a small spoon.)

Real and complex numbers

The axioms given so far still do not determine IR; they are all satisfied by
the subset 0 of rational numbers. The following notions will make a distinction between these two sets.
A nonempty subset A c IR is said to be bounded above if there is an x E IR
such that every YEA satisfies y ::; x (as usual, y ::; x means y < x or y = x).
Such a number x is called an upper bound for A. Similarly, if there is an
x E IR such that every YEA satisfies x ::; y, then A is said to be bounded below
and x is called a lower bound for A.
A number x E IR is said to be a least upper bound for a nonempty set
A c IR if x is an upper bound, and if every other upper bound x' satisfies
x' ~ x. If such an x exists it is clearly unique, and we write

= lubA.

Similarly, x is a greatest lower bound for A if it is a lower bound and if every


other lower bound x' satisfies x' ::; x. Such an x is unique, and we write

glbA.

The final axiom for IR is called the completeness axiom.


05. If A is a nonempty subset of IR which is bounded above, then A has
a least upper bound.
Note that if A c IRis bounded below, thenthesetB = {x I x E IR, -x E A}
is bounded above. If x = lub B, then - x = glb A. Therefore 05 is equivalent to: a nonempty subset of IR which is bounded below has a greatest lower
bound.

Theorem 2.1.

0 does not satisfy the completeness axiom.

Proof Recall that there is no rational p/q, p, p E 7l., such that (p/q)2 = 2:
in fact if there were, we could reduce to lowest terms and assume either p or q
is odd. Butp2 = 2q2 is even, so p is even, so p = 2m, m E 7L.. Then 4m 2 = 2q2,
so q2 = 2m 2 is even and q is also even, a contradiction.
Let A = {x I x EO, x 2 < 2}. This is nonempty, since 0, 1 EA. It is
bounded above, since x ~ 2 implies x 2 ~ 4, so 2 is an upper bound. We shall
show that no x E 0 is a least upper bound for A.
If x ::; 0, then x < I E A, so x is not an upper bound. Suppose x > 0 and
x 2 < 2. Suppose h E 0 and 0 < h < I. Then x + h E 0 and x + h > x.
Also, (x + h)2 = x 2 + 2xh + h2 < x 2 + 2xh + h = x 2 + (2x + I)h. If we
choose h > 0 so small that h < I and h < (2 - x2)/(2x + 1), then (x + h)2
< 2. Then x + h E A, and x + h > x, so x is not an upper bound of A.
Finally, suppose x E 0, X > 0, and x 2 > 2. Suppose hE 0 and 0 < h < x.
Then x - hE 0 and x - h > O. Also, (x - h)2 = x 2 - 2xh + h2 > x 2 2xh. If we choose h > 0 so small that h < I and h < (x 2 - 2)/2x, then
(x - h)2 > 2. It follows that if yEA, then y < x-h. Thus x - h is an
upper bound for A less than x, and x is not the least upper bound. 0

We used the non-existence ofa square root of2 in 0 to show that 05 does
not hold. We may turn the argument around to show, using 05, that there is

Basic concepts

a real number x > 0 such that x 2 = 2. In fact, let A = {y lYE IR, y2 < 2}.
The argument proving Theorem 2.1 proves the following: A is bounded
above; its least upper bound x is positive; if x 2 < 2 then x would not be an
upper bound, while if x 2 > 2 then x would not be the least upper bound.
Thus x 2 = 2.
Two important questions arise concerning the above axioms. Are the
axioms consistent, and satisfied by some set IR? Is the set of real numbers the
only set satisfying these axioms?
The consistency of the axioms and the existence of IR can be demonstrated
(to the satisfaction of most mathematicians) by constructjng IR, starting with
the rationals.
In one sense the axioms do not determine IR uniquely. For example, let
1R0 be the set of all symbols xO, where x is (the symbol for) a real number.
Define addition and multiplication of elements of 1R0 by
XO + yO = (x

+ y)O, xOyO

= (xy)o.

Define po by XO E po -> X E P. Then 1R0 satisfies the axioms above. This is


clearly fraudulent: 1R0 is just a copy of IR. It can 'be shown that any set with
addition, multiplication, and a subset of positive elements, which satisfies all
the axioms above, is just a copy of IR.
Starting from IR we can construct the set C of complex numbers, without
simply postulating the existence of a "quantity" j such that j2 = - I. Let Co
be the product set 1R2 = IR x IR, whose elements are ordered pairs (x, y) of
real numbers. Define addition and multiplication by
(x, y) + (x', y')
(x, y)(x', y')

= (x + x', Y + y'),
= (xx' - yy', xy' + x'y).

It can be shown by straightforward calculations that Co together with these

operations satisfies AI, A2, Ml, M2, and DL. To verify the remaining
algebraic axioms, note that
(x, y) + (0,0) = (x, y).
(x,y) + (-x, -y) = (0,0),
(x, y)(1, 0) = (x, y),
(x, y)(x/(x 2 + y2), - y/(x2 + y2 = (1, 0)

If x E IR, let XO denote the element (x, 0)


(0, 1). Then we have
(x, y)

if (x, y) i= (0, 0).

Co. Let

jO

denote the element

= (x,O) + (0, y) = (x,O) + (0, 1)(y, 0) = XO + ;Oyo.

Also, (i0)2 = (0, 1)(0, 1) = (-1,0) = _1. Thus we can write any element
of Co uniquely as XO + jOyO, x, y E IR, where (i0)2 = -1. We now drop the
superscripts and write x + jy for XO + jOyO and C for Co: this is legitimate,
since for elements of IR the new operations coincide with the old: XO + yO =
(x + y)O, xOyO = (xy)o. Often we shall denote elements of C by z or w. When

Real and complex numbers

we write z = x + iy, we shall understand that x, yare real. They are called
the real part and the imaginary part of z, respectively:
z = x

+ iy,

= Re (z),

y = 1m (z).

There is a very useful operation in C, called complex conjugation, defined


by:
z* = (x

+ iy)* =

x - iy.

Then z* is called the complex conjugate of z. It is readily checked that

(z

w)* = z* + w*,
(zw)* = z*w*,
(z*)* = z,
z*z = x 2 + y2.

Thus z*z =P 0 if z =P O. Define the modulus of z, Izl, by


Izl = (Z*Z)1/2 = (x 2 + y2)1I2,

z = x

+ iy.

Then if z =P 0,
1 = z*zlzl-2 = z(z*lzl-2),

or

Adding and subtracting gives

+ z*

z - z* = 2iy

2x,

if z

= x + iy.

Thus
Re (z) = !(z

+ z*),

= !i-1(z

1m (z)

- z*).

The usual geometric representation of C is by a coordinatized plane:

= x + iy is represented by the point with coordinates (x, y). Then by the

Pythagorean theorem, Izl is the distance from (the point representing) z to


the origin. More generally, Iz - wi is the distance from z to w.

Exercises
1. There is a unique real number x > 0 such that x 3 = 2.
2. Show that Re(z + w) = Re(z) + Re (w), Im(z + w) = Im(z)
3. Suppose z = x + iy, x, y E R Then
Izl ~
4. For any z, WEe,

Izw*1 = Izllwl
5. For any z, WEe,
Iz

+ wi

~ Izl

Iwl

Ixl +

Iyl

+ Im(w).

Basic concepts

10

(Hint: Iz + WI2 = (z + w)*(z + w) = IzI2 + 2 Re (zw*) + Iw12; apply Exercises 3 and 4 to estimate IRe (zw*)I.)
6. The Archimedean axiom 04 can be deduced from the other axioms for
the real numbers. (Hint: use 05).
7. If a > 0 and n is a positive integer, there is a unique b > 0 such that
b n = a.

3. Sequences of real and complex numbers


A sequence (Zn):'=l of complex numbers is said to converge to Z E C if for
each 8 > 0, there is an integer N such that IZn - zi < 8 whenever n ;::: N.
Geometrically, this says that for any circle with center z, the numbers Zn all
lie inside the circle, except for possibly finitely many values of n. If this is the
case we write

Zn -+ Z,

or

lim Zn = z,

n .... ao

or lim Zn = Z.

The number Z is called the limit of the sequence (Zn):'= l ' Note that the limit is
unique: suppose Zn -+ Z and also Zn -+ w. Given any 8 > 0, we can take n so
large that IZn - zi < 8 and also IZn - wi < 8. Then

Iz - wi :s; Iz - znl

IZn - wi <

+8 =

28.

Since this is true for all 8 > 0, necessarily Z = w.


The following proposition collects some convenient facts about convergence.
Proposition 3.1.

Suppose (Zn):'=l and (W n):'=l are sequences in C.

(a) Zn -+ Z if and only if Zn - Z -+ O.


(b) Let Zn = Xn + iyno Xn, Yn real. Then Zn -+ Z = x + iy if and only if
Xn -+ x and Yn -+ y.
(c) Ifzn-+zandwn-+w, then Zn + Wn-+Z + w.
(d) If Zn -+ Z and Wn -+ w, then ZnWn -+ ZW.
(e) If Zn -+ Z - 0, then there is an integer M such that Zn - 0 if n ;::: M.

Moreover (zn -l):'=M converges to Z-l.


Proof

(a) This follows directly from the definition of convergence.

(b) By Exercise 3 of 3,

IXn - xl

IYn - yl :s; IZn - zi :s; 21 xn - xl

+ 21Yn

- YI

lt follows easily that Zn - Z -+ 0 if and only if Xn - x -+ 0 and Yn - Y -+ O.


(c) This follows easily from the inequality

I(z

+ wn) -

(z

+ w)1

I(zn - z)

+ (w n -

w)1 :s; IZn - zi

IWn

wi.

(d) Choose M so large that if n ;::: M, then IZn - zi < 1. Then for
n~M,

Sequences of real and complex numbers


Let K

= 1 + Iwi + Izi. Then for all n


IZnwn - zwl

11
~

M,

IZn(wn - w) + (zn - z)wl


::; IZnllwn - wi + IZn - zllwl
::; K(iw n - wi + IZn - zi).

Since Wn - W~ 0 and Zn - Z ~ 0, it follows that ZnWn - ZW ~ o.


(e) Take M so large that IZn - zi ::; tizi when n ~ M. Then for n ~ M,

IZnl

IZnl

~ IZnl

+ tizi - -lizi
+ Iz - znl - -lizi

Therefore, Zn 1= O. Also for n

~ IZn

+ (z - zn)1 - tizi

tlzl

M.

IZn-1 - z- 11 = I(z - Zn)Z-I Zn -11


::; IZ - ZnlIZI-l(tIZi)-1
where K = 2Izl-2. Since Z - Zn ~ 0 we have zn -1

=
-

Klz - znl,
Z-1 ~ o. 0

A sequence (Zn):'= 1 in C is said to be bounded if there is an M ~ 0 such that


IZnl ::; M for all n; in other words, there is a fixed circle around the origin
which encloses all the zn's.
A sequence (xn):'= 1 in Iffi is said to be increasing if for each n, Xn ::; Xn + 1;
it is said to be decreasing if for each n, Xn ~ Xn+1.

Proposition 3.2. A bounded, increasing sequence in Iffi converges. A bounded,


decreasing sequence in Iffi converges.

Proof. Suppose (Xn)~=l is a bounded, increasing sequence. Then the set


{xn I n = 1,2, ... } is bounded above. Let x be its least upper bound. Given
I!! > 0, X - I!! is not an upper bound, so there is an N such that XN ~ X-I!!.
lfn

N, then

so IX n - xl
similar. 0

::;

I!!.

Thus Xn ~ x. The proof for a decreasing sequence is

If A c Iffi is bounded above, the least upper bound of A is often called the
supremum of A, written sup A. Thus
sup A

= lubA.

Similarly, the greatest lower bound of a set B c Iffi which is bounded below
is also called the infimum of A, written inf A :
inf A

glb A.

Suppose (Xn)~= 1 is a bounded sequence of reals. We shall associate with


this given sequence two other sequences, one increasing and the other
decreasing. For each n, let An = {xn' X n + 1 , X n +2, }, and set
X~ =

inf Am

Basic concepts

12

Now An ::::> A n+ l , so any lower or upper bound for An is a lower or upper


bound for A n + l Thus
Choose M so that IXnl :::; M, all n. Then - M is a lower bound and M an
upper bound for each An. Thus
(3.1)

all n.

- M :::; x;. :::; x~ :::; M,

We may apply Proposition 3.2 to the bounded increasing sequence (X;'):'=l


and the bounded decreasing sequence (x~):'= I and conclude that both
converge. We define
lim inf Xn = lim x;',
lim sup Xn = lim x~.
These numbers are called the lower limit and the upper limit of the sequence
(xn);'= 1> respectively. It follows from (3.1) that
(3.2)

-M:::; lim inf Xn :::; lim sup Xn :::; M.

A sequence (Zn);'= I in C is said to be a Cauchy sequence if for each 8 >


there is an integer N such that IZn - zml < 8 whenever n ~ Nand m ~ N.
The following theorem is of fundamental importance.

Theorem 3.3. A sequence in C (or IR) converges if and only ifit is a Cauchy
sequence.

Proof Suppose first that Zn -+ z. Given 8 > 0, we can choose N so that


IZn - zi < t if n ~ N. Then if n, m ~ N we have
IZn - zml :::; IZn -

zi + Iz -

zml <

t8 + 1-8 = 8.

Conversely, suppose (Zn):'= I is a Cauchy sequence. We consider first the


case of a real sequence (Xn):'=l which is a Cauchy sequence. The sequence
(Xn):'=l is bounded: in fact, choose M so that IX n - xml < 1 if n, m ~ M.
Then ifn ~ M,

IXnl :::; IXn - xMI

IXMI < 1 + IXMI

Let K = max {lXII, Ix2 1, ... , IXM-II, IXMI + I}. Then for any n, IXnl :::; K.
Now since the sequence is bounded, we can associate the sequences (X;'):'=l
and (X~):'=l as above. Given 8 > 0, choose N so that IX n - xml < 8 if
n, m ~ N. Now suppose n ~ m ~ N. It follows that
Xm -

8 :::;

Xn :::; Xm

+ 8,

n~m~N.

By definition of x;. we also have, therefore,


n~m~N.

Letting x = lim inf Xn = lim x;., we have


Xm -

8 :::;

X :::; Xm

+ 8,

N,

Sequences of real and complex numbers

13

or IXm - xl :::;; e, m ~ N. Thus Xn ~ x.


Now consider the case of a complex Cauchy sequence (Zn):'=l. Let
Zn = Xn + iYn, Xn, Yn e Iht Since IXn - xml :::;; IZn - zml, (Xn):'=l is a Cauchy
sequence. Therefore Xn ~ X E Iht Similarly, Yn ~ Y E IR. By Proposition
3.1(b), Zn ~ x + iy. 0
The importance of this theorem lies partly in the fact that it gives a criterion for the existence of a limit in terms of the sequence itself. An immediately recognizable example is the sequence
3, 3.1, 3.14, 3.142, 3.1416, 3.14159, ... ,
where successive terms are to be computed (in principle) in some specified
way. This sequence can be shown to be a Cauchy sequence, so we know it has
a limit. Knowing this, we are free to give the limit a name, such as "17".
We conclude this section with a useful characterization of the upper and
lower limits of a bounded sequence.
Proposition 3.4. Suppose (xn):'= 1 is a bounded sequence in Iht Then lim inf Xn
is the unique number x' such that
(i)' for any e > 0, there is an N such that Xn > x' - e whenever n ~ N,
(ii)' for any e > 0 and any N, there is an n ~ N such that Xn < x' + e.

Similarly, lim sup Xn is the unique number x" with the properties
(i)" for any e > 0, there is an N such that Xn < x" + e whenever n ~ N,
(ii)" for any e > 0 and any N, there is an n ~ N such that Xn > X-e.

Proof We shall prove only the assertion about lim inf Xn. First, let
inf {xn' Xn+l> . .. } = inf An as above, and let x' = lim x~ = lim inf Xn.
Suppose e > o. Choose N so that x~ > x' - e. Then n ~ N implies Xn ~
x~ > x' - e, so (i)' holds. Given e > 0 and N, we have x~ :::;; x' < x' + -le.
Therefore x' + '1e is not a lower bound for AN, so there is an n ~ N such that
Xn :::;; x' + -le < x' + e. Thus (ii)' holds.
Now suppose x' is a number satisfying (i)' and (ii)'. From (i)' it follows
that ipf An > x' - e whenever n ~ N. Thus lim inf Xn ~ x' - e, all e, so
lim inf Xn ~ x'. From (ii)' it follows that for any N and any e, inf AN <
x' + e. Thus for any N, inf AN :::;; x', so lim inf Xn :::;; x'. We have lim inf
Xn = x'. 0
x~ =

Exercises
1. The sequence (1/n):'=l has limit O. (Use the Archimedean axiom, 2.)
2. If Xn > 0 and Xn ~ 0, then xn l/2 ~ o.
3. If a > 0, then al/n ~ 1 as n ~ 00. (Hint: if a ~ 1, let al/n = 1 + Xn.
By the binomial expansion, or by induction, a = (1 + xn)n :::;; 1 + nxn. Thus
Xn < n-la ~ o. If a < 1, then al/n = (bl/n)-l where b = a-I> 1.)

14

Basic concepts

4. lim n1/n = 1. (Hint: let n1/n = 1 + Yn. For n ~ 2, n = (1 + Yn)n ~


1 + nYn + !n(n - I)Yn 2 > !n(n - I)Yn 2 , so Yn 2 ::; 2(n -'1)-1-+ O. Thus
Yn -+ 0.)
5. If Z E C and /z/ < I, then zn -+ 0 as n -+ 00.
6. Suppose (X n):=1 is a bounded real sequence. Show that Xn -+ x if and
only if lim inf Xn = x = lim sup X n
7. Prove the second part of Proposition 3.4.
8. Suppose (X n ):", 1 and (a n ):= 1 are two bounded real sequences such that
an-+a > O. Then
lim inf anXn

= a lim inf Xn>

lim sup anXn

= a lim sup X n

4. Series
Suppose (Zn):=1 is a sequence in C. We associate to it a second sequence
(Sn):=l, where
n

Sn =

L
m=l

Zn = Z1

+ Z2 + ... + Zn'

If (Sn):=1 converges to s, it is reasonable to consider s as the infinite sum


2::= 1 zn Whether (Sn):= 1 converges or not, the formal symbol 2::= 1 Zn or 2: Zn
is called an infinite series, or simply a series. The number Zn is called the nth
term of the series, Sn is called the nth partial sum. If Sn -+ S we say that the
series 2: Zn converges and that its sum is s. This is written

L Zn
00

(4.1)

s=

n=l

(Of course if the sequence is indexed differently, e.g., (zn):=O, we make the
corresponding changes in defining Sn and in (4.1).) If the sequence (Sn):= 1 does
not converge, the series 2: Zn is said to diverge.
In particular, suppose (X n):= 1 is a real sequence, and suppose each Xn ~ O.
Then the sequence (sn):= 1 of partial sums is clearly an increasing sequence.
Either it is bounded, so (by Proposition 3.2) convergent, or for each M > 0
there is an N such that

m=l

Xm

> M

whenever n

In the first case we write

L Xn <
n=l
00

(4.2)

00

and in the second case we write

L
n=1
00

(4.3)

Xn = 00.

N.

Series

15

Thus (4.2) ->

2: Xn converges, (4.3)

->

2: Xn

diverges.

Examples
1. Consider the series
(symbolically),

2::=1 n -1.

We claim

2::=1 n -1

00.

In fact

L>-l = 1 + 1 + t + -!- + t + i
2': 1

2.

2::=1 n- 2

L>-2

=1
=1

+ ~ + i + ...
+ 1 + (-!- + -!-) + (i + i + i + i) + ...
+ 1 + 2(-!-) + 4(i) + 8(+6) + ...
+ 1 + 1 + ... = 00.

00. In fact (symbolically),


= 1 + (1)2 + (t)2 + ... + (~)2 + ...
~ 1 + (1)2 + (1)2 + (t)2 + (t)2 + (t)2 + (-!-)2 + ...
= I + 2(1)2 + 4(-!-)2 + 8(i)2 + ...
= I + 1 + -!- + i + ... = 2.
<

(We leave it to the reader to make the above rigorous by considering the
respective partial sums.)
How does one tell whether a series converges? The question is whether the
sequence (sn):= 1 of partial sums converges. Theorem 2.3 gives a necessary and
sufficient condition for convergence of this sequence: that it be a Cauchy
sequence. However this only refines our original question to: how does one
tell whether a series has a sequence of partial sums which is a Cauchy
sequence? The five propositions below give some answers.
Proposition 4.1.

If 2::= 1Zn converges, then Zn --+ O.

Proof If 2: Zn converges, then the sequence (sn):= 1 of partial sums is a


Cauchy sequence, so Sn - Sn -1 --+ O. But Sn - Sn -1 = Zn. D
Note that the converse is false: lin --+ 0 but
Proposition 4.2. If 1Z1 < I, then
If Izi 2': 1, then 2::=0 zn diverges.

Proof

2: lin diverges.

2::=0 zn converges; the sum is (I

- z) -1 .

The nth partial sum is

Sn = 1 +

+ Z2 + ... + zn-1.

Then sn(l - z) = 1 - zn, so Sn = (l - zn)/(l - z). If Izl < 1, then as


n --+ 00, zn --+ 0 (Exercise 5 of 3). Therefore Sn --+ (l - Z)-l. If Izl 2': 1, then
Iznl 2': 1, and Proposition 4.1 shows divergence. D
The series

2::=0 zn is called a geometric series.

Proposition 4.3. (Comparison test). Suppose (zn):= 1 is a sequence in C


and (a n):=l a sequence in IR with each an 2': O. If there are constants M, N
such that
IZnl ~ Man whenever n 2': N,

16

Basic concepts

and ifL a" converges, then L ZIt converges.


Proof. Let SIt

L:' ~ 1 Zm, bIt

Is" - sml =

L:' = 1 a". If n, m

~ N then

IJ=m+1
i Zn I~ J=m+1
i Iz,,1

~ M

L:"

J=m+1

a"

M(b" - bm).

But (b,,):=l is a Cauchy sequence, so this inequality implies that (S,,):=l is


also a Cauchy sequence. 0
Proposition 4.4. (Ratio test).
=1= 0, all n.

Suppose (Z,,):=1 is a sequence in C and

suppose ZIt
(a)

If

then L ZIt converges.


(b)

then

If

L ZIt diverges.

Proof. (a) In this case, take r so that lim sup IZ"+l/Z,,1 < r < 1. By
Proposition 3.4, there is an N so that IZ"+l/Z,,1 ~ r whenever n ~ N. Thus if
n > N,
Iz,,1

rlzn-11

r'rlz"_21

...

r"-NlzNI = Mr",

where M = ,-NlzNI. Propositions 4.2 and 4.3 imply convergence.


(b) In this case, Proposition 3.4 implies that for some N, IZn+1/znl ~ 1 if
n ~ N. Thus for n > N.

IZnl ~ IZ"-ll ~ ... ~ IZNI > O.


We cannot have ZIt -+ 0, so Proposition 4.1 implies divergence.

Corollary 4.5. If ZIt =1= 0 for n = 1,2, ... and if lim IZn+1/z,,1 exists,
then the series L ZIt converges if the limit is < 1 and diverges if the limit is > 1.
Note that for both the series L lin and L I/n2 , the limit in Corollary 4.5
equals 1. Thus either convergence or divergence is possible in this case.
Proposition 4.6. (Root test).

Suppose (Z,,):=1 is a sequence in C.

(a) If

L ZIt converges.
(b) If

then

lim sup Iz,,111n > 1,

then

L ZIt diverges.

Series

17

Proof. (a) In this case, take r so that lim sup IZnl1fn < r
sition 3.4, there is an N so that IZn Ilfn ~ r whenever n ;::: N.
then IZnl ~ rn. Propositions 4.2 and 4.3 imply convergence.
(b) In this case, Proposition 3.4 implies that IZnl1fn ;:::
many values of n. Thus Proposition 4.1 implies divergence.

< 1. By PropoThus if n ;::: N,


1 for infinitely
D

Note the tacit assumption in the statement and proof that (lznI1fn);,= 1 is a
bounded sequence, so that the upper and lower limits exist. However, if this
sequence is not bounded, then in particular IZnl ;::: 1 for infinitely many
values of n, and Proposition 4.1 implies divergence.
Corollary 4.7.

If lim IZnl1fn exists, then the series L Zn converges if the

limit is < I and diverges if the limit is > 1.

Note that for both the series L lin and L Iln 2 , the limit in Corollary 4.7
equals 1 (see Exercise 4 of 3). Thus either convergence or divergence is
possible in this case.
A particularly important class of series are the power series. If (an);'=o is
a sequence in C and Zo a fixed element of C, then the series
(4.2)
is the power series around Zo with coefficients (a n);'= o. Here we use the convention that WO = 1 for all w E C, including w = O. Thus (4.2) is defined, as a
series, for each Z E C. For Z = Zo it converges (with sum ao), but for other
values of Z it mayor may not converge.
Theorem 4.8.

Consider the power series (4.2). Define R by

R = 0
if(la nI 1fn );'=1 is not a bounded sequence,
R = (lim sup lanI1fn)-1
if lim sup lanl 1fn > 0,
R = 00
if lim sup lanl 1fn = O.
Then the power series (4.2) converges if
Iz - zol > R.
Proof.

Iz - zol

< R, and diverges if

We have

(4.3)
Suppose Z =I Zo0 If (lanI1fn);'=1 is not a bounded sequence, then neither is
(4.3), and we have divergence. Otherwise the conclusions follow from (4.3)
and the root test, Proposition 4.6. D
The number R defined in the statement of Theorem 4.8 is called the radius
of convergence of the power series (4.2). It is the radius of the largest circle in
the complex plane inside which (4.2) converges.
Theorem 4.8 is quite satisfying from a theoretical point of view: the
radius of convergence is shown to exist and is (in principle) determined in all

Basic concepts

18

cases. However, recognizing lim sup la nl1/n may be very difficult in practice.
The following is often helpful.
Theorem 4.9. Suppose an =J Ofor n 2': N, and suppose lim lan+1/anl exists.
Then the radius of convergence R of the power series (4.2) is given by

R = (lim lan+1/anJ)-l
iflim la n+1lanl > O.
R = 00
iflim lan+1Janl = O.

Proof. Apply Corollary 4.5 to the series (4.2), noting that if z


la n+1(z - zo)n+lJan(z - zo)nl = la n +1Janllz - zol.

=1=

Zo then

Exercises

I. If 2:'= 1Zn converges with sum sand 2:'= 1Wn converges with sum t,
then 2:'= 1 (zn + wn) converges with sum s + t.
2. Suppose 2 an and 2: bn each have all non-negative terms. If there are
constants M > 0 and N such that bn 2': Man whenever n 2': N, and if 2: an =
00, then 2 bn = 00.
3. Show that 2::'=1 (n + 1)/(2n 2 + 1) diverges and 2::'=1 (n + 1)/(2n3 + 1)
converges. (Hint: use Proposition 4.3 and Exercise 2, and compare these to

2 lin, 2: Iln2.)

4. (2 k -Test). Suppose a1 2': a2 2': ... 2': an 2': 0, all n. Then 2:'=1 an <
2k= 1 2 k a2k < 00. (Hint: use the methods used to show divergence of
2 lin and convergence of 2 IJn2.)
5. (Integral Test). Suppose a1 2': a2 2': ... 2': an 2': 0, all n. Suppose
f: [1,00) -J-iR is a continuous function such thatf(n) = an, all n, andf(y) ::;
f(x) if y 2': x. Then 2:'=1 an < 00 -> II'" f(x) dx < 00.
6. Suppose p > O. The series 2:'=1 n- P converges if p > I and diverges
if p :s; 1. (Use Exercise 4 or Exercise 5.)
7. The series 2::'=2n-1(logn)-2 converges; the series 2::'=2n-1(logn)-1
diverges.
8. The series 2::'=0 znJn! converges for any z E C. (Here O! = 1, n! =
n(n - I)(n - 2) .... 1.)
9. Determine the radius of convergence of
00

->

2 n!zn,
n=O
00

L'" n! znJ(2n)!

n=O

10. (Alternating series). Suppose IXll 2': IX21 2': .. 2': IXnl, all n, Xn 2': 0
if n odd, Xn ::; 0 if n even, and Xn -J- O. Then 2: Xn converges. (Hint: the partial
sums satisfy S2 ::; S4 ::; S6 ::; . ::; S5 ::; S3 ::; SI')
11. 2::'= 1 (_l)njn converges.

Metric spaces

19

5. Metric spaces
A metric on a set S is a function d from the product set S x S to
the properties

~,

with

Dl. d(x, x) = 0, d(x, y) > 0, if x, YES, x =1= y.


D2. d(x, y) = d(y, x), all x, YES.
D3. d(x, z) ~ d(x, y) + dey, z), all x, y, z E S.
We shall refer to d(x, y) as the distance from x to y. A metric space is a set S
together with a given metric d. The inequality D3 is called the triangle
inequality. The elements of S are often called points.
As an example, take S = ~2 = ~ X ~,with
(5.1)

dx, y), (x', y' = [(x - X')2

+ (y

- y')2]l/2.

If we coordinatize the Euclidean plane in the usual way, and if (x, y), (x', y')
are the coordinates of points P and P' respectively, then (5.1) gives the length
of the line segment PP' (Pythagorean theorem). In this example, D3 is the
analytic expression of the fact that the length of one side of a triangle is at
most the sum of the lengths of the other two sides. The same example in
different guise is obtained by letting S = C and taking
(5.2)

d(z, w) =

Iz - wi

as the metric. Then D3 is a consequence of Exercise 5 in 2.


Some other possible metrics on ~2 are:
dlx, y), (x', y' = Ix - x'i + Iy - y'l,
d2x, y), (x', y' = max {Ix - x'l, Iy - y'I},
dax, y), (x', y' = 0
if (x, y) = (x', y'), and 1 otherwise.

Verification that the functions dl , d:1" and da satisfy the conditions Dl, D2, D3
is left as an exercise. Note that da works for any set S: if x, YES we set
d(x, y) = 1 if x =1= y and 0 if x = y.
A still simpler example of a metric space is IR, with distance function d
given by
(5.3)

d(x, y) =

Ix -

yl.

Again this coincides with the usual notion of the distance between two points
on the (coordinatized) line.
Another important example is ~n, the space of ordered n-tuples x =
(Xl> X2, ... , x n) of elements of R There are various possible metrics on ~n
like the metrics d l , d2 , da defined above for ~n, but we shall consider here
only the generalization of the Euclidean distance in ~2 and ~a. If x =
(Xl' X2, ... , x n) and y = (Yl> Y2, ... , Yn) we set
(5.4)

d(x, y) = [(Xl - Yl)2 + (X2 - Y2)2 + ... + (x n - Yn)2]1/2.

When n = 1 we obtain ~ with the metric (5.3); when n = 2 we obtain ~2


with the metric (5.1), in somewhat different notation. It is easy to verify that

20

Basic concepts

d given by (5.4) satisfies 01 and 02, but condition 03 is not so easy to verify.
For now we shall simply assert that d satisfies 03; a proof will be given in a
more general setting in Chapter 4.
Often when the metric d is understood, one refers to a set S alone as a
metric space. For example, when we refer to IR, C, or IRn as a metric space with
no indication what metric is taken, we mean the metric to be given by (5.3),
(5.2), or (5.4) respectively.
Suppose (S, d) is a metric space and T is a subset of S. We can consider
T as a metric space by taking the distance function on TxT to be the
restriction of d to TxT.
The concept of metric space has been introduced to provide a uniform
treatment of such notions as distance, convergence, and limit which occur in
many contexts in analysis. Later we shall encounter metric spaces much more
exotic than IRn and C.
Suppose (S, d) is a metric space, x is a point of S, and r is a positive real
number. The ball of radius r about x is defined to be the subset of S consisting
of all points in S whose distance from x is less than r:
B,(x) = {y lYE S, d(x, y) < r}.

Clearly x

B,(x). If 0 < r < s, then B,(x) c Bs(x).

Examples
When S = IR (metric understood), B,(x) is the open interval (x - r, x + r).
When S = 1R2 or C, B,(z) is the open disc of radius r centered at z. Here we
take the adjective "open" as understood; we shall see that the interval and
the disc in question are also open in the sense defined below.
A subset A c S is said to be a neighborhood of the point XES if A contains B,(x) for some r > 0. Roughly speaking, this says that A contains all
points sufficiently close to x. In particular, if A is a neighborhood of x it
contains x itself.
A subset A c S is said to be open if it is a neighborhood of each of its
points. Note that the empty set is an open subset of S: since it has no points
(elements), it is a neighborhood of each one it has.

Example

Consider the interval A = (0, I] c IR. This is a neighborhood of each of


its points except x = 1. In fact, if < x < 1, let r = min {x, 1 - x}. Then
A => B,(x) = (x - r, x + r). However, for any r > 0, B,(I) contains 1 + -!-r,
which is not in A.
We collect some useful facts about open sets in the following proposition.
Proposition 5.1.

Suppose (S, d) is a metric space.

(a) For any XES and any r > 0, B,(x) is open.


(b) If Ab A 2 , , Am are open subsets of S, then

n~=l

Am is also open.

Metric spaces

21

(c) If (An)nEB is any collection of open subsets of S, then UnEB An is also


open.
Proof (a) Suppose y E Br(x). We want to show that for some s > 0,
Bs(Y) c Br(x). The triangle inequality makes this easy, for we can choose
s = r - dey, x). (Since y E Br(x), s~ positive.) If z E Bs(Y), then
d(z, x) ::; d(z, y)

+ d(y, x)

< s

+ dey, x)

= r.

Thus z E Br(x).
(b) Suppose x E n~ = 1 Am. Since each Am is open, there is rem) > 0 so that
Br(m)(x) C Am. Let r = min {r(l), r(2), ... , r(n)}. Then r > 0 and Br(x) C
Br(m)(x) C Am, so Br(x) C n~=l Am. (Why is it necessary here to assume that
A10 A 2 , is afinite collection of sets?)
(c) Suppose x E A = UnEB An. Then for some particular fl, x E An. Since
An is open, there is an r > 0 so that Br(x) C An C A. Thus A is open. 0
Again suppose (S, d) is a metric space and suppose A C S. A point XES
is said to be a limit point of A if for every r > 0 there is a point of A with
distance from x less than r:
Br(x) n A = 0

if r > O.

In particular, if x E A then x is a limit point of A. The set A is said to be


closed if it contains each of its limit points. Note that the empty set is closed,
since it has no limit points.
Example

The interval (0, 1] C IR has as its set of limit points the closed interval
[0,1]. In fact if 0 < x ::; 1, then x is certainly a limit point. If x = 0 and
r > 0, then Br(O) n (0, I] = (-r, r) n (0, I] = 0. If x < and r = lxi, then
BrCx) n (0, I] = 0, while if x> I and r = x - I, then Br(x) n (0, I] = 0.
Thus the interval (0, I] is neither open nor closed. The exact relationship
between open sets and closed sets is given in Proposition 5.3 below.

The following is the analogue for closed sets of Proposition 5.1.

Proposition 5.2. Suppose (S, d) is a metric space.


(a) For any x E Sandanyr > 0, the closed ball C = {y lYE S, d(x, y) ::; r}
is a closed set.
(b) If A 10 A 2 , , An are closed subsets of S, then U~ = 1 Am is closed.
(c) If(Ap)pEB is any collection of closed subsets of S, then nnEB An is closed.
Proof

a point y

(a) Suppose z is a limit point of the set C. Given e > 0, there is


B.(z) n C. Then
d(z, x) ::; d(z, y)

+ dey, x)

< e

+ r.

Since this is true for every e > 0, we must have d(z, x) ::; r. Thus z

C.

Basic concepts

22

(b) Suppose x A = U::'=l Am. For each m, x is not a limit point of Am,
so there is r(m) > 0 such that Br(mlx) n Am = 0. Let
r = min {r(1), r(2), ... , r(n)}.

Then Br(x) n Am = 0, all m, so B.(x) n A = 0. Thus x is not a limit point


of A.
(c) Suppose x is a limit point of A = npeB An. For any r > 0, Br(x) n
A :I 0. But A cAp, so Br(x) n Ap :I 0. Thus x is a limit point of All' so it
is in All' This is true for each p, so x E A. 0

Proposition 5.3. Suppose (S, d) is a metric space. A subset A c S is open

if and only if its complement is closed.

Proof Let B be the complement of A. Suppose B is closed, and suppose


x E A. Then x is not a limit point of B, so for some r > 0 we have Br(x) n
B = 0. Thus Br(x) c A, and A is a neighborhood of x.
Conversely, suppose A is open and suppose x B. Then x E A, so for
some r > 0 we have Br(x) c A. Then Br(x) n B = 0, and x is not a limit
point of B. It follows that every limit point of B is in B. 0

The set of limit points of a subset A c S is called the closure of A; we


shall denote it by A - . We have A c A - and A is closed if and only if A = A - .
In the example above, we saw that the closure of (0, I] c ~ is [0,1].
Suppose A, B are subsets of S and A c B. We say that A is dense in B if
Be A -. In particular, A is dense in S if A - = S. As an example, Q (the
rationals) is dense in IR. In fact, suppose x E IR and r > O. Choose a positive
integer n so large that lin < r. There is a unique integer m so that min <
x < (m + 1)ln. Then d(x, min) = x - min < (m + 1)ln - min = lin < r,
so min E Br(x). Thus x E Q - .
A sequence (x,,);'= 1 in S is said to converge to XES if for each e > 0 there
is an N so that d(x", x) < e if n ~ N. The point x is called the limit of the
sequence, and we write
lim x" = x

or x" -+x.

" .... 00

When S = ~ or C (with the usual metric), this coincides with the definition
in 3. Again the limit, if any, is unique.
A sequence (x,,);,= 1 in S is said to be a Cauchy sequence if for each e > 0
there is an N so that d(x", x m) < e if n, m ~ N. Again when S = IR or C, this
coincides with the definition in 3.
The metric space (S, d) is said to be complete if every Cauchy sequence in
S converges to a point of S. As an example, Theorem 3.3 says precisely that
~ and C are complete metric spaces with respect to the usual metrics.
Many processes in analysis produce sequences of numbers, functions,
etc., in various metric spaces. It is important to know when such sequences
converge. Knowing that the metric space in question is complete is a powerful
tool, since the condition that the sequence be a Cauchy sequence is then a

Compact sets

23

necessary and sufficient condition for convergence. We have already seen this
in our discussion of series, for example.

Note that IRn is complete. To see this note that in IRn,


max {Ixj

Yjl,j = I, ... ,n}:::; d(x,y):::; nmax{lxj-Yjl,j = I, ... ,n}.

It follows that a sequence of points in IRn converges if and only if each of the
n corresponding sequences of coordinates converges in IR. Similarly, a sequence of points in IRn is a Cauchy sequence if and only if each of the n
corresponding sequences of coordinates is a Cauchy sequence in IR. Thus
completeness of IRn follows from completeness of IR. (This is simply a generalization of the argument showing C is complete.)

Exercises
I. If (S, d) is a metric space, XES, and r :2: 0, then

{y lYE S, d(y, x) :> r}


is an open subset of S.
2. The point x is a limit point of a set A c S if and only if there is a
sequence (x n);:'= 1 in A such that Xn --+ x.
3. If a sequence (x n);:'= 1 in a metric space converges to XES and also
converges to YES, then x = y.
4. If a sequence converges, then it is a Cauchy sequence.
5. If (S, d) is a complete metric space and A c S is closed, then (A, d) is
complete. Conversely, if B c Sand (B, d) is complete, then B is a closed
subset of S.
6. The interval (0, I) is open as a subset of IR, but not as a subset of C.
7. Let S = Q (the rational numbers) and let d(x, y) = Ix - yl, x, y E Q.
Show that (S, d) is not complete.
8. The set of all elements x = (Xl> X2, ... , xn) in IRn such that each Xj is
rational is a dense subset of IRn.
9. Verify that IRn is complete.

6. Compact Sets
Suppose that (S, d) is a metric space, and suppose A is a subset of S. The
subset A is said to be compact if it has the following property: suppose that
for each x E A there is given a neighborhood of x, denoted N(x); then there
are finitely many points Xl, X2, ... , Xn in A such that A is contained in the
union of N(Xl), N(X2), ... , N(xn). (Note that we are saying that this is true
for any choice of neighborhoods of points of A, though the selection of
points Xl, X2, . .. may depend on the selection of neighborhoods.) It is
obvious that any finite subset A is compact.

24

Basic concepts

Examples

1. The infinite interval (0, 00) c III is not compact. For example, let
N(x) = (x - I, x + I), x E (0, 00). Clearly no finite collection of these
intervals of finite length can cover all of (0, 00).
2. Even the finite interval (0, I] c III is not compact. To see this, let
N(x) = (-!-x,2), x E (0, I]. For any Xl, X2, ... , XlI E (0, 1], the union of the
intervals N(xj) will not contain y if y :::; -!- min {Xl> X2, ... , x lI }.
3. The set A = {o} U {I, -!-, 1, t, ... } c IR is compact. In fact, suppose
for each x E A we are given a neighborhood N(x). In particular, the neighborhood N(O) of 0 contains an interval (-e, e). Let M be a positive integer
larger than lie. Then lin E N(O) for n ~ M, and it follows that A c N(O) U
N(l) U NH) U ... U N(IIM).
The first two examples illustrate general requirements which compact sets
must satisfy. A subset A of S, when (S, d) is a metric space, is said to be
bounded if there is a ball Br(x) containing A.

Proposition 6.1. Suppose (S, d) is a metric space, S


A c S is compact. Then A is closed and bounded.

=1=

0, and suppose

Proof Suppose y 1= A. We want to show that y is not a limit point of A.


For any x E A, let N(x) be the ball of radius -!-d(x, y) around x. By assumption, there are X1 ,X2, ... ,Xll EA such that A c U~=IN(xm)' Let r be the
minimum of the numbers -!-d(XI, y), ... , -td(xn , y). If x E A, then for some m,
d(x, x m) < -td(xm' y). But then
d(xm,y):::; d(xm' x) + d(x,y)
:::; !d(xm, y) + d(x, y).
so d(x, y) > !d(xm, y) ~ r. Thus Br(y) (') A = 0, and y is not a limit point
of A.
Next, we want to show that A is bounded. For each x E A, let N(x) be the
ball of radius 1 around x. Again, by assumption there are Xl, X2, ... , Xn E A
such that A c U~= I N(xm). Let
r = 1

+ max {d(xl> X2), d(Xb X3), ... , d(XI' x

lI )}.

If YEA then for some m, dey, Xm) < 1. Therefore dey, Xl) :::; dey, Xm)
d(xm' Xl) < 1 + d(xm' Xl) :::; r, and we have A c Br(XI)' 0

The converse of Proposition 6.1, that a closed, bounded subset of a metric


space is compact, is not true in general. It is a subtle but extremely important
fact that it is true in IIl n, however.

Theorem 6.4. (Heine-Borel Theorem). A subset ofllln or ofe is compact

if and only if it is closed and bounded.

Proof. We have seen that in any metric space, if A is compact it is


necessarily closed and bounded. Conversely, suppose A c IRn is closed and
bounded. Let us assume at first that n = 1. Since A is bounded, it is contained

Compact sets

25

in some closed interval [a, b]. Suppose for each x E A, we are given a neighborhood N(x) of x. We shall say that a closed subinterval of [a, b] is nice if
there are points XI. X2, ... , Xm E A such that UT~ 1 N(xj) contains the intersection of the subinterval with A; we are trying to show that [a, b] itself is
nice. Suppose it is not. Consider the two subintervals [a, c] and [c, b], where
c = -tea + b) is the midpoint of [a, b]. If both of these were nice, it would
follow that [a, b] itself is nice. Therefore we must have one of them not nice;
denote its endpoints by a b bb and let C1 = -t(a1 + b1). Again, one of the
intervals [ab cd and [Cb b1] must not be nice; denote it by [a2, b2]. Continuing in this way we get a sequence of intervals [am, bm], m = 0, 1,2, ... such
that [ao, bo] = [a, b], each [am, bm] is the left or right half of the interval
[am-I. bm- 1], and each interval [am, bm] is not nice. It follows that ao ~ a1 ~
... ~ am ~ bm ~ ... ~ b1 ~ bo and bm - am = 2-m(b o - ao)-+O. Therefore there is a point x such that am -+ x and bm-+ x. Moreover, am ~ X ~ bm,
for all m. We claim that x E A; it is here that we use the assumption that A is
closed. Since [am, bm] is not nice, it must contain points of A: otherwise
A n [am, bm] = 0 would be contained in any Uj~l N(xj). Let

Clearly Xm -+ x, since am -+ x and bm-+ x. Since A is closed, we get x E A.


Now consider the neighborhood N(x). This contains an interval (x - e,
X + e). If we choose m so large that bm - am < e, then since am ~ x ~ bm
this implies [am, bm] c N(x). But this means that [am, bm] is nice. This contradiction proves the theorem for the case n = 1.
The same method of proof works in IRn, where instead of intervals we use
squares, cubes, or their higher dimensional analogues. For example, when
n = 2 we choose M so large that A is contained in the square with corners
( M, M). If this square were not nice, the same would be true of one of
the four equal squares into which it can be divided, and so on. Continuing we
get a sequences of squares So J Sl J S2 J . . . , each of side -t the length of
the preceding, each intersecting A, and each not nice. The intersection
n~~o Sm contains a single point x, and x is in A. Then N(x) contains Sm for
large m, a contradiction. Since as metric space C = 1R2, this also proves the
result for C. 0
Suppose (Xn):~ 1 is a sequence in a set S. A subsequence of this sequence is
a sequence of the form (Yk):'~ I. where for each k there is a positive integer nk
so that
n1 < n2 < . . . < nk < nk + 1 < ... ,
Yk = x nk

Thus, (Yk):'~ 1 is just a selection of some (possibly all) of the xn's, taken in
order. As an example, if (xn):~ 1 C IR has Xn = (-l)nfn, and if we take
nk = 2k,then(xn):~1 = (-l,-t, -t,t, -t,)and(Yk):'~l = (-t,t,!,)
As a second example, let (xn):~ 1 be an enumeration of the rationals. Then for
any real number x, there is a subsequence of (Xn):~l which converges to x.

26

Basic concepts

Suppose (8, d) is a metric space. A set A c 8 is said to be sequentially


compact if, given any sequence (x n):'= 1 C A, some subsequence converges
to a point of A.

Examples

1. Any finite set is sequentially compact. (Prove this.)


2. The interval (0, 00) c IR is not sequentially compact; in fact let
Xn = n. No subsequence of (Xn):'=l converges.
3. The bounded interval (0, 1] c IR is not sequentially compact; in fact
let Xn = lin. Any subsequence of (X n):'=l converges to 0, which is not in
(0, 1].
Proposition 6.3. 8uppose (8, d) is a metric space, 8 -# 0, and suppose
A c 8 is sequentially compact. Then A is closed and bounded.

Proof. Suppose x is a limit point of A. Choose Xn E B 1/n(X) n A,


n = 1,2,3, .... Any subsequence of (X n):'=l converges to x, since Xn ~ x. It
follows (since by assumption some subsequence converges to a point of A)
that x E A. Thus A is closed.
Suppose A were not bounded. Take x E 8 and choose Xl E A such that
x rf B 1(x). Let r1 = d(x, Xl) + 1. By the triangle inequality, B 1(X 1) c BT'(X),
Since A is not bounded, there is X2 E A such that X2 rf BTl (X). Thus also
d(x1> X2) ~ 1. Let r2 = max {d(x, Xl), d(x, X2)} + 1 and choose X3 E A such
that X3 rt= BTl (X). Then d(x1' x 3) ~ 1 and d(x2' x 3) ~ 1. Continuing in this
way we can find a sequence (x n):'= 1 C A such that d(xm' xn) ~ 1 if m #- n.
Then no subsequence of this sequence can converge, and A is not sequentially
compact. 0
Theorem 6.4. (Bolzano-Weierstrass Theorem). A subset A oflRn or ofe
is sequentially compact if and only if it is closed and bounded.

Proof. We have shown that A sequentially compact implies A closed and


bounded. Suppose A is closed and bounded, and suppose first that n = 1.
Take an interval [a, b] containing A. Let c = -t(a + b). One (or both) of the
subintervals [a, c] and [c, b] must contain Xn for infinitely many integers n;
denote such a subinterval by [a1> bd, and consider [a1' cd, [C1> bd where
C1 = -t(a1 + b1). Proceeding in this way we can find intervals [am, bm] with
the properties [ao, bo] = [a, b], [am, bm] c [a m-1> bm- 1], bm - am =
2 - m(b o - ao), and [am, bm] contains Xn for infinitely many values of n. Then
there is a point x such that am ~ x, bm~ x. We choose integers n1> n2, ...
so that x nl E [a1' bd, n2 > n1 and x n2 E [a2, b2], n3 > n2 and xna E [a3, b3], etc.
Then this subsequence converges to x. Since A is closed, x E A.
The generalization of this proof to higher dimensions now follows as in
the proof of Theorem 6.2. 0
Both the terminology and the facts proved suggest a close relationship
between compactness and sequential compactness. This relationship is made
precise in the exercises below.

Vector spaces

27

Exercises
1. Suppose (x n):= 1 is a sequence in a metric space (S, d) which converges
to XES. Let A = {x} U {X n }:= l' Then A is compact and sequentially compact.
2. Let 0, the rationals, have the usual metric. Let A = {x I x E 0, X2 < 2}.
Then A is bounded, and is closed as a subset of 0, but is not compact.
3. Suppose A is a compact subset of a metric space (S, d). Then A is
sequentially compact. (Hint: otherwise there is a sequence (X n ):= 1 in A with
no subsequence converging to a point of A. It follows that for each x E A.
there is an rex) > 0 such that the ball N(x) = Br<x)(x) contains Xn for only
finitely many values of n. Since A is compact, this would imply that
{I, 2, 3, ... } is finite, a contradiction.)
4. A metric space is said to be separable if there is a dense subset which is
countable. If (S, d) is separable and A c S is sequentially compact, then A is
compact. (Hint: suppose for each x E A we are given a neighborhood N(x).
Let {Xl> x 2 , X3, ... } be a dense subset of S. For each x E A we can choose an
integer m and a rational rm such that x E Brm(xm) c N(x). The collection of
balls Brm(xm) so obtained is (finite or) countable; enumerate them as C 1 ,
C 2 , . Since each Cj is contained in some N(x), it is sufficient to show that
for some n, Ui= 1, C j ::::> A. If this were not the case, we could take Yn E A,
Yn ; Ui=l C j , n = 1,2, .... Applying the assumption of sequential compactness to this sequence and noting how the Cj were obtained, we get a
contradiction.)

7. Vector spaces
A vector space over IR is a set X in which there are an operation of addition and an operation of multiplication by real numbers which satisfy certain
conditions. These abstract from the well-known operations with directed line
segments in Euclidean 3-space.
Specifically, we assume that there is a function from X x X to X, called
addition, which assigns to the ordered pair (x, y) E X X X an element of X
denoted x + y. We assume
VI.
V2.
V3.
V4.

(x + y) + z = x + (y + z), all x, y, Z E X.
x + y = y + x, all x, y E X.
There exists 0 E X such that x + 0 = x, all X.
For all x E X, there exists -x E X such that x

+ (-x)

= O.

We assume also that there is a function from ~ x X to X, called scalar


multiplication, assigning to the ordered pair (a, x) E ~ X X an element of X
denoted ax. We assume
VS.
V6.
V7.
V8.

(ab)x = a(bx), all a, b E ~, X E X.


a(x + y) = ax + ay, all a E ~, x, Y E X.
(a + b)x = ax + bx, all a, b E ~,x E X.
Ix = x, all x E X.

28

Basic concepts

Summarizing: a vector space over IR, or a real vector space, is a set X with
addition satisfying VI-V4 and scalar multiplication satisfying VS-VS. The
elements of X are called vectors and the elements of IR, in this context, are
often called scalars.
Similarly, a vector space over C, or a complex vector space, is a set X
together with addition sstisfying VI-V4 and scalar multiplication defined
from C x X to X and satisfying VS-VS. Here the scalars are, of course,
complex numbers.

Examples
1. IR is a vector space over IR, with addition as usual and the usual
multiplication as scalar multiplication.
2. The set with one element 0 is a vector space over IR or C with 0 + 0 =
0, aO = 0, all a.
3.lRn is a vector space of IR if we take addition and scalar multiplication as

(Xl> X2, ... , xn)

+ (Yl> Y2, ... , Yn) = (Xl + Yl> X2 + Y2, ... , Xn


a(xI' X2, ... , xn) = (axl> aX2, ... , aXn).

+ Yn),

4. C is a vector space over IR or C with the usual addition and scalar


multiplication.
5. Let S be any set, and let F(S; IR) be the set whose elements s are the
functions from S to IR. Define addition and scalar multiplication in F(S; IR) by
(f + g)(s) = f(s) + g(s),
(af)(s) = af(s),

SES,
S, a E IR.

S E

Then F(S; IR) is a vector space over IR.


6. The set F(S; C) of functions from S to C can be made a complex
vector space by defining addition and scalar multiplication as in 5.
7. Let X be the set of all functionsf: IR ~ IR which are polynomials, i.e.,
for some aD, al> ... , an E IR,
all x

IR.

With addition and scalar multiplication defined as in 5, this is a real vector


space.
S. The set of polynomials with complex coefficients can be considered as
a complex vector space.
Let us note two elementary facts valid in every vector space: the element
V3 is unique, and for any x E X, Ox = O. First, suppose
0' E X has the property that x + 0' = x for each x E X. Then in particular
0' = 0' + 0 = 0 + 0' = 0 (using V2 and V3). Next, if x E X, then

o of assumption

Ox = Ox + 0 = Ox + [Ox + (-Ox)]
= [Ox + Ox] + (-Ox) = (0 + O)x
= Ox + (- Ox) = O.

+ (-Ox)

29

Vector spaces
Note also that the element -x in V4 is unique. In fact if x

+ Y=

0, then

Y = Y + 0 = Y + [x + (-x)] = [y + x] + (-x)
= [x + y] + (-x) = 0 + (-x) = (-x) + 0 = -x.
This implies that (-I)x = -x, since
x

(-I)x = [I

+ (-I)]x =

Ox =

o.

A non-empty subset Y of a (complex) vector space X is called a subspace


of X if it is closed with respect to the operations of addition and scalar multiplication. This means that if x, Y E Y and a E 18, then x + y and ay are in Y.
If so, then Y itself is a vector space over 18, with the operations inherited
from X.
Examples

l. Any vector space is a subspace of itself.


2. The set {o} is a subspace.
3. {x I Xn = O} is a subspace of IRn.
4. In the previous set of examples, the space X in example 7 is a subspace
of F(IR; IR).
5. Let X again be the space of polynomials with real coefficients. For each
n = 0, I, 2, ... , let Xn be the subset of X consisting of polynomials of degree
::; n. Then each Xn is a subspace of X. For m ::; n, Xm is a subspace of X n.
Suppose Xl> X2, ... , Xn are elements of the vector space X. A linear
combination of these vectors is any vector x of the form

where al> a2' ... , an are scalars.


Proposition 7.1. Let S be a nonempty subset of the vector space X, and let
Y be the set of all linear combinations of elements of S. Then Y is a subspace of
X andY:::> S.
IfZ is any other subspace of X which contains the set S, then Z :::> Y.
Proof If x, y E Y, then by definition they can be expressed as finite
sums x = L ajxj, y = L bjYj, where each Xj E S and each Yj E S. Then
ax = L (aaj)x j is a linear combination of the x/s, and x + Y is a linear
combination of the x/s and the y/s. If XES, then x = Ix E Y. Thus Y is a
subspace containing S.
Suppose Z is another subspace of X containing S. Suppose x E Y. Then
for some Xl> x 2 , , Xn E S, X = L ajXj. Since Z is a subspace and Xl> X2, ... ,
Xn E Z, we have alxl> a2x2, ... , anX n E Z. Moreover, alx l + a2x2 E Z, so
(alxl + a2x2) + aaXa E Z. Continuing we eventually find that x E Z. 0

We can paraphrase Proposition 7.1 by saying that any subset S of a


vector space X is contained in a unique smallest subspace Y. This subspace is

Basic concepts

30

called the span of S. We write Y = span (S). The set S is said to span Y. Note
that if S is empty, the span is the subspace {O}.

Examples
Let X be the space of all polynomials with real coefficients. Let fm be the
polynomial defined by fm(x) = xm. Then span {fo.!l' ... .!n} is the subspace
Xn of polynomials of degree ~ n.

A linear combination alXl + a2X2 + ... + anxn of the vectors Xl> X2, ... ,
Xn is said to be nontrivial if at least one of the coefficients al> a2, ... , an is not
zero. The vectors Xl> X2, ... , Xn are said to be linearly dependent if some nontrivial linear combination of them is the zero vector. Otherwise they are said
to be linearly independent. More generally, an arbitrary (possibly infinite)
subset S is said to be linearly dependent if some nontrivial linear combination
of finitely many distinct elements of S is the zero vector; otherwise S is said
to be linearly independent. (Note that with this definition, the empty set is
linearly independent.)
Lemma 7.2. Vectors Xl> X2, ... , Xn in X, n ~ 2, are linearly dependent if
and only if some Xj is a linear combination of the others.

Proof If Xl, X2, ... , Xn are linearly dependent, there are scalars al>
a2, ... , an, not all 0, such that L a,xj = 0. Renumbering, we may suppose
al # 0. Then Xl = L1=2 (-al-laj)x,.
Conversely, suppose Xl, say, is a linear combination L1=2 bjxj. Letting
al = 1, and aj = -bj for j ~ 2, we have L ajxj = 0. 0
The vector space X is said to be finite dimensional if there is a finite subset
which spans X. Otherwise, X is said to be infinite dimensional. A basis of a
(finite-dimensional) space X is an ordered finite subset (Xl> X2, ... , xn) which
is linearly independent and spans X.

Examples
1. IRn has basis vectors (el> e2, ... , en), where el = (1, 0, 0, ... , 0),
e2 = (0, 1,0, ... , 0), ... , en = (0,0, ... ,0, 1). This is called the standard
basis in IRn.
2. The set consisting of the single vector 1 is a basis for C as a complex
vector space, but not as a real vector space. The set (1, i) is a basis for C as a
real vector space, but is linearly dependent if C is considered as a complex
vector space.
Theorem 7.3. A finite-dimensional vector space X has a basis. Any two
bases of X have the same number of elements.

Proof

Let {Xl> x 2 , , xn} span X. If these vectors are linearly independent then we may order this set in any way and have a basis. Otherwise

31

Vector spaces

ifn ;:::; 2 we may use Lemma 7.2 and renumber, so that Xn is a linear combination "17= f ajxj. Since span {Xl> X2, ... , xn} = X, any X E X is a linear combination
X

j=l
n-1

bjxI =

~1 bjxj + bn(ni,l alx/)


/=1

/=1

2: (bj + bnaj)xj.

/=1

Thus span {Xl' X2, ... , Xn-1} = X. If these vectors are not linearly independent, we may renumber and argue as before to show that
span{Xl>X2,""Xn- 2} = X.
Eventually we reach a linearly independent subset which spans X, and thus
get a basis, or else we reach a linearly dependent set {Xl} spanning X and consisting of one element. This implies Xl = 0, so X = {O}, and the empty set is
the basis.
Now suppose (Xl> X2, ... , xn) and (Yl> Y2, ... , Ym) are bases of X. and
suppose m :::; n. If n = 0, then m = O. Otherwise Xl "# O. The y/s span X, so
Xl = "1 ajYj. Renumbering, we may assume a1 "# O. Then

Thus Y1 is a linear combination of Xl> Y2, ... , Ym' It follows easily that
span {XH Y2,' .. , Ym} ::::> span {Yl> Y2, .. ,ym} = X. If m = 1 this shows that
span {Xl} = X, and the linear independence of the x/s then implies n = 1.
Otherwise X2 = bX I + "11'= 2 bjYj. The independence of Xl and X2 implies
some bi "# O. Renumbering, we assume b2 "# O. Then
Y2

= b2- 1(X2 - bX1 -

1=3

bi Yi ).

This implies that


span {Xl> X2, Y3, ... , Ym}

::::>

span {Xl> Y2, ... , Ym}

X.

Continuing in this way, we see that after the y/s are suitably renumbered,
each set {Xl> X2,"" Xb Yk+l>"" Ym} spans X, k :::; m. In particular, takitlg
k = m we have that {Xl> X2, ... , xm} spans X. Since the x/s were assumed
linearly independent, we must have n :::; m. Thus n = m. 0

If X has a basis with n elements, n = 0, 1, 2, ... , then any basis has n


elements. The number n is called the dimension of n. We write n = dim X.
The argument used to prove Theorem 7.3 proves somewhat more.

Theorem 7.4. Suppose X is a finite-dimensional vector space with dimension n. Any subset X which spans X has at least n elements. Any subset of X
which is linearly independent has at most n elements. An ordered subset of n
elements which either spans X or is linearly independent is a basis.

32

Basic concepts

Suppose (Xl' X2, ... , xn) is a basis of X. Then any X E X can be written as
a linear combination X = 2: ajxj The scalars aI, a2, ... , an are unique; in
fact if X = 2: bjxj, then

o=

x - x =

L (aj -

bj)Xi'

Since the x/s are linearly independent, each aj - bj = 0, i.e., ai = bj. Thus
the equation x = 2: ajxj associates to each x E X a unique ordered n-tuple
(aI' a2, ... , an) of scalars, called the coordinates ofx with respect to the basis
(Xl' ... , xn). Note that if x and y correspond respectively to (ab a2, ... , an)
and (bb b2, ... , bn), then ax corresponds to (aal, aa2, ... , aan) and x + y
corresponds to (al + bl , a2 + b2, ... , an + bn). In other words, the basis
(Xl> X2, ... , xn) gives rise to a function from X onto IRn or en which preserves
the vector operations.
Suppose X and Yare vector spaces, either both real or both complex.
A function T: X -+ Y is said to be linear if for all vectors x, x' E X and all
scalars a,
T(ax) = aT(x), T(x + y) = T(x) + T(y).
A linear function is often called a linear operator or a linear transformation.
A linear function T: X -+ IR (for X a real vector space) or T: X -+ e (for X
a complex vector space) is called a linear functional.

Examples
1. Suppose X is a real vector space and (Xl> x 2, ... , x n) a basis. Let
T(2: ajxj) = (aI' a2, ... , an). Then T: X -+ IRn is a linear transformation.
2. Let T(z) = z*, Z E C. Then T is a linear transformation of e into itself
if e is considered as a real vector space, but is not linear if e is considered as
a complex vector space.
3. Letjf: IRn -+ IR be defined by jf(Xl' X2, ... , xn) = Xj' Thenjf is a linear
functional.
4. Let X be the space of polynomials with real coefficients. The two functions S, T defined below are linear transformations from X to itself. If
f(x) = I7=0 aixi, then

S(f)(x) =
T(f)(x)

L" U + 1)-lajxi+1,

1=0
" jal x i - 1.
= L

i=l
Note that T(S(f = J, while S(T(f = fif and only if ao = O.
Exercises
1. If the linearly independent finite set {Xb X2, ... , x,,} does not span X,
then there is a vector X,,+1 E X such that {Xb X2, ... , x,,, X"+1} is linearly
independent.

33

Vector spaces

2. If X is finite-dimensional and Xl, X2, , Xn are linearly independent,


then there is a basis of X containing the vectors Xl> X2, ... , Xn
3. If X is a finite-dimensional vector space and Y c X is a subspace, then
Y is finite-dimensional. Moreover, dim Y ::::; dim X, and dim Y < dim X
unless Y = X.
4. If Y and Z are subspaces of the finite-dimensional vector space X and
Y n Z = {O}, then dim Y + dim Z ::::; dim X.
5. Suppose Y and Z are subspaces of the finite-dimensional vector space
X, and suppose span (Y u Z) = X. Then dim Y + dim Z ~ dim X.
6. If Y is a subspace of the finite dimensional vector space X, then there
is a subspace Z with the properties Y n Z = fO}; dim Y + dim Z = dim X;
any vector X E X can be expressed uniquely in the form X = Y + Z, where
Y E Y, Z E Z. (Such subspaces are said to be complementary.)
7. Prove Theorem 7.4.
8. The polynomials fm(x) = x m, 0 ::::; m ::::; n, are a basis for the vector
space of polynomials of degree ::::; n.
9. The vector space of all polynomials is infinite dimensional.
10. If S is a non-empty set, the vector space F(S; JR) of functions from
S to ~ is finite dimensional if and only if S is finite.
11. If X and Yare vector spaces and T: X -+ Y is a linear transformation,
then the sets
NCT) = {x I x E X, T(x) = O}
R(T) = {T(x) I x E X}

are subspaces of X and Y, respectively. (They are called the null space or
kernel ofT, and the range ofT, respectively.) Tis 1-1 ifand only ifN(T) = {O}.
12. If X is finite dimensional, the subspaces N(T) and R(T) in problem 11
satisfy dim N(T) + dim R(T) = dim X. In particular, if dim Y = dim X,
then Tis 1-1 if and only if it is onto. (Hint: choose a basis for N(T) and use
problem 2 to extend to a basis for X. Then the images under T of the basis
elements not in N(T) are a basis for R(T).)

Chapter 2

Continuous Functions
1. Continuity, uniform continuity, and compactness

Suppose (S, d) and (S', d') are metric spaces. A function f: S -)- S' is
said to be continuous at the point XES if for each e > there is a 0 > Osuch
that
d'(f(x),J(y)) < e

if d(x, y) < O.

In particular, if Sand S' are subsets of ~ or of C (with the usual metrics) then
the condition is
If(y) - f(x)1 < e

if Iy - xl < O.

(This definition is equivalent to the following one, given in terms of convergence of sequences: f is continuous at x if f(x n ) ~ f(x) whenever (x n );:'= 1
is a sequence in S which converges to x. The equivalence is left as an
exercise.)
Recall that we can add, multiply, and take scalar multiples of functions
with values in C (or ~): iff, g: S ~ C and a E C, XES, then
(f

+ g)(x)

= f(x) + g(x),
(af)(x) = af(x) ,
(fg)(x) = f(x)g(x),
(flg)(x) = f(x)/g(x)
if g(x) -# 0.

Proposition 1.1. Suppose (S, d) is any metric space, and suppose


f, g: S ~ C are functions which are continuous at x. Then f + g, af, and fg are
continuous at x. Ifg(x) -# 0, then fig is defined in a ball BrCx) and is continuous
at x.
Proof Continuity of f + g and af at x follow from the definition of
continuity and the inequalities
l(f + g)(y) - (f

+ g)(x) I = If(y) - f(x) + g(y) - g(x) I

:$; !f(y) - f(x)! + !g(y) - g(x)!,


I(af)(y) - (~f)(x)1 = lallf(y) - f(x)l

To show continuity offg at x we choose 01 > so small that if dey, x) < 01


then !f(y) - f(x)! < 1. Let M be the larger of !g(x)1 and !f(x)! + 1. Given
e > 0, choose 0 >
so small that

If(y) - f(x) I < e12M,

Ig(y) - g(x) I < el2M


34

Continuity, uniform continuity, and compactness

35

if dey, x) < O. Then dey, x) < 0 implies

l(fg)(y) - (fg)(x) I = If(y)g(y) - f(x)g(x) I


= If(y)g(y) - f(y)g(x) + f(y)g(x) - f(x)g(x) I
~ If(y)llg(y) - g(x) I + If(y) - f(x) I Ig(x) I
~ If(y)I'e/2M + Me/2M
= If(y)I'e/2M + e/2.
But also

If(y) I = If(y) - f(x)

+ f(x) I ~ If(y) - f(x) I + If(x) I


< 1 + If(x) I ~ M,

so l(fg)(y) - (fg)(x) I < e.


Finally, suppose g(x) =F O. Choose r > 0 so that Ig(y) - g(x) I < tlg(x)1
if dey, x) < r. Then if dey, x) < r we have

Ig(x) I =

Ig(y)

~ Ig(Y)1

+ g(x)

- g(y)1

+ tlg(x)l,

so Ig(Y)1 ~ tlg(x)1 > O. Thus l/g is defined on Br(x). Since the product of
functions continuous at x is continuous at x, we only need show that l/g is
continuous at x. But if y E Br(x), then

II/g(y) - l/g(x) I = Ig(y) - g(x)I/lg(y)llg(x)1


~ Klg(y) - g(x)l,
where K = 2/1 g(x)i2. Since g is continuous at x, it follows that l/g is.

A function f: S -+ S' is said to be continuous if it is continuous at each


point of S.
The following is an immediate consequence of Proposition 1.1.
Corollary 1.2. Suppose f, g: S -+ C are continuous. Then f
fg are also continuous. Ifg(x) =F 0, all x, thenflg is continuous.

+ g, af, and

A functionf: S -+ S' is said to be uniformly continuous if for each e > 0


there is a 0 > 0 such that

d'(f(x),J(y < e

if d(x, y) < 8.

In particular, if S, S' c C, then this condition reads

If(y) - f(x) I < e

if Iy - xl < 8.

The distinction betwe encontinuity and uniform continuity is important.


If f is continuous, then for each x and each e > 0 there is a 8 > 0 such that
the above condition holds; however, 8 may depend on x. As an example, let
S = S' = ~,f(x) = x 2 Then If(y) - f(x)1 = ly 2 - x 2 1 = Iy + xlly - xlIf Ixl is very large, then Iy - xl must be very small for If(y) - f(x) I to be
less than 1. Thus this function is not uniformly continuous. (However it is
clear that any uniformly continuous function is continuous.)
In one important case, continuity does imply uniform continuity.

Continuous functions

36

Theorem 1.3. Suppose (S, d) and (S', d') are metric spaces and suppose
I: S ~ S' is continuous. If S is compact, then I is uniformly continuous.
Proof Given e > 0, we know that for each XES there is a number
8(x) > 0 such that
d'(f(x),f(y <

te

if d(x, y) < 28(x).

Let N(x) = Bo(x)(x). By the definition of compactness, there are points


Xl, X2, ... , Xn E S such that S C U N(xj). Let 8 = min {8(Xl)' 8(X2), ... ,
8(xn)}, and suppose d(x, y) < 8. There is some Xt such that X E N(xt). Then

d(Xh x) < S(Xj) < 28(xt),


d(Xh y) < d(Xh x) + d(x, y) < 8(xt)
so

d'(f(x),f(y : : ; d'(f(x),f(Xt

+8

::::;; 28(xj),

+ d'(f(xt),f(y

::::;;~+~=~

There are other pleasant properties of continuous functions on compact


sets. A function I: S ~ C is said to be bounded if/(S) is a bounded set in C,
i.e., there is an M ~ 0 such that

I/(x) I : : ; M,

all XES.

Theorem 1.4. Suppose (S, d) is a metric space and suppose I: S ~ C is


continuous. Then I is bounded and there is a point Xo E S such that
I/(xo) I = sup {1/(x) I Ix E S}.
IfI(S) c Ill, then there are X+, x_

S such that

I(x+) = sup {/(x) 1 XES},


I(x-) = inf{f(x) 1 XES}.
Proof For each XES, there is a number 8(x) > 0 such that I/(y) - I(x) I
< 1 if Y E Bo(x)(x) = N(x). Choose X1o" ., Xn such that S c U N(xj). If
XES then x E N (Xt) for some i, and
I/(x) I ::::;; I/(Xj) I + I/(x) - I(xt) I < I/(Xi) I + 1.
Thus we can take M = 1 + max{l/(xl)I, ... , I/(xn)l} and we have shown
that I is bounded.
Let a = sup {l/(x) I I XES}, and suppose I/(x) I < a, all XES. Then for
each XES there are numbers a(x) < a and e(x) > 0 such that 1/(y) I ::::;; a(x)
if y E B.(x)(x) = M(x). Choose Y1o' .. , Ym such that S c U M(Yt), and let
al = max {aCYl), ... , a(Ym)} < a. If XES then x E M(Yt) for some i, so

I/(x) I < a(Yt) : : ; al < a.


This contradicts the assumption that a = sup {1/(x) I I XES}. Thus there
must be a point Xo with I/(x) I = a.
The proof of the existence of x+ and x_ when/is real-valued is similar,
and we omit it. 0

Continuity, uniform continuity, and compactness

37

Both theorems above apply in particular to continuous functions defined


on a closed bounded interval [a, b] c IR. We need one further fact about such
functions when real-valued: they skip no values.

Theorem 1.5. (Intermediate Value Theorem). Suppose f: [a, b] ~ IR ;s


continuous. Suppose either
f(a) ~ c ~ feb)
Then there is a point

Xo E

or feb) ~ c ~ f(a).

[a, b] such that f(xo) =

c.

Proof We consider only the case f(a) ~ c ~ feb). Consider the two
intervals [a, !(a + b)], [-t(a + b), b]. For at least one of these, c lies between
the values of f at the endpoints; denote this subinterval by [aI, bd. Thus
f(al) ~ c ~ f(b l ). Continuing in this way we get a sequence of intervals
[an, b n] with [an + 1> b n+l ) c [an, b n], b n+l - an+l = t(b n - an), andf(an) ~
c ~ f(b n). Then there is Xo E [a, b] such that an ~ Xo, b n ~ Xo. Thus
f(xo) = limf(an) ~ c.
f(xo) = limf(b n) ~ c.

Exercises
1. Prove the equivalence of the two definitions of continuity at a point.
2. Use Theorem 1.6 to give another proof of the existence of V2. Prove
that any positive real number has a positive nth root, n = 1,2, ....
3. Supposef: S~ S', where (S, d) and (S', d') are metric spaces. Prove
that the following are equivalent:
(a) fis continuous;
(b) for each open set A' c S',j-l(A') is open;
(c) for each closed set A' c S',j-l(A') is closed.
4. Find continuous functionsJj: (0, 1) ~ lR,j = 1,2,3, such that
fl is not bounded,
f2 is bounded but not uniformly continuous,
f3 is bounded but there are no points x+, x_ E (0,1) such thatf3(x+) =
sup {!3(X) I XE(O, 1)},j3(x_) = inf{f3(x) I XE(O, I)}.
5. Suppose f: S ~ S' is continuous and S is compact. Prove thatf(S) is
compact.
6. Use Exercise 5 and Theorem 6.2 of Chapter 1 to give another proof of
Theorem 1.4.
7. Use Exercise 3 of Chapter 1, 6 to give a third proof of Theorem 1.4.
(Hint: take (Xn);'=l c S such that lim If(x n) I = sup {f(x) I XES}, etc.)
8. Suppose (S, d) is a metric space, XES, and r > O. Show that there is
a continuous functionf: S ~ IR with the properties: 0 ~ fey) ~ 1, all YES,
fey) = 0 if y rt Br(x),j(x) = 1. (Hint: takef(y) = max {I - r-ld(y, x), O}.
9. Suppose (S, d) is a metric space and 'suppose S is not compact. Show
that there is a continuousf: S ~ IR which is not bounded. (Hint: use Exercise
8.)

Continuous functions

38

2. Integration of complex-valued functions


A partition of a closed interval [a, b] c IR is a finite ordered set of points
P = (xo, Xl> , xn) with

a = Xo <

< ... < Xn

Xl

b.

The mesh of the partition P is the maximum length of the subintervals


[XI-I, XI]:
IPI = max {XI - XI-l I i = 1,2, ... , n}.

Iff: [a, b] -+ C is a bounded function and P = (xo, Xl> , xn) is a partition of [a, b], then the Riemann sum off associated with the partition P is the
number

2: f(xl)(x; n

S(f; P) =

XI-I)'

1=1

The function f is said to be integrable (in the sense of Riemann) if there is


number Z E C such that
lim S(f; P) = z.
IPI .... O

More precisely, we mean that for any e > 0 there is a 8 > 0 such that
(2.1)

zl

IS(f;P) -

<

if IPI < 8.

If this is the case, the number z is called the integral off on [a, b] and denoted
by

ff

or ff(X)dx.

If f: [a, b] -+ C is bounded, suppose If(x) I


any partition P of [a, b],

IS(f; P)I :::;;

2: If(Xi) I(XI -

XI-I) :::;;

: :;

M, all X E [a, b]. Then for

2: (XI -

XI-l) = M(b - a).

Therefore, iff is integrable,


(2.2)

Iff I: ;

M = sup {If(x) I I X E [a, b]}.

M(b - a),

Recall that f: [a, b] -+ C is a sum f = g + ih where g and hare realvalued functions. The functions g and h are called the real and imaginary
parts off and are defined by

h(x) = 1m (f(x,

g(X) = Re (f(x,

X E

[a, b].

We denote g by Refand h by 1m!

Proposition 2.1. A boundedfunctionf: [a, b] -+ C is integrable if and only


if the real and imaginary parts off are integrable. If so,

ff=
a

r
a

Ref +

if 1m!
a

Integration of complex-valued functions

Proof

Recall that if Z

(2.3)

11xl

39

= x + iy, x, y

+ 11yl

lR, then

::; Izi ::; Ixl

Iyl

If P is any partition of [a, b], then

+ is(Imf; P),
Let z = x + iy, x, y

S(f; P) = S(Ref; P)
and S(Ref; P), S(Imf; P) are real.
(2.3) to S(f; P) - z. We get

1IS(Ref; P) - xl + 1IS(Imf; P) - yl
::; IS(f; P) - zl ::; IS(Ref; P) - xl

real, and apply

IS(Imf; P) - YI.

Thus S(f; P) -+ z as IP 1-+ 0 ifand only if S(Ref; P) -+ x and S(Imf; P)-+


yas IP 1--; o. 0

Proposition 2.2. Suppose f: [a, b] -+ C and g: [a, b] -+ C are bounded


integrable functions, and suppose c E C. Then f + g and cf are integrable, and

Proof

(f + g)

r r r
g,

cf = c

For any partition P of [a, b],

S(f + g; P)

S(f; P)

S(g; P),

S(cf; P) = cS(f; P).

The conclusions follow easily from these identities and the definition.

Neither of these propositions identifies any integrable functions. We shall


see shortly that continuous functions are integrable. The following criterion
is useful in that connection.

Proposition 2.3. A boundedfunctionf: [a, b] -+ C is integrable if and only

if for each e > 0 there is as> 0 such that


(2.4)

IS(f;P) - S(f; Q)I < e if IPI, IQI < S.

J:

Proof Suppose f is integrable, and let z = f For any e > 0 there is a


S > 0 such that S(f; P) is in the disc of radius 1e around z if IP I < S. Then
IPI < S, IQI < S implies S(f;P), S(f; Q) are at distance < e.
Conversely, suppose for each e > 0 there is as> 0 such that (2.4) holds.
Take partitions P n with IPnl < lin, n = 1,2,3, ... , and let Zn = S(f;Pn).
It follows from our assumption that (Zn):=l is a Cauchy sequence. Let z be
its limit. If n is large, IPnl is small and S(f; P n) is close to z, and if IQI is
small, S(f; Q) is close to S(f; Pn). Thus S(f; Q) -+ z as IQI-+ o. 0

Theorem 2.4. Iff: [a, b] -+ C is continuous, it is integrable.


Proof We know by 1 that f is bounded and uniformly continuous.
Given e > 0, choose S > 0 so that If(x) - f(y) I < e if Ix - yl < S. Suppose
P, Q are partitions of [a, b] with IP I < S, IQ I < S. Suppose P = (xo, Xl> ... ,
x n ). Let P' be a partition which includes all points of P and of Q, P' =
(Yo, Y1, ... , Ym). We examine one summand of S(f; P). Suppose
Xj-1

= Y1-1

< Y1 < ... < Yk =

Xi

40

Continuous functions

Then
k(X,)(Xi - Xj-l) -

~/(YI)(YI -

1~1 (f(Xj) -

~e

L (YI -

YI-l)

f(YI(YI - YI-l)

1=1

YI-l) = e(XI - XI-l),

since each IXi - YII < 8. Adding, we get


IS(f; P) - S(/; P')I < e(b - a).

Similarly,
IS(f; Q) - S(/; P')I < e(b - a),

so
IS(f; P) - S(f; Q)I < 2e(b - a).

By Proposition 2.3, I is integrable.

We now want to consider the effect of integrating over subintervals.

Proposition 2.5. Suppose a < b < c and I: [a, c] -+ C is bounded. Then


lis integrable if and only ifit is integrable as alunction on [a, b] and on [b, c].
lfso, then

f/= f/+ ff.

Proof. Suppose I is integrable on [a, b] and on [b, c]. Given e > 0,


choose 8 > 0 so that
(2.5)

IS(f;p) - { I I <

~e,

IS(f; Q)

f/l

<

~e

if P, Q are partitions of [a, b], [b, c] respectively, IPI < 8, IQI < 8. Suppose
P' is a partition of [a, c], IP'I < 8. If b is a point of P', then P' determines
partitions P of [a, b] and Q of [b, c], IP I < 8, 101 < 8. It follows from (2.5)
that
(2.6)

IS(f; P') - .C I -

rI
I

< e.

If b is not a point of P', let P" be the partition obtained by adjoining b. Then
(2.6) holds with P" in place of P'. Suppose I/(x) I ~ M, all x E [a, b]. The
sums S(f; P') and S(f; PH) differ only in terms corresponding to the subinterval determined by P' which contains b. It is easy to check, then, that
IS(f; P') - S(f; P")I < 28M.

Thus
IS(f;p') - ( ' / -

f/l

< e + 28M

41

Integration of complex-valued functions

I:f I:f

if IP'I < S. It follows thatfis integrable with integral


+
Conversely, suppose f is not integrable on [a, b] or [b, cl; say it is not
integrable on [a, b]. Then there is an e > 0 such that for any S > 0 there are
partitions Pl' P 2 of [a, b] with IPll < S, IP2 1 < S, but

IS(I; Pl) - S(I; P 2 )1 > e.


Let Q be a partition of [b, c] with IQI < S, and let P~, P~ be the partitions of
[a, c] containing all points of Q and of Plo P 2 respectively. Then IP~I < S,
IP~I < S, and

IS(I; PD - S(I; P~)I = IS(I; Pl) - S(f; P 2 )1 > e.


By Proposition 1.3,fis not integrable.

Supposef: [ao, bol -?- C is integrable, and suppose a, b E [ao, bolo If a < b,
then f is integrable on [a, b]. (In fact f is integrable on [ao, b], therefore on
[a, b], by two applications of Proposition 2.5.) If b < a, thenfis integrable
on [b, a] and we define

We also define

Then one can easily check, case by case, that for any a, b, c E [ao, bo],
(2.7)

LCf= ff+ ff

It is convenient to extend the notion of the integral to certain unbounded


functions and to certain functions on unbounded intervals; such integrals are
called improper integrals. We give two examples, and leave the remaining
cases to the reader.
Suppose f: (ao, b] -?- C is bounded and integrable on each subinterval
[a, b], ao < a < b. We set
(2.8)

if the limit exists.


Suppose f: [a, (0) -?- C is bounded and integrable on each subinterval
[a, b], a < b < 00. We set
(2.9)

f f= fbf
cc
a

lim

b-+oo

if the limit exists; this means that there is a

Z E

C such that for each

if b ~ bee).

> 0

Continuous functions

42

Exercises

1. Let f: [0, 1] ~ ~ with f(x) = if x is irrational, f(x) = 1 if x is


rational. Show thatfis not integrable.
2. Letf: [0, 1] ~ ~ be defined by: f(O) = 0, f(x) = sin (l/x) if x i= 0.
Sketch the graph. Show that f is integrable.
3. Supposej, g: [a, b] ~ IC are bounded,fis integrable, and g(x) = f(x)
except on a finite set of points in [a, b]. Then g is integrable and g = f
4. Suppose f: [a, b] ~ IC is bounded and is continuous except at some
finite set of points in [a, b]. Show that f is integrable.
5. Suppose f: [a, b] ~ IC is continuous and f(x) ~ 0, all x E [a, b]. Show
that
= impliesf(x) = 0, all x E [a, b].
6. Supposef: [a, b] ~ IC is bounded, integrable, and real-valued. Suppose

f:

f:

f:f

If 1=
f

M(b - a),

where

sup {If(x) I I x

[a, b]}.

Show that f is constant.


7. Do Exercise 6 without the assumption thatfis real-valued.
8. Let f: [0, 1] ~ IC be defined by: f(x) = if x = or x is irrational,
f(x) = l/q if x = p/q, p, q relatively prime positive integers. Show that f is
continuous at x if and only if x is zero or irrational. Show thatfis integrable
and
0.

f:f=

3. Differentiation of complex-valued functions


Suppose (a, b) is an open interval in ~ and thatf: (a, b) ~ IC. As in the
case of a real-valued function, we say that the functionfis differentiable at the
point x E (a, b) if the limit
lim fey) - f(x)

(3.1)

y .... x

y - x

exists. More precisely, this means that there is a number


any e > 0, there is a 8 > with
(3.2)

l(f(y) - f(x))(y - X)-l - zl < e

if

<

ZE

C such that for

Iy - xl < 8.

If so, the (unique) number z is called the derivative off at x and denoted
variously by
f'(x) ,

Df(x),

or

df (x).

dx

Proposition 3.1. Iff: (a, b) ~ C is differentiable at x


continuous at x.

(a, b) then f is

Differentiation of complex-valued functions

43

Proof Choose 8 > so that (3.2) holds with z = f'(x) and e = 1. Then
when Iy - xl < 8 we have
If(y) - f(x) I

As y -* x,j(y) -* f(x).

:,;

If(y) - f(x) - z(y - x)1


< (1 + Izj)ly - xl

Iz(y - x)1

Proposition 3.2. Thefunctionf: (a, b) -* C is differentiable at x E (a, b) if


and only if the real and imaginary parts g = Re f and h = 1m fare differentiable
at x. If so, then
f'(x) = g'(x)

+ ih'(x).

Proof As in the proof of Proposition 2.l, the limit (3.l) exists if and
only if the limits of the real and imaginary parts of this expression exist. If so,
these are respectively g'(x) and h'(x). 0

Proposition 3.3. Suppose f: (a, b) -* C and g: (a, b) -* C are differentiable at x E (a, b), and suppose c E C. Then the functions f + g, cf, and fg are
differentiable at x, and
(f + g)'{x) = f'(x) + g'(x),
(cf)'(x) = cf'(x),
(fg)'{x) = f'(x)g(x) + f(x)g'(x).

If g(x) =F

then fig is differentiable at x and


(fjg)'(x)

[f'(x)g(x) - f(x)g'(x)]g(X)-2.

Proof This can be proved by reducing it to the (presumed known)


theorem for real-valued functions, using Proposition 3.2. An alternative is
simply to repeat the proofs, which are no different in the complex case. We
shall do this for the product, as an example. We have
(fg)(y) = (fg)(x) = f(y)g(y) - f(x)g(x)
= [fey) - f(x)]g(y) + f(x) [g(y) - g(x)].

Divide by (y - x) and let y -* x. Since g(y) -* g(x), the first term converges
to f'(x)g(x). The second converges to f(x)g'(x). 0
We recall the following theorem, which is only valid for real-valued
functions.
Theorem 3.4. (Mean Value Theorem). Suppose f: [a, b] -* IR is continuous, and is differentiable at each point of (a, b). Then there is aCE (a, b) such
that
f'(c) = [feb) - f(a)](b - a)-I.
Proof. Suppose first thatf(b) = f(a). By Theorem 1.5 there are points
c+ and c_ in [a, b] such thatf(c+) ~ f(x), all x E [a, b] andf(c_) :,; f(x), all
x E [a, b]. If c+ and c_ are both either a or b, thenfis constant andf'(c) = 0,

Continuous functions

44

all c E (a, b). Otherwise, suppose c+ E (a, b). It follows that [fey) - f(c+)] x
(y - c +) - 1 is :::; if y < c + and ~ if y > c +. Therefore thelimit as y -+ c +
is zero. Similarly, if c_ "# a and c_ "# b, thenf'(c_) = 0. Thus in this case
f'(c) = for some c E (a, b).
In the general case, let

g(x) = f(x) - (x - a)[f(b) - f(a)](b - a)-l.

Then g(a) = f(a) = g(b). By what we have just proved, there is aCE (a, b)
such that
0= g'(c) =f'(c) - [feb) - f(a)](b - a)-l.

Corollary 3.5. Supposef: [a, b] -+ C is continuous, and is differentiable at


each point of (a, b). Iff'(x) = 0for each x E (a, b), thenfis constant.
Proof Let g, h be the real and imaginary parts off Then g'(x) = h'(x) =
0, X E (a, b). We want to show g, h constant. If [x, y] c [a, b], Theorem 3.1
applied to g, h on [x, y] implies g(x) = g(y), hex) = hey). 0
Theorem 3.6. (Fundamental Theorem of Calculus). Suppose f: [a, b] -+
C is continuous and suppose c E [a, b]. The function F: [a, b] -+ C defined by
F(x)

= LX f

is differentiable at each point x of (a, b) and


F'(x) = f(x).
Proof Let g be the constant function g(y) = f(x), y E (a, b). Given
> 0, choose 0 so small that If(y) - g(y)1 = If(y) - f(x) I < eif Iy - xl <
O. Then

(3.3)

F(y) - F(x) =
=

f -

LX f

f +f
g

(1- g)

= f(x)(y - x)

(I - g).

If Iy - xl < 0, then

If (f - g) I:::; Iy <

xl sup {If(r) - f(x)lllr - xl < Iy - xl}

elY - xl

Thus dividing (3.3) by (y - x) we get


I[F(y) - F(x)](y - X)-l - f(x) I < e.

Theorem 3.7. Suppose f: [a, b] -+ ~ is continuous and differentiable at each


point of (a, b) and supposef'(x) > 0, all i E (a, b). Thenfis strictly increasing

Differentiation of complex-valued functions

4S

on (a, b). For each y E [f(a)f(b)] there is a unique point x = g(y) E [a, b]
such that f(x) = y. The function g = f- 1 is differentiable at each point of
(f(a),f(b and
g'(y) = [f'(g(y))]-I.
Proof If x, y E [a, b] and x < y, application of Theorem 3.4 to [x, y]
shows that f(x) < f(y). In particular, f(a) < f(b). By Theorem 1.6, if
f(a) ::; y ::; f(b) there is x E [a, b] withf(x) = y. Sincefis strictly increasing,
x is unique. Letting g = f- 1 we note that g is continuous. In fact, suppose
y E (f(a),f(b and e > O. Take y', y" such that
f(a) =:;; y' < y < y" =:;;f(b)

and y" - y ::; e, y - y' =:;; e. Let x' = g(y'), x = g(y), x" = g(y"). Then
x' < x < x". Let S = min {x" - x, x - x'}. If Ix - wi < S then w E (x', x"),
sof(w) E (y', y"), so If(w) - f(x) I = If(w) - yl < e. Continuity atf(a) and
f(b) is proved similarly.
Finally, let x = g(y), x' = g(y'). Then
g(y') - g(y) =
x' - x .
y' - y
f(x') - f(x)

(3.3)

As y' -+ y, we have shown that x' -+ x. Thus (3.3) converges to f'(X)-l =


[f'(g(y]-I. 0
Proposition 3.8. (Chain rule). Suppose g: (a, b) -+ IR is differentiable at
x, and suppose f: ga, b) -+ IC is differentiable at g(x). Then the composite
function fog is differentiable at x and
(3.4)

(f 0 g)'(x)
f

f'(g(xg'(x).

We have

Proof
(3.5)

g(y) - f

g(x) = f(g(y - f(g(x


= f(g(y - f(g(x .g(y) - g(x) .(y _ x)

g(y) - g(x)

y - x

if g(y) =F g(x). If g'(x) =F 0 then g(y) =F g(x) if y is close to x and y =F x.


Taking the limit as y -+ x we get (3.4). Suppose g'(x) = O. For each y near x
either g(y) = g(x), so f 0 g(y) - f 0 g(x) = 0, or (3.5) holds. In either case,
[fog(y) - fog(x)](y - X)-1 is close to zero for y near x. 0
Proposition 3.9. (Change of variables in integration). Suppose g: [a, b] -+
IR is continuous, and is differentiable at each point of (a, b). Supposef: g([a, b])
-+ IC is continuous. Then

g(b)

f =

g(a)

Proof

fb

(fo g)g'.

Define F: g([a, b]) -+ IC and G: [a, b] -+ IC by


F(y) =

fY
g(a)

J,

G(x) =

LX (fog)g'.

Continuous functions

46

We want to prove that F(g(b)) = G(b); we shall in fact show that Fog = G
on [a, b]. Since F g(a) = G(a) = 0, it suffices to prove that the derivatives
are the same. But
0

(F g)' = (F' g)g' = (f g)g' = G'.


0

A functionJ: (a, b) -+ C is said to be differentiable if it is differentiable at


each point of (a, b). If J is differentiable, then the derivative f' is itself a
function from (a, b) to C which may (or may not) have a derivative f"(x) =
(f')'(x) at x E (a, b). This is called the second derivative ofJat x and denoted
also by

Higher derivatives are defined similarly, by induction:

J<)(x) = J(x),
J<l)(X) = f'(x),
J<k+l)(X) = (f(k)'(X) ,
k = 0, 1,2, ....
The functionJ: (a, b) -+ C is said to be of class C k, or k-times continuously
differentiable, if each of the derivatives f,f', ... ,J(k) is a continuous function
on (a, b). The function is said to be a class Coo, or infinitely differentiable, if
J<k) is continuous on (a, b) for every integer k ~ 0.

Exercises
1. Show that any polynomial is infinitely differentiable.
2. Show that the Mean Value Theorem is not true for complex-valued
functions, in general, by finding a differentiable function J such that J(O) =
= J(l) butf'(x) "" for < x < 1.
3. State and prove a theorem analogous to Theorem 3.7 whenf'(x) < 0,
all x E (a, b).
4. Suppose f, g are of class C k and c E Co Show that J + g, cf, and Jg are
of class Ck.
5. Suppose p is a polynomial with real coefficients. Show that between
any two distinct real roots of p there is a real root of p'.
6. Show that for any k = 0, 1,2, ... there is a function/: IR -+ IR which
is of class C k, such that J(x) = if x :s; 0, J(x) > if x > 0. Is there a
function of class Coo having this property?
7. Prove the following extension of the mean value theorem: ifJand g are
continuous real-valued functions on [a, b], and if the derivatives exist at each
point of (a, b), then there is c E (a, b) such that

[feb) - J(a)]g'(c) = [g(b) - g(a)]f'(c).

Sequences and series of functions

47

8. Prove L'Hopital's rule: iff and g are as in Exercise 7 and if


limf'(x)[g'(x)] -1

x-a

exists andf(a) = g(a) = 0, then


limf(x)[g(x)] -1

x-a

exists and the two limits are equal.

4. Sequences and series of functions


Suppose that S is any set and thatf: S ----+ C is a bounded function. Let
If I = sup {If(x) I I XES}.

A sequence of bounded functions (fn)';= 1 from S to C is said to converge


uniformly to f if

lim Ifn - fl = 0,

n- 00

This sequence (fn)';= 1 is said to be a uniform Cauchy sequence if for each e >
there is an integer N so that
(4.1)

Ifn - fml < e

if n, m 2': N.

It is not difficult to show that if the sequence converges uniformly to a


functionf, then it is a uniform Cauchy sequence. The converse is also true.
Theorem 4.1. Suppose (fn):=l is a sequence of bounded functions from a
set S to C which is a uniform Cauchy sequence. Then there is a unique bounded
functionf: S ----+ C such that (fn):=l converges uniformly to f If S is a metric
space and each fn is continuous, then f is continuous.
Proof

For each XES, we have


Ifn(x) - fm(x)1 ::; Ifn - fml

Therefore (fn(x)):= 1 is a Cauchy sequence-in C. Denote the limit by f(x). We


want to show that (fn):= 1 converges uniformly to the function f defined in
this way. Given e > 0, take N so large that (4.1) holds. Then for a fixed
m2':N,

Ifm(x) - f(x) I

Ifm(x) - lim fn(x) I


n- 00

= lim Ifm(x) - fn(x)


n- 00

I ::;

e.

Thus Ifm - fl ::; e if m 2': N, and (fn):=l converges uniformly to f If the


sequence also converged uniformly to g, then for each XES,
Ifn(x) - g(x) I ::; Ifn -

gl

48

Continuous functions

so In(x) -+ g(x) as n -+ 00, and g = f The function lis bounded, since

I/(x) I = I/(x) - Im(x)

+ Im(x) I ::; e + I/m(x) I ::; e + I/ml,

III::; e + I/ml, ifm

~ N.
Finally, suppose each In is continuous on the metric space S. Suppose
XES and e > O. Choose N as above. Choose 0 > 0 so small that

so

if dey, x) <

o.

Then

I/(y) - I(x) I ::; I/(y) - IN(Y) I + I/N(Y) - IN(X) I + I/N(X) - I(x) I


::; II - INI + I/N(Y) - IN (X) I + II - INI
< 3e
if dey, x) < o.
Thus I is continuous.

The usefulness of the notion of uniform convergence is indicated by the


next theorem and the example following.
Theorem 4.2. Suppose (fn)':= 1 is a sequence 01 continuous complex-valued
lunctions on the interval [a, b], and suppose it converges uniformly to f Then

f bI

= lim

Proof

n-+oo

fb In.

By (2.2),

If f I If
In -

As n -+ 00, this -+ O.

(fn - I)

I: ;

lin - 11lb - al

Example
For each positive integer n, letln : [0, I] -+ lR be the function whose graph
consists of the line segments joining the pairs of points (0,0), 2n)-I, 2n);
2n)-I, 2n), (n-I, 0); (n-I, 0), (1, 0). Then In is continuous, In(x) -+ 0 as
n -+ 00 for each x E [0, I], but In = I, all n.

f:

Here we are interested particularly in sequences of functions which are


partial sums of power series. Associated with the sequence (an):'=o in C and
the point Zo E C is the series

2: an(z n=O
00

(4.2)

zo)n,

ZEC.

Recall from 3 of Chapter I that there is a number R, 0 ::; R ::; 00, such that
(4.2) converges when Iz - zol < R and diverges when Iz - zol > R; R is
called the radius 01 convergence of (3.2). The partial sums
(4.3)

In(z) =

2: am(z -

m=O

zo)m

Sequences and series of functions

49

are continuous functions on C which converge at each point z with Iz - zol <
R to the function
(4.4)

f(z) =

L: am(z 00

zo)m,

Iz - zol < R.

m=O

Theorem 4.3. Let R be the radius of convergence of the power series (4.2)Then the function f defined by (4.4) is a continuous function. Moreover, the
functionsfn defined by (4.3) converge tofuniformly on each disc

Dr = {zllz - zol < r},

0 < r < R.

Proof We prove uniform convergence first. Given < r < R, choose s


with r < s < R. Take w with Iw - zol = s. By assumption, L an(w - zo)n
converges. Therefore the terms of this series -+ O. It follows that there is a
constant M so that
lan(w - zo)nl :::;; M,

= 0, 1, ....

Since Iw - zol = s, this means

n = 0, 1, ... ,

lanl :::;; Ms-n,

(4.5)

Now suppose z E Dr and m < n. Then

Ifn(z) - fm(z) I =

Ii

m+1

alz - zo)f

: :; L:

m=l

la,llz - zol'

-ML.,.S

n+1

-1 _

-M

sm+1 -

sn+1

I-S

Msm+1

:::;; I-S'

where S = rls < 1. As m -+ 00 the final expression on the right -+ 0, so


(fn)':= 0 is a uniform Cauchy sequence on Dr. It follows that it converges to f
uniformly on Dr and that f is continuous on Dr. Since this is true for each
r < R, f is continuous. 0
In particular, suppose Xo

IR. The power series

L: an(x 00

(4.6)

xo)n

n=O

defines a continuous function in the open interval (xo - R, Xo


function differentiable?

+ R). Is this

Theorem 4.4. Suppose the power series (4.6) has radius of convergence R'
Then the function f defined by this series is differentiable, and
(4.7)

f'(x) =

L: nan(x 00

n=l

xo)n-l,

Ix - xol < R.

Continuous functions

50

Proof To simplify notation we shall assume


the two series

Xo

L n la lr
n=2

= O. We claim first that

<Xl

(4.8)

n- 2

converge uniformly for Ixl :s; r < R. Take r < s < R. Then (4.5) holds. It
follows that

Imamxm-11 :s; M

m=l

ms-mrm-l

m=l

8 = rls < 1. Take e > 0 so small that (I + e)8 < l. By Exercise 4 of


Chapter I, 3, m :s; (1 + e)m for all large m. Therefore there is a constant M'
so that
m = 1,2, ....

Then

Imamxm-11 :s; rMM'

n=l

(1

m=l

+ e)m8m.

This last series converges, so the first series in (4.8) converges uniformly for
Ixl :s; r. Similarly, m2 :s; (I + e)m for large m, and the second series in (4.8)
converges uniformly for Ixl :s; r.
Let g be the function defined by the first series in (4.8). Recall that we are
taking Xo to be O. We want to show that
(4.9)

[fey) - f(x)](y - X)-l - g(x) -+ 0 as y -+ x.

Assume Ixl < r, Iyl < r. Then the expression in (4.9) is

L an[yn <Xl

(4.10)

n=2

xn - nxn-1(y - x)](y - X)-l.

Now
where
Thus
if Ixl :s; r,

Iyl:s; r.

Then

yn _ xn - nxn-1(y - x)
= (y _ x)[yn-l + xyn-2 + ... + x n- 1 _ nxn-1]
= (y _ x)[(yn-l _ x n- 1) + (yn-2 _ xn-2)x + ... +(y _ x)xn-2
+ xn - 1 _ xn n
= (y - x)2[gn_l(X, y) + gn-2(X, y)x + ... + gl(X, y)x - 2],

1]

Differential equations and the exponential function

51

so

Iyn - xn - nxn- 1(y - x)1 :::; Iy - xI 2n2r n- 2.


It follows that for Ixl :::; r, Iyl :::; r we have

I[f(y) -

f(x)](y - X)-l - g(x) I :::;

2:
co

n=2

lanllY - xl 2n2 r n- 2 ly - xl- 1


co

Iy - xl

2: n2lanlr n- 2 = Kly n=2

xl,

K constant. Thus f'(x) = g(x). 0


Corollary 4.5. The function f in Theorem 4.5 is infinitely differentiable,
and

J<k)(X) =

(4.11)

Ix - xol < R.

2: n(n co

n=k

I)(n - 2) (n - k

+ I)an(x

Proof. This follows from Theorem 4.4 by induction on k.

- xo)n-k,

In particular, if we take x = Xo in (4.11) then all terms ofthe series except


the first are zero and (4.11) becomes

(4.12)
This means that the coefficients of the power series (4.6) are determined
uniquely by the function f (provided the radius of convergence is positive).
Exercises
1. Find the function defined for Ixl < 1 by f(x) = 2::'=1 xnjn. (Hint:

f:

f(x) = 1'.)
2. Show that iff is defined by (4.6), then

If
x

Xo

co

n~o (n

+ l)-lan(X - xo)n+1.

3. Find the function defined for Ixl < 1 by f(x) = 2::'=1 nxn- 1.
4. Supposethereisasequence(xn):'=lsuchthatlxn+l - xol < IX n - xol,
xn ~ xo, and f(x n) = 0 for each n, where f is defined by (4.6). Show that
f(x) = 0 for all x. (Hint: show that ao, al> a2, ... are each zero.)

5. Differential equations and the exponential function


Rather than define the general notion of a "differential equation" here,
we shall consider some particular examples. We begin with the problem of
finding a continuously differentiable function E: IR ~ C such that

(5.1)

E(O) = I,

E'(x) = E(x),

X E

IR.

52

Continuous functions

Suppose there were such a function E, and suppose it could be defined by a


power series

L a"x".
,,=0
00

E(x) =
Then (5.1) and Theorem 4.4 imply
co

L na"x"-l = ,,=0
L a"x",
,,=1

or

00

L (n + l)a,,+l x" = ,,=0


L a"x".
,,=0
00

00

Since the coefficients are uniquely determined, this implies

= a,,/(n +

a,,+l

1),

= 0, 1,2, ....

But

ao

= E(O) =

1,

so inductively

a" = (n!)-l = [n(n - 1)(n - 2) .1]-1.


We have shown that if there is a solution of (5.l) defined by a power series,
then it is given by

L (n!)-lx".
,,=0
00

E(x) =

(5.2)

The ratio test shows that (5.2) converges for all real or complex x, and
application of Theorem 4.4 shows that E is indeed a solution of (5.1). We
shall see that it is the only solution.
Theorem 5.1. For each a, c E C there is a unique continuously differentiable function f: IR ~ C such that
(5.3)

f(O) = c,

f'(x) = af(x),

XE!R.

This function is
(5.4)

f(x)

= cE(ax)

co

= c

L (n!)-la"x".
,,=0

Proof The function given by (5.4) can be found by the argument used
to find E, and Theorem 4.4 shows that it is a solution of (5.3). To show
uniqueness, supposefis any solution of (5.3), and let g(x) = E( -ax), so that
g(O)

1,

g'(x) = - ag(x).

Thenfg is differentiable and

(fg)'

= f'g + fg' = afg - afg = O.

Differential equations and the exponential function

53

Thereforefg is constant and this constant value isf(O)g(O) = c. If c "# 0, this


impliesfg is never zero, so g is never zero. Then for any c,

f(x) = c/g(x),
Thus f is unique.

all x.

This can be extended to more complicated problems.


Theorem 5.2. For each a, c E IC and each continuous function h: IR -?- IC
there is a unique continuously differentiable function f: IR -?- IC such that
(5.5)

f(O) = c,

I'(x) = af(x)

+ hex),

XE

IR.

Proof Let fo(x) = E(ax), g(x) = E( -ax). As in the preceding proof,


fog is constant, == 1. Therefore neither function vanishes. Any solution f of
(5.5) can be written as
f

= fdo,

where f1

= gf.

Then

I'

= fVo + fd~

= fifo

+ af,

so (5.5) holds if and only if

f1(0) = c,

fHx)fo(x) = hex).

These conditions are equivalent to

f1(X) = c

+ {' gh.

Thus the unique solution of (5.5) is given by

(5.6)

f(x) = cfo(X)

+ fo(x)

f'
f'

gh

= cE(ax) + E(ax)

E( -at)h(t) dt.

Now we consider equations involving the second derivative as well.


Theorem 5.3. For any b, c, do, d1 E IC and any continuous function
h: IR -?- IC there is a unique f: IR -?- IC of class C2 which satisfies
(5.7)
(5.8)

f(O) = do,
f"(x)

1'(0)

+ bl'(x) + cf(x)

= dt>

= hex),

X E

IR.

Proof To motivate the proof, we introduce two operations on functions


of class C1 from IR to IC: given such a function g, let
Dg =g',

Ig

= g.

If g is of class C2, let D2g = g". Then (5.8) can be written


D2f + bDf + clf = h.

Continuous functions

54

This suggests the polynomial Z2


of this polynomial such that

+ 2hz +

c. We know there are roots aI, a2

z E C.

all
Thus

From the properties of differentiation it follows that

(D - al1)[(D - a21)f] = D2f - (al + a2)Df + a2ad


=/" + hf + cf
Let
We have shown that f is a solution of (S.8) if and only if

(D - al1)g = g' - alg = h.


If also (S.7) holds, then g(O) = 1'(0) - a2f(0) = d 1
solution of (S.7), (5.8) if and only if

(S.9)

f(O) = do,

I' -

a2do. Thus f is a

ad= g,

where

(S.10)
But (S.lO) has a unique solution g, and once g has been found then (S.9) has
a unique solution. It follows that (S.7), (5.8) has a unique solution. 0
Now we return to the function E,

E(z)

L (n!)-lzn,
co

C.

n=O

Define the real number e by

L (n!)-l.
co

(S.11)

e = E(I) =

n=O

Theorem 5.4.
Moreover,

The function E is a function from !R to !R of class C co

(a) E(x) > 0, x E !R,


(b) for each y > 0, there is a unique x E !R such that E(x) = y,
(c) E(x + y) = E(x)E(y), all x, y E !R,
(d) for any rational r, E(r) = eT

Proof Since E is defined by a power series, it is of class C co. It is clearly


real for x real, and positive when x ~ O. As above, E(x)E( -x) = 1, all x,
so also E(x) > 0 when x < O.

Differential equations and the exponential function

55

To prove (b), we wish to apply Theorem 1. Taking the first two terms in
the series shows (since y > 0) that E(y) > 1 + Y > y. Also, E(y-l) > y-l,
so
E( _y-l) = E(y-l)-l < (y-l)-l = y.
Thus there is x E ( - y-l, y) such that E(x) = y. Since E' = E> 0, E is
strictly increasing and x is unique.
We have proved (c) when x = - y. Multiplying by E( -x)E( - y), we want
to show

+ y)E(-x)E(-y)

E(x

= 1,

all x, y

1ft

Fix y. This equation holds when x = 0, and differentiation with respect to x


shows that the left side is constant.
Finally, repeated use of (c) shows that
n = 0, I, 2, ....

E(nx) = E(x)n,

Thus
e = E(l) = E(n/n) = E(l/n)n,
e 1/n = E(l/n),
n = 1,2,3, ....
em/n = (e1/n)m = E{l/n)m = E(m/n).

Because of (d) above and the continuity of E, it is customary to define


arbitrary complex powers of e by

e2 = E(z) =

(5.12)

L (nl)-lzn.
co

n=O

The notation
(5.13)

e2

= expz

is also common.
We extend part of Theorem 5.4 to the complex exponential function.
(Recall that z* denotes the complex conjugate of z E C.)

Theorem 5.5. For any complex numbers z and w,


(5.14)

E(z

+ w)

= E(z)E(w),

E(z*) = E(z)*.

Proof. The second assertion can be proved by examining the partial


sums of the series. To prove the first assertion, recall that we showed in the
proof of Theorem 5.1 that E(zx)E( - zx) is constant, x E IR. Therefore
E(z)E( -z) = E(0)2 = 1. We want to show
E(z

+ w)E( -z)E( -w)

= I,

all z,

WE

C.

Let
g(x) = Ez

+ w)x)E(-zx)E(-wx),

Differentiation shows g is constant. But g(O) = I.

X E

1ft

Continuous functions

56

The notation (S.12) and the identity (S.14) can be used to consolidate
expressions for the solutions of the differential equations above. The unique
solution of

I'

1(0) = c,
is
(S.lS)

I(x) = ce ax

f:

al+ h

ea(X-t)h(t) dt.

The unique solution of

1(0) = do,

f" + df' + cl2 =

1'(0) = d1 ,

is given by

(S.16)
where

and aI, a2 are the roots of Z2

+ bz + c.
Exercises

1. Find the solution of

1(0) = 1,

f" - 21' + 1 = 0

1'(0) = 0,

by the procedure in Theorem S.3, and also by determining the coefficients in


the power series expansion off.
2. Let J, g be the functions such that

1(0) = 1,
g(O) = 0,

1'(0) = 0,
g'(O) = 1,

f" + bl' + cl = 0,
f" + bg' + cg = O.

Show that for any constants dh d 2 the function h = dd + d 2 g is a solution of


h"

+ bh' + ch

= O.

Show that conversely if h is a solution of this equation then there are unique
constants dh d2 E C such that h = dd + d 2 g. (This shows that the set of
solutions of (*) is a two-dimensional complex vector space, and (J, g) is a
basis.)
3. Suppose h(x) = L::'=o dnxn, the series converging for all x. Show that
the solution of

1(0) = 0 = 1'(0),

f" + bl' + cl = h

57

Trigonometric functions and the logarithm

is of the form 2:;=0 anxn, where this series converges for all x. (Hint: determine the coefficients a o, al, ... inductively, and prove convergence.)
4. Suppose ZS + bz 2 + cz + d = (z - al)(z - a2)(z - as), all z E C.
Discuss the problem of finding a functionfsuch that

f(O) = eo,

1"(0)

f'" + dl" + c!' + d = O.

= e2,

6. Trigonometric functions and the logarithm


In 5 the exponential function arose naturally from study of the differential equation!, = f In this section we discuss solutions of one of the simplest
equations involving the second derivative: I" + f = O.
Theorem 6.1.
that

There are unique functions S, C: IR -+ C of class C2 such

(6.1)

S(O)

= 0,

(6.2)

C(O)

= 1,

S'(O) = 1,

S"

= 0,

C"

C'(O)

S = 0,
C

O.

Proof Existence and uniqueness of such functions is a consequence of


Theorem 5.3. 0
Let us obtain expressions for Sand C using the method of Theorem 5.3.
The roots of Z2 + 1 are z = i. Therefore

where
g(x) = eix .

Thus
Sex)

= LX e-Hx-t)eit dt = e- ix
=

e-IX(2i)-le2it

(6.3)

Sex)

I~

= Ii (e lX

I:

e21t dt

(2i)-le-IX(e2IX - 1),
-

e- 1X ).

A similar calculation gives


(6.4)
Theorem 6.2. The functions S, C defined by (6.3) and (6.4) are rea/valuedfunctions of class coo on R Moreover,
(a) S' = C, C' = - S,
(b) S(X)2 + C(X)2 = 1, all x

IR,

58

Continuous functions
(c) there is a smallest positive number p such that C(p)
(d) ifp is the number in (c), then
Sex

+ 4p)

C(x

= Sex),

+ 4p)

0,

all x

= C(x),

Proof Since the exponential function exp (ax) is of class Coo as a function of x for each a E C, Sand C are of class Coo. Since (exp (ix))* =
exp ( - ix), Sand C are real-valued. In fact,
C(x) = Re (e iX ),

Sex) = 1m (e IX ),

so

e1x

= C(x)

is(x).

Differentiation of (6.3) and (6.4) shows S' = C, C' = - S. Differentiation


of S2 + C 2 shows that S(X)2 + C(X)2 is constant; the value for x = is 1.
To prove (c), we suppose that C(x) =1= for all x > 0. Since C(I) = 1 and
C is continuous, the Intermediate Value Theorem implies C(x) > 0, all
x > 0. Since S' = C, S is then strictly increasing for x :?: 0. In particular,
Sex) :?: S(l) > 0, all x :?: 1. But then

< C(x)

C(l)

+ IX C'(t) dt

::;; C(1) -

C(l) -

LX Set) dt

S(1) dt = C(1) - (x - I)S(l),

x:?: 1.

But for large x the last expression is negative, a contradiction. Thus C(x) =
for some x > 0. Let p = inf {x I x > 0, C(x) = a}. Then p :?: 0. There is a
sequence (xn)l' such that C(xn) = 0, P ::;; Xn ::;; p + lin. Thus C(p) = 0, and
p is the smallest positive number at which C vanishes.
To prove (d) we note that
1 = S(p)2

C(p)2 = S(p)2,

so S(p) = 1. But Sea) = and S' = C is positive on [O,p), so S(p) > 0,


Thus S(p) = 1. Consider Sex + p) as a function of x. It satisfies (6.2), and so
by uniqueness we must have
Sex

Similarly, - C(x

+ p)

= C(x),

x E lR.

+ p) considered as a function of x satisfies (6.1), so


xER
C(x + p) = -Sex),

Then
sex
C(x

+ 4p) =
+ 4p) =

C(x + 3p) = -sex + 2p) = -C(x


-sex + 3p) = - C(x + 2p) = sex

We define the positive number

7r

by

7r

2p,

+ p) =
+ p) =

Sex),
C(x).

Trigonometric functions and the logarithm

59

p the number in (c), (d) of Theorem 6.3. We define the functions sine and
cosine for all

Z E

C by

(6.5)

1
sinz = _.(e I2 - e- 12 ) =

2: [(2n + I)!]-1(-I)"z2"+1,
,,=0

(6.6)

1
cos z = - (e 12

2: [2n!]-1( -I)"z2".
,,=0

2l

+ e- 12 ) =

co

co

Note that because of the way we have defined 7r and the sine and cosine
functions, it is necessary to prove that they have the usual geometric significance.

Theorem 6.3. Let y: [0, 27r] -+ 1R2 be defined by


yet) = (cos t, sin t).
Then y is a 1-1 mapping of [0, 27r] onto the unit circle about the origin in 1R2.
The length of the arc of this circle from yeO) to yet) is t. In particular, the
length of the unit circle is 27r.
Proof. We know from Theorem 6.2 that (cos t)2 + (sin t)2 = 1, so
(cos t, sin t) lies on the unit circle. The discussion in the proof of Theorem 6.2
shows that on the interval [0, -!7r], cos t decreases strictly from 1 to 0 while
sin t increases strictly from 0 to 1. Therefore, y maps [0,1-7r] into the portion
of the circle lying in the quadrant x :?; 0; y :?; 0 in a 1-1 manner. Furthermore, suppose 0 :5: x :5: 1, 0 :5: Y :5: 1, and x 2 + y2 = 1. By the Intermediate
Value Theorem and the continuity of cosine, there is a unique t E [0, !w] such
that cos t = x. Then sin t :?; 0, (sin t)2 = 1 - x 2 = y2, and y :?; 0, so
sin t = y. Thus y maps [0, -tw] onto the portion of the circle in question.
Since cos (t + trr) = -sin t and sin (t + -tw) = cos t, the cosine decreases from 0 to -1 and the sine decreases from 1 to 0 on [!7r, 7r]. As above,
we find that y maps this interval 1-1 and onto the portion of the circle in the
quadrant x :5: 0, Y :?; O. Continuing in this way we see that y does indeed
map [0,27r) 1-1 onto the unit circle.
The length of the curve y from yeO) to yet) is usually defined to be the
limit, if it exists, of the lengths of polygonal approximations. Specifically,
suppose

o = to

< t1 < t2 < ... < t"

= t.

The sum of the lengths of the line segments joining the points y(tl y(tl), i = 1,2, ... , n is
(6.7)

2:" [(cos tl -

cos tl _1)2

+ (sin tl

1)

and

- sin tl_1)2]1/2.

1=1

By the Mean Value Theorem, there are t; and t7 between tt-1 and ti such that

costt - costi_ 1 = -sint;(tl- tt-1),


sin tl - sin tl - 1 = cos t7(t1 - tl - 1).

60

Continuous functions

Therefore, the sum (6.7) is

L [(sin t;)2 + (cos t~)2](t1 n

tl _ 1).

1=1

Since sine and cosine are continuous, hence uniformly continuous on [0, tl,
and since (sin t)2 + (cos t)2 = 1, it is not hard to show that as the maximum
length ItI - tl - 1 1-+ 0, (6.7) approaches t. 0
This theorem shows that sine, cosine, and 1T as defined above do indeed
have the usual interpretation. Next we consider them as functions from

Cto C.
Theorem 6.4. The sine, cosine, and exponential/unctions have the/ollowing
properties:
(a) exp (iz) = cos z + i sin z, all z E C,
(b) sin (z + 21T) = sin z, cos (z + 21T) = cos z, exp (z + 21Ti) = exp z,
all z E C,
(c) if W E C and w =ft 0, there is a z E C such that w = exp (z). If also
w = exp (z'), then there is an integer n such that z' = z + 2n1Ti.

Proof The identity (a) follows from solving (6.5) and (6.6) for exp (iz).
By Theorem 2.2 and the definition of 1T,
exp (21Ti)
Then since exp (z

+ w) =

= cos 21T + i sin 21T = 1.

exp z exp w we get


exp (z

+ 21Ti)

= exp

z.

This identity and (6.5), (6.6) imply the rest of (b).


Suppose w E C, w =ft 0. Let r = Iwl. If x, yare real,
lexp (iy)12 = Icos y

+ i sin Yl2 =

(cos y)2

+ (sin y)2 =

1.

Therefore
lexp (x

+ iy)12 =

lexp x exp (iy)1 = lexp xl = exp x.

To have exp (x + iy) = w, then, we must have exp x = r. By Theorem 5.4


there is a unique such x E R. We also want exp (iy) = ,-lW = a + bi. Since
ir- 1 wl = 1, a2 + b2 = 1. By Theorem 6.3 there is a unique y E [0, 21T) such
that cos y = a, sin y = b. Then exp (iy) = cos y + i sin y = a + ib. We
have shown that there are x, y E R such that if z = x + iy,
exp z = exp x exp (iy) = rr- 1 w = w.
Suppose z'

= x' + iy', x', y' real, and exp z' = w. Then


r = Iwl = lexp z'l = lexp x'i = exp x',

x' = x. There is an integer n such that


2n1T S y' - y < 2n1T

+ 21T,

Trigonometric functions and the logarithm

61

or
y'

= y + 2mr + h,

[0, 2TT).

Since exp z' = exp z, we have


exp (iy)

= exp (iy') = exp (i(y + 2nTT +

= exp (iy +

ih),

so
1

= exp (- iy) exp (iy + ih) = exp (ih).

Since hE [0, 2TT), this implies h = O. Thus y' = y

+ 2nTT.

The trigonometric functions tangent, secant, etc., are defined for complex
values by
tan z = sin z/cos z,
etc.
If w,

Z E

Z E

C,

cos z -1= 0,

C and w = exp z, then z is said to be a logarithm of w,


z

= log w.

Theorem 6.4 shows that any w -1= has a logarithm; in fact it has infinitely
many, whose imaginary parts differ by integral multiples of 27T. Thus log w is
not a function of w, in the usual sense. It can be considered a function by
restricting the range of values of the imaginary part. For example, if w -1=
the z such that exp z = w, 1m z E [a, a + 2TT) is unique, for any given choice
of a E~.
If x > 0, it is customary to take for log x the unique real y such that
exp y = x. Thus as a function from (0, (0) to ~, the logarithm is the inverse
of the exponential function. Theorem 3.7 shows that it is differentiable, with

d
d eYI
-(logx) = ( -dy
y=logx

dx

Thus

(6.8)

log x =

f:

)-1 =

t- 1 dt,

e- 10gx

x- 1

x> 0.

Exercises
1. Prove the identities
sin (z
cos (z

+ w) =
+ w) =

sin z cos w + cos z sin w,


cos z cos w - sin z sin w

for all complex z, w. (Hint: use (6.5) and (6.6).)

Continuous functions

62

2. Show that tan x is a strictly increasing function from (-1-17, -trr) onto
IR. Show that the inverse function tan -1 x satisfies

~ (tan- 1 x)
dx

3. Show that
4. Show that

J:

log (1

to

+ x 2) -1 dx

(1

+ x)

= (1

+ X 2)-1

17.

LX (1 + f)-1 dt,

-1 < x <

l)n-1
L
x
n=1
n

-1<x<l.

00.

5. Show that
log (1

+ x) =

to

n,

(Hint: use Exercise 4.)

7. Functions of two variables


Suppose A is an open subset of 1R2, i.e., for each (xo, Yo)
open disc with center (xo, Yo) contained in A:

A there is an

some r > O.

In particular, A contains (x, Yo) for each x in the open interval Xo - r <
x < Xo + r, and A contains (xo, y) for each y in the open interval Yo - r <
y < Yo + r.
SupposeJ: A - ? C. It makes sense to ask whether J(x, Yo) is differentiable
as a function of x at Xo. If so, we denote the derivative by

Other common notations are

Similarly, ifj(xo, y) is differentiable at Yo as a function of y we set


Dd(xo, Yo)

= lim [f(xo, y) - J(xo, Yo)](Y - YO)-1.


V-Yo

The derivatives Dd, Dd are called the first order partial derivatives of f
The second order partial derivatives are the first order derivatives of the first
order derivatives:
D 1Y = D 1(Dd),
D1Dd = D 1(Dd),

D 22J = D 2(Dd),
D2Dd= D2(Dd).

Functions of two variables

63

Other notations are

o2j

etc.

oyox'

Higher order partial derivatives are defined similarly. An (n + I)-order


partial derivative of I is DIg or D 2g, where g is an n-order partial derivative
of f. The function I: A -l> C is said to be n-times continuously differentiable,
or 01 class c n if all the partial derivatives of I of order ::5: n exist and are continuous at each point of A. If this is true for every integer n, then I is said to
be infinitely differentiable, or 01 class Coo.

Theorem 7.1. (Equality of mixed partial derivatives). If I:

class C 2, then DIDd = D 2DIf.

A -l> C is 01

Proof. Suppose (a, b) EA. Choose r > 0 so small that A contains the
closed square with center (a, b), edges parallel to the coordinate axes, and
sides of length 2r. Thus (x, y) E A if
Ix - al ::5: rand

Iy -

bl

::5: r.

In this square we apply the fundamental theorem of calculus to I as a function


of x with y fixed, and conclude
I(x, y) =

Let g(y)

LX Dd(s, y) ds + I(a, y).

I(a, y). We claim that

(7.1)

Dd(x, y)

= LX D 2Dd(s, y) ds + g'(y).

If so, then differentiation with respect to x shows DIDd = D2DIf. To prove


(7.1) we consider
e- 1 (f(x, y

+ e)

- I(x, y)) -

[e- 1(Dd(s,y

IX D 2Dd(s, y) ds + e)

g'(y)

- Dd(s,y)) - D 2 Dd(s,y)]ds

[e- 1 (g(y

+ e)

- g(y)) - g'(y)].

The second term in brackets -l> 0 as e -l> O. IfI is real-valued, we may apply
the Mean Value Theorem to the first term and conclude that for each sand y
and for each small e, there is a point y' = y'(s, y, e) between y and y + e such
that
(7.2)

e- 1 (Dd(s, y

+ e)

- Dd(s, y)) - D 2Dd(s, y)


= D 2Dd(s, y') - D 2Dd(s, y).

Now Iy' - yl < e, so I(s, y') - (s, y)1 < E. Since D2Dd is uniformly continuous on the square Ix - al ::5: r, Iy - hi ::5: r, it follows that the maximum
value of (7.2) converges to zero as E -l> O. This implies convergence to zero of

Continuous functions

64

the integral of (7.1) with respect to s, proving (7.1) when / is real. In the
general case, we look at the real and imaginary parts of/separately. 0
Remarks. In the course of proving Theorem 7.1 we have, in effect,
proved the following. If/is a complex-valued function of class CI defined on
a rectangle Ix - al < r1> Iy - hi < r 2 , then the derivative with respect to

yof

is

LX Dd(s, y) ds.
Similarly, the derivative with respect to x of

is

/(x, t) dt

Dd(x, t) dt.

We need one more result of this sort: if a = h, rl = S1> then

F(y)
is defined for

Iy - al

(7.3)

In fact

F(y

+ e)

/(s, y) ds

< rl. The derivative is

Dd(s, y) ds

- F(y) = a [f(s, y

+ e)

+ /(y, y).

- /(s, y)] ds

fY+8
I

/(S, y) dy.

Divide by e and let e -+ O. By the argument above, the first integral converges

to

Dd(s, y) ds.

In the second integral, we are integrating a function whose values are very
close to /(y, y), over an interval oflength e. Then, dividing bye, we see that
the limit is /(y, y).
We need two results on change of order of integration.
Theorem 7.2.

Suppose / is a continuous complex valued/unction on the

rectangle
A = {(x,y) I a::; x ::; h, c ::; y ::; d}.

Functions of two variables


Then the lunctions
g(x)

are continuous, and

65

I(x, t) dt,

hey) =

g(x) dx =

I(s, y) ds

hey) dy.

Proof The preceding remarks show that g and h are not only continuous
but differentiable. More generally,

LX I(s, t) ds

I(s, t) dt,

r{(
r

are continuous functions of s and of t respectively. Define


Fl(x, y) =
F2(x, y)

I(s, t) dt} ds,

{LX I(~, t) dS} dt.

We want to show that Fl(b, d) = F2 (b, d). The remarks preceding this theorem show that

Therefore, F2 - Fl is constant along each vertical line segment in the


rectangle A. Similarly, DlFl = D l F2, so F2 - Fl is constant along each
horizontal line segment. Since Fl(a, c) = F2 (a, c) = 0, Fl == F2. D
The next theorem describes the analogous situation for integration over
a triangle.

Theorem 7.3.
on the triangle
Then

Proof

Suppose I is a continuous complex-valued lunction defined


A = {(x, y) I 0 :$ x :$ a,O :$ y :$ x}.

r{LX

I(x, y) dY } dx

{a {( I(x, y) dX} dy.

Consider the two functions of t, 0

f {LX

I(x, y) dy } dx,

:$

f {f

t :$ a, defined by

I(x, y) dX} dy.

By the remarks following Theorem 7.1, the derivatives of these functions with
respect to tare
f/(t, y) dy,

I(t, y) dy

{O} dy.

Continuous functions

66

Thus these functions differ by a constant. Since both are zero when t = 0,
they are identical. 0
Finally we need to discuss polar coordinates. If (x, y) E 1R2 and (x,y) =F
(0,0), let
Then

so there is a unique 8, 0
This means

8 < 2'17, such that cos 8 = r- 1x, sin 8 = r- 1y.

x = rcos 8,
r = (x 2 + y2)1/2,

y = r sin 8
8 = tan- 1 (yLx).

Thus any point p of the plane other than the origin is determined uniquely
either by its Cartesian coordinates (x, y) or by its polar coordinates r, 8. A
function defined on a subset of 1R2 can be expressed either as f(x, y) or
g(r, 8). These are related by
(7.4)

f(x,y) = g((x2 + y2)1/2, tan-1 (y/x),

(7.5)

g(r, 8) = fer cos 8, r sin 8).

Theorem 7.4.
on the disc

Suppose f is a continuous complex-valued function defined


DB = {(x,y) I x 2 + y2 < R}.

Suppose g is related to f by (7.5). Then

f-BB {ICB2-112)1/2
_(B2_112)1/2
x

f(x, y) dx} dy =

LB {L 2 " g(r, 8)r d8}


0

dr.

Proof Look first at the quadrant x ~ 0, y ~ O. For a fixed y ~ 0, if


0 then x = (r2 - y2)1/2. Proposition 3.9 on coordinate changes gives

We integrate the integral on the right over 0


to get

LB

{f

R, and use Theorem 7.3

f((r 2 - y2)1/2, y)(r 2 - y2)-1/2 dy}r dr.

Let y = r sin 8, 8 E [0,1-'17]. Then (r 2 - y2)1/2 = r cos 8. We may apply


Proposition 3.9 to the preceding integral of gr over 0 ~ 8 ~ tn-. Similar
arguments apply to the other three quadrants. 0

67

Some infinitely differentiable functions

Exercises
1. Suppose A c 1R2 is open and suppose g: A ~ C and h: A ~ C are of
class CI. Show that a necessary condition for the existence off: A ~ C such
that

Dd=g,

Dd=h

is that D 2 g = DIh.
2. In Exercise 1,supposeAisadisc{(x,y) I (x - XO)2 + (y - YO)2 < R2}.
Show that the condition D 2 g = DIh is sufficient. (Hint: consider
f(x, y)

= fX

g(s, y) ds

+ fY

Xo

h(xo, t) dt.)

Yo

8. Some infinitely differentiable functions


In 4 it was shown that any power series with a positive radius of convergence defines an infinitely differentiable function where it converges:

2: an(x 0)

f(x)

xo)n.

n=O

We know
an = (n!) -IJ(n)(xo)'

In particular, if all derivatives off are zero at Xo, then f is identically zero.
There are infinitely differentiable functions which do not have this property.

Proposition 8.1. There is an infinitely differentiable function f: IR ~ IR


such that
f(x) = 0,
f(x) > 0,
Proof

We definefby

x ~ 0,
x> 0.

f(x) = 0,
x ~
f(x) = exp (-Ijx),

x > 0.

Near any point x -# 0, f is the composition of two infinitely differentiable


functions. Repeated use of the chain rule shows that f is, therefore, infinitely
differentiable except possibly at zero.
Let us show thatfis continuous at 0. If y > 0, then

2: (n!)-Iyn < (m!)-Iym,


ex>

eY =

= 0, I, ....

n=O

Thus if x> 0,

<

f(x)

= exp (-Ijx) = exp (IjX)-l <

m!(ljx)-m

m! x m,

68

Continuous functions

m = 0, 1, .... In particular,J(x) --+ 0 as x

--+ O.
It is easy to show by induction that for x > 0,

J<k)(X) = Pk(x- 1)f(x),


where Pk(x) is a polynomial of degree :::;; k + 1. Of course, this equation also
holds for x < O. Suppose we have shown thatJ<k) exists and is continuous at
0; then of courseJ<k)(O) = O. We have
(8.1)

1[f(k)(X) - J<k)(0)]x- 11 = lJ<k)(X)x- 11


= IX- 1Pk(X- 1)f(x)l

Since Pk is of order :::;; k

+ 1 and
If(x) I : :; (k + 3)! X- k - 3 ,

it follows that the right side of (8.1) converges to zero at x --+ O. ThusJ<k + 1)(0)
exists and is zero. Similarly, J<k + 1)(X) = PH 1(x- 1)f(x) --+ 0 as x --+ 0, so
J<k+1) is continuous. 0
Note that all derivatives of the preceding function vanish at zero, but f is
not identically zero. Therefore f does not have a convergent power series
expansion around zero.
Corollary 8.2.

Suppose a < b. There is an infinitely differentiable function

g: IH --+ IH such that


g(x) = 0,
g(x) > 0,
Proof

xrt(a, b),
xE(a,b).

Letfbe the function in Theorem 8.1 and let

g(x) = f(x - a)f(b - x).


This is positive where x - a > 0 and where b - x < 0, and is zero elsewhere. It is clearly of class C '" . 0
Corollary 8.3. Suppose a < b. There is an infinitely differentiablefunction
h: IH --+ IH such that
h(x)

= 0,

o < h(x) <

1,
hex) = 1,

Proof

x:::;; a,

0< x < b,
x 2! b.

Let g be the function in Corollary 8.2 and let


h(x) = c

LX",

g(t) dt,

where c > 0 is chosen so that h(b) = 1. Then h' = cg 2! 0, h is constant


outside (a, b), etc. 0

Chapter 3

Periodic Functions and Periodic Distributions


1. Continuous periodic functions

Suppose u is a complex-valued function defined on the real line IR. The


function u is said to be periodic with period a "" if
u(x

for each x

+ a) =

u(x)

IR. If this is so, then also


u(x + 2a) = ux + a)
u(x - a) = ux - a)

+ a) =
+ a) =

u(x + a) = u(x) ,
u(x).

Thus u is also periodic with period 2a and with period -a. More generally,
u is periodic with period na for each integer n. If u is periodic with period
a "" 0, then the function v,
vex)

= u(lalx/27T)

is periodic with period 27T. It is convenient to choose a fixed period for our
study of periodic functions, and the period 27T is particularly convenient.
From now on the statement" u is periodic" will mean "u is periodic with
period 27T." In this section we are concerned with continuous periodic
functions. We denote the set of all continuous periodic functions from IR
to C by re. This set includes, in particular, the functions
sin nx,

exp (inx) = cos nx

cos nx,

i sin nx

for each integer n.


The set re can be considered a vector space in a natural way. We define the
operations of addition and scalar multiplication by
(1)

(2)

u, V E re,

(u + v)(x) = u(x) + v(x) ,

u E re,

(au)(x) = au(x),

x E IR;
a E C,

IR.

it is easily checked that the functions u + v and au are periodic. By Proposition 1.1 of Chapter 2, they are also continuous. Thus u + v Ere, au Ere. The
axioms VI-V8 for a vector space are easily verified. We note also that there
is a natural multiplication of elements of re,
(uv)(x) = u(x)v(x),

u, V Ere,

X E

IR.

The set re may also be considered as a metric space. Since the interval
[0, 27T] is a compact set in IR and since u E re is continuous,
sup

xe[O.2,,]

Iu(x) I <
69

00.

Periodic functions and periodic distributions

70

We define the norm of u E re to be the real number

lui

(3)

= sup
xelll

Iu(x) I =

lui

where

sup lu(x)l.

xe[O.2,,]

The norm (3) has the following properties:

lui

(4)

~ 0,

and

lui

lallul,

laul

(5)

only if u(x) = 0, all x;

= 0

lu + vi ::;; lui + lvi,

(6)

u E re;

a E C,

u, v Ere.

The properties (4) and (5) are easily checked. As for (6), suppose x

I(u

R Then

+ v)(x) I = lu(x) + v(x) I ::;; Iu(x) I + Iv(x) I ::;; lui + Ivl

Since this is true for every x E IR, (6) is true.


To make re a metric space, we set

d(u, v) =

(7)

lu -

vi.

Theorem 1.1. The set re of continuous periodic functions is a vector


space with the operations defined by (1) and (2). The set re is also a metric
space with respect to the metric d defined by (7), and it is complete.

Proof. As we noted above, checking that re satisfies the axioms for a


vector space is straightforward. The axioms for a metric space are also
easily checked, using (4), (5), and (6). For example,
d(u, w)

lu - wi

I(u - v)

+ (v -

w)1

::;; lu - vi + Iv - wi = d(u, v) + d(v, w).


is a Cauchy sequence of functions in re. By Theorem

Finally, suppose (un ):'= 1


4.1 of Chapter 2, there is a continuous function u: IR --+ C such that
IUn - ul--+ O. Clearly u is periodic, so u Ere and re is complete. 0
Sets which are simultaneously vector spaces and metric spaces of this
sort are common enough and important enough to have been named and
studied in the abstract. Suppose X is a real or complex vector space. A norm
on X is a function assigning to each uE X a real number lui, such that

lui

and lui = 0 implies u =


laul = lallul, a scalar, uE X;
lu + vi ::;; lui + lvi, u, VEX.

~ 0,

0;

A normed linear space is a vector space X together with a norm lui. Associated
to the norm is the metric

d(u, v)

lu - vi.

If the normed linear space is complete with respect to this metric, it is said
to be a Banach space.
In this terminology, Theorem 1.1 has a very brief statement: ri is a
Banach space.

Continuous periodic functions

71

Suppose X is a complex normed linear space. A linear functional


F: X --+ C is said to be bounded if there is a constant c ;;:: 0 such that

IF(u) I ~ clul,

all u E X.

Proposition 1.2. A linear functional F on a normed linear space X is


continuous if and only if it is bounded.

Proof

Suppose F is bounded. Then

IF(u) - F(v) I = IF(u - v)1


so IF(u) - F(v) I < e if lu - vi < c- 1e.
lui

clu - vi,

Conversely, suppose F is continuous. There is as> 0 such that


= lu - 01 ~ S implies

IF(u) I = IF(u) - F(O) I ~ 1.


For any u -# 0, U E X, the vector v = 81ul- 1uhas norm 8. Therefore
IF(u) I = IF(8- 1 Iulv)1 = 8- 1 IuIIF(v)1 ~ 8- 1 Iul,
and F is bounded.

It is important both in theory and practice to determine all the continuous linear functionals on a given space of functions. The reason is that
many problems, in theory and in practice, can be interpreted as problems
about existence or uniqueness of linear functionals satisfying given conditions. The examples below show that it is not obvious that there is any
way to give a unified description of all the continuous linear functionals on
~. In fact one can give such a description (in terms of Riemann-Stieltjes
integrals, or integrals with respect to a bounded Borel measure), but we shall
not do this here. Instead we introduce a second useful space of periodic
functions and determine the continuous linear functionals on this second
space.

Exercises

1. Suppose (a n):'= _ 00 is a (two sided) sequence of complex numbers


such that

~ la,,1

n= - co

< 00;

here we take the infinite sum to be


00

la,,1 =
,,=-00

00

00

,,=0

,,=1

L la,,1 + L la_"I

Show that the function u defined by

L
00

u(x) =

n=

-00

a" exp (inx)

Periodic functions and periodic distributions

72

is continuous and periodic


2. Suppose u: IR --+ C is a continuous function and suppose there is a
constant M such that u(x) =
if Ixl ~ M. Show that for any x the series

2:
00

vex)

n= -

u(x

+ 2mr)

00

converges. Show that the function v is in ~.


3. If u E~, define the real number lui' by
lui'

= (27T)-1

ro

2"

Iu(x) I dx.

Show that lui' is a norm on ~ and that lui' ~ lui.


4. Suppose d' is the metric associated with the norm lui' in Exercise 3.
Show that ~ is not complete with respect to this metric. (Hint: take a sequence
of functions (u n ):= 1 of functions in ~ such that

o~

un(x)
un(x)
un(x)

IR, n = 1,2, ... ,


E [0, 7T/2 - I/n] U [37T/2
x E [7T/2, 37T/2].

1,

X E

= 0,
= 1,

I/n, 27T],

Then IUn - Urn I' --+ as n, m --+ 00. If u E~, there is an open interval
(7T/2 - 0,7T/2 + 0) on which either Iu(x) I > t or lu(x) - 11 > t. Show that
IU n - ul' > 0/67T for large values of n.)
5. Which of the following are bounded linear functionals on ~, with
respect to the norm lui?

(a)
(b)
(c)
(d)
(e)

F(u) = u(7T/2),
F(u) =
sin nx u(x) dx,
F(u) =
(U(X2 dx,
F(u) = 17u(0) +
u(x) dx.
F(u) = -3Iu(0)1.

f:"
f:"

f:"

6. Suppose X is a normed linear space. Let X' be the set of all bounded
linear functionals on X. Then X' is a vector space. For FE X', let

IFI =
Show that

IFI

sup {IF(u)11

uE X, lui

~ I}.

is a norm on X'. Show that for any u E X and FE X',

IF(u)1

IFllul

Show that X' is a Banach space with respect to this norm.

2. Smooth periodic functions


Suppose u: IR --+ C is a continuous periodic function, and suppose that
the derivative
Du(x) = u'(x)

Smooth periodic functions


exists for each x

Du(x

73

IR. Then Du is also periodic:

+ 27T) = lim h- 1[u(x + 27T + h) - u(x + 27T)]


" ... 0

= lim h- 1[u(x + h)

- u(x)]

= Du(x).

" ... 0

In particular, if u is infinitely differentiable and periodic, then each derivative Du, D2U, . .. , DkU,. .. is in ~.
We shall denote by gJ the subset of ~ which consists of all functions
u E ~ which are smooth, i.e., infinitely differentiable. Such a function will be
called a smooth periodic function. If u is in &! then the derivatives Du, D2U, . ..
are also in fJ'.
The set gJ is a subspace of ~ in the sense of vector spaces, so it is itself a
vector space. The function Isin xl is in C(j but not in &! so gJ i: ~. We could
consider gJ as a metric space with respect to the metric on ~ given in the
previous section, but we shall see later that gJ is not complete with respect
to that metric. To be able to consider gJ as a complete space we shall introduce a new notion of convergence for functions in ~
A sequence of functions (U n):'=1 c gJ is said to converge to u E gJ in the
sense of gJ if for each k = 0, 1,2, ... ,

IDkun (Here DOu

DkUI-+

as h -+ 00.

= u.) We denote this by


Un -+ U (gJ).

Thus (Un ):'=1 converges to u in the sense of gJ if and only if each derivative
of Un converges uniformly to the corresponding derivative of u as n -+ 00.
A sequence of functions (U n):'=1 is said to be a Cauchy sequence in the
sense of gJ if for each k = 0, 1, ... , (Dkun):,= 1 is a Cauchy sequence in ~.
Thus
for each k.
When there is no danger of confusion we shall speak simply of "convergence" and of a "Cauchy sequence," without referring to the "the sense
of fJ'." The statement of the following theorem is to be understood in this
way.
Theorem 2.1. The set gJ of all smooth periodic functions is a vector space.

If(u n):'=1 c

gJ

is a Cauchy sequence, then it converges to afunction u E ~

Proof. As noted above, gJ is a subspace of the vector space ~: if u, v E


then u -I- v E &! au E fJ'. Thus gJ is a vector space.

gJ

Suppose (Un):'=l is a Cauchy sequence. For each k the sequence of


derivatives (Dkun):'=l is a Cauchy sequence in ~. Therefore it converges
uniformly to a function Vk E~. For k = 0, 1,2, ... ,

D"un{x) = D"un(O)

+ LX D"+1un(t) dt.

74

Periodic functions and periodic distributions

By Theorem 4.2 of Chapter 2,

Vk(X)

lim Dkun(X)

n-+Qj

lim Dkun(O)

n-+oo

= Vk(O)

J:

lim

n .... oo

fX

Dk+lvn(t) dt

Vk+l(t) dt.

Therefore DVk = Vk+1 ' all k. This means that if u = Vo, then Vk = Dku and
IDkun - DkUI-+ 0 as n -+ 00. Thus Un -+ u (in the sense of fJI). 0
The remainder of this section is not necessary for the subsequent development. We show that there is no way of choosing a norm on fJI so that convergence as defined above is equivalent to convergence in the sense of the
metric associated with the norm. However, there is a way of choosing a
metric on fJI (not associated with a norm) such that convergence in the sense
of fJI is equivalent to convergence in the sense of the metric. Finally, we
introduce the abstract concept which is related to fJI in the way that the
concept of "Banach space" is related to ~.
Suppose there were a norm lui' on fJI such that a sequence (Un);'=l c fJI
converges in the sense of fJI to u E fJI if and only if

IU n - ul' -+ O.
Then there would be a constant M and an integer N such that
(1)

lui' ~ M(lul

IDul

+ ... +

all u E!!l!

IDNUI),

In fact, suppose (1) is false for every M, N. Then for each integer n there
would be a Un E fJI such that

Let
Then
if n

k,

so Vn -+ 0 in the sense of!!l! But Ivnl' = 1, all n. This shows that the norm lui'
must satisfy (I) for some M, N. Now let

Wn(X) = n- N- 1 sin nx,

n = 1,2, ....

Then Wn E fJI and

N.

Thus by (1),

But IDN+IW nl = 1, all n, so (W n);'=l does not converge to 0 in the sense of!!l!
This contradicts our assumption about the norm lui'.

Smooth periodic functions

75

Although we cannot choose a metric on 9, associated with a norm, which


gives the right notion of convergence, we can choose a metric as follows. Let

L 2-k-1IDku 00

d'(u, v)

+ IDku - Dkvll- 1.

Dkvl[l

k=O

The term of this sum indexed by k is non-negative and is smaller than


2-/(-1. Thus

u,

d'(u, v) < 1,
It is clear that

d'(u, v) ::::: 0,
d'(u, v) =
d'(u, v) = d'(v, u).

VE

fIJ.

u = v,

implies

The triangle inequality is a little more difficult. Let

d(u, v)

lu -

vi,

d*(u, v)

d(u, v)[l

+ d(u, V)]-l.

The reader may verify that

d*(u, w)

d*(u, v)

+ d*(v,

w).

Then
=

L 2- k- 1d*(DkU, Dk
k=O

...

GO

d'(u, w)

d'(u, v)

W)

+ d'(v, w).

Theorem 2.2. A sequence offunctions (Un)~=1 C fIJ is a Cauchy sequence


in the sense of fIJ if and only if it is a Cauchy sequence in the sense of the metric
d'. Thus (fIJ, d') is a complete metric space.

Proof Suppose (un)~= 1 is a Cauchy sequence in the sense of!!Ji. Suppose


> is given. Choose k so large that 2 - k < e. Choose N so large that if
m ::::: Nand n ::::: N, then
e

IDiUn - Diunl <

-te,

j = 0, 1, ... , k.

Then if m, n ::::: N,
=

2: 2- i - 1d*(Dium , Djun)
j=O

<

2: 2j=O

00

d'(u m, un)

f- 1

(!e) +

< !e + 2 - k -1 <

2: 2j=k+1
00

j - 1

!e + -le.

Conversely, suppose (Un)~= 1 is a Cauchy sequence in the sense of the metric d'.
Given an integer k ::::: and an e > 0, choose N so large that if m, n ::::: N
then

76

Periodic functions and periodic distribution

For m,n

N,

IDk un - Dkuml = 2k+2.2-k-l.2-1IDkun - Dkuml


< 2k+22-k-ld*(DkUn> Dk um)
< 2k+2d'(un, um) < e.
Thus (Un ):'=1 is a Cauchy sequence in the sense of Bl!
The same argument shows that d'(u n, u) -+ 0 if and only if Un -+ U (91').
Thus (f!J', d') is complete. 0
There is an important generalization of the concept of a Banach space,
which includes spaces like f!J'. Let X be a vector space over the real or complex
numbers. A seminorm on X is a function u-+ lui from X to IR such that

lui

laul

lallul,

lu + vi : : ; lui + Ivl
(Thus a seminorm is a norm if and only if lui = 0 implies u = 0.) Suppose
there is given a sequence of semi norms on X, lUll> luI2"'" with the property
~ 0,

that

all k implies u = O.

(2)

Then we may define a metric on X by

d'(u, v) =

L 2- klu 00

1<=1

vlk[1 + lu - vlk]-I.

If X is complete with respect to the metric d', it is said to be a Frechet space.


Note that

d'(u n , v) -+ 0 as n -+ 00
is equivalent to

IUn - vlk-+ O

as

n-+oo,

for all k.

In particular, if we take X = f!J' and

Iulk = IDk-lul,
then d' agrees with d' as defined above. Thus Theorems 3.1 and 3.2 say that
f!J'is a Frechet space.

Exercises
1. Which of the following are Cauchy sequences in the sense of f!J'?
un(x) = n- 3 cos nx,
vn(x) = (nl)-l sin nx,

L (ml)-1 sin mx.


n

wn(x) =

m-l

77

Translation, convolution, and approximation

2. Suppose X is a complex vector space with a sequence of seminorms


(2). Let d' be the associated metric. Show that a
linear functional F: X --+ C is continuous if and only if there are a constant
M and an integer N such that

lull> lul2' ... , satisfying

alluEX.

3. Translation, convolution, and approximation


The aim of this section and the next is to show that the space f!J' of smooth
periodic functions is dense in the space CC of continuous periodic functions;
in other words, any continuous periodic function U is the uniform limit of a
sequence (U n):'=l of smooth periodic functions. Even more important than
this theorem is the method of proof, because we develop a systematic
procedure for approximating functions by smooth functions.
The idea behind this procedure is that an average of translates of a
function U is smoother than u itself, while if the translated functions are
translated only slightly, the resulting functions are close to u. To illustrate
this the reader is invited to graph the following functions from IR to IR:

U1(X) =
U2(X) =
U3(X) =

lxi,
tlx - el + tlx + el,
tlx - el + tlxl + tlx + el,

where e > O.
If U E CC and t E IR, the translation of u by t is the function Ttu,

Ttu(x) = u(x - t),

XE

IR.

Then Ttu Ere. The graph of Tt is the graph of u shifted t units to the right
(i.e., shifted It I units to the left, if t < 0). In these terms the functions above
are
More generally, one could consider weighted averages of the form
(1)
where
ao

+ a1 + ... + aT

= 1,

and most of the t k are near O. If

o~
and we set
then (1) becomes
(2)

to

<

t1

< ... <

tk

= 2'IT

Periodic functions and periodic distributions

78

The natural continuous analog of (2) is the symbolic integral


V =

defining the function

vex)

(3)

1
2"
0

1
0

2"

b(t)Ttu dt,

b(t)u(x - t) dt.

We wish to study integrals of the form (3). If u, v E <iff, the convolution of

u and v is the function u * v defined by


(u

(4)

* v)(x)

I
27T

1
2
0 "

u(x - y)v(y) dy.

It follows readily from (4) that


I

lu * vi ::s; lul 27T

(5)

f2" Iv(x) I dx ::s; lullvl


0

Proposition 3.1. If u, v E <iff, then u * v E <iff. Moreover

* v = v * u,
a E 18,
(au) * v = a(u * v),
WE<iff,
(u + v) * W = U * W + v * W,
(u * v) * W = U * (v * w),
Tt(u * v) = (Ttu) * v = u * (Ttv).

(6)

(7)

(8)
(9)
(10)

Proof We begin with part of (10).


(11)

Tt(u

* v)(x)

* v)(x

- t)

(u

2~ J 1tu(x -

LI

u(x -

t -

y)v(y) dy = (T,u)

y)v(y) dy

* v.

Therefore,
(12)

lTt(u

* v)

- u * vi

I(Ttu - v)

* vi

::s; ITtu - ullvl,

where we have assumed (7) and (8). Now u is uniformly continuous on [0, 27T]
and is periodic; it follows easily that u is uniformly continuous on IR. Therefore ITtu - ul--+ 0 as t --+ O. Then (12) implies continuity of u * v. Also,
so u * v is periodic.
The equality of (6) follows from a change of variables in (4): let y' =
x - y, and use the periodicity of u and v. Equalities (7), (8), and (9) are easy
computations. The last part of (10) follows from (11) and (6). 0

79

Translation, convolution, and approximation


Note that
rl[U(t

+ x)

- U(X)]

t- 1 [L tu - u](x).

Lemma 3.2. If u E&! then

It- 1(L t u Proof

u) - Dul-+ 0

as

t -+ O.

By the Mean Value Theorem,

rl[Ltu - u](x) - Du(x) = Du(y) - Du(x)


where y = yet, x) lies between x and x + t. Since Du is uniformly continuous,
(-l(T -tU - u) converges uniformly to Du as (-+ O. 0

Corollary 3.3.

If u E

&!

then

r 1(L t u - u) -+ u (.9')
as t -+ O.
Proof

It is easy to see that


Dk(T-tu) = Tt(Dku).

Then
Dk[t-l(Ltu - u)] = r 1(L t DkU - Dku),
which converges uniformly to D(Dku)

Proposition 3.4.

= Dk(Du). 0

If u E .9' and v E ~ then u * v E .9' and


all k.

(13)
Proof

By Proposition 3.1,

(-l[Lt(u * v) - (u

(14)

* v)]

= [r1(Ltu - u)]

* v.

By Lemma 3.2 and (5), the expression on the right in (14) converges uniformly to (Du) * v as t -+ O. Thus
D(u * v)
and u

(Du)

* v has a continuous derivative. But,


D2(U * v) = DDu)

By induction, (13) holds for all k.

Corcllary 3.5.
each u E&!

* v,

Du E &! so we also have

* v) =

(D 2u)

* V.

Suppose (v n)1' c~,

VE~,

and

IV n

u * Vn -+ U * v (.9').
Proof

For each k,
Dk(U * Vn

U * v)

Dku * (v n

v).

vl-+ O. Then for

Periodic functions and periodic distributions

80

It follows from this and (5) that

IDk(U * Vn
all k.

* v)1 -+ 0,

Having established the general properties of convolution, let us return


to the question of approximation. Suppose U E f{f. If (IPn)~= 1 is a sequence of
functions in &! then each function
Un = IPn

*U

is smooth. Thinking of Un as a weighted average of translates of u, we can


expect Un to be close to U if IPn has average value 1 and is concentrated near
o and 27T (as a function on [0, 27T]).
A sequence (IPn)~=l c f{f is said to be an approximate identity if
(i) IPn(x) ~ 0, all n, x;
(ii) 1/27T
IPn(X) dx = 1, all n;
(iii) for each 0 < 8 < 7T,

f:"

[2"-6 IPn(X) dx -+ 0 as n -+ 00
.16

Theorem 3.6. Suppose (IPn)i' c

f{f

is an approximate identity. Then for

each U E f{f,

IlPn
Moreover,

if U E f!J1

*U -

ul-+O as n-+oo.

then

IPn * U -+ U (f!J1) as n -+ 00.


Proof

27T1(lPn

Since (27T)-1

* u)(x)

- U(x) I

J:" IPn(Y) dy =

If"

If:"
~
~
~

1 and IPn * U = U * lPn, we have

u(x - Y)lPn(Y) dy - u(x) f"lPn(Y) dy \


[u(x - y) - U(X)]IP,,(Y) dy \

II: \+ If"-6 \+ \{:-6 \


sup IT.u - ul

Isl,,26

sup ITsu - ul

Isl,,26

(f 6IPn + f2"
0

2,,-6

+ 21ul

.16

"-6

IPn)

IPn.

Given 8 > 0, we first choose D > 0 so small that 8 < 7T and


ITsu - ul ~ 7T8

if

lsi

~ D.

Then choose N so large that

2 1u l .16

2"-6

IPII

< 27T8,

N.

+ 21uI [2"-6 IPn

J6

The Weierstrass approximation theorems

81

If n :2: N, then
l'Pn

* u - ul

< e.

This proves the first assertion. Now suppose u E fYJ. For each k,
Dk('Pn * u)

Dk(U * 'Pn)

(Dku) * 'Pn

'Pn * (Dku),

which converges uniformly to Dku. Thus


'Pn

*u ~ u

(&).

In the next section we shall construct a sequence in & which is an approximate identity. It will follow, using Proposition 3.4 and Theorem 3.6, that &
is dense in ~.

Exercises
1. Let e,cCx) = exp (ikx), k = 0, 1, 2, .... These functions are in &
(see 6 of Chapter 2). Suppose u E~. Show that

where

I
ak = 211'
2. Show that ek * Lie

J2" e-lkYu(y) dy.


0

= I, and ej * e-k =

ifj:F k.

4. The Weierstrass approximation theorems


A trigonometric polynomial is a function of the form

L ak exp (ikx),
k=-n
n

(1)

'P(x) =

where the coefficients ak are in C. The reason for the terminology is that for
k > 0,
exp ( ikx)

[exp ( iX)]k

= (cos x

i sin X)k.

Therefore any function of the form (I) can be written as a polynomial in the
trigonometric functions cos x and sin x. Conversely, recall that
cos x

= -t[exp (ix) + exp ( -

sin x

ix)],

[exp (ix) - exp (-ix)].

Therefore any polynomial in cos x and sin x can be written in the form (I).

82

Periodic functions and periodic distributions

Lemma 4.1. There is a sequence (<Pn)r' of trigonometric polynomials which


is an approximate identity.
Proof.
such that

We want to choose a non-negative trigonometric polynomial 11'

11'(0) = <p(2n) = 1,
<p(x) < 1 for 0 < x < 2n.

Then successive powers of 11' will take values at points near 0 and 2n which
are relatively much greater than those taken at points between 0 and 2n.
We may take

<p(x) = -t(1

+ cos x)

and set

where

Cn

is chosen so that

f"

<Pn(x) dx

2n.

We need to show that for each 0 < 8 < n,

(2)

2"-6

<Pn(X) dx ~ 0 as n ~ 00

There is a number r, 0 < r < 1, such that

1 + cos x < r (1

(3)

+ cos y)

if
(4)

X E

Y E [0, -!8].

[8, 2n - 8],

Then (3) and (4) imply

<Pn(x) = cn(1 + cos x)n :::;; r n<pn(y),


so

or
X E

Thus <Pn ~ 0 uniformly on [8, 2n - 8].

[8, 2n - 8]

Lemma 4.2. If 11' is a trigonometric polynomial and u E 'i&', then 11' * u is a


trigonometric polynomial.
Proof.

This follows from Exercise 1 of the preceding section.

Theorem 4.3. The trigonometric polynomials are dense in the space ct of


continuous periodic functions, and in the space (!} of smooth periodic functions.

The Weierstrass approximation theorems

83

That is, ifu E ~ and v E~, there are sequences (un)f and (vn)f of trigonometric
polynomials such that
and
vn ~ v (.9).
Proof Let (Pn)!' be a sequence of trigonometric polynomials which is
an approximate identity, as in Lemma 4.1. Let
un = Pn

* u,

vn = Pn

* v.

By Lemma 4.2, the functions Un and Vn are trigonometric polynomials. By


Theorem 3.6, Un ~ u uniformly and Vn ~ v in the sense of ~ 0
Note that if u, v are real-valued, then so are the sequences (un)!', (vn)f
constructed here.

Corollary 4.4. .9 is dense in

~.

Theorem 4.3 is due to Weierstrass. There is a better-known approximation theorem, also due to Weierstrass, which can be deduced from Theorem
4.3.

Theorem 4.5. (Weierstrass polynomial approximation theorem). Let u be


a complex-valued continuous function defined on a closed interval [a, b] c IR.
Then there is a sequence (Pn)f of polynomials which converges uniformly to u
on the interval [a, b].
Proof Suppose first that [a, b] = [0, 7T]. We can extend u so that it is a
function in ~; for example, let u( -x) = u(x), X E [0, 7T] and take the unique
periodic extension ofthis function. Then there is a sequence (un)!' oftrigonometric polynomials converging uniformly to u. Now the partial sums of the
power series

L: (m!)-l(ikx)m = exp (ikx)


converge to exp (ikx) uniformly on [0,7T]. Therefore for each n, we may
replace the functions exp (ikx) in the expression of the form (1) for Un by
partial sums, so as to obtain a polynomial Pn with
X E

[0, 27T].

Then Pn ~ u uniformly on [0,27T].


In the case of an arbitrary interval [a, b], let
vex) = u(a

+ (b - a)x/7T),

X E

[0, 7T].

Then v is continuous on [0,7T], so there is a sequence (qn)!' of polynomials


with qn ~ v uniformly on [0, 7T]. Let
Y E [a, b].

84

Periodic functions and periodic distributions

Then Pn is also a polynomial and Pn -+ U uniformly on [a, b].

Exercises
1. Suppose u E <'C and suppose that for each integer k,

J,o

2,.

u(x) exp (ikx) dx = O.

Show that u = O.
2. Suppose u: [a, b] -+ <'C is continuous, and for each integer n

u(x)xn dx

0,

= O.

Show that u = O.

5. Periodic distributions
In general, a "distribution" is a continuous linear functional on some
space of functions. A periodic distribution is a continuous linear functional
on the space ~ Thus a periodic distribution is a mapping F: f!IJ -+ C such
that
F(au) = aF(u),
F(u + v) = F(u) + F(v),
F(un) -+ F(u)

UEf!IJ;
aEC,
U, V E f!IJ;
if Un -+ U (f!IJ).

If v is a continuous periodic function defined on [0, 27T], then we define a


linear functional F = Fv by
1
Fv(u) = 27T

(1)

f2" v(x)u(x) dx,


0

U E

<'C.

Then Fv: <'C -+ C is linear, and

Therefore Fv is continuous on <'C. Its restriction to the subspace f!IJ is a periodic


distribution. We say that a periodic distribution F is a function if there is a
v E <'C such that F = Fv' If so, we may abuse notation and write F = v.
Note that different functions v, WE <'C define different distributions. In
fact, suppose Fv = Fw. Choose (un)i c f!IJ such that Un -+ W* - v* uniformly, where w*(x) = W(x) *, the complex conjugate. Then

o=
so

27T(Fw(Un) - Fv(u n )) =

W =

v.

ih
0

(w(x) - v(x))un(x) dx -+

ih
0

Iw(x) - v(x)12 dx,

Periodic distributions

85

Not every periodic distribution is a function. For example, let 8: re ~ C


be defined by
(2)

8(u)

= u(O),

u Ere.

Then the restriction of 8 to fYJ is a periodic distribution. It is called t~ 8distribution, or Dirac 8-distribution. To see that it is not a function, let
un(X) =

H + 1- cos x)n.

Then 8(un) = I, all n. But un(x) ~ 0 uniformly for x E fe, 27T - e], any
e > O. Also 0 ~ un(x) ~ I, all x, n. It follows from this that for any v Ere,
Fiu n) ~ o. Thus 8 f= Fv.
The set of all periodic distributions is denoted by fYJ'. We consider fYJ'
as a vector space in the usual way: if F, G E fYJ', u E.9, a E C, then
(F

+ G)(u) = F(u) + G(u) ,


(aF)(u)

= aF(u).

Note that if v, ware continuous periodic functions, then

A sequence (Fn)f fYJ' is said to converge to FE fYJ' in the sense of fYJ' if


Fn(u) ~ F(u),

all u E [lJ.

We denote convergence in the sense of fYJ' by


Fn ~ F (fYJ'),

or simply by

when it is understood in what sense convergence is understood.


We want to define operations of complex conjugation, reversal, translation, and differentiation for periodic distributions. For any such operation
there is a standard procedure for extending the operation from functions to
distributions. For example, if VEe, the complex conjugate function v* is
defined by
v*(X) = V(X) *.
Then
Fv.(u)

(I

= 27T f2"
0
v(x)*u(x) dx = 27T f2"
0
v(x)u*(x) dx

)* =

Then we define F* for an arbitrary FE fYJ' by


(3)

F*(u) = F(u*)*,

Similarly, if v E re we define the reversed function

v(x) = v( -x).

v by

(Fv(u*.

Periodic functions and periodic distributions

86

Then
I

F;,(u) = 27T

f211
0

v( -X)U(X) dx = 27T

f211
0

V(X)U( -x) dx = Flu).

We define F-, FE &', by

F-(u)

(4)

F(u),

UE~

If v E f{;' and t E IR, recall that the translate Ttv is defined by

Ttv(x)

vex - t).

Then
I

27T

f211
0

Ttv(x)u(x) dx

27T

27T

f211

vex - t)u(x) dx

f211
0

v(x)u(x

+ t) dx =

Fv(T -tu).

We define TtF, FE &', by

(5)

UE~

If v E & and u E &', then integration by parts gives


1

27T

f211
0

Dv(x)u(x) dx

27T

f211
0

v(x)Du(x) dx

Fv(Du).

We define DF, FE &', by

(6)

(DF)(u)

-F(Du),

uE ~

Then inductively,
(7)

UE~

Each of the linear functionals so defined is a periodic distribution. For


example, if Un ~ U (&) then DUn ~ Du (&). It follows that

(DF)(u n)

-F(Dun) ~ -F(Du)

DF(u),

so the derivative DFis continuous. Similarly, F*, F, TtF, and DkFare in &'.
In particular, let us take F = 8. Then
(8)

(9)
(10)

0= 0*

Tto(u)
Dk8(u)

u(t),

(_I)kDku(O),

UE~

Proposition 5.1. The operations in &' defined by equations (3), (4), (5),
and (6) are continuous, in the sense that if Fn ~ F (&') then
Fn * ~ F* (&'),
Fn - ~ F- (&'),
TtFn ~ TtF (&'),
DFn ~ DF (&').

Periodic distributions

87

Proof Each of these assertions follows trivially from the definitions.


For example, if u E fll then
(DFn)(u) = -Fn(Du) --? -F(Du) = DF(u).

Thus DFn --? DF (fll'), etc.

Recall that if u E fll then Du is the limit of the "difference quotient"


t-l(Ltu - u).

Proposition 5.2. If FE fll', then


(11)

t-l(LtF - F)

--?

F (fll')

as t --? O.
Proof
(12)

Suppose u E f?lJ. By definition,

t-l(LtF - F)(u) = t-1F(Ttu) - t-1F(u) = -F(t-l[u - Ttu]).

Now
(13)

An argument like that proving Lemma 3.2 and Corollary 3.3 shows that the
expression in (13) converges to Du in the sense of fll as t --? O. From this fact
and (12) we get (11). 0
As an example,
t-l(Lto - o)(u) = t-l[u(-t) - u(O)]--? -Du(O) = (Do)(u).

The real and imaginary parts of a function v E

can be defined by

+ v*),

Re v = !(v
1m v

C(?

= ~ (v - v*).

Similarly, we define the real and imaginary parts of a periodic distribution F


by
ReF = -HF
ImF

+ F*),

= ~(F - F*).

F is said to be real if F = F*. A function v E C(? is said to be even if vex) =


v( -x), all x; it is said to be odd if vex) = -v( -x), all x. These conditions
may be written

v = D,

v = -D.

Similarly, we say a periodic distribution F is even if


F= F-;

88

Periodic functions and periodic distributions

we say F is odd if

Exercises
1. Which of the following define periodic distributions?

(a)
(b)
(c)
(d)
(e)
(f)
(g)

F(u)
F(u)
F(u)
F(u)
F(u)
F(u)
F(u)

= Du(1) - 3u(27T).
= J:" (u(xW dx.
=
=

J:" u(x) dx.

J: u(x)(1 + xf dx.

= - J:" D3U(x)lcos 2xl dx.


= L:7=o ajDiu(tj).
= L:i=o (j!)-l Dju(O).

2. Verify (8), (9), (10)


3. Express the distributions in parts (a) and (f) of Exercise 1 in terms of the
8-distribution and its translates and derivatives.
4. Compute DFwhen Fis the distribution in part (c) or (d) of Exercise l.
5. Show that Re F and 1m F are real. Show that F = Re F + i 1m F.
6. Show that F real and ureal, u E.9, imply F(u) is real.
7. Show that F even and u odd, u E.9, imply F(u) = o. Show that F odd
and u even, u E.9, imply F(u) = O.
8. Show that any FE f!jJ' can be written uniquely as F = G + H, where
G, HE&" and G is even, H is odd.
9. Suppose that v E '6' is differentiable at each point of IR and Dv = w is
in '6'. Show that
in other words, if F = v, then DF = Dv.
10. Suppose v is a continuous complex-valued function defined on the
interval [0, 27T] and that Dv = w is continuous on (0, 27T) and bounded. Define
Fv E &" by
1
Fv(u) = 27T

f2" v(x)u(x) dx,


0

u E &'.

Show that

DFv(u) = (v(O) - V(27TU(0)

1 f2"
+ 27T
0
w(x)u(x) dx.

In other words,

DFv = Fw

[v(O) - V(27T)]8.

11. Let vex) = Isin !xl. Compute

(D2Fv)(U),

UE&'.

89

Determining the periodic distributions

6. Determining the periodic distributions


We know that any continuous periodic function v may be considered as a
periodic distribution Fv. The derivatives DkFv are also periodic distributions,
though in general they are not (defined by) functions. It is natural to ask
whether all periodic distributions are of the form DkFv , v Ere. The answer is
nearly yes.

Theorem 6.1. Suppose F is a periodic distribution. Then there is an integer


0, a continuous periodic function v, and a constant function J, such that

(I)
The proof of this theorem will be given later in this section, after several
other lemmas and theorems. First we need the notion of the order of a
periodic distribution. A periodic distribution F is said to be of order k (k an
integer ~ 0) if there is a constant c such that

JF(u)J ~ c{JuJ

+ JDuJ + ... +

all u E &1J.

JDkUJ},

re

For example, 8 is of order o. If v E then DkFv is of order k. It is true, but


not obvious, that any FE f!JJ' is of order k for some integer k ~ O.

Theorem 6.2. If F E f!JJ', then there is an integer k

0 such that F is of

order k.
Proof

If F is not of order k, there is a function Uk

JF(Uk)J ~ (k

+ l){JukJ +

JDUkJ

+ ... +

f!JJ such that

JDkUkJ}.

Let

Vk = (k

+ 1)-l{JUkJ +

JDUkJ

+ ... +

JDkukl}-lUk.

Then we have
(2)
while
(3)
Suppose now that F were not of order k for any k ~ O. Then we could find a
sequence (Vk)k'=l c f!JJ satisfying (2) and (3) for each k. But (3) implies

Vk -+ 0 (f!JJ).
Then (2) contradicts the continuity of F. Thus F must be of order k, some k. 0

Lemma 6.3. Suppose FE f!JJ' is of order O. Then there is a unique continuous linear functional F1 : -+ such that

re re

all u E &1J.
Proof

By assumption there is a constant c such that

JF(u)J ~ cJuJ,

u E &1J.

90

Periodic functions and periodic distributions

If U E'i&', there is a sequence (u n):'= 1 C & such that

Un --? U

uniformly. Then

so (F(u n:'= 1 is a Cauchy sequence. Let


(4)

We want to show that Fl(U) is independent of the particular sequence used


to approximate u. If (vnH" c & and Vn --? U uniformly, then
IUn - vnl--? 0

so

Thus

The functional F1 : 'i&' --? 'i&' defined by (4) is easily seen to be linear. It is
continuous (= bounded), because

IF(u) I =

lim IF(un) I ~

clim lunl

clul.

Conversely, suppose F2: 'i&' --? 'i&' is continuous and suppose F2(U) = F(u),
all u E f!JJ. For any u E'i&', let (un);." c & be such that Un --? U uniformly. Then

D
(The remainder of this section is not needed subsequently.)
Lemma 6.4. Suppose FE &' is of order 0, and suppose F(w)
a constant function. Then there is a function v E 'i&' such that

0 if w is

D2Fv = F.

Proof Let us suppose first that F = Ft , where f E 'i&'. We shall try to find
a periodic function v such that D2 v = f Then we must have
Dv(x)

Dv(O)

+ IX f(t) dt
o

r r

f(/) dl,

.0

where a is to be chosen so that v is periodic. We may require v(O)

vex)

Dv(t)dt

[a

O. Then

+ ff(S)dS] dt

= ax + fox ff(S)dSdl.
We use Theorem 7.3 of Chapter 2 to reverse the order of integration and get

vex)

ax

+ LX f(s)(x -

s) ds.

Determining the periodic distributions

91

Let
(x - s)+

=0

if x < s,

Then

vex) = ax

(5)

By assumption onf,

(x - s)+

2"

2"

o hf(s) ds

X -

if x

;?:

s.

f(s)(x - s)+ ds.

= 0,

Now we want to choose a in (5) so that v is periodic. This will be true if


v(271-) = 0, i.e.,

o=

27Ta

fb
0

(27T - s)f(s) ds = 27Ta -

fb
0

sf(s) ds.

Thus
I
a = 27T

f2" sf(s) ds
0

and
I
vex) = 27T

(6)

f2" f(s)[xs + 27T(X 0

s)+] ds.

Now suppose only that FE &' is of order 0 and that F(w) = 0 if w is


constant. Let Fl be the extension of F to a continuous linear functional on <'C.
Let

Then ux(O) = 27TX = UxC27T). We can extend Ux so that it is a continuous


periodic function of s. Then (6) suggests that we define a function v by

(7)
We want to show that v E <'C and D2Fv = F. It is easy to check that

Therefore

Iv(x) - v(y)1 s IF1(u X ) - Fl(U y ) I s clu x - uyl s C27Tlx - yl,


vex + 27T) = F1(u X +2") = F1(u X ) = vex).
Thus v E <'C. Let us compute DFv. If w E .9, then

92

Periodic functions and periodic distributions

Approximate the integral by Riemann sums. These give expressions of the


form
(8)

rl

L: [v(Xj + t) -

v(Xj)]w(Xj)(Xj - Xj-l)
=

FI(L: W(Xj)(Xj - Xj_l)t-I.[Ux/+t - UX /]),

since FI is linear. As partitions (xo, Xl, ... , xn) of (0,21T) are taken with
smaller mesh, the functions on which FI acts in (8) converge uniformly to
the function gt. Here

gtCs) =

t-l[ux_t(s) - uxCs )]w(x) dx.

Now It-l(ux_t - ux)1 :5 21T. For fixed

SE

(0, 21T), and

< X < s,

t -+ 0.

rl(ux+t - ux) -+ S as

This convergence is uniform for X in any closed subinterval of (0, s). Similarly,

t-l(ux+t - ux) -+ S + 21T as t -+ 0,


uniformly for X in any closed subinterval of (s, 21T). It follows that

21TDFv(w) = lim rl[Ltv - v](w) = FI(g),


t-+o

(9)
where

211
g(s) = s 10 W(X) dx

+ 21T 10r "


2

W(X) dx.

Then
where

211
h(s) = s 10 Dw(x) dx

+ 21T f211
s

Dw(x) dx = 4rrW(21T) - 21TW(S).

Since FI applied to a constant function gives zero, we have

21T(D2FV)(W) = -FI(h) = 21TFI(W) = 21TF(w).


Thus D2Fv = F.

if w is a constant
Lemma 6.5. Suppose FE &' and suppose F(w) =
function. Then there is a unique G E &' such that DG = F and G(w) = if
w is a constant function. If F is of order k ~ 1, then G is of order k - 1.
Proof If u E .9, it is not necessarily the derivative of a periodic function.
We can get a periodic function by setting

Ix

I211

IX

Su(x) = 0 u(t) dt - 21T 0 u(t) dt = 0 u(t) dt - xFe(U),

Determining the periodic distributions

93

where

e(x) = 1,

all x,

Then

D(Su) = u - Fe(u)e.
It follows that if DG = F and G(e) = 0, then

(10)

G(u) = G(u - Fe(u)e) = G(D(Su = - DG(Su) = - F(Su).

Thus G is unique. To prove existence, we use (10) to define G. Since S: fJ' ~ fJ'
is linear, G is linear. Also

ISul :=:; 41Tlul,


ID(Su) I :=:; 21ul,
IDk(SU) I = IDk-1UI,
Then if Un ~

k;::; 2.

(fJ') we have

G(un ) = -F(Sun )

-F(Su) = G(u).

Thus G E fJ". Also

DG(u) = - G(Du) = F(S(Du = F(u).


Finally, suppose F is of order k ;::; 1. Then

IG(u)1 =

IF(Su) I :=:;
:=:;

and G is of order k - 1.

Corollary 6.6.
Proof

c{ISul + IDSul + ... + IDkSul}


51TC{lul + IDul + ... + IDk-lul},

If G E fJ"

and DG = 0, then G = Ft , where! is constant.

Again let e(x) = 1, all x. Let! = (21T)-lG(e)e, and

B = G - Ft.

Then DB =
G = Ft. 0

and B(e) = 0. By Lemma 6.5 (uniqueness), B = 0. Thus

FinaIIy, we can prove Theorem 6.1. Suppose FE fJ". Take an integer k


so large that F is of order k - 2 ;::; 0. Again, let e(x) = 1, all x, ! =
(21T)-lF(e)e, and

= F - Ft.
2 and Fo(e) = 0. By repeated applications of Lemma
Fo

Then Fo is of order k 6.5 we can find Flo F2 , , Fk -

2 E

fJ" so that

Fie) = 0,
and Fj is of order k - 2 - j. Then Fk - 2 is of order 0. By Lemma 6.4, there
is a v E <'C such that D2Fv = Fk _2' Then

DkFv = Dk- 2Fk _ 2 = Dk-1Fk _ 1 = ... = Fo


= F - Ft.

Periodic functions and periodic distributions

94

Exercises
1. To what extent are the functions v and f in Theorem 6.1 uniquely
determined?

2. Find v E 'fj such that D2 Fv = F, where

8 - T,,8,

i.e.,
F(u) = u(O) - u(7r).
3. Find v E'fj and a constant function f such that
8

= D2Fv + Ft.

7. Convolution of distributions
Suppose v E'fj and u E fJ'. The convolution v * u can be written as
1
(v * u)(x) = (u * v)(x) = 27r
=

I
27r

f2" u(x 0

f2" v(y)u(y 0

y)v(y) dy

x) dy = Fv(Txu);

here again u(x) = u( -x). Because of this it is natural to define the convolution of a periodic distribution F and a smooth periodic function u by the
formula

(F * u)(x)

(I)

= F(Txu).

Proposition 7.1. IfF is a periodic distribution and u is a smooth periodic


function, then the function F * u defined by (1) is a smooth periodic function.
Moreover,

(3)

*u =
(F + G) * u =

(4)

F* (u

(2)

(aF)

+ v)

a(F* u) = F* (au),

FEfJ",uE&!aEC;

F *u

FEfJ", uEfJ';

+ G * u,
= F* u + F* v,

* u = F* (Ttu),
= (DF) * u = F* (Du),

FE fJ", u,

VE

fJ';

(5)

Tt(F* u) = (TtF)

FEfJ", UEfJ';

(6)

D(F* u)

FE fJ", u E&.

Proof The identities (2)-(5) follow from the definition (1) by elementary
manipUlations. For example,

Tt(F * u)(x) = (F * u)(x - t) = F(Tx_tu)


= F(T-tTxu) = (TtF)(Txu) = TtF)
Also,

* u)(x).

Convolution of distributions

95

This proves (5). It follows that

(F * u)(x

+ 27T) = (F * (T2"U))(X) = (F * u)(x).

We know that

Therefore

t- 1 [(F* u)(x

+ t) - (F* u)(x)]

=
=
=

t- 1 [L t(F* u)(x) - (F* u)(x)]


t- 1 [L t F - TF](Txu)~ DF(Txu)
DF) * u)(x).

This shows that F * u is differentiable at each point x E ~, with derivative


(DF) * u(x). By induction, Dk(F * u) = (DkF) * u. Thus F * u E fJ. Finally,
using (5) again,

t- 1[(F*u)(x

+ t) - (F*u)(x)]

F* [t-1(L t y - u)](x)

= F(TAt- 1(L tu - u)n ~ F(TxCDur)

= (F * (Du))(x).

By induction, Dk(F * u) = F * (DkU), all k. We leave the proofs of (2), (3),


(4) as an exercise. 0
As an example:

s*u =

(7)

u,

In using (1) to define F * u, we departed from the procedure in 4, where


operations on distributions were defined in terms of their actions on functions.
Suppose u E '(j', V E ~ W E.9. Then

(8)

Fv*u(w)

Fu*v(w)

(27T)-2

Fv(a * w).

f"

(27T)-2 {2"

f"

u(x - y)v(y)w(x) dy dx

f(y){f" a(y - x)w(x) dX} dy

This suggests that we could have defined F * u as a distribution by letting it


assign to WE fjJ the number F(u * w). We shall see that this distribution
corresponds to the function defined by (1).

Lemma 7.2. If u E '(j' and v E '(j', then w


the functions wn, where

u * v is the uniform limit of

L n- v(27Tm/n)T2"mmU.
n

(9)

Wn

(27T)-2

m=1

Proof

Let Xmn = 2m7T/n. Then it is easy to see that

27T(Wn(X) - w(x))

n=l

fx mn
Xm-l,n

[v(xmn)u(x - x mn ) - v(y)u(x - y)] dy,

96

Periodic functions and periodic distributions

Now u, v are uniformly continuous and over the range of integration of the
m-th summand,

Iy - xmnl < 27T/n.


Therefore IW n

wl-+ 0 as n -+ 00.

Corollary 7.3. If U E gJ and v E~, then the functions Wn given by (9)


converge to W = U * v in the sense of gJ as n -+ 00.
Proof Since Dk(Txu) = TiDkU), Dkwn is the corresponding sequence of
functions for (Dku) * W = Dk(U * w). Therefore Dkwn -+ Dkw uniformly as
n -+ 00, for each k. 0

Proposition 7.4. IfF E gJ' and u, v E &, then


F,(v) = F(a * v),
where
f= F* u.
Let w = a * v and let Wn be the corresponding function defined
by (9), with a replacing u. Then Wn -+ a * v (gJ), so

Proof

F(a

* v) = lim F(wn)

But

F(w n) = (27T)-Z

L n-lv(27Tm/n)F(Tz1Imtnu)
m=l
n

1 f211
-+ 27T 0 v(x)f(x) dx = F,(v).

We shall now define the convolution of two periodic distributions F, G by

(F * G)(u) = F(G-

(10)

* u),

If G = Fv'/ = F * v, then Proposition 3.4 shows that F, = F * G. In general,


we must verify that (10) defines a periodic distribution. Clearly F * G: gJ' -+ C
is linear. If Un -+ U (gJ) and G is of order k, then

I(G- * un)(x)

G- * u(x) I
= I(G- * (un

- u(x) I

= IG-(Tiu n - u)-) I

:$; c{ITiun - u)1 + IDTiu n - u)1 + ... + IDkTiun - u)1}


= c{lun - ul + ID(u n - u)1 + ... + IDk(Un - u)I}

Thus G- * Un -+ G- * u uniformly. Similarly, for each j, Df(GG- * Djun -+ Df(G- * u) uniformly. Thus
G-

* Un -+ G- * u (gJ),

* un)

97

Convolution of distributions

so

(F * G)(un) -+ (F * G)(u).
This shows that F

* G E f!JJ'. As an example,

(11)

In the course of showing that F * G is continuous, we have given an


argument which proves the following.

If FE f!JJ', (un)f c &, and Un -+ U (f!JJ), then F * Un -+

Lemma 7.5.
F * u (f!JJ).
Corollary 7.6.

Suppose (fn)f

.9, and set

&, (gn)'f

Fn -+ F (f!JJ') and Gn

-+

Suppose
G (f!JJ').

Then
Fn

* G -+ F* G (f!JJ')

and
F

Proof

Suppose u E.9. Then

(Fn
Also, Gn ~

* Gn -+ F * G (f!JJ').

* G)(u)

Fn(G~

* u -+ G~ * u

(f!JJ) so

(F * Gn)(u)

F(G n~

* u) -+ F(G~ * u)

* u) -+ F(G~ * u)

= (F * G)(u)

(F * G)(u).

We can now prove approximation theorems for periodic distributions


analogous to those for functions.
Theorem 7.7. Suppose (CPnH" C f!JJ is an approximate identity, and suppose
FE f!JJ'. Let Fn = FIn' where fn = F * CPn' Then Fn -+ F (f!JJ').
In particular, there is a sequence (fn)f of trigonometric polynomials such
that FIn -+ F (f!JJ').

Proof

We have, by Proposition 7.4,

Fn(u) = F(CPn

* u),

But (CPn):=l is also an approximate identity, so

CPn

* U -+ u

(f!JJ).

Therefore

Fn(u) -+ F(u).

u E.9.

98

Periodic functions and periodic distributions

If (<Pn);= 1 is an approximate identity consisting of trigonometric polynomials, then the functions In = F * <Pn are also trigonometric polynomials.
In fact, let
Then
But
so
Thus

is a trigonometric polynomial.

Finally, we prove the analog of Proposition 7.1 and Proposition 3.1.


Proposition 7.8. Suppose F, G, HE fYJ', a E C. Then
(12)
(13)
(14)

(15)
(16)

(17)

F*G = G*F,

*G =
(F + G) * H =
(aF)

a(F* G) = F* (aG),
F

* H + G * H,

(F*G)*H= F*(G*H),
Tt(F * G) = (TtF)
D"(F* G)

* G = F * (TtG),
= (D"F) * G = F* (D"G).

Proof All of these identities except (12) and (15) follow from the
definitions by a sequence of elementary manipulations. As an example, we
shall prove part of (16):
[Tt(F * G)](u) = F * G(Ltu) = F(G- * Ltu)
= F((T -tG-) * u)
= F((TtG)- * u) = (F * TtG)(u).

Here we used the identity


(18)

To prove (12) and (15) we use Theorem 7.7 and Corollary 7.6. First, suppose
G = F g , g E f!JJ. Take (fn)i c fYJ such that
Fn = F'n -+ F (fYJ').

99

Summary of operations on periodic distributions


Let

h,.

=/,. *g,

It follows from (8), (10), and Corollary 7.6 that

H,. = F,. * G __ F * G (,gil').


But

h,.

= g */,.,

so also

H,. = G * F,. __ G * F (,gil').


Thus (12) is true when G
so that

= Fg , g E ~ In the general case, take (g,.)i

c ,gil

Then, in the sense of ,gil',

F * G = lim F * G,. = lim G,. * F = G * F.


The proof of (15) is similar. In the first place, (15) is true when

since

F * (G * H) = F * Fgoh = FIO(goh) = F(fog)Oh = (F * G)

* H.

We then approximate an arbitrary F by Fin and get (15) when G = Fg,


H = Fh Then approximate G, H successively to get (15) for all F, G, HE ,gil'.
The rest of the proof is left as an exercise.

Exercises
1. Prove the identities (2), (3), (4).
2. Prove the identities (7), (11).
3. Prove the identities (13), (14), (16), (17), (18) directly from the definitions.
4. Prove the identities in Exercise 3 by approximating the distributions
F, G, H by smooth periodic functions.

8. Summary of operations on periodic distributions


In this section we simply collect for reference the definitions and results
concerning ,gil'. The space ,gil is the set of infinitely differentiable periodic
functions u: IR __ C. We say

u,. __ u (,gil) if

IDkU,. -

DkUI -- 0,

all k.

Periodic functions and periodic distributions

100

A periodic distribution is a mapping F: &' -+ C with

F(u + v) = F(u) + F(v),


F(au) = aF(u),
F(un) -+ F(u) if Un -+ U (&').
If v E ~, the space of continuous periodic functions, then Fv E &" is defined by

Fv(u) = 2'117

2n
0

v(x)u(x) dx.

The 8-distribution is defined by

8(u) = u(O).
The sum, scalar multiple, complex conjugate, reversal, and translation of
distributions are defined by

(F

+ G)(u)

F(u) + G(u),
aF(u),
(F(u**
(u*(x) = u(x)*),
= F(u)
(u(x) = u( -x,
= F(T -tu)
(Ttu(x) = u(x - t.
=
=
=

(aF)(u)
F*(u)
F-(u)
(TtF)(u)
Derivatives ate defined by

(DkF)(U) = ( -1)kF(Dku).
We say

if
all u E &'.
In particular,

Then
8 = 8* = 8-

(Tt 8)(u) = u(t),


(D k8)(u) = (_l)k Dku(O).
Ifv E~ then

(Fv)* = F".,

(Fv )- = F 1J ,

Tt(Fv) = Fw, where w = Tty.


IfvEfI!

Dk(Fv) = Fw where w = Dkv.


The convolution F * u is the function

(F * u)(x) = F(T:;.fl),

u E &'.

Summary of operations on periodic distributions

101

Then F * u E f/J. If v E~, then

Fv

* u = v * u.

If (IPn)l' c f!Jl is an approximate identity, then

F * IPn --* F (f!Jl').


More precisely,

FIn --* F (f!Jl') , where In = F * IPn.


In particular,

*u =

I)

U,

Ef!Jl.

The convolution F * G is the distribution

(F* G)(u) =

F(G~

* u),

u Ef/J.

In particular if v E.9! then

F * Fv = FI , where 1= F * v.
Clearly
If

Fn --* F (f!Jl')
then

Fn

* G --* F * G

(f!Jl').

The convolution of distributions satisfies

F*G=G*F,
(aF) * G = a(F* G) = F* (aG),
(F + G) * H = F * H + G * H,
(F * G)

* H = F * (G * H),

Tt(F * G) = (TtF) * G = F * (TtG),


Dk(F* G) = (DkF) * G = F* (DkG).

A periodic distribution F is real if F = F*. Any FE f!Jl' can be written


uniquely as

F= G

+ iH,

G, H real.

In fact
G

= Re F = !(F + F*),

1m F

= 2i (F - F*).

A periodic distribution F is even if F =


FE f!Jl' can be written uniquely as

F= G

+ H,

F~

and odd if F = -

G even, H odd.

F~.

Any

Periodic functions and periodic distributions

102

In fact
G
H

= t(F + F-),
= t(F - F-).

The 8-distribution is real and even.


A periodic distribution is of order k if there is a constant c such that
IF(u)1 :s;

c(lul + IDul + ... +

IDkUI),

all u Ef!JJ.

Any FE f!)J' is of order k for some k.


If v E <'C and f is constant,

+ Ff E f!)J'.
+ 2, k ~ 0, then there are v E <'C and con-

DkFv

Conversely, if F E f!)J' is of order k


stant function f such that

Chapter 4

Hilbert Spaces and Fourier Series


1. An inner product in CC, and the space L2
Suppose u and v are in '1&', the space of continuous complex-valued periodic
functions. The inner product of u and v is the number (u, v) defined by
1 f.211
(u, v) = 27T 0 u(x)v(x)* dx.

(1)

It is easy to verify the following properties of the inner product:


(2)

(au, v)

(4)

+ U2, v)
(u, VI + V2)

(5)

(v, u)

(3)

(Ul

= a(u, v) = (u, a*v),


= (Ub v) + (U2' v),
= (u, VI) + (u, V2),
= (u, v)*,

(u, v)

(6)

0,

(u, u) = 0 only if u =

o.

We define Ilull for u E 'I&' by


1
IIuli = (u, U)1/2 = ( 27T

(7)
Lemma 1.1.

f.211
0

lu(x)12 dx

)1/2

If u, v E '1&', then

(8)

I(u, v)1 ~ IIuli IIvll


Proof.

If v = 0 then

(u, v)

= (u,Ov) = O(u, v) = 0,

and (8) is true. Suppose v #- O. Note that for any complex number a,

(9)

(u - av, u - av) = (u, u) - (av, u) - (u, av) + (av, av)


= IIull 2 - a(u, v)* - a*(u, v) + laI 211vll 2.

Let
Then (9) becomes

o~
and this implies (8).

IIull 2 - 21(u, v)1211vll -2

I(u, v)J211vll -2,

The inequality (8) is known as the Schwarz inequality. Note that only the
properties (2)-(6) were used in the proof, and no other features of the inner
product (1).
103

Hilbert spaces and Fourier series

104

Corollary 1.2. The junction u -+- I u I is a norm on 7/.


Proof

Recall that this means that I u I satisfies

Ilull

(10)

Ilull = 0 only if u = 0,
Ilaull = lal Ilull,
a E C,
Ilu + vii :::; Ilull + IlvII-

2: 0,

(11)
(12)

Property (10) follows from (6) and property (11) follows from (2). To prove
(12), we take the square and use the Schwarz inequality:
Ilu

+ vl1 2=

(u

+ v, u + v) = IIul1 2+ (u, v) + (v, u) + IIvl1 2


:::; IIul1 2+ 211ullllvil + I vl1 2
= (Ilull + Ilv11)2.

The new norm on 7/ is dominated by the preceding norm:

Ilull :::;

(13)

lui = sup {lu(x)I}

It is important to note that 7/ is not complete with respect to the metric


associated with this new norm. For example, let Un: IR -+-IR be the periodic
function whose graph contains the line segments joining the pairs of points

(0,0),

(~7T - ~,J) ,

7T, 0),

G 0);
7T,

(27T, 0).

Then

so (un)'f is a Cauchy sequence in the new metric. However, there is no


such that
Ilun - ull-+- o.

U E

7/

In order to get a complete space which contains 7/ with this inner product,
we turn to the space of periodic distributions. Suppose (un)! C 7/ is a
Cauchy sequence with respect to the metric induced by the norm Ilull, I.e.

Ilun

urnll-+- 0

as

n, m -+- 00.

An inner product in 'C, and the space L2

lOS

Let (Fn)i be the corresponding sequence of distributions:


Thus if v EEl!

Fn(v)
where again
inequality,

1 f2"
= 217
0 un(x)v(x) dx = (v, u:),

u: denotes the complex conjugate function. By the Schwarz


JFn(v) - Fm(v)J = J(v, u: - u:)J ~ JJvJJ JJun - umJJ
~

JvJJJun - umJJ.

Therefore (Fn(v:f has a limit. We define


(14)

F(v) = lim Fn(v).

The functional F: fYJ -+ t:C defined by (14) is clearly linear, since each Fn is
linear. In fact, F is a periodic distribution. To see this, we take N so large
that
ifn, m ;:: N.
Let
Then for any n

N,

while if n > N,

JJunJJ = JJun -

UN

+ UNJJ

JJun - UNJJ + JJUNJJ

< 1 + JJUNJJ ~ M.

Therefore
so

JFn(v)J = J(u, u:)J ~ JJvJJ JJunJJ ~ MJvJ,


JF(v)J = lim JFn(v)J ~ MJvJ.

We have proved the following lemma.


Lemma 1.2. If (un)i c t:C is a Cauchy sequence with respect to the
norm JJuJJ, then the corresponding sequence of distributions

Fn = Fun
converges in the sense of fYJ' to a distribution F, which is of order O.
It is important to know when two Cauchy sequences in t:C give rise to the
same distribution.

Lemma 1.3. Suppose (un)i c t:C and (vn)i c t:C are Cauchy sequences
with respect to the norm JJuJJ. Let Fn = Fun and Gn = Fun be the corresponding
distributions, and let F, G be the limits:

Fn -+ F (fYJ') and Gn -+ G (fYJ').

Hilbert spaces and Fourier series

106

Then F = G if and only

Proof
show

Let Wn

if

= Un - Vn and let Hn = Fn - Gn = Fw n. We want to

Hn

-+ 0

(31>')

if and only if

llwnll-+ O.

Suppose
Then for any

U E

9,

Conversely, suppose

Hn

(15)

-+ 0

Given e > 0, take N so large that n, m

(31>').
~

N implies

(16)
Fix, m

N. Then if n

N we use (16) to get

IIwml1 =
2

(wm' wm) = (wm' Wm - wn) + (wm' wn)


= (wm' Wm - wn) + Hn(w m)

ellwmll + IHn(wm)l

Letting n -+ 00, from (15) we get


or
m ~ N.

We define V to be the set consisting of all periodic distributions F with


the property that there is a sequence (un)f such that

Ilun- umll-+ 0 as n, m -+ 00,


Fun -+ F (31>').
If (un)f

'C is such a sequence, we say that it converges to F in the sense of

V and write
Un -+F (V).
Lemma 1.2 can be rephrased: if

Un -+F (V),

Vn

-+ G

(V),

then

F=G

if and only if

Ilun- vnll -+ O.

Clearly L2 is a subspace of 31>' in the sense of vector spaces. In fact, if


Un

-+ F

(V)

and

Vn

-+ G (V),

An inner product in

~,

and the space V

107

then
Un

+ Vn -+ F + G (V).

We may extend the inner product on 'C to V as follows. If


un-+F (V),

Vn -+

G (V),

let
(17)
The existence of this limit is left as an exercise. Lemma 1.2 shows that the
limit is independent of the particular sequences (un)f and (vn)f. That is,
if also

Un -+ F

(V),

v' -+ G (V)

then
(18)

Theorem 1.3. The inner product in V defined by (17) satisfies the identities
(2), (3), (4), (5), (6). If we define
(19)

IIFII = (F, F)1!2,

then this is a norm on L2. The space V is complete with respect to this norm.
Proof The fact that (2)-(6) hold is a consequence of (17) and (2)-(6)
for functions. We also have the Schwarz inequality in V:
I(F,

G)I ::;; IIFIIIIGII

IIF I is a norm.
Finally, suppose (Fn)f C L2 is a Cauchy sequence with respect to this
norm. First, note that if

It follows that

un-+F (V)

and v E 'C, and if we take

Vn

= v, all n, then

It follows that
i.e., F can be approximated in L2 by functions. Therefore, for each n =
1, 2, . . . we can find a function Vn E 'C such that

Then

Ilvn - vmll = IlFvn - Fvmll ::;; IlFvn - Fnll + IlFn - Fmll + IlFm - Fvmll
< IlFn - Fmll + n- 1 + m- 1 -+ 0 as n, m -+ 00.

Hilbert spaces and Fourier series

108

Thus there is an FE L2 such that

v" -+ F (L2).
But then

IIF" - FII ::; IIF" - Fvnll + IlFvn- FII


< n- l + IlFvn- FII-+O.

Exercises

1. Carry out the proof that

lIull

C(;'

is not complete with respect to the norm

2. Show that the limit in (17) exists.


3. Show that (18) is true.
4. Suppose I: [0, 217] -+ ~ is such that

Define
F(v) =

2~

f"

e [a, b),

I(x) = 1,
I(x) = 0,

x [a.. b).

I(x)v(x) dx =

2~

vex) dx,

bE

Show that FEV.


5. Suppose I: [0,217] -+ IC is constant on each subinterval
where

o=

Xo <

Xl

Define
F(v)

1
-2

[Xl-lo XI),

< ... < X" = 217.

(2" I(x)v(x) dx.

17 .0

Show that FEV.


6. Show that 8, the 8-distribution, is not in V.
7. Show that if FE V there is a sequence (u,,)i of smooth periodic
functions such that

u" -+F

(L2).

8. Let T t denote translation. Show that if u E C(;' then

IITtu - T.ull-+ 0

as t -+ s.

IITtF - T.FII-+ 0

as t -+ s.

If FE L2, show that

9. For any FeV, show that

IIFII

= sup

{IF(u)11 uE.9! lIuli ::;

I}.

109

Hilbert space

2. Hilbert space
In this section we consider an abstract version of the space L2 of l.
This clarifies the nature of certain theorems. In addition, the abstract version
describes other spaces which are obtained in very different ways.
Suppose H is a vector space over the real or complex numbers. An
inner product in H is a function assigning to each ordered pair of elements
u, v E H a real or complex number denoted by (u, v), such that
(U1

(au, v) = a(u, v),


a a scalar,
(Ub v) + (U2' v),
(v, u) = (u, v)*
((v, u) = (u, v) in the real case),
(u, u) > 0
if u =F O.

+ U2, v) =

The argument of Lemma 1.1 shows that

(I)

I(u, v)1 :::; lIullllvll,

where
(2)
Then lIuli is a norm on H. If H is complete with respect to the metric associated with this norm, then H is said to be a Hilbert space. In particular, V
is a Hilbert space. Clearly any Hilbert space is a Banach space.
A more mundane example than L2 is the finite-dimensional vector space
eN of N-tuples of complex numbers, with

L anb:
n=l
N

(a, b) =
when

(ab a2, ... , aN),

In this case the Schwarz inequality is

1~1 anb: \2 :::; ~1 lanl2 n~l Ibnl2.


Notice in particular that if we let

then

A still more mundane example is the plane 1R2, with

ab =

a1b1

+ a2b2

when a = (ab a2), b = (bb b2); here we use the dot to avoid confusing the
inner product with ordered pairs. It is worth noting that the law of cosines
of trigonometry can be written
(2)

ab = lallbl cos 0,

Hilbert spaces and Fourier series

110

where is the angle between the line segments from 0 to a and the line
segment from 0 to b. Therefore labl = lallbl if and only if the segments
lie on the same line. Similarly, a b = 0 if and only if the segments form a
right angle.
Elements u and v of a Hilbert space H are said to be orthogonal if the
inner product (u, v) is zero. In 1R2 this means that the corresponding line
segments are perpendicular. We write
u.lv
when u and v are orthogonal. More generally, U E H is said to be orthogonal
to the subset S c H if
all v E S.

u.l v,

If so we write
u.l S.
Ifu.l v then

(3)

Ilu

+ vl1 2 = (u + v, u + v) = (u, u) + (u, v) + (v, u) + (v, v)


= IIul1 2 + Ilv11 2

In 1R2, this is essentially the Pythagorean theorem, and we shall give the
identity (3) that name in any case. Another simple identity with a classical
geometric interpretation is the parallelogram law:

(4)

Ilu - vl1 2

+ Ilu + vl1 2 = 211ul1 2 + 211v11 2

This follows immediately from the properties of the inner product. In 1R2
it says that the sum of the squares of the lengths of the diagonals of the
parallelogram with vertices 0, u, v, u + v is equal to the sum of the lengths
of the squares of the (four) sides.
When speaking of convergence in a Hilbert space, we shall always mean
convergence with respect to the metric associated with the norm. Thus
Un -+ U means

Ilun - ull-+ O.

The Schwarz inequality shows that the inner product is a continuous function.
Lemma 2.1. [I(un)l, (Vn)l cHand

then

(Un> vn) -+ (u, v).


Proof Since the sequences converge, they are bounded. In particular,
there is a constant M such that Ilvnll ::;; M, all n. Then
I(un, vn) - (u, v)1

= I(un - u, vn) + (u, Vn - v)1

::;; Ilun - ull Ilvnll + Ilull Ilvn - vii


::;; Mllun - ull + Ilullllvn - vll-+ o.

Hilbert space

111

Corollary 2.2.
closure of S.

If u EH, S c H, and u 1.. S, then u is orthogonal to the

The general theory of Hilbert space essentially rests on the following


two geometric lemmas.
Lemma 2.3. Suppose Hl is a closed subspace of the Hilbert space H,
and suppose u E H. Then there is a unique v E Hl which is closest to u, in the
sense that

Ilu - vii :::; Ilu - wll,


Proof

The set

{Ilu - will wEHl}


is bounded below by O. Let dbe the greatest lower bound of this set. For each
integer n > 0 there is an element Vn E Hl such that

If we can show that (vn )1" is a Cauchy sequence, then it has a limit v E H.
Since Hl is closed, we would have v EHl and Ilu - vii = d as desired.
Geometrically the argument that (vn }1 is a Cauchy sequence is as follows.
The midpoint i(vn + vm
) of the line segment joining Vn and Vm has distance
~ d from u, by the definition of d. Therefore the square of the length of
one diagonal of the parallelogram with vertices u, Vn, Vm, Vn + Vm is nearly
equal to the sum of squares of the lengths of the sides. It follows that the
length Ilvn - vmll of the other diagonal is small. Algebraically, we use (4)
to get

o : :; IIVn - v l1 2 = 211Vn - ul1 2 + 211V


m

ul1 2 - II(vn + Vm) - 2ul1 2


m + n- l )2 + 2(d + m- l )2 - 411!(Vn + Vm) - ul1 2
2(d + n- l )2 + 2(d + m- l )2 - 4d 2 ~ O.

< 2(d
:::;

To show uniqueness, suppose that v and w both are closest to u in the above
sense. Then another application of the parallelogram law gives

Ilu - i(v + w)112 = -!-Ilu - vI12 + -!-Ilu - wl1 2 - illv - wl1 2


= d 2 - Ilv - w11 2.
Since the left side is

d 2 , we must have v = w.

As an example, take H = [R2, Hl a line through the origin. The unique


point on this line closest to a given point u is obtained as the intersection of
Hl and the line through u perpendicular to Hl . This connection between
perpendicularity (orthogonality) and the closest point is also true in the
general case.
Lemma 2.4. Under the hypotheses of Lemma 2.3, the element v E Hl is
closest to u if and only if

Hilbert spaces and Fourier series

112

Proof. First, suppose v E HI is closest to u, and suppose


want to show (u - v, w) = 0, and we may assume W "# O. Let
For any a E C, v + aw E HI' Therefore

Ilul -

awII2

= lIu - (v + aw)1I2

WE
Ul

HI' We

= U -

v.

WE

HI>

lIu - vll 2= lIu1112,

or

Let
(5)
Then (5) becomes

Thus (u1 , w) = O.
Conversely, suppose u - v .1 HI> and suppose W E HI' Then v so

(u - v) .1 (v - w).

The Pythagorean theorem gives

lIu - wI12

=
~

lI(u - v) + (v - w)112 = lIu - vl1 2+ Ilv - wI12


lIu - v1l 2

Corollary 2.5. Suppose HI is a closed subspace of a Hilbert space H.


Then either HI = H, or there is a nonzero element u E H such that u .1 HI'
Proof.
closest to

If HI "# H, take
Then u = Uo -

Uo.

H, UO HI' Take Vo E HI such that


is nonzero and orthogonal to H. 0

Uo E
Vo

Vo

is

As a first application of these results, we determine all the bounded


linear functionals on H. The following theorem is one of several results
known as the Riesz Representation Theorem.
Theorem 2.6. Suppose H is a Hilbert space and suppose v E H. The
mapping Lv: H -+ C (or IR) defined by

4(u)

= (u, v),

uEH,

is a bounded linear functional on H. Moreover, if L is any bounded linear


functional on H, then there is a unique v E H such that L = Lv'
Proof.

Clearly 4 is linear. By the Schwarz inequality

ILv(u)I

IIvllllull

Thus 4 is bounded.
Suppose L is a bounded linear functional on H. If L

= O. Otherwise, let

Hl = {u E H I L(u) = O}.

0 we may take

Hilbert spaces of sequences

113

Then H1 is a subspace of H, since L is linear; H1 is closed, since L is continuous. Since L -:f 0, H1 is not H. Take a nonzero UE H which is orthogonal to Hlo and let

v = Ilull-2L(u)u.
Then also v 1- H 1 , so

4(w) = L{w),
Moreover,

4(u)

Ilull- 2L{u){u, u)

= L(u).

If w is any element of H,

w - L{U)-lL(w)u E H 1
Thus any element of H is of the form
au

+ W1

for some a E C (or IR) and W1 E H 1 It follows that 4 = L.


To show uniqueness, suppose 4 = Lw. Then

o=

4(v - w) - Lw(v - w) = Ilv - w11 2.

Exercises
1. Prove (2).
2. Suppose FE &J'. Show that FE V if and only if there is a constant e
such that
all u E fiJ.
IF(u) I ::::; ellull,
3. Let H be any Hilbert space and let H1 be a closed subspace of H. Let
H2 = {u E H I u 1- H 1}.
Show that H2 is a closed subspace of H. Show that for any u E H there are
unique vectors U1 E H1 and U2 E H2 such that
u=

U1

+ U2.

3. Hilbert spaces of sequences


In this section we consider two infinite dimensional analogs of the finite
dimensional complex Hilbert space CN. Recall that if

x=

(al> a2, ... , aN) E CN

then

IIxl1 2 =

L:
N

n=l

lan l2

Hilbert spaces and Fourier series

114

Let 1+2 denote the set of all sequences

x = (an)!

such that
(1)
If
we set
co

(x, y) =

(2)

L anb~,

n=l

provided this series converges.

Theorem 3.1. The space 1+2 of complex sequences satisfying (1) is a


Hilbert space with respect to the inner product (2).

Proof

Suppose
x = (an)! El+2

and

y = (b n)! E/+2.

If a E C, then clearly

As for x

+ y,

we have

2:

Ian + bnl2 ::;

L (lanl2 + 21anbnl + Ibnl2)

::; 2

L lanl2 + 2 L Ibnl2 <

00.

Thus 1+ 2 is a vector space. To show that the inner product (2) is defined for
all x, y E 1+ 2, we use the inequality (1) from 2. For each N,

J-l lanb~1 (Jl


: ; C~l
::;

lanl2t2

C~l Ibnl2f2

lanl2 f2

C~

bnl 2f2

Therefore

and (2) converges. It is easy to check that (x, y) has the properties of an inner
product. The only remaining question is whether 1+2 is complete.
Suppose
m

= 1,2, ....

Hilbert spaces of sequences

115

Suppose (xm)i is a Cauchy sequence in the metric corresponding to the


norm

Ilxll

(x, X)1/2.

For each fixed n,

lam." - a1' .,,12

:$;

Ilxm - x1'l12 -? 0

as m, p -? 00. Thus (am ,,,):= 1 is a Cauchy sequence in C, and

am." -? a" as m -? 00.


Let x = (a,,)i. We want to show that xEl+2 and
(xmHO is a Cauchy sequence, it is bounded:

Ilxm - xll-?O.

Since

all m.
Therefore for any N,

L la,,12 = lim L:
N

m.... oo n=1

n=1

Finally, given

lam,,,12

> 0 choose M so large that m, p

Ilxm - x1'll
Then for any N and any m

:$;

K2.
M implies

< e.

M,

Thus
ifm

M.

It is often convenient to work with sequences indexed by the integers,


rather than by the positive integers; such sequences are called two-sided
sequences. We use the notation

Let 12 denote the space of two-sided sequences

x =

(anY~oo c C

such that
(3)

Here a two-sided infinite sum is defined to be the limit

00

n=-oo

if this limit exists.


If

Cn

lim

L:

M,N~+oo n=-M

en

Hilbert spaces and Fourier series

116

we let

2:
00

(x, y) =

(4)

n= -

anb:,
00

provided the series converges.


Theorem 3.2. The space J2 of two-sided complex sequences satisfying (3) is
a Hilbert space with respect to the inner product (4).

The proof of this theorem is very similar to the proof of Theorem 3.1.

Exercises
1. Prove Theorem 3.2
2. Let em E 1+2 be the sequence

with am,n = 0 if m i= n, an,n = 1. Show that


(a)
(b)
(c)
(d)
(e)

JJemJJ = 1;

em.l ep ifm i= p;
(x, em) --+ 0 as m --+ 00, for each x

1+ 2;

the set of linear combinations of the elements em is dense in 1+ 2;


if x E 1+ 2 there is a unique sequence (bn}r' c (; such that

3. Show that the unit ball in 1+ 2, the set


B

= {x E 1+ 2JJJXJJ

I},

is closed and bounded, but not compact.


4. Show that the set
C = {x E/+ 2J x = (an)i, each JanJ ~ n- 1}
is compact; C is called the Hilbert cube.

4. Orthonormal bases
The Hilbert space (;N is a finite dimensional vector space. Therefore any
element of (;N can be written uniquely as a linear combination of a given
set of basis vectors. It follows that the inner product of two elements of (;N
can be computed if we know the expression of each element as such a linear
combination. Conversely, the inner product makes possible a very convenient
way of expressing a given vector as a linear combination of basis vectors.

Orthonormal bases
Specifically, let en

117
E

C N be the N-tuple
en

= (0,0, ... ,0,1,0, ... ,0),

where the 1 is in the n-th place. Then {e l , e2, ... , eN} is a basis for CN. Moreover it is clear that
(1)

If x = (ab a2, ... , aN) E eN then the expression for x as a linear combination
of the basis vectors en is
(2)
Because of (1),
Thus we may rewrite (2) as
(3)

2: (x, en)en.
N

n=l

(x, y) =

2: anb~.
N

n=l

Using (3) and the corresponding expression for y, we have


(4)

(x, y) =

2: (x, en)(y, e )* = 2: (x, en)(en, y).


N

n=l

n=l

In particular,
(5)

JJXJJ2 =

2:
N

n=l

J(x, enW

The aim of this section and the next is to carry this development over
to a class of Hilbert spaces which are not finite dimensional. We look for
infinite subsets (e n)l' with the properties (1), and try to write elements as
convergent infinite sums analogous to (3).
A subset S of a Hilbert space H is said to be orthonormal if each U E S
has norm 1, while

(U, v) =

if u, v E S,

f= v.

The following procedure for producing orthonormal sets is called the GramSchmidt method.
Lemma 4.1. Suppose {Ub U 2, ... } is a finite or infinite set of elements of
a Hilbert space H. Then there is a finite or infinite set S = {eb e2, ... } of
elements of H such that S is orthonormal and such that each Un is in the subspace spanned by {eb e2, ... , en}.

Hilbert spaces and Fourier series

118

(If S has m elements, m <


Un

00,

we interpret the statement as saying that

E span {elo' .. , em} when n ;::: m.)

Proof The proof is by induction. If each U i = 0, we may take S to be


the empty set. Otherwise let VI be the first nonzero Uh and let
el

I V111- IV 1.

Then {e 1} is orthonormal and Ul E span {e 1 }. Suppose we have chosen


elo ... , em such that {e lo ... , em} is orthonormal and U lo "" U m E
span {e lo ... , em}. If each Ui E span {elo ... , em} we may stop. Otherwise
choose the first j such that Uj is not in this subspace. Let
Vm + 1

L (u), en).
N

Uj -

n=1

Since {e lo ... , em} is orthonormal, it follows that


::; n::; m.

Since uj 1= span {elo' .. , em}, Vm+l =I- 0. Let


em+l =

Ilvm+1 11- 1vm+l'

Then {e 1 , . . . , em + 1} is orthonormal and


we get the desired set S. 0

U m +l

is in the span. Continuing,

Note that completeness of H was not used. Thus Lemma 4.1 is valid in
any space with an inner product.
An orthonormal basis for a Hilbert space H is an orthonormal set S c H
such that span (S) is dense in H. This means that for any U E H and any
e > 0, there is a v, which is a linear combination of elements of S, such that
Ilu - vii < e.
A Hilbert space H is said to be separable if there is a sequence (un)f C H
which is dense in H. This means that for any U E H and any e > 0, there is
an n such that Ilu - unll < e.
Theorem 4.2. Suppose H is a separable Hilbert space. Then H has an
orthonormal basis S, which is finite or countable.
Conversely, if H is a Hilbert space which has a finite or countable orthonormal basis, then H is separable.

Proof Suppose (un)f is dense in H. By Lemma 4.1, there is a finite or


countable orthonormal set S = {elo e2, ... } such that each Un is a linear
combination of elements of S. Thus S is an orthonormal basis.
Conversely, suppose S is a finite or countable orthonormal basis for H.
Suppose H is a complex vector space. Let T be the set of all elements of H
of the form

Orthonormal bases

119

where N is arbitrary, the en are in S, and the an are complex numbers whose
real and imaginary parts are rational. It is not difficult to show that T is
countable, so the elements of T may be arranged in a sequence (un)!? Any
complex number is the limit of a sequence of complex numbers with rational
real and imaginary parts. It follows that any linear combination of elements
of S is a limit of a sequence of elements in T. Since S is assumed to be an
orthonormal basis, this implies that (v n)1' is dense in H. D
To complete Theorem 4.2, we want to know whether two orthonormal
bases in a separable Hilbert space have the same number of elements.
Theorem 4.3. Suppose H is a separable Hilbert space. If dim H =
N < 00, then any orthonormal basis for H is a basis for H as a vector space,
and therefore has N elements.
IfH is not finite dimensional, then any orthonormal basisfor H is countable.
Proof Suppose dim H = N < 00, and suppose S cHis an orthonormal basis. If el> ... , eM are distinct elements of Sand

then

m = 1, ... ,M.
Thus the elements of S are linearly independent, so S has :0; N elements.
Let S = {e 1, e2, ... , eM}' We want to show that S is a basis. Let H1 be the
subspace spanned by S. Given U E H, let
U1

Then

2: (u, en)en.
M

n=l

1, ... ,M.

It follows that

U - U 1 is orthogonal to the subspace H 1. The argument used


to prove Lemma 2.4 shows that U 1 is the element of H1 closest to u. But by
assumption on S, there are elements of H1 arbitrarily close to u. Therefore
U = U 1 E H 1, and S is a basis.
The argument just given shows that if H has a finite orthonormal basis S,
then S is a basis in the vector space sense. Therefore if H is not finite dimensional, any orthonormal basis is infinite. We want to show, therefore,
that if H is separable and not finite dimensional then any orthonormal basis
is at most countable. Let (un)!? be dense in H, and let S be an orthonormal
basis. For each element e E S, there is an integer n = nee) such that

lie - unll < 2- 1/2.


Suppose e, f

Sand e #- f. Let n = nee), p = n(f). Then


lie - fl12

+ (e, f) + (f, e) +
+ 0 + 0 + 1 = 2,

IIel12

= 1

IIfl12

Hilbert spaces and Fourier series

120

so

Ilun- Up I = Ilun- e + e - (+ (- upll


~ lie - (II - Ilun - e + ( - upll
~ lie - (II - Ilun - ell - II( - upll
> 21/2 - 2-1/2 - 2-1/2

Thus

Un

=F

Up.

= 0

We have shown that n(e) =F n() if

e =F (,

so the mapping

e -+ n(e) is a 1-1 function from S to a subset of the integers. It follows that S


is finite or countable.

Exercises
1. Let Xl be the vector space of continuous complex-valued functions
defined on the interval [-I, 1]. If u, v E Xl> let an inner product (u, V)l be
defined by

(u, V)l =

f1

u(x)v(x)* dx.

Let un(x) = x n-l, n = 1,2, .... Carry out the Gram-Schmidt process of
Lemma 4.1 to find polynomials Pn, n = 1, 2, 3, 4 such that Pn is of degree
n - 1, Pn has real coefficients, the leading coefficient is positive, and

(Pn, Pm)l = 1 if n = m,

(Pn, Pm)l = 0 if n =F m.

These are the first four Legendre polynomials.


2. Let X 2 be the set of all continuous functions u: IR -+ C such that

L:

lu(x)j2e- x dx <

L:

00.

Show that X 2 is a vector space. If u, v E X 2 , show that the integral

(u, V)2 =

u(x)v(x)e- X dx

exists as an improper integral, and that this defines an inner product on X 2


Show that there are polynomials Pn, n = I, 2, 3, . . . such that Pn is of degree
n - 1 and
(Pn, Pm)2 = 1 if n = m,
(Pn, Pm)2 = 0 if n =F m.
Determine the first few polynomials of such a sequence. Except for constant
factors these are the Laguerre polynomials.
3. Show that there is a sequence (Pn)i' of polynomials such that Pn is
of degree n - 1 and

L:

L:

Ipn(x)j2e-<1/2)X 2 dx = 1,

Pn(x)Pm(X)e-<1/2)X2 dx = 0

if n =F m.

Except for constant factors these are the Hermite polynomials.

Orthogonal expansions

121

4. Suppose H is a finite dimensional complex Hilbert space, of dimension


eN such that

N. Show that there is a linear transformation U from H onto


(Uu, Uv) = (u, v),

all u, v E H.

(Hint: choose an orthonormal basis for H.)


5. Suppose H and H' are two complex Hilbert spaces of dimension
N < 00. Show that there is a linear transformation U from H onto H'
such that
(Uu, Uv) = (u, v),

all u, v E H.

6. In the space [2 of two-sided complex sequences, let en be the sequence


with entry 1 in the nth place and all other entries O. Show that (en)'~ co is an
orthonormal basis for [2.
7. Show that there is a linear transformation U from [2 onto 1+ 2 such
that
all x, y E [2
(Ux, Uy) = (x, y),
5. Orthogonal expansions
Suppose H is a Hilbert space of dimension N < 00. We know that H
has an orthonormal basis {e l , e2 , . , eN}' Any element u E H is a linear
combination
(1)

and as in 4 we see that


(2)
It follows that if u, v E H then

L (u, en)(e", v).


N

(3)

(u, v) =

n=l

In particular,

L I(u, en}!?
N

IIul1 2

(4)

n=l

The expression (1) for u E H with coefficients given by (2) is called the
orthogonal expansion of u with respect to the orthonormal basis {eb ... , eN}'
We are now in a position to carry (1)-(4) over to an infinite-dimensional
separable Hilbert space.
Theorem 5.1.

Suppose H is a Hilbert space with an orthonormal basis

(en)l'. Ifu E H, there is a unique sequence (an)l' of scalars such that


(5)

Hilbert spaces and Fourier series

122

in the sense that

(6)
The coefficients are given by

(7)
and they satisfy

2:
00

(8)

lan l2 = Ilu11 2

n=l

More generally,

if

2: bnen
00

and v =

(9)

n=l

then

(10)

(U, v) =

2: anb! = 2: (u, en)(en. v).


00

00

n=l

n=l

Conversely, suppose (an)r' is a sequence of scalars with the property


(11)

Then there is a unique element

U E

H such that (5) is true.

Proof First let us prove uniqueness. Suppose (an)f is a sequence of


scalars such that (6) is true. Let

(12)

Since the sequence (en)f is orthonormal,


(UN'

if N

en) = an

n.

U sing Lemma 2.1 we get


an

= lim

Thus (an)f is unique.


To prove existence, set an

N-oo

(UN'

en) = (u, en).

(u, en). Define

UN

by (12). Then

l~n~N.

This implies that U - UN is orthogonal to the subspace HN spanned by


{eb ... , eN}' Now given e > 0, there is a linear combination v of the en
such that

Ilu - vii

< e.

Orthogonal expansions

123

Then there is an No such that v E HN when N::::: No. As in the proof of


Lemma 2.3, the facts that UN E HN and that U - UN ..l HN imply

Thus UN --+ u. Since the en are orthonormal,


N

IIuNI12 = (UN' UN) =

n=l

lan l2.

Thus

L la l2.
00

IIuI12 = lim IIuNI12 =

n=l

More generally, suppose U and v are given by (9). Let UN be defined by (12),
and let VN be defined in a similar way. Then by Lemma 2.1,
(u, v) = lim (UN' VN) = lim

00

n=l

n=l

L anb: = L anb:.

Finally, suppose (anH" is any sequence of scalars satisfying (II). Define


UN by (12). All we need do is show that (uN)r' is a Cauchy sequence, since we
can then let U be its limit. But if N > M,
(13)

IIUN - UMI12 = (UN - UM, UN - UM) =

L
N

n=M+l

lan l2.

Since (12) is true, the right side of (13) converges to zero as M, N --+ 00.

It is convenient to have the corresponding statement for a Hilbert space


with an orthonormal basis indexed by all integers. The proof is essentially
unchanged.

Theorem 5.2. Suppose H is a Hilbert space with an orthonormal basis


If U E H, there is a unique two-sided sequence (an)~ 00 of scalars such
that

(enY~ 00.

(14)

in the sense that

(15)
The coefficients are given by

(16)
and they satisfy

L
00

(17)

n= -

00

lan l2 =

IIuI12

124

Hilbert spaces and Fourier series

More generally, if

then
(18)

(u, v) =

Conversely, suppose

2: anb: = 2: (u, en)(en, v).


00

00

-00

-00

is a sequence of scalars with the property

(anY~~ 00

Then there is a unique u E H such that (IS) is true.


The equations (5), (6), or (14), (IS) give the orthogonal expansion of u
with respect to the respective orthonormal bases. The identity (8) or (17) is
called Bessel's equality. It implies Bessel's inequality:

2:

I(u, en)12 ~ I/ul/ 2

2:

I(u, en )12 ~ I/ul/ 2.

(19)

n=l

or

(20)

n= -N

The identity (10) or (18) is called Parseval's identity. The coefficients an


given by (6) or (14) are often called the Fourier coeffiCients ofu with respect to
the respective orthonormal basis.

Exercises
1. Suppose Hand H1 are two infinite-dimensional separable complex
Hilbert spaces. Use Theorems 4.2,4.3, and 5.1 to show that there is a linear
transformation U from H onto Hl such that
(Uu, Uv)

(u, v),

all u, v E H.

Show that U is invertible and that


(U-IUb U-IV1) = (Ul' vl ),

Such a transformation U is called a unitary transformation, or a unitary

equivalence.
2. Let U: 1+ 2 -+ 1+ 2 be defined by
Ua 1, a2, a3, ... )) = (0, a1, a2, a3, ... ).
Show that U is a I-I linear transformation such that

(Ux, Uy)
Show that U is not onto.

(x, y),

Fourier series

125

6. Fourier series
Let L2 be the Hilbert space introduced in 1. Thus V consists of each
periodic distribution F which is the limit, in the sense of L2, of a sequence
(unH" c <C, i.e.,

Ilun

- umll-+ 0,
Fun -+ F (.9").

We can identify the space C(j of continuous periodic functions with a subspace
of L2 by identifying the function u with the distribution Fu. Then

IJFul1 2

IIul1 2

1
217

f2" lu(x)12 dx.


0

In particular, we may consider the two-sided sequence of functions en,

en(x) = exp (inx),

n = 0, 1, 2, ...

as elements of L2.
(en)~ <x"

Lemma 6.1. The sequence of functions


of V, is an orthonormal basis for V.

Proof

f"

Clearly Ilenll = 1. If m

en(x)em(x)* dx =
=

f"

=1=

considered as elements

n, then

exp (inx - imx) dx

[i(n - m)]-l exp (i(n -

m)x)I~"

= 0.

Thus (en)~ 00 is an orthonormal set. Now suppose FE L2. Given


there is a function u E C(j such that

IIFu - FII <


There is a linear combination v =

> 0,

e.

L anen such that

Iv - ul

< e.

Then

IJFv - FII ~ IJFv - Full + IJFu - FII


= Ilu - vii + IJFu - FII
< Iv - ul + e < 2e.

Since (en)~ 00 is an orthonormal basis, we may apply Theorem 5.2 and


obtain orthogonal expansions for distributions in V.
Theorem 6.2.
(an)~ 00 c

Suppose FE L2. There is a unique two-sided sequence

C such that

2:
00

(1)

F=

n=

-00

an exp (inx)

Hilbert spaces and Fourier series

126

in the sense that the functions

2:
N

(2)

n= -

as N -+ 00.

anen -+ F (.,<l'2)
00

The coefficients are given by


(3)

and they satisfy

2: lanl
al

(4)

-al

More generally,

IIFI12.

if

2: an exp (inx)
al

F=

2: bn exp (inx)
al

and G =

-00

-00

in the sense of (2), then


(F, G) =

(5)

2: anb: = 2: F(e_n)G*(en).
al

al

-co

-co

Conversely, suppose (anY~~ al c C and

-al

Then there is a unique FE V such that (1) is true, in the sense of(2).
Proof.
that

In view of Lemma 6.1 and Theorem 5.2, we only need to verify

The second identity follows from the first and the definition of G*. To prove
the first, take (um)i c C,

um-+F (V).
Then

(F,Fe.) = lim (u m, en) = limF"m(e_ n) = F(e_ n).

The an In (3) are called the Fourier coefficients of the distribution F. The
formal series on the right in (1) is called the Fourier series of F.
Suppose F = F" where u E r"C. Then (2) is equivalent to
(6)

1
2"

Iu(x) - n=~N an exp (inx)12 dx -+ 0,

where
(7)

an = 21T

f2" u(x) exp ( - inx) dx.


0

Fourier series

127

In fact, (6) and (7) remain valid when u is simply assumed to be an integrable
function on [0, 21T]. In this case the an are called the Fourier coefficients of
the function u, and the formal series

2: an exp (inx)
00

-00

is called the Fourier series of the function u. The fact that (6) and (7) remain
valid in this case is easily established, as follows. If u: [0, 21T] --* C is integrable, then again it defines a distribution F,. by
1
F,.(v) = 21T

J2" u(x)v(x) dx.


0

Then an extension of the Schwarz inequality gives


1
IF,.(v) I :;; ( 21T

f2" lu(x)12 dx )1/2 IIvll


0

= IIull IIvll

By Exercise 2 of 2, F,. E V.
When u E~, it is tempting to interpret (1) for F,. as

u(x)

2: an exp (inx).
00

-00

In general, however, the series on the right may diverge for some values of
x, and it will certainly not converge uniformly without further restrictions
on u. It is sufficient to assume that u has a continuous derivative.

Lemma 6.3.

If (an) ~

00

c C

has the property

2: lanl
00

(8)

<

00,

-00

then the functions

converge uniformly to a function u E~, and (an)~ 00 is the sequence of Fourier


coefficients of u.
Proof Since Ien(x) I = lexp (inx)1 = 1 for all x E C, it follows from (8)
that the sequence offunctions (UN)! is a uniform Cauchy sequence. Therefore
it converges uniformly to a function u E~. For N ~ m we have
(UN' em) = am
Thus

am = lim (UN' em) = (u, em)


is the mth Fourier coefficient of u.

Hilbert spaces and Fourier series

128

Theorem 6.4. If u E ~ and u is continuously differentiable, then the


partial sums of the Fourier series of u converge uniformly to u. Thus

L (u, en) exp (inx),


00

u(x) =

all x

IR.

-00

Moreover, ifu E ~ then the partial sums converge to u in the sense off!IJ.
Proof Let v = Du. There is a relation between the Fourier coefficients
of v and those of u. In fact integration by parts gives
(9)

1
bn = (v, en) = 21T
in
= 21T

f2" Du(x) exp (-inx) dx


0

f2" u(x) exp ( - inx) dx


0

= in(u, en) = inan.


We can apply the Schwartz inequality for sequences to show
In fact

laol +

00

2 n~1 n- 2

)1/2

I Dull

<

L lanl

<

00.

00.

By Lemma 6.3, the partial sums of the Fourier series of u converge uniformly
to u.
Now suppose u E ~ and let
N

UN =

L
n=

-N

anen

Then (9) shows that


N

DUN =

L inanen = n=-N
L bnen
n= -N

is the Nth partial sum of the Fourier series of Du. Similarly, DkuN is the Nth
partial sum of Dku. Each Dku is continuously differentiable, so each DkuN -+
Dku uniformly. 0
Fourier series expansions are very commonly written in terms of sine
and cosine functions, rather than the exponential function. This is particularly natural when the function u or distribution F is real. Suppose FE L2.
Let
(10)

bn = 2F(cos nx),

cn = 2F(sin nx).

Since
e -n(x) = cos nx - i sin nx

Fourier series

129

we have
a" = t(b" - ie,,).

Also

e_" = -e".
Thus for n > 0,
a"e,,(x)

+ a_fIe_flex) = a,,(cos nx + i sin nx) + a_,,(cos nx = btl cos nx + e" sin nx.

i sin nx)

Then

L:
N

n.= -N

tbo +

a" exp (inx) =

The formal series

tbo +

(11)

L: btl cos nx + e" sin nx.


N

n=1

L: btl cos nx + e" sin nx

,,=1

is also called the Fourier series of F, and the coefficients btl, e" given by (10)
are also called the Fourier coefficients of F. If F is real, then btl, e" are real,
and (11) is a series of real-valued functions of x. Theorems 6.2 and 6.4 may
be restated using the series (11).

Exercises
1. Find the Fourier coefficients of the following integrable functions on
[0,2'IT]:
(a)
(b)
(c)
(d)
(e)
(f)

u(x)
u(x)
u(x)
u(x)
u(x)
u(x)

= 0, X E [0, 'IT], u(x) = 1, x E ('IT, 2'IT].


= 0, X E [0, 'IT], u(x) = X - 'IT, X E ('IT, 2'IT].

= Ix - 'lT1.

(x - 'IT)2.

= x
= Icos

xl.

2. Suppose u E re and suppose btl, e" are as in (10). Show that if u is even
then e" = 0, all n. Show that if u is odd, then btl = 0, all n. (It is convenient
to integrate over [-'IT, 'IT] instead of [0, 2'IT].) Show that if u is real then

~ (b,,+
2 e"2) -_ 2"
1 f.2" ( )2 d
41 b0 2 + 2:1 ,,=1
L..
u x x.
'IT 0
3. Suppose u E re,

L:
N

UN

,,=-N

(u, e _")e,,.

Hilbert spaces and Fourier series

130

Show that

where

DN(X)

sin (N

+ !-)x/sin -tx.

The function DN is called the Dirichlet kernel. Thus

4. Extend the result of Exercise 3 to the partial sums of the Fourier


series of a distribution FE L2.
5. For FE fJ", define

an

F(e_ n ),

n = 0, 1, 2, ....

Show that FE L2 if and only if

-00

6. Suppose FE fJ" and DF E L2. Show that F = Fu for some continuous


function u. (Hint: find the Fourier coefficients of u.)

Chapter 5

Applications of Fourier Series


1. Fourier series of smooth periodic functions
and of periodic distributions
If u is a smooth periodic function with Fourier coefficients (anY~ ex" then
we know that the sequence (anr~ 00 uniquely determines the function u; in
fact, the partial sums
(1)

us(x)

L an exp (inx)
N

-N

of the Fourier series converge to u in the sense of.9. Therefore it makes sense
to ask: what are necessary and sufficient conditions on a two sided sequence
(an)'~~ 00 c C that it be the sequence of Fourier coefficients of a function
u E!Y? The question is not hard to answer.
A sequence (an)~ 00 c C is said to be of rapid decrease if for every r >
there is a constant c = c(r) such that

all n =f. 0.

(2)

Theorem 1.1. A sequence (an)~ 00 c C is the sequence of Fourier coefficients of a function u E!Y if and only if it is of rapid decrease.

Proof Suppose first that u E!Y has (an)~ en as its sequence of Fourier
coefficients. Given r > 0, take an integer k ::::: r. In proving Theorem 6.4 of
Chapter 4 we noted that (inan)~ 00 is the sequence of Fourier coefficients of Du.
It follows that
(in)ka,,)~ 00

is the sequence of Fourier coefficients of Dku. But then

Inlklanl :::;

IDkul,

which gives (2) with c = 1 Dku I.


Conversely, suppose (an)~ 00 c C is a sequence which is of rapid decrease.
Define functions UN by (1). From (2) with r = 2 we deduce

-00

By Lemma 6.3 of Chapter 4, the functions (1) converge uniformly to u E ~,


and (an)'~~ 00 are the Fourier coefficients of u. Also

DkuN

L (in)kanen.
N

-N

131

Applications of Fourier series

132

and (2) with r = k

+ 2 implies
-00

Thus each derivative DkuN also converges uniformly as N


UE~

~ 00.

Therefore

If F is a periodic distribution which is in V, then we have defined its


Fourier coefficients by
(3)

e -n(x) = exp ( - inx).

Since e _n E fJ! the expression in (3) makes sense for any periodic distribution
F, whether or not it is in L2. Thus given any FE f!jJ', we define its Fourier
coefficients to be the sequence (an)'~ defined by (3). We know that if all the
an are zero, then F = 0 (see Chapter 3, Theorem 4.3). Therefore, FE f!jJ'
is uniquely determined by its Fourier coefficients, and we may ask: what are
necessary and sufficient conditions on a sequence (anY~~ c IC that it be the
sequence of Fourier coefficients of a periodic distribution F? Again, the
answer is not difficult.
A sequence (a,,)':: c IC is said to be of slow growth if there are some
positive constants c and r such that
00

00

00

(4)

all n "# O.

Theorem 1.2. A sequence (an)':: c IC is the sequence of Fourier coefficients of a distribution FE f!jJ' if and only if it is of slow growth.
00

Proof Suppose first that FE f!jJ' has (an)':: as its sequence of Fourier
coefficients. Recall that for some integer k, F is of order k. Thus for u E fJ!
00

IF(u) I ::;

c(lul

IDul

+ ... +

IDkUI)

With u = e -n this means

lanl = IF(u) I ::;

cll + Inl + ... + Inlk)

< 2clnl lc

Thus (an)':: 00 is of slow growth.


Conversely, suppose (an)':: c IC is a sequence which is of slow growth.
Then there is an integer k > 0 such that
00

(5)

n"# O.

Let b o = 0 and
n"# O.
Let

From (5) we get

2:
00

-00

Ibnl <

00.

Fourier series of smooth periodic functions and distributions


Therefore

VN

133

converges uniformly to v E ce. Let

This is a distribution; we claim that its Fourier coefficients are the


In fact, for n i= 0,

(anr~~ 00'

F(e_ n) = DkFv(e_n) = FvC( _D)ke _n)


= (in)kFv(e_n) = (in)kb n = an.
Also

o
In the course of the preceding proof we gave a second proof of the
characterization theorem for periodic distributions, Theorem 6.1 of Chapter 3.
In fact, the whole theory of f!J1 and f!J1' in Chapter 3 can be derived from the
point of view of Fourier series. We shall do much of such a derivation in this
section and the next. An important feature of such a program is to express
the action of F E f!J1' on u E f!J1 in terms of the respective sequences of Fourier
coefficients.
Theorem 1.3. Suppose FE f!J1' has (an)~ 00 as its sequence of coefficients,
and suppose u E f!J1 has (bn)~ 00 as its sequence of Fourier coefficients. Then

F(u) =

(6)

Proof

2: anb_ n = 2: a_nbn.
00

00

-co

-CD

Let

2b
-N
N

UN (X)

exp (inx).

We know

Therefore

F(u) = lim F(UN)'


But

F(UN) =

2:-N bnF(en) = 2:-N a_nbn = -N2: anb- n

Implicit in the proof of Theorem 1.3 is the proof that the series in (6)
converges. A more direct proof uses the criteria in Theorem 1.1 and 1.2.
In fact,

lanl ::; clnlr, n i= 0,


Ibnl ::; c'lnl- r- 2 , n =1=
and convergence follows.

Applications of Fourier series

134

Corollary 1.4. If FE &' has Fourier coefficients


the distribution defined by the function

(an)~ 00,

and

if FN is

L: an exp (inx),
N

-N

then
Proof

With

U, UN

as in Theorem 1.3,

o
Exercises
1. Compute the Fourier coefficients of Sand DkS.

2. If FE &' has Fourier coefficients

cients of

(an)~ co,

compute the Fourier coeffi-

F* ,

3. Give necessary and sufficient conditions on the Fourier coefficient


of a distribution F that F be real; or even; or odd.

(an)~ 00

4. Suppose (!Pm)i c

is an approximate identity. Let


(am,n);'= -

00

be the sequence of Fourier coefficients of !Pm. Show that


m ~ 1,
n = 0, 1, 2, ... ;
all n.
lim
am,n
=
1,
m __ oo

2. Fourier series, convolutions, and approximation


Recall that if F, G E &' and u E.q;, the convolutions F * u and F * G are
defined by
F * u(x) = F(Txu),
(F * G)(u) = F(G- * u).
We want to compute the Fourier coefficients of the convolutions in terms of
the Fourier coefficients of F, G, and u.
Theorem 2.1. Suppose FE &' has Fourier coefficients (an)~ 00" suppose
G E &' has Fourier coefficients (bn)~ co; and suppose u E & has Fourier coefficients (cn)~ co' Then F u has Fourier coefficients (ancn)~ co and F G has
Fourier coefficients (anbn)~ co'

Proof

Note that
(Txen)(Y) = e_n(y - x) = en(x)e_n(y).

Fourier series, convolutions, and approximation

135

Therefore

and

Taking the limit as N -+ 00, we find that F

*u

has Fourier coefficients

(ancn)~ co'

Now
(G-

* en)(x)

G-(Txen)

en(x)G-(en)

= en(x)G(e n) = b_nen(x).

Therefore

o
Using Theorem 2.l and Theorems 1.1 and 1.2, we may easily give a
second proof that F * u E f7J. Similarly, if u E f7J and G = F u , then
in fact F * G and F * u have the same Fourier coefficients.
The approximation theorems of Chapter 3 may be proved using Theorem
2.1 and the following two general approximation theorems.

Theorem 2.2.
Urn are

Suppose (urn)!

&. Suppose the Fourier coefficients of

(am,n):= - co'
Suppose that for each r > 0 there is a constant c

c(r) such that

all m, all n 1= O.
Suppose, finally, that for each n,

am,n -+ an
Then
over,

(an)~ 00

as m -+ 00.

is the sequence of Fourier coefficients of a function u E.9. More-

um -+ u (f7J).
Proof

The conditions imply that also

n 1= O.
Thus (an)~ co is the sequence of Fourier coefficients of a function u E.9.
Given e > 0, choose N so large that

L nco

c(2)

n=N

< e.

Applications of Fourier series

136

Choose M so large that m

M implies

L: lam

1I -

-N

It follows that if m

alii <

8.

M then
co

L:

-co

lam1I

alii < 58.

Since

this implies
ifm

M.

Thus Um-+ U uniformly. A similar argument shows each Dkum -+ Dku.

Theorem 2.3. Suppose (Fm)f


Fm are

f?I". Suppose the Fourier coefficients of

(am.II):'~ -

co.

Suppose that for some r > 0 there is a constant c such that

lam.1I1

clnlr,

all m, all n # O.

Suppose, finally, that for each n,


am1I -+ all and m -+ 00.
Then (allY~ co is the sequence of Fourier coefficients of a distribution FE f?I".
Moreover,
Fm -+ F (f?I").
Proof. The conditions imply that also

lalll

clnlr,

all n # O.

Thus (allr~co is the sequence of Fourier coefficients of a distribution FEf?I".


Take an integer k ~ r + 2, and let
b mo = bo = 0,
bm1I = (in)-k am .lI ,
bll = (in)-kall ,

n#O
n # O.

As in the proof of Theorem 1.2, (b mII ):'= _ co is the sequence of Fourier


coefficients of a function Vm E C(/, with

Similarly,

(bll)'~ co

is the sequence of Fourier coefficients of v E C(/ with

The heat equation: distribution solutions

137

The hypotheses of the theorem imply

jbm.nj

cjnj-2,

bm n --+ bn

as

n #- 0,
m --+ 00.

As in the proof of Theorem 2.1, these conditions imply that


Vm

--+

v uniformly as m --+ 00.

Also,
It follows that F m --+ F (.9").

Exercises

1. Suppose (9lm)l' c '?? is an approximate identity. Use the theorems of


this section and Exercise 4 of l to prove that:

* u --+ u (.9')
* 9lm --+ F (.9")

9lm
F

ifuE.9';
if FE .9".

2. State and prove a theorem for V which is analogous to Theorem 1.2


for .9' and Theorem 1.3 for .9".
3. Use the result of Exercise 2 to show that if (9lm)l' c '?? is an approximate identity and FE V, then
F

* 9lrn --+ F

(V).

4. Prove the converse of Theorem 2.2: if Urn --+ U (.9') then the Fourier
coefficients satisfy the hypotheses of Theorem 2.2.

3. The heat equation: distribution solutions


Many physical processes are approximately described by a function U"
depending on time and on position in space, which satisfies a type of partial
differential equation called a heat equation or diffusion equation. The simplest
case is the following. Find u(x, t), a continuous function defined for x E [0, 7T]
and for t ~ 0, satisfying the equation
(1)

o U(X, t)
ot

(ax
0)2U(x, t),

X E

(0, 7T),

the initial condition


(2)

u(x, 0) = g(X),

X E

[0, 7T],

t > 0,

Applications of Fourier series

138

and one of the following two sets of boundary conditions:


(3)
(3)'

u(O, t)

U(7T, t)

= 0,

t ;;:: 0;

ox U(O, t) = oX U(7T, t) = 0,

t> 0.

The function U describes the temperature distribution in a thin homogeneous metal rod of length 7T. The number x represents the distance of the
point P on the rod from one end of the rod, the number t represents the time,
and the number u(x, t) the temperature at the point P at time t. Equation (1)
expresses the assumption that the rod is in an insulating medium, with no
heat gained or lost except possibly at the ends. The constant K > is proportional to the thermal conductivity of the metal, and we may assume units
are chosen so that K = 1. Equation (2) expresses the assumption that the
temperature distribution is known at time t = 0. Equation (3) expresses the
assumption that the ends of the rod are kept at the constant temperature 0,
while the alternative equation (3)' applies if the ends are assumed insulated.
Later we shall sketch the derivation of Equation (1) and indicate some other
physical processes it describes.
Let us convert the two problems (1), (2), (3) and (1), (2), (3)' into a single
problem for a function periodic in x. Note that if (2) and (3) are both to hold,
we should have

(4)

g(O) = g(7T) =

o.

Let g( - x) = - g(x), X E ( -7T, 0). Then g has a unique extension to all of


~ which is odd and periodic (period 27T). Because of (4) the resulting function
is still continuous. Suppose u were a function defined for all x E ~ and t ;;:: 0,
periodic in x, and satisfying (1) for all x E ~, t > 0. Then u(x,O) = g(x) is
odd. If g is smooth, then also

is odd, and we might expect that u is odd as a function of x for each t ;;:: 0.
If this is so, then necessarily (3) is true.
Similarly, if (2) and (3)' are both to hold, we would expect that if g is
smooth then
Dg(O)

Dg(7T)

= 0.

In this case, let g( -x) = g(x), X E (-7T, 0). Then g has a unique extension
to all of ~ which is even and periodic. If g is of class C 1 on [0, 27T], the extension is of class Cl on R Again, if u were a solution of (1), (2) which is
periodic in x for all t, we might expect u to be even for all t. Then (3)' is
necessarily true.
The above considerations suggest that we replace the two problems

The heat equation: distribution solutions

l39

above by the single problem: find u defined for x


periodic in x,

(ox
0)2u(x, t),

X E

JR,

u(x, 0) = g(x),

X E

JR,

o u(x, t) =
ot

(5)

(6)

IR, t 2:: 0, such that u is

t > 0,

where g E C{} is given.


It is convenient and useful to ask for solutions of an analogous problem
for periodic distributions. We formulate this more general problem as foIIows.
Suppose that to each t in some interval (a, b) E JR we have assigned a distribution Ft E gJ'. If s E (a, b), G E gJ', and

as t ~ s, it is natural to consider G as the derivative of Ft with respect to t


at t = s. We do so, and write
G

= ddt Ftl t=s .

Our formulation of the problem for distribution is as foIIows: given G E gJ',


find distributions Ft E gJ' for each t > 0, such that

ddt Ftl t=s

(7)
(8)

= D2F.,

Ft~G

(gJ')

aIls> 0,
as

t~O.

Theorem 3.1. For each G E gJ' there is a unique family (Ft)t>o c gJ'
such that (7) and (8) hold. For each t > 0, Ft is a function ut(x) = u(x, t)
which is infinitely differentiable in both variables and satisfies (5).

Proof Let us prove uniqueness first. Suppose (Ft)t>o is a solution of


(7), (8). Each F t has Fourier coefficients (an(t~ '("
t > 0, n

0, 1, ....

Then

as t

s. In other words,
t > 0.

(9)
As

t~Owehave

(10)
The unique function a(t), t >
(11)

which satisfies (9) and (10) is

Applications of Fourier series

140

This shows that an(t) are uniquely determined, so the distributions Ft are
uniquely determined.
To show existence, we want to show that the an(t) defined by (11) are the
Fourier coefficients of a smooth function for t > 0. Recall that if y >
then

so
Therefore
m = 0, 1~ 2, ....

But
for some c and k. Therefore

lan(t)1 ::;; cm! Inl k - 2m t- m,

= 0, 1,2, ....

It follows that for each t > 0, (an(t))~ 00 is the sequence of Fourier coefficients of a function Ut E &. Then u, is the uniform limit of the functions

un(x, t) =

2: an(t) exp (inx)


N

-N

2: bn exp (inx -

2: an,:,,(t) exp (inx).

-N

n2t).

Then

-N

As above,

lan,l,r(t)1

<_ cm'lnlk+21+r-2mt-m,

= 0, 1,2, .. '.

It follows that each partial derivative of UN converges as N --+ 00, uniformly


for x E ~ and t ;;::: S > 0. Thus

U(x, t) = u,(x)
is smooth for x E ~, t > 0. For each N, UN satisfies (5). Therefore U satisfies
(5).
Let F t be the distribution determined by Ut. t > 0. Thus the Fourier
coefficients of F t are the same as those of Ut.
It follows from (11) that

The heat equation: classical solutions; derivation

141

By Theorem 2.3, therefore, (8) is satisfied. Finally, each partial derivative


of u is bounded on the region x E IR, t 2 8 > 0. It follows from this and the
mean value theorem that

(t - S)-l(Ut - us) -+ :t Us
uniformly as t -+ s > 0. Therefore (7) also holds.

Exercises
1. In Theorem 3.1, suppose G is even or odd. Show that each Ft is respectively even or odd. Show that if G is real, then each F t is real.
2. Discuss the behavior of Ft as t -+ 00.
3. Formulate the correct conditions on the function u if it represents the
temperature in a rod with one end insulated and the other held at constant
temperature.
4. In Theorem 3.1, suppose G = Fw , the distribution defined by a
function w E fJ1. Show that the functions Ut -+ W (.9') as t -+ 0.
5. Let

L
00

gt(x)

+ inx) ,

exp (-n 2 t

n= - co

t > 0,

X E

IR.

Show that gt E fJ1. In Theorem 3.1, show that


Ut

= G * gt.

6. With g, as in Exercise 5, show that

gt

* g.

gt+8'

7. The backwards heat equation is the equation (1) considered for t :s; 0,
with initial (or" final") condition (2). Is it reasonable to expect that solutions
for this problem will exist? Specifically, given G E .9", will there always be a
family of distributions (Ft)t < 0 C .9" such that

ddt Ft t=s = D2F.,

all s < 0,

Ft -+ G (.9") as

t -+ O?

8. The Schrodinger equation (simplest form) is the equation


OU
ot

.02U

= I ox2'

Consider the corresponding problem for periodic distributions: given


G E .9", find a family (Ft)t>o of periodic distributions such that

ddt Ft t=s = iD 2F.,


F t -+ G (.9")

all s > 0,
as

t -+ 0.

Discuss the existence and uniqueness of solutions to this problem.

Applications of Fourier series

142

4. The heat equation: classical solutions; derivation


Let us return to the problem given at the beginning of the last section:
find u(x, t), continuous for x E [0, 27T] and t ~ 0, and satisfying
(1)

:t u(x, t)

(2)

u(x,O)

= (:xrU(X, t),

(0, 7T),

= g(x), where g E C[O, 7T) is given;


u(O, t) = U(7T, t) = 0,

(3)
or

(3)'

t > 0;

ox u(O, t)

t ~

= oX U(7T, t) = 0,

t > 0.

Such a function u is called a classical solution of the problem (1), (2), (3)
or (1), (2), (3)', in contrast to the distribution solution for the periodic
problem given by Theorem 3.1. In this section we complete the discussion
by showing that a classical solution exists and is unique, and that it is given
by the distribution solution. We consider the problem (1), (2), (3), and leave
the problem (1), (2), (3)' as an exercise.
Given g E C[O, 7T] with g(O) = g(7T) =
(so that (3) is reasonable),
extend g to be an odd periodic function in 't', and let G = Fg By Theorem 3.1,
there is a function u(x, t) = ut(x) which is smooth in x, t for x E ~, t > 0,
satisfies (1) for all such x, t, and which converges to G in the sense of f!jJ' as
t --+ 0. By Exercise 1 of 3, u is odd as a function of x for each t > 0. Since u
is also periodic as a function of x, this implies that (2) holds when t > 0.
Ifwe knew that Ut --+ g uniformly as t --+ 0, it would follow that the restriction
of u to ~ x ~ 7T is a classical solution of (I), (2), (3). Note that this is true
when (the extension of) g is smooth: see Exercise 4 of 3. Everything else we
need to know follows from the maximum principle stated in the following
theorem.

Theorem 4.1. Suppose u is a real-valued classical solution of (1), (2).


Thenfor each T > 0, the maximum value ofu(x, t) in the rectangle

~ x ~ 7T,
~ t ~ T
is attained on one of the three edges t = 0, x = 0, or x = 7T.

Proof

Given e > 0, let


vex, t) = u(x, t) - et.

It is easy to see that the maximum value of v is attained on one of the edges
in question. Otherwise, it would be attained at (x o, to), where
Xo E

(0, 7T),

For v to be maximal here, we must have

to > 0.

The heat equation: classical solutions; derivation

143

But then

:2': e > 0,

so Xo cannot be a maximum for vex, to) on [0, 217].


Now u and v differ by at most eT on the rectangle. Therefore for any
(x, t) in the rectangle,
u(x, t) ~ M + 2eT,
where M is the maximum of u for t = 0, x = 0, or x = 17. Since this is true
for every e > 0, the conclusion follows. 0
Theorem 4.2. For each continuous g with g(O) = g(17) = 0, there is a
unique classical solution of problem (1), (2), (3).
Proof Note that u is a classical solution with initial values given by g
if and only if the real and imaginary parts of u are classical solutions with
initial values given by the real and imaginary parts of g, respectively. Therefore we may assume that g and u are real-valued. Applying Theorem 4.1
we see that

U(x, t) ~

Igl,

all x E [0,17],

t :2': 0.

Applying Theorem 4.1 to - u, which is a solution with initial values given


by -g, we get
-U(x, t) ~ Igl,
all x, t.
Thus
all x, t.
Iu(x, t)1 ~ 1g I,
This proves uniqueness.
To prove existence, let g be extended so as to be odd and periodic. Let
(<Pm)i c fJ' be an approximate identity. Let
gm = <Pm *gE~
We can choose <Pm to be even, so that gm is odd. Let Um be the distribution
solution given by Theorem 3.1 for gm as initial value, and let U be the distribution solution with g as initial value. Then we know um(x, t) -+ gm(x)
uniformly with respect to x at t -+ 0, so we may consider Um as continuous
for t :2': 0, X E R Moreover, since gm -+ g uniformly as m -+ 00, it follows
that um(x, t) -+ u(x, t), at least in the sense of fJ", for each t > O. On the
other hand, since 1gm - g 1 -+ 0 we find that for x E [0, 17]
Ium(x, t) - up(x, t)1 -+ 0
as m, p -+ 00, uniformly in x and t, t > O. Thus we must have Um -+ U
uniformly. It follows that U has a continuous extension to t = 0, and therefore that U (restricted to x E [0, 7T]) is a classical solution. 0
A second proof of the uniform convergence of U to g as t -+ 0 is sketched
in the exercises below.

Applications of Fourier series

144

The heat equation, (1), may be derived as follows. Again we consider a


homogeneous thin metal rod in an insulating medium. Imagine the rod
divided into sections of length e, and suppose x is the coordinate of the
midpoint of one section. We consider only this and the two adjacent sections,
and approximate the temperature distribution at time t by assuming u to be
constant in each section. The rate of flow of heat from the section centered
at x + e to that centered at x is proportional to the temperature difference
u(x + e, t) - u(x, t), and inversely proportional to the distance e. The
temperature in each section is the amount of heat divided by the volume, and
the volume is proportional to e. Considering also the heat flow from the
section centered at x - e to that centered at x, we get an approximate expression for the rate of change of temperature at (x, t):

e at u(x, t) ~ Ke- 1[u(x

+ e, t)

- u(x)

+ u(x

+ e, t)

- u(x - e, t) - 2u(x)].

- e, t) - u(x)]

or

at u(x, t) ~ Ke- 2[u(x

Consider the expression on the right as a function of e and let e -+ O. Two


applications of L'H6pital's rule give

a u(x, t)
at

(4)

(ox
0)2 u(x, t).

Note that essentially the same reasoning applies to the following general
situation. A (relatively) narrow cylinder contains a large number of individual
objects which move rather randomly about. The random motion of each
object is assumed symmetric in direction (left or right is equally likely) and
essentially independent of position in the cylinder, past motion, or the
presence of the other objects. As examples one can picture diffusion of
molecules or dye in a tube of water kicked about by thermal motion of the
water molecules, or the late stages of a large cocktail party in a very long
narrow room. If u(x, t) represents the density of the objects near the point x
at time t, then equation (4) arises again. Boundary conditions like (3)'
correspond to the ends of the cylinder being closed, while those like (3)
correspond to having one way doors at the ends, to allow egress but not
ingress.

Exercises
I. Suppose G = Fw, W E ~ and suppose (Ut)t>o is the family of functions
in Theorem 3.1. Show that if W ~ 0, then each Ut is ~ O.
2. As in Exercise 5 of 3, let

L exp (-n t + inx),


co

gt(x) =

-co

t>

o.

The wave equation

145

Show that
1

211"

f2n gt(x) dx = 1.
0

Show that if w E fJJ and w ~ 0, then gt * w ~ O. Show, by using anyapproximate identity in .9, that gt ~ O.
3. Show that (gt)t> 0 is an approximate identity as t ~ 0, i.e., (in addition
to the conclusions of Exercise 2) for each 0 < 8 < 11",

2,,-6

lim

t .... o 6

gt(x) dx = O.

(Hint: choose WE fJJ such that W ~ 0, w(O) = 0, and w(x) = 1 for 8 ~ x ~


211" - 8. Then consider gt * w(O).)
4. Use Exercise 3 to show that ifw E~, G = Fw and (Ut)t>o is the family
of functions given in Theorem 3.1, then Ut ~ w uniformly as t ~ O.
5. In a situation in which heat is being supplied to or drawn from a rod
with ends at a fixed temperature, one is led to the problem
:tu(x,t) = (:xYU(X,t) +/(X,/),
u(x,O)

= g(x),

u(O, t)

t > 0,

X E(O, 11"),

= u(11", I) = O.

Formulate and solve the corresponding problem for periodic distributions,


and discuss existence and uniqueness of classical solutions.

5. The wave equation


Another type of equation which is satisfied by the functions describing
many physical processes is the wave equation. The simplest example occurs
in connection with a vibrating string. Consider a taut string of length 11"
with endpoints fixed at the same height, and let u(x, t) denote the vertical
displacement of the string at the point with coordinate x, at time t. If there
are no external forces, the function u is (approximately) a solution of the
equation
(1)

X E

(0, 11"),

t> O.

Here c is a constant depending on the tension and properties of the string.


The condition that the endpoints be fixed is
(2)

u(O, t)

= u(11", I) = 0,

O.

To complete the determination of u it is enough to know the position and


velocity of each point of the string at time t = 0:
(3)

u(x, 0) = g(x),

8
81 u(x, 0) = hex),

X E

[0, 11"].

We shall discuss the derivation of (1) later in this section.

Applications of Fourier Series

146

As in the case of the heat equation we begin by formulating a corresponding problem for periodic distributions and solving it. The conditions
(2) suggest that we extend g, h to be odd and periodic and look for a solution
periodic in x. The procedure is essentially the same as in 3.
Theorem 5.1. For each G, HE f!/J' there is a unique family (F,)t>o
with the following properties:

f!/J'

exists for each s > 0,

(4)
(5)
(6)

all s > 0,
Ft --+ G (f!/J')

; Ftl

--+ H

as

(f!/J')

t --+ 0,
as

s --+ 0.

Proof Let (bn)'~ co be the sequence of Fourier coefficients of G, and let


(en)::' co be the sequence of Fourier coefficients of"H. If (Ft)t>o is such a family
of distributions, let the Fourier coefficients of F t be
(an(t))~ co'

As in the proof of Theorem 3.1, conditions (4), (5), (6) imply that an is
twice continuously differentiable for t > 0, an and Dan have limits at t = 0,
and
D 2 an(t)
an(O) = b n,

= -n 2 an(t),

Dan(O) = cn

The unique function satisfying these conditions (see 6 of Chapter 2) is


(7)

i= 0,

(8)
Thus we have proved uniqueness. On the other hand, the functions (7), (8)
satisfy
(9)

(10)
(11)
(12)

Dan(t) --+

Cn

as

t --+ 0,

all n.

It follows from (9) and Theorem 1.2 that (an(t))~ co is the sequence of Fourier
coefficients of a distribution Ft. It follows from (10) and Theorem 2.3 that
(5) is true. It follows from (11), the mean value theorem, and Theorem 1.2
that

The wave equation

147

exists for s > 0 and has Fourier coefficients


Finally,

(Dan(s~ <Xl.

Then (12) gives (6).

It follows that

(~rFtl

t= s

exists, s > O. The choice of an(t) implies that (4) holds.

Let us look more closely at the distribution F t in the case when G and H
are the distributions defined by functions g and h in .9. First, suppose h = O.
The Fourier series for g converges:

L an exp (inx).
<Xl

g(x)

-<Xl

Inequality (9) implies that the Fourier series for F t also converges; then F t
is the distribution defined by the function Ut E ~ where

L an cos ntexp (inx)


<Xl

u(x, t) = u,(x) =

-<Xl

~~

an[exp (int)

+ exp ( -

int)] exp (inx)

-<Xl

= :2

L an exp (in(x + t
<Xl

-00

+ :2 L an exp (in(x
<Xl

- t,

-00

or

(13)

u(x, t) = !g(x

+ t) + tg(x

- t).

It is easily checked that u is a solution of


82u
82u
8t 2 = 8x2 '

u(x, 0)

Next, consider the case g = o. Again


coefficients of a function Ut E P,

u(x, t) = utCx) =

8u

g(x),

8t (x,O)

(an(t~

<Xl

= O.

is the sequence of Fourier

L n- cn sin ntexp (inx) + cot.


1

n .. O

To get rid of the n -1 factor, we differentiate:

: u(x, t)
x

iCn sin nt exp (inx)

-<Xl

~~

cn[exp (int) - exp ( - int)] exp (inx)

-00

:2

L Cn exp (in(x + t
<Xl

-00

= th(x + t) - -!-h(x - t).

:2

L Cn exp (in(x <Xl

-00

Applications of Fourier series

148

Integrating with respect to x gives


(13)'

u(x, t)

=:2

fx+t
x-t hey) dy + aCt)

where aCt) is a constant to be determined so that u is periodic as a function


of x. But the periodicity of h implies that aCt) should be taken to be zero.
Then u so defined in a solution of

u(x,O) = 0,

au
at (x, 0) = hex).

Equation (l) for the vibrating string may be derived as follows. Approximate the curve Ut representing the displacement at time t by a polygonal line
joining the points ... (x - e, u(x - e, t, (x, u(x, t, (x + e, u(x + e, t, ....
The force on the string at the point x is due to the tension of the string. In
this approximation the force due to tension is directed along the line segments from (x, u(x, t to (x e, u(x e, t. The net vertical component
of force is then proportional to

(:r

u(x, t)

c2e- 2[u(x

+ e) + u(x -

e) - 2u(x)]

where c is constant. We take the limit as e -+ O. Two applications of


L'Hopital's rule to the right side considered as a function of e give (1).
The constant c2 can be seen to be equal to
where r is the diameter of the string or wire, T is the tension, and K is proportional to the density of the material. Let us suppose also that the length
of the string is wi instead of w. The solution of the problem corresponding
to (1), (2), (3) when g is real and h = 0 can be shown to be of the form

Lb
n=l
co

(14)

u(x, t)

cos (cntfl) sin (nx/I).

A single term

bn cos (cnt/I) sin (nx/I)


represents a "standing wave", a sine curve with n maxima and minima in
the interval [0, wi] and with height varying with time according to the term
cos (cnt/I). Thus the maximum height is bn , and the original wave is repeated
after a time interval of length

2wl/cn.
Thus the frequency for this term is

cn/2wl = n(KT)1/2/2wlr,
an integral multiple of the lowest frequency
(15)

The wave equation

149

In hearing the response of a plucked string the ear performs a Fourier


analysis on the air vibrations corresponding to u(x, t). Only finitely many
terms of (14) represent frequencies low enough to be heard, so the series
(14) is heard as though it were a finite sum

.L b
N

(16)

cos (cnt/I) sin (nx/l).

n=l

In general, if the basic frequency (15) is not too low (or high) it is heard as
the pitch of the string, and the coefficients b1 , b2 , , bN determine the
purity: a pure tone corresponds to alI but one bn being zero. Formula (15)
shows that the pitch varies inversely with length and radius, and directly
with the square root of the tension.

Exercises
1. In Theorem 5.1, suppose DG and H are in V. Show that

are in L2 for each t, and

IIDFtl12 + II~FtIl2 = IIDGI12 + IIHI12.


2. In the problem (1), (2), (3) with c = 1 suppose g, h, and u are smooth
functions.. Show, by computing the derivative with respect to t, that

I:x u(x, t)

r f
dx

I:t u(x, t) 12 dx

is constant. (This expresses conservation of energy: the first term represents


potential energy, from the tension, while the second term represents kinetic
energy.)
3. Use Exercise 2 and the results of this section to prove a theorem about
existence and uniqueness of classical solutions of the problem (1), (2), (3).
4. Show that if H(f) = 0 when f is constant, then the solution (Ft)t>o
in Theorem 5.1 can be written in the form

Ft

!(TtG

+ LtG) + !(LtSH -

TtSH).

Here again T t denotes translation, while S is the operator from &' to &'
defined by
SH(u) = H(v),
where
vex)

f2"
x

vet) dt

X f2" vet) dt.

27T

Applications of Fourier series

150

This is the analog for distributions of formulas (13) and (13)'.

6. Laplace's equation and the Dirichlet problem


A third equation of mathematical and physical importance is Laplace's
equation. In two variables this is the equation
02U
ox2

(1)

02U

+ oy2 = o.

A function u satisfying this equation is said to be harmonic. A typical


problem connected with this equation is the Dirichlet problem for the disc:
find a function continuous on the closed unit disc
{(x,y) I x 2 + y2 ::; I}

and equal on the boundary to a given function g:


u(x, y) = g(x, y),

A physical situation leading to this problem is the following. Let u(x, y, t)


denote the temperature at time t at the point with coordinates (x, y) on a
metal disc of radius 1. Suppose the temperature at the edge of the disc is
fixed, though varying from point to point, while the interior of the disc is
insulated. Eventually thermal equilibrium will be reached: u will be independent of time. The resulting temperature distribution u is approximately a
solution of (1), (2).
To solve this equation we express u and g by polar coordinates:
u

u(r, 0),

g = g(O)

where

x = rcos 0,

y = r sin 0;

0= tan- 1 (yx- 1 ).

Then
au = au or
ax
or ax

+ au 00

00 ax

cos 0 au _ sin 0 au,


or

00

and similarly for oujoy. An elementary but tedious calculation gives


02U
02U
02U
- 2 + -oy2 = -or2
ox

I au

1 02U

+ -r or
- +2
r2 00.

Thus we want to solve

(3)
(4)

02U

r2 or2

aU

02U

+ r or + 002

= 0,

u(I,O) = g(O),

o ::; r <

oE IR,

1,

oE IR,

Laplace's equation and the Dirichlet problem

151

where u and g are periodic in e. We proceed formally. Suppose g has Fourier


coefficients (bn)~ "" and u,(e) = u(r, e) has Fourier coefficients
O<r<1.

Then (3) leads to the equation


(5)

O<r<1.

From (4) we get


(6)

and since uo(e) is constant we want

n # O.

(7)

We look for a solution an(r) of the form bnr c , where e = e(n) is a constant,
for each n. This will be a solution if and only if

e(e - 1)
or

c2 =

n 2 Then (7) gives e

+e-

n2 = 0,

= Inl. We are led to the formal solution

L: bnr1nl exp (ine).


00

u(r, e) =

-00

Formally, this should be the convolution of g with the distribution whose


Fourier coefficients are
(r Inl)~ 00'

For r < 1 these are the Fourier coefficients of a function Pr


(8)

Pr(e) =

f!JJ. In fact,

2: rlnlelno
00

-00

= 1+

i
1

(reiO)n

(re-iO)n

= (1 - r2)11 - re 1o l- 2 = (1 - r 2)(1 - 2r cos e + r2)-1.


Note that Pr(e) is an approximate identity as r ~ 1: the first expression on
the right in (8) shows

2~ {" Pr(e) de = 1
and the last expression shows that pre e) ;;:: 0 and

1
2

lim
r-+1 _ " - - Pr(e) de = 0
for each 0 < 8 <

7T.

The function

per, e) = Pr(e)
is called the Poisson kernel for the Dirichlet problem in the unit disc.

Applications of Fourier series

152

Theorem 6.1. Suppose F is a periodic distribution. There is a unique


function u(r, 8) defined and smooth in the open unit disc, satisfying (3), and
such that the distributions defined by the functions u r (8) = u(r, 8) converge
to F in the sense of f!IJ' as r -+ I. Moreover, ifF = Fg where g E '?l, then Ur -+ g
uniformly as r -+ I. The functions Ur are given by convolution with the Poisson
kernel:

ur = F*Pr
Proof Uniqueness was proved in the derivation above. Let Ur = F * P,.
Since P, is an approximate identity, we do have Ur -+ F (f!IJ') as r -+ I, and
u, -+ g uniformly if F = F g , g E '?l. We must show that u is smooth and
satisfies (3). Note that when F = Fg , g E Iif, then explicitly
u(r, 8)

f2" per, 8 -

I
= 27T

cp)g(cp) dcp

and we may differentiate under the integral sign to prove that u is smooth.
Moreover, since per, 8) satisfies (3), so does u.
Finally, suppose u has merely a distribution F as its value on the boundary.
NotethatifO:s; r,s < 1 then (by computing Fourier coefficients, for example)
In particular, choose any R > 0, R < 1. It suffices to show that u is smooth
in the disc r < R and satisfies (3) there. But when r < R,

= rR- 1

< I

and
u,

= F * P, = F * (Pr * p.) = (F * PR) * p. = UR * p .

Since P R E &, UR is a smooth function of 8. Then


u(r, 8)

I f2"
= 27T
0 P(rR-l, 8 - CP)UR(CP) dcp,

:s; r < R. Again, differentiation under the integral sign shows that u is
smooth and satisfies (3). 0

The preceding theorem leads to the remarkable result that a real-valued


harmonic function is (locally) the real part of a function defined by a convergent power series in z = x + iy.
Theorem 6.2. Suppose u is a harmonic real-valued function of class C2
defined in an open subset of 1R2 containing the point (xo, Yo). Then there is a
function f defined by a convergent power series:

Lo an(z 00

fez) =

zo)n,

Iz - zol

< e,

such that
u(x, y)

= Ref(x + iy)

Zo

= Xo + iyo,

Laplace's equation and the Dirichlet problem

153

when

Ix + iy - zol

<

e.

Proof Suppose first that (xo, Yo) = (0, 0) and that the set in which u is
defined contains the closed disc r + y2 :s; 1. Let

g( 0) = u(cos 0, sin 0),

0 E IR.

Then u is the unique solution of the Dirichlet problem in the unit disc with
g as value on the boundary. If (bn)'~ 00 are the Fourier coefficients of g,

then we know that in polar coordinates u is given by

L r In1bn exp (inO).


00

-00

Since u is real, g is real. Therefore bn

b! n and the series is

Let I be defined by

Lo anzn
00

I(z) =
where

an = 2bn,

n > O.

Then

u(x,y) = Re
=

(~an(reI8)n) =

Re(J(x

+ iy)),

Re (J(re IB ))

r + y2

< 1.

In the general case, assume that u is defined on a set containing the closed
disc of radius e centered at (xo, Yo), and let
Ul(X,

y) = u(xo

+ ex, Yo + ey).

Then Ul is harmonic in a set containing the unit disc, so

11 defined by a power series in the unit disc. Then


u

= Ref,

where

is defined by a power series in the disc of radius e around Zo = Xo

+ iyo. 0

Applications of Fourier series

154

Exercises

1. Prove the converse of Theorem 6.2: if


fez) =

2:o an(z 00

Iz - zol

zo)n,

< R,

and we let
u(x,y)

Ref(x

+ iy),

Ix + iy - zol

< R,

then u is harmonic.
2. There is a maximum principle for harmonic functions analogous to
the maximum principle for solutions of the heat equation discussed in 4.
(a) Show that if u is of class C2 on an open set A in 1R2 and
02U

ox2

02U

+ oy2 >

at each point of A, then u does not have a local maximum at any point of A.
(b) Suppose u is of class C2 and harmonic in an open disc in 1R2 and
continuous on the closure of this disc. Show, by considering the functions
uB(x, y) = u(x, y)

+ ex2 + ey2

that u attains its maximum on the boundary of the disc.


3. Use the result of Exercise 2 to give a second proof of the uniqueness
of the solution of the Dirichlet problem for a continuous boundary function g.
4. Suppose u is continuous on the closed disc x 2 + y2 ~ R and harmonic
in the open disc x 2 + y2 < R. Give a formula for u(x, y) (or u(r, 8)) for
x 2 + y2 < R in terms of the values of u for x 2 + y2 = R. Give formulas
for the derivatives ou/or and ou/o8 also.
S. Suppose u is defined on all of 1R2 and is harmonic. Use the result of
Exercise 4 to show that if u is bounded, then it is constant.

Chapter 6

Complex Analysis
1. Complex differentiation
Suppose n is an open subset of the complex plane C. Recall that this
means that for each Zo E n there is a 8 > so that n contains the disc of
radius 8 around Zo:

ZEn
A function J:

if Iz - zol < 8.

n ~ C is said to be differentiable at ZEn if the limit


lim

Pe....:C,--w)c.....-----"--J('--"z)

w-+z

w-

exists. If so, the limit is called the derivative of J at Z and denoted j'(z).
These definitions are formally the same as those given for functions
defined on open subsets of IR, and the proofs of the three propositions below
are also identical to the proofs for functions of a real variable.

If J: n ~ C

Proposition 1.1.
tinuous at z.

is differentiable at

ZEn,

then it is con-

Proposition 1.2. Suppose J: n ~ C, g: n ~ C and a E IC. IJ J and g are


differentiable at z = n, then so are the Junctions af, J + g, and Jg:
(aJ),(z)

(f

aj'(z)

+ g)'(z) = j'(z) + g'(z)


(fg)'(z) = j'(z)g(z) + J(z)g'(z).

IJ also g(z) =I- 0, then fig is differentiable at z and


(f/g)'(z) = [j'(z)g(z) - J(z)g'(z)]g(Z)-2.

Proposition 1.3 (Chain rule). Suppose J is differentiable at z E C and g


is differentiable at J(z). Then the compositive Junction go J is differentiable
at z and
(g 0 J),(z) = g'(f(z))j'(z).

The proof of the following theorem is also identical to the proof of the
corresponding Theorem 4.4 of Chapter 2.
Theorem 1.4.

Suppose J is defined by a convergent power series:

2: an(z 00

J(z)

n=O

zo)n,

155

Iz - zol

< R.

Complex Analysis

156

Then f is differentiable at each point z with Iz - Zo I < R, and

L: nan(z n=1
co

f'(z) =

zo)n-1.

In particular, the exponential and the sine and cosine functions are
differentiable as functions defined on C, and (exp z)' = exp z, (cos z)' =
- sin z, (sin z)' = cos z.
A remarkable fact about complex differentiation is that a converse of
Theorem 2.4 is true : iff is defined in the disc Iz - Zo I < R and differentiable
at each point of this disc, then f can be expressed as the sum of a power
series which converges in the disc. We shall sketch one proof of this fact in
the Exercises at the end of this section, and give a second proof in 3 and a
third in 7 (under the additional hypothesis that the derivative is continuous).
Here we want to give some indication why the hypothesis of differentiability
is so much more powerful in the complex case than in the real case. Consider
the function
fez) = z*,

or f(x

+ iy)

= x - iy.

Take t E IR, t =F O. Then


t- 1 [f(z + t) - fez)] = 1,
(it)-l[f(Z + it) - fez)] = -1,
so the ratio
w- 1 [f(w) - fez)]
depends on the direction of the line through wand z, even in the limit as
w -+ z. Therefore this functionfis not differentiable at any point.
Givenf: n -+ C, define functions u, v by

u(x, y) = Ref(x

+ iy) =

vex, y) = Imf(x

+ iy) = ~ (f(x + iy*

Thus
f(x

+ iy) =

ff(x

u(x, y)

+ iy) + !(f(x + iy*,


-

~f(x + iy).

+ iv(x, y).

We shall speak ofu and v as the real and imaginary parts off and write

f= u

+ iv.

(This is slightly incorrect, since f is being considered as a function of z E


nee, while u and v are considered as functions of two real variables x, y.)

Theorem 1.5. Suppose nee is open, f: n -+ C, f = u


is differentiable at z = x + iy E n, the partial derivatives
(1)

au

ox'

au

oy'

ov

ox'

+ iv.

Then iff

Complex differentiation

157

all exist at (x, y) and satisfy

(2)

ox u(x, y)

(3)

oy u(x, y)

oy vex, y)
0

ox vex, y).

= -

Conversely, suppose that the partial derivatives (1) all exist and are continuous in an open set containing (x, y) and satisfy (2), (3) at (x, y). Then f is
differentiable at z = x + iy.

The equations (2) and (3) are called the Cauchy-Riemann equations.
They provide a precise analytical version of the requirement that the limit
definingf'(z) be independent of the direction of approach. Note that in the
examplef(z) = z* we have
ou
ox
Proof

(4)

ou=ov=o
oy
ox
.

ov =-1
oy
,

'

+ iy. Then
lim (it)-l[f(z + it) - fez)].

Suppose f is differentiable at z

lim t-1[f(z
t-+o

=1

+ t) - fez)]

f'(z)

t-+o

The left side of (4) is clearly


:x u(x, y)

+ i :x vex, y),

while the right side is


-i :yu(x,y)

+ :yv(x,y).

Equating the real and imaginary parts of these two expressions, we get (2)
and (3).
Conversely, suppose the first partial derivatives of u and v exist and are
continuous near (x, y), and suppose (2) and (3) are true. Let
h=a+ib

where a and b are real and near zero, h t= O. We apply the Mean Value
Theorem to u and v to get
fez

+ h) - fez) = fez + a + ib) - fez + a) + fez + a) - fez)


= :y u(x

+ a, y + t1b)b + i :y vex + a, y + t 2 b)b

Complex Analysis

158

where 0 < tl < 1, j = 1, 2, 3, 4. Because of (2) and (3),


[:x u(x, y)

+ i ;y vex, Y)]h
= ;y u(x, y)b

+ i :y vex, y)b + :x u(x, y)a + i

vex, y)a.

Therefore
h-l[f(z

+ h)

- fez)] - [:x u(x, y)

+ i ;y vex, y)]

is a sum of four terms similar in size to

oy u(x

+ a, y + tlb)

- oy u(x, y).

Since the partial derivatives were assumed continuous, these terms ~ 0 as


h~O.

A function f: n ~ C, n open in C, is said to be holomorphic in n if it is


differentiable at each point of n and the derivative/' is a continuous function.
(Actually, the derivative is necessarily continuous if it exists at each point;
later we shall indicate how this may be proved.) Theorem 1.5 has the following
immediate consequence.
Corollary 1.6. f: n ~ C is holomorphic in n if and only if its real and
imaginary parts, u and v, are of class Cl and satisfy the Cauchy-Riemann
equations (2) and (3) at each point (x, y) such that x + iy En.

Locally, at least, a holomorphic function can be integrated.


Corollary 1.7. Suppose g is holomorphic in a disc Iz - zol < R. Then
there is afunction!, holomorphicfor Iz - zol < R, such that/' = g.

Proof Let u, v be the real and imaginary parts of g. We want to determine real functions q, r such that

f=

+ ir

has derivative g. Because of Theorem 1.5 we can see that this will be true
if and only if
oq or
oq
or
-=-=U
- = - - = -v
oX
oy
,
oy
ox
.
The condition that u, - v be the partial derivatives of a function q is (by
Exercises 1 and 2 of 7, Chapter 2)

au

oy

ov
-ox

The condition that v and u be the partial derivatives of a function r is

ov ou

oy = ox'

Complex integration

159

Thus there are functions q, r with the desired properties. 0

Exercises
1. Let/(x + iy) = x 2 + y2. Show that lis differentiable only at z = O.
2. Suppose I: C ~ ~. Show that I is differentiable at every point if and
only ifI is constant.
3. Let 1(0) = 0 and

x + iy =I O.
Show that the first partial derivatives of I exist at each point and are both
zero at x = y = O. Show that/is not differentiable (in fact not continuous)
at z = O. Why does this not contradict Theorem 1.5?
4. Suppose I is holomorphic in n and suppose the real and imaginary
parts u, v are of class C2 in n. Show that u and v are harmonic.
5. Suppose I is as in Exercise 4, and suppose the disc Iz - zol :$; R is
contained in n. Use Exercise 4 together with Theorem 6.2 of Chapter 5 to
show that there is a power series 2: an(z - zo)n converging to I(z) for
Iz - zol < R.
6. Suppose g is holomorphic in n, and suppose n contains the disc
Iz - zol :$; R. Let I be such that f'(z) = g(z) for Iz - zol :$; R (using
Corollary 1.7). Show that the real and imaginary parts ofI are of class C2.
7. Use the results of Exercises 5 and 6, together with Theorem 1.4, to
prove the following theorem.
If g is ho10morphic in nand Zo E n, then there is a power series such that

L: an(z 00

g(z) =

n=O

zo)n,

Iz - zol

< R,

for any R such that n contains the disc of radius R with center zo0

2. Complex integration
Suppose nee is open. A curve in n is, by definition, a continuous
function" from a closed interval [a, b] c ~ into n. The curve" is said to be
smooth if it is a function of class C1 on the open interval (a, b) and if the
one-sided derivatives exist at the endpoints:
(t - a)-1[,,(t) - ,,(a)] converges as t ~ a, t > a;
(t - b)-1[,,(t) - ,,(b)] converges as t~b, t < b.

The curve " is said to be piecewise smooth if there are points ao, a1> ... , ar
with
a = ao < a1 < ... < ar = b

Complex Analysis

160

such that the restriction of y to [aj-I. aj] is a smooth curve, 1 :::;; j :::;; r. An
example is
yet) =

Zo

+ e exp (it),

t E [0, 27T];

then the image


{yet)

I t E [0, 27T]}

is the circle of radius e around zoo This is a smooth curve. A second example is
yet) = t,
yet) = 1 + i(t - 1),
yet) = 1 + i - (t - 2),
yet) = i - (t - 3)i,

t E [0, 1],
tE(1,2],
t E (2,3],

t E (3, 4].

Here y is piecewise smooth and the image is a unit square.


Suppose y: [a, b] ~ Q is a curve and f: Q ~ C. The integral off over y,

is defined to be the limit, as the mesh of the partition P = (to, tI. ... , tn) of
[a, b] goes to zero, of
n

2 f(y(tl))[y(t l )

j=l

y(tl - 1)].

Proposition 2.1. If y is a piecewise smooth curve in Q and f: Q


continuous, then the integral off over y exists and

(1)

{f

C is

f(y(t))y'(t) dt.

Proof The integral on the right exists, since the integrand is bounded
and is continuous except possibly at finitely many points of [a, b]. To prove
that (1) holds we assume first that y is smooth. Let P = (to, t 1 , ", tn) be a
partition of [a, b]. Then
(2)

2f(y(tt))[y(tt) - y(tt-1)]

= 2f(y(tl))y'(tt)[t1 - ti-d + R,

where

Applying the Mean Value Theorem to the real and imaginary parts of yon
[tl - 1 , ttl and using the continuity of y', we see that
R~O

as the mesh

IPI ~O.

On the other hand, the sum on the right side of (2) is a Riemann sum for
the integral on the right side of (1). Thus we have shown that the limit exists
and (1) is true.

Complex integration

161

If )I is only piecewise smooth, then the argument above breaks down on


intervals containing points of discontinuity of )I'. However, the total contribution to the sums in (2) from such intervals is easily seen to be bounded in
modulus by a constant times the mesh of P. Thus again the limit exists as
IPI-? and (1) is true. 0

Note that the integral

I7 1depends not only on the set of points


{)I(t) I t E [a, b]}

but also on the "sense," or ordering, of them. For example, if

)ll(t) = exp (it),

)l2(t) = exp ( - it),

t E [0, 217],

then the point sets are the same but

Furthermore, it matters how many times the point set is traced out by)l: if
)la(t) = exp (int),

f l=nJ. f.

then

73

A curve
If so, then

)I:

71

[a, b] -? n is said to be constant if )I is a constant function.

LI=

0,

allf.

A curve)l: [a, b] -? n is said to be closed if )I(a) = y(b). (All the examples


given so far have been examples of closed curves.) Two closed curves
)10, )11: [a, b] -? n are said to be homotopic in n if there is a continuous
function
r:[a,b] x [O,I]-?n
such that

r(t,O)
r(a, s)
The function
let

= )lo(t),
= r(b, s),

r(t, 1) = )ll(t),
all s E [0, 1].

is called a homotopy from

)I,(t) = r(t, s),

)10

all t E [a, b],


to )II' If r is such a homotopy,

SE

(0,1).

Then each )I. is a closed curve, and we think of these as being a family of
curves varying continuously from )10 to )110 within n.

Theorem 2.2 (Cauchy's Theorem). Suppose nee is open and suppose


1 is holomorphic in n. Suppose )10 and )II are two piecewise smooth closed
curves in n which are homotopic in n. Then

Complex Analysis

162

The importance of this theorem can scarcely be overestimated. We


shall first cite a special case of the theorem as a Corollary, and prove the
special case.

Corollary 2.3. Suppose

n is either a disc
{z

liz - zol

< R}

or a rectangle

Suppose f is holomorphic in nand Y is any piecewise smooth closed curve


in n. Then

Lf=O.
Proof It is easy to see that in this case y is homotopic to a constant
curve Yo, so that the conclusion follows from Cauchy's Theorem. However,
let us give a ,different proof. By Corollary 1.7, or by the analogous result
for a rectangle in place of a disc, there is a function h, holomorphic in n,
such that h' = f But then

f =

Jb !(y(t))y'(t) dt =
a

C[h

h'(y(t))y'(t) dt

y]'(t) dt = h(y(b)) - h(y(a)) = 0,

'a

since y(a) = y(b).

Proof of Cauchy's Theorem.

YB(t)

Let

= ret, s),

be a homotopy from Yo to Yl and let

t E [a,

b],

SE

(0,1).

Assume for the moment that each curve YB is piecewise smooth. We would like
to show that the integral off over YB is independent of s, ;5; s ;5; 1.
Assume first that r is of class C2 on the square [a, b] x (0,1), that the
first partial derivatives are uniformly bounded, and that

Y; ~ y~
,,~-+y~

as s ~ 0,
as s-+ 1

uniformly on each interval (c, d) c [a, b] on which


continuous. These assumptions clearly imply that

y~

or

y~,

respectively, is

Complex integration

163

is a continuous function of s, s E [0, 1]. Furthermore under these assumptions


we may apply results of 7 of Chapter 2 and differentiate under the integral
sign when s E (0, 1) to get

F'(s) =

f :s (J(r(t,

J j'(r(t, S"8aS ret, S)"8aI ret, s) dt


b

:t ret, S) dt

f J(r(l,

:t [J(r(t, s

a:~s ret, s) dt
:s r(l, s)

J dt

= J(r(b, s as reb, s) - J(r(a, s as rea, s)


=0
(since reb, s) = rea, s. Thus F(O) = F(I).
Finally; do not assume r is differentiable. We may extend r in a unique
way so as to be periodic in the first variable with period b - a, and even and
periodic in the second variable with period 2; then r: IR x IR ---+ Q. Let
If!: IR ---+ IR be a smooth function such that

<p(x) ~ 0,

all x,

<p(x) dx = 1
<p(x)

if

Ixl

~ 1.

Let <Pn(x) = n<p(nx), n = 1, 2, 3, .. " Then

<Pn(X) dx

= 1,

Let

E IR x R Then (see the arguments in 3 of Chapter 3) r n is also periodic


with the same periods as r, and r n ---+ r uniformly as n ---+ 00. It follows
from this (and the fact that r(1R x IR) is a compact subset of Q) that
r n(1R x IR) c Q if n is sufficiently large. Furthermore, the argument above
shows that

(t, s)

f J=1
"O.n

Yl,n

J,

where

Ys.n(t) =

r n(t, s),

IE

[a,b],

SE

[0, 1].

164

Complex Analysis

All we need do to complete the proof is show that


(3)

)I,"n

-+

ff

s = 0 or 1.

as n -+ 00,

fa

Note that we may choose the homotopy r so that Ys = Yo for 0 ~ s ~ !


and Ys = YI for j- ~ s ~ I; to see this, let robe any homotopy from Yo to
YI in n, and let

o ~ s ~ t,

ret, s) = Yo(t),
ret, s) = r o(t,3s - 1),
ret, s) = YI(t),
If r has these properties, then when n

< s <j-,

t~s~1.

3 we have

s = 0 or 1.
It follows that Ys,n -+ Ys uniformly, s = 0 or 1. It also follows, by differentiating with respect to t, that y~,n is uniformly bounded and

on each interval of [0, 1] where


is true. 0

y~

is continuous, s

0 or 1. Therefore (3)

Exercises

1. Suppose YI: [a, b] -+ nand Y2: [c, d] -+ n are two piecewise smooth
curves with the same image:

{YI(t) I t E [a, b]}

{Y2(t) I t E [c, d]}.

Suppose these curves trace out the image in the same direction, i.e., if

s, t E [a, b], s', t'

[c, d]

and
and

s< t
then

s' < t'.


Show that for any continuousf: n -+ C (not necessarily holomorphic),

Complex integration

165

This justifies writing the integral as an integral over the point set C:

J i
J=

Yl

J(z)dz,

where we tacitly assume a direction chosen on C.


2. Suppose C in Exercise 1 is a circle of radius R and suppose IJ(z)1
all z E C. Show that

IL

3. Let y(t) =

Zo

+ eelt ,

J(z) dz

:$;

M,

M2rrR.

[0, 2rr], where

L(z -

:$;

> 0. Show that

zo)-l = 2rri.

4. Let Yn(t) = exp (int), t E [0, 2rr], n = 0, 1, 2, .... Show that

Z-l

2nrri.

Yn

5. Let n = {z Eel z - O}. Use the result of Exercise 4 to show that the
curves Yn and Ym are not homotopic in n if n - m. Show that each Yn is
homotopic to Yo in C, however.
6. Use Exercise 4 to show that there is no function/, defined for all z - 0,
such thatf'(z) = z-l, all z - 0. Compare this to Corollary 1.7.
7. Let n be a disc with a point removed:

n = {z I Iz - zol
where

Iz - zll

< R. Let Yo and


Yo(t)

Zl

Yl(t) = z

< R, z -

Zl},

Yl be two circles in n enclosing Zh say


+ eelt ,
+ relt ,

t E

[0, 2rr],
[0, 2rr].

Here Iz - zll < r < Rand e > is chosen so that IYl(t) Construct a homotopy from Yo to Yl'
8. Suppose n contains the square
{x

+ iy 10:$;

x, y

:$;

zll

> e, all t.

I}.

SupposeJis differentiable at each point of n; here we do not assume thatf'


is continuous. Let C be the boundary of the square, with the counterclockwise
direction. Show that

LJ(z)dz = 0.
This extension of Corollary 2.3 is due to Goursat. (Hint: for each integer
k > 0, divide the square into 4k smaller squares with edges of length 2- k

Complex Analysis

166

Let Ck,l,"" Ck ,2k be the boundaries of these smaller squares, with the
counterclockwise direction, Then show that

JJ(z) dz = 2: J
f

It follows that if

II

then for each k there is aj

J(Z)dZI

J(z) dz.

C".I

M > 0,

= j(k) such that

where C k = Ck,f' Let Zk be the center of the square with boundary C k


There is a subsequence Zkn of the sequence Zk which converges to a point
Z of the unit square. Now derive a contradiction as follows. Since J is
differentiable at z,
J(w) = J(z)

+ J'(z)(w

-. z)

+ r(w)

as

w~ z

where
Jr(w)(w - Z)-lJ

~0

Therefore for each e > 0 there is a 8 > 0 such that if Co is the boundary
of a square with sides of length h lying in the disc Jw - zJ < 8, then

3. The Cauchy integral formula


There are many approaches to the principal results of the theory of
holomorphic functions. The most elegant approach is through Cauchy's
Theorem and its chief consequence, the Cauchy integral formula. We begin
with a special case, which itself is adequate for most purposes.

Theorem 3.1. Suppose J is holomorphic in an open set n. Suppose C is a


circleor rectangle contained in n and such that all points enclosed by Care
in n. Then if w is enclosed by C,
(1)

1
J(w) = -2
7T1

J(z)(z - W)-l dz.

(Here the integral is taken in the counterclockwise direction on C.)


Proof Let Yo: [a, b] ~ n be a piecewise smooth closed curve whose
image is C, traced once in the counterclockwise direction. Given any positive
e which is so small that C encloses the closed disk of radius and center w,
let C. be the circle of radius e centered at w.

The Cauchy integral formula

167

We can find a piecewise smooth curve YI: [a, b] -+ .0 which traces out Cs
once in the counterclockwise direction and is homotopic to Yo in the region
.0 with the point w removed. Granting this for the moment, let us derive (1).
By Exercise 1 of 2, and Cauchy's Theorem applied to g(z) = J(z)(z - w)-I,
we have

Lo J(z)(z -

W)-I dz =

rJ(z)(z -

Jo

fu

so

(2)

f =f

W)-I dz =

J(w)(z - W)-I dz

08

L J(z)(z -

W)-l dz,

[few) - J(z)](z - W)-I dz.

08

By Exercise 2 of 2, the first integral on the right in (2) equals 27Tif(w). Since
Jis differentiable, the integrand in the second integral on the right is bounded
as e -+ O. But the integration takes place over a curve of length 27Te, so this
integral converges to zero as e -+ O. Therefore (1) is true.
Finally, let us construct the curve YI and the homotopy. For t E [a, b],
let YI(t) be the point at which Ce and the line segment joining w to Yo(t)
intersect. Then for 0 ~ s ~ 1, let

It is easily checked that Yo and

have the desired properties.

The preceding proof applies to any situation in which a given curve YI is


homotopic to all small circles around WEn. Let us make this more precise.
A closed curve Y in an open set n is said to enclose the point WEn within .0
if the following is true: there is a I) > 0 such that if 0 < e ~ I) then Y is
homotopic, in .0 with w removed, to a piecewise smooth curve which traces
out once, in a counterclockwise direction, the circle with radius e and center
w. We have the following generalization of Theorem 3.1.

Theorem 3.2. Suppose J is holomorphic in an open set .0. Suppose Y is a


piecewise smooth closed curve in .0 which encloses a point w within .0. Then

1.
J(w) = -2
7Tl

(3)

J(z)(z - W)-I dz.

Equation (3) is the Cauchy integralJormula, and equation (1) is essentially


a special case of (3). (In section 7 we shall give another proof of (1) when C
is a circle.)
The Cauchy integral formula makes possible a second proof of the result
of Exercise 5, 1: any holomorphic function can be represented locally as a
power series.

Corollary 3.3. Suppose J is holomorphic in .0, and suppose .0 contains


the disc

{z

liz -

zol < R}.

Complex Analysis

168

Then there is a unique power series such that

2: anew n=O
co

(4)

J(w)

all w with Jw - zoJ < R.

zo)n,

Proof Suppose 0 < r < R, and let C be the circle of radius r centered
at Zo, with the counterclockwise direction. If Jw - zoJ < rand z E C then
J(w - zo)(z - ZO)-lJ =

< 1.

We expand (z - W)-l in a power series:


z - w = z - Zo - (w - zo) = (z - zo)[1 - (w - zo)(z - ZO)-l],
so

The sequence of functions

2: (w n=O
N

gN(Z)

J(z)

zo)n(z - zo)-n-l

converges uniformly for z E C to the function


J(z)(z - W)-l.
Therefore we may substitute in equation (l) to get (4) with
(5)

an

1.
-2
wI

J(z)(z - zo)-n-l dz.

This argument shows that the series exists and converges for Jw - zoJ < r.
The series is unique, since repeated differentiation shows that
(6)

Since r < R was arbitrary, and since the series is unique, it follows that it
converges for all Jw - zoJ < R. 0
Note that our two expressions for an can be combined to give
(7)

j<n)(zo) = -2
n''.
wI

J(z)(z - zo)-n-l dz.

This is a special case of the following generalization of the Cauchy integral


formula.

Corollary 3.4. Suppose J is holomorphic in an open set containing a circle


or rectangle C and all the points enclosed by C. If w is enclosed by C then the
nth derivative ojJ at w is given by
(8)

j<n>(w)

n'

= -'.

2m c

J(z)(z - W)-l dz.

The Cauchy integral formula

169

Proof Let C 1 be a circle centered at wand enclosed by C. Then by the


Cauchy integral theorem, and the argument given in the proof of Theorem
3.1, we may replace C by C1 in (8). But in this case the formula reduces to
the case given in (7). D

A functionf defined and holomorphic on the whole plane C is said to be


entire. The following result is known as Liouville's Theorem.

Corollary 3.5.

Iff is an entire function which is bounded, then f is constant.

Proof We are assuming that there is a constant M such that If(z) 1 ::;; M,
all z E C. It is sufficient to show that f' == 0. Given w E C and R > 0, let C
be the circle with radius R centered at w. Then
f'(w)

= 2~f f(z)(z - W)-2 dz,


1T1 c

so
If'(w) 1

Letting R

---'J-

::;;

2~M.R-2.21TR =

Owe getf'(w) = 0.

MR-l.

A surprising consequence of Liouville's Theorem is the "Fundamental


Theorem of Algebra."

Corollary 3.6.
a complex root.

Any nonconstant polynomial with complex coefficients has

Proof Suppose p is such a polynomial. We may assume the leading


coefficient is 1:
p(z)

= zn + an_1z n- 1 + ... + al(z) + ao.

It is easy to show that there is an R >

(8)

such that

if Izl ~ R.

Now suppose p has no roots: p(z) i= 0, all z E Co Thenf(z) = p(Z)-l would


be an entire function. Then f would be bounded on the disc Izl ::;; R, and
(8) shows that it would be bounded by 2R-n for Izl ~ R. But thenfwould be
constant, a contradiction. D

Exercises
1. Verify the Cauchy Integral Formula in the form (1) by direct computation whenf(z) = eZ and C is a circle.
2. Compute the power series expansion (4) in the following cases. (Hint:
(6) is not always the simplest way to obtain the an.)
(a) fez) = sin z, Zo = 1-1T.
(b) eZ , Zo arbitrary.

Complex Analysis

170

(c) J(z) = Z3 - 2Z2 + z + i, Zo = -i.


(d) J(z) = (z - 1)(z2 + 1)-\ Zo = 2.
3. Derive equation (8) directly from (1) by differentiating.
4. Suppose J is an entire function, and suppose there are constants M
and n such that
all z E C.
Show that J is a polynomial of degree :s; n. Show that, conversely, any
polynomial of degree :s; n satisfies such an inequality.
5. The Cauchy integral formula can be extended to more general situations, such as the case of a region bounded by more than one curve. For
example, suppose n contains the annulus

{z I r < Iz - zol < R},

and also the two circles

C1 = {z liz - zol = r},

C2 = {z I Iz - zol = R}

which bound A. Give C 1 and C2 the counterclockwise direction. Then if


w E A, and J is holomorphic in n, show that
(*)

J(w)

(Hint: choose

1.
-2
7Tl

C2

1
J(z)(z - W)-1 dz - -2

a E C, lal

7Tl

Cl

J(z)(z - W)-1 dz.

= 1 so that w does not lie on the line segment

L = {zo + ta I r :s; t :s; R} joining C 1 to C2 There is a curve y tracing out


L, then C2 in the counterclockwise direction, then L in the reverse direction,

then C 1 in the clockwise direction. This curve is homotopic in n to any


small circle about w. Moreover, the integral of J(z)(z - W)-1 over yequals
the right side of (*), since the two integrations over L are in the opposite
directions and cancel each other.)
6. Extend the result of Exercise 5 to the following situation: n contains
a circle or rectangle C, together with all points enclosed by C except
Zl> Z2, ... , Zm' Let Cl> C 2, ... , C m be circles around these points which do
not intersect each other or C. IfJ is holomorphic in nand WEn is enclosed
by C, then

J(w)

1 f
2'
7Tl

J(z)(z - W)-1 dz -

L -2'1 f
m

j=1

7Tl

J(z)(z - W)-1 dz;

CJ

again, all integrals are taken in the counterclockwise direction.


7. Suppose J: n -,)- C is merely assumed to be differentiable at each
point of the open set n, and suppose n contains a rectangle C and all points
enclosed by C. Suppose w is enclosed by C. Modify the argument of Exercise
8, 2, to show that the integral in (1) remains unchanged if we replace C by
any rectangle enclosed by C and enclosing w. Therefore show that (1) holds
in this case also.
8. Use the result of Exercise 7 to show that if J is differentiable at each
point of an open set n, the derivative is necessarily a continuous function.

171

The local behavior of a holomorphic function

4. The local behavior of a holomorphic function


In this section we investigate the qualitative behavior near a point Zo E IC
of a function which is holomorphic in a disc around Zoo Iff is not constant,
then its qualitative behavior near Zo is the same as that of a function of the
form
where ao and am are constants, am

=1=

0, and m ;:::: 1.

Lemma 4.1. Suppose f is holomorphic in an open set Q and


f is not constant near zo, then

Zo E Q.

If

(1)
where ao and am are constants, m ;:::: 1, and his holomorphic in Q with h(zo)
Proof
(2)

= 1.

Near zo, f is given by a power series expansion


fez)

= ao + a 1 (z - zo) + ... + an(z - zo)n + .. '.

Let m be the first integer;:::: 1 such that am

h(z)

0. Then (2) gives (1) with

=1=

L: anam-1(Z 00

zo)n-m.

n=m

This function is holomorphic near zo, and h(zo) = 1. On the other hand,
(1) defines a function h in Q except at zo, and the function so defined is
holomorphic. Thus there is a single such function holomorphic throughout Q. 0
Our first theorem here is the Inverse Function Theorem for holomorphic
functions.
Theorem 4.2. Suppose f is holomorphic in an open set Q, and suppose
Q,f'(zo) =1= 0. Let Wo = f(zo). Then there is an e1 > and a holomorphic
function g defined on the disc Iw - wol < e1 such that

Zo E

g({w Ilw f(g(w = w

wol

< e1D is open,


if Iw - wol < e1'

In other words,ftakes an open set containing Zo in a 1-1 way onto a disc


about wo, and the inverse function g is holomorphic.
Proof We begin by asking: Suppose the theorem were true. Can we
derive a formula for g in terms off? The idea is to use the Cauchy integral
formula for g, using a curve y around Wo which is the image by f of a curve
around Zo, because then we may take advantage of the fact that g(f(z = z.
To carry this out, let 0 > be small enough that Q contains the closed disc
Iz - zol :s; 0; later we shall further restrict O. Let

yo(t)
yet)

= Zo + oett ,
= f(Yo(t,

[0, 21T],

Complex Analysis

172

and let C be the circle of radius 0 around zoo Assuming the truth of the
theorem and assuming that y enclosed WI> we should have

1
2-:

I2" g(y(ty'(t)[y(t) -

1
2-:

f2" yo(t)J'(Yo(ty~(t)[f(Yo(t

1T1

1T1

or
(3)

wd- 1 dt

g(W 1 )

1.
-2

1T1 c

wd- 1 dt

wd- 1 dz.

zJ'(z)[f(z) -

Our aim now is to use (3) to define g and show that it has the desired properties. First, note that (1) holds with m = l. We may restrict 0 still further,
so that Ih(z)1 ~ 1- for z E C. This implies that fez) =f. Wo if z E C. Then we
may choose e > 0 so that

fez) =f. WI

if

IWI - wol

< e and

z E C.

With this choice of 0 and e, (3) defines a function g on the disc IWI - Wo I < e.
This function is holomorphic; in fact it may be differentiated under the
integral sign.
Suppose

f(zl) =

IZI - zol < 0,

WI>

and

IWI - wol

< e.

We can, and shall, assume that 0 is chosen so small that J'(z) =f. 0 when
Iz - zol < O. Then

fez) - WI

fez) - f(zl)

where k is holomorphic in Q and k(Zl)


in Q. We have

(z - zl)k(z)

J'(Zl) =f. O. Therefore k is nonzero

But the right side is the Cauchy integral formula for

Zd'(Zl)k(Zl)-l
Thus g(f(Zl
Also

Zl'

Zl for Zl near zo, and we have shown that f is 1-1 near zoo
1

= (g f)'(zo) = g'(wo)f'(zo),
0

so g'(wo) =f. O. Therefore g is 1-1 near woo We may take eo > 0 so small that
eo :::;; e, and so that g is I-Ion the disc Iw - wol < eo. We may also assume
el :::;; eo so small that

Iw - wol <

el implies

Ig(w) - zol < 0

The local behavior of a holomorphic function

173

and

If(g(w)) - wol < eo


But then for

Iw -

wol < e1 we have


g(f(g(w))) = g(w),

and since g is I-Ion

Iw -

wol < eo this implies

f(g(w)) = w.

As an example, consider the logarithm. Suppose Wo E C, WO "# O. We


know by results of 6 of Chapter 2 that there is a Zo E C such that eZo = Wo.
The derivative of e Z at Zo is eZo = Wo "# O. Therefore there is a unique way of
choosing the logarithm
Z = log w,

Z near zo, in such a way that it is a holomorphic function of w near Wo.


(In fact, we know that any two determinations of log w differ by an integral
multiple of 27Ti; therefore the choice of log w will be holomorphic in an
open set n if and only if it is continuous there.) By definition a branch of
the logarithm function in n is a choice Z = log w, WEn, such that Z is a
holomorphic function of w in n.
As a second example, consider the nth root, n a positive integer. If Wo "# 0,
choose a branch of log w holomorphic in a disc about Wo. Then if we set

1 /n

= exp

(~log w),

this is a holomorphic function of w near Wo and

(wl/n)n = exp

(n.~ log w) =

exp (log w) = w.

We refer to w1 / n as a branch of the nth root.


There are exactly n branches of the nth root holomorphic near Wo. In
fact, suppose
Then for any choice of log Zo and log Zl>
exp (n log zo) = exp (n log Zl),
so

n log Zo = n log Zl

+ 2m7Ti,

some integer m. This implies


(4)
Since

Zo = exp (log zo) = ... =

Zl

exp (2m7Tin-l).

exp (2m7Tin- 1 ) = exp (2m'7Tin- 1 )

if and only if (m - m')n- 1 is an integer, we get all n distinct nth roots of Wo


by letting Z 1 be a fixed root and taking m = 0, 1, ... , n - I in (4).

Complex Analysis

174

We can now describe the behavior of a nonconstant holomorphic function


near a point where the derivative vanishes.

Theorem 4.3. Suppose f is holomorphic in an open set 0, and suppose


Zo E O. Suppose f is not constant near zo, and let m be the first integer ~ 1 such
that j<m)(zo) '# O. Let Wo = f(zo). There are e > 0 and 8 > 0 such that if

0<

Iw - wol

<

there are exactly m distinct points z such that

Iz - zol
Proof.

< 8 and fez) = w.

By Lemma 4.1,

fez) = Wo

+ (z - zo)mh(z),

where h is holomorphic in 0 and h(zo) '# O. Choose a branch g of the mth


root function which is hoi omorphic near h(zo). Then near zo,

fez) = Wo

[(z - zo)(g(h(z))]m = Wo

+ k(z)m,

where k is holomorphic near zoo Then k(zo) = 0, k'(zo) = g'(h(zo)) '# O. For
z near zo, z '# zo, we have

fez) = w if and only if k(z) = (w - wo)l/m


for some determination of the mth root of w - woo We can apply Theorem
4.2 to k: there are e, 8 so that
k(z) = t
has a unique solution z in the disc
It I < e 1 / m But if

0<

Iz - zol

Iw - wol

<

< 8 for each t in the disc

then w - Wo has exactly m mth roots t, all with It I <

e 1 / m

The following corollary is called the open mapping property of holomorphic functions.

Corollary 4.4. If 0 is open and f: 0


near any point, thenf(O) is open.

C is holomorphic and not constant

Proof. Suppose Wo Ef(O). Then Wo = f(zo), some Zo E O. We want to


show that there is an e > 0 such that the disc Iw - wol < e is contained in
f(O). But this follows from Theorem 4.3. 0
Exercises
1. Use Corollary 4.4 to prove the Maximum Modulus Theorem: if f is
holomorphic and not constant in a disc Iz - zol < R, then g(z) = If(z) I
does not have a local maximum at zoo

Isolated singularities

175

2. Withfas in Exercise 1, show that g(z) = If(z) I does not have a local
minimum at Zo unless f(zo) = O.
3. Use Exercise 2 to give another proof of the Fundamental Theorem of
Algebra.
4. Suppose z = log w. Show that Re z = log Iwl.
5. Supposefis holomorphic near Zo andf(zo) #- O. Show that log If(z) I
is harmonic near zoo
6. Use Exercise 5 and the maximum principle for harmonic functions to
give another proof of the Maximum Modulus Theorem.
7. Use the Cauchy integral formula (for a circle with center zo) to give
still another proof of the Maximum Modulus Theorem.
8. Use the Maximum Modulus Theorem to prove Corollary 4.4. (Hint:
let Wo = f(zo) and let C be a small circle around Zo such that fez) #- Wo if
z E C. Choose e > 0 so that If(z) - wol ;::: 2e if z E C. If Iw - wol < e, can
(f(z) - W)-l be holomorphic inside C?
9. A set 0 c C is connected if for any points zo, Zl E 0 there is a (continuous) curve y: [a, b] -'>- 0 with yea) = zo, y(b) = Zl' Suppose 0 is open
and connected, and supposefis holomorphic in O. Show that iffis identically
zero in any nonempty open subset 0 1 c 0, thenf == 0 in O.
10. Let 0 be the union of two disjoint open discs. Show that 0 is not
connected.

5. Isolated singularities
Suppose f is a function holomorphic in an open set O. A point Zo is said
to be an isolated singularity off if Zo 1: 0 but if every point sufficiently close
to Zo is in O. Precisely, there is a a > 0 such that
if 0 < Iz - zol < O.

z EO

For example, 0 is an isolated singularity for fez) = z-n, n a positive integer,


and for g(z) = exp (l/z). On the other hand, according to the definition, 0
is also an isolated singularity for the functionfwhich is defined by fez) = 1,
z #- 0 and is not defined at O. This example shows that a singularity may
occur through oversight: not assigning values to enough points. An isolated
singularity Zo for f is said to be a removable singularity iff can be defined at Zo
in such a way as to remain holomorphic.
Theorem 5.1. Suppose Zo is an isolated singularity for the holomorphic
function f It is a removable singularity if and only iff is bounded near zo,
i.e., there are constants M, a > 0 such that

(1)

If(z) I

:$

if 0 <

Iz -

zol <

o.

Proof Suppose Zo is a removable singularity. Then f has a limit at Zo,


and it follows easily that (1) is true.

Complex Analysis

176

Conversely, suppose (1) is true. Choose r with 0 < r < 8 and let 8 > 0
be such that 0 < 8 < r. Let C be the circle with center Zo and radius r. Given
w with 0 < Jw - zoJ < r, choose 8 so small that 0 < 8 < Jw - zoJ, and let
C~ be the circle with center Zo and radius 8. By Exercise 5 of 3,

.f

1
f(w) = -2

TTl

.f

1
f(z)(z - W)-l dz - -2

TTl

c.

f(z)(z - W)-l dz.

Since f is bounded on C independent of 8, the second integral goes to zero as


8-+0. Thus

(2)

f(w)

1
-2
TTl

f(z)(z - W)-l dz,

o<

Jw - zoJ < r.

We may define f(zo) by (2) with w = zo, and then (2) will hold for all w,
Jw - zoJ < r. The resulting function is then holomorphic. 0
An isolated singularity Zo for a function f is said to be a pole of order n
for J, where n is an integer ~ 1, iff is of the form
(3)

f(z)

(z - ZO)-lIg(Z)

where g is defined at Zo and holomorphic near Zo, while g(zo)


order 1 is often called a simple pole.

=1=

O. A pole of

Theorem 5.2. Suppose Zo is an isolated singularity for the holomorphic


function f. It is a pole of order n if and only if the function
(z - zoYi(z)

is bounded near zo, while the function


(z - ZO)"-Y(Z)

is not.
Proof. It follows easily from the definition that if Zo is a pole of order n
the asserted consequences are true.
Conversely, suppose (z - zoYi(z) = g(z) is bounded near zo0 Then Zo is
an isolated singularity, so we may extend g to be defined at Zo and holo~
morphic. We want to show that g(zo) =1= 0 if (z - ZO)"-lJ(Z) is not bounded
near Z00 But if g(zo) = 0 then by Lemma 4.1,
g(z)

(z - zo)mh(z)

for some m ~ I and some h holomorphic near Z00 But then (z - ZO)II-Y(Z)
(z - zo)m-1h(z) is bounded near Z00 0

An isolated singularity which is neither removable nor a pole (of any


order) is called an essential singularity. Note that if Zo is a pole or a removable
singularity, then for some a E C or a = 00

f(z) -+ a as z -+ Zo
This is most emphatically not true near an esseutial singularity.

177

Isolated singularities

Theorem 5.3 (Casorati-Weierstrass). Suppose Zo is an isolated singularity


for the holomorphic function f If Zo is an essential singularity for J, then for
any e > 0 and any a E C there is a z such that

Iz -

zol < e,

al

If(z) -

< e.

Proof Suppose the conclusion is not true. Then for some e > 0 and
some a E C we have
If(z) -

al

~ e

where 0 <

Iz -

zol < e.

Therefore h(z) = (f(z) - a)-l is bounded near zoo It follows that h can be
extended so as to be defined at Zo and holomorphic near zoo Then for some
m ~ 0,
h(z) = (z - zo)mk(z)
where k is holomorphic near Zo and k(zo) # O. We have

o < Iz Therefore Zo is either a removable singularity or a pole for f

zol < e.
0

Actually, much more is true. Picard proved that if Zo is an isolated essential singularity for J, then for any e > 0 anti any a E C, with at most one
exception, there is a z such that 0 < Iz - zol < e andf(z) = a. An example
is fez) = exp (liz), z # 0, which takes any value except zero in any disc
around zero.
Isolated singularities occur naturally in operations with holomorphic
functions. Suppose, for example, that f is holomorphic in nand Zo E n.
Iff(zo) # 0, then we know thatf(z)-l is holomorphic near zoo The function
fis said to have a zero of order n (or mUltiplicity n) at zo, n an integer ~ 0, if
J<k)(ZO) = 0,
J<fI)(ZO) # O.

o :$;

k < n.

(In particular, f has a zero of order zero at Zo if f(zo) # 0.) A zero of order
one is called a simple zero.
Lemma 5.4. If f is holomorphic near Zo and has a zero of order n at Zo,
then fez) -1 has a pole of order n at zo0

Proof.
(4)

By Lemma 4.1,

fez) = (z - zo)flh(z),

where h is holomorphic near Zo and h(zo) # O. The desired conclusion


follows. 0
For example, the function
sec z = (cos Z)-l
is holomorphic except at the zeros of cos z, where it has poles (of order 1).
The same is then true of

tanz = sinz(cosz)-l = sin z sec Z.

Complex Analysis

178

It is convenient to assign the" value" 00 at Zo to a holomorphic function


with a pole at Zoo Similarly, if Zo is a removable singularity for f we shall
consider f as being extended to take the appropriate value at zoo With these
conventions we may work with the following extension of the notion of
holomorphic function.
Suppose Q c C is an open set. A function f: Q -?- C U {co} is said to
be meromorphic in Q if for each point Zo E Q, either Zo is a pole off, or f is
holomorphic in the disc Iz - zol < 8 for some 8 > 0.

Theorem 5.5. Suppose


a E C. Then the functions

is open and f, g are meromorphic in

af,
are meromorphic in
morphic in Q.

Q.

f+g,

fg

If Q is connected and g

-;;J.

in

Q,

Q.

Suppose

then fig is mero-

Proof We leave all but the last statement as an exercise. Suppose Q is


connected, i.e. for each zo, Z1 E Q there is a continuous curve y: [0, 1] -?- Q
with yeO) = zo, y(l) = Z1' Given any Zo E Q, either g(Z)-1 is meromorphic
near Zo or g vanishes in a disc around Zoo We wnnt to show that the second
alternative implies g == in all of Q.
Suppose g vanishes identically near Zo E Q and suppose Z1 is any other
point of Q. Let y be a curve joining Zo to Z1: yeO) = zo, y(l) = Zl' Let A be
the subset of the interval [0, I] consisting of all those t such that g vanishes
identically near yet). Let c = lub A. There is a sequence (tn)!) C A such that
tn ->- C. If g did not vanish identically near y(c) then either g(y(c)) =f. 0, or
y(c) is a zero of order n for some n, or y(c) is a pole of order n for some n.
But then (see (3) and (4)) we could not have g identically zero near y(tn) for
those n so large that y(tn) is very close to y(c). Thus g vanishes identically
near c. This means that c = 1, since otherwise g vanishes identically near
y(c + e) for small e > 0. Therefore g(Z1) = g(y(l)) = g(y(c)) = 0.
We know now that given Zo E Q, either g(zo) =f. or Zo is either a zero or
pole of order n for g. It follows that either g(Z)-l is holomorphic near zo,
or, by (4), that Zo is a pole of order n, or, by (3), that Zo is a zero of order n.
Thus g(Z)-1 is meromorphic in Q, and sofjg is also. 0

Exercises

1.
2.
What
3.

Prove the rest of the assertions of Theorem 5.5.


Show that tan z is merom orphic on all of C, with only simple poles.
about (tan Z)2?
Determine all the functions f, meromorphic in all of C, such that
If(z) - tan

zl

< 2

at each point z which is not a pole either off or of tan z.

Rational functions; Laurent expansions; residues

179

4. Suppose f has a pole of order n at zoo Show that there are an R > 0
and a 0 > 0 such that if Iwl > R then there are exactly n points z such that

Iz - zol

fez) = w,

<

o.

(Hint: recall Theorem 4.3.)


5. A functionfis said to be defined near 00 if there is an R > 0 such that

{z Ilzl

> R}

is in the domain of definition off The function f is said to be holomorphic


at 00 if 0 is a removable singularity for the function g defined by

g(z) = f(1/z),
I/z in the domain off Similarly, 00 is said to be a zero or pole of order n
for f if 0 is a zero or pole of order n for g. Discuss the status of 00 for the
following functions:
(a)
(b)
(c)
(d)

fez) = zn, n an integer.


fez) = eZ
fez) = sin z.
fez) = tan z.

6. Suppose Zo is an essential singularity for f, while g is meromorphic near


and not identically zero near zoo Is Zo an essential singularity for fg? What
if, in addition, Zo is not an essential singularity of g?
Zo

6. Rational functions; Laurent expansions; residues


A rational function is the quotient of two polynomials:

fez) = p(z)/q(z)
where p and q are polynomials, q "= O. By Theorem 5.4, a rational function is
meromorphic in the whole plane C. (In fact it is also meromorphic at 00;
see Exercise 5 of 5.)
It is easy to see that sums, scalar multiples, and products of rational
functions are rational functions. Iff and g are rational functions and g "= 0,
thenf/g is a rational function. In particular, any function of the form
(1)

fez) = a 1(z - zo)-l

+ aiz

ZO)-2

+ ... + an(z

- zo)-n

is a rational function with a pole at zoo We can write


(2)

fez) = p((z - zo)-l)

where p is a polynomial with p(O) = O. It turns out that any rational function
is the sum of a polynomial and rational functions of the form (I).
Theorem 6.1. Suppose f is a rational function with poles at the distinct
points Z1> Z2, ... , Zm and no other poles. There are unique polynomials Po,
P1, ... , Pm such that plO) = 0 if j =f. 0 and

180

Complex Analysis

Proof We induce on m. If m = 0, thenfhas no poles. Thusfis an entire


function. We havef = plq where P and q are polynomials and q ;t; O. Suppose
P is of degree rand q is of degree s. It is not hard to show that there is a
constant a such that

t,S-'!(z) --? a as z --? 00.

This and Exercise 4 of 3 imply that f is a polynomial.


Now suppose the assertion of the theorem is true for rational functions
with m - I distinct poles, and suppose f has m poles. Let Zo be a pole ofJ,
of order r. Then
where h is holomorphic near zoo Near zo,

L: an(z n=l
co

h(z)

Therefore
f(z)

ao(z - zo)-r

+ al(z

- zo)l-r

zo)n.

+ ... + ar-I(z -

zo)-l

+ k(z),

where k is holomorphic near zoo Now k is the difference of two rational


functions, hence is rational. The function g = f - k has no poles except at
zo, so the poles of k are the poles off which differ from zoo By the induction
assumption, k has a unique expression of the desired form. Therefore f has
an expression of the desired form.
Finally, we want to show that the expression (3) is unique. Suppose Po
is of degree k. The coefficient bk of Zk in Po can be computed by taking limits
on both sides in (3):
lim z-kf(z)

Z-+1Xl

lim Z-kpO(Z)

2-+00

bk.

Therefore the coefficient of Zk-l can be computed:


lim Zl-k[f(z) - bkz k] = bk-b etc.
" .... 00

Continuing in this way, we determine all coefficients of Po. Similarly, if PI


is of order r then the coefficient of the highest power of (z - ZI)-l in (3) is
lim (z - ZI)'!(Z) =

2-+2 1

Cn

the coefficient of the next power is


lim (z - ZIY-l[f(z) - Cr(Z - ZI)-r].
2-+2 1

All coefficients may be computed successively in this way.

The expression (3) is called the partial fractions decomposition of the


rational function f

Rational functions; Laurent expansions; residues

181

Let us note explicitly a point implicit in the preceding proof. If f has a


pole of order r at Zo, then

fez) =

'" bn(z L:
11.=-"

zo)-n,

o < Iz -

zol < 8,

some 8 > O. This generalization of the power series expansion of a holomorphic function is called a Laurent expansion. It is one case of a general
result valid, in particular, near any isolated singularity.

Theorem 6.2. Suppose f is holomorphic in the annulus


A = {z I r <

Iz -

zol < R}.

Then there is a unique two sided sequence (an)~ '"


fez) =

(4)

L:'" an(z -

-'"

Proof. Suppose r <

Iz -

Iz -

r<

zo)n,

c C

such that

zol < R.

zol < R. Choose rl> R1 such that

r < r1 <

Iz -

zol < R1 < R.

Let C1 be the circle of radius r1 and C2 the circle of radius R1 centered at zo0
By Exercise 5 of 3,

.f

1
fez) = -2

TTl

C2

.f

1
f(w)(w - zo)-1 dw - -2

TTl

Cl

f(w)(w - zo)-l dw

= f2(Z) + f1(Z).
Here f2 and f1 are defined by the respective integrals. We consider f2 as
being defined for Iz - zol < R1 and f1 as being defined for Iz - zol > r1.
Thenf2 is holomorphic and has the power series expansion

f2(Z) =

(5)

L:'" an(z -

n=O

zo)n,

Moreover, by the Cauchy integral theorem we may increase R1 without


changing the values of f2 on Iz - zol < R 1; thus the series (5) converges
for Iz - zol < R.
The functionf1 is holomorphic for Iz - zol > r1. Again, by moving the
circle CI> we may extend f1 to be holomorphic for Iz - zol > r. To get an
appropriate series expansion we proceed as in the proof of Corollary 3.3.
When Iz - zol > r1 and Iw - zol = r1,

(w - Z)-l = w - zo) - (z - ZO))-l = -(z - zo)-l[1 - (w - zo)(z - ZO)-1]-1


= -

'" (w L:
n=1

zo)n-1(z - zo)-n.

The series converges uniformly for w E CI> so


(6)

f1(Z) =

'" a_n(z L:
n=1

zo)-n

Complex Analysis

182

where

a_ n =

2~.J
TTl

f(w)(w - Zo)n-l dw.

01

Equations (5) and (6) together give (4).


Finally, we want to prove uniqueness. Suppose (4) is valid. Then the
power series
n= -

00

converges for r < Iz - zol < R, so it converges uniformly on any smaller


annulus. It follows that if C is any circle with center zo, contained in A, then
(3) may be integrated term by term over C. Since

Ie (z -

zo)n dz

is zero for ni:l and 2TTi for n = -1, this gives


(7)

a-I =

2~'
TTl

Jfez) dz.
0

More generally we may multiply fby (z - zo)-m-l and integrate to get

am

(8)

.J f(z)(z -

1
= -2

TTl

zo)-m-l dz,

all m. Thus the coefficients are uniquely determined.

In particular, Theorem 6.2 applies when f has an isolated singularity at


zoo In this case the Laurent expansion (3) is valid for

o<

Iz - zol < R,

some R > O. The coefficient a -1 is called the residue off at zoo Equation (7)
determines the residue by evaluating an integral; reversing the viewpoint
we may evaluate the integral if we can determine the residue. These observations are the basis for the" calculus of residues." The following theorem is
sufficient for many applications.

Theorem 6.3. Suppose C is a circle or rectangle. Suppose f is holomorphic


in an open set Q containing C and all points enclosed by C, except for isolated
singularities at the points ZI, Z2, ... , Zm enclosed by C. Suppose the residue of
fat Zj is bj Then
(9)

fez) dz = 27Ti(b 1

+ bJ + ... + brn).

Proof Let Cl> ... , Cm be nonoverlapping circles centered at Zl> ... , Zm


and enclosed by C. Then

Rational functions; Laurent expansions; residues

183

Applying Exercise 6 of 3, we get (9). 0


If fhas a pole at
order of the pole,
fez)

Zo,

= (z

the residue may be computed as follows. If n is the

- zo)-nh(z)

= (z

- zo)-n

L:'" bm(z -

zo)m,

m=O

so the residue at Zo is

In particular, at a simple pole m = 1 and the residue is h(zo).


Let us illustrate the use of the calculus of residues by an example. Suppose
we want to compute
1=

and have forgotten that (1

i'"

(1

+ t 2)-1 dt

+ t 2)-1 is the derivative oftan- 1 t. Now

Let CR , R > 0, be the square with vertices Rand R + Ri. Let fez) =
+ Z2)-1. The integral off over the three sides of CR which do not lie on
the real axis is easily seen to approach zero as R -+ 00. Therefore
(1

f:",

(1

+ t 2)-1 dt =

1!~",rR (1

= lim

R-+oo

OR

+ t 2)-1 dt

fez) dz.

For R > l,fis holomorphic inside CR except at z = i, where it has a simple


pole. Since
fez) = (z - i)-1(Z

+ i)-1,

the residue at i is (2i)-1. Therefore when R > 1,

fez) dz = 21Ti(2i)-1 = 1T.

OB

We get I = !1T (which is tan- 1 (+00) - tan- 1 0, as it should be).


We conclude with some further remarks on evaluating integrals by this
method. Theorem 6.3 is easily shown to be valid for other curves C, such as
a semicircular arc together with the line segment joining its endpoints, or a
rectangle with a portion or portions replaced by semicircular arcs. The
method is of great utility, depending on the experience and ingenuity of the
user.

Complex Analysis

184

Exercises
1. Compute the partial fractions decomposition of

2. Find the Laurent expansion of exp (lIz) and sin (liz) at 0.


3. Compute the definite integrals
{'", (t

1)(t 3

2it2

fa'" t 2(t

t - 2i)-1 dt

+ 1)-1 dt.

S:

4. Show that
t- 1 sin t dt = Pr. (Hint: this is an even function, and
it is the imaginary part of J(z) = z- 1etz Integrate J over rectangles lying in
the half-plane 1m z ? 0, but with the segment - e < t < e replaced by a
semicircle of radius e in the same half-plane, and let the rectangles grow long
in proportion to their height.)
5. Show that a rational function is holomorphic at 00 or has a pole at 00.
6. Show that any flfnction which is merom orphic in the whole plane and
is holomorphic at 00, or has a pole there, is a rational function.
7. Show that if Re z > 0, the integral

r(z) =

fa'" t

Z- 1

t:-t dt

exists and is a hoi om orphic function of z for Re z > 0. This is called the
Gamma Junction.
8. Integrate by parts to show that

r(z

1)

= zr(z) ,

9. Define r(z) for -1 < Re z

::s;;

Rez> 0.

0, z - 0, by

r(z) = z- 1 r(z

1).

Show that r is meromorphic for Re z > -1, with a simple pole at zero.
10. Use the procedure of Exercise 9 to extend r so as to be merom orphic
in the whole plane, with simple poles at 0, -1, - 2, ....
11. Show inductively that the residue of r at - n is ( -1)n(n!) -1.
12. Is r a rational function?

7. Holomorphic functions in the unit disc


In this section we discuss functions hoi om orphic in the unit disc D =
{z I Izl < I} from the point of view of periodic functions and distributions.
This point of view gives another way of deriving the basic facts about the
local theory of holomorphic functions. It also serves to introduce certain
spaces of holomorphic functions and of periodic distributions.

Holomorphic functions in the unit disc

185

Note that iff is defined and holomorphic for


fleW) = f(zo

Iz - zol < R, then setting

+ Rw)

we get a function fl holomorphic for Iwi < 1. Similarly, iff has an isolated
singularity at zo, we may transform it to a function with an isolated singularity at zero and holomorphic elsewhere in the unit disc. Since f can be
recovered fromfl by
fez) = h.(R-l(Z - zo)),

all the information about local behavior can be deduced from study of fl
instead.
Supposefis holomorphic in D. Then the function
g(r, 8) = f(re iO )

is periodic as a function of 8 for 0 ::::; r < I and constant for r = O. It is also


differentiable, and the assumption thatfis holomorphic imposes a condition
on the derivatives of g. In fact
h-l[g(r

+ h, 8)

- g(r, 8)] = h-l[f(reiO + heiO ) - f(re iO )]


= eiO(heiO)-l[f(reiO + he f8 ) - f(re i8 )].

Letting h -+ 0 we get

Similarly,
h- 1 [g(r 1 8

+ h)

- g(r, 8)] = h-l[f(rei(o+It) - f(re iO )]


'" l'(re IO ) h -1 [re i (8 + It) - re f8 ]

so

Combining these equations we get


. 8g
8r

8g
88

zr- =-.

(1)

Now let grC8) = g(r, 8), 0 ::::; r < 1. Since gr is continuous, periodic, and
continuously differentiable as a function of 8, it is the sum of its Fourier
series:
(2)

gr(8)

L an(r)efRO
00

-00

The coefficients an(r) are given by


I
air) = 27T

f2" g(r, 8)e0

fR8

d8.

186

Complex Analysis

It follows that air) is continuous for 0 :::; r < 1 and differentiable for
< r < 1, with derivative

,( ) _ 1

an r - 21T

2"

og ( e) - inD de
or r, e
.

Using (1) and integrating by parts we get

n =F 0,

(3)

and a~(r) = O. Thus ao is constant. The equation (3) may be solved for an
as follows, n =F O. The real and imaginary parts of an are each real solutions of

u'(r) = ,-lnu(r).
On any interval where u(r) =F 0 this is equivalent to

so on such an interval u(r) = ern, e constant. Since an is continuous on [0, 1)


and vanishes at 0 if n =F 0 (because go is constant), we must have air) = anr n,
with an constant and an = 0 if n < O.
We have proved the following: ifJis holomorphic in D, then
(4)

J(re iD ) =

L: anrneinD,
00

O:::;r<1.

n=O

Thus

L: anzn,
00

J(z) =

n=O

Izl

< 1.

Suppose J is holomorphic in D and defined and continuous on the


closure: {z Ilzl :::; I}. Then the functions an(r) = anrn are also continuous
at r = 1, and an = an(1) is the nth Fourier coefficient of gl. It follows that gT
is a convolution:

(5)
where QT is the periodic distribution with Fourier coefficients bn = rn, n > 0
and bn = 0, n < O. Then

QT(e) =
or
(6)
Equation (5) can be written

L: rneinD = L: (reiD)n
co

00

n=O

n=O

Holomorphic functions in the unit disc

187

Setting w = re i9 and z = eit, we recover the Cauchy integral formula


(7)

few) = -2'
1

J' f(z)(z -

TTL c

W)-l dz,

where C is the unit circle. Thus (5) may be regarded as a version of the Cauchy
integral formula.
Note that Qr is a smooth periodic function when
r < 1. Therefore
(5) defines a function in the disc if gl is only assumed to be a periodic distribution. In terms of the Fourier coefficients, if gl has Fourier coefficients
(anY~~ 00 then those of gr are (an(r ~ 00 where an(r) = 0, n < 0, an(r) = anr n,
n ;::: 0. These observations and the results of I and 2 of Chapter 5 leads
to the following theorem.

: ;

Theorem 7.1.
where an

(an)~ 00,

(8)

Suppose F is a periodic distribution with Fourier coefficients


< 0. Letf be the function defined in the unit disc by

= 0for n

f(re i9 ) = (F * Q,)(8),

with Qr given by (6). Then f is holomorphic in the unit disc and

2: anz n,
00

(9)

fez) =

n=O

Izl <

1.

Moreover, F is the boundary value of J, in the sense that the distributions


Fr defined by the functions J,( 8) = f(re i9 ) converge to F in the sense of fIJ' as
r -+ 1.
Conversely, suppose f is holomorphic in the unit disc. Then f is given by a
convergent power series (9). If the sequence (an)O' is of slow growth, i.e., if
there are constants c, r such that
(10)

n > 0,

then there is a distribution F such that (8) holds. If we require that the Fourier
coefficients ofF with negative indices vanish, then F is unique and is the boundary
value off in the sense above.
Condition (10) is not necessarily satisfied by the coefficients of a power
series (9) converging in the disc. An example is

n > 0.
Thus the condition (10) specifies a subset of the set of all holomorphic
functions in the disc. This set of holomorphic functions is a vector space.
Theorem 7.1 shows that this space corresponds naturally to the subspace of
fIJ' consisting of distributions whose negative Fourier coefficients all vanish.
Recall that FE fIJ' is in the Hilbert space 22 if and only if its Fourier
coefficients (an)~ co satisfy

(11)
For such distributions there is a result exactly like the preceding theorem.

Complex Analysis

188

Theorem 7.2. Suppose F is a periodic distribution with Fourier coefficients


an = 0 for n < o. If FE V, then F is the boundary value of a function J,

L: anzn,
co

f(z) =

(12)

Izl

n=O

< 1

with
(13)

The functions f,( 0) = f(re '9 ) converge to F in the sense of V as r - 1, and

11F112

1
sup -2

0:$1<1

7T

2"

If(re'9 )12 dO.

Conversely, suppose f is defined in the unit disc by (12). Suppose either that
(13) is true or that

sup

(14)

(,

2"

0:$1:$1,0

If(re I9)12 dO <

00.

Then both (13) and (14) hold, and the boundary value off is a distribution
FEV.
Proof. The first part of the theorem follows from Theorem 7.1 and the
fact that (11) is a necessary condition for F to be in L2. The second part of
the theorem is based on the identity
(15)

27T

f2" If(re
0

I9

)1 2 dO =

L:
co

n=O

lan l2r 2n ,

o :s; r

< 1,

which is true because the Fourier coefficients off, are anrn for n ~ 0 and
zero for n < O. If (13) is true then (14) follows. Conversely, if (13) is false,
then (15) shows that the integrals in (14) will increase to 00 as r _ 1. Thus
(13) and (14) are equivalent. By Theorem 7.1, if (13) holds then f has a
distribution F as boundary value. The an are the Fourier coefficients of F,
so (13) implies FE V. 0
The set of hoi om orphic functions in the disc which satisfy (14) is a vector
space which can be identified with the closed subspace of V consisting of
distributions whose negative Fourier coefficients are all zero. Looked at
either way this is a Hilbert space, usually denoted by

H2 or H2(D).

Exercises

1. Verify that
f(z) =

L: n-r..zn
co

n=l

converges for

Izl

< I but that the coefficients do not satisfy (10).

Holomorphic functions in the unit disc

189

2. Suppose J is holomorphic in the punctured disc 0 <


out the analysis of this section for
g(r, ()

= J(re!8)

to deduce:

(a) J(z) = L:'~ _ao anz n, 0 < Izl < 1,


(b) If IJ(z)1 ~ Mlzl-m for some M, m, then
J(z) =

2:
ao

n .. -m

anz n.

Izl < 1.

Carry

Chapter 7

The Laplace Transform


1. Introduction
It is useful to be able to express a given function as a sum of functions
of some specified type, for example as a sum of exponential functions. We
have done this for smooth periodic functions: if u E &! then
u(x) =

(1)

where
(2)

a"

1
217

L"" a"e

-""

l """,

f2" u(x)e0

I """

dx.

Of course the particular exponential functions which occur here are precisely
those which are periodic (period 217). If u: ~ -+ C is a function which is not
periodic, then there is no such natural way to single out a sequence of
exponential functions for a representation like (1). One might suspect that
(1) would be replaced by a continuous sum, i.e., an integral. This suspicion
is correct. To derive an appropriate formula we start with the analogue of
(2). Let
(3)

g(z) =

i:

u(t)e-Ilt dt,

when the integral exists. (Of course it may not exist for any z E C unless
restrictions are placed on u.)
If we are interested in functions u defined only on the half-line [0, (0),
we may extend such a function to be zero on (-00,0]. Then (3) for the
extended function is equivalent to
(4)

g(z) =

L"" u(t)e- zt dt.

If u is bounded and continuous, then the integral (4) will exist for each
z E C which has positive real part. More generally, if a E ~ and e-atu(t) is
continuous and bounded for t > 0, then the integral (4) exists for Re z > a.
Moreover, the function

g =Lu

(5)

is holomorphic in this half plane:

g'(z) = -

L"" tu(t)e- zt dt,


190

Rez > a.

Introduction

191

In particular, let

t> 0;
t ::;; 0,
Then for Re z > Re w,

Luw(z)

1
00

e(w-z)t dt

WECo

(z - W)-l.

The operator L defined by (3) or (4) and (5) assigns, to certain functions
on IR, functions holomorphic in half-planes in C. This operator is clearly
linear. We would like to invert it: given g = Lu, find u. Let us proceed
formally, with no attention to convergence. Since Lu is holomorphic in some
half-plane Re z > a, it is natural to invoke the Cauchy integral formula.
Given z with Re z > a, choose b such that

a < b < Rez.


Let C be the vertical line Re w = b, traced in the upward direction. We
consider C as "enclosing" the half-plane Re w > b, though traced in the
wrong direction. A purely formal application of the Cauchy integral formula
then gives

g(z)

Lu(z)

1.
--2

1.
-2

7Tl

1Tl

fc J(w)(w -

Z)-l dw

J(w)LuwCz) dw.

If L has an inverse, then L -1 is also linear. Then we might expect to be able


to interchange L -1 and integration in the preceding expression, to get

1.
u(t) = -2

(6)

1Tl C

g(w)ewt dw,

or

u(t)

= -1
21T

foo
_

g(a

+ is)e(a+is)t ds.

00

Thus (3) or (4) and (6) are our analogues of (2) and (1) for periodic functions.
It is convenient for applications to interpret (3) and (6) for an appropriate
class of distributions F. Thus if Fis a continuous linear functional on a suitable
space !l' of functions, we interpret (3) as

The space !l' will be chosen in such a way that each such continuous linear
functional F can be extended to act on all the functions ez for z in some
half-plane Re z > a. Then the function LF will be holomorphic in this halfplane. We shall characterize those functions g such that g = LF for some F,
and give an appropriate version of the inversion formula (6).

The Laplace Transform

192

The operator L is called the Laplace transform. It is particularly useful


in connection with ordinary differential equations. To see why this might be
so, let u' = Du be the derivative of a function u. Substitution of u' for u in
(3) and integration by parts (formally) yield

[Lu'](z) = zLu(z).

(7)

More generally, suppose p is a polynomial

p(z) = amz m + am_IZm- 1 + ... + aIz

+ ao.

Let p(D) denote the corresponding operator

p(D) = amDm

+ am_IDm-1 + ... + aID + ao,

i.e.,

Then formally

[Lp(D)u](z) = p(z)Lu(z).

(8)

Thus to solve the differential equation

p(D)u

=v

we want

p(z)Lu(z)

= Lw(z).

From (6), this becomes

(9)
As we shall see, all these purely formal manipulations can be justified.

Exercises
1. Show that the inversion formula (6) is valid for the functions uw , i.e.,
if a > Re wand C = {z I Re z = a} then
_1_.

2m c

(z - w)-Ie zt dz = ewt,
= 0,

t> 0;
t < 0.

2. Suppose u: [0, (0) -+ C is bounded and continuous and suppose the


derivative u' exists and is bounded and continuous on (0, (0). Show that

f"

e-ztu'(t) dt = z

f"

Re z > 0. Does this conflict with (7)?

e-ztu(t) dt - u(O),

The space:e

193

2. The space !l'


Recall that a function u: IR -+ C is said to be smooth if each derivative
D"u exists and is continuous at each point of IR, k = 0, 1, 2, . . .. In this
section we shall be concerned with smooth functions u which have the
property that each derivative of u approaches 0 very rapidly to the right.
To be precise, let 2 be the set of all smooth functions u: IR -+ C such that
for every integer k ~ 0 and every a E IR,
lim eatD"u(t) = O.

(1)

t-+

+ 00

This is equivalent to the requirement that for each integer k > 0, each a E IR,
and each ME IR the function
(2)

is bounded on the interval [M, 00). In fact (1) implies that (2) is bounded on
every such interval. Conversely if (2) is bounded on [M,oo) then (1) holds
when a is replaced by any smaller number a' < a.
It follows that if u E 2; then for each k, a, M we have that
(3)

lulk,a,M = sup {eatID"u(t)11 t E [M, oo)}

is finite. Conversely, if (3) is finite for every integer k ~ and every a E IR,
M E IR, then u E Y.
The set of functions 2 is a vector space: it is easily checked that if u, v E 2
and bEe then bu and u + v are in Y. Moreover,
(4)
(5)

Ibulk,a,M = Ibllulk,a,M'
lu

+ VI",a,M

:::;; lul",a,M

Ivl",a,M'

A sequence of functions (un)l' c: 2 is said to converge to u E 2 in the


sense of 2 if for each k, a, M,

IUn - Uka,M -+

as n -+ 00.

If so, we write

Un -+ U (2).
The sequence (un)'f c: 2 is said to be a Cauchy sequence in the sense of 2
if for each k, a, M,

As usual, a convergent sequence in this sense is a Cauchy sequence in this


sense. The converse is also true.

Theorem 2.1. 2 is a vector space. It is complete with respect to convergence as defined by the expressions (3); i.e., if (un) l' c: 2 is a Cauchy sequence
in the sense of 2; then there is a unique u E 2 such that Un -+ U (2).

194

The Laplace Transform

Proof Let (un)f be a Cauchy sequence in the sense of !l'. Taking (3)
with a = 0, we see that each sequence of derivatives (Dlcun)f is a uniform
Cauchy sequence on each interval [M, 00). It follows, by Theorem 4.1 of
Chapter 2, that there is a unique smooth function u such that Dlcun -+ DIcU
uniformly on each [M, 00). Now let a be arbitrary. Since (eatDlcun)f is also
a uniform Cauchy sequence on [M, 00) it follows that this sequence converges
uniformly to eatDlcu. Thus u E.fR and Un -+ U (.fR). 0
It follows immediately from the definition of .fR that certain operations
on functions in .fR give functions in !l'. In particular, this is true of differen-

tiation:
ifuE.fR;

translation:
T.u E.fR

ifuE.fR

where

(T.u)(t) = u(t - s);


and complex conjugation:

u* E.fR

if u E.fR

where

u*(t) = u(t)*.
It follows that if u E 2, so are the real and imaginary parts:
Re u

= !(u + u*),

1m u

= 2 (u* - u).

Moreover, if u E 2, so is the integral of u taken from the right:

S+u(t) = -

fO u(s) ds.

In fact, DS+u = u so
(6)
For k

0, note that for t

/S+u(t)/

1.

M, a > 0,

~ {Xl /u(s)/ ds ~

/U/O.a.M {' e- as ds

= a-1/u/O.a.Me- at.
Thus

a> O.

(7)
The finiteness of

/S+U/O.a.M
for a ~ 0 follows from finiteness for any a > O. Thus S+u E!l'.

The space .I

195

Lemma 2.2. The operations of differentiation, translation, complex conjugation, and integration are continuous from !l' to !l' with respect to convergence in the sense of !l'. Moreover, if u E !l' then the difference quotient

as s.....,..O.
Proof These statements chiefly involve routine verifications. We shall
prove the final statement. Given an integer k ~ 0, let v = Dku. Suppose
t ~ M and 0 < Is I :;:; 1. The Mean Value Theorem implies
s-l[L sv(t) - vet)] = Dv(r)
where

It - rl

<

lsi. Then

Dk{S-l[L s u(t) - u(t)]} - Dk Du(t) = Dv(r) - Dv(t)


= Dk+lu(r) - Dk+lU(t).
But

(8)

-M.

The left side of (8) converges to zero as s .....,.. 0, uniformly on bounded intervals. It follows that

as s .....,.. 0, for any a' < a.

The functions ez ,

(9)

IR,

are not in!l' for any Z E C. However, they may be approximated by functions
from !l' in a suitable sense.
Lemma 2.3.

for each integer k

Suppose Re z > a. There is a sequence (un)r c !l'such that

0 and each ME IR.

Proof Choose a smooth function rp such that rp(t) = 1 if t :;:; 1 and


rp(t) = 0 if t ~ 2. (The existence of such a function is proved in 8 of Chapter
2.) Let

rpn(t) = rp(tfn),
Then Un is smooth and vanishes for t ~ 2n, so Un E!l'. We shall consider in
detail only the (typical) case k = 1.

eat(Dun(t) - Dez(t)) = eat(rpn(t)DezCt) + ez(t)Drpn(t) - DezCt))


= (1 - rpn(t))ze<a-Z)t + Drpn(t)e<a-Z)t.

The Laplace Transform

196

Now both 1 - CPn(t) and DCPn(t) are bounded independent of t and n, and
vanish except on the interval [n, 2n]. Therefore

leat(Dun(t) - Deit)) I ~ c exp (na - n Re z),


c independent of nand t. Thus

all M. The argument for other values of k is similar.


The following lemma relates the
k, a, M.

lui k,a,M for different values of the indices

Lemma 2.4. Suppose k, k' are integers, and

o~

a',

M;::: M'.

k',

Suppose also either that k


that

k' or that a' > O. Then there is a constant c such

(10)

luka,M ~ clulk',a',M"

all u E Ii'.

Proof It is sufficient to prove (10) in all cases when two of the three
indices are the same. The case k = k', a = a' is trivial. The case k = k',
M = M' is straightforward. Thus, suppose a = a' > 0 and M = M'. Let
k' = k + j and set v = Dk'U. We may obtain Dku from v by repeated integrations:
We use (7) repeatedly to get

lulk,a,M

IDkulo,a,M ~ a-Jlvlo,a,M

a -JIDk'u IO,a,M

a-Jlulk',a,M'

If u E .P, we set

lulk

lulk,k,-k

sup {lektDku(t)1

It

;::: -k}.

Then the following is an easy consequence of Lemma 2.4.

Corollary 2.5.

Suppose (un)!X) c Ii'. Then


Un -+ U (2)

if and only if for each integer k ;::: 0,


lun-ulk-+O as n-+oo.
Exercises
1. Show that u(t) = exp ( - t 2 ) is in Ii'.
2. Show that if u, v E 2, then the product uv is in Ii'.

The space 2'

197

3. Show that u E Y, Z E C implies ezu E 24. Complete the proof of Lemma 2.2.

3. The space 2"


A linear functional on the vector space !' is a function F: !' --+ C such
that

F(au)

aF(u),

F(u

+ v) =

F(u)

+ F(v).

A linear functional F on !' is said to be continuous if

un --+ U (!')
implies
The set of all continuous linear functionals on !' will be denoted !". An
element FE!" will be called a distribution of type !", or simply a distribution.
An example is the a-distribution defined by

a(u) = u(O).

(1)

A second class of examples is given by

F(u) = (' eztu(t) dt,

(2)
Z E

C.

Suppose f: IR --+ C is a continuous function such that for some a E IR,


MEIR,
(3)

f(t) = 0,

(4)

e-a!j(t) is bounded.

M,

We may define
by
(5)

Flu)

t:

f(t)u(t) dt.

In fact the integrand is continuous and vanishes for t


on [M, 00) we have

M. If a' > a then

If(t)u(t) I = le-a!j(t)lle(a-a')t Ilea'tu(t)1


<
- clul a,a ',M e(a-a')t,
where c is a bound for le-a!j(t)l. Therefore the integral (5) exists and
(6)

1F,(u) I

clulo.a'.M

LX) e(a-a')t dt.

The Laplace Transform

198

It follows from (6) that F, is continuous on Z, i.e., F, E fl".


We say that FE!l" is a function, or is defined by a function, if there is a
continuous functionfsatisfying (3) and (4), such that F = F,.
Suppose FE!l" is defined by f The translates Taf and the complex
conjugate function f* also define distributions. It is easy to check that if
g = Tat. i.e., g(t) = f(t - s), t E~, then
Flu) = F,(Lsu),

u E!l'.

F,*(u) = F,(u*)*,

u E!l'.

Similarly,

We shall define the translates and the complex conjugate of any arbitrary
FE!l" by
(7)

(T.F)(u) = F(T -au),


F*(u)

(8)

= F(u*)*,

E!l'.

u E!l'.

Similarly, the real and imaginary parts of F E!l" are defined by


(9)

ReF = t(F + F*),

(10)

1m F =

2 (F*

- F).

We say that F is real if F = F*.


Suppose FE!l" is defined by f and suppose the derivative Df exists, is
continuous, and satisfies (3) and (4). Integration by parts gives

FD'(U)

-F(Du),

u E!l'.

Therefore we shall define the derivative DF of an arbitrary FE !l" by


(11)

DF(u)

= -F(Du),

E!l:

Generally, for any integer k ;;:: 0 we define

(12)
Proposition 3.1.

The set !l" is a vector space. ifF E !l", then the translates
TaF, the complex conjugate F*, and the derivatives DkF are in !l".
Proof All these statements follow easily from the definitions and the
continuity of the operations in!l'; see Lemma 2.2. For example, DkF:!l' -+ C
is certainly linear. If u" -+ u (.!l'), then
DkU" -+ Dku (!l'),
so

Thus DkF is continuous.

The space 2'

199

A sequence (Fn)'~ c 2' is said to converge to FE 2' in the sense of 2'


if for each u E 2
as n -+ 00.

Fn(u) -+ F(u)

We denote this by
Fn -+ F (2').

The operations defined above are continuous with respect to this notion of
convergence.
Suppose

Proposition 3.2.

Fn -+ F (2'),
and suppose a E C, s

Gn -+ G (2'),

IR. Then

Fn

aFn -+ aF (2'),
(2'),
ToFn -+ ToF (2'),
D"Fn -+ D"F (2').

+ Gn -+ F + G

Moreover, the difference quotient


s-I[T-oF - F] -+ F (2')
as s-+O.
Proof. All except the last statement follow immediately from the definitions. To prove the last statement we use Lemma 2.2:
s-l[LoF - F](u)

F(S-l[Tou - uD

-+ F( - Du) = (DF)(u).

The following theorem gives a very useful necessary and sufficient condition for a linear functional on 2 to be continuous.
Theorem 3.3. Suppose F: 2 -+ C is linear. Then F is continuous if and
only if there are an integer k ;;:: 0 and constants a, M, K c IR such that

(13)

!F(u)! :::; K!U!".a.M,

Proof.

all u E!l'.

Suppose (13) is true. If Un -+ U (2) then


!F(un)

F(u)! :::; K!un - U!k.a.M -+ O.

Thus F is continuous.
To prove the converse, suppose that (13) is not true for any k, a, M, K.
In particular, for each positive integer k we may find a v" E 2 such that
!F(v,,)! ;;:: k!Vk!k

Let Uk

2 be

k!v!"."._k - O.

200

The Laplace Transform

Then

lu"l" =

k-1,

But Corollary 2.5 implies that


Since F(u,,) does not converge to 0, F is not continuous.

The support of a function u: IR -+ IC is defined as the smallest closed subset


A of IR such that u(t) = 0 for every t = A. Another way of phrasing this is
that t is not in the support of u if and only if there is an e > 0 such that u is
zero on the interval (t - e, t + e). The support of u is denoted
supp (u).
Condition (3) on a function f can be written
supp (I) c [M, (0).
t

The support of a distribution FE 2' can be defined similarly. A point


R is not in the support of F if and only if there is an e > 0 such that
F(u) = 0

whenever u E 2 and supp (u) c (t - e, t + e). We denote the support of F


also by
supp (F).
Theorem 3.3 implies that any FE 2' has support in a half line.
Corollary 3.4.

If FE 2',

there is M

IR such that

supp (F) c [M, (0).


Proof

Choose k, a, M such that (13) is true for some K. If u E 2 and


supp (u) c (-00, M),

then

luka.M =

so F(u) = O. Therefore each t < M is not in the support of F.

Exercises
1. Compute the following in the case F = S:

T.F(u),

F*(u),

ReF,

ImF,

2. Show that if F is given by (2), then


DF= S

+ zF.

3. Suppose F = DiS. For what constants k, a, M, K is (l3) true?

Characterization of distributions of type .:'

201

4. Find supp (DkS).


5. Prove that if F is defined by a function/, then
supp (F) = supp (f).

4. Characterization of distributions of type IF'


Iff: IR ---+ C is a function satisfying the two conditions (3) and (4) of 3
and if k is an integer ~ 0, then
(1)

is a distribution of type 'p'. In this section we shall prove that, conversely,


any FE 'p' is of the form (1) for some k and some function f The proof
depends on two notions: the order of a distribution and the integral (from
the left) of a distribution.
A distribution FE 'p' is said to be of order k if (13) of 4 is true, i.e., if
for some real constants a, M, and K,

IF(u) I ::;;

(2)

Klu ",-a,M'

all u E 2.

By Theorem 3.3, each FE 'p' is of order k for some k ~ 0.


Suppose FEL' is defined by a functionf Let g be the integral of ffrom
the left:
get) =

{oo f(s) ds = {f(S) ds,

where supp (f) c [M, (0). If u E.P and v = S+u, then Dv = u. Integration
by parts gives
Flu) =

Loooo g(t)v'(t) dt = - L""oo

g'(t)V(t) dt = - F;(S +u).

For an arbitrary FE 'p' we define the integral of F (from the left), denoted
S_F, by
Proposition 4.1.

If FE.P'

then the integral S _ F is in 'p' and

(3)

If F is of order k

1, then S _F is of order k - 1.

Proof Clearly S _F is linear. The continuity follows from the definition


and the fact that S + is continuous in 2. The identity (3) is a matter of
manipulation:
D(S_F)(u) = -S_F(Du) = F(S+Du) = F(u),

and similarly for the other part.

E!l',

The Laplace Transform

202

To prove that every FE 2' is of the form (1), we want to integrate F


enough times to get a distribution defined by a function. To motivate this,
we consider first a function J: IR -? C such that the first and derivatives are
continuous and such that
supp (I) c [M, (0),
some M. Then by integrating twice and changing the order of integration
(see 7 of Chapter 2) we get

J(t) =
=
=

f",

DJ(s) ds =

f", f
f",

f", f",

D2f(r) dr ds

D2f(r) ds dr

(t-r)D2J(r)dr.

Let

h(t) = Itl,
= 0,

(4)

t :::; 0,
t> O.

Then our equation is

J(t) =

L"'", her -

t)D2J(r) dr

or

(5)
We would like to interpret (5) as the action of the distribution defined
by D2f on the function Tth; howev~r Tth is not in .P. Nevertheless, h can be
approximated by elements of .P.
Lemma 4.2. Let h be defined by (4). There is a sequence (h n )'{! c 2 such
that

(6)

Ih n

Jor each a E IR,

ME

hl oaM

-?

0 as n -? 00

IR.

Proof There is a smooth function rp: IR -? IR such that 0 :::; rp(t) :::; 1 for
all t and rp(t) = 1, t :::; -2, rp(t) = 0, t 2': -1; see 8 of Chapter 2. Let
hn(t) = rp(tln)h(t) = rpn(t)h(t).
Then hn is smooth, since rpn is zero in an interval around 0 and h is smooth
except at O. Also hn(t) = h(t) except in the interval (-2In, 0), and

IhnCt) - h(t)1 :::; 21n,


Thus hn

2 and (6) is true.

t E (-2In, 0).

Characterization of distributions of type 2'

203

Suppose FE.P' is of order k - 2, where k is an integer


Then there is a unique function f such that

Theorem 4.3.
~ 2.

F = Dk(F,).

Proof. Suppose first that k = 2, F is of order O. Let h be the function


defined by (4) and let (h,.)i C .P be as in Lemma 4.2. Choose a, M, K E IR
such that
IF(u) I ~ Klulo.a.M'

(7)
We may suppose a
(8)

alluEL.

O. For each s E IR the translates T.h,. also converge to h:

IT.h,. - T.hlo.a.M -+ 0 as n -+ 00.

It follows from (7) and (8) that for each s E IR

F(T.h,.) converges as n -+ 00.


Let f(s) be the limit of this sequence. Then

If(s) I ~ lim KIT.h,.lo.a.M = KIT.hlo.a.M'

(9)

, . .... IX)

But

ifs

(10)

ITshl oaM

(11)

IT.hlo.a.M ~ (s - M)e aS

M,

ifs> M.

Thus
supp (f)

[M, (0)

and for any a' > a there is a constant c such that

If(s) I

cea's,

all s E IR.

If f is continuous, it follows that f defines a distribution FJ E !R'. Suppose


s < t. Taking limits we get

If(t) - f(s)1 ~ KITth - T.hl oa.M


~ Keat(t - s).
Thus f is continuous.

Letfis) = F(T.h,.). Then/b) = 0 if s

M. For s > M,

If,.(s) - f(s) I ~ KITsh,. - T.hl oaM


~ 2KeaS(s - M)/n.
Therefore if u E!l;

where G,. is the distribution defined by f,.. Then

The Laplace Transform

204

Let Vn be the function defined by

Vn(t) =

L:

Ti1n(t)D 2u(s) ds

= {"", hn(t - s)D 2u(s) ds.


Since D 2u E.P, and hn(t - s) = 0 if s :0; t, the integral converges. Moreover,
it is not difficult to see that the integral is the limit of its Riemann sums
1 N2
Vn.N(t) = N
hn(t - m/N)D2u(m/N),
m= -N2

in the sense that

IVn.N - vnlo.a.M -+ 0 as N -+ 00.


In fact, for t

M,

:0;

i'"

It - sle- as ds =

~ e- at,

where c and c' are independent of t and N. Therefore


F(v n ) = lim
F(v n N)
N

Now let

v(t) =

L"'", h(t -

s)D 2u(s)ds.

Then

IVn - Vlo.a.M -+ 0 as n -+ 00.


In fact, for t

Therefore

D2Ff (u)

FlD 2u) = lim Gn(D 2u)


= lim F(v n ) = F(v).

Laplace transforms of functions

205

But
vet) = { ' (s - t)D 2 u(s) ds

=
Thus D2F,

-LX) Du(s) ds = u(t).

F.

Now suppose F is of order k - 2 > O. Let G = (S_)k-2F. Then G is of


order 0, and by what we have just shown, there is an I such that
D2F, = G.

But then
DkF,

Dk- 2G

F.

Finally, we must prove uniqueness. This is equivalent to showing that


= 0 implies F = O. But F = S_(DF), so this is the case. 0

DF

Exercises

1. Show that Dko is of order k but not of order k - 1.


2. Find the functionl of Theorem 4.3 when F = o. Compute DF, = S _ o.
3. Show that if supp (F) c [M, (0), then supp (S_F) c [M, (0), and
conversely.
4. Suppose FE 2' and the support of F consists of the single point O.
Show that F is of the form

where the ak's are constants.

5. Laplace transforms of functions

I:

Suppose thatlis a function which defines a distribution of type 2', i.e.,


C is continuous, and

~ -+

supp (f) c [M, (0),

(1)

I/(t)1 : :; Ke at ,

(2)

If Z

all

t.

C, let e" be as before:


e,,(t)

If Re Z = b > a then

= e-"t,

E ~.

206

The Laplace Transform

Sincef(t) = 0 for t < M, the integral

L:

(3)

f(t)e,,(t) dt

f:

f(t)e,,(t) dt

exists when Re z > a. The Lap/ace transform of the function f is the function
Lf defined by (3):

(4)

Lf(z)

= L"oo

Re z > a.

e-2tf(t) dt,

Theorem 5.1. Suppose f: IR - C is continuous and satisfies (I) and (2).


Then the Laplace transform Lf is holomorphic in the half plane

{z 1 Rez > a}.


The derivative is

(5)

= - Loooo e- 2ttf(t) dt.

(Lf)'(z)

The Laplace transform satisfies the estimate


ILf(Z) 1 s; K(Re z - a)-1 exp (M(a - Re z,

(6)

Rez> a.

Proof
(w - z)-1[Lf(w) - Lf(z)] =

Loo g(w, z, t)f(t) dt

where

g(w, z, t)

Suppose Re z and Re w are

(w - z)-l[e- wt - e- 2t ].
b > a. Let

h(s) = exp [-(1 - s)z - sw]t,

Os;ss;1.

Then
g(w, z, t)

= (w - z)-1[h(l) - h(O)].

An application of the Mean Value Theorem to the real and imaginary parts
of h shows that

Ih(l) - h(O) 1

S;

clw - zle-b't,

t ~ 0

where b' = b if b > 0, b' = max {Re w, Re z} otherwise. Thus as w _ z,


Ig(w, z, t)f(t)1

S;

c1e- Bt,

where

e > O.

Moreover,
g(w, z, t)f(t) _ - te-alJ(t)

as w _ z, uniformly on each interval [M, N]. It follows that Lfis differentiable and that (5) is true.

Laplace transforms of functions

f:

207

The estimate (6) follows easily from (1) and (2):

ILf(z) I ::;;

::;; K

If(t)e- 2t l dt

exp tea - Re z) dt

"M

= K(Re z - a)-l exp (M(a - Re z.


We want next to invert the process: determine J, given Lf
Theorem 5.2.

g =

Suppose f satisfies the conditions of Theorem 5.1, and let

Lf. Given b > max {a, O}, let C be the line


{z IRe z = b}.

Then f is the second derivative of the function F defined by


F(t) =

(7)

~.
2m

e2tz- 2g(z) dz.

By (6), g is bounded on the line C. Therefore the integral (7)


exists. Moreover, if
Proof

then the gN are bounded uniformly on the line C and converge uniformly to
g. Thus F(t) is the limit as N ~ ex) of FN , where

= IN

-N

{~.f

2m c

e 2(t-s)z-2 dZ}f(S) ds.

Let us consider the integral in braces. When s > t the integrand is holomorphic to the right of C and has modulus::;; klzl- 2 for some constant k.
Let C n be the curve consisting of the segment {Re z = b I Iz - bl ::;; N} and
the semicircle {Re z > b liz - bl = N}. Then the integral of

over Cn in the counterclockwise direction is zero, and the limit as R

ex)

is

The Laplace Transform

208

Thus the integral in braces vanishes for s > t. When s < t, let
reflection of CB about the line C. Then for R > b,

C~

be the

= t - s.

Taking the limit as R ~ 00 we get

.f

1
-2

7T1

ez <t-')z-2 dz = t - s,

Thus

F(t) =

It follows that D2F = f.

f""

s < t.

(t - s)J(s) ds.

We can get a partial converse of Theorem 5.1.


Theorem 5.3.

Suppose g is holomorphic in the half plane


{z IRe z > a}

and satisfies the inequality


(8)

Ig(z) I :s; K(I

Izl)-2 exp (- M Re z).

Then there is a unique continuous Junction J: R ~ C with the properties


(9)

suPpJ c: [M',oo),

(10)
(11)

IJ(t)1 :s; Ke bt,

some M',

all t,Jor some b,


Re z > b.

LJ(z) = g(z) Jor

= Min (9) and any b > a in (10) and (11).


Proof. Choose b > a and let C(b) be the line {z IRe z = b}. Let

Moreover, we may take M'

(12)

J(t)

1
= -2
'lT1

C(b)

eztg(z) dz.

It follows from (8) that the integral exists and defines a continuous function.
It follows from (8) and an elementary contour integration argument that (12)
is independent of b, provided b > a. Moreover, (8) gives the estimate
(13)

IJ(t)1 :s; Ceb<t-M),

b > a,

where C is independent of band t. This implies (10). If t < M we may take


b ~ +00 in (13) and get

J(t) = 0, t < M.
Thus the Laplace transform of J can be defined for Re z > a. If Re w > a,
choose
a < b < Rew.

209

Laplace transforms of functions


Then

f"" e- wt{-21f

Lf(w) =

eztg(z) dZ} dt.

7r1

IZi)-2 exp [-Mb

C(b)

Since
le-wtHtg(z)1

e(1

+ t(b

- Re w)]

for t E IR and z E C, we may interchange the order of integration. This gives

f
= - ~f

Lf(w) = _1.
2m
2m

g(Z){f"" e(a-w)t dt} dz


M

C\b)

g(z)e(a-W)M(z - W)-l dz.

C\b)

In the half plane Re z > b we have


Ig(z)e(a-W)MI

e(1

Izi)-2.

Therefore a contour integration argument and the Cauchy integral formula


give
Lf(w) = g(z)e(a-w)Mla_w
= g(w).
Finally, we must show thatfis unique. This is equivalent to showing that
Lf == 0 implies f == O. But this follows from Theorem 5.2. 0

Exercises

1. Letf(t) = 0, t < O;f(t) = teat, t > O. Compute Lfand verify that

1.
f(t) = -2
7r1

eatLf(z) dz,

where C is an appropriate line.


2. Show that the Laplace transform of the translate of a function f
satisfies
L(Tsf)(z) = e"sLf(z).
3. Suppose bothfand Dfare functions satisfying (1) and (2). Show that
L(Df)(z) = zLf(z).

4. Compute Lf when
f(t) = 0,

t < 0;

f(t) = tn,

t > 0,

where n is a positive integer.


5. Supposefsatisfies (1) and (2), and let g = ewf. Show that
Lg(z) = Lf(z - w).

The Laplace Transform

210

6. Compute Lf when

f(t)

t < 0;

0,

7. Compute Lfwhenf(t)

f(t) =

f(t)

= ewtt",

t > O.

0, t < 0;

f~ sin s ds,

t > O.

6. Laplace transforms of distributions


Suppose FE 2'. Theorem 3.3 states that there are constants k, a, M, K
such that
(1)

alluE2.

If Re z > a, then Lemma 2.3 states that there is a sequence (u,,)i


that

2 such

(2)

where e2 (t) = e- 2 Now (1) and (2) imply that (F(U,,:=l is a Cauchy
sequence in C. We shall define the Laplace transform LF by
(3)

LF(z) = lim F(u,,).


".... '"
In view of (2) we shall write, symbolically,
(4)

LF(z)

= F(e

2)

even though e" 1= 2. Note that if (vn)i c 2 and

Jv" -

e2 JIc,a,M -+ 0 as n -+ 00

then

JF(v,,) - F(u,,)J

;:S;

KJv" - U"JIc,a,M -+ 0

as n -+ 00. Thus LF(z) is independent of the particular sequence used to


approximate e2
Proposition 6.1.
domain of definition
(5)

Suppose F, G E 2' and b E C. Then on the common


L(bF) = bLF;

(6)

L(F + G) = LF

(7)

(8)

= e- LF(z);
L(DlcF)(z) = zlcLF(z);

(9)

L(S_F)(z)

L(T.F)(z)

+ LG;

28

= z-lLF(z),

z =F O.

Moreover,
(10)

L(ewF)(z)

= LF(z + w),

Laplace transforms of distributions

211

where ewF is the distribution defined by


U E!l'.

(11)

If F is defined by a function J,

then
LF = Lf.

(12)

Proof. The identities (5) and (6) follow immediately from the definitions.
If (un)f c !f' satisfies (2) then the sequence (Lsun)f approximates Lse;, in
the same sense. But

so
L(TsF)(z) = lim TsF(un ) = lim F(T -sun)
= e-;'S lim F(u n) = e- 2SLF(z).

This proves (7), and the proofs of (8), (9) and (10) are similar. Note that
U E !f' implies ewu E !f' and
implies
ewun ~ ewu (!f').

Therefore (11) does define a distribution.


Finally, (12) follows from the definitions.

We can now generalize Theorems 5.1 and 5.2 to distributions.


Theorem 6.2. Suppose FE!f" and suppose F satisfies (1). Then the
Lap/ace transform LF is holomorphic in the half plane
{z IRe z > a}.

Moreover,
(13)

where f is the function defined by


(14)

f(t) = ....!.....

2m

e2tz- k - 2LF(z) dz.

Here C is the line {z I Re z = b}, where b > max {a, O}.


Proof.

We know by Theorem 4.3 that there is a functionfsuch that

F = Dk+2F,.
It was shown in the proof of Theorem 4.3 that for any b > max {a, O} there
is a constant c such that
If(t)1 ~ cebt,

all t.

The Laplace Transform

212

Therefore Lfis holomorphic for Re z > max {a, O}. By Proposition 6.1,
(IS)

LF(z) = zk+2Lf(z),

Re z > max {a, O}.

Therefore LF is hoI om orphic in this half plane. This completes the proof
of the first statement in the case a ;::: O. When a < 0, let G = eaF. Then (1)
implies

Thus by the argument just given, LG is holomorphic for Re z > O. Since


LF(z) = LG(z - a), LF is holomorphic for Re z > a.
Now let C be the line {z I Re z = b}, where b > max {a, O}. Letfbe the
function such that (13) is true. Then by Theorem 5.2 and equation (15),f is
the second derivative of the function

(16)

get)

= ~f e2tz- k - 4LF(z) dz.


2m c

From the definition of LF it follows that on C


(17)
Using (17) we may justify differentiating (16) twice under the integral sign
to get (14). 0
Theorem 6.2 implies, in particular, that if LF == 0 then F = o.
Given a holomorphic function g, how can one tell whether it is the
Laplace transform of a distribution?

Theorem 6.3. Suppose g is holomorphic in a half plane


{z IRe z > a}.
Then g is the Laplace transform of a distribution FE 2'
are constants k, a, M, Kl such that

(18)

Ig(z) I :$;

K1(l

Izi)k exp (-M Re z),

if and only if there

Rez> a.

Proof. Suppose g = LF, where FE 2'. Then there are k, a, M, K such


that (I) is true. Then Re z > a implies

where Kl = Ke aM
Conversely, suppose (18) is true. Take b > max {a, O}. We may apply
Theorem 5.3 to
h(z) = Z-k-2g (Z)

to conclude that
h = LJ,

Differential equations

213

where I is continuous,
suppl c [M, (0),

I/(t)1

cebt

Let F = Dk +2f. Then

LF(z) = Zk+2h(z) = g(z),

Rez > b.

Since this is true whenever b > max {a, O}, the proof is complete in the case

O.

When a < 0, let

gl(Z) = g(z

+ a).

Then gl is holomorphic for Re z > 0 and satisfies

It follows that gl

= LFI for

Re z > 0, some FI e fl". Then

g =LF,

Exercises

1. Compute the Laplace transforms of D k 8, k = 0, 1,2, ... and of T.8,

seR.
2. Compute the Laplace transform of F when

F(u) = (" elDtu(t) dt.

7. Differential equations
In 5, 6 of Chapter 2 we discussed differential equations of the form
U'(x)

+ au(x) = I(x),

and of the form


U"(x)

+ bu'(x) + cu(x) = I(x).

In this section we turn to the theory and practice of solving general nth order
linear differential equations with constant coefficients:
(1)

The Laplace Transform

214

where the aj are complex constants. Using D to denote differentiation, and


understanding DO to be the identity operator, DOu = u, we may write (I) in
the form
(1)'
Let p be the polynomial
p(z) =

L akzk.
n

k=O

Then it is natural to denote by p(D) the operator

L ak D".
n

(2)

p(D) =

"=0

Equation (1) becomes


p(D)u =f.

(1)"

We shall assume that the polynomial is actually of degree n, that is

an =F 0.
Before discussing (1)" for junctions, let us look at the corresponding
problem for distributions: given HE 2', find FE 2' such that
p(D)F = H.
Theorem 7.1. Suppose p is a polynomial oj degree n > 0, and suppose
HE 2'. Then there is a unique distribution FE 2' such that

p(D)F = H.

(3)

Proof. Distributions in 2' are uniquely determined by their Laplace


transforms. Therefore (3) is equivalent to
L(P(D)F)(z) = LH(z),

(4)

Rez> a

for some a E IR. But


L(p(D)F)(z) = p(z)LF(z).

We may choose a so large that p(z) =F if Re z ::c: a, and so that LH is holomorphic for Re z > a and satisfies the estimate given in Theorem 6.3. Then
we may define
g(z) = p(z)-lLH(z),

Rez > a.

Then g is holomorphic, and it too satisfies estimates

Ig(z) I ::;;

K(I

Izl)" exp (-M Re z),

Rez> a.

Theorem 6.3 assures us that there is a unique FE 2' such that LF = g, and
then (4) holds. 0

Differential equations

215

The proof just given provides us, in principle, with a way to calculate F,
given H. Let us carry out the calculation formally, treating F and H as
though they were functions:

f
~. f
ff
f [_1.f

1.
F(t) = -2
=

7f1

2m

et2LF(z) dz
etzp(z)-lLH(z) dz
et2p(z)-le- S2 H(s) ds dz

= _I.
=

2711

2m

e<t-S)Zp(z)-l dZ]H(S) ds

G(t - s)H(s) ds.

where
(5)

G(t) =

J....
2m

et2p(z)-1 dz

and C is a line Re z = b > a.


We emphasize that the calculation was purely formal. Nevertheless the
integral (5) makes sense if p has degree ~ 2, and defines a function G.
Equivalently,

(5)'

1.
G(t) = lim -2
R-. .o

7f1

CB

etzp(z) -1 dz,

where CR is the directed line segment from b - iR to b + iR, R > O. We


shall show that the limit (5)' also exists when p has degree 1, except when
t = O. The function defined by (5)' is called the Green's function for the
operator p(D) defined by the polynomial p. Our formal calculation suggests
that G plays a central role in solving differential equations. The following two
theorems provide some information about it.

Theorem 7.2. Suppose p is a polynomial of degree n ~ 1; suppose


Zl, Z2, ... , Zr are the distinct roots ofp, and suppose that Zj has mUltiplicity mi.
Then (5)' defines afunction G for all t # o. Thisfunction is a linear combination
of the functions g;lc, 1 ~ j ~ r, 0 ~ k < mi' where

gilc(t) = 0,
t < 0;
gjlc(t) = tic exp (zjt),

t >

o.

Proof Suppose t < O. Let DR denote the rectangle with vertices b iR,
(b + Rl/2) iR. When Re Z ~ b,
(6)

le2tp(z)-11

c(t)(1

Izj)-n exp t(Re Z - b),

The Laplace Transform

216

where c(t) is independent of z. Now the line segment CB is one side of the
rectangle DB, and the estimates (6) show that the integral of e2tp(z)-1 over
the other sides converges to 0 as R --+ 00. On the other hand, the integral
over all of DB vanishes, because the integrand is holomorphic inside DB.
Thus the limit in (5)' exists and is 0 when t < 0 (and also when t = 0, if

n> 1).

Suppose t > O. Let DB now be the rectangle with vertices b iR,


(b - Rl/2) iR, with the counterclockwise direction, and suppose R is so
large that DB contains all roots of p(z). Then the integral of e 2tp(z)-l over
DB is independent of R, and the integral over the sides other than CB tends
to 0 as R --+ 00. Thus again the limit in (5)' exists, and

.J.

1
G(t) = -2

(5)"

17'

t > O.

e2tp(z)-ldz,

CB

Now we may apply Theorem 6.3 of Chapter 6: G(t) is the, sum of the residues
of the meromorphic function e2tp(z)-l. The point z, is a pole of order m"
so near we have a Laurent expansion

z,

L:

p(Z)-l =

bm(z - zJ)m.

mO!:-m,

Combining this with


e2t

e"l t

L:

(ml)-l(z - z,)mt m,

m.. O

we see that the residue (the coefficient of(z - Z,)-l in the Laurent expansion)
at ZI is a linear combination of
tIc

exp (Z,t),

moreover, it is the same linear combination whatever the value of t.

Suppose f is a complex-valued function defined on an interval (a, b).


We write
f(a+) = limf(a)
t ... a

when the limit on the right exists as t approaches from the right.
We take the Green's function for p(D) to be 0 at t = 0; when n > 1 this
agrees with (5)'.

Theorem 7.3. Let p be a polynomial of degree n > 0, with leading coefficient all :F O. Let G be the Green's function for p(D). Then G is the unique
function from IR to C having the following properties:
(7)
(8)

all derivatives D"G exist and are continuous when t :F 0;


the derivatives D"G exist and are continuous at 0 when k

= 0,

(9)

G(t)

(10)

p(D)G(t)

(11)

= 0,

0;

t > 0;

aID"-1G(0+) = 1.

n - 2;

Differential equations

217

Proof We know that G is a linear combination of functions satisfying


(7) and (9), so G does also. When t > 0 we may differentiate (5)" and get
DkG(t) =

(12)

-1.f
2m

z ke 2tp(z)-1 dz.

DR

Thus

.f

1
DkG(O+) = -2

7rl

Zkp (Z)-ldz.

DR

We may replace DR by a very large circle centered at the origin and conclude
that
k ~ n - 2.
Therefore (8) is true. Let us apply the same argument when k = n - 1.
Over the large circle the integrand is close to
zn-1(anZn)-1

= a n -lZ -l ,

so
Finally, (12) gives

1.
p(D)G(t) = -2
7rl

DR

t> O.

e zt dz = 0,

Now we must show that G is uniquely determined by the properties


(7)-(11). Suppose G1 also satisfies (7)-(11), and let 1= G - G1 Then I
satisfies (7)-(10); moreover Dn-y(o) = O. We may factor
p(z)

an(z - Zl)(Z - Z2)' .. (z - zn),

where we do not assume that the

Zj

are distinct. Let 10

= J,

k > O.
Then eachlk is a linear combination of DiJ, 0

~ j ::;;

k, so

k~n-l.

Moreover,
In = (D - zn)(D - Zn-1) ... (D - zl)1
= an -lp(D)1 = 0.

Thus
In-1(0)

0,

Dln-1 - Znln-1

= In =

0,

so
In-1 = 0.

Then
In-2(0) = 0,

Dln-2 - zn-1/n-2 = 0,

and let

218

The Laplace Transform

so fn-2 = O. (We are using Theorem 5.1 of Chapter 2). Inductively, each
fk = 0, k ::;; n, so f = 0 and G = G1 0
Now let us return to differential equations for functions.
Theorem 7.4. Suppose p is a polynomial of degree n > 0, and suppose
f: [0, (0) ~ C is a continuous function. Then there is a unique solution
u: [0, (0) ~ C to the problem

(13)

p(D)u(t) = f(t),

t > 0;

(14)

DiU(O+) = 0,

O::;;j::;;n-l.

This solution u is given by


u(t) =

(15)

1:

G(t - s)f(s) ds,

where G is the Green'sfunctionfor the operator p(D).


Proof We use properties (7)-(11) of G. Let u be given by (15) for t
Then successive differentiations yield
(16)

Du(t)

G(O+ )f(t) +

I:

O.

DG(t - s)f(s) ds

DG(t - s)f(s) ds, ... ,

(17)

Dku(t) =

1:

(18)

Dnu(t)

an -If(t)

s:

DnG(t - s)f(s) tis.

p(D)u(t) = f(t)

p(D)G(t - s)f(s) ds

DkG(t - s)f(s) ds,

Thus

k ::;; n - 1,

= f(t).

Moreover, (17) implies (14). Thus u is a solution. The uniqueness of u is


proved in the same way as uniqueness of G. 0
We conclude with a number of remarks.
1. The problem (13)-(14) as a problem for distributions: Let us define
f(t) = 0 for t < O. Iff does not grow too fast, i.e., if for some a E IR

e-alJ(t) is bounded,
then we may define a distribution HE .fR' by

H(v) =

f(t)v(t) dt,

E.fR.

Differential equations

219

Suppose u is the solution of (13)-(14). Then it can be shown that u defines a


distribution F, and

p(D)F = H.

(19)

Thus we have returned to the case of Theorem 7.1.


2. If the problem (13)-(14) is reduced to (19), then the proof of Theorem
7.1 shows that the solution may be found by determining its Laplace transform. Since there are extensive tables of Laplace transforms, this is of
practical as well as theoretical interest. It should be noted that Laplace
transform tables list functions which are considered to be defined only for
t ;:::: 0; then the Laplace transform of such a function f is taken to be
Lf(z)

= (" e- 2 !J(t) dt.

In the context of this chapter, this amounts to settingf(t) = 0 for t < 0 and
considering the distribution determined by J, exactly as in Remark 1.
3. Let us consider an example of the situation described in Remark 2.
A table of Laplace transforms may read, in part,
f

Lf

sin t
sinh t

(Z2

+ 1)-1
1)-1

(Z2 -

(As noted in Remark 2, the function sin t in the table is considered only for
t ;:::: 0, or is extended to vanish for t < 0.)
Now suppose we wish to solve:

u(O)

(21)
Let p(z) =

t > 0;

u"(t) - u(t) - sin t = 0,

(20)

Z2 -

= u'(O) = O.

1. Our problem is

p(D)u = sin t,

t > 0;

u(O) = Du(O) = O.
The solution u is the function whose Laplace transform is
p(z)-lL(sin t)(z)

= (Z2 - 1)-1(Z2

+ 1)-1.

But

(Z2 - 1)-1(Z2

+ 1)-1

= !(Z2 - 1)-1 - !(Z2

Therefore the solution to (20)-(21) is

u(t) =

sinh t -

sin t,

t ;::::

o.

+ 1)-1.

The Laplace Transform

220

4. In cases where the above method fails, either because the given
function! grows too fast to have a Laplace transform or because the function
Lu cannot be located in a table, one may wish to compute the Green's
function G and use (IS). The Green's function may be computed explicitly
if the roots of the polynomial p are known (of course (5)" gives us G in
principle). In fact, suppose the roots are Zlo Z2, ... , Zr with multiplicities
ml, m2, ... , m r We know that G for t > 0, is a linear combination of the n
functions

Thus

t> 0,
where we must determine the constants

Cjk'

The conditions (8) and (11) give

n independent linear equations for these n constants. In fact,


G(O+) =
DG(O+) =

2:

.2

CjO,

ZjCjO

+ 2: Cjlo

etc.
5. The more general problem
(22)

p(D)u(t) = Jet),

t> 0;

(23)

Dku(O +) = b k,

O:S.k<n

may be reduced to (13)-(14). Two ways of doing this are given in the exercises.
6. The formal calculation after Theorem 7.1 led to a formula
F(t) =

G(t - s)H(s) ds

which it is natural to interpret as a convolution (see Chapter 3). A brief


sketch of such a development is given in the exercises.

Exercises
1. Compute the Green's function for the operator p(D) in each of the
following cases:

p(Z)
p(z)
p(z)
p(z)

= Z2 - 4z - 5

= Z2 - 4z + 4
= Z3 + 2Z2 - Z
= Z3 - 3z + 2.

Differential equations

221

2. Solve for u:
u"(t) - 4u'(t) + 4u(t) = et ,
t > 0,
u(O+) = u'(O+) = O.
3. Solve for u:
u"'(t) - 3u'(t) + 2u(t) = t 2 - cos t,
t > 0,
u(O+) = u'(O+) = u"(O+) = 0
4. Let Uo: (0, 00) ~ C be given by
uo(t) =

L (k!)-1b t

n-l

k k

k=O

Show that

O:S;k:s;n-1.
5. Suppose uo: (0, 00) ~ IR is such that

O:s;k:S;n-1.
Show that u is a solution of (22)-(23) if and only if u =
the solution of

Uo

+ Uh where U1 is

P(D)Ul(t) = f(t) - p(D)uo(t),


t > 0,
DkU1 (0+) = 0,
0 :s; k :s; n - 1.
6. Show that problem (22)-(23) has a unique solution.
7. Show that the solution of
p{D)u{t) = 0,
Dku{O+) = 0,
DJu(O+) = 1

t > 0,

o :s; k

<j

and j < k :s; n - 1,

is a linear combination of functions t k exp zt.


8. Show that any solution of
p(D)u(t) = 0,

t > 0

is a linear combination of the functions t k exp zt, where z is a root of p(D)


with multiplicity greater than k, and conversely.
9. Suppose u: (0, 00) ~ C is smooth and suppose Dku(O+) exists for
each k. Suppose also that each Dku defines a distribution Fk by

Show that
DFo = F1

+ u(O+ )8,

222

The Laplace Transform

and in general

DkFO = Fk

2 Dfu(O+ )Dk-l-fS.

k=l
j=O

10. In Exercise 9 let u(t) = G(t), t > 0, where G is the Green's function
for p(D). Show that

p(D)Fo = S.

11. Use Exercise 9 to interpret the problem (22)-(23) as a problem of


finding a distribution (when the function f defines a distribution in 'p').
Discuss the solution of the problem.
12. Use Exercise 11 to give another derivation of the result of Exercise 5.
13. Again let
aCt) = u(-t),
If F E

'p'

Tsu(t) = u(t - s).

and u E.P, set

(a) Suppose F = F v , where v: IR -+ IC is continuous, vet) = for t :::; - M,


and e-atv(t) is bounded. Show that for each u E.P' the convolution integral
iJ * u(t) =

JiJ(t -

s)u(s) ds

exists and equals

F* u(t).
(b) Show that for each FE 'p' and u E 2, the function F * u is in !l'.
14. If F, HE 'p', set
(F * H)(u) = F(D * u),

Show that F * HE .P'.


15. Compute (DkSr
16. Show that

u E!l'.

* u, u E!l'. Compute (DkS) * F, FE 'p'.


L(F * H) = L(F)L(H).

17. Let G be the distribution determined by the Green's function for


p(D). Show that
LG = p(Z)-l ..
18. Show that the solution of

p(D)F = H

is
where G is as in Exercise 17.

NOTES AND BffiLIOGRAPHY


Chapters 1 and 2. The book by Kaplansky [9] is a very readable source of
further material on set theory and metric spaces. The classical book by Whittaker
and Watson [25] and the more modern one by Rudin [18] treat the real and
complex number systems, compactness and continuity, and the topics of Chapter
2. Vector spaces, linear functionals, and linear transformations are the subject of
any linear algebra text, such as Halmos [7]. Infinite sequences and series may be
pursued further in the books of Knopp [10], [11]. More problems (and theorems)
in analysis are to be found in the classic by Polya and Szego [15].
Chapters 3, 4, and 5. The Weierstrass theorems (and the technique of approximation by convolution with an approximate identity) are classical. A direct proof
of the polynomial approximation theorem and a statement and proof of Stone's
generalization may be found in Rudin [18].
The general theory of distributions (or" generalized functions") is due to
Laurent Schwartz, and is expounded in his book [20]. The little book by Lighthill
[12] discusses periodic distributions and Fourier series. Other references for distribution theory imd applications are the books of Bremermann [2], Liverman [13],
Schwartz [21], and Zemanian [27].
Banach spaces, Frechet spaces, and generalizations are treated in books on
functional analysis: that by Yosida [26] is comprehensive; the treatise by Dunford
and Schwartz [4] is exhaustive; the sprightly text by Reed and Simon [16] is
oriented toward mathematical physics. Good sources for Hilbert space theory in
particular are the books by Halmos [6], [8] and by Riesz and Sz.-Nagy [17].
The classical L 2-theory of Fourier series treats L 2 (0, 21T) as a space of functions
rather than as a space of distributions, and requires Lebesgue integration. Chapters 11 through 13 of Titchmarsh [24] contain a concise development of Lebesgue
integration and the P-theory. A more leisurely account is in Sz.-Nagy [14]. The
treatise by Zygmund [28] is comprehensive.
Chapter 6. The material in 1--6 is standard. The classic text by Titchmarsh
[24] and that by Ahlfors [1] are good general sources. The book by Rudin [19]
also treats the boundary behavior of functions in the disc, related to the material
in 7.
Chapter 7. The Laplace transform is the principal subject of most books on
"operational mathematics" and" transform methods." Doetsch [3] is a comprehensive classical treatise. Distribution-theoretic points of view are presented in
the books of Bremermann [2], Erdelyi [5], Liverman [13], and Schwartz [21].
Bibliography
1.

2.

AHLFORS, L. V.: Complex Analysis, 2nd ed. New York: McGraw-Hill 1966.
BREMER MANN, H. J.: Distributions, Complex Variables, and Fourier Transforms. Reading, Mass.: Addison-Wesley 1966.
223

224

Notes and Bibliography

3.

DOETSCH, G.: Handbuch der Laplace-Transformation, 3 vols. Basel: Birkhauser 1950-1956.


ERDELYI, A.: Operational calculus and generalized functions. New York:
Holt, Rinehart & Winston 1962.
DUNFORD, N., and SCHWARTZ, J. T.:Linear Operators, 3 Vols. New York:
Wiley-Interscience 1958-1971.
HALMos, P. R.: Introduction to Hilbert Space, 2nd. ed. New York: Chelsea
1957.
HALMOS, P. R.: Finite Dimensional Vector Spaces. Princeton: Van Nostrand
1958.
HALMOS, P. R.: A Hilbert Space Problem Book. Princeton: Van Nostrand
1967.
KAPLANSKY, I.: Set Theory and Metric Spaces. Boston: Allyn & Bacon 1972.
KNOPP, K. : Theory and Applications of Infinite Series. London and Glasgow:
Blackie & Son 1928.
KNOPP, K.: Infinite Sequences and Series. New York: Dover 1956.
LIGHTHILL, M. J. : Introduction to Fourier Analysis and Generalized Functions.
New York: Cambridge University Press 1958.
LIVERMAN, T. P. G.: Generalized Functions and Direct Operational Methods.
Englewood Cliffs, N.J.: Prentice-Hall 1964.
Sz.-NAGY, B.: Introduction to Real Functions and Orthogonal Expansions.
New York: Oxford University Press 1965.
POLYA, G., and SZEGO, G.: Problems and Theorems in Analysis. BerlinHeidelberg-New York: Springer 1972.
REED. M., and SIMON, B.: Methods of Modern Mathematical Physics, vol. 1.
New York: Academic Press 1972.
RIESZ, F., and Sz.-NAGY, B.: Functional Analysis. New York: Ungar 1955.
RUDIN, W.: Principles of Mathematical Analysis, 2nd ed. New York:
McGraw-Hill 1964.
RUDIN, W.: Real and Complex Analysis. New York: McGraw-Hill 1966.
SCHWARTZ, L.: Theorie des Distributions, 2nd ed. Paris: Hermann 1966.
SCHWARTZ, L.: Mathematics for the Physical Sciences. Paris: Hermann;
Reading, Mass.: Addison-Wesley 1966.
SOBOLEV, S. L.: Applications of Functional Analysis in Mathematical Physics.
Providence: Amer. Math. Soc. 1963.
SOBOLEV, S. L.: Partial Differential Equations of Mathematical Physics.
Oxford and New York: Pergamon 1964.
TITCHMARSH, E. c.: The Theory of Functions, 2nd ed. London: Oxford
University Press 1939.
WHITTAKER, E. T., and WATSON, G. N.: A Course of Modern Analysis, 4th
ed. London: Cambridge University Press 1969.
YOSIDA, K.: Functional Analysis, 2nd ed. Berlin-Heidelberg-New York:
Springer 1968.
ZEMANIAN, A. H.: Distribution Theory and Transform Analysis. New York:
McGraw-Hill 1965.
ZYGMUND, A.: Trigonometric Series, 2nd ed. London: Cambridge University
Press 1968.

5.
4.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.

NOTATION INDEX
complex numbers, 8
rational numbers, 4
IR
real numbers, 5
integers, 1
Z
positive integers, 1
z+
L2
Hilbert space of periodic distributions, 106
rc
continuous periodic functions, 69
,!l7
smooth functions of fast decrease at + 00, 193
,!l7'
distributions acting on L, 197
~
smooth periodic functions, 73
~'
periodic distributions, 84
differentiation operator, 72, 86, 198
D
Laplace transform operator, 192, 206, 210
L
U * v, F * u, F * G convolution, 78, 94, 96, 100, 101, 222
Un ->- U (,!l7)
193
Un ->- U (~)
73
Un ->- F(L2)
106
F7} ->- F (,!l7')
199
Fn ->- F(~')
85, 100
C
Q

225

SUBJECT INDEX
approximate identity, 80

- of distribution in 2', 198


- of function, 38
- of periodic distribution, 85, 100
composition of functions, 3
connected set, 175
continuous function, 35
continuity, at a point, 34
- uniform, 35
convergence, in a metric space, 22
- in Hilbert space, 110
- in 2',199
- in V, 106
- in 9, 73
- in 9', 85, 100
- of numerical sequences, 10
- of series, 14
convolution, in 2', 222
- in 9', 96, 101
- of functions, 78
- of functions with periodic distributions, 94, 100
coordinates of vector, 32
countable set, 3
curve, 159
- smooth, piecewise smooth, 159

ball, in metric space, 20


Banach space, 70
basis, 30
Bessel's equality, inequality, 124
Bolzano-Weierstrass theorem, 26
bounded function, 36
-linear functional, 71
- sequence, 11
- set, 7, 24
branch of logarithm, 173

Casorati-Weierstrass theorem, 177


Cauchy integral formula, 166, 187
Cauchy-Riemann equations, 157
Cauchy sequence, in a metric space, 22
- in 2,193
- in 9,73
- of numbers, 12
- uniform, 47
Cauchy's theorem, 161
chain rule, 45, 155
change of variables in integration, 45
characterization of distributions in 2',
203
- of periodic distributions, 89, 102
class C\ Coo, 46
closed set, 21
closure, 22
compact set, 23
- in \R n , IC, 24
comparison test, 15
complement, of set, 2
complementary subspace, 33
complete metric space, 22
completeness axiom for real numbers,
7
completeness of'ti', 70
- of V, 107
- of \R, IC, 12
- of \R n , 23
complex conjugate of complex number, 9

distribution, 85, 197


dense set, 22
derivative, of distribution in 2', 198
- of function, 42, 155
- of periodic distribution, 86, 100
differentiable function, 42, 155
differential equations, first order and
second order, 51-56
- higher order, 213-222
diffusion equation, 137
- derivation, 144
dimension, of vector space, 31
Dirac 8- distribution, 85, 197
Dirichlet kernel, 130
Dirichlet problem, 150
distribution, of type 2', 197
- , periodic, 84, 100
divergence of series, 14
~-

227

228

Subject Index

essential singularity, 176


even function, 87
even periodic distribution, 87, 101

Hilbert space, 109


holomorphic function, 158
homotopy, 161

finite dimensional vector space, 30


Fourier coefficients, 124
- in V, 126, 129
- of a convolution, 134
- of periodic distributions, 132
Fourier series, 126, 129
Frechet space, 76
function, 2
- bounded, 36
- class CIe, Coo, 46
- complex-valued, 3
- continuous, 35
- differentiable, 42, 155
- holomorphic, 158
- injective, 3
- infinitely differentiable, 46
- integrable, 38
- meromorphic, 178
-1-1,3
- onto, 3
- periodic, 69
- rational, 179
- real-valued, 3
- smooth, 73
- surjective, 3
- uniformly continuous, 35
fundamental theorem of algebra, 169
fundamental theorem of calculus, 44

imaginary part, of complex number, 9


- of distribution in ,[', 198
- of function, 38
- of periodic distribution, 87, 101
improper integral, 41
independence, linear, 30
inf, infimum, 11
infinite dimensional vector space, 30
infinitely differentiable function, 46
inner product, 103, 109
integrable function, 38
integral, 38
- improper, 41
- of distribution in ,[', 201
intermediate value theorem, 37
intersection, 2
interval, 5
inverse function, 3
inverse function theorem, for holomorphic functions, 171
isolated singularity, 175

gamma function, 184


geometric series, 15
glb,7
Goursat's theorem, 165
Gram-Schmidt method, 117
greatest lower bound, 7
Green's function, 215
IP,188
harmonic function, 150
heat equation, 137
- derivation, 144
Heine-Borel theorem, 24
Hermite polynomials, 120
Hilbert cube, 116

kernel, of linear transformation, 33


Laguerre polynomials, 120
Laplace transform, of distribution,
210
- of function, 192,206
Laplace's equation, 150
Laurent expansion, 181
least upper bound, 7
Legendre polynomials, 120
L'H6pital's rule, 47
lim inf, lim sup, 12
limit of sequence, 10,22
limit point, 21
linear combination, 29
- nontrivial, 30
linear functional, 32
- bounded, 71
linear independence, 30
linear operator, linear transformation,
32

229

Subject Index
Liouville's theorem, 169
logarithm, 61, 173
lower bound, 7
lower limit, 12
lub,7
maximum modulus theorem, 174
maximum principle, for harmonic
functions, 154
- for heat equation, 142
mean value theorem, 43
meromorphic function, 178
mesh, of partition, 38
metric, metric space, 19
modulus, 9
neighborhood, 20
norm, normed linear space, 70
null space, 33
odd function, 87
odd periodic distribution, 88, 101
open mapping property, 174
open set, 20
order, of distribution in 2', 201
- of periodic distribution, 89, 102
- of pole, 177
- of zero, 177 .
orthogonal expansion, 121, 124
orthogonal vectors, 110
orthonormal set, orthonormal basis,
117
parallelogram law, 110
Parseval's identity, 124
partial fractions decomposition, 180
partial sum, of series, 14
partition, 38
period,69
periodic distribution, 84, 100
periodic function, 69
Poisson kernel, 151
polar coordinates, 66
pole, simple pole, 176
power series, 17
product, of sets, 2
Pythagorean theorem, in Hilbert
space, 110

radius of convergence, 17
rapid decrease, 131
ratio test, 16
rational function, 179
rational number, 4
real part, of complex number, 9
- of distribution in 2', 198
- of function, 38
- of periodic distribution, 87, 101
real distribution in 2', 198
real periodic distribution, 87, 101
removable singularity, 175
residue, 182
Riemann sum, 38
Riesz representation theorem, 112
root test, 16

scalar, 28
scalar multiplication, 27
Schrodinger equation, 141
Schwarz inequality, 103, 109
semi norm, 76
separable, 27
sequence, 4
sequentially compact set, 26
series, 14
simple pole, simple zero, 177
singularity, essential, 176
- isolated, 175
- removable, 175
slow growth, 132
smooth function, 73
span, 30
standard basis, 30
subset, 2
subsequence, 24
subspace, 29
sup, 11
support, 200
supremum, 11

translate, of distribution in 2', 198


- of periodic distribution, 86, 100
- of function, 77
triangle inequality, 19
trigonometric polynomial, 81

230

Subject Index

uniform Cauchy sequence of functions,47


uniform continuity, 35
uniform convergence, 47
union, 2
unitary transformation, 124
upper bound, 7
upper limit, 12

wave equation, 145


-- derivation, 148
Weierstrass approximation theorem,
82
Weierstrass polynomial approximation theorem, 83

vector, vector space, 28

zero, of holomorphic function, 177

Graduate Texts in Mathematics

Soft and hard cover editions are available for each volume.

For information
A student approaching mathematical research is often discouraged by the sheer
volume of the literature and the long history of the subject, even when the actual
problems are readily understandable. Graduate Texts in Mathematics, is intended
to bridge the gap between passive study and creative understanding; it offers introductions on a suitably advanced level to areas of current research. These introductions are neither complete surveys, nor brief accounts of the latest results only.
They are textbooks carefully designed as teaching aids; the purpose of the authors
is, in every case, to highlight the characteristic features of the theory.
Graduate Texts in Mathematics can serve as the basis for advanced courses.
They can be either the main or subsidiary sources for seminars, and they can be
used for private study. Their guiding principle is to convince the student that
mathematics is a living science.

Vol.

TAKEUTI/ZARING: Introduction to Axiomatic Set Theory. vii, 250 pages.


1971.

Vol. 2

OXTOBY: Measure and Category. viii, 95 pages. 1971.

Vol. 3

SCHAEFER: Topological Vector Spaces. xi, 294 pages. 1971.

Vol. 4

HILTON/STAMMBACH: A Course in Homological Algebra. ix, 338 pages.


1971.

Vol. 5

MAC LANE: Categories for the Working Mathematician. ix, 262 pages.
1972.

Vol. 6

HUGHES/PIPER: Projective Planes. xii, 291 pages. 1973.

Vol. 7

SERRE: A Course in Arithmetic. x, 115 pages. 1973.

Vol. 8

TAKEUTI/ZARING: Axiomatic Set Theory. viii, 238 pages. 1973.

Vol. 9

HUMPHREYS: Introduction to Lie Algebras and Representation Theory.


xiv, 169 pages. 1972.

Vol. 10

COHEN: A Course in Simple-Homotopy Theory. xii, 114 pages. 1973.

Vol. 11

CONWAY: Functions of One Complex Variable. xiv, 314 pages. 1973.

In preparation
Vol. 12

BEALS: Advanced Mathematical Analysis. xii, 248 pages. Tentative


publication date: November, 1973.

Vol. 13

ANDERSON/FuLLER: Rings and Categories of Modules. xiv, 370 pages


approximately. Tentative publication date: October, 1973.

Vol. 14

GOLUBITSKy/GUILLEMIN: Stable Mappings and Their Regularities. xii,


224 pages approximately. Tentative publication date: October, 1973.

Vol. 15

BERBERIAN: Lectures In Functional Analysis and Operator Theory. xii,


368 pages approximately. Tentative publication date: January, 1973.

Vol. 16

WINTER: The Structure of Fields. xii, 320 pages approximately. Tentative publication date: January, 1973.