Analysis Sol Comp

ANALYSIS TOOLS WITH APPLICATIONS
BRUCE K. DRIVER
Date: April 10, 2003 File:anal.tex .
Department of Mathematics, 0112.

University of California, San Diego .
La Jolla, CA 92093-0112 .
i
ii
Abstract. These are lecture notes from Math 240.
Things to do:
0) Exhibit a non-measurable null set and a non-Borel measurable Riemann
integrable function.
1) Weak convergence on metric spaces. See Durrett, Stochastic calculus,
Chapter 8 for example. Also see Stroocks book on this point, chapter 3. See
Problems 3.1.183.1.20.
2) Innite product measures using the Caratheodory extension theorem in
the general case of products of arbitrary probability spaces. See Stroocks book
on probability from an analytic point of view.
3) Do enough on topological vector spaces to cover what is needed for the
section on distributions, this includes Banach - Steinhauss theorem and open
mapping theorem in the context of Frechet spaces. See Rudins functional
analysis and lens notes.
4) Add manifolds basics including Stokes theorems and partitions of unity.
See le Partitn.tex in 257af94 directory. Also add facts about smooth measure
on manifolds, see the last chapter of bookall.tex for this material.
5) Also basic ODE facts, i.e. ows of vector elds
6) Put in some complex variables.
7) Bochner Integrals (See Gaussian.tex for a discussion and problems be-
low.)
8) Add in implicit function theorem proof of existence to ODEs via Joel
Robbins method, see PDE notes.
9) Manifold theory including Sards theorem (See p.538 of Taylor Volume I
and references), Stokes Theorem, perhaps a little PDE on manifolds.
10) Put in more PDE stu, especially by hilbert space methods. See le
zpde.tex in this directory.
11) Add some functional analysis, including the spectral theorem. See
Taylor volume 2.
12) Perhaps some probability theory including stochastic integration. See
course.tex from 257af94 and other les on disk. For Kolmogorov continuity
criteria see course.tex from 257af94 as well. Also see Gaussian.tex in 289aW98
for construction of Wiener measures.
13) There are some typed notes on Partitions of unity called partitn.tex,
from PDE course and other notes from that course may be useful. For more
ODE stu see pdenote2.tex from directory 231a-f96. These notes also contain
quadratic form notes and compact and Fredholm operator notes.
15) Move Holder spaces much earlier in the text as illustrations of com-
pactness theorems.
14) Use the proof in Loomis of Tychonos theorem, see p.11
15) Perhaps the pi-lambda theorem should go in section 4 when discussing
the generation of algebras.
Major Break down thoughts:
I Real Analysis
II: Topology
III: Complex Variables
IV Distributrion Theory, PDE 1
V: Functional analysis and PDE 2. (Sobolev Spaces)
VI: Probability Theory
VII: Manifold Theory and PDE 3.
Contents
1. Introduction 1
ANALYSIS TOOLS WITH APPLICATIONS iii
2. Limits, sums, and other basics 1
2.1. Set Operations 1
2.2. Limits, Limsups, and Liminfs 2
2.3. Sums of positive functions 3
2.4. Sums of complex functions 6
2.5. Iterated sums 9
2.6.
p
spaces, Minkowski and Holder Inequalities 11
2.7. Exercises 15
3. Metric, Banach and Topological Spaces 18
3.1. Basic metric space notions 18
3.2. Continuity 20
3.3. Basic Topological Notions 21
3.4. Completeness 27
3.5. Compactness in Metric Spaces 29
3.6. Compactness in Function Spaces 34
3.7. Bounded Linear Operators Basics 36
3.8. Inverting Elements in L(X) and Linear ODE 40
3.9. Supplement: Sums in Banach Spaces 42
3.10. Word of Caution 43
3.11. Exercises 45
4. The Riemann Integral 48
4.1. The Fundamental Theorem of Calculus 51
4.2. Exercises 53
5. Ordinary Dierential Equations in a Banach Space 55
5.1. Examples 55
5.2. Linear Ordinary Dierential Equations 57
5.3. Uniqueness Theorem and Continuous Dependence on Initial Data 60
5.4. Local Existence (Non-Linear ODE) 61
5.5. Global Properties 63
5.6. Semi-Group Properties of time independent ows 68
5.7. Exercises 70
6. Algebras, Algebras and Measurability 75
6.1. Introduction: What are measures and why measurable sets 75
6.2. The problem with Lebesgue measure 76
6.3. Algebras and algebras 78
6.4. Continuous and Measurable Functions 84
6.5. Topologies and Algebras Generated by Functions 87
6.6. Product Spaces 89
6.7. Exercises 95
7. Measures and Integration 97
7.1. Example of Measures 99
7.2. Integrals of Simple functions 101
7.3. Integrals of positive functions 103
7.4. Integrals of Complex Valued Functions 110
7.5. Measurability on Complete Measure Spaces 117
7.6. Comparison of the Lebesgue and the Riemann Integral 118
7.7. Appendix: Bochner Integral 121
7.8. Bochner Integrals 124
iv BRUCE K. DRIVER
7.9. Exercises 126

8. Fubinis Theorem 129
8.1. Measure Theoretic Arguments 129
8.2. Fubini-Tonellis Theorem and Product Measure 135
8.3. Lebesgue measure on R
d
141
8.4. Polar Coordinates and Surface Measure 144
8.5. Regularity of Measures 147
8.6. Exercises 151
9. L
p
-spaces 153
9.1. Jensens Inequality 156
9.2. Modes of Convergence 159
9.3. Completeness of L
p
spaces 162
9.4. Converse of Hlders Inequality 166
9.5. Uniform Integrability 171
9.6. Exercises 176
10. Locally Compact Hausdor Spaces 178
10.1. Locally compact form of Urysohn Metrization Theorem 184
10.2. Partitions of Unity 186
10.3. C
0
(X) and the Alexanderov Compactication 190
10.4. More on Separation Axioms: Normal Spaces 191
10.5. Exercises 194
11. Approximation Theorems and Convolutions 197
11.1. Convolution and Youngs Inequalities 201
11.2. Classical Weierstrass Approximation Theorem 208
11.3. Stone-Weierstrass Theorem 213
11.4. Locally Compact Version of Stone-Weierstrass Theorem 216
11.5. Dynkins Multiplicative System Theorem 217
11.6. Exercises 218
12. Hilbert Spaces 222
12.1. Hilbert Spaces Basics 222
12.2. Hilbert Space Basis 230
12.3. Fourier Series Considerations 232
12.4. Weak Convergence 235
12.5. Supplement 1: Converse of the Parallelogram Law 238
12.6. Supplement 2. Non-complete inner product spaces 240
12.7. Supplement 3: Conditional Expectation 241
12.8. Exercises 244
12.9. Fourier Series Exercises 246
12.10. Dirichlet Problems on D 250
13. Construction of Measures 253
13.1. Finitely Additive Measures and Associated Integrals 253
13.2. The Daniell-Stone Construction Theorem 257
13.3. Extensions of premeasures to measures I 261
13.4. Riesz Representation Theorem 263
13.5. Metric space regularity results resisted 269
13.6. Measure on Products of Metric spaces 270
13.7. Measures on general innite product spaces 272
13.8. Extensions of premeasures to measures II 274
ANALYSIS TOOLS WITH APPLICATIONS v
13.9. Supplement: Generalizations of Theorem 13.35 to R
n
277
13.10. Exercises 279
14. Daniell Integral Proofs 282
14.1. Extension of Integrals 282
14.2. The Structure of L
1
(I) 289
14.3. Relationship to Measure Theory 290
15. Complex Measures, Radon-Nikodym Theorem and the Dual of L
p
296
15.1. Radon-Nikodym Theorem I 297
15.2. Signed Measures 302
15.3. Complex Measures II 307
15.4. Absolute Continuity on an Algebra 310
15.5. Dual Spaces and the Complex Riesz Theorem 312
15.6. Exercises 314
16. Lebesgue Dierentiation and the Fundamental Theorem of Calculus 316
16.1. A Covering Lemma and Averaging Operators 316
16.2. Maximal Functions 317
16.3. Lebesque Set 319
16.4. The Fundamental Theorem of Calculus 322
16.5. Alternative method to the Fundamental Theorem of Calculus 330
16.6. Examples: 332
16.7. Exercises 333
17. More Point Set Topology 335
17.1. Connectedness 335
17.2. Product Spaces 337
17.3. Tychonos Theorem 339
17.4. Baire Category Theorem 341
17.5. Baire Category Theorem 341
17.6. Exercises 346
18. Banach Spaces II 348
18.1. Applications to Fourier Series 353
18.2. Hahn Banach Theorem 355
18.3. Weak and Strong Topologies 359
18.4. Weak Convergence Results 360
18.5. Supplement: Quotient spaces, adjoints, and more reexivity 364
18.6. Exercises 368
19. Weak and Strong Derivatives 371
19.1. Basic Denitions and Properties 371
19.2. The connection of Weak and pointwise derivatives 382
19.3. Exercises 387
20. Fourier Transform 389
20.1. Fourier Transform 390
20.2. Schwartz Test Functions 392
20.3. Fourier Inversion Formula 394
20.4. Summary of Basic Properties of F and F
1
397
20.5. Fourier Transforms of Measures and Bochners Theorem 397
20.6. Supplement: Heisenberg Uncertainty Principle 400
21. Constant Coecient partial dierential equations 405
21.1. Elliptic Regularity 416
vi BRUCE K. DRIVER
21.2. Exercises 420

22. L
2
Sobolev spaces on R
n
421
22.1. Sobolev Spaces 421
22.2. Examples 429
22.3. Summary of operations on H
431
22.4. Application to Dierential Equations 433
23. Sobolev Spaces 436
23.1. Mollications 437
23.2. Dierence quotients 442
23.3. Application to regularity 443
23.4. Sobolev Spaces on Compact Manifolds 444
23.5. Trace Theorems 447
23.6. Extension Theorems 451
23.7. Exercises 453
24. Hlder Spaces 455
24.1. Exercises 460
25. Sobolev Inequalities 461
25.1. Gagliardo-Nirenberg-Sobolev Inequality 461
25.2. Morreys Inequality 465
25.3. Rademachers Theorem 470
25.4. Sobolev Embedding Theorems Summary 470
25.5. Other Theorems along these lines 471
25.6. Exercises 472
26. Banach Spaces III: Calculus 473
26.1. The Dierential 473
26.2. Product and Chain Rules 474
26.3. Partial Derivatives 476
26.4. Smooth Dependence of ODEs on Initial Conditions 477
26.5. Higher Order Derivatives 479
26.6. Contraction Mapping Principle 482
26.7. Inverse and Implicit Function Theorems 484
26.8. More on the Inverse Function Theorem 487
26.9. Applications 490
26.10. Exercises 492
27. Proof of the Change of Variable Theorem 494
27.1. Appendix: Other Approaches to proving Theorem 27.1 498
27.2. Sards Theorem 499
27.3. Co-Area Formula 503
27.4. Stokes Theorem 503
28. Complex Dierentiable Functions 504
28.1. Basic Facts About Complex Numbers 504
28.2. The complex derivative 504
28.3. Contour integrals 509
28.4. Weak characterizations of H() 515
28.5. Summary of Results 519
28.6. Exercises 520
28.7. Problems from Rudin 522
29. Littlewood Payley Theory 523
ANALYSIS TOOLS WITH APPLICATIONS vii
30. Elementary Distribution Theory 529
30.1. Distributions on U
o
R
n
529
30.2. Other classes of test functions 536
30.3. Compactly supported distributions 541
30.4. Tempered Distributions and the Fourier Transform 543
30.5. Appendix: Topology on C
c
(U) 553
31. Convolutions involving distributions 557
31.1. Tensor Product of Distributions 557
31.2. Elliptic Regularity 565
31.3. Appendix: Old Proof of Theorem 31.4 567
32. Pseudo-Dierential Operators on Euclidean space 571
32.1. Symbols and their operators 572
32.2. A more general symbol class 574
32.3. Schwartz Kernel Approach 583
32.4. Pseudo Dierential Operators 588
33. Elliptic pseudo dierential operators on R
d
600
34. Pseudo dierential operators on Compact Manifolds 604
35. Sobolev Spaces on M 608
35.1. Alternate Denition of H
k
for k-integer 612
35.2. Scaled Spaces 614
35.3. General Properties of Scaled space" 615
36. Compact and Fredholm Operators and the Spectral Theorem 618
36.1. Compact Operators 618
36.2. Hilbert Schmidt Operators 620
36.3. The Spectral Theorem for Self Adjoint Compact Operators 623
36.4. Structure of Compact Operators 627
36.5. Fredholm Operators 628
36.6. Tensor Product Spaces 633
37. Unbounded operators and quadratic forms 639
37.1. Unbounded operator basics 639
37.2. Lax-Milgram Methods 640
37.3. Close, symmetric, semi-bounded quadratic forms and self-adjoint
operators 642
37.4. Construction of positive self-adjoint operators 646
37.5. Applications to partial dierential equations 647
38. More Complex Variables: The Index 649
38.1. Unique Lifting Theorem 650
38.2. Path Lifting Property 650
39. Residue Theorem 657
39.1. Residue Theorem 658
39.2. Open Mapping Theorem 660
39.3. Applications of Residue Theorem 661
39.4. Isolated Singularity Theory 663
40. Conformal Equivalence 664
41. Find All Conformal Homeomorphisms of V U 666
41.1. Sketch of Proof of Riemann Mapping Theorem 667
42. Radon Measures and C
0
(X)
675
42.1. More Regularity Results 677
viii BRUCE K. DRIVER
42.2. The Riesz Representation Theorem 680

42.3. The dual of C
0
(X) 683
42.4. Special case of Riesz Theorem on [0, 1] 687
42.5. Applications 688
42.6. The General Riesz Representation by Daniell Integrals 690
42.7. Regularity Results 692
43. The Flow of a Vector Fields on Manifolds 698
Appendix A. Multinomial Theorems and Calculus Results 701
A.1. Multinomial Theorems and Product Rules 701
A.2. Taylors Theorem 702
Appendix B. Zorns Lemma and the Hausdor Maximal Principle 706
Appendix C. Cartheodory Method of Constructing Measures 710
C.1. Outer Measures 710
C.2. Carathodorys Construction Theorem 712
C.3. Regularity results revisited 715
C.4. Construction of measures on a simple product space. 717
Appendix D. Nets 719
Appendix E. Innite Dimensional Gaussian Measures 722
E.1. Finite Dimensional Examples and Results 723
E.2. Basic Innite Dimensional Results 726
E.3. Guassian Measure for
2
730
E.4. Classical Wiener Measure 734
E.5. Basic Properties Wiener Measure 738
E.6. The Cameron-Martin Space and Theorem 740
E.7. Cameron-Martin Theorem 742
E.8. Exercises 746
Appendix F. Solutions to Selected Exercises 747
F.1. Section 2 Solutions 747
F.17. Size of
2
spaces. 820
F.18. Bochner Integral Problems form chapter 5 of rst edition. 821
ANALYSIS TOOLS WITH APPLICATIONS ix
F.25. Problems from Folland Sec. 7 833
F.26. Folland Chapter 2 problems 836
F.27. Folland Chapter 4 problems 836
Appendix G. Old Stu 844
G.1. Section 2 844
G.2. Section 3 847
G.3. Compactness on metric spaces 847
G.4. Compact Sets in R
n
849
G.5. Section 4 851
G.6. Section 5 852
G.7. Section 6: 854
G.8. Section 8 855
G.9. Section 9 866
G.10. Section 10 866
G.15. Section 15 old Stu 883
G.16. Signed measures 883
G.17. The Total Variation on an Algebra by B. 887
G.18. The Total Variation an Algebra by Z. 888
G.19. Old parts of Section 16 890
G.20. Old Absolute Continuity 890
G.21. Appendix: Absolute Continuity on an algebra by Z. (Delete?) 890
G.22. Other Hahn Decomposition Proofs 890
G.23. Old Dual to L
p
spaces 892
G.24. Section G.15 894
G.25. Section 16.4 894
G.27. Old Urysohns metrization Theorem 899
G.31. Old Section 21 908
G.34. Old Section A 914
G.35. Old Section E.4 915
Appendix H. Record of Problems Graded 916
H.1. 240A F01 916
H.2. 240B W02 916
References 916
ANALYSIS TOOLS WITH APPLICATIONS 1
1. Introduction
Not written as of yet. Topics to mention.
(1) A better and more general integral.
(a) Convergence Theorems
(b) Integration over diverse collection of sets. (See probability theory.)
(c) Integration relative to dierent weights or densities including singular
weights.
(d) Characterization of dual spaces.
(e) Completeness.
(2) Innite dimensional Linear algebra.
(3) ODE and PDE.
(4) Harmonic and Fourier Analysis.
(5) Probability Theory
2. Limits, sums, and other basics
2.1. Set Operations. Suppose that X is a set. Let P(X) or 2
X
denote the power
set of X, that is elements of P(X) = 2
X
are subsets of A. For A 2
X
let
A
c
= X \ A = {x X : x / A}
and more generally if A, B X let
B \ A = {x B : x / A}.
We also dene the symmetric dierence of A and B by
A4B = (B \ A) (A\ B) .
As usual if {A
}
I
is an indexed collection of subsets of X we dene the union
and the intersection of this collection by
I
A
:= {x X : I 3 x A
} and
I
A
:= {x X : x A
I }.
Notation 2.1. We will also write
`
I
A
for
I
A
in the case that {A
}
I
are pairwise disjoint, i.e. A
= if 6= .
Notice that is closely related to and is closely related to . For example
let {A
n
}
n=1
be a sequence of subsets from X and dene
{A
n
i.o.} := {x X : #{n : x A
n
} = } and
{A
n
a.a.} := {x X : x A
n
for all n suciently large}.
(One should read {A
n
i.o.} as A
n
innitely often and {A
n
a.a.} as A
n
almost al-
ways.) Then x {A
n
i.o.} i N N n N 3 x A
n
which may be written
as
{A
n
i.o.} =
N=1
nN
A
n
.
Similarly, x {A
n
a.a.} i N N 3 n N, x A
n
which may be written as
{A
n
a.a.} =
N=1
nN
A
n
.
2 BRUCE K. DRIVER
2.2. Limits, Limsups, and Liminfs.

Notation 2.2. The Extended real numbers is the set

R := R{} , i.e. it
is R with two new points called and . We use the following conventions,
0 = 0, +a = for any a R, += and = while
is not dened.
If

R we will let sup and inf denote the least upper bound and greatest
lower bound of respectively. We will also use the following convention, if = ,
then sup = and inf = +.
Notation 2.3. Suppose that {x
n
}
n=1

R is a sequence of numbers. Then
lim inf
n
x
n
= lim
n
inf{x
k
: k n} and (2.1)
lim sup
n
x
n
= lim
n
sup{x
k
: k n}. (2.2)
We will also write lim for liminf and lim for limsup.
Remark 2.4. Notice that if a
k
:= inf{x
k
: k n} and b
k
:= sup{x
k
: k n},then
{a
k
} is an increasing sequence while {b
k
} is a decreasing sequence. Therefore the
limits in Eq. (2.1) and Eq. (2.2) always exist and
lim inf
n
x
n
= sup
n
inf{x
k
: k n} and
lim sup
n
x
n
= inf
n
sup{x
k
: k n}.
The following proposition contains some basic properties of liminfs and limsups.
Proposition 2.5. Let {a
n
}
n=1
and {b
n
}
n=1
be two sequences of real numbers.
Then
(1) liminf
n
a
n
limsup
n
a
n
and lim
n
a
n
exists in

R i liminf
n
a
n
=
limsup
n
a
n

R.
(2) There is a subsequence {a
n
k
}
k=1
of {a
n
}
n=1
such that lim
k
a
n
k
=
limsup
n
a
n
.
(3)
(2.3) lim sup
n
(a
n
+b
n
) lim sup
n
a
n
+ lim sup
n
b
n
whenever the right side of this equation is not of the form .
(4) If a
n
0 and b
n
0 for all n N, then
(2.4) lim sup
n
(a
n
b
n
) lim sup
n
a
n
lim sup
n
b
n
,
provided the right hand side of (2.4) is not of the form 0 or 0.
Proof. We will only prove part 1. and leave the rest as an exercise to the reader.
We begin by noticing that
inf{a
k
: k n} sup{a
k
: k n} n
so that
lim inf
n
a
n
lim sup
n
a
n
.
Now suppose that liminf
n
a
n
= limsup
n
a
n
= a R. Then for all > 0,
there is an integer N such that
a inf{a
k
: k N} sup{a
k
: k N} a +,
i.e.
a a
k
a + for all k N.
Hence by the denition of the limit, lim
k
a
k
= a.
If liminf
n
a
n
= , then we know for all M (0, ) there is an integer N
such that
M inf{a
k
: k N}
and hence lim
n
a
n
= . The case where limsup
n
a
n
= is handled
similarly.
Conversely, suppose that lim
n
a
n
= A

R exists. If A R, then for every
> 0 there exists N() N such that |Aa
n
| for all n N(), i.e.
A a
n
A+ for all n N().
From this we learn that
A lim inf
n
a
n
lim sup
n
a
n
A+.
Since > 0 is arbitrary, it follows that
A lim inf
n
a
n
lim sup
n
a
n
A,
i.e. that A = liminf
n
a
n
= limsup
n
a
n
.
If A = , then for all M > 0 there exists N(M) such that a
n
M for all
n N(M). This show that
lim inf
n
a
n
M
and since M is arbitrary it follows that
lim inf
n
a
n
lim sup
n
a
n
.
The proof is similar if A = as well.
2.3. Sums of positive functions. In this and the next few sections, let X and Y
be two sets. We will write X to denote that is a nite subset of X.
Denition 2.6. Suppose that a : X [0, ] is a function and F X is a subset,
then
X
F
a =
X
xF
a(x) = sup
(
X
x
a(x) : F
)
.
Remark 2.7. Suppose that X = N = {1, 2, 3, . . . }, then
X
N
a =

X
n=1
a(n) := lim
N
N
X
n=1
a(n).
Indeed for all N,
P
N
n=1
a(n)
P
N
a, and thus passing to the limit we learn that
X
n=1
a(n)
X
N
a.
Conversely, if N, then for all N large enough so that {1, 2, . . . , N}, we
have
P
a
P
N
n=1
a(n) which upon passing to the limit implies that
X
a

X
n=1
a(n)
4 BRUCE K. DRIVER
and hence by taking the supremum over we learn that

X
N
a

X
n=1
a(n).
Remark 2.8. Suppose that
P
X
a < , then {x X : a(x) > 0} is at most count-
able. To see this rst notice that for any > 0, the set {x : a(x) } must be nite
for otherwise
P
X
a = . Thus
{x X : a(x) > 0} =
[
k=1
{x : a(x) 1/k}
which shows that {x X : a(x) > 0} is a countable union of nite sets and thus
countable.
Lemma 2.9. Suppose that a, b : X [0, ] are two functions, then
X
X
(a +b) =
X
X
a +
X
X
b and
X
X
a =
X
X
a
for all 0.
I will only prove the rst assertion, the second being easy. Let X be a
nite set, then
X
(a +b) =
X
a +
X
b
X
X
a +
X
X
b
which after taking sups over shows that
X
X
(a +b)
X
X
a +
X
X
b.
Similarly, if , X, then
X
a +
X
b
X
a +
X
b =
X
(a +b)
X
X
(a +b).
Taking sups over and then shows that
X
X
a +
X
X
b
X
X
(a +b).
Lemma 2.10. Let X and Y be sets, R X Y and suppose that a : R

R is a
function. Let
x
R := {y Y : (x, y) R} and R
y
:= {x X : (x, y) R} . Then
sup
(x,y)R
a(x, y) = sup
xX
sup
y
x
R
a(x, y) = sup
yY
sup
xR
y
a(x, y) and
inf
(x,y)R
a(x, y) = inf
xX
inf
y
x
R
a(x, y) = inf
yY
inf
xR
y
a(x, y).
(Recall the conventions: sup = and inf = +.)
Proof. Let M = sup
(x,y)R
a(x, y), N
x
:= sup
y
x
R
a(x, y). Then a(x, y) M
for all (x, y) R implies N
x
= sup
y
x
R
a(x, y) M and therefore that
(2.5) sup
xX
sup
y
x
R
a(x, y) = sup
xX
N
x
M.
Similarly for any (x, y) R,
a(x, y) N
x
sup
xX
N
x
= sup
xX
sup
y
x
R
a(x, y)
and therefore
(2.6) sup
(x,y)R
a(x, y) sup
xX
sup
y
x
R
a(x, y) = M
Equations (2.5) and (2.6) show that
sup
(x,y)R
a(x, y) = sup
xX
sup
y
x
R
a(x, y).
The assertions involving innums are proved analogously or follow from what we
have just proved applied to the function a.
Figure 1. The x and y slices of a set R X Y.
Theorem 2.11 (Monotone Convergence Theorem for Sums). Suppose that f
n
:
X [0, ] is an increasing sequence of functions and
f(x) := lim
n
f
n
(x) = sup
n
f
n
(x).
Then
lim
n
X
X
f
n
=
X
X
f
Proof. We will give two proves. For the rst proof, let P
f
(X) = {A X :
A X}. Then
lim
n
X
X
f
n
= sup
n
X
X
f
n
= sup
n
sup
P
f
(X)
X
f
n
= sup
P
f
(X)
sup
n
X
f
n
= sup
P
f
(X)
lim
n
X
f
n
= sup
P
f
(X)
X
lim
n
f
n
= sup
P
f
(X)
X
f =
X
X
f.
(Second Proof.) Let S
n
=
P
X
f
n
and S =
P
X
f. Since f
n
f
m
f for all
n m, it follows that
S
n
S
m
S
which shows that lim
n
S
n
exists and is less that S, i.e.
(2.7) A := lim
n
X
X
f
n

X
X
f.
6 BRUCE K. DRIVER
Noting that
P
f
n

P
X
f
n
= S
n
A for all X and in particular,
X
f
n
A for all n and X.
Letting n tend to innity in this equation shows that
X
f A for all X
and then taking the sup over all X gives
(2.8)
X
X
f A = lim
n
X
X
f
n
which combined with Eq. (2.7) proves the theorem.
Lemma 2.12 (Fatous Lemma for Sums). Suppose that f
n
: X [0, ] is a
sequence of functions, then
X
X
lim inf
n
f
n
lim inf
n
X
X
f
n
.
Proof. Dene g
k
inf
nk
f
n
so that g
k
liminf
n
f
n
as k . Since g
k
f
n
for all k n,
X
X
g
k

X
X
f
n
for all n k
and therefore
X
X
g
k
lim inf
n
X
X
f
n
for all k.
We may now use the monotone convergence theorem to let k to nd
X
X
lim inf
n
f
n
=
X
X
lim
k
g
k
MCT
= lim
k
X
X
g
k
lim inf
n
X
X
f
n
.
Remark 2.13. If A =
P
X
a < , then for all > 0 there exists
X such that
A
X
a A
for all X containing
or equivalently,
(2.9)
A
X

for all X containing
. Indeed, choose
so that
P
a A.
2.4. Sums of complex functions.
Denition 2.14. Suppose that a : X C is a function, we say that
X
X
a =
X
xX
a(x)
exists and is equal to A C, if for all > 0 there is a nite subset
X such
that for all X containing
we have
A
X
.
The following lemma is left as an exercise to the reader.
Lemma 2.15. Suppose that a, b : X C are two functions such that
P
X
a and
P
X
b exist, then
P
X
(a +b) exists for all C and
X
X
(a +b) =
X
X
a +
X
X
b.
Denition 2.16 (Summable). We call a function a : X C summable if
X
X
|a| < .
Proposition 2.17. Let a : X C be a function, then
P
X
a exists i
P
X
|a| < ,
i.e. i a is summable.
Proof. If
P
X
|a| < , then
P
X
(Re a)
< and
P
X
(Ima)
< and hence

by Remark 2.13 these sums exists in the sense of Denition 2.14. Therefore by
Lemma 2.15,
P
X
a exists and
X
X
a =
X
X
(Re a)
+
X
X
(Re a)
+i
X
X
(Ima)
+
X
X
(Ima)
!
.
Conversely, if
P
X
|a| = then, because |a| |Re a| +|Ima| , we must have
X
X
|Re a| = or
X
X
|Ima| = .
Thus it suces to consider the case where a : X R is a real function. Write
a = a
+
a
where
(2.10) a
+
(x) = max(a(x), 0) and a
(x) = max(a(x), 0).

Then |a| = a
+
+a
and
=
X
X
|a| =
X
X
a
+
+
X
X
a
which shows that either

P
X
a
+
= or
P
X
a
= . Suppose, with out loss of

generality, that
P
X
a
+
= . Let X
0
:= {x X : a(x) 0}, then we know that
P
X
0
a = which means there are nite subsets
n
X
0
X such that
P
n
a n
for all n. Thus if X is any nite set, it follows that lim
n
P
a = ,
and therefore
P
X
a can not exist as a number in R.
Remark 2.18. Suppose that X = N and a : N C is a sequence, then it is not
necessarily true that
(2.11)

X
n=1
a(n) =
X
nN
a(n).
This is because
X
n=1
a(n) = lim
N
N
X
n=1
a(n)
depends on the ordering of the sequence a where as
P
nN
a(n) does not. For
example, take a(n) = (1)
n
/n then
P
nN
|a(n)| = i.e.
P
nN
a(n) does not
8 BRUCE K. DRIVER
exist while
P
n=1
a(n) does exist. On the other hand, if
X
nN
|a(n)| =

X
n=1
|a(n)| <
then Eq. (2.11) is valid.
Theorem 2.19 (Dominated Convergence Theorem for Sums). Suppose that f
n
:
X C is a sequence of functions on X such that f(x) = lim
n
f
n
(x) C exists
for all x X. Further assume there is a dominating function g : X [0, )
such that
(2.12) |f
n
(x)| g(x) for all x X and n N
and that g is summable. Then
(2.13) lim
n
X
xX
f
n
(x) =
X
xX
f(x).
Proof. Notice that |f| = lim|f
n
| g so that f is summable. By considering
the real and imaginary parts of f separately, it suces to prove the theorem in the
case where f is real. By Fatous Lemma,
X
X
(g f) =
X
X
lim inf
n
(g f
n
) lim inf
n
X
X
(g f
n
)
=
X
X
g + lim inf
n
X
X
f
n
!
.
Since liminf
n
(a
n
) = limsup
n
a
n
, we have shown,
X
X
g
X
X
f
X
X
g +
liminf
n
P
X
f
n
limsup
n
P
X
f
n
and therefore
lim sup
n
X
X
f
n

X
X
f lim inf
n
X
X
f
n
.
This shows that lim
n
P
X
f
n
exists and is equal to
P
X
f.
Proof. (Second Proof.) Passing to the limit in Eq. (2.12) shows that |f| g
and in particular that f is summable. Given > 0, let X such that
X
X\
g .
Then for X such that ,
f
X
f
n
(f f
n
)
|f f
n
| =
X
|f f
n
| +
X
\
|f f
n
|
|f f
n
| + 2
X
\
g
|f f
n
| + 2.
and hence that

f
X
f
n
|f f
n
| + 2.
Since this last equation is true for all such X, we learn that
X
X
f
X
X
f
n
|f f
n
| + 2
which then implies that
lim sup
n
X
X
f
X
X
f
n
lim sup
n
X
|f f
n
| + 2
= 2.
Because > 0 is arbitrary we conclude that
lim sup
n
X
X
f
X
X
f
n
= 0.
which is the same as Eq. (2.13).
2.5. Iterated sums. Let X and Y be two sets. The proof of the following lemma
is left to the reader.
Lemma 2.20. Suppose that a : X C is function and F X is a subset such
that a(x) = 0 for all x / F. Show that
P
F
a exists i
P
X
a exists, and if the sums
exist then
X
X
a =
X
F
a.
Theorem 2.21 (Tonellis Theorem for Sums). Suppose that a : X Y [0, ],
then
X
XY
a =
X
X
X
Y
a =
X
Y
X
X
a.
Proof. It suces to show, by symmetry, that
X
XY
a =
X
X
X
Y
a
Let X Y. The for any X and Y such that , we have
X
a
X
a =
X
a
X
X
Y
a
X
X
X
Y
a,
i.e.
P
a
P
X
P
Y
a. Taking the sup over in this last equation shows
X
XY
a
X
X
X
Y
a.
We must now show the opposite inequality. If
P
XY
a = we are done so
we now assume that a is summable. By Remark 2.8, there is a countable set
{(x
0
n
, y
0
n
)}
n=1
X Y o of which a is identically 0.
10 BRUCE K. DRIVER
Let {y
n
}
n=1
be an enumeration of {y
0
n
}
n=1
, then since a(x, y) = 0 if y /
{y
n
}
n=1
,
P
yY
a(x, y) =
P
n=1
a(x, y
n
) for all x X. Hence
X
xX
X
yY
a(x, y) =
X
xX
X
n=1
a(x, y
n
) =
X
xX
lim
N
N
X
n=1
a(x, y
n
)
= lim
N
X
xX
N
X
n=1
a(x, y
n
), (2.14)
wherein the last inequality we have used the monotone convergence theorem with
F
N
(x) :=
P
N
n=1
a(x, y
n
). If X, then
X
x
N
X
n=1
a(x, y
n
) =
X
{y
n
}
N
n=1
a
X
XY
a
and therefore,
(2.15) lim
N
X
xX
N
X
n=1
a(x, y
n
)
X
XY
a.
Hence it follows from Eqs. (2.14) and (2.15) that
(2.16)
X
xX
X
yY
a(x, y)
X
XY
a
as desired.
Alternative proof of Eq. (2.16). Let A = {x
0
n
: n N} and let {x
n
}
n=1
be an
enumeration of A. Then for x / A, a(x, y) = 0 for all y Y.
Given > 0, let : X [0, ) be the function such that
P
X
= and (x) > 0
for x A. (For example we may dene by (x
n
) = /2
n
for all n and (x) = 0 if
x / A.) For each x X, let
x
X be a nite set such that
X
yY
a(x, y)
X
y
x
a(x, y) +(x).
Then
X
X
X
Y
a
X
xX
X
y
x
a(x, y) +
X
xX
(x)
=
X
xX
X
y
x
a(x, y) + = sup
X
X
x
X
y
x
a(x, y) +
X
XY
a +, (2.17)
wherein the last inequality we have used
X
x
X
y
x
a(x, y) =
X
a
X
XY
a
with
:= {(x, y) X Y : x and y
x
} X Y.
Since > 0 is arbitrary in Eq. (2.17), the proof is complete.
Theorem 2.22 (Fubinis Theorem for Sums). Now suppose that a : X Y C
is a summable function, i.e. by Theorem 2.21 any one of the following equivalent
conditions hold:
(1)
P
XY
|a| < ,
(2)
P
X
P
Y
|a| < or
(3)
P
Y
P
X
|a| < .
Then
X
XY
a =
X
X
X
Y
a =
X
Y
X
X
a.
Proof. If a : X R is real valued the theorem follows by applying Theorem
2.21 to a
the positive and negative parts of a. The general result holds for
complex valued functions a by applying the real version just proved to the real and
imaginary parts of a.
2.6.
p
spaces, Minkowski and Holder Inequalities. In this subsection, let
: X (0, ] be a given function. Let F denote either C or R. For p (0, )
and f : X F, let
kfk
p
(
X
xX
|f(x)|
p
(x))
1/p
and for p = let
kfk
= sup{|f(x)| : x X} .
Also, for p > 0, let
p
() = {f : X F : kfk
p
< }.
In the case where (x) = 1 for all x X we will simply write
p
(X) for
p
().
Denition 2.23. A norm on a vector space L is a function kk : L [0, ) such
that
(1) (Homogeneity) kfk = || kfk for all F and f L.
(2) (Triangle inequality) kf +gk kfk +kgk for all f, g L.
(3) (Positive denite) kfk = 0 implies f = 0.
A pair (L, kk) where L is a vector space and kk is a norm on L is called a
normed vector space.
The rest of this section is devoted to the proof of the following theorem.
Theorem 2.24. For p [1, ], (
p
(), k k
p
) is a normed vector space.
Proof. The only diculty is the proof of the triangle inequality which is the
content of Minkowskis Inequality proved in Theorem 2.30 below.
2.6.1. Some inequalities.
Proposition 2.25. Let f : [0, ) [0, ) be a continuous strictly increasing
function such that f(0) = 0 (for simplicity) and lim
s
f(s) = . Let g = f
1
and
for s, t 0 let
F(s) =
Z
s
0
f(s
0
)ds
0
and G(t) =
Z
t
0
g(t
0
)dt
0
.
Then for all s, t 0,
st F(s) +G(t)
and equality holds i t = f(s).
12 BRUCE K. DRIVER
Proof. Let
A
s
:= {(, ) : 0 f() for 0 s} and
B
t
:= {(, ) : 0 g() for 0 t}
then as one sees from Figure 2, [0, s] [0, t] A
s
B
t
. (In the gure: s = 3, t = 1,
A
3
is the region under t = f(s) for 0 s 3 and B
1
is the region to the left of the
curve s = g(t) for 0 t 1.) Hence if m denotes the area of a region in the plane,
then
st = m([0, s] [0, t]) m(A
s
) +m(B
t
) = F(s) +G(t).
As it stands, this proof is a bit on the intuitive side. However, it will become rig-
orous if one takes m to be Lebesgue measure on the plane which will be introduced
later.
We can also give a calculus proof of this theorem under the additional assumption
that f is C
1
. (This restricted version of the theorem is all we need in this section.)
To do this x t 0 and let
h(s) = st F(s) =
Z
s
0
(t f())d.
If > g(t) = f
1
(t), then t f() < 0 and hence if s > g(t), we have
h(s) =
Z
s
0
(t f())d =
Z
g(t)
0
(t f())d +
Z
s
g(t)
(t f())d
Z
g(t)
0
(t f())d = h(g(t)).
Combining this with h(0) = 0 we see that h(s) takes its maximum at some point
s (0, t] and hence at a point where 0 = h
0
(s) = t f(s). The only solution to this
equation is s = g(t) and we have thus shown
st F(s) = h(s)
Z
g(t)
0
(t f())d = h(g(t))
with equality when s = g(t). To nish the proof we must show
R
g(t)
0
(t f())d =
G(t). This is veried by making the change of variables = g() and then inte-
grating by parts as follows:
Z
g(t)
0
(t f())d =
Z
t
0
(t f(g()))g
0
()d =
Z
t
0
(t )g
0
()d
=
Z
t
0
g()d = G(t).
Denition 2.26. The conjugate exponent q [1, ] to p [1, ] is q :=
p
p1
with
the convention that q = if p = 1. Notice that q is characterized by any of the
following identities:
(2.18)
1
p
+
1
q
= 1, 1 +
q
p
= q, p
p
q
= 1 and q(p 1) = p.
4 3 2 1 0
4
3
2
1
0
x
y
x
y
Figure 2. A picture proof of Proposition 2.25.
Lemma 2.27. Let p (1, ) and q :=
p
p1
(1, ) be the conjugate exponent.
Then
st
s
q
q
+
t
p
p
for all s, t 0
with equality if and only if s
q
= t
p
.
Proof. Let F(s) =
s
p
p
for p > 1. Then f(s) = s
p1
= t and g(t) = t
1
p1
= t
q1
,
wherein we have used q 1 = p/ (p 1) 1 = 1/ (p 1) . Therefore G(t) = t
q
/q
and hence by Proposition 2.25,
st
s
p
p
+
t
q
q
with equality i t = s
p1
.
Theorem 2.28 (Hlders inequality). Let p, q [1, ] be conjugate exponents. For
all f, g : X F,
(2.19) kfgk
1
kfk
p
kgk
q
.
If p (1, ), then equality holds in Eq. (2.19) i
(
|f|
kfk
p
)
p
= (
|g|
kgk
q
)
q
.
Proof. The proof of Eq. (2.19) for p {1, } is easy and will be left to
the reader. The cases where kfk
q
= 0 or or kgk
p
= 0 or are easily dealt
with and are also left to the reader. So we will assume that p (1, ) and
0 < kfk
q
, kgk
p
< . Letting s = |f|/kfk
p
and t = |g|/kgk
q
in Lemma 2.27 implies
|fg|
kfk
p
kgk
q

1
p
|f|
p
kfk
p
+
1
q
|g|
q
kgk
q
.
Multiplying this equation by and then summing gives
kfgk
1
kfk
p
kgk
q

1
p
+
1
q
= 1
14 BRUCE K. DRIVER
with equality i
|g|
kgk
q
=
|f|
p1
kfk
(p1)
p
|g|
kgk
q
=
|f|
p/q
kfk
p/q
p
|g|
q
kfk
p
p
= kgk
q
q
|f|
p
.
Denition 2.29. For a complex number C, let
sgn() =

||
if 6= 0
0 if = 0.
Theorem 2.30 (Minkowskis Inequality). If 1 p and f, g
p
() then
kf +gk
p
kfk
p
+kgk
p
,
with equality i
sgn(f) = sgn(g) when p = 1 and
f = cg for some c > 0 when p (1, ).
Proof. For p = 1,
kf +gk
1
=
X
X
|f +g|
X
X
(|f| +|g|) =
X
X
|f| +
X
X
|g|
with equality i
|f| +|g| = |f +g| sgn(f) = sgn(g).
For p = ,
kf +gk
= sup
X
|f +g| sup
X
(|f| +|g|)
sup
X
|f| + sup
X
|g| = kfk
+kgk
.
Now assume that p (1, ). Since
|f +g|
p
(2 max (|f| , |g|))
p
= 2
p
max (|f|
p
, |g|
p
) 2
p
(|f|
p
+|g|
p
)
it follows that
kf +gk
p
p
2
p
kfk
p
p
+kgk
p
p
< .
The theorem is easily veried if kf +gk
p
= 0, so we may assume kf +gk
p
> 0.
Now
(2.20) |f +g|
p
= |f +g||f +g|
p1
(|f| +|g|)|f +g|
p1
with equality i sgn(f) = sgn(g). Multiplying Eq. (2.20) by and then summing
and applying Holders inequality gives
X
X
|f +g|
p

X
X
|f| |f +g|
p1
+
X
X
|g| |f +g|
p1
(kfk
p
+kgk
p
) k |f +g|
p1
k
q
(2.21)
with equality i
|f|
kfk
p
p
=
|f +g|
p1
k|f +g|
p1
k
q
q
=

|g|
kgk
p
p
and sgn(f) = sgn(g).
By Eq. (2.18), q(p 1) = p, and hence
(2.22) k|f +g|
p1
k
q
q
=
X
X
(|f +g|
p1
)
q
=
X
X
|f +g|
p
.
Combining Eqs. (2.21) and (2.22) implies
(2.23) kf +gk
p
p
kfk
p
kf +gk
p/q
p
+kgk
p
kf +gk
p/q
p
with equality i
sgn(f) = sgn(g) and
|f|
kfk
p
p
=
|f +g|
p
kf +gk
p
p
=

|g|
kgk
p
p
. (2.24)
Solving for kf +gk
p
in Eq. (2.23) with the aid of Eq. (2.18) shows that kf +gk
p

kfk
p
+kgk
p
with equality i Eq. (2.24) holds which happens i f = cg with c > 0.
2.7. Exercises .
2.7.1. Set Theory. Let f : X Y be a function and {A
i
}
iI
be an indexed family
of subsets of Y, verify the following assertions.
Exercise 2.1. (
iI
A
i
)
c
=
iI
A
c
i
.
Exercise 2.2. Suppose that B Y, show that B \ (
iI
A
i
) =
iI
(B \ A
i
).
Exercise 2.3. f
1
(
iI
A
i
) =
iI
f
1
(A
i
).
Exercise 2.4. f
1
(
iI
A
i
) =
iI
f
1
(A
i
).
Exercise 2.5. Find a counter example which shows that f(C D) = f(C) f(D)
need not hold.
Exercise 2.6. Now suppose for each n N {1, 2, . . .} that f
n
: X R is a
function. Let
D {x X : lim
n
f
n
(x) = +}
show that
(2.25) D =
M=1
N=1
nN
{x X : f
n
(x) M}.
Exercise 2.7. Let f
n
: X R be as in the last problem. Let
C {x X : lim
n
f
n
(x) exists in R}.
Find an expression for C similar to the expression for D in (2.25). (Hint: use the
Cauchy criteria for convergence.)
2.7.2. Limit Problems.
Exercise 2.8. Prove Lemma 2.15.
Let {a
n
}
n=1
and {b
n
}
n=1
be two sequences of real numbers.
Exercise 2.10. Show liminf
n
(a
n
) = limsup
n
a
n
.
Exercise 2.11. Suppose that limsup
n
a
n
= M

R, show that there is a
subsequence {a
n
k
}
k=1
of {a
n
}
n=1
such that lim
k
a
n
k
= M.
16 BRUCE K. DRIVER
Exercise 2.12. Show that

(2.26) limsup
n
(a
n
+b
n
) limsup
n
a
n
+ limsup
n
b
n
provided that the right side of Eq. (2.26) is well dened, i.e. no or +
type expressions. (It is OK to have += or = , etc.)
Exercise 2.13. Suppose that a
n
0 and b
n
0 for all n N. Show
(2.27) limsup
n
(a
n
b
n
) limsup
n
a
n
limsup
n
b
n
,
provided the right hand side of (2.27) is not of the form 0 or 0.
2.7.3. Dominated Convergence Theorem Problems.
Notation 2.31. For u
0
R
n
and > 0, let B
u
0
() := {x R
n
: |x u
0
| < } be
the ball in R
n
centered at u
0
with radius .
Exercise 2.14. Suppose U R
n
is a set and u
0
U is a point such that
U (B
u
0
() \ {u
0
}) 6= for all > 0. Let G : U \ {u
0
} C be a function on
U \ {u
0
}. Show that lim
uu
0
G(u) exists and is equal to C,
1
i for all se-
quences {u
n
}
n=1
U \ {u
0
} which converge to u
0
(i.e. lim
n
u
n
= u
0
) we have
lim
n
G(u
n
) = .
Exercise 2.15. Suppose that Y is a set, U R
n
is a set, and f : U Y C is a
function satisfying:
(1) For each y Y, the function u U f(u, y) is continuous on U.
2
(2) There is a summable function g : Y [0, ) such that
|f(u, y)| g(y) for all y Y and u U.
Show that
(2.28) F(u) :=
X
yY
f(u, y)
is a continuous function for u U.
Exercise 2.16. Suppose that Y is a set, J = (a, b) R is an interval, and f :
J Y C is a function satisfying:
(1) For each y Y, the function u f(u, y) is dierentiable on J,
(2) There is a summable function g : Y [0, ) such that

u
f(u, y)
g(y) for all y Y.

(3) There is a u
0
J such that
P
yY
|f(u
0
, y)| < .
Show:
a) for all u J that
P
yY
|f(u, y)| < .
1
More explicitly, lim
uu
0
G(u) = means for every every > 0 there exists a > 0 such that
|G(u) | < whenerver u U (B
u
0
() \ {u
0
}) .
2
To say g := f(, y) is continuous on U means that g : U C is continuous relative to the
metric on R
n
restricted to U.
b) Let F(u) :=
P
yY
f(u, y), show F is dierentiable on J and that
F(u) =
X
yY
u
f(u, y).
(Hint: Use the mean value theorem.)
Exercise 2.17 (Dierentiation of Power Series). Suppose R > 0 and {a
n
}
n=0
is
a sequence of complex numbers such that
P
n=0
|a
n
| r
n
< for all r (0, R).
Show, using Exercise 2.16, f(x) :=
P
n=0
a
n
x
n
is continuously dierentiable for
x (R, R) and
f
0
(x) =

X
n=0
na
n
x
n1
=

X
n=1
na
n
x
n1
.
Exercise 2.18. Let {a
n
}
n=
be a summable sequence of complex numbers, i.e.
P
n=
|a
n
| < . For t 0 and x R, dene
F(t, x) =

X
n=
a
n
e
tn
2
e
inx
,
where as usual e
ix
= cos(x) +i sin(x). Prove the following facts about F :
(1) F(t, x) is continuous for (t, x) [0, )R. Hint: Let Y = Z and u = (t, x)
and use Exercise 2.15.
(2) F(t, x)/t, F(t, x)/x and
2
F(t, x)/x
2
exist for t > 0 and x R.
Hint: Let Y = Z and u = t for computing F(t, x)/t and u = x for
computing F(t, x)/x and
2
F(t, x)/x
2
. See Exercise 2.16.
(3) F satises the heat equation, namely
F(t, x)/t =
2
F(t, x)/x
2
for t > 0 and x R.
2.7.4. Inequalities.
Exercise 2.19. Generalize Proposition 2.25 as follows. Let a [, 0] and f : R
[a, ) [0, ) be a continuous strictly increasing function such that lim
s
f(s) =
, f(a) = 0 if a > or lim
s
f(s) = 0 if a = . Also let g = f
1
,
b = f(0) 0,
F(s) =
Z
s
0
f(s
0
)ds
0
and G(t) =
Z
t
0
g(t
0
)dt
0
.
Then for all s, t 0,
st F(s) +G(t b) F(s) +G(t)
and equality holds i t = f(s). In particular, taking f(s) = e
s
, prove Youngs
inequality stating
st e
s
+ (t 1) ln(t 1) (t 1) e
s
+t lnt t.
Hint: Refer to the following pictures.
18 BRUCE K. DRIVER
2 1 0 -1 -2
5
3.75
2.5
1.25
0
s
t
s
t
Figure 3. Comparing areas when t b goes the same way as in
the text.
2 1 0 -1 -2
5
3.75
2.5
1.25
0
s
t
s
t
Figure 4. When t b, notice that g(t) 0 but G(t) 0. Also
notice that G(t) is no longer needed to estimate st.
3. Metric, Banach and Topological Spaces
3.1. Basic metric space notions.
Denition 3.1. A function d : X X [0, ) is called a metric if
(1) (Symmetry) d(x, y) = d(y, x) for all x, y X
(2) (Non-degenerate) d(x, y) = 0 if and only if x = y X
(3) (Triangle inequality) d(x, z) d(x, y) +d(y, z) for all x, y, z X.
As primary examples, any normed space (X, kk) is a metric space with d(x, y) :=
kx yk . Thus the space
p
() is a metric space for all p [1, ]. Also any subset
of a metric space is a metric space. For example a surface in R
3
is a metric space
with the distance between two points on being the usual distance in R
3
.
Denition 3.2. Let (X, d) be a metric space. The open ball B(x, ) X centered
at x X with radius > 0 is the set
B(x, ) := {y X : d(x, y) < }.
We will often also write B(x, ) as B
x
(). We also dene the closed ball centered
at x X with radius > 0 as the set C
x
() := {y X : d(x, y) }.
Denition 3.3. A sequence {x
n
}
n=1
in a metric space (X, d) is said to be conver-
gent if there exists a point x X such that lim
n
d(x, x
n
) = 0. In this case we
write lim
n
x
n
= x of x
n
x as n .
Exercise 3.1. Show that x in Denition 3.3 is necessarily unique.
Denition 3.4. A set F X is closed i every convergent sequence {x
n
}
n=1
which is contained in F has its limit back in F. A set V X is open i V
c
is
closed. We will write F @ X to indicate the F is a closed subset of X and V
o
X
to indicate the V is an open subset of X. We also let
d
denote the collection of
open subsets of X relative to the metric d.
Exercise 3.2. Let F be a collection of closed subsets of X, show F :=
FF
F
is closed. Also show that nite unions of closed sets are closed, i.e. if {F
k
}
n
k=1
are
closed sets then
n
k=1
F
k
is closed. (By taking complements, this shows that the
collection of open sets,
d
, is closed under nite intersections and arbitrary unions.)
The following continuity facts of the metric d will be used frequently in the
remainder of this book.
Lemma 3.5. For any non empty subset A X, let d
A
(x) inf{d(x, a)|a A},
then
(3.1) |d
A
(x) d
A
(y)| d(x, y) x, y X.
Moreover the set F
{x X|d
A
(x) } is closed in X.
Proof. Let a A and x, y X, then
d(x, a) d(x, y) +d(y, a).
Take the inf over a in the above equation shows that
d
A
(x) d(x, y) +d
A
(y) x, y X.
Therefore, d
A
(x) d
A
(y) d(x, y) and by interchanging x and y we also have that
d
A
(y) d
A
(x) d(x, y) which implies Eq. (3.1). Now suppose that {x
n
}
n=1
F
is a convergent sequence and x = lim

n
x
n
X. By Eq. (3.1),
d
A
(x) d
A
(x
n
) d
A
(x) d(x, x
n
) 0 as n ,
so that d
A
(x). This shows that x F
and hence F
is closed.
Corollary 3.6. The function d satises,
|d(x, y) d(x
0
, y
0
)| d(y, y
0
) +d(x, x
0
)
and in particular d : X X [0, ) is continuous.
Proof. By Lemma 3.5 for single point sets and the triangle inequality for the
absolute value of real numbers,
|d(x, y) d(x
0
, y
0
)| |d(x, y) d(x, y
0
)| +|d(x, y
0
) d(x
0
, y
0
)|
d(y, y
0
) +d(x, x
0
).
20 BRUCE K. DRIVER
Exercise 3.3. Show that V X is open i for every x V there is a > 0 such
that B
x
() V. In particular show B
x
() is open for all x X and > 0.
Lemma 3.7. Let A be a closed subset of X and F
@ X be as dened as in Lemma
3.5. Then F
A
c
as 0.
Proof. It is clear that d
A
(x) = 0 for x A so that F
A
c
for each > 0 and
hence
>0
F
A
c
. Now suppose that x A
c
o
X. By Exercise 3.3 there exists
an > 0 such that B
x
() A
c
, i.e. d(x, y) for all y A. Hence x F
and we
have shown that A
c

>0
F
. Finally it is clear that F
0 whenever
0
.
Denition 3.8. Given a set A contained a metric space X, let

A X be the
closure of A dened by
A := {x X : {x
n
} A 3 x = lim
n
x
n
}.
That is to say

A contains all limit points of A.
Exercise 3.4. Given A X, show

A is a closed set and in fact
(3.2)

A = {F : A F X with F closed}.
That is to say

A is the smallest closed set containing A.
3.2. Continuity. Suppose that (X, d) and (Y, ) are two metric spaces and f :
X Y is a function.
Denition 3.9. A function f : X Y is continuous at x X if for all > 0 there
is a > 0 such that
d(f(x), f(x
0
)) < provided that (x, x
0
) < .
The function f is said to be continuous if f is continuous at all points x X.
The following lemma gives three other ways to characterize continuous functions.
Lemma 3.10 (Continuity Lemma). Suppose that (X, ) and (Y, d) are two metric
spaces and f : X Y is a function. Then the following are equivalent:
(1) f is continuous.
(2) f
1
(V )
for all V
d
, i.e. f
1
(V ) is open in X if V is open in Y.
(3) f
1
(C) is closed in X if C is closed in Y.
(4) For all convergent sequences {x
n
} X, {f(x
n
)} is convergent in Y and
lim
n
f(x
n
) = f

lim
n
x
n
.
Proof. 1. 2. For all x X and > 0 there exists > 0 such that
d(f(x), f(x
0
)) < if (x, x
0
) < . i.e.
B
x
() f
1
(B
f(x)
())
So if V
o
Y and x f
1
(V ) we may choose > 0 such that B
f(x)
() V then
B
x
() f
1
(B
f(x)
()) f
1
(V )
showing that f
1
(V ) is open.
2. 1. Let > 0 and x X, then, since f
1
(B
f(x)
())
o
X, there exists > 0
such that B
x
() f
1
(B
f(x)
()) i.e. if (x, x
0
) < then d(f(x
0
), f(x)) < .
2. 3. If C is closed in Y, then C
c
o
Y and hence f
1
(C
c
)
o
X. Since
f
1
(C
c
) =

f
1
(C)
c
, this shows that f
1
(C) is the complement of an open set
and hence closed. Similarly one shows that 3. 2.
1. 4. If f is continuous and x
n
x in X, let > 0 and choose > 0
such that d(f(x), f(x
0
)) < when (x, x
0
) < . There exists an N > 0 such that
(x, x
n
) < for all n N and therefore d(f(x), f(x
n
)) < for all n N. That is
to say lim
n
f(x
n
) = f(x) as n .
4. 1. We will show that not 1. not 4. Not 1 implies there exists > 0,
a point x X and a sequence {x
n
}
n=1
X such that d(f(x), f(x
n
)) while
(x, x
n
) <
1
n
. Clearly this sequence {x
n
} violates 4.
There is of course a local version of this lemma. To state this lemma, we will
use the following terminology.
Denition 3.11. Let X be metric space and x X. A subset A X is a neigh-
borhood of x if there exists an open set V
o
X such that x V A. We will
say that A X is an open neighborhood of x if A is open and x A.
Lemma 3.12 (Local Continuity Lemma). Suppose that (X, ) and (Y, d) are two
metric spaces and f : X Y is a function. Then following are equivalent:
(1) f is continuous as x X.
(2) For all neighborhoods A Y of f(x), f
1
(A) is a neighborhood of x X.
(3) For all sequences {x
n
} X such that x = lim
n
x
n
, {f(x
n
)} is conver-
gent in Y and
lim
n
f(x
n
) = f

lim
n
x
n
.
The proof of this lemma is similar to Lemma 3.10 and so will be omitted.
Example 3.13. The function d
A
dened in Lemma 3.5 is continuous for each
A X. In particular, if A = {x} , it follows that y X d(y, x) is continuous for
each x X.
Exercise 3.5. Show the closed ball C
x
() := {y X : d(x, y) } is a closed
subset of X.
3.3. Basic Topological Notions. Using the metric space results above as moti-
vation we will axiomatize the notion of being an open set to more general settings.
Denition 3.14. A collection of subsets of X is a topology if
(1) , X
(2) is closed under arbitrary unions, i.e. if V
, for I then
S
I
V
.
(3) is closed under nite intersections, i.e. if V
1
, . . . , V
n
then V
1
V
n

.
A pair (X, ) where is a topology on X will be called a topological space.
Notation 3.15. The subsets V X which are in are called open sets and we
will abbreviate this by writing V
o
X and the those sets F X such that F
c

are called closed sets. We will write F @ X if F is a closed subset of X.
Example 3.16. (1) Let (X, d) be a metric space, we write
d
for the collection
of d open sets in X. We have already seen that
d
is a topology, see Exercise
3.2.
22 BRUCE K. DRIVER
(2) Let X be any set, then = P(X) is a topology. In this topology all subsets
of X are both open and closed. At the opposite extreme we have the trivial
topology, = {, X} . In this topology only the empty set and X are open
(closed).
(3) Let X = {1, 2, 3}, then = {, X, {2, 3}} is a topology on X which does
not come from a metric.
(4) Again let X = {1, 2, 3}. Then = {{1}, {2, 3}, , X}. is a topology, and the
sets X, {1}, {2, 3}, are open and closed. The sets {1, 2} and {1, 3} are
neither open nor closed.
1
2
3
Figure 5. A topology.
Denition 3.17. Let (X, ) be a topological space, A X and i
A
: A X be
the inclusion map, i.e. i
A
(a) = a for all a A. Dene
A
= i
1
A
() = {A V : V } ,
the so called relative topology on A.
Notice that the closed sets in Y relative to
Y
are precisely those sets of the form
C Y where C is close in X. Indeed, B Y is closed i Y \ B = Y V for some
V which is equivalent to B = Y \ (Y V ) = Y V
c
for some V .
Exercise 3.6. Show the relative topology is a topology on A. Also show if (X, d) is
a metric space and =
d
is the topology coming from d, then (
d
)
A
is the topology
induced by making A into a metric space using the metric d|
AA
.
Notation 3.18 (Neighborhoods of x). An open neighborhood of a point x X
is an open set V X such that x V. Let
x
= {V : x V } denote the
collection of open neighborhoods of x. A collection
x
is called a neighborhood
base at x X if for all V
x
there exists W such that W V .
The notation
x
should not be confused with
{x}
:= i
1
{x}
() = {{x} V : V } = {, {x}} .
When (X, d) is a metric space, a typical example of a neighborhood base for x is
= {B
x
() : D} where D is any dense subset of (0, 1].
Denition 3.19. Let (X, ) be a topological space and A be a subset of X.
(1) The closure of A is the smallest closed set

A containing A, i.e.
A := {F : A F @ X} .
(Because of Exercise 3.4 this is consistent with Denition 3.8 for the closure
of a set in a metric space.)
(2) The interior of A is the largest open set A
o
contained in A, i.e.
A
o
= {V : V A} .
(3) The accumulation points of A is the set
acc(A) = {x X : V A\ {x} 6= for all V
x
}.
(4) The boundary of A is the set A :=

A\ A
o
.
(5) A is a neighborhood of a point x X if x A
o
. This is equivalent to
requiring there to be an open neighborhood of V of x X such that V A.
Remark 3.20. The relationships between the interior and the closure of a set are:
(A
o
)
c
=
\
{V
c
: V and V A} =
\
{C : C is closed C A
c
} = A
c
and similarly, (

A)
c
= (A
c
)
o
. Hence the boundary of A may be written as
(3.3) A

A\ A
o
=

A (A
o
)
c
=

A A
c
,
which is to say A consists of the points in both the closure of A and A
c
.
Proposition 3.21. Let A X and x X.
(1) If V
o
X and A V = then

A V = .
(2) x

A i V A 6= for all V
x
.
(3) x A i V A 6= and V A
c
6= for all V
x
.
(4)

A = A acc(A).
Proof. 1. Since A V = , A V
c
and since V
c
is closed,

A V
c
. That is to
say

A V = .
2. By Remark 3.20
3
,

A = ((A
c
)
o
)
c
so x

A i x / (A
c
)
o
which happens i
V * A
c
for all V
x
, i.e. i V A 6= for all V
x
.
3. This assertion easily follows from the Item 2. and Eq. (3.3).
4. Item 4. is an easy consequence of the denition of acc(A) and item 2.
Lemma 3.22. Let A Y X,

A
Y
denote the closure of A in Y with its relative
topology and

A =

A
X
be the closure of A in X, then

A
Y
=

A
X
Y.
Proof. Using the comments after Denition 3.17,
A
Y
= {B @ Y : A B} = {C Y : A C @ X}
= Y ({C : A C @ X}) = Y

A
X
.
Alternative proof. Let x Y then x

A
Y
i for all V
Y
x
, V A 6= . This
happens i for all U
X
x
, U Y A = U A 6= which happens i x

A
X
. That
is to say

A
Y
=

A
X
Y.
3
Here is another direct proof of item 2. which goes by showing x /

A i there exists V
x
such that V A = . If x /

A then V = A
c

x
and V A V

A = . Conversely if there
exists V
x
such that V A = then by Item 1.

A V = .
24 BRUCE K. DRIVER
Denition 3.23. Let (X, ) be a topological space and A X. We say a subset

U is an open cover of A if A U. The set A is said to be compact if every
open cover of A has nite a sub-cover, i.e. if U is an open cover of A there exists
U
0
U such that U
0
is a cover of A. (We will write A @@ X to denote that
A X and A is compact.) A subset A X is precompact if

A is compact.
Proposition 3.24. Suppose that K X is a compact set and F K is a closed
subset. Then F is compact. If {K
i
}
n
i=1
is a nite collections of compact subsets of
X then K =
n
i=1
K
i
is also a compact subset of X.
Proof. Let U is an open cover of F, then U{F
c
} is an open cover of K.
The cover U{F
c
} of K has a nite subcover which we denote by U
0
{F
c
} where
U
0
U. Since F F
c
= , it follows that U
0
is the desired subcover of F.
For the second assertion suppose U is an open cover of K. Then U covers
each compact set K
i
and therefore there exists a nite subset U
i
U for each i
such that K
i
U
i
. Then U
0
:=
n
i=1
U
i
is a nite cover of K.
Denition 3.25. We say a collection F of closed subsets of a topological space
(X, ) has the nite intersection property if F
0
6= for all F
0
F.
The notion of compactness may be expressed in terms of closed sets as follows.
Proposition 3.26. A topological space X is compact i every family of closed sets
F P(X) with the nite intersection property satises
T
F 6= .
Proof. () Suppose that X is compact and F P(X) is a collection of closed
sets such that
T
F = . Let
U = F
c
:= {C
c
: C F} ,
then U is a cover of X and hence has a nite subcover, U
0
. Let F
0
= U
c
0
F,
then F
0
= so that F does not have the nite intersection property.
() If X is not compact, there exists an open cover U of X with no nite sub-
cover. Let F = U
c
, then F is a collection of closed sets with the nite intersection
property while
T
F = .
Exercise 3.7. Let (X, ) be a topological space. Show that A X is compact i
(A,
A
) is a compact topological space.
Denition 3.27. Let (X, ) be a topological space. A sequence {x
n
}
n=1
X
converges to a point x X if for all V
x
, x
n
V almost always (abbreviated
a.a.), i.e. #({n : x
n
/ V }) < . We will write x
n
x as n or lim
n
x
n
= x
when x
n
converges to x.
Example 3.28. Let Y = {1, 2, 3} and = {Y, , {1, 2}, {2, 3}, {2}} and y
n
= 2 for
all n. Then y
n
y for every y Y. So limits need not be unique!
Denition 3.29. Let (X,
X
) and (Y,
Y
) be topological spaces. A function f :
X Y is continuous if f
1
(
Y
)
X
. We will also say that f is
X
/
Y

continuous or (
X
,
Y
) continuous. We also say that f is continuous at a point
x X if for every open neighborhood V of f(x) there is an open neighborhood U
of x such that U f
1
(V ). See Figure 6.
Denition 3.30. A map f : X Y between topological spaces is called a home-
omorphism provided that f is bijective, f is continuous and f
1
: Y X is
continuous. If there exists f : X Y which is a homeomorphism, we say that
Figure 6. Checking that a function is continuous at x X.
X and Y are homeomorphic. (As topological spaces X and Y are essentially the
same.)
Exercise 3.8. Show f : X Y is continuous i f is continuous at all points
x X.
Exercise 3.9. Show f : X Y is continuous i f
1
(C) is closed in X for all
closed subsets C of Y.
Exercise 3.10. Suppose f : X Y is continuous and K X is compact, then
f(K) is a compact subset of Y.
Exercise 3.11 (Dinis Theorem). Let X be a compact topological space and f
n
:
X [0, ) be a sequence of continuous functions such that f
n
(x) 0 as n
for each x X. Show that in fact f
n
0 uniformly in x, i.e. sup
xX
f
n
(x) 0 as
n . Hint: Given > 0, consider the open sets V
n
:= {x X : f
n
(x) < }.
Denition 3.31 (First Countable). A topological space, (X, ), is rst countable
i every point x X has a countable neighborhood base. (All metric space are
rst countable.)
When is rst countable, we may formulate many topological notions in terms
of sequences.
Proposition 3.32. If f : X Y is continuous at x X and lim
n
x
n
= x X,
then lim
n
f(x
n
) = f(x) Y. Moreover, if there exists a countable neighborhood
base of x X, then f is continuous at x i lim
n
f(x
n
) = f(x) for all sequences
{x
n
}
n=1
X such that x
n
x as n .
Proof. If f : X Y is continuous and W
Y
is a neighborhood of f(x) Y,
then there exists a neighborhood V of x X such that f(V ) W. Since x
n
x,
x
n
V a.a. and therefore f(x
n
) f(V ) W a.a., i.e. f(x
n
) f(x) as n .
Conversely suppose that {W
n
}
n=1
is a countable neighborhood base at x and
lim
n
f(x
n
) = f(x) for all sequences {x
n
}
n=1
X such that x
n
x. By replacing
W
n
by W
1
W
n
if necessary, we may assume that {W
n
}
n=1
is a decreasing
sequence of sets. If f were not continuous at x then there exists V
f(x)
such
that x / f
1
(V )
0
. Therefore, W
n
is not a subset of f
1
(V ) for all n. Hence for
each n, we may choose x
n
W
n
\ f
1
(V ). This sequence then has the property
26 BRUCE K. DRIVER
that x
n
x as n while f(x
n
) / V for all n and hence lim
n
f(x
n
) 6= f(x).
Lemma 3.33. Suppose there exists {x
n
}
n=1
A such that x
n
x, then x

A.
Conversely if (X, ) is a rst countable space (like a metric space) then if x

A
there exists {x
n
}
n=1
A such that x
n
x.
Proof. Suppose {x
n
}
n=1
A and x
n
x X. Since

A
c
is an open set, if
x

A
c
then x
n

A
c
A
c
a.a. contradicting the assumption that {x
n
}
n=1
A.
Hence x

A.
For the converse we now assume that (X, ) is rst countable and that {V
n
}
n=1
is
a countable neighborhood base at x such that V
1
V
2
V
3
. . . . By Proposition
3.21, x

A i V A 6= for all V
x
. Hence x

A implies there exists x
n
V
n
A
for all n. It is now easily seen that x
n
x as n .
Denition 3.34 (Support). Let f : X Y be a function from a topological space
(X,
X
) to a vector space Y. Then we dene the support of f by
supp(f) := {x X : f(x) 6= 0},
a closed subset of X.
Example 3.35. For example, let f(x) = sin(x)1
[0,4]
(x) R, then
{f 6= 0} = (0, 4) \ {, 2, 3}
and therefore supp(f) = [0, 4].
Notation 3.36. If X and Y are two topological spaces, let C(X, Y ) denote the
continuous functions from X to Y. If Y is a Banach space, let
BC(X, Y ) := {f C(X, Y ) : sup
xX
kf(x)k
Y
< }
and
C
c
(X, Y ) := {f C(X, Y ) : supp(f) is compact}.
If Y = R or C we will simply write C(X), BC(X) and C
c
(X) for C(X, Y ),
BC(X, Y ) and C
c
(X, Y ) respectively.
The next result is included for completeness but will not be used in the sequel
so may be omitted.
Lemma 3.37. Suppose that f : X Y is a map between topological spaces. Then
the following are equivalent:
(1) f is continuous.
(2) f(

A) f(A) for all A X
(3) f
1
(B) f
1
(

B) for all B @ X.
Proof. If f is continuous, then f
1
f(A)
is closed and since A f

1
(f(A))
f
1
f(A)
it follows that

A f
1
f(A)
. From this equation we learn that

f(

A) f(A) so that (1) implies (2) Now assume (2), then for B Y (taking
A = f
1
(

B)) we have
f(f
1
(B)) f(f
1
(

B)) f(f
1
(

B))

B
and therefore
(3.4) f
1
(B) f
1
(

B).
This shows that (2) implies (3) Finally if Eq. (3.4) holds for all B, then when B is
closed this shows that
f
1
(B) f
1
(

B) = f
1
(B) f
1
(B)
which shows that
f
1
(B) = f
1
(B).
Therefore f
1
(B) is closed whenever B is closed which implies that f is continuous.
3.4. Completeness.
Denition 3.38 (Cauchy sequences). A sequence {x
n
}
n=1
in a metric space (X, d)
is Cauchy provided that
lim
m,n
d(x
n
, x
m
) = 0.
Exercise 3.12. Show that convergent sequences are always Cauchy sequences. The
converse is not always true. For example, let X = Q be the set of rational numbers
and d(x, y) = |x y|. Choose a sequence {x
n
}
n=1
Q which converges to
2 R,
then {x
n
}
n=1
is (Q, d) Cauchy but not (Q, d) convergent. The sequence does
converge in R however.
Denition 3.39. A metric space (X, d) is complete if all Cauchy sequences are
convergent sequences.
Exercise 3.13. Let (X, d) be a complete metric space. Let A X be a subset of
X viewed as a metric space using d|
AA
. Show that (A, d|
AA
) is complete i A is
a closed subset of X.
Denition 3.40. If (X, kk) is a normed vector space, then we say {x
n
}
n=1
X
is a Cauchy sequence if lim
m,n
kx
m
x
n
k = 0. The normed vector space is a
Banach space if it is complete, i.e. if every {x
n
}
n=1
X which is Cauchy is
convergent where {x
n
}
n=1
X is convergent i there exists x X such that
lim
n
kx
n
xk = 0. As usual we will abbreviate this last statement by writing
lim
n
x
n
= x.
Lemma 3.41. Suppose that X is a set then the bounded functions
(X) on X is
a Banach space with the norm
kfk = kfk
= sup
xX
|f(x)| .
Moreover if X is a topological space the set BC(X)
(X) = B(X) is closed

subspace of
(X) and hence is also a Banach space.

Proof. Let {f
n
}
n=1

(X) be a Cauchy sequence. Since for any x X, we

have
(3.5) |f
n
(x) f
m
(x)| kf
n
f
m
k
which shows that {f

n
(x)}
n=1
F is a Cauchy sequence of numbers. Because F
(F = R or C) is complete, f(x) := lim
n
f
n
(x) exists for all x X. Passing to
the limit n in Eq. (3.5) implies
|f(x) f
m
(x)| lim sup
n
kf
n
f
m
k
28 BRUCE K. DRIVER
and taking the supremum over x X of this inequality implies

kf f
m
k
lim sup
n
kf
n
f
m
k
0 as m
showing f
m
f in
(X).
For the second assertion, suppose that {f
n
}
n=1
BC(X)
(X) and f
n

f
(X). We must show that f BC(X), i.e. that f is continuous. To this end
let x, y X, then
|f(x) f(y)| |f(x) f
n
(x)| +|f
n
(x) f
n
(y)| +|f
n
(y) f(y)|
2 kf f
n
k
+|f
n
(x) f
n
(y)| .
Thus if > 0, we may choose n large so that 2 kf f
n
k
< /2 and then for this

n there exists an open neighborhood V
x
of x X such that |f
n
(x) f
n
(y)| < /2
for y V
x
. Thus |f(x) f(y)| < for y V
x
showing the limiting function f is
continuous.
Remark 3.42. Let X be a set, Y be a Banach space and
(X, Y ) denote
the bounded functions f : X Y equipped with the norm kfk = kfk
=
sup
xX
kf(x)k
Y
. If X is a topological space, let BC(X, Y ) denote those f
(X, Y ) which are continuous. The same proof used in Lemma 3.41 shows that
(X, Y ) is a Banach space and that BC(X, Y ) is a closed subspace of
(X, Y ).
Theorem 3.43 (Completeness of
p
()). Let X be a set and : X (0, ] be a
given function. Then for any p [1, ], (
p
(), kk
p
) is a Banach space.
Proof. We have already proved this for p = in Lemma 3.41 so we now assume
that p [1, ). Let {f
n
}
n=1

p
() be a Cauchy sequence. Since for any x X,
|f
n
(x) f
m
(x)|
1
(x)
kf
n
f
m
k
p
0 as m, n
it follows that {f
n
(x)}
n=1
is a Cauchy sequence of numbers and f(x) :=
lim
n
f
n
(x) exists for all x X. By Fatous Lemma,
kf
n
fk
p
p
=
X
X
lim
m
inf |f
n
f
m
|
p
lim
m
inf
X
X
|f
n
f
m
|
p
= lim
m
inf kf
n
f
m
k
p
p
0 as n .
This then shows that f = (f f
n
)+f
n

p
() (being the sum of two
p
functions)
and that f
n

p
f.
Example 3.44. Here are a couple of examples of complete metric spaces.
(1) X = R and d(x, y) = |x y|.
(2) X = R
n
and d(x, y) = kx yk
2
=
P
n
i=1
(x
i
y
i
)
2
.
(3) X =
p
() for p [1, ] and any weight function .
(4) X = C([0, 1], R) the space of continuous functions from [0, 1] to R and
d(f, g) := max
t[0,1]
|f(t) g(t)|. This is a special case of Lemma 3.41.
(5) Here is a typical example of a non-complete metric space. Let X =
C([0, 1], R) and
d(f, g) :=
Z
1
0
|f(t) g(t)| dt.
3.5. Compactness in Metric Spaces. Let (X, ) be a metric space and let
B
0
x
() = B
x
() \ {x} .
Denition 3.45. A point x X is an accumulation point of a subset E X if
6= E V \ {x} for all V
o
X containing x.
Let us start with the following elementary lemma which is left as an exercise to
the reader.
Lemma 3.46. Let E X be a subset of a metric space (X, ) . Then the following
are equivalent:
(1) x X is an accumulation point of E.
(2) B
0
x
() E 6= for all > 0.
(3) B
x
() E is an innite set for all > 0.
(4) There exists {x
n
}
n=1
E \ {x} with lim
n
x
n
= x.
Denition 3.47. A metric space (X, ) is said to be bounded ( > 0) provided
there exists a nite cover of X by balls of radius . The metric space is totally
bounded if it is bounded for all > 0.
Theorem 3.48. Let X be a metric space. The following are equivalent.
(a) X is compact.
(b) Every innite subset of X has an accumulation point.
(c) X is totally bounded and complete.
Proof. The proof will consist of showing that a b c a.
(a b) We will show that not b not a. Suppose there exists E X, such
that #(E) = and E has no accumulation points. Then for all x X there exists
x
> 0 such that V
x
:= B
x
(
x
) satises (V
x
\ {x}) E = . Clearly V = {V
x
}
xX
is
a cover of X, yet V has no nite sub cover. Indeed, for each x X, V
x
E consists
of at most one point, therefore if X,
x
V
x
can only contain a nite number
of points from E, in particular X 6=
x
V
x
. (See Figure 7.)
Figure 7. The construction of an open cover with no nite sub-cover.
(b c) To show X is complete, let {x
n
}
n=1
X be a sequence and
E := {x
n
: n N} . If #(E) < , then {x
n
}
n=1
has a subsequence {x
n
k
} which
is constant and hence convergent. If E is an innite set it has an accumulation
point by assumption and hence Lemma 3.46 implies that {x
n
} has a convergence
subsequence.
30 BRUCE K. DRIVER
We now show that X is totally bounded. Let > 0 be given and choose x
1
X. If
possible choose x
2
X such that d(x
2
, x
1
) , then if possible choose x
3
X such
that d(x
3
, {x
1
, x
2
}) and continue inductively choosing points {x
j
}
n
j=1
X
such that d(x
n
, {x
1
, . . . , x
n1
}) . This process must terminate, for otherwise
we could choose E = {x
j
}
j=1
and innite number of distinct points such that
d(x
j
, {x
1
, . . . , x
j1
}) for all j = 2, 3, 4, . . . . Since for all x X the B
x
(/3) E
can contain at most one point, no point x X is an accumulation point of E. (See
Figure 8.)
Figure 8. Constructing a set with out an accumulation point.
(c a) For sake of contradiction, assume there exists a cover an open cover
V = {V
}
A
of X with no nite subcover. Since X is totally bounded for each
n N there exists
n
X such that
X =
[
x
n
B
x
(1/n)
[
x
n
C
x
(1/n).
Choose x
1

1
such that no nite subset of V covers K
1
:= C
x
1
(1). Since K
1
=
x
2
K
1
C
x
(1/2), there exists x
2

2
such that K
2
:= K
1
C
x
2
(1/2) can not be
covered by a nite subset of V. Continuing this way inductively, we construct sets
K
n
= K
n1
C
x
n
(1/n) with x
n

n
such no K
n
can be covered by a nite subset
of V. Now choose y
n
K
n
for each n. Since {K
n
}
n=1
is a decreasing sequence of
closed sets such that diam(K
n
) 2/n, it follows that {y
n
} is a Cauchy and hence
convergent with
y = lim
n
y
n

m=1
K
m
.
Since V is a cover of X, there exists V V such that x V. Since K
n
{y} and
diam(K
n
) 0, it now follows that K
n
V for some n large. But this violates the
assertion that K
n
can not be covered by a nite subset of V.(See Figure 9.)
Remark 3.49. Let X be a topological space and Y be a Banach space. By combining
Exercise 3.10 and Theorem 3.48 it follows that C
c
(X, Y ) BC(X, Y ).
Corollary 3.50. Let X be a metric space then X is compact i all sequences
{x
n
} X have convergent subsequences.
Proof. Suppose X is compact and {x
n
} X.
Figure 9. Nested Sequence of cubes.
(1) If #({x
n
: n = 1, 2, . . . }) < then choose x X such that x
n
= x i.o.
and let {n
k
} {n} such that x
n
k
= x for all k. Then x
n
k
x
(2) If #({x
n
: n = 1, 2, . . . }) = . We know E = {x
n
} has an accumulation
point {x}, hence there exists x
n
k
x.
Conversely if E is an innite set let {x
n
}
n=1
E be a sequence of distinct
elements of E. We may, by passing to a subsequence, assume x
n
x X as
n . Now x X is an accumulation point of E by Theorem 3.48 and hence X
is compact.
Corollary 3.51. Compact subsets of R
n
are the closed and bounded sets.
Proof. If K is closed and bounded then K is complete (being the closed subset
of a complete space) and K is contained in [M, M]
n
for some positive integer M.
For > 0, let
= Z
n
[M, M]
n
:= {x : x Z
n
and |x
i
| M for i = 1, 2, . . . , n}.
We will show, by choosing > 0 suciently small, that
(3.6) K [M, M]
n

x
B(x, )
which shows that K is totally bounded. Hence by Theorem 3.48, K is compact.
Suppose that y [M, M]
n
, then there exists x
such that |y
i
x
i
| for
i = 1, 2, . . . , n. Hence
d
2
(x, y) =
n
X
i=1
(y
i
x
i
)
2
n
2
which shows that d(x, y)

n. Hence if choose < /
n we have shows that

d(x, y) < , i.e. Eq. (3.6) holds.
Example 3.52. Let X =
p
(N) with p [1, ) and X such that (k) 0 for
all k N. The set
K := {x X : |x(k)| (k) for all k N}
32 BRUCE K. DRIVER
is compact. To prove this, let {x

n
}
n=1
K be a sequence. By compactness of
closed bounded sets in C, for each k N there is a subsequence of {x
n
(k)}
n=1
C
which is convergent. By Cantors diagonalization trick, we may choose a subse-
quence {y
n
}
n=1
of {x
n
}
n=1
such that y(k) := lim
n
y
n
(k) exists for all k N.
4
Since |y
n
(k)| (k) for all n it follows that |y(k)| (k), i.e. y K. Finally
lim
n
ky y
n
k
p
p
= lim
n
X
k=1
|y(k) y
n
(k)|
p
=

X
k=1
lim
n
|y(k) y
n
(k)|
p
= 0
where we have used the Dominated convergence theorem. (Note |y(k) y
n
(k)|
p
2
p
p
(k) and
p
is summable.) Therefore y
n
y and we are done.
Alternatively, we can prove K is compact by showing that K is closed and totally
bounded. It is simple to show K is closed, for if {x
n
}
n=1
K is a convergent
sequence in X, x := lim
n
x
n
, then |x(k)| lim
n
|x
n
(k)| (k) for all k N.
This shows that x K and hence K is closed. To see that K is totally bounded, let
> 0 and choose N such that
P
k=N+1
|(k)|
p
1/p
< . Since
Q
N
k=1
C
(k)
(0) C
N
is closed and bounded, it is compact. Therefore there exists a nite subset
Q
N
k=1
C
(k)
(0) such that
N
Y
k=1
C
(k)
(0)
z
B
N
z
()
where B
N
z
() is the open ball centered at z C
N
relative to the
p
({1, 2, 3, . . . , N})
norm. For each z , let z X be dened by z(k) = z(k) if k N and z(k) = 0
for k N + 1. I now claim that
(3.7) K
z
B
z
(2)
which, when veried, shows K is totally bounced. To verify Eq. (3.7), let x K
and write x = u + v where u(k) = x(k) for k N and u(k) = 0 for k < N. Then
by construction u B
z
() for some z and
kvk
p

X
k=N+1
|(k)|
p
!
1/p
< .
So we have
kx zk
p
= ku +v zk
p
ku zk
p
+kvk
p
< 2.
Exercise 3.14 (Extreme value theorem). Let (X, ) be a compact topological space
and f : X R be a continuous function. Show < inf f supf < and
4
The argument is as follows. Let {n
1
j
}
j=1
be a subsequence of N ={n}
n=1
such that
lim
j
x
n
1
j
(1) exists. Now choose a subsequence {n
2
j
}
j=1
of {n
1
j
}
j=1
such that lim
j
x
n
2
j
(2)
exists and similalry {n
3
j
}
j=1
of {n
2
j
}
j=1
such that lim
j
x
n
3
j
(3) exists. Continue on this way
inductively to get
{n}
n=1
{n
1
j
}
j=1
{n
2
j
}
j=1
{n
3
j
}
j=1
. . .
such that lim
j
x
n
k
j
(k) exists for all k N. Let m
j
:= n
j
j
so that eventually {m
j
}
j=1
is a
subsequnce of {n
k
j
}
j=1
for all k. Therefore, we may take y
j
:= x
m
j
.
there exists a, b X such that f(a) = inf f and f(b) = supf.
5
Hint: use Exercise
3.10 and Corollary 3.51.
Exercise 3.15 (Uniform Continuity). Let (X, d) be a compact metric space, (Y, )
be a metric space and f : X Y be a continuous function. Show that f is
uniformly continuous, i.e. if > 0 there exists > 0 such that (f(y), f(x)) < if
x, y X with d(x, y) < . Hint: I think the easiest proof is by using a sequence
argument.
Denition 3.53. Let L be a vector space. We say that two norms, || and kk , on
L are equivalent if there exists constants , (0, ) such that
kfk |f| and |f| kfk for all f L.
Lemma 3.54. Let L be a nite dimensional vector space. Then any two norms
|| and kk on L are equivalent. (This is typically not true for norms on innite
dimensional spaces.)
Proof. Let {f
i
}
n
i=1
be a basis for L and dene a new norm on L by
n
X
i=1
a
i
f
i
n
X
i=1
|a
i
| for a
i
F.
By the triangle inequality of the norm || , we nd
n
X
i=1
a
i
f
i
n
X
i=1
|a
i
| |f
i
| M
n
X
i=1
|a
i
| = M
n
X
i=1
a
i
f
i
1
where M = max
i
|f
i
| . Thus we have
|f| M kfk
1
for all f L. This inequality shows that || is continuous relative to kk
1
. Now
let S := {f L : kfk
1
= 1} , a compact subset of L relative to kk
1
. Therefore by
Exercise 3.14 there exists f
0
S such that
m = inf {|f| : f S} = |f
0
| > 0.
Hence given 0 6= f L, then
f
kfk
1
S so that
m
f
kfk
1
= |f|
1
kfk
1
or equivalently
kfk
1

1
m
|f| .
This shows that || and kk
1
are equivalent norms. Similarly one shows that kk and
kk
1
are equivalent and hence so are || and kk .
Denition 3.55. A subset D of a topological space X is dense if

D = X. A
topological space is said to be separable if it contains a countable dense subset,
D.
Example 3.56. The following are examples of countable dense sets.
5
Here is a proof if X is a metric space. Let {x
n
}
n=1
X be a sequence such that f(x
n
) supf.
By compactness of X we may assume, by passing to a subsequence if necessary that x
n
b X
as n . By continuity of f, f(b) = supf.
34 BRUCE K. DRIVER
(1) The rational number Q are dense in R equipped with the usual topology.
(2) More generally, Q
d
is a countable dense subset of R
d
for any d N.
(3) Even more generally, for any function : N (0, ),
p
() is separable for
all 1 p < . For example, let F be a countable dense set, then
D := {x
p
() : x
i
for all i and #{j : x
j
6= 0} < }.
The set can be taken to be Q if F = R or Q+iQ if F = C.
(4) If (X, ) is a metric space which is separable then every subset Y X is
also separable in the induced topology.
To prove 4. above, let A = {x
n
}
n=1
X be a countable dense subset of X.
Let (x, Y ) = inf{(x, y) : y Y } be the distance from x to Y . Recall that
(, Y ) : X [0, ) is continuous. Let
n
= (x
n
, Y ) 0 and for each n let
y
n
B
x
n
(
1
n
) Y if
n
= 0 otherwise choose y
n
B
x
n
(2
n
) Y. Then if y Y and
> 0 we may choose n N such that (y, x
n
)
n
< /3 and
1
n
< /3. If
n
> 0,
(y
n
, x
n
) 2
n
< 2/3 and if
n
= 0, (y
n
, x
n
) < /3 and therefore
(y, y
n
) (y, x
n
) +(x
n
, y
n
) < .
This shows that B {y
n
}
n=1
is a countable dense subset of Y.
Lemma 3.57. Any compact metric space (X, d) is separable.
Proof. To each integer n, there exists
n
X such that X =
x
n
B(x, 1/n).
Let D :=
n=1
n
a countable subset of X. Moreover, it is clear by construction
that

D = X.
3.6. Compactness in Function Spaces. In this section, let (X, ) be a topolog-
ical space.
Denition 3.58. Let F C(X).
(1) F is equicontinuous at x X i for all > 0 there exists U
x
such that
|f(y) f(x)| < for all y U and f F.
(2) F is equicontinuous if F is equicontinuous at all points x X.
(3) F is pointwise bounded if sup{|f(x)| : |f F} < for all x X.
Theorem 3.59 (Ascoli-Arzela Theorem). Let (X, ) be a compact topological space
and F C(X). Then F is precompact in C(X) i F is equicontinuous and point-
wise bounded.
Proof. () Since C(X) B(X) is a complete metric space, we must show F
is totally bounded. Let > 0 be given. By equicontinuity there exists V
x

x
for
all x X such that |f(y) f(x)| < /2 if y V
x
and f F. Since X is compact
we may choose X such that X =
x
V
x
. We have now decomposed X
into blocks {V
x
}
x
such that each f F is constant to within on V
x
. Since
sup{|f(x)| : x and f F} < , it is now evident that
M sup{|f(x)| : x X and f F} sup{|f(x)| : x and f F} + < .
Let D {k/2 : k Z} [M, M]. If f F and D
(i.e. : D is a
function) is chosen so that |(x) f(x)| /2 for all x , then
|f(y) (x)| |f(y) f(x)| +|f(x) (x)| < x and y V
x
.
From this it follows that F =
S
F
: D
where, for D
,
F
{f F : |f(y) (x)| < for y V

x
and x }.
Let :=

D
: F
6=
and for each choose f
F. For f F
,
x and y V
x
we have
|f(y) f
(y)| |f(y) (x))| +|(x) f
(y)| < 2.
So kf f
k < 2 for all f F
showing that F
B
f
(2). Therefore,
F =
B
f
(2)
and because > 0 was arbitrary we have shown that F is totally bounded.
() Since kk : C(X) [0, ) is a continuous function on C(X) it is bounded
on any compact subset F C(X). This shows that sup{kfk : f F} < which
clearly implies that F is pointwise bounded.
6
Suppose F were not equicontinuous
at some point x X that is to say there exists > 0 such that for all V
x
,
sup
yV
sup
fF
|f(y) f(x)| > .
7
Equivalently said, to each V
x
we may choose
(3.8) f
V
F and x
V
V such that |f
V
(x) f
V
(x
V
)| .
Set C
V
= {f
W
: W
x
and W V }
kk
F and notice for any V

x
that
V V
C
V
C
V
6= ,
so that {C
V
}
V

x
F has the nite intersection property.
8
Since F is compact,
it follows that there exists some
f
\
V
x
C
V
6= .
Since f is continuous, there exists V
x
such that |f(x) f(y)| < /3 for all
y V. Because f C
V
, there exists W V such that kf f
W
k < /3. We now
arrive at a contradiction;
|f
W
(x) f
W
(x
W
)| |f
W
(x) f(x)| +|f(x) f(x
W
)| +|f(x
W
) f
W
(x
W
)|
< /3 +/3 +/3 = .
6
One could also prove that F is pointwise bounded by considering the continuous evaluation
maps e
x
: C(X) R given by e
x
(f) = f(x) for all x X.
7
If X is rst countable we could nish the proof with the following argument. Let {V
n
}
n=1
be a neighborhood base at x such that V
1
V
2
V
3
. . . . By the assumption that F is not
equicontinuous at x, there exist f
n
F and x
n
V
n
such that |f
n
(x) f
n
(x
n
)| n. Since
F is a compact metric space by passing to a subsequence if necessary we may assume that f
n
converges uniformly to some f F. Because x
n
x as n we learn that
|f
n
(x) f
n
(x
n
)| |f
n
(x) f(x)| + |f(x) f(x
n
)| + |f(x
n
) f
n
(x
n
)|
2kf
n
fk + |f(x) f(x
n
)| 0 as n
which is a contradiction.
8
If we are willing to use Nets described in Appendix D below we could nish the proof as
follows. Since F is compact, the net {f
V
}
V
x
F has a cluster point f F C(X). Choose a
subnet {g
}
A
of {f
V
}
V
X
such that g
f uniformly. Then, since x

V
x implies x
V
x,
we may conclude from Eq. (3.8) that
|g
(x) g
(x
V
)| |g(x) g(x)| = 0
which is a contradiction.
36 BRUCE K. DRIVER
3.7. Bounded Linear Operators Basics.

Denition 3.60. Let X and Y be normed spaces and T : X Y be a linear
map. Then T is said to be bounded provided there exists C < such that
kT(x)k Ckxk
X
for all x X. We denote the best constant by kTk, i.e.
kTk = sup
x6=0
kT(x)k
kxk
= sup
x6=0
{kT(x)k : kxk = 1} .
The number kTk is called the operator norm of T.
Proposition 3.61. Suppose that X and Y are normed spaces and T : X Y is a
linear map. The the following are equivalent:
(a) T is continuous.
(b) T is continuous at 0.
(c) T is bounded.
Proof. (a) (b) trivial. (b) (c) If T continuous at 0 then there exist > 0
such that kT(x)k 1 if kxk . Therefore for any x X, kT (x/kxk) k 1 which
implies that kT(x)k
1
kxk and hence kTk

1
< . (c) (a) Let x X and

> 0 be given. Then
kT(y) T(x)k = kT(y x)k kTk ky xk <
provided ky xk < /kTk .
In the examples to follow all integrals are the standard Riemann integrals, see
Section 4 below for the denition and the basic properties of the Riemann integral.
Example 3.62. Suppose that K : [0, 1] [0, 1] C is a continuous function. For
f C([0, 1]), let
Tf(x) =
Z
1
0
K(x, y)f(y)dy.
Since
|Tf(x) Tf(z)|
Z
1
0
|K(x, y) K(z, y)| |f(y)| dy
kfk
max
y
|K(x, y) K(z, y)| (3.9)
and the latter expression tends to 0 as x z by uniform continuity of K. Therefore
Tf C([0, 1]) and by the linearity of the Riemann integral, T : C([0, 1]) C([0, 1])
is a linear map. Moreover,
|Tf(x)|
Z
1
0
|K(x, y)| |f(y)| dy
Z
1
0
|K(x, y)| dy kfk
Akfk
where
(3.10) A := sup
x[0,1]
Z
1
0
|K(x, y)| dy < .
This shows kTk A < and therefore T is bounded. We may in fact show
kTk = A. To do this let x
0
[0, 1] be such that
sup
x[0,1]
Z
1
0
|K(x, y)| dy =
Z
1
0
|K(x
0
, y)| dy.
Such an x
0
can be found since, using a similar argument to that in Eq. (3.9),
x
R
1
0
|K(x, y)| dy is continuous. Given > 0, let
f
(y) :=
K(x
0
, y)
q
+|K(x
0
, y)|
2
and notice that lim
0
kf
= 1 and
kTf
|Tf
(x
0
)| = Tf
(x
0
) =
Z
1
0
|K(x
0
, y)|
2
q
+|K(x
0
, y)|
2
dy.
Therefore,
kTk lim
0
1
kf
Z
1
0
|K(x
0
, y)|
2
q
+|K(x
0
, y)|
2
dy
= lim
0
Z
1
0
|K(x
0
, y)|
2
q
+|K(x
0
, y)|
2
dy = A
since
0 |K(x
0
, y)|
|K(x
0
, y)|
2
q
+|K(x
0
, y)|
2
=
|K(x
0
, y)|
q
+|K(x
0
, y)|
2
q
+|K(x
0
, y)|
2
|K(x
0
, y)|
q
+|K(x
0
, y)|
2
|K(x
0
, y)|
and the latter expression tends to zero uniformly in y as 0.
We may also consider other norms on C([0, 1]). Let (for now) L
1
([0, 1]) denote
C([0, 1]) with the norm
kfk
1
=
Z
1
0
|f(x)| dx,
then T : L
1
([0, 1], dm) C([0, 1]) is bounded as well. Indeed, let M =
sup{|K(x, y)| : x, y [0, 1]} , then
|(Tf)(x)|
Z
1
0
|K(x, y)f(y)| dy M kfk
1
which shows kTfk
M kfk
1
and hence,
kTk
L
1
C
max {|K(x, y)| : x, y [0, 1]} < .
We can in fact show that kTk = M as follows. Let (x
0
, y
0
) [0, 1]
2
satisfying
|K(x
0
, y
0
)| = M. Then given > 0, there exists a neighborhood U = I J of
(x
0
, y
0
) such that |K(x, y) K(x
0
, y
0
)| < for all (x, y) U. Let f C
c
(I, [0, ))
such that
R
1
0
f(x)dx = 1. Choose C such that || = 1 and K(x
0
, y
0
) = M,
then
|(Tf)(x
0
)| =
Z
1
0
K(x
0
, y)f(y)dy
Z
I
K(x
0
, y)f(y)dy
Re
Z
I
K(x
0
, y)f(y)dy
Z
I
(M ) f(y)dy = (M ) kfk
L
1
and hence
kTfk
C
(M ) kfk
L
1
38 BRUCE K. DRIVER
showing that kTk M . Since > 0 is arbitrary, we learn that kTk M and
hence kTk = M.
One may also view T as a map from T : C([0, 1]) L
1
([0, 1]) in which case one
may show
kTk
L
1
C

Z
1
0
max
y
|K(x, y)| dx < .
For the next three exercises, let X = R
n
and Y = R
m
and T : X Y be a linear
transformation so that T is given by matrix multiplication by an mn matrix. Let
us identify the linear transformation T with this matrix.
Exercise 3.16. Assume the norms on X and Y are the
1
norms, i.e. for x R
n
,
kxk =
P
n
j=1
|x
j
| . Then the operator norm of T is given by
kTk = max
1jn
m
X
i=1
|T
ij
| .
Exercise 3.17. ms on X and Y are the
norms, i.e. for x R

n
, kxk =
max
1jn
|x
j
| . Then the operator norm of T is given by
kTk = max
1im
n
X
j=1
|T
ij
| .
Exercise 3.18. Assume the norms on X and Y are the
2
norms, i.e. for x R
n
,
kxk
2
=
P
n
j=1
x
2
j
. Show kTk
2
is the largest eigenvalue of the matrix T
tr
T : R
n
R
n
.
Exercise 3.19. If X is nite dimensional normed space then all linear maps are
bounded.
Notation 3.63. Let L(X, Y ) denote the bounded linear operators from X to Y. If
Y = F we write X
for L(X, F) and call X
the (continuous) dual space to X.

Lemma 3.64. Let X, Y be normed spaces, then the operator norm kk on L(X, Y )
is a norm. Moreover if Z is another normed space and T : X Y and S : Y Z
are linear maps, then kSTk kSkkTk, where ST := S T.
Proof. As usual, the main point in checking the operator norm is a norm is
to verify the triangle inequality, the other axioms being easy to check. If A, B
L(X, Y ) then the triangle inequality is veried as follows:
kA+Bk = sup
x6=0
kAx +Bxk
kxk
sup
x6=0
kAxk +kBxk
kxk
sup
x6=0
kAxk
kxk
+ sup
x6=0
kBxk
kxk
= kAk +kBk .
For the second assertion, we have for x X, that
kSTxk kSkkTxk kSkkTkkxk.
From this inequality and the denition of kSTk, it follows that kSTk kSkkTk.
Proposition 3.65. Suppose that X is a normed vector space and Y is a Banach
space. Then (L(X, Y ), k k
op
) is a Banach space. In particular the dual space X
is always a Banach space.

We will use the following characterization of a Banach space in the proof of this
proposition.
Theorem 3.66. A normed space (X, k k) is a Banach space i for every sequence
{x
n
}
n=1
such that

P
n=1
kx
n
k < then lim
N
N
P
n=1
x
n
= S exists in X (that is to
say every absolutely convergent series is a convergent series in X). As usual we
will denote S by

P
n=1
x
n
.
Proof. ()If X is complete and

P
n=1
kx
n
k < then sequence S
N

N
P
n=1
x
n
for
N N is Cauchy because (for N > M)
kS
N
S
M
k
N
X
n=M+1
kx
n
k 0 as M, N .
Therefore S =

P
n=1
x
n
:= lim
N
N
P
n=1
x
n
exists in X.
(=) Suppose that {x
n
}
n=1
is a Cauchy sequence and let {y
k
= x
n
k
}
k=1
be a
subsequence of {x
n
}
n=1
such that

P
n=1
ky
n+1
y
n
k < . By assumption
y
N+1
y
1
=
N
X
n=1
(y
n+1
y
n
) S =

X
n=1
(y
n+1
y
n
) X as N .
This shows that lim
N
y
N
exists and is equal to x := y
1
+ S. Since {x
n
}
n=1
is
Cauchy,
kx x
n
k kx y
k
k +ky
k
x
n
k 0 as k, n
showing that lim
n
x
n
exists and is equal to x.
Proof. (Proof of Proposition 3.65.) We must show (L(X, Y ), k k
op
) is complete.
Suppose that T
n
L(X, Y ) is a sequence of operators such that

P
n=1
kT
n
k < .
Then
X
n=1
kT
n
xk

X
n=1
kT
n
k kxk <
and therefore by the completeness of Y, Sx :=

P
n=1
T
n
x = lim
N
S
N
x exists in
Y, where S
N
:=
N
P
n=1
T
n
. The reader should check that S : X Y so dened in
linear. Since,
kSxk = lim
N
kS
N
xk lim
N
N
X
n=1
kT
n
xk

X
n=1
kT
n
k kxk ,
S is bounded and
(3.11) kSk

X
n=1
kT
n
k.
40 BRUCE K. DRIVER
Similarly,
kSx S
M
xk = lim
N
kS
N
x S
M
xk lim
N
N
X
n=M+1
kT
n
k kxk =

X
n=M+1
kT
n
k kxk
and therefore,
kS S
M
k

X
n=M
kT
n
k 0 as M .
Of course we did not actually need to use Theorem 3.66 in the proof. Here is
another proof. Let {T
n
}
n=1
be a Cauchy sequence in L(X, Y ). Then for each x X,
kT
n
x T
m
xk kT
n
T
m
k kxk 0 as m, n
showing {T
n
x}
n=1
is Cauchy in Y. Using the completeness of Y, there exists an
element Tx Y such that
lim
n
kT
n
x Txk = 0.
It is a simple matter to show T : X Y is a linear map. Moreover,
kTx T
n
xk kTx T
m
xk +kT
m
x T
n
xk kTx T
m
xk +kT
m
T
n
k kxk
and therefore
kTx T
n
xk lim sup
m
(kTx T
m
xk +kT
m
T
n
k kxk) = kxklim sup
m
kT
m
T
n
k .
Hence
kT T
n
k lim sup
m
kT
m
T
n
k 0 as n .
Thus we have shown that T
n
T in L(X, Y ) as desired.
3.8. Inverting Elements in L(X) and Linear ODE.
Denition 3.67. A linear map T : X Y is an isometry if kTxk
Y
= kxk
X
for
all x X. T is said to be invertible if T is a bijection and T
1
is bounded.
Notation 3.68. We will write GL(X, Y ) for those T L(X, Y ) which are invert-
ible. If X = Y we simply write L(X) and GL(X) for L(X, X) and GL(X, X)
respectively.
Proposition 3.69. Suppose X is a Banach space and L(X) L(X, X)
satises

P
n=0
k
n
k < . Then I is invertible and
(I )
1
=
1
I
=

X
n=0
n
and

(I )
1

X
n=0
k
n
k.
In particular if kk < 1 then the above formula holds and
(I )
1
1
1 kk
.
Proof. Since L(X) is a Banach space and

P
n=0
k
n
k < , it follows from Theo-
rem 3.66 that
S := lim
N
S
N
:= lim
N
N
X
n=0
n
exists in L(X). Moreover, by Exercise 3.38 below,
(I ) S = (I ) lim
N
S
N
= lim
N
(I ) S
N
= lim
N
(I )
N
X
n=0
n
= lim
N
(I
N+1
) = I
and similarly S (I ) = I. This shows that (I )
1
exists and is equal to S.
Moreover, (I )
1
is bounded because
(I )
1
= kSk

X
n=0
k
n
k.
If we further assume kk < 1, then k
n
k kk
n
and
X
n=0
k
n
k

X
n=0
kk
n
1
1 kk
< .
Corollary 3.70. Let X and Y be Banach spaces. Then GL(X, Y ) is an open
(possibly empty) subset of L(X, Y ). More specically, if A GL(X, Y ) and B
L(X, Y ) satises
(3.12) kB Ak < kA
1
k
1
then B GL(X, Y )
(3.13) B
1
=

X
n=0
I
X
A
1
B
n
A
1
L(Y, X)
and
B
1
kA
1
k
1
1 kA
1
k kABk
.
Proof. Let A and B be as above, then
B = A(AB) = A
I
X
A
1
(AB))
= A(I
X
)
where : X X is given by
:= A
1
(AB) = I
X
A
1
B.
Now
kk =

A
1
(AB))
kA
1
k kABk < kA
1
kkA
1
k
1
= 1.
Therefore I is invertible and hence so is B (being the product of invertible
elements) with
B
1
= (I )
1
A
1
=

I
X
A
1
(AB))
1
A
1
.
For the last assertion we have,
B
1
(I
X
)
1
kA
1
k kA
1
k
1
1 kk
kA
1
k
1
1 kA
1
k kABk
.
42 BRUCE K. DRIVER
For an application of these results to linear ordinary dierential equations, see

Section 5.2.
3.9. Supplement: Sums in Banach Spaces.
Denition 3.71. Suppose that X is a normed space and {v
X : A} is a
given collection of vectors in X. We say that s =
P
A
v
X if for all > 0

there exists a nite set
A such that

s
P
< for all A

such that
. (Unlike the case of real valued sums, this does not imply that
P
kv
k < . See Proposition 12.19 below, from which one may manufacture
counter-examples to this false premise.)
Lemma 3.72. (1) When X is a Banach space,
P
A
v
exists in X i for all

> 0 there exists
A such that

< for all A \
.
Also if
P
A
v
exists in X then { A : v
a
6= 0} is at most countable. (2) If
s =
P
A
v
X exists and T : X Y is a bounded linear map between normed

spaces, then
P
A
Tv
exists in Y and
Ts = T
X
A
v
=
X
A
Tv
.
Proof. (1) Suppose that s =
P
A
v
exists and > 0. Let
A be as in
Denition 3.71. Then for A\
+
X
+ < 2.
Conversely, suppose for all > 0 there exists
A such that

<
for all A \
. Let
n
:=
n
k=1
1/k
A and set s
n
:=
P
n
v
. Then for
m > n,
ks
m
s
n
k =
m
\
n
v
1/n 0 as m, n .
Therefore {s
n
}
n=1
is Cauchy and hence convergent in X. Let s := lim
n
s
n
, then
for A such that
n
, we have
s
X
ks s
n
k +
X
\
n
v
ks s
n
k +
1
n
.
Since the right member of this equation goes to zero as n , it follows that
P
A
v
exists and is equal to s.

Let :=
n=1
n
a countable subset of A. Then for / , {} A\
n
for all
n and hence
kv
k =
X
{}
v
1/n 0 as n .
Therefore v
= 0 for all A\ .
(2) Let
be as in Denition 3.71 and A such that
. Then
Ts
X
Tv
kTk
s
X
< kTk
which shows that
P
Tv
exists and is equal to Ts.

3.10. Word of Caution.
Example 3.73. Let (X, d) be a metric space. It is always true that B
x
() C
x
()
since C
x
() is a closed set containing B
x
(). However, it is not always true that
B
x
() = C
x
(). For example let X = {1, 2} and d(1, 2) = 1, then B
1
(1) = {1} ,
B
1
(1) = {1} while C
1
(1) = X. For another counter example, take
X =

(x, y) R
2
: x = 0 or x = 1
with the usually Euclidean metric coming from the plane. Then
B
(0,0)
(1) =

(0, y) R
2
: |y| < 1
,
B
(0,0)
(1) =

(0, y) R
2
: |y| 1
, while
C
(0,0)
(1) = B
(0,0)
(1) {(0, 1)} .
In spite of the above examples, Lemmas 3.74 and 3.75 below shows that for
certain metric spaces of interest it is true that B
x
() = C
x
().
Lemma 3.74. Suppose that (X, ||) is a normed vector space and d is the metric
on X dened by d(x, y) = |x y| . Then
B
x
() = C
x
() and
B
x
() = {y X : d(x, y) = }.
Proof. We must show that C := C
x
() B
x
() =:

B. For y C, let v = y x,
then
|v| = |y x| = d(x, y) .
Let
n
= 1 1/n so that
n
1 as n . Let y
n
= x +
n
v, then d(x, y
n
) =
n
d(x, y) < , so that y
n
B
x
() and d(y, y
n
) = 1
n
0 as n . This shows
that y
n
y as n and hence that y

B.
3.10.1. Riemannian Metrics. This subsection is not completely self contained and
may safely be skipped.
Lemma 3.75. Suppose that X is a Riemannian (or sub-Riemannian) manifold
and d is the metric on X dened by
d(x, y) = inf {() : (0) = x and (1) = y}
where () is the length of the curve . We dene () = if is not piecewise
smooth.
Then
B
x
() = C
x
() and
B
x
() = {y X : d(x, y) = }.
44 BRUCE K. DRIVER
x
y
z
Figure 10. An almost length minimizing curve joining x to y.
Proof. Let C := C
x
() B
x
() =:

B. We will show that C

B by showing
B
c
C
c
. Suppose that y

B
c
and choose > 0 such that B
y
()

B = . In
particular this implies that
B
y
() B
x
() = .
We will nish the proof by showing that d(x, y) + > and hence that y C
c
.
This will be accomplished by showing: if d(x, y) < + then B
y
() B
x
() 6= .
If d(x, y) < max(, ) then either x B
y
() or y B
x
(). In either case B
y
()
B
x
() 6= . Hence we may assume that max(, ) d(x, y) < +. Let > 0 be a
number such that
max(, ) d(x, y) < < +
and choose a curve from x to y such that () < . Also choose 0 <
0
< such
that 0 <
0
< which can be done since < . Let k(t) = d(y, (t)) a
continuous function on [0, 1] and therefore k([0, 1]) R is a connected set which
contains 0 and d(x, y). Therefore there exists t
0
[0, 1] such that d(y, (t
0
)) =
k(t
0
) =
0
. Let z = (t
0
) B
y
() then
d(x, z) (|
[0,t
0
]
) = () (|
[t
0
,1]
) < d(z, y) =
0
<
and therefore z B
x
() B
x
() 6= .
Remark 3.76. Suppose again that X is a Riemannian (or sub-Riemannian) manifold
and
d(x, y) = inf {() : (0) = x and (1) = y} .
Let be a curve from x to y and let = () d(x, y). Then for all 0 u < v 1,
d((u), (v)) (|
[u,v]
) +.
So if is within of a length minimizing curve from x to y that |
[u,v]
is within
of a length minimizing curve from (u) to (v). In particular if d(x, y) = ()
then d((u), (v)) = (|
[u,v]
) for all 0 u < v 1, i.e. if is a length minimizing
curve from x to y that |
[u,v]
is a length minimizing curve from (u) to (v).
To prove these assertions notice that
d(x, y) + = () = (|
[0,u]
) +(|
[u,v]
) +(|
[v,1]
)
d(x, (u)) +(|
[u,v]
) +d((v), y)
and therefore
(|
[u,v]
) d(x, y) + d(x, (u)) d((v), y)
d((u), (v)) +.
3.11. Exercises.
Exercise 3.21. Let X = C([0, 1], R) and for f X, let
kfk
1
:=
Z
1
0
|f(t)| dt.
Show that (X, kk
1
) is normed space and show by example that this space is not
complete.
Exercise 3.22. Let (X, d) be a metric space. Suppose that {x
n
}
n=1
X is a
sequence and set
n
:= d(x
n
, x
n+1
). Show that for m > n that
d(x
n
, x
m
)
m1
X
k=n
k

X
k=n
k
.
Conclude from this that if
X
k=1
k
=

X
n=1
d(x
n
, x
n+1
) <
then {x
n
}
n=1
is Cauchy. Moreover, show that if {x
n
}
n=1
is a convergent sequence
and x = lim
n
x
n
then
d(x, x
n
)

X
k=n
k
.
Exercise 3.23. Show that (X, d) is a complete metric space i every sequence
{x
n
}
n=1
X such that
P
n=1
d(x
n
, x
n+1
) < is a convergent sequence in X. You
may nd it useful to prove the following statements in the course of the proof.
(1) If {x
n
} is Cauchy sequence, then there is a subsequence y
j
x
n
j
such that
P
j=1
d(y
j+1
, y
j
) < .
(2) If {x
n
}
n=1
is Cauchy and there exists a subsequence y
j
x
n
j
of {x
n
} such
that x = lim
j
y
j
exists, then lim
n
x
n
also exists and is equal to x.
Exercise 3.24. Suppose that f : [0, ) [0, ) is a C
2
function such that
f(0) = 0, f
0
> 0 and f
00
0 and (X, ) is a metric space. Show that d(x, y) =
f((x, y)) is a metric on X. In particular show that
d(x, y)
(x, y)
1 +(x, y)
is a metric on X. (Hint: use calculus to verify that f(a + b) f(a) + f(b) for all
a, b [0, ).)
Exercise 3.25. Let d : C(R) C(R) [0, ) be dened by
d(f, g) =

X
n=1
2
n
kf gk
n
1 +kf gk
n
,
where kfk
n
sup{|f(x)| : |x| n} = max{|f(x)| : |x| n}.
(1) Show that d is a metric on C(R).
(2) Show that a sequence {f
n
}
n=1
C(R) converges to f C(R) as n
i f
n
converges to f uniformly on compact subsets of R.
(3) Show that (C(R), d) is a complete metric space.
46 BRUCE K. DRIVER
Exercise 3.26. Let {(X

n
, d
n
)}
n=1
be a sequence of metric spaces, X :=
Q
n=1
X
n
,
and for x = (x(n))
n=1
and y = (y(n))
n=1
in X let
d(x, y) =

X
n=1
2
n
d
n
(x(n), y(n))
1 +d
n
(x(n), y(n))
.
Show: 1) (X, d) is a metric space, 2) a sequence {x
k
}
k=1
X converges to x X
i x
k
(n) x(n) X
n
as k for every n = 1, 2, . . . , and 3) X is complete if
X
n
is complete for all n.
Exercise 3.27 (Tychonos Theorem). Let us continue the notation of the previous
problem. Further assume that the spaces X
n
are compact for all n. Show (X, d) is
compact. Hint: Either use Cantors method to show every sequence {x
m
}
m=1
X
has a convergent subsequence or alternatively show (X, d) is complete and totally
bounded.
Exercise 3.28. Let (X
i
, d
i
) for i = 1, . . . , n be a nite collection of metric spaces
and for 1 p and x = (x
1
, x
2
, . . . , x
n
) and y = (y
1
, . . . , y
n
) in X :=
Q
n
i=1
X
i
,
let
p
(x, y) =

(
P
n
i=1
[d
i
(x
i
, y
i
)]
p
)
1/p
if p 6=
max
i
d
i
(x
i
, y
i
) if p =
.
(1) Show (X,
p
) is a metric space for p [1, ]. Hint: Minkowskis inequality.
(2) Show that all of the metric {
p
: 1 p } are equivalent, i.e. for any
p, q [1, ] there exists constants c, C < such that
p
(x, y) C
q
(x, y) and
q
(x, y) c
p
(x, y) for all x, y X.
Hint: This can be done with explicit estimates or more simply using
Lemma 3.54.
(3) Show that the topologies associated to the metrics
p
are the same for all
p [1, ].
Exercise 3.29. Let C be a closed proper subset of R
n
and x R
n
\C. Show there
exists a y C such that d(x, y) = d
C
(x).
Exercise 3.30. Let F = R in this problem and A
2
(N) be dened by
A = {x
2
(N) : x(n) 1 + 1/n for some n N}
=
n=1
{x
2
(N) : x(n) 1 + 1/n}.
Show A is a closed subset of
2
(N) with the property that d
A
(0) = 1 while there
is no y A such that d
A
(y) = 1. (Remember that in general an innite union of
closed sets need not be closed.)
3.11.1. Banach Space Problems.
Exercise 3.31. Show that all nite dimensional normed vector spaces (L, kk) are
necessarily complete. Also show that closed and bounded sets (relative to the given
norm) are compact.
Exercise 3.32. Let (X, kk) be a normed space over F (R or C). Show the map
(, x, y) F X X x +y X
is continuous relative to the topology on F X X dened by the norm
k(, x, y)k
FXX
:= || +kxk +kyk .
(See Exercise 3.28 for more on the metric associated to this norm.) Also show that
kk : X [0, ) is continuous.
Exercise 3.33. Let p [1, ] and X be an innite set. Show the closed unit ball
in
p
(X) is not compact.
Exercise 3.34. Let X = N and for p, q [1, ) let kk
p
denote the
p
(N) norm.
Show kk
p
and kk
q
are inequivalent norms for p 6= q by showing
sup
f6=0
kfk
p
kfk
q
= if p < q.
Exercise 3.35. Folland Problem 5.5. Closure of subspaces are subspaces.
Exercise 3.36. Folland Problem 5.9. Showing C
k
([0, 1]) is a Banach space.
Exercise 3.37. Folland Problem 5.11. Showing Holder spaces are Banach spaces.
Exercise 3.38. Let X, Y and Z be normed spaces. Prove the maps
(S, x) L(X, Y ) X Sx Y
and
(S, T) L(X, Y ) L(Y, Z) ST L(X, Z)
are continuous relative to the norms
k(S, x)k
L(X,Y )X
:= kSk
L(X,Y )
+kxk
X
and
k(S, T)k
L(X,Y )L(Y,Z)
:= kSk
L(X,Y )
+kTk
L(Y,Z)
on L(X, Y ) X and L(X, Y ) L(Y, Z) respectively.
3.11.2. Ascoli-Arzela Theorem Problems.
Exercise 3.39. Let T (0, ) and F C([0, T]) be a family of functions such
that:
(1)

f(t) exists for all t (0, T) and f F.
(2) sup
fF
|f(0)| < and
(3) M := sup
fF
sup
t(0,T)

f(t)
< .
Show F is precompact in the Banach space C([0, T]) equipped with the norm
kfk
= sup
t[0,T]
|f(t)| .
Exercise 3.40. Folland Problem 4.63.
Exercise 3.41. Folland Problem 4.64.
3.11.3. General Topological Space Problems.
Exercise 3.42. Give an example of continuous map, f : X Y, and a compact
subset K of Y such that f
1
(K) is not compact.
Exercise 3.43. Let V be an open subset of R. Show V may be written as a disjoint
union of open intervals J
n
= (a
n
, b
n
), where a
n
, b
n
R{} for n = 1, 2, < N
with N = possible.
48 BRUCE K. DRIVER
4. The Riemann Integral

In this short chapter, the Riemann integral for Banach space valued functions
is dened and developed. Our exposition will be brief, since the Lebesgue integral
and the Bochner Lebesgue integral will subsume the content of this chapter. The
following simple Bounded Linear Transformation theorem will often be used here
and in the sequel to dene linear transformations.
Theorem 4.1 (B. L. T. Theorem). Suppose that Z is a normed space, X is a
Banach space, and S Z is a dense linear subspace of Z. If T : S X is a
bounded linear transformation (i.e. there exists C < such that kTzk C kzk
for all z S), then T has a unique extension to an element

T L(Z, X) and this
extension still satises

Tz
C kzk for all z

S.
Exercise 4.1. Prove Theorem 4.1.
For the remainder of the chapter, let [a, b] be a xed compact interval and X be
a Banach space. The collection S = S([a, b], X) of step functions, f : [a, b] X,
consists of those functions f which may be written in the form
(4.1) f(t) = x
0
1
[a,t
1
]
(t) +
n1
X
i=1
x
i
1
(t
i
,t
i+1
]
(t),
where {a = t
0
< t
1
< < t
n
= b} is a partition of [a, b] and x
i
X. For f as
in Eq. (4.1), let
(4.2) I(f)
n1
X
i=0
(t
i+1
t
i
)x
i
X.
Exercise 4.2. Show that I(f) is well dened, independent of how f is represented
as a step function. (Hint: show that adding a point to a partition of [a, b] does
not change the right side of Eq. (4.2).) Also verify that I : S X is a linear
operator.
Proposition 4.2 (Riemann Integral). The linear function I : S X extends
uniquely to a continuous linear operator

I from

S (the closure of the step functions
inside of
([a, b], X)) to X and this operator satises,

(4.3) k
I(f)k (b a) kfk
for all f

S.
Furthermore, C([a, b], X)

S
([a, b], X) and for f ,

I(f) may be computed
as
(4.4)

I(f) = lim
||0
n1
X
i=0
f(c
i
)(t
i+1
t
i
)
where {a = t
0
< t
1
< < t
n
= b} denotes a partition of [a, b],
|| = max {|t
i+1
t
i
| : i = 0, . . . , n 1} is the mesh size of and c
i
may be chosen
arbitrarily inside [t
i
, t
i+1
].
Proof. Taking the norm of Eq. (4.2) and using the triangle inequality shows,
(4.5) kI(f)k
n1
X
i=0
(t
i+1
t
i
)kx
i
k
n1
X
i=0
(t
i+1
t
i
)kfk
(b a)kfk
.
The existence of

I satisfying Eq. (4.3) is a consequence of Theorem 4.1.
For f C([a, b], X), {a = t
0
< t
1
< < t
n
= b} a partition of [a, b], and
c
i
[t
i
, t
i+1
] for i = 0, 1, 2 . . . , n 1, let
f
(t) f(c
0
)
0
1
[t
0
,t
1
]
(t) +
n1
X
i=1
f(c
i
)1
(t
i
,t
i+1
]
(t).
Then I(f
) =
P
n1
i=0
f(c
i
)(t
i+1
t
i
) so to nish the proof of Eq. (4.4) and that
C([a, b], X)

S, it suces to observe that lim
||0
kf f
= 0 because f is
uniformly continuous on [a, b].
If f
n
S and f

S such that lim
n
kf f
n
k
= 0, then for a < b,

then 1
[,]
f
n
S and lim
n
1
[,]
f 1
[,]
f
n
= 0. This shows 1
[,]
f

S
whenever f

S.
Notation 4.3. For f

S and a b we will write denote

I(1
[,]
f) by
R
f(t) dt or
R
[,]
f(t)dt. Also following the usual convention, if a b, we
will let
Z

f(t) dt =
I(1
[,]
f) =
Z

f(t) dt.
The next Lemma, whose proof is left to the reader (Exercise 4.4) contains some
of the many familiar properties of the Riemann integral.
Lemma 4.4. For f

S([a, b], X) and , , [a, b], the Riemann integral satises:
(1)

f(t) dt
( ) sup{kf(t)k : t } .
(2)
R
f(t) dt =
R
f(t) dt +
R
f(t) dt.
(3) The function G(t) :=
R
t
a
f()d is continuous on [a, b].
(4) If Y is another Banach space and T L(X, Y ), then Tf

S([a, b], Y ) and
T
f(t)dt
!
=
Z

Tf(t)dt.
(5) The function t kf(t)k
X
is in

S([a, b], R) and
Z
b
a
f(t) dt
Z
b
a
kf(t)k dt.
(6) If f, g

S([a, b], R) and f g, then
Z
b
a
f(t)dt
Z
b
a
g(t)dt.
Theorem 4.5 (Baby Fubini Theorem). Let a, b, c, d R and f(s, t) X be a
continuous function of (s, t) for s between a and b and t between c and d. Then the
maps t
R
b
a
f(s, t)ds X and s
R
d
c
f(s, t)dt are continuous and
(4.6)
Z
d
c
"
Z
b
a
f(s, t)ds
#
dt =
Z
b
a
"
Z
d
c
f(s, t)dt
#
ds.
50 BRUCE K. DRIVER
Proof. With out loss of generality we may assume a < b and c < d. By uniform
continuity of f, Exercise 3.15,
sup
ctd
kf(s, t) f(s
0
, t)k 0 as s s
0
and so by Lemma 4.4
Z
d
c
f(s, t)dt
Z
d
c
f(s
0
, t)dt as s s
0
showing the continuity of s
R
d
c
f(s, t)dt. The other continuity assertion is proved
similarly.
Now let
= {a s
0
< s
1
< < s
m
= b} and
0
= {c t
0
< t
1
< < t
n
= d}
be partitions of [a, b] and [c, d] respectively. For s [a, b] let s
= s
i
if s (s
i
, s
i+1
]
and i 1 and s
= s
0
= a if s [s
0
, s
1
]. Dene t
0 for t [c, d] analogously. Then

Z
b
a
"
Z
d
c
f(s, t)dt
#
ds =
Z
b
a
"
Z
d
c
f(s, t
0 )dt
#
ds +
Z
b
a

0 (s)ds
=
Z
b
a
"
Z
d
c
f(s
, t
0 )dt
#
ds +
,
0 +
Z
b
a

0 (s)ds
where
0 (s) =
Z
d
c
f(s, t)dt
Z
d
c
f(s, t
0 )dt
and
,
0 =
Z
b
a
"
Z
d
c
{f(s, t
0 ) f(s
, t
0 )} dt
#
ds.
The uniform continuity of f and the estimates
sup
s[a,b]
k
0 (s)k sup
s[a,b]
Z
d
c
kf(s, t) f(s, t
0 )k dt
(d c) sup{kf(s, t) f(s, t
0 )k : (s, t) Q}
and
k
,
0 k
Z
b
a
"
Z
d
c
kf(s, t
0 ) f(s
, t
0 )k dt
#
ds
(b a)(d c) sup{kf(s, t) f(s, t
0 )k : (s, t) Q}
allow us to conclude that
Z
b
a
"
Z
d
c
f(s, t)dt
#
ds
Z
b
a
"
Z
d
c
f(s
, t
0 )dt
#
ds 0 as || +|
0
| 0.
By symmetry (or an analogous argument),
Z
d
c
"
Z
b
a
f(s, t)ds
#
dt
Z
d
c
"
Z
b
a
f(s
, t
0 )ds
#
dt 0 as || +|
0
| 0.
This completes the proof since
Z
b
a
"
Z
d
c
f(s
, t
0 )dt
#
ds =
X
0i<m,0j<n
f(s
i
, t
j
)(s
i+1
s
i
)(t
j+1
t
j
)
=
Z
d
c
"
Z
b
a
f(s
, t
0 )ds
#
dt.
4.1. The Fundamental Theorem of Calculus. Our next goal is to show that
our Riemann integral interacts well with dierentiation, namely the fundamental
theorem of calculus holds. Before doing this we will need a couple of basic denitions
and results.
Denition 4.6. Let (a, b) R. A function f : (a, b) X is dierentiable at
t (a, b) i L := lim
h0
f(t+h)f(t)
h
exists in X. The limit L, if it exists, will be
denoted by

f(t) or
df
dt
(t). We also say that f C
1
((a, b) X) if f is dierentiable
at all points t (a, b) and

f C((a, b) X).
Proposition 4.7. Suppose that f : [a, b] X is a continuous function such that
f(t) exists and is equal to zero for t (a, b). Then f is constant.
Proof. Let > 0 and (a, b) be given. (We will later let 0 and a.) By
the denition of the derivative, for all (a, b) there exists
> 0 such that

(4.7) kf(t) f()k =

f(t) f()

f()(t )
|t | if |t | <
.
Let
(4.8) A = {t [, b] : kf(t) f()k (t )}
and t
0
be the least upper bound for A. We will now use a standard argument called
continuous induction to show t
0
= b.
Eq. (4.7) with = shows t
0
> and a simple continuity argument shows
t
0
A, i.e.
(4.9) kf(t
0
) f()k (t
0
)
For the sake of contradiction, suppose that t
0
< b. By Eqs. (4.7) and (4.9),
kf(t) f()k kf(t) f(t
0
)k +kf(t
0
) f()k (t
0
) +(t t
0
) = (t )
for 0 t t
0
<
t
0
which violates the denition of t
0
being an upper bound. Thus
we have shown Eq. (4.8) holds for all t [, b]. Since > 0 and > a were
arbitrary we may conclude, using the continuity of f, that kf(t) f(a)k = 0 for all
t [a, b].
Remark 4.8. The usual real variable proof of Proposition 4.7 makes use Rolles
theorem which in turn uses the extreme value theorem. This latter theorem is not
available to vector valued functions. However with the aid of the Hahn Banach
Theorem 18.16 and Lemma 4.4, it is possible to reduce the proof of Proposition 4.7
and the proof of the Fundamental Theorem of Calculus 4.9 to the real valued case,
see Exercise 18.12.
Theorem 4.9 (Fundamental Theorem of Calculus). Suppose that f C([a, b], X),
Then
52 BRUCE K. DRIVER
(1)
d
dt
R
t
a
f() d = f(t) for all t (a, b).
(2) Now assume that F C([a, b], X), F is continuously dierentiable on (a, b),
and

F extends to a continuous function on [a, b] which is still denoted by
F. Then
Z
b
a
F(t) dt = F(b) F(a).

Proof. Let h > 0 be a small number and consider
k
Z
t+h
a
f()d
Z
t
a
f()d f(t)hk = k
Z
t+h
t
(f() f(t)) dk
Z
t+h
t
k(f() f(t))k d
h(h),
where (h) max
[t,t+h]
k(f() f(t))k. Combining this with a similar computa-
tion when h < 0 shows, for all h R suciently small, that
k
Z
t+h
a
f()d
Z
t
a
f()d f(t)hk |h|(h),
where now (h) max
[t|h|,t+|h|]
k(f()f(t))k. By continuity of f at t, (h) 0
and hence
d
dt
R
t
a
f() d exists and is equal to f(t).
For the second item, set G(t)
R
t
a

F() d F(t). Then G is continuous by
Lemma 4.4 and

G(t) = 0 for all t (a, b) by item 1. An application of Proposition
4.7 shows G is a constant and in particular G(b) = G(a), i.e.
R
b
a

F() d F(b) =
F(a).
Corollary 4.10 (Mean Value Inequality). Suppose that f : [a, b] X is a con-
tinuous function such that

f(t) exists for t (a, b) and

f extends to a continuous
function on [a, b]. Then
(4.10) kf(b) f(a)k
Z
b
a
k

f(t)kdt (b a)
.
Proof. By the fundamental theorem of calculus, f(b) f(a) =
R
b
a

f(t)dt and
then by Lemma 4.4,
kf(b) f(a)k =
Z
b
a
f(t)dt
Z
b
a
k

f(t)kdt
Z
b
a
dt = (b a)
.
Proposition 4.11 (Equality of Mixed Partial Derivatives). Let Q = (a, b) (c, d)
be an open rectangle in R
2
and f C(Q, X). Assume that

t
f(s, t),

s
f(s, t) and
t

s
f(s, t) exists and are continuous for (s, t) Q, then

s

t
f(s, t) exists for
(s, t) Q and
(4.11)

s

t
f(s, t) =

t

s
f(s, t) for (s, t) Q.
Proof. Fix (s
0
, t
0
) Q. By two applications of Theorem 4.9,
f(s, t) = f(s
t
0
, t) +
Z
s
s
0
f(, t)d
= f(s
0
, t) +
Z
s
s
0
f(, t
0
)d +
Z
s
s
0
d
Z
t
t
0
d

f(, ) (4.12)
and then by Fubinis Theorem 4.5 we learn
f(s, t) = f(s
0
, t) +
Z
s
s
0
f(, t
0
)d +
Z
t
t
0
d
Z
s
s
0
d

f(, ).
Dierentiating this equation in t and then in s (again using two more applications
of Theorem 4.9) shows Eq. (4.11) holds.
4.2. Exercises.
Exercise 4.3. Let
([a, b], X) {f : [a, b] X : kfk
sup
t[a,b]
kf(t)k < }.
Show that (
([a, b], X), k k
) is a complete Banach space.

Exercise 4.5. Using Lemma 4.4, show f = (f
1
, . . . , f
n
)

S([a, b], R
n
) i f
i

S([a, b], R) for i = 1, 2, . . . , n and

Z
b
a
f(t)dt =
Z
b
a
f
1
(t)dt, . . . ,
Z
b
a
f
n
(t)dt
!
.
Exercise 4.6. Give another proof of Proposition 4.11 which does not use Fubinis
Theorem 4.5 as follows.
(1) By a simple translation argument we may assume (0, 0) Q and we are
trying to prove Eq. (4.11) holds at (s, t) = (0, 0).
(2) Let h(s, t) :=

t

s
f(s, t) and
G(s, t) :=
Z
s
0
d
Z
t
0
dh(, )
so that Eq. (4.12) states
f(s, t) = f(0, t) +
Z
s
0
f(, t
0
)d +G(s, t)
and dierentiating this equation at t = 0 shows
(4.13)

t
f(s, 0) =

t
f(0, 0) +

t
G(s, 0).
Now show using the denition of the derivative that
(4.14)

t
G(s, 0) =
Z
s
0
dh(, 0).
Hint: Consider
G(s, t) t
Z
s
0
dh(, 0) =
Z
s
0
d
Z
t
0
d [h(, ) h(, 0)] .
(3) Now dierentiate Eq. (4.13) in s using Theorem 4.9 to nish the proof.
54 BRUCE K. DRIVER
Exercise 4.7. Give another proof of Eq. (4.6) in Theorem 4.5 based on Proposition
4.11. To do this let t
0
(c, d) and s
0
(a, b) and dene
G(s, t) :=
Z
t
t
0
d
Z
s
s
0
df(, )
Show G satises the hypothesis of Proposition 4.11 which combined with two ap-
plications of the fundamental theorem of calculus implies
t

s
G(s, t) =

s

t
G(s, t) = f(s, t).
Use two more applications of the fundamental theorem of calculus along with the
observation that G = 0 if t = t
0
or s = s
0
to conclude
(4.15) G(s, t) =
Z
s
s
0
d
Z
t
t
0
d

G(, ) =
Z
s
s
0
d
Z
t
t
0
d

f(, ).
Finally let s = b and t = d in Eq. (4.15) and then let s
0
a and t
0
c to prove Eq.
(4.6).
5. Ordinary Differential Equations in a Banach Space
Let X be a Banach space, U
o
X, J = (a, b) 3 0 and Z C (J U, X) Z
is to be interpreted as a time dependent vector-eld on U X. In this section we
will consider the ordinary dierential equation (ODE for short)
(5.1) y(t) = Z(t, y(t)) with y(0) = x U.
The reader should check that any solution y C
1
(J, U) to Eq. (5.1) gives a solution
y C(J, U) to the integral equation:
(5.2) y(t) = x +
Z
t
0
Z(, y())d
and conversely if y C(J, U) solves Eq. (5.2) then y C
1
(J, U) and y solves Eq.
(5.1).
Remark 5.1. For notational simplicity we have assumed that the initial condition
for the ODE in Eq. (5.1) is taken at t = 0. There is no loss in generality in doing
this since if y solves
d y
dt
(t) =

Z(t, y(t)) with y(t
0
) = x U
i y(t) := y(t +t
0
) solves Eq. (5.1) with Z(t, x) =

Z(t +t
0
, x).
5.1. Examples. Let X = R, Z(x) = x
n
with n N and consider the ordinary
dierential equation
(5.3) y(t) = Z(y(t)) = y
n
(t) with y(0) = x R.
If y solves Eq. (5.3) with x 6= 0, then y(t) is not zero for t near 0. Therefore up to
the rst time y possibly hits 0, we must have
t =
Z
t
0
y()
y()
n
d =
Z
y(t)
0
u
n
du =
_
_
_
[y(t)]
1n
x
1n
1n
if n > 1
ln
y(t)
x
if n = 1
and solving these equations for y(t) implies
(5.4) y(t) = y(t, x) =
(
x
n1
1(n1)tx
n1
if n > 1
e
t
x if n = 1.
The reader should verify by direct calculation that y(t, x) dened above does in-
deed solve Eq. (5.3). The above argument shows that these are the only possible
solutions to the Equations in (5.3).
Notice that when n = 1, the solution exists for all time while for n > 1, we must
require
1 (n 1)tx
n1
> 0
or equivalently that
t <
1
(1 n)x
n1
if x
n1
> 0 and
t >
1
(1 n) |x|
n1
if x
n1
< 0.
56 BRUCE K. DRIVER
Moreover for n > 1, y(t, x) blows up as t approaches the value for which 1 (n
1)tx
n1
= 0. The reader should also observe that, at least for s and t close to 0,
(5.5) y(t, y(s, x)) = y(t +s, x)
for each of the solutions above. Indeed, if n = 1 Eq. (5.5) is equivalent to the well
know identity, e
t
e
s
= e
t+s
and for n > 1,
y(t, y(s, x)) =
y(s, x)
n1
p
1 (n 1)ty(s, x)
n1
=
x
n1
1(n1)sx
n1
n1
s
1 (n 1)t
x
n1
1(n1)sx
n1
n1
=
x
n1
1(n1)sx
n1
n1
q
1 (n 1)t
x
n1
1(n1)sx
n1
=
x
n1
p
1 (n 1)sx
n1
(n 1)tx
n1
=
x
n1
p
1 (n 1)(s +t)x
n1
= y(t +s, x).
Now suppose Z(x) = |x|
with 0 < < 1 and we now consider the ordinary

dierential equation
(5.6) y(t) = Z(y(t)) = |y(t)|
with y(0) = x R.
Working as above we nd, if x 6= 0 that
t =
Z
t
0
y()
|y(t)|
d =
Z
y(t)
0
|u|
du =
[y(t)]
1
x
1
1
,
where u
1
:= |u|
1
sgn(u). Since sgn(y(t)) = sgn(x) the previous equation im-
plies
sgn(x)(1 )t = sgn(x)
h
sgn(y(t)) |y(t)|
1
sgn(x) |x|
1
i
= |y(t)|
1
|x|
1
and therefore,
(5.7) y(t, x) = sgn(x)
|x|
1
+ sgn(x)(1 )t
1
1
is uniquely determined by this formula until the rst time t where |x|
1
+sgn(x)(1
)t = 0. As before y(t) = 0 is a solution to Eq. (5.6), however it is far from being
the unique solution. For example letting x 0 in Eq. (5.7) gives a function
y(t, 0+) = ((1 )t)
1
1
which solves Eq. (5.6) for t > 0. Moreover if we dene
y(t) :=

((1 )t)
1
1
if t > 0
0 if t 0
,
(for example if = 1/2 then y(t) =
1
4
t
2
1
t0
) then the reader may easily check y
also solve Eq. (5.6). Furthermore, y
a
(t) := y(t a) also solves Eq. (5.6) for all
a 0, see Figure 11 below.
8 6 4 2 0
10
7.5
5
2.5
0
tt
Figure 11. Three dierent solutions to the ODE y(t) = |y(t)|
1/2
with y(0) = 0.
With these examples in mind, let us now go to the general theory starting with
linear ODEs.
5.2. Linear Ordinary Dierential Equations. Consider the linear dierential
equation
(5.8) y(t) = A(t)y(t) where y(0) = x X.
Here A C(J L(X)) and y C
1
(J X). This equation may be written in its
equivalent (as the reader should verify) integral form, namely we are looking for
y C(J, X) such that
(5.9) y(t) = x +
Z
t
0
A()y()d.
In what follows, we will abuse notation and use kk to denote the operator norm
on L(X) associated to kk on X we will also x J = (a, b) 3 0 and let kk
:=
max
tJ
k(t)k for BC(J, X) or BC(J, L(X)).
Notation 5.2. For t R and n N, let
n
(t) =

{(
1
, . . . ,
n
) R
n
: 0
1

n
t} if t 0
{(
1
, . . . ,
n
) R
n
: t
n

1
0} if t 0
and also write d = d
1
. . . d
n
and
Z
n
(t)
f(
1
, . . .
n
)d : = (1)
n1
t<0
Z
t
0
d
n
Z

n
0
d
n1
. . .
Z

2
0
d
1
f(
1
, . . .
n
).
Lemma 5.3. Suppose that C (R, R) , then
(5.10) (1)
n1
t<0
Z
n
(t)
(
1
) . . . (
n
)d =
1
n!
Z
t
0
()d
n
.
58 BRUCE K. DRIVER
Proof. Let (t) :=

R
t
0
()d. The proof will go by induction on n. The case
n = 1 is easily veried since
(1)
11
t<0
Z
1
(t)
(
1
)d
1
=
Z
t
0
()d = (t).
Now assume the truth of Eq. (5.10) for n 1 for some n 2, then
(1)
n1
t<0
Z
n
(t)
(
1
) . . . (
n
)d=
Z
t
0
d
n
Z

n
0
d
n1
. . .
Z

2
0
d
1
(
1
) . . . (
n
)
=
Z
t
0
d
n
n1
(
n
)
(n 1)!
(
n
) =
Z
t
0
d
n
n1
(
n
)
(n 1)!

(
n
)
=
Z
(t)
0
u
n1
(n 1)!
du =

n
(t)
n!
,
wherein we made the change of variables, u = (
n
), in the second to last equality.
Remark 5.4. Eq. (5.10) is equivalent to
Z
n
(t)
(
1
) . . . (
n
)d =
1
n!
1
(t)
()d
!
n
and another way to understand this equality is to view
R
n
(t)
(
1
) . . . (
n
)d as
a multiple integral (see Section 8 below) rather than an iterated integral. Indeed,
taking t > 0 for simplicity and letting S
n
be the permutation group on {1, 2, . . . , n}
we have
[0, t]
n
=
S
n
{(
1
, . . . ,
n
) R
n
: 0
1

n
t}
with the union being essentially disjoint. Therefore, making a change of variables
and using the fact that (
1
) . . . (
n
) is invariant under permutations, we nd
Z
t
0
()d
n
=
Z
[0,t]
n
(
1
) . . . (
n
)d
=
X
S
n
Z
{(
1
,...,
n
)R
n
:0
1
n
t}
(
1
) . . . (
n
)d
=
X
S
n
Z
{(s
1
,...,s
n
)R
n
:0s
1
s
n
t}
(s
1
1
) . . . (s
1
n
)ds
=
X
S
n
Z
{(s
1
,...,s
n
)R
n
:0s
1
s
n
t}
(s
1
) . . . (s
n
)ds
= n!
Z
n
(t)
(
1
) . . . (
n
)d.
Theorem 5.5. Let BC(J, X), then the integral equation
(5.11) y(t) = (t) +
Z
t
0
A()y()d
has a unique solution given by
(5.12) y(t) = (t) +

X
n=1
(1)
n1
t<0
Z
n
(t)
A(
n
) . . . A(
1
)(
1
)d
and this solution satises the bound
kyk
kk
e
R
J
kA()kd
.
Proof. Dene : BC(J, X) BC(J, X) by
(y)(t) =
Z
t
0
A()y()d.
Then y solves Eq. (5.9) i y = +y or equivalently i (I )y = .
An induction argument shows
(
n
)(t) =
Z
t
0
d
n
A(
n
)(
n1
)(
n
)
=
Z
t
0
d
n
Z

n
0
d
n1
A(
n
)A(
n1
)(
n2
)(
n1
)
.
.
.
=
Z
t
0
d
n
Z

n
0
d
n1
. . .
Z

2
0
d
1
A(
n
) . . . A(
1
)(
1
)
= (1)
n1
t<0
Z
n
(t)
A(
n
) . . . A(
1
)(
1
)d.
Taking norms of this equation and using the triangle inequality along with Lemma
5.3 gives,
k(
n
)(t)k kk
n
(t)
kA(
n
)k . . . kA(
1
)kd
kk
1
n!
1
(t)
kA()kd
!
n
kk
1
n!
Z
J
kA()kd
n
.
Therefore,
(5.13) k
n
k
op

1
n!
Z
J
kA()kd
n
and
X
n=0
k
n
k
op
e
R
J
kA()kd
<
where kk
op
denotes the operator norm on L(BC(J, X)) . An application of Propo-
sition 3.69 now shows (I )
1
=

P
n=0
n
exists and
(I )
1
op
e
R
J
kA()kd
.
It is now only a matter of working through the notation to see that these assertions
prove the theorem.
Corollary 5.6. Suppose that A L(X) is independent of time, then the solution
to
y(t) = Ay(t) with y(0) = x
60 BRUCE K. DRIVER
is given by y(t) = e
tA
x where
(5.14) e
tA
=

X
n=0
t
n
n!
A
n
.
Proof. This is a simple consequence of Eq. 5.12 and Lemma 5.3 with = 1.
We also have the following converse to this corollary whose proof is outlined in
Exercise 5.11 below.
Theorem 5.7. Suppose that T
t
L(X) for t 0 satises
(1) (Semi-group property.) T
0
= Id
X
and T
t
T
s
= T
t+s
for all s, t 0.
(2) (Norm Continuity) t T
t
is continuous at 0, i.e. kT
t
Ik
L(X)
0 as
t 0.
Then there exists A L(X) such that T
t
= e
tA
where e
tA
is dened in Eq.
(5.14).
5.3. Uniqueness Theorem and Continuous Dependence on Initial Data.
Lemma 5.8. Gronwalls Lemma. Suppose that f, , and k are non-negative
functions of a real variable t such that
(5.15) f(t) (t) +
Z
t
0
k()f()d
.
Then
(5.16) f(t) (t) +
Z
t
0
k()()e
|
R
t
k(s)ds
|
d
,
and in particular if and k are constants we nd that
(5.17) f(t) e
k|t|
.
Proof. I will only prove the case t 0. The case t 0 can be derived by
applying the t 0 to

f(t) = f(t),

k(t) = k(t) and (t) = (t).
Set F(t) =
R
t
0
k()f()d. Then by (5.15),
F = kf k +kF.
Hence,
d
dt
(e
R
t
0
k(s)ds
F) = e
R
t
0
k(s)ds
(

F kF) ke
R
t
0
k(s)ds
.
Integrating this last inequality from 0 to t and then solving for F yields:
F(t) e
R
t
0
k(s)ds
Z
t
0
dk()()e
R

0
k(s)ds
=
Z
t
0
dk()()e
R
t
k(s)ds
.
But by the denition of F we have that
f +F,
and hence the last two displayed equations imply (5.16). Equation (5.17) follows
from (5.16) by a simple integration.
Corollary 5.9 (Continuous Dependence on Initial Data). Let U
o
X, 0 (a, b)
and Z : (a, b)U X be a continuous function which is KLipschitz function on U,
i.e. kZ(t, x)Z(t, x
0
)k Kkxx
0
k for all x and x
0
in U. Suppose y
1
, y
2
: (a, b) U
solve
(5.18)
dy
i
(t)
dt
= Z(t, y
i
(t)) with y
i
(0) = x
i
for i = 1, 2.
Then
(5.19) ky
2
(t) y
1
(t)k kx
2
x
1
ke
K|t|
for t (a, b)
and in particular, there is at most one solution to Eq. (5.1) under the above Lip-
schitz assumption on Z.
Proof. Let f(t) ky
2
(t)y
1
(t)k. Then by the fundamental theorem of calculus,
f(t) = ky
2
(0) y
1
(0) +
Z
t
0
( y
2
() y
1
()) dk
f(0) +
Z
t
0
kZ(, y
2
()) Z(, y
1
())k d
= kx
2
x
1
k +K
Z
t
0
f() d
.
Therefore by Gronwalls inequality we have,
ky
2
(t) y
1
(t)k = f(t) kx
2
x
1
ke
K|t|
.
5.4. Local Existence (Non-Linear ODE). We now show that Eq. (5.1) under
a Lipschitz condition on Z. Another existence theorem is given in Exercise 7.9.
Theorem 5.10 (Local Existence). Let T > 0, J = (T, T), x
0
X, r > 0 and
C(x
0
, r) := {x X : kx x
0
k r}
be the closed r ball centered at x
0
X. Assume
(5.20) M = sup{kZ(t, x)k : (t, x) J C(x
0
, r)} <
and there exists K < such that
(5.21) kZ(t, x) Z(t, y)k Kkx yk for all x, y C(x
0
, r) and t J.
Let T
0
< min{r/M, T} and J
0
:= (T
0
, T
0
), then for each x B(x
0
, rMT
0
) there
exists a unique solution y(t) = y(t, x) to Eq. (5.2) in C (J
0
, C(x
0
, r)) . Moreover
y(t, x) is jointly continuous in (t, x), y(t, x) is dierentiable in t, y(t, x) is jointly
continuous for all (t, x) J
0
B(x
0
, r MT
0
) and satises Eq. (5.1).
Proof. The uniqueness assertion has already been proved in Corollary 5.9. To
prove existence, let C
r
:= C(x
0
, r), Y := C (J
0
, C(x
0
, r)) and
(5.22) S
x
(y)(t) := x +
Z
t
0
Z(, y())d.
With this notation, Eq. (5.2) becomes y = S
x
(y), i.e. we are looking for a xed
point of S
x
. If y Y, then
kS
x
(y)(t) x
0
k kx x
0
k +
Z
t
0
kZ(, y())k d
kx x
0
k +M |t|
kx x
0
k +MT
0
r MT
0
+MT
0
= r,
62 BRUCE K. DRIVER
showing S
x
(Y ) Y for all x B(x
0
, r MT
0
). Moreover if y, z Y,
kS
x
(y)(t) S
x
(z)(t)k =
Z
t
0
[Z(, y()) Z(, z())] d
Z
t
0
kZ(, y()) Z(, z())k d
Z
t
0
ky() z()k d
. (5.23)
Let y
0
(t, x) = x and y
n
(, x) Y dened inductively by
(5.24) y
n
(, x) := S
x
(y
n1
(, x)) = x +
Z
t
0
Z(, y
n1
(, x))d.
Using the estimate in Eq. (5.23) repeatedly we nd
ky
n+1
(t) y
n
(t)k K
Z
t
0
ky
n
() y
n1
()k d
K
2
Z
t
0
dt
1
Z
t
1
0
dt
2
ky
n1
(t
2
) y
n2
(t
2
)k
. . .
K
n
Z
t
0
dt
1
Z
t
1
0
dt
2
. . .
Z
t
n1
0
dt
n
ky
1
(t
n
) y
0
(t
n
)k
. . .
K
n
ky
1
(, x) y
0
(, x)k
n
(t)
d
=
K
n
|t|
n
n!
ky
1
(, x) y
0
(, x)k
2r
K
n
|t|
n
n!
(5.25)
wherein we have also made use of Lemma 5.3. Combining this estimate with
ky
1
(t, x) y
0
(t, x)k =
Z
t
0
Z(, x)d
Z
t
0
kZ(, x)k d
M
0
,
where
M
0
= T
0
max
(
Z
T
0
0
kZ(, x)k d,
Z
0
T
0
kZ(, x)k d
)
MT
0
,
shows
ky
n+1
(t, x) y
n
(t, x)k M
0
K
n
|t|
n
n!
M
0
K
n
T
n
0
n!
and this implies
X
n=0
sup
n
ky
n+1
(, x) y
n
(, x)k
,J
0
: t J
0
o

X
n=0
M
0
K
n
T
n
0
n!
= M
0
e
KT
0
<
where
ky
n+1
(, x) y
n
(, x)k
,J
0
:= sup{ky
n+1
(t, x) y
n
(t, x)k : t J
0
} .
So y(t, x) := lim
n
y
n
(t, x) exists uniformly for t J and using Eq. (5.21) we
also have
sup{kZ(t, y(t)) Z(t, y
n1
(t))k : t J
0
} Kky(, x) y
n1
(, x)k
,J
0
0 as n .
Now passing to the limit in Eq. (5.24) shows y solves Eq. (5.2). From this equation
it follows that y(t, x) is dierentiable in t and y satises Eq. (5.1).
The continuity of y(t, x) follows from Corollary 5.9 and mean value inequality
(Corollary 4.10):
ky(t, x) y(t
0
, x
0
)k ky(t, x) y(t, x
0
)k +ky(t, x
0
) y(t
0
, x
0
)k
= ky(t, x) y(t, x
0
)k +
Z
t
t
0
Z(, y(, x
0
))d
ky(t, x) y(t, x
0
)k +
Z
t
t
0
kZ(, y(, x
0
))k d
kx x
0
ke
KT
+
Z
t
t
0
kZ(, y(, x
0
))k d
(5.26)
kx x
0
ke
KT
+M |t t
0
| .
The continuity of y(t, x) is now a consequence Eq. (5.1) and the continuity of y
and Z.
Corollary 5.11. Let J = (a, b) 3 0 and suppose Z C(J X, X) satises
(5.27) kZ(t, x) Z(t, y)k Kkx yk for all x, y X and t J.
Then for all x X, there is a unique solution y(t, x) (for t J) to Eq. (5.1).
Moreover y(t, x) and y(t, x) are jointly continuous in (t, x).
Proof. Let J
0
= (a
0
, b
0
) 3 0 be a precompact subinterval of J and Y :=
BC (J
0
, X) . By compactness, M := sup
t
J
0
kZ(t, 0)k < which combined with
Eq. (5.27) implies
sup
t
J
0
kZ(t, x)k M +Kkxk for all x X.
Using this estimate and Lemma 4.4 one easily shows S
x
(Y ) Y for all x X. The
proof of Theorem 5.10 now goes through without any further change.
5.5. Global Properties.
Denition 5.12 (Local Lipschitz Functions). Let U
o
X, J be an open interval
and Z C(J U, X). The function Z is said to be locally Lipschitz in x if for
all x U and all compact intervals I J there exists K = K(x, I) < and
= (x, I) > 0 such that B(x, (x, I)) U and
(5.28)
kZ(t, x
1
) Z(t, x
0
)k K(x, I)kx
1
x
0
k for all x
0
, x
1
B(x, (x, I)) and t I.
For the rest of this section, we will assume J is an open interval containing 0, U
is an open subset of X and Z C(J U, X) is a locally Lipschitz function.
Lemma 5.13. Let Z C(J U, X) be a locally Lipschitz function in X and E be
a compact subset of U and I be a compact subset of J. Then there exists > 0 such
that Z(t, x) is bounded for (t, x) I E
and and Z(t, x) is K Lipschitz on E
for all t I, where

E
:= {x U : dist(x, E) < } .
64 BRUCE K. DRIVER
Proof. Let (x, I) and K(x, I) be as in Denition 5.12 above. Since E is com-
pact, there exists a nite subset E such that E V :=
x
B(x, (x, I)/2). If
y V, there exists x such that ky xk < (x, I)/2 and therefore
kZ(t, y)k kZ(t, x)k +K(x, I) ky xk kZ(t, x)k +K(x, I)(x, I)/2
sup
x,tI
{kZ(t, x)k +K(x, I)(x, I)/2} =: M < .
This shows Z is bounded on I V.
Let
:= d(E, V
c
)
1
2
min
x
(x, I)
and notice that > 0 since E is compact, V
c
is closed and E V
c
= . If y, z E
and ky zk < , then as before there exists x such that ky xk < (x, I)/2.
Therefore
kz xk kz yk +ky xk < +(x, I)/2 (x, I)
and since y, z B(x, (x, I)), it follows that
kZ(t, y) Z(t, z)k K(x, I)ky zk K
0
ky zk
where K
0
:= max
x
K(x, I) < . On the other hand if y, z E
and ky zk ,
then
kZ(t, y) Z(t, z)k 2M
2M
ky zk .
Thus if we let K := max {2M/, K
0
} , we have shown
kZ(t, y) Z(t, z)k Kky zk for all y, z E
and t I.
Proposition 5.14 (Maximal Solutions). Let Z C(J U, X) be a locally Lipschitz
function in x and let x U be xed. Then there is an interval J
x
= (a(x), b(x))
with a [, 0) and b (0, ] and a C
1
function y : J U with the following
properties:
(1) y solves ODE in Eq. (5.1).
(2) If y :

J = ( a,
b) U is another solution of Eq. (5.1) (we assume that

0

J) then

J J and y = y|

J
.
The function y : J U is called the maximal solution to Eq. (5.1).
Proof. Suppose that y
i
: J
i
= (a
i
, b
i
) U, i = 1, 2, are two solutions to Eq.
(5.1). We will start by showing the y
1
= y
2
on J
1
J
2
. To do this
9
let J
0
= (a
0
, b
0
)
be chosen so that 0 J
0
J
1
J
2
, and let E := y
1
(J
0
) y
2
(J
0
) a compact
subset of X. Choose > 0 as in Lemma 5.13 so that Z is Lipschitz on E
. Then
y
1
|
J
0
, y
2
|
J
0
: J
0
E
both solve Eq. (5.1) and therefore are equal by Corollary 5.9.
9
Here is an alternate proof of the uniqueness. Let
T sup{t [0, min{b
1
, b
2
}) : y
1
= y
2
on [0, t]}.
(T is the rst positive time after which y
1
and y
2
disagree.
Suppose, for sake of contradiction, that T < min{b
1
, b
2
}. Notice that y
1
(T) = y
2
(T) =: x
0
.
Applying the local uniqueness theorem to y
1
( T) and y
2
( T) thought as function from
(, ) B(x
0
, (x
0
)) for some suciently small, we learn that y
1
( T) = y
2
( T) on (, ).
But this shows that y
1
= y
2
on [0, T + ) which contradicts the denition of T. Hence we must
have the T = min{b
1
, b
2
}, i.e. y
1
= y
2
on J
1
J
2
[0, ). A similar argument shows that y
1
= y
2
on J
1
J
2
(, 0] as well.
Since J
0
= (a
0
, b
0
) was chosen arbitrarily so that [a, b] J
1
J
2
, we may conclude
that y
1
= y
2
on J
1
J
2
.
Let (y
, J
= (a
, b
))
A
denote the possible solutions to (5.1) such that 0
J
. Dene J
x
= J
and set y = y
on J
. We have just checked that y is well

dened and the reader may easily check that this function y : J
x
U satises all
the conclusions of the theorem.
Notation 5.15. For each x U, let J
x
= (a(x), b(x)) be the maximal interval on
which Eq. (5.1) may be solved, see Proposition 5.14. Set D(Z)
xU
(J
x
{x})
J U and let : D(Z) U be dened by (t, x) = y(t) where y is the maximal
solution to Eq. (5.1). (So for each x U, (, x) is the maximal solution to Eq.
(5.1).)
Proposition 5.16. Let Z C(J U, X) be a locally Lipschitz function in x and y :
J
x
= (a(x), b(x)) U be the maximal solution to Eq. (5.1). If b(x) < b, then either
limsup
tb(x)
kZ(t, y(t))k = or y(b(x)) lim
tb(x)
y(t) exists and y(b(x)) /
U. Similarly, if a > a(x), then either limsup
ta(x)
ky(t)k = or y(a(x)+)
lim
ta
y(t) exists and y(a(x)+) / U.
Proof. Suppose that b < b(x) and M limsup
tb(x)
kZ(t, y(t))k < . Then
there is a b
0
(0, b(x)) such that kZ(t, y(t))k 2M for all t (b
0
, b(x)). Thus, by
the usual fundamental theorem of calculus argument,
ky(t) y(t
0
)k
Z
t
0
t
kZ(t, y())k d
2M|t t
0
|
for all t, t
0
(b
0
, b(x)). From this it is easy to conclude that y(b(x)) = lim
tb(x)
y(t)
exists. If y(b(x)) U, by the local existence Theorem 5.10, there exists > 0 and
w C
1
((b(x) , b(x) +), U) such that
w(t) = Z(t, w(t)) and w(b(x)) = y(b(x)).
Now dene y : (a, b(x) +) U by
y(t) =

y(t) if t J
x
w(t) if t [b(x), b(x) +)
.
The reader may now easily show y solves the integral Eq. (5.2) and hence also
solves Eq. 5.1 for t (a(x), b(x) + ).
10
But this violates the maximality of y and
hence we must have that y(b(x)) / U. The assertions for t near a(x) are proved
similarly.
Example 5.17. Let X = R
2
, J = R, U =

(x, y) R
2
: 0 < r < 1
where r
2
=
x
2
+y
2
and
Z(x, y) =
1
r
(x, y) +
1
1 r
2
(y, x).
The the unique solution (x(t), y(t)) to
d
dt
(x(t), y(t)) = Z(x(t), y(t)) with (x(0), y(0)) = (
1
2
, 0)
10
See the argument in Proposition 5.19 for a slightly dierent method of extending y which
avoids the use of the integral equation (5.2).
66 BRUCE K. DRIVER
is given by
(x(t), y(t)) =

t +
1
2
cos
1
1/2 t
, sin
1
1/2 t
for t J
(1/2,0)
= (, 1/2) . Notice that kZ(x(t), y(t))k as t 1/2 and
dist((x(t), y(t)), U
c
) 0 as t 1/2.
Example 5.18. (Not worked out completely.) Let X = U =
2
, C
(R
2
) be
a smooth function such that = 1 in a neighborhood of the line segment joining
(1, 0) to (0, 1) and being supported within the 1/10 neighborhood of this segment.
Choose a
n
and b
n
and dene
(5.29) Z(x) =

X
n=1
a
n
(b
n
(x
n
, x
n+1
))(e
n+1
e
n
).
For any x
2
, only a nite number of terms are non-zero in the above some in a
neighborhood of x. Therefor Z :
2

2
is a smooth and hence locally Lipshcitz
vector eld. Let (y(t), J = (a, b)) denote the maximal solution to
y(t) = Z(y(t)) with y(0) = e
1
.
Then if the a
n
and b
n
are chosen appropriately, then b < and there will exist
t
n
b such that y(t
n
) is approximately e
n
for all n. So again y(t
n
) does not have
a limit yet sup
t[0,b)
ky(t)k < . The idea is that Z is constructed to blow the
particle form e
1
to e
2
to e
3
to e
4
etc. etc. with the time it takes to travel from e
n
to e
n+1
being on order 1/2
n
. The vector eld in Eq. (5.29) is a rst approximation
at such a vector eld, it may have to be adjusted a little more to provide an honest
example. In this example, we are having problems because y(t) is going o in
dimensions.
Here is another version of Proposition 5.16 which is more useful when dim(X) <
.
Proposition 5.19. Let Z C(J U, X) be a locally Lipschitz function in x and
y : J
x
= (a(x), b(x)) U be the maximal solution to Eq. (5.1).
(1) If b(x) < b, then for every compact subset K U there exists T
K
< b(x)
such that y(t) / K for all t [T
K
, b(x)).
(2) When dim(X) < , we may write this condition as: if b(x) < b, then either
limsup
tb(x)
ky(t)k = or liminf
tb(x)
dist(y(t), U
c
) = 0.
Proof. 1) Suppose that b(x) < b and, for sake of contradiction, there exists a
compact set K U and t
n
b(x) such that y(t
n
) K for all n. Since K is compact,
by passing to a subsequence if necessary, we may assume y
:= lim
n
y(t
n
)
exists in K U. By the local existence Theorem 5.10, there exists T
0
> 0 and
> 0 such that for each x
0
B(y
, ) there exists a unique solution w(, x

0
)
C
1
((T
0
, T
0
), U) solving
w(t, x
0
) = Z(t, w(t, x
0
)) and w(0, x
0
) = x
0
.
Now choose n suciently large so that t
n
(b(x) T
0
/2, b(x)) and y(t
n
)
B(y
, ) . Dene y : (a(x), b(x) +T

0
/2) U by
y(t) =

y(t) if t J
x
w(t t
n
, y(t
n
)) if t (t
n
T
0
, b(x) +T
0
/2) (t
n
T
0
, t
n
+T
0
).
By uniqueness of solutions to ODEs y is well dened, y C
1
((a(x), b(x) +T
0
/2) , X)
and y solves the ODE in Eq. 5.1. But this violates the maximality of y.
2) For each n N let
K
n
:= {x U : kxk n and dist(x, U
c
) 1/n} .
Then K
n
U and each K
n
is a closed bounded set and hence compact if dim(X) <
. Therefore if b(x) < b, by item 1., there exists T
n
[0, b(x)) such that y(t) / K
n
for all t [T
n
, b(x)) or equivalently ky(t)k > n or dist(y(t), U
c
) < 1/n for all
t [T
n
, b(x)).
Remark 5.20. In general it is not true that the functions a and b are continuous.
For example, let U be the region in R
2
described in polar coordinates by r > 0 and
0 < < 3/4 and Z(x, y) = (0, 1) as in Figure 12 below. Then b(x, y) = y for all
x, y > 0 while b(x, y) = for all x < 0 and y R which shows b is discontinuous.
On the other hand notice that
{b > t} = {x < 0} {(x, y) : x 0, y > t}
is an open set for all t > 0.
Figure 12. An example of a vector eld for which b(x) is discon-
tinuous. This is given in the top left hand corner of the gure.
The map would allow the reader to nd an example on R
2
if so
desired. Some calculations shows that Z transfered to R
2
by the
map is given by
Z(x, y) = e
x
sin
3
8
+
3
4
tan
1
(y)
, cos
3
8
+
3
4
tan
1
(y)
.
Theorem 5.21 (Global Continuity). Let Z C(J U, X) be a locally Lipschitz
function in x. Then D(Z) is an open subset of J U and the functions : D(Z)
U and

: D(Z) U are continuous. More precisely, for all x
0
U and all
68 BRUCE K. DRIVER
open intervals J
0
such that 0 J
0
@@ J
x
0
there exists = (x
0
, J
0
, Z) > 0 and
C = C(x
0
, J
0
, Z) < such that for all x B(x
0
, ), J
0
J
x
and
(5.30) k(, x) (, x
0
)k
BC(J
0
,U)
C kx x
0
k .
Proof. Let |J
0
| = b
0
a
0
, I =

J
0
and E := y(

J
0
) a compact subset of U and
let > 0 and K < be given as in Lemma 5.13, i.e. K is the Lipschitz constant
for Z on E
. Also recall the notation:

1
(t) = [0, t] if t > 0 and
1
(t) = [t, 0] if
t < 0.
Suppose that x E
, then by Corollary 5.9,

(5.31) k(t, x) (t, x
0
)k kx x
0
ke
K|t|
kx x
0
ke
K|J
0
|
for all t J
0
J
x
such that such that (
1
(t), x) E
. Letting := e
K|J
0
|
/2,
and assuming x B(x
0
, ), the previous equation implies
k(t, x) (t, x
0
)k /2 < for all t J
0
J
x
such that (
1
(t), x) E
.
This estimate further shows that (t, x) remains bounded and strictly away from
the boundary of U for all such t. Therefore, it follows from Proposition 5.14 and
continuous induction
11
that J
0
J
x
and Eq. (5.31) is valid for all t J
0
. This
proves Eq. (5.30) with C := e
K|J
0
|
.
Suppose that (t
0
, x
0
) D(Z) and let 0 J
0
@@ J
x
0
such that t
0
J
0
and be
as above. Then we have just shown J
0
B(x
0
, ) D(Z) which proves D(Z) is
open. Furthermore, since the evaluation map
(t
0
, y) J
0
BC(J
0
, U)
e
y(t
0
) X
is continuous (as the reader should check) it follows that = e (x (, x)) :
J
0
B(x
0
, ) U is also continuous; being the composition of continuous maps.
The continuity of

(t
0
, x) is a consequences of the continuity of and the dierential
equation 5.1
Alternatively using Eq. (5.2),
k(t
0
, x) (t, x
0
)k k(t
0
, x) (t
0
, x
0
)k +k(t
0
, x
0
) (t, x
0
)k
C kx x
0
k +
Z
t
0
t
kZ(, (, x
0
))k d
C kx x
0
k +M |t
0
t|
where C is the constant in Eq. (5.30) and M = sup
J
0
kZ(, (, x
0
))k < . This
clearly shows is continuous.
5.6. Semi-Group Properties of time independent ows. To end this chapter
we investigate the semi-group property of the ow associated to the vector-eld Z.
It will be convenient to introduce the following suggestive notation. For (t, x)
D(Z), set e
tZ
(x) = (t, x). So the path t e
tZ
(x) is the maximal solution to
d
dt
e
tZ
(x) = Z(e
tZ
(x)) with e
0Z
(x) = x.
This exponential notation will be justied shortly. It is convenient to have the
following conventions.
11
See the argument in the proof of Proposition 4.7.
Notation 5.22. We write f : X X to mean a function dened on some open
subset D(f) X. The open set D(f) will be called the domain of f. Given two
functions f : X X and g : X X with domains D(f) and D(g) respectively,
we dene the composite function f g : X X to be the function with domain
D(f g) = {x X : x D(g) and g(x) D(f)} = g
1
(D(f))
given by the rule f g(x) = f(g(x)) for all x D(f g). We now write f = g i
D(f) = D(g) and f(x) = g(x) for all x D(f) = D(g). We will also write f g
i D(f) D(g) and g|
D(f)
= f.
Theorem 5.23. For xed t R we consider e
tZ
as a function from X to X with
domain D(e
tZ
) = {x U : (t, x) D(Z)}, where D() = D(Z) RU, D(Z) and
are dened in Notation 5.15. Conclusions:
(1) If t, s R and t s 0, then e
tZ
e
sZ
= e
(t+s)Z
.
(2) If t R, then e
tZ
e
tZ
= Id
D(e
tZ
)
.
(3) For arbitrary t, s R, e
tZ
e
sZ
e
(t+s)Z
.
Proof. Item 1. For simplicity assume that t, s 0. The case t, s 0 is left to
the reader. Suppose that x D(e
tZ
e
sZ
). Then by assumption x D(e
sZ
) and
e
sZ
(x) D(e
tZ
). Dene the path y() via:
y() =

e
Z
(x) if 0 s
e
(s)Z
(x) if s t +s
.
It is easy to check that y solves y() = Z(y()) with y(0) = x. But since, e
Z
(x) is
the maximal solution we must have that x D(e
(t+s)Z
) and y(t +s) = e
(t+s)Z
(x).
That is e
(t+s)Z
(x) = e
tZ
e
sZ
(x). Hence we have shown that e
tZ
e
sZ
e
(t+s)Z
.
To nish the proof of item 1. it suces to show that D(e
(t+s)Z
) D(e
tZ
e
sZ
).
Take x D(e
(t+s)Z
), then clearly x D(e
sZ
). Set y() = e
(+s)Z
(x) dened for
0 t. Then y solves
y() = Z(y()) with y(0) = e
sZ
(x).
But since e
Z
(e
sZ
(x)) is the maximal solution to the above initial valued prob-
lem we must have that y() = e
Z
(e
sZ
(x)), and in particular at = t, e
(t+s)Z
(x) =
e
tZ
(e
sZ
(x)). This shows that x D(e
tZ
e
sZ
) and in fact e
(t+s)Z
e
tZ
e
sZ
.
Item 2. Let x D(e
tZ
) again assume for simplicity that t 0. Set y() =
e
(t)Z
(x) dened for 0 t. Notice that y(0) = e
tZ
(x) and y() = Z(y()).
This shows that y() = e
Z
(e
tZ
(x)) and in particular that x D(e
tZ
e
tZ
) and
e
tZ
e
tZ
(x) = x. This proves item 2.
Item 3. I will only consider the case that s < 0 and t + s 0, the other
cases are handled similarly. Write u for t + s, so that t = s + u. We know that
e
tZ
= e
uZ
e
sZ
by item 1. Therefore
e
tZ
e
sZ
= (e
uZ
e
sZ
) e
sZ
.
Notice in general, one has (f g) h = f (g h) (you prove). Hence, the above
displayed equation and item 2. imply that
e
tZ
e
sZ
= e
uZ
(e
sZ
e
sZ
) = e
(t+s)Z
I
D(e
sZ
)
e
(t+s)Z
.
The following result is trivial but conceptually illuminating partial converse to
Theorem 5.23.
70 BRUCE K. DRIVER
Proposition 5.24 (Flows and Complete Vector Fields). Suppose U

o
X,
C(R U, U) and
t
(x) = (t, x). Suppose satises:
(1)
0
= I
U
,
(2)
t

s
=
t+s
for all t, s R, and
(3) Z(x) :=

(0, x) exists for all x U and Z C(U, X) is locally Lipschitz.
Then
t
= e
tZ
.
Proof. Let x U and y(t)
t
(x). Then using Item 2.,
y(t) =
d
ds
|
0
y(t +s) =
d
ds
|
0
(t+s)
(x) =
d
ds
|
0
s

t
(x) = Z(y(t)).
Since y(0) = x by Item 1. and Z is locally Lipschitz by Item 3., we know by
uniqueness of solutions to ODEs (Corollary 5.9) that
t
(x) = y(t) = e
tZ
(x).
5.7. Exercises.
Exercise 5.1. Find a vector eld Z such that e
(t+s)Z
is not contained in e
tZ
e
sZ
.
Denition 5.25. A locally Lipschitz function Z : U
o
X X is said to be a
complete vector eld if D(Z) = RU. That is for any x U, t e
tZ
(x) is dened
for all t R.
Exercise 5.2. Suppose that Z : X X is a locally Lipschitz function. Assume
there is a constant C > 0 such that
kZ(x)k C(1 +kxk) for all x X.
Then Z is complete. Hint: use Gronwalls Lemma 5.8 and Proposition 5.16.
Exercise 5.3. Suppose y is a solution to y(t) = |y(t)|
1/2
with y(0) = 0. Show there
exists a, b [0, ] such that
y(t) =
_
_
_
1
4
(t b)
2
if t b
0 if a < t < b
1
4
(t +a)
2
if t a.
Exercise 5.4. Using the fact that the solutions to Eq. (5.3) are never 0 if x 6= 0,
show that y(t) = 0 is the only solution to Eq. (5.3) with y(0) = 0.
Exercise 5.5. Suppose that A L(X). Show directly that:
(1) e
tA
dene in Eq. (5.14) is convergent in L(X) when equipped with the
operator norm.
(2) e
tA
is dierentiable in t and that
d
dt
e
tA
= Ae
tA
.
Exercise 5.6. Suppose that A L(X) and v X is an eigenvector of A with
eigenvalue , i.e. that Av = v. Show e
tA
v = e
t
v. Also show that X = R
n
and A
is a diagonalizable n n matrix with
A = SDS
1
with D = diag(
1
, . . . ,
n
)
then e
tA
= Se
tD
S
1
where e
tD
= diag(e
t
1
, . . . , e
t
n
).
Exercise 5.7. Suppose that A, B L(X) and [A, B] AB BA = 0. Show that
e
(A+B)
= e
A
e
B
.
Exercise 5.8. Suppose A C(R, L(X)) satises [A(t), A(s)] = 0 for all s, t R.
Show
y(t) := e
(
R
t
0
A()d
)
x
is the unique solution to y(t) = A(t)y(t) with y(0) = x.
Exercise 5.9. Compute e
tA
when
A =

0 1
1 0
and use the result to prove the formula

cos(s +t) = cos s cos t sins sint.
Hint: Sum the series and use e
tA
e
sA
= e
(t+s)A
.
Exercise 5.10. Compute e
tA
when
A =
_
_
0 a b
0 0 c
0 0 0
_
_
with a, b, c R. Use your result to compute e
t(I+A)
where R and I is the 33
identity matrix. Hint: Sum the series.
Exercise 5.11. Prove Theorem 5.7 using the following outline.
(1) First show t [0, ) T
t
L(X) is continuos.
(2) For > 0, let S
:=
1
0
T
d L(X). Show S
I as 0 and conclude
from this that S
is invertible when > 0 is suciently small. For the

remainder of the proof x such a small > 0.
(3) Show
T
t
S
=
1
Z
t+
t
T
d
and conclude from this that
lim
t0
t
1
(T
t
I) S
=
1
(T
Id
X
) .
(4) Using the fact that S
is invertible, conclude A = lim

t0
t
1
(T
t
I) exists
in L(X) and that
A =
1
(T
I) S
1
.
(5) Now show using the semigroup property and step 4. that
d
dt
T
t
= AT
t
for
all t > 0.
(6) Using step 5, show
d
dt
e
tA
T
t
= 0 for all t > 0 and therefore e
tA
T
t
=
e
0A
T
0
= I.
Exercise 5.12 (Higher Order ODE). Let X be a Banach space, , U
o
X
n
and
f C (J U, X) be a Locally Lipschitz function in x = (x
1
, . . . , x
n
). Show the n
th
ordinary dierential equation,
(5.32)
y
(n)
(t) = f(t, y(t), y(t), . . . y
(n1)
(t)) with y
(k)
(0) = y
k
0
for k = 0, 1, 2 . . . , n 1
72 BRUCE K. DRIVER
where (y
0
0
, . . . , y
n1
0
) is given in U, has a unique solution for small t J. Hint: let
y(t) =

y(t), y(t), . . . y
(n1)
(t)
and rewrite Eq. (5.32) as a rst order ODE of the

form
y(t) = Z(t, y(t)) with y(0) = (y
0
0
, . . . , y
n1
0
).
Exercise 5.13. Use the results of Exercises 5.10 and 5.12 to solve
y(t) 2 y(t) +y(t) = 0 with y(0) = a and y(0) = b.
Hint: The 2 2 matrix associated to this system, A, has only one eigenvalue 1
and may be written as A = I +B where B
2
= 0.
Exercise 5.14. Suppose that A : R L(X) is a continuous function and U, V :
R L(X) are the unique solution to the linear dierential equations
V (t) = A(t)V (t) with V (0) = I

and
(5.33)

U(t) = U(t)A(t) with U(0) = I.
Prove that V (t) is invertible and that V
1
(t) = U(t). Hint: 1) show
d
dt
[U(t)V (t)] =
0 (which is sucient if dim(X) < ) and 2) show compute y(t) := V (t)U(t) solves
a linear dierential ordinary dierential equation that has y 0 as an obvious
solution. Then use the uniqueness of solutions to ODEs. (The fact that U(t) must
be dened as in Eq. (5.33) is the content of Exercise 26.2 below.)
Exercise 5.15 (Duhamel s Principle I). Suppose that A : R L(X) is a contin-
uous function and V : R L(X) is the unique solution to the linear dierential
equation in Eq. (26.36). Let x X and h C(R, X) be given. Show that the
unique solution to the dierential equation:
(5.34) y(t) = A(t)y(t) +h(t) with y(0) = x
is given by
(5.35) y(t) = V (t)x +V (t)
Z
t
0
V ()
1
h() d.
Hint: compute
d
dt
[V
1
(t)y(t)] when y solves Eq. (5.34).
Exercise 5.16 (Duhamel s Principle II). Suppose that A : R L(X) is a con-
tinuous function and V : R L(X) is the unique solution to the linear dierential
equation in Eq. (26.36). Let W
0
L(X) and H C(R, L(X)) be given. Show that
the unique solution to the dierential equation:
(5.36)

W(t) = A(t)W(t) +H(t) with W(0) = W
0
is given by
(5.37) W(t) = V (t)W
0
+V (t)
Z
t
0
V ()
1
H() d.
Exercise 5.17 (Non-Homogeneous ODE). Suppose that U
o
X is open and
Z : R U X is a continuous function. Let J = (a, b) be an interval and t
0
J.
Suppose that y C
1
(J, U) is a solution to the non-homogeneous dierential
equation:
(5.38) y(t) = Z(t, y(t)) with y(t
o
) = x U.
Dene Y C
1
(J t
0
, RU) by Y (t) (t +t
0
, y(t +t
0
)). Show that Y solves the
homogeneous dierential equation
(5.39)

Y (t) =

Z(Y (t)) with Y (0) = (t
0
, y
0
),
where

Z(t, x) (1, Z(x)). Conversely, suppose that Y C
1
(J t
0
, R U) is a
solution to Eq. (5.39). Show that Y (t) = (t +t
0
, y(t +t
0
)) for some y C
1
(J, U)
satisfying Eq. (5.38). (In this way the theory of non-homogeneous odes may be
reduced to the theory of homogeneous odes.)
Exercise 5.18 (Dierential Equations with Parameters). Let W be another Ba-
nach space, U V
o
X W and Z C(U V, X) be a locally Lipschitz function
on U V. For each (x, w) U V, let t J
x,w
(t, x, w) denote the maximal
solution to the ODE
(5.40) y(t) = Z(y(t), w) with y(0) = x.
Prove
(5.41) D := {(t, x, w) R U V : t J
x,w
}
is open in R U V and and

are continuous functions on D.
Hint: If y(t) solves the dierential equation in (5.40), then v(t) (y(t), w)
solves the dierential equation,
(5.42) v(t) =

Z(v(t)) with v(0) = (x, w),
where

Z(x, w) (Z(x, w), 0) X W and let (t, (x, w)) := v(t). Now apply the
Theorem 5.21 to the dierential equation (5.42).
Exercise 5.19 (Abstract Wave Equation). For A L(X) and t R, let
cos(tA) :=

X
n=0
(1)
n
(2n)!
t
2n
A
2n
and
sin(tA)
A
:=

X
n=0
(1)
n
(2n + 1)!
t
2n+1
A
2n
.
Show that the unique solution y C
2
(R, X) to
(5.43) y(t) +A
2
y(t) = 0 with y(0) = y
0
and y(0) = y
0
X
is given by
y(t) = cos(tA)y
0
+
sin(tA)
A
y
0
.
Remark 5.26. Exercise 5.19 can be done by direct verication. Alternatively and
more instructively, rewrite Eq. (5.43) as a rst order ODE using Exercise 5.12. In
doing so you will be lead to compute e
tB
where B L(X X) is given by
B =

0 I
A
2
0
,
where we are writing elements of X X as column vectors,

x
1
x
2
. You should
then show
e
tB
=

cos(tA)
sin(tA)
A
Asin(tA) cos(tA)
74 BRUCE K. DRIVER
where
Asin(tA) :=

X
n=0
(1)
n
(2n + 1)!
t
2n+1
A
2(n+1)
.
Exercise 5.20 (Duhamels Principle for the Abstract Wave Equation). Continue
the notation in Exercise 5.19, but now consider the ODE,
(5.44) y(t) +A
2
y(t) = f(t) with y(0) = y
0
and y(0) = y
0
X
where f C(R, X). Show the unique solution to Eq. (5.44) is given by
(5.45) y(t) = cos(tA)y
0
+
sin(tA)
A
y
0
+
Z
t
0
sin((t ) A)
A
f()d
Hint: Again this could be proved by direct calculation. However it is more in-
structive to deduce Eq. (5.45) from Exercise 5.15 and the comments in Remark
5.26.
6. Algebras, Algebras and Measurability
6.1. Introduction: What are measures and why measurable sets.
Denition 6.1 (Preliminary). Suppose that X is a set and P(X) denotes the
collection of all subsets of X. A measure on X is a function : P(X) [0, ]
such that
(1) () = 0
(2) If {A
i
}
N
i=1
is a nite (N < ) or countable (N = ) collection of subsets
of X which are pair-wise disjoint (i.e. A
i
A
j
= if i 6= j) then
(
N
i=1
A
i
) =
N
X
i=1
(A
i
).
Example 6.2. Suppose that X is any set and x X is a point. For A X, let
x
(A) =

1 if x A
0 otherwise.
Then =
x
is a measure on X called the Dirac delta function at x.
Example 6.3. Suppose that is a measure on X and > 0, then is also a
measure on X. Moreover, if {
}
J
are all measures on X, then =
P
J

,
i.e.
(A) =
X
J
(A) for all A X

is a measure on X. (See Section 2 for the meaning of this sum.) To prove this we
must show that is countably additive. Suppose that {A
i
}
i=1
is a collection of
pair-wise disjoint subsets of X, then
(
i=1
A
i
) =

X
i=1
(A
i
) =

X
i=1
X
J
(A
i
)
=
X
J
X
i=1
(A
i
) =
X
J
i=1
A
i
)
= (
i=1
A
i
)
wherein the third equality we used Theorem 2.21 and in the fourth we used that
fact that
is a measure.
Example 6.4. Suppose that X is a set : X [0, ] is a function. Then
:=
X
xX
(x)
x
is a measure, explicitly
(A) =
X
xA
(x)
for all A X.
76 BRUCE K. DRIVER
6.2. The problem with Lebesgue measure.

Question 1. Does there exist a measure : P(R) [0, ] such that
(1) ([a, b)) = (b a) for all a < b and
(2) (Translation invariant) (A + x) = (A) for all x R? (Here A + x :=
{y +x : y A} R.)
The answer is no which we now demonstrate. In fact the answer is no even if we
replace (1) by the condition that 0 < ((0, 1]) < .
Let us identify [0, 1) with the unit circle S
1
:= {z C : |z| = 1} by the map
(t) = e
i2t
S
1
for t [0, 1). Using this identication we may use to dene a
function on P(S
1
) by ((A)) = (A) for all A [0, 1). This new function is a
measure on S
1
with the property that 0 < ((0, 1]) < . For z S
1
and N S
1
let
(6.1) zN := {zn S
1
: n N},
that is to say e
i
N is N rotated counter clockwise by angle . We now claim that
is invariant under these rotations, i.e.
(6.2) (zN) = (N)
for all z S
1
and N S
1
. To verify this, write N = (A) and z = (t) for some
t [0, 1) and A [0, 1). Then
(t)(A) = (t +Amod1)
where for A [0, 1) and [0, 1), let
t +Amod1 = {a +t mod1 [0, 1) : a N}
= (a +A {a < 1 t}) ((t 1) +A {a 1 t}) .
Thus
((t)(A)) = (t +Amod1)
= ((a +A {a < 1 t}) ((t 1) +A {a 1 t}))
= ((a +A {a < 1 t})) +(((t 1) +A {a 1 t}))
= (A {a < 1 t}) +(A {a 1 t})
= ((A {a < 1 t}) (A {a 1 t}))
= (A) = ((A)).
Therefore it suces to prove that no nite measure on S
1
such that Eq. (6.2)
holds. To do this we will construct a non-measurable set N = (A) for some
A [0, 1).
To do this let
R := {z = e
i2t
: t Q} = {z = e
i2t
: t [0, 1) Q},
a countable subgroup of S
1
. As above R acts on S
1
by rotations and divides S
1
up
into equivalence classes, where z, w S
1
are equivalent if z = rw for some r R.
Choose (using the axiom of choice) one representative point n from each of these
equivalence classes and let N S
1
be the set of these representative points. Then
every point z S
1
may be uniquely written as z = nr with n N and r R. That
is to say
(6.3) S
1
=
a
rR
(rN)
where
`
is used to denote the union of pair-wise disjoint sets {A
} . By Eqs.
(6.2) and (6.3),
(S
1
) =
X
rR
(rN) =
X
rR
(N).
The right member from this equation is either 0 or , 0 if (N) = 0 and if
(N) > 0. In either case it is not equal (S
1
) (0, 1). Thus we have reached the
desired contradiction.
Proof. (Second proof of Answer to Question 1) For N [0, 1) and [0, 1),
let
N
= N +mod1
= {a +mod1 [0, 1) : a N}
= ( +N {a < 1 }) (( 1) +N {a 1 }) .
If is a measure satisfying the properties of the Question we would have
(N
) = ( +N {a < 1 }) +(( 1) +N {a 1 })
= (N {a < 1 }) +(N {a 1 })
= (N {a < 1 } (N {a 1 }))
= (N). (6.4)
We will now construct a bad set N which coupled with Eq. (6.4) will lead to a
contradiction.
Set
Q
x
{x +r R : r Q} =x +Q.
Notice that Q
x
Q
y
6= implies that Q
x
= Q
y
. Let O = {Q
x
: x R} the
orbit space of the Q action. For all A O choose f(A) [0, 1/3) A.
12
Dene
N = f(O). Then observe:
(1) f(A) = f(B) implies that A B 6= which implies that A = B so that f
is injective.
(2) O = {Q
n
: n N}.
Let R be the countable set,
R Q [0, 1).
We now claim that
N
r
N
s
= if r 6= s and (6.5)
[0, 1) =
rR
N
r
. (6.6)
Indeed, if x N
r
N
s
6= then x = r + nmod1 and x = s + n
0
mod1, then
n n0 Q, i.e. Q
n
= Q
n
0 . That is to say, n = f(Q
n
) = f(Q
n
0 ) = n
0
and hence
that s = r mod1, but s, r [0, 1) implies that s = r. Furthermore, if x [0, 1) and
n := f(Q
x
), then x n = r Q and x N
r mod 1
.
12
We have used the Axiom of choice here, i.e.
Q
AF
(A [0, 1/3]) 6=
78 BRUCE K. DRIVER
Now that we have constructed N, we are ready for the contradiction. By Equa-
tions (6.46.6) we nd
1 = ([0, 1)) =
X
rR
(N
r
) =
X
rR
(N)
=

if (N) > 0
0 if (N) = 0
.
which is certainly inconsistent. Incidentally we have just produced an example of
so called non measurable set.
Because of this example and our desire to have a measure on R satisfying the
properties in Question 1, we need to modify our denition of a measure. We will
give up on trying to measure all subsets A R, i.e. we will only try to dene on a
smaller collection of measurable sets. Such collections will be called algebras
which we now introduce. The formal denition of a measure appears in Denition
7.1 of Section 7 below.
6.3. Algebras and algebras.
Denition 6.5. A collection of subsets A of X is an Algebra if
(1) , X A
(2) A A implies that A
c
A
(3) Ais closed under nite unions, i.e. if A
1
, . . . , A
n
Athen A
1
A
n
A.
In view of conditions 1. and 2., 3. is equivalent to
3
0
. A is closed under nite intersections.
Denition 6.6. A collection of subsets M of X is a algebra ( eld) if M
is an algebra which also closed under countable unions, i.e. if {A
i
}
i=1
M, then
i=1
A
i
M. (Notice that since M is also closed under taking complements, M is
also closed under taking countable intersections.) A pair (X, M), where X is a set
and M is a algebra on X, is called a measurable space.
The reader should compare these denitions with that of a topology, see Deni-
tion 3.14. Recall that the elements of a topology are called open sets. Analogously,
we will often refer to elements of and algebra A or a algebra M as measurable
sets.
Example 6.7. Here are some examples.
(1) = M= P(X) in which case all subsets of X are open, closed, and mea-
surable.
(2) Let X = {1, 2, 3}, then = {, X, {2, 3}} is a topology on X which is not
an algebra.
(3) = A = {{1}, {2, 3}, , X} is a topology, an algebra, and a algebra
on X. The sets X, {1}, {2, 3}, are open and closed. The sets {1, 2} and
{1, 3} are neither open nor closed and are not measurable.
Proposition 6.8. Let E be any collection of subsets of X. Then there exists a
unique smallest topology (E), algebra A(E) and -algebra (E) which contains E.
Proof. Note P(X) is a topology and an algebra and a -algebra and E P(X),
so E is always a subset of a topology, algebra, and algebra. One may now easily
check that
(E)
\
{ : is a topology and E }
is a topology which is clearly the smallest topology containing E. The analogous
construction works for the other cases as well.
We may give explicit descriptions of (E) and A(E). However (E) typically does
not admit a simple concrete description.
Proposition 6.9. Let X be a set and E P(X). For simplicity of notation, assume
that X, E (otherwise adjoin them to E if necessary) and let E
c
{A
c
: A E}
and E
c
= E {X, } E
c
Then (E) = and A(E) = A where
(6.7) := {arbitrary unions of nite intersections of elements from E}
and
(6.8) A := {nite unions of nite intersections of elements from E
c
}.
Proof. From the denition of a topology and an algebra, it is clear that E
(E) and E A A(E). Hence to nish that proof it suces to show is a
topology and A is an algebra. The proof of these assertions are routine except for
possibly showing that is closed under taking nite intersections and A is closed
under complementation.
To check A is closed under complementation, let Z A be expressed as
Z =
N
[
i=1
K
\
j=1
A
ij
where A
ij
E
c
. Therefore, writing B
ij
= A
c
ij
E
c
, we nd that
Z
c
=
N
\
i=1
K
[
j=1
B
ij
=
K
[
j
1
,...,j
N
=1
(B
1j
1
B
2j
2
B
Nj
N
) A
wherein we have used the fact that B
1j
1
B
2j
2
B
Nj
N
is a nite intersection
of sets from E
c
.
To show is closed under nite intersections it suces to show for V, W
that V W . Write
V =
A
V
and W =
B
W
where V
and W
are sets which are nite intersection of elements from E. Then

V W = (
A
V
) (
B
W
) =
[
(,)AB
V

since for each (, ) AB, V
is still a nite intersection of elements from

E.
Remark 6.10. One might think that in general (E) may be described as the count-
able unions of countable intersections of sets in E
c
. However this is false, since if
Z =

[
i=1
\
j=1
A
ij
with A
ij
E
c
, then
Z
c
=

[
j
1
=1,j
2
=1,...j
N
=1,...

\
=1
A
c
,j
!
80 BRUCE K. DRIVER
which is now an uncountable union. Thus the above description is not correct. In
general it is complicated to explicitly describe (E), see Proposition 1.23 on page
39 of Folland for details.
Exercise 6.1. Let be a topology on a set X and A = A() be the algebra
generated by . Show A is the collection of subsets of X which may be written as
nite union of sets of the form F V where F is closed and V is open.
The following notion will be useful in the sequel.
Denition 6.11. A set E P(X) is said to be an elementary family or ele-
mentary class provided that
E
E is closed under nite intersections
if E E, then E
c
is a nite disjoint union of sets from E. (In particular
X =
c
is a disjoint union of elements from E.)
Proposition 6.12. Suppose E P(X) is an elementary family, then A = A(E)
consists of sets which may be written as nite disjoint unions of sets from E.
Proof. This could be proved making use of Proposition 6.12. However it is
easier to give a direct proof.
Let A denote the collection of sets which may be written as nite disjoint unions
of sets from E. Clearly E A A(E) so it suces to show A is an algebra since
A(E) is the smallest algebra containing E.
By the properties of E, we know that , X A. Now suppose that A
i
=
`
F
i
F A where, for i = 1, 2, . . . , n.,
i
is a nite collection of disjoint sets
from E. Then
n
\
i=1
A
i
=
n
\
i=1
a
F
i
F
!
=
[
(F
1
,,...,F
n
)
1
n
(F
1
F
2
F
n
)
and this is a disjoint (you check) union of elements from E. Therefore A is closed
under nite intersections. Similarly, if A =
`
F
F with being a nite collection
of disjoint sets from E, then A
c
=
T
F
F
c
. Since by assumption F
c
A for
F E and A is closed under nite intersections, it follows that A
c
A.
Exercise 6.2. Let A P(X) and B P(Y ) be elementary families. Show the
collection
E = AB = {AB : A A and B B}
is also an elementary family.
The analogous notion of elementary class E for topologies is a basis V dened
below.
Denition 6.13. Let (X, ) be a topological space. We say that S is a sub-
basis for the topology i = (S) and X = S :=
V S
V. We say V is a
basis for the topology i V is a sub-basis with the property that every element
V may be written as
V = {B V : B V }.
x
y
z
d(x,z)
d(x,z)
Figure 13. Fitting balls in the intersection.
Exercise 6.3. Suppose that S is a sub-basis for a topology on a set X. Show V :=
S
f
consisting of nite intersections of elements from S is a basis for . Moreover, S
is itself a basis for i
V
1
V
2
= {S S : S V
1
V
2
}.
for every pair of sets V
1
, V
2
S.
Remark 6.14. Let (X, d) be a metric space, then E = {B
x
() : x X and > 0}
is a basis for
d
the topology associated to the metric d. This is the content of
Exercise 3.3.
Let us check directly that E is a basis for a topology. Suppose that x, y X and
, > 0. If z B(x, ) B(y, ), then
(6.9) B(z, ) B(x, ) B(y, )
where = min{ d(x, z), d(y, z)}, see Figure 13. This is a formal consequence
of the triangle inequality. For example let us show that B(z, ) B(x, ). By the
denition of , we have that d(x, z) or that d(x, z) . Hence if
w B(z, ), then
d(x, w) d(x, z) +d(z, w) +d(z, w) < + =
which shows that w B(x, ). Similarly we show that w B(y, ) as well.
Owing to Exercise 6.3, this shows E is a basis for a topology. We do not need
to use Exercise 6.3 here since in fact Equation (6.9) may be generalized to nite
intersection of balls. Namely if x
i
X,
i
> 0 and z
n
i=1
B(x
i
,
i
), then
(6.10) B(z, )
n
i=1
B(x
i
,
i
)
where now := min{
i
d(x
i
, z) : i = 1, 2, . . . , n} . By Eq. (6.10) it follows that
any nite intersection of open balls may be written as a union of open balls.
Example 6.15. Suppose X = {1, 2, 3} and E = {, X, {1, 2}, {1, 3}}, see Figure 14
below.
Then
(E) = {, X, {1}, {1, 2}, {1, 3}}
A(E) = (E) = P(X).
82 BRUCE K. DRIVER
Figure 14. A collection of subsets.

Denition 6.16. Let X be a set. We say that a family of sets F P(X) is a
partition of X if X is the disjoint union of the sets in F.
Example 6.17. Let X be a set and E = {A
1
, . . . , A
n
} where A
1
, . . . , A
n
is a
partition of X. In this case
A(E) = (E) = (E) = {
i
A
i
: {1, 2, . . . , n}}
where
i
A
i
:= when = . Notice that
#A(E) = #(P({1, 2, . . . , n})) = 2
n
.
Proposition 6.18. Suppose that M P(X) is a algebra and M is at most a
countable set. Then there exists a unique nite partition F of X such that F M
and every element A M is of the form
(6.11) A = { F : A} .
In particular M is actually a nite set.
Proof. For each x X let
A
x
= (
xAM
A) M.
That is, A
x
is the smallest set in M which contains x. Suppose that C = A
x
A
y
is non-empty. If x / C then x A
x
\ C M and hence A
x
A
x
\ C which
shows that A
x
C = which is a contradiction. Hence x C and similarly y C,
therefore A
x
C = A
x
A
y
and A
y
C = A
x
A
y
which shows that A
x
= A
y
.
Therefore, F = {A
x
: x X} is a partition of X (which is necessarily countable)
and Eq. (6.11) holds for all A M. Let F = {P
n
}
N
n=1
where for the moment
we allow N = . If N = , then M is one to one correspondence with {0, 1}
N
.
Indeed to each a {0, 1}
N
, let A
a
M be dened by
A
a
= {P
n
: a
n
= 1}.
This shows that M is uncountable since {0, 1}
N
is uncountable; think of the base
two expansion of numbers in [0, 1] for example. Thus any countable algebra is
necessarily nite. This nishes the proof modulo the uniqueness assertion which is
left as an exercise to the reader.
Example 6.19. Let X = R and
E = {(a, ) : a R} {R, } = {(a, ) R : a

R} P(R).
Notice that E
f
= E and that E is closed under unions, which shows that
(E) = E, i.e. E is already a topology. Since (a, )
c
= (, a] we nd that
E
c
= {(a, ), (, a], a < } {R, }. Noting that
(a, ) (, b] = (a, b]
it is easy to verify that the algebra A(E) generated by E may be described as being
those sets which are nite disjoint unions of sets from the following list
E :=

(a, b] R : a, b

R
.
(This follows from Proposition 6.12 and the fact that

E is an elementary family of
subsets of R.) The algebra, (E), generated by E is very complicated. Here
are some sets in (E) most of which are not in A(E).
(a) (a, b) =

S
n=1
(a, b
1
n
] (E).
(b) All of the standard open subsets of R are in (E).
(c) {x} =
T
n
x
1
n
, x
(E)
(d) [a, b] = {a} (a, b] (E)
(e) Any countable subset of R is in (E).
Remark 6.20. In the above example, one may replace E by E = {(a, ) : a
Q} {R, }, in which case A(E) may be described as being those sets which are
nite disjoint unions of sets from the following list
{(a, ), (, a], (a, b] : a, b Q} {, R} .
This shows that A(E) is a countable set a fact we will use later on.
Denition 6.21. A topological space, (X, ), is second countable if there exists
a countable base V for , i.e. V is a countable set such that for every W ,
W = {V : V V 3V W}.
Exercise 6.4. Suppose E P(X) is a countable collection of subsets of X, then
= (E) is a second countable topology on X.
Proposition 6.22. Every separable metric space, (X, ) is second countable.
Proof. Let {x
n
}
n=1
be a countable dense subset of X. Let V
{X, }

S
m,n=1
{B
x
n
(r
m
)}
, where {r
m
}
m=1
is dense in (0, ). Then V is a
countable base for
. To see this let V X be open and x V . Choose

> 0 such that B
x
() V and then choose x
n
B
x
(/3). Choose r
m
near
/3 such that (x, x
n
) < r
m
< /3 so that x B
x
n
(r
m
) V . This shows
V =
S
{B
x
n
(r
m
) : B
x
n
(r
m
) V } .
Notation 6.23. For a general topological space (X, ), the Borel algebra is
the algebra, B
X
= (). We will use B
R
to denote the Borel - algebra on R.
Proposition 6.24. If is a second countable topology on X and E P(X) is a
countable set such that = (E), then B
X
:= () = (E), i.e. ((E)) = (E).
84 BRUCE K. DRIVER
Proof. Let E
f
denote the collection of subsets of X which are nite intersection
of elements from E along with X and . Notice that E
f
is still countable (you prove).
A set Z is in (E) i Z is an arbitrary union of sets from E
f
. Therefore Z =
S
AF
A
for some subset F E
f
which is necessarily countable. Since E
f
(E) and (E) is
closed under countable unions it follows that Z (E) and hence that (E) (E).
For the last assertion, since E (E) (E) it follows that (E) ((E)) (E).
Exercise 6.5. Verify the following identities
B
R
= ({(a, ) : a R} = ({(a, ) : a Q} = ({[a, ) : a Q}).
6.4. Continuous and Measurable Functions. Our notion of a measurable
function will be analogous to that for a continuous function. For motivational pur-
poses, suppose (X, M, ) is a measure space and f : X R
+
. Roughly speaking,
in the next section we are going to dene
R
X
fd by
Z
X
fd = lim
mesh0
X
0<a
1
<a
2
<a
3
<...
a
i
(f
1
(a
i
, a
i+1
]).
For this to make sense we will need to require f
1
((a, b]) Mfor all a < b. Because
of Lemma 6.30 below, this last condition is equivalent to the condition
f
1
(B
R
) M,
where we are using the following notation.
Notation 6.25. If f : X Y is a function and E P(Y ) let
f
1
E f
1
(E) {f
1
(E)|E E}.
If G P(X), let
f
G {A P(Y )|f
1
(A) G}.
Exercise 6.6. Show f
1
E and f
G are algebras (topologies) provided E and

G are algebras (topologies).
Denition 6.26. Let (X, M) and (Y, F) be measurable (topological) spaces. A
function f : X Y is measurable (continuous) if f
1
(F) M. We will also
say that f is M/F measurable (continuous) or (M, F) measurable (continuous).
Example 6.27 (Characteristic Functions). Let (X, M) be a measurable space and
A X. We dene the characteristic function 1
A
: X R by
1
A
(x) =

1 if x A
0 if x / A.
If A M, then 1
A
is (M, P(R)) measurable because 1
1
A
(W) is either , X, A or
A
c
for any U R. Conversely, if F is any algebra on R containing a set W R
such that 1 W and 0 W
c
, then A M if 1
A
is (M, F) measurable. This is
because A = 1
1
A
(W) M.
Remark 6.28. Let f : X Y be a function. Given a algebra (topology)
F P(Y ), the algebra (topology) M := f
1
(F) is the smallest algebra
(topology) on X such that f is (M, F) - measurable (continuous). Similarly, if M
is a - algebra (topology) on X then F = f
Mis the largest algebra (topology)

on Y such that f is (M, F) - measurable (continuous).
Lemma 6.29. Suppose that (X, M), (Y, F) and (Z, G) are measurable (topological)
spaces. If f : (X, M) (Y, F) and g : (Y, F) (Z, G) are measurable (continuous)
functions then g f : (X, M) (Z, G) is measurable (continuous) as well.
Proof. This is easy since by assumption g
1
(G) F and f
1
(F) M so that
(g f)
1
(G) = f
1
g
1
(G)
f
1
(F) M.
Lemma 6.30. Suppose that f : X Y is a function and E P(Y ), then
f
1
(E)
= f
1
((E)) and (6.12)

f
1
(E)
= f
1
((E)). (6.13)
Moreover, if F = (E) (or F = (E)) and M is a algebra (topology) on X,
then f is (M, F) measurable (continuous) i f
1
(E) M.
Proof. We will prove Eq. (6.12), the proof of Eq. (6.13) being analogous.
If E F, then f
1
(E) f
1
((E)) and therefore, (because f
1
((E)) is a
algebra)
G := (f
1
(E)) f
1
((E))
which proves half of Eq. (6.12). For the reverse inclusion notice that
f
G =

B Y : f
1
(B) G
is a algebra which contains E and thus (E) f
G. Hence if B (E) we
know that f
1
(B) G, i.e. f
1
((E)) G. The last assertion of the Lemma is
an easy consequence of Eqs. (6.12) and (6.13). For example, if f
1
E M, then
f
1
(E) =
f
1
E
M which shows f is (M, F) measurable.

Denition 6.31. A function f : X Y between to topological spaces is Borel
measurable if f
1
(B
Y
) B
X
.
Proposition 6.32. Let X and Y be two topological spaces and f : X Y be a
continuous function. Then f is Borel measurable.
Proof. Using Lemma 6.30 and B
Y
= (
Y
),
f
1
(B
Y
) = f
1
((
Y
)) = (f
1
(
Y
)) (
X
) = B
X
.
Corollary 6.33. Suppose that (X, M) is a measurable space. Then f : X R
is (M, B
R
) measurable i f
1
((a, )) M for all a R i f
1
((a, )) M
for all a Q i f
1
((, a]) M for all a R, etc. Similarly, if (X, M) is
a topological space, then f : X R is (M,
R
) - continuous i f
1
((a, b)) M
for all < a < b < i f
1
((a, )) M and f
1
((, b)) M for all
a, b Q. (We are using
R
to denote the standard topology on R induced by the
metric d(x, y) = |x y|.)
Proof. This is an exercise (Exercise 6.7) in using Lemma 6.30.
We will often deal with functions f : X

R = R{} . Let
(6.14) B
R
:= ({[a, ] : a R}) .
The following Corollary of Lemma 6.30 is a direct analogue of Corollary 6.33.
86 BRUCE K. DRIVER
Corollary 6.34. f : X

R is (M, B
R
) - measurable i f
1
((a, ]) M for all
a R i f
1
((, a]) M for all a R, etc.
Proposition 6.35. Let B
R
and B
R
be as above, then
(6.15) B
R
= {A

R : A R B
R
}.
In particular {} , {} B
R
and B
R
B
R
.
Proof. Let us rst observe that
{} =
n=1
[, n) =
n=1
[n, ]
c
B
R
,
{} =
n=1
[n, ] B
R
and R =

R\ {} B
R
.
Letting i : R

R be the inclusion map,
i
1
(B
R
) =
i
1
[a, ] : a

R
i
1
([a, ]) : a

R
[a, ] R : a

R
= ({[a, ) : a R}) = B
R
.
Thus we have shown
B
R
= i
1
(B
R
) = {A R : A B
R
}.
This implies:
(1) A B
R
=A R B
R
and
(2) if A

R is such that AR B
R
there exists B B
R
such that AR = BR.
Because AB {} and {} , {} B
R
we may conclude that
A B
R
as well.
This proves Eq. (6.15).
Proposition 6.36 (Closure under sups, infs and limits). Suppose that (X, M) is
a measurable space and f
j
: (X, M) R is a sequence of M/B
R
measurable
functions. Then
sup
j
f
j
, inf
j
f
j
, limsup
j
f
j
and liminf
j
f
j
are all M/B
R
measurable functions. (Note that this result is in generally false
when (X, M) is a topological space and measurable is replaced by continuous in the
statement.)
Proof. Dene g
+
(x) := sup
j
f
j
(x), then
{x : g
+
(x) a} = {x : f
j
(x) a j}
=
j
{x : f
j
(x) a} M
so that g
+
is measurable. Similarly if g
(x) = inf
j
f
j
(x) then
{x : g
(x) a} =
j
{x : f
j
(x) a} M.
Since
limsup
j
f
j
= inf
n
sup{f
j
: j n} and
liminf
j
f
j
= sup
n
inf {f
j
: j n}
we are done by what we have already proved.
6.4.1. More general pointwise limits.
Lemma 6.37. Suppose that (X, M) is a measurable space, (Y, d) is a metric space
and f
j
: X Y is (M, B
Y
) measurable for all j. Also assume that for each x X,
f(x) = lim
n
f
n
(x) exists. Then f : X Y is also (M, B
Y
) measurable.
Proof. Let V
d
and W
m
:= {y Y : d
V
c (y) > 1/m} for m = 1, 2, . . . . Then
W
m

d
,
W
m

W
m
{y Y : d
V
c (y) 1/m} V
for all m and W
m
V as m . The proof will be completed by verifying the
identity,
f
1
(V ) =
m=1
N=1
nN
f
1
n
(W
m
) M.
If x f
1
(V ) then f(x) V and hence f(x) W
m
for some m. Since f
n
(x) f(x),
f
n
(x) W
m
for almost all n. That is x
m=1
N=1
nN
f
1
n
(W
m
). Conversely
when x
m=1
N=1
nN
f
1
n
(W
m
) there exists an m such that f
n
(x) W
m

W
m
for almost all n. Since f
n
(x) f(x)

W
m
V, it follows that x f
1
(V ).
Remark 6.38. In the previous Lemma 6.37 it is possible to let (Y, ) be any topo-
logical space which has the regularity property that if V there exists W
m

such that W
m

W
m
V and V =
m=1
W
m
. Moreover, some extra condition is
necessary on the topology in order for Lemma 6.37 to be correct. For example if
Y = {1, 2, 3} and = {Y, , {1, 2}, {2, 3}, {2}} as in Example 3.28 and X = {a, b}
with the trivial algebra. Let f
j
(a) = f
j
(b) = 2 for all j, then f
j
is constant and
hence measurable. Let f(a) = 1 and f(b) = 2, then f
j
f as j with f being
non-measurable. Notice that the Borel algebra on Y is P(Y ).
6.5. Topologies and Algebras Generated by Functions.
Denition 6.39. Let E P(X) be a collection of sets, A X, i
A
: A X be
the inclusion map (i
A
(x) = x) for all x A, and
E
A
= i
1
A
(E) = {A E : E E} .
When E = is a topology or E = Mis a algebra we call
A
the relative topology
and M
A
the relative algebra on A.
Proposition 6.40. Suppose that A X, M P(X) is a algebra and
P(X) is a topology, then M
A
P(A) is a algebra and
A
P(A) is a topology.
Moreover if E P(X) is such that M = (E) ( = (E)) then M
A
= (E
A
)
(
A
= (E
A
)).
Proof. The rst assertion is Exercise 6.6 and the second assertion is a conse-
quence of Lemma 6.30. Indeed,
M
A
= i
1
A
(M) = i
1
A
((E)) = (i
1
A
(E)) = (E
A
)
and similarly
A
= i
1
A
() = i
1
A
((E)) = (i
1
A
(E)) = (E
A
).
Example 6.41. Suppose that (X, d) is a metric space and A X is a set. Let
=
d
and d
A
:= d|
AA
be the metric d restricted to A. Then
A
=
d
A
, i.e.
the relative topology,
A
, of
d
on A is the same as the topology induced by the
restriction of the metric d to A. Indeed, if V
A
there exists W such that
V A = W. Therefore for all x A there exists > 0 such that B
x
() W and
88 BRUCE K. DRIVER
hence B
x
() A V. Since B
x
() A = B
d
A
x
() is a d
A
ball in A, this shows
V is d
A
open, i.e.
A

d
A
. Conversely, if V
d
A
, then for each x A there
exists
x
> 0 such that B
d
A
x
() = B
x
() A V. Therefore V = A W with
W :=
xA
B
x
() . This shows
d
A

A
.
Denition 6.42. Let A X, f : A C be a function, M P(X) be a algebra
and P(X) be a topology, then we say that f|
A
is measurable (continuous) if
f|
A
is M
A
measurable (
A
continuous).
Proposition 6.43. Let A X, f : X C be a function, M P(X) be a
algebra and P(X) be a topology. If f is M measurable ( continuous) then
f|
A
is M
A
measurable (
A
continuous). Moreover if A
n
M (A
n
) such that
X =
n=1
A
n
and f|A
n
is M
A
n
measurable (
A
n
continuous) for all n, then f is
M measurable ( continuous).
Proof. Notice that i
A
is (M
A
, M) measurable (
A
, ) continuous) hence
f|
A
= f i
A
is M
A
measurable (
A
continuous). Let B C be a Borel set and
consider
f
1
(B) =
n=1
f
1
(B) A
n
n=1
f|
1
A
n
(B).
If A M (A ), then it is easy to check that
M
A
= {B M: B A} M and
A
= {B : B A} .
The second assertion is now an easy consequence of the previous three equations.
Denition 6.44. Let X and A be sets, and suppose for A we are give a
measurable (topological) space (Y
, F
) and a function f
: X Y
. We will write
(f
: A) ((f
: A)) for the smallest -algebra (topology) on X such that

each f
is measurable (continuous), i.e.

(f
: A) = (
f
1
(F
)) and
(f
: A) = (
f
1
(F
)).
Proposition 6.45. Assuming the notation in Denition 6.44 and additionally let
(Z, M) be a measurable (topological) space and g : Z X be a function. Then g
is (M, (f
: A)) measurable ((M, (f
: A)) continuous) i f
g is
(M, F
)measurable (continuous) for all A.

Proof. () If g is (M, (f
: A)) measurable, then the composition f
g
is (M, F
) measurable by Lemma 6.29.

() Let
G = (f
: A) =
A
f
1
(F
.
If f
g is (M, F
) measurable for all , then

g
1
f
1
(F
) M A
and therefore
g
1
A
f
1
(F
=
A
g
1
f
1
(F
) M.
Hence
g
1
(G) = g
1
A
f
1
(F
= (g
1
A
f
1
(F
M
which shows that g is (M, G) measurable.
The topological case is proved in the same way.
6.6. Product Spaces. In this section we consider product topologies and
algebras. We will start with a nite number of factors rst and then later mention
what happens for an innite number of factors.
6.6.1. Products with a Finite Number of Factors. Let {X
i
}
n
i=1
be a collection of sets,
X := X
1
X
2
X
n
and
i
: X X
i
be the projection map (x
1
, x
2
, . . . , x
n
) =
x
i
for each 1 i n. Let us also suppose that
i
is a topology on X
i
and M
i
is a
algebra on X
i
for each i.
Notation 6.46. Let E
i
P(X
i
) be a collection of subsets of X
i
for i = 1, 2, . . . , n
we will write, by abuse of notation, E
1
E
2
E
n
for the collection of subsets
of X
1
X
n
of the form A
1
A
2
A
n
with A
i
E
i
for all i. That is we
are identifying (A
1
, A
2
, . . . , A
n
) with A
1
A
2
A
n
.
Denition 6.47. The product topology on X, denoted by
1
2

n
is
the smallest topology on X so that each map
i
: X X
i
is continuous. Similarly,
the product algebra on X, denoted by M
1
M
2
M
n
, is the smallest
algebra on X so that each map
i
: X X
i
is measurable.
Remark 6.48. The product topology may also be described as the smallest topology
containing sets from
1

n
, i.e.
2

n
= (
1

n
).
Indeed,
2

n
= (
1
,
2
, . . . ,
n
)
= (
n
i=1
1
i
(V
i
) : V
i

i
for i = 1, 2, . . . , n
)
= ({V
1
V
2
V
n
: V
i

i
for i = 1, 2, . . . , n}).
Similarly,
M
1
M
2
M
n
= (M
1
M
2
M
n
).
Furthermore if B
i

i
is a basis for the topology
i
for each i, then B
1
B
n
is
a basis for
1
2

n
. Indeed,
1

n
is closed under nite intersections
and generates
1

2

n
, therefore
1

n
is a basis for the product
topology. Hence for W
1
2

n
and x = (x
1
, . . . , x
n
) W, there exists
V
1
V
2
V
n

1

n
such that
x V
1
V
2
V
n
W.
Since B
i
is a basis for
i
, we may now choose U
i
B
i
such that x
i
U
i
V
i
for
each i. Thus
x U
1
U
2
U
n
W
and we have shown W may be written as a union of sets from B
1
B
n
. Since
B
1
B
n

1

n

1
2

n
,
this shows B
1
B
n
is a basis for
1
2

n
.
90 BRUCE K. DRIVER
Lemma 6.49. Let (X

i
, d
i
) for i = 1, . . . , n be metric spaces, X := X
1
X
n
and for x = (x
1
, x
2
, . . . , x
n
) and y = (y
1
, y
2
, . . . , y
n
) in X let
(6.16) d(x, y) =
n
X
i=1
d
i
(x
i
, y
i
).
Then the topology,
d
, associated to the metric d is the product topology on X, i.e.
d
=
d
1

d
2

d
n
.
Proof. Let (x, y) = max{d
i
(x
i
, y
i
) : i = 1, 2, . . . , n}. Then is equivalent to d
and hence
=
d
. Moreover if > 0 and x = (x
1
, x
2
, . . . , x
n
) X, then
B
x
() = B
d
1
x
1
() B
d
n
x
n
().
By Remark 6.14,
E := {B
x
() : x X and > 0}
is a basis for
and by Remark 6.48 E is also a basis for

d
1

d
2

d
n
.
Therefore,
d
1

d
2

d
n
= (E) =
=
d
.
Remark 6.50. Let (Z, M) be a measurable (topological) space, then by Proposition
6.45, a function f : Z X is measurable (continuous) i
i
f : Z X
i
is
(M, M
i
) measurable ((,
i
) continuous) for i = 1, 2, . . . , n. So if we write
f(z) = (f
1
(z), f
2
(z), . . . , f
n
(z)) X
1
X
2
X
n
,
then f : Z X is measurable (continuous) i f
i
: Z X
i
is measurable (continu-
ous) for all i.
Theorem 6.51. For i = 1, 2, . . . , n, let E
i
P(X
i
) be a collection of subsets of X
i
such that X
i
E
i
and M
i
= (E
i
) (or
i
= (E
i
)) for i = 1, 2, . . . , n, then
M
1
M
2
M
n
= (E
1
E
2
E
n
) and
2

n
= (E
1
E
2
E
n
).
Written out more explicitly, these equations state
((E
1
) (E
2
) (E
n
)) = (E
1
E
2
E
n
) and (6.17)
((E
1
) (E
2
) (E
n
)) = (E
1
E
2
E
n
). (6.18)
Moreover if {(X
i
,
i
)}
n
i=1
is a sequence of second countable topological spaces, =
2

n
is the product topology on X = X
1
X
n
, then
B
X
:= (
1
2

n
) = (B
X
1
B
X
n
) =: B
X
1
B
X
n
.
That is to say the Borel algebra and the product algebra on X are the same.
Proof. We will prove Eq. (6.17). The proof of Eq. (6.18) is completely analo-
gous. Let us rst do the case of two factors. Since
E
1
E
2
(E
1
) (E
2
)
it follows that
(E
1
E
2
) ((E
1
) (E
2
)) = (
1
,
2
).
To prove the reverse inequality it suces to show
i
: X
1
X
2
X
i
is (E
1
E
2
)
M
i
= (E
i
) measurable for i = 1, 2. To prove this suppose that E E
1
, then
1
1
(E) = E X
2
E
1
E
2
(E
1
E
2
)
wherein we have used the fact that X
2
E
2
. Similarly, for E E
2
we have
1
2
(E) = X
1
E E
1
E
2
(E
1
E
2
) .
This proves the desired measurability, and hence
(
1
,
2
) (E
1
E
2
) (
1
,
2
).
To prove the last assertion we may assume each E
i
is countable for i = 1, 2. Since
E
1
E
2
is countable, a couple of applications of Proposition 6.24 along with the
rst two assertions of the theorems gives
(
1
2
) = ( (
1

2
)) = ( ((E
1
) (E
2
))) = ( (E
1
E
2
))
= (E
1
E
2
) = ((E
1
) (E
2
)) = (M
1
M
2
) = M
1
M
2
.
The proof for n factors works the same way. Indeed,
E
1
E
2
E
n
(E
1
) (E
2
) (E
n
)
implies
(E
1
E
2
E
n
) ((E
1
) (E
2
) (E
n
)) = (
1
, . . . ,
n
)
and for E E
i
,
1
i
(E) = X
1
X
2
X
i1
E X
i+1
X
n
E
1
E
2
E
n
(E
1
E
2
E
n
) .
This show
i
is (E
1
E
2
E
n
) M
i
= (E
i
) measurable and therefore,
(
1
, . . . ,
n
) (E
1
E
2
E
n
) (
1
, . . . ,
n
).
If the E
i
are countable, then
(
1
2

n
) = ( (
1
2

n
))
= ( ((E
1
) (E
2
) (E
n
)))
= ( (E
1
E
2
E
n
))
= (E
1
E
2
E
n
)
= ((E
1
) (E
2
) (E
n
))
= (M
1
M
2
M
n
)
= M
1
M
2
M
n
.
Remark 6.52. One can not relax the assumption that X
i
E
i
in Theorem 6.51.
For example, if X
1
= X
2
= {1, 2} and E
1
= E
2
= {{1}} , then (E
1
E
2
) =
{, X
1
X
2
, {(1, 1)}} while ((E
1
) (E
2
)) = P(X
1
X
2
).
Proposition 6.53. If (X
i
, d
i
) are separable metric spaces for i = 1, . . . , n, then
B
X
1
B
X
n
= B
(X
1
X
n
)
where B
X
i
is the Borel algebra on X
i
and B
(X
1
X
n
)
is the Borel algebra
on X
1
X
n
equipped with the product topology.
92 BRUCE K. DRIVER
Proof. This follows directly from Proposition 6.22 and Theorem 6.51.
Because all norms on nite dimensional spaces are equivalent, the usual Euclid-
ean norm on R
m
R
n
is equivalent to the product norm dened by
k(x, y)k
R
m
R
n
= kxk
R
m
+kyk
R
n
.
Hence by Lemma 6.49, the Euclidean topology on R
m+n
is the same as the product
topology on R
m+n

= R
m
R
n
Here we are identifying R
m
R
n
with R
m+n
by the
map
(x, y) R
m
R
n
(x
1
, . . . , x
m
, y
1
, . . . , y
n
) R
m+n
.
Proposition 6.53 and these comments leads to the following corollaries.
Corollary 6.54. After identifying R
m
R
n
with R
m+n
as above and letting B
R
n
denote the Borel algebra on R
n
, we have
B
R
m+n = B
R
n B
R
m and B
R
n =
ntimes
z }| {
B
R
B
R
.
Corollary 6.55. If (X, M) is a measurable space, then
f = (f
1
, f
2
, . . . , f
n
) : X R
n
is (M, B
R
n) measurable i f
i
: X R is (M, B
R
) measurable for each i. In
particular, a function f : X C is (M, B
C
) measurable i Re f and Imf are
(M, B
R
) measurable.
Corollary 6.56. Let (X, M) be a measurable space and f, g : X C be (M, B
C
)
measurable functions. Then f g and f g are also (M, B
C
) measurable.
Proof. Dene F : X C C, A
: C C C and M : C C C by
F(x) = (f(x), g(x)), A
(w, z) = w z and M(w, z) = wz. Then A
and M are
continuous and hence (B
C
2, B
C
) measurable. Also F is (M, B
C
B
C
) = (M, B
C
2)
measurable since
1
F = f and
2
F = g are (M, B
C
) measurable. Therefore
A
F = f g and M F = f g, being the composition of measurable functions,

are also measurable.
Lemma 6.57. Let C, (X, M) be a measurable space and f : X C be a
(M, B
C
) measurable function. Then
F(x) :=

1
f(x)
if f(x) 6= 0
if f(x) = 0
is measurable.
Proof. Dene i : C C by
i(z) =

1
z
if z 6= 0
if z = 0.
For any open set V C we have
i
1
(V ) = i
1
(V \ {0}) i
1
(V {0})
Because i is continuous except at z = 0, i
1
(V \ {0}) is an open set and hence
in B
C
. Moreover, i
1
(V {0}) B
C
since i
1
(V {0}) is either the empty set or
the one point set {} . Therefore i
1
(
C
) B
C
and hence i
1
(B
C
) = i
1
((
C
)) =
(i
1
(
C
)) B
C
which shows that i is Borel measurable. Since F = i f is the
composition of measurable functions, F is also measurable.
6.6.2. General Product spaces .
Denition 6.58. Suppose(X
, M
)
A
is a collection of measurable spaces and
let X be the product space
X =
Y
A
X
and
: X X
be the canonical projection maps. Then the product algebra,

N
, is dened by
O
A
M
: A) =
(M
)
!
.
Similarly if (X
, M
)
A
is a collection of topological spaces, the product topology
N
, is dened by
O
A
M
: A) =
(M
)
!
.
Remark 6.59. Let (Z, M) be a measurable (topological) space and
X =
Y
A
X
,
O
A
M
!
be as in Denition 6.58. By Proposition 6.45, a function f : Z X is measurable
(continuous) i
f is (M, M
) measurable (continuous) for all A.

Proposition 6.60. Suppose that (X
, M
)
A
is a collection of measurable (topo-
logical) spaces and E
generates M
for each A, then

(6.19)
A
M
(E
(E
Moreover, suppose that A is either nite or countably innite, X
for each
A, and M
= (E
) for each A. Then the product algebra satises

(6.20)
O
A
M
=
(
Y
A
E
: E
for all A
)!
.
Similarly if A is nite and M
= (E
), then the product topology satises

(6.21)
O
A
M
=
(
Y
A
E
: E
for all A
)!
.
Proof. We will prove Eq. (6.19) in the measure theoretic case since a similar
proof works in the topological category. Since
S
(E
(M
), it follows
that
F :=
(E
)
!

(M
)
!
=
O
.
Conversely,
F (
1
(E
)) =
1
((E
)) =
1
(M
)
holds for all implies that
[
(M
) F
94 BRUCE K. DRIVER
and hence that

N
F.
We now prove Eq. (6.20). Since we are assuming that X
for each A,
we see that
[
(E
)
(
Y
A
E
: E
for all A
)
and therefore by Eq. (6.19)
O
A
M
(E
)
!

(
Y
A
E
: E
for all A
)!
.
This last statement is true independent as to whether A is countable or not. For
the reverse inclusion it suces to notice that since A is countable,
Y
A
E
=
A
(E
)
O
A
M
and hence
(
Y
A
E
: E
for all A
)!
O
A
M
.
Here is a generalization of Theorem 6.51 to the case of countable number of factors.
Proposition 6.61. Let {X
}
A
be a sequence of sets where A is at most count-
able. Suppose for each A we are given a countable set E
P(X
). Let
= (E
) be the topology on X
generated by E
and X be the product space

Q
A
X
with equipped with the product topology :=

A
(E
). Then the Borel

algebra B
X
= () is the same as the product algebra:
B
X
=
A
B
X
,
where B
X
= ((E
)) = (E
) for all A.
Proof. By Proposition 6.60, the topology may be described as the smallest
topology containing E =
A
(E
). Now E is the countable union of countable

sets so is still countable. Therefore by Proposition 6.24 and Proposition 6.60 we
have
B
X
= () = ((E)) = (E) =
A
(E
) =
A
(
) =
A
B
X
.
Lemma 6.62. Suppose that (Y, F) is a measurable space and F : X Y is a
map. Then to every ((F), B
R
) measurable function, H from X

R, there is a
(F, B
R
) measurable function h : Y

R such that H = h F.
Proof. First suppose that H = 1
A
where A (F) = F
1
(B
R
). Let J B
R
such that A = F
1
(J) then 1
A
= 1
F
1
(J)
= 1
J
F and hence the Lemma is valid
in this case with h = 1
J
. More generally if H =
P
a
i
1
A
i
is a simple function, then
there exists J
i
B
R
such that 1
A
i
= 1
J
i
F and hence H = hF with h :=
P
a
i
1
J
i
a simple function on

R.
For general ((F), B
R
) measurable function, H, from X

R, choose simple
functions H
n
converging to H. Let h
n
be simple functions on

R such that H
n
=
h
n
F. Then it follows that
H = lim
n
H
n
= limsup
n
H
n
= limsup
n
h
n
F = h F
where h := limsup
n
h
n
a measurable function from Y to

R.
The following is an immediate corollary of Proposition 6.45 and Lemma 6.62.
Corollary 6.63. Let X and A be sets, and suppose for A we are give a
measurable space (Y
, F
) and a function f
: X Y
. Let Y :=
Q
A
Y
, F :=
A
F
be the product algebra on Y and M := (f
: A) be the smallest
-algebra on X such that each f
is measurable. Then the function F : X Y

dened by [F(x)]
:= f
(x) for each A is (M, F) measurable and a function

H : X

R is (M, B
R
) measurable i there exists a (F, B
R
) measurable function
h from Y to

R such that H = h F.
6.7. Exercises.
Exercise 6.7. Prove Corollary 6.33. Hint: See Exercise 6.5.
Exercise 6.8. Folland, Problem 1.5 on p.24. If M is the algebra generated by
E P(X), then M is the union of the algebras generated by countable subsets
F E.
Exercise 6.9. Let (X, M) be a measure space and f
n
: X F be a sequence of
measurable functions on X. Show that {x : lim
n
f
n
(x) exists} M.
Exercise 6.10. Show that every monotone function f : R R is (B
R
, B
R
) mea-
surable.
Exercise 6.11. Folland problem 2.6 on p. 48.
Exercise 6.12. Suppose that X is a set, {(Y
) : A} is a family of topological
spaces and f
: X Y
is a given function for all A. Assuming that S
is a sub-basis for the topology
for each A, show S :=

A
f
1
(S
) is a
sub-basis for the topology := (f
: A).
Notation 6.64. Let X be a set and p := {p
n
}
n=0
be a family of semi-metrics on
X, i.e. p
n
: X X [0, ) are functions satisfying the assumptions of metric
except for the assertion that p
n
(x, y) = 0 implies x = y. Further assume that
p
n
(x, y) p
n+1
(x, y) for all n and if p
n
(x, y) = 0 for all n N then x = y. Given
n N and x X let
B
n
(x, ) := {y X : p
n
(x, y) < } .
We will write (p) form the smallest topology on X such that p
n
(x, ) : X [0, )
is continuous for all n N and x X, i.e. (p) := (p
n
(x) : n N and x X).
Exercise 6.13. Using Notation 6.64, show that collection of balls,
B := {B
n
(x, ) : n N, x X and > 0} ,
forms a basis for the topology (p). Hint: Use Exercise 6.12 to show B is a sub-
basis for the topology (p) and then use Exercise 6.3 to show B is in fact a basis
for the topology (p).
96 BRUCE K. DRIVER
Exercise 6.14. Using the notation in 6.64, let

d(x, y) =

X
n=0
2
n
p
n
(x, y)
1 +p
n
(x, y)
.
Show d is a metric on X and
d
= (p). Conclude that a sequence {x
k
}
k=1
X
converges to x X i
lim
k
p
n
(x
k
, x) = 0 for all n N.
Exercise 6.15. Let {(X
n
, d
n
)}
n=1
be a sequence of metric spaces, X :=
Q
n=1
X
n
,
and for x = (x(n))
n=1
and y = (y(n))
n=1
in X let
d(x, y) =

X
n=1
2
n
d
n
(x(n), y(n))
1 +d
n
(x(n), y(n))
.
(See Exercise 3.26.) Moreover, let
i
: X X
i
be the projection maps, show
d
=
n=1
d
i
:= ({
i
: i N}).
That is show the d metric topology is the same as the product topology on X.
7. Measures and Integration
Denition 7.1. A measure on a measurable space (X, M) is a function :
M[0, ] such that
(1) () = 0 and
(2) (Finite Additivity) If {A
i
}
n
i=1
M are pairwise disjoint, i.e. A
i
A
j
=
when i 6= j, then
(
n
[
i=1
A
i
) =
n
X
i=1
(A
i
).
(3) (Continuity) If A
n
M and A
n
A, then (A
n
) (A).
We call a triple (X, M, ), where (X, M) is a measurable space and : M
[0, ] is a measure, a measure space.
Remark 7.2. Properties 2) and 3) in Denition 7.1 are equivalent to the following
condition. If {A
i
}
i=1
M are pairwise disjoint then
(7.1) (
[
i=1
A
i
) =

X
i=1
(A
i
).
To prove this suppose that Properties 2) and 3) in Denition 7.1 and {A
i
}
i=1
M
are pairwise disjoint. Let B
n
:=
n
S
i=1
A
i
B :=

S
i=1
A
i
, so that
(B)
(3)
= lim
n
(B
n
)
(2)
= lim
n
n
X
i=1
(A
i
) =

X
i=1
(A
i
).
Conversely, if Eq. (7.1) holds we may take A
j
= for all j n to see that Property
2) of Denition 7.1 holds. Also if A
n
A, let B
n
:= A
n
\ A
n1
. Then {B
n
}
n=1
are
pairwise disjoint, A
n
=
n
j=1
B
j
and A =
j=1
B
j
. So if Eq. (7.1) holds we have
(A) =
j=1
B
j
=

X
j=1
(B
j
)
= lim
n
n
X
j=1
(B
j
) = lim
n
(
n
j=1
B
j
) = lim
n
(A
n
).
Proposition 7.3 (Basic properties of measures). Suppose that (X, M, ) is a mea-
sure space and E, F M and {E
j
}
j=1
M, then :
(1) (E) (F) if E F.
(2) (E
j
)
P
(E
j
).
(3) If (E
1
) < and E
j
E, i.e. E
1
E
2
E
3
. . . and E =
j
E
j
, then
(E
j
) (E) as j .
Proof.
(1) Since F = E (F \ E),
(F) = (E) +(F \ E) (E).
(2) Let
e
E
j
= E
j
\ (E
1
E
j1
) so that the

E
j
s are pair-wise disjoint and
E =
e
E
j
. Since

E
j
E
j
it follows from Remark 7.2 and part (1), that
(E) =
X
(
e
E
j
)
X
(E
j
).
98 BRUCE K. DRIVER
N
F
A
Figure 15. Completing a algebra.
(3) Dene D
i
E
1
\ E
i
then D
i
E
1
\ E which implies that
(E
1
) (E) = lim
i
(D
i
) = (E
1
) lim
i
(E
i
)
which shows that lim
i
(E
i
) = (E).
Denition 7.4. A set E X is a null set if E M and (E) = 0. If P is some
property which is either true or false for each x X, we will use the terminology
P a.e. (to be read P almost everywhere) to mean
E := {x X : P is false for x}
is a null set. For example if f and g are two measurable functions on (X, M, ),
f = g a.e. means that (f 6= g) = 0.
Denition 7.5. A measure space (X, M, ) is complete if every subset of a null
set is in M, i.e. for all F X such that F E M with (E) = 0 implies that
F M.
Proposition 7.6. Let (X, M, ) be a measure space. Set
N {N X : F M 3 N F and (F) = 0}
and
M= {A N : A M, N M},
see Fig. 15. Then

M is a -algebra. Dene (AN) = (A), then is the unique
measure on

M which extends .
Proof. Clearly X,

M.
Let A M and N N and choose F M such that N F and (F) = 0.
Since N
c
= (F \ N) F
c
,
(A N)
c
= A
c
N
c
= A
c
(F \ N F
c
) = [A
c
(F \ N)] [A
c
F
c
]
where [A
c
(F \N)] N and [A
c
F
c
] M. Thus

Mis closed under complements.
If A
i
M and N
i
F
i
M such that (F
i
) = 0 then (A
i
N
i
) = (A
i
)
(N
i
)

M since A
i
M and N
i
F
i
and (F
i
)
P
(F
i
) = 0. Therefore,
M is a -algebra.
Suppose AN
1
= BN
2
with A, B M and N
1
, N
2
, N. Then A AN
1

A N
1
F
2
= B F
2
which shows that
(A) (B) +(F
2
) = (B).
Similarly, we show that (B) (A) so that (A) = (B) and hence (AN) :=
(A) is well dened. It is left as an exercise to show is a measure, i.e. that it is
countable additive.
Many theorems in the sequel will require some control on the size of a measure
. The relevant notion for our purposes (and most purposes) is that of a nite
measure dened next.
Denition 7.7. Suppose X is a set, E M P(X) and : M [0, ] is
a function. The function is nite on E if there exists E
n
E such that
(E
n
) < and X =
n=1
E
n
. If M is a algebra and is a measure on M
which is nite on M we will say (X, M, ) is a -nite measure space.
The reader should check that if is a nitely additive measure on an algebra,
M, then is nite on M i there exists X
n
M such that X
n
X and
(X
n
) < .
7.1. Example of Measures. Most algebras and -additive measures are
somewhat dicult to describe and dene. However, one special case is fairly easy
to understand. Namely suppose that F P(X) is a countable or nite partition of
X and M P(X) is the algebra which consists of the collection of sets A X
such that
(7.2) A = { F : A} .
It is easily seen that M is a algebra.
Any measure : M [0, ] is determined uniquely by its values on F. Con-
versely, if we are given any function : F [0, ] we may dene, for A M,
(A) =
X
F3A
() =
X
F
()1
A
where 1
A
is one if A and zero otherwise. We may check that is a measure
on M. Indeed, if A =
`
i=1
A
i
and F, then A i A
i
for one and hence
exactly one A
i
. Therefore 1
A
=
P
i=1
1
A
i
and hence
(A) =
X
F
()1
A
=
X
F
()

X
i=1
1
A
i
=

X
i=1
X
F
()1
A
i
=

X
i=1
(A
i
)
as desired. Thus we have shown that there is a one to one correspondence between
measures on M and functions : F [0, ].
We will leave the issue of constructing measures until Sections 13 and 14. How-
ever, let us point out that interesting measures do exist. The following theorem
may be found in Theorem 13.35 or see Section 13.8.1.
Theorem 7.8. To every right continuous non-decreasing function F : R R there
exists a unique measure
F
on B
R
such that
(7.3)
F
((a, b]) = F(b) F(a) < a b <
100 BRUCE K. DRIVER
Moreover, if A B
R
then
F
(A) = inf
(

X
i=1
(F(b
i
) F(a
i
)) : A
i=1
(a
i
, b
i
]
)
(7.4)
= inf
(

X
i=1
(F(b
i
) F(a
i
)) : A

a
i=1
(a
i
, b
i
]
)
. (7.5)
In fact the map F
F
is a one to one correspondence between right continuous
functions F with F(0) = 0 on one hand and measures on B
R
such that (J) <
on any bounded set J B
R
on the other.
Example 7.9. The most important special case of Theorem 7.8 is when F(x) = x,
in which case we write m for
F
. The measure m is called Lebesgue measure.
Theorem 7.10. Lebesgue measure m is invariant under translations, i.e. for B
B
R
and x R,
(7.6) m(x +B) = m(B).
Moreover, m is the unique measure on B
R
such that m((0, 1]) = 1 and Eq. (7.6)
holds for B B
R
and x R. Moreover, m has the scaling property
(7.7) m(B) = || m(B)
where R, B B
R
and B := {x : x B}.
Proof. Let m
x
(B) := m(x+B), then one easily shows that m
x
is a measure on
B
R
such that m
x
((a, b]) = b a for all a < b. Therefore, m
x
= m by the uniqueness
assertion in Theorem 7.8.
For the converse, suppose that m is translation invariant and m((0, 1]) = 1.
Given n N, we have
(0, 1] =
n
k=1
(
k 1
n
,
k
n
] =
n
k=1
k 1
n
+ (0,
1
n
]
.
Therefore,
1 = m((0, 1]) =
n
X
k=1
m
k 1
n
+ (0,
1
n
]
=
n
X
k=1
m((0,
1
n
]) = n m((0,
1
n
]).
That is to say
m((0,
1
n
]) = 1/n.
Similarly, m((0,
l
n
]) = l/n for all l, n N and therefore by the translation invariance
of m,
m((a, b]) = b a for all a, b Q with a < b.
Finally for a, b R such that a < b, choose a
n
, b
n
Q such that b
n
b and a
n
a,
then (a
n
, b
n
] (a, b] and thus
m((a, b]) = lim
n
m((a
n
, b
n
]) = lim
n
(b
n
a
n
) = b a,
i.e. m is Lebesgue measure.
To prove Eq. (7.7) we may assume that 6= 0 since this case is trivial to prove.
Now let m
(B) := ||
1
m(B). It is easily checked that m
is again a measure on
B
R
which satises
m
((a, b]) =
1
m((a, b]) =
1
(b a) = b a
if > 0 and
m
((a, b]) = ||
1
m([b, a)) = ||
1
(b a) = b a
if < 0. Hence m
= m.
We are now going to develope integration theory relative to a measure. The
integral dened in the case for Lebesgue measure, m, will be an extension of the
standard Riemann integral on R.
7.2. Integrals of Simple functions. Let (X, M, ) be a xed measure space in
this section.
Denition 7.11. A function : X F is a simple function if is M B
R
measurable and (X) is a nite set. Any such simple functions can be written as
(7.8) =
n
X
i=1
i
1
A
i
with A
i
M and
i
F.
Indeed, let
1
,
2
, . . . ,
n
be an enumeration of the range of and A
i
=
1
({
i
}).
Also note that Eq. (7.8) may be written more intrinsically as
=
X
yF
y1
1
({y})
.
The next theorem shows that simple functions are pointwise dense in the space
of measurable functions.
Theorem 7.12 (Approximation Theorem). Let f : X [0, ] be measurable and
dene
n
(x)
2
2n
1
X
k=0
k
2
n
1
f
1
(
(
k
2
n
,
k+1
2
n
]
)
(x) + 2
n
1
f
1
((2
n
,])
(x)
=
2
2n
1
X
k=0
k
2
n
1
{
k
2
n
<f
k+1
2
n }
(x) + 2
n
1
{f>2
n
}
(x)
then
n
f for all n,
n
(x) f(x) for all x X and
n
f uniformly on the sets
X
M
:= {x X : f(x) M} with M < . Moreover, if f : X C is a measurable
function, then there exists simple functions
n
such that lim
n
n
(x) = f(x) for
all x and |
n
| |f| as n .
Proof. It is clear by construction that
n
(x) f(x) for all x and that 0
f(x)
n
(x) 2
n
if x X
2
n. From this it follows that
n
(x) f(x) for all x X
and
n
f uniformly on bounded sets.
Also notice that
(
k
2
n
,
k + 1
2
n
] = (
2k
2
n+1
,
2k + 2
2
n+1
] = (
2k
2
n+1
,
2k + 1
2
n+1
] (
2k + 1
2
n+1
,
2k + 2
2
n+1
]
102 BRUCE K. DRIVER
and for x f
1
(
2k
2
n+1
,
2k+1
2
n+1
]
,
n
(x) =
n+1
(x) =
2k
2
n+1
and for x
f
1
(
2k+1
2
n+1
,
2k+2
2
n+1
]
,
n
(x) =
2k
2
n+1
<
2k+1
2
n+1
=
n+1
(x). Similarly
(2
n
, ] = (2
n
, 2
n+1
] (2
n+1
, ],
so for x f
1
((2
n+1
, ])
n
(x) = 2
n
< 2
n+1
=
n+1
(x) and for x
f
1
((2
n
, 2
n+1
]),
n+1
(x) 2
n
=
n
(x). Therefore
n

n+1
for all n and we
have completed the proof of the rst assertion.
For the second assertion, rst assume that f : X R is a measurable function
and choose
n
to be simple functions such that
n
f
as n and dene
n
=
+
n

n
. Then
|
n
| =
+
n
+
n

+
n+1
+
n+1
= |
n+1
|
and clearly |
n
| =
+
n
+
n
f
+
+f
= |f| and
n
=
+
n

n
f
+
f
= f as
n .
Now suppose that f : X C is measurable. We may now choose simple
function u
n
and v
n
such that |u
n
| |Re f| , |v
n
| |Imf| , u
n
Re f and v
n
Imf
as n . Let
n
= u
n
+iv
n
, then
|
n
|
2
= u
2
n
+v
2
n
|Re f|
2
+|Imf|
2
= |f|
2
and
n
= u
n
+iv
n
Re f +i Imf = f as n .
We are now ready to dene the Lebesgue integral. We will start by integrating
simple functions and then proceed to general measurable functions.
Denition 7.13. Let F = C or [0, ) and suppose that : X F is a simple
function. If F = C assume further that (
1
({y})) < for all y 6= 0 in C. For
such functions , dene I
() by
I
() =
X
yF
y(
1
({y})).
Proposition 7.14. Let F and and be two simple functions, then I
satises:
(1)
(7.9) I
() = I
().
(2)
I
( +) = I
() +I
().
(3) If and are non-negative simple functions such that then
I
() I
().
Proof. Let us write { = y} for the set
1
({y}) X and ( = y) for
({ = y}) = (
1
({y})) so that
I
() =
X
yC
y( = y).
We will also write { = a, = b} for
1
({a})
1
({b}). This notation is more
intuitive for the purposes of this proof. Suppose that F then
I
() =
X
yF
y ( = y) =
X
yF
y ( = y/)
=
X
zF
z ( = z) = I
()
provided that 6= 0. The case = 0 is clear, so we have proved 1.
Suppose that and are two simple functions, then
I
( +) =
X
zF
z ( + = z)
=
X
zF
z (
wF
{ = w, = z w})
=
X
zF
z
X
wF
( = w, = z w)
=
X
z,wF
(z +w)( = w, = z)
=
X
zF
z ( = z) +
X
wF
w ( = w)
= I
() +I
().
which proves 2.
For 3. if and are non-negative simple functions such that
I
() =
X
a0
a( = a) =
X
a,b0
a( = a, = b)
X
a,b0
b( = a, = b) =
X
b0
b( = b) = I
(),
wherein the third inequality we have used { = a, = b} = if a > b.
7.3. Integrals of positive functions.
Denition 7.15. Let L
+
= {f : X [0, ] : f is measurable}. Dene
Z
X
fd = sup{I
() : is simple and f} .
Because of item 3. of Proposition 7.14, if is a non-negative simple function,
R
X
d = I
() so that
R
X
is an extension of I
. We say the f L
+
is integrable
if
R
X
fd < .
Remark 7.16. Notice that we still have the monotonicity property: 0 f g then
Z
X
fd = sup{I
() : is simple and f}
sup{I
() : is simple and g}
Z
X
g.
Similarly if c > 0,
Z
X
cfd = c
Z
X
fd.
104 BRUCE K. DRIVER
Also notice that if f is integrable, then ({f = }) = 0.

Lemma 7.17. Let X be a set and : X [0, ] be a function, let =
P
xX
(x)
x
on M = P(X), i.e.
(A) =
X
xA
(x).
If f : X [0, ] is a function (which is necessarily measurable), then
Z
X
fd =
X
X
f.
Proof. Suppose that : X [0, ] is a simple function, then =
P
z[0,]
z1
1
({z})
and
X
X
=
X
xX
(x)
X
z[0,]
z1
1
({z})
(x) =
X
z[0,]
z
X
xX
(x)1
1
({z})
(x)
=
X
z[0,]
z(
1
({z})) =
Z
X
d.
So if : X [0, ) is a simple function such that f, then
Z
X
d =
X
X

X
X
f.
Taking the sup over in this last equation then shows that
Z
X
fd
X
X
f.
For the reverse inequality, let X be a nite set and N (0, ). Set
f
N
(x) = min{N, f(x)} and let
N,
be the simple function given by
N,
(x) :=
1
(x)f
N
(x). Because
N,
(x) f(x),
X
f
N
=
X
X
N,
=
Z
X

N,
d
Z
X
fd.
Since f
N
f as N , we may let N in this last equation to concluded
that
X
f
Z
X
fd
and since is arbitrary we learn that
X
X
f
Z
X
fd.
Theorem 7.18 (Monotone Convergence Theorem). Suppose f
n
L
+
is a sequence
of functions such that f
n
f (f is necessarily in L
+
) then
Z
f
n

Z
f as n .
Proof. Since f
n
f
m
f, for all n m < ,
Z
f
n

Z
f
m

Z
f
from which if follows
R
f
n
is increasing in n and
(7.10) lim
n
Z
f
n

Z
f.
For the opposite inequality, let be a simple function such that 0 f and
let (0, 1). By Proposition 7.14,
(7.11)
Z
f
n

Z
1
E
n
f
n

Z
E
n
=
Z
E
n
.
Write =
P
i
1
B
i
with
i
> 0 and B
i
M, then
lim
n
Z
E
n
= lim
n
X
i
Z
E
n
1
B
i
=
X
i
(E
n
B
i
) =
X
i
lim
n
(E
n
B
i
)
=
X
i
(B
i
) =
Z
.
Using this we may let n in Eq. (7.11) to conclude
lim
n
Z
f
n
lim
n
Z
E
n
=
Z
X
.
Because this equation holds for all simple functions 0 f, form the denition
of
R
f we have lim
n
R
f
n

R
f. Since (0, 1) is arbitrary, lim
n
R
f
n

R
f
The following simple lemma will be use often in the sequel.
Lemma 7.19 (Chebyshevs Inequality). Suppose that f 0 is a measurable func-
tion, then for any > 0,
(7.12) (f )
1
Z
X
fd.
In particular if
R
X
fd < then (f = ) = 0 (i.e. f < a.e.) and the set
{f > 0} is nite.
Proof. Since 1
{f}
1
{f}
1
f
1
f,
(f ) =
Z
X
1
{f}
d
Z
X
1
{f}
1
fd
1
Z
X
fd.
If M :=
R
X
fd < , then
(f = ) (f n)
M
n
0 as n
and {f 1/n} {f > 0} with (f 1/n) nM < for all n.
Corollary 7.20. If f
n
L
+
is a sequence of functions then
Z
X
n
f
n
=
X
n
Z
f
n
.
In particular, if
P
n
R
f
n
< then
P
n
f
n
< a.e.
106 BRUCE K. DRIVER
Proof. First o we show that

Z
(f
1
+f
2
) =
Z
f
1
+
Z
f
2
by choosing non-negative simple function
n
and
n
such that
n
f
1
and
n
f
2
.
Then (
n
+
n
) is simple as well and (
n
+
n
) (f
1
+ f
2
) so by the monotone
convergence theorem,
Z
(f
1
+f
2
) = lim
n
Z
(
n
+
n
) = lim
n
Z

n
+
Z

n
= lim
n
Z

n
+ lim
n
Z

n
=
Z
f
1
+
Z
f
2
.
Now to the general case. Let g
N

N
P
n=1
f
n
and g =

P
1
f
n
, then g
N
g and so again
by monotone convergence theorem and the additivity just proved,
X
n=1
Z
f
n
:= lim
N
N
X
n=1
Z
f
n
= lim
N
Z
N
X
n=1
f
n
= lim
N
Z
g
N
=
Z
g =

X
n=1
Z
f
n
.
Remark 7.21. It is in the proof of this corollary (i.e. the linearity of the integral)
that we really make use of the assumption that all of our functions are measurable.
In fact the denition
R
fd makes sense for all functions f : X [0, ] not just
measurable functions. Moreover the monotone convergence theorem holds in this
generality with no change in the proof. However, in the proof of Corollary 7.20, we
use the approximation Theorem 7.12 which relies heavily on the measurability of
the functions to be approximated.
The following Lemma and the next Corollary are simple applications of Corollary
7.20.
Lemma 7.22 (First Borell-Carnteli- Lemma.). Let (X, M, ) be a measure space,
A
n
M, and set
{A
n
i.o.} = {x X : x A
n
for innitely many ns} =

\
N=1
[
nN
A
n
.
If
P
n=1
(A
n
) < then ({A
n
i.o.}) = 0.
Proof. (First Proof.) Let us rst observe that
{A
n
i.o.} =
(
x X :

X
n=1
1
A
n
(x) =
)
.
Hence if
P
n=1
(A
n
) < then
>

X
n=1
(A
n
) =

X
n=1
Z
X
1
A
n
d =
Z
X
X
n=1
1
A
n
d
implies that

P
n=1
1
A
n
(x) < for - a.e. x. That is to say ({A
n
i.o.}) = 0.
(Second Proof.) Of course we may give a strictly measure theoretic proof of this
fact:
(A
n
i.o.) = lim
N
_
_
[
nN
A
n
_
_
lim
N
X
nN
(A
n
)
and the last limit is zero since
P
n=1
(A
n
) < .
Corollary 7.23. Suppose that (X, M, ) is a measure space and {A
n
}
n=1
M is
a collection of sets such that (A
i
A
j
) = 0 for all i 6= j, then
(
n=1
A
n
) =

X
n=1
(A
n
).
Proof. Since
(
n=1
A
n
) =
Z
X
1
n=1
A
n
d and
X
n=1
(A
n
) =
Z
X
X
n=1
1
A
n
d
it suces to show
(7.13)

X
n=1
1
A
n
= 1
n=1
A
n
a.e.
Now
P
n=1
1
A
n
1
n=1
A
n
and
P
n=1
1
A
n
(x) 6= 1
n=1
A
n
(x) i x A
i
A
j
for some
i 6= j, that is
(
x :

X
n=1
1
A
n
(x) 6= 1
n=1
A
n
(x)
)
=
i<j
A
i
A
j
and the later set has measure 0 being the countable union of sets of measure zero.
This proves Eq. (7.13) and hence the corollary.
Example 7.24. Suppose < a < b < , f C([a, b], [0, )) and m be
Lebesgue measure on R. Also let
k
= {a = a
k
0
< a
k
1
< < a
k
n
k
= b} be a
sequence of rening partitions (i.e.
k

k+1
for all k) such that
mesh(
k
) := max{
a
k
j
a
k+1
j1
: j = 1, . . . , n
k
} 0 as k .
For each k, let
f
k
(x) = f(a)1
{a}
+
n
k
1
X
l=0
min
f(x) : a
k
l
x a
k
l+1
1
(a
k
l
,a
k
l+1
]
(x)
108 BRUCE K. DRIVER
then f
k
f as k and so by the monotone convergence theorem,
Z
b
a
fdm :=
Z
[a,b]
fdm = lim
k
Z
b
a
f
k
dm
= lim
k
n
k
X
l=0
min
f(x) : a
k
l
x a
k
l+1
(a
k
l
, a
k
l+1
]
=
Z
b
a
f(x)dx.
The latter integral being the Riemann integral.
We can use the above result to integrate some non-Riemann integrable functions:
Example 7.25. For all > 0,
R
0
e
x
dm(x) =
1
and
R
R
1
1+x
2
dm(x) = .
The proof of these equations are similar. By the monotone convergence theorem,
Example 7.24 and the fundamental theorem of calculus for Riemann integrals (or
see Theorem 7.40 below),
Z

0
e
x
dm(x) = lim
N
Z
N
0
e
x
dm(x) = lim
N
Z
N
0
e
x
dx
= lim
N
1
e
x
|
N
0
=
1
and
Z
R
1
1 +x
2
dm(x) = lim
N
Z
N
N
1
1 +x
2
dm(x) = lim
N
Z
N
N
1
1 +x
2
dx
= tan
1
(N) tan
1
(N) = .
Let us also consider the functions x
p
,
Z
(0,1]
1
x
p
dm(x) = lim
n
Z
1
0
1
(
1
n
,1]
(x)
1
x
p
dm(x)
= lim
n
Z
1
1
n
1
x
p
dx = lim
n
x
p+1
1 p
1
1/n
=

1
1p
if p < 1
if p > 1
If p = 1 we nd
Z
(0,1]
1
x
p
dm(x) = lim
n
Z
1
1
n
1
x
dx = lim
n
ln(x)|
1
1/n
= .
Example 7.26. Let {r
n
}
n=1
be an enumeration of the points in Q [0, 1] and
dene
f(x) =

X
n=1
2
n
1
p
|x r
n
|
with the convention that
1
p
|x r
n
|
= 5 if x = r
n
.
Since, By Theorem 7.40,
Z
1
0
1
p
|x r
n
|
dx =
Z
1
r
n
1
x r
n
dx +
Z
r
n
0
1
r
n
x
dx
= 2
x r
n
|
1
r
n
2
r
n
x|
r
n
0
= 2
1 r
n
r
n
4,
we nd
Z
[0,1]
f(x)dm(x) =

X
n=1
2
n
Z
[0,1]
1
p
|x r
n
|
dx

X
n=1
2
n
4 = 4 < .
In particular, m(f = ) = 0, i.e. that f < for almost every x [0, 1] and this
implies that
X
n=1
2
n
1
p
|x r
n
|
< for a.e. x [0, 1].
This result is somewhat surprising since the singularities of the summands form a
dense subset of [0, 1].
Proposition 7.27. Suppose that f 0 is a measurable function. Then
R
X
fd = 0
i f = 0 a.e. Also if f, g 0 are measurable functions such that f g a.e. then
R
fd
R
gd. In particular if f = g a.e. then
R
fd =
R
gd.
Proof. If f = 0 a.e. and f is a simple function then = 0 a.e. This implies
that (
1
({y})) = 0 for all y > 0 and hence
R
X
d = 0 and therefore
R
X
fd = 0.
Conversely, if
R
fd = 0, then by Chebyshevs Inequality (Lemma 7.19),
(f 1/n) n
Z
fd = 0 for all n.
Therefore, (f > 0)
P
n=1
(f 1/n) = 0, i.e. f = 0 a.e.
For the second assertion let E be the exceptional set where g > f, i.e. E := {x
X : g(x) > f(x)}. By assumption E is a null set and 1
E
c f 1
E
cg everywhere.
Because g = 1
E
cg + 1
E
g and 1
E
g = 0 a.e.,
Z
gd =
Z
1
E
cgd +
Z
1
E
gd =
Z
1
E
c gd
and similarly
R
fd =
R
1
E
cfd. Since 1
E
cf 1
E
cg everywhere,
Z
fd =
Z
1
E
c fd
Z
1
E
c gd =
Z
gd.
Corollary 7.28. Suppose that {f
n
} is a sequence of non-negative functions and f
is a measurable function such that f
n
f o a null set, then
Z
f
n

Z
f as n .
Proof. Let E X be a null set such that f
n
1
E
c f1
E
c as n . Then by
the monotone convergence theorem and Proposition 7.27,
Z
f
n
=
Z
f
n
1
E
c
Z
f1
E
c =
Z
f as n .
110 BRUCE K. DRIVER
Lemma 7.29 (Fatous Lemma). If f

n
: X [0, ] is a sequence of measurable
functions then
Z
liminf
n
f
n
liminf
n
Z
f
n
Proof. Dene g
k
inf
nk
f
n
so that g
k
liminf
n
f
n
as k . Since g
k
f
n
for all k n,
Z
g
k

Z
f
n
for all n k
and therefore
Z
g
k
lim inf
n
Z
f
n
for all k.
We may now use the monotone convergence theorem to let k to nd
Z
lim inf
n
f
n
=
Z
lim
k
g
k
MCT
= lim
k
Z
g
k
lim inf
n
Z
f
n
.
7.4. Integrals of Complex Valued Functions.
Denition 7.30. A measurable function f : X

R is integrable if f
+
f1
{f0}
and f
= f 1
{f0}
are integrable. We write L
1
for the space of integrable
functions. For f L
1
, let
Z
fd =
Z
f
+
d
Z
f
d
Convention: If f, g : X

R are two measurable functions, let f + g denote
the collection of measurable functions h : X

R such that h(x) = f(x) + g(x)
whenever f(x) + g(x) is well dened, i.e. is not of the form or + .
We use a similar convention for f g. Notice that if f, g L
1
and h
1
, h
2
f +g,
then h
1
= h
2
a.e. because |f| < and |g| < a.e.
Remark 7.31. Since
f
|f| f
+
+f
,
a measurable function f is integrable i
R
|f| d < . If f, g L
1
and f = g a.e.
then f
= g
a.e. and so it follows from Proposition 7.27 that

R
fd =
R
gd. In
particular if f, g L
1
we may dene
Z
X
(f +g) d =
Z
X
hd
where h is any element of f +g.
Proposition 7.32. The map
f L
1
Z
X
fd R
is linear and has the monotonicity property:
R
fd
R
gd for all f, g L
1
such
that f g a.e.
Proof. Let f, g L
1
and a, b R. By modifying f and g on a null set, we may
assume that f, g are real valued functions. We have af +bg L
1
because
|af +bg| |a||f| +|b| |g| L
1
.
If a < 0, then
(af)
+
= af
and (af)
= af
+
so that
Z
af = a
Z
f
+a
Z
f
+
= a(
Z
f
+
Z
f
) = a
Z
f.
A similar calculation works for a > 0 and the case a = 0 is trivial so we have shown
that
Z
af = a
Z
f.
Now set h = f +g. Since h = h
+
h
,
h
+
h
= f
+
f
+g
+
g
or
h
+
+f
+g
= h
+f
+
+g
+
.
Therefore,
Z
h
+
+
Z
f
+
Z
g
=
Z
h
+
Z
f
+
+
Z
g
+
and hence
Z
h =
Z
h
+
Z
h
=
Z
f
+
+
Z
g
+
Z
f
Z
g
=
Z
f +
Z
g.
Finally if f
+
f
= f g = g
+
g
then f
+
+ g
g
+
+ f
which implies
that
Z
f
+
+
Z
g

Z
g
+
+
Z
f
Z
f =
Z
f
+
Z
f

Z
g
+
Z
g
=
Z
g.
The monotonicity property is also a consequence of the linearity of the integral, the
fact that f g a.e. implies 0 g f a.e. and Proposition 7.27.
Denition 7.33. A measurable function f : X C is integrable if
R
X
|f| d < ,
again we write f L
1
. Because, max (|Re f| , |Imf|) |f|
2 max (|Re f| , |Imf|) ,

R
|f| d < i
Z
|Re f| d +
Z
|Imf| d < .
For f L
1
dene
Z
f d =
Z
Re f d +i
Z
Imf d.
It is routine to show the integral is still linear on the complex L
1
(prove!).
Proposition 7.34. Suppose that f L
1
, then
Z
X
fd
Z
X
|f|d.
112 BRUCE K. DRIVER
Proof. Start by writing

R
X
f d = Re
i
. Then using the monotonicity in
Proposition 7.27,
Z
X
fd
= R = e
i
Z
X
fd =
Z
X
e
i
fd
=
Z
X
Re
e
i
f
d
Z
X
Re
e
i
f
d
Z
X
|f| d.
Proposition 7.35. f, g L
1
, then
(1) The set {f 6= 0} is -nite, in fact {|f|
1
n
} {f 6= 0} and (|f|
1
n
) <
for all n.
(2) The following are equivalent
(a)
R
E
f =
R
E
g for all E M
(b)
R
X
|f g| = 0
(c) f = g a.e.
Proof. 1. By Chebyshevs inequality, Lemma 7.19,
(|f|
1
n
) n
Z
X
|f|d <
for all n.
2. (a) = (c) Notice that
Z
E
f =
Z
E
g
Z
E
(f g) = 0
for all E M. Taking E = {Re(f g) > 0} and using 1
E
Re(f g) 0, we learn
that
0 = Re
Z
E
(f g)d =
Z
1
E
Re(f g) =1
E
Re(f g) = 0 a.e.
This implies that 1
E
= 0 a.e. which happens i
({Re(f g) > 0}) = (E) = 0.
Similar (Re(f g) < 0) = 0 so that Re(f g) = 0 a.e. Similarly, Im(f g) = 0
a.e and hence f g = 0 a.e., i.e. f = g a.e.
(c) = (b) is clear and so is (b) = (a) since
Z
E
f
Z
E
g
Z
|f g| = 0.
Denition 7.36. Let (X, M, ) be a measure space and L
1
() = L
1
(X, M, )
denote the set of L
1
functions modulo the equivalence relation; f g i f = g a.e.
We make this into a normed space using the norm
kf gk
L
1
=
Z
|f g| d
and into a metric space using
1
(f, g) = kf gk
L
1
.
Remark 7.37. More generally we may dene L
p
() = L
p
(X, M, ) for p [1, )
as the set of measurable functions f such that
Z
X
|f|
p
d <
modulo the equivalence relation; f g i f = g a.e.
We will see in Section 9 that
kfk
L
p
=
Z
|f|
p
d
1/p
for f L
p
()
is a norm and (L
p
(), kk
L
p
) is a Banach space in this norm.
Theorem 7.38 (Dominated Convergence Theorem). Suppose f
n
, g
n
, g L
1
, f
n

f a.e., |f
n
| g
n
L
1
, g
n
g a.e. and
R
X
g
n
d
R
X
gd. Then f L
1
and
Z
X
fd = lim
h
Z
X
f
n
d.
(In most typical applications of this theorem g
n
= g L
1
for all n.)
Proof. Notice that |f| = lim
n
|f
n
| lim
n
|g
n
| g a.e. so that f L
1
.
By considering the real and imaginary parts of f separately, it suces to prove the
theorem in the case where f is real. By Fatous Lemma,
Z
X
(g f)d =
Z
X
liminf
n
(g
n
f
n
) d liminf
n
Z
X
(g
n
f
n
) d
= lim
n
Z
X
g
n
d + liminf
n
Z
X
f
n
d
=
Z
X
gd + liminf
n
Z
X
f
n
d
Since liminf
n
(a
n
) = limsup
n
a
n
, we have shown,
Z
X
gd
Z
X
fd
Z
X
gd +
liminf
n
R
X
f
n
d
limsup
n
R
X
f
n
d
and therefore
limsup
n
Z
X
f
n
d
Z
X
fd liminf
n
Z
X
f
n
d.
This shows that lim
n
R
X
f
n
d exists and is equal to
R
X
fd.
Corollary 7.39. Let {f
n
}
n=1
L
1
be a sequence such that
P
n=1
kf
n
k
L
1
< ,
then
P
n=1
f
n
is convergent a.e. and
Z
X

X
n=1
f
n
!
d =

X
n=1
Z
X
f
n
d.
Proof. The condition
P
n=1
kf
n
k
L
1
< is equivalent to
P
n=1
|f
n
| L
1
. Hence
P
n=1
f
n
is almost everywhere convergent and if S
N
:=
P
N
n=1
f
n
, then
|S
N
|
N
X
n=1
|f
n
|

X
n=1
|f
n
| L
1
.
114 BRUCE K. DRIVER
So by the dominated convergence theorem,

Z
X

X
n=1
f
n
!
d =
Z
X
lim
N
S
N
d = lim
N
Z
X
S
N
d
= lim
N
N
X
n=1
Z
X
f
n
d =

X
n=1
Z
X
f
n
d.
Theorem 7.40 (The Fundamental Theorem of Calculus). Suppose < a < b <
, f C((a, b), R)L
1
((a, b), m) and F(x) :=
R
x
a
f(y)dm(y). Then
(1) F C([a, b], R) C
1
((a, b), R).
(2) F
0
(x) = f(x) for all x (a, b).
(3) If G C([a, b], R) C
1
((a, b), R) is an anti-derivative of f on (a, b) (i.e.
f = G
0
|
(a,b)
) then
Z
b
a
f(x)dm(x) = G(b) G(a).
Proof. Since F(x) :=
R
R
1
(a,x)
(y)f(y)dm(y), lim
xz
1
(a,x)
(y) = 1
(a,z)
(y) for m
a.e. y and

1
(a,x)
(y)f(y)
1
(a,b)
(y) |f(y)| is an L
1
function, it follows from
the dominated convergence Theorem 7.38 that F is continuous on [a, b]. Simple
manipulations show,
F(x +h) F(x)

h
f(x)
=
1
|h|
_
_
_
R
x+h
x
[f(y) f(x)] dm(y)
if h > 0
R
x
x+h
[f(y) f(x)] dm(y)
if h < 0
1
|h|
(
R
x+h
x
|f(y) f(x)| dm(y) if h > 0
R
x
x+h
|f(y) f(x)| dm(y) if h < 0
sup{|f(y) f(x)| : y [x |h| , x +|h|]}
and the latter expression, by the continuity of f, goes to zero as h 0 . This shows
F
0
= f on (a, b).
For the converse direction, we have by assumption that G
0
(x) = F
0
(x) for x
(a, b). Therefore by the mean value theorem, F G = C for some constant C. Hence
Z
b
a
f(x)dm(x) = F(b) = F(b) F(a) = (G(b) +C) (G(a) +C) = G(b) G(a).
Example 7.41. The following limit holds,
lim
n
Z
n
0
(1
x
n
)
n
dm(x) = 1.
Let f
n
(x) = (1
x
n
)
n
1
[0,n]
(x) and notice that lim
n
f
n
(x) = e
x
. We will now
show
0 f
n
(x) e
x
for all x 0.
It suces to consider x [0, n]. Let g(x) = e
x
f
n
(x), then for x (0, n),
d
dx
lng(x) = 1 +n
1
(1
x
n
)
(
1
n
) = 1
1
(1
x
n
)
0
which shows that lng(x) and hence g(x) is decreasing on [0, n]. Therefore g(x)
g(0) = 1, i.e.
0 f
n
(x) e
x
.
From Example 7.25, we know
Z

0
e
x
dm(x) = 1 < ,
so that e
x
is an integrable function on [0, ). Hence by the dominated convergence
theorem,
lim
n
Z
n
0
(1
x
n
)
n
dm(x) = lim
n
Z

0
f
n
(x)dm(x)
=
Z

0
lim
n
f
n
(x)dm(x) =
Z

0
e
x
dm(x) = 1.
Example 7.42 (Integration of Power Series). Suppose R > 0 and {a
n
}
n=0
is a
sequence of complex numbers such that
P
n=0
|a
n
| r
n
< for all r (0, R). Then
Z

X
n=0
a
n
x
n
!
dm(x) =

X
n=0
a
n
Z

x
n
dm(x) =

X
n=0
a
n
n+1
n+1
n + 1
for all R < < < R. Indeed this follows from Corollary 7.39 since
X
n=0
Z

|a
n
| |x|
n
dm(x)

X
n=0
Z
||
0
|a
n
| |x|
n
dm(x) +
Z
||
0
|a
n
| |x|
n
dm(x)
!

X
n=0
|a
n
|
||
n+1
+||
n+1
n + 1
2r

X
n=0
|a
n
| r
n
<
where r = max(|| , ||).
Corollary 7.43 (Dierentiation Under the Integral). Suppose that J R is an
open interval and f : J X C is a function such that
(1) x f(t, x) is measurable for each t J.
(2) f(t
0
, ) L
1
() for some t
0
J.
(3)
f
t
(t, x) exists for all (t, x).
(4) There is a function g L
1
such that

f
t
(t, )
g L
1
for each t J.
Then f(t, ) L
1
() for all t J (i.e.
R
|f(t, x)| d(x) < ), t
R
X
f(t, x)d(x) is a dierentiable function on J and
d
dt
Z
X
f(t, x)d(x) =
Z
X
f
t
(t, x)d(x).
Proof. (The proof is essentially the same as for sums.) By considering the real
and imaginary parts of f separately, we may assume that f is real. Also notice that
f
t
(t, x) = lim
n
n(f(t +n
1
, x) f(t, x))
and therefore, for x
f
t
(t, x) is a sequential limit of measurable functions and
hence is measurable for all t J. By the mean value theorem,
(7.14) |f(t, x) f(t
0
, x)| g(x) |t t
0
| for all t J
116 BRUCE K. DRIVER
and hence
|f(t, x)| |f(t, x) f(t
0
, x)| +|f(t
0
, x)| g(x) |t t
0
| +|f(t
0
, x)| .
This shows f(t, ) L
1
() for all t J. Let G(t) :=
R
X
f(t, x)d(x), then
G(t) G(t
0
)
t t
0
=
Z
X
f(t, x) f(t
0
, x)
t t
0
d(x).
By assumption,
lim
tt
0
f(t, x) f(t
0
, x)
t t
0
=
f
t
(t, x) for all x X
and by Eq. (7.14),
f(t, x) f(t
0
, x)
t t
0
g(x) for all t J and x X.

Therefore, we may apply the dominated convergence theorem to conclude
lim
n
G(t
n
) G(t
0
)
t
n
t
0
= lim
n
Z
X
f(t
n
, x) f(t
0
, x)
t
n
t
0
d(x)
=
Z
X
lim
n
f(t
n
, x) f(t
0
, x)
t
n
t
0
d(x) =
Z
X
f
t
(t
0
, x)d(x)
for all sequences t
n
J \ {t
0
} such that t
n
t
0
. Therefore,

G(t
0
) =
lim
tt
0
G(t)G(t
0
)
tt
0
exists and
G(t
0
) =
Z
X
f
t
(t
0
, x)d(x).
Example 7.44. Recall from Example 7.25 that
1
=
Z
[0,)
e
x
dm(x) for all > 0.
Let > 0. For 2 > 0 and n N there exists C
n
() < such that
0
d
d
n
e
x
= x
n
e
x
C()e
x
.
Using this fact, Corollary 7.43 and induction gives
n!
n1
=

d
d
1
=
Z
[0,)
d
d
n
e
x
dm(x) =
Z
[0,)
x
n
e
x
dm(x).
That is n! =
n
R
[0,)
x
n
e
x
dm(x). Recall that
(t) :=
Z
[0,)
x
t1
e
x
dx for t > 0.
(The reader should check that (t) < for all t > 0.) We have just shown that
(n + 1) = n! for all n N.
Remark 7.45. Corollary 7.43 may be generalized by allowing the hypothesis to hold
for x X\E where E Mis a xed null set, i.e. E must be independent of t. Con-
sider what happens if we formally apply Corollary 7.43 to g(t) :=
R
0
1
xt
dm(x),
g(t) =
d
dt
Z

0
1
xt
dm(x)
?
=
Z

0
t
1
xt
dm(x).
The last integral is zero since

t
1
xt
= 0 unless t = x in which case it is not
dened. On the other hand g(t) = t so that g(t) = 1. (The reader should decide
which hypothesis of Corollary 7.43 has been violated in this example.)
7.5. Measurability on Complete Measure Spaces. In this subsection we will
discuss a couple of measurability results concerning completions of measure spaces.
Proposition 7.46. Suppose that (X, M, ) is a complete measure space
13
and
f : X R is measurable.
(1) If g : X R is a function such that f(x) = g(x) for a.e. x, then g is
measurable.
(2) If f
n
: X R are measurable and f : X R is a function such that
lim
n
f
n
= f, - a.e., then f is measurable as well.
Proof. 1. Let E = {x : f(x) 6= g(x)} which is assumed to be in M and
(E) = 0. Then g = 1
E
c f + 1
E
g since f = g on E
c
. Now 1
E
c f is measurable so g
will be measurable if we show 1
E
g is measurable. For this consider,
(7.15) (1
E
g)
1
(A) =

E
c
(1
E
g)
1
(A\ {0}) if 0 A
(1
E
g)
1
(A) if 0 / A
Since (1
E
g)
1
(B) E if 0 / B and (E) = 0, it follow by completeness of M that
(1
E
g)
1
(B) M if 0 / B. Therefore Eq. (7.15) shows that 1
E
g is measurable.
2. Let E = {x : lim
n
f
n
(x) 6= f(x)} by assumption E M and (E) = 0. Since
g 1
E
f = lim
n
1
E
cf
n
, g is measurable. Because f = g on E
c
and (E) = 0,
f = g a.e. so by part 1. f is also measurable.
The above results are in general false if (X, M, ) is not complete. For example,
let X = {0, 1, 2} M = {{0}, {1, 2}, X, } and =
0
Take g(0) = 0, g(1) =
1, g(2) = 2, then g = 0 a.e. yet g is not measurable.
Lemma 7.47. Suppose that (X, M, ) is a measure space and

M is the completion
of M relative to and is the extension of to

M. Then a function f : X R
is (

M, B = B
R
) measurable i there exists a function g : X R that is (M, B)
measurable such E = {x : f(x) 6= g(x)}

M and (E) = 0, i.e. f(x) = g(x) for
a.e. x. Moreover for such a pair f and g, f L
1
( ) i g L
1
() and in which
case
Z
X
fd =
Z
X
gd.
Proof. Suppose rst that such a function g exists so that (E) = 0. Since
g is also (

M, B) measurable, we see from Proposition 7.46 that f is (

M, B)
measurable.
13
Recall this means that if N X is a set such that N A M and (A) = 0, then N M
as well.
118 BRUCE K. DRIVER
Conversely if f is (

M, B) measurable, by considering f
we may assume that

f 0. Choose (

M, B) measurable simple function
n
0 such that
n
f as
n . Writing
n
=
X
a
k
1
A
k
with A
k

M, we may choose B
k
M such that B
k
A
k
and (A
k
\ B
k
) = 0.
Letting
n
:=
X
a
k
1
B
k
we have produced a (M, B) measurable simple function

n
0 such that E
n
:=
{
n
6=

n
} has zero measure. Since (
n
E
n
)
P
n
(E
n
) , there exists F M
such that
n
E
n
F and (F) = 0. It now follows that
1
F

n
= 1
F
n
g := 1
F
f as n .
This shows that g = 1
F
f is (M, B) measurable and that {f 6= g} F has
measure zero.
Since f = g, a.e.,
R
X
fd =
R
X
gd so to prove Eq. (7.16) it suces to prove
(7.16)
Z
X
gd =
Z
X
gd.
Because = on M, Eq. (7.16) is easily veried for non-negative M measurable
simple functions. Then by the monotone convergence theorem and the approxi-
mation Theorem 7.12 it holds for all M measurable functions g : X [0, ].
The rest of the assertions follow in the standard way by considering (Re g)
and
(Img)
.
7.6. Comparison of the Lebesgue and the Riemann Integral. For the rest
of this chapter, let < a < b < and f : [a, b] R be a bounded function. A
partition of [a, b] is a nite subset [a, b] containing {a, b}. To each partition
(7.17) = {a = t
0
< t
1
< < t
n
= b}
of [a, b] let
mesh() := max{|t
j
t
j1
| : j = 1, . . . , n},
M
j
= sup{f(x) : t
j
x t
j1
}, m
j
= inf{f(x) : t
j
x t
j1
}
G
= f(a)1
{a}
+
n
X
1
M
j
1
(t
j1
,t
j
]
, g
= f(a)1
{a}
+
n
X
1
m
j
1
(t
j1
,t
j
]
and
S
f =
X
M
j
(t
j
t
j1
) and s
f =
X
m
j
(t
j
t
j1
).
Notice that
S
f =
Z
b
a
G
dm and s
f =
Z
b
a
g
dm.
The upper and lower Riemann integrals are dened respectively by
Z
b
a
f(x)dx = inf
f and
Z
a
b
f(x)dx = sup
f.
Denition 7.48. The function f is Riemann integrable i
R
b
a
f =
R
b
a
f and
which case the Riemann integral
R
b
a
f is dened to be the common value:
Z
b
a
f(x)dx =
Z
b
a
f(x)dx =
Z
b
a
f(x)dx.
The proof of the following Lemma is left as an exercise to the reader.
Lemma 7.49. If
0
and are two partitions of [a, b] and
0
then
G
0 f g
0 g
and
S
f S
0 f s
0 f s
f.
There exists an increasing sequence of partitions {
k
}
k=1
such that mesh(
k
) 0
and
S
k
f
Z
b
a
f and s
k
f
Z
b
a
f as k .
If we let
(7.18) G lim
k
G
k
and g lim
k
g
k
then by the dominated convergence theorem,
Z
[a,b]
gdm = lim
k
Z
[a,b]
g
k
= lim
k
s
k
f =
Z
b
a
f(x)dx (7.19)
and
Z
[a,b]
Gdm = lim
k
Z
[a,b]
G
k
= lim
k
S
k
f =
Z
b
a
f(x)dx. (7.20)
Notation 7.50. For x [a, b], let
H(x) = limsup
yx
f(y) = lim
0
sup{f(y) : |y x| , y [a, b]} and
h(x) liminf
yx
f(y) = lim
0
inf {f(y) : |y x| , y [a, b]}.
Lemma 7.51. The functions H, h : [a, b] R satisfy:
(1) h(x) f(x) H(x) for all x [a, b] and h(x) = H(x) i f is continuous
at x.
(2) If {
k
}
k=1
is any increasing sequence of partitions such that mesh(
k
) 0
and G and g are dened as in Eq. (7.18), then
(7.21) G(x) = H(x) f(x) h(x) = g(x) x / :=
k=1
k
.
(Note is a countable set.)
(3) H and h are Borel measurable.
Proof. Let G
k
G
k
G and g
k
g
k
g.
(1) It is clear that h(x) f(x) H(x) for all x and H(x) = h(x) i lim
yx
f(y)
exists and is equal to f(x). That is H(x) = h(x) i f is continuous at x.
120 BRUCE K. DRIVER
(2) For x / ,
G
k
(x) H(x) f(x) h(x) g
k
(x) k
and letting k in this equation implies
(7.22) G(x) H(x) f(x) h(x) g(x) x / .
Moreover, given > 0 and x / ,
sup{f(y) : |y x| , y [a, b]} G
k
(x)
for all k large enough, since eventually G
k
(x) is the supremum of f(y)
over some interval contained in [x , x +]. Again letting k implies
sup
|yx|
f(y) G(x) and therefore, that
H(x) = limsup
yx
f(y) G(x)
for all x / . Combining this equation with Eq. (7.22) then implies H(x) =
G(x) if x / . A similar argument shows that h(x) = g(x) if x / and
hence Eq. (7.21) is proved.
(3) The functions G and g are limits of measurable functions and hence mea-
surable. Since H = G and h = g except possibly on the countable set ,
both H and h are also Borel measurable. (You justify this statement.)
Theorem 7.52. Let f : [a, b] R be a bounded function. Then
(7.23)
Z
b
a
f =
Z
[a,b]
Hdm and
Z
b
a
f =
Z
[a,b]
hdm
and the following statements are equivalent:
(1) H(x) = h(x) for m -a.e. x,
(2) the set
E := {x [a, b] : f is disconituous at x}
is an m null set.
(3) f is Riemann integrable.
If f is Riemann integrable then f is Lebesgue measurable
14
, i.e. f is L/B
measurable where L is the Lebesgue algebra and B is the Borel algebra on
[a, b]. Moreover if we let m denote the completion of m, then
(7.24)
Z
[a,b]
Hdm =
Z
b
a
f(x)dx =
Z
[a,b]
fd m =
Z
[a,b]
hdm.
Proof. Let {
k
}
k=1
be an increasing sequence of partitions of [a, b] as described
in Lemma 7.49 and let G and g be dened as in Lemma 7.51. Since m() = 0,
H = G a.e., Eq. (7.23) is a consequence of Eqs. (7.19) and (7.20). From Eq. (7.23),
f is Riemann integrable i
Z
[a,b]
Hdm =
Z
[a,b]
hdm
and because h f H this happens i h(x) = H(x) for m - a.e. x. Since
E = {x : H(x) 6= h(x)}, this last condition is equivalent to E being a m null
14
f need not be Borel measurable.
set. In light of these results and Eq. (7.21), the remaining assertions including Eq.
(7.24) are now consequences of Lemma 7.47.
Notation 7.53. In view of this theorem we will often write
R
b
a
f(x)dx for
R
b
a
fdm.
7.7. Appendix: Bochner Integral. In this appendix we will discuss how to
dene integrals of functions taking values in a Banach space. The resulting integral
will be called the Bochner integral. In this section, let (, F, ) be a probability
space and X be a separable Banach space.
Remark 7.54. Recall that we have already seen in this case that the Borel eld
B = B(X) on X is the same as the eld ( (X
)) which is generated by X

the continuous linear functionals on X. As a consequence F : X is F/B(X)
measurable i F : R is F/B(R) measurable for all X
.
Lemma 7.55. Let 1 p < and L
p
(; X) denote the space of measurable func-
tions F : X such that
R
kFk
p
d < . For F L
p
(; X), dene
kFk
L
p =
_
_
Z
kFk
p
X
d
_
_
1
p
.
Then after identifying function F L
p
(; X) which agree modulo sets of mea-
sure zero, (L
p
(; X), k k
L
p) becomes a Banach space.
Proof. It is easily checked that k k
L
p is a norm, for example,
kF +Gk
L
p =
_
_
Z
kF +Gk
p
X
d
_
_
1
p
_
_
Z
(kFk
X
+kGk
X
)
p
d
_
_
1
p
kFk
L
p +kGk
L
p.
So the main point is to check completeness of the space. For this suppose {F
n
}
1

L
p
= L
p
(; X) such that

P
n=1
kF
n+1
F
n
k
L
p < and dene F
0
0. Since kFk
L
1
kFk
L
p it follows that
Z
X
n=1
kF
n+1
F
n
k
X
d

X
n=1
kF
n+1
F
n
k
L
1 <
and therefore that

P
n=1
kF
n+1
F
n
k
X
< on as set
0
such that (
0
) = 1.
Since X is complete, we know

P
n=0
(F
n+1
(x) F
n
(x)) exists in X for all x
0
so
we may dene F : X by
F
_
_
_
P
n=0
(F
n+1
F
n
) X on
0
0 on
c
0
.
Then on
0
,
F F
N
=

X
n=N+1
(F
n+1
F
n
) = lim
M
M
X
n=N+1
(F
n+1
F
n
).
122 BRUCE K. DRIVER
So
kF F
N
k
X

X
n=N+1
kF
n+1
F
n
k
X
= lim
M
M
X
nN+1
kF
n+1
F
n
k
X
and therefore by Fatous Lemma and Minikowskis inequality,
kF F
N
k
L
p
lim
M
inf
M
X
N+1
kF
n+1
F
n
k
X
L
p
lim
M
inf
M
X
N+1
|F
n+1
F
n
|
L
p
lim
M
inf
M
X
N+1
kF
n+1
F
n
k
L
p =

X
N+1
kF
n+1
F
n
k
L
p 0 as N .
Therefore F L
p
and lim
N
F
N
= F in L
p
.
Denition 7.56. A measurable function F : X is said to be a simple function
provided that F() is a nite set. Let S denote the collection of simple functions.
For F S set
I(F)
X
xX
x(F
1
({x})) =
X
xX
x({F = x}) =
X
xF()
x({F = x}).
Proposition 7.57. The map I : S X is linear and satises for all F S,
(7.25) kI(F)k
X

Z
kFkd
and
(7.26) (I(F)) =
Z
X
F d X
.
Proof. If 0 6= c R and F S, then
I(cF) =
X
xX
x(cF = x) =
X
xX
x
F =
x
c
=
X
yX
cy (F = y) = cI(F)
and if c = 0, I(0F) = 0 = 0I(F). If F, G S,
I(F +G) =
X
x
x(F +G = x)
=
X
x
x
X
y+z=x
(F = y, G = z)
=
X
y,z
(y +z)(F = y, G = z)
=
X
y
y(F = y) +
X
z
z(G = z) = I(F) +I(G).
Equation (7.25) is a consequence of the following computation:
kI(F)k
X
= k
X
xX
x(F = x)k
X
xX
kxk(F = x) =
Z
kFkd
and Eq. (7.26) follows from:
(I(F)) = (
X
xX
x({F = x}))
=
X
xX
(x)({F = x}) =
Z
X
F d.
Proposition 7.58. The set of simple functions, S, is dense in L
p
(, X) for all
p [1, ).
Proof. By assumption that X is separable, there is a countable dense set
D ={x
n
}
n=1
X. Given > 0 and n N set
V
n
= B(x
n
, ) r
n1
[
i=1
B(x
i
, )
!
where by convention V
1
= B(x
1
, ). Then X =

`
i=1
V
i
disjoint union. For F
L
p
(; X) let
F
=

X
n=1
x
n
1
F
1
(V
n
)
and notice that kF F
k
X
on and therefore, kF F
k
L
p . In particular
this shows that
kF
k
L
p kF F
k
L
p +kFk
L
p +kFk
L
p <
so that F
L
p
(; X). Since
> kF
k
p
L
p
=

X
n=1
kx
n
k
p
(F
1
(V
n
)),
there exists N such that

P
n=N+1
kx
n
k
p
(F
1
(V
n
))
p
and hence
F
N
X
n=1
x
n
1
F
1
(V
n
)
L
p
kF F
k
L
p
+
N
X
n=1
x
n
1
F
1
(V
n
)
L
p
+
X
n=N+1
x
n
1
F
1
(V
n
)
L
p
= +

X
n=N+1
kx
n
k
p
(F
1
(V
n
))
!
1/p
+ = 2.
Since
N
P
n=1
x
n
1
F
1
(V
n
)
S and > 0 is arbitrary, the last estimate proves the
proposition.
124 BRUCE K. DRIVER
Theorem 7.59. There is a unique continuous linear map

I : L
1
(, F, ; X) X
such that

I|
S
= I where I is dened in Denition 7.56. Moreover, for all F
L
1
(, F, ; X),
(7.27) k
I(F)k
X

Z
kFkd
and

I(F) is the unique element in X such that
(7.28) (
I(F)) =
Z
X
F d X
.
The map

I(F) will be denoted suggestively by
R
X
Fd so that Eq. (7.28) may be
written as
(
Z
X
Fd) =
Z
X
F d X
.
Proof. The existence of a continuous linear map

I : L
1
(, F, ; X) X such
that

I|
S
= I and Eq. (7.27) holds follows from Propositions 7.57 and 7.58 and the
bounded linear transformation Theorem 4.1. If X
and F L
1
(, F, ; X),
choose F
n
S such that F
n
F in L
1
(, F, ; X) as n . Then

I(F) =
lim
n
I(F
n
) and hence by Eq. (7.26),
(
I(F)) = ( lim
n
I(F
n
)) = lim
n
(I(F
n
)) = lim
n
Z
X
F
n
d.
This proves Eq. (7.28) since
( F F
n
)d
| F F
n
| d
kk
X
k F F
n
k
X
d
= kk
X
kF F
n
k
L
1 0 as n .
The fact that

I(F) is determined by Eq. (7.28) is a consequence of the Hahn
Banach theorem.
Remark 7.60. The separability assumption on X may be relaxed by assuming that
F : X has separable essential range. In this case we may still dene
R
X
Fd
by applying the above formalism with X replaced by the separable Banach space
X
0
:= essran
(F). For example if is a compact topological space and F : X

is a continuous map, then
R
Fd is always dened.
7.8. Bochner Integrals.
7.8.1. Bochner Integral Problems From Folland. #15
Let f, g L
1
Y
, c C then |(f + cg)(x)| |f(x)| + |c| |g(x)| for all x X.
Integrating over x kf + cgk
1
kfk
1
+ |c| kgk
1
< . Hence f, g L
Y
and
c C f +cg L
Y
so that L
Y
is vector subspace of all functions from X Y .
(By the way L
Y
is a vector space since the map (y
1
, y
2
) y
1
+cy
2
fromY Y Y is
continuous and therefore f +cg = (f, g) is a composition of measurable functions).
It is clear that F
Y
is a linear space. Moreover if f =
n
X
j=1
y
j
x
E
j
with u(E
j
) <
then |f(x)|
n
P
j=1
|y
j
|x
E
j
(x) kfk
L
1
n
P
j=1
|y
i
|u(E
j
) < . So F
Y
L
1
Y
. It is
easily checked that k k
1
is a seminorm with the property
kfk
1
= 0
Z
kf(x)kdu(x) = 0
kf(x)k = 0 a.e.
f(x) = 0 a.e.
Hence k k
1
is a norm on L
1
Y
/ (null functions).
#16
B
n
= {y Y : ky y
n
k < ky
n
k}
{y
n
}
n=1
= Y
.
Let 0 6= y Y and choose {y
n
k
} {y
n
} 3 y
n
k
y as k . Then kyy
n
k
k 0
while ky
n
k
k kyk 6= 0 as k . Hence eventually |y y
n
k
k < ky
n
k
k for k
suciently large, i.e. y B
n
k
for all k suciently large. Thus Y \ {0}

S
n=1
B
n
.
Also Y \ {0} =

S
n=1
B
n
if < 1. Since k0 y
n
k < ky
n
k can not happen.
#17
Let f L
1
Y
and 1 > > 0, B
n
as in problem 16. Dene A
n
B
n
\ (B
1

B
n1
) and E
n
f
1
(A
n
) and set
g

X
1
y
n
x
E
n
=

X
1
y
n
x
A
n
f.
Suppose E
n
then kf(x)g
(x)k = ky
n
f(x)k < ky
n
k. Now ky
n
k ky
n
f(x)k+
kf(x)k < ky
n
k+kf(x)k. Therefore ky
n
k <
kf(x)k
1
. So kf(x)g
(x)k <

1
kf(x)k
for x E
n
. Since n is arbitrary it follows by problem 16 that kf(x) g
(x)k <
1
kf(x)k for all x / f
1
({0}). Since < 1, by the end of problem 16 we know
0 / A
n
for any n g
(x) = 0 if f(x) = 0. Hence kf(x) g
(x)k <

1
kf(x)k
holds for all x X. This implies kf g
k
1

1
kfk
1
0 0. Also we
see kg
k
1
kfk
1
+ kf g
k
1
<

P
n=1
ky
n
ku(E
n
) = kg
k
1
< . Choose
N() {1, 2, 3, . . . } such that

P
n=N()+1
ky
n
ku(E
n
) < . Set f
(x) =
N()
P
n=1
y
n
x
E
n
.
Then
kf f
k
1
kf g
k
1
+kg
k
1

1
kfk
1
+

X
n=N()+1
ky
n
ku(E
n
)
(1 +
kfk
1
1
) 0 as 0.
Finally f
F
Y
so we are done.
126 BRUCE K. DRIVER
#18
Dene
R
: F
Y
Y by
R
X
f(x)du(x) =
P
yY
yu(f
1
({y}) Just is the real variable
case be in class are shows that
R
: F
Y
Y is linear. For f L
1
Y
choose f
n

F
Y
such that kf f
n
k
1
0, n . Then kf
n
f
m
k
1
0 as m, n . Now
f
n
f F
Y
.
k
Z
X
f duk
X
yY
kyku(f
1
({y})) =
Z
X
kfkdu.
Therefore k
R
X
f
n
du
R
X
f
m
duk kf
n
f
m
k
1
0 m, n . Hence lim
n
R
X
f
n
du
exists in Y . Set
R
X
f du = lim
n
R
X
f
n
du.
Claim 1.
R
X
fdu is well dened. Indeed if g
n
F
y
such that kf g
n
k
1
0
as n . Then kf
n
g
n
k
1
0 as n also. k
R
X
f
n
du
R
x
g
n
duk
kf
n
g
n
k
1
0 n . So lim
n
R
X
g
n
du = lim
n
R
X
f
n
du
Finally:
k
Z
X
f duk = lim
n
k
Z
X
f
n
duk
limsup
n
kf
n
k
1
= kfk
1
#19 D.C.T {f
n
} L
1
Y
, f L
1
Y
such that g L
1
(d) for all n kf
n
(x)k g(x)
a.e. and f
n
(x) f(x) a.e. Then k
R
f
R
f
n
k kf f
n
kdu
n
0 by real variable.
7.9. Exercises.
Exercise 7.1. Let be a measure on an algebra A P(X), then (A) +(B) =
(A B) +(A B) for all A, B A.
Exercise 7.2. Problem 12 on p. 27 of Folland. Let (X, M, ) be a nite measure
space and for A, B M let (A, B) = (AB) where AB = (A\ B) (B \ A) .
Dene A B i (AB) = 0. Show is an equivalence relation, is a metric
on M/ and (A) = (B) if A B. Also show that : (M/ ) [0, ) is a
continuous function relative to the metric .
Exercise 7.3. Suppose that
n
: M[0, ] are measures on M for n N. Also
suppose that
n
(A) is increasing in n for all A M. Prove that : M [0, ]
dened by (A) := lim
n
n
(A) is also a measure.
Exercise 7.4. Now suppose that is some index set and for each ,
:
M [0, ] is a measure on M. Dene : M [0, ] by (A) =
P
(A)
for each A M. Show that is also a measure.
Exercise 7.5. Let (X, M, ) be a measure space and : X [0, ] be a measur-
able function. For A M, set (A) :=
R
A
d.
(1) Show : M[0, ] is a measure.
(2) Let f : X [0, ] be a measurable function, show
(7.29)
Z
X
fd =
Z
X
fd.
Hint: rst prove the relationship for characteristic functions, then for
simple functions, and then for general positive measurable functions.
(3) Show that f L
1
() i f L
1
() and if f L
1
() then Eq. (7.29) still
holds.
Notation 7.61. It is customary to informally describe dened in Exercise 7.5
by writing d = d.
Exercise 7.6. Let (X, M, ) be a measure space, (Y, F) be a measurable space
and f : X Y be a measurable map. Dene a function : F [0, ] by
(A) := (f
1
(A)) for all A F.
(1) Show is a measure. (We will write = f
or = f
1
.)
(2) Show
(7.30)
Z
Y
gd =
Z
X
(g f) d
for all measurable functions g : Y [0, ]. Hint: see the hint from
Exercise 7.5.
(3) Show g L
1
() i gf L
1
() and that Eq. (7.30) holds for all g L
1
().
Exercise 7.7. Let F : R R be a C
1
-function such that F
0
(x) > 0 for all
x R and lim
x
F(x) = . (Notice that F is strictly increasing so that
F
1
: R R exists and moreover, by the implicit function theorem that F
1
is a
C
1
function.) Let m be Lebesgue measure on B
R
and
(A) = m(F(A)) = m(
F
1
1
(A)) =

F
1
(A)
for all A B
R
. Show d = F
0
dm. Use this result to prove the change of variable
formula,
(7.31)
Z
R
h F F
0
dm =
Z
R
hdm
which is valid for all Borel measurable functions h : R [0, ].
Hint: Start by showing d = F
0
dm on sets of the form A = (a, b] with a, b R
and a < b. Then use the uniqueness assertions in Theorem 7.8 to conclude d =
F
0
dm on all of B
R
. To prove Eq. (7.31) apply Exercise 7.6 with g = h F and
f = F
1
.
Exercise 7.8. Let (X, M, ) be a measure space and {A
n
}
n=1
M, show
({A
n
a.a.}) liminf
n
(A
n
)
and if (
mn
A
m
) < for some n, then
({A
n
i.o.}) limsup
n
(A
n
) .
Exercise 7.9 (Peanos Existence Theorem). Suppose Z : RR
d
R
d
is a bounded
continuous function. Then for each T <
15
there exists a solution to the dier-
ential equation
(7.32) x(t) = Z(t, x(t)) for 0 t T with x(0) = x
0
.
Do this by lling in the following outline for the proof.
15
Using Corollary 10.12 below, we may in fact allow T = .
128 BRUCE K. DRIVER
(1) Given > 0, show there exists a unique function x
C([, ) R
d
)
such that x
(t) x
0
for t 0 and
(7.33) x
(t) = x
0
+
Z
t
0
Z(, x
( ))d for all t 0.

Here
Z
t
0
Z(, x
( ))d =
Z
t
0
Z
1
(, x
( ))d, . . . ,
Z
t
0
Z
d
(, x
( ))d
where Z = (Z
1
, . . . , Z
d
) and the integrals are either the Lebesgue or the
Riemann integral since they are equal on continuous functions. Hint: For
t [0, ], it follows from Eq. (7.33) that
x
(t) = x
0
+
Z
t
0
Z(, x
0
)d.
Now that x
(t) is known for t [, ] it can be found by integration for

t [, 2]. The process can be repeated.
(2) Then use Exercise 3.39 to show there exists {
k
}
k=1
(0, ) such that
lim
k
k
= 0 and x
k
converges to some x C([0, T]) (relative to the
sup-norm: kxk
= sup
t[0,T]
|x(t)|) as k .
(3) Pass to the limit in Eq. (7.33) with replaced by
k
to show x satises
x(t) = x
0
+
Z
t
0
Z(, x())d t [0, T].
(4) Conclude from this that x(t) exists for t (0, T) and that x solves Eq.
(7.32).
(5) Apply what you have just prove to the ODE,
y(t) = Z(t, y(t)) for 0 t T with x(0) = x
0
.
Then extend x(t) above to [T, T] by setting x(t) = y(t) if t [T, 0].
Show x so dened solves Eq. (7.32) for t (T, T).
Exercise 7.10. Folland 2.12 on p. 52.
Exercise 7.13. Give examples of measurable functions {f
n
} on R such that f
n
decreases to 0 uniformly yet
R
f
n
dm = for all n. Also give an example of a
sequence of measurable functions {g
n
} on [0, 1] such that g
n
0 while
R
g
n
dm = 1
for all n.
Exercise 7.15. Suppose {a
n
}
n=
C is a summable sequence (i.e.
P
n=
|a
n
| <
), then f() :=
P
n=
a
n
e
in
is a continuous function for R and
a
n
=
1
2
Z

f()e
in
d.
Exercise 7.18. Folland 2.31b on p. 60.
8. Fubinis Theorem
This next example gives a real world example of the fact that it is not always
possible to interchange order of integration.
Example 8.1. Consider
Z
1
0
dy
Z

1
dx(e
xy
2e
2xy
) =
Z
1
0
dy

e
y
y
2
e
2xy
2y
x=1
=
Z
1
0
dy
e
y
e
2y
y
=
Z
1
0
dy e
y
1 e
y
y
(0, ).
Note well that

1e
y
y
has not singularity at 0. On the other hand

Z

1
dx
Z
1
0
dy(e
xy
2e
2xy
) =
Z

1
dx

e
xy
x
2
e
2xy
2x
1
y=0
=
Z

1
dx

e
2x
e
x
x
=
Z

1
e
x
1 e
x
x
dx (, 0).
Moral
R
dx
R
dy f(x, y) 6=
R
dy
R
dx f(x, y) is not always true.
In the remainder of this section we will let (X, M, ) and (Y, N, ) be xed
measure spaces. Our main goals are to show:
(1) There exists a unique measure on MN such that (AB) =
(A)(B) for all A M and B N and
(2) For all f : X Y [0, ] which are MN measurable,
Z
XY
f d ( ) =
Z
X
d(x)
Z
Y
d(y)f(x, y)
=
Z
Y
d(y)
Z
X
d(x)f(x, y).
Before proving such assertions, we will need a few more technical measure
theoretic arguments which are of independent interest.
8.1. Measure Theoretic Arguments.
Denition 8.2. Let C P(X) be a collection of sets. We say:
(1) C is a monotone class if it is closed under countable increasing unions
and countable decreasing intersections,
(2) C is a class if it is closed under nite intersections and
(3) C is a class if C satises the following properties:
(a) X C
(b) If A, B C and A B = , then A B C. (Closed under disjoint
unions.)
(c) If A, B C and A B, then A \ B C. (Closed under proper
dierences.)
130 BRUCE K. DRIVER
(d) If A
n
C and A
n
A, then A C. (Closed under countable increasing
unions.)
(4) We will say C is a
0
class if C satises conditions a) c) but not necessarily
d).
Remark 8.3. Notice that every class is also a monotone class.
(The reader wishing to shortcut this section may jump to Theorem 8.7 where
he/she should then only read the second proof.)
Lemma 8.4 (Monotone Class Theorem). Suppose A P(X) is an algebra and C
is the smallest monotone class containing A. Then C = (A).
Proof. For C C let
C(C) = {B C : C B, C B
c
, B C
c
C},
then C(C) is a monotone class. Indeed, if B
n
C(C) and B
n
B, then B
c
n
B
c
and so
C 3 C B
n
C B
C 3 C B
c
n
C B
c
and
C 3 B
n
C
c
B C
c
.
Since C is a monotone class, it follows that CB, CB
c
, BC
c
C, i.e. B C(C).
This shows that C(C) is closed under increasing limits and a similar argument shows
that C(C) is closed under decreasing limits. Thus we have shown that C(C) is a
monotone class for all C C.
If A A C, then A B, A B
c
, B A
c
A C for all B A and hence
it follows that A C(A) C. Since C is the smallest monotone class containing
A and C(A) is a monotone class containing A, we conclude that C(A) = C for any
A A.
Let B C and notice that A C(B) happens i B C(A). This observation and
the fact that C(A) = C for all A A implies A C(B) C for all B C. Again
since C is the smallest monotone class containing A and C(B) is a monotone class we
conclude that C(B) = C for all B C. That is to say, if A, B C then A C = C(B)
and hence A B, A B
c
, A
c
B C. So C is closed under complements (since
X A C) and nite intersections and increasing unions from which it easily
follows that C is a algebra.
Let E P(X Y ) be given by
E = MN = {AB : A M, B N}
and recall from Exercise 6.2 that E is an elementary family. Hence the algebra
A = A(E) generated by E consists of sets which may be written as disjoint unions
of sets from E.
Theorem 8.5 (Uniqueness). Suppose that E P(X) is an elementary class and
M = (E) (the algebra generated by E). If and are two measures on M
which are nite on E and such that = on E then = on M.
Proof. Let A := A(E) be the algebra generated by E. Since every element of A
is a disjoint union of elements from E, it is clear that = on A. Henceforth we
may assume that E = A. We begin rst with the special case where (X) < and
hence (X) = (X) < . Let
C = {A M : (A) = (A)}
The reader may easily check that C is a monotone class. Since A C, the monotone
class lemma asserts that M= (A) C M showing that C = M and hence that
= on M.
For the nite case, let X
n
A be sets such that (X
n
) = (X
n
) < and
X
n
X as n . For n N, let
(8.1)
n
(A) := (A X
n
) and
n
(A) = (A X
n
)
for all A M. Then one easily checks that
n
and
n
are nite measure on M
such that
n
=
n
on A. Therefore, by what we have just proved,
n
=
n
on M.
Hence or all A M, using the continuity of measures,
(A) = lim
n
(A X
n
) = lim
n
(A X
n
) = (A).
Lemma 8.6. If D is a
0
class which contains a -class, C, then D contains
A(C) the algebra generated by C.
Proof. We will give two proofs of this lemma. The rst proof is constructive
and makes use of Proposition 6.9 which tells how to construct A(C) from C. The
key to the rst proof is the following claim which will be proved by induction.
Claim. Let

C
0
= C and

C
n
denote the collection of subsets of X of the form
(8.2) A
c
1
A
c
n
B = B \ A
1
\ A
2
\ \ A
n
.
with A
i
C and B C {X} . Then

C
n
D for all n, i.e.

C :=
n=0

C
n
D.
By assumption

C
0
D and when n = 1,
B \ A
1
= B \ (A
1
B) D
when A
1
, B C D since A
1
B C D. Therefore,

C
1
D. For the induction
step, let B C {X} and A
i
C {X} and let E
n
denote the set in Eq. (8.2) We
now assume

C
n
D and wish to show E
n+1
D, where
E
n+1
= E
n
\ A
n+1
= E
n
\ (A
n+1
E
n
).
Because
A
n+1
E
n
= A
c
1
A
c
n
(B A
n+1
)

C
n
D
and (A
n+1
E
n
) E
n

C
n
D, we have E
n+1
D as well. This nishes the
proof of the claim.
Notice that

C is still a multiplicative class and from Proposition 6.9 (using the
fact that C is a multiplicative class), A(C) consists of nite unions of elements from
C. By applying the claim to

C, A
c
1
A
c
n
D for all A
i

C and hence
A
1
A
n
= (A
c
1
A
c
n
)
c
D.
Thus we have shown A(C) D which completes the proof.
(Second Proof.) With out loss of generality, we may assume that D is the
smallest
0
class containing C for if not just replace D by the intersection of all
0
classes containing C. Let
D
1
:= {A D : A C D C C}.
132 BRUCE K. DRIVER
Then C D
1
and D
1
is also a
0
class as we now check. a) X D
1
. b) If A, B D
1
with A B = , then (A B) C = (A C)
`
(B C) D for all C C. c) If
A, B D
1
with B A, then (A\ B) C = AC\(BC) D for all C C. Since
C D
1
D and D is the smallest
0
class containing C it follows that D
1
= D.
From this we conclude that if A D and B C then A B D.
Let
D
2
:= {A D : A D D D D}.
Then D
2
is a
0
class (as you should check) which, by the above paragraph, contains
C. As above this implies that D = D
2
, i.e. we have shown that D is closed under
nite intersections. Since
0
classes are closed under complementation, D is an
algebra and hence A(C) D. In fact D = A(C).
This Lemma along with the monotone class theorem immediately implies
Dynkins very useful theorem.
Theorem 8.7 ( Theorem). If D is a class which contains a contains a
-class, C, then (C) D.
Proof. Since D is a
0
class, Lemma 8.6 implies that A(C) D and so by
Remark 8.3 and Lemma 8.4, (C) D. Let us pause to give a second stand-alone
proof of this Theorem.
(Second Proof.) With out loss of generality, we may assume that D is the
smallest class containing C for if not just replace D by the intersection of all
classes containing C. Let
D
1
:= {A D : A C D C C}.
Then C D
1
and D
1
is also a class because as we now check. a) X D
1
. b)
If A, B D
1
with A B = , then (A B) C = (A C)
`
(B C) D for all
C C. c) If A, B D
1
with B A, then (A\ B) C = A C \ (B C) D for
all C C. d) If A
n
D
1
and A
n
A as n , then A
n
C D for all C D
and hence A
n
C A C D. Since C D
1
D and D is the smallest class
containing C it follows that D
1
= D. From this we conclude that if A D and
B C then A B D.
Let
D
2
:= {A D : A D D D D}.
Then D
2
is a class (as you should check) which, by the above paragraph, contains
C. As above this implies that D = D
2
, i.e. we have shown that D is closed under
nite intersections.
Since classes are closed under complementation, D is an algebra which is
closed under increasing unions and hence is closed under arbitrary countable unions,
i.e. D is a algebra. Since C D we must have (C) D and in fact (C) = D.
Using this theorem we may strengthen Theorem 8.5 to the following.
Theorem 8.8 (Uniqueness). Suppose that C P(X) is a class such that
M = (C). If and are two measures on M and there exists X
n
C such that
X
n
X and (X
n
) = (X
n
) < for each n, then = on M.
Proof. As in the proof of Theorem 8.5, it suces to consider the case where
and are nite measure such that (X) = (X) < . In this case the reader may
easily verify from the basic properties of measures that
D = {A M : (A) = (A)}
is a class. By assumption C D and hence by the theorem, D contains
M = (C).
As an immediate consequence we have the following corollaries.
Corollary 8.9. Suppose that (X, ) is a topological space, B
X
= () is the Borel
algebra on X and and are two measures on B
X
which are nite on .
If = on then = on B
X
, i.e. .
Corollary 8.10. Suppose that and are two measures on B
R
n which are nite
on bounded sets and such that (A) = (A) for all sets A of the form
A = (a, b] = (a
1
, b
1
] (a
n
, b
n
]
with a, b R
n
and a b, i.e. a
i
b
i
for all i. Then = on B
R
n.
To end this section we wish to reformulate the theorem in a function
theoretic setting.
Denition 8.11 (Bounded Convergence). Let X be a set. We say that a se-
quence of functions f
n
from X to R or C converges boundedly to a function f if
lim
n
f
n
(x) = f(x) for all x X and
sup{|f
n
(x)| : x X and n = 1, 2, . . .} < .
Theorem 8.12. Let X be a set and H be a subspace of B(X, R) the space of
bounded real valued functions on X. Assume:
(1) 1 H, i.e. the constant functions are in H and
(2) H is closed under bounded convergence, i.e. if {f
n
}
n=1
H and f
n
f
boundedly then f H.
If C P(X) is a multiplicative class such that 1
A
H for all A C, then H
contains all bounded (C) measurable functions.
Proof. Let D := {A X : 1
A
H}. Then by assumption C D and since
1 H we know X D. If A, B D are disjoint then 1
AB
= 1
A
+ 1
B
H so
that A B D and if A, B D and A B, then 1
B\A
= 1
B
1
A
H. Finally
if A
n
D and A
n
A as n then 1
A
n
1
A
boundedly so 1
A
H and
hence A D. So D is class containing C and hence D contains (C). From this
it follows that H contains 1
A
for all A (C) and hence all (C) measurable
simple functions by linearity. The proof is now complete with an application of
the approximation Theorem 7.12 along with the assumption that H is closed under
bounded convergence.
Corollary 8.13. Suppose that (X, d) is a metric space and B
X
= (
d
) is the
Borel algebra on X and H is a subspace of B(X, R) such that BC(X, R) H
(BC(X, R) the bounded continuous functions on X) and H is closed under
bounded convergence. Then H contains all bounded B
X
measurable real val-
ued functions on X. (This may be paraphrased as follows. The smallest vector
space of bounded functions which is closed under bounded convergence and contains
BC(X, R) is the space of bounded B
X
measurable real valued functions on X.)
Proof. Let V
d
be an open subset of X and for n N let
f
n
(x) := min(n d
V
c (x), 1) for all x X.
Notice that f
n
=
n
d
V
c where
n
(t) = min(nt, 1) which is continuous and hence
f
n
BC(X, R) for all n. Furthermore, f
n
converges boundedly to 1
V
as n
134 BRUCE K. DRIVER
and therefore 1
V
H for all V . Since is a class the corollary follows by
an application of Theorem 8.12.
Here is a basic application of this corollary.
Proposition 8.14. Suppose that (X, d) is a metric space, and are two measures
on B
X
= (
d
) which are nite on bounded measurable subsets of X and
(8.3)
Z
X
fd =
Z
X
fd
for all f BC
b
(X, R) where
BC
b
(X, R) = {f BC(X, R) : supp(f) is bounded}.
Then .
Proof. To prove this x a o X and let
R
(x) = ([R+ 1 d(x, o)] 1) 0
so that
R
BC
b
(X, [0, 1]), supp(
R
) B(o, R + 2) and
R
1 as R . Let
H
R
denote the space of bounded measurable functions f such that
(8.4)
Z
X

R
fd =
Z
X

R
fd.
Then H
R
is closed under bounded convergence and because of Eq. (8.3) contains
BC(X, R). Therefore by Corollary 8.13, H
R
contains all bounded measurable func-
tions on X. Take f = 1
A
in Eq. (8.4) with A B
X
, and then use the monotone
convergence theorem to let R . The result is (A) = (A) for all A B
X
.
Corollary 8.15. Let (X, d) be a metric space, B
X
= (
d
) be the Borel algebra
and : B
X
[0, ] be a measure such that (K) < when K is a compact
subset of X. Assume further there exists compact sets K
k
X such that K
o
k
X.
Suppose that H is a subspace of B(X, R) such that C
c
(X, R) H (C
c
(X, R) is the
space of continuous functions with compact support) and H is closed under bounded
convergence. Then H contains all bounded B
X
measurable real valued functions
on X.
Proof. Let k and n be positive integers and set
n,k
(x) = min(1, n d
(
K
o
k
)
c
(x)).
Then
n,k
C
c
(X, R) and {
n,k
6= 0} K
o
k
. Let H
n,k
denote those bounded
B
X
measurable functions, f : X R, such that
n,k
f H. It is easily seen
that H
n,k
is closed under bounded convergence and that H
n,k
contains BC(X, R)
and therefore by Corollary 8.13,
n,k
f H for all bounded measurable functions
f : X R. Since
n,k
f 1
K
o
k
f boundedly as n , 1
K
o
k
f H for all k and
similarly 1
K
o
k
f f boundedly as k and therefore f H.
Here is another version of Proposition 8.14.
Proposition 8.16. Suppose that (X, d) is a metric space, and are two measures
on B
X
= (
d
) which are both nite on compact sets. Further assume there exists
compact sets K
k
X such that K
o
k
X. If
(8.5)
Z
X
fd =
Z
X
fd
for all f C
c
(X, R) then .
Proof. Let
n,k
be dened as in the proof of Corollary 8.15 and let H
n,k
denote
those bounded B
X
measurable functions, f : X R such that
Z
X
f
n,k
d =
Z
X
f
n,k
d.
By assumption BC(X, R) H
n,k
and one easily checks that H
n,k
is closed under
bounded convergence. Therefore, by Corollary 8.13, H
n,k
contains all bounded
measurable function. In particular for A B
X
,
Z
X
1
A

n,k
d =
Z
X
1
A

n,k
d.
Letting n in this equation, using the dominated convergence theorem, one
shows
Z
X
1
A
1
K
o
k
d =
Z
X
1
A
1
K
o
k
d
holds for k. Finally using the monotone convergence theorem we may let k
to conclude
(A) =
Z
X
1
A
d =
Z
X
1
A
d = (A)
for all A B
X
.
8.2. Fubini-Tonellis Theorem and Product Measure. Recall that (X, M, )
and (Y, N, ) are xed measure spaces.
Notation 8.17. Suppose that f : X C and g : Y C are functions, let f g
denote the function on X Y given by
f g(x, y) = f(x)g(y).
Notice that if f, g are measurable, then f g is (MN, B
C
) measurable.
To prove this let F(x, y) = f(x) and G(x, y) = g(y) so that f g = F G will
be measurable provided that F and G are measurable. Now F = f
1
where
1
: X Y X is the projection map. This shows that F is the composition
of measurable functions and hence measurable. Similarly one shows that G is
measurable.
Theorem 8.18. Suppose (X, M, ) and (Y, N, ) are -nite measure spaces and
f is a nonnegative (MN, B
R
) measurable function, then for each y Y,
(8.6) x f(x, y) is M B
[0,]
measurable,
for each x X,
(8.7) y f(x, y) is N B
[0,]
measurable,
x
Z
Y
f(x, y)d(y) is M B
[0,]
measurable, (8.8)
y
Z
X
f(x, y)d(x) is N B
[0,]
measurable, (8.9)
and
(8.10)
Z
X
d(x)
Z
Y
d(y)f(x, y) =
Z
Y
d(y)
Z
X
d(x)f(x, y).
136 BRUCE K. DRIVER
Proof. Suppose that E = AB E := MN and f = 1

E
. Then
f(x, y) = 1
AB
(x, y) = 1
A
(x)1
B
(y)
and one sees that Eqs. (8.6) and (8.7) hold. Moreover
Z
Y
f(x, y)d(y) =
Z
Y
1
A
(x)1
B
(y)d(y) = 1
A
(x)(B),
so that Eq. (8.8) holds and we have
(8.11)
Z
X
d(x)
Z
Y
d(y)f(x, y) = (B)(A).
Similarly,
Z
X
f(x, y)d(x) = (A)1
B
(y) and
Z
Y
d(y)
Z
X
d(x)f(x, y) = (B)(A)
from which it follows that Eqs. (8.9) and (8.10) hold in this case as well.
For the moment let us further assume that (X) < and (Y ) < and let H
be the collection of all bounded (MN, B
R
) measurable functions on XY such
that Eqs. (8.6) (8.10) hold. Using the fact that measurable functions are closed
under pointwise limits and the dominated convergence theorem (the dominating
function always being a constant), one easily shows that H closed under bounded
convergence. Since we have just veried that 1
E
H for all E in the class, E,
it follows that H is the space of all bounded (MN, B
R
) measurable functions
on X Y. Finally if f : X Y [0, ] is a (MN, B
R
) measurable function,
let f
M
= M f so that f
M
f as M and Eqs. (8.6) (8.10) hold with f
replaced by f
M
for all M N. Repeated use of the monotone convergence theorem
allows us to pass to the limit M in these equations to deduce the theorem in
the case and are nite measures.
For the nite case, choose X
n
M, Y
n
N such that X
n
X, Y
n
Y,
(X
n
) < and (Y
n
) < for all m, n N. Then dene
m
(A) = (X
m
A) and
n
(B) = (Y
n
B) for all A M and B N or equivalently d
m
= 1
X
m
d and
d
n
= 1
Y
n
d. By what we have just proved Eqs. (8.6) (8.10) with replaced by
m
and by
n
for all (MN, B
R
) measurable functions, f : X Y [0, ].
The validity of Eqs. (8.6) (8.10) then follows by passing to the limits m
and then n using the monotone convergence theorem again to conclude
Z
X
fd
m
=
Z
X
f1
X
m
d
Z
X
fd as m
and
Z
Y
gd
n
=
Z
Y
g1
Y
n
d
Z
Y
gd as n
for all f L
+
(X, M) and g L
+
(Y, N).
Corollary 8.19. Suppose (X, M, ) and (Y, N, ) are -nite measure spaces.
Then there exists a unique measure on MN such that (AB) = (A)(B)
for all A M and B N. Moreover is given by
(8.12) (E) =
Z
X
d(x)
Z
Y
d(y)1
E
(x, y) =
Z
Y
d(y)
Z
X
d(x)1
E
(x, y)
for all E MN and is nite.
Notation 8.20. The measure is called the product measure of and and will
be denoted by .
Proof. Notice that any measure such that (AB) = (A)(B) for all A M
and B N is necessarily nite. Indeed, let X
n
M and Y
n
N be chosen
so that (X
n
) < , (Y
n
) < , X
n
X and Y
n
Y, then X
n
Y
n
M N,
X
n
Y
n
X Y and (X
n
Y
n
) < for all n. The uniqueness assertion is
a consequence of either Theorem 8.5 or by Theorem 8.8 with E = MN. For
the existence, it suces to observe, using the monotone convergence theorem, that
dened in Eq. (8.12) is a measure on M N. Moreover this measure satises
(AB) = (A)(B) for all A M and B N from Eq. (8.11
Theorem 8.21 (Tonellis Theorem). Suppose (X, M, ) and (Y, N, ) are -nite
measure spaces and = is the product measure on MN. If f L
+
(X
Y, MN), then f(, y) L
+
(X, M) for all y Y, f(x, ) L
+
(Y, N) for all x X,
Z
Y
f(, y)d(y) L
+
(X, M),
Z
X
f(x, )d(x) L
+
(Y, N)
and
Z
XY
f d =
Z
X
d(x)
Z
Y
d(y)f(x, y) (8.13)
=
Z
Y
d(y)
Z
X
d(x)f(x, y). (8.14)
Proof. By Theorem 8.18 and Corollary 8.19, the theorem holds when f = 1
E
with E M N. Using the linearity of all of the statements, the theorem is
also true for non-negative simple functions. Then using the monotone convergence
theorem repeatedly along with Theorem 7.12, one deduces the theorem for general
f L
+
(X Y, MN).
Theorem 8.22 (Fubinis Theorem). Suppose (X, M, ) and (Y, N, ) are -nite
measure spaces and = be the product measure on MN. If f L
1
() then
for a.e. x, f(x, ) L
1
() and for a.e. y, f(, y) L
1
(). Moreover,
g(x) =
Z
Y
f(x, y)dv(y) and h(y) =
Z
X
f(x, y)d(x)
are in L
1
() and L
1
() respectively and Eq. (8.14) holds.
Proof. If f L
1
(X Y ) L
+
then by Eq. (8.13),
Z
X
Z
Y
f(x, y)d(y)
d(x) <
so
R
Y
f(x, y)d(y) < for a.e. x, i.e. for a.e. x, f(x, ) L
1
(). Similarly for
a.e. y, f(, y) L
1
(). Let f be a real valued function in f L
1
(X Y ) and let
f = f
+
f
. Apply the results just proved to f
to conclude, f
(x, ) L
1
() for
a.e. x and that
Z
Y
f
(, y)d(y) L
1
().
138 BRUCE K. DRIVER
Therefore for a.e. .x,

f(x, ) = f
+
(x, ) f
(x, ) L
1
()
and
x
Z
f(x, y)d(y) =
Z
f
+
(x, )d(y)
Z
f
(x, )d(y)
is a almost everywhere dened function such that
R
f(, y)d(y) L
1
(). Be-
cause
Z
f
(x, y)d ( ) =
Z
d(x)
Z
d(y)f
(x, y),
Z
f d( ) =
Z
f
+
d( )
Z
f
d( )
=
Z
d
Z
df
+
Z
d
Z
df
=
Z
d
Z
f
+
d
Z
f
=
Z
d
Z
d(f
+
f
) =
Z
d
Z
df.
The proof that
Z
f d( ) =
Z
d(y)
Z
d(x)f(x, y)
is analogous. As usual the complex case follows by applying the real results just
proved to the real and imaginary parts of f.
Notation 8.23. Given E X Y and x X, let
x
E := {y Y : (x, y) E}.
Similarly if y Y is given let
E
y
:= {x X : (x, y) E}.
If f : XY C is a function let f
x
= f(x, ) and f
y
:= f(, y) so that f
x
: Y C
and f
y
: X C.
Theorem 8.24. Suppose (X, M, ) and (Y, N, ) are complete -nite measure
spaces. Let (X Y, L, ) be the completion of (X Y, M N, ). If f is
L-measurable and (a) f 0 or (b) f L
1
() then f
x
is N-measurable for a.e.
x and f
y
is M-measurable for a.e. y and in case (b) f
x
L
1
() and f
y
L
1
()
for a.e. x and a.e. y respectively. Moreover,
x
Z
f
x
d and y
Z
f
y
d
are measurable and
Z
fd =
Z
d
Z
df =
Z
d
Z
d f.
Proof. If E MN is a null set (( )(E) = 0), then
0 = ( )(E) =
Z
X
(
x
E)d(x) =
Z
X
(E
y
)d(y).
This shows that
({x : (
x
E) 6= 0}) = 0 and ({y : (E
y
) 6= 0}) = 0,
i.e. (
x
E) = 0 for a.e. x and (E
y
) = 0 for a.e. y.
If h is L measurable and h = 0 for - a.e., then there exists E M N 3
{(x, y) : h(x, y) 6= 0} E and ( )(E) = 0. Therefore |h(x, y)| 1
E
(x, y) and
( )(E) = 0. Since
{h
x
6= 0} = {y Y : h(x, y) 6= 0}
x
E and
{h
y
6= 0} = {x X : h(x, y) 6= 0} E
y
we learn that for a.e. x and a.e. y that {h
x
6= 0} M, {h
y
6= 0} N,
({h
x
6= 0}) = 0 and a.e. and ({h
y
6= 0}) = 0. This implies
for a.e. y,
Z
h(x, y)d(y) exists and equals 0
and
for a.e. x,
Z
h(x, y)d(y) exists and equals 0.
Therefore
0 =
Z
hd =
Z Z
hd
d =
Z Z
hd
d.
For general f L
1
(), we may choose g L
1
(MN, ) such that f(x, y) =
g(x, y) for a.e. (x, y). Dene h f g. Then h = 0, a.e. Hence by what
we have just proved and Theorem 8.21 f = g +h has the following properties:
(1) For a.e. x, y f(x, y) = g(x, y) +h(x, y) is in L
1
() and
Z
f(x, y)d(y) =
Z
g(x, y)d(y).
(2) For a.e. y, x f(x, y) = g(x, y) +h(x, y) is in L
1
() and
Z
f(x, y)d(x) =
Z
g(x, y)d(x).
From these assertions and Theorem 8.21, it follows that
Z
d(x)
Z
d(y)f(x, y) =
Z
d(x)
Z
d(y)g(x, y)
=
Z
d(y)
Z
d(x)g(x, y)
=
Z
g(x, y)d( )(x, y)
=
Z
f(x, y)d(x, y)
and similarly we shows
Z
d(y)
Z
d(x)f(x, y) =
Z
f(x, y)d(x, y).
The previous theorems have obvious generalizations to products of any nite
number of compact measure spaces. For example the following theorem holds.
140 BRUCE K. DRIVER
Theorem 8.25. Suppose {(X

i
, M
i
,
i
)}
n
i=1
are nite measure spaces and X :=
X
1
X
n
. Then there exists a unique measure, , on (X, M
1
M
n
) such
that (A
1
A
n
) =
1
(A
1
) . . .
n
(A
n
) for all A
i
M
i
. (This measure and its
completion will be denote by
1

n
.) If f : X [0, ] is a measurable
function then
Z
X
fd =
n
Y
i=1
Z
X
(i)
d
(i)
(x
(i)
) f(x
1
, . . . , x
n
)
where is any permutation of {1, 2, . . . , n}. This equation also holds for any f
L
1
(X, ) and moreover, f L
1
(X, ) i
n
Y
i=1
Z
X
(i)
d
(i)
(x
(i)
) |f(x
1
, . . . , x
n
)| <
for some (and hence all) permutation, .
This theorem can be proved by the same methods as in the two factor case.
Alternatively, one can use induction on n, see Exercise 8.6.
Example 8.26. We have
(8.15)
Z

0
sinx
x
e
x
dx =
1
2
arctan for all > 0
and for, M [0, ),
(8.16)
Z
M
0
sinx
x
e
x
dx
1
2
+ arctan
C
e
M
M
where C = max
x0
1+x
1+x
2
=
1
2
22

= 1.2. In particular,
(8.17) lim
M
Z
M
0
sinx
x
dx = /2.
To verify these assertions, rst notice that by the fundamental theorem of cal-
culus,
|sinx| =
Z
x
0
cos ydy
Z
x
0
|cos y| dy
Z
x
0
1dy
= |x|
so

sin x
x
1 for all x 6= 0. Making use of the identity

Z

0
e
tx
dt = 1/x
and Fubinis theorem,
Z
M
0
sinx
x
e
x
dx =
Z
M
0
dxsinxe
x
Z

0
e
tx
dt
=
Z

0
dt
Z
M
0
dxsinxe
(+t)x
=
Z

0
1 (cos M + ( +t) sinM) e
M(+t)
( +t)
2
+ 1
dt
=
Z

0
1
( +t)
2
+ 1
dt
Z

0
cos M + ( +t) sinM
( +t)
2
+ 1
e
M(+t)
dt
=
1
2
arctan (M, ) (8.18)
where
(M, ) =
Z

0
cos M + ( +t) sinM
( +t)
2
+ 1
e
M(+t)
dt.
Since

cos M + ( +t) sinM

( +t)
2
+ 1
1 + ( +t)
( +t)
2
+ 1
C,
|(M, )|
Z

0
e
M(+t)
dt = C
e
M
M
.
This estimate along with Eq. (8.18) proves Eq. (8.16) from which Eq. (8.17)
follows by taking and Eq. (8.15) follows (using the dominated convergence
theorem again) by letting M .
8.3. Lebesgue measure on R
d
.
Notation 8.27. Let
m
d
:=
d times
z }| {
m m on B
R
d =
d times
z }| {
B
R
B
R
be the d fold product of Lebesgue measure m on B
R
. We will also use m
d
to denote
its completion and let L
d
be the completion of B
R
d relative to m. A subset A L
d
is called a Lebesgue measurable set and m
d
is called d dimensional Lebesgue
measure, or just Lebesgue measure for short.
Denition 8.28. A function f : R
d
R is Lebesgue measurable if f
1
(B
R
)
L
d
.
Theorem 8.29. Lebesgue measure m
d
is translation invariant. Moreover m
d
is
the unique translation invariant measure on B
R
d such that m
d
((0, 1]
d
) = 1.
Proof. Let A = J
1
J
d
with J
i
B
R
and x R
d
. Then
x +A = (x
1
+J
1
) (x
2
+J
2
) (x
d
+J
d
)
and therefore by translation invariance of m on B
R
we nd that
m
d
(x +A) = m(x
1
+J
1
) . . . m(x
d
+J
d
) = m(J
1
) . . . m(J
d
) = m
d
(A)
and hence m
d
(x + A) = m
d
(A) for all A B
R
d by Corollary 8.10. From this fact
we see that the measure m
d
(x + ) and m
d
() have the same null sets. Using this
it is easily seen that m(x + A) = m(A) for all A L
d
. The proof of the second
assertion is Exercise 8.7.
142 BRUCE K. DRIVER
Notation 8.30. I will often be sloppy in the sequel and write m for m
d
and dx
for dm(x) = dm
d
(x). Hopefully the reader will understand the meaning from the
context.
The following change of variable theorem is an important tool in using Lebesgue
measure.
Theorem 8.31 (Change of Variables Theorem). Let
o
R
d
be an open set and
T : T()
o
R
d
be a C
1
dieomorphism
16
. Then for any Borel measurable
function, f : T() [0, ],
(8.19)
Z
f T| det T
0
|dm =
Z
T()
f dm,
where T
0
(x) is the linear transformation on R
d
dened by T
0
(x)v :=
d
dt
|
0
T(x +tv).
Alternatively, the ij matrix entry of T
0
(x) is given by T
0
(x)
ij
= T
j
(x)/x
i
where
T(x) = (T
1
(x), . . . , T
d
(x)).
We will postpone the full proof of this theorem until Section 27. However we will
give here the proof in the case that T is linear. The following elementary remark
will be used in the proof.
Remark 8.32. Suppose that
T
T()
S
S(T())
are two C
1
dieomorphisms and Theorem 8.31 holds for T and S separately, then
it holds for the composition S T. Indeed
Z
f S T| det (S T)
0
|dm =
Z
f S T| det (S
0
T) T
0
|dm
=
Z
(|det S
0
| f S) T| det T
0
|dm
=
Z
T()
|det S
0
| f Sdm =
Z
S(T())
fdm.
Theorem 8.33. Suppose T GL(d, R) = GL(R
d
) the space of d d invertible
matrices.
(1) If f : R
d
R is Borel measurable then so is f T and if f 0 or f L
1
then
(8.20)
Z
R
d
f(y)dy = | det T|
Z
R
d
f T(x)dx.
(2) If E L
d
then T(E) L
d
and m(T(E)) = |det T| m(E).
Proof. Since f is Borel measurable and T : R
d
R
d
is continuous and hence
Borel measurable, f T is also Borel measurable. We now break the proof of Eq.
(8.20) into a number of cases. In each case we make use Tonellis theorem and the
basic properties of one dimensional Lebesgue measure.
16
That is T : T()
o
R
d
is a continuously dierentiable bijection and the inverse map
T
1
: T() is also continuously dierentiable.
(1) Suppose that i < k and
T(x
1
, x
2
. . . , x
d
) = (x
1
, . . . , x
i1
, x
k
, x
i+1
. . . , x
k1
, x
i
, x
k+1
, . . . x
d
)
then by Tonellis theorem,
Z
R
d
f T(x
1
, . . . , x
d
) =
Z
R
d
f(x
1
, . . . , x
k
, . . . x
i
, . . . x
d
)dx
1
. . . dx
d
=
Z
R
d
f(x
1
, . . . , x
d
)dx
1
. . . dx
d
which prove Eq. (8.20) in this case since | det T| = 1.
(2) Suppose that c R and T(x
1
, . . . x
k
, . . . , x
d
) = (x
1
, . . . , cx
k
, . . . x
d
), then
Z
R
d
f T(x
1
, . . . , x
d
)dm =
Z
R
d
f(x
1
, . . . , cx
k
, . . . , x
d
)dx
1
. . . dx
k
. . . dx
d
= |c|
1
Z
R
d
f(x
1
, . . . , x
d
)dx
1
. . . dx
d
= | det T|
1
Z
R
d
f dm
which again proves Eq. (8.20) in this case.
(3) Suppose that
T(x
1
, x
2
. . . , x
d
) = (x
1
, . . . ,
ith spot
x
i
+cx
k
, . . . x
k
, . . . x
d
).
Then
Z
R
d
f T(x
1
, . . . , x
d
)dm =
Z
R
d
f(x
1
, . . . , x
i
+cx
k
, . . . x
k
, . . . x
d
)dx
1
. . . dx
i
. . . dx
k
. . . dx
d
=
Z
R
d
f(x
1
, . . . , x
i
, . . . x
k
, . . . x
d
)dx
1
. . . dx
i
. . . dx
k
. . . dx
d
=
Z
R
d
f(x
1
, . . . , x
d
)dx
1
. . . dx
d
where in the second inequality we did the x
i
integral rst and used trans-
lation invariance of Lebesgue measure. Again this proves Eq. (8.20) in this
case since det(T) = 1.
Since every invertible matrix is a product of matrices of the type occurring in
steps 1. 3. above, it follows by Remark 8.32 that Eq. (8.20) holds in general. For
the second assertion, let E B
R
d and take f = 1
E
in Eq. (8.20) to nd
| det T|m(T
1
(E)) = | det T|
Z
R
d
1
T
1
(E)
dm = | det T|
Z
R
d
1
E
Tdm =
Z
R
d
1
E
dm = m(E).
Replacing T by T
1
in this equation shows that
m(T(E)) = |det T| m(E)
for all E B
R
d. In particular this shows that m T and m have the same null sets
and therefore the completion of B
R
d is L
d
for both measures. Using Proposition
7.6 one now easily shows
m(T(E)) = |det T| m(E) E L
d
.
144 BRUCE K. DRIVER
8.4. Polar Coordinates and Surface Measure. Let

S
d1
= {x R
d
: |x|
2
:=
d
X
i=1
x
2
i
= 1}
be the unit sphere in R
d
. Let : R
d
\ (0) (0, ) S
d1
and
1
be the inverse
map given by
(8.21) (x) := (|x| ,
x
|x|
) and
1
(r, ) = r
respectively. Since and
1
are continuous, they are Borel measurable.
Consider the measure
m on B
(0,)
B
S
d1 given by
m(A) := m
1
(A)
for all A B
(0,)
B
S
d1. For E B
S
d1 and a > 0, let
E
a
:= {r : r (0, a] and E} =
1
((0, a] E) B
R
d.
Noting that E
a
= aE
1
, we have for 0 < a < b, E B
S
d1, E and A = (a, b] E
that
1
(A) = {r : r (a, b] and E} (8.22)
= bE
1
\ aE
1
. (8.23)
Therefore,
(
m) ((a, b] E) = m(bE
1
\ aE
1
) = m(bE
1
) m(aE
1
)
= b
d
m(E
1
) a
d
m(E
1
)
= d m(E
1
)
Z
b
a
r
d1
dr. (8.24)
Let denote the unique measure on B
(0,)
such that
(8.25) (J) =
Z
J
r
d1
dr
for all J B
(0,)
, i.e. d(r) = r
d1
dr.
Denition 8.34. For E B
S
d1, let (E) := d m(E
1
). We call the surface
measure on S.
It is easy to check that is a measure. Indeed if E B
S
d1, then E
1
=
1
((0, 1] E) B
R
d so that m(E
1
) is well dened. Moreover if E =
`
i=1
E
i
,
then E
1
=
`
i=1
(E
i
)
1
and
(E) = d m(E
1
) =

X
i=1
m((E
i
)
1
) =

X
i=1
(E
i
).
The intuition behind this denition is as follows. If E S
d1
is a set and > 0 is
a small number, then the volume of
(1, 1 +] E = {r : r (1, 1 +] and E}
should be approximately given by m((1, 1 +] E)
= (E), see Figure 16 below.

On the other hand
m((1, 1 +]E) = m(E
1+
\ E
1
) =

(1 +)
d
1
m(E
1
).
Figure 16. Motivating the denition of surface measure for a sphere.
Therefore we expect the area of E should be given by
(E) = lim
0
(1 +)
d
1
m(E
1
)
= d m(E
1
).
According to these denitions and Eq. (8.24) we have shown that
(8.26)
m((a, b] E) = ((a, b]) (E).

Let
E = {(a, b] E : 0 < a < b, E B
S
d1} ,
then E is an elementary class. Since (E) = B
(0,)
B
S
d1, we conclude from Eq.
(8.26) that
m =
and this implies the following theorem.
Theorem 8.35. If f : R
d
[0, ] is a (B
R
d, B)measurable function then
(8.27)
Z
R
d
f(x)dm(x) =
Z
[0,)S
d1
f(r ) d()r
d1
dr.
Let us now work out some integrals using Eq. (8.27).
Lemma 8.36. Let a > 0 and
I
d
(a) :=
Z
R
d
e
a|x|
2
dm(x).
Then I
d
(a) = (/a)
d/2
.
Proof. By Tonellis theorem and induction,
I
d
(a) =
Z
R
d1
R
e
a|y|
2
e
at
2
m
d1
(dy) dt
= I
d1
(a)I
1
(a) = I
d
1
(a). (8.28)
So it suces to compute:
I
2
(a) =
Z
R
2
e
a|x|
2
dm(x) =
Z
R
2
\{0}
e
a(x
2
1
+x
2
2
)
dx
1
dx
2
.
146 BRUCE K. DRIVER
We now make the change of variables,

x
1
= r cos and x
2
= r sin for 0 < r < and 0 < < 2.
In vector form this transform is
x = T(r, ) =

r cos
r sin
and the dierential and the Jacobian determinant are given by

T
0
(r, ) =

cos r sin
sin r cos
and det T
0
(r, ) = r cos
2
+r sin
2
= r.
Notice that T : (0, ) (0, 2) R
2
\ where is the ray, := {(x, 0) : x 0}
which is a m
2
null set. Hence by Tonellis theorem and the change of variable
theorem, for any Borel measurable function f : R
2
[0, ] we have
Z
R
2
f(x)dx =
Z
2
0
Z

0
f(r cos , r sin) rdrd.
In particular,
I
2
(a) =
Z

0
dr r
Z
2
0
d e
ar
2
= 2
Z

0
re
ar
2
dr
= 2 lim
M
Z
M
0
re
ar
2
dr = 2 lim
M
e
ar
2
2a
Z
M
0
=
2
2a
= /a.
This shows that I
2
(a) = /a and the result now follows from Eq. (8.28).
Corollary 8.37. The surface area (S
d1
) of the unit sphere S
d1
R
d
is
(8.29) (S
d1
) =
2
d/2
(d/2)
where is the gamma function given by
(8.30) (x) :=
Z

0
u
x1
e
u
dr
Moreover, (1/2) =
, (1) = 1 and (x + 1) = x(x) for x > 0.

Proof. We may alternatively compute I
d
(1) =
d/2
using Theorem 8.35;
I
d
(1) =
Z

0
dr r
d1
e
r
2
Z
S
d1
d
= (S
d1
)
Z

0
r
d1
e
r
2
dr.
We simplify this last integral by making the change of variables u = r
2
so that
r = u
1/2
and dr =
1
2
u
1/2
du. The result is
Z

0
r
d1
e
r
2
dr =
Z

0
u
d1
2
e
u
1
2
u
1/2
du
=
1
2
Z

0
u
d
2
1
e
u
du
=
1
2
(d/2). (8.31)
Collecting these observations implies that
d/2
= I
d
(1) =
1
2
(S
d1
)(d/2)
which proves Eq. (8.29).
The computation of (1) is easy and is left to the reader. By Eq. (8.31),
(1/2) = 2
Z

0
e
r
2
dr =
Z

e
r
2
dr
= I
1
(1) =
.
The relation, (x + 1) = x(x) is the consequence of the following integration by
parts:
(x + 1) =
Z

0
e
u
u
x+1
du
u
=
Z

0
u
x
d
du
e
u
du
= x
Z

0
u
x1
e
u
du = x (x).
8.5. Regularity of Measures.
Denition 8.38. Suppose that E is a collection of subsets of X, let E
denote
the collection of subsets of X which are nite or countable unions of sets from E.
Similarly let E
denote the collection of subsets of X which are nite or countable

intersections of sets from E. We also write E
= (E
and E
= (E
, etc.
Remark 8.39. Notice that if A is an algebra and C = C
i
and D = D
j
with
C
i
, D
j
A
, then
C D =
i,j
(C
i
D
j
) A
so that A
is closed under nite intersections.

The following theorem shows how recover a measure on (A) from its values
on an algebra A.
Theorem 8.40 (Regularity Theorem). Let A P(X) be an algebra of sets, M =
(A) and : M[0, ] be a measure on M which is nite on A. Then for
all A M,
(8.32) (A) = inf {(B) : A B A
} .
Moreover, if A M and > 0 are given, then there exists B A
such that A B
and (B \ A) .
Proof. For A X, dene
(A) = inf {(B) : A B A
} .
We are trying to show
= on M. We will begin by rst assuming that is a

nite measure, i.e. (X) < .
Let
F = {B M:
(B) = (B)} = {B M :
(B) (B)}.
It is clear that A F, so the nite case will be nished by showing F is a monotone
class. Suppose B
n
F, B
n
B as n and let > 0 be given. Since
(B
n
) =
(B
n
) there exists A
n
A
such that B
n
A
n
and (A
n
) (B
n
) +2
n
i.e.
(A
n
\ B
n
) 2
n
.
148 BRUCE K. DRIVER
Let A =
n
A
n
A
, then B A and
(A\ B) = (
n
(A
n
\ B))

X
n=1
((A
n
\ B))

X
n=1
((A
n
\ B
n
))

X
n=1
2
n
= .
Therefore,
(B) (A) (B) +

and since > 0 was arbitrary it follows that B F.
Now suppose that B
n
F and B
n
B as n so that
(B
n
) (B) as n .
As above choose A
n
A
such that B
n
A
n
and
0 (A
n
) (B
n
) = (A
n
\ B
n
) 2
n
.
Combining the previous two equations shows that lim
n
(A
n
) = (B). Since
(B) (A
n
) for all n, we conclude that
(B) (B), i.e. that B F.

Since F is a monotone class containing the algebra A, the monotone class theo-
rem asserts that
M = (A) F M
showing the F = M and hence that
= on M.
For the nite case, let X
n
n
) < and X
n
X as
n . Let
n
be the nite measure on M dened by
n
(A) := (AX
n
) for all
A M. Suppose that > 0 and A M are given. By what we have just proved,
for all A M, there exists B
n
A
such that A B
n
and
((B
n
X
n
) \ (A X
n
)) =
n
(B
n
\ A) 2
n
.
Notice that since X
n
A
, B
n
X
n
A
and
B :=
n=1
(B
n
X
n
) A
.
Moreover, A B and
(B \ A)

X
n=1
((B
n
X
n
) \ A)

X
n=1
((B
n
X
n
) \ (A X
n
))

X
n=1
2
n
= .
Since this implies that
(A) (B) (A) +
and > 0 is arbitrary, this equation shows that Eq. (8.32) holds.
Corollary 8.41. Let A P(X) be an algebra of sets, M = (A) and : M
[0, ] be a measure on M which is nite on A. Then for all A M and > 0
there exists B A
such that B A and

(A\ B) < .
Furthermore, for any B M there exists A A
and C A
such that A
B C and (C \ A) = 0.
Proof. By Theorem 8.40, there exist C A
such that A
c
C and (C\A
c
)
. Let B = C
c
A and notice that B A
and that C \ A
c
= B
c
A = A\ B, so
that
(A\ B) = (C \ A
c
) .
Finally, given B M, we may choose A
n
A
and C
n
A
such that A
n
B
C
n
and (C
n
\B) 1/n and (B\A
n
) 1/n. By replacing A
N
by
N
n=1
A
n
and C
N
by
N
n=1
C
n
, we may assume that A
n
and C
n
as n increases. Let A = A
n
A
and C = C
n
A
, then A B C and
(C \ A) = (C \ B) +(B \ A) (C
n
\ B) +(B \ A
n
)
2/n 0 as n .
Corollary 8.42. Let A P(X) be an algebra of sets, M = (A) and : M
[0, ] be a measure on M which is nite on A. Then for every B M such
that (B) < and > 0 there exists D A such that (B4D) < .
Proof. By Corollary 8.41, there exists C A
such B C and (C \ B) < .

Now write C =
n=1
C
n
with C
n
A for each n. By replacing C
n
by
n
k=1
C
k
A
if necessary, we may assume that C
n
C as n . Since C
n
\ B C \ B and
B \ C
n
B \ C = as n and (B \ C
1
) (B) < , we know that
lim
n
(C
n
\ B) = (C \ B) < and lim
n
(B \ C
n
) = (B \ C) = 0
Hence for n suciently large,
(B4C
n
) = ((C
n
\ B) +(B \ C
n
) < .
Hence we are done by taking D = C
n
A for an n suciently large.
Remark 8.43. We have to assume that (B) < as the following example shows.
Let X = R, M = B, = m, A be the algebra generated by half open intervals of
the form (a, b], and B =
n=1
(2n, 2n+1]. It is easily checked that for every D A,
that m(BD) = .
For Exercises 8.1 8.3 let P(X) be a topology, M = () and : M
[0, ) be a nite measure, i.e. (X) < .
Exercise 8.1. Let
(8.33) F := {A M: (A) = inf {(V ) : A V }} .
(1) Show F may be described as the collection of set A M such that for all
> 0 there exists V such that A V and (V \ A) < .
(2) Show F is a monotone class.
Exercise 8.2. Give an example of a topology on X = {1, 2} and a measure on
M = () such that F dened in Eq. (8.33) is not M.
Exercise 8.3. Suppose now P(X) is a topology with the property that to
every closed set C X, there exists V
n
such that V
n
C as n . Let
A = A() be the algebra generated by .
(1) With the aid of Exercise 6.1, show that A F. Therefore by exercise 8.1
and the monotone class theorem, F = M, i.e.
(A) = inf {(V ) : A V } .
(Hint: Recall the structure of A from Exercise 6.1.)
150 BRUCE K. DRIVER
(2) Show this result is equivalent to following statement: for every > 0 and
A M there exist a closed set C and an open set V such that C A V
and (V \ C) < . (Hint: Apply part 1. to both A and A
c
.)
Exercise 8.4 (Generalization to the nite case). Let P(X) be a topology
with the property that to every closed set F X, there exists V
n
such that
V
n
F as n . Also let M = () and : M [0, ] be a measure which is
nite on .
(1) Show that for all > 0 and A M there exists an open set V and a
closed set F such that F A V and (V \ F) .
(2) Let F
denote the collection of subsets of X which may be written as a

countable union of closed sets. Use item 1. to show for all B M, there
exists C
is customarily written as G
) and A F
such that
A B C and (C \ A) = 0.
Exercise 8.5 (Metric Space Examples). Suppose that (X, d) is a metric space and
d
is the topology of d open subsets of X. To each set F X and > 0 let
F
= {x X : d
F
(x) < } =
xF
B
x
()
d
.
Show that if F is closed, then F
F as 0 and in particular V
n
:= F
1/n

d
are
open sets decreasing to F. Therefore the results of Exercises 8.3 and 8.4 apply to
measures on metric spaces with the Borel algebra, B = (
d
).
Corollary 8.44. Let X R
n
be an open set and B = B
X
be the Borel algebra
on X equipped with the standard topology induced by open balls with respect to the
Euclidean distance. Suppose that : B [0, ] is a measure such that (K) <
whenever K is a compact set.
(1) Then for all A B and > 0 there exist a closed set F and an open set V
such that F A V and (V \ F) < .
(2) If (A) < , the set F in item 1. may be chosen to be compact.
(3) For all A B we may compute (A) using
(A) = inf{(V ) : A V and V is open} (8.34)
= sup{(K) : K A and K is compact}. (8.35)
Proof. For k N, let
(8.36) K
k
:= {x X : |x| k and d
X
c (x) 1/k} .
Then K
k
is a closed and bounded subset of R
n
and hence compact. Moreover
K
o
k
X as k since
17
{x X : |x| < k and d
X
c(x) > 1/k} K
o
k
and {x X : |x| < k and d
X
c (x) > 1/k} X as k .This shows is nite
on
X
and Item 1. follows from Exercises 8.4 and 8.5.
If (A) < and F A V as in item 1. Then K
k
F F as k and
therefore since (V ) < , (V \ K
k
F) (V \ F) as k . Hence by choosing
k suciently large, (V \ K
k
F) < and we may replace F by the compact set
F K
k
and item 1. still holds. This proves item 2.
Item 3. Item 1. easily implies that Eq. (8.34) holds and item 2. implies Eq.
(8.35) holds when (A) < . So we need only check Eq. (8.35) when (A) = .
17
In fact this is an equality, but we will not need this here.
By Item 1. there is a closed set F A such that (A \ F) < 1 and in particular
(F) = . Since K
n
F F, and K
n
F is compact, it follows that the right side
of Eq. (8.35) is innite and hence equal to (A).
8.6. Exercises.
Exercise 8.6. Let (X
j
, M
j
,
j
) for j = 1, 2, 3 be nite measure spaces. Let
F : X
1
X
2
X
3
(X
1
X
2
) X
3
be dened by
F((x
1
, x
2
), x
3
) = (x
1
, x
2
, x
3
).
(1) Show F is ((M
1
M
2
) M
3
, M
1
M
2
M
3
) measurable and F
1
is
(M
1
M
2
M
3
, (M
1
M
2
) M
3
) measurable. That is
F : ((X
1
X
2
) X
3
, (M
1
M
2
) M
3
) (X
1
X
2
X
3
, M
1
M
2
M
3
)
is a measure theoretic isomorphism.
(2) Let := F
[(
1
2
)
3
] , i.e. (A) = [(
1
2
)
3
] (F
1
(A)) for all
A M
1
M
2
M
3
. Then is the unique measure on M
1
M
2
M
3
such that
(A
1
A
2
A
3
) =
1
(A
1
)
2
(A
2
)
3
(A
3
)
for all A
i
M
i
. We will write :=
1
3
.
(3) Let f : X
1
X
2
X
3
[0, ] be a (M
1
M
2
M
3
, B
R
) measurable
function. Verify the identity,
Z
X
1
X
2
X
3
fd =
Z
X
3
Z
X
2
Z
X
1
f(x
1
, x
2
, x
3
)d
1
(x
1
)d
2
(x
2
)d
3
(x
3
),
makes sense and is correct. Also show the identity holds for any one of the
six possible orderings of the iterated integrals.
Exercise 8.7. Prove the second assertion of Theorem 8.29. That is show m
d
is
the unique translation invariant measure on B
R
d such that m
d
((0, 1]
d
) = 1. Hint:
Look at the proof of Theorem 7.10.
Exercise 8.8. (Part of Folland Problem 2.46 on p. 69.) Let X = [0, 1], M= B
[0,1]
be the Borel eld on X, m be Lebesgue measure on [0, 1] and be counting
measure, (A) = #(A). Finally let D = {(x, x) X
2
: x X} be the diagonal in
X
2
. Show
Z
X
Z
X
1
D
(x, y)d(y)dm(x) 6=
Z
X
Z
X
1
D
(x, y)dm(x)d(y)
by explicitly computing both sides of this equation.
Exercise 8.9. Folland Problem 2.48 on p. 69. (Fubini problem.)
Exercise 8.10. Folland Problem 2.50 on p. 69. (Note the M B
R
should be
MB
R
in this problem.)
Exercise 8.11. Folland Problem 2.55 on p. 77. (Explicit integrations.)
Exercise 8.12. Folland Problem 2.56 on p. 77. Let f L
1
((0, a), dm), g(x) =
R
a
x
f(t)
t
dt for x (0, a), show g L
1
((0, a), dm) and
Z
a
0
g(x)dx =
Z
a
0
f(t)dt.
152 BRUCE K. DRIVER
Exercise 8.13. Show

R
sin x
x
dm(x) = . So
sin x
x
/ L
1
([0, ), m) and
R
0
sin x
x
dm(x) is not dened as a Lebesgue integral.
Exercise 8.14. Folland Problem 2.57 on p. 77.
Exercise 8.15. Folland Problem 2.58 on p. 77.
Exercise 8.16. Folland Problem 2.60 on p. 77. Properties of functions.
Exercise 8.17. Folland Problem 2.61 on p. 77. Fractional integration.
Exercise 8.18. Folland Problem 2.62 on p. 80. Rotation invariance of surface
measure on S
n1
.
Exercise 8.19. Folland Problem 2.64 on p. 80. On the integrability of |x|
a
|log |x||
b
for x near 0 and x near in R
n
.
9. L
p
-spaces
Let (X, M, ) be a measure space and for 0 a) = 0}

For 0 < p , let
L
p
(X, M, ) = {f : X C : f is measurable and kfk
p
< }/
where f g i f = g a.e. Notice that kf gk
p
= 0 i f g and if f g then
kfk
p
= kgk
p
. In general we will (by abuse of notation) use f to denote both the
function f and the equivalence class containing f.
Remark 9.1. Suppose that kfk
M, then for all a > M, (|f| > a) = 0 and

therefore (|f| > M) = lim
n
(|f| > M + 1/n) = 0, i.e. |f(x)| M for -
a.e. x. Conversely, if |f| M a.e. and a > M then (|f| > a) = 0 and hence
kfk
M. This leads to the identity:

kfk
= inf {a 0 : |f(x)| a for a.e. x} .

Theorem 9.2 (Hlders inequality). Suppose that 1 p and q :=
p
p1
, or
equivalently p
1
+q
1
= 1. If f and g are measurable functions then
(9.3) kfgk
1
kfk
p
kgk
q
.
Assuming p (1, ) and kfk
p
kgk
q
< , equality holds in Eq. (9.3) i |f|
p
and
|g|
q
are linearly dependent as elements of L
1
. If we further assume that kfk
p
and
kgk
q
are positive then equality holds in Eq. (9.3) i
(9.4) |g|
q
kfk
p
p
= kgk
q
q
|f|
p
a.e.
Proof. The cases where kfk
q
= 0 or or kgk
p
= 0 or are easy to deal with
and are left to the reader. So we will now assume that 0 < kfk
q
, kgk
p
< . Let
s = |f|/kfk
p
and t = |g|/kgk
q
then Lemma 2.27 implies
(9.5)
|fg|
kfk
p
kgk
q

1
p
|f|
p
kfk
p
+
1
q
|g|
q
kgk
q
with equality i |g/kgk
q
| = |f|
p1
/kfk
(p1)
p
= |f|
p/q
/kfk
p/q
p
, i.e. |g|
q
kfk
p
p
=
kgk
q
q
|f|
p
. Integrating Eq. (9.5) implies
kfgk
1
kfk
p
kgk
q

1
p
+
1
q
= 1
with equality i Eq. (9.4) holds. The proof is nished since it is easily checked that
equality holds in Eq. (9.3) when |f|
p
= c |g|
q
of |g|
q
= c |f|
p
for some constant c.
The following corollary is an easy extension of Hlders inequality.
Corollary 9.3. Suppose that f
i
: X C are measurable functions for i = 1, . . . , n
and p
1
, . . . , p
n
and r are positive numbers such that
P
n
i=1
p
1
i
= r
1
, then
n
Y
i=1
f
i
n
Y
i=1
kf
i
k
p
i
where
n
X
i=1
p
1
i
= r
1
.
154 BRUCE K. DRIVER
Proof. To prove this inequality, start with n = 2, then for any p [1, ],
kfgk
r
r
=
Z
f
r
g
r
d kf
r
k
p
kg
r
k
p
where p
=
p
p1
is the conjugate exponent. Let p
1
= pr and p
2
= p
r so that
p
1
1
+p
1
2
= r
1
as desired. Then the previous equation states that
kfgk
r
kfk
p
1
kgk
p
2
as desired. The general case is now proved by induction. Indeed,
n+1
Y
i=1
f
i
r
=
n
Y
i=1
f
i
f
n+1
n
Y
i=1
f
i
q
kf
n+1
k
p
n+1
where q
1
+ p
1
n+1
= r
1
. Since
P
n
i=1
p
1
i
= q
1
, we may now use the induction
hypothesis to conclude

n
Y
i=1
f
i
n
Y
i=1
kf
i
k
p
i
,
which combined with the previous displayed equation proves the generalized form
of Holders inequality.
Theorem 9.4 (Minkowskis Inequality). If 1 p and f, g L
p
then
(9.6) kf +gk
p
kfk
p
+kgk
p
.
Moreover if p < , then equality holds in this inequality i
sgn(f) = sgn(g) when p = 1 and
f = cg or g = cf for some c > 0 when p > 1.
Proof. When p = , |f| kfk
a.e. and |g| kgk
a.e. so that |f +g|

|f| +|g| kfk
+kgk
a.e. and therefore

kf +gk
kfk
+kgk
.
When p < ,
|f +g|
p
(2 max (|f| , |g|))
p
= 2
p
max (|f|
p
, |g|
p
) 2
p
(|f|
p
+|g|
p
) ,
kf +gk
p
p
2
p
kfk
p
p
+kgk
p
p
< .
In case p = 1,
kf +gk
1
=
Z
X
|f +g|d
Z
X
|f|d +
Z
X
|g|d
with equality i |f| +|g| = |f +g| a.e. which happens i sgn(f) = sgn(g) a.e.
In case p (1, ), we may assume kf +gk
p
, kfk
p
and kgk
p
are all positive since
otherwise the theorem is easily veried. Now
|f +g|
p
= |f +g||f +g|
p1
(|f| +|g|)|f +g|
p1
with equality i sgn(f) = sgn(g). Integrating this equation and applying Holders
inequality with q = p/(p 1) gives
Z
X
|f +g|
p
d
Z
X
|f| |f +g|
p1
d +
Z
X
|g| |f +g|
p1
d
(kfk
p
+kgk
p
) k |f +g|
p1
k
q
(9.7)
with equality i
sgn(f) = sgn(g) and
|f|
kfk
p
p
=
|f +g|
p
kf +gk
p
p
=

|g|
kgk
p
p
a.e. (9.8)
Therefore
(9.9) k|f +g|
p1
k
q
q
=
Z
X
(|f +g|
p1
)
q
d =
Z
X
|f +g|
p
d.
Combining Eqs. (9.7) and (9.9) implies
(9.10) kf +gk
p
p
kfk
p
kf +gk
p/q
p
+kgk
p
kf +gk
p/q
p
with equality i Eq. (9.8) holds which happens i f = cg a.e. with c > 0.. Solving
for kf +gk
p
in Eq. (9.10) gives Eq. (9.6).
The next theorem gives another example of using Hlders inequality
Theorem 9.5. Suppose that (X, M, ) and (Y, N, ) be -nite measure spaces,
p [1, ], q = p/(p 1) and k : X Y C be a MN measurable function.
Assume there exist nite constants C
1
and C
2
such that
Z
X
|k(x, y)| d(x) C
1
for a.e. y and
Z
Y
|k(x, y)| d(y) C
2
for a.e. x.
If f L
p
(), then
Z
Y
|k(x, y)f(y)| d(y) < for a.e. x,
x Kf(x) :=
R
k(x, y)f(y)d(y) L
p
() and
(9.11) kKfk
L
p
()
C
1/p
1
C
1/q
2
kfk
L
p
()
Proof. Suppose p (1, ) to begin with and let q = p/(p1), then by Hlders
inequality,
Z
Y
|k(x, y)f(y)| d(y) =
Z
Y
|k(x, y)|
1/q
|k(x, y)|
1/p
|f(y)| d(y)
Z
Y
|k(x, y)| d(y)
1/q
Z
X
|k(x, y)| |f(y)|
p
d(y)
1/p
C
1/q
2
Z
X
|k(x, y)| |f(y)|
p
d(y)
1/p
.
Therefore, using Tonellis theorem,
Z
Y
|k(, y)f(y)| d(y)
p
p
C
p/q
2
Z
Y
d(x)
Z
X
d(y) |k(x, y)| |f(y)|
p
= C
p/q
2
Z
X
d(y) |f(y)|
p
Z
Y
d(x) |k(x, y)|
C
p/q
2
C
1
Z
X
d(y) |f(y)|
p
= C
p/q
2
C
1
kfk
p
p
.
156 BRUCE K. DRIVER
From this it follows that x Kf(x) :=

R
k(x, y)f(y)d(y) L
p
() and that Eq.
(9.11) holds.
Similarly, if p = ,
Z
Y
|k(x, y)f(y)| d(y) kfk
Z
Y
|k(x, y)| d(y) C
2
kfk
for a.e. x.
so that kKfk
L
()
C
2
kfk
L
()
. If p = 1, then
Z
X
d(x)
Z
Y
d(y) |k(x, y)f(y)| =
Z
Y
d(y) |f(y)|
Z
X
d(x) |k(x, y)|
C
1
Z
Y
d(y) |f(y)|
which shows kKfk
L
1
()
C
1
kfk
L
1
()
.
9.1. Jensens Inequality.
Denition 9.6. A function : (a, b) R is convex if for all a < x
0
< x
1
< b and
t [0, 1] (x
t
) t(x
1
) + (1 t)(x
0
) where x
t
= tx
1
+ (1 t)x
0
.
The following Proposition is clearly motivated by Figure 17.
4 2 0 -2 -4
25
12.5
0
-12.5
-25
x
y
x
y
Figure 17. A convex function along with two cords corresponding
to x
0
= 2 and x
1
= 4 and x
0
= 5 and x
1
= 2.
Proposition 9.7. Suppose : (a, b) R is a convex function, then
(1) For all u, v, w, z (a, b) such that u < z, w [u, z) and v (u, z],
(9.12)
(v) (u)
v u

(z) (w)
z w
.
(2) For each c (a, b), the right and left sided derivatives
0
(c) exists in R
and if a < u < v < b, then
0
+
(u)
0
(v)
0
+
(v).
(3) The function is continuous.
(4) For all t (a, b) and [
0
(t),
0
+
(t)], (x) (t) + (x t) for all
x (a, b). In particular,
(x) (t) +
0
(t)(x t) for all x, t (a, b).

Proof. 1a) Suppose rst that u < v = w < z, in which case Eq. (9.12) is
equivalent to
((v) (u)) (z v) ((z) (v)) (v u)
which after solving for (v) is equivalent to the following equations holding:
(v) (z)
v u
z u
+(u)
z v
z u
.
But this last equation states that (v) (z)t + (u) (1 t) where t =
vu
zu
and
v = tz + (1 t)u and hence is valid by the denition of being convex.
1b) Now assume u = w < v < z, in which case Eq. (9.12) is equivalent to
((v) (u)) (z u) ((z) (u)) (v u)
which after solving for (v) is equivalent to
(v) (z u) (z) (v u) +(u) (z v)
which is equivalent to
(v) (z)
v u
z u
+(u)
z v
z u
.
Again this equation is valid by the convexity of .
1c) u < w < v = z, in which case Eq. (9.12) is equivalent to
((z) (u)) (z w) ((z) (w)) (z u)
and this is equivalent to the inequality,
(w) (z)
w u
z u
+(u)
z w
z u
which again is true by the convexity of .
1) General case. If u < w < v < z, then by 1a-1c)
(z) (w)
z w

(v) (w)
v w

(v) (u)
v u
and if u < v < w < z
(z) (w)
z w

(w) (v)
w v

(w) (u)
w u
.
We have now taken care of all possible cases.
2) On the set a < w < z < b, Eq. (9.12) shows that ((z) (w)) / (z w) is a
decreasing function in w and an increasing function in z and therefore
0
(x) exists
for all x (a, b). Also from Eq. (9.12) we learn that
0
+
(u)
(z) (w)
z w
for all a < u < w < z < b, (9.13)
(v) (u)
v u

0
(z) for all a < u < v < z < b, (9.14)

and letting w z in the rst equation also implies that
0
+
(u)
0
(z) for all a < u < z < b.

The inequality,
0
(z)
0
+
(z), is also an easy consequence of Eq. (9.12).
3) Since (x) has both left and right nite derivatives, it follows that is con-
tinuous. (For an alternative proof, see Rudin.)
158 BRUCE K. DRIVER
4) Given t, let [
0
(t),
0
+
(t)], then by Eqs. (9.13) and (9.14),
(t) (u)
t u

0
(t)
0
+
(t)
(z) (t)
z t
for all a < u < t < z < b. Item 4. now follows.
Corollary 9.8. Suppose : (a, b) R is dierential then is convex i
0
is non
decreasing. In particular if C
2
(a, b) then is convex i
00
0.
Proof. By Proposition 9.7, if is convex then
0
is non-decreasing. Conversely
if
0
is increasing then by the mean value theorem,
(x
1
) (c)
x
1
c
=
0
(
1
) for some
1
(c, x
1
)
and
(c) (x
0
)
c x
0
=
0
(
2
) for some
2
(x
0
, c).
Hence
(x
1
) (c)
x
1
c

(c) (x
0
)
c x
0
for all x
0
< c < x
1
. Solving this inequality for (c) gives
(c)
c x
0
x
1
x
0
(x
1
) +
x
1
c
x
1
x
0
(x
0
)
showing is convex.
Example 9.9. The functions exp(x) and log(x) are convex and x
p
is convex i
p 1.
Theorem 9.10 (Jensens Inequality). Suppose that (X, M, ) is a probability space,
i.e. is a positive measure and (X) = 1. Also suppose that f L
1
(), f : X
(a, b), and : (a, b) R is a convex function. Then
Z
X
fd
Z
X
(f)d
where if f / L
1
(), then f is integrable in the extended sense and
R
X
(f)d =
.
Proof. Let t =
R
X
fd (a, b) and let R be such that (s) (t) (st)
for all s (a, b). Then integrating the inequality, (f) (t) (f t), implies
that
0
Z
X
(f)d (t) =
Z
X
(f)d (
Z
X
fd).
Moreover, if (f) is not integrable, then (f) (t) +(f t) which shows that
negative part of (f) is integrable. Therefore,
R
X
(f)d = in this case.
Example 9.11. The convex functions in Example 9.9 lead to the following inequal-
ities,
exp
Z
X
fd
Z
X
e
f
d, (9.15)
Z
X
log(|f|)d log
Z
X
|f| d
log
Z
X
fd

and for p 1,

Z
X
fd
Z
X
|f| d
Z
X
|f|
p
d.
The last equation may also easily be derived using Hlders inequality. As a special
case of the rst equation, we get another proof of Lemma 2.27. Indeed, more
generally, suppose p
i
, s
i
> 0 for i = 1, 2, . . . , n and
P
n
i=1
1
p
i
= 1, then
(9.16) s
1
. . . s
n
= e
P
n
i=1
ln s
i
= e
P
n
i=1
1
p
i
ln s
p
i
i
n
X
i=1
1
p
i
e
ln s
p
i
i
=
n
X
i=1
s
p
i
i
p
i
where the inequality follows from Eq. (9.15) with =
P
n
i=1
1
p
i
s
i
. Of course Eq.
(9.16) may be proved directly by directly using the convexity of the exponential
function.
9.2. Modes of Convergence. As usual let (X, M, ) be a xed measure space
and let {f
n
} be a sequence of measurable functions on X. Also let f : X C be
a measurable function. We have the following notions of convergence and Cauchy
sequences.
Denition 9.12. (1) f
n
f a.e. if there is a set E M such that (E
c
) = 0
and lim
n
1
E
f
n
= 1
E
f.
(2) f
n
f in measure if lim
n
(|f
n
f| > ) = 0 for all > 0. We will
abbreviate this by saying f
n
f in L
0
or by f
n

f.
(3) f
n
f in L
p
i f L
p
and f
n
L
p
for all n, and lim
n
R
|f
n
f|
p
d = 0.
Denition 9.13. (1) {f
n
} is a.e. Cauchy if there is a set E M such that
(E
c
) = 0 and{1
E
f
n
} is a pointwise Cauchy sequences.
(2) {f
n
} is Cauchy in measure (or L
0
Cauchy) if lim
m,n
(|f
n
f
m
| >
) = 0 for all > 0.
(3) {f
n
} is Cauchy in L
p
if lim
m,n
R
|f
n
f
m
|
p
d = 0.
Lemma 9.14 (Chebyshevs inequality again). Let p [1, ) and f L
p
, then
(|f| )
1
p
kfk
p
p
for all > 0.
In particular if {f
n
} L
p
is L
p
convergent (Cauchy) then {f
n
} is also convergent
(Cauchy) in measure.
Proof. By Chebyshevs inequality (7.12),
(|f| ) = (|f|
p

p
)
1
p
Z
X
|f|
p
d =
1
p
kfk
p
p
and therefore if {f
n
} is L
p
Cauchy, then
(|f
n
f
m
| )
1
p
kf
n
f
m
k
p
0 as m, n
showing {f
n
} is L
0
Cauchy. A similar argument holds for the L
p
convergent
case.
Lemma 9.15. Suppose a
n
C and |a
n+1
a
n
|
n
and

P
n=1
n
< . Then
lim
n
a
n
= a C exists and |a a
n
|
n

P
k=n
k
.
160 BRUCE K. DRIVER
Figure 18. Modes of convergence examples. In picture 1. f

n
0
a.e., f
n
9 0 in L
1
, f
n
m
0. In picture 2. f
n
0 a.e., f
n
9 0 in
L
1
, f
n
m
9 0. In picture 3., f
n
0 a.e., f
n
m
0 but f
n
9 0 in L
1
.
In picture 4., f
n
0 in L
1
, f
n
90 a.e., and f
n
m
0.
Proof. Let m > n then
(9.17) |a
m
a
n
| =
m1
P
k=n
(a
k+1
a
k
)
m1
P
k=n
|a
k+1
a
k
|

P
k=n
k

n
.
So |a
m
a
n
|
min(m,n)
0 as , m, n , i.e. {a
n
} is Cauchy. Let m in
(9.17) to nd |a a
n
|
n
.
Theorem 9.16. Suppose {f
n
} is L
0
-Cauchy. Then there exists a subsequence
g
j
= f
n
j
of {f
n
} such that limg
j
f exists a.e. and f
n
f as n .
Moreover if g is a measurable function such that f
n

g as n , then f = g
a.e.
Proof. Let
n
> 0 such that

P
n=1
n
< (
n
= 2
n
would do) and set
n
=
P
k=n
k
. Choose g
j
= f
n
j
such that {n
j
} is a subsequence of N and
({|g
j+1
g
j
| >
j
})
j
.
Let E
j
= {|g
j+1
g
j
| >
j
} ,
F
N
=

[
j=N
E
j
=

[
j=N
{|g
j+1
g
j
| >
j
}
and
E

\
N=1
F
N
=

\
N=1
[
j=N
E
j
= {|g
j+1
g
j
| >
j
i.o.}.
Then (E) = 0 since
(E)

X
j=N
(E
j
)

X
j=N
j
=
N
0 as N .
For x / F
N
, |g
j+1
(x) g
j
(x)|
j
for all j N and by Lemma 9.15, f(x) =
lim
j
g
j
(x) exists and |f(x)g
j
(x)|
j
for all j N. Therefore, lim
j
g
j
(x) = f(x)
exists for all x / E. Moreover, {x : |f(x) f
j
(x)| >
j
} F
j
for all j N and
hence
(|f g
j
| >
j
) (F
j
)
j
0 as j .
Therefore g
j

f as j .
Since
{|f
n
f| > } = {|f g
j
+g
j
f
n
| > }
{|f g
j
| > /2} {|g
j
f
n
| > /2},
({|f
n
f| > }) ({|f g
j
| > /2}) +(|g
j
f
n
| > /2)
and
({|f
n
f| > }) lim
j
sup(|g
j
f
n
| > /2) 0 as n .
If also f
n

g as n , then arguing as above
(|f g| > ) ({|f f
n
| > /2}) +(|g f
n
| > /2) 0 as n .
Hence
(|f g| > 0) = (
n=1
{|f g| >
1
n
})

X
n=1
(|f g| >
1
n
) = 0,
i.e. f = g a.e.
Corollary 9.17 (Dominated Convergence Theorem). Suppose {f
n
} , {g
n
} , and g
are in L
1
and f L
0
are functions such that
|f
n
| g
n
a.e., f
n

f, g
n

g, and
Z
g
n

Z
g as n .
Then f L
1
and lim
n
kf f
n
k
1
= 0, i.e. f
n
f in L
1
. In particular
lim
n
R
f
n
=
R
f.
Proof. First notice that |f| g a.e. and hence f L
1
since g L
1
. To see that
|f| g, use Theorem 9.16 to nd subsequences {f
n
k
} and {g
n
k
} of {f
n
} and {g
n
}
respectively which are almost everywhere convergent. Then
|f| = lim
k
|f
n
k
| lim
k
g
n
k
= g a.e.
162 BRUCE K. DRIVER
If (for sake of contradiction) lim

n
kf f
n
k
1
6= 0 there exists > 0 and a
subsequence {f
n
k
} of {f
n
} such that
(9.18)
Z
|f f
n
k
| for all k.
Using Theorem 9.16 again, we may assume (by passing to a further subsequences
if necessary) that f
n
k
f and g
n
k
g almost everywhere. Noting, |f f
n
k
|
g +g
n
k
2g and
R
(g +g
n
k
)
R
2g, an application of the dominated convergence
Theorem 7.38 implies lim
k
R
|f f
n
k
| = 0 which contradicts Eq. (9.18).
Exercise 9.1 (Fatous Lemma). If f
n
0 and f
n
f in measure, then
R
f
liminf
n
R
f
n
.
Theorem 9.18 (Egoros Theorem). Suppose (X) < and f
n
f a.e. Then
for all > 0 there exists E M such that (E) < and f
n
f uniformly on E
c
.
In particular f
n

f as n .
Proof. Let f
n
f a.e. Then ({|f
n
f| >
1
k
i.o. n}) = 0 for all k > 0, i.e.
lim
N
_
_
[
nN
{|f
n
f| >
1
k
}
_
_
=
_
_

\
N=1
[
nN
{|f
n
f| >
1
k
}
_
_
= 0.
Let E
k
:=
S
nN
k
{|f
n
f| >
1
k
} and choose an increasing sequence {N
k
}
k=1
such
that (E
k
) < 2
k
for all k. Setting E := E
k
, (E) <
P
k
2
k
= and if x / E,
then |f
n
f|
1
k
for all n N
k
and all k. That is f
n
f uniformly on E
c
.
Exercise 9.2. Show that Egoros Theorem remains valid when the assumption
(X) < is replaced by the assumption that |f
n
| g L
1
for all n.
9.3. Completeness of L
p
spaces.
Theorem 9.19. Let kk
be as dened in Eq. (9.2), then (L
(X, M, ), kk
) is
a Banach space. A sequence {f
n
}
n=1
L
converges to f L
i there exists
E M such that (E) = 0 and f
n
f uniformly on E
c
. Moreover, bounded
simple functions are dense in L
.
Proof. By Minkowskis Theorem 9.4, kk
satises the triangle inequality. The

reader may easily check the remaining conditions that ensure kk
is a norm.
Suppose that {f
n
}
n=1
L
is a sequence such f
n
f L
, i.e. kf f
n
k

0 as n . Then for all k N, there exists N
k
< such that
|f f
n
| > k
1
= 0 for all n N
k
.
Let
E =
k=1
nN
k
|f f
n
| > k
1
.
Then (E) = 0 and for x E
c
, |f(x) f
n
(x)| k
1
for all n N
k
. This shows
that f
n
f uniformly on E
c
. Conversely, if there exists E Msuch that (E) = 0
and f
n
f uniformly on E
c
, then for any > 0,
(|f f
n
| ) = ({|f f
n
| } E
c
) = 0
for all n suciently large. That is to say limsup
n
kf f
n
k
for all > 0.

The density of simple functions follows from the approximation Theorem 7.12.
So the last item to prove is the completeness of L
for which we will use Theorem

3.66. Suppose that {f
n
}
n=1
L
is a sequence such that

P
n=1
kf
n
k
< . Let
M
n
:= kf
n
k
, E
n
:= {|f
n
| > M
n
} , and E :=
n=1
E
n
so that (E) = 0. Then
X
n=1
sup
xE
c
|f
n
(x)|

X
n=1
M
n
<
which shows that S
N
(x) =
P
N
n=1
f
n
(x) converges uniformly to S(x) :=
P
n=1
f
n
(x)
on E
c
, i.e. lim
n
kS S
n
k
= 0.
Alternatively, suppose
m,n
:= kf
m
f
n
k
0 as m, n . Let E
m,n
=
{|f
n
f
m
| >
m,n
} and E := E
m,n
, then (E) = 0 and kf
m
f
n
k
E
c
,u
=
m,n
0
as m, n . Therefore, f := lim
n
f
n
exists on E
c
and the limit is uniform on
E
c
. Letting f = limsup
n
f
n
, it then follows that kf
m
fk
0 as m .
Theorem 9.20 (Completeness of L
p
()). For 1 p , L
p
() equipped with the
L
p
norm, kk
p
(see Eq. (9.1)), is a Banach space.
Proof. By Minkowskis Theorem 9.4, kk
p
satises the triangle inequality. As
above the reader may easily check the remaining conditions that ensure kk
p
is a
norm. So we are left to prove the completeness of L
p
() for 1 p < , the case
p = being done in Theorem 9.19. By Chebyshevs inequality (Lemma 9.14),
{f
n
} is L
0
-Cauchy (i.e. Cauchy in measure) and by Theorem 9.16 there exists a
subsequence {g
j
} of {f
n
} such that g
j
f a.e. By Fatous Lemma,
kg
j
fk
p
p
=
Z
lim
k
inf |g
j
g
k
|
p
d lim
k
inf
Z
|g
j
g
k
|
p
d
= lim
k
inf kg
j
g
k
k
p
p
0 as j .
In particular, kfk
p
kg
j
fk
p
+ kg
j
k
p
< so the f L
p
and g
j
L
p
f. The
proof is nished because,
kf
n
fk
p
kf
n
g
j
k
p
+kg
j
fk
p
0 as j, n .
The L
p
() norm controls two types of behaviors of f, namely the behavior at
innity and the behavior of local singularities. So in particular, if f is blows up at
a point x
0
X, then locally near x
0
it is harder for f to be in L
p
() as p increases.
On the other hand a function f L
p
() is allowed to decay at innity slower and
slower as p increases. With these insights in mind, we should not in general expect
L
p
() L
q
() or L
q
() L
p
(). However, there are two notable exceptions. (1) If
(X) < , then there is no behavior at innity to worry about and L
q
() L
p
()
for all q p as is shown in Corollary 9.21 below. (2) If is counting measure, i.e.
(A) = #(A), then all functions in L
p
() for any p can not blow up on a set of
positive measure, so there are no local singularities. In this case L
p
() L
q
() for
all q p, see Corollary 9.25 below.
Corollary 9.21. If (X) < , then L
p
() L
q
() for all 0 < p < q and
the inclusion map is bounded.
Proof. Choose a [1, ] such that
1
p
=
1
a
+
1
q
, i.e. a =
pq
q p
.
164 BRUCE K. DRIVER
Then by Corollary 9.3,

kfk
p
= kf 1k
p
kfk
q
k1k
a
= (X)
1/a
kfk
q
= (X)
(
1
p
1
q
)
kfk
q
.
The reader may easily check this nal formula is correct even when q = provided
we interpret 1/p 1/ to be 1/p.
Proposition 9.22. Suppose that 0 < p < q < r , then L
q
L
p
+ L
r
, i.e.
every function f L
q
may be written as f = g + h with g L
p
and h L
r
. For
1 p < r and f L
p
+L
r
let
kfk := inf
n
kgk
p
+khk
r
: f = g +h
o
.
Then (L
p
+L
r
, kk) is a Banach space and the inclusion map from L
q
to L
p
+L
r
is bounded; in fact kfk 2 kfk
q
for all f L
q
.
Proof. Let M > 0, then the local singularities of f are contained in the set
E := {|f| > M} and the behavior of f at innity is solely determined by f on
E
c
. Hence let g = f1
E
and h = f1
E
c so that f = g + h. By our earlier discussion
we expect that g L
p
and h L
r
and this is the case since,
kgk
p
p
=

f1
|f|>M
p
p
=
Z
|f|
p
1
|f|>M
= M
p
Z

f
M
p
1
|f|>M
M
p
Z

f
M
q
1
|f|>M
M
pq
kfk
q
q
<
and
khk
r
r
=

f1
|f|M
r
r
=
Z
|f|
r
1
|f|M
= M
r
Z

f
M
r
1
|f|M
M
r
Z

f
M
q
1
|f|M
M
rq
kfk
q
q
< .
Moreover this shows
kfk M
1q/p
kfk
q/p
q
+M
1q/r
kfk
q/r
q
.
Taking M = kfk
q
then gives
kfk
1q/p
+
1q/r
kfk
q
and then taking = 1 shows kfk 2 kfk
q
. The the proof that (L
p
+L
r
, kk) is a
Banach space is left as Exercise 9.7 to the reader.
Corollary 9.23. Suppose that 0 < p < q < r , then L
p
L
r
L
q
and
(9.19) kfk
q
kfk
p
kfk
1
r
where (0, 1) is determined so that
1
q
=

p
+
1
r
with = p/q if r = .
Further assume 1 p < q < r , and for f L
p
L
r
let
kfk := kfk
p
+kfk
r
.
Then (L
p
L
r
, kk) is a Banach space and the inclusion map of L
p
L
r
into L
q
is
bounded, in fact
(9.20) kfk
q
max
1
, (1 )
1
kfk
p
+kfk
r
,
where
=
1
q

1
r
1
p

1
r
=
p (r q)
q (r p)
.
The heuristic explanation of this corollary is that if f L
p
L
r
, then f has local
singularities no worse than an L
r
function and behavior at innity no worse than
an L
p
function. Hence f L
q
for any q between p and r.
Proof. Let be determined as above, a = p/ and b = r/(1 ), then by
Corollary 9.3,
kfk
q
=

|f|
|f|
1
|f|
|f|
1
b
= kfk
p
kfk
1
r
.
It is easily checked that kk is a norm on L
p
L
r
. To show this space is complete,
suppose that {f
n
} L
p
L
r
is a kk Cauchy sequence. Then {f
n
} is both L
p
and
L
r
Cauchy. Hence there exist f L
p
and g L
r
such that lim
n
kf f
n
k
p
= 0
and lim
n
kg f
n
k
q
= 0. By Chebyshevs inequality (Lemma 9.14) f
n
f and
f
n
g in measure and therefore by Theorem 9.16, f = g a.e. It now is clear that
lim
n
kf f
n
k = 0. The estimate in Eq. (9.20) is left as Exercise 9.6 to the
reader.
Remark 9.24. Let p = p
1
, r = p
0
and for (0, 1) let p
be dened by
(9.21)
1
p
=
1
p
0
+

p
1
.
Combining Proposition 9.22 and Corollary 9.23 gives
L
p
0
L
p
1
L
p
L
p
0
+L
p
1
and Eq. (9.19) becomes
kfk
p
kfk
1
p
0
kfk
p
1
.
Corollary 9.25. Suppose now that is counting measure on X. Then L
p
()
L
q
() for all 0 < p < q and kfk
q
kfk
p
.
Proof. Suppose that 0 < p < q = , then
kfk
p
= sup{|f(x)|
p
: x X}
X
xX
|f(x)|
p
= kfk
p
p
,
i.e. kfk
kfk
p
for all 0 )
p
kfk
p
p
it follows that L
p
convergence implies L
0
convergence.
(2) L
0
convergence implies almost everywhere convergence for some subse-
quence.
(3) If (X) < , then L
q
L
p
for all p q in fact
kfk
p
[(X)]
(
1
p
1
q
)
kfk
q
,
i.e. L
q
convergence implies L
p
convergence.
(4) L
p
0
L
p
1
L
p
L
p
0
+L
p
1
where
1
p
=
1
p
0
+

p
1
.
(5)
p

q
if p q. In fact kfk
q
kfk
p
in this case. To prove this write
1
q
=

p
+
(1 )
,
then using kfk
kfk
p
for all p,
kfk
q
kfk
p
kfk
1
kfk
p
kfk
1
p
= kfk
p
.
(6) If (X) < then almost everywhere convergence implies L
0
convergence.
9.4. Converse of Hlders Inequality. Throughout this section we assume
(X, M, ) is a -nite measure space, q [1, ] and p [1, ] are conjugate
exponents, i.e. p
1
+q
1
= 1. For g L
q
, let
g
(L
p
)
be given by
(9.22)
g
(f) =
Z
gf d.
By Hlders inequality
(9.23) |
g
(f)|
Z
|gf|d kgk
q
kfk
p
which implies that
(9.24) k
g
k
(L
p
)
:= sup{|
g
(f)| : kfk
p
= 1} kgk
q
.
Proposition 9.26 (Converse of Hlders Inequality). Let (X, M, ) be a -nite
measure space and 1 p as above. For all g L
q
,
(9.25) kgk
q
= k
g
k
(L
p
)
:= sup
n
|
g
(f)| : kfk
p
= 1
o
and for any measurable function g : X C,
(9.26) kgk
q
= sup
Z
X
|g| fd : kfk
p
= 1 and f 0
.
Proof. We begin by proving Eq. (9.25). Assume rst that q < so p > 1.
Then
|
g
(f)| =
Z
gf d
Z
|gf| d kgk
q
kfk
p
and equality occurs in the rst inequality when sgn(gf) is constant a.e. while
equality in the second occurs, by Theorem 9.2, when |f|
p
= c|g|
q
for some constant
c > 0. So let f := sgn(g)|g|
q/p
which for p = is to be interpreted as f = sgn(g),
i.e. |g|
q/
1.
When p = ,
|
g
(f)| =
Z
X
g sgn(g)d = kgk
L
1
()
= kgk
1
kfk
which shows that k

g
k
(L
)
kgk
1
. If p < , then
kfk
p
p
=
Z
|f|
p
=
Z
|g|
q
= kgk
q
q
while
g
(f) =
Z
gfd =
Z
|g||g|
q/p
d =
Z
|g|
q
d = kgk
q
q
.
Hence
|
g
(f)|
kfk
p
=
kgk
q
q
kgk
q/p
q
= kgk
q(1
1
p
)
q
= kgk
q
.
This shows that ||
g
k kgk
q
which combined with Eq. (9.24) implies Eq. (9.25).
The last case to consider is p = 1 and q = . Let M := kgk
and choose
X
n
M such that X
n
X as n and (X
n
) < for all n. For any
> 0, (|g| M ) > 0 and X
n
{|g| M } {|g| M }. Therefore,
(X
n
{|g| M }) > 0 for n suciently large. Let
f = sgn(g)1
X
n
{|g|M}
,
then
kfk
1
= (X
n
{|g| M }) (0, )
and
|
g
(f)| =
Z
X
n
{|g|M}
sgn(g)gd =
Z
X
n
{|g|M}
|g|d
(M )(X
n
{|g| M }) = (M )kfk
1
.
Since > 0 is arbitrary, it follows from this equation that k
g
k
(L
1
)
M = kgk
.
We now will prove Eq. (9.26). The key new point is that we no longer are
assuming that g L
q
. Let M(g) denote the right member in Eq. (9.26) and set
g
n
:= 1
X
n
{|g|n}
g. Then |g
n
| |g| as n and it is clear that M(g
n
) is increasing
in n. Therefore using Lemma 2.10 and the monotone convergence theorem,
lim
n
M(g
n
) = sup
n
M(g
n
) = sup
n
sup
Z
X
|g
n
| fd : kfk
p
= 1 and f 0
= sup
sup
n
Z
X
|g
n
| fd : kfk
p
= 1 and f 0
= sup
lim
n
Z
X
|g
n
| fd : kfk
p
= 1 and f 0
= sup
Z
X
|g| fd : kfk
p
= 1 and f 0
= M(g).
Since g
n
L
q
for all n and M(g
n
) = k
g
n
k
(L
p
)
(as you should verify), it fol-
lows from Eq. (9.25) that M(g
n
) = kg
n
k
q
. When q < , by the monotone
convergence theorem, and when q = , directly from the denitions, one learns
168 BRUCE K. DRIVER
that lim
n
kg
n
k
q
= kgk
q
. Combining this fact with lim
n
M(g
n
) = M(g) just
proved shows M(g) = kgk
q
.
As an application we can derive a sweeping generalization of Minkowskis inequal-
ity. (See Reed and Simon, Vol II. Appendix IX.4 for a more thorough discussion of
complex interpolation theory.)
Theorem 9.27 (Minkowskis Inequality for Integrals). Let (X, M, ) and (Y, N, )
be -nite measure spaces and 1 p . If f is a M N measurable function,
then y kf(, y)k
L
p
()
is measurable and
(1) if f is a positive MN measurable function, then
(9.27) k
Z
Y
f(, y)d(y)k
L
p
()

Z
Y
kf(, y)k
L
p
()
d(y).
(2) If f : XY C is a MN measurable function and
R
Y
kf(, y)k
L
p
()
d(y) <
then
(a) for a.e. x, f(x, ) L
1
(),
(b) the a.e. dened function, x
R
Y
f(x, y)d(y), is in L
p
() and
(c) the bound in Eq. (9.27) holds.
Proof. For p [1, ], let F
p
(y) := kf(, y)k
L
p
()
. If p [1, )
F
p
(y) = kf(, y)k
L
p
()
=
Z
X
|f(x, y)|
p
d(x)
1/p
is a measurable function on Y by Fubinis theorem. To see that F
is measurable,
let X
n
M such that X
n
X and (X
n
) < for all n. Then by Exercise 9.5,
F
(y) = lim
n
lim
p
kf(, y)1
X
n
k
L
p
()
which shows that F
is (Y, N) measurable as well. This shows that integral on

the right side of Eq. (9.27) is well dened.
Now suppose that f 0, q = p/(p 1)and g L
q
() such that g 0 and
kgk
L
q
()
= 1. Then by Tonellis theorem and Hlders inequality,
Z
X
Z
Y
f(x, y)d(y)
g(x)d(x) =
Z
Y
d(y)
Z
X
d(x)f(x, y)g(x)
kgk
L
q
()
Z
Y
kf(, y)k
L
p
()
d(y)
=
Z
Y
kf(, y)k
L
p
()
d(y).
Therefore by Proposition 9.26,
k
Z
Y
f(, y)d(y)k
L
p
()
= sup
Z
X
Z
Y
f(x, y)d(y)
g(x)d(x) : kgk
L
q
()
= 1 and g 0
Z
Y
kf(, y)k
L
p
()
d(y)
proving Eq. (9.27) in this case.
Now let f : X Y C be as in item 2) of the theorem. Applying the rst part
of the theorem to |f| shows
Z
Y
|f(x, y)| d(y) < for a.e. x,
i.e. f(x, ) L
1
() for the a.e. x. Since

R
Y
f(x, y)d(y)
R
Y
|f(x, y)| d(y) it
follows by item 1) that
k
Z
Y
f(, y)d(y)k
L
p
()
k
Z
Y
|f(, y)| d(y)k
L
p
()

Z
Y
kf(, y)k
L
p
()
d(y).
Hence the function, x X
R
Y
f(x, y)d(y), is in L
p
() and the bound in Eq.
(9.27) holds.
Here is an application of Minkowskis inequality for integrals.
Theorem 9.28 (Theorem 6.20 in Folland). Suppose that k : (0, ) (0, ) C
is a measurable function such that k is homogenous of degree 1, i.e. k(x, y) =
1
k(x, y) for all > 0. If
C
p
:=
Z

0
|k(x, 1)| x
1/p
dx <
for some p [1, ], then for f L
p
((0, ), m), k(x, )f() L
p
((0, ), m) for m
a.e. x Moreover, the m a.e. dened function
(9.28) (Kf)(x) =
Z

0
k(x, y)f(y)dy
is in L
p
((0, ), m) and
kKfk
L
p
((0,),m)
C
p
kfk
L
p
((0,),m)
.
Proof. By the homogeneity of k, k(x, y) = y
1
k(
x
y
, 1). Hence
Z

0
|k(x, y)f(y)| dy =
Z

0
x
1
|k(1, y/x)f(y)| dy
=
Z

0
x
1
|k(1, z)f(xz)| xdz =
Z

0
|k(1, z)f(xz)| dz.
Since
kf( z)k
p
L
p
((0,),m)
=
Z

0
|f(yz)|
p
dy =
Z

0
|f(x)|
p
dx
z
,
kf( z)k
L
p
((0,),m)
= z
1/p
kfk
L
p
((0,),m)
.
Using Minkowskis inequality for integrals then shows
Z

0
|k(, y)f(y)| dy
L
p
((0,),m)
Z

0
|k(1, z)| kf(z)k
L
p
((0,),m)
dz
= kfk
L
p
((0,),m)
Z

0
|k(1, z)| z
1/p
dz
= C
p
kfk
L
p
((0,),m)
< .
This shows that Kf in Eq. (9.28) is well dened from m a.e. x. The proof is
nished by observing
kKfk
L
p
((0,),m)

Z

0
|k(, y)f(y)| dy
L
p
((0,),m)
C
p
kfk
L
p
((0,),m)
for all f L
p
((0, ), m).
The following theorem is a strengthening of Proposition 9.26. which will be used
(actually maybe not) in Theorem G.49 below. (WHERE IS THIS THEOREM
USED?)
170 BRUCE K. DRIVER
Theorem 9.29 (Converse of Hlders Inequality II). Assume that (X, M, ) is a

nite measure space, q, p [1, ] are conjugate exponents and let S
f
denote the
set of simple functions on X such that ( 6= 0) < . For g : X C measurable
such that g L
1
for all S
f
,
18
let
(9.29) M
q
(g) = sup
Z
X
gd
: S
f
with kk
p
= 1
.
If M
q
(g) < then g L
q
and M
q
(g) = kgk
q
.
Proof. Let X
n
M be sets such that (X
n
) < and X
n
X as n .
Suppose that q = 1 and hence p = . Choose simple functions
n
on X such that
|
n
| 1 and sgn(g) = lim
n
n
in the pointwise sense. Then 1
X
m
n
S
f
and
therefore

Z
X
1
X
m
n
gd
M
q
(g)
for all m, n. By assumption 1
X
m
g L
1
() and therefore by the dominated conver-
gence theorem we may let n in this equation to nd
Z
X
1
X
m
|g| d M
q
(g)
for all m. The monotone convergence theorem then implies that
Z
X
|g| d = lim
m
Z
X
1
X
m
|g| d M
q
(g)
showing g L
1
() and kgk
1
M
q
(g). Since Holders inequality implies that
M
q
(g) kgk
1
, we have proved the theorem in case q = 1.
For q > 1, we will begin by assuming that g L
q
(). Since p [1, ) we know
that S
f
is a dense subspace of L
p
() and therefore, using
g
is continuous on L
p
(),
M
q
(g) = sup
Z
X
gd
: L
p
() with kk
p
= 1
= kgk
q
where the last equality follows by Proposition 9.26.
So it remains to show that if g L
1
for all S
f
and M
q
(g) < then
g L
q
(). For n N, let g
n
1
X
n
1
|g|n
g. Then g
n
L
q
(), in fact kg
n
k
q

n(X
n
)
1/q
< . So by the previous paragraph,
kg
n
k
q
= M
q
(g
n
) = sup
Z
X
1
X
n
1
|g|n
gd
: L
p
() with kk
p
= 1
M
q
(g)
1
X
n
1
|g|n
p
M
q
(g) 1 = M
q
(g)
wherein the second to last inequality we have made use of the denition of M
q
(g)
and the fact that 1
X
n
1
|g|n
S
f
. If q (1, ), an application of the monotone
convergence theorem (or Fatous Lemma) along with the continuity of the norm,
kk
p
, implies
kgk
q
= lim
n
kg
n
k
q
M
q
(g) < .
If q = , then kg
n
k
M
q
(g) < for all n implies |g
n
| M
q
(g) a.e. which
then implies that |g| M
q
(g) a.e. since |g| = lim
n
|g
n
| . That is g L
() and
kgk
(g).
18
This is equivalent to requiring 1
A
g L
1
() for all A M such that (A) < .
9.5. Uniform Integrability. This section will address the question as to what
extra conditions are needed in order that an L
0
convergent sequence is L
p
convergent.
Notation 9.30. For f L
1
() and E M, let
(f : E) :=
Z
E
fd.
and more generally if A, B M let
(f : A, B) :=
Z
AB
fd.
Lemma 9.31. Suppose g L
1
(), then for any > 0 there exist a > 0 such that
(|g| : E) < whenever (E) < .
Proof. If the Lemma is false, there would exist > 0 and sets E
n
such that
(E
n
) 0 while (|g| : E
n
) for all n. Since |1
E
n
g| |g| L
1
and for any
(0, 1), (1
E
n
|g| > ) (E
n
) 0 as n , the dominated convergence theorem
of Corollary 9.17 implies lim
n
(|g| : E
n
) = 0. This contradicts (|g| : E
n
)
for all n and the proof is complete.
Suppose that {f
n
}
n=1
is a sequence of measurable functions which converge in
L
1
() to a function f. Then for E M and n N,
|(f
n
: E)| |(f f
n
: E)| +|(f : E)| kf f
n
k
1
+|(f : E)| .
Let
N
:= sup
n>N
kf f
n
k
1
, then
N
0 as N and
(9.30) sup
n
|(f
n
: E)| sup
nN
|(f
n
: E)| (
N
+|(f : E)|)
N
+(g
N
: E) ,
where g
N
= |f| +
P
N
n=1
|f
n
| L
1
. From Lemma 9.31 and Eq. (9.30) one easily
concludes,
(9.31) > 0 > 0 3 sup
n
|(f
n
: E)| < when (E) < .
Denition 9.32. Functions {f
n
}
n=1
L
1
() satisfying Eq. (9.31) are said to be
uniformly integrable.
Remark 9.33. Let {f
n
} be real functions satisfying Eq. (9.31), E be a set where
(E) < and E
n
= E {f
n
0} . Then (E
n
) < so that (f
+
n
: E) = (f
n
:
E
n
) < and similarly (f
n
: E) < . Therefore if Eq. (9.31) holds then
(9.32) sup
n
(|f
n
| : E) < 2 when (E) < .
Similar arguments work for the complex case by looking at the real and imaginary
parts of f
n
. Therefore {f
n
}
n=1
L
1
() is uniformly integrable i
(9.33) > 0 > 0 3 sup
n
(|f
n
| : E) < when (E) < .
Lemma 9.34. Assume that (X) < , then {f
n
} is uniformly bounded in L
1
()
(i.e. K = sup
n
kf
n
k
1
< ) and {f
n
} is uniformly integrable i
(9.34) lim
M
sup
n
(|f
n
| : |f
n
| M) = 0.
172 BRUCE K. DRIVER
Proof. Since {f
n
} is uniformly bounded in L
1
(), (|f
n
| M) K/M. So
if (9.33) holds and > 0 is given, we may choose M suceintly large so that
(|f
n
| M) < () for all n and therefore,
sup
n
(|f
n
| : |f
n
| M) .
Since is arbitrary, we concluded that Eq. (9.34) must hold.
Conversely, suppose that Eq. (9.34) holds, then automatically K = sup
n
(|f
n
|) <
because
(|f
n
|) = (|f
n
| : |f
n
| M) +(|f
n
| : |f
n
| < M)
sup
n
(|f
n
| : |f
n
| M) +M(X) < .
Moreover,
(|f
n
| : E) = (|f
n
| : |f
n
| M, E) +(|f
n
| : |f
n
| < M, E)
sup
n
(|f
n
| : |f
n
| M) +M(E).
So given > 0 choose M so large that sup
n
(|f
n
| : |f
n
| M) < /2 and then take
= / (2M) .
Remark 9.35. It is not in general true that if {f
n
} L
1
() is uniformly integrable
then sup
n
(|f
n
|) < . For example take X = {} and ({}) = 1. Let f
n
() = n.
Since for < 1 a set E X such that (E) < is in fact the empty set, we see
that Eq. (9.32) holds in this example. However, for nite measure spaces with out
atoms, for every > 0 we may nd a nite partition of X by sets {E
}
k
=1
with
(E
) < . Then if Eq. (9.32) holds with 2 = 1, then

(|f
n
|) =
k
X
=1
(|f
n
| : E
) k
showing that (|f
n
|) k for all n.
The following Lemmas gives a concrete necessary and sucient conditions for
verifying a sequence of functions is uniformly bounded and uniformly integrable.
Lemma 9.36. Suppose that (X) < , and L
0
(X) is a collection of functions.
(1) If there exists a non decreasing function : R
+
R
+
such that
lim
x
(x)/x = and
(9.35) K := sup
f
((|f|)) <
then
(9.36) lim
M
sup
f
|f| 1
|f|M
= 0.
(2) Conversely if Eq. (9.36) holds, there exists a non-decreasing continuous
function : R
+
R
+
such that (0) = 0, lim
x
(x)/x = and Eq.
(9.35) is valid.
Proof. 1. Let be as in item 1. above and set
M
:= sup
xM
x
(x)
0 as
M by assumption. Then for f
(|f| : |f| M) = (
|f|
(|f|)
(|f|) : |f| M)
M
((|f|) : |f| M)

M
((|f|)) K
M
and hence
lim
M
sup
f
|f| 1
|f|M
lim
M
K
M
= 0.
2. By assumption,
M
:= sup
f
|f| 1
|f|M
0 as M . Therefore we
may choose M
n
such that
X
n=0
(n + 1)
M
n
<
where by convention M
0
:= 0. Now dene so that (0) = 0 and
0
(x) =

X
n=0
(n + 1) 1
(M
n
,M
n+1
]
(x),
i.e.
(x) =
Z
x
0

0
(y)dy =

X
n=0
(n + 1) (x M
n+1
x M
n
) .
By construction is continuous, (0) = 0,
0
(x) is increasing (so is convex)
and
0
(x) (n + 1) for x M
n
. In particular
(x)
x

(M
n
) + (n + 1)x
x
n + 1 for x M
n
from which we conclude lim
x
(x)/x = . We also have
0
(x) (n + 1) on
[0, M
n+1
] and therefore
(x) (n + 1)x for x M
n+1
.
So for f ,
((|f|)) =

X
n=0
(|f|)1
(M
n
,M
n+1
]
(|f|)

X
n=0
(n + 1)
|f| 1
(M
n
,M
n+1
]
(|f|)

X
n=0
(n + 1)
|f| 1
|f|M
n

X
n=0
(n + 1)
M
n
and hence
sup
f
((|f|))

X
n=0
(n + 1)
M
n
< .
Theorem 9.37 (Vitali Convergence Theorem). (Folland 6.15) Suppose that 1
p < . A sequence {f
n
} L
p
is Cauchy i
(1) {f
n
} is L
0
Cauchy,
(2) {|f
n
|
p
} is uniformly integrable.
174 BRUCE K. DRIVER
(3) For all > 0, there exists a set E M such that (E) < and
R
E
c
|f
n
|
p
d < for all n. (This condition is vacuous when (X) < .)
Proof. (=) Suppose {f
n
} L
p
is Cauchy. Then (1) {f
n
} is L
0
Cauchy by
Lemma 9.14. (2) By completeness of L
p
, there exists f L
p
such that kf
n
fk
p

0 as n . By the mean value theorem,
||f|
p
|f
n
|
p
| p(max(|f|, |f
n
|))
p1
||f| |f
n
|| p(|f| +|f
n
|)
p1
||f| |f
n
||
and therefore by Hlders inequality,
Z
||f|
p
|f
n
|
p
| d p
Z
(|f| +|f
n
|)
p1
||f| |f
n
|| d p
Z
(|f| +|f
n
|)
p1
|f f
n
|d
pkf f
n
k
p
k(|f| +|f
n
|)
p1
k
q
= pk|f| +|f
n
|k
p/q
p
kf f
n
k
p
p(kfk
p
+kf
n
k
p
)
p/q
kf f
n
k
p
where q := p/(p 1). This shows that
R
||f|
p
|f
n
|
p
| d 0 as n .
19
By the
remarks prior to Denition 9.32, {|f
n
|
p
} is uniformly integrable.
To verify (3), for M > 0 and n N let E
M
= {|f| M} and E
M
(n) = {|f
n
|
M}. Then (E
M
)
1
M
p
kf||
p
p
< and by the dominated convergence theorem,
Z
E
c
M
|f|
p
d =
Z
|f|
p
1
|f|<M
d 0 as M 0.
Moreover,
(9.37)

f
n
1
E
c
M
f1
E
c
M
p
+
(f
n
f)1
E
c
M
f1
E
c
M
p
+kf
n
fk
p
.
So given > 0, choose N suciently large such that for all n N, kf f
n
k
p
p
< .
Then choose M suciently small such that
R
E
c
M
|f|
p
d < and
R
E
c
M
(n)
|f|
p
d <
for all n = 1, 2, . . . , N 1. Letting E E
M
E
M
(1) E
M
(N 1), we have
(E) < ,
Z
E
c
|f
n
|
p
d < for n N 1
and by Eq. (9.37)
Z
E
c
|f
n
|
p
d < (
1/p
+
1/p
)
p
2
p
for n N.
Therefore we have found E M such that (E) < and
sup
n
Z
E
c
|f
n
|
p
d 2
p
which veries (3) since > 0 was arbitrary.

(=) Now suppose{f
n
} L
p
satises conditions (1) - (3). Let > 0, E be as
in (3) and
A
mn
{x E|f
m
(x) f
n
(x)| }.
Then
k(f
n
f
m
) 1
E
ck
p
kf
n
1
E
c k
p
+kf
m
1
E
c k
p
< 2
1/p
19
Here is an alternative proof. Let h
n
||f
n
|
p
|f|
p
| |f
n
|
p
+|f|
p
=: g
n
L
1
and g 2|f|
p
.
Then g
n

g, h
n

0 and
R
g
n

R
g. Therefore by the dominated convergence theorem in
Corollary 9.17, lim
n
R
h
n
d = 0.
and
kf
n
f
m
k
p
= k(f
n
f
m
)1
E
c k
p
+k(f
n
f
m
)1
E\A
mn
k
p
+k(f
n
f
m
)1
A
mn
k
p
k(f
n
f
m
)1
E\A
mn
k
p
+k(f
n
f
m
)1
A
mn
k
p
+ 2
1/p
. (9.38)
Using properties (1) and (3) and 1
E{|f
m
f
n
|<}
|f
m
f
n
|
p

p
1
E
L
1
, the
dominated convergence theorem in Corollary 9.17 implies
k(f
n
f
m
) 1
E\A
mn
k
p
p
=
Z
1
E{|f
m
f
n
|<}
|f
m
f
n
|
p
m,n
0.
which combined with Eq. (9.38) implies
limsup
m,n
kf
n
f
m
k
p
limsup
m,n
k(f
n
f
m
)1
A
mn
k
p
+ 2
1/p
.
Finally
k(f
n
f
m
)1
A
mn
k
p
kf
n
1
A
mn
k
p
+kf
m
1
A
mn
k
p
2()
where
() sup
n
sup{ kf
n
1
E
k
p
: E M 3 (E) }
By property (2), () 0 as 0. Therefore
limsup
m,n
kf
n
f
m
k
p
2
1/p
+ 0 + 2() 0 as 0
and therefore {f
n
} is L
p
-Cauchy.
Here is another version of Vitalis Convergence Theorem.
Theorem 9.38 (Vitali Convergence Theorem). (This is problem 9 on p. 133 in
Rudin.) Assume that (X) < , {f
n
} is uniformly integrable, f
n
f a.e. and
|f| < a.e., then f L
1
() and f
n
f in L
1
().
Proof. Let > 0 be given and choose > 0 as in the Eq. (9.32). Now use
Egoros Theorem 9.18 to choose a set E
c
where {f
n
} converges uniformly on E
c
and (E) < . By uniform convergence on E
c
, there is an integer N < such that
|f
n
f
m
| 1 on E
c
for all m, n N. Letting m , we learn that
|f
N
f| 1 on E
c
.
Therefore |f| |f
N
| + 1 on E
c
and hence
(|f|) = (|f| : E
c
) +(|f| : E)
(|f
N
|) +(X) +(|f| : E).
Now by Fatous lemma,
(|f| : E) lim inf
n
(|f
n
| : E) 2 <
by Eq. (9.32). This shows that f L
1
. Finally
(|f f
n
|) = (|f f
n
| : E
c
) +(|f f
n
| : E)
(|f f
n
| : E
c
) +(|f| +|f
n
| : E)
(|f f
n
| : E
c
) + 4
and so by the Dominated convergence theorem we learn that
lim sup
n
(|f f
n
|) 4.
176 BRUCE K. DRIVER
Since > 0 was arbitrary this completes the proof.

Theorem 9.39 (Vitali again). Suppose that f
n
f in measure and Eq. (9.34)
holds, then f
n
f in L
1
.
Proof. This could of course be proved using 9.38 after passing to subsequences
to get {f
n
} to converge a.s. However I wish to give another proof.
First o, by Fatous lemma, f L
1
(). Now let
K
(x) = x1
|x|K
+K1
|x|>K
.
then
K
(f
n
)

K
(f) because |
K
(f)
K
(f
n
)| |f f
n
| and since
|f f
n
| |f
K
(f)| +|
K
(f)
K
(f
n
)| +|
K
(f
n
) f
n
|
we have that
|f f
n
| |f
K
(f)| +|
K
(f)
K
(f
n
)| +|
K
(f
n
) f
n
|
= (|f| : |f| K) +|
K
(f)
K
(f
n
)| +(|f
n
| : |f
n
| K).
Therefore by the dominated convergence theorem
lim sup
n
|f f
n
| (|f| : |f| K) + lim sup
n
(|f
n
| : |f
n
| K).
This last expression goes to zero as K by uniform integrability.
9.6. Exercises.
Denition 9.40. The essential range of f, essran(f), consists of those C
such that (|f | < ) > 0 for all > 0.
Denition 9.41. Let (X, ) be a topological space and be a measure on B
X
=
(). The support of , supp(), consists of those x X such that (V ) > 0 for
all open neighborhoods, V, of x.
Exercise 9.3. Let (X, ) be a second countable topological space and be a
measure on B
X
the Borel algebra on X. Show
(1) supp() is a closed set. (This is true on all topological spaces.)
(2) (X \ supp()) = 0 and use this to conclude that W := X \ supp() is the
largest open set in X such that (W) = 0. Hint: U be a countable
base for the topology . Show that W may be written as a union of elements
from V V with the property that (V ) = 0.
Exercise 9.4. Prove the following facts about essran(f).
(1) Let = f
:= f
1
a Borel measure on C. Show essran(f) = supp().
(2) essran(f) is a closed set and f(x) essran(f) for almost every x, i.e. (f /
essran(f)) = 0.
(3) If F C is a closed set such that f(x) F for almost every x then
essran(f) F. So essran(f) is the smallest closed set F such that f(x) F
for almost every x.
(4) kfk
= sup{|| : essran(f)} .
Exercise 9.5. Let f L
p
L
for some p < . Show kfk
= lim
q
kfk
q
.
If we further assume (X) < , show kfk
= lim
q
kfk
q
for all measurable
functions f : X C. In particular, f L
i lim
q
kfk
q
< .
Exercise 9.6. Prove Eq. (9.20) in Corollary 9.23. (Part of Folland 6.3 on p. 186.)
Hint: Use Lemma 2.27 applied to the right side of Eq. (9.19).
Exercise 9.7. Complete the proof of Proposition 9.22 by showing (L
p
+ L
r
, kk)
is a Banach space. (Part of Folland 6.4 on p. 186.)
Exercise 9.11. Folland 6.10 on p. 186. Use the strong form of Theorem 7.38.
Exercise 9.12. Let (X, M, ) and (Y, N, ) be -nite measure spaces, f L
2
()
and k L
2
( ). Show
Z
|k(x, y)f(y)| d(y) < for a.e. x.
Let Kf(x) :=
R
Y
k(x, y)f(y)d(y) when the integral is dened. Show Kf L
2
()
and K : L
2
() L
2
() is a bounded operator with kKk
op
kkk
L
2
()
.
178 BRUCE K. DRIVER
10. Locally Compact Hausdorff Spaces

In this section X will always be a topological space with topology . We are now
interested in restrictions on in order to insure there are plenty of continuous
functions. One such restriction is to assume =
d
is the topology induced from
a metric on X. The following two results shows that (X,
d
) has lots of continuous
functions. Recall for A X, d
A
(x) = inf{d(x, y) : y A}.
Lemma 10.1 (Urysohns Lemma for Metric Spaces). Let (X, d) be a metric space,
V
o
X and F @ X such that F V. Then
(10.1) f(x) =
d
V
c (x)
d
F
(x) +d
V
c (x)
for x X
denes a continuous function, f : X [0, 1], such that f(x) = 1 for x F and
f(x) = 0 if x / V. (This may also be stated as follows. Let A (A = F) and B
(B = V
c
) be two disjoint closed subsets of X, then there exists f C(X, [0, 1]) such
that f = 1 on A and f = 0 on B.)
Proof. By Lemma 3.5, d
F
and d
V
c are continuous functions on X. Since F and
V
c
are closed, d
F
(x) > 0 if x / F and d
V
c (x) > 0 if x V. Since F V
c
= ,
d
F
(x)+d
V
c(x) > 0 for all x and (d
F
+d
V
c )
1
is continuous as well. The remaining
assertions about f are all easy to verify.
Theorem 10.2 (Metric Space Tietze Extension Theorem). Let (X, d) be a metric
space, D be a closed subset of X, < a < b < and f C(D, [a, b]). (Here we
are viewing D as a topological space with the relative topology,
D
, see Denition
3.17.) Then there exists F C(X, [a, b]) such that F|
D
= f.
Proof.
(1) By scaling and translation (i.e. by replacing f by
fa
ba
), it suces to prove
Theorem 10.2 with a = 0 and b = 1.
(2) Suppose (0, 1] and f : D [0, ] is continuous function. Let A :=
f
1
([0,
1
3
]) and B := f
1
([
2
3
, 1]). By Lemma 10.1 there exists a function
g C(X, [0, /3]) such that g = 0 on A and g = 1 on B. Letting g :=

3
g,
we have g C(X, [0, /3]) such that g = 0 on A and g = /3 on B. Further
notice that
0 f(x) g(x)
2
3
for all x D.
(3) Now suppose f : D [0, 1] is a continuous function as in step 1. Let
g
1
C(X, [0, 1/3]) be as in step 2. with = 1 and let f
1
:= f g
1
|
D

C(D, [0, 2/3]). Apply step 2. with = 2/3 and f = f
1
to nd g
2

C(X, [0,
1
3
2
3
]) such that f
2
:= f (g
1
+g
2
) |
D
C(D, [0,
2
3
2
]). Continue
this way inductively to nd g
n
C(X, [0,
1
3
2
3
n1
]) such that
(10.2) f
N
X
n=1
g
n
|
D
=: f
N
C(D, [0,
2
3
N
]).
(4) Dene F :=
P
n=1
g
n
. Since
X
n=1
kg
n
k
u

X
n=1
1
3
2
3
n1
=
1
3
1
1
2
3
= 1,
the series dening F is uniformly convergent so F C(X, [0, 1]). Passing
to the limit in Eq. (10.2) shows f = F|
D
.
The main thrust of this section is to study locally compact (and compact)
Hausdor spaces as dened below. We will see again that this class of topological
spaces have an ample supply of continuous functions. We will start out with the
notion of a Hausdor topology. The following example shows a pathology which
occurs when there are not enough open sets in a topology.
Example 10.3. Let X = {1, 2, 3} and = {X, , {1, 2}, {2, 3}, {2}} and x
n
= 2
for all n. Then x
n
x for every x X!
Denition 10.4 (Hausdor Topology). A topological space, (X, ), is Hausdor
if for each pair of distinct points, x, y X, there exists disjoint open neighborhoods,
U and V of x and y respectively. (Metric spaces are typical examples of Hausdor
spaces.)
Remark 10.5. When is Hausdor the pathologies appearing in Example 10.3
do not occur. Indeed if x
n
x X and y X \ {x} we may choose V
x
and
W
y
such that V W = . Then x
n
V a.a. implies x
n
/ W for all but a nite
number of n and hence x
n
9y, so limits are unique.
Proposition 10.6. Suppose that (X, ) is a Hausdor space, K @@ X and x K
c
.
Then there exists U, V such that U V = , x U and K V. In particular
K is closed. (So compact subsets of Hausdor topological spaces are closed.) More
generally if K and F are two disjoint compact subsets of X, there exist disjoint
open sets U, V such that K V and F U.
Proof. Because X is Hausdor, for all y K there exists V
y

y
and U
y

x
such that V
y
U
y
= . The cover {V
y
}
yK
of K has a nite subcover, {V
y
}
y
for
some K. Let V =
y
V
y
and U =
y
U
y
, then U, V satisfy x U,
K V and U V = . This shows that K
c
is open and hence that K is closed.
Suppose that K and F are two disjoint compact subsets of X. For each x F
there exists disjoint open sets U
x
and V
x
such that K V
x
and x U
x
. Since
{U
x
}
xF
is an open cover of F, there exists a nite subset of F such that F
U :=
x
U
x
. The proof is completed by dening V :=
x
V
x
.
Exercise 10.1. Show any nite set X admits exactly one Hausdor topology .
Exercise 10.2. Let (X, ) and (Y,
Y
) be topological spaces.
(1) Show is Hausdor i := {(x, x) : x X} is a closed in XX equipped
with the product topology .
(2) Suppose is Hausdor and f, g : Y X are continuous maps. If
{f = g}
Y
= Y then f = g. Hint: make use of the map f g : Y XX
dened by (f g) (y) = (f(y), g(y)).
Exercise 10.3. Given an example of a topological space which has a non-closed
compact subset.
Proposition 10.7. Suppose that X is a compact topological space, Y is a Hausdor
topological space, and f : X Y is a continuous bijection then f is a homeomor-
phism, i.e. f
1
: Y X is continuous as well.
180 BRUCE K. DRIVER
Proof. Since closed subsets of compact sets are compact, continuous images of
compact subsets are compact and compact subsets of Hausdor spaces are closed,
it follows that

f
1
1
(C) = f(C) is closed in X for all closed subsets C of X.
Thus f
1
is continuous.
Denition 10.8 (Local and compactness). Let (X, ) be a topological space.
(1) (X, ) is locally compact if for all x X there exists an open neigh-
borhood V X of x such that

V is compact. (Alternatively, in light of
Denition 3.19, this is equivalent to requiring that to each x X there
exists a compact neighborhood N
x
of x.)
(2) (X, ) is compact if there exists compact sets K
n
X such that X =
n=1
K
n
. (Notice that we may assume, by replacing K
n
by K
1
K
2
K
n
if necessary, that K
n
X.)
Example 10.9. Any open subset of X R
n
is a locally compact and compact
metric space (and hence Hausdor). The proof of local compactness is easy and is
left to the reader. To see that X is compact, for k N, let
K
k
:= {x X : |x| k and d
X
c (x) 1/k} .
Then K
k
is a closed and bounded subset of R
n
and hence compact. Moreover
K
o
k
X as k since
20
K
o
k
{x X : |x| < k and d
X
c (x) > 1/k} X as k .
Exercise 10.4. Every separable locally compact metric space is compact.
Hint: Let {x
n
}
n=1
X be a countable dense subset of X and dene
n
=
1
2
sup{ > 0 : C
x
n
() is compact} 1.
Exercise 10.5. Every compact metric space is separable. Therefore a locally
compact metric space is separable i it is compact.
Exercise 10.6. Suppose that (X, d) is a metric space and U X is an open subset.
(1) If X is locally compact then (U, d) is locally compact.
(2) If X is compact then (U, d) is compact. Hint: Mimick Example
10.9, replacing C
0
(k) by compact set K
k
@@ X such that K
k
X.
Lemma 10.10. Let (X, ) be a locally compact and compact topological space.
Then there exists compact sets K
n
X such that K
n
K
o
n+1
K
n+1
for all n.
Proof. Suppose that C X is a compact set. For each x C let V
x

o
X be
an open neighborhood of x such that

V
x
is compact. Then C
xC
V
x
so there
exists C such that
C
x
V
x

x
V
x
=: K.
Then K is a compact set, being a nite union of compact subsets of X, and C
x
V
x
K
o
.
Now let C
n
X be compact sets such that C
n
X as n . Let K
1
= C
1
and then choose a compact set K
2
such that C
2
K
o
2
. Similarly, choose a compact
set K
3
such that K
2
C
3
K
o
3
and continue inductively to nd compact sets K
n
such that K
n
C
n+1
K
o
n+1
for all n. Then {K
n
}
n=1
is the desired sequence.
20
In fact this is an equality, but we will not need this here.
Remark 10.11. Lemma 10.10 may also be stated as saying there exists precompact
open sets {G
n
}
n=1
such that G
n

G
n
G
n+1
for all n and G
n
X as n .
Indeed if {G
n
}
n=1
are as above, let K
n
:=

G
n
and if {K
n
}
n=1
are as in Lemma
10.10, let G
n
:= K
o
n
.
The following result is a Corollary of Lemma 10.10 and Theorem 3.59.
Corollary 10.12 (Locally compact form of Ascoli-Arzela Theorem ). Let (X, )
be a locally compact and compact topological space and {f
m
} C(X) be a
pointwise bounded sequence of functions such that {f
m
|
K
} is equicontinuous for any
compact subset K X. Then there exists a subsequence {m
n
} {m} such that
{g
n
:= f
m
n
}
n=1
C(X) is a sequence which is uniformly convergent on compact
subsets of X.
Proof. Let {K
n
}
n=1
be the compact subsets of X constructed in Lemma 10.10.
We may now apply Theorem 3.59 repeatedly to nd a nested family of subsequences
{f
m
} {g
1
m
} {g
2
m
} {g
3
m
} . . .
such that the sequence {g
n
m
}
m=1
C(X) is uniformly convergent on K
n
. Using
Cantors trick, dene the subsequence {h
n
} of {f
m
} by h
n
g
n
n
. Then {h
n
} is
uniformly convergent on K
l
for each l N. Now if K X is an arbitrary compact
set, there exists l < such that K K
o
l
K
l
and therefore {h
n
} is uniformly
convergent on K as well.
The next two results shows that locally compact Hausdor spaces have plenty
of open sets and plenty of continuous functions.
Proposition 10.13. Suppose X is a locally compact Hausdor space and U
o
X
and K @@ U. Then there exists V
o
X such that K V V U X and

V is
compact.
Proof. By local compactness, for all x K, there exists U
x

x
such that

U
x
is compact. Since K is compact, there exists K such that {U
x
}
x
is a cover
of K. The set O = U (
x
U
x
) is an open set such that K O U and O
is precompact since

O is a closed subset of the compact set
x
U
x
. (
x
U
x
. is
compact because it is a nite union of compact sets.) So by replacing U by O if
necessary, we may assume that

U is compact.
Since

U is compact and U =

U U
c
is a closed subset of

U, U is compact.
Because U U
c
, it follows that U K = , so by Proposition 10.6, there exists
disjoint open sets V and W such that K V and U W. By replacing V by
V U if necessary we may further assume that K V U, see Figure 19.
Because

U W
c
is a closed set containing V and U
c

U W
c
= U W
c
= ,
V

U W
c
= U W
c
U

U.
Since

U is compact it follows that

V is compact and the proof is complete.
Exercise 10.7. Give a simpler proof of Proposition 10.13 under the additional
assumption that X is a metric space. Hint: show for each x K there exists
V
x
:= B
x
(
x
) with
x
> 0 such that B
x
(
x
) C
x
(
x
) U with C
x
(
x
) being
compact. Recall that C
x
() is the closed ball of radius about x.
Denition 10.14. Let U be an open subset of a topological space (X, ). We will
write f U to mean a function f C
c
(X, [0, 1]) such that supp(f) := {f 6= 0} U.
182 BRUCE K. DRIVER
Figure 19. The construction of V.

Lemma 10.15 (Locally Compact Version of Urysohns Lemma). Let X be a locally
compact Hausdor space and K @@ U
o
X. Then there exists f U such that
f = 1 on K. In particular, if K is compact and C is closed in X such that KC = ,
there exists f C
c
(X, [0, 1]) such that f = 1 on K and f = 0 on C.
Proof. For notational ease later it is more convenient to construct g := 1 f
rather than f. To motivate the proof, suppose g C(X, [0, 1]) such that g = 0
on K and g = 1 on U
c
. For r > 0, let U
r
= {g < r} . Then for 0 < r < s 1,
U
r
{g r} U
s
and since {g r} is closed this implies
K U
r

U
r
{g r} U
s
U.
Therefore associated to the function g is the collection open sets {U
r
}
r>0
with
the property that K U
r

U
r
U
s
U for all 0 < r < s 1 and U
r
= X if
r > 1. Finally let us notice that we may recover the function g from the sequence
{U
r
}
r>0
by the formula
(10.3) g(x) = inf{r > 0 : x U
r
}.
The idea of the proof to follow is to turn these remarks around and dene g by Eq.
(10.3).
Step 1. (Construction of the U
r
.) Let
D
k2
n
: k = 1, 2, . . . , 2
1
, n = 1, 2, . . .
be the dyadic rationales in (0, 1]. Use Proposition 10.13 to nd a precompact open
set U
1
such that K U
1

U
1
U. Apply Proposition 10.13 again to construct an
open set U
1/2
such that
K U
1/2

U
1/2
U
1
and similarly use Proposition 10.13 to nd open sets U
1/2
, U
3/4

o
X such that
K U
1/4

U
1/4
U
1/2

U
1/2
U
3/4

U
3/4
U
1
.
Likewise there exists open set U
1/8
, U
3/8
, U
5/8
, U
7/8
such that
K U
1/8

U
1/8
U
1/4

U
1/4
U
3/8

U
3/8
U
1/2

U
1/2
U
5/8

U
5/8
U
3/4

U
3/4
U
7/8

U
7/8
U
1
.
Continuing this way inductively, one shows there exists precompact open sets
{U
r
}
rD
such that
K U
r
U
r
U
s
U
1

U
1
U
for all r, s D with 0 < r < s 1.
Step 2. Let U
r
X if r > 1 and dene
g(x) = inf{r D (1, ) : x U
r
},
see Figure 20. Then g(x) [0, 1] for all x X, g(x) = 0 for x K since x K U
r
Figure 20. Determining g from {U
r
} .
for all r D. If x U
c
1
, then x / U
r
for all r D and hence g(x) = 1. Therefore
f := 1g is a function such that f = 1 on K and {f 6= 0} = {g 6= 1} U
1

U
1
U
so that supp(f) = {f 6= 0}

U
1
U is a compact subset of U. Thus it only remains
to show f, or equivalently g, is continuous.
Since E = {(, ), (, ) : R} generates the standard topology on R, to
prove g is continuous it suces to show {g < } and {g > } are open sets for all
R. But g(x) } = and if < 0, {g > } = X. If (0, 1),
then g(x) > i there exists r D such that r > and x / U
r
. Now if r > and
x / U
r
then for s D (, r), x /

U
s
U
r
. Thus we have shown that
{g > } =
[
n
U
s
c
: s D 3 s >
o
which is again an open subset of X.
Exercise 10.8. mGive a simpler proof of Lemma 10.15 under the additional as-
sumption that X is a metric space.
184 BRUCE K. DRIVER
Theorem 10.16 (Locally Compact Tietz Extension Theorem). Let (X, ) be a

locally compact Hausdor space, K @@ U
o
X, f C(K, R), a = minf(K) and
b = max f(K). Then there exists F C(X, [a, b]) such that F|
K
= f. Moreover
given c [a, b], F can be chosen so that supp(F c) = {F 6= c} U.
The proof of this theorem is similar to Theorem 10.2 and will be left to the
reader, see Exercise 10.11.
Lemma 10.17. Suppose that (X, ) is a locally compact second countable Hausdor
space. (For example any separable locally compact metric space and in particular
any open subsets of R
n
.) Then:
(1) every open subset U X is compact.
(2) If F X is a closed set, there exist open sets V
n
X such that V
n
F as
n .
(3) To each open set U X there exists f
n
U such that lim
n
f
n
= 1
U
.
(4) The algebra generated by C
c
(X) is the Borel algebra, B
X
.
Proof.
(1) Let U be an open subset of X, V be a countable base for and
V
U
:= {W V :

W U and

W is compact}.
For each x U, by Proposition 10.13, there exists an open neighborhood V
of x such that

V U and

V is compact. Since V is a base for the topology
, there exists W V such that x W V. Because

W

V , it follows
that

W is compact and hence W V
U
. As x U was arbitrary, U = V
U
.
Let {W
n
}
n=1
be an enumeration of V
U
and set K
n
:=
n
k=1

W
k
. Then
K
n
U as n and K
n
is compact for each n.
(2) Let {K
n
}
n=1
be compact subsets of F
c
such that K
n
F
c
as n and
set V
n
:= K
c
n
= X \ K
n
. Then V
n
F and by Proposition 10.6, V
n
is open
for each n.
(3) Let U X be an open set and {K
n
}
n=1
be compact subsets of U such
that K
n
U. By Lemma 10.15, there exist f
n
U such that f
n
= 1 on K
n
.
These functions satisfy, 1
U
= lim
n
f
n
.
(4) By Item 3., 1
U
is (C
c
(X, R)) measurable for all U . Hence
(C
c
(X, R)) and therefore B
X
= () (C
c
(X, R)). The converse inclu-
sion always holds since continuous functions are always Borel measurable.
Corollary 10.18. Suppose that (X, ) is a second countable locally compact Haus-
dor space, B
X
= () is the Borel algebra on X and H is a subspace of
B(X, R) which is closed under bounded convergence and contains C
c
(X, R). Then
H contains all bounded B
X
measurable real valued functions on X.
Proof. Since H is closed under bounded convergence and C
c
(X, R) H, it
follows by Item 3. of Lemma 10.17 that 1
U
H for all U . Since is a class
the corollary follows by an application of Theorem 8.12.
10.1. Locally compact form of Urysohn Metrization Theorem.
Notation 10.19. Let Q := [0, 1]
N
denote the (innite dimensional) unit cube in
R
N
. For a, b Q let
(10.4) d(a, b) :=

X
n=1
1
2
n
|a
n
b
n
| .
The metric introduced in Exercise 3.27 would be dened, in this context, as
d(a, b) :=
P
n=1
1
2
n
|a
n
b
n
|
1+|a
n
b
n
|
. Since 1 1+|a
n
b
n
| 2, it follows that

d d 2d.
So the metrics d and

d are equivalent and in particular the topologies induced by
d and

d are the same. By Exercises 6.15, the d topology on Q is the same as the
product topology and by Exercise 3.27, (Q, d) is a compact metric space.
Theorem 10.20 (Urysohn Metrization Theorem). Every second countable locally
compact Hausdor space, (X, ) , is metrizable, i.e. there is a metric on X such
that =
. Moreover, may be chosen so that X is isometric to a subset Q

0
Q
equipped with the metric d in Eq. (10.4). In this metric X is totally bounded and
hence the completion of X (which is isometric to

Q
0
Q) is compact.
Proof. Let B be a countable base for and set
{(U, V ) B B |

U V and

U is compact}.
To each O and x O there exist (U, V ) such that x U V O.
Indeed, since B is a basis for , there exists V B such that x V O. Now
apply Proposition 10.13 to nd U
0

o
X such that x U
0

U
0
V with

U
0
being compact. Since B is a basis for , there exists U B such that x U U
0
and since

U

U
0
,

U is compact so (U, V ) . In particular this shows that
B
0
:= {U B : (U, V ) for some V B} is still a base for .
If is a nite, then B
0
is nite and only has a nite number of elements as
well. Since (X, ) is Hausdor, it follows that X is a nite set. Letting {x
n
}
N
n=1
be
an enumeration of X, dene T : X Q by T(x
n
) = e
n
for n = 1, 2, . . . , N where
e
n
= (0, 0, . . . , 0, 1, 0, . . . ), with the 1 ocurring in the n
th
spot. Then (x, y) :=
d(T(x), T(y)) for x, y X is the desired metric. So we may now assume that is
an innite set and let {(U
n
, V
n
)}
n=1
be an enumeration of .
By Urysohns Lemma 10.15 there exists f
U,V
C(X, [0, 1]) such that f
U,V
= 0
on

U and f
U,V
= 1 on V
c
. Let F {f
U,V
| (U, V ) } and set f
n
:= f
U
n
,V
n
an
enumeration of F. We will now show that
(x, y) :=

X
n=1
1
2
n
|f
n
(x) f
n
(y)|
is the desired metric on X. The proof will involve a number of steps.
(1) ( is a metric on X.) It is routine to show satises the triangle inequal-
ity and is symmetric. If x, y X are distinct points then there exists
(U
n
0
, V
n
0
) such that x U
n
0
and V
n
0
O := {y}
c
. Since f
n
0
(x) = 0
and f
n
0
(y) = 1, it follows that (x, y) 2
n
0
> 0.
(2) (Let
0
= (f
n
: n N) , then =
0
=
.) As usual we have
0
.
Since, for each x X, y (x, y) is
0
continuous (being the uni-
formly convergent sum of continuous functions), it follows that B
x
() :=
{y X : (x, y) < }
0
for all x X and > 0. Thus

0
.
Suppose that O and x O. Let (U
n
0
, V
n
0
) be such that x U
n
0
and V
n
0
O. Then f
n
0
(x) = 0 and f
n
0
= 1 on O
c
. Therefore if y X and
186 BRUCE K. DRIVER
f
n
0
(y) < 1, then y O so x {f
n
0
< 1} O. This shows that O may be
written as a union of elements from
0
and therefore O
0
. So
0
and
hence =
0
. Moreover, if y B
x
(2
n
0
) then 2
n
0
> (x, y) 2
n
0
f
n
0
(y)
and therefore x B
x
(2
n
0
) {f
n
0
< 1} O. This shows O is open
and hence
.
(3) (X is isometric to some Q
0
Q.) Let T : X Q be dened by T(x) =
(f
1
(x), f
2
(x), . . . , f
n
(x), . . . ). Then T is an isometry by the very denitions
of d and and therefore X is isometric to Q
0
:= T(X). Since Q
0
is a subset
of the compact metric space (Q, d), Q
0
is totally bounded and therefore X
is totally bounded.
10.2. Partitions of Unity.
Denition 10.21. Let (X, ) be a topological space and X
0
X be a set. A
collection of sets {B
}
A
2
X
is locally nite on X
0
if for all x X
0
, there is
an open neighborhood N
x
of x such that #{ A : B
N
x
6= } < .
Lemma 10.22. Let (X, ) be a locally compact Hausdor space.
(1) A subset E X is closed i E K is closed for all K @@ X.
(2) Let {C
}
A
be a locally nite collection of closed subsets of X, then C =
A
C
is closed in X. (Recall that in general closed sets are only closed

under nite unions.)
Proof. Item 1. Since compact subsets of Hausdor spaces are closed, E K
is closed if E is closed and K is compact. Now suppose that E K is closed for
all compact subsets K X and let x E
c
. Since X is locally compact, there
exists a precompact open neighborhood, V, of x.
21
By assumption E

V is closed
so x
E

V
c
an open subset of X. By Proposition 10.13 there exists an open
set U such that x U

U
E

V
c
, see Figure 21. Let W := U V. Since
Figure 21. Showing E
c
is open.
W E = U V E U

V E = ,
21
If X were a metric space we could nish the proof as follows. If there does not exist an open
neighborhood of x which is disjoint from E, then there would exists x
n
E such that x
n
x.
Since E

V is closed and x
n
E

V for all large n, it follows (see Exercise 3.4) that x E

V
and in particular that x E. But we chose x E
c
.
and W is an open neighborhood of x and x E
c
was arbitrary, we have shown E
c
is open hence E is closed.
Item 2. Let K be a compact subset of X and for each x K let N
x
be an
open neighborhood of x such that #{ A : C
N
x
6= } < . Since K
is compact, there exists a nite subset K such that K
x
N
x
. Letting
0
:= { A : C
K 6= }, then
#(
0
)
X
x
#{ A : C
N
x
6= } <
and hence K (
A
C
) = K (
0
C
) . The set (
0
C
) is a nite union
of closed sets and hence closed. Therefore, K(
A
C
) is closed and by Item (1)

it follows that
A
C
is closed as well.
Denition 10.23. Suppose that U is an open cover of X
0
X. A collection
{
i
}
N
i=1
C(X, [0, 1]) (N = is allowed here) is a partition of unity on X
0
subordinate to the cover U if:
(1) for all i there is a U U such that supp(
i
) U,
(2) the collection of sets, {supp(
i
)}
N
i=1
, is locally nite on X
0
, and
(3)
P
N
i=1
i
= 1 on X
0
. (Notice by (2), that for each x X
0
there is a neigh-
borhood N
x
such that
i
|
N
x
is not identically zero for only a nite number
of terms. So the sum is well dened and we say the sum is locally nite.)
Proposition 10.24 (Partitions of Unity: The Compact Case). Suppose that X is
a locally compact Hausdor space, K X is a compact set and U = {U
j
}
n
j=1
is
an open cover of K. Then there exists a partition of unity {h
j
}
n
j=1
of K such that
h
j
U
j
for all j = 1, 2, . . . , n.
Proof. For all x K choose a precompact open neighborhood, V
x
, of x such
that V
x
U
j
. Since K is compact, there exists a nite subset, , of K such that
K
S
x
V
x
. Let
F
j
=
V
x
: x and V
x
U
j
.
Then F
j
is compact, F
j
U
j
for all j, and K
n
j=1
F
j
. By Urysohns Lemma
10.15 there exists f
j
U
j
such that f
j
= 1 on F
j
. We will now give two methods
to nish the proof.
Method 1. Let h
1
= f
1
, h
2
= f
2
(1 h
1
) = f
2
(1 f
1
),
h
3
= f
3
(1 h
1
h
2
) = f
3
(1 f
1
(1 f
1
)f
2
) = f
3
(1 f
1
)(1 f
2
)
and continue on inductively to dene
(10.5) h
k
= (1 h
1
h
k1
)f
k
= f
k

k1
Y
j=1
(1 f
j
) k = 2, 3, . . . , n
and to show
(10.6) (1 h
1
h
n
) =
n
Y
j=1
(1 f
j
).
From these equations it clearly follows that h
j
C
c
(X, [0, 1]) and that supp(h
j
)
supp(f
j
) U
j
, i.e. h
j
U
j
. Since
Q
n
j=1
(1 f
j
) = 0 on K,
P
n
j=1
h
j
= 1 on K and
{h
j
}
n
j=1
is the desired partition of unity.
188 BRUCE K. DRIVER
Method 2. Let g :=
n
P
j=1
f
j
C
c
(X). Then g 1 on K and hence K {g >
1
2
}.
Choose C
c
(X, [0, 1]) such that = 1 on K and supp() {g >
1
2
} and dene
f
0
1 . Then f
0
= 0 on K, f
0
= 1 if g
1
2
and therefore,
f
0
+f
1
+ +f
n
= f
0
+g > 0
on X. The desired partition of unity may be constructed as
h
j
(x) =
f
j
(x)
f
0
(x) + +f
n
(x)
.
Indeed supp(h
j
) = supp(f
j
) U
j
, h
j
C
c
(X, [0, 1]) and on K,
h
1
+ +h
n
=
f
1
+ +f
n
f
0
+f
1
+ +f
n
=
f
1
+ +f
n
f
1
+ +f
n
= 1.
Proposition 10.25. Let (X, ) be a locally compact and compact Hausdor
space. Suppose that U is an open cover of X. Then we may construct two
locally nite open covers V = {V
i
}
N
i=1
and W = {W
i
}
N
i=1
of X (N = is allowed
here) such that:
(1) W
i

W
i
V
i

V
i
and

V
i
is compact for all i.
(2) For each i there exist U U such that

V
i
U.
Proof. By Remark 10.11, there exists an open cover of G = {G
n
}
n=1
of X such
that G
n

G
n
G
n+1
. Then X =
k=1
(

G
k
\

G
k1
), where by convention G
1
=
G
0
= . For the moment x k 1. For each x

G
k
\G
k1
, let U
x
U be chosen so
that x U
x
and by Proposition 10.13 choose an open neighborhood N
x
of x such
that

N
x
U
x
(G
k+1
\
G
k2
), see Figure 22 below. Since {N
x
}
x
G
k
\G
k1
is an open
Figure 22. Constructing the {W
i
}
N
i=1
.
cover of the compact set

G
k
\G
k1
, there exist a nite subset
k
{N
x
}
x
G
k
\G
k1
which also covers

G
k
\ G
k1
. By construction, for each W
k
, there is a U U
such that

W U (G
k+1
\

G
k2
). Apply Proposition 10.13 one more time to nd,
for each W
k
, an open set V
W
such that

W V
W

V
W
U (G
k+1
\

G
k2
).
We now choose and enumeration {W
i
}
N
i=1
of the countable open cover
k=1
k
of
X and dene V
i
= V
W
i
. Then the collection {W
i
}
N
i=1
and {V
i
}
N
i=1
are easily checked
to satisfy all the conclusions of the proposition. In particular notice that for each
k that the set of is such that V
i
G
k
6= is nite.
Theorem 10.26 (Partitions of Unity in locally and compact spaces). Let (X, )
be a locally compact and compact Hausdor space and U be an open cover
of X. Then there exists a partition of unity of {h
i
}
N
i=1
(N = is allowed here)
subordinate to the cover U such that supp(h
i
) is compact for all i.
Proof. Let V = {V
i
}
N
i=1
and W = {W
i
}
N
i=1
be open covers of X with the
properties described in Proposition 10.25. By Urysohns Lemma 10.15, there exists
f
i
V
i
such that f
i
= 1 on

W
i
for each i.
As in the proof of Proposition 10.24 there are two methods to nish the proof.
Method 1. Dene h
1
= f
1
, h
j
by Eq. (10.5) for all other j. Then as in Eq.
(10.6)
1
N
X
j=1
h
j
=
N
Y
j=1
(1 f
j
) = 0
since for x X, f
j
(x) = 1 for some j. As in the proof of Proposition 10.24, it is
easily checked that {h
i
}
N
i=1
Method 2. Let f
P
N
i=1
f
i
, a locally nite sum, so that f C(X). Since
{W
i
}
i=1
is a cover of X, f 1 on X so that 1/f C (X)) as well. The functions
h
i
f
i
/f for i = 1, 2, . . . , N give the desired partition of unity.
Corollary 10.27. Let (X, ) be a locally compact and compact Hausdor space
and U = {U
}
A
be an open cover of X. Then there exists a partition of unity
of {h
}
A
) U
for all A.
(Notice that we do not assert that h
has compact support. However if

U
is
compact then supp(h
) will be compact.)
Proof. By the compactness of X, we may choose a countable subset, {
i
}
i<N
(N = allowed here), of A such that {U
i
U
i
}
i<N
is still an open cover of X. Let
{g
j
}
j<N
be a partition of unity subordinate to the cover {U
i
}
i<N
as in Theorem
10.26. Dene

k
{j : supp(g
j
) U
k
} and
k
:=

k
\

k1
j=1
, where by
convention

0
= . Then
{i N : i < N}=

[
k=1
k
=

a
k=1
k
.
If
k
= let h
k
0 otherwise let h
k
:=
P
j
k
g
j
, a locally nite sum. Then
P
k=1
h
k
=
P
N
j=1
g
j
= 1 and the sum
P
k=1
h
k
is still locally nite. (Why?) Now
for =
k
{
i
}
N
i=1
, let h
:= h
k
and for / {
i
}
N
i=1
let h
0. Since
{h
k
6= 0} =
j
k
{g
j
6= 0}
j
k
supp(g
j
) U
k
and, by Item 2. of Lemma 10.22,
j
k
supp(g
j
) is closed, we see that
supp(h
k
) = {h
k
6= 0}
j
k
supp(g
j
) U
k
.
Therefore {h
}
A
190 BRUCE K. DRIVER
Corollary 10.28. Let (X, ) be a locally compact and compact Hausdor space
and A, B be disjoint closed subsets of X. Then there exists f C(X, [0, 1]) such
that f = 1 on A and f = 0 on B. In fact f can be chosen so that supp(f) B
c
.
Proof. Let U
1
= A
c
and U
2
= B
c
, then {U
1
, U
2
} is an open cover of X. By
Corollary 10.27 there exists h
1
, h
2
C(X, [0, 1]) such that supp(h
i
) U
i
for i = 1, 2
and h
1
+h
2
= 1 on X. The function f = h
2
satises the desired properties.
10.3. C
0
(X) and the Alexanderov Compactication.
Denition 10.29. Let (X, ) be a topological space. A continuous function f :
X C is said to vanish at innity if {|f| } is compact in X for all > 0.
The functions, f C(X), vanishing at innity will be denoted by C
0
(X).
Proposition 10.30. Let X be a topological space, BC(X) be the space of bounded
continuous functions on X with the supremum norm topology. Then
(1) C
0
(X) is a closed subspace of BC(X).
(2) If we further assume that X is a locally compact Hausdor space, then
C
0
(X) = C
c
(X).
Proof.
(1) If f C
0
(X), K
1
:= {|f| 1} is a compact subset of X and therefore f(K
1
)
is a compact and hence bounded subset of C and so M := sup
xK
1
|f(x)| <
. Therefore kfk
u
M 1 < showing f BC(X).
Now suppose f
n
C
0
(X) and f
n
f in BC(X). Let > 0 be given
and choose n suciently large so that kf f
n
k
u
/2. Since
|f| |f
n
| +|f f
n
| |f
n
| +kf f
n
k
u
|f
n
| +/2,
{|f| } {|f
n
| +/2 } = {|f
n
| /2} .
Because {|f| } is a closed subset of the compact set {|f
n
| /2} ,
{|f| } is compact and we have shown f C
0
(X).
(2) Since C
0
(X) is a closed subspace of BC(X) and C
c
(X) C
0
(X), we always
have C
c
(X) C
0
(X). Now suppose that f C
0
(X) and let K
n
{|f|
1
n
} @@ X. By Lemma 10.15 we may choose
n
C
c
(X, [0, 1]) such that
n
1 on K
n
. Dene f
n

n
f C
c
(X). Then
kf f
n
k
u
= k(1
n
)fk
u

1
n
0 as n .
This shows that f C
c
(X).
Proposition 10.31 (Alexanderov Compactication). Suppose that (X, ) is a non-
compact locally compact Hausdor space. Let X
= X{} , where {} is a new

symbol not in X. The collection of sets,
= {X
\ K : K @@ X} P(X
),
is a topology on X
and (X
) is a compact Hausdor space. Moreover f C(X)

extends continuously to X
i f = g +c with g C
0
(X) and c C in which case
the extension is given by f() = c.
Proof. 1. (
is a topology.) Let F := {F X
: X
\ F
}, i.e. F F i
F is a compact subset of X or F = F
0
{} with F
0
being a closed subset of X.
Since the nite union of compact (closed) subsets is compact (closed), it is easily
seen that F is closed under nite unions. Because arbitrary intersections of closed
subsets of X are closed and closed subsets of compact subsets of X are compact,
it is also easily checked that F is closed under arbitrary intersections. Therefore F
satises the axioms of the closed subsets associated to a topology and hence
is
a topology.
2. ((X
) is a Hausdor space.) It suces to show any point x X can be

separated from . To do this use Proposition 10.13 to nd an open precompact
neighborhood, U, of x. Then U and V := X
\

U are disjoint open subsets of X
such that x U and V.

3. ((X
) is compact.) Suppose that U
is an open cover of X
. Since U
covers , there exists a compact set K X such that X
\ K U. Clearly X is
covered by U
0
:= {V \ {} : V U} and by the denition of
(or using (X
)
is Hausdor), U
0
is an open cover of X. In particular U
0
is an open cover of K and
since K is compact there exists U such that K {V \ {} : V } . It is
now easily checked that {X
\ K} U is a nite subcover of X
.
4. (Continuous functions on C(X
) statements.) Let i : X X
be the inclusion
map. Then i is continuous and open, i.e. i(V ) is open in X
for all V open in X.

If f C(X
), then g = f|
X
f() = f i f() is continuous on X. Moreover,
for all > 0 there exists an open neighborhood V
of such that
|g(x)| = |f(x) f()| < for all x V.
Since V is an open neighborhood of , there exists a compact subset, K X, such
that V = X
\ K. By the previous equation we see that {x X : |g(x)| } K,

so {|g| } is compact and we have shown g vanishes at .
Conversely if g C
0
(X), extend g to X
by setting g() = 0. Given > 0, the

set K = {|g| } is compact, hence X
\K is open in X
. Since g(X
\K) (, )
we have shown that g is continuous at . Since g is also continuous at all points
in X it follows that g is continuous on X
. Now it f = g + c with c C and

g C
0
(X), it follows by what we just proved that dening f() = c extends f to
a continuous function on X
.
10.4. More on Separation Axioms: Normal Spaces. (The reader may skip
to Denition 10.34 if he/she wishes. The following material will not be used in the
rest of the book.)
Denition 10.32 (T
0
T
2
Separation Axioms). Let (X, ) be a topological space.
The topology is said to be:
(1) T
0
if for x 6= y in X there exists V such that x V and y / V or V
such that y V but x / V.
(2) T
1
if for every x, y X with x 6= y there exists V such that x V and
y / V. Equivalently, is T
1
i all one point subsets of X are closed.
22
(3) T
2
if it is Hausdor.
22
If one point subsets are closed and x 6= y in X then V := {x}
c
is an open set containing y
but not x. Conversely if is T
1
and x X there exists V
y
such that y V
y
and x / V
y
for
all y 6= x. Therefore, {x}
c
=
y6=x
V
y
.
192 BRUCE K. DRIVER
Note T
2
implies T
1
which implies T
0
. The topology in Example 10.3 is T
0
but
not T
1
. If X is a nite set and is a T
1
topology on X then = 2
X
. To prove this
let x X be xed. Then for every y 6= x in X there exists V
y
such that x V
y
while y / V
y
. Thus {x} =
y6=x
V
y
showing contains all one point subsets of
X and therefore all subsets of X. So we have to look to innite sets for an example
of T
1
topology which is not T
2
.
Example 10.33. Let X be any innite set and let = {A X : #(A
c
) < }{}
the so called conite topology. This topology is T
1
because if x 6= y in X, then
V = {x}
c
with x / V while y V. This topology however is not T
2
. Indeed if
U, V are open sets such that x U, y V and U V = then U V
c
. But
this implies #(U) < which is impossible unless U = which is impossible since
x U.
The uniqueness of limits of sequences which occurs for Hausdor topologies (see
Remark 10.5) need not occur for T
1
spaces. For example, let X = N and be
the conite topology on X as in Example 10.33. Then x
n
= n is a sequence in X
such that x
n
x as n for all x N. For the most part we will avoid these
pathologies in the future by only considering Hausdor topologies.
Denition 10.34 (Normal Spaces: T
4
Separation Axiom). A topological space
(X, ) is said to be normal or T
4
if:
(1) X is Hausdor and
(2) if for any two closed disjoint subsets A, B X there exists disjoint open
sets V, W X such that A V and B W.
Example 10.35. By Lemma 10.1 and Corollary 10.28 it follows that metric space
and locally compact and compact Hausdor space (in particular compact Haus-
dor spaces) are normal. Indeed, in each case if A, B are disjoint closed subsets of
X, there exists f C(X, [0, 1]) such that f = 1 on A and f = 0 on B. Now let
U =

f >
1
2
and V = {f <
1
2
}.
Remark 10.36. A topological space, (X, ), is normal i for any C W X with
C being closed and W being open there exists an open set U
o
X such that
C U

U W.
To prove this rst suppose X is normal. Since W
c
is closed and C W
c
= ,
there exists disjoint open sets U and V such that C U and W
c
V. Therefore
C U V
c
W and since V
c
is closed, C U

U V
c
W.
For the converse direction suppose A and B are disjoint closed subsets of X.
Then A B
c
and B
c
is open, and so by assumption there exists U
o
X such
that A U

U B
c
and by the same token there exists W
o
X such that
U W

W B
c
. Taking complements of the last expression implies
B

W
c
W
c

U
c
.
Let V =

W
c
. Then A U
o
X, B V
o
X and U V U W
c
= .
Theorem 10.37 (Urysohns Lemma for Normal Spaces). Let X be a normal space.
Assume A, B are disjoint closed subsets of X. Then there exists f C(X, [0, 1])
such that f = 0 on A and f = 1 on B.
Proof. To make the notation match Lemma 10.15, let U = A
c
and K = B.
Then K U and it suces to produce a function f C(X, [0, 1]) such that f = 1
on K and supp(f) U. The proof is now identical to that for Lemma 10.15 except
we now use Remark 10.36 in place of Proposition 10.13.
Theorem 10.38 (Tietze Extension Theorem). Let (X, ) be a normal space, D be
a closed subset of X, < a < b < and f C(D, [a, b]). Then there exists
F C(X, [a, b]) such that F|
D
= f.
Proof. The proof is identical to that of Theorem 10.2 except we now use The-
orem 10.37 in place of Lemma 10.1.
Corollary 10.39. Suppose that X is a normal topological space, D X is closed,
F C(D, R). Then there exists F C(X) such that F|
D
= f.
Proof. Let g = arctan(f) C(D, (
2
,

2
)). Then by the Tietze exten-
sion theorem, there exists G C(X, [
2
,

2
]) such that G|
D
= g. Let B
G
1
({
2
,

2
}) @ X, then B D = . By Urysohns lemma (Theorem 10.37) there
exists h C(X, [0, 1]) such that h 1 on D and h = 0 on B and in particular
hG C(D, (
2
,

2
)) and (hG) |
D
= g. The function F tan(hG) C(X) is an
extension of f.
Theorem 10.40 (Urysohn Metrization Theorem). Every second countable normal
space, (X, ) , is metrizable, i.e. there is a metric on X such that =
. More-
over, may be chosen so that X is isometric to a subset Q
0
Q equipped with
the metric d in Eq. (10.4). In this metric X is totally bounded and hence the
completion of X (which is isometric to

Q
0
Q) is compact.
Proof. Let B be a countable base for and set
{(U, V ) B B |

U V }.
To each O and x O there exist (U, V ) such that x U V O.
Indeed, since B is a basis for , there exists V B such that x V O. Because
{x} V
c
= , there exists disjoint open sets
e
U and W such that x
e
U, V
c
W
and
e
U W = . Choose U B such that x U
e
U. Since U
e
U W
c
,
U W
c
V and hence (U, V ) . See Figure 23 below. In particular this shows
Figure 23. Constructing (U, V ) .
that {U B : (U, V ) for some V B} is still a base for .
If is a nite set, the previous comment shows that only has a nite number
of elements as well. Since (X, ) is Hausdor, it follows that X is a nite set.
194 BRUCE K. DRIVER
Letting {x
n
}
N
n=1
be an enumeration of X, dene T : X Q by T(x
n
) = e
n
for
n = 1, 2, . . . , N where e
n
= (0, 0, . . . , 0, 1, 0, . . . ), with the 1 ocurring in the n
th
spot. Then (x, y) := d(T(x), T(y)) for x, y X is the desired metric. So we may
now assume that is an innite set and let {(U
n
, V
n
)}
n=1
be an enumeration of .
By Urysohns Lemma (Theorem 10.37) there exists f
U,V
C(X, [0, 1]) such that
f
U,V
= 0 on

U and f
U,V
= 1 on V
c
. Let F {f
U,V
| (U, V ) } and set
f
n
:= f
U
n
,V
n
an enumeration of F. We will now show that
(x, y) :=

X
n=1
1
2
n
|f
n
(x) f
n
(y)|
is the desired metric on X. The proof will involve a number of steps.
(1) ( is a metric on X.) It is routine to show satises the triangle inequal-
ity and is symmetric. If x, y X are distinct points then there exists
(U
n
0
, V
n
0
) such that x U
n
0
and V
n
0
O := {y}
c
. Since f
n
0
(x) = 0
and f
n
0
(y) = 1, it follows that (x, y) 2
n
0
> 0.
(2) (Let
0
= (f
n
: n N) , then =
0
=
.) As usual we have
0
.
Since, for each x X, y (x, y) is
0
continuous (being the uni-
formly convergent sum of continuous functions), it follows that B
x
() :=
{y X : (x, y) < }
0
for all x X and > 0. Thus

0
.
Suppose that O and x O. Let (U
n
0
, V
n
0
) be such that x U
n
0
and V
n
0
O. Then f
n
0
(x) = 0 and f
n
0
= 1 on O
c
. Therefore if y X and
f
n
0
(y) < 1, then y O so x {f
n
0
< 1} O. This shows that O may be
written as a union of elements from
0
and therefore O
0
. So
0
and
hence =
0
. Moreover, if y B
x
(2
n
0
) then 2
n
0
> (x, y) 2
n
0
f
n
0
(y)
and therefore x B
x
(2
n
0
) {f
n
0
< 1} O. This shows O is open
and hence
.
(3) (X is isometric to some Q
0
Q.) Let T : X Q be dened by T(x) =
(f
1
(x), f
2
(x), . . . , f
n
(x), . . . ). Then T is an isometry by the very denitions
of d and and therefore X is isometric to Q
0
:= T(X). Since Q
0
is a subset
of the compact metric space (Q, d), Q
0
is totally bounded and therefore X
is totally bounded.
10.5. Exercises.
Exercise 10.9. Let (X, ) be a topological space, A X, i
A
: A X be the
inclusion map and
A
:= i
1
A
() be the relative topology on A. Verify
A
= {AV :
V } and show C A is closed in (A,
A
) i there exists a closed set F X
such that C = AF. (If you get stuck, see the remarks after Denition 3.17 where
this has already been proved.)
Exercise 10.10. Let (X, ) and (Y,
0
) be a topological spaces, f : X Y be a
function, U be an open cover of X and {F
j
}
n
j=1
be a nite cover of X by closed
sets.
(1) If A X is any set and f : X Y is (,
0
) continuous then f|
A
: A Y
is (
A
,
0
) continuous.
(2) Show f : X Y is (,
0
) continuous i f|
U
: U Y is (
U
,
0
)
continuous for all U U.
(3) Show f : X Y is (,
0
) continuous i f|
F
j
: F
j
Y is (
F
j
,
0
)
continuous for all j = 1, 2, . . . , n.
(4) (A baby form of the Tietze extension Theorem.) Suppose V and
f : V C is a continuous function such supp(f) V, then F : X C
dened by
F(x) =

f(x) if x V
0 otherwise
is continuous.
Exercise 10.11. Prove Theorem 10.16. Hints:
(1) By Proposition 10.13, there exists a precompact open set V such that K
V

V U. Now suppose that f : K [0, ] is continuous with (0, 1]
and let A := f
1
([0,
1
3
]) and B := f
1
([
2
3
, 1]). Appeal to Lemma 10.15
to nd a function g C(X, [0, /3]) such that g = /3 on B and supp(g)
V \ A.
(2) Now follow the argument in the proof of Theorem 10.2 to construct F
C(X, [a, b]) such that F|
K
= f.
(3) For c [a, b], choose U such that = 1 on K and replace F by
F
c
:= F + (1 )c.
Exercise 10.12 (Sterographic Projection). Let X = R
n
, X
:= X {} be the
one point compactication of X, S
n
:= {y R
n+1
: |y| = 1} be the unit sphere
in R
n+1
and N = (0, . . . , 0, 1) R
n+1
. Dene f : S
n
X
by f(N) = , and
for y S
n
\ {N} let f(y) = b R
n
be the unique point such that (b, 0) is on
the line containing N and y, see Figure 24 below. Find a formula for f and show
f : S
n
X
is a homeomorphism. (So the one point compactication of R

n
is
homeomorphic to the n sphere.)
N
-N
(b,0)
z
1
y
Figure 24. Sterographic projection and the one point compacti-
cation of R
n
.
Exercise 10.13. Let (X, ) be a locally compact Hausdor space. Show (X, ) is
separable i (X
) is separable.
Exercise 10.14. Show by example that there exists a locally compact metric
space (X, d) such that the one point compactication, (X
:= X {} ,
) , is
not metrizable. Hint: use exercise 10.13.
196 BRUCE K. DRIVER
Exercise 10.15. Suppose (X, d) is a locally compact and compact metric

space. Show the one point compactication, (X
:= X {} ,
) , is metrizable.
11. Approximation Theorems and Convolutions
Let (X, M, ) be a measure space, A M an algebra.
Notation 11.1. Let S
f
(A, ) denote those simple functions : X C such that
1
({}) A for all C and ( 6= 0) < .
For S
f
(A, ) and p [1, ), ||
p
=
P
z6=0
|z|
p
1
{=z}
and hence
Z
||
p
d =
X
z6=0
|z|
p
( = z) <
so that S
f
(A, ) L
p
().
Lemma 11.2 (Simple Functions are Dense). The simple functions, S
f
(M, ), form
a dense subspace of L
p
() for all 1 p < .
Proof. Let {
n
}
n=1
be the simple functions in the approximation Theorem
7.12. Since |
n
| |f| for all n,
n
S
f
(M, ) (verify!) and
|f
n
|
p
(|f| +|
n
|)
p
2
p
|f|
p
L
1
.
Therefore, by the dominated convergence theorem,
lim
n
Z
|f
n
|
p
d =
Z
lim
n
|f
n
|
p
d = 0.
Theorem 11.3 (Separable Algebras implies Separability of L
p
Spaces). Suppose
1 p < and A M is an algebra such that (A) = M and is -nite on
A. Then S
f
(A, ) is dense in L
p
(). Moreover, if A is countable, then L
p
() is
separable and
D = {
X
a
j
1
A
j
: a
j
Q+iQ, A
j
A with (A
j
) < }
is a countable dense subset.
Proof. First Proof. Let X
k
k
) < and X
k
X as
k . For k N let H
k
denote those bounded M measurable functions, f, on
X such that 1
X
k
f S
f
(A, )
L
p
()
. It is easily seen that H
k
is a vector space closed
under bounded convergence and this subspace contains 1
A
for all A A. Therefore
by Theorem 8.12, H
k
is the set of all bounded M measurable functions on X.
For f L
p
(), the dominated convergence theorem implies 1
X
k
{|f|k}
f f
in L
p
() as k . We have just proved 1
X
k
{|f|k}
f S
f
(A, )
L
p
()
for all k
and hence it follows that f S
f
(A, )
L
p
()
. The last assertion of the theorem is
a consequence of the easily veried fact that D is dense in S
f
(A, ) relative to the
L
p
() norm.
Second Proof. Given > 0, by Corollary 8.42, for all E M such that
(E) < , there exists A A such that (E4A) < . Therefore
(11.1)
Z
|1
E
1
A
|
p
d = (E4A) < .
This equation shows that any simple function in S
f
(M, ) may be approximated
arbitrary well by an element from D and hence D is also dense in L
p
().
198 BRUCE K. DRIVER
Corollary 11.4 (Riemann Lebesgue Lemma). Suppose that f L

1
(R, m), then
lim
Z
R
f(x)e
ix
dm(x) = 0.
Proof. Let A denote the algebra on R generated by the half open intervals, i.e.
A consists of sets of the form
n
a
k=1
(a
k
, b
k
] R
where a
k
, b
k

R. By Theorem 11.3given > 0 there exists =
P
n
k=1
c
k
1
(a
k
,b
k
]
with a
k
, b
k
R such that
Z
R
|f |dm < .
Notice that
Z
R
(x)e
ix
dm(x) =
Z
R
n
X
k=1
c
k
1
(a
k
,b
k
]
(x)e
ix
dm(x)
=
n
X
k=1
c
k
Z
b
k
a
k
e
ix
dm(x) =
n
X
k=1
c
k
1
e
ix
|
b
k
a
k
=
1
n
X
k=1
c
k
e
ib
k
e
ia
k
0 as || .
Combining these two equations with
Z
R
f(x)e
ix
dm(x)
Z
R
(f(x) (x)) e
ix
dm(x)
Z
R
(x)e
ix
dm(x)
Z
R
|f |dm+
Z
R
(x)e
ix
dm(x)
Z
R
(x)e
ix
dm(x)
we learn that
lim sup
||
Z
R
f(x)e
ix
dm(x)
+ lim sup
||
Z
R
(x)e
ix
dm(x)
= .
Since > 0 is arbitrary, we have proven the lemma.
Theorem 11.5 (Continuous Functions are Dense). Let (X, d) be a metric space,
d
be the topology on X generated by d and B
X
= (
d
) be the Borel algebra.
Suppose : B
X
[0, ] is a measure which is nite on
d
and let BC
f
(X)
denote the bounded continuous functions on X such that (f 6= 0) < . Then
BC
f
(X) is a dense subspace of L
p
() for any p [1, ).
Proof. First Proof. Let X
k

d
be open sets such that X
k
X and (X
k
) <
. Let k and n be positive integers and set
n,k
(x) = min(1, n d
X
c
k
(x)) =
n
(d
X
c
k
(x)),
and notice that
n,k
1
d
X
c
k
>0
= 1
X
k
as n , see Figure 25 below.
Then
n,k
BC
f
(X) and {
n,k
6= 0} X
k
. Let H denote those bounded
M measurable functions, f : X R, such that
n,k
f BC
f
(X)
L
p
()
. It is
easily seen that H is a vector space closed under bounded convergence and this
2 1.5 1 0.5 0
1
0.75
0.5
0.25
0
x
y
x
y
Figure 25. The plot of
n
for n = 1, 2, and 4. Notice that
n
1
(0,)
.
subspace contains BC(X, R). By Corollary 8.13, H is the set of all bounded real
valued M measurable functions on X, i.e.
n,k
f BC
f
(X)
L
p
()
for all bounded
measurable f and n, k N. Let f be a bounded measurable function, by the
dominated convergence theorem,
n,k
f 1
X
k
f in L
p
() as n , therefore
1
X
k
f BC
f
(X)
L
p
()
. It now follows as in the rst proof of Theorem 11.3 that
BC
f
(X)
L
p
()
= L
p
().
Second Proof. Since S
f
(M, ) is dense in L
p
() it suces to show any
S
f
(M, ) may be well approximated by f BC
f
(X). Moreover, to prove this it
suces to show for A M with (A) < that 1
A
may be well approximated
by an f BC
f
(X). By Exercises 8.4 and 8.5, for any > 0 there exists a closed
set F and an open set V such that F A V and (V \ F) < . (Notice that
(V ) < (A) + < .) Let f be as in Eq. (10.1), then f BC
f
(X) and since
|1
A
f| 1
V \F
,
(11.2)
Z
|1
A
f|
p
d
Z
1
V \F
d = (V \ F)
or equivalently
k1
A
fk
1/p
.
Since > 0 is arbitrary, we have shown that 1
A
can be approximated in L
p
()
arbitrarily well by functions from BC
f
(X)).
Proposition 11.6. Let (X, ) be a second countable locally compact Hausdor
space, B
X
= () be the Borel algebra and : B
X
[0, ] be a measure
such that (K) < when K is a compact subset of X. Then C
c
(X) (the space of
continuous functions with compact support) is dense in L
p
() for all p [1, ).
Proof. First Proof. Let {K
k
}
k=1
be a sequence of compact sets as in Lemma
10.10 and set X
k
= K
o
k
. Using Item 3. of Lemma 10.17, there exists {
n,k
}
n=1

C
c
(X) such that supp(
n,k
) X
k
and lim
n
n,k
= 1
X
k
. As in the rst proof of
Theorem 11.5, let H denote those bounded B
X
measurable functions, f : X R,
such that
n,k
f C
c
(X)
L
p
()
. It is easily seen that H is a vector space closed
under bounded convergence and this subspace contains BC(X, R). By Corollary
10.18, H is the set of all bounded real valued B
X
measurable functions on X, i.e.
200 BRUCE K. DRIVER
n,k
f C
c
(X)
L
p
()
for all bounded measurable f and n, k N. Let f be a bounded
measurable function, by the dominated convergence theorem,
n,k
f 1
X
k
f in
L
p
() as k , therefore 1
X
k
f C
c
(X)
L
p
()
. It now follows as in the rst proof
of Theorem 11.3 that C
c
(X)
L
p
()
= L
p
().
Second Proof. Following the second proof of Theorem 11.5, let A M with
(A) < . Since lim
k
||1
AK
o
k
1
A
||
p
= 0, it suces to assume A K
o
k
for
some k. Given > 0, by Item 2. of Lemma 10.17 and Exercises 8.4 there exists a
closed set F and an open set V such that F A V and (V \ F) < . Replacing
V by V K
o
k
we may assume that V K
o
k
K
k
. The function f dened in Eq.
(10.1) is now in C
c
(X). The remainder of the proof now follows as in the second
proof of Theorem 11.5.
Lemma 11.7. Let (X, ) be a second countable locally compact Hausdor space,
B
X
= () be the Borel algebra and : B
X
[0, ] be a measure such that
(K) < when K is a compact subset of X. If h L
1
loc
() is a function such that
(11.3)
Z
X
fhd = 0 for all f C
c
(X)
then h(x) = 0 for a.e. x.
Proof. First Proof. Let d(x) = |h(x)| dx, then is a measure on X such
that (K) < for all compact subsets K X and hence C
c
(X) is dense in L
1
()
by Proposition 11.6. Notice that
(11.4)
Z
X
f sgn(h)d =
Z
X
fhd = 0 for all f C
c
(X).
Let {K
k
}
k=1
be a sequence of compact sets such that K
k
X as in Lemma 10.10.
Then 1
K
k
sgn(h) L
1
() and therefore there exists f
m
C
c
(X) such that f
m

1
K
k
sgn(h) in L
1
(). So by Eq. (11.4),
(K
k
) =
Z
X
1
K
k
d = lim
m
Z
X
f
m
sgn(h)d = 0.
Since K
k
X as k , 0 = (X) =
R
X
|h| d, i.e. h(x) = 0 for a.e. x.
Second Proof. Let K
k
be as above and use Lemma 10.15 to nd
C
c
(X, [0, 1]) such that = 1 on K
k
. Let H denote the set of bounded measur-
able real valued functions on X such that
R
X
fhd = 0. Then it is easily checked
that H is linear subspace closed under bounded convergence which contains C
c
(X).
Therefore by Corollary 10.18, 0 =
R
X
fhd for all bounded measurable functions
f : X R and then by linearity for all bounded measurable functions f : X C.
Taking f = sgn(h) then implies
0 =
Z
X
|h| d
Z
K
k
|h| d
and hence by the monotone convergence theorem,
0 = lim
k
Z
K
k
|h| d =
Z
X
|h| d.
Corollary 11.8. Suppose X R
n
is an open set, B
X
is the Borel algebra on
X and is a measure on (X, B
X
) which is nite on compact sets. Then C
c
(X) is
dense in L
p
() for all p [1, ).
11.1. Convolution and Youngs Inequalities.
Denition 11.9. Let f, g : R
n
C be measurable functions. We dene
f g(x) =
Z
R
n
f(x y)g(y)dy
whenever the integral is dened, i.e. either f(x)g() L
1
(R
n
, m) or f(x)g()
0. Notice that the condition that f(x )g() L
1
(R
n
, m) is equivalent to writing
|f| |g| (x) < .
Notation 11.10. Given a multi-index Z
n
+
, let || =
1
+ +
n
,
x
:=
n
Y
j=1
x
j
j
, and
x
=

x
:=
n
Y
j=1

x
j
j
.
Remark 11.11 (The Signicance of Convolution). Suppose that L =
P
||k
a
is
a constant coecient dierential operator and suppose that we can solve (uniquely)
the equation Lu = g in the form
u(x) = Kg(x) :=
Z
R
n
k(x, y)g(y)dy
where k(x, y) is an integral kernel. (This is a natural sort of assumption since, in
view of the fundamental theorem of calculus, integration is the inverse operation to
dierentiation.) Since
z
L = L
z
for all z R
n
, (this is another way to characterize
constant coecient dierential operators) and L
1
= K we should have
z
K = K
z
.
Writing out this equation then says
Z
R
n
k(x z, y)g(y)dy = (Kg) (x z) =
z
Kg(x) = (K
z
g) (x)
=
Z
R
n
k(x, y)g(y z)dy =
Z
R
n
k(x, y +z)g(y)dy.
Since g is arbitrary we conclude that k(x z, y) = k(x, y + z). Taking y = 0 then
gives
k(x, z) = k(x z, 0) =: (x z).
We thus nd that Kg = g. Hence we expect the convolution operation to
appear naturally when solving constant coecient partial dierential equations.
More about this point later.
The following proposition is an easy consequence of Minkowskis inequality for
integrals, Theorem 9.27.
Proposition 11.12. Suppose q [1, ], f L
1
and g L
q
, then f g(x) exists
for almost every x, f g L
q
and
kf gk
p
kfk
1
kgk
p
.
For z R
n
and f : R
n
C, let
z
f : R
n
C be dened by
z
f(x) = f(x z).
Proposition 11.13. Suppose that p [1, ), then
z
: L
p
L
p
is an isometric
isomorphism and for f L
p
, z R
n
z
f L
p
is continuous.
202 BRUCE K. DRIVER
Proof. The assertion that

z
: L
p
L
p
is an isometric isomorphism follows
from translation invariance of Lebesgue measure and the fact that
z

z
= id.
For the continuity assertion, observe that
k
z
f
y
fk
p
= k
y
(
z
f
y
f)k
p
= k
zy
f fk
p
from which it follows that it is enough to show
z
f f in L
p
as z 0 R
n
.
When f C
c
(R
n
),
z
f f uniformly and since the K :=
|z|1
supp(
z
f) is
compact, it follows by the dominated convergence theorem that
z
f f in L
p
as
z 0 R
n
. For general g L
p
and f C
c
(R
n
),
k
z
g gk
p
k
z
g
z
fk
p
+k
z
f fk
p
+kf gk
p
= k
z
f fk
p
+ 2 kf gk
p
and thus
lim sup
z0
k
z
g gk
p
lim sup
z0
k
z
f fk
p
+ 2 kf gk
p
= 2 kf gk
p
.
Because C
c
(R
n
) is dense in L
p
, the term kf gk
p
may be made as small as we
please.
Denition 11.14. Suppose that (X, ) is a topological space and is a measure
on B
X
= (). For a measurable function f : X C we dene the essential support
of f by
(11.5)
supp
(f) = {x U : ({y V : f(y) 6= 0}}) > 0 for all neighborhoods V of x}.

It is not hard to show that if supp() = X (see Denition 9.41) and f C(X)
then supp
(f) = supp(f) := {f 6= 0} , see Exercise 11.5.

Lemma 11.15. Suppose (X, ) is second countable and f : X C is a measurable
function and is a measure on B
X
. Then X := U \ supp
(f) may be described

as the largest open set W such that f1
W
(x) = 0 for a.e. x. Equivalently put,
C := supp
(f) is the smallest closed subset of X such that f = f1

C
a.e.
Proof. To verify that the two descriptions of supp
(f) are equivalent, suppose

supp
(f) is dened as in Eq. (11.5) and W := X \ supp
(f). Then
W = {x X : ({y V : f(y) 6= 0}}) = 0 for some neighborhood V of x}
= {V
o
X : (f1
V
6= 0) = 0}
= {V
o
X : f1
V
= 0 for a.e.} .
So to nish the argument it suces to show (f1
W
6= 0) = 0. To to this let U be
a countable base for and set
U
f
:= {V U : f1
V
= 0 a.e.}.
Then it is easily seen that W = U
f
and since U
f
is countable (f1
W
6= 0)
P
V U
f
(f1
V
6= 0) = 0.
Lemma 11.16. Suppose f, g, h : R
n
C are measurable functions and assume
that x is a point in R
n
such that |f| |g| (x) < and |f| (|g| |h|) (x) < , then
(1) f g(x) = g f(x)
(2) f (g h)(x) = (f g) h(x)
(3) If z R
n
and
z
(|f| |g|)(x) = |f| |g| (x z) < , then
z
(f g)(x) =
z
f g(x) = f
z
g(x)
(4) If x / supp
m
(f) +supp
m
(g) then f g(x) = 0 and in particular, supp
m
(f
g) supp
m
(f) + supp
m
(g) where in dening supp
m
(f g) we will use the
convention that f g(x) 6= 0 when |f| |g| (x) = .
Proof. For item 1.,
|f| |g| (x) =
Z
R
n
|f| (x y) |g| (y)dy =
Z
R
n
|f| (y) |g| (y x)dy = |g| |f| (x)
where in the second equality we made use of the fact that Lebesgue measure in-
variant under the transformation y x y. Similar computations prove all of the
remaining assertions of the rst three items of the lemma.
Item 4. Since f g(x) =

f g(x) if f =

f and g = g a.e. we may,
by replacing f by f1
supp
m
(f)
and g by g1
supp
m
(g)
if necessary, assume that
{f 6= 0} supp
m
(f) and {g 6= 0} supp
m
(g). So if x / (supp
m
(f) + supp
m
(g))
then x / ({f 6= 0} +{g 6= 0}) and for all y R
n
, either x y / {f 6= 0} or
y / {g 6= 0} . That is to say either x y {f = 0} or y {g = 0} and hence
f(x y)g(y) = 0 for all y and therefore f g(x) = 0. This shows that f g = 0 on
R
n
\
supp
m
(f) + supp
m
(g)
and therefore
R
n
\
supp
m
(f) + supp
m
(g)
R
n
\ supp
m
(f g),
i.e. supp
m
(f g) supp
m
(f) + supp
m
(g).
Remark 11.17. Let A, B be closed sets of R
n
, it is not necessarily true that A+B
is still closed. For example, take
A = {(x, y) : x > 0 and y 1/x} and B = {(x, y) : x < 0 and y 1/|x|} ,
then every point of A+B has a positive y - component and hence is not zero. On
the other hand, for x > 0 we have (x, 1/x) + (x, 1/x) = (0, 2/x) A + B for all
x and hence 0 A+B showing A + B is not closed. Nevertheless if one of the
sets A or B is compact, then A + B is closed again. Indeed, if A is compact and
x
n
= a
n
+ b
n
A + B and x
n
x R
n
, then by passing to a subsequence if
necessary we may assume lim
n
a
n
= a A exists. In this case
lim
n
b
n
= lim
n
(x
n
a
n
) = x a B
exists as well, showing x = a +b A+B.
Proposition 11.18. Suppose that p, q [1, ] and p and q are conjugate expo-
nents, f L
p
and g L
q
, then f g BC(R
n
), kf gk
u
kfk
p
kgk
q
and if
p, q (1, ) then f g C
0
(R
n
).
Proof. The existence of f g(x) and the estimate |f g| (x) kfk
p
kgk
q
for all
x R
n
is a simple consequence of Holders inequality and the translation invariance
of Lebesgue measure. In particular this shows kf gk
u
kfk
p
kgk
q
. By relabeling
p and q if necessary we may assume that p [1, ). Since
k
z
(f g) f gk
u
= k
z
f g f gk
u
k
z
f fk
p
kgk
q
0 as z 0
it follows that f g is uniformly continuous. Finally if p, q (1, ), we learn
from Lemma 11.16 and what we have just proved that f
m
g
m
C
c
(R
n
) where
204 BRUCE K. DRIVER
f
m
= f1
|f|m
and g
m
= g1
|g|m
. Moreover,
kf g f
m
g
m
k
u
kf g f
m
gk
u
+kf
m
g f
m
g
m
k
u
kf f
m
k
p
kgk
q
+kf
m
k
p
kg g
m
k
q
kf f
m
k
p
kgk
q
+kfk
p
kg g
m
k
q
0 as m
showing, with the aid of Proposition 10.30, f g C
0
(R
n
).
Theorem 11.19 (Youngs Inequality). Let p, q, r [1, ] satisfy
(11.6)
1
p
+
1
q
= 1 +
1
r
.
If f L
p
and g L
q
then |f| |g| (x) < for m a.e. x and
(11.7) kf gk
r
kfk
p
kgk
q
.
In particular L
1
is closed under convolution. (The space (L
1
, ) is an example of a
Banach algebra without unit.)
Remark 11.20. Before going to the formal proof, let us rst understand Eq. (11.6)
by the following scaling argument. For > 0, let f
(x) := f(x), then after a few

simple change of variables we nd
kf
k
p
=
1/p
kfk and (f g)
= f
.
Therefore if Eq. (11.7) holds for some p, q, r [1, ], we would also have
kf gk
r
=
1/r
k(f g)
k
r

1/r
kf
k
p
kg
k
q
=
(1+1/r1/p1/q)
kfk
p
kgk
q
for all > 0. This is only possible if Eq. (11.6) holds.
Proof. Let , [0, 1] and p
1
, p
2
[0, ] satisfy p
1
1
+ p
1
2
+ r
1
= 1. Then
by Hlders inequality, Corollary 9.3,
|f g(x)| =
Z
f(x y)g(y)dy
Z
|f(x y)|
(1)
|g(y)|
(1)
|f(x y)|
|g(y)|
dy
Z
|f(x y)|
(1)r
|g(y)|
(1)r
dy
1/r
Z
|f(x y)|
p
1
dy
1/p
1
Z
|g(y)|
p
2
dy
1/p
2
=
Z
|f(x y)|
(1)r
|g(y)|
(1)r
dy
1/r
kfk
p
1
kgk
p
2
.
Taking the r
th
power of this equation and integrating on x gives
kf gk
r
r

Z Z
|f(x y)|
(1)r
|g(y)|
(1)r
dy
dx kfk
p
1
kgk
p
2
= kfk
(1)r
(1)r
kgk
(1)r
(1)r
kfk
r
p
1
kgk
r
p
2
. (11.8)
Let us now suppose, (1 )r = p
1
and (1 )r = p
2
, in which case Eq. (11.8)
becomes,
kf gk
r
r
kfk
r
p
1
kgk
r
p
2
which is Eq. (11.7) with
(11.9) p := (1 )r = p
1
and q := (1 )r = p
2
.
So to nish the proof, it suces to show p and q are arbitrary indices in [1, ]
satisfying p
1
+q
1
= 1 +r
1
.
If , , p
1
, p
2
satisfy the relations above, then
=
r
r +p
1
and =
r
r +p
2
and
1
p
+
1
q
=
1
p
1
r +p
1
r
+
1
p
2
r +p
2
r
=
1
p
1
+
1
p
2
+
2
r
= 1 +
1
r
.
Conversely, if p, q, r satisfy Eq. (11.6), then let and satisfy p = (1 )r and
q = (1 )r, i.e.
:=
r p
r
= 1
p
r
1 and =
r q
r
= 1
q
r
1.
From Eq. (11.6), = p(1
1
q
) 0 and = q(1
1
p
) 0, so that , [0, 1]. We
then dene p
1
:= p/ and p
2
:= q/, then
1
p
1
+
1
p
2
+
1
r
=
1
q
+
1
p
+
1
r
=
1
q

1
r
+
1
p

1
r
+
1
r
= 1
as desired.
Theorem 11.21 (Approximate functions). Let p [1, ], L
1
(R
n
), a :=
R
R
n
f(x)dx, and for t > 0 let
t
(x) = t
n
(x/t). Then
(1) If f L
p
with p < then
t
f af in L
p
as t 0.
(2) If f BC(R
n
) and f is uniformly continuous then k
t
f fk
0 as
t 0.
(3) If f L
and f is continuous on U
o
R
n
then
t
f af uniformly on
compact subsets of U as t 0.
Proof. Making the change of variables y = tz implies
t
f(x) =
Z
R
n
f(x y)
t
(y)dy =
Z
R
n
f(x tz)(z)dz
so that
t
f(x) af(x) =
Z
R
n
[f(x tz) f(x)] (z)dz
=
Z
R
n
[
tz
f(x) f(x)] (z)dz. (11.10)
Hence by Minkowskis inequality for integrals (Theorem 9.27), Proposition 11.13
and the dominated convergence theorem,
k
t
f afk
p

Z
R
n
k
tz
f fk
p
|(z)| dz 0 as t 0.
Item 2. is proved similarly. Indeed, form Eq. (11.10)
k
t
f afk

Z
R
n
k
tz
f fk
|(z)| dz
which again tends to zero by the dominated convergence theorem because
lim
t0
k
tz
f fk
= 0 uniformly in z by the uniform continuity of f.

206 BRUCE K. DRIVER
Item 3. Let B
R
= B(0, R) be a large ball in R
n
and K @@ U, then
sup
xK
|
t
f(x) af(x)|
Z
B
R
Z
B
c
R
Z
B
R
|(z)| dz sup
xK,zB
R
|f(x tz) f(x)| + 2 kfk
Z
B
c
R
|(z)| dz
kk
1
sup
xK,zB
R
|f(x tz) f(x)| + 2 kfk
Z
|z|>R
|(z)| dz
so that using the uniform continuity of f on compact subsets of U,
limsup
t0
sup
xK
|
t
f(x) af(x)| 2 kfk
Z
|z|>R
|(z)| dz 0 as R .
See Theorem 8.15 if Folland for a statement about almost everywhere conver-
gence.
Exercise 11.1. Let
f(t) =

e
1/t
if t > 0
0 if t 0.
Show f C
(R, [0, 1]).

Lemma 11.22. There exists C
c
(R
n
, [0, )) such that (0) > 0, supp()
B(0, 1) and
R
R
n
(x)dx = 1.
Proof. Dene h(t) = f(1 t)f(t + 1) where f is as in Exercise 11.1. Then
h C
c
(R, [0, 1]), supp(h) [1, 1] and h(0) = e
2
> 0. Dene c =
R
R
n
h(|x|
2
)dx.
Then (x) = c
1
h(|x|
2
) is the desired function.
Denition 11.23. Let X R
n
be an open set. A Radon measure on B
X
is a
measure which is nite on compact subsets of X. For a Radon measure , we let
L
1
loc
() consists of those measurable functions f : X C such that
R
K
|f| d <
for all compact subsets K X.
The reader asked to prove the following proposition in Exercise 11.6 below.
Proposition 11.24. Suppose that f L
1
loc
(R
n
, m) and C
1
c
(R
n
), then f
C
1
(R
n
) and
i
(f ) = f
i
. Moreover if C
c
(R
n
) then f C
(R
n
).
Corollary 11.25 (C
Uryhsons Lemma). Given K @@ U

o
R
n
, there exists
f C
c
(R
n
, [0, 1]) such that supp(f) U and f = 1 on K.
Proof. Let be as in Lemma 11.22,
t
(x) = t
n
(x/t) be as in Theorem 11.21,
d be the standard metric on R
n
and = d(K, U
c
). Since K is compact and U
c
is
closed, > 0. Let V
= {x R
n
: d(x, K) < } and f =
/3
1
V
/3
, then
supp(f) supp(
/3
) +V
/3

V
2/3
U.
Since

V
2/3
is closed and bounded, f C
c
(U) and for x K,
f(x) =
Z
R
n
1
d(y,K)</3

/3
(x y)dy =
Z
R
n
/3
(x y)dy = 1.
The proof will be nished after the reader (easily) veries 0 f 1.
Here is an application of this corollary whose proof is left to the reader, Exercise
11.7.
Lemma 11.26 (Integration by Parts). Suppose f and g are measurable functions
on R
n
such that t f(x
1
, . . . , x
i1
, t, x
i+1
, . . . , x
n
) and t g(x
1
, . . . , x
i1
, t, x
i+1
, . . . , x
n
)
are continuously dierentiable functions on R for each xed x = (x
1
, . . . , x
n
) R
n
.
Moreover assume f g,
f
x
i
g and f
g
x
i
are in L
1
(R
n
, m). Then
Z
R
n
f
x
i
gdm =
Z
R
n
f
g
x
i
dm.
With this result we may give another proof of the Riemann Lebesgue Lemma.
Lemma 11.27. For f L
1
(R
n
, m) let
f() := (2)
n/2
Z
R
n
f(x)e
ix
dm(x)
be the Fourier transform of f. Then

f C
0
(R
n
) and

u
(2)
n/2
kfk
1
. (The
choice of the normalization factor, (2)
n/2
, in

f is for later convenience.)
Proof. The fact that

f is continuous is a simple application of the dominated
convergence theorem. Moreover,
f()

Z
|f(x)| dm(x) (2)
n/2
kfk
1
so it only remains to see that

f() 0 as || .
First suppose that f C
c
(R
n
) and let =
P
n
j=1

2
x
2
j
be the Laplacian on R
n
.
Notice that

x
j
e
ix
= i
j
e
ix
and e
ix
= ||
2
e
ix
. Using Lemma 11.26
repeatedly,
Z

k
f(x)e
ix
dm(x) =
Z
f(x)
k
x
e
ix
dm(x) = ||
2k
Z
f(x)e
ix
dm(x)
= (2)
n/2
||
2k

f()
for any k N. Hence (2)
n/2
f()
||
2k
k
f
1
0 as || and
f C
0
(R
n
). Suppose that f L
1
(m) and f
k
C
c
(R
n
) is a sequence such that
lim
k
kf f
k
k
1
= 0, then lim
k
f

f
k
u
= 0. Hence

f C
0
(R
n
) by an
application of Proposition 10.30.
Corollary 11.28. Let X R
n
be an open set and be a Radon measure on B
X
.
(1) Then C
c
(X) is dense in L
p
() for all 1 p < .
(2) If h L
1
loc
() satises
(11.11)
Z
X
fhd = 0 for all f C
c
(X)
then h(x) = 0 for a.e. x.
Proof. Let f C
c
(X), be as in Lemma 11.22,
t
be as in Theorem 11.21 and
set
t
:=
t
(f1
X
) . Then by Proposition 11.24
t
C
(X) and by Lemma 11.16

there exists a compact set K X such that supp(
t
) K for all t suciently
small. By Theorem 11.21,
t
f uniformly on X as t 0
(1) The dominated convergence theorem (with dominating function being
kfk
1
K
), shows
t
f in L
p
() as t 0. This proves Item 1., since
Proposition 11.6 guarantees that C
c
(X) is dense in L
p
().
208 BRUCE K. DRIVER
(2) Keeping the same notation as above, the dominated convergence theorem
(with dominating function being kfk
|h| 1
K
) implies
0 = lim
t0
Z
X

t
hd =
Z
X
lim
t0

t
hd =
Z
X
fhd.
The proof is now nished by an application of Lemma 11.7.
11.1.1. Smooth Partitions of Unity. We have the following smooth variants of
Proposition 10.24, Theorem 10.26 and Corollary 10.27. The proofs of these re-
sults are the same as their continuous counterparts. One simply uses the smooth
version of Urysohns Lemma of Corollary 11.25 in place of Lemma 10.15.
Proposition 11.29 (Smooth Partitions of Unity for Compacts). Suppose that X
is an open subset of R
n
, K X is a compact set and U = {U
j
}
n
j=1
is an open
cover of K. Then there exists a smooth (i.e. h
j
C
(X, [0, 1])) partition of unity

{h
j
}
n
j=1
of K such that h
j
U
j
for all j = 1, 2, . . . , n.
Theorem 11.30 (Locally Compact Partitions of Unity). Suppose that X is an open
subset of R
n
and U is an open cover of X. Then there exists a smooth partition of
unity of {h
i
}
N
i=1
(N = is allowed here) subordinate to the cover U such that
supp(h
i
) is compact for all i.
Corollary 11.31. Suppose that X is an open subset of R
n
and U = {U
}
A

is an open cover of X. Then there exists a smooth partition of unity of {h
}
A
) U
for all A. Moreover if

U
is compact for each A we may choose h
so that h
.
11.2. Classical Weierstrass Approximation Theorem. Let Z
+
:= N {0}.
Notation 11.32. For x R
d
and Z
d
+
let x
=
Q
d
i=1
x
i
i
and || =
P
d
i=1
i
.
A polynomial on R
d
is a function p : R
d
C of the form
p(x) =
X
:||N
p
with p
C and N Z
+
.
If p
6= 0 for some such that || = N, then we dene deg(p) := N to be the

degree of p. The function p has a natural extension to z C
d
, namely p(z) =
P
:||N
p
where z
=
Q
d
i=1
z
i
i
.
Remark 11.33. The mapping (x, y) R
d
R
d
z = x+iy C
d
is an isomorphism
of vector spaces. Letting z = x iy as usual, we have x =
z+ z
2
and y =
z z
2i
.
Therefore under this identication any polynomial p(x, y) on R
d
R
d
may be written
as a polynomial q in (z, z), namely
q(z, z) = p(
z + z
2
,
z z
2i
).
Conversely a polynomial q in (z, z) may be thought of as a polynomial p in (x, y),
namely p(x, y) = q(x +iy, x iy).
Theorem 11.34 (Weierstrass Approximation Theorem). Let a, b R
d
with a b
(i.e. a
i
b
i
for i = 1, 2, . . . , d ) and set [a, b] := [a
1
, b
1
] [a
d
, b
d
]. Then for
f C([a, b], C) there exists polynomials p
n
on R
d
such that p
n
f uniformly on
[a, b].
We will give two proofs of this theorem below. The rst proof is based on the
weak law of large numbers, while the second is base on using a certain sequence
of approximate functions.
Corollary 11.35. Suppose that K R
d
is a compact set and f C(K, C). Then
there exists polynomials p
n
on R
d
such that p
n
f uniformly on K.
Proof. Choose a, b R
d
such that a b and K (a, b) := (a
1
, b
1
) (a
d
, b
d
).
Let

f : K (a, b)
c
C be the continuous function dened by

f|
K
= f and
f|
(a,b)
c 0. Then by the Tietze extension Theorem (either of Theorems 10.2 or
10.16 will do) there exists F C(R
d
, C) such that

f = F|
K(a,b)
c . Apply the
Weierstrass Approximation Theorem 11.34 to F|
[a,b]
to nd polynomials p
n
on R
d
such that p
n
F uniformly on [a, b]. Clearly we also have p
n
f uniformly on
K.
Corollary 11.36 (Complex Weierstrass Approximation Theorem). Suppose that
K C
d
is a compact set and f C(K, C). Then there exists polynomials p
n
(z, z)
for z C
d
such that sup
zK
|p
n
(z, z) f(z)| 0 as n .
Proof. This is an immediate consequence of Remark 11.33 and Corollary 11.35.
Example 11.37. Let K = S
1
= {z C : |z| = 1} and A be the set of polynomials
in (z, z) restricted to S
1
. Then A is dense in C(S
1
).
23
Since z = z
1
on S
1
, we have
shown polynomials in z and z
1
are dense in C(S
1
). This example generalizes in
an obvious way to K =

S
1
d
C
d
.
11.2.1. First proof of the Weierstrass Approximation Theorem 11.34. Proof. Let
0 : = (0, 0, . . . , 0) and 1 : = (1, 1, . . . , 1). By considering the real and imaginary
parts of f separately, it suces to assume f is real valued. By replacing f by
g(x) = f(a
1
+ x
1
(b
1
a
1
), . . . , a
d
+ x
d
(b
d
a
d
)) for x [0, 1], it suces to prove
the theorem for f C([0, 1]).
For x [0, 1], let
x
be the measure on {0, 1} such that
x
({0}) = 1 x and
x
({1}) = x. Then
Z
{0,1}
yd
x
(y) = 0 (1 x) + 1 x = x and (11.12)
Z
{0,1}
(y x)
2
d
x
(y) = x
2
(1 x) + (1 x)
2
x = x(1 x). (11.13)
For x [0, 1] let
x
=
x
1

x
d
be the product of
x
1
, . . . ,
x
d
on := {0, 1}
d
.
Alternatively the measure
x
may be described by
(11.14)
x
({}) =
d
Y
i=1
(1 x
i
)
1
i
x
i
i
for . Notice that
x
({}) is a degree d polynomial in x for each . For
n N and x [0, 1], let
n
x
denote the n fold product of
x
with itself on
n
,
X
i
() =
i
R
d
for
n
and let
S
n
= (S
1
n
, . . . , S
d
n
) := (X
1
+X
2
+ +X
n
)/n,
23
Note that it is easy to extend f C(S
1
) to a function F C(C) by setting F(z) = zf(
z
|z|
)
for z 6= 0 and F(0) = 0. So this special case does not require the Tietze extension theorem.
210 BRUCE K. DRIVER
so S
n
:
n
R
d
. The reader is asked to verify (Exercise 11.2) that
(11.15)
Z
n
S
n
d
n
x
=
Z
n
S
1
n
d
n
x
, . . . ,
Z
n
S
d
n
d
n
x
= (x
1
, . . . , x
d
) = x
and
(11.16)
Z
n
|S
n
x|
2
d
n
x
=
1
n
d
X
i=1
x
i
(1 x
i
)
d
n
.
From these equations it follows that S
n
is concentrating near x as n , a
manifestation of the law of large numbers. Therefore it is reasonable to expect
(11.17) p
n
(x) :=
Z
n
f(S
n
)d
n
x
should approach f(x) as n .
Let > 0 be given, M = sup{|f(x)| : x [0, 1]} and
= sup{|f(y) f(x)| : x, y [0, 1] and |y x| } .

By uniform continuity of f on [0, 1], lim
0
= 0. Using these denitions and the

fact that
n
x
(
n
) = 1,
|f(x) p
n
(x)| =
n
(f(x) f(S
n
)) d
n
x
n
|f(x) f(S
n
)| d
n
x
Z
{|S
n
x|>}
|f(x) f(S
n
)| d
n
x
+
Z
{|S
n
x|}
|f(x) f(S
n
)| d
n
x
2M
n
x
(|S
n
x| > ) +
. (11.18)
By Chebyshevs inequality,
n
x
(|S
n
x| > )
1
2
Z
n
(S
n
x)
2
d
n
x
=
d
n
2
,
and therefore, Eq. (11.18) yields the estimate
kf p
n
k
u

2dM
n
2
+
and hence
limsup
n
kf p
n
k
u

0 as 0.
This completes the proof since, using Eq. (11.14),
p
n
(x) =
X
n
f(S
n
())
n
x
({}) =
X
n
f(S
n
())
n
Y
i=1
x
({
i
}),
is an nd degree polynomial in x R
d
).
Exercise 11.2. Verify Eqs. (11.15) and (11.16). This is most easily done using
Eqs. (11.12) and (11.13) and Fubinis theorem repeatedly. (Of course Fubinis
theorem here is over kill since these are only nite sums after all. Nevertheless it
is convenient to use this formulation.)
11.2.2. Second proof of the Weierstrass Approximation Theorem 11.34. For the
second proof we will rst need two lemmas.
Lemma 11.38 (Approximate sequences). Suppose that {Q
n
}
n=1
is a sequence
of positive functions on R
d
such that
Z
R
d
Q
n
(x) dx = 1 and (11.19)
lim
n
Z
|x|
Q
n
(x)dx = 0 for all > 0. (11.20)
For f BC(R
d
), Q
n
f converges to f uniformly on compact subsets of R
d
.
Proof. Let x R
d
, then because of Eq. (11.19),
|Q
n
f(x) f(x)| =
Z
R
d
Q
n
(y) (f(x y) f(x)) dy
Z
R
d
Q
n
(y) |f(x y) f(x)| dy.
Let M = sup
|f(x)| : x R
d
and > 0, then by and Eq. (11.19)

|Q
n
f(x) f(x)|
Z
|y|
Q
n
(y) |f(x y) f(x)| dy
+
Z
|y|>
Q
n
(y) |f(x y) f(x)| dy
sup
|z|
|f(x +z) f(x)| + 2M
Z
|y|>
Q
n
(y)dy.
Let K be a compact subset of R
d
, then
sup
xK
|Q
n
f(x) f(x)| sup
|z|,xK
|f(x +z) f(x)| + 2M
Z
|y|>
Q
n
(y)dy
and hence by Eq. (11.20),
lim sup
n
sup
xK
|Q
n
f(x) f(x)| sup
|z|,xK
|f(x +z) f(x)| .
This nishes the proof since the right member of this equation tends to 0 as 0
by uniform continuity of f on compact subsets of R
n
.
Let q
n
: R [0, ) be dened by
(11.21) q
n
(x)
1
c
n
(1 x
2
)
n
1
|x|1
where c
n
:=
Z
1
1
(1 x
2
)
n
dx.
Figure 26 displays the key features of the functions q
n
.
Dene
(11.22) Q
n
: R
n
[0, ) by Q
n
(x) = q
n
(x
1
) . . . q
n
(x
d
).
Lemma 11.39. The sequence {Q
n
}
n=1
is an approximate sequence, i.e. they
satisfy Eqs. (11.19) and (11.20).
Proof. The fact that Q
n
integrates to one is an easy consequence of Tonellis
theorem and the denition of c
n
. Since all norms on R
d
are equivalent, we may
assume that |x| = max {|x
i
| : i = 1, 2, . . . , d} when proving Eq. (11.20). With this
norm

x R
d
: |x|
=
d
i=1
x R
d
: |x
i
|
212 BRUCE K. DRIVER
1 0.5 0 -0.5 -1
5
3.75
2.5
1.25
0
x
y
x
y
Figure 26. A plot of q
1
, q
50
, and q
100
. The most peaked curve is
q
100
and the least is q
1
. The total area under each of these curves
is one.
and therefore by Tonellis theorem and the denition of c
n
,
Z
{|x|}
Q
n
(x)dx
d
X
i=1
Z
{|x
i
|}
Q
n
(x)dx = d
Z
{xR|x|}
q
n
(x)dx.
Since
Z
|x|
q
n
(x)dx =
2
R
1
(1 x
2
)
n
dx
2
R
0
(1 x
2
)
n
dx + 2
R
1
(1 x
2
)
n
dx
R
1
(1 x
2
)
n
dx
R
0
x
(1 x
2
)
n
dx
=
(1 x
2
)
n+1
|
1
(1 x
2
)
n+1
|
0
=
(1
2
)
n+1
1 (1
2
)
n+1
0 as n ,
the proof is complete.
We will now prove Corollary 11.35 which clearly implies Theorem 11.34.
Proof. Proof of Corollary 11.35. As in the beginning of the proof already given
for Corollary 11.35, we may assume that K = [a, b] for some a b and f = F|
K
where F C(R
d
, C) is a function such that F|
K
c 0. Moreover, by replacing F(x)
by G(x) = F(a
1
+ x
1
(b
1
a
1
), . . . , a
d
+ x
d
(b
d
a
d
)) for x R
n
we may further
assume K = [0, 1].
Let Q
n
(x) be dened as in Eq. (11.22). Then by Lemma 11.39 and 11.38,
p
n
(x) := (Q
n
F)(x) F(x) uniformly for x [0, 1] as n . So to nish the
proof it only remains to show p
n
(x) is a polynomial when x [0, 1]. For x [0, 1],
p
n
(x) =
Z
R
d
Q
n
(x y)f(y)dy
=
1
c
n
Z
[0,1]
f(y)
d
Y
i=1
c
1
n
(1 (x
i
y
i
)
2
)
n
1
|x
i
y
i
|1
dy
=
1
c
n
Z
[0,1]
f(y)
d
Y
i=1
c
1
n
(1 (x
i
y
i
)
2
)
n
dy.
Since the product in the above integrand is a polynomial if (x, y) R
n
R
n
, it
follows easily that p
n
(x) is polynomial in x.
11.3. Stone-Weierstrass Theorem. We now wish to generalize Theorem 11.34
to more general topological spaces. We will rst need some denitions.
Denition 11.40. Let X be a topological space and A C(X) = C(X, R) or
C(X, C) be a collection of functions. Then
(1) A is said to separate points if for all distinct points x, y X there exists
f A such that f(x) 6= f(y).
(2) A is an algebra if A is a vector subspace of C(X) which is closed under
pointwise multiplication.
(3) A is called a lattice if f g := max(f, g) and f g = min(f, g) A for all
f, g A.
(4) A C(X) is closed under conjugation if

f A whenever f A.
24
Remark 11.41. If X is a topological space such that C(X, R) separates points then
X is Hausdor. Indeed if x, y X and f C(X, R) such that f(x) 6= f(y), then
f
1
(J) and f
1
(I) are disjoint open sets containing x and y respectively when I
and J are disjoint intervals containing f(x) and f(y) respectively.
Lemma 11.42. If A C(X, R) is a closed algebra then |f| A for all f A and
A is a lattice.
Proof. Let f A and let M = sup
xX
|f(x)|. Using Theorem 11.34 or Exercise
11.8, there are polynomials p
n
(t) such that
lim
n
sup
|t|M
||t| p
n
(t)| = 0.
By replacing p
n
by p
n
p
n
(0) if necessary we may assume that p
n
(0) = 0. Since
A is an algebra, it follows that f
n
= p
n
(f) A and |f| A, because |f| is the
uniform limit of the f
n
s. Since
f g =
1
2
(f +g +|f g|) and
f g =
1
2
(f +g |f g|),
we have shown A is a lattice.
24
This is of course no restriction when C(X) = C(X, R).
214 BRUCE K. DRIVER
Lemma 11.43. Let A C(X, R) be an algebra which separates points and x, y X

be distinct points such that
(11.23) f, g A 3 f(x) 6= 0 and g(y) 6= 0.
Then
(11.24) V := {(f(x), f(y)) : f A}= R
2
.
Proof. It is clear that V is a non-zero subspace of R
2.
If dim(V ) = 1, then V =
span(a, b) with a 6= 0 and b 6= 0 by the assumption in Eq. (11.23). Since (a, b) =
(f(x), f(y)) for some f Aand f
2
A, it follows that (a
2
, b
2
) = (f
2
(x), f
2
(y)) V
as well. Since dimV = 1, (a, b) and (a
2
, b
2
) are linearly dependent and therefore
0 = det
a a
2
b b
2
= ab
2
ba
2
= ab(b a)
which implies that a = b. But this the implies that f(x) = f(y) for all f A,
violating the assumption that A separates points. Therefore we conclude that
dim(V ) = 2, i.e. V = R
2
.
Theorem 11.44 (Stone-Weierstrass Theorem). ppose X is a compact Hausdor
space and A C(X, R) is a closed subalgebra which separates points. For x X
let
A
x
{f(x) : f A} and
I
x
= {f C(X, R) : f(x) = 0}.
Then either one of the following two cases hold.
(1) A
x
= R for all x X, i.e. for all x X there exists f A such that
f(x) 6= 0.
25
(2) There exists a unique point x
0
X such that A
x
0
= {0} .
Moreover in case (1) A = C(X, R) and in case (2) A = I
x
0
= {f C(X, R) :
f(x
0
) = 0}.
Proof. If there exists x
0
such that A
x
0
= {0} (x
0
is unique since A separates
points) then A I
x
0
. If such an x
0
exists let C = I
x
0
and if A
x
= R for all x, set
C = C(X, R). Let f C, then by Lemma 11.43, for all x, y X such that x 6= y
there exists g
xy
A such that f = g
xy
on {x, y}.
26
The basic idea of the proof is
contained in the following identity,
(11.25) f(z) = inf
xX
sup
yX
g
xy
(z) for all z X.
To prove this identity, let g
x
:= sup
yX
g
xy
and notice that g
x
f since g
xy
(y) =
f(y) for all y X. Moreover, g
x
(x) = f(x) for all x X since g
xy
(x) = f(x) for all
x. Therefore,
inf
xX
sup
yX
g
xy
= inf
xX
g
x
= f.
The rest of the proof is devoted to replacing the inf and the sup above by min and
max over nite sets at the expense of Eq. (11.25) becoming only an approximate
identity.
25
If A contains the constant function 1, then this hypothesis holds.
26
If A
x
0
= {0} and x = x
0
or y = x
0
, then g
xy
exists merely by the fact that A separates
points.
Claim 2. Given > 0 and x X there exists g
x
A such that g
x
(x) = f(x) and
f < g
x
+ on X.
To prove the claim, let V
y
be an open neighborhood of y such that |f g
xy
| <
on V
y
so in particular f < + g
xy
on V
y
. By compactness, there exists X
such that X =
S
y
V
y
. Set
g
x
(z) = max{g
xy
(z) : y },
then for any y , f < + g
xy
< + g
x
on V
y
and therefore f < + g
x
on X.
Moreover, by construction f(x) = g
x
(x), see Figure 27 below.
Figure 27. Constructing the funtions g
x
.
We now will nish the proof of the theorem. For each x X, let U
x
be a
neighborhood of x such that |f g
x
| < on U
x
. Choose X such that
X =
S
x
U
x
and dene
g = min{g
x
: x } A.
Then f < g + on X and for x , g
x
< f + on U
x
and hence g < f + on U
x
.
Since X =
S
x
U
x
, we conclude
f < g + and g < f + on X,
i.e. |f g| < on X. Since > 0 is arbitrary it follows that f

A = A.
Theorem 11.45 (Complex Stone-Weierstrass Theorem). Let X be a compact
Hausdor space. Suppose A C(X, C) is closed in the uniform topology, sep-
arates points, and is closed under conjugation. Then either A = C(X, C) or
A = I
C
x
0
:= {f C(X, C) : f(x
0
) = 0} for some x
0
X.
Proof. Since
Re f =
f +

f
2
and Imf =
f

f
2i
,
216 BRUCE K. DRIVER
Re f and Imf are both in A. Therefore

A
R
= {Re f, Imf : f A}
is a real sub-algebra of C(X, R) which separates points. Therefore either A
R
=
C(X, R) or A
R
= I
x
0
C(X, R) for some x
0
and hence A = C(X, C) or I
C
x
0
respectively.
As an easy application, Theorems 11.44 and 11.45 imply Corollaries 11.35 and
11.36 respectively.
Corollary 11.46. Suppose that X is a compact subset of R
n
and is a nite
measure on (X, B
X
), then polynomials are dense in L
p
(X, ) for all 1 p < .
Proof. Consider X to be a metric space with usual metric induced from R
n
.
Then X is a locally compact separable metric space and therefore C
c
(X, C) =
C(X, C) is dense in L
p
() for all p [1, ). Since, by the dominated convergence
theorem, uniform convergence implies L
p
() convergence, it follows from the
Stone - Weierstrass theorem that polynomials are also dense in L
p
().
Here are a couple of more applications.
Example 11.47. Let f C([a, b]) be a positive function which is injective. Then
functions of the form
P
N
k=1
a
k
f
k
with a
k
C and N N are dense in C([a, b]).
For example if a = 1 and b = 2, then one may take f(x) = x
for any 6= 0, or
f(x) = e
x
, etc.
Exercise 11.3. Let (X, d) be a separable compact metric space. Show that C(X)
is also separable. Hint: Let E X be a countable dense set and then consider the
algebra, A C(X), generated by {d(x, )}
xE
.
11.4. Locally Compact Version of Stone-Weierstrass Theorem.
Theorem 11.48. Let X be non-compact locally compact Hausdor space. If A is
a closed subalgebra of C
0
(X, R) which separates points. Then either A = C
0
(X, R)
or there exists x
0
X such that A = {f C
0
(X, R) : f(x
0
) = 0}.
Proof. There are two cases to consider.
Case 1. There is no point x
0
X such that A {f C
0
(X, R) : f(x
0
) = 0}.
In this case let X
= X {} be the one point compactication of X. Because of

Proposition 10.31 to each f A there exists a unique extension

f C(X
, R)
such that f =

f|
X
and moreover this extension is given by

f() = 0. Let
e
A := {
f C(X
, R) : f A}. Then
e
A is a closed (you check) sub-algebra
of C(X
, R) which separates points. An application of Theorem 11.44 implies

e
A = {F C(X
, R) 3F() = 0} and therefore by Proposition 10.31 A = {F|

X
:
F
e
A} = C
0
(X, R).
Case 2. There exists x
0
X such A {f C
0
(X, R) : f(x
0
) = 0}. In this
case let Y := X \ {x
0
} and A
Y
:= {f|
Y
: f A} . Since X is locally compact,
one easily checks A
Y
C
0
(Y, R) is a closed subalgebra which separates points.
By Case 1. it follows that A
Y
= C
0
(Y, R). So if f C
0
(X, R) and f(x
0
) = 0,
f|
Y
C
0
(Y, R) =A
Y
, i.e. there exists g A such that g|
Y
= f|
Y
. Since g(x
0
) =
f(x
0
) = 0, it follows that f = g A and therefore A = {f C
0
(X, R) : f(x
0
) = 0}.
Example 11.49. Let X = [0, ), > 0 be xed, A be the algebra generated by
t e
t
. So the general element f A is of the form f(t) = p(e
t
), where p(x)
is a polynomial. Since A C
0
(X, R) separates points and e
t
A is pointwise
positive,

A = C
0
(X, R).
As an application of this example, we will show that the Laplace transform is
injective.
Theorem 11.50. For f L
1
([0, ), dx), the Laplace transform of f is dened by
Lf()
Z

0
e
x
f(x)dx for all > 0.
If Lf() 0 then f(x) = 0 for m -a.e. x.
Proof. Suppose that f L
1
([0, ), dx) such that Lf() 0. Let g
C
0
([0, ), R) and > 0 be given. Choose {a
}
>0
such that #({ > 0 : a
6= 0}) <
and
|g(x)
X
>0
a
e
x
| < for all x 0.
Then
Z

0
g(x)f(x)dx
Z

0
g(x)
X
>0
a
e
x
!
f(x)dx
Z

0
g(x)
X
>0
a
e
x
|f(x)| dx kfk
1
.
R
0
g(x)f(x)dx = 0 for all g C
0
([0, ), R).
The proof is nished by an application of Lemma 11.7.
11.5. Dynkins Multiplicative System Theorem. This section is devoted to
an extension of Theorem 8.12 based on the Weierstrass approximation theorem. In
this section X is a set.
Denition 11.51 (Multiplicative System). A collection of real valued functions Q
on a set X is a multiplicative system provided f g Q whenever f, g Q.
Theorem 11.52 (Dynkins Multiplicative System Theorem). Let H be a linear sub-
space of B(X, R) which contains the constant functions and is closed under bounded
convergence. If Q H is multiplicative system, then H contains all bounded real
valued (Q)-measurable functions.
Theorem 11.53 (Complex Multiplicative System Theorem). Let H be a complex
linear subspace of B(X, C) such that: 1 H, H is closed under complex conjugation,
and H is closed under bounded convergence. If Q H is multiplicative system
which is closed under conjugation, then H contains all bounded complex valued
(Q)-measurable functions.
Proof. Let F be R or C. Let C be the family of all sets of the form:
(11.26) B := {x X : f
1
(x) R
1
, . . . , f
m
(x) R
m
}
where m = 1, 2, . . . , and for k = 1, 2, . . . , m, f
k
Q and R
k
is an open interval if
F = R or R
k
is an open rectangle in C if F = C. The family C is easily seen to be
a system such that (Q) = (C). So By Theorem 8.12, to nish the proof it
suces to show 1
B
H for all B C.
218 BRUCE K. DRIVER
It is easy to construct, for each k, a uniformly bounded sequence of continuous

functions
k
n
n=1
on F converging to the characteristic function 1
R
k
. By Weier-
strass theorem, there exists polynomials p
k
m
(x) such that

p
k
n
(x)
k
n
(x)
1/n
for |x| k
k
k
in the real case and polynomials p

k
m
(z, z) in z and z such that
p
k
n
(z, z)
k
n
(z)
1/n for |z| k

k
k
in the complex case. The functions

F
n
:=p
1
n
(f
1
)p
2
n
(f
2
) . . . p
m
n
(f
m
) (real case)
F
n
:=p
1
n
(f
1

f
1
)p
2
n
(f
2
,

f
2
) . . . p
m
n
(f
m
,

f
m
) (complex case)
on X are uniformly bounded, belong to H and converge pointwise to 1
B
as n ,
where B is the set in Eq. (11.26). Thus 1
B
H and the proof is complete.
Remark 11.54. Given any collection of bounded real valued functions F on X,
let H(F) be the subspace of B(X, R) generated by F, i.e. H(F) is the smallest
subspace of B(X, R) which is closed under bounded convergence and contains F.
With this notation, Theorem 11.52 may be stated as follows. If F is a multiplicative
system then H(F) = B
(F)
(X, R) the space of bounded (F) measurable real
valued functions on X.
11.6. Exercises.
Exercise 11.4. Let (X, ) be a topological space, a measure on B
X
= () and
f : X C be a measurable function. Letting be the measure, d = |f| d, show
supp() = supp
(f), where supp() is dened in Denition 9.41).

Exercise 11.5. Let (X, ) be a topological space, a measure on B
X
= () such
that supp() = X (see Denition 9.41). Show supp
(f) = supp(f) = {f 6= 0} for

all f C(X).
Exercise 11.6. Prove Proposition 11.24 by appealing to Corollary 7.43.
Exercise 11.7 (Integration by Parts). Suppose that (x, y) R R
n1
f(x, y)
C and (x, y) R R
n1
g(x, y) C are measurable functions such that for each
xed y R
n1
, x f(x, y) and x g(x, y) are continuously dierentiable. Also
assume f g,
x
f g and f
x
g are integrable relative to Lebesgue measure on
R R
n1
, where
x
f(x, y) :=
d
dt
f(x +t, y)|
t=0
. Show
(11.27)
Z
RR
n1

x
f(x, y) g(x, y)dxdy =
Z
RR
n1
f(x, y)
x
g(x, y)dxdy.
(Note: this result and Fubinis theorem proves Lemma 11.26.)
Hints: Let C
c
(R) be a function which is 1 in a neighborhood of 0 R and
set
(x) = (x). First verify Eq. (11.27) with f(x, y) replaced by
(x)f(x, y) by
doing the x integral rst. Then use the dominated convergence theorem to prove
Eq. (11.27) by passing to the limit, 0.
Exercise 11.8. Let M < , show there are polynomials p
n
(t) such that
lim
n
sup
|t|M
||t| p
n
(t)| = 0
as follows. Let f(t) =

1 t for |t| 1. By Taylors theorem with integral re-
mainder (see Eq. A.15 of Appendix A) or by analytic function theory, there are
constants
27
n
> 0 for n N such that

1 x = 1
P
n=1
n
x
n
for all |x| < 1.
Use this to prove
P
n=1
n
= 1 and therefore q
m
(x) := 1
P
m
n=1
n
x
n
lim
m
sup
|x|1
|
1 x q
m
(x)| = 0.
Let 1 x = t
2
/M
2
, i.e. x = 1 t
2
/M
2
, then
lim
m
sup
|t|M
|t|
M
q
m
(1 t
2
/M
2
)
= 0
so that p
m
(t) := Mq
m
(1 t
2
/M
2
) are the desired polynomials.
Exercise 11.9. Given a continuous function f : R C which is 2 -periodic and
> 0. Show there exists a trigonometric polynomial, p() =
n
P
n=N

n
e
in
, such that
|f() P()| < for all R. Hint: show that there exists a unique function
F C(S
1
) such that f() = F(e
i
) for all R.
Remark 11.55. Exercise 11.9 generalizes to 2 periodic functions on R
d
, i.e. func-
tions such that f(+2e
i
) = f() for all i = 1, 2, . . . , d where {e
i
}
d
i=1
is the standard
basis for R
d
. A trigonometric polynomial p() is a function of R
d
of the form
p() =
X
n
n
e
in
where is a nite subset of Z
d
. The assertion is again that these trigonometric
polynomials are dense in the 2 periodic functions relative to the supremum
norm.
Exercise 11.10. Let be a nite measure on B
R
d, then D := span{e
ix
: R
d
}
p
() for all 1 p < . Hints: By Proposition 11.6, C
c
(R
d
)
p
(). For f C
c
(R
d
) and N N, let
f
N
(x) :=
X
nZ
d
f(x + 2Nn).
Show f
N
BC(R
d
) and x f
N
(Nx) is 2 periodic, so by Exercise 11.9, x
f
N
(Nx) can be approximated uniformly by trigonometric polynomials. Use this
fact to conclude that f
N

D
L
p
()
. After this show f
N
f in L
p
().
Exercise 11.11. Suppose that and are two nite measures on R
d
such that
(11.28)
Z
R
d
e
ix
d(x) =
Z
R
d
e
ix
d(x)
for all R
d
. Show = .
Hint: Perhaps the easiest way to do this is to use Exercise 11.10 with the
measure being replaced by +. Alternatively, use the method of proof of Exercise
11.9 to show Eq. (11.28) implies
R
R
d
fd(x) =
R
R
d
fd(x) for all f C
c
(R
d
).
Exercise 11.12. Again let be a nite measure on B
R
d. Further assume that
C
M
:=
R
R
d
e
M|x|
d(x) < for all M (0, ). Let P(R
d
) be the space of
polynomials, (x) =
P
||N

with
C, on R
d
. (Notice that |(x)|
p
C(, p, M)e
M|x|
, so that P(R
d
) L
p
() for all 1 p < .) Show P(R
d
) is dense
in L
p
() for all 1 p < . Here is a possible outline.
27
In fact
n
:=
(2n3)!!
2
n
n!
, but this is not needed.
220 BRUCE K. DRIVER
Outline: For R
d
and n N let f
n
(x) = ( x)
n
/n!
(1) Use calculus to verify sup
t0
t
e
Mt
= (/M)
for all 0 where

(0/M)
0
:= 1. Use this estimate along with the identity
| x|
pn
||
pn
|x|
pn
=

|x|
pn
e
M|x|
||
pn
e
M|x|
to nd an estimate on kf
n
k
p
.
(2) Use your estimate on kf
n
k
p
to show
P
n=0
kf
n
k
p
< and conclude
lim
N
e
i()
N
X
n=0
f
n
p
= 0.
(3) Now nish by appealing to Exercise 11.10.
Exercise 11.13. Again let be a nite measure on B
R
d but now assume there
exists an > 0 such that C :=
R
R
d
e
|x|
d(x) < . Also let q > 1 and h L
q
()
be a function such that
R
R
d
h(x)x
d(x) = 0 for all N

d
0
. (As mentioned in
Exercise 11.13, P(R
d
) L
p
() for all 1 p < , so x h(x)x
is in L
1
().)
Show h(x) = 0 for a.e. x using the following outline.
Outline: For R
d
and n N let f
n
(x) = ( x)
n
/n! and let p = q/(q 1)
be the conjugate exponent to q.
(1) Use calculus to verify sup
t0
t
e
t
= (/)
for all 0 where

(0/)
0
:= 1. Use this estimate along with the identity
| x|
pn
||
pn
|x|
pn
=

|x|
pn
e
|x|
||
pn
e
|x|
to nd an estimate on

p
.
(2) Use your estimate on

p
to show there exists > 0 such that
P
n=0
p
< when || and conclude for || that e
ix
= L
p
()
P
n=0
f
n
(x). Conclude from this that
Z
R
d
h(x)e
ix
d(x) = 0 when || .
(3) Let R
d
(|| not necessarily small) and set g(t) :=
R
R
d
e
itx
h(x)d(x)
for t R. Show g C
(R) and
g
(n)
(t) =
Z
R
d
(i x)
n
e
itx
h(x)d(x) for all n N.
(4) Let T = sup{ 0 : g|
[0,]
0}. By Step 2., T . If T < , then
0 = g
(n)
(T) =
Z
R
d
(i x)
n
e
iTx
h(x)d(x) for all n N.
Use Step 3. with h replaced by e
iTx
h(x) to conclude
g(T +t) =
Z
R
d
e
i(T+t)x
h(x)d(x) = 0 for all t / || .
This violates the denition of T and therefore T = and in particular we
may take T = 1 to learn
Z
R
d
h(x)e
ix
d(x) = 0 for all R
d
.
(5) Use Exercise 11.10 to conclude that
Z
R
d
h(x)g(x)d(x) = 0
for all g L
p
(). Now choose g judiciously to nish the proof.
222 BRUCE K. DRIVER
12. Hilbert Spaces

12.1. Hilbert Spaces Basics.
Denition 12.1. Let H be a complex vector space. An inner product on H is a
function, h, i : H H C, such that
(1) hax +by, zi = ahx, zi +bhy, zi i.e. x hx, zi is linear.
(2) hx, yi = hy, xi.
(3) kxk
2
hx, xi 0 with equality kxk
2
= 0 i x = 0.
Notice that combining properties (1) and (2) that x hz, xi is anti-linear for
xed z H, i.e.
hz, ax +byi = ahz, xi +
bhz, yi.
We will often nd the following formula useful:
kx +yk
2
= hx +y, x +yi = kxk
2
+kyk
2
+hx, yi +hy, xi
= kxk
2
+kyk
2
+ 2Rehx, yi (12.1)
Theorem 12.2 (Schwarz Inequality). Let (H, h, i) be an inner product space, then
for all x, y H
|hx, yi| kxkkyk
and equality holds i x and y are linearly dependent.
Proof. If y = 0, the result holds trivially. So assume that y 6= 0. First o notice
that if x = y for some C, then hx, yi = kyk
2
and hence
|hx, yi| = || kyk
2
= kxkkyk.
Moreover, in this case :=
hx,yi
kyk
2
.
Now suppose that x H is arbitrary, let z x kyk
2
hx, yiy. (So z is the
orthogonal projection of x onto y, see Figure 28.) Then
Figure 28. The picture behind the proof.
0 kzk
2
=
x
hx, yi
kyk
2
y
2
= kxk
2
+
|hx, yi|
2
kyk
4
kyk
2
2Rehx,
hx, yi
kyk
2
yi
= kxk
2
|hx, yi|
2
kyk
2
from which it follows that 0 kyk
2
kxk
2
|hx, yi|
2
with equality i z = 0 or
equivalently i x = kyk
2
hx, yiy.
Corollary 12.3. Let (H, h, i) be an inner product space and kxk :=
p
hx, xi. Then
k k is a norm on H. Moreover h, i is continuous on H H, where H is viewed as
the normed space (H, kk).
Proof. The only non-trivial thing to verify that kk is a norm is the triangle
inequality:
kx +yk
2
= kxk
2
+kyk
2
+ 2Rehx, yi kxk
2
+kyk
2
+ 2kxk kyk
= (kxk +kyk)
2
where we have made use of Schwarzs inequality. Taking the square root of this
inequality shows kx +yk kxk +kyk. For the continuity assertion:
|hx, yi hx
0
, y
0
i| = |hx x
0
, yi +hx
0
, y y
0
i|
kykkx x
0
k +kx
0
kky y
0
k
kykkx x
0
k + (kxk +kx x
0
k) ky y
0
k
= kykkx x
0
k +kxkky y
0
k +kx x
0
kky y
0
k
from which it follows that h, i is continuous.
Denition 12.4. Let (H, h, i) be an inner product space, we say x, y H are
orthogonal and write x y i hx, yi = 0. More generally if A H is a set,
x H is orthogonal to A and write x A i hx, yi = 0 for all y A. Let
A
= {x H : x A} be the set of vectors orthogonal to A. We also say that a

set S H is orthogonal if x y for all x, y S such that x 6= y. If S further
satises, kxk = 1 for all x S, then S is said to be orthonormal.
Proposition 12.5. Let (H, h, i) be an inner product space then
(1) (Parallelogram Law)
(12.2) kx +yk
2
+kx yk
2
= 2kxk
2
+ 2kyk
2
for all x, y H.
(2) (Pythagorean Theorem) If S H is a nite orthonormal set, then
(12.3) k
X
xS
xk
2
=
X
xS
kxk
2
.
(3) If A H is a set, then A
is a closed linear subspace of H.

Remark 12.6. See Proposition 12.40 in the appendix below for the converse of
the parallelogram law.
Proof. I will assume that H is a complex Hilbert space, the real case being
easier. Items 1. and 2. are proved by the following elementary computations:
kx +yk
2
+kx yk
2
= kxk
2
+kyk
2
+ 2Rehx, yi +kxk
2
+kyk
2
2Rehx, yi
= 2kxk
2
+ 2kyk
2
,
and
k
X
xS
xk
2
= h
X
xS
x,
X
yS
yi =
X
x,yS
hx, yi
=
X
xS
hx, xi =
X
xS
kxk
2
.
224 BRUCE K. DRIVER
Item 3. is a consequence of the continuity of h, i and the fact that

A
=
xA
ker(h, xi)
where ker(h, xi) = {y H : hy, xi = 0} a closed subspace of H.
Denition 12.7. A Hilbert space is an inner product space (H, h, i) such that
the induced Hilbertian norm is complete.
Example 12.8. Let (X, M, ) be a measure space then H := L
2
(X, M, ) with
inner product
(f, g) =
Z
X
f gd
is a Hilbert space. In Exercise 12.6 you will show every Hilbert space H is equiv-
alent to a Hilbert space of this form.
Denition 12.9. A subset C of a vector space X is said to be convex if for all
x, y C the line segment [x, y] := {tx + (1 t)y : 0 t 1} joining x to y is
contained in C as well. (Notice that any vector subspace of X is convex.)
Theorem 12.10. Suppose that H is a Hilbert space and M H be a closed convex
subset of H. Then for any x H there exists a unique y M such that
kx yk = d(x, M) = inf
zM
kx zk.
Moreover, if M is a vector subspace of H, then the point y may also be characterized
as the unique point in M such that (x y) M.
Proof. By replacing M by M x := {mx : m M} we may assume x = 0.
Let := d(0, M) = inf
mM
kmk and y, z M, see Figure 29.
Figure 29. The geometry of convex sets.
By the parallelogram law and the convexity of M,
(12.4) 2kyk
2
+2kzk
2
= ky+zk
2
+kyzk
2
= 4k
y +z
2
||
2
+kyzk
2
4
2
+kyzk
2
.
Hence if kyk = kzk = , then 2
2
+ 2
2
4
2
+ ky zk
2
, so that ky zk
2
= 0.
Therefore, if a minimizer for d(0, )|
M
exists, it is unique.
Existence. Let y
n
M be chosen such that ky
n
k =
n
d(0, M). Taking
y = y
m
and z = y
n
in Eq. (12.4) shows 2
2
m
+ 2
2
n
4
2
+ky
n
y
m
k
2
. Passing to
the limit m, n in this equation implies,
2
2
+ 2
2
4
2
+ limsup
m,n
ky
n
y
m
k
2
.
Therefore {y
n
}
n=1
is Cauchy and hence convergent. Because M is closed, y :=
lim
n
y
n
M and because kk is continuous,
kyk = lim
n
ky
n
k = = d(0, M).
So y is the desired point in M which is closest to 0.
Now for the second assertion we further assume that M is a closed subspace of
H and x H. Let y M be the closest point in M to x. Then for w M, the
function
g(t) kx (y +tw)k
2
= kx yk
2
2tRehx y, wi +t
2
kwk
2
has a minimum at t = 0. Therefore 0 = g
0
(0) = 2Rehx y, wi. Since w M is
arbitrary, this implies that (x y) M. Finally suppose y M is any point such
that (x y) M. Then for z M, by Pythagoreans theorem,
kx zk
2
= kx y +y zk
2
= kx yk
2
+ky zk
2
kx yk
2
which shows d(x, M)
2
kx yk
2
. That is to say y is the point in M closest to x.
Denition 12.11. Suppose that A : H H is a bounded operator. The adjoint
of A, denote A
, is the unique operator A
: H H such that hAx, yi = hx, A
yi.
(The proof that A
exists and is unique will be given in Proposition 12.16 below.)

A bounded operator A : H H is self - adjoint or Hermitian if A = A
.
Denition 12.12. Let H be a Hilbert space and M H be a closed subspace.
The orthogonal projection of H onto M is the function P
M
: H H such that for
x H, P
M
(x) is the unique element in M such that (x P
M
(x)) M.
Proposition 12.13. Let H be a Hilbert space and M H be a closed subspace.
The orthogonal projection P
M
satises:
(1) P
M
is linear (and hence we will write P
M
x rather than P
M
(x).
(2) P
2
M
= P
M
(P
M
is a projection).
(3) P
M
= P
M
, (P
M
is self-adjoint).
(4) Ran(P
M
) = M and ker(P
M
) = M
.
Proof.
(1) Let x
1
, x
2
H and F, then P
M
x
1
+P
M
x
2
M and
P
M
x
1
+P
M
x
2
(x
1
+x
2
) = [P
M
x
1
x
1
+(P
M
x
2
x
2
)] M
showing P
M
x
1
+P
M
x
2
= P
M
(x
1
+x
2
), i.e. P
M
is linear.
(2) Obviously Ran(P
M
) = M and P
M
x = x for all x M. Therefore P
2
M
=
P
M
.
226 BRUCE K. DRIVER
(3) Let x, y H, then since (x P

M
x) and (y P
M
y) are in M
,
hP
M
x, yi = hP
M
x, P
M
y +y P
M
yi
= hP
M
x, P
M
yi
= hP
M
x + (x P
M
), P
M
yi
= hx, P
M
yi.
(4) It is clear that Ran(P
M
) M. Moreover, if x M, then P
M
x = x implies
that Ran(P
M
) = M. Now x ker(P
M
) i P
M
x = 0 i x = x 0 M
.
Corollary 12.14. Suppose that M H is a proper closed subspace of a Hilbert
space H, then H = M M
.
Proof. Given x H, let y = P
M
x so that x y M
. Then x = y +(x y)
M +M
. If x M M
, then x x, i.e. kxk

2
= hx, xi = 0. So M M
= {0} .
Proposition 12.15 (Riesz Theorem). Let H
be the dual space of H (Notation

3.63). The map
(12.5) z H
j
h, zi H
is a conjugate linear isometric isomorphism.

Proof. The map j is conjugate linear by the axioms of the inner products.
Moreover, for x, z H,
|hx, zi| kxk kzk for all x H
with equality when x = z. This implies that kjzk
H
= kh, zik
H
= kzk . Therefore
j is isometric and this shows that j is injective. To nish the proof we must show
that j is surjective. So let f H
which we assume with out loss of generality is

non-zero. Then M = ker(f) a closed proper subspace of H. Since, by Corollary
12.14, H = M M
, f : H/M

= M
F is a linear isomorphism. This

shows that dim(M
) = 1 and hence H = M Fx
0
where x
0
M
\ {0} .
28
Choose z = x
0
M
such that f(x

0
) = hx
0
, zi. (So =

f(x
0
)/ kx
0
k
2
.) Then for
x = m+x
0
with m M and F,
f(x) = f(x
0
) = hx
0
, zi = hx
0
, zi = hm+x
0
, zi = hx, zi
which shows that f = jz.
Proposition 12.16 (Adjoints). Let H and K be Hilbert spaces and A : H K
be a bounded operator. Then there exists a unique bounded operator A
: K H
such that
(12.6) hAx, yi
K
= hx, A
yi
H
for all x H and y K.
Moreover (A+B)
= A
+

B
, A
:= (A
= A, kA
k = kAk and kA
Ak =
kAk
2
for all A, B L(H, K) and C.
28
Alternatively, choose x
0
M
\{0} such that f(x

0
) = 1. For x M
we have f(xx
0
) = 0
provided that := f(x). Therefore x x
0
M M
= {0} , i.e. x = x
0
. This again shows
that M
is spanned by x
0
.
Proof. For each y K, then map x hAx, yi
K
is in H
and therefore there

exists by Proposition 12.15 a unique vector z H such that
hAx, yi
K
= hx, zi
H
for all x H.
This shows there is a unique map A
: K H such that hAx, yi

K
= hx, A
(y)i
H
for all x H and y K. To nish the proof, we need only show A
is linear and
bounded. To see A
is linear, let y
1
, y
2
K and C, then for any x H,
hAx, y
1
+y
2
i
K
= hAx, y
1
i
K
+

hAx, y
2
i
K
= hx, A
(y
1
)i
K
+

hx, A
(y
2
)i
K
= hx, A
(y
1
) +A
(y
2
)i
K
and by the uniqueness of A
(y
1
+y
2
) we nd
A
(y
1
+y
2
) = A
(y
1
) +A
(y
2
).
This shows A
is linear and so we will now write A
y instead of A
(y). Since
hA
y, xi
H
= hx, A
yi
H
= hAx, yi
K
= hy, Axi
K
it follows that A
= A. he assertion that (A+B)
= A
+

B
is left to the
reader, see Exercise 12.1.
The following arguments prove the assertions about norms of A and A
:
kA
k = sup
kK:kkk=1
kA
kk = sup
kK:kkk=1
sup
hH:khk=1
|hA
k, hi|
= sup
hH:khk=1
sup
kK:kkk=1
|hk, Ahi| = sup
hH:khk=1
kAhk = kAk ,
kA
Ak kA
k kAk = kAk
2
and
kAk
2
= sup
hH:khk=1
|hAh, Ahi| = sup
hH:khk=1
|hh, A
Ahi|
sup
hH:khk=1
kA
Ahk = kA
Ak .
Exercise 12.1. Let H, K, M be Hilbert space, A, B L(H, K), C L(K, M) and
C. Show (A+B)
= A
+

B
and (CA)
= A
L(M, H).
Exercise 12.2. Let H = C
n
and K = C
m
equipped with the usual inner products,
i.e. hz, wi
H
= z w for z, w H. Let A be an mn matrix thought of as a linear
operator from H to K. Show the matrix associated to A
: K H is the conjugate
transpose of A.
Exercise 12.3. Let K : L
2
() L
2
() be the operator dened in Exercise 9.12.
Show K
: L
2
() L
2
() is the operator given by
K
g(y) =
Z
X
k(x, y)g(x)d(x).
Denition 12.17. {u
}
A
H is an orthonormal set if u
for all 6=
and ku
k = 1.
228 BRUCE K. DRIVER
Proposition 12.18 (Bessels Inequality). Let {u
}
A
be an orthonormal set, then
(12.7)
X
A
|hx, u
i|
2
kxk
2
for all x H.
In particular the set { A : hx, u
i 6= 0} is at most countable for all x H.

Proof. Let A be any nite set. Then
0 kx
X
hx, u
iu
k
2
= kxk
2
2Re
X
hx, u
i hu
, xi +
X
|hx, u
i|
2
= kxk
2
|hx, u
i|
2
showing that
X
|hx, u
i|
2
kxk
2
.
Taking the supremum of this equation of A then proves Eq. (12.7).
Proposition 12.19. Suppose A H is an orthogonal set. Then s =
P
vA
v
exists in H i
P
vA
kvk
2
< . (In particular A must be at most a countable set.)
Moreover, if
P
vA
kvk
2
< , then
(1) ksk
2
=
P
vA
kvk
2
and
(2) hs, xi =
P
vA
hv, xi for all x H.
Similarly if {v
n
}
n=1
is an orthogonal set, then s =

P
n=1
v
n
exists in H i
P
n=1
kv
n
k
2
< . In particular if

P
n=1
v
n
exists, then it is independent of rearrange-
ments of {v
n
}
n=1
.
Proof. Suppose s =
P
vA
v exists. Then there exists A such that
X
v
kvk
2
=
X
v
v
2
1
for all A\ ,wherein the rst inequality we have used Pythagoreans theorem.
Taking the supremum over such shows that
P
vA\
kvk
2
1 and therefore
X
vA
kvk
2
1 +
X
v
kvk
2
< .
Conversely, suppose that
P
vA
kvk
2
< . Then for all > 0 there exists
A
such that if A\
,
(12.8)
X
v
v
2
=
X
v
kvk
2
<
2
.
Hence by Lemma 3.72,
P
vA
v exists.
For item 1, let
be as above and set s
:=
P
v
v. Then
|ksk ks
k| ks s
k <
and by Eq. (12.8),
0
X
vA
kvk
2
ks
k
2
=
X
v/
kvk
2

2
.
Letting 0 we deduce from the previous two equations that ks
k ksk and
ks
k
2
P
vA
kvk
2
as 0 and therefore ksk
2
=
P
vA
kvk
2
.
Item 2. is a special case of Lemma 3.72.
For the nal assertion, let s
N

N
P
n=1
v
n
and suppose that lim
N
s
N
= s exists
in H and in particular {s
N
}
N=1
is Cauchy. So for N > M.
N
X
n=M+1
kv
n
k
2
= ks
N
s
M
k
2
0 as M, N
which shows that

P
n=1
kv
n
k
2
is convergent, i.e.

P
n=1
kv
n
k
2
< .
Remark: We could use the last result to prove Item 1. Indeed, if
P
vA
kvk
2
<
, then A is countable and so we may writer A = {v
n
}
n=1
. Then s = lim
N
s
N
with s
N
as above. Since the norm kk is continuous on H, we have
ksk
2
= lim
N
ks
N
k
2
= lim
N
N
X
n=1
v
n
2
= lim
N
N
X
n=1
kv
n
k
2
=

X
n=1
kv
n
k
2
=
X
vA
kvk
2
.
Corollary 12.20. Suppose H is a Hilbert space, H is an orthonormal set and
M = span . Then
P
M
x =
X
u
hx, uiu, (12.9)
X
u
|hx, ui|
2
= kP
M
xk
2
and (12.10)
X
u
hx, uihu, yi = hP
M
x, yi (12.11)
for all x, y H.
Proof. By Bessels inequality,
P
u
|hx, ui|
2
kxk
2
for all x H and hence
by Proposition 12.18, Px :=
P
u
hx, uiu exists in H and for all x, y H,
(12.12) hPx, yi =
X
u
hhx, uiu, yi =
X
u
hx, uihu, yi.
Taking y in Eq. (12.12) gives hPx, yi = hx, yi, i.e. that hx Px, yi = 0
for all y . So (x Px) span and by continuity we also have (x Px)
M = span . Since Px is also in M, it follows from the denition of P
M
that
Px = P
M
x proving Eq. (12.9). Equations (12.10) and (12.11) now follow from
(12.12), Proposition 12.19 and the fact that hP
M
x, yi = hP
2
M
x, yi = hP
M
x, P
M
yi
for all x, y H.
12.2. Hilbert Space Basis.
Denition 12.21 (Basis). Let H be a Hilbert space. A basis of H is a maximal
orthonormal subset H.
Proposition 12.22. Every Hilbert space has an orthonormal basis.
230 BRUCE K. DRIVER
Proof. Let F be the collection of all orthonormal subsets of H ordered by

inclusion. If F is linearly ordered then is an upper bound. By Zorns
Lemma (see Theorem B.7) there exists a maximal element F.
An orthonormal set H is said to be complete if
= {0} . That is to say

if hx, ui = 0 for all u then x = 0.
Lemma 12.23. Let be an orthonormal subset of H then the following are equiv-
alent:
(1) is a basis,
(2) is complete and
(3) span = H.
Proof. If is not complete, then there exists a unit vector x
\ {0} .
The set {x} is an orthonormal set properly containing , so is not maximal.
Conversely, if is not maximal, there exists an orthonormal set
1
H such that
&
1
. Then if x
1
\ , we have hx, ui = 0 for all u showing is not
complete. This proves the equivalence of (1) and (2). If is not complete and
x
\ {0} , then span x
which is a proper subspace of H. Conversely

if span is a proper subspace of H,
= span
is a non-trivial subspace by
Corollary 12.14 and is not complete. This shows that (2) and (3) are equivalent.
Theorem 12.24. Let H be an orthonormal set. Then the following are
equivalent:
(1) is complete or equivalently a basis.
(2) x =
P
u
hx, uiu for all x H.
(3) hx, yi =
P
u
hx, ui hu, yi for all x, y H.
(4) kxk
2
=
P
u
|hx, ui|
2
for all x H.
Proof. Let M = span and P = P
M
.
(1) (2) By Corollary 12.20,
P
u
hx, uiu = P
M
x. Therefore
x
X
u
hx, uiu = x P
M
x M
= {0} .
(2) (3) is a consequence of Proposition 12.19.
(3) (4) is obvious, just take y = x.
(4) (1) If x
, then by 4), kxk = 0, i.e. x = 0. This shows that is

complete.
Proposition 12.25. A Hilbert space H is separable i H has a countable ortho-
normal basis H. Moreover, if H is separable, all orthonormal bases of H are
countable.
Proof. Let D H be a countable dense set D = {u
n
}
n=1
. By Gram-Schmidt
process there exists = {v
n
}
n=1
an orthonormal set such that span{v
n
: n =
1, 2 . . . , N} span{u
n
: n = 1, 2 . . . , N}. So if hx, v
n
i = 0 for all n then hx, u
n
i = 0
for all n. Since D H is dense we may choose {w
k
} D such that x = lim
k
w
k
and therefore hx, xi = lim
k
hx, w
k
i = 0. That is to say x = 0 and is complete.
Conversely if H is a countable orthonormal basis, then the countable set
D =
_
_
_
X
u
a
u
u : a
u
Q+iQ : #{u : a
u
6= 0} <
_
_
_
is dense in H.
Finally let = {u
n
}
n=1
be an orthonormal basis and
1
H be another ortho-
normal basis. Then the sets
B
n
= {v
1
: hv, u
n
i 6= 0}
are countable for each n N and hence B :=

S
n=1
B
n
is a countable subset of
1
.
Suppose there exists v
1
\ B, then hv, u
n
i = 0 for all n and since = {u
n
}
n=1
is an orthonormal basis, this implies v = 0 which is impossible since kvk = 1.
Therefore
1
\ B = and hence
1
= B is countable.
Denition 12.26. A linear map U : H K is an isometry if kUxk
K
= kxk
H
for all x H and U is unitary if U is also surjective.
Exercise 12.4. Let U : H K be a linear map, show the following are equivalent:
(1) U : H K is an isometry,
(2) hUx, Ux
0
i
K
= hx, x
0
i
H
for all x, x
0
H, (see Eq. (12.21) below)
(3) U
U = id
H
.
Exercise 12.5. Let U : H K be a linear map, show the following are equivalent:
(1) U : H K is unitary
(2) U
U = id
H
and UU
= id
K
.
(3) U is invertible and U
1
= U
.
Exercise 12.6. Let H be a Hilbert space. Use Theorem 12.24 to show there exists
a set X and a unitary map U : H
2
(X). Moreover, if H is separable and
dim(H) = , then X can be taken to be N so that H is unitarily equivalent to
2
=
2
(N).
Remark 12.27. Suppose that {u
n
}
n=1
is a total subset of H, i.e. span{u
n
} = H.
Let {v
n
}
n=1
be the vectors found by performing Gram-Schmidt on the set {u
n
}
n=1
.
Then {v
n
}
n=1
is an orthonormal basis for H.
Example 12.28. (1) Let H = L
2
([, ], dm) = L
2
((, ), dm) and
e
n
() =
1
2
e
in
for n Z. Simple computations show := {e
n
}
nZ
is an
orthonormal set. We now claim that is an orthonormal basis. To see this
recall that C
c
((, )) is dense in L
2
((, ), dm). Any f C
c
((, ))
may be extended to be a continuous 2 periodic function on R and hence
by Exercise 11.9), f may uniformly (and hence in L
2
) be approximated by
a trigonometric polynomial. Therefore is a total orthonormal set, i.e.
is an orthonormal basis. The expansion of f in this basis is the well known
Fourier series expansion of f.
(2) Let H = L
2
([1, 1], dm) and A := {1, x, x
2
, x
3
. . . }. Then A is total in
H by the Stone-Weierstrass theorem and a similar argument as in the rst
example or directly from Exercise 11.12. The result of doing Gram-Schmidt
on this set gives an orthonormal basis of H consisting of the Legendre
Polynomials.
232 BRUCE K. DRIVER
(3) Let H = L
2
(R, e
1
2
x
2
dx).Exercise 11.12 implies A := {1, x, x
2
, x
3
. . . } is
total in H and the result of doing Gram-Schmidt on A now gives an ortho-
normal basis for H consisting of Hermite Polynomials.
Remark 12.29 (An Interesting Phenomena). Let H = L
2
([1, 1], dm) and B :=
{1, x
3
, x
6
, x
9
, . . . }. Then again A is total in H by the same argument as in item 2.
Example 12.28. This is true even though B is a proper subset of A. Notice that A
is an algebraic basis for the polynomials on [1, 1] while B is not! The following
computations may help relieve some of the readers anxiety. Let f L
2
([1, 1], dm),
then, making the change of variables x = y
1/3
, shows that
(12.13)
Z
1
1
|f(x)|
2
dx =
Z
1
1
f(y
1/3
)
2
1
3
y
2/3
dy =
Z
1
1
f(y
1/3
)
2
d(y)
where d(y) =
1
3
y
2/3
dy. Since ([1, 1]) = m([1, 1]) = 2, is a nite mea-
sure on [1, 1] and hence by Exercise 11.12 A := {1, x, x
2
, x
3
. . . } is a total in
L
2
([1, 1], d). In particular for any > 0 there exists a polynomial p(y) such that
Z
1
1
f(y
1/3
) p(y)
2
d(y) <
2
.
However, by Eq. (12.13) we have
2
>
Z
1
1
f(y
1/3
) p(y)
2
d(y) =
Z
1
1
f(x) p(x
3
)
2
dx.
Alternatively, if f C([1, 1]), then g(y) = f(y
1/3
) is back in C([1, 1]). There-
fore for any > 0, there exists a polynomial p(y) such that
> kg pk
u
= sup{|g(y) p(y)| : y [1, 1]}
= sup
g(x
3
) p(x
3
)
: x [1, 1]
= sup
f(x) p(x
3
)
: x [1, 1]
.
This gives another proof the polynomials in x
3
are dense in C([1, 1]) and hence
in L
2
([1, 1]).
12.3. Fourier Series Considerations. (BRUCE: This needs work and some stu
from Section 18.1 should be moved to here.) In this section we will examine item
1. of Example 12.28 in more detail. In the process we will give a direct and
constructive proof of the result in Exercise 11.9.
For C, let d
n
() :=
P
n
k=n
k
. Since d
n
() d
n
() =
n+1
n
,
d
n
() :=
n
X
k=n
k
=

n+1
n
1
n+1
n
1
|
=1
= lim
1
n+1
n
1
= 2n + 1 =
n
X
k=n
1
k
.
Writing = e
i
, we nd
D
n
() := d
n
(e
i
) =
e
i(n+1)
e
in
e
i
1
=
e
i(n+1/2)
e
i(n+1/2)
e
i/2
e
i/2
=
sin(n +
1
2
)
sin
1
2
.
Denition 12.30. The function
(12.14) D
n
() :=
sin(n +
1
2
)
sin
1
2
=
n
X
k=n
e
ik
is called the Dirichlet kernel.
By the L
2
theory of the Fourier series (or other methods) one may shows that
D
n

0
as n when acting on smooth periodic functions of . However this
kernel is not positive. In order to get a positive approximate function sequence,
we might try squaring D
n
to nd
D
2
n
() =
sin
2
(n +
1
2
)
sin
2 1
2
=
"
n
X
k=n
k
#
2
=
n
X
k,l=n
l
=
n
X
k,l=n
k+l
=
2n
X
m=2n
n
X
k,l=n
1
k+l=m,k,l[n,n]
m
=
2n
X
m=2n
n
X
k=n
1
|mk|n
m
=
2n
X
m=2n
[n + 1 +n |m|]
m
=
2n
X
m=2n
[2n + 1 |m|]
m
=
2n
X
m=2n
[2n + 1 |m|] e
im
.
In particular this implies
(12.15)
1
2n + 1
sin
2
(n +
1
2
)
sin
2 1
2
=
2n
X
m=2n
1
|m|
2n + 1
e
im
.
We will show in Lemma 12.32 below that Eq. (12.15) is valid for n
1
2
N.
Denition 12.31. The function
(12.16) K
n
() :=
1
n + 1
sin
2
(
n+1
2
)
sin
2 1
2
is called the Fejr kernel.

Lemma 12.32. The Fejr kernel K
n
satises:
(1)
(12.17) K
n
() :=
n
X
m=n
1
|m|
n + 1
e
im
.
(2) K
n
() 0.
(3)
1
2
R
K
n
()d = 1
(4) sup
||
K
n
() 0 as n for all > 0, see Figure 12.3
(5) For any continuous 2 periodic function f on R,
K
n
f() =
1
2
Z

K
n
( )f()d
=
n
X
m=n
1
|m|
n + 1

1
2
Z

e
im
f()d
e
im
(12.18)
and K
n
f() f() uniformly in as n .
234 BRUCE K. DRIVER
2.5 1.25 0 -1.25 -2.5

12.5
10
7.5
5
2.5
0
x
y
x
y
Plots of K
n
() for n = 2, 7 and 13.
Proof. 1. Using
sin
2
1
2
=

e
i/2
e
i/2
2i
2
=
2 +e
i
e
i
4
=
2 e
i
e
i
4
we nd
4 (n + 1) sin
2
1
2
n
X
m=n
1
|m|
n + 1
e
im
=

2 e
i
e
i
X
1
|m|n
[n + 1 |m|] e
im
=
X
21
|m|n
[n + 1 |m|] 1
|m1|n
[n + 1 |m1|]
1
|m+1|n
[n + 1 |m+ 1|]
e
im
=
X
m{0,n1,n+1}
21
|m|n
[n + 1 |m|] 1
|m1|n
[n + 1 |m1|]
1
|m+1|n
[n + 1 |m+ 1|]
e
im
= 2 e
i(n+1)
e
i(n+1)
= 4 sin
2
(
n + 1
2
)
which veries item 1.
2.- 4. Clearly K
n
() 0 being the square of a function and item 3. follows by
integrating the formula in Eq. (12.17). Item 4. is elementary to check and is clearly
indicated in Figure 12.3.
5. Items 2-4 show that K
n
() has the classic properties of an approximate
function when acting on 2 periodic functions. Hence it is standard that
K
n
f() f() uniformly in as n . Eq. (12.18) is a consequence of the
simple computation,
K
n
f() =
1
2
Z

K
n
( )f()d
=
n
X
m=n
1
|m|
n + 1

1
2
Z

e
im
f()d
e
im
.
12.4. Weak Convergence. Suppose H is an innite dimensional Hilbert space
and {x
n
}
n=1
is an orthonormal subset of H. Then, by Eq. (12.1), kx
n
x
m
k
2
= 2
for all m 6= n and in particular, {x
n
}
n=1
has no convergent subsequences. From
this we conclude that C := {x H : kxk 1} , the closed unit ball in H, is not
compact. To overcome this problems it is sometimes useful to introduce a weaker
topology on X having the property that C is compact.
Denition 12.33. Let (X, kk) be a Banach space and X
be its continuous dual.

The weak topology,
w
, on X is the topology generated by X
. If {x
n
}
n=1
X
is a sequence we will write x
n
w
x as n to mean that x
n
x in the weak
topology.
Because
w
= (X
)
kk
:= ({kx k : x X} , it is harder for a function
f : X F to be continuous in the
w
topology than in the norm topology,
kk
.
In particular if : X F is a linear functional which is
w
continuous, then is
kk
continuous and hence X
.
Proposition 12.34. Let {x
n
}
n=1
X be a sequence, then x
n
w
x X as n
i (x) = lim
n
(x
n
) for all X
.
Proof. By denition of
w
, we have x
n
w
x X i for all X
and > 0
there exists an N N such that |(x) (x
n
)| < for all n N and .
This later condition is easily seen to be equivalent to (x) = lim
n
(x
n
) for all
X
.
The topological space (X,
w
) is still Hausdor, however to prove this one needs
to make use of the Hahn Banach Theorem 18.16 below. For the moment we will
concentrate on the special case where X = H is a Hilbert space in which case
H
= {
z
:= h, zi : z H} , see Propositions 12.15. If x, y H and z := y x 6= 0,
then
0 < := kzk
2
=
z
(z) =
z
(y)
z
(x).
Thus V
x
:= {w H : |
z
(x)
z
(w)| < /2} and V
y
:= {w H : |
z
(y)
z
(w)| < /2}
are disjoint sets from
w
which contain x and y respectively. This shows that (H,
w
)
is a Hausdor space. In particular, this shows that weak limits are unique if they
exist.
Remark 12.35. Suppose that H is an innite dimensional Hilbert space {x
n
}
n=1
is
an orthonormal subset of H. Then Bessels inequality (Proposition 12.18) implies
x
n
w
0 H as n . This points out the fact that if x
n
w
x H as n , it
is no longer necessarily true that kxk = lim
n
kx
n
k . However we do always have
kxk liminf
n
kx
n
k because,
kxk
2
= lim
n
hx
n
, xi liminf
n
[kx
n
k kxk] = kxk liminf
n
kx
n
k .
Proposition 12.36. Let H be a Hilbert space, H be an orthonormal basis for
H and {x
n
}
n=1
H be a bounded sequence, then the following are equivalent:
(1) x
n
w
x H as n .
(2) hx, yi = lim
n
hx
n
, yi for all y H.
(3) hx, yi = lim
n
hx
n
, yi for all y .
Moreover, if c
y
:= lim
n
hx
n
, yi exists for all y , then
P
y
|c
y
|
2
< and
x
n
w
x :=
P
y
c
y
y H as n .
Proof. 1. = 2. This is a consequence of Propositions 12.15 and 12.34. 2. =
3. is trivial.
236 BRUCE K. DRIVER
3. = 1. Let M := sup
n
kx
n
k and H
0
denote the algebraic span of . Then for
y H and z H
0
,
|hx x
n
, yi| |hx x
n
, zi| +|hx x
n
, y zi| |hx x
n
, zi| + 2M ky zk .
Passing to the limit in this equation implies limsup
n
|hx x
n
, yi| 2M ky zk
which shows limsup
n
|hx x
n
, yi| = 0 since H
0
is dense in H.
To prove the last assertion, let . Then by Bessels inequality (Proposition
12.18),
X
y
|c
y
|
2
= lim
n
X
y
|hx
n
, yi|
2
liminf
n
kx
n
k
2
M
2
.
Since was arbitrary, we conclude that
P
y
|c
y
|
2
M < and hence we
may dene x :=
P
y
c
y
y. By construction we have
hx, yi = c
y
= lim
n
hx
n
, yi for all y
and hence x
n
w
x H as n by what we have just proved.
Theorem 12.37. Suppose that {x
n
}
n=1
H is a bounded sequence. Then there
exists a subsequence y
k
:= x
n
k
of {x
n
}
n=1
and x X such that y
k
w
x as k .
Proof. This is a consequence of Proposition 12.36 and a Cantors diagonalization
argument which is left to the reader, see Exercise 12.14.
Theorem 12.38 (Alaoglus Theorem for Hilbert Spaces). Suppose that H is a
separable Hilbert space, C := {x H : kxk 1} is the closed unit ball in H and
{e
n
}
n=1
is an orthonormal basis for H. Then
(12.19) (x, y) :=

X
n=1
1
2
n
|hx y, e
n
i|
denes a metric on C which is compatible with the weak topology on C,
C
:=
(
w
)
C
= {V C : V
w
} . Moreover (C, ) is a compact metric space.
Proof. The routine check that is a metric is left to the reader. Let
be
the topology on C induced by . For any y H and n N, the map x H
hx y, e
n
i = hx, e
n
i hy, e
n
i is
w
continuous and since the sum in Eq. (12.19) is
uniformly convergent for x, y C, it follows that x (x, y) is
C
continuous.
This implies the open balls relative to are contained in
C
and therefore
C
. For the converse inclusion, let z H, x
z
(x) = hz, xi be an element of
H
, and for N N let z

N
:=
P
N
n=1
hz, e
n
ie
n
. Then
z
N
=
P
N
n=1
hz, e
n
i
e
n
is
continuous, being a nite linear combination of the
e
n
which are easily seen to be
continuous. Because z
N
z as N it follows that
sup
xC
|
z
(x)
z
N
(x)| = kz z
N
k 0 as N .
Therefore
z
|
C
is continuous as well and hence
C
= (
z
|
C
: z H)
.
The last assertion follows directly from Theorem 12.37 and the fact that sequen-
tial compactness is equivalent to compactness for metric spaces.
Theorem 12.39 (Weak and Strong Dierentiability). Suppose that f L
2
(R
n
)
and v R
n
\ {0} . Then the following are equivalent:
(1) There exists {t
n
}
n=1
R\ {0} such that lim
n
t
n
= 0 and
sup
n
f( +t
n
v) f()
t
n
2
< .
(2) There exists g L
2
(R
n
) such that hf,
v
i = hg, i for all C
c
(R
n
).
2
(R
n
) and f
n
C
c
(R
n
) such that f
n
L
2
f and
v
f
n
L
2
g
as n .
2
such that
f( +tv) f()
t
L
2
g as t 0.
(See Theorem 19.18 for the L
p
generalization of this theorem.)
Proof. 1. = 2. We may assume, using Theorem 12.37 and passing to a
subsequence if necessary, that
f(+t
n
v)f()
t
n
w
g for some g L
2
(R
n
). Now for
C
c
(R
n
),
hg, i = lim
n
h
f( +t
n
v) f()
t
n
, i = lim
n
hf,
( t
n
v) ()
t
n
i
= hf, lim
n
( t
n
v) ()
t
n
i = hf,
v
i,
wherein we have used the translation invariance of Lebesgue measure and the dom-
inated convergence theorem.
2. = 3. Let C
c
(R
n
, R) such that
R
R
n
(x)dx = 1 and let
m
(x) =
m
n
(mx), then by Proposition 11.24, h
m
:=
m
f C
(R
n
) for all m and
v
h
m
(x) =
v
m
f(x) =
Z
R
n

v
m
(x y)f(y)dy = hf,
v
[
m
(x )]i
= hg,
m
(x )i =
m
g(x).
By Theorem 11.21, h
m
f L
2
(R
n
) and
v
h
m
=
m
g g in L
2
(R
n
) as
m . This shows 3. holds except for the fact that h
m
need not have compact
support. To x this let C
c
(R
n
, [0, 1]) such that = 1 in a neighborhood of 0
and let
(x) = (x) and (

v
)
(x) := (
v
) (x). Then
v
(
h
m
) =
v
h
m
+
v
h
m
= (
v
)
h
m
+
v
h
m
so that
h
m
h
m
in L
2
and
v
(
h
m
)
v
h
m
in L
2
as 0. Let f
m
=
m
h
m
where
m
is chosen to be greater than zero but small enough so that
k
m
h
m
h
m
k
2
+k
v
(
m
h
m
)
v
h
m
k
2
< 1/m.
Then f
m
C
c
(R
n
), f
m
f and
v
f
m
g in L
2
as m .
3. = 4. By the fundamental theorem of calculus
tv
f
m
(x) f
m
(x)
t
=
f
m
(x +tv) f
m
(x)
t
=
1
t
Z
1
0
d
ds
f
m
(x +stv)ds =
Z
1
0
(
v
f
m
) (x +stv)ds. (12.20)
Let
G
t
(x) :=
Z
1
0

stv
g(x)ds =
Z
1
0
g(x +stv)ds
238 BRUCE K. DRIVER
which is dened for almost every x and is in L

2
(R
n
) by Minkowskis inequality for
integrals, Theorem 9.27. Therefore
tv
f
m
(x) f
m
(x)
t
G
t
(x) =
Z
1
0
[(
v
f
m
) (x +stv) g(x +stv)] ds
and hence again by Minkowskis inequality for integrals,
tv
f
m
f
m
t
G
t
Z
1
0
k
stv
(
v
f
m
)
stv
gk
2
ds =
Z
1
0
k
v
f
m
gk
2
ds.
Letting m in this equation implies (
tv
f f) /t = G
t
a.e. Finally one more
application of Minkowskis inequality for integrals implies,
tv
f f
t
g
2
= kG
t
gk
2
=
Z
1
0
(
stv
g g) ds
Z
1
0
k
stv
g gk
2
ds.
By the dominated convergence theorem and Proposition 11.13, the latter term tends
to 0 as t 0 and this proves 4. The proof is now complete since 4. =1. is trivial.
12.5. Supplement 1: Converse of the Parallelogram Law.
Proposition 12.40 (Parallelogram Law Converse). If (X, kk) is a normed space
such that Eq. (12.2) holds for all x, y X, then there exists a unique inner product
on h, i such that kxk :=
p
hx, xi for all x X. In this case we say that kk is a
Hilbertian norm.
Proof. If kk is going to come from an inner product h, i, it follows from Eq.
(12.1) that
2Rehx, yi = kx +yk
2
kxk
2
kyk
2
and
2Rehx, yi = kx yk
2
kxk
2
kyk
2
.
Subtracting these two equations gives the polarization identity,
4Rehx, yi = kx +yk
2
kx yk
2
.
Replacing y by iy in this equation then implies that
4Imhx, yi = kx +iyk
2
kx iyk
2
from which we nd
(12.21) hx, yi =
1
4
X
G
kx +yk
2
where G = {1, i} a cyclic subgroup of S
1
C. Hence if h, i is going to exists
we must dene it by Eq. (12.21).
Notice that
hx, xi =
1
4
X
G
kx +xk
2
= kxk
2
+ikx +ixk
2
ikx ixk
2
= kxk
2
+i
1 +i|
2
kxk
2
i
1 i|
2
kxk
2
= kxk
2
.
So to nish the proof of (4) we must show that hx, yi in Eq. (12.21) is an inner
product. Since
4hy, xi =
X
G
ky +xk
2
=
X
G
k (y +x) k
2
=
X
G
ky +
2
xk
2
= ky +xk
2
+k y +xk
2
+ikiy xk
2
ik iy xk
2
= kx +yk
2
+kx yk
2
+ikx iyk
2
ikx +iyk
2
= 4hx, yi
it suces to show x hx, yi is linear for all y H. (The rest of this proof may
safely be skipped by the reader.) For this we will need to derive an identity from
Eq. (12.2). To do this we make use of Eq. (12.2) three times to nd
kx +y +zk
2
= kx +y zk
2
+ 2kx +yk
2
+ 2kzk
2
= kx y zk
2
2kx zk
2
2kyk
2
+ 2kx +yk
2
+ 2kzk
2
= ky +z xk
2
2kx zk
2
2kyk
2
+ 2kx +yk
2
+ 2kzk
2
= ky +z +xk
2
+ 2ky +zk
2
+ 2kxk
2
2kx zk
2
2kyk
2
+ 2kx +yk
2
+ 2kzk
2
.
Solving this equation for kx +y +zk
2
gives
(12.22) kx +y +zk
2
= ky +zk
2
+kx +yk
2
kx zk
2
+kxk
2
+kzk
2
kyk
2
.
Using Eq. (12.22), for x, y, z H,
4 Rehx +z, yi = kx +z +yk
2
kx +z yk
2
= ky +zk
2
+kx +yk
2
kx zk
2
+kxk
2
+kzk
2
kyk
2
kz yk
2
+kx yk
2
kx zk
2
+kxk
2
+kzk
2
kyk
2
= kz +yk
2
kz yk
2
+kx +yk
2
kx yk
2
= 4 Rehx, yi + 4 Rehz, yi. (12.23)
Now suppose that G, then since || = 1,
4hx, yi =
1
4
X
G
kx +yk
2
=
1
4
X
G
kx +
1
yk
2
=
1
4
X
G
kx +yk
2
= 4hx, yi (12.24)
where in the third inequality, the substitution was made in the sum. So Eq.
(12.24) says hix, yi = ihix, yi and hx, yi = hx, yi. Therefore
Imhx, yi = Re (ihx, yi) = Rehix, yi
which combined with Eq. (12.23) shows
Imhx +z, yi = Rehix iz, yi = Rehix, yi + Rehiz, yi
= Imhx, yi + Imhz, yi
and therefore (again in combination with Eq. (12.23)),
hx +z, yi = hx, yi +hz, yi for all x, y H.
240 BRUCE K. DRIVER
Because of this equation and Eq. (12.24) to nish the proof that x hx, yi is
linear, it suces to show hx, yi = hx, yi for all > 0. Now if = m N, then
hmx, yi = hx + (m1)x, yi = hx, yi +h(m1)x, yi
so that by induction hmx, yi = mhx, yi. Replacing x by x/m then shows that
hx, yi = mhm
1
x, yi so that hm
1
x, yi = m
1
hx, yi and so if m, n N, we nd
h
n
m
x, yi = nh
1
m
x, yi =
n
m
hx, yi
so that hx, yi = hx, yi for all > 0 and Q. By continuity, it now follows that
hx, yi = hx, yi for all > 0.
12.6. Supplement 2. Non-complete inner product spaces. Part of Theorem
12.24 goes through when H is a not necessarily complete inner product space. We
have the following proposition.
Proposition 12.41. Let (H, h, i) be a not necessarily complete inner product space
and H be an orthonormal set. Then the following two conditions are equivalent:
(1) x =
P
u
hx, uiu for all x H.
(2) kxk
2
=
P
u
|hx, ui|
2
for all x H.
Moreover, either of these two conditions implies that H is a maximal ortho-
normal set. However H being a maximal orthonormal set is not sucient to
conditions for 1) and 2) hold!
Proof. As in the proof of Theorem 12.24, 1) implies 2). For 2) implies 1) let
and consider
x
X
u
hx, uiu
2
= kxk
2
2
X
u
|hx, ui|
2
+
X
u
|hx, ui|
2
= kxk
2
X
u
|hx, ui|
2
.
Since kxk
2
=
P
u
|hx, ui|
2
, it follows that for every > 0 there exists
such
that for all such that
x
X
u
hx, uiu
2
= kxk
2
X
u
|hx, ui|
2
<
showing that x =
P
u
hx, uiu.
Suppose x = (x
1
, x
2
, . . . , x
n
, . . . )
. If 2) is valid then kxk

2
= 0, i.e. x = 0. So
is maximal. Let us now construct a counter example to prove the last assertion.
Take H = Span{e
i
}
i=1

2
and let u
n
= e
1
(n+1)e
n+1
for n = 1, 2 . . . . Apply-
ing Gramn-Schmidt to { u
n
}
n=1
we construct an orthonormal set = {u
n
}
n=1
H.
I now claim that H is maximal. Indeed if x = (x
1
, x
2
, . . . , x
n
, . . . )
then
x u
n
for all n, i.e.
0 = (x, u
n
) = x
1
(n + 1)x
n+1
.
Therefore x
n+1
= (n + 1)
1
x
1
for all n. Since x Span{e
i
}
i=1
, x
N
= 0 for some
N suciently large and therefore x
1
= 0 which in turn implies that x
n
= 0 for all
n. So x = 0 and hence is maximal in H. On the other hand, is not maximal
in
2
. In fact the above argument shows that
in
2
is given by the span of v =
(1,
1
2
,
1
3
,
1
4
,
1
5
, . . . ). Let P be the orthogonal projection of
2
onto the Span() = v
.
Then
X
i=1
hx, u
n
iu
n
= Px = x
hx, vi
kvk
2
v,
so that

P
i=1
hx, u
n
iu
n
= x i x Span() = v

2
. For example if x =
(1, 0, 0, . . . ) H (or more generally for x = e
i
for any i), x / v
and hence
P
i=1
hx, u
n
iu
n
6= x.
12.7. Supplement 3: Conditional Expectation. In this section let (, F, P)
be a probability space, i.e. (, F, P) is a measure space and P() = 1. Let G F
be a sub sigma algebra of F and write f G
b
if f : C is bounded and f is
(G, B
C
) measurable. In this section we will write
Ef :=
Z
fdP.
Denition 12.42 (Conditional Expectation). Let E
G
: L
2
(, F, P) L
2
(, G, P)
denote orthogonal projection of L
2
(, F, P) onto the closed subspace L
2
(, G, P).
For f L
2
(, G, P), we say that E
G
f L
2
(, F, P) is the conditional expecta-
tion of f.
Theorem 12.43. Let (, F, P) and G F be as above and f, g L
2
(, F, P).
(1) If f 0, P a.e. then E
G
f 0, P a.e.
(2) If f g, P a.e. there E
G
f E
G
g, P a.e.
(3) |E
G
f| E
G
|f|, P a.e.
(4) kE
G
fk
L
1 kfk
L
1 for all f L
2
. So by the B.L.T. Theorem 4.1, E
G
extends
uniquely to a bounded linear map from L
1
(, F, P) to L
1
(, G, P) which we
will still denote by E
G
.
(5) If f L
1
(, F, P) then F = E
G
f L
1
(, G, P) i
E(Fh) = E(fh) for all h G
b
.
(6) If g G
b
and f L
1
(, F, P), then E
G
(gf) = g E
G
f, P a.e.
Proof. By the denition of orthogonal projection for h G
b
,
E(fh) = E(f E
G
h) = E(E
G
f h).
So if f, h 0 then 0 E(fh) E(E
G
f h) and since this holds for all h 0 in G
b
,
E
G
f 0, P a.e. This proves (1). Item (2) follows by applying item (1). to f g.
If f is real, f |f| and so by Item (2), E
G
f E
G
|f|, i.e. |E
G
f| E
G
|f|, P
a.e. For complex f, let h 0 be a bounded and G measurable function. Then
E[|E
G
f| h] = E
h
E
G
f sgn(E
G
f)h
i
= E
h
f sgn(E
G
f)h
i
E[|f| h] = E[E
G
|f| h] .
Since h is arbitrary, it follows that |E
G
f| E
G
|f| , P a.e. Integrating this
inequality implies
kE
G
fk
L
1 E|E
G
f| E[E
G
|f| 1] = E[|f|] = kfk
L
1.
242 BRUCE K. DRIVER
Item (5). Suppose f L

1
(, F, P) and h G
b
. Let f
n
L
2
(, F, P) be a
sequence of functions such that f
n
f in L
1
(, F, P). Then
E(E
G
f h) = E( lim
n
E
G
f
n
h) = lim
n
E(E
G
f
n
h)
= lim
n
E(f
n
h) = E(f h). (12.25)
This equation uniquely determines E
G
, for if F L
1
(, G, P) also satises E(Fh) =
E(f h) for all h G
b
, then taking h = sgn(F E
G
f) in Eq. (12.25) gives
0 = E((F E
G
f) h) = E(|F E
G
f|).
This shows F = E
G
f, P a.e. Item (6) is now an easy consequence of this charac-
terization, since if h G
b
,
E[(gE
G
f) h] = E[E
G
f hg] = E[f hg] = E[gf h] = E[E
G
(gf) h] .
Thus E
G
(gf) = g E
G
f, P a.e.
Proposition 12.44. If G
0
G
1
F. Then
(12.26) E
G
0
E
G
1
= E
G
1
E
G
0
= E
G
0
.
Proof. Equation (12.26) holds on L
2
(, F, P) by the basic properties of or-
thogonal projections. It then hold on L
1
(, F, P) by continuity and the density of
L
2
(, F, P) in L
1
(, F, P).
Example 12.45. Suppose that (X, M, ) and (Y, N, ) are two nite measure
spaces. Let = X Y, F = M N and P(dx, dy) = (x, y)(dx)(dy) where
L
1
(, F, ) is a positive function such that
R
XY
d ( ) = 1. Let
X
: X be the projection map,
X
(x, y) = x, and
G := (
X
) =
1
X
(M) = {AY : A M} .
Then f : R is G measurable i f = F
X
for some function F : X R
which is N measurable, see Lemma 6.62. For f L
1
(, F, P), we will now show
E
G
f = F
X
where
F(x) =
1
(x)
1
(0,)
( (x))
Z
Y
f(x, y)(x, y)(dy),
(x) :=
R
Y
(x, y)(dy). (By convention,
R
Y
f(x, y)(x, y)(dy) := 0 if
R
Y
|f(x, y)| (x, y)(dy) =
.)
By Tonellis theorem, the set
E := {x X : (x) = }
x X :
Z
Y
|f(x, y)| (x, y)(dy) =
is a null set. Since

E[|F
X
|] =
Z
X
d(x)
Z
Y
d(y) |F(x)| (x, y) =
Z
X
d(x) |F(x)| (x)
=
Z
X
d(x)
Z
Y
(dy)f(x, y)(x, y)
Z
X
d(x)
Z
Y
(dy) |f(x, y)| (x, y) < ,
F
X
L
1
(, G, P). Let h = H
X
be a bounded G measurable function, then
E[F
X
h] =
Z
X
d(x)
Z
Y
d(y)F(x)H(x)(x, y)
=
Z
X
d(x)F(x)H(x) (x)
=
Z
X
d(x)H(x)
Z
Y
(dy)f(x, y)(x, y)
= E[hf]
and hence E
G
f = F
X
as claimed.
This example shows that conditional expectation is a generalization of the notion
of performing integration over a partial subset of the variables in the integrand.
Whereas to compute the expectation, one should integrate over all of the variables.
See also Exercise 12.8 to gain more intuition about conditional expectations.
Theorem 12.46 (Jensens inequality). Let (, F, P) be a probability space and
: R R be a convex function. Assume f L
1
(, F, P; R) is a function such
that (for simplicity) (f) L
1
(, F, P; R), then (E
G
f) E
G
[(f)] , P a.e.
Proof. Let us rst assume that is C
1
and f is bounded. In this case
(12.27) (x) (x
0
)
0
(x
0
)(x x
0
) for all x
0
, x R.
Taking x
0
= E
G
f and x = f in this inequality implies
(f) (E
G
f)
0
(E
G
f)(f E
G
f)
and then applying E
G
to this inequality gives
E
G
[(f)] (E
G
f) = E
G
[(f) (E
G
f)]
0
(E
G
f)(E
G
f E
G
E
G
f) = 0
The same proof works for general , one need only use Proposition 9.7 to replace
Eq. (12.27) by
(x) (x
0
)
0
(x
0
)(x x
0
) for all x
0
, x R
where
0
(x
0
) is the left hand derivative of at x
0
.
If f is not bounded, apply what we have just proved to f
M
= f1
|f|M
, to nd
(12.28) E
G

(f
M
)
(E
G
f
M
).
Since E
G
: L
1
(, F, P; R) L
1
(, F, P; R) is a bounded operator and f
M
f
and (f
M
) (f) in L
1
(, F, P; R) as M , there exists {M
k
}
k=1
such that
M
k
and f
M
k
f and (f
M
k
) (f), P a.e. So passing to the limit in Eq.
(12.28) shows E
G
[(f)] (E
G
f), P a.e.
12.8. Exercises.
Exercise 12.7. Let (X, M, ) be a measure space and H := L
2
(X, M, ). Given
f L
() let M
f
: H H be the multiplication operator dened by M
f
g = fg.
Show M
2
f
= M
f
i there exists A M such that f = 1
A
a.e.
Exercise 12.8. Suppose (, F, P) is a probability space and A := {A
i
}
i=1
F
is a partition of . (Recall this means =
`
i=1
A
i
.) Let G be the algebra
generated by A. Show:
(1) B G i B =
i
A
i
for some N.
244 BRUCE K. DRIVER
(2) g : R is G measurable i g =
P
i=1
i
1
A
i
for some
i
R.
(3) For f L
1
(, F, P), let E(f|A
i
) := E[1
A
i
f] /P(A
i
) if P(A
i
) 6= 0 and
E(f|A
i
) = 0 otherwise. Show
E
G
f =

X
i=1
E(f|A
i
)1
A
i
.
Exercise 12.10. Folland 5.61 on p. 178 about orthonormal basis on product
spaces.
Exercise 12.11. Folland 5.67 on p. 178 regarding the mean ergodic theorem.
Exercise 12.12 (Haar Basis). In this problem, let L
2
denote L
2
([0, 1], m) with the
standard inner product,
(x) = 1
[0,1/2)
(x) 1
[1/2,1)
(x)
and for k, j N
0
:= N{0} with 0 j < 2
k
let
kj
(x) := 2
k/2
(2
k
x j).
The following pictures shows the graphs of
00
,
1,0
,
1,1
,
2,1
,
2,2
and
2,3
re-
spectively.
1 0.75 0.5 0.25 0
1
0.5
0
-0.5
-1
x
y
x
y
Plot of
0
, 0.
1 0.75 0.5 0.25 0
1
0.5
0
-0.5
-1
x
y
x
y
Plot of
1
0.
1 0.75 0.5 0.25 0
1
0.5
0
-0.5
-1
x
y
x
y
Plot of
1
1.
1 0.75 0.5 0.25 0
2
1
0
-1
-2
x
y
x
y
Plot of
2
0.
1 0.75 0.5 0.25 0
2
1
0
-1
-2
x
y
x
y
Plot of
2
1.
1 0.75 0.5 0.25 0
2
1
0
-1
-2
x
y
x
y
Plot of
2
2.
1 0.75 0.5 0.25 0
2
1
0
-1
-2
x
y
x
y
Plot of
2
3.
(1) Show := {1}
kj
: 0 k and 0 j < 2
k
is an orthonormal set, 1
denotes the constant function 1.
(2) For n N, let M
n
:= span
{1}
kj
: 0 k < n and 0 j < 2
k
.
Show
M
n
= span
{1
[j2
n
,(j+1)2
n
)
: and 0 j < 2
n
.
(3) Show
n=1
M
n
2
and therefore is an orthonormal
basis for L
2
. Hint: see Theorem 11.3.
(4) For f L
2
, let
H
n
f := hf, 1i1 +
n1
X
k=0
2
k
1
X
j=0
hf,
kj
i
kj
.
Show (compare with Exercise 12.8)
H
n
f =
2
n
1
X
j=0
2
n
Z
(j+1)2
n
j2
n
f(x)dx
!
1
[j2
n
,(j+1)2
n
)
and use this to show kf H
n
fk
u
0 as n for all f C([0, 1]).
Exercise 12.13. Let O(n) be the orthogonal groups consisting of n n real
orthogonal matrices O, i.e. O
tr
O = I. For O O(n) and f L
2
(R
n
) let
U
O
f(x) = f(O
1
x). Show
(1) U
O
f is well dened, namely if f = g a.e. then U
O
f = U
O
g a.e.
(2) U
O
: L
2
(R
n
) L
2
(R
n
) is unitary and satises U
O
1
U
O
2
= U
O
1
O
2
for all
O
1
, O
2
O(n). That is to say the map O O(n) U(L
2
(R
n
)) the
unitary operators on L
2
(R
n
) is a group homomorphism, i.e. a unitary
representation of O(n).
(3) For each f L
2
(R
n
), the map O O(n) U
O
f L
2
(R
n
) is continuous.
Take the topology on O(n) to be that inherited from the Euclidean topology
on the vector space of all nn matrices. Hint: see the proof of Proposition
11.13.
Exercise 12.14. Prove Theorem 12.37. Hint: Let H
0
:= span{x
n
: n N}
a separable Hilbert subspace of H. Let {
m
}
m=1
H
0
be an orthonormal basis
and use Cantors diagonalization argument to nd a subsequence y
k
:= x
n
k
such
that c
m
:= lim
k
hy
k
,
m
i exists for all m N. Finish the proof by appealing to
Proposition 12.36.
Exercise 12.15. Suppose that {x
n
}
n=1
H and x
n
w
x H as n . Show
x
n
x as n (i.e. lim
n
kx x
n
k = 0) i lim
n
kx
n
k = kxk .
Exercise 12.16. Show the vector space operations of X are continuous in the weak
topology. More explicitly show
246 BRUCE K. DRIVER
(1) (x, y) X X x +y X is (
w
w
,
w
) continuous and
(2) (, x) F X x X is (
F
w
,
w
) continuous.
Exercise 12.17. Euclidean group representation and its innitesimal generators
including momentum and angular momentum operators.
Exercise 12.18. Spherical Harmonics.
Exercise 12.19. The gradient and the Laplacian in spherical coordinates.
Exercise 12.20. Legendre polynomials.
Exercise 12.21. In this problem you are asked to show there is no reasonable
notion of Lebesgue measure on an innite dimensional Hilbert space. To be more
precise, suppose H is an innite dimensional Hilbert space and m is a count-
ably additive measure on B
H
which is invariant under translations and satises,
m(B
0
()) > 0 for all > 0. Show m(V ) = for all non-empty open subsets V H.
12.9. Fourier Series Exercises.
Notation 12.47. Let C
k
per
(R
d
) denote the 2 periodic functions in C
k
(R
d
),
C
k
per
(R
d
) :=

f C
k
(R
d
) : f(x + 2e
i
) = f(x) for all x R
d
and i = 1, 2, . . . , d
.
Also let h, i denote the inner product on the Hilbert space H := L
2
([, ]
d
) given
by
hf, gi :=

1
2
d
Z
[,]
d
f(x) g(x)dx.
Recall that

k
(x) := e
ikx
: k Z
d
is an orthonormal basis for H in particular

for f H,
(12.29) f =
X
kZ
d
hf,
k
i
k
where the convergence takes place in L
2
([, ]
d
). For f L
1
([, ]
d
), we will
write

f(k) for the Fourier coecient,
(12.30)

f(k) := hf,
k
i =

1
2
d
Z
[,]
d
f(x)e
ikx
dx.
Lemma 12.48. Let s > 0, then the following are equivalent,
(12.31)
X
kZ
d
1
(1 +|k|)
s
< ,
X
kZ
d
1
(1 +|k|
2
)
s/2
< and s > d.
Proof. Let Q := (0, 1]
d
and k Z
d
. For x = k +y (k +Q),
2 +|k| = 2 +|x y| 2 +|x| +|y| 3 +|x| and
2 +|k| = 2 +|x y| 2 +|x| |y| |x| + 1
and therefore for s > 0,
1
(3 +|x|)
s

1
(2 +|k|)
s

1
(1 +|x|)
s
.
Thus we have shown
1
(3 +|x|)
s

X
kZ
d
1
(2 +|k|)
s
1
Q+k
(x)
1
(1 +|x|)
s
for all x R
d
.
Integrating this equation then shows
Z
R
d
1
(3 +|x|)
s
dx
X
kZ
d
1
(2 +|k|)
s

Z
R
d
1
(1 +|x|)
s
dx
from which we conclude that
(12.32)
X
kZ
d
1
(2 +|k|)
s
 d.
Because the functions 1+t, 2+t, and
1 +t
2
all behave like t as t , the sums
in Eq. (12.31) may be compared with the one in Eq. (12.32) to nish the proof.
Exercise 12.22 (Riemann Lebesgue Lemma for Fourier Series). Show for f
L
1
([, ]
d
) that

f c
0
(Z
d
), i.e.

f : Z
d
C and lim
k

f(k) = 0. Hint: If
f H, this follows form Bessels inequality. Now use a density argument.
Exercise 12.23. Suppose f L
1
([, ]
d
) is a function such that

f
1
(Z
d
) and
set
g(x) :=
X
kZ
d
f(k)e
ikx
(pointwise).
(1) Show g C
per
(R
d
).
(2) Show g(x) = f(x) for m a.e. x in [, ]
d
. Hint: Show g(k) =

f(k) and
then use approximation arguments to show
Z
[,]
d
f(x)h(x)dx =
Z
[,]
d
g(x)h(x)dx h C([, ]
d
).
(3) Conclude that f L
1
([, ]
d
) L
([, ]
d
) and in particular f
L
p
([, ]
d
) for all p [1, ].
Exercise 12.24. Suppose m N
0
, is a multi-index such that || 2m and
f C
2m
per
(R
d
)
29
.
(1) Using integration by parts, show
(ik)

f(k) = h
f,
k
i.
Note: This equality implies
f(k)

1
k
fk
H

1
k
fk
u
.
(2) Now let f =
P
d
i=1
2
f/x
2
i
, Working as in part 1) show
(12.33) h(1 )
m
f,
k
i = (1 +|k|
2
)
m

f(k).
29
We view C
per
(R) as a subspace of H by identifying f C
per
(R) with f|
[,]
H.
248 BRUCE K. DRIVER
Remark 12.49. Suppose that m is an even integer, is a multi-index and f

C
m+||
per
(R
d
), then
_
_
X
kZ
d
|k
f(k)
_
_
2
=
_
_
X
kZ
d
|h
f,
k
i| (1 +|k|
2
)
m/2
(1 +|k|
2
)
m/2
_
_
2
=
_
_
X
kZ
d
h(1 )
m/2
f,
k
i
(1 +|k|
2
)
m/2
_
_
2
X
kZ
d
h(1 )
m/2
f,
k
i
X
kZ
d
(1 +|k|
2
)
m
= C
m
(1 )
m/2
2
H
where C
m
:=
P
kZ
d
(1 +|k|
2
)
m
 d/2. So the smoother f is the faster
f decays at innity. The next problem is the converse of this assertion and hence
smoothness of f corresponds to decay of

f at innity and visa-versa.
Exercise 12.25. Suppose s R and

c
k
C : k Z
d
are coecients such that

X
kZ
d
|c
k
|
2
(1 +|k|
2
)
s
< .
Show if s >
d
2
+m, the function f dened by
f(x) =
X
kZ
d
c
k
e
ikx
is in C
m
per
(R
d
). Hint: Work as in the above remark to show
X
kZ
d
|c
k
| |k
| < for all || m.

Exercise 12.26 (Poisson Summation Formula). Let F L
1
(R
d
),
E :=
_
_
_
x R
d
:
X
kZ
d
|F(x + 2k)| =
_
_
_
and set
F(k) := (2)
d/2
Z
R
d
F(x)e
ikx
dx.
Further assume

F
1
(Z
d
).
(1) Show m(E) = 0 and E + 2k = E for all k Z
d
. Hint: Compute
R
[,]
d
P
kZ
d
|F(x + 2k)| dx.
(2) Let
f(x) :=
P
kZ
d
F(x + 2k) for x / E
0 if x E.
Show f L
1
([, ]
d
) and

f(k) = (2)
d/2

F(k).
(3) Using item 2) and the assumptions on F, show f L
1
([, ]
d
)
L
([, ]
d
) and
f(x) =
X
kZ
d
f(k)e
ikx
=
X
kZ
d
(2)
d/2

F(k)e
ikx
for m a.e. x,
i.e.
(12.34)
X
kZ
d
F(x + 2k) = (2)
d/2
X
kZ
d
F(k)e
ikx
for m a.e. x.
(4) Suppose we now assume that F C(R
d
) and F satises 1) |F(x)| C(1 +
|x|)
s
for some s > d and C < and 2)

F
1
(Z
d
), then show Eq. (12.34)
holds for all x R
d
and in particular
X
kZ
d
F(2k) = (2)
d/2
X
kZ
d
F(k).
For simplicity, in the remaining problems we will assume that d = 1.
Exercise 12.27 (Heat Equation 1.). Let (t, x) [0, ) R u(t, x) be a contin-
uous function such that u(t, ) C
per
(R) for all t 0, u := u
t
, u
x
, and u
xx
exists
and are continuous when t > 0. Further assume that u satises the heat equation
u =
1
2
u
xx
. Let u(t, k) := hu(t, ),
k
i for k Z. Show for t > 0 and k Z that
u(t, k) is dierentiable in t and
d
dt
u(t, k) = k
2
u(t, k)/2. Use this result to show
(12.35) u(t, x) =
X
kZ
e
t
2
k
2
f(k)e
ikx
where f(x) := u(0, x) and as above
f(k) = hf,
k
i =
1
2
Z

f(y)e
iky
dy.
Notice from Eq. (12.35) that (t, x) u(t, x) is C
for t > 0.
Exercise 12.28 (Heat Equation 2.). Let q
t
(x) :=
1
2
P
kZ
e
t
2
k
2
e
ikx
. Show that
Eq. (12.35) may be rewritten as
u(t, x) =
Z

q
t
(x y)f(y)dy
and
q
t
(x) =
X
kZ
p
t
(x +k2)
where p
t
(x) :=
1
2t
e
1
2t
x
2
. Also show u(t, x) may be written as
u(t, x) = p
t
f(x) :=
Z
R
d
p
t
(x y)f(y)dy.
Hint: To show q
t
(x) =
P
kZ
p
t
(x + k2), use the Poisson summation formula
along with the Gaussian integration formula
p
t
() =
1
2
Z
R
p
t
(x)e
ix
dx =
1
2
e
t
2
2
.
250 BRUCE K. DRIVER
Exercise 12.29 (Wave Equation). Let u C

2
(RR) be such that u(t, ) C
per
(R)
for all t R. Further assume that u solves the wave equation, u
tt
= u
xx
. Let
f(x) := u(0, x) and g(x) = u(0, x). Show u(t, k) := hu(t, ),
k
i for k Z is twice
continuously dierentiable in t and
d
2
dt
2
u(t, k) = k
2
u(t, k). Use this result to show
(12.36) u(t, x) =
X
kZ
f(k) cos(kt) + g(k)

sinkt
k
e
ikx
with the sum converging absolutely. Also show that u(t, x) may be written as
(12.37) u(t, x) =
1
2
[f(x +t) +f(x t)] +
1
2
Z
t
t
g(x +)d.
Hint: To show Eq. (12.36) implies (12.37) use
cos kt =
e
ikt
+e
ikt
2
, and sinkt =
e
ikt
e
ikt
2i
and
e
ik(x+t)
e
ik(xt)
ik
=
Z
t
t
e
ik(x+)
d.
12.10. Dirichlet Problems on D.
Exercise 12.30 (Worked Example). Let D := {z C : |z| < 1} be the open
unit disk in C

= R
2
, where we write z = x + iy = re
i
in the usual way. Also let
=

2
x
2
+

2
y
2
and recall that may be computed in polar coordinates by the
formula,
u = r
1
r
1
r
u
+
1
r
2
u.
Suppose that u C(

D) C
2
(D) and u(z) = 0 for z D. Let g = u|
D
and
g(k) :=
1
2
Z

g(e
ik
)e
ik
d.
(We are identifying S
1
= D :=

z

D : |z| = 1
with [, ]/ ( ) by the
map [, ] e
i
S
1
.) Let
(12.38) u(r, k) :=
1
2
Z

u(re
i
)e
ik
d
then:
(1) u(r, k) satises the ordinary dierential equation
r
1
r
(r
r
u(r, k)) =
1
r
2
k
2
u(r, k) for r (0, 1).
(2) Recall the general solution to
(12.39) r
r
(r
r
y(r)) = k
2
y(r)
may be found by trying solutions of the form y(r) = r
which then implies
2
= k
2
or = k. From this one sees that u(r, k) may be written as
u(r, k) = A
k
r
|k|
+ B
k
r
|k|
for some constants A
k
and B
k
when k 6= 0. If
k = 0, the solution to Eq. (12.39) is gotten by simple integration and the
result is u(r, 0) = A
0
+B
0
lnr. Since u(r, k) is bounded near the origin for
each k, it follows that B
k
= 0 for all k Z.
(3) So we have shown
A
k
r
|k|
= u(r, k) =
1
2
Z

u(re
i
)e
ik
d
and letting r 1 in this equation implies
A
k
=
1
2
Z

u(e
i
)e
ik
d = g(k).
Therefore,
(12.40) u(re
i
) =
X
kZ
g(k)r
|k|
e
ik
for r < 1 or equivalently,
u(z) =
X
kN
0
g(k)z
k
+
X
kN
g(k) z
k
.
(4) Inserting the formula for g(k) into Eq. (12.40) gives
u(re
i
) =
1
2
Z

X
kZ
r
|k|
e
ik()
!
u(e
i
)d for all r < 1.
Now by simple geometric series considerations we nd, setting = ,
that
X
kZ
r
|k|
e
ik
=

X
k=0
r
k
e
ik
+

X
k=0
r
k
e
ik
1 = 2 Re

X
k=0
r
k
e
ik
1
= Re
2
1
1 re
i
1
= Re
1 +re
i
1 re
i
= Re
"
1 +re
i

1 re
i
|1 re
i
|
2
#
= Re
1 r
2
+ 2ir sin
1 2r cos +r
2
(12.41)
=
1 r
2
1 2r cos +r
2
.
Putting this altogether we have shown
u(re
i
) =
1
2
Z

P
r
( )u(e
i
)d =: P
r
u(e
i
)
=
1
2
Re
Z

1 +re
i()
1 re
i()
u(e
i
)d (12.42)
where
P
r
() :=
1 r
2
1 2r cos +r
2
is the so called Poisson kernel. (The fact that
1
2
Re
R
P
r
()d = 1 follows
from the fact that
1
2
Z

P
r
()d = Re
1
2
Z

X
kZ
r
|k|
e
ik
d
= Re
1
2
X
kZ
Z

r
|k|
e
ik
d = 1.)
252 BRUCE K. DRIVER
Writing z = re
i
, Eq. (12.42) may be rewritten as
u(z) =
1
2
Re
Z

1 +ze
i
1 ze
i
u(e
i
)d
which shows u = Re F where
F(z) :=
1
2
Z

1 +ze
i
1 ze
i
u(e
i
)d.
Moreover it follows from Eq. (12.41) that
ImF(re
i
) =
1
Im
Z

r sin( )
1 2r cos( ) +r
2
u(e
i
)d
=: Q
r
u(e
i
)
where
Q
r
() :=
r sin()
1 2r cos() +r
2
.
From these remarks it follows that v is the harmonic conjugate of u and
P
r
= Q
r
.
Exercise 12.31. Show
P
k=1
k
2
=
2
/6, by taking f(x) = x on [, ] and
computing kfk
2
2
directly and then in terms of the Fourier Coecients

f of f.
13. Construction of Measures
Now that we have developed integration theory relative to a measure on a
algebra, it is time to show how to construct the measures that we have been using.
This is a bit technical because there tends to be no explicit description of the
general element of the typical algebras. On the other hand, we do know how
to explicitly describe algebras which are generated by some class of sets E P(X).
Therefore, we might try to dene measures on (E) by there restrictions to A(E).
Theorem 8.5 shows this is a plausible method.
So the strategy of this section is as follows: 1) construct nitely additive mea-
sure on an algebra, 2) construct integrals associated to such nitely additive
measures, 3) extend these integrals (Daniells method) when possible to a larger
class of functions, 4) construct a measure from the extended integral (Daniell
Stone construction theorem).
13.1. Finitely Additive Measures and Associated Integrals.
Denition 13.1. Suppose that E P(X) is a collection of subsets of a set X and
: E [0, ] is a function. Then
(1) is additive on E if (E) =
P
n
i=1
(E
i
) whenever E =
`
n
i=1
E
i
E with
E
i
E for i = 1, 2, . . . , n < .
(2) is additive (or countable additive) on E if Item 1. holds even
when n = .
(3) is subadditive on E if (E)
P
n
i=1
(E
i
) whenever E =
`
n
i=1
E
i
E
with E
i
E and n N{} .
(4) is nite on E if there exist E
n
E such that X =
n
E
n
and
(E
n
) < .
The reader should check if E = A is an algebra and is additive on A, then
is nite on A i there exists X
n
A such that X
n
X and (X
n
) < for
all n.
Proposition 13.2. Suppose E P(X) is an elementary family (see Denition
6.11) and A = A(E) is the algebra generated by E. Then every additive function
: E [0, ] extends uniquely to an additive measure (which we still denote by )
on A.
Proof. Since by Proposition 6.12, every element A A is of the form A =
`
i
E
i
with E
i
E, it is clear that if extends to a measure the extension is unique and
must be given by
(13.1) (A) =
X
i
(E
i
).
To prove the existence of the extension, the main point is to show that dening
(A) by Eq. (13.1) is well dened, i.e. if we also have A =
`
j
F
j
with F
j
E, then
we must show
(13.2)
X
i
(E
i
) =
X
j
(F
j
).
254 BRUCE K. DRIVER
But E
i
=
`
j
(E
i
F
j
) and the property that is additive on E implies (E
i
) =
P
j
(E
i
F
j
) and hence
X
i
(E
i
) =
X
i
X
j
(E
i
F
j
) =
X
i,j
(E
i
F
j
).
By symmetry or an analogous argument,
X
j
(F
j
) =
X
i,j
(E
i
F
j
)
which combined with the previous equation shows that Eq. (13.2) holds. It is now
easy to verify that extended to A as in Eq. (13.1) is an additive measure on A.
Proposition 13.3. Let X = R and E be the elementary class
E = {(a, b] R : a b },
and A = A(E) be the algebra of disjoint union of elements from E. Suppose that
0
: A [0, ] is an additive measure such that
0
((a, b]) < for all < a <
b < . Then there is a unique increasing function F :

R

R such that F(0) = 0,
F
1
({}) {} , F
1
({}) {} and
(13.3)
0
((a, b] R) = F(b) F(a) a b in

R.
Conversely, given an increasing function F :

R

R such that F
1
({})
{} , F
1
({}) {} there is a unique measure
0
=
0
F
on A such that
the relation in Eq. (13.3) holds.
So the nitely additive measures
0
on A(E) which are nite on bounded sets
are in one to one correspondence with increasing functions F :

R

R such that
F(0) = 0, F
1
({}) {} , F
1
({}) {} .
Proof. If F is going to exist, then
0
((0, b] R) = F(b) F(0) = F(b) if b [0, ],
0
((a, 0]) = F(0) F(a) = F(a) if a [, 0]
from which we learn
F(x) =

0
((x, 0]) if x 0
0
((0, x] R) if x 0.
Moreover, one easily checks using the additivity of
0
that Eq. (13.3) holds for this
F.
Conversely, suppose F :

R

R is an increasing function such that F
1
({})
{}, F
1
({}) {}. Dene
0
on E using the formula in Eq. (13.3). I claim
that
0
is additive on E and hence has a unique extension to A which will nish
the argument. Suppose that
(a, b] =
n
a
i=1
(a
i
, b
i
].
By reordering (a
i
, b
i
] if necessary, we may assume that
a = a
1
> b
1
= a
2
< b
2
= a
3
< < a
n
< b
n
= b.
Therefore,
0
((a, b]) = F(b) F(a) =
n
X
i=1
[F(b
i
) F(a
i
)] =
n
X
i=1
0
((a
i
, b
i
])
as desired.
13.1.1. Integrals associated to nitely additive measures.
Denition 13.4. Let be a nitely additive measure on an algebra A P(X),
S = S
f
(A, ) be the collection of simple functions dened in Notation 11.1 and for
f S dened the integral I(f) = I
(f) by
(13.4) I
(f) =
X
yR
y(f = y).
The same proof used for Proposition 7.14 shows I
: S R is linear and positive,

i.e. I(f) 0 if f 0. Taking absolute values of Eq. (13.4) gives
(13.5) |I(f)|
X
yR
|y| (f = y) kfk
(f 6= 0)
where kfk
= sup
xX
|f(x)| . For A A, let S
A
:= {f S : {f 6= 0} A}. The
estimate in Eq. (13.5) implies
(13.6) |I(f)| (A) kfk
for all f S
A
.
The B.L.T. Theorem 4.1 then implies that I has a unique extension I
A
to

S
A

B(X) for any A A such that (A) < . The extension I
A
is still positive. Indeed,
let f

S
A
with f 0 and let f
n
S
A
be a sequence such that kf f
n
k
0 as
n . Then f
n
0 S
A
and
kf f
n
0k
kf f
n
k
0 as n .
Therefore, I
A
(f) = lim
n
I
A
(f
n
0) 0.
Suppose that A, B Aare sets such that (A)+(B) < , then S
A
S
B
S
AB
and so

S
A
S
B

S
AB
. Therefore I
A
(f) = I
AB
(f) = I
B
(f) for all f

S
A
S
B
.
The next proposition summarizes these remarks.
Proposition 13.5. Let (A, , I = I
) be as in Denition 13.4, then we may extend

I to
S := {
S
A
: A A with (A) < }
by dening I(f) = I
A
(f) when f

S
A
with (A) < . Moreover this extension is
still positive.
Notation 13.6. Suppose X = R, A=A(E), F and
0
are as in Proposition 13.3.
For f
S, we will write I(f) as

R
fdF or
R
f(x)dF(x) and refer to

R
fdF
as the Riemann Stieljtes integral of f relative to F.
Lemma 13.7. Using the notation above, the map f

S
R
fdF is linear,
positive and satises the estimate
(13.7)
fdF
(F(b) F(a)) kfk
if supp(f) (a, b). Moreover C

c
(R, R)
S.
256 BRUCE K. DRIVER
Proof. The only new point of the lemma is to prove C

c
(R, R)
S, the remaining
assertions follow directly from Proposition 13.5. The fact that C
c
(R, R)

S has
essentially already been done in Example 7.24. In more detail, let f C
c
(R, R)
and choose a < b such that supp(f) (a, b). Then dene f
k
S as in Example
7.24, i.e.
f
k
(x) =
n
k
1
X
l=0
min
f(x) : a
k
l
x a
k
l+1
1
(a
k
l
,a
k
l+1
]
(x)
where
k
= {a = a
k
0
< a
k
1
< < a
k
n
k
= b}, for k = 1, 2, 3, . . . , is a sequence of
rening partitions such that mesh(
k
) 0 as k . Since supp(f) is compact
and f is continuous, f is uniformly continuous on R. Therefore kf f
k
k
0 as
k , showing f
S. Incidentally, for f C
c
(R, R), it follows that
(13.8)
Z

fdF = lim
k
n
k
1
X
l=0
min
f(x) : a
k
l
x a
k
l+1
F(a
k
l+1
) F(a
k
l
)
.
The most important special case of a Riemann Stieljtes integral is when F(x) = x
in which case
R
f(x)dF(x) =
R
f(x)dx is the ordinary Riemann integral. The

following Exercise is an abstraction of Lemma 13.7.
Exercise 13.1. Continue the notation of Denition 13.4 and Proposition 13.5.
Further assume that X is a metric space, there exists open sets X
n

o
X such
that X
n
X and for each n N and > 0 there exists a nite collection of
sets {A
i
}
k
i=1
A such that diam(A
i
) < , (A
i
) < and X
n

k
i=1
A
i
. Then
C
c
(X, R)
S and so I is well dened on C

c
(X, R).
Proposition 13.8. Suppose that (X, ) is locally compact Hausdor space and I
is a positive linear functional on C
c
(X, R). Then for each compact subset K X
there is a constant C
K
< such that |I(f)| C
K
kfk
for all f C
c
(X, R) with
supp(f) K. Moreover, if f
n
C
c
(X, [0, )) and f
n
0 (pointwise) as n ,
then I(f
n
) 0 as n .
Proof. Let f C
c
(X, R) with supp(f) K. By Lemma 10.15 there exists
K
X such that
K
= 1 on K. Since kfk
K
f 0,
0 I(kfk
K
f) = kfk
I(
K
) I(f)
from which it follows that |I(f)| I(
K
) kfk
. So the rst assertion holds with

C
K
= I(
K
) < .
Now suppose that f
n
C
c
(X, [0, )) and f
n
0 as n . Let K = supp(f
1
)
and notice that supp(f
n
) K for all n. By Dinis Theorem (see Exercise 3.11),
kf
n
k
0 as n and hence
0 I(f
n
) C
K
kf
n
k
0 as n .
This result applies to the Riemann Stieljtes integral in Lemma 13.7 restricted to
C
c
(R, R). However it is not generally true in this case that I(f
n
) 0 for all f
n
S
such that f
n
0. Proposition 13.10 below addresses this question.
Denition 13.9. A countably additive function on an algebra A 2
X
is called
a premeasure.
As for measures (see Remark 7.2 and Proposition 7.3), one easily shows if is a
premeasure on A, {A
n
}
n=1
A and if A
n
A A then (A
n
) (A) as n
or if (A
1
) < and A
n
then (A
n
) 0 as n Now suppose that in
Proposition 13.3 were a premeasure on A(E). Letting A
n
= (a, b
n
] with b
n
b as
n we learn,
F(b
n
) F(a) = ((a, b
n
]) ((a, b]) = F(b) F(a)
from which it follows that lim
yb
F(y) = F(b), i.e. F is right continuous. We will
see below that in fact is a premeasure on A(E) i F is right continuous.
Proposition 13.10. Let (A, , S = S
f
(A, ), I = I
) be as in Denition 13.4. If
is a premeasure on A, then
(13.9) f
n
S with f
n
0 =I(f
n
) 0 as n .
Proof. Let > 0 be given. Then
f
n
= f
n
1
f
n
>f
1
+f
n
1
f
n
f
1
f
1
1
f
n
>f
1
+f
1
,
I(f
n
) I (f
1
1
f
n
>f
1
) +I(f
1
) =
X
a>0
a(f
1
= a, f
n
> a) +I(f
1
),
and hence
(13.10) limsup
n
I(f
n
)
X
a>0
a limsup
n
(f
1
= a, f
n
> a) +I(f
1
).
Because, for a > 0,
A 3 {f
1
= a, f
n
> a} := {f
1
= a} { f
n
> a} as n
and (f
1
= a) < , limsup
n
(f
1
= a, f
n
> a) = 0. Combining this with
Eq. (13.10) and making use of the fact that > 0 is arbitrary we learn
limsup
n
I(f
n
) = 0.
13.2. The Daniell-Stone Construction Theorem.
Denition 13.11. A vector subspace S of real valued functions on a set X is a
lattice if it is closed under the lattice operations; f g = max(f, g) and f g =
min(f, g).
Remark 13.12. Notice that a lattice S is closed under the absolute value operation
since |f| = f 0 f 0. Furthermore if S is a vector space of real valued functions,
to show that S is a lattice it suces to show f
+
= f 0 S for all f S. This is
because
|f| = f
+
+ (f)
+
,
f g =
1
2
(f +g +|f g|) and
f g =
1
2
(f +g |f g|) .
Notation 13.13. Given a collection of extended real valued functions C on X, let
C
+
:= {f C : f 0} denote the subset of positive functions f C.
Denition 13.14. A linear functional I on S is said to be positive (i.e. non-
negative) if I(f) 0 for all f S
+
. (This is equivalent to the statement the
I(f) I(g) if f, g S and f g.)
258 BRUCE K. DRIVER
Denition 13.15 (Property (D)). A non-negative linear functional I on S is said

to be continuous under monotone limits if I(f
n
) 0 for all {f
n
}
n=1
S
+
satisfying
(pointwise) f
n
0. A positive linear functional on S satisfying property (D) is
called a Daniell integral on S. We will also write S as D(I) the domain of I.
Example 13.16. Let (X, ) be a locally compact Hausdor space and I be a
positive linear functional on S := C
c
(X, R). It is easily checked that S is a lattice
and Proposition 13.8 shows I is automatically a Daniell integral. In particular if
X = R and F is an increasing function on R, then the corresponding Riemann
Stieljtes integral restricted to S := C
c
(R, R) (f C
c
(R, R)
R
R
fdF) is a Daniell
integral.
Example 13.17. Let (A, , S = S
f
(A, ), I = I
) be as in Denition 13.4. It is
easily checked that S is a lattice. Proposition 13.10 guarantees that I is a Daniell
integral on S when is a premeasure on A.
Lemma 13.18. Let I be a non-negative linear functional on a lattice S. Then
property (D) is equivalent to either of the following two properties:
D
1
: If ,
n
S satisfy;
n

n+1
for all n and lim
n
n
, then
I() lim
n
I(
n
).
D
2
: If u
j
S
+
and S is such that
P
j=1
u
j
then I()
P
j=1
I(u
j
).
Proof. (D) =(D
1
) Let ,
n
S be as in D
1
. Then
n
and (
n
)
0 which implies
I() I(
n
) = I( (
n
)) 0.
Hence
I() = lim
n
I(
n
) lim
n
I(
n
).
(D
1
) = (D
2
) Apply (D
1
) with
n
=
P
n
j=1
u
j
.
(D
2
) = (D) Suppose
n
S with
n
0 and let u
n
=
n

n+1
. Then
P
N
n=1
u
n
=
1
N+1

1
and hence
I(
1
)

X
n=1
I(u
n
) = lim
N
N
X
n=1
I(u
n
) = lim
N
I(
1
N+1
) = I(
1
) lim
N
I(
N+1
)
N
I(
N+1
) 0. Since I(
N+1
) 0 for all N we
conclude that lim
N
I(
N+1
) = 0.
In the remainder of this section, S will denote a lattice of bounded real valued
functions on a set X and I : S R will be a Daniell integral on S.
Lemma 13.19. Suppose that {f
n
} , {g
n
} S.
(1) If f
n
f and g
n
g with f, g : X (, ] such that f g, then
(13.11) lim
n
I(f
n
) lim
n
I(g
n
).
(2) If f
n
f and g
n
g with f, g : X [, ) such that f g, then Eq.
(13.11) still holds.
In particular, in either case if f = g, then lim
n
I(f
n
) = lim
n
I(g
n
).
Proof.
(1) Fix n N, then g
k
f
n
f
n
as k and g
k
f
n
g
k
and hence
I(f
n
) = lim
k
I(g
k
f
n
) lim
k
I(g
k
).
Passing to the limit n in this equation proves Eq. (13.11).
(2) Since f
n
(f) and g
n
(g) and g (f), what we just proved
shows
lim
n
I(g
n
) = lim
n
I(g
n
) lim
n
I(f
n
) = lim
n
I(f
n
)
which is equivalent to Eq. (13.11).
Denition 13.20. Let
S
= {f : X (, ] : f
n
S such that f
n
f}
and for f S
let I(f) = lim

n
I(f
n
) (, ].
Lemma 13.19 shows this extension of I to S
is well dened and positive, i.e.

I(f) I(g) if f g.
Denition 13.21. Let S
= {f : X [, ) : f
n
S such that f
n
f} and
dene I(f) = lim
n
I(f
n
) on S
.
Exercise 13.2. Show S
= S
and for f S
that I(f) = I(f)

R.
We are now in a position to state the main construction theorem. The theorem
we state here is not as general as possible but it will suce for our present purposes.
See Section 14 for a more general version and the full proof.
Theorem 13.22 (Daniell-Stone). Let S be a lattice of bounded functions on a set
X such that 1 S and let I be a Daniel integral on S. Further assume there
exists S
such that I() < and (x) > 0 for all x X. Then there exists a
unique measure on M := (S) such that
(13.12) I(f) =
Z
X
fd for all f S.
Moreover, for all g L
1
(X, M, ),
(13.13) sup{I(f) : S
3 f g} =
Z
X
gd = inf {I(h) : g h S
} .
Proof. Only a sketch of the proof will be given here. Full details may be found
in Section 14 below.
Existence. For g : X

R, dene
I(g) := inf{I(h) : g h S
},
I(g) := sup{I(f) : S
3 f g}
and set
L
1
(I) := {g : X

R :

I(g) = I(g) R}.
For g L
1
(I), let

I(g) =

I(g) = I(g). Then, as shown in Proposition 14.10, L
1
(I)
is a extended vector space and

I : L
1
(I) R is linear as dened in Denition
14.1 below. By Proposition 14.6, if f S
with I(f) < then f L

1
(I).
Moreover,

I obeys the monotone convergence theorem, Fatous lemma, and the
260 BRUCE K. DRIVER
dominated convergence theorem, see Theorem 14.11, Lemma 14.12 and Theorem
14.15 respectively.
Let
R :=

A X : 1
A
f L
1
(I) for all f S
and for A Rset (A) :=

I(1
A
). It can then be shown: 1) Ris a algebra (Lemma
14.23) containing (S) (Lemma 14.24), is a measure on R (Lemma 14.25), and
that Eq. (13.12) holds. In fact it is shown in Theorem 14.28 and Proposition 14.29
below that L
1
(X, M, ) L
1
(I) and
I(g) =
Z
X
gd for all g L
1
(X, M, ).
The assertion in Eq. (13.13) is a consequence of the denition of L
1
(I) and

I and
this last equation.
Uniqueness. Suppose that is another measure on (S) such that
I(f) =
Z
X
fd for all f S.
By the monotone convergence theorem and the denition of I on S
,
I(f) =
Z
X
fd for all f S
.
Therefore if A (S) R,
(A) =

I(1
A
) = inf{I(h) : 1
A
h S
}
= inf{
Z
X
hd : 1
A
h S
}
Z
X
1
A
d = (A)
which shows . If A (S) R with (A) < , then, by Remark 14.22 below,
1
A
L
1
(I) and therefore
(A) =

I(1
A
) =

I(1
A
) = I(1
A
) = sup{I(f) : S
3 f 1
A
}
= sup{
Z
X
fd : S
3 f 1
A
} (A).
Hence (A) (A) for all A (S) and (A) = (A) when (A) < .
To prove (A) = (A) for all A (S), let X
n
:= { 1/n} (S). Since
1
X
n
n,
(X
n
) =
Z
X
1
X
n
d
Z
X
nd = nI() < .
Since > 0 on X, X
n
X and therefore by continuity of and ,
(A) = lim
n
(A X
n
) = lim
n
(A X
n
) = (A)
for all A (S).
The rest of this chapter is devoted to applications of the Daniell Stone con-
struction theorem.
Remark 13.23. To check the hypothesis in Theorem 13.22 that there exists S
such that I() < and (x) > 0 for all x X, it suces to nd
n
S
+
such
that
P
n=1
n
> 0 on X. To see this let M
n
:= max (k
n
k
u
, I(
n
) , 1) and dene
:=
P
n=1
1
M
n
2
n
n
, then S
, 0 < 1 and I() 1 < .

13.3. Extensions of premeasures to measures I. In this section let X be a
set, A be a subalgebra of 2
X
and
0
: A [0, ] be a premeasure on A.
Denition 13.24. Let E be a collection of subsets of X, let E
denote the collection

of subsets of X which are nite or countable unions of sets from E. Similarly let E
denote the collection of subsets of X which are nite or countable intersections of

sets from E. We also write E
= (E
and E
= (E
, etc.
Remark 13.25. Let
0
be a premeasure on an algebra A. Any A =
n=1
A
0
n
A
with A
0
n
A may be written as A =

`
n=1
A
n
, with A
n
A by setting A
n
:=
A
0
n
\ (A
0
1
A
0
n1
). If we also have A =

`
n=1
B
n
with B
n
A, then A
n
=
`
k=1
(A
n
B
k
) and therefore because
0
is a premeasure,
0
(A
n
) =

X
k=1
0
(A
n
B
k
).
Summing this equation on n shows,
X
n=1
0
(A
n
) =

X
n=1
X
k=1
0
(A
n
B
k
)
By symmetry (i.e. the same argument with the As and Bs interchanged) and
Fubinis theorem for sums,
X
k=1
0
(B
k
) =

X
k=1
X
n=1
0
(A
n
B
k
) =

X
n=1
X
k=1
0
(A
n
B
k
)
and hence
P
n=1
0
(A
n
) =
P
k=1
0
(B
k
). Therefore we may extend
0
to A
by
setting
0
(A) :=

X
n=1
0
(A
n
)
if A =

`
n=1
A
n
, with A
n
A. In future we will tacitly assume this extension has
been made.
Theorem 13.26. Let X be a set, A be a subalgebra of 2
X
and
0
be a premeasure
on A which is nite on A, i.e. there exists X
n
A such that
0
(X
n
) <
and X
n
X as n . Then
0
has a unique extension to a measure, , on
M := (A). Moreover, if A M and > 0 is given, there exists B A
such that
A B and (B \ A) < . In particular,
(A) = inf{
0
(B) : A B A
} (13.14)
= inf{

X
n=1
0
(A
n
) : A

a
n=1
A
n
with A
n
A}. (13.15)
Proof. Let (A,
0
, I = I
0
) be as in Denition 13.4. As mentioned in Example
13.17, I is a Daniell integral on the lattice S = S
f
(A,
0
). It is clear that 1 S
for all S. Since 1
X
n
S
+
and
P
n=1
1
X
n
> 0 on X, by Remark 13.23 there
exists S
such that I() < and > 0. So the hypothesis of Theorem 13.22
hold and hence there exists a unique measure on M such that I(f) =
R
X
fd for
262 BRUCE K. DRIVER
all f S. Taking f = 1
A
with A A and
0
(A) < shows (A) =
0
(A). For
general A A, we have
(A) = lim
n
(A X
n
) = lim
n
0
(A X
n
) =
0
(A).
The fact that is the only extension of
0
to M follows from Theorem 8.5 or
Theorem 8.8. It is also can be proved using Theorem 13.22. Indeed, if is another
measure on Msuch that = on A, then I
= I on S. Therefore by the uniqueness

assertion in Theorem 13.22, = on M.
By Eq. (13.13), for A M,
(A) =

I(1
A
) = inf {I(f) : f S
with 1
A
f}
= inf
Z
X
fd : f S
with 1
A
f
.
For the moment suppose (A) < and > 0 is given. Choose f S
such that
1
A
f and
(13.16)
Z
X
fd = I(f) < (A) +.
Let f
n
S be a sequence such that f
n
f as n and for (0, 1) set
B
:= {f > } =
n=1
{f
n
> } A
.
Then A {f 1} B
and by Chebyshevs inequality,

(B
)
1
Z
X
fd =
1
I(f)
which combined with Eq. (13.16) implies (B
) < (A)+ for all suciently close

to 1. For such we then have A B
and (B
\ A) = (B
) (A) < .
For general A A, choose X
n
X with X
n
A. Then there exists B
n
A
such that (B
n
\ (A
n
X
n
)) < 2
n
. Dene B :=
n=1
B
n
A
. Then
(B \ A) = (
n=1
(B
n
\ A))

X
n=1
((B
n
\ A))

X
n=1
((B
n
\ (A X
n
)) < .
Eq. (13.14) is an easy consequence of this result and the fact that (B) =
0
(B).
Corollary 13.27 (Regularity of ). Let A P(X) be an algebra of sets, M = (A)
and : M[0, ] be a measure on M which is nite on A. Then
(1) For all A M,
(13.17) (A) = inf {(B) : A B A
} .
(2) If A M and > 0 are given, there exists B A
such that A B and

(B \ A) < .
(3) For all A M and > 0 there exists B A
such that B A and

(A\ B) < .
(4) For any B M there exists A A
and C A
such that A B C
and (C \ A) = 0.
(5) The linear space S := S
f
(A, ) is dense in L
p
() for all p [1, ), briey
put, S
f
(A, )
L
p
()
= L
p
().
Proof. Items 1. and 2. follow by applying Theorem 13.26 to
0
= |
A
. Items
3. and 4. follow from Items 1. and 2. as in the proof of Corollary 8.41 above.
Item 5. This has already been proved in Theorem 11.3 but we will give yet
another proof here. When p = 1 and g L
1
(; R), there exists, by Eq. (13.13),
h S
such that g h and kh gk

1
=
R
X
(h g)d < . Let {h
n
}
n=1
S be
chosen so that h
n
h as n . Then by the dominated convergence theorem,
kh
n
gk
1
kh gk
1
< as n . Therefore for n large we have h
n
S with
kh
n
gk
1
< . Since > 0 is arbitrary this shows, S
f
(A, )
L
1
()
= L
1
().
Now suppose p > 1, g L
p
(; R) and X
n
A are sets such that X
n
X and
(X
n
) < . By the dominated convergence theorem, 1
X
n
[(g n) (n)] g in
L
p
() as n , so it suces to consider g L
p
(; R) with {g 6= 0} X
n
and
|g| n for some large n N. By Hlders inequality, such a g is in L
1
(). So if
> 0, by the p = 1 case, we may nd h S such that kh gk
1
< . By replacing
h by (h n) (n) S, we may assume h is bounded by n as well and hence
kh gk
p
p
=
Z
X
|h g|
p
d =
Z
X
|h g|
p1
|h g| d
(2n)
p1
Z
X
|h g| d < (2n)
p1
.
Since > 0 was arbitrary, this shows S is dense in L
p
(; R).
Remark 13.28. If we drop the niteness assumption on
0
we may loose unique-
ness assertion in Theorem 13.26. For example, let X = R, B
R
and A be the algebra
generated by E := {(a, b] R : a < b }. Recall B
R
= (E). Let D R
be a countable dense set and dene
D
(A) := #(D A). Then
D
(A) = for
all A A such that A 6= . So if D
0
R is another countable dense subset of R,
D
0 =
D
on A while
D
6=
D
0 on B
R
. Also notice that
D
is nite on B
R
but
not on A.
It is now possible to use Theorem 13.26 to give a proof of Theorem 7.8, see sub-
section 13.8 below. However rather than do this now let us give another application
of Theorem 13.26 based on Example 13.16 and use the result to prove Theorem 7.8.
13.4. Riesz Representation Theorem.
Denition 13.29. Given a second countable locally compact Hausdor space
(X, ), let M
+
denote the collection of positive measures, , on B
X
:= () with the
property that (K) < for all compact subsets K X. Such a measure will be
called a Radon measure on X. For M
+
and f C
c
(X, R) let I
(f) :=
R
X
fd.
Theorem 13.30 (Riesz Representation Theorem). Let (X, ) be a second count-
able
30
locally compact Hausdor space. Then the map I
taking M
+
to positive
linear functionals on C
c
(X, R) is bijective. Moreover every measure M
+
has
the following properties:
30
The second countability is assumed here in order to avoid certain technical issues. Recall from
Lemma 10.17 that under these assumptions, (S) = B
X
. Also recall from Uryshons metrizatoin
theorem that X is metrizable. We will later remove the second countability assumption.
264 BRUCE K. DRIVER
(1) For all > 0 and B B

X
, there exists F B U such that U is open
and F is closed and (U \ F) < . If (B) < , F may be taken to be a
compact subset of X.
(2) For all B B
X
there exists A F
and C
is more conventionally
written as G
) such that A B C and (C \ A) = 0.

(3) For all B B
X
,
(B) = inf{(U) : B U and U is open} (13.18)
= sup{(K) : K B and K is compact}. (13.19)
(4) For all open subsets, U X,
(13.20) (U) = sup{
Z
X
fd : f X} = sup{I
(f) : f X}.
(5) For all compact subsets K X,
(13.21) (K) = inf{I
(f) : 1
K
f X}.
(6) If kI
k denotes the dual norm on C

c
(X, R)
, then kI
k = (X). In partic-
ular I
is bounded i (X) < .

(7) C
c
(X, R) is dense in L
p
(; R) for all 1 p < .
Proof. First notice that I
is a positive linear functional on S := C

c
(X, R) for
all M
+
and S is a lattice such that 1 f S for all f S. Example 13.16
shows that any positive linear functional, I, on S := C
c
(X, R) is a Daniell integral
on S. By Lemma 10.10, there exists compact sets K
n
X such that K
n
X. By
Urysohns lemma, there exists
n
X such that
n
= 1 on K
n
. Since
n
S
+
and
P
n=1
n
> 0 on X it follows from Remark 13.23 that there exists S
such
that > 0 on X and I() < . So the hypothesis of the Daniell Stone Theorem
13.22 hold and hence there exists a unique measure on (S) =B
X
(Lemma 10.17)
such that I = I
. Hence the map I
taking M
+
to positive linear functionals
on C
c
(X, R) is bijective. We will now prove the remaining seven assertions of the
theorem.
(1) Suppose > 0 and B B
X
satises (B) < . Then 1
B
L
1
() so there
exists functions f
n
C
c
(X, R) such that f
n
f, 1
B
f, and
(13.22)
Z
X
fd = I(f) < (B) +.
Let (0, 1) and U
a
:= {f > }
n=1
{f
n
> } . Since 1
B
f,
B {f 1} U
and by Chebyshevs inequality, (U
)
1
R
X
fd =
1
I(f). Combining this estimate with Eq. (13.22) shows (U
\ B) =
(U
) (B) < for suciently closet to 1.

For general B B
X
, by what we have just proved, there exists open sets
U
n
X such that B K
n
U
n
and (U
n
\ (B K
n
)) < 2
n
for all n.
Let U =
n=1
U
n
, then B U and
(U \ B) = (
n=1
(U
n
\ B))

X
n=1
(U
n
\ B)

X
n=1
(U
n
\ (B K
n
))

X
n=1
2
n
= .
Applying this result to B
c
shows there exists a closed set F @ X such
that B
c
F
c
and
(B \ F) = (F
c
\ B
c
) < .
So we have produced F B U such that (U\F) = (U\B)+(B\F) <
2.
If (B) < , using B \ (K
n
F) B \ F as n , we may choose n
suciently large so that (B \ (K
n
F)) < . Hence we may replace F by
the compact set F K
n
if necessary.
(2) Choose F
n
B U
n
such F
n
is closed, U
n
is open and (U
n
\ F
n
) < 1/n.
Let B =
n
F
n
F
and C := U
n

. Then A B C and
(C \ A) (F
n
\ U
n
) <
1
n
0 as n .
(3) From Item 1, one easily concludes that
(B) = inf {(U) : B U
o
X}
for all B B
X
and
(B) = sup{(K) : K @@ B}
for all B B
X
with (B) < . So now suppose B B
X
and (B) = .
Using the notation at the end of the proof of Item 1., we have (F) = and
(F K
n
) as n . This shows sup{(K) : K @@ B} = = (B)
as desired.
(4) For U
o
X, let
(U) := sup{I
(f) : f U}.
It is evident that (U) (U) because f U implies f 1
U
. Let K be a
compact subset of U. By Urysohns Lemma 10.15, there exists f U such
that f = 1 on K. Therefore,
(13.23) (K)
Z
X
fd (U)
and we have
(13.24) (K) (U) (U) for all U
o
X and K @@ U.
By Item 3.,
(U) = sup{(K) : K @@ U} (U) (U)
which shows that (U) = (U), i.e. Eq. (13.20) holds.
(5) Now suppose K is a compact subset of X. From Eq. (13.23),
(K) inf{I
(f) : 1
K
f X} (U)
for any open subset U such that K U. Consequently by Eq. (13.18),
(K) inf{I
(f) : 1
K
f X} inf{(U) : K U
o
X} = (K)
266 BRUCE K. DRIVER
(6) For f C
c
(X, R),
(13.25) |I
(f)|
Z
X
|f| d kfk
u
(supp(f)) kfk
u
(X)
which shows kI
k (X). Let K @@ X and f X such that f = 1 on K.

By Eq. (13.23),
(K)
Z
X
fd = I
(f) kI
k kfk
u
= kI
k
and therefore,
(X) = sup{(K) : K @@ X} kI
k .
(7) This has already been proved by two methods in Proposition 11.6 but we
will give yet another proof here. When p = 1 and g L
1
(; R), there
exists, by Eq. (13.13), h S
= C
c
(X, R)
such that g h and kh gk

1
=
R
X
(h g)d < . Let {h
n
}
n=1
S = C
c
(X, R) be chosen so that h
n
h as
n . Then by the dominated convergence theorem (notice that |h
n
|
|h
1
| +|h|), kh
n
gk
1
kh gk
1
< as n . Therefore for n large we
have h
n
C
c
(X, R) with kh
n
gk
1
< . Since > 0 is arbitrary this shows,
S
f
(A, )
L
1
()
= L
1
().
Now suppose p > 1, g L
p
(; R) and {K
n
}
n=1
are as above. By the
dominated convergence theorem, 1
K
n
(g n) (n) g in L
p
() as n
, so it suces to consider g L
p
(; R) with supp(g) K
n
and |g| n
for some large n N. By Hlders inequality, such a g is in L
1
(). So if
> 0, by the p = 1 case, there exists h S such that kh gk
1
< . By
replacing h by (h n) (n) S, we may assume h is bounded by n in
which case
kh gk
p
p
=
Z
X
|h g|
p
d =
Z
X
|h g|
p1
|h g| d
(2n)
p1
Z
X
|h g| d < (2n)
p1
.
Since > 0 was arbitrary, this shows S is dense in L
p
(; R).
Remark 13.31. We may give a direct proof of the fact that I
is injective. In-
deed, suppose , M
+
satisfy I
(f) = I
(f) for all f C

c
(X, R). By Proposition
11.6, if A B
X
is a set such that (A) + (A) < , there exists f
n
C
c
(X, R)
such that f
n
1
A
in L
1
( +). Since f
n
1
A
in L
1
() and L
1
(),
(A) = lim
n
I
(f
n
) = lim
n
I
(f
n
) = (A).
For general A B
X
, choose compact subsets K
n
X such that K
n
X. Then
(A) = lim
n
(A K
n
) = lim
n
(A K
n
) = (A)
showing = . Therefore the map I
is injective.
Theorem 13.32 (Lusins Theorem). Suppose (X, ) is a locally compact and second
countable Hausdor space, B
X
is the Borel algebra on X, and is a measure on
(X, B
X
) which is nite on compact sets of X. Also let > 0 be given. If f : X C
is a measurable function such that (f 6= 0) < , there exists a compact set
K {f 6= 0} such that f|
K
is continuous and ({f 6= 0} \ K) < . Moreover there
exists C
c
(X) such that (f 6= ) < and if f is bounded the function may
be chosen so that kk
u
kfk
u
:= sup
xX
|f(x)| .
Proof. Suppose rst that f is bounded, in which case
Z
X
|f| d kfk
(f 6= 0) < .
By Proposition 11.6 or Item 7. of Theorem 13.30, there exists f
n
C
c
(X) such
that f
n
f in L
1
() as n . By passing to a subsequence if necessary, we may
assume kf f
n
k
1
< n
1
2
n
for all n and thus
|f f
n
| > n
1
< 2
n
for all
n. Let E :=
n=1
|f f
n
| > n
1
, so that (E) < . On E

c
, |f f
n
| 1/n, i.e.
f
n
f uniformly on E
c
and hence f|
E
c is continuous.
Let A := {f 6= 0}\E. By Theorem 13.30 (or see Exercises 8.4 and 8.5) there exists
a compact set K and open set V such that K A V such that (V \ K) < .
Notice that
({f 6= 0} \ K) (A\ K) +(E) < 2.
By the Tietze extension Theorem 10.16, there exists F C(X) such that f =
F|
K
. By Urysohns Lemma 10.15 there exists V such that = 1 on K. So
letting = F C
c
(X), we have = f on K, kk
u
kfk
u
and since { 6= f}
E (V \ K), ( 6= f) < 3. This proves the assertions in the theorem when f is
bounded.
Suppose that f : X C is (possibly) unbounded. By Lemmas 10.17 and
10.10, there exists compact sets {K
N
}
N=1
of X such that K
N
X. Hence B
N
:=
K
N
{0 < |f| N} {f 6= 0} as N . Therefore if > 0 is given there
exists an N such that ({f 6= 0} \ B
N
) < . We now apply what we have just
proved to 1
B
N
f to nd a compact set K {1
B
N
f 6= 0} , and open set V X and
C
c
(V ) C
c
(X) such that (V \ K) < , ({1
B
N
f 6= 0} \ K) < and = f on
K. The proof is now complete since
{ 6= f} ({f 6= 0} \ B
N
) ({1
B
N
f 6= 0} \ K) (V \ K)
so that ( 6= f) < 3.
To illustrate Theorem 13.32, suppose that X = (0, 1), = mis Lebesgue measure
and f = 1
(0,1)Q
. Then Lusins theorem asserts for any > 0 there exists a compact
set K (0, 1) such that m((0, 1)\K) < and f|
K
is continuous. To see this directly,
let {r
n
}
n=1
be an enumeration of the rationales in (0, 1),
J
n
= (r
n
2
n
, r
n
+2
n
) (0, 1) and W =
n=1
J
n
.
Then W is an open subset of X and (W) < . Therefore K
n
:= [1/n, 1 1/n] \ W
is a compact subset of X and m(X \ K
n
)
2
n
+(W). Taking n suciently large
we have m(X \ K
n
) < and f|
K
n
0 is continuous.
13.4.1. The Riemann Stieljtes Lebesgue Integral.
Notation 13.33. Given an increasing function F : R R, let F(x) =
lim
yx
F(y), F(x+) = lim
yx
F(y) and F() = lim
x
F(x)

R. Since F
is increasing all of theses limits exists.
Theorem 13.34. Let F : R R be increasing and dene G(x) = F(x+). Then
(1) The function G is increasing and right continuous.
(2) For x R, G(x) = lim
yx
F(y).
268 BRUCE K. DRIVER
(3) The set {x R : F(x+) > F(x)} is countable and for each N > 0, and
moreover,
(13.26)
X
x(N,N]
[F(x+) F(x)] F(N) F(N) < .
Proof.
(1) The following observation shows G is increasing: if x < y then
(13.27) F(x) F(x) F(x+) = G(x) F(y) F(y) F(y+) = G(y).
Since G is increasing, G(x) G(x+). If y > x then G(x+) F(y) and
hence G(x+) F(x+) = G(x), i.e. G(x+) = G(x).
(2) Since G(x) F(y) F(y) for all y > x, it follows that
G(x) lim
yx
F(y) lim
yx
F(y) = G(x)
showing G(x) = lim
yx
F(y).
(3) By Eq. (13.27), if x 6= y then
(F(x), F(x+)] (F(y), F(y+)] = .
Therefore, {(F(x), F(x+)]}
xR
are disjoint possible empty intervals in R.
Let N N and (N, N) be a nite set, then
a
x
(F(x), F(x+)] (F(N), F(N)]
and therefore,
X
x
[F(x+) F(x)] F(N) F(N) < .
Since this is true for all (N, N], Eq. (13.26) holds. Eq. (13.26)
shows
N
:= {x (N, N)|F(x+) F(x) > 0}
is countable and hence so is
:= {x R|F(x+) F(x) > 0} =
N=1
N
.
Theorem 13.35. If F : R R is an increasing function, there exists a unique
measure =
F
on B
R
such that
(13.28)
Z

fdF =
Z
R
fd for all f C
c
(R, R),
where
R
fdF is as in Notation 13.6 above. This measure may also be character-

ized as the unique measure on B
R
such that
(13.29) ((a, b]) = F(b+) F(a+) for all < a < b < .
Moreover, if A B
R
then
F
(A) = inf
(

X
i=1
(F(b
i
+) F(a
i
+)) : A
i=1
(a
i
, b
i
]
)
= inf
(

X
i=1
(F(b
i
+) F(a
i
+)) : A

a
i=1
(a
i
, b
i
]
)
.
Proof. An application of Theorem 13.30 implies there exists a unique measure
on B
R
such Eq. (13.28) is valid. Let < a 0 be small and
(x) be the function dened in Figure 30, i.e.
is one on [a + 2, b +], linearly

interpolates to zero on [b +, b +2] and on [a+, a+2] and is zero on (a, b +2)
c
.
Figure 30. The function
used to compute ((a, b]).

Since
1
(a,b]
it follows by the dominated convergence theorem that
(13.30) ((a, b]) = lim
0
Z
R
d = lim
0
Z
R
dF.
On the other hand we have 1
(a+2,b+]

1
(a+,b+2]
and therefore,
F(b +) F(a + 2) =
Z
R
1
(a+2,b+]
dF
Z
R
dF
Z
R
1
(a+,b+2)
dF = F(b + 2) F(a +).
Letting 0 in this equation and using Eq. (13.30) shows
F(b+) F(a+) ((a, b]) F(b+) F(a+).
The last assertion in the theorem is now a consequence of Corollary 13.27.
Corollary 13.36. The positive linear functionals on C
c
(R, R) are in one to one
correspondence with right continuous non-decreasing functions F such that F(0) =
0.
13.5. Metric space regularity results resisted.
Proposition 13.37. Let (X, d) be a metric space and be a measure on M= B
X
which is nite on :=
d
.
(1) For all > 0 and B M there exists an open set V and a closed set
F such that F B V and (V \ F) .
(2) For all B M, there exists A F
and C G
such that A B C and

(C \ A) = 0. Here F
denotes the collection of subsets of X which may be

written as a countable union of closed sets and G
is the collection of
subsets of X which may be written as a countable intersection of open sets.
(3) The space BC
f
(X) of bounded continuous functions on X such that (f 6=
0) < is dense in L
p
().
270 BRUCE K. DRIVER
Proof. Let S := BC
f
(X), I(f) :=
R
X
fd for f S and X
n
be chosen
so that (X
n
) < and X
n
X as n . Then 1 f S for all f S and
if
n
= 1
nd
X
c
n
S
+
, then
n
1 as n and so by Remark 13.23 there
exists S
such that > 0 on X and I() < . Similarly if V , the function

g
n
:= 1
nd
(X
n
V )
c
S and g
n
1
V
as n showing (S) =B
X
. If f
n
S
+
and f
n
0 as n , it follows by the dominated convergence theorem that
I(f
n
) 0 as n . So the hypothesis of the Daniell Stone Theorem 13.22 hold
and hence is the unique measure on B
X
such that I = I
and for B B
X
and
(B) =

I(1
B
) = inf {I(f) : f S
with 1
B
f}
= inf
Z
X
fd : f S
with 1
B
f
.
Suppose > 0 and B B
X
are given. There exists f
n
BC
f
(X) such that f
n

f, 1
B
f, and (f) < (B) +. The condition 1
B
f, implies 1
B
1
{f1}
f
and hence that
(13.31) (B) (f 1) (f) < (B) +.
Moreover, letting V
m
:=
n=1
{f
n
1 1/m}
d
, we have V
m
{f 1} B
hence (V
m
) (f 1) (B) as m . Combining this observation with Eq.
(13.31), we may choose m suciently large so that B V
m
and
(V
m
\ B) = (V
m
) (B) < .
Hence there exists V such that B V and (V \ B) < . Applying this result
to B
c
shows there exists F @ X such that B
c
F
c
and
(B \ F) = (F
c
\ B
c
) < .
So we have produced F B V such that (V \ F) = (V \ B) +(B \ F) < 2.
The second assertion is an easy consequence of the rst and the third follows in
similar manner to any of the proofs of Item 7. in Theorem 13.30.
13.6. Measure on Products of Metric spaces. Let {(X
n
, d
n
)}
nN
be a se-
quence of compact metric spaces, for N N let X
N
:=
Q
N
n=1
X
n
and
N
: X X
N
be the projection map
N
(x) = x|
{1,2,...,N}
. Recall from Exercise 3.27 and Ex-
ercise 6.15 that there is a metric d on X :=
Q
nN
X
n
such that
d
=
n=1
d
n
(= (
n
: n N) the product topology on X) and X is compact in this topology.
Also recall that compact metric spaces are second countable, Exercise 10.5.
Proposition 13.38. Continuing the notation above, suppose that {
N
}
NN
are
given probability measures
31
on B
N
:= B
X
N
satisfying the compatibility conditions,
(
N
)
M
=
N
for all N M. Then there exists a unique measure on B
X
=
(
d
) = (
n
: n N) such that (
N
)
=
N
for all N N, i.e.
(13.32)
Z
X
f(
N
(x))d(x) =
Z
X
N
f(y)d
N
(y)
for all N N and f : X
N
R bounded a measurable.
31
A typical example of such measures,
N
, is to set
N
:=
1

N
where
n
is a
probablity measure on B
X
n
for each n N.
Proof. An application of the Stone Weierstrass Theorem 11.44 shows that
D = {f C(X) : f = F
N
with F C(X
N
) and N N}
is dense in C(X). For f = F
N
D let
I(f) =
Z
X
N
F
N
(x)d
N
(x).
Let us verify that I is well dened. Suppose that f may also be expressed as
f = G
M
with M N and G C(X
M
). By interchanging M and N if necessary
we may assume M N. By the compatibility assumption,
Z
X
M
G(z)d
M
(z) =
Z
X
M
F
N
(x)d
M
(x) =
Z
X
N
Fd [(
N
)
M
]
=
Z
X
N
F
N
d
N
.
Since |I(f)| kfk
, the B.L.T. Theorem 4.1 allows us to extend I uniquely to

a continuous linear functional on C(X) which we still denote by I. Because I was
positive on D, it is easy to check that I is positive on C(X) as well. So by the Riesz
Theorem 13.30, there exists a probability measure on B
X
such that I(f) =
R
X
fd
for all f C(X). By the denition of I in now follows that
Z
X
N
Fd (
N
)
=
Z
X
N
F
N
d = I(F
N
) =
Z
X
N
Fd
N
for all F C(X
N
) and N N. It now follows from Theorem 11.44he uniqueness
assertion in the Riesz theorem 13.30 (applied with X replaced by X
N
) that
N
=
N
.
Corollary 13.39. Keeping the same assumptions from Proposition 13.38. Further
assume, for each n N, there exists measurable set Y
n
X
n
such that
N
(Y
N
) = 1
with Y
N
:= Y
1
Y
N
. Then (Y ) = 1 where Y =
Q
i=1
Y
i
X.
Proof. Since Y =
N=1
1
N
(Y
N
), we have X \ Y =
N=1
1
N
(X
N
\ Y
N
) and
therefore,
(X \ Y )

X
N=1
1
N
(X
N
\ Y
N
)
=

X
N=1
N
(X
N
\ Y
N
) = 0.
Corollary 13.40. Suppose that {
n
}
nN
are probability measures on B
R
d for all
n N, X :=

R
d
N
and B :=
n=1
(B
R
d) . Then there exists a unique measure
on (X, B) such that
(13.33)
Z
X
f(x
1
, x
2
, . . . , x
N
)d(x) =
Z
(R
d
)
N
f(x
1
, x
2
, . . . , x
N
)d
1
(x
1
) . . . d
N
(x
N
)
for all N N and bounded measurable functions f :

R
d
N
R.
Proof. Let

R
d
denote the Alexandrov compactication of R

d
. Recall form
Exercise 10.12 that

R
d
is homeomorphic to S
d
and hence

R
d
is a compact
metric space. (Alternatively see Exercise 10.15.) Let
n
:= i
n
=
n
i
1
where
272 BRUCE K. DRIVER
i : R
d

R
d
is the inclusion map. Then

n
is a probability measure on B
(R
d
)
such that
n
({}) = 0. An application of Proposition 13.38 and Corollary 13.39
completes the proof.
Exercise 13.3. Extend Corollary 13.40 to construct arbitrary (not necessarily
countable) products of R
d
.
13.7. Measures on general innite product spaces. In this section we drop
the topological assumptions used in the last section.
Proposition 13.41. Let {(X
, M
)}
A
be a collection of probability spaces,
that is
(X
a
) = 1 for all A. Let X
Q
A
X
, M = (
: A) and for
A let X
:=
Q
and
: X X
be the projection map
(x) = x|
and
:=
Q
be product measure on M
:=
. Then there exists a

unique measure on M such that (
for all A, i.e. if f : X
R
is a bounded measurable function then
(13.34)
Z
X
f(
(x))d(x) =
Z
X
f(y)d
(y).
Proof. Let S denote the collection of functions f : X R such that there exists
A and a bounded measurable function F : X
R such that f = F
.
For f = F
S, let I(f) =
R
X
Fd
.
Let us verify that I is well dened. Suppose that f may also be expressed as
f = G
with A and G : X
R bounded and measurable. By replacing

by if necessary, we may assume that . Making use of Fubinis theorem
we learn
Z
X
G(z) d
(z) =
Z
X
X
\
F
(x) d
(x)d
\
(y)
=
Z
X
(x) d
(x)
Z
X
\
d
\
(y)
=
\
X
\
Z
X
(x) d
(x) =
Z
X
(x) d
(x),
wherein we have used the fact that
(X
) = 1 for all A since
(X
) = 1
for all A. It is now easy to check that I is a positive linear functional on the
lattice S. We will now show that I is a Daniel integral.
Suppose that f
n
S
+
is a decreasing sequence such that inf
n
I(f
n
) = > 0.
We need to show f := lim
n
f
n
is not identically zero. As in the proof that I is
well dened, there exists
n
A and bounded measurable functions F
n
: X
n

[0, ) such that
n
is increasing in n and f
n
= F
n

n
for each n. For k n, let
F
k
n
: X
k
[0, ) be the bounded measurable function
F
k
n
(x) =
Z
X
n
\
k
F
n
(x y)d
n
\
k
(y)
where xy X
n
is dened by (x y) () = x() if
k
and (x y) () = y()
for
n
\
k
. By convention we set F
n
n
= F
n
. Since f
n
is decreasing it follows
that F
k
n+1
F
k
n
for all k and n k and therefore F
k
:= lim
n
F
k
n
exists. By
Fubinis theorem,
F
k
n
(x) =
Z
X
n
\
k
F
k+1
n
(x y)d
k+1
\
k
(y) when k + 1 n
and hence letting n in this equation shows
(13.35) F
k
(x) =
Z
X
n
\
k
F
k+1
(x y)d
k+1
\
k
(y)
for all k. Now
Z
X
1
F
1
(x)d
1
(x) = lim
n
Z
X
1
F
1
n
(x)d
1
(x) = lim
n
I(f
n
) = > 0
so there exists
x
1
X
1
such that F
1
(x
1
) .
From Eq. (13.35) with k = 1 and x = x
1
it follows that

Z
X
2
\
1
F
2
(x
1
y)d
2
\
1
(y)
and hence there exists
x
2
X
2
\
1
such that F
2
(x
1
x
2
) .
Working this way inductively using Eq. (13.35) implies there exists
x
i
X
i
\
i1
such that F
n
(x
1
x
2
x
n
)
for all n. Now F
n
k
F
n
for all k n and in particular for k = n, thus
F
n
(x
1
x
2
x
n
) = F
n
n
(x
1
x
2
x
n
)
F
n
(x
1
x
2
x
n
) (13.36)
for all n. Let x X be any point such that
n
(x) = x
1
x
2
x
n
for all n. From Eq. (13.36) it follows that
f
n
(x) = F
n

n
(x) = F
n
(x
1
x
2
x
n
)
for all n and therefore f(x) := lim
n
f
n
(x) showing f is not zero.
Therefore, I is a Daniel integral and there exists by Theorem 13.30 a unique
measure on (X, (S) = M) such that
I(f) =
Z
X
fd for all f S.
Taking f = 1
A

in this equation implies
(A) = I(f) =
1
(A)
and the result is proved.
Remark 13.42. (Notion of kernel needs more explanation here.) The above theorem
may be Jazzed up as follows. Let {(X
, M
)}
A
be a collection of measurable
spaces. Suppose for each pair A there is a kernel
,
(x, dy) for x X
and y X
\
such that if K A then
,K
(x, dy dz) =
,
(x, dy)
,K
(x y, dz).
274 BRUCE K. DRIVER
Then there exists a unique measure on M such that

Z
X
f(
(x))d(x) =
Z
X
f(y)d
,
(y)
for all A and f : X
R bounded and measurable. To prove this assertion,

just use the proof of Proposition 13.41 replacing
\
(dy) by
,
(x, dy) everywhere
in the proof.
13.8. Extensions of premeasures to measures II.
Proposition 13.43. Suppose that A P(X) is an algebra of sets and : A
[0, ] is a nitely additive measure on A. Then if A, A
i
A and A =

`
i=1
A
i
we
have
(13.37)

X
i=1
(A
i
) (A).
Proof. Since
A =
N
a
i=1
A
i
!
A\
N
[
i=1
A
i
!
we nd using the nite additivity of that
(A) =
N
X
i=1
(A
i
) +
A\
N
[
i=1
A
i
!
N
X
i=1
(A
i
).
Letting N in this last expression shows that

P
i=1
(A
i
) (A).
Because of Proposition 13.43, in order to prove that is a premeasure on A, it
suces to show is subadditive on A, namely
(13.38) (A)

X
i=1
(A
i
)
whenever A =

`
i=1
A
i
with A A and each {A
i
}
i=1
A.
Proposition 13.44. Suppose that E P(X) is an elementary family (see Def-
inition 6.11), A = A(E) and : A [0, ] is an additive measure. Then the
following are equivalent:
(1) is a premeasure on A.
(2) is subadditivity on E, i.e. whenever E E is of the form E =
`
i=1
E
i
E
with E
i
E then
(13.39) (E)

X
i=1
(E
i
).
Proof. Item 1. trivially implies item 2. For the converse, it suces to show,
by Proposition 13.43, that if A =

`
n=1
A
n
with A A and each A
n
A then Eq.
(13.38) holds. To prove this, write A =
`
n
j=1
E
j
with E
j
E and A
n
=
`
N
n
i=1
E
n,i
with E
n,i
E. Then
E
j
= A E
j
=

a
n=1
A
n
E
j
=

a
n=1
N
n
a
i=1
E
n,i
E
j
which is a countable union and hence by assumption,
(E
j
)

X
n=1
N
n
X
i=1
(E
n,i
E
j
) .
Summing this equation on j and using the additivity of shows that
(A) =
n
X
j=1
(E
j
)
n
X
j=1
X
n=1
N
n
X
i=1
(E
n,i
E
j
) =

X
n=1
N
n
X
i=1
n
X
j=1
(E
n,i
E
j
)
=

X
n=1
N
n
X
i=1
(E
n,i
) =

X
n=1
(A
n
)
as desired.
The following theorem summarizes the results of Proposition 13.3, Proposition
13.44 and Theorem 13.26 above.
Theorem 13.45. Suppose that E P(X) is an elementary family and
0
: E
[0, ] is a function.
(1) If
0
is additive on E, then
0
has a unique extension to a nitely additive
measure
0
on A = A(E).
(2) If we further assume that
0
is countably subadditive on E, then
0
is a
premeasure on A.
(3) If we further assume that
0
is nite on E, then there exists a unique
measure on (E) such that |
E
=
0
. Moreover, for A (E),
(A) = inf{
0
(B) : A B A
}
= inf{

X
n=1
0
(E
n
) : A

a
n=1
E
n
with E
n
E}.
13.8.1. Radon measures on (R, B
R
) Revisited. Here we will use Theorem 13.45
to give another proof of Theorem 7.8. The main point is to show that to each
right continuous function F : R R there exists a unique measure
F
such that
F
((a, b]) = F(b) F(a) for all < a < b < . We begin by extending F
to a function from

R

R by dening F() := lim
x
F(x). As above let
E = {(a, b] R : a b } and set
0
((a, b]) = F(b) F(a) for all a, b

R
with a b. The proof will be nished by Theorem 13.45 if we can show that
0
is
sub-additive on E.
First suppose that < a a,

b
n
> b
n
and set I = ( a, b] J,

J
n
= (a
n
,
b
n
] J
n
and

J
o
n
= (a
n
,
b
n
). Since

I is compact and

I J

S
n=1
J
o
n
there exists N <
such that
I

I
N
[
n=1
J
o
n

N
[
n=1
J
n
.
Hence by nite sub-additivity of
0
,
F(b) F( a) =
0
(I)
N
X
n=1
0
(

J
n
)

X
n=1
0
(

J
n
).
Using the right continuity of F and letting a a in the above inequality shows that
0
((a, b]) = F(b) F(a)

X
n=1
J
n
=

X
n=1
0
(J
n
) +

X
n=1
0
(

J
n
\ J
n
) (13.41)
Given > 0 we may use the right continuity of F to choose

b
n
so that
0
(

J
n
\ J
n
) = F(
b
n
) F(b
n
) 2
n
n.
Using this in Eq. (13.41) show
0
(J) =
0
((a, b])

X
n=1
0
(J
n
) +
and since > 0 we have veried Eq. (13.40).
We have now done the hard work. We still have to check the cases where a =
or b = or both. For example, suppose that b = so that
J = (a, ) =

a
n=1
J
n
with J
n
= (a
n
, b
n
] R. Then let I
M
:= (a, M], and notice that
I
M
= J I
M
=

a
n=1
J
n
I
M
So by what we have already proved,
F(M) F(a) =
0
(I
M
)

X
n=1
0
(J
n
I
M
)

X
n=1
0
(J
n
)
Now let M in this last inequality to nd that
0
((a, )) = F() F(a)

X
n=1
0
(J
n
).
The other cases where a = and b R and a = and b = are handled
similarly.
13.9. Supplement: Generalizations of Theorem 13.35 to R
n
.
Theorem 13.46. Let A P(X) and B P(Y ) be algebras. Suppose that
: AB C
is a function such that for each A A, the function
B B (AB) C
is an additive measure on B and for each B B, the function
A A (AB) C
is an additive measure on A. Then extends uniquely to an additive measure on
the product algebra C generated by AB.
Proof. The collection
E = AB = {AB : A A and B B}
is an elementary family, see Exercise 6.2. Therefore, it suces to show is additive
on E. To check this suppose that AB E and
AB =
n
a
k=1
(A
k
B
k
)
with A
k
B
k
E. We wish to shows
(AB) =
n
X
k=1
(A
k
B
k
).
For this consider the nite algebras A
0
P(A) and B
0
P(B) generated by
{A
k
}
n
k=1
and {B
k
}
n
k=1
respectively. Let B A
0
and G B
0
be partition of A and
B respectively as found Proposition 6.18. Then for each k we may write
A
k
=
a
F,A
k
and B
k
=
a
G,B
k
.
Therefore,
(A
k
B
k
) = (A
k

[
B
k
) =
X
B
k
(A
k
)
=
X
B
k
(
[
A
k
!
) =
X
A
k
,B
k
( )
so that
X
k
(A
k
B
k
) =
X
k
X
A
k
,B
k
( ) =
X
A,B
( )
=
X
B
(A) = (AB)
as desired.
Proposition 13.47. Suppose that A P(X) is an algebra and for each t R,
t
: A C is a nitely additive measure. Let Y = (u, v] R be a nite interval
and B P(Y ) denote the algebra generated by E := {(a, b] : (a, b] Y } . Then there
is a unique additive measure on C, the algebra generated by AB such that
(A(a, b]) =
b
(A)
a
(A) (a, b] E and A A.
278 BRUCE K. DRIVER
Proof. By Proposition 13.3, for each A A, the function (a, b] (A(a, b])
extends to a unique measure on B which we continue to denote by . Now if B B,
then B =
`
k
I
k
with I
k
E, then
(AB) =
X
k
(AI
k
)
from which we learn that A (A B) is still nitely additive. The proof is
complete with an application of Theorem 13.46.
For a, b R
n
, write a < b (a b) if a
i
< b
i
(a
i
b
i
) for all i. For a < b, let (a, b]
denote the half open rectangle:
(a, b] = (a
1
, b
1
] (a
2
, b
2
] (a
n
, b
n
],
E = {(a, b] : a < b} {R
n
}
and A(R
n
) P(R
n
) denote the algebra generated by E. Suppose that F : R
n
C
is a function, we wish to dene a nitely additive complex valued measure
F
on
A(R
n
) associated to F. Intuitively the denition is to be
F
((a, b]) =
Z
(a,b]
F(dt
1
, dt
2
, . . . , dt
n
)
=
Z
(a,b]
(
1
2
. . .
n
F) (t
1
, t
2
, . . . , t
n
)dt
1
, dt
2
, . . . , dt
n
=
Z
( a,
b]
(
1
2
. . .
n1
F) (t
1
, t
2
, . . . , t
n
)|
t
n
=b
n
t
n
=a
n
dt
1
, dt
2
, . . . , dt
n1
,
where
( a,
b] = (a
1
, b
1
] (a
2
, b
2
] (a
n1
, b
n1
].
Using this expression as motivation we are led to dene
F
by induction on n. For
n = 1, let
F
((a, b]) = F(b) F(a)
and then inductively using
F
((a, b]) =
F(,t)
(( a,
b])|
t=b
n
t=a
n
.
Proposition 13.48. The function
F
extends uniquely to an additive function on
A(R
n
). Moreover,
(13.42)
F
((a, b]) =
X
S
(1)
||
F(a
c)
where S = {1, 2, . . . , n} and
(a
c ) (i) =

a(i) if i
b(i) if i / .
Proof. Both statements of the proof will be by induction. For n = 1 we
have
F
((a, b]) = F(b) F(a) so that Eq. (13.42) holds and we have already
seen that
F
extends to a additive measure on A(R) . For general n, notice that
A(R
n
) = A(R
n1
) A(R). For t R and A A(R
n1
), let
t
(A) =
F(,t)
(A)
where
F(,t)
is dened by the induction hypothesis. Then
F
(A(a, b]) =
b
(A)
a
(A)
and by Proposition 13.47 has a unique extension to A(R
n1
) A(R) as a nitely
additive measure.
For n = 1, Eq. (13.42) says that
F
((a, b]) = F(b) F(a)
where the rst term corresponds to = and second to = {1}. This agrees with
the denition of
F
for n = 1. Now for the induction step. Let T = {1, 2, . . . , n1}
and suppose that a, b R
n
, then
F
((a, b]) =
F(,t)
(( a,
b])|
t=b
n
t=a
n
=
X
T
(1)
||
F( a
c, t)|
t=b
n
t=a
n
=
X
T
(1)
||
F( a
c, b
n
)
X
T
(1)
||
F( a
c , a
n
)
=
X
S:n
c
(1)
||
F(a
c) +
X
S:n
(1)
||
F(a
c )
=
X
S
(1)
||
F(a
c)
as desired.
13.10. Exercises.
Exercise 13.4. Let (X, A, ) be as in Denition 13.4 and Proposition 13.5, Y be a
Banach space and S(Y ) := S
f
(X, A, ; Y ) be the collection of functions f : X Y
such that #(f(X)) < , f
1
({y}) A for all y Y and (f 6= 0) < . We may
dene a linear functional I : S(Y ) Y by
I(f) =
X
yY
y(f = y).
Verify the following statements.
(1) Let kfk
= sup
xX
kf(x)k
Y
be the sup norm on
(X, Y ), then for f

S(Y ),
kI(f)k
Y
kfk
(f 6= 0).
Hence if (X) < , I extends to a bounded linear transformation from
S(Y )
(X, Y ) to Y.
(2) Assuming (X, A, ) satises the hypothesis in Exercise 13.1, then C(X, Y )
S(Y ).
(3) Now assume the notation in Section 13.4.1, i.e. X = [M, M] for some
M R and is determined by an increasing function F. Let {M =
t
0
< t
1
< < t
n
= M} denote a partition of J := [M, M] along with a
choice c
i
[t
i
, t
i+1
] for i = 0, 1, 2 . . . , n 1. For f C([M, M], Y ), set
f
f(c
0
)1
[t
0
,t
1
]
+
n1
X
i=1
f(c
i
)1
(t
i
,t
i+1
]
.
Show that f
S and
kf f
k
F
0 as || max{(t
i+1
t
i
) : i = 0, 1, 2 . . . , n 1} 0.
280 BRUCE K. DRIVER
Conclude from this that

I(f) = lim
||0
n1
X
i=0
f(c
i
)(F(t
i+1
) F(t
i
)).
As usual we will write this integral as
R
M
M
fdF and as
R
M
M
f(t)dt if F(t) =
t.
Exercise 13.5. Folland problem 1.28.
Exercise 13.6. Suppose that F C
1
(R) is an increasing function and
F
is the
unique Borel measure on R such that
F
((a, b]) = F(b) F(a) for all a b. Show
that d
F
= dm for some function 0. Find explicitly in terms of F.
Exercise 13.7. Suppose that F(x) = e1
x3
+ 1
x7
and
F
is the is the unique
Borel measure on R such that
F
((a, b]) = F(b) F(a) for all a b. Give an
explicit description of the measure
F
.
Exercise 13.8. Let E B
R
with m(E) > 0. Then for any (0, 1) there exists
an open interval J R such that m(E J) m(J).
32
Hints: 1. Reduce to
the case where m(E) (0, ). 2) Approximate E from the outside by an open set
V R. 3. Make use of Exercise 3.43, which states that V may be written as a
disjoint union of open intervals.
Exercise 13.9. Let (X, ) be a second countable locally compact Hausdor space
and I : C
0
(X, R) R be a positive linear functional. Show I is necessarily bounded,
i.e. there exists a C < such that |I(f)| C kfk
u
for all f C
0
(X, R). Hint:
Let be the measure on B
X
coming from the Riesz Representation theorem and for
sake of contradiction suppose (X) = kIk = . To reach a contradiction, construct
a function f C
0
(X, R) such that I(f) = .
Exercise 13.10. Suppose that I : C
c
(R, R) R is a positive linear functional.
Show
(1) For each compact subset K @@ R there exists a constant C
K
< such
that
|I(f)| C
K
kfk
u
whenever supp(f) K.
(2) Show there exists a unique Radon measure on B
R
(the Borel algebra
on R) such that I(f) =
R
R
fd for all f C
c
(R, R).
13.10.1. The Laws of Large Number Exercises. For the rest of the problems of this
section, let be a probability measure on B
R
such that
R
R
|x| d(x) < ,
n
:=
for n N and denote the innite product measure as constructed in Corollary
13.40. So is the unique measure on (X := R
N
, B := B
R
N) such that
(13.43)
Z
X
f(x
1
, x
2
, . . . , x
N
)d(x) =
Z
R
N
f(x
1
, x
2
, . . . , x
N
)d(x
1
) . . . d(x
N
)
32
See also the Lebesgue dierentiation Theorem 16.13 from which one may prove the much
stronger form of this theorem, namely for m -a.e. x E there exits r
(x) > 0 such that m(E

(x r, x +r)) m((x r, x +r)) for all r r
(x).
for all N N and bounded measurable functions f : R
N
R. We will also use the
following notation:
S
n
(x) :=
1
n
n
X
k=1
x
k
for x X,
m :=
Z
R
xd(x) the average of ,
2
:=
Z
R
(x m)
2
d(x) the variance of and
:=
Z
R
(x m)
4
d(x).
The variance may also be written as
2
=
R
R
x
2
d(x) m
2
.
Exercise 13.11 (Weak Law of Large Numbers). Suppose further that
2
< ,
show
R
X
S
n
d = m,
kS
n
mk
2
2
=
Z
X
(S
n
m)
2
d =

2
n
and (|S
n
m| > )

2
n
2
for all > 0 and n N.
Exercise 13.12 (A simple form of the Strong Law of Large Numbers). Suppose
now that :=
R
R
(x m)
4
d(x) < . Show for all > 0 and n N that
kS
n
mk
4
4
=
Z
X
(S
n
m)
4
d =
1
n
4
n + 3n(n 1)
4
=
1
n
2
n
1
+ 3
1 n
1
and
(|S
n
m| > )
n
1
+ 3
1 n
1
4
n
2
.
Conclude from the last estimate and the rst Borel Cantelli Lemma 7.22 that
lim
n
S
n
(x) = m for a.e. x X.
Exercise 13.13. Suppose :=
R
R
(xm)
4
d(x) < and m =
R
R
(xm)d(x) 6= 0.
For > 0 let T
: R
N
R
N
be dened by T
(x) = (x
1
, x
2
, . . . , x
n
, . . . ),
= T
1
and
X
:=
_
_
_
x R
N
: lim
n
1
n
n
X
j=1
x
j
=
_
_
_
.
Show
(X
0 ) =
,
0 =

1 if =
0
0 if 6=
0
and use this to show if 6= 1, then d
6= d for any measurable function :

R
N
[0, ].
282 BRUCE K. DRIVER
14. Daniell Integral Proofs

(This section follows the exposition in Royden and Loomis.) In this section we
let X be a given set. We will be interested in certain spaces of extended real valued
functions f : X

R on X.
Convention: Given functions f, g : X

R, let f + g denote the collection of
functions h : X

R such that h(x) = f(x) + g(x) for all x for which f(x) + g(x)
is well dened, i.e. not of the form . For example, if X = {1, 2, 3} and
f(1) = , f(2) = 2 and f(3) = 5 and g(1) = g(2) = and g(3) = 4, then
h f +g i h(2) = and h(3) = 7. The value h(1) may be chosen freely. More
generally if a, b R and f, g : X

R we will write af + bg for the collection
of functions h : X

R such that h(x) = af(x) + bg(x) for those x X where
af(x) +bg(x) is well dened with the values of h(x) at the remaining points being
arbitrary. It will also be useful to have some explicit representatives for af + bg
which we dene, for

R, by
(14.1) (af +bg)
(x) =

af(x) +bg(x) when dened
otherwise.
We will make use of this denition with = 0 and = below.
Denition 14.1. A set, L, of extended real valued functions on X is an extended
vector space (or a vector space for short) if L is closed under scalar multiplication
and addition in the following sense: if f, g L and R then (f + g) L. A
vector space L is said to be an extended lattice (or a lattice for short) if it is
also closed under the lattice operations; f g = max(f, g) and f g = min(f, g).
A linear functional I on L is a function I : L R such that
(14.2) I(f +g) = I(f) +I(g) for all f, g L and R.
Eq. (14.2) is to be interpreted as I(h) = I(f) + I(g) for all h (f + g), and
in particular I is required to take the same value on all members of (f + g). A
linear functional I is positive if I(f) 0 when f L
+
, where L
+
denotes the
non-negative elements of L as in Notation 13.13.
Remark 14.2. Notice that an extended lattice L is closed under the absolute value
operation since |f| = f 0 f 0 = f (f). Also if I is positive on L then
I(f) I(g) when f, g L and f g. Indeed, f g implies (g f)
0
0, so
0 = I(0) = I((g f)
0
) = I(g) I(f) and hence I(f) I(g).
In the remainder of this chapter we x a lattice, S, of bounded functions, f :
X R, and a positive linear functional I : S R satisfying Property (D) of
Denition 13.15.
14.1. Extension of Integrals.
Proposition 14.3. The set S
and the extension of I to S
in Denition 13.20
satises:
(1) (Monotonicity) I(f) I(g) if f, g S
with f g.
(2) S
is closed under the lattice operations, i.e. if f, g S
then f g S
and f g S
. Moreover, if I(f) < and I(g) < , then I(f g) <

and I(f g) < .
(3) (Positive Linearity) I (f +g) = I(f) +I(g) for all f, g S
and 0.
(4) f S
+
i there exists
n
S
+
such that f =
P
n=1
n
. Moreover, I(f) =
P
m=1
I(
m
).
(5) If f
n
S
+
, then
P
n=1
f
n
=: f S
+
and I(f) =
P
n=1
I(f
n
).
Remark 14.4. Similar results hold for the extension of I to S
in Denition 13.21.
Proof.
(1) Monotonicity follows directly from Lemma 13.19.
(2) If f
n
, g
n
S are chosen so that f
n
f and g
n
g, then f
n
g
n
f g and
f
n
g
n
f g. If we further assume that I(g) < , then f g g and
hence I(f g) I(g) < . In particular it follows that I(f 0) (, 0]
for all f S
. Combining this with the identity,

I(f) = I (f 0 +f 0) = I (f 0) +I(f 0) ,
shows I(f) < i I(f 0) < . Since f g f 0 + g 0, if both
I(f) < and I(g) < then
I(f g) I (f 0) +I (g 0) < .
(3) Let f
n
, g
n
S be chosen so that f
n
f and g
n
g, then (f
n
+g
n
)
(f +g) and therefore
I (f +g) = lim
n
I (f
n
+g
n
) = lim
n
I(f
n
) + lim
n
I(g
n
)
= I(f) +I(g).
(4) Let f S
+
and f
n
S be chosen so that f
n
f. By replacing f
n
by f
n
0
if necessary we may assume that f
n
S
+
. Now set
n
= f
n
f
n1
S for
n = 1, 2, 3, . . . with the convention that f
0
= 0 S. Then
P
n=1
n
= f
and
I(f) = lim
n
I(f
n
) = lim
n
I(
n
X
m=1
m
) = lim
n
n
X
m=1
I(
m
) =

X
m=1
I(
m
).
Conversely, if f =
P
m=1
m
with
m
S
+
, then f
n
:=
P
n
m=1
m
f as
n and f
n
S
+
.
(5) Using Item 4., f
n
=
P
m=1
n,m
with
n,m
S
+
. Thus
f =

X
n=1
X
m=1
n,m
= lim
N
X
m,nN
n,m
S
and
I(f) = lim
N
I(
X
m,nN
n,m
) = lim
N
X
m,nN
I(
n,m
)
=

X
n=1
X
m=1
I(
n,m
) =

X
n=1
I(f
n
).
Denition 14.5. Given an arbitrary function g : X

R, let
I(g) = inf {I(f) : g f S
}

R and
I(g) = sup{I(f) : S
3 f g}

R.
with the convention that sup = and inf = +.
284 BRUCE K. DRIVER
Proposition 14.6. Given functions f, g : X

R, then:
(1)

I(f) =
I(f) for all 0.

(2) (Chebyshevs Inequality.) Suppose f : X [0, ] is a function and
(0, ), then

I(1
{f}
)
1

I(f) and if

I(f) < then

I(1
{f=}
) = 0.
(3)

I is subadditive, i.e. if

I(f) +

I(g) is not of the form or +,
then
(14.3)

I(f +g)

I(f) +

I(g).
This inequality is to be interpreted to mean,
I(h)

I(f) +

I(g) for all h (f +g).
(4) I(g) =
I(g).
(5) I(g)

I(g).
(6) If f g then

I(f)

I(g) and I(f) I(g).
(7) If g S
and I(g) < or g S
and I(g) > then I(g) =

I(g) = I(g).
Proof.
(1) Suppose that > 0 (the = 0 case being trivial), then
I(f) = inf {I(h) : f h S
} = inf

I(h) : f
1
h S
= inf {I(g) : f g S
} = inf {I(g) : f g S
} =
I(f).
(2) For (0, ), 1
{f}
f and therefore,
I(1
{f}
) =

I(1
{f}
)

I(f).
Since N1
{f=}
f for all N (0, ),
N
I(1
{f=}
) =

I(N1
{f=}
)

I(f).
So if

I(f) < , this inequality implies

I(1
{f=}
) = 0 because N is arbi-
trary.
(3) If

I(f) +

I(g) = the inequality is trivial so we may assume that
I(f),

I(g) [, ). If

I(f) +

I(g) = then we may assume, by inter-
changing f and g if necessary, that

I(f) = and

I(g) < . By denition
of

I, there exists f
n
S
and g
n
S
such that f f
n
and g g
n
and
I(f
n
) and I(g
n
)

I(g). Since f +g f
n
+g
n
S
, (i.e. h f
n
+g
n
for all h (f +g) which holds because f
n
, g
n
> ) and
I(f
n
+g
n
) = I(f
n
) +I(g
n
) +

I(g) = ,
it follows that

I(f +g) = , i.e.

I(h) = for all h f +g. Henceforth
we may assume

I(f),

I(g) R. Let k (f +g) and f h
1
S
and
g h
2
S
. Then k h
1
+ h
2
S
because if (for example) f(x) =

and g(x) = , then h
1
(x) = and h
2
(x) > since h
2
S
. Thus
h
1
(x) + h
2
(x) = k(x) no matter the value of k(x). It now follows
from the denitions that

I(k) I(h
1
) + I(h
2
) for all f h
1
S
and
g h
2
S
. Therefore,
I(k) inf {I(h

1
) +I(h
2
) : f h
1
S
and g h
2
S
}
=

I(f) +

I(g)
and since k (f +g) is arbitrary we have proven Eq. (14.3).
(4) From the denitions and Exercise 13.2,
I(g) = sup{I(f) : f g S
} = sup{I(f) : g f S
}
= sup{I(h) : g h S
} = inf {I(h) : g h S
} =
I(g).
(5) The assertion is trivially true if

I(g) = I(g) = or

I(g) = I(g) = . So
we now assume that

I(g) and I(g) are not both or . Since 0 (gg)
and

I(g g)

I(g) +

I(g) (by Item 1),
0 =

I(0)

I(g) +

I(g) =

I(g) I(g)
provided the right side is well dened which it is by assumption. So again
we deduce that I(g)

I(g).
(6) If f g then
I(f) = inf {I(h) : f h S
} inf {I(h) : g h S
} =

I(g)
and
I(f) = sup{I(h) : S
3 h f} sup{I(h) : S
3 h g} = I(g).
(7) Let g S
with I(g) < and choose g

n
S such that g
n
g. Then
I(g) I(g) I(g

n
) I(g) as n .
Combining this with
I(g) = inf {I(f) : g f S
} = I(g)
shows
I(g) I(g) I(g) =

I(g)
and hence I(g) = I(g) =

I(g). If g S
and I(g) > , then by what we

have just proved,
I(g) = I(g) =

I(g).
This nishes the proof since I(g) =
I(g) and I(g) = I(g).

Lemma 14.7. Let f
n
: X [0, ] be a sequence of functions and F :=
P
n=1
f
n
.
Then
(14.4)

I(F) =

I(

X
n=1
f
n
)

X
n=1
I(f
n
).
Proof. Suppose
P
n=1

I(f
n
) < , for otherwise the result is trivial. Let > 0
be given and choose g
n
S
+
such that f
n
g
n
and I(g
n
) =

I(f
n
) +
n
where
P
n=1
n
. (For example take
n
2
n
.) Then
P
n=1
g
n
=: G S
+
, F G
and so
I(F)

I(G) = I(G) =

X
n=1
I(g
n
) =

X
n=1
I(f
n
) +
n

X
n=1
I(f
n
) +.
Since > 0 is arbitrary, the proof is complete.
Denition 14.8. A function g : X

R is integrable if I(g) =

I(g) R. Let
L
1
(I) :=

g : X

R : I(g) =

I(g) R
and for g L
1
(I), let

I(g) denote the common value I(g) =

I(g).
286 BRUCE K. DRIVER
Remark 14.9. A function g : X

R is integrable i there exists f S
L
1
(I) and
h S
L
1
(I)
33
such that f g h and I(h f) < . Indeed if g is integrable,
then I(g) =

I(g) and there exists f S
L
1
(I) and h S
L
1
(I) such that
f g h and 0 I(g) I(f) < /2 and 0 I(h)

I(g) < /2. Adding these
two inequalities implies 0 I(h) I(f) = I(h f) < . Conversely, if there exists
f S
L
1
(I) and h S
L
1
(I) such that f g h and I(h f) < , then
I(f) = I(f) I(g) I(h) = I(h) and
I(f) =

I(f)

I(g)

I(h) = I(h)
and therefore
0

I(g) I(g) I(h) I(f) = I(h f) < .
Since > 0 is arbitrary, this shows

I(g) = I(g).
Proposition 14.10. The space L
1
(I) is an extended lattice and

I : L
1
(I) R is
linear in the sense of Denition 14.1.
Proof. Let us begin by showing that L
1
(I) is a vector space. Suppose that
g
1
, g
2
L
1
(I), and g (g
1
+ g
2
). Given > 0 there exists f
i
S
L
1
(I) and
h
i
S
L
1
(I) such that f
i
g
i
h
i
and I(h
i
f
i
) < /2. Let us now show
(14.5) f
1
(x) +f
2
(x) g(x) h
1
(x) +h
2
(x) x X.
This is clear at points x X where g
1
(x) +g
2
(x) is well dened. The other case to
consider is where g
1
(x) = = g
2
(x) in which case h
1
(x) = and f
2
(x) =
while , h
2
(x) > and f
1
(x) < because h
2
S
and f
1
S
. Therefore,
f
1
(x) +f
2
(x) = and h
1
(x) +h
2
(x) = so that Eq. (14.5) is valid no matter
how g(x) is chosen.
Since f
1
+f
2
S
L
1
(I), h
1
+h
2
S
L
1
(I) and
I(g
i
) I(f
i
) +/2 and /2 +I(h
i
)

I(g
i
),
we nd
I(g
1
) +

I(g
2
) I(f
1
) +I(f
2
) = I(f
1
+f
2
) I(g)

I(g)
I(h
1
+h
2
) = I(h
1
) +I(h
2
)

I(g
1
) +

I(g
2
) +.
Because > 0 is arbitrary, we have shown that g L
1
(I) and

I(g
1
) +

I(g
2
) =

I(g),
i.e.

I(g
1
+g
2
) =

I(g
1
) +

I(g
2
).
It is a simple matter to show g L
1
(I) and

I(g) =
I(g) for all g L

1
(I) and
R. For example if = 1 (the most interesting case), choose f S
L
1
(I)
and h S
L
1
(I) such that f g h and I(h f) < . Therefore,
S
L
1
(I) 3 h g f S
L
1
(I)
with I(f (h)) = I(h f) < and this shows that g L
1
(I) and

I(g) =
I(g). We have now shown that L

1
(I) is a vector space of extended real valued
functions and

I : L
1
(I) R is linear.
To show L
1
(I) is a lattice, let g
1
, g
2
L
1
(I) and f
i
S
L
1
(I) and h
i

S
L
1
(I) such that f
i
g
i
h
i
and I(h
i
f
i
) < /2 as above. Then using
Proposition 14.3 and Remark 14.4,
S
L
1
(I) 3 f
1
f
2
g
1
g
2
h
1
h
2
S
L
1
(I).
33
Equivalently, f S
with I(f) > and h S
with I(h) < .

Moreover,
0 h
1
h
2
f
1
f
2
h
1
f
1
+h
2
f
2
,
because, for example, if h
1
h
2
= h
1
and f
1
f
2
= f
2
then
h
1
h
2
f
1
f
2
= h
1
f
2
h
2
f
2
.
Therefore,
I (h
1
h
2
f
1
f
2
) I (h
1
f
1
+h
2
f
2
) <
and hence by Remark 14.9, g
1
g
2
L
1
(I). Similarly
0 h
1
h
2
f
1
f
2
h
1
f
1
+h
2
f
2
,
because, for example, if h
1
h
2
= h
1
and f
1
f
2
= f
2
then
h
1
h
2
f
1
f
2
= h
1
f
2
h
1
f
1
.
Therefore,
I (h
1
h
2
f
1
f
2
) I (h
1
f
1
+h
2
f
2
) <
and hence by Remark 14.9, g
1
g
2
L
1
(I).
Theorem 14.11 (Monotone convergence theorem). If f
n
L
1
(I) and f
n
f, then
f L
1
(I) i lim
n

I(f
n
) = sup
n

I(f
n
) < in which case

I(f) = lim
n

I(f
n
).
Proof. If f L
1
(I), then by monotonicity

I(f
n
)

I(f) for all n and therefore
lim
n

I(f
n
)

I(f) < . Conversely, suppose := lim
n

I(f
n
) < and let
g :=
P
n=1
(f
n+1
f
n
)
0
. The reader should check that f (f
1
+ g)
(f
1
+g) .
So by Lemma 14.7,
I(f)

I((f
1
+g)
)

I(f
1
) +

I(g)

I(f
1
) +

X
n=1
I ((f
n+1
f
n
)
0
) =

I(f
1
) +

X
n=1
I (f
n+1
f
n
)
=

I(f
1
) +

X
n=1
h
I(f
n+1
)

I(f
n
)
i
=

I(f
1
) +

I(f
1
) = . (14.6)
Because f
n
f, it follows that

I(f
n
) = I(f
n
) I(f) which upon passing to limit
implies I(f). This inequality and the one in Eq. (14.6) shows

I(f) I(f)
and therefore, f L
1
(I) and

I(f) = = lim
n

I(f
n
).
Lemma 14.12 (Fatous Lemma). Suppose {f
n
}
L
1
(I)
+
, then inf f
n
L
1
(I).
If liminf
n

I(f
n
) < , then liminf
n
f
n
L
1
(I) and in this case
I(liminf
n
f
n
) liminf
n

I(f
n
).
Proof. Let g
k
:= f
1
f
k
L
1
(I), then g
k
g := inf
n
f
n
. Since g
k
g,
g
k
L
1
(I) for all k and

I(g
k
)

I(0) = 0, it follow from Theorem 14.11 that
g L
1
(I) and hence so is inf
n
f
n
= g L
1
(I).
By what we have just proved, u
k
:= inf
nk
f
n
L
1
(I) for all k. Notice that
u
k
liminf
n
f
n
, and by monotonicity that

I(u
k
)

I(f
k
) for all k. Therefore,
lim
k

I(u
k
) = liminf
k

I(u
k
) liminf
k

I(f
n
) <
and by the monotone convergence Theorem 14.11, liminf
n
f
n
= lim
k
u
k

L
1
(I) and
I(liminf
n
f
n
) = lim
k

I(u
k
) liminf
n

I(f
n
).
288 BRUCE K. DRIVER
Before stating the dominated convergence theorem, it is helpful to remove some

of the annoyances of dealing with extended real valued functions. As we have
done when studying integrals associated to a measure, we can do this by modifying
integrable functions by a null function.
Denition 14.13. A function n : X

R is a null function if

I(|n|) = 0. A
subset E X is said to be a null set if 1
E
is a null function. Given two functions
f, g : X

R we will write f = g a.e. if {f 6= g} is a null set.
Here are some basic properties of null functions and null sets.
Proposition 14.14. Suppose that n : X

R is a null function and f : X

R is
an arbitrary function. Then
(1) n L
1
(I) and

I(n) = 0.
(2) The function n f is a null function.
(3) The set {x X : n(x) 6= 0} is a null set.
(4) If E is a null set and f L
1
(I), then 1
E
c f L
1
(I) and

I(f) =

I(1
E
cf).
(5) If g L
1
(I) and f = g a.e. then f L
1
(I) and

I(f) =

I(g).
(6) If f L
1
(I), then {|f| = } is a null set.
Proof.
(1) If n is null, using n |n| we nd

I(n)

I(|n|) = 0, i.e.

I(n) 0 and
I(n) =

I(n) 0. Thus it follows that

I(n) 0 I(n) and therefore
n L
1
(I) with

I (n) = 0.
(2) Since |n f| |n| ,

I (|n f|)

I ( |n|) . For k N, k |n| L
1
(I)
and

I(k |n|) = kI (|n|) = 0, so k |n| is a null function. By the monotone
convergence Theorem 14.11 and the fact k |n| |n| L
1
(I) as k ,
I ( |n|) = lim
k

I (k |n|) = 0. Therefore |n| is a null function and
hence so is n f.
(3) Since 1
{n6=0}
1
{n6=0}
= |n| ,

I
1
{n6=0}

I ( |n|) = 0 showing
{n 6= 0} is a null set.
(4) Since 1
E
f L
1
(I) and

I (1
E
f) = 0,
f1
E
c = (f 1
E
f)
0
(f 1
E
f) L
1
(I)
and

I(f1
E
c ) =

I(f)

I(1
E
f) =

I(f).
(5) Letting E be the null set {f 6= g} , then 1
E
c f = 1
E
c g L
1
(I) and 1
E
f is a
null function and therefore, f = 1
E
f + 1
E
c f L
1
(I) and
I(f) =

I(1
E
f) +

I(f1
E
c) =

I(1
E
cf) =

I(1
E
cg) =

I(g).
(6) By Proposition 14.10, |f| L
1
(I) and so by Chebyshevs inequality (Item
2 of Proposition 14.6), {|f| = } is a null set.
Theorem 14.15 (Dominated Convergence Theorem). Suppose that {f
n
: n N}
L
1
(I) such that f := limf
n
exists pointwise and there exists g L
1
(I) such that
|f
n
| g for all n. Then f L
1
(I) and
lim
n

I(f
n
) =

I( lim
n
f
n
) =

I(f).
Proof. By Proposition 14.14, the set E := {g = } is a null set and

I(1
E
cf
n
) =
I(f
n
) and

I(1
E
cg) =

I(g). Since
I(1
E
c(g f
n
)) 2
I(1
E
c g) = 2
I(g) < ,
we may apply Fatous Lemma 14.12 to nd 1
E
c (g f) L
1
(I) and
I(1
E
c (g f)) liminf
n

I(1
E
c (g f
n
))
= liminf
n
n
I(1
E
cg)

I(1
E
c f
n
)
o
= liminf
n
n
I(g)

I(f
n
)
o
.
Since f = 1
E
cf a.e. and 1
E
cf =
1
2
1
E
c (g +f (g +f)) L
1
(I), Proposition 14.14
implies f L
1
(I). So the previous inequality may be written as
I(g)

I(f) =

I(1
E
cg)

I(1
E
c f)
=

I(1
E
c (g f))

I(g) +
liminf
n

I(f
n
)
limsup
n

I(f
n
),
wherein we have used liminf
n
(a
n
) = limsupa
n
. These two inequalities im-
ply limsup
n

I(f
n
)

I(f) liminf
n

I(f
n
) which shows that lim
n

I(f
n
)
exists and is equal to

I(f).
14.2. The Structure of L
1
(I). Let S
denote the collections of functions f :

X

R for which there exists f
n
S
L
1
(I) such that f
n
f as n and
lim
n

I(f
n
) > . Applying the monotone convergence theorem to f
1
f
n
, it
follows that f
1
f L
1
(I) and hence f L
1
(I) so that S
L
1
(I).
Lemma 14.16. Let f : X

R be a function. If

I(f) R, then there exists g S
such that f g and

I(f) =

I(g). (Consequently, n : X [0, , ) is a positive null
function i there exists g S
such that g n and

I(g) = 0.) Moreover, f L
1
(I)
i there exists g S
such that g f and f = g a.e.

Proof. By denition of

I(f) we may choose a sequence of functions g
k
S
L
1
(I) such that g
k
f and

I(g
k
)

I(f). By replacing g
k
by g
1
g
k
if necessary
(g
1
g
k
S
L
1
(I) by Proposition 14.3), we may assume that g
k
is a decreasing
sequence. Then lim
k
g
k
=: g f and, since lim
k

I(g
k
) =

I(f) > ,
g S
. By the monotone convergence theorem applied to g

1
g
k
,
I(g
1
g) = lim
k

I(g
1
g
k
) =

I(g
1
)

I(f),
so

I(g) =

I(f).
Now suppose that f L
1
(I), then (g f)
0
0 and
I ((g f)
0
) =

I (g)

I(f) =

I(g)

I(f) = 0.
Therefore (g f)
0
is a null functions and hence so is (g f)
0
. Because
1
{f6=g}
= 1
{f<g}
(g f)
0
,
{f 6= g} is a null set so if f L
1
(I) there exists g S
such that f = g a.e. The

converse statement has already been proved in Proposition 14.14.
290 BRUCE K. DRIVER
Proposition 14.17. Suppose that I and S are as above and J is another Daniell
integral on a vector lattice T such that S T and I = J|
S
. (We abbreviate this by
writing I J.) Then L
1
(I) L
1
(J) and

I =

J on L
1
(I), or in abbreviated form:
if I J then

I

J.
Proof. From the construction of the extensions, it follows that S
and the
I = J on S
. Similarly, it follows that S
and

I =

J on S
. From Lemma
14.16 we learn, if n 0 is an I null function then there exists g S
such
that n g and 0 = I(g) = J(g). This shows that n is also a J null function and in
particular every I null set is a J null set. Again by Lemma 14.16, if f L
1
(I)
there exists g S
such that {f 6= g} is an I null set and hence a J null

set. So by Proposition 14.14, f L
1
(J) and I(f) = I(g) = J(g) = J(f).
14.3. Relationship to Measure Theory.
Denition 14.18. A function f : X [0, ] is said to measurable if f g L
1
(I)
for all g L
1
(I).
Lemma 14.19. The set of non-negative measurable functions is closed under pair-
wise minimums and maximums and pointwise limits.
Proof. Suppose that f, g : X [0, ] are measurable functions. The fact that
f g and f g are measurable (i.e. (f g) h and (f g) h are in L
1
(I) for all
h L
1
(I)) follows from the identities
(f g) h = f (g h) and (f g) h = (f h) (g h)
and the fact that L
1
(I) is a lattice. If f
n
: X [0, ] is a sequence of measurable
functions such that f = lim
n
f
n
exists pointwise, then for h L
1
(I), we have
h f
n
h f . By the dominated convergence theorem (using |h f
n
| |h|)
it follows that h f L
1
(I). Since h L
1
(I) is arbitrary we conclude that f is
measurable as well.
Lemma 14.20. A non-negative function f on X is measurable i f L
1
(I)
for all S.
Proof. Suppose f : X [0, ] is a function such that f L
1
(I) for all
S and let g S
L
1
(I). Choose
n
S such that
n
g as n , then
n
f L
1
(I) and by the monotone convergence Theorem 14.11,
n
f g f
L
1
(I). Similarly, using the dominated convergence Theorem 14.15, it follows that
g f L
1
(I) for all g S
. Finally for any h L

1
(I), there exists g S
such
that h = g a.e. and hence h f = g f a.e. and therefore by Proposition 14.14,
h f L
1
(I). This completes the proof since the converse direction is trivial.
Denition 14.21. A set A X is measurable if 1
A
is measurable and A inte-
grable if 1
A
L
1
(I). Let R denote the collection of measurable subsets of X.
Remark 14.22. Suppose that f 0, then f L
1
(I) i f is measurable and

I(f) <
. Indeed, if f is measurable and

I(f) < , there exists g S
L
1
(I) such that
f g. Since f is measurable, f = f g L
1
(I). In particular if A R, then A is
integrable i

I(1
A
) < .
Lemma 14.23. The set R is a ring which is a algebra if 1 is measurable.
(Notice that 1 is measurable i 1 L
1
(I) for all S. This condition is
clearly implied by assuming 1 S for all S. This will be the typical case in
applications.)
Proof. Suppose that A, B R, then AB and AB are in R by Lemma 14.19
because
1
AB
= 1
A
1
B
and 1
AB
= 1
A
1
B
.
If A
k
R, then the identities,
1
k=1
A
k
= lim
n
1
n
k=1
A
k
and 1
k=1
A
k
= lim
n
1
n
k=1
A
k
along with Lemma 14.19 shows that
k=1
A
k
and
k=1
A
k
are in R as well. Also if
A, B R and g S, then
(14.7) g 1
A\B
= g 1
A
g 1
AB
+g 0 L
1
(I)
showing the A\ B R as well.
34
Thus we have shown that R is a ring. If 1 = 1
X
is measurable it follows that X R and R becomes a algebra.
Lemma 14.24 (Chebyshevs Inequality). Suppose that 1 is measurable.
(1) If f
L
1
(I)
+
then, for all R, the set {f > } is measurable. More-
over, if > 0 then {f > } is integrable and

I(1
{f>}
)
1
I(f).
(2) (S) R.
Proof.
(1) If < 0, {f > } = X R since 1 is measurable. So now assume that
0. If = 0 let g = f L
1
(I) and if > 0 let g =
1
f
1
f
1.
(Notice that g is a dierence of two L
1
(I) functions and hence in L
1
(I).)
The function g
L
1
(I)
+
has been manufactured so that {g > 0} = {f >
}. Now let
n
:= (ng)1
L
1
(I)
+
then
n
1
{f>}
as n showing
1
{f>}
is measurable and hence that {f > } is measurable. Finally if
> 0,
1
{f>}
= 1
{f>}
1
f
L
1
(I)
showing the {f > } is integrable and
I(1
{f>}
) =

I(1
{f>}
1
f
)

I(
1
f) =
1
I(f).
(2) Since f S
+
is R measurable by (1) and S = S
+
S
+
, it follows that any
f S is R measurable, (S) R.
Lemma 14.25. Let 1 be measurable. Dene
: R [0, ] by
+
(A) =

I(1
A
) and
(A) = I(1
A
)
Then
are measures on R such that

+
and
(A) =
+
(A) whenever
+
(A) < .
34
Indeed, for x A B, x A\ B and x A
c
, Eq. (14.7) evaluated at x states, respectively,
that
g 0 = g 1 g 1 +g 0,
g 1 = g 1 g 0 +g 0 and
g 0 = g 0 g 0 +g 0,
all of which are true.
292 BRUCE K. DRIVER
Notice by Remark 14.22 that
+
(A) =

I(1
A
) if A is integrable
if A R but A is not integrable.
Proof. Since 1
= 0,
() =

I(0) = 0 and if A, B R, A B then
+
(A) =
I(1
A
)

I(1
B
) =
+
(B) and similarly,
(A) = I(1
A
) I(1
B
) =
(B). Hence
are monotonic. By Remark 14.22 if

+
(A) < then A is integrable so
(A) = I(1
A
) =

I(1
A
) =

I(1
A
) =
+
(A).
Now suppose that {E
j
}
j=1
R is a sequence of pairwise disjoint sets and let
E :=
j=1
E
j
R. If
+
(E
i
) = for some i then by monotonicity
+
(E) =
as well. If
+
(E
j
) < for all j then f
n
:=
P
n
j=1
1
E
j

L
1
(I)
+
with f
n
1
E
.
Therefore, by the monotone convergence theorem, 1
E
is integrable i
lim
n

I(f
n
) =

X
j=1
+
(E
j
) <
in which case 1
E
L
1
(I) and lim
n

I(f
n
) =

I(1
E
) =
+
(E). Thus we have
shown that
+
is a measure and
(E) =
+
(E) whenever
+
(E) < . The fact
the
is a measure will be shown in the course of the proof of Theorem 14.28.

Example 14.26. Suppose X is a set, S = {0} is the trivial vector space and
I(0) = 0. Then clearly I is a Daniel integral,
I(g) =

if g(x) > 0 for some x
0 if g 0
and similarly,
I(g) =

if g(x) < 0 for some x
0 if g 0.
Therefore, L
1
(I) = {0} and for any A X we have 1
A
0 = 0 S so that R = 2
X
.
Since 1
A
/ L
1
(I) = {0} unless A = set, the measure
+
in Lemma 14.25 is given
by
+
(A) = if A 6= and
+
() = 0, i.e.
+
(A) =

I(1
A
) while
0.
Lemma 14.27. For A R, let
(A) := sup{
+
(B) : B R, B A and
+
(B) < },
then is a measure on R such that (A) =
+
(A) whenever
+
(A) < . If
is any measure on R such that (B) =
+
(B) when
+
(B) < , then .
Moreover,
.
Proof. Clearly (A) =
+
(A) whenever
+
(A) < . Now let A =
n=1
A
n
with{A
n
}
n=1
R being a collection of pairwise disjoint subsets. Let B
n
A
n
with
+
(B
n
) < , then B
N
:=
N
n=1
B
n
A and
+
(B
N
) < and hence
(A)
+
(B
N
) =
N
X
n=1
+
(B
n
)
and since B
n
A
n
with
+
(B
n
) < is arbitrary it follows that (A)
P
N
n=1
(A
n
) and hence letting N implies (A)
P
n=1
(A
n
). Conversely,
if B A with
+
(B) < , then B A
n
A
n
and
+
(B A
n
) < . Therefore,
+
(B) =

X
n=1
+
(B A
n
)

X
n=1
(A
n
)
for all such B and hence (A)
P
n=1
(A
n
).
Using the denition of and the assumption that (B) =
+
(B) when
+
(B) <
,
(A) = sup{(B) : B R, B A and
+
(B) < } (A),
showing . Similarly,
(A) = sup{
I(1
B
) : B R, B A and
+
(B) < }
= sup{I(1
B
) : B R, B A and
+
(B) < } I(1
A
) =
(A).
Theorem 14.28 (Stone). Suppose that 1 is measurable and
+
and
are as
dened in Lemma 14.25, then:
(1) L
1
(I) = L
1
(X, R,
+
) = L
1
(
+
) and for integrable f L
1
(
+
),
(14.8)

I(f) =
Z
X
fd
+
.
(2) If is any measure on R such that S L
1
() and
(14.9)

I(f) =
Z
X
fd for all f S
then
(A) (A)
+
(A) for all A R with
(A) = (A) =
+
(A)
whenever
+
(A) < .
(3) Letting be as dened in Lemma 14.27,
= and hence
is a measure.
(So
+
is the maximal and
is the minimal measure for which Eq. (14.9)

holds.)
(4) Conversely if is any measure on (S) such that (A) =
+
(A) when
A (S) and
+
(A) < , then Eq. (14.9) is valid.
Proof.
(1) Suppose that f

L
1
(I)
+
, then Lemma 14.24 implies that f is R mea-
surable. Given n N, let
(14.10)
n
:=
2
2n
X
k=1
k
2
n
1
{
k
2
n
<f
k+1
2
n
}
= 2
n
2
2n
X
k=1
1
{
k
2
n
<f}
.
Then we know {
k
2
n
< f} Rand that 1
{
k
2
n
<f}
= 1
{
k
2
n
<f}
2
n
k
f
L
1
(I),
i.e.
+
k
2
n
< f
< . Therefore
n

L
1
(I)
+
and
n
f. Suppose that
is any measure such that (A) =
+
(A) when
+
(A) < , then by the
monotone convergence theorems for

I and the Lebesgue integral,
I(f) = lim
n

I(
n
) = lim
n
2
n
2
2n
X
k=1
I(1
{
k
2
n
<f}
) = lim
n
2
n
2
2n
X
k=1
k
2
n
< f
= lim
n
2
n
2
2n
X
k=1
k
2
n
< f
= lim
n
Z
X

n
d =
Z
X
fd. (14.11)
294 BRUCE K. DRIVER
This shows that f

L
1
()
+
and that

I(f) =
R
X
fd. Since every f
L
1
(I) is of the form f = f
+
f
with f

L
1
(I)
+
, it follows that
L
1
(I) L
1
(
+
) L
1
() L
1
() and Eq. (14.9) holds for all f L
1
(I).
Conversely suppose that f

L
1
(
+
)
+
. Dene
n
as in Eq. (14.10).
Chebyshevs inequality implies that
+
(
k
2
n
< f) < and hence {
k
2
n
< f}
is I integrable. Again by the monotone convergence for Lebesgue integrals
and the computations in Eq. (14.11),
>
Z
X
fd
+
= lim
n

I(
n
)
and therefore by the monotone convergence theorem for

I, f L
1
(I) and
Z
X
fd
+
= lim
n

I(
n
) =

I(f).
(2) Suppose that is any measure such that Eq. (14.9) holds. Then by the
monotone convergence theorem,
I(f) =
Z
X
fd for all f S
.
Let A R and assume that
+
(A) < , i.e. 1
A
L
1
(I). Then there exists
f S
L
1
(I) such that 1
A
f and integrating this inequality relative to
implies
(A) =
Z
X
1
A
d
Z
X
fd =

I(f).
Taking the innum of this equation over those f S
such that 1
A
f
implies (A)

I(1
A
) =
+
(A). If
+
(A) = in this inequality holds
trivially.
Similarly, if A R and f S
such that 0 f 1
A
, then
(A) =
Z
X
1
A
d
Z
X
fd =

I(f).
Taking the supremum of this equation over those f S
such that 0 f
1
A
then implies (A)
(A). So we have shown that

+
.
(3) By Lemma 14.27, = is a measure as in (2) satisfying
and
therefore
and hence we have shown that =
. This also shows

that
is a measure.
(4) This can be done by the same type of argument used in the proof of (1).
Proposition 14.29 (Uniqueness). Suppose that 1 is measurable and there exists a
function L
1
(I) such that (x) > 0 for all x. Then there is only one measure
on (S) such that
I(f) =
Z
X
fd for all f S.
Remark 14.30. The existence of a function L
1
(I) such that (x) > 0 for all x is
equivalent to the existence of a function S
such that

I() < and (x) > 0
for all x X. Indeed by Lemma 14.16, if L
1
(I) there exists S
L
1
(I)
such .
Proof. As in Remark 14.30, we may assume S
L
1
(I). The sets X
n
:=
{ > 1/n} (S) R satisfy (X
n
) n
I() < . The proof is completed using

Theorem 14.28 to conclude, for any A (S), that
+
(A) = lim
n
+
(A X
n
) = lim
n
(A X
n
) =
(A).
Since

+
=
,
we see that =
+
=
.
296 BRUCE K. DRIVER
15. Complex Measures, Radon-Nikodym Theorem and the Dual of L

p
Denition 15.1. A signed measure on a measurable space (X, M) is a function
: MR such that
(1) Either (M) (, ] or (M) [, ).
(2) is countably additive, this is to say if E =
`
j=1
E
j
with E
j
M, then
(E) =

P
j=1
(E
j
).
35
(3) () = 0.
If there exists X
n
M such that |(X
n
)| < and X =
n=1
X
n
, then is said
to be nite and if (M) R then is said to be a nite signed measure.
Similarly, a countably additive set function : MC such that () = 0 is called
a complex measure.
A nite signed measure is clearly a complex measure.
Example 15.2. Suppose that
+
and
are two positive measures on M such

that either
+
(X) < or
(X) < , then =

+
is a signed measure. If
both
+
(X) and
(X) are nite then is a nite signed measure.

Example 15.3. Suppose that g : X R is measurable and either
R
E
g
+
d or
R
E
g
d < , then
(15.1) (A) =
Z
A
gdA M
denes a signed measure. This is actually a special case of the last example with
(A)
R
A
g
d. Notice that the measure
in this example have the property

that they are concentrated on disjoint sets, namely
+
lives on {g > 0} and
lives on the set {g < 0} .

Example 15.4. Suppose that is a positive measure on (X, M) and g L
1
(),
then given as in Eq. (15.1) is a complex measure on (X, M). Also if

,
i
is
any collection of four positive measures on (X, M), then
(15.2) :=
r
+
+i
i
+
is a complex measure.
If is given as in Eq. 15.1, then may be written as in Eq. (15.2) with
d
r
= (Re g)
d and d
i
= (Img)
d.
Denition 15.5. Let be a complex or signed measure on (X, M). A set E Mis
a null set or precisely a null set if (A) = 0 for all A M such that A E, i.e.
|
M
E
= 0. Recall that M
E
:= {A E : A M} = i
1
E
(M) is the trace of M on
E.
35
If (E) R then the series

P
j=1
(E
j
) is absolutely convergent since it is independent of
rearrangements.
15.1. Radon-Nikodym Theorem I. We will eventually show that every complex
and nite signed measure may be described as in Eq. (15.1). The next theorem
is the rst result in this direction.
Theorem 15.6. Suppose (X, M) is a measurable space, is a positive nite mea-
sure on M and is a complex measure on M such that |(A)| (A) for all
A M. Then d = d where || 1. Moreover if is a positive measure, then
0 1.
Proof. For a simple function, f S(X, M), let (f) :=
P
aC
a(f = a). Then
|(f)|
X
aC
|a| |(f = a)|
X
aC
|a| (f = a) =
Z
X
|f| d.
So, by the B.L.T. Theorem 4.1, extends to a continuous linear functional on L
1
()
satisfying the bounds
|(f)|
Z
X
|f| d
p
(X) kfk
L
2
()
for all f L
1
().
The Riesz representation Theorem (Proposition 12.15) then implies there exists a
unique L
2
() such that
(f) =
Z
X
fd for all f L
2
().
Taking f = sgn()1
A
in this equation shows
Z
A
|| d = (sgn()1
A
) (A) =
Z
A
1d
from which it follows that || 1, a.e. If is a positive measure, then for real
f, 0 = Im[(f)] =
R
X
Imfd and taking f = Im shows 0 =
R
X
[Im]
2
d, i.e.
Im((x)) = 0 for a.e. x and we have shown is real a.e. Similarly,
0 (Re < 0) =
Z
{Re <0}
d 0,
shows 0 a.e.
Denition 15.7. Let and be two signed or complex measures on (X, M). Then
and are mutually singular (written as ) if there exists A M such
that A is a null set and A
c
is a null set. The measure is absolutely
continuous relative to (written as ) provided (A) = 0 whenever A is a
null set, i.e. all null sets are null sets as well.
Remark 15.8. If
1
,
2
and are signed measures on (X, M) such that
1
and
2
and
1
+
2
is well dened, then (
1
+
2
) . If {
i
}
i=1
is a sequence of
positive measures such that
i
for all i then =
P
i=1
i
as well.
Proof. In both cases, choose A
i
M such that A
i
is null and A
c
i
is
i
-null
for all i. Then by Lemma 15.17, A :=
i
A
i
is still a null set. Since
A
c
=
i
A
c
i
A
c
m
for all m
we see that A
c
is a
i
- null set for all i and is therefore a null set for =
P
i=1
i
.
This shows that .
Throughout the remainder of this section will be always be a positive measure.
298 BRUCE K. DRIVER
Denition 15.9 (Lebesgue Decomposition). Suppose that is a signed (complex)

measure and is a positive measure on (X, M). Two signed (complex) measures
a
and
s
form a Lebesgue decomposition of relative to if
(1) If =
a
+
s
where implicit in this statement is the assertion that if
takes on the value () then
a
and
s
do not take on the value
().
(2)
a
and
s
.
Lemma 15.10. Let is a signed (complex) measure and is a positive measure on
(X, M). If there exists a Lebesgue decomposition of relative to then it is unique.
Moreover, if is a positive measure and =
s
+
a
is the Lebesgue decomposition
of relative to then:
(1) if is positive then
s
and
a
are positive.
(2) If is a nite measure then so are
s
and
a
.
Proof. Since
s
, there exists A M such that (A) = 0 and A
c
is
s

null and because
a
, A is also a null set for
a
. So for C M,
a
(C A) = 0
and
s
(C A
c
) = 0 from which it follows that
(C) = (C A) +(C A
c
) =
s
(C A) +
a
(C A
c
)
and hence,
s
(C) =
s
(C A) = (C A) and
a
(C) =
a
(C A
c
) = (C A
c
). (15.3)
Item 1. is now obvious from Eq. (15.3). For Item 2., if is a nite measure
then there exists X
n
M such that X =
n=1
X
n
and |(X
n
)| < for all n. Since
(X
n
) =
a
(X
n
) +
s
(X
n
), we must have
a
(X
n
) R and
s
(X
n
) R showing
a
and
s
are nite as well.
For the uniqueness assertion, if we have another decomposition =
a
+
s
with

s
and
a
we may choose

A M such that (

A) = 0 and

A
c
is
s
null.
Letting B = A

A we have
(B) (A) +(

A) = 0
and B
c
= A
c

A
c
is both a
s
and a
s
null set. Therefore by the same arguments
that proves Eqs. (15.3), for all C M,
s
(C) = (C B) =
s
(C) and
a
(C) = (C B
c
) =
a
(C).
Lemma 15.11. Suppose is a positive measure on (X, M) and f, g : X

R are
extended integrable functions such that
(15.4)
Z
A
fd =
Z
A
gd for all A M,
R
X
f
d < ,
R
X
g
d < , and the measures |f| d and |g| d are nite.

Then f(x) = g(x) for a.e. x.
Proof. By assumption there exists X
n
M such that X
n
X and
R
X
n
|f| d <
and
R
X
n
|g| d < for all n. Replacing A by A X
n
in Eq. (15.4) implies
Z
A
1
X
n
fd =
Z
AX
n
fd =
Z
AX
n
gd =
Z
A
1
X
n
gd
for all A M. Since 1
X
n
f and 1
X
n
g are in L
1
() for all n, this equation implies
1
X
n
f = 1
X
n
g, a.e. Letting n then shows that f = g, a.e.
Remark 15.12. Suppose that f and g are two positive measurable functions on
(X, M, ) such that Eq. (15.4) holds. It is not in general true that f = g,
a.e. A trivial counter example is to take M = P(X), (A) = for all non-empty
A M, f = 1
X
and g = 2 1
X
. Then Eq. (15.4) holds yet f 6= g.
Theorem 15.13 (Radon Nikodym Theorem for Positive Measures). Suppose that
, are nite positive measures on (X, M). Then has a unique Lebesgue
decomposition =
a
+
s
relative to and there exists a unique (modulo sets of
measure 0) function : X [0, ) such that d
a
= d. Moreover,
s
= 0 i
.
Proof. The uniqueness assertions follow directly from Lemmas 15.10 and 15.11.
Existence. (Von-Neumanns Proof.) First suppose that and are nite
measures and let = +. By Theorem 15.6, d = hd with 0 h 1 and this
implies, for all non-negative measurable functions f, that
(15.5) (f) = (fh) = (fh) +(fh)
or equivalently
(15.6) (f(1 h)) = (fh).
Taking f = 1
{h=1}
and f = g1
{h<1}
(1 h)
1
with g 0 in Eq. (15.6)
({h = 1}) = 0 and (g1
{h<1}
) = (g1
{h<1}
(1 h)
1
h) = (g)
where := 1
{h<1}
h
1h
and
s
(g) := (g1
{h=1}
). This gives the desired decomposi-
tion
36
since
(g) = (g1
{h=1}
) +(g1
{h<1}
) =
s
(g) +(g)
and
s
(h 6= 1) = 0 while (h = 1) = ({h 6= 1}
c
) = 0.
If , then (h = 1) = 0 implies (h = 1) = 0 and hence that
s
= 0. If
s
= 0, then d = d and so if (A) = 0, then (A) = (1
A
) = 0 as well.
36
Here is the motivation for this construction. Suppose that d = d
s
+ d is the Radon-
Nikodym decompostion and X = A
`
B such that
s
(B) = 0 and (A) = 0. Then we nd
s
(f) +(f) = (f) = (fg) = (fg) +(fg).
Letting f 1
A
f then implies that
s
(1
A
f) = (1
A
fg)
which show that g = 1 a.e. on A. Also letting f 1
B
f implies that
(1
B
f(1 g)) = (1
B
f(1 g)) = (1
B
fg) = (fg)
which shows that
(1 g) = 1
B
(1 g) = g a.e..
This shows that =
g
1g
a.e.
300 BRUCE K. DRIVER
For the nite case, write X =

`
n=1
X
n
where X
n
M are chosen so that
(X
n
) < and (X
n
) < for all n. Let d
n
= 1
X
n
d and d
n
= 1
X
n
d. Then
by what we have just proved there exists
n
L
1
(X,
n
) and measure
s
n
such that
d
n
=
n
d
n
+d
s
n
with
s
n

n
, i.e. there exists A
n
, B
n
M
X
n
and (A
n
) = 0
and
s
n
(B
n
) = 0. Dene
s
:=
P
n=1
s
n
and :=
P
n=1
1
X
n
n
, then
=

X
n=1
n
=

X
n=1
(
n
n
+
s
n
) =

X
n=1
(
n
1
X
n
+
s
n
) = +
s
and letting A :=
n=1
A
n
and B :=
n=1
B
n
, we have A = B
c
and
(A) =

X
n=1
(A
n
) = 0 and (B) =

X
n=1
(B
n
) = 0.
Theorem 15.14 (Dual of L
p
spaces). Let (X, M, ) be a nite measure
space and suppose that p, q [1, ] are conjugate exponents. Then for p [1, ),
the map g L
q

g
(L
p
)
is an isometric isomorphism of Banach spaces.

(Recall that
g
(f) :=
R
X
fgd.) We summarize this by writing (L
p
)
= L
q
for all
1 p < .
Proof. The only point that we have not yet proved is the surjectivity of the
map g L
q

g
(L
p
)
. When p = 2 the result follows directly from the Riesz

theorem. We will begin the proof under the extra assumption that (X) < in
which cased bounded functions are in L
p
() for all p. So let (L
p
)
. We need
to nd g L
q
() such that =
g
. When p [1, 2], L
2
() L
p
() so that we
may restrict to L
2
() and again the result follows fairly easily from the Riesz
Theorem, see Exercise 15.1 below.
To handle general p [1, ), dene (A) := (1
A
). If A =
`
n=1
A
n
with
A
n
M, then
k1
A
N
X
n=1
1
A
n
k
L
p = k1
n=N+1
A
n
k
L
p =

(
n=N+1
A
n
)
1
p
0 as N .
Therefore
(A) = (1
A
) =

X
1
(1
A
n
) =

X
1
(A
n
)
showing is a complex measure.
37
For A M, let || (A) be the total variation of A dened by
|| (A) := sup{|(f1
A
)| : |f| 1}
and notice that
(15.7) |(A)| || (A) kk
(L
p
)
(A)
1/p
for all A M.
You are asked to show in Exercise 15.2 that || is a measure on (X, M). (This can
also be deduced from Lemma 15.31 and Proposition 15.35 below.) By Eq. (15.7)
|| , by Theorem 15.6 d = hd || for some |h| 1 and by Theorem 15.13
37
It is at this point that the proof breaks down when p = .
d || = d for some L
1
(). Hence, letting g = h L
1
(), d = gd or
equivalently
(15.8) (1
A
) =
Z
X
g1
A
d A M.
By linearity this equation implies
(15.9) (f) =
Z
X
gfd
for all simple functions f on X. Replacing f by 1
{|g|M}
f in Eq. (15.9) shows
(f1
{|g|M}
) =
Z
X
1
{|g|M}
gfd
holds for all simple functions f and then by continuity for all f L
p
(). By the
converse to Holders inequality, (Proposition 9.26) we learn that
1
{|g|M}
g
q
= sup
kfk
p
=1
(f1
{|g|M}
)
sup
kfk
p
=1
kk
(L
p
)
f1
{|g|M}
p
kk
(L
p
)
.
Using the monotone convergence theorem we may let M in the previous equa-
tion to learn kgk
q
kk
(L
p
)
.With this result, Eq. (15.9) extends by continuity to
hold for all f L
p
() and hence we have shown that =
g
.
Case 2. Now suppose that is nite and X
n
Mare sets such that (X
n
) <
and X
n
X as n . We will identify f L
p
(X
n
, ) with f1
X
n
L
p
(X, )
and this way we may consider L
p
(X
n
, ) as a subspace of L
p
(X, ) for all n and
p [1, ].
By Case 1. there exists g
n
L
q
(X
n
, ) such that
(f) =
Z
X
n
g
n
fd for all f L
p
(X
n
, )
and
kg
n
k
q
= sup
|(f)| : f L
p
(X
n
, ) and kfk
L
p
(X
n
,)
= 1
kk
[L
p
()]
.
It is easy to see that g
n
= g
m
a.e. on X
n
X
m
for all m, n so that g := lim
n
g
n
exists a.e. By the above inequality and Fatous lemma, kgk
q
kk
[L
p
()]
<
and since (f) =
R
X
n
gfd for all f L
p
(X
n
, ) and n and
n=1
L
p
(X
n
, ) is dense
in L
p
(X, ) it follows by continuity that (f) =
R
X
gfd for all f L
p
(X, ),i.e.
=
g
.
Example 15.15. Theorem 15.14 fails in general when p = . Consider X = [0, 1],
M = B, and = m. Then (L
6= L
1
.
Proof. Let M := C([0, 1]) L
([0, 1], dm). It is easily seen for f M, that

kfk
= sup{|f(x)| : x [0, 1]} for all f M. Therefore M is a closed subspace of

L
. Dene (f) = f(0) for all f M. Then M
with norm 1. Appealing to

the Hahn-Banach Theorem 18.16 below, there exists an extension L (L
such
that L = on M and kLk = 1. If L 6=
g
for some g L
1
, i.e.
L(f) =
g
(f) =
Z
[0,1]
fgdm for all f L
,
302 BRUCE K. DRIVER
then replacing f by f
n
(x) = (1 nx) 1
xn
1 and letting n implies, (using the
dominated convergence theorem)
1 = lim
n
L(f
n
) = lim
n
Z
[0,1]
f
n
gdm =
Z
{0}
gdm = 0.
From this contradiction, we conclude that L 6=
g
for any g L
1
.
15.2. Signed Measures.
Denition 15.16. Let be a signed measure on (X, M) and E M, then
(1) E is positive if for all A M such that A E, (A) 0, i.e. |
M
E
0.
(2) E is negative if for all A M such that A E, (A) 0, i.e. |
M
E
0.
Lemma 15.17. Suppose that is a signed measure on (X, M). Then
(1) Any subset of a positive set is positive.
(2) The countable union of positive (negative or null) sets is still positive (neg-
ative or null).
(3) Let us now further assume that (M) [, ) and E M is a set
such that (E) (0, ). Then there exists a positive set P E such that
(P) (E).
Proof. The rst assertion is obvious. If P
j
M are positive sets, let P =
S
n=1
P
n
. By replacing P
n
by the positive set P
n
\
n1
S
j=1
P
j
!
we may assume that
the {P
n
}
n=1
are pairwise disjoint so that P =

`
n=1
P
n
. Now if E P and E M,
E =

`
n=1
(E P
n
) so (E) =
P
n=1
(E P
n
) 0.which shows that P is positive.
The proof for the negative and the null case is analogous.
The idea for proving the third assertion is to keep removing big sets of negative
measure from E. The set remaining from this procedure will be P. We now proceed
to the formal proof.
For all A M let n(A) = 1 sup{(B) : B A}. Since () = 0, n(A) 0
and n(A) = 0 i A is positive. Choose A
0
E such that (A
0
)
1
2
n(E) and
set E
1
= E \ A
0
, then choose A
1
E
1
such that (A
1
)
1
2
n(E
1
) and set
E
2
= E \ (A
0
A
1
) . Continue this procedure inductively, namely if A
0
, . . . , A
k1
have been chosen let E
k
= E \
k1
`
i=0
A
i
and choose A
k
E
k
such that (A
k
)
1
2
n(E
k
). Let P := E \

`
k=0
A
k
=

T
k=0
E
k
, then E = P

`
k=0
A
k
and hence
(15.10) (0, ) 3 v(E) = (P) +

X
k=0
(A
k
) = (P)

X
k=0
(A
k
) (P).
From Eq. (15.10) we learn that
P
k=0
(A
k
) < and in particular that
lim
k
((A
k
)) = 0. Since 0
1
2
n(E
k
) (A
k
), this also implies
lim
k
n(E
k
) = 0. If A P, then A E
k
for all k and so, for k large so
that n(E
k
) < 1, we nd (A) n(E
k
). Letting k in this estimate shows
(A) 0 or equivalently (A) 0. Since A P was arbitrary, we conclude that
P is a positive set such that (P) (E).
15.2.1. Hahn Decomposition Theorem.
Denition 15.18. Suppose that is a signed measure on (X, M). A Hahn de-
composition for is a partition {P, N} of X such that P is positive and N is
negative.
Theorem 15.19 (Hahn Decomposition Theorem). Every signed measure space
(X, M, ) has a Hahn decomposition, {P, N}. Moreover, if {
P,

N} is another Hahn
decomposition, then P
P = N
N is a null set, so the decomposition is unique

modulo null sets.
Proof. With out loss of generality we may assume that (M) [, ). If
not just consider instead. Let us begin with the uniqueness assertion. Suppose
that A M, then
(A) = (A P) +(A N) (A P) (P)
and similarly (A) (

P) for all A M. Therefore
(P) (P

P) (

P) and (

P) (P

P) (P)
which shows that
s := (

P) = (P

P) = (P).
Since
s = (P

P) = (P) +(

P) (P

P) = 2s (P

P)
we see that (P

P) = s and since
s = (P

P) = (P

P) +(

PP)
it follows that (

PP) = 0. Thus N
N =

PP is a positive set with zero measure,
i.e. N
N =

PP is a null set and this proves the uniqueness assertion.
Let
s sup{(A) : A M}
which is non-negative since () = 0. If s = 0, we are done since P = and
N = X is the desired decomposition. So assume s > 0 and choose A
n
M such
that (A
n
) > 0 and lim
n
(A
n
) = s. By Lemma 15.17here exists positive sets
P
n
A
n
such that (P
n
) (A
n
). Then s (P
n
) (A
n
) s as n
implies that s = lim
n
(P
n
). The set P
n=1
P
n
is a positive set being the
union of positive sets and since P
n
P for all n,
(P) (P
n
) s as n .
This shows that (P) s and hence by the denition of s, s = (P) < .
We now claim that N = P
c
is a negative set and therefore, {P, N} is the desired
Hahn decomposition. If N were not negative, we could nd E N = P
c
such that
(E) > 0. We then would have
(P E) = (P) +(E) = s +(E) > s
which contradicts the denition of s.
304 BRUCE K. DRIVER
15.2.2. Jordan Decomposition.

Denition 15.20. Let X = P N be a Hahn decomposition of and dene
+
(E) = (P E) and
(E) = (N E) E M.
Suppose X =
e
P
e
N is another Hahn Decomposition and e
are dene as above

with P and N replaced by
e
P and
e
N respectively. Then
e
+
(E) = (E
e
P) = (E
e
P P) +((E
e
P N) = (E
e
P P)
since N

P is both positive and negative and hence null. Similarly
+
(E) =
(E
e
P P) showing that
+
= e
+
and therefore also that
= e
.
Theorem 15.21 (Jordan Decomposition). There exists unique positive measure
such that
+

and =
+
.
Proof. Existence has been proved. For uniqueness suppose =
+

is a
Jordan Decomposition. Since
+

there exists P, N = P
c
M such that
+
(N) = 0 and
(P) = 0. Then clearly P is positive for and N is negative for

. Now (E P) =
+
(E) and (E N) =
(E). The uniqueness now follows

from the remarks after Denition 15.20.
Denition 15.22. ||(E) =
+
(E) +
(E) is called the total variation of . A

signed measure is called nite provided that || :=
+
+
is a nite
measure.
(BRUCE: Use Exercise 15.7 to prove the uniqueness of the Jordan decomposi-
tions, or make an exercise.)
Lemma 15.23. Let be a signed measure on (X, M) and A M. If (A) R
then (B) R for all B A. Moreover, (A) R i || (A) < . In particular,
is nite i || is nite. Furthermore if P, N M is a Hahn decomposition
for and g = 1
P
1
N
, then d = gd || , i.e.
(A) =
Z
A
gd || for all A M.
Proof. Suppose that B A and |(B)| = then since (A) = (B) +(A\B)
we must have |(A)| = . Let P, N M be a Hahn decomposition for , then
(A) = (A P) +(A N) = |(A P)| |(A N)| and
|| (A) = (A P) (A N) = |(A P)| +|(A N)| . (15.11)
Therefore (A) R i (A P) R and (A N) R i || (A) < . Finally,
(A) = (A P) +(A N)
= ||(A P) ||(A N)
=
Z
A
(1
P
1
N
)d||
which shows that d = gd || .
Denition 15.24. Let be a signed measure on (X, M), let
L
1
() := L
1
(
+
) L
1
(
) = L
1
(||)
and for f L
1
() we dene
Z
X
fd =
Z
X
fd
+
Z
X
fd
.
Lemma 15.25. Let be a positive measure on (X, M), g be an extended integrable
function on (X, M, ) and d = gd. Then L
1
() = L
1
(|g| d) and for f L
1
(),
Z
X
fd =
Z
X
fgd.
Proof. We have already seen that d
+
= g
+
d, d
= g
d, and d || = |g| d
so that L
1
() = L
1
(||) = L
1
(|g| d) and for f L
1
(),
Z
X
fd =
Z
X
fd
+
Z
X
fd
=
Z
X
fg
+
d
Z
X
fg
d
=
Z
X
f (g
+
g
) d =
Z
X
fgd.
Lemma 15.26. Suppose that is a positive measure on (X, M) and g : X R
is an extended integrable function. If is the signed measure d = gd, then
d
= g
d and d || = |g| d. We also have

(15.12) ||(A) = sup{
Z
A
f d : |f| 1} for all A M.
Proof. The pair, P = {g > 0} and N = {g 0} = P
c
is a Hahn decomposition
for . Therefore
+
(A) = (A P) =
Z
AP
gd =
Z
A
1
{g>0}
gd =
Z
A
g
+
d,
(A) = (A N) =
Z
AN
gd =
Z
A
1
{g0}
gd =
Z
A
g
d.
and
|| (A) =
+
(A) +
(A) =
Z
A
g
+
d
Z
A
g
d
=
Z
A
(g
+
g
) d =
Z
A
|g| d.
If A M and |f| 1, then
Z
A
f d
Z
A
f d
+
Z
A
f d
Z
A
f d
+
Z
A
f d
Z
A
|f|d
+
+
Z
A
|f|d
=
Z
A
|f| d|| || (A).
For the reverse inequality, let f 1
P
1
N
then
Z
A
f d = (A P) (A N) =
+
(A) +
(A) = ||(A).
Lemma 15.27. Suppose is a signed measure, is a positive measure and =
a
+
s
is a Lebesgue decomposition of relative to , then || = |
a
| +|
s
| .
306 BRUCE K. DRIVER
Proof. Let A M be chosen so that A is a null set for

a
and A
c
is a null
set for
s
. Let A = P
0
`
N
0
be a Hahn decomposition of
s
|
M
A
and A
c
=

P
`

N
be a Hahn decomposition of
a
|
M
A
c
. Let P = P
0

P and N = N
0

N. Since for
C M,
(C P) = (C P
0
) +(C

P)
=
s
(C P
0
) +
a
(C

P) 0
and
(C N) = (C N
0
) +(C

N)
=
s
(C N
0
) +
a
(C

N) 0
we see that {P, N} is a Hahn decomposition for . It also easy to see that {P, N}
is a Hahn decomposition for both
s
and
a
as well. Therefore,
|| (C) = (C P) (C N)
=
s
(C P)
s
(C N) +
a
(C P)
a
(C N)
= |
s
| (C) +|
a
| (C).
Lemma 15.28. 1) Let be a signed measure and be a positive measure on
(X, M) such that and , then 0. 2) Suppose that =
P
i=1
i
where
i
are positive measures on (X, M) such that
i
, then . Also if
1
and
2
are two signed measure such that
i
for i = 1, 2 and =
1
+
2
is well
dened, then .
Proof. (1) Because , there exists A M such that A is a null set and
B = A
c
is a - null set. Since B is null and , B is also null. This
shows by Lemma 15.17 that X = A B is also null, i.e. is the zero measure.
The proof of (2) is easy and is left to the reader.
Theorem 15.29 (Radon Nikodym Theorem for Signed Measures). Let be a
nite signed measure and be a nite positive measure on (X, M). Then has
a unique Lebesgue decomposition =
a
+
s
relative to and there exists a unique
(modulo sets of measure 0) extended integrable function : X R such that
d
a
= d. Moreover,
s
= 0 i , i.e. d = d i .
Proof. Uniqueness. Is a direct consequence of Lemmas 15.10 and 15.11.
Existence. Let =
+
be the Jordan decomposition of . Assume, without

loss of generality, that
+
(X) < , i.e. (A) < for all A M. By the Radon
Nikodym Theorem 15.13 for positive measures there exist functions f
: X [0, )
and measures
such that
=
f
with
. Since
>
+
(X) =
f
+
(X) +
+
(X),
f
+
L
1
() and
+
(X) < so that f = f
+
f
is an extended integrable function,

d
a
:= fd and
s
=
+
are signed measures. This nishes the existence proof

since
=
+
=
f
+
+
+
=
a
+
s
and
s
= (
+
) by Remark 15.8.
For the nal statement, if
s
= 0, then d = d and hence . Conversely
if , then d
s
= d d , so by Lemma 15.17,
s
= 0. Alternatively just
use the uniqueness of the Lebesgue decomposition to conclude
a
= and
s
= 0.
Or more directly, choose B M such that (B
c
) = 0 and B is a
s
null set.
Since , B
c
is also a null set so that, for A M,
(A) = (A B) =
a
(A B) +
s
(A B) =
a
(A B).
Notation 15.30. The function f is called the Radon-Nikodym derivative of
relative to and we will denote this function by
d
d
.
15.3. Complex Measures II. Suppose that is a complex measure on (X, M),
let
r
:= Re ,
i
:= Im and := |
r
| + |
i
|. Then is a nite positive measure
on M such that
r
and
i
. By the Radon-Nikodym Theorem 15.29, there
exists real functions h, k L
1
() such that d
r
= h d and d
i
= k d. So letting
g := h +ik L
1
(),
d = (h +ik)d = gd
showing every complex measure may be written as in Eq. (15.1).
Lemma 15.31. Suppose that is a complex measure on (X, M), and for i = 1, 2
let
i
be a nite positive measure on (X, M) such that d = g
i
d
i
with g
i
L
1
(
i
).
Then
Z
A
|g
1
| d
1
=
Z
A
|g
2
| d
2
for all A M.
In particular, we may dene a positive measure || on (X, M) by
|| (A) =
Z
A
|g
1
| d
1
for all A M.
The nite positive measure || is called the total variation measure of .
Proof. Let =
1
+
2
so that
i
. Let
i
= d
i
/d 0 and h
i
=
i
g
i
.
Since
(A) =
Z
A
g
i
d
i
=
Z
A
g
i
i
d =
Z
A
h
i
d for all A M,
h
1
= h
2
, a.e. Therefore
Z
A
|g
1
| d
1
=
Z
A
|g
1
|
1
d =
Z
A
|h
1
| d =
Z
A
|h
2
| d =
Z
A
|g
2
|
2
d =
Z
A
|g
2
| d
2
.
Denition 15.32. Given a complex measure , let
r
= Re and
i
= Im so
that
r
and
i
are nite signed measures such that
(A) =
r
(A) +i
i
(A) for all A M.
Let L
1
() := L
1
(
r
) L
1
(
i
) and for f L
1
() dene
Z
X
fd :=
Z
X
fd
r
+i
Z
X
fd
i
.
Example 15.33. Suppose that is a positive measure on (X, M), g L
1
() and
(A) =
R
A
gdas in Example 15.4, then L
1
() = L
1
(|g| d) and for f L
1
()
(15.13)
Z
X
fd =
Z
X
fgd.
308 BRUCE K. DRIVER
To check Eq. (15.13), notice that d

r
= Re g d and d
i
= Img d so that
(using Lemma 15.25)
L
1
() = L
1
(Re gd) L
1
(Imgd) = L
1
(|Re g| d) L
1
(|Img| d) = L
1
(|g| d).
If f L
1
(), then
Z
X
fd :=
Z
X
f Re gd +i
Z
X
f Imgd =
Z
X
fgd.
Remark 15.34. Suppose that is a complex measure on (X, M) such that d = gd
and as above d || = |g| d. Letting
= sgn() :=

g
|g|
if |g| 6= 0
1 if |g| = 0
we see that
d = gd = |g| d = d ||
and || = 1 and is uniquely dened modulo || null sets. We will denote by
d/d || . With this notation, it follows from Example 15.33 that L
1
() := L
1
(||)
and for f L
1
(),
Z
X
fd =
Z
X
f
d
d ||
d || .
Proposition 15.35 (Total Variation). Suppose A P(X) is an algebra, M =
(A), is a complex (or a signed measure which is nite on A) on (X, M)
and for E M let
0
(E) = sup
(
n
X
1
|(E
j
)| : E
j
A
E
3 E
i
E
j
=
ij
E
i
, n = 1, 2, . . .
)
1
(E) = sup
(
n
X
1
|(E
j
)| : E
j
M
E
3 E
i
E
j
=
ij
E
i
, n = 1, 2, . . .
)
2
(E) = sup
(

X
1
|(E
j
)| : E
j
M
E
3 E
i
E
j
=
ij
E
i
)
3
(E) = sup
Z
E
fd
: f is measurable with |f| 1
4
(E) = sup
Z
E
fd
: f S
f
(A, ||) with |f| 1
.
then
0
=
1
=
2
=
3
=
4
= || .
Proof. Let = d/d || and recall that || = 1, || a.e. We will start by
showing || =
3
=
4
. If f is measurable with |f| 1 then
Z
E
f d
Z
E
f d ||
Z
E
|f| d||
Z
E
1d|| = ||(E)
from which we conclude that
4

3
||. Taking f = above shows
Z
E
f d
=
Z
E
d|| =
Z
E
1 d|| = || (E)
which shows that ||
3
and hence || =
3
. To show || =
4
as well let X
m
A
be chosen so that || (X
m
) < and X
m
X as m . By Theorem 11.3 of
Corollary 13.27, there exists
n
S
f
(A, ) such that
n
1
X
m
in L
1
(||) and
each
n
may be written in the form
(15.14)
n
=
N
X
k=1
z
k
1
A
k
where z
k
C and A
k
A and A
k
A
j
= if k 6= j. I claim that we may assume
that |z
k
| 1 in Eq. (15.14) for if |z
k
| > 1 and x A
k
,
|(x) z
k
|
(x) |z
k
|
1
z
k
.
This is evident from Figure 31 and formally follows from the fact that
d
dt
(x) t |z
k
|
1
z
k
2
= 2
h
t Re(|z
k
|
1
z
k
(x))
i
0
when t 1.
Figure 31. Sliding points to the unit circle.
Therefore if we dene
w
k
:=

|z
k
|
1
z
k
if |z
k
| > 1
z
k
if |z
k
| 1
and
n
=
N
P
k=1
w
k
1
A
k
then
|(x)
n
(x)| |(x)
n
(x)|
and therefore
n
1
X
m
in L
1
(||). So we now assume that
n
is as in Eq. (15.14)
with |z
k
| 1.
Now
Z
E

n
d
Z
E
1
X
m
d
Z
E
(
n
d 1
X
m
) d ||
Z
E
|
n
1
X
m
| d || 0 as n
and hence
4
(E)
Z
E
1
X
m
d
= || (E X
m
) for all m.
Letting m in this equation shows
4
|| .
310 BRUCE K. DRIVER
We will now show

0
=
1
=
2
= || . Clearly
0

1

2
. Suppose E
j
M
E
such that E
i
E
j
=
ij
E
i
, then
X
|(E
j
)| =
X
|
Z
E
j
d ||
X
||(E
j
) = ||(E
j
) || (E)
which shows that
2
|| =
4
. So it suces to show
4

0
. But if f S
f
(A, ||)
with |f| 1, then f may be expressed as f =
P
N
k=1
z
k
1
A
k
with |z
k
| 1 and
A
k
A
j
=
ij
A
k.
Therefore,
Z
E
fd
N
X
k=1
z
k
(A
k
E)
N
X
k=1
|z
k
| |(A
k
E)|
N
X
k=1
|(A
k
E)|
0
(A).
Since this equation holds for all f S
f
(A, ||) with |f| 1,
4

0
as claimed.
Theorem 15.36 (Radon Nikodym Theorem for Complex Measures). Let be a
complex measure and be a nite positive measure on (X, M). Then has a
unique Lebesgue decomposition =
a
+
s
relative to and there exists a unique
element L
1
() such that such that d
a
= d. Moreover,
s
= 0 i , i.e.
d = d i .
Proof. Uniqueness. Is a direct consequence of Lemmas 15.10 and 15.11.
Existence. Let g : X S
1
C be a function such that d = gd || . By
Theorem 15.13, there exists h L
1
() and a positive measure ||
s
such that ||
s

and d || = hd +d ||
s
. Hence we have d = d +d
s
with := gh L
1
() and
d
s
:= gd ||
s
. This nishes the proof since, as is easily veried,
s
.
15.4. Absolute Continuity on an Algebra. The following results will be useful
in Section 16.4 below.
Lemma 15.37. Let be a complex or a signed measure on (X, M). Then A M
is a null set i || (A) = 0. In particular if is a positive measure on (X, M),
i || .
Proof. In all cases we have |(A)| || (A) for all A M which clearly shows
that || (A) = 0 implies A is a null set. Conversely if A is a null set, then,
by denition, |
M
A
0 so by Proposition 15.35
|| (A) = sup
(

X
1
|(E
j
)| : E
j
M
A
3 E
i
E
j
=
ij
E
i
)
= 0.
since E
j
A implies (E
j
) = 0 and hence (E
j
) = 0.
Alternate Proofs that A is null implies || (A) = 0.
1) Suppose is a signed measure and {P, N = P
c
} Mis a Hahn decomposition
for . Then
|| (A) = (A P) (A N) = 0.
Now suppose that is a complex measure. Then A is a null set for both
r
:= Re
and
i
:= Im. Therefore || (A) |
r
| (A) +|
i
| (A) = 0.
2) Here is another proof in the complex case. Let =
d
d||
, then by assumption
of A being null,
0 = (B) =
Z
B
d || for all B M
A
.
This shows that 1
A
= 0, || a.e. and hence
|| (A) =
Z
A
|| d || =
Z
X
1
A
|| d || = 0.
Theorem 15.38 ( Denition of Absolute Continuity). Let be a complex
measure and be a positive measure on (X, M). Then i for all > 0 there
exists a > 0 such that |(A)| < whenever A M and (A) < .
Proof. (=) If (A) = 0 then |(A)| < for all > 0 which shows that
(A) = 0, i.e. .
(=) Since i || and |(A)| ||(A) for all A M, it suces to
assume 0 with (X) < . Suppose for the sake of contradiction there exists
> 0 and A
n
M such that (A
n
) > 0 while (A
n
)
1
2
n
. Let
A = {A
n
i.o.} =

\
N=1
[
nN
A
n
so that
(A) = lim
N
(
nN
A
n
) lim
N
X
n=N
(A
n
) lim
N
2
(N1)
= 0.
On the other hand,
(A) = lim
N
(
nN
A
n
) lim
n
inf (A
n
) > 0
showing that is not absolutely continuous relative to .
Corollary 15.39. Let be a positive measure on (X, M) and f L
1
(d). Then
for all > 0 there exists > 0 such that
R
A
f d
< for all A M such that

(A) < .
Proof. Apply theorem 15.38 to the signed measure (A) =
R
A
f d for all A M.
Theorem 15.40 (Absolute Continuity on an Algebra). Let be a complex measure
and be a positive measure on (X, M). Suppose that A M is an algebra such
that (A) = M and that is nite on A. Then i for all > 0 there
exists a > 0 such that |(A)| < for all A A with (A) < .
Proof. (=) This implication is a consequence of Theorem 15.38.
(=) Let us begin by showing the hypothesis |(A)| < for all A A with
(A) < implies || (A) 4 for all A A with (A) < . To prove this decompose
into its real and imaginary parts; =
r
+ i
i
.and suppose that A =
`
n
j=1
A
j
with A
j
A. Then
n
X
j=1
|
r
(A
j
)| =
X
j:
r
(A
j
)0
r
(A
j
)
X
j:
r
(A
j
)0
r
(A
j
)
=
r
(
j:
r
(A
j
)0
A
j
)
r
(
j:
r
(A
j
)0
A
j
)
(
j:
r
(A
j
)0
A
j
)
(
j:
r
(A
j
)0
A
j
)
< 2
312 BRUCE K. DRIVER
using the hypothesis and the fact
j:
r
(A
j
)0
A
j
(A) < and
j:
r
(A
j
)0
A
j
(A) < . Similarly,

P
n
j=1
|
i
(A
j
)| < 2 and therefore
n
X
j=1
|(A
j
)|
n
X
j=1
|
r
(A
j
)| +
n
X
j=1
|
i
(A
j
)| < 4.
Using Proposition 15.35, it follows that
|| (A) = sup
_
_
_
n
X
j=1
|(A
j
)| : A =
n
a
j=1
A
j
with A
j
A and n N
_
_
_
4.
Because of this argument, we may now replace by || and hence we may assume
that is a positive nite measure.
Let > 0 and > 0 be such that (A) < for all A A with (A) < . Suppose
that B M with (B) < . Use the regularity Theorem 8.40 or Corollary 13.27 to
nd A A
such that B A and (B) (A) < . Write A =

n
A
n
with A
n
A.
By replacing A
n
by
n
j=1
A
j
if necessary we may assume that A
n
is increasing in
n. Then (A
n
) (A) < for each n and hence by assumption (A
n
) < . Since
B A =
n
A
n
it follows that (B) (A) = lim
n
(A
n
) . Thus we have
shown that (B) for all B M such that (B) < .
15.5. Dual Spaces and the Complex Riesz Theorem.
Proposition 15.41. Let S be a vector lattice of bounded real functions on a set
X. We equip S with the sup-norm topology and suppose I S
. Then there exists

I
which are positive such that then I = I

+
I
.
Proof. For f S
+
, let
I
+
(f) := sup
I(g) : g S
+
and g f
.
One easily sees that |I
+
(f)| kIk kfk for all f S
+
and I
+
(cf) = cI
+
(f) for all
f S
+
and c > 0. Let f
1
, f
2
S
+
. Then for any g
i
S
+
such that g
i
f
i
, we have
S
+
3 g
1
+g
2
f
1
+f
2
and hence
I(g
1
) +I(g
2
) = I(g
1
+g
2
) I
+
(f
1
+f
2
).
Therefore,
(15.15) I
+
(f
1
) +I
+
(f
2
) = sup{I(g
1
) +I(g
2
) : S
+
3 g
i
f
i
} I
+
(f
1
+f
2
).
For the opposite inequality, suppose g S
+
and g f
1
+f
2
. Let g
1
= f
1
g, then
0 g
2
:= g g
1
= g f
1
g =

0 if g f
1
g f
1
if g f
1
0 if g f
1
f
1
+f
2
f
1
if g f
1
f
2
.
Since g = g
1
+g
2
with S
+
3 g
i
f
i
,
I(g) = I(g
1
) +I(g
2
) I
+
(f
1
) +I
+
(f
2
)
and since S
+
3 g f
1
+f
2
was arbitrary, we may conclude
(15.16) I
+
(f
1
+f
2
) I
+
(f
1
) +I
+
(f
2
).
Combining Eqs. (15.15) and (15.16) shows that
(15.17) I
+
(f
1
+f
2
) = I
+
(f
1
) +I
+
(f
2
) for all f
i
S
+
.
We now extend I
+
to S by dening, for f S,
I
+
(f) = I
+
(f
+
) I
+
(f
)
where f
+
= f 0 and f
= (f 0) = (f) 0. (Notice that f = f

+
f
.) We
will now shows that I
+
is linear.
If c 0, we may use (cf)
= cf
to conclude that
I
+
(cf) = I
+
(cf
+
) I
+
(cf
) = cI
+
(f
+
) cI
+
(f
) = cI
+
(f).
Similarly, using (f)
= f
it follows that I
+
(f) = I
+
(f
) I
+
(f
+
) = I
+
(f).
Therefore we have shown
I
+
(cf) = cI
+
(f) for all c R and f S.
If f = u v with u, v S
+
then
v +f
+
= u +f
S
+
and so by Eq. (15.17), I
+
(v) +I
+
(f
+
) = I
+
(u) +I
+
(f
) or equivalently
(15.18) I
+
(f) = I
+
(f
+
) I
+
(f
) = I
+
(u) I
+
(v).
Now if f, g S, then
I
+
(f +g) = I
+
(f
+
+g
+
(f
+g
))
= I
+
(f
+
+g
+
) I
+
(f
+g
)
= I
+
(f
+
) +I
+
(g
+
) I
+
(f
) I
+
(g
)
= I
+
(f) +I
+
(g),
wherein the second equality we used Eq. (15.18).
The last two paragraphs show I
+
: S R is linear. Moreover,
|I
+
(f)| = |I
+
(f
+
) I
+
(f
)| max (|I
+
(f
+
)| , |I
+
(f
)|)
kIk max (kf
+
k , kf
k) = kIk kfk
which shows that kI
+
k kIk . That is I
+
is a bounded positive linear functional
on S. Let I
= I
+
I S
. Then by denition of I
+
(f), I
(f) = I
+
(f) I(f) 0
for all S 3 f 0. Therefore I = I
+
I
with I
being positive linear functionals

on S.
Corollary 15.42. Suppose X is a second countable locally compact Hausdor space
and I C
0
(X, R)
, then there exists =

+
where is a nite signed measure

on B
R
such that I(f) =
R
R
fd for all f C
0
(X, R). Similarly if I C
0
(X, C)
there exists a complex measure such that I(f) =

R
R
fd for all f C
0
(X, C).
TODO Add in the isometry statement here.
Proof. Let I = I
+
I
be the decomposition given as above. Then we know

there exists nite measure
such that
I
(f) =
Z
X
fd
for all f C
0
(X, R).
and therefore I(f) =
R
X
fd for all f C
0
(X, R) where =
+
. Moreover the
measure is unique. Indeed if I(f) =
R
X
fd for some nite signed measure , then
the next result shows that I
(f) =
R
X
fd
where
is the Hahn decomposition

of . Now the measures
are uniquely determined by I
. The complex case is a

consequence of applying the real case just proved to Re I and ImI.
314 BRUCE K. DRIVER
Proposition 15.43. Suppose that is a signed Radon measure and I = I
. Let
+
and
be the Radon measures associated to I
, then =
+

is the Jordan
decomposition of .
Proof. Let X = P P
c
where P is a positive set for and P
c
is a negative set.
Then for A B
X
,
(15.19) (P A) =
+
(P A)
(P A)
+
(P A)
+
(A).
To nish the proof we need only prove the reverse inequality. To this end let > 0
and choose K @@ P A U
o
X such that || (U\K) < . Let f, g C
c
(U, [0, 1])
with f g, then
I(f) = (f) = (f : K) +(f : U \ K) (g : K) +O()
(K) +O() (P A) +O().
Taking the supremum over all such f g, we learn that I
+
(g) (P A) +O()
and then taking the supremum over all such g shows that
+
(U) (P A) +O().
Taking the inmum over all U
o
X such that P A U shows that
(15.20)
+
(P A) (P A) +O()
From Eqs. (15.19) and (15.20) it follows that (P A) =
+
(P A). Since
I
(f) = sup
0gf
I(g) I(f) = sup
0gf
I(g f) = sup
0gf
I(f g) = sup
0hf
I(h)
the same argument applied to I shows that
(P
c
A) =
(P
c
A).
Since
(A) = (P A) +(P
c
A) =
+
(P A)
(P
c
A) and
(A) =
+
(A)
(A)
it follows that
+
(A\ P) =
(A\ P
c
) =
(A P).
Taking A = P then shows that
(P) = 0 and taking A = P

c
shows that
+
(P
c
) =
0 and hence
(P A) =
+
(P A) =
+
(A) and
(P
c
A) =
(P
c
A) =
(A)
as was to be proved.
15.6. Exercises.
Exercise 15.1. Prove Theorem 15.14 for p [1, 2] by directly applying the Riesz
theorem to |
L
2
()
.
Exercise 15.2. Show || be dened as in Eq. (15.7) is a positive measure. Here is
an outline.
(1) Show
(15.21) || (A) +|| (B) || (A B).
when A, B are disjoint sets in M.
(2) If A =
`
n=1
A
n
with A
n
M then
(15.22) || (A)

X
n=1
|| (A
n
).
(3) From Eqs. (15.21) and (15.22) it follows that is nitely additive, and
hence
|| (A) =
N
X
n=1
|| (A
n
) +|| (
n>N
A
n
)
N
X
n=1
|| (A
n
).
Letting N in this inequality shows || (A)
P
n=1
|| (A
n
) which
combined with Eq. (15.22) shows || is countable additive.
Exercise 15.3. Suppose
i
,
i
are nite positive measures on measurable
spaces, (X
i
, M
i
), for i = 1, 2. If
i

i
for i = 1, 2 then
1

2

1

2
and in fact
d(
1
2
)
d(
1
2
)
(x
1
, x
2
) =
1
2
(x
1
, x
2
) :=
1
(x
1
)
2
(x
2
)
where
i
:= d
i
/d
i
for i = 1, 2.
Exercise 15.5. Let be a nite signed measure, f L
1
(||) and dene
Z
X
fd =
Z
X
fd
+
Z
X
fd
.
Suppose that is a nite measure and . Show
(15.23)
Z
X
fd =
Z
X
f
d
d
d.
Exercise 15.6. Suppose that is a signed or complex measure on (X, M) and
A
n
M such that either A
n
A or A
n
A and (A
1
) R, then show (A) =
lim
n
(A
n
).
Exercise 15.7. Suppose that and are positive measures and (X) < . Let
:= , then show
+
and
.
Exercise 15.8. Folland Exercise 3.5 on p. 88 showing |
1
+
2
| |
1
| +|
2
| .
Exercise 15.9. Folland Exercise 3.7a on p. 88.
Exercise 15.10. Show Theorem 15.38 may fail if is not nite. (For a hint, see
problem 3.10 on p. 92 of Folland.)
316 BRUCE K. DRIVER
16. Lebesgue Differentiation and the Fundamental Theorem of

Calculus
Notation 16.1. In this chapter, let B = B
R
n denote the Borel algebra on R
n
and m be Lebesgue measure on B. If V is an open subset of R
n
, let L
1
loc
(V ) :=
L
1
loc
(V, m) and simply write L
1
loc
for L
1
loc
(R
n
). We will also write |A| for m(A) when
A B.
Denition 16.2. A collection of measurable sets {E}
r>0
B is said to shrink
nicely to x R
n
if (i) E
r
B
x
(r) for all r > 0 and (ii) there exists > 0 such that
m(E
r
) m(B
x
(r)). We will abbreviate this by writing E
r
{x} nicely. (Notice
that it is not required that x E
r
for any r > 0.
The main result of this chapter is the following theorem.
Theorem 16.3. Suppose that is a complex measure on (R
n
, B) , then there exists
g L
1
(R
n
, m) and a complex measure
s
such that
s
m, d = gdm +d
s
, and
for m - a.e. x,
(16.1) g(x) = lim
r0
(E
r
)
m(E
r
)
for any collection of {E
r
}
r>0
B which shrink nicely to {x} .
Proof. The existence of g and
s
such that
s
m and d = gdm + d
s
is a
consequence of the Radon-Nikodym Theorem 15.36. Since
(E
r
)
m(E
r
)
=
1
m(E
r
)
Z
E
r
g(x)dm(x) +

s
(E
r
)
m(E
r
)
Eq. (16.1) is a consequence of Theorem 16.13 and Corollary 16.15 below.
The rest of this chapter will be devoted to lling in the details of the proof of
this theorem.
16.1. A Covering Lemma and Averaging Operators.
Lemma 16.4 (Covering Lemma). Let E be a collection of open balls in R
n
and
U =
BE
B. If c < m(U), then there exists disjoint balls B
1
, . . . , B
k
E such that
c < 3
n
k
P
j=1
m(B
j
).
Proof. Choose a compact set K U such that m(K) > c and then let E
1
E
be a nite subcover of K. Choose B
1
E
1
to be a ball with largest diameter in E
1
.
Let E
2
= {A E
1
: AB
1
= }. If E
2
is not empty, choose B
2
E
2
to be a ball with
largest diameter in E
2
. Similarly let E
3
= {A E
2
: A B
2
= } and if E
3
is not
empty, choose B
3
E
3
to be a ball with largest diameter in E
3
. Continue choosing
B
i
E for i = 1, 2, . . . , k this way until E
k+1
is empty, see Figure 32 below.
If B = B(x
0
, r) R
n
, let B
= B(x
0
, 3r) R
n
, that is B
is the ball concentric

with B which has three times the radius of B. We will now show K
k
i=1
B
i
. For
each A E
1
there exists a rst i such that B
i
A 6= . In this case diam(A)
diam(B
i
) and A B
i
. Therefore A
k
i=1
B
i
and hence K {A : A E
1
}
k
i=1
B
i
. Hence by subadditivity,
c < m(K)
k
X
i=1
m(B
i
) 3
n
k
X
i=1
m(B
i
).
Figure 32. Picking out the large disjoint balls.
Denition 16.5. For f L
1
loc
, x R
n
and r > 0 let
(16.2) (A
r
f)(x) =
1
|B
x
(r)|
Z
B
x
(r)
fdm
where B
x
(r) = B(x, r) R
n
, and |A| := m(A).
Lemma 16.6. Let f L
1
loc
, then for each x R
n
, (0, )such that r (A
r
f)(x)
is continuous and for each r > 0, R
n
such that x (A
r
f) (x) is measurable.
Proof. Recall that |B
x
(r)| = m(E
1
)r
n
which is continuous in r. Also
lim
rr
0
1
B
x
(r)
(y) = 1
B
x
(r
0
)
(y) if |y| 6= r
0
and since m({y : |y| 6= r
0
}) = 0 (you
prove!), lim
rr
0
1
B
x
(r)
(y) = 1
B
x
(r
0
)
(y) for m -a.e. y. So by the dominated conver-
gence theorem,
lim
rr
0
Z
B
x
(r)
fdm =
Z
B
x
(r
0
)
fdm
and therefore
(A
r
f)(x) =
1
m(E
1
)r
n
Z
B
x
(r)
fdm
is continuous in r. Let g
r
(x, y) := 1
B
x
(r)
(y) = 1
|xy|<r
. Then g
r
is B B mea-
surable (for example write it as a limit of continuous functions or just notice that
F : R
n
R
n
R dened by F(x, y) := |x y| is continuous) and so that by
Fubinis theorem
x
Z
B
x
(r)
fdm =
Z
B
x
(r)
g
r
(x, y)f(y)dm(y)
is B measurable and hence so is x (A
r
f) (x).
16.2. Maximal Functions.
1
(m), the Hardy - Littlewood maximal function Hf is
dened by
(Hf)(x) = sup
r>0
A
r
|f|(x).
318 BRUCE K. DRIVER
Lemma 16.6 allows us to write

(Hf)(x) = sup
rQ, r>0
A
r
|f|(x)
and then to concluded that Hf is measurable.
Theorem 16.8 (Maximal Inequality). If f L
1
(m) and > 0, then
m(Hf > )
3
n
kfk
L
1.
This should be compared with Chebyshevs inequality which states that
m(|f| > )
kfk
L
1
.
Proof. Let E
{Hf > }. For all x E
there exists r
x
such that
A
r
x
|f|(x) > , i.e.
|B
x
(r
x
)| <
1
Z
B
x
(r
x
)
fdm.
Since E

xE
B
x
(r
x
), if c < m(E
) m(
xE
B
x
(r
x
)) then, using Lemma
16.4, there exists x
1
, . . . , x
k
E
and disjoint balls B

i
= B
x
i
(r
x
i
) for i = 1, 2, . . . , k
such that
c <
k
X
i=1
3
n
|B
i
| <
X
3
n
Z
B
i
|f|dm
3
n
Z
R
n
|f|dm =
3
n
kfk
L
1.
This shows that c < 3
n
1
kfk
L
1 for all c < m(E
) which proves m(E
)
3
n
1
kfk.
Theorem 16.9. If f L
1
loc
then lim
r0
(A
r
f)(x) = f(x) for m a.e. x R
n
.
Proof. With out loss of generality we may assume f L
1
(m). We now begin
with the special case where f = g L
1
(m) is also continuous. In this case we nd:
|(A
r
g)(x) g(x)|
1
|B
x
(r)|
Z
B
x
(r)
|g(y) g(x)|dm(y)
sup
yB
x
(r)
|g(y) g(x)| 0 as r 0.
In fact we have shown that (A
r
g)(x) g(x) as r 0 uniformly for x in compact
subsets of R
n
.
For general f L
1
(m),
|A
r
f(x) f(x)| |A
r
f(x) A
r
g(x)| +|A
r
g(x) g(x)| +|g(x) f(x)|
= |A
r
(f g)(x)| +|A
r
g(x) g(x)| +|g(x) f(x)|
H(f g)(x) +|A
r
g(x) g(x)| +|g(x) f(x)|
and therefore,
lim
r0
|A
r
f(x) f(x)| H(f g)(x) +|g(x) f(x)|.
So if > 0, then
E
lim
r0
|A
r
f(x) f(x)| >
n
H(f g) >

2
o
n
|g f| >

2
o
and thus
m(E
) m
H(f g) >

2
+m
|g f| >

2
3
n
/2
kf gk
L
1 +
1
/2
kf gk
L
1
2(3
n
+ 1)
1
kf gk
L
1,
where in the second inequality we have used the Maximal inequality (Theorem 16.8)
and Chebyshevs inequality. Since this is true for all continuous g C(R
n
) L
1
(m)
and this set is dense in L
1
(m), we may make kf gk
L
1 as small as we please. This
shows that
m
x : lim
r0
|A
r
f(x) f(x)| > 0
= m(
n=1
E
1/n
)

X
n=1
m(E
1/n
) = 0.
Corollary 16.10. If d = gdm with g L
1
loc
then
(B
x
(r))
|B
x
(r)|
= A
r
g(x) g(x) for m a.e. x.
16.3. Lebesque Set.
1
loc
(m), the Lebesgue set of f is
L
f
:=
_
_
x R
n
: lim
r0
1
|B
x
(r)|
Z
B
x
(r)
|f(y) f(x)|dy = 0
_
_
=

x R
n
: lim
r0
(A
r
|f() f(x)|) (x) = 0
.
Theorem 16.12. Suppose 1 p < and f L
p
loc
(m), then m
R
d
\ L
p
f
= 0
where
L
p
f
:=
_
_
x R
n
: lim
r0
1
|B
x
(r)|
Z
B
x
(r)
|f(y) f(x)|
p
dy = 0
_
_
.
Proof. For w C dene g
w
(x) = |f(x)w|
p
and E
w
{x : lim
r0
(A
r
g
w
) (x) 6= g
w
(x)} .
Then by Theorem 16.9 m(E
w
) = 0 for all w C and therefore m(E) = 0 where
E =
[
wQ+iQ
E
w
.
By denition of E, if x / E then.
lim
r0
(A
r
|f() w|
p
)(x) = |f(x) w|
p
for all w Q+iQ. Letting q :=
p
p1
we have
|f() f(x)|
p
(|f() w| +|w f(x)|)
p
2
q
(|f() w|
p
+|w f(x)|
p
) ,
(A
r
|f() f(x)|
p
)(x) 2
q
(A
r
|f() w|
p
) (x) + (A
r
|w f(x)|
p
) (x)
= 2
q
(A
r
|f() w|
p
) (x) + 2
q
|w f(x)|
320 BRUCE K. DRIVER
and hence for x / E,

lim
r0
(A
r
|f() f(x)|
p
)(x) 2
q
|f(x) w|
p
+ 2
q
|w f(x)|
p
= 22
q
|f(x) w|
p
.
Since this is true for all w Q+iQ, we see that
lim
r0
(A
r
|f() f(x)|
p
)(x) = 0 for all x / E,
i.e. E
c
L
p
f
or equivalently

L
p
f
c
E. So m
R
d
\ L
p
f
m(E) = 0.
Theorem 16.13 (Lebesque Dierentiation Theorem). Suppose f L
1
loc
for all
x L
f
(so in particular for m a.e. x)
lim
r0
1
m(E
r
)
Z
E
r
|f(y) f(x)|dy = 0
and
lim
r0
1
m(E
r
)
Z
E
r
f(y)dy = f(x)
when E
r
{x} nicely.
Proof. For all x L
f
,
1
m(E
r
)
Z
E
r
f(y)dy f(x)
1
m(E
r
)
Z
E
r
(f(y) f(x)) dy
1
m(E
r
)
Z
E
r
|f(y) f(x)|dy
1
m(B
x
(r))
Z
B
x
(r)
|f(y) f(x)|dy
which tends to zero as r 0 by Theorem 16.12. In the second inequality we have
used the fact that m(B
x
(r) \ B
x
(r)) = 0.
BRUCE: ADD an L
p
version of this theorem.
Lemma 16.14. Suppose is positive nite measure on B B
R
n such that
m. Then for m a.e. x,
lim
r0
(B
x
(r))
m(B
x
(r))
= 0.
Proof. Let A B such that (A) = 0 and m(A
c
) = 0. By the regularity theorem
(Corollary 13.27 or Exercise 8.4), for all > 0 there exists an open set V
R
n
such that A V
and (V
) < . Let
F
k

x A : lim
r0
(B
x
(r))
m(B
x
(r))
>
1
k
the for x F
k
choose r
x
> 0 such that B
x
(r
x
) V
(see Figure 33) and

(B
x
(r
x
))
m(B
x
(r
x
))
>
1
k
, i.e.
m(B
x
(r
x
)) < k (B
x
(r
x
)).
Figure 33. Covering a small set with balls.
Let E = {B
x
(r
x
)}
xF
k
and U
S
xF
k
B
x
(r
x
) V
. Heuristically if all the balls

in E were disjoint and E were countable, then
m(F
k
)
X
xF
k
m(B
x
(r
x
)) < k
X
xF
k
(B
x
(r
x
))
= k(U) k (V
) k.
Since > 0 is arbitrary this would imply that m(F
k
) = 0.
To x the above argument, suppose that c < m(U) and use the covering lemma
to nd disjoint balls B
1
, . . . , B
N
E such that
c < 3
n
N
X
i=1
m(B
i
) < k3
n
N
X
i=1
(B
i
)
k3
n
(U) k3
n
(V
) k3
n
.
Since c < m(U) is arbitrary we learn that m(F
k
) m(U) k3
n
and in particular
that m(F
k
) k3
n
. Since > 0 is arbitrary, this shows that m(F
k
) = 0 and
therefore, m(F
) = 0 where
F
x A : lim
r0
(B
x
(r))
m(B
x
(r))
> 0
k=1
F
k
.
Since
{x R
n
: lim
r0
(B
x
(r))
m(B
x
(r))
> 0} F
A
c
and m(A
c
) = 0, we have shown
m({x R
n
: lim
r0
(B
x
(r))
m(B
x
(r))
> 0}) = 0.
Corollary 16.15. Let be a complex or a nite signed measure such that
m. Then for m a.e. x,
lim
r0
(E
r
)
m(E
r
)
= 0
322 BRUCE K. DRIVER
whenever E
r
{x} nicely.
Proof. Recalling the m implies || m, Lemma 16.14 and the inequalities,
|(E
r
)|
m(E
r
)

||(E
r
)
m(B
x
(r))

||(B
x
(r))
m(B
x
(r))

||(B
x
(2r))
2
n
m(B
x
(2r))
proves the result.
Proposition 16.16. TODO Add in almost everywhere convergence result of con-
volutions by approximate functions.
16.4. The Fundamental Theorem of Calculus. In this section we will restrict
the results above to the one dimensional setting. The following notation will be in
force for the rest of this chapter: m denotes one dimensional Lebesgue measure on
B := B
R
, < , A = A
[,]
denote the algebra generated by sets of
the form (a, b] [, ] with a < b , A
c
denotes those sets in A which
are bounded, and B
[,]
is the Borel algebra on [, ] R.
Notation 16.17. Given a function F : R

R or F : R C, let F(x) =
lim
yx
F(y), F(x+) = lim
yx
F(y) and F() = lim
x
F(x) whenever the
limits exist. Notice that if F is a monotone functions then F() and F(x)
exist for all x.
Theorem 16.18. Let F : R R be increasing and dene G(x) = F(x+). Then
(1) {x R : F(x+) > F(x)} is countable.
(2) The function G increasing and right continuous.
(3) For m a.e. x, F
0
(x) and G
0
(x) exists and F
0
(x) = G
0
(x).
(4) The function F
0
is in L
1
loc
(m) and there exists a unique positive measure
s
on (R, B
R
) such that
F(b+) F(a+) =
Z
b
a
F
0
dm+
s
((a, b]) for all < a < b < .
Moreover the measure
s
is singular relative to m.
Proof. Properties (1) and (2) have already been proved in Theorem 13.34.
(3) Let
G
denote the unique measure on B such that
G
((a, b]) = G(b) G(a)
for all a < b. By Theorem 16.3, for m - a.e. x, for all sequences {E
r
}
r>0
which
shrink nicely to {x} , lim
r0
(
G
(E
r
)/m(E
r
)) exists and is independent of the choice
of sequence {E
r
}
r>0
shrinking to {x} . Since (x, x + r] {x} and (x r, x] {x}
nicely,
(16.3) lim
r0
G
(x, x +r])
m((x, x +r])
= lim
r0
G(x +r) G(x)
r
=
d
dx
+
G(x)
and
(16.4)
lim
r0
G
((x r, x])
m((x r, x])
= lim
r0
G(x) G(x r)
r
= lim
r0
G(x r) G(x)
r
=
d
dx
G(x)
exist and are equal for m - a.e. x, i.e. G
0
(x) exists for m -a.e. x.
For x R, let
H(x) G(x) F(x) = F(x+) F(x) 0.
Since F(x) = G(x)H(x), the proof of (3) will be complete once we show H
0
(x) = 0
for m a.e. x.
From Theorem 13.34,
:= {x R : F(x+) > F(x)} {x R : F(x+) > F(x)}
is a countable set and
X
x(N,N)
H(x) =
X
x(N,N)
(F(x+) F(x))
X
x(N,N)
(F(x+) F(x)) <
for all N < . Therefore :=
P
xR
H(x)
x
(i.e. (A) :=
P
xA
H(x) for all A B
R
)
denes a Radon measure on B
R
. Since (
c
) = 0 and m() = 0, the measure m.
By Corollary 16.15 for m - a.e. x,
H(x +r) H(x)

r
|H(x +r)| +|H(x)|

|r|

H(x +|r|) +H(x |r|) +H(x)
|r|
2
([x |r| , x +|r|])
2 |r|
and the last term goes to zero as r 0 because {[x r, x +r]}
r>0
shrinks nicely
to {x} as r 0 and m([x |r| , x +|r|]) = 2 |r| . Hence we conclude for m a.e. x
that H
0
(x) = 0.
(4) From Theorem 16.3, item (3) and Eqs. (16.3) and (16.4), F
0
= G
0
L
1
loc
(m)
and d
G
= F
0
dm+d
s
where
s
is a positive measure such that
s
m. Applying
this equation to an interval of the form (a, b] gives
F(b+) F(a+) =
G
((a, b]) =
Z
b
a
F
0
dm+
s
((a, b]).
The uniqueness of
s
such that this equation holds is a consequence of Theorem
8.8.
Our next goal is to prove an analogue of Theorem 16.18 for complex valued F.
Denition 16.19. For a x}
and if x = b let x
+
= b.
Proposition 16.20. Let be a complex measure on B
R
and let F be a function
such that
F(b) F(a) = ((a, b]) for all a < b,
for example let F(x) = ((, x]) in which case F() = 0. The function F is
right continuous and for < a 0
there exists > 0 such that
(16.6)
n
X
i=1
| ((a
i
, b
i
])| =
n
X
i=1
|F(b
i
) F(a
i
)| <
whenever {(a
i
, b
i
) (a, b]}
n
i=1
are disjoint open intervals in (a, b] such that
n
P
i=1
(b
i
a
i
) < .
324 BRUCE K. DRIVER
Proof. Eq. (16.5) follows from Proposition 15.35 and the fact that B = (A)
where A is the algebra generated by (a, b] R with a, b

R. Equation (16.6) is a
consequence of Theorem 15.40 with A being the algebra of half open intervals as
above. Notice that {(a
i
, b
i
) (a, b]}
n
i=1
are disjoint intervals i {(a
i
, b
i
] (a, b]}
n
i=1
are disjoint intervals,
n
P
i=1
(b
i
a
i
) = m((a, b]
n
i=1
(a
i
, b
i
]) and the general element
A A
(a,b]
is of the form A = (a, b]
n
i=1
(a
i
, b
i
].
Denition 16.21. Given a function F : R[, ]C let
F
be the unique additive
measure on A
c
such that
F
((a, b]) = F(b) F(a) for all a, b [, ] with a < b
and also dene
T
F
([a, b]) = sup
P
X
xP
|
F
(x, x
+
]| = sup
P
X
xP
|F(x
+
) F(x)|
where supremum is over all partitions P of [a, b]. We will also abuse notation and
dene T
F
(b) := T
F
([, b]). A function F : R [, ]C is said to be of bounded
variation if T
F
() := T
F
([, ]) < and we write F BV ([, ]). If =
and = +, we will simply denote BV ([, +]) by BV.
Denition 16.22. A function F : R C is said to be of normalized bounded
variation if F BV, F is right continuous and F() := lim
x
F(x) = 0.
We will abbreviate this by saying F NBV. (The condition: F() = 0 is not
essential and plays no role in the discussion below.)
Denition 16.23. A function F : R[, ]C is absolutely continuous if for
all > 0 there exists > 0 such that
(16.7)
n
X
i=1
|F(b
i
) F(a
i
)| <
whenever {(a
i
, b
i
)}
n
i=1
are disjoint open intervals in R[, ] such that
n
P
i=1
(b
i
a
i
) <
.
Lemma 16.24. Let F : R [, ]C be any function and and a < b < c with
a, b, c R [, ] then
(1)
(16.8) T
F
([a, c]) = T
F
([a, b]) +T
F
([b, c]).
(2) Letting a = in this expression implies
(16.9) T
F
(c) = T
F
(b) +T
F
([b, c])
and in particular T
F
is monotone increasing.
(3) If T
F
(b) < for some b R [, ] then
(16.10) T
F
(a+) T
F
(a) limsup
ya
|F(y) F(a)|
for all a R [, b). In particular T
F
is right continuous if F is right
continuous.
(4) If = and T
F
(b) < for some b (, ] R then T
F
() :=
lim
b
T
F
(b) = 0.
Proof. (1 2) By the triangle inequality, if P and P
0
are partition of [a, c] such
that P P
0
, then
X
xP
|F(x
+
) F(x)|
X
xP
0
|F(x
+
) F(x)|.
So if P is a partition of [a, c], then P P
0
:= P{b} implies
X
xP
|F(x
+
) F(x)|
X
xP
0
|F(x
+
) F(x)|
=
X
xP
0
[a,b]
|F(x
+
) F(x)| +
X
xP
0
[b,c]
|F(x
+
) F(x)|
T
F
([a, b]) +T
F
([b, c]).
Thus we see that T
F
([a, c]) T
F
([a, b]) +T
F
([b, c]). Similarly if P
1
is a partition of
[a, b] and P
2
is a partition of [b, c], then P = P
1
P
2
is a partition of [a, c] and
X
xP
1
|F(x
+
) F(x)| +
X
xP
2
|F(x
+
) F(x)| =
X
xP
|F(x
+
) F(x)| T
F
([a, c]).
From this we conclude T
F
([a, b]) +T
F
([b, c]) T
F
([a, c]) which nishes the proof of
Eqs. (16.8) and (16.9).
(3) Let a R[, b) and given > 0 let P be a partition of [a, b] such that
(16.11) T
F
(b) T
F
(a) = T
F
([a, b])
X
xP
|F(x
+
) F(x)| +.
Let y (a, a
+
), then
X
xP
|F(x
+
) F(x)| +
X
xP{y}
|F(x
+
) F(x)| +
= |F(y) F(a)| +
X
xP\{y}
|F(x
+
) F(x)| +
|F(y) F(a)| +T
F
([y, b]) +. (16.12)
Combining Eqs. (16.11) and (16.12) shows
T
F
(y) T
F
(a) +T
F
([y, b]) = T
F
(b) T
F
(a)
|F(y) F(a)| +T
F
([y, b]) +.
Since y (a, a
+
) is arbitrary we conclude that
T
F
(a+) T
F
(a) = limsup
ya
T
F
(y) T
F
(a) limsup
ya
|F(y) F(a)| +.
Since > 0 is arbitrary this proves Eq. (16.10).
(4) Suppose that T
F
(b) < and given > 0 let P be a partition of [, b] such
that
T
F
(b)
X
xP
|F(x
+
) F(x)| +.
Let x
0
= minP then by the previous equation
T
F
(x
0
) +T
F
([x
0
, b]) = T
F
(b)
X
xP
|F(x
+
) F(x)| + T
F
([x
0
, b]) +
which shows, using the monotonicity of T
F
, that T
F
() T
F
(x
0
) . Since
> 0 we conclude that T
F
() = 0.
326 BRUCE K. DRIVER
The following lemma should help to clarify Proposition 16.20 and Denition
16.23.
Lemma 16.25. Let and F be as in Proposition 16.20 and A be the algebra
generated by (a, b] R with a, b

R.. Then the following are equivalent:
(1) m
(2) || m
(3) For all > 0 there exists a > 0 such that T
F
(A) < whenever m(A) < .
(4) For all > 0 there exists a > 0 such that |
F
(A)| < whenever m(A) < .
Moreover, condition 4. shows that we could replace the last statement in Propo-
sition 16.20 by: m i for all > 0 there exists > 0 such that
n
X
i=1
((a
i
, b
i
])
n
X
i=1
[F(b
i
) F(a
i
)]
<
whenever {(a
i
, b
i
) (a, b]}
n
i=1
are disjoint open intervals in (a, b] such that
n
P
i=1
(b
i
a
i
) < .
Proof. This follows directly from Lemma 15.37 and Theorem 15.40.
Lemma 16.26.
(1) Monotone functions F : R [, ]R are in BV ([, ]).
(2) Linear combinations of functions in BV are in BV, i.e. BV is a vector
space.
(3) If F : R [, ]C is absolutely continuous then F is continuous and
F BV ([, ]).
(4) If < < < and F : R [, ]R is a dierentiable function
such that sup
xR
|F
0
(x)| = M < , then F is absolutely continuous and
T
F
([a, b]) M(b a) for all a < b .
(5) Let f L
1
(R [, ], m) and set
(16.13) F(x) =
Z
(,x]
fdm
for x [, b] R. Then F : R [, ]C is absolutely continuous.
Proof.
(1) If F is monotone increasing and P is a partition of (a, b] then
X
xP
|F(x
+
) F(x)| =
X
xP
(F(x
+
) F(x)) = F(b) F(a)
so that T
F
([a, b]) = F (b) F(a). Also note that F BV i F()
F() < .
(2) Item 2. follows from the triangle inequality.
(3) Since F is absolutely continuous, there exists > 0 such that whenever
a < b < a + and P is a partition of (a, b],
X
xP
|F(x
+
) F(x)| 1.
This shows that T
F
([a, b]) 1 for all a < b with b a < . Thus using Eq.
(16.8), it follows that T
F
([a, b]) N < if b a < N for an N N.
(4) Suppose that {(a
i
, b
i
)}
n
i=1
(a, b] are disjoint intervals, then by the mean
value theorem,
n
X
i=1
|F(b
i
) F(a
i
)|
n
X
i=1
|F
0
(c
i
)| (b
i
a
i
) Mm(
n
i=1
(a
i
, b
i
))
M
n
X
i=1
(b
i
a
i
) M(b a)
form which it clearly follows that F is absolutely continuous. Moreover we
may conclude that T
F
([a, b]) M(b a).
(5) Let be the positive measure d = |f| dm on (a, b]. Let {(a
i
, b
i
)}
n
i=1
(a, b]
be disjoint intervals as above, then
n
X
i=1
|F(b
i
) F(a
i
)| =
n
X
i=1
Z
(a
i
,b
i
]
fdm
n
X
i=1
Z
(a
i
,b
i
]
|f| dm
=
Z
n
i=1
(a
i
,b
i
]
|f| dm = (
n
i=1
(a
i
, b
i
]). (16.14)
Since is absolutely continuous relative to m for all > 0 there exist
> 0 such that (A) < if m(A) < . Taking A =
n
i=1
(a
i
, b
i
] in Eq.
(16.14) shows that F is absolutely continuous. It is also easy to see from
Eq. (16.14) that T
F
([a, b])
R
(a,b]
|f| dm.
Theorem 16.27. Let F : R C be a function, then
(1) F BV i Re F BV and ImF BV.
(2) If F : R R is in BV then the functions F
:= (T
F
F) /2 are bounded
and increasing functions.
(3) F : R R is in BV i F = F
+
F
where F
are bounded increasing

functions.
(4) If F BV then F(x) exist for all x

R. Let G(x) := F(x+).
(5) F BV then {x : lim
yx
F(y) 6= F(x)} is a countable set and in particular
G(x) = F(x+) for all but a countable number of x R.
(6) If F BV, then for m a.e. x, F
0
(x) and G
0
(x) exist and F
0
(x) = G
0
(x).
Proof.
(1) Item 1. is a consequence of the inequalities
|F(b) F(a)| |Re F(b) Re F(a)| +|ImF(b) ImF(a)| 2 |F(b) F(a)| .
(2) By Lemma 16.24, for all a < b,
(16.15) T
F
(b) T
F
(a) = T
F
([a, b]) |F(b) F(a)|
and therefore
T
F
(b) F(b) T
F
(a) F(a)
328 BRUCE K. DRIVER
which shows that F
are increasing. Moreover from Eq. (16.15), for b 0

and a 0,
|F(b)| |F(b) F(0)| +|F(0)| T
F
(0, b] +|F(0)|
T
F
(0, ) +|F(0)|
and similarly
|F(a)| |F(0)| +T
F
(, 0)
which shows that F is bounded by |F(0)|+T
F
(). Therefore F
is bounded
as well.
(3) By Lemma 16.26 if F = F
+
F
, then
T
F
([a, b]) T
F
+
([a, b]) +T
F
([a, b]) = |F
+
(b) F
+
(a)| +|F
(b) F
(a)|
which is bounded showing that F BV. Conversely if F is bounded varia-
tion, then F = F
+
F
where F
are dened as in Item 2.

Items 4. 6. follow from Items 1. 3. and Theorem 16.18.
Theorem 16.28. Suppose that F : R C is in BV, then
(16.16) |T
F
(x+) T
F
(x)| |F(x+) F(x)|
for all x R. If we further assume that F is right continuous then there exists a
unique measure on B = B
R
. such that
(16.17) ((, x]) = F(x) F() for all x R.
Proof. Since F BV, F(x+) exists for all x R and hence Eq. (16.16) is a
consequence of Eq. (16.10). Now assume that F is right continuous. In this case
Eq. (16.16) shows that T
F
(x) is also right continuous. By considering the real
and imaginary parts of F separately it suces to prove there exists a unique nite
signed measure satisfying Eq. (16.17) in the case that F is real valued. Now
let F
= (T
F
F) /2, then F
are increasing right continuous bounded functions.

Hence there exists unique measure
on B such that
((, x]) = F
(x) F
() x R.
The nite signed measure
+
satises Eq. (16.17). So it only remains to

prove that is unique.
Suppose that is another such measure such that (16.17) holds with replaced
by . Then for (a, b],
|| (a, b] = sup
P
X
xP
|F(x
+
) F(x)| = | | (a, b]
where the supremum is over all partition of (a, b]. This shows that || = | | on
A B the algebra generated by half open intervals and hence || = | | . It now
follows that || + and | | + are nite positive measure on B such that
(|| +) ((a, b]) = || ((a, b]) + (F(b) F(a))
= | | ((a, b]) + (F(b) F(a))
= (| | + ) ((a, b])
from which we infer that || + = | | + = || + on B. Thus = .
Alternatively, one may prove the uniqueness by showing that C := {A B :
(A) = e (A)} is a monotone class which contains A or using the theorem.
Theorem 16.29. Suppose that F NBV and
F
is the measure dened by Eq.
(16.17), then
(16.18) d
F
= F
0
dm+d
s
where
s
m and in particular for < a < b < ,
(16.19) F(b) F(a) =
Z
b
a
F
0
dm+
s
((a, b]).
Proof. By Theorem 16.3, there exists f L
1
(m) and a complex measure
s
such that for m -a.e. x,
(16.20) f(x) = lim
r0
(E
r
)
m(E
r
)
,
for any collection of {E
r
}
r>0
B which shrink nicely to {x} ,
s
m and
d
F
= fdm+d
s
.
From Eq. (16.20) it follows that
lim
h0
F(x +h) F(x)
h
= lim
h0
F
((x, x +h])
h
= f(x) and
lim
h0
F(x h) F(x)
h
= lim
h0
F
((x h, x])
h
= f(x)
for m a.e. x, i.e.
d
dx
+
F(x) =
d
dx
F(x) = f(x) for m a.e. x. This implies that F

is m a.e. dierentiable and F
0
(x) = f(x) for m a.e. x.
Corollary 16.30. Let F : R C be in NBV, then
(1)
F
m i F
0
= 0 m a.e.
(2)
F
m i
s
= 0 i
(16.21)
F
((a, b]) =
Z
(a,b]
F
0
(x)dm(x) for all a < b.
Proof.
(1) If F
0
(x) = 0 for m a.e. x, then by Eq. (16.18),
F
=
s
m. If
F
m,
then by Eq. (16.18), F
0
dm = d
F
d
s
dm and by Remark 15.8 F
0
dm =
0, i.e. F
0
= 0 m -a.e.
(2) If
F
m, then d
s
= d
F
F
0
dm dm which implies, by Lemma 15.28,
that
s
= 0. Therefore Eq. (16.19) becomes (16.21). Now let
(A) :=
Z
A
F
0
(x)dm(x) for all A B.
Recall by the Radon - Nikodym theorem that
R
R
|F
0
(x)| dm(x) < so
that is a complex measure on B. So if Eq. (16.21) holds, then =
F
on
the algebra generated by half open intervals. Therefore =
F
as in the
uniqueness part of the proof of Theorem 16.28. Therefore d
F
= F
0
dm and
hence
s
= 0.
Theorem 16.31. Suppose that F : [a, b] C is a measurable function. Then the
following are equivalent:
330 BRUCE K. DRIVER
(1) F is absolutely continuous on [a, b].

(2) There exists f L
1
([a, b]), dm) such that
(16.22) F(x) F(a) =
Z
x
a
fdm x [a, b]
(3) F
0
exists a.e., F
0
L
1
([a, b], dm) and
(16.23) F(x) F(a) =
Z
x
a
F
0
dmx [a, b].
Proof. In order to apply the previous results, extend F to R by F(x) = F(b) if
x b and F(x) = F(a) if x a.
1. = 3. If F is absolutely continuous then F is continuous on [a, b] and
F F(a) = F F() NBV by Lemma 16.26. By Proposition 16.20,
F
m
and hence Item 3. is now a consequence of Item 2. of Corollary 16.30. The assertion
3. = 2. is trivial.
2. = 1. If 2. holds then F is absolutely continuous on [a, b] by Lemma 16.26.
Corollary 16.32 (Integration by parts). Suppose < a < b < and F, G :
[a, b] C are two absoutely continuous functions. Then
Z
b
a
F
0
Gdm =
Z
b
a
FG
0
dm+FG|
b
a
.
Proof. Suppose that {(a
i
, b
i
)}
n
i=1
is a sequence of disjoint intervals in [a, b], then
n
X
i=1
|F(b
i
)G(b
i
) F(a
i
)G(a
i
)|
n
X
i=1
|F(b
i
)| |G(b
i
) G(a
i
)| +
n
X
i=1
|F(b
i
) F(a
i
)| |G(a
i
)|
kFk
u
n
X
i=1
|G(b
i
) G(a
i
)| +kGk
u
n
X
i=1
|F(b
i
) F(a
i
)| .
From this inequality, one easily deduces the absolutely continuity of the product
FG from the absolutely continuity of F and G. Therefore,
FG|
b
a
=
Z
b
a
(FG)
0
dm =
Z
b
a
(F
0
G+FG
0
)dm.
16.5. Alternative method to the Fundamental Theorem of Calculus. For
simplicity assume that = , = and F BV. Let
0
=
0
F
be the nitely
additive set function on A
c
such that
0
((a, b]) = F(b) F(a) for all < a <
b < .As in the real increasing case (Notation 13.6 above) we may dene a linear
functional, I
F
: S
c
(A) C, by
I
F
(f) =
X
C
0
(f = ).
If we write f =
P
N
i=1
i
1
(a
i
,b
i
]
with {(a
i
, b
i
]}
N
i=1
pairwise disjoint subsets of A
c
inside (a, b] we learn
(16.24)
|I
F
(f)| =
N
X
i=1
i
(F(b
i
) F(a
i
)
N
X
i=1
|
i
| |F(b
i
) F(a
i
)| kfk
u
T
F
((a, b]).
In the usual way this estimate allows us to extend I
F
to the those compactly
supported functions S
c
(A) in the closure of S
c
(A). As usual we will still denote
the extension of I
F
to S
c
(A) by I
F
and recall that S
c
(A) contains C
c
(R, C). The
estimate in Eq. (16.24) still holds for this extension and in particular we have
|I(f)| T
F
() kfk
u
for all f C
c
(R, C). Therefore I extends uniquely by conti-
nuity to an element of C
0
(R, C)
. So by appealing to the complex Riesz Theorem

(Corollary 15.42) there exists a unique complex measure =
F
such that
(16.25) I
F
(f) =
Z
R
fd for all f C
c
(R).
This leads to the following theorem.
Theorem 16.33. To each function F BV there exists a unique measure =
F
on (R, B
R
) such that Eq. (16.25) holds. Moreover, F(x+) = lim
yx
F(y) exists for
all x R and the measure satises
(16.26) ((a, b]) = F(b+) F(a+) for all < a < b < .
Remark 16.34. By applying Theorem 16.33 to the function x F(x) one shows
every F BV has left hand limits as well, i.e F(x) = lim
yx
F(y) exists for all
x R.
Proof. We must still prove F (x+) exists for all x R and Eq. (16.26) holds.
To prove let
b
and
be the functions shown in Figure 34 below. The reader

should check that
b
S
c
(A). Notice that
Figure 34. A couple of functions in S
c
(A).
I
F
(
b+
) = I
F
(
+ 1
(,b+]
) = I
F
(
) +F(b +) F()
332 BRUCE K. DRIVER
and since k
b+
k
u
= 1,
|I(
) I
F
(
b+
)| = |I
F
(
b+
)|
T
F
([b +, b + 2]) = T
F
(b + 2) T
F
(b +),
which implies O() := I(
) I
F
(
b+
) 0 as 0 because T
F
is monotonic.
Therefore,
(16.27) I(
) = I
F
(
b+
) +I(
) I
F
(
b+
) = I
F
(
) +F(b +) F() +O().

Because
converges boundedly to
b
as 0, the dominated convergence theorem
implies
lim
0
I(
) = lim
0
Z
R
d =
Z
R
b
d =
Z
R
d +((, b]).
So we may let 0 in Eq. (16.27) to learn F(b+) exists and
Z
R
d +((, b]) = I
F
(
) +F(b+) F().
Similarly this equation holds with b replaced by a, i.e.
Z
R
d +((, a]) = I
F
(
) +F(a+) F().
Subtracting the last two equations proves Eq. (16.26).
16.5.1. Proof of Theorem 16.29. Proof. Given Theorem 16.33 we may now prove
Theorem 16.29 in the same we proved Theorem 16.18.
16.6. Examples: These are taken from I. P. Natanson,Theory of functions of a
real variable, p.269. Note it is proved in Natanson or in Rudin that the fundamen-
tal theorem of calculus holds for f C([0, 1]) such that f
0
(x) exists for all x [0, 1]
and f
0
L
1
. Now we give a couple of examples.
Example 16.35. In each case f C([1, 1]).
(1) Let f(x) = |x|
3/2
sin
1
x
with f(0) = 0, then f is everywhere dierentiable
but f
0
is not bounded near zero. However, the function f
0
L
1
([1, 1]).
(2) Let f(x) = x
2
cos

x
2
with f(0) = 0, then f is everywhere dierentiable but
f
0
/ L
1
loc
(, ). Indeed, if 0 / (, ) then
Z

f
0
(x)dx = f() f() =
2
cos

2

2
cos

2
.
Now take
n
:=
q
2
4n+1
and
n
= 1/
2n. Then
Z

n
n
f
0
(x)dx =
2
4n + 1
cos
(4n + 1)
2

1
2n
cos 2n =
1
2n
and noting that {(
n
,
n
)}
n=1
are all disjoint, we nd
R
0
|f
0
(x)| dx = .
Example 16.36. Let C [0, 1] denote the cantor set constructed as follows. Let
C
1
= [0, 1] \ (1/3, 2/3), C
2
:= C
1
\ [(1/9, 2/9) (7/9, 8/9)] , etc., so that we keep
removing the middle thirds at each stage in the construction. Then
C :=
n=1
C
n
=
_
_
_
x =

X
j=0
a
j
3
j
: a
j
{0, 2}
_
_
_
and
m(C) = 1
1
3
+
2
9
+
2
2
3
3
+. . .
= 1
1
3
X
n=0
2
3
n
= 1
1
3
1
1 2/3
= 0.
Associated to this set is the so called cantor function F(x) := lim
n
f
n
(x) where
the {f
n
}
n=1
are continuous non-decreasing functions such that f
n
(0) = 0, f
n
(1) = 1
with the f
n
pictured in Figure 35 below. From the pictures one sees that {f
n
} are
Figure 35. Constructing the Cantor function.
uniformly Cauchy, hence there exists F C([0, 1]) such that F(x) := lim
n
f
n
(x).
The function F has the following properties,
(1) F is continuous and non-decreasing.
(2) F
0
(x) = 0 for m a.e. x [0, 1] because F is at on all of the middle third
open intervals used to construct the cantor set C and the total measure of
these intervals is 1 as proved above.
(3) The measure on B
[0,1]
associated to F, namely ([0, b]) = F(b) is singular
relative to Lebesgue measure and ({x}) = 0 for all x [0, 1]. Notice that
([0, 1]) = 1.
16.7. Exercises.
334 BRUCE K. DRIVER

Solution. 16.12Notice that
A
r
f =
1
|B
0
(r)|
1
B
0
(r)
f
and there for x A
r
f(x) C
0
(R
n
) for all r > 0 by Proposition 11.18. Since
A
r
f(x) f(x) =
1
|B
0
(r)|
Z
B
0
(r)
f(x+y) f(x)dy =
1
|B
0
(r)|
Z
B
0
(r)
(
y
f f) (x)dy
it follows from Minikowskis inequality for integrals (Theorem 9.27) that
kA
r
f fk

1
|B
0
(r)|
Z
B
0
(r)
k
y
f fk
dy sup
|y|r
k
y
f fk
and the latter goes to zero as r 0 by assumption. In particular we learn that

kA
r
f A
fk
u
kA
r
f fk
+kf A
fk
0 as r, 0
showing {A
r
f}
r>0
is uniformly Cauchy as r 0. Therefore lim
r0
A
r
f(x) = g(x)
exists for all x R
n
and g = f a.e.
Solution.
17. More Point Set Topology
17.1. Connectedness. The reader may wish to review the topological notions and
results introduced in Section 3.3 above before proceeding.
Denition 17.1. (X, ) is disconnected if there exists non-empty open sets U
and V of X such that UV = and X = UV . We say {U, V } is a disconnection
of X. The topological space (X, ) is called connected if it is not disconnected,
i.e. if there are no disconnection of X. If A X we say A is connected i (A,
A
)
is connected where
A
is the relative topology on A. Explicitly, A is disconnected
in (X, ) i there exists U, V such that U A 6= , U A 6= , A U V =
and A U V.
The reader should check that the following statement is an equivalent denition
of connectivity. A topological space (X, ) is connected i the only sets A X
which are both open and closed are the sets X and .
Remark 17.2. Let A Y X. Then A is connected in X i A is connected in Y .
Proof. Since
A
{V A : V X} = {V A Y : V X} = {U A : U
o
Y },
the relative topology on A inherited from X is the same as the relative topology on
A inherited from Y . Since connectivity is a statement about the relative topologies
on A, A is connected in X i A is connected in Y.
The following elementary but important lemma is left as an exercise to the reader.
Lemma 17.3. Suppose that f : X Y is a continuous map between topological
spaces. Then f(X) Y is connected if X is connected.
Here is a typical way these connectedness ideas are used.
Example 17.4. Suppose that f : X Y is a continuous map between topological
spaces, X is connected, Y is Hausdor, and f is locally constant, i.e. for all x X
there exists an open neighborhood V of x in X such that f|
V
is constant. Then f
is constant, i.e. f(X) = {y
0
} for some y
0
Y. To prove this, let y
0
f(X) and
let W := f
1
({y
0
}). Since Y is Hausdor, {y
0
} Y is a closed set and since f is
continuous W X is also closed. Since f is locally constant, W is open as well
and since X is connected it follows that W = X, i.e. f(X) = {y
0
} .
Proposition 17.5. Let (X, ) be a topological space.
(1) If B X is a connected set and X is the disjoint union of two open sets
U and V, then either B U or B V.
(2) a. If A X is connected, then

A is connected.
b. More generally, if A is connected and B acc(A), then A B is
connected as well. (Recall that acc(A) the set of accumulation points of
A was dened in Denition 3.19 above.)
(3) If {E
}
A
is a collection of connected sets such that
T
A
E
6= , then
Y :=
S
A
E
is connected as well.
(4) Suppose A, B X are non-empty connected subsets of X such that

AB 6=
, then A B is connected in X.
336 BRUCE K. DRIVER
(5) Every point x X is contained in a unique maximal connected subset C

x
of
X and this subset is closed. The set C
x
is called the connected component
of x.
Proof.
(1) Since B is the disjoint union of the relatively open sets BU and BV, we
must have B U = B or B V = B for otherwise {B U, B V } would
be a disconnection of B.
(2) a. Let Y =

A equipped with the relative topology from X. Suppose that
U, V
o
Y form a disconnection of Y =

A. Then by 1. either A U or
A V. Say that A U. Since U is both open an closed in Y, it follows that
Y =

A U. Therefore V = and we have a contradiction to the assumption
that {U, V } is a disconnection of Y =

A. Hence we must conclude that
Y =

A is connected as well.
b. Now let Y = A B with B acc(A), then
A
Y
=

A Y = (A acc(A)) Y = A B.
Because A is connected in Y, by (2a) Y = A B =

A
Y
is also connected.
(3) Let Y :=
S
A
E
. By Remark 17.2, we know that E
is connected in Y
for each A. If {U, V } were a disconnection of Y, by item (1), either
E
U or E
V for all . Let = { A : E
U} then U =
and V =
A\
E
. (Notice that neither or A\ can be empty since U

and V are not empty.) Since
= U V =
[
,
c (E
)
\
A
E
6= .
we have reached a contradiction and hence no such disconnection exists.
(4) (A good example to keep in mind here is X = R, A = (0, 1) and B = [1, 2).)
For sake of contradiction suppose that {U, V } were a disconnection of Y =
AB. By item (1) either A U or A V, say A U in which case B V.
Since Y = AB we must have A = U and B = V and so we may conclude:
A and B are disjoint subsets of Y which are both open and closed. This
implies
A =

A
Y
=

A Y =

A (A B) = A

A B
and therefore
6=

A B A B = ,
which gives us the desired contradiction.
(5) Let C denote the collection of connected subsets C X such that x C.
Then by item 3., the set C
x
:= C is also a connected subset of X which
contains x and clearly this is the unique maximal connected set containing
x. Since

C
x
is also connected by item (2) and C
x
is maximal, C
x
=

C
x
, i.e.
C
x
is closed.
Theorem 17.6. The connected subsets of R are intervals.
Proof. Suppose that A R is a connected subset and that a, b A with
a < b. If there exists c (a, b) such that c / A, then U := (, c) A and
V := (c, )A would form a disconnection of A. Hence (a, b) A. Let := inf(A)
and := sup(A) and choose
n
,
n
A such that
n
<
n
and
n
and
n
as n . By what we have just shown, (
n
,
n
) A for all n and hence
(, ) =
n=1
(
n
,
n
) A. From this it follows that A = (, ), [, ), (, ] or
[, ], i.e. A is an interval.
Conversely suppose that A is an interval, and for sake of contradiction, suppose
that {U, V } is a disconnection of A with a U, b V. After relabeling U and
V if necessary we may assume that a < b. Since A is an interval [a, b] A. Let
p = sup([a, b] U) , then because U and V are open, a p and p can not be in V for otherwise
p < sup([a, b] U) . From this it follows that p / U V and hence A 6= U V
contradicting the assumption that {U, V } is a disconnection.
Denition 17.7. A topological space X is path connected if to every pair of
points {x
0
, x
1
} X there exists a continuous path C([0, 1], X) such that
(0) = x
0
and (1) = x
1
. The space X is said to be locally path connected if for
each x X, there is an open neighborhood V X of x which is path connected.
Proposition 17.8. Let X be a topological space.
(1) If X is path connected then X is connected.
(2) If X is connected and locally path connected, then X is path connected.
(3) If X is any connected open subset of R
n
, then X is path connected.
Proof. The reader is asked to prove this proposition in Exercises 17.1 17.3
below.
17.2. Product Spaces. Let {(X
)}
A
be a collection of topological spaces
(we assume X
6= ) and let X
A
=
Q
A
X
. Recall that x X
A
is a function
x : A
a
A
X
such that x
:= x() X
for all A. An element x X

A
is called a choice
function and the axiom of choice states that X
A
6= provided that X
6= for
each A. If each X
above is the same set X, we will denote X

A
=
Q
A
X
by
X
A
. So x X
A
is a function from A to X.
Notation 17.9. For A, let
: X
A
X
be the canonical projection map,
(x) = x
. The product topology =

A
is the smallest topology on X

A
such that each projection
is continuous. Explicitly, is the topology generated

by
(17.1) E = {
1
(V
) : A, V
}.
A basic open set in this topology is of the form
(17.2) V = {x X
A
:
(x) V
for }
where is a nite subset of A and V
for all . We will sometimes write

V above as
V =
Y
Y
/
X
= V
X
A\
.
Proposition 17.10. Suppose Y is a topological space and f : Y X
A
is a map.
Then f is continuous i
f : Y X
is continuous for all A.

338 BRUCE K. DRIVER
Proof. If f is continuous then
f is the composition of two continuous

functions and hence is continuous. Conversely if
f is continuous for all A,

the (
f)
1
(V
) = f
1
(
1
(V
)) is open in Y for all A and V

o
X
. That
is to say, f
1
(E) consists of open sets, and therefore f is continuous since E is a
sub-basis for the product topology.
Proposition 17.11. Suppose that (X, ) is a topological space and {f
n
} X
A
is
a sequence. Then f
n
f in the product topology of X
A
i f
n
() f() for all
A.
Proof. Since
is continuous, if f
n
f then f
n
() =
(f
n
)
(f) = f()
for all A. Conversely, f
n
() f() for all A i
(f
n
)
(f) for all

A. Therefore if V =
1
(V
) E and f V, then
(f) V
and
(f
n
) V
a.a. and hence f

n
V a.a.. This shows that f
n
f as n .
Proposition 17.12. Let (X
) be topological spaces and X

A
be the product space
with the product topology.
(1) If X
is Hausdor for all A, then so is X

A
.
(2) If each X
is connected for all A, then so is X

A
.
Proof.
(1) Let x, y X
A
be distinct points. Then there exists A such that
(x) = x
6= y
(y). Since X
is Hausdor, there exists disjoint open

sets U, V X
such
(x) U and
(y) V. Then
1
(U) and
1
(V )
are disjoint open sets in X
A
containing x and y respectively.
(2) Let us begin with the case of two factors, namely assume that X and Y are
connected topological spaces, then we will show that X Y is connected
as well. To do this let p = (x
0
, y
0
) X Y and E denote the connected
component of p. Since {x
0
}Y is homeomorphic to Y, {x
0
}Y is connected
in X Y and therefore {x
0
} Y E, i.e. (x
0
, y) E for all y Y. A
similar argument now shows that X {y} E for any y Y, that is to
X Y = E. By induction the theorem holds whenever A is a nite set.
For the general case, again choose a point p X
A
= X
A
and let C =
C
p
be the connected component of p in X
A
. Recall that C
p
is closed and
therefore if C
p
is a proper subset of X
A
, then X
A
\ C
p
is a non-empty
open set. By the denition of the product topology, this would imply that
X
A
\ C
p
contains an open set of the form
V :=
(V
) = V
X
A\
where A and V
for all . We will now show that no such

V can exist and hence X
A
= C
p
, i.e. X
A
is connected.
Dene : X
X
A
by (y) = x where
x
=

y
if
p
if / .
If ,
(y) = y
(y) and if A \ then
(y) = p
so that in every case
: X
is continuous and therefore is

continuous.
Since X
is a product of a nite number of connected spaces it is con-

nected by step 1. above. Hence so is the continuous image, (X
) =
X
{p
}
A\
, of X
. Now p (X
) and (X
) is connected implies
that (X
) C. On the other hand one easily sees that

6= V (X
) V C
contradicting the assumption that V C
c
.
17.3. Tychonos Theorem. The main theorem of this subsection is that the
product of compact spaces is compact. Before going to the general case an arbitrary
number of factors let us start with only two factors.
Proposition 17.13. Suppose that X and Y are non-empty compact topological
spaces, then X Y is compact in the product topology.
Proof. Let U be an open cover of X Y. Then for each (x, y) X Y
there exist U U such that (x, y) U. By denition of the product topology,
there also exist V
x

X
x
and W
y

Y
y
such that V
x
W
y
U. Therefore V :=
{V
x
W
y
: (x, y) X Y } is also an open cover of X Y. We will now show that
V has a nite sub-cover, say V
0
V. Assuming this is proved for the moment,
this implies that U also has a nite subcover because each V V
0
is contained in
some U
V
U. So to complete the proof it suces to show every cover V of the form
V = {V
: A} where V

o
X and W

o
Y has a nite subcover.
Given x X, let f
x
: Y X Y be the map f
x
(y) = (x, y) and notice that
f
x
is continuous since
X
f
x
(y) = x and
Y
f
x
(y) = y are continuous maps.
From this we conclude that {x} Y = f
x
(Y ) is compact. Similarly, it follows that
X {y} is compact for all y Y.
Since V is a cover of {x} Y, there exist
x
A such that {x} Y
S
x
(V
) without loss of generality we may assume that

x
is chosen so that
x V
for all
x
. Let U
x

T
x
V

o
X, and notice that
(17.3)
[
x
(V
)
[
x
(U
x
W
) = U
x
Y,
see Figure 36 below.
Since {U
x
}
xX
is now an open cover of X and X is compact, there exists X
such that X =
x
U
x
. The nite subcollection, V
0
:= {V
:
x
x
},
of V is the desired nite subcover. Indeed using Eq. (17.3),
V
0
=
x
x
(V
)
x
(U
x
Y ) = X Y.
The results of Exercises 3.27 and 6.15 prove Tychonos Theorem for a countable
product of compact metric spaces. We now state the general version of the theorem.
Theorem 17.14 (Tychonos Theorem). Let {X
}
A
be a collection of non-
empty compact spaces. Then X := X
A
=
Q
A
X
is compact in the product space

topology.
Proof. The proof requires Zorns lemma which is equivalent to the axiom of
choice, see Theorem B.7 of Appendix B below. For A let
denote the
projection map from X to X
. Suppose that F is a family of closed subsets of X

340 BRUCE K. DRIVER
Figure 36. Constructing the open set U

x
.
which has the nite intersection property, see Denition 3.25. By Proposition 3.26
the proof will be complete if we can show F 6= .
The rst step is to apply Zorns lemma to construct a maximal collection F
0
of (not necessarily closed) subsets of X with the nite intersection property. To
do this, let :=

G 2
X
: F G
equipped with the partial order, G

1
< G
2
if
G
1
G
2
. If is a linearly ordered subset of , then G:= is an upper bound for
which still has the nite intersection property as the reader should check. So by
Zorns lemma, has a maximal element F
0
.
The maximal F
0
has the following properties.
(1) If {F
i
}
n
i=1
F
0
then
n
i=1
F
i
F
0
as well. Indeed, if we let (F
0
)
f
denote
the collection of all nite intersections of elements from F
0
, then (F
0
)
f
has
the nite intersection property and contains F
0
. Since F
0
is maximal, this
implies (F
0
)
f
= F
0
.
(2) If A X and A F 6= for all F F
0
then A F
0
. For if not F
0

{A} would still satisfy the nite intersection property and would properly
contain F
0
. this would violate the maximallity of F
0
.
(3) For each A,
a
(F
0
) := {
(F) X
: F F
0
} has the nite intersec-
tion property. Indeed, if {F
i
}
n
i=1
F
0
, then
n
i=1
(F
i
)
(
n
i=1
F
i
) 6= .
Since X
is compact, item 3. above along with Proposition 3.26 implies
FF
0
(F) 6= . Since this true for each A, using the axiom of choice,
there exists p X such that p
(p)
FF
0
(F) for all A. The

proof will be completed by showing p F, hence F is not empty as desired.
Since
F : F F
0
F, it suces to show p C :=
F : F F
0
. For
this suppose that U is an open neighborhood of p in X. By the denition of the
product topology, there exists A and open sets U
for all such

that p
(U
) U. Since p

FF
0
(F) and p
for all ,
it follows that U
(F) 6= for all F F

0
and all and this implies
(U
) F 6= for all F F
0
and all . By item 2. above we concluded
that
1
(U
) F
0
for all and by then by item 1.,
(U
) F
0
. In
particular 6= F
(U
F U for all F F
0
which shows p

F for
each F F
0
.
17.4. Baire Category Theorem.
Denition 17.15. Let (X, ) be a topological space. A set E X is said to be
nowhere dense if

o
= i.e.

E has empty interior.
Notice that E is nowhere dense is equivalent to
X =

c
=

c
= (E
c
)
o
.
That is to say E is nowhere dense i E
c
has dense interior.
17.5. Baire Category Theorem.
Theorem 17.16 (Baire Category Theorem). Let (X, ) be a complete metric space.
(1) If {V
n
}
n=1
is a sequence of dense open sets, then G :=

T
n=1
V
n
is dense in
X.
(2) If {E
n
}
n=1
is a sequence of nowhere dense sets, then
S
n=1
E
n

S
n=1

E
n
& X and in particular X 6=
S
n=1
E
n
.
Proof. 1) We must shows that

G = X which is equivalent to showing that
W G 6= for all non-empty open sets W X. Since V
1
is dense, W V
1
6= and
hence there exists x
1
X and
1
> 0 such that
B(x
1
,
1
) W V
1
.
Since V
2
is dense, B(x
1
,
1
) V
2
6= and hence there exists x
2
X and
2
> 0 such
that
B(x
2
,
2
) B(x
1
,
1
) V
2
.
Continuing this way inductively, we may choose {x
n
X and
n
> 0}
n=1
such that
B(x
n
,
n
) B(x
n1
,
n1
) V
n
n.
Furthermore we can clearly do this construction in such a way that
n
0 as
n . Hence {x
n
}
n=1
is Cauchy sequence and x = lim
n
x
n
exists in X since X
is complete. Since B(x
n
,
n
) is closed, x B(x
n
,
n
) V
n
so that x V
n
for all
n and hence x G. Moreover, x B(x
1
,
1
) W V
1
implies x W and hence
x W G showing W G 6= .
2) The second assertion is equivalently to showing
6=

[
n=1
E
n
!
c
=

\
n=1
E
n
c
=

\
n=1
(E
c
n
)
o
.
As we have observed, E
n
is nowhere dense is equivalent to (E
c
n
)
o
being a dense
open set, hence by part 1),
T
n=1
(E
c
n
)
o
is dense in X and hence not empty.
Here is another version of the Baire Category theorem when X is a locally
compact Hausdor space.
Proposition 17.17. Let X be a locally compact Hausdor space.
342 BRUCE K. DRIVER
(1) If {V
n
}
n=1
is a sequence of dense open sets, then G :=

T
n=1
V
n
is dense in
X.
(2) If {E
n
}
n=1
is a sequence of nowhere dense sets, then X 6=
S
n=1
E
n
.
Proof. As in the previous proof, the second assertion is a consequence of the
rst. To nish the proof, if suces to show G W 6= for all open sets W X.
Since V
1
is dense, there exists x
1
V
1
W and by Proposition 10.13 there exists
U
1

o
X such that x
1
U
1

U
1
V
1
W with

U
1
being compact. Similarly, there
exists a non-empty open set U
2
such that U
2

U
2
U
1
V
2
. Working inductively,
we may nd non-empty open sets {U
k
}
k=1
such that U
k

U
k
U
k1
V
k
. Since
n
k=1

U
k
=

U
n
6= for all n, the nite intersection characterization of

U
1
being
compact implies that
6=
k=1

U
k
G W.
Denition 17.18. A subset E X is meager or of the rst category if E =
S
n=1
E
n
where each E
n
is nowhere dense. And a set R X is called residual if R
c
is meager.
Remarks 17.19. The reader should think of meager as being the topological ana-
logue of sets of measure 0 and residual as being the topological analogue of sets of
full measure.
(1) R is residual i R contains a countable intersection of dense open sets.
Indeed if R is a residual set, then there exists nowhere dense sets {E
n
}
such that
R
c
=
n=1
E
n

n=1

E
n
.
Taking complements of this equation shows that
n=1

E
c
n
R,
i.e. R contains a set of the form
n=1
V
n
with each V
n
(=

E
c
n
) being an
open dense subset of X.
Conversely, if
n=1
V
n
R with each V
n
being an open dense subset of
X, then R
c

n=1
V
c
n
and hence R
c
=
n=1
E
n
where each E
n
= R
c
V
c
n
,
is a nowhere dense subset of X.
(2) A countable union of meager sets is meager and any subset of a meager set
is meager.
(3) A countable intersection of residual sets is residual.
Remarks 17.20. The Baire Category Theorems may now be stated as follows. If X
is a complete metric space or X is a locally compact Hausdor space, then
Remark 17.21. (1) all residual sets are dense in X and
(2) X is not meager.
It should also be remarked that incomplete metric spaces may be meager. For ex-
ample, let X C([0, 1]) be the subspace of polynomial functions on [0, 1] equipped
with the supremum norm. Then X =
n=1
E
n
where E
n
X denotes the subspace
of polynomials of degree less than or equal to n. You are asked to show in Exercise
17.7 below that E
n
is nowhere dense for all n. Hence X is meager and the empty
set is residual in X.
Here is an application of Theorem 17.16.
Theorem 17.22. Let N C([0, 1], R) be the set of nowhere dierentiable func-
tions. (Here a function f is said to be dierentiable at 0 if f
0
(0) := lim
t0
f(t)f(0)
t
exists and at 1 if f
0
(1) := lim
t0
f(1)f(t)
1t
exists.) Then N is a residual set so the
generic continuous functions is nowhere dierentiable.
Proof. If f / N, then f
0
(x
0
) exists for some x
0
[0, 1] and by the de-
nition of the derivative and compactness of [0, 1], there exists n N such that
|f(x) f(x
0
)| n|x x
0
| x [0, 1]. Thus if we dene
E
n
:= {f C([0, 1]) : x
0
[0, 1] 3 |f(x) f(x
0
)| n|x x
0
| x [0, 1]} ,
then we have just shown N
c
E :=
n=1
E
n
. So to nish the proof it suces to
show (for each n) E
n
is a closed subset of C([0, 1], R) with empty interior.
1) To prove E
n
is closed, let {f
m
}
m=1
E
n
be a sequence of functions such that
there exists f C([0, 1], R) such that kf f
m
k
u
0 as m . Since f
m
E
n
,
there exists x
m
[0, 1] such that
(17.4) |f
m
(x) f
m
(x
m
)| n|x x
m
| x [0, 1].
Since [0, 1] is a compact metric space, by passing to a subsequence if necessary, we
may assume x
0
= lim
m
x
m
[0, 1] exists. Passing to the limit in Eq. (17.4),
making use of the uniform convergence of f
n
f to show lim
m
f
m
(x
m
) = f(x
0
),
implies
|f(x) f(x
0
)| n|x x
0
| x [0, 1]
and therefore that f E
n
. This shows E
n
is a closed subset of C([0, 1], R).
2) To nish the proof, we will show E
0
n
= by showing for each f E
n
and
> 0 given, there exists g C([0, 1], R) \ E
n
such that kf gk
u
< . We now
construct g.
Since [0, 1] is compact and f is continuous there exists N N such that
|f(x) f(y)| < /2 whenever |y x| < 1/N. Let k denote the piecewise linear
function on [0, 1] such that k(
m
N
) = f(
m
N
) for m = 0, 1, . . . , N and k
00
(x) = 0 for
x /
N
:= {m/N : m = 0, 1, . . . , N} . Then it is easily seen that kf kk
u
< /2
and for x (
m
N
,
m+1
N
) that
|k
0
(x)| =
|f(
m+1
N
) f(
m
N
)|
1
N
< N/2.
We now make k rougher by adding a small wiggly function h which we dene
as follows. Let M N be chosen so that 4M > 2n and dene h uniquely by
h(
m
M
) = (1)
m
/2 for m = 0, 1, . . . , M and h
00
(x) = 0 for x /
M
. Then khk
u
<
and |h
0
(x)| = 4M > 2n for x /
M
. See Figure 37 below.
Finally dene g := k +h. Then
kf gk
u
kf kk
u
+khk
u
< /2 +/2 =
and
|g
0
(x)| |h
0
(x)| |k
0
(x)| > 2n n = n x /
M

N
.
It now follows from this last equation and the mean value theorem that for any
x
0
[0, 1],

g(x) g(x
0
)
x x
0
> n
344 BRUCE K. DRIVER
Figure 37. Constgructing a rough approximation, g, to a contin-

uous function f.
for all x [0, 1] suciently close to x
0
. This shows g / E
n
and so the proof is
complete.
Here is an application of the Baire Category Theorem in Proposition 17.17.
Proposition 17.23. Suppose that f : R R is a function such that f
0
(x) exists
for all x R. Let
U :=
>0
(
x R : sup
|y|<
|f
0
(x +y)| <
)
.
Then U is a dense open set. (It is not true that U = R in general, see Example
16.35 above.)
Proof. It is easily seen from the denition of U that U is open. Let W
o
R be
an open subset of R. For k N, let
E
k
:=

x W : |f(y) f(x)| k |y x| when |y x|
1
k
=
\
z:|z|k
1
{x W : |f(x +z) f(x)| k |z|} ,
which is a closed subset of R since f is continuous. Moreover, if x W and
M = |f
0
(x)| , then
|f(y) f(x)| = |f
0
(x) (y x) +o (y x)|
(M + 1) |y x|
for y close to x. (Here o(yx) denotes a function such that lim
yx
o(yx)/(yx) =
0.) In particular, this shows that x E
k
for all k suciently large. Therefore
W=
k=1
E
k
and since W is not meager by the Baire category Theorem in Propo-
sition 17.17, some E
k
has non-empty interior. That is there exists x
0
E
k
W
and > 0 such that
J := (x
0
, x
0
+) E
k
W.
For x J, we have |f(x +z) f(x)| k |z| provided that |z| k
1
and therefore
that |f
0
(x)| k for x J. Therefore x
0
U W showing U is dense.
Remark 17.24. This proposition generalizes to functions f : R
n
R
m
in an obvious
way.
For our next application of Theorem 17.16, let X := BC
((1, 1)) denote the

set of smooth functions f on (1, 1) such that f and all of its derivatives are
bounded. In the metric
(f, g) :=

X
k=0
2
k
f
(k)
g
(k)
1 +
f
(k)
g
(k)
for f, g X,
X becomes a complete metric space.
Theorem 17.25. Given an increasing sequence of positive numbers {M
n
}
n=1
, the
set
F :=

f X : limsup
n
f
(n)
(0)
M
n
is dense in X. In particular, there is a dense set of f X such that the power

series expansion of f at 0 has zero radius of convergence.
Proof. Step 1. Let n N. Choose g C
c
((1, 1)) such that kgk
< 2
n
while g
0
(0) = 2M
n
and dene
f
n
(x) :=
Z
x
0
dt
n1
Z
t
n1
0
dt
n2
. . .
Z
t
2
0
dt
1
g(t
1
).
Then for k < n,
f
(k)
n
(x) =
Z
x
0
dt
nk1
Z
t
nk1
0
dt
nk2
. . .
Z
t
2
0
dt
1
g(t
1
),
f
(n)
(x) = g
0
(x), f
(n)
n
(0) = 2M
n
and f
(k)
n
satises
f
(k)
n

2
n
(n 1 k)!
2
n
for k < n.
Consequently,
(f
n
, 0) =

X
k=0
2
k
f
(k)
n
1 +
f
(k)
n
n1
X
k=0
2
k
2
n
+

X
k=n
2
k
1 2
2
n
+ 2
n
= 4 2
n
.
Thus we have constructed f
n
X such that lim
n
(f
n
, 0) = 0 while f
(n)
n
(0) =
2M
n
for all n.
Step 2. The set
G
n
:=
mn
n
f X :

f
(m)
(0)
> M
m
o
is a dense open subset of X. The fact that G
n
is open is clear. To see that G
n
is
dense, let g X be given and dene g
m
:= g +
m
f
m
where
m
:= sgn(g
(m)
(0)).
Then

g
(m)
m
(0)
g
(m)
(0)
f
(m)
m
(0)
2M
m
> M
m
for all m.
Therefore, g
m
G
n
for all m n and since
(g
m
, g) = (f
m
, 0) 0 as m
346 BRUCE K. DRIVER
it follows that g

G
n
.
Step 3. By the Baire Category theorem, G
n
is a dense subset of X. This
completes the proof of the rst assertion since
F =

f X : limsup
n
f
(n)
(0)
M
n
n=1
f X :
f
(n)
(0)
M
n
1 for some n m
n=1
G
n
.
Step 4. Take M
n
= (n!)
2
and recall that the power series expansion for f near
0 is given by
P
n=0
f
n
(0)
n!
x
n
. This series can not converge for any f F and any
x 6= 0 because
limsup
n
f
n
(0)
n!
x
n
= limsup
n
f
n
(0)
(n!)
2
n!x
n
= limsup
n
f
n
(0)
(n!)
2
lim
n
n! |x
n
| =
where we have used lim
n
n! |x
n
| = and limsup
n
f
n
(0)
(n!)
2
1.
Remark 17.26. Given a sequence of real number {a
n
}
n=0
there always exists f X
such that f
(n)
(0) = a
n
. To construct such a function f, let C
c
(1, 1) be a
function such that = 1 in a neighborhood of 0 and
n
(0, 1) be chosen so that
n
0 as n and
P
n=0
|a
n
|
n
n
< . The desired function f can then be dened
by
(17.5) f(x) =

X
n=0
a
n
n!
x
n
(x/
n
) =:

X
n=0
g
n
(x).
The fact that f is well dened and continuous follows from the estimate:
|g
n
(x)| =

a
n
n!
x
n
(x/
n
)

kk
n!
|a
n
|
n
n
and the assumption that
P
n=0
|a
n
|
n
n
< . The estimate
|g
0
n
(x)| =
a
n
(n 1)!
x
n1
(x/
n
) +
a
n
n!
n
x
n
0
(x/
n
)
kk
(n 1)!
|a
n
|
n1
n
+
k
0
k
n!
|a
n
|
n
n
(kk
+k
0
k
) |a
n
|
n
n
and the assumption that
P
n=0
|a
n
|
n
n
< shows f C
1
(1, 1) and f
0
(x) =
P
n=0
g
0
n
(x). Similar arguments show f C
k
c
(1, 1) and f
(k)
(x) =
P
n=0
g
(k)
n
(x)
for all x and k N. This completes the proof since, using (x/
n
) = 1 for x in a
neighborhood of 0, g
(k)
n
(0) =
k,n
a
k
and hence
f
(k)
(0) =

X
n=0
g
(k)
n
(0) = a
k
.
17.6. Exercises.
Exercise 17.1. Prove item 1. of Proposition 17.8. Hint: show X is not connected
implies X is not path connected.
Exercise 17.2. Prove item 2. of Proposition 17.8. Hint: x x
0
X and let W
denote the set of x X such that there exists C([0, 1], X) satisfying (0) = x
0
and (1) = x. Then show W is both open and closed.
Exercise 17.3. Prove item 3. of Proposition 17.8.
Exercise 17.4. Let
X :=

(x, y) R
2
: y = sin(x
1
)
{(0, 0)}
equipped with the relative topology induced from the standard topology on R
2
.
Show X is connected but not path connected.
Exercise 17.5. Prove the following strong version of item 3. of Proposition 17.8,
namely to every pair of points x
0
, x
1
in a connected open subset V of R
n
there
exists C
(R, V ) such that (0) = x

0
and (1) = x
1
. Hint: Use a convolution
argument.
Exercise 17.6. Folland 5.27. Hint: Consider the generalized cantor sets discussed
on p. 39 of Folland.
Exercise 17.7. Let (X, kk) be an innite dimensional normed space and E X
be a nite dimensional subspace. Show that E X is nowhere dense.
Exercise 17.8. Now suppose that (X, kk) is an innite dimensional Banach space.
Show that X can not have a countable algebraic basis. More explicitly, there is
no countable subset S X such that every element x X may be written as a
nite linear combination of elements from S. Hint: make use of Exercise 17.7 and
the Baire category theorem.
348 BRUCE K. DRIVER
18. Banach Spaces II

Theorem 18.1 (Open Mapping Theorem). Let X, Y be Banach spaces, T
L(X, Y ). If T is surjective then T is an open mapping, i.e. T(V ) is open in
Y for all open subsets V X.
Proof. For all > 0 let B
X
= {x X : kxk
X
< } X, B
Y
=
{y Y : kyk
Y
< } Y and E
= T(B
X
) Y. The proof will be carried out

by proving the following three assertions.
(1) There exists > 0 such that B
Y
for all > 0.

(2) For the same > 0, B
Y
, i.e. we may remove the closure in assertion

1.
(3) The last assertion implies T is an open mapping.
1. Since Y =

S
n1
E
n
, the Baire category Theorem 17.16 implies there exists
n such that E
0
n
6= , i.e. there exists y E
n
and > 0 such that B
Y
(y, ) E
n
.
Suppose ky
0
k < then y and y + y
0
are in B
Y
(y, ) E
n
hence there exists
x
0
, x B
X
n
such that kTx
0
(y +y
0
)k and kTx yk may be made as small as we
please, which we abbreviate as follows
kTx
0
(y +y
0
)k 0 and kTx yk 0.
Hence by the triangle inequality,
kT(x
0
x) y
0
k = kTx
0
(y +y
0
) (Tx y)k
kTx
0
(y +y
0
)k +kTx yk 0
with x
0
x B
X
2n
. This shows that y
0
E
2n
which implies B
Y
(0, ) E
2n
. Since
the map
: Y Y given by
(y) =

2n
y is a homeomorphism,
(E
2n
) = E
and
(B
Y
(0, )) = B
Y
(0,

2n
), it follows that B
Y
where

2n
> 0.
2. Let be as in assertion 1., y B
Y
and
1
(kyk /, 1). Choose {
n
}
n=2

(0, ) such that
P
n=1
n
< 1. Since y B
Y
1
= T

B
X
by assertion 1.
there exists x
1
B
X
1
such that ky Tx
1
k <
2
. (Notice that ky Tx
1
k can be
made as small as we please.) Similarly, since yTx
1
B
Y
2
= T

B
X
there
exists x
2
B
X
2
such that ky Tx
1
Tx
2
k <
3
. Continuing this way inductively,
there exists x
n
B
X
n
such that
(18.1) ky
n
X
k=1
Tx
k
k <
n+1
for all n N.
Since

P
n=1
kx
n
k <

P
n=1
n
< 1, x

P
n=1
x
n
exists and kxk < 1, i.e. x B
X
1
. Passing
to the limit in Eq. (18.1) shows, ky Txk = 0 and hence y T(B
X
1
) = E
1
.
Therefore we have shown B
X
E
1
. The same scaling argument as above then
shows B
X
for all > 0.

3. If x V
o
X and y = Tx TV we must show that TV contains a
ball B
Y
(y, ) = Tx + B
Y
for some > 0. Now B

Y
(y, ) = Tx + B
Y
TV i
B
Y
TV Tx = T(V x). Since V x is a neighborhood of 0 X, there exists

> 0 such that B
X
(V x) and hence by assertion 2., B

Y
TB
X
T(V x)
and therefore B
Y
(y, ) TV with := .
Corollary 18.2. If X, Y are Banach spaces and T L(X, Y ) is invertible (i.e. a
bijective linear transformation) then the inverse map, T
1
, is bounded, i.e. T
1
L(Y, X). (Note that T

1
is automatically linear.)
Theorem 18.3 (Closed Graph Theorem). Let X and Y be Banach space T : X
Y linear is continuous i T is closed i.e. (T) X Y is closed.
Proof. If T continuous and (x
n
, Tx
n
) (x, y) X Y as n then
Tx
n
Tx = y which implies (x, y) = (x, Tx) (T).
Conversely: If T is closed then the following diagram commutes
X Y
-
T
(T)
2
@
@
@
@R
where (x) := (x, Tx).
The map
2
: X Y X is continuous and
1
|
(T)
: (T) X is continuous
bijection which implies
1
|
1
(T)
is bounded by the open mapping Theorem 18.1.
Hence T =
2

1
|
1
(T)
is bounded, being the composition of bounded operators.
As an application we have the following proposition.
Proposition 18.4. Let H be a Hilbert space. Suppose that T : H H is a linear
(not necessarily bounded) map such that there exists T
: H H such that
hTx, Y i = hx, T
Y i x, y H.
Then T is bounded.
Proof. It suces to show T is closed. To prove this suppose that x
n
H such
that (x
n
, Tx
n
) (x, y) H H. Then for any z H,
hTx
n
, zi = hx
n
, T
zi hx, T
zi = hTx, zi as n .
On the other hand lim
n
hTx
n
, zi = hy, zi as well and therefore hTx, zi = hy, zi
for all z H. This shows that Tx = y and proves that T is closed.
Here is another example.
Example 18.5. Suppose that M L
2
([0, 1], m) is a closed subspace such that
each element of M has a representative in C([0, 1]). We will abuse notation and
simply write M C([0, 1]). Then
(1) There exists A (0, ) such that kfk
Akfk
L
2 for all f M.
(2) For all x [0, 1] there exists g
x
M such that
f(x) = hf, g
x
i for all f M.
Moreover we have kg
x
k A.
(3) The subspace M is nite dimensional and dim(M) A
2
.
Proof. 1) I will give a two proofs of part 1. Each proof requires that we rst
show that (M, k k
) is a complete space. To prove this it suces to show M

is a closed subspace of C([0, 1]). So let {f
n
} M and f C([0, 1]) such that
kf
n
fk
0 as n . Then kf
n
f
m
k
L
2
kf
n
f
m
k
0 as m, n ,
and since M is closed in L
2
([0, 1]), L
2
lim
n
f
n
= g M. By passing to a
350 BRUCE K. DRIVER
subsequence if necessary we know that g(x) = lim

n
f
n
(x) = f(x) for m - a.e. x.
So f = g M.
i)Let i : (M, k k
) (M, k k
2
) be the identity map. Then i is bounded and
bijective. By the open mapping theorem, j = i
1
is bounded as well. Hence there
exists A < such that kfk
= kj(f)k Akfk
2
for all f M.
ii) Let j : (M, k k
2
) (M, k k
) be the identity map. We will shows that j is

a closed operator and hence bounded by the closed graph theorem. Suppose that
f
n
M such that f
n
f in L
2
and f
n
= j(f
n
) g in C([0, 1]). Then as in the
rst paragraph, we conclude that g = f = j(f) a.e. showing j is closed. Now nish
as in last line of proof i).
2) For x [0, 1], let e
x
: MC be the evaluation map e
x
(f) = f(x). Then
|e
x
(f)| |f(x)| kfk
Akfk
L
2
which shows that e
x
M
. Hence there exists a unique element g

x
M such that
f(x) = e
x
(f) = hf, g
x
i for all f M.
Moreover kg
x
k
L
2 = ke
x
k
M
A.
3) Let {f
j
}
n
j=1
be an L
2
orthonormal subset of M. Then
A
2
ke
x
k
2
M
= kg
x
k
2
L
2

n
X
j=1
|hf
j
, g
x
i|
2
=
n
X
j=1
|f
j
(x)|
2
and integrating this equation over x [0, 1] implies that
A
2
n
X
j=1
Z
1
0
|f
j
(x)|
2
dx =
n
X
j=1
1 = n
which shows that n A
2
. Hence dim(M) A
2
.
Remark 18.6. Keeping the notation in Example 18.5, G(x, y) = g
x
(y) for all x, y
[0, 1]. Then
f(x) = e
x
(f) =
Z
1
0
f(y)G(x, y)dy for all f M.
The function G is called the reproducing kernel for M.
The above example generalizes as follows.
Proposition 18.7. Suppose that (X, M, ) is a nite measure space, p [1, )
and W is a closed subspace of L
p
() such that W L
p
()L
(). Then dim(W) <

.
Proof. With out loss of generality we may assume that (X) = 1. As in Example
18.5, we shows that W is a closed subspace of L
() and hence by the open mapping

theorem, there exists a constant A < such that kfk
Akfk
p
for all f W.
Now if 1 p 2, then
kfk
Akfk
p
Akfk
2
and if p (2, ), then kfk
p
p
kfk
2
2
kfk
p2
or equivalently,
kfk
p
kfk
2/p
2
kfk
12/p
kfk
2/p
2
Akfk
p
12/p
from which we learn that kfk
p
A
12/p
kfk
2
and therefore that kfk

AA
12/p
kfk
2
so that in any case there exists a constant B < such that
kfk
Bkfk
2
.
Let {f
n
}
N
n=1
be an orthonormal subset of W and f =
P
N
n=1
c
n
f
n
with c
n
C,
then
N
X
n=1
c
n
f
n
B
2
N
X
n=1
|c
n
|
2
B
2
|c|
2
where |c|
2
:=
P
N
n=1
|c
n
|
2
. For each c C
N
, there is an exception set E
c
such that
for x / E
c
,
N
X
n=1
c
n
f
n
(x)
2
B
2
|c|
2
.
Let D := (Q+iQ)
N
and E =
cD
E
c
. Then (E) = 0 and for x / E,
P
N
n=1
c
n
f
n
(x)
B
2
|c|
2
for all c D. By continuity it then follows for x / E
that
N
X
n=1
c
n
f
n
(x)
2
B
2
|c|
2
for all c C
N
.
Taking c
n
= f
n
(x) in this inequality implies that
N
X
n=1
|f
n
(x)|
2
2
B
2
N
X
n=1
|f
n
(x)|
2
for all x / E
and therefore that
N
X
n=1
|f
n
(x)|
2
B
2
for all x / E.
Integrating this equation over x then implies that N B
2
, i.e. dim(W) B
2
.
Theorem 18.8 (Uniform Boundedness Principle). Let X and Y be a normed vector
spaces, A L(X, Y ) be a collection of bounded linear operators from X to Y,
F = F
A
= {x X : sup
AA
kAxk < } and
R = R
A
= F
c
= {x X : sup
AA
kAxk = }. (18.2)
(1) If sup
AA
kAk < then F = X.
(2) If F is not meager, then sup
AA
kAk < .
(3) If X is a Banach space, F is not meager i sup
AA
kAk < . In particular,
if sup
AA
kAxk < for all x X then sup
AA
kAk < .
(4) If X is a Banach space, then sup
AA
kAk = i R is residual. In particular
if sup
AA
kAk = then sup
AA
kAxk = for x in a dense subset of X.
352 BRUCE K. DRIVER
Proof. 1. If M := sup
AA
kAk < , then sup
AA
kAxk M kxk < for all x X
showing F = X.
2. For each n N, let E
n
X be the closed sets given by
E
n
= {x : sup
AA
kAxk n} =
\
AA
{x : kAxk n}.
Then F =
n=1
E
n
which is assumed to be non-meager and hence there exists
an n N such that E
n
has non-empty interior. Let B
x
() be a ball such that
B
x
() E
n
. Then for y X with kyk = we know x y B
x
() E
n
, so that
Ay = Ax A(x y) and hence for any A A,
kAyk kAxk +kA(x y)k n +n = 2n.
Hence it follows that kAk 2n/ for all A A, i.e. sup
AA
kAk 2n/ < .
3. If X is a Banach space, F = X is not meager by the Baire Category Theorem
17.16. So item 3. follows from items 1. and 2 and the fact that F = X i
sup
AA
kAxk < for all x X.
4. Item 3. is equivalent to F is meager i sup
AA
kAk = . Since R = F
c
, R is
residual i F is meager, so R is residual i sup
AA
kAk = .
Remarks 18.9. Let S X be the unit sphere in X, f
A
(x) = Ax for x S and
A A.
(1) The assertion sup
AA
kAxk < for all x X implies sup
AA
kAk < may
be interpreted as follows. If sup
AA
kf
A
(x)k < for all x S, then
sup
AA
kf
A
k
u
< where kf
A
k
u
:= sup
xS
kf
A
(x)k = kAk .
(2) If dim(X) < we may give a simple proof of this assertion. Indeed
if {e
n
}
N
n=1
S is a basis for X there is a constant > 0 such that
P
N
n=1
n
e
n

P
N
n=1
|
n
| and so the assumption sup
AA
kf
A
(x)k <
implies
sup
AA
kAk = sup
AA
sup
6=0
P
N
n=1
n
Ae
n
P
N
n=1
n
e
n
sup
AA
sup
6=0
P
N
n=1
|
n
| kAe
n
k
P
N
n=1
|
n
|

1
sup
AA
sup
n
kAe
n
k =
1
sup
n
sup
AA
kAe
n
k < .
Notice that we have used the linearity of each A A in a crucial way.
(3) If we drop the linearity assumption, so that f
A
C(S, Y ) for all A A
some index set, then it is no longer true that sup
AA
kf
A
(x)k <
for all x S, then sup
AA
kf
A
k
u
< . The reader is invited to construct a
counter example when X = R
2
and Y = R by nding a sequence {f
n
}
n=1
of continuous functions on S
1
such that lim
n
f
n
(x) = 0 for all x S
1
while lim
n
kf
n
k
C(S
1
)
= .
(4) The assumption that X is a Banach space in item 3.of Theorem 18.8 can
not be dropped. For example, let X C([0, 1]) be the polynomial functions
on [0, 1] equipped with the uniform norm kk
u
and for t (0, 1], let f
t
(x) :=
(x(t) x(0)) /t for all x X. Then lim
t0
f
t
(x) =
d
dt
|
0
x(t) and therefore
sup
t(0,1]
|f
t
(x)| < for all x X. If the conclusion of Theorem 18.8 (item
3.) were true we would have M := sup
t(0,1]
kf
t
k < . This would then
imply
x(t) x(0)
t
M kxk
u
for all x X and t (0, 1].
Letting t 0 in this equation gives, | x(0)| M kxk
u
for all x X. But
taking x(t) = t
n
in this inequality shows M = .
Example 18.10. Suppose that {c
n
}
n=1
C is a sequence of numbers such that
lim
N
N
X
n=1
a
n
c
n
exists in C for all a
1
.
Then c
.
Proof. Let f
N

be given by f
N
(a) =
P
N
n=1
a
n
c
n
and set M
N
:=
max {|c
n
| : n = 1, . . . , N} . Then
|f
N
(a)| M
N
kak
1
and by taking a = e
k
with k such M
N
= |c
k
| , we learn that kf
N
k = M
N
. Now by
assumption, lim
N
f
N
(a) exists for all a
1
and in particular,
sup
N
|f
N
(a)| < for all a
1
.
So by the Theorem 18.8,
> sup
N
kf
N
k = sup
N
M
N
= sup{|c
n
| : n = 1, 2, 3, . . . } .
18.1. Applications to Fourier Series. Let T = S
1
be the unit circle in S
1
and m
denote the normalized arc length measure on T. So if f : T [0, ) is measurable,
then
Z
T
f(w)dw :=
Z
T
fdm :=
1
2
Z

f(e
i
)d.
Also let
n
(z) = z
n
for all n Z. Recall that {
n
}
nZ
is an orthonormal basis for
L
2
(T). For n N let
s
n
(f, z) :=
n
X
k=n
hf,
n
i
k
(z) =
n
X
k=n
hf,
n
iz
k
=
n
X
k=n
Z
T
f(w) w
k
dw
z
k
=
Z
T
f(w)
n
X
k=n
w
k
z
k
!
dw =
Z
T
f(w)d
n
(z w)dw
where d
n
() :=
P
n
k=n
k
. Now d
n
() d
n
() =
n+1
n
, so that
d
n
() :=
n
X
k=n
k
=

n+1
n
1
n+1
n
1
|
=1
= lim
1
n+1
n
1
= 2n + 1 =
n
X
k=n
1
k
.
354 BRUCE K. DRIVER
Writing = e
i
, we nd
D
n
() := d
n
(e
i
) =
e
i(n+1)
e
in
e
i
1
=
e
i(n+1/2)
e
i(n+1/2)
e
i/2
e
i/2
=
sin(n +
1
2
)
sin
1
2
.
Recall by Hilbert space theory, L
2
(T) lim
n
s
n
(f, ) = f for all f L
2
(T). We
will now show that the convergence is not pointwise for all f C(T) L
2
(T).
Proposition 18.11. For each z T, there exists a residual set R
z
C(T) such
that sup
n
|s
n
(f, z)| = for all f R
z
. Recall that C(T) is a complete metric space,
hence R
z
is a dense subset of C(T).
Proof. By symmetry considerations, it suces to take z = 1 T. Let
n
:
C(T) C be given by
n
f := s
n
(f, 1) =
Z
T
f(w)d
n
( w)dw.
From Corollary 15.42 we know that
k
n
k = kd
n
k
1
=
Z
T
|d
n
( w)| dw
=
1
2
Z

d
n
(e
i
)
d =
1
2
Z

sin(n +
1
2
)
sin
1
2
d. (18.3)
which can also be proved directly as follows. Since
|
n
f| =
Z
T
f(w)d
n
( w)dw
Z
T
|f(w)d
n
( w)| dw kfk
Z
T
|d
n
( w)| dw,
we learn k
n
k
R
T
|d
n
( w)| dw. Since C(T) is dense in L
1
(T), there exists f
k

C(T, R) such that f
k
(w) sgnd
k
( w) in L
1
. By replacing f
k
by (f
k
1) (1) we
may assume that kf
k
k
1. It now follows that

k
n
k
|
n
f
k
|
kf
k
k
Z
T
f
k
(w)d
n
( w)dw
and passing to the limit as k implies that k

n
k
R
T
|d
n
( w)| dw.
Since
sinx =
Z
x
0
cos ydy
Z
x
0
|cos y| dy x
for all x 0. Since sinx is odd, |sinx| |x| for all x R. Using this in Eq. (18.3)
implies that
k
n
k
1
2
Z

sin(n +
1
2
)
1
2
d =
2
Z

0
sin(n +
1
2
)
.
=
2
Z

0
sin(n +
1
2
)
=
Z
(n+
1
2
)
0
|siny|
dy
y
as n
and hence sup
n
k
n
k = . So by Theorem 18.8,
R
1
= {f C(T) : sup
n
|
n
f| = }
is a residual set.
See Rudin Chapter 5 for more details.
Lemma 18.12. For f L
1
(T), let
f(n) := hf,
n
i =
Z
T
f(w) w
n
dw.
Then

f c
0
:= C
0
(Z) (i.e lim
n

f(n) = 0) and the map f L
1
(T)

f c
0
is
a one to one bounded linear transformation into but not onto c
0
.
Proof. By Bessels inequality,
P
nZ
f(n)
2
< for all f L
2
(T) and in
particular lim
|n|
f(n)
= 0. Given f L
1
(T) and g L
2
(T) we have
f(n) g(n)
Z
T
[f(w) g(w)] w
n
dw
kf gk
1
and hence
lim sup
n
f(n)
= lim sup
n
f(n) g(n)
kf gk
1
for all g L
2
(T). Since L
2
(T) is dense in L
1
(T), it follows that limsup
n
f(n)
=
0 for all f L
1
, i.e.

f c
0
.
Since

f(n)
kfk
1
, we have

c
0
kfk
1
showing that f :=

f is a bounded
linear transformation from L
1
(T) to c
0
.
To see that is injective, suppose

f = f 0, then
R
T
f(w)p(w, w)dw = 0
for all polynomials p in w and w. By the Stone - Wierestrass and the dominated
convergence theorem, this implies that
Z
T
f(w)g(w)dw = 0
for all g C(T). Lemma 11.7 now implies f = 0 a.e.
If were surjective, the open mapping theorem would imply that
1
: c
0

L
1
(T) is bounded. In particular this implies there exists C < such that
(18.4) kfk
L
1
C
c
0
for all f L
1
(T).
Taking f = d
n
, we nd

d
n
c
0
= 1 while lim
n
kd
n
k
L
1
= contradicting Eq.
(18.4). Therefore Ran) 6= c
0
.
18.2. Hahn Banach Theorem. Our next goal is to show that continuous dual
X
of a Banach space X is always large. This will be the content of the Hahn
Banach Theorem 18.16 below.
Proposition 18.13. Let X be a complex vector space over C. If f X
and
u = Ref X
R
then
(18.5) f(x) = u(x) iu(ix).
Conversely if u X
R
and f is dened by Eq. (18.5), then f X
and kuk
X
R
=
kfk
X
. More generally if p is a semi-norm on X, then
|f| p i u p.
356 BRUCE K. DRIVER
Proof. Let v(x) = Im f(x), then

v(ix) = Im f(ix) = Im(if(x)) = Ref(x) = u(x).
Therefore
f(x) = u(x) +iv(x) = u(x) +iu(ix) = u(x) iu(ix).
Conversely for u X
R
let f(x) = u(x) iu(ix). Then
f((a +ib)x) = u(ax +ibx) iu(iax bx) = au(x) +bu(ix) i(au(ix) bu(x))
while
(a +ib)f(x) = au(x) +bu(ix) +i(bu(x) au(ix)).
So f is complex linear.
Because |u(x)| = |Ref(x)| |f(x)|, it follows that kuk kfk. For x X choose
S
1
C such that |f(x)| = f(x) so
|f(x)| = f(x) = u(x) kuk kxk = kukkxk.
Since x X is arbitrary, this shows that kfk kuk so kfk = kuk.
38
For the last assertion, it is clear that |f| p implies that u |u| |f| p.
Conversely if u p and x X, choose S
1
C such that |f(x)| = f(x). Then
|f(x)| = f(x) = f(x) = u(x) p(x) = p(x)
holds for all x X.
Denition 18.14 (Minkowski functional). p : X R is a Minkowski functional if
(1) p(x +y) p(x) +p(y) for all x, y X and
(2) p(cx) = cp(x) for all c 0 and x X.
Example 18.15. Suppose that X = R and
p(x) = inf { 0 : x [1, 2] = [, 2]} .
Notice that if x 0, then p(x) = x/2 and if x 0 then p(x) = x, i.e.
p(x) =

x/2 if x 0
|x| if x 0.
From this formula it is clear that p(cx) = cp(x) for all c 0 but not for c < 0.
Moreover, p satises the triangle inequality, indeed if p(x) = and p(y) = , then
x [1, 2] and y [1, 2] so that
x +y [1, 2] +[1, 2] ( +) [1, 2]
38
Proof. To understand better why kfk = kuk, notice that
kfk
2
= sup
kxk=1
|f(x)|
2
= sup
kxk=1
(|u(x)|
2
+ |u(ix)|
2
).
Supppose that M = sup
kxk=1
|u(x)| and this supremum is attained at x
0
X with kx
0
k = 1.
Replacing x
0
by x
0
if necessary, we may assume that u(x
0
) = M. Since u has a maximum at
x
0
,
0 =
d
dt
0
u
x
0
+itx
0
kx
0
+itx
0
k
=
d
dt
1
|1 +it|
(u(x
0
) +tu(ix
0
))
= u(ix
0
)
since
d
dt
|
0
|1 +it| =
d
dt
|
0
1 +t
2
= 0.This explains why kfk = kuk.
which shows that p(x + y) + = p(x) + p(y). To check the last set inclusion
let a, b [1, 2], then
a +b = ( +)

+
a +

+
b
( +) [1, 2]
since [1, 2] is a convex set and

+
+

+
= 1.
TODO: Add in the relationship to convex sets and separation theorems, see Reed
and Simon Vol. 1. for example.
Theorem 18.16 (Hahn-Banach). Let X be a real vector space, M X be a
subspace f : M R be a linear functional such that f p on M. Then there
exists a linear functional F : X R such that F|
M
= f and F p.
Proof. Step (1) We show for all x X \ M there exists and extension F to
M Rx with the desired properties. If F exists and = F(x), then for all y M
and R we must have f(y)+ = F(y+x) p(y+x) i.e. p(y+x)f(y).
Equivalently put we must nd R such that

p(y +x) f(y)
for all y M and > 0

p(z x) f(z)
for all z M and > 0.

So if R is going to exist, we have to prove, for all y, z M and , > 0 that
f(z) p(z x)

p(y +x) f(y)
or equivalently
f(z +y) p(y +x) +p(z x) (18.6)
= p(y +x) +p(z x).
But by assumtion and the triangle inequality for p,
f(z +y) p(z +y) = p(z +x +z x)
p(z +x) +p(z x)
which shows that Eq. (18.6) is true and by working backwards, there exist an R
such that f(y) + p(y +x). Therefore F(y +x) := f(y) + is the desired
extension.
Step (2) Let us now write F : X R to mean F is dened on a linear subspace
D(F) X and F : D(F) R is linear. For F, G : X R we will say F G if
D(F) D(G) and F = G|
D(F)
, that is G is an extension of F. Let
F = {F : X R : f F and F p on D(F)}.
Then (F, ) is a partially ordered set. If F is a chain (i.e. a linearly ordered
subset of F) then has an upper bound G F dened by D(G) =
S
F
D(F)
and G(x) = F(x) for x D(F). Then it is easily checked that D(G) is a linear
subspace, G F, and F G for all F . We may now apply Zorns Lemma
(see Theorem B.7) to conclude there exists a maximal element F F. Necessarily,
D(F) = X for otherwise we could extend F by step (1), violating the maximality
of F. Thus F is the desired extension of f.
358 BRUCE K. DRIVER
The use of Zorns lemma in Step (2) above may be avoided in the case that
X may be written as M span() where := {x
n
}
n=1
is a countable subset of
X. In this case f : M R may be extended to a linear functional F : X R
with the desired properties by step (1) and induction. If p(x) is a norm on X and
X = M span() with as above, then this function F constructed above extends
by continuity to X.
Corollary 18.17. Suppose that X is a complex vector space, p : X [0, ) is a
semi-norm, M X is a linear subspace, and f : M C is linear functional such
that |f(x)| p(x) for all x M. Then there exists F X
0
(X
0
is the algebraic
dual of X) such that F|
M
= f and |F| p.
Proof. Let u = Ref then u p on M and hence by Theorem 18.16, there exists
U X
0
R
such that U|
M
= u and U p on M. Dene F(x) = U(x) iU(ix) then
as in Proposition 18.13, F = f on M and |F| p.
Theorem 18.18. Let X be a normed space M X be a closed subspace and
x X \ M. Then there exists f X
such that kfk = 1, f(x) = = d(x, M) and

f = 0 on M.
Proof. Dene h : M Cx C by h(m+x) for all m M and C.
Then
khk := sup
mM and 6=0
||
km+xk
= sup
mM and 6=0
kx +m/k
=

= 1
and by the Hahn-Banach theorem there exists f X
such that f|
MCx
= h and
kfk 1. Since 1 = khk kfk 1, it follows that kfk = 1.
Corollary 18.19. The linear map x X x X
where x(f) = f(x) for all

x X is an isometry. (This isometry need not be surjective.)
Proof. Since | x(f)| = |f(x)| kfk
X
kxk
X
for all f X
, it follows that
k xk
X
kxk
X
. Now applying Theorem 18.18 with M = {0} , there exists f X
such that kfk = 1 and | x(f)| = f(x) = kxk , which shows that k xk
X
kxk
X
.
This shows that x X x X
is an isometry. Since isometries are necessarily

injective, we are done.
Denition 18.20. A Banach space X is reexive if the map x X x X
is
surjective.
Example 18.21. Every Hilbert space H is reexive. This is a consequence of the
Riesz Theorem, Proposition 12.15.
Example 18.22. Suppose that is a nite measure on a measurable space
(X, M), then L
p
(X, M, ) is reexive for all p (1, ), see Theorem 15.14.
Example 18.23 (Following Riesz and Nagy, p. 214). The Banach space X :=
C([0, 1]) is not reexive. To prove this recall that X
may be identied with complex

measures on [0, 1] which may be identied with right continuous functions of
bounded variation (F) on [0, 1], namely
F
F
(f X
Z
[0,1]
fd
F
=
Z
1
0
fdF).
Dene X
by
() =
X
x[0,1]
({x}) =
X
x[0,1]
(F(x) F(x)) ,
so () is the sum of the atoms of . Suppose there existed an f X such that
() =
R
[0,1]
fd for all X
. Choosing =
x
for some x (0, 1) would then
imply that
f(x) =
Z
[0,1]
f
x
= (
x
) = 1
showing f would have to be the constant function,1, which clearly can not work.
Example 18.24. The Banach space X := L
1
([0, 1], m) is not reexive. As we
have seen in Theorem 15.14, X

= L
([0, 1], m). The argument in Example 15.15

shows (L
([0, 1], m))
L
1
([0, 1], m). Recall in that example, we show there exists
L X

= (L
([0, 1], m))
such that L(f) = f(0) for all f in the closed subspace,

C([0, 1]) of X
. If there were to exist a g X such that g = L, we would have

(18.7) f(0) = L(f) = g(
f
) =
f
(g) :=
Z
1
0
f(x)g(x)dx
for all f C([0, 1]) L
([0, 1], m). Taking f C

c
((0, 1]) in this equation and
making use of Lemma 11.7, it would follow that g(x) = 0 for a.e. x (0, 1]. But
this is clearly inconsistent with Eq. (18.7).
18.3. Weak and Strong Topologies.
Denition 18.25. Let X and Y be be a normed vector spaces and L(X, Y ) the
normed space of bounded linear transformations from X to Y.
(1) The weak topology on X is the topology generated by X
, i.e. sets of
the form
N =
n
i=1
{x X : |f
i
(x) f
i
(x
0
)| < }
where f
i
X
and > 0 form a neighborhood base for the weak topology

on X at x
0
.
(2) The weak- topology on X
is the topology generated by X, i.e.

N
n
i=1
{g X
: |f(x
i
) g(x
i
)| < }
where x
i
X and > 0 forms a neighborhood base for the weak topology
on X
at f X
.
(3) The strong operator topology on L(X, Y ) is the smallest topology such
that T L(X, Y ) Tx Y is continuous for all x X.
(4) The weak operator topology on L(X, Y ) is the smallest topology such
that T L(X, Y ) f(Tx) C is continuous for all x X and f Y
.
Theorem 18.26 (Alaoglus Theorem). If X is a normed space the unit ball in X
is weak - compact.
Proof. For all x X let D
x
= {z C : |z| kxk}. Then D
x
C is a
compact set and so by Tychonos Theorem
Q
xX
D
x
is compact in the product
topology. If f C
:= {f X
: kfk 1}, |f(x)| kfk kxk kxk which implies

that f(x) D
x
for all x X, i.e. C
. The topology on C
inherited from
the weak topology on X
is the same as that relative topology coming from the

product topology on . So to nish the proof it suces to show C
is a closed
360 BRUCE K. DRIVER
subset of the compact space . To prove this let

x
(f) = f(x) be the projection
maps. Then
C
= {f : f is linear}
= {f : f(x +cy) f(x) cf(y) = 0 for all x, y X and c C}
=
\
x,yX
\
cC
{f : f(x +cy) f(x) cf(y) = 0}
=
\
x,yX
\
cC
(
x+cy
x
c
y
)
1
({0})
which is closed because (
x+cy
x
c
y
) : C is continuous.
Theorem 18.27 (Alaoglus Theorem for separable spaces). Suppose that X is a
separable Banach space, C
:= {f X
: kfk 1} is the closed unit ball in X
and
{x
n
}
n=1
is an countable dense subset of C := {x X : kxk 1} . Then
(18.8) (f, g) :=

X
n=1
1
2
n
|f(x
n
) g(x
n
)|
denes a metric on C
which is compatible with the weak topology on C
,
C
:=
(
w
)
C
= {V C : V
w
} . Moreover (C
, ) is a compact metric space.

Proof. The routine check that is a metric is left to the reader. Let
be
the topology on C
induced by . For any g X and n N, the map f X

(f(x
n
) g(x
n
)) C is
w
continuous and since the sum in Eq. (18.8) is uniformly
convergent for f C
, it follows that f (f, g) is

C
continuous. This implies
the open balls relative to are contained in
C
and therefore

C
.
We now wish to prove
C

. Since
C
is the topology generated by
{ x|
C
: x C} , it suces to show x is
continuous for all x C. But given x C

there exists a subsequence y
k
:= x
n
k
of {x
n
}
n=1
such that such that x = lim
k
y
k
.
Since
sup
fC
| x(f) y
k
(f)| = sup
fC
|f(x y
k
)| kx y
k
k 0 as k ,
y
k
x uniformly on C
and using y
k
is
continuous for all k (as is easily

checked) we learn x is also
continuous. Hence
C
= ( x|
C
: x X)
.
The compactness assertion follows from Theorem 18.26. The compactness as-
sertion may also be veried directly using: 1) sequential compactness is equivalent
to compactness for metric spaces and 2) a Cantors diagonalization argument as in
the proof of Theorem 12.38. (See Proposition 19.16 below.)
18.4. Weak Convergence Results. The following is an application of theorem
3.48 characterizing compact sets in metric spaces.
Proposition 18.28. Suppose that (X, ) is a complete separable metric space and
is a probability measure on B = (
). Then for all > 0, there exists K
@@ X
such that (K
) 1 .
Proof. Let {x
k
}
k=1
be a countable dense subset of X. Then X =
k
C
x
k
(1/n)
for all n N. Hence by continuity of , there exists, for all n N, N
n
< such
that (F
n
) 1 2
n
where F
n
:=
N
n
k=1
C
x
k
(1/n). Let K :=
n=1
F
n
then
(X \ K) = (
n=1
F
c
n
)

X
n=1
(F
c
n
) =

X
n=1
(1 (F
n
))

X
n=1
2
n
=
so that (K) 1. Moreover K is compact since K is closed and totally bounded;
K F
n
for all n and each F
n
is 1/n bounded.
Denition 18.29. A sequence of probability measures {P
n
}
n=1
is said to converge
to a probability P if for every f BC(X), P
n
(f) P(f). This is actually weak-*
convergence when viewing P
n
BC(X)
.
Proposition 18.30. The following are equivalent:
(1) P
n
w
P as n
(2) P
n
(f) P(f) for every f BC(X) which is uniformly continuous.
(3) limsup
n
P
n
(F) P(F) for all F @ X.
(4) liminf
n
P
n
(G) P(G) for all G
o
X.
(5) lim
n
P
n
(A) = P(A) for all A B such that P(bd(A)) = 0.
Proof. 1. = 2. is obvious. For 2. = 3.,
(18.9) (t) :=
_
_
_
1 if t 0
1 t if 0 t 1
0 if t 1
and let f
n
(x) := (nd(x, F)). Then f
n
BC(X, [0, 1]) is uniformly continuous,
0 1
F
f
n
for all n and f
n
1
F
as n . Passing to the limit n in the
equation
0 P
n
(F) P
n
(f
m
)
gives
0 lim sup
n
P
n
(F) P(f
m
)
and then letting m in this inequality implies item 3.
3. 4. Assuming item 3., let F = G
c
, then
1 lim inf
n
P
n
(G) = lim sup
n
(1 P
n
(G)) = lim sup
n
P
n
(G
c
)
P(G
c
) = 1 P(G)
which implies 4. Similarly 4. = 3.
3. 5. Recall that bd(A) =

A \ A
o
, so if P(bd(A)) = 0 and 3. (and hence
also 4. holds) we have
lim sup
n
P
n
(A) lim sup
n
P
n
(

A) P(

A) = P(A) and
lim inf
n
P
n
(A) lim inf
n
P
n
(A
o
) P(A
o
) = P(A)
n
P
n
(A) = P(A). Conversely, let F @ X and set
F
:= {x X : (x, F) } . Then
bd(F
) F
\ {x X : (x, F) < } = {x X : (x, F) = } =: A
.
Since {A
}
>0
are all disjoint, we must have
X
>0
P(A
) P(X) 1
and in particular the set := { > 0 : P(A
) > 0} is at most countable. Let

n
/
be chosen so that
n
0 as n , then
P(F
m
) = lim
n
P
n
(F
n
) lim sup
n
P
n
(F).
Let m this equation to conclude P(F) limsup
n
P
n
(F) as desired.
362 BRUCE K. DRIVER
To nish the proof we will now show 3. = 1. By an ane change of variables

it suces to consider f C(X, (0, 1)) in which case we have
(18.10)
k
X
i=1
(i 1)
k
1
{
(i1)
k
f<
i
k
}
f
k
X
i=1
i
k
1
{
(i1)
k
f<
i
k
}
.
Let F
i
:=

i
k
f
and notice that F

k
= , then we for any probability P that
(18.11)
k
X
i=1
(i 1)
k
[P(F
i1
) P(F
i
)] P(f)
k
X
i=1
i
k
[P(F
i1
) P(F
i
)] .
Now
k
X
i=1
(i 1)
k
[P(F
i1
) P(F
i
)] =
k
X
i=1
(i 1)
k
P(F
i1
)
k
X
i=1
(i 1)
k
P(F
i
)
=
k1
X
i=1
i
k
P(F
i
)
k
X
i=1
i 1
k
P(F
i
) =
1
k
k1
X
i=1
P(F
i
)
and
k
X
i=1
i
k
[P(F
i1
) P(F
i
)] =
k
X
i=1
i 1
k
[P(F
i1
) P(F
i
)] +
k
X
i=1
1
k
[P(F
i1
) P(F
i
)]
=
k1
X
i=1
P(F
i
) +
1
k
so that Eq. (18.11) becomes,
1
k
k1
X
i=1
P(F
i
) P(f)
1
k
k1
X
i=1
P(F
i
) + 1/k.
Using this equation with P = P
n
and then with P = P we nd
lim sup
n
P
n
(f) lim sup
n
"
1
k
k1
X
i=1
P
n
(F
i
) + 1/k
#
1
k
k1
X
i=1
P(F
i
) + 1/k P(f) + 1/k.
Since k is arbitary,
lim sup
n
P
n
(f) P(f).
This inequality also hold for 1 f and this implies liminf
n
P
n
(f) P(f) and
hence lim
n
P
n
(f) = P(f) as claimed.
Let Q := [0, 1]
N
and for a, b Q let
d(a, b) :=

X
n=1
1
2
n
|a
n
b
n
|
as in Notation 10.19 and recall that in this metric (Q, d) is a complete metric space
that
d
is the product topology on Q, see Exercises 3.27 and 6.15.
Theorem 18.31. To every separable metric space (X, ), there exists a continuous
injective map G : X Q such that G : X G(X) Q is a homeomorphism. In
short, any separable metrizable space X is homeomorphic to a subset of (Q, d).
Remark 18.32. Notice that if we let
0
(x, y) := d(G(x), G(y)), then
0
induces the
same topology on X as and G : (X,
0
) (Q, d) is isometric.
Proof. Let D = {x
n
}
n=1
be a countable dense subset of X and for m, n N let
f
m,n
(x) := 1 (m(x
n
, x)),
where is as in Eq. (18.9). Then f
m,n
= 0 if (x, x
n
) < 1/m and f
m,n
= 1 if
(x, x
n
) > 2/m. Let {g
k
}
k=1
be an enumeration of {f
m,n
: m, n N} and dene
G : X Q by
G(x) = (g
1
(x), g
2
(x), . . . ) Q.
We will now show G : X G(X) Q is a homeomorphism. To show G is injective
suppose x, y X and (x, y) = 1/m. In this case we may nd x
n
X such that
(x, x
n
)
1
2m
, (y, x
n
)
1
2m

1
2m
and hence f
4m,n
(y) = 1 while f
4m,n
(y) = 0.
From this it follows that G(x) 6= G(y) if x 6= y and hence G is injective.
The continuity of G is a consequence of the continuity of each of the components
g
i
of G. So it only remains to show G
1
: G(X) X is continuous. Given
a = G(x) G(X) Q and > 0, choose m N and x
n
X such that (x
n
, x) <
1
2m
<

2
. Then f
m,n
(x) = 0 and for y / B(x
n
,
2
m
), f
m,n
(y) = 1. So if k is chosen so
that g
k
= f
m,n
, we have shown that for
d(G(y), G(x)) 2
k
for y / B(x
n
, 2/m)
or equivalently put, if
d(G(y), G(x)) < 2
k
then y B(x
n
, 2/m) B(x, 1/m) B(x, ).
This shows that if G(y) is suciently close to G(x) then (y, x) < , i.e. G
1
is
continuous at a = G(x).
Denition 18.33. Let X be a topological space. A collection of probability mea-
sures on (X, B
X
) is said to be tight if for every > 0 there exists a compact set
K
B
X
such that P(K
) 1 for all P .
Theorem 18.34. Suppose X is a separable metrizable space and = {P
n
}
n=1
is a tight sequence of probability measures on B
X
. Then there exists a subsequence
{P
n
k
}
k=1
which is weakly convergent to a probability measure P on B
X
.
Proof. First suppose that X is compact. In this case C(X) is a Banach space
which is separable by the Stone Weirstrass theorem. By the Riesz theorem,
Corollary 15.42, we know that C(X)
is in one to one correspondence with complex

measure on (X, B
X
). We have also seen that C(X)
is metrizable and the unit ball

in C(X)
is weak - * compact. Hence there exists a subsequence {P

n
k
}
k=1
which is
weak -* convergent to a probability measure P on X. Alternatively, use the cantors
diagonalization procedure on a countable dense set C(X) so nd {P
n
k
}
k=1
such
that (f) := lim
k
P
n
k
(f) exists for all f . Then for g C(X) and f ,
we have
|P
n
k
(g) P
n
l
(g)| |P
n
k
(g) P
n
k
(f)| +|P
n
k
(f) P
n
l
(f)| +|P
n
l
(f) P
n
l
(g)|
2 kg fk
+|P
n
k
(f) P
n
l
(f)|
364 BRUCE K. DRIVER
which shows
lim sup
k,l
|P
n
k
(g) P
n
l
(g)| 2 kg fk
.
Letting f tend to g in C(X) shows limsup
k,l
|P
n
k
(g) P
n
l
(g)| = 0 and
hence (g) := lim
k
P
n
k
(g) for all g C(X). It is now clear that (g) 0 for all
g 0 so that is a positive linear functional on X and thus there is a probability
measure P such that (g) = P(g).
For the general case, by Theorem 18.31 we may assume that X is a subset of
a compact metric space which we will denote by

X. We now extend P
n
to

X by
setting

P
n
(A) :=

P
n
(A

X) for all A B
X
. By what we have just proved, there
is a subsequence

P
0
k
:=

P
n
k
k=1
such that

P
0
k
converges weakly to a probability
measure

P on

X. The main thing we now have to prove is that

P(X) = 1, this
is where the tightness assuption is going to be used.
Given > 0, let K
X be a compact set such that

P
n
(K
) 1 for all n.
Since K
is compact in X it is compact in

X as well and in particular a closesd
subset of

X. Therefore by Proposition 18.30
P(K
) lim sup
k
P
0
k
(K
) = 1 .
Since > 0 is arbitary, this shows with X
0
:=
n=1
K
1/n
satises

P(X
0
) = 1.
Because X
0
B
X
B
X
, we may view

P as a measure on B
X
by letting P(A) :=
P(A X
0
) for all A B
X
.
Given a closed subset F X, choose

F @

X such that F =

F X. Then
lim sup
k
P
0
k
(F) = lim sup
k
P
0
k
(

F)

P(

F) =

P(

F X
0
) = P(F),
which shows P
0
k
w
P.
18.5. Supplement: Quotient spaces, adjoints, and more reexivity.
Denition 18.35. Let X and Y be Banach spaces and A : X Y be a linear
operator. The transpose of A is the linear operator A
: Y
dened by
(x) = f(Ax) for f Y
and x X. The null space of A is the subspace

Nul(A) := {x X : Ax = 0} X. For M X and N X
let
M
0
:= {f X
: f|
M
= 0} and
N
:= {x X : f(x) = 0 for all f N}.

Proposition 18.36 (Basic Properties). (1) kAk =

and A
x =
c
Ax for
all x X.
(2) M
0
and N
are always closed subspace of X
and X respectively.
(3)

M
0
=

M.
(4)

N
0
with equality when X is reexive.
(5) Nul(A) = RanA
and Nul(A
) = Ran(A)
0
. Moreover, Ran(A) =
Nul(A
and if X is reexive, then Ran(A
) = Nul(A)
0
.
(6) X is reexive i X
is reexive. More generally X
=
c
X

X
0
.
Proof.
(1)
kAk = sup
kxk=1
kAxk = sup
kxk=1
sup
kfk=1
|f(Ax)|
= sup
kfk=1
sup
kxk=1
f(x)
= sup
kfk=1
.
(2) This is an easy consequence of the assumed continuity o all linear func-
tionals involved.
(3) If x M, then f(x) = 0 for all f M
0
so that x

M
0
. Therefore
M
0
. If x /

M, then there exists f X
such that f|
M
= 0 while
f(x) 6= 0, i.e. f M
0
yet f(x) 6= 0. This shows x /

M
0
and we have
shown

M
0

M.
(4) It is again simple to show N
0
and therefore

N
0
. Moreover,
as above if f /

N there exists X
such that |
N
= 0 while (f) 6= 0.
If X is reexive, = x for some x X and since g(x) = (g) = 0 for
all g

N, we have x N
. On the other hand, f(x) = (f) 6= 0 so

f /
0
. Thus again

N

N.
(5)
Nul(A) = {x X : Ax = 0} = {x X : f(Ax) = 0 f X
}
=

x X : A
f(x) = 0 f X
=

x X : g(x) = 0 g Ran(A
= Ran(A
.
Similarly,
Nul(A
) =

f Y
: A
f = 0
=

f Y
: (A
f)(x) = 0 x X
= {f Y
: f(Ax) = 0 x X}
=

f Y
: f|
Ran(A)
= 0
= Ran(A)
0
.
(6) Let X
and dene f
by f
(x) = ( x) for all x X and set
0
:=

f
. For x X (so x X
) we have
0
( x) = ( x)

f
( x) = f
(x) x(f
) = f
(x) f
(x) = 0.
This shows
0

X
0
and we have shown X
=
c
X
+

X
0
. If
c
X

X
0
,
then =

f for some f X
and 0 =

f( x) = x(f) = f(x) for all x X,
i.e. f = 0 so = 0. Therefore X
=
c
X

X
0
as claimed. If X is
reexive, then

X = X
and so

X
0
= {0} showing X
=
c
X
, i.e. X
is reexive. Conversely if X
is reexive we conclude that

X
0
= {0} and
therefore X
= {0}
=

X
0
=

X, so that X is reexive.
Alternative proof. Notice that f
= J
, where J : X X
is given
by Jx = x, and the composition
f X

f X

f X
is the identity map since

J
(x) =

f(Jx) =

f( x) = x(f) = f(x) for all
x X. Thus it follows that X
is invertible i J
is its inverse
which can happen i Nul(J
) = {0} . But as above Nul(J
) = RanJ)
0
366 BRUCE K. DRIVER
which will be zero i Ran(J) = X
and since J is an isometry this is

equivalent to saying RanJ) = X
. So we have again shown X
is reexive
i X is reexive.
Theorem 18.37. Let X be a Banach space, M X be a proper closed subspace,
X/M the quotient space, : X X/M the projection map (x) = x + M for
x X and dene the quotient norm on X/M by
k(x)k
X/M
= kx +Mk
X/M
= inf
mM
kx +mk
X
.
Then
(1) kk
X/M
is a norm on X/M.
(2) The projection map : X X/M has norm 1, kk = 1.
(3) (X/M, kk
X/M
(4) If Y is another normed space and T : X Y is a bounded linear transfor-
mation such that M Nul(T), then there exists a unique linear transfor-
mation S : X/M Y such that T = S and moreover kTk = kSk .
Proof. 1) Clearly kx +Mk 0 and if kx +Mk = 0, then there exists m
n
M
such that kx + m
n
k 0 as n , i.e. x = lim
n
m
n

M = M. Since x M,
x +M = 0 X/M. If c C\ {0} , x X, then
kcx +Mk = inf
mM
kcx +mk = |c| inf
mM
kx +m/ck = |c| kx +Mk
because m/c runs through M as mruns through M. Let x
1
, x
2
X and m
1
, m
2
M
then
kx
1
+x
2
+Mk kx
1
+x
2
+m
1
+m
2
k kx
1
+m
1
k +kx
2
+m
2
k.
Taking innums over m
1
, m
2
M then implies
kx
1
+x
2
+Mk kx
1
+Mk +kx
2
+Mk.
and we have completed the proof the (X/M, k k) is a normed space.
2) Since k(x)k = inf
mM
kx +mk kxk for all x X, kk 1. To see kk = 1,
let x X \ M so that (x) 6= 0. Given (0, 1), there exists m M such that
kx +mk
1
k(x)k .
Therefore,
k(x +m)k
kx +mk
=
k(x)k
kx +mk

kx +mk
kx +mk
=
which shows kk . Since (0, 1) is arbitrary we conclude that k(x)k = 1.
3) Let (x
n
) X/M be a sequence such that
P
k(x
n
)k < . As above there
exists m
n
M such that k(x
n
)k
1
2
kx
n
+ m
n
k and hence
P
kx
n
+ m
n
k
2
P
k(x
n
)k < . Since X is complete, x :=

P
n=1
(x
n
+m
n
) exists in X and therefore
by the continuity of ,
(x) =

X
n=1
(x
n
+m
n
) =

X
n=1
(x
n
)
showing X/M is complete.
4) The existence of S is guaranteed by the factor theorem from linear algebra.
Moreover kSk = kTk because
kTk = kS k kSk kk = kSk
and
kSk = sup
x/ M
kS((x))k
k(x)k
= sup
x/ M
kTxk
k(x)k
sup
x/ M
kTxk
kxk
= sup
x6=0
kTxk
kxk
= kTk .
Theorem 18.38. Let X be a Banach space. Then
(1) Identifying X with

X X
, the weak topology on X
induces the
weak topology on X. More explicitly, the map x X x

X is a homeo-
morphism when X is equipped with its weak topology and

X with the relative
topology coming from the weak- topology on X
.
(2)

X X
is dense in the weak- topology on X
.
(3) Letting C and C
be the closed unit balls in X and X
respectively, then
C := { x C
: x C} is dense in C
in the weak topology on X

.
.
(4) X is reexive i C is weakly compact.
Proof.
(1) The weak topology on X
is generated by
n
f : f X
o
= { X
(f) : f X
} .
So the induced topology on X is generated by
{x X x X
x(f) = f(x) : f X
} = X
and so the induced topology on X is precisely the weak topology.

(2) A basic weak - neighborhood of a point X
is of the form
(18.12) N :=
n
k=1
{ X
: |(f
k
) (f
k
)| < }
for some {f
k
}
n
k=1
X
and > 0. be given. We must now nd x X such

that x N, or equivalently so that
(18.13) | x(f
k
) (f
k
)| = |f
k
(x) (f
k
)| < for k = 1, 2, . . . , n.
In fact we will show there exists x X such that (f
k
) = f
k
(x) for
k = 1, 2, . . . , n. To prove this stronger assertion we may, by discard-
ing some of the f
k
s if necessary, assume that {f
k
}
n
k=1
is a linearly in-
dependent set. Since the {f
k
}
n
k=1
are linearly independent, the map
x X (f
1
(x), . . . , f
n
(x)) C
n
is surjective (why) and hence there
exists x X such that
(18.14) (f
1
(x), . . . , f
n
(x)) = Tx = ((f
1
) , . . . , (f
n
))
as desired.
(3) Let C
and N be the weak - open neighborhood of as

in Eq. (18.12). Working as before, given > 0, we need to nd x C
such that Eq. (18.13). It will be left to the reader to verify that it suces
again to assume {f
k
}
n
k=1
is a linearly independent set. (Hint: Suppose that
368 BRUCE K. DRIVER
{f
1
, . . . , f
m
} were a maximal linearly dependent subset of {f
k
}
n
k=1
, then
each f
k
with k > m may be written as a linear combination {f
1
, . . . , f
m
} .)
As in the proof of item 2., there exists x X such that Eq. (18.14)
holds. The problem is that x may not be in C. To remedy this, let N :=
n
k=1
Nul(f
k
) = Nul(T), : X X/N

= C
n
be the projection map and

f
k
(X/N)
be chosen so that f
k
=

f
k
for k = 1, 2, . . . , n. Then we have
produced x X such that
((f
1
) , . . . , (f
n
)) = (f
1
(x), . . . , f
n
(x)) = (

f
1
((x)), . . . ,

f
n
((x))).
Since

f
1
, . . . ,

f
n
is a basis for (X/N)
we nd
k(x)k = sup
C
n
\{0}
P
n
i=1
i

f
i
((x))
P
n
i=1
i

f
i
= sup
C
n
\{0}
|
P
n
i=1
i
(f
i
)|
k
P
n
i=1
i
f
i
k
= sup
C
n
\{0}
|(
P
n
i=1
i
f
i
)|
k
P
n
i=1
i
f
i
k
kk sup
C
n
\{0}
k
P
n
i=1
i
f
i
k
k
P
n
i=1
i
f
i
k
= 1.
Hence we have shown k(x)k 1 and therefore for any > 1 there
exists y = x + n X such that kyk < and ((f
1
) , . . . , (f
n
)) =
(f
1
(y), . . . , f
n
(y)). Hence
|(f
i
) f
i
(y/)|
f
i
(y)
1
f
i
(y)
(1
1
) |f
i
(y)|
which can be arbitrarily small (i.e. less than ) by choosing suciently
close to 1.
(4) Let

C := { x : x C} C
. If X is reexive,

C = C
is weak
- compact and hence by item 1., C is weakly compact in X. Conversely
if C is weakly compact, then

C C
is weak compact being the

continuous image of a continuous map. Since the weak topology on
X
is Hausdor, it follows that

C is weak closed and so by item 3,
C
=

C
weak
=

C. So if X
, / kk C
=

C, i.e. there exists
x C such that x = / kk . This shows = (kk x)
and therefore
X = X
.
18.6. Exercises.
18.6.1. More Examples of Banach Spaces.
Exercise 18.1. Let (X, M) be a measurable space and M(X) denote the space
of complex measures on (X, M) and for M(X) let kk |k(X). Show
(M(X), kk) is a Banach space. (Move to Section 16.)
Exercise 18.2. Folland 5.9, p. 155.
18.6.2. Hahn-Banach Theorem Problems.
Exercise 18.10. Let X be a Banach space such that X
is separable. Show X
is separable as well. (Folland 5.25.) Hint: use the greedy algorithm, i.e. suppose
D X
\{0} is a countable dense subset of X
, for D choose x
X such that
kx
k = 1 and |(x
)|
1
2
kk.
Exercise 18.11. Folland 5.26.
Exercise 18.12. Give another proof Corollary 4.10 based on Remark 4.8. Hint:
the Hahn Banach theorem implies
kf(b) f(a)k = sup
X
, 6=0
|(f(b)) (f(a))|
kk
.
18.6.3. Baire Category Result Problems.
18.6.4. Weak Topology and Convergence Problems.
Denition 18.39. A sequence {x
n
}
n=1
X is weakly Cauchy if for all V
w
such that 0 V, x
n
x
m
V for all m, n suciently large. Similarly a sequence
{f
n
}
n=1
X
is weak Cauchy if for all V

w
such that 0 V, f
n
f
m
V
for all m, n suciently large.
370 BRUCE K. DRIVER
Remark 18.40. These conditions are equivalent to {f(x

n
)}
n=1
being Cauchy for all
f X
and {f
n
(x)}
n=1
being Cauchy for all x X respectively.
Exercise 18.29. land 5.50, p. 172.
Exercise 18.30. Let X be a Banach space. Show every weakly compact subset of
X is norm bounded and every weak compact subset of X
is norm bounded.
19. Weak and Strong Derivatives
For this section, let be an open subset of R
d
, p, q, r [1, ], L
p
() =
L
p
(, B
, m) and L
p
loc
() = L
p
loc
(, B
, m), where m is Lebesgue measure on B

R
d
and B
is the Borel algebra on . If = R

d
, we will simply write L
p
and L
p
loc
for L
p
(R
d
) and L
p
loc
(R
d
) respectively. Also let
hf, gi :=
Z
fgdm
for any pair of measurable functions f, g : C such that fg L
1
(). For
example, by Hlders inequality, if hf, gi is dened for f L
p
() and g L
q
()
when q =
p
p1
.
Denition 19.1. A sequence {u
n
}
n=1
L
p
loc
() is said to converge to u L
p
loc
()
if lim
n
ku u
n
k
L
q
(K)
= 0 for all compact subsets K .
The following simple but useful remark will be used (typically without further
comment) in the sequel.
Remark 19.2. Suppose r, p, q [1, ] are such that r
1
= p
1
+ q
1
and f
t
f
in L
p
() and g
t
g in L
q
() as t 0, then f
t
g
t
fg in L
r
(). Indeed,
kf
t
g
t
fgk
r
= k(f
t
f) g
t
+f (g
t
g)k
r
kf
t
fk
p
kg
t
k
q
+kfk
p
kg
t
gk
q
0 as t 0
19.1. Basic Denitions and Properties.
Denition 19.3 (Weak Dierentiability). Let v R
d
and u L
p
() (u L
p
loc
())
then
v
u is said to exist weakly in L
p
() (L
p
loc
()) if there exists a function
g L
p
() (g L
p
loc
()) such that
(19.1) hu,
v
i = hg, i for all C
c
().
The function g if it exists will be denoted by
(w)
v
u. Similarly if N
d
0
and
is
as in Notation 11.10, we say
u exists weakly in L
p
() (L
p
loc
()) i there exists
g L
p
() (L
p
loc
()) such that
hu,
i = (1)
||
hg, i for all C
c
().
More generally if p() =
P
||N
a
is a polynomial in R
n
, then p()u exists
weakly in L
p
() (L
p
loc
()) i there exists g L
p
() (L
p
loc
()) such that
(19.2) hu, p()i = hg, i for all C
c
()
and we denote g by wp()u.
By Corollary 11.28, there is at most one g L
1
loc
() such that Eq. (19.2) holds,
so wp()u is well dened.
Lemma 19.4. Let p() be a polynomial on R
d
, k = deg (p) N, and u L
1
loc
()
such that p()u exists weakly in L
1
loc
(). Then
(1) supp
m
(wp()u) supp
m
(u), where supp
m
(u) is the essential support of
u relative to Lebesgue measure, see Denition 11.14.
(2) If deg p = k and u|
U
C
k
(U, C) for some open set U , then wp()u =
p () u a.e. on U.
Proof.
372 BRUCE K. DRIVER
(1) Since
hwp()u, i = hu, p()i = 0 for all C
c
(\ supp
m
(u)),
an application of Corollary 11.28 shows wp()u = 0 a.e. on \
supp
m
(u). So by Lemma 11.15, \ supp
m
(u) \ supp
m
(wp()u), i.e.
supp
m
(wp()u) supp
m
(u).
(2) Suppose that u|
U
is C
k
and let C
c
(U). (We view as a function
in C
c
(R
d
) by setting 0 on R
d
\ U.) By Corollary 11.25, there exists
C
c
() such that 0 1 and = 1 in a neighborhood of supp().
Then by setting u = 0 on R
d
\ supp() we may view u C
k
c
(R
d
) and
so by standard integration by parts (see Lemma 11.26) and the ordinary
product rule,
hwp()u, i = hu, p()i = hu, p()i
= hp() (u) , i = hp()u, i (19.3)
wherein the last equality we have is constant on supp(). Since Eq.
(19.3) is true for all C
c
(U), an application of Corollary 11.28 with
h = wp()u p () u and = m shows wp()u = p () u a.e. on U.
Notation 19.5. In light of Lemma 19.4 there is no danger in simply writing p () u
for wp()u. So in the sequel we will always interpret p()u in the weak or dis-
tributional sense.
Example 19.6. Suppose u(x) = |x| for x R, then u(x) = sgn(x) in L
1
loc
(R)
while
2
u(x) = 2(x) so
2
u(x) does not exist weakly in L
1
loc
(R) .
Example 19.7. Suppose d = 2 and u(x, y) = 1
y>x
. Then u L
1
loc
R
2
, while
x
1
y>x
= (y x) and
y
1
y>x
= (y x) and so that neither
x
u or
y
u exists
weakly. On the other hand (
x
+
y
) u = 0 weakly. To prove these assertions,
notice u C
R
2
\
where =

(x, x) : x R
2
. So by Lemma 19.4, for any

polynomial p () without constant term, if p () u exists weakly then p () u = 0.
However,
hu,
x
i =
Z
y>x
x
(x, y)dxdy =
Z
R
(y, y)dy,
hu,
y
i =
Z
y>x
y
(x, y)dxdy =
Z
R
(x, x)dx and
hu, (
x
+
y
)i = 0
from which it follows that
x
u and
y
u can not be zero while (
x
+
y
)u = 0.
On the other hand if p() and q () are two polynomials and u L
1
loc
() is a
function such that p()u exists weakly in L
1
loc
() and q () [p () u] exists weakly
in L
1
loc
() then (qp) () u exists weakly in L
1
loc
() . This is because
hu, (qp) () i = hu, p () q()i
= hp () u, q()i = hq()p () u, i for all C
c
() .
Example 19.8. Let u(x, y) = 1
x>0
+ 1
y>0
in L
1
loc
R
2
. Then
x
u(x, y) = (x)
and
y
u(x, y) = (y) so
x
u(x, y) and
y
u(x, y) do not exist weakly in L
1
loc
R
2
.
However
y
x
u does exists weakly and is the zero function. This shows
y
x
u may
exists weakly despite the fact both
x
u and
y
u do not exists weakly in L
1
loc
R
2
.
Lemma 19.9. Suppose u L
1
loc
() and p() is a polynomial of degree k such that
p () u exists weakly in L
1
loc
() then
(19.4) hp () u, i = hu, p () i for all C
k
c
() .
Note: The point here is that Eq. (19.4) holds for all C
k
c
() not just
C
c
() .
Proof. Let C
k
c
() and choose C
c
(B(0, 1)) such that
R
R
d
(x)dx = 1
and let
(x) :=
d
(x/). Then
c
() for suciently small and
p () [
] =
p () p () and
uniformly on compact
sets as 0. Therefore by the dominated convergence theorem,
hp () u, i = lim
0
hp () u,
i = lim
0
hu, p () (
)i = hu, p () i.
Lemma 19.10 (Product Rule). Let u L
1
loc
(), v R
d
and C
1
(). If
(w)
v
u
exists in L
1
loc
(), then
(w)
v
(u) exists in L
1
loc
() and
(w)
v
(u) =
v
u +
(w)
v
u a.e.
Moreover if C
1
c
() and F := u L
1
(here we dene F on R
d
by setting F = 0
on R
d
\ ), then
(w)
F =
v
u +
(w)
v
1
(R
d
).
Proof. Let C
c
(), then using Lemma 19.9,
hu,
v
i = hu,
v
i = hu,
v
()
v
i = h
(w)
v
u, i +h
v
u, i
= h
(w)
v
u, i +h
v
u, i.
This proves the rst assertion. To prove the second assertion let C
c
() such
that 0 1 and = 1 on a neighborhood of supp(). So for C
c
(R
d
), using
v
= 0 on supp() and C
c
(), we nd
hF,
v
i = hF,
v
i = hF,
v
i = h(u) ,
v
()
v
i
= h(u) ,
v
()i = h
(w)
v
(u) , ()i
= h
v
u +
(w)
v
u, i = h
v
u +
(w)
v
u, i.
This show
(w)
v
F =
v
u +
(w)
v
u as desired.
Lemma 19.11. Suppose q [1, ), p() is a polynomial in R
d
and u L
q
loc
().
If there exists {u
m
}
m=1
L
q
loc
() such that p () u
m
exists in L
q
loc
() for all m
and there exists g L
q
loc
() such that for all C
c
(),
lim
m
hu
m
, i = hu, i and lim
m
hp () u
m
, i = hg, i
then p () u exists in L
q
loc
() and p () u = g.
Proof. Since
hu, p () i = lim
m
hu
m
, p () i = lim
m
hp () u
m
, i = hg, i
for all C
c
(), p () u exists and is equal to g L
q
loc
().
Conversely we have the following proposition.
374 BRUCE K. DRIVER
Proposition 19.12 (Mollication). Suppose q [1, ), p

1
(), . . . , p
N
() is a col-
lection of polynomials in R
d
and u L
q
loc
() such that p
l
()u exists weakly in
L
q
loc
() for l = 1, 2, . . . , N. Then there exists u
n
C
c
() such that u
n
u in
L
q
loc
() and p
l
() u
n
p
l
() u in L
q
loc
() for l = 1, 2, . . . , N.
Proof. Let C
c
(B(0, 1)) such that
R
R
d
dm = 1 and
(x) :=
d
(x/)
be as in the proof of Lemma 19.9. For any function f L
1
loc
() , > 0 and
x
:= {y : dist(y,
c
) > } , let
f
(x) := f
(x) := 1
(x) =
Z
f(y)
(x y)dy.
Notice that f
) and
as 0.
Given a compact set K let K
:= {x : dist(x, K) } . Then K
K as
0, there exists
0
> 0 such that K
0
:= K
0
is a compact subset of
0
:=
0

(see Figure 38) and for x K,
f
(x) :=
Z
f(y)
(x y)dy =
Z
K
f(y)
(x y)dy.
Therefore, using Theorem 11.21,
0
Figure 38. The geomentry of K K

0

0
.
kf
fk
L
p
(K)
= k(1
K
0
f)
1
K
0
fk
L
p
(K)
k(1
K
0
f)
1
K
0
fk
L
p
(R
d
)
0 as 0.
Hence, for all f L
q
loc
(), f
) and
(19.5) lim
0
kf
fk
L
p
(K)
= 0.
Now let p() be a polynomial on R
d
, u L
q
loc
() such that p () u L
q
loc
() and
v
:=
u C
) as above. Then for x K and <

0
,
p()v
(x) =
Z
u(y)p(
x
)
(x y)dy =
Z
u(y)p(
y
)
(x y)dy
=
Z
u(y)p(
y
)
(x y)dy = hu, p()
(x )i
= hp()u,
(x )i = (p()u)
(x). (19.6)
From Eq. (19.6) we may now apply Eq. (19.5) with f = u and f = p
l
()u for
1 l N to nd
kv
uk
L
p
(K)
+
N
X
l=1
kp
l
()v
p
l
()uk
L
p
(K)
0 as 0.
For n N, let
K
n
:= {x : |x| n and d(x,
c
) 1/n}
(so K
n
K
o
n+1
K
n+1
for all n and K
n
as n or see Lemma 10.10)
and choose
n
C
c
(K
o
n+1
, [0, 1]), using Corollary 11.25, so that
n
= 1 on a
neighborhood of K
n
. Choose
n
0 such that K
n+1

n
and
kv
n
uk
L
p
(K
n
)
+
N
X
l=1
kp
l
()v
n
p
l
()uk
L
p
(K
n
)
1/n.
Then u
n
:=
n
v
n
C
c
() and since u
n
= v
n
on K
n
we still have
(19.7) ku
n
uk
L
p
(K
n
)
+
N
X
l=1
kp
l
()u
n
p
l
()uk
L
p
(K
n
)
1/n.
Since any compact set K is contained in K
o
n
for all n suciently large, Eq.
(19.7) implies
lim
n
"
ku
n
uk
L
p
(K)
+
N
X
l=1
kp
l
()u
n
p
l
()uk
L
p
(K)
#
= 0.
The following proposition is another variant of Proposition 19.12 which the
reader is asked to prove in Exercise 19.2 below.
Proposition 19.13. Suppose q [1, ), p
1
(), . . . , p
N
() is a collection of poly-
nomials in R
d
and u L
q
= L
q
R
d
such that p
l
()u L
q
for l = 1, 2, . . . , N.
Then there exists u
n
C
R
d
such that
lim
n
"
ku
n
uk
L
p
+
N
X
l=1
kp
l
()u
n
p
l
()uk
L
p
#
= 0.
Notation 19.14 (Dierence quotients). For v R
d
and h R\{0} and a function
u : C, let
h
v
u(x) :=
u(x +hv) u(x)
h
for those x such that x+hv . When v is one of the standard basis elements,
e
i
for 1 i d, we will write
h
i
u(x) rather than
h
e
i
u(x). Also let
h
u(x) :=

h
1
u(x), . . . ,
h
n
u(x)
be the dierence quotient approximation to the gradient.

Denition 19.15 (Strong Dierentiability). Let v R
d
and u L
p
, then
v
u is
said to exist strongly in L
p
if the lim
h0
h
v
u exists in L
p
. We will denote the limit
by
(s)
v
u.
376 BRUCE K. DRIVER
It is easily veried that if u L

p
, v R
d
and
(s)
v
u L
p
exists then
(w)
v
u exists
and
(w)
v
u =
(s)
v
u. The key to checking this assetion is the identity,
h
h
v
u, i =
Z
R
d
u(x +hv) u(x)
h
(x)dx
=
Z
R
d
u(x)
(x hv) (x)
h
dx = hu,
h
v
i. (19.8)
Hence if
(s)
v
u = lim
h0
h
v
u exists in L
p
and C
c
(R
d
), then
h
(s)
v
u, i = lim
h0
h
h
v
u, i = lim
h0
hu,
h
v
i =
d
dh
|
0
hu, ( hv)i = hu,
v
i
wherein Corollary 7.43 has been used in the last equality to bring the derivative
past the integral. This shows
(w)
v
u exists and is equal to
(s)
v
u. What is somewhat
more surprising is that the converse assertion that if
(w)
v
u exists then so does
(s)
v
u. Theorem 19.18 is a generalization of Theorem 12.39 from L
2
to L
p
. For the
readers convenience, let us give a self-contained proof of the version of the Banach
- Alaoglus Theorem which will be used in the proof of Theorem 19.18. (This is the
same as Theorem 18.27 above.)
Proposition 19.16 (Weak- Compactness: Banach - Alaoglus Theorem). Let X
be a separable Banach space and {f
n
} X
be a bounded sequence, then there exist

a subsequence {
f
n
} {f
n
} such that lim
n
f
n
(x) = f(x) for all x X with f X
.
Proof. Let D X be a countable linearly independent subset of X such
that span(D) = X. Using Cantors diagonal trick, choose {
f
n
} {f
n
} such that
x
:= lim
n

f
n
(x) exist for all x D. Dene f : span(D) R by the formula
f(
X
xD
a
x
x) =
X
xD
a
x
x
where by assumption #({x D : a
x
6= 0}) < . Then f : span(D) R is linear
and moreover

f
n
(y) f(y) for all y span(D). Now
|f(y)| = lim
n
|
f
n
(y)| limsup
n
k
f
n
k kyk Ckyk for all y span(D).
Hence by the B.L.T. Theorem 4.1, f extends uniquely to a bounded linear functional
on X. We still denote the extension of f by f X
. Finally, if x X and y
span(D)
|f(x)

f
n
(x)| |f(x) f(y)| +|f(y)

f
n
(y)| +|
f
n
(y)

f
n
(x)|
kfk kx yk +k
f
n
k kx yk +|f(y)

f
n
(y)k
2Ckx yk +|f(y)

f
n
(y)| 2Ckx yk as n .
Therefore limsup
n
|f(x)

f
n
(x)| 2Ckx yk 0 as y x.
Corollary 19.17. Let p (1, ] and q =
p
p1
. Then to every bounded sequence
{u
n
}
n=1
L
p
() there is a subsequence { u
n
}
n=1
and an element u L
p
() such
that
lim
n
h u
n
, gi = hu, gi for all g L
q
() .
Proof. By Theorem 15.14, the map
v L
p
() hv, i (L
q
())
is an isometric isomorphism of Banach spaces. By Theorem 11.3, L

q
() is separable
for all q [1, ) and hence the result now follows from Proposition 19.16.
Theorem 19.18 (Weak and Strong Dierentiability). Suppose p [1, ), u
L
p
(R
d
) and v R
d
\ {0} . Then the following are equivalent:
p
(R
d
) and {h
n
}
n=1
n
h
n
= 0
and
lim
n
h
h
n
v
u, i = hg, i for all C
c
(R
d
).
(2)
(w)
v
u exists and is equal to g L
p
(R
d
), i.e. hu,
v
i = hg, i for all
C
c
(R
d
).
p
(R
d
) and u
n
C
c
(R
d
) such that u
n
L
p
u and
v
u
n
L
p
g
as n .
(4)
(s)
v
u exists and is is equal to g L
p
(R
d
), i.e.
h
v
u g in L
p
as h 0.
Moreover if p (1, ) any one of the equivalent conditions 1. 4. above are
implied by the following condition.
1
0
. There exists {h
n
}
n=1
n
h
n
= 0 and sup
n
h
n
v
u
p
<
.
Proof. 4. = 1. is simply the assertion that strong convergence implies weak
convergence.
1. = 2. For C
c
(R
d
), Eq. (19.8) and the dominated convergence theorem
implies
hg, i = lim
n
h
h
n
v
u, i = lim
n
hu,
h
n
v
i = hu,
v
i.
2. = 3. Let C
c
(R
d
, R) such that
R
R
d
(x)dx = 1 and let
m
(x) =
m
d
(mx), then by Proposition 11.24, h
m
:=
m
u C
(R
d
) for all m and
v
h
m
(x) =
v
m
u(x) =
Z
R
d

v
m
(x y)u(y)dy = hu,
v
[
m
(x )]i
= hg,
m
(x )i =
m
g(x).
By Theorem 11.21, h
m
u L
p
(R
d
) and
v
h
m
=
m
g g in L
p
(R
d
) as m .
This shows 3. holds except for the fact that h
m
need not have compact support.
To x this let C
c
(R
d
, [0, 1]) such that = 1 in a neighborhood of 0 and let
(x) = (x) and (

v
)
(x) := (
v
) (x). Then
v
(
h
m
) =
v
h
m
+
v
h
m
= (
v
)
h
m
+
v
h
m
so that
h
m
h
m
in L
p
and
v
(
h
m
)
v
h
m
in L
p
as 0. Let u
m
=
m
h
m
where
m
is chosen to be greater than zero but small enough so that
k
m
h
m
h
m
k
p
+k
v
(
m
h
m
)
v
h
m
k
p
< 1/m.
Then u
m
C
c
(R
d
), u
m
u and
v
u
m
g in L
p
as m .
3. = 4. By the fundamental theorem of calculus
h
v
u
m
(x) =
u
m
(x +hv) u
m
(x)
h
=
1
h
Z
1
0
d
ds
u
m
(x +shv)ds =
Z
1
0
(
v
u
m
) (x +shv)ds. (19.9)
378 BRUCE K. DRIVER
and therefore,
h
v
u
m
(x)
v
u
m
(x) =
Z
1
0
[(
v
u
m
) (x +shv)
v
u
m
(x)] ds.
So by Minkowskis inequality for integrals, Theorem 9.27,
h
v
u
m
(x)
v
u
m
p

Z
1
0
k(
v
u
m
) ( +shv)
v
u
m
k
p
ds
and letting m in this equation then implies
h
v
u g
p

Z
1
0
kg( +shv) gk
p
ds.
By the dominated convergence theorem and Proposition 11.13, the right member
of this equation tends to zero as h 0 and this shows item 4. holds.
(1
0
. =1. when p > 1) This is a consequence of Corollary 19.17 (or see Theorem
18.27 above) which asserts, by passing to a subsequence if necessary, that
h
n
v
u
w
g
for some g L
p
(R
d
).
Example 19.19. The fact that (1
0
) does not imply the equivalent conditions 1
4 in Theorem 19.18 when p = 1 is demonstrated by the following example. Let
u := 1
[0,1]
, then
Z
R
u(x +h) u(x)

h
dx =
1
|h|
Z
R
1
[h,1h]
(x) 1
[0,1]
(x)
dx = 2
for |h| < 1. On the other hand the distributional derivative of u is u(x) = (x)
(x 1) which is not in L
1
.
Alternatively, if there exists g L
1
(R, dm) such that
lim
n
u(x +h
n
) u(x)
h
n
= g(x) in L
1
for some sequence {h
n
}
n=1
as above. Then for C
c
(R) we would have on one
hand,
Z
R
u(x +h
n
) u(x)
h
n
(x)dx =
Z
R
(x h
n
) (x)
h
n
u(x)dx
Z
1
0

0
(x)dx = ((0) (1)) as n ,
while on the other hand,
Z
R
u(x +h
n
) u(x)
h
n
(x)dx
Z
R
g(x)(x)dx.
These two equations imply
(19.10)
Z
R
g(x)(x)dx = (0) (1) for all C
c
(R)
and in particular that
R
R
g(x)(x)dx = 0 for all C
c
(R\ {0, 1}). By Corollary
11.28, g(x) = 0 for m a.e. x R\ {0, 1} and hence g(x) = 0 for m a.e. x R.
But this clearly contradicts Eq. (19.10). This example also shows that the unit ball
in L
1
(R, dm) is not weakly sequentially compact. Compare with Example 18.24.
Corollary 19.20. If 1 p < , u L
p
such that
v
u L
p
, then

h
v
u
L
p

k
v
uk
L
p
for all h 6= 0 and v R
d
.
Proof. By Minkowskis inequality for integrals, Theorem 9.27, we may let m
in Eq. (19.9) to nd
h
v
u(x) =
Z
1
0
(
v
u) (x +shv)ds for a.e. x R
d
and
h
v
u
L
p

Z
1
0
k(
v
u) ( +shv)k
L
p
ds = k
v
uk
L
p
.
Proposition 19.21 (A weak form of Weyls Lemma). If u L
2
(R
d
) such that
f := 4u L
2
(R
d
) then
u L
2
R
d
for || 2. Furthermore if k N
0
and
f L
2
R
d
for all || k, then
u L
2
R
d
for || k + 2.
Proof. By Proposition 19.13, there exists u
n
C
R
d
such that u
n
u and
u
n
u = f in L
2
R
d
. By integration by parts we nd
Z
R
d
|(u
n
u
m
)|
2
dm = ((u
n
u
m
), (u
n
u
m
))
L
2 (f f, u u) = 0 as m, n
and hence by item 3. of Theorem 19.18,
i
u L
2
for each i. Since
kuk
2
L
2
= lim
n
Z
R
d
|u
n
|
2
dm = (u
n
, u
n
)
L
2 (f, u) as n
we also learn that
(19.11) kuk
2
L
2
= (f, u) kfk
L
2
kuk
L
2
.
Let us now consider
d
X
i,j=1
Z
R
d
|
i
j
u
n
|
2
dm =
d
X
i,j=1
Z
R
d

j
u
n
2
i
j
u
n
dm
=
d
X
j=1
Z
R
d

j
u
n
j
u
n
dm =
d
X
j=1
Z
R
d

2
j
u
n
u
n
dm
=
Z
R
d
|u
n
|
2
dm = ku
n
k
2
L
2
.
Replacing u
n
by u
n
u
m
in this calculation shows
d
X
i,j=1
Z
R
d
|
i
j
(u
n
u
m
)|
2
dm = k(u
n
u
m
)k
2
L
2
0 as m, n
and therefore by Lemma 19.4 (also see Exercise 19.4),
i
j
u L
2
R
d
for all i, j
and
(19.12)
d
X
i,j=1
Z
R
d
|
i
j
u|
2
dm = kuk
2
L
2
= kfk
2
L
2
.
Combining Eqs. (19.11) and (19.12) gives the estimate
X
||2
k
uk
2
L
2
kuk
2
L
2
+kfk
L
2
kuk
L
2
+kfk
2
L
2
= kuk
2
L
2
+kuk
L
2
kuk
L
2
+kuk
2
L
2
. (19.13)
380 BRUCE K. DRIVER
Let us now further assume

i
f =
i
u L
2
R
d
. Then for h R \ {0} ,
h
i
u L
2
(R
d
) and
h
i
u =
h
i
u =
h
i
f L
2
(R
d
) and hence by Eq. (19.13) and
what we have just proved,
h
i
u =
h
i

u L
2
and
X
||2
h
i

2
L
2
(R
d
)

h
i
u
2
L
2
+
h
i
f
L
2

h
i
u
L
2
+
h
i
f
2
L
2
k
i
uk
2
L
2
+k
i
fk
L
2
k
i
uk
L
2
+k
i
fk
2
L
2
where the last inequality follows from Corollary 19.20. Therefore applying Theorem
19.18 again we learn that
i
u L
2
(R
d
) for all || 2 and
X
||2
k
i
uk
2
L
2
(R
d
)
k
i
uk
2
L
2
+k
i
fk
L
2
k
i
uk
L
2
+k
i
fk
2
L
2
kuk
2
L
2
+k
i
fk
L
2
kuk
L
2
+k
i
fk
2
L
2
kfk
L
2
kuk
L
2
+k
i
fk
L
2

q
kfk
L
2
kuk
L
2
+k
i
fk
2
L
2
.
The remainder of the proof, which is now an induction argument using the above
ideas, is left as an exercise to the reader.
Theorem 19.22. Suppose that is a precompact open subset of R
d
and V is an
open precompact subset of .
(1) If 1 p < , u L
p
() and
i
u L
p
(), then k
h
i
uk
L
p
(V )
k
i
uk
L
p
()
for all 0 < |h| <
1
2
dist(V,
c
).
(2) Suppose that 1 < p , u L
p
() and assume there exists a constants
C
V
< and
V
(0,
1
2
dist(V,
c
)) such that
k
h
i
uk
L
p
(V )
C
V
for all 0 < |h| <
V
.
Then
i
u L
p
(V ) and k
i
uk
L
p
(V )
C
V
. Moreover if C := sup
V
C
V
<
then in fact
i
u L
p
() and k
i
uk
L
p
()
C.
Proof. 1. Let U
o
such that

V U and

U is a compact subset of . For
u C
1
() L
p
(), x B and 0 < |h| <
1
2
dist(V, U
c
),
h
i
u(x) =
u(x +he
i
) u(x)
h
=
Z
1
0

i
u(x +the
i
) dt
and in particular,
|
h
i
u(x)|
Z
1
0
|u(x +the
i
)|dt.
Therefore by Minikowskis inequality for integrals,
(19.14) k
h
i
uk
L
p
(V )

Z
1
0
ku( +the
i
)k
L
p
(V )
dt k
i
uk
L
p
(U)
.
For general u L
p
() with
i
u L
p
(), by Proposition 19.12, there exists
u
n
C
c
() such that u
n
u and
i
u
n

i
u in L
p
loc
(). Therefore we may
replace u by u
n
in Eq. (19.14) and then pass to the limit to nd
k
h
i
uk
L
p
(V )
k
i
uk
L
p
(U)
k
i
uk
L
p
()
.
2. If k
h
i
uk
L
p
(V )
C
V
for all h suciently small then by Corollary 19.17 there
exists h
n
0 such that
h
n
i
u
w
v L
p
(V ). Hence if C
c
(V ),
Z
V
vdm = lim
n
Z
h
n
i
udm = lim
n
Z
u
h
n
i
dm
=
Z
u
i
dm =
Z
V
u
i
dm.
Therefore
i
u = v L
p
(V ) and k
i
uk
L
p
(V )
kvk
L
p
(V )
C
V
. Finally if C :=
sup
V
C
V
< , then by the dominated convergence theorem,
k
i
uk
L
p
()
= lim
V
k
i
uk
L
p
(V )
C.
We will now give a couple of applications of Theorem 19.18.
Lemma 19.23. Let v R
d
.
(1) If h L
1
and
v
h exists in L
1
, then
R
R
d

v
h(x)dx = 0.
(2) If p, q, r [1, ) satisfy r
1
= p
1
+q
1
, f L
p
and g L
q
are functions
such that
v
f and
v
g exists in L
p
and L
q
respectively, then
v
(fg) exists in
L
r
and
v
(fg) =
v
f g +f
v
g. Moreover if r = 1 we have the integration
by parts formula,
(19.15) h
v
f, gi = hf,
v
gi.
(3) If p = 1,
v
f exists in L
1
and g BC
1
(R
d
) (i.e. g C
1
(R
d
) with g and
its rst derivatives being bounded) then
v
(gf) exists in L
1
and
v
(fg) =
v
f g +f
v
g and again Eq. (19.15) holds.
Proof. 1) By item 3. of Theorem 19.18 there exists h
n
C
c
(R
d
) such that
h
n
h and
v
h
n

v
h in L
1
. Then
Z
R
d

v
h
n
(x)dx =
d
dt
|
0
Z
R
d
h
n
(x +hv)dx =
d
dt
|
0
Z
R
d
h
n
(x)dx = 0
and letting n proves the rst assertion.
2) Similarly there exists f
n
, g
n
C
c
(R
d
) such that f
n
f and
v
f
n

v
f in
L
p
and g
n
g and
v
g
n

v
g in L
q
as n . So by the standard product rule
and Remark 19.2, f
n
g
n
fg L
r
as n and
v
(f
n
g
n
) =
v
f
n
g
n
+f
n

v
g
n

v
f g +f
v
g in L
r
as n .
It now follows from another application of Theorem 19.18 that
v
(fg) exists in L
r
and
v
(fg) =
v
f g +f
v
g. Eq. (19.15) follows from this product rule and item
1. when r = 1.
3) Let f
n
C
c
(R
d
) such that f
n
f and
v
f
n

v
f in L
1
as n . Then
as above, gf
n
gf in L
1
and
v
(gf
n
)
v
g f + g
v
f in L
1
as n . In
particular if C
c
(R
d
), then
hgf,
v
i = lim
n
hgf
n
,
v
i = lim
n
h
v
(gf
n
) , i
= lim
n
h
v
g f
n
+g
v
f
n
, i = h
v
g f +g
v
f, i.
This shows
v
(fg) exists (weakly) and
v
(fg) =
v
f g +f
v
g. Again Eq. (19.15)
holds in this case by item 1. already proved.
382 BRUCE K. DRIVER
Lemma 19.24. Let p, q, r [1, ] satisfy p

1
+ q
1
= 1 + r
1
, f L
p
, g L
q
and v R
d
.
(1) If
v
f exists strongly in L
r
, then
v
(f g) exists strongly in L
p
and
v
(f g) = (
v
f) g.
(2) If
v
g exists strongly in L
q
, then
v
(f g) exists strongly in L
r
and
v
(f g) = f
v
g.
(3) If
v
f exists weakly in L
p
and g C
c
(R
d
), then f g C
(R
d
),
v
(f g)
exists strongly in L
r
and
v
(f g) = f
v
g = (
v
f) g.
Proof. Items 1 and 2. By Youngs inequality (Theorem 11.19) and simple
computations:
hv
(f g) f g
h
(
v
f) g
r
=
hv
f g f g
h
(
v
f) g
r
=
hv
f f
h
(
v
f)
hv
f f
h
(
v
f)
p
kgk
q
which tends to zero as h 0. The second item is proved analogously, or just make
use of the fact that f g = g f and apply Item 1.
Using the fact that g(x) C
c
(R
d
) and the denition of the weak derivative,
f
v
g(x) =
Z
R
d
f(y) (
v
g) (x y)dy =
Z
R
d
f(y) (
v
g(x )) (y)dy
=
Z
R
d

v
f(y)g(x y)dy =
v
f g(x).
Item 3. is a consequence of this equality and items 1. and 2.
19.2. The connection of Weak and pointwise derivatives.
Proposition 19.25. Let = (, ) R be an open interval and f L
1
loc
() such
that
(w)
f = 0 in L
1
loc
(). Then there exists c C such that f = c a.e. More
generally, suppose F : C
c
() C is a linear functional such that F(
0
) = 0 for
all C
c
(), where
0
(x) =
d
dx
(x), then there exists c C such that
(19.16) F() = hc, i =
Z
c(x)dx for all C
c
().
Proof. Before giving a proof of the second assertion, let us show it includes the
rst. Indeed, if F() :=
R
fdm and
(w)
f = 0, then F(
0
) = 0 for all C
c
()
and therefore there exists c C such that
Z
fdm = F() = ch, 1i = c

Z
fdm.
But this implies f = c a.e. So it only remains to prove the second assertion.
Let C
c
() such that
R
dm = 1. Given C
c
() C
c
(R) , let
(x) =
R
x
((y) (y)h, 1i) dy. Then

0
(x) = (x) (x)h, 1i and C
c
()
as the reader should check. Therefore,
0 = F() = F( h, i) = F() h, 1iF()
which shows Eq. (19.16) holds with c = F(). This concludes the proof, however
it will be instructive to give another proof of the rst assertion.
Alternative proof of rst assertion. Suppose f L
1
loc
() and
(w)
f = 0
and f
m
:= f
m
as is in the proof of Lemma 19.9. Then f
0
m
=
(w)
f
m
= 0,
so f
m
= c
m
for some constant c
m
C. By Theorem 11.21, f
m
f in L
1
loc
() and
therefore if J = [a, b] is a compact subinterval of ,
|c
m
c
k
| =
1
b a
Z
J
|f
m
f
k
| dm 0 as m, k .
So {c
m
}
m=1
is a Cauchy sequence and therefore c := lim
m
c
m
exists and f =
lim
m
f
m
= c a.e.
Theorem 19.26. Suppose f L
1
loc
(). Then there exists a complex measure on
B
such that
(19.17) hf,
0
i = () :=
Z
d for all C
c
()
i there exists a right continuous function F of bounded variation such that F = f
a.e. In this case =
F
, i.e. ((a, b]) = F(b) F(a) for all < a < b < .
Proof. Suppose f = F a.e. where F is as above and let =
F
be the
associated measure on B
. Let G(t) = F(t) F() = ((, t]), then using

Fubinis theorem and the fundamental theorem of calculus,
hf,
0
i = hF,
0
i = hG,
0
i =
Z
0
(t)
Z
1
(,t]
(s)d(s)
dt
=
Z
0
(t)1
(,t]
(s)dtd(s) =
Z
(s)d(s) = ().
Conversely if Eq. (19.17) holds for some measure , let F(t) := ((, t]) then
working backwards from above,
hf,
0
i = () =
Z
(s)d(s) =
Z
0
(t)1
(,t]
(s)dtd(s) =
Z
0
(t)F(t)dt.
This shows
(w)
(f F) = 0 and therefore by Proposition 19.25, f = F +c a.e. for
some constant c C. Since F + c is right continuous with bounded variation, the
proof is complete.
Proposition 19.27. Let R be an open interval and f L
1
loc
(). Then
w
f
exists in L
1
loc
() i f has a continuous version

f which is absolutely continuous on
all compact subintervals of . Moreover,
w
f =

f
0
a.e., where

f
0
(x) is the usual
pointwise derivative.
Proof. If f is locally absolutely continuous and C
c
() with supp()
[a, b] , then by integration by parts, Corollary 16.32,
Z
f
0
dm =
Z
b
a
f
0
dm =
Z
b
a
f
0
dm+f|
b
a
=
Z
f
0
dm.
384 BRUCE K. DRIVER
This shows
w
f exists and
w
f = f
0
L
1
loc
().
Now suppose that
w
f exists in L
1
loc
() and a . Dene F C () by
F(x) :=
R
x
a

w
f(y)dy. Then F is absolutely continuous on compacts and therefore
by fundamental theorem of calculus for absolutely continuous functions (Theorem
16.31), F
0
(x) exists and is equal to
w
f(x) for a.e. x . Moreover, by the rst
part of the argument,
w
F exists and
w
F =
w
f, and so by Proposition 19.25
there is a constant c such that
f(x) := F(x) +c = f(x) for a.e. x .

Denition 19.28. Let X and Y be metric spaces. A function u : X Y is said
to be Lipschitz if there exists C < such that
d
Y
(u(x), u(x
0
)) Cd
X
(x, x
0
) for all x, x
0
X
and said to be locally Lipschitz if for all compact subsets K X there exists
C
K
< such that
d
Y
(u(x), u(x
0
)) C
K
d
X
(x, x
0
) for all x, x
0
K.
Proposition 19.29. Let u L
1
loc
(). Then there exists a locally Lipschitz function
u : C such that u = u a.e. i
i
u L
1
loc
() exists and is locally (essentially)
bounded for i = 1, 2, . . . , d.
Proof. Suppose u = u a.e. and u is Lipschitz and let p (1, ) and V be
a precompact open set such that

V W and let V
:=

x : dist(x,

V )
.
Then for < dist(
V ,
c
), V
and therefore there is constant C(V, ) < such

that | u(y) u(x)| C(V, ) |y x| for all x, y V
. So for 0 < |h| 1 and v R

d
with |v| = 1,
Z
V
u(x +hv) u(x)

h
p
dx =
Z
V
u(x +hv) u(x)

h
p
dx C(V, ) |v|
p
.
Therefore Theorem 19.18 may be applied to conclude
v
u exists in L
p
and moreover,
lim
h0
u(x +hv) u(x)
h
=
v
u(x) for m a.e. x V.
Since there exists {h
n
}
n=1
n
h
n
= 0 and
|
v
u(x)| = lim
n
u(x +h
n
v) u(x)
h
n
C(V ) for a.e. x V,

it follows that k
v
uk
C(V ) where C(V ) := lim

0
C(V, ).
Conversely, let
:= {x : dist(x,
c
) > } and C
c
(B(0, 1), [0, )) such
that
R
R
n
(x)dx = 1,
m
(x) = m
n
(mx) and u
m
:= u
m
as in the proof of
Theorem 19.18. Suppose V
o
with

V and is suciently small. Then
u
m
C
),
v
u
m
=
v
u
m
, |
v
u
m
(x)| k
v
uk
L
(V
m
1
)
=: C(V, m) < and
therefore for x, y

V with |y x| ,
|u
m
(y) u
m
(x)| =
Z
1
0
d
dt
u
m
(x +t(y x))dt
Z
1
0
(y x) u
m
(x +t(y x))dt
Z
1
0
|y x| |u
m
(x +t(y x))| dt C(V, m) |y x| (19.18)
By passing to a subsequence if necessary, we may assume that lim
m
u
m
(x) =
u(x) for m a.e. x

V and then letting m in Eq. (19.18) implies
(19.19) |u(y) u(x)| C(V ) |y x| for all x, y V \ E and |y x|
where E

V is a m null set. Dene u
V
:

V C by u
V
= u on

V \ E
c
and
u
V
(x) = lim
yx
y/ E
u(y) if x E. Then clearly u
V
= u a.e. on

V and it is easy to
show u
V
is well dened and u
V
:

V C is continuous and still satises
| u
V
(y) u
V
(x)| C
V
|y x| for x, y

V with |y x| .
Since u
V
is continuous on

V there exists M
V
< such that | u
V
| M
V
on

V .
Hence if x, y

V with |x y| , we nd
| u
V
(y) u
V
(x)|
|y x|

2M
and hence
| u
V
(y) u
V
(x)| max
C
V
,
2M
V
|y x| for x, y

V
showing u
V
is Lipschitz on

V . To complete the proof, choose precompact open sets
V
n
such that V
n

V
n
V
n+1
for all n and for x V
n
let u(x) := u
V
n
(x).
Here is an alternative way to construct the function u
V
above. For x V \ E,
|u
m
(x) u(x)| =
Z
V
u(x y)(my)m
n
dy u(x)
Z
V
[u(x y/m) u(x)] (y)dy
Z
V
|u(x y/m) u(x)| (y)dy
C
m
Z
V
|y| (y)dy
wherein the last equality we have used Eq. (19.19) with V replaced by V
for some
small > 0. Letting K := C
R
V
|y| (y)dy < we have shown
ku
m
uk
K/m 0 as m
and consequently
ku
m
u
n
k
u
= ku
m
u
n
k
2K/m 0 as m .
Therefore, u
n
converges uniformly to a continuous function u
V
.
The next theorem is from Chapter 1. of Mazja [2].
Theorem 19.30. Let p 1 and be an open subset of R
d
, x R
d
be written as
x = (y, t) R
d1
R,
Y :=

y R
d1
: ({y} R) 6=
and u L
p
(). Then
t
p
() i there is a version u of u such that
for a.e. y Y the function t u(y, t) is absolutely continuous,
t
u(y, t) =
u(y,t)
t
a.e., and

u
t
L
p
()
< .
Proof. For the proof of Theorem 19.30, it suces to consider the case where
= (0, 1)
d
. Write x as x = (y, t) Y (0, 1) = (0, 1)
d1
(0, 1) and
t
u for
the weak derivative
e
d
u. By assumption
Z
|
t
u(y, t)| dydt = k
t
uk
1
k
t
uk
p
<
386 BRUCE K. DRIVER
and so by Fubinis theorem there exists a set of full measure, Y

0
Y, such that
Z
1
0
|
t
u(y, t)| dt < for y Y
0
.
So for y Y
0
, the function v(y, t) :=
R
t
0

t
u(y, )d is well dened and absolutely
continuous in t with

t
v(y, t) =
t
u(y, t) for a.e. t (0, 1). Let C
c
(Y ) and
C
c
((0, 1)) , then integration by parts for absolutely functions implies
Z
1
0
v(y, t) (t)dt =
Z
1
0
t
v(y, t)(t)dt for all y Y
0
.
Multiplying both sides of this equation by (y) and integrating in y shows
Z
v(x) (t)(y)dydt =
Z
t
v(y, t)(t)(y)dydt =
Z
t
u(y, t)(t)(y)dydt.
Using the denition of the weak derivative, this equation may be written as
Z
u(x) (t)(y)dydt =
Z
t
u(x)(t)(y)dydt
and comparing the last two equations shows
Z
[v(x) u(x)] (t)(y)dydt = 0.

Since C
c
(Y ) is arbitrary, this implies there exists a set Y
1
Y
0
of full measure
such that
Z
[v(y, t) u(y, t)] (t)dt = 0 for all y Y

1
from which we conclude, using Proposition 19.25, that u(y, t) = v(y, t) + C(y) for
t J
y
where m
d1
(J
y
) = 1, here m
k
denotes k dimensional Lebesgue measure.
In conclusion we have shown that
(19.20) u(y, t) = u(y, t) :=
Z
t
0

t
u(y, )d +C(y) for all y Y
1
and t J
y
.
We can be more precise about the formula for u(y, t) by integrating both sides
of Eq. (19.20) on t we learn
C(y) =
Z
1
0
dt
Z
t
0

u(y, )d
Z
1
0
u(y, t)dt =
Z
1
0
(1 )
u(y, )d
Z
1
0
u(y, t)dt
=
Z
1
0
[(1 t)
t
u(y, t) u(y, t)] dt
and hence
u(y, t) :=
Z
t
0

u(y, )d +
Z
1
0
[(1 )
u(y, ) u(y, )] d
which is well dened for y Y
0
.
For the converse suppose that such a u exists, then for C
c
() ,
Z
u(y, t)
t
(y, t)dydt =
Z
u(y, t)
t
(y, t)dtdy =
Z
u(y, t)
t
(y, t)dtdy
wherein we have used integration by parts for absolutely continuous functions. From
this equation we learn the weak derivative
t
u(y, t) exists and is given by
u(y,t)
t
a.e.
19.3. Exercises.
Exercise 19.1. Give another proof of Lemma 19.10 base on Proposition 19.12.
Exercise 19.2. Prove Proposition 19.13. Hints: 1. Use u
as dened in the proof

of Proposition 19.12 to show it suces to consider the case where u C
R
d
L
p
R
d
with
u L
p
R
d
for all N
d
0
. 2. Then let C
c
(B(0, 1), [0, 1])
such that = 1 on a neighborhood of 0 and let u
n
(x) := u(x)(x/n).
Exercise 19.3. Suppose p() is a polynomial in R
d
, p (1, ), q :=
p
p1
,
u L
p
such that p()u L
p
and v L
q
such that p () v L
q
. Show hp () u, vi =
hu, p () vi.
Exercise 19.4. Let p [1, ), be a multi index (if = 0 let
0
be the identity
operator on L
p
),
D(
) := {f L
p
(R
n
) :
p
(R
n
)}
and for f D(
) (the domain of
) let
f denote the weak derivative of f.

(See Denition 19.3.)
(1) Show
is a densely dened operator on L

p
, i.e. D(
) is a dense linear
subspace of L
p
and
: D(
) L
p
is a linear transformation.
(2) Show
: D(
) L
p
is a closed operator, i.e. the graph,
(
) := {(f,
f) L
p
L
p
: f D(
)} ,
is a closed subspace of L
p
L
p
.
(3) Show
: D(
) L
p
L
p
is not bounded unless = 0. (The norm on
D(
) is taken to be the L
p
norm.)
Exercise 19.5. Let p [1, ), f L
p
and be a multi index. Show
f exists
weakly (see Denition 19.3) in L
p
i there exists f
n
C
c
(R
n
) and g L
p
such
that f
n
f and
f
n
g in L
p
as n . Hints: See exercises 19.2 and 19.4.
Exercise 19.7. Assume n = 1 and let =
e
1
where e
1
= (1) R
1
= R.
(1) Let f(x) = |x| , show f exists weakly in L
1
loc
(R) and f(x) = sgn(x) for
m a.e. x.
(2) Show (f) does not exists weakly in L
1
loc
(R).
(3) Generalize item 1. as follows. Suppose f C(R, R) and there exists a nite
set := {t
1
< t
2
< < t
N
} R such that f C
1
(R \ , R). Assuming
f L
1
loc
(R) , show f exists weakly and
(w)
f(x) = f(x) for m a.e. x.
Exercise 19.8. Suppose that f L
1
loc
() and v R
d
and {e
j
}
n
j=1
is the standard
basis for R
d
. If
j
f :=
e
j
1
loc
() for all j = 1, 2, . . . , n then
v
f
exists weakly in L
1
loc
() and
v
f =
P
n
j=1
v
j
j
f.
Exercise 19.9. Suppose, f L
1
loc
(R
d
) and
v
f exists weakly and
v
f = 0 in
L
1
loc
(R
d
) for all v R
d
. Then there exists C such that f(x) = for m a.e.
x R
d
. Hint: See steps 1. and 2. in the outline given in Exercise 19.10 below.
Exercise 19.10 (A generalization of Exercise 19.9). Suppose is a connected
open subset of R
d
and f L
1
loc
(). If
f = 0 weakly for Z
n
+
with || = N+1,
then f(x) = p(x) for m a.e. x where p(x) is a polynomial of degree at most N.
Here is an outline.
388 BRUCE K. DRIVER
(1) Suppose x
0
and > 0 such that C := C
x
0
() and let
n
be a
sequence of approximate functions such supp(
n
) B
0
(1/n) for all n.
Then for n large enough,
(f
n
) = (
f)
n
on C for || = N+1. Now
use Taylors theorem to conclude there exists a polynomial p
n
of degree at
most N such that f
n
= p
n
on C.
(2) Show p := lim
n
p
n
exists on C and then let n in step 1. to show
there exists a polynomial p of degree at most N such that f = p a.e. on C.
(3) Use Taylors theorem to show if p and q are two polynomials on R
d
which
agree on an open set then p = q.
(4) Finish the proof with a connectedness argument using the results of steps
2. and 3. above.
Exercise 19.11. Suppose
o
R
d
and v, w R
d
. Assume f L
1
loc
() and that
w
1
loc
(), show
w
v
f also exists weakly and
w
v
f =
v
w
f.
Exercise 19.12. Let d = 2 and f(x, y) = 1
x0
. Show
(1,1)
f = 0 weakly in L
1
loc
despite the fact that
1
f does not exist weakly in L
1
loc
!
20. Fourier Transform
The underlying space in this section is R
n
with Lebesgue measure. The Fourier
inversion formula is going to state that
(20.1) f(x) =

1
2
n
Z
R
n
de
ix
Z
R
n
dyf(y)e
iy
.
If we let = 2, this may be written as
f(x) =
Z
R
n
de
i2x
Z
R
n
dyf(y)e
iy2
and we have removed the multiplicative factor of

1
2
n
in Eq. (20.1) at the expense
of placing factors of 2 in the arguments of the exponential. Another way to avoid
writing the 2s altogether is to redene dx and d and this is what we will do here.
Notation 20.1. Let m be Lebesgue measure on R
n
and dene:
dx =

1
n
dm(x) and d
n
dm().
To be consistent with this new normalization of Lebesgue measure we will redene
kfk
p
and hf, gi as
kfk
p
=
Z
R
n
|f(x)|
p
dx
1/p
=
1
2
n/2
Z
R
n
|f(x)|
p
dm(x)
!
1/p
and
hf, gi :=
Z
R
n
f(x)g(x)dx when fg L
1
.
Similarly we will dene the convolution relative to these normalizations by fFg :=
1
2
n/2
f g, i.e.
fFg(x) =
Z
R
n
f(x y)g(y)dy =
Z
R
n
f(x y)g(y)
1
2
n/2
dm(y).
The following notation will also be convenient; given a multi-index Z
n
+
, let
|| =
1
+ +
n
,
x
:=
n
Y
j=1
x
j
j
,
x
=

x
:=
n
Y
j=1

x
j
j
and
D
x
=

1
i
||
=

1
i

x
.
Also let
hxi := (1 +|x|
2
)
1/2
and for s R let
s
(x) = (1 +|x|)
s
.
390 BRUCE K. DRIVER
20.1. Fourier Transform.

Denition 20.2 (Fourier Transform). For f L
1
, let
f() = Ff() :=
Z
R
n
e
ix
f(x)dx (20.2)
g
(x) = F
1
g(x) =
Z
R
n
e
ix
g()d = Fg(x) (20.3)
The next theorem summarizes some more basic properties of the Fourier trans-
form.
Theorem 20.3. Suppose that f, g L
1
. Then
(1)

f C
0
(R
n
) and

u
kfk
1
.
(2) For y R
n
, (
y
f) () = e
iy

f() where, as usual,
y
f(x) := f(x y).
(3) The Fourier transform takes convolution to products, i.e. (fFg)
=

f g.
(4) For f, g L
1
, h
f, gi = hf, gi.
(5) If T : R
n
R
n
is an invertible linear transformation, then
(f T)
() = |det T|
1

f(
T
1
) and
(f T)
() = |det T|
1
f
T
1
)
(6) If (1+|x|)
k
f(x) L
1
, then

f C
k
and

f C
0
for all || k. Moreover,
(20.4)

f() = F [(ix)
f(x)] ()
for all || k.
(7) If f C
k
and
f L
1
for all || k, then (1 +||)
k

f() C
0
and
(20.5) (
f)
() = (i)

f()
for all || k.
(8) Suppose g L
1
(R
k
) and h L
1
(R
nk
) and f = g h, i.e.
f(x) = g(x
1
, . . . , x
k
)h(x
k+1
, . . . , x
n
),
then

f = g
h.
Proof. Item 1. is the Riemann Lebesgue Lemma 11.27. Items 2. 5. are
proved by the following straight forward computations:
(
y
f) () =
Z
R
n
e
ix
f(x y)dx =
Z
R
n
e
i(x+y)
f(x)dx = e
iy

f(),
h
f, gi =
Z
R
n
f()g()d =
Z
R
n
dg()
Z
R
n
dxe
ix
f(x)
=
Z
R
n
R
n
dxde
ix
g()f(x) =
Z
R
n
R
n
dx g(x)f(x) = hf, gi,
(fFg)
() =
Z
R
n
e
ix
fFg(x)dx =
Z
R
n
e
ix
Z
R
n
f(x y)g(y)dy
dx
=
Z
R
n
dy
Z
R
n
dxe
ix
f(x y)g(y) =
Z
R
n
dy
Z
R
n
dxe
i(x+y)
f(x)g(y)
=
Z
R
n
dye
iy
g(y)
Z
R
n
dxe
ix
f(x) =

f() g()
and letting y = Tx so that dx = |det T|
1
dy
(f T)
() =
Z
R
n
e
ix
f(Tx)dx =
Z
R
n
e
iT
1
y
f(y) |det T|
1
dy
= |det T|
1

f(
T
1
).
Item 6. is simply a matter of dierentiating under the integral sign which is easily
justied because (1 +|x|)
k
f(x) L
1
.
Item 7. follows by using Lemma 11.26 repeatedly (i.e. integration by parts) to
nd
(
f)
() =
Z
R
n
x
f(x)e
ix
dx = (1)
||
Z
R
n
f(x)
x
e
ix
dx
= (1)
||
Z
R
n
f(x)(i)
e
ix
dx = (i)

f().
Since
f L
1
for all || k, it follows that (i)

f() = (
f)
() C
0
for all
|| k. Since
(1 +||)
k
1 +
n
X
i=1
|
i
|
!
k
=
X
||k
c
|
where 0 < c
< ,
(1 +||)
k

f()

X
||k
c

f()
0 as .
Item 8. is a simple application of Fubinis theorem.
Example 20.4. If f(x) = e
|x|
2
/2
then

f() = e
||
2
/2
, in short
(20.6) Fe
|x|
2
/2
= e
||
2
/2
and F
1
e
||
2
/2
= e
|x|
2
/2
.
More generally, for t > 0 let
(20.7) p
t
(x) := t
n/2
e
1
2t
|x|
2
then
(20.8) b p
t
() = e
t
2
||
2
and (b p
t
)
(x) = p
t
(x).
By Item 8. of Theorem 20.3, to prove Eq. (20.6) it suces to consider the 1
dimensional case because e
|x|
2
/2
=
Q
n
i=1
e
x
2
i
/2
. Let g() :=

Fe
x
2
/2
() , then
by Eq. (20.4) and Eq. (20.5),
(20.9)
g
0
() = F
h
(ix) e
x
2
/2
i
() = iF
d
dx
e
x
2
/2
() = i(i)F
h
e
x
2
/2
i
() = g().
Lemma 8.36 implies
g(0) =
Z
R
e
x
2
/2
dx =
1
2
Z
R
e
x
2
/2
dm(x) = 1,
and so solving Eq. (20.9) with g(0) = 1 gives F
h
e
x
2
/2
i
() = g() = e
2
/2
as
desired. The assertion that F
1
e
||
2
/2
= e
|x|
2
/2
follows similarly or by using Eq.
(20.3) to conclude,
F
1
h
e
||
2
/2
i
(x) = F
h
e
||
2
/2
i
(x) = F
h
e
||
2
/2
i
(x) = e
|x|
2
/2
.
392 BRUCE K. DRIVER
The results in Eq. (20.8) now follow from Eq. (20.6) and item 5 of Theorem 20.3.
For example, since p
t
(x) = t
n/2
p
1
(x/
t),
(b p
t
)() = t
n/2
n
p
1
(
t) = e
t
2
||
2
.
This may also be written as (b p
t
)() = t
n/2
p1
t
(). Using this and the fact that p
t
is an even function,
(b p
t
)
(x) = Fb p
t
(x) = t
n/2
Fp1
t
(x) = t
n/2
t
n/2
p
t
(x) = p
t
(x).
20.2. Schwartz Test Functions.
Denition 20.5. A function f C(R
n
, C) is said to have rapid decay or rapid
decrease if
sup
xR
n
(1 +|x|)
N
|f(x)| < for N = 1, 2, . . . .
Equivalently, for each N N there exists constants C
N
< such that |f(x)|
C
N
(1 + |x|)
N
for all x R
n
. A function f C(R
n
, C) is said to have (at most)
polynomial growth if there exists N < such
sup(1 +|x|)
N
|f(x)| < ,
i.e. there exists N N and C < such that |f(x)| C(1 +|x|)
N
for all x R
n
.
Denition 20.6 (Schwartz Test Functions). Let S denote the space of functions
f C
(R
n
) such that f and all of its partial derivatives have rapid decay and let
kfk
N,
= sup
xR
n
(1 +|x|)
N
f(x)
so that
S =
n
f C
(R
n
) : kfk
N,
< for all N and
o
.
Also let P denote those functions g C
(R
n
) such that g and all of its derivatives
have at most polynomial growth, i.e. g C
(R
n
) is in P i for all multi-indices
, there exists N
< such
sup(1 +|x|)
N
g(x)| < .
(Notice that any polynomial function on R
n
is in P.)
Remark 20.7. Since C
c
(R
n
) S L
2
(R
n
) , it follows that S is dense in L
2
(R
n
).
Exercise 20.1. Let
(20.10) L =
X
||k
a
(x)
with a
P. Show L(S) S and in particular
f and x
f are back in S for all

multi-indices .
Notation 20.8. Suppose that p(x, ) =
||N
a
(x)
where each function a
(x)
is a smooth function. We then set
p(x, D
x
) :=
||N
a
(x)D
x
and if each a
(x) is also a polynomial in x we will let

p(D
, ) :=
||N
a
(D
)M
where M
is the operation of multiplication by
.
Proposition 20.9. Let p(x, ) be as above and assume each a
(x) is a polynomial
in x. Then for f S,
(20.11) (p(x, D
x
)f)
() = p(D
, )

f ()
and
(20.12) p(, D
)

f() = [p(D
x
, x)f(x)]
().
Proof. The identities (D
e
ix
= x
e
ix
and D
x
e
ix
=
e
ix
imply,
for any polynomial function q on R
n
,
(20.13) q(D
)e
ix
= q(x)e
ix
and q(D
x
)e
ix
= q()e
ix
.
Therefore using Eq. (20.13) repeatedly,
(p(x, D
x
)f)
() =
Z
R
n
X
||N
a
(x)D
x
f(x) e
ix
d
=
Z
R
n
X
||N
D
x
f(x) a
(D
)e
ix
d
=
Z
R
n
f(x)
X
||N
(D
x
)
(D
)e
ix
d
=
Z
R
n
f(x)
X
||N
a
(D
e
ix
d = p(D
, )

f ()
wherein the third inequality we have used Lemma 11.26 to do repeated integration
by parts, the fact that mixed partial derivatives commute in the fourth, and in the
last we have repeatedly used Corollary 7.43 to dierentiate under the integral. The
proof of Eq. (20.12) is similar:
p(, D
)

f() = p(, D
)
Z
R
n
f(x)e
ix
dx =
Z
R
n
f(x)p(, x)e
ix
dx
=
X
||N
Z
R
n
f(x)(x)
()e
ix
dx =
X
||N
Z
R
n
f(x)(x)
(D
x
)e
ix
dx
=
X
||N
Z
R
n
e
ix
a
(D
x
) [(x)
f(x)] dx = [p(D
x
, x)f(x)]
().
Corollary 20.10. The Fourier transform preserves the space S, i.e. F(S) S.
Proof. Let p(x, ) =
||N
a
(x)
with each a
(x) being a polynomial func-

tion in x. If f S then p(D
x
, x)f S L
1
and so by Eq. (20.12), p(, D
)

f()
is bounded in , i.e.
sup
R
n
|p(, D
)

f()| C(p, f) < .
Taking p(x, ) = (1 +||
2
)
N
with N Z
+
in this estimate shows

f() and all of
its derivatives have rapid decay, i.e.

f is in S.
394 BRUCE K. DRIVER
20.3. Fourier Inversion Formula .

Theorem 20.11 (Fourier Inversion Theorem). Suppose that f L
1
and

f L
1
,
then
(1) there exists f
0
C
0
(R
n
) such that f = f
0
a.e.
(2) f
0
= F
1
F f and f
0
= FF
1
f,
(3) f and

f are in L
1
L
and
(4) kfk
2
=

2
.
In particular, F : S S is a linear isomorphism of vector spaces.
Proof. First notice that

f C
0
(R
n
) L
and

f L
1
by assumption, so that
f L
1
L
. Let p
t
(x) t
n/2
e
1
2t
|x|
2
be as in Example 20.4 so that b p
t
() = e
t
2
||
2
and b p
t
= p
t
. Dene f
0
:=

f
C
0
then
f
0
(x) = (

f)
(x) =
Z
R
n
f()e
ix
d = lim
t0
Z
R
n
f()e
ix
b p
t
()d
= lim
t0
Z
R
n
Z
R
n
f(y)e
i(xy)
b p
t
()d dy
= lim
t0
Z
R
n
f(y)p
t
(y)dy = f(x) a.e.
wherein we have used Theorem 11.21 in the last equality along with the observations
that p
t
(y) = p
1
(y/
t) and
R
R
n
p
1
(y)dy = 1. In particular this shows that f
L
1
L
. A similar argument shows that F

1
F f = f
0
as well.
Let us now compute the L
2
norm of

f,
k
fk
2
2
=
Z
R
n
f()

f()d =
Z
R
n
d

f()
Z
R
n
dxf(x)e
ix
=
Z
R
n
dxf(x)
Z
R
n
d

f()e
ix
=
Z
R
n
dx f(x)f(x) = kfk
2
2
because
R
R
n
d

f()e
ix
= F
1

f(x) = f(x) a.e.
Corollary 20.12. By the B.L.T. Theorem 4.1, the maps F|
S
and F
1
|
S
extend to
bounded linear maps

F and

F
1
from L
2
L
2
. These maps satisfy the following
properties:
(1)

F and

F
1
are unitary and are inverses to one another as the notation
suggests.
(2) For f L
2
we may compute

F and

F
1
by
Ff() = L
2
lim
R
Z
|x|R
f(x)e
ix
dx and (20.14)
F
1
f() = L
2
lim
R
Z
|x|R
f(x)e
ix
dx. (20.15)
(3) We may further extend

F to a map from L
1
+L
2
C
0
+L
2
(still denote
by

F) dened by

Ff =

h+

Fg where f = h+g L
1
+L
2
. For f L
1
+L
2
,
Ff may be characterized as the unique function F L

1
loc
(R
n
) such that
(20.16) hF, i = hf,

i for all C
c
(R
n
).
Moreover if Eq. (20.16) holds then F C
0
+L
2
L
1
loc
(R
n
) and Eq.(20.16)
is valid for all S.
Proof. Item 1., If f L
2
and
n
S such that
n
f in L
2
, then

Ff :=
lim
n

n
. Since

n
S L
1
, we may concluded that

2
= k
n
k
2
for all n.
Thus

Ff
2
= lim
n
2
= lim
n
k
n
k
2
= kfk
2
which shows that

F is an isometry from L
2
to L
2
and similarly

F
1
is an isometry.
Since

F
1

F = F
1
F = id on the dense set S, it follows by continuity that

F
1

F =
id on all of L
2
. Hence

F

F
1
= id, and thus

F
1
is the inverse of

F. This proves
item 1.
Item 2. Let f L
2
and R < and set f
R
(x) := f(x)1
|x|R
. Then f
R
L
1
L
2
.
Let C
c
(R
n
) be a function such that
R
R
n
(x)dx = 1 and set
k
(x) = k
n
(kx).
Then f
R
F
k
f
R
L
1
L
2
with f
R
F
k
C
c
(R
n
) S. Hence
Ff
R
= L
2
lim
k
F (f
R
F
k
) = Ff
R
a.e.
where in the second equality we used the fact that F is continuous on L
1
. Hence
R
|x|R
f(x)e
ix
dx represents

Ff
R
() in L
2
. Since f
R
f in L
2
, Eq. (20.14)
follows by the continuity of

F on L
2
.
Item 3. If f = h +g L
1
+L
2
and S, then
h
h +

Fg, i = hh, i +h

Fg, i = hh,

i + lim
R
hF
g1
||R
, i
= hh,

i + lim
R
hg1
||R
,

i = hh +g,

i. (20.17)
In particular if h + g = 0 a.e., then h
h +

Fg, i = 0 for all S and since
h+

Fg L
1
loc
it follows from Corollary 11.28 that

h+

Fg = 0 a.e. This shows that
Ff is well dened independent of how f L

1
+ L
2
is decomposed into the sum
of an L
1
and an L
2
function. Moreover Eq. (20.17) shows Eq. (20.16) holds with
F =

h +

Fg C
0
+ L
2
and S. Now suppose G L
1
loc
and hG, i = hf,

i for
all C
c
(R
n
). Then by what we just proved, hG, i = hF, i for all C
c
(R
n
)
and so an application of Corollary 11.28 shows G = F C
0
+L
2
.
Notation 20.13. Given the results of Corollary 20.12, there is little danger in
writing

f or Ff for

Ff when f L
1
+L
2
.
Corollary 20.14. If f and g are L
1
functions such that

f, g L
1
, then
F(fg) =

fF g and F
1
(fg) = f
Fg
.
Since S is closed under pointwise products and F : S S is an isomorphism it
follows that S is closed under convolution as well.
Proof. By Theorem 20.11, f, g,

f, g L
1
L
and hence f g L
1
L
and
fF g L
1
L
. Since
F
1
fF g
= F
1
F
1
( g) = f g L
1
we may conclude from Theorem 20.11 that
fF g = FF
1
fF g
= F(f g).
Similarly one shows F
1
(fg) = f
Fg
.
396 BRUCE K. DRIVER
Corollary 20.15. Let p(x, ) and p(x, D

x
) be as in Notation 20.8 with each func-
tion a
(x) being a smooth function of x R

n
. Then for f S,
(20.18) p(x, D
x
)f(x) =
Z
R
n
p(x, )

f () e
ix
d.
Proof. For f S, we have
p(x, D
x
)f(x) = p(x, D
x
)
F
1

f
(x) = p(x, D
x
)
Z
R
n
f () e
ix
d
=
Z
R
n
f () p(x, D
x
)e
ix
d =
Z
R
n
f () p(x, )e
ix
d.
If p(x, ) is a more general function of (x, ) then that given in Notation 20.8,
the right member of Eq. (20.18) may still make sense, in which case we may use it
as a denition of p(x, D
x
). A linear operator dened this way is called a pseudo
dierential operator and they turn out to be a useful class of operators to study
when working with partial dierential equations.
Corollary 20.16. Suppose p() =
P
||N
a
is a polynomial in R
n
and
f L
2
. Then p()f exists in L
2
(see Denition 19.3) i p(i)

f() L
2
in
which case
(p()f)
() = p(i)

f() for a.e. .
In particular, if g L
2
then f L
2
solves the equation, p()f = g i p(i)

f() =
g() for a.e. .
Proof. By denition p()f = g in L
2
i
(20.19) hg, i = hf, p()i for all C
c
(R
n
).
If follows from repeated use of Lemma 19.23 that the previous equation is equivalent
to
(20.20) hg, i = hf, p()i for all S(R
n
).
This may also be easily proved directly as well as follows. Choose C
c
(R
n
)
such that (x) = 1 for x B
0
(1) and for S(R
n
) let
n
(x) := (x/n)(x). By
the chain rule and the product rule (Eq. A.5 of Appendix A),
n
(x) =
X
n
||
(x/n)
(x)
along with the dominated convergence theorem shows
n
and
in
L
2
as n . Therefore if Eq. (20.19) holds, we nd Eq. (20.20) holds because
hg, i = lim
n
hg,
n
i = lim
n
hf, p()
n
i = hf, p()i.
To complete the proof simply observe that hg, i = h g,
i and
hf, p()i = h
f, [p()]
i = h
f(), p(i)
()i
= hp(i)

f(),
()i
for all S(R
n
). From these two observations and the fact that F is bijective on
S, one sees that Eq. (20.20) holds i p(i)

f() L
2
and g() = p(i)

f() for
a.e. .
20.4. Summary of Basic Properties of F and F
1
. The following table sum-
marizes some of the basic properties of the Fourier transform and its inverse.
f

f or f
Smoothness Decay at innity
Multiplication by (i)
S S
L
2
(R
n
) L
2
(R
n
)
Convolution Products.
20.5. Fourier Transforms of Measures and Bochners Theorem. To moti-
vate the next denition suppose that is a nite measure on R
n
which is absolutely
continuous relative to Lebesgue measure, d(x) = (x)dx. Then it is reasonable to
require
() := () =
Z
R
n
e
ix
(x)dx =
Z
R
n
e
ix
d(x)
and
(Fg) (x) := Fg(x) =
Z
R
n
g(x y)(x)dx =
Z
R
n
g(x y)d(y)
when g : R
n
C is a function such that the latter integral is dened, for example
assume g is bounded. These considerations lead to the following denitions.
Denition 20.17. The Fourier transform, , of a complex measure on B
R
n is
dened by
(20.21) () =
Z
R
n
e
ix
d(x)
and the convolution with a function g is dened by
(Fg) (x) =
Z
R
n
g(x y)d(y)
when the integral is dened.
It follows from the dominated convergence theorem that is continuous. Also
by a variant of Exercise 11.11, if and are two complex measure on B
R
n such
that = , then = . The reader is asked to give another proof of this fact in
Exercise 20.4 below.
Example 20.18. Let
t
be the surface measure on the sphere S
t
of radius t centered
at zero in R
3
. Then

t
() = 4t
sint ||
||
.
Indeed,

t
() =
Z
tS
2
e
ix
d(x) = t
2
Z
S
2
e
itx
d(x)
= t
2
Z
S
2
e
itx
3
||
d(x) = t
2
Z
2
0
d
Z

0
dsine
it cos ||
= 2t
2
Z
1
1
e
itu||
du = 2t
2
1
it ||
e
itu||
|
u=1
u=1
= 4t
2
sint ||
t ||
.
398 BRUCE K. DRIVER
Denition 20.19. A function : R

n
C is said to be positive (semi) denite
i the matrices A := {(
k
j
)}
m
k,j=1
are positive denite for all m N and
{
j
}
m
j=1
R
n
.
Lemma 20.20. If C(R
n
, C) is a positive denite function, then
(1) (0) 0.
(2) () = () for all R
n
.
(3) |()| (0) for all R
n
.
(4) For all f S(R
d
),
(20.22)
Z
R
n
R
n
( )f()f()dd 0.
Proof. Taking m = 1 and
1
= 0 we learn (0) ||
2
0 for all C which
proves item 1. Taking m = 2,
1
= and
2
= , the matrix
A :=

(0) ( )
( ) (0)
is positive denite from which we conclude ( ) = ( ) (since A = A
by
denition) and
0 det
(0) ( )
( ) (0)
= |(0)|
2
|( )|
2
.
and hence |()| (0) for all . This proves items 2. and 3. Item 4. follows by
approximating the integral in Eq. (20.22) by Riemann sums,
Z
R
n
R
n
( )f()f()dd = lim
mesh0
X
(
k
j
)f(
j
)f(
k
) 0.
The details are left to the reader.
Lemma 20.21. If is a nite positive measure on B
R
n, then := C(R
n
, C)
is a positive denite function.
Proof. As has already been observed after Denition 20.17, the dominated
convergence theorem implies C(R
n
, C). Since is a positive measure (and
hence real),
() =
Z
R
n
e
ix
d(x) =
Z
R
n
e
ix
d(x) = ().
From this it follows that for any m N and {
j
}
m
j=1
R
n
, the matrix A :=
{ (
k
j
)}
m
k,j=1
is self-adjoint. Moreover if C
m
,
m
X
k,j=1
(
k
j
)
k
j
=
Z
R
n
m
X
k,j=1
e
i(
k
j
)x
j
d(x) =
Z
R
n
m
X
k,j=1
e
i
k
x
k
e
i
j
x
j
d(x)
=
Z
R
n
m
X
k=1
e
i
k
x
2
d(x) 0
showing A is positive denite.
Theorem 20.22 (Bochners Theorem). Suppose C(R
n
, C) is positive denite
function, then there exists a unique positive measure on B
R
n such that = .
Proof. If () = (), then for f S we would have
Z
R
n
fd =
Z
R
n
(f
d =
Z
R
n
f
() ()d.
This suggests that we dene
I(f) :=
Z
R
n
()f
()d for all f S.

We will now show I is positive in the sense if f S and f 0 then I(f) 0. For
general f S we have
I(|f|
2
) =
Z
R
n
()
|f|
2
()d =
Z
R
n
()
()d
=
Z
R
n
()f
( )

f
()dd =
Z
R
n
()f
( )f
()dd
=
Z
R
n
( )f
()f
()dd 0.
For t > 0 let p
t
(x) := t
n/2
e
|x|
2
/2t
S and dene
IFp
t
(x) := I(p
t
(x )) = I(
p
p
t
(x )
2
)
which is non-negative by above computation and because
p
p
t
(x ) S.
Using
[p
t
(x )]
() =
Z
R
n
p
t
(x y)e
iy
dy =
Z
R
n
p
t
(y)e
i(y+x)
dy
= e
ix
p
t
() = e
ix
e
t||
2
/2
,
hIFp
t
, i =
Z
R
n
I(p
t
(x ))(x)dx =
Z
R
n
Z
R
n
() [p
t
(x )]
()(x)ddx
=
Z
R
n
()
()e
t||
2
/2
d
which coupled with the dominated convergence theorem shows
hIFp
t
, i
Z
R
n
()
()d = I() as t 0.
Hence if 0, then I() = lim
t0
hIFp
t
, i 0.
Let K R be a compact set and C
c
(R, [0, )) be a function such that
= 1 on K. If f C
c
(R, R) is a smooth function with supp(f) K, then
0 kfk
f S and hence
0 hI, kfk
fi = kfk
hI, i hI, fi
and therefore hI, fi kfk
hI, i. Replacing f by f implies, hI, fi

kfk
hI, i and hence we have proved

(20.23) |hI, fi| C(supp(f)) kfk
for all f D
R
n := C
c
(R
n
, R) where C(K) is a nite constant for each compact
subset of R
n
. Because of the estimate in Eq. (20.23), it follows that I|
D
R
n
has a
unique extension I to C
c
(R
n
, R) still satisfying the estimates in Eq. (20.23) and
moreover this extension is still positive. So by the Riesz Markov theorem, there
400 BRUCE K. DRIVER
exists a unique Radon measure on R

n
such that such that hI, fi = (f) for all
f C
c
(R
n
, R).
To nish the proof we must show () = () for all R
n
given
(f) =
Z
R
n
()f
()d for all f C
c
(R
n
, R).
Let f C
c
(R
n
, R
+
) be a radial function such f(0) = 1 and f(x) is decreasing as
|x| increases. Let f
(x) := f(x), then by Theorem 20.3,

F
1
e
ix
f
(x)
() =
n
f
)
and therefore
(20.24)
Z
R
n
e
ix
f
(x)d(x) =
Z
R
n
()
n
f
)d.
Because
R
R
n
f
()d = Ff
(0) = f(0) = 1, we may apply the approximate

function Theorem 11.21 to Eq. (20.24) to nd
(20.25)
Z
R
n
e
ix
f
(x)d(x) () as 0.
On the the other hand, when = 0, the monotone convergence theorem implies
(f
) (1) = (R
n
) and therefore (R
n
) = (1) = (0) < . Now knowing the
is a nite measure we may use the dominated convergence theorem to concluded
(e
ix
f
(x)) (e
ix
) = () as 0
for all . Combining this equation with Eq. (20.25) shows () = () for all
R
n
.
20.6. Supplement: Heisenberg Uncertainty Principle. Suppose that H is a
Hilbert space and A, B are two densely dened symmetric operators on H. More
explicitly, A is a densely dened symmetric linear operator on H means there is
a dense subspace D
A
H and a linear map A : D
A
H such that (A, ) =
(, A) for all , D
A
. Let D
AB
:= { H : D
B
and B D
A
} and for
D
AB
let (AB) = A(B) with a similar denition of D
BA
and BA. Moreover,
let D
C
:= D
AB
D
BA
and for D
C
, let
C =
1
i
[A, B] =
1
i
(AB BA) .
Notice that for , D
C
we have
(C, ) =
1
i
{(AB, ) (BA, )} =
1
i
{(B, A) (A, B)}
=
1
i
{(, BA) (, AB)} = (, C),
so that C is symmetric as well.
Theorem 20.23 (Heisenberg Uncertainty Principle). Continue the above notation
and assumptions,
(20.26)
1
2
|(, C)|
q
kAk
2
(, A)
q
kBk
2
(, B)
for all D
C
. Moreover if kk = 1 and equality holds in Eq. (20.26), then
(A(, A)) = i(B (, B)) or
(B (, B)) = i(A(, A)) (20.27)
for some R.
Proof. By homogeneity (20.26) we may assume that kk = 1. Let a := (, A),
b = (, B),

A = AaI, and

B = B bI. Then we have still have
[

A,

B] = [AaI, B bI] = iC.
Now
i(, C) = (, iC) = (, [

A,

B]) = (,

A

B) (,

B

A)
= (

A,

B) (

B,

A) = 2i Im(

A,

B)
from which we learn
|(, C)| = 2
Im(

A,

B)
(

A,

B)
with equality i Re(

A,

B) = 0 and

A and

B are linearly dependent, i.e. i
Eq. (20.27) holds.
The result follows from this equality and the identities
2
= kA ak
2
= kAk
2
+a
2
kk
2
2a Re(A, )
= kAk
2
+a
2
2a
2
= kAk
2
(A, )
and

= kBk
2
(B, ).
Example 20.24. As an example, take H = L
2
(R), A =
1
i
x
and B =
M
x
with D
A
:= {f H : f
0
H} (f
0
is the weak derivative) and D
B
:=
n
f H :
R
R
|xf(x)|
2
dx <
o
. In this case,
D
C
= {f H : f
0
, xf and xf
0
are in H}
and C = I on D
C
. Therefore for a unit vector D
C
,
1
2

1
i
0
a
2
kx bk
2
where a = i
R
R
0
dm
39
and b =
R
R
x|(x)|
2
dm(x). Thus we have
(20.28)
1
4
=
1
4
Z
R
||
2
dm
Z
R
(k a)
2
(k)
2
dk
Z
R
(x b)
2
|(x)|
2
dx.
39
The constant a may also be described as
a = i
Z
R
0
dm =
2i
Z
R
()
()d
=
Z
R
()
2
dm().
402 BRUCE K. DRIVER
Equality occurs if there exists R such that

i(x b) (x) = (
1
i
x
a)(x) a.e.
Working formally, this gives rise to the ordinary dierential equation (in weak form),
(20.29)
x
= [(x b) +ia]
which has solutions (see Exercise 20.5 below)
(20.30) = C exp
Z
R
[(x b) +ia] dx
= C exp
2
(x b)
2
+iax
.
Let =
1
2t
and choose C so that kk
2
= 1 to nd
t,a,b
(x) =

1
2t
1/4
exp
1
4t
(x b)
2
+iax
are the functions which saturate the Heisenberg uncertainty principle in Eq. (20.28).
20.6.1. Exercises.
Exercise 20.2. Let f L
2
(R
n
) and be a multi-index. If
f exists in L
2
(R
n
)
then F(
f) = (i)

f() in L
2
(R
n
) and conversely if

f()
L
2
(R
n
) then
f exists.
Exercise 20.3. Suppose p() is a polynomial in R
d
and u L
2
such that
p () u L
2
. Show
F (p () u) () = p(i) u() L
2
.
Conversely if u L
2
such that p(i) u() L
2
, show p () u L
2
.
Exercise 20.4. Suppose is a complex measure on R
n
and () is its Fourier
transform as dened in Denition 20.17. Show satises,
h , i :=
Z
R
n
()()d = (
) :=
Z
R
n
d for all S
and use this to show if is a complex measure such that 0, then 0.
Exercise 20.5. Show that described in Eq. (20.30) is the general solution to
Eq. (20.29). Hint: Suppose that is any solution to Eq. (20.29) and is given
as in Eq. (20.30) with C = 1. Consider the weak dierential equation solved by
/.
20.6.2. More Proofs of the Fourier Inversion Theorem.
Exercise 20.6. Suppose that f L
1
(R) and assume that f continuously dieren-
tiable in a neighborhood of 0, show
(20.31) lim
M
Z

sinMx
x
f(x)dx = f(0)
using the following steps.
(1) Use Example 8.26 to deduce,
lim
M
Z
1
1
sinMx
x
dx = lim
M
Z
M
M
sinx
x
dx = .
(2) Explain why
0 = lim
M
Z
|x|1
sinMx
f(x)
x
dx and
0 = lim
M
Z
|x|1
sinMx
f(x) f(0)
x
dx.
(3) Add the previous two equations and use part (1) to prove Eq. (20.31).
Exercise 20.7 (Fourier Inversion Formula). Suppose that f L
1
(R) such that
f L
1
(R).
(1) Further assume that f is continuously dierentiable in a neighborhood of
0. Show that
:=
Z
R
f()d = f(0).
Hint: by the dominated convergence theorem, := lim
M
R
||M

f()d.
Now use the denition of

f(), Fubinis theorem and Exercise 20.6.
(2) Apply part 1. of this exercise with f replace by
y
f for some y R to
prove
(20.32) f(y) =
Z
R
f()e
iy
d
provided f is now continuously dierentiable near y.
The goal of the next exercises is to give yet another proof of the Fourier inversion
formula.
Notation 20.25. For L > 0, let C
k
L
(R) denote the space of C
k
2L periodic
functions:
C
k
L
(R) :=

f C
k
(R) : f(x + 2L) = f(x) for all x R
.
Also let h, i
L
denote the inner product on the Hilbert space H
L
:= L
2
([L, L])
given by
(f, g)
L
:=
1
2L
Z
[L,L]
f(x) g(x)dx.
Exercise 20.8. Recall that

L
k
(x) := e
ikx/L
: k Z
is an orthonormal basis for

H
L
and in particular for f H
L
,
(20.33) f =
X
kZ
hf,
L
k
i
L
L
k
where the convergence takes place in L
2
([L, L]). Suppose now that f
C
2
L
(R)
40
. Show (by two integration by parts)
(f
L
,
L
k
)
L
L
2
k
2
kf
00
k
u
where kgk
u
denote the uniform norm of a function g. Use this to conclude that the
sum in Eq. (20.33) is uniformly convergent and from this conclude that Eq. (20.33)
holds pointwise.
40
We view C
2
L
(R) as a subspace of H
L
by identifying f C
2
L
(R) with f|
[L,L]
H
L
.
404 BRUCE K. DRIVER
Exercise 20.9 (Fourier Inversion Formula on S). Let f S(R), L > 0 and
(20.34) f
L
(x) :=
X
kZ
f(x + 2kL).
Show:
(1) The sum dening f
L
is convergent and moreover that f
L
C
L
(R).
(2) Show (f
L
,
L
k
)
L
=
1
2L

f(k/L).
(3) Conclude from Exercise 20.8 that
(20.35) f
L
(x) =
1
2L
X
kZ
f(k/L)e
ikx/L
for all x R.
(4) Show, by passing to the limit, L , in Eq. (20.35) that Eq. (20.32)
holds for all x R. Hint: Recall that

f S.
Exercise 20.11. Folland 8.14 on p. 254. (Wirtingers inequality.)
Exercise 20.12. Folland 8.15 on p. 255. (The sampling Theorem. Modify to
agree with notation in notes, see Solution F.20 below.)
Exercise 20.15. .Folland 8.19 on p. 256. (The Fourier transform of a function
whose support has nite measure.)
Exercise 20.16. Folland 8.22 on p. 256. (Bessel functions.)
Exercise 20.17. Folland 8.23 on p. 256. (Hermite Polynomial problems and
Harmonic oscillators.)
Exercise 20.18. Folland 8.31 on p. 263. (Poisson Summation formula problem.)
21. Constant Coefficient partial differential equations
Suppose that p() =
P
||k
a
with a
C and
(21.1) L = p(D
x
) :=
||N
a
x
=
||N
a
1
i
.
Then for f S
c
Lf() = p()

f(),
that is to say the Fourier transform takes a constant coecient partial dierential
operator to multiplication by a polynomial. This fact can often be used to solve
constant coecient partial dierential equation. For example suppose g : R
n
C is
a given function and we want to nd a solution to the equation Lf = g. Taking the
Fourier transform of both sides of the equation Lf = g would imply p()

f() = g()
and therefore

f() = g()/p() provided p() is never zero. (We will discuss what
happens when p() has zeros a bit more later on.) So we should expect
f(x) = F
1
1
p()
g()
(x) = F
1
1
p()
Fg(x).
Denition 21.1. Let L = p(D
x
) as in Eq. (21.1). Then we let (L) :=Ran(p) C
and call (L) the spectrum of L. Given a measurable function G : (L) C, we
dene (a possibly unbounded operator) G(L) : L
2
(R
n
, m) L
2
(R
n
, m) by
G(L)f := F
1
M
Gp
F
where M
Gp
denotes the operation on L
2
(R
n
, m) of multiplication by G p, i.e.
M
Gp
f = (G p) f
with domain given by those f L
2
such that (G p) f L
2
.
At a formal level we expect
G(L)f = F
1
(G p) Fg.
21.0.3. Elliptic examples. As a specic example consider the equation
(21.2)

+m
2
f = g
where f, g : R
n
C and =
P
n
i=1
2
/x
2
i
is the usual Laplacian on R
n
. By
Corollary 20.16 (i.e. taking the Fourier transform of this equation), solving Eq.
(21.2) with f, g L
2
is equivalent to solving
(21.3)

||
2
+m
2

f() = g().
The unique solution to this latter equation is
f() =

||
2
+m
2
1
g()
and therefore,
f(x) = F
1
||
2
+m
2
1
g()
(x) =:

+m
2
1
g(x).
We expect
F
1
||
2
+m
2
1
g()
(x) = G
m
Fg(x) =
Z
R
n
G
m
(x y)g(y)dy,
406 BRUCE K. DRIVER
where
G
m
(x) := F
1
||
2
+m
2
1
(x) =
Z
R
n
1
m
2
+||
2
e
ix
d.
At the moment F
1
||
2
+m
2
1
only makes sense when n = 1, 2, or 3 because
only then is

||
2
+m
2
1
L
2
(R
n
).
For now we will restrict our attention to the one dimensional case, n = 1, in
which case
(21.4) G
m
(x) =
1
2
Z
R
1
( +mi) ( mi)
e
ix
d.
The function G
m
may be computed using standard complex variable contour inte-
gration methods to nd, for x 0,
G
m
(x) =
1
2
2i
e
i
2
mx
2im
=
1
2m
2e
mx
and since G
m
is an even function,
(21.5) G
m
(x) = F
1
||
2
+m
2
1
(x) =
2
2m
e
m|x|
.
This result is easily veried to be correct, since
F
"
2
2m
e
m|x|
#
() =
2
2m
Z
R
e
m|x|
e
ix
dx
=
1
2m
Z

0
e
mx
e
ix
dx +
Z
0
e
mx
e
ix
dx
=
1
2m
1
m+i
+
1
mi
=
1
m
2
+
2
.
Hence in conclusion we nd that

+m
2
f = g has solution given by

f(x) = G
m
Fg(x) =
2
2m
Z
R
e
m|xy|
g(y)dy =
1
2m
Z
R
e
m|xy|
g(y)dy.
Question. Why do we get a unique answer here given that f(x) = Asinh(x) +
Bcosh(x) solves

+m
2
f = 0?
The answer is that such an f is not in L
2
unless f = 0! More generally it is worth
noting that Asinh(x) +Bcosh(x) is not in P unless A = B = 0.
What about when m = 0 in which case m
2
+
2
becomes
2
which has a zero at
0. Noting that constants are solutions to f = 0, we might look at
lim
m0
(G
m
(x) 1) = lim
m0
2
2m
(e
m|x|
1) =
2
2
|x| .
as a solution, i.e. we might conjecture that
f(x) :=
1
2
Z
R
|x y| g(y)dy
solves the equation f
00
= g. To verify this we have
f(x) :=
1
2
Z
x
(x y) g(y)dy
1
2
Z

x
(y x) g(y)dy
so that
f
0
(x) =
1
2
Z
x
g(y)dy +
1
2
Z

x
g(y)dy and
f
00
(x) =
1
2
g(x)
1
2
g(x).
21.0.4. Poisson Semi-Group. Let us now consider the problems of nding a function
(x
0
, x) [0, ) R
n
u(x
0
, x) C such that
(21.6)

2
x
2
0
+
u = 0 with u(0, ) = f L
2
(R
n
).
Let u(x
0
, ) :=
R
R
n
u(x
0
, x)e
ix
dx denote the Fourier transform of u in the x R
n
variable. Then Eq. (21.6) becomes
(21.7)

2
x
2
0
||
2
u(x
0
, ) = 0 with u(0, ) =

f()
and the general solution to this dierential equation ignoring the initial condition
is of the form
(21.8) u(x
0
, ) = A()e
x
0
||
+B()e
x
0
||
for some function A() and B(). Let us now impose the extra condition that
u(x
0
, ) L
2
(R
n
) or equivalently that u(x
0
, ) L
2
(R
n
) for all x
0
0. The solution
in Eq. (21.8) will not have this property unless B() decays very rapidly at . The
simplest way to achieve this is to assume B = 0 in which case we now get a unique
solution to Eq. (21.7), namely
u(x
0
, ) =

f()e
x
0
||
.
Applying the inverse Fourier transform gives
u(x
0
, x) = F
1
h
f()e
x
0
||
i
(x) =:

e
x
0
(x)
and moreover

e
x
0
(x) = P
x
0
f(x)
where P
x
0
(x) = (2)
n/2
F
1
e
x
0
||
(x). From Exercise 21.1,

P
x
0
(x) = (2)
n/2
F
1
e
x
0
||
(x) = c
n
x
0
(x
2
0
+|x|
2
)
(n+1)/2
where
c
n
= (2)
n/2
((n + 1)/2)
2
n/2
=
((n + 1)/2)
2
n
(n+1)/2
.
Hence we have proved the following proposition.
Proposition 21.2. For f L
2
(R
n
),
e
x
0
f = P
x
0
f for all x
0
0
and the function u(x
0
, x) := e
x
0
f(x) is C
for (x
0
, x) (0, ) R
n
and
solves Eq. (21.6).
408 BRUCE K. DRIVER
21.0.5. Heat Equation on R

n
. The heat equation for a function u : R
+
R
n
C
is the partial dierential equation
(21.9)

1
2
u = 0 with u(0, x) = f(x),

where f is a given function on R
n
. By Fourier transforming Eq. (21.9) in the x
variables only, one nds that (21.9) implies that
(21.10)

t
+
1
2
||
2
u(t, ) = 0 with u(0, ) =

f().
and hence that u(t, ) = e
t||
2
/2

f(). Inverting the Fourier transform then shows
that
u(t, x) = F
1
e
t||
2
/2

f()
(x) =

F
1
e
t||
2
/2
Ff
(x) =: e
t/2
f(x).
From Example 20.4,
F
1
e
t||
2
/2
(x) = p
t
(x) = t
n/2
e
1
2t
|x|
2
and therefore,
u(t, x) =
Z
R
n
p
t
(x y)f(y)dy.
This suggests the following theorem.
Theorem 21.3. Let
(21.11) (t, x, y) := (2t)
n/2
e
|xy|
2
/2t
be the heat kernel on R
n
. Then
(21.12)

1
2
(t, x, y) = 0 and lim

t0
(t, x, y) =
x
(y),
where
x
is the function at x in R
n
. More precisely, if f is a continuous bounded
(can be relaxed considerably) function on R
n
, then u(t, x) =
R
R
n
(t, x, y)f(y)dy is
a solution to Eq. (21.9) where u(0, x) := lim
t0
u(t, x).
Proof. Direct computations show that

1
2
(t, x, y) = 0 and an ap-

plication of Theorem 11.21 shows lim
t0
(t, x, y) =
x
(y) or equivalently that
lim
t0
R
R
n
(t, x, y)f(y)dy = f(x) uniformly on compact subsets of R
n
. This shows
that lim
t0
u(t, x) = f(x) uniformly on compact subsets of R
n
.
This notation suggests that we should be able to compute the solution to g to
(m
2
)g = f using
g(x) =

m
2
1
f(x) =
Z

0
(
m
2
)
t
f
(x)dt =
Z

0
e
m
2
t
p
2t
Ff
(x)dt,
a fact which is easily veried using the Fourier transform. This gives us a method
to compute G
m
(x) from the previous section, namely
G
m
(x) =
Z

0
e
m
2
t
p
2t
(x)dt =
Z

0
(2t)
n/2
e
m
2
t
1
4t
|x|
2
dt.
We make the change of variables, = |x|
2
/4t (t = |x|
2
/4, dt =
|x|
2
4
2
d) to nd
G
m
(x) =
Z

0
(2t)
n/2
e
m
2
t
1
4t
|x|
2
dt =
Z

0
|x|
2
2
!
n/2
e
m
2
|x|
2
/4
|x|
2
(2)
2
d
=
2
(n/22)
|x|
n2
Z

0

n/22
e
e
m
2
|x|
2
/4
d. (21.13)
In case n = 3, Eq. (21.13) becomes
G
m
(x) =

2 |x|
Z

0
1
e
m
2
|x|
2
/4
d =

2 |x|
e
m|x|
where the last equality follows from Exercise 21.1. Hence when n = 3 we have
found
m
2
1
f(x) = G
m
Ff(x) = (2)
3/2
Z
R
3
2 |x y|
e
m|xy|
f(y)dy
=
Z
R
3
1
4 |x y|
e
m|xy|
f(y)dy. (21.14)
The function
1
4|x|
e
m|x|
is called the Yukawa potential.
Let us work out G
m
(x) for n odd. By dierentiating Eq. (21.26) of Exercise
21.1 we nd
Z

0
d
k1/2
e
1
4
x
2
e
m
2
=
Z

0
d
1
1
4
x
2

d
da
k
e
a
|
a=m
2
=

d
da
a
e
ax
= p
m,k
(x)e
mx
where p
m,k
(x) is a polynomial in x with deg p
m
= k with
p
m,k
(0) =
d
da
k
a
1/2
|
a=m
2 =
(
1
2
3
2
. . .
2k 1
2
)m
2k+1
= m
2k+1
2
k
(2k1)!!.
Letting k 1/2 = n/2 2 and m = 1 we nd k =
n1
2
2 N for n = 3, 5, . . . .
and we nd
Z

0

n/22
e
1
4
x
2
e
d = p
1,k
(x)e
x
for all x > 0.
Therefore,
G
m
(x) =
2
(n/22)
|x|
n2
Z

0

n/22
e
e
m
2
|x|
2
/4
d =
2
(n/22)
|x|
n2
p
1,n/22
(m|x|)e
m|x|
.
Now for even m, I think we get Bessel functions in the answer. (BRUCE: look
this up.) Let us at least work out the asymptotics of G
m
(x) for x . To this
end let
(y) :=
Z

0

n/22
e
(+
1
y
2
)
d = y
n2
Z

0

n/22
e
(y
2
+
1
)
d
The function f
y
() := (y
2
+
1
) satises,
f
0
y
() =

y
2
and f
00
y
() = 2
3
and f
000
y
() = 6
4
410 BRUCE K. DRIVER
so by Taylors theorem with remainder we learn

f
y
()
= 2y +y
3
( y
1
)
2
for all > 0,
see Figure 21.0.5 below.
2.5 2 1.5 1 0.5 0
30
25
20
15
10
5
0
x
y
x
y
Plot of f
4
and its second order Taylor approximation.
So by the usual asymptotics arguments,
(y)
= y
n2
Z
(+y
1
,y
1
+)
n/22
e
(y
2
+
1
)
d
= y
n2
Z
(+y
1
,y
1
+)
n/22
exp
2y y
3
( y
1
)
2
= y
n2
e
2y
Z
R
n/22
exp
y
3
( y
1
)
2
d (let y
1
)
= e
2y
y
n2
y
n/2+1
Z
R
n/22
exp
y( 1)
2
d
= e
2y
y
n2
y
n/2+1
Z
R
( + 1)
n/22
exp
y
2
d.
The point is we are still going to get exponential decay at .
When m = 0, Eq. (21.13) becomes
G
0
(x) =
2
(n/22)
|x|
n2
Z

0

n/21
e
=
2
(n/22)
|x|
n2
(n/2 1)
where (x) in the gamma function dened in Eq. (8.30). Hence for reasonable
functions f (and n 6= 2)
()
1
f(x) = G
0
Ff(x) = 2
(n/22)
(n/2 1)(2)
n/2
Z
R
n
1
|x y|
n2
f(y)dy
=
1
4
n/2
(n/2 1)
Z
R
n
1
|x y|
n2
f(y)dy.
The function
G
0
(x, y) :=
1
4
n/2
(n/2 1)
1
|x y|
n2
is a Greens function for . Recall from Exercise 8.16 that, for n = 2k, (
n
2

1) = (k 1) = (k 2)!, and for n = 2k + 1,
(
n
2
1) = (k 1/2) = (k 1 + 1/2) =
1 3 5 (2k 3)
2
k1
=
(2k 3)!!
2
k1
where (1)!! 1.
Hence
G
0
(x, y) =
1
4
1
|x y|
n2
k
(k 2)! if n = 2k
1
k
(2k3)!!
2
k1
if n = 2k + 1
and in particular when n = 3,
G
0
(x, y) =
1
4
1
|x y|
which is consistent with Eq. (21.14) with m = 0.
21.0.6. Wave Equation on R
n
. Let us now consider the wave equation on R
n
,
0 =

2
t

u(t, x) with
u(0, x) = f(x) and u
t
(0, x) = g(x). (21.15)
Taking the Fourier transform in the x variables gives the following equation
0 = u
t t
(t, ) +||
2
u(t, ) with
u(0, ) =

f() and u
t
(0, ) = g(). (21.16)
The solution to these equations is
u(t, ) =

f() cos (t ||) + g()
sint||
||
and hence we should have
u(t, x) = F
1
f() cos (t ||) + g()

sint||
||
(x)
= F
1
cos (t ||) Ff(x) +F
1
sint||
||
Fg (x)
=
d
dt
F
1
sint||
||
Ff(x) +F
1
sint||
||
Fg (x) . (21.17)
The question now is how interpret this equation. In particular what are the inverse
Fourier transforms of F
1
cos (t ||) and F
1
sin t||
||
. Since
d
dt
F
1
sin t||
||
Ff(x) =
F
1
cos (t ||)Ff(x), it really suces to understand F
1
h
sin t||
||
i
. The problem we
immediately run into here is that
sin t||
||
L
2
(R
n
) i n = 1 so that is the case we
should start with.
Again by complex contour integration methods one can show
F
1
1
sint
(x) =

1
x+t>0
1
(xt)>0
2
(1
x>t
1
x>t
) =

2
1
[t,t]
(x)
412 BRUCE K. DRIVER
where in writing the last line we have assume that t 0. Again this easily seen to
be correct because
F
2
1
[t,t]
(x)
() =
1
2
Z
R
1
[t,t]
(x)e
ix
dx =
1
2i
e
ix
|
t
t
=
1
2i
e
it
e
it
=
1
sint.
Therefore,
F
1
1
sint
Ff(x) =
1
2
Z
t
t
f(x y)dy
and the solution to the one dimensional wave equation is
u(t, x) =
d
dt
1
2
Z
t
t
f(x y)dy +
1
2
Z
t
t
g(x y)dy
=
1
2
(f(x t) +f(x +t)) +
1
2
Z
t
t
g(x y)dy
=
1
2
(f(x t) +f(x +t)) +
1
2
Z
x+t
xt
g(y)dy.
We can arrive at this same solution by more elementary means as follows. We
rst note in the one dimensional case that wave operator factors, namely
0 =

2
t

2
x
u(t, x) = (
t
x
) (
t
+
x
) u(t, x).
Let U(t, x) := (
t
+
x
) u(t, x), then the wave equation states (
t
x
) U = 0 and
hence by the chain rule
d
dt
U(t, x t) = 0. So
U(t, x t) = U(0, x) = g(x) +f
0
(x)
and replacing x by x +t in this equation shows
(
t
+
x
) u(t, x) = U(t, x) = g(x +t) +f
0
(x +t).
Working similarly, we learn that
d
dt
u(t, x +t) = g(x + 2t) +f
0
(x + 2t)
which upon integration implies
u(t, x +t) = u(0, x) +
Z
t
0
{g(x + 2) +f
0
(x + 2)} d
= f(x) +
Z
t
0
g(x + 2)d +
1
2
f(x + 2)|
t
0
=
1
2
(f(x) +f(x + 2t)) +
Z
t
0
g(x + 2)d.
Replacing x x t in this equation gives
u(t, x) =
1
2
(f(x t) +f(x +t)) +
Z
t
0
g(x t + 2)d
and then letting y = x t + 2 in the last integral shows again that
u(t, x) =
1
2
(f(x t) +f(x +t)) +
1
2
Z
x+t
xt
g(y)dy.
When n > 3 it is necessary to treat F
1
h
sin t||
||
i
as a distribution or gener-
alized function, see Section 30 below. So for now let us take n = 3, in which case
from Example 20.18 it follows that
(21.18) F
1
sint ||
||
=
t
4t
2
t
= t
t
where
t
is
1
4t
2
t
, the surface measure on S
t
normalized to have total measure
one. Hence from Eq. (21.17) the solution to the three dimensional wave equation
should be given by
(21.19) u(t, x) =
d
dt
(t
t
Ff(x)) +t
t
Fg (x) .
Using this denition in Eq. (21.19) gives
u(t, x) =
d
dt
t
Z
S
t
f(x y)d
t
(y)
+t
Z
S
t
g(x y)d
t
(y)
=
d
dt
t
Z
S
1
f(x t)d
1
()
+t
Z
S
1
g(x t)d
1
()
=
d
dt
t
Z
S
1
f(x +t)d
1
()
+t
Z
S
1
g(x +t)d
1
(). (21.20)
Proposition 21.4. Suppose f C
3
(R
3
) and g C
2
(R
3
), then u(t, x) dened by
Eq. (21.20) is in C
2
RR
3
and is a classical solution of the wave equation in

Eq. (21.15).
Proof. The fact that u C
2
R R
3
follows by the usual dierentiation under

the integral arguments. Suppose we can prove the proposition in the special case
that f 0. Then for f C
3
(R
3
), the function v(t, x) = +t
R
S
1
g(x + t)d
1
()
solves the wave equation 0 =

2
t

v(t, x) with v(0, x) = 0 and v

t
(0, x) = g(x).
Dierentiating the wave equation in t shows u = v
t
also solves the wave equation
with u(0, x) = g(x) and u
t
(0, x) = v
tt
(0, x) =
x
v(0, x) = 0.
These remarks reduced the problems to showing u in Eq. (21.20) with f 0
solves the wave equation. So let
(21.21) u(t, x) := t
Z
S
1
g(x +t)d
1
().
We now give two proofs the u solves the wave equation.
Proof 1. Since solving the wave equation is a local statement and u(t, x) only
depends on the values of g in B(x, t) we it suces to consider the case where
g C
2
c
R
3
. Taking the Fourier transform of Eq. (21.21) in the x variable shows

u(t, ) = t
Z
S
1
d
1
()
Z
R
3
g(x +t)e
ix
dx
= t
Z
S
1
d
1
()
Z
R
3
g(x)e
ix
e
it
dx = g()t
Z
S
1
e
it
d
1
()
= g()t
sin|tk|
|tk|
= g()
sin(t ||)
||
wherein we have made use of Example 20.18. This completes the proof since u(t, )
solves Eq. (21.16) as desired.
414 BRUCE K. DRIVER
Proof 2. Dierentiating
S(t, x) :=
Z
S
1
g(x +t)d
1
()
in t gives
S
t
(t, x) =
1
4
Z
S
1
g(x +t) d() =
1
4
Z
B(0,1)
g(x +t)dm()
=
t
4
Z
B(0,1)
g(x +t)dm() =
1
4t
2
Z
B(0,t)
g(x +y)dm(y)
=
1
4t
2
Z
t
0
dr r
2
Z
|y|=r
g(x +y)d(y)
where we have used the divergence theorem, made the change of variables y = t
and used the disintegration formula in Eq. (8.27),
Z
R
d
f(x)dm(x) =
Z
[0,)S
n1
f(r ) d()r
n1
dr =
Z

0
dr
Z
|y|=r
f(y)d(y).
Since u(t, x) = tS(t, x) if follows that
u
tt
(t, x) =

t
[S(t, x) +tS
t
(t, x)]
= S
t
(t, x) +

t
"
1
4t
Z
t
0
dr r
2
Z
|y|=r
g(x +y)d(y)
#
= S
t
(t, x)
1
4t
2
Z
t
0
dr
Z
|y|=r
g(x +y)d(y) +
1
4t
Z
|y|=t
g(x +y)d(y)
= S
t
(t, x) S
t
(t, x) +
t
4t
2
Z
|y|=1
g(x +t)d() = tu(t, x)
as required.
The solution in Eq. (21.20) exhibits a basic property of wave equations, namely
nite propagation speed. To exhibit the nite propagation speed, suppose that
f = 0 (for simplicity) and g has compact support near the origin, for example
think of g =
0
(x). Then x +tw = 0 for some w i |x| = t. Hence the wave front
propagates at unit speed and the wave front is sharp. See Figure 39 below.
The solution of the two dimensional wave equation may be found using
Hadamards method of decent which we now describe. Suppose now that f and
g are functions on R
2
which we may view as functions on R
3
which happen not to
depend on the third coordinate. We now go ahead and solve the three dimensional
wave equation using Eq. (21.20) and f and g as initial conditions. It is easily seen
that the solution u(t, x, y, z) is again independent of z and hence is a solution to
the two dimensional wave equation. See gure 40 below.
Notice that we still have nite speed of propagation but no longer sharp propa-
gation. The explicit formula for u is given in the next proposition.
Proposition 21.5. Suppose f C
3
(R
2
) and g C
2
(R
2
), then
u(t, x) :=

t
"
t
2
ZZ
D
1
f(x +tw)
p
1 |w|
2
dm(w)
#
+
t
2
ZZ
D
1
g(x +tw)
p
1 |w|
2
dm(w)
Figure 39. The geometry of the solution to the wave equation in
three dimensions. The observer sees a ash at t = 0 and x = 0
only at time t = |x| . The wave progates sharply with speed 1.
two dimensions. A ash at 0 R
2
looks like a line of ashes to the
ctitious 3 d observer and hence she sees the eect of the ash
for t |x| . The wave still propagates with speed 1. However there
is no longer sharp propagation of the wave front, similar to water
waves.
416 BRUCE K. DRIVER
is in C
2
R R
2
and solves the wave equation in Eq. (21.15).

Proof. As usual it suces to consider the case where f 0. By symmetry u
may be written as
u(t, x) = 2t
Z
S
+
t
g(x y)d
t
(y) = 2t
Z
S
+
t
g(x +y)d
t
(y)
where S
+
t
is the portion of S
t
with z 0. The surface S
+
t
may be parametrized by
R(u, v) = (u, v,
t
2
u
2
v
2
) with (u, v) D
t
:=

(u, v) : u
2
+v
2
t
2
. In these
coordinates we have
4t
2
d
t
=

u
p
t
2
u
2
v
2
,
v
p
t
2
u
2
v
2
, 1
dudv
=
t
2
u
2
v
2
,
v
t
2
u
2
v
2
, 1
dudv
=
r
u
2
+v
2
t
2
u
2
v
2
+ 1dudv =
|t|
t
2
u
2
v
2
dudv
and therefore,
u(t, x) =
2t
4t
2
Z
D
t
g(x + (u, v,
p
t
2
u
2
v
2
))
|t|
t
2
u
2
v
2
dudv
=
1
2
sgn(t)
Z
D
t
g(x + (u, v))
t
2
u
2
v
2
dudv.
This may be written as
u(t, x) =
1
2
sgn(t)
ZZ
D
t
g(x +w)
p
t
2
|w|
2
dm(w) =
1
2
sgn(t)
t
2
|t|
ZZ
D
1
g(x +tw)
p
1 |w|
2
dm(w)
=
1
2
t
ZZ
D
1
g(x +tw)
p
1 |w|
2
dm(w)
21.1. Elliptic Regularity. The following theorem is a special case of the main
theorem (Theorem 21.10) of this section.
Theorem 21.6. Suppose that M
o
R
n
, v C
(M) and u L
1
loc
(M) satises
u = v weakly, then u has a (necessarily unique) version u C
(M).
Proof. We may always assume n 3, by embedding the n = 1 and n = 2 cases
in the n = 3 cases. For notational simplicity, assume 0 M and we will show u is
smooth near 0. To this end let C
c
(M) such that = 1 in a neighborhood of 0
and C
c
(M) such that supp() { = 1} and = 1 in a neighborhood of 0
as well. Then formally, we have with := 1 ,
G (v) = G (u) = G ((u +u))
= G ((u) +(u)) = u +G ((u))
so that
u(x) = G (v) (x) G ((u))(x)
for x supp(). The last term is formally given by
G ((u))(x) =
Z
R
n
G(x y)(y)((y)u(y))dy
=
Z
R
n
(y)
y
[G(x y)(y)] u(y)dy
which makes sense for x near 0. Therefore we nd
u(x) = G (v) (x)
Z
R
n
(y)
y
[G(x y)(y)] u(y)dy.
Clearly all of the above manipulations were correct if we know u were C
2
to begin
with. So for the general case, let u
n
= u
n
with {
n
}
n=1
the usual sort of
sequence approximation. Then u
n
= v
n
=: v
n
away from M and
(21.22) u
n
(x) = G (v
n
) (x)
Z
R
n
(y)
y
[G(x y)(y)] u
n
(y)dy.
Since u
n
u in L
1
loc
(O) where O is a suciently small neighborhood of 0, we may
pass to the limit in Eq. (21.22) to nd u(x) = u(x) for a.e. x O where
u(x) := G (v) (x)
Z
R
n
(y)
y
[G(x y)(y)] u(y)dy.
This concluded the proof since u is smooth for x near 0.
Denition 21.7. We say L = p(D
x
) as dened in Eq. (21.1) is elliptic if p
k
() :=
P
||=k
a
is zero i = 0. We will also say the polynomial p() :=

P
||k
a
is elliptic if this condition holds.

Remark 21.8. If p() :=
P
||k
a
is an elliptic polynomial, then there exists

A < such that inf
||A
|p()| > 0. Since p
k
() is everywhere non-zero for S
n1
and S
n1
R
n
is compact, := inf
||=1
|p
k
()| > 0. By homogeneity this implies
|p
k
()| ||
k
for all A
n
.
Since
|p()| =
p
k
() +
X
||<k
a
|p
k
()|
X
||<k
a
||
k
C
1 +||
k1
for some constant C < from which it is easily seen that for A suciently large,
|p()|

2
||
k
for all || A.
For the rest of this section, let L = p(D
x
) be an elliptic operator and M
0
R
n
.
As mentioned at the beginning of this section, the formal solution to Lu = v for
v L
2
(R
n
) is given by
u = L
1
v = G v
where
G(x) :=
Z
R
n
1
p()
e
ix
d.
Of course this integral may not be convergent because of the possible zeros of p
and the fact
1
p()
may not decay fast enough at innity. We we will introduce
418 BRUCE K. DRIVER
a smooth cut o function () which is 1 on C

0
(A) := {x R
n
: |x| A} and
supp() C
0
(2A) where A is as in Remark 21.8. Then for M > 0 let
G
M
(x) =
Z
R
n
(1 ()) (/M)
p()
e
ix
d, (21.23)
(x) :=
(x) =
Z
R
n
()e
ix
d, and
M
(x) = M
n
(Mx). (21.24)
Notice
R
R
n
(x)dx = F(0) = (0) = 1, S since S and
LG
M
(x) =
Z
R
n
(1 ()) (/M)e
ix
d =
Z
R
n
[(/M) ()] e
ix
d
=
M
(x) (x)
provided M > 2.
Proposition 21.9. Let p be an elliptic polynomial of degree m. The function G
M
dened in Eq. (21.23) satises the following properties,
(1) G
M
S for all M > 0.
(2) LG
M
(x) = M
n
(Mx) (x).
(3) There exists G C
c
(R
n
\ {0}) such that for all multi-indecies ,
lim
M
G
M
(x) =
G(x) uniformly on compact subsets in R

n
\ {0} .
Proof. We have already proved the rst two items. For item 3., we notice that
(x)
G
M
(x) =
Z
R
n
(1 ()) (/M)
p()
(D)
e
ix
d
=
Z
R
n
D
(1 ())
p()
(/M)
e
ix
d
=
Z
R
n
D
(1 ())
p()
(/M)e
ix
d +R
M
(x)
where
R
M
(x) =
X
<
M
||||
Z
R
n
D
(1 ())
p()

(/M)e
ix
d.
Using

p()
(1 ())
C ||
||m||
and the fact that
supp(
(/M)) { R
n
: A || /M 2A} = { R
n
: AM || 2AM}
we easily estimate
|R
M
(x)| C
X
<
M
||||
Z
{R
n
:AM||2AM}
||
||m||
d
C
X
<
M
||||
M
||m||+n
= CM
||||m+n
.
Therefore, R
M
0 uniformly in x as M provided || > || m+n. It follows
easily now that G
M
G in C
c
(R
n
\ {0}) and furthermore that
(x)
G(x) =
Z
R
n
D
(1 ())
p()
e
ix
d
provided is suciently large. In particular we have shown,
D
G(x) =
1
|x|
2k
Z
R
n
(
)
k
(1 ())
p()
e
ix
d
provided m|| + 2k > n, i.e. k > (n m+||) /2.
We are now ready to use this result to prove elliptic regularity for the constant
coecient case.
Theorem 21.10. Suppose L = p(D
) is an elliptic dierential operator on R

n
,
M
o
R
n
, v C
(M) and u L
1
loc
(M) satises Lu = v weakly, then u has a
(necessarily unique) version u C
(M).
Proof. For notational simplicity, assume 0 M and we will show u is smooth
near 0. To this end let C
c
(M) such that = 1 in a neighborhood of 0 and
C
c
(M) such that supp() { = 1} , and = 1 in a neighborhood of 0 as
well. Then formally, we have with := 1 ,
G
M
(v) = G
M
(Lu) = G
M
(L(u +u))
= G
M
(L(u) +L(u)) =
M
(u) (u) +G
M
(L(u))
so that
(21.25)
M
(u) (x) = G
M
(v) (x) G
M
(L(u))(x) + (u) .
Since
F [G
M
(v)] () =

G
M
() (v)
() =
(1 ()) (/M)
p()
(v)
()
(1 ())
p()
(v)
() as M
with the convergence taking place in L
2
(actually in S), it follows that
G
M
(v) G (v) (x) :=
Z
R
n
(1 ())
p()
(v)
()e
ix
d
= F
1
(1 ())
p()
(v)
()
(x) S.
So passing the the limit, M , in Eq. (21.25) we learn for almost every x R
n
,
u(x) = G (v) (x) lim
M
G
M
(L(u))(x) + (u) (x)
for a.e. x supp(). Using the support properties of and we see for x near 0
that (L(u))(y) = 0 unless y supp() and y / { = 1} , i.e. unless y is in an
annulus centered at 0. So taking x suciently close to 0, we nd x y stays away
from 0 as y varies through the above mentioned annulus, and therefore
G
M
(L(u))(x) =
Z
R
n
G
M
(x y)(L(u))(y)dy
=
Z
R
n
L
y
{(y)G
M
(x y)} (u) (y)dy
Z
R
n
L
y
{(y)G(x y)} (u) (y)dy as M .
420 BRUCE K. DRIVER
Therefore we have shown,

u(x) = G (v) (x)
Z
R
n
L
y
{(y)G(x y)} (u) (y)dy + (u) (x)
for almost every x in a neighborhood of 0. (Again it suces to prove this equation
and in particular Eq. (21.25) assuming u C
2
(M) because of the same convo-
lution argument we have use above.) Since the right side of this equation is the
linear combination of smooth functions we have shown u has a smooth version in a
neighborhood of 0.
Remarks 21.11. We could avoid introducing G
M
(x) if deg(p) > n, in which case
(1())
p()
L
1
and so
G(x) :=
Z
R
n
(1 ())
p()
e
ix
d
is already well dened function with G C
(R
n
\ {0}) BC(R
n
). If deg(p) < n,
we may consider the operator L
k
= [p(D
x
)]
k
= p
k
(D
x
) where k is chosen so that
k deg(p) > n. Since Lu = v implies L
k
u = L
k1
v weakly, we see to prove the
hypoellipticity of L it suces to prove the hypoellipticity of L
k
.
21.2. Exercises.
Exercise 21.1. Using
1
||
2
+m
2
=
Z

0
e
(||
2
+m
2
)
d,
the identity in Eq. (21.5) and Example 20.4, show for m > 0 and x 0 that
e
mx
=
m
Z

0
d
1
1
4
x
2
e
m
2
(let /m
2
) (21.26)
=
Z

0
d
1
m
2
4
x
2
. (21.27)
Use this formula and Example 20.4 to show, in dimension n, that
F
h
e
m|x|
i
() = 2
n/2
((n + 1)/2)
m
(m
2
+||
2
)
(n+1)/2
where (x) in the gamma function dened in Eq. (8.30). (I am not absolutely
positive I have got all the constants exactly right, but they should be close.)
22. L
2
Sobolev spaces on R
n
Recall the following notation and denitions from Section 20. TODO Introduce
S
0
so that one may dene negative Sobolev spaces here and do the embedding
theorems. Localize to open sets, add in trace theorems to hyperplanes and sub-
manifolds and give some application to PDE.
Notation 22.1. Let
dx =

1
n
dm(x) and d
n
dm()
where m is Lebesgue measure on R
n
. Also let hi =
p
1 +||
2
,
x
=

x
and D
x
=

1
i
||
=

1
i

x
.
Denition 22.2 (Fourier Transform). For f L
1
, let
f() = Ff() :=
Z
R
n
e
ix
f(x)dx
g
(x) = F
1
g(x) =
Z
R
n
e
ix
g()d = Fg(x)
22.1. Sobolev Spaces.
Denition 22.3. To each s R and f S let
|f|
2
s

Z
|
f()|
2
(1 +||
2
)
s
d =
Z
|
f()|
2
hi
2s
d.
This norm may also be described by
|f|
s
= k(1 )
s/2
fk
L
2
We call ||
s
the L
2
Sobolev norm with s derivatives.
It will sometime be useful to use the following norms,
kfk
2
s

Z
|
f()|
2
(1 +||)
2s
d for all s R and f S.
For each s R, k k
s
is equivalent to ||
s
because
1 +||
2
(1 +||)
2
2(1 +||
2
).
Lemma 22.4. The Hilbert space L
2
(R
n
, (1+||
2
)
s
d) may be viewed as a subspace
of S
0
under the map
g L
2
(R
n
, (1 +||
2
)
s
d) ( S
Z
R
n
g()()d) S
0
.
Proof. Let g L
2
(R
n
, (1 +||
2
)
s
d) and S, then
Z
R
n
|g()()| d =
Z
R
n
|g()| (1 +||
2
)
s/2
|()| (1 +||
2
)
s/2
d
kgk
L
2
(R
n
,(1+||
2
)
s
d)
kk
L
2
(R
n
,(1+||
2
)
s
d)
.
422 BRUCE K. DRIVER
Now
kk
2
L
2
(R
n
,(1+||
2
)
s
d)
=
Z
R
n
|()|
2
(1 +||
2
)
s
d
Z
R
n
(1 +||
2
)
s
(1 +||
2
)
t
d sup
h
|()|
2
(1 +||
2
)
t
i
= C(s +t) sup
h
|()|
2
(1 +||
2
)
t
i
where
C(s +t) :=
Z
R
n
(1 +||
2
)
st
d <
provided s +t > n/2. So by choosing t > n/2 s, we have shown g L
1
(d) and
that

Z
R
n
g()()d
C(s +t) sup
h
|()|
2
(1 +||
2
)
t
i
.
Therefore S
R
R
n
g()()d is an element of S
0
.
Denition 22.5. The Sobolev space of order s on R
n
is the normed vector space
H
s
(R
n
) = F
1
(L
2
(R
n
, (1 +||
2
)
s
d)) S
0
or equivalently,
H
s
(R
n
) =
n
f S
0
:

f L
2
(R
n
, (1 +||
2
)
s
d)
o
.
We make H
s
(R
n
) into a Hilbert space by requiring
F
1
|
L
2
(R
n
,(1+||
2
)
s
d)
: L
2
(R
n
, (1 +||
2
)
s
d) H
s
(R
n
)
to be a unitary map. So the inner product on H
s
is given by
(22.1) hf, gi
s
:=
Z

f() g()(1 +||
2
)
s
d for all f, g H
s
(R
n
)
and the associated norm is
(22.2) |f|
2
s

Z
R
n
|
f()|
2
(1 +||
2
)
s
d.
Remark 22.6. We may also describe H
s
(R
n
) as
H
s
(R
n
) = (1 )
s/2
L
2
(R
n
, dx)
= {f S
0
: (1 )
s/2
f L
2
(R
n
, dx)}
and the inner product may be described as
hf, gi
s
= h(1 )
s/2
f, (1 )
s/2
gi
L
2.
Here we dene (1 )
s/2
acting on S
0
as the transpose of its action on S which is
determined by
F
h
(1 )
s/2
f
i
() = (1 +||
2
)
s/2

f() for all f S.
It will be useful to notice later that commutes with complex conjugation and
therefore so does (1 )
s/2
. To check this formally, recall that F

f() =

f(),
therefore,
F
h
(1 )
s/2
f
i
() = F(1 )
s/2
f() = (1 +||
2
)
s/2
f()
= (1 +||
2
)
s/2

f() = (1 +||
2
)
s/2
F

f()
= F
(1 )
s/2

f
().
This shows that (1 )
s/2
f = (1 )
s/2

f for f S and hence by duality for
f S
0
as well.
Lemma 22.7. S is dense in H
s
(R
n
) for all s R and (1 )
t/2
: H
s
H
st
is
unitary for all s, t R.
Proof. Because F : H
s
(R
n
) L
2
(R
n
, (1+||
2
)
s
d) is unitary and F(S) = S, it
suces to show S is dense in L
2
(R
n
, (1 +||
2
)
s
d). Since d
s
() := (1 +||
2
)
s
d is
a Radon measure on R
n
, we know that C
c
(R
n
) is dense in L
2
(d
s
) and therefore
by the virtue that C
c
(R
n
) S, S is dense as well.
Because the map
f L
2
(R
n
, (1 +||
2
)
s
d) (1 +||
2
)
t/2
f() L
2
(R
n
, (1 +||
2
)
st
d)
is unitary, it follows that (1 )
t/2
: H
s
H
st
is unitary for all s, t R as well.
Lemma 22.8. For each multi-index , the operator D
x
: S
0
S
0
restricts to a
contraction from H
s
H
s||
. We also have the relation
(22.3) F (D
x
f) () =

f () for all f H
s
.
Proof. Recall the Eq. (22.3) holds for all f S
0
in the sense
(22.4) F (D
x
f) = m

f
where m
() :=
. Now if f H
s
,

f is represented by a tempered function,
therefore m

f is represented by the tempered function

f () . That is Eq.
(22.3) holds and therefore,
|D
x
f|
2
s||
=
Z
|

f()|
2
(1 +||
2
)
s||
d
=
Z
|
f()|
2
(1 +||
2
)
s||
|
|
2
d
Z
|
f()|
2
(1 +||
2
)
s||
(1 +||
2
)
||
d
Z
|
f()|
2
(1 +||
2
)
s
d = |f|
2
s
,
wherein the third line we have used the estimate
|
|
2
=
2
1
1

2
n
n
||
2||
(1 +||
2
)
||
which follows from
2
i
||
2
for all i.
424 BRUCE K. DRIVER
Lemma 22.9. Suppose s N. Then H

s
may be characterized by
(22.5) H
s
= {f L
2
(d) : D
f exists in L
2
(d) for all || s},
where D
f denotes the distributional or weak derivatives of f. (See Theorem 19.18

for other characterizations of these derivatives.) Also if we let
kfk
2
s
:=
X
||s
kD
fk
2
L
2
for f H
s
,
then kk
s
and ||
s
are equivalent norms on H
s
.
Proof. Let

H
s
denote the right side of Eq. (22.5). If f H
s
and || s, then
Lemma 22.8,
|D
f|
2
0
|D
f|
2
s||
|f|
2
s
< .
This shows that f

H
s
and
(22.6) kfk
2
s

X
||s
|f|
2
s
C
s
|f|
2
s
.
Conversely if f

H
s
(letting m
() :=
as above),
> kfk
2
s
=
X
||s
kD
fk
2
L
2
=
X
||s
2
L
2
=
X
||s
Z
R
n
(
)
2
f()
2
d
=
Z
R
n
X
||s
(
)
2
f()
2
d. (22.7)
Let
0
= 1, then by the multinomial theorem
(1 +||
2
)
s
= (
n
X
i=0
2
i
)
s
=
X
||=s
2
where = (
0
,
1
, . . . ,
n
) N
n+1
and
=
s!
Q
n
j=0
j
!
.
We may rewrite this using = (
1
, . . . ,
n
) N
n
as follows
(1 +||
2
)
s
=
X
||s
s
(s || , )
2
so that
(22.8)
X
||s
2
c
s
(1 +||
2
)
s
with c
1
s
:= max
||s
s
(s || , )
.
Using this estimate in with Eq. (22.7) implies
(22.9) > kfk
2
s
c
s
Z
R
n
(1 +||
2
)
s
f()
2
d = c
s
|f|
2
s
.
This shows that f H
s
and Eqs. (22.6) and (22.9) prove kk
s
and ||
s
are equivalent.
Denition 22.10. Let C
k
0
(R
n
) denote the Banach space of C
k
functions on R
n
for which D
f C
0
(R
n
) for || k. The norm on C
k
0
(R
n
) is dened by
|f|
,k
=
X
||k
kD
x
fk
sup
x
X
||k
|D
x
f|.
Theorem 22.11 (Sobolev Embedding Theorem). Let k N. If s > k +
n
2
(or
k s <
n
2
) then every f H
s
has a representative i(f) C
k
0
(R
n
) which is given
by
(22.10) i(f)(x) =
Z
R
n
f()e
ix
d.
The map i : H
s
C
k
0
(R
n
) is bounded and linear.
Proof. For N
n
Z
R
n
|
f()
Z
R
n
(1 +||
2
)
s
f()
2
d
Z
R
n
|
|
2
(1 +||
2
)
s
d
= C
2
|f|
2
s
where
C
2
:=
Z
R
n
|
|
2
(1 +||
2
)
s
d.
If || k, then
|
|
2
(1 +||
2
)
s
(1 +||
2
)
k
(1 +||
2
)
s
= (1 +||
2
)
ks
L
1
(d)
provided k s < n/2. So we have shown,
(22.11)

f() L
1
(d) for all || k.
Using this result for = 0, we deduce

f L
1
L
2
and therefore the continuous
version of f is given by Eq. (22.10). Using the integrability of

f() in Eq. (22.11)
we may dierentiate this expression to nd
D
i(f)(x) =
Z
R
n

f()e
ix
d for all || k.
By the dominated convergence theorem and the Riemann Lebesgue lemma,
D
i(f) C
0
(R
n
) for all || k. Moreover,
|D
i(f)|

Z
R
n

f()
d C
|f|
s
for all || k.
This shows that |i(f)|
,k
(const.) |f|
s
.
Let us now improve the above result to get some Hlder continuity for f, for this
|f(x) f(y)| =
Z
R
n
f()
e
ix
e
iy
Z
R
n
f()
e
ix
e
iy
d
=
Z
R
n
f()
(1 +||
2
)
s/2
1 e
i(yx)
(1 +||
2
)
s/2
d
Z
R
n
(1 +||
2
)
s
f()
2
d
1/2
Z
R
n
1 e
i(yx)
2
(1 +||
2
)
s
d
1/2
= |f|
s
C
s
(|y x|)
426 BRUCE K. DRIVER
where
C
s
(|x|) =
Z
R
n
1 e
ix
2
(1 +||
2
)
s
d
1/2
=
Z
R
n
1 e
i|x|
n
2
(1 +||
2
)
s
d
1/2
.
Making the change of variables / |x| in the above formula gives
C
s
(|x|) =
_
_
|x|
n
Z
R
n
1 e
i
n
2 1
(1 +
||
2
|x|
2
)
s
d
_
_
1/2
= |x|
sn/2
Z
R
n
1 e
i
n
2 1
(|x|
2
+||
2
)
s
d
!
1/2
|x|
sn/2
Z
R
n
1 e
i
n
2 1
||
2s
d
1/2
p
(S
n1
) |x|
sn/2
Z

0
2 r
2
r
2s
r
n1
dr
1/2
.
Supposing the s n/2 = (0, 1), we nd
Z

0
2 r
2
r
2s
r
n1
dr =
Z

0
2 r
2
r
n+2
r
n1
dr =
Z

0
2 r
2
r
1+2
dr <
since 2/r
1+2
is integrable near innity and r
2
/r
1+2
= 1/r
21
is integrable near
0. Thus we have shown, for s n/2 (0, 1) that
|f(x) f(y)| K
s
|f|
s
|x y|
sn/2
where
K
s
:=
p
(S
n1
)
Z

0
2 r
2
r
2s
r
n1
dr
1/2
.
Notation 22.12. In the sequel, we will simply write f for i(f) with the under-
standing that if f L
1
loc
(R
n
) has a continuous version, then we will identify f with
its (necessarily unique) continuous version.
Denition 22.13. In the future we will work with the following two subspaces of
S
0
:
H
=
sR
H
s
=
s0
H
s
and
H
=
sR
H
s
=
s0
H
s
.
We also set
(22.12) hf, gi :=
Z
R
n
f()g
()d
for all f, g H
such that

fg
L
1
(d).
Notice that H
H
s
L
2
H
s
H
for all s R. Also if f, g H

0
= L
2
,
then

f, g
L
2
(d) so that

fg
L
1
(d) and
Z
R
n
f()g
()d =
Z
R
n
f()
b
g()d =
Z
R
n
f(x) g(x)dx =
Z
R
n
f(x)g(x)dx.
Therefore, h, i is an extension of the pairing
f, g L
2
Z
R
n
f(x)g(x)dx =: hf, gi
L
2.
Proposition 22.14. Let s R. If f H
s
and g H
s
, then hf, gi is well dened
and satises
hf, gi = h(1 )
s/2
f, (1 )
s/2
gi
L
2 = h(1 )
s/2
f, (1 )
s/2
gi.
If we further assume that g S, then hf, gi = hf, gi
S
0
S
where h, i
S
0
S
denotes
the natural pairing between S
0
and S. Moreover, if s 0, the map
(22.13) f H
s
T
hf, i H
s
is a unitary map (i.e. a Hilbert space isomorphism) and the |f|
s
may be computed
using
(22.14) |f|
s
= sup
|hf, gi
S
0
S
|
|g|
s
: 0 6= g S
.
Proof. Let s R, f H
s
and g H
s
, then
f()g
() = (1 +||
2
)
s/2

f() (1 +||
2
)
s/2
g
() L
1
since (1+||
2
)
s/2

f() and (1+||
2
)
s/2
g
() = (1+||
2
)
s/2
g() are L
2
functions
by denition of H
s
and H
s
respectively. Therefore hf, gi is well dened and
hf, gi =
Z
R
n
(1 +||
2
)
s/2

f() (1 +||
2
)
s/2
g
()d
=
Z
R
n
(1 +||
2
)
s/2

f() (1 +||
2
)
s/2
b
g()d
= hF
1
(1 +||
2
)
s/2

f(), F
1
(1 +||
2
)
s/2
b
g()i
L
2
= h(1 )
s/2
f, (1 )
s/2
gi
L
2 = h(1 )
s/2
f, (1 )
s/2
gi. (22.15)
If g S, then by denition of the Fourier transform for tempered distributions,
Z
R
n
f()g
()d = h
f, g
i
S
0
S
= hf, (g
i
S
0
S
= hf, gi
S
0
S
.
By Eq. (22.15),
|hf, gi|
(1 )
s/2
f
(1 )
s/2
g
0
= |f|
s
|g|
s
with equality if (1 )
s/2
g = (1 )
s/2
f, i.e. if g = (1 )
s
f H
s
. This
shows that
|f|
s
= sup
|hf, gi|
|g|
s
: g H
s
= sup
|hf, gi|
|g|
s
: g S
= khf, ik
H
s
,
where the second equality is a consequence of S being dense in H
s
. This proves Eq.
(22.14) and the fact the map, T, in Eq. (22.13) is isometric. So to nish the proof
we need only prove T is surjective.
By the Riesz theorem, every element of H
s
may be written in the form (, F)
s
for a unique element F H
s
. So we must nd f H
s
such that hf, gi = (g, F)
s
for all g H
s
, i.e.
((1)
s/2
f, (1 )
s/2
g)
0
= hf, gi = (g, F)
s
= ((1)
s/2
g, (1)
s/2
F)
0
g H
s
428 BRUCE K. DRIVER
from which we conclude

(1 )
s/2
f = (1 )
s/2
F.
So
f := (1 )
s/2
(1 )
s/2
F = (1 )
s

F H
s
is the desired function.
Lemma 22.15. Useful inequality: kf gk
2
kfk
1
kgk
2
. (Already proved some-
where else.)
Proof. We will give two the proofs, the rst is
kf gk
2
= k
f gk
2
k
fk
kgk
2
kfk
1
kgk
2
and the second is
kf gk
2
2
=
Z

Z
f(x y)g(y)dy
2
dx
Z Z
|f(x y)| |g(y)|dy
2
dx
Z Z
|f(x y)|
2
|g(y)|dy
Z
1
2
|g(y)|dy
dx
= kfk
2
2
kgk
2
1
.
Lemma 22.16 (Rellichs). For s < t in R, the inclusion map i : H
s
H
t
is
locally compact in the sense that if {f
l
}
l=1
H
s
is a sequence of distributions
such that supp(f
l
) K @@ R
n
for all l and sup
l
|f
l
|
s
= C < , then there exists
a subsequence of {f
l
}
l=1
which is convergent in H
t
.
Proof. Recall for C
c
(R
n
) S, S and hence, for all k N, there exists
C
k
< such that
| ()| C
k
(1 +||)
k
.
Choose C
c
(R
n
) such that 1 on a neighborhood of K @@ R
n
so that
f
l
= f
l
for all l. We then have
|
f
l
()| = |

f
l
|()
Z
|
f
l
()| | ( )|d (22.16)
C
k
Z
|
f
l
()|(1 +||)
s
(1 +||)
s
(1 +| |)
k
d
C
k
|f
l
|
s
Z
(1 +||)
2s
(1 +| |)
2k
d
1
2
C
k
|f
l
|
s
Z
(1 +||)
2s2k
d
1
2
(1 +||)
k
wherein the last inequality we have used Peetres inequality (Lemma 30.31). Since
R
(1 + ||)
2s2k
d < if k is chosen so that 2s + 2k > n, we have shown there
exists

C
k
< for all k >
ns
2
such that learn that
|
f
l
()|

C
k
|f
l
|
s
(1 +||)
k
for all R
n
.
Because D
i

f
l
() =

f
l
D
i
, the same argument shows (by increasing

C
k
if necessary)
that there exists

C
k
< for all k >
ns
2
such that
|D
i

f
l
|()

C
k
(1 +||)
k
for all R
n
.
The Ascolli-Arzela Theorem 3.59 now allows us to conclude there exists a subse-
quence

f
l
which is convergent uniformly on compact subsets of R
n
. For notational
simplicity we will continue to denote this subsequence by {f
l
} . For any M (0, ),
Z
||M
|
f
l

f
m
|
2
()(1 +||
2
)
t
d =
Z
||M
|
f
l

f
m
|
2
()(1 +||
2
)
s
(1 +||
2
)
ts
d
(1 +M
2
)
ts
|f|
2
s
=
1
(1 +M
2
)
ts
|f|
2
s
and
|f
l
f
m
|
2
t
=
Z
||M
|
f
l

f
m
|
2
()(1 +||
2
)
t
d +
Z
||M
|
f
l

f
m
|
2
()(1 +||
2
)
t
d.
Using these equations and the uniform convergence on compact just proved,
limsup
l,m
|f
l
f
m
|
2
t
limsup
l,m
Z
||M
|
f
l

f
m
|
2
()(1 +||
2
)
t
d
1
(1 +M
2
)
ts
|f|
2
s
0 as M .
Therefore {f
l
}
l=1
is Cauchy and hence convergent.
22.2. Examples.
Example 22.17. Let H
be given by

() = 1, then H
n
s
for any
> 0 and h, fi
R

f()d = f(0). That is to say is the delta distribution.
Example 22.18. (P(D
x
))
() = p(). So hP(D
x
), fi =
R
P()

f()d =
(P(D
x
)f)(0).
Example 22.19. Let g H

=
S
s0
H
s
. Then D
x
g H
and
hD
x
g, fi =
Z

g()

f()d =
Z
g()

f()d (22.17)
=
Z
g()((D
x
)
f)
()d
= hg, (D
x
)
fi
Note If H
. Then
0
= implies
b
0
= 1 or i
() = 1 implies

() =
1
i
/
L
1
loc
implies

/ H
. So hP(D
x
)g, fi = hg, P(D
x
)fi for all g H
and f S.
General Idea Suppose
S
s0
H
s
, how do we compute

. Recall

() H
and h, fi =
R

()

f()d = h
,

fi. Replace f

f implies h
, fi = h,

fi. So if

S
s0
H
s
, then the function f in

is characterized by
(22.18) h
, fi = h,

fi
430 BRUCE K. DRIVER
Example 22.20. Say h, fi = f(0). Then h
, fi = h,

fi =

f(0) =
R
f()d
R

()f(x)d. implies

() = 1.
Example 22.21. Take n = 3. Consider h
1
|x|
, fi
R
1
|x|
f(x)dx for f S. Claim
(False) h
1
|x|
, i H
and

1
|x|
() = 4
1
||
2
and
1
|x|
= 4 not zero.
Proof.
D
1
|x|
, f
E
=
D
1
|x|
,
Z

f(x)
E
=
Z
1
|x|
(x) dx
= lim
R%
Z
|x|R
1
|x|

f(x)dx = lim
R%
Z

|x|R
1
|x|

f
= lim
R%
Z
|x|R
1
|x|
()f()d
Now
|x|R
1
|x|
() =
Z
|x|R
1
|x|
e
ix
d
= 2
Z
R
0
dr
Z
1
1
d cos e
+ir|| cos
r
=
Z
R
0
dr 2
e
ir||
e
ir||
i||
= 4
Z
R
0
dr sin(r||)
= 4
cos(r||)
||
2
R
0
=
4
||
2
(cos(R||) 1)
So
D
1
|x|
, f
E
= 4
R
1
|xi|
2

f()d lim
R%
R
cos(R||)

f()
||
2
d
Claim 3.
(22.19) lim
R
Z
R
n
cos(R||)
g()
||
2
d = 0
Proof. Let I
R
:=
R
R
n
cos(R||)
g()
||
2
d which in polar coordinates may be written
as
I
R
=
Z
cos(Rt)
g(t, , )
t
2
t
2
dtd cos d
=
Z

0
cos(Rt)f(t)dt
where f L
1
. The result follows by Riemann Lebesgue Lemma. lim
R
I
R
= 0. So
we have nally shown

1
|x|
() =
4
||
2
/ L
2
. As a Corollary
1
|x|
= ||
2
4
||
2
= 4 = 4
() .
So
1
|x|
= 4 and not 0 as a naive direct calculation would show.
Example 22.22. Set g(x) =
[1,1]
(x). Then |hg, fi| 2kfk
C|f|
s
for s >
n/2. implies g H
s
= H
s
what are g and
dg
dx
? Answer:
D
dg
dx
, f
E
=
Z
g()

f() =
Z
g()(D
x
f)
()d
= hg, D
x
fi =
1
i
Z
1
1
f
0
(x)dx =
1
i
(f(r) f(r)
Taking the Fourier transform of the equation D
x
g = i(
1
1
) gives
g() = 2i
e
i
e
i
2i
= 2 sin().
which shows
(22.20) g() =
2 sin()
L
2
= H
0
.
Note D
x
g 6 0.
Second Method of Computation. Let f
l
(x) S = f
l
g L
2
then on one
hand
hD
x
f
l
, fi = hf
l
, D
x
fi hg, D
x
fi =
Z
1
1
D
x
f(x)dx
= i(f(1) f(1)) = i(
1
1
, f)
while on the other hand
hD
x
f
l
, fi hD
x
g, fi H
1
.
Combining these two equations shows that D
x
g = i(
1
1
).
22.3. Summary of operations on H
.
Example 22.23. hD
x
g, fi = hg, (D
x
)
fi for all g H
and f S. Suppose
h C
such that h and all its derivatives have at most polynomial growth then
M
h
: S S and M
h
extends to H
.
Lemma 22.24. For all f S the sum f

P
ytZ
n
n
f(y)
y
converges in H
s
for
all s >
n
2
. Furthermore lim
0
|f
f|
s
= 0.
Proof. Let Z
n
be a nite set, put g

P
y
n
f(y)y. Then
(22.21) |g
|
2
s

Z

X
y
n
f(y)e
iy
2
d
s
()
where d
s
(x) (1 + ||
2
)
s
d for all s R. Therefore f S we know |F(y)|
c(1 +|y|)
m
so
|g
|
2
s
c
n
Z

X
y
(1 +|y|)
m
2
d
s
()
c
n
X
y
(1 +|y|)
m
2
.
432 BRUCE K. DRIVER
Now
P
yZ
n
(1 + |y|)
m
< if m > n. Therefore if and are two nite subsets
of Z
n
,
|g
|
2
s
= |g
|
2
s
0 as , % Z
n
.
So the sums exists. Now consider
|f f
|
2
s
=
Z
|
f()
X
n
f(y)e
iy
|
2
d
s
().
Set

f
(x) = f(y) if |x y|
box

z
where y Z
n
. Then
f() =
X
n
f(y)e
iy
.
So |f f
|
2
s
=
R
|
f()

f()|
2
d
s
(). Now
|
f()

f()|
Z
|f(x)

f
(x)|dx 0 as 0.
So |f f
|
2
s
0 as 0 by dominated convergence theorem.
Lemma 22.25. The map x R
n

x
H
s
is C
k
for all s >
n
2
+k.
Proof. Since : H
s
L
2
(d
s
) is a unitary map it suces to prove that the
map
x R
n
e
ix
f(x)() L
2
(d
s
)
is C
k
. So we will show x f(x)() = e
ix
is C
k
.
Consider
f(x +te
1
) f(x)
x
=
1
t
Z
1
0
d
ds
f(x +ste
1
)ds
=
1
t
(i
1
t)
Z
1
0
f(x +ste
1
)dx.
So (Fix)
(22.22)
f(x +te
1
) f(x)
x
i
1
f(x)
s
=
Z

Z
1
0
(e
i(x+ste
1
)
e
ix
)ds
2
1
d
s
().
This shows
f
x
1
(x) exists, and this derivative is easily seen to be continuous in
L
2
(d
s
) norm. The other derivatives may be computed similarly.
Proposition 22.26. Suppose K : H
s
H
s
is a bounded operator and s >
n
2
+k
for some K = 0, 1, 2, . . . . Then exists a C
k
b
-function k(x, y) such that (Kf)(x) =
R
k(x, y)f(y)dy for all F S. Furthermore
(22.23) |k|
,k
C(s)kKk
H
s
H
s
.
Corollary 22.27. If K : H
H
+
then k(x, y) is C
.
Proof. Dene k(x, y) hk
y
,
x
i
Claim 4. k is C
k
.
Reasons:
R
n
R
n
C
k
H
s
H
s
C
H
s
H
s
C
C
(x, y) (
x
,
y
) (K
y
,
x
) hK
y
,
x
i
so k(x, y) is the composition of two C
maps and a C
k
-map. So k(x, y) is C
k
.
Note |k(x, y)| C(s)kKk
s,s
. So k is bounded.
Claim 5. For F S, Kf(x) =
R
k(x, y)f(y)dy. Indeed,
Z
k(x, y)f(y)dy = lim
0

n
X
yZ
n
k(x, y)(y) (22.24)
= hK lim
0
X
n
hf(y)
y
,
x
i
= hKf,
x
i = (Kf)(x).
Finally:
x
D
y
k(x, y)
= |hKD
y
, D
x
i| (22.25)
= |hK (D)
y
, (D)
x
i|
kKk
s,s
|D
y
|
s
|D
x
|
s
C(s)kKk
s,s
implies

x
D
y
k(x, y)
C(s)kKk
s,s
if ||, || k.
22.4. Application to Dierential Equations.
22.4.1. Dirichlet problem . Consider the following Dirichlet problem in one dimen-
sion written in Divergence form as
Lf(x) :=
d
dx
(a(x)
df
dx
(x)) = g(x) where a C
([0, 1], (0, )), (22.26)

f C
2
([0, 1], R) such that f(0) = f(1) = 0 and g C
0
([0, 1], R).
Theorem 22.28. There exists a solution to (22.26).
Proof. Suppose f solves (22.26) and C
1
([0, 1]), R) such that (0) = (1) =
0. Then
(f, )
a
=
Z
1
0
a(x)f
0
(x)
0
(x)dx =
Z
1
0
g(x)(x)dx =:
g
().
Dene
H {f AC([0, 1], R) : f(0) = f(1) = 0 and (f, f)
Z
1
0
|f
0
(x)|
2
dx < }.
Since
|f(x)| =
Z
x
0
f
0
(y)dy
Z
1
0
f
0
(y)1
[0,x]
(y)dy
kf
0
k
2
x kf
0
k
2
we nd conclude the following Poincar inequality holds,
kfk
2
kfk
kf
0
k
2
= kfk
p
(f, f).
In particular this shows that k k is a norm. Since that map f H f
0

L
2
([0, 1]) is unitary, it follows that H is complete, i.e. H is a Hilbert space. Also
k
g
()k kgk
2
kk
2
kgk
2
kk
2
434 BRUCE K. DRIVER
which implies
g
: H R is bounded and linear. We also notice that (, )
a
is an
equivalent inner product on H so by the Riesz theorem, there exists f H such
that
(f, )
a
=
g
() =
Z
1
0
g(x)(x)dx
for all H i.e.
(22.27)
Z
1
0
a(x)f
0
(x)
0
(x)dx =
Z
1
0
g(x)(x)dx.
At this point we have produced a so called weak solution of (22.26).
Let G(x) =
R
x
0
g(y)dy so G
0
(x) = g(x) a.e. Then by integration by parts
(Justication: See Theorem 3.30 and Proposition 3.31),
Z
1
0
g(x)(x)dx =
Z
1
0
G
0
(x)(x)dx =
Z
1
0
G(x)
0
(x)dx.
Using this in Eq. (22.27) we learn
Z
1
0
[a(x)f
0
(x) G(x)]
0
(x)dx = 0 for all H
By Lemma 22.29 below, this implies there is a constant C such that a(x)f
0
(x) +
G(x) = C for almost every x. Solving this equation givesf
0
(x) = (C G(x)) /a(x)
a.e. or
f(x) =
Z
x
0
C G(y)
a(y)
C
2
([0, 1])
showing f is in fact a strong solution.
Lemma 22.29. Suppose h L
1
([0, 1], dx) and
R
1
0
h(x)
0
(x)dx = 0 for all
C
c
((0, 1)) ten h = constant a.e.
Proposition 22.30. Suppose f is C
2
, f(0) = 0 = f(1) and f
00
= g C
0
([0, 1])
then
f(x) =
Z
1
0
k(x, y)g(y)dy
where
k(x, y) =

x(1 y) x y
y(1 x) x y.
Proof. By the fundamental theorem of calculus, f
0
(x) = f
0
(0) +
R
x
0
g(y)dy and
therefore
f(x) = 0 +f
0
(0)x +
Z
x
0
dy
Z
y
0
dzg(y)
= f(0)x +
Z
1
zyx
g(z)dz dy
= f
0
(0)x +
Z
x
0
(x z)g(z)dz.
Since
0 = f(1) = f
0
(0) +
Z
1
0
(1 z)g(z)dz
we have
f(x) =
Z
1
0
[1
zx
(x z) x(1 z)
| {z }
k(x,z)
]g(z)dz
So if we let
(Kg)(x) =
Z
1
0
k(x, y)g(y)dy
then we have shown K
d
2
dx
2
= I.
Exercise 22.1. (See previous test) Show
d
2
dx
2
K = I.
436 BRUCE K. DRIVER
23. Sobolev Spaces

Denition 23.1. For p [1, ], k N and an open subset of R
d
, let
W
k,p
loc
() := {f L
p
() :
f L
p
loc
() (weakly) for all || k} ,
W
k,p
() := {f L
p
() :
f L
p
() (weakly) for all || k} ,
(23.1) kfk
W
k,p
()
:=
_
_
X
||k
k
fk
p
L
p
()
_
_
1/p
if p <
and
(23.2) kfk
W
k,p
()
=
X
||k
k
fk
L
()
if p = .
In the special case of p = 2, we write W
k,2
loc
() =: H
k
loc
() and W
k,2
() =: H
k
()
in which case kk
W
k,2
()
= kk
H
k
()
is a Hilbertian norm associated to the inner
product
(23.3) (f, g)
H
k
()
=
X
||k
Z
g dm.
Theorem 23.2. The function, kk
W
k,p
()
, is a norm which makes W
k,p
() into a
Banach space.
Proof. Let f, g W
k,p
(), then the triangle inequality for the p norms on
L
p
() and l
p
({ : || k}) implies
kf +gk
W
k,p
()
=
_
_
X
||k
k
f +
gk
p
L
p
()
_
_
1/p
_
_
X
||k
h
k
fk
L
p
()
+k
gk
L
p
()
i
p
_
_
1/p
_
_
X
||k
k
fk
p
L
p
()
_
_
1/p
+
_
_
X
||k
k
gk
p
L
p
()
_
_
1/p
= kfk
W
k,p
()
+kgk
W
k,p
()
.
This shows kk
W
k,p
()
dened in Eq. (23.1) is a norm. We now show completeness.
If {f
n
}
n=1
W
k,p
() is a Cauchy sequence, then {
f
n
}
n=1
is a Cauchy
sequence in L
p
() for all || k. By the completeness of L
p
(), there exists
g
L
p
() such that g
= L
p
lim
n
f
n
for all || k. Therefore, for all
C
c
(),
hf,
i = lim
n
hf
n
,
i = (1)
||
lim
n
h
f
n
, i = (1)
||
lim
n
hg
, i.
This shows
f exists weakly and g
f a.e. This shows f W

k,p
() and that
f
n
f W
k,p
() as n .
Example 23.3. Let u(x) := |x|
for x R
d
and R. Then
Z
B(0,R)
|u(x)|
p
dx =
S
d1
Z
R
0
1
r
p
r
d1
dr =
S
d1
Z
R
0
r
dp1
dr
=
S
d1
(
R
dp
dp
if d p > 0
otherwise
(23.4)
and hence u L
p
loc
R
d
i < d/p. Now u(x) = |x|

1
x where x := x/ |x| .
Hence if u(x) is to exist in L
p
loc
R
d
it is given by |x|
1
x which is in
L
p
loc
R
d
i + 1 < d/p, i.e. if < d/p 1 =

dp
p
. Let us not check that
u W
1,p
loc
R
d
provided < d/p 1. To do this suppose C
c
(R
d
) and > 0,
then
hu,
i
i = lim
0
Z
|x|>
u(x)
i
(x)dx
= lim
0
(
Z
|x|>
i
u(x)(x)dx +
Z
|x|=
u(x)(x)
x
i
d(x)
)
.
Since

Z
|x|=
u(x)(x)
x
i
d(x)
kk
S
d1
d1
0 as 0
and
i
u(x) = |x|
1
x e
i
is locally integrable we conclude that
hu,
i
i =
Z
R
d

i
u(x)(x)dx
showing that the weak derivative
i
u exists and is given by the usual pointwise
derivative.
23.1. Mollications.
Proposition 23.4 (Mollication). Let be an open subset of R
d
, k N
0
:=
N{0} , p [1, ) and u W
k,p
loc
(). Then there exists u
n
C
c
() such that
u
n
u in W
k,p
loc
().
Proof. Apply Proposition 19.12 with polynomials, p
() =
, for || k.
Proposition 23.5. C
c
(R
d
) is dense in W
k,p
(R
d
) for all 1 p < .
Proof. The proof is similar to the proof of Proposition 23.4 using Exercise 19.2
in place of Proposition 19.12.
Proposition 23.6. Let be an open subset of R
d
, k N
0
:= N{0} and p 1,
then
(1) for any with || k,
: W
k,p
() W
k||,p
() is a contraction.
(2) For any open subset V , the restriction map u u|
V
is bounded from
W
k,p
() W
k,p
(V ) .
(3) For any f C
k
() and u W
k,p
loc
(), the fu W
k,p
loc
() and for || k,
(23.5)
(fu) =
X
u
where

:=
!
!()!
.
438 BRUCE K. DRIVER
(4) For any f BC

k
() and u W
k,p
loc
(), the fu W
k,p
loc
() and for || k
Eq. (23.5) still holds. Moreover, the linear map u W
k,p
() fu
W
k,p
() is a bounded operator.
Proof. 1. Let C
c
() and u W
k,p
() , then for with || k || ,
h
u,
i = (1)
||
hu,
i = (1)
||
hu,
+
i = (1)
||
h
+
u, i
u) exists weakly and
u) =
+
u. This shows
that
u W
k||,p
() and it should be clear that k
uk
W
k||,p
()
kuk
W
k,p
()
.
Item 2. is trivial.
3 - 4. Given u W
k,p
loc
() , by Proposition 23.4 there exists u
n
C
c
() such
that u
n
u in W
k,p
loc
() . From the results in Appendix A.1, fu
n
C
k
c
()
W
k,p
() and
(23.6)
(fu
n
) =
X
u
n
holds. Given V
o
such that

V is compactly contained in , we may use the
above equation to nd the estimate
k
(fu
n
)k
L
p
(V )

X
(V )
u
n
L
p
(V )
C
(f, V )
X
u
n
L
p
(V )
C
(f, V ) ku
n
k
W
k,p
(V )
wherein the last equality we have used Exercise 23.1 below. Summing this equation
on || k shows
(23.7) kfu
n
k
W
k,p
(V )
C(f, V ) ku
n
k
W
k,p
(V )
for all n
where C(f, V ) :=
P
||k
C
(f, V ). By replacing u
n
by u
n
u
m
in the above
inequality it follows that {fu
n
}
n=1
is convergent in W
k,p
(V ) and since V was
arbitrary fu
n
fu in W
k,p
loc
(). Moreover, we may pass to the limit in Eq. (23.6)
and in Eq. (23.7) to see that Eq. (23.5) holds and that
kfuk
W
k,p
(V )
C(f, V ) kuk
W
k,p
(V )
C(f, V ) kuk
W
k,p
()
Moreover if f BC () then constant C(f, V ) may be chosen to be independent
of V and therefore, if u W
k,p
() then fu W
k,p
().
Alternative direct proof of 4. We will prove this by induction on || . If
= e
i
then, using Lemma 19.9,
hfu,
i
i = hu, f
i
i = hu,
i
[f]
i
f i
= h
i
u, fi +hu,
i
f i = hf
i
u +
i
f u, i
showing
i
(fu) exists weakly and is equal to
i
(fu) = f
i
u +
i
f u L
p
() .
Supposing the result has been proved for all such that || m with m [1, k).
Let = +e
i
with || = m, then by what we have just proved each summand in
Eq. (23.5) satises
i
exists weakly and
=
+e
i
f
u +
i
f
+e
u L
p
() .
Therefore
(fu) =
i
(fu) exists weakly in L

p
() and
(fu) =
X
+e
i
f
u +
f
+e
i
u
=
X
.
For the last equality see the combinatorics in Appendix A.1.
Theorem 23.7. Let be an open subset of R
d
, k N
0
:= N{0} and p [1, ).
Then C
() W
k,p
() is dense in W
k,p
().
Proof. Let
n
:= {x : dist(x, ) > 1/n} B(0, n) , then
n
{x : dist(x, ) 1/n} B(0, n)
n+1
,
n
is compact for every n and
n
as n . Let V
0
=
3
, V
j
:=
j+3
\

j
for
j 1, K
0
:=

2
and K
j
:=

j+2
\
j+1
for j 1 as in gure 41. Then K
n
@@ V
n
1
0
3
1
2
0
Figure 41. Decomposing into compact pieces. The compact
sets K
0
, K
1
and K
2
are the shaded annular regions while V
0
, V
1
and V
2
are the indicated open annular regions.
for all n and K
n
= . Choose
n
C
c
(V
n
, [0, 1]) such that
n
= 1 on K
n
and
set
0
=
0
and
j
= (1
1

j1
)
j
=
j
j1
Y
k=1
(1
k
)
for j 1. Then
j
C
c
(V
n
, [0, 1]),
1
n
X
k=0
k
=
n
Y
k=1
(1
k
) 0 as n
so that
P
k=0
k
= 1 on with the sum being locally nite.
Let > 0 be given. By Proposition 23.6, u
n
:=
n
u W
k,p
() with
supp(u
n
) @@ V
n
. By Proposition 23.4, we may nd v
n
C
c
(V
n
) such that
440 BRUCE K. DRIVER
ku
n
v
n
k
W
k,p
()
/2
n+1
for all n. Let v :=
P
n=1
v
n
, then v C
() because
the sum is locally nite. Since
X
n=0
ku
n
v
n
k
W
k,p
()

X
n=0
/2
n+1
= < ,
the sum
P
n=0
(u
n
v
n
) converges in W
k,p
() . The sum,
P
n=0
(u
n
v
n
) , also
converges pointwise to u v and hence u v =
P
n=0
(u
n
v
n
) is in W
k,p
() .
Therefore v W
k,p
() C
() and
ku vk

X
n=0
ku
n
v
n
k
W
k,p
()
.
Theorem 23.8 (Density of W
k,p
() C
in W
k,p
()). Let R
d
be a
manifold with C
0
boundary, then for k N
0
and p [1, ), W
k,p
is dense in W
k,p
. This may alternatively be stated by assuming R

d
is
an open set such that

=
0
and

is a manifold with C
0
boundary, then
W
k,p
() C
is dense in W
k,p
() .
Before going into the proof, let us point out that some restriction on the boundary
of is needed for assertion in Theorem 23.8 to be valid. For example, suppose
0
:=

x R
2
: 1 < |x| < 2
and :=
0
\ {(1, 2) {0}}
and : (0, 2) is dened so that x
1
= |x| cos (x) and x
2
= |x| sin(x),
see Figure 42. Then BC
() W
k,
() for all k N
0
yet can not be
Figure 42. The region
0
along with a vertical in .
approximated by functions from C
BC
(
0
) in W
1,p
() . Indeed, if this
were possible, it would follows that W
1,p
(
0
) . However, is not continuous
(and hence not absolutely continuous) on the lines {x
1
= } for all (1, 2)
and so by Theorem 19.30, / W
1,p
(
0
) .
The following is a warm-up to the proof of Theorem 23.8.
Proposition 23.9 (Warm-up). Let := H
d
:=

x R
d
: x
d
> 0
and C
)
denote those u C
which are restrictions of C
functions dened on an open

neighborhood of

. Then for p [1, ), C
) W
k,p
() is dense in W
k,p
() .
Proof. Let u W
k,p
() and for s > 0 let u
s
(x) := u(x +se
d
). Then it is easily
seen that u
s
W
k,p
( se
d
) and for || k that
u
s
= (
u)
s
because for
C
c
(se
d
) ,
h
u
s
, i = hu
s
, ()
i =
Z
R
d
u(x +se
d
) ()
(x)dx
=
Z
R
d
u(x) ()
(x se
d
)dx =
Z
R
d

u(x)(x se
d
)dx
=
Z
R
d
(
u) (x +se
d
)(x)dx = h(
u)
s
, i.
This result and by the strong continuity of translations in L
p
(see Proposition
11.13), it follows that lim
s0
ku u
s
k
W
k,p
()
= 0. By Theorem 23.7, we may choose
v
s
C
( se
d
) C
such that kv
s
u
s
k
W
k,p
()
s for all s > 0. Then
kv
s
uk
W
k,p
()
kv
s
u
s
k
W
k,p
()
+ku
s
uk
W
k,p
()
0 as s 0.
23.1.1. Proof of Theorem 23.8. Proof. By Theorem 23.7, it suces to show than
any u C
() W
k,p
() may be approximated by C
. To understand the
main ideas of the proof, suppose that is the triangular region in Figure 43 and
suppose that we have used a partition of unity relative to the cover shown so that
u = u
1
+ u
2
+ u
3
with supp(u
i
) B
i
. Now concentrating on u
1
whose support is
Figure 43. Splitting and moving a function in C
() so that
the result is in C
.
depicted as the grey shaded area in Figure 43. We now simply translate u
1
in the
direction v shown in Figure 43. That is for any small s > 0, let w
s
(x) := u
1
(x+sv),
then v
s
lives on the translated grey area as seen in Figure 43. The function w
s
extended to be zero o its domain of denition is an element of C
moreover
it is easily seen, using the same methods as in the proof of Proposition 23.9, that
w
s
u
1
in W
k,p
() .
The formal proof follows along these same lines. To do this choose an at most
countable locally nite cover {V
i
}
i=0
of

such that

V
0
and for each i 1,
after making an ane change of coordinates, V
i
= (, )
d
for some > 0 and
V
i

= {(y, z) V
i
: > z > f
i
(y)}
442 BRUCE K. DRIVER
where f
i
: (, )
d1
(, ), see Figure 44 below. Let {
i
}
i=0
be a partition of
Figure 44. The shaded area depicts the support of u

i
= u
i
.
unity subordinated to {V
i
} and let u
i
:= u
i
C
(V
i
) . Given > 0, we choose
s so small that w
i
(x) := u
i
(x+se
d
) (extended to be zero o its domain of denition)
may be viewed as an element of C
) and such that ku

i
w
i
k
W
k,p
()
< /2
i
. For
i = 0 we set w
0
:= u
0
= u
0
. Then, since {V
i
}
i=1
is a locally nite cover of

, it
follows that w :=
P
i=0
w
i
C
and further we have
X
i=0
ku
i
w
i
k
W
k,p
()

X
i=1
/2
i
= .
This shows
u w =

X
i=0
(u
i
w
i
) W
k,p
()
and ku wk
W
k,p
()
< . Hence w C
W
k,p
() is a approximation of
u and since > 0 arbitrary the proof is complete.
23.2. Dierence quotients.
Theorem 23.10. Suppose k N
0
, is a precompact open subset of R
d
and V is
an open precompact subset of .
(1) If 1 p < u W
k,p
() and
i
u W
k,p
(), then
(23.8) k
h
i
uk
W
k,p
(V )
k
i
uk
W
k,p
()
for all 0 < |h| <
1
2
dist(V,
c
).
(2) Suppose that 1 < p , u W
k,p
() and assume there exists a constant
C(V ) < such that
k
h
i
uk
W
k,p
(V )
C(V ) for all 0 < |h| <
1
2
dist(V,
c
).
Then
i
u W
k,p
(V ) and k
i
uk
W
k,p
(V )
C(V ). Moreover if C :=
sup
V
C(V ) < then in fact
i
u W
k,p
() and there is a constant
c < such that
k
i
uk
W
k,p
()
c
C +kuk
L
p
()
.
Proof. 1. Let || k, then
k
h
i
uk
L
p
(V )
= k
h
i

uk
L
p
(V )
k
i
uk
L
p
()
wherein we have used Theorem 19.22 for the last inequality. Eq. (23.8) now easily
follows.
2. If k
h
i
uk
W
k,p
(V )
C(V ) then for all || k,
k
h
i

uk
L
p
(V )
= k
h
i
uk
L
p
(V )
C(V ).
So by Theorem 19.22,
i
u L
p
(V ) and k
i
uk
L
p
(V )
C(V ). From this we
conclude that k
uk
L
p
(V )
C(V ) for all 0 < || k+1 and hence kuk
W
k+1,p
(V )

c
C(V ) +kuk
L
p
(V )
for some constant c.

23.3. Application to regularity.
Denition 23.11 (Negative order Sobolev space). Let H
1
() = H
1
()
and
recall that
kuk
H
1
()
:= sup
H
1
()
|hu, i|
kk
H
1
()
.
When = R
d
, C
R
d
is dense in H
1
(R
d
) and hence
kuk
H
1
(R
d
)
:= sup
C
c
(R
d
)
|hu, i|
kk
H
1
(R
d
)
and we may identify H
1
R
d
with

u D
0
(R
d
) : uk
H
1
()
<
D
0
(R
d
).
Theorem 23.12. Suppose u H
1
(R
d
) and 4u H
k
(R
d
) for k {0, 1, 2, . . . }
then u H
k+2
(R
d
).
Proof. Fourier transform proof. Since (1 + ||
2
) + ||
2
(1 + ||
2
)
k
(1 +
||
2
)
k+2
we are given
u() L
2
((1 +||
2
)d) and ||
2
u() L
2
((1 +||
2
)
k
d).
But this implies u H
k+2
(R
d
).
Proof with out the Fourier transform. For u H
1
(R
d
),
kuk
H
1 =
s
Z
R
d
(|u|
2
+u
2
)dm = sup
C
c
(R
d
)
R
R
d
(u +u)dm
kk
H
1
(R
d
)
= sup
C
c
(R
d
)
|hu +u, i|
kk
H
1
(R
d
)
= k(4+ 1)uk
H
1
(R
d
)
(23.9)
which shows (4+ 1) : H
1
(R
d
) H
1
(R
d
) is an isometry.
Now suppose that u H
1
and (4+ 1)u L
2
H
1
(R
d
). Then
k
h
i
uk
H
1 = k(4+ 1)
h
i
uk
H
1 = sup
kk
H
1
=1
|h
h
i
u, (4+ 1)i|
= sup
kk
H
1
=1
|hu,
h
i
(4+ 1)i| = sup
kk
H
1
=1
{h(4+ 1)u,
h
i
i}
sup
kk
H
1
=1
k(4+ 1)uk
L
2 k
h
i
k
L
2 = sup
kk
H
1
=1
k(4+ 1)uk
L
2 kk
L
2
k(4+ 1)uk
L
2.
444 BRUCE K. DRIVER
Therefore by Theorem 23.10

i
u H
1
and since this is true for i = 1, 2, . . . , d,
u H
2
and
kuk
H
1
Ck(4+ 1)uk
L
2.
Combining this with Eq. (23.9) allows us to conclude
kuk
H
2 Ck(4+ 1)uk
L
2.
The argument may now be repeated. For example if 4u H
1
, then u H
2
and
h
i
u H
2
and
k
h
i
uk
H
2 k(4+ 1)
h
i
uk
L
2 Ck
h
i
(4+ 1)uk
L
2 Ck(4+ 1)uk
H
1.
Therefore u H
3
and kuk
H
2 Ck(4 + 1)uk
H
1 and so kuk
H
3 Ck(4 +
1)uk
H
1.
23.4. Sobolev Spaces on Compact Manifolds.
Theorem 23.13 (Change of Variables). Suppose that U and V are open subsets
of R
d
, C
k
(U, V ) be a C
k
dieomorphism such that k
k
BC(U)
< for all
1 || k and := inf
U
|det
0
| > 0. Then the map
: W
k,p
(V ) W
k,p
(U)
dened by u W
k,p
(V )
u := u W
k,p
(U) is well dened and is bounded.
Proof. For u W
k,p
(V ) C
(V ) , repeated use of the chain and product rule

implies,
(u )
0
= (u
0
)
0
(u )
00
= (u
0
)
0
0
+ (u
0
)
00
= (u
00
)
0
0
+ (u
0
)
00
(u )
(3)
=

u
(3)

0
+ (u
00
) (
0
0
)
0
+ (u
00
)
0
00
+ (u
0
)
(3)
.
.
.
(u )
(l)
=

u
(l)

l times
z }| {
+
l1
X
j=1
u
(j)

p
j
0
,
00
, . . . ,
(l+1j)
.
(23.10)
This equation and the boundedness assumptions on
(j)
for 1 j k implies there
is a nite constant K such that
(u )
(l)
K
l
X
j=1
u
(j)

for all 1 l k.
By Hlders inequality for sums we conclude there is a constant K
p
such that
X
||k
|
(u )|
p
K
p
X
||k
|
u|
p

and therefore
ku k
p
W
k,p
(U)
K
p
X
||k
Z
U
|
u|
p
((x)) dx.
Making the change of variables, y = (x) and using
dy = |det
0
(x)| dx dx,
we nd
ku k
p
W
k,p
(U)
K
p
X
||k
Z
U
|
u|
p
((x)) dx
K
p
X
||k
Z
V
|
u|
p
(y) dy =
K
p
kuk
p
W
k,p
(V )
. (23.11)
This shows that
: W
k,p
(V ) C
(V ) W
k,p
(U) C
(U) is a bounded
operator. For general u W
k,p
(V ) , we may choose u
n
W
k,p
(V ) C
(V ) such
that u
n
u in W
k,p
(V ) . Since
is bounded, it follows that
u
n
is Cauchy
in W
k,p
(U) and hence convergent. Finally, using the change of variables theorem
again we know,
k
u
n
k
p
L
p
(V )

1
ku u
n
k
p
L
p
(U)
0 as n
and therefore
u = lim
n
u
n
and by continuity Eq. (23.11) still holds for
u W
k,p
(V ) .
Let M be a compact C
k
manifolds without boundary, i.e. M is a compact
Hausdor space with a collection of charts in an atlas A such that x : D(x)
o
M R(x)
o
R
d
is a homeomorphism such that
x y
1
C
k
(y (D(x) D(y))) , x(D(x) D(y))) for all x, y A.
Denition 23.14. Let {x
i
}
N
i=1
A such that M =
N
i=1
D(x
i
) and let {
i
}
N
i=1
be a partition of unity subordinate do the cover {D(x
i
)}
N
i=1
. We now dene u
W
k,p
(M) if u : M C is a function such that
(23.12) kuk
W
k,p
(M)
:=
N
X
i=1
(
i
u) x
1
i
W
k,p
(R(x
i
))
< .
Since kk
W
k,p
(R(x
i
))
is a norm for all i, it easily veried that kk
W
k,p
(M)
is a norm
on W
k,p
(M).
Proposition 23.15. If f C
k
(M) and u W
k,p
(M) then fu W
k,p
(M) and
(23.13) kfuk
W
k,p
(M)
C kuk
W
k,p
(M)
where C is a nite constant not depending on u. Recall that f : M R is said to
be C
j
with j k if f x
1
C
j
(R(x), R) for all x A.
Proof. Since

f x
1
i
has bounded derivatives on supp(

i
x
1
i
), it follows
from Proposition 23.6 that there is a constant C
i
< such that
(
i
fu) x
1
i
W
k,p
(R(x
i
))
=

f x
1
i
(
i
u) x
1
i
W
k,p
(R(x
i
))
C
i
(
i
u) x
1
i
W
k,p
(R(x
i
))
and summing this equation on i shows Eq. (23.13) holds with C := max
i
C
i
.
Theorem 23.16. If {y
j
}
K
j=1
A such that M =
K
j=1
D(y
j
) and {
j
}
K
j=1
is a
partition of unity subordinate to the cover {D(y
j
)}
K
j=1
, then the norm
(23.14) |u|
W
k,p
(M)
:=
K
X
j=1
(
j
u) y
1
j
W
k,p
(R(y
j
))
is equivalent to the norm in Eq. (23.12). That is to say the space W
k,p
(M) along
with its topology is well dened independent of the choice of charts and partitions
of unity used in dening the norm on W
k,p
(M) .
446 BRUCE K. DRIVER
Proof. Since ||
W
k,p
(M)
is a norm,
|u|
W
k,p
(M)
=
N
X
i=1
i
u
W
k,p
(M)
N
X
i=1
|
i
u|
W
k,p
(M)
=
K
X
j=1
N
X
i=1
(
j
i
u) y
1
j
W
k,p
(R(y
j
))
K
X
j=1
N
X
i=1
(
j
i
u) y
1
j
W
k,p
(R(y
j
))
(23.15)
and since x
i
y
1
j
and y
j
x
1
i
are C
k
dieomorphism and the sets y
j
(supp(
i
) supp(
j
))
and x
i
(supp(
i
) supp(
j
)) are compact, an application of Theorem 23.13 and
Proposition 23.6 shows there are nite constants C
ij
such that
(
j
i
u) y
1
j
W
k,p
(R(y
j
))
C
ij
(
j
i
u) x
1
i
W
k,p
(R(x
i
))
C
ij
i
u x
1
i
W
k,p
(R(x
i
))
which combined with Eq. (23.15) implies
|u|
W
k,p
(M)

K
X
j=1
N
X
i=1
C
ij
i
u x
1
i
W
k,p
(R(x
i
))
C kuk
W
k,p
(M)
where C := max
i
P
K
j=1
C
ij
< . Analogously, one shows there is a constant K <
such that kuk
W
k,p
(M)
K|u|
W
k,p
(M)
.
Lemma 23.17. Suppose x A(M) and U
o
M such that U

U D(x), then
there is a constant C < such that
(23.16)

u x
1
W
k,p
(x(U))
C kuk
W
k,p
(M)
for all u W
k,p
(M).
Conversely a function u : M C with supp(u) U is in W
k,p
(M) i
u x
1
W
k,p
(x(U))
< and in any case there is a nite constant such that
(23.17) kuk
W
k,p
(M)
C
u x
1
W
k,p
(x(U))
.
Proof. Choose charts y
1
:= x, y
2
, . . . , y
K
A such that {D(y
i
)}
K
j=1
is an
open cover of M and choose a partition of unity {
j
}
K
j=1
subordinate to the cover
{D(y
j
)}
K
j=1
such that
1
= 1 on a neighborhood of

U. To construct such a partition
of unity choose U
j

o
M such that U
j

U
j
D(y
j
),

U U
1
and
K
j=1
U
j
= M
and for each j let
j
C
k
c
(D(y
j
), [0, 1]) such that
j
U
j
. Then dene
j
:=
j
(1
0
) (1
j1
) where by convention
0
0. Then
{
j
}
K
j=1
is the desired partition, indeed by induction one shows
1
l
X
j=1
j
= (1
1
) (1
l
)
and in particular
1
K
X
j=1
j
= (1
1
) (1
K
) = 0.
Using Theorem 23.16, it follows that
u x
1
W
k,p
(x(U))
=

(
1
u) x
1
W
k,p
(x(U))
(
1
u) x
1
W
k,p
(R(y
1
))

K
X
j=1
(
j
u) y
1
j
W
k,p
(R(y
j
))
= |u|
W
k,p
(M)
C kuk
W
k,p
(M)
Using Theorems 23.16 and 23.13 there are constants C
j
for j = 0, 1, 2 . . . , N such
that
kuk
W
k,p
(M)
C
0
K
X
j=1
(
j
u) y
1
j
W
k,p
(R(y
j
))
= C
0
K
X
j=1
(
j
u) y
1
1
y
1
y
1
j
W
k,p
(R(y
j
))
C
0
K
X
j=1
C
j
(
j
u) x
1
W
k,p
(R(y
1
))
= C
0
K
X
j=1
C
j
j
x
1
u x
1
W
k,p
(R(y
1
))
.
This inequality along with K applications of Proposition 23.6 proves Eq. (23.17).
Theorem 23.18. The space (W
k,p
(M), kk
W
k,p
(M)
Proof. Let {x
i
}
N
i=1
A and {
i
}
N
i=1
be as in Denition 23.14 and choose U
i

o
M such that supp(
i
) U
i

U
i
D(x
i
). If {u
n
}
n=1
W
k,p
(M) is a Cauchy
sequence, then by Lemma 23.17,

u
n
x
1
i
n=1
W
k,p
(x
i
(U
i
)) is a Cauchy se-
quence for all i. Since W
k,p
(x
i
(U
i
)) is complete, there exists v
i
W
k,p
(x
i
(U
i
)) such
that u
n
x
1
i
v
i
in W
k,p
(x
i
(U
i
)). For each i let v
i
:=
i
( v
i
x
i
) and notice by
Lemma 23.17 that
kv
i
k
W
k,p
(M)
C
v
i
x
1
i
W
k,p
(x
i
(U
i
))
= C k v
i
k
W
k,p
(x
i
(U
i
))
<
so that u :=
P
N
i=1
v
i
W
k,p
(M). Since supp(v
i
i
u
n
) U
i
, it follows that
ku u
n
k
W
k,p
(M)
=
N
X
i=1
v
i
N
X
i=1
i
u
n
W
k,p
(M)
N
X
i=1
kv
i
i
u
n
k
W
k,p
(M)
C
N
X
i=1
[
i
( v
i
x
i
u
n
)] x
1
i
W
k,p
(x
i
(U
i
))
= C
N
X
i=1
i
x
1
i
v
i
u
n
x
1
i
W
k,p
(x
i
(U
i
))
C
N
X
i=1
C
i
v
i
u
n
x
1
i
W
k,p
(x
i
(U
i
))
0 as n
wherein the last inequality we have used Proposition 23.6 again.
23.5. Trace Theorems. For many more general results on this subject matter,
see E. Stein [7, Chapter VI].
448 BRUCE K. DRIVER
Lemma 23.19. Suppose k 1, H

d
:=

x R
d
: x
d
> 0

o
R
d
, u C
k
c
H
d
and D is the smallest constant so that supp(u) R

d1
[0, D]. Then there is a
constant C = C(p, k, D, d) such that
(23.18) kuk
W
k1,p
(H
d
)
C(p, D, k, d) kuk
W
k,p
(H
d
)
.
Proof. Write x H
d
as x = (y, z) R
d1
[0, ), then by the fundamental
theorem of calculus we have for any N
d1
0
with || k 1 that
(23.19)
y
u(y, 0) =
y
u(y, z)
Z
z
0

y
u
t
(y, t)dt.
Therefore, for p [1, )
y
u(y, 0)
p
2
p/q
y
u(y, z)
p
+
Z
z
0

y
u
t
(y, t)dt
2
p/q
y
u(y, z)
p
+
Z
z
0
y
u
t
(y, t)
p
dt |z|
q/p
2
p1
"
y
u(y, z)
p
+
Z
D
0
y
u
t
(y, t)
p
dt z
p1
#
where q :=
p
p1
is the conjugate exponent to p. Integrating this inequality over
R
d1
[0, D] implies
Dk
uk
p
L
p
(H
d
)
2
p1
uk
p
L
p
(H
d
)
+
+e
d
u
p
L
p
(H
d
)
D
p
p
k
uk
p
L
p
(H
d
)
2
p1
D
1
k
uk
p
L
p
(H
d
)
+ 2
p1
D
p1
p
+e
d
u
p
L
p
(H
d
)
from which implies Eq. (23.18).
Similarly, if p = , then from Eq. (23.19) we nd
k
uk
L
(H
d
)
= k
uk
L
(H
d
)
+D
+e
d
u
(H
d
)
and again the result follows.
Theorem 23.20 (Trace Theorem). Suppose k 1 and
o
R
d
such that

is
a compact manifold with C
k
boundary. Then there exists a unique linear map
T : W
k,p
() W
k1,p
() such that Tu = u|
for all u C
k
.
Proof. Choose a covering {V
i
}
N
i=0
of

such that

V
0
and for each i 1,
there is C
k
dieomorphism x
i
: V
i
R(x
i
)
o
R
d
such that
x
i
( V
i
) = R(x
i
) bd(H
d
) and
x
i
( V
i
) = R(x
i
) H
d
as in Figure 45. Further choose
i
C
c
(V
i
, [0, 1]) such that
P
N
i=0
i
= 1 on a
( )
Figure 45. Covering (the shaded region) as described in the text.
neighborhood of

and set y
i
:= x
i
|
V
i
for i 1. Given u C
k
, we compute
ku|
k
W
k1,p
(
)
=
N
X
i=1
(
i
u) |
y
1
i
W
k1,p
(R(x
i
)bd(H
d
))
=
N
X
i=1
(
i
u) x
1
i
|
bd(H
d
)
W
k1,p
(R(x
i
)bd(H
d
))
N
X
i=1
C
i
(
i
u) x
1
i
W
k,p
(R(x
i
))
max C
i

N
X
i=1
(
i
u) x
1
i
W
k,p
(R(x
i
)H
d
)
+
(
0
u) x
1
0
W
k,p
(R(x
0
))
C kuk
W
k,p
()
where C = max {1, C
1
, . . . , C
N
} . The result now follows by the B.L.T. Theorem
4.1 and the fact that C
k
is dense inside W
k,p
() .
Notation 23.21. In the sequel will often abuse notation and simply write u|
for
the function Tu W
k1,p
(
).
Proposition 23.22 (Integration by parts). Suppose
o
R
d
such that

is a
compact manifold with C
1
boundary, p [1, ] and q =
p
p1
is the conjugate
exponent. Then for u W
k,p
() and v W
k,q
() ,
(23.20)
Z
i
u vdm =
Z
u
i
vdm+
Z
u|
v|
n
i
d
where n :
R
d
is unit outward pointing norm to
.
450 BRUCE K. DRIVER
Proof. Equation 23.20 holds for u, v C

2
and therefore for (u, v)

W
k,p
() W
k,q
() since both sides of the equality are continuous in (u, v)
W
k,p
() W
k,q
() as the reader should verify.
Denition 23.23. Let W
k,p
0
() := C
c
()
W
k,p
()
be the closure of C
c
() inside
W
k,p
() .
Remark 23.24. Notice that if T : W
k,p
() W
k1,p
is the trace operator in

Theorem 23.20, then T

W
k,p
0
()
= {0} W
k1,p
since Tu = u|
= 0 for
all u C
c
().
Corollary 23.25. Suppose
o
R
d
such that

is a compact manifold with C
1
boundary, p [1, ] and T : W

1,p
() L
p
() is the trace operator of Theorem
23.20. Then W
1,p
0
() = Nul(T).
Proof. It has already been observed in Remark 23.24 that W
1,p
0
() Nul(T).
Suppose u Nul(T) and supp(u) is compactly contained in . The mollication
u
(x) dened in Proposition 23.4 will be in C
c
() for > 0 suciently small and
by Proposition 23.4, u
u in W
1,p
() . Thus u W
1,p
0
() . We will now give
two proofs for Nul(T) W
1,p
0
() .
Proof 1. For u Nul(T) W
1,p
() dene
u(x) =

u(x) for x

0 for x /

.
Then clearly u L
p
R
d
and moreover by Proposition 23.22, for v C
c
(R
d
),
Z
R
d
u
i
vdm =
Z
u
i
vdm =
Z
i
u vdm
i
p
R
d
and
i
u = 1
i
u a.e.. Thus
u W
1,p
R
d
with k uk
W
1,p
(R
d
)
= kuk
W
1,p
()
and supp( u) .
Choose V C
1
c
R
d
, R
d
such that V (x) n(x) > 0 for all x
and dene
u
(x) = T
u(x) := u e
V
(x).
Notice that supp( u
) e
V

@@ for all suciently small. By the change

of variables Theorem 23.13, we know that u
W
1,p
() and since supp( u
) is a
compact subset of , it follows from the rst paragraph that u
W
1,p
0
() .
To so nish this proof, it only remains to show u
u in W
1,p
() as 0.
Looking at the proof of Theorem 23.13, the reader may show there are constants
> 0 and C < such that
(23.21) kT
vk
W
1,p
(R
d
)
C kvk
W
1,p
(R
d
)
for all v W
1,p
R
d
.
By direct computation along with the dominated convergence it may be shown
that
(23.22) T
v v in W
1,p
R
d
for all v C
c
(R
d
).
As is now standard, Eqs. (23.21) and (23.22) along with the density of C
c
(R
d
) in
W
1,p
R
d
allows us to conclude T
v v in W
1,p
R
d
for all v W
1,p
R
d
which
completes the proof that u
u in W
1,p
() as 0.
Proof 2. As in the rst proof it suces to show that any u W
1,p
0
() may
be approximated by v W
1,p
() with supp(v) @ . As above extend u to
c
by 0 so that u W
1,p
R
d
. Using the notation in the proof of 23.20, it suces

to show u
i
:=
i
u W
1,p
R
d
may be approximated by u
i
W
1,p
() with
supp(u
i
) @ . Using the change of variables Theorem 23.13, the problem may be
reduced to working with w
i
= u
i
x
1
i
on B = R(x
i
). But in this case we need only
dene w
i
(y) := w
i
(y e
d
) for > 0 suciently small. Then supp(w
i
) H
d
B
and as we have already seen w
i
w
i
in W
1,p
H
d
. Thus u
i
:= w
i
x
i
W
1,p
() ,
u
i
u
i
as 0 with supp(u
i
) @ .
23.6. Extension Theorems.
Lemma 23.26. Let R > 0, B := B(0, R) R
d
, B
:= {x B : x
d
> 0} and
:= {x B : x
d
= 0} . Suppose that u C
k
(B \ ) C(B) and for each || k,
u extends to a continuous function v
on B. Then u C
k
(B) and
u = v
for
all || k.
Proof. For x and i < d, then by continuity, the fundamental theorem of
calculus and the dominated convergence theorem,
u(x +e
i
) u(x) = lim
yx
yB\
[u(y +e
i
) u(y)] = lim
yx
yB\
Z

0

i
u(y +se
i
)ds
= lim
yx
yB\
Z

0
v
e
i
(y +se
i
)ds =
Z

0
v
e
i
(x +se
i
)ds
and similarly, for i = d,
u(x +e
d
) u(x) = lim
yx
yB
sgn()
\
[u(y +e
d
) u(y)] = lim
yx
yB
sgn()
\
Z

0

d
u(y +se
d
)ds
= lim
yx
yB
sgn()
\
Z

0
v
e
d
(y +se
d
)ds =
Z

0
v
e
d
(x +se
d
)ds.
These two equations show, for each i,
i
u(x) exits and
i
u(x) = v
e
i
(x). Hence we
have shown u C
1
(B) .
Suppose it has been proven for some l 1 that
u(x) exists and is given by

v
(x) for all || l < k. Then applying the results of the previous paragraph to
u(x) with || = l shows that

i
u(x) exits and is given by v

+e
i
(x) for all i
and x B and from this we conclude that
u(x) exists and is given by v
(x) for
all || l +1. So by induction we conclude
u(x) exists and is given by v
(x) for
all || k, i.e. u C
k
(B).
Lemma 23.27. Given any k +1 distinct points, {c
i
}
k
i=0
, in R\ {0} , the (k + 1)
(k + 1) matrix C with entries C
ij
:= (c
i
)
j
is invertible.
Proof. Let a R
k+1
and dene p(x) :=
P
k
j=0
a
j
x
j
. If a Nul(C), then
0 =
k
X
j=0
(c
i
)
j
a
j
= p (c
i
) for i = 0, 1, . . . , k.
Since deg (p) k and the above equation says that p has k + 1 distinct roots, we
conclude that a Nul(C) implies p 0 which implies a = 0. Therefore Nul(C) =
{0} and C is invertible.
452 BRUCE K. DRIVER
Lemma 23.28. Let B, B
and be as in Lemma 23.26 and {c

i
}
k
i=0
, be k + 1
distinct points in (, 1] for example c
i
= (i + 1) will work. Also let a R
k+1
be the unique solution (see Lemma 23.27 to C
tr
a = 1 where 1 denotes the vector
of all ones in R
k+1
, i.e. a satises
(23.23) 1 =
k
X
j=0
(c
i
)
j
a
i
for j = 0, 1, 2 . . . , k.
For u C
k
(H
d
) C
c
(H
d
) with supp(u) B and x = (y, z) R
d
dene
(23.24) u(x) = u(y, z) =

u(y, z) if z 0
P
k
i=0
a
i
u(y, c
i
z) if z 0.
Then u C
k
c
(R
d
) with supp( u) B and moreover there exists a constant M
independent of u such that
(23.25) k uk
W
k,p
(B)
Mkuk
W
k,p
(B
+
)
.
Proof. By Eq. (23.23) with j = 0,
k
X
i=0
a
i
u(y, c
i
0) = u(y, 0)
k
X
i=0
a
i
= u(y, 0).
This shows that u in Eq. (23.24) is well dened and that u C
H
d
. Let K
:=
{(y, z) : (y, z) supp(u)} . Since c
i
(, 1], if x = (y, z) / K
and z < 0
then (y, c
i
z) / supp(u) and therefore u(x) = 0 and therefore supp( u) is compactly
contained inside of B. Similarly if N
d
0
with || k, Eq. (23.23) with j =
d
implies
v
(x) :=

(
u) (y, z) if z 0
P
k
i=0
a
i
c
d
i
(
u) (y, c
i
z) if z 0.
is well dened and v
R
d
. Dierentiating Eq. (23.24) shows
u(x) = v
(x)
for x B\ and therefore we may conclude from Lemma 23.26 that u C
k
c
(B)
C
k
R
d
and
u = v
for all || k.
We now verify Eq. (23.25) as follows. For || k,
k
uk
p
L
p
(B
)
=
Z
R
d
1
z<0
k
X
i=0
a
i
c
d
i
(
u) (y, c
i
z)
p
dydz
C
Z
R
d
1
z<0
k
X
i=0
|(
u) (y, c
i
z)|
p
dydz
= C
Z
R
d
1
z>0
k
X
i=0
1
|c
i
|
|(
u) (y, z)|
p
dydz
= C
k
X
i=0
1
|c
i
|
!
k
uk
p
L
p
(B
+
)
where C :=

P
k
i=0
|a
i
c
d
i
|
q
p/q
. Summing this equation on || k shows there ex-
ists a constant M
0
such that k uk
W
k,p
(B
)
M
0
kuk
W
k,p
(B
+
)
and hence Eq. (23.25)
holds with M = M
0
+ 1.
Theorem 23.29 (Extension Theorem). Suppose k 1 and
o
R
d
such that

is a compact manifold with C

k
boundary. Given U
o
R
d
such that

U, there
exists a bounded linear (extension) operator E : W
k,p
() W
k,p
R
d
such that
(1) Eu = u a.e. in and
(2) supp(Eu) U.
Proof. As in the proof of Theorem 23.20, choose a covering {V
i
}
N
i=0
of

such
that

V
0
,
N
i=0

V
i
U and for each i 1, there is C
k
dieomorphism x
i
:
V
i
R(x
i
)
o
R
d
such that
x
i
( V
i
) = R(x
i
) bd(H
d
) and x
i
( V
i
) = R(x
i
) H
d
= B
+
where B
+
is as in Lemma 23.28, refer to Figure 45. Further choose
i

C
c
(V
i
, [0, 1]) such that
P
N
i=0
i

and set y
i
:= x
i
|
V
i
for i 1. Given u C
k
and i 1, the function v

i
:= (
i
u) x
1
i
may be
viewed as a function in C
k
(H
d
) C
c
(H
d
) with supp(u) B. Let v
i
C
k
c
(B) be
dened as in Eq. (23.24) above and dene u :=
0
u+
P
N
i=1
v
i
x
i
C
R
d
with
supp(u) U. Notice that u = u on

and making use of Lemma 23.17 we learn
k uk
W
k,p
(R
d
)
k
0
uk
W
k,p
(R
d
)
+
N
X
i=1
k v
i
x
i
k
W
k,p
(R
d
)
k
0
uk
W
k,p
()
+
N
X
i=1
k v
i
k
W
k,p
(R(x
i
))
C (
0
) kuk
W
k,p
()
+
N
X
i=1
kv
i
k
W
k,p
(B
+
)
= C (
0
) kuk
W
k,p
()
+
N
X
i=1
(
i
u) x
1
i
W
k,p
(B
+
)
C (
0
) kuk
W
k,p
()
+
N
X
i=1
C
i
kuk
W
k,p
()
.
This shows the map u C
k
(
) Eu := u C
k
c
(U) is bounded as map from
W
k,p
() to W
k,p
(U) . As usual, we now extend E using the B.L.T. Theorem 4.1
to a bounded linear map from W
k,p
() to W
k,p
(U) . So for general u W
k,p
() ,
Eu = W
k,p
(U) lim
n
u
n
where u
n
C
k
(
) and u = W
k,p
() lim
n
u
n
.
By passing to a subsequence if necessary, we may assume that u
n
converges a.e. to
Eu from which it follows that Eu = u a.e. on

and supp(Eu) U.
23.7. Exercises.
Exercise 23.1. Show the norm in Eq. (23.1) is equivalent to the norm
|f|
W
k,p
()
:=
X
||k
k
fk
L
p
()
.
454 BRUCE K. DRIVER
Solution. 23.1This is a consequence of the fact that all norms on l

p
({ : || k})
are equivalent. To be more explicit, let a
= k
fk
L
p
()
, then
X
||k
|a
|
_
_
X
||k
|a
|
p
_
_
1/p
_
_
X
||k
1
q
_
_
1/q
while
_
_
X
||k
|a
|
p
_
_
1/p
_
_
p
X
||k
_
_
X
||k
|a
|
_
_
p
_
_
1/p
[#{ : || k}]
1/p
X
||k
|a
| .
24. Hlder Spaces
Notation 24.1. Let be an open subset of R
d
, BC() and BC(
) be the bounded
continuous functions on and

respectively. By identifying f BC(
) with
f|
BC(), we will consider BC(
) as a subset of BC(). For u BC() and

0 < 1 let
kuk
u
:= sup
x
|u(x)| and [u]
:= sup
x,y
x6=y
|u(x) u(y)|
|x y|
.
If [u]
< , then u is Hlder continuous with holder exponent

41
. The collection
of Hlder continuous function on will be denoted by
C
0,
() := {u BC() : [u]
< }
and for u C
0,
() let
(24.1) kuk
C
0,
()
:= kuk
u
+ [u]
.
Remark 24.2. If u : C and [u]
< for some > 1, then u is constant on

each connected component of . Indeed, if x and h R
d
then
u(x +th) u(x)

t
[u]
/t 0 as t 0
which shows
h
u(x) = 0 for all x . If y is in the same connected component
as x, then by Exercise 17.5 there exists a smooth curve : [0, 1] such that
(0) = x and (1) = y. So by the fundamental theorem of calculus and the chain
rule,
u(y) u(x) =
Z
1
0
d
dt
u((t))dt =
Z
1
0
0 dt = 0.
This is why we do not talk about Hlder spaces with Hlder exponents larger than
1.
Lemma 24.3. Suppose u C
1
() BC() and
i
u BC() for i = 1, 2, . . . , d,
then u C
0,1
(), i.e. [u]
1
< .
The proof of this lemma is left to the reader as Exercise 24.1.
Theorem 24.4. Let be an open subset of R
d
. Then
(1) Under the identication of u BC
with u|
BC () , BC(
) is a
closed subspace of BC().
(2) Every element u C
0,
() has a unique extension to a continuous func-
tion (still denoted by u) on

. Therefore we may identify C
0,
() with
C
0,
(
) BC(
).
(3) The function u C
0,
() kuk
C
0,
()
[0, ) is a norm on C
0,
()
which make C
0,
() into a Banach space.
Proof. 1. The rst item is trivial since for u BC(
), the sup-norm of u on

agrees with the sup-norm on and BC(
) is complete in this norm.

2. Suppose that [u]
< and x
0
. Let {x
n
}
n=1
be a sequence such
that x
0
= lim
n
x
n
. Then
|u(x
n
) u(x
m
)| [u]
|x
n
x
m
|
0 as m, n
41
If = 1, u is is said to be Lipschitz continuous.
456 BRUCE K. DRIVER
showing {u(x
n
)}
n=1
is Cauchy so that u(x
0
) := lim
n
u(x
n
) exists. If {y
n
}
n=1

is another sequence converging to x
0
, then
|u(x
n
) u(y
n
)| [u]
|x
n
y
n
|
0 as n ,
showing u(x
0
) is well dened. In this way we dene u(x) for all x and let
u(x) = u(x) for x . Since a similar limiting argument shows
| u(x) u(y)| [u]
|x y|
for all x, y

it follows that u is still continuous and [ u]
= [u]
. In the sequel we will abuse

notation and simply denote u by u.
3. For u, v C
0,
(),
[v +u]
= sup
x,y
x6=y
|v(y) +u(y) v(x) u(x)|

|x y|
sup
x,y
x6=y
|v(y) v(x)| +|u(y) u(x)|

|x y|
[v]
+ [u]
and for C it is easily seen that [u]
= || [u]
. This shows []
is a semi-norm
on C
0,
() and therefore k k
C
0,
()
dened in Eq. (24.1) is a norm.
To see that C
0,
() is complete, let {u
n
}
n=1
be a C
0,
()Cauchy sequence.
Since BC(
) is complete, there exists u BC(
) such that ku u
n
k
u
0 as
n . For x, y with x 6= y,
|u(x) u(y)|
|x y|
= lim
n
|u
n
(x) u
n
(y)|
|x y|
limsup
n
[u
n
]
lim
n
ku
n
k
C
0,
()
< ,
and so we see that u C
0,
(). Similarly,
|u(x) u
n
(x) (u(y) u
n
(y))|
|x y|
= lim
m
|(u
m
u
n
)(x) (u
m
u
n
)(y)|
|x y|
limsup
m
[u
m
u
n
]
0 as n ,
showing [u u
n
]
0 as n and therefore lim

n
ku u
n
k
C
0,
()
= 0.
Notation 24.5. Since and

are locally compact Hausdor spaces, we may
dene C
0
() and C
0
(
) as in Denition 10.29. We will also let

C
0,
0
() := C
0,
() C
0
() and C
0,
0
(
) := C
0,
() C
0
(
).
It has already been shown in Proposition 10.30 that C
0
() and C
0
(
) are closed
subspaces of BC() and BC(
) respectively. The next proposition describes the

relation between C
0
() and C
0
(
).
Proposition 24.6. Each u C
0
() has a unique extension to a continuous func-
tion on

given by u = u on and u = 0 on and the extension u is in C
0
(
).
Conversely if u C
0
(
) and u|
= 0, then u|
C
0
(). In this way we may
identify C
0
() with those u C
0
(
) such that u|
= 0.
Proof. Any extension u C
0
() to an element u C(
) is necessarily unique,
since is dense inside

. So dene u = u on and u = 0 on . We must show u
is continuous on

and u C
0
(
).
For the continuity assertion it is enough to show u is continuous at all points
in . For any > 0, by assumption, the set K
:= {x : |u(x)| } is a
compact subset of . Since =

\ , K
= and therefore the distance,

:= d(K
, ), between K
and is positive. So if x and y

and
|y x| < , then | u(x) u(y)| = |u(y)| < which shows u :

C is continuous.
This also shows {| u| } = {|u| } = K
is compact in and hence also in

.
Since > 0 was arbitrary, this shows u C
0
(
).
Conversely if u C
0
(
) such that u|
= 0 and > 0, then K
:=
x

: |u(x)|
is a compact subset of

which is contained in since
K
= . Therefore K
is a compact subset of showing u|
C
0
(
).
Denition 24.7. Let be an open subset of R
d
, k N{0} and (0, 1]. Let
BC
k
() (BC
k
(
)) denote the set of k times continuously dierentiable functions

u on such that
u BC() (
u BC(
))
42
for all || k. Similarly, let
BC
k,
() denote those u BC
k
() such that [
u]
< for all || = k. For

u BC
k
() let
kuk
C
k
()
=
X
||k
k
uk
u
and
kuk
C
k,
()
=
X
||k
k
uk
u
+
X
||=k
[
u]
.
Theorem 24.8. The spaces BC
k
() and BC
k,
() equipped with k k
C
k
()
and
k k
C
k,
()
respectively are Banach spaces and BC
k
(
) is a closed subspace of
BC
k
() and BC
k,
() BC
k
(
). Also
C
k,
0
() = C
k,
0
(
) = {u BC
k,
() :
u C
0
() || k}
is a closed subspace of BC
k,
().
Proof. Suppose that {u
n
}
n=1
BC
k
() is a Cauchy sequence, then {
u
n
}
n=1
is a Cauchy sequence in BC() for || k. Since BC() is complete, there exists
g
BC() such that lim

n
k
u
n
g
k
u
= 0 for all || k. Letting u := g
0
,
we must show u C
k
() and
u = g
for all || k. This will be done by

induction on || . If || = 0 there is nothing to prove. Suppose that we have
veried u C
l
() and
u = g
for all || l for some l < k. Then for x ,

i {1, 2, . . . , d} and t R suciently small,
a
u
n
(x +te
i
) =
a
u
n
(x) +
Z
t
0

i
a
u
n
(x +e
i
)d.
Letting n in this equation gives
a
u(x +te
i
) =
a
u(x) +
Z
t
0
g
+e
i
(x +e
i
)d
i
u(x) exists for all x and

i
u = g
+e
i
. This
completes the induction argument and also the proof that BC
k
() is complete.
It is easy to check that BC
k
(
) is a closed subspace of BC
k
() and by using
Exercise 24.1 and Theorem 24.4 that that BC
k,
() is a subspace of BC
k
(
). The
fact that C
k,
0
() is a closed subspace of BC
k,
() is a consequence of Proposition
10.30.
42
To say
u BC(
) means that
u BC() and
u extends to a continuous function

on

.
458 BRUCE K. DRIVER
To prove BC
k,
() is complete, let {u
n
}
n=1
BC
k,
() be a k k
C
k,
()

Cauchy sequence. By the completeness of BC
k
() just proved, there exists u
BC
k
() such that lim
n
kuu
n
k
C
k
()
= 0. An application of Theorem 24.4 then
shows lim
n
k
u
n
uk
C
0,
()
= 0 for || = k and therefore lim
n
ku
u
n
k
C
k,
()
= 0.
The reader is asked to supply the proof of the following lemma.
Lemma 24.9. The following inclusions hold. For any [0, 1]
BC
k+1,0
() BC
k,1
() BC
k,
()
BC
k+1,0
(
) BC
k,1
(
) BC
k,
().
Denition 24.10. Let A : X Y be a bounded operator between two (sep-
arable) Banach spaces. Then A is compact if A[B
X
(0, 1)] is precompact in Y
or equivalently for any {x
n
}
n=1
X such that kx
n
k 1 for all n the sequence
y
n
:= Ax
n
Y has a convergent subsequence.
Example 24.11. Let X =
2
= Y and
n
C such that lim
n
n
= 0, then
A : X Y dened by (Ax)(n) =
n
x(n) is compact.
Proof. Suppose {x
j
}
j=1

2
such that kx
j
k
2
=
P
|x
j
(n)|
2
1 for all j. By
Cantors Diagonalization argument, there exists {j
k
} {j} such that, for each n,
x
k
(n) = x
j
k
(n) converges to some x(n) C as k . Since for any M < ,
M
X
n=1
| x(n)|
2
= lim
k
M
X
n=1
| x
k
(n)|
2
1
we may conclude that

P
n=1
| x(n)|
2
1, i.e. x
2
.
Let y
k
:= A x
k
and y := A x. We will nish the verication of this example by
showing y
k
y in
2
as k . Indeed if
M
= max
nM
|
n
|, then
kA x
k
A xk
2
=

X
n=1
|
n
|
2
| x
k
(n) x(n)|
2
=
M
X
n=1
|
n
|
2
| x
k
(n) x(n)|
2
+|
M
|
2

X
M+1
| x
k
(n) x(n)|
2
M
X
n=1
|
n
|
2
| x
k
(n) x(n)|
2
+|
M
|
2
k x
k
xk
2
M
X
n=1
|
n
|
2
| x
k
(n) x(n)|
2
+ 4|
M
|
2
.
Passing to the limit in this inequality then implies
lim sup
k
kA x
k
A xk
2
4|
M
|
2
0 as M .
Lemma 24.12. If X
A
Y
B
Z are continuous operators such the either A or
B is compact then the composition BA : X Z is also compact.
Proof. If A is compact and B is bounded, then BA(B
X
(0, 1)) B(AB
X
(0, 1))
which is compact since the image of compact sets under continuous maps are com-
pact. Hence we conclude that BA(B
X
(0, 1)) is compact, being the closed subset of
the compact set B(AB
X
(0, 1)).
If A is continuos and B is compact, then A(B
X
(0, 1)) is a bounded set and so
by the compactness of B, BA(B
X
(0, 1)) is a precompact subset of Z, i.e. BA is
compact.
Proposition 24.13. Let
o
R
d
such that

is compact and 0 < 1.
Then the inclusion map i : C
() C
() is compact.
Let {u
n
}
n=1
C
() such that ku
n
k
C
1, i.e. ku
n
k
1 and
|u
n
(x) u
n
(y)| |x y|
for all x, y .
By Arzela-Ascoli, there exists a subsequence of { u
n
}
n=1
of {u
n
}
n=1
and u C
o
(
)
such that u
n
u in C
0
. Since
|u(x) u(y)| = lim
n
| u
n
(x) u
n
(y)| |x y|
,
u C
as well. Dene g
n
:= u u
n
C
, then kg
n
k
C
2 and g
n
0 in C
0
. To
nish the proof we must show that g
n
0 in C
. Given > 0,
sup
x6=y
|g
n
(x) g
n
(y)|
|x y|
: A
n
+B
n
where
+
A
n
:= sup
x 6= y
|x y|
|g
n
(x) g
n
(y)|
|x y|
sup
x6=y
|g
n
(x) g
n
(y)|
2
kg
n
k
0 as n
and
B
n
:= sup
x 6= y
|x y| >
|g
n
(x) g
n
(y)|
|x y|
sup
x 6= y
|x y|
|x y|
|x y|
= sup
x 6= y
|x y|
|x y|
.
Therefore,
lim sup
n
[g
n
]
lim sup
n
A
n
+ lim sup
n
B
n
0 +
0 as 0.
This proposition generalizes to the following theorem which the reader is asked to
prove in Exercise 24.2 below.
Theorem 24.14. Let be a precompact open subset of R
d
, , [0, 1] and k, j
N
0
. If j + > k +, then C
j,

is compactly contained in C
k,
.
460 BRUCE K. DRIVER
24.1. Exercises.
Exercise 24.2. Prove Theorem 24.14. Hint: First prove C
j,

@@ C
j,
is
compact if 0 < 1. Then use Lemma 24.12 repeatedly to handle all of the
other cases.
25. Sobolev Inequalities
25.1. Gagliardo-Nirenberg-Sobolev Inequality. In this section our goal is to
prove an inequality of the form:
(25.1) kuk
L
q Ckuk
L
p
(R
d
)
for u C
1
c
(R
d
).
For > 0, let u
(x) = u(x). Then

ku
k
q
L
q
=
Z
|u(x)|
q
dx =
Z
|u(y)|
q
dy
d
and hence ku
k
L
q =
d/q
kuk
L
q . Moreover, u
(x) = (u)(x) and thus

ku
k
L
p = k(u)
k
L
p =
d/p
kuk
L
p.
If (25.1) is to hold for all u C
1
c
(R
d
) then we must have
d/q
kuk
L
q = ku
k
L
q Cku
k
L
p
(R
d
)
= C
1d/p
kuk
L
p for all > 0
which only possible if 1 d/p + d/q = 0, i.e. 1/p = 1/d + 1/q. Let us denote the
solution, q, to this equation by p
so p
:=
dp
dp
.
Theorem 25.1. Let p = 1 so 1
=
d
d1
, then
(25.2) kuk
1
= kuk d
d1
d
1
2
kuk
1
for all u C
1
c
(R
d
).
Proof. To help the reader understand the proof, let us give the proof for d = 1,
d = 2 and d = 3 rst and with the constant d
1/2
being replaced by 1. After that
the general induction argument will be given. (The adventurous reader may skip
directly to the paragraph containing Eq. (25.3) below.)
(d = 1, p
= ) By the fundamental theorem of calculus,

|u(x)| =
Z
x
u
0
(y)dy
Z
x
|u
0
(y)| dy
Z
R
|u
0
(x)| dx.
Therefore kuk
L
ku
0
k
L
1, proving the d = 1 case.
(d = 2, p
= 2) Applying the same argument as above to y

1
u(y
1
, x
2
) and
y
2
u(x
1
, y
2
),,
|u(x
1
, x
2
)|
Z

|
1
u(y
1
, x
2
)| dy
1

Z

|u(y
1
, x
2
)| dy
1
and
|u(x
1
, x
2
)|
Z

|
2
u(x
1
, y
2
)| dy
2

Z

|u(x
1
, y
2
)| dy
2
and therefore
|u(x
1
, x
2
)|
2
|
1
u(y
1
, x
2
)|dy
1
|
2
u(x
1
, y
2
)| dy
2
.
Integrating this equation relative to x
1
and x
2
gives
kuk
2
L
2
=
Z
R
2
|u(x)|
2
dx
Z

|
1
u(x)| dx
Z

|
2
u(x)| dx
|u(x)| dx
2
which proves the d = 2 case.
462 BRUCE K. DRIVER
(d = 3, p
= 3/2) Let x
1
= (y
1
, x
2
, x
3
), x
2
= (x
1
, y
2
, x
3
), and x
3
= (x
1
, x
2
, y
3
) if
i = 3, then as above,
|u(x)|
Z

|
i
u(x
i
)|dy
i
for i = 1, 2, 3
and hence
|u(x)|
3
2

3
Y
i=1
Z

|
i
u(x
i
)|dy
i
1
2
.
Integrating this equation on x
1
gives,
Z
R
|u(x)|
3
2
dx
1

Z

|
1
u(x
1
)|dy
1
1
2
Z
3
Y
i=2
Z

|
i
u(x
i
)|dy
i
1
2
dx
1
|
1
u(x)|dx
1
1
2
3
Y
i=2
Z

|
i
u(x
i
)|dx
1
dy
i
1
2
wherein the second equality we have used the Hlders inequality with p = q = 2.
Integrating this result on x
2
and using Hlders inequality gives
Z
R
2
|u(x)|
3
2
dx
1
dx
2

Z
R
2
|
2
u(x)|dx
1
dx
2
1
2
Z
R
dx
2
Z

|
1
u(x)|dx
1
1
2
Z
R
2
|
3
u(x
3
)|dx
1
dy
3
1
2
Z
R
2
|
2
u(x)|dx
1
dx
2
1
2
Z
R
2
|
1
u(x)|dx
1
dx
2
1
2
Z
R
3
|
3
u(x)|dx
1
2
.
One more integration of x
3
and application of Hlders inequality, implies
Z
R
3
|u(x)|
3
2
dx
3
Y
i=1
Z
R
3
|
i
u(x)|dx
1
2
Z
R
3
|u(x)|dx
3
2
proving the d = 3 case.
For general d (p
=
d
d1
), as above let x
i
= (x
1
, . . . , y
i
, . . . , x
d
). Then
|u(x)|
Z

|
i
u(x
i
)|dy
i
and
(25.3) |u(x)|
d
d1

d
Y
i=1
Z

|
i
u(x
i
)|dy
i
1
d1
.
Integrating this equation relative to x
1
and making use of Hlders inequality in
the form
(25.4)
d1
Y
j=1
f
j
d1
Y
j=1
kf
j
k
d1
(see Corollary 9.3) we nd
Z
R
|u(x)|
d
d1
dx
1

Z
R
1
u(x)dx
1
1
d1
Z
R
dx
1
d
Y
i=2
Z
R
|
i
u(x
i
)|dy
i
1
d1
Z
R
1
u(x)dx
1
1
d1
d
Y
i=2
Z
R
2
|
i
u(x
i
)|dx
1
dy
i
1
d1
=
Z
R
1
u(x)dx
1
1
d1
Z
R
2
|
2
u(x)|dx
1
dx
2
1
d1
d
Y
i=3
Z
R
2
|
i
u(x
i
)|dx
1
dy
i
1
d1
.
Integrating this equation on x
2
and using Eq. (25.4) once again implies,
Z
R
2
|u(x)|
d
d1
dx
1
dx
2

Z
R
2
|
2
u(x)|dx
1
dx
2
1
d1
Z
R
dx
2
Z
R
1
u(x)dx
1
1
d1
d
Y
i=3
Z
R
2
|
i
u(x
i
)|dx
1
dy
i
1
d1
Z
R
2
|
2
u(x)|dx
1
dx
2
1
d1
Z
R
2
|
1
u(x)|dx
1
dx
2
1
d1
d
Y
i=3
Z
R
3
|
i
u(x
i
)|dx
1
dx
2
dy
i
1
d1
.
Continuing this way inductively, one shows
Z
R
k
|u(x)|
d
d1
dx
1
dx
2
. . . dx
k

k
Y
i=1
Z
R
k
|
i
u(x)|dx
1
dx
2
. . . dx
k
1
d1
d
Y
i=k+1
Z
R
3
|
i
u(x
i
)|dx
1
dx
2
. . . dx
k
dy
k+1
1
d1
and in particular when k = d,
Z
R
d
|u(x)|
d
d1
dx
d
Y
i=1
Z
R
d
|
i
u(x)|dx
1
dx
2
. . . dx
d
1
d1
(25.5)
d
Y
i=1
Z
R
d
|u(x)|dx
1
d1
=
Z
R
d
|u(x)|dx
d
d1
.
We can improve on this estimate by using Youngs inequality (see Exercise 25.1) in
the form
d
Q
i=1
a
i

1
d
P
d
i=1
a
d
i
. Indeed by Eq. (25.5) and Youngs inequality,
kuk d
d1

d
Y
i=1
Z
R
d
|
i
u(x)|dx
1
d
1
d
d
X
i=1
Z
R
d
|
i
u(x)|dx
=
1
d
Z
R
d
d
X
i=1
|
i
u(x)|dx
1
d
Z
R
d
d |u(x)| dx
464 BRUCE K. DRIVER
wherein the last inequality we have used Hlders inequality for sums,
d
X
i=1
|a
i
|
d
X
i=1
1
!
1/2
d
X
i=1
|a
i
|
2
!
1/2
=
d |a| .
The next theorem generalizes Theorem 25.1 to an inequality of the form in Eq.
(25.1).
Notation 25.2. For p [1, d), let p
:=
pd
dp
so that 1/p
+1/d = 1/p. In particular

1
=
d
d1
.
Theorem 25.3. If p [1, d) then
(25.6) kuk
L
p
d
1/2
p(d 1)
d p
kuk
L
p for all u C
1
c
(R
d
).
Proof. Let u C
1
c
(R
d
) and s > 1, then |u|
s
C
1
c
(R
d
) and |u|
s
=
s|u|
s1
sgn(u)u. Applying Eq. (25.2) with u replaced by |u|
s
and then using
Holders inequality gives
(25.7) k|u|
s
k d
d1
d
1
2
k|u|
s
k
1
= sd
1
2
k|u|
s1
uk
L
1
s
d
kuk
L
p k|u|
s1
k
L
q
where q =
p
p1
. Let us now choose s so that
s1
= s
d
d 1
= (s 1)q = (s 1)
p
p 1
=: p
,
i.e.
s =
q
q 1
=
p
p1
p
p1

d
d1
=
p(d 1)
p(d 1) d(p 1)
=
p(d 1)
d p
and p
=
p(d1)
dp
d
d1
=
pd
dp
. Using this s in Eq. (25.7) gives
kuk
p
d1
d
p
d
1/2
p(d 1)
d p
kuk
L
p kuk
p
/q
p
.
This proves Eq. (25.6) since
p
d 1
d
p
/q = p
s
p

s 1
p
= 1.
Corollary 25.4. The estimate kuk
L
p

p(d1)
d(dp)
kuk
L
p holds for all u
W
1,p
(R
d
).
Corollary 25.5. Suppose U R
d
is bounded with C
1
-boundary, then for all 1
p < d and 1 q p
there exists C = C(U) such that kuk

L
q
(U)
Ckuk
W
1,p
(U)
.
Proof. Let u C
1
(U) W
1,p
(U) and Eu denote an extension operator. Then
kuk
L
p
(u)
kEuk
L
p
(R
d
)
Ck(Eu)k
L
p
(R
d
)
Ckuk
W
1,p
(u)
.
Therefore
(25.8) kuk
L
p
(U)
Ckuk
W
1,p
(U)
Since C
1
(U) is dense in W
1,p
(U), Eq. (25.8) holds for all u W
1,p
(U). Finally for
all 1 q 2 then
kuk
2+4/d
2
C
d
kuk
2
2
kuk
4/d
1
for all u C
1
c
.
Proof. Recall kuk
2
Ckuk
2
where 2
=
2d
d2
. Now
kuk
2
kuk
p
kuk
1
q
where

p
+
1
q
=
1
2
. Taking p = 2
and q = 1 implies

2
+ 1 =
1
2
, i.e.
1
2
=
1
2
and hence
=
1
2
1
1
2
=
2
2(2
1)
=
d
(d 2)

1
2d
d2
1
=
d
d 2
d 2
d + 2
=
d
d + 2
and
1 =
2
d + 2
.
Hence
kuk
2
kuk
d
d+2
2
kuk
2
d+2
1
C
d
d+2
kuk
d
d=2
2
kuk
2
d+2
1
and therefore
kuk
d+2
d
2
Ckuk
2
kuk
2
d
1
.
and squaring this equation then gives
kuk
2+4/d
2
C
2
kuk
2
2
kuk
4
d
1
.
25.2. Morreys Inequality.
Notation 25.7. Let S
d1
be the sphere of radius one centered at zero inside R
d
.
For S
d1
, x R
d
, and r (0, ), let
x,r
{x +s : 3 0 s r}.
So
x,r
= x +
0,r
where
0,r
is a cone based on .
Notation 25.8. If S
d1
is a measurable set let || = () be the surface
area of . If R
d
is a measurable set, let
Z
f(x)dx =
1
m()
Z
f(x)dx.
By Theorem 8.35,
(25.9)
Z
x,r
f(y)dy =
Z
0,r
f(x +y)dy =
Z
r
0
dt t
d1
Z
f(x +t) d()

and letting f = 1 in this equation implies
(25.10) m(
x,r
) = || r
d
/d.
466 BRUCE K. DRIVER
Lemma 25.9. Let S

d1
be a measurable set. For u C
1
(
x,r
),
(25.11)
Z
x,r
|u(y) u(x)|dy
1
||
Z
x,r
|u(y)|
|x y|
d1
dy.
Proof. Write y = x + s with S
d1
, then by the fundamental theorem of
calculus,
u(x +s) u(x) =
Z
s
0
u(x +t) dt
and therefore,
Z
|u(x +s) u(x)|d()

Z
s
0
Z
|u(x +t)|d()dt
=
Z
s
0
t
d1
dt
Z
|u(x +t)|
|x +t x|
d1
d()
=
Z
x,s
|u(y)|
|y x|
d1
dy
Z
x,r
|u(y)|
|x y|
d1
dy,
wherein the second equality we have used Eq. (25.9). Multiplying this inequality
by s
d1
and integrating on s [0, r] gives
Z
x,r
|u(y) u(x)|dy
r
d
d
Z
x,r
|u(y)|
|x y|
d1
dy =
m(
x,r
)
||
Z
x,r
|u(y)|
|x y|
d1
dy
Corollary 25.10. For d N and p (d, ] there is a constant C = C(p, d) <
such that if u C
1
(R
d
) then for all x, y R
d
,
(25.12) |u(y) u(x)| C kuk
L
p
(B(x,r)B(y,r))
|x y|
(
1
d
p
)
where r := |x y| .
Proof. The case p = is easy and will be left to the reader. Let r := |x y| ,
V B
x
(r) B
y
(r) and , S
d1
be chosen so that x + r = B
x
(r) B
y
(r)
and y +r = B
y
(r) B
x
(r), i.e.
=
1
r
(B
x
(r) B
y
(r) x) and =
1
r
(B
y
(r) B
x
(r) y) = .
Also let W =
x,r

y,r
, see Figure 46 below. By a scaling,
d
:=
|
x,r

y,r
|
|
x,r
|
=
|
x,1

y,1
|
|
x,1
|
(0, 1)
is a constant only depending on d, i.e. we have |
x,r
| = |
y,r
| = |W|. Integrating
the inequality
|u(x) u(y)| |u(x) u(z)| +|u(z) u(y)|
Figure 46. The geometry of two intersecting balls of radius r := |x y| .

over z W gives
|u(x) u(y)|
Z
W
|u(x) u(z)|dz +
Z
W
|u(z) u(y)|dz
=

|
x,r
|
_
_
Z
W
|u(x) u(z)|dz +
Z
W
|u(z) u(y)|dz
_
_

|
x,r
|
_
_
_
Z
x,r
|u(x) u(z)|dz +
Z
y,r
|u(z) u(y)|dz
_
_
_
.
Hence by Lemma 25.9, Hlders inequality and translation and rotation invariance
of Lebesgue measure,
|u(x) u(y)|

||
_
_
_
Z
x,r
|u(z)|
|x z|
d1
dz +
Z
y,r
|u(z)|
|z y|
d1
dz
_
_
_

||
kuk
L
p
(
x,r
)
k
1
|x |
d1
k
L
q
(
x,r
)
+kuk
L
p
(
y,r
)
k
1
|y |
d1
k
L
q
(
y,r
)
2
||
kuk
L
p
(V )
k
1
| |
d1
k
L
q
(
0,r
)
(25.13)
468 BRUCE K. DRIVER
where q =
p
p1
is the conjugate exponent to p. Now
k
1
| |
d1
k
q
L
q
(
0,r
)
=
Z
r
0
dt t
d1
Z
t
d1
q
d()
= ||
Z
r
0
dt

t
d1
1
p
p1
= ||
Z
r
0
dt t
d1
p1
and since
d1
p1
+ 1 =
pd
p1
we nd
(25.14) k
1
| |
d1
k
L
q
(
0,r
)
=

p 1
p d
|| r
pd
p1
1/q
=

p 1
p d
||
p1
p
r
1
d
p
.
Combining Eqs. (25.13) and (25.14) gives
|u(x) u(y)|
2
||
1/p
p 1
p d
p1
p
kuk
L
p
(V )
r
1
d
p
.
Corollary 25.11. Suppose d < p < , B
S
d1, r (0, ) and u C
1
(
x,r
).
Then
(25.15) |u(x)| C(||, r, d, p) kuk
W
1,p
(
x,r
)
r
1d/p
where
C(||, r, d, p) :=
1
||
1/p
max
d
1/p
r
,
p 1
p d
11/p
!
.
Proof. For y
x,r
,
|u(x)| |u(y)| +|u(y) u(x)|
and hence using Eq. (25.11) and Hlders inequality,
|u(x)|
Z
x,r
|u(y)|dy +
1
||
Z
x,r
|u(y)|
|x y|
d1
dy
1
m(
x,r
)
kuk
L
p
(
x,r
)
k1k
L
p
(
x,r
)
+
1
||
kuk
L
p
(
x,r
)
k
1
|x |
d1
k
L
q
(
x,r
)
where q =
p
p1
as before. This equation combined with Eq. (25.14) and the equality,
(25.16)
1
m(
x,r
)
k1k
L
q
(
x,r
)
=
1
m(
x,r
)
m(
x,r
)
1/q
=

|| r
d
/d
1/p
shows
|u(x)| kuk
L
p
(
x,r
)
|| r
d
/d
1/p
+
1
||
kuk
L
p
(
x,r
)
p 1
p d
||
11/p
r
1d/p
=
1
||
1/p
"
kuk
L
p
(
x,r
)
d
1/p
r
+kuk
L
p
(
x,r
)
p 1
p d
11/p
#
r
1d/p
.
1
||
1/p
max
d
1/p
r
,
p 1
p d
11/p
!
kuk
W
1,p
(
x,r
)
r
1d/p
.
Theorem 25.12 (Morreys Inequality). If d < p < , u W
1,p
(R
d
), then there
exists a unique version u
of u (i.e. u
= u a.e.) such that u
is continuous.
Moreover u
C
0,1
p
d
(R
d
) and
(25.17) ku
k
C
0,1
p
d
(R
d
)
Ckuk
W
1,p
where C = C(p, d) is a universal constant.
Proof. First assume that u C
1
c
(R
d
) then by Corollary 25.11 kuk
C(R
d
)

Ckuk
W
1,p
(R
d
)
and by Corollary 25.10
|u(y) u(x)|
|x y|
1
d
p
Ckuk
L
p
(R
d
)
.
Therefore
[u]
1
d
p
Ckuk
L
p
(R
d
)
Ckuk
W
1,p
(R
d
)
and hence
(25.18) kuk
C
0,1
p
d
(R
d
)
Ckuk
W
1,p
(R
d
)
.
Now suppose u W
1,p
(R
d
), choose (using Exercise 19.8 and Theorem G.67) u
d

C
1
c
(R
d
) such that u
d
u in W
1,p
(R
d
). Then by Eq. (25.18), ku
n
u
m
k
C
0,1
p
d
(R
d
)

0 as m, n and therefore there exists u
C
0,1
p
d
(R
d
) such that u
n
u
in
C
0,1
p
d
(R
d
). Clearly u
= u a.e. and Eq. (25.17) holds.

The following example shows that L
(R
d
) 6 W
1,d
(R
d
) in general.
Example 25.13. Let u(x) = (x) log log
1 +
1
|x|
where C
c
(R
d
) is chosen
so that (x) = 1 for |x| 1. Then u / L
(R
d
) while u W
1,d
(R
d
). Let us check
this claim. Using Theorem 8.35, one easily shows u L
p
(R
d
). A short computation
shows, for |x| < 1, that
u(x) =
1
log
1 +
1
|x|
1
1 +
1
|x|
1
|x|
=
1
1 +
1
|x|
1
log
1 +
1
|x|
1
|x|
x
where x = x/ |x| and so again by Theorem 8.35,

Z
R
d
|u(x)|
d
dx
Z
|x|<1
_
_
1
|x|
2
+|x|
1
log
1 +
1
|x|
_
_
d
dx
(S
d1
)
Z
1
0
2
r log
1 +
1
r
!
d
r
d1
dr = .
Corollary 25.14. The above them holds with R
d
replaced by
o
R
d
such that
is compact C
1
-manifold with boundary.
Proof. Use Extension Theory.
470 BRUCE K. DRIVER
25.3. Rademachers Theorem.

Theorem 25.15. Suppose that u W
1,p
loc
() for some d < p . Then u is
dierentiable almost everywhere and w-
i
u =
i
u a.e. on .
Proof. We clearly may assume that p < . For v W
1,p
loc
() and x, y such
that B(x, r) B(y, r) where r := |x y| , the estimate in Corollary 25.10,
gives
|v(y) v(x)| Ckuk
L
p
(B(x,r)B(y,r))
|x y|
(
1
d
p
)
= Ckvk
L
p
(B(x,r)B(y,r))
r
(
1
d
p
)
. (25.19)
Let u now denote the unique continuous version of u W
1,p
loc
(). The by the
Lebesgue dierentiation Theorem 16.12, there exists an exceptional set E such
that m(E) = 0 and
lim
r0
1
m(B(x, r))
Z
B(x,r)
|u(y) u(x)|
p
dy = 0 for x \ E.
Fix a point x \ E and let v(y) := u(y) u(x) u(x) (y x) and notice that
v(y) = u(y) u(x). Applying Eq. (25.19) to v then implies
|u(y) u(x) u(x) (y x)|
Cku() u(x)k
L
p
(B(x,r)B(y,r))
r
(
1
d
p
)
C
Z
B(x,r)
|u(y) u(x)|
p
dy
!
1/p
r
(
1
d
p
)
= C
p
q
(S
d1
))r
d/p
1
m(B(x, r))
Z
B(x,r)
|u(y) u(x)|
p
dy
!
1/p
r
(
1
d
p
)
= C
p
q
(S
d1
))
1
m(B(x, r))
Z
B(x,r)
|u(y) u(x)|
p
dy
!
1/p
|x y|
which shows u is dierentiable at x and u(x) = w-u(x).
Theorem 25.16 (Rademachers Theorem). Let u be locally Lipschitz continuous
on
o
R
d
. Then u is dierentiable almost everywhere and w-
i
u =
i
u a.e. on
.
Proof. By Proposition 19.29
(w)
i
u exists weakly and is in
i
u L
(R
d
) for
i = 1, 2, . . . , d. The result now follows from Theorem 25.15.
25.4. Sobolev Embedding Theorems Summary.
Space Degree of Reguilarity
W
k,p
k d/p
C
k,
= C
k+
k +.
Summary A space embeds continuously in the other if it has a higher or equal
degree of regularity. Here are some examples:
(1) W
k,q
W
k,p
k
d
p
k
d
q
i.e.
d
p

d
q
or
1
q

1
p
=

d
.
(2) W
k,p
C
d
p
+

The embeddings are compact if the above inequalities are strict and in the case
of considering W
k,p
W
1,q
we must have k > j!
Example L
2
([0, 1]) L
1
([0, 1]) but this is not compact. To see this, take {u
d
}
d=1
to be the Haar basis for L
2
. Then u
d
0 in L
2
and L
1
, while ku
d
k
2
ku
d
k
1
1
since |u
d
| = 1.
25.5. Other Theorems along these lines. Another theorem of this form is de-
rived as follows. Let > 0 be xed and g C
c
((0, 1) , [0, 1]) such that g(t) = 1 for
|t| 1/2 and set (t) := g(t/). Then for x R
d
and we have
Z

0
d
dt
[(t)u(x +t)] dt = u(x)
and then by integration by parts repeatedly we learn that
u(x) =
Z

0

2
t
[(t)u(x +t)] tdt =
Z

0

2
t
[(t)u(x +t)] d
t
2
2
=
Z

0

3
t
[(t)u(x +t)] d
t
3
3!
= . . .
= (1)
m
Z

0

m
t
[(t)u(x +t)] d
t
m
m!
= (1)
m
Z

0

m
t
[(t)u(x +t)]
t
m1
(m1)!
dt.
Integrating this equatoin on then implies
|| u(x) = (1)
m
Z
d
Z

0

m
t
[(t)u(x +t)]
t
m1
(m1)!
dt
=
(1)
m
(m1)!
Z
d
Z

0
t
md
m
t
[(t)u(x +t)] t
d1
dt
=
(1)
m
(m1)!
Z
d
Z

0
t
md
m
X
k=0
m
k
(mk)
(t)
(x +t)
i
t
d1
dt
=
(1)
m
(m1)!
Z
d
Z

0
t
md
m
X
k=0
m
k
km
h
g
(mk)
(t)
(x +t)
i
t
d1
dt
=
(1)
m
(m1)!
m
X
k=0
m
k
km
Z
x,
|y x|
md
h
g
(mk)
(|y x|)
k
[
yx
u
(y)
i
dy
and hence
u(x) =
(1)
m
|| (m1)!
m
X
k=0
m
k
km
Z
x,
|y x|
md
h
g
(mk)
(|y x|)
k
[
yx
u
(y)
i
dy
and hence by the Hlders inequality,
|u(x)| C(g)
(1)
m
|| (m1)!
m
X
k=0
m
k
km
"
Z
x,
|y x|
q(md)
dy
#
1/q
"
Z
x,
k
[
yx
u
(y)
p
dy
#
1/p
.
472 BRUCE K. DRIVER
From the same computation as in Eq. (23.4) we nd

Z
x,
|y x|
q(md)
dy = ()
Z

0
r
q(md)
r
d1
dr = ()

q(md)+d
q (md) +d
= ()

pmd
p1
pmd
(p 1).
provided that pmd > 0 (i.e. m > d/p) wherein we have used
q (md) +d =
p
p 1
(md) +d =
p (md) +d (p 1)
p 1
=
pmd
p 1
.
This gives the estimate
"
Z
x,
|y x|
q(md)
dy
#
1/q
() (p 1)
pmd
p1
p
pmd
p
=

() (p 1)
pmd
p1
p
md/p
.
Thus we have obtained the estimate that
|u(x)|
C(g)
|| (m1)!
() (p 1)
pmd
p1
p
md/p
m
X
k=0
m
k
km
k
[
yx
u
L
p
(
x,p
)
.
25.6. Exercises.
Exercise 25.1. Let a
i
0 and p
i
[1, ) for i = 1, 2, . . . , d satisfy
P
d
i=1
p
1
i
= 1,
then
d
Y
i=1
a
i

d
X
i=1
1
p
i
a
p
i
i
.
Hint: This may be proved by induction on d making use of Lemma 2.27 or by
using Jensens inequality analogously to how the d = 2 case was done in Example
9.11.
26. Banach Spaces III: Calculus
In this section, X and Y will be Banach space and U will be an open subset of
X.
Notation 26.1 (, O, and o notation). Let 0 U
o
X, and f : U Y be a
function. We will write:
(1) f(x) = (x) if lim
x0
kf(x)k = 0.
(2) f(x) = O(x) if there are constants C < and r > 0 such that
kf(x)k Ckxk for all x B(0, r). This is equivalent to the condition
that limsup
x0
kf(x)k
kxk
< , where
limsup
x0
kf(x)k
kxk
lim
r0
sup{kf(x)k : 0 < kxk r}.
(3) f(x) = o(x) if f(x) = (x)O(x), i.e. lim
x0
kf(x)k/kxk = 0.
Example 26.2. Here are some examples of properties of these symbols.
(1) A function f : U
o
X Y is continuous at x
0
U if f(x
0
+ h) =
f(x
0
) +(h).
(2) If f(x) = (x) and g(x) = (x) then f(x) +g(x) = (x).
Now let g : Y Z be another function where Z is another Banach
space.
(3) If f(x) = O(x) and g(y) = o(y) then g f(x) = o(x).
(4) If f(x) = (x) and g(y) = (y) then g f(x) = (x).
26.1. The Dierential.
Denition 26.3. A function f : U
o
X Y is dierentiable at x
0
+h
0
U
if there exists a linear transformation L(X, Y ) such that
(26.1) f(x
0
+h) f(x
0
+h
0
) h = o(h).
We denote by f
0
(x
0
) or Df(x
0
) if it exists. As with continuity, f is dierentiable
on U if f is dierentiable at all points in U.
Remark 26.4. The linear transformation in Denition 26.3 is necessarily unique.
Indeed if
1
is another linear transformation such that Eq. (26.1) holds with
replaced by
1
, then
(
1
)h = o(h),
i.e.
limsup
h0
k(
1
)hk
khk
= 0.
On the other hand, by denition of the operator norm,
limsup
h0
k(
1
)hk
khk
= k
1
k.
The last two equations show that =
1
.
Exercise 26.1. Show that a function f : (a, b) X is a dierentiable at t (a, b)
in the sense of Denition 4.6 i it is dierentiable in the sense of Denition 26.3.
Also show Df(t)v = v

f(t) for all v R.
474 BRUCE K. DRIVER
Example 26.5. Assume that GL(X, Y ) is non-empty. Then f : GL(X, Y )

GL(Y, X) dened by f(A) A
1
is dierentiable and
f
0
(A)B = A
1
BA
1
for all B L(X, Y ).
Indeed (by Eq. (3.13)),
f(A+H) f(A) = (A+H)
1
A
1
= (A
I +A
1
H
)
1
A
1
=

I +A
1
H
)
1
A
1
A
1
=

X
n=0
(A
1
H)
n
A
1
A
1
= A
1
HA
1
+

X
n=2
(A
1
H)
n
.
Since
k

X
n=2
(A
1
H)
n
k

X
n=2
kA
1
Hk
n
kA
1
k
2
kHk
2
1 kA
1
Hk
,
we nd that
f(A+H) f(A) = A
1
HA
1
+o(H).
26.2. Product and Chain Rules. The following theorem summarizes some basic
properties of the dierential.
Theorem 26.6. The dierential D has the following properties:
Linearity: D is linear, i.e. D(f +g) = Df +Dg.
Product Rule: If f : U
o
X Y and A : U
o
X L(X, Z) are
dierentiable at x
0
then so is x (Af)(x) A(x)f(x) and
D(Af)(x
0
)h = (DA(x
0
)h)f(x
0
) +A(x
0
)Df(x
0
)h.
Chain Rule: If f : U
o
X V
o
Y is dierentiable at x
0
U, and
g : V
o
Y Z is dierentiable at y
0
f(h
o
), then g f is dierentiable
at x
0
and (g f)
0
(x
0
) = g
0
(y
0
)f
0
(x
0
).
Converse Chain Rule: Suppose that f : U
o
X V
o
Y is continuous
at x
0
U, g : V
o
Y Z is dierentiable y
0
f(h
o
), g
0
(y
0
) is invertible,
and g f is dierentiable at x
0
, then f is dierentiable at x
0
and
(26.2) f
0
(x
0
) [g
0
(x
0
)]
1
(g f)
0
(x
0
).
Proof. For the proof of linearity, let f, g : U
o
X Y be two functions which
are dierentiable at x
0
U and c R, then
(f +cg)(x
0
+h) = f(x
0
) +Df(x
0
)h +o(h) +c(g(x
0
) +Dg(x
0
)h +o(h)
= (f +cg)(x
0
) + (Df(x
0
) +cDg(x
0
))h +o(h),
which implies that (f +cg) is dierentiable at x
0
and that
D(f +cg)(x
0
) = Df(x
0
) +cDg(x
0
).
For item 2, we have
A(x
0
+h)f(x
0
+h) = (A(x
0
) +DA(x
0
)h +o(h))(f(x
0
) +f
0
(x
0
)h +o(h))
= A(x
0
)f(x
0
) +A(x
0
)f
0
(x
0
)h + [DA(x
0
)h]f(x
0
) +o(h),
which proves item 2.
Similarly for item 3,
(g f)(x
0
+h) = g(f(x
0
)) +g
0
(f(x
0
))(f(x
0
+h) f(x
0
)) +o(f(x
0
+h) f(x
0
))
= g(f(x
0
)) +g
0
(f(x
0
))(Df(x
0
)x
0
+o(h)) +o(f(x
0
+h) f(x
0
)
= g(f(x
0
)) +g
0
(f(x
0
))Df(x
0
)h +o(h),
where in the last line we have used the fact that f(x
0
+h) f(x
0
) = O(h) (see Eq.
(26.1)) and o(O(h)) = o(h).
Item 4. Since g is dierentiable at y
0
= f(x
0
),
g(f(x
0
+h)) g(f(x
0
)) = g
0
(f(x
0
))(f(x
0
+h) f(x
0
)) +o(f(x
0
+h) f(x
0
)).
And since g f is dierentiable at x
0
,
(g f)(x
0
+h) g(f(x
0
)) = (g f)
0
(x
0
)h +o(h).
Comparing these two equations shows that
f(x
0
+h) f(x
0
) = g
0
(f(x
0
))
1
{(g f)
0
(x
0
)h +o(h) o(f(x
0
+h) f(x
0
))}
= g
0
(f(x
0
))
1
(g f)
0
(x
0
)h +o(h)
g
0
(f(x
0
))
1
o(f(x
0
+h) f(x
0
)). (26.3)
Using the continuity of f, f(x
0
+ h) f(x
0
) is close to 0 if h is close to zero, and
hence ko(f(x
0
+h) f(x
0
))k
1
2
kf(x
0
+h) f(x
0
)k for all h suciently close to
0. (We may replace
1
2
by any number > 0 above.) Using this remark, we may
take the norm of both sides of equation (26.3) to nd
kf(x
0
+h) f(x
0
)k kg
0
(f(x
0
))
1
(g f)
0
(x
0
)kkhk +o(h) +
1
2
kf(x
0
+h) f(x
0
)k
for h close to 0. Solving for kf(x
0
+h) f(x
0
)k in this last equation shows that
(26.4) f(x
0
+h) f(x
0
) = O(h).
(This is an improvement, since the continuity of f only guaranteed that f(x
0
+h)
f(x
0
) = (h).) Because of Eq. (25.4), we now know that o(f(x
0
+h)f(x
0
)) = o(h),
which combined with Eq. (26.3) shows that
f(x
0
+h) f(x
0
) = g
0
(f(x
0
))
1
(g f)
0
(x
0
)h +o(h),
i.e. f is dierentiable at x
0
and f
0
(x
0
) = g
0
(f(x
0
))
1
(g f)
0
(x
0
).
Corollary 26.7. Suppose that : (a, b) U
o
X is dierentiable at t (a, b)
and f : U
o
X Y is dierentiable at (t) U. Then f is dierentiable at t
and
d(f )(t)/dt = f
0
((t)) (t).
Example 26.8. Let us continue on with Example 26.5 but now let X = Y to
simplify the notation. So f : GL(X) GL(X) is the map f(A) = A
1
and
f
0
(A) = L
A
1R
A
1, i.e. f
0
= L
f
R
f
.
where L
A
B = AB and R
A
B = AB for all A, B L(X). As the reader may easily
check, the maps
A L(X) L
A
, R
A
L(L(X))
are linear and bounded. So by the chain and the product rule we nd f
00
(A) exists
for all A L(X) and
f
00
(A)B = L
f
0
(A)B
R
f
L
f
R
f
0
(A)B
.
476 BRUCE K. DRIVER
More explicitly
(26.5) [f
00
(A)B] C = A
1
BA
1
CA
1
+A
1
CA
1
BA
1
.
Working inductively one shows f : GL(X) GL(X) dened by f(A) A
1
is
C
.
26.3. Partial Derivatives.
Denition 26.9 (Partial or Directional Derivative). Let f : U
o
X Y be a
function, x
0
U, and v X. We say that f is dierentiable at x
0
in the direction v
i
d
dt
|
0
(f(x
0
+tv)) =: (
v
f)(x
0
) exists. We call (
v
f)(x
0
) the directional or partial
derivative of f at x
0
in the direction v.
Notice that if f is dierentiable at x
0
, then
v
f(x
0
) exists and is equal to f
0
(x
0
)v,
see Corollary 26.7.
Proposition 26.10. Let f : U
o
X Y be a continuous function and D X be
a dense subspace of X. Assume
v
f(x) exists for all x U and v D, and there
exists a continuous function A : U L(X, Y ) such that
v
f(x) = A(x)v for all
v D and x U D. Then f C
1
(U, Y ) and Df = A.
Proof. Let x
0
U, > 0 such that B(x
0
, 2) U and M sup{kA(x)k : x
B(x
0
, 2)} <
43
. For x B(x
0
, ) D and v D B(0, ), by the fundamental
theorem of calculus,
(26.6)
f(x +v) f(x) =
Z
1
0
df(x +tv)
dt
dt =
Z
1
0
(
v
f)(x +tv) dt =
Z
1
0
A(x +tv) v dt.
For general x B(x
0
, ) and v B(0, ), choose x
n
B(x
0
, ) D and v
n

D B(0, ) such that x
n
x and v
n
v. Then
(26.7) f(x
n
+v
n
) f(x
n
) =
Z
1
0
A(x
n
+tv
n
) v
n
dt
holds for all n. The left side of this last equation tends to f(x + v) f(x) by the
continuity of f. For the right side of Eq. (26.7) we have
k
Z
1
0
A(x +tv) v dt
Z
1
0
A(x
n
+tv
n
) v
n
dtk
Z
1
0
kA(x +tv) A(x
n
+tv
n
) kkvk dt
+Mkv v
n
k.
It now follows by the continuity of A, the fact that kA(x+tv) A(x
n
+tv
n
) k M,
and the dominated convergence theorem that right side of Eq. (26.7) converges to
R
1
0
A(x +tv) v dt. Hence Eq. (26.6) is valid for all x B(x
0
, ) and v B(0, ). We
also see that
(26.8) f(x +v) f(x) A(x)v = (v)v,
43
It should be noted well, unlike in nite dimensions closed and bounded sets need not be
compact, so it is not sucient to choose suciently small so that B(x
0
, 2) U. Here is a
counter example. Let X H be a Hilbert space, {e
n
}
n=1
be an orthonormal set. Dene
f(x)
P
n=1
n(kx e
n
k), where is any continuous function on R such that (0) = 1 and
is supported in (1, 1). Notice that ke
n
e
m
k
2
= 2 for all m 6= n, so that ke
n
e
m
k =

2.
Using this fact it is rather easy to check that for any x
0
H, there is an > 0 such that for all
x B(x
0
, ), only one term in the sum dening f is non-zero. Hence, f is continuous. However,
f(e
n
) = n as n .
where (v)
R
1
0
[A(x +tv) A(x)] dt. Now
k(v)k
Z
1
0
kA(x +tv) A(x)k dt max
t[0,1]
kA(x +tv) A(x)k 0 as v 0,
by the continuity of A. Thus, we have shown that f is dierentiable and that
Df(x) = A(x).
26.4. Smooth Dependence of ODEs on Initial Conditions . In this subsec-
tion, let X be a Banach space, U
o
X and J be an open interval with 0 J.
Lemma 26.11. If Z C(J U, X) such that D
x
Z(t, x) exists for all (t, x) J U
and D
x
Z(t, x) C(J U, X) then Z is locally Lipschitz in x, see Denition 5.12.
Proof. Suppose I @@ J and x U. By the continuity of DZ, for every t I
there an open neighborhood N
t
of t I and
t
> 0 such that B(x,
t
) U and
sup{kD
x
Z(t
0
, x
0
)k : (t
0
, x
0
) N
t
B(x,
t
)} < .
By the compactness of I, there exists a nite subset I such that I
tI
N
t
.
Let (x, I) := min{
t
: t } and
K(x, I) sup{kDZ(t, x
0
)k(t, x
0
) I B(x, (x, I))} < .
Then by the fundamental theorem of calculus and the triangle inequality,
kZ(t, x
1
)Z(t, x
0
)k
Z
1
0
kD
x
Z(t, x
0
+s(x
1
x
0
)k ds
kx
1
x
0
k K(x, I)kx
1
x
0
k
for all x
0
, x
1
B(x, (x, I)) and t I.
Theorem 26.12 (Smooth Dependence of ODEs on Initial Conditions). Let X be
a Banach space, U
o
X, Z C(R U, X) such that D
x
Z C(R U, X) and
: D(Z) R X X denote the maximal solution operator to the ordinary
dierential equation
(26.9) y(t) = Z(t, y(t)) with y(0) = x U,
see Notation 5.15 and Theorem 5.21. Then C
1
(D(Z), U),
t
D
x
(t, x) exists
and is continuous for (t, x) D(Z) and D
x
(t, x) satises the linear dierential
equation,
(26.10)
d
dt
D
x
(t, x) = [(D
x
Z) (t, (t, x))]D
x
(t, x) with D
x
(0, x) = I
X
for t J
x
.
Proof. Let x
0
U and J be an open interval such that 0 J

J @@ J
x
0
,
y
0
:= y(, x
0
)|
J
and
O
:= {y BC(J, U) : ky y
0
k
< }
o
BC(J, X).
By Lemma 26.11, Z is locally Lipschitz and therefore Theorem 5.21 is applicable.
By Eq. (5.30) of Theorem 5.21, there exists > 0 and > 0 such that G :
B(x
0
, ) O
dened by G(x) (, x)|

J
is continuous. By Lemma 26.13 below,
for > 0 suciently small the function F : O
BC(J, X) dened by
(26.11) F(y) y
Z

0
Z(t, y(t))dt.
478 BRUCE K. DRIVER
is C
1
and
(26.12) DF(y)v = v
Z

0
D
y
Z(t, y(t))v(t)dt.
By the existence and uniqueness Theorem 5.5 for linear ordinary dierential
equations, DF(y) is invertible for any y BC(J, U). By the denition of ,
F(G(x)) = h(x) for all x B(x
0
, ) where h : X BC(J, X) is dened by
h(x)(t) = x for all t J, i.e. h(x) is the constant path at x. Since h is a bounded
linear map, h is smooth and Dh(x) = h for all x X. We may now apply the
converse to the chain rule in Theorem 26.6 to conclude G C
1
(B(x
0
, ), O) and
DG(x) = [DF(G(x))]
1
Dh(x) or equivalently, DF(G(x))DG(x) = h which in turn
is equivalent to
D
x
(t, x)
Z
t
0
[DZ((, x)]D
x
(, x) d = I
X
.
As usual this equation implies D
x
(t, x) is dierentiable in t, D
x
(t, x) is continuous
in (t, x) and D
x
(t, x) satises Eq. (26.10).
Lemma 26.13. Continuing the notation used in the proof of Theorem 26.12 and
further let
f(y)
Z

0
Z(, y()) d for y O
.
Then f C
1
(O
, Y ) and for all y O
,
f
0
(y)h =
Z

0
D
x
Z(, y())h() d =:
y
h.
Proof. Let h Y be suciently small and J, then by fundamental theorem
of calculus,
Z(, y() +h()) Z(, y()) =
Z
1
0
[D
x
Z(, y() +rh()) D
x
Z(, y())]dr
and therefore,
(f(y +h) f(y)
y
h) (t) =
Z
t
0
[Z(, y() +h()) Z(, y()) D
x
Z(, y())h() ] d
=
Z
t
0
d
Z
1
0
dr[D
x
Z(, y() +rh()) D
x
Z(, y())]h().
Therefore,
(26.13) k(f(y +h) f(y)
y
h)k
khk
(h)
where
(h) :=
Z
J
d
Z
1
0
dr kD
x
Z(, y() +rh()) D
x
Z(, y())k .
With the aide of Lemmas 26.11 and Lemma 5.13,
(r, , h) [0, 1] J Y kD
x
Z(, y() +rh())k
is bounded for small h provided > 0 is suciently small. Thus it follows from the
dominated convergence theorem that (h) 0 as h 0 and hence Eq. (26.13)
implies f
0
(y) exists and is given by
y
. Similarly,
kf
0
(y +h) f
0
(y)k
op

Z
J
kD
x
Z(, y() +h()) D
x
Z(, y())k d 0 as h 0
showing f
0
is continuous.
Remark 26.14. If Z C
k
(U, X), then an inductive argument shows that
C
k
(D(Z), X). For example if Z C
2
(U, X) then (y(t), u(t)) := ((t, x), D
x
(t, x))
solves the ODE,
d
dt
(y(t), u(t)) =

Z ((y(t), u(t))) with (y(0), u(0)) = (x, Id
X
)
where

Z is the C
1
vector eld dened by
Z (x, u) = (Z(x), D
x
Z(x)u) .
Therefore Theorem 26.12 may be applied to this equation to deduce: D
2
x
(t, x) and
D
2
x

(t, x) exist and are continuous. We may now dierentiate Eq. (26.10) to nd
D
2
x
(t, x) satises the ODE,
d
dt
D
2
x
(t, x) = [
D
x
(t,x)
D
x
Z
(t, (t, x))]D

x
(t, x) + [(D
x
Z) (t, (t, x))]D
2
x
(t, x)
with D
2
x
(0, x) = 0.
26.5. Higher Order Derivatives. As above, let f : U
o
X Y be a function.
If f is dierentiable on U, then the dierential Df of f is a function from U to
the Banach space L(X, Y ). If the function Df : U L(X, Y ) is also dieren-
tiable on U, then its dierential D
2
f = D(Df) : U L(X, L(X, Y )). Similarly,
D
3
f = D(D(Df)) : U L(X, L(X, L(X, Y ))) if the dierential of D(Df) ex-
ists. In general, let L
1
(X, Y ) L(X, Y ) and L
k
(X, Y ) be dened inductively by
L
k+1
(X, Y ) = L(X, L
k
(X, Y )). Then (D
k
f)(x) L
k
(X, Y ) if it exists. It will be
convenient to identify the space L
k
(X, Y ) with the Banach space dened in the
next denition.
Denition 26.15. For k {1, 2, 3, . . .}, let M
k
(X, Y ) denote the set of functions
f : X
k
Y such that
(1) For i {1, 2, . . . , k}, v X fhv
1
, v
2
, . . . , v
i1
, v, v
i+1
, . . . , v
k
i Y is
linear
44
for all {v
i
}
n
i=1
X.
(2) The norm kfk
M
k
(X,Y )
should be nite, where
kfk
M
k
(X,Y )
sup{
kfhv
1
, v
2
, . . . , v
k
ik
Y
kv
1
kkv
2
k kv
k
k
: {v
i
}
k
i=1
X \ {0}}.
Lemma 26.16. There are linear operators j
k
: L
k
(X, Y ) M
k
(X, Y ) dened in-
ductively as follows: j
1
= Id
L(X,Y )
(notice that M
1
(X, Y ) = L
1
(X, Y ) = L(X, Y ))
and
(j
k+1
A)hv
0
, v
1
, . . . , v
k
i = (j
k
(Av
0
))hv
1
, v
2
, . . . , v
k
i v
i
X.
(Notice that Av
0
L
k
(X, Y ).) Moreover, the maps j
k
are isometric isomorphisms.
Proof. To get a feeling for what j
k
is let us write out j
2
and j
3
explicitly. If A
L
2
(X, Y ) = L(X, L(X, Y )), then (j
2
A)hv
1
, v
2
i = (Av
1
)v
2
and if A L
3
(X, Y ) =
L(X, L(X, L(X, Y ))), (j
3
A)hv
1
, v
2
, v
3
i = ((Av
1
)v
2
)v
3
for all v
i
X.
It is easily checked that j
k
is linear for all k. We will now show by induction that
j
k
is an isometry and in particular that j
k
is injective. Clearly this is true if k = 1
since j
1
is the identity map. For A L
k+1
(X, Y ),
44
I will routinely write fhv
1
, v
2
, . . . , v
k
i rather than f(v
1
, v
2
, . . . , v
k
) when the function f
depends on each of variables linearly, i.e. f is a multi-linear function.
480 BRUCE K. DRIVER
kj
k+1
Ak
M
k+1
(X,Y )
sup{
k(j
k
(Av
0
))hv
1
, v
2
, . . . , v
k
ik
Y
kv
0
kkv
1
kkv
2
k kv
k
k
: {v
i
}
k
i=0
X \ {0}}
sup{
k(j
k
(Av
0
))k
M
k
(X,Y )
kv
0
k
: v
0
X \ {0}}
= sup{
kAv
0
k
L
k
(X,Y )
kv
0
k
: v
0
X \ {0}}
= kAk
L(X,L
k
(X,Y ))
kAk
L
k+1
(X,Y )
,
wherein the second to last inequality we have used the induction hypothesis. This
shows that j
k+1
is an isometry provided j
k
is an isometry.
To nish the proof it suces to shows that j
k
is surjective for all k. Again this is
true for k = 1. Suppose that j
k
is invertible for some k 1. Given f M
k+1
(X, Y )
we must produce A L
k+1
(X, Y ) = L(X, L
k
(X, Y )) such that j
k+1
A = f. If such
an equation is to hold, then for v
0
X, we would have j
k
(Av
0
) = fhv
0
, i. That
is Av
0
= j
1
k
(fhv
0
, i). It is easily checked that A so dened is linear, bounded,
and j
k+1
A = f.
From now on we will identify L
k
with M
k
without further mention. In particular,
we will view D
k
f as function on U with values in M
k
(X, Y ).
Theorem 26.17 (Dierentiability). Suppose k {1, 2, . . .} and D is a dense
subspace of X, f : U
o
X Y is a function such that (
v
1
v
2

v
l
f)(x)
exists for all x D U, {v
i
}
l
i=1
D, and l = 1, 2, . . . k. Further assume
there exists continuous functions A
l
: U
o
X M
l
(X, Y ) such that such
that (
v
1
v
2

v
l
f)(x) = A
l
(x)hv
1
, v
2
, . . . , v
l
i for all x D U, {v
i
}
l
i=1
D,
and l = 1, 2, . . . k. Then D
l
f(x) exists and is equal to A
l
(x) for all x U and
l = 1, 2, . . . , k.
Proof. We will prove the theorem by induction on k. We have already proved
the theorem when k = 1, see Proposition 26.10. Now suppose that k > 1 and that
the statement of the theorem holds when k is replaced by k 1. Hence we know
that D
l
f(x) = A
l
(x) for all x U and l = 1, 2, . . . , k 1. We are also given that
(26.14) (
v
1
v
2

v
k
f)(x) = A
k
(x)hv
1
, v
2
, . . . , v
k
i x U D, {v
i
} D.
Now we may write (
v
2

v
k
f)(x) as (D
k1
f)(x)hv
2
, v
3
, . . . , v
k
i so that Eq.
(26.14) may be written as
(26.15)
v
1
(D
k1
f)(x)hv
2
, v
3
, . . . , v
k
i) = A
k
(x)hv
1
, v
2
, . . . , v
k
i x U D, {v
i
} D.
So by the fundamental theorem of calculus, we have that
(26.16)
((D
k1
f)(x +v
1
) (D
k1
f)(x))hv
2
, v
3
, . . . , v
k
i =
Z
1
0
A
k
(x +tv
1
)hv
1
, v
2
, . . . , v
k
i dt
for all x U D and {v
i
} D with v
1
suciently small. By the same argument
given in the proof of Proposition 26.10, Eq. (26.16) remains valid for all x U and
{v
i
} X with v
1
suciently small. We may write this last equation alternatively
as,
(26.17) (D
k1
f)(x +v
1
) (D
k1
f)(x) =
Z
1
0
A
k
(x +tv
1
)hv
1
, i dt.
Hence
(D
k1
f)(x+v
1
)(D
k1
f)(x)A
k
(x)hv
1
, i =
Z
1
0
[A
k
(x+tv
1
)A
k
(x)]hv
1
, i dt
from which we get the estimate,
(26.18) k(D
k1
f)(x +v
1
) (D
k1
f)(x) A
k
(x)hv
1
, ik (v
1
)kv
1
k
where (v
1
)
R
1
0
kA
k
(x + tv
1
) A
k
(x)k dt. Notice by the continuity of A
k
that
(v
1
) 0 as v
1
0. Thus it follow from Eq. (26.18) that D
k1
f is dierentiable
and that (D
k
f)(x) = A
k
(x).
Example 26.18. Let f : L
(X, Y ) L
(Y, X) be dened by f(A) A

1
. We
assume that L
(X, Y ) is not empty. Then f is innitely dierentiable and

(26.19)
(D
k
f)(A)hV
1
, V
2
, . . . , V
k
i = (1)
k
X
{B
1
V
(1)
B
1
V
(2)
B
1
B
1
V
(k)
B
1
},
where sum is over all permutations of of {1, 2, . . . , k}.
Let me check Eq. (26.19) in the case that k = 2. Notice that we have already
shown that (
V
1
f)(B) = Df(B)V
1
= B
1
V
1
B
1
. Using the product rule we nd
that
(
V
2
V
1
f)(B) = B
1
V
2
B
1
V
1
B
1
+B
1
V
1
B
1
V
2
B
1
=: A
2
(B)hV
1
, V
2
i.
Notice that kA
2
(B)hV
1
, V
2
ik 2kB
1
k
3
kV
1
k kV
2
k, so that kA
2
(B)k 2kB
1
k
3
<
. Hence A
2
: L
(X, Y ) M
2
(L(X, Y ), L(Y, X)). Also
k(A
2
(B) A
2
(C))hV
1
, V
2
ik 2kB
1
V
2
B
1
V
1
B
1
C
1
V
2
C
1
V
1
C
1
k
2kB
1
V
2
B
1
V
1
B
1
B
1
V
2
B
1
V
1
C
1
k
+ 2kB
1
V
2
B
1
V
1
C
1
B
1
V
2
C
1
V
1
C
1
k
+ 2kB
1
V
2
C
1
V
1
C
1
C
1
V
2
C
1
V
1
C
1
k
2kB
1
k
2
kV
2
kkV
1
kkB
1
C
1
k
+ 2kB
1
kkC
1
kkV
2
kkV
1
kkB
1
C
1
k
+ 2kC
1
k
2
kV
2
kkV
1
kkB
1
C
1
k.
This shows that
kA
2
(B) A
2
(C)k 2kB
1
C
1
k{kB
1
k
2
+kB
1
kkC
1
k +kC
1
k
2
}.
Since B B
1
is dierentiable and hence continuous, it follows that A
2
(B) is
also continuous in B. Hence by Theorem 26.17 D
2
f(A) exists and is given as in Eq.
(26.19)
Example 26.19. Suppose that f : R R is a C
function and
F(x)
R
1
0
f(x(t)) dt for x X C([0, 1], R) equipped with the norm kxk
max
t[0,1]
|x(t)|. Then F : X R is also innitely dierentiable and
(26.20) (D
k
F)(x)hv
1
, v
2
, . . . , v
k
i =
Z
1
0
f
(k)
(x(t))v
1
(t) v
k
(t) dt,
for all x X and {v
i
} X.
482 BRUCE K. DRIVER
To verify this example, notice that

(
v
F)(x)
d
ds
|
0
F(x +sv) =
d
ds
|
0
Z
1
0
f(x(t) +sv(t)) dt
=
Z
1
0
d
ds
|
0
f(x(t) +sv(t)) dt =
Z
1
0
f
0
(x(t))v(t) dt.
Similar computations show that
(
v
1
v
2

v
k
f)(x) =
Z
1
0
f
(k)
(x(t))v
1
(t) v
k
(t) dt =: A
k
(x)hv
1
, v
2
, . . . , v
k
i.
Now for x, y X,
|A
k
(x)hv
1
, v
2
, . . . , v
k
i A
k
(y)hv
1
, v
2
, . . . , v
k
i|
Z
1
0
|f
(k)
(x(t)) f
(k)
(y(t))| |v
1
(t) v
k
(t) |dt
k
Y
i=1
kv
i
k
Z
1
0
|f
(k)
(x(t)) f
(k)
(y(t))|dt,
which shows that
kA
k
(x) A
k
(y)k
Z
1
0
|f
(k)
(x(t)) f
(k)
(y(t))|dt.
This last expression is easily seen to go to zero as y x in X. Hence A
k
is
continuous. Thus we may apply Theorem 26.17 to conclude that Eq. (26.20) is
valid.
26.6. Contraction Mapping Principle.
Theorem 26.20. Suppose that (X, ) is a complete metric space and S : X X
is a contraction, i.e. there exists (0, 1) such that (S(x), S(y)) (x, y) for
all x, y X. Then S has a unique xed point in X, i.e. there exists a unique point
x X such that S(x) = x.
Proof. For uniqueness suppose that x and x
0
are two xed points of S, then
(x, x
0
) = (S(x), S(x
0
)) (x, x
0
).
Therefore (1 )(x, x
0
) 0 which implies that (x, x
0
) = 0 since 1 > 0. Thus
x = x
0
.
For existence, let x
0
X be any point in X and dene x
n
X inductively by
x
n+1
= S(x
n
) for n 0. We will show that x lim
n
x
n
exists in X and because
S is continuous this will imply,
x = lim
n
x
n+1
= lim
n
S(x
n
) = S( lim
n
x
n
) = S(x),
showing x is a xed point of S.
So to nish the proof, because X is complete, it suces to show {x
n
}
n=1
is a
Cauchy sequence in X. An easy inductive computation shows, for n 0, that
(x
n+1
, x
n
) = (S(x
n
), S(x
n1
)) (x
n
, x
n1
)
n
(x
1
, x
0
).
Another inductive argument using the triangle inequality shows, for m > n, that,
(x
m
, x
n
) (x
m
, x
m1
) +(x
m1
, x
n
)
m1
X
k=n
(x
k+1
, x
k
).
Combining the last two inequalities gives (using again that (0, 1)),
(x
m
, x
n
)
m1
X
k=n
k
(x
1
, x
0
) (x
1
, x
0
)
n

X
l=0
l
= (x
1
, x
0
)

n
1
.
This last equation shows that (x
m
, x
n
) 0 as m, n , i.e. {x
n
}
n=0
is a
Cauchy sequence.
Corollary 26.21 (Contraction Mapping Principle II). Suppose that (X, ) is a
complete metric space and S : X X is a continuous map such that S
(n)
is a
contraction for some n N. Here
S
(n)
n times
z }| {
S S . . . S
and we are assuming there exists (0, 1) such that (S
(n)
(x), S
(n)
(y)) (x, y)
for all x, y X. Then S has a unique xed point in X.
Proof. Let T S
(n)
, then T : X X is a contraction and hence T has a
unique xed point x X. Since any xed point of S is also a xed point of T, we
see if S has a xed point then it must be x. Now
T(S(x)) = S
(n)
(S(x)) = S(S
(n)
(x)) = S(T(x)) = S(x),
which shows that S(x) is also a xed point of T. Since T has only one xed point,
we must have that S(x) = x. So we have shown that x is a xed point of S and
this xed point is unique.
Lemma 26.22. Suppose that (X, ) is a complete metric space, n N, Z is a
topological space, and (0, 1). Suppose for each z Z there is a map S
z
: X X
with the following properties:
Contraction property: (S
(n)
z
(x), S
(n)
z
(y)) (x, y) for all x, y X and
z Z.
Continuity in z: For each x X the map z Z S
z
(x) X is continu-
ous.
By Corollary 26.21 above, for each z Z there is a unique xed point G(z) X
of S
z
.
Conclusion: The map G : Z X is continuous.
Proof. Let T
z
S
(n)
z
. If z, w Z, then
(G(z), G(w)) = (T
z
(G(z)), T
w
(G(w)))
(T
z
(G(z)), T
w
(G(z))) +(T
w
(G(z)), T
w
(G(w)))
(T
z
(G(z)), T
w
(G(z))) +(G(z), G(w)).
Solving this inequality for (G(z), G(w)) gives
(G(z), G(w))
1
1
(T
z
(G(z)), T
w
(G(z))).
Since w T
w
(G(z)) is continuous it follows from the above equation that G(w)
G(z) as w z, i.e. G is continuous.
484 BRUCE K. DRIVER
26.7. Inverse and Implicit Function Theorems. In this section, let X be a

Banach space, U X be an open set, and F : U X and : U X be
continuous functions. Question: under what conditions on is F(x) := x + (x)
a homeomorphism from B
0
() to F(B
0
()) for some small > 0? Lets start by
looking at the one dimensional case rst. So for the moment assume that X = R,
U = (1, 1), and : U R is C
1
. Then F will be one to one i F is monotonic.
This will be the case, for example, if F
0
= 1 +
0
> 0. This in turn is guaranteed
by assuming that |
0
| < 1. (This last condition makes sense on a Banach space
whereas assuming 1 +
0
> 0 is not as easily interpreted.)
Lemma 26.23. Suppose that U = B = B(0, r) (r > 0) is a ball in X and : B
X is a C
1
function such that kDk < on U. Then for all x, y U we
have:
(26.21) k(x) (y)k kx yk.
Proof. By the fundamental theorem of calculus and the chain rule:
(y) (x) =
Z
1
0
d
dt
(x +t(y x))dt
=
Z
1
0
[D(x +t(y x))](y x)dt.
Therefore, by the triangle inequality and the assumption that kD(x)k on B,
k(y) (x)k
Z
1
0
kD(x +t(y x))kdt k(y x)k k(y x)k.
Remark 26.24. It is easily checked that if : B = B(0, r) X is C
1
and satises
(26.21) then kDk on B.
Using the above remark and the analogy to the one dimensional example, one is
lead to the following proposition.
Proposition 26.25. Suppose that U = B = B(0, r) (r > 0) is a ball in X,
(0, 1), : U X is continuous, F(x) x +(x) for x U, and satises:
(26.22) k(x) (y)k kx yk x, y B.
Then F(B) is open in X and F : B V := F(B) is a homeomorphism.
Proof. First notice from (26.22) that
kx yk = k(F(x) F(y)) ((x) (y))k
kF(x) F(y)k +k(x) (y)k
kF(x) F(y)k +k(x y)k
from which it follows that kx yk (1 )
1
kF(x) F(y)k. Thus F is injective
on B. Let V
.
= F(B) and G = F
1
: V B denote the inverse function which
exists since F is injective.
We will now show that V is open. For this let x
0
B and z
0
= F(x
0
) =
x
0
+(x
0
) V. We wish to show for z close to z
0
that there is an x B such that
F(x) = x+(x) = z or equivalently x = z (x). Set S
z
(x)
.
= z (x), then we are
looking for x B such that x = S
z
(x), i.e. we want to nd a xed point of S
z
. We
will show that such a xed point exists by using the contraction mapping theorem.
Step 1. S
z
is contractive for all z X. In fact for x, y B,
(26.23) kS
z
(x) S
z
(y)k = k(x) (y))k kx yk.
Step 2. For any > 0 such the C
.
= B(x
0
, ) B and z X such that
kz z
0
k < (1 ), we have S
z
(C) C. Indeed, let x C and compute:
kS
z
(x) x
0
k = kS
z
(x) S
z
0
(x
0
)k
= kz (x) (z
0
(x
0
))k
= kz z
0
((x) (x
0
))k
kz z
0
k +kx x
0
k
< (1 ) + = .
wherein we have used z
0
= F(x
0
) and (26.22).
Since C is a closed subset of a Banach space X, we may apply the contraction
mapping principle, Theorem 26.20 and Lemma 26.22, to S
z
to show there is a
continuous function G : B(z
0
, (1 )) C such that
G(z) = S
z
(G(z)) = z (G(z)) = z F(G(z)) +G(z),
i.e. F(G(z)) = z. This shows that B(z
0
, (1 )) F(C) F(B) = V. That is
z
0
is in the interior of V. Since F
1
|
B(z
0
,(1))
is necessarily equal to G which is
continuous, we have also shown that F
1
is continuous in a neighborhood of z
0
.
Since z
0
V was arbitrary, we have shown that V is open and that F
1
: V U
is continuous.
Theorem 26.26 (Inverse Function Theorem). Suppose X and Y are Banach
spaces, U
o
X, f C
k
(U X) with k 1, x
0
U and Df(x
0
) is invert-
ible. Then there is a ball B = B(x
0
, r) in U centered at x
0
such that
(1) V = f(B) is open,
(2) f|
B
: B V is a homeomorphism,
(3) g
.
= (f|
B
)
1
C
k
(V, B) and
(26.24) g
0
(y) = [f
0
(g(y))]
1
for all y V.
Proof. Dene F(x) [Df(x
0
)]
1
f(x + x
0
) and (x) x F(x) X for
x (U x
0
). Notice that 0 U x
0
, DF(0) = I, and that D(0) = I I = 0.
Choose r > 0 such that

B B(0, r) U x
0
and kD(x)k
1
2
for x

B. By
Lemma 26.23, satises (26.23) with = 1/2. By Proposition 26.25, F(

B) is open
and F|
B
:

B F(

B) is a homeomorphism. Let G F|
1
B
which we know to be a
continuous map from F(

B)

B.
Since kD(x)k 1/2 for x

B, DF(x) = I +D(x) is invertible, see Corollary
3.70. Since H(z)
.
= z is C
1
and H = F G on F(

B), it follows from the converse
to the chain rule, Theorem 26.6, that G is dierentiable and
DG(z) = [DF(G(z))]
1
DH(z) = [DF(G(z))]
1
.
Since G, DF, and the map A GL(X) A
1
GL(X) are all continuous maps,
(see Example 26.5) the map z F(

B) DG(z) L(X) is also continuous, i.e. G
is C
1
.
Let B =

B +x
0
= B(x
0
, r) U. Since f(x) = [Df(x
0
)]F(x x
0
) and Df(x
0
) is
invertible (hence an open mapping), V := f(B) = [Df(x
0
)]F(

B) is open in X. It
486 BRUCE K. DRIVER
is also easily checked that f|

1
B
exists and is given by
(26.25) f|
1
B
(y) = x
0
+G([Df(x
0
)]
1
y)
for y V = f(B). This shows that f|
B
: B V is a homeomorphism and it follows
from (26.25) that g
.
= (f|
B
)
1
C
1
(V, B). Eq. (26.24) now follows from the chain
rule and the fact that
f g(y) = y for all y B.
Since f
0
C
k1
(B, L(X)) and i(A) := A
1
is a smooth map by Example 26.18,
g
0
= i f
0
g is C
1
if k 2, i.e. g is C
2
if k 2. Again using g
0
= i f
0
g, we may
conclude g
0
is C
2
if k 3, i.e. g is C
3
if k 3. Continuing bootstrapping our way
up we eventually learn g
.
= (f|
B
)
1
C
k
(V, B) if f is C
k
.
Theorem 26.27 (Implicit Function Theorem). Now suppose that X, Y, and W
are three Banach spaces, k 1, A X Y is an open set, (x
0
, y
0
) is a
point in A, and f : A W is a C
k
map such f(x
0
, y
0
) = 0. Assume that
D
2
f(x
0
, y
0
) D(f(x
0
, ))(y
0
) : Y W is a bounded invertible linear transforma-
tion. Then there is an open neighborhood U
0
of x
0
in X such that for all connected
open neighborhoods U of x
0
contained in U
0
, there is a unique continuous function
u : U Y such that u(x
0
) = y
o
, (x, u(x)) A and f(x, u(x)) = 0 for all x U.
Moreover u is necessarily C
k
and
(26.26) Du(x) = D
2
f(x, u(x))
1
D
1
f(x, u(x)) for all x U.
Proof. Proof of 26.27. By replacing f by (x, y) D
2
f(x
0
, y
0
)
1
f(x, y) if
necessary, we may assume with out loss of generality that W = Y and D
2
f(x
0
, y
0
) =
I
Y
. Dene F : A X Y by F(x, y) (x, f(x, y)) for all (x, y) A. Notice that
DF(x, y) =

I D
1
f(x, y)
0 D
2
f(x, y)
which is invertible i D
2
f(x, y) is invertible and if D
2
f(x, y) is invertible then
DF(x, y)
1
=

I D
1
f(x, y)D
2
f(x, y)
1
0 D
2
f(x, y)
1
.
Since D
2
f(x
0
, y
0
) = I is invertible, the implicit function theorem guarantees that
there exists a neighborhood U
0
of x
0
and V
0
of y
0
such that U
0
V
0
A, F(U
0
V
0
)
is open in X Y, F|
(U
0
V
0
)
has a C
k
inverse which we call F
1
. Let
2
(x, y) y
for all (x, y) XY and dene C
k
function u
0
on U
0
by u
0
(x)
2
F
1
(x, 0).
Since F
1
(x, 0) = ( x, u
0
(x)) i (x, 0) = F( x, u
0
(x)) = ( x, f( x, u
0
(x))), it follows
that x = x and f(x, u
0
(x)) = 0. Thus (x, u
0
(x)) = F
1
(x, 0) U
0
V
0
A and
f(x, u
0
(x)) = 0 for all x U
0
. Moreover, u
0
is C
k
being the composition of the C
k
functions, x (x, 0), F

1
, and
2
. So if U U
0
is a connected set containing x
0
,
we may dene u u
0
|
U
to show the existence of the functions u as described in
the statement of the theorem. The only statement left to prove is the uniqueness
of such a function u.
Suppose that u
1
: U Y is another continuous function such that u
1
(x
0
) = y
0
,
and (x, u
1
(x)) A and f(x, u
1
(x)) = 0 for all x U. Let
O {x U|u(x) = u
1
(x)} = {x U|u
0
(x) = u
1
(x)}.
Clearly O is a (relatively) closed subset of U which is not empty since x
0
O.
Because U is connected, if we show that O is also an open set we will have shown
that O = U or equivalently that u
1
= u
0
on U. So suppose that x O, i.e.
u
0
(x) = u
1
(x). For x near x U,
(26.27) 0 = 0 0 = f( x, u
0
( x)) f( x, u
1
( x)) = R( x)(u
1
( x) u
0
( x))
where
(26.28) R( x)
Z
1
0
D
2
f(( x, u
0
( x) +t(u
1
( x) u
0
( x)))dt.
From Eq. (26.28) and the continuity of u
0
and u
1
, lim
xx
R( x) = D
2
f(x, u
0
(x))
which is invertible
45
. Thus R( x) is invertible for all x suciently close to x. Using
Eq. (26.27), this last remark implies that u
1
( x) = u
0
( x) for all x suciently close
to x. Since x O was arbitrary, we have shown that O is open.
26.8. More on the Inverse Function Theorem. In this section X and Y will
denote two Banach spaces, U
o
X, k 1, and f C
k
(U, Y ). Suppose x
0
U,
h X, and f
0
(x
0
) is invertible, then
f(x
0
+h) f(x
0
) = f
0
(x
0
)h +o(h) = f
0
(x
0
) [h +(h)]
where
(h) = f
0
(x
0
)
1
[f(x
0
+h) f(x
0
)] h = o(h).
In fact by the fundamental theorem of calculus,
(h) =
Z
1
0
f
0
(x
0
)
1
f
0
(x
0
+th) I
hdt
but we will not use this here.
Let h, h
0
B
X
(0, R) and apply the fundamental theorem of calculus to t
f(x
0
+t(h
0
h)) to conclude
(h
0
) (h) = f
0
(x
0
)
1
[f(x
0
+h
0
) f(x
0
+h)] (h
0
h)
=
Z
1
0
f
0
(x
0
)
1
f
0
(x
0
+t(h
0
h)) I
dt
(h
0
h).
Taking norms of this equation gives
k(h
0
) (h)k
Z
1
0
f
0
(x
0
)
1
f
0
(x
0
+t(h
0
h)) I
dt
kh
0
hk kh
0
hk
where
(26.29) := sup
xB
X
(x
0
,R)
f
0
(x
0
)
1
f
0
(x) I
L(X)
.
We summarize these comments in the following lemma.
Lemma 26.28. Suppose x
0
U, R > 0, f : B
X
(x
0
, R) Y be a C
1
function
such that f
0
(x
0
) is invertible, is as in Eq. (26.29) and C
1
B
X
(0, R), X
is
dened by
(26.30) f(x
0
+h) = f(x
0
) +f
0
(x
0
) (h +(h)) .
Then
(26.31) k(h
0
) (h)k kh
0
hk for all h, h
0
B
X
(0, R).
45
Notice that DF(x, u
0
(x)) is invertible for all x U
0
since F|
U
0
V
0
has a C
1
inverse. There-
fore D
2
f(x, u
0
(x)) is also invertible for all x U
0
.
488 BRUCE K. DRIVER
Furthermore if < 1 (which may be achieved by shrinking R if necessary) then

f
0
(x) is invertible for all x B
X
(x
0
, R) and
(26.32) sup
xB
X
(x
0
,R)
f
0
(x)
1
L(Y,X)

1
1
f
0
(x
0
)
1
L(Y,X)
.
Proof. It only remains to prove Eq. (26.32), so suppose now that < 1. Then
by Proposition 3.69 f
0
(x
0
)
1
f
0
(x) is invertible and
f
0
(x
0
)
1
f
0
(x)

1
1
for all x B
X
(x
0
, R).
Since f
0
(x) = f
0
(x
0
)
f
0
(x
0
)
1
f
0
(x)
this implies f
0
(x) is invertible and
f
0
(x)
1
f
0
(x
0
)
1
f
0
(x)
1
f
0
(x
0
)
1

1
1
f
0
(x
0
)
1
for all x B
X
(x
0
, R).
Theorem 26.29 (Inverse Function Theorem). Suppose U
o
X, k 1 and f
C
k
(U, Y ) such that f
0
(x) is invertible for all x U. Then:
(1) f : U Y is an open mapping, in particular V := f(U)
o
Y.
(2) If f is injective, then f
1
: V U is also a C
k
map and
f
1
0
(y) =

f
0
(f
1
(y))
1
for all y V.
(3) If x
0
U and R > 0 such that B
X
(x
0
, R) U and
sup
xB
X
(x
0
,R)
f
0
(x
0
)
1
f
0
(x) I
= < 1
(which may always be achieved by taking R suciently small by continuity
of f
0
(x)) then f|
B
X
(x
0
,R)
: B
X
(x
0
, R) f(B
X
(x
0
, R)) is invertible and
f|
1
B
X
(x
0
,R)
: f

B
X
(x
0
, R)
B
X
(x
0
, R) is C
k
.
(4) Keeping the same hypothesis as in item 3. and letting y
0
= f(x
0
) Y,
f(B
X
(x
0
, r)) B
Y
(y
0
, kf
0
(x
0
)k (1 +)r) for all r R
and
B
Y
(y
0
, ) f(B
X
(x
0
, (1 )
1
f
0
(x
0
)
1
))
for all < (x
0
) := (1 ) R/
f
0
(x
0
)
1
.
Proof. Let x
0
and R > 0 be as in item 3. above and be as dened in Eq.
(26.30) above, so that for x, x
0
B
X
(x
0
, R),
f(x) = f(x
0
) +f
0
(x
0
) [(x x
0
) +(x x
0
)] and
f(x
0
) = f(x
0
) +f
0
(x
0
) [(x
0
x
0
) +(x
0
x
0
)] .
Subtracting these two equations implies
f(x
0
) f(x) = f
0
(x
0
) [x
0
x +(x
0
x
0
) (x x
0
)]
or equivalently
x
0
x = f
0
(x
0
)
1
[f(x
0
) f(x)] +(x x
0
) (x
0
x
0
).
Taking norms of this equation and making use of Lemma 26.28 implies
kx
0
xk
f
0
(x
0
)
1
kf(x
0
) f(x)k +kx
0
xk
which implies
(26.33) kx
0
xk
f
0
(x
0
)
1
1
kf(x
0
) f(x)k for all x, x
0
B
X
(x
0
, R).
This shows that f|
B
X
(x
0
,R)
is injective and that f|
1
B
X
(x
0
,R)
: f

B
X
(x
0
, R)

B
X
(x
0
, R) is Lipschitz continuous because
f|
1
B
X
(x
0
,R)
(y
0
) f|
1
B
X
(x
0
,R)
(y)
f
0
(x
0
)
1
1
ky
0
yk for all y, y
0
f

B
X
(x
0
, R)
.
Since x
0
X was chosen arbitrarily, if we know f : U Y is injective, we then
know that f
1
: V = f(U) U is necessarily continuous. The remaining assertions
of the theorem now follow from the converse to the chain rule in Theorem 26.6 and
the fact that f is an open mapping (as we shall now show) so that in particular
f

B
X
(x
0
, R)
is open.
Let y B
Y
(0, ), with to be determined later, we wish to solve the equation,
for x B
X
(0, R),
f(x
0
) +y = f(x
0
+x) = f(x
0
) +f
0
(x
0
) (x +(x)) .
Equivalently we are trying to nd x B
X
(0, R) such that
x = f
0
(x
0
)
1
y (x) =: S
y
(x).
Now using Lemma 26.28 and the fact that (0) = 0,
kS
y
(x)k
f
0
(x
0
)
1
y
+k(x)k
f
0
(x
0
)
1
kyk +kxk
f
0
(x
0
)
1
+R.
Therefore if we assume is chosen so that
f
0
(x
0
)
1
+R < R, i.e. < (1 ) R/
f
0
(x
0
)
1
:= (x
0
),
then S
y
: B
X
(0, R) B
X
(0, R) B
X
(0, R).
Similarly by Lemma 26.28, for all x, z B
X
(0, R),
kS
y
(x) S
y
(z)k = k(z) (x)k kx zk
which shows S
y
is a contraction on B
X
(0, R). Hence by the contraction mapping
principle in Theorem 26.20, for every y B
Y
(0, ) there exists a unique solution
x B
X
(0, R) such that x = S
y
(x) or equivalently
f(x
0
+x) = f(x
0
) +y.
Letting y
0
= f(x
0
), this last statement implies there exists a unique function g :
B
Y
(y
0
, (x
0
)) B
X
(x
0
, R) such that f(g(y)) = y B
Y
(y
0
, (x
0
)). From Eq.
(26.33) it follows that
kg(y) x
0
k = kg(y) g(y
0
)k
f
0
(x
0
)
1
1
kf(g(y)) f(g(y
0
))k =
f
0
(x
0
)
1
1
ky y
0
k .
This shows
g(B
Y
(y
0
, )) B
X
(x
0
, (1 )
1
f
0
(x
0
)
1
)
and therefore
B
Y
(y
0
, ) = f

g(B
Y
(y
0
, ))
f

B
X
(x
0
, (1 )
1
f
0
(x
0
)
1
490 BRUCE K. DRIVER
for all < (x

0
).
This last assertion implies f(x
0
) f(W)
o
for any W
o
U with x
0
W. Since
x
0
U was arbitrary, this shows f is an open mapping.
26.8.1. Alternate construction of g. Suppose U
o
X and f : U Y is a C
2
function. Then we are looking for a function g(y) such that f(g(y)) = y. Fix an
x
0
U and y
0
= f(x
0
) Y. Suppose such a g exists and let x(t) = g(y
0
+ th) for
some h Y. Then dierentiating f(x(t)) = y
0
+th implies
d
dt
f(x(t)) = f
0
(x(t)) x(t) = h
(26.34) x(t) = [f
0
(x(t))]
1
h = Z(h, x(t)) with x(0) = x
0
where Z(h, x) = [f
0
(x(t))]
1
h. Conversely if x solves Eq. (26.34) we have
d
dt
f(x(t)) = h and hence that
f(x(1)) = y
0
+h.
Thus if we dene
g(y
0
+h) := e
Z(h,)
(x
0
),
then f(g(y
0
+ h)) = y
0
+ h for all h suciently small. This shows f is an open
mapping.
26.9. Applications. A detailed discussion of the inverse function theorem on Ba-
nach and Frchet spaces may be found in Richard Hamiltons, The Inverse Func-
tion Theorem of Nash and Moser. The applications in this section are taken from
this paper.
Theorem 26.30 (Hamiltons Theorem on p. 110.). Let p : U := (a, b) V :=
(c, d) be a smooth function with p
0
> 0 on (a, b). For every g C
2
(R, (c, d)) there
exists a unique function y C
2
(R, (a, b)) such that
y(t) +p(y(t)) = g(t).
Proof. Let

V := C
0
2
(R, (c, d))
o
C
0
2
(R, R) and
U :=

y C
1
2
(R, R) : a < y(t) < b and c < y(t) +p(y(t)) < d for all t
o
C
1
2
(R, (a, b)).
The proof will be completed by showing P :

U

V dened by
P(y)(t) = y(t) +p(y(t)) for y

U and t R
is bijective.
Step 1. The dierential of P is given by P
0
(y)h =

h + p
0
(y)h, see Exercise
26.7. We will now show that the linear mapping P
0
(y) is invertible. Indeed let
f = p
0
(y) > 0, then the general solution to the Eq.

h +fh = k is given by
h(t) = e
R
t
0
f()d
h
0
+
Z
t
0
e
R
t
f(s)ds
k()d
where h
0
is a constant. We wish to choose h
0
so that h(2) = h
0
, i.e. so that
h
0
1 e
c(f)
=
Z
2
0
e
R
t
f(s)ds
k()d
where
c(f) =
Z
2
0
f()d =
Z
2
0
p
0
(y())d > 0.
The unique solution h C
1
2
(R, R) to P
0
(y)h = k is given by
h(t) =

1 e
c(f)
1
e
R
t
0
f()d
Z
2
0
e
R
t
f(s)ds
k()d +
Z
t
0
e
R
t
f(s)ds
k()d
=

1 e
c(f)
1
e
R
t
0
f(s)ds
Z
2
0
e
R
t
f(s)ds
k()d +
Z
t
0
e
R
t
f(s)ds
k()d.
Therefore P
0
(y) is invertible for all y. Hence by the implicit function theorem,
P :

U

V is an open mapping which is locally invertible.
Step 2. Let us now prove P :

U

V is injective. For this suppose y
1
, y
2

U
such that P(y
1
) = g = P(y
2
) and let z = y
2
y
1
. Since
z(t) +p(y
2
(t)) p(y
1
(t)) = g(t) g(t) = 0,
if t
m
R is point where z(t
m
) takes on its maximum, then z(t
m
) = 0 and hence
p(y
2
(t
m
)) p(y
1
(t
m
)) = 0.
Since p is increasing this implies y
2
(t
m
) = y
1
(t
m
) and hence z(t
m
) = 0. This shows
z(t) 0 for all t and a similar argument using a minimizer of z shows z(t) 0 for
all t. So we conclude y
1
= y
2
.
Step 3. Let W := P(
U), we wish to show W =

V . By step 1., we know W is
an open subset of

V and since

V is connected, to nish the proof it suces to show
W is relatively closed in

V . So suppose y
j

U such that g
j
:= P(y
j
) g

V .
We must now show g W, i.e. g = P(y) for some y W. If t
m
is a maximizer of
y
j
, then y
j
(t
m
) = 0 and hence g
j
(t
m
) = p(y
j
(t
m
)) < d and therefore y
j
(t
m
) < b
because p is increasing. A similar argument works for the minimizers then allows us
to conclude Ranpy
j
) Rang
j
) @@ (c, d) for all j. Since g
j
is converging uniformly
to g, there exists c < < < d such that Ran(p y
j
) Ran(g
j
) [, ] for all j.
Again since p
0
> 0,
Ran(y
j
) p
1
([, ]) = [, ] @@ (a, b) for all j.
In particular sup{| y
j
(t)| : t R and j} < since
(26.35) y
j
(t) = g
j
(t) p(y
j
(t)) [, ] [, ]
which is a compact subset of R. The Ascoli-Arzela Theorem 3.59 now allows us to
assume, by passing to a subsequence if necessary, that y
j
is converging uniformly
to y C
0
2
(R, [, ]). It now follows that
y
j
(t) = g
j
(t) p(y
j
(t)) g p(y)
uniformly in t. Hence we concluded that y C
1
2
(R, R) C
0
2
(R, [, ]), y
j
y and
P(y) = g. This has proved that g W and hence that W is relatively closed in

V .
492 BRUCE K. DRIVER
26.10. Exercises.
Exercise 26.2. Suppose that A : R L(X) is a continuous function and V : R
L(X) is the unique solution to the linear dierential equation
(26.36)

V (t) = A(t)V (t) with V (0) = I.
Assuming that V (t) is invertible for all t R, show that V
1
(t) [V (t)]
1
must
solve the dierential equation
(26.37)
d
dt
V
1
(t) = V
1
(t)A(t) with V
1
(0) = I.
See Exercise 5.14 as well.
Exercise 26.3 (Dierential Equations with Parameters). Let W be another Ba-
nach space, U V
o
X W and Z C
1
(U V, X). For each (x, w) U V, let
t J
x,w
(t, x, w) denote the maximal solution to the ODE
(26.38) y(t) = Z(y(t), w) with y(0) = x
and
D := {(t, x, w) R U V : t J
x,w
}
as in Exercise 5.18.
(1) Prove that is C
1
and that D
w
(t, x, w) solves the dierential equation:
d
dt
D
w
(t, x, w) = (D
x
Z)((t, x, w), w)D
w
(t, x, w) + (D
w
Z)((t, x, w), w)
with D
w
(0, x, w) = 0 L(W, X). Hint: See the hint for Exercise 5.18
with the reference to Theorem 5.21 being replace by Theorem 26.12.
(2) Also show with the aid of Duhamels principle (Exercise 5.16) and Theorem
26.12 that
D
w
(t, x, w) = D
x
(t, x, w)
Z
t
0
D
x
(, x, w)
1
(D
w
Z)((, x, w), w)d
Exercise 26.4. (Dierential of e
A
) Let f : L(X) L
(X) be the exponential

function f(A) = e
A
. Prove that f is dierentiable and that
(26.39) Df(A)B =
Z
1
0
e
(1t)A
Be
tA
dt.
Hint: Let B L(X) and dene w(t, s) = e
t(A+sB)
for all t, s R. Notice that
(26.40) dw(t, s)/dt = (A+sB)w(t, s) with w(0, s) = I L(X).
Use Exercise 26.3 to conclude that w is C
1
and that w
0
(t, 0) dw(t, s)/ds|
s=0
satises the dierential equation,
(26.41)
d
dt
w
0
(t, 0) = Aw
0
(t, 0) +Be
tA
with w(0, 0) = 0 L(X).
Solve this equation by Duhamels principle (Exercise 5.16) and then apply Proposi-
tion 26.10 to conclude that f is dierentiable with dierential given by Eq. (26.39).
Exercise 26.5 (Local ODE Existence). Let S
x
be dened as in Eq. (5.22) from the
proof of Theorem 5.10. Verify that S
x
satises the hypothesis of Corollary 26.21.
In particular we could have used Corollary 26.21 to prove Theorem 5.10.
Exercise 26.6 (Local ODE Existence Again). Let J = [1, 1], Z C
1
(X, X),
Y := C(J, X) and for y Y and s J let y
s
Y be dened by y
s
(t) := y(st). Use
the following outline to prove the ODE
(26.42) y(t) = Z(y(t)) with y(0) = x
has a unique solution for small t and this solution is C
1
in x.
(1) If y solves Eq. (26.42) then y
s
solves
y
s
(t) = sZ(y
s
(t)) with y
s
(0) = x
or equivalently
(26.43) y
s
(t) = x +s
Z
t
0
Z(y
s
())d.
Notice that when s = 0, the unique solution to this equation is y
0
(t) = x.
(2) Let F : J Y J Y be dened by
F(s, y) := (s, y(t) s
Z
t
0
Z(y())d).
Show the dierential of F is given by
F
0
(s, y)(a, v) =

a, t v(t) s
Z
t
0
Z
0
(y())v()d a
Z

0
Z(y())d
.
(3) Verify F
0
(0, y) : R Y R Y is invertible for all y Y and notice that
F(0, y) = (0, y).
(4) For x X, let C
x
Y be the constant path at x, i.e. C
x
(t) = x for all
t J. Use the inverse function Theorem 26.26 to conclude there exists > 0
and a C
1
map : (, ) B(x
0
, ) Y such that
F(s, (s, x)) = (s, C
x
) for all (s, x) (, ) B(x
0
, ).
(5) Show, for s that y
s
(t) := (s, x)(t) satises Eq. (26.43). Now dene
y(t, x) = (/2, x)(2t/) and show y(t, x) solve Eq. (26.42) for |t| < /2
and x B(x
0
, ).
Exercise 26.7. Show P dened in Theorem 26.30 is continuously dierentiable
and P
0
(y)h =

h +p
0
(y)h.
494 BRUCE K. DRIVER
27. Proof of the Change of Variable Theorem

This section is devoted to the proof of the change of variables theorem 8.31. For
convenience we restate the theorem here.
Theorem 27.1 (Change of Variables Theorem). Let
o
R
d
be an open set and
T : T()
o
R
d
be a C
1
dieomorphism. Then for any Borel measurable
f : T() [0, ] we have
(27.1)
Z
f T| det T
0
|dm =
Z
T()
f dm.
Proof. We will carry out the proof in a number of steps.
Step 1. Eq. (27.1) holds when = R
d
and T is linear and invertible. This was
proved in Theorem 8.33 above using Fubinis theorem, the scaling and translation
invariance properties of one dimensional Lebesgue measure and the fact that by
row reduction arguments T may be written as a product of elementary transfor-
mations.
Step 2. For all A B
,
(27.2) m(T(A))
Z
A
|det T
0
| dm.
This will be proved in Theorem 27.4below.
Step 3. Step 2. implies the general case. To see this, let B B
T()
and
A = T
1
(B) in Eq. (27.2) to learn that
Z
1
A
dm = m(A)
Z
T
1
(A)
|det T
0
| dm =
Z
1
A
T |det T
0
| dm.
Using linearity we may conclude from this equation that
(27.3)
Z
T()
fdm
Z
f T |det T
0
| dm
for all non-negative simple functions f on T(). Using Theorem 7.12 and the
monotone convergence theorem one easily extends this equation to hold for all
nonnegative measurable functions f on T().
Applying Eq. (27.3) with replaced by T(), T replaced by T
1
and f by
g : [0, ], we see that
(27.4)
Z
gdm =
Z
T
1
(T())
gdm
Z
T()
g T
1
det
T
1
dm
for all Borel measurable g. Taking g = (f T) |det T
0
| in this equation shows,
Z
f T |det T
0
| dm
Z
T()
f

det T
0
T
1
det
T
1
dm
=
Z
T()
fdm (27.5)
wherein the last equality we used the fact that T T
1
= id so that
T
0
T
1

T
1
0
= id and hence det T
0
T
1
det
T
1
0
= 1.
Combining Eqs. (27.3) and (27.5) proves Eq. (27.1). Thus the proof is complete
modulo Eq. (27.3) which we prove in Theorem 27.4 below.
Notation 27.2. For a, b R
d
we will write a b is a
i
b
i
for all i and a < b
if a
i
< b
i
for all i. Given a 0 is a constant independent of i. When Q is a cube,
let
x
Q
:= a + (, , . . . , )
be the center of the cube.
Notice that with this notation, if Q is a cube of side length 2,
(27.6)

Q = {x R
d
: |x x
Q
| }
and the interior (Q
0
) of Q may be written as
Q
0
= {x R
d
: |x x
Q
| < }.
Notation 27.3. For a R
d
, let |a| = max
i
|a
i
| and if T is a d d matrix let
kTk = max
i
P
j
|T
ij
| .
A key point of this notation is that
|Ta| = max
i
X
j
T
ij
a
j
max
i
X
j
|T
ij
| |a
j
|
kTk |a| . (27.7)
Theorem 27.4. Let
o
R
d
be an open set and T : T()
o
R
d
be a C
1
dieomorphism. Then for any A B
,
(27.8) m(T(A))
Z
A
| det T
0
(x)|dx.
Proof. Step 1. We will rst assume that A = Q = (a, b] is a cube such that
Q = [a, b] . Let = (b
i
a
i
)/2 be half the side length of Q. By the fundamental
theorem of calculus (for Riemann integrals) for x Q,
T(x) = T(x
Q
) +
Z
1
0
T
0
(x
Q
+t(x x
Q
))(x x
Q
)dt
= T(x
Q
) +T
0
(x
Q
)S(x)
where
S(x) =
Z
1
0
T
0
(x
Q
)
1
T
0
(x
Q
+t(x x
Q
))dt
(x x
Q
).
Therefore T(Q) = T(x
Q
) +T
0
(x
Q
)S(Q) and hence
m(T(Q)) = m(T(x
Q
) +T
0
(x
Q
)S(Q)) = m(T
0
(x
Q
)S(Q))
= |det T
0
(x
Q
)| m(S(Q)) . (27.9)
Now for x

Q, i.e. |x x
Q
| ,
|S(x)|
Z
1
0
T
0
(x
Q
)
1
T
0
(x
Q
+t(x x
Q
))dt
|x x
Q
|
h(x
Q
, x)
496 BRUCE K. DRIVER
where
(27.10) h(x
Q
, x) :=
Z
1
0
T
0
(x
Q
)
1
T
0
(x
Q
+t(x x
Q
))
dt.
Hence
S(Q) max
xQ
h(x
Q
, x){x R
d
: |x| max
xQ
h
d
(x
Q
, x)}
and
(27.11) m(S(Q)) max
xQ
h(x
Q
, x)
d
(2)
d
= max
xQ
h
d
(x
Q
, x)m(Q).
Combining Eqs. (27.9) and (27.11) shows that
(27.12) m(T(Q)) |det T
0
(x
Q
)| m(Q) max
xQ
h
d
(x
Q
, x).
To rene this estimate, we will subdivide Q into smaller cubes, i.e. for n N let
Q
n
=

(a, a +
2
n
(, , . . . , )] +
2
n
: {0, 1, 2, . . . , n}
d
.
Notice that Q =
`
AQ
n
A. By Eq. (27.12),
m(T(A)) |det T
0
(x
A
)| m(A) max
xA
h
d
(x
A
, x)
and summing the equation on A gives
m(T(Q)) =
X
AQ
n
m(T(A))
X
AQ
n
|det T
0
(x
A
)| m(A) max
xA
h
d
(x
A
, x).
Since h
d
(x, x) = 1 for all x

Q and h
d
:

Q

Q [0, ) is continuous function on
a compact set, for any > 0 there exists n such that if x, y

Q and |x y| /n
then h
d
(x, y) 1 +. Using this in the previously displayed equation, we nd that
m(T(Q) (1 +)
X
AQ
n
|det T
0
(x
A
)| m(A)
= (1 +)
Z
Q
X
AQ
n
|det T
0
(x
A
)| 1
A
(x)dm(x). (27.13)
Since |det T
0
(x)| is continuous on the compact set

Q, it easily follows by uniform
continuity that
X
AQ
n
|det T
0
(x
A
)| 1
A
(x) |det T
0
(x)| as n
and the convergence in uniform on

Q. Therefore the dominated convergence theorem
enables us to pass to the limit, n , in Eq. (27.13) to nd
m(T(Q)) (1 +)
Z
Q
|det T
0
(x)| dm(x).
Since > 0 is arbitrary we are done we have shown that
m(T(Q))
Z
Q
|det T
0
(x)| dm(x).
Step 2. We will now show that Eq. (27.8) is valid when A = U is an open
subset of . For n N, let
Q
n
=

(0, (, , . . . , )] + 2
n
: Z
d

so that Q
n
is a partition of R
d
. Let F
1
:=

A Q
1
:

A U
and dene F
n

n
k=1
Q
k
inductively as follows. Assuming F
n1
has been dened, let
F
n
= F
n1
A Q
n
:

A U and A B = for all B F
n1
= F
n1
A Q
n
:

A U and A * B for any B F
n1
Now set F = F
n
(see Figure 47) and notice that U =
`
AF
A. Indeed by con-
Figure 47. Filling out an open set with half open disjoint cubes.
We have drawn F
2
.
struction, the sets in F are pairwise disjoint subset of U so that
`
AF
A U.
If x U, there exists an n and A Q
n
such that x A and

A U. Then by
construction of F, either A F or there is a set B F such that A B. In either
case x
`
AF
A which shows that U =
`
AF
A. Therefore by step 1.,
m(T(U)) = m(T(
AF
A)) = m((
AF
T(A)))
=
X
AF
m(T(A))
X
AF
Z
A
|det T
0
(x)| dm(x)
=
Z
U
|det T
0
(x)| dm(x)
which proves step 2.
Step 3. For general A B
let be the measure,

(A) :=
Z
A
|det T
0
(x)| dm(x).
Then m T and are ( nite measures as you should check) on B
such that
m T on open sets. By regularity of these measures, we may conclude that
m T . Indeed, if A B
,
m(T(A)) = inf
U
o
m(T(U)) inf
U
o
(U) = (A) =
Z
A
|det T
0
(x)| dm(x).
498 BRUCE K. DRIVER
27.1. Appendix: Other Approaches to proving Theorem 27.1 . Replace f

by f T
1
in Eq. (27.1) gives
Z
f| det T
0
|dm =
Z
T()
f T
1
dm =
Z
fd(m T)
so we are trying to prove d(m T) = | det T
0
|dm. Since both sides are measures it
suces to show that they agree on a multiplicative system which generates the
algebra. So for example it is enough to show m(T(Q)) =
R
Q
| det T
0
|dm when Q is
a small rectangle.
As above reduce the problem to the case where T(0) = 0 and T
0
(0) = id. Let
(x) = T(x) x and set T
t
(x) = x +t(x). (Notice that det T
0
> 0 in this case so
we will not need absolute values.) Then T
t
: Q T
t
(Q) is a C
1
morphism for Q
small and T
t
(Q) contains some xed smaller cube C for all t. Let f C
1
c
(C
o
), then
it suces to show
d
dt
Z
Q
f T
t
|det T
0
t
| dm = 0
for then
Z
Q
f T det T
0
dm =
Z
Q
f T
0
det T
0
0
dm =
Z
Q
fdm =
Z
T(Q)
fdm.
So we are left to compute
d
dt
Z
Q
f T
t
det T
0
t
dm =
Z
Q
f) (T
t
) det T
0
t
+f T
t
d
dt
det T
0
t
dm
=
Z
Q
{(
f) (T
t
) +f T
t
tr (T
0
t
)} det T
0
t
dm.
Now let W
t
:= (T
0
t
)
1
, then
W
t
(f T
t
) = W
t
(f T
t
) =

T
0
t
W
t
f
(T
t
) = (
f) (T
t
).
Therefore,
d
dt
Z
Q
f T
t
det T
0
t
dm =
Z
Q
{W
t
(f T
t
) +f T
t
tr (T
0
t
)} det T
0
t
dm.
Let us now do an integration by parts,
Z
Q
W
t
(f T
t
) det T
0
t
dm =
Z
Q
(f T
t
) {W
t
det T
0
t
+ W
t
det T
0
t
} dm
so that
d
dt
Z
Q
f T
t
det T
0
t
dm =
Z
Q
{tr (T
0
t
) det T
0
t
W
t
det T
0
t
W
t
det T
0
t
} f T
t
dm.
Finally,
W
t
det T
0
t
= det T
0
t
tr((T
0
t
)
1
W
t
T
0
t
) = det T
0
t
tr((T
0
t
)
1
T
00
t
(T
0
t
)
1
)
while
W
t
= trW
0
t
= tr
h
(T
0
t
)
1
T
00
t
(T
0
t
)
1
i
+tr
h
(T
0
t
)
1
0
i
.
so that
W
t
det T
0
t
+ W
t
det T
0
t
= det T
0
t
tr
h
(T
0
t
)
1
0
i
and therefore
d
dt
Z
Q
f T
t
det T
0
t
dm = 0
as desired.
The problem with this proof is that it requires T or equivalently to be twice
continuously dierentiable. I guess this can be overcome by smoothing a C
1

and then removing the smoothing after the result is proved.
Proof. Take care of lower bounds also.
(1) Show m(T(Q)) =
R
Q
(T
0
(x))dx =: (Q) for all Q
(2) Fix Q. Claim mT = on B
Q
= {A Q : A B}
Proof Equality holds on a k. Rectangles contained in Q. Therefore the algebra
of nite disjoint unison of such of rectangles here as ({rectangle contained in Q}.
But ({rectangle Q} = B
Q
.
(3) Since =

S
i=1
of such rectangles (even cubes) it follows that mJ(E) =
P
mT(E Q
i
) =
P
(E Q
i
) = (E) for all E B
.
Now for general open sets write =

S
j=1
Q
j
almost disjoint union. Then
m(T()) m(
[
j=1
T(Q
j
))
X
j
mTQ
j
X
j
Z
Q
j
|T
0
|dm =
Z
|T
0
|dm
so m(T())
R
|T
0
|d, for all . Let E such that E bounded. Choose
n
C such that
n
and m(E \
n
) 0. Then m(TE) m(T
n
)
R
n
|T
0
|dm
R
E
|T
0
|dm so m(T(E))
R
E
|T
0
|dm for all E bounded for general E
m(T(E)) = lim
n
m(T(E B
n
)) lim
n
Z
EB
n
|T
0
|dm =
Z
E
|T
0
|dm.
Therefore m(T(E))
R
E
|T
0
|dm for all E measurable.
27.2. Sards Theorem. See p. 538 of Taylor and references. Also see Milnors
topology book. Add in the Brower Fixed point theorem here as well. Also Spivaks
calculus on manifolds.
Theorem 27.5. Let U
o
R
m
, f C
(U, R
d
) and C := {x U : rank(f
0
(x)) < n}
be the set of critical points of f. Then the critical values, f(C), is a Borel measuralbe
subset of R
d
of Lebesgue measure 0.
Remark 27.6. This result clearly extends to manifolds.
For simplicity in the proof given below it will be convenient to use the norm,
|x| := max
i
|x
i
| . Recall that if f C
1
(U, R
d
) and p U, then
f(p +x) = f(p) +
Z
1
0
f
0
(p +tx)xdt = f(p) +f
0
(p)x +
Z
1
0
[f
0
(p +tx) f
0
(p)] xdt
so that if
R(p, x) := f(p +x) f(p) f
0
(p)x =
Z
1
0
[f
0
(p +tx) f
0
(p)] xdt
500 BRUCE K. DRIVER
we have
|R(p, x)| |x|
Z
1
0
|f
0
(p +tx) f
0
(p)| dt = |x| (p, x).
By uniform continuity, it follows for any compact subset K U that
sup{|(p, x)| : p K and |x| } 0 as 0.
Proof. Notice that if x U \ C, then f
0
(x) : R
m
R
n
is surjective, which is
an open condition, so that U \ C is an open subset of U. This shows C is relatively
closed in U, i.e. there exists

C @ R
m
such that C =

C U. Let K
n
U be
compact subsets of U such that K
n
U, then K
n
C C and K
n
C = K
n

C
is compact for each n. Therefore, f(K
n
C) f(C) i.e. f(C) =
n
f(K
n
C) is
a countable union of compact sets and therefore is Borel measurable. Moreover,
since m(f(C)) = lim
n
m(f(K
n
C)), it suces to show m(f(K)) = 0 for all
compact subsets K C.
Case 1. (n m) Let K = [a, a + ] be a cube contained in U and by
scaling the domain we may assume = (1, 1, 1, . . . , 1). For N N and j
S
N
:= {0, 1, . . . , N 1}
n
let K
j
= j/N + [a, a + /N] so that K =
jS
N
K
j
with K
o
j
K
o
j
0
= if j 6= j
0
. Let {Q
j
: j = 1 . . . , M} be the collection of those
{K
j
: j S
N
} which intersect C. For each j, let p
j
Q
j
C and for x Q
j
p
j
we have
f(p
j
+x) = f(p
j
) +f
0
(p
j
)x +R
j
(x)
where |R
j
(x)|
j
(N)/N and (N) := max
j

j
(N) 0 as N . Now
m(f(Q
j
)) = m(f(p
j
) + (f
0
(p
j
) +R
j
) (Q
j
p
j
))
= m((f
0
(p
j
) +R
j
) (Q
j
p
j
))
= m(O
j
(f
0
(p
j
) +R
j
) (Q
j
p
j
)) (27.14)
where O
j
SO(n) is chosen so that O
j
f
0
(p
j
)R
n
R
m1
{0} . Now O
j
f
0
(p
j
)(Q
j
p
j
) is contained in {0} where R
m1
is a cube cetered at 0 R
m1
with side
length at most 2 |f
0
(p
j
)| /N 2M/N where M = max
pK
|f
0
(p)| . It now follows
that O
j
(f
0
(p
j
) +R
j
) (Q
j
p
j
) is contained the set of all points within (N)/N of
{0} and in particular
O
j
(f
0
(p
j
) +R
j
) (Q
j
p
j
) (1 +(N)/N) [(N)/N, (N)/N].
From this inclusion and Eq. (27.14) it follows that
m(f(Q
j
))
2
M
N
(1 +(N)/N)
m1
2(N)/N
= 2
m
M
m1
[(1 +(N)/N)]
m1
(N)
1
N
m
and therefore,
m(f(C K))
X
j
m(f(Q
j
)) N
n
2
m
M
m1
[(1 +(N)/N)]
m1
(N)
1
N
m
= 2
n
M
n1
[(1 +(N)/N)]
n1
(N)
1
N
mn
0 as N
since m n. This proves the easy case since we may write U as a countable union
of cubes K as above.
Remark. The case (m < n) also follows brom the case m = n as follows. When
m < n, C = U and we must show m(f(U)) = 0. Letting F : U R
nm
R
n
be
the map F(x, y) = f(x). Then F
0
(x, y)(v, w) = f
0
(x)v, and hence C
F
:= UR
nm
.
So if the assetion holds for m = n we have
m(f(U)) = m(F(U R
nm
)) = 0.
Case 2. (m > n) This is the hard case and the case we will need in the co-area
formula to be proved later. Here I will follow the proof in Milnor. Let
C
i
:= {x U :
f(x) = 0 when || i}
so that C C
1
C
2
C
3
. . . . The proof is by induction on n and goes by the
following steps:
(1) m(f(C \ C
1
)) = 0.
(2) m(f(C
i
\ C
i+1
)) = 0 for all i 1.
(3) m(f(C
i
)) = 0 for all i suciently large.
Step 1. If m = 1, there is nothing to prove since C = C
1
so we may assume
m 2. Suppose that x C\C
1
, then f
0
(p) 6= 0 and so by reordering the components
of x and f(p) if necessary we may assume that f
1
(p)/x
1
6= 0. The map h(x) :=
(f
1
(x), x
2
, . . . , x
n
) has dierential
h
0
(p) =
_
_
f
1
(p)/x
1
f
1
(p)/x
2
. . . f
1
(p)/x
n
0 1 0 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1
_
_
which is not singular. So by the implicit function theorem, there exists there exists
V
p
such that h : V h(V )
h(p)
is a dieomorphism and in particular
f
1
(x)/x
1
6= 0 for x V and hence V U \ C
1
. Consider the map g := f h
1
:
V
0
:= h(V ) R
m
, which satises
(f
1
(x), f
2
(x), . . . , f
m
(x)) = f(x) = g(h(x)) = g((f
1
(x), x
2
, . . . , x
n
))
which implies g(t, y) = (t, u(t, y)) for (t, y) V
0
:= h(V )
h(p)
, see Figure 48
below where p = x and m = p. Since
Figure 48. Making a change of variable so as to apply induction.
502 BRUCE K. DRIVER
g
0
(t, y) =

1 0
t
u(t, y)
y
u(t, y)
it follows that (t, y) is a critical point of g i y C

0
t
the set of critical points of
y u(t, y). Since h is a dieomorphism we have C
0
:= h(C V ) are the critical
points of g in V
0
and
f(C V ) = g(C
0
) =
t
[{t} u
t
(C
0
t
)] .
By the induction hypothesis, m
m1
(u
t
(C
0
t
)) = 0 for all t, and therefore by Fubinis
theorem,
m(f(C V )) =
Z
R
m
m1
(u
t
(C
0
t
))1
V
0
t
6=
dt = 0.
Since C \ C
1
may be covered by a countable collection of open sets V as above, it
follows that m(f(C \ C
1
)) = 0.
Step 2. Suppose that p C
k
\C
k+1
, then there is an such that || = k+1 such
that
f(p) = 0 while
f(p) = 0 for all || k. Again by permuting coordinates

we may assume that
1
6= 0 and
f
1
(p) 6= 0. Let w(x) :=
e
1
f
1
(x), then
w(p) = 0 while
1
w(p) 6= 0. So again the implicit function theorem there exists
V
p
such that h(x) := (w(x) , x
2
, . . . , x
n
) maps V V
0
:= h(V )
h(p)
in
dieomorphic way and in particular
1
w(x) 6= 0 on V so that V U \ C
k+1
. As
before, let g := f h
1
and notice that C
0
k
:= h(C
k
V ) {0} R
n1
and
f(C
k
V ) = g(C
0
k
) = g (C
0
k
)
where g := g|
({0}R
n1
)V
0 . Clearly C
0
k
is contained in the critical points of g, and
therefore, by induction
0 = m( g(C
0
k
)) = m(f(C
k
V )).
Since C
k
\ C
k+1
is covered by a countable collection of such open sets, it follows
that
m(f(C
k
\ C
k+1
)) = 0 for all k 1.
Step 3. Supppose that Q is a closed cube with edge length contained in U
and k > n/m1. We will show m(f(Q C
k
)) = 0 and since Q is arbitrary it will
forllow that m(f(C
k
)) = 0 as desired.
By Taylors theorem with (integral) remainder, it follows for x Q C
k
and h
such that x +h Q that
f(x +h) = f(x) +R(x, h)
where
|R(x, h)| c khk
k+1
where c = c(Q, k). Now subdivide Q into r
n
cubes of edge size /r and let Q
0
be
one of the cubes in this subdivision such that Q
0
C
k
6= and let x Q
0
C
k
.
It then follows that f(Q
0
) is contained in a cube centered at f(x) R
m
with side
length at most 2c (/r)
k+1
and hence volume at most (2c)
m
(/r)
m(k+1)
. Therefore,
f(QC
k
) is contained in the union of at most r
n
cubes of volume (2c)
m
(/r)
m(k+1)
and hence meach
m(f(Q C
k
)) (2c)
m
(/r)
m(k+1)
r
n
= (2c)
m
m(k+1)
r
nm(k+1)
0 as r
provided that n m(k + 1) < 0, i.e. provided k > n/m1.
27.3. Co-Area Formula. See C:\driverdat\Bruce\DATA\MATHFILE\qft-notes\co-
area.tex for this material.
27.4. Stokes Theorem. See Whitneys "Geometric Integration Theory," p. 100.
for a fairly genral form of Stokes Theorem allowing for rough boundaries.
504 BRUCE K. DRIVER
28. Complex Differentiable Functions

28.1. Basic Facts About Complex Numbers.
Denition 28.1. C = R
2
and we write 1 = (1, 0) and i = (0, 1). As usual C
becomes a eld with the multiplication rule determined by 1
2
= 1 and i
2
= 1, i.e.
(a +ib)(c +id) (ac bd) +i(bc +ad).
Notation 28.2. If z = a +ib with a, b R let z = a ib and
|z|
2
z z = a
2
+b
2
.
Also notice that if z 6= 0, then z is invertible with inverse given by
z
1
=
1
z
=
z
|z|
2
.
Given w = a + ib C, the map z C wz C is complex and hence
real linear so we may view this a linear transformation M
w
: R
2
R
2
. To work
out the matrix of this transformation, let z = c + id, then the map is c + id
wz = (ac bd) + i (bc +ad) which written in terms of real and imaginary parts is
equivalent to

a b
b a
c
d
=

ac bd
bc +ad
.
Thus
M
w
=

a b
b a
= aI +bJ where J =

0 1
1 0
.
Remark 28.3. Continuing the notation above, M
tr
w
= M
w
, det(M
w
) = a
2
+ b
2
=
|w|
2
, and M
w
M
z
= M
wz
for all w, z C. Moreover the ready may easily check that
a real 2 2 matrix A is equal to M
w
for some w C i 0 = [A, J] =: AJ JA.
Hence C and the set of real 2 2 matrices A such that 0 = [A, J] are algebraically
isomorphic objects.
28.2. The complex derivative.
Denition 28.4. A function F :
o
C C is complex dierentiable at
z
0
if
(28.1) lim
zz
0
F(z) F(z
0
)
z z
0
= w
exists.
Proposition 28.5. A function F :
o
C C is complex dierentiable i
F : C is dierentiable (in the real sense as a function from
o
R
2
R
2
)
and [F
0
(z
0
), J] = 0, i.e. by Remark 28.3,
F
0
(z
0
) = M
w
=

a b
b a
for some w = a +ib C.

Proof. Eq. (28.1) is equivalent to the equation:
F(z) = F(z
0
) +w(z z
0
) +o(z z
0
)
= F(z
0
) +M
w
(z z
0
) +o(z z
0
) (28.2)
and hence F is complex dierentiable i F is dierentiable and the dierential is
of the form F
0
(z
0
) = M
w
for some w C.
Corollary 28.6 (Cauchy Riemann Equations). F : C is complex dieren-
tiable at z
0
i F
0
(z
0
) exists
46
and, writing z
0
= x
0
+iy
0
,
(28.3) i
F(x
0
+iy
0
)
x
=
F
y
(x
0
+iy
0
)
or in short we write
F
x
+i
F
y
= 0.
Proof. The dierential F
0
(z
0
) is, in general, an arbitrary matrix of the form
F
0
(z
0
) =

a c
b d
where
(28.4)
F
x
(z
0
) = a +ib and
F
y
(z
0
) = c +id.
Since F is complex dierentiable at z
0
i d = a and c = b which is easily seen to
be equivalent to Eq. (28.3) by Eq. (28.4) and comparing the real and imaginary
parts of iF
x
(z
0
) and F
y
(z
0
).
Second Proof. If F is complex dierentiable at z
0
= x
0
+ iy
0
, then by the
chain rule,
F
y
(x
0
+iy
0
) = iF
0
(x
0
+iy
0
) = i
F(x
0
+iy
0
)
x
.
Conversely if F is real dierentiable at z
0
there exists a real linear transformation
: C
= R
2
C such that
(28.5) F(z) = F(z
0
) +(z z
0
) +o(z z
0
)
and as usual this implies
F(z
0
)
x
= (1) and
F(z
0
)
y
= (i)
where 1 = (1, 0) and i = (0, 1) under the identication of C with R
2
. So if Eq.
(28.3) holds, we have
(i) = i(1)
from which it follows that is complex linear. Hence if we set := (1), we have
(a +ib) = a(1) +b(i) = a(1) +ib(1) = (a +ib),
which shows Eq. (28.5) may be written as
F(z) = F(z
0
) +(z z
0
) +o(z z
0
).
This is equivalent to saying F is complex dierentiable at z
0
and F
0
(z
0
) = .
Notation 28.7. Let
=
1
2

x
+i

y
and =
1
2

x
i

y
.
46
For example this is satised if If F : C is continuous at z
0
, F
x
and F
y
exists in a
neighborhood of z
0
and are continuous near z
0
.
506 BRUCE K. DRIVER
With this notation we have

fdz +

fd z =
1
2

x
i

y
f (dx +idy) +
1
2

x
+i

y
f (dx idy)
=
f
x
dx +
f
y
dy = df.
In particular if (s) C is a smooth curve, then
d
ds
f((s)) = f((s))
0
(s) +

f((s))
0
(s).
Corollary 28.8. Let
o
C be a given open set and f : C be a C
1
function
in the real variable sense. Then the following are equivalent:
(1) The complex derivative df(z)/dz exists for all z .
47
(2) The real dierential f
0
(z) satises [f
0
(z), J] = 0 for all z .
(3) The function f satises the Cauchy Riemann equations

f = 0 on .
Notation 28.9. A function f C
1
(, C) satisfying any and hence all of the
conditions in Corollary 28.8 is said to be a holomorphic or an analytic function
on . We will let H() denote the space of holomorphic functions on .
Corollary 28.10. The chain rule holds for complex dierentiable functions. In
particular,
o
C
f
D
o
C
g
C are functions, z
0
and w
0
= f(z
0
) D.
Assume that f
0
(z
0
) exists, g
0
(w
0
) exists then (g f)
0
(z
0
) exists and is given by
(28.6) (g f)
0
(z
0
) = g
0
(f(z
0
))f
0
(z
0
)
Proof. This is a consequence of the chain rule for F : R
2
R
2
when restricted
to those functions whose dierentials commute with J. Alternatively, one can simply
follow the usual proof in the complex category as follows:
g f(z) = g(f(z)) = g(w
0
) +g
0
(w
0
)(f(z) f(z
0
)) +o(f(z) f(z
0
))
and hence
(28.7)
g f(z) g(f(z
0
))
z z
0
= g
0
(w
0
)
f(z) f(z
0
)
z z
0
+
o(f(z) f(z
0
))
z z
0
.
Since
o(f(z)f(z
0
))
zz
0
0 as z z
0
we may pass to the limit z z
0
in Eq. (28.7) to
prove Eq. (28.6).
Lemma 28.11 (Converse to the Chain rule). Suppose f :
o
C U
o
C and
g : U
o
C C are functions such that f is continuous, g H(U) and h := g f
H(), then f H( \ {z : g
0
(f(z)) = 0}). Moreover f
0
(z) = h
0
(z)/g
0
(f(z)) when
z and g
0
(f(z)) 6= 0.
Proof. This follow from the previous converse to the chain rule or directly as
follows
48
. Suppose that z
0
and g
0
(f(z
0
)) 6= 0. On one hand
h(z) = h(z
0
) +h
0
(z
0
)(z z
0
) +o(z z
0
)
while on the other
h(z) = g(f(z)) = g(f(z
0
)) +g
0
(f(z
0
)(f(z) f(z
0
)) +o(f(z) f(z
0
)).
47
As we will see later in Theorem 28.38, the assumption that f is C
1
in this condition is
redundant. Complex dierentiablity of f at all points z already implies that f is C
(, C)!!
48
One could also apeal to the inverse function theorem here as well.
Combining these equations shows
(28.8) h
0
(z
0
)(z z
0
) = g
0
(f(z
0
))(f(z) f(z
0
)) +o(f(z) f(z
0
)) +o(z z
0
).
Since g
0
(f(z
0
)) 6= 0 we may conclude that
f(z) f(z
0
) = o(f(z) f(z
0
)) +O(z z
0
),
in particular it follow that
|f(z) f(z
0
)|
1
2
|f(z) f(z
0
)| +O(z z
0
) for z near z
0
and hence that f(z) f(z
0
) = O(z z
0
). Using this back in Eq. (28.8) then shows
that
h
0
(z
0
)(z z
0
) = g
0
(f(z
0
))(f(z) f(z
0
)) +o(z z
0
)
f(z) f(z
0
) =
h
0
(z
0
)
g
0
(f(z
0
))
(z z
0
) +o(z z
0
).
Example 28.12. Here are some examples.
(1) f(z) = z is analytic and more generally f(z) =
k
P
n=0
a
n
z
n
with a
n
C are
analytic on C.
(2) If f, g H() then f g, f +g, cf H() and f/g H( \ {g = 0}).
(3) f(z) = z is not analytic and f C
1
(C, R) is analytic i f is constant.
The next theorem shows that analytic functions may be averaged to produce
new analytic functions.
Theorem 28.13. Let g : X C be a function such that
(1) g(, x) H() for all x X and write g
0
(z, x) for
d
dz
g(z, x).
(2) There exists G L
1
(X, ) such that |g
0
(z, x)| G(x) on X.
(3) g(z, ) L
1
(X, ) for z .
Then
f(z) :=
Z
X
g(z, )d()
is holomorphic on and the complex derivative is given by
f
0
(z) =
Z
X
g
0
(z, )d().
Exercise 28.1. Prove Theorem 28.13 using the dominated convergence theorem
along with the mean value inequality of Corollary 4.10. Alternatively one may use
the corresponding real variable dierentiation theorem to show
x
f and
y
f exists
and are continuous and then to show

f = 0.
As an application we will shows that power series give example of complex dif-
ferentiable functions.
Corollary 28.14. Suppose that {a
n
}
n=0
C is a sequence of complex numbers
such that series
f(z) :=

X
n=0
a
n
(z z
0
)
n
508 BRUCE K. DRIVER
is convergent for |z z
0
| < R, where R is some positive number. Then f :
D(z
0
, R) C is complex dierentiable on D(z
0
, R) and
(28.9) f
0
(z) =

X
n=0
na
n
(z z
0
)
n1
=

X
n=1
na
n
(z z
0
)
n1
.
By induction it follows that f
(k)
exists for all k and that
f
(k)
(z) =

X
n=0
n(n 1) . . . (n k + 1)a
n
(z z
0
)
n1
.
Proof. Let < R be given and choose r (, R). Since z = z
0
+r D(z
0
, R), by
assumption the series

P
n=0
a
n
r
n
is convergent and in particular M := sup
n
|a
n
r
n
| <
. We now apply Theorem 28.13 with X = N{0} , being counting measure,
= D(z
0
, ) and g(z, n) := a
n
(z z
0
)
n
. Since
|g
0
(z, n)| = |na
n
(z z
0
)
n1
| n|a
n
|
n1
1
r
n
n1
|a
n
| r
n
1
r
n
n1
M
and the function G(n) :=
M
r
n
n1
is summable (by the Ratio test for example),
we may use G as our dominating function. It then follows from Theorem 28.13
f(z) =
Z
X
g(z, n)d(n) =

X
n=0
a
n
(z z
0
)
n
is complex dierentiable with the dierential given as in Eq. (28.9).
Example 28.15. Let w C, := C\ {w} and f(z) =
1
wz
. Then f H(). Let
z
0
and write z = z
0
+h, then
f(z) =
1
w z
=
1
w z
0
h
=
1
w z
0
1
1 h/(w z
0
)
=
1
w z
0
X
n=0
h
w z
0
n
=

X
n=0
1
w z
0
n+1
(z z
0
)
n
which is valid for |z z
0
| < |w z
0
| . Summarizing this computation we have shown
(28.10)
1
w z
=

X
n=0
1
w z
0
n+1
(z z
0
)
n
for |z z
0
| < |w z
0
| .
Proposition 28.16. The exponential function e
z
=

P
n=0
z
n
n!
is holomorphic on C
and
d
dz
e
z
= e
z
. Moreover,
(1) e
(z+w)
= e
z
e
w
for all z, w C.
(2) (Eulers Formula) e
i
= cos + i sin for all R and |e
i
| = 1 for all
R.
(3) e
x+iy
= e
x
(cos y +i siny) for all x, y R.
(4) e
z
= e
z
.
Proof. By the chain rule for functions of a real variable,
d
dt
[e
tw
e
(z+tw)
] = we
tw
e
(z+tw)
+e
tw
we
(z+tw)
= 0
and hence e
tw
e
(z+tw)
is constant in t. So by evaluating this expression at t = 0
and t = 1 we nd
(28.11) e
w
e
(z+w)
= e
z
for all w, z C.
Choose z = 0 in Eq. (28.11) implies e
w
e
w
= 1, i.e. e
w
= 1/e
w
which used back
in Eq. (28.11 proves item 1. Similarly,
d
d
[e
i
(cos +i sin)] = ie
i
(cos +i sin) +e
i
(sin +i cos ) = 0.
Hence e
i
(cos +i sin) = e
i
(cos +i sin)|
=0
= 1 which proves item 2. Item
3. is a consequence of items 1) and 2) and item 4) follows from item 3) or directly
from the power series expansion.
Remark 28.17. One could dene e
z
by e
z
= e
x
(cos(y) + i sin(y)) when z = x +iy
and then use the Cauchy Riemann equations to prove e
z
is complex dierentiable.
Exercise 28.2. By comparing the real and imaginary parts of the equality e
i
e
i
=
e
i(+)
prove the formulas:
cos( +) = cos cos sin sin and
sin( +) = cos sin + cos sin
for all , R.
Exercise 28.3. Find all possible solutions to the equation e
z
= w where z
and w are complex numbers. Let log(w) {z : e
z
= w}. Note that log : C
(subsets of C). One often writes log : C C and calls log a multi-valued function.
A continuous function l dened on some open subset of C is called a branch of
log if l(w) log(w) for all w . Use the reverse chain rule to show any branch of
log is holomorphic on its domain of denition and that l
0
(z) = 1/z for all z .
Exercise 28.4. Let = {w = re
i
C : r > 0, and < < } = C\ (, 0],
and dene Ln : C by Ln(re
i
) ln(r) +i for r > 0 and || < . Show that
Ln is a branch of log . This branch of the log function is often called the principle
value branch of log . The line (, 0] where Ln is not dened is called a branch
cut.
Exercise 28.5. Let
n
w {z C : z
n
= w}. The function w
n
w is another
example of a multi-valued function. Let h(w) be any branch of
n
w, that is h is a
continuous function on an open subset of C such that h(w)
n
w. Show that h
is holomorphic away from w = 0 and that h
0
(w) =
1
n
h(w)/w.
Exercise 28.6. Let l be any branch of the log function. Dene w
z
e
zl(w)
for
all z C and w D(l) where D(l) denotes the domain of l. Show that w
1/n
is a
branch of
n
w and also show that

d
dw
w
z
= zw
z1
.
28.3. Contour integrals.
Denition 28.18. Suppose that : [a, b] is a Piecewise C
1
function and
f : C is continuous, we dene the contour integral of f along (written
R
f(z)dz) by
Z
f(z)dz :=
Z
b
a
f((t)) (t)dt.
510 BRUCE K. DRIVER
Notation 28.19. Given

o
C and a C
2
map : [a, b] [0, 1] , let
s
:=
(, s) C
1
([a, b] ). In this way, the map may be viewed as a map
s [0, 1]
s
:= (, s) C
2
([a, b] ),
i.e. s
s
is a path of contours in .
Denition 28.20. Given a region and , C
2
([a, b] ) , we will write
' in provided there exists a C
2
map : [a, b] [0, 1] such that
0
= ,
1
= , and satises either of the following two conditions:
(1)
d
ds
(a, s) =
d
ds
(b, s) = 0 for all s [0, 1], i.e. the end points of the paths
s
for s [0, 1] are xed.
(2) (a, s) = (b, s) for all s [0, 1], i.e.
s
is a loop in for all s [0, 1].
Proposition 28.21. Let be a region and , C
2
([a, b], ) be two contours
such that ' in . Then
Z
f(z)dz =
Z
f(z)dz for all f H().

Proof. Let : [a, b] [0, 1] be as in Denition 28.20, then it suces to
show the function
F(s) :=
Z
s
f(z)dz
is constant for s [0, 1]. For this we compute:
F
0
(s) =
d
ds
Z
b
a
f((t, s)) (t, s)dt =
Z
b
a
d
ds
[f((t, s)) (t, s)] dt
=
Z
b
a
{f
0
((t, s))
0
(t, s) (t, s) +f((t, s))
0
(t, s)} dt
=
Z
b
a
d
dt
[f((t, s))
0
(t, s)] dt
= [f((t, s))
0
(t, s)]
t=b
t=a
= 0
where the last equality is a consequence of either of the two endpoint assumptions
of Denition 28.20.
Remark 28.22. For those who know about dierential forms and such we may
generalize the above computation to f C
1
() using df = fdz +

fd z. We then
nd
F
0
(s) =
d
ds
Z
b
a
f((t, s)) (t, s)dt =
Z
b
a
d
ds
[f((t, s)) (t, s)] dt
=
Z
b
a
f((t, s))
0
(t, s) +

f((t, s))
0
(t, s)
(t, s) +f((t, s))

0
(t, s)
dt
=
Z
b
a
f((t, s)) (t, s)

0
(t, s) +

f((t, s))
t
(t, s)
0
(t, s)
+f((t, s))
0
(t, s)
dt
+
Z
b
a
f((t, s)) (
0
(t, s) (t, s)
t
(t, s)
0
(t, s)) dt
=
Z
b
a
d
dt
[f((t, s))
0
(t, s)] dt +
Z
b
a
f((t, s)) (
0
(t, s) (t, s)
t
(t, s)
0
(t, s)) dt
= [f((t, s))
0
(t, s)]
t=b
t=a
+
Z
b
a
f((t, s)) (
0
(t, s) (t, s)
t
(t, s)
0
(t, s)) dt
=
Z
b
a
f((t, s)) (
0
(t, s) (t, s)
t
(t, s)
0
(t, s)) dt.
Integrating this expression on s then shows that
Z
1
fdz
Z
0
fdz =
Z
1
0
ds
Z
b
a
dt
f((t, s)) (
0
(t, s) (t, s)
t
(t, s)
0
(t, s))
=
Z
(fdz) =
Z
fd z dz
We have just given a proof of Greens theorem in this context.
The main point of this section is to prove the following theorem.
Theorem 28.23. Let
o
C be an open set and f C
1
(, C), then the following
statements are equivalent:
(1) f H(),
(2) For all disks D = D(z
0
, ) such that

D ,
(28.12) f(z) =
1
2i
I
D
f(w)
w z
dw for all z D.
(3) For all disks D = D(z
0
, ) such that

D , f(z) may be represented as a
convergent power series
(28.13) f(z) =

X
n=0
a
n
(z z
0
)
n
for all z D.
In particular f C
(, C).
Moreover if D is as above, we have
(28.14) f
(n)
(z) =
n!
2i
I
D
f(w)
(w z)
n
dw for all z D
and the coecients a
n
in Eq. (28.13) are given by
a
n
= f
(n)
(z
0
)/n! =
1
2i
I
D
f(w)
(w z
0
)
n+1
dw.
512 BRUCE K. DRIVER
Proof. 1) =2) For s [0, 1], let z

s
= (1 s)z
0
+sz,
s
:= dist(z
s
, D) =
s |z z
0
| and
s
(t) = z
s
+
s
e
it
for 0 t 2. Notice that
0
is a parametrization
of D,
0
'
1
in \ {z} , w
f(w)
wz
is in H ( \ {z}) and hence by Proposition
28.21,
I
D
f(w)
w z
dw =
Z
0
f(w)
w z
dw =
Z
1
f(w)
w z
dw.
Now let
s
(t) = z +s
1
e
it
for 0 t 2 and s (0, 1]. Then
1
=
1
and
1
'
s
in
\ {z} and so again by Proposition 28.21,
I
D
f(w)
w z
dw =
Z
1
f(w)
w z
dw =
Z
s
f(w)
w z
dw
=
Z
2
0
f(z +s
1
e
it
)
s
1
e
it
is
1
e
it
dt
= i
Z
2
0
f(z +s
1
e
it
)dt 2if(z) as s 0.
2) =3) By 2) and Eq. (28.10)
f(z) =
1
2i
I
D
f(w)
w z
dw
=
1
2i
I
D
f(w)

X
n=0
1
w z
0
n+1
(z z
0
)
n
dw
=
1
2i
X
n=0
I
D
f(w)
1
w z
0
n+1
dw
!
(z z
0
)
n
.
(The reader should justify the interchange of the sum and the integral.) The last
equation proves Eq. (28.13) and shows that
a
n
=
1
2i
I
D
f(w)
(w z
0
)
n+1
dw.
Also using Theorem 28.13 we may dierentiate Eq. (28.12) repeatedly to nd
(28.15) f
(n)
(z) =
n!
2i
I
D
f(w)
(w z)
n+1
dw for all z D
which evaluated at z = z
0
shows that a
n
= f
(n)
(z
0
)/n!.
3) = 1) This follows from Corollary 28.14 and the fact that being complex
dierentiable is a local property.
The proof of the theorem also reveals the following corollary.
Corollary 28.24. If f H() then f
0
H() and by induction f
(n)
H()
with f
(n)
dened as in Eq. (28.15).
Corollary 28.25 (Cauchy Estimates). Suppose that f H() where
o
C and
suppose that D(z
0
, ) , then
(28.16)

f
(n)
(z
0
)

n!
n
sup
|z
0
|=
|f()|.
Proof. From Eq. (28.15) evaluated at z = z
0
and letting (t) = z
0
+ e
it
for
0 t 2, we nd
f
(n)
(z
0
) =
n!
2i
I
D
f(w)
(w z
0
)
n+1
dw =
n!
2i
Z
f(w)
(w z
0
)
n+1
dw
=
n!
2i
Z
2
0
f(z
0
+e
it
)
(e
it
)
n+1
ie
it
dt
=
n!
2
n
Z
2
0
f(z
0
+e
it
)
e
int
dt. (28.17)
Therefore,
f
(n)
(z
0
)

n!
2
n
Z
2
0
f(z
0
+e
it
)
e
int
dt =
n!
2
n
Z
2
0
f(z
0
+e
it
)
dt
n!
n
sup
|z
0
|=
|f()|.
Exercise 28.7. Show that Theorem 28.13 is still valid with conditions 2) and 3) in
the hypothesis being replaced by: there exists G L
1
(X, ) such that | |g(z, x)|
G(x).
Hint: Use the Cauchy estimates.
Corollary 28.26 ( Liouvilles Theorem). If f H(C) and f is bounded then f is
constant.
Proof. This follows from Eq. (28.16) with n = 1 and the letting n to nd
f
0
(z
0
) = 0 for all z
0
C.
Corollary 28.27 (Fundamental theorem of algebra). Every polynomial p(z) of
degree larger than 0 has a root in C.
Proof. Suppose that p(z) is polynomial with no roots in z. Then f(z) = 1/p(z)
is a bounded holomorphic function and hence constant. This shows that p(z) is a
constant, i.e. p has degree zero.
Denition 28.28. We say that is a region if is a connected open subset of
C.
Corollary 28.29. Let be a region and f H() and Z(f) = f
1
({0}) denote
the zero set of f. Then either f 0 or Z(f) has no accumulation points in . More
generally if f, g H() and the set {z : f(z) = g(z)} has an accumulation
point in , then f g.
Proof. The second statement follows from the rst by considering the function
f g. For the proof of the rst assertion we will work strictly in with the relative
topology.
Let A denote the set of accumulation points of Z(f) (in ). By continuity of f,
A Z(f) and A is a closed
49
subset of with the relative topology. The proof
49
Recall that x A i V
0
x
Z 6= for all x V
x

o
C where V
0
x
:= V
x
\ {x} . Hence x / A
i there exists x V
x

o
C such that V
0
x
Z = . Since V
0
x
is open, it follows that V
0
x
A
c
and
thus V
x
A
c
. So A
c
is open, i.e. A is closed.
514 BRUCE K. DRIVER
is nished by showing that A is open and thus A = or A = because is

connected.
Suppose that z
0
A, and express f(z) as its power series expansion
f(z) =

X
n=0
a
n
(z z
0
)
n
for z near z
0
. Since 0 = f(z
0
) it follows that a
0
= 0. Let z
k
Z(f) \ {z
0
} such that
limz
k
= z
0
. Then
0 =
f(z
k
)
z
k
z
0
=

X
n=1
a
n
(z
k
z
0
)
n1
a
1
as k
so that f(z) =
P
n=2
a
n
(z z
0
)
n
. Similarly
0 =
f(z
k
)
(z
k
z
0
)
2
=

X
n=2
a
n
(z
k
z
0
)
n2
a
2
as k
and continuing by induction, it follows that a
n
0, i.e. f is zero in a neighborhood
of z
0
.
Denition 28.30. For z C, let
cos(z) =
e
iz
+e
iz
2
and sin(z) =
e
iz
e
iz
2i
.
Exercise 28.8. Show the these formula are consistent with the usual denition of
cos and sin when z is real. Also shows that the addition formula in Exercise 28.2
are valid for , C. This can be done with no additional computations by making
use of Corollary 28.29.
Exercise 28.9. Let
f(z) :=
1
2
Z
R
exp(
1
2
x
2
+zx)dm(x) for z C.
Show f(z) = exp(
1
2
z
2
) using the following outline:
(1) Show f H().
(2) Show f(z) = exp(
1
2
z
2
) for z R by completing the squares and using the
translation invariance of m. Also recall that you have proved in the rst
quarter that f(0) = 1.
(3) Conclude f(z) = exp(
1
2
z
2
) for all z C using Corollary 28.29.
Corollary 28.31 (Mean vaule property). Let
o
C and f H(), then f
satises the mean value property
(28.18) f(z
0
) =
1
2
Z
2
0
f(z
0
+e
i
)d
which holds for all z
0
and 0 such that D(z
0
, ) .
Proof. Take n = 0 in Eq. (28.17).
Proposition 28.32. Suppose that is a connected open subset of C. If f H()
is a function such that |f| has a local maximum at z
0
, then f is constant.
Proof. Let > 0 such that

D = D(z
0
, ) and |f(z)| |f(z
0
)| =: M
for z

D. By replacing f by e
i
f with an appropriate R we may assume
M = f(z
0
). Letting u(z) = Re f(z) and v(z) = Imf(z), we learn from Eq. (28.18)
that
M = f(z
0
) = Re f(z
0
) =
1
2
Z
2
0
u(z
0
+e
i
)d
1
2
Z
2
0
min(u(z
0
+e
i
), 0)d M
since

u(z
0
+e
i
)
f(z
0
+e
i
)
M for all . From the previous equation it

follows that
0 =
Z
2
0
M min(u(z
0
+e
i
), 0)
d
which in turn implies that M = min(u(z
0
+ e
i
), 0), since M min(u(z
0
+
e
i
), 0) is positive and continuous. So we have proved M = u(z
0
+e
i
) for all .
Since
M
2
f(z
0
+e
i
)
2
= u
2
(z
0
+e
i
) +v
2
(z
0
+e
i
) = M
2
+v
2
(z
0
+e
i
),
we nd v(z
0
+e
i
) = 0 for all . Thus we have shown f(z
0
+ e
i
) = M for all
and hence by Corollary 28.29, f(z) = M for all z .
The following lemma makes the same conclusion as Proposition 28.32 using the
Cauchy Riemann equations. This Lemma may be skipped.
Lemma 28.33. Suppose that f H(D) where D = D(z
0
, ) for some > 0. If
|f(z)| = k is constant on D then f is constant on D.
Proof. If k = 0 we are done, so assume that k > 0. By assumption
0 = k
2
= |f|
2
= (

ff) =

f f +

ff
=

ff =

ff
0
wherein we have used

f =
1
2
(
x
i
y
)

f =
1
2
(
x
+i
y
) f(z) =

f = 0
by the Cauchy Riemann equations. Hence f
0
= 0 and f is constant.
Corollary 28.34 (Maximum modulous principle). Let be a bounded region and
f C() H(). Then for all z , |f(z)| sup
z
|f(z)|. Furthermore if there
exists z
0
such that |f(z
0
)| = sup
z
|f(z)| then f is constant.
Proof. If there exists z
0
such that |f(z
0
)| = max
z
|f(z)|, then Proposi-
tion 28.32 implies that f is constant and hence |f(z)| = sup
z
|f(z)|. If no such z
0
exists then |f(z)| sup
z
|f(z)| for all z

.
28.4. Weak characterizations of H(). The next theorem is the deepest theo-
rem of this section.
Theorem 28.35. Let
o
C and f : C is a function which is complex
dierentiable at each point z . Then
H
T
f(z)dz = 0 for all solid triangles T .
516 BRUCE K. DRIVER
Figure 49. Spliting T into four similar triangles of equal size.

Proof. Write T = S
1
S
2
S
3
S
4
as in Figure 49 below.
Let T
1
{S
1
, S
2
, S
3
, S
4
} such that |
R
T
f(z)dz| = max{|
R
S
i
f(z)dz| : i =
1, 2, 3, 4}, then
|
Z
T
f(z)dz| = |
4
X
i=1
Z
S
i
f(z)dz|
4
X
i=1
|
Z
S
i
f(z)dz| 4|
Z
T
1
f(z)dz|.
Repeating the above argument with T replaced by T
1
again and again, we nd by
induction there are triangles {T
i
}
i=1
such that
(1) T T
1
T
2
T
3
. . .
(2) (T
n
) = 2
n
(T) where (T) denotes the length of the boundary of T,
(3) diam(T
n
) = 2
n
diam(T) and
(28.19) |
Z
T
f(z)dz| 4
n
|
Z
T
n
f(z)dz|.
By nite intersection property of compact sets there exists z
0

T
n=1
T
n
. Because
f(z) = f(z
0
) +f
0
(z
0
)(z z
0
) +o(z z
0
)
we nd
4
n
Z
T
n
f(z)dz
= 4
n
Z
T
n
f(z
0
)dz +
Z
T
n
f
0
(z
0
)(z z
0
)dz +
Z
T
n
o(z z
0
)dz
= 4
n
Z
T
n
o(z z
0
)dz
C
n
4
n
Z
T
n
|z z
0
| d|z|
where
n
0 as n . Since
Z
T
n
|z z
0
| d|z| diam(T
n
)(T
n
) = 2
n
diam(T)2
n
(T) = 4
n
diam(T)(T)
we see
4
n
Z
T
n
f(z)dz
C
n
4
n
4
n
diam(T)(T) = C
n
0 as n .
Hence by Eq. (28.19),
R
T
f(z)dz = 0.
Theorem 28.36 (Moreras Theorem). Suppose that
o
C and f C() is a
complex function such that
(28.20)
Z
T
f(z)dz = 0 for all solid triangles T ,
then f H().
Proof. Let D = D(z
0
, ) be a disk such that

D and for z D let
F(z) =
Z
[z
0
,z]
f()d
where [z
0
, z] is by denition the contour, (t) = (1 t)z
0
+ tz for 0 t 1. For
z, w D we have, using Eq. (28.20),
F(w) F(z) =
Z
[z,w]
f()d =
Z
1
0
f(z +t(w z))(w z)dt
= (w z)
Z
1
0
f(z +t(w z))dt.
From this equation and the dominated convergence theorem we learn that
F(w) F(z)
w z
=
Z
1
0
f(z +t(w z))dt f(z) as w z.
Hence F
0
= f so that F H(D). Corollary 28.24 now implies f = F
0
H(D).
Since D was an arbitrary disk contained in and the condition for being in H()
is local we conclude that f H().
The method of the proof above also gives the following corollary.
Corollary 28.37. Suppose that
o
C is convex open set. Then for every f
H() there exists F H() such that F
0
= f. In fact xing a point z
0
, we
may dene F by
F(z) =
Z
[z
0
,z]
f()d for all z .
Exercise 28.10. Let
o
C and {f
n
} H() be a sequence of functions such
that f(z) = lim
n
f
n
(z) exists for all z and the convergence is uniform on
compact subsets of . Show f H() and f
0
(z) = lim
n
f
0
n
(z).
Hint: Use Moreras theorem to show f H() and then use Eq. (28.14) with
n = 1 to prove f
0
(z) = lim
n
f
0
n
(z).
518 BRUCE K. DRIVER
Theorem 28.38. Let

o
C be an open set. Then
(28.21) H() =

f : C such that
df(z)
dz
exists for all z
.
In other words, if f : C is complex dierentiable at all points of then f
0
is
automatically continuous and hence C
by Theorem 28.23!!!
Proof. Combine Theorems 28.35 and 28.36.
Corollary 28.39 (Removable singularities). Let
o
C, z
0
and f H( \
{z
0
}). If limsup
zz
0
|f(z)| < , i.e. sup
0<|zz
0
|<
|f(z)| < for some > 0, then
lim
zz
0
f(z) exists. Moreover if we extend f to by setting f(z
0
) = lim
zz
0
f(z), then
f H().
Proof. Set
g(z) =

(z z
0
)
2
f(z) for z \ {z
0
}
0 for z = z
0
.
Then g
0
(z
0
) exists and is equal to zero. Therefore g
0
(z) exists for all z and
hence g H(). We may now expand g into a power series using g(z
0
) = g
0
(z
0
) = 0
to learn g(z) =

P
n=2
a
n
(z z
0
)
n
which implies
f(z) =
g(z)
(z z
0
)
2
=

X
n=0
a
n
(z z
0
)
n2
for 0 < |z z
0
| <
Therefore, lim
zz
0
f(z) = a
2
exists. Dening f(z
0
) = a
2
we have f(z) =

P
n=0
a
n
(z
z
0
)
n2
for z near z
0
. This shows that f is holomorphic in a neighborhood of z
0
and
since f was already holomorphic away from z
0
, f H().
Exercise 28.11. Show
(28.22)
Z
1
1
sinMx
x
dx =
Z
M
M
sinx
x
dx as M
using the following method.
50
(1) Show that
g(z) =

z
1
sinz for z 6= 0
1 if z = 0
denes a holomorphic function on C.
(2) Let
M
denote the straight line path from M to 1 along the real axis
followed by the contour e
i
for going from to 2 and then followed by
the straight line path from 1 to M. Explain why
Z
M
M
sinx
x
dx =
Z
M
sinz
z
dz

=
1
2i
Z
M
e
iz
z
dz
1
2i
Z
M
e
iz
z
dz.
50
In previous notes we evaluated this limit by real variable techniques based on the identity
that
1
x
=
R
0
e
x
d for x > 0.
(3) Let C
+
M
denote the path Me
i
with going from 0 to and C
M
denote the
path Me
i
with going from to 2. By deforming paths and using the
Cauchy integral formula, show
Z
M
+C
+
M
e
iz
z
dz = 2i and
Z
M
C
M
e
iz
z
dz = 0.
(4) Show (by writing out the integrals explicitly) that
lim
M
Z
C
+
M
e
iz
z
dz = 0 = lim
M
Z
C
M
e
iz
z
dz.
(5) Conclude from steps 3. and 4. that Eq. (28.22) holds.
28.5. Summary of Results.
Theorem 28.40. Let C be an open subset and f : C be a given function.
If f
0
(z) exists for all z , then in fact f has complex derivatives to all orders and
hence f C
(). Set H() to be the set of holomorphic functions on .

Now assume that f C
0
(). Then the following are equivalent:
(1) f H()
(2)
H
T
f(z)dz = 0 for all triangles T .
(3)
H
R
f(z)dz = 0 for all nice regions R .
(4)
H
f(z)dz = 0 for all closed paths in which are null-homotopic.

(5) f C
1
() and

f 0 or equivalently if f(x + iy) = u(x, y) + iv(x, y),
then the pair of real valued functions u, v should satisfy
"

x

y
y

x
#
u
v
=

0
0
.
(6) For all closed discs D and z D
o
,
f(z) =
I
D
f()
z
d.
(7) For all z
0
and R > 0 such that D(z
0
, R) the function f restricted
to D(z
0
, R) may be written as a power series:
f(z) =

X
n=0
a
n
(z z
0
)
n
for z D(z
0
, R).
Furthermore
a
n
= f
(n)
(z
0
)/n! =
1
2i
I
|zz
0
|=r
f(z)
(z z
0
)
n+1
dz,
where 0 < r < R.
Remark 28.41. The operator L =
"

x

y
y

x
#
is an example of an elliptic dif-
ferential operator. This means that if

x
is replaced by
1
and

y
is replaced by
2
then the principal symbol of L,

L()

1

2
2

1
, is an invertible matrix
for all = (
1
,
2
) 6= 0. Solutions to equations of the form Lf = g where L is an
elliptic operator have the property that the solution f is smoother than the forc-
ing function g. Another example of an elliptic dierential operator is the Laplacian
520 BRUCE K. DRIVER
=

2
x
2
+

2
y
2
for which

() =
2
1
+
2
2
is invertible provided 6= 0. The wave
operator =

2
x
2

2
y
2
for which

() =
2
1

2
2
is not elliptic and also does not
have the smoothing properties of an elliptic operator.
28.6. Exercises.
(1) Set e
z
=
P
n=0
z
n
n!
. Show that e
z
= e
x
(cos(y) + i sin(y)), and that e
z
=
d
dz
e
z
= e
z
and

e
z
= 0.
(2) Find all possible solutions to the equation e
z
= w where z and w are
complex numbers. Let log(w) {z : e
z
= w}. Note that log : C
(subsets of C). One often writes log : C C and calls log a multi-valued
function. A continuous function l dened on some open subset of C is
called a branch of log if l(w) log(w) for all w . Use a result from class
to show any branch of log is holomorphic on its domain of denition and
that l
0
(z) = 1/z for all z .
(3) Let = {w = re
i
C : r > 0, and < < } = C \ (, 0], and
dene Ln : C by Ln(re
i
) ln(r) + i for r > 0 and || < . Show
that Ln is a branch of log . This branch of the log function is often called
the principle value branch of log . The line (, 0] where Ln is not dened
is called a branch cut. We will see that such a branch cut is necessary. In
fact for any continuous simple curve joining 0 and there will be a
branch of the log - function dened on the complement of .
(4) Let
n
w {z C : z
n
= w}. The function w
n
w is another example
of a multivalued function. Let h(w) be any branch of
n
w, that is h is a
continuous function on an open subset of C such that h(w)
n
w. Show
that h is holomorphic away from w = 0 and that h
0
(w) =
1
n
h(w)/w.
(5) Let l be any branch of the log function. Dene w
z
e
zl(w)
for all z C
and w D(l) where D(l) denotes the domain of l. Show that w
1/n
is a
branch of
n
w and also show that

d
dw
w
z
= zw
z1
.
(6) Suppose that (X, ) is a measure space and that f : X C is a
function ( is an open subset of C) such that for all w X the function
z f(z, w) is in H() and
R
X
|f(z, w)|d(w) < for all z (in fact
one z is enough). Also assume there is a function g L
1
(d) such
that |
f(z,w)
z
| g(w) for all (z, w) X. Show that the function h(z)
R
X
f(z, w)d(w) is holomorphic on X and that h
0
(z) =
R
X
f(z,w)
z
d(w)
for all z X. Hint: use the Hahn Banach theorem and the mean valued
theorem to prove the following estimate:
|
f(z +, w) f(z, w)
| g(w)
all C suciently close to but not equal to zero.
(7) Assume that f is a C
1
function on C. Show that [f( z)] = (
f)( z). (By the

way, a C
1
function f on C is said to be anti-holomorphic if f = 0. This
problem shows that f is anti-holomorphic i z f( z) is holomorphic.)
(8) Let U C be connected and open. Show that f H(U) is constant on U
i f
0
0 on U.
(9) Let f H(U) and R U be a nice closed region (See Figure To be
supplied later.). Use Greens theorem to show
R
R
f(z)dz = 0, where
Z
R
f(z)dz
n
X
i=1
Z
i
f(z)dz,
and {
i
}
n
i=1
denote the components of the boundary appropriately oriented,
see the Figure 1.
(10) The purpose of this problem is to understand the Laurent Series of a
function holomorphic in an annulus. Let 0 R
0
< r
0
< r
1
< R
1
,
z
0
C, U {z C|R
0
< |z z
0
| < R
1
}, and A {z C|r
0
< |z z
0
| <
r
1
}.
a): Use the above problem (or otherwise) and the simple form of the
Cauchy integral formula proved in class to show if g H(U) C
1
(U),
then for all z A, g(z) =
1
2i
R
A
g(w)
wz
dw. Hint: Apply the above
problem to the function f(w) =
g(w)
wz
with a judiciously chosen region
R U.
b): Mimic the proof (twice, one time for each component of A) of the
Taylor series done in class to show if g H(U) C
1
(U), then
g(z) =

X
n=
a
n
(z z
0
)
n
, z A,
where
a
n
=
1
2i
Z
g(w)
(w z)
n+1
dw,
and (t) = e
it
(0 t 2) and is any point in (R
0
, R
1
).
c): Suppose that R
0
= 0, g H(U) C
1
(U), and g is bounded near z
0
.
Show in this case that a
n
0 for all n > 0 and in particular conclude
that g may be extended uniquely to z
0
in such a way that g is complex
dierentiable at z
0
.
(11) A Problem from Berenstein and Gay, Complex Variables: An introduc-
tion, Springer, 1991, p. 163.
Notation and Conventions: Let denote an open subset of R
N
. Let
L = =
P
N
i=1

2
x
2
i
be the Laplacian on C
2
(, R).
(12) (Weak Maximum Principle)
a): Suppose that u C
2
(, R) such that Lu(x) > 0 x . Show that u
can have no local maximum in . In particular if is a bounded open
subset of R
N
and u C(
, R) C
2
(, R) then u(x) < max
y
u(y)
for all x .
b): (Weak maximum principle) Suppose that is now a bounded open
subset of R
N
and that u C(
, R) C
2
(, R) such that Lu 0 on
. Show that u(y) M : max
x
u(x) for all y . (Hint: apply
part a) to the function u
(x) = u(x) +|x|

2
where > 0 and then let
0.)
Remark 28.42 (Fact:). Assume now that is connected. It is possible to
prove, using just calculus techniques, the strong maximum principle
which states that if u as in part b) of the problem above has an interior
maximum then u must be a constant. (One may prove this result when the
522 BRUCE K. DRIVER
dimension n = 2 by using the mean value property of harmonic functions

discussed in Chapter 11 of Rudin.) The direct calculus proof of this fact
is elementary but tricky. If you are interested see Protter and Weinberger,
Maximum Principles in Dierential Equations, p.61.
(13) (Maximum modulus principle) Prove the maximum modulus principle us-
ing the strong maximum principle. That is assume that is a con-
nected bounded subset of C, and that f H() C(
, C). Show that

|f(z)| max
|f()| for all z and if equality holds for some z

then f is a constant.
Hint: Assume for contradiction that |f(z)| has a maximum greater than
zero at z
0
. Write f(z) = e
g(z)
for some analytic function g in a
neighborhood of z
0
. (We have shown such a function must exist.) Now use
the strong maximum principle on the function u = Re(g).
28.7. Problems from Rudin.
p. 229:: #17 .
Chapter 10:: 2, 3, 4, 5
Chapter 10:: 8-13, 17, 18-21, 26, 30 (replace the word show by convince
yourself that in problem 30.)
Remark 28.43. Remark. Problem 30. is related to the fact that the fundamental
group of is not commutative, whereas the rst homology group of and is in
fact the abelianization of the fundamental group.
Chapter 11:: 1, 2, 5, 6,
Chapter 12:: 2 (Hint: use the fractional linear transformation
(z) i
z i
z +i
which maps
+
U. conformally.), 3, 4 (Hint: on 4a, apply Maxi-
mum modulus principle to 1/f.), 5, 11 (Hint: Choose > 1, z
0

such that |f(z
0
)| <

and (0, 1) such that

D D(z
0
, )
and |f(z)| M on

D. For R > let
R
( D(z
0
, R)) \

D.
Show that g
n
(z) (f(z))
n
/(z z
0
) satises g
n
H(
R
) C
0
(
R
) and
|g
n
| max{
n
M
n
/, B
n
/R} on
R
. Now apply the maximum modulus
principle to g
n
, then let R , then n , and nally let 1.)
29. Littlewood Payley Theory
Lemma 29.1 (Hadamards three line lemma). Let S be the vertical strip
S = {z : 0 < Re(z) < 1} = (0, 1) iR
and (z) be a continuous bounded function on

S = [0, 1] iR which is holomorphic
on S. If M
s
:= sup
Re(z)=s
|(z)|, then M
s
M
1s
0
M
s
1
. (In words this says that the
maximum of (z) on the line Re(z) = s is controlled by the maximum of (z) on
the lines Re(z) = 0 and Re(z) = 1. Hence the reason for the naming this the three
line lemma.
Proof. Let N
0
> M
0
and N
1
> M
1
51
and > 0 be given. For z = x +iy

S,
max(N
0
, N
1
)
N
1z
0
N
z
1
= N
1x
0
N
x
1
min(N
0
, N
1
)
and Re(z
2
1) = (x
2
1 y
2
) 0 and Re(z
2
1) as z in the strip
S. Therefore,
(z) :=
(z)
N
1z
0
N
z
1
exp((z
2
1)) for z

S
is a bounded continuous function

S,
H(S) and
(z) 0 as z in the
strip S. By the maximum modulus principle applied to

S
B
:= [0, 1] i[B, B] for
B suciently large, shows that
max
(z)| : z

S
= max
(z)| : z
.
For z = iy we have
|
(z)| =
(z)
N
1z
0
N
z
1
exp((z
2
1))
|(iy)|
N
0

M
0
N
0
< 1
and for z = 1 +iy,
|
(z)|
|(1 +iy)|
N
1

M
1
N
1
< 1.
Combining the last three equations implies max
(z)| : z

S
< 1. Letting 0
then shows that

(z)
N
1z
0
N
z
1
1 for all z

S
|(z)|
N
1z
0
N
z
1
= N
1x
0
N
x
1
for all z = x +iy

S.
Since N
0
> M
0
and N
1
> M
1
were arbitrary, we conclude that
|(z)|
M
1z
0
M
z
1
= M
1x
0
M
x
1
for all z = x +iy

S
from which it follows that M
x
M
1x
0
M
x
1
for all x (0, 1).
As a rst application we have.
Proposition 29.2. Suppose that A and B are complex nn matrices with A > 0.
(A 0 can be handled by a limiting argument.) Suppose that kABk 1 and
kBAk 1, then

AB
1 as well.
51
If M
0
and M
1
are both positive, we may take N
0
= M
0
and N
1
= M
1
.
524 BRUCE K. DRIVER
Proof. Let F(z) = A

z
BA
1z
for z S, where A
z
f :=
z
= e
z ln
f when
Af = f. Then one checks that F is holomorphic and
F(x +iy) = A
x+iy
BA
1xiy
= A
iy
F(x)A
iy
so that
kF(x +iy)k = kF(x)k .
Hence F is bounded on S and
kF(0 +iy)k = kF(0)k = kBAk 1, and
kF(1 +iy)k = kF(1)k = kABk 1.
So by the three lines lemma (and the Hahn Banach theorem) kF(z)k 1 for all
z S. Taking z = 1/2 then proves the proposition.
Theorem 29.3 (Riesz-Thorin Interpolation Theorem). Suppose that (X, M, )
and (Y, N, ) are nite measure spaces and that 1 p
i
, q
i
for i = 0, 1. For
0 < s < 1, let p
s
and q
s
be dened by
1
p
s
=
1 s
p
0
+
s
p
1
and
1
q
s
=
1 s
q
0
+
s
q
1
.
If T is a linear map from L
p
0
() +L
p
1
() to L
q
0
() +L
q
1
() such that
kTk
p
0
q
0
M
0
< and kTk
p
1
q
1
M
1
<
then
kTk
p
s
q
s
M
s
= M
(1s)
0
M
s
1
< .
Alternatively put we are trying to show
(29.1) kTfk
q
s
M
s
kfk
p
s
for all s (0, 1) and f L
p
s
().
given
kTfk
q
0
M
0
kfk
p
0
for all f L
p
0
() and
kTfk
q
1
M
1
kfk
p
1
for all f L
p
1
().
Proof. Let us rst give the main ideas of the proof. At the end we will ll in
some of the missing technicalities. (See Theorem 6.27 in Folland for the details.)
Eq. (29.1) is equivalent to showing
Z
Tfgd
M
s
for all f L
p
s
() such that kfk
p
s
= 1 and for all g L
q
s
such that kgk
q
s
= 1,
where q
s
is the conjugate exponent to p
s
. Dene p
z
and q
z
by
1
p
z
=
1 z
p
0
+
z
p
1
and
1
q
z
=
1 z
q
0
+
z
q
1
and let
f
z
= |f|
p
s
/p
z
f
|f|
and g
z
= |g|
q
s
/q
z
g
|g|
.
Writing z = x +iy we have |f
z
| = |f|
p
s
/p
x
and |g
z
| = |g|
q
s
/q
x
so that
(29.2) kf
z
k
L
p
x
= 1 and kg
z
k
L
q
x
= 1
for all z = x +iy with 0 < x < 1. Let
F(z) := hTf
z
, g
z
i =
Z
Y
Tf
z
g
z
d
and assume that f and g are simple functions. It is then routine to show F
C
b
(
S) H(S) where S is the strip S = (0, 1) +iR. Moreover using Eq. (29.2),
|F(it)| = |hTf
it
, g
it
i| M
0
kf
it
k
p
0
kg
it
k
q
0
= M
0
and
|F(1 +it)| = |hTf
1+it
, g
1+it
i| M
1
kf
1+it
k
p
1
kg
1+it
k
q
1
= M
1
for all t R. By the three lines lemma, it now follows that
|hTf
z
, g
z
i| = |F(z)| M
1Re z
0
M
Re z
1
and in particular taking z = s using f
s
= f and g
s
= g gives
|hTf, gi| = F(s) M
1s
0
M
s
1
.
Taking the supremum over all simple g L
q
s
such that kgk
q
s
= 1 shows kTfk
L
q
s

M
1s
0
M
s
1
for all simple f L
p
s
() such that kfk
p
s
= 1 or equivalently that
(29.3) kTfk
L
q
s
M
1s
0
M
s
1
kfk
p
s
for all simple f L
p
s
().
Now suppose that f L
p
s
and f
n
are simple functions in L
p
s
such that |f
n
| |f|
and f
n
f point wise as n . Set E = {|f| > 1} , g = f1
E
h = f1
c
E
, g
n
= f
n
1
E
and h
n
= f
n
1
E
c. By renaming p
0
and p
1
if necessary we may assume p
0
< p
1
. Under
this hypothesis we have g, g
n
L
p
0
and h, h
n
L
p
1
and f = g+h and f
n
= g
n
+h
n
.
By the dominated convergence theorem
kf
n
fk
p
t
0, kg
n
gk
p
0
0 and kh h
n
k
p
1
0
as n . Therefore kTg
n
Tgk
q
0
0 and kTh
n
Thk
q
1
0 as n .
Passing to a subsequence if necessary, we may also assume that Tg
n
Tg 0 and
Th
n
Th 0 a.e. as n . It then follows that Tf
n
= Tg
n
+Th
n
Tg +Th =
Tf a.e. as n . This result, Fatous lemma, the dominated convergence theorem
and Eq. (29.3) then gives
kTfk
q
s
lim inf
n
kTf
n
k
q
s
lim inf
n
M
1s
0
M
s
1
kf
n
k
p
s
= M
1s
0
M
s
1
kfk
p
s
.
29.0.1. Applications. For the rst application, we will give another proof of Theo-
rem 11.19.
Proof. Proof of Theorem 11.19. The case q = 1 is simple, namely
kf gk
r
=
Z
R
n
f( y)g(y)dy
Z
R
n
kf( y)k
r
|g(y)| dy
= kfk
r
kgk
1
and by interchanging the roles of f and g we also have
kf gk
r
= kfk
1
kgk
r
.
Letting C
g
f = f g, the above comments may be reformulated as saying
kC
g
k
1p
kgk
p
.
526 BRUCE K. DRIVER
Another easy case is when r = , since

|f g(x)| =
Z
R
n
f(x y)g(y)dy
kf(x )k
p
kgk
q
= kfk
p
kgk
q
.
which may be formulated as saying that
kC
g
k
q
kgk
p
.
By the Riesz Thorin interpolation with p
0
= 1, q
0
= p, p
1
= q and q
1
= ,
kC
g
k
p
s
q
s
kC
g
k
1s
p
kC
g
k
s
1q
kgk
1s
p
kgk
s
p
kgk
p
for all s (0, 1) which is equivalent to
kf gk
q
s
kfk
p
s
kgk
p
Since
p
1
s
= (1 s) +sq
1
and q
1
s
= (1 s)p
1
+s
1
= (1 s)p
1
,
and therefore if a = q
s
and b = p
s
then
b
1
+p
1
= (1 s) +sq
1
+p
1
= (1 s) +s(q
1
+p
1
) + (1 s)p
1
= 1 + (1 s)p
1
= 1 +a
1
.
Example 29.4. By the Riesz Thorin interpolation theorem we conclude that F :
L
p
L
q
is bounded for all p [1, 2] where q = p
is the conjugate exponent to p.

Indeed, in the notation of the Riesz Thorin interpolation theorem F : L
p
s
L
q
s
is
bounded where
1
p
s
=
1 s
1
+
s
2
and
1
q
s
=
1 s
+
s
2
=
s
2
,
i.e.
1
p
s
+
1
q
s
= 1 s +
s
2
+
s
2
= 1.
See Theorem 20.11.
For the next application we will need the following general duality argument.
Lemma 29.5. Suppose that (X, M, ) and (Y, N, ) are nite measure spaces
and T : L
2
() L
2
() is a bounded operator. If there exists p, q [1, ] and a
constant C < such that
kTgk
q
C kgk
p
for all g L
p
() L
2
()
then
kT
fk
p
C kfk
q
for all f L
q
() L
2
(),
where T
is the L
2
adjoint of T and p
and q
are the conjugate exponents to p

and q.
Proof. Suppose that f L
q
() L
2
(), then by the reverse Holder inequality
kT
fk
p
= sup
n
|(T
f, g)| : g L
p
() L
2
() with kgk
p
= 1
o
= sup
n
|(f, Tg)| : g L
p
() L
2
() with kgk
p
= 1
o
kfk
q
sup
n
kTgk
q
: g L
p
() L
2
() with kgk
p
= 1
o
C kfk
q
.
Lemma 29.6. Suppose that K = {k
mn
0}
m,n=1
is a symmetric matrix such that
(29.4) M := sup
m
X
n=1
k
mn
= sup
n
X
m=1
k
mn
<
and dene Ka by (Ka)
m
=
P
n
k
mn
a
n
when the sum converges. Given p [1, ]
and p
be the conjugate exponent, then K :

p

p
is bounded kKk
pp
M.
Proof. Let A
m
=
P
n=1
k
mn
=
P
n=1
k
nm
. For a
p
X
n
k
mn
|a
n
|
!
p
=
A
m
X
n
k
mn
A
m
|a
n
|
!
p
A
p
m
X
n
k
mn
A
m
|a
n
|
p
M
p1
X
n
k
mn
|a
n
|
p
(29.5)
and hence
X
m
X
n
k
mn
|a
n
|
!
p
M
p1
X
m
X
n
k
mn
|a
n
|
p
= M
p1
X
n
X
m
k
mn
|a
n
|
p
M
p
kak
p
p
which shows K :
p

p
with kKk
pp
M. Moreover from Eq. (29.5) we see that
sup
m
X
n
k
mn
|a
n
| M kak
p
which shows that K :
p

is bounded with kKk

p
M for all p and in
particular for p = 1. By duality it follows that kKk
p
M as well. This is easy
to check directly as well.
Let p
0
= 1 = q
1
and p
1
= = q
0
so that
p
1
s
= (1 s)1
1
+s
1
= (1 s) and q
1
s
= (1 s)
1
+s1
1
= s
so that q
s
= p
s
. Applying the Riesz-Thorin interpolation theorem shows
kKk
p
s
p
s
= kKk
p
s
q
s
M.
The following lemma only uses the case p = 2 which we proved without interpo-
lation.
Lemma 29.7. Suppose that {u
n
} is a sequence in a Hilbert space H, such that: 1)
P
n
|u
n
|
2
< and 2) there exists constants k
mn
= k
nm
0 satisfying Eq. (29.4)
and
|(u
m
, u
n
)| k
mn
|u
n
||u
m
| for all m and n.
528 BRUCE K. DRIVER
Then v =
P
n
u
n
exists and
(29.6) |v|
2
M
X
n
|u
n
|
2
.
Proof. Let us begin by assuming that only a nite number of the {u
n
} are
non-zero. The key point is to prove Eq. (29.6). In this case
|v|
2
=
X
m,n
(u
n
, u
m
)
X
m,n
k
mn
|u
n
||u
m
| = Ka a
where a
n
= |u
n
|. Now by the above remarks
Ka a M|a|
2
= M
X
a
2
n
= M
X
n
|u
n
|
2
,
which establishes Eq. (29.6) in this case.
For M < N, let v
M,N
=
P
N
n=M
u
n
, then by what we have just proved
|v
M,N
|
2
M
N
X
n=M
|u
n
|
2
0 as M, N .
This shows that v =
P
n
u
n
exists. Moreover we have
|v
1,N
|
2
M
N
X
n=1
|u
n
|
2
M

X
n=1
|u
n
|
2
.
Letting N in this last equation shows that Eq. (29.6) holds in general.
30. Elementary Distribution Theory
Author Friedlander, F. G. (Friedrich Gerard), 1917-
Title Introduction to the theory of distributions
Published Cambridge ; New York : Cambridge University Press, 1998
Edition 2nd ed. / F.G. Friedlander, with additional material by M. Joshi
LOCATION CALL NUMBER STATUS
S&E Stacks QA324 .F74 1998
30.1. Distributions on U
o
R
n
. Let U be an open subset of R
n
and
(30.1) C
c
(U) =
K@@U
C
(K)
denote the set of smooth functions on U with compact support in U.
Denition 30.1. A sequence {
k
}
k=1
D(U) converges to D(U), i there is
a compact set K @@ U such that supp(
k
) K for all k and
k
in C
(K).
Denition 30.2 (Distributions on U
o
R
n
). A generalized function T on U
o
R
n
is a continuous linear functional on D(U), i.e. T : D(U) C is linear and
lim
n
hT,
k
i = 0 for all {
k
} D(U) such that
k
0 in D(U). We denote the
space of generalized functions by D
0
(U).
Proposition 30.3. Let T : D(U) C be a linear functional. Then T D
0
(U) i
for all K @@ U, there exist n N and C < such that
(30.2) |T()| Cp
n
() for all C
(K).
Proof. Suppose that {
k
} D(U) such that
k
0 in D(U). Let K be a
compact set such that supp(
k
) K for all k. Since lim
k
p
n
(
k
) = 0, it follows
that if Eq. (30.2) holds that lim
n
hT,
k
i = 0. Conversely, suppose that there is
a compact set K @@ U such that for no choice of n N and C < , Eq. (30.2)
holds. Then we may choose non-zero
n
C
(K) such that

|T(
n
)| np
n
(
n
) for all n.
Let
n
=
1
np
n
(
n
)
n
C
(K), then p
n
(
n
) = 1/n 0 as n which shows
that
n
0 in D(U). On the other hence |T(
n
)| 1 so that lim
n
hT,
n
i 6= 0.
Alternate Proof:The denition of T being continuous is equivalent to T|
C
(K)
being sequentially continuous for all K @@ U. Since C
(K) is a metric space,

sequential continuity and continuity are the same thing. Hence T is continuous i
T|
C
(K)
is continuous for all K @@ U. Now T|
C
(K)
is continuous i a bound like
Eq. (30.2) holds.
Denition 30.4. Let Y be a topological space and T
y
D
0
(U) for all y Y. We
say that T
y
T D
0
(U) as y y
0
i
lim
yy
0
hT
y
, i = hT, i for all D(U).
30.1.1. Examples of distributions and related computations.
Example 30.5. Let be a positive Radon measure on U and f L
1
loc
(U). Dene
T D
0
(U) by hT
f
, i =
R
U
fd for all D(U). Notice that if C
(K) then
|hT
f
, i|
Z
U
|f| d =
Z
K
|f| d C
K
kk
530 BRUCE K. DRIVER
where C
K
:=
R
K
|f| d < . Hence T
f
D
0
(U). Furthermore, the map
f L
1
loc
(U) T
f
D
0
(U)
is injective. Indeed, T
f
= 0 is equivalent to
(30.3)
Z
U
fd = 0 for all D(U).
for all C
(K). By the dominated convergence theorem and the usual convolu-

tion argument, this is equivalent to
(30.4)
Z
U
fd = 0 for all C
c
(U).
Now x a compact set K @@ U and
n
C
c
(U) such that
n
sgn(f)1
K
in
L
1
(). By replacing
n
by (
n
) if necessary, where
(z) =

z if |z| 1
z
|z|
if |z| 1,
we may assume that |
n
| 1. By passing to a further subsequence, we may assume
that
n
sgn(f)1
K
a.e.. Thus we have
0 = lim
n
Z
U

n
fd =
Z
U
sgn(f)1
K
fd =
Z
K
|f| d.
This shows that |f(x)| = 0 for -a.e. x K. Since K is arbitrary and U is the
countable union of such compact sets K, it follows that f(x) = 0 for -a.e. x U.
The injectivity may also be proved slightly more directly as follows. As before,
it suces to prove Eq. (30.4) implies that f(x) = 0 for a.e. x. We may
further assume that f is real by considering real and imaginary parts separately.
Let K @@ U and > 0 be given. Set A = {f > 0} K, then (A) < and hence
since all nite measure on U are Radon, there exists F A V with F compact
and V
o
U such that (V \ F) < . By Uryshons lemma, there exists C
c
(V )
such that 0 1 and = 1 on F. Then by Eq. (30.4)
0 =
Z
U
fd =
Z
F
fd +
Z
V \F
fd =
Z
F
fd +
Z
V \F
fd
so that
Z
F
fd =
Z
V \F
fd
Z
V \F
|f| d <
provided that is chosen suciently small by the denition of absolute con-
tinuity. Similarly, it follows that
0
Z
A
fd
Z
F
fd + 2.
R
A
fd = 0. Since K was arbitrary, we learn
that
Z
{f>0}
fd = 0
which shows that f 0 a.e. Similarly, one shows that f 0 a.e. and hence
f = 0 a.e.
Example 30.6. Let us now assume that = m and write hT
f
, i =
R
U
fdm. For
the moment let us also assume that U = R. Then we have
(1) lim
M
T
sin Mx
= 0
(2) lim
M
T
M
1
sin Mx
=
0
where
0
is the point measure at 0.
(3) If f L
1
(R
n
, dm) with
R
R
n
fdm = 1 and f
(x) =
n
f(x/), then
lim
0
T
f
=
0
. As a special case,
consider lim
0

(x
2
+
2
)
=
0
.
Denition 30.7 (Multiplication by smooth functions). Suppose that g C
(U)
and T D
0
(U) then we dene gT D
0
(U) by
hgT, i = hT, gi for all D(U).
It is easily checked that gT is continuous.
Denition 30.8 (Dierentiation). For T D
0
(U) and i {1, 2, . . . , n} let
i
T
D
0
(U) be the distribution dened by
h
i
T, i = hT,
i
i for all D(U).
Again it is easy to check that
i
T is a distribution.
More generally if L =
P
||m
a
with a
(U) for all , then LT is the

distribution dened by
hLT, i = hT,
X
||m
(1)
||
(a
)i for all D(U).

Hence we can talk about distributional solutions to dierential equations of the
form LT = S.
Example 30.9. Suppose that f L
1
loc
and g C
(U), then gT
f
= T
gf
. If further
f C
1
(U), then
i
T
f
= T
i
f
. If f C
m
(U), then LT
f
= T
Lf
.
Example 30.10. Suppose that a U, then
h
i
a
, i =
i
(a)
and more generally we have
hL
a
, i =
X
||m
(1)
||
(a
) (a).
Example 30.11. Consider the distribution T := T
|x|
for x R, i.e. take U = R.
Then
d
dx
T = T
sgn(x)
and
d
2
d
2
x
T = 2
0
.
More generally, suppose that f is piecewise C
1
, the
d
dx
T
f
= T
f
0 +
X
(f(x+) f(x))
x
.
532 BRUCE K. DRIVER
Example 30.12. Consider T = T

ln|x|
on D(R). Then
hT
0
, i =
Z
R
ln|x|
0
(x)dx = lim
0
Z
|x|>
ln|x|
0
(x)dx
= lim
0
Z
|x|>
ln|x|
0
(x)dx = lim
0
Z
|x|>
1
x
(x)dx lim
0
[ln(() ())]
= lim
0
Z
|x|>
1
x
(x)dx.
We will write T
0
= PV
1
x
in the future. Here is another formula for T
0
,
hT
0
, i = lim
0
Z
1|x|>
1
x
(x)dx +
Z
|x|>1
1
x
(x)dx
= lim
0
Z
1|x|>
1
x
[(x) (0)]dx +
Z
|x|>1
1
x
(x)dx
=
Z
1|x|
1
x
[(x) (0)]dx +
Z
|x|>1
1
x
(x)dx.
Please notice in the last example that
1
x
/ L
1
loc
(R) so that T
1/x
is not well
dened. This is an example of the so called division problem of distributions. Here
is another possible interpretation of
1
x
as a distribution.
Example 30.13. Here we try to dene 1/x as lim
y0
1
xiy
, that is we want to
dene a distribution T
by
hT
, i := lim
y0
Z
1
x iy
(x)dx.
Let us compute T
+
explicitly,
lim
y0
Z
R
1
x +iy
(x)dx = lim
y0
Z
|x|1
1
x +iy
(x)dx + lim
y0
Z
|x|>1
1
x +iy
(x)dx
= lim
y0
Z
|x|1
1
x +iy
[(x) (0)] dx +(0) lim
y0
Z
|x|1
1
x +iy
dx
+
Z
|x|>1
1
x
(x)dx
= PV
Z
R
1
x
(x)dx +(0) lim
y0
Z
|x|1
1
x +iy
dx.
Now by deforming the contour we have
Z
|x|1
1
x +iy
dx =
Z
<|x|1
1
x +iy
dx +
Z
C
1
z +iy
dz
where C
: z = e
i
with : 0. Therefore,
lim
y0
Z
|x|1
1
x +iy
dx = lim
y0
Z
<|x|1
1
x +iy
dx + lim
y0
Z
C
1
z +iy
dz
=
Z
<|x|1
1
x
dx +
Z
C
1
z
dz = 0 .
Hence we have shown that T
+
= PV
1
x
i
0
. Similarly, one shows that T
=
PV
1
x
+i
0
. Notice that it follows from these computations that T
T
+
= i2
0
.
Notice that
1
x iy

1
x +iy
=
2iy
x
2
+y
2
and hence we conclude that lim
y0
y
x
2
+y
2
=
0
a result that we saw in Example
30.6, item 3.
Example 30.14. Suppose that is a complex measure on R and F(x) =
((, x]), then T
0
F
= . Moreover, if f L
1
loc
(R) and T
0
f
= , then f = F + C
a.e. for some constant C.
Proof. Let D := D(R), then
hT
0
F
, i = hT
F
,
0
i =
Z
R
F(x)
0
(x)dx =
Z
R
dx
Z
R
d(y)
0
(x)1
yx
=
Z
R
d(y)
Z
R
dx
0
(x)1
yx
=
Z
R
d(y)(y) = h, i
by Fubinis theorem and the fundamental theorem of calculus. If T
0
f
= , then
T
0
fF
= 0 and the result follows from Corollary 30.16 below.
Lemma 30.15. Suppose that T D
0
(R
n
) is a distribution such that
i
T = 0 for
some i, then there exists a distribution S D
0
(R
n1
) such that hT, i = hS,

i
i for
all D(R
n
) where
i
=
Z
R
te
i
dt D(R
n1
).
Proof. To simplify notation, assume that i = n and write x R
n
as x = (y, z)
with y R
n1
and z R. Let C
c
(R) such that
R
R
(z)dz = 1 and for
D(R
n1
), let (x) = (y)(z). The mapping
D(R
n1
) D(R
n
)
is easily seen to be sequentially continuous and therefore hS, i := hT, i dened
a distribution in D
0
(R
n
). Now suppose that D(R
n
). If =
n
f for some
f D(R
n
) we would have to have
R
(y, z)dz = 0. This is not generally true,
however the function

does have this property. Dene
f(y, z) :=
Z
z
(y, z
0
)

(y)(z
0
)
dz
0
,
then f D(R
n
) and
n
f =

. Therefore,
0 = h
n
T, fi = hT,
n
fi = hT, i hT,

i = hT, i hS,

i.
Corollary 30.16. Suppose that T D
0
(R
n
) is a distribution such that there exists
m 0 such that
T = 0 for all || = m,
then T = T
p
where p(x) is a polynomial on R
n
of degree less than or equal to m1,
where by convention if deg(p) = 1 then p 0.
534 BRUCE K. DRIVER
Proof. The proof will be by induction on n and m. The corollary is trivially

true when m = 0 and n is arbitrary. Let n = 1 and assume the corollary holds
for m = k 1 with k 1. Let T D
0
(R) such that 0 =
k
T =
k1
T. By
the induction hypothesis, there exists a polynomial, q, of degree k 2 such that
T
0
= T
q
. Let p(x) =
R
x
0
q(z)dz, then p is a polynomial of degree at most k 1 such
that p
0
= q and hence T
0
p
= T
q
= T
0
. So (T T
p
)
0
= 0 and hence by Lemma 30.15,
T T
p
= T
C
where C = hT T
p
, i and is as in the proof of Lemma 30.15. This
proves the he result for n = 1.
For the general induction, suppose there exists (m, n) N
2
with m 0 and
n 1 such that assertion in the corollary holds for pairs (m
0
, n
0
) such that either
n
0
< n of n
0
= n and m
0
m. Suppose that T D
0
(R
n
) is a distribution such that
T = 0 for all || = m+ 1.
In particular this implies that
n
T = 0 for all || = m1 and hence by induction
n
T = T
q
n
where q
n
is a polynomial of degree at most m1 on R
n
. Let p
n
(x) =
R
z
0
q
n
(y, z
0
)dz
0
a polynomial of degree at most m on R
n
. The polynomial p
n
satises,
1)
p
n
= 0 if || = m and
n
= 0 and 2)
n
p
n
= q
n
. Hence
n
(T T
p
n
) = 0 and
so by Lemma 30.15,
hT T
p
n
, i = hS,

n
i
for some distribution S D
0
(R
n1
). If is a multi-index such that
n
= 0 and
|| = m, then
0 = h
T
p
n
, i = hTT
p
n
,
i = hS, (
)
n
i = hS,
n
i = (1)
||
h
S,

n
i.
and in particular by taking = , we learn that h
S, i = 0 for all
D(R
n1
). Thus by the induction hypothesis, S = T
r
for some polynomial (r) of
degree at most m on R
n1
. Letting p(y, z) = p
n
(y, z) + r(y) a polynomial of
degree at most m on R
n
, it is easily checked that T = T
p
.
Example 30.17. Consider the wave equation
(
t
x
) (
t
+
x
) u(t, x) =

2
t

2
x
u(t, x) = 0.
From this equation one learns that u(t, x) = f(x + t) + g(x t) solves the wave
equation for f, g C
2
. Suppose that f is a bounded Borel measurable function on
R and consider the function f(x +t) as a distribution on R. We compute
h(
t
x
) f(x +t), (x, t)i =
Z
R
2
f(x +t) (
x
t
) (x, t)dxdt
=
Z
R
2
f(x) [(
x
t
) ] (x t, t)dxdt
=
Z
R
2
f(x)
d
dt
[(x t, t)] dxdt
=
Z
R
f(x) [(x t, t)] |
t=
t=
dx = 0.
This shows that (
t
x
) f(x + t) = 0 in the distributional sense. Similarly,
(
t
+
x
) g(xt) = 0 in the distributional sense. Hence u(t, x) = f(x+t) +g(xt)
solves the wave equation in the distributional sense whenever f and g are bounded
Borel measurable functions on R.
Example 30.18. Consider f(x) = ln|x| for x R
2
and let T = T
f
. Then, pointwise
we have
ln|x| =
x
|x|
2
and ln|x| =
2
|x|
2
2x
x
|x|
4
= 0.
Hence f(x) = 0 for all x R
2
except at x = 0 where it is not dened. Does this
imply that T = 0? No, in fact T = 2 as we shall now prove. By denition of
T and the dominated convergence theorem,
hT, i = hT, i =
Z
R
2
ln|x| (x)dx = lim
0
Z
|x|>
ln|x| (x)dx.
Using the divergence theorem,
Z
|x|>
ln|x| (x)dx =
Z
|x|>
ln|x| (x)dx +
Z
{|x|>}
ln|x| (x) n(x)dS(x)
=
Z
|x|>
ln|x| (x)dx
Z
{|x|>}
ln|x| n(x)(x)dS(x)
+
Z
{|x|>}
ln|x| ((x) n(x)) dS(x)
=
Z
{|x|>}
Z
{|x|>}
ln|x| n(x)(x)dS(x),
where n(x) is the outward pointing normal, i.e. n(x) = x := x/ |x| . Now
Z
{|x|>}
ln
1
2 0 as 0
where C is a bound on ((x) n(x)) . While
Z
{|x|>}
ln|x| n(x)(x)dS(x) =
Z
{|x|>}
x
|x|
( x)(x)dS(x)
=
1
Z
{|x|>}
(x)dS(x) 2(0) as 0.
Combining these results shows
hT, i = 2(0).
Exercise 30.1. Carry out a similar computation to that in Example 30.18 to show
T
1/|x|
= 4
where now x R
3
.
Example 30.19. Let z = x +iy, and

=
1
2
(
x
+i
y
). Let T = T
1/z
, then
T =
0
or imprecisely

1
z
= (z).
536 BRUCE K. DRIVER
Proof. Pointwise we have

1
z
= 0 so we shall work as above. We then have
h
T, i = hT,

i =
Z
R
2
1
z

(z)dm(z) = lim
0
Z
|z|>
1
z

(z)dm(z)
= lim
0
Z
|z|>
1
z
(z)dm(z) lim
0
Z
{|z|>}
1
z
(z)
1
2
(n
1
(z) +in
2
(z)) d(z)
= 0 lim
0
Z
{|z|>}
1
z
(z)
1
2
z
|z|
d(z) =
1
2
lim
0
Z
{|z|>}
1
|z|
(z)d(z)
= lim
0
1
2
Z
{|z|>}
(z)d(z) = (0).
30.2. Other classes of test functions. (For what follows, see Exercises 6.13 and
6.14 of Chapter 6.
Notation 30.20. Suppose that X is a vector space and {p
n
}
n=0
is a family of semi-
norms on X such that p
n
p
n+1
for all n and with the property that p
n
(x) = 0
for all n implies that x = 0. (We allow for p
n
= p
0
for all n in which case X is a
normed vector space.) Let be the smallest topology on X such that p
n
(x ) :
X [0, ) is continuous for all n N and x X. For n N, x X and > 0 let
B
n
(x, ) := {y X : p
n
(x y) < } .
Proposition 30.21. The balls B := {B
n
(x, ) : n N, x X and > 0} for a
basis for the topology . This topology is the same as the topology induced by the
metric d on X dened by
d(x, y) =

X
n=0
2
n
p
n
(x y)
1 +p
n
(x y)
.
Moreover, a sequence {x
k
} X is convergent to x X i lim
k
d(x, x
k
) =
0 i lim
n
p
n
(x, x
k
) = 0 for all n N and {x
k
} X is Cauchy in X i
lim
k,l
d(x
l
, x
k
) = 0 i lim
k,l
p
n
(x
l
, x
k
) = 0 for all n N.
Proof. Suppose that z B
n
(x, ) B
m
(y, ) and assume with out loss of
generality that m n. Then if p
m
(w z) < , we have
p
m
(w y) p
m
(w z) +p
m
(z y) < +p
m
(z y) <
provided that (0, p
m
(z y)) and similarly
p
n
(w x) p
m
(w x) p
m
(w z) +p
m
(z x) < +p
m
(z x) <
provided that (0, p
m
(z x)). So choosing
=
1
2
min( p
m
(z y), p
m
(z x)) ,
we have shown that B
m
(z, ) B
n
(x, ) B
m
(y, ). This shows that B forms a
basis for a topology. In detail, V
o
X i for all x V there exists n N and
> 0 such that B
n
(x, ) := {y X : p
n
(x y) < } V.
Let (B) be the topology generated by B. Since|p
n
(x y) p
n
(x z)| p
n
(y
z), we see that p
n
(x) is continuous on relative to (B) for each x X and n N.
This shows that (B). On the other hand, since p
n
(x ) is continuous, it
follows that B
n
(x, ) = {y X : p
n
(x y) < } for all x X, > 0 and n N.
This shows that B and therefore that (B) . Thus = (B).
Given x X and > 0, let B
d
(x, ) = {y X : d(x, y) < } be a d ball. Choose
N large so that
P
n=N+1
2
n
< /2. Then y B
N
(x, /4) we have
d(x, y) = p
N
(x y)
N
X
n=0
2
n
+/2 < 2
4
+/2 <
which shows that B
N
(x, /4) B
d
(x, ). Conversely, if d(x, y) < , then
2
n
p
n
(x y)
1 +p
n
(x y)
<
which implies that
p
n
(x y) <
2
n
1 2
n
=:
when 2
n
< 1 which shows that B
n
(x, ) contains B
d
(x, ) with and as above.
This shows that and the topology generated by d are the same.
The moreover statements are now easily proved and are left to the reader.
Exercise 30.2. Keeping the same notation as Proposition 30.21 and further assume
that {p
0
n
}
nN
is another family of semi-norms as in Notation 30.20. Then the
topology
0
determined by {p
0
n
}
nN
is weaker then the topology determined by
{p
n
}
nN
(i.e.
0
) i for every n N there is an m N and C < such that
p
0
n
Cp
m
.
Solution. Suppose that
0
. Since 0 {p
0
n
< 1}
0
, there exists an
m N and > 0 such that {p
m
< } {p
0
n
< 1} . So for x X,
x
2p
m
(x)
{p
m
< } {p
0
n
< 1}
which implies p
0
n
(x) < 2p
m
(x) and hence p
0
n
Cp
m
with C = 2/. (Actually 1/
would do here.)
For the converse assertion, let U
0
and x
0
U. Then there exists an n N
and > 0 such that {p
0
n
(x
0
) < } U. If m N and C < so that p
0
n
Cp
m
,
then
x
0
{p
m
(x
0
) < /C} {p
0
n
(x
0
) < } U
which shows that U .
Lemma 30.22. Suppose that X and Y are vector spaces equipped with sequences
of norms {p
n
} and {q
n
} as in Notation 30.20. Then a linear map T : X Y is
continuous if for all n N there exists C
n
< and m
n
N such that q
n
(Tx)
C
n
p
m
n
(x) for all x X. In particular, f X
i |f(x)| Cp
m
(x) for some C <
and m N. (We may also characterize continuity by sequential convergence since
both X and Y are metric spaces.)
Proof. Suppose that T is continuous, then {x : q
n
(Tx) < 1} is an open neigh-
borhood of 0 in X. Therefore, there exists m N and > 0 such that B
m
(0, )
{x : q
n
(Tx) < 1} . So for x X and < 1, x/p
m
(x) B
m
(0, ) and thus
q
n
(

p
m
(x)
Tx) < 1 =q
n
(Tx) <
1
p
m
(x)
for all x. Letting 1 shows that q
n
(Tx)
1
p
m
(x) for all x X.
Conversely, if T satises
q
n
(Tx) C
n
p
m
n
(x) for all x X,
538 BRUCE K. DRIVER
then
q
n
(Tx Tx
0
) = q
n
(T(x x
0
)) C
n
p
m
n
(x x
0
) for all x, y X.
This shows Tx
0
Tx as x
0
x, i.e. that T is continuous.
Denition 30.23. A Frchet space is a vector space X equipped with a family
{p
n
} of semi-norms such that X is complete in the associated metric d.
Example 30.24. Let K @@ R
n
and C
(K) := {f C
c
(R
n
) : supp(f) K} .
For m N, let
p
m
(f) :=
X
||m
k
fk
.
Then (C
(K), {p
m
}
m=1
) is a Frchet space. Moreover the derivative operators
{
k
} and multiplication by smooth functions are continuous linear maps from
C
(K) to C
(K). If is a nite measure on K, then T(f) :=

R
K

fd is an
element of C
(K)
for any multi index .

Example 30.25. Let U
o
R
n
and for m N, and a compact set K @@ U let
p
K
m
(f) :=
X
||m
k
fk
,K
:=
X
||m
max
xK
|
f(x)| .
Choose a sequence K
m
@@ U such that K
m
K
o
m+1
K
m+1
@@ U for all m
and set q
m
(f) = p
K
m
m
(f). Then (C
(K), {p
m
}
m=1
) is a Frchet space and the
topology in independent of the choice of sequence of compact sets K exhausting
U. Moreover the derivative operators {
k
} and multiplication by smooth functions
are continuous linear maps from C
(U) to C
(U). If is a nite measure with

compact support in U, then T(f) :=
R
K

fd is an element of C
(U)
for any
multi index .
Proposition 30.26. A linear functional T on C
(U) is continuous, i.e. T

C
(U)
i there exists a compact set K @@ U, m N and C < such that

|hT, i| Cp
K
m
() for all C
(U).
Notation 30.27. Let
s
(x) := (1+|x|)
s
(or change to
s
(x) = (1+|x|
2
)
s/2
= hxi
s
?)
for x R
n
and s R.
Example 30.28. Let S denote the space of functions f C
(R
n
) such that f
and all of its partial derivatives decay faster that (1 + |x|)
m
for all m > 0 as in
Denition 20.6. Dene
p
m
(f) =
X
||m
k(1 +| |)
m
f()k
=
X
||m
k(
m
f()k
,
then (S, {p
m
}) is a Frchet space. Again the derivative operators {
k
} and multi-
plication by function f P are examples of continuous linear operators on S. For
an example of an element T S
, let be a measure on R
n
such that
Z
(1 +|x|)
N
d||(x) <
for some N N. Then T(f) :=
R
K

fd denes and element of S
.
Proposition 30.29. The Fourier transform F : S S is a continuous linear
transformation.
Proof. For the purposes of this proof, it will be convenient to use the semi-norms
p
0
m
(f) =
X
||m
(1 +| |
2
)
m
f()
.
This is permissible, since by Exercise 30.2 they give rise to the same topology on
S.
Let f S and m N, then
(1 +||
2
)
m

f() = (1 +||
2
)
m
F ((ix)
f) ()
= F [(1 )
m
((ix)
f)] ()
and therefore if we let g = (1 )
m
((ix)
f) S,
(1 +||
2
)
m

f()
kgk
1
=
Z
R
n
|g(x)| dx
=
Z
R
n
|g(x)| (1 +|x|
2
)
n
1
(1 +|x|
2
)
n
d
C
|g()| (1 +||
2
)
n
where C =
R
R
n
1
(1+|x|
2
)
n
d < . Using the product rule repeatedly, it is not hard
to show
|g()| (1 +||
2
)
n
(1 +||
2
)
n
(1 )
m
((ix)
f)
k
X
||2m
(1 +||
2
)
n+||/2
kp
0
2m+n
(f)
for some constant k < . Combining the last two displayed equations implies that
p
0
m
(

f) Ckp
0
2m+n
(f) for all f S, and thus F is continuous.
Proposition 30.30. The subspace C
c
(R
n
) is dense in S(R
n
).
Proof. Let C
c
(R
n
) such that = 1 in a neighborhood of 0 and set
m
(x) = (x/m) for all m N. We will now show for all f S that
m
f converges
to f in S. The main point is by the product rule,
(
m
f f) (x) =
X
m
(x)
f(x) f
=
X
:6=
1
m
||
(x/m)
f(x).
Since max
is bounded it then follows from the last equation that

k
t
(
m
f f)k
= O(1/m) for all t > 0 and . That is to say

m
f f in S.
Lemma 30.31 (Peetres Inequality). For all x, y R
n
and s R,
(30.5) (1 +|x +y|)
s
min
n
(1 +|y|)
|s|
(1 +|x|)
s
, (1 +|y|)
s
(1 +|x|)
|s|
o
that is to say
s
(x + y)
|s|
(x)
s
(y) and
s
(x + y)
s
(x)
|s|
(y) for all s R,
where
s
(x) = (1 + |x|)
s
as in Notation 30.27. We also have the same results for
hxi, namely
(30.6) hx +yi
s
2
|s|/2
min
n
hxi
|s|
hyi
s
, hxi
s
hyi
|s|
o
.
540 BRUCE K. DRIVER
Proof. By elementary estimates,

(1 +|x +y|) 1 +|x| +|y| (1 +|x|)(1 +|y|)
and so for Eq. (30.5) holds if s 0. Now suppose that s < 0, then
(1 +|x +y|)
s
(1 +|x|)
s
(1 +|y|)
s
and letting x x y and y y in this inequality implies
(1 +|x|)
s
(1 +|x +y|)
s
(1 +|y|)
s
.
This inequality is equivalent to
(1 +|x +y|)
s
(1 +|x|)
s
(1 +|y|)
s
= (1 +|x|)
s
(1 +|y|)
|s|
.
By symmetry we also have
(1 +|x +y|)
s
(1 +|x|)
|s|
(1 +|y|)
s
.
For the proof of Eq. (30.6
hx +yi
2
= 1 +|x +y|
2
1 + (|x| +|y|)
2
= 1 +|x|
2
+|y|
2
+ 2 |x| |y|
1 + 2 |x|
2
+ 2 |y|
2
2(1 +|x|
2
)(1 +|y|
2
) = 2hxi
2
hyi
2
.
From this it follows that hxi
2
2hx +yi
2
hyi
2
and hence
hx +yi
2
2hxi
2
hyi
2
.
So if s 0, then
hx +yi
s
2
s/2
hxi
s
hyi
s
and
hx +yi
s
2
s/2
hxi
s
hyi
s
.
Proposition 30.32. Suppose that f, g S then f g S.
Proof. First proof. Since F(f g) =

f g S it follows that f g = F
1
(

f g) S
as well.
For the second proof we will make use of Peetres inequality. We have for any
k, l N that
t
(x) |
(f g)(x)| =
t
(x) |
f g(x)|
t
(x)
Z
|
f(x y)| |g(y)| dy

C
t
(x)
Z

k
(x y)
l
(y)dy C
t
(x)
Z

k
(x)
k
(y)
l
(y)dy
= C
tk
(x)
Z

kl
(y)dy.
Choosing k = t and l > t +n we learn that
t
(x) |
(f g)(x)| C
Z

kl
(y)dy <
showing k
t
(f g)k
< for all t 0 and N

n
.
30.3. Compactly supported distributions.
Denition 30.33. For a distribution T D
0
(U) and V
o
U, we say T|
V
= 0 if
hT, i = 0 for all D(V ).
Proposition 30.34. Suppose that V := {V
}
A
is a collection of open subset of
U such that T|
V
= 0 for all , then T|

W
= 0 where W =
A
V
.
Proof. Let {
}
A
be a smooth partition of unity subordinate to V, i.e.
supp(
) V
for all A, for each point x W there exists a neighborhood

N
x

o
W such that #{ A : supp(
) N
x
6= } < and 1
W
=
P
A
.
Then for D(W), we have =
P
A
and there are only a nite number of

nonzero terms in the sum since supp() is compact. Since
D(V
) for all ,
hT, i = hT,
X
A
i =
X
A
hT,
i = 0.
Denition 30.35. The support, supp(T), of a distribution T D
0
(U) is the rela-
tively closed subset of U determined by
U \ supp(T) = {V
o
U : T|
V
= 0} .
By Proposition 30.26, supp(T) may described as the smallest (relatively) closed set
F such that T|
U\F
= 0.
Proposition 30.36. If f L
1
loc
(U), then supp(T
f
) = ess sup(f), where
ess sup(f) := {x U : m({y V : f(y) 6= 0}}) > 0 for all neighborhoods V of x}
as in Denition 11.14.
Proof. The key point is that T
f
|
V
= 0 i f = 0 a.e. on V and therefore
U \ supp(T
f
) = {V
o
U : f1
V
= 0 a.e.} .
On the other hand,
U \ ess sup(f) = {x U : m({y V : f(y) 6= 0}}) = 0 for some neighborhood V of x}
= {x U : f1
V
= 0 a.e. for some neighborhood V of x}
= {V
o
U : f1
V
= 0 a.e.}
Denition 30.37. Let E
0
(U) := {T D
0
(U) : supp(T) U is compact} the
compactly supported distributions in D
0
(U).
Lemma 30.38. Suppose that T D
0
(U) and f C
(U) is a function such that

K := supp(T) supp(f) is a compact subset of U. Then we may dene hT, fi :=
hT, fi, where D(U) is any function such that = 1 on a neighborhood of K.
Moreover, if K @@ U is a given compact set and F @@ U is a compact set such
that K F
o
, then there exists m N and C < such that
(30.7) |hT, fi| C
X
||m
,F
for all f C
(U) such that supp(T) supp(f) K. In particular if T E

0
(U)
then T extends uniquely to a linear functional on C
(U) and there is a compact

subset F @@ U such that the estimate in Eq. (30.7) holds for all f C
(U).
542 BRUCE K. DRIVER
Proof. Suppose that

is another such cuto function and let V be an open
neighborhood of K such that =

= 1 on V. Setting g :=

f D(U) we
see that
supp(g) supp(f) \ V supp(f) \ K = supp(f) \ supp(T) U \ supp(T),
see Figure 50 below. Therefore,
0 = hT, gi = hT,
fi = hT, fi hT,

fi
which shows that hT, fi is well dened.
Figure 50. Intersecting the supports.
Moreover, if F @@ U is a compact set such that K F
o
and C
c
(F
0
) is a
function which is 1 on a neighborhood of K, we have
|hT, fi| = |hT, fi| = C
X
||m
k
(f)k
C
X
||m
,F
and this estimate holds for all f C
(U) such that supp(T) supp(f) K.

Theorem 30.39. The restriction of T C
(U)
to C
c
(U) denes an element in
E
0
(U). Moreover the map
T C
(U)
i
T|
D(U)
E
0
(U)
is a linear isomorphism of vector spaces. The inverse map is dened as follows.
Given S E
0
(U) and C
c
(U) such that = 1 on K = supp(S) then i
1
(S) =
S, where S C
(U)
dened by
hS, i = hS, i for all C
(U).
Proof. Suppose that T C
(U)
then there exists a compact set K @@ U,

m N and C < such that
|hT, i| Cp
K
m
() for all C
(U)
where p
K
m
is dened in Example 30.25. It is clear using the sequential notion of
continuity that T|
D(U)
is continuous on D(U), i.e. T|
D(U)
D
0
(U). Moreover, if
C
c
(U) such that = 1 on a neighborhood of K then
|hT, i hT, i| = |hT, ( 1) i| Cp
K
m
(( 1) ) = 0,
which shows T = T. Hence supp(T) = supp(T) supp() @@ U showing that
T|
D(U)
E
0
(U). Therefore the map i is well dened and is clearly linear. I also
claim that i is injective because if T C
(U)
and i(T) = T|
D(U)
0, then
hT, i = hT, i = hT|
D(U)
, i = 0 for all C
(U).
To show i is surjective suppose that S E
0
(U). By Lemma 30.38 we know that
S extends uniquely to an element

S of C
(U)
such that

S|
D(U)
= S, i.e. i(
S) = S.
and K = supp(S).
Lemma 30.40. The space E
0
(U) is a sequentially dense subset of D
0
(U).
Proof. Choose K
n
@@ U such that K
n
K
o
n+1
K
n+1
U as n . Let
n
C
c
(K
0
n+1
) such that
n
= 1 on K. Then for T D
0
(U),
n
T E
0
(U) and
n
T T as n .
30.4. Tempered Distributions and the Fourier Transform. The space of
tempered distributions S
0
(R
n
) is the continuous dual to S = S(R
n
). A linear
functional T on S is continuous i there exists k N and C < such that
(30.8) |hT, i| Cp
k
() := C
X
||k
k
k
for all S. Since D = D(R

n
) is a dense subspace of S any element T S
0
is determined by its restriction to D. Moreover, if T S
0
it is easy to see that
T|
D
D
0
. Conversely and element T D
0
satisfying an estimate of the form in Eq.
(30.8) for all D extend uniquely to an element of S
0
. In this way we may view
S
0
as a subspace of D
0
.
Example 30.41. Any compactly supported distribution is tempered, i.e. E
0
(U)
S
0
(R
n
) for any U
o
R
n
.
One of the virtues of S
0
is that we may extend the Fourier transform to S
0
. Recall
that for L
1
functions f and g we have the identity,
h
f, gi = hf, gi.
This suggests the following denition.
Denition 30.42. The Fourier and inverse Fourier transform of a tempered dis-
tribution T S
0
are the distributions

T = FT S
0
and T
= F
1
T S
0
dened
by
h
T, i = hT,

i and hT
, i = hT,
i for all S.
Since F : S S is a continuous isomorphism with inverse F
1
, one easily checks
that

T and T
are well dened elements of S and that F

1
is the inverse of F on
S
0
.
544 BRUCE K. DRIVER
Example 30.43. Suppose that is a complex measure on R

n
. Then we may view
as an element of S
0
via h, i =
R
d for all S
0
. Then by Fubini-Tonelli,
h , i = h,

i =
Z

(x)d(x) =
Z Z
()e
ix
d
d(x)
=
Z Z
()e
ix
d(x)
d
which shows that is the distribution associated to the continuous function

R
e
ix
d(x).
R
e
ix
d(x)We will somewhat abuse notation and identify the
distribution with the function
R
e
ix
d(x). When d(x) = f(x)dx with
f L
1
, we have =

f, so the denitions are all consistent.
Corollary 30.44. Suppose that is a complex measure such that = 0, then = 0.
So complex measures on R
n
are uniquely determined by their Fourier transform.
Proof. If = 0, then = 0 as a distribution, i.e.
R
d = 0 for all S and in
particular for all D. By Example 30.5 this implies that is the zero measure.
More generally we have the following analogous theorem for compactly supported
distributions.
Theorem 30.45. Let S E
0
(R
n
), then

S is an analytic function and

S(z) =
hS(x), e
ixz
i. Also if supp(S) @@ B(0, M), then

S(z) satises a bound of the form
S(z)
C(1 +|z|)
m
e
M|Imz|
for some m N and C < . If S D(R
n
), i.e. if S is assumed to be smooth, then
for all m N there exists C
m
< such that
S(z)
C
m
(1 +|z|)
m
e
M|Imz|
.
Proof. The function h(z) = hS(), e
iz
i for z C
n
is analytic since the map
z C
n
e
iz
C
( R
n
) is analytic and S is complex linear. Moreover, we
have the bound
|h(z)| =

hS(), e
iz
i
C
X
||m
e
iz
,B(0,M)
= C
X
||m
e
iz
,B(0,M)
C
X
||m
|z|
||
e
iz
,B(0,M)
C(1 +|z|)
m
e
M|Imz|
.
If we now assume that S D(R
n
), then

S(z)
Z
R
n
S()z
e
iz
d
Z
R
n
S()(i
e
iz
d
Z
R
n
(i
S()e
iz
d
e
M|Imz|
Z
R
n
|
S()| d
showing
|z
S(z)
e
M|Imz|
k
Sk
1
and therefore
(1 +|z|)
m
S(z)
Ce
M|Imz|
X
||m
k
Sk
1
Ce
M|Imz|
.
So to nish the proof it suces to show h =

S in the sense of distributions
52
.
For this let D, K @@ R
n
be a compact set for > 0 let
() = (2)
n/2
n
X
xZ
n
(x)e
ix
.
This is a nite sum and
sup
K
()

()
= sup
K
X
yZ
n
Z
y+(0,1]
n
(iy)
(y)e
iy
(ix)
(x)e
ix
dx
X
yZ
n
Z
y+(0,1]
n
sup
K
(y)e
iy
x
(x)e
ix
dx
By uniform continuity of x
(x)e
ix
for (, x) KR
n
( has compact support),
() = sup
K
sup
yZ
n
sup
xy+(0,1]
n
(y)e
iy
x
(x)e
ix
0 as 0
which shows
sup
K
()

()
C()
where C is the volume of a cube in R
n
which contains the support of . This shows
that

in C
(R
n
). Therefore,
h
S, i = hS,

i = lim
0
hS,

i = lim
0
(2)
n/2
n
X
xZ
n
(x)hS(), e
ix
i
= lim
0
(2)
n/2
n
X
xZ
n
(x)h(x) =
Z
R
n
(x)h(x)dx = hh, i.
Remark 30.46. Notice that

S(z) = hS(x),
z
e
ixz
i = hS(x), (ix)
e
ixz
i = h(ix)
S(x), e
ixz
i
and (ix)
S(x) E
0
(R
n
). Therefore, we nd a bound of the form

S(z)
C(1 +|z|)
m
0
e
M|Imz|
where C and m
0
depend on . In particular, this shows that

S P, i.e. S
0
is
preserved under multiplication by

S.
The converse of this theorem holds as well. For the moment we only have the
tools to prove the smooth converse. The general case will follow by using the notion
of convolution to regularize a distribution to reduce the question to the smooth case.
52
This is most easily done using Fubinis Theorem 31.2 for distributions proved below. This
proof goes as follows. Let , D(R
n
) such that = 1 on a neighborhood of supp(S) and = 1
on a neighborhood of supp() then
hh, i = h(x), hS(), e
ix
ii = h(x)(x), hS(), ()e
ix
ii
= h(x), hS(), (x)()e
ix
ii.
We may now apply Theorem 31.2 to conclude,
hh, i = hS(), h(x), (x)()e
ix
ii = hS(), ()h(x), e
ix
ii = hS(), h(x), e
ix
ii
= hS(),

()i.
546 BRUCE K. DRIVER
Theorem 30.47. Let S S(R

n
) and assume that

S is an analytic function and
there exists an M < such that for all m N there exists C
m
< such that
S(z)
C
m
(1 +|z|)
m
e
M|Imz|
.
Then supp(S) B(0, M).
Proof. By the Fourier inversion formula,
S(x) =
Z
R
n
S()e
ix
d
and by deforming the contour, we may express this integral as
S(x) =
Z
R
n
+i
S()e
ix
d =
Z
R
n
S( +i)e
i(+i)x
d
for any R
n
. From this last equation it follows that
|S(x)| e
x
Z
R
n
S( +i)
d C
m
e
x
e
M||
Z
R
n
(1 +| +i|)
m
d
C
m
e
x
e
M||
Z
R
n
(1 +||)
m
d

C
m
e
x
e
M||
where

C
m
< if m > n. Letting = x with > 0 we learn
(30.9) |S(x)|

C
m
exp
|x|
2
+M |x|
=

C
m
e
|x|(M|x|)
.
Hence if |x| > M, we may let in Eq. (30.9) to show S(x) = 0. That is to
say supp(S) B(0, M).
Let us now pause to work out some specic examples of Fourier transform of
measures.
Example 30.48 (Delta Functions). Let a R
n
and
a
be the point mass measure
at a, then
a
() = e
ia
.
In particular it follows that
F
1
e
ia
=
a
.
To see the content of this formula, let S. Then
Z
e
ia
()d = he
ia
, F
1
i = hF
1
e
ia
, i = h
a
, i = (a)
which is precisely the Fourier inversion formula.
Example 30.49. Suppose that p(x) is a polynomial. Then
h p, i = hp,

i =
Z
p()
()d.
Now
p()
() =
Z
(x)p()e
ix
dx =
Z
(x)p(i
x
)e
ix
dx
=
Z
p(i
x
)(x)e
ix
dx = F (p(i)) ()
which combined with the previous equation implies
h p, i =
Z
F (p(i)) ()d =

F
1
F (p(i))
(0) = p(i)(0)
= h
0
, p(i)i = hp(i)
0
, i.
Thus we have shown that p = p(i)
0
.
Lemma 30.50. Let p() be a polynomial in R
n
, L = p(i) (a constant
coecient partial dierential operator) and T S
0
, then
Fp(i)T = p
T.
In particular if T =
0
, we have
Fp(i)
0
= p

0
= p.
Proof. By denition,
hFLT, i = hLT,

i = hp(i)T,

i = hT, p(i)
i
and
p(i
() = p(i
)
Z
(x)e
ix
dx =
Z
p(x)(x)e
ix
dx = (p) .
Thus
hFLT, i = hT, p(i)
i = hT, (p) i = h
T, pi = hp
T, i
which proves the lemma.
Example 30.51. Let n = 1, < a < b < , and d(x) = 1
[a,b]
(x)dx. Then
() =
Z
b
a
e
ix
dx =
1
2
e
ix
i
|
b
a
=
1
2
e
ib
e
ia
i
=
1
2
e
ia
e
ib
i
.
So by the inversion formula we may conclude that
(30.10) F
1
2
e
ia
e
ib
i
(x) = 1
[a,b]
(x)
in the sense of distributions. This also true at the Level of L
2
functions. When
a = b and b > 0 these formula reduce to
F1
[b,b]
=
1
2
e
ib
e
ib
i
=
2
2
sinb
and
F
1
2
2
sinb
= 1
[b,b]
.
Let us pause to work out Eq. (30.10) by rst principles. For M (0, ) let
N
be the complex measure on R
n
dened by
d
M
() =
1
2
1
||M
e
ia
e
ib
i
d,
then
1
2
e
ia
e
ib
i
= lim
M
M
in the S
0
topology.
548 BRUCE K. DRIVER
Hence
F
1
2
e
ia
e
ib
i
(x) = lim
M
F
1
M
and
F
1
M
() =
Z
M
M
1
2
e
ia
e
ib
i
e
ix
d.
Since is
1
2
e
ia
e
ib
i
e
ix
is a holomorphic function on C we may deform
the contour to any contour in C starting at M and ending at M. Let
M
denote
the straight line path from M to 1 along the real axis followed by the contour
e
i
for going from to 2 and then followed by the straight line path from 1 to
M. Then
Z
||M
1
2
e
ia
e
ib
i
e
ix
d =
Z
M
1
2
e
ia
e
ib
i
e
ix
d
=
Z
M
1
2
e
i(xa)
e
i(xb)
i
d
=
1
2i
Z
M
e
i(xa)
e
i(xb)
i
dm().
By the usual contour methods we nd
lim
M
1
2i
Z
M
e
iy
dm() =

1 if y > 0
0 if y < 0
and therefore we have
F
1
2
e
ia
e
ib
i
(x) = lim
M
F
1
M
(x) = 1
x>a
1
x>b
= 1
[a,b]
(x).
Example 30.52. Let
t
be the surface measure on the sphere S
t
of radius t centered
at zero in R
3
. Then

t
() = 4t
sint ||
||
.
Indeed,

t
() =
Z
tS
2
e
ix
d(x) = t
2
Z
S
2
e
itx
d(x)
= t
2
Z
S
2
e
itx
3
||
d(x) = t
2
Z
2
0
d
Z

0
dsine
it cos ||
= 2t
2
Z
1
1
e
itu||
du = 2t
2
1
it ||
e
itu||
|
u=1
u=1
= 4t
2
sint ||
t ||
.
By the inversion formula, it follows that
F
1
sint ||
||
=
t
4t
2
t
= t
t
where
t
is
1
4t
2
t
, the surface measure on S
t
normalized to have total measure one.
Let us again pause to try to compute this inverse Fourier transform directly.
To this end, let f
M
() :=
sin t||
t||
1
||M
. By the dominated convergence theorem, it
follows that f
M

sin t||
t||
in S
0
, i.e. pointwise on S. Therefore,
hF
1
sint ||
t ||
, i = h
sint ||
t ||
, F
1
i = lim
M
hf
M
, F
1
i = lim
M
hF
1
f
M
, i
and
(2)
3/2
F
1
f
M
(x) = (2)
3/2
Z
R
3
sint ||
t ||
1
||M
e
ix
d
=
Z
M
r=0
Z
2
=0
Z

=0
sintr
tr
e
ir|x| cos
r
2
sindrdd
=
Z
M
r=0
Z
2
=0
Z
1
u=1
sintr
tr
e
ir|x|u
r
2
drdud = 2
Z
M
r=0
sintr
t
e
ir|x|
e
ir|x|
ir |x|
rdr
=
4
t |x|
Z
M
r=0
sintr sinr|x|dr
=
4
t |x|
Z
M
r=0
1
2
(cos(r(t +|x|) cos(r(t |x|)) dr
=
4
t |x|
1
2(t +|x|)
(sin(r(t +|x|) sin(r(t |x|)) |
M
r=0
=
4
t |x|
1
2
sin(M(t +|x|)
t +|x|

sin(M(t |x|)
t |x|
Now make use of the fact that

sin Mx
x
(x) in one dimension to nish the proof.
30.4.1. Wave Equation. Given a distribution T and a test function , we wish to
dene T C
by the formula
T (x) =
Z
T(y)(x y)dy = hT, (x )i.
As motivation for wanting to understand convolutions of distributions let us recon-
sider the wave equation in R
n
,
0 =

2
t

u(t, x) with
u(0, x) = f(x) and u
t
(0, x) = g(x).
Taking the Fourier transform in the x variables gives the following equation
0 = u
t t
(t, ) +||
2
u(t, )with
u(0, ) =

f() and u
t
(0, ) = g().
The solution to these equations is
u(t, ) =

f() cos (t ||) + g()
sint||
||
550 BRUCE K. DRIVER
and hence we should have

u(t, x) = F
1
f() cos (t ||) + g()

sint||
||
(x)
= F
1
cos (t ||) f(x) +F
1
sint||
||
g (x)
=
d
dt
F
1
sint||
||
f(x) +F
1
sint||
||
g (x) .
The question now is how interpret this equation. In particular what are the inverse
Fourier transforms of F
1
cos (t ||) and F
1
sin t||
||
. Since
d
dt
F
1
sin t||
||
f(x) =
F
1
cos (t ||) f(x), it really suces to understand F
1
sin t||
||
. This was worked
out in Example 30.51 when n = 1 where we found
F
1
1
sint
(x) =

1
x+t>0
1
(xt)>0
2
(1
x>t
1
x>t
) =

2
1
[t,t]
(x)
where in writing the last line we have assume that t 0. Therefore,
F
1
1
sint
f(x) =
1
2
Z
t
t
f(x y)dy
Therefore the solution to the one dimensional wave equation is
u(t, x) =
d
dt
1
2
Z
t
t
f(x y)dy +
1
2
Z
t
t
g(x y)dy
=
1
2
(f(x t) +f(x +t)) +
1
2
Z
t
t
g(x y)dy
=
1
2
(f(x t) +f(x +t)) +
1
2
Z
x+t
xt
g(y)dy.
We can arrive at this same solution by more elementary means as follows. We
rst note in the one dimensional case that wave operator factors, namely
0 =

2
t

2
x
u(t, x) = (
t
x
) (
t
+
x
) u(t, x).
Let U(t, x) := (
t
+
x
) u(t, x), then the wave equation states (
t
x
) U = 0 and
hence by the chain rule
d
dt
U(t, x t) = 0. So
U(t, x t) = U(0, x) = g(x) +f
0
(x)
and replacing x by x +t in this equation shows
(
t
+
x
) u(t, x) = U(t, x) = g(x +t) +f
0
(x +t).
Working similarly, we learn that
d
dt
u(t, x +t) = g(x + 2t) +f
0
(x + 2t)
which upon integration implies
u(t, x +t) = u(0, x) +
Z
t
0
{g(x + 2) +f
0
(x + 2)} d.
= f(x) +
Z
t
0
g(x + 2)d +
1
2
f(x + 2)|
t
0
=
1
2
(f(x) +f(x + 2t)) +
Z
t
0
g(x + 2)d.
Replacing x x t in this equation then implies
u(t, x) =
1
2
(f(x t) +f(x +t)) +
Z
t
0
g(x t + 2)d.
Finally, letting y = x t + 2 in the last integral gives
u(t, x) =
1
2
(f(x t) +f(x +t)) +
1
2
Z
x+t
xt
g(y)dy
as derived using the Fourier transform.
For the three dimensional case we have
u(t, x) =
d
dt
F
1
sint||
||
f(x) +F
1
sint||
||
g (x)
=
d
dt
(t
t
f(x)) +t
t
g (x) .
The question is what is g(x) where is a measure. To understand the denition,
suppose rst that d(x) = (x)dx, then we should have
g(x) = g(x) =
Z
R
n
g(x y)(x)dx =
Z
R
n
g(x y)d(y).
Thus we expect our solution to the wave equation should be given by
u(t, x) =
d
dt
t
Z
S
t
f(x y)d
t
(y)
+t
Z
S
t
g(x y)d
t
(y)
=
d
dt
t
Z
S
1
f(x t)d
+t
Z
S
1
g(x t)d
=
d
dt
t
Z
S
1
f(x +t)d
+t
Z
S
1
g(x +t)d (30.11)
where d := d
1
(). Notice the sharp propagation of speed. To understand this
suppose that f = 0 for simplicity and g has compact support near the origin, for
example think of g =
0
(x), the x +tw = 0 for some w i |x| = t. Hence the wave
front propagates at unit speed in a sharp way. See gure below.
We may also use this solution to solve the two dimensional wave equation using
Hadamards method of decent. Indeed, suppose now that f and g are function
on R
2
which we may view as functions on R
3
which do not depend on the third
coordinate say. We now go ahead and solve the three dimensional wave equation
using Eq. (30.11) and f and g as initial conditions. It is easily seen that the solution
u(t, x, y, z) is again independent of z and hence is a solution to the two dimensional
wave equation. See gure below.
552 BRUCE K. DRIVER

three dimensions.
two dimensions.
Notice that we still have nite speed of propagation but no longer sharp prop-
agation. In fact we can work out the solution analytically as follows. Again for
simplicity assume that f 0. Then
u(t, x, y) =
t
4
Z
2
0
d
Z

0
dsing((x, y) +t(sincos , sinsin))
=
t
2
Z
2
0
d
Z
/2
0
dsing((x, y) +t(sincos , sinsin))
and letting u = sin, so that du = cos d =
1 u
2
d we nd
u(t, x, y) =
t
2
Z
2
0
d
Z
1
0
du
1 u
2
ug((x, y) +ut(cos , sin))
and then letting r = ut we learn,
u(t, x, y) =
1
2
Z
2
0
d
Z
t
0
dr
p
1 r
2
/t
2
r
t
g((x, y) +r(cos , sin))
=
1
2
Z
2
0
d
Z
t
0
dr
t
2
r
2
rg((x, y) +r(cos , sin))
=
1
2
ZZ
D
t
g((x, y) +w))
p
t
2
|w|
2
dm(w).
Here is a better alternative derivation of this result. We begin by using symmetry
to nd
u(t, x) = 2t
Z
S
+
t
g(x y)d
t
(y) = 2t
Z
S
+
t
g(x +y)d
t
(y)
where S
+
t
is the portion of S
t
with z 0. This sphere is parametrized by
R(u, v) = (u, v,
t
2
u
2
v
2
) with (u, v) D
t
:=

(u, v) : u
2
+v
2
t
2
. In these
coordinates we have
4t
2
d
t
=

u
p
t
2
u
2
v
2
,
v
p
t
2
u
2
v
2
, 1
dudv
=
t
2
u
2
v
2
,
v
t
2
u
2
v
2
, 1
dudv
=
r
u
2
+v
2
t
2
u
2
v
2
+ 1dudv =
|t|
t
2
u
2
v
2
dudv
and therefore,
u(t, x) =
2t
4t
2
Z
S
+
t
g(x + (u, v,
p
t
2
u
2
v
2
))
|t|
t
2
u
2
v
2
dudv
=
1
2
sgn(t)
Z
S
+
t
g(x + (u, v))
t
2
u
2
v
2
dudv.
This may be written as
u(t, x) =
1
2
sgn(t)
ZZ
D
t
g((x, y) +w))
p
t
2
|w|
2
dm(w)
as before. (I should check on the sgn(t) term.)
30.5. Appendix: Topology on C
c
(U). Let U be an open subset of R
n
and
(30.12) C
c
(U) =
K@@U
C
(K)
denote the set of smooth functions on U with compact support in U. Our goal is
to topologize C
c
(U) in a way which is compatible with he topologies dened in
Example 30.24 above. This leads us to the inductive limit topology which we now
pause to introduce.
Denition 30.53 (Indcutive Limit Topology). Let X be a set, X
X for A
(A is an index set) and assume that
P(X
) is a topology on X
for each .
Let i
: X
X denote the inclusion maps. The inductive limit topology on X

is the largest topology on X such that i
is continuous for all A. That is to

say, =
A
i
), i.e. a set U X is open (U ) i i

1
(A) = A X
for all A.
554 BRUCE K. DRIVER
Notice that C X is closed i C X
is closed in X
for all . Indeed, C X

is closed i C
c
= X \ C X is open, i C
c
X
= X
\ C is open in X
i
X
C = X
\ (X
\ C) is closed in X
for all A.
Denition 30.54. Let D(U) denote C
c
(U) equipped with the inductive limit
topology arising from writing C
c
(U) as in Eq. (30.12) and using the Frchet
topologies on C
(K) as dened in Example 30.24.

For each K @@ U, C
(K) is a closed subset of D(U). Indeed if F is another

compact subset of U, then C
(K)C
(F) = C
(KF), which is a closed subset

of C
(F). The set U D(U) dened by

(30.13) U =
_
_
_
D(U) :
X
||m
k
( )k
<
_
_
_
for some D(U) and > 0 is an open subset of D(U). Indeed, if K @@ U, then
U C
(K) =
_
_
_
C
(K) :
X
||m
k
( )k
<
_
_
_
is easily seen to be open in C
(K).
Proposition 30.55. Let (X, ) be as described in Denition 30.53 and f : X Y
be a function where Y is another topological space. Then f is continuous i f i
:
X
Y is continuous for all A.

Proof. Since the composition of continuous maps is continuous, it follows that
f i
: X
Y is continuous for all A if f : X Y is continuous. Conversely,

if f i
is continuous for all A, then for all V

o
Y we have
3 (f i
)
1
(V ) = i
1
(f
1
(V )) = f
1
(V ) X
for all A
showing that f
1
(V ) .
Lemma 30.56. Let us continue the notation introduced in Denition 30.53. Sup-
pose further that there exists
k
A such that X
0
k
:= X
k
X as k and for
each A there exists an k N such that X
X
0
k
and the inclusion map is con-
tinuous. Then = {A X : A X
0
k

o
X
0
k
for all k} and a function f : X Y
is continuous i f|
X
0
k
: X
0
k
Y is continuous for all k. In short the inductive limit
topology on X arising from the two collections of subsets {X
}
A
and {X
0
k
}
kN
are the same.
Proof. Suppose that A X, if A then A X
0
k
= A X
k

o
X
0
k
by
denition. Now suppose that A X
0
k

o
X
0
k
for all k. For A choose k such
that X
X
0
k
, then A X
= (A X
0
k
) X

o
X
since A X
0
k
is open in X
0
k
and by assumption that X
is continuously embedded in X
0
k
, V X

o
X
for all
V
o
X
0
k
. The characterization of continuous functions is prove similarly.
Let K
k
@@ U for k N such that K
o
k
K
k
K
o
k+1
K
k+1
for all k and
K
k
U as k . Then it follows for any K @@ U, there exists an k such
that K K
o
k
K
k
. One now checks that the map C
(K) embeds continuously

into C
(K
k
) and moreover, C
(K) is a closed subset of C
(K
k+1
). Therefore
we may describe D(U) as C
c
(U) with the inductively limit topology coming from
kN
C
(K
k
).
Lemma 30.57. Suppose that {
k
}
k=1
D(U), then
k
D(U) i
k

0 D(U).
Proof. Let D(U) and U D(U) be a set. We will begin by showing that
U is open in D(U) i U is open in D(U). To this end let K
k
be the compact
sets described above and choose k
0
suciently large so that C
(K
k
) for all
k k
0
. Now U D(U) is open i (U ) C
(K
k
) is open in C
(K
k
) for
all k k
0
. Because C
(K
k
), we have (U ) C
(K
k
) = U C
(K
k
)
which is open in C
(K
k
) i U C
(K
k
) is open C
(K
k
). Since this is true for
all k k
0
we conclude that U is an open subset of D(U) i U is open in D(U).
Now
k
in D(U) i for all U
o
D(U),
k
U for almost all k which
happens i
k
U for almost all k. Since U ranges over all open
neighborhoods of 0 when U ranges over the open neighborhoods of , the result
follows.
Lemma 30.58. A sequence {
k
}
k=1
D(U) converges to D(U), i there is a
compact set K @@ U such that supp(
k
) K for all k and
k
in C
(K).
Proof. If
k
in C
(K), then for any open set V D(U) with V we

have V C
(K) is open in C
(K) and hence

k
V C
(K) V for almost all

k. This shows that
k
D(U).
For the converse, suppose that there exists {
k
}
k=1
D(U) which converges to
D(U) yet there is no compact set K such that supp(
k
) K for all k. Using
Lemma30.57, we may replace
k
by
k
if necessary so that we may assume
k
0 in D(U). By passing to a subsequences of {
k
} and {K
k
} if necessary, we
may also assume there x
k
K
k+1
\ K
k
such that
k
(x
k
) 6= 0 for all k. Let p denote
the semi-norm on C
c
(U) dened by
p() =

X
k=0
sup
|(x)|
|
k
(x
k
)|
: x K
k+1
\ K
o
k
.
One then checks that
p()
N
X
k=0
1
|
k
(x
k
)|
!
kk
for C
(K
N+1
). This shows that p|
C
(K
N+1
)
is continuous for all N and hence
p is continuous on D(U). Since p is continuous on D(U) and
k
0 in D(U), it
follows that lim
k
p(
k
) = p(lim
k
k
) = p(0) = 0. While on the other hand,
p(
k
) 1 by construction and hence we have arrived at a contradiction. Thus for
any convergent sequence {
k
}
k=1
D(U) there is a compact set K @@ U such
that supp(
k
) K for all k.
We will now show that {
k
}
k=1
is convergent to in C
(K). To this end let

U D(U) be the open set described in Eq. (30.13), then
k
U for almost all k
and in particular,
k
U C
(K) for almost all k. (Letting > 0 tend to zero

shows that supp() K, i.e. C
(K).) Since sets of the form U C
(K) with
U as in Eq. (30.13) form a neighborhood base for the C
(K) at , we concluded
that
k
in C
(K).
Denition 30.59 (Distributions on U
o
R
n
). A generalized function on U
o
R
n
is a continuous linear functional on D(U). We denote the space of generalized
functions by D
0
(U).
556 BRUCE K. DRIVER
Proposition 30.60. Let f : D(U) C be a linear functional. Then the following

are equivalent.
(1) f is continuous, i.e. f D
0
(U).
(2) For all K @@ U, there exist n N and C < such that
(30.14) |f()| Cp
n
() for all C
(K).
(3) For all sequences {
k
} D(U) such that
k
0 in D(U), lim
k
f(
k
) =
0.
Proof. 1) 2). If f is continuous, then by denition of the inductive limit
topology f|
C
(K)
is continuous. Hence an estimate of the type in Eq. (30.14) must
hold. Conversely if estimates of the type in Eq. (30.14) hold for all compact sets
K, then f|
C
(K)
is continuous for all K @@ U and again by the denition of the
inductive limit topologies, f is continuous on D
0
(U).
1) 3) By Lemma 30.58, the assertion in item 3. is equivalent to saying that
f|
C
(K)
is sequentially continuous for all K @@ U. Since the topology on C
(K)
is rst countable (being a metric topology), sequential continuity and continuity
are the same think. Hence item 3. is equivalent to the assertion that f|
C
(K)
is
continuous for all K @@ U which is equivalent to the assertion that f is continuous
on D
0
(U).
Proposition 30.61. The maps (, ) C D(U) D(U) and (, )
D(U) D(U) + D(U) are continuous. (Actually, I will have to look up
how to decide to this.) What is obvious is that all of these operations are sequentially
continuous, which is enough for our purposes.
31. Convolutions involving distributions
31.1. Tensor Product of Distributions. Let X
o
R
n
and Y
o
R
m
and S
D
0
(X) and T D
0
(Y ). We wish to dene ST D
0
(XY ). Informally, we should
have
hS T, i =
Z
XY
S(x)T(y)(x, y)dxdy
=
Z
X
dxS(x)
Z
Y
dyT(y)(x, y) =
Z
Y
dyT(y)
Z
X
dxS(x)(x, y).
Of course we should interpret this last equation as follows,
(31.1) hS T, i = hS(x), hT(y), (x, y)ii = hT(y), hS(x), (x, y)ii.
This formula takes on particularly simple form when = uv with u D(X) and
v D(Y ) in which case
(31.2) hS T, u vi = hS, uihT, vi.
We begin with the following smooth version of the Weierstrass approximation the-
orem which will be used to show Eq. (31.2) uniquely determines S T.
Theorem 31.1 (Density Theorem). Suppose that X
o
R
n
and Y
o
R
m
, then
D(X) D(Y ) is dense in D(X Y ).
Proof. First let us consider the special case where X = (0, 1)
n
and Y = (0, 1)
m
so that XY = (0, 1)
m+n
. To simplify notation, let m+n = k and = (0, 1)
k
and
i
: (0, 1) be projection onto the i
th
factor of . Suppose that C
c
() and
K = supp(). We will view C
c
(R
k
) by setting = 0 outside of . Since K is
compact
i
(K) [a
i
, b
i
] for some 0 < a
i
< b
i
< 1. Let a = min{a
i
: i = 1, . . . , k}
and b = max {b
i
: i = 1, . . . , k} . Then supp() = K [a, b]
k
.
As in the proof of the Weierstrass approximation theorem, let q
n
(t) = c
n
(1
t
2
)
n
1
|t|1
where c
n
is chosen so that
R
R
q
n
(t)dt = 1. Also set Q
n
= q
n
q
n
,
i.e. Q
n
(x) =
Q
k
i=1
q
n
(x
i
) for x R
k
. Let
(31.3) f
n
(x) := Q
n
(x) = c
k
n
Z
R
k
(y)
k
Y
i=1
(1 (x
i
y
i
)
2
)
n
1
|x
i
y
i
|1
dy
i
.
By standard arguments, we know that
f
n

uniformly on R
k
as n .
Moreover for x , it follows from Eq. (31.3) that
f
n
(x) := c
k
n
Z
(y)
k
Y
i=1
(1 (x
i
y
i
)
2
)
n
dy
i
= p
n
(x)
where p
n
(x) is a polynomial in x. Notice that p
n
C
((0, 1)) C
((0, 1)) so
that we are almost there.
53
We need only cuto these functions so that they have
53
One could also construct f
n
C
(R)
k
such that
f
n

f uniformlly as n using
Fourier series. To this end, let

be the 1 periodic extension of to R
k
. Then

C
periodic
(R
k
)
and hence it may be written as
(x) =
X
mZ
k
c
m
e
i2mx
where the

c
m
: m Z
k
are the Fourier coecients of

which decay faster that (1 +|m|)
l
for
any l > 0. Thus f
n
(x) :=
P
mZ
k
:|m|n
c
m
e
i2mx
C
(R)
k
and
f
n

unifromly on
as n .
558 BRUCE K. DRIVER
compact support. To this end, let C
c
((0, 1)) be a function such that = 1 on
a neighborhood of [a, b] and dene
n
= ( ) f
n
= ( ) p
n
C
c
((0, 1)) C
c
((0, 1)).
I claim now that
n
in D(). Certainly by construction supp(
n
)
[a, b]
k
@@ for all n. Also
(31.4)
(
n
) =
( ( ) f
n
) = ( ) (
f
n
) +R
n
where R
n
is a sum of terms of the form
( )
f
n
with 6= 0. Since
( ) = 0 on [a, b]
k
and
f
n
converges uniformly to zero on R
k
\ [a, b]
k
,
it follows that R
n
0 uniformly as n . Combining this with Eq. (31.4) and
the fact that
f
n

uniformly on R
k
as n , we see that
n
in D().
This nishes the proof in the case X = (0, 1)
n
and Y = (0, 1)
m
.
For the general case, let K = supp() @@ X Y and K
1
=
1
(K) @@ X and
K
2
=
2
(K) @@ Y where
1
and
2
are projections from X Y to X and Y
respectively. Then K @ K
1
K
2
@@ X Y. Let {V
i
}
a
i=1
and {U
j
}
b
j=1
be nite
covers of K
1
and K
2
respectively by open sets V
i
= (a
i
, b
i
) and U
j
= (c
j
, d
j
) with
a
i
, b
i
X and c
j
, d
j
Y. Also let
i
C
c
(V
i
) for i = 1, . . . , a and
j
C
c
(U
j
)
for j = 1, . . . , b be functions such that
P
a
i=1
i
= 1 on a neighborhood of K
1
and
P
b
j=1
j
= 1 on a neighborhood of K
2
. Then =
P
a
i=1
P
b
j=1
(
i
j
) and by
what we have just proved (after scaling and translating) each term in this sum,
(
i
j
) , may be written as a limit of elements in D(X) D(Y ) in the D(XY )
topology.
Theorem 31.2 (Distribution-Fubini-Theorem). Let S D
0
(X), T D
0
(Y ),
h(x) := hT(y), (x, y)i and g(y) := hS(x), (x, y)i. Then h = h
D(X),
g = g
D(Y ),
h(x) = hT(y),
x
(x, y)i and
g(y) = hS(x),
y
(x, y)i for
all multi-indices and . Moreover
(31.5) hS(x), hT(y), (x, y)ii = hS, hi = hT, gi = hT(y), hS(x), (x, y)ii.
We denote this common value by hS T, i and call S T the tensor product of S
and T. This distribution is uniquely determined by its values on D(X) D(Y ) and
for u D(X) and v D(Y ) we have
hS T, u vi = hS, uihT, vi.
Proof. Let K = supp() @@ X Y and K
1
=
1
(K) and K
2
=
2
(K). Then
K
1
@@ X and K
2
@@ Y and K K
1
K
2
X Y. If x X and y / K
2
, then
(x, y) = 0 and more generally
x
(x, y) = 0 so that {y :
x
(x, y) 6= 0} K
2
.
Thus for all x X, supp(
(x, )) K
2
Y. By the fundamental theorem of
calculus,
(31.6)
y
(x +v, y)
y
(x, y) =
Z
1
0

x
v
y
(x +v, y)d
and therefore
y
(x +v, )
y
(x, )
|v|
Z
1
0
y
(x +v, )
d
|v|
0 as 0.
This shows that x X (x, ) D(Y ) is continuous. Thus h is continuous being
the composition of continuous functions. Letting v = te
i
in Eq. (31.6) we nd
y
(x +te
i
, y)
y
(x, y)
t

x
i
y
(x, y) =
Z
1
0

x
i
y
(x +te
i
, y)

x
i
y
(x, y)
d
and hence
y
(x +te
i
, )
y
(x, )
t

x
i
y
(x, )
Z
1
0

x
i
y
(x +te
i
, )

x
i
y
(x, )
d
which tends to zero as t 0. Thus we have checked that
x
i
(x, ) = D
0
(Y ) lim
t0
(x +te
i
, ) (x, )
t
and therefore,
h(x +te
i
) h(x)
t
= hT,
(x +te
i
, ) (x, )
t
i hT,

x
i
(x, )i
as t 0 showing
i
h(x) exists and is given by hT,

x
i
(x, )i. By what we
have proved above, it follows that
i
h(x) = hT,

x
i
(x, )i is continuous in
x. By induction on || , it follows that
h(x) exists and is continuous and
h(x) = hT(y),
x
(x, y)i for all . Now if x / K
1
, then (x, ) 0 showing that
{x X : h(x) 6= 0} K
1
and hence supp(h) K
1
@@ X. Thus h has compact
support. This proves all of the assertions made about h. The assertions pertaining
to the function g are prove analogously.
Let h, i = hS(x), hT(y), (x, y)ii = hS, h
i for D(XY ). Then is clearly

linear and we have
|h, i| = |hS, h
i| C
X
||m
k
x
h
k
,K
1
= C
X
||m
khT(y),
x
(, y)ik
,K
1
which combined with the estimate
|hT(y),
x
(x, y)i| C
X
||p
x
(x, y)i
,K
2
shows
|h, i| C
X
||m
X
||p
x
(x, y)i
,K
1
K
2
.
So is continuous, i.e. D
0
(X Y ), i.e.
D(X Y ) hS(x), hT(y), (x, y)ii
denes a distribution. Similarly,
D(X Y ) hT(y), hS(x), (x, y)ii
also denes a distribution and since both of these distributions agree on the dense
subspace D(X) D(Y ), it follows they are equal.
Theorem 31.3. If (T, ) is a distribution test function pair satisfying one of the
following three conditions
(1) T E
0
(R
n
) and C
(R
n
)
(2) T D
0
(R
n
) and D(R
n
) or
(3) T S
0
(R
n
) and S(R
n
),
560 BRUCE K. DRIVER
let
(31.7) T (x) =
Z
T(y)(x y)dy = hT, (x )i.
Then T C
(R
n
),
(T ) = (
T ) = (T
) for all and

supp(T ) supp(T) + supp(). Moreover if (3) holds then T P the space
of smooth functions with slow decrease.
Proof. I will supply the proof for case (3) since the other cases are similar and
easier. Let h(x) := T (x). Since T S
0
(R
n
), there exists m N and C <
such that |hT, i| Cp
m
() for all S, where p
m
is dened in Example 30.28.
Therefore,
|h(x) h(y)| = |hT, (x ) (y )i| Cp
m
((x ) (y ))
= C
X
||m
k
m
(
(x )
(y ))k
.
Let :=
, then
(31.8) (x z) (y z) =
Z
1
0
(y +(x y) z) (x y)d
and hence
|(x z) (y z)| |x y|
Z
1
0
|(y +(x y) z)| d
C |x y|
Z
1
0

M
(y +(x y) z)d
for any M < . By Peetres inequality,
M
(y +(x y) z)
M
(z)
M
(y +(x y))
so that
|
(x z)
(y z)| C |x y|
M
(z)
Z
1
0

M
(y +(x y))d
C(x, y) |x y|
M
(z) (31.9)
where C(x, y) is a continuous function of (x, y). Putting all of this together we see
that
|h(x) h(y)|

C(x, y) |x y| 0 as x y,
showing h is continuous. Let us now compute a partial derivative of h. Suppose
that v R
n
is a xed vector, then by Eq. (31.8),
(x +tv z) (x z)
t

v
(x z) =
Z
1
0
(x +tv z) vd
v
(x z)
=
Z
1
0
[
v
(x +tv z)
v
(x z)] d.
This then implies
(x +tv z) (x z)
t

v
(x z)
Z
1
0

z
[
v
(x +tv z)
v
(x z)] d
Z
1
0
|
z
[
v
(x +tv z)
v
(x z)]| d.
But by the same argument as above, it follows that
|
z
[
v
(x +tv z)
v
(x z)]| C(x +tv, x) |tv|
M
(z)
and thus
(x +tv z) (x z)
t

v
(x z)
t
M
(z)
Z
1
0
C(x+tv, x)d |v|
M
(z).
Putting this all together shows
(x +tv z) (x z)
t

v
(x z)
= O(t) 0 as t 0.
That is to say
(x+tv)(x)
t

v
(x ) in S as t 0. Hence since T is
continuous on S, we learn
v
(T ) (x) =
v
hT, (x )i = lim
t0
hT,
(x +tv ) (x )
t
i
= hT,
v
(x )i = T
v
(x).
By the rst part of the proof, we know that
v
(T ) is continuous and hence by
induction it now follows that T is C
and
T = T
. Since
T
(x) = hT(z), (
) (x z)i = (1)
hT(z),
z
(x z)i
= h
z
T(z), (x z)i =
T (x)
the proof is complete except for showing T P.
For the last statement, it suces to prove |T (x)| C
M
(x) for some C <
and M < . This goes as follows
|h(x)| = |hT, (x )i| Cp
m
((x )) = C
X
||m
k
m
(
(x )k
and using Peetres inequality, |
(x z)| C
m
(x z) C
m
(z)
m
(x) so
that
k
m
(
(x )k
C
m
(x).
Thus it follows that |T (x)| C
m
(x) for some C < .
If x R
n
\ (supp(T) + supp()) and y supp() then x y / supp(T) for
otherwise x = x y +y supp(T) + supp(). Thus
supp((x )) = x supp() R
n
\ supp(T)
and hence h(x) = hT, (x )i = 0 for all x R
n
\ (supp(T) + supp()) . This
implies that {h 6= 0} supp(T) + supp() and hence
supp(h) = {h 6= 0} supp(T) + supp().
As we have seen in the previous theorem, T is a smooth function and hence
may be used to dene a distribution in D
0
(R
n
) by
hT , i =
Z
T (x)(x)dx =
Z
hT, (x )i(x)dx.
Using the linearity of T we might expect that
Z
hT, (x )i(x)dx = hT,
Z
(x )(x)dxi
562 BRUCE K. DRIVER
(31.10) hT , i = hT,

i
where

(x) := (x).
Theorem 31.4. Suppose that if (T, ) is a distribution test function pair satisfy-
ing one the three condition in Theorem 31.3, then T as a distribution may be
characterized by
(31.11) hT , i = hT,

i
for all D(R
n
). Moreover, if T S
0
and S then Eq. (31.11) holds for all
S.
Proof. Let us rst assume that T D
0
and , D and D be a function
such that = 1 on a neighborhood of the support of . Then
hT , i =
Z
R
n
hT, (x )i(x)dx = h(x), hT(y), (x y)ii
= h(x)(x), hT(y), (x y)ii = h(x), (x)hT(y), (x y)ii
= h(x), hT(y), (x)(x y)ii.
Now the function, (x)(x y) D(R
n
R
n
), so we may apply Fubinis theorem
for distributions to conclude that
hT , i = h(x), hT(y), (x)(x y)ii = hT(y), h(x), (x)(x y)ii
= hT(y), h(x)(x), (x y)ii = hT(y), h(x), (x y)ii
= hT(y),

(y)i = hT,

i
as claimed.
If T E
0
, let D(R
n
) be a function such that = 1 on a neighborhood of
supp(T), then working as above,
hT , i = h(x), hT(y), (x)(x y)ii = h(x), hT(y), (y)(x)(x y)ii
and since (y)(x)(x y) D(R
n
R
n
) we may apply Fubinis theorem for
distributions to conclude again that
hT , i = hT(y), h(x), (y)(x)(x y)ii
= h(y)T(y), h(x)(x), (x y)ii
= hT(y), h(x), (x y)ii = hT,

i.
Now suppose that T S
0
and , S. Let
n
,
n
D be a sequences such that
n
and
n
in S, then using arguments similar to those in the proof of
Theorem 31.3, one shows
hT , i = lim
n
hT
n
,
n
i = lim
n
hT,
n
n
i = hT,

i.
Theorem 31.5. Let U
o
R
n
, then D(U) is sequentially dense in E
0
(U). When
U = R
n
we have E
0
(R
n
) is a dense subspace of S
0
(R
n
) D
0
(R
n
). Hence we have
the following inclusions,
D(U) E
0
(U) D
0
(U),
D(R
n
) E
0
(R
n
) S
0
(R
n
) D
0
(R
n
) and
D(R
n
) S(R
n
) S
0
(R
n
) D
0
(R
n
)
with all inclusions being dense in the next space up.
Proof. The key point is to show D(U) is dense in E
0
(U). Choose C
c
(R
n
)
such that supp() B(0, 1), = and
R
(x)dx = 1. Let
m
(x) = m
n
(mx)
so that supp(
m
) B(0, 1/m). An element in T E
0
(U) may be viewed as an
element in E
0
(R
n
) in a natural way. Namely if C
c
(U) such that = 1
on a neighborhood of supp(T), and C
(R
n
), let hT, i = hT, i. Dene
T
m
= T
m
. It is easily seen that supp(T
n
) supp(T) +B(0, 1/m) U for all m
suciently large. Hence T
m
D(U) for large enough m. Moreover, if D(U),
then
hT
m
, i = hT
m
, i = hT,
m
i = hT,
m
i hT, i
since
m
in D(U) by standard arguments. If U = R
n
, T E
0
(R
n
) S
0
(R
n
)
and S, the same argument goes through to show hT
m
, i hT, i provided
we show
m
in S(R
n
) as m . This latter is proved by showing for all
and t > 0, I
k
t
(
)k
0 as m ,
which is a consequence of the estimates:
|
m
(x)
(x)| = |
m

(x)
(x)|
=
Z

m
(y) [
(x y)
(x)] dy
sup
|y|1/m
|
(x y)
(x)|
1
m
sup
|y|1/m
|
(x y)|
1
m
C sup
|y|1/m
t
(x y)
1
m
C
t
(x y) sup
|y|1/m
t
(y)
1
m
C
1 +m
1
t
(x).
Denition 31.6 (Convolution of Distributions). Suppose that T D
0
and S E
0
,
then dene T S D
0
by
hT S, i = hT S,
+
i
where
+
(x, y) = (x + y) for all x, y R
n
. More generally we may dene T S
for any two distributions having the property that supp(T S) supp(
+
) =
[supp(T) supp(S)] supp(
+
) is compact for all D.
Proposition 31.7. Suppose that T D
0
and S E
0
then T S is well dened and
(31.12) hT S, i = hT(x), hS(y), (x +y)ii = hS(y), hT(x), (x +y)ii.
Moreover, if T S
0
then T S S
0
and F(T S) =

S

T. Recall from Remark 30.46
that

S P so that

S

T S
0
.
564 BRUCE K. DRIVER
Proof. Let D be a function such that = 1 on a neighborhood of supp(S),

then by Fubinis theorem for distributions,
hT S,
+
i = hT S(x, y), (y)(x +y)i = hT(x)S(y), (y)(x +y)i
= hT(x), hS(y), (y)(x +y)ii = hT(x), hS(y), (x +y)ii
and
hT S,
+
i = hT(x)S(y), (y)(x +y)i = hS(y), hT(x), (y)(x +y)ii
= hS(y), (y)hT(x), (x +y)ii = hS(y), hT(x), (x +y)ii
proving Eq. (31.12).
Suppose that T S
0
, then
|hT S, i| = |hT(x), hS(y), (x +y)ii| C
X
||m
k
m
x
hS(y), ( +y)ik
= C
X
||m
k
m
hS(y),
( +y)ik
and
|hS(y),
(x +y)i| C
X
||p
sup
yK
(x +y)
Cp
m+p
() sup
yK
mp
(x +y)
Cp
m+p
()
mp
(x) sup
yK
m+p
(y) =

C
mp
(x)p
m+p
().
Combining the last two displayed equations shows
|hT S, i| Cp
m+p
()
which shows that T S S
0
. We still should check that
hT S, i = hT(x), hS(y), (x +y)ii = hS(y), hT(x), (x +y)ii
still holds for all S. This is a matter of showing that all of the expressions
are continuous in S when restricted to D. Explicitly, let
m
D be a sequence of
functions such that
m
in S, then
(31.13) hT S, i = lim
n
hT S,
n
i = lim
n
hT(x), hS(y),
n
(x +y)ii
and
(31.14) hT S, i = lim
n
hT S,
n
i = lim
n
hS(y), hT(x),
n
(x +y)ii.
So it suces to show the map S hS(y), ( + y)i S is continuous and
S hT(x), (x + )i C
(R
n
) are continuous maps. These may veried
by methods similar to what we have been doing, so I will leave the details to the
reader. Given these continuity assertions, we may pass to the limits in Eq. (31.13d
(31.14) to learn
hT S, i = hT(x), hS(y), (x +y)ii = hS(y), hT(x), (x +y)ii
still holds for all S.
The last and most important point is to show F(T S) =

S

T. Using
(x +y) =
Z
R
n
()e
i(x+y)
d =
Z
R
n
()e
iy
e
ix
d = F
()e
iy
(x)
and the denition of F on S
0
we learn
hF(T S), i = hT S,

i = hS(y), hT(x),

(x +y)ii = hS(y), hT(x), F
()e
iy
(x)ii
= hS(y), h
T(), ()e
iy
ii.
(31.15)
Let D be a function such that = 1 on a neighborhood of supp(S) and
assume D for the moment. Then from Eq. (31.15) and Fubinis theorem for
distributions we nd
hF(T S), i = hS(y), (y)h
T(), ()e
iy
ii = hS(y), h
T(), ()(y)e
iy
ii
= h
T(), hS(y), ()(y)e

iy
ii = h
T(), ()hS(y), e
iy
ii
= h
T(), ()
S()i = h
S()
T(), ()i. (31.16)

Since F(T S) S
0
and

S

T S
0
, we conclude that Eq. (31.16) holds for all S
and hence F(T S) =

S

T as was to be proved.
31.2. Elliptic Regularity.
Theorem 31.8 (Hypoellipticity). Suppose that p(x) =
P
||m
a
is a polyno-
mial on R
n
and L is the constant coecient dierential operator
L = p(
1
i
) =
X
||m
a
(
1
i
)
=
X
||m
a
(i)
.
Also assume there exists a distribution T D
0
(R
n
) such that R := LT C
(R
n
)
and T|
R
n
\{0}
C
(R
n
\ {0}). Then if v C
(U) and u D
0
(U) solves Lu = v
then u C
(U). In particular, all solutions u to the equation Lu = 0 are smooth.

Proof. We must show for each x
0
U that u is smooth on a neighborhood of
x
0
. So let x
0
U and D(U) such that 0 1 and = 1 on neighborhood
V of x
0
. Also pick D(V ) such that 0 1 and = 1 on a neighborhood of
x
0
. Then
u = (u) = (LT +R) (u) = (LT) (u) +R (u)
= T L(u) +R (u)
= T {L(u) + (1 )L(u)} +R (u)
= T {Lu + (1 )L(u)} +R (u)
= T (v) +R (u) +T [(1 )L(u)] .
Since v D(U) and T D
0
(R
n
) it follows that R (u) C
(R
n
). Also since
R C
(R
n
) and u E
0
(U), R (u) C
(R
n
). So to show u, and hence u, is
smooth near x
0
it suces to show T g is smooth near x
0
where g := (1)L(u) .
Working formally for the moment,
T g(x) =
Z
R
n
T(x y)g(y)dy =
Z
R
n
\{=1}
T(x y)g(y)dy
which should be smooth for x near x
0
since in this case xy 6= 0 when g(y) 6= 0. To
make this precise, let > 0 be chosen so that = 1 on a neighborhood of B(x
0
, )
so that supp(g) B(x
0
, )
c
. For D(B(x
0
, /2),
hT g, i = hT(x), hg(y), (x +y)ii = hT, hi
566 BRUCE K. DRIVER
where h(x) := hg(y), (x +y)i. If |x| /2

supp((x +)) = supp() x B(x
0
, /2) x B(x
0
, )
so that h(x) = 0 and hence supp(h) B(x
0
, /2)
c
. Hence if we let D(B(0, /2))
be a function such that = 1 near 0, we have h 0, and thus
hT g, i = hT, hi = hT, h hi = h(1 )T, hi = h[(1 )T] g, i.
Since this last equation is true for all D(B(x
0
, /2)), T g = [(1 )T] g
on B(x
0
, /2) and this nishes the proof since [(1 )T] g C
(R
n
) because
(1 )T C
(R
n
).
Denition 31.9. Suppose that p(x) =
P
||m
a
is a polynomial on R
n
and L
is the constant coecient dierential operator
L = p(
1
i
) =
X
||m
a
(
1
i
)
=
X
||m
a
(i)
.
Let
p
(L)() :=
P
||=m
a
and call
p
(L) the principle symbol of L. The oper-
ator L is said to be elliptic provided that
p
(L)() 6= 0 if 6= 0.
Theorem 31.10 (Existence of Parametrix). Suppose that L = p(
1
i
) is an elliptic
constant coecient dierential operator, then there exists a distribution T D
0
(R
n
)
such that R := LT C
(R
n
) and T|
R
n
\{0}
C
(R
n
\ {0}).
Proof. The idea is to try to nd T such that LT = . Taking the Fourier
transform of this equation implies that p()
T() = 1 and hence we should try to

dene

T() = 1/p(). The main problem with this denition is that p() may have
zeros. However, these zeros can not occur for large by the ellipticity assumption.
Indeed, let q() :=
p
(L)() =
P
||=m
a
, r() = p()q() =
P
||<m
a
and
let c = min{|q()| : || = 1} max {|q()| : || = 1} =: C. Then because |q()| is a
nowhere vanishing continuous function on the compact set S := { R
n
: || = 1|} ,
0 < c C < . For R
n
, let

= /|| and notice
|p()| = |q()| |r()| c ||
m
|r()| = ||
m
(c
|r()|
||
m
) > 0
for all || M with M suciently large since lim
|r()|
||
m
= 0. Choose D(R
n
)
such that = 1 on a neighborhood of B(0, M) and let
h() =
1 ()
p()
=
()
p()
C
(R
n
)
where = 1 . Since h() is bounded (in fact lim
h() = 0), h S
0
(R
n
) so
there exists T := F
1
h S
0
(R
n
) is well dened. Moreover,
F ( LT) = 1 p()h() = 1 () = () D(R
n
)
which shows that
R := LT S(R
n
) C
(R
n
).
So to nish the proof it suces to show
T|
R
n
\{0}
C
(R
n
\ {0}).
To prove this recall that
F (x
T) = (i)

T = (i)
h.
By the chain rule and the fact that any derivative of is has compact support in
B(0, M)
c
and any derivative of
1
p
is non-zero on this set,
h =
1
p
+r
where r
D(R
n
). Moreover,
i
1
p
=
i
p
p
2
and
j
i
1
p
=
j
i
p
p
2
=
i
p
p
2
+ 2
i
p
p
3
()
i
1
p
()
C ||
(m+1)
and
()
j
i
1
p
C ||
(m+2)
.
More generally, one shows by inductively that
(31.17)
()
1
p
C ||
(m+||)
.
In particular, if k N is given and is chosen so that || + m > n + k, then
||
k
h() L
1
() and therefore
x
T = F
1
[(i)
h] C
k
(R
n
).
Hence we learn for any k N, we may choose p suciently large so that
|x|
2p
T C
k
(R
n
).
This shows that T|
R
n
\{0}
C
(R
n
\ {0}).
Here is the induction argument that proves Eq. (31.17). Let q
:= p
||+1
p
1
with q
0
= 1, then
p
1
=
i
p
||1
q
= (|| 1) p
||2
q
i
p +p
||1
i
q
so that
q
+e
i
= p
||+2
p
1
= (|| 1) q
i
p +p
i
q
.
It follows by induction that q
is a polynomial in and letting d
:= deg(q
), we
have d
+e
i
d
+ m 1 with d
0
= 1. Again by indunction this implies d

|| (m1). Therefore
p
1
=
q
p
||+1
||
d
m(||+1)
= ||
||(m1)m(||+1)
= ||
(m+||)
as claimed in Eq. (31.17).
31.3. Appendix: Old Proof of Theorem 31.4. This indeed turns out to be the
case but is a bit painful to prove. The next theorem is the key ingredient to proving
Eq. (31.10).
Theorem 31.11. Let D ( S) d(y) = (y)dy, and C
(R
n
) ( S).
For > 0 we may write R
n
=
`
mZ
n
(m + Q) where Q = (0, 1]
n
. For y
(m + Q), let y
m +
Q be the point closest to the origin in m +
Q. (This
will be one of the corners of the translated cube.) In this way we dene a function
y R
n
y
Z
n
which is constant on each cube (m+Q). Let
(31.18) F
(x) :=
Z
(x y
)d(y) =
X
mZ
n
(x (m)
)((m+Q)),
568 BRUCE K. DRIVER
then the above sum converges in C
(R
n
) (S) and F
in C
(R
n
) (S) as
0. (In particular if , S then S.)
Proof. First suppose that D the measure has compact support and hence
the sum in Eq. (31.18) is nite and so is certainly convergent in C
(R
n
). To shows
F
in C
(R
n
), let K be a compact set and m N. Then for || m,
|
(x)
(x)| =
Z
[
(x y
(x y)] d(y)
Z
|
(x y
(x y)| |(y)| dy (31.19)

and therefore,
k
k
,K

Z
k
( y
( y)k
,K
|(y)| dy
sup
ysupp()
k
( y
( y)k
,K
Z
|(y)| dy.
Since (y) has compact support, we may us the uniform continuity of
on
compact sets to conclude
sup
ysupp()
k
( y
( y)k
,K
0 as 0.
This nishes the proof for D and C
(R
n
).
Now suppose that both and are in S in which case the sum in Eq. (31.18) is
now an innite sum in general so we need to check that it converges to an element
in S. For this we estimate each term in the sum. Given s, t > 0 and a multi-index
, using Peetres inequality and simple estimates,
|
(x (m)
)((m+Q))| C
t
(x (m)
)
Z
(m+Q)
|(y)| dy
C
t
(x)
t
((m)
)K
Z
(m+Q)
s
(y)dy
for some nite constants K and C. Making the change of variables y = m +z, we
nd
Z
(m+Q)
s
(y)dy =
n
Z
Q
s
(m +z)dz

n
s
(m)
Z
Q
s
(z)dy =
n
s
(m)
Z
Q
1
(1 +|z|)
s
dy

n
s
(m).
Combining these two estimates shows
k
t
( (m)
)((m+Q))k
C
t
((m)
)
n
s
(m)
C
t
(m)
s
(m)
n
= C
ts
((m)
n
and therefore for some (dierent constant C)
X
mZ
n
p
k
(( (m)
)((m+Q)))
X
mZ
n
C
ks
(m)
n
=
X
mZ
n
C
1
(1 + |m|)
ks
n
which can be made nite by taking s > k +n as can be seen by an comparison with
the integral
R
1
(1+|x|)
ks
dx. Therefore the sum is convergent in S as claimed.
To nish the proof, we must show that F
in S. From Eq. (31.19) we

still have
|
(x)
(x)|
Z
|
(x y
(x y)| |(y)| dy.

The estimate in Eq. (31.9) gives
|
(x y
(x y)| C
Z
1
0

M
(y
+(y y
))d |y y
|
M
(x)
C
M
(x)
Z
1
0

M
(y
+(y y
))d
C
M
(x)
Z
1
0

M
(y)d = C
M
(x)
M
(y)
where in the last inequality we have used the fact that |y
+(y y
)| |y| .
Therefore,
k
M
(
(x)
)k
C
Z
R
n

M
(y) |(y)| dy = O() 0 as
because
R
R
n

M
(y) |(y)| dy < for all M < since S.
We are now in a position to prove Eq. (31.10). Let us state this in the form of
a theorem.
Theorem 31.12. Suppose that if (T, ) is a distribution test function pair satis-
fying one the three condition in Theorem 31.3, then T as a distribution may be
characterized by
(31.20) hT , i = hT,

i
for all D(R
n
) and all S when T S
0
and S.
Proof. Let
=
Z

(x y
)d(y) =
X
mZ
n
(x (m)
)((m+Q))
570 BRUCE K. DRIVER
then making use of Theorem 31.12 in all cases we nd

hT,

i = lim
0
hT,

F
i
= lim
0
hT(x),
X
mZ
n
(x (m)
)((m+Q))i
= lim
0
X
mZ
n
hT(x), ((m)
x)((m+Q))i
= lim
0
X
mZ
n
hT ((m)
i((m+Q)). (31.21)
To compute this last limit, let h(x) = T (x) and let us do the hard case where
T S
0
. In this case we know that h P, and in particular there exists k < and
C < such that k
k
hk
< . So we have
Z
R
n
h(x)d(x)
X
mZ
n
hT ((m)
i((m+Q))
Z
R
n
[h(x) h(x
)] d(x)
Z
R
n
|h(x) h(x
)| |(x)| dx.
Now
|h(x) h(x
)| C (
k
(x) +
k
(x
)) 2C
k
(x)
and since
k
|| L
1
we may use the dominated convergence theorem to conclude
lim
0
Z
R
n
h(x)d(x)
X
mZ
n
hT ((m)
i((m+Q))
= 0

Analysis Sol Comp

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Analysis Sol Comp

Diunggah oleh

Hak Cipta:

Format Tersedia

ANALYSIS TOOLS WITH APPLICATIONS

Date: April 10, 2003 File:anal.tex .

Department of Mathematics, 0112.

7.9. Exercises 126

21.2. Exercises 420

42.2. The Riesz Representation Theorem 680

in the case that {A

2.2. Limits, Limsups, and Liminfs.

and hence by taking the supremum over we learn that

< and hence

(x) = max(a(x), 0).

which shows that either

= . Suppose, with out loss of

Exercise 2.12. Show that

g(y) for all y Y.

is a convergent sequence and x = lim

. Finally it is clear that F

Denition 3.23. Let (X, ) be a topological space and A X. We say a subset

is closed and since A f

. From this equation we learn that

(X) = B(X) is closed

(X) and hence is also a Banach space.

(X) be a Cauchy sequence. Since for any x X, we

which shows that {f

and taking the supremum over x X of this inequality implies

< /2 and then for this

(X, Y ) is a Banach space and that BC(X, Y ) is a closed subspace of

n we have shows that

is compact. To prove this, let {x

{f F : |f(y) (x)| < for y V

and for each choose f

(y)| |f(y) (x))| +|(x) f

k < 2 for all f F

F and notice for any V

f uniformly. Then, since x

3.7. Bounded Linear Operators Basics.

kxk and hence kTk

< . (c) (a) Let x X and

norms, i.e. for x R

for L(X, F) and call X

the (continuous) dual space to X.

is always a Banach space.

For an application of these results to linear ordinary dierential equations, see

X if for all > 0

< for all A

exists in X i for all

< for all A \

X exists and T : X Y is a bounded linear map between normed

exists and > 0. Let

exists and is equal to s.

be as in Denition 3.71 and A such that

exists and is equal to Ts.

Exercise 3.26. Let {(X

4. The Riemann Integral

C kzk for all z

([a, b], X)) to X and this operator satises,

([a, b], X) and for f ,

= 0, then for a < b,

0 for t [c, d] analogously. Then

> 0 such that

F(t) dt = F(b) F(a).

([a, b], X) {f : [a, b] X : kfk

([a, b], X), k k

) is a complete Banach space.