John Loftin
May 4, 2013
1 Spaces of functions
1.1 Banach spaces
Many natural spaces of functions form infinite-dimensional vector spaces.
Examples are the space of polynomials and the space of smooth functions.
If we are interested in solving differential equations, then, it is important to
understand analysis in infinite-dimensional vector spaces (over R or C).
First of all, we should recognize the following straightforward fact about
finite-dimensional vector spaces:
1
to define limits in infinite-dimensional spaces, however (though a related
construction is used in defining the topology of Frechet spaces).
Finite-dimensional vector spaces are also all complete with respect to
their standard norm (in other words, they are all Banach spaces). Given a
norm on an infinite dimensional vector space, completeness must be proved,
however. There are many examples of Banach function spaces: On a measure
space, the Lp spaces of functions are all Banach spaces for 1 p . Also,
on a metric space X, the space of all bounded continuous functions C 0 (X)
is a measure space under the norm
The Lp and C 0 form the basis of most other useful Banach spaces, with exten-
sions typically provided by measuring not just the functions themselves, but
also their partial derivatives (as in Sobolev and C k spaces) or their difference
quotients (Holder spaces).
Completeness of a metric space of course means that any Cauchy sequence
has a unique limit. More roughly, this means that any sequence that should
converge, in that its elements are becoming infinitesimally close to each other,
will converge to a limit in the space. As we will see, taking such limits is
a powerful way to construct solutions to analytic problems. Unfortunately,
many of the most familiar spaces of functions (such as smooth functions) do
not have the structure of a Banach space, and so it is difficult to ensure that
a given limit of smooth functions is smooth. In fact we have the following
theorem, which we state without proof:
Theorem 1. On Rn equipped with Lebesgue measure, the space C0 (Rn ) of
smooth functions with compact support is dense in Lp (Rn ) for all 1 p < .
In other words, completion of the space of smooth functions with compact
support on Rn with respect to the Lp norm, is simply the space of all Lp
functions for 1 p < .
If we are working in L2 , for example, it is possible for the limit of smooth
functions to be quite non-smooth: there are many L2 functions which are
discontinuous everywhere. This poses a potential problem if the limit we
have produced is supposed to be a solution to a differential equation. In
particular, such a limit may be nowhere differentiable. Some of our goals then
are to understand (1) how to make sense of taking derivatives of functions
which are not classically differentiable (the theory of distributions and weak
2
derivatives), and (2) how to show that a limit function actually has enough
derivatives to solve the equation (bootstrapping).
Theorem 1 reminds us that the Lp Banach spaces have a very large over-
lap, which of course includes many more functions than the smooth functions
with compact support. In particular, it is often useful to take the point of
view that these Banach function spaces are not so much different spaces but
different tools to study either the space of all functions or (via the comple-
tion process) the space of only very nice functions (e.g., smooth functions of
compact support).
In particular, two function spaces which are very closely related to each
other are L and C 0 . As we will see below, they have essentially the same
norm. First of all, we show that C 0 (X) is a Banach space for any metric
space X.
kf kC 0 = 0 f 0,
kf kC 0 = ||kf kC 0 ,
kf + gkC 0 kf kC 0 + kgkC 0 .
3
Proof. We simply need to check the metric induced on C 0 (X) is complete.
Let d denote the metric on X, and consider a Cauchy sequence {fi }
C 0 (X). In other words, for all > 0, there is an N so that n, m > N
implies kfn fm kC 0 < . By the definition of the norm, this is equivalent to
|fn (x) fm (x)| < for all x X. Now for each x X, {fi (x)} R is a
Cauchy sequence, and since R is complete, there is a limit f (x) = limi fi (x).
Now we have produced a limit function f . Now we need to show that
kfi f kC 0 0 and f C 0 (X). The first statement is straightforward:
For all > 0, there is an N so that for all n, m > N , for all x X,
So we have that for all > 0, there is an N so that for all n > N , and for all
x X,
|fn (x) f (x)| .
Since this is true for all x X, we have
and so kfi f kC 0 0.
We still need to prove that the limit function f is continuous. So let x
X and choose > 0. Then there is an N so that for n > N , kfn f kC 0 < .
By the previous paragraph and the definition of k kC 0 ,
|fn (x) f (x)| < and |fn (y) f (y)| < for all y X.
|f (x) f (y)| = | [f (x) fn (x)] + [fn (x) fn (y)] + [fn (y) f (y)] |
|f (x) fn (x)| + |fn (x) fn (y)| + |fn (y) f (y)|
< + + = 3.
4
The last bit of the proof can be remembered as this: Any uniform limit
of continuous functions is continuous.
Remark. The previous proposition works as well for functions whose range is
the complex numbers C, or a vector space Rn , or in fact any Banach space
B. The proof is the same. In this last case, we could refer to the Banach
space C 0 (X; B) as the Banach space of continuous functions from X into B.
Consider an open set Rn . On , the C 0 norm is essentially the same
as the L norm, but is simpler to define because we can consider functions
as elements of C 0 , while we need equivalence classes of functions to define
L . In fact, more is true. Let inherit the standard metric and Lebesgue
measure from Rn . For a measurable function f : R, let [f ] be the
equivalence class whose members are all functions from R which agree
with f almost everywhere.
Proposition 2. The map : C 0 () L () given by (f ) = [f ] is one-
to-one and preserves the norm.
Proof. First of all, note that it follows immediately from the definitions that
for f C 0 (), (f ) L (). Also, we should show that kf kC 0 = k(f )kL
to show preserves the norm.
The proof hinges on the simple fact that every full-measure subset V of
is dense in . (Recall V has full measure if \ V has Lebesgue measure
zero.) This fact may be proved as follows: let V have full measure.
Then there is no open ball contained in \ V (since open balls have positive
measure). This shows V is dense in . (Question: We need to use is an
open subset of Rn in this paragraph. Where did we use that is open?)
Now we prove the map is injective. So if f and g are in C 0 (), and
[f ] = [g], then by definition, f g on a set V of full measure. Let x .
Since V is dense, there is a sequence xn x, xn V . Then
f (x) = f (lim xn ) = lim f (xn ) = lim g(xn ) = g(lim xn ) = g(x)
n n n n
since f and g are continuous and f (xn ) = g(xn ). So f and g coincide at each
point of and so f = g in C 0 ().
Finally, we show that for f C 0 (), kf kC 0 = kf kL . In particular, let
denote Lebesgue measure and compute (recall we often write kf kL instead
of the more correct k[f ]kL = k(f )kL )
kf kL () = inf{a : |f (x)| a for almost every x }
= inf{a : {x : |f (x)| > a} = 0}.
5
But {x : |f (x)| > a} = 0 implies that {x : |f (x)| > a} = (Proof: If the
set is not empty it is an open subset of since |f | is continuous. The only
open subset of with measure zero is the empty set.) So now
Hint for (b): Show that if fn g in L (R), then {fn } is a Cauchy sequence
in C 0 (R). Then use Proposition 1 and show the resulting limit function
f C 0 (R) must be equal to g almost everywhere. (This amounts to showing
that (C 0 ) is a closed subspace of L .) Provide a contradiction.
1.3 Quantifiers
It is worth taking the time to look in some detail at C 0 convergence, and to
compare it to pointwise convergence. By contrast, C 0 convergence is often
call uniform convergence.
6
For a metric space X, fn f in C 0 (X), if for all > 0, there is an N so
that
n > N = kfn f kC 0 (X) < .
In other words, for all > 0, there is an N so that
n>N = sup |fn (x) f (x)| < .
xX
So then fn f in C 0 (X) implies that for all > 0, there is an N so that for
x X,
n > N = |fn (x) f (x)| < .
A few easy manipulations imply in fact the following
Lemma 3. Let X be a metric space and let fn C 0 (X). Then fn f in
C 0 (X) if and only if for every > 0, there is an N = N () so that for x X,
n>N = |fn (x) f (x)| < .
Homework Problem 3. Prove Lemma 3.
Since C 0 (X) is a Banach space, we know that the limit function f
C 0 (X) as well, and thus the uniform limit of continuous functions is continu-
ous. C 0 convergence is called uniform convergence because the N in Lemma
3 depends only on > 0 and not on x X: thus N is uniform over all
x X.
We contrast this with pointwise convergence. If fn are functions on X,
then fn f pointwise if for all > 0 and x X, there is an N = N (, x) so
that
n > N = |fn (x) f (x)| < .
The difference between pointwise and uniform convergence is subtle but very
important: in pointwise convergence N = N (, x) may depend on and x,
while in uniform convergence N = N () only depends on and is independent
of x.
We have belabored this point because it is one of the major issues in
analysis: keeping track of which constants, or quantifiers, depend on which
other quantifiers. (It is even better to have explicit bounds (estimates) on the
behavior of quantifiers with respect to each other.) Of course it is desirable
(though not always possible) to have more uniform dependence of quantifiers,
as we see in the following standard example:
We have seen that the uniform limit of continuous functions is continuous.
On the other hand, a pointwise limit of continuous functions may be not be:
7
Example 1. Consider X = [0, 1] and fn (x) = xn . Then fn f pointwise
on [0, 1], where
0 for x [0, 1),
f (x) =
1 for x = 1.
So the pointwise limit f is discontinuous, and thus we see that fn 6 f
uniformly.
1.4 Derivatives
The theory of derivatives in one variable is fairly straightforward: if a function
f : R R is differentiable at p (i.e., f 0 (p) exists), then f must be continuous
at p. For functions of more than one variable, however, consider the following
example:
Example 2.
( xy
for (x, y) 6= (0, 0)
f (x, y) = x2 + y 2
0 for (x, y) = (0, 0),
has first partial derivatives everywhere but is not even continuous at (0, 0).
Even though f has all its first partial derivatives at (0, 0), we do not
consider f to be differentiable at (0, 0). For functions of more than one
variable, we introduce the following definition of differentiability, which is
stronger than just the existence of all the partial derivatives. Instead of R-
valued functions, we consider the slightly more general case of maps from Rn
to Rm . A basic reference is Spivak, Calculus on Manifolds, Chapter 2.
Let O Rn be a domain, and let f = (f 1 , . . . , f m ) : O Rm . Then f
is differentiable at a point a O if there is a linear map Df (a) : Rn Rm
which satisfies
|f (a + h) f (a) Df (a)(h)|
lim = 0,
h0 |h|
where h Rn . Df (a) is called the derivative, or total derivative, of f at a.
Lemma 4. In terms of standard bases of Rn and Rm , Df (a) is written as
the Jacobian matrix
i
f
Df (a) = (a) , i = 1, . . . , m, j = 1, . . . , n.
xj
8
In particular, if f is differentiable at a, then all the partial derivatives f i /xj
exist at a.
Proof. Write Df (a) as the matrix (ij ). Also consider a path h = (0, . . . , k, . . . , 0),
where k 0 sits in the j th slot. (In other words, hl = jl k, where jl is the
Kronecker delta, which is 1 if l = j and 0 otherwise.) We also use Einsteins
summation convention. In n space, this summation convention requires that
any repeated index which appears in both up and down positionssuch as
the l in the last two lines belowis assumed to be summed from 1 to n.
Compute
f i f i (a1 , . . . , aj + k, . . . , an ) f i (a)
(a) = lim
xj k0 k
[f i (a1 , . . . , aj + k, . . . , an ) f i (a) il hl ] + il hl
= lim
k0 k
i l
l j k
= 0 + lim
k0 k
i l i
= l j = j .
The key step, going from the second to the third line, follows from the as-
sumption that f is differentiable at a.
Another important result with essentially the same proof concerns direc-
tional derivatives. For a vector v = v j Rn , The directional derivative at a
in the direction v is the vector
f (a + tv) f (a)
Dv f (a) = lim .
t0 t
(Note we do not require kvk = 1 to define the directional derivative.) We
have the following lemma:
9
Proposition 6. If f = (f 1 , . . . , f m ) has continuous first partial derivatives
f i /xj on a neighborhood of a, then f is differentiable at a.
Proof. For a component function f i , write
f i (a + h) f i (a) = f i (a1 + h1 , a2 , . . . , an ) f i (a1 , a2 , . . . , an )
+ f i (a1 + h1 , a2 + h2 , . . . , an ) f i (a1 + h1 , a2 , . . . , an )
+ + f i (a1 + h1 , a2 + h2 , . . . , an + hn )
f i (a1 + h1 , a2 + h2 , . . . , an1 + hn1 , an )
Now consider the first term in terms of the function f i (x1 , a2 , . . . , an ) of the
first variable x1 alone. The Mean Value Theorem shows that there is a b1
between a1 and a1 + h1 so that
f i 1 2
f i (a1 + h1 , a2 , . . . , an ) f i (a1 , a2 , . . . , an ) = h1 (b , a , . . . , an ).
x1
Similarly, for all other terms the difference equals
f i 1
hj (a + h1 , . . . , aj1 + hj1 , bj , aj+1 , . . . , an )
xj
for bj between aj and aj + hj . So if we set cj = (a1 + h1 , . . . , aj1 +
hj1 , bj , aj+1 , . . . , an ), then we have
n
i i
X f i
f (a + h) f (a) = hj (cj ),
j=1
xj
= 0
10
since each f i /xj is assumed to be continuous at a.
So we have proved that each component function f i is differentiable at a.
To show f is differentiable, just note
i
f (a + h) f (a) f j
i i
f (a + h) f (a) f j
(a)h m (a)h
xj X xj
,
|h| i=1
|h|
which goes to 0 as h 0.
Recall a function is (locally) C 1 if its first partial derivatives are continu-
ous. The previous Proposition 6 shows that such functions are differentiable,
and Lemma 5 then shows that directional derivatives work as expected for
C 1 functions.
Now, for functions f on an open subset of Rm , consider the norm
m
X
f
kf kC 1 () = kf kC 0 () +
xi
0
i=1 C ()
11
(1) shows it suffices to prove that
f
= gi , i = 1, . . . , m.
xi
As usual, we recognize that integrating has better properties than differ-
entiating. For x , choose an x0 = x (0, . . . , k, . . . , 0), where the k > 0
is in the ith slot. Since is open, we may choose k small enough so that the
line segment from x0 to x is contained in . Compute
f (x) = lim fn (x)
n
" #
Z y=xi
fn 1
= lim fn (x0 ) + (x , . . . , xi1 , y, xi+1 , . . . , xm ) dy
n y=xi k xi
Z y=xi
= f (x0 ) + gi (x1 , . . . , xi1 , y, xi+1 , . . . , xm ) dy (2)
y=xi k
The key step in the computation is the last one: fn (x0 ) f (x0 ) is easy,
and the integral converges by the Dominated Convergence Theorem: Since
gi C 0 , there is a constant C so that |gi | C on . Moreover, since f
xi
n
gi
0 fn fn
in C , there is an N so that | xi gi | 1 for all n N . Thus xi are all
bounded by the integrable function C + 1, and the Dominated Convergence
Theorem applies.
Now we can differentiate (2) with respect to xi and we see that f xi
= gi
at each x . This completes the proof.
The last part of the proof is of independent interest. We record it as
Proposition 8. Let fn be C 1 functions on a domain Rm . Then if fn
f uniformly and fn /xi gi uniformly for i = 1, . . . , m, then gi = f /xi .
Remark. We can also define C k (, Rp ) to be the space of all functions f :
Rp so that f and all its partial derivatives up to order k are continuous
and bounded. The norm is given by
X
kf kC k = k f kC 0 , (3)
||k
12
(if some i = 0, then there is no differentiation with respect to xi ).
We can use the same proof as above to conclude that C k is a Banach
space. In particular, we can apply the theorem to F = (f, f,1 , . . . , f,n ) and
then relate kF kC 1 to kf kC 2 to provide an inductive step.
C is not a Banach space, as the analog of (3) would involve an infinite
sum.
Weve used the following problem implicitly a few times above.
Homework Problem 4. Show that if f : Rn Rm is differentiable at a
point a, then it is continuous at a.
Homework Problem 5. Let f be a real-valued function defined on a domain
2f 2f
in R2 . Show that if the second mixed partials f,12 = x1 x2 and f,21 = x2 x1
13
Proof. Let A = Df (a), B = Dg(f (a)). Now consider the remainder terms
in the definition of differentiable maps. For h Rl , k Rm ,
(h) = f (a + h) f (a) A(h),
(k) = g(f (a) + k) g(f (a)) B(k),
(h) = (g f )(a + h) (g f )(a) (B A)(h).
Then since f and g are differentiable,
|(h)|
lim = 0, (4)
h0 |h|
|(k)|
lim = 0, (5)
k0 |k|
So compute
(h) = g(f (a + h)) g(f (a)) B(A(h))
= g(f (a + h)) g(f (a)) B(f (a + h) f (a) (h))
= [g(f (a + h)) g(f (a)) B(f (a + h) f (a))] + B((h))
= (f (a + h) f (a)) + B((h))
So then
|(h)| |(f (a + h) f (a))| |B((h))|
+ .
|h| |h| |h|
|B((h))|/|h| 0 as h 0 by (4) and Lemma 9. On the other hand (5)
shows that for all > 0 there is a so that
|k| < = |(k)| |k|.
Therefore if |f (a + h) f (a)| < (which can be achieved if |h| < since f
is continuous),
|(f (a + h) f (a))| |f (a + h) f (a)|
|h| |h|
|A(h)| |(h)|
+
|h| |h|
14
Now if we let h 0, using (4) and Lemma 9,
|(f (a + h) f (a))|
lim sup C.
h0 |h|
Proof. As above, denote our metric space by X with metric d, and let
(0, 1) be the constant for the contraction map g: for all x, y X,
d(g(x), g(y)) d(x, y).
First we prove uniqueness. If x and y are fixed points of g (so g(x) = x,
g(y) = y), then
d(x, y) = d(g(x), g(y)) d(x, y).
So (1 )d(x, y) 0. Since < 1 and d(x, y) 0 (since X is a metric
space), we must have d(x, y) = 0 and so x = y (again since X is a metric
space).
To prove existence of the fixed point, we consider any point x0 X,
and consider iterates defined inductively by xn+1 = g(xn ) for all n 0. We
claim xn is a Cauchy sequence and the limit x of xn is the fixed point. For
15
n m 0, compute
(Note that in this computation, weve used the exact sum of the geometric
series, and it is crucial that (0, 1): the geometric series diverges for
1.) So if N is a positive integer, then for all n, m > N , d(xn , xm )
d(x1 , x0 )N /(1 ), and this last quantity d(x1 , x0 )N /(1 ) 0 as N
. Thus {xn } is a Cauchy sequence which has a limit x X since X is a
complete metric space.
Now we prove that x is a fixed point. Since x = limi xi = limi xi+1 ,
we have
g(x ) = g(lim xi ) = lim g(xi ) = lim xi+1 = x ,
i i i
16
(a) The main point is to exhibit the Newtons method iteration as a con-
traction map on a complete metric space (recall a closed subset of any
complete metric space is complete). You must find an appropriately
small neighborhood of x on whose closure Newtons method is a con-
traction map.
(b) You will need the following lemma: For a C 1 function g : R R,
|g(y) g(z)|
y 6= z [a, b] = max |g 0 (w)|.
|y z| w[a,b]
17
for y = (y 1 , . . . , y i1 , bi , y i+1 , . . . , y n ), bi between y i and y i + k. Since f is C 1 ,
f /y i is continuous, is compact, and y stays in a compact neighborhood
of y, then
R the absolute value of the integrand is bounded by a constant M .
Since M dx < , the Dominated Convergence Theorem shows that
f (y + kei , x) f (y, x)
Z Z
i
f (y, x) dx = lim dx
y k0 k
f (y + kei , x) f (y, x)
Z
= lim dx
k0 k
Z
f
= i
(y, x) dx
y
R
To show that f (y, x) is C 1 as a function of y, note that its partial
derivatives Z
f
gi (y) = i
(y, x) dx
y
are continuous in y by the Dominated Convergence Theorem again, since if
y y0 , then
Z
f
lim gi (y) = lim (y, x) dx
yy0 yy0 y i
Z
f
= lim i
(y, x) dx
yy0 y
Z
f
= (y , x) dx
i 0
y
= gi (y0 )
because f /y i is continuous in y.
Remark. The last argument also shows that if f = f (z, x) is a continuous
function of z and x, and x a compact subset of Rn , then the function
Z
z 7 f (z, x) dx
is continuous.
18
1.7 The Inverse Function Theorem
We need the following lemma first:
Lemma 12. If f is a C 1 function from a ball B in Rn to Rm , which satisfies
i
f
xj C
on B, then for y, z B,
(Note this argument is essentially the same as the use of the Mean Value
Theorem.) Now apply
m
X
|f (y) f (z)| |f i (y) f i (z)|.
i=1
19
Now consider g(x) = x f (x) and note that Dg(0) = 0 the zero linear
transformation. Since g is C 1 , there is an r > 0 so that |x| < 2r implies
i
g
< 1 ,
(x)
xj 2m2 for i, j = 1, . . . , m. (6)
Let B(r) = {x Rm : |x| < r}. Then Lemma 12 and g(0) = 0 imply that
g(B(r)) B(r/2).
Now let y B(r/2) and consider
Then
gy (x) = x is equivalent to f (x) = y, and so a fixed point of gy is
equivalent to a solution to f (x) = y.
If x B(r), |gy (x)| |g(x)| + |y| r, and so gy is a map from the
complete metric space B(r) to itself.
Lemma 12 and (6) imply gy is a contraction map (with = 1/2). In
other words, for x1 , x2 B(r),
20
To show this, compute
Therefore,
|f 1 (y1 ) f 1 (y2 ) (Df (x2 ))1 (y1 y2 )|
|y1 y2 |
|f (y1 ) f 1 (y2 ) (Df (x2 ))1 (y1 y2 )| |x1 x2 |
1
=
|x1 x2 | |y1 y2 |
(Note y1 6= y2 implies x1 6= x2 since yi = f (xi ).) This expression goes to zero
as y1 y2 by (8) and (9), since f is differentiable at x2 .
Finally we show the total derivative (Df (x))1 is continuous in y. We
2
can think of Df as a map from x to Rm , which represents the space of
m m matrices. Df (x) is continuous in x (f is C 1 ), and thus is continuous
2
in y. The determinant function det : Rm R is continuous, since it is a
polynomial in the matrix entries. So det Df (x) is bounded away from zero,
by compactness of B(r). We are left to prove the continuity of the matrix
inverse operation for square matrices with determinant bounded away from 0.
This follows from the formula from the inverse in terms of cofactor matrices:
Each entry of the inverse matrix A1 = (aij )1 is of the form
(m 1)st -order polynomial in the aij
.
det(aij )
21
(b) Use the formula for D(f 1 ) to show that f 1 is C .
Hints: It may be helpful to use the following notation. If f = f (x) =
f (x1 , . . . , xn ), we may write (y 1 , . . . , y n ) = y = y(x) = f (x). And so
f 1 (y) = x may be written simply as y = y(x). To show f 1 is C 2 , for
example, you should write
2 (f 1 )k 2 xk
=
y i y j y i y j
f y 2f 2y
= , and =
xi xi xi xj xi xj
and verify that the resulting expression is continuous.
Remember to use the Chain Rule, as in e.g.,
xi
= ,
y j y j xi
22
(a) Consider F : Rn Rm Rn Rm defined by F (x, y) = (x, f (x, y)) and
apply the Inverse Function Theorem to F .
(c) Show that g(x) = p(x, 0) satisfies the conditions of the theorem.
23
Homework Problem 11. The Lipschitz constant of a locally C 1 function
f : R R is equal to supxR |f 0 (x)|.
Hint: To show the two quantities are equal, you need to relate the sup of
the derivative to the sup of the difference quotients. To relate the derivative
f 0 (x) to difference quotients, use the definition of the derivative. To relate a
given difference quotient to a derivative, use the Mean Value Theorem.
The previous problem shows that any differentiable function with bounded
derivative is Lipschitz. The converse is false, as we see in the following ex-
ample.
Example 3. The function x 7 |x| is a Lipschitz function from R to R. This
follows from the observation that for each x 6= y R,
|x| |y|
1.
|x y|
24
Hint: The point of this problem is that there should be no uniform L < 1
which works for all x and y. To construct such a function f , use Problem
11 above. In particular, first construct the derivative f 0 and then integrate
to find f . (Youll need supx |f 0 (x)| = 1; why?) Use the Mean Value Theorem
to relate values of f 0 to difference quotients.
x, y C, t [0, 1] = tx + (1 t)y C.
Proof. Any ball is convex (see the following homework problem), and so if
f is C 1 on a small ball, then it is Lipschitz on the ball by the previous
Proposition 15.
25
Homework Problem 13. Show that any ball Bx (r) = {y Rn : |yx| < r}
is convex.
dY (f (x), f (x0 ))
,
dX (x, x0 )
dY (f (x), f (x0 ))
L
dX (x, x0 )
and f is Lipschitz on K.
26
2 Ordinary Differential Equations
2.1 Introduction
An ordinary differential equation (an ODE ) is an equation of the form
x = (x1 , . . . , xm ) : I Rm ,
27
For other values of g(y), compute
dy
= x2 y,
dx
dy
= x2 dx,
y
Z Z
dy
= x2 dx,
y
x3
ln |y| = + C,
3
3 3
y = eC ex /3 = C 0 ex /3 ,
If we let C 0 be any real number, then we capture both cases above, and
3
the general solution is y = C 0 ex /3 .
(b) Find the particular solution passing through (x, y) = (1, 1).
(c) Find the particular solution passing through (x, y) = (1, 1). (Hint:
What is the formula for tan( + 2 )?)
28
Well see below that if v satisfies a Lipschitz condition, and for t in a small
interval around t0 , there is a unique solution to the initial value problem.
29
Example 7. The differential equation x = x2 + t has no solution which can
be written down in terms of standard algebraic and transcendental functions
(such as roots, exponentials, trigonometric functions). Theorem 5 states that
there is a local solution for every initial value problem. For example, for
initial conditions x(0) = 1, there is a solution valid on an open interval
containing t = 0.
Theorem 5 does not guarantee a solution which is valid for all time t (see
Example 6 above). In fact the solution for the present initial-value problem
will also blow up in finite time. This is basically because for t 0, x =
x2 + t x2 , and so the solution should grow faster than the solution to
Example 6, which goes to infinity in finite time.
Proof of Theorem 5. The idea of the proof is to set up the problem in terms
of a contraction mapping. We first find an iteration whose fixed point solves
the differential equation and then find an appropriate complete metric space
on which the iteration is a contraction map.
30
For a continuous Rn -valued function defined on a neighborhood of t0 ,
let A be another such function defined as follows:
Z t
(A)(t) = x0 + v(( ), )d. (12)
t0
31
Rn ) is complete by Proposition 1. Moreover, the
Proof. First of all, C 0 (I,
conditions imposed give closed subsets of the Banach space C 0 . The second
condition is obviously closed since the norm on any Banach space is contin-
uous. To check the condition (t0 ) = x0 is closed, use the following lemma,
whose proof is immediate:
Lemma 20. For a metric space J and y J, the map from the Banach
space C 0 (J, Rn ) to Rn given by f 7 f (y) is continuous.
Since these two conditions are closed, X is a closed subset of the complete
Rn ), and so is complete with the induced metric.
metric space C 0 (I,
Remark. Lemma 20 is false for the Banach space L . Why?
So we have proved that X is a complete metric space. Next we show
Lemma 21. For > 0 small enough, A : X X.
Proof. First of all, choose > 0 so that [t0 , t0 + ] I. Since v is
continuous and {x : |x x0 | P } [t0 , t0 + ] is compact, there is a
constant M so that
In order for this bound to work below, we must have (so then I
[t0 , t0 + ]). To check A : X X, we need to check for each X,
So A : X X if min{, P/M }.
32
Finally we use the Lipschitz hypothesis on v to show that A is a con-
traction map. Let L be the Lipschitz constant for v. Then for , X,
compute
Z t
|(A)(t) (A)(t)| = [v(( ), ) v(( ), )]d
t
Z t0
|v(( ), ) v(( ), )|d
t0
Z t
L|( ) ( )|d
t0
Lk kC 0 |t t0 |
Lk kC 0
Then since kA AkC 0 = suptI |(A)(t) (A)(t)|, we see that
kA AkC 0 Lk kC 0 .
So A is a contraction map if < 1/L. Thus all together, if we require
< min{, P/M, 1/L}, then A is a contraction map on X, and its fixed
point is a solution to the initial value problem.
In order to show uniqueness of the initial value problem, note that the
Contraction Mapping Theorem automatically proves that any two continuous
solutions 1 and 2 to the initial value problem from I to Rn must coincide
if the additional constraint
sup |(t) x0 | P
tI
is satisfied. Since 1 and 2 are continuous and satisfy the initial condition,
this condition is automatically satisfied for both 1 and 2 on a (perhaps
smaller) interval I I containing t0 . Then uniqueness applies on this smaller
interval, since A is a contraction map for any small enough. Note that the
interval I on which 1 = 2 may depend on 1 and 2 . The proof that the
two solutions must coincide on all of I depends on the Extension Theorem 6
below.
We record what we have proven so far with respect to uniqueness here.
Proposition 22. Any two solutions 1 and 2 to the initial value problem
(11) coincide on a small interval containing t0 . The interval may depend on
the solutions 1 and 2 .
33
Remark. Note that in the proof of the previous theorem, we only use that
v is Lipschitz in the x variables (with a uniform Lipschitz constant uniform
valid for all t). We still require v to be continuous in t.
The previous theorem provides a continuously differentiable solution on
an interval I containing the initial time t0 and proves uniqueness on a (per-
There is a satisfactory more global theory of ODEs
haps) smaller interval I.
which we detail in the next subsection.
34
Proof. Recall that in the proof of Theorem 5. Any < min{, P/M, 1/L}
works. By compactness of K and since I is open, we can choose a uniform
> 0 so that for all (x0 , t0 ) K, [t0 , t0 + ] I. We may choose P to
be any positive number (since O = Rn in the present case). The Lipschitz
constant L = LK is uniform over any compact set K by the locally Lipschitz
property of v (Proposition 17). Let
where
= {(x, t) Rn+1 : (x0 , t0 ) K : |t t0 | , |x x0 | P }
K
In order to have a single M work for all (x0 , t0 ) K, we must have let
L must be valid on all of K
(x, t) K. as well, since we consider integrals
from t0 to t, where (x0 , t0 ) K, |t t0 | < .)
Now we must ensure that < min{, P/M, 1/L}. All of these quantities
can be chosen independently of (x0 , t0 ) K.
Lemma 24 (Gluing solutions). Consider any two solutions to x = v(x, t)
which are defined on intervals in R. If the two coincide on any interval
in R then they must coincide on the entire intersection of their intervals of
definition. Thus they can be glued together to form a solution on the union
of their intervals of definition.
Proof. Consider two solutions 1 , 2 to x = v(x, t) defined on intervals I1
and I2 . Assume they coincide on an interval I3 I1 I2 . We want to show
1 = 2 on all of I1 I2 . Let I4 be the largest interval containing I3 on which
1 and 2 coincide (take I4 to be the path-connected component of the closed
set {t : 1 (t) = 2 (t)} containing I3 ). Now we will show that I4 = I1 I2 .
35
Assume I4 6= I1 I2 . Then since I4 is a relatively closed subinterval of
I1 I2 , there is an endpoint T of I4 in the interior of I1 I2 . Now 1 and 2
are both solutions of
36
so E = I+ since I+ is connected). E is nonempty by Theorem 5 and Lemma
24 above.
To show E is open in I+ , let T E. Then there is a unique solution
defined on (t0 , T ). First we note that (t0 , T ] E. To see this, let T 0 (t0 , T ].
Then the restriction of = T to [t0 , T 0 ] is a solution to (13) on [t0 , T ).
Moreover, it is unique, since any other solution to (13) on [t0 , T 0 ) agrees with
on a neighborhood of t0 , and so Lemma 24 shows they must agree on all
[t0 , T 0 ).
So to show E is open, we may restrict our attention to times larger than
T . Since || is uniformly bounded by C and [t0 , T ] is a compact subinterval
of I, we may apply Lemma 23 to show there is uniform so that any solution
to the differential equation with initial condition x( ) = for [t0 , T ],
|| C must exist on [ , + ]. Now we may consider the initial value
problem
x = v(x, t), x(T 2 ) = (T 2 ). (14)
So Lemma 23 shows there is a solution to this initial value problem which
exists on [T 32 , T + 2 ]. Moreover, Lemma 24 says that = on the
intersection of their intervals of definition, and moreover, that may be
extended by to a solution on [t0 , T + 2 ]. Lemma 24 also implies this
extension is unique on every subinterval containing t0 , and so in particular
[T 2 , T + 2 ] E and E is open.
T 2
T
r r [t0 , T ]
[T 32 , T + 2 ]
37
This Lemma 25 completes the proof of the Extension Theorem 6, at least
for solutions moving forward in time. The reason is this: if there is a time
T I+ J (we may choose I+ since we are only moving forward in time),
then
E = J+ 6= I+ .
Therefore, by the contrapositive of Lemma 25, supE || = . But since is
continuous on [t0 , T ), we must have lim suptT |(t)| = .
The argument for solutions moving backward in time is the same.
The above theorem may be improved as follows:
Proof. Let 1 and 2 be the two solutions. If their graphs cross at (t0 , x0 ),
then they both solve the initial value problem
The solutions must coincide on a small interval by Proposition 22, and then
must coincide on the whole intersection of their intervals of definition by
Lemma 24.
38
Hint: See Examples 6 and 7 above. Let (t) be the solution to the current
1
initial value problem. We will compare to the solution (t) = 1t of the
initial value problem x = x2 , x(0) = 1. Let J be the maximal interval on
which can be extended. Let J+ = J (0, ); T is then the positive endpoint
of J+ . Now consider the interval
blows up at a time T < 1. Then note that parts (a)-(e) can be repeated
to show that J+ (0, T).
39
Proposition 27. Consider the equation x = A(t)x, where A(t) is a contin-
uous n n matrix valued function of t, and x(t) Rn . For each t0 , there is
an interval I 3 t0 so that the space of solutions (t) on I has dimension n.
Consider an initial value condition x(t0 ) = x0 . Let x0 (t) be the solution to
this initial value problem. Then the map S : x0 7 x0 is a linear isomorphism
from Rn to the space of solutions defined on I.
Remark. It is not too hard to show that the interval I can be taken to be
the maximal open interval containing t0 on which A(t) is continuous. (See
Michael Taylor, Partial Differential Equations, Basic Theory.)
Proof. A(t)x is locally Lipschitz in x and continuous in t, as needed for
Theorems 5 and 6. First of all, for a basis i of Rn , let I be a small interval
on which all the solutions i exist. Note the map x0 7 x0 is obviously
linear. S is injective since if x0 6= y0 , x0 (t0 ) 6= y0 (t0 ), and thus x0 6= y0 .
Therefore, if x0 = ai i , x0 = ai i . Again by uniqueness, any solution to
x = A(t)x is determined by the initial value (t0 ) = x0 , and so S is onto.
Given a linear equation x = A(t)x, for x = x(t) Rn , we can consider a
similar equation X = A(t)X for X = X(t) an n n matrix valued function.
The solution (t) of the initial value problem
X = A(t)X, X(t0 ) = I the identity matrix,
is called the fundamental solution of the equation x = A(t)x. It is straight-
forward to see that the ith column of (t) is the solution to x = A(t)x,
xj (t0 ) = ij . Moreover, the fundamental solution can be used to compute any
solution to the differential equation near t0 .
Lemma 28. On the maximal interval of existence of the fundamental solu-
tion (t) of x = A(t)x, the solution to the initial value problem
x = A(t)x, x(t0 ) = x0 ,
is given by (t)x0 .
Proof. The proof is an immediate calculation.
Homework Problem 16. An inhomogeneous linear system is a system of
the form
x = A(t)x + b(t), (15)
where A(t) and x are as above and b(t) is a continuous Rn -valued function.
40
(a) Let (t) be a solution to (15). Show that the solution space to (15) is
equal to
{(t) + (t) : (t) solves x = A(t)x}.
(b) In dimension 1, let (t) be the fundamental solution to x = A(t)x.
Show that the general solution to (15) is
Z
b(t)
(t) dt + C .
(t)
41
Homework Problem 17. Let B be the n n Jordan block matrix
1 0 0
0
1 0
0
0 0 (16)
.. .. .. . . ..
. . . . .
0 0 0
with on the diagonal, 1 just above the diagonal, and 0 elsewhere. Find the
fundamental solution etB to x = Bx.
Hint: Write out the system of equations in terms of components. Note
that x n only involves xn and not any other xi . So first solve the appropriate
initial value problems for xn (youll need to do one initial value problem for
each column of the identity matrix I). Then do xn1 , then xn2 , etc., and
find a formula that works for all xi .
Alternatively, it is possible to write out etB as a power series. If you
approach the problem this way, you must check to be sure your answer works.
42
Remark. A is diagonalizable if and only if each Jordan block is 1 1. If
the characteristic polynomial of A has distinct roots, then A is diagonaliz-
able, but the converse is false in general (A = I the identity matrix is a
counterexample).
Homework Problem 18. Assume that all the eigenvalues of the n n
matrix A have negative real part. (A is not necessarily diagonalizable.) Show
that etA 0 as t . (Just check that each entry in the matrix etA goes
to 0.)
Homework Problem 19. Solve the initial value problem
2.5 Regularity
Regularity of a function refers to how many times the function may be differ-
entiated. A function is (locally) C k if it and all of its partial derivatives up to
order k are continuous. A function is C if it and all of its partial derivatives
of all orders are continuous. For the purposes of this course, a function is
smooth if it is C (in other settings a function may be called smooth if it has
as many derivatives as the purpose at hand requires). There are other no-
tions of regularity in which the function and perhaps its derivatives, suitably
defined, are in Lp or other Banach spaces.
A vector-valued function is smooth or C k if and only if each of its com-
ponent functions is smooth or C k respectively.
Theorem 9. Assume v : O I Rn is smooth (O Rn is a domain and
I R is an open interval). Any solution to x = v(x, t) is smooth.
Proof. Let be a solution. Since exists, then is differentiable, and thus
continuous. Since v is continuous as well, = v(, t) is continuous and so
is (locally) C 1 . Now since v is smooth, we may differentiate to find
= v (, t) i (t) + v (, t).
(t)
xi t
Now since and and the partial derivatives of v are continuous, we see that
is continuous and is (locally) C 2 . Since v is smooth, we can keep differ-
entiating, using the chain and product rules, to find by induction dm /dtm
is continuous for all m and so is C .
43
Remark. The technique used in the proof of Theorem 9 above is called boot-
strapping. In this process, once we know that is C 0 , we plug into the
equation to find that is C 1 . Then we use the fact that is C 1 to prove
is C 2 , etc.
Remark. The proof above also shows that if v is C k , then is C k+1 .
x(m) = v(x(m1) , . . . , x,
x, t), (17)
m
where of course x(m) = ddtmx . There is an easy trick to transform this system to
an equivalent first-order system with more variables. Let y 1 = x, . . . , y m1 =
x(m1) . Then it is easy to see the system (17) above is equivalent to the
system
y m1 = v(y m1 , . . . , y 1 , x, t),
m2
y = y m1 ,
.. .. (18)
. .
1 2
y = y ,
x = y 1 .
44
So for an mth order differential equation, we need initial conditions for
the function and its derivatives up to order m 1.
Remark. The trick of introducing new variables into a system of ODEs is
standard in physics. For a particle at position x = x(t), a typical equation
involves how a force acts on the particle. The sum F of the forces acting on
the particle must be equal to m x, where m is a constant called the mass. It is
standard to introduce a new vector quantity, called the momentum q = mx.
Then F = m x is equivalent to the system
q
q = F, x = .
m
Again, an important class of examples is linear equations with constant
coefficients. If
m + am1 m1 + + a1 + a0 .
If all the roots are distinct, then {ek t } form a basis. If a root is repeated l
times, then we must consider functions of the form tj ek t for j = 0, . . . , l 1
to form a basis of the solution space.
Eulers formula again allows us to handle complex roots of the character-
istic equation.
Homework Problem 20. For which real values of the constants a and b do
all the solutions to
x + ax + bx = 0
go to 0 as t ? Prove your answer, and draw your answer as a region in
the (a, b) plane.
45
First of all we remark that there is a neighborhood N of (x0 , t0 ) in Rn+1
and an > 0 so that every solution to the equation with initial condition
x( ) = y for (y, ) N exists by Lemma 23. This existence on a neighbor-
hood allows us to consider taking derivatives in y in what follows.
Theorem 11. Let v be a C 2 function on a neighborhood of the initial con-
ditions (y, t0 ) Rn R. Then the solution = (y, t) to the initial value
problem
x = v(x, t), x(t0 ) = y,
is C 1 in y.
Proof. If /y i exists, then it must satisfy
Dy = Dx v(, t) Dy .
t
(Here Dy is the total derivative matrix with respect to the y variables. So
its entries are j /y i .) So = (, Dy ) = (x, z) should satisfy the initial
value problem
x = v(x, t),
z = Dx v(x, t) z,
(20)
x(t0 ) = y,
z(t0 ) = I the identity matrix.
46
(see Proposition 11 above), we can easily check that Dy n = n implies
Dy n+1 = n+1 .
We know by the proof of Theorem 5 that n and n uniformly
on a small interval containing t0 . Then Proposition 8 shows that /y i = i
the ith component of for i = 1, . . . , n. Since these partial derivatives are
continuous (the uniform limit of continuous functions is continuous), then
Proposition 6 shows Dy = .
Remark. The previous theorem is true if we assume v is only C 1 and not
necessarily C 2 . The proof is more involved in the case v is only C 1 . (See
Taylor, Partial Differential Equations, Basic Theory, section 1.6.)
A bootstrapping argument can be used to prove the following
is C r1 in y.
Now analyze the right-hand side of the equations in (21). They are C r
functions of x, z, t. Therefore, Proposition Tr shows that z = Dy is locally
C r1 in y. Since the first partial derivatives of are C r1 , is C r . This
proves the inductive step, and the proposition.
We also have the following
47
Theorem 12. Let r 2. If v(x, t) is C r jointly in x and t, and if is the
solution to x = v(x, t), x(t0 ) = y, then is jointly C r1 in y, t and t0 .
Idea of proof. The difficult part is already done (the C r1 dependence on y).
For the rest, recall that any solution = (y, t0 , t) satisfies
Z t
=y+ v((y, t0 , ), ) d.
t0
x = v(x, t, ), x(t0 ) = x0
is smooth as a function of .
Hint: Show that this initial value problem is equivalent to the problem
48
2.8 Autonomous equations
An ODE system of the form x = v(x) is autonomous. In other words, a
system is autonomous if there is no explicit dependence on t. The main fact
about autonomous systems is the following proposition, whose proof is an
easy computation:
=
Proposition 31. If is a solution to x = v(x), then for all T R, (t)
(t + T ) is also a solution.
If x(0) (1, 1), then A < 0, and the solution x exists for all time and
is bounded between the equilibrium solutions at 1 and 1. Moreover, x
approaches the equilibrium solutions x 1 as t and x 1 as
t . If x(0) > 1, then A (0, 1) and the solution exists only for
t (, 12 ln A). If x(0) < 1, then A > 1 and the solution exists only
for t ( 21 ln A, ).
49
This behavior is typical of the behavior of autonomous equations for Lip-
schitz v. Any bounded solution which exists for all time must be asymptotic
to equilibrium solutions as t . Also note that any integral curve I
acts as a barrier to other solutions, in that no other integral curves can cross
I (see Proposition 26 above).
Homework Problem 23. Let v : R R be locally Lipschitz. Show
that any bounded solution of x = v(x) which exists for all time satisfies
limt (t) = c, where v(c) = 0.
Hint: There are three cases:
Case 1: v((0)) = 0. Show that is constant by uniqueness.
Case 2: v((0)) > 0. Show that v((t)) > 0 for all t (if it is ever equal to
zero, apply the argument of Case 1 above to show is constant; also use the
continuity of v ). Now show (t) is always increasing, and so must have
a finite limit c as t . Compute limt v((t)). Write
Z Z
> c = (0) +
(t) dt = (0) + v((t)) dt,
0 0
50
Proposition 32. Let O Rn be an open set, and let v : O Rn be locally
Lipschitz. If 1 and 2 are two maximally extended solutions to x = v(x)
which satisfy 1 (t1 ) = 2 (t2 ), then 1 (t) = 2 (t + t2 t1 ) for all t in the
maximal interval of definition of 1 .
Proof. 1 (t) and 2 (t) = 2 (t + t2 t1 ) both satisfy the initial value problem
Here is the principal theorem regarding flows of vector fields on open sets:
Theorem 13. Let O Rn be open, and v : O Rn be smooth. Then there
is an open set U so that O {0} U O R on which the solution (y, t)
to
x = v(x), x(0) = y
exists, is unique, and is smooth jointly as a function of (y, t).
Proof. This follows immediately from Theorems 5, 7 and 11.
51
Remark. It may not be possible to find an > 0 so that O (, ) U. The
reason is that solutions may leave O in shorter and shorter times for initial
conditions y O. A simple example is given by v(x) = 1, O = (0, 1).
This problem cannot be fixed by considering O = Rn , since we may have
v(y) rapidly as y in Rn . However, see the following corollary.
Corollary 33. Under the conditions of Theorem 13 above, if K O is
compact, then there is an > 0 so that the solution
: K (, ) O.
Proposition 34. Consider (y, t) the solution to x = v(x), x(0) = y, for v
smooth. Then as long as (y, t1 ), (y, t1 + t2 ) O, then
(y, t1 + t2 ) = ((y, t1 ), t2 ).
Proof. Consider
(t) = (y, t1 + t), (t) = ((y, t1 ), t).
Then if we show and satisfy the same initial value problem, then unique-
ness will show that (t) = (t) and we are done.
Compute
(0) = (y, t1 ),
(0) = ((y, t1 ), 0) = (y, t1 ),
(t) = t1 + t) 1 = v((y, t1 + t)) = v((t)),
(y,
(t) =
((y, t1 ), t) = v(((y, t1 ), t)) = v((t)).
52
2.10 Vector fields as differential operators
A vector field v on O naturally differentiates functions f on O by the direc-
tional derivative:
f
vf = Dv f = v i i
x
for v i the components of v. Therefore, we often write
v = vi .
xi
We say that v is a first-order differential operator on functions f .
This observation is natural from the point of view of ODEs by the fol-
lowing
Proof. Compute
Homework Problem 25. Let v and w are two smooth vector fields on .
53
(a) Show that the differential operator [v, w] is also a first-order differential
operator determined by a vector field (which we also write as [v, w]).
What are the components of [v, w]?
[u, v] = [v, u]
and
[[u, v], w] + [[v, w], u] + [[w, u], v] = 0.
(This last identity is the Jacobi identity.)
Remark. Part (b) of the previous problem shows that the vector space of
smooth vector fields on O is a Lie algebra. The bracket [, ] is called the Lie
bracket.
54
3 Manifolds
3.1 Smooth manifolds
We define smooth manifolds as subsets of RN . We basically follow Spivak,
Calculus on Manifolds, Chapter 5. When we say smooth in this section, we
mean C .
We say a subset M Rn is a smooth k-dimensional manifold (or, more
properly, a submanifold of Rn ), if for all x M , there are open subsets
U Rk and O M with x O and a one-to-one C map : U Rn
satisfying
1. (U) = O.
3. 1 : O U is continuous.
(10 ) (U) = M W .
and so D has rank 0 < 1 at the point at which (R) is not smooth.
Condition (3) is necessary by the following problem:
55
Homework Problem 26. Recall polar coordinates (x, y) = (r cos , r sin )
in R2 . Show that a portion of the polar graph r = sin 2 can be parametrized
for I an open interval in R, by : I R2 so that is one-to-one, C , and
D is never 0, but so that 1 : (I) I is not continuous. Sketch the graph
and indicate pictorially why (I) should not be considered a submanifold of
R2 .
If W and V are open subset of Rn , then a map f : W V is a diffeomor-
phism if f is one-to-one, onto, C , and f 1 is C . The Inverse Function
Theorem and Problem 9 show
Lemma 36. f : W V is a diffeomorphism if and only if f is one-to-one,
onto, C , and det Df (x) 6= 0 for all x W .
The following theorem is useful in proving properties about manifolds:
Theorem 14. M Rn is a k-dimensional manifold if and only if for all x
M , there are two open subset V, W of Rn , with x W and a diffeomorphism
h : W V satisfying
h(W M ) = V (Rk {0}) = {y V : y k+1 = = y n = 0}.
Proof. () Let U = {a Rk : (a, 0) h(W )}, and define : U Rn by
(a) = h1 (a, 0). is smooth and one-to-one since h is a diffeomorphism.
Moreover, (U) = M W to satisfy condition (10 ). 1 = h(W M ) is contin-
uous.
So all that is left to check is the rank condition (2). Consider H : W Rk
H(z) = (h1 (z), . . . , hk (z)).
Then H((y)) = y for all y U. Then use the Chain Rule to compute
DH((y)) D(y) = I, and so D(y) must be an injective linear map, and
so must have rank k. Thus M is a smooth manifold.
() Now assume M is a manifold, and define y = 1 (x). Then D(y)
has rank k, and so there is at least one k k submatrix of D(y) with
nonzero determinant. (We may think of D(y) as an n k matrix mapping
column vectors in Rk to column vectors in Rn . Then a k k submatrix is
simply a collection of k distinct rows of D(y).) By a linear change of basis,
if necessary, then, we may assume that
i
det (y) 6= 0.
1i,jk y j
56
By continuity, this is true on an open neighborhood U 0 of y.
Define g : U 0 Rnk Rn by g(a, b) = (a)+(0, b). Then, in block matrix
form,
i
0
y j 1i,jk
Dg(a, b) =
i .
j
Ink
y 1jk,k<in
i
So det Dg(a, b) = det1i,jk y j 6= 0. So we may apply the Inverse Function
Theorem to find that there are open subsets of Rn V10 3 (y, 0) and V20 3
g(y, 0) = x so that g : V10 V20 has a smooth inverse h : V20 V10 .
Define O via
W M = {(a) : (a, 0) V }
= {g(a, 0) : (a, 0) V },
h(W M ) = g 1 (W M )
= g 1 ({g(a, 0) : (a, 0) V })
= V (Rk {0}).
Proposition 37. 1 1 1
: (O ) (O ) is a diffeomorphism.
57
guaranteed by Theorem 14. Then (a) = h1 1
(a, 0), (x) = (h (x)), and
so
1 1
= h h
F = f h1 h : W Rp
F = f h1 h = (f ) h.
58
() For a local parametrization , f is smooth since locally, f =
F , which is smooth by the Chain Rule.
X RN is a smooth manifold of dimension k if every x X has a
neighborhood that is diffeomorphic to an open subset of Rk . In other words,
there is an open cover O of X so that each O is diffeomorphic to an open
subset U Rk . Let : U O be the diffeomorphism. is called a
parametrization of O X, and the inverse map 1 is called a coordinate
system. The open cover, together with the coordinate systems
{O , , U }
is called a smooth atlas of X, and X is a smooth manifold if and only if it
has a smooth atlas.
Example 10. The unit sphere
S2 = {(x1 , x2 , x3 ) R3 : (x1 )2 + (x2 )2 + (x3 )2 = 1}
is a two-dimensional submanifold of R3 .
To show this, we provide an atlas. Let N = (0, 0, 1) be the north pole and
S = (0, 0, 1) be the south pole. Then let O1 = S2 \ {N }, O2 = S2 \ {S},
U1 = U2 = R2 . We construct the coordinate systems 1 , = 1, 2, by
stereographic projection. We may realize R as the plane {x3 = 0} R3 .
2
x1 x2
1 2 1 1 2 3
(y , y ) = 1 (x , x , x ) = , ,
1 x3 1 x3
2y 1 2y 2 |y|2 1
1 2 3 1 2
(x , x , x ) = 1 (y , y ) = , , .
|y|2 + 1 |y|2 + 1 |y|2 + 1
Similarly, for any point x O2 , define 1
2 (x) to be the unique point in
R2 Lx,S , and we find as above
x1 x2
1 2 1 1 2 3
(z , z ) = 2 (x , x , x ) = , ,
1 + x3 1 + x3
2z 1 2z 2 |z|2 1
1 2 3 1 2
(x , x , x ) = 2 (z , z ) = , , .
|z|2 + 1 |z|2 + 1 |z|2 + 1
It is straightforward to check that each of these coordinate systems is a dif-
feomorphism, and since S2 = O1 O2 , we have produced a smooth atlas of
S2 and thus have shown that S2 is a two-dimensional manifold.
59
Given a smooth manifold X with a smooth atlas {O , , U }, let O =
O O . Also define U = U 1
(O ). As long as O 6= , the map
1
: U U
is a diffeomorphism. These maps are called the gluing maps of the man-
ifold X associated to the atlas. In particular, the manifold can be thought of
as the union of the coordinate charts U glued together by the gluing maps.
It is straightforward to see, at least as a set, we may identify
!
G
X= U / ,
xy if x U U , y U U , y = (x).
Gluing maps may be used to define smooth manifolds which are not necessar-
ily subsets of RN (though we wont do so here). It is instructive to think of
k-dimensional smooth manifolds as spaces that are smoothly glued together
from open sets in Rk .
O12 = S2 \ {S, N },
U12 = R2 \ {0},
U21 = R2 \ {0},
y1 y2
y
z = 12 (y) = 1
2 (1 (y)) = , = .
|y|2 |y|2 |y|2
This gluing map is called inversion across the circle |y|2 = 1 in R2 . Each
point is mapped to a point on the same ray through the origin, but the distance
to the origin is replaced by its reciprocal. So we can think of S2 as two copies
of R2 glued together along R2 \{0} by the inversion map across the unit circle.
60
this to define tangent vectors to manifolds. A tangent vector at a point p
in a smooth manifold X is given by the derivative (0)
of a smooth curve
: (, ) X RN so that (0) = p. (Note the fact RN is a vector space
allows us to differentiate .) The space of all tangent vectors at p is called
the tangent space Tp X of X at p, and it is characterized by the following
proposition.
Proposition 40. If X RN is a k-dimensional smooth manifold, then the
tangent space Tp X is the following: Given a local parametrization of X
: U O 3 p
so that (0) = p,
Tp X = D(0)(Rk ).
In particular, Tp X is naturally a k-dimensional vector space.
Proof. First of all, given a curve : (, ) X so that (0) = p, we can
ensure (by shrinking if necessary), that the image of is contained in the
coordinate neighborhood O. Now
= (1 )
0 (0) = D(0)v
and so D(0)(Rk ) = Tp X.
Also note the following corollary of our definition of Tp X:
Corollary 41. Tp X is independent of the coordinate neighborhood O of p.
If f : X Rm is a smooth map from a smooth k-dimensional manifold
X, and if p X, then we define
Df (p) : Tp X Rm
61
by using a local parametrization : U X so that (q) = p. Then we define
The following exercise verifies this definition makes sense (see Guillemin and
Pollack).
Homework Problem 27.
This definition depends only on v, and not on the curve used. (For each v
there are many , since v only depends on the first derivative 0 (0) and no
higher Taylor coefficients.)
For a coordinate system
1 = (x1 , . . . , xk ) : O Rk ,
(where we assume as usual that (0) = p), then the coordinate basis of Tp X
induced by may be written as {/xi }, which are thought of as tangent
vectors differentiating functions f by
1
1
k
i
f =
i
f = i
f (x , . . . , x ) .
x p x 0 x 0
(/xi is the tangent vector associated to the curve = (tei ), for ei the ith
basis standard basis vector in Rk .) Thus we can write any tangent vector v
at p as
v = vi i .
x
62
Writing tangent vectors in terms of the coordinate basis of Tp X is much more
useful than writing them in terms of a basis of RN Tp X.
The components v i will change depending on the local coordinates. On
O = O O the intersection of two coordinate neighborhoods of p, then
1
we have two coordinate systems 1 1 k 1 k
= (x , . . . , x ) and = (y , . . . , y ).
We can write by using the chain rule
y j
v = v i (x) i
= v i
(x) i j
= v j (y) j .
x x y y
T X = {(p, w) RN RN : p X, w Tp X}.
: T X X, (p, w) = p,
63
and each 1 ({p}) is the vector space Tp X.
Each coordinate system 1 = (x1 , . . . , xk ), provides a local frame {/xi }
of the tangent bundle. A local frame is a basis of the tangent space for ev-
ery p in a neighborhood O X. These frames are patched together in the
following paragraph.
A more abstract view of the tangent bundle is given by looking a given
smooth atlas {O , , U } of X. Then as a set, we may identify
!
G
TX = U Rk / ,
v = v i (x)
xi
for v i smooth on U Rk .
64
In order to describe the relationship between the local and global pictures
of the ODE system, consider X RN and v : X RN so that for each
p X, v(p) Tp X. Consider a local parametrization : U O . Let
1 1 k k
= (x , . . . , x ). Locally on U R , we represent v by
v = vi .
xi
In other words, for p O X, we have
65
Theorem 15. Let v be a smooth vector field on a compact manifold X. Then
the flow F (y, t) along the vector field (the solution to
x = v(x), x(0) = y)
x = v (x ), x(0) = x0
66
An important class of functions is smooth functions with compact support.
Prominent examples can be constructed using the smooth function on R
1
e x for x > 0
f (x) =
0 for x 0
f g Diff(X), f 1 Diff(X), f f 1 = id
67
v 2 x2 + v 3 x3 = 0. (Proof: S2 = {f = 1} for f = (x1 )2 + (x2 )2 + (x3 )2 ,
and so for any local parametrization , we have f = 1. Thus the Chain
Rule shows that Df (x)(Tx S2 ) = 0, and so Tx S2 ker Df (x). They must
be equal since both are two-dimensional vector spaces. Then simply compute
ker Df (x).) Therefore, v is a smooth vector field on S2 .
Recall that the coordinate systems of the atlas introduced above are
x1 x2
1 2 1 1 2 3
(y , y ) = 1 (x , x , x ) = , ,
1 x3 1 x3
x1 x2
1 2 1 1 2 3
(z , z ) = 2 (x , x , x ) = , .
1 + x3 1 + x3
On U1 , compute at x = (x1 , x2 , x3 ) O1 S2 ,
! x2
1 x1
1x3
0 (1x3 )2
D1
1 (x)(v) = 1 x2
x1
0 1x3 (1x3 )2 0
2
!
x
1x 3 y 2
= 1 = .
x
1x3
y1
as well.
In the
coordinate charts, these systems can be solved explicitly. For A =
0 1
, compute the fundamental solution
1 0
eAt = P etD P 1
1 i
1 1 i 0 2 2
= exp t
i i 0 i 1
2
2i
1 i
1 1 cos t + i sin t 0 2 2
=
i i 0 cos t i sin t 1
2
2i
cos t sin t
= .
sin t cos t
68
Therefore, for y U1 , the solution to y = v(y), y(0) = y0 is
cos t sin t
y(t) = y0 . (23)
sin t cos t
Proposition 42 implies that these two flows should be related, since they both
correspond to flows on S2 . In particular, for y0 U12 , let z0 = 12 (y0 ) =
y0 |y0 |2 . Then we check that the solution
z(t) = 12 (y(t))
on U2 by
cos t sin t
Ft (z) = z,
sin t cos t
and even on S2 R3 itself by
0 1 0 cos t sin t 0
Ft (x) = exp 1 0 0 t x = sin t cos t 0 x.
0 0 0 0 0 1
69
Homework Problem 30. Consider the atlas given above for S2 . On U1 ,
consider the vector field
2
v = y 1 y .
y 1 y 2
v = va
y a
70
bilinearity shows
g(v, w) = g(v i ei , wj ej ) = v i g(ei , wj ej ) = v i wj g(ei , ej ) = v i wj gij .
The fact g is symmetric is equivalent to gij = gji .
Note that a positive definite
p inner product g provides a way to measure
the length of a vector |v|g = g(v, v), and it also provides a measurement
of the angle between two nonzero vectors v and w:
g(v, w)
cos = .
|v|g |w|g
A Riemannian metric on X gives a positive definite inner product on each
tangent space Tp X. We also require these inner products to vary smoothly
as the point p varies in X. To describe this, consider a smooth atlas on X,
and a local coordinate system (x1 , . . . , xk ) around p. Then a smooth vector
field v can be represented as v = v i x i for the standard local frame {/xi }
of the tangent bundle. Then at each point, the inner product g is represented
by gij (x), and
g(v, w) = gij v i wj , v i = v i (x), wj = wj (x), gij = gij (x).
Then g is smoothly varying on X if the functions gij are smoothly varying
on each coordinate chart in the smooth atlas of X.
Euclidean space RN has a standard Riemannian metric given by the stan-
dard inner product ab . As weve seen above, for any submanifold X RN
endows X with a Riemannian metric. In particular, for v, w Tp X RN ,
we can form g(v, w) using the inner product ab . In particular, consider a
smooth parametrization : U O X RN . Then = (1 , , N ). A
vector field represented by
v = vi i
x
on U Rn is represented by
a
D(x)(v) = i
(x)v i (x) T(x) X RN .
x
D(x)(v) is called the push-forward of v under the map . For v, w Tp X,
we may define the metric
a b
i j i j
gij v w = g(v, w) = i
v w ab
x xj
a b
= ab v i wj .
xi xj
71
Therefore, the Euclidean inner product on RN induced the Riemannian met-
ric on X locally given by the formula
a b
g , = gij = ab . (25)
xi xj xi xj
Given a real vector space V , the dual vector space V is given by the set
of all linear functions from V to R. It is easy to check V is a vector space.
If V has a basis {ei }, then there is a dual basis { i } of V , which is defined
as follows:
i (ej ) = ji .
Given a local coordinate frame {/xi } of T X, the local frame on the dual
space is written as {dxi }. Each dxi is called a differential. The dual space
Tp X of Tp X is called the cotangent space of X at p.
Lemma 45. If y = y(x) is a coordinate change as in (22), then
y j i
dy j = dx .
xi
Proof. Write dy j = `j dx` . Then we have
k k
k k
j j j ` x j x ` j x ` j x
i = dy = dx = dx = = .
y i `
y i xk `
y i xk `
y i k k
y i
k
j x j y j
Therefore, (k ) is the inverse matrix of , and so k = .
y i xk
A Riemannian metric can be naturally written as
y k y ` i j
gk` dy k dy ` = gij dx dx .
xi xj
This makes sense because of the natural pairing
i
dx j
= ji
x
between the tangent and cotangent spaces implies that
k `
i
g(v, w) = gij dx v j
dx w = gij (v k ki )(w` `j ) = gk` v k w` .
xk x`
72
A Riemannian metric is an example of a tensor on X. The tensor product
V W of two real vector spaces with bases respectively i and j is the real
vector space formed from the basis
{i j }.
This implies
dim V W = (dim V )(dim W ).
A tensor of type (k, `) on a manifold X assigns to each point p X an
element of
(Tp X)k (Tp X)` ,
which has as its basis
j1 j`
i dx dx .
xi1 x k
2y 1 2y 2 |y|2 1
1 2
(y , y ) = , , ,
|y|2 + 1 |y|2 + 1 |y|2 + 1
73
and the Riemannian metric induced from R3 is
a b i j
gij dy i dy j = ab dy dy
y i y j
= ab da db
= d1 d1 + d2 d2 + d3 d3
2
2(y 1 )2 + 2(y 2 )2 + 2 1 4y 1 y 2
2
= dy + dy
(|y|2 + 1)2 (|y|2 + 1)2
2
4y 1 y 2 2(y 1 )2 2(y 2 )2 + 2 2
1
+ dy + dy
(|y|2 + 1)2 (|y|2 + 1)2
2
4y 1 4y 2
1 2
+ dy + dy
(|y|2 + 1)2 (|y|2 + 1)2
4
= 2 2
(dy 1 dy 1 + dy 2 dy 2 ).
(|y| + 1)
Note in the previous example, we used the formula for differentials
a i
da = dy .
y i
It is also useful to have the following notation: If h = hab dz a dz b is a Rie-
mannian metric on Z, and : Y Z is a smooth map, then we denote the
pullback metric
h = hab () da db
on Y . Thus in the construction above, if = ab dxa dxb is the Euclidean
metric on RN , then the metric g induced on a submanifold : X , RN is
the pullback .
Homework Problem 31. Let : X Y be a smooth map of manifolds.
Let Y have a Riemannian metric h on it. Show that h is a Riemannian
metric on X if and only if the tangent map D(x) : Tx X T(x) Y is injective
for every x X. (In this case is called an immersion.)
Hint: Do the calculations in local coordinates on X and Y . The key point
to check is whether h is positive definite. Show h(x) is 0 on the kernel
of D(x).
Note in the previous example, we considered the Riemannian metric on
S pulled back from the Euclidean metric on R3 . It is possible to write down
2
74
Example 14. Consider hyperbolic space
Hn = {x = (x1 , . . . , xn ) Rn : xn > 0}
A famous theorem of John Nash shows that for every Riemannian metric
g on a smooth manifold X, there is an embedding i : X RN so that g is
induced from the standard metric on RN . (Although it is not in most cases
obvious what the embedding is.)
: T X X.
1 O O Rk
75
provides for each p O a basis of the vector space 1 (p) by taking the
preimage of the standard basis of Rk under the diffeomorphism. Such a
smoothly varying basis is called a local frame of the vector bundle over O.
Given a gluing map y = y(x) of two small coordinate neighborhoods Ox
and Oy in X, there is a corresponding gluing map of Ox Rk and Oy Rk .
We require this gluing map to be of the form
(x, v) 7 (y(x), A(x)v)
for v a vector in Rk and A(x) a smoothly varying nonsingular matrix in
x. Therefore, above each point p, if we change coordinates from x to y, the
frame changes by the matrix A(x). A(x) is a transition function of the vector
bundle V . So the transition functions act on the fibers of a vector bundle as
linear isomorphisms. This preserves the vector-space structure on each fiber
when changing coordinates.
Remark. We have defined real vector bundles of rank k, for which each fiber is
diffeomorphic to Rk . We may also define complex vector bundles with fibers
diffeomorphic to Ck .
A section of a vector bundle : V X is a map s : X V satisfying
(s(p)) = p for all p X. So for each p X, s(p) is an element of the
vector space 1 (p). A vector field is precisely a section of the tangent
bundle. Locally, k sections which are linearly independent on each fiber
form a frame of the vector bundle. For example, {/xi } are n linearly
independent sections of the tangent bundle over a coordinate chart.
Since vector bundles preserve the linear structure on each fiber, we may
do linear algebra on the fibers to create new vector bundles. In particular,
we can take duals and tensor products of the fiber space to form new vector
bundles. The tensor bundle of type (k, `) over an n dimensional manifold X
is the vector bundle of rank nk+` with the fiber over p given by
Tp X k Tp X ` .
Over each coordinate chart, the natural frame of the tensor bundle is
i
i dxj1 dxj`
x 1 x k
for i1 , . . . , ik , j1 , . . . , j` {1, . . . , n}. The transition functions of a tensor
bundle are determined by the formulas
y k xj `
= , dxj = dy .
xi xi y k y `
76
For example the transition functions for the (0, 2) tensor bundle are given by
xi xj k `
dxi dxj = dy dy .
y k y `
Note we can view
xi xj
y k y `
as a nonsingular n2 n2 matrix, which is the tensor product of the matrix
xi
y k
with itself.
A smooth tensor of type (k, `) is a smooth section of the (k, `) tensor
bundle. Thus a Riemannian metric is a smooth symmetric, positive-definite
(0, 2) tensor.
77
Proposition 46. For every open cover of a smooth manifold X, there exists
a subordinate partition of unity.
For a proof, see Spivak or Guillemin and Pollack.
Theorem 17. A Riemannian metric g on a manifold X provides a measure
on X called the Riemannian density.
The construction of this measure follows below, along with a sketch of a
proof.
Let {O , , U } be a smooth atlas of X. A function f : X R is
measurable if each f : U R is measurable. For a Riemannian metric
g on X, the density dVg is defined first for measurable functions f : X R
whose supports are contained in some O . In this case, define
Z Z Z q
f dVg = f dVg = f (x) det gij (x) dx
X O U
78
The calculation in the previous paragraph can be used to ensure that this def-
inition is independent of the atlas and partition of unity used. It is straight-
forward to check that dVg defines a measure on X. Then for any L1 function
f on X (measured by dVg of course),
Z XZ
f dVg = f dVg .
X X
79
(see Problem 33) below. So the density
p
dVg = 1 + |df |2 dxn1
for dxn1 Lebesgue measure on Rn1 .
Homework Problem 33. For w an n-dimensional column vector, and I
the n n identity matrix, show that det(I + ww> ) = 1 + |w|2 .
Hint: Show that I +ww> can be diagonalized, with one eigenvalue 1+|w|2 ,
and with the eigenvalue 1 repeated n 1 times. (For this last step, show that
on the n 1 space orthogonal to the natural (1 + |w|2 )-eigenvector, I + ww>
acts as the identity. What is a natural eigenvector to try?)
f i
For a function f : R, the differential, or one-form, df = x i dx .
80
Homework Problem 35. Show that f transforms as a vector field under
coordinate changes. In other words, check that if y = y(x),
y j
(f )j (y) = (f )i (x)
xi
as in (22).
Hint: First check how the inverse of the metric g ij transforms. Note that
in the definition g ij gjk = ki , ki is independent of coordinate changes.
(Here n is the unit outward normal vector field to , and dV is the measure
on induced from the Euclidean metric.)
Remark. The way we have put the integration depends on the Euclidean
metric (to form the dot product, dV and n). In the general form of Stokess
Theorem, it it unnecessary to use the metric. (We may recast v and v as
differential forms.)
81
Idea of proof. We do the computation in a very special case, for v having
compact support in , which is the lower half-space {x = (x1 , . . . , xn ) Rn :
xn 0}.
In this case the unit normal vector n = (0, . . . , 0, 1) and dV = dxn1
Lebesgue measure on Rn1 = {xn = 0}. Then, using Fubinis Theorem, we
want to prove
Z Z Z 0 Z Z
v i n n1 1
... i
dx dx dx = ... v n dxn1 dx1 .
x
since v has compact support. Therefore, using Fubinis Theorem, for each
i 6= n, we can integrate v i /xi with respect to xi first to get zero. The
remaining term is the case i = n, and so
Z Z Z 0
v i n n1
... i
dx dx dx1
x
Z Z Z 0
v n n n1
= ... n
dx dx dx1
x
Z Z
= ... v n dxn1 dx1 .
82
apply the above special case to v for in the partition of unity and v the
vector field.
It is also necessary to make sure that the various terms in the integrals
transform well with respect to the local diffeomorphisms. This can be checked
directly, but it is better to use the language of differential forms (see Spivak
or Guillemin and Pollack).
Homework Problem 36. Let be a domain in Rn with smooth boundary.
On a neighborhood N Rn of a point in the boundary , assume that
N = {x N : xn < f (x1 , . . . , xn1 )}
so that is locally the region under the graph of a smooth function f . Com-
pute n and dV . For a smooth vector field v, compute
Z
v n dV
N
83
3.7 The -Neighborhood Theorem
Theorem 19. Let X Rn be a compact k-dimensional manifold. Then
there is an > 0 so that for
(b) At each x X, and given a smooth function as above, show that the
normal space Nx is the image of of the transpose of the tangent map
D(x) : Rnk Rn .
(c) Use the previous section and the techniques of Problem 28 to show N X
is a manifold.
84
Proof of the -Neighborhood Theorem. Consider the map F : N X Rn
given by F : (x, y) 7 x + y. For each x X, DF (x, 0) : Tx (N X) Rn
is a linear isomorphism. This can be proved since T(x,0) (N X) can be written
as a sum Tx (X) + Nx (X), and DF (x), when restricted to each factor, is
a linear isomorphism. The Inverse Function Theorem then shows that each
x X, there are neighborhoods Nx of (x, 0) in N X and Wx of x in Rn so that
F |Nx is a diffeomorphism from Nx to Wx . Note we may apply the Inverse
Function Theorem because by considering a local parametrization of N X,
and diffeomorphisms of (open subsets of) manifolds are defined in terms of
these parametrizations.
Consider the following lemma:
Lemma 48. There are open sets N and X so that X {0} N N X and
XX Rn and the restriction of F is a diffeomorphism from N to X.
Proof.
S First of all, we note that DF is a linear isomorphism on N 0 =
xX Nx . The Inverse Function Theorem then shows that F |N 0 is a dif-
feomorphism onto its image as long as it is one-to-one. Therefore, we need
only find an open N satisfying X {0} N N 0 on which F is one-to-one.
Now assume by contradiction that no such N exists. Then there are
points (xn , yn ) 6= (x0n , yn0 ) N X satisfying F (xn , yn ) = F (x0n , yn0 ) and so
that |yn |, |yn0 | < n1 (Why? You must use the compactness of X.) Since X is
compact, there must be a subsequence ni so that (xni , yni ) (x, 0) as i .
Then we may take a further subsequence nij so that (x0ni , yn0 i ) (x0 , 0) as
j j
j . For simplicity, we rename the subsequence nij as simply n. Then
the continuity of F shows that
85
4 The Calculus of Variations
4.1 The variational principle
In this section, we want to consider the problem of constructing a function
which minimizes a given functional. (A functional is a map from functions
to R.)
By pulling back the Euclidean metric on Rn+1 , we can consider the n-volume
of the graph. We have computed above
Z p
Vol(f ) = 1 + |f |2 dxn .
86
and h = 0 on .
h C 2 () C 0 ()
d
0 = Vol(f + h)
d =0
Z
d p
= 1 + |f + h|2 dxn
d =0
Z
d p
= 1 + |f |2 + 2 df h + 2 |h|2 dxn
d =0
2 f h + 2 |h|2
Z
= dxn
p
2
2 1 + |f + h|
=0
f h
Z
= p dxn
1 + |f |2
! !
f f
Z Z
= h p dxn + h p n dV
1 + |f |2 1 + |f |2
!
f
Z
= h p dxn .
1 + |f |2
which vanishes
This last integral must be equal to zero for every h C 0 ()
on . We claim this forces
!
f
g = p =0
1 + |f |2
on .
To prove the claim, note that since f is C 2 , g is continuous on . We
prove the claim by contradiction. If g is nonzero at any point x , assume
without loss of generality that g(x) > 0. Then by continuity, g > 0 in a small
ball B centered at x. Now it is easy to find a smooth bump function h whose
support is contained in B. In this case
Z Z
hg dxn = hg dxn > 0,
B
87
Thus any function f which minimizes the functional Vol satisfies the
Euler-Lagrange equation of the functional
!
f
p = 0.
1 + |f |2
This is the formula of the first variation, which comes from the first derivative
test in calculus. We may also use the second derivative test. A minimizer f
as above must satisfy the second variation formula
d2
P (f ) 0.
d2 =0
88
Homework Problem 38. Consider a variational problem for C 2 functions
y = y(x) from a domain [a, b] and fixed endpoints y(a) = y0 , y(b) = y1 .
Assume the function is of the form
Z b
J(y) = F (y, y 0 )dx,
a
4.2 Geodesics
Given a C 1 path : I X for I = [, ] an interval and X RN a manifold
with Riemannian metric g induced from the Euclidean metric on RN , the
length of the path (I) is given by
Z Z p Z q
L() = ||
g dt = g(,
)
dt = gij ((t)) i (t) j (t) dt.
(In the last formulation, note the use of local coordinates. So the last for-
mulation is strictly only true when (I) is contained in a single coordinate
chart.) L() is called the length functional which take paths to R.
89
Proposition 49. The length of a path is independent of the parametrization.
In other words, if ( ) = (t( )) for t = t( ) a C 1 diffeomorphism onto I,
then L( ) = L().
Proof. Let t = t( ) with t() = , t() = . Assume that < and since t
is a diffeomorphism, then dt/d > 0. Then compute
Z s
d
d
L() = g , d
d d
Z s
d dt d dt
= g , d
dt d dt d
Z s
d d dt
= g , d
dt dt d
Z s
d d
= g , dt
dt dt
= L().
> is similar.
The case when dt/d < 0 and
So this definition corresponds to the usual definition of the arc length of a
parametric curve. In particular, it is invariant under change of parametriza-
tion. This particular feature turns out to cause trouble analytically. In the
following sections, well seek to find paths minimizing arc length by con-
structing a sequence of paths approaching a length-minimizing one. The fact
that a potentially minimizing path has many different parametrizations will
make the analysis more difficult, since it will be difficult to find a sequence of
paths which approaches a particular minimizing path among all the possible
parametrizations. Another analytic objection to the length functional is that
it is the L1 norm of the length of the tangent vector . L2 norms tend to
behave better, since we can use the structure of Hilbert spaces.
Assume for convenience that the interval I = [0, 1]. This can always be
achieved by using a linear map to take a given I to [0, 1].
Thus we introduce a related functional, the energy of a C 1 path : [0, 1]
X. Define Z 1
E() = 2g dt.
||
0
The energy is related to the length by the following proposition.
90
Proposition 50. For a given homotopy class C of curves : [0, 1] X, a
C 1 curve minimizes E in C if and only if it minimizes L among C 1 curves
in C and the speed |(t)|
g is constant.
Note this definition is well-defined, since for H(1/2, t) = 1 (t) for either
definition above. This observation also shows that H is continuous. It is
straightforward to show H is a homotopy.
A C 1 diffeomorphism t = t( ) of [0, 1] is called orientation preserving if
dt/d > 0. Another fact about homotopy well presently use is the following
91
Lemma 52. If ( ) = (t( )) for t = t( ) an orientation-preserving diffeo-
morphism of [0, 1], then and are homotopic.
Proof. For s, [0, 1], define (s, ) = s + (1 s)t( ). Then we will
show that G(s, ) = ((s, )) is the required homotopy. First of all, since
t( ) is an orientation-preserving diffeomorphism, we see t(0) = 0, t(1) = 1.
Now check that for s, [0, 1], (s, ) [0, 1]: because 0 1 and
0 t( ) 1, then
0 = s(0) + (1 s)0 s + (1 s)t( ) s(1) + (1 s)(1) = 1.
This shows the homotopy G is well-defined. It is obvious for [0, 1]
that G(0, ) = 0 ( ) and G(1, ) = 1 ( ). Also compute for s [0, 1],
G(s, 0) = (0) and G(s, 1) = (1).
Also, note the following
Lemma 53. For any C 1 path , E() L()2 and they are equal if and
only if |(t)|
g is constant.
92
Homework Problem 39. (a) Let : [0, 1] X, = (t) be a C 1 path
into a Riemannian manifold X. Assume |(t)| g 6= 0 for all t [0, 1].
Show that there is a reparametrization t( ) so that t(0) = 0, t(1) = 1,
dt/d > 0, and d is constant.
d g
Hint: Show the constant must be equal to L(). Then show the con-
dition is an ODE in = (t). (Note that if dt/d > 0, then t( ) is
strictly increasing and thus has an inverse on [0, 1].)
93
formula
Z 1
d d
E( ) = g( (t), (t)) dt
d =0 d =0 0
Z
d
= gij ((t) + h(t))[ i (t) + h i (t)][ j (t) + h j (t)] dt
d =0 I
Z
gij
= k
((t))h (t) i (t) j (t) dt
k
I x
Z
+ gij ((t)) h i (t) j (t) dt
ZI
+ gij ((t)) i (t) h j (t) dt
I
Now we integrate by parts in the last two integrals. Note that since h has
compact support, all the boundary terms involving h vanish. Compute
Z Z
i j gij
gij ((t)) h (t) (t) dt = k
((t)) (t) hi (t) j (t) dt
k
I I x
Z
gij ((t)) hi (t) j (t) dt.
I
Since this is true for each h with compact support in I, then we must have
for each k = 1, . . . , n, and for all t in the open interval I,
gij i j gkj i j gik j i
0= k
i
gkj j gjk j .
x x xj
94
Since gkj = gjk , we have
j 1 gij gkj gik
0 = gjk + k + + i j ,
2 x xi xj
` 1 k` gkj gik gij
0 = + g + k i j
2 xi xj x
= ` + `ij i j ,
1 k` gkj gik gij
`ij = g + k .
2 xi xj x
`ij are called the Christoffel symbols of the metric gij , and
` + `ij i j = 0 (26)
`ij = `ji .
95
(b) Use part (a) to prove the following generalization of Proposition 50:
A curve in C is a critical point of E if and only if it is a critical point
of L and it has constant speed.
Homework Problem 42. Let (X, g) be an n-dimensional smooth compact
Riemannian manifold. By Nashs Theorem, we may assume that g = i the
pull-back of the Euclidean metric on RN for some embedding i : X RN .
If (p, v) T X (i.e. p X and v Tp X), show that the solution to the
geodesic equation (26) on X with initial conditions (0) = p and (0)
=v
exists for all time.
Hints:
(a) Show that if (t) solves the geodesic equation (26), then the speed |(t)|
g
is constant in t.
(b) Reduce the problem to the case the initial speed |v|g(p) = 1.
(c) The unit tangent bundle U T X is defined by
U T X = {(p, v) T X : |v|g(p) = 1}.
Show U T X is compact as long as X is compact.
(d) Mimic the proof of Theorem 15 to complete the proof.
Example 16. Euclidean space is Rn with the standard Euclidean metric
= ij dxi dxj . In this case, all the Christoffel symbols kij vanish, since each
term involves differentiating the components of the metric tensor, all of which
are constant. Therefore, the geodesic system is simply k = 0. Solutions to
this ODE are simply linear functions of t, and so geodesics are of the form
= tv + w for v, w Rn . So geodesics on Euclidean space are straight lines
traversed at constant speed.
Example 17. For hyperbolic space, recall the metric gij = (xn )2 ij on {x
Rn : xn > 0}. Compute the Christoffel symbols:
g ij = (xn )2 ij ,
gij,k = 2(xn )3 ij kn ,
kij = 12 (xn )2 k` (gi`,j + g`j,i gij,` )
= 1
2
(xn )2 k` [2(xn )3 ](i` jn + `j in ij `n )
= (xn )1 (ik jn + jk in kn ij ).
96
Now consider i, j, k distinct integers in {1, . . . , n}.
kij = 0,
iik = iki = (xn )1 kn ,
kii = (xn )1 kn ,
iii = (xn )1 in .
0 = k = kij i j
= knn n n
= (xn )1 kn n n = 0.
n = nij i j
= nnn n n
= (xn )1 n n ,
= ( n )1 n n . (27)
( n n ) = n n + n n ,
and that each of these terms is similar to those in the geodesic equation (27)
above.
In particular, compute for a function f of n
0 = (f ( n ) n ) (28)
n + f 0 ( n ) n n ,
= f ( n )
f 0 ( n ) n n
0 = n + . (29)
f ( n )
97
This last equation is the same as the geodesic equation (27) if
f 0 ( n ) 1
n
= n,
f ( )
and this is now a first-order separable equation for f . We may solve to find
f = ( n )1 is a solution.
Now plug into (28) to find
n
0 = ,
n
n
C = n
= (log n ),
Ct + D = log n ,
n = AeCt
98
Dropping the pull back notation, we compute
xj
yj = ,
|x|2
y j i
dy j = dx ,
xi
n
X |x|2 ij 2xi xj i
= 4
dx ,
i=1
|x|
n
!2
X |x|2 ij 2xi xj i
(dy j )2 = 2 dx
i=1
|x|4
n
! n
!
X |x|2 ij 2xi xj i X |x|2 kj 2xk xj k
= 2 dx dx
i=1
|x|4 k=1
|x|4
n
2 X i k j 2 2 i j j 2 k j j 4 j j
i k
= 4x x (x ) 2|x| x x k 2|x| x x i + |x| i k dx dx
|x|8 i,k=1
( n n
2 j 2
X
i k i k 2 j j
X
= 4(x ) x x dx dx 2|x| x dx xi dxi
|x|8 i,k=1 i=1
n
)
X
2|x|2 xj dxj xk dxk + |x|4 (dxj )2
k=1
( n n
)
2
X X
= 4(xj )2 xi xk dxi dxk 4|x|2 xj dxj xi dxi + |x|4 (dxj )2 ,
|x|8 i,k=1 i=1
99
n
( n
! n
!
X 2 X X
(dy j )2 = 4 (xj )2 xi xk dxi dxk
j=1
|x|8 j=1 i,k=1
n n
)
X X
2 j i i j 4 j 2
4|x| x x dx dx + |x| (dx )
i,j=1 j=1
( n n
2 2
X
i k i k 2
X
= 4|x| x x dx dx 4|x| xi xk dxi dxk
|x|8 i,k=1 i,k=1
n
)
X
+ |x|4 (dxj )2
j=1
2 n
X
= (dxj )2 ,
|x|4 j=1
2
Pn j 2
Pn j 2
j=1 (dy ) |x|4 j=1 (dx )
= n 2
(y n )2 2 (x|x|4)
Pn j 2
j=1 (dx )
= .
(xn )2
Therefore, g = g and is an isometry.
Moreover, it is trivial to check that any translation x 7 x + x0 is an
isometry of Hn if the last component xn0 = 0. Also, note that the composition
of two isometries is again an isometry (indeed the set of isometries of a
Riemannian manifold X forms a subgroup of the diffeomorphism group called
the isometry group).
Proposition 55 below shows that for any geodesic : R Hn , then
is also a geodesic. Recall we know so far that
= (01 , . . . , 0n1 , AeCt )
are geodesics for A > 0, C R. Compute for > 0,
(01 , . . . , 0n1 , AeCt )
= = .
||2 (01 )2 + + (0n1 )2 + A2 e2Ct
The image (R) is then the half-circle in Rn which intersects {xn = 0}
perpendicularly at
(01 , . . . , 0n1 , 0)
0 and .
(01 )2 + + (0n1 )2
100
Then if we apply the isometry given by adding a constant x0 with xn0 = 0,
then every half-circle in Hn which intersects {xn = 0} perpendicularly at both
endpoints is the image of a geodesic path in Hn .
All together, for constants
v01 = = v0n1 = 0.
101
The following proposition was discussed in Example 17 above.
Proposition 54. Consider a Riemannian manifold (X, g). Given p X,
v Tp X, there is an > 0 and a unique geodesic : (, ) X with
(0) = p, (0)
= v.
Remark. In general, the geodesic may not exist for all time, although we
have seen that all the geodesics on hyperbolic space (Example 17) and on
compact Riemannian manifolds (Problem 42) do exist for all time.
A map : X Y for manifolds X and Y with Riemannian metrics g
and h respectively is a local isometry if every point in X has a neighborhood
O on which : O (O) Y is an isometry.
Proposition 55. If : X Y is a local isometry of Riemannian manifolds,
then for every geodesic : (, ) X, is a geodesic on Y . Any geodesic
on (X) Y is of this form.
Proof. In local coordinates on X and Y , we can write the isometry as y =
y(x). Note this is the same form as a coordinate change, and the condition
that the map is an isometry is simply that the metric pulls back as a (0, 2)
tensor when changing coordinates.
Therefore, the proof boils down the the following fact: for a local isometry,
and for any C 2 path , the quantity
wk = k + kij i j
transforms like a tangent vector (i.e. a (1, 0) tensor) under changes of coor-
dinates. Therefore,
y I
wk k = wk k I
x x y
and wk (x) = 0 for k = 1, . . . , n is equivalent to wI (y) = 0 for I = 1, . . . , n.
y I
This is because x k is nonsingular for y = y(x) a diffeomorphism.
102
Compute
gIJ
gIJ,K =
y K
xi xj
= gij I J
y K y y
gij x xj
i
2 xi xj xi 2 xj
= + gij + gij
y K y I y J y I y K y J y I y J y K
xk xi xj 2 xi xj xi 2 xj
= gij,k K I J + gij I K J + gij I J K .
y y y y y y y y y
Then compute
103
Then the Christoffel symbols
104
This equation follows from the formula for the first derivative of an inverse
matrix. If A represents the first derivative of a matrix A (with respect to
any parameter or variable), then
1 .
(A1 ) = A1 AA
(Proof: Differentiate the equation AA1 = I to find AA 1 + A(A1 ) = 0.)
Then since (y /x ) is the inverse matrix of (x /y L ),
L ` `
2yL
L
y
` j
= j
x x x x`
y L
k J
x y
= k
x xj y J x`
y L y I xk
J
y
= k j I J
.
x x y y x`
Upon plugging in, this proves formula (31) and the proposition.
Remark. There is also a more geometric proof of the previous proposition.
Recall that we derived the geodesic equation as the Euler-Lagrange equation
of the energy functional. So any path which minimizes the energy satisfies the
geodesic equation. It is easy to see that the energy of a path is invariant under
an isometry; therefore, the notion of energy-minimizing path is invariant
under isometries.
The problem is that there are geodesics which do not minimize the en-
ergy. (They may be saddle points of the energy functional.) This can be
surmounted by restricting to small domains by using the following fact from
Riemannian geometry: Every point in a Riemannian manifold has a neighbor-
hood O so that all geodesic paths in O are energy-minimizing for endpoints
in O. (In Riemannian geometry books, this fact is usually stated in terms
of the length functional instead; to translate to the present situation, re-
call that energy-minimizing paths are length-minimizing paths parametrized
with constant speed.)
Homework Problem 43. Given a smooth function on a Riemannian man-
ifold, the Hessian of f is defined locally by the formula
2f f
H(f )ij = i j
kij k .
x x x
Show that the Hessian of f is a symmetric (0, 2) tensor.
105
Homework Problem 44. Compute all the geodesics on S2 .
Hint: Use the expression for the metric in local coordinates (y 1 , y 2 ) from
Example 13. Compute the Christoffel symbols. Analyze the case when y 2 = 0
and only y 1 varies. Solve the resulting second-order ODE for 1 = y 1 . Then
move these geodesics around via the isometry group of S2 .
(The isometry group of S2 is given by the orthogonal group of 3 3 ma-
trices
O(3) = {A : AA> = I}.
Show that each such linear action is an isometry of R3 which takes the unit
sphere S2 to itself. For every line L though the origin in R3 , show that
rotating by an angle around the line L is a linear map in O(3). Show
that every initial condition (p, v) T S2 of the geodesic equation on S2 can
be realized by the examples you computed above, when acted on by such a
rotation in O(3).)
106
elliptic PDEs. The problem we approach involves geodesics, and thus the
solution we produce be a solution to an ODE. This will allow us to proceed
with much of the general picture of the calculus of variations while avoid-
ing some of the more technical points. In particular, we will learn about
distributions, weak derivatives, Hilbert spaces, and compact maps between
Banach spaces in solving our problem.
Given a smooth manifold X, a loop is a continuous map from the circle S1
to X. Each such loop is equivalent to a continuous map : R X which is
periodic in the sense that (t+1) = (t) for all t R. We will abuse notation
by using the same for : S1 X and the periodic : R X. (This is
because S1 is naturally the quotient R/Z, where Z acts on R by adding
integers to real numbers.) Two loops 0 , 1 : S1 X are freely homotopic if
there is a continuous homotopy
107
in X. Then the map : X X which sends a point in X to its closest
point in X is a smooth map of X to X, and it fixes each point in X X .
Let 0 and 1 be loops on X satisfying
Apply Proposition 56 to show and i are in the same free homotopy class.
108
Theorem 20. Let f : Rn Y be uniformly continuous, where Y RN is
a compact submanifold without boundary. Then f is homotopic to a smooth
map from Rn Y .
Proof. Since f is uniformly continuous, for all > 0, there is a > 0 so
that if |x x0 | < , then |f (x) f (x0 )| < . The -Neighborhood Theorem
shows that there is an > 0 so that the map : Y Y is well-defined and
smooth. Let be the corresponding from the uniform continuity of f .
Let be a smooth nonnegative
R bump function with support in the unit
n
ball B1 (0) in R so that Rn dxn = 1. Then for > 0, define (x) =
n (x/). Note supp = B (0). Define
Z Z
f (x) = f (y) (x y) dyn = f (y) (x y) dyn .
Rn {y:|xy|}
(Note each f is RN -valued.) If < , then |f (y) f (x)| < for y in the
domain of integration, and so
Z
f (x) = f (y) (x y) dyn
{y:|xy|}
Z
= [f (y) f (x)] (x y) dyn
{y:|xy|}
Z
+ f (x) (x y) dyn
{y:|xy|}
Z
= [f (y) f (x)] (x y) dyn + f (x)
{y:|xy|}
since
Z Z Z
(x y) dyn = (x y) dyn = (z) dzn = 1
{y:|xy|} Rn Rn
109
Therefore if (0, ), then f (x) Y . Then we check that f (x) =
(f (x)) is the desired homotopy. In particular, as 0, f (x) f (x)
uniformly by (32) (view as varying to zero instead of fixed for this inter-
pretation). Since and f are smooth, then f is smooth for small > 0.
In particular, we have shown that
f (x) for > 0 small
F (, x) =
f (x) for = 0
is the desired homotopy.
Theorem 21. Let f : X Y be a continuous map between smooth mani-
folds. Then f is homotopic to a smooth map from X Y .
Sketch of proof. We may assume X RM by Whitneys Embedding The-
orem. Then there is a > 0 so that M : X X is well-defined and
smooth. Define g : RM RN by g(p) = f (M (p)) for p X and g(p) = 0
for p 6 X . Note g(p) is uniformly continuous on a neighborhood of X.
Apply the mollifier argument as above to g and show that the homotopy
constructed in the proof of Theorem 20, when restricted to X RM , has the
desired properties.
The discussion above about energy and length still holds. Assuming the
minimizer is smooth enough, then a constant-speed length-minimizing loop
is the same as an energy-minimizing loop. Thus we may as well consider
energy-minimizing loops, and we have the equivalent problem.
110
4.4 Distributions
On Rn , we consider each smooth function with compact support to be a
test function. For any C 1 function f on Rn and test function , we have the
following formula by integrating by parts:
Z Z
f,i dxn = f ,i dxn . (33)
Rn Rn
Let D(Rn ) be the vector space of all smooth functions with compact sup-
port in Rn . For our purposes, we will define a distribution on Rn to be a linear
map from D(Rn ) R. We often allow C-valued test functions and consider
complex linear maps to C; complex-valued functions are useful when doing
Fourier analysis. (The usual definition of a distribution is more involved:
one must define a topology on D(Rn ) and then consider distributions to be
only continuous linear maps to C. For our purposes, the simpler definition
suffices. See Section 4.9 below for a more standard treatment of distributions
on the circle S1 .) Recall a measurable function
R f is locally L1 if over every
compact subset K of the domain of f , K |f | < . Any locally L1 function
f on Rn gives a distribution by sending
Z
f : 7 f () = f dxn .
Rn
111
Example 18. Any locally finite Borel measure d on Rn defines a distribu-
tion by sending Z
7 d
Rn
for any test function .
An important example of this is the inaptly named -function, or unit
point mass, at the origin. The -function is a measure on Rn so that for any
subset Rn ,
1 if 0
() =
0 if 0
/ .
So the distribution defined by this measure is
: 7 (0),
which is just evaluation of at the origin. The following problem shows there
is no locally L1 function which is equal to the -function.
then f f in L1 as 0.
(a) Show that for all x 6= 0 that f (x) = 0 for small enough. (Follow the
proof of Proposition 58.)
112
(d) Find a contradiction.
We have just seen that distributions are more general than functions.
In particular, it is possible to differentiate any distribution by mimicking
formula (33). A distributional derivative of a function may no longer be a
function, but it will be well-defined as a distribution. Given a distribution f
defined by a map f : 7 f () R, the partial derivative f,i in the sense of
distributions is defined to be the distribution
f,i : 7 f (,i ).
(a) Show P V ( x1 )() converges for all smooth test functions . (Hint: The
potential problem is clearly at x = 0. Use Taylors Theorem to write
= (0) + O(x), where O(x) represents a term so that O(x)/x con-
verges to a real limit as x 0.)
113
(b) Show that the first derivative in the sense of distributions of P V ( x1 ) is
given in terms of D(R) as
Z Z
1 1 2
lim 2 (x) dx + 2 (x) dx + (0) .
0+ x x
Proof. We first consider the case when f1 and f2 are both globally L1 on Rn .
Then recall that we can use a mollifier to approximate each in L1 by smooth
functions. In particular,
R if is a smooth nonnegative function with compact
support so that Rn dxn = 1, then define
Z
1 x
(x) = n , fi (x) = (x y)fi (y) dyn , i = 1, 2.
Rn
114
So far, we have discussed distributions on Rn . On the circle S1 , the
definitions are similar, the main difference being that since S1 is compact,
our test functions are simply all smooth functions on S1 . In particular, we
can think of test functions on S1 as smooth periodic functions on R with
period 1. In this way, an L1 function f on S1 acts on test functions by
Z 1
f: f dt.
0
= f dt + f (1)(1) f (0)(0)
S 1
Z
= f dt
S1
because f (0) = f (1) and (0) = (1) since f and are periodic. So we
have the same basic formula as in (33), and we may define distributions and
distributional derivatives in the same manner as above.
Now we return to our problem. We want to consider all loops : S1
X RN so that
Z 1
E() = 2g dt = kk
|| 2L2 (S1 ,RN ) < .
0
115
each a = 1, . . . , N . Thus we may work with each component of separately
in RN . Below we will see that L21 is a Hilbert space, but for now we are
content to show that every function in L21 (S1 ) is continuous. Recall that
elements of L21 (S1 ) are only equivalence classes of functions, two functions
being equivalent if they agree almost everywhere.
Z t2
|f (t2 ) f (t1 )| = f (t) dt
t1
Z t2 12 Z t2 12
|f(t)|2 dt dt
t1 t1
1
C(t2 t1 ) . 2
116
function . Then
Z
g()
= dt
g(t)(t)
Z Z t
= dt
f (s) ds (t)
Z 0 Z
=
f (s)(t) ds dt + f(s)(t)
ds dt
R1 R2
R1 = {(s, t) : s 0, t s}, R2 = R1 .
But now Z
( L) dt = L L 1 = 0,
117
and thus the function
Z t
(t) = [(s) L(s)]ds (34)
h() = LK + h( L) = LK + h()
= LK h() = LK
and h = K as distributions.
(b) Show that f = g in the sense of distributions. In other words, for every
smooth periodic test function D(S1 ), show that
Z 1 Z 1
f dt = g dt.
0 0
118
(c) If h is a distribution on S1 which satisfies h = 0 in the sense of distri-
butions, show there is a constant K so that h = K as distributions. In
other words, show that for every periodic smooth : R R,
Z 1
h() = K dt.
0
2(gik i ) gij,k i j
119
Compute the first variation as in the derivation of the Euler-Lagrange equa-
tions in Subsection 4.2 above:
Z Z Z
k i j
gij,k h dt + i j
gij h dt + gij i h j dt = 0.
S1 S1 S1
Since the components of h are smooth with compact support, they act as
test functions, and we may then integrate by parts in the second and third
integrals, in the sense of distributions, to conclude that
(b) Let be the -function on R. Compute its first derivative in the sense
of distributions.
120
space. Recall that L21 (S1 , R) consists of all L2 functions on S1 whose derivative
in the sense of distributions is also L2 . This suggests a natural inner product:
Z Z
hf, hiL21 = f h dt + fh dt.
S1 S1
Then plug in f = h to find
Z Z
2
kf kL2 = 2
|f | dt + |f|2 dt = hf, f iL21 ,
1
S1 S1
and so the norm on L21 is induced by the inner product. Below in Corollary
67, we show that any positive definite inner product defines a norm.
Remark. L21 (S1 , RN ) is also naturally a Hilbert space, with inner product
given by Z Z
hf, hiL2 =
1
hf, hi dt + hf, hi
dt,
S1 S1
N
where h, i is the inner product on R .
It is also useful to define complex Hilbert spaces, in which the inner
product h, i is Hermitian and positive definite. A Hermitian inner product
on a complex vector space V is a map from V V C which satisfies for
C and f, g, h V ,
hf + g, hi = hf, hi + hg, hi,
gi + hf, hi,
hf, g + hi = hf,
hf, gi = hg, f i.
These three conditions are respectively that the inner product is complex
linear in the first slot, complex antilinear in the second slot, and skew-
symmetric. The first two conditions together are called sesquilinear.
Then L21 (S1 , C) is a complex Hilbert space with inner product
Z Z
hf, gi = f g dt + fg dt.
S1 S1
We can also define the Sobolev space L21 (Rn , R) by the inner product
Z n Z
X
hf, gi = f g dxn + f,i g,i dxn ,
Rn i=1 Rn
the derivatives taken in the sense of distributions. The elements of L21 (Rn , R)
are then equivalence classes of functions in L2 so that all the first partials in
the sense of distributions are also in L2 .
121
We will work with L21 (S1 , R) instead of L21 (S1 , RN ), since convergence
in L21 (S1 , RN ) is equivalent to each component converging in L21 (S1 , R). The
proofs that follow will work with minor modifications for the spaces L21 (S1 , RN )
and L21 (S1 , C).
We focus on L21 (S1 , R), which we refer to simply as L21 .
Proof. Weve exhibited an inner product on L21 , and it is easy to check that
it is positive definite (if we consider elements to be equivalence classes of
functions, two functions being equivalent if they agree almost everywhere).
Thus the remaining thing to check is that the metric L21 (S1 , R) is complete
(and so it is a Banach space).
First of all note that fn f in L21 is equivalent to fn f in L2 and
fn f in L2 .
Let fn be a Cauchy sequence in L21 . Then by the definition of the norm,
it is clear that fn and fn are both Cauchy sequences in L2 . Then we have
limits fn f and fn g in L2 . In order to show that fn f in L21 , it
suffices to show that f = g in the sense of distributions.
Let be a test function, and note that fn f in L2 implies by Holders
inequality that
Z
|fn () f ()| = (fn f ) dt kfn f kL2 kkL2 0
S1
122
For a real Hilbert space H, an orthonormal basis is a collection of elements
{e }A which are orthonormal in that
he , e i =
123
Pn
Proof. Check that for w = i=1 hy, ei iei , hy w, wi = 0. Then apply
the Pythagorian Theorem to y = (y w) + w, and note that kwk2 =
P n 2
i=1 |hy, ei i| .
124
where A = { A : x > 0} is countable. Show that if A is infinite,
the right-hand sum is the usual sum of a convergent countably infinite
series (for any bijection between A and the natural numbers).
Hint for (a): Each x > 0 satisfies x [2n , 2n+1 ) for some n Z.
Derive a contradiction if the number of positive x is uncountable.
Remark. Note that X Z
x = x dc
A A
Here,
P the second equality is by the Pythagorean
P Theorem. Since the series
i 2 i 2
i=1 |v | converges, the tail of the series i=m+1 |v | must go to zero as
m , and thus {vn } is a Cauchy sequence in H. Since H is complete, vn
converges to the limit v H.
Now let v H and v = hv, e i. Then Bessels Inequality shows that for
all finite subsets A0 A, that
X
|v |2 kvk2 .
A0
So for the collection {|v |2 }A , the set S of finite partial sums is bounded.
So Homework Problem 50 shows that all but countably many v = 0. Denu-
merate the countable number of nonzero terms as v 1 , v 2 , . . . , and the corre-
sponding elements of the Porthonormal basis as e1 , e2 , . . . .
N i 2
Since the sequence i=1 |v | is bounded and increasing, P iti has a finite
limit as N . We have shown above that the series i=1 v ei converges
0
to a limit v H. Compute
* n
+
X
hv v 0 , ei i = lim v v j ej , ei = v i v j ji = 0.
n
j=1
125
And for any e 6 {e1 , e2 , . . . }, compute
* n
+
X
0 j
hv v , e i = lim v v ej , e = 0.
n
j=1
Proof. Compute
kv + wk2 = kvk2 + 2hv, wi + kwk2 ,
hv, wi = 12 [kv + wk2 kvk2 kwk2 ]
X
= 12 [(v i + wi )2 (v i )2 (wi )2 ]
i=1
X
= v i wi .
i=1
126
Remark. The formula for a complex Hilbert space is
X
hv, wi = v i wi .
i=1
Remark. Homework Problem 50 shows that this result still holds for non-
separable Hilbert spaces, since the number of basis elements with nonzero
coefficients for v and/or w is countable.
Here is another basic result in Hilbert spaces:
Corollary 67. Any positive definite inner product on a real vector space V
produces a norm by the formula kvk2 = hv, vi.
Proof. The main thing to check is the triangle inequality. Let v, w V and
note that
kv + wk kvk + kwk
kv + wk2 kvk2 + 2kvkkwk + kwk2
kvk2 + 2hv, wi + kwk2 kvk2 + 2kvkkwk + kwk2
hv, wi kvkkwk.
The main results we will use regarding Hilbert spaces involve another
topology on the Hilbert space which is different from the topology defined
by the metric. The usual metric convergence of sequences is called strong
convergence. So a sequence vi v in H strongly if
kvi vkH 0,
127
and we write vi v. On the other hand, a sequence vi H is weakly
convergent to a limit v H if
is also countable, and it represents all the basis elements with nonvanishing
coefficients for all the vi . Denumerate these elements as e1 , e2 , . . . , and write
X
vi = vij ej .
j=1
128
Since there is a constant K so that kvi k K, then Theorem 23 shows
for each N
N
X
|vij |2 K 2 . (37)
j=1
This is because 1 vi2 [K, K], which is compact, and the bound follows from
(37). Recursively, we may define for each N a subsequence {N vi } and a real
number v N so that
{N vi } is a subsequence of {(N 1) vi },
lim N vij = v j for j = 1, . . . , N, (38)
i
|v | + + |v N |2 K 2 .
1 2
(39)
129
for all n. Here the third line follows from the second since i vi = v = 0 if
e 6 {e1 , e2 , . . . }.
Since ki vi k K and kvk K, then ki vi vk 2K and Cauchy-Schwartz
shows that
! 21
! 12
X X X
|(i vij v j )wj | |i vij v j |2 |wj |2
j=n+1 j=n+1 j=n+1
! 12
! 12
X X
|i vij v j |2 |wj |2
j=1 j=n+1
! 12
X
2K |wj |2
j=n+1
P
|wj |2 |w |2 = kwk2 converges, and there is an n
P
Since w H, j=1 A
so that ! 12
X
|wj |2 < .
j=n+1
In other words, the Hilbert space norm is lower semicontinuous under weak
convergence.
130
Proof. The proof is to translate the current problem into Fatous Lemma.
Let {e }A be an orthonormal basis of our Hilbert space H. Then put the
counting measure c on the index set A. Let f : A [0, ), f : 7 f be a
nonnegative real-valued function on A. Then it is straightforward to check
that Z X
f dc = f ,
A A
and thus each sum may be thought of as an integral with respect to the
counting measure.
In our case, if X X
vi = vi e , v= v e ,
A A
vi = hvi , e i hv, e i = v
Note that the above proofs depend heavily on the existence of an or-
thonormal basis, Proposition 63, which we did not prove. The following
131
problem outlines a standard procedure for getting around the proof of Propo-
sition 63, by proving the existence of an orthonormal basis for any Hilbert
space with a countable spanning set. A subset S of a Hilbert space H is said
to be a spanning set if the (strong) closure of finite linear combinations of
elements in S is equal to all of H. For example, in the proof of Theorem
24, we need only deal with the closure H 0 of the span of {v1 , v2 , . . . }. The
existence of an orthonormal basis of H 0 is sufficient for the proof of Theorem
24.
Homework Problem 53. Show that any strongly closed linear subspace of
a Hilbert space H is again a Hilbert space (with the same inner product).
We say a subset {v }A H is linearly independent (in the sense of
Banach spaces) if any convergent sum
X
b v = 0
A
implies b = 0 for all A. Note in particular, the implication holds for any
finite sum (and thus this notion of linearly independence in this Banach-space
sense implies linear independence in the usual vector-space sense).
Homework Problem 54 (Gram-Schmidt Orthogonalization).
(a) Let H be a Hilbert space with a countable spanning set {v1 , v2 , . . . }
which is finite or countably infinite. Show that there is a subset of
{v1 , v2 , . . . } which is a linearly independent spanning set of H.
(b) Given a linearly independent spanning set {v1 , v2 , . . . } on a Hilbert
space H, define fi and ei recursively by
f1
f1 = v1 , e1 = ,
kf1 k
f2
f2 = v2 hv2 , e1 ie1 , e2 = ,
kf2 k
n1
X fn
fn = vn hvn , ei iei , en = .
i=1
kfn k
Show that this recursive definition can be carried out (in particular,
show that fn 6= 0). Then show that {e1 , e2 , . . . } is an orthonormal
basis for H. In other words, show that hei , P
ej i = ij and that any v in
H can be written as a convergent sum v = i
i=1 v ei .
132
The use of the previous problem isnt strictly necessary for our purposes,
as L21 (S1 , R) is separable (though we wont prove that it is).
Recall that for every Banach space B, the dual space Banach space B is
the space of all continuous linear functionals : B R, with norm given by
|(x)|
kkB = sup .
xB\{0} kxkB
Also recall that for any p (1, ), the dual Banach space of Lp (Rn ) is
Lq (Rn ) for p1 + q 1 = 1. Thus L2 (Rn ) is dual to itself. This fact is true for
all Hilbert spaces, as the following problem shows in the separable case.
x 7 x = h, xi.
Show that this map preserves the norm, is one-to-one and onto.
Hint: The most significant step is showing the map is onto. First reduce
to the case 6= 0. Show that L = 1 (0) is also a separable Hilbert space,
and let {ei } be an orthonormal basis for L. Let y
/ L and use a version of
Gram-Schmidt to show we may assume y L. Construct x from y and .
133
Theorem 27. The complex exponential functions
{e2ikt : k Z}
Therefore, {e2ikt } 2 1
k= forms an orthonormal set in L (S , C).
We must show that every element f L2 (S1 , C) can be written as a
Fourier series
X
f= hf, e2ikt ie2ikt ,
k=
Proof. We use the following claim: For any L2 function f , the Fourier coef-
ficients hf, e2ikt i 0 as k . This follows from Bessels Inequality
X
|hf, e2ikt i|2 kf k2L2 < .
k=
134
If f is smooth, then f is also smooth (and thus is in L2 ), and integration
by parts gives us
Z 1
hf , e2ikt
i = fe2ikt dt
0
Z 1 1
2ikt 2ikt
= f (e )dt + f (t)e
0 0
= 2ikhf, e2ikt i + 0.
Thus any polynomial P (k) times the Fourier coefficients also goes to zero as
k .
The previous lemma shows that for any smooth function f C (S1 , C),
the Fourier series
X
g(t) = hf, e2ikt ie2ikt
k=
135
Therefore, uniform convergence implies that g(t) is continuous (and thus
is in L2 as wellwhy?). (In fact, g(t) is smoothsee Homework Problem 58
below.) If we let
X
h(t) = f (t) g(t) = f (t) hf, e2ikt ie2ikt ,
k=
then by the same techniques in the proof of Theorem 23 above, we see that
Assume that Re h( ) > 0 (the other cases are similar), and let (t) = Re h(t).
Since is continuous, there is a > 0 so that
136
Now compute
Z
|Re hh, bn i| = Re h(t)bn (t) dt
1
Z S
= (t)bn (t) dt
1
ZS + Z
= (t)bn (t) dt + (t)bn (t) dt
S1 \[ , +]
Z + Z
(t)bn (t) dt (t)bn (t) dt
S1 \[ , +]
Z + Z
2
> (t)bn (t) dt (t)bn (t) dt .
2 S1 \[ , +]
(Note the last inequality follows since the integrand is positive.) Also, we
have the following bounds:
Now compute
+ 2
Z Z
|Re hh, bn i| > (t)bn (t) dt (t)bn (t) dt
2 S1 \[ , +]
Now (41) shows the ratio of the first term over the second goes to + as
n and thus there is an n so that |Re hh, bn i| > 0.
Now the contradiction is this: Since bn is a finite Fourier series, hh, bn i is
a finite linear combination of Fourier coefficients hh, e2ikt i, which we assume
are all zero. Thus hh, bn i = 0, and we have a contradiction.
Since h is the difference between the smooth f and its Fourier series, we
have shown
137
Lemma 70. Let f C (S1 , C). Then
X
f (t) = hf, e2ikt i e2ikt ,
k=
kL(v)kB2 CkvkB1 .
138
The proof of Theorem 23 shows that G preserves the norms. In other words,
So in L2 ,
X
fn G(f) = fk e2ikt .
k=
139
Homework Problem 58. Let f k C for all k Z, and assume for all
n 0 that
lim k n f k = lim k n f k = 0.
k k
140
In this section, we show that this map is compact. A linear map between
Banach spaces : B1 B2 is called compact if the closure of the image of
the unit ball in B1 is strongly compact in B2 . In other words, if vi B1
satisfy kvi kB1 1, then {(vi )} has a strongly convergent subsequence in
B2 : i.e. there is a subsequence {vij } and an element w B2 so that
lim k(vij ) wkB2 = 0.
j
The basic observation which allows us to conclude that the natural inclu-
sion map L21 (S1 ) C 0 (S1 ) is compact comes from the proof of Proposition
59. If f L21 (S1 ), then
Z t2
|f (t2 ) f (t1 )| = f (t) dt
t1
Z t2 12 Z t2 12
|f(t)|2 dt dt
t1 t1
Z 12
1
2
|f (t)| dt (t2 t1 ) 2
S1
1
kf kL21 (t2 t1 ) 2
(Note that the first equality was justified in the proof of Proposition 59.)
Therefore, f is continuous. But moreover, for every > 0, we may choose
!2
=
kf kL21
so that
|t2 t1 | < = |f (t2 ) f (t1 )| < .
So the modulus of continuity does not depend on t, and depends only on
the norm kf kL21 , and on no other information about f .
A family of functions of functions from a metric space X to a metric
space Y is called equicontinuous at a point x X if for all > 0, there is a
> 0 so that
dX (x, x0 ) < = dY (f (x), f (x0 )) <
for all f . The point is that does not depend on f . Such a family
of functions is called equicontinuous on X if it is equicontinuous at each
point x X.
141
Note that if is equicontinuous on X then each f is continuous.
The computations above show
Proof. Well prove the theorem with the help of a few lemmas.
Lemma 72. Any compact metric space has a countable dense subset.
Proof. Let X be the compact metric space. For = 1/n, obviously
[
X= B (x), B (x) = {y X : dX (x, y) < }.
xX
For each positive integer n, this open cover of X has a finite subcover con-
sisting of balls of radius 1/n centered at points xn,1 , . . . , xn,mn . The union
[
{xn,1 , . . . , xn,mn }
n=1
142
Proof. First we show that fn (x) converges pointwise everywhere to a function
f (x). Let y X and let > 0. Then by equicontinuity, there is a > 0 so
that
dX (x, y) < = |fn (x) fn (y)| < .
(Note is independent of n.) Since fn converges on a dense subset of X,
there is an x B (y) for which fn (x) converges. Therefore, {fn (x)} is a
Cauchy sequence in R, and so there is an N so that
Therefore, for n, m N ,
|fn (y) fm (y)| |fn (y) fn (x)| + |fn (x) fm (x)| + |fm (x) fm (y)| < 3.
Therefore, {fn (y)} is a Cauchy sequence in the complete metric space R, and
so it converges to a limit which we call f (y).
Let y X and > 0. Then equicontinuity shows that there is a > 0 so
that
x B (y) = |fn (x) fn (y)| < (42)
for all n. By letting n , we also have
Then for x X, x Bi (yi ) for some yi , and so (42), (43) and (44) show
|fn (x) f (x)| |fn (x) fn (yi )| + |fn (yi ) f (yi )| + |f (yi ) f (x)| < 3.
143
Homework Problem 59. Let P be a countable set, and let fn : P R
be a sequence of functions. Assume that for each p P, there is a constant
C = Cp so that |fn (p)| C for all n = 1, 2, . . . . Show there is a subsequence
of {fn } which converges everywhere on P to a function f : P R.
Hint: Use a diagonalization argument.
An important version of the Ascoli-Arzela Theorem is the following:
Theorem 29. Let X be a metric space so that there is a countable number
of open subsets Oi satisfying
[
X= Oi , Oi Oi+1 , (45)
i=1
144
Proof. This follows from the Ascoli-Arzela Theorem and Lemma 71 above,
once we know in addition that there is a constant K so that |fn | K
pointwise. First of all, note that
1 1
|fn (t2 ) fn (t1 )| kfn kL21 |t2 t1 | 2 C|t2 t1 | 2
kn k2L2 (S1 ,RN ) = kn k2L2 (S1 ,RN ) + k n k2L2 (S1 ,RN ) = kn k2L2 (S1 ,RN ) + E(n ).
1
and moreover,
kn k2L2 (S1 ,RN ) C + K 2
1
145
independently of n. So each component function na for a = 1, . . . , N satisfies
kna kL21 (S1 ,R) C + K 2 .
|f (t1 ) f (t2 )|
kf kC 0, 21 (S1 ) = kf kC 0 + sup 1 .
t1 6=t2 dS1 (t1 , t2 ) 2
(Here we define
dS1 (t1 , t2 ) = inf |(t1 + k) t2 |.
kZ
C 0, (X) = {f : X R : kf kC 0, < },
|f (x) f (y)|
kf kC 0, = sup |f (x)| + sup .
xX x6=yX dX (x, y)
146
Example 21. This is the standard example for X = [1, 1] R. f (x) = |x|
is in C 0, (X).
Proof. It clearly suffices to bound the difference quotient
||x| |y| |
q(x, y) = , x 6= y [1, 1].
|x y|
We will show that this is always 1. First, simplify to the case x and y
have the same sign, since if they have opposite signs, q(x, y) < q(x, y).
We may assume x and y have the same sign. By possibly interchanging
(x, y) (x, y) and switching x and y, we may assume x > y 0. Then
write
x y 1 y
q(x, y) =
=
, = [0, 1).
(x y) (1 ) x
Then we compute
dq (1 1 )
= 0.
d (1 )+1
Therefore, the max of q() is achieved at = 0, q = 1.
We also say f (x) = |x| is locally C 0, on R, since the Holder norm of
f is finite on any compact subset of R.
In the case = 1, note that a function in C 0,1 is simply a C 0 function
which is globally Lipschitz.
Homework Problem 61.
(a) Show that the inclusion C 1 (S1 ) , C 0 (S1 ) is compact (Hint: use the
Mean Value Theorem).
(b) Show that every bounded sequence fn C 1 (R) (i.e., there is a uniform
C so that kfn kC 1 C for all n) has a subsequence which converges
uniformly on compact subsets of R to a continuous limit f . Hint: It is
easy to show that R satisfies condition (45).
(c) Find an example of a bounded sequence of functions fn C 1 (R) which
does not have a convergent subsequence in C 0 (R). Thus the inclusion
C 1 (R) , C 0 (R) is not compact. (Hint: How is this situation differ-
ent from parts (a) and (b)? You must use the noncompactness of R.
Therefore, the interesting behavior of the fn should be moving off to
infinity.)
147
It is also useful to apply Holder norms to the derivatives of a functions.
In particular, on Rn , we may define for k a positive integer, (0, 1],
X
kf kC k, = k f kC 0, ,
||k
where, as in (3) above, we use the multi-index notation to denote all the
partial derivatives of f of order k.
Remark. It is not useful to define C 0, for > 1, as the following problem
shows.
Homework Problem 62. Let > 1, and let f : R R. Assume that
|f (x) f (y)|
sup = C < .
x6=y |x y|
Show that f is a constant function.
Hint: Use the definition of the derivative to show that f 0 (x) = 0 for all
x.
Proposition 78. Let X be a metric space and (0, 1]. Then C 0, (X) is
a Banach space.
Proof. It is straightforward to show that k kC 0, is a norm. As always, we
must check completeness carefully.
Let {fn } be a Cauchy sequence in C 0, (X). We want to show that there
is a limit f C 0, and that kfn f kC 0, 0 as n .
First of all, it is obvious from the definition of the Holder norm that
{fn } is a Cauchy sequence in C 0 (X), and since C 0 is complete, there is a
continuous limit function f , and fn f uniformly.
Now we show f C 0, . Let > 0. Then there is an N so that
m, n N = kfm fn kC 0, < . (46)
Then for all m N , kfm kC 0, < kfN kC 0, + C . By the definition of the
Holder norm, for all x, y X,
|fm (x) fm (y)| C dX (x, y) .
Taking m shows that f C 0, . Now (46) also implies that for all
x, y X,
|fm (x) fn (x) fm (y) + fn (y)| dX (x, y) ,
148
and so again let m to show for all x, y X, and for all n N ,
B1 (0) B2
Remark. The Holder spaces C k, , for (0, 1), and the Sobolev spaces Lpk ,
for p (1, ), play a very important role in the theory of partial differen-
tial equations. In particular, the behave much better than the more obvious
1
spaces C k . Our simple proofs that L21 (S1 ) embeds continuously in C 0, 2 (S1 )
and compactly in C 0 (S1 ) constitute some of the easiest cases of Sobolev em-
bedding theorem. The Sobolev embedding theorem allow us to embed certain
Sobolev spaces, in which derivatives are defined only in the sense of distri-
butions, to Holder and C k spaces, in which we may take derivatives in the
usual sense. These spaces are crucial to the regularity theory of solutions to
PDEs.
149
4.7 Convergence
Now we have finally developed the tools needed to solve our problem. Recall
L = inf E().
C
is bounded independent of i.
This proposition shows there is a further subsequence of i which con-
verges weakly to a L21 (S1 , RN ) by Theorem 24. (Explanatory note: a
150
further subsequence means that we take a subsequence not just of the origi-
nal i , but of the subsequence taken in the paragraph above Proposition 79.)
We still refer to this further subsequence as i . Then Theorem 25 shows that
the Hilbert space norm
k
kL21 (S1 ,RN ) lim inf ki kL21 (S1 ,RN ) .
i
151
Proposition 82. For every D(S1 ), there is a D(S1 ) so that =
.
Proof. Recall D(S1 ) = C (S1 , R). Moreover, Lemma 70 and Problem 58
show that
( )
X
1 k 2ikt k n
C (S , C) = f e : lim f |k| = 0 for n = 1, 2, . . . . (48)
k
k=
The convergence of each such series is uniform, and the sum commutes with
the derivative d/dt.
Therefore, if
X
= k e2ikt C (S1 , C),
k=
then
X
= (4 2 k 2 )k e2ikt ,
k=
X
= (1 + 4 2 k 2 )k e2ikt .
k=
So if
X
= k e2ikt C (S1 , C),
k=
so that = .
We must prove that C (S1 , C). Let n be a positive integer. Then
k |k|n
lim k |k|n = lim = 0.
k k 1 + 4 2 k 2
152
Remark. The previous proposition uses a standard technique for solving
constant-coefficient differential equations on S1 . The differential equation
then breaks into an algebraic equation for each Fourier coefficient, each of
which can be typically be solved.
This also works for functions on the n-torus (S1 )n . In this case, the Fourier
series is summed over Zn , and we can solve constant-coefficient PDEs. Also,
on Rn , the Fourier transform turns constant-coefficient PDEs into algebraic
equations of the Fourier transform variable.
Homework Problem 65. L22 (S1 , C) is the complex Hilbert space defined by
the inner product Z
hf, giL2 = 2
(f g + fg + fg) dt.
S1
The elements of L22 (S1 , C) are all complex-valued functions f on S1 which are
L2 and whose first and second derivatives f and f in the sense of distributions
are also L2 functions. (You may assume L22 (S1 , C) is a Hilbert space, as in
Proposition 62.)
Show that if fn f converges weakly in L22 (S1 , C), then for all D(S1 ),
fn () f ().
Hint: Mimic the proofs of Propositions 81 and 82.
Moreover, C the same free homotopy class of L21 loops containing the i .
Since i uniformly, we have
Z
2
ki kL2 (S1 ,RN ) = |i |2 dt sup |i |2 0,
S1 t
and so i in L2 .
153
Now Theorem 25 shows that
Corollary 83. This minimizing satisfies the geodesic equations (in local
coordinates on X) in the form
2(gik i ) gij,k i j = 0
the first of which is compact and the second of which is continuous. The
following problem gives a direct proof.
Homework Problem 66. Show directly that the inclusion L21 (S1 , C) ,
L2 (S1 , C) is a compact linear map.
Hints:
(a) Use the characterization of L21 (S1 , C) in terms of Fourier series from
Proposition 87 below.
154
(b) If kfi (t)kL21 1, then use a diagonalization argument to produce a
subsequence {fij } so that for each k Z, the Fourier coefficients fikj
converge to constants g k C as j .
(c) For all > 0, show that there is an N so that if |k| N , then
X
|fk |2 <
|k|N
in L2 (S1 , C).
Remark. The proof presented in the previous problem works for Sobolev
spaces in higher dimensions (for functions on the n-dimensional torus S1
S1 ), whereas the use of the Sobolev embedding theorem for the compact
inclusion L21 (S1 , C) , C 0 (S1 , C) is only available in dimension n = 1.
4.8 Regularity
Now we show that is smooth. First of all, note that kij is a smooth in each
set of local coordinates x on X. Also, since L21 (S1 , RN ), then we know
that is continuous in t S1 , and so kij () is continuous on S1 .
Until now, weve been lax about distinguishing between = ( 1 , . . . , N )
X RN and in local coordinates. There is an important point in which
we should make a distinction. Recall we are working on a coordinate chart
: U O X RN , where U Rn . Our notation has been this: a is the
ath coordinate of in RN X, while i has been shorthand for (1 )i
the ith coordinate of 1 in Rn U.
In the previous subsections, we have dealt with the L21 norm of in RN ,
while in local coordinates, we should deal with the L21 norm of 1 in
U Rn . Let 1 : O U be restriction of the smooth map
y = (y 1 , . . . , y n ) : Q U,
155
where Q is an open subset of RN which contains O X RN . (Recall
we may do this by the definition of smooth maps from O to Rn .) Let x =
(x1 , . . . , xN ) represent coordinates on RN . Compute for k = 1, . . . , n
y k a
(y )k = ,
t xa
where a is summed from 1 to N .
Remark. A related, simpler notion is the following: Two norms k kB1 and
k kB2 on a single linear space B are called equivalent if there are constants
C1 > C2 > 0 so that for all x B,
156
Then it is easy to check that for A, B 0,
1
(A + B) A2 + B 2 A + B.
2
In other words, the norm on given by the sum of the L2 norm of and the
L2 norm of is equivalent to the L21 norm. It is straightforward to use this
fact to prove the claim.
Since 1 is C 1 on K, it is locally Lipschitz and thus globally Lipschitz
on K (see Proposition 17). So for C the Lipschitz constant and x0 a point
in K, for all x K,
This is essentially one half of (49) for the L2 norm of . The other half
follows from the fact that is a C 1 function on the compact set 1 K.
We still must address the L21 norm of . Recall for y = 1 as above, that
y a
(1 ) = .
xa
On the compact set K, since 1 is C 1 , there is a constant C so that
y
xa C on K,
157
and so on K
N
1 y a X
( ) =
xa C
| a | CN ||.
a=1
y a 2y
(1 )=
+ a b .
xa xa xb
So first derivative terms of come into the calculations of the second deriva-
tives of 1 .
The geodesic equation is written in terms of the coordinates on U Rn ,
and for an open interval I S1 , (I) O. On any compact subinterval of
I, there is a constant C so that the the components of the metric gk` () and
its first derivatives gk`,m () have absolute values bounded by a constant C
(this is since is continuous on the compact interval I). Since L21 , each
i L2 . Therefore, Holders inequality shows that
Z n Z
X 12 Z 12
1
|g i j | dt
2 ij,k
C
2
i 2
| | dt j 2
| | dt < .
I i,j=1 I I
Thus 12 |gij,k i j | L1 (I) for each k, and thus Corollary 83 shows (gik i )
L1 (I) for each k in the sense of distributions. Lemma 86 below and the proof
of Proposition 59 above then show gik () i is continuous.
k L1 (I)
gk` () (50)
158
Now bootstrap using Corollary 83 again to show that (gik () i ) is con-
tinuous as well. Thus gik () i is, in the sense of distributions, a C 1 function.
As above, this shows i is also C 1 , and thus is locally C 2 .
We now have enough regularity to show rewrite Corollary 83 as the
geodesic equation
k = kij () i j .
for k C 2 functions. The equation holds in the usual sense of ODEs. There-
fore, since kij is smooth, the usual regularity theory for ODEs, Theorem 9,
applies, and the geodesic is smooth.
is continuous.
Proof. Let t R, and let h > 0 (the case h < 0 is similar). Compute
Z t+h
g(t + h) g(t) = f (s) ds
Zt
= [t,t+h] (s) f (s) ds
R
for [t,t+h] the characteristic function of the interval [t, t + h]. Then as h 0,
159
Homework Problem 67. Let f : R R be an L1 function. Show that
Z t
(t) = exp f ( ) d
0
is a distribution.
Proposition 87.
( )
X X
L21 (S1 , C) = f k e2ikt : |f k |2 (k 2 + 1) < .
kZ kZ
Since f L2 ,
X X
4 2 k 2 |fk |2 = kfk2L2 < . k 2 |fk |2 < .
kZ kZ
160
Now since f L2 also, then
X X
|fk |2 < and |fk |2 (k 2 + 1) < .
kZ kZ
This proves .
To show , note that
X X X
|f k |2 (k 2 + 1) < |f k |2 < and k 2 |f k |2 < .
kZ kZ kZ
Therefore, X
f= f k e2ikt L2 ,
kZ
f() = f ()
Z
= f dt
S1
L2
= hf, i
X
= f k 2ik k
kZ
X
= (2ik)f k k
kZ
X
= fk k
kZ
* +
X
= k 2ikt
f e , .
kZ L2
161
Remark. Similar easy calculations show that
( )
X X
L2m (S1 , C) = f k e2ikt : |f k |2 (k 2 + 1)m <
kZ kZ
by X
f () = fk k , (51)
kZ
k
where f = f (e2ikt ) and we assume that
X
|fk |2 (1 + k 2 )s < . (52)
kZ
162
Now we finally give the correct definition of complex distributions on S1 .
A distribution on S1 is a continuous C-linear map from C (S1 , C) to C.
Denote the space of complex distributions on S1 by D0 (S1 , C).
[
Proposition 88. D0 (S1 , C) = L2m (S1 , C), and the image of D0 (S1 , C)
mZ
under the Fourier transform is the set of all polynomially bounded complex
sequences. In other words, it is the set of all sequences {f k } so that there are
m
m Z, C > 0 so that |f k | C(k 2 + 1) 2 for all k Z.
Proof. We prove the first equality, and leave the rest as an exercise.
To prove , if f is in the union, then f L2m (S1 , C) for some positive
m. To show f D0 (S1 , C), consider a sequence of j in C (S1 , C).
Then by definition, j in L2m . Then
|f (j ) f ()| = |f (j )|
X
|fk (kj k )|
kZ
X |fk | h
k k 2 m
i
= m | j |(1 + k ) 2
kZ
(1 + k 2 ) 2
! 21 ! 12
X |fk |2 X
2 m
|kj k |2 (1 + k 2 )m .
kZ
(1 + k ) kZ
The second term in the last line goes to zero by the remark after Proposition
87, while the first term is finite by the fact f L2m . Therefore, f (j ) f ()
for every test function , and f D0 (S1 , C).
We prove by contradiction. If f D0 (S1 , C) is not in L2m (S1 , C) for
every m Z, then for all m Z,
X
|fk |2 (1 + k 2 )m = .
k=
163
So for each j, there is a kj so that
|fkj |
j 1.
(1 + kj2 ) 2
We may assume kj 6= 0.
Now we construct a sequence j which converges to 0 in C (S1 , C), but
for which f (j ) 6 0. Define
fkj
j = j e2ikj t .
|fkj |(1 + kj2 ) 2
Compute
(1 + kj2 )n
kj k2L2n = (1 + kj2 )nj ,
(1 + kj2 )j
where denotes equivalence of norms. For each fixed n, since each kj2 1,
then
lim kj k2L2n = 0,
j
fkj |fkj |
f (j ) = fkj j = j 1.
|fkj |(1 + kj2 ) 2 (1 + kj2 ) 2
164