MIT Métodos Matemáticos

MATH 18.
152 COURSE NOTES - CLASS MEETING # 1
18.152 Introduction to PDEs, Fall 2011 Professor: Jared Speck
Class Meeting # 1: Introduction to PDEs

1. What is a PDE?
We will be studying functions u = u(x1 , x2 , · · · , xn ) and their partial derivatives. Here x1 , x2 , · · · , xn
are standard Cartesian coordinates on Rn . We sometimes use the alternate notation u(x, y), u(x, y, z),
etc. We also write e.g. u(r, θ, φ) for spherical coordinates on R3 , etc. We sometimes also have a
“time” coordinate t, in which case t, x1 , · · · , xn denotes standard Cartesian coordinates on R1+n .
def
We also use the alternate notation x0 = t.
We use lots of different notation for partial derivatives:
∂
(1.0.1a) u = uxi = ∂i u, 1 ≤ i ≤ n,
∂xi
∂ 2u ∂ ∂
(1.0.1b) i j
= u = uxi xj = ∂i ∂j u, 1 ≤ i, j ≤ n.
∂x ∂x ∂xi ∂xj
def
If i = j, then we sometimes abbreviate ∂i ∂j u = ∂i2 u. If u is a function of (x, y), then we also write
∂
ux = ∂x u, etc.
Definition 1.0.1. A PDE in a single unknown u is an equation involving u and its partial deriva-
tives. All such equations can be written as
(1.0.2) F (u, ux1 , · · · , uxn , ux1 x1 , · · · , uxi1 ···xiN , x1 , x2 , · · · , xn ) = 0, i1 , · · · , iN ∈ {1, 2, · · · , n}

for some function F.
Here N is called the order of the PDE. N is the maximum number of derivatives appearing in
the equation.
Example 1.0.1. u = u(t, x)
(1.0.3) −∂t2 u + (1 + cos u)∂x3 u = 0

is a third-order nonlinear PDE.
(1.0.4) −∂t2 u + 2∂x2 u + u = t

is a second-order linear PDE.
We say that (1.0.4) is a constant coefficient linear PDE because u and its derivatives appear
linearly (i.e. first power only) and are multiplied only by constants.
1
2 MATH 18.152 COURSE NOTES - CLASS MEETING # 1
(1.0.5) ∂t u + 2(1 + x2 )∂x3 u + u = t

is a third-order linear PDE.
We say that (1.0.5) is a variable coefficient linear PDE because u and its derivatives appear
linearly (i.e. first power only) and are multiplied only by functions of the coordinates (t, x).
Example 1.0.4. u = u(t, x), v = v(t, x)
(1.0.6a) ∂t u + 2x∂x v = sin(x2 ),
(1.0.6b) ∂ t v − x2 ∂ x u − 0
is a system of PDEs in the unknowns u, v.
2. The Goals of PDE (and of this course)

Suppose that we are interested in some physical system. A very fundamental question is:
• Which PDEs are good models for the system?
A major goal of modeling is to answer this question. There is no general recipe for answering it!
In practice, good models are often the end result of confrontations between experimental data and
theory. In this course, we will discuss some important physical systems and the PDEs that are
commonly used to model them.
Now let’s assume that we have a PDE that we believe is a good model for our system of interest.
Then most of the time, the primary goals of PDE are to answer questions such as the following:
(1) Does the PDE have any solutions? (Some PDEs have NO SOLUTIONS whatsoever!!)
(2) What kind of “data” do we need to specify in order to solve the PDE?
(3) Are the solutions corresponding to the given data unique?
(4) What are the basic qualitative properties of the solution?
(5) Does the solution contain singularities? If so, what is their nature?
(6) What happens if we slightly vary the data? Does the solution then also vary only slightly?
(7) What kinds of quantitative estimates can be derived for the solutions?
(8) How can we define the size (i.e., “the norm”) of a solution in way that is useful for the
problem at hand?
3. Physical Examples
It is difficult to exaggerate how prevalent PDEs are. We will discuss some important physically
motivated examples throughout this course. Here is a first look.
• −∂t2 u + ∂x2 u = 0 wave equation, second-order, linear, homogeneous
• −∂t u + ∂x2 u = 0 heat equation, second-order, linear, homogeneous
• ∂x2 u + ∂y2 u + ∂z2 u = 0 Laplace’s equation, second-order, linear, homogeneous
• ∂x2 u + ∂y2 u + ∂z2 u = f (x, y, z) Poisson’s equation with source function f, second-order, linear,
inhomogeneous (unless f = 0)
• ı∂t u + ∂x2 u = 0 Schrödinger’s equation, second-order, linear, homogeneous
• ut + ux = 0, transport equation, first-order, linear, homogeneous
• ut + uux = 0, Burger’s equation, first-order, nonlinear, homogeneous
MATH 18.152 COURSE NOTES - CLASS MEETING # 1 3

E = E1 (x, y, z), E2 (x, y, z), E3 (x, y, z) , B = B1 (x, y, z), B2 (x, y, z), B3 (x, y, z) are vectors in
R3
(3.0.7a) ∂t E − ∇ × B = 0, ∇ · E = 0,
(3.0.7b) ∂t B + ∇ × E = 0, ∇·B=0
“Maxwell’s equations” in a vacuum (i.e., matter-free spacetime), first-order, linear, homogeneous.
4. Linear PDEs
Before we dive into a specific model, let’s discuss a distinguished class of PDEs that are relatively
easy to study. The PDEs of interest are called linear PDEs. Most of this course will concern linear
PDEs.
Definition 4.0.2. A linear differential operator L is a differential operator such that
(4.0.8) L(au + bv) = aLu + bLv

for all constants a, b ∈ R and all functions u, v.
Remark 4.0.1. The notation was introduced out of convenience and laziness. The definition is
closely connected to the superposition principle.
def
Example 4.0.5. L = −∂t2 + (t2 − x2 )∂x2 is a linear operator: Lu = −∂t2 u + (t2 − x2 )∂x2 u
Example 4.0.6. u = u(x, y), Lu = ∂x2 u + u2 ∂y2 u does NOT define a linear operator:
L(u + v) = ∂x2 (u + v) + (u + v)2 ∂y2 (u + v) 6= ∂x2 u + u2 ∂y2 u + ∂x2 v + v 2 ∂y2 v = Lu + Lv
Definition 4.0.3. A PDE is linear if it can be written as
(4.0.9) Lu = f (x1 , · · · , xn )
for some linear operator L and some function f of the coordinates.
Definition 4.0.4. If f = 0, then we say that the PDE is homogeneous. Otherwise, we say that it
is inhomogeneous.
(4.0.10) ∂t u − (1 + cos t)∂x2 u = tx

is a linear PDE.
Here is an incredibly useful property of linear PDEs.
Proposition 4.0.1 (Superposition principle). If u1 , · · · , uM are solutions to the linear PDE
(4.0.11) Lu = 0,
PM
and c1 , · · · , cM ∈ R, then i=1 ci ui is also a solution.
Proof.
M M =0
X X z}|{
(4.0.12) L ci ui = ci Lui = 0.
i=1 i=1

Remark 4.0.2. This shows that the set of all solutions to Lu = 0 is a vector space when L is
linear.
As we will see in the next proposition, inhomogeneous and homogeneous linear PDEs are closely
related.
Proposition 4.0.2 (Relationship between the inhomogeneous and homogeneous linear
PDE solutions). Let Sh be the set of all solutions to the homogeneous linear PDE
(4.0.13) Lu = 0,
and let uI be a “fixed” solution to the inhomogeneous linear PDE
(4.0.14) Lu = f (x1 , · · · , xn ).
Then the set SI of all solutions to (4.0.14) is the translation of SH by uI : SI = {uI + uH | uH ∈
SH }.
Proof. Assume that LuI = f, and let w be any other solution to (4.0.14), i.e., Lw = f. Then
L(w − uI ) = f − f = 0, so that w − uI ∈ SH . Thus, w = uI + (w − uI ) , and so w ∈ SI
| {z }
belongs to SH
by definition. On the other hand, if w ∈ SI , then w = uI + uH for some uH ∈ SH . Therefore,
Lw = L(uI + uH ) = LuI + LuH = f + 0 = f. Thus, w is a solution to (4.0.14).
5. How to solve PDEs
• There is no general recipe that works for all PDEs! We will develop some tools that
will enable us to analyze some important classes of PDEs.
• Usually, we don’t have explicit formulas for the solutions to the PDEs we are
interested in! Instead, we are forced to understand and estimate the solutions without
having explicit formulas.
The two things that you typically need to study a PDE:
• You need to know the PDE.
• You need some “data.”
6. Some simple PDEs that we can easily solve
6.1. Constant coefficient transport equations. Consider the first-order linear transport equa-
tion
(6.1.1) a∂x u(x, y) + b∂y u(x, y) = 0,

where a, b ∈ R. Let’s try to solve this PDE by reasoning geometrically. Geometrically, this equation
def
says that ∇u · v = 0, where ∇u = (∂x u, ∂y u) and v is the vector (a, b) ∈ R2 . Thus, the derivative of
u in the direction (a, b) is 0, which implies that u is constant along lines pointing in the direction
of (a, b). The slope of such a line is ab . Therefore, every such line can be described as the set of
solutions to bx − ay = c, where c ∈ R. Since u is constant along these lines, we know that u is a
“function that depends only on the line c.” Therefore u(x, y) = f (c) = f (bx − ay) for some function
f.
In order to provide more details about u, we would need to prescribe some “data.” For example,
if it is known that u(x, 0) = x2 , then x2 = f (bx). Thus, f (c) = b−2 c2 , and u(x, y) = (x − b−1 ay)2 .
In the future, we will discuss the kinds of data that can be specified in more detail. As we will see,
the type of data will depend on the type of PDE.
6.2. Solving a variable coefficient transport equations. With only a bit of additional effort,
the procedure from Section 6.1 can be extended to cover the case where the coefficients are pre-
specified functions of x, y. Let’s consider the following example:
(6.2.1) y∂x u + x∂y u = 0.

Let P denote a point P = (x, y), and let V denote the vector V = (y, x). Using vector calculus
notation, (6.2.1) can be written as ∇u(P ) · V = 0, i.e., the derivative of u at P in the direction
of V is 0. Thus, equation (6.2.1) implies that u is constant along the curve C passing through P
that points in the same direction as V. This vector can be viewed as a line segment with slope xy .
Therefore, if the curve C is parameterized by x → (x, y(x)) (where we are viewing y as a function
dy
of x along C) then C has slope dx , and y is therefore a solution to the following ODE:
dy x
(6.2.2) = .
dx y
We can use the following steps to integrate (6.2.2), which you might have learned in an ODE
class:
dy 1 d 2
(6.2.3) (6.2.2) =⇒ y = x =⇒ (y ) = x
dx 2 dx
y2 x2
(6.2.4) =⇒ = + c, c = constant.
2 2
Thus, the curve C is a hyperbola of the form {y 2 − x2 = c}. These curves are called characteristics.
We conclude that u is constant along the hyperbolas {y 2 − x2 = c}, which implies that u(x, y) =
f (x2 − y 2 ) for some function f (c).
We can carry out the same procedure for a PDE of the form
(6.2.5) a(x, y)∂x u + b(x, y)∂y u = 0,

as long as we can figure out how to integrate the ODE
dy b(x, y)
(6.2.6) = .
dx a(x, y)
7. Some basic analytical notions and tools
We now discuss a few ideas from analysis that will appear repeatedly throughout the course.
7.1. Norms. In PDE, there are many different ways to measure the “size” of a function f. These
measures are called norms. Here is a simple, but useful norm that will appear throughout this
course.
Definition 7.1.1 (C k norms). Let f be a function defined on a domain Ω ⊂ R. Then for any
integer k ≥ 0, we define the C k norm of f on Ω by
k
def
X
(7.1.1) kf kC k (Ω) = sup |f (a) (x)|,
x∈Ω
a=0
where f (a) (x) is the ath order derivative of f (x). We often omit the symbol Ω when Ω = R.
Example 7.1.1.
(7.1.2) k sin(x)kC 7 (R) = 8.
The same notation is used in the case that Ω ⊂ Rn , but in this case, we now sum over all
def
partial derivatives of order ≤ k. For example, if Ω ⊂ R2 , then kf kC 2 (Ω) = sup(x,y)∈Ω |f (x, y)| +
sup(x,y)∈Ω |∂x f (x, y)|+sup(x,y)∈Ω |∂y f (x, y)|+sup(x,y)∈Ω |∂x2 f (x, y)|+sup(x,y)∈Ω |∂x ∂y f (x, y)|+sup(x,y)∈Ω |∂y2 f (x, y)|.
If f is a function of more than one variable, then we sometimes want to extract different informa-
tion about f in one variable compared to another. For example, if f = f (t, x), then we use notation
such as
1 2
def
X X
(7.1.3) kf kC 1,2 = sup |∂ta f (t, x)| + sup |∂xa f (t, x)|.
2 2
a=0 (t,x)∈R a=1 (t,x)∈R
Above, the “1” in C 1,2 refers to the t coordinate, while the “2” refers to the x coordinate.
The next definition provides a very important example of another class of norms that are prevalent
in PDE theory.
Definition 7.1.2 (Lp norms). Let 1 ≤ p < ∞ be a number, and let f be a function defined on a
domain Ω ⊂ Rn . We define the Lp norm of f by
Z 1/p
def p n
(7.1.4) kf kLp (Ω) = |f (x)| d x .
Ω
We often write just Lp instead of Lp (Rn ).

k · kLp (Ω) has all the properties of a norm:
• Non-negativity: kf kLp (Ω) ≥ 0, kf kLp (Ω) = 0 ⇐⇒ f (x) = 0 almost everywhere1
• Scaling: kλf kLp (Ω) = |λ|kf kLp (Ω)
• Triangle inequality: kf + gkLp (Ω) ≤ kf kLp (Ω) + kgkLp (Ω)
Similarly, k · kC k (Ω) also has all the properties of a norm. All of these properties are very easy to
show except for the last one in the case of k · kLp (Ω) . You will study the very important case p = 2
in detail in your homework.
1“Almost everywhere” is a term that would be precisely defined in a course on measure theory.
7.2. The divergence theorem. A lot of PDE results are derived using integration by parts (some-
times very fancy versions of it), which provides us with integral identities. This will become more
apparent as the course progresses. Let’s recall a very important version of integration by parts from
vector calculus: the divergence theorem. We first need to recall the notion of a vectorfield on Rn .
Definition 7.2.1 (Vectorfield). Recall that a vectorfield F on Ω ⊂ Rn is an Rn −valued (i.e.
vector-valued) function defined on Ω. That is,
(7.2.1) F : Ω → Rn ,
F(x1 , · · · , xn ) = F 1 (x1 , · · · , xn ), · · · , F n (x1 , · · · , xn ) ,

where each of the F i are scalar-valued functions on Rn .

We also need to recall the definition of the divergence operator, which is a differential operator
that acts on vectorfields.
Definition 7.2.2 (Divergence). Recall that ∇·F, the divergence of F, is the scalar-valued function
on Rn defined by
n
def
X
(7.2.2) ∇·F = ∂i F i .
i=1
We are now ready to recall the divergence theorem.

Theorem 7.1 (Divergence Theorem). Let Ω ⊂ R3 be a domain2 with a boundary that we denote
by ∂Ω. Then the following formula holds:
Z Z
(7.2.3) ∇ · F(x, y, z) dxdydz = ˆ
F(σ) · N(σ) dσ.
Ω ∂Ω
ˆ
Above, N(σ) is the unit outward normal vector to ∂Ω, and dσ is the surface measure induced
on ∂Ω. Recall that if ∂Ω ⊂ R3 can locally be described as the graph of a function φ(x, y) (e.g.,
∂Ω = {(x, y, z) |z = φ(x, y)}), then
p
(7.2.4) 1 + |∇φ(x, y)|2 dxdy,
dσ =
def p
where ∇φ = (∂x φ, ∂y φ) is the gradient of φ, and |∇φ| = (∂x φ)2 + (∂y φ)2 is the Euclidean length
of ∇φ.
Remark 7.2.1. The divergence theorem holds in all dimensions, not just 3. In dimension 1, the
divergence theorem is
Z
d
(7.2.5) F (x) dx = F (b) − F (a),
[a,b] dx
which is just the Fundamental Theorem of Calculus.
2Throughout this course, a domain is defined to be an open, connected subset of Rn .

MIT OpenCourseWare
http://ocw.mit.edu
18.152 Introduction to Partial Differential Equations.

Fall 2011
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
MATH 18.152 COURSE NOTES - CLASS MEETING # 2
Class Meeting # 2: The Diffusion (aka Heat) Equation

1. Introduction to the Heat Equation
def
The heat equation for a function u(t, x), x = (x1 , ⋯, xn ) ∈ Rn , is
(1.0.1) ut − D∆u = f (t, x).

Here, the constant D > 0 is the diffusion coefficient, f (t, x) is an inhomogeneous term, and ∆ is the
Laplacian operator, which takes the following form in Cartesian coordinates:
n
def
(1.0.2) ∆ = ∑ ∂i2 .
i=1
Equation (1.0.1) is first-order and linear.
2. A simple model of heat flow that leads to the heat equation

We now give an example of a simple model of heat flow that leads to the heat equation. Consider
a homogeneous, isotropic solid body B ⊂ Rn (n = 3 is the physically relevant case) described by the
following physical properties:
def
(2.0.3) ρ = mass density ∼ [mass] × [Volume]−1 = constant,
def
(2.0.4) e(t, x) = thermal energy per unit mass ∼ [energy] × [mass]−1 .
Let’s also assume that heat is supplied to the body by an external source which pumps in heat at
the following rate per unit mass:
(2.0.5) R ∼ [energy] × [time]−1 × [mass]−1 .

The total thermal E (t; V ) energy contained in a body sub-volume V ⊂ B at time t is the integral
of e(t, x) over V ∶
def
(2.0.6) E(t; V ) = ∫ ρe(t, x) dn x.
V
The rate of change of the total energy contained in V is
d d
(2.0.7) E(t; V ) = ∫ ρe(t, x) dn x = ∫ ρ∂t e(t, x) dn x.
dt dt V V
In (2.0.7), we have assumed that you can differentiate under the integral; we can do this when
e(t, x) is a “nice” function. We will be more precise about the meaning of “nice” later in the course.
1
Let’s now address the factors that can cause dtd E (t; V ) to be non-zero. That is, let’s account for
the factors that cause the energy within the volume V to change. In our simple model, we will
account for only two factors. First, by integrating (2.0.5) over V, we deduce the rate of energy
pumped into the sub-volume V by the external source:
n
(2.0.8) ∫V ρR(t, x) d x ∼ [energy] × [time] .
−1
Second, we will also assume that heat energy is flowing throughout the body, and that flow can
be modeled by a heat flux vector q
(2.0.9) q ∼ [energy] × [time]−1 × [area]−1 ,

which specifies the direction and magnitude of heat flow across a unit area. That is, if dσ ⊂ ∂V is
a small surface area with outward unit-normal N,ˆ then q ⋅ N
ˆ is the energy flowing out of the small
surface. Thus, the rate of heat energy flowing into V is
(2.0.10) −∫ ˆ dσ = − ∫ ∇ ⋅ q dn x ∼ [energy] × [time]−1 ,

q⋅N
∂V V
where the equality follows from the divergence theorem.

We will connect the various energies together by assuming the following energy conservation
“law:” The rate of change of total energy in the sub-volume V is equal to the rate of heat energy
flowing into V + rate of heat energy supplied by the external source. Using (2.0.7), (2.0.8), and
(2.0.10), we see that this “law” takes the following form in terms of integrals:
n n n
(2.0.11) ∫V ρ∂t e(t, x) d x = − ∫V ∇ ⋅ q d x + ∫V ρR d x.
Since the above relations are assumed to hold for all body sub-volumes V, the integrands must be
equal (again, as long as they are nice):
(2.0.12) ρ∂t e(t, x) = −∇ ⋅ q + ρR.
2.1. Fourier’s law. In order to turn (2.0.12) into a PDE that we can study, we need to make an-
other assumption about e(t, x), q, and their relation to the temperature u(t, x). Fourier hypothesized
the following “Fourier’s Law of heat conduction:”
(2.1.1) q(t, x) = −κ∇u(t, x),

def
where κ > 0 is the thermal conductivity, and ∇u = (∂1 u, ⋯, ∂n u) is the spatial derivative gradient
of the temperature u(t, x). We will assume that κ is a constant. Recall that at each fixed t,
∇u(t, x) points in the direction of maximal increase and that ∇u(t, x) is perpendicular to the level
sets {x ∣ u(x) = constant}. Thus, (2.1.1) states that heat flows “from hot to cold” (i.e. towards
decreasing temperature) and that the flow is perpendicular to the surfaces of constant temperature.
Remark 2.1.1. (2.1.1) is NOT A FUNDAMENTAL LAW OF NATURE ! It is a simple but rea-
sonable (under certain circumstances) model!
We need one more assumption in order to derive our PDE - we need to relate e(t, x) to u(t, x). We
will assume a very simple model, which is experimentally verified by many substances in moderate
temperature ranges:
(2.1.2) e = cυ u.
Here, cυ > 0 is the specific heat at constant volume. We also assume that cυ is constant. Like
many of our previous assumptions, (2.1.2) is also just a simple model, and not a fundamental law
of nature.
Finally, we combine (2.0.12), (2.1.1), and (2.1.2), and use the identity ∇ ⋅ ∇u = ∆u, thus arriving
at
κ 1
(2.1.3) ∂t u(t, x) = ∆u + R.
cυ ρ cυ
cυ R.
κ 1
This is the heat equation (1.0.1) with D = cυ ρ and f =
3. Well-posedness
Remember, one of the main goals of PDE theory is to figure out which kind of data lead to a
unique solution. It is not always obvious which kind of data we are allowed to specify in order to
solve the equation. When we have a PDE and a notion of data such that the data always lead to a
unique solution, and the solution depends “continuously” on the data, we say that the problem is
well-posed.
3.1. Dirichlet boundary conditions. Let’s study Dirichlet boundary conditions for the heat
equation in n = 1 dimensions. Think of a one-dimensional rod with endpoints at x = 0 and x = L.
Let’s set most of the constants equal to 1 for simplicity, and assume that there is no external source
pumping energy into the rod, i.e., that there is no inhomogeneous term f.
Then we could, for example, prescribe the temperature of the rod at t = 0 (sometimes called
Cauchy data) and also at the boundaries x = 0 and x = L for all times t ∈ [0, T ] ∶
⎧
⎪ ∂ u − D∂x2 u = 0, (t, x) ∈ (0, T ) × (0, L),
⎪ t
⎪
(3.1.1) ⎨ u (0, x ) = g ( x), x ∈ [0, L], (Cauchy data),
⎪
⎪
⎩ u(t, 0) = h0 (t),
⎪ u(t, L) = hL (t), t > 0, (Dirichlet data).
As we will see, under suitable assumptions on the functions, g, h0 , hL , these conditions lead to a
well-posed problem.
3.2. Neumann (N for Normal!) boundary conditions. Instead of prescribing the temperature
at the boundaries, let’s instead prescribe the inward rate of heat flow (given by Fourier’s law with
κ = 1) at the boundaries:
⎧
⎪ ∂ u − D∂x2 u = 0, (t, x) ∈ (0, T ) × (0, L),
⎪ t
⎪
(3.2.1) ⎨ u(0, x) = g (x), (Cauchy data),
⎪
⎪ −∂x u(t, 0) = h0 (t),
⎪
⎩ ∂x u(t, L) = hL (t), (Neumann data).
Under suitable assumptions on the functions, g, h0 , hL , these conditions also lead to a well-posed
problem.
3.3. Robin boundary conditions. We can also take some linear combinations of the Dirichlet
and Neumann conditions:
⎧
⎪ ∂ u − D∂x2 u = 0, (t, x) ∈ (0, T ) × (0, L),
⎪ t
⎪
(3.3.1) ⎨ u(0, x) = g (x), (Cauchy data),
⎪
⎪
⎪
⎩ −∂ x u(t, 0) + αu (t, 0) = h0 (t), ∂x u(t, L) + αu(t, L) = hL (t), (Robin data),
where α > 0 is a positive constant. Under suitable assumptions on the functions, g, h0 , hL , these
conditions also lead to a well-posed problem.
3.4. Mixed boundary conditions. The above three boundary conditions are called homogeneous
because they are of the same type at each end. It is also possible to prescribe one condition at
one endpoint, and a different condition at the other endpoint. These are called mixed boundary
conditions. These conditions also lead to a well-posed problem.
4. Separation of variables
We now discuss a technique, known as separation of variables, that can be used to explicitly
solve certain PDEs. It is especially useful in the study of linear PDEs. Although this technique is
applicable to some important PDEs, it is unfortunately far from universally applicable.
In a nutshell, the separation of variables technique can be summarized as:
● Look for a solution of the form u(t, x) = v (t)w(x).
● Plug this guess into the PDE and hope that the PDE forces the functions v and w to be
solutions to ODEs that can be solved without too much trouble.
As we will see, when one tries to apply this technique, one quickly runs into difficulties that are
best addressed using techniques from Fourier analysis. We don’t have time right now to give a
detailed introduction to Fourier analysis, but we will return to it later in the course if time permits;
at the moment, we will only show how to use some of these techniques, without fully justifying
them.
A great way to illustrate separation of variables is through an example. Let’s try to solve the
heat equation problem with homogeneous (i.e., vanishing) Dirichlet conditions
⎧
⎪ u − uxx = 0, (t, x) ∈ (0, T ] × [0, 1],
⎪ t
⎪
(4.0.1) ⎨ u(0, x) = x, x ∈ [0, 1],
⎪
⎪
⎩ u(t, 0) = 0,
⎪ u(t, 1) = 0,
by separation of variables.
Remark 4.0.1. Note that such a solution cannot possibly be continuous at the point (0, 1).
We plug in the form u(t, x) = v(t)w(x) into (4.0.1) and discover that
v ′ (t) w′′ (x)

(4.0.2) = .
v (t) w(x)
This should hold for all t, x. It therefore must be the case that both sides are equal to a constant,
which we will call λ. We then have
(4.0.3a) v ′ (t) = λv (t),

(4.0.3b) w′′ (x) = λw(x).
Furthermore, w(0) = w(1) = 0 by the boundary conditions.
Let’s address v first, since it requires less work to deal with than w. If λ ∈ R, then (4.0.3a) can
be generally solved:
(4.0.4) v(t) = Aeλt

for some A ∈ R.
In contrast, the study of w(x) splits into three cases:
● λ = 0. Then w(x) = Bx + C for some B, C ∈ R. The boundary conditions imply that C = 0
and B + C = 0, so that √B = C = 0. √
Thus, this solution is not very interesting.
● λ > 0. Then w(x) = √Be λx +C e− λx for some B, C ∈ R. The boundary conditions imply that
√
B + C = 0, and Be + Ce λ − λ = 0, which forces B = C = 0. This solution is also not very
interesting. √ √
● λ < 0. Then w(x) = B sin( ∣λ∣x) + C cos√ ( ∣λ∣x) for some B, C ∈ R. The boundary condition
w(0) = 0 forces C = 0, so w(x) = B sin( λx). The boundary condition w(1) = 0 then forces
def
λ = −π 2 m2 for some m ∈ Z+ , where Z+ = the set of non-negative integers. The λ are
called eigenvalues, and the corresponding wm are the corresponding eigenvectors. Equation
def
(4.0.3a) is called an eigenvalue problem corresponding to the linear operator L = ∂x2 .
We have shown that the only solutions w are of the form wm (x) = B sin(2πmx), m ∈ Z+ . Using also
(4.0.4) and the fact that λ = −π 2 m2 for our solutions, we have produced a family of solutions to the
heat equation ∂t u − ∂x2 u = 0 that satisfying the boundary conditions:
2 π2 t
(4.0.5) um (t, x) = e−m sin(mπx), Am ∈ R, m ∈ Z+ .
But we haven’t yet satisfied the initial condition u(0, x) = x. To do this, we could try using the
superposition principle:
∞
(4.0.6) u(t, x) = ∑ Am um (t, x).
m=1
We would have to solve for the Am to achieve the desired initial condition u(0, x) = x.
Here is a list of things we would have to do to fully solve this problem using this technique:
(1) Find plausible Am .
(2) Show that the infinite sum (4.0.6) converges.
(3) Show that the infinite sum solves the heat equation.
(4) Show that u(t, x) satisfies the boundary conditions.
(5) Check that limt→0+ u(t, x) = u(0, x) = x. We also have to investigate in which sense this limit
may or may not hold. We already know that this equality cannot hold pointwise at the
point (0, 1).
(6) Show that there can be no other solution with these initial/boundary conditions (unique-
ness).
Let’s deal with (1) first. If (4.0.6) holds, then at t = 0 ∶
∞ ∞
(4.0.7) x = u(0, x) = ∑ Am um (0, x) = ∑ Am sin(mπ x).
m=1 m=1
This is a Fourier series expansion for the function f (x) = x on the interval [0, 1].
It is helpful to think of a function f (x) as a vector in an infinite dimensional vector space and the
sin(mπx) as basis vectors (however, it is not trivial to show that they form a basis...). Furthermore,
if we introduce the dot product
def
(4.0.8) ⟨f (x), g(x)⟩ = ∫ f (x)g(x) dx,
[0,1]
then the basis vectors are orthogonal (do the computation yourself!):
1/2 if m = n
(4.0.9) ⟨sin(mπ x), sin(πnx)⟩ = {
0 if m ≠ n.
This suggests that the following heuristic computations might be able to be made completely
rigorous:
∞
(4.0.10) ∫[0,1] f (x) sin(πnx) dx = ⟨f (x), sin(πnx)⟩ = ⟨ ∑ Am sin(mπx), sin(πnx)⟩
m=1
∞
= ∑ ⟨Am sin(mπx), sin(π nx)⟩
m=1
1
= An .
2
Applying this to our function f (x) = x, we integrate by parts to compute that
2 2 2
(4.0.11) Am = 2 ∫ x sin(mπx) dx = − x cos(mπx)∣xx==10 + ∫ cos(πnx) dx = (−1)m+1 .
[0,1] mπ mπ [0,1] mπ
We now hope that our solution is:
∞
2 π2 t 2
(4.0.12) u(t, x) = ∑ (−1)m+1 e−m sin(mπx).
m=1 mπ
2 2
2
Remark 4.0.2. The individual terms (−1)m+1 e−m π t mπ sin(mπx) are sometimes called the modes
of the solution. Note that each mode is rapidly decaying at an exponential rate as t → ∞. Further-
more, the infinite sum ∑∞ m+1 e−m2 π 2 t 2 sin(mπx) also decays exponentially in time. Later
m=1 (−1) mπ
in the course we will study the heat equation on all of R, and we will once again see that under
suitable assumptions, solutions to the heat equation tend to exponentially decay in time. However,
if we had non-zero Dirichlet conditions for the problem (4.0.1), then the solution might not decay
to 0, but instead to some other state.
Let’s now answer some of the remaining questions from above.

2 2
(2) Thanks to the rapidly decaying in m factor e−m π t , for any t > 0, the series (4.0.12) can be
seen to uniformly converge for x ∈ [0, 1] using one of the standard convergence arguments from
analysis (carefully work through this argument yourself; pg. 9 of your book might be a helpful
reference). The argument for t = 0 is much more subtle and is addressed in Theorem 4.1 below.
(3) We already know that each mode in (4.0.12) solves the heat equation. So what about the
2 2
infinite sum? Again, for any t > 0, the e−m π t factor plus standard results from analysis allow us
to repeatedly differentiate the series term-by-term in both t and x (work through this yourself).
In particular, the series is smooth (i.e., infinitely differentiable in all variables) for any t > 0. In
particular, for t > 0, we have that
∞
2
2 π2 t
∞
2 2
(4.0.13a) ∂t u = ∑ ∂t [(−1)m+1 e−m sin(mπx)] = ∑ (−1)m mπe−m π t sin(mπx),
m=1 mπ m=1
∞
2 2 2 ∞
2 2 2
(4.0.13b) ∂x2 u = ∑ ∂x2 [(−1)m+1 e−m π t sin(mπx)] = ∑ (−1)m mπe−m π t sin(mπx),
m=1 mπ m=1 mπ
which shows that −∂t u + ∂x2 u = 0.
(4) The fact that u verifies the correct Dirichlet conditions at x = 0 and x = 1 follows from the
fact that each of the modes does.
The remaining two questions require more work. We first quote the following theorem from
Fourier analysis to help us understand the Fourier expansion at t = 0. Using this theorem, you will
address question (5) in your homework.
Theorem 4.1 (Some basic facts from Fourier analysis). If f (x) is a function such that
def 1
∥f ∥2L2 ([0,1]) = ∫0 ∣f (x)∣2 dx < ∞, then f (x) can be Fourier-expanded as f (x) = ∑∞
m=1 Am sin(mπx),
where Am = 2 ∫[0,1] f (x) sin(mπx) dx. The infinite sum converges in the sense that
N
(4.0.14) ∥f − ∑ Am sin(mπx)∥L2 ([0,1]) → 0 as N → ∞.
m=1
We also have the Parseval identity
∞ ∞
1
(4.0.15) ∥f ∥2L2 ([0,1]) = ∑ A2m ∥ sin(mπx)∥2L2 ([0,1]) = ∑ A2m .
m=1 m=1 2
Note that (4.0.15) is an “infinite dimensional Pythagorean theorem.”
Furthermore, if f is continuous on [0, 1], then for any subinterval [a, b] ⊂ (0, 1),
N
(4.0.16) ∥f − ∑ Am sin(mπx)∥C 0 ([a,b]) → 0 as N → ∞,
m=1
i.e., the convergence is uniform on any closed subinterval [a, b] of the open interval (0, 1).
Exercise 4.0.1. Many extensions of Theorem 4.1 are possible. Read Appendix A of your textbook
in order to learn about them.
MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 3: The Heat Equation: Uniqueness

1. Uniqueness
The results from the previous lecture produced one solution to the Dirichlet problem

 ut − uxx = 0, (t, x) ∈ (0, T ] × [0, 1],
(1.0.1) u(0, x) = x, x ∈ [0, 1],
u(t, 0) = 0, u(t, 1) = 0,

namely
∞
X 2 π2 t 2
(1.0.2) u(t, x) = (−1)m+1 e−m sin(mπx).
m=1
mπ
But how do we know that this is the only one? In other words, we need to answer the uniqueness
question (6) from the previous lecture. The next theorem addresses this question. We first need to
introduce some important spacetime domains that will play a role in the analysis.
Definition 1.0.1. Let Ω ⊂ Rn be a bounded spatial domain (i.e., an open connected subset of Rn ),
and let T > 0 be a time. We define the corresponding spacetime cylinder QT ⊂ R1+n by
def
(1.0.3) QT = (0, T ) × Ω.
We also define the parabolic boundary ∂p QT of QT as follows:
def
(1.0.4) ∂p QT = {0} × Ω ∪ (0, T ] × ∂Ω
= bottom of QT ∪ sides of QT .
Here, QT denotes the closure of QT in R1+n .
Theorem 1.1 (A uniqueness result for the heat equation on a finite interval). Solutions
u ∈ C 1,2 (QT ) to the inhomogeneous heat equation
(1.0.5) ∂t u − D∂x2 u = f (t, x)

are unique under Dirichlet, Neumann, Robin, or mixed conditions.
Remark 1.0.1. By u ∈ C 1,2 (QT ), we mean that the time derivatives of u(t, x) up to order 1 (the
first index) are continuously differentiable on QT and extend continuously to the closure of QT ,
and also that all spatial derivatives of u(t, x) up to order 2 (the second index) are continuously
differentiable on QT and extend continuously to the closure of QT . Unfortunately, these kind of
ugly technical details often play a role in PDE theory.
1
Remark 1.0.2. In its current form Theorem, 1.1 is not quite strong enough to apply to the problem
(1.0.1). More precisely, the solution to that problem has a discontinuity at (0, 1), while Theorem
1.1 requires that the solutions are of class C 1,2 (QT ). Uniqueness does in fact hold in a certain sense
for the problem (1.0.1), but the because of the discontinuity, this issue is best addressed in a more
advanced course.
Proof. Let’s do the Dirichlet proof in the case D = 1. Assume we have two solutions to (1.0.5) with
specified Cauchy and Dirichlet data. Then by subtracting them and calling the difference w, we get
another solution w satisfying

 ∂t w − ∂x2 w = 0, (t, x) ∈ [0, T ] × [0, L],
(1.0.6) w(0, x) = 0, x ∈ [0, L],
 w(t, 0) = 0, w(t, L) = 0, t ∈ [0, T ].
We want to show that w(t, x) = 0 for (t, x) ∈ [0, T ] × [0, L]. We perform the following super-
important and very commonly used strategy: we multiply both sides of (1.0.6) by w and integrate
dx over the interval [0, L] to derive
Z Z
(1.0.7) w∂t w dx = w∂x2 w dx
[0,L] [0,L]
differentiate under the integral Z Z Z

d1 2
=⇒ w (t, x) dx = w∂t w dx = w∂x2 w dx
z }| {
dt 2 [0,L] [0,L] [0,L]
| {z }
integrate by parts
Z
=− (∂x w(t, x))2 dx + w(t, x)∂x w(t, x)|x=L
x=0
[0,L] | {z }
| {z } = 0 by bndry. cond.
≤0
≤ 0.
So if we define the energy
Z
def
(1.0.8) E(t) = w2 (t, x) dx,
[0,L]
| {z }
≥0
then we have shown that
d
(1.0.9) E(t) ≤ 0.
dt
But E(0) = 0 by the initial conditions of w. Therefore, E(t) = 0 for t ∈ [0, T ]. But since w2 (t, x) is
continuous and non-negative, it must be that w2 (t, x) = 0 for (t, x) ∈ [0, T ] × [0, L].

Remark 1.0.3. Broadly speaking, the strategy we have used in this proof is called the energy
method. It is a very flexible strategy that applies to many PDEs.
Note also that we did not need to know very much about the solution to conclude that it is
unique! In particular, we didn’t need to “find a formula” for the solution!
Note also that E(t) is the square of the spatial L2 ([0, L]) norm of w at time t.
MIT OpenCourseWare
http://ocw.mit.edu
18.152 Introduction to Partial Differential Equations

Fall 2011
Class Meeting #4: The Heat Equation: The Weak Maximum Principle
1. The Weak Maximum Principle
We will now study some important properties of solutions to the heat equation ∂t u−D∆u = 0. For
simplicity, we sometimes only study the case of 1 + 1 spacetime dimensions, even though analogous
properties are verified in higher dimensions.
def
Theorem 1.1 (Weak Maximum Principle). Let Ω ⊂ Rn be a domain. Recall that QT =
def
(0, T ) × Ω is a spacetime cylinder and that ∂p QT = {0} × Ω ∪ (0, T ] × ∂Ω is its corresponding
parabolic boundary. Let w ∈ C 1,2 (QT ) ∩ C(QT ) be a solution to the (possibly inhomogeneous) heat
equation
(1.0.1) wt − D∆w = f,
where f ≤ 0. Then w(t, x) obtains its max in the region QT on ∂p QT . Thus, if w is strictly negative
on ∂p QT , then w is strictly negative on QT .
Proof. For simplicity, we consider only case of 1 + 1 spacetime dimensions. Let be a positive
number, and let u = w − t. Our goal is to first study u, and then take a limit as ↓ 0 to extract
information about w. Note that on QT we have u ≤ w, that w ≤ u + T, and that in QT we have
(1.0.2) ut − Duxx = f − < 0.

We claim that the maximum of u on QT − occurs on ∂p QT − . To verify the claim, suppose that
u(t, x) has its max at (t0 , x0 ) ∈ QT − . We may assume that 0 < t0 ≤ T − , since if t0 = 0 the claim
is obviously true. Under this assumption, we have that u < w and that w ≤ u + T. Similarly, we
may also assume that x ∈ Ω, since otherwise we would have (t, x) ∈ ∂p QT − , and the claim would
be true.
Then from vector calculus, ux (t0 , x0 ) must be equal to 0. Furthermore, ut (t0 , x0 ) must also be
equal to 0 if t0 < T − , and ut (t0 , x0 ) ≥ 0 if t0 = T − . Now since u(t0 , x0 ) is a maximum value,
we can apply Taylor’s remainder theorem in x to deduce that for x near x0 , we have
(1.0.3) u(t0 , x) − u(t0 , x0 ) = ux |t0 ,x0 (x − x0 ) +uxx |t0 ,x∗ (x − x0 )2 ≤ 0,

| {z }
0
where x∗ is some point in between x0 and x. Therefore, uxx (t0 , x∗ ) ≤ 0, and by taking the limit as
x → x0 , it follows that uxx (t0 , x0 ) ≤ 0. Thus, in any possible case, we have that
(1.0.4) ut (t0 , x0 ) − Duxx (t0 , x0 ) ≥ 0,

1
which contradicts (1.0.2).

Using u ≤ w and that fact that ∂p QT − ⊂ ∂p QT , we have thus shown that
(1.0.5) max u = max u ≤ max w ≤ max w.

QT − ∂p QT − ∂p QT − ∂p Q T
Using (1.0.5) and w ≤ u + T, we also have that

(1.0.6) max w ≤ max u + T ≤ T + max w.
QT − QT − ∂p Q T
Now since w is uniformly continuous on QT , we have that
(1.0.7) max w ↑ max w

QT − QT
as ↓ 0. Thus, allowing ↓ 0 in inequality (1.0.6), we deduce that
(1.0.8) max w = lim max w ≤ lim(T + max w) = max w ≤ max w.

QT ↓0 QT − ↓0 ∂p Q T ∂p Q T QT
Therefore, all of the inequalities in (1.0.8) can be replaced with equalities, and
(1.0.9) max w = max w

QT ∂p Q T
as desired.

The following very important corollary shows how to compare two different solutions to the heat
equation with possibly different inhomogeneous terms. The proof relies upon the weak maximum
principle.
Corollary 1.0.1 (Comparison Principle and Stability). Suppose that v, w are solutions to the
heat equations
(1.0.10) vt − Dvxx = f,
(1.0.11) wt − Dwxx = g.
Then
(1) (Comparison): If v ≥ w on ∂p QT and f ≥ g, then v ≥ w on all of QT .
(2) (Stability): maxQT |v − w| ≤ max∂p QT |v − w| + T maxQT |f − g|.
Proof. One of the things that makes linear PDEs relatively easy to study is that you can add or
def
subtract solutions: Setting u = w − v, we have
(1.0.12) ut − Duxx = g − f ≤ 0.
Then by Theorem 1.1, since u ≤ 0 on ∂p QT we have that u ≤ 0 on QT . This proves (1).
def def
To prove (2), we define M = maxQT |f − g |, u = w − v − tM and note that
(1.0.13) ut − Duxx = g − f − M ≤ 0.
Thus, by Theorem 1.1, we have that
(1.0.14) max u = max u ≤ max |w − v |.

QT ∂p Q T ∂p QT
Thus, subtracting and adding tM, we have
(1.0.15) max w − v ≤ max(w − v − tM ) + max tM ≤ max |w − v| + T M .

QT QT QT ∂p QT
def
Similarly, by setting u = v − w − tM, we can show that
(1.0.16) max v − w ≤ max |w − v | + T M.

QT ∂p Q T
Combining (1.0.15) and (1.0.16), and recalling the definition of M, we have shown (2).

MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 5: The Fundamental Solution for the Heat Equation

1. The Fundamental solution
As we will see, in the case Ω = Rn , we will be able to represent general solutions the inhomoge-
neous heat equation
n
def
X
(1.0.1) ut − D∆u = f, ∆= ∂i2
i=1
in terms of f, the initial data, and a single solution that has very special properties. This special
solution is called the fundamental solution.
Remark 1.0.1. Note that when Ω = Rn , there are no finite boundary conditions to worry about.
However, we do have to worry about “boundary conditions at ∞.” Roughly speaking, this means
that we have to assume something about the growth rate of the solution as |x| → ∞.
Definition 1.0.1. The fundamental solution ΓD (t, x) to (1.0.1) is defined to be
def 1 |x|2
− 4Dt
(1.0.2) ΓD (t, x) = e , t > 0, x ∈ Rn ,
(4πDt)n/2
def def Pn
where x = (x1 , · · · , xn ), |x|2 = i=1 (x
i 2
).
Let’s check that ΓD (t, x) solves (1.0.1) when f = 0 in the next lemma.
Lemma 1.0.1. ΓD (t, x) is a solution to the heat equation (1.0.1) when f = 0 for x ∈ Rn , t > 0.
2
2πDn 1 |x|2 − |x|
Proof. We compute that ∂t ΓD (t, x) = − (4πDt) n/2+1 + (4πDt)n/2 4Dt2 e
4Dt . Also, we compute ∂ Γ (t, x) =
i D
i |x|2 i
2π(x ) 2 | x|2
2πx − 4Dt 2π 1 − 4Dt

− (4πDt) n/2+1 e and ∂i2 ΓD (t, x) = − (4πDt) n/2+1 + 4Dt (4πDt)n/2+1 e ,
2 |x|2
2πDn 1 2π D|x| − 4Dt

D∆ΓD (t, x) = − (4πDt) n/2+1 + 4Dt (4πDt)n/2+1 e . Lemma 1.0.1 now easily follows.
Here are a few very important properties of ΓD (t, x).
Lemma 1.0.2. ΓD (t, x) has the following properties:
6 0, then limt→0+ ΓD (t, x) = 0
(1) If x =
(2) Rlimt→0+ ΓD (t, 0) = ∞
(3) Rn ΓD (t, x) dn x = 1 for all t > 0
Proof. This is a good exercise for you to do on your own.
As we will see, (1) - (3) suggest that at t = 0, ΓD (0, x) behaves like the “delta distribution
centered at 0.” We’ll make sense of this in the next lemma.
1
Remark 1.0.2. The delta distribution is sometimes called the “delta function,” but it is not a
function in the usual sense!
So what is the delta distribution?
Definition 1.0.2. The delta distribution δ is an example of a mathematical object called a distri-
bution. It acts on suitable functions φ(x) as follows:
def
(1.0.3) hδ, φi = φ(0).
Remark 1.0.3. The notation h·, ·i is meant to remind you of the L2 inner product
Z
(1.0.4) hf, gi = f (x)g(x) dn x.
Rn
The next lemma shows that ΓD (t, x) behaves like the delta distribution as t → 0+ .
Lemma 1.0.3. Suppose that φ(x) is a continuous function on Rn and that there exist constants
a, b ≥ 0 such that
2
(1.0.5) |φ(x)| ≤ aeb|x| .
Then
Z
(1.0.6) lim+ ΓD (t, x)φ(x) dn x = φ(0).
t→0 Rn
Proof. Using Property (3) of Lemma 1.0.2, we start with the simple inequality
Z Z Z
n n
(1.0.7) φ(0) = ΓD (t, x)φ(0) d x = ΓD (t, x)φ(x) d x + ΓD (t, x)(φ(0) − φ(x)) dn x.
Rn Rn Rn
Let > 0 be any small positive number, and choose a ball B of radius R centered at 0 such that
|φ(0) − φ(x)| ≤ for x ∈ B (this is possible since φ is continuous). Then the last term from above
can be estimated as follows, where B c denotes the complement of B in Rn :
Z Z Z
n n

ΓD (t, x)(φ(0) − φ(x)) d x ≤ ΓD (t, x)|φ(0) − φ(x)| d x + ΓD (t, x)|φ(0) − φ(x)| dn x
n c
R
ZB Z B
Z
n n
(1.0.8) ≤ ΓD (t, x) d x + |φ(0)| ΓD (t, x) d x + ΓD (t, x)|φ(x)| dn x
B B c B c
Z Z
≤ + |φ(0)| ΓD (t, x) dn x + ΓD (t, x)|φ(x)| dn x.
Bc Bc
We have thus shown that
Z Z Z
n n
(1.0.9) φ(0) − ΓD (t, x)φ(x) d x ≤ + |φ(0)| ΓD (t, x) d x + ΓD (t, x)|φ(x)| dn x.

Rn Bc Bc
To estimate the final term on the right-hand side of (1.0.8), we take advantage of the spherical
def pPn i 2
symmetry of Γ(t, x) in x. More precisely, we introduce the radial variable r = |x| = i=1 (x )
and recall from vector calculus that for spherically symmetric functions, dn x = Cn rn−1 dr where
2
Cn > 0 is a constant. Therefore, using the assumed bound |φ(x)| ≤ aebr , we have that
(1.0.10)
Z Z ∞ 1 −n/2 Z ∞
1
0 −n/2 n−1 −( 4Dt −b)r2 2
n
ΓD (t, x)|φ(x)| d x ≤ Cn t r e dr ≤ Cn00 − bt √ ρn−1 e−ρ dρ,
Bc r=R 4D 1
ρ=R 4Dt −b
where Cn0 > 0 and Cn00 > 0 are constants. To deduce the second inequality in (1.0.10), q we have
1 −1/2 1/2 1 −1/2 1
made the change of variables r = ρ( 4Dt − b) = ρt ( 4D − bt) . Now since R 4Dt − b → ∞
R
as t → 0+ , it easily follows from the last expression in (1.0.10) that B c ΓD (t, x) dn x goes to 0 as
t → 0+ .
The second term on the right-hand side of (1.0.8) can similarly be shown to go to 0 as t → 0+ .
Combining the above arguments, we have thus shown that for any > 0,
Z
(1.0.11) lim sup φ(0) − ΓD (t, x)φ(x) dn x ≤ .

t→0+ Rn
We therefore conclude that
Z
n
(1.0.12) lim+ φ(0) − ΓD (t, x)φ(x) d x = 0

t→0 Rn
as desired.

Remark 1.0.4. Lemma 1.0.3 can be restated as
(1.0.13) lim hΓD (t, ·), φ(·)i = hδ(·), φ(·)i = φ(0).

t→0+
On the left, h, i means the integral inner product, whereas in the middle it has the meaning of
(1.0.3). We sometimes restate (1.0.13) as
(1.0.14) lim ΓD (t, x) = δ(x).

t→0+
Let’s summarize the above results.

Proposition 1.0.4 (Properties of ΓD (t, x)). ΓD (t, x) is a solution to the heat equation (1.0.1)
(with f = 0) verifying the initial conditions
(1.0.15) lim ΓD (t, x) = δ(x).

t→0+
1.1. Solving the global Cauchy problem when n = 1. Let’s see how we can use ΓD to solve
the following initial value (aka Cauchy) problem:
(1.1.1) ut − Duxx = 0, (t, x) ∈ (0, ∞) × R,

u(0, x) = g(x).
We will make use of an important mathematical operation called convolution.
Definition 1.1.1. If f and g are two functions on Rn , then we define their convolution f ∗ g to be
the following function on Rn :
Z
def
(1.1.2) (f ∗ g)(x) = f (y)g(x − y) dn y.
Rn
Convolution is an averaging process, in which the function f (x) is replaced by the “average value”
of f (x) relative to the “profile” function g(x).
The convolution operator plays a very important role in many areas of mathematics. Here are
two key properties. First, by making the change of variables z = x − y, dn z = dn y in (1.1.2), we see
that
Z Z
n
(1.1.3) (f ∗ g)(x) = f (y)g(x − y) d y = f (x − z)g(z) dn z = (g ∗ f )(x),
Rn Rn
which implies that convolution is a commutative operation. Next, Fubini’s theorem can be used to
show that
(1.1.4) f ∗ (g ∗ h) = (f ∗ g) ∗ h,
so that ∗ is also associative.
Remark 1.1.1. According to (1.0.3) and (1.1.3),
(1.1.5) (f ∗ δ)(x) = hδ(y), f (x − y)iy = f (x),

so that in the context of convolutions, the δ distribution plays the role of an “identity element.”
The next proposition is a standard fact from analysis. It allows us to differentiate under integrals
under certain assumptions. We will use it in the proof of the next theorem.
Proposition 1.1.1 (Differentiating under the integral). Let I(a, b) be a function on R × R.
Assume that
Z
(1.1.6) |I(a, b)| da < ∞
R
for all b belonging to a neighborhood of b0 and define
Z
def
(1.1.7) J(b) = I(a, b) da.
R
Assume that there exists a neighborhood N of b0 such that for almost every1 a, ∂b I(a, b) exists for
b ∈ N . In addition, assume that there exists as function U (a) (defined for almost all a) such that
for b ∈ N , we have that |∂b I(a, b)| ≤ U (a) and such that
1Ina measure theory course, you would learn a precise technical definition of “almost every.” For the purposes of
this course, it suffices to know the following fact: if a statement holds for all a except for those values of a belonging
to a countable set, then the statement holds for almost every a. The main point is that the function I(a, b) does not
have to be “well-behaved” at every single value of a; it can have some “bad a spots,” just not too many of them.
Z
(1.1.8) U (a) da < ∞.
R
Then near J(b) is differentiable near b0 , and

Z
(1.1.9) ∂b J(b) = ∂b I(a, b) da.
R
Remark 1.1.2. An analogous proposition is true for functions I(a, b) defined on Rm × Rn .

Theorem 1.1 (Solving the global Cauchy problem via the fundamental solution). Assume
2
that g(x) is a continuous function on Rn that verifies the bounds |g(x)| ≤ aeb|x| , where a, b > 0 are
constants. Then there exists a solution u(t, x) to the homogeneous heat equation
(1.1.10) ut − D∆u = 0, (t > 0, x ∈ Rn ),

u(0, x) = g(x), x ∈ Rn
existing for (t, x) ∈ [0, T ) × Rn , where
def 1
(1.1.11) T = .
4Db
Furthermore, u(t, x) can be represented as
Z
(1.1.12) u(t, x) = [g(·) ∗ ΓD (t, ·)](x) = g(y)ΓD (t, x − y) dn y
Rn
Z
1 |x−y|2
− 4Dt
= g(y)e dn y.
(4πDt)n/2 Rn
The solution u(t, x) is of regularity C ∞ (0, 4Db
1

)×Rn (i.e., it is infinitely differentiable). Finally,
for each compact subinterval [0, T 0 ] ⊂ [0, T ), there exist constants A, B > 0 (depending on the
compact subinterval) such that
2
(1.1.13) |u(t, x)| ≤ AeB|x|
for all (t, x) ∈ [0, T 0 ] × Rn . The solution u(t, x) is the unique solution in the class of functions
verifying a bound of the form (1.1.13).
Remark 1.1.3. Note the very important smoothing property of diffusion: the solution to the
heat equation on all of Rn is smooth even if the data are merely continuous.
Remark 1.1.4. The formula (1.1.12) shows that solutions to (1.1.10) propagate with infinite
speed: even if the initial data g(x) have support that is contained within some compact region,
(1.1.12) shows that at any time t > 0, the solution u(t, x) has “spread out over the entire space
Rn .” In contrast, as we will see later in the course, some important PDEs have finite speeds of
propagation (for example, the wave equation).
Proof. For simplicity, we only give the proof in the case n = 1. The basic strategy of the proof is to
analyze the behavior of ΓD (t, y) in detail.
Let u(t, x) be the function defined by (1.1.12). The argument that follows will show that the right-
hand side of (1.1.12) is finite (and more). In fact, let us first demonstrate the bound (1.1.13). To this
end, let > 0 be any positive number. Then using the simple algebraic estimate |2xy| ≤ −1 x2 +y 2 ,
we deduce the inequality
(1.1.14) |x − y|2 = x2 − 2xy + y 2 ≤ (1 + −1 )x2 + (1 + )y 2 .
Using (1.1.14) and the assumed bound on |g(·)|, we deduce that
2 −1 2 2
(1.1.15) |g(x − y)| ≤ aeb|x−y| ≤ ae(1+ )b|x| e(1+)b|y|
R R
Using (1.1.15) and the fact that Rn g(y)ΓD (t, x − y) dy = Rn g (x − y)ΓD (t, y) dy (i.e., that convo-
lution is commutative), we have the following estimates:
Z Z
(1+−1 )b|x|2 2
(1.1.16) |u(t, x)| ≤ |g(x − y)|ΓD (t, y) dy ≤ ae e(1+)b|y| ΓD (t, y) dy
R R
Z
1

−1 2 1 2
≤ ae(1+ )b|x| √ t−1/2 e− 4πDt −(1+)b y dy
R 4πD Z
(1+−1 )b|x|2 1 1 −1/2 2
= ae √ − (1 + )bt e−z dz
4πD 4πD
| R {z }
<∞
(1+−1 )b|x|2
≤ Ae ,
where A > 0 is an −dependent constant, and in the next-to-last step, we have made the change of
1/2 1/2
y = t−1/2 4πD
1 1
variables z = 4πDt − (1 + )b − (1 + )bt y. Note that this change of variables
1
is valid as long as 0 < t < 4πD(1+)b . Since is allowed to be arbitrarily small, we have thus
demonstrated an estimate of the form (1.1.13).
Let’s now check that the function u(t, x) defined by (1.1.12) is a solution to the heat equation
def
and also that it takes on the initial conditions g(x). To this end, let L = ∂t − D∂x2 . We want to show
that Lu(t, x) = 0 for t > 0, x ∈ Rn and that u(t, x) → g(x) as t ↓ 0. Recall that by Proposition
1.0.4, LΓD (t, x) = 0 for t > 0, x ∈ R. For t > 0, x ∈ R, we have that
Z 0
z }| {
(1.1.17) Lu(t, x) = g(y) LΓD (t, x − y ) dy = 0.
R
To derive (1.1.17), we have used Proposition 1.1.1 to differentiate under the integral; because of
rapid exponential decay of ΓD (·, ·) in its second argument as the argument goes to ∞, one can use
arguments similar to those given in the beginning of this proof to check that the hypotheses of the
proposition are verified.
Similarly, the fact that u ∈ C ∞ (0, 4Db 1

) × R can be derived by repeatedly differentiating with
respect to t and x under the integral in (1.1.12).
Furthermore by (1.0.15) and (1.1.5), we have that

(1.1.18) lim u(t, x) = lim+ (g(·) ∗ ΓD (t, ·))(x) = (g ∗ δ)(x) = g(x).
t→0+ t→0
The question of uniqueness in the class of solutions verifying a bound of the form (1.1.13) is
challenging and will not be addressed here. Instead, with the help of the weak maximum principle,
you will prove a weakened version of the uniqueness result in your homework.

In the next theorem, we extend the results of Theorem 1.1 to allow for an inhomogeneous term
f (t, x).
1 def
Theorem 1.2 (Duhamel’s principle). Let g(x) and T = 4Db be as in Theorem 1.1. Also assume
that f (t, x), ∂i f (t, x), and ∂i ∂j f (t, x) are continuous, bounded functions on [0, T )×Rn for 1 ≤ i, j ≤
n. Then there exists a unique solution u(t, x) to the inhomogeneous heat equation
(1.1.19) ut − D∆u = f (t, x), (t, x) ∈ (0, ∞) × R,

u(0, x) = g(x), x∈R
existing for (t, x) ∈ [0, T ) × R. Furthermore, u(t, x) can be represented as
Z t
(1.1.20) u(t, x) = (ΓD (t, ·) ∗ g)(x) + (ΓD (t − s, ·) ∗ f (s, ·))(x) ds.
0
The solution has the following regularity properties: u ∈ C 0 ([0, T ) × R) ∩ C 1,2 ((0, T ) × R).
Proof. A slightly less technical version of this theorem is one of your homework exercises.
2. Deriving ΓD (t, x)
Let’s backtrack a bit and discuss how one could derive the fundamental solution to the heat
equation
(2.0.21) ∂t u(t, x) − D∆x u(t, x) = 0, (t, x) ∈ [0, ∞) × Rn .

As we will see, the fundamental solution is connected to some important invariance properties
associated to solutions of (2.0.21). These properties are addressed in the next lemma.
Lemma 2.0.2 (Invariance of solutions to the heat equation under translations and par-
abolic dilations). Suppose that u(t, x) is a solution to the heat equation (2.0.21). Let A, t0 ∈ R be
constants, and x0 ∈ Rn . Then the amplified and translated function
def
(2.0.22) u∗ (t, x) = Au(t − t0 , x − x0 )
is also a solution to (2.0.21).
Similarly, if λ > 0 is a constant, then the amplified, parabolically scaled function
def
(2.0.23) u∗ (t, x) = Au(λ2 t, λx)
is also a solution.
Proof. We address only the case (2.0.23), and leave (2.0.22) as a simple exercise. Using the chain
rule, we calculate that if u is a solution to (2.0.21), then
∂t u∗ (t, x) − ∆u∗ (t, x) = λ2 A (∂t u)(λ2 t, λx) − (D∆u)(λ2 t, λx) = 0.

(2.0.24)
Thus, u∗ is also a solution.
We would now like to choose the constant A in (2.0.23) so that the “total thermal energy” of u∗
is equal to the “total thermal energy of” of u.
Definition 2.0.2. We define the total thermal energy T (t) at time t associated to u(t, x) by
Z
def
(2.0.25) T (t) = u(t, x) dn x.
Rn
It is important to note that for rapidly-spatially decaying solutions to the heat equation, T (t) is
constant.
Lemma 2.0.3. Let u(t, x) ∈ C 1,2 ([0, ∞) × Rn ) be a solution to the heat equation −∂t u(t, x) +
∆u(t, x) = 0. Assume that at each fixed t, lim|x|→∞ |x|n−1 |∇x u(t, x)| = 0, uniformly in x. Further-
more, assume
R that there exists a function f (x) ≥ 0, not depending on t, such that |∂t u| ≤ f (x) and
n
such that Rn f (x) d x < ∞. Then the total thermal energy of u(t, x) is constant in time:
(2.0.26) T (t) = T (0).

def R
Proof. Let T (t) = Rn u(t, x) dn x denote the total thermal energy at time t. The hypotheses on
ensure that we can differentiate under the integral and use the heat equation:
Z Z Z
d n n
(2.0.27) T (t) = ∂t u(t, x) d x = ∆u(t, x) d x = lim ∆u(t, x) dn x,
dt Rn Rn R→∞ BR (0)
where BR (0) ⊂ Rn denotes the ball of radius R centered at the origin. Then with the help of
the divergence theorem, and recalling that dσ = Rn−1 dω along ∂BR (0), where ω denotes angular
coordinates along the unit sphere ∂B1 (0), we conclude that
Z Z
n
(2.0.28) lim ∆u(t, x) d x = lim ∇Nˆ u(t, σ) dσ
R→∞ BR (0) R→∞ ∂BR (0)
Z
= lim Rn−1 ∇N̂ u(t, Rω) dω
R→∞ ∂B1 (0)
Z Z
n−1
= lim R ∇N̂ u(t, Rω) dω = 0 dω = 0.
∂B1 (0) R→∞ ∂B1 (0)
In the last steps, we have used the following basic fact from analysis: the condition lim|x|→∞ |x|n−1 |∇x u(t, x)| =
0 uniformly in ω allows us to interchange the order of the limit and the integral.
We now return to the issue of choosing constant A in (2.0.23) so that the total thermal energy of
u∗ is equal to the total thermal energy of of u. Using the change of variables z = λx, and recalling
from multi-variable calculus that dn z = λn dn x, we compute that
Z Z Z
∗ n 2 2 n −n
(2.0.29) u (t, x) d x = A u(D λ t, λx) d x = Aλ u(λ2 t, z) dn z.
Rn Rn Rn
R
Observe that that Rn
u(λ t, z) d z is in fact the mass of u. Thus, we choose A = λn , which results
2 n
in
(2.0.30) u∗ (t, x) = λn u(D2 λ2 t, λx).

Motivated by the parabolic scaling result (2.0.23), we now introduce the dimensionless variable
def x
(2.0.31) ζ = √ ,
Dt
where we have used the fact that the constant D has the dimensions of [length2 ]/[time]. Note that
ζ is invariant under the parabolic scaling t → λ2 t, x → λx.
We now proceed to derive the fundamental solution. For simplicity, we only consider the case of
1 + 1 spacetime dimensions. We will look for a fundamental solution of the form
1
(2.0.32) ΓD (t, x) = √ V (ζ),
Dt
where V (ζ) is a function that we hope to determine. Admittedly, it is not easy to completely
motivateR the fact that ΓD (t, x) should look like (2.0.32). We first note that since we would like to
achieve R ΓD (t, x) = 1, the change of variables (2.0.31) leads to the following identity:
Z Z Z
1 x
(2.0.33) 1= ΓD (t, x) √ V √ dx = V (ζ) dζ.
R R Dt Dt R
Next, since ΓD (t, x) is assumed to solve the heat equation, we calculate that
1 n 1 1 o
(2.0.34) 0 = ∂t Γ − ∆Γ = − D−1/2 t−3/2 V 00 (ζ) + ζV 0 (ζ) + V (ζ) .
2 2 2
Therefore, V must be a solution to the following ODE:
1 1
(2.0.35) V 00 (ζ) + ζV 0 (ζ) + V (ζ) = 0.
2 2
Since we want ΓD (t, x) to behave like the δ distribution (at least for small t > 0), we demand that
(2.0.36) V (ζ) ≥ 0.
Furthermore, since we want ΓD (t, x) to rapidly decay as |x| → ∞, we demand that
(2.0.37) V (±∞) = 0.
We also expect that ideally, V (ζ) should be an even function. Furthermore, it is easy to see that if
def
V (ζ) is a solution to (2.0.35), then so is W (ζ) = V (−ζ). Thus, it is reasonable to look for an even
solution. Now for any differentiable even function V (ζ), it necessarily follows that V 0 (0) = 0. Thus,
we demand that
(2.0.38) V 0 (0) = 0.
We now note that (2.0.35) can be written in the form
d 1
(2.0.39) (V 0 (ζ) + ζV (ζ)) = 0,
dζ 2
which implies that V 0 (ζ) + 12 ζV (ζ) is constant. By setting ζ = 0 in and using (2.0.38), we see that
this constant is 0 :
1
(2.0.40) V 0 (ζ) + ζV (ζ) = 0.
2
Now the first-order ODE (2.0.40) can be written in the form
d 1
(2.0.41) ln V (ζ) = − ζ,
dζ 2
which can be easily integrated as follows:

V (ζ) 1
(2.0.42) ln = − ζ 2,
V (0) 4
1 2
(2.0.43) =⇒ V (ζ) = V (0)e− 4 ζ .
To find V (0), we use the relation (2.0.33), and the integral identity2
√
Z Z
− 14 ζ 2 2
(2.0.44) 1= V (0)e dζ = 2V (0) e−α dα = 2V (0) π.
R ζ=2α R
Therefore, V (0) = √1 , and
4π
1 1 2
(2.0.45) V (ζ) = √ e− 4 ζ .
4π
Finally, from (2.0.32) and (2.0.45), we deduce that
1 − x2
(2.0.46) ΓD (t, x) = √ e 4t
4πt
as desired.
2Let def 2 2 2
e−x dx. Then I 2 = R R e−(x +y ) dx dy, and by switching to polar coordinates, we have that
R R R
I =
R∞ R
−r 2
√
I 2 = 2π r=0
re dr = π. Thus, I = π.
MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 6: Laplace’s and Poisson’s Equations

We will now study the Laplace and Poisson equations on a domain (i.e. open connected subset)
Ω ⊂ Rn . Recall that
n
def
(0.0.1) ∆ = ∑ ∂i2 .
i=1
The Laplace equation is
(0.0.2) ∆u(x) = 0, x ∈ Ω,
while the Poisson equation is the inhomogeneous equation
(0.0.3) ∆u(x) = f (x).

Functions u ∈ C 2 (Ω) verifying (0.0.2) are said to be harmonic. (0.0.2) and (0.0.3) are both second
order, linear, constant coefficient PDEs. As in our study of the heat equation, we will need to supply
some kind of boundary conditions to get a well-posed problem. But unlike the heat equation, there
is no “timelike” variable, so there is no “initial condition” to specify!
1. Where does it come from?

1.1. Basic examples. First example: set ∂t u ≡ 0 in the heat equation, and (0.0.2) results. These
solutions are known as steady state solutions.
Second example: We start with Maxwell equations from electrodynamics. The quantities of
interest are
● E = (E1 (t, x, y, z ), E2 (t, x, y, z ), E3 (t, x, y, z )) is the electric field
● B = (B1 (t, x, y, z), B2 (t, x, y, z ), B3 (t, x, y, z )) is the magnetic induction
● J = (J1 (t, x, y, z ), J2 (t, x, y, z ), J3 (t, x, y, z )) is the current density
● ρ is the charge density
Maxwell’s equations are
(1.1.1) ∂t E − ∇ × B = −J, ∇ ⋅ E = ρ,
(1.1.2) ∂t B + ∇ × E = 0, ∇ ⋅ B = 0.
Recall that ∇× is the curl operator, so that e.g. ∇ × B = (∂y B3 − ∂z B2 , ∂z B1 − ∂x B3 , ∂x B2 − ∂y B1 ).
Let’s look for steady-state solutions with ∂t E = ∂t B ≡ 0. Then equation (1.1.2) implies that
1
(1.1.3) ∇ × E = 0,
so that by the Poincaré lemma, there exists a scalar-valued function φ(x, y, z) such that
(1.1.4) E(x, y, z) = −∇φ(x, y, z ).

The function φ is called an electric potential. Plugging (1.1.4) into the second of (1.1.1), and using
the identity ∇ ⋅ ∇φ = ∆φ, we deduce that
(1.1.5) ∆φ(x, y, z ) = −ρ(x, y, z ).

This is exactly the Poisson equation (0.0.3) with inhomogeneous term f = −ρ. Thus, Poisson’s
equation is at the heart of electrostatics.
1.2. Connections to complex analysis. Let z = x + iy (where x, y ∈ R) be a complex number,
and let f (z ) = u(z ) + iv (z ) be a complex-valued function (where u, v ∈ R). We recall that f is said
to be differentiable at z0 if
f (z) − f (z0 )
(1.2.1) lim
z →z0 z − z0
′
exists. If the limit exists, we denote it by f (z0 ).
A fundamental result of complex analysis is the following: f is differentiable at z0 = x0 + iy0 ≃
(x0 , y0 ) if and only if the real and imaginary parts of f verify the Cauchy-Riemann equations at z0 ∶
(1.2.2) ux (x0 , y0 ) = vy (x0 , y0 ),

(1.2.3) uy (x0 , y0 ) = −vx (x0 , y0 ).
Differentiating (1.2.2) and using the symmetry of mixed partial derivatives (we are assuming here
that u(x, y ) and v (x, y ) are C 1 near (x0 , y0 )), we have
def
(1.2.4) ∆u = uxx + uyy = vyx − vxy = 0,
def
(1.2.5) ∆v = vxx + vyy = −uyx + uxy = 0.
Thus, the real and imaginary parts of a complex-differentiable function are harmonic!
2. Well-posed Problems
Much like in the case of the heat equation, we are interested in well-posed problems for the Laplace
and Poisson equations. Recall that well-posed problems are problems that i) have a solution; ii)the
solutions are unique; and iii)the solution varies continuously with the data.
Let Ω ⊂ Rn be a domain with a Lipschitz boundary, and let N ˆ denote the unit outward normal
vector to ∂Ω. We consider the PDE
(2.0.6) ∆u(x) = f (x), x ∈ Ω,

supplemented by some boundary conditions. The following boundary conditions are known to lead
to well-posed problems:
(1) Dirichlet data: specify a function g (x) defined on ∂Ω such that u∣∂Ω (x) = g (x).
(2) Neumann data: specify a function h(x) defined on ∂Ω such that ∇N̂ u(x)∣∂Ω (x) = h(x).
(3) Robin-type data: specify a function h(x) defined on ∂Ω such that ∇N̂ u(x)∣∂Ω (x)+ αu∣∂Ω (x) =
h(x), where α > 0 is a constant.
(4) Mixed conditions: for example, we can divide ∂Ω into two disjoint pieces ∂Ω = SD ∪ SN ,
where SN is relatively open in ∂Ω, and specify a function g (x) defined on SD and a function
h(x) defined on SN such that u∣SD (x) = g (x), ∇N̂ u∣SN (x) = h(x).
(5) Conditions at infinity: When Ω = Rn , we can specify asymptotic conditions on u(x) as
∣x∣ → ∞. We will return to this kind of condition later in the course.
3. Uniqueness via the Energy Method

In this section, we address the question of uniqueness for solutions to the equation (0.0.3), sup-
plemented by suitable boundary conditions. As in the case of the heat equation, we are able to
provide a simple proof based on the energy method.
Theorem 3.1. Let Ω ⊂ Rn be a smooth, bounded domain. Then under Dirichlet, Robin, or mixed
boundary conditions, there is at most one solution of regularity u ∈ C 2 (Ω) ∩ C 1 (Ω) to the Poisson
equation (0.0.3).
In the case of Neumann conditions, any two solutions can differ by at most a constant.
Proof. If u and v are two solutions to (0.0.3) with the same boundary data, then we can subtract
def
them (aren’t linear PDEs nice?!...) to get a solution w = u − v to the Poisson equation with 0 data:
(3.0.7) ∆w = 0.
Let’s perform the usual trick of multiplying (3.0.7) by w, integrating over Ω, and integrating by
parts via the divergence theorem:
(3.0.8) 0 = ∫ w∆w dn x = ∫ w∇ ⋅ ∇w dn x = − ∫ ∣∇w∣2 dn x + ∫ w∇N̂ w dσ.

Ω Ω Ω ∂Ω
In the case of Dirichlet data, w∣∂Ω = 0, so the last term in (3.0.8) vanishes. Thus, in the Dirichlet
case, we have that
2
(3.0.9) ∫Ω ∣∇w∣ = 0.
Thus, ∇w = 0 in Ω, and so w is constant in Ω. Since w is 0 on ∂Ω, we have that w ≡ 0 in Ω, which
shows that u ≡ v in Ω.
Similarly, in the Robin case
2
(3.0.10) ∫∂Ω w∇N̂ w dσ = −α ∫∂ Ω w dσ ≤ 0,
which implies that
2
(3.0.11) ∫Ω ∣∇w∣ = 0,
and we can argue as before conclude that w ≡ 0 in Ω.
Now in the Neumann case, we have that ∇N̂ w∣∂Ω = 0, and we can argue as above to conclude that
w is constant in Ω. But now we can’t say anything about the constant, so the best we can conclude
is that u = v + constant in Ω.

4. Mean value properties

Harmonic functions u have some amazing properties. Some of the most important ones are
captured in the following theorem, which shows that the pointwise values of u can be determined
by its average over solid balls or their boundaries.
Theorem 4.1 (Mean value properties). Let u(x) be harmonic in the domain Ω ⊂ Rn , and let
BR (x) ⊂ Ω be a ball of radius R centered at the point x. Then the following mean value formulas
hold:
n
(4.0.12a) u(x) = u(y) dn y,
ωn Rn ∫BR (x)
1
(4.0.12b) u(x) = u(σ ) dσ,
ωn Rn−1 ∫∂BR (x)
where ωn is the area of ∂B1 (0) ⊂ Rn , that is, the area of the boundary of the unit ball in Rn .
Proof. Let’s address the n = 2 case only; the proof is similar for other values of n. Let’s also assume
that x is the origin; as we will see, we will be able to treat the case of general x by reducing it to
the origin. We will work with polar coordinates (r, θ) on R2 . For a ball of radius r, we have that
the measure dσ corresponding to ∂Br (0) is dσ = r dθ. Note also that along ∂Br (0), we have that
ˆ = ∇ u, where N
∂r u = ∇u ⋅ N ˆ (σ ) is the unit normal to ∂Br (0). For any 0 ≤ r < R, we define
N̂
1 1 2π 1 2π
def
(4.0.13) g (r ) = ∫ u(σ) dσ = ∫ ru(r, θ) dθ = ∫ u(r, θ) dθ.
2πr ∂Br (0) 2πr θ=0 2π θ=0
We now note that since u is continuous at 0, we have that
(4.0.14) u(0) = lim+ g (r).
r→0
Thus, we would obtain (4.0.12b) in the case x = 0 if we could show that g ′ (r) = 0. Let’s now show
this. To this end, we calculate that
1 2π 1 2π 1
(4.0.15) g ′ (r) = ∂ u r, θ dθ ∇Nˆ u(r, θ) dθ = ∇ ˆ u(σ ) dσ.
2π ∫∂B1 (0) N (σ)
∫ r ( ) = ∫
2π θ=0 2π θ=0
By the divergence theorem, this last term is equal to
1
(4.0.16) ∆u(y ) d2 y.
2π ∫B1 (0)
But ∆u = 0 since u is harmonic, so we have shown that
(4.0.17) g ′ (r) = 0,
and we have shown (4.0.12b) for x = 0.
To prove (4.0.12a), we use polar coordinate integration and (4.0.12b) (in the case x =0) to obtain
R 1 R 2π 1
(4.0.18) u(0)R2 /2 = ∫ ru(0) dr = ∫ ∫ ru(r, θ) dθ dr = ∫ u(y ) d2 y.
0 2π 0 θ =0 2π BR (0)
We have now shown (4.0.12a) and (4.0.12b) when x = 0.
def
To obtain the corresponding formulas for non-zero x, define v (y ) = u(x + y ), and note that
∆y v (y ) = (∆y u)(x + y ) = 0. Therefore, using what we have already shown,
2 2 2
(4.0.19) u(x) = v(0) = ∫ v(y) d2 y = ∫ u(x + y ) d2 y = ∫ u(y ) d2 y,
ωn R BR (0)
2 ω2 R BR (0)
2 ω2 R BR (x)
2
which implies (4.0.12a) for general x . We can similarly obtain (4.0.12b) for general x.

5. Maximum Principle
Let’s now discuss another amazing property verified by harmonic functions. The property, known
as the strong maximum principle, says that most harmonic functions achieve their maximums and
minimums only on the interior of Ω. The only exceptions are the constant functions.
Theorem 5.1 (Strong Maximum Principle). Let Ω ⊂ Rn be a domain, and assume that u ∈ C (Ω)
verifies the mean value property (4.0.12a). Then if u achieves its max or min at a point p ∈ Ω, then
u is constant on Ω. Therefore, if Ω is bounded and u ∈ C(Ω) is not constant, then for every x ∈ Ω,
we have
(5.0.20) u(x) < max u(y ), u(x) > min u(y ).

y ∈∂ Ω y ∈∂Ω
Proof. We give the argument for the “min” in the case n = 2. Suppose that u achieves its min at a
point p ∈ Ω, and that u(p) = m. Let B (p) ⊂ Ω be any ball centered at p, and let z be any point in
B (p). Choose a small ball Br (z) of radius r centered z with Br (z) ⊂ B (p).
Note that by the definition of a min, we have that
(5.0.21) u(z) ≥ m.
Using the assumption that the mean value property (4.0.12a) holds, we conclude that
(5.0.22)
1 1
m= ∫ u(y ) d2 y = { u(y ) d2 y + ∫ u(y ) d2 y }
∣B (p)∣ B (p) ∣B (p)∣ ∫Br (z) B /Br (z )
1 1
= {∣Br (z )∣u(z ) + ∫ u(y) d2 y} ≥ {∣Br (z)∣u(z) + m(∣B(p)∣ − ∣Br (z)∣)} .
∣B (p)∣ B /Br (z ) ∣B (p)∣
Rearranging inequality (5.0.22), we conclude that
(5.0.23) u(z) ≤ m.
Combining (5.0.21) and (5.0.23), we conclude that
(5.0.24) u(x) = m
holds for all points x ∈ B(p). Therefore, u is locally constant at any point where it achieves its min.
Since Ω is open and connected, we conclude that u(x) = m for all x ∈ Ω.

The next corollary will allow us to compare the size of two solutions to Poisson’s equation if we
have information about the size of the source terms and about the values of the solutions on ∂Ω.
The proof is based on Theorem 5.1.
Corollary 5.0.1. Let Ω ⊂ Rn be a bounded domain and let f ∈ C(Ω). Then the PDE
∆u = 0, x ∈ Ω,
(5.0.25) {
u(x) = f (x), x ∈ ∂Ω,
has at most one solution uf ∈ C 2 (Ω) ∩ C (Ω). Furthermore, if uf and ug are the solutions corre-
sponding to the data f, g ∈ C (Ω), then
(1) (Comparison Principle) If f ≥ g on ∂Ω and f ≠ g, then
uf > ug in Ω.
(2) (Stability Estimate) For any x ∈ Ω, we have that
∣uf (x) − ug (x)∣ ≤ max ∣f (y ) − g (y )∣.
y ∈∂Ω
Proof. We first prove the Comparison Principle. Let w = uf − ug . Then by subtracting the PDEs,
we see that w solves
∆w = 0, x ∈ Ω,
(5.0.26) {
u(x) = f (x) − g (x) ≥ 0, x ∈ ∂Ω,
Since w is harmonic, since f (x) − g (x) ≥ 0 on ∂Ω, and since f ≠ g, Theorem 5.1 implies that w is
not constant and that for every x ∈ Ω, we have
(5.0.27) w(x) > max f (y) − g (y ) ≥ 0.

y ∈∂Ω
This proves the Comparison Principle.

For the Stability Estimate, we perform a similar argument for both ±w, which leads to the
estimates
(5.0.28) w(x) > − max ∣f (y ) − g (y )∣,

y∈∂Ω
(5.0.29) −w(x) > − max ∣f (y ) − g (y )∣.

y ∈∂Ω
Combining (5.0.28) and (5.0.29), we deduce the Stability Estimate.

The “at most one” statement of the corollary now follows directly from applying the Stability
Estimate to w in the case f = g.
MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 7: The Fundamental Solution and Green Functions
1. The Fundamental Solution for ∆ in Rn

Here is a situation that often arises in physics. We are given a function f (x) on Rn representing
the spatial density of some kind of quantity, and we want to solve the following equation:
(1.0.1) ∆u(x) = f (x), x = (x1 , ⋯, xn ) ∈ Rn .

Furthermore, we often want to impose the following decay condition as ∣x∣ → ∞ ∶
(1.0.2) ∣u(x)∣ → 0.
For technical reasons, we will need a different condition in the case n = 2. A good physical example
is the theory of electrostatics, in which u(x) is the electric potential1, and f (x) is the charge density.
f (x) could be e.g a compactly supported function modeling the charge density of a charged star,
and we might want to know how the potential behaves far away from the star (i.e. as ∣x∣ → ∞).
Roughly speaking, the decay conditions (1.0.2) are physically motivated by the fact that the star
should not have a large effect on far-away locations.
As we will soon see, the PDE (1.0.1) has a unique solution verifying (1.0.2) as long as f (x) is
sufficiently differentiable and decays sufficiently rapidly as ∣x∣ → ∞. Much like in the case of the heat
equation, we will be able to construct the solution using an object called the fundamental solution.
Definition 1.0.1. The fundamental solution Φ corresponding to the operator ∆ is
def ln ∣x∣ n = 2,
1
(1.0.3) Φ(x) = { − 1 2π
ωn ∣x∣n−2 n ≥ 3,
def
√
where as usual ∣x∣ = ∑i=1 (xi )2 and ωn is the surface area of a unit ball in Rn (e.g. ω3 = 4π).
n
Remark 1.0.1. Some people prefer to define their Φ to be the negative of our Φ.
Essentially, our goal in this section is to show that ∆Φ(x) = δ (x), where δ is the delta distribution.
Let’s assume that this holds for now. We then claim that the solution to (1.0.1) is u(x) = f ∗ Φ(x) =
∫Rn f (y )Φ(x − y ) d y. This can be heuristically justified by the following heuristic computations:
n
∆x (f ∗ Φ) = f ∗ ∆x Φ = f ∗ δ = f (x).
Let’s now make rigorous sense of this. We first show that away from the origin, the fundamental
solution verifies Laplace’s equation.
Lemma 1.0.1. If x ≠ 0, then ∆Φ(x) = 0.
1Recall that the the force F associated to u is F = −∇u.
1
def
Proof. Let’s do the proof in the case n = 3. Note that Φ(x) = Φ(r) (r = ∣x∣) is spherically symmetric.
Thus, using the fact that ∆ = ∂r2 + 2r ∂r when r > 0 for spherically symmetric functions, we have that
∆Φ = ∂r2 Φ + 2r ∂r Φ = ω−32r4 + ω32r4 = 0.
We are now ready to state and prove a rigorous version of the aforementioned heuristic results.
Theorem 1.1 (Solution to Poisson’s equation in Rn ). Let f (x) ∈ C0∞ (Rn ) (i.e., f (x) is a
smooth, compactly supported function on Rn ). Then for n ≥ 3, the Laplace equation ∆u(x) = f (x)
has a unique smooth solution u(x) that tends to 0 as ∣x∣ → ∞. For n = 2, the solution is unique
under the assumptions u∣(x∣x) → 0 as ∣x∣ → ∞ and ∣∇u(x)∣ → 0 as ∣x∣ → ∞. Furthermore, these unique
solutions can be represented as
∫R2 ln ∣y ∣f (x − y ) d y, n = 2,
1 2
(1.0.4) u(x) = (Φ ∗ f )(x) = { 2π
− ω1n ∫R n ∣ y ∣ n− 2 f (x − y ) d n y, n ≥ 3.
Furthermore, there exist constants Cn > 0 such that the fol lowing decay estimate holds for the
solution as ∣x∣ → ∞ ∶
C2 ln ∣x∣ n = 2,
(1.0.5) ∣u(x)∣ ≤ { Cn
∣x∣n−2 n ≥ 3.
Remark 1.0.2. As we alluded to above, Theorem 1.1 shows that ∆Φ(x) = δ (x), where δ is the
“delta distribution.” For on the one hand, as we have previously discussed, we have that f = δ ∗ f.
On the other hand, our proof of Theorem 1.1 below will show that f = ∆u = ∆(Φ ∗ f ) = (∆Φ) ∗ f.
Thus, for any f, we have δ ∗ f = (∆Φ) ∗ f, and so ∆Φ = δ.
Proof. We consider only the case n = 3. Let’s first show existence by checking that the function u
defined in (1.0.4) solves the equation and has the desired properties. We first differentiate under
the integral (we use one of our prior propositions to justify this) and use the fact that ∆x f (x − y ) =
∆y f (x − y ) (you can easily verify this with the chain rule) to derive
1 1 1 1
(1.0.6) ∆x u(x) = − ∫ ∆x f (x − y) d3 y = − ∫ ∆y f (x − y ) d3 y.
4π R ∣y ∣
3 4π R ∣y ∣
3
To show that the right-hand side of (1.0.6) is equal to f (x), we will split the integral into two
pieces: a small ball centered at the origin, and it’s complement. Thus, let B (0) denote the ball of
radius centered at 0. We then split
1 1 1 1 def
(1.0.7) ∆x u(x) = − ∫ ∆y f (x − y) d3 y − ∫ ∆y f (x − y) d3 y = I + I I.
4π B (0) ∣y ∣ 4π Bc (0) ∣y ∣
We first show that I goes to 0 as → 0+ . To this end, let
def
(1.0.8) M = sup ∣f (y)∣ + ∣∇f (y)∣ + ∣∆y f (y)∣.
y ∈R 3
Then using spherical coordinates (r, ω ) for the y variable, and recalling that d3 y = r2 dω (where
ω ∈ ∂B1 (0) ⊂ R3 is a point on the unit sphere and dω = sin θ dθ dφ) in spherical coordinates, we have
that
1
(1.0.9) ∣I ∣ ≤ ∫ ∣ ∆y f (x − y )∣ d3 y ≤ M ∫ ∫ r dω dr = 22 πM.
B (0) ∣y ∣ r=0 ∂B1 (0)
Clearly, the right-hand side of (1.0.9) goes to 0 as → 0+ .

We would now like to understand the second term on the right-hand side of (1.0.7). We claim
that
(1.0.10) ∣f (x) − II ∣ → 0
as → 0+ . After we show this, we can combine (1.0.7), (1.0.9), and (1.0.10) and let → 0+ to deduce
that ∆x u(x) = f (x) as desired.
To show (1.0.10), we will use integration by parts via Green’s identity and simple estimates to
control the boundary terms. Recall that Green’s identity for two functions u, v is
∫Ω v (x)∆u(x) − u(x)∆v (x) d x = ∫∂Ω v ∇Nˆ u(σ ) − u∇Nˆ v (σ ) dσ.

n
(1.0.11)
Using (1.0.11) and Lemma 1.0.1, we compute that
(1.0.12)
0

1 1 3 1 1
∫B c (0) − ∣y ∣ ∆y f (x − y ) + f (x − y ) ∆y ∣y ∣ d y = ∫∂ B c (0) ∣σ ∣ ∇Nˆ (σ) f (x − σ ) − f (x − σ )∇Nˆ (σ) ∣σ ∣ dσ.

Above, ∇N̂ (σ) is the outward unit radial derivative on the sphere ∂B (0). This corresponds to the
“opposite” choice of normal that appears in the standard formulation of Green’s identity for Bc (0),
but we have compensated for this by carefully inserting minus signs on the right-hand side of
(1.0.12). Recalling also that ∇N̂ (σ) ∣σ∣
1
= − ∣σ∣12 , that ∣σ∣ = on ∂Bc (0), and that dσ = 2 dω on ∂Bc (0),
it follows that
1
(1.0.13) −∫ ∆y f (x − y) d3 y = − ∫ ω ⋅ (∇f )(x − ω) dω + ∫ f (x − ω ) dω.
Bc (0) ∣y ∣ ∂B1 (0) ∂B1 (0)
Using (1.0.8), it follows that the first integral on the right-hand side of (1.0.13) is bounded by
4πM , and thus goes to 0 as → 0+ . Furthermore, since f is continuous and since ∫∂B1 (0) 1 dω = 4π,
it follows that the second integral converges to 4πf (x) as → 0+ . We have thus proved (1.0.10) for
n = 3.
To estimate ∣u(x)∣ as ∣x∣ → ∞, we assume that f (x) vanishes outside of the ball BR (0). It suffices
to estimate right-hand side of (1.0.4) when ∣x∣ > 2R. We first note the inequality ∣x−y∣
1
≤ ∣x∣
2
, which
holds for ∣y ∣ ≤ R and ∣x∣ > 2R. Using this inequality and (1.0.8), we can estimate right-hand side of
(1.0.4) by
1 1 M 2R3 M
∣u(x)∣ = ∣∫ f (y ) d3 y ∣ ≤ 3
=
2π ∣x∣ ∫BR (0)
(1.0.14) 1 d y ,
4π BR (0) ∣x − y ∣ 3∣x∣
and we have shown (1.0.5) in the case n = 3.
To prove uniqueness, we will make use of Corollary 4.0.4, which we will prove later. Now if u, v
are two solutions with the assumed decay conditions at ∞, then using the usual strategy, we note
def
that w = u − v is a solution to the Laplace equation
(1.0.15) ∆w = 0
that verifies ∣w(x)∣ → 0 as ∣x∣ → ∞. In particular, w is a bounded harmonic function on R3 . We will
show in Corollary 4.0.4 below that w(x) must be a constant function. Furthermore, the constant
must be 0 since ∣w(x)∣ → 0 as ∣x∣ → ∞.

2. Green functions for domains Ω

Our goal in this section is to derive an analog of Theorem 1.1 on the interior of domains Ω ⊂ Rn .
Specifically, we will study the boundary value Poisson problem
(2.0.16) ∆u(x) = f (x), x ∈ Ω ⊂ Rn ,

u(x) = g (x), x ∈ ∂Ω.
Theorem 2.1 (Basic existence theorem). Let g be a bounded Lipschitz domain, and let g ∈
C(∂ Ω). Then the PDE (2.0.16) has a unique solution u ∈ C 2 (Ω) ∩ C(Ω).
Proof. This proof is a bit beyond this course.
Definition 2.0.2. Let Ω ⊂ Rn be a domain. A Green function in Ω is defined to be a function of
(x, y ) ∈ Ω × Ω verifying the following conditions for each fixed x ∈ Ω ∶
(2.0.17) ∆y G(x, y) = δ(x), y∈Ω

(2.0.18) G(x, σ ) = 0, σ ∈ ∂Ω.
Proposition 2.0.2. Let Φ be the fundamental solution (1.0.3) for ∆ in Rn , and let Ω ∈ Rn be a
domain. Then the Green function G(x, y ) for Ω can be decomposed as
(2.0.19) G(x, y) = Φ(x − y ) − φ(x, y ),

where for each x ∈ Ω, φ(x, y ) solves the Dirichlet problem
(2.0.20) ∆y φ(x, y) = 0, y ∈ Ω,
(2.0.21) φ(x, σ ) = Φ(x − σ ), σ ∈ ∂Ω.
Proof. As we have previously discussed, ∆Φ = δ. Also using (2.0.20), we compute that
(2.0.22) ∆y (Φ(x − y) − φ(x, y )) = ∆y Φ(x − y ) − ∆y φ(x, y ) = δ (x − y ).

Therefore, Φ(x − y ) − φ(x, y ) verifies equation (2.0.17).
Furthermore, using (2.0.21), we have that Φ(x − σ ) − φ(x, σ ) = 0 whenever σ ∈ ∂Ω. Thus, Φ(x −
y ) − φ(x, y ) also verifies the boundary condition (2.0.18).

The following technical proposition will play later in this section when we derive representation
formulas for solutions to (2.0.16) in terms of Green functions.
Proposition 2.0.3 (Representation formula for u). Let Φ be the fundamental solution (1.0.3)
for ∆ in Rn , and let Ω ⊂ Rn be a domain. Assume that u ∈ C 2 (Ω). Then for every x ∈ Ω, we have
the following representation formula for u(x) ∶
(2.0.23) u(x) = ∫ Φ(x − y)∆y u(y) dn y − ∫ Φ(x − σ )∇N̂ (σ) u(σ ) dσ + ∫ u(σ )∇Nˆ (σ) Φ(x − σ ) dσ .
Ω ∂Ω ∂Ω
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
single layer potential double layer potential
Proof. We’ll do the proof for n = 3, in which case Φ(x) = − 4π1∣x∣ . We will also make use of Green’s
def
identity (1.0.11). Let B (x) be a ball of radius centered at x, and let Ω = Ω/B (x). Note that
∂Ω = ∂Ω ∪ −∂B (x). Using (1.0.11), we compute that
1 1 1
∫Ω ∣x − y∣ ∆u(y ) d y = ∫∂Ω ∣x − σ∣ ∇Nˆ u(σ ) − u(σ )∇Nˆ ( ∣x − σ∣ ) dσ
3
(2.0.24)

1 1
=∫ ∇N̂ u(σ) dσ − ∫ u(σ )∇N̂ ( ) dσ
∂Ω ∣x − σ∣ ∂Ω ∣x − σ ∣
1 1
−∫ ∇N̂ u(σ ) dσ + ∫ u(σ)∇N̂ ( ) dσ.
∂B (x) ∣x − σ∣ ∂B (x) ∣x − σ ∣
In the last two integrals above, Nˆ (σ ) denotes the radially outward unit normal to the boundary of
the ball B (x). This corresponds to the “opposite” choice of normal that appears in the standard
formulation of Green’s identity, but we have compensated by adjusting the signs in front of the
integrals.
Let’s symbolically write (2.0.24) as
(2.0.25) L = R1 + R2 + R3 + R4.
Our goal is to show that as ↓ 0, the following limits are achieved:
● L → −4π ∫Ω Φ(x − y )∆y u(y) d3 y
● R1 → 4π × single layer potential
● R2 → −4π × double layer potential
● R3 → 0
● R4 → −4πu(x).
Once we have calculated the above limits, (2.0.23) then follows from simple algebraic rearranging.
We first address L. Let M = maxy∈Ω ∆u(y ). We then estimate
1 1 1
(2.0.26) ∣∫ ∆u(y ) d3 y − ∫ ∆u(y ) d3 y ∣ ≤ ∫ ∣∆u(y )∣ d3 y
Ω ∣x − y∣ Ω ∣x − y∣ B (x) ∣x − y ∣
1
≤M∫ d3 y → 0 as ↓ 0.
B (x) ∣x − y ∣
This shows that L converges to ∫Ω ∣x−1 y∣ ∆u(y ) d3 y as ↓ 0.

The limits for R1 and R2 are obvious since these terms do not depend on .
We now address R3. To this end, Let M ′ = maxy∈Ω ∣∇u(y )∣. We then estimate R3 by
(2.0.27)
1 1 ′
∣R3∣ ≤ ∫ ∣ ∇N̂ u(σ)∣ dσ ≤ ∫ M dσ = 4π2 ×−1 M ′ → 0 as ↓ 0.
∂B (x) ∣x − σ ∣ ∂B (x) ±
surface area of ∂B (x)
We now address R4. Using spherical coordinates (r, θ, φ) ∈ [0, ∞) × [0, π ) × [0, 2π ) centered at
x, we have that dσ = r2 sin θ dθ dφ. Therefore, ∫∂ B (x) ∣x−σ∣
1
2 dσ = ∫φ∈[0,2π] ∫θ∈[0,π] 1 dθ dφ = 4π. We now
estimate
1 1 1
(2.0.28) ∣ R4 − [ − u(x)]∣ = ∣u(x) + ∫ u(σ )∇Nˆ (σ) ( ) dσ ∣
4π 4π ∂ B (x) ∣x − σ ∣
1 1
= ∣∫ (u(x) − u(σ ))( ) dσ ∣
4π ∂ B (x) ∣x − σ ∣2
1 1
≤ ∫ ∣u(x) − u(σ )∣( ) dσ
4π ∂B (x) ∣x − σ ∣2
1 1
≤ max ∣u(x) − u(σ )∣ ∫ ( ) dσ
4π σ ∈ ∂B ( x) ∂B (x) ∣x − σ ∣2
≤ max ∣u(x) − u(σ)∣ → 0 as ↓ 0.

σ ∈∂B (x)
This shows that R4 → −4πu(x) as ↓ 0.

Theorem 2.2 (Representation formula for solutions to the boundary value Poisson
equation). The solution u to (2.0.16) can be represented as
(2.0.29) u(x) = − ∫ f (y)G(x, y) dn y − ∫ g (σ) ∇N̂ G(x, σ ) dσ.

Ω ∂Ω ´ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
Poisson kernel
Proof. Applying Proposition 2.0.3, we have that
(2.0.30) u(x) = − ∫ Φ(x − y)f (y ) dn y + ∫ Φ(x − σ)∇N̂ (σ) u(σ) dσ − ∫ g (σ)∇N̂ (σ) Φ(x − σ) dσ.
Ω ∂Ω ∂Ω
Recall also that
(2.0.31) G(x, y) = Φ(x − y ) − φ(x, y )
(2.0.32) G(x, σ ) = 0 when σ ∈ ∂Ω.

Applying the Green identity (1.0.11) to the functions u(y ) and φ(x, y ), and recalling that ∆y φ(x, y ) =
0 for each fixed x ∈ Ω, we have that
∆u(y ) Φ(x−σ ) u(σ )

¬ ³ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ µ ¬
(2.0.33) 0 = ∫ φ(x, y ) f (y ) dn y − ∫ φ(x, σ ) ∇N̂ u(σ ) dσ + ∫ g (σ ) ∇N̂ φ(x, σ ) dσ.
Ω ∂Ω ∂Ω
Adding (2.0.30) and (2.0.33), and using (2.0.31), we deduce the formula (2.0.29).
3. Poisson’s Formula
def
Let’s compute the Green function G(x, y ) and Poisson kernel P (x, σ ) = −∇N̂ G(x, σ ) from (2.0.29)
def
in the case that Ω = BR (0) ⊂ R3 is a ball of radius R centered at the origin. We’ll use a technique
called the method of images that works for special domains.
Warning 3.0.1. Brace yourself for a bunch of tedious computations that at the end of the day will
lead to a very nice expression.
The basic idea is to hope that φ(x, y) from (2.0.19), viewed as a potential that depends on
y, is equal to the potential generated by some “imaginary charge” q placed at a point x∗ ∈ Ωc .
To ensure that property (2.0.18) holds, q and x∗ have to be chosen so that along the boundary
{y ∈ R3 ∣ ∣y ∣ = R}, φ(x, y ) = 4π∣x1−y∣ . In a nutshell, we guess that
1 q
(3.0.34) G(x, y ) = − ,
4π ∣x − y ∣ 4π ∣x∗ − y ∣
and we try to solve for q and x∗ so that G(x, y) vanishes when ∣y ∣ = R. Thus, when ∣y∣ = R, we must
have
1 q
(3.0.35) = .
4π∣x − y∣ 4π∣x∗ − y ∣
(3.0.36) ∣x∗ − y ∣2 = q 2 ∣x − y∣2 .
(3.0.37) ∣x∣2 − 2x ⋅ y + R2 = ∣x − y ∣2 = q 2 ∣x∗ − y∣2 = q 2 (∣x∗ ∣2 − 2x∗ ⋅ y + R2 ).

Then performing simple algebra, we have
(3.0.38) ∣x∗ ∣2 + R2 − q 2 (R2 + ∣x∣2 ) = 2y ⋅ (x∗ − q 2 x).

Now since the left-hand side of (3.0.38) does not depend on y, it must be the case that the second
term on the right-hand side vanishes. This implies that x∗ = q 2 x, and also leads to the equation
(3.0.39) q 4 ∣x∣2 − q 2 (R2 + ∣x∣2 ) + R2 = 0.

Solving (3.0.39) for q, we finally have that
R
(3.0.40) q= ,
∣x∣
R2
(3.0.41) x∗ = 2 x.
∣x∣
Therefore,
1 R
(3.0.42) φ(x, y ) = ,
4π ∣x∣∣ ∣x∣2 x − y ∣
R 2
1
(3.0.43) φ(0, y ) =
,
R
where we took a limit as x → 0 in (3.0.42) to derive (3.0.43).
Next, using (3.0.34), we have
1 1 R
(3.0.44) G(x, y ) = − , x ≠ 0,
4π ∣x − y∣ 4π ∣x∣∣ ∣x∣2 x − y∣
R 2
1 1
(3.0.45) G(0, y ) = − .
4π ∣y ∣ R
x−y 1 R x∗ − y
(3.0.46) ∇y G(x, y ) = −
4π ∣x − y ∣3 4π ∣x∣ ∣x∗ − y∣3
Now when σ ∈ ∂BR (0), (3.0.36) and (3.0.40) imply that
R
(3.0.47) ∣ x∗ − σ ∣ = ∣x − σ ∣.
∣x∣
Therefore, using (3.0.46) and (3.0.47), we compute that
1 ∣x∣2 ∣x∣2 x − σ
R 2
x−σ 1 ∣x∣2 x∗ − σ x−σ
(3.0.48) ∇σ G(x, σ ) = − = −
4π ∣x − σ ∣3 4π R2 ∣x − σ ∣3 4π ∣x − σ ∣3 4π R2 ∣x − σ ∣3
−σ ∣x∣2
= ( 1 − ).
4π ∣x − σ ∣3 R2
ˆ (σ) = σ , we deduce
Using (3.0.48) and the fact that N R
ˆ (σ) = R − ∣x∣
2 2 1
def
(3.0.49) ∇N̂ (σ) G(x, σ ) = ∇σ G(x, σ ) ⋅ N .
4πR ∣x − σ ∣3
Remark 3.0.3. If the ball were centered at the point p ∈ R3 instead of the origin, then the formula
(3.0.49) would be replaced with
ˆ ( σ ) = R − ∣x − p ∣
2 2 1
def
(3.0.50) ∇Nˆ (σ) G(x, σ ) = ∇σ G(x, σ ) ⋅ N .
4πR ∣ x − σ ∣3
Theorem 3.1 (Poisson’s formula). Let BR (p) ⊂ R3 be a ball of radius R centered at p =
(p1 , p2 , p3 ), and let x = (x1 , x2 , x3 ) denote a point in R3 . Then the unique solution u ∈ C 2 (BR (p)) ∩
C (B R (p)) of the PDE
∆u = 0, x ∈ Ω,
(3.0.51) {
u(x) = f (x), x ∈ ∂Ω,
can be represented using the Poisson formula:
R 2 − ∣x − p ∣2 f (σ )
(3.0.52) u(x) = ∫ dσ.
4πR ∂BR (p) ∣x − σ ∣3
Remark 3.0.4. In n dimensions, the formula (3.0.52) gets replaced with
R2 − ∣x − p∣2 f (σ )
(3.0.53) u(x) = ∫ dσ,
ωn R ∂BR (p) ∣x − σ ∣n
where as usual, ωn is the surface area of the unit ball in Rn .

Proof. The identity (3.0.52) follows immediately from Theorem 2.2 and (3.0.50).
4. Harnack’s inequality
Theorem 4.1 (Harnack’s inequality). Let u be harmonic and non-negative in the ball BR (0) ⊂
Rn . Then for any x ∈ BR (0), we have that
Rn−2 (R − ∣x∣) Rn−2 (R + ∣x∣)

(4.0.54) u (0 ) ≤ u (x) ≤ u(0).
(R + ∣x∣)n−1 (R − ∣x∣)n−1
Proof. We’ll do the proof for n = 3. The basic idea is to combine the Poisson representation formula
with simple inequalities and the mean value property. By Theorem 3.1, we have that
R2 − ∣x∣2 f (σ )
(4.0.55) u(x) = ∫ dσ.
4πR ∂BR (0) ∣x − σ ∣3
By the triangle inequality, for σ ∈ ∂BR (0) (i.e. ∣σ ∣ = R), we have that ∣x∣ − R ≤ ∣x − σ∣ ≤ ∣x∣ + R.
Applying the first inequality to (4.0.55), and using the non-negativity of f, we deduce that
R + ∣x∣ 1
u(x) ≤ f (σ ) dσ.
R2 − ∣x∣2 4πR ∫∂BR (0)
(4.0.56)
Now recall that by the mean value property, we have that
1
u(0) = f (σ ) dσ.
4πR2 ∫∂BR (0)
(4.0.57)
Thus, combining (4.0.56) and (4.0.57), we have that
Rn−2 (R + ∣x∣)
(4.0.58) u(x) ≤ ,
(R − ∣x∣)n−1
which implies one of the inequalities in (4.0.54). The other one can be proven similarly using the
remaining triangle inequality.

Corollary 4.0.4 (Liouville’s theorem). Suppose that u ∈ C 2 (Rn ) is harmonic on Rn . Suppose
their exists a constant M such that u(x) ≥ M for all x ∈ Rn , or such that u(x) ≤ M for all x ∈ Rn .
Then u is constant.
def
Proof. We first consider the case that u(x) ≥ M. Let v = u + ∣M ∣. Observe that v ≥ 0 is harmonic
and verifies the hypotheses of Theorem 4.1. Thus, by (4.0.54), if x ∈ Rn and R is sufficiently large,
we have that
Rn−2 (R − ∣x∣) Rn−2 (R + ∣x∣)

(4.0.59) u (0 ) ≤ u (x) ≤ u(0).
(R + ∣x∣)n−1 (R − ∣x∣)n−1
Allowing R → ∞ in (4.0.59), we conclude that v (x) = v (0). Thus, v is a constant-valued function
(and therefore u is too).
def
To handle the case u(x) ≤ M, we simply consider the function w(x) = −u(x) + ∣M ∣ in place of
v (x), and we argue as above.

MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 8: Green Functions

1. Green functions for domains Ω
Our goal in this section is to derive an integral representation formula for the solution to Poisson’s
equation on domains Ω ⊂ Rn . Specifically, we will study the boundary value Poisson PDE
(1.0.1) ∆u(x) = f (x), x ∈ Ω ⊂ Rn ,

u(x) = g (x), x ∈ ∂Ω.
We first state a basic existence theorem.
Theorem 1.1 (Basic existence theorem). Let g be a bounded Lipschitz domain, and let g ∈
C (∂Ω). Then the PDE (1.0.1) has a unique solution u ∈ C 2 (Ω) ∩ C (Ω).
Proof. This proof is a bit beyond this course.
We now define the basic object that will play the role of a fundamental solution on a domain Ω.
Definition 1.0.1. Let Ω ⊂ Rn be a domain. A Green function in Ω is defined to be a function of
(x, y ) ∈ Ω × Ω verifying the following conditions for each fixed x ∈ Ω ∶
def
(1.0.2) ∆y G(x, y) = δx (y) = δ(y − x), y ∈ Ω,
(1.0.3) G(x, σ ) = 0, σ ∈ ∂Ω.
Let’s now connect G(x, y) to Φ(x − y ).
Proposition 1.0.1. Let Φ be the fundamental solution for ∆ in Rn , and let Ω ∈ Rn be a domain.
Then the Green function G(x, y ) for Ω can be decomposed as
(1.0.4) G(x, y ) = Φ(x − y ) − φ(x, y ),

where for each x ∈ Ω, φ(x, y ) solves the Dirichlet problem
(1.0.5) ∆y φ(x, y ) = 0, y ∈ Ω,
(1.0.6) φ(x, σ) = Φ(x − σ ), σ ∈ ∂Ω.
Proof. As we have previously discussed, ∆Φ = δ. Also using (1.0.5), we compute that
(1.0.7) ∆y (Φ(x − y) − φ(x, y)) = ∆y Φ(x − y ) − ∆y φ(x, y ) = δ (y − x).

Therefore, Φ(x − y ) − φ(x, y ) verifies equation (1.0.2).
1
Furthermore, using (1.0.6), we have that Φ(x − σ ) − φ(x, σ ) = 0 whenever σ ∈ ∂Ω. Thus, Φ(x −
y ) − φ(x, y ) also verifies the boundary condition (1.0.3).

The following technical proposition will play later in this section when we derive representation
formulas for solutions to (1.0.1) in terms of Green functions.
Proposition 1.0.2 (Representation formula for u). Let Φ be the fundamental solution for ∆
in Rn , and let Ω ⊂ Rn be a domain. Assume that u ∈ C 2 (Ω). Then for every x ∈ Ω, we have the
following representation formula for u(x) ∶
(1.0.8) u(x) = ∫ Φ(x − y )∆y u(y) dn y − ∫ Φ(x − σ)∇Nˆ (σ) u(σ) dσ + ∫ u(σ )∇N̂ (σ) Φ(x − σ ) dσ .
Ω ∂Ω ∂Ω
´ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
single layer potential double layer potential
1
Proof. We’ll do the proof for n = 3, in which case Φ(x) = − 4π∣x∣ . We will also make use of Green’s
def
identity. Let B (x) be a ball of radius centered at x, and let Ω = Ω/B (x). Note that ∂Ω =
∂Ω ∪ −∂B (x). Using Green’s identity, we compute that
1 3 1 1
(1.0.9) ∫Ω ∣x − y∣ ∆u(y ) d y = ∫∂ Ω ∣x − σ∣ ∇Nˆ u(σ ) − u(σ )∇Nˆ ( ∣x − σ ∣ ) dσ

1 1
=∫ ∇Nˆ u(σ) dσ − ∫ u(σ )∇N̂ ( ) dσ
∂Ω ∣x − σ∣ ∂Ω ∣x − σ ∣
1 1
−∫ ∇N̂ u(σ ) dσ + ∫ u(σ)∇N̂ ( ) dσ.
∂B (x) ∣x − σ ∣ ∂B (x) ∣x − σ ∣
In the last two integrals above, Nˆ (σ ) denotes the radially outward unit normal to the boundary of
the ball B (x). This corresponds to the “opposite” choice of normal that appears in the standard
formulation of Green’s identity, but we have compensated by adjusting the signs in front of the
integrals.
Let’s symbolically write (1.0.9) as
(1.0.10) L = R1 + R2 + R3 + R4.
Our goal is to show that as ↓ 0, the following limits are achieved:
● L → −4π ∫Ω Φ(x − y )∆y u(y ) d3 y
● R1 → 4π × single layer potential
● R2 → −4π × double layer potential
● R3 → 0
● R4 → −4πu(x).
Once we have calculated the above limits, (1.0.8) then follows from simple algebraic rearranging.
We first address L. Let M = maxy∈Ω ∆u(y ). We then estimate
1 1 1
(1.0.11) ∣∫ ∆u(y) d3 y − ∫ ∆u(y ) d3 y ∣ ≤ ∫ ∣∆u(y )∣ d3 y
Ω ∣x − y∣ Ω ∣x − y∣ B (x) ∣x − y ∣
1
≤M∫ d3 y → 0 as ↓ 0.
B (x) ∣x − y ∣
1
This shows that L converges to ∫Ω ∣x−y∣ ∆u(y ) d3 y as ↓ 0.
The limits for R1 and R2 are obvious since these terms do not depend on .
We now address R3. To this end, Let M ′ = maxy∈Ω ∣∇u(y )∣. We then estimate R3 by
(1.0.12)
1 1 ′
∣R3∣ ≤ ∫ ∣ ∇N̂ u(σ)∣ dσ ≤ ∫ M dσ = 4π2 ×−1 M ′ → 0 as ↓ 0.
∂B (x) ∣x − σ ∣ ∂B (x) ±
surface area of ∂B (x)
We now address R4. Using spherical coordinates (r, θ, φ) ∈ [0, ∞) × [0, π ) × [0, 2π ) centered at
x, we have that dσ = r2 sin θ dθ dφ. Therefore, ∫∂B (x) ∣x−1σ∣2 dσ = ∫φ∈[0,2π] ∫θ∈[0,π] 1 dθ dφ = 4π. We now
estimate
1 1 1
(1.0.13) ∣ R4 − [ − u(x)]∣ = ∣u(x) + ∫ u(σ )∇Nˆ (σ) ( ) dσ ∣
4π 4π ∂B (x) ∣x − σ ∣
1 1
= ∣∫ (u(x) − u(σ ))( ) dσ ∣
4π ∂B (x) ∣x − σ ∣2
1 1
≤ ∫ ∣u(x) − u(σ )∣( ) dσ
4π ∂B (x) ∣x − σ∣2
1 1
≤ max ∣u(x) − u(σ )∣ ∫ ( ) dσ
4π σ∈∂B (x) ∂B (x) ∣x − σ∣2
≤ max ∣u(x) − u(σ )∣ → 0 as ↓ 0.

σ ∈∂B (x)
This shows that R4 → −4πu(x) as ↓ 0.

MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 9: Poisson’s Formula, Harnack’s Inequality, and Liouville’s

Theorem
1. Representation Formula for Solutions to Poisson’s Equation
We now derive our main representation formula for solution’s to Poisson’s equation on a domain
Ω.
Theorem 1.1 (Representation formula for solutions to the boundary value Poisson
equation). Let Ω be a domain with a smooth boundary, and assume that f ∈ C 2 (Ω) and g ∈ C (∂ Ω).
Then the unique solution u ∈ C 2 (Ω) ∩ C (Ω) to
(1.0.1) ∆u(x) = f (x), x ∈ Ω ⊂ Rn ,

u(x) = g (x), x ∈ ∂Ω.
can be represented as
(1.0.2) u(x) = ∫ f (y)G(x, y ) dn y + ∫ g (σ) ∇N̂ (σ) G(x, σ ) dσ,

Ω ∂Ω
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
Poisson kernel
where G(x, y) is the Green function for Ω.
Proof. Applying the Representation formula for u Proposition, we have that
(1.0.3) u(x) = ∫ Φ(x − y)f (y) dn y − ∫ Φ(x − σ)∇N̂ (σ) u(σ ) dσ + ∫ g (σ )∇N̂ (σ) Φ(x − σ ) dσ.
Ω ∂Ω ∂Ω
Recall also that
(1.0.4) G(x, y) = Φ(x − y) − φ(x, y),
where
(1.0.5) ∆y φ(x, y ) = 0, x ∈ Ω,
and
(1.0.6) G(x, σ ) = 0 when x ∈ Ω and σ ∈ ∂Ω.

The expression (1.0.3) is not very useful since don’t know the value of ∇N̂ (σ) u(σ ) along ∂Ω. To
fix this, we will use Green’s identity. Applying Green’s identity to the functions u(y) and φ(x, y),
and recalling that ∆y φ(x, y ) = 0 for each fixed x ∈ Ω, we have that
1
∆u(y) Φ(x−σ) u(σ )

¬ ³¹¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹µ ¬
(1.0.7) 0 = ∫ φ(x, y ) f (y ) dn y − ∫ φ(x, σ) ∇N̂ u(σ) dσ + ∫ g(σ) ∇N̂ φ(x, σ) dσ.
Ω ∂Ω ∂Ω
Subtracting (1.0.7) from (1.0.3), and using (1.0.4), we deduce the formula (1.0.2).

2. Poisson’s Formula
def
Let’s compute the Green function G(x, y ) and Poisson kernel P (x, σ) = ∇N̂ G(x, σ ) from (1.0.2)
def
in the case that Ω = BR (0) ⊂ R3 is a ball of radius R centered at the origin. We’ll use a technique
called the method of images that works for special domains.
Warning 2.0.1. Brace yourself for a bunch of tedious computations that at the end of the day will
lead to a very nice expression.
The basic idea is to hope that φ(x, y ) from the decomposition G(x, y ) = Φ(x − y ) − φ(x, y ), where
φ(x, y ) is viewed as a function of x that depends on the parameter y, is equal to the Newtonian
potential generated by some “imaginary charge” q placed at a point x∗ ∈ BRc (0). To ensure that
G(x, σ) = 0 when σ ∈ ∂BR (0), q and x∗ have to be chosen so that along the boundary {y ∈ R3 ∣ ∣y ∣ =
1
R}, φ(x, y ) = 4π∣x−y∣ . In a nutshell, we guess that
1 q
(2.0.8) G(x, y) = − + ,
4π∣x − y∣ 4π ∣x∗ − y ∣
´ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
φ(x,y )?
and we try to solve for q and x∗ so that G(x, y ) vanishes when ∣y ∣ = R.

Remark 2.0.1. Note that ∆y 4π∣xq∗ −y∣ = 0, which is one of the conditions necessary for constructing
G(x, y).
By the definition of G(x, y), we must have G(x, y) = 0 when ∣y∣ = R, which implies that
1 q
(2.0.9) = .
4π∣x − y ∣ 4π∣x∗ − y ∣
Simple algebra then leads to
(2.0.10) ∣ x∗ − y ∣ 2 = q 2 ∣ x − y ∣ 2 .
When ∣y∣ = R, we use (2.0.10) to compute that
(2.0.11) ∣x∗ ∣2 − 2x∗ ⋅ y + R2 = ∣x∗ − y ∣2 = q 2 ∣x − y∣2 = q 2 (∣x∣2 − 2x ⋅ y + R2 ),
where ⋅ denotes the Euclidean dot product. Then performing simple algebra, it follows from (2.0.11)
that
(2.0.12) ∣x∗ ∣2 + R2 − q 2 (R2 + ∣x∣2 ) = 2y ⋅ (x∗ − q 2 x).

Now since the left-hand side of (2.0.12) does not depend on y, it must be the case that the
right-hand side is always 0. This implies that x∗ = q 2 x, and also leads to the equation
(2.0.13) q 4 ∣x∣2 − q 2 (R2 + ∣x∣2 ) + R2 = 0.

Solving (2.0.13) for q, we finally have that
R
(2.0.14) q= ,
∣x∣
R2
(2.0.15) x∗ = 2 x.
∣x∣
Therefore,
1 R
(2.0.16) φ(x, y ) = R 2 ,
4π ∣x∣∣ ∣x∣2 x − y ∣
1
(2.0.17) φ(0, y ) =
,
4πR
where we took a limit as x → 0 in (2.0.16) to derive (2.0.17).
Next, using (2.0.8), we have
1 1 R
(2.0.18) G(x, y) = − + R 2 , x ≠ 0,
4π ∣x − y∣ 4π ∣x∣∣ ∣x∣2 x − y ∣
1 1
(2.0.19) G(0, y ) = − + .
4π ∣y ∣ 4πR
For future use, we also compute that
x−y 1 R x∗ − y
(2.0.20) ∇y G(x, y) = − + .
4π∣x − y ∣3 4π ∣x∣ ∣x∗ − y ∣3
Now when σ ∈ ∂BR (0), (2.0.10) and (2.0.14) imply that
R
(2.0.21) ∣x∗ − σ∣ = ∣x − σ∣.
∣ x∣
Therefore, using (2.0.20) and (2.0.21), we compute that
R2
x−σ 1 ∣ x∣ 2 x∗ − σ x−σ 1 ∣x∣2 ∣x∣2 x − σ
(2.0.22) ∇σ G(x, σ ) = − + = − +
4π ∣x − σ ∣3 4π R2 ∣x − σ ∣3 4π ∣x − σ ∣3 4π R2 ∣x − σ∣3
σ ∣x∣2
= ( 1 − ).
4π ∣x − σ ∣3 R2
ˆ (σ ) = 1 σ, we deduce
Using (2.0.22) and the fact that N R
2 2
(2.0.23)
def ˆ (σ ) = R − ∣x∣
∇N̂ (σ) G(x, σ ) = ∇σ G(x, σ ) ⋅ N
1
.
4π R ∣x − σ ∣3
Remark 2.0.2. If the ball were centered at the point p ∈ R3 instead of the origin, then the formula
(2.0.23) would be replaced with
def ˆ (σ ) = − R2 − ∣x − p∣2 1
(2.0.24) ∇N̂ (σ) G(x, σ ) = ∇σ G(x, σ ) ⋅ N .
4πR ∣x − σ ∣3
Let’s summarize this by stating a lemma.
Lemma 2.0.1. The Green function for a ball BR (p) ⊂ R3 is
1 1 R
(2.0.25a) G(x, y ) = − + R 2 , x ≠ p,
4π ∣x − y ∣ 4π ∣x − p∣∣ ∣x−p∣2 (x − p) − (y − p)∣
1 1
(2.0.25b) G(p, y ) = − + .
4π ∣y − p∣ 4π R
Furthermore, if x ∈ BR (p) and σ ∈ ∂BR (p), then
R2 − ∣x − p∣2 1
(2.0.25c) ∇N̂ (σ) G(x, σ ) = .
4πR ∣x − σ ∣3
We can now easily derive a representation formula for solutions to the Laplace equation on a ball.
Theorem 2.1 (Poisson’s formula). Let BR (p) ⊂ R3 be a bal l of radius R centered at p =
(p1 , p2 , p3 ), and let x = (x1 , x2 , x3 ) denote a point in R3 . Let g ∈ C (∂BR (p)). Then the unique
solution u ∈ C 2 (BR (p)) ∩ C (B R (p)) of the PDE
∆u(x) = 0, x ∈ BR (p),
(2.0.26) {
u(x) = g (x), x ∈ ∂BR (p),
can be represented using the Poisson formula:
R2 − ∣x − p∣2 g (σ )
(2.0.27) u(x) = ∫ dσ.
4πR ∂BR (p) ∣x − σ ∣3
Remark 2.0.3. In n dimensions, the formula (2.0.27) gets replaced with
R 2 − ∣x − p ∣2 g (σ )
(2.0.28) u(x) = ∫ dσ,
ωn R ∂BR (p) ∣x − σ ∣n
where as usual, ωn is the surface area of the unit ball in Rn .

Proof. The identity (2.0.27) follows immediately from Theorem 1.1 and Lemma 2.0.1.
3. Harnack’s inequality
We will now use some of our tools to prove a famous inequality for Harmonic functions. The
theorem provides some estimates that place limitations on how slow/fast harmonic functions are
allowed to grow.
Theorem 3.1 (Harnack’s inequality). Let BR (0) ⊂ Rn be the ball of radius R centered at the
origin, and let u ∈ C 2 (BR (0)) ∩ C (B R (0)) be the unique solution to (2.0.26). Assume that u is
non-negative on B R (0). Then for any x ∈ BR (0), we have that
Rn−2 (R − ∣x∣) Rn−2 (R + ∣x∣)

(3.0.29) u(0 ) ≤ u(x) ≤ u(0).
(R + ∣x∣)n−1 (R − ∣x∣)n−1
Proof. We’ll do the proof for n = 3. The basic idea is to combine the Poisson representation formula
with simple inequalities and the mean value property. By Theorem 2.1, we have that
R2 − ∣x∣2 g (σ )
(3.0.30) u(x) = ∫ dσ.
4πR ∂BR (0) ∣x − σ ∣3
By the triangle inequality, for σ ∈ ∂BR (0) (i.e. ∣σ ∣ = R), we have that ∣x∣ − R ≤ ∣x − σ ∣ ≤ ∣x∣ + R.
Applying the first inequality to (3.0.30), and using the non-negativity of g, we deduce that
R + ∣x∣ 1
(3.0.31) u(x) ≤ g (σ ) dσ.
R2 − ∣x∣2 4πR ∫∂BR (0)
Now recall that by the mean value property, we have that
1
(3.0.32) u(0) = g (σ ) dσ.
4πR2 ∫∂BR (0)
Thus, combining (3.0.31) and (3.0.32), we have that
R(R + ∣x∣)
(3.0.33) u(x) ≤ u(0),
(R − ∣x∣)2
which implies one of the inequalities in (3.0.29). The other one can be proved similarly using the
remaining triangle inequality.

We now prove a famous consequence of Harnack’s inequality. The statement is also often proved
in introductory courses in complex analysis, and it plays a central role in some proofs of the
fundamental theorem of algebra.
Corollary 3.0.2 (Liouville’s theorem). Suppose that u ∈ C 2 (Rn ) is harmonic on Rn . Assume
that there exists a constant M such that u(x) ≥ M for all x ∈ Rn , or such that u(x) ≤ M for all
x ∈ Rn . Then u is a constant-valued function.
def
Proof. We first consider the case that u(x) ≥ M. Let v = u + ∣M ∣. Observe that v ≥ 0 is harmonic
and verifies the hypotheses of Theorem 3.1. Thus, by (3.0.29), if x ∈ Rn and R is sufficiently large,
we have that
Rn−2 (R − ∣x∣) Rn−2 (R + ∣x∣)

(3.0.34) v(0) ≤ v (x) ≤ v(0).
(R + ∣x∣)n−1 (R − ∣x∣)n−1
Allowing R → ∞ in (3.0.34), we conclude that v (x) = v (0). Thus, v is a constant-valued function
(and therefore u is too).
def
To handle the case u(x) ≤ M, we simply consider the function w(x) = −u(x) + ∣M ∣ in place of
v (x), and we argue as above.

MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 10: Introduction to the Wave Equation

1. What is the wave equation?
The standard wave equation for a function u(t, x) (where t ∈ R, x ∈ Rn ) is
1 2
(1.0.1) − ∂ u + ∆u = 0.
c2 t
(1.0.1) is second order and linear. The constant c > 0 is called the speed (this terminology will be
length
justified as our course progresses), and it has dimensions of time . Note that heuristically speaking,
if we let c → ∞, then (1.0.1) becomes Laplace’s equation. However, as we will see, in order to have
a well-posed problem for (1.0.1), we will need to specify Cauchy (i.e. initial data) for u and also
∂t u. The fact that we need to specify Cauchy data is in stark contrast to Laplace’s equation, but
is analogous to the heat equation. The fact that we need to specify two pieces of Cauchy data is
connected to the fact that the wave equation is second order in time.
2. Where does it come from?

Equation (1.0.1) arises in an incredible variety of physical contexts, especially those involving
disturbances that propagate at a finite speed. Let’s discuss how the wave equation arises as an
approximation to the equations of fluid mechanics. For simplicity, let’s only discuss the case of 1
spatial dimension. The equations of fluid mechanics, which are known as the Euler equations, take
the following form in 1 + 1 dimensions:
(2.0.2a) ∂t ρ + ∂x (ρv ) = 0,
(2.0.2b) ∂t (ρv ) + ∂x (ρv 2 ) = −∂x p,
where ρ(t, x) is the fluid mass density, v (t, x) is the fluid velocity, and p(t, x) is the pressure.
Equation (2.0.2a) implies the conservation of mass, and equation (2.0.2b) is Newton’s second law:
the rate of change of fluid momentum is equal to the force, which is created by the pressure gradient
(i.e., the −∂x p term). The Euler equations are highly nonlinear, and we are very far from obtaining
a full understanding of how their solutions behave in general.
A fundamental aspect of fluid mechanics is that the system is not closed because there are not
enough equations. A common method of achieving closure is by choosing an equation of state, which
is a relationship between the fluid variables. This relationship is often empirically determined. A
commonly studied equation of state is
(2.0.3a) p = Kργ
where γ > 1 and K > 0 are constants. For future use, we note that under (2.0.3a), we have
1
(2.0.3b) ∂x p = Kγργ −1 ∂x ρ,
(2.0.3c) ∂x2 p = Kγργ −1 ∂x2 ρ + Kγ (γ − 1)ργ−2 (∂x ρ)2 ,
Also for future use, we differentiate (2.0.2a) with respect to t and (2.0.2b) with respect to x to
deduce that
(2.0.4a) ∂t2 ρ + ρ∂t ∂x v + v∂t ∂x ρ + ∂t ρ∂x v + ∂t v∂x ρ = 0,

(2.0.4b) ρ∂t ∂x v + v∂t ∂x ρ + ∂t ρ∂x v + ∂t v∂x ρ + ∂x2 ρ + 4ρv∂x2 v + 2∂x ρ∂x v = −∂x2 p.
The theory of acoustics is based on linearizing (i.e. throwing away the nonlinear terms) the
equations (2.0.4a) - (2.0.4b) around the static solutions ρ = ρ̄ = const > 0, v = 0, p = p̄ = const > 0.
These static solutions describe a fluid at rest. Let’s assume that we make a small perturbation of
this solution, i.e., that v is small, and that
(2.0.5) ρ = ρ̄ + δ,
where δ (t, x) is a small function.
Using the expansion (2.0.5), we now throw away (with the help of (2.0.3c)) all of the quadratic and
higher-order small terms from (2.0.4a) - (2.0.4b) to obtain the following approximating system
(the quantities that are assumed to be small are v, δ, and all of their partial derivatives):
(2.0.6a) ∂t2 δ + ρ∂
¯ t ∂x v = 0,
(2.0.6b) ¯ t ∂x v = −Kγρ̄γ −1 ∂x2 δ.
ρ∂
Comparing (2.0.6a) and (2.0.6b), we see that δ verifies the following approximating equation
(2.0.7) −∂t2 δ + Kγρ̄γ−1 ∂x2 δ = 0.

Equation (2.0.7) is a wave equation for the perturbation δ (t, x)! It models the propagation of sound
waves. This is the linear theory of acoustics! Note that the speed associated to the equation (2.0.7)
depends on the background density ρ̄ ∶
√
(2.0.8) c= Kγρ̄γ−1 .
When γ > 1, higher background density Ô⇒ faster sound speed propagation.
Remark 2.0.1. For air under “normal” atmospheric conditions, γ = 1.4 is a pretty good model.
3. Some Well-Posed Problems

Recall that well-posed PDEs have three important properties:
● Given suitable data, a solution exists.
● The solution is unique.
● The solution depends continuously on the data.
Perhaps the most often studied well-posed problem for the wave equation is the global Cauchy
problem in 1 + n spacetime dimensions:
(3.0.9a) −∂t2 u(t, x) + ∆x u(t, x) = 0, (t, x) ∈ R × Rn ,

(3.0.9b) u(0, x) = f (x), x ∈ Rn ,
(3.0.9c) ∂t u(0, x) = g (x), x ∈ Rn .
We now mention some additional well-posed problems in the case of 1 + 1 dimensions. We assume
that u verifies the wave equation for (t, x) ∈ (−∞, ∞) × [0, L] and that Cauchy data is given:
(3.0.10a) −∂t2 u(t, x) + ∂x2 u(t, x) = 0, (t, x) ∈ R × [0, L],

(3.0.10b) u(0, x) = f (x), x ∈ [0, L],
(3.0.10c) ∂t u(0, x) = g (x), x ∈ [0, L].
Unlike in the case of (3.0.9a) - (3.0.9c), because of the finiteness of the interval [0, L], we need
to supplement (3.0.10a) - (3.0.10c) with additional conditions in order to generated a well-posed
problem. Here are some well-known ways of generating a well-posed problem; they are essentially
the same as in the case of the heat equation.
(1) Dirichlet data: also specifying u(t, 0) = a(t), u(t, L) = b(t) for t > 0
(2) Neumann data: also specifying ∂x u(t, 0) = a(t), ∂x u(t, L) = b(t) for t > 0
(3) Robin data: also specifying ∂x u(t, 0) = ku(t, 0), ∂x u(t, L) = −ku(t, L) for t > 0, where k > 0
is a constant
(4) Mixed data: e.g. one kind of data at x = 0, and a different kind at x = L
4. 1 + 1 spacetime dimensions
Let’s consider the wave equation with speed c in 1 + 1 dimensions:
(4.0.11) −c−2 ∂τ2 u(τ, x) + ∂x2 u(τ, x) = 0.

def
Let’s first note the following fact: if f, g are any differentiable functions, then u(x, τ ) = f (x − cτ )
def
and u(x, τ ) = g(x + cτ ) solve (4.0.11). The first is called a right-traveling wave, and the second is
called a left-traveling wave. To visualized wave propagation in 1 + 1 dimensions, you can imagine
that the graph of f (⋅) and g (⋅) are translated to the right/left at a speed c. This gives a good idea
of what wave motion looks like in 1 + 1 dimensions. In particular, the amplitudes of the traveling
wave solutions are preserved in time. As we will see, wave propagation in higher dimensions is quite
different. In higher dimensions, the amplitudes decay in time due to the spreading out of the waves.
You will study the case of 1 + 3 spatial dimension in one of your homework exercises; you will show
that in this case, the amplitudes decay at a rate of order t−1 as t → ∞.
Remark 4.0.2. Not all wave solutions in 1 + 1 dimensions are traveling waves; see Theorem 4.1.
def
By making the change of variables t = cτ, we can transform equation (4.0.11) into a wave equation
with speed equal to 1 ∶
(4.0.12) −∂t2 u(t, x) + ∂x2 u(t, x) = 0.

This makes our life a bit easier. Let’s now consider the global Cauchy problem by supplementing
(4.0.12) with the initial data
(4.0.13) u(0, x) = f (x), ∂t u(0, x) = g(x).

As we will see, (4.0.12) + (4.0.13) has a unique solution that has a nice representation.
Theorem 4.1 (d’Alembert’s formula). Assume that f ∈ C 2 (R) and g ∈ C 1 (R). Then the
unique solution u(t, x) to (4.0.12) + (4.0.13) satisfies u ∈ C 2 ([0, ∞) × R) and can be represented by
d’Alembert’s formula:
1 1 z=x+t
(4.0.14) u(t, x) = (f (x + t) + f (x − t)) + ∫ g (z ) dz.
2 2 z=x−t
Remark 4.0.3. For the wave equation −c−2 ∂t2 u + ∂x2 u = 0 formula (4.0.14) is replaced with
1 1 z =x+ct
(4.0.15) u(t, x) = (f (x + ct) + f (x − ct)) + ∫ g (z ) dz.
2 2c z=x−ct
Remark 4.0.4. Equation (4.0.14) illustrates the finite speed of propagation property associated
to the wave equation. More precisely, the value of the solution at (t, x) is only influenced by the
“initial data interval” {(0, y) ∣ x − t ≤ y ≤ x + t}; changes to the initial data (4.0.13) outside of this
interval have no effect on the solution at (t, x). We will reexamine this property later in the course
with the help of energy methods.
Proof. To derive (4.0.14), it is convenient to introduce a change of variables called null coordinates:
def
(4.0.16) q = t − x,
def
(4.0.17) s = t + x.
The chain rule implies the following relationships between partial derivatives:
1 1
(4.0.18) ∂q = (∂t − ∂x ), ∂s = (∂t + ∂x ),
2 2
(4.0.19) ∂t = ∂q + ∂s , ∂x = ∂s − ∂q .
The operators ∂q and ∂s can be viewed as directional derivatives in the (t, x) Cartesian spacetime
direction .5(1, −1) and .5(1, 1) respectively. These null directions, which are sometimes called
characteristic directions, are extremely important. In the future, we will discuss the notion of a
characteristic direction in a general setting.
It is now easy to see that (4.0.12) takes the following form in null coordinates:
(4.0.20) ∂s ∂q u = 0.
Integrating (4.0.20) with respect to s, we have that
(4.0.21) ∂q u = H(q ),
where H is a function of q.
Note that the value of q is the same for the pair of Cartesian spacetime points (τ, y ) and (0, y − τ ).
Thus, using the initial conditions (4.0.13), we have that
1 1
(4.0.22) ∂q u(τ, y) = ∂q u(0, y − τ ) = ( (∂t − ∂x )u)(0, y − τ ) = (g(y − τ ) − f ′ (y − τ )).
2 2
Similarly, interchanging the partial derivatives in (4.0.20) to deduce ∂s ∂q u = 0, we conclude that
1
(4.0.23) ∂s u(τ, y ) = (g(y + τ ) + f ′ (y + τ )).
2
Adding (4.0.22) and (4.0.23), and using (4.0.18), we have that
1
(4.0.24) ∂t u(t, x) = (f ′ (x + t) − f ′ (x − t) + g(x + t) + g(x − t)).
2
Integrating (4.0.24) in time with respect to t from 0 to t, and again using the initial conditions
(4.0.13), we have that
f (x)
³¹¹ ¹ ¹ ·¹ ¹ ¹ ¹ µ 1 1 t
(4.0.25) u(t, x) = u(0, x) + (f (x + t) − f (x) + f (x − t) − f (x)) + ∫ g (x + τ ) + g (x − τ ) dτ
2 2 τ =0
1 1 z = x+ t
= (f (x + t) + f (x − t)) + ∫ g (z ) dz,
2 2 z=x−t
where to derive the last equality, we made the integration change of variables z = x + τ for the
g (x + τ ) term, and the change of variables z = x − τ for the g (x − τ ) term. We have thus derived
(4.0.14).

Without a lot of additional effort, we can extend Theorem 4.1 to apply to the following initial
+ boundary value PDE in 1 + 1 dimensions; the result is stated and proved in the next corollary.
This PDE would arise in the study of e.g. the following idealized problem: a description of the
propagation of waves on an infinitely long vibrating string with one end fixed. Furthermore, the
corollary will later play a role in our extension of Theorem 4.1 to the case of 1 + 3 dimensions.
Corollary 4.0.1. Let f ∈ C 2 ([0, ∞)), g ∈ C 1 ([0, ∞)), and assume that f (0) = g(0) = 0. Then the
unique solution to the fol lowing 1 + 1 dimensional initial + boundary value problem
(4.0.26a) −∂t2 u(t, x) + ∂x2 u(t, x) = 0, (t, x) ∈ [0, ∞) × (0, ∞),

(4.0.26b) u(t, 0) = 0, t ∈ [0, ∞),
(4.0.26c) u(0, x) = f (x), x ∈ (0, ∞),
(4.0.26d) ∂t u(0, x) = g (x), x ∈ (0, ∞)
satisfies u ∈ C 2 ([0, ∞) × [0, ∞)). Furthermore, it can be represented as
⎧
⎪ 1 z =x+t
⎪ 2 (f (x + t) + f (x − t)) + 2 ∫z=∣x−t∣ g (z ) dz,
⎪ if 0 ≤ t ≤ x,
1
(4.0.27) u(t, x) = ⎨ z =x+t
⎪
⎪ 12 (f (x + t) − f (t − x)) + 12 ∫z=∣x−t∣ g (z ) dz,
⎪ if 0 ≤ x ≤ t.
⎩
Proof. The idea is that if we extend u to be odd in x, then we can reduce the problem to the case
of Theorem 4.1. Motivated by this, we define
u(t, x), if t ≥ 0, x ≥ 0,
̃(t, x) = {
def
(4.0.28) u ,
−u(t, −x), if t ≥ 0, x ≤ 0,
f (x), if x ≥ 0,
f̃(x) = {
def
(4.0.29) ,
−f (−x), if x ≤ 0,
g (x), if x ≥ 0,
̃
def
(4.0.30) g(x) = {
−g (−x), if x ≤ 0.
̃(t, x) is a solution to the wave equation (4.0.12) for
Since u(t, x) solves (4.0.26a), it follows that u
(t, x) ∈ R × R with initial data u ̃
̃(0, x) = f (x), ∂t u
̃(t, x) = ̃
g (x). Thus, by (4.0.14), we have that
1 1 z=x+t
(4.0.31) ̃(t, x) = (f̃(x + t) + f̃(x − t)) + ∫
u ̃
g (z) dz.
2 2 z=x−t
The expression (4.0.27) now easily follows from considering (4.0.31) separately in the spacetime
regions {(t, x) ∣0 ≤ t ≤ x} and {(t, x) ∣0 ≤ x ≤ t}, and from the definitions (4.0.28) - (4.0.30); note
that in the case {(t, x) ∣0 ≤ t ≤ x}, since ̃
g is odd, the part of the integral from x − t to t − x cancels
and thus the only net contribution comes from the integration interval [∣x − t∣, x + t].

MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 11: The Method of Spherical Means
1. 1 + 3 spacetime dimensions and the method of spherical means

We would now like to derive an analog of d’Alembert’s formula in the physically relevant case
of 1 + 3 dimensions. As we will see, the analogous formula, known as Kirchhoff’s formula, can be
derived through the following steps.
● Given a solution u(t, x) to the 1 + 3 dimensional wave equation, we will define a spherical
average of u centered at x. The average will depend on the averaging radius r.
● For fixed x, we will show that a slight modification of the average will solve the 1 + 1 dimen-
sional wave equation in the unknowns (t, r). With the help of our corollary to d’Alembert’s
formula, we will be able to find an explicit formula for this modified function.
● We will take a limit as the averaging goes to 0 in order to recover an expression for u(t, x).
This procedure is known as the method of spherical means. The final result will be stated and
proved as a theorem. Before proving the theorem, we will develop some preliminary estimates. We
will use spherical coordinates (r, θ, φ) ∈ [0, ∞) × [0, π ) × [0, 2π ) on R3 . Recall that if the spherical
coordinates are centered at the Cartesian point (p1 , p2 , p3 ), then the standard Cartesian coordinates
(x1 , x2 , x3 ) are connected to spherical coordinates by
(1.0.1a) x1 = p1 + r sin θ cos φ,

(1.0.1b) x2 = p2 + r sin θ sin φ,
(1.0.1c) x3 = p3 + r cos θ.
def
Also recall that the integration measure associated to Br (0) is dσ = r2 dω, where dω = sin θdθdφ.
Here, ω represents the angular variables. We will abuse notation by using the symbol ω to denote
both the angular coordinates (θ, φ), and alternatively as the corresponding point (sin θ cos φ, sin θ sin φ, cos θ) ∈
∂B1 (0).
Proposition 1.0.1 (Spherical averages). Let u(t, x) ∈ C 2 ([0, ∞) × R3 ) be a solution to the 1 + 3

dimensional global Cauchy problem
(1.0.2a) −∂t2 u(t, x) + ∆u(t, x) = 0, (t, x) ∈ [0, ∞) × R3 ,

(1.0.2b) u(0, x) = f (x), x ∈ R3 ,
(1.0.2c) ∂t u(0, x) = g (x), x ∈ R3 .
1
For each r > 0, define the spherically averaged quantities
1
def 1
(1.0.3a) U (t, r; x) = ∫ u(t, σ ) dσ = u(t, x + rω) dω,
2
4πr ∂Br (x) 4π ∫ω∈∂B1 (0)
def 1
(1.0.3b) F (r; x) = f (σ ) dσ,
4πr2 ∫∂Br (x)
def 1
(1.0.3c) G(r; x) = g (σ ) dσ,
4πr2 ∫∂Br (x)
and their related modifications
(1.0.4a) ̃ (t, r; x) def

U = rU (t, r; x),
def
(1.0.4b) F̃(r; x) = rF (r; x),
(1.0.4c) ̃(r; x) def
G = rG(r; x).
Then U ̃ (t, r; x) ∈ C 2 ([0, ∞)×[0, ∞)) is a solution to the following initial + boundary-value problem
for the one-dimensional wave equation:
(1.0.5a) −∂t2 U
̃ (t, r; x) + ∂r2 U
̃ (t, r; x) = 0, (t, r) ∈ [0, ∞) × [0, ∞),
(1.0.5b) Ũ (t, 0; x) = 0, t ∈ [0, ∞),
(1.0.5c) U
̃ (0, r; x) = F̃(r; x), r ∈ (0, ∞),
(1.0.5d) ∂t U
̃ (0, r; x) = G̃(r; x), r ∈ (0, ∞).
Furthermore,
(1.0.6) lim U (t, r; x) = u(t, x).

r →0
Proof. Differentiating under the integral on the right-hand side of (1.0.3a), using the chain rule
ˆ (σ) is the outward
relation ∂r [u(t, x + rω )] dω = (∇u)(t, x + rω ) ⋅ ω dω = r12 ∇N̂ (σ) u(t, σ) dσ (where N
unit normal to ∂Br (x)), and applying the divergence theorem, we compute that
1 1
(1.0.7) ∂r U = ∫ ∇Nˆ (σ) u(t, σ ) dσ = ∆y u(t, y) d3 y.
2
4πr ∂Br (x) 4πr2 ∫Br (x)
We now derive a version of the fundamental theorem of calculus that will be used in our analysis
below. If h is a continuous function on R3 , then using spherical coordinates (ρ, ω) centered at the
fixed point x, we have
(1.0.8)
r
def
∂r ∫ h(y) d3 y = ∂r ∫ ∫ω∈∂B ρ2 h(ρ, x + ρω) dωdρ = ∫ r2 h(r, x + rω ) dω = ∫ h(σ ) dσ.
Br (x) 0 1 (0) ω ∈∂B1 (0) ∂B
r (x)
Multiplying both sides of (1.0.7) by r2 and applying (1.0.8), we have that

1 1
(1.0.9) ∂r (r2 ∂r U ) =∂r ∫ ∆y u(t, y ) d3 y = ∆u(t, σ ) dσ.
4π Br (x) 4π ∫∂Br (x)
Differentiating under the integral in (1.0.3a) and using (1.0.2a), we have that
1 1
(1.0.10) ∂t2 U (t, r; x) = ∫ ∂t2 u(t, σ ) dσ = ∆u(t, σ ) dσ.
2
4πr ∂Br (x) 4πr2 ∫∂Br (x)
Comparing (1.0.9) and (1.0.10), we see that
1 2
(1.0.11) ∂t2 U (t, r; x) =
2
∂r (r2 ∂r U ) = ∂r2 U (t, r; x) + ∂r U (t, r; x).
r r
Multiplying both sides of (1.0.11) by r and performing simple calculations, we see that
(1.0.12) ∂t2 [rU (t, r; x)] = ∂r2 [rU (t, r; x)].

We have thus shown that the PDE (1.0.5a) is verified by U ̃ def
= rU.
Using (1.0.2b) - (1.0.2c) and definitions (1.0.3b) - (1.0.3c), it is easy to check that the initial
conditions (1.0.5c) - (1.0.5d) hold. Note that you will have to differentiate under the integral in
(1.0.3a) in order to show that (1.0.5d) holds.
The limit (1.0.6) follows easily from the right-hand side of (1.0.3a), since u is continuous.
Finally, the boundary condition (1.0.5b) then follows easily from multiplying (1.0.6) by r before
taking the limit r → 0+ .

Corollary 1.0.2 (Representation formula for U ̃ (t, r; x)). Under the assumptions of Proposition
1.0.1, for 0 ≤ r ≤ t, we have that
̃ (t, r; x) def 1 1 ρ=r+t ̃

(1.0.13) U = rU (t, r; x) = (F̃(r + t; x) − F̃(r − t; x)) + ∫ G(ρ; x) dρ.
2 2 ρ=−r+t
Proof. (1.0.13) follows from (1.0.5a) - (1.0.5d) and the Corollary to d’Alembert’s formula.
MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 12: Kirchhoff’s Formula and Minkowskian Geometry

1. Kirchhoff’s Formula
We are now ready to derive Kirchhoff’s famous formula.
Theorem 1.1 (Kirchhoff ’s formula). Assume that f ∈ C 3 (R3 ) and g ∈ C 2 (R3 ). Then the unique
solution u(t, x) to the global Cauchy problem
(1.0.1a) −∂t2 u(t, x) + ∆u(t, x) = 0, (t, x) ∈ [0, ∞) × R3 ,

(1.0.1b) u(0, x) = f (x), x ∈ R3 ,
(1.0.1c) ∂t u(0, x) = g (x), x ∈ R3
in 1 + 3 dimensions satisfies u ∈ C 2 ([0, ∞) × R3 ) and can be represented as follows:
1 1 1
u(t, x) = ∫ f (σ ) dσ + ∫ ∇Nˆ (σ) f (σ ) dσ + g (σ ) dσ.
4πt ∫∂Bt (x)
(1.0.2) 2
4πt ∂Bt (x) 4πt ∂Bt (x)
Remark 1.0.1. Equation (1.0.2) again illustrates the finite speed of propagation property associ-
ated to the linear wave equation. More precisely, the behavior of the solution at the point (t, x)
is only affected by the initial data in the region {(0, y ) ∣ ∣x − y ∣ = t}. The fact that this region is
the boundary of a ball rather than a solid ball is known as the sharp Huygens principle. It can be
shown that the sharp version of this principle holds 1 + n dimensions when n ≥ 3 is odd, but not
when n = 1 or when n is even. However, even when the sharp version fails, there still is a finite
speed of propagation property; the solution in these cases depends on the data in the solid ball.
Remark 1.0.2. Note that in Theorem 1.1, we can only guarantee that the solution is one degree less
differentiable than the data. This contrasts to d’Alembert’s formula, in which the 1 + 1 dimensional
solution was shown to have the same degree of differentiability as the data.
̃ (t, r; x) corollary, the differentiability of F̃, and
Proof. Using the Representation formula for U
̃
the continuity of G, we have that
̃ (t, r; x)
U
(1.0.3) u(t, x) = lim+ U (t, r; x) = lim+
r→0 r→0 r
̃ ̃
F (r + t; x) − F (r − t; x) 1 ρ=r+t
= lim+ + ∫ ̃(ρ; x) dρ
G
r→0 2r 2r ρ=−r+t
= ∂t F̃(t; x) + G
̃(t; x).
The ∂t F̃(t; x) term on the right-hand side of (1.0.3) arises from the definition of a partial derivative,
while to derive the G̃(t; x) term, we applied the fundamental theorem of calculus (think about both
1
of these claims own your own!). By the definition of F̃ and G

̃ (see the Spherical averages
Proposition), it therefore follows from (1.0.3) that
1 1
u(t, x) = ∂t (t ∫ f (σ ) dσ ) + t g (σ ) dσ.
4πt2 ∫∂Bt (x)
(1.0.4) 2
4πt ∂Bt (x)
Differentiating under the integral sign, using the chain rule relation ∂t [f (x + tω )] = (∇f )(x + tω )⋅ ω =
∇N̂ (x+tω) f (x + tω ) (where N
ˆ is the unit outward normal to ∂Bt (x)), and recalling that dσ = t2 dω
on ∂Bt (x), we have that
1 1 t
t∂t ( ∫ f (σ) dσ) = t∂t ( ∫ [f (x + tω )] dω ) = ∂t [f (x + tω)] dω
4π ∫∂B1 (0)
(1.0.5) 2
4πt ∂Bt (x) 4π ∂B1 (0)
t
= ∇ˆ f (x + tω ) dω
4π ∫∂B1 (0) N (x+tω)
def 1
= ∇ f (σ ) dσ.
4πt ∫∂Bt (x) N̂ (σ)
Combining (1.0.4) and (1.0.5), we have that
1 1 1
u(t, x) = ∫ f (σ ) dσ + ∫ ∇Nˆ (σ) f (σ) dσ + g (σ ) dσ.
4πt ∫∂Bt (x)
(1.0.6) 2
4πt ∂Bt (x) 4πt ∂Bt (x)
We have thus shown (1.0.2).
The fact that u ∈ C 2 ([0, ∞) × R3 ) follows from differentiating the integrals in the formula (1.0.2)
and using the hypotheses on f and g.

Exercise 1.0.1. Show that (1.0.3) holds.
Exercise 1.0.2. Verify that u ∈ C 2 ([0, ∞) × R3 ), as was claimed at the end of the proof above.
The Linear Wave Equation: A Geometric Point of View
We will now derive some very important results for solutions to the linear wave equation. The
results will exploit interplay between geometry and analysis. Many of the techniques that we will
discuss play a central role in current PDE research.
2. Geometric background
Throughout this lecture, standard rectangular coordinates on R1+n are denoted by (x0 , x1 , ⋯, xn ),
and we often use the alternate notation x0 = t. The Minkowski metric on R1+n , which we denote by
m, embodies the Lorentzian geometry at the heart of Einstein’s theory of special relativity. As we
will see, this geometry is intimately connected to the linear wave equation. The components of m
takes the following form relative to a standard rectangular coordinate system:
(2.0.7) mµν = (m−1 )µν = diag(−1, 1, 1, ⋯, 1).

´ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
n copies
We can view mµν as an (1 + n) × (1 + n) matrix of real numbers. It is conventional to label the first
row and column of mµν starting with “0” rather than “1,” so that m00 = −1, m22 = 1, m02 = 0, etc.
Note that m is symmetric: mµν = mνµ .
If X is a vector in R1+n with components X µ (0 ≤ µ ≤ n), then we define its metric dual to be
the covector with components Xµ (0 ≤ µ ≤ n) defined by
3
def
(2.0.8) Xµ = ∑ mµα X α .
α=0
This is called “lowering the indices of X with m.”

Similarly, given a covector with components Yµ , we can use (m−1 ) to form a vector Y µ by raising
the indices:
3
def
(2.0.9) Y µ = ∑ (m−1 )µα Yα .
α=0
These notions of duality are called metric duality. They are related to, but distinct from (roughly
speaking by a minus sign in the first component), the notion of basis duality commonly introduced
in linear algebra.
We will make use of Einstein’s summation convention, in which we avoid writing many of the
summation signs Σ to reduce the notational clutter. In particular, repeated indices, with one up
and one down, are summed over their ranges. Here is an example:
3 3
def def def
(2.0.10) Xα Y α = ∑ Xα Y α = ∑ Xα Y α = mαβ X β Y α = mαβ X β Y β = mαβ X α Y β ,
α=0 α=0
where the last equality is a consequence of the symmetry property of m.

We now make the following important observation: the linear wave equation −∂t2 φ + ∆φ = 0 can
be written as
(2.0.11) (m−1 )αβ ∂α ∂β φ = 0.

We will return to this observation in a bit.
We first provide a standard division of vectors into three classes timelike, spacelike, null.
Definition 2.0.1.
def
(1) Timelike vectors: m(X, X) = mαβ X α X β < 0
(2) Spacelike vectors: m(X, X ) > 0
(3) Null vectors: m(X, X ) = 0
(4) Causal vectors: {Timelike vectors}∪ {Null vectors}
We also will need to know when a vector is pointing “towards the future.” This idea is captured
by the next definition.
Definition 2.0.2. A vector X ∈ Rn is said to be future-directed if X 0 > 0.
2.1. Lorentz transformations. Lorentz transformations play a very important role in the study
of the linear wave equation.
Definition 2.1.1. A Lorentz transformation is a linear transformation Λµν (i.e., a matrix) that
def
preserves the form of the Minkowski metric mµν = diag(−1, 1, 1, ⋯, 1) ∶
(2.1.1) Λαµ Λβν mαβ = mµν .

In standard matrix notation, (2.1.1) reads
(2.1.2) ΛT mΛ = m,
where T denotes the transpose.
By taking the determinant of each side of (2.1.2) and using the basic properties of the determinant,
we see that ∣det(Λ)∣ = 1. If det(Λ) = 1, then Λ is said to be proper or orientation preserving.
It is easy to see that (2.1.1) is equivalent to
(2.1.3) m(ΛX, ΛY ) = m(X, Y ), ∀ vectors X, Y ∈ R1+n ,

def
i.e., that the linear transformation Λ preserves the Minkowskian inner product. In (2.1.3), m(X, Y ) =
mαβ X α Y β and ΛX is the vector with components (ΛX )µ = Λµα X α .
Also note that the left-hand side of (2.1.2) is connected to the linear-algebraic notion of change
of basis on R1+n . More precisely, an important way of thinking about Lorentz transformations Λ
is the following: if we have a standard rectangular coordinate system (x0 , ⋯, xn ) on R1+n , and we
def
change coordinates by defining y µ = Λµα xα , then relative to the new coordinate system (y 0 , ⋯, y n ),
the Minkowski metric still has the same form mµν = diag(−1, 1, 1, ⋯, 1). This statement would be
false if, for example, we changed to polar spatial coordinates, or we dilated spacetime coordinates
by setting (y 0 , ⋯, y n ) = α(x0 , ⋯, xn ) for some constant α > 0. Thus, the Lorentz transformations
capture some invariance properties of m under certain special linear coordinate transformations.
Corollary 2.1.1. If X is timelike, and Λ is a Lorentz transformation, then ΛX is also timelike.
Analogous results also hold if X is spacelike or null.
Proof. Corollary 2.1.1 easily follows from Definition 2.0.1 and (2.1.3).
It can be checked that the Lorentz transformations form a group. In particular:

● If Λ is a Lorentz transformation, then so is Λ−1 .
● If Λ and Υ are Lorentz transformations, then so is their matrix product ΛΥ, which has
def
components (ΛΥ)µν = Λµα Υαν .
The condition (2.1.2) can be viewed as (n + 1)2 scalar equations. However, by the symmetry of
m, there are plenty of redundancies, so that only 12 (n + 1)(n + 2) of the equations are independent.
This leaves (n + 1)2 − 12 (n + 1)(n + 2) = 12 n(n + 1) “free parameters” that determine the matrix Λ.
Thus, the Lorentz transformations form a “ 12 n(n + 1) dimensional” group.
It can be shown that the proper Lorentz group is generated 1 by the (n)(2n−1) dimensional subgroup
of spatial rotations, and the n dimensional subgroup of proper Lorentz boosts. For the sake of
concreteness let’s focus on the physical case of n = 3 spatial dimensions.
Then the rotations about the x3 axis are the set of linear transformations of the form
1By“generated,” we mean that all proper Lorentz transformations can be built out of a finite number of products
of boosts and spatial rotations.
⎡ 1 0 ⎤⎥
⎢ 0 0
⎢ 0 cos θ − sin θ 0 ⎥
⎢ ⎥
(2.1.4) Λαµ = ⎢ ⎥,
⎢ 0 sin θ cos θ 0 ⎥
⎢ ⎥
⎢ 0 0 0 1 ⎥
⎣ ⎦
where θ ∈ [0, 2π ) is the counter-clockwise angle of rotation. Analogous matrices yield the rotations
about the x1 and x2 axes. Note that the X 0 (i.e. “time”) coordinate of vectors X is not affected
by such transformations.
The (proper) Lorentz boosts are the famous linear transformations that play a distinguished role
in Einstein’s theory of special relativity. They are sometimes called spacetime rotations, because
they intermix the time component X 0 of vectors X with their spatial components X 1 , X 2 , ⋯, X n .
The Lorentz boosts in the x1 direction can be expressed as
⎡ cosh ζ − sinh ζ 0 0 ⎤
⎢ ⎥
⎢ − sinh ζ cosh ζ 0 0 ⎥
⎢ ⎥
(2.1.5) Λ µ=⎢
α
⎥
⎢ 0 0 1 0 ⎥
⎢ ⎥
⎢ 0 0 0 1 ⎥
⎣ ⎦
where ζ ∈ (−∞, ∞). Equivalently, (2.1.5) may be parameterized by
⎡ γ −γv 0 0 ⎤
⎢ ⎥
⎢ −γv γ 0 0 ⎥
⎢ ⎥
(2.1.6) Λ µ=⎢
α
⎥
⎢ 0 0 1 0 ⎥
⎢ ⎥
⎢ 0 0 0 1 ⎥⎦
⎣
√
where v ∈ (−1, 1) is a “velocity” and γ = 1−1v2 . The requirement that ∣v ∣ < 1 is directly connected
to the idea that in special relativity, material particles should never “exceed the speed of light.”
2.2. Null frames. It is often the case that the standard basis on R1+n is not the best basis for
analyzing solutions to the linear wave equation. One of the most useful bases is called a null frame,
which can vary from spacetime point to spacetime point.
Definition 2.2.1. A null frame is a basis for R1+n consisting of vectors {L, L, e(1) , ⋯, e(n−1) }. Here,
L and L are null vectors normalized by m(L, L) = −2, and the e(i) are orthonormal vectors that
span the m−orthogonal complement of span(L, L) ∶ m(e(i) , e(j ) ) = δij , m(L, e(i) ) = m(L, e(i) ) = 0,
for 1 ≤ i ≤ j ≤ n − 1. Note that the e(i) must form a basis for this complement; i.e., since they are
m−orthonormal, they must be linearly independent.
In particular, we have the decomposition
(2.2.1) R1+n = span(L, L) ⊕ span(e(1) , ⋯, e(n−1) ),

where each of the two subspaces in the above direct sum are m−orthogonal.
Example 2.2.1. A common choice of a null frame is to take Lµ = (1, ω 1 , ⋯, ω n ), Lµ = (1, −ω 1 , ⋯, −ω n ),
and to take the e(i) to be any m−orthonormal basis for the m−orthogonal complement of span(L, L).
Note that this n − 1 dimensional complementary space is spanned by the n non-linearly inde-
def def
pendent vectors v(µi) = (0, −ω 1 , −ω 2 , ⋯, −ω i−1 , 1 − ω i , −ω i+1 , ⋯, −ω n ), 1 ≤ i ≤ n. Here, ω i = xr ,
i
def
√ n
and r = ∑i=1 (xi )2 is the standard radial coordinate. Observe that v(i) is formed by sub-
def
tracting of the “radial part” (0, ω1 , ⋯, ωn ) from the standard spatial unit basis vector bµ(i) =
(0, 0, ⋯, 0, 1 , 0, ⋯, 0). Note that ∑ni=1 (ω i )2 = 1.
®
ith spatial slot
For this null frame, in terms of differential operators, ∇L = ∂t + ∂r , while ∇L = ∂t − ∂r . The
∇e(i) are the angular derivatives, i.e., derivatives in directions tangential to the Euclidean spheres
def
√
Sr,t = {(τ, x1 , ⋯, xn ) ∣ τ = t, ∑ni=1 (xi )2 = r.}
The following proposition shows that the Minkowski metric has a very nice form when expressed
relative to a null frame.
Proposition 2.2.1 (Null frame decomposition of m). If {L, L, e(1) , ⋯, e(n−1) } is a null frame,
then we can decompose
1 1
(2.2.2) mµν = − Lµ Lν − Lµ Lν + m/ µν ,
2 2
/ µν is positive-definite on the m−orthogonal complement of span(L, L), and m
where m / µν vanishes on
span(L, L).
Similarly, by raising each index on both sides of (2.2.2) with m−1 , we have that
1 1
(m−1 )µν = − Lµ Lν − Lµ Lν + m
/ .
µν
(2.2.3)
2 2
def
Proof. We define m / µν = mµν + 12 Lµ Lν + 12 Lµ Lν . Since m(L, L) = m(L, L) = 0, and m(L, L) = −2, it
easily follows that m / (L, L) =m / (L, L) =m/ (L, L) = 0. Thus, m / µν vanishes on span(L, L).
Since m(L, e(i) ) = m(L, e(i) ) = 0 for 1 ≤ i ≤ n, it easily follows that m / (L, e(i) ) =m
/ (L, e(i) ) = 0.
Finally, it also easily follows that m / (e(i) , e(j ) ) = m(e(i) , e(j ) ) = δij , where δij = 1 if i = j and
δij = 0 if i ≠ j, so that {e(i) }ni=1
−1
is an m/ −orthonormal basis for the m−orthogonal complement of
span(L, L).

Remark 2.2.1. If the null frame is the one described in Example 2.2.1, then m / µν is a metric that is
positive definite in the “angular” directions, and 0 otherwise. In fact, m/ is the standard Euclidean
/ is known as the first fundamental form of the spheres
metric on the family Euclidean spheres Sr,t . m
relative to m.
MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 13: Geometric Energy Estimates

1. ◻m , the energy-momentum tensor, and compatible currents
The following shorthand notation is often used for the “linear wave operator associated to m:”
◻m = (m−1 )αβ ∂α ∂β .
def
(1.0.1)
Using this notation, the wave equation −∂t2 + ∆φ = 0 can be expressed as
(1.0.2) ◻m φ = 0.
We now introduce a very important object called the energy-momentum tensor. As we will see,
it encodes some very important conservation laws associated to solutions of (1.0.2).
Definition 1.0.1. The energy-momentum tensor associated to equation (1.0.2) is
1
Tµν = ∂µ φ∂ν φ − mµν (m−1 )αβ ∂α φ∂β φ.
def
(1.0.3)
2
Later in the course, we will hopefully have time to motivate its derivation in the larger context
of variational methods. For now, we will simply study/use its useful properties.
Note that Tµν is symmetric:
(1.0.4) Tµν = Tνµ .

In your homework, you will prove the following very important positivity property of T, which is
called the dominant energy condition.
Lemma 1.0.1 (Dominant Energy Condition for Tµν ).
(1.0.5)
def
T (X, Y ) = Tαβ X α Y β ≥ 0 if X, Y are both timelike and future-directed or timelike and past-directed.
Since causal vectors are the limit of timelike vectors, we have the following consequence of (1.0.5):
(1.0.6)
def
T (X, Y ) = Tαβ X α Y β ≥ 0 if X, Y are future-directed and causal or past-directed and causal.
As before, we can raise the indices of T ∶
(1.0.7) T µν = (m−1 )µα (m−1 )νβ Tαβ .

1
A very special case of Lemma 1.0.1 is the following, which corresponds to X µ = Y µ = δ0µ =
(1, 0, 0, ⋯, 0) in the lemma:
1 3 1
(1.0.8) T00 = T 00 = ∑ (∂µ φ)2 = ∣∇t,x φ∣2 .
2 µ=0 2
The derivation of (1.0.8) is a simple computation that you should do for yourself. Note that T00 is
positive definite in all of the derivatives of φ. This fact will play an important role in Theorem 2.1
below.
The next lemma shows that T µν is divergence-free whenever φ verifies the wave equation. This fact
is intimately connected to the derivation of conservation laws, which are fundamental ingredients
in the study of hyperbolic PDEs.
Lemma 1.0.2 (The divergence of T µν ). Let Tµν be the energy-momentum tensor defined in
(1.0.3). Then
(1.0.9) ∂µ T µν = (◻m φ)(m−1 )να ∂α φ.

In particular, if φ is a solution to (1.0.2), then
(1.0.10) ∂µ T µν = 0.
Proof. The proof is a computation that uses the symmetry property (m−1 )µν = (m−1 )νµ and the
fact that we are allowed to interchange the order of partial derivatives (if φ is sufficiently smooth):
1
(1.0.11) ∂µ T µν = ∂µ ((m−1 )µα (m−1 )νβ ∂α φ∂β φ − (m−1 )µν (m−1 )αβ ∂α φ∂β φ)
2
= (◻m φ)(m ) ∂β φ + (m ) (m−1 )νβ (∂α φ)∂µ ∂β φ
−1 νβ −1 µα
1 1
− (m−1 )µν (m−1 )αβ (∂µ ∂α φ)∂β φ − (m−1 )µν (m−1 )αβ (∂α φ)∂µ ∂β φ
2 2
−1 νβ
= (◻m φ)(m ) ∂β φ,
where the last three terms have canceled each other.

As we will soon see, the energy-momentum tensor provides an amazingly convenient way of
bookkeeping in the divergence theorem. However, in order to apply the divergence theorem, we
need to find a useful vectorfield to take the divergence of. By useful, we mean a vectorfield that can
be used to control a solution φ to the wave equation. One way of constructing a useful vectorfield is
to start with an auxiliary vectorfield X and then to contract it with the energy momentum tensor
to form a new vectorfield J. The next definition shows how to do this.
Definition 1.0.2. Given any vectorfield X, we associate to it the following compatible current,
which is itself a vectorfield:
(X ) µ def
(1.0.12) J = T µα Xα .
So which vectors X are the useful ones? It turns out that the answer is causal vectors. This fact is
closely connected to the dominant energy condition (1.0.5). This will become more clear in our proof
def def
of theorem 2.1 below; Note that by Lemma 1.0.1, J µ Yµ = T µα Xα Yµ = Tαβ X α Y β = T (X, Y ) ≥ 0 if
X, Y are both timelike and future-directed (i.e., X 0 , Y 0 > 0) or past-directed (i.e., X 0 , Y 0 < 0).
In order to apply the divergence theorem to (X)J µ , we of course need to know its divergence. We
carry out this computation in the next corollary.
Corollary 1.0.3. Using (1.0.4) and (1.0.10), we have that
(X)
(1.0.13) ∂µ ( J µ ) = T αβ (X )παβ ,
where
(X) def 1
(1.0.14) πµν = (∂µ Xν + ∂ν Xµ )
2
is called the deformation tensor of X.
We now state a version of the divergence theorem that is tailored to our study of the linear wave
equation.
Theorem 1.1 (Divergence Theorem). Let φ be a solution to the linear wave equation ◻m φ = 0.
Let X be any vectorfield, and let (X )J be the compatible current define in Definition 1.0.2. Let
Ω ⊂ R1+n be a domain with boundary ∂Ω. Then the following integral identity holds:
(X )
(X) α
(1.0.15) ∫∂Ω N̂α J [φ(σ)] dσ = ∫ ∂µ ( J µ [φ(t, x)]) dtdn x.
Ω
2. Energy Estimates and Uniqueness

We will now use the results of the previous section to derive some extremely important energy
estimates for solutions to ◻m φ = 0. The results we derive are a geometry version of integration by
parts + the divergence theorem. They could alternatively be derived by multiplying both sides of
the wave equation by a suitable quantity and then integrating by parts over a suitable hypersurfaces,
but there is a substantial gain in geometric insight that accompanies our use of compatible currents.
Theorem 2.1 (Energy estimates in a cone). Let φ(t, x) be a C 2 solution to the 1 + n dimensional
global Cauchy problem for the linear wave equation
(2.0.16) ◻m φ = 0,
(2.0.17) φ(0, x) = f (x), x ∈ Rn ,
(2.0.18) ∂t φ(0, x) = g (x), x ∈ Rn .
Let R ∈ [0, ∞], let X be the past-directed timelike vector defined by X µ = −δ0µ , and let (X )J µ [φ(t, y )]
be the compatible current (1.0.12) associated to X. Note that by (1.0.8), (X )J µ [φ(t, y )] = ∣∇t,y φ(t, y )∣2 =
∑µ=0 (∂µ φ)2 = (∂t φ)2 + ∑i=1 (∂i φ)2 . Define the square of the energy E [φ](t) by
n n
(2.0.19)
def
E 2 [φ](t) = ∫ ˆµ (X)J µ [φ(t, y )] dn y = 1 ∫
N ∣∇t,y φ(t, y )∣2 dn y,
B R−t (p) 2 BR−t (p)
where Nˆµ = δµ0 (and therefore N

ˆ µ = −δ µ ) is the past-pointing unit normal covector to {t} × BR−t (p) ⊂
0
R4 , and BR (p) ⊂ R3 denotes the solid Euclidean ball of radius R centered at p. Then
(2.0.20) E[φ](t) ≤ E [φ](0).

def
Proof. The goal is to apply Theorem 1.1 to the solid truncated backwards light cone Ct,p;R = {(τ, y ) ∈
[0, ∞) × Rn ∣ ∣y − p∣ ≤ R − τ } and to make use of the dominant energy condition. It is easy to see
def def
that ∂ Ct,p;R = B ∪ Mt,p;R ∪ T , where B = {0} × BR (p) is the flat base of the truncated cone, T =
def
{t} × BR−t (p) is the flat top of the truncated cone, and Mt,p;R = {(τ, y ) ∈ [0, ∞) × Rn ∣ ∣y − p∣ = R − τ }
is the mantle of the truncated cone.
By Theorem 1.1, we have that
(X )
(2.0.21) E[φ](t) − E[φ](0) + F [φ] = ∫ ∂µ ( J µ [φ(τ, y)]) dτ dn y,
Ct,p
where
(2.0.22)
def
F [φ] = ∫ ˆα (X)J α [φ(σ )] dσ
N
M t,p;R
is the “flux” associated to Mt,p;R . Since φ solves the wave equation (2.0.16), and since (X ) π
µν = 0,
the identity (1.0.13) implies that the right-hand side of (2.0.21) is 0. Therefore,
(2.0.23) E[φ](t) − E[φ](0) + F [φ] = 0.

We claim that F [φ] ≥ 0. The energy inequality (2.0.20) would then follow from (2.0.23). They key
observation for showing that F [φ] ≥ 0 is the following. Along the mantle Mt,p;R , it is easy to see
(draw the picture!) that N ˆµ = L , where L is a past-directed null vector. Therefore, the integrand
µ
α β
in (2.0.22) is equal to Tαβ X L , and since X is a past-directed timelike vector, the dominant energy
condition (1.0.6) implies that Tαβ X α Lβ ≥ 0. Therefore, F [φ] ≥ 0 as desired.

Theorem 2.1 can easily be used to prove the following local uniqueness result for solutions to the
linear wave equation.
Corollary 2.0.4 (Uniqueness). Suppose that two C 2 solutions φ1 and φ2 to (2.0.16) have the
same initial data on BR (p) ⊂ {(τ, y ) ∣ τ = 0}. Then the two solutions agree on the “solid backwards
def
light cone” Cp;R = {(τ, y ) ∣ 0 ≤ τ ≤ R, 0 ≤ ∣y − p∣ ≤ R − τ }.
def
Proof. Define ψ = φ1 − φ2 . Then ψ verifies (2.0.16) and furthermore, E[ψ](0) = 0. Thus, by Theorem
2.1, E [ψ ](t) = 0 for 0 ≤ t ≤ R. Therefore, from the definition of E [ψ ](t), it follows that ∇τ,y ψ (τ, y ) =
0 for (τ, y ) ∈ Cp;R . Thus, by elementary analysis, ψ is constant in Cp;R . But ψ (0, x) = 0 for (0, x) ∈ Cp;R .
Thus, ψ (τ, y ) = 0 for all points (τ, y ) ∈ Cp;R .
Corollary 2.0.4 is one illustration of the finite speed of propagation property associated to the
linear wave equation. Another way to think about it is the following. Suppose you alter the initial
conditions outside of BR (p), but not on BR (p) itself. Then this alteration has no effect whatsoever
on the behavior of the solution in the spacetime region Cp;R Think about this claim yourself; it
follows easily from the Corollary!
3. Developments, Domain of Dependence, and Range of Influence

We will now develop a language for discussing the finite speed of propagation properties of the
linear wave equation in more detail. If we had more time in this course, we could adopt a more
geometric point of view that would apply to many other hyperbolic PDEs. This would involve
fleshing out our discussion of Lorentzian geometry, and also developing a generalized version of
geometry that applies to a large class of PDEs.
Warning 3.0.1. Some people permute or even severely alter the following definitions, which can
be very confusing. The definitions below therefore indicate some of my biases.
Definition 3.0.3 (Development). Let S ⊂ {(t, x) ∣ t = 0} be a set. Assume that that we know the
initial data φ(0, x) = f (x), ∂t φ(0, x) = g(x) for the wave equation (1.0.2), but only for x ∈ S. Then a
future development Ω of S is defined to be a “future” region of spacetime Ω ⊂ R1+n ∩ {(t, x) ∣ t ≥ 0}
on which the solution φ(t, x) to (1.0.2) is uniquely determined by the initial data on S. A past
development D− (S ) can be analogously defined (replace t ≥ 0 with t ≤ 0 in the previous definition).
Example 3.0.1. If BR (p) and Cp;R are as in Corollary 2.0.4, then Cp;R is a development of BR (p).
You can imagine that the solution knows how to “develop” in Cp;R from the initial conditions on its
subset BR (p).
Definition 3.0.4 (Maximal development). The maximal future development of S, which we
denote by D+ (S ), is defined to be the union of all future developments of S. The maximal past
development D− (S ) can be analogously defined. The maximal development of S is defined to be
D + (S ) ∪ D − (S ) .
def
Example 3.0.2. Consider the plane P = {(t, x1 , x2 , x3 ) ∣ x1 = 0}. Then using techniques from
a more advanced course, one could show that D(P ) = P for the wave equation (1.0.2). That is,
knowing the conditions of a solution φ along P is not enough information to determine the solution
anywhere else. This is closely connected to the fact that all smooth curves in P have tangent vectors
that are timelike relative to the Minkowski metric.
Definition 3.0.5 (Domain of dependence). Let Ω ⊂ R1+n . Assume that φ is a solution to the
wave equation (1.0.2) in Ω. A domain of dependence for Ω is a set S such that φ is completely
determined on Ω from only the data φ∣S and ∇t,x φ∣S .
Remark 3.0.1. For general nonlinear hyperbolic PDEs, domains of dependence depend both on
Ω and the solution φ itself. However, for the linear wave equation, domains of dependence do
not depend on the solution. Roughly speaking, this is because the “geometry of the solution” is
predetermined by the Minkowski metric m.
Example 3.0.3. In 1 + 1 dimensions, a domain of dependence for the spacetime point (t, x) (for the
wave equation (1.0.2)) is the “initial data” interval {0}×[x − t, x + t]. Another domain of dependence
for this point is the interval {t/2} × [x − t/2, x + t/2]. A trivial example is that (t, x) is a domain of
dependence for itself.
Example 3.0.4. In 1+3 dimensions, a domain of dependence for the positive t axis {(t, x1 , x2 , x3 ) ∣ x1 =
x2 = x3 = 0, t ≥ 0} (for the wave equation (1.0.2)) is all of “space:” {(t, x1 , x2 , x3 ) ∣ t = 0}. Any subset
of space is not a domain of dependence for the positive t axis.
The next definition is complementary to the notion of domain of dependence.
Definition 3.0.6 (Range of influence). Assume that φ is a solution to the wave equation (1.0.2)
in R1+n . The range of influence R for a set S ⊂ R1+n is the set of all points (t, x) ∈ R1+n such that
φ(t, x) is affected by the initial data φ∣S and ∇t,x φ∣S .
Example 3.0.5. In 1 + 1 dimensions, the (future) range of influence (for t ≥ 0) of the interval
S = {0} × [−1, 1] is R = {(t, x) ∣ − t − 1 ≤ x ≤ t + 1}.
Example 3.0.6. In 1 + 1 dimensions, the (future) range of influence (for t ≥ 0) of the t axis
S = {(t, 0) ∣t ≥ 0} is R = {(t, x) ∣ t ≥ 0}.
Example 3.0.7. In 1 + 3 dimensions, the (future) range of influence (for t ≥ 0) of S = {0} × ∂B1 (0)
def
√
is R = {(t, x) ∣ 0 ≤ t ≤ 1, ∣x∣ = 1 − t} ∪ {(t, x) ∣ 0 ≤ t < t, ∣x∣ = 1 + t} where ∣x∣ = (x1 )2 + (x2 )2 + (x3 )2 .
This is a consequence of the Sharp Huygens’ Principle.
MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 15: Classification of second order equations

1. Review of Three Important Examples of PDEs
Let’s review some basic facts concerning the three PDEs we’ve examined in detail thus far.
Equation Type Well-posed problems Features

∆u(x) = f (x) Elliptic Boundary value prob- mean value properties;
lems: All of Rn (with maximum principle; Har-
boundary conditions nack inequality
at ∞); finite bound-
aries under Dirichlet,
Neumann, Robin,
or Mixed boundary
conditions
∂t u(t, x) − ∆u(t, x) = f (t, x) Diffusive (parabolic): Initial value (Cauchy) Infinite speeds of propaga-
problems: all of Rn tion; smoothing properties;
at t = 0; Initial + maximum principle, t−n/2
boundary value prob- decay as t → ∞ for the
lems: data at t = 0 + global Cauchy problem
Dirichlet, Neumann,
Robin, or Mixed
boundary conditions
−∂t2 u(t, x) + ∆u(t, x) = f (t, x) Hyperbolic Initial value (Cauchy) Finite speed of propagation;
problems: all of Rn domain of dependence and
at t = 0; Initial + influence; energy identities;
boundary value prob- order t(1−n)/2 decay as t →
lems: data at t = 0 + ∞ for the global Cauchy
Dirichlet, Neumann, problem
Robin, or Mixed
boundary conditions
2. Motivating example
Let’s consider the following second-order linear PDE on R1+n :
def
(2.0.1) Lu = Aαβ ∂α ∂β u + B α ∂α u + Cu = 0.
In (2.0.1), A, B, C are allowed to be functions of the coordinates (x0 , · · · , xn ). We will also use the
standard notation x0 = t. By the symmetry of the mixed partial derivatives, we can also assume
that A is symmetric:
1
(2.0.2) Aµν = Aνµ .

The question we would like to address at the moment is the following: what are the basic properties
of solutions to (2.0.1)? Is this equation most like a Laplace, heat, or wave equation? That is, is
(2.0.1) elliptic, diffusive, or hyperbolic? As we will see, the most important part of equation (2.0.1)
in this context is the principal part Aαβ ∂α ∂β u, which involves the top-order derivatives.
To begin answering this question, let’s start with a simple example on R2 . Let’s try to classify
the following equation:
def
(2.0.3) Lu = ∂t2 u − 4∂t ∂x u + 2∂x2 u = 0.
Note that it would be easy to answer our question if we were able to make a linear change of
variables that eliminates the cross term −4∂t ∂x u; the PDE would then look just like one of the
other ones we have already studied. More precisely, let’s try to eliminate the cross terms by making
good choices for the constants a, b, c, d in the following linear change variables:
(2.0.4a) t = at + bx,
e
(2.0.4b) x
e = ct + dx.
In order to have a viable change of variables, we also need to achieve the following non-degeneracy
condition from linear algebra:
(2.0.5) ad − bc 6= 0.
(2.0.5) states the determinant of the above linear transformation is non-zero, and that the trans-
formation is non-degenerate.
Then using the chain rule, we have that
∂te ∂x
e
(2.0.6a) ∂t = ∂et + ∂xe = a∂et + c∂xe,
∂t ∂t
∂te ∂x
e
(2.0.6b) ∂x = ∂et + ∂xe = b∂et + d∂xe.
∂x ∂x
Inserting (2.0.6a) - (2.0.6b) into (2.0.3), we compute that
(2.0.7) Lu = (a2 − 4ab + 2b2 )∂et2 u + (2ac + 4bd − 4ad − 4bc)∂et ∂xeu + (c2 − 4cd + 2d2 )∂xe2 u.
To make the cross term in (2.0.7) vanish, we now choose
(2.0.8) a = 1, b = 0, c = 2, d = 1.
Note that (2.0.8) also verifies the non-degeneracy condition (2.0.5). We remark that other choices
would also have worked. In the new coordinates, we have that
(2.0.9) Lu = ∂et2 u − 2∂xe2 u.

Dividing by −2, we see that the PDE (2.0.3) was actually a “standard” linear wave equation in
disguise:
1
(2.0.10) − ∂et2 u + ∂xe2 = 0.
2
√
Relative to the coordinates (t, x)
e e , the “speed” associated to the wave equation (2.0.10) is 2.
Let’s do another example. Consider the PDE
def
(2.0.11) Lu = −2∂t2 u − 2∂t ∂x u − ∂x2 u + ∂x u = 0.
Using (2.0.6a) - (2.0.6b) again, we compute that
(2.0.12) Lu = (−2a2 − 2ab − b2 )∂et2 u + (−2ac − bd − 2ad − 2bc)∂et ∂xeu + (−2c2 − 4cd − d2 )∂xe2 u
+ b∂et u + d∂xeu.
Choosing
1
(2.0.13) a= √ , b = 0, c = −1, d = 1,
2
we see that
(2.0.14) Lu = −∂et2 u − ∂xe2 u + ∂xeu.

Thus, multiplying by −1, we see that (2.0.11) is really just a Laplace-like equation in disguise:
(2.0.15) ∂et2 u + ∂xe2 u − ∂xeu = 0.

Equation (2.0.11) is therefore elliptic. We remark that the first-order term in (2.0.15) does not
affect the elliptic nature of the system.
Let’s do one final example. Consider the PDE
def
(2.0.16) Lu = ∂t2 u − 2∂t ∂x u + ∂x2 u + ∂x u = 0.
Using (2.0.6a) - (2.0.6b) again, we compute that
(2.0.17) Lu = (a2 − 2ab + b2 )∂et2 u + (2ac + 2bd − 2ad − 2bc)∂et ∂xeu + (c2 − 2cd + d2 )∂xe2 u
+ b∂et u + d∂xeu.
Choosing
(2.0.18) a = 1, b = 0, c = −1, d = −1,

we see that
(2.0.19) Lu = ∂et2 u − ∂xeu.

Thus, (2.0.16) is equivalent to
(2.0.20) −∂xeu + ∂et2 u = 0.

Now observe that (2.0.20) is just the standard heat equation, with the variable x e playing the role
of “time” and e
t playing the role of “space.” Equation (2.0.20) is therefore diffusive (parabolic).
3. A general framework
In this section, we will establish a general framework for classifying second order constant co-
efficient scalar PDEs. The framework will cover the three examples from the previous section as
special cases. The proof will reveal that the classification is intimately connected to the theory of
quadratic forms from linear algebra. Throughout this section, we will use the notation
(3.0.21) x = (x0 , x1 , · · · , xn ).
As above, we will investigate PDEs of the form
def
(3.0.22) Lu = Aαβ ∂α ∂β u + B α ∂α u + Cu = 0,
where Aµν = Aνµ .
We begin by providing a simple version of Hadamard’s classic definitions.
Definition 3.0.1 (Hadamard’s classification of second order scalar PDEs). Equation (3.0.22)
is respectively said to be elliptic, hyperbolic, or parabolic according to the following conditions on
the (1 + n) × (1 + n) symmetric matrix A :
• All of the eigenvalues of A have the same sign - elliptic
• n of the eigenvalues of A have the same (non-zero) sign, and the remaining one has the
opposite (non-zero) sign - hyperbolic
• n of the eigenvalues of A have the same (non-zero) sign, and the remaining one is 0 -
parabolic
Remark 3.0.1. Many of the ideas in this section, including the definition above, can be generalized
to include the case where A depends on (x), or even on the solution u itself; PDEs of the latter
type are said to be quasilinear.
We now state and prove the main classification theorem.
Theorem 3.1 (Classification of second order constant-coefficient PDEs). Consider the
following second order constant coefficient PDE
def
(3.0.23) Lu(x) = Aαβ ∂α ∂β u(x) + B α ∂α u(x) + Cu(x) = 0,
def
where ∂α = ∂x∂α . Then there exists a linear change of variables y µ = Mαµ xα such that
• If all of the eigenvalues of Aµν have the same (non-zero) sign, then (3.0.23) can be written
as ±Lu = ∆y u(y) + B e α ∂α u(y) + Cu(y) = 0, where ∆y def
=
Pn ∂2
∂y µ=0 (∂y α )2 .
• If n of the eigenvalues of A have the same (non-zero) sign, and the remaining one has the
opposite (non-zero) sign, then (3.0.23) can be written as ±Lu = y u(y) + B e α ∂α u(y) +
∂y
def
Cu(y) = 0, where y = (m−1 )αβ ∂y∂α ∂y∂β is the standard linear wave operator, and (m)−1 =
diag(−1, 1, 1, · · · , 1) is the standard Minkowskian matrix.
• If n eigenvalues λ(1) , · · · , λ(n) of A have the same (non-zero) sign, and the remaining one is
e 0 ∂ 0 u(y 0 , y 1 , · · · , y n )+Pn ∂2
λ(0) = 0, then (3.0.23) can be written as ±Lu = B ∂y
0 1 n
i=1 (∂y i )2 u(y , y , · · · , y )+
Pn e i ∂ 0 1 n (0) (1) (n)
i=1 B ∂y i u(y , y , · · · , y ) + Cy = 0. Furthermore, let v , v , · · · , v be a corresponding
(µ)
diagonalizing unit-length co-vector basis. More precisely, this means that nα=0 |vα |2 = 1
P
(µ) (ν) (µ) (ν)
for 0 ≤ µ ≤ n, that Aαβ vα vβ = λ(µ) if µ = ν, and that Aαβ vα vβ = 0 if µ = 6 ν (standard
linear algebraic theory guarantees the existence of such a basis). Then if the non-zero vector
(0)
B satisfies B α vα = 6 0, we also have that B e0 =
6 0.
Remark 3.0.2. The “±” sign above distinguishes whether or not most of the eigenvalue of Aµν
are positive or negative. For example, if all of the eigenvalues of Aµν are positive, then Lu =
∆y u(y) + · · · , while if they are all negative, then Lu = −∆y u(y) + · · · (and similarly for the other
two cases).
Proof. Let’s consider the first case, in which all of the eigenvalues have the same (non-zero) sign.
Then by standard linear algebra, since Aµν is symmetric and positive definite (perhaps after mul-
tiplying it by −1), there exists an invertible “change-of-basis” matrix Mµν such that
(3.0.24) Mαµ Aαβ Mβ ν = I µν ,

def
where I µν = diag(1, 1, · · · , 1) is the (n + 1) × (n + 1) identity matrix. In fact, we can choose
1
(3.0.25) Mαµ = p vα(µ) (no summation in µ),
|λ(µ)|
(µ) (µ)
where λ(µ) is the “eigenvalue” of A corresponding to the unit-length covector vα (i.e., nα=0 |vα |2 =
P
1) appearing in the statement of the theorem.
∂y µ ∂
We now make the linear change of variables y µ = Mαµ xα . Then by the chain rule, ∂x∂α = ∂x α ∂y µ =
∂
Mαµ ∂yµ . Therefore,
∂ ∂ ν ∂ ∂ µν ∂ ∂
(3.0.26) Aαβ u = A αβ
M α
µ
M β u = I u = ∆y u.
∂xα ∂xβ ∂y µ ∂y ν ∂y µ ∂y ν
This completes the proof in the first case.
In the second case, in which n of the eigenvalues of A have the same (non-zero) sign, and the
remaining one has the opposite (non-zero) sign, the proof is similar. The key difference is that
because of the eigenvalue of opposite sign, (3.0.24) is replaced with
(3.0.27) Mαµ Aαβ Mβ ν = (m−1 )µν ,

def
where (m−1 )µν = diag(−1, 1, 1, · · · , 1) is the standard (1+n)×(1+n) Minkowski matrix. Therefore,
∂ ∂ ∂ ∂ ∂ ∂
(3.0.28) Aαβ α β
u = Aαβ Mαµ Mβ ν µ ν u = (m−1 )µν µ ν u = y u.
∂ x ∂x ∂y ∂y ∂y ∂y
This completes the proof in the second case.
In the third case, in which n of the eigenvalues of A have the same (non-zero) sign, and the
remaining one is 0, the proof is similar. The key difference is that because of the zero eigenvalue,
(3.0.24) is replaced with
(3.0.29) Mαµ Aαβ Mβ ν = Dµν ,

def
where Dµν = diag(0, 1, 1, · · · , 1).
Therefore,
X ∂2 n
αβ ∂ ∂ αβ µ ν ∂ ∂ µν ∂ ∂
(3.0.30) A u = A M α M β u = D u = u.
∂ xα ∂xβ ∂y µ ∂y ν ∂y µ ∂y ν i=1
(∂y i )2
Furthermore, we have that
∂ ∂
(3.0.31) α
Bα
u = Mαµ B α µ u.
∂x ∂y
Thus, using using (3.0.25), we have that
(3.0.32) e 0 def
B = Mα0 B α = vα(0) B α 6= 0.

Example 3.0.1. In the first example from above,

µν 1 −2
(3.0.33) A = .
−2 2
To calculate the eigenvalues of A, we first set

1 − λ −2
(3.0.34) det(A − λI) = det = λ2 − 3λ − 2 = 0.
−2 2 − λ
The solutions are
√
17 3±
(3.0.35) λ= .
2
Since the eigenvalues are of opposite sign, the corresponding PDE is hyperbolic.
Example 3.0.2. In the second example from above,

µν −2 −1
(3.0.36) A = .
−1 −1

−2 − λ −1
(3.0.37) det(A − λI) = det = λ2 + 3λ + 1 = 0.
−1 −1 − λ
The solutions are
√
−3 ± 5
(3.0.38) λ= .
2
Both of these eigenvalues are negative, and thus the corresponding PDE is elliptic.
Example 3.0.3. In the final example from above,

µν 1 −1
(3.0.39) A = .
−1 1

1 − λ −1
(3.0.40) det(A − λI) = det = λ2 + 2λ = 0.
−1 1 − λ
The solutions are
(3.0.41) λ = 0, −2,
and so the corresponding PDE is parabolic.
MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 16: The Fourier Transform on Rn

1. Introduction to the Fourier Transform
Earlier in the course, we learned that periodic functions f ∈ L2 ([−1, 1]) (of period 2) can be
represented using a Fourier series:
∞ ∞
a0 X X
(1.0.1) f (x)“ = ” + am cos(mπx) + bm sin(mπx).
2 m=1 m=1
The “ = ” sign above is interpreted in the sense of the convergence of the sequence of partial sums
associated to the right-hand side in the L2 ([−1, 1]) norm. The coefficients am and bm represent the
“amount of the frequency m” that the function f contains. These coefficients were related to f
itself by
Z 1
(1.0.2a) a0 = f (x) dx,
−1
Z 1
(1.0.2b) am = f (x) cos(mπx) dx, (m ≥ 1),
−1
Z 1
(1.0.2c) bm = f (x) sin(mπx) dx, (m ≥ 1).
−1
The Fourier transform is a “continuous” version of the formula (1.0.1) for functions defined on
the whole space Rn . Our goal is to write functions f defined on Rn as a superposition of different
frequencies. However, instead of discrete frequencies m, we will need to use “continuous frequencies”
ξ.
R
Definition 1.0.1 (Fourier Transform). Let f ∈ L1 (Rn ), i.e., Rn |f (x)| dn x < ∞. The Fourier
transform of f is denoted by fˆ, and it is a new function of the frequency variable ξ ∈ Rn . It is
defined for each frequency ξ as follows:
Z
def
(1.0.3) fˆ(ξ) = f (x)e−2πiξ·x dn x,
Rn
where · denotes the Euclidean dot product, i.e., if x = (x1 , · · · , xn ) and ξ = (ξ 1 , · · · , ξ n ), then
def P
ξ · x = nj=1 ξ j xj . In the above formula, recall that if is r is a real number, then eir = sin r + i cos r.
The formula (1.0.3) is analogous to the formulas (1.0.2a) - (1.0.2c). It provides the “amount of
the frequency component” ξ that f contains. Later in the course, we will derive an analog of the
representation formula (1.0.1).
1
Remark 1.0.1. The Fourier transform can be defined on a much larger class of functions than
those that belong to L1 . However, to make rigorous sense of this fact requires advanced techniques
that go beyond this course.
We will also use the following notation.
Definition 1.0.2 (Inverse Fourier transform). Given a function f (ξ) ∈ L1 (Rn ), its inverse
Fourier transform, which is denoted by f ∨ , is a new function of x defined as follows:
Z
def def
(1.0.4) f (x) = fˆ(−x) =
∨
f (x)e2πiξ·x dn ξ.
Rn
The name is motivated as follows: later in the course, we will show that (fˆ)∨ = f. Thus, ∨ is in
fact the inverse of the operator ∧.
The Fourier transform is very useful in the study of certain PDEs. To use it in the context
of PDEs, we will have to understand how the Fourier transform operator interacts with partial
derivatives. In order to do this, it is convenient to introduce the following notation, which will si-
multaneously help us bookkeep when taking repeated derivatives, and when classifying the structure
monomials.
Definition 1.0.3. If
def
(1.0.5) ~ = (α1 , · · · , αn )
α
is an array of non-negative integers, then we define ∂α~ to be the differential operator
def 1 n
(1.0.6) ∂α~ = ∂1α · · · ∂nα .
def
Note that ∂α~ is an operator of order |α ~ | = α1 + · · · + αn .
If x = (x1 , · · · , xn ) is an element of Cn , then we also define xα~ to be the monomial
def 1 n
(1.0.7) xα~ = (x1 )α · · · (xn )α .
The following function spaces will play an important role in our study of the Fourier transform.
Throughout this discussion, the functions f are allowed to be complex-valued.
Definition 1.0.4 (Some important function spaces).
def
(1.0.8) C k = {f : Rn → C | ∂α~ f is continuous for |α
~ | ≤ k},
def
(1.0.9) C0 = {f : Rn → C | f is continuous and lim f (x) = 0}.
|x|→∞
We also recall the following norm on the space of bounded, continuous functions f : Rn → C :
def
(1.0.10) kf kC0 = maxn |f (x)|.
x∈R
The L norm plays an important role in Fourier analysis. Since fˆ is in general complex-valued
2
we also need to extend the notion of the L2 inner product to complex-valued functions. This is
accomplished in the next definition.
Definition 1.0.5 (Inner product for complex-valued functions). Let f and g be complex-
valued functions defined on Rn . We define their complex inner product by
Z
def
(1.0.11) hf, gi = f (x)ḡ(x) dn x,
Rn
where ḡ denotes the complex conjugate of g. That is, if g(x) = u(x) + iv(x), where u and v are
def
real-valued, then ḡ(x) = u(x) − iv(x).
We also define norm of f by
Z 1/2
def 1/2 def 2 n
(1.0.12) kf k = hf, f i = |f (x)| d x .
Rn
Note that this is just the standard L2 norm extended to complex-valued functions.
Note that h·, ·i and k·k verify all of the standard properties associated to a complex inner product
and its norm:
• kf k ≥ 0 and kf k = 0 if and only if f = 0 almost everywhere
• hg, f i = hf, gi (Hermitian symmetry)
• If a and b are complex numbers, then haf + bg, hi = ahf, hi + bhg, hi, and hf, agi = āhf, gi
(Hermitian linearity)
• |hf, gi| ≤ kf kkgk (Cauchy-Schwarz inequality)
• kf + gk ≤ kf k + kgk (Triangle Inequality)
2. Properties of the Fourier Transform

The next lemma illustrates some basic properties of fˆ that hold whenever f ∈ L1 .
Lemma 2.0.1 (Properties of fˆ for f ∈ L1 ). Suppose that f ∈ L1 (Rn ). Then fˆ is a bounded,

continuous function and
(2.0.13) kfˆkC0 ≤ kf kL1 .
Proof. Since |eir | = 1 for all real numbers r, it follows that for each fixed ξ, we have
Z Z
def
(2.0.14) |fˆ(ξ)| ≤ |f (x)e −2πiξ·x n
|d x ≤ |f (x)| dn x = kf kL1 .
Rn Rn
Taking the max over all ξ ∈ Rn , the estimate (2.0.13) thus follows.
We now prove that fˆ is continuous. Given > 0, let BR be a ball of radius R centered at the
origin such that the integral of |f | over its complement BRc is no larger than :
Z
(2.0.15) |f (x)| dn x ≤ .
c
BR
It is possible to choose such a ball since f ∈ L1 . We then estimate

≤2
Z Z z }| {
(2.0.16) |fˆ(ξ ) − fˆ(η)| ≤ |f (x)||e−2πiξ·x − e−2πiη·x | dn x + |f (x)| |e−2πiξ·x − e−2πiη·x | dn x
BR c
BR
Z
≤ |f (x)||e−2πiξ·x − e−2πiη·x | dn x + 2.
BR
−2πir
Now since e is a uniformly continuous function of the real number r on any compact set, if
|ξ − η| is sufficiently small, then we can ensure that maxx∈BR |e−2πiξ·x − e−2πiη·x | ≤ . We then
conclude that the final integral over BR on the right-hand side of (2.0.16) will be no larger than
Z Z
−2πiξ·x −2πiη·x n def
(2.0.17) max |e −e | |f (x)| d x ≤ |f (x)| dn x = kf kL1 .
x∈BR BR Rn
Thus, in total, we have shown that if |ξ − η| is sufficiently small, then |fˆ(ξ) − fˆ(η)| ≤ kf kL1 + 2.
Since such an estimate holds for all > 0, fˆ is continuous by definition.
It is helpful to introduce notation to indicate that a function has been translated.
Definition 2.0.6 (Translation of a function). If Rn → C is a function and y ∈ Rn is any point,
then we define the translated function τy f by
def
(2.0.18) τy f (x) = f (x − y).
The next theorem collects together some very important properties of the Fourier transform. In
particular, it illustrates how the Fourier transform interacts with translations, derivatives, multi-
plication by polynomials, products, convolutions, and complex conjugates.
Theorem 2.1 (Important properties of the Fourier transform). Assume that f, g ∈ L1 (Rn ),
and let t ∈ R. Then
(2.0.19a)
(τy f )∧ (ξ) = e−2πiξ·y fˆ(ξ),
(2.0.19b)
def
ˆ
h(ξ) = τη fˆ(ξ) if h(x) = e2πiη·x f (x),
(2.0.19c)
def
ˆ
h(ξ) = tn fˆ(tξ) if h(x) = f (t−1 x),
(2.0.19d)
(f ∗ g)∧ (ξ) = fˆ(ξ)ĝ(ξ),
(2.0.19e)
~ | ≤ k, then fˆ ∈ C k and ∂α~ fˆ(ξ) = [(−2πix)α~ f (x)]∧ (ξ),
If xα~ f ∈ L1 for |α
(2.0.19f)
If f ∈ C k , ∂α~ f ∈ L1 for |α ~ | ≤ k − 1, then (∂α~ f )∧ (ξ) = (2πiξ)α~ fˆ(ξ),
~ | ≤ k, and ∂α~ f ∈ C0 for |α
(2.0.19g)
¯
fˆ(ξ) = (f¯)∨ (ξ) and (f ∨ )(ξ) = (f¯)∧ (ξ).
Above, f¯ denotes the complex conjugate of f ; i.e., if f = u + iv, where u and v are real-valued,
then f¯ = u − iv.
Proof. To prove (2.0.19a), we make the change of variables z = x − y, dn z = dn x and calculate that
(2.0.20) Z Z Z
def def
∧
(τy f ) (ξ) = f (x − y)e −2πix·ξ n
d x= f (z)e −2πi(z+y)·ξ n
d z=e −2πiy·ξ
f (z)e−2πiz·ξ dn z = e−2πiy·ξ fˆ(ξ).
Rn Rn Rn
To prove (2.0.19b), we calculate that

Z Z
ˆ def def def
(2.0.21) h(ξ) = 2πiη·x
e f (x)e −2πix·ξ
d x=n
f (x)e−2πix·(ξ−η) dn x = fˆ(ξ − η) = τη fˆ(ξ).
Rn Rn
To prove (2.0.19c), we make the change of variables y = t−1 x, dn y = t−n dn x to deduce that
Z
ˆ def
(2.0.22) h(ξ) = f (t−1 x)e−2πix·ξ dn x
Rn
Z
f (y)e−2πiy·tξ tn dn y
Rn
def n
= t fˆ(tξ).
To prove (2.0.19d), we use the definition of convolution, (2.0.19a), and Fubini’s theorem to deduce
that
(2.0.23)
Z Z Z Z
∧ def −2πx·ξ n n −2πx·ξ
(f ∗ g) (ξ) = e f (x − y)g(y)d y d x = g(y) e f (x − y)d x dn y
n
Rn Rn Rn Rn
| {z }
e−2πiξ·y fˆ(ξ)
Z
def
= fˆ(ξ) e−2πiξ·y g(y) dn y = fˆ(ξ)ĝ(ξ).
Rn
To prove (2.0.19e), we differentiate under the integral in the definition of fˆ(ξ) to deduce that
(2.0.24) Z Z
(ξ) (ξ) def
∂α~ fˆ(ξ) = f (x)∂α~ e−2πix·ξ n
d x= f (x)(−2πix)α~ e−2πix·ξ dn x = [(−2πix)α~ f (x)]∧ (ξ).
Rn Rn
To prove (2.0.19f), we integrate by parts |α

~ | times and use the hypotheses on f to discard the
boundary terms at infinity, thus concluding that
Z Z
∧ def −2πix·ξ (x)
(2.0.25) (∂α~ f ) (ξ) = ∂α~ f (x)e n
d x= f (x)(−1)|α~| ∂α~ e−2πix·ξ dn x
Rn n
ZR
def
= f (x)(2πiξ)α~ e−2πix·ξ dn x = (2πiξ)α~ fˆ(ξ).
Rn
To deduce the first relation in (2.0.19g), we compute that

(2.0.26)
Z Z Z
¯ def
f¯(x)e2πix·ξ dn x = f¯ˆ(−ξ) = (f¯)∨ (ξ).
def def
fˆ(ξ) = f (x)e−2πix·ξ dn x = f¯(x)e−2πix·ξ dn x =
Rn Rn Rn
The second relation in (2.0.19g) can be shown using similar reasoning.

(2.0.19e) roughly shows that if f decays very rapidly at infinity, then fˆ is very differentiable.
Similarly, (2.0.19f) roughly shows that if f is very differentiable with rapidly decaying derivatives,
then fˆ also rapidly decays. The Fourier transform thus connects the decay properties of f to
the differentiability properties of fˆ, and vice versa. In the next proposition, we provide a specific
example of these phenomena. More precisely, the next proposition shows that the Fourier transform
of a smooth, compactly supported function is itself smooth and rapidly decaying at infinity.
Proposition 2.0.2. Let f ∈ Cc∞ (Rn ), i.e., f is a smooth, compactly supported function. Then fˆ
is smooth and “rapidly decaying at infinity” in the following sense: for each N ≥ 0, there exists a
constant CN > 0 such that
(2.0.27) |fˆ(ξ)| ≤ CN (1 + |ξ|)−N .

Furthermore, an estimate similar to (2.0.27) holds (with possibly different constants) for all of the
derivatives |∂β~ fˆ(ξ)|.
In particular, fˆ ∈ L1 :
Z
def
(2.0.28) kfˆ(ξ )kL1 = |fˆ(ξ)| dn ξ < ∞,
Rn
and similarly for ∂β~ fˆ, where β~ is any derivative multi-index.

Proof. Using (2.0.19e) and the fact that f is compactly supported (and hence xα~ f ∈ L1 ), we see
that fˆ is smooth.
To prove (2.0.27), we use (2.0.19f), (2.0.13), and the fact that k∂α~ f kL1 < ∞ for any differential
operator ∂α~ to deduce that
(2.0.29) |(2πiξ)α~ fˆ(ξ)| = |(∂α~ f )∧ (ξ)| ≤ k∂α~ f kL1 = Cα~ ,

where Cα~ is a constant depending on α ~ . In particular, if M ≥ 0 is an integer, then by apply-
n
def P
X M
ing (2.0.29) to the differential operator ∆M = ( ni=1 ∂i2 )M i.e., (2πi)2M (ξ i )2 fˆ(ξ) =

i=1
| {z }
|ξ|2M

|(∆M f )∧ (ξ)| ≤ CM , it follows that
(2.0.30) (2π|ξ|)2M |fˆ(ξ)| ≤ CM

for some constant CM > 0. It is easy to see that an estimate of the form (2.0.27) follows from
(2.0.30).
(2.0.28) follows from (2.0.27) and the fact that

Z
1
(2.0.31) dn ξ < ∞.
Rn (1 + |ξ |)n+1
To see that (2.0.31) holds, perform the integration using spherical coordinates on Rn :
∞
ρn−1
Z Z
1
(2.0.32) d n ξ = ωn dρ,
Rn (1 + |ξ|)n+1 ρ=0 (1 + ρ)n+1
qP
def def n j 2 n
where ρ = |ξ| = j=1 (ξ ) is the radial variable on R , and ωn is the surface area of the unit
ball in Rn . By a simple comparison estimate, it is easy to see that the integral on the right-hand
side of (2.0.32) converges (the integrand behaves like 0 near ρ = 0, and like ρ12 near ∞).
To show that similar results hold for for ∂ ~ fˆ, we first use (2.0.19e) to conclude that
β
~
(2.0.33) ∂β~ fˆ(ξ) = [(−2πix)β f (x)]∧ (ξ).
~
Furthermore, the function (−2πix)β f (x) also satisfies the hypotheses of the proposition. We can
~
therefore repeat the above arguments with ∂β~ fˆ in place of fˆ and (−2πix)β f (x) in place of f.

3. Gaussians
One of the most important classes of functions in Fourier theory is the class of Gaussians. The
next proposition shows that this class interacts very nicely with the Fourier transform.
Proposition 3.0.3 (The Fourier transform of a Gaussian is another Gaussian). Let f (x) =
z|x|2 ), where z = a + ib is a complex number, a, b ∈ R, a > 0, x = (x1 , · · · , xn ) ∈ Rn , and
exp(−πP
2 n
|x| = j=1 (xj )2 . Then
(3.0.34) fˆ(ξ) = z −n/2 exp(−π|ξ|2 /z).

Proof. We consider only the case b = 0, so that z = a. The cases b 6= 0 would follow from an
argument similar to the one we give below but requiring a few additional technical details. We first
address the case n = 1. Then by properties (2.0.19e)-(2.0.19f) of Theorem 2.1, we have that
2 i d 2 i −2π ˆ
(3.0.35) fˆ0 (ξ) = (−2πixe−aπx )∧ (ξ) = ( e−aπx )∧ (ξ) = 2πiξfˆ(ξ) = ξf (ξ).
a dx a a
We can view (3.0.35) as
d −2π
(3.0.36) ln fˆ = ξ.
dξ a
Integrating (3.0.36) with respect to ξ and then exponentiating both sides, we conclude that
(3.0.37) fˆ(ξ) = Cexp(−πξ 2 /a.)

Furthermore, the constant C clearly must be equal to fˆ(0).
We now compute fˆ(0) :
1
Z z }| {
def −πax2
(3.0.38) fˆ(0) = e e−2πiξ0 dx = a−1/2 .
R
Note that you have previously calculated this integral in your homework. Combining (3.0.36) and
(3.0.38), we arrive at the desired expression (3.0.34) in the case n = 1.
To treat the case of general n, we note that the properties of the exponential function and the
Fubini theorem together allow us to reduce it to the case of n = 1 :
Z
(3.0.39) fˆ(ξ) = exp(−πa|x|2 )exp(−2πiξ · x) dn x
Rn
Z n
X
n
X
k 2
= exp − πa (x ) exp − 2πi ξ x dn x
j j
Rn k=1 j=1
Z n n
Y o
exp − πa(xj )2 exp(−2πiξ j xj ) dn x

=
Rn j=1
n nZ
Y o
exp − πa(xj )2 exp(−2πiξ j xj ) dxj

=
j=1 R
Yn
a−1/2 exp − π(ξ j )2 /a

=
j=1
n
X
= a−n/2 exp − πa−1 (ξ j )2

j=1
−n/2 2
=a exp(−π |ξ | /a).

4. Fourier Inversion and the Plancherel Theorem

The next lemma is very important. It shows that the Fourier transform interacts nicely with the
L2 inner product.
Lemma 4.0.4 (Interaction of the Fourier transform with the L2 inner product). Assume
that f, g ∈ L1 . Then
Z Z
(4.0.40) fˆ(x)g(x) dn x = f (x)ĝ(x) dn x.
Rn Rn
Alternatively, in terms of the complex L2 inner product, we have that
(4.0.41) hfˆ, gi = hf, g ∨ i.

Proof. Using the definition of the Fourier transform and Fubini’s theorem, the left-hand side of
(4.0.40) is equal to
Z Z
(4.0.42) f (ξ)g(x)e−2πiξ·x dn ξ dn x.
Rn Rn
By the same reasoning, this is also equal to the right-hand side of (4.0.40).
To obtain (4.0.41), simply replace g with ḡ in the identity (4.0.40) and use property (2.0.19g).

The next theorem is central to Fourier analysis. It shows that the operators ∧ and ∨ are inverses
of each other whenever f and fˆ are nice functions.
Theorem 4.1 (Fourier inversion theorem). Suppose that f : Rn → C is a continuous function,
that f ∈ L1 , and that fˆ ∈ L1 . Then
(4.0.43) (fˆ)∨ = (f ∨ )∧ = f.
That is, the operators ∧ and ∨ are inverses of each other.
Proof. We first note that
Z nZ o
def
(4.0.44) (fˆ) (x) =
∨
f (y)e −2πiy·ξ
d y e2πix·ξ dn ξ.
n
Rn Rn
Note that the integral in (4.0.44) is not absolutely convergent when viewed as a function of (y, ξ) ∈
Rn × Rn . Thus, our proof of (4.0.43) will involve a slightly delicate limiting procedure that makes
use of the auxiliary function
def
(4.0.45) φ(t, ξ) = exp(−πt2 |ξ|2 + 2πiξ · x).
Note that (2.0.19b) and Proposition 3.0.3 together imply that
ˆ def
(4.0.46) φ(y) = t−n exp(−π|x − y|2 /t2 ) = Γ(t, x − y),
where
def
(4.0.47) Γ(t; y) = t−n exp(−π|y|2 /t2 ).
Also note that Γ(t, y) is just the fundamental solution of the heat equation with diffusion constant
1
D = 4π . In particular, we previously showed in our study of the heat equation that
Z
(4.0.48) Γ(t, y) dn y = 1
Rn
for all t > 0. We now compute that

Z
def
(4.0.49) (Γ(t, ·) ∗ f )(x) = Γ(t, x − y)f (y) dn y
n
ZR
= φ̂(t, y)f (y) dn y
n
ZR
= φ(t, ξ)fˆ(ξ) dn ξ
n
ZR
= exp(−πt2 |ξ|2 )fˆ(ξ)exp(2πiξ · x) dn ξ
Rn
During our study of the heat equation, we showed that the left-hand side of (4.0.49) converges to
f (x) as t ↓ 0. To complete the proof of the theorem, it remains to show that the right-hand side
converges to
Z
def def
(4.0.50) fˆ(ξ)exp(2πiξ · x) dn ξ = (fˆ)∨ (x) = (fˆ)∧ (−x)
Rn
as t ↓ 0. To this end, given any number > 0, choose a ball BR of radius R centered at the origin
such that
Z
(4.0.51) |fˆ(ξ)| dn ξ ≤ .
c
BR
Above, BRc denotes the complement of the ball. It is possible to choose such a ball since fˆ ∈ L1 .
We then estimate
Z
ˆ ˆ∨
2 2 n

(4.0.52)
exp(−πt |ξ| )f (ξ)exp(2πiξ · x) d ξ − f (x)
Rn
Z Z
def 2 ˆ
fˆ(ξ)exp(2πiξ · x) d ξ
2 n n

= exp(−πt |ξ| )f (ξ)exp(2πiξ · x) d ξ −
Rn Rn
Z
exp(−πt2 |ξ|2 ) − 1|fˆ(ξ)| dn ξ

≤
Rn
Z Z ≤1
z }| {
|fˆ(ξ)| dn ξ + |exp(−πt2 |ξ|2 ) − 1| |fˆ(ξ)| dn ξ

≤ max exp(−πt2 |ξ|2 ) − 1
ξ∈BR BR c
BR
Z
≤ max exp(−πt2 |ξ|2 ) − 1kfˆkL1 + |fˆ(ξ)| dn ξ

ξ∈BR c
BR
≤ max exp(−πt2 |ξ|2 ) − 1kfˆkL1 + .

ξ ∈BR
As t ↓ 0, the first term on the right-hand side of (4.0.52) converges to 0. In particular, if t is

sufficiently small, then the right-hand side of (4.0.52) will be no larger than 2. Since this holds
for any > 0, we have thus shown that the right-hand side of (4.0.49) converges to the expression
(4.0.50) as t ↓ 0, i.e., that it converges to (fˆ)∨ (x). Since, as we have previously noted, the left-hand
side of (4.0.49) converges to f (x) as t ↓ 0, we have thus shown that (fˆ)∨ (x) = f (x).
It can similarly be shown that (f ∨ )∧ (x) = f (x). This completes the proof of (4.0.43).

The next theorem plays a central role in many areas of PDE and analysis. It shows that the
Fourier transform preserves the L2 norm of functions.
Theorem 4.2 (The Plancherel theorem). Suppose that f, g : Rn → C are continuous functions,
that f, g ∈ L1 ∩ L2 , and that fˆ, ĝ ∈ L1 . Then fˆ, ĝ ∈ L2 , and
(4.0.53) hf, gi = hfˆ, ĝi,

i.e., the Fourier transform preserves the L2 inner product. In particular, by setting f = g, it follows
from (4.0.53) that
(4.0.54) kf kL2 = kfˆkL2 .

Proof. By applying (4.0.41) with g replaced by ĝ, we have that
(4.0.55) hfˆ, ĝi = hf, (ĝ)∨ i.

By the Fourier inversion theorem (i.e. Theorem 4.1), we have that (ĝ)∨ = g, and so the right-hand
side of (4.0.55) is equal to
(4.0.56) hf, gi.


MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 19: Schrödinger’s Equation
1. Introduction
Schrödinger’s equation is the fundamental PDE of quantum mechanics. In the case of a single
quantum particle, the unknown function is the wave function ψ(t, x), which is a map from R1+n
into the complex numbers:
ψ : R1+n → C.
Above and throughout these notes, t is the time coordinate, and x = (x1 , · · · , xn ) are the spatial
coordinates. Schrödinger’s equation is
1
(1.0.1) i∂t ψ(t, x) + ∆ψ(t, x) = V (t, x)ψ(t, x),
2
where ∆ = ni=1 ∂i2 is the usual Laplacian with respect to the spatial variables, and V (t, x) is the
P
potential, which models the interaction of the particle with its environment. In this course, we
will mainly consider the case of free particles, in which V = 0 (i.e., the homogeneous Schrödinger
equation). In the case of free particles, there is an important family of solutions to (1.0.1), namely
the free waves. The free wave solutions provide some important intuition about how solutions to
the homogeneous Schrödinger equation behave. To derive the free wave solutions, we first make the
assumption that
(1.0.2) ψ(t, x) = ei(ωt−ξ·x) ,

where · is the Euclidean dot product. Above, ω ∈ R is the frequency, and ξ ∈ Rn is the wave vector.
ω ξ
Note that (1.0.2) can be written as ei|ξ|( |ξ| t− |ξ| ·x) , where |ξ| is the Euclidean length of ξ. Since |ξ|
ξ
is
n
a unit vector in R , it therefore follows that the speed of the plane wave is
ω
(1.0.3) .
|ξ|
Plugging (1.0.2) into (1.0.1), we derive the algebraic relation
|ξ|2 i(ωt+ξ·x)
(1.0.4) −(ω + )e = 0,
2
which implies
1
|ξ|2
(1.0.5) ω=− ,
2
ω |ξ |
(1.0.6) =− .
|ξ | 2
These conditions are necessary and sufficient in order for the function given in (1.0.2) to solve
(1.0.1) when V = 0. Note in particular that (1.0.6) shows that the speed of the plane wave solution
depends on |ξ|, and in particular that larger |ξ|0 s lead to larger speeds. The dependence of the speed
of the plane wave on ξ is known as dispersion, and (1.0.5) is known as the dispersion relation of
Schrödinger’s equation.
Dispersion plays a very important role in the analysis of certain PDEs, and in particular Schrödinger’s
equation. Heuristically, one sometimes imagines that a “typical” solution to a dispersive PDE is
composed of many free waves, each moving at a different speed and/or spatial direction (at least
when the dispersion relation is non-trivial). The dispersive nature of the PDE suggests that the
different free wave components in the solution should separate from each other. As we will see
(see e.g. Theorem 2.1), this heuristic argument is sometimes rigorously borne out, and separation
can cause the overall amplitude of the solution to decay in time (frequently at a rate of t to some
negative power).
2. The Fundamental Solution

We are now going to study the following global Cauchy problem for Schrödinger’s equation:
1
(2.0.7a) i∂t ψ(t, x) + ∆ψ(t, x) = 0,
2
(2.0.7b) ψ(0, x) = φ(x).
Let’s start by momentarily forgetting about the initial data and instead trying to find the fundamen-
tal solution K(t, x) to equation (2.0.7a). We will precisely define the fundamental solution below;
it is analogous to the fundamental solution for the heat equation. As we will see, the techniques
from Fourier analysis that we have previously developed will allow us to derive the fundamental
solution with relative ease. To this end, we set ψ(t, x) = K(t, x), take the spatial Fourier of equation
ˆ ξ) (and in particular
(2.0.7a), and use the Fourier transform property (∂α~ K)∧ (t, ξ) = (2πiξ)α~ K(t,
∧ 2 2 ˆ ˆ
(∆K) (t, ξ) = −4π |ξ | K(t, ξ)) to deduce the following ODE for K(t, ξ) :
(2.0.8) ˆ ξ) − 2π 2 |ξ|2 K(t,

i∂t K(t, ˆ ξ) = 0.
We rewrite (2.0.8) as
(2.0.9) ˆ ξ) = −2π 2 i|ξ|2 ,

∂t ln K(t,
which can be easily integrated to give
2 it|ξ|2
(2.0.10) K̂(t, ξ) = Ce−2π ,
where C(ξ) is a constant that we have to calculate.
To calculate C(ξ), we recall that we are ultimately trying to solve the following initial value
problem for Schrödinger’s equation:
1
(2.0.11a) i∂t ψ(t, x) + ∆ψ(t, x) = 0,
2
(2.0.11b) ψ(0, x) = φ(x).
Since K(t, x) is supposed to be the fundamental solution, we would like (in analogy with the results
of our study of the heat equation) the solution to (2.0.11a) - (2.0.11b) to be of the form
(2.0.12) ψ(t, x) = (K(t, ·) ∗ φ(·))(x).

Formally taking the Fourier transform of (2.0.12), using the fact that the Fourier transform turns
convolutions into products, and using (2.0.10), we arrive at the formal relation
(2.0.13) ˆ ξ) = K(t,
ψ(t, ˆ = C(ξ)e−2π2 it|ξ|2 φ(ξ).
ˆ ξ)φ(ξ) ˆ
Since (2.0.13) must in particular hold at t = 0, it is easy to see that
(2.0.14) C(ξ) = 1.
Thus, the spatial Fourier transform of K can be expressed as
(2.0.15) ˆ ξ) = e−2π2 it|ξ|2 .

K(t,
In the next proposition, we make rigorous sense of the above formal calculations, and we calculate
ˆ ξ).
K(t, x) from K(t,
Proposition 2.0.1 (Calculation of the Fundamental Solution K(t, x) for Schrödinger’s
equation). Let φ(x) be a smooth compactly supported function, and let ψ(t, x) be the function
whose spatial Fourier transform is defined as in (2.0.13):
(2.0.16) ˆ ξ) = K(t,
ψ(t, ˆ
ˆ ξ)φ(ξ),
ˆ ξ) is defined in (2.0.15). Then if t > 0, we have that
where K(t,
Z Z
def n
(2.0.17) ψ(t, x) = (K(t, ·) ∗ φ)(x) = K(t, y)φ(x − y) d y = K(t, x − y)φ(y) dn y,
Rn Rn
where
1 |x|2
i 2t
(2.0.18) K(t, x) = e .
(2πit)n/2
Above, i1/2 = eiπ/4 = √1 (1 + i).
2
ˆ ξ) as the Fourier transform of K(t, x), and K(t, x) as the inverse

Remark 2.0.1. We refer to K(t,
ˆ ξ).
Fourier transform of K(t,
R
Remark 2.0.2. Note that K(t, ·) is not an element of L1 because R |K(t, x)| dn x = ∞. Since many
of our previous results for the Fourier transform used the assumption that K(t, ·) ∈ L1 , our analysis
of K(t, x) is more delicate than these results.
Proof. For simplicity, let’s consider only the case n = 1. Previously, we showed that since φ is
smooth and compactly supported, φˆ is smooth, is rapidly decaying at infinity, and is an element of
ˆ = e−2π2 it|ξ|2 φ(ξ).
L1 . Therefore, the same is true of the function ψ(ξ) ˆ Thus, by the Fourier inversion
ˆ
theorem, ψ(t, x) is the inverse Fourier transform of ψ(t, ξ) :
Z Z
(2.0.19) ˆ ∨ (t, x) def
ψ(t, x) = (ψ) = 2πiξx
e ˆ ξ) dξ =
ψ(t, e2πiξx e−2π
2 it|ξ|2
ˆ dξ.
φ(ξ)
R R
To complete the proof, we will use that fact that the aforementioned properties of φˆ together
with the expression (2.0.19) allow us to express
Z
2 (δ+i)t|ξ|2
(2.0.20) ψ(t, x) = lim+ e2πiξx e−2π ˆ dx.
φ(ξ)
δ→0 R
We will show (2.0.20) at the end of the proof; let us take it for granted at the moment.
Defining
def 2 (δ+i)t|ξ|2
(2.0.21) fδ;t (ξ) = e−2π ,
we see that (2.0.20) is by definition equivalent to
(2.0.22) ˆ ∨ (x).
ψ(t, x) = lim+ (fδ;t φ)
δ→0
Note that fδ;t is a Gaussian whose argument has negative real part. Thus, we have previously
calculated its inverse Fourier transform:
1 2 /(2t(δ+i))
(2.0.23) fδ∨;t (x) = p e−|x| .
2π(δ + i)t
Furthermore, it is easy to see that
∨ 1 2
(2.0.24) lim+ fδ;t (x) = √ ei|x| /(2t) .
δ→0 2πit
√
We note that in the formula (2.0.24), i = eiπ/4 = √12 (1 + i).
Using (2.0.22), the Fourier transform + Fourier inversion identity (uv)∨ = [u∨ ∗ v ∨ ], and the
ˆ ∨ = φ, we have that
Fourier inversion theorem (φ)
Z
def
(2.0.25) ψ(t, x) = lim [fδ∨;t ∗ φ](x) = lim+ fδ∨;t (x − y)φ(y) dy
δ→0+ δ→0 R
Z
= lim fδ∨;t (x − y)φ(y) dy
+
R δ→0
Z
1 2
=√ ei|x−y| /(2t) φ(y) dy.
2πit R
We are allowed to bring the limit inside the integral in (2.0.25) because φ(y) is smooth and compactly
supported and because (for each fixed t > 0) the limit (2.0.24) is achieved uniformly on compact
spatial sets. We have thus shown (2.0.17).
It remains to prove (2.0.20). We need to show that
Z
2πiξx −2π 2 it|ξ|2 −2π 2 δt|ξ|2

(2.0.26) e e e − 1 φ̂(ξ) dξ

R
goes to 0 as δ ↓ 0. As we have previously discussed several times, the key to such an estimate is to
split the integral over R into an integral over a ball [−R, R] and its complement. More precisely,
for any R > 0, the expression (2.0.26) can be bounded as follows:
Z Z
−2π 2 δt|ξ|2 ˆ 2 2
(2.0.27) ≤ |e − 1||φ(ξ)| dξ + |e−2π δt|ξ| − 1| |φˆ(ξ)| dξ
[−R,R] {|ξ|≥R} | {z }
≤1
Z Z
−2π 2 δt|ξ|2 ˆ ξ )| dξ
≤ max |e − 1| |φ̂(ξ)| dx + |φ(
ξ ∈[−R,R] [−R,R] {|ξ|≥R}
def
= I + II.
Let > 0 be a positive number. In our previous studies of the Fourier transform, we showed that
ˆ dξ def ˆ L1 < ∞. Now by Taylor expanding, we see that the
R
(see also the remarks above) R |φ| = kφk
following inequality holds whenever R > 0, ξ ∈ [−R, R], and δtR2 is sufficiently small:
2 δt|ξ|2
(2.0.28) |e−2π − 1| ≤ CδtR2 ,
where C is a positive constant. Thus, we have the following estimate, valid whenever δtR2 is
sufficiently small:
Z
(2.0.29) |I| ≤ CδtR 2 ˆ | dx ≤ CtR2 kφk
|φ(ξ) ˆ L1 .
[−R,R]
Furthermore, since kφ̂kL1 < ∞, if R is sufficiently large, then
(2.0.30) |II| ≤ .
Thus, if t is fixed, R is first chosen to be sufficiently large, and then δ is chosen to be sufficiently
small, we have that
(2.0.31) |I| + |II| ≤ CδtR2 + ≤ 2.

In total, we have shown that if δ is sufficiently small, then (2.0.26) is ≤ 2. Since this holds for any
> 0, we have thus shown (2.0.20).

We now formally define the fundamental solution.
Definition 2.0.1 (The Fundamental Solution to Schrödinger’s equation). The fundamental
2
|x|
1
solution associated to (1.0.1) is the function K(t, x) = (2πit)n/2
ei 2t given in (2.0.18).
As an exercise, let’s check that K(t, x) verifies Schrödinger equation.

Lemma 2.0.2 (K(t, x) verifies the free Schrödinger equation). For t > 0, K(t, x) is a solution
to the free Schrödinger equation.
Proof. We use the chain rule to calculate
|x|2 i |x|2
(2.0.32) ∂j ei 2t = xj ei 2t ,
t
|x| 2
i(xj )2 i i |x|2
(2.0.33) ∂j2 ei 2t = 1 + e 2t ,
t t
1 1 |x|2
i 2t n |x|2 i |x|2
(2.0.34) ∆K(t, x) = e i − e 2t ,
2 (2πit)n/2 2t 2t2
i |x|2
i 2t
n i|x|2
(2.0.35) i∂t K(t, x) = e − − 2 .
(2πit)n/2 2t 2t
From the last two calculations, it easily follows that
1
(2.0.36) i∂t K(t, x) + ∆K(t, x) = 0.
2

We would like our fundamental solution to have the property that limt→0+ ψ(t, x) = φ(x) for nice
def
functions φ, where ψ(t, x) = [K(t, ·) ∗ φ(·)](x). Now using (2.0.13), if the initial datum φ is smooth
and compactly supported (and therefore, as previously shown, φˆ is smooth and rapidly decaying),
it is not difficult to show that
(2.0.37) ˆ ·) − φk
lim kψ(t, ˆ L2 = 0.
t↓0
(2.0.13) shows that the transformed function ψ(t,ˆ ·) converges to the transformed datum φ( ˆ ·) in
def
the L2 norm as t ↓ 0. But how does the function ψ(t, ·) = [K(t, ·) ∗ φ(·)](x) behave as t ↓ 0? By
|x−y |2
1
R
(2.0.17), this is equivalent to studying the behavior of (2πit) n/2 Rn
ei 2t φ(y) dn y as t ↓ 0. The next
proposition briefly addresses this surprisingly difficult question.
Proposition 2.0.3 (The behavior of K(t, ·) ∗ φ(·) as t ↓ 0). Let φ ∈ Cc∞ (Rn ). Then
Z
1 |x−y|2
(2.0.38) lim+ ei 2t φ(y) dn y = φ(x).
t→0 (2πit)n/2 Rn
Proof. The proof of this proposition requires a technically involved technique from Fourier Analysis
known as the method of stationary phase; it is therefore slightly beyond the scope of this course.
The main difficulty is that the most of the important behavior in (2.0.38) is due to the rapid
oscillation in y of the integrand (except when y is near x!) as t ↓ 0.
We are now ready to state and prove the main theorem concerning the solution to the free
Schrödinger equation.
Theorem 2.1 (The Solution to the Global Cauchy Problem Schrödinger’s Equation
and the Dispersive Estimate). Let φ(x) ∈ Cc∞ (Rn ). Then there exists a unique solution ψ ∈
C ∞ ((0, ∞) × Rn ) to the free Schrödinger equation
1
(2.0.39a) i∂t ψ(t, x) + ∆ψ(t, x) = 0, t > 0, x ∈ Rn ,
2
(2.0.39b) ψ(0, x) = φ(x), x ∈ Rn .
The solution can be expressed as
(2.0.40) ψ(t, x) = [K(t, ·) ∗ φ(·)](x),

where K(t, x) is the fundamental solution defined in (2.0.18).
Furthermore, for each t > 0, the solution ψ(t, x) verifies the dispersive estimate
Z
def C def C
(2.0.41) kψ(t, ·)kC0 = maxn |ψ(t, x)| ≤ kφkL1 = n/2 |φ(x)| dn x.
x∈R tn/2 t
Above, C > 0 is a constant that does not depend on the initial data.
def
Proof. Let L = i∂t + 21 ∆x denote the free Schrödinger operator. By definition, we have that
Z
1 |x−y |2
i 2t
(2.0.42) [K(t, ·) ∗ φ(·)](x) = φ(y) e dn y.
Rn (2πit)n/2
According to our previously discussed differentiation-under-the-integral theorem (and making use
of our assumptions on φ(x)), for t > 0, we can differentiate under the integral in (2.0.42) and use
Lemma 2.0.2 to deduce that
Z n 1 |x−y |2
i 2t
o
(2.0.43) L[K(t, ·) ∗ φ(·)](x) = φ(y)L e dn y = 0.
Rn (2πit)n/2
Thus, φ ∗ Kt verifies Schrödinger’s equation (2.0.39a).
The fact that ψ ∈ C ∞ ((0, ∞) × Rn ) follows from expressing
Z
1 |y|2
i 2t
(2.0.44) [K(t, ·) ∗ φ(·)](x) = φ(x − y) e dn y.
Rn (2π it)n/2
and repeatedly differentiating with respect to x under the integral.
To prove (2.0.41), we note that the following simple pointwise inequality follows easily from
(2.0.42):
Z
1 |x−y|2
i 2t n

(2.0.45) |[K(t, ·) ∗ φ(·)](x)| ≤ φ(y) n/2
e d y
Rn (2πit)
Z
1 def 1
≤ n/2 n/2
|φ(y)| dn y = kφkL1 .
(2π) t Rn (2π) 2 tn/2
n/
Taking the max over all x ∈ Rn , the estimate (2.0.41) thus follows.

Let’s now prove a very important property of sufficiently regular solutions to the free Schrödinger
equation: their L2 norm is constant in time.
Proposition 2.0.4 (Preservation of L2 norm). Under the assumptions of Theorem 2.1, we have
that
L2 norm of the data

z }| {
(2.0.46) kψ(t, ·)kL2 = kφkL2 ,
2
wher
R e the L 2 norm on the left-hand
R of (2.0.46) is taken over the spatial variables only. In particular,
n
if Rn |φ(x)| d x = 1, then Rn |ψ(t, x)|2 dn x = 1 holds for all t ≥ 0.
Proof. We give two proofs, the first using the original solution, and the second using its Fourier
transform; both proofs are important. For the first proof, we begin by noting that if
1
(2.0.47) i∂t ψ(t, x) + ∆ψ(t, x) = 0,
2
then by taking the complex conjugate of both sides, we have that
(2.0.48) ¯ x) + 1 ∆ψ(t,
−i∂t ψ(t, ¯ x) = 0,
2
where ψ¯ denotes the complex conjugate of ψ.
¯ and
Differentiating under the integral in the definition of the L2 norm, recalling that |ψ |2 = ψψ,
using (2.0.47) - (2.0.48), we thus deduce that
Z Z
d d ¯ x) dn x = ¯ x) + ψ(t, x)∂t ψ(t,
¯ x) dn x
(2.0.49) kψ(t, ·)k2L2 = ψ(t, x)ψ(t, ∂t ψ(t, x)ψ(t,
dt dt Rn Rn
Z
i ¯ x) − ψ(t, x)∆ψ(t,
¯ x) dn x.
= ∆ψ(t, x)ψ(t,
2 Rn
Integrating by parts on the right-hand side of (2.0.49), we conclude that
Z
d i ¯ x) dn x = 0,
(2.0.50) kψ(t, ·)k2L2 = − ∇ψ(t, x) · ∇ψ̄(t, x) − ∇ψ(t, x) · ∇ψ(t,
dt 2 Rn
where · denotes the Euclidean dot product. We have thus shown (2.0.46).
For the second proof, we begin by recalling (2.0.13) and (2.0.14):
(2.0.51) ˆ ξ) = e−2π2 it|ξ|2 φ(ξ).

ψ(t, ˆ
In particular, (2.0.51) implies that
(2.0.52) ˆ 2.
|ψ̂(t, ξ)|2 = |φ(ξ)|
Integrating (2.0.52) over Rn , we deduce that
(2.0.53) ˆ ·)kL2 = kφk

kψ(t, ˆ L2 ,
where the L2 norm on the left-hand side of (2.0.53) is taken over the ξ variables only. Finally, by
Plancherel’s theorem, we see that (2.0.53) implies
(2.0.54) kψ(t, ·)kL2 = kφkL2 .

Again, we have shown (2.0.46).

MIT OpenCourseWare
http://ocw.mit.edu

Fall 201s1
Class Meeting # 21: Lagrangian Field Theories

Many of the PDEs of interest to us can be realized as the Euler-Lagrange equations corresponding
to a function known as a Lagrangian L. Closely related are the notions of the action corresponding
to the Lagrangian, and the notion of a stationary point of the action. These ideas fall under a
branch of mathematics known as the calculus of variations. As we will see, these ideas will provide
a framework for deriving conserved (and more generally almost-conserved) quantities for solutions
to the Euler-Lagrange equations, the availability of which plays a central role in the analysis of
these solutions. Some important examples of PDEs to which these methods apply include the
familiar linear wave equation, Maxwell’s equations of electromagnetism, the Euler equations of
fluid mechanics, and the Einstein equations of general relativity.
1. Variational formulation (The Action Principle)

In this section, we will study (scalar-valued) functions φ on R1+n . They are sometimes called
(scalar-valued) fields on R1+n . We will use the notation
(1.0.1) x = (x0 , x1 , · · · , xn )
to denote the standard coordinates on R1+n , and as usual, we will sometimes use the alternate
notation x0 = t. We will use the notation
def
(1.0.2) ∇φ = (∇t φ, ∇1 φ, · · · , ∇n φ)
to denote the spacetime gradient of φ. We will study PDEs that are (in a sense to be explained)
generated by a Lagrangian.
Definition 1.0.1 (Lagrangian). A Lagrangian L is a function of φ and ∇φ (and sometimes the
spacetime coordinates x and perhaps other quantities too). We indicate the dependence of L on
e.g. φ and ∇φ by writing
(1.0.3) L(φ, ∇φ).

def
Example 1.0.1. As we will see, L = 12 (m−1 )αβ ∇α φ∇β φ, is the Lagrangian corresponding to the
linear wave equation, where m−1 = diag(−1, 1, 1, · · · , 1) is the standard Minkowski metric.
Given a Lagrangian L and a compact subset of spacetime K, we can define an important functional
known as the action. The action inputs functions φ and outputs a real number.
Definition 1.0.2 (Action). Let K ⊂ R1+n be a compact subset of spacetime. We define the action
A of φ over the set K by
1
Z
def
(1.0.4) A[φ; K] = L(φ(x), ∇φ(x)) d1+n x.
K
def
Above, d1+n x = dtdx1 d2 · · · dxn denotes spacetime integration. We often omit the argument x of φ
and ∇φ.
A main theme that runs throughout this section is that it is possible to generalize certain aspects
of standard calculus, which takes place on R1+n , to apply to (infinite dimensional) spaces of func-
tions. In this context, the action A plays the same role that a function plays in standard calculus.
Moreover, many important PDEs have solutions that are stationary points 1 for the action. The
notion of a stationary point is a generalization of the notion of a critical point from calculus. In
order to define a stationary point of A, we will need to introduce the notion of a variation. The
motivation behind the next two definitions is that we would like to understand how A[φ; K] changes
when we slightly change φ.
Definition 1.0.3 (Variation). Given a compact set K, a function ψ ∈ Cc∞ (K) is called a variation.
Definition 1.0.4. Given a variation ψ and a small number , we define
def
(1.0.5) φ = φ + ψ .
|{z}
tiny perturbation of φ
We now give the definition of a stationary point of the action. Stationary points are the moral
equivalent of critical points2 from calculus.
Definition 1.0.5 (Definition of a stationary point φ). We say that φ is a stationary point of
the action if the following relation holds for all compact subsets K and all variations ψ ∈ Cc∞ (K) :
d
(1.0.6) A[φ ; K] = 0.
d =0
The next theorem is central to our discussion in this section. It shows that the stationary points
of A verify a PDE called the Euler-Lagrange equation.
Theorem 1.1 (The Principle of Stationary Action). Let L(φ, ∇φ, x) be a C 2 Lagrangian.
Then a C 2 field φ is a stationary point of the action if and only if the following Euler-Lagrange
PDE is verified by φ :

∂L(φ, ∇φ, x) ∂L(φ, ∇φ, x)
(1.0.7) ∇α = .
∂(∇α φ) ∂φ
Above, ∂L(φ,∇φ,x)
∂(∇α φ)
denotes partial differentiation of L with respect to its argument ∇α φ with its other
arguments (e.g., the other ∇µ φ with µ 6= α, φ, x, etc.) held fixed.
Proof. Let K ⊂ R1+n be a compact subset of spacetime and let ψ be any variation with support
def
contained in K. For any > 0, we define as in (1.0.5): φ = φ + ψ. We then differentiate under the
integral and use the chain rule to conclude that
1Even though they are called “stationary points,” they are actually fields on R1+n .
2Recall that x is a critical point of the function f if f 0 (x) = 0.
Z Z
d def d
(1.0.8) A[φ ; K] = L(φ , ∇φ , x) d x = ∂ L(φ , ∇φ , x) d1+n x
1+n
d d K K
∂ L(φ , ∇φ , x) ∂ L(φ , ∇φ , x)
Z
= ∂ φ + ∂ ∇α φ d1+n x.
K ∂φ |{z} ∂( ∇ α φ) | {z }
ψ ∇α ψ
Above, ∂ denotes the derivative with respect to the parameter with all other variables held fixed.
We now set = 0, integrate by parts in (1.0.8) (and observe that the conditions on ψ guarantee
that there are no boundary terms) to deduce that
∂ L(φ, ∇φ, x) ∂ L(φ, ∇φ, x)

Z
d
(1.0.9) A[φ ; K] = ψ+ ∇α ψ d1+n x
d ∂φ ∂(∇α φ)
ZK
∂L(φ, ∇φ, x) ∂L(φ, ∇φ, x)
= ψ − ∇α ψ d1+n x
K ∂φ ∂(∇ α φ)
Z
∂ L(φ, ∇φ, x) ∂L(φ, ∇φ, x)
= − ∇α ψ d1+n x.
K ∂φ ∂(∇ α φ)
We now observe that (1.0.9) is equal to 0 for all variations ψ if and only if the term in large brackets
on the right-hand side of (1.0.9) must be 0. Since this observation holds for any compact subset K,
we have thus shown that (1.0.7) holds if and only if φ is a stationary point of the action.

def
Example 1.0.2. Let L = − 12 (m−1 )αβ ∇α φ∇β φ (note that this L does not directly depend on x)
where m−1 = diag(−1, 1, 1, · · · , 1) is the standard Minkowski metric. Then
∂L(φ, ∇φ)
(1.0.10) = 0,
∂φ
∂L(φ, ∇φ)
(1.0.11) = −(m−1 )µα ∇α φ.
∂(∇µ φ)
Therefore, the Euler-Lagrange equation corresponding to L is

−1 µα
(1.0.12) ∇µ (m ) ∇α φ = 0.
Note that equation (1.0.12) is just the familiar linear wave equation (m−1 )αβ ∇α ∇β φ = 0.
2. Coordinate Invariant Lagrangians

Many important PDEs are the Euler-Lagrange equations corresponding to coordinate invariant
Lagrangians; we will explain what this means momentarily. Motivated by this claim, we will now
introduce a class of changes of coordinates on spacetime. The new coordinates will be formed by
flowing the old coordinates in the direction of a vectorfield Y on spacetime. These new coordinates
will therefore verify a system of ordinary differential equations generated by the flow of Y. In the
next proposition, we review some facts concerning these new coordinates; these facts are basic
results in ODE theory.
Prop osition 2.0.1 (Basic facts from ODE theory for autonomous systems). Let Y (x) =
0 0 n 1 0 n n 0 n 1+n
Y (x , · · · , x ), Y (x , · · · , x ), · · · , Y (x , · · · , x ) be a smooth vectorfield on R . Assume that
| {z }
x
there exists a uniform constant C > 0 such that
(2.0.13) |∇µ Y ν (x)| ≤ C, x ∈ R1+n , 0 ≤ µ, ν ≤ n.

Consider the initial value problem (where the independent variable is the “flow parameter” ) for
the following system of ordinary differential equations:
d µ
(2.0.14) e () = Y µ (x
x e),
d
(2.0.15) eµ (0) = xµ .
x
Then there exists a number 0 > 0 such that the initial value problem (2.0.14) - (2.0.15) has a
unique smooth (in ) solution existing on the interval ∈ [−0 , 0 ].
Let us denote the “flow map” from the data x to the solution x e at flow parameter by x
e = F (x).
Then on the interval [−0 , 0 ], the flow map
def
(2.0.16) x → F (x) = x.
e
is a smooth (in x), bijective map from R1+n to R1+n with smooth inverse F− (·), i.e., x
e = F (x) =⇒
e); such maps are called diffeomorphisms of R1+n . Furthermore, if |1 | + |2 | ≤ 0 , then
x = F− (x
the flow map verifies the following one-parameter commutative group properties:
(2.0.17) F1 ◦ F2 = F2 ◦ F1 = F1 +2 .

Let us also denote the derivative matrix corresponding to the flow map F by M :
def eµ
∂x
(2.0.18) Mνµ = .
∂xν
Then if || is sufficiently small, we have the following expansions in :
def
(2.0.19) eµ = Fµ (x) = xµ + Y µ (x) + 2 Rµ (, x),
x
(2.0.20) Mνµ = δνµ + ∇ν Y µ (x) + 2 ∇ν Rµ (, x),
∂xµ
(2.0.21) (M −1 )µν = ν
= δνµ − ∇ν Y µ (x) + 2 Sνµ (, x),
∂x
e
(2.0.22) detM −1 = 1 − ∇α Y α + 2 S(, x).
Above, Rµ (, x), ∇ν Rµ (, x), Sνµ (, x), S(, x) are smooth functions of (, x) for ∈ [−0 , 0 ], x ∈
R1+n .
Remark 2.0.1. The assumption (2.0.13) guarantees that the “time of existence” 0 can be chosen
to be independent of the initial data x.
Proof. Most of the results of Proposition 2.0.1 are standard facts from ODE theory and will not
be proved here. We will show how to derive the expansions (2.0.21) and (2.0.22) from the other
results. To this end, we will need some basic facts from matrix theory. We will use the following
norm for (1 + n) × (1 + n) matrix-valued functions on R1+n :
s X
def
(2.0.23) kM k = max
1+n
|Mνµ (x)|2 .
x∈R
0≤µ,ν ≤n
Now if I is the 1 + n identity matrix3, and kAk is a sufficiently small (1 + n) × (1 + n) matrix, then
def
the matrix M = (I − A)−1 can be expanded in a convergent series:
(2.0.24) (I − A)−1 = I + A + A2 + A3 + · · · .
Note in particular that the tail (i.e., all but the first two terms) can be bounded by
(2.0.25) kA2 + A3 + A4 + · · · k = kA2 (I − A)−1 k ≤ 2kAk2 ,

if kAk is sufficiently small. We now apply (2.0.24) and (2.0.25) to the matrix M defined in (2.0.18)
def
(where Aµν = ∇ν Y µ ), thereby arriving at (2.0.21).
To derive (2.0.22), we first Taylor expand the determinant (viewed as a real-valued function of
matrices) for sufficiently small kAk :
(2.0.26) det(I + A) = 1 + Aαα + O(kAk2 )

Above, we write O(kAk2 ) to denote a term that can be bounded by CkAk2 , where C > 0 is some
positive constant independent of (all sufficiently small) A. The expansion (2.0.22) now follows from
(2.0.21) and (2.0.26). We remark that you will derive the expansion (2.0.26) in your homework in
more detail.

We will now “define” how various fields and their derivatives transform under a change of coordi-
nates. A full justification of these definitions can be found in books on tensor analysis or differential
geometry.
Definition 2.0.6 (Transformation properties of fields). Let φ(x) be a scalar-valued function,
let m(x) be an (invertible) metric (depending on x) with components mµν (x), and let x → x
e be a
spacetime diffeomorphism. Then upon changing coordinates x → x, e these quantities transform as
follows:
(2.0.27a) e e def
φ(x) = φ|(x◦x)
e ,
def
(2.0.27b) ∇ e) = (M −1 )αµ |(x◦x)
e µ φe(x e ∇α φ|x◦x
e,
def
(2.0.27c) m e) = (M −1 )µα |(x◦xe) (M −1 )βν |(x◦x)
e µν (x e mαβ |(x◦x)
e ,
def
(2.0.27d) e −1 )µν (x
(m e) = Mαµ |(x◦xe) Mβν |(x◦x) −1 αβ
e (m ) |(x◦x
e) .
3Note that Iνµ = δνµ .

Above and throughout, we use the notation
def ∂
(2.0.28a) ∇µ = ,
∂xµ
e µ def ∂
(2.0.28b) ∇ = ,
eµ
∂x
def µ µ
Mνµ = ∂x e
∂ xν
is the derivative matrix defined in (2.0.18), and (M −1 )µν = ∂x
eν
∂x
is its inverse. Furthermore,
the notation x ◦ x e indicates that we are viewing x as a function of x; e this is possible since x → x
e is
a diffeomorphism.
Remark 2.0.2. (2.0.27a) simply says that the transformed function φe takes the same value at the
new coordinate xe that φ takes at the old coordinate x. (2.0.27b) is really just the chain rule expressing
∂ ∂
eµ
∂x
in terms of ∂xµ
. (2.0.27c) is the standard transformation law for tensors with two upstairs
indices. These transformation laws generalize to other tensors in a straightforward fashion; the
generalization can be found in books on tensor analysis/ differential geometry. Roughly speaking,
tensors with indices downstairs transform by multiplication by the matrix M −1 (one copy of M −1
for each index), and tensors with indices upstairs transform by multiplication by the matrix M (one
copy of M for each index).
We will now define what it means for a Lagrangian to be coordinate invariant.
Definition 2.0.7 (Coordinate invariant Lagrangian). Let L(φ, ∇φ, m) be a Lagrangian that
depends only on φ, ∇φ, and the Minkowski metric m. We say that L is coordinate invariant if for
all spacetime diffeomorphisms x → x,
e we have that
(2.0.29) L(φ(x), ∇φ(x), m(x)) = L(φe(x),

e ∇ e φe(x
e), m
e (x
e)),
where the transformed fields are defined in Definition 2.0.6.
Example 2.0.3. Consider the Lagrangian for the linear wave equation: L(φ, ∇φ, m) = − 12 (m−1 )µν ∇µ φ∇ν φ.
Using (2.0.27a) - (2.0.27c) and the fact that Mαµ (M −1 )κµ = δακ4, we compute that
def 1 −1 µν e e e e
(2.0.30) L(φ,
e∇ e φ,
em e ) = − (me ) ∇µ φ∇ν φ
2
1
= − Mαµ Mβν (m−1 )αβ (M −1 )κµ ∇κ φ(M −1 )λµ ∇λ φ
2
1
= − (m−1 )µν ∇µ φ∇ν φ.
2
This Lagrangian is therefore coordinate invariant.
As we will see, the availability of an energy-momentum tensor for certain Euler-Lagrange equa-
tions is closely connected to the coordinate invariance property of their Lagrangians. In order to
derive this connection, we will need to understand more about how the coordinate transformations
(2.0.16) vary with .
4Recall that δακ = 1 if α = κ and δακ = 1 if α 6= κ; δακ can be viewed as the identity matrix.
Proposition 2.0.2 (Derivatives with respect to the flow parameter ). Let x eµ = F (x)
e ∇
be the change of spacetime coordinates defined in (2.0.16), and let φ, e µ φ,
e m e −1 )µν be the
e µν , (m
transformed fields defined in Definition 2.0.6. Then the following identities hold for all spacetime
points x
e:

(2.0.31a) ∂ φe|xe = −Y α |xe∇α φ|xe,

=0

(2.0.31b) ∂ ∇µ φ|xe = −∇µ Y α |xe∇α φ|xe − Y α |xe∇α ∇µ φ|xe = −∇µ (Y α ∇α φ)|xe,
e e
=0
(2.0.31c) ∂ m e µν |xe = −mνα |xe∇µ Y α |xe − mµα |xe∇ν Y α |xe − Y α |xe∇α mµν |xe ,

=0 | {z }
0 for the Minkowski metric

(2.0.31d) ∂ e −1 )µν |xe = (m−1 )αν |xe∇α Y µ |xe + (m−1 )µα |xe∇α Y ν |xe −
(m Y α |xe∇α (m−1 )µν |xe ,

=0 | {z }
0 for the Minkowski metric
−1 α
(2.0.31e) ∂ |=0 detM |xe = −∇α Y |xe.
Above and for the remainder of these notes, ∂ denotes the derivative of an −dependent quantity
with the new coordinates x
e held fixed.
Remark 2.0.3. In the language of differential geometry, the tilded fields are the Lie derivatives of
the un-tilded fields with respect to the vectorfield −Y.
Proof. Recall that x eµ = Fµ (x), xµ = F−µ (xe), F0µ (x) = xµ (so that x = x
e when = 0) and
∂ Fµ (·) = Y µ (·). Therefore, using the chain rule, we compute that
def α
(2.0.32) ∂ |=0 φe(x
e) = ∂ |=0 φ(F− (x))
e = ∇α φ|xe∂ |=0 F− (x)
e
= −Y α |xe∇α φ|xe,
We have thus shown (2.0.31a).
Similarly, with the help of (2.0.21), and noting that (M −1 )νµ = δµν when = 0 and ∂ |=0 (M −1 )αµ ◦

F− |xe = −∇µ Y α |xe, we compute that
(2.0.33)
n o
def −1 α
∂ |=0 ∇µ φ(x)
e e e = ∂ |=0 (M )µ ◦ F− (x)( e ∇α φ) ◦ F− (x)
e
n o
= ∂ |=0 (M −1 )αµ ◦ F− (x e + (M −1 )αµ ◦ F− (x)∂

e) (∇α φ) ◦ F− (x) e |=0 (∇α φ) ◦ F− (x)
e
= −∇µ Y α |xe∇α φ|xe − (M −1 )µα |xe Y β |xe∇β ∇α φ|xe.
| {z }
α
δµ
We have thus shown we have thus shown (2.0.31b). The proofs of (2.0.31c) and (2.0.31d) are similar,
and we omit the details.
To prove (2.0.31e), we simply differentiate the expansion (2.0.22) with respect to and set = 0.

We now state the following simple corollary to Proposition 2.0.2.
Corollary 2.0.3 (The derivative of L with respect to the flow parameter ). Let L(φ, ∇φ, m)
be a C 2 Lagrangian. Then under the assumptions of Proposition 2.0.2, the following identity holds
at all spacetime points:
∂L(φ, ∇φ, m) α
(2.0.34) ∂ |=0 L(φ,
e ∇φ, e =−
e e m) Y ∇α φ
∇φ
∂L(φ, ∇φ, m)
− ∇µ (Y α ∇α φ)
∂(∇µ φ)
∂L(φ, ∇φ, m) n o
− mαν ∇µ Y α + mµα ∇ν Y α + Y α ∇α mµν .
∂mµν | {z }
0
Proof. By the chain rule, we have that
∂ L(φ,
e∇ e φ,
e m)
∂ L(φ,
e ∇φ, e
(2.0.35) ee m
e) = ∂ φe
∂φ
∂ L(φ,
e∇ e φ,
em e ) e e ∂ L(φ,
e∇ e φ,
e m)
∂ ∇φ +
e
+ ∂ m
e µν .
∂(∇µ φ) ∂ mµν
The relation (2.0.34) now follows from Proposition 2.0.2 and (2.0.35).
3. The energy-momentum tensor

The main goal of this section is to show that for a certain class of coordinate invariant Lagrangians
L, there exists an energy-momentum tensor T µν . This T µν plays the same role in the analysis of the
corresponding Euler-Lagrange equation corresponding to L as it did in our previous analysis of the
linear wave equation. More precisely, for solutions to the Euler-Lagrange equation corresponding
to L, we will show that ∇µ T µν = 0. As we saw earlier in the course, this identity forms the basis
for the derivation of conserved quantities in solutions to the Euler-Lagrange equations.
Theorem 3.1 (Derivation and divergence-free property of the energy-momentum ten-
sor). Let L(φ, ∇φ, m) be a coordinate invariant Lagrangian (in the sense of Definition 2.0.7) that
depends only on φ, ∇φ, and the Minkowski metric m. Let
def ∂L
(3.0.36) T µν = 2 + (m−1 )µν L
∂mµν
be the energy-momentum tensor corresponding to L. Then T µν is symmetric:
(3.0.37) T µν = T νµ , 0 ≤ µ, ν ≤ n.
Furthermore, if φ verifies the Euler-Lagrange equation (1.0.7), the following divergence identity is
verified by T µν :
(3.0.38) ∇µ T µν = 0, (ν = 0, 1, 2, · · · , n).
Proof. The relation (3.0.37) follows easily from (3.0.36) since mµν = mνµ .
We will now prove (3.0.38). To this end, let K ⊂ R1+n be a compact spacetime subset, and let
Y : R1+n → R1+n be a smooth vectorfield with support contained in K. Let x e be the change of
−1
variables (2.0.16), and consider the transformed quantities φ, ∇φ, m,
e e e e me given in Definition 2.0.6.
Now by assumption, we have that L(φ, ∇φ, m) = L(φ, ∇φ, m
e e e e ). Furthermore, by the standard change
of variables theorem from advanced calculus, we have that d1+n x = det ∂x ∂x
e
e = detM −1 d1+n x,
d1+n x e
where the matrix M is defined in (2.0.18). Therefore, we have that
Z
(3.0.39) A[φ; K] = L(φ, ∇φ, m) d1+n x
ZK
= L(φ,
e ∇φ, e detM −1 d1+n x.
e e m) e
K
Now the left-hand side of (3.0.39) doesn’t depend on . We therefore have that
d
(3.0.40) |=0 A[φ; K] = 0.
d
On the other hand, we can differentiate under the integral on the right-hand side of (3.0.39) with
respect to at = 0 and use (2.0.31e) plus Corollary 2.0.3 (together with the fact that x = x
e when
= 0) to deduce that
∂L(φ, ∇φ, m) α ∂ L(φ, ∇φ, m)

Z
d
(3.0.41) |=0 A[φ; K] = − Y ∇α φ − ∇µ (Y α ∇α φ) d1+n x
d K ∇φ ∂(∇µ φ)
∂L(φ, ∇φ, m)
Z n o
− mαν ∇µ Y α + mµα ∇ν Y α + Y α ∇α mµν d1+n x
K ∂mµν | {z }
0
Z
− L(φ, ∇φ, m)∇α Y α d1+n x.
K
Integrating by parts in (3.0.41) and using (3.0.40), we have that
Z
∂L(φ, ∇φ, m) ∂L(φ, ∇φ, m)
(3.0.42) 0=− − ∇µ Y α ∇α φ d1+n x
K ∇φ ∂(∇µ φ)
∂L(φ, ∇φ, m)
Z n o
− mαν ∇µ Y α + mµα ∇ν Y α + Y α ∇α mµν d1+n x
K ∂ mµν | {z }
0
Z
− L(φ, ∇φ, m)∇α Y α d1+n x.
K
We now note that the Euler-Lagrange equation (1.0.7) implies that the first line on the right-hand
side of (3.0.42) is 0. Therefore, we collect the remaining terms together to derive that
Z
∂L(φ, ∇φ, m) 1 −1 µν
(3.0.43) 0=− + (m ) L(φ, ∇φ, m) {mαν ∇µ Y α + mµα ∇ν Y α } d1+n x
K ∂m µν 2
Z
∂ L(φ, ∇φ, m) −1 µν
=− 2 + (m ) L(φ, ∇φ, m) mαν ∇µ Y α d1+n x.
K ∂mµν
Integrating by parts in (3.0.43), we deduce that

∂L(φ, ∇φ, m)
Z
−1 µν
(3.0.44) 0= ∇µ 2 + (m ) L(φ, ∇φ, m) mαν Y α d1+n x.
K ∂mµν
Since (3.0.43) must hold for all such smooth vectorfields Y with support contained in K, we
conclude that the divergence of the term in braces is 0:

∂ L(φ, ∇φ, m) −1 µν
(3.0.45) ∇µ 2 + (m ) L(φ, ∇φ, m) = 0.
∂mµν
We have thus shown that (3.0.38) holds.
Example 3.0.4. The Lagrangian for the linear wave equation is L = − 12 (m−1 )αβ ∇α φ∇β φ. We
therefore appeal to (3.0.36) and calculate that
def ∂L
(3.0.46) T µν = 2 + (m−1 )µν L
∂mµν
1
= (m−1 )µα (m−1 )νβ ∇α φ∇β φ − (m−1 )µν (m−1 )αβ ∇α φ∇β φ.
2
Remark 3.0.4. To derive (3.0.46), we have used the fact that if q is any quantity, and m is a
symmetric invertible (1 + n) × (1 + n) matrix that depends on q, then
d d
(3.0.47) (m−1 )µν = −(m−1 )µα (m−1 )νβ mαβ , 0 ≤ µ, ν ≤ n.
dq dq
You will derive the simple relation (3.0.47) in your homework. In particular, it follows from (3.0.47)
that
∂(m−1 )µν
(3.0.48) = −(m−1 )µκ (m−1 )νλ .
∂mκλ
On the left-hand side of (3.0.48), we are viewing the components (m−1 )µν as functions of the
components mκλ , 0 ≤ κ, λ ≤ n.
MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011
Class Meeting # 24: Transport Equations and Burger’s Equation

In these notes, we introduce a class of evolution PDEs known as transport equations. Such
equations arise in a physical context whenever a quantity is “transported” in a certain direction.
Some important physical examples include the mass density flow for an incompressible fluid, and
the Boltzmann equation of kinetic theory. We discuss both linear transport equations and a famous
nonlinear transport equation known as Burger’s equation. One of our major goals is to show that
in contrast to the case of linear PDEs, solutions to Burger’s equations can develop singularities in
finite time.
1. Transport Equations
Linear homogeneous transport equations are PDEs of the form
(1.0.1) X µ ∂µ u = 0,
where (x0 , x1 , · · · , xn ) are coordinates on R1+n and X(x0 , · · · , xn ) is a vectorfield on R1+n . As we
will soon see, the transport equation is closely connected to the following system of ODEs for the
unknowns γ µ :
d µ
(1.0.2) γ (s) = X µ (γ 0 (s), γ 1 (s), · · · , γ n (s)), (µ = 0, 1, · · · , n).
ds
Given initial conditions γ µ (0), the solutions to (1.0.2) are curves γ : I → R1+n , where I is an interval.
These curves are known as the integral curves of the vectorfield X. They are also known as the
characteristic curves associated to the PDE (1.0.1). The next proposition clarifies the connection
between the transport equation (1.0.1) and its characteristic curves.
Proposition 1.0.1 (Connection between transport equations and ODEs). If u solves the
transport equation (1.0.1), then u is constant along the integral curves of X. More precisely, if γ(s)
is any solution to (1.0.2), then
d
u γ 0 (s), · · · , γ n (s) = 0.

(1.0.3)
ds
Proof. Using the chain rule, (1.0.2), and (1.0.1), we have that
n
d X ∂ d
(1.0.4) u γ 0 (s), · · · , γ n (s) = µ
u |γ(s) γ µ (s)
ds µ=0
∂x ds
n
X ∂
= u |γ(s) X µ (γ(s)) = (X µ ∂µ u)|γ(s) = 0.
µ=0
∂xµ
1

1.1. Constant vectorfields. Let’s consider a very special case of (1.0.1) in which the components
of X are constant. That is, we assume that
0 1 n
(1.1.1) X = (X , X , · · · , X )
µ
where the X are constants independent of (x0 , · · · , xn ).
In this case, the solutions to the system (1.0.2) of ODEs are the straight lines
(1.1.2) γ(s) = γ̊ + sX,

where γ̊ = γ(0) is a constant vector.
For concreteness, let’s also assume that
0
(1.1.3) X =1
and as usual, let’s use the alternate notation x0 = t. Let’s assume that we are given Cauchy data
for u on the hypersurface {t = 0} × Rn :
(1.1.4) u(0, x1 , . . . , xn ) = f (x1 , · · · , xn ),

where f is a function on Rn . We now note that
1 n
(1.1.5) (t, x1 , · · · , xn ) = (0, x1 − tX , · · · , xn − tX ) + tX,
which implies that the spacetime point (t, x1 , · · · , xn ) lies on the characteristic curve γ(t) passing
1 n
through the “initial” point (0, x1 − tX , · · · , xn − tX ) ⊂ {t = 0} × Rn . Therefore, by Proposition
1.0.1, we have that
1 n
(1.1.6) u(0, x1 , . . . , xn ) = f (x1 − tX , · · · , xn − tX ),
and we have explicitly solved the PDE (1.0.1).
2. A Nonlinear Scalar PDE: Burger’s (Inviscid) Equation

Burger’s equation is a simple nonlinear PDE in 1+1 dimensions. It is often used to illustrate some
important features of (some) nonlinear PDEs. As we will see, it can be viewed as a nonlinear version
of the transport equation. Our main goal in these next two sections is to illustrate a phenomenon
not found in linear PDEs : the formation of a singularity in the solution.
Burger’s equation is the following PDE for the function u(t, x) :
(2.0.7) ∂t u + u∂x u = 0, (t, x) ∈ [0, ∞) × R.

As we will see, the Cauchy problem (i.e., the initial value problem in which the datum u(0, x) is
prescribed) for (2.0.7) is well-posed.
Equation (2.0.7) is a simple example of a nonlinear conservation law. More precisely, the next
proposition shows that under suitable assumptions, the spatial L2 norm of solutions to (2.0.7) is
preserved in time.
Proposition 2.0.1 (Burger’s equation is a conservation law). Let T ≥ 0, and let u(t, x) be
def
a C 1 solution to (2.0.7) on ST = [0, T ] × R. Assume that for each fixed t ∈ [0, T ], we have that
limx→±∞ u(t, x) = 0. Then for (t, x) ∈ ST , we have that
Z Z
2
(2.0.8) u (t, x) dx = u2 (0, x) dx,
R R
i.e., the spatial L2 norm of u(t, ·) is preserved in time.

Proof. Multiplying both sides of (2.0.7) by u, we deduce that
1 1
(2.0.9) ∂t (u2 ) + ∂x (u3 ) = 0.
2 3
Integrating (2.0.9) over R, using the Fundamental Theorem of calculus and the assumption on the
behavior of u(t, x) as x → ±∞, and “un-differentiating” under the integral, we deduce that
Z
1d
(2.0.10) |u(t, x)|2 dx = 0.
2 dt R
The proposition now follows from (2.0.10).

Notice that (2.0.7) can be viewed as as a transport equation whose speed and direction depend on
the solution u itself. As in the case of transport equations, we can define the characteristic curves
associated to a solution of (2.0.7).
Definition 2.0.1. Let u be a solution of (2.0.7). The characteristic curves associated to u are the
solutions to the following system of ODEs:
d 0
(2.0.11a) γ = 1,
ds
d 1
(2.0.11b) γ = u ◦ γ = u(γ 0 (s), γ 1 (s)).
ds
Remark 2.0.1. Equation (2.0.11a) shows that γ 0 (s) = s + c, where c is a constant. There is no
loss of generality in parameterizing the curve with the constant c set equal to 0.
The next two propositions are essential for our analysis of Burger’s equation.
Proposition 2.0.2 (Burger solutions are constant along characteristics). C 1 solutions to
(2.0.7) are constant along the characteristic curves (2.0.11a) - (2.0.11b).
Proof. Using the chain rule and the equations (2.0.7), (2.0.11a) - (2.0.11b), we compute that
d d d
(2.0.12) [u ◦ γ(s)] = (∂t u)|γ γ 0 + (∂x u)|γ γ 1 = (∂t u)|γ + u|γ (∂x u)|γ = 0.
ds ds ds

Proposition 2.0.3 (Burger characteristics are straight lines). The characteristic curves
(2.0.11a) - (2.0.11b) are straight lines in R1+1 .
Proof. It clearly follows from (2.0.11a) that
d2 0
(2.0.13) γ (s) = 0.
ds2
Furthermore, using the ODE (2.0.11b) and the computation (2.0.12), we compute that
d2 1 d
(2.0.14) 2
γ (s) = [u ◦ γ(s)] = 0.
ds ds
d2 µ
We have thus shown that ds2
γ (s) = 0 for µ = 0, 1. Thus, the curve γ has 0 acceleration, and is
therefore a straight line.

3. “Solving” Burger’s equation

Using the propositions from the previous section, will now exhibit an implicit solution to the
following initial value problem for Burger’s equation:
(3.0.15) ∂t u + u∂x u = 0, (t, x) ∈ [0, ∞) × R,

u(0, x) = f (x), x ∈ R.
Theorem 3.1. Let u be a C 1 solution to (3.0.15), and let (t, x) be a spacetime point. With (t, x)
fixed, assume that the implicit equation x = p + f (p)t in the unknown p has a unique solution. Then
(3.0.16) u(t, x) = f (p).

Proof. Let γ(s) = (γ 0 (s), γ 1 (s)) denote the characteristic curve passing through the Cartesian (t, x)
spacetime point (0, p) when s = 0, i.e., (γ 0 (0), γ 1 (0)) = (0, p). According to the ODEs (2.0.11a)
0 (0)
- (2.0.11b) and Proposition 2.0.3, γ(s) is a straight line with constant “t/x” slope γ̇γ̇ 1 (0) 1
= f (p) . It
therefore follows that
(3.0.17) γ 0 (s) = s,
(3.0.18) γ 1 (s) = p + f (p)s.
Consequently, by Proposition 2.0.2, we have that
(3.0.19) u(s, p + f (p)s) = u(0, p) = f (p).

Equation (3.0.16) thus follows.
4. Formation of Singularities
Proposition 2.0.1 shows that the spatial L2 norm of nice solutions to Burger’s equation is preserved
in time. This conserved quantity suggests that the solution can never grow large and therefore that
the solution should exist for all time. However, this intuition is false! The next theorem shows that
even though the L2 norm is preserved, the solution can develop a singularity in finite time, even if
the initial datum f is very small and very nice.
Theorem 4.1 (Sharp Characterization of Singularity Formation in Burger’s Equation).

Let f ∈ C 1 (R) be initial data for Burger’s equation (3.0.15). Then the corresponding solution u(t, x)
remains C 1 for all (t, x) ∈ [0, ∞) × R if and only if f 0 (x) ≥ 0 holds for all x ∈ R.
Proof. Suppose that there exists a point x0 such that f 0 (x0 ) < 0. Then there exists a nearby
point x1 > x0 with f (x1 ) < f (x0 ). Let γ(xi ) (s) denote the characteristic curve passing through
the spacetime point (0, xi ) at s = 0. Then by Proposition 2.0.2, u ◦ γ(xi ) (s) = f (xi ) for all s ≥
0. Furthermore, as in the proof of Theorem 3.1, γ(xi ) (s) traces out a straight line with slope (x
def
horizontal, t vertical) mi = f (x1 i ) . Since m11 < m10 , it is easy to check that γ(x0 ) intersects γ(x1 ) at
the spacetime point (t, x) = x1 1 −x m0 x0 −m1 x1

1 , . Thus, by Proposition 2.0.2 u(t, x) = f (x0 ) and
0
m0
−m1
m0 −m1
u(t, x) = f (x1 ), which is a contradiction.
On the other hand, if f 0 (p) ≥ 0 for all p, then for all t0 ≥ 0 and all x0 , the equation
(4.0.20) x0 = p + f (p)t0
has a unique solution p = p0 (t0 , x0 ) that depends on (t0 , x0 ) in a C 1 fashion. This fact follows from
e.g. the implicit function theorem since ∂p (p + f (p)t0 ) = 1 + f 0 (p)t0 > 0 (i.e., the right-hand side
of (4.0.20) is strictly increasing in p). Therefore, by Theorem 3.1 u(t0 , x0 ) = f ◦ p0 (t0 , x0 ), and
u ∈ C 1 ([0, ∞) × R).

x1 −x0
, m0mx00 −m 1 x1

Exercise 4.0.1. Work through the details to to show that γ(x0 ) intersects γ(x1 ) at (t, x) = 1
− m1 −m1
.
m0 1
Exercise 4.0.2. Find a reference and review the implicit function theorem.
MIT OpenCourseWare
http://ocw.mit.edu

Fall 2011

MIT Métodos Matemáticos

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

MIT Métodos Matemáticos

Diunggah oleh

Hak Cipta:

Format Tersedia

MATH 18.

152 COURSE NOTES - CLASS MEETING # 1

18.152 Introduction to PDEs, Fall 2011 Professor: Jared Speck

Class Meeting # 1: Introduction to PDEs

(1.0.2) F (u, ux1 , · · · , uxn , ux1 x1 , · · · , uxi1 ···xiN , x1 , x2 , · · · , xn ) = 0, i1 , · · · , iN ∈ {1, 2, · · · , n}

(1.0.3) −∂t2 u + (1 + cos u)∂x3 u = 0

(1.0.4) −∂t2 u + 2∂x2 u + u = t

Example 1.0.3. u = u(t, x)

(1.0.5) ∂t u + 2(1 + x2 )∂x3 u + u = t

2. The Goals of PDE (and of this course)

(4.0.8) L(au + bv) = aLu + bLv

(4.0.10) ∂t u − (1 + cos t)∂x2 u = tx

(6.1.1) a∂x u(x, y) + b∂y u(x, y) = 0,

(6.2.1) y∂x u + x∂y u = 0.

(6.2.5) a(x, y)∂x u + b(x, y)∂y u = 0,

We often write just Lp instead of Lp (Rn ).

where each of the F i are scalar-valued functions on Rn .

We are now ready to recall the divergence theorem.

2Throughout this course, a domain is defined to be an open, connected subset of Rn .

18.152 Introduction to Partial Differential Equations.

18.152 Introduction to PDEs, Fall 2011 Professor: Jared Speck

Class Meeting # 2: The Diffusion (aka Heat) Equation

(1.0.1) ut − D∆u = f (t, x).

Equation (1.0.1) is first-order and linear.

2. A simple model of heat flow that leads to the heat equation

(2.0.5) R ∼ [energy] × [time]−1 × [mass]−1 .

(2.0.9) q ∼ [energy] × [time]−1 × [area]−1 ,

(2.0.10) −∫ ˆ dσ = − ∫ ∇ ⋅ q dn x ∼ [energy] × [time]−1 ,

where the equality follows from the divergence theorem.

(2.0.12) ρ∂t e(t, x) = −∇ ⋅ q + ρR.

(2.1.1) q(t, x) = −κ∇u(t, x),

v ′ (t) w′′ (x)

(4.0.3a) v ′ (t) = λv (t),

(4.0.4) v(t) = Aeλt

Let’s now answer some of the remaining questions from above.

18.152 Introduction to Partial Differential Equations.

18.152 Introduction to PDEs, Fall 2011 Professor: Jared Speck

Class Meeting # 3: The Heat Equation: Uniqueness

(1.0.5) ∂t u − D∂x2 u = f (t, x)

differentiate under the integral Z Z Z

then we have shown that

18.152 Introduction to Partial Differential Equations

18.152 Introduction to PDEs, Fall 2011 Professor: Jared Speck

(1.0.2) ut − Duxx = f −  < 0.

(1.0.3) u(t0 , x) − u(t0 , x0 ) = ux |t0 ,x0 (x − x0 ) +uxx |t0 ,x∗ (x − x0 )2 ≤ 0,

(1.0.4) ut (t0 , x0 ) − Duxx (t0 , x0 ) ≥ 0,

which contradicts (1.0.2).

(1.0.5) max u = max u ≤ max w ≤ max w.

Using (1.0.5) and w ≤ u + T, we also have that

Now since w is uniformly continuous on QT , we have that

(1.0.7) max w ↑ max w

as  ↓ 0. Thus, allowing  ↓ 0 in inequality (1.0.6), we deduce that

(1.0.8) max w = lim max w ≤ lim(T + max w) = max w ≤ max w.

(1.0.9) max w = max w

(1.0.14) max u = max u ≤ max |w − v |.

Thus, subtracting and adding tM, we have

(1.0.15) max w − v ≤ max(w − v − tM ) + max tM ≤ max |w − v| + T M .

(1.0.16) max v − w ≤ max |w − v | + T M.

18.152 Introduction to Partial Differential Equations.

18.152 Introduction to PDEs, Fall 2011 Professor: Jared Speck

Class Meeting # 5: The Fundamental Solution for the Heat Equation

(1.0.13) lim hΓD (t, ·), φ(·)i = hδ(·), φ(·)i = φ(0).

(1.0.14) lim ΓD (t, x) = δ(x).

(1.0.2) ut − Duxx = f − < 0.

Using (1.0.5) and w ≤ u + T, we also have that

as ↓ 0. Thus, allowing ↓ 0 in inequality (1.0.6), we deduce that

(1.0.8) max w = lim max w ≤ lim(T + max w) = max w ≤ max w.

(1.1.14) |x − y|2 = x2 − 2xy + y 2 ≤ (1 + −1 )x2 + (1 + )y 2 .

Clearly, the right-hand side of (1.0.9) goes to 0 as → 0+ .

This shows that L converges to ∫Ω ∣x−1 y∣ ∆u(y ) d3 y as ↓ 0.

≤ max ∣u(x) − u(σ)∣ → 0 as ↓ 0.

This shows that R4 → −4πu(x) as ↓ 0.

≤ max ∣u(x) − u(σ )∣ → 0 as ↓ 0.