Notes For Math 471 - Real Analysis Measure and Integral by Wheeden and Zygmund

Notes for
Math 471 Real Analysis

Measure and Integral by Wheeden and Zygmund
Clayton J. Lungstrum
December 8, 2012
HW: Chapter 1: pp. 13-14, #16,17 Chapter 2: p. 31, #1,5,6
1 The Riemann Integral
We begin with a sample of things we hope to see later in the course.
Denition 1.1. A function f is said to vanish at innity if for all > 0 there is a compact
set K X such that x K
C
implies |f(x)| < .
Theorem (Riesz). Let X be a locally compact, regular topological space. Let C
0
(X) be
the space of continuous functions which vanish at innity. Then the space of bounded linear
functionals on C
0
(X) can be identied with the space of Borel measures on X.
Remark 1.1. C
0
(X)
is the dual space of C

0
(X, i.e., the set
_
L : C
0
C : |L(f)| C sup
xX
|f(x)| for some C R
_
.
This means, L C
0
(X)
implies there exists a unique Borel measure on X such that

L(f) =
_
X
f(x)d(x).
Example 1.1. Fix x
0
X, and consider
L
x
0
(f) = f(x
0
), L
x
0
C
0
(X)
=
_
X
f(x)d
x
0
(x)
where
x
0
is the Dirac delta measure.
Well begin by covering concrete measure theory; that is, Lebesgue measure on R
n
, n 1.
Lebesgue measure of the measure sets E R
n
is similar to dening the volume of sets. Along
the way well need to dene measurable functions f : R
n
R, as well as integration over
measurable functions. After dening these, well discuss various properties of the integral,
such as changing the order of integration (Fubinis Theorem) and interchanging limits and
integrals (Lebesgue Dominated Convergence Theorem).
Finally, let us now begin by recalling Riemann integration on R.
Denition 1.2. A partition of an interval [a, b] is a nite set of points {a = x
0
< x
1
<
< x
N
= b}.
Denition 1.3. The norm (or mesh size) is || = max
1jN
|x
j
x
j1
|.
Denition 1.4. The Riemann sum is dened by
S
=
N
j=1
f(
j
)(x
j
x
j1
),
where the
j
s are sample points chosen from the interval [x
j1
, x
j
]. We will say that f is
Riemann integrable if lim
||0
S
exists. That is, if there exists an A R such that for every

> 0 there exists a > 0 such that || < implies |S
A| < . Denote A by
(R)
_
b
a
f(x)dx.
1
Theorem 1.1. If f is continuous on [a, b], then f is Riemann integrable.
Example 1.2. An example that requires something more than simply discontinuous is the
Dirichlet function, i.e.,
f(x) =
_
1 : x Q
0 : x R \ Q
.
This function is not integrable on any non-degenerate interval.
2 Functions of Bounded Variation
Denition 2.1. Let f : [a, b] R and be a partition of [a, b]. Dene the variation of
f relative to to be
S
(f; [a, b]) =

N
j=1
|f(x
j
) f(x
j1
)|.
Dene the total variation of f on [a, b] to be
V (f; [a, b]) = V (f) = sup
(f; [a, b]).

Denition 2.2. We say that f is of bounded variation if V (f) < and f is of unbounded
variation if V (f) = .
Example 2.1. If f is monotonic on [a, b], then V (f) = |f(b) f(a)|, depending on whether
or not it is increasing or decreasing.
Denition 2.3. A function f dened on [a, b] is said to satisfy a Lipschitz condition on
[a, b], or to be a Lipschitz function on [a, b], if there is a constant C such that
|f(x) f(y)| C|x y| for all x, y [a, b].
Example 2.2. If f C
1
[a, b], then
|f(x) f(y)| =
_
x
y
f
(s)ds
_
x
y
|f
(s)| ds
M(x y)
where |f
(x)| M for all x [a, b] which exists since it is the continuous image of a compact
set. Thus all continuously dierentiable functions are Lipschitz functions.
Proposition 2.1. Let f satisfy a Lipschitz condition. Then f is of bounded variation.
2
Proof.
S
=
N
j=1
|f(x
j
) f(x
j1
|
j=1
C|x
j
x
j1
|
= C(b a) < .
Thus, we have that V (f) C(b a).
Q.E.D.
Note 2.1. We will denote the vector space of all functions of bounded variation on [a, b] by
BV [a, b].
At this point, it may be useful to note that C
1
[a, b] Lip[a, b] BV [a, b].
Note 2.2. BV [a, b] is a normed linear vector space with f = V (f).
Theorem 2.1.
(i) If [a
, b
] [a, b], then V (f; [a
, b
]) V (f; [a, b]).

(ii) If a < c < b, then V (f; [a, b]) < implies V (f; [a, b]) = V (f; [a, c]) + V (f; [c, b]).
Proof. For a proof, refer to the text.
Q.E.D.
Now we will introduce the concepts of positive and negative variation, which, as their
name implies, measures how much upward variation the function has (positive) and how
much downward variation the function has (negative).
Denition 2.4. For x R, dene the following:
x
+
=
_
x : x > 0
0 : x 0
and x
=
_
0 : x > 0
x : x 0
Note 2.3. It will be useful to note that |x| = x
+
+ x
, x = x
+
x
, x
+
=
1
2
_
x +|x|
_
, and
x
=
1
2
_
|x| x
_
.
Dene
P
(f) =
n
j=1
_
f(x
j
) f(x
j1
)
+
N
(f) =
n
j=1
_
f(x
j
) f(x
j1
)
to be the positive and negative variation with respect to the partition .

3
Note 2.4. P
(f) N
(f) =
n
j=1
f(x
j
) f(x
j1
) = f(b) f(a)
Denition 2.5. The positive and negative variations of f on [a, b] are
P = P(f) = sup
(f) and N = N(f) = sup
(f)
Theorem 2.2. If f is a bounded function on [a, b], then if any one of P, N, or V is nite,
then the other two are nite as well, and
P + N = V ; P N = f(b) f(a).
Proof. Confer the text.
Q.E.D.
Corollary 2.1 (Jordans Theorem). A function f is of bounded variation if and only if
f = f
1
f
2
are bounded, increasing functions. Equivalently, f = f
1
+ f
2
, where f
1
is a
bounded, increasing function and f
2
is a bounded, decreasing function.
Proof. Suppose f = f
1
f
2
, then since f
1
, f
2
BV [a, b] and BV is a linear space, f
1
f
2

BV [a, b].
Conversely, refer to the text.
Q.E.D.
Denition 2.6. We say f has a discontinuity of the rst kind at a if f(a
+
) =
lim
xa
+ f(x) and f(a
) = lim
xa
f(x) both exist. If f(a
+
) = f(a
), we can redene f at
a to be this common value and say f has a removable discontinuity at a. If f(a
+
) = f(a
),
then f has a jump discontinuity at a.
Note that if f is monotonic and bounded, then any discontinuities are of the rst kind
since f(a
+
) and f(a
) exist by basic real analysis.

Theorem 2.3. A function f BV [, ] implies f only has discontinuities of the rst kind
and there are at most countably many.
Proof. If f BV [, ], write f = f
1
f
2
with f
1
and f
2
increasing. It suces to prove
the theorem for each of f
1
and f
2
separately, so let us treat f
1
, replacing it by f
2
for the
second part. We know that f
1
only has discontinuities of the rst kind, and it is sucient to
assume only jump discontinuities occur since removable discontinuities only occur at points
for which the function is not dened, hence theyre not really discontinuities. Let
D
1
= {a [, ] : f
1
has a discontinuity at a}
=
_
k=1
{a [, ] : f
1
(a
+
) f
1
(a
) >
1
k
}.
Since f
1
is bounded, the cardinality of D
1
is at most countable.
Following the same process for f
2
, we have the same conclusion, and the countable union
of countable sets is countable, thus the set of discontinuities are at most countable.
Q.E.D.
4
Theorem 2.4. If f C[a, b], then V (f) = lim
||0
S
(f); i.e., for every > 0 there exists

a > 0 such that || < implies S
> V (f) .
Proof. Refer to the text for the proof.
Q.E.D.
Theorem 2.5. If f C
1
[a, b], then
V (f) = (R)
_
b
a
|f
(x)|dx
N(f) = (R)
_
b
a
(f
(x))
dx
P(f) = (R)
_
b
a
(f
+
(x))
+
dx
Proof. Since f C
1
[a, b], we can apply the Mean Value Theorem to each of the subintervals
determined by the partition, thus we have for some points
j
,
f(x
j
) f(x
j1
) = f
(
j
)(x
j
x
j1
).
This implies
S
(f) =
n
j=1
|f
(x)|(x
j
x
j1
)
which is the Riemann sum on [a, b]. Since |f
| C[a, b], the theory of Riemann integration

implies
lim
||0
S
(f) = (R)
_
b
a
|f
(x)|dx,
and thus lim
||0
= V (f).
Q.E.D.
Note 2.5. We have the following:
(i) P + N = V
(ii) P N = f(b) f(a)
(iii)
P =
1
2
_
V + f(b) f(a)
=
1
2
_ _
b
a
|f
(x)|dx =
_
b
a
f
(x)dx
_
=
_
b
a
1
2
_
|f
(x)| + f(x)
dx
=
_
b
a
(f
(x))
+
dx
5
Now let us turn our attention to rectiable curves. Well consider these curves in the
plane, R
2
for simplicity, though for R
n
the process is almost the same. A curve has a
parametric representation as follows:
C =
_
(t) : a t b
(t) : a t b
.
Denition 2.7. The graph or image of C is {((t), (t)) : a t b} R
2
.
For these curves, well consider polygonal approximations and measure the length of these
approximations. To that end, let = {a = t
0
< t
1
< < t
n
= b} be a partition of [a, b].
Let P
j
= ((t
j
), (t
j
)) and approximate C by the union of the line segments P
j1
P
j
. Observe
that these are polygonal, i.e., piecewise linear, approximations to C.
Denition 2.8. Dene the geometric length of C
to be
() =
n
j=1
P
j1
P
j
=
n
j=1
_
((t
j
) (t
j1
))
2
+ ((t
j
) (t
j1
))
2
1/2
.
Denition 2.9.
(i) L=L(C)=sup
(), 0 L ;
(ii) C is rectiable if L < .
Note 2.6. Rectiability is a property of the parametric curve, not its graph.
Example 2.3. Let / BV [1, 1], but 1 (t) 1 for all t [1, 1]. Then C =
{((t), (t)) : 1 t 1} is not rectiable even though it is contained in a line segment.
For a proof of this, see the following theorem.
Theorem 2.6. A parametric curve C, given by functions (t) and (t) on [a, b] is rectiable
if and only if both (t) and (t) are of bounded variation. In that case,
max{V (), V ()} L V () + V ().
HW: Due Thursday, 09/13/2012, Chapter 2 pp. 3132, #9, 11, 13, 18
Now we shall present our nal remarks on functions of bounded variation. First, well
discuss extensions of functions of bounded variation to the complex number system, and
well conclude with the notion of bounded variation on intervals other than closed intervals.
Let f : [a, b] C be a complex-valued function, and dene S
(f) by
S
(f) =
n
i=1
f(x
i
) f(x
i1
)
and V (f) by taking the supremum over all partitions of S
(f). Since z = x + y and

max{|x|, |y|} |z| |x| + |y|, we have f BV [a, b] if and only if Re(f) and Im(f) are of
bounded variation.
6
Denition 2.10. Well dene f BV (a, b) if and only if for all a < a
< b
< b, f
BV [a
, b
] and sup
a<a
<b
<b
V (f; [a
, b
]) < . Similarly, f BV (R if and only if f BV [a, b]

for < a < b < and sup
<a<b<
V (f; [a, b]) < .
3 The Riemann-Stieltjes Integral
If we recall Riemann integration, we would have integrals like the following:
_
b
a
fdx. The
Riemann-Stieltjes integral replaces the x in dx with a more general (x). Well start with a
partition of [a, b] and sample points within the subintervals of the partition, and we will
dene the Riemann-Stieltjes integral in a similar fashion as the standard Riemann integral.
More formally, we write for the Riemann-Stieltjes sum with respect to as:
R
=
n
i=1
f(
i
)[(x
i
) (x
i1
)].
Notice this becomes the standard Riemann sum if (x) = x.
Denition 3.1. If lim
||0
R
= I exists, i.e., for all > 0 there exists a > 0 such that
|| < implies |R
I| < , then we say that the Riemann-Stieltjes integral of f with

respect to exists, written as
_
b
a
f(x)d(x) =
_
b
a
fd,
and we say that f is Riemann-Stieltjes integrable with respect to .
Note 3.1.
(i) An equivalent condition is the following: for all > 0, there exists a > 0 such that
||, |
| < implies |R
| < .
(ii) For (x) = x, the notion of Riemann-Stieltjes integrability agrees with Riemann inte-
grability and the value of
_
b
a
fd is the same.
(iii) If f C[a, b] and C
1
[a, b], then
_
b
a
fd =
_
b
a
f
dx.
Proof. Any R
n
i=1
f(x
i
)[(x
i
) (x
i1
)], each (x
i
) (x
i1
) =
(
i
)(x
i
x
i1
)
by the Mean Value Theorem, thus
R
=
n
i=1
f(x
i
)
(
i
)(x
i
x
i1
)
=
n
i=1
(f
)(
i
)(x
i
x
i1
) +
n
i=1
(f(x
i
) f(
i
))
(
i
)(x
i
x
i1
)
It suces to show the last term tends to zero as the norm of the partition tends to
zero. However, because f is continuous on a compact set, we have that it is uniformly
7
continuous, and similarly, since
is continuous on a compact set, its image is bounded

by a positive integer, say M. Hence, let > 0 such that |f(x) f(y)| <

M(ba)
if
|x y| < . Thus, we have:
n
i=1
(f(x
i
) f(
i
))
(
i
)(x
i
x
i1
) <

M(b a)
n
i=1
|
(
i
)|(x
i
x
i1
)

(b a)
n
i=1
(x
i
x
i1
)
.
Thus, the right-hand term is small for suciently large values of n, thus the desired
result follows.
Q.E.D.
(iv) Suppose is a step function, i.e., it is piecewise constant. An equivalent expression is
to say there exists a partition A = {a =
0
<
1
< <
n
= b} with constant
on the open intervals (
i1
,
i
). Let d
i
= (
+
i
) (
i1
), 1 i n 1, with
d
0
= (a
+
) (a) and d
n
= (b) (b
). Our claim is:

_
b
a
fd =
n
i=0
f(
i
)d
i
as long as C [a, b].
Theorem 3.1. A necessary condition for
_
b
a
fd to exist is for f and to have no common
point of discontinuity.
Proof. Refer to the text.
Q.E.D.
Theorem 3.2 (Linearity Properties).
(i)
_
b
a
c
1
f
1
+ c
2
f
2
d = c
1
_
b
a
f
1
d + c
2
_
b
a
f
2
d if the two integrals on the right-hand side
exist.
(ii)
_
b
a
fd(c
1
1
+c
2
2
) = c
1
_
b
a
fd
1
+c
2
_
b
a
fd
2
if the two integrals on the right-hand side
exist.
Thus we can say the map (f, )
_
b
a
fd is bilinear; i.e., it is linear in each of the two
components separately.
Note 3.2. This is dened on some linear space containing C[a, b] C
1
[a, b].
Theorem 3.3 (Additivity on Intervals).
_
b
a
fd =
_
c
a
fd +
_
b
c
fd for all a < c < b, if the
integral on the left-hand side exists.
Note 3.3. The converse of the preceding theorem fails! See the book for further details.
8
Theorem 3.4 (Integration by Parts). If f and are bounded on [a, b], then if
_
b
a
fd exists,
so does
_
b
a
df and
_
b
a
fd = f(b)(b) f(a)(a)
_
b
a
df.
Proof. A typical Riemann-Stieltjes sum of
_
b
a
fd is
R
=
n
i=1
f(
i
)[(x
i
) (x
i1
)]
=
n
i=1
f(
i
)(x
i
)
n
i=1
f(
i
)(x
i1
)
=
n
i=1
f(
i
)(x
i
)
n1
i=0
f(
i+1
)(x
i
)
=
n1
i=1
(x
i
)[f(
i+1
) f(
i
)] + f(
n
)(b) f(
1
)](a).
Now add and subtract the term (a)[f(
i
) f(a)] +(b)[f(b) f(
n
)], then do some algebra,
and we get
R
= f(b)(b) f(a)(a) T
where
T
=
n1
i=1
_
(x
i
)[f(
i+1
) f(
i
)]
_
+ (a)[f(
1
) f(a)] + (b)[f(b) f(
n
)].
This is a Riemann-Stieltjes sum of with respect to f with partition
= {a =
0

1

n

n+1
= b} with sample points . Note that |
| 0 as || 0, thus lim
||0
R
exists implies lim

|
|0
T
exists.
Q.E.D.
Now assume that is increasing and bounded so that (x
i
) (x
i1
) 0 for all i and
all partitions and assume that f is continuous so that
m
i
= inf
x[x
i1
,x
i
]
f(x) and M
i
= sup
x[x
i1
,x
i
]
f(x)
exist. Then, for any sample points {
i
}, we have
n
i=1
m
i
((x
i
) (x
i1
)) = L
=
n
i=1
M
i
((x
i
) (x
i1
)).
Recall the denition of renement, i.e.,
is a renement of if
. If ,
are any two

partitions, then
is a common renement with |
| min{||, |
|}.
Lemma 3.1.
(i) If
, then L
and U
.
9
(ii) If ,
are any two partitions, L
.
Q.E.D.
Theorem 3.5. If f C[a, b], BV [a, b], then
_
b
a
fd exists and
_
b
a
fd
sup
x[a,b]
f(x)
V (; [a, b]).
4 Lebesgue Measure on R
n
Denition 4.1. An n-dimensional interval is I R
n
where I =
n
j=1
[a
j
, b
j
]. The
volume of I is v(I) =
n
j=1
(b
j
a
j
).
Denition 4.2. A cover of a set E R
n
is a countable collection S = {S
i
}
i=1
of sets with
E

i=1
S
i
.
We are interested in covers which consist of intervals, S = {I
k
}
i=1
. Dene
(I) =
k=1
v(I
k
).
Denition 4.3. For E R
n
, the Lebesgue outer measure of E, denoted |E|
e
, is
|E|
e
= inf{(S) : all countable covers of E by rectangles}.
We say that E is a set of Lebesgue measure zero if |E|
e
=0. More formally, for every
> 0, there exists a cover S = {I
k
}
k=1
, E

k=1
I
k
, and
k=1
v(I
k
) < . E is considered
a null set since it has measure zero.
Theorem 4.1. If E = I is an interval, then |I|
e
= v(I).
Proof. Note S = {I} is a cover of I, thus |I|
e
(S) = v(I). For the reverse inequality, we
remark that if I is an n-dimensional interval and 0 < r < , let rI denote the interval with
the same center as I and side lengths equal to r times the corresponding side length. Then
v(rI) = r
n
v(I). Now, suppose S = {I
k
}
k=1
is a cover of I. Let > 0 and I
k
= (1+)I
k
so that
v(I
k
) = (1 + )
n
v(I
k
). Now we have I

k=1
I
k

k=1
I
k
. Note also that I

k=1
(I
k
)
,
which implies {(I
k
)
k=1
is an open cover of a compact set. By the Heine-Borel theorem,
there exists a nite subcover. Relabeling if necessary, we can assume I

N
k=1
(I
k
)
, thus
v(I)
N
k=1
v(I
k
)
= (1 + )
n
N
k=1
v(I
k
)
(1 + )
n
(S).
Since this holds for every > 0, we have v(I) (S) for every cover S of I. Taking the
inmum, we get v(I) |(|
e
E), and hence, v(I) = |E|
e
.
Q.E.D.
10
Theorem 4.2 (Monotonicity). If E
1
E
2
, then |E
1
|
e
|E
2
|
e
.
Proof. Let S
j
be the family of all covers S
j
b intervals of E
j
. Its clear that any cover of E
2
is also a cover of E
1
, thus S
1
S
2
and |E
1
|
e
= inf
SS
1
(S) inf
SS
2
(S) = |E
2
|
e
.
Q.E.D.
Theorem 4.3. If E =
k=1
E
k
, then |E|
e

k=1
|E
k
|
e
.
Q.E.D.
Example 4.1. A singleton {x
0
} R
n
can be covered by
n
j=1
[
j
(x
0
)

2
,
j
(x
0
) +

2
] = I
.
Then (S) = v(I
) =
n
, which converges to 0 as tends to 0. That is,
0 |E|
e
inf
>0
n
= 0,
hence |E|
e
=0.
Corollary 4.1. If E R
n
is countable, then |E|
e
= 0. Note that even though Q
n
is dense
in R
n
, it is countable and therefore has measure zero.
Example 4.2. Let P R
n
be an ane k-plane, 1 k n 1, i.e., P = + a for some
a R
n
and some k-dimensional linear subspace R
n
.
Claim 4.1. |P|
e
=0, for a k = n 1-dimensional hyperplane.
Proof. By remarks below, we can assume P = {x R
n
: x
n
= 0}. Write P =
k=1
P
k
,
where P
k
= {x R
n
: x
n
= 0, |x
j
| k, 1 j n1}. This implies that |P|
e

k=1
|P
k
|
e
.
Claim 4.2. For all k N, |P
k
|
e
= 0.
Proof. For all > 0, consider
I
=
_
n1
j=1
[k, k]
_
[, ].
Now,
v(I
) = (2k)
n1
(2) = (2
n
k
n1
)
which approaches zero as approaches zero. Hence, |P
k
|
e
=0, as desired.
Q.E.D.
Returning to the claim above, we now have |P|
e

k=1
|P
k
|
e
= 0, thus, |P|
e
= 0.
Q.E.D.
For the remarks cited above, we mean specically the symmetries of ||
e
, i.e.,
11
(1) Translation Invariance; that is, for any a R
n
, E + a = {x + a : x E} and
|E + a|
e
= |E|
e
.
(2) Rotation Invariance (dened below)
Denition 4.4. The rotation group in R
n
is O(n) = {R M
nn
(R) : R
T
R = I
n
= RR
T
}.
If E R
n
, R O(n), then R(E) = {Rx : x E}.
Proposition 4.1. |R(E)|
e
= |E|
e
, for all E R
n
, R O(n).
(3) Scaling or homogeneity
Denition 4.5. Let r (0, ), E R
n
. Then rE = {rx : x E}. [Note that this conicts
with the earlier rI.] For every interval I, we have v(rI) = r
n
v(I).
Proposition 4.2. |rE|
e
= r
n
|E|
e
for 0 < r < .
Theorem 4.4. If E R
n
and > 0, then there exists an open set G R
n
with E G,
|G|
e
|E|
e
+ .
Proof. Let > 0. There exists a cover S = {I
k
}
k=1
with
k=1
v(I
k
) |E|
e
+

2
. Let I
k
be
an open interval containing I
k
and with v(I
k
) v(I
k
) +

2
k+1
. Let G =
k=1
I
k
(which is
open), then E

k=1
I
k
G. We have
|G|
e
=
_
k=1
I
k=1
|I
k
|
e
k=1
v(I
k
) +

2
k+1
|E|
e
+

2
+

2
= |E|
e
+
Q.E.D.
Theorem 4.5. If E R
n
, there exists a G
-set with E H and |E|

e
= |H|
e
.
HW: pp.4748, #2, 3, 9, 10, 13
Denition 4.6. For all E R
n
, let S = {I
k
}
k=1
be a cover with each I
k
a closed interval.
Dene (S) =
k=1
v(I
k
), and then 0 (S) . Then the Lebesgue outer measure
of E is
|E|
e
= inf{(S) : S is a cover of E.
12
Now we will restrict our attention to M 2
R
n
, the measurable sets as a subset of the
power set of R
n
.
Denition 4.7. E R
n
is (Lebesgue) measurable if for every > 0, there exists an
open set G R
n
such that E G and |GE|
e
< .
Note 4.1. If we write G = E (GE), by subadditivity, this implies
|E|
e
|E|
e
+|GE|
e
.
Thus, the previous material does not imply the denition of Lebesgue measurability.
We would like to point out that M 2
R
n
; in particular, there exist nonmeasurable
subsets.
Note 4.2. We would like to list some properties of measurable sets.
(i) ||
e
= 0
(ii) Any open set G is measurable
(iii) Any E such that |E|
e
= 0 is measurable.
Let us prove the last statement:
Proof. Given > 0, there exists a cover S = {I
k
}
k=1
of E such that
k=1
<

2
. Similar
to before, we can dilate each I
k
by r = 2
1
n
and call the result I
k
. Then v(I
k
) = 2v(I
k
),
and hence E (I
k
)
, which is open. Let G =
(I
k
)
. Clearly G E G, hence, by
subadditivity we have
|G
E
|
e
|G|
e
=
_
(I
k
)
k=1
v(I
k
)
< .
Therefore, we have satised the denition of Lebesgue measurable.
Q.E.D.
If E M, we refer to |E|
e
as just the measure of E. Thus, |E|
e
= 0 implies |E| = 0.
Corollary 4.2. Any subset of any ane k-plane is measurable and of measure zero.
Theorem 4.6. If {E
k
}
k=1
is a countable collection of measurable sets, then
k=1
E
k
M
and
_
k=1
E
k
k=1
|E
k
|.
13
Proof. By subadditivity, it suces to show
k=1
E
k
M. To that end, let > 0. For each
k, there exists an open set G
k
that contains E
k
such that |G
k
E
k
| <

2
k
. Let G =
G
k
,
thus G is open,
E
k
G, and
G
_
E
k
_
G
k

_
E
k
_
(G
k
E
k
)
k=1
2
k
= .
Thus, we have that
E
k
is measurable.
Q.E.D.
Corollary 4.3. Any closed interval is measurable and v(I) = |I|.
Proof. We know I = I
I, I
M, and |I|
e
= 0, thus |I| = 0, i.e., I M and, by
the preceding theorem, I M. Then,
v(I) |I
| |I| |I
| +|I| = |I
| = v(I).
Q.E.D.
Lemma 4.1. If E
1
, E
2
R
n
and
d(E
1
, E
2
) = inf
xE
1
, yE
2
|x y| > 0,
then |E
1
E
2
|
e
= |E
1
|
e
+|E
2
|
e
.
Theorem 4.7. If F R
n
is closed, then F M.
Q.E.D.
Theorem 4.8. E M if and only if E
C
M, where E
C
= R
n
E.
Proof. If E M, then for every k N, there exists an open set G
k
with E G
k
, |G
k
E| <
1
k
. We have G
C
k
E
C
, and G
C
k
is closed, thus is measurable by the previous theorem.
Thus, H =
G
C
k
is a countable union of measurable sets, hence is measurable. Therefore,
E
C
= H(E
C
H), and E
C
H E
C
G
k
= G
k
E. Thus,
E
C
H
e
|G
k
E|
e
<
1
k
which implies
E
C
H
e
= 0, thus E
C
H is measurable. Now, E
C
= H E
C
H M.
Q.E.D.
Theorem 4.9. If {E
k
}
k=1
are measurable, then E =
k=1
E
k
is measurable, and if E
1
, E
2

M, then E
1
E
2
M.
14
Proof. Since E
1
E
2
= E
1
E
C
2
, it suces to show the rst part of the theorem. But, by
the inversion of DeMorgans Law, we have
E =
k=1
E
k
=
_

_
k=1
E
C
k
_
C
.
Thus, by previous theorems, we have that E M.
Q.E.D.
Now let us discuss some abstract topics to make them more accessible later on in the
course. If X is any set, denote the power set of X by 2
X
.
Denition 4.8. 2
X
is a -algebra if it is closed under complements and countable
intersections.
Remark 4.1. It is easy to show that for any -algebra, , X . Just take E , then
E E
C
= , and the complement of is the whole space X.
Its clear that 2
X
is a -algebra since it consists of all subsets of X. We can summarize
what weve done with the measurable sets in the following theorem.
Theorem 4.10. M is a -algebra.
Proposition 4.3. Let X be a set and let F be a set of -algebras on X. Then F is closed
under intersections.
Proof. Let {}
iI
be a family of -algebras and let =
iI

i
. If E , then E
i
for all i I, and since each
i
is a -algebra, E
C
is in each
i
, thus E
C
. A similar
argument holds for the intersection of any countable collection of sets in .
Q.E.D.
If S 2
X
is an arbitrary collection of sets, then let
F
S
=
_
2
X
: is a -algebra and S
_
.
Observe that F
S
= since 2
X
is a -algebra. Then
=
:

F
S
_
is the smallest -algebra generated by S.
Example 4.3. Let X = R
n
, S be the collection of all open subsets of R
n
. S is clearly not
a -algebra, but, by the process above, S generates a -algebra. This is known as the Borel
-algebra on R
n
and is denoted by B, in honor of

Emile Borel.
Since M contains all of the open sets and is a -algebra, and B is the smallest -algebra
containing the open sets, we have B M. It can be shown B M.
Example 4.4. M 2
R
n
, the measurable subsets of R
n
. Note that M, where is the
topology on R
n
.
Let (X, ) be any topological space. Taking S = , the minimal -algebra containing
is denoted B, the Borel -algebra on X.
Theorem 4.11. On R
n
, B M.
15
4.0.1 Two Important Properties of M
Recall E R
n
is measurable if and only if for every > 0, there exists an open set G such
that E G and |GE|
e
< .
Lemma 4.2. E is measurable if and only if for every > 0, there exists a closed set F E
such that |E F|
e
< .
Proof. We know E is measurable if and only if E
C
is measurable, which is measurable if and
only if for every > 0 there exists an open set G such that E
C
G and
GE
C
e
< .
Thus, writing this as
GE
C
= G (E
C
)
C
= G E = E (G
C
)
C
= E G
C
.
Now we have
GE
C
e
< if and only if
E G
C
e
< . Hence, just take F = G
C
.
Q.E.D.
Recall the property outer measure subadditivity, and if we have that the Hausdor dis-
tance is positive between any pair of sets, then we actually have equality.
Theorem 4.12. If {E
k
}
k=1
is a countable family of disjoint measurable sets, then
_
k=1
E
k
k=1
|E
k
| .
Proof. By subadditivity, it suces to show
_
k=1
E
k
k=1
|E
k
| .
For simplicity, assume each E
k
is bounded and let > 0. Then, for every k N, there exists
F
k
E
k
, F
k
-closed such that |E
k
F
k
| < 2
k
. Note that since F
k
E
k
, we have that each
F
k
is pairwise disjoint and each F
k
is bounded. Since it is closed, it is compact. Some metric
space theory implies that d
H
(F
j
, F
k
) > 0 for j = k.
Let m N, and notice that from before, we have
m
_
k=1
F
k
=
m
k=1
|F
k
| .
Notice that
m
_
k=1
F
k

m
_
k=1
E
k

_
k=1
E
k
,
thus
m
k=1
|F
k
| |
k=1
E
k
|. Since m is arbitrary, it follows that
k=1
|F
k
| |
k=1
E
k
|.
But we have
k=1
|E
k
| 2
k
= +
k=1
|E
k
|
k=1
|F
k
| .
16
Since > 0 was arbitrary, we have
k=1
|E
k
|

k=1
|F
k
|, thus
k=1
|F
k
| =
k=1
|E
k
|
_
k=1
E
k
.
Q.E.D.
Corollary 4.4. Suppose E
1
, E
2
M, E
2
E
1
and |E
2
| < . Then |E
1
E
2
| = |E
1
||E
2
|.
Proof. Write E
1
= E
2
(E
1
E
2
). Then this is a disjoint union and so |E
1
| = |E
2
|+|E
1
E
2
|.
Since |E
2
| < , we can subtract it to the other side and obtain the desired identity.
Q.E.D.
Theorem 4.13. Let {E
k
}
k=1
M. Then
(i) if E
m
E
n
for m n and E =
k=1
E
k
, then |E| = lim
n
|E
k
|;
(ii) if E
m
E
n
for m n and E =
k=1
E
k
and some E
k
has nite measure, then
|E| = lim
k
|E
k
|.
Proof. For the second condition, write
E
1
= (E
1
E
2
) (E
2
E
3
) E =
_
k=1
(E
k
E
k+1
) E.
Then we have
|E
1
| =
k=1
|E
k
E
k+1
| +|E| =
k=1
|E
k
| |E
k+1
| +|E| = |E
1
| lim
N
|E
N
| +|E| .
Relabeling if necessary, we can assume E
1
is nite, thus we can subtract |E
1
| lim
N
|E
N
|
over and obtain the desired identity.
Q.E.D.
4.0.2 Equivalent Formulations of Measurability
Theorem 4.14. Let E R
n
. Then
(i) E is measurable if and only if E = H Z, where H is a G
-set and |Z| = 0;

(ii) E is measurable if and only if E = K W, where K is an F
-set and |W| = 0.

Proof. Clearly sets of either form are measurable. Conversely, if E M, then for every
k N, there exists an open set G
k
such that E G
k
and |G
k
E| <
1
k
. Thus, let
H =
k=1
G
k
, which is a G
. Then H E G
k
E, thus |H E| |G
k
E|, which
implies |H E| 0, thus |H E| = 0. Set Z = H E, and E = H Z. The second
condition easily follows in a similar manner as a previous theorem.
Q.E.D.
17
Theorem 4.15. If |E|
e
< , then E is measurable if and only if for ever > 0 we can
write E = (S N
1
) N
2
, where S =
N
k=1
I
k
, a nite union of non-overlapping intervals
and N
1
, N
2
M with |N
1
| < and |N
2
| < .
Theorem 4.16 (Caratheodory). E is measurable if and only if for every A 2
R
n
, we have
|A|
e
= |A E|
e
+|A E|
e
.
HW: pp. 48-49, # 12,15,20
Let us present a proof of Caratheodorys theorem.
Proof. Suppose E M. Then for any A R
n
there exists H A such that H is a G
-set
and |A|
e
= |H|. Now, write H = (H E) (H E
C
), a disjoint union of measurable sets,
so |A|
e
= |H| = |E E| + |H E|. Now, observe A E H E and A E
C
H E
C
,
so |A E|
e
+
A E
C
e
|A|
e
. The reverse inequality follows from subadditivity of outer
measure. Thus,
|A|
e
= |A E|
e
+
A E
C
e
.
Conversely, we want to show if |A|
e
= |A E|
e
+
A E
C
e
for any A R
n
, then E is
measurable. Suppose |E|
e
< . Then we can nd a G
-set H E with |H| = |E|

e
. Write
H = (H E) (H E
C
). Then H E = E and we can apply the hypothesis with A = H
and see that
|E|
e
= |H| = |H|
e
= |E|
e
+|H E|
e
.
Since |E|
e
< , we can subtract it from both sides, so |H E|
e
= 0, thus H E is
measurable and |H E| = 0. Dene Z = H E, then E = H Z, therefore E M. See
the text for the case |E|
e
= .
Q.E.D.
4.0.3 Action of Transformations on M
We saw that outer measure is invariant under translations and rotation.
Denition 4.9. E(n) is the Euclidean motion group of R
n
, that is
E(n) = {T : R
n
R
n
: Tx = Ax + b, A O(n), b R
n
} .
T E(n) implies T is a homeomorphism, thus it maps G
-sets to G
-sets and similarly for

F
-sets. In particular, the Borel set is mapped to the Borel set.

Claim 4.3. E(n) maps measurable sets to measurable sets.
Note 4.3. If |Z| = 0, then |TZ| = 0 for all T E(n), letting
Z = {Z R
n
: |Z| = 0} , E(n) : Z Z.
Proof. Thus, if E M, by the previous theorem, we can write E = H Z, H a G
-set and
|Z| = 0, thus Z Z, and therefore TE = TH TZ, thus, TE is measurable.
Q.E.D.
18
Question: Is there a larger (interesting) class of mappings T : R
n
R
n
such that
T : M M?
Denition 4.10.
(i) Let f : R
n
C. We say that f is Lipschitz (continuous) if there exists C R such
that for all x, y R
n
,
|f(x) f(y)| C|x y|;
(ii) T : R
n
R
n
is Lipschitz if there exists C R such that for all x, y R
n
,
|Tx Ty| C|x y|.
Remark 4.2.
(i) We could dene Lipschitz on a general domain D, where f : D C and T : D R
n
;
(ii) We can dene Lipschitz for T : (X
1
, d
1
) (X
2
, d
2
), general metric spaces.
T : R
n
R
n
can be written as Tx = (f
1
(x), f
2
(x), . . . , f
n
(x)), where f
j
: R
n
R, 1 j n.
Exercise 4.1. Show that T is Lipschitz if and only if each f
j
is Lipschitz for 1 j n.
Example 4.5. Suppose f : R
n
R with sup
xR
n |f(x)| = C < . Then
|f(x) f(y)| =
_
1
0
d
dt
(f((1 t)y + tx)) dt
_
1
0
f (x y)dt
_
1
0
|f (x y)| dt
_
1
)
|f ((1 t)y + tx)| |x y|dt
C|x y|,
hence, f is Lipschitz with Lipschitz constant C.
Theorem 4.17. If T : R
n
R
n
is Lipschitz and E M, then TE M.
Remark 4.3. Consider the ane group
A(n) = {T : R
n
R
n
: Tx = Ax + b, b R
n
, A M
nn
(R) with det(A) = 0} .
Observe E(n) A(n).
Note 4.4. T A(n) implies T C
1
and the Jacobian matrix
DT
Dx
= A for all x, thus
DT
Dx
is
bounded, so each component is Lipschitz which implies T is Lipschitz.
19
Well give a sketch of the proof:
(i) If K F
, then TK F
.
(ii) If W Z, then TW Z.
(iii) Write E M as E = D W, then TE = TK TW M.
If T : R
n
R
n
is linear, Tx = Ax, A M
nn
(R), then
|Tx Ty| = |A(x y)| A |x y|.
Here, A = sup
|x|1
|Ax| = sup
x=0
|Ax|
|x|
, thus T is Lipschitz.
If I =
n
j=1
[a
j
, b
j
] is an interval in R
n
, then TI = {Ax : x I} is a measurable set with
|TI| = |I|, where = | det(A)|.
Theorem 4.18. Let Tx = Ax with A M
nn
. Then, for all E M, |TE| = |E|.
Example 4.6. Consider the Cantor set C [0, 1]. C = lim
k
C
k
=
k=1
C
k
, then |C| =
lim
k
|C
k
| = lim
k
_
2
3
_
k
= 0. The Cantor-Lebesgue function is a continuous, monotone
function f : [0, 1] [0, 1], f(0) = 0, f(1) = 1, and f is constant on the complement of C.
Observe |C| = 0, but f(C) = f([0, 1]) = [0, 1], thus |f(C)| = 1. Hence, f is not Lipschitz.
Theorem 4.19. If A M
nn
(R), Tx = Ax, then |TE| = |E| where = | det(A)| for all
E M.
Example 4.7. Consider the dilation of intervals. Then for r R
+
, Drx = rx if and only if
A = rI
n
, and |A| = r
n
. If E M, then rE = {rx : x E} is measurable and |rE| = r
n
|E|.
Proof. If = 0, then dim(ker(A)) > 0, thus dim(coker(A)) > 0, where coker(A) = R
n
/Ran(T).
Thus, Ran(T) R
n
, which implies Ran(T) is a proper subspace, i.e., Ran(T) is contained
in some hyperplane which has measure zero. By monotonicity, |TE| |TR
n
| = 0, thus
|TE| = 0 = |E|.
If = 0, then A is invertible, and in particular, T : R
n
R
n
is injective. From the
text, we will assume that if I is an interval, then TI is a parallelepiped and |TI| = |I|.
Now, let E R
n
be any subset of R
n
. For > 0, we can nd a cover {I
k
}
k=1
of E with
k=1
|I
k
| < |E|
e
+ . Since E

k=1
I
k
, we have
TE T
_

_
k=1
I
k
_
=
_
k=1
T(I
k
).
Now we have
|TE|
e

k=1
|T(I
k
)|
e

k=1
|I
k
| < |E|
e
+ .
Since > 0 was arbitrary, we have
|TE|
e
|E|
e
.
20
Now it suces to prove |TE| |E| for all E M. For all > 0, there exists an open
set G such that E G and |G E| < . By a result from Chapter 1, G can be written as
G =
k=1
I
k
, where {I
k
}
k=1
is a family of pairwise nonoverlapping intervals, i.e., I
j
I
k
is
contained in a boundary face for all j = k. Hence,
TG = T
_

_
k=1
I
k
_
=
_
k=1
T(I
k
)
. .
parallelepiped
.
Observe that T(I
j
) T(I
k
) is contained in some hyperplane, so
|TG| =
_
k=1
T(I
k
)
_
k=1
T(I
k
)
k=1
|T(I
k
)| =
k=1
|I
k
| = |G| .
Since |G| = |E| +|GE| < |E| + , we can let approach zero, and we have |G| |E|.
But
|TG| = |TE| +|TGTE|
()
= |TE| +|T(GE)| |TE| + C,
where () follows from the injectivity of T. Notice that the inequality comes from the proof
of the previous theorem that tells us |TF| C |F|, i.e., small sets are mapped to small sets.
Thus, |TE| |E|.
Q.E.D.
4.0.4 Nonmeasurable Sets
We know B M; however, most subsets of R
n
are not measurable.
Theorem 4.20 (Vitali). There exists E R
n
such that E / M.
Before we can prove this theorem, we must provide a bit of background.
Let {E
}
A
be an indexed collection of nonempty sets with A an arbitrary indexing set.
Then the Axiom of Choice implies that there exists a section of {E
}
A
, i.e., there
exists
s : A
_
A
E
such that for every A, s() E
. Thus, s picks for each A one and only one

element of E
.
A relation on a set X that is
(i) Reexive; x x for all x X;
(ii) Symmetric; x y if and only if y x for all x, y X;
(iii) Transitive; x y and y z imply x z for all x, y, z X
is called an equivalence relation on X. An equivalence class of a X is [a] = {b
X : a b}. Then for all a, b X, either [a] = [b] or [a] [b] = . Thus,
X =
_
[a]X/
[a]
where X/ is the set of equivalence classes.
21
Example 4.8. Let X = R and dene x y if and only if x y Q. Then
[0] = Q
[a] = {a +
p
q
:
p
q
Q} for all a R.
As dened above, it is clear that for each a R, |[a]| = |Q|, i.e., is countably innite. Since
R =
_
[a]R/
[a],
we know that R/ is uncountable.
Lemma 4.3. Let E R, E M, and |E| > 0. Then the dierence set of E is
E
= {x y : x, y E}.
E
in this case contains an open interval about 0.
Proof. Next time.
Q.E.D.
Now we shall present the proof to Vitalis Theorem.
Proof. Write R =
[a]R/
[a] as before. By the axiom of choice, for each [a] R/ we
can choose an element of [a], say x
[a]
[a]. Let E = {x
[a]
: [a] R/ }. Note that for all
x, y E, x = y, x y / Q since [x] = [y]. Thus
E
Q = {0} and contains no interval.
Hence, either E / M or |E| = 0. But if |E| = 0,
R =
_
[a]R/
[a] = {x +
p
q
: x E,
p
q
Q} =
_
p
q
Q
{x +
p
q
: x E}.
The last set is just a translate of E by
p
q
, thus of measure zero. Since this is a countable
union, it is also of measure zero, but then R would have measure zero, a contradiction.
Therefore, E / M.
Q.E.D.
Lemma 4.4. For every measurable subset E R with |E| > 0, the set of dierences, i.e.,
E
= {x y : x, y E}
contains a neighborhood of 0.
Remark 4.4. This also holds in R
n
, with
E
containing a ball centered at 0.
Proof. Let > 0. We can nd an open set G such that E G and |G| < (1 +) |E|. Recall
that since G is open, it can be written as G =
k=1
I
k
, with each I
k
being a closed interval
and I
k
I
j
is at most a singleton for k = j.
22
Let E
k
= E I
k
. Then E =
k=1
E
k
with the set of common points
{x : x E
k
E
j
, some k = j}.
Notice that the above set is at most countable, hence has measure zero. Then, by the
(almost) disjointness, we have
|G| =
k=1
|I
k
| < (1 + )
k=1
|E
k
|
|E| =
k=1
|E
k
| .
This implies there exists a k
0
such that |I
k
0
| < (1 + ) |E
k
0
|. Now, we can pick any positive
number for , but for simplicity, we shall choose =
1
3
. Then |I
k
0
| <
4
3
|E
k
0
|. Hence,
E
k
0
I
k
0
and |E
k
0
| >
3
4
|I
k
0
|.
If A R and d R, let A
d
be the translate of A by d, i.e.,
A
d
= {x + d : x A}.
Claim 4.4. For every d R with |d| <
1
2
|I
k
0
|, then E
k
0
E
k
0
,d
= .
Note that the claim implies the lemma, for if x E
k
0
E
k
0
,d
, then there exists y E
k
0
such that x = y + d, thus x y = d, hence
E
k
0
{d R : |d| <
1
2
|I
k
0
| .
The proof of the claim follows.
If not, there exists d, |d| <
1
2
|I
k
0
|, such that E
k
0
E
k
0
,d
= . Considering E
k
0
E
k
0
,d
as
a disjoint union, we have
|E
k
0
E
k
0
,d
| = |E
k
0
| +|E
k
0
,d
| = 2 |E
k
0
| >
3
2
|I
k
0
| .
But E
k
0
E
k
0
,d
is contained in an interval of length less than
3
2
|I
k
0
|, thus we have a contra-
diction.
Q.E.D.
5 Lebesgue Measurable Functions
HW: Chapter 4; exercises 1, 2, 3, and 5.
Denition 5.1. The super-level sets of an extended real-valued function f are the sets
{x E : f(x) > , [, )}.
The sub-level sets of f are the sets
{x E : f(x) < , (, ]}.
23
For the remaining discussion, we will assume the level set
{x E : f(x) = }
is always measurable.
Denition 5.2. Let f : E R, where R is the extended real numbers. We say f is
measurable if for every [, ), the set {x E : f(x) > } is measurable.
Remark 5.1. Observe that E = {x E : f(x) > } {x E : f(x) = }, hence E
is measurable as it is the union of two measurable sets.
Denition 5.3. We say that f is Borel measurable if {f(x) = } B and for every
, {x E : f(x) > } B.
Example 5.1. A continuous function is Borel measurable.
Remark 5.2. For the general setting, the notion of measurability is for a -algebra on a
set X, then f : X R is -measurable if for every alpha, {x X : f(x) > } .
Theorem 5.1. A function f : E R, where E R
n
, is M-measurable if and only if for
every a R, one of the following is true:
(i) {f a} M;
(ii) {f < a} M;
(iii) {f a} M.
Proof. Suppose f is measurable. Then
{f a} =
k=1
_
f > a
1
k
_
,
which is measurable since M is a -algebra.
The rst condition implies the second condition since -algebras are closed under com-
plements. Finally, the second condition implies the third since
{f a} =
k=1
_
f < a +
1
k
_
,
which is measurable as the countable intersection of measurable sets.
The converse is clear as it is the complement of the third condition.
Q.E.D.
Denition 5.4. We say that a condition is true or holds almost everywhere if it is true
except on a set of measure zero. The general notation for this is a.e.
Remark 5.3. Since Lebesgue was a French mathematician, it is also not too uncommon to
see p.p. instead of a.e., where p.p. stands for presque partout.
24
Example 5.2. Often times well compare functions and say that f(x) = g(x) a.e., which
means f and g are indistinguishable from the point of view of Lebesgue measure theory. For
example, take the Heaviside function on R,
H(x) =
_
0 : x < 0
1 : x > 0
.
Then any two values we pick for H(0), say and , H
(x) = H
(x) a.e.
Theorem 5.2. If f is measurable and f(x) = g(x) a.e., then g is measurable.
Q.E.D.
Denition 5.5. If {f
n
}
n=1
is a sequence of functions and f is a function, then we say that
f
n
f pointwise a.e. if lim
n
f
n
(x) = f(x) a.e.
Denition 5.6. If f : E R, dene
f
(a) = |{x E : f(x) > a}| .
This is the distribution function of f. Note
f
: R [0, ].
Remark 5.4. If f = g a.e., then
f
=
g
.
One of the homework problems is to show that if f and are measurable, then it does
not follow that f is measurable; however, we do have the following.
Theorem 5.3. If f : E R is measurable and : R R is continuous, then f is
measurable.
Proof. Let h = f. To show that h is measurable, it suces to show h
1
(G) is measurable
for every open set G R. But
h
1
(G) = ( f)
1
(G) = f
1

1
(G) = f
1
_
1
(G)
_
,
and observe that
1
(G) is open as it is the preimage of an open set. Recall that the inverse
image of an open set in a measurable function is measurable, thus h is measurable.
Q.E.D.
Example 5.3.
(t) =
_
_
|t|
|t|
p
, 0 < p <
ln(2 +|t|)
are all continuous; thus for every measurable function f, f is measurable.
Theorem 5.4. If f and g are measurable functions on E, then
{x E : f(x) > g(x)}
is measurable, as is {x E : f(x) g(x)}.
25
Proof. Let {r
k
}
k=1
be an enumeration of Q. Then
{f(x) > g(x)} =
_
k=1
{f(x) > r
k
> g
k
} =
_
k=1
({f > r
k
} {r
k
> g})
which is measurable since M is a -algebra. For the second part, notice that
{f(x) g(x)}
C
= {f(x) < g(x)}.
The set on the right is of the rst form if we interchange f and g, hence is measurable.
Q.E.D.
Example 5.4. Combining this with the previous result, we have
{x : [f(x)]
2
+ 3f(x) > [g(x)]
3
}
is measurable. We can extend this to {(f
1
(x), . . . , f
k
(x)) > 0} is measurable if is contin-
uous and the f
j
s are measurable.
Theorem 5.5. If f : E R and g : E R are measurable and c
1
, c
2
R, then c
1
f
1
+c
2
f
2
is
measurable (if dened a.e.). Another way to say this: if |{x E : f(x) = } {x E : f(x) = }| =
0. This extends to a nite linear combination; c
1
f
1
+ c
k
f
k
is measurable if the f
k
s are
measurable.
Theorem 5.6. If f, g are measurable, then so if fg (if dened a.e.). In addition, if g(x) = 0
a.e., then
f(x)
g(x)
is measurable.
Denition 5.7. Let {f
k
}
k=1
be a sequence of functions. Recall the denitions
limsup
k
f
k
(x) = lim
j
_
sup
jk<
f
k
(x)
_
liminf
k
f
k
(x) = lim
j
_
inf
jk<
f
k
(x)
_
.
Then lim
k
f
k
(x) exists if and only if limsup
k
f
k
(x) = liminf
k
f
k
(x).
Theorem 5.7. If {f
k
}
k=1
is a sequence of measurable functions on E, then limsup
k
f
k
and liminf
k
f
k
are measurable. Hence, if lim
k
f
k
(x) exists a.e. on E, then it is
measurable. Likewise, if f
1
, . . . , f
N
are measurable, then so are max{f
1
(x), . . . , f
N
(x)} and
min{f
1
(x), . . . , f
N
(x)}.
Proof. This is clear; just observe
{x : sup
kN
f
k
(x) > a} =
_
k=1
{f
k
> a}
which is measurable. For the inmum, notice that inf
1k<
f
k
(x) = sup
1k<
(f
k
(x)).
Additionally, limsup
1k<
f
k
and liminf
1k<
f
k
(x) are measurable. Use the fact that
sup
jk<
f
k
(x) decreases to limsup
1k<
f
k
(x).
Q.E.D.
26
5.0.5 Simple Functions
Denition 5.8. Let X be a set, E X. The characteristic function or indicator
function of E is
E
: X R dened by
E
(x) =
_
1 : x E
0 : x / E
.
For pairwise disjoint sets {E
j
}
m
j=1
and pairwise distinct coecients {a
j
}
m
j=1
, we let
f(x) =
m
j=1
a
j
E
j
.
Proposition 5.1. If f(x) as dened above equals g(x), then the E
j
s and the F
j
s are equal
and the a
j
s and b
j
s are equal after a permutation.
Proposition 5.2. A simple function f : R
n
R is measurable if and only if each E
j
is
measurable.
Theorem 5.8.
(i) Every function f : X R can be written as a pointwise limit of a sequence of simple
functions, f(x) = lim
k
f
k
(x) for all x X.
(ii) If f is bounded below, then f
k
(x) can be assumed to be an increasing sequence.
(iii) If X = R
n
, and f is measurable, then each f
k
can be taken to be measurable.
Proof. Refer to the text, but note that if f(x) is bounded, then the f
k
can be chose to
converge uniformly to f.
Q.E.D.
5.0.6 Semicontinuous Functions
Denition 5.9. Let E R
n
, f : E R, and x
0
E a limit point of E. Then we say
(i) f is uppse semicontinuous at x
0
if
limsup
xx
0
,xE
f(x) = lim
0
+
_
sup
|xx
0
|<,xE
f(x)
_
f(x
0
);
(ii) f is lower semicontinuous at x
0
if
liminf
xx
0
,xE
f(x) = lim
0
+
_
inf
|xx
0
|<,xE
f(x)
_
f(x
0
).
We can see that f : E R is continuous at x
0
if and only if f is usc at x
0
and lsc at x
0
.
27
Note 5.1.
(i) f(x
0
) = (respectively, f(x
0
) = ), then f is usc (respectively, lsc) at x
0
.
(ii) If f(x
0
) R, then f being usc at x
0
implies that for all M > f(x
0
), there exists a
> 0 such that x E and |x x
0
| < implies f(x) < M.
(iii) f is usc at x
0
if and only if f is lsc at x
0
.
Example 5.5. Let us look at three versions of the Heaviside function.
H
1
(x) =
_
_
1 : x > 0
0 : x = 0
1 : x < 0
is neither usc nor lsc at x = 0. However,
H
2
(x) =
_
1 : x 0
1 : x < 0
is usc at 0, but is not lsc at 0. Finally,
H
3
(x) =
_
1 : x > 0
1 : x 0
is lsc at 0, but it is not usc at x = 0.
HW: p. 31, #4, 16; p. 48, #6, 21; pp. 61-62, #4, 8, 12
Example 5.6. Let E = [1, 1] and x
0
= 0. Then
f(x) =
_
0 : x E \ {0}
1 : x = 0
is usc at x
0
, but not lsc at x = 0. On the other hand, if we replace 1 by -1, the function
becomes lsc and not usc. Finally, consider the function dened by
f(x) =
1
x
2
.
This function is both usc and lsc at x = 0, but note that it is not continuous.
Denition 5.10. A function f is usc (lsc) relative to E if it is usc (lsc) at every x
0
E
which is a limit point in E.
Theorem 5.9.
28
(i) f is usc relative to E if and only if for every a R, {x E : f(x) a} is closed in
E. Equivalently, {x E : f(x) < a} is relatively open in E.
(ii) f is lsc at x
0
E if and only if for all a R, {f a} is relatively closed and {f > a}
is relatively open.
Proof. {x E : f(x) a} = E F, where F is closed in R
n
, which is measurable, thus f
is measurable. Also, if E is open and f : E R is usc relative to E, then for every a R,
{x E : f(x) > a} =
_
k=1
_
x E : f(x) a +
1
k
_
,
which is an F
-set, i.e., a Borel set, thus f is Borel-measurable.

Q.E.D.
Theorem 5.10 (Egorovs Theorem). Suppose {f
k
}
k=1
is a sequence of measurable functions
on a measurable set E with |E| < such that
lim
k
f
k
(x) = f(x)
for almost every x E. Then, for every > 0, there is a closed set F E such that
|E F| < and f
k
f uniformly on F. That is, for every > 0 there exists a K
such
that |f
k
(x) f(x)| < for all x F and k > K
.
Remark 5.5. We need to assume |E| < (see the example below). We also need to assume
f is real-valued to even state the theorem.
Example 5.7. For k N, let f
k
(x) =
{|x|k}
(x). Then f
k
(x)
R
n(x). For any F R
n
with |R
n
F| < , F must be unbounded.
Before proving Egorovs theorem, let us rst prove a lemma with the quantiers reordered.
Lemma 5.1. With the same assumptions as in the theorem, the for every , > 0, there
exists F
,
E closed, with |E F| < and K
,
N such that |f(x) f
k
(x)| < for all
x F
,
and k > K
,
.
Proof. Fix > 0 and > 0. For m N, dene
E
m
= {x E : |f(x) f
k
(x)| < for all k > m}
=
k=m+1
{x : |f(x) f
k
(x)| < }
=
k=m+1
{x : f(x) f
k
(x) < } {x : f(x) f
k
(x) > }.
Observe that this gives us a nested sequence E
1
E
2
E
m+1
. Since f
k
(x) f(x)
a.e. x E,
m=1
E
m
= E Z, where |Z| = 0. Since |E| < , we have |E E
m
| 0 as
29
m . Pick m
0
N such that |E E
m
0
| <
1
2
. E
m
0
M implies there exists f E
mo
such that |E
mo
F| <
1
2
. Then
|E F| |E E
m
0
| +|E
m
0
F| < .
Then x F implies x E
m
0
and f(x) f
k
(x) < for k > m
0
. Then F = F
,
and
m
0
= K
,
.
Q.E.D.
Now we prove Egorovs theorem.
Proof. Given > 0, apply the lemma to get that there exists F
m
E, closed, and K
,m
such
that
|E f
m
| < 2
m
|f f
k
| <
1
m
on F
m
for k > K
,m
.
Let F =
m=1
F
m
, which is closed and contained in E. Observe
E F = E
m=1
F
m
=
_
m=1
E F
m
.
So we have
|E F|
m=1
|E F
m
|
m=1
2
m
= .
For every > 0, there is an m
> 0 such that

1
m
< and x F implies x F

m
which
implies |f(x) f
k
(x)| <
1
m
< for all k > m. Thus, f
k
f uniformly.
Q.E.D.
Denition 5.11. Let E M and f : E R be measurable. We say that f has C on E if
for every > 0 there exists a closed set F E with |E F| < and f is continuous relative
to F.
Lemma 5.2. Any measurable simple function has property C.
Proof. Let f =
N
j=1
a
j
E
j
with a
j
= a
k
for j = k and E
j
E
k
= for j = k. Let
E =
N
j=1
E
j
. Let > 0. Then E
j
M implies there exists F
J
E
j
, closed, with
|E
j
F
j
| <

N
. Dene F +
N
j=1
F
j
E. Since the union is nite, F is closed, thus
|E F| =
N
_
j=1
E
j

N
_
j=1
F
j
N
_
j=1
(E
j
f
j
)
=
N
j=1
|E
j
F
j
|
N
j=1
N
= .
Claim 5.1. f is continuous relative to F.
30
Note that for all 1 j n,
N
k=1,k=j
F
k
is closed and thus closed in F with respect to the
relative topology. Thus, F
j
= F
N
k=1,k=j
F
k
is relatively open and this is a neighborhood
of each of its points. f is constant on F
j
, thus constant on a neighborhood of all points
x F
j
, thus continuous.
Q.E.D.
Theorem 5.11 (Lusins Theorem). Let E M and f : E R. Then f is measurable if
and only if f has property C with respect to E.
Proof. Suppose f has property C. We want to show f : E R is measurable. By property
C, for every k N, there exists F
k
E, closed, |E F
k
| <
1
k
, and f is continuous relative
to F
k
. Let H =
k=1
F
k
E, an F
-set. Let Z = E H. Then Z E F

k
for all k N.
Now, for all k N, we have |Z|
e
< |E F
k
| <
1
k
, so |Z| = 0. So E = H Z. This implies
{x E : f(x) > a} =
_
k=1
{x F
k
: f(x) > a} {x Z : f(x) > a}.
Note that since {x Z : f(x) > a} Z, it it measurable and has measure zero. The other
portion of the union, however, we have that f is continuous relative to F
k
, thus each of these
is open in the relative topology, i.e., there is an open set G
k
such that {x F
k
: f(x) >
a} = F
k
G
k
for all k, thus this is clearly measurable. Since M is a -algebra, we are done.
Conversely, suppose that f is measurable. From an eariler result, there exists a sequence
of measurable functions such that f
k
(x) f(x) for all x. By the lemma, each f
k
has property
C. Thus, given > 0, there exists F
k
E, closed, |E F
k
| <

2
2
K
, and f
k
is continuous
relative to F
k
. For the moment, assume |E| < . There exists a closed set F
0
E with
|E F
0
| <

2
such that f
k
f uniformly on F
0
by Egorov. Dene F = F
0

k=1
F
k
E,
closed. Then
|E F| |E F
0
| +
k=1
|E F
k
| <
varepsilon
2
+ frac2 =
and f
k
f uniformly on F. If |E| = , write E = E
0

k=1
E
k
, where E
0
= E {x
R
n
: |x| 1} and E
k
= {E x R
n
: k < |x| k +1}; therefore, f has property C on E
j
.
Given > 0, apply rst part of the proof for f on E
k
with

2
k+1
. This implies there exists
F
k
E
k
, closed, and |E
k
F
k
| <

2
k+1
. Let F +
k=1
F
k
. Simply show F is closed and we
are done.
Q.E.D.
5.1 Convergence in Measure
Denition 5.12. Let E M, {f
k
}
k=1
be a sequence of functions, f measurable. Then we
say that f
k
converges to f in measure if, for every > 0,
|{x E : |f
k
(x) f(x)| > }| 0 as k .
This will be denoted f
k
m
f.
31
Example 5.8. Let E = [0, 1] and let
f
k
(x) =
_
_
4k
2
x : x
_
0,
1
2k
_
4k fk
2
x : x
_
1
2k
,
1
k
_
0 :
_
1
k
, 1
and f(x) = 0, x [0, 1].

Observe f
k
f pointwise for all x [0, 1] but not uniformly. We can check that it converges
in measure to f as follows:
{x [0, 1] : |f
k
(x) f(x) > }
_
0,
1
k
_
|{x [0, 1] : |f
k
(x) f(x)| > }|
_
0,
1
k
_
1
k
which clearly goes to 0 as k . Thus, f
k
m
f.
Example 5.9. Let E = R and
f
k
(x) =
_
_
0 : x R \ [k, k + 1)
4x : x
_
k,
2k+1
2
4(k + 1) 4x : x
_
2k+1
2
, k + 1
_
and f(x) = 0, x R.
Then
|{x R : f
k
(x) > }| >
1
2
independent of k, thus f
k
does not converge in measure to f.
HW: pp. 62-63, #15, 16 17
Theorem 5.12. If |E| < and if f
k
f a.e. on E, then f
k
m
on E.
To prove this, we need a lemma from the proof of Egorovs Theorem.
Lemma 5.3. With the assumption above, for every , > 0, there exists a closed set F E,
k N such that |E f| < and for every j > k, |f(x) f
j
(x)| < for every x F.
Proof. Applying this,
{x E : |f
j
(x) f(x)| > } E F,
which implies
|{x E : |f
j
(x) f(x)| > }| |E F| < .
Since is arbitrarily small, it follows that
lim
j
|{x E : |f
j
(x) f(x)| > }| = 0
for every > 0. Thus, f
j
m
on E.
Q.E.D.
32
Question: Is the converse of the theorem true?
Answer: No! Refer to the book for example. We are able to establish the following
weak converse, however.
Theorem 5.13. If f
k
m
on E, then there exists a subsequence {f
k
j
}
j=1
such that
lim
j
f
k
j
(x) = f(x) a.e.
Q.E.D.
6 Lebesgue Integration
Let E M(R
n
), f : E [0, ] (measurable unless otherwise stated). We would like to
dene
_
E
f, and since f is nonnegative,
_
E
f [0, ].
Denition 6.1. (f, E) is the graph of f over E, i.e., the set
{
_
x, f(x)
_
R
n+1
: x E, f(x) < }.
Example 6.1. Let f(x) =
1
x
on [0, 1]. Then (f) =
_
(x, y) R
2
: 0 < x 1, y =
1
x
_
.
Denition 6.2. R(f, E) is the region under the graph of f over E.
R(f, E) = {(x, x
n+1
) R
n+1
: x E, 0 x
n+1
f(x) when it is nite
and 0 x
n+1
< otherwise}.
We will distinguish between M(R
n
) and M(R
n+1
). For outer measure and measure,
|E|
e(n)
and |E
|
e(n+1)
, with |E|
(n)
and |E
|
(n+1)
, where E and E
are in R
n
and R
n+1
respec-
tively.
Denition 6.3.
_
E
f = |R(f, E)|
(n+1)
. Clearly R(f, E) M(R
(n+1)
) otherwise
_
E
f is not
dened.
Theorem 6.1. Let f 0 on E M(R
n
). Then
_
E
f exists if and only if f is measurable
on E.
Before proving the theorem, we rst present a lemma.
Lemma 6.1. Let E M(R
n
), a [0, ]. Let f
a
= a on E (which is measurable). Then
R(f
a
, E) M(R
n+1
) and
_
E
f
a
= a |E|
(n)
.
Recall the convention is that 0 x = 0 for all x [, ].
33
Proof. If |E|
(n)
= 0, we just need to show
|R(f
a
, E)|
e(n+1)
= 0,
which is left as an exercise.
So assume |E|
(n)
> 0 and a < . If E is a (possibly) half-open interval, then R(f, E) =
E [0, a] is another possibly half-open interval in R
n+1
, thus R(f, E) M(R
n+1
and
_
E
f
a
= |R(f, E)|
(n+1)
= |E|
(n)
|[0, a]|
(1)
= a |E|
(n)
.
If E R
n
is open, then E =
k=1
I
k
, a countable disjoint union of intervals. This says
R(f
a
, E) =
_
k=1
R(f
a
, I
k
),
which is measurable as a countable union of measurable sets, so
|R(f
a
, E)|
(n+1)
=
k=1
|R(f
a
, I
k
)|
(n+1)
=
k=1
a |I
k
|
(n)
= a |E| .
If E R
n
is of measure zero, we already have done it (as an exercise). Thus, if E R
n
,
E M(R
n
), |E| < , then E = H Z where |Z|
(n)
= 0 and H is a G
-set. Without loss

of generality, H =
i=1
G
i
and G
1
G
2

i=1
G
i
= H. Thus,
R(f
a
, E) =
i=1
R(f
a
, G
i
) R(f
a
, z).
We know R(f
a
, G
i
) decreases since the Gis are nested with |R(f
a
, G
1
)|
(n+1)
= a |G1|
(n)
<
, which implies
i=1
R(f
a
, G
i
)
= lim
i
|R(f
a
, G
i
)| = lim
i
a |G
i
| = a |H| = a |E| .
If |E| = , then we can write E =
k=1
E
k
with E
k
= E (B
k
(0) B
k1
(0)) which is
measurable and nite, so we apply the previous step. Doing this, we get
R(f
a
, E) =
_
k=1
R(f
a
, E
k
),
therefore, we have
_
E
f
a
= |R(f
a
, E)| =
k=1
|R(f
a
, E
k
)| =
k=1
a |E
k
| = a
k=1
|E
k
| = a |E| .
It is left as an exercise to check the cases when a = and 0 < |E| .
Q.E.D.
34
Lemma 6.2. Let f : E [0, ] be measurable, E M(R
n
). Then |(f, E)|
(n+1)
= 0.
Proof. As above, we can assume |E|
(n)
< . Given > 0, k Z
+
, let E
k
= {x E : k
f(x) < (k + 1)}. Then {E
k
}
k=1
M(R
n
) where E
j
E
k
= if j = k, and
_
k=1
E
k
= {x E : f(x) < }.
Thus, (f, E) =
k=1
(f, E
k
), but (f, E
k
) E
k
[k, (k+1)), which implies |(f, E
k
)|
|E
k
| .
Now we have
|(f, E)|
k=1
|(f, E
k
)|
(n+1)

k=1
|E
k
|
(n)
|E|
(n)
.
Since > 0 is arbitrarily small, |(f, E)| = 0.
Q.E.D.
HW: p. 85, #2, 3, 4, 5
Now well present the proof of the theorem.
Proof. Assume f is a nonnegative measurable function on E. Then we know there exists a
sequence {f
k
}
k=1
of simple measurable functions such that f
k
increases to f for all x E.
Thus, R(f
k
, E)(f, E) increases to R(f, E) as k . By the previous lemma, |(f, E)| =
0, so |R(f
k
, E)|
e(n+1)
increases to |R(f, E)|
e(n+1)
and
R(f, E) =
_
k=1
R(f
k
, E) Z,
which is clearly measurable. Thus,
_
E
f = lim
k
_
E
f
k
exists.
The converse is left as an exercise.
Q.E.D.
Lets take a look at some of the properties of the Lebesgue integral.
Theorem 6.2.
(i) If 0 f g on E and if f and g are measurable, then
_
E
f
_
E
g. Thus
_
E
f
(sup
xE
f(x)) |E|.
(ii) If f 0 and measurable on E with
_
E
f < , then f < a.e. in E.
(iii) If E
1
E
2
are measruable, then for f 0,
_
E
1
f
_
E
2
f.
Proof.
(i) We know f g implies R(f, E) R(g, E), thus |R(f, E)| |R(g, E)|, i.e.,
_
E
f
_
E
g.
35
(ii) We prove the contrapositive using the third condition. Suppose f = on a set of
positive measure. Write E
1
= {x E : f(x) = } E, so the third condition implies
_
E
1
f = |E|
_
E
f.
(iii) Observe E
1
E
2
implies R(f, E
1
) R(f, E
2
), thus
_
E
1
f
_
E
2
f.
Q.E.D.
Theorem 6.3 (Monotone Convergence Theorem). Let {f
k
}
k=1
be a sequence of nonnegative
measurable functions on E M. Assume f
k
increases to f for all x E, then
_
E
f
k
increases to
_
E
f. In particular,
lim
k
_
E
f
k
=
_
E
f.
Proof. The fact that
__
E
f
k
_
k=1
increases follows from f
k
f
k+1
and the previous theo-
rem. R(f
k
, E) (f, E) increases to R(f, E), which implies |R(f
k+1
, E)|
(n+1)
increases to
|R(f, E)|
(n+1)
.
Q.E.D.
Theorem 6.4. Suppose f 0 and measurable on E, and let E be partitioned into E =
k=1
E
k
, E
k
M and E
k
E
j
= if k = j. Then
_
E
f =
k=1
_
E
k
f.
Proof. R(f, E) =
k=1
R(f
k
, E), a disjoint union. This gives us |R(f, E)| =
k=1
|R(f
k
, E)|.
Q.E.D.
Theorem 6.5. Let f 0 be measurable on E M. Then
_
E
f = sup
_

k=1
_
inf
xE
k
f(x)
_
|E
k
|
(n)
_
where the supremum is over all countable partitions E =
k=1
E
k
.
Q.E.D.
Corollary 6.1. If f 0 and |E| = 0, then
_
E
f = 0.
Proof. They hypotheses imply the measurability of f. Then, for any partition of E, say
E =
k=1
E
k
, then
_
E
f = sup
_

k=1
_
inf
xE
k
f(x)
_
|E
k
|
(n)
_
= sup{0} = 0.
Q.E.D.
36
Example 6.2. Let
f(x) =
_
1 : x Q
0 : x / Q
.
(i) Let E = Q. Since |Q| = 0, we know
_
E
f = 0.
(ii) In fact,
_
R
f =
_
Q
f +
_
RQ
f = 0 + 0 = 0.
Theorem 6.6.
(i) If f and g are nonnegative measurable functions on E M, then if f(x) g(x) a.e.,
then
_
E
f
_
E
g.
(ii) If f(x) = g(x) a.e., then
_
E
f =
_
E
g.
Proof. Clearly (ii) follows from (i). To prove (i), decompose E into sets {f > g} {f
g} = Z A, where |Z| = 0. Then
_
E
f =
_
Z
f +
_
A
f =
_
A
f
_
A
g =
_
A
g +
_
Z
g =
_
E
g.
Q.E.D.
Next, we will cover an important estimate/inequality, the Tchebyshev Inequality.
Theorem 6.7 (Tchebyshevs Inequality). Let f 0 be measurable on E. Then, for every
> 0 dene
f,E
() = () = |{x E : f(x) > }| .
This is called the distribution function of f and satises
f,E
()
1
_
E
f.
Remark 6.1. This gives a decay rate as and a blow-up rate as 0
+
. If
_
E
f = , then the statement contains no information.
Example 6.3. Let E = R
n
and f(x) =
1
|x|
p
, where p > 0. Then
{x R
n
:
1
|x|
p
> } =
_
x R
n
: |x| <
1
p
_
= B
0
_
1
p
_
.
Recall that |B
r
(x
0
)| = c
n
r
n
, where c
n
is a constant depending on the dimension. Thus,
() = c
n
(
1
p
)
n
= c
n
n
p
.
This does not contradict Tchebyshevs inequality since
_
R
n
f = . If we change E to B
1
(0),
then it turns out that _
B
1
(0)
1
|x|
p
<
37
if and only if 0 < p < n. In that case, Tchebyshevs inequality implies
f,B
1
(0)
()
1
__
B
1
(0)
1
|x|
p
_
< .
Tchebyshevs inequality gives a weaker result than explicit calculation
n
p
> 1, so
n
p
0
as faster than
1
does.
Proof. Fix > 0. Let E
= {x E : f(x) > } E, E
M. On E
, f(x) > which

implies
_
E
f
_
E
f by a previous theorem. But now
_
E
f
_
E
f |E
| = ().
Thus, ()
1
_
E
f.
Q.E.D.
Theorem 6.8. Let f 0 on E, then f = 0 a.e. on E if and only if
_
E
f = 0.
Proof. Suppose f = 0 a.e., then f 0 a.e., which implies
0
_
E
f
_
E
0 = 0,
thus
_
E
f = 0.
Conversely, if
_
E
(f) = 0, apply Tchebyshevs inequality for any > 0. Then
|{x E : f(x) > }|
1
_
E
f = 0.
Observe that {x E : f(x) = 0} =
k=1
{x E : f(x) >
1
k
}, which is a countable union
of sets of measure zero, thus it has measure zero. Hence, f = 0 a.e. on E.
Q.E.D.
Theorem 6.9. If f 0 and c R
+
, then
_
E
cf = c
_
E
f.
Proof. Recall if f is simple, f =
N
i=1
a
i
E
i
, then
_
E
f =
N
i=1
a
i
|E
i
| and cf =
N
i=1
(ca
i
)
E
i
,
which implies
_
E
cf =
N
i=1
ca
i
|E
i
| = c
N
i=1
a
i
|E
i
| = c
_
E
f.
A general measurable f 0 can always be expressed as f = lim
k
f
k
, with each f
k
0
simple, measurable, and increasing to f for all x E. Then it follows that cf
k
(x) increases
to c(f(x)). If we apply the monotone convergence theorem, we get that
_
E
f = lim
k
_
E
f
k
.
Thus,
_
E
cf = lim
k
_
E
cf
k
= lim
k
c
_
E
f
k
= c lim
k
_
E
f
k
= c
_
E
f.
Q.E.D.
38
Theorem 6.10. If f and g are nonnegative, measurable functions, then
_
E
f + g =
_
E
f +
_
E
g.
Proof. First, assume f and g are simple functions. Then
f =
M
i=1
a
i
A
i
and g =
N
j=1
b
j
B
j
.
Write
f =
M
i=1
a
i
_
N
j=1
A
i
B
j
_
=
M,N
i,j=1
a
i
A
i
B
j
g =
N
j=1
a
i
_
M
i=1
A
i
B
j
_
=
M,N
i,j=1
b
j
A
i
B
j
.
This implies
f + g =
M,N
i,j=1
(a
i
+ b
j
)
A
i
B
j
.
Now we have
_
E
f + g =
M,N
i,j=1
(a
i
+ b
j
) |A
i
B
j
|
=
M
i=1
a
i
_
N
j=1
|A
i
B
j
|
_
+
N
j=1
b
j
_
M
i=1
|A
i
B
j
|
_
=
_
E
f +
_
E
g.
For general nonnegative functions f and g, express f = lim
k
f
k
and g = lim
k
g
k
,
where f
k
and g
k
increase to f and g, respectively, and f
k
and g
k
are simple and measurable.
Thus, note that each f
k
+g
k
is simple and measurable and f
k
+g
k
increases to f +g. Then,
by the monotone convergence theorem,
_
E
f
k
+ g
k

_
E
f + g.
Thus,
_
E
f
k
+
_
E
g
k

_
E
f +
_
E
g.
Q.E.D.
Theorem 6.11. Suppose 0 f g are measurable on E and
_
E
f < . Then
_
E
g f =
_
E
g
_
E
f.
39
Proof. Clearly g f 0 and measurable on E, and g = f + (g f). Thus,
_
E
g =
_
E
f +
_
E
(g f).
Q.E.D.
6.1 Convergence Theorem
The goal of this section to achieve the following: if {f
k
}
k=1
is a sequence of functions with
lim
k
f
k
(x) = f(x) a.e. on E, then
_
E
f =
_
E
lim
k
f
k
= lim
k
_
E
f
k
.
Equivalently, suppose
k=0
f
k
= F(x) a.e. on E. If we form the sequence of partial sums
S
n
(x) =
n
k=0
f
k
(x), then lim
n
S
n
(x) = F(x) a.e. on E. Then we want
_
E
F(x) =
_
E
lim
n
S
n
(x) = lim
n
_
E
S
n
(x) = lim
n
n
k=0
_
E
f
k
=
k=0
_
E
f
k
.
We have problems, though, unless we impose some hypotheses as the following example
demonstrates.
Example 6.4. On E = [0, 1], let
f
k
(x) =
_
_
4k
2
: x [ 0,
1
2k
_
4k
2
: x [
1
2k
,
1
k
_
0 : x [
1
k
, 1]
.
Clearly f
k
0 and is measurable, f
k
(x) 0, i.e., f
k
(x) f(x) for all x [0, 1]. Thus,
f
k
f pointwise everywhere on E, but
_
E
f
k
= 1 0.
Similarly, if E = R and
g
k
(x) =
_
_
4
2k+1
: x [k,
2k+1
2
)
4
2k+1
: x [
2k+1
2
, k + 1)
0 : x R [k, k + 1)
.
Then g
k
g, where g(x) = 0 for all x E, yet
_
E
g
k
= 1
_
E
g = 0.
Thus, we need other convergence theorem besides monotone.
Theorem 6.12 (Fatous Lemma). Let {f
k
}
k=1
be a sequence of nonnegative measurable
functions on E. Then
_
E
liminf
kN
f
k
liminf
kN
_
E
f
k
.
40
Proof. Recall the denition of what the limit inmum means: liminf
kN
f
k
(x) = lim
k
(inf
nk
(f
n
(x))),
where the inmum is taken over decreasing sets, hence, as a sequence, it is monotonically
increasing. Thus, the monotone convergence theorem implies
_
E
liminf
kN
f
k
= lim
k
_
E
inf
nk
(f
n
(x)).
Note that for all x E, k N, inf(f
k
(x)) f
k
(x), thus
_
E
inf
n>k
(f
n
(x))
_
E
f
k
(x),
hence we have
lim
k
_
E
inf
n>k
(f
n
) liminf
kN
_
E
f
k
.
Q.E.D.
Corollary 6.2. Let f
k
, k = 1, 2, . . ., be nonnegative and measurable on E, and let f
k
f
a.e. on E. If
_
E
f
k
M for all k, then
_
E
f M.
Proof.
f = liminf
k
f
k
, which implies
_
E
f liminf
k
_
E
f
k
M.
Q.E.D.
Example 6.5. In the above corollary we can have strict inequality, using the function
f(x) =
_
_
4k
2
x : x
_
0,
1
2k
4k 4k
2
x : x
_
1
2k
,
1
k
0 : x
_
1
k
, 1
.
Integrating this, we have
_
[0,1]
liminf
k
f
k
= 0 = 1 = liminf
k
_
[0,1]
f
k
.
Theorem 6.13 (Lebesgues Dominated Convergence Theorem for Nonnegative Functions).
Let {f
k
}
k=1
be a sequence of nonnegative measurable functions on E such that f
k
f a.e.
on E. If there exists a measurable function such that f
k
a.e. for all k, and if
_
E
is
nite, then
_
E
f
k

_
E
f.
Remark 6.2. Lebesgues Dominated Convergence Theorem does not apply to the example
since on
1
4k
x
1
3k
, f
k
(x) k. So any majorant (x) must satisfy (x) k on [
1
4k
,
1
3k
]
for all k N. Thus
_
[0,1]
= since can be bounded below by simple functions whose
integral mimics the harmonic series.
41
Proof. One inequality follows immediately from Fatous Lemma:
_
E
f liminf
k
_
E
f
k
.
So, it suces to show
limsup
k
_
E
f
k

_
E
f.
Let g
k
= f
k
, and notice g
k
0 a.e. on E and lim
k
g
k
= lim
k
f
k
= f. Now,
Fatous Lemma implies
_
E
f =
_
E
lim
k
g
k
liminf
k
_
E
g
k
= liminf
k
_
E
f
k
= liminf
k
__
E

_
E
f
k
_
=
_
E
limsup
k
_
E
f
k
.
On the other hand,
_
E
f =
_
E

_
E
f.
Combining these two inequalities and cancelling
_
E
, we achieve the desired result.
Q.E.D.
Denition 6.4. If at least one of the integrals
_
E
f
+
or
_
E
f
is nite, then
_
E
f is dened
to be
_
E
f
+
_
E
f
. If
_
E
f exists and is nite, we say that f is Lebesgue integrable, or
simply integrable, on E and write f L(E). Thus,
L(E) = L
1
(E) = L
1
(E, dx) = {f :
_
E
f < },
where dx represents Lebesgue measure. Observe that L
1
(E) is a vector space under pointwise
addition and multiplication.
Observe that
_
E
f
_
E
f
+
_
E
f
int
E
f
+
_
E
f
=
_
E
f
+
+
_
E
f
=
_
E
|f| .
Denition 6.5. A function f is essentially bounded if there exists M < such that
|f(x)| M for almost every x E. The essential supremum is then dened to be the
inmum of the Ms that satisfy this property.
42
Proposition 6.1. Let |E| < . Then if f is essentially bounded,
_
E
f
ess sup
xE
|f| |E| .
Proof.
_
E
f

_
E
|f|
_
E
ess sup
xE
|f|
= ess sup
xE
|f|
_
E
1
= ess sup
xE
|f| |E| .
Q.E.D.
Theorem 6.14. Let f : E R be measurable on E. Then f is integrable over E if and
only if |f| is integrable over E.
Proof. If |f| L
1
(E), use the inequality above to get f L
1
(E). Conversely, if f L
1
(E),
then
_
E
f
+
and
_
E
f
both must be nite. Then

_
E
|f| =
_
E
f
=
+ f
=
_
E
f
+
+
_
E
f
< .
Thus, |f| L
1
(E) by denition.
Q.E.D.
Many of the results for nonnegative extended real-valued functions extend to R-valued
function if either
_
E
f is dened or f L
1
(E). As a warning, however, the integral operator
is no longer monotonic in the sets we integrate over (simply consider the function f(x) = 2x
over the intervals [1, 1] and [0, 1]).We do however, have the following proposition.
Proposition 6.2. Let f L
1
(E) and E
2
E
1
. Then f
E
2
L
1
(E
2
) and
_
E
2
|f|
_
E
1
|f|.
Theorem 6.15. If f L(E), then f is nite a.e. in E.
Q.E.D.
Theorem 6.16.
(i) If both
_
E
f and
_
E
g exist and if f g a.e. in E, then
_
E
f
_
E
g. In particular, if
f = g a.e. in E, then
_
E
f =
_
E
g.
43
(ii) If
_
E
2
f exists and E
1
is a measurable subset of E
2
, then
_
E
1
f exists.
Q.E.D.
Theorem 6.17. If
_
E
f exists and E =
k=1
E
k
is the countable union of disjoint measurable
sets E
k
, then
_
E
f =
k=1
_
E
k
f.
Q.E.D.
Theorem 6.18. If |E| = 0 or if f = 0 a.e. in E, then
_
E
f = 0.
Q.E.D.
6.2 Linearity of the Integral
Lemma 6.3. If
_
E
f exists and c R, then
_
E
cf exists and equals c
_
E
f.
Proof. Suppose c = 1. Then
_
E
f exists implies
_
E
f =
_
E
f
+
_
E
f
with at least one of

the integrals on the right nite. Note that (f)
+
= f
and (f)
= f
+
. Then
_
E
(f) =
_
E
(f)
+
_
E
(f)
=
_
E
f
_
E
f
+
=
__
E
f
+
_
E
f
_
=
_
E
f.
Now, suppose 0 c < . Then
_
E
cf =
_
E
(cf)
+
_
E
(cf)
=
_
E
cf
+
_
E
cf
= c
_
E
f
+
c
_
E
f
= c
__
E
f
+
_
E
f
_
= c
_
E
f.
For general c R, either c 0 or c 0. In the latter case, c = c
, thus apply case 2

or cases 2 and 1 respectively.
Q.E.D.
44
Corollary 6.3. If
_
E
1
f
1
and
_
E
2
f
2
are dened and c
1
, c
2
R, then
_
E
1
c
1
f
1
+
_
E
2
c
2
f
2
= c
1
_
E
1
f
1
+ c
2
_
E
2
f
2
.
Thus, the set of function f such that
_
E
f is dened is a vector space over R as is L
1
(E).
HW: Chap. 5, #7,8,9,11
Theorem 6.19. If f, g L
1
(E), then f + G L
1
(E) and
_
E
f + g =
_
E
+
_
E
g.
Q.E.D.
Thus, L
1
(E) is a vector space over R. In fact, L
1
(E) is a normed linear space with
f
L
1
(E)
=
_
E
|f|,
which satises
(i) f
L
1
(E)
0 with equality if and only if f = 0 a.e. on E;
(ii) cf
L
1
(E)
= |c|f
L
1
(E)
;
(iii) f + g
L
1
(E)
f
L
1
(E)
+g
L
1
(E)
.
Strictly speaking, we need to consider
Z(E) = {f : f = 0 a.e. on E}.
Clearly this is a linear subspace of L
1
(E), so we replace L
1
(E)/Z(E) to identify all of the
functions with 0. Now L
1
(E) becomes a metric space, where
d(f, g) = f g
L
1
(E)
.
Example 6.6. f
k
f in L
1
(E) if d(f
k
, f) =
_
E
|f
k
f| 0 as k .
Denition 6.6. If (X, ) is a normed linear space, then a linear functional : X R
is bounded (or continuous) if there is some constant C < such that (v) C v.
We know that if f L
1
(E), then (f) =
_
E
f is a linear function on L
1
(E) and
|(f)| =
_
E
f

_
E
|f| = 1 f
L
1
(E)
,
which implies is a bounded linear functional with constant C = 1.
Theorem 6.20. Suppose f L
1
(E) and g : E R is measurable. Then
45
(i) if there exists M < such that |g(x)| Mf(x), then g L
1
(E) with g
L
1
(E)

Mf
L
1
(E)
;
(ii) if there exists constant a and b such that af(x) g(x) bf(x) a.e. on E, then
a
_
E
f
_
E
g b
_
E
f,
so g L
1
(E).
If m(x) is essentially bounded on E, then f L
1
(E) implies
f
m

m
(f) =
_
E
m(x)f(x)dx.
Observe
m
(cf) = c
m
(f) for all c R and f L
1
(E). Also,
m
(f + g)
m
(f) +
m
(g)
for all f and g L
1
(E), thus
m
is a linear functional.
By the theorem,
|
m
(x)|
_
E
|m(x)f(x)|dx M
_
E
|f| = Mf
L
1
(E)
for any M for which M ess sup
xE
|m(x)|. In fact, here
m
is a bounded linear function
with constant C = ess sup
xE
|m|. Furthermore, every bounded linear functional on L
1
(E)
arises in this way.
6.3 Convergence Theorems for General Functions
Theorem 6.21 (Monotone Convergence Theorem). Let {f
k
}
k=1
be a sequence of measurable
functions on E. If f
k
monotonically converges to f a.e. and there exists L
1
(E) such
that f
k
(x) (x) a.e. on E, then lim
k
_
E
f
k
=
_
E
f.
Proof. Clearly f
k
(x) (x) 0 e.e on E and f
k
(x) (x) increases to f(x) (x) a.e. on
E. The monotone convergence theorem for nonnegative functions imply
_
E
f
k

_
E
f ,
but that means _
E
f
k

_
E

_
E
f
_
E
.
Cancelling the
_
E
from both sides yields the desired result.
Q.E.D.
Theorem 6.22 (Fatous Lemma). Let {f
k
}
k=1
be a sequence of functions on E such that
there exists L
1
(E) with f
k
a.e. on E. Then
_
E
liminf
k
f
k
liminf
k
_
E
f
k
.
46
Proof. Apply Fatous Lemma for nonnegative functions to {f
k
}
k=1
, using
liminf
k
(f
k
) = liminf
k
(f
k
) ,
so
_
E
f
k
=
_
E
f
k

_
E
implies
_
E
liminf
k
f
k

_
E
=
_
E
liminf
k
(f
k
) liminf
k
__
E
f
k

_
E
_
= liminf
k
_
E
f
k

_
E
.
Cancelling the
_
E
from both sides yields Fatous Lemma for arbitrary measurable f.
Q.E.D.
Theorem 6.23 (Lebesgues Dominated Convergence Theorem). If {f
k
}
k=1
is a sequence of
measurable functions on E, L
1
(E) such that |f
k
(x)| (x), and f
k
(x) f(x) a.e. on
E, then
lim
k
_
E
f
k
=
_
E
f(x).
Proof. For a.e. x E, (x) f
k
(x) (x), so 0 f
k
(x) + (x) 2(x). Observe
2 L
1
(E), thus
f
k
(x) + (x) f(x) + (x) a.e. x E,
so we can apply Lebesgues Dominated Convergence Theorem for nonnegative functions.
Hence,
_
E
f
k
+
_
E
=
_
E
f
k
+
_
E
f + =
_
E
f +
_
E
.
Thus, lim
k
_
E
f
k
=
_
E
f.
Q.E.D.
6.4 The Distribution Function and L
p
-spaces
For measurable f : E R, we dened for every R in the following way,
f,E
() =
f
() = () = |{x E : f(x) > }| .
Thus, : R [0, ]. If |E| < , then : R [0, ).
Let us review some elementary facts regarding .
(i) If , then () (), i.e., is a decreasing function;
(ii) if

f = f a.e. on E, then
f
=
f
.
As , {f > } {f = }, which implies lim
() = |{f = }|. It is
convenient to assume |{f = }| = 0, which holds if and only if lim
() = 0. Also note
as lim
() = |{f > }|. Similar to before, it is convenient to assume |{f = }| =

0, thus lim
() = |E|.
Lemma 6.4. If < , then
() () = |{ < f(x) }| .
47
Proof.
() () = |{f > }| |{f > }|
= |{f > } { f > }| |{f > }|
= |{f > }| +|{ f > }| |{f > }|
= |{ f > }| .
Q.E.D.
Lemma 6.5. For every R,
(i) is continuous from the right: () = lim
0
+ ( + );
(ii) lim
0
( + ) = |{f }|;
(iii) is continuous at alpha if and only if |{f = }| = 0.
Proof. For any sequence
k
0
+
, {f > +
k
} increases to {f > }, thus ( +
k
)
increases to (). On the other hand, {f >
k
} decreases to {f }, which implies
(
k
) decreases to |{f }|. Note that () is constant on some open interval (, ) if
and only if f(x) / (, ] a.e.
Q.E.D.
Example 6.7. We will show that it is necessary to take the set E with nite measure. Take
f(x) = e
x
. Then () = 0 for all 0 while () = for any < 0. In particular,
(0
) = = 0 = |{f 0}|.
Proof. Let
k
0
+
, then
(
k
) = |{f >
k
}| .
Thus
k=1
{x : f(x) >
k
} = {x : f(x) }.
Q.E.D.
To get the convergence of measure, we need |E| < .
Proposition 6.3. is continuous at if and only if |{f = }| = 0.
Denition 6.7. We say f and g are equimeasurable or equidistributed on E if
f,E
() =
g,E
()
for all R. This will imply that
_
E
(f) =
_
E
(g) for all continuous .
There is a natural relation between the Lebesgue integral of f and Riemann-Stieltjes
integral of
f
.
48
Theorem 6.24. Suppose < a < b < and a < f(x) b a.e. in E is measurable.
Then
_
E
f =
_
b
a
d
f,E
().
Proof. Let f be a bounded function, |E| < so that
_
E
f is dened and
_
E
f
ess sup
xE
|f| |E| < ,
thus f L
1
(E). Note that () is continuous on [a, b] and
f,E
is a function of bounded
variation since it is decreasing and bounded. Thus
_
b
a
d
f,E
()
is dened as a Riemann-Stieltjes integral.
Let
= {a =
0
<
1
< <
n
= b}
be a partition of [a, b] (the range). For each 1 j k, dene
E
j
= {x E :
j1
< f(x)
j
}.
Then E =
k
j=1
E
j
(up to a set of measure zero). The pairwise disjointness implies
_
E
f =
k
j=1
_
E
j
f.
Note that _
E
j
j1
<
_
E
j
f
_
E
j
j
,
that is,
j1
|E
j
| <
_
E
j
f
j
|E
j
| .
Thus,
k
j=1
j1
|E
j
| <
_
E
f
k
j=1
j
|E
j
| .
Observe that
|E
j
| = |{
j1
< f(x) j}| = (
j1
) (
j
) = [(
j
) (
j1
)].
Inserting this into the left-hand side and the right-hand side, we get two Riemann-Stieltjes
sums for
_
b
a
d(). Letting || 0, the left-hand side and the right-hand side approach
_
b
a
d(), forcing
_
E
f =
_
b
a
d().
Q.E.D.
49
In fact, the next theorem shows we can relax the bounded assumption.
Theorem 6.25. Let f : E R be measurable. For < a < b < , let E
ab
= {x E :
a < f(x) < b}. Then
_
E
ab
f =
_
b
a
d
f,E
().
Theorem 6.26. Let f : E R be measurable. If either of
_
E
f or
_

d
f,E
()
exists and is nite, then the same is true for the other and
_
E
f =
_

d().
Q.E.D.
Corollary 6.4. If f and g are equimeasurable on E, then f L
1
(E) if and only if g L
1
(E)
and
_
E
f =
_
E
g.
Proof.
_
E
f =
_

d
f,E
() =
_

d
g,E
() =
_
E
g.
Q.E.D.
Theorem 6.27. Let : R R is continuous and f : E R is measurable. Then
_
E
(f) =
_

()d
f,E
().
If f 0, then
_
E
(f) =
_

0
()d
f,E
().
For general f, replace f by |f|. This gives us
_
E
(|f|) =
_

0
()d
|f|,E
().
Example 6.8. Consider () =
p
where 0 < p < is xed. Then
_
E
|f(x)|
p
=
_

p
d
f,E
().
Denition 6.8. We dene the L
p
-spaces as follows:
L
p
(E) =
_
f :
_
E
|f|
p
<
_
.
50
When p = 1, these are just the integrable functions on E. When p = 2, these are the
square-integrable functions on E. Likewise, we can let p = and we get the essentially
bounded functions on E.
For 1 p < dene the L
p
-norm of f L
p
(E) to be
f
L
p
(E)
= f
L
p = f
p
=
__
E
|f|
p
_1
p
.
For p = , f
L
(E)
= ess sup
xE
|f|.
Theorem 6.28. (L
p
(E),
p
) is a normed linear space (and thus a metric space with
d
L
p(f, g) = f g
L
p).
Theorem 6.29 (Tchebyshevs Inequality). For p = 1, we have
f,E

1
_
E
f, with f
L
1
(E). When 0 < p < ,
|f|,E
()
p
_
E
|f|
p
.
Note 6.1. For f 0, an estimate of the form
f
()
C
p
does not imply f L
p
(E).
Example 6.9. Take E = R and f(x) =
1
|x|
. Then
|{x R : f(x) > }| =
{
1
|x|
> }
{|x| <
1
=
2
,
but _
R
1
|x|
= .
Even on the bounded set E = [1, 1] the function f / L
p
(E).
Denition 6.9. We say that f is in weak-L
p
if there exists C < such that
|f|,E
()
C
p
, 0 < < .
The notation for weak-L
p
is L
p
weak
= L
p
w
.
Note that f > on E
implies f >
E
on E
, thus f
p
>
p
E
, which implies
_
E
f
p

p
|E
| ,
i.e.,
|{x : f > }|
1
p
_
E
p
_
E
f
p
.
51
This then implies
|{x : |f| > }|
1
p
_
{|f|>}
|f|
p
_
E
|f|
p
.
These are the L
p
Tchebyshev inequalities. Thus,
p
f
()
_
E
f
p
< if f
p
is integrable,
and thus f L
p
(E). Hence,
p
() is bounded on (0, ).
Question: Is this a sucient condition for f L
p
(E)?
Answer: No, there are other necessary conditions.
Since |E| < ,
p
() =
p
|E
| 0 as 0
+
. We can show
p
() 0 as
, so it suces to show
_
E
f
p
0 as .
Lemma 6.6. If f L
p
(E), then
lim
p
() = 0.
Proof. We will do this by using Tchebyshev. It suces to show for every {
k
}
k=1
(0, ),
k
as k that
_
E
k
f
p
0. Since
_
E
f
p
< , f is nite a.e. Thus, let
f
k
(x) =
_
f(x) : x E
k
0 : x E E
k
.
Now, clearly
lim
k
f
k
(x) = 0 a.e.
Also, 0 (f
k
(x))
p
(f(x))
p
. Since f L
p
(E), the Lebesgue Dominated Convergence
Theorem implies
lim
k
_
E
f
p
k
=
_
E
lim
k
f
p
k
= 0.
Thus, we have
lim
k
_
E
k
f
p
k
= 0,
as desired.
Q.E.D.
Theorem 6.30. If |E| < , 0 < p < , f L
p
(E) is nonnegative, then
p
f,E
() is a
bounded function on (0, ) and tends to 0 as 0
+
or .
Again, the conclusion is not sucient for f L
p
(E).
Theorem 6.31. Let f be a nonnegative function over a set with nite measure, and let
_
E
f
p
< . Then
_
E
f
p
=
_

0
p
d
f,E
() = p
_

0
p1
()d.
52
Proof. For 0 < a < b < ,
_
b
a
p
d() =
p
()
b
a
+
_
b
a
()p
p1
d,
by integration by parts. Now, this is
b
p
(b) + a
p
(a) +
_
b
a
()p
p1
d.
Now, from the lemma above, we can reduce the rst two terms to 0 as a 0
+
and b
respectively. Thus, we have the desired result.
Q.E.D.
Unfortunately, this still is not enough in order to guarantee f L
p
(E).
7 L
p
Space
In analysis, 1 p , the L
p
spaces are basic examples of complete normed linear spaces,
also known as Banach spaces. For 0 < p < 1, L
p
is not normed but is a topological vector
space.
Denition 7.1. Let E be a measurable subset of R
n
, 0 < p < . We say that f L
p
(E)
if
_
E
|f|
p
< . The norm associated with 1 p < is
f
L
p =
__
E
|f|
p
_1
p
.
For p = ,
L
(E) = {f : ess sup

xE
|f| < }.
The natural norm on this set is to take the essential supremum.
Thus, (ess sup
xE
|f|) = 0 and for all ess sup |f|. For every < ess sup |f|,
|f|
( ) >
0.
Theorem 7.1. For 1 p , (L
p
(E),
L
p) is a normed linear space, thus L
p
(E) is a
vector space with respect to pointwise linear combinations.
Recall that a norm satises the properties of a metric, so strictly speaking, we need to
replace L
p
(E) by L
p
(E)/Z(E), where Z(E) is the subspace of functions equal to 0 a.e.
Theorem 7.2. If |E| < and f L
p
(E) for all p p
0
for some xed p
0
, then
lim
p
f
L
p = f
L
.
53
Proof. Let M = f
L
= ess sup |f|. For any

M < M, we have that
A = {x E : |f(x)| >

M}
has the property that |A| > 0. Then,
f
L
p
(E)
f
L
p
(A)
=
__
A
|f|
p
_1
p
>

M |A|
1
p
.
As p , we have liminf
p
f
L
p

M.
On the other hand, |f| M a.e. on E, which implies
f
L
p =
__
E
|f|
p
_1
p
__
E
M
p
_1
p
= M |E|
1
p
,
which clearly converges to M as p . Thus,
limsup
xE
f
L
p M.
Thus the limit exists as p , and the common value is f
L
.
Q.E.D.
Observe that most inequalities are trivial if |E| = 0, so we will always assume |E| > 0.
Suppose E
2
E. Then if f L
p
(E), f
E
2
L
p
(E
2
). This implies f
L
p
(E
2
)
f
L
p
(E
1
)
.
Thus note that the operator R
E,E
2
which sends f to f
E
2
is a bounded linear operator with
R
E,E
2
= 1.
Theorem 7.3.
(i) Let 0 < p
2
< p
1
with |E| < . Then f L
p
1
(E) implies f L
p
2
(E). Hence,
there is a natural inclusion L
p
1
(E) L
p
2
(E), and so the L
p
-spaces shrink as p in-
creases.
(ii) If 1 p
2
< p
1
, then the inclusion above is a bounded linear operator.
We will prove the rst part now, saving the proof for the second part until after more
estimates have been established.
Proof. Decompose
E = E
1
E
2
= {x E : |f(x)| 1} {x E : |f(x)| > 1}
into disjoint measurable subsets.
First, notice that when x E
1
, |f|
p
2
1, so
_
E
1
|f|
p
2
|E
1
| |E| < . Next, use the
fact that for every x E
2
, |f(x)|
p
is increasing for 0 < p < . Thus,
_
E
2
|f|
p
2
_
E
2
|f|
p
1
_
E
|f|
p
1
.
54
Hence,
_
E
|f|
p
2
|E| +
_
E
|f|
p
1
< .
This follows from the fact that |E| < and f L
p
1
(E).
Q.E.D.
Example 7.1. Note that this is not true in general if |E| = . Take E = R and let p
2
< p
1
.
Then neither containment holds. To see this, pick q (p
2
, p
1
) and note that
p
2
q
< 1 <
p
1
q
.
Let
f(x) = |x|
1
q

{|x|1}
.
Thus f L
p
2
L
p
1
.
On the other hand, let
g(x) = (1 +|x|)
1
q
.
Then g L
p
1
L
p
2
.
7.1 Some Inequalities
Theorem 7.4 (Youngs Inequality). Let : [0, ) [0, ) be continuous, strictly in-
creasing, (0) = 0. Let =
1
, which exists since is strictly increasing. Then, for all
a, b (0, ), we have
ab
_
a
0
(x)dx +
_
b
0
(y)dy,
with equality if and only if (a) = b.
Example 7.2. Take (x) = x
, 0 < < so that (y) = y

1
. Youngs Inequality implies

ab
_
a
0
x
dx +
_
b
0
y
1
dy.
Integration yields
ab
a
+1
+ 1
+
b
1
+1
1
+ 1
.
Let p = + 1 so 1 < p < . Note that
1
+ 1 = p
=
p
p1
. Thus, p + p
= 1, and hence p
maps (0, ) back into itself (called an involution), and we can check (p
= p. Note that
p = p
if and only if p = p
= 2, and we have 1 < p < 2 implies p
(2, ) vice versa.

In this special case, Youngs inequality becomes
ab
a
p
p
+
b
p
,
with 0 < a, 0 < b, 1 < p < , 1 < p
< , and
1
p
+
1
p
= 1.
55
Theorem 7.5 (H olders Inequality). If 1 p and f and g are measurable functions
on E, then
_
E
|fg| f
L
p
(E)
g
L
p
(E)
,
where we dene 1
= and
= 1.
When p = p
= 2, this is known as the Cauchy-Schwarz Inequality.

Proof. If f
L
p
(E)
= or g
L
p
(E)
= , then there is nothing to prove, so assume
f
L
p
(E)
, g
L
p
(E)
< . Likewise, the result follows trivially if f
L
p
(E)
= 0 or g
L
p
(E)
=
0, as then f = 0 a.e. or g = 0 a.e., and so |fg| = 0 a.e., so the inequality eectively says
0 0.
If p = 1 and p
= , then |g(x)| g
L
(E)
= ess sup
xE
|g(x)|. Thus
|f(x)g(x)| g
L
(E)
|f(x)|,
and so
fg
L
1
(E)
g
L
(E)
f
L
1
(E)
,
as desired. Obviously if we switch the roles of p and p
we can form a similar argument

showing the claim is true for that case.
Now, to prove H olders Inequality for p, p
(0, ), it suces to do it under the

assumption that f
L
p
(E)
= g
L
p
(E)
= 1. This is clear if we consider
f(x) =
f(x)
f
L
p
,
which implies
f
L
p =
1
f
L
p
f
L
p = 1,
and similarly for g. Thus, assuming the claim to be true for normalized functions for the
moment, we have
_
E
f g
f
L
p g
L
p
= 1.
Now, using the denition of

f and g, we have
_
E
|fg| f
L
p g
L
p
.
Thus, we need only prove the claim for normalized functions. Using Youngs Inequality, we
have
|f(x)g(x)|
|f(x)|
p
p
+
|g(x)|
p
,
so that by the linearity of the Lebesgue integral, we have
_
E
|fg|
1
p
_
E
|f|
p
+
1
p
_
E
|g|
p
=
1
p
+
1
p
= 1
56
since
_
E
|f|
p
=
_
__
E
|f|
p
_1
p
_
p
= f
p
p
= 1.
Clearly a similar argument holds for g. Thus the claim has been proven.
Q.E.D.
Example 7.3. As an application of Hoders Inequality, we can now prove the second claim
from the theorem above.
Proof. Recall we want to show that |E| < and p
2
< p
1
implies L
p
1
(E) L
p
2
(E). To that
end, the inclusion L
p
1
(E) L
p
2
(E) is a bounded linear operator.
_
E
|f|
p
2
=
_
E
1
|f|
p
2
+
_
E
2
|f|
p
2
.
Now,
_
E
2
|f|
p
2
1
_
E
|f|
p
1
quad and
_
E
1
|f|
p
2
= 1
_
E
1
|f|
p
2
.
Pick p > 1 so that p
2
p = p
1
. Then
_
E
1
|f|
p
2
1 |f|
p
2
p
1
p
= |f|
p
2
p
|E
1
|
1
p
|f|
p
2
p
|E|
1
p
.
Now,
|f|
p
2
p
=
__
E
1
|f|
p
2
p
_1
p
f
p
2
p
1
p
1
.
Now we can add, take the
1
p
2
power, and we get
f
p
2
C f
p
1
,
which implies the inclusion.
Q.E.D.
Theorem 7.6 (Minkowskis Inequality). Let 1 p , and let f and g measurable
functions nite a.e. on a set E. Then
f + g
p
f
p
+g
p
.
Proof. Well handle this by cases. First, suppose p = . Then, by denition, |f(x)| f
a.e. and |g(x)| g
a.e. The triangle inequality implies that

|f(x) + g(x)| f
+g
.
Since f
+ g
is an upper bound a.e., it is at least as great as the least upper bound

a.e., hence
ess sup
xE
|f(x) + g(x)| = f + g
+g
,
57
as desired.
When p = 1, we know |f(x) + g(x)| |f(x)| + |g(x)| by the triangle inequality. By the
monotonicity of the integral, we have
__
E
|f + g|
_
= f + g
1
f
1
+g
1
=
_
E
|f| +
_
E
|g|,
as desired.
When 1 < p < , we can apply Holders inequality in the following way,
f + g
p
p
=
_
E
|f + g|
p
=
_
E
|f + g|
p1
|f + g|
_
E
|f + g|
p1
|f| +
_
E
|f + g|
p1
|g|
__
E
|f|
p
_1
p
__
E
|f + g|
(p1)
p
p1
_
p1
p
+
__
E
|g|
p
_1
p
__
E
|f + g|
(p1)
p
p1
_
p1
p
= f
p
f + g
p1
p
+g
p
f + g
p1
p
.
Dividing both sides of the inequality by f + g
p1
p
, we attain the desired result, except in
the rare cases when f + g
p1
p
= 0 a.e., in which case f + g = 0 a.e. and so the statement
in itself is redundant.
Q.E.D.
Remark 7.1. This implies (L
p
(E),
p
) is a normed linear space.
Denition 7.2. Let we say f
k
f in L
p
-norm if
_
|f
k
f|
p
0 as k . Note that
the denition is unambiguous for all p (0, ].
Denition 7.3. A Banach space is a complete normed linear space.
Denition 7.4. A space E is separable if there exists a countable dense subset.
Theorem 7.7. L
p
(E) is a Banach space for 1 p with the L
p
-norm. Furthermore, if
1 p < , L
p
(E) is separable.
Example 7.4. The space L
(E) is not necessarily separable. Let E = R, then for all

< a < b < , let
a,b
(x) =
[a,b]
(x). Observe that
{
a,b
: < a < b < } L
(E).
Thus, if [a, b] = [c, d], then
a,b

c,d
= 1. This implies d
L
(
a,b
,
c,d
) = 1, which
implies L
is not separable (for if it were, then wed be able to approximate these functions
arbitrary close, but the distance is always 1).
58
HW: Chap. 8, # 2,3,6, and 8
Theorem 7.8. Let E M(R
n
) and 1 p . Then L
p
(E) is a Banach space.
Proof. We will show that every Cauchy sequence converges. Suppose that {f)k}
k=1
L
p
(E)
such that for every > 0, there exists a K
> 0 such that k, l > K
implies f
k
f
l
p
< .
Suppose p = . Then |f
k
(x)f
l
(x)| f
k
f
l
for a.e. x. Let Z

kl
E with |Z
kl
| = 0,
such that x E Z
kl
implies |f
k
(x) f
l
(x)| f
k
f
l
. Let Z =
k,l
Z
kl
. Since this
is a countable union of sets with measure zero, |Z| = 0 as well. Thus, x E Z implies
|f
k
(x) f
l
(x)| f
k
f
l
. For x E Z, {f
k
(x)}
k=1
R is Cauchy, so there exists a
limit of the f
k
(x), say f(x). Note that f(x) is not dened if x Z. Thus, f
k
(x) f(x) as
k uniformly on EZ. Hence, f
k
f
0 as k . Note that f = (f f
k
0
) +f
k
0
for k
0
N xed. Since f f
k
0
L
(E) and f
k
0
L
(E), we have their sum, f, is also in

L
(E), which implies L
(E) is complete.
Now suppose 1 p < and we apply Tchebyshevs inequality. That is, for every > 0,
we have
|{x E : |f
k
(x) f
l
(x)| > }|
1
p
_
E
|f
k
f
l
|
p
,
which converges to zero as k and l tend to innity since {f
k
}
k=1
is Cauchy in L
p
(E). Thus,
from the above inequality, {f
k
}
k=1
is Cauchy with respect to convergence in measure. By
Theorem 4.22, {f
k
}
k=1
has a convergent subsequence which converges a.e. on E, say
lim
j
f
k
j
(x) = f(x) a.e. x E
for some measurable f(x).
Given a > 0, there exists a K
N such that k, l K
implies f
k
f
l
p
< . Take
k = k
j
K
, then
f
k
j
(x) f
l
(x) f(x) f
l
(x) a.e.
Now, apply Fatous lemma with j to get
_
E
|f f
l
|
p
=
_
E
| lim
j
f
k
j
f
l
|
p
=
_
E
lim
j
|f
k
j
f
l
|
p
liminf
j
_
E
|f
k
j
f
l
|
p
< .
This implies that f f
l
0 in L
p
as l , i.e., f
l
f in L
p
-norm as l . For xed
l
0
, decompose f = (f f
l
0
) + f
l
0
, which is in L
p
(E) since L
p
(E) is closed under addition.
Thus, L
p
(E) is complete.
Q.E.D.
59
8 Repeated (Iterated) Integration
For convenience, we will denote R
n
R
m
by R
n+m
, where we will naturally let the x coordi-
nates range over R
n
and the y coordinate range over R
m
. Thus, let I R
n+m
be an interval,
thus I = I
1
I
2
, where I
1
n
i=1
[a
i
, b
i
] and I
2
=
m
j=1
[c
j
, d
j
], where we allow a
i
, c
j
=
and b
i
, d
j
= . Now, let f : I R be measurable. When calculating
__
I
fdxdy =
_
I
f(x, y),
the n + m-dimensional Lebesgue integral, we would like to calculate it in terms of lower
dimension integrals, integrating rst in x, then in y, or vice versa. Let, for x I
1
xed,
F(x) =
_
I
2
f(x, y)dy.
Is it necessarily true that
__
I
fdxdy =
_
I
1
F(x)dx?
To answer these question, we have Fubinis theorem, and we have partial converse, known
as Tonellis theorem.
Theorem 8.1 (Fubinis Theorem). Let f L
1
(I
1
I
2
). Then
(i) for a.e. x I
1
, f(x, ) is a measurable, integrable function of y I
2
;
(ii) if we set F(x) =
_
I
2
f(x, y)dy, then F L
1
(I
1
) and
_
I
1
F(x)dx =
_
I
1
__
I
2
f(x, y)dy
_
dx.
(iii) The above hold in the obvious senses if the roles of x and y are interchanged.
Theorem 8.2 (Tonellis Theorem). Assume f 0 and measurable on I
1
I
2
. Then
(i) for a.e. x I
1
, f(x, ) is measurable on I
2
;
(ii) F(x) =
_
I
2
f(x, y)dy 0 is a measurable function of x on I
1
;
(iii)
_
I
1
F(x)dx =
_
I
1
__
I
2
f(x, y)dy
_
dx =
__
I
1
I
2
fdxdy.
Corollary 8.1. If f(x, y) is such that a.e. x I
1
, f(x, ) L
1
(I
2
) and
_
I
1
__
I
2
|f(x, y)|dy
_
dx < ,
then f L
1
(I
1
I
2
) and
__
I
1
I
2
fdxdy =
_
I
1
__
I
2
fdy
_
dx.
60
Example 8.1. Let I
1
= I
2
= [0, 1]. Divides the unit square into fourths, then take the upper
right corner and divide it into fourths. Continue in this manner, and we get a sequence of
squares from the bottom left corner to the top right corner, which we denote {J
k
}
k=1
, and
each square has side length
1
2
k
, with the lower left vertex (x
(k)
, x
(k)
), where
x
(k)
=
k2
j=0
1
2
j+1
.
Note that |J
k
| = 2
k
2
k
= 2
2k
, which clearly converges to 0 as k . Dene f(x) to be
0 for x /

k=1
J
k
. Thus,
f
J
k
=
1
|J
k
|
= 2
2k
.
For a.e. x [0, 1] (i.e., for x = 0,
1
4
,
1
2
, . . .), we have that f(x, ) is bounded (and thus
integrable), with F(x) =
_
I
2
f(x, y)dy = 0, F L
1
([0, 1]), and
_
[0,1]
Fdx = 0. But f /
L
1
([0, 1]
2
) since
_
J
k
|f| =
_
J
k
1
|J
k
|
= 1,
which implies for every k
0
N,
_
[0,1]
2
|f|
k
0
k=1
__
J
k
|f| =
k
0
k=1
1 = k
0
.
HW: pp. 96-97, #1, 2, 3, and 11
Proof. Here we prove Fubinis Theorem. We will say that a function has property F if the
conclusions of Fubinis theorem hold. By extending f to be zero for all x / I
1
I
2
, we may
assume that I
1
= R
n
and I
2
= R
m
. Now we present two basic results concerning property
F.
Claim 8.1. If {f
j
}
k
j=1
L
1
(R
n+m
) satisfy F, then so does any linear combination (with
real-valued coecients).
Claim 8.2. Suppose {f
j
}
j=1
have property F, converge monotonically to f L
1
R
n+m
, then
f has property F.
Proof. Without loss of generality, we can assume f
j
to increase to f. Since each of these
have property F, then for every j, f(j, ) is measurable on R
m
for a.e. x R
n
. Let Z
j
be
the set of measure zero such that for every x R
n
Z
j
, then f
j
(x, ) is measurable and
integrable on R
m
. Let Z =
j=1
Z
j
and observe that |Z|
n
= 0. Hence, for x / Z and for
every j N, f
j
(x, ) is measurable and integrable.
Apply the Monotone Convergence Theorem to get
lim
j
_
R
m
f
j
(x, y)dy =
_
R
m
f(x, y)dy.
61
Let F
j
(x) =
_
R
m
f
j
(x, y)dy. Now we have that F
j
increases to F for some F, and applying
the monotone convergence theorem again yields
_
R
n
F
j
dx
_
R
n
Fdx,
where the convergence is increasing. Thus, by property F, we have
_
R
n
__
R
m
f
j
dy
_
dx =
__
R
n+m
f
j
dxdy,
which implies f has property F.
Q.E.D.
Claim 8.3. If E R
n+m
is a G
-set, say G =
k=1
G
k
with |G
1
| < , then
E
has property
F.
Proof. We begin from a base case and proceed to generalize it:
(i ) Let E = J
1
J
2
with J
1
and J
2
open, bounded intervals in R
n
and R
m
, respectively,
and |E|
n+m
= |J
1
|
n
|J
2
|
m
< . For every x R
n
,
E
(x, ) =
J
1
(x)
J
2
() =
_
J
2
() : x J
1
0 : x / J
1
.
Note these are measurable functions. Now, since
_
J
2
J
2
dy = |J
2
| < , the slices are
in L
1
(I
2
) = L
1
(dy). Dene
F(x) =
_
R
m
E
(x, y)dy = |J
2
|
J
1
(x),
which is in L
1
(dx), with
_
R
n
Fdx =
_
R
n
__
R
m
E
dy
_
dx = |J
2
| |J
1
| = |E| ,
but we know
|E| =
__
R
n+m
E
dxdy,
hence,
E
has property F.
(ii ) Suppose E (J
1
J
2
) with J
1
and J
2
as above. Then E is contained in a union of
coordinate planes {x = a} or {y = b}. Consider the rst. Then
E
(x, ) =
_
0 : x = f
F
(y) : x = a and F R
m
is contained in some interval
.
The second case is of measure zero (dx). The rst case implies the sliced function has
all integral equal to zero.
62
In the second case, E {(x, y) : y = b},
E
(x, y) =
_
1 : y = b
0 : o.w.
.
This is measurable and integrable (dy), with
F(x) =
_
R
m
E
(x, y)dy = 0.
Thus, F L
1
(dx) and
_
R
n
Fdx = 0 =
__
R
n+m
E
dxdy = |E|
n+m
.
(iii ) If E is a bounded product of possibly half-open intervals, then
e
=
int(E)(E)
=
int(E)
+
F(E)
,
which satises F by (i), (ii), and linearity.
(iv) Let E R
n+m
be open with |E| < . Then E =
j=1
I
j
, where the I
j
s are possibly
half-open intervals. For every k N, let E
k
=
k
j=1
I
j
, then f
k
=
E
k
=
k
j=1
I
j
has
property F by (iii). As k , f
k
increases to
E
, and since
E
L
1
(dxdy),
E
has
property F by monotonicity.
(v) Suppose E =
k=1
, where the G
k
are open and |G
1
| < . Without loss of generality,
we can assume the sequence of open sets is nested, so we can assume G
k
decreases
to E. But that implies
G
k
decreases to
E
, and by (iv) and monotonicity,
E
has
property F.
Q.E.D.
Claim 8.4. Let Z R
n+m
with |Z| = 0. Then
Z
has property F.
Proof. We can express Z H where H is a G
-set with |H| = 0.

Q.E.D.
Note that we can assume Z is contained in some translate of [0, 1]
n+m
. For if not,
decompose Z into the following union
_
Z
n+m
_
Z ([0, 1]
n+m
+ )
_
.
Now
Z
=
Z
n+m
Z([0,1]
n+m
+)
,
63
which is an increasing limit of linear combinations of characteristic functions of translates
of subsets of [0, 1]
n+m
. Monotonicity implies we may assume Z [0, 1]
n+m
. Since Z H =
k=1
G
k
, we can assume each G
k
[
1
2
,
3
2
]
n+m
so that |G
1
| < , which implies
H
has
property F, hence
Z
has property F.
Now, let E R
n+m
be measurable with |E| < , then
E
has property F. We know
E = H Z where H is a G
-set and |Z| = 0. Since

E
=
H

Z
, where
H
and
Z
have
property F, we know
E
has property F.
For the nal step, let f L
1
(dxdy), then f = f
+
f
, with f
+
and f
in L
1
(E). Thus,
it sucees to show that f has property F if f L
1
(E) and f is nonnegative. We know we
can express f as an increasing limit of f
k
, with f
k
0 being measurable simple functions.
By monotonicity, it suces to show each f
k
has property F. Then
f
k
=
J
k
j=1
a
(k)
j

(k)
E
j
and
J
k
j=1
a
(k)
j
E
(k)
j
=
_
f
k

_
f < ,
hence
E
(k)
j
< for all j, k, and so f

k
has property F by the above.
Q.E.D.
64

Notes For Math 471 - Real Analysis Measure and Integral by Wheeden and Zygmund

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Notes For Math 471 - Real Analysis Measure and Integral by Wheeden and Zygmund

Diunggah oleh

Hak Cipta:

Format Tersedia

Notes for

Math 471 Real Analysis

is the dual space of C

implies there exists a unique Borel measure on X such that

exists. That is, if there exists an A R such that for every

(f; [a, b]) =

(f; [a, b]).

] [a, b], then V (f; [a

]) V (f; [a, b]).

to be the positive and negative variation with respect to the partition .

(f) and N = N(f) = sup

) exist by basic real analysis.

(f); i.e., for every > 0 there exists

| C[a, b], the theory of Riemann integration

and V (f) by taking the supremum over all partitions of S

(f). Since z = x + y and

]) < . Similarly, f BV (R if and only if f BV [a, b]

I| < , then we say that the Riemann-Stieltjes integral of f with

is continuous on a compact set, its image is bounded

). Our claim is:

exists implies lim

are any two

is a common renement with |

are any two partitions, L

-set with E H and |E|

, which is open. Let G =

-set and |Z| = 0;

-set and |W| = 0.

-set H E with |H| = |E|

-sets and similarly for

-sets. In particular, the Borel set is mapped to the Borel set.

such that for every A, s() E

. Thus, s picks for each A one and only one

-set, i.e., a Borel set, thus f is Borel-measurable.

> 0 such that

< and x F implies x F

-set. Let Z = E H. Then Z E F

and f(x) = 0, x [0, 1].

-set. Without loss

, f(x) > which

both must be nite. Then

with at least one of

, thus apply case 2

() = |{f > }|. Similar to before, it is convenient to assume |{f = }| =

(E) = {f : ess sup

, 0 < < so that (y) = y

. Youngs Inequality implies

= 2, and we have 1 < p < 2 implies p

(2, ) vice versa.

= 2, this is known as the Cauchy-Schwarz Inequality.

we can form a similar argument

(0, ), it suces to do it under the

a.e. and |g(x)| g

a.e. The triangle inequality implies that

is an upper bound a.e., it is at least as great as the least upper bound

(E) is not necessarily separable. Let E = R, then for all

> 0 such that k, l > K

for a.e. x. Let Z

(E), we have their sum, f, is also in

(E), which implies L

-set with |H| = 0.

-set and |Z| = 0. Since