Devinatz D. - Advanced Calculus (1963)

ADVANCED
CALCULUS_
ALLEN DEVINATZ
Northwestern University
_./
HOLT, RINEHART AND WINSTON

New York
Montreal
Chicago
Toronto
San Francisco
London
Atlanta
Dallas
Copyright
1968
by Holt, Rinehart and Winston, Inc.
All Rights Reserved

Library of Congress Catalog Card Number: 68-18409
2689453
Printed in the United States of America
1 2 3 4 5 6 7 8 9
PREFACE
The contents of this book represent a somewhat expanded version of
a one-year course that I have given from time to time since 1961.
Those taking the course have been mainly undergraduate and first
year graduate students concentrating in mathematics. Occasionally,
students from engineering and the physical sciences have taken the
course and have told me they enjoyed it. I recommend that students
who select a course such as this should generally have a little more
mathematical maturity than that afforded by the usual freshman
sophomore courses in the calculus. One excellent way to gain such
maturity is through a beginning course in linear algebra, although the

contents of such a course are not a specific prerequisite for the under
standing of this book.
Section 1.1 on logic is to be read by the student. Of course, such a
brief introduction is not intended to teach the student the elements
of logic, but rather to make him aware of the formal processes involved
in mathematical reasoning. It may not be too well understood on the
first reading, but if the student will reread it several times during the
course, it probably will begin to appear more reasonable. The notation
of the propositional calculus is to be viewed as a concise shorthand for
mathematical statements. My experience has been that students learn

to use the notation in a reasonable way in a relatively short time and
with very little trouble.
For those instructors who do not wish to spend time on an extended
treatment of the real number system, I have arranged matters so that
they can begin the discussion o(real numbers with Section 1.8. In that
section, I have given what amounts to a set of axioms for the real num
ber system, the more usual starting point for a beginning course in
analysis.
If, in going through the material, my peers should at times accuse
me of being pedantic, I plead guilty to the charge; my aim in doing

this has been deliberate. All too often students beginning the serious
study of mathematics get the idea that a vague or seemingly trivial
point should be waved away. I have triecrto convey the idea to the novice
that he should be sure that he can really prove these seemingly trivial
points.
As far as the differential calculus is concerned, there is probably

not too much choice in the way one can proceed. As for the integral
calculus, I have chosen the more cumbersome and less general method
of Riemann-Darboux integration and Jordan content rather than one

iii
of the more modern theories of measure and integration. Although

I do not feel that a historical approach to a subject i necessarily always
the best, in the case of integration my view is that a student cannot

fully appreciate or even fully understand the more modern theories
until he has seen the gradual and natural evolution of the ideas involved.
I make absolutely no claims to originality. I have no gimmicks or
special pedagogical devices as aids in understanding. Mathematics is

a difficult subject; I have tried to set down a small but important portion
of it in as straightforward, clean, and concise a way as I know how,
consistent with the level of student to whom it is addressed. Only the
readers can ultimately decide whether or not I have succeeded.
I am grateful to several friends for their help in preparing the manu
script. I am deeply indebted to Sam Lachterman of St. Louis University.

He read the entire manuscript, pointed out a large, but finite, number
of errors, and showed me how to make several proofs in a shorter and
more elegant way. Jacob K. Goldhaber of the University of Maryland
read several of the chapters and gave me some excellent advice. Thanks
are also due to my former colleagues, Sebastian Koh and A. Edward
Nussbaum of Washington University. The former used a preliminary
version of the first five chapters in his class and made several sugges
tions for improvement, while I had several helpful conversations with
the latter on the subject matter of the book. Above all I am grateful to
the various classes of students who endured varying versions of the
course.
Evanston, Illinois
March 1968
A. D.
CONTENTS
Preface
CHAPTER 1
THE REAL NUMBER SYSTEM
I. I
Some Ideas about Logic
1.2
1.3
1.4
1.5
1.6
1. 7
1.8
1. 9
Sets
Relations and Functions
The Natural Numbers
The Integers and the Rationals
Countability
The Reals
A Review of the Real Number System and Sequences
Properties of the Reals
CHAPTER 2
2.1
2.2
2.3
2.4
The Heine-Borel Theorem and Uniform Corttinuity

Monotone Functions
Limit Superior and Limit Inferior
Convergence Tests
Decimal Expansions
Sequences and Series of Functions
Infinite Products
69
76
84
90
INFINITE SERIES
Series of Real Numbers
CHAPTER 4
4.1
4.2
4.3
4.4
LIMITS
The Limit Concept and Continuity
CHAPTER 3
3.1
3.2
3.3
3.4
3.5
1
16
22
26
31
38
44
55
62
99
110
118
126
131
DIFFERENTIATION
The Derivative Concept

Differentiation Rules
Mean Value Theorems
Taylor's Remainder Formulas
138
145
149
158
v
4.5
4.6
Power Series
The Weierstrass Approximation Theorem
165
178
CHAPTER 5 I INTEGRATION
5.1
5.2
5.3
5.4
5.5
Riemann-Darboux Integrals
Properties and Existence of Riemann-Darboux Integrals
Improper Integrals
Riemann-Stiel tj es Integrals
183
190
201
210
Functions of Bounded Variation and the Existence of

Riemann-Stiel tj es Integrals
217
CHAPTER 6 j HIGHER DIMENSIONAL SPACE

6.1
6.2
6.3
6.4
6.5
6.6
6.7
Real Vector Spaces

Euclidean Spaces
Topology in En
Continuous Functions
Linear Transformations
Determinants
Function Spaces
228
235
241
248
256
274
293
CHAPTER 7 I HIGHER DIMENSIONAL

DIFFERENTIATION
7.1
7.2
7.3
7.4
7.5
7.6
Motivation
Directional Derivatives and Differentials
Differentiation Rules
Higher-Order Differentials and Taylor's Theorem
The Inverse and Implicit Function Theorems
Maxima and Minima
CHAPTER
305
309
319
324
332
344
8 I HIGHER DIMENSIONAL
INTEGRATION
8.1
8.2
8.3
Riemann-Darboux Integrals
Jordan Content
Existence and Properties of Riemann-Darboux Integrals
353
359
366
8.4
8.5
Iterated Integration
The Transformation Theorem for Integrals
CHAPTER 9
374
380
THE INTEGRATION OF
DIFFERENTIAL FORMS
I. LINE INTEGRALS
9.1
9.2
9.3
9.4
Motivation and Definitions

The Length of a Curve
A Special Case of Stokes' Theorem
Closed and Exact Differentials
396
403
407
416
II. SURFACE INTEGRALS
9.5
9.6
9.7
9.8
9.9
9.10
Motivation and Definitions

The Algebra of Differential Fonns
Closed and Exact Forms
Manifolds
Integration on Manifolds
Stokes' Theorem
Symbols
Index
429
437
452
455
461
467
479
482
2 I THE REAL NUMBER SYSTEM
new statements that are called true. In this sense mathematics is a

complicated game and truth has nothing to do with reality (whatever
that elusive thing is!) or various concepts of truth discussed by the
philosophers. Truth, for us, shall be something prescribed by a set of
rules.
To be somewhat more specific, a branch of mathematics is usually
constructed in the following way. A small number of statements are
written down which are called axioms and these are arbitrarily called
true and the letter 't' assigned to them. By means of a given rule,
from each true statement a new statement can be formed which is
called false and the letter 'f' is assigned to these. Then there are vari
ous rules for assigning 't' or 'f' to new statements formed from col
lections of statements that already have 't' or 'f' values attached to them.
This enlarges our collection of statements with 't' or 'f' attached to them.
We can then use our rules on this enlarged collection to get a possibly
still larger collection of statements haviilg 't' or 'f' values attached to
them. We can then apply the rules again to get a possibly still larger
collection of statements having 't' or 'f' assigned to them, and so on.
It is always our hope that in starting from the given axioms and
applying our rules that we will not get a statement that has both letters
't' and 'f' attached to it. If we get a statement with both 't' and 'f' at
tached to it, we say that our axioms are inconsistent. If this is never the
case, we say our axioms are consistent.
For a consistent set of axioms, those statements taking on the value
't' are called lemmas. propositions, theorems, and corollaries. It is not always
clear which true statements should bear which names.
However,
current usage seems to suggest the following rules. A theorem is

an important true statement. A lemma is a true statement that is used
in constructing the proof of another true statement and usually does
not have wider applicability. A corollary is a true statement that is an
immediate consequence of a true statement. Finally, a proposition is
a true statement that is not a lemma or corollary but is not important
enough to be called a theorem.
Many people also use the word
'scholium' to play the same role as the word 'proposition' or even

possibly to be a true statement that is not as important as a proposition.
We shall now give some rules for forming new statements from given
statements A and B that have values 't' or 'f'. That is to say, we shall
construct new statements containing the statements A or B or both and
give rules for assigning 't' or 'f' to the new statements. We shall do
this by means of a truth table, which will list the symbol 't' or 'f' to be
given to a new statement given the various combinations of 't' and 'f'
values that A and B can take on.
a.
Negation
(To be read: not A.)
I.I
-A
SOME IDEAS ABOUT LOGIC I 3
f
b.
Implication A =>B
(To be read: A implies B, or if A then B, or B if A, or A only if
B, or A is a sufficient condition for B, or B 1s a necessary con
dition for A.)
A
c.
A =>B
A&B
Conjunction A&B
(To be read: A and B.)
A
d.
Disjunction AVB
(To be read: A or B.)
A
AVB
t
t
The preceding truth tables give a prescription for the use of the
symbols
'-
' '==i, &

'
,' and 'V'. As such, we can loosely think of these
,
tables as giving a meaning to statements containing these symbols. We

shall try to explain this in more detail.
In our previous discussion we have used the word 'statement' as
if this were a well-known concept to the reader. Actually we think the
reader has a good idea of this concept, but we shall be pedantic and
comment on it further. To form a statement in the written English

language, for example, we begin with an alphabet consisting of 52
Latin letters (lower case and capitals), the various punctuation marks,
and various other symbols such as parentheses, brackets, and so forth.
We may even suppose the alphabet contains a symbol that cannot be
seen-an empty space. A statement in the English language, meaningful
or not, is a string of these symbols usually placed in a horizontal row,
and one has a rule to tell where a statement begins and where it ends.
A string of Latin letters that begins and ends witl) the empty-space
symbol and has no empty-space symbol in betwee'n is called a word.

A string of objects beginning with two empty-space symbols and a
capital Latin letter and ending with a period, and having no period in
between is called a sentence, and so forth.
Statements in mathematics are formed in the same way, that is, by
placing symbols in various positions. However, it is usually the case that
we form these statements from a different collection of symbols than
those that we use for the English language. The symbols '
' . '&',
,
and 'V' are part of our mathematical alphabet.

Now, what we have described as statements of the written English
language cannot be said to constitute the written English language.
Most of the strings of symbols that would be written down would not
be meaningful. The meaningful statements are those prescribed by
means of lists of words contained in dictionaries and by means of the
rules of grammar. A moment's reflection is enough to convince us that
for someone who does not already know the written English language
it would be impossible to describe the rules of grammar or how to use
a dictionary in terms of the written language. The various rules must
be described in terms of a different language that is understood by
the learner. For a child this is usually done by means of a spoken
language, and for someone who understands a different written lan
guage such as, for example, Hebrew or Sanskrit, the rules of written
English can be described in terms of those languages.
The same situation persists with regard to the mathematical language.
In the mathematical language we say that the meaningful statements
are those which can be assigned a value 't' or 'f'. The rules whereby
we describe which mathematical statements are meaningful must be
prescribed by a language outside the mathematical language. For us,
the describing language is the English language. We are assuming that
a truth table is part of the English language, since if we had a mind to
do it we could describe these tables in terms of the conventional lan
guage. So we see that truth tables are nothing more than rules, written
in a language that we presumably understand, which describes which
statements in our mathematical language are meaningful. Of course,
.we have some intuitive ideas of what we want and these prescriptions
I.I
of the truth tables are nothing more than formalizations of these

intuitive ideas.
Let us give an example that illustrates how the truth-table method
works. Let us suppose that
A, B,
and
are statements that can be
given 't' or 'f' values. We wish to show that the statement
[(AB)
(BC)] [A CJ
&
always has a 't' value regardless of the values taken on by
A, B, and C.
'AB', E for
'AC'. The table
To get the table to fit on one page, let us set 'D' for
'BC',
'F' for
'(AB)
&
(BC)',
and 'G' for
'
'
looks as follows.
FG
t
t
Since the last column always has the 't' value we have shown what we set
out to show.
Once we have given the basic symbols of our mathematical language
we can
define
point we can
new symbols in terms of our basic symbols. As a case in
define
the equivalence symbol
When we write
AB,
this is to be read: A is equivalent with
B. It is sometimes also read:

'AB' is defined as another repre
for the statement (AB) & (BA), which is in terms of
symbols of our mathematical language. Once 'AB' has
A if and only if B.
sentation
the basic
The set of symbols
been defined, we coan consider it as the name of a statement. It is easily

seen that the statement
A and
AB has
the value 't' attached to it whenever
both haye the 't' value or whenever A and B both have the 'f'
value. Otherwise AB has the 'f' value attached to it.
In the above paragraph we have used a short symbol to replace a

more cumbersome one. This is, in essence, the nature of a definition.
A definition gives a (usually shorter) new name to something that can
be described in terms of known symbols or names. The object, as in
any other language, is for efficiency in expression, which leads to
efficiency in thought. The criteria of a definition is that it should only
introduce new symbols or names for groups of known symbols and we

should not be able to obtain any true statements by use of the defini
tion that could not be obtained without it. In other words, we should
think of definitions as simply introducing a system of shorthand into
the mathematical language.
Suppose now that we have a set of statements Hi, H2,
Hn which
are meaningf ul in the sense that they have 't' op-"f' values attached to
them. In addition, we suppose that there are statements A1,
, Am
so that each H k is composed of some of the A; together with the logical

symbols '-','==?',etc. It may not, in general,be known whether the A;
are meaningful. Further, suppose C is a statement that is composed of
some of the A; and the logical symbols, and we don't know in general
whether C is meaningful. However, suppose that under the supposi
tion that A1,
, Am can be given 't' or 'f' values we find by the use of
our truth-table rules that

Hi & H2 &
& Hn ==* C
always has a 't' value. Then,if all the H k have the 't' value we shall give
C a 't' value. This is a new rule for giving statements 't' or 'f' values and
is usually called the rule of inference. Our hope here,as with the truth
table rules, is that starting from our axioms we cannot also give C an
'f' value,that is, -C cannot be given a 't' value by the scheme that has
been outlined.
As an example of this new rule,suppose A1 =*A2 and A2 =*A3 are
axioms and therefore have 't' values, even though we don't know
whether A1, A2, and A3 can be given 't' or 'f' values. However, under
the supposition that they are meaningful we have established by a truth
table that
always has a 't' value. Hence we would give A 1 ==* A3 a 't' value. Another
example is
If Ai and A1 ==*A2 have 't' values,we would assign the 't' value to A2
In case a statement can be given a 't' value by means of the rules we
have prescribed,then the statement is said to be derived or proved from
the axioms. Suppose a statement C has a 't' value and it is obtained by
our rules through an implication Hi & H2 &
& Hn -==*C .
Each
H k is in turn either an axiom or obtained through an implication of
a conjunction of other statements, and so forth, until we finally get

back to where all the statements appearing in the conjunction on the left
are axioms. The collection of all such statements is called the derivation
or proof of C. However, it would be quite impractical to list all these
statements beginning with the axioms and, as the reader well knows
1.1
SOME IDEAS ABOUT LOGIC 17
from experience, in practice a proof usually consists of just a portion

of this collection. In other words, the proof starts with known true
statements that have been proved elsewhere and proceeds from there.
The set of rules we have given above is usually called the pmposi
tional cakulus or sometimes a model for the propositional calculus.
However, the symbols we introduced, and the rules for their use, are
not rich enough to provide an adequate basis for most of the discourse
of mathematics. Hence we shall introduce some new symbols together
with rules for their use which in formal studies of logic is called the
predicate cakulus. A good deal of mathematics is described in the lan

guage of the predicate calculus, usually at an informal level, since a
very formal approach gets very cumbersome and may often interfere
with understanding. However, there are some situations in which a
formal approach may very much clarify and facilitate the handling
of complicated situations. We have in mind the precise statements of
complicated definitions and, in particular, the negating of complicated
statements.
Suppose that
Q(x) is a statement that depends on a variable 'x'. The
reader may think of a variable as the name of an unspecified object

that can be replaced by any member of a specified set. The counterpart
of a variable in the English language is a pronoun or a common noun.
x is not a specified object, in general it would make no sense to

Q(x). However, by adding certain quali
fying statements to Q(x) it may be possible to do so. One of these
qualifying statements is 'for every x,' which in symbols is '(x)' or 'Vx.'
Since
associate a 't' or 'f' value with
We may then write down a statement:
(i)(Q(x))
This is to be read: For every
or
Vx Q(x).
x the statement Q(x) is true. Another

x', which in symbols is '(3x).'
qualifying statement is 'there exists an
We may then write down another statement:
(3x)(Q(x)) .
This is to be read: There exists an
x such that Q(x) is true. It may now
be possible to attach 't' or 'f' values to these statements. The symbols
'(x)' and 'Vx' are called universal quantifiers and the symbol '(3x)' is
existential quantifier.
It is also possible that we may have a statement Q(x, y) which depends
called an
on two variables and we may write down a statement:
(x)(y)(Q(x,y)).
This is to be read:
For every
x and for every
is true. We may also write down a statement:
(x)(3 y)(Q(x,y)).
y the statement
Q(x, y)
I THE REAL NUMBER SYSTEM
This statement is to be read: For every x there exists a y such that Q(x, y)
is true. Clearly, the statements
(x)(3y)(Q(x, y)) and (3y)(x)(Q(x, y))
are different statements simply because the symbols are placed in a

different order. However, even intuitively they cannot be considered
equivalent. In the second case y is independent of which
whereas in the first case y may depend on
x is chosen,
x. As an example, suppose
x and y may be replaced by real numbers. We may then write the true
statement:
(x)(3y)(x
<
y).
The translation of this statement into the English language reads:

For every real number there exists another real number (depending
on
x) that is larger. On the other hand, we may write down the false
statement:
(3 y)(x)(x
<
y).
In the English language this says: There exists a (fixed) real number
that is larger than every real number. Many other situations will arise
later on in the text which will demonstrate again and again the dis
tinction between these two types of statements.
We shall now outline some of the rules for operating with the predi
cate calculus. These rules will show the connections with the rules we
outlined previously for operating with the propositional calculus. The
reader may very well recognize that he has been using these rules in
an informal way inhis previous studies of mathematics.
As in the propositional calculus we shall suppose that statements

involving quantifiers are to be given 't' and 'f' values. We start with a
given set of statements that have 't' or 'f' values attached to them, and
by means of a set of rules we obtain other statements that have 't' or 'f'
values attached to them. The first rules we shall give are those for
negating statements. The rule for negation is the following:
-(x)(Q(x)) {::::> (3x)(-Q(x)).

This is to be taken to mean that if the left side has a 't' or 'f' value, then
the right side is to be given the 't' or 'f' value, respectively, and vice
versa. From this rule it is easy to establish the rule
- (3x)(Q(x))
{::::>
(x)(-Q(x)).
Using these rules we can negate more complicated expressions. For
example,
is equivalent to
SOME IDEAS ABOUT LOGIC) 9
1.1
This is done by negating one at a time; that is, we consider
Q1(x1) = (x2) (3x3) (3x4)(xs) (Q(x.,

and then
-(x1)(Q1(x1))
is equivalent to
,xs))
(3xi)(-Q1(x1)),
and con
tinue on in this way.

Now, let us go on to some of the other rules. Usually it is necessary
to start a chain of proof by means of known true statements that in
volve quantifiers. A broad description of the rules is as follows. First,
have a consistent way of removing the quantifiers from the known true
statements. Next manipulate the resulting quantifier free statements
by the rules of the propositional calculus. Finally, have a consistent
way of replacing the quantifiers. The final quantified statement can be
given a 't' value provided we have used all the rules correctly.
We believe that the rules of manipulating with quantifiers are best
explained by means of examples. Let us first look at a simple situation
involving only universal quantifiers. Suppose
statements involving the variable
'x',
P(x), Q(x), and R(x)
are
and it is known that the statements
(x)(P(x) => Q(x)),

(x)(Q(x) =>R(x)),
are true, that is, have a 't' value attached to them. It would seem natural,
at least from the rules given for the propositional calculus, that we
should be able to conclude that the statement
(x)(P(x) =>R(x))
has a 't' value attached to it. The rule here is to remove the universal
quantifiers to get the statements
P(x) => Q(x),

Q(x) =>R(x).
Consider each of these two statements to have the 't' value attached to
it even though with the variable
'x'
the statements may have no mean
ing. However, we are thinking that if we replace
by any member of
a given set, then the statements will be true. By the methods of the
propositional calculus we conclude that
{[P(x) => Q(x)J
&
[Q(x) =>R(x)]} => {P(x) =>R(x)}
is a true statement. Consequently, by the rule of inference we conclude

that
P(x) =>R(x)
is a true statement. The rule now is to add the universal quantifier and
get the statement
(x)(P(x) =>R(x)).
Let us now look at a simple situation that involves an existential quan

tifier. Suppose it is known that the statements
(x) (P(x) :::} Q(x)),
(3x)(P(x))
are true. It would seem reasonable that the rules we give should lead
to the conclusion that the statement
(3x)(Q(x))
is true. Now, the statement (3x)(P(x)) has the intuitive meaning that
P(x) is true when x is replaced by only certain members (possibly only
one) of a specified set. Hence the rule we now formulate is that when
an existential quantifier is removed, the variable 'x' shall be replaced by
a symbol that stands for a definite but unspecified member of some
set. This is in accordance with the conventions used in ordinary mathe
matical discourse. For the purpose of this discussion let us use the
beginning letters of the Latin alphabet to stand for these definite but
unspecified symbols. Consequently, remove the existential quantifier
to obtain the statement P(a). If we remove the universal quantifier
from the statement (x)(P(x) :::} Q(x))to get the statement P(x) :::} Q(x),
then there would be no way for us to proceed. For the statement
{P(a) & [P(x) :::} Q(x)]}:::} Q(a) is not a true statement according to
truth-table methods. But
{P(a) & [P(a) :::} Q(a)]} :::} Q(a)
is a true statement, and by the rule of inference we conclude that
Q(a) is a true statement. Hence we adopt the rule that whenever a
universal quantifier is removed, we may retain the variable 'x' or replace
it by one of the letters 'a', 'b', and so on, which stand for definite but
unspecified objects. The tactics depend on just what is intended to be
accomplished.
Once we have established that Q(a) is true, we want to reinstate a
quantifier. The rule is that whenever the letters 'a', 'b', and so on, appear
in our statements, they can be quantified by existential quantifiers.
Hence we get that the statement
(3x)(Q(x))
is to be given a t value.
'
'
In removing existential quantifiers, the rule we made is that the

letters usually reserved for variables are replaced by other letters.
This is done to serve as a warning that, when we reach the point where
we want to reinstate quantifiers, we should not add a universal quanti
fier where we should have added an existential quantifier. As an exam-
1.1
SOME IDEAS ABOUT LOGIC
11
ple of the type of difficulty that could arise if we did not follow this
procedure, let us consider the statements:
(x)(x< 1x+1<2),
(3x)(x<l).
In these statements we are supposing that x may be replaced by real

numbers. Hence these are true statements. Remove the existential
quantifier but do not replace the variable 'x' by another letter. We get
the statements:
x<lx+l<2,
x<1,
which are to be considered as being true. Applying the propositional

calculus we are led to the statement:
x+1< 2.
Now, in adding a quantifier to this statement we may forget that x

cannot be replaced by any real number. If we add a universal quanti
fier we get the wrong statement:
(x)(x+1<2).
In the simple example we have considered, it is, of course, not hard

to remember that in adjoining a quantifier we should add an existential
rather than a universal quantifier. However, in a long and complicated
chain of reasoning this may not be so easy to remember. For this
reason we must proceed in the manner outlined.
Let us give several more examples which will illustrate several other
points that arise when dealing with quantifiers. In the next three
examples we shall suppose our variables represent real numbers. We
start with the following true statements:
(3y)(x)(x+ y =x),
(x) (x< x + 1).
In statements involving both existential and universal quantifiers, the

rule is to remove the existential quantifiers first. Doing this we get the
statements:
(x)(x+a=x),
(x)(x< x+1).
Now remove the universal quantifiers to get the statements:

x+a=x,
x<x+l.

In dealing with the equality sign we shall adopt the rule that the symbols which
stand on either side of the equality may be used interchangeably tn any ex

pression involving these symbols. Hence we get the statement:
x+a<x+l.
Now replace the quantifiers. The rule here is the reverse of the rule
in removing the quantifiers. First, replace the universal quantifier to
get the statement:
(x)(x+a<x+I).
Now replace the existential quantifier, by a variable other than 'x', to
get the true statement:
(3y)(x)(x+y <x+ 1).
Note here that we do not try to quantify the symbol 'I'. This would lead
to a different statement from the one we have obtained. The general
rule is not to quantify symbols in the final statement of a chain of
argument which are not quantified in the initial statements.
Let us look at a second example closely related to, but not the same
as, the previous example. Again the variable 'x' shall represent real
numbers. Let us start with the true statements:
(3y)(x)(x+ y=x),
{3y)(x)(x<x+ y).
In removing the existential quantifiers we should be careful not to
use the same symbol in separate statements involving the existential
quantifiers. If, for example, we used the same symbol upon removing
the existential quantifiers we would get the statements:
(x)(x+a=x),
(x)(x<x+a).
After a few more steps this would lead to the incorrect statement:
(
(x)(x+y<x+y).
Hence, removing the existential quantifiers in the correct way and

then removing the universal quantifiers, we get the statements:
x+a=x,
x<x+ b.
Using our rule for equality we get the statement:
x+a<x+ b.
Replacing the quantifiers, first the universal and then the existential,
we get the true statement:
I.I
(3y)(3z)(x)(x+y <x+z).
We shall consider one final example. The variables are again assumed
to represent real numbers. Let us start with the true statements:
(x)(3 y)(x<y),
(x)(y)(x <y =}x + y <2y).
If we proceed according to the previous rules we get the statements:
x <a,
x <a =:::} x+ a<2a.
Using the propositional calculus we get the statement:
x+a<2a.
Replacing the quantifiers by our previous rules leads to the incorrect
statement:
(3y)(x)(x+y < 2y).
The reader has probably already recognized what has gone wrong
here. In removing the quantifier from the first statement the ambiguous
element a does not remain the same for all x but must change as x
changes. In mathematics one usually indicates this by writing 'ax' or
'a(x)'. If we do this we get the statements:
x <ax,
X <ax
==> X+ ax<2ax
By means of the propositional calculus these statements lead to the

statement:
x+ ax< 2ax.
We must now decide how to replace the quantifiers. Clearly the rule
in this situation is to replace the existential quantifier first, using a
variable other than 'x ', to get the statement:
(3y)(x+y < 2y).
This does not contradict any of our previous rules, since we have
introduced a new type of symbol, namely, 'ax' Now replace the univer
sal quantifier to get the correct statement:
(x)(3y)(x+y<2y).
In working with quantified statements the reader should be aware
of the fact that any letters or symbols may be used for the variables.
This is in accordance with accepted mathematical practice. For example,
the following three sets of symbols all have the same meaning:
J:!(x)
dx,
J: J(y) dy, J: J(t)
dt.
The verbal description of the rules of the predicate calculus we have

given above can actually be given in a very succinct manner by means
of a set of axioms. That is, given any set of axioms for a mathematical
system, it is always possible to add another short set of axioms so that
together with the propositional calculus the latter set of axioms has the
effect of automatically applying the rules of the predicate calculus
which we have given. For such a set of axioms we refer the reader to
page 57 of the book by Mendelson cited at the end of this section.
One final rule. We shall always make the assumption that every
statement we make has one and only one of the values 't' or 'f'; that is,
every statement we make is either true or false and our system is con
sistent. This allows us to use the
rule or principle of proof by contradiction.
Suppose we want to find the value 't' or 'f' of a statement A. Suppose

also that we know a statement B has a 't' value and we are able to prove
that A
B; that is, the last statement has a 't' value. If A had the 't'
value attached to it, then it would follow that
B has the 't' value at
tached to it, which would contradict our hypothesis that B can have
only one value attached to it. Hence A must have the 'f' value.
Our discussion of the elements of logic has of necessity been rather
circumscribed, so that at this point there is not a sufficiently adequate
basis for a completely formal development of a branch of mathematics.
On the other hand, a completely formal development usually turns out
to be very cumbersome and, unless one already understands the sub
ject that is being formalized, is probably undesirable. After all, the
process of understanding is an intuitive psychological phenomenon.
In understanding the proof of a theorem or in discovering a new
theorem, the mind seems to operate in a haphazard manner rather than
in a formal step by step process that a logical proof requires. What
then is the role of logic in mathematics? There are several important
roles it can play: It can provide a mechanical check on our informal
reasoning processes; it can provide a system of automatic symbolic
operations to assist in complicated situations; it can provide a means
of increasing precision and generality; and, last but not least, it can
provide an accurate method of transmitting mathematical information.
When we begin the discussion of the natural numbers we shall
proceed in a somewhat more formal manner than we do later on in
the book. The reason for this is that we feel that most readers will be
very familiar with the basic facts of elementary arithmetic and hence
the formalism will not interfere with their understanding. As the mate
rial becomes less familiar, our approach shall become more informal
so that understanding may not be jeopardized. However, we shall
never completely abandon a certain amount of formalism, since we feel
1.1 SOME IDEAS ABOUT LOGIC I 15
that in many instances it provides an accurate and efficient method for

handling complicated situations.
References
Kershner, R. B., and L. R. Wilcox,
The Anatomy of Mathematics,
The Ronald
Press Company, New York, 1950.
Introduction to Mathematical Logi,c, D. Van Nostrand Com

J., 1964.
Introduction to Logi,c, D. Van Nostrand Company, Inc., Prince
Mendelson, Elliott,
pany, Inc., Princeton, N.

Suppes, Patnck,
ton, N. J., 1957.
D Exercises
I.
By use of a truth table verify the following:

(a) [-(A & B)] {:::>[ -AV -B].
(b) [-(A & B)]{=>[A =>-B].
(c) [-(A =>B)] {=>[A & -B].
2. Use the truth-table method to show that the following statements

are always true:
(a) [A& (A =>B)] =>B.
(b) [ (AV B)& -A] =>B.
(c) [A&B] =>B.
3. Use the truth-table method to show that the following statements
are always true:
(a) [A =>B] {:::> [-B =>-A].
(b) [ (A =>B)&-B] =>-A.
(c) [ (-A)& (AVB)] =>B.
4. Translate the following into the English language assuming the
variables represent real numbers:
(a) (x)(y)(z)([x > y &y > z] =>x > z).
(b) (x)(x > 0 => (y)(3z)(y < zx)).
(c) (3z)(x)(x+z= x).
(d) (3z)(x)(3y)(x+y=z).
(e) (x)(3y)(3z)(x+y=z).
Explain why (d) and (e) are not equivalent statements.
5. Using the results of Exercise

Exercise 2.
negate all the statements
6. Translate the following statements into the symbolism of the

propositional and predicate calculus:
(a) Every nonzero real number is either positive or negative.
(b)
It is not the case that there is a real number greater than
every real number.

(c)
There is a real number with the property that its multipli
cation with any real number

(d)
7.
xgives xagain.
There is a negative real number whose square is
Suppose
1.
P(x) is a statement involving the variable 'x' and it is
known that the following statement is true:
(x)(P (x)).
Use the rules of the propositional and predicate calculus to show that
the following statement is true:
(3x)(P(x)).
8. In the following list show that if the left statement is true, so is
the right one. The results of Exercise 1 may be helpful.
(c)
(3y)(x)(x < O=>x < y)

(x)(3y)(y > 0 &xy = x)
(x)(3y)(x O=>xy= 1)
1.2
SETS
(a)
(b)
(3y)(x)((x < y) V -(x < 0)).

- (3x)(y)(y > 0 =>xy x).
(x)(3y)(-[x 0 &xy l]).
We shall adopt the naive, intuitive point of view that a set is a collection
of objects without questioning what these words mean. The term
'b EB' is to be read: b is an element of the set B. Often a set will be

specified by some descriptive property. For example, suppose that
Q(x)is a sentence containing a variable 'x'. Then we can form the class
'x' make the
sentence Q true. We denote this by means of the term '{x: Q(x)}',
and this is to be read: The collection of all xsuch that Q(x) is true. As
in the situation for quantifiers, x is understood to vary over a specified
set. The set consisting of the one element a will be designated by '{a}',
the set consisting of the two elements a and b will be designated by
'{a, b}', and so on.
Given two elements a EA and b EB we can form the ordered pair
(a, b). The reason we use the word 'ordered' is that in general {a, b)
and (b, a)are not considered the s.ame object. Indeed, the ordered pairs
(a, b) and {a1, b1) shall be identified if and only if a= a1 and b= b1
Recall that the symbol '=' has the meaning that the symbols that stand
of all objects that when their names are substituted for
to the left and right of it are simply different names for the same object
and we adopted the rule that different names for the same object may
be used interchangeably in any expression.
From two setsA and B we can form a new set, the Cartesian product
A X B, which is defined by the equality
AX B
= { (x, y) : x EA & y E B}.
(l.2.1)
1.2
SETS I 17
We can also form the intersection of two sets defined by the equality
A n B={x:x EA & x EB}.
(1.2.2)
More generally, if is a collection of sets we shall define
n {A:A EV'6}
{x: (A)(A Et16 =>x EA)},
(l.2.2')
that is, the collection of all elements each of which belongs to every set
in <76. The union of two sets is defined by the equality
A U B={x: x EA V x EB}.
(l.2.3)
More generally, if is a collection of sets we shall define
U {A:A EV'6}={x: (3A)(A Et16 & x EA)},
(1.2.3')
that is, the collection of all elements that belong to at least one set in A.
The complement of a set A is defined by the equality
(1.2.4)
We usually use the term 'x
ft A'
for the term '(x EA)' and shall
write 'A\B' for 'A n BC'; that is,
(1.2.5)
This latter set can be described as the set of all elements in A that are
not in B; that is,
A\B={x:x EA & x fi! B }.
(1.2.5')
A very helpful intuitive way of thinking about the sets we have
been forming is to represent them diagrammatically. These diagrams

are usually referred to as Venn diagrams. For example, the set A n B
can be represented by the cross-sectioned area in Fig. 1.2.1. The set
A U B can be represented by the cross-sectioned area in Fig. 1.2.2.
FIGURE 1.2.1
FIGURE 1.2.2
The rectangle in which the sets are enclosed represents the entire
universe of elements of the discourse.
We shall define the inclusion symbol C' by means of the equivalence
AC B<=>(x)(x EA=>x EB) .
(l.2.6)
18 j THE REAL NUMBER SYSTEM
The term
B.
If
'A CB ' is to be read: A is contained in B, or A is a subset

A CB and A =fa B, A is called a proper subset of B. Two sets are
of
to
be identified if and only if they are contained in each other, that is,
(1.2. 7)
A=B (ACB&BCA).
Since we have taken equality as a primitive logical notion, the equiv

alence ( l.2. 7) is to be viewed as an axiom rather than as a definition of
equality between sets. For, from the rule we have adopted for the
symbol'=', it is a simple matter to prove that
A=B =:}(A CB&BCA).

However, the converse implication cannot be proved and in axiomatic
set theory it is usually adopted as an axiom, provided equality is taken
as a logical notion. Actually in making the definition (l.2.6) and in
taking as an axiom (I. 2. 7) we should have used the universal quantifiers
'(A)'
and
'(B)',
so that these statements would refer to all pairs of sets
rather than to two particular sets.

In our previous discussion we have introduced a new symbolism,
'{x: Q(x)}',
which has not been defined in terms of the symbolism of
the predicate calculus. Hence we must either define this symbol in

terms of the rules of the predicate calculus or else give new rules for
operating with statements that contain these symbols. The first method
is clearly the preferable one. Hence we take the symbol
'{x: Q(x)}'
to
be a name for that set for which the following statement is true:
(y)(y E{x : Q(x)}Q(y)).

A moment's reflection is enough to convince us that the intuitive
meaning of this statement is the same as the intuitive meaning we
previously gave to the term
As an example,
A n B
'{x: Q(x)}'.
is to be defined as that set for which the
following statement is true:
(x) (x EAn B [x EA &x EB]).
(l.2.8)
Let us prove that the following statement is true:
An BCA.
First,
removing
the
universal
(l.2.9)
quantifier from (l.2.8) we get the
statement:
x EA n B [x EA &x E B],
(l.2.9')
which for the purpose of applying the rules of the propositional cal
culus is assumed to have a 't' value. The truth-table method of the prop
ositional calculus tells us that the following statement has a 't' value:
{x EA n B [x EA&x E B]}
=:} {x EA n B =:} [x EA &x EB]}.
(1.2.lO)
1.2
SETS j 19
From (l.2.9), (1.2.10), and the rule of inference we find that the follow
ing statement is true:
x EA n B ""* [x EA
&
x E B].
(l.2.11)
The rules of the propositional calculus tell us that the following is

true [Exercise 2(c) of Section 1.1]:
[x EA
&
x E BJ ===* x EA.
Designating the statement (l.2.11) by

by
'S(x)',
'R(x)'
(l.2.12)
and the statement (l.2.12)
we get the following true statement:
[R(x)
&
S(x)] ===* [x EAn B ===*x EA].
(l.2.13)
Using the rule of inference the following statement has a 't' value:
x EAn B ===*x EA.
(l.2.14)
Adding a universal quantifier we get the following true statement:
(x)(x EAn B ===*x EA).
(l.2.15)
Using the statement (l.2.6) and the rules of the propositional calculus,
we arrive at the true statement
(x)(x EA n B ===*x EA) ===*An BC A.
(l.2.16)
Using the fact that statements (1.2.15) and (1.2.16) are true, by the rule
of inference we finally arrive at the true statement:
An BC A.
We have presented above a formal proof of the last statement, being
careful to point out at each stage exactly what was being used. Of course,
we could have developed a scheme so that the proof would have been
more mechanical and the amount of space needed to write it down
would have been much less. Nevertheless, we think the reader now
sees how cumbersome a formal proof can be, even of the simplest
statements. For this pragmatic reason most of the discourse of mathe
matics is carried on in an informal way.
In an informal proof we do not write down all the steps but only
those considered to be essential. This is analogous to the situation
when in making an arithmetic or algebraic computation we usually
do not take cognizance of the fact that we are using, for example, the
commutative or associative laws, but suppose these are standard facts
which the reader recognizes. For example, the chain of argument
leading from ( l .2.9) to (l.2.11) or the chain of argument leading from
(1.2.11) to (l.2.14) is usually considered a standard argument and
would not be mentioned in an informal proof. Of course, just how much
is written down is at the discretion of the writer. Usually enough should
be written down so that it would be clear how to make the formal proof
if any question should arise about the validity of the informal proof.
As an example of an informal proof let us show the following:
(B
B)
(A
(B U C) [x EA
&
x EB U C].
C)
(A
C).
We have
x EA
Also,
x EB UC [x EB V x EC].
Now, it is easily checked by a truth table that
[ (x EA) & (x EB V x E C)]
[ (x EA & x EB) V (x EA
&
x EC)].
The disjunction on the right is equivalent with
x E (A
B)
(A
C).
Consequently, we have shown that
x EA
(B
C) x E (A
B)
(A
C)'
which gives the equality we are seeking.

Of course, proving such an equality or discovering it may be two dif
ferent matters. Often the way to discover such an equality is by looking
at the Venn diagram. In this case the set in question is shown by the
cross-sectioned area in Fig. l.2.3.
FIGURE 1.2.3
The reader may now object that we have not defined the symbol 'E'
in terms of the symbols of the predicate calculus. This is true, and in a
formal development of mathematics it is necessary to give the rules or
axioms that prescribe the use of this symbol. The situation is analogous
to that of Euclidean geometry, where points and lines are taken as
undefined objects and a set of axioms are given that give the relation
ships between points and lines. In axiomatic set theory, sets and the
symbol
'E' are taken as undefined things and a set of axioms is given that
will allow us to develop the kind of a theory of sets which seems intui-
1.2 SETS I 21
tively reasonable to us. These axioms deal mainly with prescribing the
conditions under which new sets can be formed from given sets. For
example, in axiomatic set theory the facts that
AnB and A
B can
be taken to be sets are usually given by axioms. In connection with the

set
B, the notion of ordered pair can be defined by use of the
axioms.
To try to give a reasonable axiomatic approach to set theory would
be too difficult at this stage and would delay our study of the calculus
for a long time. Hence, as we mentioned at the beginning of the dis
cussion on sets, we shall suppose that everyone understands what a
set is and we shall allow operations on sets and the construction of sets
that seem intuitively reasonable. Such a procedure can, on occasion, lead
to serious philosophical difficulties, but we shall pretend that they
don't exist.
Finally, let us remark that it is convenient to consider the set that has
no elements. It is defined by the equality
0= {x: x x}.
The set 0 is called the null set or the empty set or the void set.
D Exercises
1.
Draw Venn diagrams for the sets
A \B, Ac, and An(BU C).
Give a schematic diagram (not a Venn diagram) for a Cartesian product

set.
2.
Give formal proofs of the following statements:

(a)
AU(BU C)=(AU B)U C.

An(Bn C)=(AnB)n c.
(c) (Ac)c =A.
(d) An Ac= 0.
(b)
3.
Prove the following:
(A u B) n c=(An C) u (Bn C)..

(AnB) u c = (A u C)n(B u C).
(c)
An(A u B)=A.
(d) AU(AnB)=A.
(a)
(b)
4.

(a)
(b)
5.
(An B)C=AC u Be.

(A u BY=ACnBe.
Using the results of Exercises 2, 3, and 4, find the complements
of the following sets:

(a)
(b)
AU B U cc.
An(BU(C U D)c).
6.
(c)
(A u BC)n (A u (Bn cc)).
(d)
0.
Prove the following by using the results of Exercise 3:

If An B=An C and A U B=A U C, then B = C.
7.
Show the following:

(a)
(b)
8.
If A,, is any collection of sets and B is any set, show the following:
(a)
(b)
9.
(b)
Bn u {A:AEA,, }= u {AnB:AEA,,}.
B u n {A:AEA,, }= n {A u B:A E A,,}.
If A,, is any collection of sets, show the following:

(a)
(b)
11.
B U U{A:AEA,, }= U {A U B:A EA,,}.

Bn n{A:AE.A,,}= n{AnB:AE.A,,}.
If A,, is any collection of sets and B is any set, show the following:
(a)
10.
AB;,,,_A\(A\B).
An B= A\(A\B).
-n c
( U{A: A E.A,,})c= n{Ac:AEA,,}.

( n{A :AE.A,,})c= U{Ac:AEA,,}.
If A,, is any collection of sets and B is any set, use the results of
the previous three exercises to show the following:

(a)
(b)
1.3
B\ U{A:A Evt}= n{B\A : AE vt} .

B\ n {A :A Evt}= U {B\A :AE vt} .
RELATIONS AND FUNCTIONS
The concept of the Cartesian product of two sets leads to the concept
of a relation. We shall first give a formal definition and then comment
on the meaning.
1.3.1
Definition.
A relation is a subset of a Cart.esian product set.
If R is a relation, the set >(R)={x :(3y)((x, y)ER)} is carted the do

main of R and the set 5t(R)={y :(3x )((x, y)ER)} is called the range
ofR.
The relation defined by R-1={(y,x):(x,y)ER} is caUed the inverse
of the relation R. If A is any set, then the set R-1(A)={x:(3y)(yEA &
(x, y)ER} is called the inverse image of A under R.

An example of a relation is the following. Let A be the set consisting
of all men in the United States and B the set of all people in the United
States. Let R be the set of all (x,y)EA x B so that x EA and y is a rela
tive of x. Since R is a subset of a Cartesian product it is a relation. Note
that we also have R C B X B. The domain of R is the set of elements
which are first members of the ordered pairs that are in R. In this case
1.3
RELATIONS AND FUNCTIONS I 23
this is A. t Suppose C is the subset consisting of those people in B who

have at least one living male relative. It is probably true that C - B.
At any rate, C is the set of elements which are the second members of
the ordered pairs that are in Rand hence is the range of R. Note that
it is not true that R=A X C, although certainly R C A X C.
The situation where to each element in the domain of a relation
there corresponds only one element in the range so that the resulting
ordered pair is in the relation is of special significance . Such relations
are called functions and we now give the formal definition.
1.3.2
Definition.
A function F is a relation with the additional property
that
(x)(y)(z)([(x,y)
F & (x,z )
E F]
==>y=z)'.
For example, if A and B are the sets given above, then the set of all
(x, y)
E A X B with
a husband and y his legal wife is a function. A
more pertinent example of the distinction between a relation and a

function is perhaps the following:
{ (x,y) : x2 + y2= 1} is a relation.

{ (x, y) : x2 + y2= 1 and y;a. O} is a
function.
Some people prefer the words multivalued function in place of the word
'relation.'
If Fis a function and
(x, y)
E F, then the usual convention is to de
note the second member y by

notation. The element
F(x)
F(x).
We shall follow this convenient
is called the value of Fat
x, and
we also
often speak of it as the map of x under F. In case Fmaps distinct elements

of its domain into distinct elements of its range the function
is said
to be one to one. The formal definition is the following
1.3.3
Definition.
A function F is
said to be one to one <=>
(x)(y)(F(x)=F(y) <=> x =y).

In the statement above the variables are, of course, understood to
represent elements of J?>(F). In case F is a one-to-one function, it is
clear that 1 is also a function. However, we shall state this as a formal
proposition and leave the proof as an exercise.
1.3.4
Proposition.
If F is a one-to-one function, then p-i is also a one
to-one function.
Given two or more functions there may be ways of combining these
tWe are supposing that every man has a relative.
functions to get a new function. We shall give one way here, the
com
position of two functions. We shall give other ways later.

1.3.5 Definition. If F and G are functions, then F 0 G is that function
having domain {x: G(x) EE(F) } and Vx EJ0(F 0 G) ,
F0G(x) =F(G(x) ) .
In very formal terms we can write
F0G= {(x,y) : (3z) ((x,z) EG & (z,y) EF) }.

By an abuse of language we shall often designate the range of a
function
f by the symbol
{J(x) : x EE(f)}.
If A is a set, we can define a new function g as that subset off consisting
of those ordered pairs (possibly void) whose first members belong to
A.
We shall often write
g=JIA.
If
E(f ) ,when
we write
f(A) = {f(x) : x EA},

we are referring to the range of
g.
A function is an important special type of relation. There is another
special type of relation, an
equivalence relation, which plays an extremely
important role in all branches of mathematics.
1.3.6 Definition. A relation R is said to be an equivalence relation if

and only if the following are satisfied:
(a)
(b)
(x) (y) ((x,y) ER ==> (y,x) ER) .

(symmetric)
(x) (y) (z) ([(x,y) ER & (y,z) ER] =>(x,z) ER) . (transitive)
Many authors prefer to talk about an equivalence relation
X and
(c)
on a set
add the condition
(x) (x EX=> (x,x) ER) .
(reflexive)
The condition (c) simply assures us that
,B (R) .
In fact, from
(a) and (b) it is easy to prove the following:
(x) (x E,B(R) ==> (x,x) ER) .

(x, y) ER, then (y,x) ER.
(x,y) ER & (y,x) ==> (x,x) ER.
Indeed, (a) tells us that if

(b) we get that
Hence from
We shall usually denote an equivalence relation by the symbol'=' and

instead of
'(x,y) E='we
shall write
'x
y'.
It is not hard to check that
1.3
RELATIONS AND FUNCTIONS I 25
it is possible to use the symbol to define an equivalence relation in

the Cartesian product XX X, where X is the set of all meaningful state
ments. We shall soon meet other familiar equivalence relations.
1.3.7 Theorem. Let X be a set and = an equiva/.ence relation having
domain and range the set X. There is a collection 6 of subsets of X so that
X=U{E :EE6},
where VE, F E 6, E = F E n F = 0 and x,y E E x = y; x = y
3E E 6 so that x,yE E. (The sets E E 6 are called equivalence
classes.)
Proof. For every x E X let E(x) = {y : y = x}. Since, as we have
shown, xE ..e (=) x = x, it follows that xE E(x) . For any sets
E(x) and E(y) suppose E(x) n E(y) = 0 and let z E E(x) n E(y) .
We have z = x and for w E E(x) we have w = x. From the symmetry
condition (a) we get x = z, and thus from the transitivity condition (b)
we get w = x & x = z w = z. On the other hand, z = y, and hence
from (b) w = z & z = y w ;,, y. Hence we have shown that w E E(x)
w E E(y) , which means E(x) CE(y) . By making the same kind
of argument for the set E(y) we arrive at the conclusion that E(y)
CE(x) . This shows E(x) =E(y) . If we now take
8 = {E(x) : xE X},
we see that the theorem is proved.
D Exercises
I.
Letf be a function and A,B CJ?J(f). Show the following:

(a) A CB l(A) Cl(B).
(b) l(A U B) =l(A) U l(B).
(c) l(A\B) Cl(A).
(d) l(A n B) c l(A) n l(B).
2. Prove Proposition 1.3.4: The inverse of a one-to-one function is

a function, which is also one to one.
3.
=x.
If l is a one-to-one function, show that Vx
4.
Let
following:
(a)
(b)
(c)
(d)
..e(J), 1-1 l(x)

0
f be a function and A and B subsets of (f). Prove the

A CB 1-1(A) C1-1(B).
1-1(A U B) 1-1(A) U 1-1(B).
l-1(A\B) =l-1(A) \j-1(B).
1-1(A n B) =1-1 (A) n l-1(B).
=
26 j THE REAL NUMBER SYSTEM

5.
Give an example which shows that we may not have equality in
Exercise l(d). However, show that if f is a one-to-one function we get

equality.
Suppose f and g are functions such that tR- (g) C JFJ(J) and
E " (g), f g(x) =x. Show that g is one to one. If, in addition,
tR-(J) C "(g) and Vy E"(J), g0f(y) =y. show thatf=g-1
6.
Vx
7.
Define a relation on the set Z of integers by writing n
m n
- m is divisible by 5. Show that this is an equivalence relation. How many
equivalence classes are there?
8.
=
For ordered pairs (x, y) and ( u, v) of real numbers write (x, y)
(u, v) there exists a real number t > 0 so that (x,y) =(tu, tv). Show
that this is an equivalence relation and give a geometric description

of the equivalence classes.
9.
Suppose R is a relation with the following properties:

y E tR- ( R) ::::} (y, y) E R.
(a)
(/3)
(x,y ) , ( z,y ) E R::::} ( z,x ) E R.
Prove that (x, y) E R ::::} (y,x) E R.
1.4
THE NATURAL NUMBERS
In this section we shall give a set of axioms for the natural numbers
and derive some of their more important properties. The proofs we
give will be informal, as explained in Section 1.2, and the set theory
we shall use will be intuitive. One may, quite legitimately, ask why
we bother to be so formal about the development of the real number
system when we are being so informal about logic and set theory. One
answer is that the first serious questions about the nature of mathematics
arose in connection with the real numbers, first among the ancient
Greeks and later again among the nineteenth-century mathematicians.
Hence an enormous amount of intellect and energy have been expended
in trying to clarify the nature of these objects. Many people seem to
feel that between certain limits these efforts have been successful agd
that a usable system can be obtained from a few psychologically satis
fying and clearly stated principles or axioms. Of course, there may
be sharp disagreement on just where to start and how far one can go
without getting involved in contradictions. We shall start with a set
of axioms that are not as minimal and/or perhaps not as intuitively
satisfactory as others. However, we feel they are reasonably satisfac
tory and have the advantage that the development of the real numbers
can proceed quite rapidly from them.
The name 'the natural numbers' is given to any set N together with two
functions + and
following axioms:
each with domain N X N and range in N satisfying the
1.4
(a)
THE NATURAL NUMBERS I 27
(x)(y)(x+ y=y+ x).

(x)(y)(xy=yx).
(a')
(b)
(commutative laws)
(b')
(x)(y)(z)(x+(y+z)=(x+y)+z).
(x)(y)(z)(x (y z)=(x y) z).
(associative laws)
(c)
(x)(y)(z)(x
(distributive law)
(y+z) =x y+x z).
(d)
1E
(e)
N & (x)(x
x, y in N, one
(I) x=y.
(2) (3z)(x=y+z).
(3) (3z)(y=x+z).
For every
x).
and only one of the following is possible:
(trichotomy/aw)
M C N , the following is true:

[I E M & (x)(xE Mx+ IE M)] M=N.
(f)
For every
(induction)
Using these axioms it is immediately possible to prove a number of

results about the natural numbers
N.
[We are using
'N'
to designate the
natural numbers, although strictly speaking we should use the triple
'(N +
, , )'.]
However, let us first make some comments about the
previous axioms. The axioms (a) through (c) are of course the familiar
ones from arithmetic. The first part of the cortjunction of axiom (d)
says that
=/:-
0 and names a particular element. The second part of
the axiom states a property for this element. Axiom (e) has been stated
rather informally for the sake of clarity. It simply says that we can have
one and only one possibility; either
and y are the same,
is greater
than y, or xis less than y. More formally, we could have given this axiom
in terms of two axioms:
( e' )
(x)(y)(x =/:- y :::> (3z)( y=x+z_ V x=y+ z)) .

(x)(y)((3z)(y=x+z)- (3z)(x=y+z)).
(e")
The last axiom (f) is often stated in the following way: If P (x) is a state
ment depending on
x,
then
[P( I) & (x)(P(x)P(x+ I))](x)(P(x)).
(f')
This can be translated to our statement (f) by the following device. Set
M={x: P(x)};
then if (f') is true, (f) is true for
M,
and vice versa.
Let us now give some examples that show how these axioms may be
used to obtain other true statements about
that
N.
Our first statement says
has no zero element; actually it says more.
1.4.1
Proposition.
There is no
and no y in
that is,
-(3x)(3y)(x+y=x).
N,
so that
x + y=x;
28 [ THE REAL NUMBER SYSTEM
Proof. We shall prove this by contradiction. Suppose (3x)(3y)

(x+ y=x). This implies
(3x)(3y)((x=x) & (x+y=x)),

which contradicts the trichotomy axiom (e).
Our next statement is to the effect that we have a cancellation law
in N with respect to multiplication.
Proposition.
1.4.2
If x z=y z, then x= y, and vice versa, that is,
(x) (y)(z)(x z=y
z x=y).
Proof. The fact that x=y=>x z=y z follows from our rules
x and y are different names for the same thing and hence we may
that
use them interchangeably in any expression. Hence we must prove the

implication
(x)(y)(z)(xz=yz=>x=y). Suppose this 1s not true.
Then using our rule for negating statements we have
(3x) (3 y)(3z)(xz=y
z &
=F
y).
(l.4.1)
By the trichotomy axiom (e) [or (e')]
=;/:
y=>(3w) (x=y+w Vy=x+ w),
and by the distributive law the latter statement implies
=F
y=>(3w)(x z=yz+w z Vyz=x z+ w z).
(l.4.2)
Hence from ( 1.4.1) and (l.4.2) we get
[(3x) (3y)(3z) (x z=y z & x =F y)]

=> [ (3x) (3 y)(3z)
(x z=y z & (3w)(x z=y z+ w
z Vy z=x z+w
z) ) ],
which contradicts the trichotomy axiom.
1.4.3
Definition
(x)(y)(x < y(3z) (y=x+z)),
(x)(y)(xyx <y Vx=y).
1.4.4
Proposition.
If xy and zw, then x+zy + w;
(x)(y) (z)(w) ( [x y & zw]=>x+ z y+ w).

Proof.
Exercise.
1.4.5
Proposition
(x) (1 x).
Proof..
Let us set
M= {x: 1.;;: x}.
that is,
1.4
Clearly 1
THE NATURAL NUMBERS I 29
{x)(x+ 1 E M). The latter statement follows from

x+ 1 and z = I. Hence {x)(x E M
=:::} x+1 E M), and by the principle of induction it follows that M = N.
EM
and
Definition l.4.3 by putting y
The next statement is to the effect that there is no natural number

between two successive natural numbers.
1.4.6
Proposition
-(3x)(3y)(x<y <x+ 1).

Proof.
By the definition of the relation< it follows that
n < m<n+ 1=:::} (3x )( m =n+ x <n+1).

Using Proposition 1.4.4 we have
::s;;
(1.4.3)
Vx E N,
x=:::} n+ 1
::s;;
n +x.
Combining this with Proposition 1.4.5 we see that
n+ 1
::s;;
Vx E N,
n +x.
(1.4.4)
If we assume that Proposition 1.4.6 is not true, then from (1.4.3)

there are natural numbers m, n, and k so that
m=n+k<n+l.
Replace
(l.4.5)
by k in (l.4.4) and we see that (l.4.4) and (l.4.5) contradict
the trichotomy axiom.
1.4.7 Theorem (Well Ordering of N). Every nonvoid subset of N

has a unique smallest element; that is,Jor every nonvoid S C N there is a unique
m E S so that Vn E S, m ::s;; n.
Proof.
Let
be a given nonempty set in N and set
R= {x: (y)(y E S=}x<y)}.

We have two cases to consider.
(a)
1 R. In this case 1 E S. For in case 1 S, by trichotomy and

(x)(x E S=:::} 1<x), and hence 1 ER. This
is a contradiction. Since 1 E S, the result follows by Proposition 1.4.5.
(b) 1 ER. In this case we claim (3x)(x ER & (x+ 1) R). For
if we suppose the contrary is true, we have (see Exercise 1 of Section 1.1)
Proposition 1.4.5 we get
-(3x)(x ER & (x +1) R) ::::> (x)(x ER=:::} (x+1) ER).

But since I
ER,
the principle of induction tells us that R
trichotomy axiom we get

the hypothesis that
R n S= 0= N n S=S,
+1 E S.
must have
By the
0.
This being the case we must have that

and k
= N.
which contradicts
For, if k+ 1 fj.
S,
3k E R,
so that (k + 1)
fJ.
then by the trichotomy axiom we
(x)(x E S :::!? x < k + l V k + l < x).
But for
:::!? k
x E S
cannot have x < k + l, since otherwise k E R
we
< x < k + l, contradicting Proposition l.4.6. Therefore, (x)'(x E S
:::!?k+ l < x), which contradicts the fact that k + I
R.
We now claim that (x) (x E S :::!? k + I ,,,;;; x). Indeed, in the contrary
case, (3x)(x E S & x < k+I), which we have seen contradicts Prop
osition l.4.6. This shows that k+ l is the smallest element of S. The
uniqueness is an immediate consequence of trichotomy.
1.4.8
Theorem (Archimedian Ordering of N)

(x)(y)(3z)(x,,,;;; z
Proof.
y).
By Proposition l.4.5 we know that

(y)(y
#l
:::!?
( 3 z) (y=I+z)).
Therefore, using the distributive axiom we have

(x)(y)(y
which means (x) (y )(y

that (x)(y) (x ,,,;;; x
NOTE:
#I:::!? (3z)(x
# I :::!? x
< x
y=x+x
z)),
y). Since (x)(x ,,,;;; x
l),
it follows
y). This says even more than we set out to pro
Let us agree from this point on that for n, m E
write 'nm' instead of'n
we shall
m' unless we have a special reason for empha
sizing the multiplication function. This is in accordance with the usual

procedure.
D Exercises
1.

(x) (y)(z)(x+z= y + z :::!?x= y).
2.

(x) (y)(z)(w)(x,,,;;; y
& z,,,;;; w :::!?xz ,,,;;; yw).
Use this in conjunction with Proposition 1.4.5 to get the Archimedian

ordering of
N.
3.
If mn < mp or m+n < m + p, show that n < p.
4.
If n2= 1, show that n= 1. In formal terms,

(x)(x2
5.
If we define
l+ l,
I:::!? x=I).
prove that V m E
N,
there exists exactly
one k E N so that
m < k < m+
6.
2.
Prove there is no number n so that n2=2. In formal terms prove

-(3x)(x2= 2).
1.5
7. If we define 3
2n=3.
8.
THE INTEGERS AND THE RATIONALS I 31
2 + 1, show that there is no number n so that
Replace the principle of induction by the principle of well order
ing in the axioms for the positive integers and prove (x)( 1 x).
9.
Replace the principle of induction by the principle of well order
ing in the axioms for the positive integers and prove the principle of
induction as a theorem.
10.
Show that the axiom (e') implies the following:

(x)(y)(x=y V (3 z)(y=x+z V x=y+z)).
1.5
THE INTEGERS AND THE RATIONALS
If m and n are natural numbers with m > n, then it is not true that there
is a natural number p so that m + p = n. For if it were true, it would

contradict the axiom of trichotomy. It would be very useful if we could
embed N into a larger setZso that
(x) (y)(3z) (y=x+ z). (The variables
in the last expression, of course, represent elements of Z.) The object

of this section is to construct such a larger set Z called the integers.
We shall construct the integers using ordered pairs of natural num
bers. We define an equivalence relation in
ing (n,m)
(p,q) n +q =m+ p.
(N X N) X (N X N) by writ
[We are, of course, thinking of
the pair (n,m) as n - m, from which the construction of the equivalence
is natural.] To check that this is an equivalence relation we must check

that
(a)
(b)
(n,m)
(n,m)
(p,q) ==> (p,q)

(n,m),
(p, q) and (p,q)
(r,s) ==> (n,m)
=
(r ,s).
We think that (a) is immediate and we leave the verification to the

reader. To prove (b) we must show that from the equalities n+ q =m+p
and p +s=q+r we can get n+s =m +r. From the first equality we
get n+q +r=m + p +r. The second inequality allows us to substitute
p+ s for q +r and we get
(n+s) +p=(m+r) + p.
The result of Exercise 1 of Section 1.4 shows that n+s = m+ r, which
is the result we wished to obtain.
Denote the equivalence class of (n ,m) by 'Z(n,m)' and the collection
of equivalence classes by 'Z'. We want to define two functions+ and
with domain Z x Z and range in Z for which the rules (a) through (d)
of Section 1.3 are valid and moreover (x) (y) ( 3 z) (x=y+z) . We shall
define these functions as follows:
Z(n,m) +Z(p,q)=Z(n+p,m+q),
Z(n,m)
Z(p,q)=Z(np+mq,mp+nq).
For these definitions to make sense, they must be independent of the

representatives we choose from the equivalence classes. In other words,
if (n, m)
(ni. m1) and (p, Q)
(Pi. q1), then we must make sure that
(n1 + Pi. m1 + Q1)
(n + p, m + Q)
and
(np + mQ, mp + nQ)
(n1P1 + m1Qi, mtP1 + n1Qi ) . Otherwise we would not be defining
functions. Let us prove the first one. The equivalences (n, m)
(ni. mi)
and (p, q)
(p1, Qi) give
=
and
If we add corresponding sides of both equations we get
n + P + mi + Qi = m + Q + n1 + P1.
This is precisely the condition that (n + p, m +
q)
(n1 + Pi. m1 + q1).
We shall leave as an exercise the proof of the second statement about

equivalence.
The equivalence class Z(n, n) shall be denoted by a special symbol,
O=Z(n,n),
and it is given the name zero. It is almost immediate that for every x
x
{ Z,
+ 0 = x.
Also,
(x) (y)(3z) (x + z =y).

To prove this suppose x =Z(n, m) ,y = Z(p,
is an equivalence class so that
q);
then z =Z(m+ p, n +
q)
x + z =y.
For the construction of Z to be meaningful, we must show that it

contains N in some sense. Define a function i with domain N and range
in Z by the equality
'(n) = Z(n+ 1, 1).

If '(n) = i(m), then (n + 1, 1)
(m + 1, 1), which by definition means
n + 2 = m+ 2 and hence by Exercise 1 of Section 1.4 we get n = m.
=
As we noted in Definition
1.3.3,
a function with this property is said
to be one to one.
The function i also satisfies the equalities
i(n+ m) = i(n) + '(m),

i(n
m) = i(n)
i(m).
Indeed, the first equality is nothing more than the fact that (n + m +
1, I)
(n + m + 2, 2) and the second equality is the fact that (nm + 1, 1)

(nm+ n + m+ 2, n + m + 2). A one-to-one function that preserves
the operations + and is called an isomorphism. Hence N has an image
=
1.5
THE INTEGERS AND THE RATIONALS I 33
in Z that reflects in a faithful manner all its properties. Therefore,

it is usual to say that N C Z and to give the equivalence classes in Z
that correspond to elements of N the same names as the elements of N,
although strictly speaking they are different entities. The triple (Z, +,
is given the name 'the integers', although usually we shall speak only
of Z as being the integers. The set N in Z will be called the natural num
bers or the positive integers. As in the case of the natural numbers, the
symbol for multiplication in Z X Z will usually be dropped and we shall
write 'nm' instead of 'n
m' ..
It is very convenient to define a function with domain and range Z

and which is designated by the symbol '-'. This symbol represents a
collection of ordered pairs in Z X Z, and very properly it is defined as
that function satisfying
(x)(y )((x,y ) E- (m)(n)(m,n EN & x =Z(m, n) y=Z(n,m))).

The function - is read: minus. In line with our usual convention, if
(x, y) E - , then we write y = -x. This way of making a definition is

rather formidable and is usually valuable only when one already under
stands what is being defined. Thus we shall not, in general, make defini
tions in such a formal manner, but may introduce the formalism after
the informal statement has been given.
1.5.1
Definition
(x)(-x=Z(m, n) x=Z(n,m)).
(x)(y )(x - y=.x + ( -y )).
-N = {x: -x EN}.
1.5.2
Defiiiftio n
(x)(y)(x < y y - x EN).
(x)(y)(x ,,;;; y x < y V x= y).
(x)(y )(y > x x < y).
(x)(y)(y x x,,;;; y ).
It is a simple matter to show that the properties (a) through (d) of

Section 1.4 hold for the integers. The trichotomy rule reads that for
every x,y EZone and only one of the following hold: x < y, x
y, y < x.
The trichotomy rule is equivalent to the statement of the following

proposition:
1.5.3
Proposition
(b)
-NnN=0.
-N U N U {O}=Z.
(c)
0 ft -N
(a)
U N.
Proof. (a) If - N n N 0 and x E-N n N, then it follows that

-x E-N n N. Hence x - x= 0 E N, which contradicts Proposition
1.4.1.
(b) Suppose now that x EZ and x NU {O}. Let us write x =
Z(m,n). If n < m, then 3p EN, so that m =n+ p. Since Z (p + n,n)
=Z(p+ 1,1), it follows that x EN, which is a contradiction. Also
m n, since otherwise x= 0. Hence m < n and -x=Z(n, m) EN,
which implies x E -N. Thus Z C -NUNU {O}. Since the converse
inclusion is obvious, the proof is finished.
(c)
Clear.
1.5.4
(a)
(b)
Proposition
(x)(xO= O).
(x)( y)( (-x)y=x(- y) =-(xy)).
Proof. (a) Since 0 = 0+ 0, if we multiply both sides by x we get

xO=xO+ xO. Add -(xO) to both sides and we get
xO - xO=xO+ (xO - xO).

Now, for every x
EZ, x - x= 0 and therefore we get

O=xO+O=xO.
(b)
Using the distributive rule and part (a) of this proposition we
have
0 = Oy= (x - x)y =xy+ ( -x)y.
Add -(xy) to both ends of the equality to get
-(xy) = (-x)y.
If we use the commutative rule and interchange the symbols 'x' and ' y'
in this formula we get
-(xy)
(x) (- y).
This completes the proof.

If n, m
EZand m
0, it is not always true that
( 3x)(x EZ & mx =n).
Hence it would be very convenient to embed Z into a larger system that

will make this true.
We shall consider the class
P = {(x,y)
: x
EZ & y EZ & y
0}.
[We are, of course, thinking of the ordered pair (x,y) as x/y in the usual
notation.] We shall say
(m,n)
(p, q)
::}
mq
np.
1.5
THE
INTEGERS AND THE RATIONALS I 35
To prove this is an equivalence relation we must show that the symmetric

and transitive properties hold. The first property is automatic and we
leave this for the reader. Hence we must prove the transitivity property.
To do this we first prove the following.
1.5.5
If x and y are in
Lemma.
more formal
and xy
0, then x = 0
0. In
terms:
(x)(y) (xy=0 =* [x=0
y=OJ).
N U {O} =Z, if we can show that not both x

N, we will be done. If x, y E N, then by Proposi
tion l.5.3(c), xy # 0, and this contradicts the fact that xy
0.
If x E N, y E -N, or if x E -N, y E N, or if x, y E -N, then since
-0= 0, we get from Proposition 1.5.4,
Proof.
Since
-N
and y belong to -N U
(x)(-y)= (-x)(y) = (-x)(-y)= 0.

Apply the reasoning of the first paragraph to get a contradiction in
each case.
xy=0 =* - (x # 0 & y # O). But - (x # 0 & y # O)

0), which gives the desired conclusion.
Hence we have
(x=0
It is now a simple matter to prove the transitivity for the equivalence

relation given above. Suppose
(m,n)
(p,q) and (p,q)
(r,s). This
means
mq=np
and
Multiply the first equation by
s to get
ps=qr.
mqs = nps,
Then use the fact that
ps=qr to get
( ms - nr)q=0.
From Lemma
ms=
1.5.5
we get
ms
- nr=0
q=0. But
#0
and thus
nr.
(m,n),
'Q (m,n) '. The collection of equivalence classes shall be de
We shall denote the equivalence class of an ordered pair
#.
0,
by
noted by 'Q'.. We shall also define two functions+ and
with domain
Q and range Q by means of the following equations:

Q(m,n) +-Q(p,q)= Q(mq + np,nq),
Q (m,n) Q(p,q)= Q(mp,nq).
These are well-defined functions that satisfy the commutative associa

tive and distributive laws. The triple (Q,+, )will be called the rational
numbers, and by an abuse of language we shall refer to Qas the rationals.
We shall write
(r)(-r= Q(m, n)<=> r=Q(-m, n)
Q(m, -n)),
and we shall set

Q+
{Q(m, n) : (m >
0)
& (n >
O) }.
Q+ will be called the positive rational numbers. As before, we shall take

r - s=r+ (-s).
The relation< shall be defined as before,
(r)(s)(r<s<=> s - r E Q+),
(r)(s)(rs<=>r<s V r=s),
and we shall also define
(r)(s)(s > r<=>r<s),

(r)(s)(sr<=>rs).
The integers Z can be embedded into Qby means of an order-preserving

isomorphism defined by the equality
'YJ(n)= Q(n, I).

By 'YI being order-preserving we mean that Vm, n E Z, if m< n we
have 'YJ(m)<'YJ(n).
We shall now introduce the concept of absolute value on Q. This will

be needed to describe the construction of the real number system.
1.5.6
Definition.
and range in Q+
The absolute value is that function with domain Q
{ 0} defined by the following:
r<=>r 0,
Ir! = -r<=>r
1.5.7
<
0.
Proposition
(x)(-x lxl & x Ix!) .

(x)(!xi= I -xi).
(x)(y)(!xyl=lxl IYI).
Proof.
1.5.8
Exercise 18 of Section 1.5.
Proposition (Triangle Inequality)
(x)(y)(I lxl- IYI I lx+yl lxl +IYI).

Proof.
Since
x lxl
and
yIYI, it follows t hat x+ y !xi+ IYI.

-(x + y) lxl +IYI Thus from the
In the same way it follows that
definition of the absolute value we have
lx+yl lxl +IYI
1.5
In this inequality, replace
by
-x
THE INTEGERS AND THE RATIONALS [ 37
and
by
y+x to
get
IYI - lxl,,;;; Ix+YI

In the same way, if we replace
by
-y
and
by
y+ x
we get
lxl - IYI ,,;;; Ix+ YI

Taken together these say that
I lxl - IYI I
,,;;;
Ix+YI ,,;;; !xi + IYI
D Exercises
I.
Show that the properties (a) through (d) given in Section 1.4 as
part of the axioms for
2.
hold for Z.
Show the following:
(x)(y)(x,yE Z &x<y=>(3w)(w EN &y.=x+w)).
3.
Vx,yE Z, one and only

x<y, x= y, x > y. Show that trichotomy
Trichotomy in Z is taken to mean that
one of the following is true:
in Z is equivalent to the statement of Proposition 1.5.3.

4.
Show that the isomorphisms and T/ are order-preserving.
5.
Show that Z is Archimedian-ordered in the following sense:
(x) (y)( y > 0 ==> (3z) (zE N &x,,;;; zy)).
6.

(a)
(b)
x,yE-N=>xyEN.
xE N, yE- N ==> xyE-N.
Using these facts and the known fact that the product of elements in
N, show that
x<y &z > 0 ==> xz <yz.
x<y & z<0 ==> yz<xz.
is again an element of
(c)
(d)
7.
Show that multiplication in Z is well defined, that is, independent
of the representatives chosen from the equivalence classes.
8.
Show the following:
(x)(y)(x,yEZ ==> (-x)(-y) =xy).
9.
Show that t he commutative, associative, and distributive laws are
valid for the rational numbers.
IO.
(a)
Show the following:
(r)(s)(r,sEQ. &r # 0 ==> (3x)(xE Q.&rx=s)).

(r)(s)(r,sEQ.=> (3x)(xE Q.&r+x=s)).
For given rands show that x obtained in (I) and (2) is unique.
(1)
(2)
(b)

II.
In defining the equivalence classes for the rationals we allowed
only ordered pairs of integers
for which
ordered pairs of the form
describe at least one unfavorable
(m, n)
(m, O),
n =F 0. If we had allowed
circumstance that would have occurred.
12.
Show that
Q+ is not well ordered under the order relation .,;; .

Q+ has a least element.
That is, it is not true thvery subset of
Prove that Q+ is Archimedian ordered; that is, for every r and s

Q+, 3n E N such that r.,;; ns.
13.
in
14.
Let us define
Show that
Q-
Q- = {x: x E Q & -x E Q+}.

Q+ 0 and Q- U Q+ U {O} = Q.
=
15.
Do Exercise 6 with - Nand Nreplaced by Q- and
16.
If
Q+, respectively.
m EZ, let
Zm = {k : k EZ & k ;:;.: m}.

Show that the principle of induction implies the following for every
MC Zm:
[m EM & (x)(x EM=>x+ 1 EM)] =>M=Zm
17.
Prove that the set Zm of Exercise 16 is well ordered.
18.
Prove Proposition 1.5. 7.
1.6
COUNTABILITY
It is interesting, as well as important, to know that there are the "same

number" of rational numbers as there are positive integers. By this we
mean that there is a one-to-one function with domain
N and range Q.
In this section we shall prove this fact.
1.6.1 Definition. Any set that is the range of a one-to-one function with
domain N will be called denumerable.
1.6.2
Lemma.
The set {(m, n) : m E N & n E N} is denumerable.
We shall first give an informal construction so as to make the formal

proof understandable. Let us look at a picture of the lattice points
(m, n), Fig. 1.6.1. Now follow along the paths of the arrows and attach
successive positive integers to the successive points on the paths. We
shall write this out as follows, calling the point (I,I) the first arrow.
1st arrow:
I---+ (1,I)
2nd arrow: 2---+ (1, 2), 3(2,1)

3rd arrow:
4---+ (1,3), 5---+ (2, 2), 6---+ (3,I)
1.6
nth arrow:
COUNTABILITY I 39
m,, - (1,n), mn+1 - (2,n-1), ,mn+ n-1
- (n,1).
,-.
(I.5)
(1, 4)
"""'.-""'---+-- -+-----+--
(1, 3) 1.---+--+----l
1_....
.,__.
(1, 2)
(1, 1) (2, 1) (3, 1) (4, 1) (5, 1)

FIGURE 1.6.1
We have taken m,, to be the integer that begins the nth arrow. Since
the nth row contains n elements, it is clear that
mn+i
=mn+ n.
Hence
mn =m1 + (mi-m1)
(m3 - mi) +
+ (mn-m..-1)
=1+1+2+ +n-1
n(n-1).
=1+
2
Consequently, we have the general correspondence

n(n
; I)+k+l ""(k+l,n-k),
Okn-l.
We are now in a position to prove Lemma l.6.2. We will have proved

this if we can show that the relation which is the set of ordered pairs
cl> of the form
n(n
; l)+k+1, (k+ I,n-k) ) .
nEN,
Okn-l,
is a function with domain N, range N x N, and is one to one.

It is convenient for us to use the term
and n; that is,
' ( m, n)'
to stand for all integers between
(1.6.1)
(m,n) = {k: k E Z & m k n}.
Let us first show that the doman of is N. We shall show the follow
ing is t rue:
(m) (m E N => (3n)(3k)
(n EN & k E ( O, n-
1) &
;- l)+k+l=m)).
n(n
..----
Let P(m)be the statement :

(3n)(3k)(n EN & k E ( O , n-
) &
; I) +k+1 =m).
n(n
P(I) is t rue. Indeed, for m= 1 we take n= 1, k= 0. Next we shall
show that Vm, P (m)=> P(m +1). Let nand kbe integers of the required
kind, so that
; 1) +k+1.
m= n(n
(1.6.2)
Then
m+1 =
If 0 k
<
n-
1,
; l)+ (k+ 1) +1.
n(n
then 0 k+ 1 n-
make P(m+1) t rue. If k=n-
1,
and hence n and k + 1 will

take n1 = n+1, k1 = 0, and we see
that
Therefore, also in this case P(m+ 1) is true. By the principle of induc

tion P(m)is true for all min N.
We next show that is a function. This will be accomplished if we
can show that every integer mhas a unique representation in the form
given by
(l.6.2).
Suppose that n, k, n.
i and k1 are integers with
0 k n- 1, 0 k1 n1 - 1, and
If n
<
ni. then
n (n-
l)
+k
,.:::. n(n-
.....,
1)
+n
< n( n + I)
,.:::.
n1 ( n1 -
I)
k1
'
<
and conversely if n1
n we get the reverse inequality. Hence we can
have equality if and only if n=ni. and this implies that k= k1. This
shows that for a given mthere is one and only one ordered pair (k + 1,
n- k) corresponding to it. This shows that is a function.
1.6
COUNTABILITY J 41
Finally, to show is one to one we note that if

<t>
; 1) +k+1 ) = (k+ l,n-k)
n(n
<1>
n1(n - I)
+k1 +I
= (k1+1, n1-k1),
then k = ki. n
ni, and therefore

n(n
; I)+k+1 = n1(n2- I) +ki+1.
The fact that () = N X N is almost obvious.
1.6.3
Definition.
A set A is said to be finite A is void or there exists
an n E N and a one-to-one function
& 1 .;;; k .;;; n} such that A = ()

with domain
(1, n)
= { k: k E N
In these cases A is said to have zero
elements or n elements, respectively. Otherwise A is said to be infinite. A set that

is finit,e or denumerable is called countable.
1.6.4
Theorem.
Proof.
An infinit,e subset of a denumerable set is denumerable.
Suppose A is denumerable, B C A, and B is infinite. Since
A is denumerable, there is a one-to-one function with domain N

and range A. Let
C
{n: n E N & <l>(n) E B};
then C C N, takes C onto B, and C is infinite. Indeed, ifC is finite

there is an integer m and a one-to-one function 'I' with domain the set
(1,
m) and range C. But then 
domain
(1, m)
'I' is a one-to-one function with
and range B, which contradicts the fact that B is infinite.
If we can show that C is denumerable, then it will follow th(}VB is

denumerable. For, if 'TT is a one-to-one function with domain N and
rangeC, then 
'TT is a one-to-one function with domain N and range B.
The method of constructing 'TT proceeds informally as follows:

'TT( 1) = smallest element ofC
7r(2) = smallest element of C \ {'TT(I)}
'TT(n) =smallest element ofC\{7r(I),
(n
, 'TT
1) }
To proceed more formally we let P(n) be the statement: There exists

a unique function
'TTn
with domain
(1, n)
and range in C such that
k n implies 1Tn(j) < 1Tn(k), and IE C and l 1Tn(n) implies

IE gi(1Tn) . The statement P(l) is true; simply take 7T1(1) as the first
j <
element of
C.
If
P(n)
is true, define
1Tn+1(k) 1Tn(k) :::!> k

C\ &2-(1T n) We
=
to be the smallest element in
1Tn+i (n+ 1)
<
n+ 1
and
leave to the
reader the easy task of verifying that all the conditions of the statement
are satisfied. Hence, by the principle of induction,
P(n+ 1)
(n) (P(n) ) .
Define
1T(n)
1Tn(n) .
This defines a function with domain N and
gi (1T) C C. 1T has the

k => 1T(j) < 1T(k) and if l E C and I 1T(n) for some
l E gi( 1T). To prove these statements we first note that if
property thatj <
n E N, then
k n, then
This follows from the uniqueness of the function 1Tk Indeed, if we re

strict 1T n to the set (1, k), we get a function that has all the properties
of 1Tk and hence by the uniqueness
1Tk. Therefore, ifj < k we have
1T(j)
of
1Tj(j) = 1Tk(j)
1Tk
<
this restriction of
1Tn
must be
1Tk(k) = 1T(k)'
and if l
1T(n), IE C, then l 1Tn(n) and hence by P(n) there is a

1Tn(k) = 1Tdk) = 1T(k).
The above proof shows that 1T is a one-to-one function. Indeed, if
1T(m) = 1T(n) we must have m = n. Otherwise, m < n or n < m. In the
first case we get 1T(m) < 1T(n) and in the second case 1T(n) < 1T(m).
Each contradicts the fact that 7T(m) = 1T(n).
The function 1T has all of C as its range. In the contrary case,
C\&i(1T) # 0. Let n be the first element of C\&2-(1T). Now, the collec
tion {k : 7T(k) < n} is nonempty and finite and hence has a maximum
element m. But 'TT( m + I) n, which means n E gi (1T), which is a con
k n
so that/=
tradiction. This completes the proof of the theorem.
1.6.5
Theorem.
Proof.
Suppose
Q+ is
r=
denumerab/,e.
Q(m,
n)
E Q+ and let
 be the one-to-one
function obtained in Lemma 1.6.2 that maps N onto N X N. Set
P(r)
w(r)
This defines
=
=
{x: x E N & <l>(x) E r},

<l>(k),
k =least element
of
P(r) .
as a fu nction with domain Q+ and range in N X N. It
is a one-to-one function, since if
<l>(j ) = w(q)
thenj =
k;
this implies j E
P(r) ,
w(r)
<l>(k),
which means
q = r.
1.6
COUNTABILITY j 43
The set (w) is infinite, since the one-to-one character of w implies

that(w IN) is already infinite. By Lemma 1.6.2 N X
N is denumerable
and since (w) is infinite, by Theorem 1.6.4 it is also denumerable.

Hence there is a one-to-one function
function w-
0 7T
7T
that takes
onto(w). The
is a one-to-one function with domain N and range Q+.
D Exercises
I.
2.
P( n)
3.
Prove that every subset of a finite set is finite.

Provide the details of the proof that
P(n) P(n + 1),
where
is the statement given in the proof of Theorem 1.6.4.

Justify all the statements in the last paragraph of the proof of
Theorem 1.6.4. Prove:
4.
(a)
{k : 7r(k)
(b)
Every nonempty finite subset of N has a maximum element.
(c)
7r ( m + 1) n.
<
n}
is nonempty and finite.
Show that the function
constructed in the proof of Theorem
7T
1.6.4 is unique in that it is the only function from N onto C that has its
properties.
5.
(a)
Show that a denumerable set is infinite.
(b)
Show that if B C
and B is infinite, then
is infinite.
These two facts will completely justify the proof of Theorem 1.6.5.
6.
Show that the rationals Qare denumerable and hence the integers
Z are denumerable.
7.
Show that Definition 1.6.3 is meaningful in t
set cannot have both

8.
If
n elements
is a finite set with
one functions with domain (I,
and
ile sense that a finite
elements, where
"'"
elements, show that there are
n)
and range
n!
one-to
A.
9.
Let {Ai.
, An} be a collection of n sets. Define A,
An "inductively" by setting
Ai x A2 x A3= (Ai x A2) x A3,
Ai X
X Ak= (Ai X
X Ak_1) x A k,
Ai x ... x An= (A, x . . . x An-1) x An.
an )
The elements of A1 X
X An are written (ai.
m.
A2
(We shall
not make the meaning of "definition by induction" precise at this time.)

If each set
10.
Ak
is countable, show that
Ai
An
is countable.
Show that the collection of finite subsets of N is denumerable by
proceeding in the following way: If

XA (k)
{0
is a finite subset of N, set
:::)
k
k
1 :::)
E
E
A,
A.
44 J THE
REAL NUMBER SYSTEM
Since A is finite, there is a smallest n(A) E N so that XA(k)=0 if

k > n(A). Now set
Show that is a one-to-one map with range an infinite subset of Q+.
The results of Exercise 5 may be helpful.
From this result it might be natural to guess that the collection of all
subsets of N is denumerable. Disturbingly enough, this is not true, as
we shall show in Section 3.3.
1.7
THE REALS
The equation x2 = 2 has no rational solution. Indeed, suppose there is

a rational solution p/q (in the standard terminology), where p and q are
not both even. Then p2= 2<f, which means p2 and therefore p is even.
If we write p = 2k, then p2 = 4k2= 2<f, which means 2k2=<f and hence
<f and q are even. This contradicts the fact that p or q is not even.
An equation of the form we have just considered arises quite naturally
when we try to compute the length of the hypotenuse of a right triangle
with legs ea--c4 of length one unit. The "geometric proof" of the Pytha
gorean theorem is interesting. We consider a square of side length c and
decompose it into right triangles with legs having lengths a and b, as
shown in Fig. 1.7.1. The square has been decomposed into four con-
FIGURE 1.7 .1
gruent right triangles and a square of side length a - b. Hence we have

area=c2
2ab+ (a - b)2=a2+b2
There are various methods for "completing" the rational numbers,

so that, for example, an equation such as the one we have been con
sidering above has a solution. One method is due to R. Dedekind, and
another method is due to G. Cantor. We shall follow the latter's method
here.
I. 7 THE REALS I 45
To describe Cantor's method for completing the rationals it will be

necessary for us first to make some remarks about rational sequences.
1.7.1
Definition.
nonnegative integers N0
A
=
rational sequence is a function with domain the

N U { 0} and range a subset of the rational numbers.
There is an operation of addition and multiplication for rational

sequences given formally by the following.
1.7 .2 Definition. If r and s are rational sequences, then r + s and r

are rational sequences defined by the following equalities for all n EN0:
(r + s)(n)
(r s) (n)
r(n) + s(n),
r(n)s(n).
We now come to a special type of rational sequence that bears the

name of one of the great nineteenth-century mathematicians, Augustin
Cauchy. In his
Cours d'.analyse Cauchy stated that irrational numbers are
to be regarded as limits of sequences of rational numbers. He tried to
r is a rational sequence so that Ir(n) -r(m) I tends to zero

m get larger, then r(n) tends to a real number. Of course, since
prove that if
as
and
he did not really have a definition of an irrational number, it was not

possible to prove this.
1.7.3 Definition. A rational sequence r is said to be a rational Cauchy

sequence VEE Q+, 3M so that if n,m E N and n;;;::: M, m;;;::: M, then
lr(m) -r(n)I < E.
In the notation of the predicate calculus the last statement of the
definition would be as follows:
(E)(EE Q+ => (3M)(m)(n)

(m,n EN & m,n;;;::: M=>lr(m)-r(n)I < e)).
This may actually look rather formidable, but after the definition of a
rational Cauchy sequence is understood, this notation will seem quite
reasonable. Actually, this method of writing the definition of a Cauchy
sequence is very convenient if one should want to negate the statement
that a sequence is Cauchy.
To obtain some properties of rational Cuchy sequences we make a
few more definitions.
1.7.4 Definition. The maximum of two rationals, r,s EQ, is the value
at (r,s) of that function with domain Q X Q defined as follows:
max
(r, s)
rsr,
sr s.
In an analogous fashion we could define the minimum function on QX Qwith

values min(r, s).
In Exercise 2 at the end of this section the reader is asked to show
that every finite set of real numbers has a unique maximum and a
unique minimum element. Hence if {r(k):k E (1, n )} is a finite
set of numbers we can define, in an obvious way, a maximum function
and a minimum function on these finite sets. We shall designate the
values of these functions, respectively, by
max{r(k):k E (l,n)},
min {r(k):k E (1, n)}.
1.7.5
Definition.
Vr EA, lrl
set A
Q is said to be bounded 3m so that
m.
In the symbolism of the predicate calculus the last part of this state
ment is written as follows:
(3y)(x)(x EA==> lxl
y).
Note, at this stage, unless stated to the contrary, our variables are opera
tive on the universe of the rationals.
1.7.6
Lemma.
Every finite set in Qis bounded.
Proof. For n E N0 =N U {O} let P(n) be the statement: Every

set in Qwith n elements is bounded. P(O) is true, since the null set is
clearly bounded. Assume P(n) is true. Suppose A C Q has n + 1
elements; that is, there exists a one-to-one function with domain
(l,n+l) and range A. Let B={<l>(k):k E (l,n)}; then B has
n elements and by P(n), 3l so that Vr EB, lrl l. Let m=max
(l,<l>(n+ 1)); then it is clear that Vr EA, lrl m. Thus P(n) ==>
P(n+ 1) and by the principle of induction (x)(P(x)).
In the above proof we have used the principle of induction

as applied to N0 N U { O} rather than to N. That it may be so ap
plied follows from Exercise 16 of Section 1.5.
REMARK:
1.7.7
Proof.
Proposition.
Every rational Cauchy sequence is bounded.
By the definition of a Cauchy sequencer, 3M so that n M ==>

ir(M) - r(n)I < 1.
From the triangle inequality we get, for n

lr(n)I<
1 +
M,
lr(M)j.
1.7
{r (n): n
The set
THE REALS I 47
(O,M-1)} is finite and hence by the previous

k. Set m= max (k, I + lr(M) I) and we see
lemma is bounded, say by

that
(n )(lr(n)I
m).
1.1.8 Proposition. If r and s are rational Cauchy sequences, then

r+s and r s are rational Cauchy sequences.
Proof.
For every
lr(n)-r (m)I
Hence, if
n,m
Q.+, 3M such
<
e/2
that
and
n,m
ls( n)- s(m)I
implies
<
e/2.
M,
l(r+s )(n)-(r+s)(m)I
lr(n)-r(m)I + ls(n)-s(m)I
<
E.
Since r and s are Cauchy sequences they are bounded and hence
(3k )(n)(lr(n) I < k & ls(n)I < k). There exists an M such that n, m M
implies
lr(n)-r (m)I
Hence, if
n,m
<
e/2k
and
ls(n)- s(m)I
<
e/2k.
M,
I(r s) (n)-(r s )(m)I= lr(n )s(n)-r(n)s(m) + r(n)s(m) - r (m)s(m)I

l r( n)I ls(n)-s(m)I+ ls(m)I lr (n)- r(m)I
<
As we have _mentioned before, it was impossible for Cauchy to prove

that every rational Cauchy sequence converged to a real number, since
he had no formulation of an irrational number. This gap was filled in
the second part of the nineteenth century by people such as Cantor,
Dedekind, Heine, and Weierstrass. From intuitive geometric considera
tions it seemed reasonable that a rational Cauchy sequence should
converge to some entity. It was Cantor's view that this entity could be
identified with the rational Cauchy sequence itself. However, again from
an intuitive geometric point of view, there are many Cauchy sequences
that converge to the same entity so, more properly, Cantor's view was
that this entity could be identified with an equivalence class of rational
Cuchy sequences.
1. 7 .9 Definition. Two rational Cauchy sequences r and s are sai d to be

equivalent for every e E Q.+, 3M such that for every n E N0 with n M
we have lr(n) -s(n)I < E.
We leave as an exercise the proof of the fact that what we have
defined is indeed an equivalence relation.
We shall designate the collection of equivalence classes of rational
Cauchy sequences by 'R' and the equivalence class of a rational Cauchy
sequencer by
R x R and
'R(x)'. We shall define

R by means of te
range
two functions+ and with domain

equations
R(r) +R.(s) =R(r+s),

R(r) R(s) = R(r s).
It is, of course, necessary to show that these are well-defined functions.

That is, the definitions are independent of the particular represent atives
chosen from each class. We shall leave the proofs of these facts as exer
cises. We shall call the ordered triple
(R, +, ) the real number system, but

R is the real number
-
by an abuse of language we shall simply say that
system. Also, in accordance with s tandard practice, we shall usually

drop the multiplication dot when multiplying real numbers.
We shall leave as an exercise the facts that
R obeys
the commutative
associative and distributive laws. The rationals Qare embedded into
by means of the isomorphism defined by
p(s) = R(r8),
where
r,(n) = s for
all
n E N0
It is clear that
p is order- preserving (see
Definition l. 7 .13 and the notion of order preserving before Definition
1.5.6). In mos t ins tances no confusion will result if we label the elements
in
which are in the range of
p by the same
Q. C R.
names used in
Q..
In fact,
we shall suppose that N C Z C
1.7 .1 0 Definition. A rational sequence r is said to be positive

EQ.+ & 3M so that Vn E N0 with n ""'M we have r(n)""'S.
3S
In the formalism of the predicate calculus the last part of the above
s tatemen t would read as follows:
(3 S)(8 E Q.+ & (3M)(n)(n E N0 & n""'Mr(n) ""'8)).

1.7.11 Definition. R+ is the set of all equivalence classes R(r), where
r is a positive rational Cauchy sequence. We also define -R(r)= R(-r),
where (-r)(n) =-r(n), and R-= {R(r): -R(r) ER+}. The set R+ is
called the positive real numbers and the set R - is called the negative real num
bers. As usual we write R(r) - R(s)for R(r)+ (-R(s)).
The next theorem is the st atement of trichotomy for R.
1.7.12
Theorem
0 ft_ R+ U R-.
R+ n R-=0.
R+ U R- U {O}= R.
I.7
Proof.
-R(r)
THE REALS I 49
The first statement is clear. Next, if R(r) ER+ n R-, then
E R+ n R-, which implies R.(r)
R(r)
Suppose
- R(r) =
is a contradiction.
-- 0. This means that
0 ER+ n R-, which
is not equivalent to the zero
sequence; that is,
-(e)(e
E Q+
(3M)(n)(n
E N0 &
n;;;,: M lr(n)I
<
e))
is a true statement. Using our rule for negating statements we find that
this statement becomes
(3e)(e
Since
E Q+ &
(M)(3n)(n
E N0 &
r is a Cauchy sequence, 3N so that n,m

n0 > N so that lr(n0) I ;;;,: E.
n;;;,: M
> N
&
lr(n)I;;;,: e)).
lr(n) - r(m)I
<
e/2;
Choose
r(n0) # 0,
we have r(n0) > 0 V r(n0) < 0. In the first case, since r(n0) - r(m) :,;;;
lr(n0) - r(m)I < e/2 for m;;;,: N, it follows that r(m) > r(n0) - e/2;;;,: e/2.
In the second case, since r(m) - r(n0) :,;;; lr(n0) - r(m) I < e/2 it follows
that for m ;;;,: N, -r(m) > -r(n0) - e/2 = lr(n0)I - e/2 ;;;,: e/2. Hence r
or -r is a positive sequence.
Now, we know that trichotomy holds in Qand hence since
1.7.13
Definition
(x)(y)(x<y<=>y-x ER+).
(x)(y)(x:,;;;y<=>.x<y V x=y).
(x)(y)(x >y<=>y <x).
(x)(y)(x;;;,:y<=>y:,;;;x).
Note that in terms of the relation < ,Theorem
1. 7.12 is the statement
of trichotomy for R in the form: Ifx

y
, ER, then one and only one of
the statementsx<y,x=y
,y <xis true. Indeed, the three conditions
in the. theorem are equivalent to the statement that Vx
,
y ER
.
one and
only one of the following possibilities holds: y-x ER+
, y-x= 0,
y-x ER-.
1.7.14
Lemma.
Ifx <y,
then
3r
E Q such that x<
<y.
Let x R({), y= R('); sincey-x > 0, 3S E Q+ and 3L

n;;;,: L '(n) - {(n);;;,: S > 0. Further, since {and'are ra
tional Cauchy sequences, 3M such that n,m;;;,: M l'(n) - '(m)I< S/4
and l{(n) - {(m) I < S/4. Pick n0;;;,: N = max(L, M) and take r =
['(n0) + {(n0)]/2. Then for n ;;;,: N we have
Proof.
such that
'(n) - r = '(n) - '(n0) + ['(n0) - {(n0)]/2 > S/4,

r - {(n) = ['(n0) - {(n0) ]/2 + {(n0) - {(n) > S/4.
If we now call the isomorphic image of
in R by the same name, we
have proved the lemma.
1.7 .15
Theorem. R+ is Archimedian-ordered in the sense that
(x)(y)(x,y ER+=>(3n)(n
EN
& x:;;; ny).
Proof. By the previous lemma 3 r,s E Q.+ , so that 0 < r < y and
x < s < x + 1. Since Q.+ is Archimedian-ordered (Exercise 13 of Section
1.5), 3n E N, so thatx < s:;;; nr < ny.
1.7 .16
range R+
The absolute value is that function with domainRand

{O} defined by the following:
Definition.
U
x :::) x 0,
lxl = -x :::) x 0.
:;;;
1. 7.17
For everyx and yin R,
Theorem.
x:;;; lxl,
-x:;;; lxl,
lxl =I-xi,
llxl - IYll:;;; Ix+ YI:;;; lxl + IYI
Proof.
See Propositions 1 5. 7 and l.5.8.

.
The important question now arises as to what happens if we repeat

the process for
real
Cauchy sequences that we have just gone through
for rational Cauchy sequences. Theorem l.7.20 below shows that we get
nothing new.
A real sequence is a function with domain N0=N

U {0} and range in R. A real Cauchy sequence x is a sequence such that
Ve>0, 3N so that if n,m EN0 and n,m N, then lx(n)- x(m) I <e.
1.7.18
Definition.
In the formalism of the predicate calculus the definition of a real

Cauchy sequence would read as follows: A real sequence
Cauchy sequence
is a real
:::)
(e)(e >0=> (3N) (n) (m)(n,m E N0 & n,mN=> lx(n) - x(m) I <e)).
Our variables, of course, are now assumed to take values inR.
1.7 .19 Definition. a ER is said to be a limit of the real sequence

x <=>Ve>0, 3N so that Vn E N0 with nN, lx(n)- al < E. If the real
sequence x has a limit a we say that x is convergent, and also x converges to a.
In the formalism of the predicate calculus the definition of a limit
would be as follows:
a ER is
a limit of the real sequence
x :::)
(e)(E>0=>(3N)(n)(n E N0 & nN=>lx(n)- al <e)).
1.7
THE
REALS I 51
Every real Cauchy sequence has a unique limit in R.
1.7.20
Theorem.
Proof.
To make the proof clear, it is necessary to distinguish care
fully between element s in Qand elements in R which are the isomorphic

images of elements in Q. For a rationals E
R, let us write
s=p(s),
p is the isomorphism taking Qinto R.
r be a rational Cauchy sequence with range in R. For every n
wheres E Q and
Let
E N0
we may write
r(n) =p(i(;i)) = R(1\),

where rn is a
rn(P) = r(n).
rational Cauchy sequence with range Q and
Suppose Eis a positive rational in
R. Then
lr(n) -r(P)I
3N so that
Vp
E N0,
n,p N:::}
< E/2,
or, what is equivalent,

-E/2 <
r(n) -r(p)
Since the isomorphism

-
()
<
r(n) -r(p)
and
< E2
/ .
is order-preserving, this gives
r"{n) - f'(p)
and
or what is equivalent,
-(E2
/ ) < rn\P)
,..-:-...
- r(p)
and
This leads to the set of inequalities
for all
n,p
e+
[w) -r"lP)l > ((/2),
[r-;;{P) - r(p) ] > (@),
N.
If we now identify with the constant Cauchy sequence defined by
r;{jj)
, we have shown that for Vn N the Cauchy sequences
r.-+ [-T]
are positive, where
r is
r. ..- [....
rn .. -r']
....
and
the rational Cauchy sequence that evaluated at
p is r(p). If we take the equivalence classes of these sequences we get

R(T,') + R(T,;'- r) > 0
R("f;) - R(T,;- r) > 0.
and
If we now set
a= R(T),
then from the facts that
and
r(n) = R (r,;-),
52 J THE REAL NUMBER SYSTEM
we have arrived at the conclusion that
Vn ;;;.: N,
lr(n)-al < E.
x is a real Cauchy sequence. Using the Archimedian
Un={m: m E Z & x(n) .;;; m/n} is nonvoid and
hence, by the well ordering of N (see Exercise 17 of Section 1.5), Un
has a minimal element mn. If we set r(n)= mn/n, then from the fact that
(mn-I)/n < x(n) we get 0 .;;; r(n) - x(n) < I/n. Since x is a Cauchy
sequence, the sequence r defined by the numbers r(n) is Cauchy.
Indeed, Ve> 0, 3e' E Q+ with e' < E and 3M so that n,m ;;;.: M
==> lx(n) -x(m)I < e'/2. Hence, if n,m;;;.: max {M, 4/e}, we have
Suppose now that
ordering of R+, the set
lr(n)-r(m)I .;;; lr(n)-x(n)I +lx(n)-x(m)I

I
I
+ix(m)-r(m)I <-+-+e/2 < e'.
n
m
From the first part of the proof,
==>
lr(n) - al < e' /2.
3a
Consequently, for
E R and
n;;;,:
3L such that n ;;;.: L

2/e') we have
max(L,
lx(n) -al .;;; ix(n)-r(n)I +lr(n)-al < E.

This is what we set out to prove.
To show that
that
;;;.:
a is unique,
suppose
3b,
so that for every
e> 0, 3N such
implies
lx(n)-bl < E.
Then
la - bl < lx{n)-al+lx(n) - bl < 2e.

This implies
a= b;
for in the contrary case the use of the trichotomy
theorem for R shows that
la -bl> 0. Choose 0 < E < la -bl/2 and we
get a contradiction.
This concludes the proof of the theorem.
REMARKS:
(a) Let us emphasize once more the meaning of this last
theorem. If we form an equivalence class of real Cauchy sequences

and define functions of addition and multiplication, as we did for
the rationals, then Theorem 1. 7.20 tells us that, up to isomorphism,
we shall get nothing new. For this reason R is said to be complete.
Since
V2 <t,
Q, we know that Q is not complete.
(b) The uniqueness part of the proof of the last theorem shows that
we are justified in calling a limit of a sequence, the limit of the sequence.
As is usual, if
is the limit of the real sequence

lim
x,
we shall write
x(n)=a,
and, as we noted before, we say that the sequence

We shall also write
x(n) -+a
as
n--+oo.
converges to
a.
I. 7
NOTE:
as
THE REALS I 53
To avoid notational confusion we shall usually use a term such
'(x(n) )' or '(xn)' to denote a real sequence. The second notation is
the more standard one in the mathematical literature and, although

we shall eventually revert to it, for the remainder of this chapter we
shall use the functional notation to remind the reader that a real
sequence is a function on N0. In the future when speaking about
sequences we shall usually drop the word 'real', since it will be under
stood that R is the range of the function that is the sequence.
Let us now prove the converse to Theorem 1.7.20.
1. 7 .20'
Proof.
Every convergent sequence
Theorem.
Suppose
is
a Cauchy sequence.
(x(n)) is a real sequence and

Jim
x(n) =a.
n- co
This means that VE>
0, 3N so that n
N implies
lx(n) - al < e/2.

Therefore, if
n,m
N,
lx(n) - x(m) I
lx(n) - al
la - x(m) I <
E.
The sum and product of real sequences are defined in the same way
as the sum and product of rational sequences; see Definition 1.7.2.
The analogues of Propositions 1.7.7 and 1.7.8 are valid for real se
quences and we leave it to the reader to satisfy himself of these things.
There is an important concept that we have not noted yet and that
is the concept of a subsequence.
1. 7.2 1 Definition.
A sequence y is said to be a subsequence of a sequence
x <=:? there is a function with domain N 0 = N U { 0} and range in N0 so
that j < k => <l>(j) < <l>(k) and
y=xo.
In other words, y(n)
= (x(<l>(n))).
The condition
x(<l>(n)) and we shall write (y(n)) = (x
<l>(n))
j < k => <l>(j) < <l>(k) means that the range of is
denumerable. Hence, speaking loosely, "picks out" an infinite num

ber of the ordered pairs
(n, x(n)) to form the sequence y. As a simple

(x(n)) to be the sequence given by x(n) = n2
example, suppose we take

+
2. Take <l>(n) =2n
+ l; then
y(n) = x <l>(n) = (2n

0
1)2
4n2
4n
+ 3.
D Exercises
I.
Use the principle of induction to show the following inequalities:

(a)
;;,.
&
N0 =>
n(
(1+h) n ;;,. 1+nh+
0
(b)
(I - h) n
h
(c)
N0 =>
n (n
1 - nh+
h2
0, n. E N, n
;;,.
(I+h) n
2.
&
n; I)
;;,. 2 =>
;;,. 1+nh +
; I) h2
h2
Show that every finite set in R has a unique maximum element
and a unique m in imum element. Use this fact to give another proof
of Lemma 1.7.7.
3.
Show that there is always an irrational (not rational) number
between any two real numbers. Recall that there exists an irrational
number:
\/2.
If
lxl <
4.
1, show that
xn - 0
lxl
> 1, show that Vk E
5.
If
6.
For every
as
n-oo.
E R, show that
xn
,-o
n.
as
n-oo.
x(n) -a, y(n) - b as n - oo,

and x(n)y(n) -ab as n -oo.
7.
If
8.
If
(x(n))
show that
x(n) + y(n) -a+ b
is a convergent sequence, show that every subsequence
(x (n)). Conversely, if every "proper"

(x(n)) converges, then (x(n)) converges. By a "proper"
subsequence we mean a subsequence x , where (m) (3n) (n ;;,. m &
(n+I) > (n)+1).
converges to the same limit as
subsequence of
9.
W ithout using Theorem 1.7.20, show that if a subsequence of a
Cauchy sequence converges, then the Cauchy sequence itself converges.
10.
Suppose
x(n) -a
as
n-oo and 3N such

lx(n) - bl
What can be said about
la - bl
<
c.
that
n;;,. N =>
1.8
If
11.
A REVIEW OF THE REAL NUMBER SYSTEM AND SEQUENCES I 55
as
x(n)- a
show that
n- oo,
lx(n) I - lal
as
n- oo.
Give an
example that shows that the converse is not always true. For what
value(s) of
12.
If
a,
if any, is it
(x(n))
always
true that
lx(n) I -l al
x(n) -a?
is a sequence and
x(2n) -a, x(2n
as
1) -a
n-
oo,
show that
x(n)-a
13.
Let
(s(n)) be
as
n-oo.
a sequence and set
<T(n)
s(O)+ s(l) +
n+l
s(n).
If
s(n)-s as n - oo, show that <T(n)-s as n- oo. The

(<T(n)) is called the Cesaro mean of the sequence (s(n)).
If A C Ris a finite set with n elements, show that there is a unique
14.
one-to-one function with domain
<l> (j)
15.
<
For every
is countable.
N,
1.8
n be the unique function

(1,mn) and range An. Let 'I' be that func
defined by
'l'(k,n)
TT
[Hint: If An has mn elements,
tion with domain
0 TT
and range A so that
suppose An is a finite set in R. Show that
of Exercise 14 having domain
Let
(1, n)
<l>(k).
'I'
sequence
{ <l>n (k) k (I,m,.),

0
otherwise.
be a one-to-one function with domain
N and range N
0 TT ) . l
N. Then
is a function with domain N andA C &i('I'
A REVIEW OF THE REAL NUMBER SYSTEM AND SEQUENCES
In the previous sections we constructed the real number system starting

from a set of axioms for the natural numbers and derived a number
of relevant properties. In developing further properties of the real
numbers, it is usually more efficient to list a certain number of these
relevant properties and then proceed from these properties without
regard as to how they arose. In this section we shall give this list.
Consequently, this section can be read in either of two ways:
review of the previous material, or
(2) taking
(1)
as a
statements (a) through (j)
below as an axiom system for the real numbers. What is a theorem under
reading (l) may be an axiom under reading
(2)
and vice versa.
56 I THE
REAL NUMBER SYSTEM
The numbered statements given below have in the previous sections

either been taken as definitions or axioms, or have been proved. Under
the reading (2) they are either definitions or can be proved using state
ments (a) through (j) as axioms, and we have indicated how to do this
in some instances. For example, if the current section is being read as
introducing an axiom system for the reals, then it is of course neces
sary to identify the natural numbers in the set R, since several axioms
require the use of the natural numbers. This is what has been done in
.the statements labeled 1.8.1 and 1.8.2. Note then that the principle of
induction can .be proved as a theorem, 1.8.3. On the other hand, if
the current section is being read as a review, then it is not necessary to
identify the natural numbers, since they form the basis for the con
struction of the reals. However, then the statement 1.8.2 must be proved.
Because of the dual nature in which this section may be viewed, we have
not labeled statements as definitions or theorems but have left it to the
reader to decide in which way he wants to read it. In writing it our
tendency has been to consider statements (a) through (j) as a f ull
blown axiom system for the reals.
One of the properties of the real number system that we wish to list
is stated in the language of sequences. Hence this section will serve as
a good opportunity to look once again at the salient definitions and facts
about sequences.
We shall describe the real number system as a sextuple consisting of a set R,

two functions + and with domain R X R and range R, a relation < with
domain and range R, a function with domain and range R whose value at
x E R is designated by -x, and a function with domain and range R'\{O}
whose value at x ER'\{O} is designated by x-1 or I/x, and satisfying the
properties (a) through (j) listed below.
(a)
For every x and y in R:
x+y=y+x,
x . y = y x.
(commutative laws)
(b)
For every x, y, and z in R:
(x+y)+ z=x+ (y+z),

(x y)
z=x
(y z).
(c)
(associative laws)
For every x, y, and z in R:
x
(d)
I E R, 0 ER,
I -
(y+ z)= x y+ x z.
0, and for every x in R:
x+O=x,
x I= x.
(distributive law)
1.8
(e)
A REVIEW OF THE REAL NUMBER SYSTEM AND SEQUENCES I 57
For every x in
R:
x+(-x)=O,
x # 0 =>x x-1
(f )
For every x , y, and z in

x
(g)
For every x, y, and
x
(h)
<
in
<
<
1.
one and onlyone of the following is valid:
y, x=y y
,
x.
<
(trichotomylaw)
R:
<
z.
y =>x+ z < y + z,
y & 0 < z) =>x z
<
<
For every x , y, and z in

x
(x
R,
&
<
z=>x
R:
z.
To state the Archimedian ordering property for R it is necessary to

identify the natural numbers in R. For this purpose we introduce the
following definitions:
1.8.1
A set A C
R is
said to be inductive
(a)
1 E A.
(b)
xEA=>x+l EA.
1.8.2
The natural numbers N is the intersection of all inductive sets in

that is, is the smallest inductive set in R.
R,
Note that N is not the null set 0, since R itself is an inductive set and
1 belongs to every inductive set. Note also it is a simple matter to prove
that the principle of induction is valid for N:
1.8.3
For everyMC N, if
1 EM
and xEM=>x+ IE M, then
M=N.
Indeed, M is an inductive set and hence NC M. Since MC N we
must have M=N.
Now that we have N we can state the next property of the real number
system.
(i)
For every x and yin R with x

that x < ny (Archimedian ordering).
> 0
and y>
0,
there is an n in N so
To state our last and rather crucial property for the real number
system, it is necessary to consider the concepts of a sequence, a Cauchy
sequence, and a limit of a sequence. For this purpose we have the follow
ing definitions:

1.8.4 A (real) sequence is a function with domain N0 =N U {O} and
range in R. A sequence will usually be denoted by the term '(x(n)) ' or ' (x n) ',
and this is to indicate that x(n) or Xn is the value of the Junction at n E N0
A sequence (y(n)) is said to be a subsequence of (x(n))<==> there is a function
 with domain N0 and range in N0 so that j < k ==> <l>(j) < <l>(k) and
y(n)=x(<l>(n)).
1.8.5
The absolute value is the function defined by

lx. l =
1.8.6
x<==>x 0,
-x. <==> x < 0.
The following properties hold for the absolute value:

-x lxl,
x lxl,
x
I l l- IYI I Ix+ YI
We are, incidentally, using
lxl =I-xi,
x
l l+ IYI
to mean
a VE > 0,

3N so that if m,nN, then lx(n) - x(m) I < E.
1.8.8 A sequence (x(n)) is said to have a limit<==> 3a such that VE > 0,
3N so that n N ==> lx(n)- al < E. The number a is said to be the limit of
(x(n)). If a sequence has a limit, it is said to be convergent.
We can now state a crucial property of the real number s ystem:
Every Cauchy sequence has a limit.
(j )
Let us note that if a sequence has a limit, then the limit must be
unique.
3N so
Indeed, suppose that
that
a and bare limits of (x(n)). Then VE > 0,

n N ==> lx(n) - al < E/2 and lx(n) - bl < E/2. Hence we
have VE > 0,
la- bl
If
S =la - bl > 0,
la- x(n)I+ lx(n) - bl <
E,
nN.
choose E =S and we get the following contradiction
to the trichotomy law:
S= la- bl< S.
Hence, if a sequence
of
the
limit
a.
(x(n))
has a limit
We usually write
lim
n-oo
a,
we are justified in speaking
x(n) =a,
or
x(n)-+ a
as
n-+
oo.
1.8
A REVIEW OF THE REAL NUMBER SYSTEM AND SEQ.VENCES I 59
The statement converse to U> is true and very easy to prove:
1.8.9
Every convergent sequence is Cauchy.
Indeed, suppose x(n) - a. This means that V e> 0, 3N so that

n;:;,: N lx(n)-al < e/2. Hence, if m,n;:;,: N, we get, using the triangle
inequality,
lx(n)-x(m)I ,,;;:; lx(n)-al+lx(m)-al < e.

This, of course, proves that
(x(n))
is Cauchy.
1.8.10 Every convergent sequence is bounrkd; tha t is, 3M so that Vn EN0,

lx(n)I,,;;:; M.
n;:;,: N lx(n) - al < 1. Hence

n;:;,: N lx(n)I,,;;:; I+lx(a)I.
Let L = max{x(n): n E( O, N) } and M= max(L, I+lal). Clearly
Vn E No, lx(n)I,,;;:; M.
Suppose
x(n) -a;
then 3N so that
using the triangle inequality, we get that
1.8.11
defined
as
The sum and product of two sequences (x(n)) and (y(n)) are
follows:
(x+y)(n) = x(n)+y(n),
(xy) (n) = x(n)y(n).
Note we have reverted to the custom of dropping the symbol
''
for
multiplication.
1.8.12
If x(n) - a and y(n) -b, then

(x+y)(n) - a+b,
(xy)(n) -ab.
x(n) -a and y(n) -b mean first of all that 3M> 1,

Vn EN0, lx(n)I ,,;;:; M, and ly(n) I ,,;;:; M. S econd, Ve> 0, 3N
n;:;,: N lx(n) - al < e/2M, ly(n)-bl < e/2M. Hence, if n;:;,: N,
The facts that

so that
so that
l(x+y)(n)- (a+b )I,,;;:; lx(n)- al +ly(n)-bl < e,

I(xy)(n)- abl ,,;;:; ly(n)I lx(n)-al +lal ly(n)-b I
,,;;:; M {lx(n)- al+ly(n)-bl} < e.
In the last part of the above proof we have used the fact that lx(n)I ,,;;:; M
for all
n EN0 lal ,,;;:; M.
We shall leave the verification of this simple
fact to the reader.

In 1.8.2 we defined the natural numbers. Now take
-N= {x: -x EN},

Z=-N UN U{O}.
The set Z is, of course, called
the integers. The rational numbers is the
set
Q=
{m/n: m,n
We are, of course, writing
1.8.13
Vx
E R,
E Z,
O}.
m/n for m( l/n).
The rationals are dense in the reals in the sense that VE

3r E Q, so that Ix - rl < E.
> 0 and
3n E N so that lxl < n,

-n < x < n. Again, using Archimedian ordering,
3m E N so that l/m < E. Let k E N be the smallest natural number so
thatx,,;:;; -n + k/m. Then -n + (k - I)/m < x and if we set r = -n + k/m,
we have Ix - rl = -n + k/m - x < k/m - (k - l)/m < E.
Indeed, by the Archimedian ordering of R,
or,
equivalently,
In the previous proof we have made use of some facts about the
relation < without explicitly mentioning them. For example, we said
lxl < n is equivalent to the fact that -n < x and x < n. Indeed,
x ,,;:;; lxl we get from (g) that x < n, and from -x ,,;:;; lxl we get
-x < n, and from (e) and (h), n + x > 0. Now, using (d), (e), and (h)
we get x = -n + n + x > -n + 0 = -n, which is what we set out to prove.
that
from
The reverse implication follows in a similarly easy way. Note we are

using the usual convention thatx -
y= x
(-y). We are sure the reader
can fill in the details of proofs of other facts.

Aside from the fact that the rationals are dense in the reals, they also
have the property that they have the "same number': of elements as
the positive integers. More formally this can be written as follows:
1.8.14
There exists a one-to-one function with domain
N and range Q.
The proof of this statement can be found in Section 1.6. More gen
erally we can make the following definition:
1.8.15 A set
tion with domain
A
N
is said to be denumerable
and range A .
::> there
exists a one-to-one func
Another way of phrasing 1.8.14 is to say that the rationals are denum
erable. We can also talk about finite sets.
1.8.16 A set A is said to be finite ::>A is the null set, in which case we say
that A has zero elements, or else there is a one-to-one function with domain the
{k: k E N & 1,,;:;; k,,;:;; n} and range A, in which case we say
set (I, n)
A has n elements. A set that is either finite or denumerable is called countable.
=
D Exercises
Use only statements (a) through U> as axioms in proving the following:
1.8
A REVIEW OF THE REAL NUMBER SYSTEM AND SEQUENCES J 61
1.
Show that -(x+y) =-x+ (-y) and (xy)-1=x-1 y-1.
2.
If n,m E N, then n +m E N and n
3.
Show that 0 < 1.
4.
For every n E N, 1 :;:;; n.
m E N.
5. It is not true that there is an n E N and an m E N so that

n<m<n + I.
6.
Show that N is well ordered; that is, every nonvoid subset of N
has a unique smallest element.
7.
If n,m E Z, then n + m E Zand n
8.
Show that every nonvoid subset of Zthat is bounded below has
m E Z.
a first element. A set A C Zis said to be bounded below:::} 3m so that
Vn EA, m:;:;; n.
9.
a
If x(n) .- a and b,
- b :;:;;
c.
E R so that Vn, x(n) - b <c show that

bl :;:;; c. Give
Show that if Vn, l x( n) - bl <c, then also la
an example which shows that in general we cannot conclude that

a
- b <c or la - bl <c.
10.
If Vn E N0, x(n)
11.
Set
O!
l/(n +I), prove that x(n)
= 1 and Vn E N, n! = 1
._
0.
n. If k E N and
_(k!)R
,
n!
x( n) show that x(n) .- 0 as n .-
12.
If x(n)
._
oo.
and y(n)
._
b - 0, show that
x(n)
a
-----
y(n)
b
13.
Let
that
p(x)= 2x4 +x2+3x+ 1,
p(n)
q(n)
14.
q(x)= 3x4 +x3 +2x+3.
._
_g_ .
3

2xy :;:;; x2+y 2, Vx,y E R.
(b ) x+ l/x ;::;,: 2 , x > 0.
(a)
[Hint: (x - y)2;::;,: O.]
15.
Use Exercise 14 to prove the Cauchy-Schwarz inequality

(X1Y1 +
XnYn>2:;:;; (X12+
+Xn2HY12 +
[Hint: Use Exercise l 4(a) to get for each k,
+Y n 2)
Prove
(X12 +
xk2
+
+ Xn2)
(Y12
Yk2
+
+ Yn2 )
Then add both sides of all the inequalities as k varies from I to n.]
1.9
PROPERTIES OF THE REALS
A very crucial property of R is that every Cauchy sequence of reals

converges. One reason why this property is so crucial was pointed out
in Section 1 . 7. There are, however, a number of other properties which
may be used in place of this property and which may be more convenient
to use in some circumstances. The purpose of this section is to derive
these other properties.
1.9.1 Definition. A set S C R is said to be bounded above [below] <==?

(3M)(x)(x E S x M[x;,, M]). Any such M is called an upper bound
[lower bound] for S. A set is said to be bounded<==? it is bounded above and
bounded below. A sequence is said to be bounded above [below] <==?the range
of the sequence is bounded above [below]; it is said to be bounded<==? its range
is bounded.
1.9.2 Definition. A real sequence (x(n)) is said to be monotone increas
ing [nondecreasing] <==? (n)(m)(n <mx(n) <x(m)[x(n) x(m)]),
and monotone decreasing [nonincreasing] <==? (n) (m)(n <m :=} x(m)
<x(n)[x(m) x(n)]).
1.9.3 Theorem. Every monotone nondecreasing [nonincreasing] se
quence that is bounded above [below] has a limit.
Proof.
(x(n)) is a monotone nondecreasing sequence
Suppose
bounded above by M. We claim that (x(n)) is Cauchy. For if (x(n)) is

not Cauchy, then we must negate the statement
(e) (E
>
0 :=} (3L)(n)(m) (n, m ;,, L :=} lx(n) - x(m) I <e)).
The negation of this statement is the statement
(3e)(e
Take L
>
0 & (L)(3n)(3m)(n, m ;,, L & lx(n) - x(m) I ;,, e)).
2 and fix n1 > m1 > 1 so that x(n1) - x(m1)
;,, E.
The absolute
value is not needed, since by monotonicity x(ni) ;,, x(m1). Next take
n1 + 1 and n2
n1 so that x(n2) - x(m2) ;,, E. Assuming we

nk + 1 and nk+i > mk+i > nk
nk, mk, take L
> >
have chosen ni.m1,
so that x(nk+i) - x(mk+i) ;,,
E.
Hence, by the principle of induction, for
every k ;,, 2 there exists a finite set of pairs { (n;, m;): j E (2, k>} so
that m1 < n1 < m 2 <n 2 <
< mk <nk and x(n;) - x(m;)
;,,
E.
1.9
PROPERTIES OF THE REALS I 63
Since R is Archimedian ordered, 3k0 E .N so that M - x(m1) k0E.

Take k > ko and for this k a finite set of the type given in the first para
graph. Then
x(nk) - x(m1)=
+
x(nk) - x(mk) + x(mk) - x(nk-1)

x(nk-1) - x(mk-1) + x(mk-1) - x(nk-2)
- x(m2)
- x(m1).
Now x(n;) - x(m;) E
and
x(m2)
x(m;+1) - x(n;)
0.
Hence
kE x(nd - x(m1) :s; M - x(m1) koE.

This is a contradiction.
Since (x(n)) is a real Cauchy sequence it has a limit. If (x(n)) is a
monotone nonincreasing sequence, (-x(n)) is a monotone nondecreas
ing sequence and hence there is an a so that
lim -x(n)=a.
n- oo
But this is equivalent with the statement

lim x(n)= -a.
n-oo
This concludes the proof of the theorem.

1.9.4
Definition.
A least upper bound [greatest lower bound] for a set
A C R is a number that is
(a) an upper [lower] bound for A, and
(b)
if YJ is any upper [lower] bound for A, then

ri[ YJ
].
A least upper bound [greatest lower bound] for A is denoted by

'l.u.b. A [g.l.b. A]' and is often called the supremum [injimum] of A and
denoted by 'sup A [inf A]'.
1.9.5 Theorem. Every nonempty set A C R that is bounded above
[below] has a unique least upper bound [greatest lower bound].
Proof.
Let Ube the set of upper bounds for A, which, by hypothesis,
is nonvoi<j. For every n E N set
In= m: m
E Z &
;E
u},
By Archimedian ordering, Un and hence In is nonempty. The set of

integers I11 is bounded below (by any element of A times 2n) and there
n
fore has a least element mn. (Why?) Hence mn/2 is the smallest element
of Un.
64
I THE REAL NUMBER SYSTEM
m/2k E Uk , then since m/2k 2m/2k+i, it follows that m/2k

Hence U k C Uk+i and consequently
If
If we set
r(n)
Uk+i
11
m,./2 , then the above inequality shows that (r(n)) is
a monotone nonincreasing sequence bounded below by any element

of A. It follows from Theorem 1.9.3 that
(r(n)) has a limit g.
We claim that
g
l.u.b. A.
The first thing to show is that g is an upper bound for A. For every
EA we have
x m11/2"
n, since the number on the right is always in U. For every E > 0,
3N so that n N implies
for all
m,.
0 2n-g < E.
This follows from the fact that
that
m11/2" g. Therefore,
VE >
0, 3N so
N=>
X
This means
g-x
m71/211 < g +
E.
- x < 0, choose E
(x - g)/2 and we
(g - x)/2 > 0. We have consequently shown
O; for if g
get the contradiction that
that is an upper bound for A.

To show it is a l.u.b., let 71 be any upper bound for A. If we suppose
that
8 > 0, then we shall show that there is a rational number

p/2" so that 71 p/2" < - First note that it is a consequence
of the Archimedian ordering of the reals that there exist m and n in
N such that m/2" < 8. Indeed, for every m, 3n such that m < n8, and
since it is easy to show by induction that n < 211 for all n E N, the asser
tion of the last sentence follows. Let us fix m and n so that m/2" < 8
-
71
of the form
and let
{k: k
EZ
& 71 km/2"}.
The set S is nonvoid by the Archimedian ordering of R, and it is

bounded below by 71. Since N is well ordered, it follows that S has a
minimum element
k0 (see Exercise 17 of Section 1.5). Hence we get
and this implies that
1.9
T/
Take
p= k0m
m
ko n<
2
T/
m
<
2n
PROPERTIES OF THE REALS
+8
T/
I 65
g.
and we have shown that
p/2" < g.
TJ
This is a contradiction, since TJ is an upper bound for A and hence p/211

is an upper bound for A, and thus g
m11/2n
p/2n.
Hence we must
have g TJ
If A is bounded below, then
-A= {x: -x EA}

is bounded above. If { is the supremum of -A, -g is the infimum of
A.
Since the unicity of a l.u.b. or g.l.b. is trivial, we have concluded the
proof of the theorem.
1.9.6 Definition. A real number a is said to be an accumulation point

ofa set A CR<=> Ve> 0, 3x EA such that x - a and Ix-al< e.
Every bounded infinite set in
1.9.7 Theorem (Bolzano-Weierstrass).

R has an accumulation point.
Proof.
(a)
Suppose, at first, that A is a bounded denumerable set.
Hence we may suppose that A is the range of a sequence

the property that
Since
n =}x(n)
(x(n)) is bounded
(y(n)), where
(x(n))
with
x(m).
below, we may use Theorem 1.9.5 to obtain
a sequence
y(O)= g.l.b.{x(n): n;;;;. O}

y(l)= g.l.b. { x(n) : n;;;;. l}
y(k)= g.l
Since
y(k + 1)
{x(n): n;;;;. k
.
b .{ x(n):
n;;;;. k }
l} C {x(n): n;;;;. k}
(y(n)) is bounded
The sequence
it
follows
that
y(k)
above, and hence . we may
again -use Theorem 1.9.5 and set
a=
We claim that
a=
such that for VN,
l.u.b { y(n) :
.
n;;;;. O}.
lim n coY (n ). Indeed,

3n;;;;. N so that
-
if this is not true, then
a -y(n);;;;. e.
3e> 0
66 I THE
NUMBER SYSTEM
REAL
The absolute-value sign is not necessary in the previous inequality,

since
y(n)
a. Now if
N < n, y(m)
y(n),
and therefore
a - y(m);;;., a - y(n);;;., e.
N = 1, 2,
every m,
If we successively choose
proved, we see that for
y(m)
a-
then by what we have just
e < a.
This contradicts the fact that a is the l.u.b. of the range of
y.
It remains to prove that a is an accumulation point of the range of
(x(n)). To do this we must show that Ve> 0, 3x(n) such that 0 <
lx(n) - al < e. Now, Ve> 0, 3N such that n;;;., N ly(n) - al < e/2.
Also, Vn;;;., N, 3n1;;;., n so that lx(n1) - y(n) I < e/2 and Vm> n1,
3n2;;;., m so that lx(n2) - y(m) I < e/2. Hence 3n1 and 3n2> n1 so that
Jx(n1) - al
lx(n2) -al
Since
(b)
x(n1) #- x(n2),
lx(n1) - y(n) I + ly(n) - al < e,

lx(n2) - y(m) I + ly(m) - al < e.
it follows that either
x(n1) #-a or x(n2) #-a.
Suppose now that A is any bounded infinite set in R. One way
of trying to reduce this case to the previous case is to try to choose a

denumerable set inA. However, since it is not clear how to do this using
only the axiom of induction, we shall proceed in a slightly different way.
Let us set
m =infA,
M=supA.
These numbers exist by virtue of the Theorem 1.9.5. Next, let us put
P = {x:
m<x<
E Q &
M}.
The set P contains an infinite number of elements (why?) and hence is

denumerable. Let
Vn
(r(n))
be a one-to-one sequence with range P and
E N0 set
An= {x:
EA &
r(n)
x}.
Each setAn is nonvoid and we put
x(n) =infAn.
We claim that the range of the sequence
(x(n)) is infinite,
and there
fore by Theorem 1.6.4 is denumerable. Indeed, suppose the range of

this sequence has k elements
y1 ,
, Yk.
By the use of the principle
of induction we may suppose these elements are labeled so that
Y1 < Y2 <
<
Yk I f we put Yo=
B; = {x:
m,
EA &
Yk+i = M, then the sets
Y; < x < Yi+d
1.9
PROPERTIES OF THE REALS I 67
must be void for 0 ,,;;;; j ,,;;;; k. But this means that A can have at most k + 2
elements, which is a contradiction.
If 3n so that x(n) A, then x(n) is an accumulation point of A. On
the other hand, if Vn E N0, x(n) E A, then A contains a denumerable
set and we get the existence of an accumulation point from the first
part of the proof.
If A is the conjunction of the statements (a) through (i) given in the
description for R in Section 1.8, then an examination of the proofs of
this section show that we have the following chain of implications:
A
& (j) =>A & Theorem 1.9.3 =>A &

Theorem 1.9.5 =>A & Theorem 1.9.7.
[Incidentally, the reason we did not use Theorem 1.9.3 directly in the
proof of Theorem l.9.7(a) in taking a as the limit of the monotone
sequence (y(n)) is that we wanted to establish the above chain of impli
cation.] If we can show that A & Theorem I. 9.7 => A & (j), then all
these statements are equivalent and any one of the statements of
Theorems 1.9.3, 1.9.5, or 1.9.7 can be used in place of U).
1.9.8
Theorem.
& Theorem 1.9.7 =>A & ( j ) .
Proof. Let (x(n)) be a real Cauchy sequence. We distinguish two

cases.
(a) The range of (x(n)) is finite. In this case 3N so that n,m N
=> x(n)
x(m). Indeed, if this is not the case, Vk, 3nk k & 3mk k
such that lx(nk) - x(mk) I
sk > 0. By hypothesis, it is immediate
that the collection of numbers {Sk} is finite. Let 8 be the minimum of
these numbers and E
8/2. Since (x(n)) is Cauchy, 3L such that
k L => lx(nk) - x(mk) I < E, which is a contradiction. Take a = x(n)
for n N, and this is clearly the limit of (x(n)).
(b) The range of (x(n)) is infinite. Let a be an accumulation point
of the range of (x(n)), which exists by Theorem 1.9.7. Since (x(n))
is Cauchy, 3N so that n,m N => lx(n) - x(m) I < e/2. Now a is an
accumulation point of the set {x(n): n N}. Therefore, 3n1 N so
that lx(n1)
a l < e/2. Hence, for any n N,
=
lx(n)
a l ,,;;;;
lx(n1) - al+ lx(n) - x(n1) I<
E.

References
Cohen, Leon W., and Gertrude Ehrlich,
The Structure of the Real Number System,
D. Van Nostrand Company, Inc., Princeton, N.J., 1963.

Hamilton, Norman, and Joseph Landin,
Set Theory and the Structure of Arithmetic,
Allyn and Bacl>n, Inc., Boston, 1961.

D Exercises
1.
If
A and B are bounded subsets of R and A C B, show that

l.u.b. A l.u.b. B,
g.l.b. B g.l.b. A.
2.
If
A and B are bounded subsets of R, show that

sup A U B =max(sup A, sup B),
inf AU B =min(inf A, inf B).
3.
If
A C R and A has an upper bound that belongs to A, show

A.
that this upper bound must be sup

4.
If a set
A has a l.u.b. that does not belong to A, show that this
l.u.b. is an accumulation point of A.
of
5.
If A is a denumerable set in R and a is an accumulation point
A, show that there is a sequence in A which converges to a. If you
use the axiom of induction, be sure you use it carefully and correctly.
6.
(a)
Show that the l.u.b. of the range of a bounded monotone
nondecreasing sequence is the limit of the sequence.

(b)
Show that if a subsequence of a monotone nondecreasing
sequence is bounded, then the sequence is bounded.
7.
Show that the sequence defined by the following expression is
monotone increasing and bounded:
3
a(n) =
2
8.
(2n+ 3)
2(n+ l)'
EN0
Assuming the binomial theorem as known, show that the se
quence defined as follows is monotone increasing and bounded:
a(n) = I +
I
"+1
,
n+1
EN0
The reader may recognize that the limit of this sequence is designated
by 'e'.
9. Show that Va 0 and V n EN, there is a unique y 0 so that

=a. Designate this unique y by 'a1/n' and show that if 0 a < b, then
a 1 1n < blln, and conversely.
"
y
10.
Prove that lim11_00 n11" =1. [Hint: Set n11" =I+
n =(I+
a11
) n [n(n-1)/2!]
2
an ,
an
and hence
for n 2.]
11.
Show that lim.,._00 n!/nn
12.
What is the set of all accumulation points of the subset of the
0.
rationals of the form
n,p,q
EN?
CHAPTER
21 LIMITS
We have already discussed the concept of function in Section 1.3. In

this and in the next few chapters we shall be exclusively interested in
those functions which have their domains and ranges in the real number
system. A real sequence is an example of such a function.
To discuss the properties of real-valued functions, it is convenient
to introduce some notation and terminology for certain sets of real
numbers. We shall define
]a,b[= {x: a<x <b},

[a,b[= {x: a..; x <b}.
[a,b]= {x: a..; x..; b},

]a,b]= {x: a< x..; b},
The sets
[a,b] and ]a,b[ will be called closed and open intervals, re

]a,b] and [a,b[ will be called half-open intervals.
spectively. The sets
Note that any of these could be the null set.

We shall also define
[a,oo[= {x: a..; x},

]-oo,a]= {x: x..; a},
]a, oo[= {x: a<x},

J -oo,a[= {x: x <a}.
x will often be denoted by 'I(x)'

x. When we
wr
. ite '/(oo)' or 'J(-oo)' we shall mean a set of the form ]a, oo[ or J-oo,a[,
An open interval containing the point
and it may sometimes be referred to as a neighborhood of
respectively.
2.1
THE LIMIT CONCEPT AND CONTINUITY
In the last chapter we introduced the idea of the limit of a sequence.

We shall now generalize this concept to the situation where we are deal
ing with arbitrary real-valued functions.
Definition. The number l is said to be the limit of the function f at

is an accumulation point of J:) (J) and VE> 0, 3 S > 0 so that
..e{f)\Ja} and Ix - al< S :::::} IJ(x) - lj< e.
If
l is the limit off at a, we write
2.1.1
a
x
lim J(x ) =
x-a
or
f(x) las xa.
In case ,B(J) is not bounded above, we write lim x oo f(x ) = l Ve> 0,

-
3M so that
;;;,: M and
E oecn implies
IJ{x) - LI< E. Note that the

69
70 I LIMITS
latter statement includes the definition of the limit of a sequence. If
.EJ(J) is not bounded below, we have a corresponding definition for

lim.r--aof(x) = L.
An equivalent way of phrasing the definition of the limit of a function
at a point is as follows: The number Lis a limit of the function J at a a
is an accumulation point of .EJ(J) and V/(l), 3/(a) such that x E /(a)
n .E;(J)'\{a} f(x) E /(l). We leave for the reader the formulation
when a = oo or a=-oo.
In dealing with the limit of a function f at a it is very natural and
often very convenient to want to conclude that lim.r-af(x) = La
is an accumulation point of ,E;(f) and for every sequence (xn) for which
Xn - a, f(x) - L. One implication is very easy to prove: If f(x) - Las
x - a and Xn - a, then f(xn) -1. Indeed, VE> 0, 3 S> 0 so that
Ix - al<S and x E .e(J)'\_{a} IJ(x) - ll<E. On the other hand,
3N so that n N lxn - al<S. Hence n N lf(xn) - LI<E. The
last statement is precisely the fact thatf(xn) - L.
However, it is not always so clear how to prove the converse implica
tion. We have the hypothesis that a is an accumulation point of ,E;(f )
and for every (xn) so that Xn - a, f(xn) - L. We want to conclude that
f(x) - Las x - a. Let us see how we could proceed. Suppose it is not
true that f(x) - l as x - a. Then we must negate the statement:
(E> 0)(3S> O)(x)(O<Ix-al<S
& x E .EJ(J) lf(x) - ll<E).
Negating this statement leads to the statement:
(3E> O)(S> O)(3x)(O<Ix - a)<S & x
.EJ(J) & lf(x)
E).
L
I
(Note that, for the sake of brevity, we have shortened the symbolism
of the predicate calculus by putting restrictions on the quantified varia
bles together with the quantification symbols ). Let Eo be a number for
which the last statement is true. For n E N0, let Sn= l/(n + 1) and let
An= {x: 0<Ix - al<Sn & x
.e(J) & IJ(x) - l l Eo}.
(2.1.1)
The set An is not void. Now V n E N0, choose Xn E An. Clearly Xn - a,

but IJ(xn) - LI Eo for all n E N0. This last statement contradicts the
hypothesis.
The sticking point in the last argument is the question of whether it
is possible to prove that there exists a sequence (xn) with Xn E An using
only the axioms that prescribe the use of E and the axioms we gave for N. Now,
it is a consequence of the axioms that prescribe the use oi:..the symbol' E'
that for every nonvoid set A there is a function f so that f(A) E A.
Using this result and the axiom of induction, it is not hard to show that
V n E N0 there exists a function <l>n with domain (O, n) so that <l>n(k)
E Ak (See page 199, Exercise 2, in the reference to Mendelson cited
at the end of Section 1.1.) However, it is highly unlikely that there is a
canonical (unique) way of prescribing the <l>n, and hence there is no
2.1
THE LIMIT CONCEPT AND CONTINUITY 171
way of patching them together as we did in the proof of Theorem I .6.4

to get a function cl>, with domain N0, so that cl>(n) E An.
We know that many readers may possibly be quite chagrined by our
suggestion that we don't know how to construct a sequence of the type
needed above. After all, it is "intuitively obvious" that this can be done.
However, strictly speaking, we have shown the existence of a function
only if we can proceed from the axioms by means of the predicate cal
culus, and prove its existence. At this point the reader must take our
word for it that the axioms of set theory that can be used to lead to the
real number system do not include the justification for picking one
element from each set of an infinite number of nonvoid sets. Indeed,
this "picking" process, even for a denumerable number of nonvoid
sets, is quite independent of the axiom of induction, unless the sets
have some special property.
The way out of the dilemma is simply to institute another axiom so
that we can construct the sequence we needed in the previous discussion
and for which the possibility of construction seems so intuitively rea
sonable. This axiom is called the axiom of choice.
(AC) For every collection db of nonvoid sets there exists a function f with
domain db so that for every A E db,f(A) E A.
Whether or not we are justified in assuming such an axiom may. be
discussed along the same lines as to whether we are justified in assum
ing any set of axioms. However, the axiom (AC) seems to have a special
role, .since the real number system can be obtained without its use and
one naturally wonders why it is needed to develop a calculus based on
the real number system. Actually we should point out that we could
develop most of the topics of this book without ever mentioning (AC).
The difficulty usually arises when one tries to prove the equivalence of
a "sequential" type of criteria with another type of criteria, as in the case
of the limit of a function at a point. Indeed the use of (AC) will arise
at only a very few points which if avoided would not really affect the
usefulness of the calculus. However, since the "sequential" criteria
seem so embedded into classical analysis, we felt it was best if we pointed
them out at the appropriate places. When we write '(AC)' before a state
ment it means we are using the axiom of choice in its proof. It does not
necessarily mean that the statement cannot be proved without this axiom
but only that the author is not aware of how to do so.
During the course of the preceding discussion we have proved the
following.
2.1.2 Proposition. If f(x) -1 a.s x - a, then for every sequence (xn)

with range in JFJ(J) such that Xn - a, it follows that f(xn) - t. (AC) Con
versely, if a is an accumulation point of JF)(f ), and for every (xn) with range
in JFJ(J), if Xn - a implies f(xn) -1, then f(x) -1 as x - a.
7% I LIMITS
As in the case of sequences, if each of two functions has a limit at a

given point so do their sum and product. For the purpose of proving
this result, we first introduce a formal definition.
2.1.3
Definition.
If f and g are (real-valued) functions, then
(a) J+g is that function with domain (J+g) = {x: x E (J ) n

(g)} and so that V x E (J + g), (J + g )(x)= f(x)+g (x).
(b) Jg is that function with domain (Jg)= {x: x E (J ) n (g)}
and so that V x E (Jg), (Jg)(x)=J(x)g(x).
(c) fig is that function with domain (Jig)= {x: x E (J ) n (g)
& g(x) O} and so that Vx E (Jig), (Jig)(x)= f(x)lg(x).
If f and g are functions, a is an accumulation point

(g), and limx-af(x) and li mx-a g(x) exist, then
lim (J+ g)(x)= lim f(x) + lim g(x);
-
2.1.4
of (J )
(a)
(b)
(c)
Proposition.
x-a
x-a
lim (Jg)(x) = lim f(x) lim g(x);
x-a
x-a
and iflim g(x)

x-a
x-a
0, then
lim (Jig)(x)= lim f(x)llim g(x) .
x-a
x-a
x-a
Proof. (a) Let us set l= lim x- o f(x) m=limx-a g(x). Then

VE> 0, 38 > 0 so that I x- al< 8 and x E (J ) n (g)\{a}
,
I J(x) - LI < El2
and
lg(x) - ml < el2.
Hence for these same x we have
l(J+g)(x) - (l+ m)I
I J(x) - ll + lg(x) - ml < e.
(b) Take l and m as in part (a). Then 361 > 0 so that I x - al < 81
and x E (J )\{a}
lf(x)l - Ill IJ(x) - ll < 1;

that is,
IJ(x)I < 1+Il l.

Also, Ve> 0, 382> 0 so that I x - al < 62 and x E (J ) n (g)\
{a}
IJ(x) - LI < el2(1+ lml)
and
Take 8 = min (6i. 62); then for I x

we get
lg(x) - ml < El2(1 + I ii) .
al< 8 and x
(J )
(g)\{a}
I (Jg)(x) - lml= IJ(x)g(x) - J(x)m+f(x)m lml

I J(x)I lg(x) -ml+ lml IJ(x) - ll < E.
-
(c) Again, take mas in part (a). Then 361 > 0 so that x
0 lml/2, and hence ]a - Si. a+ s.[ n J0(g) c 1>(1/g).
Now, V > 0, 3S so that 0 <S <S1 and x E J0(g) with 0 < Ix - al
< s =:::}
lg(x) - ml < m2/2 .

Thus x E 1>(1/g) and 0 < Ix - al <S=:::}
1 g1 (x) - ;;;
1
lmg(x) I
lg(x) - ml
2
lg(x) - ml < .
-2
m
This shows that

lim (l/g)(x)
x-a
l/m.
The proof of part (c) is completed by applying part (b) to the product
functionf(l/g).
It is by no means always obvious whether or not a function has a limit
at a point; and even if we know that it has a limit at a point, the value
of the limit may not be too easy to establish. A standard example that
shows this is the function defined by
s(x) =
smx
'
We shall see later that

Jim s(x)
x-o
0.
= 1.
Closely connected with the concept of the limit of a function at a

point is the concept of continuity of a function at a point.
2.1.5 Definition. A function f is said to be continuous at the point
a a E .1>(J) and V > 0, 3S > 0 such tliat Ix - al <Sand x E J0(J)
=:::} IJ(x) - f(a) I < . If f is continuous at every point of its domain, we say
f is a continuous function.
Note that according to this definition, any a E J0(f) that is not an

accumulation point of this set is a point of continuity off. In case a is
an accumulation point of 1>(f), then for f to be continuous at a, it is
necessary and sufficient that
limf(x) =f(a).
x-a
2.1.6 Proposition. If a function f is continuous at a, then a E J0(J)

and for every sequence (xn) with range in 1>(J) and Xn a, we have f(xn)
74 I LIMITS
--+f(a). (AC) Conversely, if a E (J), and for every sequence (xn) with
range n
i (J) if Xn --+ a impl iesf(xn) --+f(a), then f is continuous at a.
,
Proof. If f is continuous at a, then Ve> 0, 3o > 0 so that Ix- al

< o and x E (J) lf(x)-f(a)I< e. Also 3N so that n N
lxn - al< o. Hence n N IJ(xn)-f(a)I< e, which is the proof
of the first sentence.
To prove the second statement, let us assume to the contrary that
0 so that Vo > 0 there exists an x E (J)

Ix- al< o and IJ(x)- J(a)I e0. For n E N0, let On= I/
(n +I) and An= {x: x E (J) & Ix- al< On & IJ(x)-f(a)I eo}.
Each set An is nonvoid and hence by (AC) there exists a sequence (xn)
so that Xn E An. Clearly Xn--+ a, but since Vn E N0, IJ(xn)- J(a)I
it is not true. Then 3e0>
so that
e0,
we get a contradiction.
2.1.7 Theorem. If f and g are continuous at a E (J) n (g),

thenf + g and Jg are contn
i uous at a, and if g(a) - O,f/g is a/,so continuous
ata.
Proof.
to
and
In case
g,
is an accumulation point of the domain common
then the theorem is an immediate consequence of Proposi
tion 2.1.4 and the remark made prior to Proposition 2.1.6. In case
is not an accumulation point of the common domain, all the functions

listed are automatically continuous.
2.1.8 Theorem. If f and g are functions, if g is continuous at a and f

continuous at b= g(a): then f 0 g is continuous at a. (See Defin ition 1.3.5
for f0 g.)
is
Sincefis continuous at b, Ve> 0, 311 > 0 so that IY - bl < '11

y E (J) IJ(y) -f(b) I < e. Also, since g is continuous at a,
3 o> 0 s o that Ix- al< o and x E (g) lg(x) - g(a) I< '11 Hence
Proof.
and
IJ0 g(x)- Jo g(a)I< e

Ix- al <
o and
of continuity off0
g at a.
provided
(J0 g).
This is precisely the meaning
We shall now give a simple example to show that, in general, the o

of Definition 2.1.5 depends in an essential way on both
be the function with domain
]O, I]
and
f(x)=
is continuous at every point of
Letf
I
x
Since it is almost trivial to prove that the function defined by

follows from Theorem 2.1.7.
a.
given by the equation
]O, I],
the fact that
g(x)= x
is continuous
2.1
Let us take
on
a,
and
>
THE LIMIT CONCEPT AND CONTINUITY 175
a E ]O, I]. We know that 38(E, a) , depending

Ix - al < 8 and x E ]O, I] ll(x) - l(a) I < E.
and
so that
Therefore,
and this in turn implies that
Ix - al
x 1.
since
Hence for fixed
E,
is fixed and
E goes
to zero,
EXa
<
Ea,
a goes to zero we see that 8(E, a) must

of l(a) . Indeed, this also shows that if
then 8(E, a) must also go to zero.
as
go to zero for l(x) to be within
<
D Exercises
I.
,RJ(g)
2.
State and prove a result like Proposition

are unbounded above and
If limx-a
of l(x) at
3.
a;
Let
l(x)
4.
2.1.4
when
,RJ(l)
and
oo.
l, show that we are justified in calling l the limit
that is, show that if l exists it must be unique.
s be
the function with domain R\ { 0} defined by
s(x)
Show that
a=
s does
=sin
not have a limit at
(1/x) .
x =0.
A polynomial function of degree
is a function with domain
R given by
Show that every polynomial function is continuous.
5.
Suppose
l is
a function with domain R so that
l(x + y)
Iffis continuous at
then
6.
{ 0}
=fl(, (J) .
Suppose
is continuous at
a show
7.
3m
8.
=l(x)l(y) .
l is a function with ,RJ(f) = R and 3a

a. Show that if for every x and y in R,
=l(x) + l(y)
fJC,(f),
E R so that f
E R so thatl(x) =mx .
Ill shall be that function with ,RJ(lfl )

Vx E ,RJ(f), Ill (x) ll(x) I . If f is continuous at a, use
2.1.8 to show that Ill is continuous at a. Is the converse true?
If f is a function, then
=,RJ(f) and
Theorem
E R,
thatfis continuous. Further, ifO E
l(x + y)
then
Vx, y
Letfbe a function with domain R defined by
76 J LIMITS
f(x)
:atinal,
{ 1O(:::xx:) iss irrational
.
(::::)
At what points is f continuous and at what points does the limit of J

exist?
9. Ifa ""'0, then the results of Exercise 9 of Section 1.9 show that
Vn E N there is a unique nonnegative number a11n so that (a11n)n =a.
If J is a function having domain the nonnegative reals and whose
value at each x of its domain is x 11 2 show that f is continuous. Do
the same for that function whose value at each nonnegative x is x11n,
n E N.
10.
Suppose thatf1 andf2 are functions with a common domain and
J is that function having the same domain and defined by

f(x)
max(J1(x).f2(x)).
If f1 and h are continuous show that f is continuous. Extend this result

to the case ofn functions.
11. Give an example which shows that Jg may be continuous but
f and g are not continuous. Do the same for f g.
0
12.
Assume the following properties for the sine and cosine func
tions:
lsin(x)I
Jcos(x) I
lxl,
sin(x)- sin(y)
2 sin
1,
Vx ER;
(x ;y ) cos(x ;y ) ,
Vx,y ER.
Prove that the sine function is continuous.

13. If J is a function continuous at a andJ(a) >0, show that 3h
so thatforx E {x: Ix-al <h} n -B(J),
J(x) >O.
14. A positive rational number p/q is said to be in lowest terms if
there is no equal [equivalent in the terminology of 1.5] p1/q1 so that
0 < p1 < p, 0 < q1 < q. Let f be that function with domain [O, l]
defined as follows:
J(x)
{ l/q
(::::) =: p/q in lowest terms,

.
0 (:::x
:) 1s irrational.
Show that f is continuous only at the irrational numbers.

2.2
THE HEINE-BOREL THEOREM AND UNIFORM CONTINUITY
At the end of Section 2.1 we gave a simple example which showed

that for a continuous function the 8 of Definition 2.1.5 depended in
an essential way on both E' and the given point of the domain of the
2.2
THE HEINE-BOREL THEOREM AND UNIFORM CONTINUITY 177
function. In 1872 E. Heine showed that if the domain of the function

was a closed bounded interval, then 8 depended only on
and not on
the point chosen in the interval. In 1895 E. Borel distilled out the
essence of Heine's proof and stated a certain property about closed
bounded intervals from which Heine's theorem followed as an imme
diate corollary. The purpose of this section is to prove the Heine-Borel
theorem and use it to prove Heine's theorem on continuous functions.
We begin with a number of definitions.
2.2.1
Definition. A set in R is said to be closed:::} it contains all its
accumulation points. The closure A of any set A is A together with all its accumu
lation points.
A set in R is said to be compact :::} it is closed and bounded.
2.2.2 Definition. A set UC R is said to be open :::} VxE U there is
an open interval I(x) CU.
Recall thatx E I(x) so that if U is open we have U =
U {/(x): xE U}.
In terms of open sets the definition of continuity at a point is as follows:

A function
is continuous at
a:::} aE JE>(J)
and for every open
containingf(a) there exists an open V containing
a so thatf(V
U
JE>(J))
CU.
2.2.3 Proposition. The complement of a closed set is open and the
complement of an open set is closed.
Proof.
Suppose
3/(x) CAc.
belong to A.
is closed. Then
Otherwise
VxEAc,
the complement of A,
is an accumulation point of
and would
Conversely, if A is open, then Vx which is an accumulation point of

Ac and V/(x), I(x) n Ac# 0. Hence xEAc, since otherwise xEA,
and 3/(x) so that I(x) n Ac= 0.
2.2.4
Definition.
A collection of sets 1-" is said to be a covering for a set
c R :::}
A C U{U:UE1"'}.
The collection 1-" is said to be an open covering for A :::} every UE 1-" is open,
and 1-" is a covering for A.
2.2.5 Theorem (Heine-Borel). A set A C R is compact :::} in every
open covering for A there exists a finite number of sets that cover A .
Proof.
1-"
Suppose first that
be an open covering for
A is a
A and
closed bounded interval
[a, b]. Let

x in A
let E be the collection of all
78 j LIMITS
so that
[a, x] can be covered by a finite number of elements of "U.

g =sup Eand suppose U0 E "U so that t E U0, and I(g) is an open
interval about t so that I(t)C U0 By the definition of sup Ewe must
have I(g) n E 0. Let x E I(g) n E and {Uk: k E (1,n)} C "U be
an open covering for [a, xJ ; then {Uk:k E (0, n)} is an open covering
for [a-, g]. Hence g E Eand if g b we get a contradiction. This shows
that [a, b] can be covered by a finite number of elements from "U.
Now let us suppose that A is any closed and bounded set in R. Let I
be a closed (and bounded) interval so that A C I and set U0 =Ac . Then
U0 is open and "U U {U0} is an open covering for I. This reduces to a
finite subcovering {Uk:k E (O,n)} of I, and since U0 n A=0,
{Uk: k E (l,n)} must cover A.
Set
To prove the converse implication we suppose that every open cover

ing of
A reduces to a finite subcovering. For every ri E N, let In

= ]-n, n [. The collection {In: n E N} of open intervals covers R and
hence A. Hence there exists a finite set Uni: J E (l,k)} that covers A.
Let m= max {ni:J E (l,k)}; then clearly AC Im, which means it is
bounded.
To show A is closed, let
a EAc. Then Vx EA, Ix - al= 8(x) > 0.

Vn E N , Un= {x: Ix al > l/n}, it follows that "U=
{Un: n E N} is an open covering for A, since V8(x), 3n E N so that
I/ n < 8(x). By hypothesis, this reduces to a finite subcovering {Uni:
J E (l,k)}. Let m =max {ni: J E (l,k )}; then, since Vx EA, 3ni
so that x E Un it follows that {x: Ix - al < l/m} is completely in Ac.
This means Ac is open and therefore A is closed.
Hence, if
..
Many people like to use a "sequential" version of compactness. The

equivalence with the definition of compactness given in Definiuon 2.2. l
seems to involve the use of (AC).
2.2.6 Proposition. If a set A C R is compact, then every sequence with

range in A has a subsequence that converges to a point in A. (AC) Conversely,
if every sequence with range in A has a subsequence that converges to a point
in A, then A is compact.
Proof.
A.
Suppose A is compact and
If the range of
(xn) is a sequence with range in

(xn) is finite, then there is a subsequence whose range
consists of only one element (proof?) and hence is convergent to an

element of A. If the range of
(xn)
is infinite, then the Bolzano-Weier
strass theorem tells us that the range has an accumulation point
a which
belongs to A, since A is closed. Apply the result of Exercise 5 of Section
1.9, which shows the existence of a subsequence of (xn) that converges
to
a.
Conversely, suppose that every sequence with range in A contains
a subsequence which converges to a point in A. If
is an accumulation
point of A, then by use of (AC) we conclude that there is a sequence that

converges to
a.
Indeed, forn
E N0 letAn= {x: 0 <Ix - al < I/(n +I)}
2.2 THE HEINE-BOREL THEOREM AND UNIFORM CONTINUITY I 79
n A. The sets An are nonvoid and by (AC) we get a sequence (xn) with
Xn EAn; clearly Xn-+ a. By hypothesis there is a subsequence (yn)
that converges to an element of A. But any subsequence of a conver
gent sequence converges to the same limit as the original sequence.
a EA, which shows A is closed.

A is bounded let us assume the contrary. Suppose it is not
bounded above. Now, V n EN0 let
Hence
To show
An=A n [n,n + l [ .
An which are nonvoid. Other
A is bounded above. Hence the collection of nonvoid An is de
numerable and there is a one-to-one function that maps N0 into itself
so that the set A41<n> is nonvoid. Clearly it is possible to choose so that
m < n => <l>(m) < <l>(n) (proof?). Let us put Xn=inf A41<n> From the
definition of Ak and the fact that is increasing, we get
There are an infinite number of the sets
wise
<l>{n) ,;;:;; Xn
<
Since we have already shown
<l>(n) + 1
,;;:;;
<l>(n
1).
A is closed, Vn EN0, Xn EA. Now set

Y n=X2n+l
Then we have
Yn+l - Yn=X2n+3 - X2n+1
<1> (2n + 3) - <1>(2n + 2) 1,

and by induction it follows that if
>
n,
Ym - Y n 1.
Thus no subsequence of
(yn) can be Cauchy and hence no subsequence
can converge. This contradicts the original hypothesis, and shows that
A must be bounded above. In the same way, the assumption that A

is unbounded below would lead to a contradiction.
An immediate corollary of the Heine-Borel theorem is the following,
which is often called the
2.2. 7
Cantor intersection theorem.
Theorem (Cantor).
sets so that Vn EN0, An+i
Proof.
If (An) is a sequence of nonvoid compact

An, then
Suppose to the contrary that this intersection is void. Then
[see Exercise lO(b) of Section

R
and since
1.2]
(n {An: n EN0})c= U {Anc: n EN0},
Vn E N0, Anc is open, the collection {Anc: n EN0} is an open

A0 Consequently, there is a finite set {An/:j E (l,k)}
covering for
80 I LIMITS
that covers A0 Let
m = max {ni: j E (1,k); then since V n, Anc C An+1c,
it follows that Ame covers A0. But this is impossible, since Am C A0,
Am# 0, and Am n Amc=0.

Let us now go on to establish Heine's theorem about continuous
functions.
2.2.8
>
A function f is said to be uniformly continuous VE

so that if Ix - YI <8andx, y E (J), then IJ(x)-J(y)I <E.
Definition.
0, 38> 0
In the very formal symbolism of the predicate calculus this would be

written
(E) (E>
0 ==> (38)(8> 0 &
(x) (y) (x, y E (J) &

lx-yl <8==>IJ(x)-f(y)I <E))).
It is not true that every continuous function is unformly continuous.

We have already seen in Section 2.1 that the function defined by
J(x) = l /x
x E ]O, l] is not uniformly continuous.

J(x) =sin (l/x), x E ]O, l], and this
for
is given by
Another example
function is even
bounded.
2.2.9 Theorem (Heine). If f is a continuous function and (J) zs

compact, then f is uniformly continuous.
Proof. Vx E (J) and VE> 0, 38(E, x)>
<8(E, x) and y E (J), then IJ(y)-f(x)I <E/2.
set /(x) = {y: ly-xl < 8(E,x)/2}. The collection
'lb= {/(x):
is an open covering for
0 so that if
For every x E
IY-xi
(J)
x E (J)}
(J) and by the compactness of (J) reduces

{xk: k E
to a finite subcover. That is, there is a finite set of points
(1, n)} C (J)
so that
(J) CU {l(xk): k E (l,n)}.

Let us set 8 =min
lx-yl <8.
{8(E, xk)/2: k E (1, n)}. Suppose x,y E (J) and
Now,3k E
IY-xk l ,,;;; IY-xi
(l,n) so thatx E I(xk) This implies that

+
Ix- xkl <8 + 8(E, xk)/2 ,,;;; 8(E, xk).
Hence
IJ(y)-f(x)I,,;;; IJ(y)-f(xk)I
which proves that
REMARK:
IJ(xk)-f(x)I <E,
f is uniformly continuous.
In the proof of the last theorem, it might seem that it is
necessary to use the axiom of choice in order to pick a 8(E, x) for every
x E ( f). However,
there are many explicit ways of picking 8(E,x).
2.2
THE HEINE-BOREL THEOREM AND UNIFORM CONTINUITY I 81
For example, for a given
and
x,
one might consider the collection
of all S's less than one for which the first statement in the proof holds,
and then take the supremum of this set. There are also a number of
other ways to construct a suitable covering for
(J)
and we are
sure that the reader can think of several alternate ways. There are
a number of places throughout the text where a point like this will
arise, but we shall not comment about it in the future.
There are several other important results about continuous functions
with compact domains that follow from the Heine-Borel theorem. We
shall give these below.
2.2.10
Theorem.
compact range.
Proof.
If
A continuous function with a compact domain has a
is a continuous function with a compact domain, we shall
show that its range is closed and bounded. We shall consider only the
case where
if
fl(,(j)
o/:-
0, since otherwise the theorem is trivial. First,
is an accumulation point of
An= {x: x
Each set
An
fl(, (f), Vn
is nonvoid, closed (Exercise 16 of this section), and hence
Cantor intersection theorem,

E
N.
N let
(f) & IJ(x) - bl:;;;_; l/n}.
compact. Further, it is clear that
Thus
closed.
f(a)
b,
Vn
N, An+i
An.
Thus by the
3 a E (f), which belongs to An for every
which shows
fl(,(j),
and hence
fl(,(j)
is
fl(,(j) is bounded, we take a E (f). Since f is con

N there is an open interval In so that x E In n (J) =>
To show that
tinuous,
Vn
if(x) I - IJ(a) I
:;;;_;
IJ(x) - f(a) I
<
n.
E In n (f) => lf(x) I E ln = [O, IJ(a) I+ n]. Since the col

Un: n E N} covers R+ U {O}, it follows that {In: n E N} is
an open covering for (f). By the compactness of (f) this reduces
to a finite subcovering {In;: j E (I, k)}. Lel m =max {n;: j E (1, k)};
since Vn E N,]n C ]n+i, it follows that fl(, (f) C ] m, which proves that
fl(,(j) is bounded.
Thus
lection
In Chapter 6 we shall give a "global" characterization of continuous

functions from which the above theorem will follow as an immediate
corollary of the Heine-Borel theorem.
We shall now show that continuous functions with compact domains
have a maximum and a minimum. We first give the formal definition.
2.2.11 Definition. If f is a function and a E (f), then a is said to

be at the maximum [minimum] for!::::} Vx E (f), J(x) :;;;j(a) [f(a)
:;;;; J(x)]. The number f(a) is called the maximum [minimum] off.
82 I LIMITS
a E /) (f) is said to be at a lo cal maximum [minimum J fo r f there exists

an open interval I(a) so that Vx E l(a) n /f)(j), f(x) ,,;;;; f(a) [f(a)
:s;;f(x)]. The number f(a) is called a lo cal maximum [minimum] off.

2.2.12 Theorem. Every co ntinuo us functio n with a co mpact do main
has a maximum and a minimum.
Proof.
If
f is continuous on a compact domain, it follows from the
last theorem that f/C,(f) is compact. Since f/C,(f) is bounded, the numbers
m =inf f/C,(f),
M =sup f/C,(f),
exist, and since f/(, (f) is closed, they belong to f/(, (f). This establishes
the theorem.
The next theorem gives more specific information about functions
that are defined on compact intervals. We first prove a lemma.
2.2.13 Lemma. If f is co ntinuo us, ,ff)(j) = [a,b] and f(a)f(b),,;;;; 0,

then 3c E [a, b] so that f(c) =0.
Proof.
f(a) =0 or f(b) 0 we are done. Hence for the sake of

f(a) < 0 and f(b) > 0. Let A= {x: x E [a,b] &
f(x) < O}; since f(a) < 0, A - 0 and hence c =l.u.b. A is well defined.
If f(c) < 0, then, if we use the continuity of f at c, there is an open
interval I(c) so that x E I(c) n /f)(j) => f(x) < 0 (see Exercise 13
of Section 2.1). This contradicts the definition of c. By the same argu
ment we cannot have f(c) > 0. Consequently, we must have f(c)
0.
If f(a) > 0 and f(b) < 0, apply the above proof to the function -f
If
argument suppose
and we are done.
2.2.14 Theorem. Iff is co ntinuo us, /f)(j)

[a,b], m is the minimum
off and M is the maximum off, then f/(, (f)
[m,M] . That is,f takes o n all
values between its maximum and its minimum.
=
Proof.
Let y E [m,M], f(c) =m, f(d) =M. If we set g(x) =f(x)

g is continuous and g(c)g(d) ,,;;;; 0. By the last lemma, 3e be
tween c and d so that g(e) =f(e) - y
0. This shows [m,M] C f/C,(f).
On the other hand, since Vx E [a,b], m ,,;;;; f(x) ,,;;;; M, we see that
- y, then
f/(, (f) C
[m,M]. These two inclusions establish the theorem.
D Exercises
I.
Show that a closed interval is a closed set and an open interval
is an open set.
2.
If A C R, designate by A' the set of accumulation points of A.
2.2
THE HEINE-BOREL THEOREM AND UNIFORM CONTINUITY I 85
Show the following:

(a)
(b)
(c)
(d)
(e)
(f )
A' is closed.
ACBA'CB'.
(AUB)'=A'UB'.
A=AUA' is closed.
(A)'=A'.
A is closed A=A.
n E N, let Jn= ] I/n, 2/n[. Show that the collection

Un: n E N} is an open covering for]= ]O, I [but that no finite subset
3.
For
of these intervals covers].

4.
Show that a closed subset of a compact set is compact.
5.
Show that the union of any number of open sets is open and the
intersection of a finite number of open sets is open. Give an example

which shows that the intersection of a collection of open sets may not
be open.
6.
Using the results of Exercise 5, show that the intersection of any
number of closed sets is closed and the union of any finite number of
closed sets is closed.
7.
If
C R and
x E R,
define the distance between
and
as
d(x,A)=inf{lx-yl:y EA}.
If
is closed and
x g A,
3y EA
show that
Is this nearest pointy unique?
(Hint:
so that
d(x, A)= Ix-YI.
Properties of continuous functions
may be useful here.)
8.
Generalizing the notion in Exercise 7, if
R, define the distance between
and
d(A,B) =inf{lx-yl: x EA
If A is compact and B is
d(A, B) = Ix-YI. Is this
9.
closed , show that

result true if
and
are subsets of
as
&
y EB}.
3x EA and 3y EB so that
B are merely closed?
and
Show that every closed subset of Ris the intersection of a count
able number of open sets.
10.
If
ACRand x E R,
define
x+A={x+y: y E.A}.
Show the following:
(a)
(b)
(c)
11.
A
A
A
x+A is open.
x+A is closed.
compact x+A is compact.
is open
is closed
is
Show that the complement of any closed set is the union of a
countable number of pairwise disjoint open intervals.
84 ILIMITS
12. Prove the Cantor intersection theorem in the following form:

Let S C R and VxE S let Ax be a nonempty compact subset of R.
Suppose further that x,yES and x < y ==}AY C Ax. Show that
is nonvoid.
n{Ax:xES}
13. Assuming the Cantor intersection theorem, 2.2.7, and the axiom
of induction, prove the Bolzano-Weierstrass theorem. [Hint: If A is an
infinite bounded set, use the axiom of induction to establish the exist
ence of a sequence (In) of compact intervals, each of which contains a
point of A so that Vk E N0, h+i Ch, and the length of In goes to zero
asnoo.J
14. Show that the Heine-Borel theorem implies the Bolzano
Weierstrass theorem.
15. If f and g are uniformly continuous functions, show thatf+ g,
g,
J and f g are also uniformly continuous. Under what condition(s)
is l/g uniformly continuous?
0
16. If f is a continuous function with a closed domain and m is a

fixed number, show that {x: xE (f) & IJ(x)I .;;;; m} is a closed subset
of (f).
17. Let A be a set with the property that every continuous function
with domain A is uniformly continuous. Is A necessarily compact?
18. Let f be a continuous function with domain [a, b J and g be that
function with the same domain defined by
g(x) =max (fl [a, x]);

that is, g (x) is the maximum of f restricted to [a,x]. Show that g is
continuous.
2.3
MONOTONE FUNCTIONS
In this section we shall treat a special class of functions that is of im

portance in the theory of Riemann-Stieltjes integration. Although the
functions in this class are not necessarily continuous, they are continuous
at "almost all" the points in their domains.
Definition. A functionf is called monotone increasing [decreasing]
Vx,yE (f) with x < y we have f(x) <J(y) [f(y) <J(x)].
A function f is called monotone nondecreasing [non increasingJ Vx, y
E (f) with x < y we have f(x) .;;;; J(y) [f(y) .;;;; f(x)].
2.3.1
For monotone functions it is convenient to talk about right and left

limits, and for this purpose we need the concept of right and left
accumulation points.
2.3
2.3.2
a setA
MONOTONE FUNCTIONS I 85
We say that a is a right [left] accumulation point of
Definition.
Ve> 0,
{x: 0 <x - a <e}
0 [{x: 0 <a - x <E}
0].
2.3.3 Definition. We say that l is a right [left] limit of f at a a is

a right [left] accumulation point of>(!) and Ve> 0, 38 > 0 so that
{x: 0 <x - a <8}

lf(x)
- ll <e.
=:::}
x
n >(!)
A right [left] limit at
[{x: 0 <a - x <8}
will be designated by
n >(!)]
f(a+)[ f(a-)],
and we
shall write
f(a+) =
lim
f(x),
x'a
f(a-) =
lim
X-"a
f(x).
We leave as an exercise the simple fact that if the right and left limits of
exist at
a,
then
has a limit at
a if and
only if the right and left limits
are equal.
2.3.4 Definition. If f is a/unction, a E >(!), and f(a+) and f(a-)

exist, but f(a+) f(a) or f(a-) f(a), then we say that f has a jump
discontinuity at a.
If
is a monotone nondecreasing function and
is a left and right
accumulation point of> ( f ), then it is clear
f(a+) =
f(a-) =
If
g.l.b.
l.u.b.
{ f(x): x
{ f(x): x
E >(!) &
E >(!) &
x > a},
x <a}.
E >(!) and is a right accumulation point but not a left accumula
tion point set
f( a-) = f( a),
and if
not a right accumulation point, set
a is a left accumulation point but

f(a+) = f(a). With this convention
we see that the only type of discontinuity that a monotone nondecreas
ing function may have is a jump discontinuity. If

increasing, then
-f
is monotone non
is monotone nondecreasing and hence the result
holds as well in this case. Indeed, we can say more.
2.3.5 Proposition. The discontinuities of a monotone function are jump

discontinuities and there are a countable number (possibly zero) of them.
Proof.
We have already noted above that a monotone function can
have only jump discontinuities. Hence it remains to show that there

are a countable number of them.
For every
E N, let us set
Jn=
{x: x
E >(!) &
lf(x) I
n},
86JUMITS
and let us set
Dn= {x: x E n & ll(x+) -l(x-)1 l/n}
We claim that Dn hs at most 2n2 points. Indeed, let us first note that
Vx,y E n we have
ll(x) -l(y)I
2n.
Next, suppose that {a;:} E (I,k)} is any finite set of points in Dn.
Without loss of generality we may suppose that these points are labeled
so that a 1 < a2 <
< ak. Let us adopt the convention that i f
a E (J) is not a right accumulation point of (J), we set l( a + )
=l( a ), and if a is not a left accumulation point of (J), we setl(a-)
=l( a ) Then, clearly, we may writel(ak+) -l(a1- ) as a "telescoping
sum" in the following way:
l( ak+ )
l( a 1 )
i=I
J=2
L [f(a;+) -l(a;-)] + L [J(a;-) -l( a;-1+ )]
Since l is monotone, the terms that appear under the summation signs
are always nonnegative or always nonpositive. Thus
n
2n ll( ak+) -l(a1-)I L ll(a;+)

i=I
l( a; ) I k/n.
This shows that k 2n2

The set of discontinuities of l is exactly the set
D = U {Dn: n E N}.
It follows from Exercise 15 of Section 1. 7 that D is countable.
The next theorem tells us that if the domain of a strictly monotone
function satisfies certain conditions, in particular if it is an interval,
then the inverse of the function is always continuous. This is true
regardless of whether the given function is continuous or not.
2.3.6 Theorem. If f is a monotone increasing or monotone decreasing
function, then1-1 is a monotone increasing or decreasing function, respectively.
If (J) has the property that Va, b E (J) with a b, [a, b] n (J)
is closed (hence compact), then 1-1 is continuous.
Proof. If we suppose that l is monotone increasing, it is clearly a
one-to-one function and hence 1-1 is a function. For every y1 and y2
in (J-1) letx1 andx2 in (J) be those elements for whichy1=l(x1)
andy2=l(x1). Ify1 < y2, then we must have l-1(y2) -l-1(y1) X2 -xi
> 0. For otherwise, ifx2 -x1 0, we would gety2=l(x2) l(x1) =Yi.
which is a contradiction. Thus 1-1 is increasing. If f is monotone de
creasing, a similar proof shows thatl-1 is monotone decreasing.
Suppose now that b E (J-1) and b =l( a ). If a is a right accumula
tion point of (J), then VE > 0, 3x, E (J) so that 0 < x. - a < E.
=
2.S
MONOTONE FUNCTIONS I 87
f is monotone increasing
y ,,;;; y., then since 1-1 is mono
Suppose, for the sake of being definite, that

and set
y.=f(x,). If y
J(f-1) and b
,,;;;
tone increasing we get
Q .,;;j-l(y) - 1-l(b} .,;;j-l(y,} - 1-l(b) ,,;;; x, - a<
E.
If 3e>0 so that
>(!} n ]a, a+ e[=0, then we claim 38>0 so

]b, b+ 8[= 0. Indeed, if this is not true, bis a right
accumulation point of J(f-1). Let y E J(f-1) and y >b, and set
1
1
X =1-1(y }. Then X >a, and indeed X ""a+ E since >(!) n
1
1
1
1
]a, a+ e[=0. Now, by hypothesis, [a+ e, x ] n >(!) is closed. Fur
1
ther, y E ]b, y ] n J(f-1) J-1 (y} E [a+ e, x ]. Thus
1
1
y
>
b}
(y)
:
c = 1-1(b+) =inf u-1
that
J(f-1)
>(!). Now,f(c} >b since c""a+ E. On the

J(f-1) so that
b< y < f(c), which means that 1 -1(y) < 1-1(f(c)) = c. The latter in
equality contradicts the definition of c.
We have established in the last two paragraphs thatVe>0, 38>0
so that 0 .;;; y - b < 8 and y E J(f-1) 0.;;; 1-1(y} - 1-1(b} < e.
Arguing in an exactly analogous fashion we can prove that Ve>0,
38>0 so that 0.;;; b- y < 8 and y E J(f-1) 0.;;; 1-1(b) - 1-1(y)
< e. This means, of course, that 1-1 is continuous at b.
If f is monotone decreasing, then -f is monotone increasing and
hence the continuity of 1-1 follows from the continuity of (-J )-1
belongs to
[a+
E,
X]
1
other hand, since bis a right accumulation point, 3 y E
EXPONENTIAL FUNCTIONS
As an example of a monotone function we shall construct a
exponential function. Let a be a fixed number. If n
generalized
E N, then
an can be
defined inductively by the equation
fa with
fa(n+ 1) =fa(n}fa(l) .
The meaning of this is that there exists a unique function
fa(l)=a and V n
domain N such that
E N,
This statement can be proved by induction in a manner very similar

to the proof of Theorem 1.6.4. We also define a 0 = 1, and if n E Z,
n < 0, and a# 0, we define an= l/a-n.

From the results of Exercise 9 of Section 1.9 we know that for every
E N there exists a
uni,que solution in
R+ to the equation
.
a>0,
which is labeled 'a1tn. If
If
E N 0, define (Exercise
min < 0, we define

amtn= 1/0,-mtn.
9 of Section 2.3)
88 I LIMITS
For every m,n E Z and Vb,c E R\{O}, it is easily established by

induction that
Using these facts, and the uniqueness of the solutions of any equation
r,s E Q.,
xn=y, y > 0, if b,c E R+ we can establish that for every
We shall leave the verification of these simple facts as an exercise.

The function defined by
a> 0,
is a function with domain Q. and range in R+. If 0< a< 1, ea is mono

tone decreasing; if a=1, then Vr E Q., ea(r)
1; if a> 1, then ea is
monotone increasing. Let us prove the last statement. We must show
that if r,s E Q. and s< r, then ar - a8 > 0. Now
=
ar-as=a(ar--
I).
Since a> 0, if we can show ar-s- 1 > 0 we are done. Suppose r- s

=p/q; then we get
(ar-s)q=aP
Since a > 1, a simpie inductive proof shows that aP > 1. Now, if ar-s 1,
then (ar-)q 1, which would contradict the above equality.
Since ea is monotone on Q. it is bounded on every bounded set in Q..
. Further, it is continuous at r
0. To prove this let us first suppose that
a;;;,: 1. For any a;;;,: 1, it is always true that Vn E N,
=
1 a< (I +a/n)n.
For we can either expand the right side by the binomial theorem (which
we suppose known) or use the result of Exercise 1 of Section 1.7 to get
(I+
a/n)n> 1 +a > a.
Therefore, we have (why?)

1a11n<1 +a/n.
Consequently, since ea is monotone, if 0 r l/n we have

1 ar a11n< 1 + a/n.
If> 0, 3n EN so that a/n<.Therefore, Ve> 0, 36> 0 so that
Or<6==>
(2.3.1)
Since Vr;;;,: 0, ar;;;,: 1, and Vr, ara-r=1, we get Vr 0, ar 1.
Hence, if-6< r 0, we get from (2.3.1)
2.5
MONOTONE FUNCTIONS j 89
(2.3.2)
The inequalities (2.3.1) and (2.3.2) show the continuity at zero. If

0<a< 1, we work with b= l/a. We have already proved the continuity
for eb. Since ea= l/eb, and eb(r) - 0, we have continuity for ea.
Now, for every x E R, if a> 1, then since ea is monotone on Q, the
left and right limits exist:
ea(x+) = g.l.b. {ea(r): r> x},
ea(x-) =l.u.b. {ea(r): r<x}.
If r<x<s, r,s E Q, then

ea(r) <ea(x-)
ea(x+) <ea(s),
from which it follows that
If now we use the fact that ea, defined on Q, is continuous at zero, and
is bounded on any bounded set, we see that we must have
ea(x-) =ea(x+).
We denote this common value by ea(x) and this extends ea to a contin

uous monotone increasing function defined on all of R, which we
denote by the same symbol ea. We usually write ea(x) =ax.
If a - 1, ea is monotone increasing or decreasing and hence its in
verse ea-i exists and is continuous. As is well known, we usually write
x> 0,
and this is a monotone increasing function if a> 1, monotone de

creasing if a< 1 .
D Exercises
1. Let ea be the function with domain R which was constructed
above. Prove the following:
(a) If a> 1, ea is a monotone increasing continuous function.
(b) If a- 1, .17C-(ea) = ]O, oo[.
(c) ea(X + y) =ea(x)ea(y).
(d) (ax )11 =ax11.
2.
Show that the logarithm function ea-i satisfies the following:

(a) .17C-(ea-1) = ]-oo,oo[.
(b) loga X11=y loga x.
(c) loga xy =loga x + loga y.
3.
Suppose J is a continuous function on

f(x + y) =f(x)f(y).
such that
90 I LIMITS
Show that there is a unique a

4.
a>
E R
so thatf(x) = ea(x) .
If a > 1, show that x"a 0 as x oo for any real

and a> 0, show that (loga x)lx" 0 as x oo.
-x
a.
Also, if
5. Show that if f is a continuous function with domain [a, b] and

does not have a local maximum or minimum in ] a, b [, then f must be
monotone decreasing or monotone increasing.
6. If f is a continuous one-to-one function with an interval domain,
show that f must be monotone increasing or monotone decreasing.
7. Give an example which shows that Theorem 2.3.6 may not be
true if no restriction is put on J>(f).
8. For every n
and defined by
N show that the function fn with domain [O, oo[
is a monotone continuous function with range [O, oo[. Hence it has a

continuous inverse and consequently deduce the result of Exercise 9
of Section 1.9: Va E [O, oo[ there exists a unique b E [O, oo[ so that
bn= a.
9.
If m E N0 and n E N , we defined
Show that this is independent of the representation of min; that is, if

p E N0 and q E N so that Plq =min show that
10.
2.4
If b,c E R+ and r,s E Q, show that
LIMIT SUPERIOR AND LIMIT INFERIOR
There is a concept which is more general than the concept of the limit
of a function at a point and is very useful in many instances. In this sec
tion we shall introduce this concept and obtain several facts about it.
Let us suppose that f is a bounded function with J>(f) - 0 and a is
an accumulation point of J>(f). For every r E R+ let us define
;J(r) = l.u.b. {f(x): x

fr(r) = g.I.b. {J(x): x
E
E
J>(f)
J>(f)
& 0 <
& 0 <
Ix
Ix
al
al
<
<
r},
r}.
It is almost immediate that 'ifir is a nondecreasing function and <fir is a

nonincreasing function.
2.4
2.4.1
Definition. If f
point of Je(f), we shall set
is
LIMIT SUPERIOR AND LIMIT INFERIOR j 91
a bounded function and a is an accumulation
lim f(x) =lim ;,(r) = ;Ao+).

r-o
x-a
limf(x) =lim f_t(r) = ft(O+).

x-a
r-o
These numbers are called the limit superior and the limit inferior of f at a,
respectively.
The limit superior and the limit inferior of f at a are sometimes
designated by
Jim sup f(x)
x-a
Jim inf J(x) ,
and
x-a
respectively. Clearly it is not necessary that f be bounded, but only

bounded in a neighborhood of a to define the concepts of limit superior
and limit inferior.
In case Je(f) is not bounded above, we can set
;,(r) =sup {x: x
Je(f) & x
f_r (r) =inf
Je(f) & x
{x : x
>
>
r},
r }.
These are monotone nonincreasing and nondecreasing functions,

respectively, and we define
Jim f(x) = Jim ;,(r),
x-oo
r-ao
Jim f(x) = Jim f_t(r).

x-oo
r-oo
We can consider oo as an accumulation point of Je(f) since every open

interval I ( oo) contains a point of Je(f). If f is a bounded sequence,
the latter quantities are called the limit superior and the limit inferior
of the sequence, since the domain of a sequence has no finite accumu
lation point. We shall leave to the reader the easy task of formulating
these notions at -oo.
It is possible to get a geometric meaning of the previous definition
which may make it more understandable. The number ih(r) may be
thought of as measuring the size of the largest peak off as x varies over
the deleted interval {x: 0 < Ix - al < r}, and 'Pr(r) measures the depth
of the deepest valley. As r decreases to zero, the size of the largest peak
decreases to 1(0+) and the size of the deepest valley shortens to
'f'r(O+).
- As an example, let us consider the function given by f(x) =sin (l/x)
for x 0. A sketch of the graph of this function is shown in Fig. 2.4.1.
92 I LIMITS
-1
7T
-1
-2
Figure 2.4.1
As x - 0, we get an infinite number of peaks and valleys of this function.

It is clear that Vr > 0, (r) 1, !e(r)= -1. Hence
=
lim sin (l/x). = 1,
x-o
lim sin (l/x)

x-o
-1.
'
Note that this function does not have a limit at x

2.4.2
Proposition.
0.
The function f has the limit l at a:::a

:>
is an
accumu
lation point of (f) and

lim f(x)= l
lim f(x).
Proof. Suppose f(x) - z as x - a. This means Ve> 0, 38> 0 so

that x E (f) and 0 < Ix - a l < 8 l f (x) - l l < e. It follows that
if 0 < r ,,,-;; 8, then l1(r) - ll ,,,-;; e and l<P1(r)
- l] ,,,-;; e. This means, by
definition, that
!e1(0+)= l= ;1(0+).
Conversely, suppose this last equation is satisfied. Then Ve > 0,
38> 0 so that 11(8) - l l < e and l<P1(8) - l l < e. However, Vx E
(f) for which 0 < Ix-al < 8 we hive,r1(8) os;f(x) ..-;;1(8). Hence
-e < :1(8)
l os;f(x)- l
,,,-;;
q;-1(8)
l < e.
This shows thatf(x) -1 as x - 0.

2.4.3
Theorem.
lation point of (f)
If f and g a re bounded functions and a

(g)' then
is an
accumu
2.4
(a)
LIMIT SUPERIOR AND LIMIT INFERIOR I 93
lim (J+g)(x) .:;; lim f(x)+ lim g(x) ,

x-a
x-a
x-a
lim f(x) + lim g(x) .:;; lim (J+g)(x);
x-a
(b)
x-a
x-a
iff and g are nonnegative, then

Jim (fg)(x) .:;; lim f(x) lim g(x) ,
x-a
x-a
x-a
limf(x) lim g(x) .:;; Jim (Jg)(x);
x-a
x-a
x-a
(c) if Vx E .e(g), g(x)

bounded, then
lim (l/g)(x)
Proof.
lb
(a)
al
<
is
x-a
l/lim g(x).
x-a
0 <
and does not change its sign, and l/g
l/lim g(x),
x-a
lim (l/g)(x)
""0
x-a
For every e> 0 and Vr > 0, 3 b E .e (f+g) so that
r and
r+u(r)
<
(f+g)(b)+e .:;;1h(r)+0(r)+e.
Letting r--+ 0 and taking account of the fact that Eis arbitrary, we get
the first inequality in (a). The second one follows by similar reasoning.
(b) For every E> 0 and Vr > 0, 3 b E .e(fg) so that 0 < lb - al < r
and
ru(r )
<
(fg)(b)+E.:;; 1(r)u(r)+ E.
The last inequality makes use of the facts that 0.:;;j(b).:;; r(r) and
0 .:;; g(b) .:;; ?u(r). Letting r--+ 0 we get the first inequality in (b). The
second one follows by similar reasoning.
(c) For every E> 0 and Vr > 0, 3b E .,e)(g) so that 0 < lb - a l < r
and
1/g(r)
<
(l/g)(b)+E.
Since g does not change its sign and l/g is bounded, 3m > 0 so that
Vx , g(x) m or g(x) .:;; -m. Consequently, 'Pu(r) m or ip0(r) .:;; -m,
respectively. Since Vx E .e(g) for which 0 < Ix - al < r we have
g(x) fu(r), we have in either case,
'P11u(r)
<
(l/g)(b)+E.:;; l/ipu(r) + E.
Letting r--+ 0 we get the inequality

lim (l/g)(x) .:;; l/lim g(x).
x-a
x-a
On the other hand, Ve> 0 and Vr > 0, 3 b E .e(g) so that

lb a l < r and
0 <
94 I LIMITS
,,
fa(r)
,b)
<
_f0(r)
E.
Since l/g is bounded, <Pa(r) is never zero. Suppose E is so small that

_f0(r) and !e_0(r) + E hae the same sign. Then g(b) also has the same
sign and
1
_!a(r)
<
1
g(b)
'P11a (r) .
Letting r -+ 0 we see that we have the inequality

l/lim g(x) lim (l/g) (x).
x-a
x-a
This inequality when taken together with the previous one gives equality.
The second equality follows by similar reasoning.
NOTE:
The results of this theorem are completed in Exercises 4 and
5 at the end of the chapter. Also, it is clear that the above theorem
will hold if the various hypotheses put on the functions hold in a

neighborhood of a. Indeed, if I(a) is an open neighborhood of a
and f1 is the restriction off to I(a) n cB (f ), then clearly
1imf1(x) = limf(x) ,
.x-a
x-a
lim f1(x)
x-a
lim f(x).
x-a
We cannot expect expect equality in (c) if g changes sign in every

neighborhood of a. For then the limit inferior of g or l/g at a will
be negative while the corresponding limit superior will be positive.
We shall now given several examples which show that the hypotheses
of Theorem 2.4.3(b) cannot be weakened in any essential way. That is
to say, if we remove the conditions of nonnegativity, the conclusions
may not follow. For the first example take
g(x) = cos(l/x),
f(x) = sin(l/x),
;t. 0.
These functions change their signs infinitely many times in any neigh
borhood of the origin. Now, by a well-known trigonometric identity
we have
f(x)g(x) =sin (l/x) cos (l/x) =sin (2/x).

Therefore,
limf(x)
x-o
=-
Jim g(x)
x-o
-1,
lim (Jg) (x) = -1/2,
x:::;o
and hence
Jim f(x) Jim g(x) > lim (Jg) (x).
x:::;o
x:::;o
x=o
2.4
LIMIT SUPERIOR AND LIMIT INFERIOR I 95
For the second exampie take
J(x) =I+ sin(l/x),

Clearly
g(x) = cos(I/x),
0.
f(x) 0 for all x in its domain, and g changes sign infinitely
often in any neighborhood of zero. If we take xk so that I/xk =
k = 0,
1, , we
(2k+ 1)7T,
get
Hence
(Jg)(x)
lim
-1.
x-o
On the other hand,

limf(x)
x-o
= 0,
lim
x-o
g(x) =-1 ,
which leads to the inequality

lim
x-o
f(x) lim g(x) > lim (Jg)(x).

x-o
x-o
Finally, as a third example we consider
f(x) =-1 +sin (I/x),

If we take
g(x) = -1 +cos (l/x).
xk so that l/xk = (2k+ 1)7T, k= 0, 1, 2, ,we get

f(xk)g(xk) = 2
and therefore
lim
x-o
But lim J(x)

x-o
lim
x-o
(Jg)(x) 2.
g(x) = 0 and consequently we get

limf(x) lim g(x) < lim (Jg)(x).
x-o
x-o
x-o
Let us 'finish this section by giving an application of the use of the

concept of limit superior and limit inferior. This involves an extension
of the idea of a Cauchy sequence.
2.4.4 Proposition. Suppose f is a function, a is an accumulation point

of (J ) and Ve > 0, 38 > 0 so that 0 < Ix-al < a, 0 < l y-al <:a
and x,y E (J ) =}lf(x)-J(y)I < e. Then lim x a f(x) exists.
-
Proof.
Clearly
f is bounded in a neighborhood of a and hence we
may set
l= limf(x).
x-a
We claim that
l=
0 < a < 'YJ, 3y
lim x a f(x) Indeed, Ve > 0, 3'YJ > 0 so that V8 with

(J ) so that 0 < IY - al < a and
-
96 I LIMITS
lf(y) - ll < E/2.

Suppose we have taken 8 small enough so that x,y E (f ) and
0 < Ix - a l < 8 and 0 < IY a l < 8 l f(x) -f(y)I < e/2. Then we
get
-
lf(x) - ll
IJ(x) - f(y)I
lf(y) - ll < E.
D Exercises
1.
that
If f is bounded and
is an accumulation point of (f ), show
Jim - f(x)-= -lim f(x).

x-a
x-a
2. Show that the inequalities in Theorem 2.4.3(b) are reversed if

f and g are nonpositive.
3. If f is a bounded nonnegative function and a is an accumulation
point of (f ), show that Va 0,
Jim r<x) = (Jim f(x))"'
x-a
4.
x-a
Under the hypotheses of Theorem 2.4.3(a) prove that

Jim f(x) + Jim g(x) Jim (f + g)(x),
x-a-
x-a
x-a
Jim (f + g)(x) Jim f(x) + Jim g(x).
x-a
x-a
x-a
Using this in conjunction with Theorem 2.4.3(a) show that if limx- af(x)
exists, then
Jim (f + g)(x) = Jim J(x) + Jim g(x) ,
x-a
x-a
x-a
Jim (f + g)(x) = lim f(x) + Jim g(x).

x-a
5.
x-a
x-a
Under the hypotheses of Theorem 2.4.3(b) prove that

lim g(x)
Jim J(x) x-a
x-a
lim (fg)(x),
x-a
Jim (Jg)(x) lim f(x) lim g(x).
x-a
x-a
x-a
Using this in conjunction with Theorem 2.4.3(b), show that if limx-a

J(x) exists, then
lim (Jg)(x) = Jim f(x) Jim g(x),
x-a
x-a
x-a
Ji
m (Jg)(x) = Jim f(x) lim g(x).
x-a
x-a
x-a
If x E [O, I] and _is rational of the form p/q (in lowest terms),
define f(x) = I/q. Also set f(O) = I and if x is irrationaf set f(x) = 0.
6.
2.4 LIMIT SUPERIOR AND LIMIT INFERIOR I 97
Compute limx-a f(x) and limx-a J(x) for every a in [O, l]
7.
If (an) is a bounded sequence and
A= nlim an,
-oo
show that V E > 0 there are only a finite number of n E N0 with an > A
E and an infinite number of n so that A - E< an. State and prove an
analogous statement for limn_., an.
8. Compute the limit inferior and the limit superior for the
sequences whose terms are the following:
(a) sin (mr/2).
n
(b) (l + (-1) ) COS ( n1T) .
)
(c) (l + l/n (l +sin (mr/8))1'n.
9. Let ( rn) be a sequence, whos range consists of all rationals in
JO, 1 [. If rn= Pnfqn. Pn.qn E N' set Sn= (Pnfqn)11qn . Show that
Jim S = lim S = 1.
n-oo n n-oo n
10. A function f is said to be upper semicontinuous at a <=>a E
>(!) and Ve > 0, 3S> 0 so that Ix-al<Sand x E >(!)f (x)
<J(a) + E. If a E >(!)and is an accumulation point of>(!), show
that f is upper semicontinuous at a if and only if limx-a J(x) os; f(a).
Give an analogous definition for lower semicontinuity and prove a
similar result .
11.
Suppose f has domain [O, l]and is defined by

f(x)
=
{ 1 <=> x is rational,
0 <=>x is irrational.
Where is f upper semicontinuous and where is f lower semicontinuous?

12. Show that the sum and product of two upper [lower] semi
continuous functions are upper [lower]semicontinuous.
13. If f is upper [lower]semicontinuous at every point of a compact
domain, show that f is "uniformly" upper [lower]semicontinuous.
14. If f is upper [lower J semicontinuous with a compact domain,
show that it is bounded above [below]and attains its maximum [mini
mum].
15. If f is a uniformly continuous function, show that it may be
extended to a function J which is uniformly continuous on>(f), the
closure of>(!). Do this by the theme of Proposition 2.4.4.
16. Use the results of Exercise 15 to extend the function ea of Section
2.3 from Qto R.
17. If J is a continuous one-to-one function with a compact domain,
show that 1-1 is continuous.
CHAPTER
3j INFINITE
SERIES
We have already discussed the concept of real sequences, real Cauchy

sequences, and real convergent sequences in Chapter 1. In Chapter 2
we discussed corresponding questions for real functions. In this chapter
we shall discuss the concept of infinite series, which actually is based
on the concept of sequence.
Before we launch into the definitions and theorems about infinite
series, we shall digress for a moment and comment about the meaning
of the sum of a finite set of real numbers. Although we have used finite
sums several times in Chapter 2 and have assumed that the meaning
is intuitively clear (which it is!), it is nevertheless of some value to
comment about the formal meaning. It is possible to prove the following
statement by induction.
There exists a unique function er with domain and range the collection of
all real sequences so that if a= (an) is a sequence, then
cr(a)o= ao,
Vn E
N0.
Although we shall not bother to carry out the proof, the general idea
how to use induction in this context is like that used in the proof of
Theorem l.6.4. It is not hard to prove, using the commutative, asso
ciative, and distributive laws for R, and the axiom of induction, that
Va,{3 E
R and for all sequences
a and b,
cr(aa + {3b) = au(a)

Of course, by
aa we mean that sequence
{3cr(b).
so that
Vn E
N0,
(aa)n=aa n .
We now set
n
k=O
ak= cr(a)n=ao + ai
If we have given a finite set of numbers
a n.
{ ak: k E ( 0, n)},
course necessary to extend the function with values
( 0, n)
it is of
ak and domain
to all of N0 to be able to use the summation symbol. This of
course gives rise to the question whether if we extend in two different
a and b, we get cr(ah=cr(b h, V k E ( 0, n).

c= (en) is a sequence
and ck=0, Vk E (O, n) , then cr(c)n
0. Hence, if ak =bk, Vk E (O, n),
and c=a - b, we have cr(c)n=cr(a - b ) n= cr(a)n - cr(b )n=0. It is also
not hard to show that if a and b are sequences, and is a one-to-one
ways, say to sequences
However, it is a very easy matter to show that if

=
98
3.1
function with domain and range
(O, n)
SERIES OF REAL NUMBERS J 99
so that Vk E
(O, n), ak= b<P<k
then CT(a) n
= CT(b)n
Very often we shall use the symbol
If n < m, this is to be given the value zero. If n ;?.!: m and a= (an) is any
sequence that is an extension of the function with domain
( m, n)
and
values ak, we set bk= am+k and

II
k=m
ak= CT(b)n-m
More generally, suppose A is any finite set and a is a function with do

main A and range in R. If A has n + 1 elements, let be a one-to-one
function with domain
( 0, n)
and range A. Then define
a'=
a EA
k=O
a<P(k)
If A= 0 we define the symbol on the left to be zero.
3.1
SERIES OF REAL NUMBERS
In the introduction to this chapter we have discussed the meaning of

a finite sum of real numbers by means of the CT function. We now wish
to discuss the meaning of "infinite sums" of real numbers. We think
that it should now be quite clear that the most natural way to extend the
meaning of a finite sum is to say that the sum of the sequence (an) is
the limit of the sequence CT(a), provided the limit exists. Let us write
down some formal definitions.
3.1.1
Definition.
An in.finite series is an element of the CT function, that
is, is an ordered pair of sequences (a, CT(a)).

An in.finite series (a, CT(a)) is said to be convergent the sequence CT(a)
is convergent. Otherwise the series is said to be divergent.
An infinite series (a, CT(a)) will often be designated by
and in case it is convergent we shall remove the parentheses and desig

nate the limit by
100 I INFINITE SERIES
This is in keeping with standard practice and the notations are very
convenient. Very often we shall use a notation such as
Let
bk= ak+n;
then we define the above symbol as another name for
The following fact is often very useful in deciding whether a series

diverges, but it tells us nothing about the convergence of a series.
3.1.2
If (a, <T(a)) is a convergent series, then
Proposition.
lim a = 0.
n-co n
Proof.
Since
<T(a)
is a convergent sequence, it must be Cauchy.
Hence VE> 0, 3N so that n:;;;:,: N ==::}
Ian I = l<T(a)n - CT(a)n-11

This is exactly the meaning of
an ---"' 0
as n
< E.
---"' oo.
As an example of how this may be used consider the series
k=O
k
(<-I)k _
I+k
It is not immediately evident whether this senes 1s convergent or

divergent unless we note that
. .
!1.?!
k
= I'
I+ k
so that by the last proposition the series is divergent. However, even if
an ---"' 0
it does not mean that the series is convergent. The standard
example is the harmonic series
k=l
Clearly, 1/k
-0
as k
oo,
G)
but the sequence of partial sums given by
sn=
Lk
k=l
is not Cauchy. Indeed, we first note that if k E N and k n, then
n +k
2n. Hence
Vn E
N,
S2n - Sn=
L n+k
k=t
I
:;;;:,: 2
U SERIFS OF REAL NUMBERS I 101
This means that the sequence of partial sums is not Cauchy and hence
cannot be convergent.
Let us now record a very simple but very useful fact: The sum of two
convergent series is again convergent.
3.1.3
a, {3 E
If (a, u(a)) and (b, u(b)) are convergent series and

then (aa + {3b, u(aa + {3b)) is convergent and
Theorem.
R,
00
00
00
k=O
k=O
k=O
L [aak+,Bbk] =a L ak+{3 L bk.
Proof.
The proof is simply a consequence of the facts that
u(aa + ,Bb) =au(a) + {3u(b) ,

lim
n-oo
[au(a)n + ,Bu(b)n] =a
lim
n-=
u(a) + ,B
n
lim
n-ao
u(b) .
n
As we pointed out in our discussion of finite sums, it is not hard to

prove, making use of the commutativity law for R, that if
ak =b<t><k where cl>

(0, n), then
V k E (O, n),
is a one-to-one function with domain and range
In other words, rearranging the indices on a finite set of numbers does

not change its sum. This may no longer be true for a convergent in
finite series, as we shall eventually show. However, we must first know
what it means to rearrange an infinite series.
3.1.4 Definition. A rearrangement of the infinite series (a, u(a)) is

an infinite series (a cl>, u(a cl>) ), where cl> is a one-to-one function with
domain and range N0
0
It turns out that every rearrangement of an infinite series is conver

gent if and only if the series whose terms are the absolute values of the
terms of the original series is convergent. We first give a formal
definition.
3.1.5 Definition. An infinite series (a, u(a)) is said to be absolutely

convergent the sequence u( lal) is convergent, where Vn E N0, lal =la l
If an infinite series is convergent, but not absolutely convergent, it is called
conditionally convergent.
n
n .
A natural question is whether or not conditionally convergent series

exist. The answer is in the affirmative and an example is given .by the
senes
k=O
( (-I)k )
k+I
This series is certainly not absolutely convergent, since the series of

absolute values is the harmonic series. The fact that the series is con
vergent is a consequence of Leibnitz' criteria, which will be established
in Section 3.2.
3.1.6 Lemma. If V k
finite subset of N0, then
ak
E N0,
0,
<T(a) is convergent, and A is any
co
L ak:,,;; L ak.
kEA
k=O
Proof.
Let
max A and B = (0,
n) \A.
Then
L ak:,,;; L ak+ L ak=<T(a)n

kEA
kEA
kEB
Now
<T(a)
is monotone nondecreasing and convergent. Hence

co
<T(a)n:,,;; L an ,
k=O
which proves the lemma.
3.1.1 Theorem. If an infinite series is absolutely convergent, then it is

convergent, and any rearrangement converges to the same limit and is abso
lutely convergent.
Proof.
Vn E N0, an 0. Let (a ct>, <T(a ct>))

{j: (3k) (k E (0, n) & j ct>(k)}. Then
Suppose, at first, that
be a rearrangement and A=
by the last lemma we have
n
L a
k=O
Hence
<T(a
ct>)
co
ct>k= L ak:,,;; L ak.

k=O
keA
is bounded, and since it is monotone nondecreasing
it converges and
co
L a
k=O
Actually, since
(a,<T(a))
co
ct>k :,,;; L ak.

k=O
is a rearrangement of
(a
ct>, <T(a
get the reverse inequality, so that the two numbers are equal.
For a general absolutely convergent sequence
+ _
an a
,,
lanl+an
2
lanl - an
2
(an)
let us set
ct>)),
we
3.1
Thus 0,,;;;;
an+,,;;;; Ian!, 0,,;;;; an-,,;;;; Ian!, and an=an+ - an-.

n
n
"'
L ak,,;;;; L lakl ,,;;;; L lakl,
k=O
and sirice
SERIFS OF REAL NUMBERS I 103
<T(a+)
and
k=O
<T(a-)
Hence
k=O
are monotone nondecreasing sequences
they are convergent. From the previous paragraph and Theorem 3.1.3
we get
"'
L a
k=O
"'
<l>k = L a+
"'
<l> k - L a- <l>k
k=O
k=O
Note that the first equality and the last equality come from Theorem
3.1.3 and establish the fact that
<T(a)
is convergent. Also,
<T(la <l>I)
0
is
clearly convergent, since
la
<l>kl=a+
<l>k +a-
<l>k
The converse of the last theorem is due to B. Riemann. Although the

idea behind the proof is relatively simple, the technical details of a
formal proof are slightly complicated. Hence before we proceed with
the statement and proof of this theorem, we shall note that a form of the
assonauve law is true for convergent infinite series. We make this
precise by means of the following two statements.
3.1.8 Definition. The series (b, <T(b)) is said to be a grouping of the

series (a, <T(a)) <:::::? there exists a monotone increasing function from N0
into N0 so that <l>(O) =0 and
bn =
<l>(11+1l-l
ak.
k=<l>(n)
3.1.9 Proposition. If (a, u(a)) is [absolutely] convergent, then any

grouping is [absolutely] convergent to the same limit.
The proof is so simple that we shall leave it as an exercise. We s hould
point out that it is sometimes possible to group a divergent series and
obtain a convergent series. For example, the series
k=O
is clearly divergent, since

if we take
<l>(k)
and consequently
((- I) k)
does not converge to zero. However,
2k, then
u(b)
is convergent.

3.1.10
If a series
Theorem (Riemann).
(a, <T(a)) i,s conditionally
convergent, then for any two real numbers a 13, there i,s a rearrangement
(a
,
<T(a
)) so
that
lim <T(a
nco
Proof.
<l>)n =a,
The intuitive idea of the proof is quite simple. Let (a+n) be
that subsequence of (an) which consists of the nonnegative terms and

(-a-n) that sMbsequence which consists of the negative terms. The
sequences <T(a+) and <T(a-) must both be unbounded. For since both
sequences are monotone nondecreasing, if one is bounded it converges,
and, since <T(a) converges, the other sequence will converge also.
Hence <T( lal) will converge, contrary to hypothesis. Further, note that
since <T(a) is convergent, a+n
0 and a-n
0 as n
oo.
Let r0 be the smallest integer so that <T(a+)r0 > 13. This is possible
since <T(a+) is unbounded. Hence the difference between <T(a+)r0 and
13 is at most a+ro Next let s0 be the smallest integer so that <T(a+)r0
- <T(a-) < a. Again this is possible, since <T(a-) is unbounded. The
difference between a and <T(a+)r. - <T(a-)80 is at most a-. Now let r1
be the smallest integer so that <T(a+)ri - <T(a-)80 > 13; that is,
ro
k=O
so
k
k=O
a-k +
ri
a+k > 13.
k=ro+l
In other words, we have added just enough positive terms to just

bring the sum up past 13. The difference between this sum and /3 is at
most a+ r,. Proceeding in this way we see we get a rearrangement of
the original series having the specified properties.
Let us now proceed to write all this down in a formal manner follow
ing the informal proof as an outline. We shall break the proof into a
number of parts.
(a)
The first step is to get the subsequences (a+n) and (a-n) For this
purpose consider the sets

A+= {n: an O},
A-={n: an< O} .
The sequence (a+ k ) is obtained by relabeling the elements in the set

{an: n E A+} and -(a-n) is obtained by relabeling the elements in the
set {an: n EA-}. The sets A+ and A- are both denumerable subsets of
N0 Indeed, if eitherA+ orA- is finite, all but a finite number of elements

of the range of (an) are negative or nonnegative, respectively. In either
event <T(.lal) is convergent, contradicting the original hypothesis. Clearly
A+ n A-= 0 and No= A+ U A-.
The functions that relabel the elements of (an) to give (a+ k ) and
(a-k) are the functions <1>+ and <1>- having domains N0 and N and ranges
A+ and A-, respectively, so that <1>+(0) =min A+, <1>-(1) =min A-, and
5.1
SERIES OF REAL NUMBERSj 105
The existence and uniqueness of such functions can be proved by in

duction in exactly the same manner as done in the proof of Theorem
1.6.4. They clearly are monotone increasing functions. Now, define
ak
Clearly
0 and hence
o-(a+) and o-(a-) are monotone nonde
creasing.
Actually, both of these sequences are unbounded. For if, for exam
ple,
o-(a+) is bounded it is convergent. Now, if n is sufficiently large,
the set
{k: <1>+(k)
(o,<1>-(n))}
mn. Clearly
mn oo as n oo, since otherwise "(<1>+) is finite. Now, (<1>+j(O,mn))
=A+ n (O,<1>-(n)), and (<1>-j (I, n)) =A- n (O,<1>-(n) ). These sets,
which we shall label Bn+ and Bn-, respectively, are disjoint and together
give (O, <1>-(n)). Thus
is not void, and hence has a maximum which we label
o-(a) <1>-<n> = L ak + L ak = o-(a+)mn - o-(a-)n.

kEBn +
kEBnSince
o-(a) and o-(a+) converge, it follows that o-(a-) is convergent.
But since
it follows that
(b)
o-(lal) is convergent, which is a contradiction.
We shall now formalize the process of adding blocks of the se
quences
(a+n) and (-a-n). Let qr be the function with domain
N0 and
range in N0 which satisfies the following:
qr(O)
qr(l)
qr(2)
= min
=min
=min
{k: o-(a+h > /3}

{k: o-(a+hw> - o-(a-h <a}
{k: o-(a+h- o-(a- )"'m > /3}
=min {k: o-(a+h- o-(a-)'i'<2n 1> > /3}

qr(2n)
qr(2n + 1) =min {k: o-(a+)"'<2ni - o-(a-h <a}
qr can easily be proved

o-(a+) and o-(a-) are
(qr(2n)) and (qr(2n+ 1))
The existence and uniqueness of the function
using the axiom of induction and the fact that

unbounded. It is clear that the sequences
are monotone increasing.
For the sake of notational convenience let us set, for
Tn =qr(2n),
Sn= qr(2n + 1).
E N0,
From the definition of 'I' we see that Vn E N0,
u(a+)rn i-l - u(a-)sn :!S f3 < u(a+)rn i- <T(a-)n ,

+
+
u(a+)rn - u(a-)sn <a :!S <T( a+)rn- u(a-)sn-1
(3.1.1)
Now u(a+)rn 1-1= u(a+)rn +i -a +rn i and u(a-)sn- 1 = u(a-)sn -a-sn.

+
+
Since u(a) converges, an.- 0 asn .- oo, and hencea+,, .- 0 anda-n .- 0
as n .- oo. But rn .- oo and Sn.- oo as n .- oo. Hence Ve> 0, 3N so that

n N=}a+rn l < E anda-Sn < e. Thus from (3.1.1) we see that Ve> 0,
+
3N so that n N=}
a(c)
f3 < u(a+)rn i- u(a-).n < f3 +

+
E < u(a+)rn - u(a-)sn <a.
E,
(3.1.2)
We shall now define the one-to-one function that gives the
proper rearrangement of the series (a, u (a)). Let us put, for j E N0,
n_1 ro,
n2i = S; + Tj,
n2i l = Sj + T;+1
+
=
Since
(s;)
(3.1.3)
and (r1) are monotone increasing, the sequence (
m) is mono
tone increasing. Let us put, form E N0,
B_1= {k: 0 :!S k :!S r0},

Bm = {k: nm 1 <k :!S nm}.
These sets are certainly pairwise disjoint and
N0= U {Bk: k
N0 U {- 1} } .
Let be that function, with domain N0, defined as follows:
<l>(k)=
{ <l>+(k- SJ-1) :::>k

<1>-(k-r;) :::>k
E B2;-1,
B2i,
(3.1.4)
where we set s_1 = 0. Let us rewrite this in a slightly different form

which will be useful later. We have
<l> ( s ; 1 +k)= <1>+(k)

<l>(r; +k) =<1>-(k)
for r; 1 <k :!S r; ,
for S; 1 <k :!S S;.
(3.1.4')
where we set r_1 = -1, and recall that s_ 1 = 0.

Let us put
B+= U {B2; 1: j E N0},

B-= U {B2;:j E N0}.
Then B+ n B-= 0 and B+ U B-= N0 The range of <l>jB+ is A+ and
the range ofIB- is A-. This is immediately clear from (3.1.4') and the
fact that (<1>+)=A+ and (-)=A-. Thus ()=A+ U A-= N0
Further, we see that <l>B
j + and <l>jB- are monotone functions. Indeed,
suppose p E B21_1, q E B2; 1 and P < q. We haven2 
and since the sequence (nm) is monotone increasing we have that
3.1
2(i - I) < 2j- 1,
which implies that
i,,;; j.
SERIES OF REAL NUMBERS I 107
Hence, since the sequence
(rk) is monotone increasing we have p =s1_1 ,,;; r1 ,,;; r;-1 <q -s1-1 if
i <j, andp-s1_1 <q-s1_1 if i = j Thus
.
<l>(p) =<l> +(p-s;-1) < <1>+(q-s1-1 ) =<l>(q).
In the same way we show that <l>IB- is monotone. These facts, together
with the fact that .9i(<l>IB+) n .9i(<l>IB-) = 0, show that

one. Hence, we have shown that
 is

is one to
a one-to-one function with domain
and range N0
(d)
Let us now bring all these things together and show that
lim O'(a <l>)n =a,
0
n-co
lim O'(a 0 <l>)n = {3.
n-.,
We shall prove only the second one, since the first one will follow by
similar reasoning.
Suppose we take
odd,
= 2j + 1. Then we can write O'(a <l>)nm

0
as a "telescoping sum" in the following way:
O'(a <l>)nm
o
O' (a <l>) ro +
o
L [O'(a
k=O
j
L [O'(a
k=O
<l>)n2k - O'(a <l>)n2k-1J

o
<l>)n2k 1 - O'(a 0 <l>)n2kJ

+
Now,
sk+rk
O'(a <l>)n2k- O'(a <l>)n2k-1 =

o
By (3.1.4') we have a <l>(rk +

0
a <l>(i)
k-1+1
i) =a <1>-(i) =-a-;
0
Thus we get
O'(a
) n2k- O'(a ) n2k-1

o
forsk-I
< i,,;;sk.
.L
a-i
k-1+1
=:' -
=-[O'(a-) k- O'(a-) k-1].

Hence
j
L [O'(a
k=O
<l>)n2k- O'(a 0 <l>)n2k-1J
(Recall, we have taken a-0 =
that
=-
L [O'(a-).k-
k=O
O'(a ) k-1]
=-O"(a-).i'
0.)
By exactly similar reasoning we find
I 08 j INFINITE SERIES
and hence
j
+
+
L [<T(a 0 <l>)2k+i - <T(a <l>)2k] = <T(a )r;+1 - <T(a )r0
0
k=O
Consequently, we get
<T(a o <l>)n2J+i
If nm-1 <
n nm+1
<T(a+)r;+1 - <T(a-)sr
(3.1.5)
we have
nm-1
n
L a 0 <l>k= L a 0 <l>k + L a
n
k=O
n m-1+1
k=O
<l>k.
Now, nm-1 = n2J. and for n2J < k n2;+1 we have a <l>k= a 0 cJ>+(k - SJ )
= a+k-J 0. Also, for n2J+1 < k n2J+2 we have a 0 cl> k a 0 ct>-(k - r;+1 )
= -a-k-rJ+1 0. Hence, in either case, n < nm or nm n, we get
0
La
k=O
<l>k
nm
L a <l>k= <T(a
o
k=O
<l>) n2J+1
(3.1.6)
If we combine (3.1.2) with (3.1.5) and (3.1.6) we have VE> 0, 3N so

that
n n2N =:::::}
n
L a 0 <l>k
k=O
{3 + E.
This means that limn-"'

Vj E
N0, {3
<
<T(a
<T(a 0 <l>)n {3. On the other hand, from (3.1.2),

<l>)2J+i Thus we see that
Jim
n-oc
REMARK:
<T(a 0 <l>)n
{3 .
It is clear from the proof of the last theorem that we could
take
{3 = oo or a = -oo. That is, we could find a rearrangement

that <T(a 0 cl>) is unbounded above or unbounded below.
so
D Exercises
1.
a,
(a, <T(a)) is absolutely convergent, and b is a subsequence of

(b, <T(b)) is convergent.
If r # I, show that Vn E N0,
If
show that
2.
:L rk=
k=o
1 - rn+l
.
I
-r
Hence give the values of r for which the series
will converge and the values for which it will diverge.
3.1
3.
Let
SERIES OF REAL NUMBERS I 109
(bn) be the subsequence of (an) consisting of the nonzero

(a,u(a)) and (b,u(b)) .con
terms of the latter. Prove that the series

verge or diverge simultaneously.
4.
If k=o
(ak) is convergent, show that Vn
N0, k=n (ak) is
convergent.
5.
If
(a, u(a)) is convergent and b is a subsequence of a, is it

(b, u(b)) is convergent?
necessarily true that
6.
Show that
n(n+ 1) = 1.
00
[Hint:
n(n+ 1)
7.
1
:;;
Show that
00
- n+ J
1
I n(n+ l}(n+2) = ;r
8.
Determine whether the series
is convergent or divergent.
9.
If limn-oo
an exists, show that the sequence defined by
n
L ak
n+ l k=O
converges to the same limit. This is called Cesaro summability.
bn
(an) is a monotone nonincreasing sequence of nonnegative

<T(a) converges, show that nano. This result is often called
Pringsheim's theorem.
IO.
If
terms and
I I.
Show that the following two series converge or diverge simul
taneously, provided
Vk
00
N0, ak 0.
L ((k+ l)(a2k+a 2k+1)),

k=O
I2.
If
00
( 2k )
o ; a1
(ak) is a sequence with nonnegative terms and (a, CT(a)) is

(En)
divergent, show that there is monotone nonincreasing sequence

that converges to zero and
is divergent.
3.2
CONVERGENCE TESTS
We shall first give a number of tests for the absolute convergence of

series.
3.2.1 Comparison Test. Suppose (an) and (bn) are sequences for
which there is an N so that n N::::} lanl .,;; lbnl. Then ifer(lbl) is convergent,
so is er(lal), and consequently if er(lal) is divergent, so is er(lbl).
Fix
Proof.
n0
N and for
>
n0 we have
o.;;;er<laDn-er(lal)no=
n
L lakl
k=no+l
.,;;
L lbkl
k=no+l
er(lbl)n-er(lbl)no
Hence, if er(lbl) is convergent it is bounded, and thus er(lal) is bounded

and convergent.
As an example of the comparison test, let us first examine the con
vergence of the geometric series
We know (see Exercise 2 of Section 3. 1) that for
#-
l,
n
1 - rn+I .
rk=
L,,,
l-r
k=O
Hence, if lrl < 1 ,
If
rn
0 and the geometric series converges to 1/ ( 1 - r).

1, the series diverges since clearly lim n - rn #- 0.
2 3
n 2n-1, it follows that l/n!.;;; (I/2)n-i.
Now, since n!
lrl
co
Consequently,
1
L n1
n=l
co
converges by the comparison test.
3.2.2
lim lanl11n.
n< 1 and diverges if p
p
Then er(lal) converges if p

Proof.
la.I''" is
co
>
1.
I f p < 1 , 3e > 0 s o that p + e < 1 . B y the definition of limit
superior, 3N so that
t If
Let (an) be a sequence andt
Cauchy's Root Test.
N::::}
unbounded, say that p = oo and consider p > 1.
3.2
CONVERGENCE TESTS I 111
I an I 11n 0 so that p- e > 1. By the definition of limit
superior, VN, 3n N so that
lanl11n
p- E,
>
lanl
(p - E)n
>
> 1.
lanl 0 and thus u(jai) cannot con

lanl11n is unbounded and clearly we get
Consequently, it is not true that

verge. If p = oo, the sequence
the same result.
Let us prove the converges of k=i

First, if
N and n >
k,
we have
n!
>
(l/n!)
kn-k+1.
Consequently, for all sufficiently large
( n ! )l/n
This shows that
.
hm
n,
kk<l-k>Jn
>
by means of the root test.
( )l/n
1
-1
n-"' n.
>
k/2.
=O,
and the series converges by the root test.
3.2.3
3N so
Suppose (an)
D'Alembert's Ratio Test.

N ===>an- 0. Define
that n
is
a sequence for which
l aann l
1 aan+n1 1
+l ,
R =Im
.
1
-n-
QO
r=I.1m
n:::"OO
If R < 1, u(jai)
Proof.
is
convergent and if r
-- .
> 1,
u{iai)
is
divergent.
This is an immediate consequence of the root test and the
chain of inequalities,
llm
n-co
1 aan+n 1 1
.;;;;
1
lffi
n-oo
I an 111n
We shall leave the proof of these inequalities to Exercise 1 at the end

of the secu'on.
The convergence of the series of factorials that we have considered
previously is also an immediate consequence of the ratio test since
n!/ (n + I) != l/n
10.
The ratio test, when it works, is usually very easy to apply and hence
is very useful. When it fails it may be possible to apply a slightly sharper
variant of it.
Raabe's Test. Suppose (an) is a sequence for which 3N so that

- 0. Let us put
3.2.4
N => a
n- ( -laann+i I) ,
n n(1-1 aann+il)
= lim n
oo
{3= lim
-
If
oo
> 1, er(lal) is convergent and if (3
Proof.
Thus
If
kjak+il
jak+il
> 1,
la:l l
< k
(
Hencek
Vp so that
1, er( lal) is divergent.
<
> p > 1, 3 K so thatk

<
K=>
-x
- I) lakl + ( I - p) lakl, or, rearranging terms, we get
<
(p - I) lakl
< (k
- I) lakl - klak+il
is monotone decreasing ask increases and consequently
it is bounded. If we sum the above inequality we get
< (p- I) lavl - nlan+il,

k=p
er(lal) is bounded and thus convergent.
(p - I) L lakl
Therefore,
K.
Suppose now that (3 < 1. Then V p so that (3 
lakak+1 1
> l
f!..
Therefore,
so that k a k + il is monotone increasing ask increases. For fixed

andk p we get
lak+1I > (p- 1) lavlfk.

Consequently, er (lal) diverges by comparison with the harmonic series.
3.%
CONVERGENCE TESTS I I 13
Both the root test and the ratio test fail for the series
k=l
{: )
2
However, Raabe's test will show this is convergent. Indeed,
Thus
I
(
-
k +I
I
> I - I+ 2/k
=k
2
+ 2'
and hence
Jim
k-oo
(k +
2 ) [l
(k/k + 1)2]
=Jim
k-oo
k [I
(k/k + 1)2 ]
;:.;,: 2.
3.2.5 Cauchy's Condensation Test. If (an) is a monotone nonin

creasing sequence with nonnegative terms, then the following series converge
or diverge simultaneously:
Proof.
The sequences of partial sums of these series are monotone
nondecreasing. If a monotone nondecreasing sequence has a subse

quence that is bounded, then the sequence is bounded and thus con
vergent.
If the sequence is divergent, then every subsequence is
unbounded.
Using the monotone nonincreasing character of (an) we get
2k+l-1
L a;,,;;; 2ka 2k,,;;;
J=2k
2k-1
i=2k-1
where the sum on the right is taken to be 2a0 if
k =0
to
n we
k = 0.
Summing from
get
u(a)2 n+l-i,,;;;
ao + L
k=O
2k a2k,,;;; 2u(a) 2n_1
This set of inequalities when taken together with the comments in the
first paragraph constitute the proof.
As an example of the use of the condensation test, let us consider
the convergence properties of the
p series,
Recall that for p
1 we called this the harmonic series and showed it
diverged. Hence the comparison test shows that the p series diverges
for p < 1. We can apply the condensation test only for p 0, since
otherwise the sequence ( l/nP) is increasing. For p 0 we must examine
the convergence of the series
00
'L
k=O
k
( l/2<p-l) ) .
This is a geometric series with r= l/2<P-0. Hence, if p > 1, 0 r < l,

and the series converges. If p 1, r 1, and the series diverges. Thus
the p series converges for p > 1 and diverges for p 1 .
There is another very useful test for absolute convergence called
the integral test. We shall take this up in Chapter 5. For now let us turn
to some other convergence tests which are not specifically tests for abso
lute convergence. Let us first prove a result called the Abel sum mation
form ula. In Chapter 5 we shall recognize this formula as a special case
of integration by parts for Riemann-Stieltjes integrals.
For every pair of sequences (a,,) and (b,,)
Lemma (Abel).
3.2.6
k=O
akbk
bn+1 u(a)n -
k=O
(bk+1 - bk ) u(ah.
We have, upon setting u(a)_1
Proof.
(3.2.l)
0,
ak = u(ah- u(ah 1
Therefore,
n
k=O
akbk=
k=O
bku(ah-
k=I
bk u(ah-1
(3.2.2)
The last sum begins at k= 1, since u(a)_1= 0. This last sum can also
be written
n
k=O
bk+1<T(a)k - bn+1<T (a)n.
If we put this into (3.2.2), we get (3.2. 1).
3.2.7
Abel's Test.
If (a, u(a)) is a convergent series and (bn)
is
m onotonic convergent sequence, then
is convergent.
Proof.
Since u(a) is convergent and (bn) is convergent, it follows
that ( u(a)nbn+i) is a convergent sequence. Also u(a) is bounded, say,
3.2
by M , and hence, since
(bn)
is monotone,
L lbk+1 - bkl lcr(ak) I
k=O
Since (bn)
n
M L lbk+l - bk l
k=O
M [bn+l
- ho]
is convergent, the sums on the left are bounded, and since
they are monotone nondecreasing they are convergent. Thus by using
(3.2. l) we have completed the proof.

3.2.8
to
zero,
If cr(a) is bounded and (bn) is nonincreasing
Dirichlet's Test.
then
is convergent.
Proof.
The fact that
L (bk+1 - bn)cr(ah
k=O
is convergent follows in exactly the same way as the previous proof.

If Mis a bound for
er
(a),
then
lcr(a)nbn+il
and since
bn
- 0, cr(a)nbn+i - 0.
Mlbn+1I.
These facts in conjunction with Abel's
summation formula complete the proof.

As we saw in Proposition 3.1.2, an infinite series
gent only if an
- 0.
(a, er(a))
is conver
We also gave an example, the harmonic series, which
showed that in general this is not a sufficient condition for convergence.

However, under certain circumstances it is a sufficient condition, as
the next result shows.
3.2.9
If (an) is nonincreasing to zero, then the series
Leibnitz' Test.
is convergent.
Proof.
(an)
This is an immediate consequence of Dirichlet's test. Indeed,
is nonincreasing to zero and the sequence with terms
L <- t)k
k=O
is bounded by 1.
As an example of the use of Dirichlet's test we consider the series
k=I
kx ) ,
si
x E ] -oo,
oo
[.
116 J INFINITE SERIES
Recall the trigonometric identity
2 sin (x/2) sin kx
cos
(k - l/2)x - cos (k + 1/2)x.
Summing both sides we get

n
2 sin (x/2)
Thus, if
k=l
sin kx
cos (x/2)
cos
(n
1/2)x.
27Tm, (l:=i sin kx) is a bounded sequence. Since l/k
we may apply Dirichlet's test and we see the original series converges.
If x
27Tm, the series converges to zero.
O Exercises
1.
Prove the chain of inequalities in the proof of the D'Alembert
ratio test,
2.
Test the following series for convergence:

(a)
(b)
3.
3.2.3.
o (i : )
2 (
)
k2
k3
1 + k! .
00
k=O
Find all values of a for which the following series converge:

00
(a)
(b)
(c)
4.
k =l
(akka).
i( )
i C10 )
k( lo k)a
k)a

00
(a)
(b)
(c)
k=I
( (-l)k21tk).
0 (
(
).
)
(-l)k [v'k+T -VkJ

(-l)k [211k
211<k+i>J .
3.l!
5.
Show that the following series is convergent by using the Abel
summation formula.
fG
k =I
6.
log
(I+ l/k )
(a, <T(a)) is an absolutely convergent series, show that it is
If
always possible to find an unbounded monotone nondecreasing se

quence
(wk ) so that
is absolutely convergent.
7.
(a, u(a)) so that for every

(En) which converges to zero,
Show that there is a divergent series
monotone nonincreasing sequence
is convergent. Compare with Exercise
8.
12 of Section 3.1.
Give another proof of Leibnitz' test by taking
b k = (-l )k a k and
(u(h)2n+t) is monotone nondecreasing, the

(<T(b)2n) is monotone nonincreasing, and noting that
showing that the sequence

sequence
<T(h )2n+l = u(b)2n - a 2n+ I

9.

(a)
(b)
IO.
(
:i (
k =O
(-l )k 2-1 1k /(k + 1) .

(- l )k
(Vk+i - vk) .

co
(a)
L
k =l
<3-11k [211k
2 11< k + l ) J) .
limit comparison test: If (an) and (hn) are sequences

(anfbn) exists and is different from
zero, then u(a) and u(b) converge or diverge simultaneously.
11.
Prove the
with positive terms and limn-
12. Suppose lim nVr for which p < r <
co
co
n
lanl11 = p
<
1. Ifs= limn-co u(a)n, show that

n;;;.: N ==>
1, there is an N so that
Is - <T(a)nl
<
n
r +l/(l - r).
13.
that
Suppose limn-
Vr for which
p <
co
lan+ifanl = p
<
1,
Is - <T(a)nl
14.
Suppose limn-co
Show that
Vr for which
<
lanl
n(l - lan+1/anl ) = p > 1 and s

limn-co <T(a)n.
p > r > 1, 3N so that n:;;. N
=
Is - <T(a)nl
15.
l, and s = limn-co <T(a)n Show

n:;;. N
<
3N so that
<
n
r-
--
lan+1I
{an) and ( bn) satisfy the hypotheses of

s =limn-co <T(ab)n, find an error estimate for
Suppose the series
Dirichlet's
test.
Is - <T(ab)nl .
16.
Suppose
If
(an) satisfies the conditions of Leibnitz' test. Show that
(-l)kak
(-l)kak an+1.
3.3 DECIMAL EXPANSIONS

If
m is a positive integer greater than 1, then the series

co
k=l
(ak/mk) ,
is convergent to a number in
ak
(0, m - 1),
[O, l]. Indeed, since
the convergence follows by comparison with the geometric series, and,

moreover,
co
k=l
ak /mk (m -
1) L l/mk
The object of this section is to show that
co
k=l
1.
e very number in [O, l] may be
represented by such a series.
3.3.1 Theorem. If m is a positive inte ger greater than 1, then Va

[O, l], there exists a seque nce (ak) with range in (0, m - 1) so that
a=
co
k=l
ak/mk.
Proof. Let a1 be the largest integer in (0, m - 1) so that a1/m a,

2
a2 the largest integer in (0, m
1) so that a1/m + a2/m a, and if
a1,
, an-l have been determined we take an to be the largest integer
-
3.3
DECIMAL EXPANSIONS I 119
so that
(3.3.1)
In this way we can prove by induction that there exists a unique sequence
(ak) so that Vn EN, (3.3.1) is satisfied and Vj E (1, n) if c E (O, m- I)
& c >a;, then
kEAj
ak/mk + c/n; >a ,
where A;= (1, n) \{j}.

The sequence with terms given by (3.3.1) is monotone nondecreasing
and bounded above by a, hence convergent. Of course, the limit is also
bounded above by a. Hence Vn E N, we get
n
oo
ak/mk L ak/mk = b a.
k=l
k=l
Since m > 1, limn--+ 0 as n--+ oo. Thus, if b <a, there is a smallest

positive integer p so that
l/mP-l a -b .
Now if n
;;;,: p
w e have l/mn -1 a
n-1
k=l
b and consequently
ak /mk + (m - l)/mn b + l/mn-1 a.
Thus, because of the way (ak) was defined, we must have an= m-1.
Let q be the smallest integer so that n ;;;,: q :::} an m-1. Clearly q > l,
for otherwise, using the formula for the sum of a geometric series, we
get
=
b=
(m-1)/mk=
1;;;,:
a,
k=l
which contradicts the assumption that b <a. Since

..
k=q
it follows that
b
q 1
=
k=l
q-2
(m - l)/mk
l/m"-1,
ak /mk + l/m"-1
L ak /mk + [aq-1 + l]/m"-1 <a .

k=I
Since aq-1 < m-1, it follows that aq 1 + 1 m-1, which is a contra
diction, since the definition of (ak) does not allow such an inequality
as the last one. Therefore, we must have b= a and the proof is complete.
=
1%0 I INFINITE SERIES
As the reader well knows, the
duimal expansion
of a number in
is obtained by writing down the elements of the sequence
(ak)
[O, l]
in suc
cession to get
Of course, the base
must be specified.
It is not true that a real number has a unique representing sequence.

Indeed, suppose
Then we also have
a=
n-1
k=l
ak/mk + [ak - l]/mn +
k=n+l
(m - l)/mk.
Nonzero rational numbers such as these, which have a representation

with
ak= 0
for all
sufficiently large, are called
terminating decimals.
These rationals are the only reals that do not have unique series repre
sentations, and as a matter of fact they have exactly two representations.
We shall now prove these statements.
3.3.2 Theorem. Every real number in ] 0, l], except a terminating

decimal, has a unique representing series with respect to a fixed m E N , m > 1.
Each terminating decimal may be represented in exactly two ways, either by a
finite sum or by a series for which 3 n so that Vk :;;;-: n, ak= m 1 .
-
Proof.
Suppose we have
a=
If we set
ck= ak - bk ,
00
k=l
then
akfmk=
ickl .:;;; m
00
k=l
Let
00
k=l
bk/mk.
1 and
ck/mk = 0.
be the smallest integer in the set
{k: ck
7':-
O},
provided this set
is not empty. We then get
icnl/m n=
lk+i ck/mk l
00
.:;;;
k=n+l
lck l /mk
00
.:;;; (m
Thus we must have
lcnl =
m.
Since
ak
and
bk
1)
k=n+l
l/mk= l/mn.
l, which in turn implies that the above in
equalities are equalities. Hence
=1
are in
Vk > n, ck= m 1, or Vk > n, ck

(O, m - 1), this means that Vk > n,
-
3.5
ak = m
and bk =
0,
or V k > n, ak =
0,
DECIMAL EXPANSIONS I Ul
and bk = m
1.
These cases
are the cases of a terminating decimal. If a is not a terminating decimal,

the set
{k: ck
=fa
O}
must be void. This concludes the proof.
We can use the series or decimal representation to show that the set
of numbers in
[O, I]
is not countable. The set of terminating decimals,
being an infinite subset of the rationals, is denumerable. Hence the set

of numbers in
[O, I] is not finite and the set of sequences each having

( 0, m 1) is denumerable or not denumerable if and only
of numbers in [O, 1] is denumerable or not denumerable,
its range in
if the set
respectively.
Let us suppose that the set of sequences each having range in
( 0, m
1)
is denumerable. Then there is a one-to-one function 
mapping N onto this set of sequences. Let us set
Let (ak) be that sequence defined as follows:
ak=
a\
m
1 ::::} a k
::::}
=fa
0,
k
a k = 0.
This sequence exists since m 2. Now the sequence (ak) has range in
( 0, m
1)
and therefore there is a q E N so that
But aq =fa a q, which is a contradiction. Hence the original assumption

that
[O, I]
is denumerable is untenable. Thus
[O, l]
is an example of
an infinite set that is not denumerable. Such a set is called uncountable

or norulenumerable.
The fact that
[O, l]
is uncountable is due to G. Cantor. The process
by which this was shown to be true is, for obvious reasons, called Cantor's
diagonal process. We have proved the following theorem.

3.3.3
Theorem (Cantor).
The set [O, l] is uncountable.
CANTOR'S SET
Let us take m = 3 in the previous considerations, the ternary expansions

of the numbers in
[O, l].
The set C of numbers in
[O, l]
that can be
written in a ternary expansion given by the sequence (ak) so that
Vk E N, ak =fa 1, is called Cantor's set. This set has many interesting

properties that can be used to advantage for giving examples and
counterexamples.
First, the set C is closed. We will show this by showing that its comple
ment is open. Let b be a number in the complement of C and suppose
12% I INFINITE SERIES
j is the smallest element in the set {k: bk= l}. Now, 3k

bk 2 . Otherwise, since
"'
2/3k
k=;+l
we see that
lf3i'
>
j so that
(3.3.2)
b would have an expansion of the form

j-1
b= :L bkf3k + 2w.
k=l
b is in the complement of C. Also,
j, so that bk 0; otherwise we see from (3.3.2) that
"'
J-1
b= :L bkf3k + :L 2t3k,
k=l
k=;+l
which contradicts the fact that
3k
>
which would contradict the fact that
bE cc. Let us set
J-1
q/3i-l = :L bk/3k.
k=l
VkE{ 1, j - 1 ), bk is even, we see that q is even and moreover

31-1 By what we have just proved,
Since
<
(3.3.3)
(3.3.3) show that every bEcc is in an open interval
(p + 1)/3i [, where p is odd and p < 3;. We claim that
every open interval of this form is contained in cc. Indeed, let us write
The inequalities
of the form ] pf3i,
p131= :L Pkf3k ,
k=I
Since p is odd,
<
(p + 1)/31 and
3 kE {I, j)
PkE{0,2).
so that
pk= 1. Suppose that p/3i
<
"'
a= :Lak/3k;
k=l
VkE{l,j), ak=Pk
{k: kE{l,j) & ak Pk}.
then
For if not, let

If ar
L Pk/3
k=l
T
be the smallest integer in
Pr+ 1, then
k + l/3
r
"'
:L Pkf3k + :L 2/3k
k=r+l
k=l
j
"'
L Pk/3k + L 2/3k (p + l}j3i '

k=l
k=Hl
3.3 DECIMAL EXPANSIONS I Il!3
which is a contradiction. If a r .;;;
.;;;
k=l
T
.;;; L
k=l
Pr - 1, then
r
Pk/3k - l/3 +
oo
k=r+l
2/3k
k
Pk/3 .;;; PW'
which is again a contradiction. Note that if a is a terminating decimal,

the previous proof is valid regardless of which representation is used
for a. Thus a E cc, since, as we have noted, 3kE (l,j) so that a k
=
Pk
1. Hence cc is open and
C is closed.
Actually, the last two paragraphs prove more than we have stated.
Namely, the inequalities
(3.3.3) and the last paragraph show that
cc
is the union of all intervals of the form
where
q/3n-1
n-1
=
:L
k=l
qkf3k,
qkE{0,2} .
Further, the proof of the last paragraph shows that these intervals are
pairwise disjoint. For suppose a Elp,k n
lq,n, where (p,k)
(q,n).
Then, from the last paragraph we have
n-1
=
q;/31 + l/3n +
P1 /3
1=1
k-1
1=1
co
J=n+I
l /3k +
00
j=k+I
a1f3i
a1/31.
qk
1,
q1 for j E ( 1, n - 1), which
If k < n, the proof in the last paragraph leads to the fact that
which is a contradiction. If k
n,
then p1
q. If k > n, we again get a contradiction.

2n-1 intervals of the form lq,n that we
have described. Since each such interval has length l/3 n, the sum of
their lengths add up to 2n-i /3n . Hence the sum of the lengths of the pair
contradicts the fact that p #
For fixed n there are exactly
wise disjoint intervals which make up cc is
n=I
Every point of the Cantor set is an accumulation point of C. A closed set
with this property is called a perfect set. The proof is very easy. Suppose
a EC and
akE{O,2},
124 J INFINITE SERIES
where
ak
0 for
an infinite number of
series are all different from
k. Then the partial sums of this
belong to C, and converge to
a,
a.
If, on
the other hand,
ak
then for
>
{O, 2},
j, the numbers
are in C, are different from
a, and converge
The Cantor set is uncountable. The proof
to
a.
is also rather easy, being
simply an application of the Cantor diagonal process. Suppose C is

countable and is a one-to-one function with domain N and range C.
Set
ak=<l>(k)
and define
ak=
0::::}a\=2,
2=}akk=O .
Clearly, the number determined by the sequence
(ak)
in its ternary
expansion belongs to C but is not in the range of .

Collecting all the previous results we have proved the following
theorem.
3.3.4 Theorem. The Cantor set is an uncountable perfect set in [O, 1)

whose complement consists of the union of pairwise disjoint open intervals, the
sum of whose lengths is 1.
It is interesting and instructive to look at the geometric positions
of the intervals that comprise cc (see Fig. 3.3. l ). The interval I 0, 1 is
] 1/3, 2/3[,
that is, the "middle third" of the interval [O,
1
9
2
9
2
3
1
3
7
9
1].
8
9
The interval
FIGURE 3.3. 1
/0,2 is
] 1/9, 2/9[,
and the interval /2,2 is
)7/9, 8/9[.
These intervals
represent the "middle thirds" of the intervals that remain after I 0, 1 is

removed. Proceeding in this way, by removing the "middle thirds" of
the intervals that remain after any given stage, after a denumerable
number of steps we are left with Cantor's set.
3.3
DECIMAL EXPANSIONS I 125
O Exercises
1.
If
m,p, n E N, m
write
p /mn
2.
If
>
p,m E N, m
1,
k
Pk/m ,
>
1,
k=l
p < mn,
and
show that it is possible to
pk E ( 0, m
1 ).
show that p has a unique representation
in the form
Pk E (O,m-1), Pn- 0.
(Hint:
Exercise
may be helpful.)
3. A sequence (ak) is said to eventually periodic{:::::> 3N and

3 p E N so that k ;:_;,,, N ==> ak+P = ak. If m E N and m > 1, we know
that every a E;:: [O, I] can be written
ak E (O,m-1).
Show that
4.
is rational{:::::>
Suppose that
(ak)
is eventually periodic.
m E N, m > 1, and suppose we make the decimal

[ 0, 1 J unique by taking the representing
expansion of any number in
series of a terminating decimal as a finite sum. Show that

00
k=I
if and only if A=
5.
k
ak/m <
00
L bk/mk
k=l
{k: ak-bk}- 0
and
an< b n,
where
=min A.
If the range of a function is uncountable, show that its domain
is also uncountable.
6.
Prove that the set of irrational numbers in
7.
If x
EC
[O, l] is uncountable.
and
then define f to be that function with domain C given by

00
f(x) =
Show that (J) =
[O, l]
k=l
and that
k
xk/2 +i.
is continuous. Note that in con
junction with Exercise 5 this provides another proof of the fact that C
is uncountable.
8.
function. Describe a subset A C

tion with range
[O, l].
f
C
of Exercise 7 is not a one-to-one

so that f IC \A is a one-to-one func
1%6 j INFINITE SERIES
9.
nondecreasing. Show that

has domain
[O, I],
defined in Exercise 7 is monotone
may be extended to a function F which
is monotone nondecreasing, continuous, and is
constant on each interval I q,n of cc. The function F is called
function.
Cantor's
A way of describing Fis as follows. If
let us set
n
and
n=
oo
x E C.
if
n(x)
Then
F(x)
3.4
n-1
=
k=l
min { k:
xk
xk /2k+i
l},
l/2n.
SEQUENCES AND SERIES OF FUNCTIONS
In this section we shall generalize our definition of sequences and

series to include the situation where the elements are functions.
3.4.1 Definition. Let .t;t be the collection of all functions each having
domain and range in R. A function sequence is a function with domain N 0
and range in .f;i .
As in the situation for a real sequence, we shall designate a function
sequence by
(Jn).
The definition of convergence of a function sequence
at a point reduces to the definition of a real sequence. We shall state

this formally.
3.4.2 Definition. A function sequence Un) is said to be convergent at

a point x<=> 3N so that Vn E N0 with n N, x E J:JUn), and the real
sequence defined by the set of numbers {Jn(x) : n N} is convergent.
In dealing with function sequences, a new concept may be introduced
in considering convergence, namely, the consideration of uniform
convergence. When speaking of uniform convergence, for simplicity
we shall suppose that all elements of the range of the function sequence
have the same domain.
3.4.3 Definition. Suppose Un) is a function sequence so that Vn E N0,

Jn has the same domain J:J. The sequence Un) is said to be uniformly con
vergent<=> 3g E .f;i with J:J(g) J:J, so that Ve> 0, 3N such that Vx E J8
and Vn N,
=
IJn(x) - g(x) I
<
E.
3.4
SEQUENCES AND SERIES OF FUNCTIONS J 127
Loosely speaking, for a uniformly convergent sequence the rate of

convergence of each real sequence
Un(x))
is independent of
x.
The
idea of uniform convergence leads to the idea of a uniformly Cauchy

function sequence .
3.4.4
fn
Suppose
Definition.
Un) is a function sequence so that Vn E N0,

Un) is said to be uniformly Cauchy
has the same domain . The sequence
:::>Ve> 0, 3N so that Vx E and Vn,m;;.:N,
lfn(x)-fm(x)I
< e.
As one would expect, a function sequence is uniformly Cauchy if

and only if it is uniformly convergent. This is proved in the next
theorem.
3.4.5
A function sequence is uniformly convergent :::> it is
Theorem.
uniformly Cauchy.
Suppose
Proof.
Un)
is uniformly convergent, each
common domain . Then 3g E &f,
(g) =.so
fn
having the
that Ve> 0, 3N
such that Vx E and Vn;;.:N,
lfn(x) -g(x)I
< e/2.
Hence Vx E and Vn,m;;,; N,
lfn(x)-fm(x)I
lfn(x)- g(x)I
lg(x)-fm(x)I
< e.
Un) is uniformly Cauchy. Then Vx E .the

(fn(x)) is Cauchy, and hence converges to a real number
designate 'g (x) The collection of ordered pairs (x, g (x)) is
Conversely, suppose
real sequence
which we
'.
a function with domain . Now, V e> 0, 3N so that Vx E and
Vn,m;;.:N,
lfn(x) - fm(x)I
For each
x, fn(x) g(x),
< e/2.
and since the absolute value is a continuous
function with domain R, we have

lim
n(x) - fm(x) I= lfn(x)- g(x)I
m-oo lf
This shows that fn
3.4.6
to
g and
Theorem.
g uniformly.
If
Un)
is a function sequence uniformly convergent
each fn is continuous at a, then
Proof.
Since
(Jn)
e/2 < e.
g is
continuous at a.
is uniformly convergent, Ve> 0, 3N so that Vx
in the common domain and Vn;;.:N,
lfn(x) -g(x)I
< e/3.
1%8 I INFINITE SERIES
Fix n 0 N; by the continuity

Ix - al < 8 and x E It>=}
of
fno
at
we have that 38 > 0 so that
IJ,,.(x) - fno(a)I < E/3.

Ix
Hence, if
al < 8
and
x E />,
jg(x) - g(a)I .;;; lg(x) - fno(x)I + lfno(x) - fno(a)I

+ lfn.(a) - g(a) I < E.
This proves the continuity of
at
a.
The condition that the function sequence
(Jn)
is uniformly conver
gent cannot be removed from the hypothesis of the last theorem.

Indeed suppose
Vn E N0,fn
has domain
[O, I]
and is defined by
fn(x) = x n .
Then for
x E [O, I [,
x
0,
n- oo fn( )
lim fn(l) = 1.
n-oo
lim
Hence the limit function

I at
I.
takes the value zero on
Thus it is not continuous.
As in the case of real sequences a
[O, I [
and the value
function can be defined from
function sequences to function sequences. For the sake of simplicity

we shall restrict the domain of
so that
Vn E N0, /t>(J n) =It>.
to those function sequences
f = (Jn)
Then
n
u(J) n(x) = L fn(x )
k=O
is well defined on It>.
3.4. 7 Definition. A function series is an ordered pair (J, u(J)). The

function series is said to be [absolutely] convergent at x{:::> [u(IJl)Ju(J) is
convergent at x. The function series is said to be [absolutely] uniformly con
vergent{:::::> [u(IJI)] u(J) is uniformly convergent.
Analogous to our previous notation, we shall denote a function series
by
and if the limit exists at every point of the common domain of all the
fk,
we shall denote the limit function by

00
,Lfk
k=O
3.4 SEQUENCES AND SERIES OF FUNCTIONS I 129
We have the following immediate corollary to Theorem 3.4.6, the

proof of which we leave to the reader.
3.4.8
a and
is
er
If each in of the sequence i

uniformly convergent, then
Corollary.
(i)
is
Un) is continuous at
continuous at a.
There is a very simple but very useful test for uniform convergence
of function series.
3.4.9 Weierstrass M Test. Suppose i

Un) is a function sequence
so that each fn has the same domain and there is a constant sequence (Mn)
so that Vn E N0 and Vx E ' l in ( x) I Mn, and (M, er(M)) is con
vergent. Then (Iii, er( Iii)), and hence U. er U)) are uniformly convergent.
=
Proof.
Since for
>
m,
n
L lik(x )I
k=m+l
l<T(li l)n - er(lil)ml
<T(M)
it follows from the convergence of
that
er(Iii)
and
erU) are
uniformly Cauchy and hence uniformly convergent.

As an example, let us consider the series
and ask for the values of
x where this series converges. From the Cauchy
root test we know the series will converge for those
-1
lim
n-"'
and diverge for those

Since 1 + x2" >
xn
+ X2n
1 1/n
for which
< 1,
for which this limit superior is greater than 1.
x2n, we have
-.
hm
n-"'
x2"
I --x"
1 +
1 1,.. 1
lxl'
and hence we certainly get convergence for

since
lxl
> 1. On the other hand,
UO I INFINITE SERIES
1 + x2n
1 + l/x2n
it follows that the series converges as well for

Looking at this another way, if
lxl
I1 nx2nl
:s;;
:s;;
p< 1
lxl
<
1.
we see that
n'
and thus the series converges uniformly by the Weierstrass M test. On

the other hand, since
l/xn
1 + l/x2n
1 + x2n'
----=
it follows the series converges uniformly for

converges certainly for all
so that
lxl
=Jc.
1.
lxl l/p. Thus the series

lxl I, it is not true that
If
xn
--o ,
+
x211
so that the series cannot converge for these values of
x.
0 Exercises
gi ven in the last example does not converge
I.
Show that the series
2.
Discuss the uniform convergence of the following sequences:
uniformly on R \ { 1,
-1} .
(a ) fn(x) = 1
[Hint:
(b )
fn(x)
(c)
fn(x)
nx
' x
JO, l[.
, x E [O, oo[.
xn
nx(I - x)n, x E [O, l].
1 +
Use the binomial theorem to show that
n-1
(1-; ) n
3.
3 ! n2
Discuss the uniform convergence of the following series:
( a)
(b )
4.
(n- l)(n-2)
( ! x2 ) , x ]-00,oo[.
(0 :x2)k ) x ]-oo,oo[.
k2
Discuss the uniform convergence of the following series:
J-oo,oo[ .
3.5
[Hint:
INFINITE PRODUCTS [ 131
Use the methods of elementary calculus to compute the maxi
mum of each function in the series. Another possibility is the use of

the Cauchy-Schwarz inequality (Exercise
5.
Vn
15
1.8).]
of Section
N0, the function Jn is defined and contin

Vx EK and Vn E N0,Jn(x);;;,,, Jn+i(x).
If Vx EK, Jn(x)-+ 0, show that Jn-+ 0 uniformly on K. This fad is
often called Dini's theorem.
Suppose that
uous on a compact set K, and
6.
Suppose that the function sequence
Vn
vergent and
E N0,
Jn
(Jn)
is uniformly con
is bounded. Show that there is a uniform
bound for all the Jn
7.
and
Suppose Jn-+ J and
gn
is bounded. Then
gn-+ g uniformly on a set K and each Jn

Jngn-+ Jg uniformly on K. Show that this
may not be true if we remove the boundedness condition.
8.
Derive an analogue to Dirichlet's test that will give sufficient
conditions for the uniform convergence of a function series.
9.
10.
Derive an analogue to Abel's test for function series.

Let
(<Pn)
be a sequence of continuous functions with common
domain R and defined as follows:
<Pn-1 (x)=
<=>
{O
.I
Let
(rk)
Vx
ER,
JxJ ;;;,,, I/n,
EN
nJxJ <=> JxJ .;;; I/n,
be a squence whose range is the rational numbers Q. Set
co
Jn(x)=
Show that
Vn
E N 0,
Jn
L <Pn(X
k=O
rk)/2k.
is continuous, and
vergent. However, the sequence
(Jn) does
Vx
ER,
Un(x))
is con
not converge uniformly on
any interval in R.
3.5
INFINITE PRODUCTS
Just as it is possible to define an infinite sum in terms of a sequence,

it is possible also to define an infinite product in terms of a sequence.
Before we do this, it would be well to take a few moments for a dis
cussion of finite products of real numbers. As in the situation for finite
sums, it is possible to prove the following statement using the axiom
of induction.
There exists a unique function '1T with domain the collection of all real se
quences and range in the collection of real sequences so that if a= (an) is a
1!2 I INFINITE SERIES
sequence, then
7T(a0) = ao ,
7T(a) n+i = 7T(a)nan+t
Vn
N0
We shall not carry out the proof here, but shall note only that the use
of induction for this result is very much like that used in the proof of
Theorem 1.6.4. Actually, the existence of the
function discussed at
the beginning of this chapter, the existence of a power function as dis

cussed in Chapter 2, and more generally the
7T function discussed here
are all special cases of a very general logical theorem on the possibility
of
or
recursive
inductive definition.
The interested reader should consult
page I 63 of the book by Kershner and Wilcox cited at the end of

Section I. I.
We shall use the notation
n ak
k=O
7T(a)n
aoa1 ... an.
If we are given a finite set of numbers
{ak: k
necessary to extend the function with values
E (O, n)}, it is of course

ak and domain (O, n) to
N0 to be able to use the product symbol.

a and b,
it is a rather simple matter to show that 7T(ah
7T(bh, Vk E ( 0, n).
a function defined on all of
However, if we extend in two different ways, say to sequences

=
We may very often use the symbol
for
m.
If
a= (an) is any sequence that is an extension of the func

( m, n) and values ak, we set bk llm+k and we define
tion with domain
k=m
ak
7T(b)n-m
More generally, if A is any finite set with

one-to-one function with domain
TI
qEA
(O, n)
n+
1 elements, let be a
and range A. Then define
aa
Il a<f>(k)
k=O
The last definition, of course, raises the question whether or not we have
given a definition that is independent of the function . We think that
the reader can easily establish by use of the commutativity property
of the product of two real numbers and the axiom of induction that
for every one-to-one function with domain and range
( 0, n), we have
S.5
INFINITE PRODUCTS
I 13!
The following associativity property for finite products is also easily

established by using the associativity property for the product of real
numbers and the axiom of induction:
3.5.1 Definition. An infinite product is an element of the 7T function,

that is, is an ordered pair of sequences (a,7T(a))
If a= (ak) and Vk E N0, ak - 0, then the infinite product (a,7T(a))
is said to be convergent the sequence 7T(a) is convergent to a nonzero number.
Otherwise the infinite product is said to be divergent.
.
The reason for excluding zero factors and a zero limit when speak
ing about the convergence of products is that it leads to neater criteria
for deciding when a product converges. This will be borne out by fol
lowing results. An infinite product will often be designated by
and, if it is convergent, its limit will be denoted by
3.5.2 Theorem. An infinite product Ilk,.0 (ak), ak - 0, Vk E N0, is

convergent Ve > 0, 3N so that n m N =>
(3.5.1)
Proof. Suppose the infinite product is convergent; that is, 7T(a)n
--+ p - 0. Thus VE > 0, 3N so that n m N =>
Thus we have
But
7T(a)n
7T(a)m-1
l 1
7T(a)n - 7T(a)m-1
l =
7T(a) n
7T(a)m-1
--
7T(a)m-1
IJn
k=m
and hence the necessity is established.
ak
'
<
E.
(3.5.2)
Conversely, suppose
If we fix m
;.:;.
(3.5. 1 ) is satisfied. This is the same as (3.5.2).

7T (a) is a bounded sequence. Set
N, we see that
Pi = lim 7T ( a ) n ,
n-oo
From
(3.5.2)
we see that Vm;.:;. N,
-E
<
( i
- 1
7T a ) m-1
j= 1, 2.
E,
<
Thus we see that Pt #- 0, since otherwise, if we had originally taken

E
<
1,
we would get a contradiction. Let us take j=
inequalities. Using Theorem

.
lIm
2.4.3(c)
Pt
m=oo 7T ( a ) m-1
lim
m-oo
we get
I in
the last set of
P1
= P1
7T(a ) m-1 p/
and thus
-E
Since this is true, VE > 0.
Pt
- 1
P2
<
p1 = p2
=;i=
<
E.
0, and the sufficiency is established.
From the last theorem it follows that a necessary condition that the
nk;;.o (ak)
infinite product
converges is that
as
Thus we shall write
ak - 0
as k
ak = 1 + ak
k-
oo.
and the last condition is converted into
oo.
3.5.3 Theorem. A sufficient condition that the infinite product nk,,.O

(1 + ak) is convergent is that the infinite product Ilk,,.0 (I + !akl) is conver
gent. If Vk E N0, ak;.:;. 0, or if Vk E N0, ak :o;;; 0, then the condition is
also necessary.
Proof.
We claim that for every sequence
In {I+ f3d- l1 n {l
:o;;;
({3k)
N0,
lf3kl}-l.
The proof is by induction. It is certainly .t:rue for
n = j,
and V n E
n = 0.
If it is true for
then since
i+t
k=O
k=O
k=O
TI {I+ f3k} - I= TI {I+ /3d - I+ {3j+1 TI
{l +
f3k}'
by taking absolute values on .both sides, using the inductive hypothesis

and the triangle inequality, we get
3.5
INFINITE PRODUCTS I 135
Consequently, we have
ltt {I+ ak} - ii ft {I+ laki} - 1.
0, 3N so that n
For every E >

E.
N implies the right side is less than
Hence the sufficiency follows from Theorem 3.5.2.

Now, if Vk E N0, ak .0, then the necessit y is clear. On the other
hand, if Vk E
N0, ak
TI
0,
I
let us write
{I+ ak}
k=O
fI { 1 + -2.J_
}
I+ ak
k=o
Because the left side converges as
ak 0, 3K
so that k
ft{l
K => 0
<
lakl} - I
n oo, so does the

I + ak 1. Thus,
right side. Since

for
it {I+ l:klJ- 1 .
K,
m and n are
Ih,.0(1 + lakl)
The right side is small provided

Theorem 3.5.2 and we see that
sufficiently large. Apply

is convergent.
3.5.4 Definition. A infinite product n k;>O(l + ak) is said to be abso

lutely convergent tj the infinite product II k;.o (I+ I akl) is convergent. A con
vergent infinite product that is not absolutely convergent is called conditionally
convergent.
ak
For emphasis we shall remark that if the

then the infinite product
Ilk,.0 {I + ak}
maintain their signs,
is convergent if and only if

it is absolutely convergent. This is, of course, a consequence of the last
theorem.
3.5.5 Theorem. The infinite product Ilk,.0 {1 + ak} is absolutely con

vergenttj the infinite series k"'o (lakl) is convergent.
Proof.
Using the Mean Value Theorem (see Chapter 4) we can estab
lish that Vk E
N0, 3f3k
with
log(l +
Thus, if
k;.o (lakl)
1/(1 + Jakl)
lakl)
f3k
1, so that
lak l.
is convergent, then by the comparison test,
!i..?! L log( 1 + lakl)

k=O
f3k lakl
!i..?! log TI
k=O
{1 + lakl}
(3.5.3)
exists. Since the exponential function is continuous,
il
{l
+ lakl}
!i exp
1og
( I!
{l +
lakl}
exists and clearly is not zero.

On the other hand, if the infinite product is convergent, 3K so that
Vk
;<!: K =>
that
lakl l. Thus Vk
;<!: K,
f3k
;<!:
1/2, and from (3.5.3) we see

Vk
;:;i!=K.
Since the infinite product converges absolutely, the series

00
k=O
(log(l +
lakj)
is convergent. Hence by the comparison test the series
is convergent, and the theorem is proved.
3.5.6 Corollary. If Vk E N0, ak ;<!: 0, or Vk E N0, ak 0, then

Ilk,.0 (I + ak) is convergent:::> the series l:k,.0 (ak) is convergent.
D Exercises
I.
If
a2k
l/(k + 2), a2k+i
-l/(k + 2), show that
is conditionally convergent. Show that this product has the same value
as the absolutely convergent product
fl
k=2
( ;2)
-
2. Generalizing Exercise 1, let a2k(x) =x/(k + l), a2k+i(x)

-x/(k + 1). Show that the sequence defined by
n
kTI
=O
is uniformly convergent for
{l
+ ak(x)}
x in a bounded set in
the same as the limit defined by the sequence

n
x2
TI I - k2
k=I
}.
R, and the limit is
3.5
3.
If the infinite product
INFINITE PRODUCTS I 137
nk,,,0 (I I + akl) is convergent, is it neces

flk;;.O (1 + ak) is absolutely Conver
sarily true that the infinite product

gent, or simply convergent?
4.
Show that the terms of an absolutely convergent infinite product
may be rearranged in any way to obtain another absolutely convergent

infinite product that converges to the same limit as the original product.
5.
Vk E
Show that the
N0,
infinite product
nk,,,O (I
ak)'
I +
ak
> 0,
is conditionally convergent the infinite series
is conditionally convergent.
6.
Suppose
the infinite
product
nk,,,O (I
ak)
is conditionally
convergent. Use the results of Exercise 5 to show that Va E R, a- 0,

there is a rearrangement of the product which converges to a.
CHAPTER
41
DIFFERENTIATION
4.1
THE DERIVATIVE CONCEPT
When speaking about the derivative of a function at a point, it is usual

to take the domain of the function an open interval or an open set, and,
of course, the point to be in this open set. However, situations often
arise where functions are defined on closed intervals, and it is conven
ient to talk about the derivatives of the functions at the end points of
the intervals. It is for this reason that we give the following slightly
more general definition.
4.1.1
E
Definition. A function f is said to have a derivative at a a

J(f ), a is an accumulation point ofJ(f ), and
.
f(x) - f(a)
lIm
x-a
x-a
exists. In case this limit exists, it is called the derivative of f at a and is denoted
by any one of the symbols 'f'(a)', 'df(a)/dx', or 'D f(a)'. If f is differentiable at
every point of its domain it is called differentiable.
The derivative of a function f is that function f' (or dj/dx or DJ) with domain
J(J') = {x:f'(x) exists} (possibly void) whose value at the point a E J(f')
is the derivative of f at a.
From a logical point of view the notation
DJ,
for the derivative of a
D as a function with
all real-valued functions each having domain
function, is the most suitable. For we can consider

domain the collection &I of
in R (including that function whose domain is the null set), and the
range of
is also in &I. Then we can define
D2 = D oD, D3 = D2 oD,
and so on. !his "definition by induction" is made precise in the follow

ing way. It can be proved by induction that there exists a unique func
tion
F with
domain N and range in the collection of functions each
having domain &I and range in &I so that
F(l) = D ,
If we set
F(n+I) =F(n) oD .
D11 =F(n), then V f E &I we call D"f the nth derivative

'j11
< and 'df11/dx"'. We also set D0J =f
of
f.
Other notations are
4.1.2
at a a
lll8
Definition.
E
We shall say that a function f is n-times differentiable

J(J<11l), and that f is n times differentiable J(J) = J(J<11l).
4.1
THE DERIVATIVE CONCEPT I U9
function f will be said to be infinitely differentiable at a Vn

LJ(JCn>), and f is said to be infinitely differentiable Vn
LJ(J) LJ(J<n>).
A
a
E N0,
E N 0,
4.1.3
Theorem.
Proof.
&
If f' (a) exists, then f is continuous at a.
By the definition of
LJ(J)
f' (a), 3S' > 0 so that 0
I J(x)x-- af(a)
J' (a)
Multiply both sides of this inequality by
<Ix
- al <S'
< 1.
Ix - al and then use the triangle
inequality. This gives
IJ(x)-f(a) I
Now, for a given
e > 0,
take
< (I
+ I f' (a)I)Ix-al.
<min
(S', e/ (I
lf'(a)I)).
Then the
preceding inequality implies that
IJ(x)-f(a)I
provided
Ix-al < S
&
<e,
LJ(J) .
The converse of this theorem is not true. Namely, it is not true that
if a function is continuous at a point, then it is differentiable at the
point. The simplest example that illustrates this is given by the function
f (x)
This is continuous at
0,
lxl.
but
lim
x-o
does not exist. By adding together a finite number of translations of

the above function we can get a continuous function whose derivative
does not exist at a finite number of points. By modifying this process
and using what essentially amounts to an infinite number of transla
tions of the absolute value function, we can construct a continuous func
tion with domain R that does not have a derivative at any point. We
shall carry out this construction after we give an example on differ
entiation.
THE DERIVATIVE OF THE LOG.NRITHM FUNCTION
As an example of the process of differentiation, let us show that loga

is differentiable and compute its derivative at any point
E R+. We
shall suppose that all the standard additive and multiplicative proper
ties have been proved, as, for example,
140 I DIFFERENTIATION
loga xy
loga
loga
+ loga y,
y Ioga x =
loga x-1,
loga
X11
Indeed, these properties are almost immediate consequences of the

definition of loga as the inverse of
and the additive-multiplicative
ea
properties of the generalized exponential function (see Exercise

Section
of
2.3).
We form the differential quotient of loga x and attempt to pass to

the limit. We have
loga(x +
- loga x
=
Let
v =
xh
/ ; then ash \i 0,
loga
( l + ;)l/h.
(4.1.l)
/" oo. We shall show that
lim
v-.,
(1 + .! )
(4.1.2)
exists and shall designate this limit by
'e.'
From this we will be able to
show immediately that ifh /" 0, then also

lim
v--oo
( 1 .! )
+
= e.
Assuming, for the moment, that we have proved these facts, we have
from (
4.1. l)
and the fact that loga is continuous,

D loga x
To show that the limit in

sequence with values
loga
(4.1.2)
1
e1 x =
- loga
e.
exists, we shall first show that the
l,
a0 =
n EN,
is monotone increasing and bounded above (see Exercise 8 of Section
1.9). Expanding by the binomial theorem we get
1) .. n(n-1) l _!_
(l +.!.n)n l +!".n + n(2n!n2 . +
n!
nn
1 + 1 + _!_2! (1 .!.n)
+ 3\ (1-)(l-) +
! (1-) ... (1-n:l).
+
If we use
n+ l
instead of
n (4.1.3),
in
we see first that for a fixed
(4.1.3)
k
n,
,,;:;
4.1
THE DERIVATIVE CONCEPT I 141
and second, the binomial expansion for
(
l+
n+l
)n+1
has one more positive term than the corresponding expansion for
Hence
n.
+
(
(
1+
< 1+
n!1r I.
r
Next, we see from
(4.1.3) that
(
1) n
I
I
l+<l+t+-++-
n
n!
2!
Since
k!
if we use this inequality in
2 3
'
k ;??:
(4.1.4) we get
(4.1.4)
2k-I,
(
I )n
1
1
l+<l+l+-++
n
2
2n-1
1-(1/2)"
l+l-(1/2) 3 .
(an) is monotone and bounded and its limit exists. We denote the
This establishes the fact that this sequence is bounded. Consequently,

limit by 'e'; that is,
e
Suppose now that
n-oo
lim
I "
(
1+- .
n
n v n+ I ; then
(Why?)
Now,
n-oo
lim
lim
11-""
1)n + 1
1+ n
I+
)"
n+l
--
n-oo
lim
n-oo
lim
n-oo
lim
(
(
I+
I+
1)"
n
-
n-oo
lim
)"+1
n+l
--
(
1 + ! ".
n
1)
n
I +-
n-oo
lim
n-oo
lim
1 )"
I +-
1 )-1
1+ -n+l
142 J DIFFERENTIATION
Therefore,
v
lim
V-+00
If
( + !)
1
lim
n-oo
1+
e.
0, then we have
h /'
lim
v- - oo
(+ )
1 v
lim
v-oo
lim
u-oo
(-)
(
( !u ) ( + !u )
1
and loge
simply denoted by 'log
1+
is called the
-v
The function whose values are
function
( !)n
n
ex
lim
v-oo
1+
-1
--
e.
is usually called the
natural logarithm
of
exponential
and usually is
x'.
A NOWHERE-DIFFERENTIABLE CONTINUOUS FUNCTION
We shall now give an example of a continuous function defined on R

that does not have a derivative at any point of R. The idea is to con
struct a sequence of "sawtooth" functions where the spacings of the
"teeth" get finer and finer. The construction is due essentially to B. L.
van der Waerden. The variation we use is taken from the book, A
of Real Functions
Primer
by Ralph P. Boas, Jr., The Mathematical Association of
America, 1960, where it is attributed to M. Mikola.s. The reader is

referred to this book for other references on this problem.
We first obtain the following necessary condition for a function to be
differentiable at a point in its domain:
If f is differentiable at a E J(J), then Ve > 0, 38 > 0 so that Vt1,t2,si.s2

J(f) with t1 a< t2, s1 a< s2, and lt1 - t21 < 8 and ls1-s2I< 8,
we have
E
(4.1.5)
To prove this we know that
with 0<
Ix - aJ < 8
Ve
> 0,
38 > 0 so that Vx E J(J)
we have
[L(x)-f(a) - f' (a) <

I x-a
e/2.
In other words, we can write
f(x) - f(a)
x-a
f' (a)+ 'Y)(x)'
THE DERIVATIVE CONCEPT I 14!1
4.1
where
l11(x) I
e/2.
<
Thus, if t1
a,
we have
Hence we get
l[(t2) - f(t1)
- f' ( a)
I t2 - ti
<
e/2 .
t1 = a, this inequality is automatic. We get the same inequality when

t1 and t2 are replaced by s1 and s2 and thus an application of the triangle
inequality gives (41
. 5
. ).
If
Let us now set
x E (-1/2, 1/2],
g(x) = lxl,
and extend g periodically to R. That is, we set
f0(x) = g(x - k)
<===> x E
[k - 1/2, k + 1/2],
kE
Z.
The function Jo has domain R, is continuous, and is periodic of period
l;
that is,
Vx E
R,
fo(x + 1)
Now, for
Vk E N0
and
Vx E
fo(x).
R let us set
fk(x) =fo(2kx).
Each function
VxE
R,
fk
is continuous and periodic of period
l /2k;
Figure 4. 1.1
that is,
The graphs of Jo and f1 restricted to [ 0, 1] are shown in Fig. 4. 1.1. Let

us note from the definition of fk that if x E [p/2k, p/2k + l/2k+i], then
fk (x) = 2kx-p,
and if x E [p/2k + l/2k+i , (p + 1) /2k], then
fk(x) =-2kx + p + 1.
We now set
00
f(x) = L fk(x)/2k .
k=O
By the Weierstrass M Test this is a uniformly convergent series of con
tinuous functions and hence f is continuous. Let a E R, S > 0, l/2n < S,
and m be that integer so that m/2n a < (m + 1) /2n. Set b1 m/2n,
b2 = (m + l )/2n, and b = (b1 + b2)/2. If k > n, then sincefk has period
l/2k, we have f (b1) =f (b) =fk(b2). If k < n, let p be that integer so
k
k
that p/2k b1 < (p + 1) /2k. Then, of course, p/2k < b< (p + 1) /2k and
p/2k < b2 (p + l)/2k. If pis even we have, for j E (1, 2),
=
fk(b2) - fdb1) fk(bj)-h(b)

=
= 2k'
b; -b
b2 -b1
and, if pis odd,
In either event, if k >
or k <
n,
we have
fk(b2)-fk(b1) _fk(b;) - fk(b)

bj -b
b2 -b1
0.
Hence we have
f(b2)
b2
f(b1)
b1
f(b;) - f(b)
b; -b
=
_!_ f n(b2) -fn(b1) _Jn(b;) -fn(b) = +l

2n
b2 -b1
b; - b
Now, eitherb1ab orba< b2 In either event the above equality

shows that we cannot have (4.1.5) for E = 1 / 2 , say, regardless of how
small S is taken. Thus f cannot be differentiable at a.
D Exercises
Suppose f is a monotone"'increasing [decreasing] differentiable

I.
function with an interval domain whose derivative does not vanish at
any point. Show that 1-1 is also differentiable and compute its deriva
tive in terms of the derivative of f.
4.2
Use the results of Exercise 1 to show that the functions given
2.
the following are differentiable and compute their derivatives:
J(x) = a , a> 0, x ER:

J(x) =x11n, x > 0, n E N. You
n
x , n E N, then g' (x) = nxn-1
(a)
(b)
if
DIFFERENTIATION RULES j 145
g(x)
may assume as known that
3.
Suppose f is a differentiable function with domain [a, b] and

[a, b], f'(x) > 0. Show that f is monotone increasing. [Hi nt:
Suppose that 3a,f3 E [a,b] so that a</3 and J(a) >J(/3). Let
A= {x: x >a &f(a) >f(x)} and set 1' g.l.b. A. Show thatf' (y) O.]
Vx
(bn)
Suppose
4.
is a sequence whose value at
bn = 1+1
Show that
bn
- e.
1
+ f"
= 1 + I + ...!._
2!
1m
[Hi nt:
For
+ ...!._
n!
( ; r=bnm
1+
Then bnm
5.
- bn
(O!=l) .
n.
Hence obtain an estimate for the value of
valid to three decimal places.
E N0 is given by
and
Tnm - Tn;;;. 0
as
;;;. n,
m -
- 1m
that is
set
),
nm.
oo. ]
For what values of a will the sequence whose terms are
converge? Compute the limit in each case where it converges.
6.
Let us set
f(x)
-1/x
oF-
0,
0.
0,
Show thatj<n>(o) exists and i s zero
Vn
E N. You may assume as known
the results of Exercise 4 of Section 2.3.
f and g with domain R as follows: f(O)

0 and, if x oF- 0, f(x) =x sin (l/x), g(x) =x2 sin (l/x). Show
that!' (0) does not exist, g' (O)
0 but lim x-o g' (x) does not exist. You
7.
Define two functions
= g(O)
may assume as known all trigonometric identities and all differentiation

formulas for trigonometric functions.
4.2
DIFFERENTIATION RULES
In this section we shall obtain the rules for differentiating various

combinations of functions. Presumably most of these rules are well
known to the reader from his previous mathematical studies. However,

some of the discussion we present here may clarify some points in the
elementary calculus.
4.2.1
Theorem. If f and g are both differentiable at a, and a is an
accumulation point of J:>(f ) n J:>(g), then f+ g, Jg, and, if g(a) "" 0, l/g
are differentiable at a. Moreover,
(f+g)'(a)=J '(a)+g'(a).
(Jg) '(a)=f(a)g'(a)+f '(a)g(a).
(I/g)'(a) =-g'(a)/g(a)2
(a)
(b)
(c)
Proof.
Formula (a) is an immediate consequence of the equation
(f +g)(a+h) - (f +g)(a) f(a+h)- f(a) g(a+h)- g(a)

=
+
'
h
h
h
and the fact that a limit of a sum is a sum of the limits. Clearly,
chosen so that
a+ h
J:>(f )
J:>(g).
h is
Formula (b) is a consequence of the equation
(Jg)(a+h)- (Jg)(a)
h
=f(a+h)
the fact that
g(a+
g(a)
+g(a)
f(a+h -f(a)
'
f is continuous at a, and the fact that a limit of a sum and
product is the sum of the limits and the product of the limits, respect
ively.
Formula (c) is a consequence of the equation
(I/g)(a+h)- (I/g)(a)
g(a+h)- g(a)
-1
- g(a h)g(a)
h
h
+
_
a, and the fact that the limit of a product

g is continuous and g(a) "" 0,
g(a+h) ""0 for all sufficiently small h for which a+h E J:>(g).
the fact that g is continuous at
is the product of the limits. Note that since
Theorem (Chain Rule). If f and g are functions with (g)

J:>(f ), and g'(a) and f '(g(a)) exist, then (f g)'(a) exists and
4.2.2
C
(fog)'(a)=f '(g(a))g'(a).
Proof.
The most natural way to begin to construct a proof is to
consider the equality
(f
g)(a+h)- (f g)(a)
h
0
f(g(a+h)) - f(g(a)) g(a+h)- g(a).

h
g(a+h)- g(a)
Since g has a derivative at a, it is continuous at a and hence as h - 0,

g(a+h)- g(a) - 0. Consequently, taking limits on both sides of the
4.2
DIFFERENTIATION RULES I 147
above equality we should get the formula stated in the theorem. The
g(a + h) - g(a)
one possible difficulty with this method is that

be zero for an infinite number of values of
could
in every neighborhood of
zero. Hence it would not always be possible to divide by the quantity
g(a + h)-g(a).
Consequently, we must proceed in a somewhat dif
ferent way.
f is differentiable
81 & y E .B(J ) ==>
Since
<
g(a), VE1> 0, 381
at
>
so that
IY - g(a)I
IJ(y)-f(g(a))-j'(g(a))(y-g(a))I:;;; e,ly-g(a)I.
Also, since g is differentiable at
a, Ve1 > 0, 38> 0
& x E .B(g) ==> lg(x)-g(a)I < 81 and
Hence, if
so that
(4.2.1)
Ix-al
lg(x)-g(a)-g'(a)(x-a)I:;;; e,lx-al.
Ix-al
<
8&x
.B(g),
<
(4.2.2)
we get from (4.2.1) and (4.2.2),
IJ(g(x))-f(g(a))-f'(g(a))g'(a)(x-a)I
:;;; IJ(g(x))-f(g(a))-f'(g(a))(g(x)-g(a))I
+ If'(g(a))I lg(x)-g(a)-g'(a)(x - a)I

:;;; e,lg(x)-g(a) I+ ei!f'(g(a)) I Ix-al.
From (4.2.2) it follows from the triangle inequality that
:;;; (E1 + lg'(a)I) Ix - al.
(4.2.3)
lg(x)-g(a)I
Using this in the last line of (4.2.3) the latter
becomes less than or equal to
E1(E1 + lg'(a)I + lf'(g(a))I) Ix-al.

For every
E> 0, take E1 < min(l,e/(l+ lg'(a)I + lf'(g(a))I)

8 so that Ix-al < 8 & x E .B.(g) implies
and a
corresponding
If 0 g(x)-f 0 g(a)-f'(g(a))g'(a)(x-a)I :;;; E Ix-al.
(4.2.4)
x =;6 a, we can divide both sides of (4.2.4) by Ix - al, which shows that
f 0 g has a derivative at a and indeed the derivative at that point is given
If
by the formula in the theorem.

There is another theorem, very closely related to the chain rule,
which is sometimes confused with the chain rule in elementary calculus.
For this reason we feel it is worthwhile to point it out.
4.2.3 Theorem. If (g) C .B(J ), (f 0 g)'(a) and f'(g(a)) exist,

f'(g(a)) # 0, and g is continuous at a, then g'(a) exists and
g'(a)
Proof.
and
(j0g)'(a)/f'(g(a)).
Since (f 0 g)'(a) exists, Ve,> 0, 381

.B(g) ==>
>
0 so that Ix - al
If 0 g(x)-f g(a)-(J 0 g)'(a)(x-a)I :;;; e,lx-al.

0
<
81
(4.2.5)
Also, since f'(g(a)) exists, 382> 0 so that I Y - g(a)I < 82 and y E

Je(f )
IJ(y)-f(g(a))-f'(g(a))(y -g(a))I
gx
E1IY - g(a)I .
(4.2.6)
Ix
Since g is continuous at a,383> 0 so that Ix - al < 83 and x E Je(g)

l ( ) - g(a)I < 82 Hence if -al < 83 and x E Je(g), we have
from (4.2.6)
IJ(g(x))-f(g(a))-f'(g(a))(g(x) - g(a))I
eiJg(x)- g(a)I.
,
(4.2_6 )
Let us put 8 =min (8i. 83) and m = I/If'(g(a))I If we use (4.2.5) and
(4.2.6'), then Vx E Je(g) with Ix-al < 8, we get
lg(x) -g(a)-Cfoc:l)
I
E1 m{lg(x)-g(a) I+ Ix-a}
l .
(x-a)
(4.2.7)
Now set M= I (f0g)' (a)I and use the triangle inequality on (4.27) to
get
(I - me1)
lg(x) -g(a)I
m(E1+ M) Ix.:._ al .
Take E1 < l/2m, and we get
lg(x)-g(a)I
(I+ 2Mm) Ix - al .
If we use this inequality in the right side of (4.2.7) and set A=
2m(I + mM), we get
(fog)'(a)
lg(x)-g(a)- f'
(x-a)I
(g(a))
e1 A Ix - al .
(4.2.8)
Thus Ve> 0, take e1 < min(e/A,l/2m), and we see that 38> 0 so

that 0 < Ix-al < 8 and x E Je(g)
-g(a
)
lg(x x)
-a
(f g)'(a)
< E
f'(g(a))
0
(4.2.9)
In Theorem 4.2.2 we demanded that (g) C Je(f ) to

ensure that a is an accumulation point off0g so that we could talk
about (f0g)'(a). InTheorem 4.2.3we needed(g) C Je(f ),since
an examination of the proof reveals that otherwise (4.2.9) would not
necessarily be valid for all x E Je(g) for which 0 <
-al < 8.
REMARKS.
Ix
D Exercises
Use the rules for differentiation to establish the following:

(a) If f(x)= xn, Vx E Rand fixed n E N0, then f is differ
entiable.
1.
4.3
MEAN VALUE THEOREMS j 149
(b) If c is a constant and f is differentiable at a, then cf is

differentiable at a.
(c) Any polynomial function of degree n E N0,
p(x)
k=O
akxk,
is differentiable.
Use Theorem 4.2.3 to solve Exercise 1 of Section 4.1.
2.
3. Assume the results of Exercise l of Section 4.1 as known.

Supposefand g are functions with se(g) C >(!), f'(y) exists and is
continuous in an open interval around g(a), J'(g(a)) # 0, and
(J g)' (a) exists. Use the chain rule (Theorem 4.2.2) and the results
of Exercise 1 of Section 4.1 to show that g'(a) exists and
0
g'(a) = (f g)'(a)/J'(g(a)).
0
4. Compute the derivative of e (x ) =ex and assuming the deriva

tive of the logarithm function is known, use the chain rule to show
that the following functions are differentiable and compute their
derivatives:
a> 0, Vx ER.
(a) J(x) = ax = e x loga'
Vx> 0, a ER.
(b) J(x ) = xa =ea logx'
5. Assuming that the logarithm is a differentiable function, use

Theorem 4.2.3 (and not the chain rule) to show that the functions in
(a) and (b) of Exercise 4 are differentiable.
6. Suppose that f and g are n times differentiable functions on
]a, b[. If h is the product off and g, prove Leibnitz's formula for the
nth derivative of h,
h<n>(x) =
n
) pn-k>(x)g/-l<>(x)'
(
k=O
where
4.3
MEAN VALUE THEOREMS
All the mean value theorems of the differential calculus are based on
two principles: (a) a continuous function on a compact set assumes a
maximum and a minimum, and (b) if a differentiable function is defined
in an open interval about a point where it has a local maximum or mini
mum, then the derivative must be zero at this local maximum or
minimum.
4.3.1
Theorem. If f is a function with JF>(f) = ]a, b[, a< b, and

a
local maximum or local minimum at c, and if moreover f' (c) exists,
f
has
if
then f'(c) =0.
Proof.
Let us suppose that
f(c + h - f(c)
{;;;:.:
has a local minimum at
if h > 0 ,
.;;; 0
if h< 0 .
c.
Then
From this we see that the limit of the difference quotient, as
h - 0,
must be zero.
4.3.2
If f is a continuous function with JF>(f)

= [a, b] , a< b , f(a) =f(b) = 0, and f is differentiable on ]a, b[, then
3c E ]a, b[ so that f' (c) = 0.
Proof.
Rolle's Theorem.
Since
is continuous on the compact set
[a, b],
it has a
maximum and a minimum on this interval. If the maximum and mini

mum are taken on at the end points, we havef(x)
hence the theorem is true. If
]a, b[, then
=0
for every x and
has a local maximum or minimum at
by Theorem 4.3.1,
f'(c) = 0.
4.3.3 Mean Value Theorem. If f is a continuous function with JF>(f)

= [a, b], a< b, and f is differentiable on ]a, b[, then 3c E ]a, b[ such that
f(b) - f(a)
= f'(c) .
b-a
Proof.
From a geometric point of view, the Mean Value Theorem
says that there is a point on the graph off where the tangent line to
is parallel to the line joining
(a, f(a))
to
(b,f(b)), Fig. 4.3.1. The

(a,f(a)) and (b,f(b)) is
equation of the straight line through the points
f(b)-f(a)
y=f(a) +
(x-a).
b-a
y
Figure 4.3. 1
4.!
MEAN VALUE THEOREMS I 151
The difference between y andf(x) is
F(x)
f(b) - f(a) x a
f(a) + b-a
( - )- f(x).
Now, F is a function which is continuous on
]a, b[, and F(a) F(b) 0. Hence

F and find a c E ]a, b[ such that
=
F'(c)
f(b)- (a)
c
f
b-a - f'( )
differentiable on
0.
If f and g are continuous

b, and have derivatives on ]a, b[, then
Generalized Mean Value Theorem.
4.3.4
functions defined on [a,b], a

3c E ]a, b[ such that
<
g'(c)[J(b)- f(a)]
Proof.
F(x)
[a, b], is
we may apply Rolle's theorem to
f'(c)[g(b)- g(a)].
Set
[g(b)- g(a)][J(a)- f(x)] + [g(x)- g(a)][J(b) - J(a) ].
This is analogous to the formula we wrote down in the previous theo
rem, where x- a has been replaced by g(x) - g(a), b- a has been re

placed by g(b)- g(a), and we may think about it as if we had multiplied
through by the latter factor. Now Fis continuous on [a, b], differentiable
]a, b[, and F(a) F (b)

find a c E ]a, b[ such that
on
F'(c)
0. Hence we apply Rolle's theorem and
g'(c)[f(b) - f(a)] - f'(c)[g(b)- g(a)]
0,
which concludes the proof.

REMARKS:
If one uses the theory of determinants (which we shall
develop in Section 6.6), there is an interesting way of looking at the
proof of the Mean Value Theorem which leads to more general re

sults. If we refer to Fig. 4.3.2, then the area of the triangle with ver
tices at the points
(a,f(a)), (b, f(b))
and
(x,f(x))
absolute value of the determinant

y
x,
f (x) )
(a, f (a))
Figure 4.3.2
(b, f (b))
is one half the
F(x)
f(x)
J(a)
J(b)
Now, if f is continuous on
[a, b]
is true for the function
Clearly,
theorem
3c
]a, b[
F.
so that
x
a
b
I
I
and differentiable on
F' (c)
F(a)
F(b)
]a, b[,
the same
0, so that by Rolle's
0. The derivative
F' (x) is
obtained
by differentiating the top row of the determinant. if we do this at
and set the resulting determinant to zero, we get the Mean Value
Theorem.
This procedure can be generalized in the following way. Suppose
f, g,
and
are continuous on
F(x)
Then
F
F(b)
F' (x) is
and differentiable on
g(x)
g(a)
g(b)
f(x)
f(a)
J(b)
is continuous on
[a, b]
h(x)
h(a)
h(b)
3c
]a, b[
Set
differentiable on
[a, b],
0. By Rolle's theorem,
]a, b[.
]a, b[, and F(a)

F' (c) 0. But
so that
obtained by differentiating the top row of this determinant.
Thus we get
F'(c)
Note that if
h(x) =
h' (c)
h(a)
h(b)
g' (c)
g(a)
g(b)
f'(c)
f(a)
f(b)
0.
I, this is the Generalized Mean Value Theorem.
Let us now turn to the problem of obtaining a mean value theorem

for higher-order -differences. Suppose that
[a, b]
and for
and
x+ h
in
[a, b]
61,f(x)
If x+2h is
6h2f(x) and
also in
[a,b],
is
is a function defined on
J(x+h) - f(x).
then
we have
6h2f(x)
If n
we define
6h(6,,J(x))
is defined. We call this
J(x + 2h) - 2f(x+h)+f(x).
positive integer so that
x+ nh
[a, b],
inductively by means of the formula
6h"f(x)
we define
6hnf(x)
6h(6hn-1J(x)).
4.3.5 Theorem. Suppose f is continuous on [a, b], and n times differ

entiable on ]a, b[. Then Vx E [a, b] and Vh for which x + nh E [a, b],
38, 0 < 8 < I, so that
Proof.
Let
P(n) should
P(n)
be the statement of the theorem. More precisely
be stated in the following way:
4.3
[a, b] and for every continuous

[a, b] which is n times differentiable on ]a, b[ and Vx E
Vh, if x + nh E [a, b], then 38 so that 0 < 8 < l and
For every nondegenerate interval

function on
[a, b]
and
L,_hnf(x) = pn>(x + n8h)hn.

If
7'=
h 0, the theorem is clearly true and therefore we shall suppose

0. Indeed, for the sake of argument we shall suppose that h > 0.
=
The statement
P(l)
is a statement of the Mean Value Theorem and
hence is true. Assume
P(n- l)
g(x)
n> l
is true for
=
and set
6hf(x) .
h
[a, b- h] and is n- l
[a, b], then x + (n- l)h
P(n- l) to g and find a
The function g is defined and continuous on

times differentiable on
E
[a, b - h].
81
so that
]a, b- h[.
If
x + nh
Consequently, we may apply
0 < 81 < 1.
Now,
1
g<n- >(x + (n - l) 81h)
=
x u<n-ll(x + (n- 1)81h + h) - pn-ll(x + (n- l) 81h}.
Apply the Mean Value Theorem to the right side and we find a
82
so
that
0 < 82 < l.
If we set 8
since
[ (n - l.) 81.+ 82]/n,

1
6h"- g(x)
then it is clear that 0 <
6hn-l(6_hf(x)/h)
8 < 1.
L,_hnf(x)/h ,
(4.3.l)
Further,
(4.3.2)
it follows from (4.3. l) and (4.3.2) that
6hnf(x)
If
h < 0,
pn>(x + n8h)hn.
(4.3.3)
a similar argument will lead to the same conclusion (4.3.3).
Consequently, we have shown that
P(n - l) ====* P(n),
and using the
axiom of induction completes the proof.

As an application of the Generalized Mean Value Theorem, 4.3.4, we
shall obtain a result that is an aid in computing limits of quotients of
functions. We have seen in Chapter 2 that
Jim
x-a
(//g)(x)
Jim /( x ) / lim
x-a
x-a
g(x),
provided the limits in the numerator and denominator on the right

exist and
lim,,._ag(x)
7'=
0. Now, it may happen that Jim,,._af(x)
and lim.r-ag(x) = 0. In this case it may still be possible to compute the

limit of J/g as
x a. Or it may happen that limx-ag(x) =
oo
and it may
be possible to compute the limit of the quotient. By limx-ag(x) =
oo
38 > 0 so that 0 < Ix - al < 8 & x E .B{g) =>g(x)

""'M. By limx-ag(x)
-oo we mean that VM, 38 > 0 so that 0 < Ix - al
< 8 & x E .B(g) =>g(x) M .
we mean that VM,
4.3.6
L'Hospital's Rule. Suppose f and g are differentiable o n [a, b [,

#- 0 in this interval. If
a < b, and g'(x)
X?b f
lim
(a)
(x) = 0 = lim g(x),

X?b
or if
lim g(x) =
x
?b
(b)
oo,
and if
lim
.r?b
{f'/g')(x) = l,
then
lim {f/g)(x)
.r ?b
Proof.
l.
(a) Let us first give the proofs under the supposition that
f and g are zero at b. This being the case we may set

J(b) g(b) = 0 and thus extend f and g so as to be continuous on
[a, b]. From the Mean Value Theorem, V x E [a, b] we get g(x) - g(b)
= g'(g)(x- b), x < g < b. Since, by hypothesis, g'(g) #- 0, it follows
that g{x) - g( b ) #- 0 and we may divide by it. Using the Generalized
the left limits of
=
Mean Value Theorem 4.3.4, we get
f(x) _J(x)-f(b) _Lfil

g(x) g(x) - g(b) g'(g)'
Now, Ve> 0,
38 > 0 so that 0 0 < b - g < 8 and

f'( )
f(x)
lg(x)
l
I g' (g)t I
-
-/ <
/ =
(b) Let us now look at the situation where
h(x) = f(x) - lg(x) so that, by hypothesis,

lim
.r? b
E.
limx?bg(x) =
g(x) oo as x /' b, 3c1 E [a, b[ so that c1

0, and 3c ""' c1 so that c < x 
(g)
- h(c)
lh'g'{g)
lh(x)
g(x)-g(c) I
. I
=
oo.
Let us set
(h'/g')(x) = 0 .
Now, since
>.
x < g < b.
<
2'
c < g < x.
x g(x)
4.3
Also, since
then 0 <
]d, b[
g(x) oo as x /' b, 3d so that c < d< b and if d< x < b,

g(x) -g(c) < g(x) and lh(c)/g(x) I < e/2. Thus, for x E
we have
lh(x)l
g(x)
Since
lh(c)/g(x)I
lh(c)I :,;;; lh(x) -h(c)I

g(x)
g(x)
<
e/2
ford<
x < b,
<
lh(x) -g{c)I
g(x) -g(c)
for these
{x) I
lhg(x)
Ig(x) l
= M- z <
<
x we get
e.
The previous theorem is clearly true for b finite or infinite There

.
is also a corresponding theorem involving right limits and also a theorem
for the case where
reader.
g(x)
-oo
as
x b.
We leave the details for the
As an example of the use of L'Hospital's rule, let us consider
h(x)
cos x
.
Sill
# 0.
Let us suppose it is known that limx-o cos
D sin x =cos x,
and
x = 1, limx-o sin x = 0.
All the conditions of L'Hospital's
D cos x =-sin x.
rule are satisfied for the above quotient and
1.Im
x-o
h (x )
sm
l"
=Im--=
0
x-o cos x
Sometimes it may be possible, and necessary, to apply L'Hospital's
rule more than once in a given situation. For example, consider the
function given by
f(x)
g(x)
smx-x
x2
# 0.
Then we have
.Jil. cos x - I
g' (x) - 2x
_
Now, f' (x)
0 and g' (x)
apply the rule to
f' /g'
to get
0 as x
0 and D2g(x) # 0. Hence we may
l(x) 1. -sin
.
IIm = Im
x-o g(x)
x-o
2
--
= 0.
Finally, let us give an important application of the use of the Mean
Value Theorem 4.3.3. In Theorem 4.2.1 we showed that if two functions
are differentiable at a point, then the sum of the two functions is also
differentiable at that point and the derivative of the sum is the sum of
the derivatives. By the principle of induction this result can be stated
for any finite number of functions. Under suitable hypotheses the result
will extend for an infinite sum of functions. This is what we now wish
to establish.
4.3. 7 Theorem. Suppose that (f ,.) is a function squence for which

each fn has the domain ]a, b[ and is differentiable there. Suppose further that
the sequence (f,.' ) is uniformly convergent to a function g and 3c E ]a, b[ so
that the real sequence (f,.(c)) is convergent. Then (f ,.) is uniformly convergent
to a function J and J' g.
=
Proof.
Using the MeanValueTheorem,Vx E
we get
[f ,.(x)- fm(x)]- [f,.(c)- fm(c)]

where
is between x and
c.
]a, b[ & Vn, m
E N0
= (x- c)[J',.()- J;..()] ,
Hence we get
(f',.) is uniformly convergent it is uniformly Cauchy, and since

(f,.(c)) is Cauchy, it follows from the above inequality that (/,.) is uni
formly Cauchy. Thus (/,.) is uniformly convergent to a function f.
Now fix x E ] a, b[ and use the Mean Value Theorem again to get
Vz E ]a, b[ & Vn,m E N0,
Since
[J,.(z)- f,.(z)]- [J,.(x) - fm(x)]

where ' is between
and get
and
x.
If
= (z - x)[f',. (0
z -x of:- 0,
we tnay divide by this quantity
l /,.(z) - f,.(x) _ fm(z)- fm(x)I = IJ',.(O

z-x
z-x
- f 'm (')] ,
f:,.(,) j.
(f',.) we arrive at the conclusion

a,
b[,
z of:- x, and Vn,m N,
]
Using the uniform convergence of

that Ve> 0,
3N so
that
Vz
l f,.(z)- f,.(x) _fm(z)- fm(x)

z-x
If we let
oo
z-x
J(z) -f(x )
z-x
m > N so
fm(z) - fm(x)
z-x
that
IJ:n(x) - g(x) I <

For this fixed
< .
3
in this last inequality and use the fact that the absolute
value function is continuous, we get
Choose a fixed
m, 38 > 0
so that 0 <
lz- x i < 8 & z
l fm(Z:-=-!m(x) _ j:,.(x) I
<
]a, b[ =>
MEAN VALUE THEOREMS j 157
4.3
From the last three inequalities and the use of the triangle inequality
lz - xi
-f(x)
lf(z)zx g(x) I
f' (x)
f'(x) g(x).
If V n
fn is defined and differentiable on ]a, b[,
if k:o U'n)
uniformly convergent on ]a, b[ and 3c
]a, b[ so that
k:o Un(c)) is convergent, then k:o Un)
uniformly convergent and
moreover
(Jn)' 0f'n
we find that if
]a, b[ &
< S, then
0 <
<
Of course, this says that
E.
exists and
=
4.3.8
Corollary.
E N0,
is
is
We shall leave the proof to the reader.
D Exercises
f
Vx,y .e(J), IJ(x) -J(y) I Mix - YI f
f'
f
[a, b]
Vx
A function
1.
If
If
is differentiable on
and
> 0 so that
has domain
has the same domain and is continuous, show that
2.
is said to be Lipschitz if and only if
[a, b],
that f is a constant function.
[a, b]
and
is Lipschitz.
f'(x)
0, show
3.
Do Exercise 3 of Section 4.1 by using the Mean Value Theorem.
4.
Assume that
domain
.e(J).
]a, b[, g
has a continuous derivative at every point of its
is differentiable on its domain
entiating the composite function
f
(xn)
limn-oof(xn)
5.
Suppose
that
exists.
Xn
Suppose that f is a differentiable function defined on
c E ]a, b[
be f'(c).
and for
7.
(g)
suppose that limx-cf'
Suppose that
(x)
f'
]a, b[ such that

is
]a, b[ so that
- a, show
is a differentiable function on
is a sequence with range in
must
and
g.
bounded. If
6.
]c, d[,
Use the Mean Value Theorem to derive the chain rule for differ
]a, b[
exists. Show that this limit
has a continuous derivative on
]a, b[ and that

(a, b]. Show
f' can be extended to a continuous fimction F defined on
that f itself can be extended to a continuous function defined on
[a, b]
and that
F(a)
lim
x'>a
f(x) - f(a) ,
x-a
F(b)
lim
X/'b
f(x) -J(b)
x- b
8.
Use L'Hospital's rule to establish the following:

a;;:=: 0.
(a) xae-x - 0 as x - oo,
(b) x log x - 0 as x - 0.
9.
Compute the following limits:

(a)
(b)
(c)
10.
.
ex - I x-o
x2
hm
lim
x-o
log(l - x) +x.
x2
cos x
.
1im
I + x2/2
x4
x-o
If p2>(x) exists on ]a, b[, show that
J(x+h) - 2f(x) +f(x-h)

r
1(2)(x) =hi_?;!
.
h2
(Hint: Use L'Hospital's rule.)
11. Suppose f is differentiable on [a, b] and f' (a) =a, f' (b) =f3 .
Show that f ' takes on all values between a and f3. [Hint: If 'Y is strictly
between a and {3, show that g(x)
J(x) - ')l(x - a) takes on its maxi
mum or its minimum in the open interval ]a, b[. This result is often
called Darboux's theorem or property.]
=
12. Let (r n) be a sequence whose range consists of all rationals in

]O, I[. Show that
4.4
TAYLOR'S REMAINDER FORMULAS
If f is a polynomial of degree n - I and a E R, then we may write
f(x) =ao+a1(x-a)
+an_1(x-a)11-1
By successive differentiation of both sides of this equation at a we find
that
J<k>(a)
ak=,
kE (O,n-1).
For a general function defined on an interval [a, b], and which 1s
n - I times differentiable there, we write

f(x)
n-1 j<k>(a)
k=O
k-!
(x - a)k + R,.(x, a) ,
-
where this formula serves to define Rn The problem is to find a con

venient form for the remainder R,.. Any such formula is called a
Taylor's remainder formula,
4.4 TAYLOR'S REMAINDER FORMULAS I 159

although none of the expressions for
Rn
is
due to Taylor himself. The method for obtaining expressions for the
remainder is an application of Rolle's theorem.
Suppose J n - l continuous derivatives on

[a, b], and n times differentiable on ]a, b[. Further suppose and '11 are
continuous on [a, b], differentiable on ]a, b[ and Vx,y ]a, b[ with
a y x, the determinant
 (y) '11'(y)
I '(x) 'l'(x) I
Then Vx ]a, b], 3c ]a, x[, so that
n
(x - a)k Rn(X, a),
f(x)=k=OL-1 j<k>k(a)
l
where
<l>(a) (a) n
I p>
Rn (x, a)= - -1I<1><l>(x)
, (-c)--v'{!'1-1 (x)
'-(c-)- (n -(lc)) ! (x - c)n-1.
(x) v(x) I
F
[a, x], a x b,
]a, x[,
3c ]a, x[
F'(c) <l>'(c) 'l''(c)
F(a) l>(a) 'l'(a) =
F(x) l>(x) '11 (x)
a t b,
-1 J<k>
,L -(t) (x - t)k.
F(t) J(x) - nk=O
k.
F
[a, b]
]a, b[
F'(t)=- (x(n--t)n)-1! pn>(t).
F(a)=Rn (x, a),
F(x)
Rn(x,a),
4.4.1
Theorem. t
is
has

<
<
r
...t.
(4.4.l )
Proof.
on
If
is continuous on
then by Rolle's theorem
and differentiable
so that the determinant:j:
0.
(See the remarks after Theorem 4.3.4.) Let us take, for
1-
Clearly
(4.4.2)
is continuous on
and its derivative in
is given by
(4.4.3)
Further,
0.
(4.4.4)
Using (4.4.3) and (4.4.4) in (4.4.2) and solving the resulting equation
for
we get exactly the form (4.4.1).
t This theorem is due
to
L. M. Blumenthal, Am. Math. Monthly, 33 (1926), 424-426. It
was pointed out to me by Professor Sam Lachterman.
:j: A development of the theory of determinants is given in Chapter 6.
By taking and 'I' special functions, it is possible to get a number

of special useful expressions for Rn which have appeared in the liter
ature under different names.
4.4.2 Corollary. Under the hypothesis on f given in Theorem 4.4.l,

we have the following special cases:
(a) If Vt E ]a, b[, ' (t) - 0 we get the Schlomilch form of the remainder:
Rn(x,a)=
(b)
<l>(x)-<l>(a) p n>(c)
(x-c)n-1,
<l>'(c)
(n-1) !
The Roche form of the remainder:

Rn(x,a)=
(c)
pn>(c)
1 ! (x-a)P(x-c)n-v,
)
p{n-
a< c< x.
The Lagrange form of the remainder:

f<n>(c)
R (x,a)=--1- (x-a)n,
n
n.
(d)
a< c< x.
a< c< x.
The Cauchy form of the remainder:

- pn>(c)
Rn(x,a)l
(x-a)(x-c)n-1,
(n- )!
Proof.
a< c< x.
To prove (a) choose 'I' any nonzero constant and apply
Theorem 4.4.l. To prove (b) set <l>(t) = (x-t)v, 1

(a). To prove (c) and (d),
REMARK:
p
set = n
p
and
n, in part
1 in part (b), respectively.
In Theorem 4.4.1 and Corollary 4.4.2 for the sake of
convenience in the statements and proofs we have expanded about

the left end point a. Clearly, everything will be equally valid if we
expand around the right end point b. In the future we shall use this fact
without comment even though we refer to Theorem 4.4.1 or Corol
lary 4.4.2.
The Taylor remainder formulas give a very convenient method for

deciding when a given infinitely differentiable function is the sum of
a convergent infinite series. For example, Vn E N0 the nth derivative

of the exponential function is again the exponential function. If we use
the Lagrange form of the remainder we may write
ex=
n-1 xk
ec
L-+x n'
!
n!
kO
where c is a number between x and 0 and depends on both n and x.
If lxl b, then
4.4
Since
bn/n! 0
TAYLOR'S REMAINDER FORMULAS I
161
(Exercise 6 of Section l. 7), we see that the remainder
goes uniformly to zero in
[-b,b].
Thus
.,
ex=:Lxk/k!,
k=O
and the convergence of the series on the right is uniform on every

compact set in
R.
The proof we have given above to show that the exponential func
tion is the sum of an infinite series leads immediately to a more general
result: If a function f is defined and infinitely differentiable in an open
interval
I(a)
about the point
compact subinterval of
I(a)
a,
and if
(J'kl)
when restricted to every
is uniformly bounded, then
f(x) =
.,
k=O
r<k r a \
(x-a)k,
k
>
I(a). Indeed,
M= sup{IJ'k>(x)I: x E j
and the convergence is uniform on every compact set in
j= {x: Ix - al ,,;;; b } C I(a) and

& k E N0}. Using the Lagrange form of
for x E ],
suppose
lpn>(c)
let
the remainder again, we get
IRn(x,a)I= -1
- (x-a)n
n.
This estimate on
Rn
gives the result.
bn
.;;M-,.
n.
Actually, it is a rather interesting fact that the conclusions we have

obtained in the previous paragraph will follow from the considerably
milder assumption that all the derivatives are uniformly bounded below
(or above). This fact is due to Serge Bernstein. Before we prove Bern
stein's theorem we shall prove a lemma. The development we give is
taken from the book by W. Maak,
An Introduction to Modern Calculus,

I 963.
Holt, Rinehart and Winston, Inc., New York,
4.4.3 Lemma. Suppose J has do main [a,b] , is infinitely differentiable,

and 3m > 0 so that Vx in [a, b] and Vn E N0,
-m
,,;_:;
pn>(x) .
Then 3M so that Vx E [a,b[ and Vn E N0,

j
Proof.
Suppose, at first, that
remainder,
Vx E [a,b[, we
f (b)
where
<">
(x),,;;;
<
<
n! M
(b- x)n
m= 0.
Using the Lagrange form of the
get
J<k>(x)
J<n+ll(c)
_ (b-x)k+
(b - x)n+l ,
k!
(n + I)!
n
b.
__
Since all the terms on the right are nonnegative we
16l! I DIFFERENTIATION
must have
pn> x)
(b -x)n f(b).
n.
In case
0, consider the auxiliary function
g(x)
f(x) +mer-a,
[a,b].
Then we have
g<n>(x)
Thus
Vx
pn>(x) +mer-a -m + m=
0.
[a,b[,
pn>(x) n!g(b)/(b-x)n.
4.4.4
Su ppose f has domain [a, b],

so that Vx in [a,b] and Vn E N0,
Theorem (S. Bernstein).
nitely differ enti able, and 3 m
is
infi
I(c)
-m pn>(x).
If Ve E [a,b[, we set I(c)= {x: Ix-cl< b-c}, then Vx
n [a,b[,
f(x)=
j<k>(c)
k=O n.
co
--1 (x-c)k.
Ind e ed, the infinite seri es on the right converges uniformly in every compact
set in I(c) .
Proof.
By hypothesis
n!/(b - a)n
3M
pn>(x)
is uniformly bounded below, and by
n! M/(b - x)n. Since

[a,b[, n!/(b-x)n n!/(b - a)",
it follows that we may take M so large that Vn E N0 and Vx E [a, b[,
m n!M/(b-x)n. Thus Vx E [ a, b[ we have
the last lemma
oo
as
so that it is bounded above by
oo,
and
Vx
n! M .
1pn>(x)I (b -x)n
Let us use the Cauchy form of the remainder in Taylor's formula:
pn>(O
R (x,c)=
(x-c)(x-On-1
n
(n-I)!
Using the above estimate for
IJ<n>({)I.
I Rn(x, c) I nM
If
<
<
b, then c
<
<
<
Jc,x[.
x-c
{
Ib
l 1 x-{ 1 n-1
b_{
and a simple computation shows that
the derivative of the function of

vanish in the interval
we get
with values
(x-{)/(b - {) cannot
{, when restricted
Hence this function of
4.4 TAYLOR'S REMAINDER FORMULAS I 163
to
[c, x],
must. take its minimum and maximum value at the end points
x and consequently
c. We get the same result if x < < c;
Ix - Wlb-I is taken at= c. If c d < b,
of this interval. Clearly the minimum is taken at
the maximum is taken at

that is, the maximum of
let us set
-d-c
b-c
p=
then for
Ix-cl d -c we
< l;
have
Ix-I ,,,:::. Ix-cl ,,,:::.

lb -I
l b-cl ...,
Thus for
Ix-cl d-c, x E [a,b[ ,

IR n (x,
Since p < 1, n pn-
Ix-cl d-c.
--'>
0 as n
c) I ,,,:::.
--'> oo,
and
Vn EN,
d -c n 1
n - .
b d p
_
and hence Rn (x,
c)
--'>
0 uniformly for
This proves the theorem, since the last statement of
the theorem is obvious.
The last theorem gives a sufficient condition in order that a function
may be represented by a special kind of infinite series. Such functions
have a special name that we emphasize by a formal definition.
4.4.5 Definition. If f is an infinitely differentiable function having an

open domain whose values in some open interval about the point a can be
represented as
f(x)
"'
<k>( a
k=O
k.
)
:L J
-,- - (x-a)k,
then f is said to be analytic at a. The series is called the Taylor expansion of

f at a. If f is analytic at every point of its domain it is called analytic.
We should point out that if a function is infinitely differentiable in
the neighborhood of a point it does not necessarily mean that the func
tion is analytic at the point. For example, the function given by
f(x)
is infinitely differentiable and
Section 4.1 ). Hence, if
e-1/x '
0,
=I=
0,
0,
Vn E N0, pn>(O)
0 (see Exercise 6 of
were analytic at zero, it would of necessity have
to have zero values at every point of some neighborhood of the origin,
which of course it doesn't.
0 Exercises
1.
Do Exercise 10 of Section 4.3 by using an appropriate form of
Taylor's remainder formula and assuming thatj<2> is continuous .
2.
Let f be the function with domain ]-1, 1[ defined by
f(x) =1
1
-
x.
Show that f is analytic at zero and that its Taylor expansion at zero
converges uniformly on every compact subset of ]-1, 1[.
3. Generalizing the considerations of Exercise 2, letfa be the func
tion with domain ]-1, 1 [ defined by
a
ER.
Show that fa is analytic at zero and its Taylor expansion at zero con
verges uniformly on every compact subset of ]-1, 1[.
4.
Show that the function f with domain J-1, 1[ defined by

f(x) =log (I - x)
is analytic at zero and its Taylor expansion at zero converges uniformly

on every compact subset of ]-1, 1[.
5. We have shown in Section 4.3 that the exponential function with
values e3' is analytic at zero and its Taylor expansion converges at every
point of R. Hence we may write
00
e= L llk'
k=O
l/k! + Rn+1
k=O
Show that Rn+i < I/n!n. This estimate for Rn+i shows that e is irrational.
Indeed , if e is rational it can be written e = p/q, p, q E N and
q
L
k=O
llk' < pfq<
11k1 + lfq!q.
k=O
Show that this leads to a contradiction .

6. Suppose that f is n times differentiable in ]a, b[ and 3k < n
andc E ]a,b[so that frn(c) =Oforj E (l,k). UsingTaylor's formula
state sufficient conditions so that f will have either a relative maximum
at c, or a relative minimum atc, or neither.
7. Suppose f and g have (n - 1) continuous derivatives on [a, b]
and are n times differentiable on ]a, b[. Show that Vx E ]a,b], 3c
E ]a, x[ so that
4.5
]
[ f(x) - n-1
f'(kl f(n\)
(x- a)k gin>(c)
POWER SERIES I 165
k
[ g(x) - k=O
'f gk< >a)(x - a)k ] pn>(c).
.
By choosing g to be a suitable function, obtain the Taylor formula with
Lagrange remainder.
Give an example of a function
8.
f that is defined and infinitely
differentiable in a neighborhood of zero so that in some interval
f(
x) =
J<kl ( 0) k
x
k!
'
but the power series does not represent f for

Suppose
9.
<
[O, a[,
0.
is defined and infinitely differentiable in
]a, b [.
f is analytic at
0, 3/(c), and 3M so that Vk E N0 and Vx
Show that a necessary and sufficient condition that
]a, b [
I(c),
is that
3p
>
jj<k>(x)j
:s;;
k!M/pk.
Suppose J is defined in an open interval around
IO.
at a. Show that
J is
the result of Exercise 9.)
4.5
a and is analytic
a. (Hint: Use
analytic in some open interval around
POWER SERIES
We have seen in the previous section that some (but not all) infinitely
differentiable functions may be analytic at a given point; that is, they
may be represented in a neighborhood of the given point by their Taylor
expansions. It therefore behooves us to study function series of the form

00
Such series are called
a =0
about
k=O
( c k (x- a)k ) .
power series about a or Taylor series about a, and if

Maclaurin series. Every power series
x =a, but clearly it is of interest to ask for an
they are sometimes called
converges at
effective criterion, in terms of the sequence

values of
(ck), which will give the

x for which the power series converges. A second natural
question is the following: If a power series converges in an open interval
about the point
a,
analytic function?
is it always the Taylor expansion, about
a,
of an
An effective criterion to determine the values of x for which a power
series converges is given by the Cauchy root test. For every sequence
(ck} let us set
(4.5.1)
166 J DIFFERENTIATION
0 if the sequence ( lckl1'k)

if the limit superior in (4.5.1) is zero.
We shall adopt the usual convention that

is unbounded, and shall taker=
4.5.1
oo
Theorem (Cauchy-Hadamard).
The power series
00
L (ck(x - a)k)
k=O
is uniformly absolutely convergent on any compact subset of the interval
]a - r, a+ r[ and diverges on the complement of [a - r, a+ r], where r
is given by (4.5.1).
Proof.
From the Cauchy root test the power series will converge
absolutely for every
for which
lim
k-oo
and will diverge for every
for which
lim
k-
icki1'klx - al < 1
00
lckl1'klx - al > 1.
Indeed, the proof of the Cauchy root test shows that the absolute con
vergence is uniform on every compact subinterval of
]a - r, a+ r[.
However, the uniform convergence is also easily established by means

of the Weierstrass M Test. For, if 0 <
and hence
k..olc kl Ix - al k
s < r,
then
k"'o le kls k converges
converges uniformly in the interval
[a - s, a+ s].
The number
power series and
of (4.5.1) is called the
]a - r, a+ r[ is
radius of convergence of the given

interval of convergence.
called the
The question as to whether a convergent power series is always the

Taylor expansion of an analytic function is answered by the next two
results.
4.5.2 Theorem. If the power series k..o (ck(x - a)k) has a nonzero
radius of convergence r , then the function f defined on ]a - r, a+ r[ by
00
J(x) = L cdx - a)k

k=O
is infinitely differentiable and
00
f' (x)
L kck(x - a)k-l,
k=l
where the latter series also has the radius of convergence r.
Proof.
tion 2.4)
Since (Exercise 10 of Section I. 9 and Exercise 5 of Sec
4.5
POWER SERIES I 167
it follows that both of the series above have the same radius of con
vergence.
If we set
fk(x) =ck (x-a) k,

then
fk
is differentiable and by Theorem 4.5. l and the first paragraph
of the proof of this theorem it follows that

00
k=O
(fk')
is uniformly convergent on any compact subinterval of the interval of

convergence. Thus we may apply Corollary 4.2.5 on termwise differen
tiation of a series to arrive at the differentiation formula of this theorem.
The fact that the limit of the power series is infinitely differentiable
follows by use of the axiom of induction.
If the power series k..o (ck(x - a)k) has a n onzero

radiu s of convergence r, an d if
4.5.3
Corollary.
00
f (x)
then Vn
k=O
ck (x-a)k,
]a - r, a+ r[,
E N0,
Cn
Proof.
pn>(x)
x
pn>(a)/n!
By the use of the previous theorem and the axiom of induc
Vn
tion it is easy to establish that
Setting
E N0,
00
L k(k - 1)
k=n
(k- n
+ I)c k(x-a)k-n.
in both sides gives the formula for
Cn .
If we formally multiply two Maclaurin series together and collect

terms all having the same powers of
00
n=O
(anxn )
x,
we get
00
00
( bnxn) = L (cnxn),
n=O
n=O
where
Cn
2 akbn-k
k=O
The power series on the right is called the
Cauchy product
of the series
on the left. The natural question to ask concerns the value of the radius
of convergence of the Cauchy product in relation to the radii of con
vergence of the series that make up the product. We shall prove a
theorem of this nature for series of constant terms that will immediately
answer this question for power series.
4.5.4 Definition. If (a,CT(a)) and (b,CT(b)) are infinite series, then

the Cauchy product of these series is the series (c,CT (c)) , where
Cn =
n
L a kbn-k
k=O
(4.5.2)
The following theorem about Cauchy products is somewhat more

general than is needed to establish the facts about Cauchy products
of power series. The proof would be somewhat easier if we demanded
that both series be absolutely convergent.
4.5.5 Theorem (Mertens). If (a, <T(a)) is absolutely convergent and

(b,CT(b)) is convergent, then their Cauchy product is convergent and its sum is
Proof.
domain of
Let us set
J(n,k)
akbn-k
fork E
(O, n)
and
N0 The
may be pictured as an infinite triangle of lattice points,
a finite portion of which is shown in Fig. 4.5.1. If
Cn
is given by (4.5.2),
then we may write
m [n
o f(n,k)
Cn=
m 1
1 -
D=ID
-l+tL
i ++- 1
1--c. ---+-
, =t=f+t
2 t---+-- ------.--- 11---.---.

0
Figure 4.5.1
What we are doing is adding together all the values that f takes on the
triangle of lattice points shown in the figure. Note we are first adding
the terms along the columns and then adding the resulting numbers.
We get the same result if we first add the terms along the rows and
then add the resulting numbers. Thus we get
4.5
POWER SERIES I 169
Of course, a formal proof of this interchange of summations can be

easily carried out by induction, but we shall not do so. Now,
m
n=k
n=k
L f(n, k) = ak L bn-k= ak<T ( b) m-k>
and thus
m
Cn =
n =O
Let us set A= li m n - oo
L ak<T(b)m-k
k=O
<T
(a) n
and
B= lim n - oo <T ( b) n .
Then we may
write
m
Cn
-AB=
n=O
oo
k=O
k=O
L ak<T (b) m-k - B L ak

m
Let
k=O
oo
ak
[ <T ( b) m-k - B]
M be an upper bound for both
<T
( lal)
B L
k=m+l
and
<T
(b) ,
virtue of the convergence of these sequences. Now,
ak.
(4.5.3)
which exists by
Ve> 0, 3m0
E N
so that the following hold:
m - k mo=> l <T (b) m-k - Bl
<
e/2M,
00
k=mo+l
If we take
m 2m0
lakl
<
and use the above estimates in the following in
equality for the right side of
lo
ak[<T (b) m-k - B] - B
:s;;
e /6M.
(4.5.3) we have completed the proof:
ak i
+l
L l ak l l <T (b) m-k - Bl+
k=O
Note that
:s;;
k=mo+l
oo
lak l l <T (b) m-k - Bl + B L l akl
m+l
m0 and m 2m0 => m - k m0
4.5.6 Corollary. The ra.dius of convergence of the Cauchy product of

two power series about a is at least as large as the smaller of the radii of the
two component series.
Proof.
Since a power series is absolutely convergent inside its inter
val of convergence, the corollary is an immediate result of the last

theorem.
As we have shown, a power series converges
inside its interval of con
vergence and if that open interval is not the null set, the power series
converges to a function that is analytic. Now, the reader can easily
show that nothing can be said about the convergence of a power series
at the end points of its interval of convergence. In other words, exam-
pies of power series can be given which converge at both end points
of the interval of convergence, at neither end point, or at only one end
point.
There is another question about the end points of the interval of
convergence of a power series which is well illustrated by the following
example. It is very easy to show that for
lxl
<
1,
xn
(-l)n+1 -,
n
n=J
ao
log(l
+ x) =
and the interval of convergence of the series on the right is

For
1 the
]-1, 1[ .
series converges and the function defined by the quantity
on the left has the value log 2. The question is whether these two
quantities are equal. The answer is given by the following theorem due
to N. Abel.
4.5.7
If (a,u(a)) is a convergent series and
Theorem (Abel).
ao
f(x) =
k=O
akxk,
lxl
<
1,
then f(1-) exists and

ao
J( l-)
:L
k=O
a k.
Using Abel's summation formula (Lemma 3.2.5) we get
Proof.
n
akxk= u(a) nX"+i - L u(ah[xk+i - xk]
k=O
k=O
n
= u(a)nx n+I+ (1-x) L u(a)kxk.
k=O
Noting that
jxj
<
1 and u(a) is convergent, by letting n

ao
f(x)
Let
A be
the limit of
u(a).
(1- x)
k=O
ao
:L
k=O
we get
u(ahxk.
Then, since for

--=
oo
lxl
<
1,
xk,
we get
f(x) -A= ( 1-x)

Now, Ve> 0, 3N so that k N
and 0 :s:::
<
00
[u(ah- A]x k.
k=O
ju(ah-AI
< e/2. Thus for
1,
00
(1- x)
k=n
ju(ah-Ajxk
:s:::
e/2.
4.5
Also, for fixed
;;,, N,
(I - x)
is less than
POWER SERIES I 171
n-1
L [cr(ah-A]xk
k=O
e/2 provided x is close enough to 1.
This constitutes the proof
of Abel's theorem.
The converse of Abel's theorem is, in general, not true. That is to
say, suppose
f(x)
L akxk,
00
lxl
k=O
<
1,
lxl < 1. If J(l-)

(ak) is convergent. For example,
where we are supposing the series is convergent for

exists, it is not necessarily true that k;;.o
L (-l)kxk,
1
1+X
00
-- =
The left side goes to
k=O
1/2 as x -
l, but
lxl
1.
<
k..o (-l )k
is not convergent.
There are a number of "corrected" converses to Abel's theorem, and

these are called
Tauberian theorems.
We shall prove the original theorem
obtained by Tauber. We first need a lemma (see Exercise

tion
4.5.8
Lemma (Cesaro).
Cn
Proof.
of Sec
If
=
(sn) converges to S, then
n
1
s l
L
n + k=O k S.
We may write
"
1
L [s
n+ l k=O k -S] .
that k;;,, K =::::} lsk -SI < e/2.
Cn
Now
13
1. 7).
Ve> 0, 3K
n;;,, m,
so
get for
h-SI
-S
l
n+ 1
m-1
4.5.9
e/2
e/2
and the first term on the right
for all sufficiently large
Theorem (Tauber).
n.
This proves the lemma.
<
1,
If
L akxk,
00
lxl
k=O
and ifkak - 0 ask - oo and J( 1-) exists, then
J(x)
0 lsk-SI + n+ 1 k l sk-SI.
The last term on the right is less than

is also less than
Fix m;;,, K and we
is convergent and, of course, f( 1-)
k..o ak.
Proof.
lxl
If
1,
<
we can write
n
n
L ak-f(l-)=J(x)-J(l-)+ Lak(l-x k)
k=O
k=O
(4.5.4)
Now, for
>
1,
1 -xk
and .since
x
J l
<
1 we
k-1
( 1 -x) Lxi,
j=O
get
1-xk k (I -x).
Using this in (4.5.4) we get, for 0 x <
I ak- f(l-) 1
1,
IJ( x)- f(l-)J

oc
Let us set
Xn
(I
x) L klak l + L J akJxk .
k=n+l
k=O
(4.5.5)
1 - 1/ ( n + 1) ; then, by hypothesis,
By Lemma 4.5.8 and the fact that
klakl
as
n - oo.
0, we get
(4.5.6)
(4.5.7)
Also, since
kak -
0,
Ve> 0, 3N so that
,.,;:: _
.c:::
;=: N
L.J
+ 1 k=o
Xn
l akl
e/k and thus

,
e
1
--.c:::
e
+ 1 1-xn
(4.5.8)
--
<
If we use (4.5.6), (4.5.7), and (4.5.8) in (4.5.5) with
Xn
replacing x, we
have completed the proof.

THE TRIGONOMETRIC FUNCTIONS
It is certainly true that the best approach to the understanding of the

properties and uses of the trigonometric functions is through the intui
tion of geometry. However, once we understand what we are looking
for, the demands of mathematical rigor require that we give precise
definitions and proofs. One of the easiest ways to do this for the trigo
nometric functions is through the use of power series.
4.5
POWER SERIES I 173
Let us pose the problem of finding two differentiable functions s

and c, each with domain R and which satisfy the equations
s' (x) =c(x).
(4.5.9)
c' (x) =-s(x).
If there exist such functions, it is clear from these equations that they
must be infinitely differentiable and moreover V k E N 0,
c<2k>(x) =(-l)kc(x) ,
Since
(4.5.10)
c<2k+ll(x) =(-l)k+is(x).
and c are differentiable they are continuous and thus, from
(4.5.10), for every compact set in R all the derivatives are uniformly
bounded. It follows from Bernstein's theorem, 4.4.4, that these func

tions are analytic at every point of Rand moreover they are represented
on all of R by their Taylor expansions.
If we consider the Taylor expansions around the origin, from (4.5.10)
we get
s(x) =s(O)
c(x) =c(O)
oo
k=O
oo
k=O
l)
(-l)k
x2k
oo
+ c(O)
(2k) !
x2k
(2k) !
s(O)
k=O
oo
k=O
x2k+1
( - l) k
(
(2k + 1) ! '
x2k+ 1
k
l)
(2k + 1) !
(4.5.11)
Hence, if there exist functions that satisfy (4.5.9), they must be of the
form (4.5.11). From Theorem 4.5.2 we may differentiate the series on
the right termwise, and it is a simple exercise to establish that these
functions actually satisfy the differential equations (4.5.9).
The equations (4.5.11) show that the solutions to (4.5.9) are not
uniquely determined. However, once s(O) and c(O) are specified, they
are uniquely determined. By taking s(O) = 0 and c(O) =1, we obtain
the trigonometric functions sine and cosine:
sm x =
cos x =
oo
k=O
oo
k=O
(-1) k
2k+I
X
(2k + 1) !
'
(4.5.12)
x2k .
( -O k
2k'
Let us now obtain the main properties of these trigonometric func

tions. Let us first get the addition formulas, from which it is then relatively
easy to get the other essential properties. If for fixed y E R we set
su(x)=sin(x+y),
Cy(x)=
cos (x + y) , then s11 and
Cy
satisfy the dif
ferential equations (4.5.9) and moreover su(O) =siny, and cu(O) =cosy.
Hence from (4.5.11) and (4.5.12) we get
sin (x + y) = sin x cos y + cos x sin y ,
cos(x + y) =cos x cos y - sin x sin y .
(4.5.13)
(4.5.12)
From
it is clear that sine is an odd function and cosine is
Vx E R, sin (-x) =-sin x, and cos (-x)

=cos x. If we use these facts, and in the second equation of (4.5.13) we
an even function; that is,
take y
-x,
we get the usual formula

cos2
x+
x=1 ,
sin2
From this it is immediately clear that

!cos
Also from
(4.5.13) we
xi
Vx
Vx
E R,
!sin
xi
1,
(4.5.14)
ER.
1.
(4.5.14')
get the usual double-angle formulas:
sin
cos
2x=2 sin x cos x ,

2x =cos2 x - sin2 x
=2 cos2 x - 1
1 - 2 sin2 x.
(4.5.15)
From the facts that cosine is continuous, cos

it follows that sin
=1, andD sin
x =cos x
is increasing in some neighborhood of the origin
and indeed it is increasing in that interval around the origin for which
cos
x > 0.
Now,
22
cos 2=1 2!
24
4!
26+4k
""
[ 6 +14k)
(
4
(8 + 4k) !
and since
4
(8 + 4k) !
(6 + 4k) !
>
k;;;. 0,
it follows that cos 2 < 0. Since a continuous real-valued function with

an interval domain must take on all values between any two points
of its range, there is a number in ]O, 2[ at which the cosine takes on
the value zero. Since the cosine is continuous, the set of points at which
it takes on the zero value is closed and hence there is a smallest positive
number at which it takes on the zero value. This number is clearly not
zero. Tradition demands that the smallest positive number that makes
cosine zero be labeled 7T/2.
Because cosine
is
an
even
function,
[-7T/2, 7T/2] is monotone increasing. From
the restriction of sine to
(4.5.14) and the fact that

1 and sin(-7T/2) =-1.
cos 7T/2=cos(-7T/2) =Owe get that sin 7T/2=

Thus, from
(4.5.14')
and the continuity of sine, we know that its range
is [-1, l].
From the first formula of
(4.5.15)
we get sin 1T = 0 and from the sec
ond formula of the same number we get cos 1T =-sin2( 7T/2) =-1.
Repeating this process with 27T in place of 1T, we find that cos 27T = 1,
sin 27T=0. If we use these facts in conjunction with the addition
formulas
(4.5.13),
cos
we find that
Vx
(x + 27T) =cos x,
ER.
sin (x + 27T) = sin
x.
(4.5.16)
4.5
POWER SERIES I 175
If f is a function with domain R and p is a nonzero number so that

Vx E R, f(x + p) f(x), then p is called a period for f, and f is called
periodic. If there is a smallest positive period for f, then it is called the
period for f. We have just shown in (4.5.16) that 21T is a period for both
sine and cosine. Thus both of these functions are periodic and we shall
show that 21T is the period for both. Let us first show this for cosine.
Since cosine is continuous, the set of its periods is either closed or else
=
zero is an accumulation point. Since any integer times a period is again

a period, in the latter case it would follow that the set of periods is
dense in R, and using the continuity of the cosine we find that Vx E R,
cos x =1. We have seen in a previous computation that this is not true.
Thus let p be the period of cosine. From (4.5.15) we get 1 =cos p =
2 cos2 (p/2) - 1, from which it follows that cos2 (p/2) =1 and hence
sin (p/2) =0. Now, cos (p/2) = 1, since otherwise from (4.5.13) we
would find that p/2 is a period for cosine, contradicting the fact that p
is the period. Consequently, -1 =cos (p/2) = 2 cos2 (p/4) - 1, or
cos (p/4) =0. Since TT/2 is the smallest positive number at which cosine
is zero, it follows that p
21T. From (4.5.13) and the facts that cos(TT/2)
=0, sin ( 1T/2) =1, it follows that
cos x =sin (x + TT/2).

If p is any period for sine, it follows that
cos (x + p) =sin (x + 1T/2 + p) =sin (x + 1T/2) =cos x .
Hence p 2TT, and we see that 2 TT is also the period fo r sine.
Since sin (x + TT) =-sin x, it follows that sine, when restricted to
[TT/2, 31T/2], is decreasing, and has only one zero in that interval at TT.
Since sin x > 0 for x E ]O, 1T[ and D cos x =-sin x , we conclude that
cosine is decreasing in [O, 1T], and, since it is even, is increasing in
[-TT,0]. Using the periodicity of sine we see that Vk E Z, the restric
tion of sine to [(k - l/2)TT, (k + l/2)TT], has an inverse called the kth
branch of the arc sine and whose values we shall denote by 'arc sinkx.'
For k = 0, the inverse is called the principal arc sine and its values are
usually denoted by 'Arc sin x.' Note that the domain of each branch of
arc sine is [-1,1] and the range of the kth branch is [(k - 1 /2)TT,
(k + l/2)TT]. In a similar way, Vk E Z the restriction of cosine to
[k1T, (k + l)TT] has an inverse which is called the kth branch of the arc
cosine and its values are denoted by 'arc coskx.' Each branch of arc
cosine has domain [- l, 1] and the branch for k = 0 is called the principal
branch of arc cosine and its values are denoted by 'Arc cos x.'
Let us compute D arc sink x, wherever the derivative exists. Since
sine restricted to [(k - l /2)TT,(k + l/2)TT] is monotone, Theorem 2.3.6
tells us that the inverse is continuous. Hence, since Vx E R,
sin(arc sink x ) =x ,
Theorem 4.2.3 tells us that D arc sinkx exists wherever cos (arc sinkx)
#- 0 and moreover
D(arc sinkx)
1
.
cos (arc sinkx)
Now,
cos2 (arc sinkx)
sin2 (arc sinkx) = 1 ,
so that
cos2 (arc sinkx)= 1 -x2
For k odd, cosx is non positive forx E [(k - 1/2) 1T, (k + 1/2) 1T] , and for
k even, cosx is nonnegative on this interval. Hence
k
.
(-l)
D arc sinkx = ;,-o ,
lxl < 1 .
v 1-x2
In a similar way we can show that
k
(-l) +l
D arc coskx= . ;,-o .
v 1 -x2
lxl
<
1.
Once the sine and cosine have been defined, the other trigonometric
functions can be defined in terms of these. The most important other
trigonometric function is the tangent defined by the equation
tanx=
Sill X
cosx
--
#-
(2k + 1) 7T/2.
The tangent is monotone increasing in the open intervals ] (2k - l)TT/2,

(2k + 1)TT/2[, k E Z, and its range when restricted to any one of these
intervals is R. The inverse of the tangent restricted to the interval
] (2k - l)TT/2, (2k + l)TT/2[ is designated by 'arc tank,' and arc tan0 is
called the principal branch of the arc tangent and is designated by
'Arc tan.'
Since
tan (arc tankx)
x,
and D tanx is never zero, it follows from Theorem 4.2.3 that arc tankx
has a derivative
D arc tankx=
1
D tan (arc tankx)
--------
Now, from the definition of tangent,

1
D tan y = - -,
cos2 y
y #-
(2k + l)TT/2.
4.5
POWER SERIES) 177
Further, since cos2 y + sin2 y = I, it follows that I + tan2 y = l/cos2 y and
thus
= I
cos2 ( arc tank x )
+ tan2 (arc tan x) = 1 + x2

k
Thus
D arc tank
x= 1 + 2
x
D Exercises
1.
Suppose that Vx in [a
00
k=O
Show that Vk E N0,
2.
ck=
B,
a+
BJ, B
> 0,
ck(x-a)k=O.
0.
Compute the radius of convergence of each of the following
power series and test for convergence at the end points:

(a)
f (i ! k x
k=O
(b)
k .
oo
k=O
00
(c)
3.
k=l
x2k+1
2k+ I
(kkxk).
Suppose that
3M so that Vk
)
)
(ck) is a sequence with the property that 3p and
E N0,
ick l Mk! pk.

Show that Va E R there exists an infinitely differentiable function f so
that a E J?>(J) and Vk E N0,
Jk(a) =ck.
4;
Suppose that
00
f(x) = L ckxk,
k=O
where the power series has a nonzero radius of convergence. What is

the function
00
g(x)
k=O
k3ckxk?
5.
Suppose
]a - r, a+ r[, r
power
series
has
the
interval
of
convergence
0, and the series converges at a+ r. Show that the

series is uniformly convergent in [a, a+ r]. (Hint: Take a close look
>
at the proof of Abel's theorem, 4.5.7.)
6.
k;.o(ak)
k;.o(ck) is
Use Abel's theorem to prove the following: If
and
are convergent and their Cauchy product
con
k;.o(bk)
vergent, then
7.
Suppose that
f(x)
and VkE N0,
ak
L akxk,
00
k=O
lxl < l,
0. If k;.o(ak) is divergent, show that f is un -
bounded.
8.
Prove the addition formula for cosine by means of Cauchy
multiplication for power series.
4.6
THE WEIERSTRASS APPROXIMATION THEOREM
We gave an example at the end of Section 4.4 which showed that it is

not true that every infinitely differentiable function is analytic, that is,
can be represented by a convergent power series in the neighborhood
of a point. On the other hand, if a function is analytic at a point, then
on any compact interval inside the interval of convergence of the
power series there is a sequence of polynomials that converge uniformly
to the function, namely, the finite sections of the power series. Sur
prisingly this latter property persists not only for infinitely differentiable
functions but for merely continuous functions as well. Of course, for
general continuous functions it will no longer be true that the approxi
mating polynomials will be sections of the
same
power series.
The fact that a continuous function on a closed bounded interval

can be uniformly approximated by polynomials was discovered by
K. Weierstrass. The proof we shall give here is due to S. Bernstein. It
is probably one of the best constructive proofs of this theorem in the

sense that the approximating polynomials can actually be constructed
and their rate of convergence estimated. Another famous proof of
this theorem is associated with the name L. Fejer. In Section 6.7 we shall
give another proof due to M. H. Stone. That proof is a pure existence
proof and does not give a ready method for constructing the approxi
mating polynomials or of estimating their rate of convergence. On the
other hand, it has the advantage that it is a proof which can be adapted
to very general situations.
4.6
THE WEIERSTRASS APPROXIMATION THEOREM I 179
We shall begin by proving a theorem of a somewhat more general

nature than the Weierstrass theorem. It very often happens that by
generalizing a problem some of the obscuring details of the special
case are removed, and thus we can see much more clearly what is in
volved. We feel that this is the case here, although we shall not give as
general a theorem as is possible, since this might have the effect of again
obscuring the problem.
Suppose (An) is a sequence of functions each having

[O, I] and satisfying the following conditions:
(a) An(k,x) OandVk > n,An(k,x) = 0.
(b) Vo> 0, kEKi;<x> An(k,x) 0, uniformly on [O, I], as n oo,
4.6.1
domain
Theorem.
N0 X
Ki;(x) = {k: Ix - k/nl o}.

(c) Vx E [O, I],=oAn(k,x) = 1.
If f is a continuous function with domain [O, l] and if
where
Bn(J,x) =
then Bn(J,x)
Proof.
L f(k/n)An(k,x).
k=O
f(x) uniformly in x as n
oo.
From condition (c) it follows that
f(x)
L f(x)A,.(k,x).
k=O
Hence we may write
Bn(J,x)- f(x) =
If we use the fact that
L [f(k/n)- f(x)]A,.(k,x).
k=O
An(k,x)
IBn(J,x) - f(x)I
we find that
L lf(k/n)- f(x)IAn(k,x)
k=O
Since f is uniformly continuous on [O, l], it is bounded, say by M, and

moreover Ve > 0, 38 > 0, so that Ix - k/nl < o =} lf(k/n)-J(x) I < e/2.
Thus we write
IBn(J,x)- f(x)I
IJ(k/n)- f(x)I An(k,x)
1x-k/nl<6
lx
nl ;;.6
k/
lf(k/n)- f(x)I An(k,x).
The first term on the right is less than

E
L
lx-k/nl<6
An(k,x)
2 L.J An(k,x) = 2
k=O
(4.6.1)
The second term on the right in
2M.
is less than or equal to
L An(k, x).
lx-k/ni"'ll
that n;;:.: N implies
Ve, 3N so
e/2, uniformly in x. Thus
Ve> 0, 3N so that n;;:.: N
By condition (b)
(4.6. l)
than
this last number is less
using these facts in ( 4.6. l) we find that
IBn(f, x) - J(x) I
<
e.

The reader who continues his studies in analysis will find that se
quences of functions
(An)
which satisfy hypotheses like those in the
previous theorem arise again and again. Such sequences fit under the
generic name "approximate identity." The reason for this name will
become clearer to those readers who investigate the theory of Banach
algebras.
The Bernstein proof of the Weierstrass theorem is obtained by choos
ing the sequence
(An)
in a special way. Our next lemma is devoted to
proving that the special sequence we choose satisfies the hypotheses of

the previous theorem.
4.6.2
Lemma.
For every n
let An be the function on
E N0
defined by
A n(k,
N0 X
[O, l]
x) = () xk(I-x)n-k ,
where () is the binomial coefficient given by

()
n
(n _
<=> k
! kl
:s.;;
n
,
O<=>k>n.
The sequence (An ) satisfies the hypotheses of Theorem 4.6. l.

Proof.
Condition (a) i s clearly satisfied and requires n o further
comment. Condition (c) follows from the binomial theorem:
(x+ l
-x ) n=
k=O
(n)xk(l-x)11-k.
k
It therefore remains to prove the crucial condition (b). This will require
some computations.
From the binomial theorem,
(x+ y)n
If we fix
Vn
=
E N0 and
()
Vx,y
E R, we have
n kyn-k
.
x
k=O k
and differentiate both sides with respect to
y n-1
n(x + )
f k (kn)xk-lyn-k.
k=l
x we get
(4.6.2)
4.6 THE WEIERSTRASS APPROXIMATION THEOREM I 181
Multiply both sides by x and set y
nx
I -x to get
i k ( nk ) xk(I-x)n-k .
(4.6.3)
k=O
Now keeping y fixed again and differentiating both sides of (4.6.2)

with respect to x we get
n(n-I) (x + y)n-2
i k(k-I) ( nk ) xk-2yn-k.
k =2
Multiply both sides by x2 and set y
n(n- I)x2=
k(k- I)
k2
k=O
=
I -x to get
k=O
From the fact that (k-nx)2
()
n
k
k2
()
n k
x (I-x)n-k
k
xk(I -x)n-k - nx.
(4.6.4)
2knx + n2x2, we get from (4.6.3)
and (4.6.4)
()
( nx-k)2 n xk(I-x)n-k
k
k=O
nx(I -x).
Supposing n > 0, and dividing both sides by n2 we arrive at the formula
(4.6.5)
Using (4.6.5) we get
.s2
l.r-k/nl;;.ll
()
xk (1
x)n-k
.::;
l.r-k/n l;;.ll
( )(
n
k
2
x - ! xk(I - x)n-k
n
x(l - x)
I
n
4n
This shows that V.S > 0, the sum on the left goes to zero uniformly in
x as n
-+ oo,
which completes the proof of the lemma.
4.6.3 Theorem (Weierstrass). If f is continuous with compact domain

[a, b], then Ve > 0 there exist a polynomial p so that Vx E [a,b],
IJ(x) -p(x)I < e.
Proof.
Suppose at first that [a,
b]
[O, I].
Then by Lemma 4.6.2
and Theorem 4.6.1, the sequence of polynomials given by
Bn (f, x)
converges uniformly to f.
f(k/n)
( )
xk(l-x)n-k
In the general case, set g{y) f( (b - a)y +a) for y E [O, I]. As y
ranges over [O, I], x = (b - a)y +a ranges over [a, b] Now,
.
g(ktn) (;) G:::::)\ :::::r-k.
=
Bn(g,y)
The right side is a polynomial in x and thus f is approached uniformly

on [a, b J by polynomials. This concludes the proof.
REMARK:
For continuous functions f on [O, I], the polynomials

with values Bn(f, x ) formed by using the special functions of Lemma
4.6.2 are called the Berristein polynomials for f.
CHAPTER
5.1
51 INTEGRATION
RIEMANN-DARBOUX INTEGRALS
We now come to the operation of integration, which is, broadly speak

ing, the inverse of the operation of differentiation. We shall suppose
that the reader has already obtained, in his studies of the elementary
calculus, the intuitive geometric conception of a Riemann integral as
an area. Hence we shall forego a discussion of this aspect of the subj ect
and proceed immediately to the formal aspects.
5.1.1 Definition. A decomposition A of a closed interval [a, b] is a

finite set {Ik: k E (I, n)} of closed nonvoid intervals such that any two
intervals of this set have at most one point in common and
[a,b]
If Ik
{Ik: k E (l, n) } .
[ak, bd, we shall put JIkl= bk
ak, and
JAJ =max {!hi :k E (I, n)}.

5.1.2 Definition. A decomposition A* is called a refinement of the
decomposition A (in symbols A* :>) tj each interval in A* is contained in
an interval in A.
If A1 and A2 are decompositions of the same interval, the common refinement
of A1 and A2 is the set of all nonvoid intervals each of which is the intersection
of an interval of A1 with an interval of A2
A,
We shall leave as an exercise the fact that if
A* is
A*.
then each interval in is a union of intervals in
a refinement of
We also think it
is clear that the common refinement of two decompositions is actually

a refinement of each.
Let us now turn to the problem of defining Riemann and Darboux
sums and integrals.
5.1.3 Definition. Let J be a real-valued Junction with domain the

interval [a, b J. Let R1 be the real-valued function with domain the collection
of ordered pairs (A, {xk}) where A= { h : k E (I, n)} is a decomposition
of [a, b] and xk E Ik, and defined by
,
R1(A, {xk}) = :LJ(xdJikl

k=l
183
184 I INTEGRATION
The function R1 is called the Riemann sum function for f and any number in
its range is called a Riemann sum for f.
The function R1 is said to have the limit R (J) ::::? VE > 0, 36., so that
a>-a. IR1(fi, {x k}) - R(f) I < E. In case R1 has a limit, we say that
f is Riemann integrable and the limit R(J) is the Riemann integral of f.
Note that in case R1 has a limit R (J), we are justified in calling R (J)
the limit of R1 since it is unique. Indeed suppose R1 (J) is also a limit of
R1. Then Ve> 0, 3fi, and 3fi1, so that if a>-a. and a>-d1,, then
fl.*. be the common refinement of a. and a1 . Then a>-ti*

a >-a. and a>-fl.1.. Hence a>-fl.*. IR(J) - Ri(J) I IR(J)
- RJ(fi, {xd) I + IR1(fi, {x k}) - R1 (J) I < E. Since this is true VE > 0,
we have R(J)
R1 (J).
Let
5 1.4 Definition. Let f be a real-valued bounded function with domain

[a, b] and let D1 and !21 be real-valued functions with domain the set of decom
positions of [a, b] so that if ti= {lk: k E (1, n) },
n
DJ(D,,)
k=l
n
QA6.)
'L
k=l
Mkllkl,
mk11k1,
Mk=
mk
sup
{J(x): x Eh} ,
inf {f(x):
x E Ik} .
The functions D1 and !J.1 are called the upper and lower Darboux sum functions
for f, respectively. The numbers DJ(D,,) and !J.J(D,,) are called upper Darboux
and lower Darboux sums for f, respectively.
Set
and call these numbers the upper and lower Darboux integrals of f, respectively.
case D(f) !J.(f) D(f), we say that f is Darboux integrable and call
D(f) the Darboux integral off.
In
The fundamental theorem of this section is the following.
5.1.5 Theorem. The Riemann integral of f ex ists if and only if the

Darboux integral of f ex ists, and if they ex ist, then R (f)
D (f)
=
From this theorem we are justified in denoting the common value of
R(f)
and
D(f),
if they exist, by one symbol. The standard symbol is
J:!(x)
dx,
5.1
RIEMANN-DARBOUX INTEGRALS j 185
and we shall call this the Riemann-Darboux integral. Before we prove

Theorem 5.1.5 it is necessary to establish the following lemma.
5.1.6 Lemma. If f is a bounded function defined on [a, b] and A* is
a refinement ofA, then
!l1(A) ,,;;;; !l1(A*) ,,;;;; 15,(A*) :s.;; 15,(A).

Proof.
By definition
n
!l1(A)=L mklikl,
k=l
For each Ik EA, let
Ak
{I*:I* EA* & I* CIk}
Since A*> A, it follows that

Ik= U {I*:I* EA k}
and hence (Exercise 2 at the end of this section )
I
I kl
II*I.
/*EAk
Ifm*(/*)=inf {f (x) :x EI*} andI* Ch, it is clear that m k ,,;_;; m* (/*).

Thus we get
n
L mk IIkl
k=l
L 2, m*(I*)I
I *!.
k=l JE A k
Since A*= {/*: I* EA k & k E (l, n) }, it follows that the right side of
the above inequality is precisely !l1(A*) . This proves the left-hand in
equality of Lemma 5.1.6. The right-hand inequality follows by similar
reasoning, and the middle inequality is obvious.
,,;_;;
Proof of Theorem 5.1.5. Suppose that the Riemann integral of f

exists. f must be bounded (Exercise 4 of Section 5.1) and VE > 0 there
is a decomposition A of [a, b] such that
-e/2 < R1(A, {xk}) -R(J) < e/2.
(5.1.l)
We shall suppose that a < b, since otherwise the fact that D( f ) exists
and is equal to R(f ) is trivial. If we choose xk EI k so that Mk -f(xk)
< e/2(b - a), then
-
D1(A) -R1(A, {xk})
I kl < e/2.
2, (Mk - J (xk))I
(5.1.2)
k=l
On the other hand, if we choosexk E Ik so thatf(xk) - mk < e/2(b- a),
we get
(5.1.3)
=
186
\INTEGRATION
From (5.1.1) and (5.1.2) and the definition of D(J) we get

D(J) - R(J) l5t<.6.) - R(J) < e,
(5.1.4)
and from (5.1.1) and (5.1.3) we get

R(J) -{!,(J) R(J) -[2,(.6.) < e ,
(5.1.5)
which implies {!,(J) = D(f)

D(J) . Replacing D(f) and {!,(f) by
D(f) in (5.1.4) and (5.1.5), respectively, leads to the conclusion that
D(f)
R(f) .
Conversely, suppose that the Darboux integral off exists. For every
e> Q_there are decompositiOns .6.1 and .6.2 so that D(f) -[21(.6.1) < e,
and DtC.6.2) -D(f) < e. If .6., is the common refinement of .6.1 and .6.2,
then Lemma 5.1.6 gives
=
D(f) - []t(.6..) <
e'
Dt<.6..) - D(f) < e.
(5.1.6)
If A >.6.,, then by Lemma 5.1.6, and (5.1.6),

-e < [l1(.6.,)
D(f) [21(.6.) - D(f) R1(.6. , {xk}) - D(f)

i5,(.6.) - D(f) i5,(.6..) - D(f) < e.
This shows that the Riemann integral of f exists and R(J) = D(f) ,
which completes the proof of the theorem.
Theorem. The Riemann-Darboux integral of the function f exists
the function R1 is Cauchy in the sense that Ve> 0 ,3 .6., so that .6.> A,
and .6.' >.6., ==::}
5. 1. 7
Proof. If the Riemann-Darboux integral off exists, then Ve> 0,

3A, so that A > A, and A' > .6., ==::} \R1(A, {xd)
R(J) I < e/2,
\R1(.6.', {x'd) - R (J) I < e/2. Hence, by the triangle inequality,
\RtC.6., {xd) -R,(.6.', {xk}) I < e.

Conversely, suppose R1 is Cauchy. The method of obtaining a limit
is a variation on the theme of the proof of Proposition 2.4.4. There
exists a decomposition A0 so that .6.> A0 ==::}
\R1(.6., {xk}) -R , (.6.o, {xok}) \ < 1 .
Hence A>A0 ==::} \R1 (A, {xk}) I is bounded by 1 + \R1(A0, {x0k}) \. For
every .6.>.6.0 let us set
-;;;(A) =sup {R1(.6. ', {x})
A' >A},
Jim R1= inf{-;;;(.6.) :A>A0}.

We claim that Jim R1 is the Riemann integral off. First, note that
A*> A =}(A*) ;;J(A). Next Ve> 0, 3A1 >Ao so that
(5.1.7)
5.1
RIEMANN-DARBOUX INTEGRALS I 187
Also, 3A2 >-A0 so that A, A*>- A2 ==:}

IR1(A, {xd)-R1(A*, {x\}) I
<
e/2,
and 3A' >- A2 so that

0 iiJ(A2) - R1(A', {xd)
<
e/2.
(5.1.8)
Since A' >- A2, it follows that VA>-A2 we get

IR1(A, {xd) -R1(A', {xd)I
e/2.
<
(5.1.9)
The inequalities (5.1.8) and (5.1.9) show that VA>-A2,

O -;;;(A2) -R1(A, {xd)
<
e.
(5.1.10)
Let A, be the common refinement of A1 and A2 Then VA>-A, we get

from (5.1.7) and (5.1.10) and the monotone character of-;;; that
o
o
-;;;(A)- limR,
<
e,
-;;;(A)-R1(A , {xd)
<
e.
From these inequalities it is immediate that A >-A,==:}

IR1(A, {xk})-1im R11
< E.
5.1.8 Corollary. Suppose [c,d] C [a,b],f is de.fined on [a, b], and

is the restriction of f to [c,d]. If f is Riemann-Darboux integrable, then
so
lS g.
Proof. Let A0 be the decomposition of [a,b] that consists of the

intervals [a,c], [c,d] ,and [d,b]. For everye >0,let A,be a refinement
of A0 so that A>- A, and A' >-A,==:}
IR1(A, {xk}) -R1(A', {x})I
< E.
Let A1,, A2,, and A3, be the subsets of A, which are decompositions of
[a,c], [c,d], and [d,b], respectively. If A2 is a decomposition of
[c,d] and A2 >-A2. , then A= A1, U A2 U A3, is a decomposition of
[a b],which is a refinement ofA,. IfA '2>-A2, and A' = A1, U A'2 U A3.,
then
Hence R0 is Cauchy and the theorem follows from Theorem 5.1.7.

5.1.9 Theorem. If f and g are Riemann-Darboux integrable functions
de.fined on [a,b] , then
( a) for all real numbers a and f3, af + {3g is integrable,
(b) f g is integrable, and
(c) Ill is integrable.
188 I INTEGRATION
Proof.
For every
>
0 , 3d, so that d
>-
d, ==?
I 1 af(xk)IIkl - J:f(x) I
I J3g(xk)IIkl - J3 J: g(x) l
<
dx
e/2,
dx < e/2.
Thus
I [af(xk)
J3g(xk)] II kl - a
J:f(x)
dx - J3
J: g(x) I
dx
<
E.
This proves part (a).

To prove part (b) we first note that it is enough to prove that the
square of any integrable function is integrable . Indeed, if this is the case,
then from the formula
I
fg=4 [(f+g)2 - (f-g)2],

and the fact that f +g and f - g are integrable, it follows that Jg is
integrable.
Let us suppose at first that f ;;;,; 0 and M is an upper bound for J.
For every E > 0, 3 d so that
D1(d) -[11(d) ,,;;;; e/ 2M
If hE A, m k =inf{f(x) :x E Id, Mk=sup{f(x) :x Eh}, then since

j;;;,;O we get m k2=inf{f(x)2:xEid, Mk2=sup{f(x)2:xE!k}
Hence
-
D1,(A) -Q1.('11) = L (M/ - m /} IIkl

k=l
n
= L (Mk+m k)(Mk - m k)II kl

k=l
,,;;;; 2M[D,(A) - [l,(A)]
<
E.
This shows that 15 (J 2)=[2 (f2) .

In the general case let m =inf{j( x) :xE [a, b]}. Then f - m is a
nonnegative integrable function. Hence (f- m )2 is integrable. But
(f - m )2
f2 - 2mf + m 2 ,
and, since 2mf and m 2 are integrable, it follows that f2 is integrable.

This completes the proof of part (b).
1
To prove part (c), since IfI = (f2 ) 12 , it is enough to prove that the
positive square root of any nonnegative integrable function is integrable.
Suppose g ;;;,; 0 and integrable. For any e > 0, 3 A so that
Du(A) -Qu(A)
< e2
5.1
Let lk Ea, mk =inf{g(x)112:x EIk},

Mk=sup{g(x)112: x EIk },
A= {k: Mk+ mk<E} and B ={k: Mk+ mk E}. Then
kEA
E
(Mk-mk )IIkl<E(b- a),
L (Mk+ mk)(Mk - mk)IIkl

kEB
:s;;Da(a)-Q0(a)<E2
(Mk -md IIkl :s;;
kEB
Hence
n
k=l
(Mk -mk)lhl< E(b - a+ I),
which completes the proof of (c).
D Exercises
l.
If
a*
is a refinement of a, show that every interval in
of intervals in
2.
If
a*.
is a decompositiOn of
[a, b], a :s;; b,
a is a union
show that
b-a= L II kl
k=l
3.
If f is a bounded function defined on
[a, b],
establish the fact
that
4.
Show that a Riemann integrable function must be bounded.
5.
Using the definition of a Riemann-Darboux integral, show the
following:
J: x dx= 1/2.
r x2 dx 2/3.
f dx= -
(a)
(b)
-1
(c)
6.
If
1.
is a decomposition of
llll
=max
[a, b]'
we have defined
{III: IEa}.
Show that a bounded function f, with domain

tegrable
VE> 0, 36> 0
so that
lal<6 ==>
O:s;;Dr(a)-ll.r(a)<E.
[a, b],
is Darboux in
190 j INTEGRATION
7.
f, with domain [a, b], is Riemann integra

R(J) so that Ve> 0, 38 > 0 so that for every
[a, b] with IAI < 8.
Show that a function
ble::::> there is a number

decomposition
of
IR,(A,
8.
{xk}) - R(J)I <
If f is Riemann-Darboux integrable on
Sn=;; kLn=I J(k/n) ,
E.
[O, 1]
and if
show that
Sn J:J(x) dx.
(Hint:
9.
The results of Exercises 6 or 7 may be useful.)

Evaluate the following limits:
1
( k )2
n-oo -n kL=I -n .
1
k
n-oon- k=LI e ln.
(a)
lim
Jim
(b)
(Hint:
10.
See Exercises 5 and 8.)

If
is defined on
[a, b]
and integrable, and
only a finite set of points, show that
differs from fat
g is integrable and has the
same in
tegral as f.
5.2
PROPERTIES AND EXISTENCE OF

In this section we bring together the relevant properties of Riemann

Darboux integrals and prove some theorems about their existence.
Most of the results we prove here should be known to the reader, at
least in form, from the elementary calculus.
5.2.1 Theorem. If f and g are functions defined

Riemann-Darboux integrable, then
(a)
(b)
on
[a, b] and are
a and f3,
J: [af(x) + f3g(x)] dx=a J: f(x) dx + f3 J: g(x) dx,
J 0 implies J: f(x) dx 0,
for all real numbers
PROPERTIES AND EXISTENCE OF RIEMANN-DARBOUX INTEGRALS I 191
5.l!
(c)
IJ: f(x) dxl J: IJ(x) I dx, and
(d)
b implies
J: f(x) dx J: f(x) dx J: f(x) dx.

+
Proof.
The proof of part (a) is the same as the proof of part (a)
of Theorem 5.1.9.
Part (b) follows from the fact that all the Riemann sums are non
negative, and hence the limit must also be nonnegative.
To prove part (c) we note that it is always true that
Iii - f;;:;.
IJI + f;;:;.
0 and
0. By combining the additive property proved in (a) with
the positivity property of (b) we get
J: f(x) dx J: l f(x)I dx.
This is the content of part (c).

To prove (d) we first remark that by Corollary 5.1.8 the restrictions
[c,d] are integrable. Now, Ve> O,there exist decom

Ll2, of [a,c] and [c,b], respectively, so that if Ll1
> a,. and Ll2 > Ll2. . then
of/to
[a,c]
and
positions a .. and
J: f1 (x) dxl <

IR12( Ll2, {x2k}) - J: fi(x) dxl <
IR1t (Ll1, {x1d) -
e/2,
e/2,
where /1 and hare the restrictions of f to
[a, c] and [c,b], respectively.

Ll2. and Ll =a, u Ll2. it follows that Ll is a decomposition
[a,b] with Ll > a.' and if {xd {x,d u {x2d' then by the triangle
If a.= a,. u
of
inequality
IR,(Ll, {xd) Since every refinement of
J: f(x) dx - J: f(x) dxl <
Ll,
is clearly ot the form
e.
Ll1
Ll2,
we are
done.
5.2.2 Theorem. If f is defined and integ;rable on [a,b], and g is de

fined and continuous on [a,b], differentiable on ]a,b[, and Vx E ]a,b[,
'
we have g (x)
f(x), then
=
J: f(x) dx
Proof.
lk
Let
[ak> bk]
Ll = {Ik: k E ( 1, n)}
ak <bk. Then
and
g( b )
- g(a) .
be any decomposition of
[a,b]
with
192 I INTEGRATION
n
g(b) - g(a) = L [g(bk) - g(ak)]
k=l
n
L g'(xk)[bk - ak]
k=l
=
n
L f(xk) IIkl ,
k=l
We have, of course, used the Mean Value Theorem, and this is where
we require that g be continuous at the end points.

For any
>
0, if we choose a so that
IRJ(Ll, {xk}) f f(x) dx l

-
it follows that for the particular choice of
<
{xd
E,
as given above by the
use of the Mean Value Theorem,
If f(x) dx - [g(b) - g(a)] I If J(x) dx - RJ(Ll, {xk}) I

=
<
e.
This completes the proof of the theorem.
5.2.3 Corollary (Integration by Parts). If f and g are defined and

continuous on [a,b J and differentiable on ]a, b[ and fg' and f 'g are inte gra
ble (by defining f ' and g' in any way at a and b), then
f f(x)g'(x) dx + J: f '(x)g(x) dx=f(b)g(b) - f(a)g(a).

f(x)g'(x)+J'(x)g(x)=[f(x)g(x)]'.
Proof.
Since the left side is
integrable, the conclusion is an immediate consequence of the previous

theorem (see Exercise 10 of Section 5. 1).
As an application of Theorem 5.2.2 we can obtain an integral remainder

formula in Taylor's formula. We shall suppose that f is defined and has
n - 1 continuous derivatives on [a,b J, is n times differentiable on
]a,b[ andj<"> (when defined in any way at a and b ) is integrable. Let us
write
f(x)
Clearly,
Rn(t,t)
n-1 <k>(t
j
)
L k !- (x - t)k + Rn(X,t) ,
k=O
=
R'n(x,t)=-
(x - t)n-1 < n>

J (t),
(n I)!
_
[a,b].
0. If we differentiate with respect to t, holding x
fixed, we get
Hence, if
Rn(X, c)
b, we
have from Theorem 5.2.2,
Rn(X, c) - Rn(x,x)
-
tE]a,b[.
J: R'n(x,t ) dt
(x - t) n-1 n>
p (t) dt.
(n - I)!
5.2
If {3 ;;;.:
and
g is
integrable on
[a,{3], define
J: g(t) dt - J: g(t) dt.

=
With this definition, the formula for Rn is valid as well for a

Hence
l) ! J:
Rn(X,c) =
(n
(x - t)n-ipn>(t) dt.
b.
(5.2.1)
5.2.4 Theorem. If g is a continuous increasing function on [a,b J and

difef rentiable on ]a,b[, f is defined on se(g) and integrable, and (f 0 g)g' is
integrable, then
fg(b) f(x)
d:x=
g(a)
fb f0g(x)g' (x)
d:x.
If d
{ [ak,bk]: k E (1, n)} is a decomposition of [a,b],
{ [a'k,b'k]: k E (1, n)}, where a'k = g(ak), b'k g(bk) is a de
composition of [g(a),g(b)]. Conversely, since g is continuous and
increasing, it is one to one and every decomposition of [g(a),g(b)]
comes from a decomposition of [a,b J in this way. Further, xk E [ak,bk]
<=> x'k = g(xk) E [a'k,b'k] and d1 >-d <=> d'1 >-d'.
Since f is integrable, for every E > 0 there exists a decomposition
d', of [g(a),g(b)] so that if a' >-d'., then
Proof.
d'
then
IR1(d', {x'd) - Jg(b)f(x) I
d:x <
(5.2.2)
e.
g(a)
Let a and a. be the corresponding decompositions of

and
Ik
the corresponding interval in
a,
[a,b].
If I'k E
d',
use the Mean Value Theorem
to get
(5.2.3)
If we take
x'k = g(xk)
in
(5.2.2),
then from
(5.2.2)
and
(5.2.3)
we get
va >-a.,
I f0g(xk)g' (xk)IIkl - J: f(x) l
d:x <
Since
f0g(x)g' (x)
E.
is integrable, this shows the equality of the two
integrals (see Exercise
10
of Section
5.1).
REMARK:
Because the last inequality in the previous proof holds
va >-a.'
we might be tempted to conclude that
(f0g)g'
is integrable
and we can eliminate this hypothesis from the theorem. However,

note that for each a
set
{xk}
>-a.
we have this inequality true for only a special
and not for all such sets.
194 I INTEGRATION
The previous theorem justifies, in many instances, the process used

in elementary calculus for changing variables under an integral sign.
For example, suppose we wish to compute
J:
In this case
f(x) = -Let
dX .
hypotheses of the last theorem are satisfied. We have
f g(x) =
0
cos
g{x) =sin x for

g{O)= 0, and g{7r/2)
us take
Then g is monotone increasing,
x,
0,,;.;;
x,,;.;; 7T/2.
1. Thus the
g'(x) =cos x,
and thus
('o dx= f7r/2

o cos2 x dX.
J
J
5.2.5 Theorem. If f is defined on [a, b J and integrable, and if F is
also defined on [a, bJ by means of the equation
J: f(t) dt,
F(x) =
then F is differentiable at every point x

off, and F' (x)=f(x).
Proof.
F(x+h -F(x)
- f(x)=
h
[a, bJ which
is
a point of continuity
Suppose xis a point of continuity off Then if x+h E
we have
If
[a, bJ,
* {J:+h f(t) dt - J: J(t) dt }- f(x).
> 0, we have
J:+h J(t) dt- J: f(t) dt= J:+h f(t) dt,

rx
J(x)= ldx +h f(x) dt.
1
Hence
F(x+h -F(x) f(x)
f is continuous at x, Ve >
IJ(t) - f(x) I < E. If we take h
Since
=>
h\"}
l * J:+h IJ(t) - f(x)I dt.

:o;;;
0, 38 so that
It - xi
< 8 &
fe(f)
< 8, we are led to the conclusion
F (x+h)-F (x)
=J(x) .
h
5.2
If
< 0, a similar argument shows that
lim
h)'O
F(x + h) - F(x)
= J(x) ,
h
which concludes the proof of the theorem.
5.2.6 First Mean Value Theorem. If f and g are defined on [a, b],
g 0, Jg and g are integrable, and J is bounded, then there exists a number c
such that inf J ::% c ::% sup f and
J: J(x)g(x)
Proof.
Let
dx= c
m =inf f, M =sup
J: g(x)
f. Since
dx.
mg(x)
::%
J(x)g(x)
::%
Mg(x),
it follows that
m
Now, if
J: g(x)
J: g(x)
f J(x)g(x)
dx ::%
dx ::% M
J: g(x)
dx.
dx= 0, then the theorem is clearly true. If
J: g(x)
dx
> 0, set
c=
J: J(x)g(x) /J: g(x)
and it is immediate that
dx
::%
::%
dx,
M.
5.2. 7 Second Mean Value Theorem. If f and g are defined on [a, b],
g 0, Jg and g are integrable, J is bounded, and m ::% inf J ::% sup J ::% M ,
then 3c E [a, b] such that
.
f J(x)g(x)
Proof.
dx= m
Define the function
G(x)= m
G
J: g(x)
on
dx + M
[a, b]
J: g(t) dt
f g(x)
dx.
by the equation
f g(t) dt.
is a continuous function (Exercise 4 of Section 5.2) and
min
::%
G(b) = m
J: g(t) dt J: J(t)g(t) dt
M J: g(t) dt= G(a)
::%
::%
Since
takes on all values between min
::% max
and max
G,
G.
and since the
196 I INTEGRATION
above inequality shows that
J: f(t)g(t) dt
maximum and minimum, 3c E
G(c)
[a, b]
is a number between this
such that
J: J(t)g(t) dt
This proves the theorem.

We shall now present a theorem concerning the integration of a
unifotmly convergent sequence of integrable functions. The theorem
is useful for a wide variety of purposes.
5.2.8 Theorem. If Un) is a sequence of functions each of which is de

fined on [a, b] and integ;rable, and if fn - f uniformly on [a, b], then f is
integrable and
J: fn(x)
dx -
J: f(x)
dx.
Proof. For every e > 0, 3N so that n N and x E [a, b] => lfn(x)

- J(x) I < e/3(b- a). In particular, this means that for every set
A C [a, b], and Vn N,
lsup{fn(x): x
Iinf Un(x): x
Fix
N and let
E
E
A} - sup{f(x): x
A} - inf {f(x): x
E
E
A}I e/3(b- a),

A}I e/3(b- a).
a. be a decomposition of [a, b] so that .:1
ID rm (.:1) - 12rm (.:1) I
<
e/3
a.
>-
=>
Hence
ID,(.:1) -Q,(.:1) I ID,(.:1)
i5,m ( .:1) I+ ID,m( .:1 ) - Q,m(.:1) I

+
Thus
is integrable. Further,
I J:fn(x)
dx-
ll21m(.:1) - l2r(.:1) I
< E
N =>
J: !(x) dxl J: lfn(x)- f (x)I
dx e/3,
,,,/
If Cfn) is a sequence of functions each of which is

defined on [a, b] and integrable, and if Lk,,,0 (fk) is uniformly convergent to
f on [a, b], then f is integ;rable and
5.2.9
Corollary.
We shall leave the obvious proof of this corollary for the reader. We
should remark that it is possible to relax the hypothesis about the uni-
5.2
PROPERTIES AND EXISTENCE OF RIEMANN-DARBOUX INTEGRAl.S I 197
form convergence of the sequence
(Jn) and still obtain the conclusion
of theorem 5.2.8. However, the hypothesis cannot be relaxed all the

way to pointwise convergence as the following simple example shows.
Un) is the sequence of functions, each having domain [O, l]
Suppose
and defined in the following way:
fn(x)
n+ l
{::}
x E ]0,1/(n+ I)],
0,
otherwise.
It is not hard to check that each
fn(x)
0. But Vn E N0,
fn is integrable and Vx E [O,l],
If a sequence Un) converges pointwise to an integrable function J and if

the sequence is uniformly bounded, then the conclusion of Theorem 5.2.8
remains valid. The proof of this result belongs more properly to the
circle of ideas connected with the Lebesgue theory of integration and
we shall not prove it in this book. Note that in our previous example
the sequence of functions does not remain uniformly bounded.
We shall now finish this section by giving two different sufficient
conditions that a Riemann-Darboux integral of a function exists. Al
though the conditions we shall give are not necessary conditions, they
nevertheless are broad enough to be very useful in a wide variety of
circumstances.
5.2.10 Theorem. If f is defined on [a, b] and is monotone nondecreas

ing [nonincreasing], then f has a Riemann-Darboux integral.
Proof.
Let
a= {lk:
k E (l,n)} be any decomposition of [a,b].
Suppose that the intervals Ik
= [ak, bk] are named so that a1 =a,

n. Since f is nondecreasing, the mini
mum and maximum off restricted to Ik are taken on at ak and bk ,
bn
b, and ak
bk-I for 1 < k
.,;;
respectively. Hence
n,(a) = f(a2)(a2 - a1) + J(a3)(a3 - a2)+ ...+ f(b) (b - an)'

Q,(a) =f(a1)(a2 - a1)+ f(a2)(a3 - a2)+
+ f(an)(b - an).
Thus
D,(a) - Q,(a)
Now, for
>
"
_L
k=l
[f(ak+1) - f(ak)] [ak+i - ak],
0, choose a so that 1a1 <
D,(a) - [},(a)
Since
D(j)
.,;;
15,(a)
and
[},(a)
E'
b.
and we get
.,;; E [f(b)
.,;;
an+l
- f(a)] .
lJU ), and (Exercise 3 of Section
198 I INTEGRATION
5.1) f}(f) ,,;;; D(f)

0,,;;;
it follows that
r f(x)
Lb f(x)
dx-
Since this is true for every
E> 0,
dx,,;;; E[/(b) - f(a)].
the upper and lower integrals are
equal, which proves the theorem.
5.2.11 Theorem. If f is a function with domain [a,b], is bounded,

and is continuous except possibly at a finite set of points, then f has a Riemann
Darboux integral.
Let m be the number of discontinuities of f and M =supIfI.
E> 0, let 8, be a positive numbr so that 4Mm8,< E and let
a0 ={ J k: k E (1, n)} be a decomposition with laol,,;;; 8,. Let u take
A0 as the set of k in (1, n) so that f is continuous on] k and B0 = (-1, n) \
A0 Clearly Bo has at most 2m elements, and
Proof.
For every
:L llkl,,;;; 2m8,.
kEBo
If k EA0, J il k is uniformly continuous. Hence VE> 0, 38 so that

Vk EA0 and Vx, y E ]k with Ix- YI< 8 we have lf(x)- J(y) I< E.
Let a.>-a0, l a.I < 8, and let a= {IJ: j E (l,p)} be a decomposition
of [a,b] with a >-a. Let A = {j: 3 k EA0 so that I; Cfd, and
B = (1, p) \A. Clearly, if j EA we have MJ - mJ< e, where M; =sup
JII; and m; =inf JIIJ. Further,
L I Ijl
}EB
L lfkl
kEBo
Hence we have
D1(a)-f}1(a)= :L(M ;-m;)II;I+ L (M;-m;)II;I

}EA
JEB
< E(b - a)+ 4Mm8,< e (b- a+ 1).

We can now complete the proof in the same way as we completed the
proof of the previous theorem.
The last two theorems are special cases of a much more general
theorem concerning
the existence of Riemann-Darboux integrals.
Indeed, a necessary and sufficient condition is known. To describe this

result it is necessary to define a set of measure zero. A set of real num
bers is said to be of measure zero if and only if
VE> 0 there is a cover

e.
ing by a set of open intervals the sum of whose lengths is less than
It turns out that a bounded function
is Riemann-Darboux integrable
if and only if it is continuous except possibly on a set of measure zero.

Clearly it follows from this result that many of the hypotheses of the
theorems of this section are redundant. We shall delay a proof of this
general result until Chapter 8.
5.2
5.2.12 Fundamental Theorem of the Calculus. If f is a continuous

function with domain [a,b], then there exists a function F, defined and differ
entiable on [a,b] so that F' f.
=
Proof.
Since
is continuous, its Riemann-Darboux integral exists
for every interval
b. Setting
[a,x] , a :,,;;; x :,,;;;

F(x)
J: f(t) dt,
the result is an immediate consequence of Theorem 5.2.5.

The Leibnitz-Newton conception of an integral was that it was an
antiderivative, that is, the inverse of the differentiation operator. Thus

a function f has' an integral in their sense if and only if there is a func
tion
such that
F'
f.
The fu nction Fis called a primitive off and the
collection of primitives of
f is called the indefinite integral of f.

fundamental theorem of the calculus is so named because it shows
The
that
every continuous function, with domain an interval, has a primitive,

and therefore an integral in the Leibnitz-Newton sense.
D Exercises
I.
If
is defined and integrable on
La J(x)
If a,b, and care
5.2.l(d) is valid.
2.
dx
[a,]
-J: f(x)
define
dx.
any real numbers, prove that the conclusion of Theorem
In Theorem 5.2.1 replace the hypothesis that
and g are
Riemann-Darboux integrable by the hypothesis that they are bounded.

Either show that the conclusions of the theorem are valid for both the
upper and lower Darboux integrals, or obtain more general statements
involving the upper and lower Darboux integrals from which Theorem
5.2.1 will follow as a corollary.
3.
In Theorem 5.2.2 replace the hypothesis that
the hypothesis that
is integrable by
is bounded. Obtain a generalization of Theorem
5.2.2 involving upper and lower Darboux integrals.

4.
[a,b]
If
is defined and integrable on
F(x)
show that
5.
[a,b] , and F
is defined on
by
F is
J: J(t) dt ,
continuous.
Using the convention of Exercise 1, show that Theorem 5.2.4
remains valid if g is monotone decreasing.
200 f INTEGRATION
6. Suppose g and g' are defined and continuous on [a, b], and J is
defined and continuous on (g). If Fis a function defined on (g)
by the equation
F(y)
show that for every
[Note: If
y <
g(a),
g(a)
=.
o<a>
J(t) dt
[a, b] ,
g(x)
J: Jo g(t)g'(t) dt.
J(t) dt is defined in Exercise 1. If we set x
b,
then we get the formula of Theorem 5.2.4. Hence for a certain dass
of functions this result is a generalization of Theorem 5.2.4.]
7.
Under the hypotheses of Theorem 5.2.6 and the additional
hypothesis that f is continuous, show that
J: J(x)g(x)
8.
dx = J (c)
3c
[a, b] so that
J: g(x)
dx.
Show that any two primitives of a given continuous function
differ by an additive constant.
9.
Let f be a function defined on

if
x is irrational,
if
x is rational .
Does the Riemann-Darboux integral of
10.
g(x)
11.
If g is continuous on
[O, 1] as follows:
[a, b], g
J exist?
;;;;,; 0 and
J: g(x)
dx
0, show that
0 for every x.
(Jn) is a sequence of continuous functions each defined

on [a, b], and Vn E N0 and Vx E [a, b].fn(x) :-s;; Jn+1(x). Further, sup
pose that Vx E [a, b], Jn(x) J(x), where f is continuous. Show that
Suppose
J: Jn(x)
dx
J: J(x)
dx.
(Hint: See Exercise 5 of Section 3.4.)

Suppose (Jn) is a sequence of continuous functions each defined
[a, b], and Vx E [a, b], Jn(x) J(x), where f is continuous. Is it
1 2.
on
5.3
13.
Give a proof for the integrability of
IMPROPER INTEGRALS I 201
in Theorem
5.2.8 using
the Cauchy criteria for the existence of an integral.

From the expansion
14.
00
--= "" xk
1-x
,.,,
k=O
obtain the Taylor expansion of log
15.
Expand
l/
'
lxl < 1,
(1- x)
around
0.
in a Taylor series around the origin and use
this to get the Taylor expansion for Arc sin
16.
x around the origin.
Prove Bernstein's theorem by using the integral form of the
If f and all its derivatives are nonnegative

in an interval [a,b] , then f is analytic in ] a,b[ .
For c E [a,b[, write the remainder as
remainder in Taylor's formula:
Rn (x ' c)
(x- c) n-1
=
(n - 1) !
(1 - s) n-I pn>( (x - c)s
by making the change of variable t= (x- c)s + c in

;;:. 0, pn> is nondecreasing and thus for x > c,
J<n+O
Rn(x,c)
But
Rn(b,c)
f(b), and
5.3
Rn(x,c)
c) ds
(5.2.1). Now, since
(=r-1 Rn(b,c).
this implies
0 Rn(X, c)
This implies
G=r-l f(b).
0.
IMPROPER INTEGRALS
It may often happen that it is possible to define an integral for an un

bounded function whose domain is a finite interval or for a function
that is defined on an unbounded interval such as
[a,oo[.
One natural
way to do this is by means of a limiting process. Since the definitions

and results given here parallel very closely the definitions and results
given for infinite series, we shall not be as thorough here as in Chapter
3.
Definition. Suppose f is a function with domain [a,b[ (b E R

b= oo) and Vx E [a,b[, JI [a,x] has a Riemann-Darboux integral. Let
I(J) be that function with domain [a,b[ defined by
5.3.1
or
I(J)(x)
J: f(t) dt.
The ordered pair (J, J(J)) is called an improper integral of the first kind
b= oo , and it is called an improper integral of the second kind b E R.
202 I INTEGRATION
The improper integral (f, I(f)) is called convergent<::) limx-b I(f) (x)
exists. An improper integral is called divergent <::) it is not convergent. The
improper integral (f, I(f)) is called absolutely convergent<::) (ifl, J(ifl))
is convergent. In case (f, I(f)) is convergent, then limx-b I(f) (x) is called
the limit of the improper integral and it is denoted by
J: f(t) dt .
As in the case of infinite series, it is often convenient to denote an
improper integral by the symbol
J: (f(t) dt).
As an example of a convergent improper integral of the first kind
consider the function defined by f(t)

lim
.r- 00
and hence we write
.r
e-" dt
e-1
Then
I,
L'" e-1 dt
I.
As an example of a convergent improper integral of the second kind,
we have
lim
.r-1
ix
0
dt
7T
- -
\!T=t2 -
Clearly Definition 5.3.1 does not exhaust all the possibilities for defining
f is a function with domain

]a, b] .JI [x , b] is Riemann-Darboux
reasonable to call the ordered pair (f, I(f))
improper integrals. For example, suppose
]a , b] (a
E R or
a = -oo )
and V x E
integrable. Then it is quite
an improper integral, where
I(f) =
J f(t) dt.
The type of theorems that one can prove about improper integrals
of the first and second kinds resemble very closely analogous theorems
for infinite series. As examples we prove the following results.
5.3.2 Theorem. The improper integral (f,I (f)) is convergent<::) the

function I(f) is Cauchy in the sense that Ve> 0, 3c E [a, b[ so that
5.3
Vx,y
[a, b[ with x, y > c we have

II(J) (x) -J(J) (y)I
Proof.
If lim.r-b I(f)
(x)
<
e.
exists, then the Cauchy condition is ob
vious. On the other hand, if J(f) is Cauchy, the theorem follows from
Proposition 2.4.4.
5.3.3 Theorem. Suppose (g, J(g)) is an absolutely convergent im

proper integral and that (J, I (f)) is an improper integral so that Je(J) =
Je(g) and Vt E Je(J), IJ(t)I lg (t)I. Then (J, J(f)) is absolutely con
vergent and thus convergent.
Proof.
Suppose
Je(f) = Je(g) = [a, b[
and x,y E
[a, b[
with
y > x.
Then
If f(t) dt l
f lf(t)l dt
f lg (t)I dt
II(lgl) (y) -J(lgl) (x)I.
II(f) (y) - I(f) (x) I=
Since
I (lgl)
(lgl, I (lgl))
is convergent, it follows from Theorem 5.3.2 that
is Cauchy. Thus the previous inequality shows that
I(IfI)
and
hence J(f) is Cauchy. Thus, using Theorem 5.3.2 again, we have com
pleted the proof of the theorem.
5.3.4 Integral Test. Suppose f is a nonnegative nonincreasing func

tion with domain [O, oo[ and integrable on every finite interval [O, x], x > 0.
Then the series k,.,0 (f(k)) and the function l(J) converge or diverge simul
taneously.
[k
Proof. Since f is nonincreasing, we have f(k)

1, k]. Thus
f(t)
for
f(k) L1 f(t) dt f(k).
(5.3.1)
However,
J(f) (n)
J: f (t) dt = J:_1 f (t) dt.
Hence we see that if lim.r- .,,/ (J) (x) exists, the monotone nondecreasing
sequence defined by the left side of (5.3.1) is bounded and hence con
vergent. On the other hand, if the series k,.,0
(J(k))
is convergent, the
204 J INTEGRATION
right-hand inequality of (5.3.1) shows that the monotone nondecreasing

sequence (I(J) (n)) is bounded and hence, since /(f) is monotone
nondecreasing, limx-ool(f) (x) exists.
Since the statement about divergence is an immediate consequence

of the statement about convergence, we shall consider the theorem as
proved.
As an example of the use of the integral test, we come back to the
p series of Section 3.2. Let f be that function on [O, oo[ defined by

1
f(t) = (t + I)P'
0.
Then clearly f 0 and is monotone nonincreasing, and
J( k)
(k + I)P
(I/kP)
Thus the series k,,,1
and the integral
i00 (1/tP dt)
converge or
diverge simultaneously. Now,
x -dt
1
=
tp
{ _l_
1- p
log
[x1-p
l]
'
for p =I= 1,
for p=
x,
1.
Thus we see that the given integral of the first kind converges{:=:} P > 1.
Thus the p series converges {:=:} p > 1 .
Let us give another example that shows how the techniques used in
the proof of the integral test can be used to obtain rather refined esti
mates for certain finite sums. Let us show that
n
k}:
=2
k log k =
-
log logn + a+ bn,
where a is a constant that satisfies
0 < a<
0 <
bn
1
2 log 2
<
log log 2
1
n logn
Let us draw the graph of the function with values
1/ (t
log
(Fig. 5.3.1). Let ek be the area as shown in the figure; that is,
1
k
e = k log k
k+l dt
J t t'
k
log
2.
t), t > 1
5.3
IMPROPER INTEGRAl.S I 205
1
t
log
k+l
FIGURE 5.3.1
Then we may write
n-1
n
n-1 k+l dt
1
1
2: k log k = 2: k t log t+ 2: ek+n logn
k=2
k=2
k=2
--
(5.3.2)
--
Note now that
n-1 k+l dt
n dt
= = log logn - log log 2 .
L k 1
t
og
t
t 1og-t
k=2
Further,
I
< ek< k log k -
O
and thus
<
n-1
n-1
2 ek<
I
I
+ l)log(k+I)'
(k
1
k log k - (k
+I)
l)log{k
2 log 2
(5.3.3)
(5.3.4)
n logn
l::k,,.2 (ek)
Thus the series

is convergent to a number which is less than
1/(2 log 2). Further, by the same type of reasoning we find that
co
< Lek< n
k=n
I
1ogn
(5.3.5)
Let us set
co
Lek - log log 2,

k=2
bn=- ek+ I
n Iogn
k
(5.3.6)
(5.3.6')
Then from (5.3.4) and (5.3.6) we get the estimates for

(5.3.5) and (5.3.6') we get the estimates for
(5.3.3) we get
bn .
n
I
= log logn+ a+ bn.
L
k=2 k log k
-
and from
Also, from (5.3.2) and
l!06 I INTEGRATION
Now, on to other matters. An integral such as
(5.3.7)
is not absolutely convergent; that is, the function with values ! sin
xl/x
does not have a convergent integral of the first kind. To see this we
simply note that
Vn
E N,
lmr j
sin
Now,
Vt
ti
k
< +ll1r j sin tj
n -1
dt = 2: f
dt.
k=O k1T
[k7T, (k+ 1)7T], I /t l/(k+ 1)7T. Further, Vk

"
< +01T
fk
! sin t i dt 1 sin t dt 2.
k1T
0
=
Hence
Jo
n"
!sin ti
t
E N0,
1
7Tk=ok+l
n-I
dt-2:
Since the sequence defined by the sum on the right is divergent, we see
that the integral in
(5.3. 7)
is not absolutely convergent.
On the other hand, the integral in (5.3. 7) is convergent and the proof
is very much like the proof for Abel's test,
3.2.6, or Dirichlet's test, 3.2.7

x 7T/2,
-integrate by parts. Let us first write, for
lx
0
sin
--
dt =
f"'2
sin
dt +
--
Ix
1T/2
sin t
--dt.
t
We are supposing here, as we also did previously without mentioning
t/t,
it, that we have extended the function, with values sin
continuously
to t= 0 by giving it the value 1 there. Hence the first integral on the right
exists as an ordinary Riemann-Darboux integral. Let us use integration
by parts on the second integral. We have
fx -- d
sin t
7T/2
Since
Vt, ! cos ti
t--
fx
cos x
cos t
-- dt.
x
7T/2 -2t
,,_;;; 1, it follows from Theorem
on the right converges. Hence
100
0
f"'
sin t
1"'2 sin t
--dt=
-- dtt
t
7T/2
0
5.3.3
cos t
-2
that the integral
dt.
Let us now go on to discuss another type of improper integral. For

E
> 0 it is clear from the formula
- loge=
that the function with domain
I.I dt
]O, l]
and values l/t does not have a
5.3
IMPROPER INTEGRAIS I 207
convergent improper integral of the second kind. On the other hand,

for
>
0,
. [f- -+
dt f dt
t
t ]
1
hm
E-0
-1
=0.
If an integral of a function exists in this sense, we say that
has a
convergent Cauchy principal value integral. We give the formal defini

tion below.
Definition. Suppose f has domain [a,b] \{c}, c E ]a,b[, and

with O<e:s;min(c-a,b - c), Jl[a,c -e] and Ji[c+e,b] are
integrable. Let I(j) be that function with domain ]O, min (c -a, b - c)]
given by
5.3.5
Ve
I(j) (e) =
c- f( t )
l>
f
dt +
c+E
f(t) dt.
Then the ordered pair (f, I (J)) is called a Cauchy principal value integral.
Similarly, if f has domain ]-oo,oo[ and Vx 0, Jl[-x,x] is integrable,
and I(j)
I(f )(x) =
J:/(t) dt,
then the ordered pair (f, I (f)) is also called a Cauchy principal value inte
gral. The Cauchy principal value integral (f, I (f)) is said to be convergent
=>Jim,_0I(J)(e) or limx-ool(J)(x) exists, and the latter numbers are
called Cauchy principal values.
We shall now give some examples which show that the symmetry used
in the definition of the Cauchy principal value may be very important.
Since
t2
sin
is an odd function, its integral over
the integral of 1/ ( 1 +t2) over

.
IJill
x-oo
[-x,x]
x 1
+
-x
t2
[-x,x]
is 2 Arc tan
sin
1 +t2
dt
is zero. Also
Thus
x.
7T.
On the other hand,
x+17 1 +t2 sin
-x
+ t2
dt = 2
Arc tan
and thus the limit does not exist as
x+2
cos
x+
x+1l
I - sin t
dt,
1 + t2
x - oo.
By the same type of reasoning we find that

.
hm
x-oo
On the other hand,

x-00
Jim
2x
..!.2 ...i_ dt = Jim
-xI+t
x -co
I+t
dt = 7T.
t2
-x 1 +
fx
-x
..!...2 .i_ dt+Jim

l+t
x-co
2x
..!....i_ dt.
I+t2
208 j INTEGRATION
Now,
2X df
J
x-00 x I+t2
. 2x -1-t - dt I .
x-oo Jx +t2 x-oo
hm
hm
Thus
hm log
(1+4x2 2)
1+x
. zx,I+t
+t2 dt +
x-oo J-x 1
hm
7T
log 2 .
log 2.
The definitions we have written down do not exhaust all the possi
bilities for defining improper integrals and the reader can undoubtedly
think of cases we have not discussed. However, in most instances a
suitable definition of an improper integral will be either a variant or a
combination of the definitions we have discussed. Some of the follow
ing exercises are designed to exhibit the various possibilities.
D Exercises
I.
State and prove results analogous to Theorems 5.3.2 and 5.3.3
for Cauchy principal value integrals.
2.
Discuss the convergence of the following improper integrals:

(a)
(b)
(c)
3.
Discuss the convergence of the following improper integrals:

(a)
(b)
(c)
4.
!100 (/ 1).
L"" C2 )
J:""C3 )
L1 (t t dt).
J: c- t dt) .
in/2 ( )
0
log
in
cost
For what values of

(a)
f000 (t"'e-1 dt).
will the following integrals converge?
5.3
(b)
(c)
5.
J: (;!).
Loo (;!).
Discuss the convergence of the following integrals:

(a )
(b)
L (k
)
J:oo C( (2 )
J-oooo ( t2 t I tltl1/2 ) dt
00
(c)
6.
1112
For what values of

( a)
(b)
(c)
7.
00
f C t)a )
J: ( J
L"' Ca g t)
1
( lo
will the following integrals converge?
r I g
Show the following:
i"' --t dt J"' --t dt.

t2
t
sin 2
sin
8.
State and prove an analogue of Abel's test for improper integrals.
9.
State and prove an analogue of Dirichlet's test for improper
integrals.
IO.
Discuss the convergence of the following integrals:
(a )
(b)
(c)
I I.
L"' ( t dt) .
"'
L ( I t ti dt ) .
oo C! t dt .
)
L(
i
5
si
Use the integral test to establish convergence or divergence of
the following series:

(a)
""
k=I
(k3e-k).
210 I INTEGRATION
00
(b)
(c)
k=2
((log k)P/k).
( k (log og k)")
Let p and q be fixed integers p
12.
an=
pn
L
k=qn+l
1 and let
Use the ideas involved in the proof of the integral test to establish that
()
an log
(Hint:
"1
L-
k=l k
where a is constant and

13.
0 <
as
oo.
n
dt
- + a + bn,
t
1
b,. < c/n, c constant.)
In the equation
n
l og k =
L kk=2
log log n
a + bn,
show that
log log 2
log 2 -
<
a <
and
0 <
log log 2 ,
2 log 2 -
bn < 2
n l ogn
[Hint: Extend the techniques used in the text and show that V k 2,
I
I
2 k log k - (k + I) log (k + 1)
5.4
<
ek <
1)
k log k - (k + I) log (k +
RIEMANN-STIELTJES INTEGRALS
In this section we shall discuss a generalized integral that is very valuable

in that it brings together under one form such seemingly diverse topics
as absolutely convergent infinite series and Riemann-Darboux integrals.
We think i.t is also true that without the concept of such a generalized
integral a very important part of modern functional analysis would not
be available to us.
5.4.1
Definition. Let f and g be real-valued functions each having the
domain [a, b] . Let S 10 be that real-valued function with domain the collection
5.4
RIEMANN-STIELTJES INTEGRALS j 211
of ordered pairs (a, {x k}), where d {[ak> bk ]: k E ( 1, n)} is a decompo

sition of [a,b] and xk E [ak> bd, and de.fined by
=
S1,0(d , {xd)
L f (xk) [g(bk)-g( ad].
k=I
The function S1,0 is called the Riemann-Stieltjes sum function for f with respect
to g and any element in its range is called a Riemann-Stieltjes sum.
The Junction s l.o is said to converge to the limit s (f,g ) v E > 0' 3 a.
so that a >- a.
IS1.o(d, {xk})-S(f,g )I
<
E.
In case the limit of S1,0 exists, the number S( f, g ) is called the Riemann-Stieltjes
integral of f with respect to g and it is denoted by
S(f, g )
J: f(x) dg(x).
In this case f is said to be integrable with respect to g.

S1,0 has
S(f,g) the limit of S1,0, since it is unique.
As in the case of an ordinary Riemann-Darboux integral, if

a limit we are j ustified in calling
The proof is the same as before.
5.4.2 Definition. Let f be a bounded Junction and g a monotone non

decreasing function, each having domain [a, b]. Let D1,0 and Q.1,0 be those
real-valued functions with domain the set of decompositions of [a,b] so that if
d= {[ak,b k]: k E (l,n)},
m k inf {f(x): x E [ak , bd}.

k=I
The functions 151,0 and D1.0 are called the upper and lower Darboux-Stieltjes
su"!.]unctions for f with respect to g, respectively. Any element in the range
of D1,0 and any element in the range of Q.1,8 is called an upper and lower
Darboux-Stieltjes sum for f with respect to g, respectively.
Set
f(x) dg(x)'
D(f,g) inf {D1.u(d): a E >(D1.o)}
D1 , 0(d)
L mk[g(bk)-g(ak) ],
[l(f,g)
su p {!l1.0(d):
d E >([21,0)}
J:
J: f(x) dg(x) ,
arJ,d call these numbers the upper and lower Darboux-Stieltjes integrals of f
with respect to g, respectively. In case D(f,g) [l(f,g) D(f,g), we say
that f is Darboux-Stieltjes integrable with respect to g and call D(f,g) the
Darboux-Stieltjes integral of f with respect to g.
=
212 I INTEGRATION
The fundamental theorem here, as in the case of Riemann-Darboux

integrals, is the following:
5.4.3 Theorem. If f is bounded and g is monotone nondecreasing, then

the Riemann-Stieltjes integral of f with respect to g exists if and only if the
Darboux-Stieltjes integral exists and if they exist they are equal.
The proof of this theorem follows the details of the proof of Theorem
5.1.5, and we shall not reproduce it here.

5.4.4 Theorem. The Riemann-Stieltjes integral of the function f with
respect to g exists=? the function S1,0 is Cauchy in the sense that VE > 0, 3a,
so that a >- A. and a' >- a. ==>
ISr.o(a,{xd)-Sr.o(a',{x'k}I
<
e.
5.4.5 Corollary. Suppose [a1, b1] C [a, b], f and g are defined on
[a, b], and f1 and g1 are the restrictions of f and g to [a1, b1], respectively.
If f is Riemann-Stieltjes integrable with respect to g, then fr is Riemann
Stielt.Jes integrable with respect to g1
The proofs of Theorem 5.4.4 and Corollary 5.4.5 follow
mutandis
mutatis
the proofs of Theorem 5.1. 7 and Corollary 5.1.8.
Theorem. Suppose f, g, and h are defined on [a, b].

If f and g are Riemann-Stieltjes integrable with respect to h, then for
all real numbers a and {3, a f+ {3g is Riemann-Stieltjes integrable with respect
to h and
5.4.6
(a)
J: [af(x) +{3g(x)] dh(x) =a J: !(x) dh(x) +{3 J: g(x) dh(x).

(b) If f is Riemann-Stieltjes integrable with respect to g and h, then for
all real numbers a and {3, f is Riemann-Stieltjes integrable with respect to
ag+ {3h and
J: J(x) d[ag(x)+ {3h(x)] =a J: f(x) dg(x) {3 ib f(x) dh(x).

+
(c) If f ?!! 0 and Riemann-Stieltjes integrable with respect to g and g is

monotone nondecreasing, then
J: f(x) dg(x)
?!! 0.
(d} If f is bounded and Riemann-Stieltjes integrable with respect to g,

and if g is monotone nondecreasing, then IJI is integrable with respect to g and
If f(x) dg(x)I J: IJ(x) I dg(x).
RIEMANN-STIELTJES INTEGRALS I 215
5.4
(e) Iffis Riemann-Stieltjes integrable with respect to g and if a ,,;;; c ,,;;; b,

then fl[a,c] and fl[c,b] are Riemann-Stieltjes integrable with respect to
gl [a,c] and gl[c,b], respectively, and
J: f(x) dg(x)= J: f(x) dg(x) J: f(x) dg(x).

+
The proofs of (a), (c), (d), and (e) follow the proofs of Theorems
and
5.2. l
5.1.9
and we leave this as an exercise. The proof of (b) is very sim
ple and we also leave it as an exercise. The proof of (b) also follows from
part (a) and Theorem
5.4.8.
5.4. 7 Theorem. If f and g are defined on [a,b], g is continuous on

[a, b] and differentiable on ]a,b[ and if the Riemann-Stieltjes integral off
with respect to g exists and the Riemann-Darboux integral off(x)g'(x) exists
(when g' is defined in any way at the end points), then
J: !(x) dg(x) J: !(x)g'(x)

=
Proof.
with
Ik
dx.
Let a= {Ik: k E (1, n)} be any decomposition of [a, b]

[ak,bk]. By the Mean Value Theorem, 3xk E [ak,bd, so that
g(bk) - g(ak) = g'(xk)(bk - ak).

Consequently,
s,.u(a, {xd>
L 1cxk> [g(bk> - g(ak> 1

k=I
L f(xk)g'(xk)(bk - a k)= R10,(a, {xd).
k=I
Since the limits on the left side and on the right side exist, they must
be equal. This establishes the theorem.
An immediate corollary of Theorem
if we take
f(x) =
5.4.7
is Theorem
5.2.2. Indeed,
1, and note that
J: dg(x)
we immediately have Theorem
g(b) - g(a) ,
5.2.2.
The next theorem is integration
by parts for Riemann-Stieltjes integrals. When taken in conjunction

with the previous theorem, it is seen to yield Corollary
5.2.3.
. ned on [a, b] and if the integral

5.4.8 Theorem. If f and g are defi
offwith respect to g exists, then the integral ofg with respect to fexists and
J: f(x) dg(x) J: g(x) df(x)

+
f(b)g(b) - f(a)g(a)
J: df(x)g(x).
214 I INTEGRATION
Proof.
.1 {h: k E ( l, n)}
[ak,bk]. We may write
Let
where/k =
be any decomposition of
[a,b],
S0j.1, {xd) = L g(xk)[f(bk) - f(ak)],

k=l
On the other hand,
n
f(b)g(b) - J(a)g(a) = L [f(bk)g(bk) - f(ak)g(ak)].

k=l
Hence
f(b)g(b) - f(a)g(a) - S0,1('1, {xk})

n
=L
k=l
Now
[ak,xk]
.1'
[xk,bk] = [ak, bk]
f(bk)[g(bk) - g(xk)]
L f(ak)[g(xk) - g(ak)].
k=l
and hence
{[ak,xk]: k E (l,n)}
{ [xk,bk]: k E (l,n)}
is a decomposition of [a,b] and indeed is a refinement of a. We may

therefore write
S0,,(a, {xk})
x'k = ak for
[xk, bk]. Since, by
where
= f(b)g(b)
the interval
hypothesis,
- f(a)g(a) - S1,0('11, {x'k}) ,
[ak, xk]
S(f,g)
and
x'k =bk
for the interval
exists, the conclusion of the
theorem is an immediate consequence of the last equality.

The generalization of Theorem 5.2.4 is the following:
Theorem. If g is defined, increasing, and continuous on [a,b],

and
h
are
defined on [g(a),g(b)], and the Riemann-Stieltjes integral of
f
fwith respect to h exists, then the integral off g with respect to h g exists,
and
5.4.9
fg(b) J(x) dh(x) =fb f

o<m
Proof.
g(x) dh
g(x) .
{[ak,bk]: k E (I, n)}

[a,b], then .1' ={ [a'k,b'k]: k E ( 1, n)} , where
g(bk), is a decomposition of [g(a),g(b)], and con
if xk E [ak, bd, then x'k = g(xk) E [a, b'k], and con
As in the proofofTheorem 5.2.4 if.1 =
a'k = g(ad, b'k
versely. Further,
versely. Hence
Since
'1 1
>
.12 <=? .1'1
> .1'2, and since the limit on the left exists, the
limit on the right exists and we have proved the theorem.

Note that this theorem in conjunction with Theorem 5.4. 7 gives
Theorem 5.2.4 by taking
h(x) = x.
5.4
RIEMANN-STIELTJES INTEGRALS I 215
The Riemann-Stieltjes integral and Theorem 5.4. 9 can be used to

justify some of the formal operations used in elemel\tary calculus.
For example, we may write
f12 f(sin x) d sin x J: f(x)

=
dx
for any continuous function f. Assuming that the integral on the left
exists (which we shall prove in Section 5.5), set g(x)
Arc sin x. Since g
is monotone increasing and g(O) 0, g(l)
7T/2, if we use the formula
of Theorem 5.4.9 we get
=
f0"12 f(sin x) d sin x J: f(sin g(x)) d sin g(x) .

=
But since g is the inverse of the sine function, we have sin g(x) x.
Of course, in this case Theorem 5.2.4 in conjunction with Theorem 5.4.7
will also justify this change of variable. (See also Exercise 6 of Section
5.2.)
The next two theorems are generalizations of the mean value the
orems, 5.2.6 and 5.2.7.
=
5.4.10 First Mean Value Theorem. If f is de.fined and bounded on

[a, b] and g is monotone nondecreasing on the same interval, and if f is
integrable with respect to g, then 3c such that inf f,,;:::; c ,,;:::; sup f and
J: f(x) dg(x) c J: dg(x)

=
c[g(b) - g(a)J.
The proof of this theorem is essentially the same as the proof of

Theorem 5.2.6, and we leave it as an exercise.
5.4.11 Second Mean Value Theorem. If f and g are de.fined on
[a, b] , f is monotone nondecreasing, and g is continuous and integ rable with
respect to f, then 3c E [a, b] such that
J: f(x) dg(x)
Proof.
J(a)
f dg(x)
+ J(b)
J: dg(x).
By Theorem 5.4.8 we may write
J: f(x) dg(x)
f(b)g(b) - f(a)g(a)
- J: g(x) df(x).
Since g is continuous, it takes on all values between its maximum and

its minimum. Hence if we use this fact and apply Theorem 5.4. l 0, we
see that 3c E [a, b] so that
J: g(x) df(x)
g(c)
J: df(x)
g(c)[J(b) - J(a)].
216 \INTEGRATION
Hence
J:
f(x) dg(x) =f(b)[g(b) - g(c)] + f(a)[g(c) - g(a)]

=f(a)
J:
dg(x) + f(b)
dg(x).
Since g is continuous and f is monotone nondecreasing, it
NOTE:
will follow from Theorem
5.5.2
that g is integrable with respect to
and hence this part of the hypothesis is superfluous. We should
J,
also note that if f is monotone nonincreasing we get exactly the

same theorem.
D Exercises
If f is defined and bounded on
I.
[a, b]
and g is monotone non
decreasing on the same interval and f is integrable with respect to
g,
is it necessarily true that
J:
F(x) =
f(t) dg(t)
is continuous? If so, prove; if not, give an example and give sufficient

conditions under which it is continuous.
If
2.
[a, b],
r
r
1:
(a )
(b)
(c)
is nondecreasing on
f(x) dh(x) =
f(x) dh(x)
[f(x) + g(x)] dh(x) .s;

[f(x) + g(x)] dh(x)
r
J:
and f and g are bounded on
f(x) =
f:
f(x) df(x)
(iz .s; c .s; b).
f(x) dh(x)
f(x) dh(x) +
f(x) dh(x) +
Letf be a function defined on
3.
Does
[a, b]
show that
r
1:
[O, l] in
g(x) dh(x).
g(x) dh(x).
the following way:
{I'
for
x E [O, 1/2],
0,
for
x E ]1/2, l].
exist?
a= {[ak, bk]: k E (1, n ) } is a decomposition of [a, b]

a, bn = b, and ak+i = bk> k E ( l, n - I). Suppose g is constant
on each interval [a, b1 [, ]an, b], and ]ak, bk[, is defined in any way for
x =ab and g(ak+) - g(ak-) =Bk for k E (2, n - 1). If f is contin4.
with
a1
Suppose
=
5.5
uous on
[a, b],
show that S (J,
J:
5.
Suppose
Iff(x)
6.
x for x
If f,
and J and
g) exists and
J(x) dg(x)
is defined on
g(x)
[O, l]
x,
x + 3/4,
x + 9/8,
[O, I],
FUNCTIONS OF BOUNDED VARIATION I 217
f(ak)8k.
in the following way:
x
x
for x
compute
for
for
[O, 1/3],
]1/3, 2/3[,
[2/3, I].
f(x) dg(x).
g, and h are defined on [a, b], his monotone nondecreasing

g are bounded and integrable with respect to h, show that
fg is integrable with respect to h.
7.
Prove Theorem
8.
5. 4 6(d).
.
(Jn) is a sequence of functions and g is a monotone

[a, b]. Suppose
Nofn is integrable with respect to g and fn f uniformly on
Suppose
nondecreasing function, all functions being defined on
Vn E
[a, b].
Show thatf is integrable with respect tog and
J:
9.
fn(x) dg(x)
Show that Abel's lemma,
J:
f(x) dg(x) .
3.2.5,
is a special case of Theorem
5.4.8.
IO.
Supposef,
g, and hare defined on [a, b] andf and g are integra

Lagrange identity:
ble with respect to h. Prove the integral form of the
J: [J:
(J:
=
{f(x)g(y) - f(y)g(x)}2 dh(x) dh(y)
) (f
f(x)2 dh(x)
) (f
g(x)2 dh(x) -
r.
f(x)g(x) dh(x)
If h is monotone nondecreasing, use this to obtain the Cauchy-Schwarz

inequality:
(f
5.5
r (J:
f(x)g(x) dh(x)
) (f
f(x)2 dh(x)
g(x)2 dh(x) .
FUNCTIONS OF BOUNDED VARIATION AND THE EXISTENCE

OF RIEMANN-STIELTJES INTEGRALS
To get an idea of the kinds of functions that will guarantee the existence
of the Riemann-Stiehjes integral, we shall start with a continuous func
tion
f and write down the Riemann-Stieltjes sums off with respect to

g.
a function
218 j INTEGRATION
Let
a1 and a2 be
decompositions of an interval
and a=
2,
of
{I k :
I :,;;; k
[a, b]
Suppose
a;= {l;k:
and a the com
(1, n;)}, j = l,
:,;;; n}. Now, suppose we have labeled the elements
a2
mon refinement of a1 and
k E
so that
l1k = {/;: lk
<
j:,;;; lk+1}.
Consequently, we may write

lk+I
g(b1k) - g(a1k) = L
i=lk +I
x1
and hence if
11
and
x1k
Ilk
[g(b;) - g(a;)] ,
we have
lk+I
f(x1k) [g(b1k) - g(a 1k)] - L f(x;) [g(b;) - g(a1)]

j=l k+l
=
'k+l
j=l k+I
[f(x1k) - f(x;)] [g(b;) - g(a;)].
If we sum both sides of this over k we get
S1, g(a1, {x1k}) - S1,g(a, {xk})

=
n1
1k+I
k=I
i=lk+I
[f(x1k) - f(x;)] [g(b;) - g(a;)].
(5.5.1)
S1, g(a2, {x2k}).

[a, b], it is uniformly continuous and hence
Ve > 0, 38 > 0 so that Ix - YI < 8 and x,y E [a, b] implies lf(x) - f(y) I
We get a similar formula for

If
is continuous on
Consequently, if a. is a decomposition with 1a.1 < a, and al> a.

a2 >a. it follows that a >a. and la l < 8. Consequently, the left
side of (5.5. l) is less than
<
E.
and
L lg(b;) - g(a;) I.
J=I
If we suppose, for the moment, that
3M so that for any decomposition
L lg(b;) - g(a;) I :,;;; M,
j=I
then we have
(5.5.2)
That is,
S1,g
is Cauchy. It follows from Theorem
the Riemann-Stieltj es integral of
5.4.4
with respect to
and
(5.5.2)
that
g exists.
We are now ready to write down a formal definition.
5.5.1 Definition. A function g defined on the interval [a, b] is said

to be of bounded variation <=} 3M surh that for every decomposition a =
5.5
FUNCTIONS OF BOUNDED VARIATION I %19
L Jg(bk)-g(ak) I
k=l
:;;;; M.
We have actually proved the following theorem.
5.5.2 Theorem. If f and g are de.fined on the same interval, f is con

tinuous and g is of bounded variation, then the Riemann-Stieltjes integral of
f with respect to g exists.
Note that this theorem when used in conjunction with Theorem
5.4.8 on integration by parts contains Theorem 5.2.10 as a special case.

Note also that a slight adjustment in the proof of Theorem 5.5.2 will
lead to a theorem that contains Theorem 5.2.11 as a special case (see
Exercise 5 at the end of this section).
It is a simple consequence of the triangle inequality that the sum of
two functions of bounded variation is again a function of bounded varia
tion. Since it is clear that a monotone function on a finite interval is of
bounded variation, it follows that the sum of any two monotone func
tions on a finite interval is a function of bounded variation. Indeed,
the last statement is a description of any function of bounded variation.
5.5.3 Theorem. A function g with domain [a, b] is of bounded varia

tion if and only if it is the difference of two monotone nondecreasing functions.
Further, there are unique monotone nondecreasing functions g+ and g- so
that g+(a) g- (a) = 0 ,
=
g(x) - g(a)
g+(x)-g-(x),
(5.5.3)
and so that for every pair of monotone nondecreasing functions J+ and 1- for
which
g(x) - g(a)
we have
j+(x)-J-(x),
g+(x) :;;;j+(x) -j+(a),

(5.5.4)
g-(x) :;;;j-(x) - J-(a).

Proof.
We shall prove only the necessity of the first statement,
since the sufficiency is obvious. If

tion of
[a, x]
[a, b],
let
6.(x) be any decomposi
and (x) the collection of all such decompositions. Set
sg+(a(x))
2 L {Jg(bk)-g(ak)I
k=l
1
sg-<a<x)) = 2
[g(bk)-g(ak)]},
L {Jg(bk)-g(ak)I- [g(bk)-g(ak)]}.
k=l
%20
I INTEGRATION
where [ak> bk]
A(x). It is clear that
Sg+(A(x)) -s0-(A{x))
g(x) - g(a).
(5.5.5)
Since g is of bounded variation, the sets
are bounded and hence

g+(x)
g-(x)
sup{sg+(A{x)): A{x)
.#(x)},
sup{Sg-{A(x)): A{x) E (x)},
are well defined. The functions g + and g- are monotone nondecreasing

functions. We shall show this for g+, the proof for g- being similar. We
first note that every term in the sum Sg+(A(x)) is nonnegative. This
means that if A (x) C A{y ), then
(5.5.6)
Now, if x < y, then VA(x), 3A{y) such that A{x) C A{y). Indeed, if
A(x,y) is any decomposition of [x,y], then A(x) C A(x) U A(x,y)
=A{y). Consequently, it follows from this fact and (5.5.6) that
g+(x)
sup{S0+(A(x)): A{x)
.#(x)} g+(y) ,
which shows that g+ (x) is monotone nondecreasing.

Let us now prove (5.5.3); that is,
g(x)-g(a) =g+(x)-g-(x).
To show this we first note that
1
S0(A{x)) =2
k=l
'
lg(bk) -g(ak)I 2 [g(x) -g(a)]
where we have set

So(A(x))
Hence
Sg+(A(x)) + sg-(A(x))
g(x) =sup{Sg{A(x)): A(x)
k=l
E
lg(bk) -g(ak) 1.
.# (x)}
=2 sup{Sg(A(x)): A(x) E ,,! (x)}
I
2 [g(x) - g(a)].
If we subtract g -(x) from g+(x) and use the above equality, we get
(5.5.3).
5.5 FUNCTIONS OF BOUNDED VARIATION J 221
To complete the proof, let us first remark that if
;;;,:
0 andB ;;;,: 0,
we always have
4 {IA -Bl+ (A -B)}.;;; A.

Now, if g(x) -g(a)
l+(x) - 1-(x),
wherel+ andl- are nondecreas
ing, then if we apply the previous remark with

B 1-(bk) -1-(ak), we get
1+(bk) -1+(ak) and
2 {lg(bk) -g(ak)I
[g(bk) -g(ak)]}.;;;l+(bk) -l+(ak).
Thus
sg+<<x))
1l
.;;;
L u+<bk) -1+<ak)J .;;;1+<x) -1+<a).
k=I
It follows that
g+(x) .;;;l+(x) -l+(a).

Now,
l+(a)
1-(a),
and
1-(x) -g-(x)
1-(x) - 1-(a) -g-(x)
l+(x) -g+(x).
l+(x) -l+(a) -g+(x)
;;;.:
Thus
0,
which shows that
The uniqueness of the
decomposition of
g -g(a)
into functions
havin g the properties ofg+ andg- is an immediate consequence of the

normalization conditions at
and the minimum conditions (5.5.4).
This completes the proof of the theorem.

The decomposition of a function of bounded variation into the
difference of two monotone nondecreasing functions is clearly not
unique. The unique pair of functionsg+ and g- described in the last
theorem is called the
canonical decomposition
ofg -g(a).
Since every g of bounded variation can be decomposed into the

difference of two monotone nondecreasing functions, it follows that
Vx E [a, b],g(x+) andg(x-) exist. We are using the same convention
here as in Section 2.3: g(a-)
g(a)
andg(b+)
g(b).
5.5.4 Theorem. If g is defined on [ a, b] and is of bounded variation,

and if the pair g+ and r is the canonical decomposition ofg -g(a)' then
Vx E [ a, b],
g(x+) -g(x)
g(x) -g(x-)
{lg(x+) -g(x)I
=4 {lg(x) -g(x-)1
[g(x+) -g(x)]},
(5.5.7)
[g (x) -g (x-)]}.
In particular, g is continuous at x g+ and g- are continuous at x.
222 I INTEGRATION
For every E > 0 there exists a decomposition a. of [a,b]

g+(b) Sg+(d,) + E. If d >- d., then Su+(d,) Su+(d), SO that
g+(b) s0+(a) +e.
Let x E ]a, b[ , a >- a , so that [x, y] E d. Further, let d(x)
={/:IE a & IC [a,x]}, d(x,y) {[x,y]}, and d(y,b) ={/:IE a
& I C [y,b]}. Then it is clear that
Proof.
SO that
sg+(a) = so+(a(x)) + so+(a(x,y)) + so+(a(y,b))'

and hence
0 g+(b) - so+(a)
[g+(x) - so+(a(x))]
+
[g+(y) - g+(x) -so+(a(x,y))]
+ [g+(b) -g+(y) - so+(d(y,b))]

<
(5.5.8)
E.
Now, by the method used in the last paragraph of the proof of Theorem
5.5.3, we can easily establish that all the terms in the sum on the right in
(5.5.8) are nonnegative. Thus we get
0 g+(y) -g+(x) - S0+(d(x,y)) <
E,

1
0 g+(y) -g+(x) -2 {lg(y) -g(x)I + [g(y) -g(x)]} <

If now we allow
E.
we get
g+(x+) -g+(x) = 2 { lg(x+) - g(x) I+ [g(x+) - g(x)]}.

The proofs of the other formulas in (5.5.7) follow in a similar way.
The proof of the statement about continuity is almost obvious and we

leave it for the reader.
+
If g is of bounded variation and g and g form the
canonical decomposition of g - g(a), then
5.5.5
Definition.
(5.5.9)
is
called the variation of g and v(g) = v0(b) is called the total variation of g.
The next proposition gives another way of computing the variation
of a function.
5.5.6
Proposition
vu(x) = sup{S0(d(x)): d(x) E J (x)},
(5.5.10)
5.5 FUNCTIONS OF BOUNDED VARIATION I 223
where
Sg(A(x))
sg+(a(x)) + sg- (A(x))
n
L lg(bk) -g(ak)I.
k=l
Moreover,
Vg(x+)-vg(x)
Vg(x) - Vg(x-)
Proof.
lg(x+) -g(x)I,
(5.5.11)
l (x) - g(x-)1g
As we have already noted in the proof of Theorem 5.5.3,

g(x)
4 sup{Sg(A(x)): A(x)
tt (x)}
1
2 [g(x) -g(a)].
Thus, adding g + and g-, we get the formula (5.5.10). The formulas in
(5.5.11) follow immediately from the formulas (5.5. 7).
As we have already noted, a function g of bounded variation can
have only jump discontinuities. From Theorem 5.5.4 there are only a
countable number of jump discontinuities and moreover from Proposi
tion 5.5.6,
J (x) -g(x-)I= Vg(x+) - Vg(x-).
Jg(x+) -g(x)I+ g
Let J be the set of jump discontinuities of g and {xk: k E (1, n)} any
finite set in]. Suppose these are labeled so that x1 < x2 <
< Xn
Then
n
L [ Jg(xk+) -g(xk)J + Jg(xk) -g(xk-)J ]
k=l
n
L [vg(xk+) -vg(xk-)]
k=l
n-1
:;;:;; L [v9(xk+) - Vg(xk-) + Vg(Xk+i-) - Vg(xk+)] + Vg(xn+)
k=l
- vg(xn-)
:;;:;; Vg(Xn+) - Vg(x1-) :;;:;; Vg(b).
The last three inequalities follow from the fact that v9 is a monotone
nondecreasing function. Now, if] is denumerable, let (xk) be a sequence
with range]. The last inequality shows that
00
L g
J (xk+) -g(xk-)I
k=O
:;;:;; v(g),
and thus the series on the left is absolutely convergent and consequently
independent of the order of summation of the terms. Therefore, if
J is finite or denumerable, the number
%24 I INTEGRATION
( +)-gx
( -)]
L [gx
xE}
has a meaning and value independent of how the elements of J are

labeled.
Let us setlx=
[a ,x [ , and Vx E [a,b ],
g,(x )=
[g(y+)-g(y-)] +gx
( )-g(x -).
g(a-)= g(a), so that g (a ) 0. The

saltus function for g. The proofs of the following
We are using the convention that

function
g8is
called the
facts about the saltus function are easily established and we have
left the proofs for an exercise at the end.
g8(x +)-g8x
( )= gx
( +)- gx
( ),
(a)
(b)
( -)= gx
g8(x )- g8x
( )-gx
( -),
Vx E [a,b].
If g+, g- is the canonical decomposition of g-g(a), then

g+ s< g+,
and
The function g8is of bounded variation and
(c)
Vx E [a,b ].
If we set
gc = g-gs
it is clear from condition (a) that
gc is
a continuous function and more
over from condition (c) it follows that
gc is
of bounded variation.
It is interesting to compute the Riemann-Stieltjes integral of a contin

uous function
with respect to the saltus function
g8 We
know that
this integral exists by virtue of Theorem 5.5.2. By the proof of the latter
theorem
Ve>
0, 311
> 0
so that 0 < 8 < TJ and lal < 8
I ft( k)[gsb( k)-g,(ak)] - J: fx( )dg8x( )I

If
<
E/2.
ak a and bk b, then clearly there is no loss in generality in taking

bkas points of continuity of g8 In the latter case we get from the
ak and
definition of g. that
Ik= [ak> bk[.

bk= bn = b, we
where
Indeed, the last formula is also valid if ak =a. In
case
must add the germ
the right.
g+b
( )-g+ (b-)to
the sum on
5.5
FUNCTIONS OF BOUNDED VARIATION I %%5
x,y E [ak, bk] ==>

e/2M, where M is a fixed number that satisfies the
Let us suppose that 8 is small enough so that
lf(x) - f(y)J
<
inequality
L Jg(x+) - g(x-)J
<
xE}
Thus if bk#
b, we get
f(k)[g,(bk) - g,(ad]
and, if
where
M.
L f(x)[g(x+) - g(x-)] +Eb
n, we must add the term f(b)[g(b) - g(b-)] on the right,

n
LJekJ
k=l
Hence, summing over
<
e/2.
k, we see that
I Lf(x)[g(x+) - g(x-)] - fb f(x) dg,(x)I

xE}
< E.
We have proved the following result.
5.5. 7 Theorem. If f is continuous and g is of bounded variation on

[a, b], then taking g(a-)=g(a), g(b+) g(b) we get
=
fb f(x) dg(x)=Lf(x)[g(x+) - g(x-)] + fb f(x) dgc(x),

XEJ
where] is the set of discontinuities of g,

gc(x)
g(x) - g,(x) ,
and g, is the saltus function for g.

The last theorem shows, for example, that the sum of an absolutely
convergent infinite series may be written as a Riemann-Stieltj es integral.
Indeed, suppose that
g(O)=0 and for x
Lk,,.0 (ak) is an absolutely convergent series. Set

]O, I],
g(x)=a0+ L ak.
kx1
It is clear that
Vk
N,
g(O+) - g(O)=a0,
g. Since the series is absolutely
g is of bounded variation. The function g is a saltus function
and indeed g,
g. Further,
and these are the only discontinuities of
convergent,
Lak =
dg(x).
Ol
k"'=O
226 I INTEGRATION
D Exercises
If f and g are functions of bounded variation on [a, b], show
f + g and Jg are functions of bounded variation. If lgl m > 0,
I.
that
show that l/g is also of bounded variation.

2.
If
g1
[a, b J and c E [a, b].

gl [c, b], show that g1 and g2 are of bounded varia
Suppose that g is of bounded variation on
gl [a, c], g2
tion and
v(g)
v(g1) + v(g2) .
3. Suppose g is defined on [a, b], is of bounded variation, and

g+, g- is the canonical decomposition for g- g(a). If g- g(a) j+- 1-,
where j+ and 1- are monotone nondecreasing, show that h j+ - g+
1g- is a monotone nondecreasing function.
=
4.
Let g be that function defined on
g(x)
[O, l] as follows:
x sin (7T/x),
for
#- 0,
0,
for
0.
Decide whether or not g is of bounded variation.
f defined on [a, b] is said to be piecewise continuous

{ ak: k E (1, n)}
and Vk E (1, n) f(ak+) and J(ak-) exist. If f is piecewise continuous
and g is continuous and of bounded variation on [a, b], show that the
5.
A function
::::} it is continuous except at a finite number of points
Riemann-Stieltjes integral off with respect to g exists. Can the hypoth

eses on f be weakened further?
6.
Let the function f be as in Exercise 5. Weaken the continuity
requirements on g in that exercise so that the Riemann-Stieltjes integral

off with respect tog exists. Can the hypotheses onf be weakened also?
7.
If f is defined and differentiable on
[a, b] and f' is bounded,
show that f is of bounded variation.

8.
[a, b],
Suppose f is defined on
[a, b] and 3M.> 0 so that \lx,y

M Ix
IJ(x)-f(y) I
12
- Yl 1
Isf necessarily of bounded variation?

9.
[a, b] ,
Suppose f is defined on
[a, b] and 3M> 0 so that Vx,y
IJ(x)-f(y)I
M Ix-
YI
If g is any continuous function, show that the Riemann-Stieltjes integral

off with respect tog exists.
5.5
FUNCTIONS OF BOUNDED VARIATION I 227
IO.
Suppose f is continuous and of bounded variation on [a, b].
Show that
If f(x) dg(x) I J: lf(x) I ldg(x) I,
where ldg(x) I is another symbol for dv0(x).
6j HIGHER
CHAPTER
DIMENSIONAL SPACE
In the previous chapters we have developed the essentials of the calculus

for real-valued functions with domains in the real number system.
The topics we have developed include ( 1) the real number system, (2)
sequences and series, (3) real-valued continuous functions, (4) differen
tiable functions, and (5) integration theory. The object of the remaining
chapters will be to extend some of these theories to higher-dimensional
situations. Some results extend in a straightforward manner. On the
other hand, there are other results that are very simple and easy to
prove in the one-dimensional case which become rather complicated
and far-reaching theories in the higher-dimensional case. For example,
the simple fact that a differentiable monotone function has an inverse
that is also a differentiable monotone function generalizes to the con
siderably more difficult inverse function theorem. The rather elemen
tary formula for the integration of a derivative becomes a much more
complicated theorem that requires considerably more algebraic and
analytic machinery for its proof. This theorem is generally called
Stokes' theorem, but the names of Gauss and Green may also be associated
with it.
In this chapter we shall lay the basic foundation for the subsequent
chapters. We shall discuss real vector spaces with a certain distance
function acting on pairs of points and shall also discuss general proper
ties of continuous functions as well as special continuous functions called
linear transformations.
6.1
REAL VECTOR SPACES
We shall begin our discussion with the n-fold Cartesian product Rn of

the real numbers. The set Rn shall be defined as the set of all n-tuples
(x1,
"
,x
),
where
ER fork E (l,n).
A much more formal
definition of Rn can be given as the collection of all functions with

domain the set
(1, n)
and range in R. Indeed, a little reflection will
convince the reader that one reasonable way to define an n-tuple is as

a function with doIJlain
(1, n)
and range in R. We shall continue to
use the very suggestive notation that we have indicated above.

We shall now introduce two functions + and
the first one having
domain Rn X R" and range R" and the second one having domain
228
6.1
REAL VECTOR SPACES j 229
R X Rn and range Rn . These functions are defined by the equalities
(x 1, ... ,xn) + (y1, ...,yn) = (xl + y1, ... ,xn + yn),

a (x1,
,xn)= (ax1,
axn).
The triple (Rn,+, ) is a prototype example of what we shall call a

real n-dimensional vector space and we shall designate it by vn. By an
abuse of language we shall write x E vn rather than x E Rn when we
want to emphasize that we are working with a vector space, and shall
speak of the elements of vn rather than the elements of Rn. The ele
ments of vn shall be called points or vectors and we shall designate them
by letters without superscripts; for example,
x= (x1
'
'xn).
However, in two and three dimensions we shall often revert to the

standard notations (x,y) and (x,y,z). As is usual, when we "multiply"
a vector by a scalar (an element of R) we shall drop the dot. The numbers
k
x will be called the components of the vector x. The zero vector, 0, is
that one for which every component is the zero element of R. We shall
also set-x= (-I)x.
6.1.1
Definition. A finite set {xk: k E (1,m)} c vn, where if j =i' k,
then X; =i' xk is said to be linearly independent<=> for every finite set { ak: k E
(l,m)}CR:
m
k=l
akxk=O::::::} ak=O,
Vk E (l,m).
A set of vectors zn vn that is not linearly independent is called linearly de

pendent.
REMARl{S:
In using the notation {xk: k E (1, m)} we are already
specifying a function with domain (1,m) and range in vn by means
of the equality <l>(k) xk. The set {xk: k E (I, m)} is the range of,
and if this set contains more than one element there is more than one
function with domain (I, m) and with range {xk: k E (1,m)}. In
some instances, for example when we talk about matrices, it is im
portant to know exactly which function we are using. If this is the
case, we shall call the function an ordered m-tuple of vectors and denote
it by (x1 ,
, Xm). Ordinarily it is only the range of the function that
is important. For exampk, the first sentence in Definition 6, I. I should
perhaps more properly be stated as follows:
Afinite nonvoid set A C vn is said to be linearly independent <=>for every
function a with domain A and range in R we have
=
L
XEA
a(x)x
0 ::::::}
a(x) =
0,
Vx E A.
230 I HIGHER-DIMENSIONAL SPACE
Suppose A has m elements and is any one-to-one function with

domain (l,m) and range A. If we put xk = <l> (k) and ak = a( xk) ,
then (referring to the introduction to Chapter 3) we get
2:
a(x)x
.xEA
2:
k=l
a(<l>(k) )<l>(k) =
2:
k=l
akxk .
Hence, we have shown that Definition 6.1. l is independent of .

In using the notation {xk: k E (l,m)} we usually mean that if
j - k, then x; - xk> although this is not generally the case when we
use the notation (x1,
Xm ) .
6.1.2
space of
Definition.
vn
A nonempty subset L C vn is said to be a linear sub

or a linear manifold ::::> V x, y E L and Va , f3 E R, ax + {3y
EL.
6.1.3 Definition. A vector x E vn is said to be a linear combination of
the vectors in the set {xk : k E (1, m)} C vn ::::> there exist numbers a1 ,
am E R so that
The last definition should perhaps more properly be stated as follows:

vector x E vn is said to be a linear combination of the vectors in the
finite set A C vn ::::> there exists a function a with domain A and range in R
A
so that
As in the discussion of Definition 6.1.1, the sum on the right is quite

independent of any function with domain ( 1, m) and range A. Thus
Definition 6.1.3 really makes sense quite independent of which function
from (1, m) onto {xk: k E (1, m)} we are using to define the sum.
The terminology and notations we have adopted in the formal defini
tions 6.1.1 and 6.1.3 are more classical and standard than those we have
used in the rewritten versions of these definitions. Hence, if the reader
will keep in mind what is involved, there seems to be no real reason to
change from the classical terminology and notations, and we shall use
them in the future without further comment.
6.1.4 Definition. If L is a linear subspace of vn, then a set {xk: k

E (I , m)} C L is said to generate L ::::> every x E L is a linear combination
of the vectors in {xk : k E (1, m)} . A set of vectors that generates L is called
a basis for L ::::> the set is linearly independent.
It is clear that every finite set {xk: k E (1, n)} C vn generates a linear
subspace of vn, namely, the set of linear combinations of the vectors of
the given set.
6.1
REAL VECTOR SPACES I 231
As an example of a basis for vn consider the vectors
(6.1.1)
where
eki =0
ifj -
k
ek =1.
and
Of course, a vector space may have
many different bases, and for example a basis of
V3
is given by the
three vectors
X1=(1,0,0},
X3 = (1,1,l) .
X2=(1,l,O),
We shall leave the verification to the reader.
6.1.5
Theorem.
the linear subspace of

s:;;:;; r.
Proof.
If {yk: k E (1,s) } is a linearly independent set in

vn generated by the set {x k: k E (1, r) } C vn, then
Since y1 -
0, we
may write
r
Y1 = L <X1kXk,
k=l
where not all the
xk1
a1k are zero.
Let k1 be an integer so that
a1k 1
0.
Then
is a linear combination of
Since y2 is a linear combination of the
Y2=/3Y1
k.. k,
xk
we find
<X2kXk.
a2k are zero, for otherwise y1 and y2 would be

k2 be an integer so that a2k2 - 0. Then xk2 is
a linear combination of {y1,y2} U {xk: k E (1, r) & k - k1, k2}. If
r < s, then proceeding in this way we see that the set {xk: k E (1, r)}
is contained in the linear subspace generated by the set {yk: k E (1, r) }
Not all the numbers
linearly dependent. Let
Hence the latter set generates the same linear subspace generated by
{xk: k
(l,r)},
and thus
r
Yr+t = L f3kY k
k=l
This contradicts the linear independence.
The formal way to do this is as follows. Forj E
statement: There exist distinct integers
vector in the set
{xk;: i- E ( l ,j)}
k1,
ki
(1, s) let P(j) be the

(1, r) so that each
in
is a linear combination of the vectors
in the set
{y;: i
For j >
s,
P (j) be any
P(l) is true.
let
have shown
( l,j ) }
{xk: k
(l,r) & k
k;, i
true statement, for example,
( l,j ) } .
1 =1. Now, we
P(j) is true.
Further, suppose thatj <sand
The set
{xk: k E (l,r) & k #- ki, i E (l, j)}
is nonvoid, since other
{yi: i
E (l,j)}, which is impossible. Now, the process we have carried out
for y2 in the previous paragraph can be carried out for YJ+i, and we se
that P(j + l) is true. If j ;;;,, s, P (j) :::::} P (j + 1) automatically. Hence
we have the statement (j) (P(j)), and taking j
s we see thats.;;; r.
wise YJ+i is a linear combination of the elements in the set
6.1.6 Corollary. If L is a linear subspace of vn, then any two bases for
have the same number of elements.
6.1. 7
Theorem.
independent set in
Proof.
If L is a linear subspace of
can be extended to a basis for L.
Suppose
{xk: k E ( 1, r)} C
V",
then every linearly
is the given linearly inde
n elements, it follows
6.1.5 that any linearly independent set contained in L
{xk: k E (I, r)} has at most n elements. Hence, among
pendent set. Since L C V" and V " is generated by

from Theorem
and containing
the collection of such linearly independent sets, there is one with a
m.
maximal number of elements, say
If this set is
{xk: k E ( 1, m)} and
L1 is the linear space generated by this set, then L1

L - L1 #- 0 and there is an Xm+i
L. Otherwise,
{xk: k E ( 1, m + 1)}
{xk: k E (I, r)}. But this is a
L - L1, so that
is linearly independent and contains

contradiction.
6.1.8
Corollary.
If L is a linear subspace of
V", L #-
{O}, then
has
a basis.
Proof.
Since L #-
{O},
there is a nonzero element in L. The set
formed from this one element is linearly independent and hence can
be extended to a basis for L .
Corollaries
6. J .6
and
6.1.8
show that the following definition makes
sense.
6.1.9 Definition. If L is a linear subspace of V" and L #- {O}, the num

ber of elements in any basis for L is called the dimension of L. If L
{O}, its
dimension is said to be zero.
.=
6.1.1 0 Corollary. If L is a linear subspace of V" of dimension m,

then any linearly independent set contained in L and having m elements is a
basis for L.
Proof.
Theorem
If {yk: k E (I, m)} C

6.1.7 we can extend it
L is a linearly independent set, by
to a basis for L, which must have
elements. Hence the given set is a basis.
6.1
REAL VECTOR SPACES I 233
6.1.11 Corollary. If L is an m-dimensional linear subspace of vn, then

any set contained in L with m + I elements is linearly dependent.
Proof.
Otherwise, L would have dimension larger than m.

As we had mentioned before, the space vn is a prototype example of
a real vector space. There are many other sets which arise quite naturally
in mathematics that have enough of the properties of V" that all the
theorems and corollaries we have given above will be valid for these
other sets. For this reason we abstract the essential properties of vn
and give the following definition.
6.1.12 Definition. A real vector space (or real linear space) is a triple
consisting of a set V, a function + with domain V X V and range V, and a
function with domain R X V and range V which satisfJ the following con
ditions:
Vx,y EV.
( a) x+y=y+x,
( b) x+ (y+z)
(x+y) +z,
Vx,y,z EV.
( c) There is a vector 0 EV so.that Vx EV x+ 0 =x .
( d) (a/3) x=a (/3 x),
Va,{3 ER & Vx EV.
Va ER & Vx,y EV.
( e) a (x+y) =a x+a y,
( f ) (a+f3) x= a x+ f3 x,
Va,f3 E R & Vx EV.
g
I
x
x,
0
x
0,
V
x
E
V.
( )
real vector space is called finite-dimensional ::::} it is generated by a finite set

of its elements.
We shall usually abuse the language and designate a real vector space
by 'V' instead of the triple ' (V, +, ) .' Also we shall follow the usual
custom and drop the dot when we "multiply" an element of R by an
element of V; that is, we shall write 'ax' instead of 'a x.' We shall also
define -x to be the vector Ix.
All the definitions, theorems, and corollaries that we have given for
the vector spaceV" will carry over mutatis mutandis to a finite-dimensional
vector space V. We shall leave the verification of this as an exercise.
A simple example of a finite-dimensional vector space is the space
R,. [x] consisting of all polynomial functions of degree at most n, that
is, all functions given by
-
p (x) = L
k=O
akxk,
Two such polynomial functions are added by the rule ( p + q ) (x)

R is defined by
= p(x) +q(x), and multiplication by an element of

(ap) (x) ap(x).
=
We shall soon see other examples of a real vector space, and indeed
we shall even see an important example of one that is not finite-dimen
sional.
D Exercises
I.
Let V be a real vector space of dimension n. Show that V is
isomorphic to vn; that is, there exists a one-to-one function (ax + {3y)=a<f>(x)+ {3</>(y).

2. If {xk: k E (I, n)} is a basis for a vector space V, show that
VxEV, there exist unique numbers a 1,
, a n in R so that
3.
Decide which of the following sets of vectors form a basis for V4:
(a) { (I,l, 0, O), (O,0, 1, 1), (I, 0,0,4), (O, 0,0,2)}.
(b) { (I,0, 3,1), (1, 1, 0, 2), (0,1, 2,1), (2,2,5, 4)}.
(c) { (l, 0,1,0), (0,1,0,0), (0,0, 1, 0), (2, 1, 3, 0)}.
(d) { (a, b, 0, O), (I, 0, c, O), (0, 1, I, a ) , (I, 0, 1, b)},
where ab< 0.
4.
Extend the set { (I, 1, 0,2), (0,2,1,O)} to be a basis for V4
5. Decide which of the following sets of vectors are linear subspaces

of vn.
{ k=l
xk=O}
(b) { x: (xk)2= }
k=I
(c) { x: i kxk 1} .
k=I
( a)
x:
(d) {x: x1xn O }.

(e) {x: x1x2 O}.
=
6.
Is the vector (I, -3,2,-2) in the linear subspace generated
by the vectors in the set {2,3,-1, 4), (3, 0,1,2), (1, 6,-3,6)}?
7. Show that the vector space of polynomial functions of degree

at most n, given as an example at the end of Section 6.1, has dimension
n + 1.
8.
For
( a)
(b)
(c)
any real vector space V show the following:

The zero vector is unique.
VxEV,Ox=O.
VxEV,x+(-x)=O , and x+y= 0::::>y=-x.
9. If L and M are the linear subspaces of V4 generated by the sets

of vectors { (I,0, l,0), (O,2,1,2), (2,3,1, I)} and { (0, 2,0, -3),
6.2
(2, 3,
1,
EUCLIDEAN SPACES I 235
I)}, respectively, find the dimensions of the linear spaces

M. By definition,
L+M and L n
L+M= {z: (3x)(3y)(x E L,y EM, &z=x+y)}.

If V is a finite-dimensional real vector space, designate its
10.
dimension by
d(V). If Land Mare linear subspaces of V" (or any finite
dimensional real vector space), show that
d(L) + d(M) = d(L+M) + d(L n M).

The subspace L+M is defined in Exercise 9.
11. Suppose L and M are linear subspaces of a given vector space

V. Give necessary and sufficient conditions on L and M that L U M
is a linear subspace of V.
Suppose L and
12.
fixed dimensions land
M are variable linear subspaces of vn but with

m, respectively. What is the maximum possible
dimension of L+M (see Exercise 9) and minimum possible dimension

of
6.2
M?
EUCLIDEAN SPACES
To proceed further in the direction that we wish to go, it is necessary

to define a distance between two points in vn. The way in which we
choose to do this is based on the
Pythagorean theorem. Recall that we
use the symbol 'lxl' to denote the absolute value of a number and we
can intuitively think of this as being the distance from x to 0. We shall
use the same notation for elements of vn.
6.2.1
vn
to
Definition.
The norm or length function on
defined by
lxl
vn
is a Junction from
{ (xk)2 } 1'2.
The pair (V", I I), consisting of V" and the length function I I,
is a prototype example of what is called a Euclidean space of dimen
sion n and we shall designate it 'En.' The number Ix - YI will often be
called the distance between x and y. By an abuse of language we shall
say
x E En if we want to emphasize that it is in the domain of the given
length function.
The reader will perhaps recall from the elementary theory of analytic
geometry that the cosine of the angle between two vectors x and y in
the plane (Fig. 6.2.1) is given by
cos
(J =
x1y1 + x2y2
.
lxl IYI
2S6 I HIGHER-DIMENSIONAL SPACE
FIGURE 6.2.1
The number
of the vector
6.2.2
x1y1 +x2y2 is called the inner product or the dot product

x with the vector y.
Definition.
The function with domain vn x vn and range R
defined by
is called the inner product or dot product of x and y.

6.2.3
Proposition.
The inner product function satisfies the following
conditions:
(a)
(b)
(c)
(d)
(e)
xy =y x,
Vx, y E vn.
VaER & Vx,yEV".
a(xy) = (ax)y=x (ay),
Vx,y,zEV".
(x+y)z=xz+yz,
xxO, VxEV",andxx=0:::}x=O.
x x= lxl2,
VxEV".
We shall leave the proof of this proposition as an exercise.
6.2.4
Theorem (Cauchy-Bunjakovsky-Schwarz).
I x YI
For every x,yEP,
lxl IYI ,
with equality holding ::::} y =ax or x = 0.

A proof of this theorem has already been indicated in Exercise 1 5
o f Section 1.8. Another proof i s t o b e found i n Exercise 1 0 o f Section 5.4
(see also Exercise I at the end of this section). The proof we shall give
is possibly not as straightforward or elementary as any of the proofs
we have mentioned, but it does have the advantage that it will carry
over to real vector spaces that are not necessarily finite-dimensional,
that is, to real spaces
V for which there is a function defined on V
that satisfies the conditions of Proposition 6.2.3.
6.2
Note that in
V2
EUCLIDEAN SPACES I 237
! cos Bl
this theorem says nothing more than
:;;;; 1.
Before we prove this theorem it will be convenient to prove a lemma.
6.2.5 Lemma. If a ,e 0 and if V>.. E R, a>..2+b>..+ c 0, then

b2- 4ac,,;;;; O; b2- 4ac = 0 a>.. 2+b>..+c has a double root (which must be
at-b/2a).
By the technique of completing the square, we have
Proof.
a>..2+b>..+c=a
_
>..+l._
2a
a ;C 0, it must be positive;
VA.,
Since
that
)2
_.!_ (b2-4ac)
4a
V>..
E R,
(6.2.1)
0.
otherwise the right side of (6.2.1)shows
2
4a2 >..+:J :;;;; (b2- 4ac),
which is impossible if >.. is sufficiently large. Therefore, we have
(b2 - 4ac) ,,;;;; 4a2
and, taking>..=-b/2a, we get that

If
b2- 4ac=0,
r,
>..+
:a
b2 - 4ac,,;;;; 0.
then (6.2.l)'shows that our polynomial has a double
root at-b/2a. Conversely, if
a>..2 + b>..+ c has a double
real root,
a>..2+b>.. + c=a(>..- r)2 =a>..2- 2ar>..+ ar2 ,
and hence
V>..,
r =-b/2a.
From (6.2.1) i t follows that
b2- 4ac=0.
For every>.. E R ,
Proof of Theorem 6.2.4.
(y+ >..x)
r, then
( y+Ax)= lxl2>..2 + 2(x y)>.. + IYl2
0.
(6.2.2)
If we apply Lemma 6.2.5, we get
(x
y)2,,;;;; lxl2 IYl2,
from which it immediately follows that
If
y =ax
or x
Ix YI ,,;;;; lxl IYI
O; we clearly have equality. Conversely, if we have
equality, then either x=0 or, taking a= x
y /lxl2,
it follows from
Lemma 6.2.5 that-a is a double root of (6.2.2) and hence
From this it follows that
6.2.6
y=ax.
For every x, y
Theorem (Triangle Inequality).
I lxl - IYI I
,,;;;;
I x+ YI
,,;;;;
lxl + IYI,
with equality on the rif{ht holding x = 0 or y=ax,

on the left holding x = 0 or y={3x, f3 ,,;;;; 0.
IY- axl =0.
E En,
a 0
and equality
Proof.
Using Theorem 6.2.4 we get
Ix+ Yl2 = lxl2+ 2x
:y+
IYl2
lxl2+ 2lxl IYI + IYl2 = (lxl + IYl)2
Ix+ YI lxl + IYI. In this inequality replace y by -y to get Ix - YI

lxl + IYI, and then x by x+ y to get lxl Ix+ YI+ IYI, from which
it follows that Ix+ YI ;;-: lxl - IYI. By a similar argument we find that
Ix+ YI ;;-: IYI - lxl and this inequality with the previous one completes
Thus
the proof of the inequalities. The statements about equality are immedi
ate consequences of Theorem 6.2.4 and the above computations.
The reason that the previous theorem is called the triangle inequality
is that the length of any side of a triangle is less than or equal to the
sum of the lengths of the other two sides (Fig. 6.2.2).
x +y
length lxl
FIGURE 6.2.2
6.2. 7
x, y E En are said to be orthogonal

{xk: k E (l, m)} CE" is said to be ortho
E (I, m), j k =}xi xk = 0, and Vk E (I, m), lxkl
Definition.
Two
vectors
y=O. A set of vectors
normal Vj, k
= 1.
A simple example of an orthonormal set for E11 is the set

that we displayed in Section 6.1.
{ek:
E (I, m)}
6.2.8
{ 0} ,
Theorem (Gram-Schmidt).
Every linear subspace L CE11,. L
has an orthonormal basis.
Proof.
Let
{xk:
E (l, m)}
of Corollary 6. 1.8. Let us set
Y1
be a basis for L, which exists by virtue
X1f lx1 I,
x onto the onedimensional linear subspace generated

2
, we shall consider the
2
plane determined by y1 and x and use elementary trigonometry to
2
Y1/lx I,
compute it. Considering Fig. 6.2.3, and recalling that cos(} = x
2
2
and then project
by y1 to obtain a vector w2 To see the form of w
6.2 EUCLIDEAN SPACES I 239
FIGURE 6.2.3
we see that the projection of x2 onto the linear space generated by
Yi is the vector
Now, let z2 be the vector perpendicular tow2 so that z2 + w2 =x2. Hence

z2=x2- (x2 Y1)y1.
The vector z2 - 0, for otherwise y1 and x2 are linearly dependent. Let

us set
Y2 = z2/lz2I
Formally, by a direct computation, we see that y1 z2= 0. Therefore

{y1, y2} is an orthonormal set, and we think it is clear that the linear
subspace generated by this set and the set {x1, x2} is the same.
Let us proceed by induction. Suppose that {y1: j E (I, k), k < m}
is an orthonormal set that generates the same linear subspace as the
set {x1:j E (l,k)}. Let us set
k
zk+i =xk+I
L (xk+t Y;)YJ
j=l
The vector zk+i - 0, for otherwise zk+t is in the linear space generated
by {x1:j E (l,k)}. If we set
Yk+1 = zk+d lzk+1 I.
it follows that {y; : j E ( 1, k + 1)} is an orthonormal set. It is an easy
exercise to show that this set generates the same linear space as the set
{xi: j E ( 1,k + 1)}. The principle of induction shows that there is an
orthonormal set, having the same number of elements as the dimension
of L, which generates L. This proves the theorem.
6.2.9 Definition. If L is a linear subspace of a linear space M C En,
then the orthogonal complement L.1. of L in M is the set of all elements in M
which are orthogonal to every element of L; that is,
U = {x:
x E M & (y) (y E L x
y = 0)}.
6.2.10
Proposition.
If L is a linear subspace of the linear space M C E ,
then the following hold:

(a)
(b)
where
L1- is a linear subspace of Mand L n L1= {O}.
LEBL1-,
LEBL1-= {z: (3x)(3y)(x E L,y E L1-, & z=x+y)}.
M
Further,
every element of M is the unique sum of an element in Land an element in Ll..

(c)
LH
L.
The proof of (a) is almost trivial and we leave it as an exer
Proof.
cise. To prove (b) we think it is immediately clear that LEB Ll. C M.

On the other hand, since any orthonormal basis for L can be extended to
an orthonormal basis for M (Exercise

If
x,x1 E
Landy, y1
Ll. and x
4), it follows that M C LEB Ll..
+ y=x1 + y1,
then x
left side is in L and the right side is in Ll., so that
Statement (c) is an immediate consequence of (b).
- x1 = y - y1 The
x=x1 and y = y1
If V is a real finite-dimensional vector space and there is a function
with domain V X V and range R which satisfies conditions (a) through

(d) of Proposition
6.2.3, then
V together with this bilinear function is
called a Euclidean vector space. By defining the norm or length function

by condition (e) of that proposition it is possible to prove the Cauchy
Bunjakovsky-Schwarz inequality, the triangle inequality, and the Gram
n
Schmidt theorem in exactly the same way as we proved them for E .
An example of a finite-dimensional Euclidean space that is not E",
we consider the space Rn [x] of'polynomial functions of degree at most
n which we considered at the end of Section
6.1.
product on Rn [x] X Rn [x] by means of the formula
q=
J:p(x)q(x)dx.
We put an inner
(6.2.3)
We shall leave as an exercise the fact that this function satisfies condi
tions (a) through (d) of Proposition
6.2.3.
D Exercises
1.
Prove the Lagrange identity,
and use it to deduce the C-B-S inequality.
2.
Find an orthonormal basis for the subspace of E4 generated by
the vectors in the following sets:

(a)
(b)
(c)
{(0,1,1,2), (1,0,2,3), (2,1,1,2)}.

{(1,-2,3,-2)' (0,-1, 0,3)' (2, 3, 6,-7)}.
{(1,0, -1,2), (0,1,2,3), (2,1,0,7), (1,-1,-3, -1)}.
6.3
3.
Show that any orthonormal set in
E"
is linearly independent.
4.
Show that any orthonormal set in
En
can be extended to an
E".
orthonormal basis for
5.
Show that
Vx,y E E",
Interpret this result geometrically in
6.
TOPOLOGY IN E" I 241
If
y.
x, y
En
= IYI
lxl
and
E2
show that
7.
Suppose that
the form
x + ay
and
x-y
is orthogonal to
E2
Interpret this geometrically in
y are fixed vectors in E". Find the vector of
which has the smallest length as
varies over
R.
Show that this vector is orthogonal toy.
8.
Verify that the inner product defined by
(6.2.3) makes R [x]
a Euclidean vector space. Apply the Gram-Schmidt process to the basis
{I,x,x2,x3,x4}
9.
Let
for
R4[x].
{xk: k E (1, n)} be an orthonormal

E. For any x E E we may write
basis for a Euclidean
vector space
x ;=
Compute the numbers
ak.
x
10.
x E E"
akxk.
k=l
n
y = L (x
k=l
show that
xk)(y xd.
{xk: k E (l,m)} is an orthonormal

ak so that
Suppose that
For a given
If
"
set
E".
find the numbers
Ix
f .akxk l =minimum.
k=l
TOPOLOGY IN E"
6.3
Let us recall that for
E1
R,
questions about limits and continuity of
functions are discussed in terms of the absolute value, which is the same
as the length function. An alternative, but equivalent, way of discussing
these matters is through the use of open intervals or more generally
open sets. In the same way, we can discuss limit and continuity questions
in
En
by means of the length function or alternatively by means of
open sets.
In
E1
open sets were defined by means of open intervals. We could do
the same thing in En, replacing the open intervals by open cubes or more
generally by higher-dimensional rectangles. Another way is to replace
the open intervals in 1 by open balls in
B(a,p)
The set
B (a, p)
E" defined
by
{x: xEE" and Ix-al< p}.
is called an
open ball with center at a and radius p .
However, it makes very little difference which sets we take as basic,

since every open cube about
a contains an open ball with center at a,

a contains an open cube
and, conversely, every open ball with center at

about
a. We shall use the open balls as our basic sets.
6.3.1
Definition.
A set U CE" is said to be open<=> VxEU, 3p > 0,
so that B(x,p) CU.

The concept of accumulation point and closure are the same as for
the one-dimensional case, but we shall repeat the definitions.
6.3.2 Definition. A point aEE" is said to be an accumulation point

of the set ACE"<=> Vp > 0, [B(a,p)\{a}] n A =I' 0. The set of all
accumulation points of A is called the derived set of A and is denoted by A .
The closure of a set A is the set A= A U A , and A is said to be closed<=> A A.
'
'
a of a set A is a point with the

a contains a point in A
which is different frdm a. This means we can get arbitrarily close to a
by elements in A. The point a may or may not belong to A.
In other words, an accumulation point
property that every open ball with center at
The concept of compactness for E" is the same as the concept of com
pactness for 1
R. For the sake of completeness we shall repeat the
relevant definitions.
6.3.3
Definition.
A set ACE" is said to be bounded <=> 3M so that
VxEA,Jxl,,;;;M.
6.3.4
Definition.
A set in E" is said to be compact <=> it is closed and
bounded.
To discuss the Heine-Borel theorem in
E"
we need the concept of
covering. We shall give this formally in the next definition.
6.3.5
Definition.
A collection of sets U is said to be a covering for a set
ACE"<=>
AC U {U: UEU}.
The collection is said to be an open covering <=> each UE U is open.
6.3.6 Theorem (Heine-Borel). A set ACE" is compact <=> in every
open covering for A there are a finite number of sets which is a covering for A.
6.3
Proof.
We shall prove the necessity part of the theorem by induc
tion on the dimension
of En. We shall suppose that P(n) is the state
ment: An open covering for every closed and bounded set in En reduces
to a finite subcovering. We have shown in Theorem 2.2.5 that
P (1)
is true. Let us suppose that P(k) is true and that
A=BX{t},
t E E1 Suppose U is
U E U we set U1 = {x: x EEk
& (x, t) E U}. It is almost immediate that U1 is an open set in Ek and
U1 = {U1: U E U} is an open covering for B. Hence, by P( k) , this re
duces to a finite subcovering for B which in turn implies that U reduces
to a finite subcovering for A.
where
is a closed and bounded set in Ek and
an open covering for
and for every
Next suppose that
A=B XI,
[a, b] C E1 Let U be an open covering for
[a, b] so that BX [a, t] is covered by
a finite number of sets in U. J is nonvoid, since, by the first paragraph,
a E]. Let us set t0 = sup]. By the first paragraph of the proof, BX{t0}
is covered by a finite number of sets {Ui: j E (1, n)} CU. At each
point of BX {t0} place an open cube with center at the point and whose
closure is in one of the U1 This is an open covering for BX{t0} and
where B is as before and I=
and let] be the collection oft in
hence reduces to a finite subcovering. Let 2/ be the minimum of the

side lengths of this finite number of cubes. Since
B X [a, t0 - l] is
- l, t0 + l] is
covered by { U1: j E ( l, n)} it follows that B X [a, t0 + l] is covered
by a finite number of sets in U. Hence we must have t0 = b.
Finally, if A is closed and bounded, it is contained in a large enough
covered by a finite number of sets in U and since BX [t0
cube,
A C Ik
where I=
U
U
[a, b] C E'
and
is an open covering for

U
{Ac}
I,
is the direct product of
and
Ac
is the
I taken k times. If
complement of A, then
covers h XI and hence reduces to a finite subcover. Con
sequently, U reduces
P(k) P(k + l).
to a finite subcover, and we have proved that
The sufficiency part of the theorem follows exactly the same reason
ing as the sufficiency part of the proof of Theorem 2.2.5, and we shall
leave it for the reader.
The Bolzano-Weierstrass theorem is also true in En, and as we shall
see its proof is an immediate consequence of the Heine-Borel theorem.
6.3. 1
Theorem
(Balzano-Weierstrass).
in En has an accumulation point.
Every bounded
infinite set
Proof.
Suppose A C E" is a bounded infinite set. If A has no accu
mulation points it is closed and hence compact. Also, since it has no
Vx EA, 3p(x) > 0 so that B(x,p(x)) nA= {x}.

{B (x,p(x)): x EA} is an open covering for A and thus
finite subcovering. But this means that A has only a finite
accumulation points,
The collection
reduces to a
number of points, contrary to the original hypothesis.
CONNECTED SETS
The concept of a connected set is rather important in many problems

in analysis. The easiest way to introduce this concept is first to introduce
relatively
the concept of
open and
relatively
closed sets.
6.3.8 Definition. A set U C E" is said to be relatively open in a set

A C E"<=::?there exists an open set V C E" so that U V nA. A set C C E"
is said to be relatively closed in a set A C E"<=::? there exists a closed set D C E"
so that C = D n A .
=
Using the concepts of relatively open sets and relatively closed sets,
it is possible to describe the concept of a connected set.
6.3.9
written
Definition. A set A C E" is said to be connected <=::? A cannot be
as
the union of two disjoint nonvoid sets that are relatively open in A.
It is an easy matter to see, and we leave it as an exercise, that a set is

connected <=::? it cannot be written as the union of two disjoint nonvoid
sets that are relatively closed with respect to A.
It is not always a trivial matter to decide whether or not a set in E",
> 1, is connected. Roughly speaking, a set is not connected if it consists
of two or more disjoint pieces. For example, the set that is the union of
the open balls
B( (2, 2), 1)
and
B( (-2, O), 1)
in E2 is a disconnected set.
On the other hand, connected sets in E1 are easy to describe, and this
is often very useful in deciding whether or not a set in higher dimen
sions is connected.
6.3.10
Theorem.
Proof.
A set IC E1 is an interval<=::?
A set in
E1
is connected<=::?it is an interval.
Vx,y E /, {z: x ,s; z ,s; y} C I.
3x,y EA and 3z EAc
J-oo,z[ and ]z,oo[ are open in E1 and
If a set A is connected and not an interval, then

so that
<
<
y.
The intervals
if we set
B=J-oo,z[nA,
then
and
C=]z,oo[nA ,
are nonvoid and relatively open in
This contradicts the assumption that A is connected.
and
C= 0.
6.3
Conversely, suppose
A n B
xEA, yEB
and suppose x < y. Since I is an interval [x, y] C I. Since A is relatively
open in /, there is an open U so that A= U n I. Now 3 so that
x < < y and [x,U CU. Thus [x,[ CA. Let E= { : [x,U CA};
the set E is nonvoid and bounded above and hence z = I.u.b. E exists
and zE [x, y]. Now z fj. 4,; otherwise; since A is relatively open in/,
there is a > z so that [z, U c A and thus [x,U c A, which means
thatE E and z l.u.b. E. This means that zE B. But since Bis rela
tively open 3 < z so that ]' z] C B, which again contradicts the fact
that z is the least upper bound of E.
Using Theorem 6.3.10 it is possible to prove the connectedness of a
= 0, A
is an interval and
I=A
U B, where
and B are nonvoid and relatively open in/. Let
large number of sets in En. We first define these sets.
6.3.11
Definition.
A set C C E" is said to be convex Vx,yE C, the
straight-line segment
L(x, y)={z: (3t)(tE [O, 1) & z= (I- t)x+ ty}
is contained in
C.
6.3.12
Corollary.
Any convex set in En is connected.
Proof.
Suppose C is convex and C =A U B, where A,

, Bare disjoint,
relatively open sets in C. If
and B are nonvoid, let
xEA and yEB.
Let us set
A,={t: tE [O, 1) & (I-t)x+ tyEA},

B,={t:tE [O,l) & (I-t)x+tyEB}.
Since C is convex,
[O,1) =A1
U B,. From the fact
tively open in C, it is a sim pie matter to show that

tively open in
[O, 1).
But this means
dicting the last theorem.
[O, 1)
A and B are
A 1 and B1 are
rela
rela
is not connected, contra
After we discuss continuous functions in Section
6.4
we shall be able
to generalize this last corollary. We shall finish the discussion of con

nectedness by discussing maximal connected sets.
6.3.13 Definition. A component in A C En is a maximal connected

subset of A (in the sense that it is not contained in any larger connected subset
of A).
6.3.14
Theorem.
Any set zn En is a union of pairwise disjoint com
ponents.
Proof.
ting
If A C En,
y x and y
let us define an equivalence relation on

belong to a subset of
by set
which is connected. For
every
x EA, it is clear that x x. Also, Vx,y EA it is also clear that

y =::} y x. Suppose now that x,y,z EA, and x y and y z.
Let B and C be connected subsets of A so that x,y E B and y,z E C.
We claim that B U C is a connected subset of A. For suppose B U C
= E U F, where E and F are relatively open in B U C and E n F = 0.
Now, B = (B n E) U (B n F) and the latter sets are disjoint and rela
tively open in B. Since B is connected, one.of these sets must be void.
If B n F= 0then y E E. In the same way, C= (C n E) U (C n F)
so that one of the latter two sets must be void. Since y E C n E, we must
have C n F= 0 and thus ( C U B ) n F= 0., which implies F= 0.
Thus B U C is connected, and since x,z E B U C, we see that x
y
and y
z =::} x
z. Consequently, we have an equivalence relation.
We can write A as the union of pairwise disjoint equivalence classes.
We claim that each equivalence class is a component of A. If E is an
equivalence class of A, it is connected. For suppose E = B U C, B n C
= 0, and B and C are nonvoid and relatively open in E. If x EB and
y E C, since x y there is a connected subset D of A so that x,y E D.
Clearly D C E. But then the sets D n B and D n C are nonvoid,
x
disjoint, relatively open sets whose union is D. This is a contradiction.

If E is not a maximal connected subset of A, there is a point of A out
side of E that is equivalent with every point of E. This contradicts the
fact that E is an equivalence class. The proof is finished.
6.3.15 Corollary. IfA C En is open, then the components ofA are open
and there are a countable number of them.
Proof.
Let E be a component of A and
B (x,p) CA.
Now,
B (x,p)
x EE.
Let
> 0 so that
is convex and thus, by Corollary 6.3.12, is
connected. Hence every element in
B (x,p)
is equivalent to
and thus
is in E. Hence E is open.
Suppose
Q" is
QA
times, and
the Cartesian product of the rationals

=
Q" n A;
rational components. Since
that is,
QA
is open,
Q with
itself
is the set of points in A with
QA
is denumerable. (Proof?)
Let be a one-to-one function with domain N and range
QA
If E is a
component of A, let
NE= {n: n E N & <l>(n)

Since E is open,
NE
E E}.
is nonvoid and thus by the well-ordering principle
it has a minimal element
nE.
Let qr be that function whose domain con
sists of the equivalence classes E and defined by

qr(E)
The one-to-one function 
= nE.
qr has range a subset of
QA,
and thus its
range is countable. This means the collection of components of A is

countable.
6.3
6.3.16
Corollary.
TOPOLOGY
IN E"
I 247
If A C E1 is open, then A is the countable union of
open intervals.
Proof.
From the last corollary,
is the countable union of open
connected sets. Since by Theorem 5.3.10, every connected set in E1 is

an interval, the proof is complete.
INTERIORS AND BOUNDARIES
The concepts of the interior and boundary of a set in En will be of some

importance in Chapter 8 when we discuss Jordan content and the theory
of integration in higher dimensions. In somewhat loose language the
interior of a set
is the largest open set that is contained in
A.
The
boundary of A is the set of points that do not belong to the interior of
or the interior of Ac. The formal definition is as follows.
Definition. If A C E", then the interior of A, denoted by A0,

the union of all open sets contained in A. The boundary of A, denoted {3A,
is the set A\A0
6.3.17
is
A0 38 > 0, so that B(x, 8) CA. Also we think

{3A V8 > 0, B(x, 8) n A =t= 0 and B(x, 8) n AC
boundary of a set is clearly a closed set and, if A C B, then
It is clear that
it is clear that
=t= 0. The
A C B0
It is quite possible that a set may be rather "thin" and yet its boundary
may be rather "thick." For example, the rationals in
[O, I]
are in some
sense rather "thin" but the boundary of this set consists of the whole
interval
[O, l].
Note that this set of rationals has no interior. A single
point in En is an example of a closed set with no interior. The reader

may easily construct an example of a denumerable closed set, which
consequently has no interior. The Cantor set is an example of a closed
set with no interior which nevertheless has an uncountable number of
points.
D Exercises
In the following exercises all sets are to be taken in En, unless otherwise
specified.
I.
Show that the intersection of any finite number of relatively
open sets is relatively open and the union of any number of relatively
open sets is relatively open.
2.
Show that the union of any finite number of relatively closed sets
is relatively closed and the intersection of any number of relatively

closed sets is relatively closed.
3.
Suppose
tively closed. If
U is a relatively open set in A. Show

C is a relatively closed set in A, show
that
that
A\U is
A\C is
rela
rela
tively open.
4.
Let
C be relatively compact in A.
d
then
5.
3c E C,
so that d
Prove that a set
inf {Ix
le
If
E A, is it true that if we set
E C},
al?
C En is connected{:::::} the subsets of A which
are both open and closed relative to
6.
al :
If a
and B are connected and
are
itself and the null set.
n B 0, show that
U Bis
connected.
7.
Let
be a connected set in En , and A its closure. If A C B C
A,
show that Bis connected.
8.
If
C En and B C Em, show that
X B is open in 1+m
{:::A
::}
X Bis closed in 1+m
{:::::}A
and B are open.

9.
If
and Bare closed.
10.
If A C En and B C Em, show that
X Bis compact{:::::} A and B
are compact.
11.
Show that a sequence with range in E" is convergent if and only
if it is a Cauchy sequence.
12.
If
X Bis connected
{:::A
::}
and
Bare connected.
13.
Show the following:

(a)
(b)
14.
[A0 U (Ac)oy.
(AoV\ (Ac)o.
Show the following:

(a)
(b)
6.4
{3A
{3A
A is closed{:::::} {3A
A is open {:::::} {3A
C
C
A.
Ac.
CONTINUOUS FUNCTIONS
In Chapter 2 we discussed the limit and continuity concept for functions

having their domains and ranges in R. We shall now do the same for
functions having their domains in En, n 1, and ranges in E"', m 1.
The definitions are essentially identical and the theorems and their
proofs are, by and large, the same. Hence we shall not present as de
tailed a study here as we did in Chapter 2, but shall limit ourselves to
those matters for which a slightly different formulation than given in
Chapter 2 may lead to deeper insights.
6.4
6.4.1
Definition.
CONTINUOUS FUNCTIONS I 249
If f is a function with J0
(
f) C P andf
( ) C Em
and a is an accumulation point of J0

f
( ), then I is said to have the limit l at a
=}VE> 0,38> Oso thatx
E [B
( a ,8)\ { a }] n J0
(
f) l
x
( )EB
(l ,
E)
.
In case f has a limit at a we write, as usual,

Jimf
x
( )= l
x-a
6.4.2
Definition.
or f
(
x) - l
asx- a.
If f is a function with J0
(
f) C En and(
f) C Em,
then f is said to be continuous at a::::} a E J0

(
f) and VE> 0, 38> 0
so that x E B
( a 8) n J0
(
f)f
(
x) E B
f
(
a), E).
(
,
The function I
ZS
said to be continuous :::f

:}
is continuous at every point of its domain.
The concept of continuity at a point is a local concept, that is, de

pends only on the values that the given function takes on in some neigh
borhood of the given point. However, a continuous function has a
"global" characterization that is important and useful.
6.4.3
Theorem.
A function f with J0
(
f) C P and f
( ) C Em
is
(
is relatively open in J0
J
( ).
continuous :::f:} or every openU C Em, 11
- U)
Proof. Suppose f is continuous andU C E'n is open. If 11

- U
( ) is
void, it is open relative to J0
f
( ) and we are done. If
f1
- U
( ) 0, let
x E 1-1
(
U); sinceUis open,3p> Oso thatB
(
f
(
x),p) CU. From the
continuity off, 38(
x,p), so that y E B
(
x,8) n J0
(
f) f
(
y) E
B
J
(
x)),p), which in turn implies that B
(
(
x,8) n J0
(
f) Cf1
- U
( ). If
we set
(
U ) },
Ef-1
x
( 8
, )x
V= U {B
:
then Vis open andf1
- U)
(
= V n J0
(
f).
Conversely, supposef1
- U)
(
is relatively open in J0
(
f) for every
openU C Em. Let a E J0
(
f) and takeU =B
(
f
(
a), E); then there is
an open V C En so thatf1
U) = V n J0
(
f). Since Vis open,38 > 0,
- (
so that B
(
a8)
,
C V. Hence f takes B
(
a8)
,
n J0
(
f) into B
(
f
( a ) E),
which is the definition of continuity at a. This completes the proof.
,
6.4.4
Definition.
A function f is said to be an open function or an open
map::::} for every relatively open A C J0

f
( ),f
(
A)is relatively open inJ
( ).
As we shall see in Chapter 7, open maps play a relatively important

role in many considerations. Right now, Theorem 6.4.3 leads imme
diately to the following corollary.
6.4.5
Corollary.
If f is a one-to-one open map, then f 1

- is continuous.
Now, an open one-to-one map need not itself be continuous as the

reader may easily verify (for example, refer to Theorem 2.3.6). A
function that is one-to-one and f and f-1 are continuous is called a
homeomorphism or wpologfral map. So, a Continuous one-to-one open map
is topological. Exercise 4 at the end of this section will give another
condition for a continuous one-to-one function to be topological.
6A.6
Tl(j)
Theorem.
CEm
If f is a continuous function with (J)

and if (J) is compact, then Tl(j) is compact.
C E"
and
Let U be an open covering of
Tl(j). For every U E U,

there exists an open V CE" so that f-1(U)
V n (f). The collec
tion of such Vis an open covering for (f), and hence there are a
finite number that cover (f). Hence there are a finite number of
elements of U that cover Tl (f).
Proof.
6.4. 1 Corollary. If f is continuous and (f) is compact, then f is

bounded; that is, Tl (J) is bounded.
Proof.
Tl(j) compact and therefore is bounded.
6.4.8 Corollary. If f is real-valued and continuous and (f)

compact, then f assumes its maximum and its minimum on (f).
CE"
is
Proof.
Since Tl(j) C E1 is compact, it is closed and bounded.

Tl(f) must contain its supremum and infimum, which are the
maximum and minimum of f, respectively.
Hence
6.4.9 Definition. A function f with (J) CE" and Tl(j) C Em

is said to be uniformly continuous::::} VE> 0, 38 > 0 so that x y E (f)
and x y E B(O, o) f(x) - f(y) E B(O, E).
,
6.4.10
Theorem.
If f is continuous and (f) is compact, then f is
uniformly continuous.
Proof.
See the proof of Theorem 2.2.9.
In the one-dimensional case we showed (Theorem 2.2.14) that every

continuous function with a closed interval domain takes on all values
between its maximum and its minimum. This fact generalizes in a very
interesting way for functions with domains and ranges in higher
dimensional space.
6.4.11
Tl(j)
Theorem.
C Em
If f is a continuous function with (J)

and if JEJ(J) is connected, then Tl(j) is connected.
CE"
and
6.4
Proof. Suppose U and V are open in Em, f'/C,(f) C U U V, and

U n f'/C,(f) and V n f'/C,(J) are disjoint. Then 1-1(U) l-1(U n f'/C,(f))
and 1-1(V) l-1(V n f'/C,(J)) are disjoint and open relative to tB(J),
and cB(J) l-1(U) U l-1(V). Since cB(J) is connected, one of the
setsj-1(U) or 1-1(V) must be the null set and hence the corresponding
set U n f'/C,(f) or V n f'/C,(J) must be the null set. Thus f'/C,(J) must be
connected.
Theorem 6.4.11 is an important aid in deciding whether or not a set
in higher dimensions is connected. For example, the proof of Corollary
6.3.12 can be made in the following way. Suppose C is a convex set in
En and C A U B, where A and B are nonvoid disjoint relatively open
sets in C. Let x E A and y E B, and set
=
l(t)
(I- t)x + ty,
for
[O, l].
The function f is continuous and since its domain is connected its range
is connected. Butf'/C,(J)
[f'/C, (J) n A] U [f'/C, (J) n BJ, where clearly,
f'/C,(J) n A and f'/C,(J) n B are disjoint, nonvoid, relatively open sets in
(f). This is a contradiction.
We think it is clear that the procedure we have just given can be
generalized in a very simple way.
=
Definition. A set A C E" is said to be arcwise connected

A, there exists a continuous function f, with domain [0, 1 J and range
in A, so thatl(O) x andl(I) y.
6.4.12
V x, y E
6.4.13
Proposition.
Every arcwise connected set is connected.
The details of the proof are essentially the same as the proof we have
just given above for convex sets, and we shall leave it as an exercise.
As an example of how Proposition 6.4.13 can be used, let us consider
the ring in E2, which is the set
{x: 0 < r ,,,,; lxJ ,,,,; r }.

1
2
A and let (J and <P be numbers so that '
Let x,y
x1
y1
lxJ cos 8,
JyJ cos <(J,
x2
y2
Jx J sin
Jy I sin
(J,
<P
Let l (f1,J2) be the continuous function with domain [O, l] and

range. in A defined as follows:
=
f1(t)
Jxl cos [(l-2t)8+ 2t<,0]
f2(t)
Jxl sin [(I- 2t) (J + 2t<,0 J
l(t)
2(1- t) J
G) + (2t- l)y,
0 ,,,,; t ,,,,; 1/2
'
252 /HIGHER-DIMENSIONAL SPACE
FIGURE 6.4.1
This is a continuous function withf(O)= x, f(I) = y. Hence A is arcwise

connected. The diagram of A and the range off is given in Fig. 6.4.1.
There is a partial converse to Proposition 6.4.13, the following.
6.4.14
Proposition.
Every open connected set is arcwise connected.
Proof. Exercise 9 at the end of this section.

In general, it is not true that a connected set is arcwise connected.
2
As an example, let us consider the set A=A1 U A2 C E , where
A,= { (O, y): IYI I},

A2 = { (x,y): y = sin (l/x),
0 <
x I}.
In other words,A1 is a line segment along they axis and A2 is the graph
of the function given by sin (I/x) for 0 < x 1. The set A is not arcwise
1 2
connected. For suppose g= (g ,g ) is a continuous function defined
on [O,I] and whose range is inA,g(O)= (O,y) and g(l) = (x, sin (l/x)).
Let
B= {t: g(t)
A1},
t0 =sup B.
Since A1 is closed and g is continuous, B is closed. Hence t0 E B, and1

since A1 n A2= 0 and g(l) E A2, it follows that t0-I.
1
Since g is continuous and ]t0, l] is connected, it follows that the
range of g1 restricted to ] t0, I] is an interval whose closure is, say,
[a, b]. It is clear that a= 0, since otherwise by continuity g1 (t0) > 0.
Now as t ')i t0, we have that x = g1(t) ranges over an interval to the
1
right of zero and moreover g(t) = (g (t), sin l/g1(t)) g(t0). But this
is impossible, since sin (I/x) has no limit as x 0 .
On the other hand, the set A i s connected. For suppose A =U U V,
where U and V are disjoint, nonvoid, relatively closed (and open) sets
inA.IfA1 n U-0,thenA1 C U.Otherwise the setsU1=U n A1 and
V1= V n A1 would disconnect A1. By the same token, since A2 is the
range of a continuous function with domain a connected set, it is con-
6.4
nected and hence must be entirely in U or V. If Ai U A2 C U, we
contradict the fact that V is nonvoid. The same kind of statement is
true if U and V are interchanged. If Ai= U and A2 = V, or vice versa,

since A2 is not relatively closed in A, we again get a contradiction.
PEANO CURVES
As an another example of the strange behavior that some continuous

functions may exhibit, we shall give a continuous function whose domain
is
[O, l]
and whose range is the entire interval
X [O, l]. Any con

Peano curve.
[O, l]
tinuous function with this property is called a
The function we shall give is a slight modification of the function

given in Exercise 7 of Section
3.3. Let C be Cantor's set (Section 3.3).

x E [O, l] that can be written in the form
Recall that C is the set of all
where V k E N,
xk
0 or
xk
2. Set
f1(x)
j2(x)
f xi2'
k=I
- 2
f X2k /
'
2
k=I
(J'(x), f2(x) ).
We think it is clear that &>,(J') = &>,(f2 )
[O, l] (see Theorem 3.3.2)
and that &',(J )
[O, l] X [O, l]. To show that f is continuous, it is
enough to show that f1 and f2 are continuous.
l/3 2<n+1l. Suppose that x and a are in Cantor's
Let En= l/2n and 811
J(x)
set and
Clearly, we must have
xk
ak for k E ( 1, 2(n
1))
inequalities be satisfied. Hence
in order that these

.
(x2k .,.._ a2k)/2

(x2k a2k)/2
=
k
L,,
2k
2
k=I
k=n+2
L,,
Similarly, we get
- 2n
< - 2n+1
00
k=n+2
2k
(x2k-a2d/2
2k
k=I
00
,,,;;;;
These inequalities mean that if
Ix- al < 8n
x,a EC, then
and
lf1(x)- f1(a)I < ,,

2
In exactly the same way, we show that
x, a EC
and
Ix-al < 8n
==>
lf2(x)- f2(a)I < 2,,

Since
of
Ve> 0,
3n so that
1/2" <
we have established the continuity
E,
Clearly J cannot be a one-to-one function. For if it were, 1-1 would
[O, l]
be continuous (Exercise 4) and thus since
[O, I]
it would follow that C is connected, which is clearly false.
is connected,
Now, it is a very easy matter to extend f to a continuous function g

with domain
[O, l]
and range the same as the range off. Indeed, since
cc is open, it is the countable union of pairwise disjoint open intervals
]a,b[ is one of these intervals, then a,bEC.

[a, b] we define g so that its range is the line segment
f(a) and J(b) :
(Corollary 6.3.16). If
Consequently, on
joining
g(x) =
Of course,
[(x-a)f(b)
b-a
(b-x)f(a)].
Vx EC we take g(x) =J(x). Since (0, I] X [O, l] is a convex

(g) is contained in this square, and indeed is all
set, it follows that

of this square.
Clearly, g is continuous at every point of cc. If
a EC ,
and is a left
end point of an interval in cc, it is clear that

Jim g ( x)
x,a
If
= g(a) .
accumulation point of C. Now,
< () ==>
VE >
0, 3 8 so
IJ(x)- f(a)I <
b,c EC, 0
Vx E Jb, c[,
Suppose
g(x)-g(a) =
b-a < 8, 0
c-b
E.
c-a < 8,
and ]b,c[
E cc.
Then
[(x-b)(J(c )-J(a)) + (c-x)(J(b)-f(a))],
and consequently
lg(x)-g(a)I
a =ft 1, then it is a left

that x EC and 0 x-a
is not a left end point of an interval in cc and
c -b
[(x-b) + (c-x)]e=e.
Hence, we have in this case also

Jim
x,a
g(x) = g(a).
6.4
A similar argument shows that we also always have
lim
x .. a
g(x)=g(a).
This shows g is continuous.
O Exercises
I. Suppose f is a continuous function with domain the cube
C= {x: jxkl < 1, k E (l,n)} and range in Em. Let a EC and define
a function fk on ] -1, 1 [ by means of the equality
fk(t) J(a+(t ak)ek).
Show that fk is continuous.
2. Let J be a real-valued function defined on 2 as follows:
=
J(x,y)
Let
xy
x 2+y 2'
0,
(x,y) = (O,O).
be fixed and define f1 and f2 as in Exercise 1, and show that these
functions are continuous. Show that
(Hint:
(x,y)#(O,O),
is not continuous at
Look at the set of points which satisfy the equation
x = y.)
(0,0).
This problem gives an example of a function of two variables that is

continuous in each variable separately but is not continuous in both
variables simultaneously.
3.
Discuss the functions prescribed in the following with respect
to continuity in each variable separately and continuity in both variables

(see Exercise 2):
(a)
(b)
(c)
(d)
{
{
r
{
f(x,y)=
f(x,y)=
f(x,y) =
f(x,y)
x2 + y
4'
0,
x'+y'
lxl+IYI'
0,
-y
lxl + IYI'
0,
lxl
IYI'
0,
(x,y)#(O,O),
(x,y) = (O,O).
(x,y)#(O,O),
(x,y)=(O,O).
(x,y) #(O,O),
(x,y)=(O,O).
(x,y) # (0,0),
(x,y)= (0,0).
4.
If
is a one-to-one continuous function and
(f)
is compact,
show that 1-1 is continuous.
5.
If g is continuous at
(a)
and
is continuous at
thatf g (see Definition 1.3.5) is continuous at

0
(b)
If
is a continuous, show that
g(a),
show
a.
Ill
is continuous. Is the
converse true?
6.
If
is uniformly continuous, show that
a continuous function having domain
7.
!7C,(f)
(f),
If J is uniformly continuous and
may be extended to
the closure of
(f)
(f).
is bounded, show that
is bounded. Give examples which show that this may not remain
true if the hypothesis of uniform continuity or the hypothesis of

bounded domain is removed.
pk
8.
f is a function with domain all of E" and 3p E E",

E (l , n) ,so that Vx E E",f(x+p) =f(x). Iffis continous,
Suppose
'fa 0, Vk
show that it is uniformly continuous.
9.
Show that every
open
connected set in
E"
is arcwise connected.
10. Let C be the cube in 2 given by C

{x: lxkl 1, k 1, 2} and
A C C, the set of points with rational components. Is C\A connected
=
or disconnected?
1 1.
Show the existence of a Peano curve with range the unit cube
inE".
Show that no Peano curve (range [O, l] X [O, l]) can be one-to
(Hint: Consider how two intersecting perpendicular line segments
[O, l] X [O, l] would map under 1-1.)
12.
one.
in
13.
(x, y), where x E E"

En+m and range E" by
Designate the elements of En+m by
E Em. Define
a function P with domain

P (x,
y)
and
(x,O).
Show that P is a continuous open map. Does P necessarily map closed
sets onto closed sets?
14.
Use the results of Exercise 13 (as far as possible) to obtain the
results of Exercises 8, 9, and 10 of Section 6.3.
15.
If A C E" and B C E"' and A X B is connected, use the results
of Exercise 13 to show that A and B are connected.
6.5
LINEAR TRANSFORMATIONS
Many important questions in the theory of functions defined on do

mains in higher-dimensional space reduce to questions about a special
class of uniformly continuous functions, the
linear transformations.
6.5
LINEAR TRANSFORMATIONS I 257
6.5.1
Definition. A function T with domain a linear subspace L c V"
and range in vm is said to be a linear transformation \fa,{3 E R and
Vx,y EL,
T(ax
{3y) =aT(x) + {3T(y).
It is an immediate consequence of this definition that T(O)

Indeed, if
a ER, T(O)
0.
T(aO)=aT(O); choosing a=0 we have the
result. It is also an immediate consequence of the definition that the

range of a linear transformation is a linear subspace of vm.
6.5.2
:::::>
Theorem. A linear transformation T is one-to-one (nonsingular)

the range of T has the same dimension as the domain of T.
Proof.
{uk: k E (l,r)} is a basis for .B(T). Since T is

!R-(T) is a linear combination of the vectors
in the set {T(uk): k E (I, r)}. Hence, if under the assumption that T
Suppose
linear, every element in
is one to one we can show that this latter set is linearly independent, we
will have proved the necessity.
If T is one to one it follows that
T(uk)
- 0
Vk E (1,r) , since
0 is
the only vector taken into 0. Suppose that
akT(uk)=r( t1 ak uk )
0.
SinceTis one to one, we get
{uk: k E ( 1, r)} is a linearly independent set we have ak 0,

\fk E (l,r).
Conversely, suppose the range of T has the same dimension as its
domain. We must show that T(x)
T(y) ==}x=y. Using the linearity
of T, this is equivalent with showing that T(x)
0 =::::} x=0. Let us write
and since
then
T(x)
0 =::::}
r
k=I
ak T(uk)
0.
!R-(T) is r,and, as we have already

!R-(T) is generated by the set {T(uk): k E ( 1, r)}. Consequently,
this latter set is linearly independent and Vk E (l,r), ak=O. Hence
x 0 and the proof of the theorem is complete.
Now, by hypothesis, the dimension of
noted,
is
6.5.3 Definition. The dimension of the range of a linear transformation

called the rank of the linear transformation.
Suppose L and M are linear subspaces in V" and vm, respectively,

and
S and T are linear transformations each having domain L and

a E R we define
range in M. If for
(aT)(x)
(S + T) (x)
=
=
aT(x),
S( x) + T(x),
then these two operations will make the set of all linear transformations
with domain L and range in M into a vector space. Indeed, this vector
space is finite-dimensional, and if
p is the dimension of L and q is the
dimension of M, then its dimension is
pq.
We have asked the reader to
verify these facts in Exercise 4 at the end of this section.

There is a very useful representation for linear transformations by
means of
matrices. It is at this point that it becomes important to con
sider a basis of a vector space as a function rather than as a set (see the
discussion in Section 6.1 ). We shall use the terminology
ordered basis
for such a function.

Suppose
A is a linear transformation with domain L and range in

(u1,
,up ) and (v 1,
,Vq) be ordered bases
the linear space M. Let
for L and M, respectively. Then we may write
A(uk)
2:
j=l
ajkVj,
kE(l,p).
The array of numbers
a21
a12
a22
a,.
a2P
aql
aq2
aqp
[""
matrix. More specifically it is called a matrix repre

sentation for A with respect to the ordered pair of ordered bases
((u1, ,up) , (v1, , vq)) . From a very formal point of view, a
is called a
qXp
q Xp
matrix is a function with domain (I, q)
Suppose
N. Let
( w1,
X (I,p)
and range in R.
B is a linear transformation with domain M and range in
, wr
be an ordered basis for N and let us compute the
r X p matrix representation of B A with respect to the ordered pair

((ui.
,up), ( w1, , wr)) . Let the matrix entries of B with respect
to the ordered pair of ordered bases ((v1,
Vq) , (w1,
, Wr) ) be
denoted by b;k Then we have
0
A (uk)
2:
j=l
a;kB(v;)
kE(l,p).
6.5
It follows that if c1k is in the lth row and kth column of the matrix
representation of B
A with respect to the given bases, then

c 1k
L
i=I
b1;a;k
This serves to define a multiplication 0 matrices,
bu
b21
b12
b22
b,.
b2q
au
a21
a12
a22
alp
a2P
Cu
C21
C12
C22
Ct p
c 2P
br i
br2
b rq
a q,
aq2
aqp
Crt
cr2
C rp
Thus we see that we get c1k by "multiplying" the lth row of the matrix
of B with the kth column of the matrix of A; that is, bu is multiplied by
a;k and then the result is summed over j to get c1k.
Suppose now that A is a linear transformation with domain a linear
subspace L C vn and range in L. Let 'U
(u 1,
up) be an ordered
basis for L and let us designate the matrix representation of A with
respect to the pair ('U,'U) by [a;k]. We want to investigate the ques
=
tion of how the matrix representation changes when we pick a new
ordered basis 'U' = (u'i. , u'v) for Land get a matrix representation
[a' ;k] with respect to the pair ( 'U', 'U'). From this point on,for the sake of
simplicity, let us agree that if [au] is a matrix representation of A with respect
to a pair of ordered bases ( 'U,'U) we shall say that A has the matrix represen
tation [au] with respect to the basis 'U.
Let Q be the linear transformation with domain and range L that is
defined by
This means Q(uk) = u'k> and we may write
u'k = Q(ud =
j=l
Q;ku;.
Hence the matrix representation of Q with respect to the basis
'U
is
[q;k]. Clearly, Q is nonsingular,since i:he dimension of &e,(Q) is the same

as the dimension of ..e(Q). Let Q-1 be the inverse of Q and suppose its
matrix representation with respect to the basis 'U is [ru].
Let us now compute the matrix representation of A with respect
to the basis
'U'.
We have
A(u'k) = A Q(uk) =
0
L
j=l
s;ku;.
where
s ik
L a i,qlk
l=l
Now,
Q-1 (u i)
L ruui.
l=l
Q, and note that Q( u 1)
and if we compose both sides with
u' 1, we get
Using this in the expansion for A (u'k) we get

A (u'k)
Hence, if we write
[q;k]-1
[ rlisik ] u'1
for the matrix
[rik],
we have
(6.5.1)
In Section
6.6 we shall show how to compute the numbers r ik in terms
of the entries of
[q;k].
Let us now turn our attention to properties of a linear transformation

that are connected with the inner product, or, what is the same thing,
with the length function on
vn x vn.
Theorem. Let T be a linear transformation whose domain is a

C En and range is in Em. There exists an M > 0 so that
6.5.4
linear subspace L
Vx EL,
IT(x) I
lxl.
(6.5.2)
In particular this means that T is uniformly continuous.

Proof.
x EL,
Let
{ uk: k E (I, p)}
be any orthonormal basis for
we may write
lxl2
If we apply the transform
T we
T(x)
L 1gk12
k=l
get
p
L gk T(uk),
k=I
L.
If
6.5
and taking the norm of both sides and using the triangle inequality we
get
p
IT(x)I :!SL ltkl IT(uk )I.

k=l
Now use the Cauchy-Buajakovsky-Schwarz inequality on the right to
give
IT(x)i :!S
{ IT(uk)l2 f'2 { ltkl2 r2
This is the inequality (6.5.2).

Now, replace
x by x - y and
use the linearity of T to give
IT(x) - T(y)I :!SMix - YI.

This proves the uniform continuity.
6.5.5
3m>
0,
Corollary. If T is a nonsingular linear transformation, then

so that Vx E JV(T),
mlxl :!S IT(x)I.

11 is a linear transformation, it follows by the previous
3M> 0 so that Vy E JV(T1), l11(y)I :!SM IYI. But
Vx E JV(T), 3y E JV( 11) , so that T(x) =y. Hence lxl -:!SM IT(x)I,
and, taking m
l/M, we are finished.
For any linear transformation T let us set
Proof.
Since
theorem that
llTll =inf {M: Vx

Clearly,
Vx
JV(T) we
JV(T),
!T(x)I :!SM !xi}.
(6.5.3)
have
IT(x)I :!S llTll lxl.

The real number defined by (6.5.3) is called the norm of T and it defines
a distance function on the vector space of all linear transformations with
domain a fixed space Land range in a fixed space M. For this distance
function to. be useful, the triangle inequality should be satisfied and

this is seen as follows. Let
be another linear transformation with
domain Land range in M. Then
Vx
EL we have
! (T+ S)(x)I :!S IT(x)I + !S(x)I :!S {llTll + llSll} !xi.

This shows immediately that
!IT+ Sii :!S llTll + !!Sil.

It is also a very simple matter to check that
llaTll
la l llTll.
Va
E R,
and that
We shall leave the proofs of these simple facts as an exercise.

There are other expressions for 11Tll that are often very useful. For
example,
llTll=sup{IT(x)I: lxl=l}.
The proof of this is very simple. Indeed, if
(6.5.3')
lxl= 1, IT(x) I .;; IITll,
and
hence the right side of (6.5.3 ') is dominated by II Tll. On the other hand,
let M0 be the right side of (6.5.3'). Then,
IT(x) /lxl ) I .;;

From this it follows that
IT(x)I .;;
Vx
oF- 0,
M o.
M0 lxl and hence
llTll.;;
M0 This
E (T)}.
(6.5.3")
shows the equality.

Another useful expression is
llTll
=sup {IT(x )
YI: lxl= IYI = 1 & Y
To prove this, let us first note that if
Vz
is a linear subspace of
En,
then
EL,
lzl
=sup { lz
YI: IYI
1 &
y E
(6.5.4)
L}.
Indeed, if M1 is the right side of (6.5.4) the Cauchy-Bunjakovsky
Schwarz inequality shows that M1.;;

take y=z/lzl. Then
lzl=z z/lzl
lz l .
On the other hand, if
.;; M1, which shows equality. If z
oF- 0,
=
0,
the equality (6.5.4) is obvious.

To prove (6.5.3"), let M0 be the right side of that equation. From
the Cauchy-Bunjakovsky-Schwarz inequality we get
.;; llTll,
lxl=IYI= 1. Hence Mo.;; llTllget Vx, so that l xl = 1 ,
for
(6.5.4) we
IT(x)I=sup { IT(x) YI: IYI
IT(x) YI .;; IT(x)I
On the other hand, from
l}.;;
M o.
Hence, from (6.5.3'),
llTll =sup { IT(x)I: lxl=l}.;;

6.5.6
Definition.
linear functional.
Mo.
linear transformation with range in
is called a
Linear functionals have very interesting and useful representations,

as the next theorem will show.
6.5. 7
Theorem.
there exists a unique y
If A is a linear functional with domain L C En, then

so that Vx EL,
EL
A(x)=x
y.
6.5
Proof. Let {uk: k E ( 1, p)} be an orthonormal basis for L. If

x E L, we may write
A(x)
k=l
kA(uk) .
If we set
n
y= L A(uk ) uk>
k=l
A(x) =x y.
z E L, so that Vx E L, A(x) =x z. Then Vx EL, x (y - z)
=0. In particular, take x
y - z and we get jy - zj =0, which implies
y=z.
it is clear that
Suppose
6.5.8 Theorem. Let A be a linear transformation with domain L C En

and range in M C E'n. Then there exists a unique linear transformation A1
with domain M and range in L so that Vx EL and Vy E M
A(x) y=xA1(y).
(6.5.5)
y E Mand set Ay(x) =A(x) y. Since Ay is a linear func

L, it follows from the last theorem that 3 a unique
y1 E L so that Vx E L
Proof.
Fix
tional with domain
A(x)y=xy1
Since
Aay+13Ax) =aAy(x)
(6.5.5')
f3Az(x) it follows that
(ay + {3z)1 =ay1 + {3z1

y1, it follows that A1 is a linear transformation
L. Since the yl for which (6.5.5') holds is
unique, it follows that there is only one linear transformation A' for
which (6.5.5) holds.
Hence, if we set
A1(y)
with domain M and range in
6.5.9 Theorem. If A is a linear transformation with domain Land range

in M, if N(A) = {x: A(x) =O} is the null space of A, and if N(A)l. is the
orthogonal complement of N(A) in L, then
N(A)l. =(A1).
Proof.
For every
x E N(A) and Vy E M, we have

A(x) y=xA1(y)=O.
(A1) C N(A)l.. On the other hand, suppose 3z E N(A)l.\

(A1). Because (A1) C N(A)l., without loss of generality we may
Thus
suppose that
&2.(A1).J..
Thus we have
Vy
E M,
A(z) y=zA1(y)=O,
from which it follows, upon setting y
E N(A) n N(A).J.. It follows that
=A(z), that A(z)=0,
Hence
N(A).J.=&C.(A1).
6.5.10
Corollary.
For every linear transformation A,

rank
Proof.
Since
The range of
AJN(A).J.
and hence
z= 0, which is a contradiction.
A=
rank
A1
is clearly the same as the range of AJN(A).J..
is a nonsingular linear transformation, we have

rank
A=dim N(A).J.
=dim
&i (A1) =rank
A1
The linear transformation A1 is called the transpose of A with respect

to the space M. If A has the matrix representation [aid with respect
to the ordered orthonormal bases ((u1, , up) , (v1, , Vq)), it is
interesting and useful to comp ute the matrix representation of
((v1, - , Vq) , (u1, ,u p)) We have
respect to the bases
A1
with
A(ud = L aikvi,
i=I
and thus
On the other hand,

p
A1(vJ = L a1ki uk
k=I
and hence
Thus
The matrix
[a1Jk] is called the transpose of the matrix [aik] and is a

[aik]1
p X q matrix. It is usually denoted by
The rank of a linear transformation can be comp uted from its matrix
representation with respect to any ordered p air of ordered bases. The
p recise facts are given in the following proposition.
6.5.11 Proposition. If [aii] is the q X p matrix representation of a

linear transformation A with respect to any ordered pair of ordered bases,
q
then the rank of A is the dimension of the linear subspace in V generated by
the column vectors {ak=(a1k, ,aqk): k E (I,p)}, which is the same
6.5 LINEAR TRANSFORMATIONS I 265
as the dimension of the linear subspace of VP generated by the

{bk= (ak1,
akp): k E(l,q)}.
row
vectors
) is an ordered basis for "(A) and

(A).
If r is the rank of A, then there is a set {A(uk; ): i E( 1, r)} of r linearly
independent vectors that generate (A). Suppose that {a; : i E (1, r)}
Suppose
Proof.
(v1,
,Vq)
(u1,
,Up
is an ordered basis for a linear space that contains
C Rand
Since
q
A(uk1) = L aik; vi
J=l
a;A(uk1) =i (a; aik;)vi = 0,

Vi E(l,r), a1=0. Thus the vectors {ak;: i E(l,r)}
Vq . To show that these vectors generate
the same space as the vectors {ak: k E( 1, p)}, we first note that Vuk
there exist numbers {/3;k: i E (1, r)} so that
it follows that
are linearly independent in
A(ud
L /3;kA(uk;)
i=l
Hence
r
jE(l,q).
aik = L /3; kaik; ,

i=l
But this says that

n
ak= L /3;kak;,
i=l
which proves the assertion about the vectors

To prove the assertion about the vectors

ep)
be an ordered orthonormal basis
{ak: k E {l,p)}.
{bk: k E (1, q)}, let ( e1,
for P and (f 1, ,fq)
Eq. Let B be the linear transforma

q
tion with domain P, range in E , and whose matrix representation with
respect to ((e1, ,ep), (f1,
,fq)) is [aid By what we have
proved in the previous paragraph rank B = rank A. Also, we know from
Corollary 6.5.10 that rank B1 = rank B. The matrix representation of
B1 with respect to ( (f1,
, fq), (e1,
ep)) is a p X q matrix with
column vectors the set {bk: k E ( 1, q)}. Thus from the first paragraph
be an ordered orthonormal basis for
of the proof, the dimension of the linear space generated by these

vectors in
VP is rank B1 =rank A. This completes the proof.

SPECIAL LINEAR TRANSFORMATIONS
Projections. Suppose M is a linear subspace of En and Lis a linear

M. If V- is the orthogonal complement of L in M, then
Vx E M there is a unique y E L and a unique yl. E LL so that
subspace of
l
y + y ..
The last statement is just Proposition
6.2. IO(b). Let us defn

i e a linear
P with domain M and range L by the equation
transformation
P(x) =y.
The linear transformation
P is called the projection of M onto L. It has
the following properties:

(a)
(b)
x E L:::P
:> (x) =x.
p2 = p p=p.
(c)
P=P1
We shall leave these simple facts as an exercise for the reader.

If
{uk:
E (I, r)} is an orthonormal basis for the linear space L,

P(x) in terms of this basis. Indeed,
it is a simple matter to compute

let us write
P(x) = L akuk.
k =l
If we take the dot product of both sides with respect to
ak=P(x)
uk=x
uk, we get
uk.
The last equality follows from the facts that
P1
P and P(uk) = uk.
Hence
r
P(x) = L (x
k=l
uk)uk.
Let us use the idea of projection to obtain the
spherical representation
P1 be the projection of En onto the linear subspace
1
O}. P1 may also be described as the projection of
{x: x E En & x
En onto the space generated by the vectors {ek: k E (2, n)}. Here we
i
are taking ek
(e/,
ekn ) where ek = 0 ::::> j # k, el= 1. Let P2
2
1
be the projection of En onto the linear subspace {x: x E En & x = x
O}. The projection of P2 may also be described as the projection of
En onto the space generated by {ek: k E (3,n)}. Note that (P2)
=(P2j(P1)), so that P2 restricted to (P1) is the projection of
the latter subspace onto the subspace (P2). In general, let Pi, j
E (I, n I), be the projection of En onto the subspace {x: x E En
1
& x =
x; =O}. Clearly, the last subspace is the space generated
by {ek: k E (j + l,n)}, and (Pi)=(Pij(Pi_1)),j E (2,n-1).
1
The vector (t , 0;
, 0) is the projection of t onto the subspace
generated by e1, and we have
of a vector in En. Let
=
6.5
LINEAR TRANSFORMATIONS J 267
Now, there exists a unique 81E[O,1T] so that cos81 = (t

provided It I =fa 0. Hence we get
(1 = ltl
COS
e 1)/ltl,
(JI,
The number 81 may be considered as the measure of the angle between

the vectors t and e1. Since t= (t e1)e1 +P1(t) and e1 and P 1(t) are
orthogonal, we have
But, we also have

ltl2 cos2 81 + ltl2 sin281 = ltl2,
so that
IP1(t)l2= ltl2 sin281
Since 81E [O,1T] , sin 81 0, so that
1
IP 1(t)I = ltl sin 8 .
Now, let us repeat this process with the vector P1(t) playing the role
oft and P playing the role of P1 We find that
2
O ) = (P1(t) e2)e2,
(0, t2, 0,
and there exists a unique 82E [O,1T] so that cos 82 = (P1(t)

IP 1(t)I , provided, of course, that P1(t) =fa 0. We then get
t2= IPdt)I cos 82= ltl sin 81 cos 82,
e 2)/
IP2(t)I = IP1(t)I sin 82= ltl sin 81 sin 82

The number 82 may be considered the measure of the angle between
P 1(t) and e (Fig. 6.5.1).
2
x3
FIGURE 6.5.1
If P n ( t) =F
-2
0 , then none of the vectors t, P 1 ( t) , P P ( t)

,
1
2
P (t) is zero, and we can proceed by induction and
1
2
find that there exist unique (Jk E [ 0, 'TT] , k E(l,n - I), so that Vk
E(l,n-1),
tk = It I sin 61 sin 62 sin (Jk-i cos (Jk,
Pn_
P,._3
and moreover
It" I = Iti
sin 61 sin 62 sin en-2 sin en-I.
The last equation comes from the fact that
(O ,
If
t = 0,
0 , t") =
Pn
-1
Pn
-2
P1 (t) = P n (t).
-1
these equations Still hold, but the numbers
(JI,
,en -I
are no longer uniquely determined. Indeed, any numbers will do. If
P1(t) =O , then from the equation j P 1 ( t ) j = ltl sin 61, it

0 or 61 ='TT. Again the equations above hold, but now
(}2,
en-i are no longer uniquely determined and again any num
bers will work. Proceeding in this way, we see that if t =F 0 and V k
E (I, n - 2), (Jk E ]O,'TT[, then the vector t uniquely determines the
numbers (Jk, Vk E (I, n
I).
Unfortunately the last equation is an equation for It" I rather than
t". If we wish to remove the absolute value sign, it may no longer be
true that we can take en-I E [0,7T]. However, if Vk E(l,n-2),
(Jk E )0,'TT[, then sin 61 sin en-2 =F 0, and there exists a unique
en-I E [0,27T[ so that
t
0,
=F
but
follows that 61
tn-I
tn
= It I
= It I
sin 61
sin en-2 cos en-l'
sin e1 sin en-2 sin en-1.
en-1) where p ;:;. 0,

E (I, n-2) and en-1 E [O,27T]. Let S0 be this
with p > 0, (Jk E ]O,7r[ for k E( l, n - 2) and en-i
Let S be the collection of n-tuples (p, 61,
(Jk E [O,'TT]
set of n-tuples
E [O,27T [.
We have proved the following result.
The function with domain

t1
t2
tk
tn-I
tn
for k
defined by
ltl cos 61
Iti
sin 61 cos 62
It I
sin 61 sin 62
= It I
= It I
sin
(Jk-l
cos
(Jk
(6.5.6)
sin 61 sin 62 sin en-2 cos en-I

sin 61 sin 62 sin en-2 sin en-I
'
has range all of E". If this function is restricted to S then it is one to one and
its range is all of E" with the exception of the subspace generated by the set
{e1:j E(l,n-2)} U {O}.
0,
6.5
Note that if
= 2,
the formulas
(6.5.6) give the ordinary transforma
tion from "polar coordinates" to "rectangular coordinates," and the
exceptional set is { 0}.
Symmetric Transformations.
and
A is
linear transformation
If
Suppose Lis a linear subspace of En
a linear transformation with
( u1,
ur
A is
J0(A) =Land .92.(A)

RA = A1
is an ordered orthonormal basis for Land
the matrix representation of
A with
A=A1,
we get that
[a;;]
is
respect to this basis, it follows from
A1
our discussion about the matrix representation of

But since
C L. The
said to be symmetric
a1ii =a;;=aii.
that
a1;;=a;;.
Any matrix whose entries
symmetric.
satisfy the last relation is called
There is a method of computing the norm of a symmetric transforma

tion that is usually more convenient than the methods we have indicated
previously. Let us set
M=sup{IA(x) xi: lxl

so that
Vx E
En,
IA(x) xi M lxl2
sup { IA(x)
l} ,
It is clear that
YI: lxl =IYI
l}= llA ll.
On the other hand , a direct computation shows that

1
A(x) y = [A(x+y)
"4
(x+y)-A(x-y) (x-y)].
From the facts that
l[A (x+y) (x+y)-A(x-y)
(x-y)]I M[lx+yl2+lx-yl2]
Ix+Y l2+Ix - Y l2 = 2[lxl2 + IYlJ,

we get
llAll =sup {IA(x) YI: lxl =IYI = l} M.

Consequently, we have the following equality for symmetric transfor
mations:
sup {IA(x)
Since
IA(x) xi
xi: lxl = l}
llAll.
(6.5.7)
is a continuous real-valued function and the unit
sphere in L,S ={ x:
lxl = l},
is compact it follows that
3x0 E S,
llAll =max {IA(x) xi: x ES}=IA(x0) x01.
Let us set
,0=A(x0) x0
A direct computation shows that
0 IA(xo ) - 1-toxol2 =IA(xo )J2-J.to2
so that
Since
IA(xo ) I
llAII= 11.to l.
:!f::
it follows that
IA(xo)-,o xol2=0.
This means that
(6.5.8)
Any number
,0
for which there exists a nonzero vector
x0 E(A)
for which
(6.5.8) is satisfied is called an eigenvalue or proper value of the

linear transformation A . Any nonzero vector that satisfies (6.5.8) is
called an
eigenvector
or
proper vector
A.
for
Our previous discussion shows
that a symmetric transformation has an eigenvalue. The corresponding

eigenvectors are not unique, since clearly any element in the linear
space generated by an eigenvector satisfies the relation
(6.5.8).
A., let MA be the linear subspace of all vectors

in L that satisfy the relation Ax=A.x. It is clear that A takes every element
in MA into an element in MA. If A is symmetric, then it is also true that
A takes every element in MA.l into MA.l To prove this last statement, let
x EMA1-; then Vy EMA we have x y=0. Now, Vy EMA, since
A(y) EMA and x EMA.l we get
For a given eigenvalue
A(x) y = x
,0
A(y)
A.(x
y)=0.
A(x) EMA.l'
This means
Let
be the eigenvalue whose existence we established several
paragraphs back. Let
A1
be the restriction of
to
M.,,1-;
that is,
A,= AIM..1-.
A1 is a symmetric linear trans
As we have shown in the last paragraph,

formation with domain
M./
and range in the same linear subspace.
Hence, by the same existence proof as before, there exists a
x1 EM./,
so that
lx1I =I
,1
and an
and
Proceeding in this way (formally, by induction!) we find that there is

an ordered orthonormal basis
{ A.i.
A. r}
so that
(v1,
v r)
for L and eigenvalues
A vk= A.k vk.

A. k are the same.
M.i is more than 1.
We are not excluding the possibility that some of the

This can happen, for example, if the dimension of
With respect to the ordered basjs
sentation of
is
A.,
(v1,
0
0
vr)
the matrix repre
6.5
where the entries off of the main diagonal are zero. This means that
[a;i],
the matrix
which is the matrix representation of
( u1,
to the ordered basis
Ur)
with respect
is similar to a diagonal matrix in the
sense that there is an invertible matrix
[bii]
so that
is the given diagonal matrix. Because of this, we usually say that a

symmetric matrix is diagonalizable.
Let us suppose that we have numbered the eigenvalues so that
>..
>--1
>..r.
For
E L let us write
Hence we get
r
A(x) = L xk>..kvk>
k=l
r
A(x) . x L >..k(xk)2.
k=l
=
This shows that
Vx
E L,
>--rlxl2 A(x) x >..1 lxl2

Indeed, as the reader may easily verify,
{A(x) x: lxl = l},

{A(x) x: lxl = l}.
>..1 =sup
>..r =inf
Orthogonal Transformations.
and
Let L be a linear subspace of E"
a linear transformation with
=Ix!.
l(A)
(A) C L. The
Vx E L, IA (x) I
L and
is said to be orthogonal <::::}
A necessary and sufficient condition that a linear transformation is
orthogonal is that
Vx, y
E L,
A(x)
A(y) =x y.
(6.5.9)
We believe that the sufficiency of this condition is clear without further

comment. The necessity follows from the relation
A(x)
A(y) =
A(x-y)l2J
A x y)
4 [I { + l2-I
1
=4 [Ix+Yl2-Ix-Yl2J
x Y
From the condition (6.5. 9) it is almost immediate that a necessary and

sufficient condition that a linear transformation is orthogonal is that
At oA
(6.5.10)
where I is the identity transformation; that is, Vx

if A is orthogonal, then Vx,y E L
E L, I
(x)= x. Indeed,
A10A (x) y=A (x) A(y)=xy.

Hence for any fixed x and for every y, [A10 A (x) - x]y 0, which
implies that A10 A (x) x. Conversely, if A'0 A=I, then Vx E L,
A10 A (x) x= IA (x) 12 = l xl2 Note that an equivalent way of stating
(6.5.10) is that
=
D Exercises
I.
If T is a linear transformation, show that its range is a linear
space. If T is one to one, show that T-1 is also a linear transformation.
2. Decide which of the following functions from V3 into itself is

a linear transformation.
(a) T (x)= (x1 -x3, x1 + x2, x1 + 3x2 -x3).
(b) T (x)= (!xi, x2 + x3, x2).
(c) T (x)= (3x2 -x3, V{x'i")2, x2 + x3).
(d) T (x)= (x1x3, x2 + 3x3, 0).
(e) T (x)= (3x1 + 2 x2 + x3, x1 - x2, 4x1 + x2 + 2x3).
For those transformations that are linear, give the matrix representa
tions with respect to the ordered basis (e1, e2 , e ).
3
3. Let T be a linear transformation with domain the linear sub
space V c vn and range in V. Let us set
N= {x: x E V
&
T (x)=O},
the null space of T. Show that N is a linear subspace of V and moreover

rank (T) +dim (N) =dim ( V).
4.
Verify the facts stated in the paragraph after Definition 6.5.3.
5. Suppose that A and B are linear transformations each having

domain a real vector space V and ranges in V. Show that
A0B=IB0A=I,
where I is the identity transformation; that is, Vx
V, I (x) =x.
6. Suppose A and B are linear transformations each having domain

a linear subspace L C E" and range M C E"'. Show the following:
(a) Att=A.
(b) (aA + {3B)1=aA1 +{3B1,
Va,{3 ER.
(c) (A0B)1=B1oA1
7.
EC
Let C be a closed set in E" with the property that x E C ==:}ax

for all a E R with a ;;,; 0. Suppose f is a continuous function with
6.5
JE>(J) =C, fk(J) C En and so that Vx EC and Va

f(ax) = af(x). Show that 3M> 0, so that Vx E C,
IJ(x)I
8.
ER with
a;;.:
0,
Mlxl.
Let f be a function with domain En and range in E'n which is
additive; that is,
Vx, y
E En
J (x + y)= J(x)
Show that if f is continuous at
J(y).
x= 0,
then it is continuous at every
point of En.
9.
Let
Ay be
the linear functional defined on En by the equation
Ay(x) =x y.
Show that
10.
Let
be an orthogonal linear transformation with domain and
range a linear subspace L C En. Suppose

orthonormal basis of L and
[aii]
( u1,
ur
is any ordered
the matrix representation of
with
respect to this basis. Show that
L aiiakJ
if
i # k.
j=I
11.
Let
be a symmetric transformation with domain a linear sub
space L C En and range in E". Let

of
[aiJ]
be the matrix representation
with respect to the ordered orthonormal basis
that there is a matrix
[biJ]
( u1,
,Ur
).
Show
that is the matrix representation of an
orthogonal transformation from L onto L so that
is a diagonal matrix.
12.
If L and Mare linear subspaces of E" that have the same dimen
sion, show that there is a linear transformation

range M so that
13.
Vx
Show that if
E L,
IA (x) I= lxl.
with domain L and
is a linear transformation with domain L C En
and rangeM C E" so that
Vx
E L,
IA (x) I= lxl , then A can be extended
to be an orthogonal transformation with domain and range E".
14.
that P2
Let P be a symmetric linear transformation with the property

=
P. Show that P is the projection of its domain onto its range.
Give an example that shows that P2= P does not imply that P is a
symmetric linear transformation.
15.
of
Let
(A),
be a symmetric linear transformation, M a linear subspace
and
P the projection
A= A P.
into itself::::? P
16.
If
P is
(A)
onto M. Show that
a nonzero projection show that
vector in En and
of
of
takes M
11P11= 1 .
If y is a nonzero
is any vector in En, use the formula for the projection
onto the linear space generated by y and the result of the first sen
tence of this exercise to give another proof of the Cauchy-Bunjakovsky

Schwarz inequality.
17.
6.6
Show that a linear transformation is an open map.
DETERMINANTS
In his study of elementary algebra the reader has undoubtedly come

across the notion of determinants and has learned enough of their
properties to be able to use them for solving systems of linear equations.
Our purpose in this section is to derive a number of properties of
determinants in a rigorous way since they are very important quantities
in the higher-dimensional calculus.
Before we discuss determinants, it is necessary to discuss a certain
class of functions, called
permutations,
which take finite sets onto them
selves.
Definition. Let S be a finite set. A one-to-one function <T having S

its domain and range is called a permutation.
6.6.1
as
For example, if
is the set of integers
(1, n),
then a permutation is
simply a function that takes these integers into each other. It is not
difficult to show, although we shall not do it, that each permutation is
a composite of permutations that permute or interchange only two
elements in the original set and leave the remaining elements fixed.
This leads to the important concept of an
even
or
odd
permutation:
An even [odd] permutation is one that is a composition of an even

[odd] number of elementary permutations which interchange only
two elements.
Of course, in trying to define an even or an odd permutation as we
did in the last paragraph, it is necessary to show that a permutation
cannot be written in two ways as a composite of both an even number
and an odd number of elementary permutations. To avoid proving this,
we shall define the concept of an even or an odd permutation in a differ
ent way.
Let
(I, n)
Sn
be that function whose domain is the set of permutations of
onto itself which is given by
S n(<T)=
IJ (<r(k)-<r(j)).
l:Ej<kn
(6.6.1)
6.6
DETERMINANTS I 275
The definition of the product on the right has already been indicated
in Section 3.5. To make this precise we first remark that the set
={(j,k):j, k EN, 1 j < kn} has n(n-1)/2

fcr be that function with domain An defined by
elements.
An
Let
Jcr(j,k)= <r(k) - <r(j) ,

and let be any one-to-one function with domain
and range
An.
(I , n(n-I) /2)
Then, as we noted in Section 3.5, the right side of (6.6.1)
is defined as
Sn(<r)=
n<n-1)/2
TI Jrr0<l>(k),
k=l
and this definition is quite independent of the one-to-one function
 that is used.

Let us now define a function on the set
An as
1 <r(k) - <rU)
U'jk = -1
<r(k) - <r(j)
follows:
> 0,
< 0.
We claim that
TI <T1k[<r(k)-<r(j)J = TI < k- 1)
1j<kn
li<kn
where these products are defined in a manner indicated in the last
paragraph. Indeed, for each permutation
having domain
A"
Fcr(j, k)
If
<T
let
Fcr
be that function
and range in R which is given by

=
<Tjd<r(k)- <r(j)].
is the identity permutation, then clearly
Now, define the one-to-one function
qr
F, (j,k)
k - j= f, (j, k).
An by
with domain and range
Fcr 0 qr= F,. If is any one-to-one function with domain

( 1, n(n-1)/2) and range An, then qr 0 is the same type of a func
It is clear that
tion and thus
n<n-1)/2
TI
k=l
F<T 0<l>(k) =
n(n-ll/2
TI
k=l
F<T 0 qr 0<l>(k)=
n(n-1)/2
TI
k=l
F, 0<l>(k)'
which proves our assertion. From the associativity property of finite

products we get
TI U'jk(<r(k)-<r(j))= TI nU'jk TI (<r(k)-<r(j)).

ti<kn
li<k
li<kn
We shall set
sgn
<T =
TI <Tjk,
l:!Si<kn
(6.6.2)
and note that sgn <ris either 1 or -1 .

We have established part of the following theorem.
6.6.2 Theorem. There exists a function, with domain the collection of
permutations of (1, n) onto itself and range the set { 1, -1} , whose values
are denoted by 'sgn <r,' so that
Sn( <r)
where
<rand
sgn <rSn ( L )
(6.6.3)
is the identity permutation. Further, for every pair of permutations
T,
sgn
T 0
<r= sgn T sgn
(6.6.4)
<r.
Proof. Since we have already proved (6.6.3), it remains to prove

(6.6.4). Let f be any function whose domain is (I, n) and range is in
{l,n).We set
Sn(f)
TI (f(k) - J(j))
1E.i<kn
We claim that
S,,(f
0 a-
) = sgn
<r
Sn(f).
Indeed, the proof of this carries over mutatis mutandis from the proof
of (6.6.3) given just previous to the statement of this theorem, where
in that case f was taken as the identity permutation L. If in the last
formula we replace f by T we get (6.6.4).
Suppose now that [ a ii] is a square n X n matrix. One way of defining
a determinant of a matrix, which the reader has probably already seen
in his previous studies, is as follows:
(6.6.5)
<r
where the symbol rr indicates we are summing over all permutations

of ( 1, n) onto itself and for notational convenience we have set a-(j)
CTj. We leave it to the reader as an exercise to establish that there
are n ! permutations with domain ( 1 , n) .
Although the equality in (6.6.5) is a perfectly good definition of the
determinant of a matrix, it is usually easier to obtain the principal
properties of the determinant by proceeding in a way that, at first
glance, may seem slightly different. We shall do this by means of alter
nating multilinear functionals.
6.6.3 Definition. A multilinear functional is a real-valued function

defined on an m-fold Cartesian product of V" which has the property that when
any m - 1 variables are held fixed the function is a linear functional of the
remaining variable.
6.6
DETERMINANTS I 277
n
A multilinear functional D defined on the m-fold Cartesian product of v
every permutation <r acting on ( I , m)
which has the further property that for

onto itself,
D(aul a0"2,, aum) = sgn
<T
D(a1,a 2,
,am)
is called an alternating multilinear functional. If m = n and D has the addi

tional property that
D(e1,
,en) = I ,
then D is called a determinant function, and the number D(a1 , ,an) is

called the determinant of the n X n matrix whose kth row consists of the com
ponents of the vector ak.
It is not very difficult to verify that the function defined by (6.6.5)

is a determinant function. However, since it is an important point, we
shall state and prove this formally.
6.6.4
Theorem.
The function defined by the right side of
(6.6.5) is a
determinant function.
Proof.
Let us set
D(a1, ,am )
(sgn <r ) a1u1
anun.
Now, Va,{3 ER
D(a1, ,aa k + f3bk>,a,.)
=
(sgn <r) alul
f3
(aakuk +
f3bkud
a11Un
(sgn <T) alul akuk a,.un
(sgn <T) alul
bkuk anun
= aD(a1,-,a,.) +{3D(a1, ,bk. -,an).
This shows that Dis multilinear. Further,

D(a,1, ,a,m) =
If T(j) = k, then <r(j) = <r
(sgn <r) a,1u1 arnun
1
T- (k). Thus, since Tis a permutation and
a finite product is ind>pendent of the order of the products, we get

n
TI
j=l
aTjuj =
Now use the fact that <r = (<r
L
u
(sgn <r)
TI
i=l
TI
k=l
1
T- )
a Tiui = sgn T
akUT -lk.
Tand formula (6.6.4), and we get

11
(sgn <r
'T-1
TI
k=l
aku ,-1k .
Now, as <T goes over the set of permutations of ( 1, n) , <T 0 T-1 also goes
over this set. Thus the function <l>A<r)= <T 0 T-1 is a permutation of the
collection of permutations of ( 1, n). Because a finite sum is independent
of the order of summation, we have
2:
(sgn
<T 0
1
T- )
<T
IJ ak<N-1k
k=I
2:
(sgn
<T
akrrk.
TI
k
=I
Thus we see that

D(a71, ,
a m )= sgn
TD ( a 1 , ,an),
so that D is alternating.
Finally,
D ( e 1,
,en)=
2:
(sgn
<T)e1 rr
enrr".
<T
Since ekrrk=0, unless <Tk k, we see that the only nonzero term on
the right appears when <T is the identity permutation. Thus
=
D ( e 1, - ,
e 11 )=
l,
and the theorem is established.

6.6.5
Theorem.
An alternating multilinear functional
satisfies the
following:
(a)
xk=O=>D(x1, ,xk>, xm)=O.
(b)
xJ=xk=>D(x1,-,xi,,xk>,xm)=O.
(c)
Va ER & V(x1, ,xm),
Proof.
(a) Since D is multilinear we have

D(x1, , 0,, Xm)= D(x1, , 0 + 0,, Xm)
=2D(x1,- ,0,,xm).
This proves (a).

(b) Let <T be the permutation with domain (1, n) so that <T( i)= i
if i j, k, <r(j)
k, and <r( k)= j. Then since sgn <T = -1, we have
=
Since
(c)
X;
xk we have proved (b).

Using the multilinear property we have
=
6.6
DETERMINANTS I 279
By part (b) the second term on the right is zero, which completes the
proof.
6.6.6 Theorem. Suppose m .:;: n, and a is a real-valued function with

domain, the set {(i1,
, im) : I <S: ii <
< im <S: n}. Then there exists
at most one alternating multilinear functional D defined on the m-fold Cartesian
product of vn so that V(ji, .im) with I .:;: ii ::: < lm.:;: n, we have
D(e;., , e;m)
Proof.
a(ii, .im)
Suppose E is another alternating multilinear functional on
the m-fold Cartesian product of vn, so that E (e;,,
.im). If we set
F=D
, e;m) =a (j1,
E,
then F is an alternating multilinear functional with the property that
im <S: n.
...'Xm be vectors in vn and set
for all indices for which I <S: ii <

Now, let Xi'
xk = L xkie;.
j=i
Using the multilinearity of F we find that
F(xi,
",Xm) =
L
im=l
L
i1=l
If ir = i, for some rands, it follows from part (b) of the last theorem that
F(e;,, , e;m) = 0. If the ik are all different, let ii , , im be a re
numbering of the ik so that ii < i2 < < im Let CT be the permuta
tion given by i.,.k
ik. Then
=
F(e;1,
e;m)
,
(sgn CT) F(e;i'
, e;m )
Hence F = 0, which proves the theorem.
6.6. 7 Corollary. There exists one and only one determinant function on
the n-fold Cartesian product of vn with itself.
Proof.
The existence of such a determinant function is given by
Theorem 6.6.4. The uniqueness is given by the last theorem.

The determinant function on the n-fold Cartesian product of vn
may also be interpreted as a real-valued function acting on the collection
of n X n matrices. If [aiJ] is an n X n matrix, we shall often denote the
determinant of this matrix as on the left side of (6.6.5): det [a;;]
Another standard notation for the determinant of a matrix is
One of the most useful properties of the determinant function is

that the determinant of a product of two n X
matrices is the product
of the determinants of the matrices. We shall prove a slightly generalized

version of this result which will be useful to use when we discuss surface
integrals.
6.6.8
Theorem (Binet-Cauchy).
and [bjj] an n
<let
Let
m :o:;;
n, [a;;] an m
n matrix,
m matrix. Then
[a;j][b;;]
[b;;] be fixed and set ak

(ak1, ,ak,,). The kth row
[a;i] [b;i] is ak [b;i], where we are considering this as
matrix multiplication of the 1 X n matrix ak with then X m matrix [b,iJ]
m
to give a 1 x m matrix which can be considered as a vector in v . Thus,
since [b;j] is fixed, we see that the left side of (6.6.6) is an alternating
multilinear functional of the m-tuples (a1,
,am). Also, for fixed
[b;i], the function with values
Proof.
Let
of the matrix
is an alternating multilinear functional of the vectors
(aiki
aikm),
(1,m). Hence a fortiori it is an alternating multilinear functional of

ai,j E (1,m). Thus both sides of (6.6.6) are alternating
multilinear functionals of them-tuples (a1,
,am).
Denote the left side of (6.6.6) by D(a1,
,am) and the right side by
E(a1,,am) Let {ek: k E (l,n)} be the standard unit vectors in
i
V"; that is, ek
0 for j k, e/
1. Now,
the vectors
ek1[b;i]
Thus we see that
(bk i'' bk12,
, bk1m)
6.6
DETERMINANTS I 281
On the other hand,
e ktkt
ek/
"'
=l.
ekmk1
ekmkm
kt < < k,,., and Ut, ,jm) =fa (kt,

(l,m)} so Vi E {l,m},j-i6.k;. The proof
of this is a simple induction argument on m and we leave it for the
reader. Since for j =fa k;, ek/ = 0, it follows that if Ut, ,jm) =fa (k1,
,km) and 1 j1 < < jm n and 1 k1 < < km n, then
Further, if
-,km),
jt
then
jm
{j;: i
< <
3j
and
Hence we see that for 1
k1
k1
<
D(ekp ,ekm)
It follows from Theorem 6.6.6 that
Corollary.
6.6.9
< km :;;:; n,
<
Therefore, we see that for 1
If [a;;] and
det[a;;] [bu]
km
:;;:;
we have
E(ekp , ekm).
D
[b;;]
<
E,
are n
which proves the theorem.

X
n matri.ces, then
det[aii]det[b;;] .
From the formula (6.6.5) it is very easy to see that the determinant
of a square matrix may also be considered as an alternating multilinear
functional of the column vectors of the matrix. Indeed, as the permu
tation
<T
all permutations. Moreover, if

a;a; =
(1, n), <T-1 also ranges over

k, then <T-t (k) = j, and hence
ranges over all permutations of
aa-lkk
a1a1a2a2 .
Since sgn
<T
(j )
and
<T =
sgn
t
<T- ,
. llnan= lla-t11llu-l22 .
it follows that
. lla-tnn.
<T
which shows that <let [a;;] is an alternating multilinear functional of

the columns of [a;;] .
Let [a;;]1 designate the transpose of [a;;]; that is, if a1u are the entries
of the transpose of [aii ], then a1;; = a;;. Hence the rows of [au]1 are the
columns of [a;;]. Our previous remarks constitute a proof of the
following.
6.6.10
If [a;;]1 is a square matrix with entries a1;j that
Theorem.
satisfy a1u
a;;, then
det[au]1
det [ au J .
(6.6.7)
The actual computing of a determinant is probably most effectively

carried out by the method of expansion by cofactors or minors of the jth
row (or column). This is a method that reduces the computing of the
determinant of an n X n matrix to the computation of determinants
of (n
1)
(n
matrices. Let [a;j] be an n X n matrix and as
1)
usual set
"
ai
If we set D(a1,
(a;i. ,ai,,)
k=l
aikek.
,an)=det [a;j], then using the multilinearity of
D we have
det[a;;]
k=l
ai(ai. ,ai-1' ek,ai+1,
,a,,),
(6.6.8)
where, as we have indicated, each ek occurs in the jth position in the

n-tuple.
Let us, at first, compute the determinant D(e1,a2,
,a n). Set
b; =a; - a0e1, Vi E (2, n), and

ai
(a;2,
,a;,,).
Clearly b; is determined as soon as ai is given. Now, from Theorem

6.6.S(c),
A simple induction argument shows that
D(e i. a2, ,an)
D(ei. b2, ,bn).
The number on the right and hence the number on the left side depends
only on a'2,
,a',,. Let us set

E(a2, ,a)= D(ei. a2, ,an).
6.6
DETERMINANTS I 283
This defines E as an alternating multilinear functional on the (n

n
Cartesian product of v -1 Further,
E (ei.,
e)=D(e1, e2,
1) fold
,en)= 1,
so that E is a determinant function. Thus
(6.6.9)
Suppose now that I :s;;
:s;;
e1 1s m the jth position. Let
n and
<r
be
that permutation of (I, n) onto itself defined by
i
1,
iE(2,j),
iE(j+l,n).
=
A very simple induction argument establishes that
sgn
Thus, from
<r =
(6.6.9),
= ( l)i-1
-
Let us now compute D(ek> a2,

equality
<r
a u-1>n
au+ on
kE (I, n). From the

bn) are the rows of the trans
,an, then
be the same type of permutation as described above,
rr(i) =
Then
au-u2
au+u2
,an), where
(6.6. 7) we know that if (bi.
pose of the matrix with rows ek, a2,
Let
I)i-1
=>i=l ,
1 =>iE(2, k)
i =>iE(k+l,n).
Finally, let us take the transpose of the matrix having rows ( ba1 ,
The new matrix has
e1
brr,.).
in the first row and using (6.6.7) and (6.6.9)
we get
Finally, if
ek is
a2ck-o
a2ck+o
aack-o
aack+o
anck-1>
anck+O
in thejth position, by the use of a permutation in exactly
the same way as used in the case where
e1
was in the jth position, we
find that
D ai.
au
(- I ) i+ k
ek,
.. 'a,.
)
aHk-o
aHk+i>
aln
au-m
au-1xk-o
ac;-O<k+O
au-on
au+o1
au+ock-o
au+i><k+o
a< i+l)n
ant
an<k-0
anck+O
a,.n
ai2
an2
(6.6.10)
The number on the right of (6.6.10) is called the
cofactor of aik and we

Co (aik ).' Note that C o(a i k ) is a suitably signed deter
minant of an (n - I) X tn - I) matrix. Also note that this (n - I) X
( n - I) matrix is obtained by deleting the jth row and the kth column from
the original matrix.
shall denote it by
'
The arguments we have given in the last several paragraphs con

stitute a proof of the following theorem.
6.6.11
Theorem.
If [aij] is any n
n matrix, then Vj
det [aij]
k=l
aik
Co (aid =
aki
Co (aki ) .
( 1, n} ,
(6.6.11)
k=l
In computing the determinant of a matrix, whether by the formula
(6.6.5) or the cofactor formula (6.6.11), it is often very useful to use

Theorem 6.6.5, especially part (c). This says we can multiply any row
by any number and add it to any other row without changing the value
of the determinant. By use of Theorem 6.6.10 it follows that the same
procedure is valid if we operate on the columns of the matrix. This
procedure allows us to compute the determinant of a matrix by com
puting the determinant of another matrix which may have many more
zeros than the original matrix.
6.6
[au]
[bu] be
Suppose now that
i j,
[au] by
is an
with
and let
of
its jth row. Then
DETERMINANTS I 285
matrix. Fix the integers
obtained from
[au]
and
by replacing the ith row
[bu] has
two rows that are equal and hence
has a zero determinant. Use formula
to expand by the ith row
of
[bu]
to get (remember that

det[b;;]
Thus we see that for
Vi,j
(6.6.11)
b;k= aik ! )
2,
k=l
aik Co(a;k)= 0.
(1, n)
we may write
2,
k=I
where
B;i= 0
entry is the number
adjoint of [a;1],
aik Co(aik)=Bu det[aii].
{)ii= 1.
Co(a1;). This
j &
(6.6.12)
Let adj [aii] be a matrix whose
(i,j)
matrix has classically been called the
although in recent times the word 'adjoint' has been
used for other purposes. The formula
(6.6.12)
shows that
(6.6.13)
By working with the columns of
[aii] instead of the
rows, the same kind
of reasoning leads to the fact that
(adj[a;1]) [aii]= (det[a;1]) [B;1] .

The equation
(6.6.13')
leads to Cramer's rule for solving a system of
linear equations inn unknowns. Let
numbers and suppose
(6.6.13')
{x1
{y1,
be a given set of real
y.. }
is another set of real numbers so
x.. }
that the matrix equation
(6.6.14)
is satisfied. If we perform the matrix multiplication and equate corre
sponding entries on each side, we see that this is the same as a system of
n
linear equations inn "unknowns."

If we multiply both sides of the matrix equation
(6.6.13')
on the right
by the column matrix consisting of the xk, we get
(adj[a; ]) [Ytn]
Y
(det[a.,])
r:l ]
(6.6.15)
x,.
If we multiply out the left side and equate entries we get

11
xk d et[ai;]= 2,
j=l
Y;
Co(a1k).
(6.6.15')
286 j HIGHER-DIMENSIONAL SPACE
By the formula for the expansion of the determinant of a matrix by

cofactors of a column, the right side of (6.6.15') is nothing more than
[bii]k which
the determinant of the matrix
[aii]
is obtained from the matrix
by replacing the kth column of the latter matrix by the column
consisting of the
Y;.
Now, if det
[aii] = 0,
there is nothing more that can
be said at this point. However, if det [aii] =/= 0, from (6.6.15') it follows
that
xk-
det[b0] k
det [a ;;]
(6.6.16)
This shows that if det[a;;] =/= 0 and the equation (6.6.14) has a solution,
the solution must be of the form (6.6.16) which implies the solution
is unique.
Conversely, if we suppose det
[a;;]
=/=
0, then the numbers xk given
by (6.6.16) satisfy the equation (6.6.15') or, what is the same thing, the
equation (6.6.15). Multiply both sides of equation (6.6.15) on the left
by the matrix
[aii]
and then use the relation (6.6.13). This shows that
(6.6.14) is satisfied. The formula (6.6.16) is known as Cramer's rule for
solving (6.6.14). We can summarize these facts as follows:
6.6.12 Theorem. If [a;;] is an n X n matrix with det [aii] =/= 0, and

Yn} is any set of n real numbers, then the equation (6.6.14) has a
{yi.
unique solution gi,ven by (6.6.16) (Cramer's rule), where [biih is the matrix
obtained from [a1;] by replacing the kth column in [a;;] by the column vector
Yn)
(yi.
Suppose now that
is the identity transformation of a linear space
onto itself. The matrix representation of I with respect to

basis is
every ordered
8iJ = 0 if i =/= j, 8;; 1. If A is a linear transformation
inverse, [a;;] a matrix representation of A with respect to an
basis, and [aiJJ-1 denotes the matrix representation of A-1
[ 8;;],
with an
ordered
where
with respect to the same ordered basis, then
[a;;] -1 [ail] = [8iJ] .

Since by Corollary 6.6.9, the determinant of a product of square
matrices is the product of the determinants of the matrices, it follows
that
det
[a;;]-1
(det
[a;;))-1
Let T be a linear transformation of a vector space into itself and
[t;;] a matrix representation of T with respect to a given ordered basis.

If [t' ;;] is the representation of T with respect to another ordered basis,
then according to formula (6.5. l) there exists an invertible linear oper

ator Q so that if
(q;;]
is the matrix representation of Q with respect
to the original ordered basis, then
6.6
DETERMINANTS I 287
Hence
det
[ti;]
det
[t;;].
This shows that the determinant of
any matrix representation of T

This fact allows us to define the determinant of a linear
transformation (of a linear space into itself) as the determinant of any matrix
representation of the linear transformation with respect to an ordered basis.
has the same value.
6.6.13
Theorem.
is nonsingular<==> det T
- -0.
Proof.
If T has an inverse T -1, then T -1

det T -1 det T
I and hence
1.
This shows det T - 0.

Conversely, suppose det T - 0 and T ( x )
0. Suppose
( u1,
,Un
is an ordered basis for .B( T ) and

n
T(uk)
j=l
t;ku;,
Then
"
T(x)
Since the
u;
k=1
k
x T(uk)
form a linearly independent set, we have a system of linear
equations
n
k=1
k
t;kx
o,
( l , n) .
Since det T - 0, Cramer's rule tells us that Vk E
k
(1, n), x
0. Hence
0, which shows that T is one to one.
We shall close this section by proving a theorem that we shall need

at one point in Chapter 7.
6.6.14 Theorem. Suppose T is a symmetric linear transformation de

fined on a linear subspace of dimension n in Eq, and [t;;] a matrix represen
tation with respect to an ordered orthonormal basis. Let us set
t11
t21
/:;.k=
t12
tz2
t1k
t2k
(6.6.17)
tk l
tk2
tkk
Necessary and sufficient conditions that 3m > 0 so that Vx E JV(T)

2
T(x) x ;;;;: m lxl
(6.6.18)
are
/:;.k > 0,
Proof.
VkE(l,n).
(6.6.19)
Let us first prove the necessity. Suppose that [tu] is the
matrix representation with respect to the ordered orthonormal basis
(u1 , , un). Then the matrix [t;;] itself is symmetric; that is, tiJ
t;;.
T
are bounded below by m. If {i1.k: kE ( 1, n)} are the eigenvalues of T,
then since det T is independent of any matrix representation, we have
=
The condition (6.6.18) means certainly that all the eigenvalues of
n
/:;.n=det T= I1 A.k
k=I
Now,
mn > 0.
VkE ( 1, n), let Tk be the symmetric linear transformation

(u1,
, uk) and given by
acting on the space generated by
Tk(u;)= L tuui.
jE (l,k).
i=l
The symmetry of
Tk comes from the symmetry of [tu] or from the

Vx,y E JV(Tk),
easily established fact that
Tk(x) y=T(x) y=xT(y)=x Tk(y).

Replacing y by
is satisfied for
x in the above equalities we see that the inequality (6.6.18)

Tk. Thus arguing in exactly the same way as before, we
get that
The proof of the sufficiency will proceed by induction. If
n=1, it
is clear that the inequality (6.6.19) implies the inequality (6.6.18).

Assume now that the sufficiency is true for
n=p
1 < q. Let
T be a
symmetric transformation on a linear subspace of dimension p in Eq,

and suppose the inequalities (6.6.19) are satisfied
VkE (l,p). By
repeated application of Theorem 6.6.5(c) on the addition of multiples

of a row of a determinant to other rows, and the use of Theorem 6.6.10,
which allows us to do the same thing for columns, we get
6.6
lip
l11
DETERMINANTS I 289
l11
S22
S2p
Sp2
Spp
Lip =
lpp
lpl
where,sincelrs=lsr.
Vr,s E(l,p),
Note that, since Li1 =t 11 > 0, we can divide by t11 Further, if
VkE (2,p) ,
rk-1 =
it follows that
VkE(2,p).
Thus
(6.6 2 0)
VkE(l,p-1).
Let S be the symmetric transformation, with domain the linear space

generated by (u2,
,up), and defined by

p
Sui =
L s u ui
i=2
j E(2,p).
>
From (6.6.2 0) and the inductive hypothesis it follows that
Vx E .:B(S),
o/:- 0, we have
S(x)
> 0.
Let us now set
kE(2,p).
It is an easy matter to check that ( u1, u2,
and we leave this for the reader. Now,
T(u;)
u;) is a basis for .:B(T),
Vj,kE (2,p),
t it k
ufc=tki- i t =S(ui)
t11
Further, VkE(2, p),
T(u 1)
Suppose now that
u;,
u1 T(u;,) = 0.
uk.
Then, since
u1
T(uk)
T(x) x
1
x u1
k =2
0, we have, if
xkuk,
1
t11(x )2
t11(x1)2
1
t11(x )2
+ S(x')
#- 0,
f f xixkT(uj)
p
L L
k=2 j=2
xixkS(u;)
x '>
ufc
uk
0.
In other words, Vx E .B(T), x #- 0, we have T(x)

The function with values
k=2 j=2
x > 0.
T(x) x is continuous. If we restrict it to

the compact set K
{x: lxl
1 & x E .B(T)}, then since T(x) x
> 0, 3 m > 0, so thatx EK==> T(x) x m. Ifx #- O,x E .B(T), then
x/lxl EK and thus
=
T (x)
If x
m lxl2
0, this inequality is obviously fulfilled. Thus the induction is
established and the theorem is proved.
6.6.15 Corollary. Suppose T, [ti ], and ak are as in the last theorem.

i
Then necessary and sufficient conditions that 3m > 0 so that Vx E .B(T)
T(x) x
-m lxl2
(6.6.21)
are
VkE(l,n).
(6.6.22)
In case <let T
an #- 0, but the conditions (6.6.19) do not hold and the
conditions (6.6.22) do not hold, then 3x so that T(x) x > 0 and 3y so that
T(y) y < 0.
=
Proof.
If we set S =
-T and a 'k the determinant in (6.6.17) for the
transformation S, then
The first statement iri the corollary is now an immediate consequence

of the previous theorem.
To prove the second statement we first note that since an #- 0, zero
is not an eigenvalue of
T. If the conditions (6.6.19) do not hold and
the conditions (6.6.22) do not hold, then T must have at least one posi
tive and one negative eigenvalue. Taking x a nonzero eigenvector for
the positive eigenvalue and y a nonzero eigenvector for the negative
eigenvalue completes the proof.
6.6
DETERMINANTS I 291
D Exercises
1.
Compute the following determinants:

(a)
(c)
-1
I 1
-2
-2
-1
-1
1
1
3
(b)
( d)
-3
-1
a
a
1
c
2
c
2. Use Cramer's rule to obtain solutions (if possible) to the follow

ing systems of linear equations:
2x1 +3x2 - 5x3= 3
(a)
x1 - 2x2+ x3=0
3x1+ x2+3x3=0.
(b)
2x1+ 4 x2 - 3xa = 3
3x1 - 8x2+ 6x3=
4x1 + 8x2 - 6x3 = 2.
3. Let D be an alternating multilinear functional on the n-fold

Cartesian product of vn. Show that
D(a1,
a =
, n)
det [a ii ] D (e i .
,en ) .
4.
Let <r be a permutation of (1, n) that interchanges only two
elements; that is, u(j ) = k, u(k) = j, and Vi, i j, k, u(i) =i. Assuming
j k, give all the details of the proof that sgn <r = 1
-
5.
Let [a u ] be an n
n matrix with
det[a;;] =
a;;= 0
if i
> j.
Prove that
n akk
kI
1.
6. If U is an orthogonal linear transformation from an n-dimen

sional real vector space onto itself show that ldet VI =
Give an
example of an orthogonal linear transformation for which the determi
nant is -1 and an example for which the determinant is 1.
3
Suppose that A is a linear transformation of V onto itself whose

matrix representation with respect to a given ordered basis is
7.
-3
1
-2
-n
Compute the matrix representation of

ordered basis.
1
A-
with respect to the same

8.
A linear transformation A is called skew-symmetric A 1
If A acts Q..n an n-dimensional space into itself, and

that <let
9.
-A.
is odd, prove
0.
Suppose that we have an
(r + s)
matrix of the form
(r + s)
[ i ].
where
is an
rXr
matrix,
is an
and the lower left-hand corner is an
J
(Hint: Fix
r X s matrix, C is
s X r zero matrix.
(<let
an
X s matrix,
Show that
A)(<let C).
and B and set
D (A, B, C)
<let
[i l
Show that this is an alternating multilinear functional on the s-fold

Cartesian product of
vn.
Hence by Exercise 3
D(A,B,C)
where I is the
( det C)D(A B I)
,
identify matrix
[Sii].
Next show that by substracting
multiples of the rows of I from the rows of B we get
D(A,B,I) =D(A, 0, I),

and the right side is an alternating multilinear functional of the rows
of
A.)
I 0.
Show that the matrix equation
an
a2n
ann
,
has a nonzero solution
11.
(x1,
Xn)
x'
l f ] fl
<let
0
.
.
0
O
X2
.
.
.
Xn
[aii]
0.
Define a submatrix of a given matrix as one obtained by striking
out rows and/or columns from the original matrix. Let T be a linear
transformation and
[tii]
a matrix representation with respect to any
p there exists a
p X p submatrix of [tiJ] with nonvanishing determinant, and any sub
ordered pair of ordered bases. Show that rank T
matrix with more rows and columns has a vanishing determinant.
6.7
6. 7
FUNCTION SPACES I 293
FUNCTION SPACES
Let K be any set in En and designate by Cm(K) the collection of all

bounded continuous functions with domain Kand range in Em. As usual,
we shall define the sum of two functions in Cm(K) by the equality
Vx EK,
(J + g) (x) =f(x) + g (x),

and Va E R we define
(af) (x) = af (x),
Vx EK.
With these definitions Cm(K) becomes a real vector space as defined in

Section 6.1. However, this vector space is finite-dimensional if and only
if K has a finite number of points (Exercise l).
We can put a norm on the elements of Cm (I(.) by defining
llJ ll =sup {lf{x)I: x EK}.

It is clear that
11111=0 <=> f= 0.
Also, it is almost immediate that Va E R and Vf E Cm(K),
llafll = la l llJll,
and Vf, g E Cm(K) the triangle inequality
II!+ gll
:s;;
11!11 + llgll
is valid.' The distance between J and g is taken as the number llJ- gll.
Using this definition of distance it is clear how to define a Cauchy
sequence in Cm(K). In Theorem 3.4.5 we proved that a uniformly
Cauchy function sequence is uniformly convergent. This was done for
function sequences whose elements were real-valued functions with a
common domain in R. However, the proof carries over mutatis mutandis
to Cauchy sequences in Cm(K). In Theorem 3.4.6 we proved that a
uniformly convergent sequence of continuous functions converges to a
continuous function. This also carries over to Cm(K). Thus we have the
following fact.
6. 7 . 1 Proposition. The space Cm(K) is complete in the sense that every

Cauchy sequence with range in Cm(K) converges to an element of this space.
An open ball in Cm(K) can be defined in the same way as for Euclidean
space:
B( g, p) = {f: f E Cm(K) & llJ- gll
<
p}.
is said to be the open ball in Cm(K) with center g and radius p. Using
the concept of open ball, an open set in Cm(K) is a set for which every
point is the center of an open ball contained completely in the set.

Thus we are led to the idea of an open covering for a set in Cm(K).
Also, the idea of an open ball or an open set can be used to formulate
the concepts of accumulation point and closure in the obvious way.
It is probably clear from the discussion of the last paragraph on open
sets and open coverings that we are interested in the concept of com
pactness for subsets of Cm(K). We saw in Chapter 2 and again in Section
6.3 of this chapter the importance of the Heine-Borel theorem. Un
fortunately, it is no longer true that a closed and bounded set in Cm(K)
has the Heine-Borel property. Since the Heine-Borel property seems
to be the more important property we define compactness in terms of it.
6. 7 .2 Definition. A set A C Cm(K) is said to be compact in every
open covering for A there are a .finite number of sets which cover A.
There is a covering concept in terms of open balls which as we shall

see is essentially the same as the Heine-Borel property, but is much
easier to use in some contexts. We first give a definition.
6. 7 .3 Definition. A set AC Cm(K) is said to be totally bounded
VE> 0, there exists a .finite set {g k: k E (1,p)}C A so that
AC U {B(gk, E): k E (l,p)}.

It is very easy to show that a totally bounded set is bounded and we
leave this for the reader (Exercise 2). A more important fact is the
connection between total boundedness and compactness. To establish
the connection it seems to be necessary to use the axiom of choice that
we enunciated in Section 2.1. Recall that we agreed to use the symbol
(AC) in front of a statement that required this axiom.
6. 7 .4
Theorem.
A compact set in Cm(K) is closed and totally bounded.

(AC) Conversely every closed and totally bounded set in Cm(K) is compact.
Proof. Let us first prove that every compact set is totally bounded.
Suppose A is compact; then VE> O,{B (g, E): g E A } is an open cover
ing for A. By the definition of compactness, this open covering reduces
to a finite subcover, which is just the definition of total boundedness.
To show A is closed, let g E Ac. Since every element inA has a positive
distance from g, the collection of open sets {B(g, l/k)c : k E N} is a
coverin for A. By compactness there are a finite number of these sets
B (g l/kJc: j E (1, p)} which cover A. Let p =min {l/kj: j E (l,p)};
then B(g, p) covers A and thus B (g, p) C Ac . This shows Ac is open and
hence A is closed.
To prove the converse statement we shall first proceed in a way in
which it may not be completely clear just how the axiom of choice is
being used. We are doing this so that the idea of the proof does not be,
6. 7 FUNCTION SPACES I 295
come obscured with formalities; we shall take care of the formalities

later on.
It will make technical matters slightly simpler if we restrict the distance
function on
Cm(K)
space
altogether. Thus when we speak about a ball or an open
Cm(K)
Cm(K)
to A X A and forget about the ambient
set we shall mean a ball in A or an open set in A with respect to the

given distance function. Since, by hypothesis, A is closed and
Cm(K)
is complete, it follows that A is complete; that is, every Cauchy sequence

with elements in A converges to an element in A.
Suppose now that 'U is an open covering for A but no finite subset of
'U covers A. Since A is totally bounded, a finite number of balls of radius
I covers A. Hence there must be one of these balls, say B0, so that no
B0 Now, B0 (being a subset of A) is totally
bounded. Indeed, V > 0 we may cover A by a finite number of balls
of radius e/2 and hence a finite number of these balls covers B0 The
only question is whether these balls have their centers in B0 However,
if a ball of radius e/2 has a nonvoid intersection with B0 we can put a
finite subset of 'U covers
ball of radius around a point of this intersection and so get a finite

set of balls with centers in B0 and radius which covers it.
B0 is totally bounded and covered by 'U but no finite subset of

B0, we may repeat the argument given above and find a ball
B1 in A of radius 1/2 with center in B0 so that no finite subset of 'U
covers B1 Proceeding in this way (a rather vague statement!) we get
a sequence (B n) of balls with the radius of Bn being l/2n, the center of
Bn is in Bn-I and no finite subset of 'U covers Bn. Let g n be the center of
Bn Then for n m we have
n-1
I
llg n - g mll L llgk+l - gkll < 2m-1
km
Since
'U covers
(gk) is Cauchy, and since A is

3g E A, so that gk - g. Since 'U is an open
E 'U so that g E U. However, 3N so that n
Bn C U. This is a contradiction, since no finite
Bn.
Thus the sequence
complete, it follow.s
that
covering for A,
N ==> gn E U
3U
and
subset of 'U covers
The part of the proof that requires the axiom of choice is the rigorous
(Bn). Let R be the rela

n
n
tion consisting of all ordered pairs (B(g, l/2 ), B ( h, I/2 +1 )), where n
n
n
ranges over N0, h EB(g, I /2 ) and each of the balls B(g, I /2 ) and
n
+
B(h, l/2 i) cannot be covered by any finite subset of 'U. If we suppose
establishment of the existence of the sequence
that A is totally bounded but cannot be covered by any finite subset of

'U, then R is nonvoid and indeed Vn E N0, R contains an ordered pair
(B(g, l/2n), B(h, l/2n+1)). Now, Vg E (R), put
Ri= {': (g,,) ER}.

The set
Rl is nonvoid. Thus, using the axiom of

F with (F) = (R) and F(g) ERl.
function
choice, there exists a
296 j HIGHER-DIMENSIONAL SPACE
Using the function F we can establish the existence of the sequence

by means of the axiom of induction. Let B(g0, 1) be a ball in A
which cannot be covered by any finite set from 'U. We leave to the reader
the simple induction argument which establishes that Vn E N0, there
exists a unique function Gn with domain (O, n) so that
(Bn)
G n( O ) =B(go. 1),
G,,(k+ 1) =F(Gn(k)) ,
kE(O,n-1), nl.
Now, Vn E N0, take G(n )

Gn(n ) and it is the function G that is the
sequence (Bn) .
Another useful equivalence with compactness in Cm (K) is given by
the following theorem.
=
If A C Cm (K) is compact, then every sequence with

range in A has a subsequence that converges to an element of A. (AC) Con
versely, if every sequence with range in A has a subsequence that converges to
an element of A, then A is compact.
6. 7 .5
Theorem.
Suppose A is compact and (gn) is a sequence with range in

If no subsequence of (gn) converges to a point of A, then Vg E A,
3 pg so that the set {k : gk E B(g, pg)} is finite. To prove this we assume
to the contrary that 3g E A so that V p > 0, {k: gk E B(g, p)} is infinite.
Let An-i = {k: gk E B(g, l/n )}, n E N. Using the axiom of induction,
it is easy to establish that Vn E N 0, there exists a unique function 'Pn
with domain(O, n) so that 'Pn( O ) =min A0 and Vk E(1, n)
Proof.
A.
'Pn(k)
=min {Ak\{q;n(l),
",'{)n(k-1)}}.
Note that Ak\{q;n(l),

'Pn(k - 1)} is nonvoid, since Ak is infinite.
Further, since Vn E N0, An+i C An, it follows that 'Pn(n) < 'Pn+i (n + 1).
Now let q; be that function with domain N0 so that q; (n) = 'Pn(n ). Then
(g.p<n>) is a subsequence of (gn) that converges tog. This is a contradiction.
The collection {B(g, pg): g E A} is an open covering for A and thus
reduces to a finite subcover. By what we have proved above, this means
that {k : gk E A} is a finite set, which is a contradiction.
Conversely, suppose every sequence with range in A has a subsequence
that converges to a point of A. The first conclusion that can be drawn
from this is that A is closed. For if g is an accumulation point of A, using
the axiom of choice we can pick a sequence from the collection of balls
{B(g, l/n): n E N} which converges to g. But since this sequence
contains a subsequence that converges to an element of A, we must have
g EA.
If the set A is not totally bounded, then 3E: > 0 so that no finite set of
balls of radius E covers A. Pick g0 EA, and since B(g0, E ) does not
cover A we may pick g1 E A n B(g0, E ) c. Since B(g0, E ) and B(g1, E )
6.7
do not cover A, we may pick
g2 in A and
outside of B(g0,
Proceeding in this way we get a sequence
Vn,m E N0, llgn - gmll E.
(gk)
E)
with range in
Hence no subsequence of
(gn)
B(g1, e ) .
A so that
is Cauchy
and hence no subsequence can converge. This is a contradiction.

Of course, the proof of the existence of the sequence
(gk)
of the last
paragraph requires the axiom of choice. The method of procedure

is similar to that used in the proof of Theorem 6.7.4. We are going to
leave it to the reader as an exercise to make this precise.
The last two theorems we have proved can be carried over to very
much more general situations than what we have considered here:
X
X which satisfies the
they can be carried over to metric spaces. A metric space is a set

together with a distance function
defined on XX
following:
(a)
(b)
(c)
(d)
Vx,yEX.
d(x,y)O.
d(x,y)=0 {::::} x = y.
d(x, y) = d(y, x),
Vx,y EX.
d(x,y) d(x,z) +d(z,y),
Vx,y,zEX.
We leave it to the reader to formulate the relevant concepts in this

context
and to convince himself that the last two theorems can be
stated and proved in essentially the same way as for
C m(K ).
THE ARZELA-ASCOLI THEOREM
The compactness of a set in
Cm(K)
is not equivalent with the set being
closed and bounded. However, if K is compact in
En we get equivalence
(using the axiom of choice) if we add a third condition to "closed and

bounded." The third condition is equicontinuity, which we now formulate.
6. 7.6
Ve
A set A C Cm(K) is said to be equicontinuous {::::}

so that Vx,y EK with Ix - y l < 6 and Vf EA,
Definition.
> 0, 36 > 0
IJ(x) - f(y)I
<
E.
Note that every element of an equicontinuous set is uniformly con

tinuous. A simple example which shows that there may be closed and
bounded sets in
C m(K) which are not compact

A={/: f E C1 (K)
lowing: Let K = [O, l] and
is provided by the fol
&
II/II l}.
The set
is clearly closed and bounded. However, the functions defined by
fn(x) = x"
are in
A,
but no subsequence converges to an element of
A.
6.7.7 Theorem. Suppose K is a compact set in E " and A C C m(K ).

If A is compact, then A is closed, bounded, and equicontinuous. (AC) Conversely,
if A is closed, bounded, and equicontinuous, then A is compact.
Proof. If A is compact, then by Theorem 6.7.4 it follows that A

is closed and totally bounded, and hence, a fortiori. A is bounded. Since
A is totally bounded, Ve > 0, there exists a finite set of open balls
{B(gk,e/3): kE ( 1, p)} which covers A. Each gk is in A, and since K
is compact each gk is uniformly continuous. Thus 38 > 0, so that
Vx,y E Kwith lx-yl < 8 and VkE(l,p),
lgdx) - gk(y)I
<
e/3.
Now, VJ EA, 3kE (l,p) so that J E B(gk> e/3). Thus Vx,y E K

with Ix - yl < 8, we have
IJ(x) -j(y)I::;; IJ(x) - gk(x)I + lgk(x) - gk(Y)I

+ lgk(y) - f(y )I <
E.
Hence A is equicontinuous.
To prove the converse statement we shall prove that A is totally
bounded. Since A is closed it will follow by Theorem 6.7.4 that A is
compact. Note that the axiom of choice is needed, since it is needed
in part of the proof of Theorem 6.7.4.
Since A is equicontinuous, VE > 0, 3 8 > 0 so that V x, y E K with/
Ix - Y I < 8 and VJ EA we have IJ(x) - J(y)I < e/4 . Since Kis co n:f
pact, it is totally bounded and hence there is a finite set of balls {B(xk, 8}':
kE ( 1,p)} which covers K. Since A is bounded, each set
kE(l,p) ,
is bounded, and thus totally bounded. Hence there is a finite set of
balls{B(f;(xd, e/4): i E (l,pk)}which coversAk.
Suppose, for the moment, that we are able to find a set { hk: k
E ( 1, p)} of real-valued nonnegative continuous functions on K so
that Vx E B(xk, 8)c n K, hk(x)= 0, and Vx EK,
k=l
Letj = (j1,
,jp), wherejk E (I,pk), and Vx EK set
g i (x)
L hk(x) Jik (xd.
k=l
As j ranges over the set of vectors we have indicated above, we get a

finite set of functions each of which belongs to Cm (K).
j,;) so that VkE(l,p), IJ (xk)
Now, VJ EA , 3j= (j1,
- Jik (xd I < e/4. This simply follows from the fact that {B (f; (xk),
e/4): i E ( 1, Pk)} covers Ak. Further, by the equicontinuity of A ,
Vx E B(xk,8), IJ(x) - J(xk)I < e/4, and thus IJ(x) - Jik (xk)I < e/2.
Since Vx EK,
J(x)
L hk(x)J(x),
k=l
6.7
FUNCTION SPACES j 299
we get
jgi(x)-f(x)I
kL=l hk(x) lfik (xd-f (x)I.
x EK, let us set N(x) {k: x E B(xk , o)}. For every k

N(x), hk(x) 0, and Vk E N(x), lf;k (xk) -f(x)I < E/2. Thus Vx
EK,
For every
lgi(x)-f(x)I
<
(E/2) L hk(x)
kEN(Xl
E/2.
This shows that A can be covered by a finite set of balls of radius
E/2.
From this we conclude that A is totally bounded. The only small point
to be clarified is that the centers of the balls may not lie in A. However,
as in the proof of Theorem 6.7.4, for each of these balls which has a
nonvoid intersection with A we choose a point in this intersection and
take the ball with the chosen point as center and radius
a finite set of balls with centers in A and radius
E which
E.
This gives
cover A.
To complete the proof we must show the existence of the set
( 1, p)}.
Let
cp
'P(x)
The function
cp
{hk(x):
be that function on En defined by

=
{ 1- lxl
<:::::}
0 <:::::}
lxl
l xl
1,
;;.;,: 1.
is clearly continuous. Now define
'Pk
with domain
E'', by setting
cpdx)= cp( (x -xk)/o).

The function 'Pk is clearly continuous,
and
Vx E B(xk,o)c, cpk(x)= 0.
Vx E B(xk,o) we have 'Pk(x) > 0,
If we set
B= u {B(xk,o): k E (l,p)},
then
Vx EB,
p
Let us
<l>(x)= L 'Pk(x) # 0.
k=l
define hk on K by putting Vx EK
hk(x)
cpk(x)/<J>(x).
We see immediately that this set of functions has the required properties.
REMARK:
The set of functions
{hk: k E (1,p)}
that we used in the
proof of the previous theorem is called a

subordinate to the covering
partition of unity for K

{B(xk,o): k E (I,p)}. We shall meet
these objects again later on, especially when we study integration on

manifolds.

THE STONE-WEIERSTRASS THEOREM
We now come to an important generalization of the Weierstrass ap

proximation theorem which was proved in Section 4.6. This generalized
theorem was proved by M. H. Stone in a context that is more general
than we shall present it, although the proof is the same. The reason we
do not present the theorem in as general a context as originally given is
that a discussion of the relevant concepts would take us too far afield.
The theorem we shall state is valid only for
C1(K), where K is a compact

C(K) in place
set in En. For the sake of simplicity we shall simply write

of
C1(K).
Definiti.on. A set A C C(K) is said to be an algebra<:::=?

A is a linear subspace of C (K).
(b) Vf, g EA ,Jg EA.
An algebra A is said to be a separating algebra<:::=? Vx,y E K with x # y, 3 f
EA so that f (x) # f(y). An algebra A is said to be a closed algebra<:::=?A is
closed as a point set in C(K).
6.7.8
(a)
K = [O, I] andA the col

K is a compact set in E2,
in two variables restricted to K; that
For an example of a separating algebra take

lection of all polynomials restricted to
take A as the set of all polynomials
[0, I].
If
is, the elements of A are functions of the form
P (x, y) =
ai,!cX'iyk'
k=l j=l
(x, y) E K .
Clearly this is a separating algebra. A third example of a separating
C(K) is the set of all functions in C(K) that

K. Indeed the l(}.st algebra is even closed.
algebra in
point of
vanish at a fixed
6.7.9 Theorem. Suppose K is a compact set in En andA is a closed sepa

rating algebra in C(K). Then either A= C(' K), or 3a E K so that A = {f: f
E C(K) &f(a)= O}.
Proof.
(a)
We shall make the proof in a number of steps.
If g EA, then JgJ EA.
To make the proof of this theorem independent of the classical

Weierstrass approximation theorem, 4.6.3, we shall proceed in this part
in a way that is slightly longer than absolutely necessary. If
function with domain
]-a, oo[
and values (t
) 112
a>
0, the
is infinitely differ
entiable. By using, for example, the Lagrange form of the remainder, the
reader may easily convince himself that the Taylor expansion of this
function around t = 1/2 converges uniformly in the interval
If we replace t by y2 and
by
2
E , E
[O, I J .
> 0, we see then that there is a poly-
6.7
nomial
p so
that
jp(O)I< 2e and Vy
[O, l] ,
I (y2 +E2)1/2- p(y2)I< E.

Now,
e[ (y2 +E2)1/2
(y2) 112)
,,,,;; [ (y2 +e2) 112 + (y2)112][ (y2 +e2) 112
Thus, since
IYI = (y2) 112,
(y2) 112] ,,,,;; E2.
we have
I (y2 + e2) 112- IYI I < E.

Consequently,
IP(y2) - IYI I< 2e

ll gll
If
Vx
,,,,;; 1, this gives
E K,
IP(g2(x))- jg(x)l I< 2e.

p g2- p(O)
Since A is an algebra,
E A. Thus we see that
lgl
can be
approximated by elements of A, and since A is closed this means that
lgl
EA. If
l lgll
> 1, then
g/llgll
has norm 1 and it is in A. Thus by this
device we see that it is always true that

(b) If g,h EA, then g
also belong to A.
h=
max
EA=>
lgl
EA.
(g,h) and g
/\
h=
min
(g,h)
The proof of this statement is an immediate consequence of .the

formulas
g (\ h
and the facts that
(c)
If
and Va, b
Vx
g+ h
E K,
ER,
3f
By hypothesis
3g
h= 2 [g +h +lg- IL
1
2 [g +h- lg- hi],
E A and
jg- hi
EA.
so that g(x) = 0, then Vx, y

so that f(x) =a and J(y) = b.
EA
EA
3g,h
EA so that
hypothesis of the theorem,
3p
EK
so that x
= y
g(x)= 1 and h(y) = 1. Also, by

p(x) = p(y). For 8, e ER,
EA so that
let us set
ip(8,e)= p(x) + 8 +eh(x) ,

1/1(8,E) = p(y) + 8g(y) +E .
Then, since
lip(8 ,e) - 1/1(8, e)I

it is clear that
371
IP(x)- p(y)I - io(l - g(y)) +e( h(x)- 1) j,
> 0 so that if
iol <
11 and
lei<
11 then
ip(o, e)
#- lf1(8, E). Now, if p(x)= 0, then p(y) #- 0 and hence we think it is

clear that it is possible to choose 8 and
E,
so that 181 < 'Y/ and
IEI
< YJ,
#- 0 and "1(8, E) #- 0. If p(y)= 0, then p(x) #- 0 and the

same result holds. If p(x) #- 0 and p(y) #- 0 we clearly get the same
and <,o(8, E)
result. Thus we see it is possible to choose 8 and
so that if we set
q =p+ 8g+ Eh.

then q(x) #- 0, q(y) #- 0 and q(x) #- q(y). Since A 1s an algebra and
p, g, h EA, it follows that q EA.
Now, let us set
f= aq+ f3q2,
where
a and f3 are taken so that
a= aq(x)+ f3q2(x),
b
aq(y)+ f3q2(y).
This is possible since
I :i;j ::i;j I= q(x)q(y)[q(y) - q(x)] #- O.

(d)
(c)
Under the hypotheses of the theorem and the additional hypotheses of

it follows that A= C(K).
Let f E C(K); under

V x, y EK, 3Pxu EA so
E
>
the additional hypotheses of (c) it follows that

that
Pxu(x)
f(x) and Pxu(Y)
f(y)
Fix
y and
0 and consider the set
Ux= {z: f(z)

Since
Pxu
contains
<
Pxu(z)}
and fare continuous, Ux is a relatively open set in K which
x. The collection { Ux: x E K} is a relatively open covering
of K and therefore the Heine-Borel theorem tells us that there is a

finite set
{ Uxk :
E (I, m)} which covers K. Let
qy
then by (b) we know that
Px1Y
qy EA.
f(z)
Pxmy;
Moreover,
- E
<
Vz E K,
qy(z).
Next, let us set
Since
qy
and
f are
continuous, it follows that
Vu is
a relatively open set
in K. It follows as before, by use of the Heine-Borel theorem, that there

is a finite set
/\
{Vuk: k E (I, r)} which covers K. If we set h= qy1

qYr then by (b), h EA, and moreover Vz E K
f(z)
- E
<
h (z ) < f(z) +
E,
/\
6.7
IJ(z) - h (z) I <
E.
This means that f can be approximated by elements of A, and, since
A is closed, f EA.
If 3a EK
(e)
& g(a) = O} .
so
that VJ E A,f(a) =
0,
then A= {g: g E C(K)
Let Ai be the closed algebra generated by A and the function e,
which has the value e(x) = 1 for every x EK. Then Ai is a closed,
separating algebra which satisfies the additional hypotheses of (c).
Thus Ai= C(K) which means that Vf E C(K) so that f(a) = 0 and
V E>
0,
3g EA and c E R so that
llJ- (g + c) II< e/2 .

In particular, this means that if we evaluate the function f- (g + c)
al a we get lcl < e/2. Thus
llf-gll<
E,
and this shows that f EA. This completes the proof of the theorem.
D Exercises
1.
Show that Cm(K) is finite-dimensional K has a finite number
of points.
2.
Show that every totally bounded set in Cm(K) is bounded.
3.
Give a proof, using the axiom of choice, of the existence of the
sequence (gk) needed in the proof of the second statement of Theorem

6.7 5 The method will be similar to that used in the proof of .Theorem
.
6.7.4.
Let A be a bounded equicontinuous set of functions in C(K).
4.
Show that
f*(x) = sup{f(x): f EA}

defines a uniformly continuous function on K. Show that the conclusion
may not necessarily hold if A is not equicontinuous.

5.
Suppose K C E" is compact and A C C(K). Suppose A has the
property that Ve> 0 and Vx EK, 38(x,e) so that Vf EA and
Vy EK for which lx-yl < 8(x,e) we have IJ(x)-J(y)I <
that A is an equicontinuous set.
Show
If K is not compact, show that there may be a compact set

C(K) which is not equicontinuous.
6.
C
E.
7.
Let K = [l,
oo[,
Vx E K and Vn E N set
and
f,.(x)=x-1111
and A= {!11:
n E N}. Show that A C C(K) is closed, bounded, and
equicontinuous, but is not compact. This shows that if K is not a bounded

set, the last statement of Theorem 6.7.7 may not hold.
8.
If K is a bounded set in E" and A C C(K) is closed, bounded,
and equicontinuous, show that the last statement of Theorem 6.7.7

is valid.
9.
Let T be the unit circle in E2; that is, T ={x:
The function G with domain
[O, 27T[
x E
& lxl = 1}.
and defined by
g(8)=(cos 8,
sin
8)
is a topological map with range T. Let A C C(T) be the collection of

functions in C(T) so that
E A
n
g(8)= L ak
k=O
cos
k8
+ L bk
k=l
sin
k8.
Show that the closure of A is C(T). This result is a slightly less refined
version of a result called
10.
If I=
[O, l]
Fejer's theorem.
and j=IX I, show that any continuous function
in C(j) can be uniformly approximated by functions of the form

n
F(x, y)= L fk(x)gk (y),

k=l
where fk,
11.
gk
E C(I).
Let A be that subset of C(R) consisting of all functions of the
form
f(x)= p(x)e-lxl'
where p is a polynomial. Show that the closure of A is the set
C(R) & limx-oo
{f: f
J(x) =O}. (Hint: Find a topological map that makes
R onto the unit circle minus one point.)
CHAPTER
71 HIGHER-DIMEN
SIONAL DIFFERENTIATION
7.1
MOTIVATION
In Chapter 4 we considered the concept of the derivative for real-valued

functions having domains in R. At that time we did not give any moti
vational reasons for considering such a concept since we felt that such
reasons are usually adequately. and clearly discussed in elementary
calculus. Now, in carrying over a theory from one dimension to higher
dimensions, one tries to do this in a way in which the essential features
of the one-dimensional theory are preserved. Very often this can be
done by formal analogy, and this will lead to a perfectly satisfactory
theory. At the other extreme, a higher-dimensional situation may
present such a rich diversity of paths that it is only after studying many
examples from geometry and physics, and after a considerable amount
of experimentation and inspiration that the essential features of a useful
theory are isolated. It often happens that only after the higher-dimen
sional theory has been developed is it realized that it is a very rich
generalization of a very simple one-dimensional situation.
The adoption of a definition of differentiation in higher dimensions
presents a problem somewhere between the two extremes mentioned
in the previous paragraph. If
range in En,
is a function with domain in R and
1, then clearly the derivative of
can be defined in
the same way as for real-valued functions. However, ifthe domain of
is in Em,
> 1, then a motivational discussion of a geometric nature
may help to clarify the nature of the problem.

Let us first look at a real-valued, differentiable function
on an interval
{ (x, f(x)): x
/. The graph of this function is

/}, considered as a subset of E2
defined
by definition the set

Of course, the graph
of a function is the same as the function f. The name 'graph' is used

only to emphasize that the collection of ordered pairs which is f is
being considered as a subset of a space having a distance function. The
graph of a real-valued function with domain an interval I is often called
the
trace of a path,
and the path itself is defined as the function
F with
values
F(x)
(x, J(x)).
305
306 j HIGHER-DIMENSIONAL DIFFERENTIATION
The function
vector
to F at
F has
c E 1
domain I and range in E2 By definition, the
tangent
(interior of I) is
F'(c)
The tangent line to F at
(l,f'(c)).
is, by definition, the subset of E2 given by
7F<cJ= {tF'(c) + F(c): t E R}.

Note that the set
o'eF<cJ= {tF' (c): t E R}

is a one-dimensional subspace of E2 and
7F<c> = F<cJ +F(c)

7F <c> is the
F(c). Such a
;eF<cJ by a
affine space.
Thus we see that
"translation" of a linear space
constant vector
point set is usually called an
The pictorial representation is shown in Fig. 7.1.1.
x2
FIGURE 7 .1.1
Let us now suppose that
is a real-valued function with domain in
{ (x, g(x)): x E
'(g)}, considered as a subset of E3 Such a graph is often called the
trace of a suface or trace of a surface element, the surface or surface element
E2 The graph of
is the collection of ordered pairs
itself being considered as the function with values
G(x) = (x, g(x)).

The function
has domain
'(g)
C E2 In analogy with the concept
of the tangent line in E2, we want to define the concept of a tangent

plane to
at a point
interior to
'(g).
Probably, the most natural
thing to do is to consider the intersection of the graph of

planes through
which are parallel to the
collection of paths in the graph of
g,
x3
with all
axis. This will give a
and if all these paths have tangent
lines the collection of tangent lines if they form an affine space, could
possibly be called the
tangent plane
to
G at c
(Fig. 7 .1.2).
7.1
MOTIVATION\ 307
x3
Al
FIGURE 7 .1.2
Let us make a little more precise the things we have said in the
previous paragraph. A two-dimensional linear space which contains
the vector
(u,O) =(u1, u2,O) and the set of all vectors (0, O,x3) is the set
.Pu={t(u,0) +(O,0, x3): t,x3 E R}.
A p lane through
c E .B(g)
and parallel to
is the affine space
Pu(c)=Pu+C.
If we intersect this space with the graph of
{(c+tu,g(c+tu)): c+tu
g we get the set of points
(g)}.
We can consider this set of points as the graph of the path
Gu(t)
where
(c+tu,g(c+tu)) ,
ranges over an interval I for which
If we assume that
G'u(O)
exists. This is usually called the
u.
0 E /0
and
c+tu E (g).
exists, then it follows that
Dug(C) - r
direction
Gu defined by
g(c+hu) - g(c)
h
directional derivative
(7. l. l)
of
g at c in the
We shall set
DuG(c) = G'u(O) = (u,Dug(c))

G(c+hu) - G(c).
=Jim
hO
This is called the
h
of
at
in the direction
(7.1.l')
u.
08 I HIGHER-DIMENSIONAL DIFFERENTIATION
Let us suppose that
Vu
E2, DuG(c)
de c<c>
exists. If the set of vectors
{DuG(c): u
E2}
(7.1.2)
is a linear subspace of E3, then there is some temptation to say that the
surface
has a tangent plane at
c.
However, from the point of view
of analysis, there are compelling reasons why people prefer to apply

the term 'tangent plane' in a situation which is slightly more restrictive
than the one we have given here. We shall say more about this a little
dec<c>
later on. For now, let us show that if

with values
DuG(c)
is linear, then the function
is a linear transformation with domain
range in 3 Indeed,
Vu, v
and
Va, {3
aDuG (c) + f3DvG(c)
E R,
E2
and
3w E 2 so that
DwG(c).
If we write this out in component form we get
From this it follows that w =
Dau+nv G(c)
au + {3v
=
and hence
aDuG(c) + f3DvG(c).
The equality (7.1.3) means that the function
Lc<c>
(7.1.3)
with domain
E2
and
defined by
Lc<c>(u) = D,,G(c)
is a linear transformation with range in
is the linear space
dec<c> defined
E3
(7.1.4)
Indeed, the range of
Lc<c>
by (7.1.2).
In the case where the domain of g is in R, the fact that g' (c) exists
implies that g is continuous. We would like this situation to persist in

higher dimensions. Let us see what is involved. We shall suppose as
Vu E E2,DuG(c) exists. Then for every u E E2 for

lul = 1, and VE> 0, 31> > 0, depending on u and E, so that Vt E
which ltl we have
before that
IG(c +tu) - G(c)- tDuG(c)I ,;;;; E ltl.

This simply follows from the definition of
which
R for
(7.1.5)
DuG(c). Now, the last in

G at c if the domain of
equality is enough to prove the continuity of
is in R. Unfortunately, it is not quite enough in higher dimensions.
However, if we can say that VE> 0, 31) > 0 so that for all
lul
linear manifold, then this will be enough to show that

at
c.
Indeed, we can then say that VE> 0, 31> > 0 so that
lvl 
is continuous
Vv
E2 with
we have
IG(c+v)-G(c)-Lc<c>(v)I,;;; E lvl.
Lc<c> is a linear
ILc<c>(v)I,;;;; M lvl. If
Since
E2 with
1, the inequality (7.1.5) is satisfied, and if in addition dec<c> is a
transformation, 3M so that
Vv
(7 .1.6)
E
E2
we have
we take E= 1 in (7.1.6), then 31) > 0 so that
7.2
Vv
DIRECTIONAL DERIVATIVES AND DIFFERENTIALS I 309
E2 with lvl < 8 we have

IG(c+v)-G(c)I
This shows the continuity of
lvl+ IL c<ci(v)I
(M + 1) lvl.
G at c.
In case (7.1.6) is satisfied, where

say that
La<c> is a linear transformation, we

G has a tangent plane at c. We take this tangent plane to be the
affine space
<cl= .t5acc> + G(c),

which is the range of the affine transformation
T G<c>=Lace>+ G(c).
There are any number of examples which show that
DuG(c) exists
G is not continuous at c. From our previous remarks
it follows that such a G does not have a tangent plane at c. The example
Vu
E E2 but that
,
we shall give is a very standard one. We feel sure that the reader will
be able to construct many more without difficulty.
Let us write the points in
g(x,y )_
E2 as (x, y) and set
{ xy2/[x2+y4],
0
if
Vx
x
0,
0.
The result of Exercise 3(a) of Section 6.4 is that the function
g is not
(O,O). However, let us show that Dcu.vig(O,O) exists
2. Since g(O,O) = 0 we have
continuous at
V(u, v)
g(hu, hv)- g(O,O)

uv2
- u2 h2(v)4
+
h
_
Hence, if
=O.
7.2
0, we get Dcu.v>g(O,O)
v2/u and, if u = 0, Dcu.v>g(O,0)
DIRECTIONAL DERIVATIVES AND DIFFERENTIALS
The purpose of this section is to write down in a formal manner the

things that were discussed informally in the last section. We shall not,
however, adopt the geometric terminology. For the sake of simplicity
of presentation we shall not make our definitions as general as in
Chapter 4 but will define derivatives only at interior points (see Section
6.3) of the domains of functions.
7 .2.1
Definition. Suppose f is a function with domain in En and range
in Em. The function f is said to have a directional derivative at a in the direction
u ::::) a is an interior point of If) (J ), u E En, and for h E R the limit
Duf(a) =Jim
h-0
J(a+hu)- J(a)
h
exists. If u ek, and Dk f(a)

Dekf(a) exists, then it is called the partial
derivative of f at a with respect to the variable xk.
The directional derivative of a function f in the direction u is that function
Duf with domain le (Duf)
{x: D,,f(x) exists} (possibly void) and whose
value at a E le (D,,f) is Duf(a).
=
Many people speak about the directional derivative of a function
in the direction
only if
lul
I . However, this is just a matter of
taste and it doesn't seem worthwhile to make such a distinction. Also,

a more classical and possibly more popular notation for a partial
derivative at a point is given by
af(a)
oxk
Dk f(a) .
This notation does have considerable appeal in some situations since

it does lead to
an
easy way to remember some complicated formulas. We
have in mind, for example, the formula for the chain rule which we
shall develop in Section
7.3. Other notations are
fxk
and
The notation that we have adopted in the formal definition
7.2.1 has
become more popular with the advent of the modern theory of partial
differential equations since
D k can
be considered to be a function whose
domain is a space of functions and whose range is also a space of func

tions. We shall, in most instances, prefer the notations we have intro
duced in the formal definition for the same reasons as discussed after
Definition 4.1.1. Let .P be the collection of all functions each of which
has its domain in some Euclidean space and range in some Euclidean
space. The set .P shall include that function which has the null set as
domain. Then for every vector
and V f E .P,
Duf
is well defined;
more likely than not it will be the function whose domain is the null
set. However, regardless of this, the composite function
Dv Du
0
can
be defined and its domain is also .P. This way of looking at things
will be convenient when we discuss higher-order derivatives and we
shall consider the matter again in Section
7.4.
Definition. If f is a function with domain in En and range in

then f is said to be difef rentiable at a E le(J) 8 a E le(J)0 and
there exists a linear transformation L, with domain En and range in Em, so
that Ve> 0, 3(') > 0 so that Vx E le(J) with Ix - al <a, we have
7 .2.2
Em,
IJ(x) - J(a) - L (x - a) I
,,;;;
Ix - al.
(7.2.l)
If f is differentiable at a, then the linear transformation L is called the differ

ential of f at a and will be denoted by 'df (a).'
7.2 DIRECTIONAL DERIVATIVES AND DIFFERENTIALS I 311
df with domain
FJ(df) is df(x).
Some authors prefer to use the word 'derivative' for df and reserve
the word 'differential' for the function defined on FJ(df) X En with
values df(x) (u). Since from a strictly logical point of view df(a) is not
the derivative of f at a when n = 1, we shall not use this terminology.
The differential of
FJ(df) = {x: df(x)
can be defined as that function
exists}, and whose value
Vx
However, it is really just a matter of taste.

There is one small point that must be justified in Definition
We have called the linear transformation
The implication is that
L the
differental of
7.2.2.
at a.
is unique, and this can be settled in short
L1 is a linear transformation that satisfies

7.2.2. Then since a E -B(J)0, Ve> 0,
with lul < 8 we have a+u E -B(J) and
order. Indeed, suppose that
the conditions of Definition
38> 0
so that
IL(u)-L1(u)I
Vu
IJ(a+u)-J(a)-L(u)I +IJ(a+u)-f(a)-L1(u)I
2elul < 2e8.
:s:;
:s:;
Now,
Vv
E En
E En, if
0,
then
Thus using the fact that
L1(av)
aL1(v)
Vv
8v/2lvl
E E" and
lul < 8.
aL(v) and
has the property that
Va
E R,
L(av)
we get
(8/2lvl) IL(v)-L1(v)I
I L(u) -L,(u)I < 2e8.
Hence
IL(v)-L1(v)I < 4e IvI .

v # 0 this is true Ve> 0, we must have L(v)
L(O) L1(0) = 0, we have shown that L = L1.
Since for fixed

Further, since
L1(v).
7 .2.3 Proposition. If a function has a differential at a point a, then the

function must be continuous at a.
The simple proof of this fact was given in Section
7.1
and we shall
not repeat it here. However, we should call attention to the fact that
the proof of Proposition
7.2.3
very definitely requires the use of the
linearity of the differential. On the other hand, the proof given above
of the unicity of the function L which satisfies the conditions of Defi
nition
7.2.2 requires only the homogeneity of L;
that is,
L(au)
aL(u).
Proposition. If f is a function with domain in E" and range in

and has a differential at a, then Vu E En, Duf(a) exists and
7 .2.4
Em
Duf(a)
Proof.
Suppose
Ve> 0, 38> 0
so
E E"
that
Vh
and
df(a) (u) .
lul
E R with
1. Now, Va E -B(J)0 and

0 < lhl < 8 we have a+hu
E -B{f) and
If we divide by
e lhl.
IJ(a+ hu) - f(a) -df(a)(hu)I
lhl and note that df(a) is linear,
we see that the proposi
tion is proved in this case.
v E
If
E" and
0, then upon setting u = v/lvl we see that lul = 1.
Thus
1
Dv11v1f(a) =
df(a)(v).
But
Va E R, a
0,
f(a+ hau) -J(a)

f(a+ hau) -f(a)
'
-a
ha
h
_
Duf(a) exists. then Dauf(a) exists and is equal to aDuf(a).

lvlDv11vif(a) =Dvf(a). Finally, since Dof(a) = df(a)(O) = 0, we
so that if
Thus
have completed the proof of the proposition.

REMARK:
For future reference, let us call attention to the fact that
during the proof of the last theorem we have shown that if

exists then
Va E R, Dauf(a) exists and
Duf(a)
Dauf(a) = aDuf(a) .
As we saw at the end of Section 7.1, the converse of the last proposi
tion is not true. That is to say, if
Vu E
has a differential at
E",
a.
Duf(a)
exists, it is not
Indeed, the example we
gave showed the existence of a function all of whose directional deriva

tives exist at the origin but the function itself is not continuous at the
origin.
However, if in addition to the existence of the directional
derivatives of a function at a point, the directional derivatives are

continuous, then the function has a differential at the point. This is
shown by the next theorem, which actually proves somewhat more.
7 .2.5 Theorem. Suppose f is a function with domain in E" and range

in Em, {.k: k E (l,n)} is a basis for En, 3i E (l,n) so that D.J(a)
exists, and there is a ball B(a,p) C -B{f) so that Vj E (1, n) \{i}, B(a,p)
C -B(D.J) and D.J is continuous at a. Then df(a) exists.
Proof.
We shall break the proof into several steps.
Suppose u and v are linearly independent vectors in En so that Dvf(a)

exists, B(a, p) C -B(Duf) and Duf is continuous at a. Then Ve> 0,
3S > 0 so that Va,f3 E R with lau +f3vl < S we have
(a)
lf(a +au+ {3v) - f(a) -aDuf(a) - f3Dvf(a)I
lau+ f3vl.
(7.2.2)
From the definition of the directional derivative,

so that if
lf3vl
<
8i.
Ve
> 0,
381
> 0,
then
IJ(a+{3v) - f(a)-f3Dvf(a)I :;;;
lf3vl .
Next, let us set
F(a, {3) = f(a+au+{3v),

a+au+{3v
where we suppose that
laul + lf3vl
D1F(a, {3)
Since
f/2,(F )
l
lim _
h-0 h
lim
h-0
[F(a+ h, {3)-F(a, {3)]
J [f(a+ (a+h)u+{3v) - f(a +au+{3v)]

h
Duf(a+au+{3v).
C Em we may write
F(a, {3)
where
E JFJ(f); this is certainly true if
< p. We have
f/2,(Fk)
L Fk(a, {3)ek>
k=l
C R. Using the one-dimensional mean value theorem
we get
Fk(a, {3)-Fk(O, {3)

aDufk(a+Ok u+{3v),
where 0 :;;; I O i :;;; lal. By hypothesis, Duf is continuous at a and
k
38 with 0 < 8 < p, so that if laul + lf3vl < 8 , then
2
2
2
IJ(a+au+{3v) - f(a+{3v) - aDuf(a)I :;;; E laul.
J k(a+au+{3v) - J k(a+{3v)
thus
If we now write
f(a+au+{3v)-J(a)
and
take
we have
J(a+au+{3v)-f(a+{3v)
+ f(a+{3v)-J(a),
83 = min (81 , 8 ),
2
then
Va, {3
E R with
IJ(a+au+{3v) - f(a)-aDuf(a)-f3Dvf(a)I
:;;; e
laul +lf3vl
[laul +lf3vlJ.
<
83
(7.2.2')
u and v are linearly independent, if follows from Theorem 6.2.4

37/ so that 0 < 7J < I and 2lau f3vl :;;; 2(1-1'}2) laui lf3vl :;;;
(1-1'}2) [laul2+ lf3vi2]. Thus
Since
that
lau+f3vl2
laul2+2au {3v+lf3vl2
;:::,7/2 [laul2+lf3vl2 ]
;::: 7/2 [laul+lf3vl ]2/2 .
=
If we take o
7103/\12, then lau +,Bbl
<
o =::::} laul + l.Bvl < 83 Thus
from (7.2.2') and the above inequality we get (7.2.2), up to the unimpor
tant factor
\12/71.
Under the hypotheses of the theorem, Vu E ", D,,f(a) exists, and if

Lk=tuk ,k> then
(b)
D,,f(a)
Further, VE
> 0,
38
> 0
"
2: ukD.kf(a).
so that lul
<
o =::::}
lf(a + u) - f(a) - D,,f(a)I

Note that (7.2.2) shows that
Dau+fJvf(a)
Dau+fJvf(a) exists
(7.2.3)
k=l
lul.
(7.2.4)
and moreover
aD,,J(a) + ,BDvf(a).
Let us now prove the first statement in (b) by induction. For every
k E (O, n - 1)
let
P(k)
and
u which
{,j: j E ( 1, n) \{i}},
be the following statement: For every
is a linear combination of
vectors in the set A
Va E R, Du+a.J(a) exists and
(7.2.3')
ji
Lj,,.; ui,i. For every k E N and k n, let P(k) be the state
1 1. Clearly, Vk E N0 and k n, P(k) =::::} P(k + 1).
Now, P(O) is certainly true, since a zero-dimensional subspace consists
only of {O}, and thus (7.2.3') holds. If n
1, there is nothing more to
prove. Hence, suppose n > 1, k < n - 1 and P(k) is true. Suppose u is
a nonzero linear combination of (k + 1) vectors in A. Then 3l E
(1, n) \{i}, so that u1 u - u1,1 is a linear combination of k vectors in
A. From the hypothesis P (k), Va E R, Du1+a./(a) exists and
where
ment:
ji
Now, by part (a) of the proof,
-.
(7.2.3")
since by hypothesis
B (a, p) and is continuous at a.

P(k) =::::} P(k + 1) and the induction is com
Vu E E",D,,f(a) exists and (7.2.3) holds.
D,,1.J
is defined on
Thus (7.2.3') holds. Hence

plete. This shows that
The second statement of part (b) is an immediate consequence of the
1
u ,1 and u1 +a,; are linearly independent, the fact that D,,1.J
is defined on B(a, p) and continuous at a, formula (7.2.3"), and the
fact that
inequality (7.2.2).
(c) . The function f has a differential at a. From part (b) we know that
Vu E E",D,,f(a) exists. If we set L(u)
D,,f(a), then (7.2.3) shows
=
7.2
DIRECTIONAL DERIVATIVES AND DIFFERENTIALS I 315
(7.2.1) is satisfied for

(7.2.4). This completes the proof.
that L is a linear transformation. The fact that

L is simply the inequality
The last theorem is usually stated in terms of the basis
REMARK:
{ei: j
( l, n)}
since in practical situations the partial derivatives
of a function are usually the easiest to compute.

The fact that we stated the theorem in the form that allowed one
directional derivative merely to exist and not necessarily be continuous
at
a was not done for reasons of sophistry. This was done to include the
n 1, where a function has a differential at a point if it has a
case
derivative at the point, regardless of whether or not the function has

a derivative in the neighborhood of the point. Indeed, iff has domain
in R and is differentiable at
a,
then
df(a) (u)
Vu
E R,
uf' (a) .
Let us now note a particularly useful matrix form of the differential

of a function at a point. Suppose f is a function with domain in
range in
Em
and
df(a)
En
and
exists. Let us find the matrix representation of
this differential with respect to the ordered pair of ordered bases,
( (e1,
,en), (e., ,em)). If m
n,
we are using the same symbol
ek
to stand for different vectors in different dimensional spaces. However,

we think that no confusion will result. We may write
df(a) (ek)
m
Dd(a) = L DdJ(a)ei.
Thus the matrix representation of
df(a)
with respect to this ordered
pair of ordered bases is
(7.2.5)
This matrix is called the
Jacobian matrix
of f at
a,
and, if
determinant of this matrix is called simply the Jacobian off at
m,
the
and is
denoted by ]J(a) . Of course, the reader should be aware of the fact that
the Jacobian matrix of a function can exist at a point without the func
tion having a differential at the point in question.
If f has a differential at
a,
df(a) (u) in
J and the components of u with
(e., ,en) We already noted such a
then it is easy to compute
terms of the partial derivatives of

respect to the ordered basis
form in formula
(7.2.3). In general let us write
!16 I HIGHER-DIMENSIONAL DIFFERENTIATION
and apply the linear transformation df(a) to both sides. Noting that
df(a)(e )
k
D f(a), we get
k
df(a)(u)
k=l
ukD f(a).
k
(7.2.6)
The last formula can be put into a form that we feel sure the reader has
seen in elementary calculus, even though the meaning may not have
been clear at that time.
For every k E
(1, n),
let xk be that function with domain En and
range E1 defined by
(7.2.7)
Note that we have also used the same symbol 'xk as a variable. We think
it will always be clear from the context in which way we are using this
symbol, and when we use it as a function it will be clear on which space
it is acting. The function xk is clearly a projection and an almost trivial
calculation shows that Vx, u E En
Duxk(x)= uk.
Since Vu E En, this is a continuous function of x, <fxk(x) exists and
(7.2.8)
Note that <fxk(a) is independent of a so that we will usually write simply
'dxk ' instead of 'dxk(a).' Also note that <fxk(a)(u)
xk(u).
Let us suppose, for the moment, that f is a real-valued function.

Then putting (7.2.8) into (7.2.6) we get
df(a)=
k=l
D f(a) <fxk(a).
k
(7.2.9)
Since D f(a) is a real number, D f(a) <fxk(a) is a linear transformation

k
k
and (7.2.9) simply means that the linear transformations on both sides
of the equality are equal. However, even if
is not real-valued the
right side of (7.2.9) can be interpreted in a way that will make the
Vu E En we interpret D f(a) dxk(a)(u) as ukD f(a),
k
k
and thus both sides of (7.2.9) are the same when evaluated at every
equality valid:
u E En. The reader may have seen (7.2.9) written in the form
df=
i dxk.
k=l
From the above discussion he should be able to supply the interpretation
of this formula.
D Exercises
I.
follows:
Consider the real-valued functions with domain E2 defined as
f(x' x2)
f(x1,x2)
(a)
(b)
Show that
Dif(O)
Vu= (u1,u2),
(xx2) t/3.
lx1x21112.
and
so that
Iff has domain
2.
D2f(O)
u1u2
E3
'/:-
0.
exist but that
Duf(O)
does not exist
and is defined by
f(x1,x2,x3)= (x1)2+3x1x2- (x2)a+xa,

calculate Duf(a) for u= (1, 0,3), and u = (1, -1, 2).
3.
In each of the following exercises, f has domain
df(a).
(a)
(b)
( c)
follows:
f (x )
lxl2
sin
, 12
<=>x
if
'/:-
E2
defined as
0,
x = 0.
Show that the partial derivatives of J exist at every point of

not continuous at the origin. Nevertheless,
5.
Compute
f(x1' x2) (xi) 4 + (x2) 4 + (x2) 4 + 3x1x2.

f(x) = lxl3.
J(x1,x2) =x1ex.
Let J be a real-valued function with domain
4.
E2
df(O)
E2
but are
exists.
Supposef and g are real-valued functions and each has a differ
ential at
En.
Show that
Jg
has a differential at
and
dfg(a) = J(a) dg(a) + g(a) df(a).

If
g(a)
'/:-
0,
show that
g(a) df(a) - f(a) dg(a).

d 1 (a) =
g
g(a)2
6.
Suppose J is a real-valued function, all of whose partial deriva
tives exist at
En.
Define
Vf(a)
and call this the gradient off at
L Dkf(a)ek>
k=I
a.
If
df(a)
df(a) (u)= Vf(a)

7.
df(a)
exists show that
Vu
En
u.
Suppose f is a real-valued function with domain in
En
and that
exists. Define the real-valued function g on the unit sphere in
En, {u:
u E U &
iul
I } by
,
g(u) = df(a) (u)= Duf(a).

If
df(a)
when
is not the zero transformation, show that g has a maximum
u= Vf(a)/IVJ(a) I
and g does not take on its maximum at any
other vector. In other words the maximum of the directional deriva

tives of
8.
at
a is taken on in the direction of the gradient off at a.
Suppose that f is a linear transformation with domain En and
range in Em. Show that
Va E En,
df(a) = f.
9.
Suppose
is a function with domain in En and range in
Em.
Writef componentwise as
J= (J1'
'
Jm) '
(1, m ) Jk is a real-valued function with JFJ(Jk) = JFJ(f).

df(a) exists {:::} Vk E (1, m ) djk (a) exists and moreover
where Vk E
Show that
Vu E En,
df(a)(u)= (df1(a)(u),
10.
dfm (a)(u)).
Suppose f is a function with domain in
and Vt E R+ so that tx E
En and range in Em
JFJ(f) we have
J(tx) = tPf(x).
Such a function is called homogeneous of degree p. If
df(a) exists,
show that
df(a)(a)
This is known as
Pf(a).
E uler's relation.
Let A be a symmetric linear transformation from E

f(x) =A(x) x. Show that Vx E E n, df(x) exists and
over Vx , y E En
11.
E n,
(a)
and set
df(x)(y) = df(y)(x) = 2A(x) y.
(b)
If
[bt;] is any
g(x) =
matrix, then
j=l
i=l
L L b;;xixi.
Use the results of part (a) to show that
12.
Vx E En set
dg(x) exists Vx E E".
Suppose that f and g have a common domain in
E"
and both
df(a) and dg(a) exist and if h(x) = J(x) g(x),

show that dh(a) exists and Vu E En
have range in
Em.
If
dh(a)(u ) = J(a) dg(a)(u) + g(a) df(a)(u).
13. Suppose that J is a function with domain in En and range in

Em so that Vx E JFJ(f), If (x)I= 1. If df(a) exists, show that Vu E P
J(a) df(a)(u) = 0.
7.3
If we think of
as parameterizing the unit sphere in Em, then from a
geometric point of view this says that the tangent plane to the unit
sphere at a point is perpendicular to the line from the origin to the
point.
7 .3
DIFFERENTIATION RULES
In this section we shall obtain the higher-dimensional differentiation

rules, analogous to those we derived in Section
4.2. The one slight differ
ence her is that the rules will be stated in the language of differentials.
7 .3.1 Theorem. (a) If f and g each have their domain in E" and range
in Em, and if df(a) and dg(a) exist, then d(j + g)(a) exists and
d(J
g)(a)= df(a) + dg(a) .
If f is a real-valued function with domain in E" and g has domain in

and range in Em, and if df(a) and dg(a) exist, then d(jg)(a) exists and
(b)
En
d(Jg)(a)
f(a) dg(a) + g(a) df(a).
[The function Jg has domain (J ) n (g) and Vx E (Jg) we have

Jg(x) J(x)g(x). The symbol g(a)df(a) is interpreted to mean that Vu E En
g(a) df(a)(u) = df(a)(u)g(a) and this has meaning since df(a) (u) E R.]
=
(c) If g is real-valued with domain in

then d( l/g)(a) exists and
d .!
g (a)
E",
dg(a) exists, and g(a)
- 0,
dg(a).
g(a)2
a is an interior point of (f ) and (g), it is an

(f + g). The function df(a) + dg(a) is certainly a
linear transformation with domain E" and range in Em. Now, Ve> 0,
38 > 0 so that Vx E (f + g) with Ix-al< S we have
Proof.
(a)
Since
interior point of
IJ(x) - f(a) - df(a)(x - a) I
Ix - aj/2,
lg(x) - g(a) - dg(a)(x-a) I
Ix - al/2.
Thus the triangle inequality gives
I (J + g)(x) - (f + g)(a) - [df(a) + dg(a)](x - a) I
Ix - al.
This constitutes the proof of (a).

(b)
The point
interpretation of
a is an interior point
g(a) df(a), it follows
of
(Jg). Under the given

J(a) dg(a) + g(a) df(a)
that
is a linear transformation with domain E" and range in Em. Since
df(a) exists, f is continuous at a and thus 3M > 0 and 377

x E (J ) and Ix - al < 7J =:} IJ(x) I M. We may suppose
so that
that M
is large enough so that

Further,
< 8
VE>
lg(a)I
Vu
:E; Mand
E En,
0, 38> 0 so that 8 :E; TJ, and
==>
lf(x) - f(a)I
<
ldg(a)(u)I :E; Mlul.

JV(Jg) and Ix - al
e/3M,
lf(x) - f(a) - df(a) (x - a)I
:E;
(e/3M) Ix - al,
lg(x) - g(a) - dg(a) (x - a)I
:E;
(e/3M) Ix - al.
Thus we have
lfg(x) - fg(a) - [f(a) dg(a) + g(a) df(a)] (x - a)I

lf(x)I lg(x) - g(a) - dg(a)(x - a)I
:E;
:E;
lg(a)I lf(x) - f(a) - df(a)(x - a)I
lf(x) - f(a)I ldg(a)(x - a)I
E Ix - al.
This completes the proof of (b).

(c)
Since
dg(a)
exists, it follows that g is continuous at
a. Thus,
Ix - al <
x E JV(g) and
g(a) of; 0, 3M>
TJ==>g(x) #;O and ll/g(x)g(a)I :E;Af. We may suppose M is large
enough so that Vu E R, ldg(a)(u) I .s; Mlul. Further, Ve> 0, 38>
0 so that 8 .s; TJ,_ and x E DE9(g) and Ix - al < 8 ==>
0 ano 3"f'/ > 0 so that
since
lg(x) - g(a) - dg(a)(x - a)I ,,;;: E Ix - al/2M,

g(x)
I g(a)
- 11
<
e/2M2
I_!g (x) _!g (a)
dg(a)(x - a)
g(a)2
Thus
:E;
I g(x/g(a) j lg(x) - g(a) - dg(a)(x - a) I

+ I g(x)(a) 11:(j - I ldg(a)(x - a) I
I
.s;
E Ix - al.
This completes the proof of (c).

In the higher-dimensional case the easiest and neatest way to state
the theorem about the differentiation of composite functions is in terms
of differentials. The proof is the same as for the one-dimensional case,
but we shall repeat it.
7.3
7 .3.2 Theorem (Chain Rule). Suppose g is a function with domain

in Er and range in Eq, f is a function with domain in Eq and range in EP, and
se(g) C cB(J). If dg(a) and df(g(a)) exist, then df 0 g(a) exists and
df
g(a)
df(g(a))
dg(a).
(7.3.1)
Proof. For simplicity of notation let us set b

g(a). First, note that
since df(b) is a linear transformation 3N > 0 so that Vu E Eq,
ldf(b)(u) I N lu l. Hence
=
l df(b)(g(x)-g(a))-df(b) 0dg(a)(x-a)I
N lg(x)-g(a)-dg(a)(x-a) I.
Next, Ve> 0, 38' > 0 so that Vx
cB(g) with Ix-al
lg(x)-g(a)-dg(a)(x-a)I
<
(7.3.2)
8' we have
(7.3.3)
elx-al/2N.
If we apply the triangle inequality to (7.3.3) and use the fact that
E F, ldg(a)(u)I Mlul, we get that Vx E cB(g)
3M > 0 so that Vu
with Ix-al < 8',
lg(x)-g(a) I
eIx-al/2N+ ldg(a)(x-a)I
L Ix"'.""' al,
where L M + e/2N. Finally, Ve> 0, 38" > 0 so that Vx

with lg(x) -g(a)I < 8" we have
=
IJ0g(x) -f(b)-df(b)(g(x)-b)I
Let us set 8
min(8', 8"/L); then Vx
have from (7.3.2) through (7.3.5),
=
(7.3.4)
cB(f 0g)
e lg(x) -bl/2L .
cB(J 0 g) with Ix-al
(7.3.5)
<
8 we
IJ0g(x)-J0g(a)-df(g(a)) 0dg(a)(x-a)I
IJ0g(x)-f(b)-df(b)(g(x) - b)I
+ ldf(b)(g(x)-b)-df(b) 0dg(a)(x-a) I
elg(x)-g(a)l/2L+N lg(x)-g(a)-dg(a)(x-a)I
eIx-al .
This shows that df 0g(a) exists and moreover has the value given by
(7.3.1).
The linear transformation df(g(a)) is the difef rential of f at

the point b g (a) and should not be confused with df 0g(a), which is the
differential of f 0g at the point a.
WARNING:
The formula for the differential of a composite function is in very

abbreviated form, and for practical computational purposes it is useful
to compute the entries of the Jacobian matrix of df 0g(a). From the
formula (7.3.1) it follows that the Jacobian matrix of df 0g(a) is the
df(g(a)) with the Jacobian matrix

dg(a). If we perform the multiplication of these two Jacobian matrices
product of the Jacobian matrix of

of
we get
Dk(J g)i(a)
0
L D;Ji(g(a))Dk gi(a).
i=l
(7.3.6)
This formula may be rather difficult to remember, although with some

practice one can fix in one's mind the positions of the
i, j, and k. How
ever, if we revert to the notations
agi(a)
axk
Dk gi(a) '
then (7.3.6) becomes
a(J0 g)j(a)
aji(g(a)) agi(a)
=
..,
axk
agi
axk .
i=l
(7.3.6')
This formula may be easier to remember because of the temptation to

cancel
agi(a) with ag1
An analogue to Theorem 4.2.3 is also valid in higher dimensions.

The statement is as follows.
7 3 3 Theorem. Suppose g is a function with domain in E,. and range

in Eq, f is a function with domain in Eq and range in EP, and (g)C J0(J).
If df g(a) and df(g(a)) exist, the latter is nonsingular, and g is continuous
at a, then dg(a) exists and
.
dg(a)
df(g(a))-1
The proof of this theorem follows
df
g(a).
mutatis mutandis the proof of
Theorem 4.2.3 and we shall not reproduce it here. However, the reader
may find it rewarding to trace through the proof. Note that in Theorem
7.3.2 the condition (g) C J0(J) assures us that a is an interior point

J0(f g). In Theorem 7.3.3 the condition (g) C J0(f ) is required
of
in the proof.
7 3 4 Mean Value Theorem. Suppose f is a real-valued function with

J0(J) C En and the line segment A= {x: x= tb + (1-t)a & t E [O, l]}
is contained in J0(J)0. IfVx E A,df(x) exists, then 3c= (1-y)a+yb,
y E ]O,1 [,so that
.
f(b)-f(a)= df(c)(b - a).

Proof.
F or
[O,I] set
F(t)
f(tb+ (1- t)a).
Either by direct computation or the use of the chain rule we find that
F' (t)= df(tb + (I -t)a)(b - a).
7.3
By the use of the one-dimensional mean value theorem, 3-y E
]O, I[
so that
F(I)-F(O) =F'(y).
But
F(l)
J(b)
and
F(O)
which proves the result.
J(a),
7 3 5 Corollary. Suppose f is a function with domain in En and range

in Em and the line seg;inent A is in >(J)0. If Vx E A, df(x) exists, then
there exists a linear transformation L with domain En and range in Em so that
.
J(b)-f(a)
Proof.
L(b
a) .
Write
k=l
Now, Vk E
(I, n),
3 ck E A so that
Jk(b)
For every
u E
fk(a)
dfk(c ) (b - a).
k
En define
L(u)
It is clear that
dfk(c )(u)e .
k
k
k=l
has all the properties described in the corollary.
O Exercises
1.
Suppose
is a real-valued function with domain E3 defined by
f(x)
Let
(x1 )2 - x2x3 -
x1 x3
be the function with domain and range E3 whose components are
g1 (x) x1
g2(x) x1
g3(x) = x1
=
cos
sin
sin
x2,
x2 cos x3,
x2 sin x3
Compute the three partial derivatives
2.
sin
Let
of
f 0 g.
be a function with domain E" and range in En. If
kth component of
g,
gk is
the
suppose that
gk(x)
a function depending only on
(g) C >(!). Assuming

f and g, show that
xk.
g (xk)'
k
Suppose that
is a function with
appropriate differentiability properties for
Jr.g(x) =Jr(g(x))
IJ
k=l
g' (xk) .
k
324 /HIGHER-DIMENSIONAL DIFFERENTIATION
Suppose
3.
x E E9 and the components of x are labeled in some

i,j E (I, 3). Suppose we set
fixed order as XtJ
a(x)
Show that
Va E
aa(a)
axu
If
det [xij] .
E9,
=
Co(au) .
Vi,j E (I, 3), gu is a differentiable function with domain [O, l],
show by the chain rule that
D(a g)
0
g;l
g21
g31
g3
g23
g33
g;2
g22
g32
g13
g3
g33
gll
gl
g31
4.
Generalize the results of Exercise
5.
Suppose that f is a one-to-one function and df(a) and df-1(f(a))
6.
Suppose
exist. Show that
3 to
gll
g21
gl
determinants.
df(a) is nonsingular and that df(a)-1
df-1(f(a) ).
and g are functions which have a common domain
C E n which is open and connected. Suppose further that Vx E

the differentials off and g at x exist and df(x)
dg(x). Show that there
is a vector c so that Vx E . f(x)
g(x) t c.
=
'
Compute the matrix representation of the linear transformation
7.
Lobtained in Corollary
7 .3.5 with respect to the ordered pair of ordered

( (e1,- ,en) , (e1,- -, em)).
bases
Suppose the conditions of Corollary
8.
Vu E
Em,
3c E A so that
[J(b) - J(a)]
9.
sis that
7 .3.5 are satisfied. Show that
df(c) (b - a)
u.
Under the hypotheses of Exercise 8 and the additional hypothe
V c E A, 11df(c)11
:,;;;; M, use the results of Exercise 8 to show that
llLll :,;;;; M , where Lis any linear transformation satisfying the conclusion
of Corollary 7.3.5. Hence, deduce that
If(b) - f(a) I
[For definition of the symbol
7.4
II
II
:,;;;; M
see
I b - aI .
(6.5.3).]
HIGHER-ORDER DIFFERENTIALS AND TAYLOR'S THEOREM
In Section 7.2 we discussed the possibility of defining the function
Du
on the collection P of all functions each of which has its domain in some
Euclidean space and range in some Euclidean space. (P includes the
function whose domain is the null space.) More specifically, Vf E P,

Duf is that function whose domain consists of all x E (f ) so tha
7.4
D uf(x)
HIGHER-ORDER DIFFERENTIALS AND TAYLOR'S THEOREM I 325
x. Clearly, it may very well

tFJ(Duf) = 0. Nevertheless, this approach allows us to
composition Dv Du and this is a well-defined function with
exists and takes on that value at
happen that
form the
domain M and range in M.

It is, of course, "intuitively clear" how to define any finite number
of compositions of such functions. However, the precise way to handle
the situation is to use the idea of recursive definition or definition by
induction which we have mentioned several times before. The process
is entirely analogous to the use of the sum function or the product func
tion as done in Chapter 3. We again refer the reader to the book by
Kershner and Wilcox cited at the end of Section
I. I
for an easily accessi
ble treatment of the idea of inductive definition.

Let be the collection of all function sequences so that
::::} Vk
N0, tFJ(Fk)=M,
and
5(,(Fk)
(Fk)
C M. It is not hard to prove,
using the axiom of induction, that there is a unique function r with
tFJ(f) =, 5(,(f)
C so that
VF=(Fd
f(F)(O)=Fo,
f(F) ( n +I) = Fn+ i
E,
f(F) (n).
Let us remark that according to Definition 1.3.5 the composition of

two functions is
always
defined, even though the resulting function may
be that one whose domain is the null set. This, of course, allows us the
possibility of establishing the existence off. If
F= (Fk)
E we shall
write
n
TI
k=O
Fk=f(F)(n).
II=m
The usual conventions hold when we write

case the
If j
Fk
(j1,
Fk,
and so on. In
ar partial derivatives, we shall often use a special notation.
Jn ) , where Vk E (I, n), jk E N, then we shall set

n
Di=
TI
k=I
D;k=D;n Di n- I
o
D}].
If f is a function with domain in E n and range in

and if j = (j1,
,jq), where Vk E (I, q), jk E (I, n), then we call
Dd a qth-order partial derivative offat a.
7 .4.1
Em,
Definition.
Other, more standard, notations for a qth-order partial derivative of

a function
fare
We shall prefer to remain with the symbols we have chosen

previous paragraph.
the
Di Dk=Dk Di . How
f it may often happen that Di Dkf=Dk DJ , and
a E J8(f ) it is more likely to happen that Di Dd(a)
Now, ifj,k E N andj k, it is not true that

ever, for a fixed
for a fixed f and
=Dk DJ(a).
0
The next theorem gives sufficient conditions under
which the last commutativity relation holds.
7.4.2 Theorem. Suppose f is a function with domain in En and range

in Em. Suppose further that j, k E (I, n) and there is a ball B(a, p) contained
in the domains of Dif, Dkf, and Di Dkf, and the latter is continuous at a.
Then Dk Dif(a) exists and moreover
0
(7.4.1)
Proof.
Without loss of generality we may suppose that
is real
valued. Otherwise we could prove the theorem for each real-valued

component off, and the theorem itself would follow from this.
We must show that
Jim!
h-o
exists and its value is
D;f(x) exists
[D;f(a + hek)
D1 Dkf(a).
Now,
and, of course,
D;f(a)]
Vx E B(a, p)
(7.4.2)
we know that
D;f(x) =Jim! [f(x +lei) - f(x)].

1-0
t
If
x E B(a, p)
and we set
t, x + te1 E B (a, p), and

D;f(x) = lim1_0 dtf(x). Consequently,
Dkdtf(x)
then for all sufficiently small
thus
exists.
we can write
Further,
h [D;f(a + hek) - D;f(a)]

[dtf(a + hek) - dtf(a)]
h t-o
=Jim Dkdtf(a + Ohed,
=!Jim
(7.4.3)
t-o
where
(J, 0 < (J < 1,
depends on
a, h, t, j,
and k, and of course is ob
tained by use of the Mean Value Theorem.

Let us now write
Dkdtf(a + Ohek) in
terms off to get
Dkdtf(a + Ohek) = t [Dkf(a + Ohek +lei) - Dd(a + Ohed

=Di
where
(J', 0 < (J' < 1,
Dkf(a + Ohek + (J'te1),
depends on
(7.4.4)
a, (J, h, j, k, and t, and is also obtained
7.4
by the use of the Mean Value Theorem. From the continuity of Di

at
a, Ve> 0, 38 > 0 so that \Ohek + O'td
\Di Dkf(a + Ohek

0
O'tei)
<
8 =>
Di Dkf(a) I
0
<
Dkf
e,
or using (7.4.4) we have
\Dkd(a + Ohek)
Di Dd( a ) \
<
e.
If we use the fact that the absolute value is a continuous function, then
by allowing
-7
0 in the left side of the last inequality, and using (7.4.3)

i h \ < 8,
we get Vh for which
I [Dd(a
hek)
But, this simply says that
Dif(a)] - Di Dkf(a)
0
Dk Dd(a)
0
e.
exists and is equal to Di
Dkf(a),
which completes the proof.
7.4.3 Corollary. Let be an open set in En, i=(ii.

iq), where
Vj E (l,q),ii E (l,n),cr any permutation of (l,q),cr*i= (i"1,
q
i"q), and C () the collection of all functions with domain and range in
some E'n so that all partial derivatives of f of order q have as their domain
and are continuous. Then V f E Cq().
DJ=DCTJ.
P(q)
P(l) is true and P(2) is
true by the previous theorem. Suppose q ;;;:-: 2 and assume P ( q) is true;
we shall prove that P(q + 1) is true. Suppose cr(l)=r. If r
1 we may
Proof.
We shall prove this by induction on the order q. Let
be the statement of the corollary. Clearly,
write
Of course, the fact that
(g
follows from
P(q).
If
D;k (D;J) =
q, we
11 D;kf=(ft
(g
D;"K (D;J)
have from
P(q),
D; k (D;J)
kr
Since the domain of both sides is. we may apply

Hence, if we do this, then by the use of
P(q)
D;q+I
we get
to both sides.
DIf
(D
= (Ii
=
lq+l
k=2
q 0D- (D f)
fl
I,
'k
k=l
k,,.r
D;uk
) (D;ulf) =Du ;f.

r
Finally, a similar argument gives the case when

P ( q) ==> P ( q +
1) ,
q+
1. Hence
and the corollary is proved.
function f is said to belong to the class Ck :::::> all

of the partial derivatives of f of order k have domain J (f) and are con
tinuous. The function f is said to belong to the class C00 :::::> all of partial deriva
tives of f (of all orders) have domain J (f) and are continuous.
7 .4.4
Definition.
Ck, then from Corollary 7.4.3 it follows that any partial deriva
f of order k is independent of the manner in which the partial
If f E
tive of
derivatives are composed. Hence the operator D'>, which we shall define
in an informal way below, becomes rather useful. Suppose that
C En and
se(f)
C Em and a=
(a1,
,an), where a,.
set
J(f)
N . We shall
0
(7.4.5)
If
Ck and lal
k, we shall set
D<>J= (J]
where
0Dk<>k (f),
(7.4.6)
D k"k is ak compositions of Dk and Dk0f
f.
The notation
D"f
has become popular with the advent of the modern theory of partial
differential equations.
7 .4.5 Definition. Suppose f is a function with domain in En and range

in Em which has a differential at the point a and Vj E ( 1, k - 1 ) and V (u 1 ,
, ui), where u; E E", the function II;-!:1 DuJ has a differential at a.
Then V ( u1,
, uk) we set
dkJ(a) (u1,
, ud =fl 0D u J(a) ,
i=l
(7.4.7)
and call dkf(a) the kth -order differential of f at a. If u1 = u2 = = uk= u,

we shall set
dkj(a) (u)k
dkJ(a) (u,
u),
(7.4.8)
and d0f(a) (u)0= J(a).

A special case of Theorem
7 .2.5 says that if a function has continuous
first partials at a point, then the function has a differential at the point.
7.4 HIGHER-ORDER DIFFERENTIALS AND TAYLOR'S THEOREM j 329
The same type of result holds for kth-order differentials, although for
the sake of simplicity we shall state it in a slightly less general form than
is possible.
7 .4.6 Theorem. Suppose f has domain in En and range in Em, and there
is a ball B(a, p) C JFJ (J), so that all the partials of f of order k, (k 1)
have B(a,p) in their domains and are continuous there. Then f has a kth
order differential at every x E B(a,p) and
dkJ(x) (u1 ,
uk )
L
n
i1=1
Proof.
U1;,
uk 1 k
IJ
j=l
D1if(x).
(7.4.9)
Let P(k) be the statement of the theorem. The statement
P(l) is true by Theorem 7.2.5. Assume P(k) is true and we shall try
to prove P(k
+ 1).
Since the hypotheses of P (k
+ 1)
imply the hypoth
eses of P(k), it follows from P(k) thatf has a kth-order differential

at every point of B(a,p) and (7.4.9) holds.
Each function on the right side of (7.4.9) has, by the hypotheses of
P(k +
1),
continuous first partials on
B (a, p), and thus by Theorem
7.2.5 has a differential at each point of this ball. Further, Vuk+l E E11
and Vx E B( a,p)
(fI
Duk+1
J=l
D;J (x)
'k+I =1
uk+1ik+1 D;k+1
( fJ
D;J (x).
J=l
Hence, if we apply Duk+1 to both sides of (7.4.9) and use the last equality
we see that the induction is complete.

REMARK:
Clearly there is nothing special about partial derivatives
and we could have stated the previous theorem in terms of the direc
tional derivatives of any basis for 11 Note that the right side of (7.4.9)
shows that
dkf(x) is multilinear.
As an exampie of the formula (7.4.9) let us write down
d 3f(x) (u)3 in
terms of the more classical terminology for partial derivatives. We have
d3f(x) (u)3
L L .L ii i2uia
u u
.
1 =1 11=1
n
11
a3J(x)
axiaaxt.ax;"
t3=l 2
i
where of course u i is the ii component of the vector u.
7 .4. 7 Theorem. Suppose f is a real-valued function with JFJ(J) C En,

and the line segment L {x: x th + (1 - t)a & t E [O, l]} is contained
in JFJ(J). If Vk E (1, m) and Vx E L, f has a kth-order differential
dkf(x), then Vx E L, 3c E L so that c yx + (1- y)a, y E JO, 1 [,
and
=
f(x)=
Proof.
For
1
m-1 1
L kl. dkf(a)(x - a)k +m.1 dmf(c)(x - a)m.
k
-
VtE [O, l]
(7.4.10)
let us set
F(t)
f(tx +(I - t) a) .
Since xEL we may write x

rb +(I - r)a with TE[O, l]. Hence
tx +(I - t)a= trb +(I - tr)a with trE [O, l]. Thus tx + (I - t)a is
in L and hence by the chain rule, VtE [0, 1],
=
F'(t)
df(tx +(I - t)a)(x - a).
It follows by an easy induction argument that
E[O, l]
VkE (1, m)
and
Vt
we have
p<k>(t) = dkf(tx +(1 - t)a)(x - a)k .

If we apply the one-dimensional Taylor formula with the Lagrange form
of the remainder, Corollary
F (I)=
4.4 .2(c),
we get
i 1i p<k>(O) +
p<m> (y)'
m.
k
k=O .
If we substitute in for
F(l), p<k>(O),
and
yE]O,l[.
p<m>(y)
we have completed
the proof of the theorem.

The above theorem is valid only for real-valued functions. However,
by applying it component wise to a function whose range is in
>
1,
Em,
we do get a Taylor remainder formula. However, the reader
should be cautioned that the same point
will not work for all com
ponents and will in general change with the different components.

There is an integral formula for the Taylor remainder that looks
like the integral remainder formula for functions defined in R. The
integral formula does not require that
that if
f( t)= k1!:1 Jk ( t)e

k
be real-valued. We only note
is a continuous function with domain
[a, b]
we define
7 .4. 7
Theorem. Suppose the hypotheses of theorem 7.4.7 are satisfied
and in addition VxEL, dmf(tx +(I - t)a)(x - a)m is a continuous func
tion of t. Then VxEL,
m-1 1
f(x)= L I dkf(a)(x - a)k
k.
k=O
7.4
As in the proof of Theorem 7.4.7 we set
Proof.
F (t)
f(tx + (I - t) a) .
Then by the formula (5.2. l) we get
F(l)
I - t)m1 pcm>(t)dt.
i F<k>(O) + Jo{ 1 ((m-1)
.
k=Ok.
If we substitute in for p ck>(O) and F<k>(l) we get formula (7.4.10'),

If we assume that f E cm, then we can write formulas (7.4.10) or

(7.4.10') in the terminology of the operators na. In this case Corollary
7.4.3 tells us that it doesn't matter in which order the Di; are applied
and we can write, fork m,
ii dkf(x)(u)k
where
"
L u"caDaf(x),
1al=k
fl '/=1 ( U j)a;, and Ca is a constant independent of j and X.
This can be proved by an easy induction argument onk. Suppose pis

a polynomial of
variables of degree m; that is,
p(x) = L Pa (X - a)",
lai"'m
where (x - a)a =Ilj= (xi - ai )ai. From Theorem 7.4.7 we can write
1
p(x) = L (x - a)acaDap(a),
lal.:;m
since the (m + l)st remainder vanishes. If lal m and we apply n
to
both sides of this equation, we get
a!Pa= Dap(a) =a!caDap(a),
where a!= nil aj ! . If we choose pso that Pa - 0, we see that Ca= I/a!
Thus we can write the formula (7.4.10) in the form
1
1
f(x) = L -, naf(a)(x - a)a + L I naf(c)(x - a)o:.
a
a.
lal=m .
lal.:;m-1
D Exercises
1.
Compute d3f(a)(x - a)3 for the following functions at the given
point a:
(a)
(b)
(c)
2.
f(x1, x2, x3)

(x2)2 + 2x1 (x2)2 + (x3)2, a= (I, 0, -1).
f(x1, x2, x3) =sin (x1 + x2 + x3), a= 0.
f(x1' x2) =exx, a= 0.
=
Write Taylor's formula about (0, 0) for
f(x1, x2) =sin (x1 + ex)

form=3.
3.
Suppose
p(x1' x2)
x1 (x2)2 + 3(x1)2 x2
x1 + 2.
Write this polynomial as a polynomial in powers of

4.
f E C2 and Dif(a) Dd(a)

JFJ(f), show 3M > 0 so that Vb E
Suppose
convex set in
IJ(b) - f(a)I
,,s; Mlb
(x1 - 1)
and
(x2
1).
0. If B is a compact
- a l2
C"' and JFJ(f) is an

JFJ(f) is a convex set containing a E JFJ(f),
and 3M so that Va andVx E B, ID"f(x)I :s; M. Show that Vx E B, the
remainder in Taylor's formula goes to zero as m -'> oo.
5.
Suppose
open set in E
6.
n
.
is a real-valued function of class
Suppose B C
State and prove an analogue of Bernstein's theorem, 4.4.4, for
functions with domain in E n,
7.5
n > 1.
THE INVERSE AND IMPLICIT FUNCTION THEOREMS
f is a function with domain in E", range in Em and df(a) exists.

df(a)(x - a) + f(a) approximates f(x) very closely in a neighbor
hood of a we might hope that if df(a) is nonsingular, then f itself, re
stricted to a neighborhood of a, is a one-to-one function. It turns out
Suppose
Since
that this is essentially the case and our first object in this section is to
prove this, and indeed somewhat more.
7.5.1
Proposition. Suppose f is of class C1 with an open domain in E"
and range in Em. For every compact set K C JFJ(f) and Ve > 0, 3 f> > 0 so
that Vx, y E K with Ix - YI < f> and Vu E E" we have
ldf(x)(u) -df(y)(u)I
,,s;
elul,
IJ(x) - f(y) - df(x)(x - y)I

Proof.
Let
{u: u
E P &
lul
l}
(7.5.1)
,,s;
elx - YI.
(7.5.2)
be the unit sphere in E".
From the expansion
df(x)(u)
L uiDJ!(x),
j=l
df(x)(u) is continuous on the Cartesian product JFJ(f)

X S. If we restrict df(x)(u) to K X S, the restricted function is uniformly
continuous. Thus Ve> 0, 36 > 0 so that Vx, y E K with Ix - YI < a,
and Vu - 0 we have
it is clear that
ldf(x)(u/lul) - df(y)(u/lul)I
lul
we get
e.
df(x) and df(y), if we multiply the

(7.5.1). I(u 0, (7.5.1) is clearly true.
Using the homogeneity of

equality by
<
last in
7.5
THE INVERSE AND IMPLICIT FUNCTION THEOREMS I 333
To prove (7.5.2), for every a EK we can find a ball B(a,8(a))

JE>(f ) so that Vx,y E B(a, 8(a)) we have (7.5.1). Using the mean
value theorem, Vu E 711 there is a c on the straight line joining x and
y so that
C
[f(x)-f(y)-df(x)(x-y) ]
u= [df(c)(x-y)-df(x)(x-y)]
u.
Replace u by [f(x) - f(y)-df(x)(x -y) J and for the corresponding

c we get, using the C-B-S inequality,
lf(x)-f(y)-df(x)(x-y)I
ldf(c)(x-y)-df(x)(x-y)I.
If we now use the estimate (7.5.1) on the right, we have (7.5.2) in
B(a,8(a)).
The collection {B(a, 8(a)/2):a EK} is an open covering for Kand

thus reduces to a finite subcovering {B(ai,8(a;)/2):J E (l,q)}. Let
8=min{8{ai)/2:J E (l,q)} and suppose x,y EK with lx-yl < 8.
Now 3J E ( 1, q) so that Ix -a; I < 8. Thus IY-aiI IY -xi + Ix-a;I
< 8(ai) . Hence x,y E B(a;, 8(a;)), and since we have the estimate
(7.5.2) in this ball we have concluded the proof.
7 .5.2 Corollary. If f satisfies the hypotheses of Proposition 7.5.1 and
3a E JE>{f ) so that df(a) is nonsingular, then there exists a ball B(a, 8)
C JE>(f ) and 3m > 0, so that Vx E B(a,8) and Vu EEn we have
ldf(x)(u)I
m lul.
(7.5.3)
Moreover, if df(x) is nonsingular for every x in a compact set K C JE>(f ),

then 3m > 0 so that (7.5.3) holds Vx EK.
Proof. Since df(a) is nonsingular, it follows from Corollary 6.5.5
that 3 m > 0 so that Vu EE" jdf(a)(u)I 2m lul. Now, from Proposi
tion 7.5.1, 38 > 0 so that Vx E B(a,8) and Vu EE" we have
ldf(a) (u)I -jdf(x)(u)I
m lul.
Thus
jdf(x)(u) I
ldf(a)(u)I-m lul
m lul.
To prove the second statement, it follows from what we have just

proved, that Va EK, 38(a) > 0 and 3m(a) > 0 so that (7.5.3) holds
Vx E B(a,8(a)) , provided m is replaced by m(a). The collection
{B(a,8 (a)):a EK} is an open covering for K and thus reduces to a
finite subcovering {B(a;,8(aj)):J E (l,q)}. If we now take m= min
{m(a;):J E (l,q)} we have completed the proof.
The next two propositions constitute essentially the proof of the
Inverse Function Theorem.
3M I
HIGHER-DIMENSIONAL DIFFERENTIATION
Supposef E C1 has (an open) domain in En, range

in Em, and df(a) is nonsingular. Then there exists a ball B(a, 8) C (f)
and 3m > 0 so that Vx E B(a,8), df(x) is nonsingular and Vx,y
E B(a, 8),
IJ(x) - f(y)I m Ix-YI.
In particular th,is means that JIB(a, 8) is a one-to-one function.
7.5.3
Proposition.
Proof. From Corollary 7.5.2 there exists a ball B(a,281) C

3m > 0 so that Vx E B(a, 281) and Vu E En we have
(J)
and
ldf(x) (u) I
2m l ul.
(7.5.3')
If we take K as the closure of B(a, 81), then it follows from Proposition
7.5.l
Thus
that
38,0
8 < 81, so that Vx,y EK with lx-yl

IJ(x)-f(y) - df(x) (x-y)I ,,;;; m Ix - YI.
Vx,y
<
B(a, 8),
we get from
IJ(x) - f(y)I
<
28
we have
(7.5.4)
(7.5.3') and (7.5.4),
ldf(x) (x - y)I-m Ix - YI
m lx-yl.
This concludes the proof.
7 .5.4 Proposition. Suppose f is a one-to-one function with (an open)

domain in En and range in En. If Vx E (f) ,df(x) exists and is nonsingu
lar, then t.e (J) is open. In particular, f is an open map.
f(B (a,p)) contains

(a,p) \B(a, p) {x: x
E En & Ix-al
p}. S is a compact set and since f is continuous,
J(S) is compact. Further, since f is one to one and a ftS, it follows that
J(a) ft J(S). Thus the distance from f(a) tofS
( ) is a positive number,
m. Indeed, m is nothing more than the minimum of that continuous
function with domain Sand values IJ(x)-J(a)I (Fig. 7.5.1).
Proof.
If B
(a,p)
C f
( ) , we shall show that
an open ball in En with center atf(a). Let S
B(a, p)
FIGURE 7 5 1
.
7.5 THE INVERSE AND IMPLICIT FUNCTION THEOREMS I 335
We claim that B (f(a), m/2) C f(B (a, p)). To see this lety E B (f(a),
m/2) and let hu be that function with domain B(a, p) and defined by
hu(x) =IJ(x) - YI.

Since hu is a continuous real-valued function with a compact domain,
it takes on a minimum. This minimum must be in the open set B (a, p).
Indeed, on the one hand,
min hu hu(a) =IJ(a) - YI < m/2 ,
and, o n the other hand, i f x
S , then
hu (x) = lf(x) - YI ;;;,; IJ(x) - f(a)I - IJ(a) - YI

>
m - m/2 =m/2 .
If b is at a minimum point for hu, it is also at a minimum point for

h/. But
n
h/ (x) = L [Ji (x)
yi]2,
j=l
and hence all the partial derivatives of h/ exist. If we set
gk(t) =h/(b+tek),
then gk is defined in some open interval around t =0, and gk(O) is a
local minimum for gk Thus Vk E (I, n),
dgk(O)
= Dkh y 2(b) =0 '
dt
and hence Vu E E",
"
dhi/(b)(u) = L uiDih/(b) =0.
j=l
Now, hu2(x) = [f(x) -y]
[f(x) -y] and thus Vu E E"
dhu2(b)(u) =2df(b)(u)
[f(b) -y ] = 0.
Since df(b) is nonsingular, its range is all of E", and thus we must have
y = f(b).
7.5.5 Inverse Function Theorem. Suppose f is of class C1 with (an
open) domain in E" and range in E". If df(a) is nonsingular, then there exist
open sets U and V in En so that a E U, f(a) E V, JIU is a one-to-one June"
tion with range v' and the inverse function is also of class c.
Since df(a) is nonsingular, all the hypotheses of Proposition

B(a, p) C (f) so that Vx
E B(a, p), df(x) is nonsingular, and f1 = JIB(a, p) is a one-to-one
Proof.
7.5.3 are satisfied. Thus there is a ball
f1 satisfies the hypotheses of proposition 7.5.4 and

fl(,(j ) is an open set. We take U = B(a, p) and V fl2,(J1) .
1
f -1 Since f1 is an open map, it follows from Corollary 6.4.5
Let g
1
that g is continuous. Now, f1 g is the identity function on V and thus
Vy E V, df1 g(y) = t, where t is the identity linear transformation on
En. Further, Vx E U, df(g(J(x)))
df(x) exists and is nonsingular.
function. Thus
Thus the hypotheses of Theorem 7.3.3 are satisfied, and it follows that
Vy E
V,
dg(y)
exists and
dg(y)
df(g(y))-1
0 l =
df(g(y))-1
(7.5.5)
Vu
dg(y)(u) is continuous.
df(g(y)). Since g is con
The proof of the theorem will be concluded if we can show that

E En, the function with domain V defined by
L(y)
1
f E C , it follows that Vu E En, the function on V with
values L(y)(u) is continuous. Let b E V and b = f(a). By Corollary
7.5.2 371 and 3M> 0, so that Vx E B (a, 71) and Vu E En
For the sake of simplicity let us set
tinuous and
idf(x)(u)I
This is the same as saying that
JuJ/M.
Vy E f(B(a,71)),
JL(y)(u)J
and
Vu
E En,
JuJ/M.
Next, let us note that
L(y) [L(y)-1 - L(b)-1 ](u)

Using the above inequality for
J[L(y)-1-L(b)-1](u)J
Now,
dg(y)
1
[L(b)-L(y)] L(b)- (u).
IL(y)(u) I
we get
M J[L(b)-L(y)] L (b)-1(u)J.
L(y)-1
and dg(b) = L(b)-1 If we use the fact that the

L(y)(L(b)-1 (u)) is continuous at b, then VE> 0,
Vy E V with IY - b I < () we get that the right side of the
is less than E. This completes the proof of continuity
function with values
3()> 0 so
that
last inequality
and of the theorem.
If f satisfies the hypotheses of Theorem 7.5.5 with the

exception that f E Cq, q 1, then the local inverse function g is also of class Cq.
7.5.6
Proof.
Corollary.
Probably the easiest way to prove this is to use formula (7.5.5),
which is
df(g(y)) dg(y)
0
If we evaluate both sides at
e;,
l.
we get
df(g(y))(D;g(y))
e;.
From Cramer's rule it follows that
(g(y))
DJgk(y) = k
]J(g(y))'
(7.5.6)
7.5
tJ.k (g (y)) is obtained from ]1 (g ( y)) by replacing its kth column

ei. Both a,.. ( g ( y)) and ]1 ( g ( y)) are polynomials in the functions
( Drf8) 0 g evaluated at y. Now, g E C1, and if f E C2, then by the chain
rule it follows that each of the functions ( Drf8) 0 g is in C1 Thus from
(7.5.6) it follows that Digk E C1, and moreover
where
by
(7.5.7)
;fJ.k ( g ( y)) is a polynomial in the functions ( Drf") 0 g, ( D; Drf")

g, and D;g, evaluated at y. If f E C3, we see from (7.5.7) that D; Digk
where
0
C1 and we may proceed t.o differentiate again. Proceeding in this way
we get the proof of the corollary. Of course, the precise way to do this
is by induction, but we shall not bother to write this out in a formal
way.
In most instances the easiest way to decide whether or not
REMARKS:
df( a)
is nonsingular is to check the rank of the Jacobian matrix.
If the Jacobian matrix is square, then, of course,
{::::} ]1 ( a)
df(a) is
nonsingular
0. The reader should be warned that the invertibility

theorem, 7.5.5, is only a "local" theorem. In other words, Vx E E(f)
it could be possible that
df(x) is
nonsingular and yet f is not "globally"
a one-to-one function. This is already seen in the following very

simple
E
example
] 1, 2[
and
of the polar coordinate transformation.
JO,47T[
f1 ( r, 8) = r
2
f (r, 8) = r
and, of course
cos
sin
8,
f(r, 8) = (/1 (r, 8), /2 ( r, 8)).
] 1 (r, 8)
For r
let us set
cos
sin
sin
r cos
-r
Then
r.
Hence J 1 never vanishes and yet f is not a one-to-one function.

The conditions given in Theorem 7.5.5 are only sufficient conditions
for local invertibility and are by no means necessary. Indeed, let us
consider the function
f(x1,x2) = ( (xl)a, ( x2)3).

It is clear that
df(O, O) = 0,
and yet f maps an open neighborhood of
the origin onto an open neighborhood of the origin in a one-to-one

fashion. Of course, we cannot expect in a case like this that the inverse
function will be differentiable at (0, 0), since it would contradict the
fact that
df(O, O)
is singular.
On the other hand, there are any number of examples which show
338 J HIGHER-DIMENSIONAL DIFFERENTIATION
that if
df(a)
is singular we may not get local invertibility at
a.
A specific
example is given by the function
f(x1, x2) =((xi )2, (x2)2).

We feel that the reader can construct many more.
We now turn to the important situation of a function whose domain is
in E111+n and range in En. The prototype example we have in mind is
that of a linear function where the jth component is given by
J i (x,y) =bJlx1+
+ b3711X711 + aily1+
+ ll1nYn
If det [a;k] - 0, then by Cramer's rule we can "solve" the equation

f(x,y) =0 foryin terms of x, so that f(x,y(x)) =0. Note that det [a 1k ]
is the Jacobian off taken with respect to theyvariable.
Suppose now that f is any function of class C1 with domain in E111+n
and range in En. Let us identify E111+" with E111 X En and designate the
points in the latter space by
duf(a, b)
(x,y).
is nonsingular, where by
of the function with values
f(a,y).
Suppose that
dyf(a, b)
f(a, b) =
0 and that
we mean the differential
Keeping in mind the example of the
linear function we discussed in the last paragraph and the fact that
duf(a, b) + J(a, b) approximates J(a,y) very closely in ayneighborhood

(a, b), one might hope that there is a neighborhood about a E E111
so that we can "solve" the equation f(x,y) =0 for y in terms of x in
this neighborhood; that is, f(x,y (x))=0. This is actually the case and
of
the precise statement is given by the next theorem.

However, before we state and prove the Implicit Function Theorem,
it will be illuminating to look at several very simple examples. These
should help to clarify the nature of the problem.
Suppose that
andyare variables on Rand
J(x,y) =x2+y2-1.
Now, for
of
x.
lxl
<
we can "solve" the equation
f(x,y) =
0 for yin terms
Indeed, we get two C"' functions
g1(x)=,
g2(x)=-,
so that
f(x, g1 (x)) =0
for j E
(1, 2)
and
lxl
<
1.
This simple example
already illustrates several points. First, we can only "solve" the equation
f(x,y) =0 "locally" and in general cannot find y as a function of x for

x for whichf is defined. Second, there may be more than one "solu
tion" to the equation. However, note that if f(a, b) =0 with b > 0,
then there is only one of these functions, namely, g1, whose graph con
tains (a, b). In the same way if b < 0, then g is the only one of these
2
functions whose graph contains (a, b). Indeed, we can say more. If
f(x,y)=0 and y E ]O, l] we must have y=g1(x), and if y E [-1, O[
all
7.5
y = g (x). That is, there is a unique function with range

2
JO, l] and domain J -1,1 [ and a unique function with the same
domain and range in [ -1,0 [ so that f(x,g (x))
0. From this we can
conclude that V(a, b) with b =fa 0 and f(a, b) = 0,there is an open neigh
borhood V C E2 about (a, b) and a function g so that if (x,y) E V,
and f(x,y) = 0, then y = g (x).
For fixed a, the Jacobian at b of the function with values f(a,y) is
Dd(a, b) = 2b. Thus dyf{a, b) is nonsingular{:::::} b =fa 0. Now, f(l,O)
= 0, and we can still find solutions whose graph contains (1,O), but
we must have
in
several interesting things happen. First, the solutions cannot be defined
x = 1, and are not differentiable at this

2 which contains (1,O) and for all
x which are sufficiently close to 1 and with x < 1, there are always two
numbers, y and -y, so that f(x,y) = f(x,-y) = 0. The same state of
affairs exists at x =-1. Thus we lose the unicity discussed in the last
in an open neighborhood of
point. Second, in every open set in
paragraph.
As a second example we shall consider a function with domain in
and range in
E2 We suppose that x,y, and z are variables on
R,
f1(x,y, z) = x2 + y2 + z2 - 1,
f2(x,y,z) = xy - z ,
and f(x,y,z) = (f1(x,y,z), f2(x,y,z)). If we "solve" the equation
f(x,y,z) = 0 by elementary techniques, for y and z in terms of x, we
get the following two C00 "solutions" for lxl < 1:
{l,x),
g {x) =- {l,x).
2
g1 (x) =
1 and f(a, b,c) = 0, it is not difficult

(a, b,c) in its graph. How
ever, f{l,0,0) =O and both solutions, extended to x= 1, have
(1,0,O) in their graphs. As before, at x =1 we lose the differentia
Now,
V(a, b, c)
to check that only
3 so that lal
one
<
of these solutions has
bility as well as the uniqueness of the solutions in the neighborhood

of a point.
Let us compute the Jacobian of the function with domain
having values
2 and
f(a,y,z), where a is fixed. We calculate this Jacobian as

2y
2z
-1
=-2(y+az).
y + az = 0, and the only points
(a,b,c) where both the Jacobian and f(a,b,c) vanish are (1,0,0).
Hence, this Jacobian can vanish only for
As in the previous example, the points where we lose differentiability

and unicity of the solutions are the points where
d<y,zJ
is singular.
7.5.7 Implicit Function Theorem. Suppose f is of class Cq, q 1,

>(!) C Em+n, and !R,(f) C E". Suppose further that 3a E >(!) so
that f(a)=0 and df(a) is of rank n. Then there is an open set U C Em
containing 0, an open set V C >(!) containing a, a function g with >(g)
U and !R,(g) C V which satisfies the following:
(a)
g(O)=a.
(b)
f g(t)=0, Vt E >(g).
gE Cq and Vt E U, rank dg(t)
m.
(c)
(d) If x E V and f(x)=0, then x E !R,(g).
Proof.
Let us identify Em+n with E"' XE" in the obvious way, and
X {O} of Em+n andE" with the subspace

{O} XE" of Em+n. Let M be any linear subspace of Em+n of dimension
n so that the range of df(a) IM is E". Let P be the projection of Em+n
onto M 1. and A any linear transformation of Em+n into itself which takes
M 1. onto E"'. This is possible, since dim M n and thus dim M 1.= m.
identify Em with the subspace E"'
(See Exercises 12 and 13 of Section 6.5.)

If
Vx E >(!)
we set
F(x)=(A0P(x),f(x)) ,
then
is a function of class Cq with domain
Further,
Vu EEm+n
>(/)
and range in E"'+".
we have
dF(a)(u)=(dA0P(a)(u),df(a)(u))
=(A0P(u),df(a)(u)).
dF(a) is Em+n. Indeed, let (v1,v2) EEm+n and u1 E Af1.
A0P(u1)=v1,u2EM so that df(a)(u2)=v2-df(a)(u1),
and u=u1 +u2 Then A 0P(u)=v1, df(a)(u)=v2, and we see that
dF(a)(u)=(v1,v2). Consequently dF(a) has rank m + n, which means
The range of
so
that
it is nonsingular.
If we apply the Inverse Function Theorem to F, we find that there is
an open set V C
>(/)
containing
and an open set W C Em+n contain
ing F(a) so that FIV is a one-to-one function with range Wand having
an inverse function
of class Cq. Set
W1= { T:
T EE"' & (T,0) E W}.
It is clear that W1 is open in Em and is nonvoid, since

For every
T E W1
A P (a) E W1 .
0
let u s set
h(T)=G(T,0).
Then
h E Cq
and since
d G (T,0)
is nonsingular it follows that
dG(T,O)JE"'
has rank
m.
But
Vu EE"'
we have
7.5
d G(r,O) (u,O) =
Hence rank dh(T)
i=I
uiDiG(r,O) =dh(r) (u).
m.
Now,
h(A0P(a))=G(A0P(a),O)=G(A0P(a),f(a))
(7.5.8)
=GF (a)=a.
Further,
Vr
E W1
(AoPoh(r) ,Jo h(r)) =F 0h(r)

=F 0 G((T, O)) =( T, 0).
Thus, we get
AoP0h(r)=r ,
(7.5.9)
f 0h(r)=O.
x
Note also, if
E V and
f(x)=0, then F(x)
(7.5.10)
E Wand
x = Go F (x) =G(A0P(x), O) =h(A0P(x)).

0P(a)={t: t
U let us set
Finally,let us setU=W1-A
&
E W1}
Then
Vt
g (t)=h(r),
Clearly
E Em &
(7.5.11)
t=r-A0P(a)
t=r-A0P (a).
g satisfies the conclusions of Theorem 7.5.7, condition (d)
coming from (7.5.11). The proof is complete.

Condition (d) is a uniqueness condition on
(g) rather than on g
itself. We can get any number of other functions that satisfy the con
g with a function of class cq

U onto itself, leaves the origin fixed, and is of rank m at
every point of U. To pin down the uniqueness of g, the Implicit Func
clusions of the theorem by composing

that takes
tion Theorem is usually stated in a special form. We state this as a

corollary, although it is really a corollary of the proof.
7.5.8 Corollary. Suppose f is of cl ass Cq, q 1, eB(f) CE"' X En,

and (f) C En. Suppose further that (a,b) E eB(f) so that f(a,b) =0
and duf(a,b) is nonsingul ar, where duf(a,b) is the differenti al of the func
tion with dom ain in En and val u es f(a,y). Then there is an op en set U CEm
cont aining a and an op en set YC E" cont aining b, so that UXYC eB(f)
and a function g with eB(g)=U and (g) CYthat satisfies the following:
(a')
g (a) =b.
(b') f(x,g (x))=0, Vx E eB(g).
(c')
g E Cq.
(d') If (x,y) E UXYand f(x,y)=O,then y=g (x).
We shall use the notations of the proof of the last theorem,
Proof.
Em X En by (x,y).
E". From the formula
except that we shall designate the elements of

Let
u= (0, u 2)
E Em X
11
df(a,b)(u) = L u2iD +d(a,b) =duf(a,b)(u2) ,

m
;1
duf(a,b) is nonsingular and f76(df(a,b)) C E", we
df(a, b) has rank n. Let M=E"; then, of course, the orthogonal
complement of Min Em+n is Em. As in the proof of Theorem 7.5.7 we
let P be the projection of Em+n onto Em so that V (x,y) E Em+n we have
P(x,y) =x. We take A to be the identity transformation of"'+" onto
itself. Hence the function F of the last theorem becomes
and the fact that
see that
F(x,y) = (x,f(x,y)).
If we apply the proof of the last theorem, we find that there is an open
neighborhood U C
C (J)
Em containing a and an open neighborhood U X Y

(a,b) and a function h(x) = (h1(x),h2(x)) of
containing
class cq with domain u and range in u x y so that from
(7.5.8) we have
h(P(a,b)) =h(a) = (a,b).

Thus
Further, from
(7.5.9)
we have
h1(x) =P0h(x) =x ,
and thus from
(7.5.10)
w e get
f0h(x) =f(x,h2(x)) =O.

g=h2, then condition (a'), (b') and (c') are satisfied.
(x,y) EU X Y
and f(x,y) =O; then (x,y) E (F) and F(x,y) = (x,O). Applying
the inverse function G and recalling that G (x,O) =h(x) we get
If we take
To prove the unicity condition (d') let us suppose
(x,y)
F(x,y) = G (x,O) = (x,g (x)),
from which it follows that
y = g (x). This completes the proof.

duf(a,b) is the matrix
Of course, the Jacobian matrix of
Dm+d1(a,b)
1
Dm+nf (a,b)
Dm+nf"(a,b)
As we remarked after the proof of the Inverse Function Theorem, the

easiest way to check that
duf(a,b) is nonsingular is to check that the
Jacobian, that is, the determinant of the above matrix, does not vanish.
7.5 THE INVERSE AND IMPLICIT FUNCTION THEOREMS j 1143
The reader may find it instructive to go back and review the examples
given before Theorem 7 .5. 7 in the light of that theorem and its corollary.
O Exercises
1.
Define a function f on
E2 by means of the equations
f'(x,y)=x2-y2,
f2(x,y)= 2xy.
Show that f has a nonsingular differential at every point except the
origin and thus at every point of E2\ { (0,O)} is
locally a
one-to-one func
tion. Show thatf is not a one-to-one function. Is the restriction off to

some neighborhood of
(0,O) a one-to-one function? [Note: From the

z=x + iy, then f'(x,y)
is the real part of z2 and f2(x,y) is its imaginary part.]
point of view of complex variables, if we set
2.
Letf be that function on
f'(x,y)=
{x
E2 defined by
x2 sin (I/x) {::::> x
oF-
0,
if x = 0,
f2(x,y)=y.
Show that
df(O,O) is nonsingular but thatf is not a one-to-one function

df(x,y) nonsingular for every
on any neighborhood of the origin. Is
(x,y) in some neighborhood of the origin?

3.
(a)
Suppose that f is a real-valued function defined on
E2 by
f(x,y)=x-y2
Does there exist a real-valued function g defined in a neighborhood of
x=0 so that f(x,g(x)) =O?

(b)
Suppose thatf is the same function as in part (a). Show that
there is a unique function

so that
4.
f(x,g(x))
g defined on a suitable neighborhood of x = 1
0 and g(x)
>
0.
Suppose that f is a real-valued function on
E2 defined by
f(x,y)=x2 - y2.
How many continuous functions g do there exist, defined on a neighbor
hood of x= 0 so that
f(x,g (x))= 0? Are there more functions for which
this is true if we remove the requirement of continuity on g?
5.
Suppose f has domain
E2 and is defined by the equations
f'(x,y)= e cosy,
x
f2(x,y) =ex sin y.
Show that
f/2,(J)
E2 \ {O}. Isfa one-to-one function? Isflocally one
to-one? Note that in terms of the complex variable

is the real part of
6.
f2 (x, y)
ez,
= x + iy , f1 (x, y)
is its imaginary part.
If the open set U of Corollary 7.5.8 is connected and his a con
tinuous function with domain U which satisfies (a') and (b'), show that
h=g.
7.
Suppose
is a real-valued function with
J0(J)
C E2 and satis
fies the hypotheses of Corollary 7.5.8. Compute the derivative of g

in terms of the partial derivatives off.
Extend the results of Exercise 7 to higher dimensions, that is,
8.
where the domain and range off are in higher dimensions. In fact,
show that
d g(x) = -dyf(x, g (x))- 1

[Hint:
Vu
Note that
dxf(x, g (x)).
E Em and Vv E En
dxf (x, y)(u) = df(x, y)(u, O) ,

dyf(x, y)(v) = df(x, y)(O, v) .]
9.
Suppose
C1
with domain in En and range in Em and
is nonsingular. Show that there is a ball B (a,
=JIB(a, p),
then
Va
E R so that
a= 1, 3m
lg(x)-g(y) I
lx-yl"
Suppose
lim
lg( x) - g(y)I
Ix - YI
C1
0,
>
Vu
Vx E J0(f ) ,
[Hint: Use the
[J(x)-J(y)] when u = x - y.]
E E",
=ft 0, and
=ft 0. Show thatfis a one-to-one function.
mean value theorem on
7 .6
df(a)
g
so that if
and its domain is a convex set in E" and its
range is in En. Suppose further that
u df(x)(u)
J0(J)
> 0
lx-yl-O
10.
1 we have
Jim
lx-yl-O
and if
a<
p)
MAXIMA AND MINIMA
For a real-valued function whose domain is in E",
1, it is, of course,
possible to talk about the maximum and minimum of the function. The
purpose of this section is to develop some of the criteria that will tell
when a multivariate function has a (local) maximum or minimum. We
are sure that the reader knows the definition of the maximum and mini
mum of a function, but for the sake of completeness we shall repeat it.
7.6
MAXIMA AND MINIMA I 345
7 .6.1 Definition. If f is a real-valued function with domain in En,

then f is said to have a maximum [minimum] at a:::> a E >(!) and Vx
E >(f),f(x).;;; J(a) [f(a).;;; J(x)]. Thefunction f is said to have a local
maximum [minimum] at a:::> a E >(!) and there is a ball B(a,p) C En
so thatVx E B(a,p) n >(J),f(x).;;;j(a) [f(a).;;;j(x)].
The analogue of Theorem 4.3. l for a function with domain in En
is given below. Actually it is not so much a generalization of that theorem
as a consequence of it.
7.6.2 Theorem. Suppose f is a real-valued function with domain in En

and all the first partial derivatives of f exist at a E >(J). A necessary con
dition that f have a local maximum or local minimum at a is thatVk E (1, n)
(7.6.l)
Proof.
Let us set
gk (t)
This defines a function
f(a + tek).
gk with >(gk)
{t: a+ tek E >(! )} . Clearly
0 is an interior point of >(gk) If f has a local maximum or mini

mum at a, then gk has a local maximum or minimum at 0. Thus by
Theorem 4.3.1 we have
a, then the last theorem tells us that a

a is that df(a) 0.
Now, it is not necessarily true that if df(a)
0, then a is at a local
extremum for J. For example, let f be that function on E2 defined by
In case
f has
a differential at
necessary condition that f has a local extremum at
f(x,y)
Clearly
df(O)
x3 + y3.
0, but 0 is not at a local maximum or minimum for

df(a) 0 but is not at local extremum for f is
f. A point for which

called a
saddle point.
7.6.3 Definition. If f has domain in En and range in

said to be a critical point for f :::> df(a) exists and df(a) 0.
Em,
then a is
The next theorem gives sufficient condition under which a real-valued

function has a local maximum or a local minimum at a point. It is based
on Theorem 6.6.14.
7.6.4 Theorem. Suppose f is a real-valued function of class C2 with an

open domain in P and a E >(J) is a critical pointfor J. For every k E (1, n)
let us set
(7.6.2)
ak(x) =
[dn (x) is called the Hessian offat x]. IfVk E (I, n), dk(a) > 0,
then fhas a local minimum at a, and ifVk E (l,n), (-I) kdk(a) > 0,
then fhas a local maximum at a. If the Hessian dn(a) 0, but neither of
the previous conditions hold, then fhas a saddle point at a, that is, has neither
a local maximum nor minimum at a.
Since
Proof.
df(a) =0,
(f)
straight line joining
B(a, p) C (f). Since

7.4.7, Vx E B(a, p), 3c on the
is open, there is a ball
from Taylor's Theorem

to
so that
I
f(x)-J(a) =2 d2f(c)(x - a)2.
For every
En
into
x E B(a, p),
let
T(x)
(7.6.3)
be that linear transformation from
{ei: j E (1, n)} is

[D; 0 Dkf(x)]. Since fis of class C2, it follows from
7.4.2 that Vx E B(a,p), Di 0Dkf(x) =Dk 0Dif(x). Hence
E"
whose matrix with respect to the basis
the Hessian matrix

Theorem
T(x)
is a symmetric linear transformation. Further, we have from
Theorem
7.4.6,
n n
i - a i)(x k - ak) Di Dkf(c)
d2f(c)(x - a)2 = L L (x
k=I j=l
=T(c)(x-a) (x-a).
o
(7.6.4)
E (l,n), d da) > 0, then since dk is a continuous function,

p so thatVe E B(a, cr ) andVk E (I, n), dk(c) > 0. Thus from
(7.6.3), (7.6.4), and Theorem 6.6.14 it follows that Vx E B(a,cr),
f(x) - f(a) > 0, and hence f(a) is a local minimum for J.
k
IfVk E (l,n), (-I) d(a) < 0, then arguing the same way as above
and using Corollary 6.6.15, we find that f(a) is a local maximum for J.
If Vk
3cr
To prove the last statement of the theorem we use the last statement in
6.6.15. This tells us that 3u, v E En, so that T(a)(u) u > 0,

and T(a)(v) v < 0. Since the functions with values T(x)(u) u and
T(x)(v) v are continuous, there is a ball B(a, 'Y)) C (f) so that
Ve E B(a, 'Y)), T(c)(u) u > 0 and T(c)(v) v < 0. Now VOi E R,
Ol 0, andVe E B(a,'Y)),
Corollary
T(c)(Olu) OlU = 10ll2 T(c)(u) u > 0,
T(c)(Oiv)
OlV = 10ll2 T(c)(v) v < 0 .
e > 0, 3a E R, a 0, so that IOlul < e and IOlvl < e. Sup

pose 0 < e < 'Y), and we set y =au+ a and z =av+ a. Then y, z
E B(a, e) and from (7.6.3) and (7.6.4),
For every
7.6 MAXIMA AND MINIMA I 347
J(y) - f(a)= T(c)(y - a) (y - a)= T(c)(au) au> 0,

f(z) - f(a) =T(c' )(z - a)
Thus the function with values
borhood of
so that
(z - a)= T(c')(av) av< 0.
f(x) - f(a)
changes sign in every neigh
is at a saddle point for
LAGRANGE MULTIPLIERS
If
is a real-valued function with
le(J)
C E", it very often happens
that we are not interested in the local extrema offbut rather in the local
extrema of a new function g that is

Usually the subset of
le(f )
restricted to a subset of
le(f ).
{x:
we are interested in is given by a set
h(x)=O} n le(J), where his a function with domain in E" and range
m ,,;;:; n. This is a standard type of problem that arises, for example,
in Em,
in classical analytical mechanics. It is usually called an extremal problem
for funder the constraint
h(x)=0.
The method of Lagrange multipliers gives a necessary condition
that a point
h(x)=0.
should be at a local extremum of funder the constraint
Actually, it is based on Theorem 7.6.2, being an elaboration
on that theme.
7.6.5
Theorem. Suppose f and h are of class C1 with (open) domains

Suppose also that f is real-valued, f/2,(h) C Em, m ,,;;:; n, and Vx
E le(h), dh(x) has rank m. A necessary condition that a be at a local ex
tremum for f restricted to the set {x: h(x)=O} n le(J) is that 3A. E Em,
so that the function F with domain [le(J) n le (h)] X Em and de.fined by
in
E".
(7.6.5)
F(x,y)=f(x) + h(x) : y
has a critical point at (a, A.); that is,
dF(a, A.)= df(a) +

Proof.
We shall suppose that
k=I
A_k dhk(a)= 0.
(7.6.6)
m< n, since otherwise, as we shall

h(a)=0, and rank dh(a)= m,
show later, the theorem is trivial. Since
according to the Implicit Function Theorem 7.5.7, there exists an open

set U C En-m containing the origin, and a function g of class C1 with
domain U, so that
Vt
E U, rank
dg(t)=n - m, h
and
g(t)=0,
g{O)
=a.
Since
g{O) E le(J), g is continuous, and le(J) is open, there is a

B{O,p) CU so that t E B{O,p) ::::::}g(t) E le(J). Consequently,
since a is at a local extremum for frestricted to {x: h(x)=O} n le(J),
it follows that t =0 is at a local extremum for the function f
g, and is
ball
an interior point of the domain of this function. We may apply Theorem
7.6.2 to f
g, and
also use the fact that h
is the zero function, to
get the following two equations:
dh
g(0) = dh (a)
df
g(O)= df(a)
dg ( 0)
O,
(7.6.7)
dg(O)= 0.
Let N be the null space of dh(a) and N1- its orthogonal complement in
En. Now, (dh(a))= (dh(a) IN1-), and dh(a) IN1- is a one-to-one func
tion. Hence, since rank dh(a)= m, we must have dim N1-= m, and since
dim N + dim N 1n, we must have dim N=n - m. Since rank dg(0)
=n - m, it follows from the first equality of (7.6.7) that N (dg(O)).
Since df(a) is a linear functional on E", it follows from Theorem
6.5.7 that 3b EE" so that Vu EEn,
=
df(a)(u)
From the second equality of
df(a)
u b.
(7.6.8)
(7.6.7) we get Vu EEn,
dg(O)(u)
dg(O)(u)
b= 0.
Thus b E N1-. Now, from Theorem 6.5.9 we know that (dh(a)1)

=N1-. Thus 3.A EE"', so that
b= -dh(a)1(A).
If we use this in
(7.6.8) we get Vu EE",

df(a)(u)= -dh(a)(u) A.
Now,
(7.6.9)
Vu EE" and Vv EE"',
dF(a,A)(u,v) = df(a)(u)+ dh(a)(u) A+ h(a)

where, of course, by
dy(v),
(7.6.10)
dy we mean the differential of that function defined

(x,y) is y. Since h(a) 0, it follows from
on E" XE'" whose value at
(7.6.9) and (7.6.10) that

dF(a, A)= df(a)+
k=l
_Ak dhk(a)
0.
n= m, then we cannot use the preceding technique since g does

dh (a) has an inverse and thus dh(a)1
has an inverse. So again 3.A EEn so that b=-dh(a)1(A). We can then
If
not exist. However, in this case
proceed exactly as before. However, this situation is really trivial, since
h is one to one in a neighborhood of a and thus a is the only point in

h(a) 0. Consequently, a is an isolated point
of {x: h(x)= O} n .B(J). Of course, J restricted to this set still has a
relative maximum and minimum at a. The proof is concluded.
If we write (7.6.6) in terms of partial derivatives we get n equations:
the neighborhood where
Dk f(a)+
L
i=l
A1Dkhi (a)
0,
VkE(l,n).
(7.6.11)
MAXIMA AND MINIMA I 349
7.6
From the fact that
h(a)=0 we get m more equations

hi(a)=O,
VjE(l,m).
(7.6.12)
If in the set of equations (7.6.11) and (7.6.12) we replace (a,>.) by

(x,y), then these equations can be viewed as a system of m+n equa
tions in m+ n unknowns, x1,
xn and y1,
ym. The points that
are at the relative extrema of J under the constraint h(x)=0 must be
among the solutions of this system of m+ n equations. The auxiliary
solutions A.1,
,Am are called Lagrange multifJliers.
Unless the functions f and h are relatively simple, the method of
Lagrange multipliers is difficult to apply. However, we shall now give

an example which shows that it can lead to nice results. We shall obtain
the so-called
geometric-arithmetic means inequality. Other examples of
its uses are given in the exercises at the end of the chapter.
We shall prove the following statement:
IJVk
E (l,n), ak;;,: 0, th en
( J1n ak )l/n :;;1 ak.

n
(7.6.13)
To prove this, we shall find the maximum of the function
f(x)
under the constraint
(il x1 )2.
xi
>
0,
Vj E (1, n) ,
n
h(x)= L (xi)2 -1=0.
J=l
By use of the method of Lagrange multipliers, the local extrema are

contained among the solutions to the
n+ 1 equations
kE(l,n),
n
(7.6.14)
L (xi)2=1.
i=l
Sppose
(b,A.) is a solution of the above system. If we multiply the kth

(7.6.14) by b k we get
equation in
(7.6.15)
If we sum up over k and use the last equation of
(7.6.14)
we get
nf(b)+A.=0.
Putting this value of
A. into (7.6.15) we get, Vk E (l,n),

(bk)2= l/n,
(7.6.16)
and thus
A= -n-n.
(7.6.16')
It is not difficult to check that the numbers given by (7.6.16) and

(7.6.16') constitute a solution of the system (7.6.14) and thus this sys
f is defined. To see
f, let us extend J, in
the obvious way, to a continuous function F defined on the set D
{x:
x E E" & Vj E (l, n), xi ;;;.: O}. If Sis the unit sphere in En, then since
F is continuous, FI (D n S) must take on a maximum and minimum.
Clearly, t_he minimum is taken on when 3j E (l, n) so that xi= 0,
and the minimum is 0. Thus the maximum of FI (D n S) is taken on
when Vj E (l, n), xi> 0, and by Theorem 7.6.5 the point where the
tem has a unique solution in the domain where
whether this solution leads to an extremum for
maximum is taken on must satisfy the system (7 .6.14). If we specify

that Vj E
(l, n), xi> 0,
this system has a unique solution. Hence it
follows that the maximum is taken on at the point whose components

are given by (7.6.16).
C R+,yk= (a )112,y= ( y1, -,yn) ,xk=yk/lyl,

k
x) . Then !xi = l and by what we have proved pre
(l,n)}
Let {a : k E
and x = (x1,
viously we have
But
The last inequality is precisely (7.6.13). In case 3j E
(l, n)
so that
a;=0, the inequality (7.6.13) is obviously true.

O Exercises
I.
Suppose A is a compact set in
E"
with a nonvoid interior A0,
and f is a real, continuous function with domain A which has a differen

tial at every point of A0 If Vx E {3A =A\A0, f(x)
so that df(a) =
2.
0.
Let f be a real-valued function defined on
f(x,y) = ax2
If
-,!:. 0
and b2 - 4ac =
0,
Let
bxy
J has
= (O, 0).
show that
or a relative minimum at (x,y)
3.
0,
show 3 a E A0
This generalizes Rolle's theorem.
E2
by the equation
cy2
either a relative maximum
f be that real-valued function defined on E2 by the

f(x,y)
1
x2 + xy + 2 y3.
1
equation
7.6
MAXIMA AND MINIMA\ 351
Find all the relative maxima and minima for f restricted to the triangle
and its interior which has vertices at the points
( -1, 6) .
(-1, 2 ) , (-2, 4),

f restricted to
What is the maximum and minimum of
and
this
triangle and its interior?

4.
Let
f be
that real-valued function defined by the equation
J(x,y) =2+2+2xy,
x
y
x=F-0, y=F-0.
Find all the critical points of the function and decide whether they are
at a relative minima, a relative maxima, or at a saddle point.
5.
Let P be that plane in 3 whose equation is
3x+ y- 2z = 5.
(12, 1, 5)
Find that point on P whose distance from the point
is a mini
mum.
6.
Find the shortest distance from the point
surface in 3 whose equation is
7.
Let
xy - z=0.
{3, 3,-1) to the
be a real-valued function with domain 2 given by the
equation
f(x,y) = ax2 + bxy2+cy4.

b2 - 4ac > 0, then f does not have a relative extremum at
a2 + {32 - 0, and t E R, the function defined by g,,13 (t)
=f(at,f3 t) has a relative minimum at t=0 if a > 0 and a relative
maximum at t= 0 if a< 0.
Show that if
0.
However, if
8.
& Vj
f is that real-valued function

(1, n), xi> O}, and defined by
Suppose
E
with
.B{f) = {x: x
E En
1 n
n i=t
f(x) =- L xi.
Use the method of Lagrange multipliers to find the minimum of this
function under the constraint
n
h(x) =TI xi - 1=0.

i=l
Deduce the geometric-arithmetic means inequality

9.
(a)
(7.6.13).
For fixed positive p and q, let f be that function defined on
the open first quadrant of 2 by the equation
Show that the minimum off under the' constraint
h(x,y)=xy-1=0
is
(l/p) + (l/q).
(b)
b 0,
Use the result of part (a) to show that if
(l/p) + (l/q)
1 , and
0 and
> 1 and
> 1,
then
b a2 b2
This is a generalization of the result that 2a
b]
10.
[a,
Suppose f and g are nonnegative continuous functions on

and
is nondecreasing on the same interval. Use Exercise 9(b)
to show that if
> 1andq>1, and
(l/p) + (l/q)
1 , then
b (x)q dh(x) ]l/q .

afb f(x)g(x) dh(x) [ fab f(x)P dh(x) ]l/p [ fag
=
This is known as Halder's

If p
1 and
oo,
inequality.
define
[ abg(x)q dh(x) ]l/Q

f
=sup g.
Show that Holder's inequality is true in this case also.
11.
and
= 1,
and
ak
12.
p 1
k 0 bk 0,
[ ak rp [ b kqrq.
Use the results of Exercise
(l/p) + (l/q)
Vk
10
to show that if
(l,n), a
for 1 p <
and
then
Use Holder's inequality of Exercise
inequality
and
oo.
10
to obtain
Minkowski's
b
[ f)f(x)
g(x)IP dh(x) ]l/p [ fablf(x)IP dh(x) ]l/p
[ J:ig(x)Ip dh(x) rp.
[Hint:
IJ(x) g(x) Iv= IJ(x) g(x) 1v-i IJ(x) g(x) I

IJ(x) g(x) 1v-i IJ(x)I
IJ(x) g(x) lp-i lg(x)I.
+
Integrate both sides of the inequality and apply Holder's inequality on

the right.]
81 HIGHER-DIMEN
CHAPTER
SIONAL INTEGRATION
8.1
The definitions and elementary results about Riemann and Darboux

sums and integrals for higher dimensions are essentially the same as
for the one-dimensional case given in Section 5.1. However, for the
sake of completeness we shall repeat these things in this section but
shall not dwell at any length upon them, and refer for the proofs to
Chapter 5.
As we have done at times in previous chapters, if
n we shall often
identify E"' with that subspace of En consisting of all vectors of the form
(x,O)
(or
(O,x)),
where
E Em and 0 E En-m. Also EPXEq shall
usually be identified with EP+Q in the obvious way.
8.1.1
Definition.
n nonvoid intervals in
An interval in En is an n-fold Cartesian product of

E1 ;
/ =/1X X/ll,
Jk
C E1,
Vk E (l,n).
The volume or content of I is defined by

n
I I I = II I/kl'
k=I
and the diameter of I is defined by

d (I ) =
[ 11k12 r2.
The interval I is said to be m-dimensional, m n <=> exactly n - m of the

component intervals
are degenerate, that is, consist of a single point.
Jk
The next thing to do is to define the decomposition of a closed interval
I in E". The most natural thing to do in carrying over Definition 5.1.1

is to say that .:l is a decomposition of I if and only if .:l is a finite set
of closed intervals in E", any two of which intersect in at most an interval
of dimension less than
n,
and whose set theoretic union is/. However,
with such a definition it becomes somewhat complicated to prove, at

least with the tools we have at present, that the sum of the volumes
of the intervals in .:l is the volume of/. Thus we give a somewhat more
restrictive definition.
353
354 I HIGHER-DIMENSIONAL INTEGRATION
8.1.2 Definition. A finite set A of intervals in E" is called a decomposi

X I" Vk E ( I , n) there is a decom
tion of the closed interval I = JI X
position Ak of I k so that every interval
I = J1 Xj2 X
Xj",
is in A and every interval in A is of this form. The norm of a decomposition A

is defined as
IA I = max {dU) :I EA}.

With this definition of the decomposition of an interval it is now not
hard to prove the fact about volumes that we mentioned above. We shall
do this formally in the next proposition.
8.1.3
Proposition.
sition of I, then
If I is any closed interval in E" and A is any decompo

III= L 111.
JE/i
n. Let P (n) be the state

P (I) is true (Section 5.1,
Exercise 2). Assume that P( n) is true and let I be an interval in E"+I,
I= I' X
X /"+1, and let A be a decomposition of I. Let I' =I' X
X I" and let A' be the set of all intervals I ' = I 1 X
X I", where
Vk E (1,n),I k EAk. By definition, A' is a decomposition of I' and
thus, by P(n),
The proof will be by induction on
Proof.
ment of the proposition. The statement
II' I
Now,
III= II'l IJ"+II
and, by
2: If' I
J'Eli'
P(I),
Hence
where the notation on the right means we are summing over all ordered
pairs in
A' X A"+',
as is usual when two finite sums are multiplied. It
is almost trivial to prove that there is a one-to-one map from A' x

onto
A,
and if
II' I II 11+' I= II I .
is complete.
(]', I "+I) EA' X A"+I
This establishes that
corresponds to
P ( n + I)
I E A,
A11+1
then
is true and the proof
8.1.4 Definition. A decomposition A* is called a refinement of a decom

position A (in symbols A* >-A) each interval in A* is contained in an
interval in A.
8.1
If A1 and A2 are two decompositions of the same interval. their common

refinement is that decomposition which consists of every interval that is an
intersection of an interval of A1 with an interval of A2
We leave as an exercise for the reader the rather simple proof that
the second sentence in the previous definition makes sense, namely,
what we have defined as a refinement is actually a decomposition. We
now proceed to define a Riemann sum and integral.
8.1.5 Definition. Let f be a real-valued function with domain the closed

interval I C En. Let JE> be the collection of all pairs (A, {xn}), where A
= {h: k E (l,m)} is a decomposition of I and Vk E (l,m),xk E Ik.
Let Rf be that function with domain JE> defined by
m
Rf(A, {xd) = L f(xk) lhl

k1
The function Rf is called the Riemann sum function for f, and any number
in its range is called a Riemann sum for f
The function Rf is said to have the limit R (f) <==> VE> 0, 3A, so that
A >A, IRf(A,{xd) R(f)I < E. In case Rf has a limit, we say that
f is Riemann integrable and the limit R (f) is called the Riemann integral
off.
-
In case
Rf has
a limit, we are justified in calling
R (f) the
limit since
it is unique. The proof is extremely simple and is found after Definition
5. l.3 when f has domain in
R.
8.1.6 Theorem. The function Rf has a limit<==> Rf is Cauchy in the

sense that VE> 0, 3A, so that A>A, and A' >A,
IRf(A,{xd)-Rf(A',{xD)I
<
E.
The proof of this is exactly the same as the proof of Theorem 5.1. 7
and we shall not repeat it. We go on to define upper and lower Darboux
sums and integrals.
8.1. 7 Definition. Let f be a real-valued bounded function with domain

the closed interval I and let i5f and [}J be those real-valued functions each
having domain the set of all decompositions of I and defined by
Df(A) = L M(]) Ill'
M(])=
sup {J(x):
m (])=
inf {f(x):
x E ]},
Jea
x E ]} .
Jea
The functions i5f and Qfare called the upper and lower Darboux sum functions
for f, respectively. The numbers Df(A) and !2f(A) are called upper and lower
356 J HIGHER-DIMENSIONAL INTEGRATION
Darboux sums for f, respectively.

Set
D(J)
=inf {D1(Li):
Q(J)
=sup
i f(x)dx,
i f(x)dx,
Li
10(D1)}
{Q1(d): Li
.f0(Q1)}
and call these numbers the upper and lower Darboux integrals off, respectively.
In case D (J) =Q (J) =D (J), we say that fis Darboux integrable and call
D (J) the Darboux integral off.
As in Chapter 5, to show the connection between Riemann integrals
and Darboux integrals it is necessary to prove the following lemma.
8.1.8 Lemma. If f is a real-valued bounded function on the closed

interval I and Li* >- Li , then
!21(Li) ,,;;; Q1(Li*) ,,;;; i51(a*) ,,;;; i51(a).

The proof of this, using Proposition 8.1.3, is exactly the same as the
proof of Lemma 5.1.6, and we shall not repeat it. We now state the main
connection between Riemann and Darboux integrals. Again, the proof
is the same as the proof of Theorem 5.1.5.
8.1.9 Theorem. The Riemann integral of f exists if and only if the

Darboux integral off exists. and if they exist then R (J) D(J) .
=
In case the integral of
fexists we call it the Riemann-Darboux integral
and denote it by
GENERALIZED LIMITS
It is possible that by this time the reader may have begun to wonder
whether or not the various concepts of limit we have used, for example,
the limit of a sequence, the limit of a function at a point in En, or the
limit of the function
Rf> can all be brought under some general defini
tion. The answer is yes, and we shall briefy

l describe this general concept
and show how our previous definitions fit under it.
A set t!V is called a
directed set if and only if there is a relation
t!V with the following properties:
(a)
( b)
(c)
(n)(n >- n).

(m)(n)(p)(p >- n & n >- m p
(m)(n)(3p)(p >- m & p >- n).
>-
m) .
>- on
8.1
RIEMANN-DARBOUX INTEGRAIS j 357
A real net is a real-valued function with domain a directed set G/V. The
real net S is said to have a limit s <=> VE> 0, 3N E <IV so that n >N
==} IS(n) -sl < E. If a net has a limit, the limit is unique. Indeed, sup
pose tis another limit. Then 3M E <IV so that n >M ==} IS(n) -ti < E.
Now, by condition (c) for a directed set, 3P E <IV so that P >M and
P > N. Hence if n >P we have
ls-ti IS(n)-sl+IS(n)-tl <2E,

and thus s = t. Hence we are justified in calling a limit of a real net the
limit of the real net.
A real net S is said to be Cauchy<=> VE> 0, 3N E <IV so that n >N
& m >N==} IS(n) -S(m)I < E. A real net has a limit<=>it is Cauchy. It
is clear that if a real net has a limit it is Cauchy. On the other hand,
suppose the net S is Cauchy. Then 3M E <IV so that n >M ==} IS(n)
-S(M) I < 1. Thus n >M ==} IS(n)I is bounded by 1+ IS(M) 1- For
every n >M set (,O(n)
sup{S(m): m >n} and Jim S = inf{(,O(n):
n > M}. We claim that s lim S is the limit of S. First, note that
n >m ==}(n) (,O(m). Next, VE> 0, 3n1 >M so that
=
0 (n1) -s < E.
(8.1.1)
Also, 3n2 >M so that m, n >n2 ==} IS{m) -S(n) I < E/2, and 3n' >n2
so that
(8.1.2)
Since n' >n2, it follows that Vn >n2 we get
IS(n) -S(n') I < E/2.
(8.1.3)
From the inequalities (8.1.2) and (8.1.3) we get that n >n2 ==}
0 ip( n2) -S(n) < E .
(8.1.4)
From condition (c) 3N E <IV so that N >n1 and N >n2 Hence if

n >N we get from (8.1.1) and (8.1.4) and the monotone character of
7P that
0 (,O(n) -s < E,
0 (n) -S(n) < E.
From these two inequalities it is immediately clear that n >N ==}
IS(n)-sl < E .
Let us see how we can apply this concept to the various definitions of
limit that we have given. In case <IV = N 0, a net is just a sequence and
the relation > is taken as the relation - Suppose f is a real-valued
function with rB(J) C En and a is an accumulation point of rB(J).
If x, y E rB(J) \{a} set x >y<=>Ix -al IY-al. It is easily checked
358 J HIGHER-DIMENSIONAL INTEGRATION
(J) \{a}
(J) \{a}
a<==>
a.
(A*, {xk*}) >- (A, {xd) <==>A*
that
is a directed set under this relation. The function J
restricted to the directed set
is a net and the function f has
the limit l at
the net f has the limit l at
In the case of the function Rf we take ,;Vas the domain of Rf and
define
is a refinement
Note that with
this definition our discussion of Cauchy nets provides a general proof
for Theorems 5.1.7 and 8.1.6. For the functions Df and[]! we can take
,;V as the set of all decompositions of the given interval I and take
as a refinement of
Then Df becomes a monotone non
increasing net and !2.f a monotone nondecreasing net. The numbers
are the limits of these nets, respectively.
and
Of course, everything we have said for real nets will work as well for
vector-valued nets with ranges in E", n 1.
A* >-A<==> A*
D(J) Q(J)
A.
A.
D Exercises
I.
Suppose that f and g are real-valued bounded functions each
having as domain the closed interval I C E". Show that
I [f(x) +g(x)] dx,,,:; If(x) dx+ Ig(x) dx,

Lf(x) dx+ J g(x) dx,,,:; L [J(x) + g(x)] dx,
2. Suppose f and g are bounded real-valued functions having as
common domain the closed interval I C E". If
show that
If(x) dx,,,:; Lg(x) dx,

Lf(x) dx,,,:; Lg(x) dx.
f,,,:; g
3. Suppose f is a bounded integrable function with domain the

closed interval I C E" and J is a closed subinterval of /. Is it always
true that
4.
Suppose that I and j are closed intervals in E" so that I U J
is an interval and I n j is at most an (n
!)-dimensional interval. If
J is a real bounded function with domain I U J, show that
-
T 1<x) dx T J<x) dx f- J<x) dx,

J1
J
J1uJ
Ju Jf(x) dx ]/(x) dx +Lf(x) dx.
=
8.2 JORDAN CONTENT I 359
5.
V n E vV let Kn be a nonvoid com

m, then K,. C Km. Show that the follow
Let vV be a directed set, and
pact subset of P so that if
>-
ing version of the Cantor intersection theorem is valid; that is,
n
is nonvoid.
8.2
{K,.: n E <IV}
JORDAN CONTENT
In the case of higher dimensions it is very apparent that there are

simple geometric shapes, other than intervals, over which we wish to
define integrals of functions. We shall first make a definition and then
show that the definition makes sense.
8.2.1
Definition. Let A be any bounded set in E" and I any closed interval
in E" so that A C I. If f is a real-valued, bounded function with J:>(J) =A
ex tend f to I by taking
Then define
f () = f(x) :::>x EA,

A x
0
if x EI\A .
and
In case these numbers are equal, we define the common value
of f over A and denote it by
as
the integral
In order that the previous definition make sense, it is clear that the
numbers we have defined should be independent of the interval I that
contains A. Although this seems almost obvious, we shall state and prove
it formally.
8.2.2 Proposition. If A C
with A C I, then the numbers
E" is
bounded and I is any closed interval
and
are independent of I.
Proof.
We shall make the proof only for the upper integral, the proof
for the lower integral being similar. Suppose, at first that, A
C ] C I,
I= /1 X
XI",]=J1 X
X]". Since ] C I, it
follows that Vk E (I, n),Jk C Ik, and hence there is a decomposition
where recall that
!!,.k Jk
l2\l3k,
consisting of three intervals (some possibly degenerate) 1/,

where
Corresponding to the set
k E (I,
of
decompositions, there is a decomposition
of I as in Definition 8.1.2.
Now, Ve> 0 there is a decomposition l!..1 of] so that
of
l2k =lk.
!!..
{t:..k:
n)}
(8.2.1)
n)
(!!,.k\{Jk})U t:..1k C
!!.'. \l!..i.
=
DrA(t:..') = L M(/')JI'J+ L M(/')JI'I =DrA(t:..1).
Jk;
For every k E (I,

the set
these give a decomposition !!..' of /. Since A
1, it follows that if
then M(I')
sup{JA(x): x E /'} = 0. Thus
I' E
/'EL>.1
(8.2.2)
/'El>.'\L>.1
From (8.2.1) and (8.2.2) we get
!A(x) dx-
IfA(x)
dx.:;;
DrA
(!!..1) -
IfA(x)
dx < e.
(8.2.3)
On the other hand, let !!..2 be a decomposition of I so that

(8.2.4)
If !!..* is the common refinement of !!..2 and !!.., we get
< e.
f
D1A
l!..1* = {/*: /* !!..* I* C j};
C !!..*\l!..1*,
*
M(/ )
DrA(!!.. *) = DrA *).
(!!..*) -
0.:;;
Now,take
that
!A(x) dx
(8.2.4')
E
&
then if/*
it follows
0. Thus we again have a formula like (8.2.2), that is,
(!!..1
(8.2.2')
Hence from (8.2.2') and (8.2.4') we get
IfA
(x) dx-
IfA(x)
dx.:;;
i51A(l!..1*) - IfA(x)
dx < e.
(8.2.5)
The inequalities (8.2.3) and (8.2.5) show that the upper integral of
over j is the same as its upper integral over /.
In case whereA
/,but it is not true thatA
1, let us proceed
as follows. If
and e> 0, set I/=
e], and
- e,
I,= 1.1 X
XI/. In the same way we construct the interval]. We
leave for the reader the very easy, but slightly tedious,task of verifying
that 3M> 0, depending on and/, so that
fA
Cl C
lk = [ak, bk]
C
[ak bk+
II.fA(x)
I I/
dx-
A(x) dx-
I
l <Me,
IfA(x) l <Me.
fA(x) dx
dx
The analysis is very similar to that carried out in the first part of the
proof.
Since
Ve>
A Cj Cj, CI.,
we have, from the first part of the proof,
0,
Using this fact and the two inequalities above, we see that the upper
integral of
and
A C K,
8.2.3
fA
over J is the same as its upper over
then
A is
Definition.
I.
contained in the closed interval
If Ais a bounded set in
x(A)=
f-
dx
and
En,
Finally, if
A CI
n K and
let
x(A)=
dx,
A
and call these the outer and inner Jordan content of A, respectively.
In the case where X'(A)= x(A), we say that A is Jordan measurable and
designate this common value by IAI. calling the latter number the Jordan
content of A.
The next theorem, although of a rather simple nature, is nevertheless
relatively important in the development of the theory of Jordan content.
In loose language it says that the outer content is a subadditive mono
tone function.
8.2.4
(a )
Theorem.
Suppose A and Bare bounded sets in En.
x(A U B) x(A)+ x(B),
and if the distance between A and B, d =

positive, then equality maintains.
( b)
inf{lx
- YI: x EA &
EB}, is
If A CB, then x(A) x(B).
For any set S, let )(s be that function which has

S, and has the value 0 on sc, the complement of S. It is
characteristic function of S.
(a) Suppose I is a closed interval in En and A U B C I.
E En we have XAuB (x) XA(x) + x8 (x ) we have
Proof.
I on
=I
[
x(A u B)
XAuB (x) dx
xA(x) dx+
XB(x) dx=X.(A)+x (B) .
the value
called the
Since
Vx
362 j HIGHER-DIMENSIONAL INTEGRATION
If the distance d between
ILll
tion of I so that
<
A and B is positive, let Ll be any decomposi
d/2. Then an interval in Ll cannot intersect both
LlA = {] : I E Ll & I n A # 0} and

0}. Then LlA n LlB =0 and if I
that M (}) =sup {XAuB ( x ) : x E ]} = 0.
and B in nonvoid sets. Let
LlB = {]:I E Ll
E Ll \ (LlA U Ll8)
&
it
follows
Thus
DxA uB(Ll)
2: M(j) II I+ 2: M( j ) II I
JEAA
=DxA
JEAB
(Ll) + Dx8 (Ll) ,
and consequently
[ XA
( x)
dx +
[ XB
(x)
dx DxA (Ll)
Dx8 (Ll)
Now, VE> 0 it i s certainly true that 31'.l so that
DxAuB (Ll)
[ XAuB ( ) dx +
ILll
=DxAuB (Ll) .
<
d/2 and
E.
Consequently,
x(A)+ x(B) x(A
B).
This combined with the inequality in (a) gives equality.

(b)
that
The inequality in part (b) is simply a consequence of the fact

C
B ==> XA XB and thus
x(A) = XA (x) dx
REMARKS:
[ XB(x) dx
x(B).
From the last proposition and the definition of outer
Jordan content, it follows that for every bounded set
x(A) =g.I.b.
{ J2: II I:
LP E r
E<P
},
C E",
(8.2.6)
Where r is the collection of all coverings of A containing a finite num

ber of closed intervals. Indeed, if
tion we have
x(A) x (
CP E f,
u {]:] ECP} )
On the other hand, the definition of
then from the last proposi-
'2: 111.
JE<P
x(A) says that it is the g.l.b. over
that subset of r consisting of decompositions of closed intervals that

contain
A.
Thus (8.2.6) is valid.
It is not hard to give examples of sets that are not Jordan measurable.
The easiest example is obtained by considering the set A as all the
rationals in
check that x
[O, I];
(A)
that is,
1, while
A= Q
K (A)
[O, I].
It is a very easy matter to
0. More generally, it is true that if
A
A
is a bounded set, and if
A0
0, then x(A)
0;
hence, if
cannot be Jordan measurable. Indeed-:- if we embed
interval
and Li is any decomposition of
Q)(A (a)
I,
x(A) > 0,
into a closed
then
L II I,
JEt;.
JcA
where the right side is considered to be zero if there is no J in Li that .is

contained in
A0
We now wish to state a necessary and sufficient condition that a

bounded set in En be Jordan measurable. This is done in terms of the
boundary of the set, and it would be well for the reader to review the
pertinent material at the end of Section 6.3. The condition we shall
give will also provide another viewpoint on the facts discussed in the
last paragraph.
8.2.5
Theorem.
A bounded set A
C E"
is Jordan
measurable<=>
I.BA I
=O.
Although it is not difficult to give a direct proof of this theorem, we

prefer to defer it until after we have proved Theorem 8.3.2, since it
is really a very easy corollary of that theorem. We shall leave it to the
reader in Exercise 8 at the end of this section to establish a result that
will give Theorem 8.2.5 as an immediate consequence. Nevertheless,
we want to use Theorem 8.2.5 to investigate under what conditions the
map of a Jordan measurable set remains Jordan measurable. We shall
need this type of result when we prove the transformation theorem for
multiple integrals in Section 8.5.
8.2.6 Definition. A function g with domain in E" and range in Em is

said to be locally Lipschitz<=> for every compact set K C .:{g), 3M >0
and 3 o >0 so that Vx,y E K with Ix-YI < o we have lg(x) - g(y) I
M lx-yl.
For example, every function of class C1 is a locally Lipschitz function.
The proof of this statement is essentially contained in the statement
of Proposition 7.5.1. We shall leave it to the reader to fill in the elemen
tary details in Exercise 9 at the end of this section.
8.2. 7 Theorem. Suppose g is a locally Lipschitz function with domain

and range in E". For every bounded open set U with U C .:{g) and VE>0,
3o>0 so that if Ac u and x(A) < o, then x(g(A)) < E. In particular,
if IAI 0, then I g(A) I 0.
=
Proof.
371>0
Since
so that
V is compact and g is locally

Vx,y E V with lx-yl < 71
Lipschitz,
we have
3M >0 and
lg{x)-g{y)I
M Ix - YI. Now, given E > 0, cover A by a finite number of closed

intervals {Ik: k E 0, m)} so that
IIkl < x(A)+ E/2(MVn)n.

k=l
Since it is clear that each Ik may be covered by cubes whose total volume
is as close to the volume ofh as we may wish, we may as well suppose
at the outset that each Ik is a cube. Let be the positive distance from
A to uc, the complement ofU, and ,= min ( 71, ). Considering each
h as a cube, we can decompose it into cubes, so that the diameter of
each is less than,. Thus we may as well suppose from the very begin
ning that d(Ik) < ,. Hence we can suppose that U /k C U. Thus,
since x,y E lk ==>Ix - YI < 7/, we get
lg(x) - g(y) I M Ix - YI Md(Ik).
This means that g(Ik) is contained in a cube]k whose side length is at
most M d(h). Now d(h) is Vn times the side length ofI k so that
lfkl (MVn)11 Ilk'
Since g(A) C g( U/k) C U }k, it follows from Proposition 8.2.4 that
x(Ufk) :L 11k1
k=l
(MV;;)n
L lhl
k=l
<
(MVn)"x (A)+ e/2 .
Ifwe take o=E/2(MVn)", thenx(A) < o==>x(g(A))
<
E.
8.2.8 Theorem. Suppose g is a locally Lipschitz map with an open domain

in E" and range also in E". If A is a bounded Jordan measurable set with
AC "(g) and g(A0) is open, then g(A) is Jordan measurable.
Proof. The theorem will be proved ifwe can show that l,8 g (A)I = 0.
Now, since A is compact and "(g) is open, there is a bounded open set
U with ACU C U C "(g). [Just take U as the union ofall open balls
with centers in Aand radii equal to halfthe positive distance from A to
o(g)c.] Hence, since ,BA= A\ A CU, we can apply the last theorem
and get lg(,BA)I= 0. Thus, if we can show that ,Bg(A) C g(,BA), we
shall be done.
To prove that ,Bg(A) C g(,8A), we note several facts. First, since A
is compact and g is continuous, it follows that g(A) is compact and hence
closed. It follows that g(A) C g(A). Next, since g(A0) is open we have
g(A0) C g(A)0 Consequently,
f3g(A)
g(A) \g(A)
g(A) \g(A0)
g(A \A0)
g({3A).

D Exercises
I. Show that an (n
1)-dimensional interval in En has zero n
dimensional Jordan content.
2. Let J be a continuous real-valued function with E(f) a compact

set in E1 The graph of J is the set of ordered pairs in J when considered
as a subset of E2 Show that the two-dimensional Jordan content of the
graph of f is zero. Extend this result to the situation where f is a real
valued continuous function with compact domain in En.
3. Show that the sphere { (x, y, z) : x2 + y2 + z2
dimensional Jordan content.
l} has zero three
4.
A Jordan arc or path in E" is a one-to-one continuous function
J with domain [O, l] and range in En. It is piecewise differentiable
the domain off' is all of [O, l] except for a finite number of points.
If J is a piecewise differentiable Jordan arc in E2 and f' is bounded,
show that the two-dimensional Jordan content of f,1(,(j) is zero.
5. Let A be a bounded set in En having a finite number (possibly

zero!) of accumulation points. Show that the Jordan content of A is
zero.
6. Give an example of a bounded open set that is not Jordan
measurable. (Hint: Let {rk: k E N} be the rationals in [O, I] and
Vk E N let lk be an open interval contained in [O, I] so that rk E Ik
and
k=l
lhl,,,;;;; 1/2.
7. Give an example of a bounded connected set in E", n 2, which

is not Jordan measurable.
8.
If A is a bounded set in En, show that
x(f3A)
x(A) - x(A).
Deduce Theorem 8.2.5 as a corollary.

9.
Show that every function of class C1 is a locally Lipschitz function.
10. Let g
(x, y) be the polar coordinate function with domain and
range E2 defined by
=
x(r, 8)
r cos 8,
y(r, 8)
= r
sin 8.
366 j HIGHER-DIMENSIONAL INTEGRATION
Let A be the open interval in 2 given by A = ]O, I [

that f3g(A) C g(f3A) but that f3g(A) #- g(f3A).
]O, 3'1T [ . Show
11. If in addition to the hypotheses of Theorem 8.2.8 we assume

that g is one to one, show that f3g(A) = g(/3A).
8.3 EXISTENCE AND PROPERTIES OF RIEMANN-DARBOUX

INTEGRALS
In Section 5 we showed that a sufficient condition that a function with

domain I C E1 have a Riemann-Darboux integral is that it be con
tinuous. Although this condition is clearly not necessary, it is never
theless not far off the mark. The purpose of this section is to give a
necessary and sufficient condition that a real-valued function defined
on a closed and bounded interval in En has a Riemann-Darboux integral.
Let us slightly change the definition of the limit superior and limit
inferior of a real-valued bounded function at a point of its domain so
as not to exclude the point itself from consideration, as is done in the
usual definition. Let us set
Limf(x)=lim [sup{J(x):x EB(a,r)
r-o
x-a
Limf(x)=lim [inf {J(x): x E B (a, r)
x-a
r-o
n
n
c(J)}],
c(J)}].
The function f is continuous at the point a E " (J) {::::} Limx-af(x)

Limx-af(x). If f is not continuous at a, the difference of the latter
two quantities gives a measure of the discontinuity at a. The number
w(J, a)= Lim f(x) - Lim f(x)

x-a
x-a
(8.3.1)
is called the oscillation of f at a . In terms of this quantity, the function

f is continuous at the point a E c(J) {::::} w(J, a)= 0.
Before we give a necessary and sufficient condition for the existence of
a Riemann-Darboux integral it will be convenient to prove a lemma. The
lemma may be viewed as a generalization of the theorem that a con
tinuous function on a compact set is uniformly continuous, and indeed
the proof of the lemma is essentially the same as the proof of the latter
fact.
8.3.1 Lemma. If f is a bounded real-valued function with a compact
domain in E11, and if at every point x E c(J), w(f, x) < E/2, then 38 > 0
so that Vx ,y E c(J) with lx-yl < S, we have IJ(x)-f(y)I < E.
Proof. Suppose x E c(J). By the definition of Lim and Lim,
3B(x ,S(x)) so that Vy E B(x,S (x)) n c(J) we have
Lim J(t) - E/4

x
<
f(y)
<
Lim J(t) + e/4.

x
8.3
EXISTENCE AND PROPERTIES OF RIEMANN-DARBOUX INTEGRALS I 367
Thus Vy , z EB (x , 8(x) )
(f) we have
-w(f, x) - e/2 < f(y) - J(z) < w(f, x )

or, in other words, Vy,z EB(x, 8(x) )
IJ(y) - f(z ) I < w(f, x)
+ e/2
(f) ,
+ e/2 <
e.
(8.3.2)
The collection {B(x , 8(x ) /2: x E (f)} is an open covering for

(f) and hence by the Heine-Borel theorem there is a finite subset
{B(xk. 8(xk ) /2 ) : k E (I, m)} that covers (f ) . Set 8 min {8(xk) /2:
k E (l, m)}. lfx,y E (f) with lx-yl < 8, and x EB(xk>8(xd/2) ,
then y EB(xk>8 (xk ) ) . Thus it follows frnm (8.3.2) that lf(x) - f(y) I
< E.
=
8.3.2 Theorem. Let f be a bounded real-valued function defined on a

closed interval I in En. The function f has a Riemann-Darboux integral
<=>Ve> 0 the compact set f!(f, e) = {x: w(f, x) e} has zero Jordan
content.
Proof.
and V'Y/>
Suppose f has a Riemann-Darboux integral. Then Ve>

0, there exists a decomposition a of I so that
o Dr(a) -!21(a) L [M(}) -m(])J Ill< ET/.

JEA
Let a' = {]: } Ea & } n f!(f, e) oF- 0}; if} Ea', it follows that
M(}) - m ( } ) e. Also, since f!(f, E) c u {]: J Ea'} it follows
from Theorem 8.2.4 that
,
x( n (f, e) ) x ( u {]: J E a' } ) L I J I .

JEil.'
Consequently,
ex(f!(f, e) ) L [M(})
m(]) ] 11 I < ET/.
JEA'
Hence VT/ >
we have
x(f!(f, e) )
<
T/,
which means that !f!(f, e) I = 0 .

Conversely, suppose that Ve> 0 , !f!(f, e)!
there is a decomposition a of I so that
jjX[!(f,<) (a) <
E.
0.
This being the case,

(8.3.3)
Let a*={]:] Ea & } n f! (f,e)=0} . The set K= U {]:]

Ea*} is a compact subset of (f) and Vx EK, w(f, x) < e. Thus by
Lemma 8.3.1, 38> 0 so that Vx , y EK with lx-yl< 8 we have IJ(x)
- J(y) I < 2E. Let al be a refinement of a so that I al I < 8. Let a1* be
the set of all }1 in a1 so that 3] E a* with }1 C}. If L Ea*1, then
clearly
IJI,
M(L)
m(L)
:s;;
2e.
Consequently, if
is an upper bound for
we have
0 :s;;
i5,(Li1) - Q,(6.1)
[M(]) - m(]) ] IJ I
JE!J.1*
[M(])
m(]) ] Ill
JE!J.1/!J.1*
:s;;
2e II I+2MDxn<f,l (Li)
:s;;
2e[III+M].
This shows that the integral of f exists.

There is one small point we must address ourselves to before the proof
can be considered complete. We should prove the statement that !l(f,
e)
is compact. Although it is not needed in the proof of this theorem, it
!l(f, e) is contained in the closed bounded

!l(f, e) is closed in/, or what is the same
thing, that J\!l(f, e) is relatively open in /. Suppose a E J\!l(f, e) ;
then w (f, a) < e. Now, VTJ > 0, 3p > 0 so that Vx E B(a, p) n I
will be needed later on. Since
set/, it is enough to show that
we have
Lim
t-a
f(t) - T}/2
<
f(x)
< Lim
t-a
J(t) +T}/2.
In particular this means that

Lim
t-a
f(t)
I,,p. f(t)
Hence,
Vx
B(a,p)
w(f, x)
If we fix TJ < E -
w(f, x)
T}/2
- TJ/2
:s;; Lim
t-x
:s;;
f(t)
1tlr;i J(t)
:s;; Lim
t-a
:s;;
J(t) +T}/2,
T J(t) +TJ/2.
n /,we have
=Lim f(t) - Ljm

t-x
t-x
w(f, a) ,
< E. Consequently,
f(t)
:s;;
w(f, a) +TJ.
then 3p > 0 so that
J\!l(f, e)
Vx
B(a, p)
I,
is relatively open in/.
Suppose A is a bounded set in En. Then

XA of A is continuous except at the points of
every x E {3A, w(xA, x) =I and thus Ve so that 0 < E :s;; 1,
= {3A . Embed A into an interval I and the last theorem tells
Proof of Theorem 8.2.5.

the characteristic function
{3A. For
n (XA' E)
us that
if and only if
XA (x) dx =
J!l(xA, e) I = lf3A I
XA (x) dx
= 0.
An immediate corollary of Theorems
8.3.2 and 8.2.5
is the following.
8.3.3 Corollary. Suppose A is aJordan-measurable set and f is a bounded

continuous real-valued function with J:>(J) =A. Then f has a Riemann
Darboux integral.
8.3
Proof.
{3A.
EXISTENCE AND PROPERTIES OF RIEMANN-DARBOUX INTEGRALS I 369
A into I; then fA is continuous except possibly on

0, O(JA, e) C {3A, so that applying Theorem 8.2.4
8.2.5 we get IO(JA, e) I
0. The proof is completed by
Embed
Thus V e >
and Theorem
an application of Theorem 8.3.2.

To put the result of Theorem 8.3.2 into a more usable form, it is
necessary to introduce the concept of an outer Lebesgue measure.
The outer Lebesgue measure can be defined in a manner analogous to
formula (8.2.6). The basic difference is that in defining outer Lebesgue
measure we allow a countable number of intervals in the covering,
rather than only a finite number, as in the case of outer Jordan content.
Although this does not seem to be much of a difference, actually it
turns out to be quite profound, and ultimately leads to a theory of
integration that is much more flexible and useful than the theory of
Riemann-Darboux integration.
8.3.4
Definition.
as
If A
1(A)
C En,
g.Lb.
the outer Lebesgue measure of A is defined
{ 111: <U E r }.
(8.3.4)
IE'U
where f is the collection of all coverings of A that consist of a countable number

of open intervals.
If A= 0, then A can be covered by a zero number of open intervals;
that is, there is a void covering 'U of
0.
sum over 'U to be zero, and thus 1(0) =
In such a case we consider the
0. However, even if void cover

1(0) = 0. Note also that we
the series l: tE<U (III) may not be
ings are not allowed, it is still clear that

have not taken account of the fact that
convergent. In the study of measure theory, one usually extends R
oo Ji R, and extends the order relation on R

oo, Vx E R. If a series of nonnegative terms
does not converge in the ordinary sense we say it converges to oo. Hence
to a set R U
to R U
{oo}
{oo},
where
by taking x <
in the new system every series of nonnegative terms is convergent, and

we set I (A) =
oo.
oo if
every series on the right side of (8.3.4) converges to
The new element
oo
is introduced to be able to make the statements
in measure theory in a neater way. Of course, the addition and multi

plication functions cannot be extended to the new system so as to main
tain all their relevant properties, and care must be exercised on that
score. However, addition and multiplication can be extended to some
extent and many of the usual properties of the order relation contnue
to hold.
!JI (A) = 0, we shall usually say that A has zero Lebesgue measure rather
than zero outer Lebesgue measure. Of course, as we already have noted,
the null set always has zero Lebesgue measure. Also note that every
countable number of points in En has zero Lebesgue measure. Indeed,
since each point can be covered by an open interval of arbitrarily small
volume, we see that a countable set can be covered by open intervals

for which the sum of the volumes is arbitrarily small. This fact is actually
a case of the following proposition.
Proposition. (a) If .A:, is a countable collection of subsets of E"

U {A: A E vf,}, then
8.3.5
and B
1(B).;;;
L l(A).
(8.3.5)
AE-t
IfA
(b)
Proof.
B,then1(A) .;;;1 (B).
N0 and range vf, U {0},

Vk > m, <l>(k) 0. Let us
Let be a function with domain
m elements,
k o 1(<l>(k)) < oo, since
..
where if .A:, is finite with
then
suppose that
otherwise (8.3.5) is clearly true.
According to the definition of outer measure,

there exists a covering
'Uk
of
<l>(k)
VE>
0 and
Vk
N0,
so that
L III .;;;1(4>(k)) +E/2k.

Now,
'U
U {'Uk: k E N0} is an open covering for Band hence
1(B).;;;
( III ) .;;; 1(<t>(k)) +E
Since this is true,
VE>
0 we have proved (a). Part (b) is an immediate
consequence of the fact that every covering for B is a covering for
A.
The last proposition says, in particular, that the union of a countable

number of sets of zero Lebesgue measure is again a set of zero Lebesgue
measure. The reader should not come to the conclusion that sets of zero
Lebesgue measure consist only of a countable number of points. Indeed,
Cantor's set has zero Lebesgue measure and we have asked the reader
to verify this in Exercise IO at the end of this section. We now give a
connection between outer Lebesgue measure and outer Jordan content.
Proposition.
8.3.6
IfA is a bounded set in E", then

1(A).;;; x(A),
(8.3.6)
and equality maintains if A is compact.

Proof.
E>
{Ik: k
For every
of closed intervals
0, there is a covering of A by a finite number

E
(1, m)}
so that
k=I
Clearly,
Vk
( 1, m)
_
there exists an open interval J k so that Ik C J k and

m
>..<A>.;;; :L lhl
k=I
<
m
E
:L 11k1 +2
k=I
<
x<A > + E.
8.3
EXISTENCE AND PROPERTIES OF RIEMANN-DARBOUX INTEGRALS j 371
Since this is true,
VE>
Suppose now that

'U of
0 we have
(8.3.6).
is compact. For every
E>
0 there is a covering
by a countable number of open intervals so that
"L III <X(A)+ E.

/EU
Since
is compact there exists a finite number
this set which cover
A.
x<A)
Since this is true,
{h: k E (l, m ) } from
From Theorem 8.2.4 we get
VE>
x-(A)+ E.
k-1
0 we get equality in
(8.3.6).
8.3. 1 Theorem (Lebesgue). Suppose f is a bounded real-valued func

tion with domain the closed interval I. The function f is Riemann-Darboux inte
grable the set of discontinuities of f has zero Lebesgue measure.
Suppose f is Riemann-Darboux integrable. The set of points
f is not continuous is exactly the set f!(J) = U {f!(f, l/n):
n E N}. By Theorem 8.3.2 each set f!(J, l/n) has zero Jordan content,
Proof.
where
and hence by Proposition 8.3.6 has zero Lebesgue measure. By Propo
f!(f) has zero Lebesgue measure.

f!(f) has zero Lebesgue measure. Then
f!(J) it follows that f!(J, E) has zero Lebesgue
sition 8.3.5 it follows that
On the contrary, suppose
VE>
0, since
f!(f, E) C
f!(f, E) is compact, it follows from Proposition 8.3.6
measure. Since
th.at it has zero Jordan content. An application of Theorem 8.3.2 com

pletes the proof.
The theorem we have just proved makes it clear that many of the
hypotheses we made in the theorems of Section 5.2 were redundant.
The redundancies occurred in having to assume that various combi
nations of integrable functions were integrable. We first state the
generalization to higher dimensions of Theorem 5.1. 9 and Theorem
5.2. l (a), (b), and (c).
8.3.8 Theorem. If f and g are real-valued functions defined on a closed

interval I and are Riemann-Darboux integrable, then
(a)
Va , b E R, af+bg is integrable and
(b)
i [af(x)+bg(x)] dx=a f f(x) dx+b i g(x) dx.

f
implies f /(x) dx
Ill is integrable and f /(x) dx
l
l i lf(x) I dx.
(c)
(d)
0.
Jg is integrable.
Proof. The integrability property of af+ bg, IJI, and Jg is an

immediate consequence of Theorem 8.3.7. The remainder of the
proof is exactly the same as the proof of Theorem 5.2.1 and we shall
not repeat it.
The generalization of Theorem 5.2. l(d) is the following.
Theorem. If A and n are bounded Jordan measurable subsets of

and f is a bounded real-valued function with domain A U n which is
continuous except for a set of zero Lebesgue measure, then the Riemann-Darboux
integrals of f over A U n, A n n, A, and n exist and
8.3.9
En,
fAuB f(x) dx+ fAnB f(x) dx= fA f(x) dx+ JB f(x) dx.
(8.3.7)
A and n are Jordan measurable, I/Ml= lf3nl = 0.

A0 u n c (A u n)0 and A u B = A u n we have
/3(A u n) =A u n\(A u n)0 c (Au B)\(A0 u n). But (Au B)\
(A0 U n) C (A\A0) U (B\n) = f3A U {3n. Thus {3(A U B) C f3A U
{3B, and hence l/3(A U B) I= 0. This means that f3(A U n) has zero
Lebesgue measure and consequently XA u B is continuous except for a
set of zero Lebesgue measure. Since XA u B =XA + XB - XAnB, it follows
that XA n B is continuous except on a set of zero Lebesgue measure,
which means, of course, that X({3(A n B) ) = l/3(A n B) I = 0.
Embed A U B into a closed interval I and we get
Proof.
Since
Further, since
!A U B+ fA n B =!A + fB
Since the boundaries of all the sets in question have zero Lebesgue
measure, it follows that all the above functions are continuous except
on sets of zero Lebesgue measure. Hence by Theorem 8.3. 7 the integrals
of all these functions exist. The formula (8.3.7) is then a consequence
of Theorem 8.3.8(a).
8.3.10
If A and n are bounded Jordan measurable sets, then

B are Jordan measurable and
Corollary.
A U B and A
IA u BI+ IA
B I= IA I + IB 1.
The first mean value theorem, 5.2.6, has a natural analogue in higher
dimensions and we leave it for the reader to formulate. The analogues
of some of the other theorems in Section 5.2 become very much more
complicated in the higher-dimensional situation. For example, Theorem
5.2.4 must be interpreted in a somewhat more complicated way and has
a considerably more complicated proof. We shall carry this out in Section
8.5. Theorem 5.2.2 and its corollary on integration by parts becomes
Stoke's theorem in higher dimensions and we shall delay this until
Chapter 9.
8.3
EXISTENCE AND PROPERTIES OF RIEMANN-DARBOUX INTEGRAlS I 373
D Exercises
I. The set of rational numbers in [O, l] has zero Lebesgue measure.
Since the characteristic function of this set of rationals is not Riemann
Darboux integrable, why does this not contradict Theorem 8.3. 7?
2.
For every bounded set A
C En
(A)
show that
"X(A).
Give an example of a set for which equality does not hold. At any rate,
if A is Jordan measurable, this inequality shows that IA I = X(A).
3.
Give an example of a compact set that is not Jordan measurable.

6 of
[Hint: Let A be the complement of the set suggested in Exercise

Section 8.2. Show that (A)= 0, but X(A) 1/2.]
4. Give an example of a function with domain [O, l] which is

discontinuous at every rational number in [O, l] and yet is Riemann
Darboux integrable.
5.
Suppose f is a real-valued function defined on a bounded set

and J is integrable. Suppose g is another real-valued bounded
function defined on A and f(x) = g(x) except on a set of zero Jordan
content. Show that g is integrable and
C En
L f(x)
6.
dx=
L g(x)
dx.
Show that every function of bounded variation with domain
[a, b] has a Riemann-Darboux integral.

7.
Iff and g are defined on an interval I and are Riemann-Darboux
integrable, show that the functions
g= max (J, g) ,
/\
g= min(J, g)
are Riemann-Darboux integrable.

8. Suppose f is defined and integrable on an interval I C Em and
g
is defined and integrable on an interval]
C En.
h defined by
h(x, y) = f(x)g(y)
is integrable on
I X }.
9. Suppose that J is a real-valued integrable function with domain

the interval IC En. Show that the graph off has (n + !)-dimensional
Jordan content zero. The graph of f is the set of ordered pairs
{ (x,f(x)): x E I} considered as a subset of En+i.
10.
Show that Cantor's set has zero Jordan content and hence zero
Lebesgue measure. (Hint: The result of Exercise 8 of Section 8.2 may
be very useful.)
8.4
ITERATED INTEGRATION
To evaluate higher-dimensional integrals, usually the most practical

thing to do is to integrate with respect to one variable at a time. Thus
the evaluation of higher-dimensional integrals is reduced to the problem
of evaluating several one-dimensional integrals. The purpose of this
section is to show that this can be done. As is usual, we shall identify
Em X En with Em+n. If f is a bounded real-valued function with domain

the interval I X J C Em X En, then
1(x, y)dy
and
define real-valued bounded functions on
1(x, y)dy
I, and hence these functions
have upper and lower Darboux integrals.
8.4.1 Theorem. If f is a re al-valued bounded function with domain

the interval I x] c E m+n, then
IxJ
f(z) dz
Proof.
J [JJ
I [[
]
]
f(x, y) dy dx
f(x,y) dy dx
LxJJ(z)
dz.
(8.4.1)
Let us set
L(f) =
L [L
f(x,y) dy dx.
Then the following properties hold (see Exercises 1 and 2 of Section
8.1):
f gL(f) L(g).
L(f) + L(g) L(f + g).
(c) L(af) = aL(f), Va ER.
(d) If K C I X] is an interval, then
(a)
(b)
L("X.Ko) = L(X.K)
IKI
Let Li be any decomposition of I X] and VKELi set m(K) = inf{f(z):

z EK}. Also set K* = K0 ::::? m(K ) 0 and K* = K ::::? m(K ) < 0. If
we put
ft.= L m(K)X.K*,
KEl1.
then
f t.
f Hence from property (a) we get
L(ft.) L(f).
From the properties (b), (c), and (d) we get
ITERATED INTEGRATION I 375
8.4
!}r(A)
L m(K) IKI
Ket;.
Thus
!2t(A)
KE!;.
,,;; L(f),
and taking the supremum over
on the left we
have the left-hand inequality of the theorem.

The right-hand inequality of the theorem follows by similar reason
ing. The middle inequality is of course a well-known inequality for
upper and lower Darboux integrals.
8.4.2 Corollary. Suppose f is a Riemann-Darboux integrable function

with domain the interval Ix] C Em+n. Then 3A C I so that X(I n Ac)
0
and V E A, the function fx with domain J and de.fined by
fx(y)=f(x,y)
is Riemann-Darboux integrable, the functions
and h given by
g(x) = Jf(x,y)dy ,
h(x) Lf(x,y)dy,
=
are integrable on I, and
LxJJ(z)dz= L [Lf(x, y)dy] dx

L[L1<x.y)dyJdx.
,,;; h
8.4.
LxJJ(z)dz ,,;; J g(x)dx
,,;; I g(x)dx
,,;; { h(x) dx,,;; { J(z)dz.
J1xJ
J1
=
Proof.
Clearly g
and from Theorem
l we get
(8.4.2)
This shows that the upper and lower integrals of g are equal, and thus
g is integrable on /. In a similar way, we show that
is integrable.
Further, the above chain of inequalities shows that
L [h(x) - g(x)] dx
Now, the function F
0.
g is nonnegative and integrable and hence

is continuous except on a set of zero Lebesgue measure. Further at
a of continuity of F we must have F(a)= 0. For, if we

F(a) > 0, then there is an m-dimensional interval M C I, with
center at a, so that Vx E /, F(x) F(a)/2. Hence
every point
suppose
f F(x) dx
J1
F(a)
IMI
2
>
0,
which is a contradiction.
We have proved that
g(x)= Lf(x,y)dy= If(x,y)dy=h(x)

except on a set of zero Lebesgue measure. This taken in conjunction
with
(8.4.2) concludes the proof of the corollary.
E2 let I= [O, l]
We should point out that it is possible that iterated integrals may

exist without the integral existing. For example, in
[O, I]
form
and let A be the subset of I consisting of all rationals of the
(P/2n, q/2n), where p and q are odd integers. It is, we think, clear
that A is dense in /. Also, every line parallel to the x axis and every line
parallel to the y axis contains only a finite number (possibly zero) of

points in A. Let us putf= I - XA; then f has no points of continuity
in I and consequently it is not Riemann-Darboux integrable over /.
On the other hand, for every fixed y the function defined on
by
[O, I]
f(x,y) is integrable, and a similar result is valid for every fixed x.
Further, we have
J: f(x,y)dy =I
J: f(x,y)
and
dx= l,
and thus
Let us now work out a simple example that makes use of iterated
integrals. This is the type of exercise that is usually done in the ele
mentary calculus. However, the reader may possibly find it instructive
to look at it from the slightly more sophisticated point of view that we
have developed in this section. The problem is to find the volume
enclosed by the ellipsoid whose equation is
a,b,c ER+.
In precise terms this means that we wish to find the integral of the
constant function, with value 1, whose domain is the set
A=
{ (x, y, z) : x2
a2
y2
2
z2
2
+b + c
,,,;;;
I .
8.4 ITERATED INTEGRATION I 77
Let us embed A
[-b, b],
into the interval I X j X K,
where I = [-a,a],
and K = [-c, c]. B y definition the volume Vis

v
Now, for fixed
(y,z)
r
(x, y,z) d(x,y,z)
J1xJxK XA
so that
XA(x,y,z) _
(y/b)2 + (z/c)2.::;
lxl .::; a[l - (y/b)2 - (z/c)2]1'2 ,
0 otherwise
XA(x,y,z) = 0
and, of course,
if (y/b)2 +
(z/c)2
a[l - (y/b)2 - (z/c)2]112

(y z) '
O

Then we get
1, we have
XA(x,y,z) dx
> 1. Let us set
(y/b)2 + (z/c)2.::;
1,
otherwise.
f<1><11.z> dx= 2<1>(y,z).

-<l>(y,z)
Thus
v
fJXK <l>(y,z) d(y,z)
r
JK
[J <l>(y,z) dy ] dz.
J
Let us now set
qr z = b[l - (z/c)2]112 (z/c)2,,,;;

( )
0
therwise.
Then, for fixed
z,
fJ <l>(y,z) dy J'i'<z>
=
1,
<l>(y,z) dy.
-'i'(Z)
Hence
V
<z>
J
lK [ 'i'
-'i'(Z)
<l>(y,z) dy dz = - 11' abc.

3
4
DIFFERENTIATION OF INTEGRALS
In the previous example we have come across integrals of the form
g(y) =
J.<1>(11) f(x,y) dx.

</)(IJ)
It is frequently desirable to have sufficient conditions under which g is

differentiable.
8.4.3 Theorem. Suppose f is defined in the rectangle [a, b] X [c, d]

is integrable on [a, b] for each
E [c, d], and
exists and is
E2,
continuous on [a, b]
[c, d]. If we set
Dd
g(y)= J:!(x,y)dx,
then g' is defined and continuous on [c, d], and moreover
g' (y) J: D2f(x,y) dx.
(8.4.3)
Proof.
By an application of the Mean Value Theorem we get
g(y+ hh -g(y) J: [f(x,y+ h) -f(x,y)] dx

J: Dd(x,y+Oh) dx,
=
where 0 < 0 < I. Hence
'g(y+ hh-g(y) - J: D2f(x,y)dx'

If [Dd(x,y+Oh) - Dd(x,y)] dx'
J: ID2f(x,y+Oh) - Dd(x,y) I dx.
h
y+ h [c, d].
(x,y+Oh)
(x, y)
[a, b]
[c, d].
D2f(x,y)
y+ h [c, d]
ihl
l(x,y+Oh)-(x,y)I= jOhl
IDd(x,y+Oh)-D2f(x,y)j e/(b-a).
y+ h [c, d] ihl
lg(y + hh - g(y)-J: Dd(x,y)dx'
g'
=
Now,
is always chosen so that
are in the compact set
continuous,
Ve>
Hence
and
Since
0, 35 > 0 so that
< 5, and thus
is uniformly
and
<
<
Consequently, Ve> 0, 35 > 0 so that
and
<
This gives (8.4.3). Since the continuity of
<
E.
is immediate, the proof
is complete.
8.4.4 Corollary. Under the hypotheses of Theorem 8.4.3 and the addi
tional hypotheses that a 
b, where are defi,ned and
differentiable on [c, d], and V E [c, d] the function fv on [a, b] with
values v
is continuous, then
(y)
f (x) J(x,y)
=
(y)
8.4
g(y)=
ITERATED INTEGRATION I 379
J<l>(y) f(x,y)dx
<p(y )
is
differentiable, and
g'(y)=
Proof.
J<p<l>(y(y) ) Dd(x,y)dx + f(<l>(y),Y )' (y) - f('P(Y),Y)'P'(y)
Set
G(y,u,v) =
f J(x,y)dx.
By the use of the chain rule we get
G'(y,'()(y),<l>(y)) =D1G(y,'()(y),<l>(y)) +D 2G(y,'()(y),<l>(y))1P'(y)

+D3G(y,'()(y),<l>(y))<l>'(y).
Using the last theorem we get
and the other terms are obtained by the use of Theorem 5.2.5.
D Exercises
I.
X
[a, b]
a 'P(Y) <l>(y) b, where 'P and are
real-valued functions defined on [c, d]. Show that
Suppose f is defined and continuous on the rectangle
[c, d].
Suppose further that
continuous
g(y)=
J<l>(y) J(x,y)dx
<p(y)
is continuous.
2.
Locate the relative maxima and minima of
g(x)
in the interval
3.
J:
cos[ (y
- x)2]dy
]O, oo[.
Suppose f is a continuous real-valued function defined on an
interval
I C E2 containing 0 as an interior point. If (x,y)

F(x,y)
J [f:f(t,r) dr] dt.
Show that at every interior point
(x,y)
E J,
f(x,y)= D1 D2F(x,y) =D2 D1F(x,y).
E J, define

4.
[a, b] x [c, d ] , and

D2f is continuous on this interval. Use Theorem 8.4.1
Corollary 8.4.2 to obtain a proof of Theorem 8.4.3.
Suppose f is real-valued and defined on
together with
or its
5.
Suppose
f, D1 D2f,
on a rectangle in
E2
and
D2 D1f are
defined and continuous
Use the results on iterated integration to show
that
D1 D2f= D2 D1f
6.
Suppose g is defined on
[O, l]
J: [{ g(t) dt ]
7.
En.
E [O,f(x)]} is
8.
J: tg(t) dt.
Suppose f is a real-valued, nonnegative integrable function with
domain A C
dx
and integrable. Show that
If
Show that the volume of the set
L f(x)
V n(r) is
{ (x, y): x E A &
dx. (Hint: See Exercise 9 of Section 8.3.)
the volume of the ball B ( 0, r) in
V2dr)
V2k-1 (r)
En show
that
r2k7Tk/k!
r 2k-l7Tk-l4kk!/(2k) !
Do this by making use of induction and showing that
Vn+i(r)
8.5
2Vn(l)
J: (r2 - t2)nl2 dt.
THE TRANSFORMATION THEOREM FOR INTEGRALS
The purpose of this entire section is to generalize Theorem
5.2.4
to
higher dimensions. We shall do this first for linear transformations,

which leads immediately to a corresponding theorem for affine trans
formations. Since under an affine transformation an interval goes over
into a generalized parallelepiped, it is comparatively easy to compute
how the volume of an interval changes under an affine transformation.
Now, functions of class
C1
can be closely approximated in small neigh
borhoods by affine transformations, namely, translations of their differ

entials. Since an integral may be approximated by weighted sums of
the volumes of small intervals, and since the volumes of the maps of
these small intervals under a map of class
C1 can be very closely approxi
mated by the volumes of maps of these intervals under affine transfor

mations, it seems quite reasonable
that a general transformation
formula for integrals can be obtained in this way. This is actually the
way it works and we shall develop this in the subsequent pages.
We shall begin by considering how the volume of an interval changes
under three very simple types of linear transformations of En into
E n.
8.5 THE TRANSFORMATION THEOREM FOR INTEGRALS I 381
We list them as follows:
g(x)=x+(A.-l)xkek, A.ER.
g(x)=(x"\
xrrn), <Fa permutation of (1, n).
g(x) = x + x2e 1
I.
II.
III.
If we write out these transformations componentwise we see that we get

a transformation of type I simply by multiplying the kth component of
by A.. We get a linear transformation of type II by permuting the
components and we get a linear transformation of type III by adding

the second component to the first component.
If I is a closed interval in En and g is one of the linear

transformations listed above, then
8.5.1
Proposition.
1.
II, III.
Proof.
I =1 1 x
lg(J )I =l>..I 111.

lg(J)I =III.
Case I is obvious. To prove case II we simply note that if
... x P, then
g(J)=/<T1 x ... x /<TR
and hence the result
is immediate. To prove case III we set
]={(x1+x2,x2): x1E11 & x2 E12},

K=/3
/".
It is clear that
g(J)=]
x K.
/1 X /2 transforms onto J is given in Fig. 8.5. l.

11= [a1,b1] and 12= [a2,b2], let L= [a1+a2,b1+b2] and M=L
/2 X K. Clearly, M is a closed interval for which g(/) C M .
The picture of how

If
X
L
FIGURE 8.5.1
Now, in case III it is clear that g is a nonsingular linear transforma

tion. It follows from the Inverse Function Theorem that
is an open
map with an open range. Since a linear transformation is always a

locally Lipschitz map it follows from Theorem 8.2.8 that
g(J)
is Jordan
measurable.
Let us apply Corollary 8.4.2 on iterated integration. Let us first note
that if we designate the elements of
En-2
by
(x, y),
then
Xum(x,y)= xJ(x)xK(y).
Now, since the transformation that takes
into
x + x2e1
is non
singular, it follows as in the argument of the previous paragraph that
XJ is
integrable. It is also clear that
X K is
integrable. (Hence from Exer
cise 8 of Section 8.3 we get another proof that

we have
\g(J) I=
xg(f) (x,
JrK
[I
LX/2
J(
LX/2
Xow
is integrable.) Thus
y) dx dy
Xorn (x, y)dx
XJ(x) dx
dy
Xk(y) dy.
Now, clearly
n ,Ii,_
xK<Y> dy= ,K, =
Further, using the theorem on iterated integrals again, we get
J(
LX/2
For fixed
x2
/2
X J(x) dx=
we have
XJ(x1, x2)=
J [f
I ::::>
/2
ai
xJ(x) dx1
x2
xi :;;; bi
dx2
x2'
0 otherwise.
Thus we get
Lxi
XJ(x) dx= \/1 I \!21,
and thus
lg(J) I
III.
8.5.2 Corollary. If A is a bounded Jordan measurable set in E" and

is any one of the three linear transformations of type I, II, or III, then
I.
II, III.
\g(A) I= l>-1 !Al.

lg(A) I= !Al.
Proof. In case I, if A.= 0, then the corollary is clearly true, since

g(A) is a bounded set in an (n 1)-dimensional subspace of E n and thus
-
has n-dimensional Jordan content zero.

If A. - 0, then g is nonsingular and by the same reasoning as in the
proof of the last proposition it follows that g{A) is Jordan measurable.
Thus VE > 0 we may cover A by a finite set of intervals {I; : j E (1, k)}
so that
k
L II;I
<
IAI + e/l>..1.
j=l
The set {g{I;): j E (l,k)} covers g(A) and hence

k
j=l
j=l
lg{A)I.;;; L lg{I;)I =IA.IL II;I.;;; l>..l IAI + E.

1
On the other hand, g- is a linear transformation of the same form as

g with A. replaced by l/A.. Thus, using the same argument as before,
we get
.
.;;; 1
IAI = lg-I
g(A)I
TiT lg{A) 1.
This gives the equality in case I.

Cases II and III are proved in essentially the same way, and we shall
leave this as an exercise for the reader.
Suppose now that h is any linear transformation from En into itself
which has the matrix representation [h ;k ] with respect to the ordered
basis {e1,
,en). Let h; be the jth row of [h;d; that is, h1 =(h;i,
h;2, ,h;n). Let A be a fixed bounded Jordan measurable set in En .
Set
(8.5.1)
This clearly defines a function VA with domain the n-fold Cartesian
product of En, and range in R. In Exercise 8 at the end of the chapter
we have asked the reader to show that h(A) is Jordan measurable.
8.5.3 Proposition. The function VA defined by (8.5.1) satisfies the fol
lowing properties:
(a) VA(h,, ,A.kb,hn)= l>..I VA(h,, ,hn).
(b) VA(h1, ,hn)=VA(h<I1,,h<In), for every permutation <T of
(1, n).
(c) VA(h,,,hn)=VA(h,,,hv+Ahq,,hq,,hn), Vp,
q E (1, n)so that p - q, and VA. E R.
(d) VA(e,,,en)=IAI.
Proof. To prove (a) we suppose that h is a linear transformation

from En into itself so that h1,
,hn are the rows of the matrix repre
h with
sentation of
= A.ek.
(e1, ,en). Let g be

k,g(ei) =e;, and g(ek)
respect to the ordered basis
a linear transformation of type I. Then Vj =
Thus, since
h(e;) =L h;;e1+ hk;ek,

ik
we get
g 0 h(ei) = L hiiei+ A.hk1ek.

ik
g 0 h with respect
h1, ,A.hk. ,hn. Hence
Thus the rows of the matrix representation of

ordered
basis
(e1 , ,en)
are
to the
using
Corollary 8.5.2 we get
VA(h1,
A.hk
,hn) =lg 0 h(A)I =IA.I lh(A)I

=IA.I VA (h1, ,hn)
g is a linear transformation of type

g 0 h are hu1,
hun. This follows from the fact that Vj E (1, n) , g(e;) = eu;.
To prove (c) we first note that if g is a transformation of type III,
then Vj = 2, g(e;)=ei and g(e2)=e1+ e2 Thus
n
g 0 h(ei) (hu+ h2;)e1 + L h;;e;,
i=2
To prove (b) we simply note that if
II, then the rows of the matrix representation of
and hence the rows of
g 0 h are h1 + h2,h2, ,hn.
Thus
VA(h1+ h2,h2, ,hn) =lg 0 h(A)I =lh(A)I

=VA(hi. hn).
Hence if we apply a transformation of type II that interchanges
hp, h2 and hq,
h1 and
and leaves the other rows fixed, use the result (b) and the
equality we have just obtained for a transformation of type III, and then
use the result (b) again, we have arrived at the fact that
VA(h1, ,hp+ hq, ,hn)=VA(hi.
hn).
If we use this fact and (a) we get
IA.IVA(hi. ,hn)=VA(hi. , A.hq, ,hn)

,hp+ Ahq," . Ahq, . ,hn)
=VA (hi.
=IA.IVA(hi. ,hp+ Ahq, ,hq, ,hn).
From this, if
A. =
0 we get (c). If
A.=0,
(c) is obvious. Since the equality
(d) is obvious, we have completed the proof.
8.5.4 Proposition. There exists only one function defined on the njold
Cartesian product of En that satisfies the conditions (a) through (d) of the
preceding proposition.
Proof. Suppose U is a function that satisfies all the conditions (a)

through (d) of the previous proposition and let
W=VA-U.
Then W satisfies the conditions (a) through (c) and moreover

W(e., ,en)= 0.
If
for some k, hk = 0, then by taking A.= 0 in (a) it follows that W(hi.

hn)= 0. Hence if {hk: k E (1, n)} is linearly dependent, there is
a linear combination of the form

hk +
#k
Ajhj
which is zero. In this case a repeated application of (c) (rigorously by

induction!) gives
W(h., , hn)
W(h., , hk +
j#k
Ajhj,
hn)
0.
If the set {hk: k E (1,n)} is linearly independent, then 3k1 E (1, n)

so that the set {hk,, e2,
enris linearly independent. Otherwise, the
set {hk: k E (l,n)} is in an (n-1)-dimensional space. Also, for the
same reason, 3k2 so that {hk,, hk2, e3,
,en} is linearly independent.
Proceeding by induction, there exist numbers k1 ,
,kn so that all the
different sets {hk,, , hk; ei+t. ,en },j E (1, n), are linearly
independent. Now, we may write
hk,
n
=
i=l
A.iki eJ,
and by a repeated application of (c) and (a) we get

W( hk, ,e2, ,en)= 0.
Also,
hk,= A.1k,hk, +
j= 2
A.;k,e;.
Using the previous equality and a repeated application of (c) and (a)
we get
Proceeding by induction we find that

W( h0k1 , hk2 , , hkn )= 0.
Now let <r be that permutation of (1, n) so that <r(j) = ki. Then, applying
(b) we get
8.5.5 Theorem. If g is a linear transformation of

a bounded Jordan measurable set in En, then
lg(A)I
ldet
En
into itself and A is
gl IAI.
(8.5.2)
If g is any linear transformation of En into itself, let g.,

, gn be the row vectors of the matrix representation of g with respect
the ordered basis ( e 1 ,
e ) The function
n
Proof.
to
satisfies the conditions (a) through (d) of Proposition

by the last proposition U =VA. Now,
(see Exercise
8)
g(A)
8.5.3,
and thus
is a Jordan-measurable set
and thus
From the preceding theorem it would be immediately possible to get

a formula for transforming integrals under linear or affine transforma
tions. We shall not bother to carry this through but shall instead proceed
directly to the general case. We shall first establish two propositions that
essentially constitute the heart of the matter.
Proposition. Suppose g is of class C 1 with (an open) domain in

and range in En and whose difef rential is nonsingular at every point of
its domain, or in other words, g has a nonvanishing Jacobian at every point
of its domain. For every compact set K C: J0(g) and Ve> 0, 38 > 0 so
that for every interval I C K with d (I) < 8 and Vx E I we have
8.5.6
En
lg(I)I
(l+e) IJ11(x)l III.
(8.5.3)
Let us first note that from the Inverse Function Theorem it
Proof.
follows that
is an open map with an open range and from Proposition
formula
7.5.1,
(7.5.2), it follows that g is locally Lipschitz. Thus from

8.2.8, g takes Jordan measurable sets onto Jordan measura
Hence the left side of (8.5.3) makes sense. We shall now break
Theorem
ble sets.
the remainder of the proof into several parts.

(a)
Vx, a
From Proposition
E K with
Ix - a l
<
7.5.l it follows that Ve'>

8' we have
lg(x) - g(a) - dg(a)(x
a) I
0,
38' >
e' Ix - al.
Let us suppose, at first, that I is a cube in K with center at

length
21,
where
21 < 81/Vn.
(8.5.4)
we get,
(8.5.4)
a and of side
d(I) < 8',
This of course means that
and vice versa. Also let us suppose that

tion. Then from
0, so that
Vk
dg(a) is the identity transforma

(1, n) and Vx E /,
(8.5.4')
Thus, since
Vx EI, Ix - al:,;;;; lVn,
and
lxk - akl :,;;;; l,
we get
l gk(x)- gk(a)I:,;;;; (1 + e'Vn)l.
(8.5.5)
g(J) is contained in a cube with center at g(a) and side

2(1 + e'Vn)l. Consequently,
This means that

length
lg(/)I:,;;;; (1 + e' Yn) n III.

If for a given e
=<let
(b)
e'
> 0 we choose
have established
dg(a) = I.
(8.5.3)
so that (1 +
e'Vn)n < (I+ e), we

x =a, since jy(a)
in this case, at least for
Let us now go to the case where it is not necessarily true that
dg(a) is the identity, but I is still a cube as in part (a).

number so that Vx E K and Vu E E"
Let
M be a positive
ld g(x)-1 (u)I:,;;;; M lul.

7.5.2. Hence from (8.5.4)
Vx, a E K with I x- al < 81,
This is a consequence of Corollary

linearity of
dg(a)-1
we get
and the
ldg(a)-1 (g(x))- dg(a)-1 (g(a))- (x - a)I

:,;;;; M lg(x)- g(a)- dg(a)(x- a)I:,;;;; M e' I x- al.
(8.5.6)
Let us set
h = dg(a)-1 g.
0
(g)
(8.5.6)
From the chain rule it follows that his of class C1 in
dh(a)
is the identity transformation. The inequality
inequality
(8.5.5)
for
Me'. Consequently,
<l + e/2. Applying
with the exception that
for a given
> 0, choose
the result of part (a) to
e'
'
and further
leads to the
has been replaced by

so that
(I + Me Vn)"
'
we find that
lh(J)I < (1 + e/2) III.

From Theorem
8.5.5
and the definition of
we get
lh(J)I = ldg(a)-'(g(J))I =I det dg(a)-11 lg(J)I.

But since
I <let dg(a)-1I
Ifn(a) 1-1,
we get
lg(J)I < (1 + e/2) Ifn(a)I I I I

(c)
(8.5.7)
Let us now finish the proof. From the compactness of K, and the
continuity and nonvanishing of jg ( x) , 3m > 0 so that
Vx E K we have
lfn(x)I m. Further, from the uniform continuity offn(x) IK, Vri > 0,
3 8 with 0 < 8 < 8' so that Vx, y E K with Ix - YI < 8 we have
-71 < lfn(Y)l- IJg(x)I <
Hence we find that
T/
(8.5.8)
Take
11
so that
Now, if
(I+e/2)(I+11/m) < (1+e).
8.2.7 and Corollary 8.3.10

{h: k E {l, m)} of cubes in I so that
from Theorem
set
d(I) < 8,
V, > 0 there
is any n-dimensional interval in K with

that
L lhl :s; III.

k=I
and
Let
a k be
the center of
I k.
Then using
(8.5. 7)
it follows
is a finite
we get
Jg(I)J <,+Jg( u h)J

m
:s; ,+ L Jg(h)J <,+ (l+e/2) L lfu(adl Jikl

k=I
k=I
From (8.5.8) we have Vx EI, IJ11(a k )J < (1+11/m) IJ11(x)J. Thus
calling how 11 was taken we ge V, > 0,
re
Jg(I)J <,+ {l+e) IJ11(x)J JIJ.

V, > 0, we have completed the proof. Note that if
0, then (8.5.3) is automatically satisfied.
Since this is true
JIJ
8.5.7 Proposition. Suppose g is of class C1 with an open domain in

En and range in En. If A is a bounded set with AC .E9(g) and Vx EA,
]11(x) = O, then Jg(A)J O
=
Let B be a bounded open set in .E9(g) so that B C .E9(g)

B. Since B is compact, from Proposition 7.5.1 it follows that
Ve> 0, 38> 0 so that x,y EBand Jx -yJ < 8 =>
Proof.
and AC
Jg(x)-g(y)-dg(x)(x-y)J
:s;
Jx-yJ.
{h: k E (l,m)} be a covering for Aby cubes so

E (1,m)} C B, the center of h is in A, d(Ik) < 8 and
Let
(8.5.9)
that
U {I k: k
L II kl :s; 2nx(B).
k=I
The factor
2"
I k in A.
(8.5.9) we get
is needed to make sure we can get the center of
Let] be any one of these cubes and a its center. Then from
Jg(x) -g(a)-dg(a)(x-a) I :s; elVn,

where 2l is the side length of].
Since
j11(a)
(8.5.9')
0, the rank of dg(a) is r<n; that is, dg(a) maps En
into a subspace of dimension
r.
For every
g(x)-g(a)
x Ej
h(x)+k(x),
let us write
8.5
THE TRANSFORMATION THEOREM FOR INTEGRALS I 389
where k(x) E !1C,(dg(a)) and h(x) E !1C,(dg(a))_j_. Hence we get
lg(x) - g(a) - dg(a) (x - a)12
l h(x)12 + lk(x) - dg(a) (x - a)12
It follows from (8.5.9') that
lh(x)I
,,;:; ElVn,
(8.5.10)
lk(x) - dg(a)(x - a)I
,,;:; ElVn.
Since g E C1 and 1J is compact, 3M> 0 so that Vx EB and Vu

E En we have
ldg(x) (u ) I
,,;:;
M lul.
Thus from (8.5.10) we get

l
.
lk(x)I.;:; (M+E)Vn
(8.5.10')
Let U be any orthogonal transformation from En onto itself so that U

takes !1C,(dg(a)) onto Er. Of course, U takes the orthogonal complement
of !1C,(dg(a)) onto En-r. From (8.5.10) and (8.5.10') we have
IUh(x)I
,,;:; ElVn,
IUk(x)I.;;; (M+E)Vn
l
.
Thus Uh(x) is contained in a cubel1 C En-r of side length 2e!Vn and
Uk(x) is contained in a cubel2 C Er of side length 2(M+E)lVn. Hence
Ug(x) - Ug(a) is contained in 11 Xl2 and Ug(x) is contained in the
cube 11 Xl2 + Ug(a), whose content is, of course, the same as the
content ofl1 x12 Thus
Set
(M+ l)nnn12; then VEfor which 0 < E< l we have
x(Ug(j)),,;:; EN Il l .
From this we get
n
x<ug(A)) .;:; :L x<ug(Ik))

k=I
n
.;:; EN :L IIkl
k=l
.;:;
EN2nx(B).
Since Eis arbitrary we get IUg(A)I= 0. But since I <let u-11

from Theorem 8.5.5 that
lg(A)I
we get
1u-1 Ug(A)I= ldet u-11 IU(g(A))I = o.

0
We are now in a position to prove a transformation theorem for

higher-dimensional integrals.
8.5.8 Theorem. Let g be a function of class C1 with o/;en domain in

and range in E"- Let A be a bounded set with AC J0(g) and suppose that
B = {x:}.q(x) =I' O} n A is a Jordan measurable set, and glB0 is one to one.
Then g(A) is Jordan measurable, and if f is a real-valued bounded function
with domain g(A), which is continuous except on a set of zero Lebesgue measure,
then
E"
fg(.4) J(x) dx= f J0g(x) IJg(x)I dx.
(8.5.. 11)
}g(x) =I' 0
Jg (x)
E A0, it follows from the Inverse Function Theorem that g is
map. It follows from Theorem 8.2.8 that g(A ) is Jordan mea
Proof.
(a) Let us suppose, at first, that
for every x E
=I' 0, Vx
an open
A.
Since g
E C1,
A= B;
that is,
it is locally Lipschitz, and since
surable. From Theorem 8.3.7 it follows that the left-hand integral in

(8.5.11) exists. On the other hand, since the inverse of
glA0
takes sets
of zero Lebesgue measure into sets of zero Lebesgue measure (the

proof is almost exactly the same as the proof of Lemma 8.2.7), it follows
that (J 0
g) IJgl
is continuous on
except on a set of zero Lebesgue
measure. Hence the integral on the right side of (8.5. l l) exists.

Let us first prove (8.5.11) in the case where f 0. Let
Jordan measurable set in
C
ii
A0
A0
A0
and V an open set in
Let TJ be the positive distance from
be a compact
so that
to vc, and
K C V
VE> 0
let 8 be a number so that 0 < 8 < TJ, and which is_small enough so that
the conclusions of Proposition 8.5.6 hold with V the compact set of
that proposition. Embed
sition of I so that
0
By '( J0
lal
into an interval I and let a be a decompo
< 8 and
Du,g>IJylK(a)
J0g(x)
IJy(x)I dx
<
E.
(J0 g)IJgl on K
Ea&] n K =I' 0}. Clearly
V,and thus fromProposition 8.5.6it follows thatVx Ej,
g)IJglK'
we mean that function whose value is
and zero elsewhere. Let us set a*={]:]
VJ Ea*,} c
J0g(x) lg(])I (I +E)f0g(x) IJy(x)l IJI.

If we put
M(})=sup{f0g(x):x E]n K}
and
M'(})=sup{f0g(x) IJg(x)I: x
Jn K}'
then we get
M(}) lg(])I (I+ E) M'(}) IJI.

Let us now set
8.5
f!l
Clearly,
g(K)
fo<K> ,;;; f ll,
2 M(] ho(J)
JEil *
and, since from Theorem 8.2.8 the compact set
is Jordan measurable, we get
lg(J)I
fg(K)f(x)dx,;;; fg(f) fll(x)dx=l:MU)
JEA*
< (1 + E) D(fog)fJgfK (6.)
,;;; (l+E)
Since this is true
VE > 0,
[ LJ0g(x)IJ0(x)I dx+E J .
we have
lg(K) f(x) dx,;;; fK f 0g(x) IJ0(x)Idx.
(8.5.12)
A is Jordan measurable, there is a Jordan measurable compact

K C A so that IA\Kl is as small as we please. By Theorem 8.2. 7,
lg(A\K) I is as small as we please provided IA\KI is small enough. But
since g(A) \g(K) C g(A\K), it follows that lg(A) \g(K)I can be made
as small as we please provided IA\Kl is small enough. Thus we see that
(8.5.12) holds true when K is replaced by A.
Now, let g-1 be the inverse of glA0 The function g- 1 has domain
g( A0), is a one-to-one function of class C1 and has a nonvanishing
Jacobian. Thus if L is any Jordan measurable compact set in g(A0), we
can apply the inequality (8.5.12) when K is replaced by L, g(K) is re
placed by g-1 (L) and f is replaced by f 0g(x) IJ0 (x)I . Thus we get
Since
set
Jo-iu> f0g(x)IJo(x)I dx,;;; f f0g 0g-1(x)IJo(g-1(x))l IJ0-1(x)I dx.

i
Now, of course,f0
g 0g-1
and from the chain rule
I Jo(g-1(x))i IJo -1(x)I = 1 .

Now, let
we get
be a compact measurable set i n A 0 and put
JK f 0g(x) IJ0(x)I dx,;;; fg(K) f(x)dx.
L= g(K).
Then
(8.5.12')
Now argue exactly the same as in the previous paragraph and we see
(8.5.12') is satisfied with K replaced by A. Thus (8.5.11) is satisfied.
For an f that takes on both positive and negative values, let us set
J+(x)=2 [IJ(x)I + f(x)],

1
f -(x)=2 [if(x)l-f(x)].
392 j HIGHER-DIMENSIONAL
INTEGRATION
I+ and 1- are nonnegative and integrable and I= I+ 1-.

We can apply (8.5.11) to each function 1+ and 1- and then use the addi
Then
tion theorem for integrals to get (8.5.11) for
J.
A
AC (g) it follows that A\B is bounded and its
closure is in (g). Sincejg(x) =0 on A\B, it follows from Proposition
8.5.7 that /g(A\B) / =0. But since g(A) \g(B) C g(A\B), we see that
/g(A)\g(B) / =0. Now, since g/B0 is certainly an open map, it follows
from Theorem 8.2.8 that g(B) is Jordan measurable. Hence it follows
from Corollary 8.3.10 that g(A) = g(B) U (g(A)\g(B)) is Jordan mea
(b)
We shall now complete the proof in the general case. Since
is a bounded set with
surable. From the hypothesis on fit follows that the left side of (8.5.11)
(f g) /Jg /
B except for a set of Lebesgue measure zero.
Further, Vx E A\B, I g(x) /Jg(x) / =0 and thus, since B is measur
able, (f g) /Jg / is integrable over A regardless of whether or not A
certainly exists. On the other hand, as we noted in part (a),
is
continuous
on
is Jordan measurable. Hence the right-hand integral in (8.5.11) always

exists.
We can write
l(x)dx=
g(A)
l(x)dx+
g(B)
J.
l(x)dx= {
Y<Al \g(B)
l(x)dx,
J g(B)
and also
If we apply part (a) of the proof we get
J.
g
l(x)dx= f I g(x) /Jg(x) I <ix.
Jn
If we combine this with the previous two equalities we get (8.5.11) and
the proof is complete.
REMARK:
The statement of Theorem 8.5.8 could be made to appear
somewhat simpler if the concept of the Lebesgue integral had been

used instead of the concept of the Riemann-Darboux integral. For
example, the hypothesis that
set
AC (g)
and the hypothesis that the
is measurable could be removed, provided we supposed that
was Lebesgue measurable. Even if we had assumed that A was Jordan

measurable in Theorem 8.5.8, we could not have eliminated the con
dition on
B.
Also the theory of Lebesgue integration would allow us
to assume less about the functions that are involved.

We now want to show that the hypotheses which have been made in
Theorem 8.5.8 are not unnecessarily refined for easy and carefree
application. This will already be seen for the most standard and most
8.5
often used transformation: the transformation from rectangular to

polar or spherical coordinates. We shall do this in three dimensions. Let
g be that function with domain and range

given by (see Section
3 whose components are
6.5)
g1 (r,(},rp)= r cos (},
g2(r,(},rp) = r sin (} cos rp ,
g3(r,0, rp )= r sin (} sin rp.
The function g is of class C1 and an elementary computation shows that
}g(r, 0, rp)= r2 sin 0.

Let S be the set of points whose coordinates satisfy the following:
r
If
{3 cp {3
0 (} 1T,
0,
{3 arbitrary.
27T'
S, it is not necessarily true that g!A0 is one to one with a non

glA0 has these
vanishing Jacobian. However, it is always true that
properties and hence we may apply the transformation theorem and get
g(A)
J(x,y,z) dx dy dz=
For example, if f(x,y,z)
A= {(r, 0, cp): r
then
J f 0 g(r,(},cp) r2
(} dr d(} drp.
and
[O, l], (}
sin
[O, 7r], cp
[O, 2 7T]},
g(A) is the closure of B (O, 1) and we can thus compute the volume
of the unit ball in 3 Actually, if we use the full strength of Theorem

.
8.5.8
we can take S to be the set of points whose coordinates satisfy the
following:
0,
a,
aOa+1T,
{3 arbitrary.
Loosely speaking, the second-order differential sin (}
d(} d'(J is an
element of surface area on the unit sphere in 3. On a sphere of radius
r this element of surface area is magnified (or shrunk) to r2 sin (} d(} dcp
and hence r2 sin (} d(} d'(J dr is an element of volume in the (r,(},cp)
space.
There is a rather amusing way to remember the transformation
formula
jy(x)
(8.5.11)
of Theorem
8.5.8.
Let us suppose that
Vx
o(g),
> 0 and let us write
a(gl'...'gn)
Jg(x)=a(x1,,xn) (x).
Then the transformation formula becomes
g(A)
f(x)dx=
f f0g(x) o(g(X>, ..,gX:)) (x) dx1

A
' ,
dx n.
(8.5.11 )
'
This reminds us more nearly of the form of Theorem 5.2.4. Also, if
we rather loosely think of

think of a (g1,
dx1
,g") as dg1
dxn as canceling a (x1,

,xn) and
dgn, then the right-hand integral goes
over into
g(A)
J(g1, ..., gn) dgl ... dgn,
and this is the correct domain of integration since as the

over
A,
g(A).
the g variables vary over
the left side of
x variables vary
This last integral is the form of
(8. 5. 11').
O Exercises
1.
Suppose
J(x,y)
27T(x2- y2) sin 1T(x - y)2
Compute the integral off over the interior of the square having vertices
at
(1,0), (0,1), ( -1,0) and (0,-1) . (Hint:
v=x-y.)
u=x+ y,
2.
Make the transformation
C1 function with domain [a, b].

{(x,y): x E [a,b] & yE [f(x),f(x) + l]}.
Suppose J is a real-valued
Compute the area of the set
Do this by making the change of variable
g(x,y)
(See Exercise 9 of Section
3.
(x,y-f(x))
8. 3.)
'
Let g be the function on
g(x,y)
2\{0} defined
by
y', x' y')
x'
Using this transformation, compute the integral
JI
dxdy
(x2+y2)2
where A is the region common

+ y2 < l}, {(x,y): x2 + (y - 1)2
{(x,y): x2+(y- 1/2)2 > 1/4}.
4.
{(x,y) : (x - 1) 2
l}, {(x,y): (x-1/2)2+y2 > 1/4},
to the four regions

<
This exercise generalizes Exercise
function of class
C2 with
3.
Suppose f is a real-valued
domain an open connected set in En. Suppose
that the gradient off,
V f(x)=,L Dkf(x)ek>
k=l
8 .5
is a one-to-one function, and the Hessian off
Hr(x) = det[D; Dkf(x)]

0
0,
Vx
JE>(f).
Let A be a bounded Jordan-measurable set with A C JE>(f). Show that
IL Hx) I = l(Vf)-1 (A)I.

5.
Show that
J:., e-x dx = y/;.

Do this as follows: First note that
Change to polar coordinates and use the transformation theorem on

the integral on the right. Do all this carefully, justifying each step.
6. Compute the volume of the unit ball B (0, I) in En by changing
to spherical coordinates (see Section 6.5 and Exercise 8 of Section 8.4).
[Hint: Use induction to show that the Jacobian of the spherical coordi
nate transformation in En is
pn-l (sin (J1 )n-2 ( sin (J2)n-3 .. . (sin on-2).]

7. Suppose g is a function of class C1 with an open domain in E2 so
that Vx E JE>(g),J0(x) - 0. Give an example which shows that the
transformation theorem may not necessarily be valid for this type of g.
8. Suppose his a linear transformation with domain E" and range
in En. If A CE" is Jordan measurable show that h(A)is Jordan measur
able. [Hint: If h is singular, then dim t1<,(h)= m < n. Let U be an
orthogonal transformation of En onto itself so that U(t1<,(h))=Em.
Then IV h(A)I = 0 so that U h(A) is Jordan measurable. Now apply
u-1 to u h(A).]
0
CHAPTER 9 j THE
INTEGRATION OF
DIFFERENTIAL FORMS
I. LINE INTEGRALS
9.1
MOTIVATION AND DEFINITIONS
y(t) (y1(t),
y2(t), y3(t)) defines a function of the time t and represents the motion
under the action of a force field w(x)
(w (x), w (x), w3(x)), which
2
1
depends only on the position vector x
(x1, x2, x3). Let Ll be a decom
position of the time interval [a, b]. If 1 E Ll and T E 1, then
The concept of a line integral arises quite naturally in physics in con
nection with the motion of a particle. Suppose that
dyk (T)
1I
dt 1
i s the approximate displacement o f the particle

during the time interval], and
dyk(7)
wkoy(-r)---11 1
dt
k=I
3
the
x k direction
is the approximate work done on the particle in the time interval ].

If we add up all these quantities, as J varies over
ILll
Ll,
and then allow
0, we get the Riemann-Darboux integral
wk
fb 3
0
y(t)
d (t)
7
dt.
(9.1.1)
[a, b]. The integral in (9.1.1) is called a line integral.
By definition, this is the work done on the particle in the given time
interval
w is defined as a function
We can describe a line integral in the language of differentials in the
with domain in En and range in the set of linear transformations (linear
following way. A first-order differential form

functionals) acting from
En to E1 If x
w(x)(u)
If we set
396
wk(x)
1!
k=I
.B(w) and u
u kw(x)(ek).
En, then
w(x)(ek), and recall that in Section 7.2 we found that
9.1
dxk (x)(u)
MOTIVATION AND DEFINITIONS I 397
uk , then we have
(9.1.2)
k =l
y is a function with domain the closed interval [a, b] and

oE:>(w). If Vk E (I, n) and V t E [a, b], dyk (t) exists [that is,
dyk (t)/dt exists] we define the composition of w with y by the equation
Suppose
range in
y(t)
k =l
wk
y(t) dxk
y(t).
(9.1.3)
Now, from the chain rule we get
d xk
y(t)
dxk (y(t))
dy(t)
dyk (t).
Hence (9.1.3) can be written
y(t)
k =l
wk
y(t) dyk (t).
(9.1.3')
line integral of w over the line y by the equation
We define the
JY
J:
wk
y(t) d yk (t),
(9.1.4)
provided the integrals on right exist in some sense. They will certainly
y k is a continuous function
wk is continuous. Note that we may have
such a situation even though dyk (t) does not exist for every t E [a, b].
In case dyk (t)/dt exists V t E [a, b], is continuous, and each w is
k
continuous, then of course dyk (t) may be replaced by (dyk (t)/dt) dt
in (9.1.4). For n
3, the right side of (9.1.4) is the same as (9.1.1).
exist as Riemann-Stieltjes integrals if each
of bounded variation and each
It is quite possible that the line integral of a differential form over a
y may be independent of the "parameterization" of the curve.

t(r) defines
a continuous monotone increasing function from an interval [c, d]
onto the interval [a, b]. If we set a{r)
y t(r), then from Theorem
curve
In more precise language this means the following. Suppose

=
5.4.9 we have
and thus
(9.1.5)
t(r) defines a monotone decreasing function and

t ( T ) , then we get (by a slight modification of Theorem

we set f3 ( T )
5.4.9)
(9.1.6)
598 I THE INTEGRATION OF DIFFERENTIAL FORMS
The functions
a, {3, and y all have the same range, but the line
w with respect to these various paths may not be the same.
In other words, the line integral of w depends on more than just the
integral of
point set that is the common range of these functions. Intuitively

speaking, the line integral also depends on the orientation that these
functions give to the range. As T goes from c to d, a(T) goes from
y(a) to y(b), while {3(7) goes from y(b) to y(a), thus reversing the
direction of travel of the range.
The idea of an oriented curve explains, in some sense, the definition
J: f(t)dt = f J(t)dt.
(9.1.7)
We can think of [a, b] as the path given by the function defined by

y(t)=t,VtE [a,b]. On the other hand, define l(T)=a+b-T,
VTE [a,b]. If we write {3(7)=y0t(T) and w(x)=f(x)dx, then
w0y(t)=J(t)dt and w0{3(T)=-f(a+b -T)dT. Hence the line
integral of w over y is exactly the ieft side of (9.1. 7), while
J13 w = J: f(a+b - T) dT
-
I w.
The right-hand equality follows from (9.1.6) or the transformation

theorem, 8.5.8. Thus we may think of the integral on the right side
of (9.1. 7) as another name for the line integral of
w over {3.
On the basis of all the previous motivation we are now ready to give
some formal definitions.
9.1.1 Definition. A function a with domain the closed interoal I C E'

and range in En is said to be equivalent to a function f3 with domain the closed
interoal J C E1 <==? there exists a strictly monotone function t, with domain J
and range I so that f3 (T)=a0t(T). The function a is said to be orientably
equivalent to f3 <==? the function t is monotone increasing.
It is not difficult to check that the relations we have defined are indeed
equivalence relations. Note that since the function tis strictly monotone
and maps a closed interval onto a closed interval, it follows from Theo
rem 2.3.6 that it must be continuous. The idea of equivalence among
functions leads to the following definition.
9.1.2 Definition. A curoe in En is an equivalence class of functions (as

defined in Definition 9.1. I) with ranges in En. An oriented curoe in E" is an
equivalence class of functions with ranges in En under the orientable equivalence
relation.
Since our interest in curves is mainly in connection with taking line
9.1
integrals, we shall, in general, be only interested in those curves which

display certain smoothness properties. Line integrals can be taken along
(oriented) continuous curves of bounded variation, but for our purposes
it will usually be enough to consider even smoother curves.
9.1.3 Definition. A function f with domain the interoal I C E1 is said

to be piecewise smooth <=> f is continuous and there is a decomposition a of I so
V] E a, f' is defined and continuous at every point of J 0 and f' 11 can be
extended continuously to ]. The function f is said to be piecewise regular <=>
it is piecewise smooth and f' vanishes only at a finite (possibly zero) number
of points.
A curve consists of a whole equivalence class of functions and it may
not be true that two equivalent functions always have the same smooth
ness properties. This may come about, for example, if the monotone
function that gives the passage from one function to the other is not
differentiable at a denumerable set of points. However, this is a relatively
minor matter and we give the following definition.
9.1.4 Definition. A [oriented] curoe is said to be continuous, of bounded

variation, piecewise smooth, etc. <=> the equivalence class (which is the curoe!)
contains a function with the indicated property, respectively. A continuous
[oriented] curoe is said to be closed<=> there is a representative 'Y with domain
[a, b] so that y(a) y(b).
=
Note that if we would change the equivalence relation given in Defi

nition
9.1. l
so that we consider only continuously differentiable func
tions, with nowhere vanishing derivatives, then if one function in an

equivalence class is piecewise smooth, or piecewise regular, so are all
the other functions in the same class. Also, if a function has range in
En, it is said to be of bounded variation <=> each real-valued component
of the function is of bounded variation.
9.1.5 Definition. A first-order differential form w is a function with

domain in En and range in the set of linear transformations (linear functionals)
acting from E" to E1. A first-order differential form is said to be continuous, of
class C1, etc.<=> Vk E (1, n) the function defined by wk(x) w(x)(ek) has
the indicated property, respectively.
=
NOTE:
We have already noted that a first-order differential form
can be written in the form
(9.1.2).
To give a formal definition of a
line integral we shall make the usual convention
that a curoe may be

designated by any suitable function in its equivalence class. When we do this
in the future, we shall not comment about it and assume that the
rea<,:ler is aware of this convention.
9.1.6 Definition. If w is a continuous first-order differential form and

y is a continuous oriented curve of bounded variation with range in J0(w),
then the line integral of w over y is de.fined by
{
where
w=
J:
y(t) dyk(t),
(9.1.4)
Ji9(y) = [a, b].
As we have already noted, this integral depends only on the oriented

curve and not on any particular representative from the equivalence
class, so that the integral over the oriented curve is always well defined.
It would be possible to define the integral of a differential form over
unoriented curves by using the variation
dyk(t).
jdyk(t)I
in
(9.1.4) instead of
This would make the integral independent of any particular
representative of the curve. However, we shall not pursue this particular

line of thought. We would like to emphasize that if
smooth, then
is piecewise
(9.1.4) can be written
'Y
W =
fb
11
y(t)
d k(t)
Ti
dt.
(9.1.4')
Let us take a very simple example of the computation of a line

integral. Suppose
w is the differential form given by

w(x,y)
and
( x2
-y
) dx- 2xy dy ,
is the oriented curve given by
y(t) =
Note that the range of
cos
t , sin t ) ,
tE[O, l].
is that part of the unit circle which lies in the
first quadrant. Using the definition of a line integral we get
'Y
w=
(1
Jo
'!!..
_
cos2
{'
Jo
'!!..t 2
[(
cos2
sin2
'!!.t. dy1 (t) - 2

2
'!!..t2
sin2
cos
'!!..
sin
'!!..t dy2(t)
2
'!!..
'!!..
'!!..
'!!..
i sin i + 2 cos2 i sin i
] dt
1
3
On the other hand, let us take
a(t) =
as the oriented curve given by
(1- t,O),
(0, t -1),
tE[O,l],
tE[l,2].
The range of this curve consists of portions of two straight lines, one
proceeding along the x axis from
(1, O) to (O, O) and the other proceed-
9.1
ing along they axis from
(0, O)
to
dadt t) 1l
da2(t)
dt !
(0, 1). Hence this curve has the same
initial and final points as y. Now.

l(
[O, 1 [,
w=
tE
tE]l,2]
tE
tE]l,2].
t)2 dt
[O, 1 [,
Hence we get
{1 (1Jo
_!,
3
We see from the two computations that we have just made that
This fact is not an accident and indeed we would get the same value
of the integral of
proceeds from
is so.
If
{'Yk: k
over every piecewise smooth oriented curve which
(1,0)
(1, m)}
to
(0, 1).
We shall see a little later just why this
is a set of continuous oriented curves of bounded
variation, let us formally form the sum

m
where Vk
(1, m), ak
k=l
R. If
(9.1.8)
ak'Yb
is a first-order continuous differential
form and the range of each yk is in 10 ( w ) , then we shall define
w=
aY
k =l
ak
(9.1.9)
w.
Yk
To be slightly more precise about things, the formal sum in
(9.1.8) may
be considered as another name for a real-valued function whose domain

is the collection of continuous oriented curves of bounded variation.
This function takes the value ak at 'Yk Vk
(1, m),
and takes the value
zero at every other curve. Such functions, defined on curves, are called
chains, and the integral of a first-order differential form over a chain is
defined by
(9.1. 9).
Intuitively speaking, if ak is a positive integer, then ak'Yk may be

considered as the curve 'Yk taken ak times. If ak is negative, say- I, then
we shall write ak'Yk as -yk and think of this as a curve having the same
range as 'Yk but with the opposite orientation. Let us briefly comment
about the meaning of "opposite orientation" of an oriented curve. From
an intuitive geometric point of view, any curve should have two different
orientations or two different directions of travel. From a more precise
point of view, this means that if the orientable equivalence relation is

applied to a curve, it should break up the curve into exactly two nonvoid
classes that are oriented curves. One oriented curve is then said to have
the negative or opposite orientation of the other.
Because of the generality with which we have been discussing curves,
it is not always true that every curve breaks into two oriented curves.
This is already seen to be true in the extreme case where the range of the
function is one point. Clearly every function in such a curve belongs to
the same equivalence class under the orientable equivalence relation.
However, this is not really a very good example, since such a curve plays
essentially the same role for curves as zero does for the real number
system. For a more pertinent example, let y be the unoriented curve
defined by the function with values
a(t) = (cos 27Tt,cos 27Tt),
tE [O,l].
In other words, y is a collection of curves equivalent with the function

a. The range of this curve is the straight-line segment lying between
(-1,-1) and (1, 1). Suppose the curve y breaks into more than one
class under the orientable equivalence relation. Suppose f3 is a function
in y but not orientably equivalent to a. But since f3 is equivalent to a,
there is a continuous monotone decreasing function t with range [O,1]
so that {3(-r) = a t(-r). Let us set 8(0-)
a ( I - a-), Va-E [O,l]; then
8(1- t(-r)) = a t(-r) = {3(-r). Since 1- t(-r) is a monotone increasing
function, it follows that 8 and f3 are orientably equivalent. But 8(0-)
= a(l- a- ) = a(a- ) and thus 8 and a are orientably equivalent. But
this means that a and f3 are orientably equivalent, which is a contradiction.
To avoid these types of pathologies, as far as the orientation question
is concerned, it is necessary to take a more restrictive class of curves and
somewhat more restrictive equivalence relations. We shall not develop
this point of view since it will not be important for the problems we
shall discuss. However, we have asked the reader to investigate the
situation in Exercise 5 below.
0
D Exercises.
1. Compute the value of the line integral of the first-order differ
ential form with domain E2,
w(x,y) =xdx-ydy,
over the following curves:
(a) y(t) = (cos 7Tl, sin 7Tl), tE [O,l].
(1- t,O), tE [O,2].
(b) y(t)
(c) y(t)
(1- t,1- jl - tj), t E [O, 2].
=
2. Compute the value of the line integral of the first-order differ

ential form with domain E3,
9.2 THE LENGTH OF A CURVE I 403
w(x,y,z)= yzdx+xzdy +xydz ,

(a)
(b)
3.
y(t) =( cos 27Tt,sin 21Tt,2t),t

y(t)= (1,0,t),t E [0,6].
[0,3].
Compute the value of the line integral of the first-order differ
ential form with domain E \ {0},
y
x
w(x,y)= - 2dx+ 2dy,
x +y
x +y
y(t) =( cos 1Tt,sin 1Tl), t E [O,l].

(b) y(t) =(cos 7Tl,-sin 1Tt), t E [O, l].
(a)
Note that these line integrals are not the same, even though the two
curves have the same initial and final points.
4.
An ellipse is the set of all vectors
(x,y) in E2
so that
Find a representative of a closed oriented curve whose range is this

ellipse and traverses it once in the "counterclockwise" direction. If this
oriented curve is
5.
y,
compute the line integral
A function with an interval domain in E1 and range in E" is said
to be regular <=? it has a continuous nonzero derivative at each point

of its domain (see Definition
9.1.3).
Let us say that two regular functions
and {3 are equivalent <=? there is a continuously differentiable strictly
monotone function
t(T) (increasing or decreasing) so that {3 (T) = a 0 t(T)

t' ( r) > 0. A regular curve is an equivalence
and orientably equivalent<=?
class of regular f unctions, and an oriented regular curve is an equiva

lence class under the orientable equivalence relation. Show that every
regular curve contains exactly two regular oriented curves.
9.2
Let
THE LENGTH OF A CURVE
be a curve in E". Using our convention that a curve can be named
by one of its representatives, we shall suppose that
[a, b] and range in E". Let a

a= {[ak, bk]: k E (1,m) }, and set
domain
L,,(a)
L ly(bk) - y(ak) 1.
k=l
is a function with
be a decomposition of
[a,b] ,
(9.2.1)

This defines a real-valued function
l-y on the set of all decompesitions

[a, b]. Following the terminology of Section 5.5 we say that y is of
bounded variation<=> l-y is bounded. We now formally define the length
of
of a curve as follows.
9.2.1 Definition. If y is a curve of bounded variation in

length of the curve is the number
En,
then the
(9.2.2)
In line with the terminology of Section 5.5, it would also be reasonable
to say that
l(y) is the total variation of y.

t E [a, b], we restrict y. to [a, t], and we define
l(y, t) by (9.2.2), where .:i is now to be taken as a decomposition of [a, t].
The function oft defined by l(y, t) is monotone, nondecreasing, and
Suppose now that
hence the Riemann-Stieltjes integral
J: g(t) dl(y, t)
exists for every real-valued contin.uous function g. In particular, if
l(y, t) defines a coritinuous function and f is a real-valued continuous
function whos domain contains the range of y, then
J: f :y(t) dl(y, t)
(9.2.3)
exists. This last integral may be viewed as another type of line integral
where the function f is being integrated along 'Y with respect to the
length along
is
y. In particular, if f is the constant function 1, then (9.2.3)
l( y ) ..
Ify is a smooth curve,
way.
9.2.2
Theorem.
l(y) can be computed in a pa'rticularly simple
If y is of class
t(y)
then
J: Jy' (t) I d t.
Suppose .:i
{[ak, bk]: k E (1, m)} 1s any decomposition
[a, b]. By the mean value theorem we get
Proof.
of
C1,
Now, since
y' is continuous, VE> 0, 3.:i. so that
E Ik we have
v .:i
>a. and Vtk
9.2
=
where
l7Jkl
<
{ [ dyi(td ]2}1/2 IIkl+ 7Jklhl/(b - a),

dt
m
e;
405
(9.2.4)
moreover,
DIY'I (!J.)
If
THE LENGTH OF A CURVE J
mk =inf { IY'(t) I: t E Ik},

mklikl
J: IY, (t)I dt + E.
then using
(9.2.5)
(9:2.4) we get
t [ dytk) rr2 IIkl
ly(bk) - y(ak) I+ E IIkl/(b
).
Thus, summing over k, we get .
f21Y'I (!J.)
k=I
ly(bk)-y(ak)I+e.
Taking the I. u. b. as ti varies ovet
crrl refinements of di, we get
J: ly'(t)J dt l(y)+e.
In a similar way, and using
(9.2.5), we get
ly(bd -y(ak) I . DIY'I (ti)+ E fb IY'(t) I dt + 2E.

m
Thus
l(y)
Since
J: ly'(t)J dt+ 2E.
E is arbitrary we have completed the proof.
It is possible to connect the line integral of Section

integral
tion
(9.2.3).
9.1 with the line
Let us use the same physical example we used in Sec
9.1, namely, we shall compute the work done on a particle acted on

( ) ( ( )
( ) w3(x)). Let the position of the
by a force field
w x
particle at any time
w1
, w2
be given by
yW
(y1 (t),
y2(t), y3(t)).
Since we
have a physical system we may assume that the particle has an accelera
tion at every time
t and
hence the velocity
y'(t)
is continuous. Let us
make the simplifying assumption that in the time interval

velocity does not vanish. The vector
vector to 'Y at
t.
y'(t)
[a, b]
the
is, by definition, a tangent
The vector
O(t) =y'(t)/ly'(t)I
is a tangent vector of unit length and is often ca,Iled the
the oriented curve 'Y.
(9.2.6)
orientation
of
If we project the force vector

generated by
O(t), the
w 0 y(t)
onto the one-dimensional space
projection is of course given by
[w 0 y(t) O(t)]O(t).
This is the compom:nt of the force that lies in the space generated by
O(t). Let d { [ak> hd: k E (I, m)}

tk E [ak,bk]. Then
be a decomposition of
[a ,b] and
L [w 0 y(tk) O(tk)] [l(y,bk)-l(y,ak)]

k=l
should be approximately the work done by the force field
lhe particle in the time interval
integral
acting on
If we pass to the limit we get the
[a, b].
J: w 0 y(t) O(t) dl(y,t).
(9.2.7)
To show that this actually coincides with the integral (9.1.1), we note that
w
0 y(t)
__l
O(t) - '
IY (t) I
wk
11
'
y(t)
dyk (t)
,
dt
and that
/(y,t)
{ ly'(T)ldT.
Thus
From Theorem 5.4.7 it follows that the integral (9.2.7) is the same as
(9.1.1).
Let us close this section by showing how a continuous oriented curve
of bounded variation may be "parameterized by its arc length." Suppose
is the oriented curve. Using our convention about the naming of
curves, we shall suppose that

variation with domain
[a, b]
s(t)
Since
y is
y is
J d/(y,
t1
<
/(y, t).
continuous it follows that
s is continuous. Further, if t1 < t2 ,

s (t2) If we suppose y is one to one,
if s(t1)
s(t2), it follows that YI [t., t2]
then it is certainly clear that s (t1) :s;;

then
a continuous function of bounded
and range in ". Let us set
t2 ==>s(t1)
<
s(t2).
For,
is a constant function, which contradicts the fact that it is one to one.
Thus, if
y is
one to one, s is a monotone increasing continuous function
9.3
A SPECIAL CASE OF STOKES' THEOREM I 407
[a,b] and range [O, l(y)J. If we denote the values of the

t(s) and set a(s)= y t(s), then a is another repre
sentation for the oriented curve y. This is what we mean by parameter
izing a curve by its arc length. Notice also that if y is of class C 1, then
with domain
inverse function by
s(t)
y'
and if
never vanishes,
parameterize
J ly'(r)I dr;
is monotone increasing and thus we can
by its arc length, even if
is not one to one.
D Exercises
Justify the remarks made in the last paragraph of this section;
I.
that is, show that
YI [t1, t2] is
If
2.
and
continuous implies sis continuous ands(t1)
3.
is a curve of class
C1
that is regular
[a,b], and Vt
on
l(y, t)
does not vanish!)
Suppose
4.
2.
[a,b],
If
is a function of
'
t-a.
and f3 are curves with the same range. Without loss
of generality, we may suppose that

same interval domain. If
and f3 are functions having the
is of bounded variation, is it necessarily
true that f3 is of bounded variation, and
9.3
(y'
I.
Prove the following converse of Exercise
C1 defined
I
I dyd(t)
t
then
s(t2)
is "parameterized by its arc length s'', show that
I dys)I
class
a constant function.
l(a)= l({3)?
A SPECIAL CASE OF STOKES' THEOREM
Let I=
[O,I]
[O,I]
and let
given as follows:
be the piecewise smooth closed curve
Vt
[O, I],
(I, t-1), Vt
[l,2],
(3-t,l), Vt
[2,3],
(0,4-t), Vt
[3,4].
(t,O),
y(t)
(9.3.1)
Geometrically speaking, y proceeds around the boundary of I in a

"counterclockwise direction."
w be a first-order differential form of class C1 whose domain is

an open set containing /. The integral of w over y is, by definition,
Let
Jy W f W1
'}'(t)
1
dy (t) + W2
f40 k=l
:L
y(t)
'}' (t)
2
dy (t)
dyk(t)
- dt.
dt
-
Note that since yk is piecewise smooth the last integral exists. If we note
that, except at four points where it is undefined,
-1, 0, or 1, we get
dyk/dt takes on the value
1 w = f [w2(1,T)-w2(0,T)] dT
-i1 [w1(t, 1) -w1(t,O)J dt.
(9.3.2)
On the other hand,
i D2W1 (x) dx J: [ il D2w1 (t,T) dT ] dt

L1 [w1(t, l)-w1(t,O)] dt.
i D1W2(x) dx= J: [ J: D1w2(t,7) dt ] dT
= J: [ w2(1,7)-w2(0,7) ] dT.
=
(9.3.3)
If we compare the formulas in (9.3.2) and (9.3.3) we see that
i [D1w2(x) -D2w1(x)J dx= JY w.
(9.3.4)
This simple formula is an analogue of the forinula in Theorem 5.2.2.

Indeed, note that we used Theorem 5.2.2 in its proof.
The curve y may be identified with a chain in the following way.

Fort E [O, l ] let us set
2
/ 0 (t)
1
/0 (t)
(t, O),
(O, t),
2
11 (t)
1
11 (t)
(t, 1) '
(1, t).
(9.3.1 )
'
Let us then form the chain
iJI=LL (-l)HkJ/.
j=O k=l
Using the definition we gave in Section 9.1 for integration of a first
order differential form over a chain, we get
9.3
A SPECIAL CASK OF STOKES' THEOREM J 409
(-1) i+k J
i=O k=t
If
J w.
(9.3.5)
In higher dimensions it is sometimes much more convenient to con

sider integrals over higher-dimensional chains rather than over certain
surfaces. It is for this reason that we have noted the chain form of the
integration of w over y.
In applications it is usually the case that we want to apply the formula
(9.3.4) over more general regions than the unit cube /. Although this
can be carried out for rather general situations, we shall confine our
selves at present to showing that the formula holds for regions that are
smooth mappings of/.
Let us suppose that rp is of class C2 with IC J9(rp) and (rp) C E2
Further, let us suppose that Vx E !0,j<fJ(x) > 0 and rpj/0 is one to one.
Then we may apply the transformation theorem 8.5.8 to get
(9.3.6)
On the other hand, ,
rp
y defines an oriented piecewise smooth
curve and by definition
f,
140 k=12
2
wk
,(t)
d (t)
7dt,
t
(9.3.7)
where, of course, the derivative of , k does not exist at four points.

Using the chain rule we get
d, k(t)
dt
so that
2 wk
,(t)
d (t)
T,
DJ'P k ( 'Y(t))
L..
j=t
2 [ 2 wk
dyi(t)
dt '
,(t)Dirpk(y(t))
] ?,
d i(t)
Let us set
wj(x)
w*(x)
2 wk
k=t
2
i=I
rp(x) Dirpk (x),
wj(x) dxi.
(9.3 8)
.
(9.3.8')
From (9.3. 7) and the definition of w* we get
{ w=
J.poy
f w*.
y
(9.3.9)
Now, we have already proved [formula
(9.3.4)]
that
r [D1w!(x)-D2wf(x)] dx= { w* .
(9.3.10)
To connect the left side of (9.3. l 0) with the right side of (9.3.6), we must
compute the quantity
w.
form
Since
ip is
D1w(x)-D2w!(x) in terms of the differential

C2 we can use (9.3.8) to compute this quantity
of class
and a straightforward calculation gives
D1w!(x)-D2wf(x)= [D1w2(ip(x))-D2 w1(ip(x))]].p(x) . (9.3.11)

Thus if we use
(9.3.11) in (9.3.10),
and make use of
(9.3.6) and (9.3. 9)
we get
(9.3.12)
Parenthetically, let us introduce another notation for
w*
which is
more or less standard and will arise again in higher dimensions. From
(9.3.8) and (9.3.8 ')
we get
w*(x) =
Since
wk
<Upk(x) = 2I=i DJ'Pk(x) dxi
ip(x)and
[ J DJipk(x) dxi J.
dxk ip(x) = <Upk(x),

0
2
2
w*(x) = L wj(x) dxi = L w
k
j=I
k=I
Some standard notation for
w*
is [see
we have
ip(x) dxk ip(x).

0
(9.1.3)]
w*(x) = ip*w(x) = w
ip(x).
We have proved the following version of Stokes' theorem, to which

the names "Green" and "Gauss" may also be attached.
Suppose ip is a function of class C2 with an open domain

in E2 and range also in E2 Further suppose that IC (q;), Vx E J0,
j.p(x) > 0, and ipj/0 is one to one. If w is a first-order differential form of
class C' with an open domain containing ip(I), then
9.3.1
Theorem.
'!'(/)
where
y is
[D1w2 (x)-D2 w,(x)] dx =
w,
(9.3.12)
<p,y
given by (9.3.1).
We can also write the formula (9.3.12) in terms of a chain. Indeed, set
aip(I)
= LL <-oi+k q;o1/,
J=O k=l
9.3
where
I/
is given by
(9.3.1 ').
It is indeed a simple matter to verify that
1
2 <- o i+k
L L
o IP(/)
j=O k=l
IPll
f,
w.
IPY
Let us give a few examples that show how Theorem
ip
First, it is clear that if
9.3.1
can be used.
is any affine transformation with a positive
(constant) Jacobian, then the hypotheses of the theorem are satisfied

and
(9.3.12)
is satisfied. Thus, by composing any function
satisfies the hypotheses of Theorem

formation, the formula
(9.3.12)
9.3.1,
ip,
which
with a suitable affine trans
is satisfied for a rectangle
in any
position and y its bounding curve proceeding in a "counterclockwise"

direction.
Now let 'P be the polar coordinate transformation in
ip(r, 8) = ( r
cos
E2:
8, r sin 8).
"
FIGURE 9.3.1
Let J be the rectangle
[r1, r2]
[O, 27T],
where
r1 ;;;.
0. The function
'P is of class C2, the Jacobian of 'P is positive in the interior of J and
iplf0
is one to one. Hence formula
(9.3.12)
will hold when
is replaced
by J and y is the boundary of J proceeding in a "counterclockwise

direction." The interval J maps under
Fig.
9.3.1.
ip
onto the annulus shown in
Note that the two vertical edges of J map onto the circular
boundaries of the annulus, while the horizontal edges map onto the
portion of the positive
axis that lies between the two circular bound
aries. However, note that the two different horizontal boundaries map
ip y this portion
9.3.1 takes the form
in opposite directions so that when we integra.te over

of the line integral will cancel out. Hence Theorem
412 j THE INTEGRATION OF DIFFERENTIAL FORMS
J f [D1w2-D2w1] dx dy
R
where R is the annulus

circle, and
{(x,y): r1,,;;; I (x,y)I,,;;; r2}, C2
is the outer
C1 is the inner circle, both proceeding in a "counterclockwise
direction."
As a second example, let us show that Stokes' theorem is valid for a
triangle and its interior. Let us set
This clearly defines a
rp(x,y)= (x, (I -x)y).

2
function of class C on
2, and its Jacobian is
given by
I
= 1-x.
j.,,(x,y)=
-y
(x,y)
rpjI0 is
(1-x)
I,
is in the interior of the unit cube
a one-to-one function so that we may apply Theorem
9.3.1. The map of
I under 0.
9.3.2.
I
I
!
FIGURE 9.3.2
Suppose now that
u1
and
u2
are linearly independent vectors in 2
and set
Then extend
I/I by linearity to all of
2 Suppose that
Jl/J(x,y)=
> 0.
chosen so that
u1
and
u2
are
9.3
Then
!fl
cp satisfies all the hypotheses of Theorem
the map of I as shown in Fig.
9.3.1 and we have

9.3.3. Finally, by an affine transformation
FIGURE 9.3.3
we may move the triangle on the right away from the origin so that
Stokes' theorem is valid for any triangle.
As a third example we shall show how to obtain a somewhat more
exotic region for which Stokes' theorem will hold. Let us set
cp(x, y)
(x,y(x5 sin
;: + 1)),
(0, y)'
0,
x= 0.
'1 t is not difficult to check that cp is of class C2, and cpj/0 is one to one and
has a positive Jacobian. Hence we may apply the formula

map of I under cpis shown in Fig.
(9.3.12). The
9.3.4.
FIGURE 9.3.4
As a final example, let us suppose that cp satisfies the conditions of

Theorem
9.3. l and
is a first-order differential form of class C1 of
the form
w
(x, y)
=-au(x,y) dx+au (x,y) d

y.
.
ay
ax
414 I THE INTEGRATION OF DIFFERENJ:IAL FORMS
Let us set R = ip(I) and= ip 0 y and let us parameterize the oriented

curve by its arc lengths. As we have seen in Section 9.2, a tangent
vector to of u.nit length is d(s)/ds. The vector
n=
d 2 (s)
ds
d1 (s)
'
ds
is called the outward normal to, at s. It is dearly orthogonal to the tangent

vector d,(s)/ds. If we compute the directional derivative of u with
respect to n we get
DnU(X, y)ds=
[
[-
a u(x,y)
Thus
ax
n 1+
au(x,y)
n2 ds
.
ay
.
a u(x,y) d,1(s) +a u (x,y) d,2(s)

ds .
ds
ds
ax
ay
Dnu(,(s))ds=
(s).
w 0
If we use the symbol 'aua

/ n' for Dnu, from Stokes' theorem we get
.
U >au((s))
ds
.
an
o
The last integral is also very often simply written as
au
ds.
.an
The operator
a2
a 2
6.=
+
ax2 ay2
is called the Laplacean operator, and hence we may rewrite Stokes'

theorem for the particular w we are using here as
II
R
6.u(x,y) dx dy =
aR
au
ds,
(Jn
where we are using a

' R' as another name for
(9.3.13)
D Exercises
1. Use the Stokes-Green-Gauss theorem to evaluate the following
line integrals:
(a) f. x2y dx x dy, where proceed "clockwise" around the
boundary of]= { (x, y): 3 x 5, 1 y 3}.
9.!I A SPECIAL CASE OF STOKts. THEOREM j 415

.
(b) f ,,,x dy-y dx, where, proceeds "counterclockwise" around

4.
the boundary ofcircle whose equation is (x - 2) 2+ y2
(c) f,,, ex sin y dx+ex cos y dy, where , proceeds "clockwise"
around the boundary ofany region where we may apply the Stoke;
Green-Gauss theorem.
=
2. Show that the Stokes-Green-Gauss theorem is valid for an

ellipse and its boundary.
3. Generalize Exercise 2 and show that the Stokes-Green-Gauss
theorem is valid for an "elliptical ring."
4.
Use the Stokes-Green-Gauss theorem to find the area enclosed
by an ellipse by evaluating a line integral.
5. Let J be a real-valued function ofclass C2 with domain ]-e,

l+e[,e>O.LetD CE2be the set{(x,y):x E [O,l]&y E [O,f(x)]}.
Show that the Stokes-Green-Gauss theorem is valid forD and its suitably
oriented boundary. Note that this generalizes some ofthe constructions
given at the end ofSection 9.3.
6. Generalize Exercise 5 in the following way. Let J and g be real
valued functions of class C2 defined on ]-e, 1 + E[ so that g f Let
DC E 2 be the set {(x,y): x E [O, l] & y E [g(x),f(x)]}. Show that
the Stokes-Green-Gauss theorem is valid for D and its suitably oriented
boundary.
Let w be defined on 2\{0} by
7.
- -y
x
w(x,y)-+ dx++ 2dy.
x
y2
x
y
IfJ is any interval containing the origin in its interior , show that
JI [D1 w2 -D2w1] dxdy
0,
'J
but that
dJ
27T,
where a] is the suitably oriented boundary of]. Why does this not
contradict Theorem 9.3. l?
Under suitable conditions on u,v, and R
8.
rf [u
where
av + au av+ au av
ax ax
ay ay
i the arc length along aR.
] dxd =I
Y
E 2, show that
a
u v ds.
an
an
u,v,
(uL\v-vL\u) dx dy= iR(u:-v:)ds,
ff
R
Under suitable conditions on
9.
where
IO.
is the arc length along iJR.
Under suitable conditions on
then
where
9.4
u,
and R C E2, show that
and R C E2, show that if
Au=
0,
dy= u ds.
(
(
ff
R [ r r J dx LR :
+
is the arc length along iJR.
CLOSED AND EXACT DIFFERENTIALS
Suppose
is a first-order differential form with an open domain in
En and moreover
Vx ..(w),w(x)= df(x),
E
where
is a real-valued
function of class C1 Suppose further that "Y is an oriented curve in
(w)
..
of class C1 Using the chain rule we may write
k
w 0 y(t)= Dk f( y(t)) dydt(t) dt df 0/t (t) dt.
k=I
[a,b],
a= y(a),/3 y(b),
(9.4.1)
J w=J d J=fbdjodty(t) dt=f(/3)-J(a).
w
a {3,
(9.4.1)
w
=
Hence, if the domain of "Y is
and we set
then
This says that the integral of

domain, which goes from
soon show that
to
over any smooth oriented curve in its

is independent of the curve. We shall
is also true for piecewise smooth oriented curves
so that the integral of
over any piecewise smooth closed curve in its
domain is zero.
9.4. 1 Definition.
first-order difef rential form is called exact there
exists a real-valued function f defined on ..(w) so that Vx ..(w),w(x)
= df(x).
A
The next theorem gives necessary and sufficient conditions that a

first-order differential form be exact.
9.4.2 Theorem. If w is a first-order continuous differential form defined

on an open set in then w is exact for every piecewise smooth closed oriented
curve in ..(w) we have
En,
"Y
9.4 CLOSED AND EXACT DIFFERENTIALS I 417
Proof.
Let us first prove the necessity; that is, we assume
{[ak, a k+d: k E (I, m)}

that y' exists on ]a k> ak+1 [.
df
'Y
Let
be a decomposition of the domain of
so
Using Theorems 5.2.1 (d) and 5.2.2 we get
w=
'Y
df=
'Y
Since
over
is closed,
is zero.
.L
fak+l dj
y(t)
dt
dt
k=l ak
m
L [J y(ak+i) - f y(ak)]
k=l
0
y(a1)= y(am+1),
and it follows that the integral of
To prove the converse we may assume, without loss of generality,

that
.B(w)
is arcwise connected. Otherwise we can work with each open
component of
.B(w).
x0
Fix a point
.B(w) and Vx
.B(w) with x0
be a piecewise smooth oriented curve in

and
the final point of
yx
.B(w)
let
'Yx
Let us set
This defines a function of

independent of the choice
the initial point
yx We claim
of 'Yx Indeed,
that for fixed
x0
and
it is
ax is another piece
x0 to x. If yx has
that its domain is [b, c].
suppose
wise smooth oriented curve which proceeds from

domain
[a, b]
suppose
Define
{3(t)=
ax
is parameterized so
{ 'Yx(l),
Vt
Vt
ax(b + c - t) ,
[a, b],
[b. c].
E
E
Then f3 defines a piecewise smooth closed oriented curve
.B(w).
Hence
Thus for fixed
x0
we may set
f(x)= F(yx) ,
and this defines a real-valued function on
w= df
u E En
We shall show that
.B(w). Now, let

B(x, 8). Let us set
C
'Yx+hu(t) -
Since
and
.B(w)
E R so that
{ 'Yx{t)' -b)u,
x+h(t
.B(w).
is open
Vt
Vt
E
E
38 > 0 so that B(x, 8)

hot=- 0 and x+hu E
[a, b],
[b, b + I).
Hence we get
{f w- w }
J
b+t
n
d k
(t)
f
"' Wk
(t)
dt
=dt
h
J(x+hu)-f(x) !
=
h
h
'Yx
'Yx+hu
'Y
'Y x+hu
x+hu
,,,c.,
b k=I
(b+I n
wk(x +h(t-b)u)uk dt.
=J
b
o
Ash 0 we get
n
Duf(x)= L wdx)uk = w(x)(u).
k=I
Vu E En the right side is a continuous function of x, it follows
that df(x) exists, and hence w(x)=df(x).
1
In case w is of class C it is possible to give conditions on the partials
of wk so that w is "locally exact." For the moment we shall restrict our
2
selves to the case where (w) is an open set in E Later on we shall
n
consider the case where (w) C E . Let B be an open ball in (w)
Since
and] an interval inB. From Stokes' theorem we get
If
Now, if
V(x,y)
[D1w2-D2w1] dxdy =
iJ
w.
EB,
D1w2(x,y)=D2w1(x,y) ,
w
then we get that the line integral of
along every oriented rectangle
in Bis zero .
Now, if
(a,b) is
the center ofBand
(x,y)
EB, set
r W1(t,b) dt+I: Wz(X, t) dt ,

f2(x,y)= J: w2(a, t) dt+ J: w1(t,y) dt.
f1(X, y) =
The number f1 (x,y) is the line integral of
along the curve consisting
of the horizontal straight line proceeding from

the vertical straight line from
(x,b)
to
(x,y).
the line integral consisting of the vertical line

to
to
(a,y)
(x,y).
and then the horizontal straight line

Since the line integral of
inBis zero, it follows that
V(x,y)
(a,b) to (x,b) and then

The number /2(x,y) is
proceeding from (a,b)
proceeding from (a,y)
around every oriented rectangle
EB,f1(x,y) = f2(x,y). Now, a simple
calculation shows that
Dz/1(x,y)=w2(x,y),
Dif2(x,y)=w1(x,y).
9.4 CLOSED AND EXACT DIFFERENTIALS I 419
Since w1 and
w2 are continuous,
if we setf
f1
f2, then from Theorem

df(x, y). Thus
7.2.5 it follows that df(x, y) exists and of course w(x, y)
wlB
is exact.
The discussion of the last paragraph prompts us to make the following

definition.
9.4.3 Definition. A differential form w with domain an open set in En

1
is said to be closed w is of class C and Vj, k E (I, n) and Vx E E(w),
DJwk(x)
Dkwi(x).
A little later on we shall present a definition of a closed differential
form in a much more compact and more easily remembered notation.

For now, let us remark that we have proved above that every closed
differential form with domain in E2 is "locally exact" in the sense that
the restriction of the closed form to any ball in its domain is exact.
However, it is not necessarily true that a closed form is "globally exact."
For example, the form
- ......:::.1_
w(x, y) 2 +
2 dx
x
y
is defined on
2
\{0}
+ z--- 2
+y
x
and is closed. However,
dy
is not an exact form.
Indeed, if J is any interval in E2 containing the origin in its interior,

then (see Exercise 7 of Section 9.3)
w#-0.
aJ
It follows from Theorem 9.4.2 that
is not exact, even though it is
locally exact.
From the previous example it would seem that for a closed form to
be exact, there would need to be additional conditions on its domain.
This is actually the case, and for the purpose of obtaining these addi
tional conditions we introduce the following definition. To make the
notation easier we shall suppose, for the remainder of this section, that
we shall only work with representatives from a given curve that have
domain
[O, l].
9.4.4 Definition. Two closed, piecewise smooth, oriented curves 'Yo and
y 1 in a set E C En are said to be homotopic in E there exists a continuous
function r with domain [O, l] X [O, l] that is piecewise smooth in each
variable, has range in E, VT E [O, l], f(T, O]
f(T, 1), and Vt E [O, l],
,
t
t
y
y
O,
t
and
f(
0(t),
1( ).
f( )
l )
A piecewise smooth oriented curve y in E is said to be homotopic to zero in
E y is homotopic in E to a constant curve; that is, a curve 'Yo so that Vt
E [O, l], y0(t)
y0(0).
=
420 I THE INT.EGRATION OF DIFFERENTIAL FORMS
It is not difficult to establish the fact that the homotopy relation is an

equivalence relation, so that the piecewise smooth oriented curves in
a given region break up into pairwise disjoint homotopy classes. It is
also not difficult to show (Exercise 7 of Section 9.4) that the homotopy
relation, 9.4.4, is independent of the piecewise smo9th representatives
we pick from each curve.
If every closed, piecewise smooth oriented curve in an arcwise con
nected region is homotopic to zero, then we say that the region is simply
connected. From the point of view of the homotopy relation, the last
statement says that a region is simply connected if all the closed,
oriented, piecewise smooth curves belong to the same homotopy class.
9.4.5 Definition. An open set /) C E" is said to be simply connected

<==> /) is connected and every piecewise smooth, closed oriented curve in /) is
homotopic to zero.
Roughly speaking, a simply connected set in
E2 is one that has no
holes in it. Of course in higher dimensions we no longer have such a

simple interpretation. An example of an arcwise connected set in
E3
that is not simply connected is an anchor ring.

For the purpose of giving an example, let us note that if an open set
in
E" can be contracted to a point by means of straight lines, then the
set is simply connected. To be more precise, let us say that the set
S C En is star-shaped with respect to the point a E S <==> Vx E S, the straight

line L= {y: y ( 1 - t)a + tx & t E [O, l]} belongs to S. Of course,
=
every convex set is star-shaped with respect to every point in the set.
Suppose S is open and is star-shaped with respect to
a E S. If y is
a piecewise smooth, oriented closed curve in S, set
f ( T, t) = (1 - T) a + TY (t) ,
Clearly,
variable,
VT E [O, l].
f has range in S, is continuous, is piecewise smooth in each

f( T, O) = f(T, l) , f(O , t ) = a , and f(l,t)=y(t). Thus y is
homotopic to zero.
We want ultimately to prove that every closed first-order differential

form defined on a simply connected domain is exact. First, we shall
prove that every closed form on an open domain is locally exact. We
have proved this previously for closed forms having domains in
E2,
making use of the Stokes-Green-Gauss theorem. Although we could

still make use of this theorem in higher dimensions, it is much simpler
to proceed by a more direct method. Of course, since we know what we
are looking for, it is easy to discover a direct method of proof.
9.4.6 Theorem. Every closed first-order differential form with domain

an open set in E" is locally exact; that is, its restriction to every open ball in
its domain is exact.
9.4
Proof.
Let
w be
a first-order closed differential form with domain an
open set in En and suppose
B(a, r)
[O,I]
CLOSED AND EXACT DIFFERENTIALS I 421
B(a, r)
.e9(w).
Define the functions on
by
s(x,t)
Clearly, for fixed
x,
( I - t)a + tx.
the range of
is the straight line joining
and
a.
Let us set
If we apply the operator
Di to
both sides we may use Theorem
8.4.3
to move this operator from the outside to the inside of the integral. Now,
Di wk
s(x,t)
ask(x,t)
ask(x,t)
= [Di wk0s(x,t)]
at
at
ask(x, t) .
+Wk0s(x,t) Dj
at
Further, using the chain rule, and noting the form of
s,
we get
Diwk0s(x,t) = tDiwk(s(x,t)).
Also,
ask (x,t)
D)
at
Now use the fact that
- =
8)k
l
0
<=>
<=>
j = k.
k,
is closed so that
Thus we have
- t aw;0s(x,t) + Wj
at
o S X,
t).
Consequently,
(1 aw-0
) s(x , t) dt+ (1 Wj0s(x,t) dt.
DJ(x) =Jot
Jo
at
If we integrate the first integral hy parts we get
DJ(x) = w;(x).
Since
equal to
is continuous, it follows that
w(x).
df(x)
exists and, of course, is
The theorem we have just proved is one of the crucial steps in showing
that every closed first-order differential form in a simply connected
region is exact. A second crucal step is the following lemma. For the
a curve a is in a 8-neighborhood of the

curve 1' if there are representatives of each curve so that
purposes of this lemma we shall say
sup{ja(t) - y(t) I: t
Suppose w
domain an open set in En. For
.,(w) , 3 8 > 0 so that for every
with a ( 0) = 1'(0) , a (I) =1' (I)
9.4. 7
Lemma.
Proof.
Let
compact and
and
Vx
28
is a closed first-order differential form with

every piecewise smooth, oriented curve 1' in
piecewise smooth, oriented curve a in .,(w)
and a in a 8-neighborhood of 1' we have
B(y(T), 28)
y,
by
to
>
.,(w)c. Since 5C,(y) is

0. For every T E [O, I]
set
'Y,.
(0, T]
5C,(y)
be the distance from
.,(w)c is closed, it follows that 8

J,(x) =
where
[O, I]} < 8.
f,
w,
s-r.x
is that oriented curve which has a representative defined on
y ,(t) = y(t ) ,
and
s,,x(t) =(I - t)y(T) + tx, t E (0, I]. As the

Vx E B(y(T), 28), df,(x) = w(x).
proof the last theorem shows,

Next,
Vx
B(a(T), 8)
let us set
g,(x) =
w+
a,
where
a,
w,
TT,X
y,, and, of course, r, ,x(t )

B(a(T), 8), dg,(x) = w(x).
that Vx E B(a(T),8)
is defined in a manner similar to
=(I - t)a(T) +
Since
B(a(T), 8)
Vx
tx. We also get that

C
B(y(T), 28)
it follows
d[f,(x) - g,(x)] =0 ,
from which it follows that there is a number
Vx
c(T)
so that
B(a(r) ;8),
f,(x) - g,(x) =c(T).

We think it is clear that
is a continuous function of
T.
Let us put
To= sup{T: c(T) =O}.

The set on the right is nonvoid since
is well defined and we claim
To< I.
Take
To< T1 I
so
To= I.
that t
certainly belongs to it. Thus
To
For suppose to the contrary that

E
[T0,T1] =}y(t)
B(y(T0),28)
CLOSED AND EXACT DIFFERENTIALS I 423
9.4
a(t) E B(a(To), 6) . It is possible to do this because of the continuity

y and a. Now, Vx E B(y(T0), 26) let us define
and
of
Vt
Vt
Vt
The function
f3x
oriented curve in
E
E
E
[T0, T i],
[Ti.Ti+ l]'
[T1 + l, Ti+ 2].
is the representative of a piecewise smooth, closed,
B(y(T0), 26)
(Fig. 9.4.1). From Theorems 9.4.6 and
FIGURE 9.4.1
9.4.2 it follows that
{ w=
J f3x
This shows that
Vx
w-
'Yn
w+
'YTO
w-
8n,x
8To,x
w=O.
B(y(T0), 26) ,
J,0(x) =f,,(x).
In exactly the same way we get that
Vx
B{a(T0), 6)
g,0 (x) =g,,(x).

Thus
Vx
B(a(r0), 8)
we get
c(Ti) =f,,(x) - g"(x)
f,0(x) - g,0(x)
c(T0) = 0,
which, of course, is a contradiction.

Consequently, since
y( 1) = a ( 1),
we get
fi(y(l)) =g1(a(l)).
But this says nothing more than
9.4.8 Theorem. Suppose w is a closed first-order differential form with

domain an open set in En. If a and f3 are oriented, piecewise smooth closed
curves in . ( w) , and are homotopic in . (w) , then
Proof.
We shall divide the proof into two parts.
a(O) = a(l) = {3(0) = /3(1).

[O,l] which gives the homotopy between
a and /3, and let us suppose that VT E [O,l],f(T,O)= f(T, I)= a(O).
Also, suppose that f(O,t) = a(t),f(l,t) = {3(t).
Let E be the set of all points a E [O,l] with the property that VT
E [O,a]
(a)
To begin with, we shall suppose that
[O,l]
Let r be a function on
I.
w=
where
f7(t)= f(T,t).
w,
fT
The set E is a nonvoid set, since
To= sup E
Lemma 9.4. 7, 3 e > 0 so
EE. Further,
it is clearly bounded. Hence,
is well defined. We claim
To= 1. Indeed, by
that if y is in an
neighbor
hood of rTO> then
I.fTO
w=
J')'
3o > 0
Now, since r is uniformly continuous,
Vt
so that
lro - Tl < o
[O, I],
lf(T,t)-f(To,t)j <
( w= (
JrTO
JrT
To = 0, then r 70 = a and we are
3a < To so that jT0
al < o, and
If
w=
e.
led to a contradiction. If
'
To > 0,
then
hence
( w= ( w= r w
JrO"
Jr TO
JrT
Again we get a contradiction. Thus
To=
1, and the last argument also
shows that
(b)
w=
( w=
Jr,
w.
fJ
We shall now reduce the general case to the previous case. Let
and f3 be piecewise smooth, closed homotopic curves in . (w) . Let r
be a function on
and
/3
with
f(O,t)
[O, I] X [O, I] which defines a homotopy between a

a(t),f(l,t) = /3(t) (Fig. 9.4.2). Let us set
=
9.4
A(t)
and
VT
[O, l],
{ r(
CLOSED AND EXACT DIFFERENTIALS [ 425
f(3t,O),
{3(3t- 1),
f(3 - 3t,1)'
Vt
Vt
Vt
A(t),
A(7,t)
t- 7/3
,
T,
1 - 2T/3
A(t),
E
E
E
[O, 1/3],
[1/3,2/3],
[2/3,I],
Vt
[O,7/3],
Vt
[7/3,I - 7/3]'
vt E
[I - T/3' 1] .
FIGURE 9.4.2
A defines a piecewise smooth, closed oriented curve in "(w)

A ( 0)
A(I) a ( 0)
a(I), and A defines a homotopy between
and A so that VT E [O, l] we have A(T,0)
a(O) A(7,1).
Now,
with
a
From part (a) of the proof we have
But we think it is clear that
w over that part of A for which t E [O, 1/3] cancels

w over that part of A for which t E [2/3, l]. Thus the
since the integral of

the integral of
proof is complete.
9.4.9 Corollary. If l is an open, simply connected set in E", then the

collection of closed first-order differential forms with domain l is identical with
the collection of exact first-order differential forms which are of class C1 and
have domain ".
Proof.
If w is exact and of class C1, then there exists a fonction J of
class C2 so that

11
k
w(x)= :L Dkf(x)dx .
k=l
SinceD; Dkf=Dk D;f, it follows that w is closed.

Conversely, suppose w is closed. Since J8 is simply connected, every
piecewise smooth, closed oriented curve '}' in J8 is homotopic to zero.
From Theorem 9.4.8 it follows that
0
Thus from Theorem 9.4.2 we see that w is exact.

Let us take a simple example that shows an application of Theorem
2
9.4.8. Let B0 B (0, 1) \{0} in E and
=
- -y
x
w(x,y)-
2dy,
2dx+
x +
y
x +
y
which we have considered before. Let 1/1 be any closed differential form
0
defined on Bo and let] be any closed interval in B(O, 1) with 0 E ] .
Leta be that real number so that
JaJ .=a JaJr w
27Ta'
where iJ] is the boundary of J oriented in the "counterclockwise"

direction. Now the oriented boundaries of any two closed intervals
that have zero as an interior point are homotopic in B0 (Exercise 8
of Section 9.4). Thus, by Theorem 9.4.8, the number a computed
above is independent of] .
IfJ is a closed interval in B0, then iJ] is homotopic to zero, and since
ifJ aw is closed in B0, it follows again from Theorem 9.4.8 that
r
JaJ
[. - aw]= 0.
Thus the integral of 1/1 aw around every rectangle in B0 is zero.

Let (a, b) E B0 so that a< 0 andb < 0. For every (x,y) E B0 which
is not of the form (0, y) withy > 0 set
-
g(x,y)
J: [l/11(t,b)-aw1(t,b)]dt+f [l/12(x,t)-aw2(x,t)]dt.
This is a line integral along a horizontal line from (a, b) to (x,b), and
then along a vertical line from (x,b) to (x,y). It is always well defined
and clearly D2g(x,y) i/J2(x,y) - aw2(x,y). Also, V(x,y) E B0, which
is not of the form (x,O) withx > 0, set
=
This is a line integral first along a vertical line from (a, b) to (a, y) and
9.4 CLOSED AND EXACT DIFFERENTIALS J 427
then along a horizontal line from (a, y) to (x,y). It is also well defined,
1/11(x,y)-aw1(x,y).
andD1 h(x,y)
Since the integral of I/I-aw is zero around every closed rectangle in
B0, it follows that on the common domain of g and h we have g(x,y)
=
h(x,y). Now,
D2h(O,y)
lim
+t) - h(O,y) .
h(O,y
t-o
If we use the fact that the integral of I/I-aw around any closed rec
tangle in B0 is zero, we get
h(O,y
+t) - h(O,y)
+ J:+t
[1/1 (0,
2
T)-aw2(0,T)] dT.
Thus
D2 h(O,y)
1/12(0,y)-aw2(0,y) .
In the same way we get

D1g(x,0)
I/I 1(x,O)-aw1 (x,O).
Thus if we extend g to all of B0 by defining g(O,y)

h(O,y) and we
extend h to all of B0 by defining h(x,0)
g(x,0) the above computations
show that by taking f = g
h we have
=
df= I/I-aw.
The result we have just proved can be interpreted in the following
way. Suppose we say that two closed forms on B0 are equivalent they
differ by an exact form. It is not hard to verify that this is really an
equivalence relation. The result we have just obtained tells us that the
disjoint equivalence classes are determined by all the forms {aw:
a E R}. Another way of putting it is that the real vector space of
equivalence classes is one-dimensional.
D Exercises
1.
Determine which of the following forms are exact, and if exact
find a function f for which w = df
(a)
2.
(h)
w(x,y)
w(x, y)
(c )
w(x,y,z)
(d )
w(x,y,z)
Let
(x2- y2) dx-2xy dy.

2xydx + (x2- y2) dy.
=
xydx
(2xyz3
+2" x2 dy
+z) dx
+z3 dz.
+x2z3dy
+(3x2yz2
y
x
w(x,y)- dx+ dy ,
+y2
+y2
x
x
which has domain E2\{0}. Is w exact?
+x)
dz.
3.
w be defined on [a, b]
Let
X R by
w(x,y) =p(x)ydx+dy ,
where pis a continuous function on
[a,b]. If we set
verify that the first-order differential form

4.
qw is exact.
Compute the line integral of
w(x,y) =xdx+y3dy
along the oriented curve defined by
a(t)
5.
( e1 log cos
t, sin t Arctan (t-1)),
Show that
w(x,y)is exact on
6.
t E [O, l].
x2- y2
2xy
dx+ 2
dy
(x2 +y2)2
(x +y2)2
2\{0}.
Show that if
a0 and a1 are two different piecewise smooth
parametric representations for the same piecewise smooth, oriented

closed curve, then there is a function r, as in Definition 9.4.4, so that
f(O, t) =a 0 (t), f{l, t) = ai(t).

7.
If
a and {3 are piecewise smooth, closed oriented curves which
are homotopic, show that the homotopy relation is independent of the

piecewise smooth representatives which are picked from
8.
a and {3.
Show that the suitably oriented boundaries of any two closed
intervals in
2, which have zero as an interior point, are homotopic
in 2\{0}.
9. Let C 2 be an open simply connected set, a E . and

(a) =\{a}. Show that there is a closed form w defined on (a)
so that for every closed form /I defined on (a), 3a E R so that
/I - aw is exact. (Hint: Combine the example at the end of Section 9.4
with the technique of proof of Theorem 9.4.8.)
be a simply
,an)
=\{ak: k E (l,n)}. Show that there exists a set {wk: k E (l,n)}
of linearly independent closed forms in (ai. ... 'an) so that for
every closed form /I in (ai. ,a,.) there is a set {ak: k E (l,n)}
10.
Generalize Exercise 9 in the following way. Let
connected open set in
C R so that
2, {ak: k E (1, n)}
C .and let (ai.
9.5
1/1 -
k=l
akwk
k E (I,n)} is said to be linearly independent

k E (I,n)} CR,
is exact. The set {wk:

V{ak:
k=l
11.
akwk =O=>ak=O.
Let u be a real-valued harmonic function
B (O, I) C E2;
that is, u is of class C2 and

a2u
t:m
a2u
ax2 + ay2
=0
Show that there exists a harmonic function v with domain B (O,
1)
so
that the Cauchy-Riemann equations are satisfied:

au
av
ax
ay'
av
au
-=--
ay
ax
12.
Extend Exercise
11 to the situation where B (0, 1) is replaced
by any simply connected domain in 2
II. SURFACE INTEGRALS

9.5
MOTIVATION AND DEFINITIONS
Our purpose in the next few sections will be to extend to higher dimen
sions some of the facts we have proved for line integrals. In this con
nection it will be necessary to decide the meaning of an integral taken
over a surface. This section will be devoted to making definitions
together with the motivation for making these definitions.
The first thing we want to do is define a surface and the area of a
surface. We think it will be best if in the beginning we give a rather
discursive motivational discussion. Let
T be a
n;;;.. m.
is a
us
IT(A)I= l det Tl IA 1If
n m, and A C
J ordan measurable set, then the considerations of Section 8.5
that T(A) is J ordan measurable and
with domain E"' and range in E", where
> m, it is still true that
flC,(T)
If
Em
tell
(9.5.1)
is in an m-dimensional subspace of
E", which is identifiable with E"' through an orthogonal transformation

U of E" onto itself. Hence we now
T(A)
define
the m-dimensional content of
by
IT(A)I =IV T(A)I.

0
(9.5.2)
The content on the right is computable by (9.5.1). Of course, we must

make sure that the definition (9.5.2) is independent of the orthogonal
transformation U which takes Tt(T) into Em. Indeed, suppose V1 is
another such orthogonal transformation. Now, if T is singular, the
dimension of Tt( U T) and Tt(V1 T) is less than m and thus
0
IV
T(A)I = IV1T(A)I=0.
If Tis nonsingular, then W = V1 v-1IEm is an orthogonal transforma

tion of Em onto itself. Thus
IV
T(A)I = ldet V Tl IAI

= ldet W V Tl A
I I = ldet V1 Tl IAI
= IV1T(A)I.
0
Let us now give an effective way of computing the right side of (9.5.2)
so that the operator V does not intervene. Let us first note that T1 T
is a nonnegative symmetric linear transformation from E"' into itself,
that is, Vu E Em, T1 T(u) u;:,, 0. Thus this linear transformation
has a matrix representation consisting of nonnegative eigenvalues
down the main diagonal (see Section 6.5). Suppose we arrange them
in nondecreasing order d<?wn the main diagonal. By taking the non
negative square roots of these nonnegative eigenvalues and arranging
these square roots in nondecreasing order down the main diagonal we
get the matrix representation of a nonnegative symmetric linear
transformation B so that B2 = T1 T. We leave to the reader the easy
task of showing that there is only one nonnegative symmetric linear
transformation B with this property (Exercise 1 of Section 9.5). Since
Bis symmetric, the null space of Bis orthogonal to Tt(B)and moreover
BlTt(B) is one to one and takes Tt(B) onto Tt(B), For simplicity sake,
designate the inverse of BlTt(B) by B-1. Let V be that linear trans
formation with domain Tt(B) and range Tt(T) and defined by
0
V= T B-1
0
Now, Vu
Tt(B) we have
jV(u) 12 = V1
since
l l2,
V(u) u = u
V1 V(u) = B-1 T1 T B-1(u)= B-1 B2 B-1(u)= u.

0
Thus V preserves length and we may write

T= V
\/fl:T,
(9.5.3)
where we have taken B. If T is nonsingular so is B, and V

is a length-preserving linear transformation from Em onto Tt(T).
If T is nonsingular, then V V is an orthogonal map from Em onto
itself. It follows from (9.5.1), (9.5.2), and (9.5.3) that
=
9.5
IT(A)I=IV
Vo
(A)I
= [det ] IAI= [det T1 T]112 IAI.

0
(9.5.4)
In case T is singular we have already noted that IT(A)I=0. Since in

that case T1 T is also singular, we have det T1 T=0. Thus (9.5.4)
is still valid. Hence we have succeeded in computing IT(A) I only in
terms of T and A and have eliminated U. Incidentally, this is another
proof of the fact that IT(A)I is independent of U.
Suppose now that A is Jordan measurable in E"' and cp is a function
with domain A and range in En, where n;::: m. We may think of the range
of cp as a surface in E". Let us further assume that cplA0 is of class C1.
If c E A 0, then cp(c) + dcp(c) is a very close approximation to cp in a
small neighborhood of c. Thus if C is a small neighborhood of c in A0,
it is reasonable to say that ldcp(c)(C)I is a close approximation to the
surface area of that portion of the surface that is obtained by restricting
cp to C. Now, the Jacobian matrix of dcp(c) is the matrix
0
D1cp1(c)
:
D1cpn(c)
Thus, if we use the Binet-Cauchy formula (Theorem 6.6.8), we get that

det dcp(c)1
dcp(c)=
l:EiJ<<imn
where
a(cpji,... 'cpjm)
(c)
a(x1,. ,x"')
,m).
is the Jacobian at c of the function (1,
Suppose {Ak: k E (1, q)} is a decomposition of the Jordan measur
able set A into Jordan measurable sets that intersect at most on their
boundaries. If ak E Ak0, then it is not unreasonable to consider the sum
k=l
[det dcp(ak)1
dcp(ak)]'121A kl
as an approximation to the surface area of the surface determined by

cp. Thus in allowing these sums to pass to a limit, it is not unreasonable
to define the surface area as
L [det dcp(x)1
dcp(x)]112 dx.
(9.5.5)
A little later we shall write this again as a formal definition and show
that this definition is independent of the parameterization of the sur-
face, but to do so we must first define an equivalence relation among

functions.
9.5.1 Definition. The function a with domain in Em and range in En

is said to be equivalent to the function f3 with domain in Em <::=} there exists a C1
function with values x(y) so that JE>({3) C JE>(x), xjtB({3)0 is one to one with
range JE>(a)0, and f3(y) =a 0 x(y), Vy E JE>({3)0. The functions a and f3
are said to be orientably equivalent <::=} the Jacobian of xjtB(a)0 is always
positive.
It is not difficult to check that what we have defined above are actually
equivalence relations. We also think that the analogy with Definition
9.1.1 is quite clear.

9.5.2 Definition. A smooth surface in E" is an equivalence class of C1
functionst with domains Jordan measurable sets in Em, m .;;;; n, where equiva
lence is taken in the sense of Definition 9.5.1, and the functions have the same
range in Em.
A smooth, oriented surface in E" is defined in the same way, with the difference
that the equivalence relation is replaced by the orientable equivalence relation.
The common range of the equivalence class of functions that constitute a
(oriented) surface is called the trace of the surface.
As we have done with curves, a (oriented) surface will usually be
designated by any function in the equivalence class that constitutes
the oriented surface. From the point of view of surface area this makes
cp and I/I are two functions

A and B, respectively.
from B0 onto A0 so that l/J(y)
no difference, as we shall now show. Suppose
in the same smooth surface and having domains

Then there is a one-to-one C1 map
=
cp
x(y).
Thus using the chain rule we get
di/I (y)1
di/I (y) = dx(y)1
Using the fact that det dx (y)1

[det
di/I (y)1
di/I (y)]112
dcp(x(y))1
det
[det
dx (y)
dcp(x(y))
dx(y).
we get
dcp(x(y) )1
dcp(x(y))]112 IJx(Y) J.
Thus using the transformation theorem for multiple integrals we get
[det
dtfJ(y) 1
dtfJ(y) ]112 dy
[det
dcp(x)1
dcp(x)]112 dx.
Thus we see that the following formal definition makes sense.
9.5.3 Definition. If cp is a smooth surface in

are bounded, then the surface area of cp is defined as
En
and the partials of cp
t For this definition we shall say a function is C' if the restriction of the function to the
interior of its domain is of class c.
9.5 MOTIVATION AND DEFINITIONS I 433
J
- [
-J
( )=
J)(, cp
.IJ('{>)
.EJ('/))
[det dcp(x)1
dcp(x)]1'2 dx
l:O::::::l,<<imn
(a(cp)ii,
a( 1
)2]112
i )
. . . ,cpm
(x)
)
x
, x
,
m
dx.
(9.5.6)
[ We are supposing ta
h
t cp
( ) C E m, m:;;;; n.]
As an example of the use of formula (9.5.7) let us compute the area
E3
of a well-known surface: a sphere in

sphere of radius
r E3
L1 (x,y) r
L2(x, r
L3(x,y) r
in
cosx,
y)
where
(x,y)
The parametric equations of a
are given by
sinxcosy,
sinxsiny,
ranges over the interval [07T

,
] X [0,27T]. If we compute
the various Jacobians we get
a(L1,L2)( ) 2.
a(x,y) .
a(L1 L3)
a(x,y)
a(L2 L3)(x,y) r2
a(x,y)
xy
,
=
'
sin2 xsiny,
(x,y)=-r2 sin2xcosy,
--' -
sinx cosx .
Squaring and adding, taking the square root, and integrating we get
J)(,(L)=r2 Jro
r
Jo
2 1T
1T
sinxdxdy=47Tr2
Let us now turn to the problem of defining a surface integral which

will be analogous to the definition of a line integral. For motivation let
us again turn to a physical example. Letcpbe a smooth oriented surface
with trace in
in
E3
E3
and domain in
E2
Let vbe the velocity of a fluid flowing
which depends only on the position of the particle of fluid; that is ,
vis a function with domain and range in E3 We shall suppose that vis
continuous and its domain contains the trace ofcp. We suppose that the
trace ofcpis of such a nature that the fluid can flow through the surface
without altering its velocity.
If we suppose that dcp(a) is nonsingular, then we can think of the
range of the affine transformation dcp(a)+<P(a) as being the tangent
plane to cp at cpa
( ). Suppose n(a) is a vector emanating from rp(a)
which is orthogonal to this tangent plane. Then
dcp(a)(u)
n(a)
Vu E
d<P(a)1(n(a))= 0.
2 we have
Thus dcp(a)1(n(a) ) = 0, and if we write this equation

form we get
D cp1(a)n1(a)
1
D cp1(a)n1(a)
2
+
+
Assume that
D1cp2(a)n2(a)
D2cp2(a)n2(a)
component
D cp3(a)n3(a) = 0.
1
D2cp3(a)n3(a) = 0.
+
+
Cl(cp1,cp2)
(a) -0
a(x,y)
and take n3(a) to be this Jacobian. Then we can use Cramer's rule on
the previous two equations to solve for n1(a) and n2 (a) to get
( (:;)3) (a) , a(:;)1) (a) , a(:;)2) (a) ) .
n(a)= a
(9.5.7)
If
Cl(cp1 ,cp2) (a)=O,

a(x,y)
then one of the other Jacobians does not vanish, since dcp(a) has rank 2.
Thus we could proceed in a similar way to get (9.5.7). The vector
n(a)/n
j (a)I
(9.5.8)
is called the outward normal to the surface

if the partials of cp are bounded, then
Jlf,(cp)=
The component of the velocity
normal to cp at x is given by
v a
EJ (cp)
cp(x)
cp
at cp(a). Please note that
jn(x)jdx.
(9.5.9)
that is in the direction of the outward
:: , ] ,:=I'
provided, of course, that d cp(x) is nonsingular. Hence the amount of

fluid that flows in a unit time through a small area of surface about
cp (x) in an outward direction is approximately
v a
n(x)
cp(x) . j
n(x)j
JI(,
(Vx)'
where Vx is a small neighborhood about cp(x) on .92,(cp). We may sup

pose Vx = cp(Ux), where Ux is a small neighborhood about x in .e(cp).
But as we pointed out in (9.5.9), we have that JI(, (V x) is approximately
ln(x)l IUxl Thus we are justified in saying that the amount of fluid
that flows in a unit time through the surface cp in an outward direction
is the integral
9.5 MOTIVATION AND DEFINITIONS I 435
r,o(x)
v o
n(x) dx
.EJ(cp)
(9.5.10)
where we have taken
The integral we have just written down is called a
surface integral.
Because of the physical interpretation we have given to this integral,

it should be true that it is independent of the parameterization of the
oriented
surface r,o. This is actually true, but before we prove it let us
try to make an analogy between surface integrals and line integrals.

In a very informal manner let us write down an expression
n
w(x)
LL
k=1 i=1
w;k(x) dx i
/\
dxk ,
x is assumed to range over some set J) C En, n 2, and wJk is

a real-valued function defined on /). Even though this has no meaning
where
for us as yet, we shall call this a second-order differential form. Actually,

it is clear that
w can be given a meaning in such a way that it is uniquely

{w;k: j,k E (1, n ) } . However,
since we shall discuss this in very great detail in Section 9.6 we shall
not bother to do this here. We are using the symbolism dxi /\ dxk simply
determined by the functions in the set
to indicate that we are not thinking of this as the composite linear trans
formation
dxi
dxk which would make w a first-order differential form.
In an analogous manner we can write down an expression of an mth

order differential form as
n
w(x) =
im=1
Jt=t
witim(x) dxii
/\
/\
d,xim,
(9.5.11)
x runs over a set /) C En, n m, and wii ..im is a real-valued

function defined on /). We shall say w is continuous if and only if each
where
one of these component functions is continuous.

Even though we have not given a formal definition of an mth-order
differential form, we shall nevertheless give a formal definition of a
surface integral of a continuous mth-order form. This is not really
too serious a breach in the logical development since the functions
wi t i m are really the important things. Keeping formula (9.5.10) in

mind, we have the following.
9.5.4 Definition. If w is an mth-order bounded continuous differential

form with J)(w) C En, n ;::;;:: m, and r,o is a smooth oriented surface with trace
in /)(w) and /)(r,o) C E "' , and r,o has bounded partials, then the surface
integral of w over r,o is defined as
436 I THE
<P
INTEGRATION OF DIFFERENTIAL FORMS
EJ !<P)
.L ... .L
im=I
it=l
witim
cp(x)
a (cpit, . . . 'cpim )
m ) (x) dx.
a 1
'
'X
(9.5.12)
In order for this really to be an integral over an oriented surface,
it should be independent of the parameterization of the surface that

is used. Supposel/J. is orientably equivalent withcp.This means that there
is a C1 function y taking If) (cp)0 one-to-one onto If) ( l/J)0, having a positive
Jacobian and so that Vx E lf}(cp)0 we have

cp(x) = l/J
y(x).
From the chain rule it follos that

a(cp i1' ...' cpim)
a( l/J j 1, . . ' l/J im )
(
(x)
= a(yl, .. ,ym) (y(x))jy X).
a ( xl, .. ,x"')
.
If we use these last two formulas in
(9.5.12),
and note that y(lf)(cp)0)
lf)(l/J )0 and that lf)(cp)\lf)(cp)0 and lf}(l/J)\lf}(l/J )0 have Jordan con
tent zero, then it follows from the transformation theorem for integrals
that we could use l/J. in place of cp in
(9.5.12).
D Exercises
I.
We have shown that if
is a nonnegative symmetric' linear
transformation of En into E", then there exists a nonnegative symmetric
of E" into En so that B2
Show that
is
unique in the ense that it is the only nonnegative symmetric linear
transformation with this property.
2.
A portion of a cone in E3 is a surface having a parametric repre-
sentation
wherex E
cp(x,y)
[O, 1]
y E
[O, 27T]
l/J (x,y)
where If)(l/J)
B (0, 1) C
this surface.
(x cosy, x siny, x),

Show that
(x y, Yx2 +y2),
E2 "is another parametric representation for
3.
Find the surface area of the surface given in Exercise
4.
A torus is a surface in E 3 having a parametric representation
cp1(x,y)
(R+r cosy) cosx ,
cp2(x,y)
(R+rcosy) sinx ,
cp3(x,y)
whereR >rand (x,y) E
2.
r siny ,
7T ,
1T]
[ -7T, 1T]
The boundary of a bagel
or doughnut ( which is really a soft bagel!) is the trace of a torus.Find

.
its surface area.
9.6 THE ALGEBRA OF DIFFERENTIAL FORMS I 437
5.
In the text we gave a parametric representation of a 2-sphere
of radius
in 3. Give a corresponding parametric representation
for a 3-sphere of radius
same for
6.
iIJ 4 and determine its surface area. Do the
Compare with Exercise
E".
of Section
8.5.
Give a parametric representation for a smooth surface whose
trace is the ellipsoid in 3 given by the set of all points (x ,
the equation
y, z)
that satisfy
Find its surface area.
7.
If 'P is a smooth surface with
partials, prove that

JJ(,
[Hint:
8.
(If!) =
.f>(f{!)
JE>(l{I)
C 2 and 'P has bounded
[!D1 'f!(x) !2 ID21{1(x) I2 - (D1 l{l(x)
D 21{1(x))2] 112 dx .
Use Lagrange's identity (ExerCise I of Section
6.2).]
Suppose a smooth surfac in 3 is given by
l{l(x,y )= (x, y,f(x,Y)).
Show that
JJ(,
= IrJ.e
(If!)
{!{!)
[I
(Dif(x, y))2 + (D2f(x, y))2]112 dx dy.
8 and Fis a real-valued func

with domain an open set of 3 so that !lC.('f!) C JE>(F),
F 'P = 0, and dF('f!(x,y)) - 0, V(x,y) E JE>('f!). Comput JJf,(l{I) in
terms of the function F.
9.
Suppose l{lis the same as in Exercise
tion of class
C1
Suppose 'Pis a smooth, oriented surface in 3 with JE>(r,o) C E2

dr,o(x) is nonsingular, show that the outward normal to this surface
l{l(x) is independent of the paran,ieterization of the surface.
10.
If
at
11.
Let I be the cube in
with itself
surfaces.
9.6
E"
that is the Cartesian product of
[O, I ]
times. Show that I is the trace of exactly two smooth oriented
THE ALGEBRA OF DIFFERENTIAL FORMS
From the definition of a surface integral given by formula
it is clear that
(9.5.12),
J dxi
"'
/\
k=-
J cJxk
/\
dxi.
"'
'
Hence it is reasonable to want to define higher-dimensional forms in
such a way so that
dxi
/\
cJxk=-dxk
/\
dxi.
Indeed, if this is the case, and er is a permutation of
(I, m),
then we
must have the alternating commutation relation,
rJxh
(\ . . . (\
rJxim = ( sgn er) rJx ia-1
(\
. . ./\rJxi.:rm.
We have already run into this type of behavior in our study of deter
minants in Section 6.6. Actually this should not be a very surprising
development for us in view of the fact that formula (9.5.12) defines a
surface integral in terms of determinants. If we want to demand also
that Va,{3 ER,
rJxit
(\
(\
s /\
(adxir+[3rJxi)
/\dxim
= a[rJx1/\
1 .../\rJx ir/\.../\rJxim]
+{3[dx i1
(\ . . . (\
rJxis
(\ ... (\
<fx1m],
then we shall have a complete analogy with the alternating multilinear

functionals used in the study of determinants.
Let us now make some of these things precise. The first thing we shall
do is define an algebra. We have already come across the concept of a
function algebra in Section 6.7.
9.6.1 Definition. A real associative algebra is a quadruple (A,

+
, /\
, ),
where the triple (A,+, ) is a real vector space and/\is a function from
A X A into A that satisfies the conditions:
(a)
y
z), Vx,y,z EA.
(x/\y)/\ z=x/\(/\
b
()
zf\(x+y)=(z/\x)+(z/\y)
v x ,y,z E A
z +(/\
y
z)
(x+y)/\z= (x/\)
(c )
x)/\y
y =(a
a (x/\)
=x/\(a )
y , Va ER & Vx,y EA.
The algebra is said to be .finite-dimensional<=> the corresponding vector space

is .finite-dimensional.
As usual, we shall shorten the terminology and simply say that A is
an algebra. Also, we shall suppress the dot symbol and write
of'a
'ax'
instead
x.'
The question now arises as to whether it is possible to embed a real

vector space into an algebra for which an alternating commutation
relation holds. If this can be done, then by embedding the first-order
differential forms into such an algebra we could give a meaning to
higher-order differential forms. There is a standard way of doing this,
and in view of our discussion before Definition 9.6.1, the definitions and
constructions that follow should not be too surprising.
Let
be a finite-dimensional vector space and let Mk ( V) be the set
of all multilinear functionals with domain the k-fold Cartesian product

of
with itself. Under the usual definitions of addition of functions
and multiplication by elements in R, Mk (V) becomes a vector space.
9.6
THE ALGEBRA OF DIFFERENTIAL FORMS I 439
If A E MP(V) and, E Mq(V), define

A,
(v1,
=A (v,'
v,
u1,
. 'Vp
u )
q
) , ( u,' ...
'U ) .
q
(9.6.1)
Also, we set M0(V) =R, and if A E M0(V) we set A,=A

,. It is
quite clear that A , E Mv+q(V). Note that it is not usually true that
A ,=, A.For historical reasons the elements of Mk (V) are some

times called
covariant tensors of
V of order k.
We shall designate by M (V) the collection of sequences A=(An)
so that Vk E N0, Ak E Mk (V). For every A and, in M (V), and Va.

E R, define
(a)
(a.A)n = a.An .
(b)
(A+,)n=An+, n.
(c )
n
(A ,)n =
Ak ,n-k
k=O
With these operations it is a very simple matter to check that M (V)

becomes a real associative algebra. We shall leave the proofs of these
simple facts for the exercises. The algebra M (V) is usually called the
covariant tensor algebra of
V.
For our purposes the algebra M (V) is much too large, since there are
already too many elements in each M k (V) in the sense that they contain
multilinear functionals that are not alternating. If k ;;;.: I we designate

A k(V) that subspace of Mk (V) which consists of all the alternating
multilinear functionals in Mk (V), and we set
A 0(V) =R .
Now, if
A E AP(V) and , E Aq(V), it is not usually true that A , is an
alternating multilinear functional.Thus we define a linear transforma
tion d&k with domain Mk (V) and range A k(V) by means of the equations
d&k(A) =
kl
,L
UETTk
(sgn <r)<r*(A),
k;;;.: 1,
(9.6.2)
d&o(A) =A,
where <r*(A)(v1,
v k ) =A(v"''
,vuk) , and where
of permutations of (I,k) onto itself. If k;;;.: I and A E
1Tk
is the set
Ak(V), then V<T
E 1Tk, (sgn <r)<r*(A) =A. Hence, since 1Tk has kl elements, it follows
that A E A k (V) => d&k (A)=A.This is one reason for the normalizing
factor I/kl
The formula (9.6.2) for k ;;;,; 1 is analogous to the formula (6.6.5)
and indeed is modeled on the latter formula.To show that d&k has range
in A k(V), it is necessary to show that d&k(A) is alternating.The proof is

essentially the same as the proof of Theorem 6.6.4. However, we shall
repeat it.If
E 1Tk, then
T*(d&k (A))=
I
k!
L
<Te
1Tk
(sgn <r)(T0<T)*(A).

Let us set
and thus sgn
T 0 <TE rrk. As <T ranges over rrk> so does p. Also <T=T-10

<T = sgn T-1 sgn p = sgn p=sgn T sgn p. Hence
1
T*(d&k(>,))= (sgn T) kl
PE
(sgn
1Tk
p)p* (.\)
= (sgn T) d&k(.\),
which proves our assertion that
d&k{.\) is alternating.
d& on M (V) bysetting
We now define a linear transformation
(9.6.3)
The range ofd& is, ofcourse, M (V).
Let
,\k E
(a)
(b)
(c)
A(V) be the collection of sequences ,\ = (.\n) so that V

k
A (V). For every.\,, E A(V), and Va ER, define
(a .\)n=a,\n
{,\ + JL)n = ,\n + /.Ln
.\ A , = d& (.\,).
With these definitions it is not difficult to show that
E N0,
A (V) becomes a
real associative algebra. This algebra is sometimes called the covariant
alternating tensor algebra of V, or the covariant Grassman algebra of V,
or the covariant exterior algebra of V. The element ,\ A
, is called the
exterior product of,\ and ,.
As a step in showing that the operation A is associative, it is first
convenient to prove a lemma.
9.6.2
Lemma.
For every ,\ and , in M (V),
d&(d&(.\),)
Proof.
d& (,\,) = d& (,\d& (,)).
From the definition ofd& given by (9.6.3) and the definition
of,\,, it is enough to prove that

E
q
A (V) we have
Vp, q E
N0,
V.\ E A P(V), and V,
d&p+q(v'&JJ(,\),) = d&p+q(.\,)= d&p+q(.\v'&q(,)).

Let us consider
rrJJ as a subset ofrrJJ+q in the sense that every<T E
rr JJ
is identified with that element ofrrp+q which leaves invariant all elements
in
(1, p + q) \ (1, p)
and whose restriction to
(1, p)
is
<T. Then we may
write
d&p+q{v'&p(.\),)
(p
=
(p
1 q)
TE 1Tp+q
sgn
L
1 q) p\ UE1Tp
!
sgn
p
er
(sgn
(sgn
(]"E
1Tp
TE1Tp+Q
<T)(T
<T)* (.\ ,)
) ( ro <T)* (.\,) .
Now for fixed r.r E 'Trp, as
runs over 'Trp+q, T 0 r.r runs over 'Trp+q Hence
( sgn T)(T 0 r.r)* (.\ ,)= sgn r.r
'(E1Tp+q
Since there are p ! elements in
ihp+q(vtp(.\) ,)
'TrP,
it follows that
( sgn p)p* (A. ,).
(p + q)
( sgn p)p*(.\ ,)
E11J>+q
ihp+q(A. ,).
A similar computation shows the second equality.

REMARK.
Notice that the normalizing factor l/k! used in the defini
tion of vt is essential for the validity of the last lemma.
9.6.3 Theorem. The tensor algebra A(V) is a real associative algebra

which contains Ak(V) (isomorphically) Vk E N0 If V has dimension n, then
A (V) has dimension 2n and Ak(V)= {O}, Vk > n. Further, Va E R,
and VA. E A (V)
,
a/\.\=aA.=A./\a
(9.6.4)
q
and VA. E AP(V) and V, E A (V),
A./\,= (-l)PQ,/\.\.
Proof.
(9.6.5)
The fact that A(V) satisfies all the conditions for a real
associative algebra can be very easily checked. Actually the only thing
that may cause some difficulty is the proof of associativity. However,
this is an immediate consequence of the last lemma. Indeed, using the
fact that M(V) is an associative algebra we get
(.\/\ ,)/\ T/=vt( (>.. /\ ,) T/)

= vt(vt(.\ ,) T/)
= vt (A.,T/)
= vt(A. vt(, T/))
,\ /\ (,/\ T/).
We shall leave as an exercise for the reader the proofs of the other facts
needed to establish that A(V) is an algebra.
The statement that Ak(V) is contained isomorphically in A (V) means
that there is a nonsingular linear transformation with domain A k(V)
and range in A(V).
Since this fact is almost obvious, we shall leave it
for the reader.

It remains to prove the second and third statements of the theorem.
The formula (9.6.4) is so easy to prove that we leave it to the reader.
In proving the other things we shall proceed in a somewhat informal
manner, because the formal proofs of many of the things we say would
require an induction argument, and this would get rather tedious.
However, the induction arguments are not difficult and the reader
wishing to do so can easily fill in the formal details.
(l,n)} be a basis for V. For every j E (l,n), let >._i

A 1 (V) be that linear functional so that V k E ( 1, n) ,
Let {ek: k E
E
j
j
k,
k.
We shall first prove that V k E ( 1, n) the set of multilinear functionals

k
{A.ii/\ /\ A ik: 1 j1 < < lJi n}
generates
A (V). First,
we have
Thus if
er
rrk we have
Hence, by use of Lemma 9.6.2,
>._i1
/\
/\
A_ik (v1,. . .,vk)
vf,(>._i1 .. . >._ik) (v1, .. .,vk)
\ uLe 'TTk
(sgn
er
) Vu/ 1
Vukik.
Now, if A E A (V) , then since A is multilinear,
A.(v1,-
,vk)
v1i1 vkik A.(e;1,"
k!A.(e;1,'
k!A.(e;I' ,e k)A.ii /\ /\ A_ik(v1,i
i1=l
Jk=l
(J)
[j)
-,e;k)
[;
-,e;k)
( sgn
)vu1i1 .
er
vukik
(TE7Tk
-,vk).
where [j] means we are summing over all ordered k-tuples (j1,
< jk n. Consequently, we have shown
jk) for which 1 j1 <
what we set oqt to prove: The set {A.ii /\ /\ >._i k: 1 j1 < < jk
k
n } generates A (V).
Next let us prove that this is a linearly independent set. Indeed

suppose
L,,,
[j]
.
J1 ... -.Jk
.
u.
\
I\
i1
\h
/\ . . . /\ I\
, e1k) we see that

ik
0. Thus the given set is a basis for A (V), and consequently
Then evaluating both sides at the k-tuple (e i,

ah,. . . ,
ik
this vector space has dimension
n !/k ! ( n -
k) !
9.6
THE ALGEBRA OF DIFFERENTIAL FORMS I 443
A(V), if
Note now that by the very definition of
L Laj,,.ik >._i1/\
/\ >.. ik = o,
k=I [j]
then
-O
"" aii .... ,jk A.ii/\ .../\ A.ik.L.J
(j)
Thus, if it is true that

sion
Vk
>
n, Ak(V) = {O}, then A(V) has the dimen
n
nl
=
(l+l) =
" k!(nk)! 2n.
Vk > n, AK(V) = {O} is very easy to prove. Indeed, if

n, then in every set {e;;: i E (l,k) &jk E (l,n)} at least two of
the elements must be the same, since there are only n elements in a
basis for V.Suppose e;r = e;8; then since every A. E A k(V) is alternating,
The fact that
>
if we permute
and
and leave the other indices fixed we get
A.(e1" ,e;k ) =-A.(e;" ,e;k).
From this it follows that A ( ei I'
,ei k)
0. If we now use the expan

A.(v1 ,
, vk )
sion for A given by the first equality in (9.6.6), we get
=O.
To finish the proof of the theorem it remains to prove the relation
(9.6.5). We may as well assume that p+ q :;;;; n; otherwise the relation

is trivial. From the formula (9.6.6) we get
A= LaJi,,i,, >._ i1/\
/\ >._ip,
(j)
. \ i1 /\
"" /3t1..tq/\.
L.J
/\
\
I\
iq
'
[ii
where recall that by the symbol [j ] we mean we are summing over all
P-tuples of integers (j1,
j,,), where l ,,;;; j1
<
<
by [i] we mean we are summing over all q-tuples with

<
iq ,,;;; n. Hence we get

A/\
=LLa;,,,i,,/3iv,iq >._it
[j) [i)
(\
Because of the alternating property, Ai/\
>._ it /\
/\
A_ip/\ >._i1/\
/\
/\ >._ip/\ >._i1/\
A.k = - A.k
/\
j,, :;;;; n, and

l ,,;;; i1 <
/\ >._iq.
Ai, we have
A_iq
This gives (9.6.5).

We shall single out the following statement, which is actually a
corollary of the previous proof. It will be of some importance when we
discuss differential forms.
Corollary. Let V be a real vector space of dimension n, {ek: k

( 1 n)} a basis for V, and ,\i that linear functional with domain V so that
. 9.6.4
E
j
j
k,
=
k.
Then the set {,\it /\ /\ ,\ik: 1 j < < jk n} is a basis for Ak( V) ,
1
k > 0, which in turn implies that the set consisting of the number 1 together
with
{A.it f\ ... f\ ,\ik:
is a basis for
j1
<
jk n, k
<
(l,n)}
A( V) .
We are now in a position to give a definition of a kth-order differ

ential form. If the reader will review Definition
9.1.5 he will see that the
following definition is a direct generalization of the concept of a first

order differential form.
9.6.5 Definition. A kth-order differential form w zs a function with

domain in En, n ;;,, k, and whose range is in A k (En).
{ dx i: j E ( 1, n)} have the properties

{,\ i: j E (1, n)} of the previous corollary
{ei: j E (l,n)} for En, where, of course, ei
The linear functionals in the set

of the functionals in the set
with respect to the basis
is that vector whose jth component is 1 and all of whose other com
ponents are zero. Thus, from formula
(9.6.6), Vx
.(w)
we may
write
w(x) =:L wit, ,ik(x) dxit /\ f\ d,x ik,
(9.6.7)
(j)
where we recall again that

k-tuples
(j1,
,jk)
[j]
To see how the right side of

basis for
En,
means we are summing over all ordered
j 1 <
< Jk n.
(9.6.7) changes if we choose a different
for which 1
we first prove the following proposition.
9.6.6 Proposition. Suppose f is a function of class C1 with domain

(an open set) in En and range in Eq, q n. Suppose further that g is a function
of class C1 with domain in E" and range in .(J). Then
d f1
g(x) /\ /\ dfq
= :L
[j]
g(x)
au , . . . fq)
,
. , tiq) (gCx> )dgit(x) /\ ... /\ dgiq(x)
a(tit ' ..
'
where
a( f1,... r )
k(g(x))].
; (g(x))-det[D;f
,
a (.
111,
, t q)
'
(9.6.s)
9.6
Proof.
THE ALGEBRA OF DIFFERENTIAL FORMS j 445
Using the chain rule we may write
dfk g(x)
0
dfk(g(x))
Also, we have
dg(x).
dfk(g(x)) = L D;fk(g(x)) dtj.

j=l
dti
Hence, since
dg(x) = dgi(x),
"
dfk(g(x))
Thus
dg(x) = L D;fk(g(x)) dgi (x).

i=I
df1 g(x)
/\
iq=l
(ih
iq)
dgi1(x)
< jq
{jk: kE (1, q)}.
/\
/\
dgiq(x).
/\ . . . /\
dgiq(x)= 0.
jq) is
(j1,
7T[j] be the
dr g(x)
is a q-tuple of integers any two of which are the same,
then
1 j1 <
n q
L TI D;kfk(g(x)) dgi i(x)
i1=l k=l
L
If
we have
n,
let
a q-tuple of integers so that

set of permutations of the set
Any two different sets in the collection of sets of
q-tuples
{ ( <Tj 1 ,
, <Tjq) : <TE 7T [j]}
are disjoint. Moreover, from the alternating property of multiplication

in A (En), if
dga'i1(x)
<TE 7T[j]
/\
.'
/\
we have
dga'jq(x)
sgn
<T dgii(x)
/\
/\
dgiq(x).
Thus we get
df1 g(x)
=L
[j]
/\
L
crerr[j]
/\
sgn
dr g(x)
<T
TI Da'ik fk(g(x)) J dgh(x)
k=l
/\
/\
dgiq(x).
However, the summation in brackets is precisely
a( f 1,
a(tji,
.r)
(g(x))
tjq)
,
This establishes the formula (9.6.8).

Let us use Proposition 9.6.6 to show how the right side of (9.6. 7)
changes when we change the basis of E". Suppose that
is any basis for E", and
VaE En let
n
us write
n
a= L ake' = L ake
k
k
k=I
k=I
{e' : kE (1, n)}

k
For every j E
( 1, n)
let yi be that function defined on E n by the equation

yi(a) =cxi.
Now, Vx E En, a simple calculation shows that

dyi(x)(a) =Dayi ( x) = cxi.
Hence dy1(x) is independent of x and we denote it simply by dy1 Now

the set {dyi: j E
properties of the set {A.i: j E
of Corollary
the basis {e' : j E
formula
(1, n)} has the

9.6.4 with respect to
(9.6.6) we may write
w(x) =
L w'it ....,;k(x)dyit
/\
(1, n)}.
(1, n)}
Thus from
/\ dyik .
(9.6.7')
[j]
Let g be that linear transformation acting from En onto itself so that

Vj E
(1, n),
equation
g(e ) =e' Let us define the function y by means of the

i
i
y(a)
y1(a)e
1
j=l
Then we have
n
y(a) =
yi(a)e'; =a.
j=l
Thus we see that g is the inverse of y and

gi
y(a) =a1 =xi(a).
9.6.6
Thus we may apply Proposition

.
dx'l /\ . . . /\ dx'q
(9.6.8)
and we see that
a(gh,.,giq)
.
i
> dyi1 /\ .
a(t t ' .. ' tiq
'L
[i]
Since g is a linear transformation, the
..
/\ dyiq.
(9.6.9)
J acobian
a(gi1,.. .,giq)
a(tii, .. , ti q)
is independent of the points in E", and thus we have deemed it unneces

sary to evaluate it at any point.
If we use the formula
w(x) =
L
(i]
['L
[j]
Now, by Corollary
.;,;;
n}
(9.6.9)
w11" .. ,;q (x)
9.6.4,
in
a
(9.6.7)
we get
(ri:: : : : ':;) J
'
'
'
dyii /\ . . . /\ dyiq.
the set {dyii /\ /\ dy1q:
.;.;; i1 < < iq
is linearly independent. Hence if we compare the equation above
(9.6.7')
with equation
I
we get
w it ;...,iq (x) =
L
[j]
WJi, .. ,Jq (x)
t
a (gi ' giq)
.
a(tit ' ... ' tiq)
(9.6.10)
9.6
The transformation formula
THE ALGEBRA OF DIFFERENTIAL FORMS
I 447
(9.6.10) is in itself not of great impor
tance. However, it does help to show that the following definition is

quite independent of the coordinate system we choose.
9.6. 7 Definition. If w is a kth-order differential form with domain

(an open set) in En, n ;;;.: k, then w is said to be of class cm each function
wii,... ,;k which appears in (9.6. 7) is of class cm. The diff erential of a kth
order differential form of class C 1 is defined by
dw(x)
L
[j)
dw;1, ...,Jk(x) /\ dxi1 /\
/\ dxik.
(9.6.11)
Let us now establish what we mentioned before the last definition:

The class to which a differential form belongs and the definition of
dw(x) are quite independent of the coordinate system used.
9.6.8 Proposition. Suppose w is a q th order differential form with an

expansion given by (9.6.7' ). Then w is of class cm::::> each function w ii,... ,Jq is
of class cm, and moreover if w is of class C1, then
-
dw(x)
L dw'iJ,...,;q(x)
/\ dyi i /\
/\
dyiq.
[j]
Proof.
If
W;l, ... ,Jq
is of class
cm, it follows from formula (9.6.10)
that w'ii,.-.Jq is of class cm. Conversely, since there is clearly a formula

similar to (9.6.10) in which w;1,. ..,;q can be written in terms of w'J1,.. ,Jq,
it follows that if the latter functions are of class

From formula
cm, so are the former.
(9.6. l 0) we get
dw i1,...,;q(x)
L
[j)
'gi )
dw;l,. . .,;q(x).
a(ti1, ... 'tiq)
a(gii,
Note that we are not using Definition
9.6. 7 here, since we are only
dealing with the differentials of real-valued functions as developed in

Chapter
7.
Hence, by use of formula
L
rn
(9.6.9) we get
q
dw';1,...,;q(x) /\ dyi1 /\ . .. /\ dy i
-
-L
[j)
dwh ...,;q(x) /\
[iJ
.
.
a (gJ1,.. ,gJq)
q
dyi1 /\ ... /\ dyi
a< t i1' ' t iq)
dw;1,...,;q(x) /\ dxi1 /\
dw(x).
[j)
/\ dxiq

Before we proceed, let us pause for just a moment and look at some
of the things we discussed about line integrals in terms of the language
of differentials of differential forms. First, let us look at a first-order
differential form of class C1 whose domain is in 2 We shall write this as
By definition, the differential of
dw(x)= dw1(x)
dwdx) = D1wk(x) dx1
Since
w is
/\
dx1
dw2(x)
D 2wk(x) dx2,
/\
dx2
if we use the alternating
property for the exterior product of differentials, we get
dw(x)= [D1w2(x) - D2w1(x)] dx'
dx2
/\
We have already seen the coefficient of the right side in Section 9.3.
Suppose I/I is a function of class
interval
I= [O, l]
[O, l],
contained in the domain of
C2
and let
w.
whose open domain contains the
 0
and if
<plI0
is one to one, then by the transforma
tion theorem for integrals the integral on the right is
<P(/)
[D1w2(x)-D2w1(x)] dx.
Hence, Stokes' theorem, 9.3.l, takes the form
J dw= J
IP
where aq; is another symbol for

If
iJq;
<p 0
w,
'Y in formula
(9.3.12).
is a first-order differential form whose domain is in", then

n
If
has an open
w(x)= L wk(x) dx k.
k=l
1
domain and is of class C ,
then
dw(x)= L [D;wk(x) -Dkw;(x)] dxi

j<k
/\
k
dx .
Recall that a first-order differential form is called closed:::}
& Vj,k E (I,n),D;wk(x)=Dkw;(x).
From_,....the
Vx
previous
we see that a first-order differential form is closed :::}
Vx
o(w)
formula
E
.(w),
dw(x)= 0.
9.6.9 Proposition. (a) If w and
class C 1, then on their common domain
11
are kth-order differential forms of
d(w + v)(x)= dw(x) + dv(x).

(b)
(9.6.12)
If w is a kth-order differential form of class C2, then

d2w(x)= 0.
(9.6.13)
(c) If w and v are differential forms of class C1 of orders p and q, respec

tively, then on their common domain
d(w
/\ v)(x)
= dw(x)
/\ v(x) + (-l)Pw(x) /\ dv(x).
(9.6.14)
The statement (a) is an immediate consequence of the defini
Proof.
tion of the differential operator and there is no need to comment

further.
To prove (b) we write
n
dw(x)
Hence
=LL Drw it,"ik (x)dx r

n
=LL L D,
d2 w(x)
Since dx8 /\
/\ dxii /\
[JJ r=l
fil r=l
dxr
s=l
-
dxr
r
D rw i1, .. .ik(x)dx /\ dx /\ dxi1
/\ dxik.
/\
/\ dxik.
/\ dx8 and D, Drwi1, ...,jk(x) = Dr D,w;1 .....ik(x),

0
the conclusion of part (b) is immediate.
To prove part (c), because of the linearity property (9.6.12) it is

enough to suppose that
= f(x)dxi1 /\
v(x) = g (x)dxit /\
w(x)
where
d(w
/\ dx;P,
/\ dx iq,
and g are real-valued functions of class C1 Hence
/\ v)(x)
=
d(Jg)(x)
/\ dxi1 /\
k
Dkf(x) dx /\ dxi1
+ f(x)
11
L Dkg(x) dxk
k=l
/\ dx;P
/\ dxii /\
/\
/\ dxip
/\ dxi1 /\
/\ dxiq
/\ v(x)
/\ dx;P /\ dxi1 /\
/\ dxiq.
Now, from the alternating commutation relation

dxi
/\ dxk = - dxk /\ dxi,
we get
dx k /\ dxi1
/\ ... /\ dxip /\ dxi1 /\ .
/\ dxiq
(\ . . . (\ dxip (\ dx k (\ dxii (\ .
= (-})P dxi1
(\ d,xiq.
Hence
d(w
/\ v)(x) = dw(x) /\ v(x) + (-l)P w(x) /\ dv(x).

NOTE:
In the previous proposition we have used the following defini
tions. If w and v are differential forms, then on their common domain
= w(x) +
v)(x) = w(x) /\
(w + v)(x)
v(x) ,
(w
v(x).
/\
Suppose now that cp is a function of class C1 We shall set
cp*w(x)
w0cp(x)
=L wit.ik0 cp(x) dxh0cp(x) /\
/\
iq
dx 0cp(x).
(9.6.15)
[j]
We shall leave it as an exercise for the reader to show that cp*w is inde
pendent of the particular representation we use for w.
9.6.10 Proposition. Suppose w and v are differential forms and cp and

I/I are of class C1 Then
(a) cp*(w + v)=cp*w + cp*v.
(b) cp*(w /\ v)=cp*w /\ cp*v.
(c)
(I/I 0cp)*w=cp*0lfl*w.
(d) If w is of class C1, &e(cp) C J9(w) and cp is of class C2, then
d(cp*w)
Proof.
cp*dw.
The proofs of the first three statements are rather easy and
we leave them as exercises. To prove (d) we first write
cp*w(x)
L wi1...,iq
cp(x) dxi1
cp(x) /\ . . . /\ d,xiq
cp(x).
lj]
Thus from formulas (9.6. l 2), (9.6. l 3), and (9.6. l 4) we get
i
dcp*w(x)=L dwi1...,iq 0cp(x) /\ dx 1
cp(x) /\
/\
d,xiq
[j]
cp(x).
On the other hand, let us note that

n
k
dwi1.-.iq(x)= L Dkwi1,,iq(x) dx '
. k=I
and thus
cp*dw it.,iq(x)
k
L Dkwi1,- ..,iq(cp(x)) dx 0cp(x)
k=l
n
= L Dkwi}.-,iq (cp(x)) dcp k(x)

k=I
=
dwit..,iq 0cp(x).
Hence from (a) and (b) and the previous equality we get
i
iq
/\ dx ]
L cp*dwit.--.iq(x) /\ cp*[dx i /\
i
l l
iq
=L dw i}.--.iq cp(x) /\ dxii cp(x) /\
/\ d,x
cp*dw(x)
[j]
=dcp*w(x).
cp (x)
D Exercises
I.
Prove that
M( V)
is a real associative algebra.
2. Complete the proof of Theorem 9.6.3; that is, show that A(V)
is a real algebra and that A k( V) can be isomorphically embedded into
A(V). Can we consider V as embedded in A (V)?
3. If V is a real finite-dimensional vector space and {e :k E(I,n)}
k
is a basis for V, show that there exists a linear functional >..i, acting on V,
so that
j =;t. k,
j = k.
4.
Prove (a),(b) and (c) of Proposition 9.6.10.
5.
If A.,,,v E A3(V), show that

(>..-,)/\(,-v) =(A./\,)+(,/\v)+(v/\A.).
6. Let V,{e :k E(I,n)},and {A.k:k E(I,n)}be as inExercise3.

k
Show that
7.
Suppose that
f..i =
2: au,i,
jE(l,n),
i=l
whereVjE(l,n),A.iand ,1are in A1(V),andVi,j E(I,n),a;1 ER.
Show that
A.1/\
/\A." = det (ai;) ,1 /\
/\ ,".
Find the differential of the following differential forms:

(a) x2ydx+xdy.
(b) xdx+ y dy.
(c) Pdx1 /\dx2 + Qdx2/\ dx3 + R dx3 /\ dx1 , where P, Q, and
R are functions of class C1 defined on an open set in 3.
8.
9. If w is a differential form of order p and YJ is a differential form

of order q, both of class C2 and having a common domain in E", find
the differential of
[(dw)/\ YJ] - [ w/\dri].

IO. Suppose f1,
a common domain in
Dip(x) has rank k ::::}
Jk are real-valued functions of class C1 having

Show that the Jacobian matrix with entries
E".
df1(x)/\
/\dfk(x)
0.
452 \THE INTEGRATION OF DIFFERENTIAL FORMS
9. 7
CLOSED AND EXACT FORMS
Our obj ect in this section is to prove the analogue of Theorem
9.4.6
for higher-order differential forms. We shall begin with a definition.
Definition. A kth- order differential form w of class C1 is said to

9.7.1
be closed dw = 0. A kth (k 1)-order differential form w is said to be
exact there exists a ( k - 1) st-order differential form 'Y) so that w= d'Y).
It is quite clear that if an exact form is of class C1, then it is closed,
since
dw = d27J =
9.4.6
star-shaped
0. We shall prove the analogue of Theorem
for slightly more general regions than open balls, called
regions.
Definition. A set S CE" is called star-shaped 3a E S so that
the set {y: y= (1- t)a + tx, t E [O, l]} is contained in S.
9.7.2
Vx
E S,
It is clear that every convex set is star-shaped, but of course not every
star-shaped set is convex. We shall now state and prove the analogue
9.4.6.
of Theorem
The proof of the higher-dimensional theorem is a
direct generalization of the proof for first-order differential forms.
9. 7.3
Theorem.
If w is a closed differential form, of order greater than

zero, defined on an open star-shaped set in E", then w is exact.
Proof.
3a E '(w) so that the range of

[O, l] by s(x, t)
(1- t)a + tx, is
in '(w). Let us suppose, for the moment, that a= 0, so that s (x, t)
tx.
By the chain rule, and the fact that w is closed, we get
Since
the function
'(w)
is star-shaped,
defined on
'(w)
dw
Supposing that
dw
s(x, t)
dw(s(x, t))
ds(x, t)= 0.
(9.7.1)
is a kth-order form we also have
s(x, t)
dwu1
s(x, t)
dxi1
s(x, t)
[j]
Now,
dxi s(x, t) = xidt + tdxi,
dxi1
s(x, t)
= [xii dt + t dxii]
dxik
dxik
s(x, t).
(9.7.2)
and thus
s(x, t)
A
[xik dt + t dxik]
i=l
(9.7.3)
where the caret over
dxi;
indicates that this term does not appear in
CLOSED AND EXACT FORMS I 453
9.7
the exterior product. Further, we have

dwu1s(x,t)
If we use
tk
(9.7.3)
and
(9.7.4)
LL (-I)i-txi; dwc;1(tx)
fj] .i=l
tk
in
aw(j](tx)
at
(9.7.2),
/\ dxii /\
at
+ t k+i
_.......__
/\ d,xii /\
(9.7.4)
(9.7.1)
and then use
/\ dt /\ dxii /\
L awc;i(tx) dt
ui
dt+t dwui(tx) .
"d,x i k
we get
/\ dxik
L dwrn(tx) /\
[j]
dxii /\
/\ d,xik
O.
Now, the last term on the left is tk+i dw(tx), which is zero since dw(tx)
=
0. Thus moving dt through to the right on the remaining two terms
we get
k
LL
( -1) i-txi; dw1n(tx)
[j] i=l
/\ dxi1 /\
=
......--:-..
/\ dx3i /\
L aw(j]Ctx>
at
ui
dxi1 /\
/\ dxi k /\ dt
.. .
/\ d,xik /\ dt.
(9.7.s>
Let us set
71(x)
(-l)i-t
[j] i=l
[ J(
tk-t w1n(tx)x1i dt
/\ dxJi /\ . . . /\ dxik.
dxii /\
In order to compute d71(x) we must compute the differentials of the

coefficients. Using Theorem
8.4.3, which allows us to differentiate under
the integral sign, we get

d
[J:
tk-1wrn(tx)xii dt
J
[J:
tk-1w c;1(tx) dt
dxi;
dxm
J:
tk xiidw u1(tx) dt,
where the last integral is defined as the sum
m=I
[Jo{
tkx1iDmwrn(tx) dt
Consequently, we get
d71(x)
L
[j]
f
Jo
tk
[f
[2:
[j) i=I
tk-1wu1(tx) dt
dxi1 /\
(-I)i-xi;dw1n(tx) /\ dxii /\
/\
d'): /\
/\ dxi k
dt.
454 J THE INTEGRATION OF DIFFERENTIAL FORMS
where the last integral must be given the obvious interpretation. Now,
integrating by parts, we get
(1 k
t -lw
Jo
[j]
(tx)dt
[j]
(1 k awui(tx)
(x) t
dt
.
at
Jo
Putting this into the expression for d11 and using (9.7.5) we get
d11(x)
w(x) .
We have done all of this under the assumption that

s(x, t)
tx. In case
0, that is,
a # 0, set
v(x)
w(x
+a),
x E .B(w)
a.
Then v is a closed differential form defined on a star-shaped set to which

we may apply the previous considerations. Hence there exists a differ
ential form so that d(x)
d11(x)
v(x). But if we set 11(x)
(x
- a), we get
w(x), and the proof is complete.
The theorem we have just proved is usually referred to as Poincare's

lemma, although the result in question seems to have been first proved
by V. Voltera. To connect the concept of a closed form with the concept
of an exact form for domains that are more general than star-shaped
sets, it is necessary to discuss certain "topological" properties of sets
in En. This is done in terms of the
cohomology groups of a set. The relevant
theorem is due to G. DeRham. We shall forego a discussion of this

matter, because it would take us too far afield.
D Exercises
1.
Let w be a second-order differential form of class C1 defined on
an open set in E3;

w(x)
L w;,.12(x) dxii
[j]
/\ dxi2
Show that w is closed Vx E .B(w),
D3w1.2(x) -D2w1,3(x)
2.
+D1w2,3(x)
Let w(x,y, z) be the second-order differential form of class
C1 defined on E3 by
w(x,y,z)
xdy /\ dz+ydx /\ dz+xydx /\ dy .
Show that w is closed and find a first-order form 11 on E3 so that w
3.
d11.
Justify the use of formula (9.7.5) in the proof of Theorem 9.7.3.
S is an open star-shaped region in En and cp is a one-to

S, range in En, and having a
nowhere-vanishing Jacobian. If w is a closed form on cp (S), show that
4.
Suppose
one function of class C2 with domain
w is exact.
9.8
Suppose that
5.
MANIFOLDS I 455
w is an odd-order differential form of class C'
defined on an open set in En and g is a nowhere-vanishing real-valued

function of class
C1 with domain (w) so that gw is closed. Show that
w /\ dw = 0.
9.8
MANIFOLDS
We would like to prove a version of Stoke's theorem in higher dimen

sions for objects that are more general than most surfaces. Thus we
wish to talk about certain geometric objects called manifolds, which are
essentially nothing more than surface elements that have been patched
together in a consistent way. We begin with some definitions.
9.8.1
Definition.
Suppose M is a set in En. An m-dimensional chart on
M is a homeomorphism with an open domain in E"' and range a (relatively
open) set in M. A collection of m-dimensional charts on M is called an atlas

forM
M= U {512,(<p): of all m-dimensional charts on M is an atlas forM, then the
ordered pair (M, ) is called an m-dimensional topological manifold. Jn the
latter case is called the continuous structure for M, and Mis called the trace
of the manifold.
For the sake of simplicity we shall usually designate a manifold by its
trace M rather than by the ordered pair consisting ofM and the struc
ture for M. Generally speaking, topological manifolds are not "smooth
enough" to be able to carry out very much analysis on them. Thus it
is necessary to describe classes of differentiable manifolds.
9.8.2 Definition. Let (M, ) be an m-dimensional topological manifold

and <l>k the collection of all charts in the structure which are of class Ck,
k ;;;_,: 1 and each of which has a nonsingular differential at each point of its
domain. The ordered pair (M, <J>k) is called an m-dimensional, regular Ck
manifold <J>k is an atlas for M.
Any subset of <J>k that is an atlas for M is called a regular Ck atlas for M.
If (M, <J>k) is an m-dimensional, regular Ck manifold, then <J>k is called the
Ck structure forM.
As before, we shall often abuse the language and simply say that M
is an m-dimensional, regular
k can take on the value
oo
Ck manifold. In the previous definition

ck by 'analytic,' then
also. If we replace
we call M a regular analytic manifold. For the sake of rounding out the
terminology we can designate a topological manifold as a
If M is a
Ck manifold, and <p, l/J E <J>k, then

.
<p-1
"'
C0 manifold.
456
I THE INTEGRATION OF DIFFERENTIAL FORMS
is a Ck function. The proof is very similar to the proof of Corollary
7.5.6. Indeed, from Theorem 7.3.3 we know that , has a differential

at every point of its domain and hence
d<p(,(x))
Now, at every point
d,(x)
,(x), d<p(,(x))
dijJ(x).
has rank
and hence by Cramer's
rule we can solve for the entries of the Jacobian matrix of, in a neigh
borhood of
x as a quotient of determinants that involve only the function
and the partials of 'P and ijJ. We can then proceed as in Corollary
7.5.6. We shall ask the reader to give all the precise details in an exercise.
Let us look at some simple examples. Let us take M as the unit circle
in
2,M
{x: lxl
1}.
The functions
<p(t) =(cos 21Tt, sin 27Tt),

ijJ(t) =(cos 21Tt, sin 27Tt),
t
t
E
E
]-7T, 1T[,
]0,27T(,
are C" homeomorphisms with nonsingular differentials at each point

of their respective domains. Thus M is a one-dimensional, regular
C" manifold. Indeed, M is even analytic.

Suppose now that M is the unit sphere in
We could again use
polar coordinates to get a parametric representation of the two-dimen

sional sphere in 3 However, for the sake of diversity, and also because
it is somewhat easier, we shall proceed in a different way. Let
unit sphere in
3;
that is,
S2
{ixl: x
3 & lxl
= l}. On
S2 be the
S2 we shall
consider the following relatively open sets:
V/ = {x: x
V;{x: x
=
We shall consider the functions
E2
E
E
S2 & xi> O},

S2 & xi < O}.
1T;
taking
V;
onto open sets
U;
in
as follows:
1T1(x) = (x2,x3),
1T (x) (x1,xa) ,
2
1T 3(x) (x1,x2) .
=
It is a very easy matter to check that these are one-to-one functions and
the inverse functions are of class C''' with nonsingular differentials at
each point of their respective domains. Thus
S2
is a regular, two
dimensional C" manifold. Indeed it is even an analytic manifold. We

shall leave the proofs of these simple facts for the exercises at the end.
To have an integration theory on a manifold, it is necessary to limit
the differentiable structure for the manifold. We have already seen this
to be the case when we integrated over surface elements. We saw in
Section 9.5 that it was possible to define an integral of a differential
form over a surface, provided the surface was oriented. The same
situation persists for manifolds. We are thus led to the definition of
an orientable manifold.
9.8
MANIFOLDS J 457
9.8.3 Definition. A regular Ck manifold (M, <J>k), k 1, is said to be

orientable there is an atlas 'II' C <J>k so that V'(J, l/J E 'II' and Vx E
1
J('(J1/1) ,]ip-1o1JJ (x) > 0. Such an atlas is called an oriented atlas for M.
0
In defining an oriented manifold it is convenient to be able to specify

a
maximal oriented atlas. If M is an orientable manifold, then we put an

equivalence relation on the cla5s of oriented atlases for M by saying that
two oriented atlases 'II' 1 and 'II' 2 are equivalent if and only if 'II' 1 U 'II' 2
is an oriented atlas for M. We shall leave to the reader the easy task of
verifying that what we have said constitutes the definition of an equiva
lence relation.
If S is an equivalence class under the previous equivalence relation,
then we shall call
an
oriented structure for M. Clearly 'IJ'k is a maximal oriented atlas for M.
9.8.4 Definition. A regular, oriented, m-dimensional Ck manifold is

a pair (M, 'IJ'k), where 'IJ'k is an oriented structure for M. An oriented structure
for M is also called an orientation for M.
It is not always true that a regular
Ck
manifold is orientable. As an
example we shall consider the manifold called the Mobius
band. We shall
first describe the trace of this manifold geometrically, which will enable
us to get a parametric representation for it.
Let us take a circle of radius 2 in the
(x, y)
plane as represented by
the dashed line in Fig. 9.8. 1. Take a line segment of length 2 and keep
z
FIGURE 9.8.1
its center on this circle. Starting with the line segment on the x-axis in
a vertical position, move it continuously around the circle and rotate
the line continuously around its fixed midpoint so that it is always in

the plane perpendicular to the circle and will have completed a rotation
of 180 degrees when the circle has been completely traversed. The point
set in
swept out by this moving line is the trace of the Mobius band.
To get a parametric representation for the trace of the Mobius band,

we first write down the parametric representation of the circle in
This is the function from
[O,27T] to E3
2(cos
c(8) =
8,
sin
8, 0) .
The tangent vector to this curve at the point
d) / Id) I
= (-sin
c(8)
is, by definition,
8, cos 8, O).
By definition, the plane perpendicular to the curve
c(8)
given by
at the point
is the plane perpendicular to the tangent vector at
c(8).
It is
given by
{(x,y,z): -x
8+y
sin
cos
8 = O}.
A ball of radius 1 with center at the point

all
(x, y,z)
in
E3
x(r, v,)
y(r, v,)
Z( T, V,)
where
c(8)
is the collection of
given by
]-1,1(, v
2 cos 8 + r sin v cos

2 sin 8 + r sin v sin
T COS V ,
[O,7T], and
[O,27T].
(9.8.2)
lf we fix, say=
then i t can b e checked immediately that the set in
8,
{(x(r, v,8),y(r,v,8), z (r, v,8)): TE ]-1,1(, vE (0,7T]}

is the open disk of radius I with center at
perpendicular to
at
c(8).
If we fix
c(8)
and allow
that lies in the plane
to vary, we get a line
c(8). If we
0 and v(27T)=
segment in this disk which goes through

tinuous function of
so that
v(O)
make
7T,
some con
then we have a
parametric description of the trace of the Mobius band. The easiest

function to take for vis, of course, the one given by
If we set =
and
defined on the rectangle
8/2 in (9.8.2), then the

{(r,8): r E ]-1,l[, 8
v(8)
8/2.
resulting function is
E
[0,27T]},
and the
range of this function is the trace of the Mobius band. However, just
using this one function will not justify the fact that this point set in
is the trace of a C"' manifold, since the function does not have an open
domain in
E2,
E2
Actually we need two functions, defined on open sets in
to show that this set is the trace of a manifold. Let us consider the
function cp with domain 2 and range in E3 whose components are given

as follows:
(
(2
cp1(r,8)= 2
sin
cp2(r,8)
sin
cp3(T,8)
COS
)
U
cos
sin
(9.8.3)
9.8
ip2 be ip
]-7T/2, 7T/2 [. The functions ip1
and ip2 are C"' homeomorphisms and (ip1) U (ip2) is all the trace of
the Mobius band. Further, it is a rather easy matter to check that dip1
and dcp2 are nonsingular at every point of '(ip1) and '(ip2), respec
Let
ip1
be
ip
MANIFOLDS I 459
restricted to the open set ]-1, 1 [ x J O,27T[ and
restricted to the open set J-1, 1 [ X
tively. Thus we have all the ingredients for a manifold.
ip1 restricted to the open set U1 ]-1, 1 [ X ]37T/2,27T [

ip2 restricted to the open set U2 ]-1, 1 [ X ] 7T /2 O[
1
range. The function ip2'Pi. restricted to U1, takes U1
The function
and the function

have the same
onto
U2 and
'/)2-l oip1('T,8)
(-T,8-27T),
(T,8) E U1
The Jacobian of this transformation is -1. Note also that

restricted to ]-1, 1 [ X ] O, 7T/2 [ coincide.
Suppose the Mobius band is orientable and
'1'
ip1
and
ip2
is an oriented struc
1
('T,8) for which 3l/J E '1' so that
1
('T,8) E '(l/J- ip1) and J.i,-1.p, ('T, 8) > 0. Also let Q be the set of all
1
1
points ('T, 8) for which 3tjJ E '1' so that ('T, 8) E '(tjJip1) and
1
]w- <P1 ('T,8) < 0. It is clear that P and Q are open. Since dip1 (x) is
1
always nonsingular and since VtjJ E '1' , dtjJ(x) is nonsingular, it follows
1
ip1 never vanishes. Hence the union of P
that the Jacobian of tjJand Qis '(ipi).
1
1
Since '11 is an oriented structure, P n Q 0. For, if tjJ, t/J1 E '1'
1
and tjJip1 has a positive [negative] Jacobian at ('T,8), it follows
1
1
tjJ1- tjJ tjJ- ip1 has a positive [negative] Jacobian at
that tjJ1-i ip1
('T,8). Finally, P and Q are both nonvoid. For suppose that V is a
connected open neighborhood of (0, 0) and ip2(V) is in the domain
1
1
of tjJ- , where tjJ E '1' Clearly, there is a neighborhood W C U1
]-1, l[ X ]37T/2,27T[ so that ip1(W) C ip2(V). Now, let ('T1,81)E
1
V n J0(ip2) with 81 > 0. Then the Jacobian of tjJip1 at ('Ti. 81) is
1
the same as the Jacobian of tjJip2 at this point, since ip1 ip2 in a
neighborhood of ('T1, 81). On the other hand, let ('T2, 82) E W. Now,
1
1
1
1
tjJ- ip1 tjJ- ip2 ip2- 'Pi. and it follows that the Jacobian of tjJ- ip1
1
i
at (T2,82) is negative the Jacobian of tjJip2 at ip2- ip1(T2,82), since
i
we have shown that the Jacobian of ip2ip1 is -1 at (T2, 82). Since Vis
1
i
connected, ip2ip1('T2, 82) E V and the Jacobian of tjJ- ip2 never
1
vanishes,- it follows that the Jacobian of tjJip2 at ('T1,81) has the
1
same sign as the Jacobian of this function at ip2ip1(T2,82). Thus
1
ip1 at ('Ti. 81) is negative the Jacobian of this func
the Jacobian of tjJtion at (T2, 82). This shows that P and Qare not void.
ture. Let P be the set of all points
0
We have shown, under the hypothesis that the Mobius band is

orientable, that
J0(ip1)
is the union of disjoint nonvoid, open sets.
This contradicts the fact that

band is not orientable.
'(ip1)
is connected. Hence the Mobius
460 I THE
OPPOSITE ORIENTATIONS
If qrk is an orientation for M, then there is another orientation for M
that is canonically associated with qrk and which we label _qrk_ It can
be called the orientation for M that is opposite or negative to the
prientation qrk_ Let us describe -'l'k. Suppose that M is an n-dimen
sional manifold - so the elements of '}lk are defined on open subsets of
En. There is an atlas '11 C qrk so that every element of qr has its domain the
open unit ball. Indeed, Vr/J E '}lk and Vx E JF>(rjJ) let B(x, Px) be a
ball contained in JF>(rjJ). Let Tx be the function with domain B(O, I)
defined by
Tx(t) =pxt+ x.
It is clear that T x is a C"" homeomorphism that takes B (0, 1) onto
B(x , px ) . Let us set
rfix=r/J0Tx.
Since Tx has a positive Jacobian, it is clear that rfix E '}lk. Now take
It is clear that '11 is an atlas with the prescribed properties: All elements
in qr have domain B(O, 1).
For every rjJ E qr let us set
!Ji- (x)=ljl(-x1,x2, ,xn).

Then since
(x)= r/J-1
!Ji_(x)
(-xi' x2, ... 'xn),
it follows thatj(x) < 0, so that rfi-fi- '11. On the other hand, if cp E '11
and we set
v=cp_-1 orjJ_= (cp_-1 ocp) o (cp-1 orjJ) o (!Ji-I o!Ji_),

it follows that Vx E JF>(v) ,]v(x) > 0. Thus it follows that the set
- qr= {!Ji_: !Ji E 'I'}

is an oriented Ck atlas for M. We shall designate the orientation for M
that contains the atlas -qr by -'l'k. We have asked the reader to investi
gate the situation somewhat further in Exercises 5 and 6 below.
D Exercises
I. Give all the details of the fact that if
Vcp, rjJ E k, the transition function
cp-1 0!Ji
is a C k manifold, then
9.9
is a
INTEGRATION ON MANIFOLDS I 461
Ck
function.
2.
Show that the six projection functions
in Section 9.7 can be used to define a regular
{ 7Tk: k E ( 1, 3)} defined

C00 atlas for S2, and indeed
even a regular, analytic atlas for S2
C00
3.
Show that S2 is the trace of an orientable
4.
Show that the torus of Exercise 4 of Section 9.5 is the trace ot a
regular, orientable
5.
C'"'
manifold.
manifold.
If M is a connected, regular, orientable
C1
manifold, show that
there exist exactly two oriented structures for M. In other words, there
exist exactly two orientations for M.
6.
Show that the results of Exercise 5 are not valid if M is not con
nected. Indeed, compute the number of orientations for M in terms of

the number of components in M.
9.9
INTEGRATION ON MANIFOLDS
In Definition 9.5.4 we defined the integral of a differential form over

an oriented surface element. To define the integral of a differential
form over a bounded subset of an oriented manifold, it is necessary to
patch together, in a coherent way, integrals over surface elements. The
easiest way to do this is through a device called a
partition of unity.
We
have already run across this concept in Section 6. 7 in connection with

the Arzela-Ascoli theorem.
{B(xk> pd: k E (l,p)} a finite

ea be that real-valued function with
Let K be a bounded set in En and

covering of K by open balls. Let
domain E1 which is defined by
_
e a(r)
The function
{el/(r2-a2>
lrl <a,
lrl a.
ea is of class C00, and vanishes only for lrl a. Let l/Jk be
that real-valued function with domain Em defined by
lflk (x)
lflk is
x E K,
The function
therefore, if
of class
C00,
p
:L
k =I
Let us setB
epk ( l x - xk l) .
vanishes only for
Ix - xk I Pk>
l/lk (x) ,e o.
U {B(xk,pk ): k E (l,p)}, and Vx EB let us set
1Tk (x)
lflk (x) I L l/li(x).

j=l
and
Then
1T is a C00 function with domain B and Vx E K,

k
Such a collection of functions is a partition of unity for K. The formal

definition is given below.
9.9.1
Definition. Let K be a bounded set in En. A Ck partition of unity
for K is a finite collection { 1T J : j E (I ,p)} of real-valued Ck functions each
having as domain an open set in En containing K and having the following
properties:
(a) 7TJ 0, Vj E (l,p).
p
( b)
J=l
1T;(x)
l, Vx EK.
For the theory of integration on a manifold it is necessary to have

the notion of a partition of unity for K which is subordinate to an open
covering for K. In defining this concept it is convenient to talk about
support of a real-valued function, which is defined as the closure of

the set of points where the function does not vanish.
the
9.9.2
K, and
Definition.
Let K be a bounded set in En, 'U an open covering for
{ 1T; : j E ( 1, p)} a partition of unity for K. The partition of unity
is said to be subordinate to the open covering 'U <=> Vj E ( 1, p) , 3 U E 'U

so that the support of 1T; is contained in U.
9.9.3 Proposition. If K is a compact set in En and 'U is an open cover
ing for K, then there exists a C00 partition of unity for K which is subordinate
to 'U.
Proof.
x E K there exists an open ball with center at x

whose closure is contained in an element of 'U. Since K is compact, there
is a finite set {B (x > P ) : k E ( 1, p)} that covers K. We have previously
k k
constructed a C00 partition of unity for K that is subordinate to this
For every
finite set of open balls. Thus the proposition is proved.

We are now in a position to define the integral of a differential form
M
M, then when we speak of the boundary
of K we shall mean those points x E M so that every relatively open set in
M that contains x contains points of K as well as points in M\K. We shall
over suitable sets in oriented manifolds. Let us first remark that if
is a manifold in En and K C
designate this boundary of K by 'aK'. In general it will not coincide

with the boundary f3K, which is taken with respect to E".
9.9.4
in
En
Definition. If (M, k) is an m-dimensional, regular Ck manifold

and K CM, then K is called a Jordan domain in M <=>the closure of
9.9
INTEGRATION ON MANIFOLDS I 463
Kin Mis compact and Vcp E k the set cp-1(aK) has m-dimensional Lebesgue
measure zero.
We should remark if M is a regular C1 manifold and the elements of

any atlas for M satisfy the conditions of the previous definition with
regard to iJK, then the same is true for the structure <1>1 on M. This is
essentially a consequence of Theorem 8.2. 7 and we shall leave the details
for the reader.
If (M, k) is a regular, m-dimeilsional Ck manifold in En, then there
is a covering 1.b of M by open sets in En so that the intersection of every
open set in 1.b with M is the range of an element in k. This is simply a
consequence of the fact that the range of each element of k is a rela
tively open set in M. If K is a Jordan domain in M, then it can be covered
by a finite number of elements in 1.b and thus there is a partition of unity
for K that is subordinate to 1.b.
9.9.5 Definition. If (M, k) is a regular, m-dimensional, Ck manifold
in En, 1.b is a covering of M by open sets in En so that the intersection of every
element in 1.b with M is the range of an element in k which has a bounded
domain, K is a bounded set in M, and { rr;: j E ( 1, p)} is a partition of unity
for K which is subordinate to 1.b, then this partition of unity is called a partition
of unity for K subordinate to k.
We are now in a position to give the definition of the surface integral

of a differential form.
9.9.6 Definition. Let (M, 'lfl) be a regular, oriented, m-dimensional C1
manifold in En, K a Jordan domain in M, and w a continuous mth-order
. differential form with domain an open set in En which contains K. Then if
{ 7T;: j E (1, p)} is a continuous partition of unity for K subordinate to '111,
we define
r
JK
i
i=l
r
Jx
1T;W,
where rr;IM has support in (cp;).
Let us make some remarks about this definition. First, the Riemann
integral used to define the integral of the differential form 7T; w
over K exists. This is one point where we make use of the fact that
{ rr;: j E (1, p)} is subordinate to 'I'1 Indeed, if the function under
the integral sign is defined to be zero outside cp;-1 (K), then the extended
function is continuous except possibly at the points in the set cp;-1(aK).
Since, by hypothesis, this set has Lebesgue measure zero, the result
follows from Theorem 8.3. 7. Note that the definition of the integral
of the differential form 1T;W coincides with the Definition 9.5.4, although
in the latter definition we did not demand that the function
rank
cp
have
m.
The definition of the integral of
over K given in Definition 9.9.6
seems to be dependent upon the particular choice of a partition of unity

that is chosen. This would, of course, be an intolerable state of affairs,
and we want to prove that the definition of the integral is independent
of the particular choice of partition of unity for K; from that it will be
clear that the definition of the integral of 1T ;w is independent of the chart
cp; provided, of course, that 7T;IM has support in (cp;).It is precisely at

this point that we use the fact that we are integrating on an oriented manifold .
9.9. 7
9.9.6
Proposition. The definition of the integral given in Definition

is independent of the choice of the partition of unity for K subordinate
to '111
Proof.
Suppose
{7r;:j
(l,p)}
and
{cr;:j
(l,q)}
are partitions
of unity for K, both subordinate to '111 Let us suppose that

support in
(cp;) and cr;IM
has support in
7T;IM
has
(t/I;).
For the sake of simplicity let us set
f;(t)='TT; 0 cp;(t) L
W[i] 0 cp;(t)
lil
a(cp/1, ... 'cp/m)

a(tl . . . 't'n) (t) .
'
Then .we have
rp;-l<K>
f;(t)dt
qJ
iJ
:L
i=l
=
i=l
rp;-l<K>
CT; 0 cp;(t)fj(t) dt
(9.9.1)
1T;CT;W.
The last equality follows from the first formula in Definition 9.9.6,
since we can think of
'TT;
as multiplying the form
cr;w.
Let us set
This is a one-to-one function of class C1 with domain
Since we are working with an oriented manifold, it has a pos1uve

Jacobian at each point of its domain. We shall apply the transformation
theorem for integrals in the form
g(S)
f0g-1(t)}0-1(t)dt=
f J(t)dt,
S
(9.9.2)
9.9 INTEGRATION ON MANIFOLDS I 465
which is possible if g is a one-to-one C1 function with S C .e(g) and

has a positive Jacobian. Let Ku be the support of <Ti1T;. Then if we take
S
rp;-1(K n Kii),g= g ii, andf(t) =er; 0 rp;(t)f;(t), and apply (9.9.2),
the right side of (9.9.1) becomes
=
i=l
l/J;
-l(
CT;
..-1(t)dt.
l/J1(t)f; 0 g;;-1( t) }gtJ
(9.9.3)
Note that we can use l/J;-1(K) in place of l/J;-1(K n K0), since the
support of the integrand is l/J;-1 (K0). It is at this point that we make essen
tial use of the fact that our partitions of unity are subordinate to 'I' 1 Now,
(9.9.4)
= 1T; 0 l/J;(l) " W[k]
fi
l/J;(t)
iJ(rp/t,
Further, since
rp/m)
" (t i ' ...' tm)
(giJ-l(t))}g ..-1(t)
lJ
it follows from the chain rule that
If we use the fact that
then it follows from (9.9.4) and (9.9.5) that
If we use this in (9.9.3), then we get from (9.9.1),
p fK
j=l
1T3W=
L Lq
p JK
CT;1T;W
=l i=l
q 11 r 1T;<T1W=
q r CT ;W.
JK
JK
= j
The first equality comes from the above computations. The first and
third sums are equal by formula (9.9.1). The last equality is, of course,
obtained by interchanging the roles of {7T;:j E (l,p)} and {er;:
j E ( 1, q)} in the previous proof. This completes the proof.
9.9.8 Proposition. Suppose wi and w2 are continuous differential forms

so that the conditions of Definition 9.9.6 are satisfied. Then Va, f3 E R,
we have
(9.9.6)
The proof is an immediate consequence of Definition 9.9.6
and we
shall leave it as an exercise for the reader.
9.9.9 Proposition. Suppose the conditions of Definition 9.9.6 are satis

fied and Ki and K2 are Jordan domains in M with K1 n K2 oKi n oK2;
that is, Ki and K2 have at most boundary points in Min common. Then
=
L,uK2 w L, w L. w.
(9.9.7)
The proof of this is also an almost immediate consequence of Defi

nition 9.9.6
and we again shall leave it for the reader.
Finally, let us establish the following proposition, which will be

useful in Section 9.10.
9.9.10 Proposition. Suppose 'P is a C1 homeomorphism with le(ip) (an

open set) in Em, se('P) C En, m n, and d'{)(x) is of rank m, Vx E le(ip)
(see the Remark below). Further, suppose that A is a Jordan measurable set
with AC le(ip), and w is a continuous mth-order differential form with domain
in En and se ('P) C le(w). Let us consider ip (A ) as a Jordan domain in the
regular oriented m-dimensional manifold (se('P), '111) where 'P E '111, and
let us consider A as a Jordan domain in the regular m-dimensional oriented
manifold (Em, 1 ), where 1 contains the identity transformation of E m onto
itself. Then
y:>(A)
J 'P*w,
(9.9.8)
where ip'!'w is defined by formula (9.6.15).

Proof.
By definition
y:>(AJ
JL
[j)
Wen
ip(t)
a ('Ph'
. . . ' 'Pim)
(t) dt .
l
o(t ' . . . ' em)
(9.9.9)
On the other hand,
'{)*w(t)
L wu1
[j]
If in Proposition 9.6.6
(\ ... (\
ip(t) diph(t)
/\
/\
dipim(t).
.('Ph,
, ipim ) and g to be the
Em onto itself we get
we take f
identity transformation of
diph(t)
dipim(t)
a ( ipi
=
. . . ipim)
'
(t) dti
a(ti, . . ,tm)
i,
(\ ... (\
dt m.
9.10
STOKES' THEOREM I 467
Thus
Consequently, by definition, the integral of <p*w over A is exactly the

right side of (9.9.9). This concludes the proof.
The formula (9.9.8) is valid without the assumption that
REMARK:
'P is a homeomorphism and
dq;(x)
is of rank
m.
Indeed, the proof
only uses these things to remain in the context of integration on a
manifold.
If we had used the concept of integration over a
surface,
as given in Section 9.5, we would have only had to make the assump
tion that 'P is a
C1
function.
D Exercises
En
I.
Prove Proposition 9.9.8.
2.
Prove Proposition 9.9.9.
3.
Suppose
(M, c:I>1)
is a regular, m-dimensional C1 manifold in
and K is a Jordan domain in
M.
Give a definition of the surface area
of K that is independent of any partition of unity for K.
9.10
STOKES' THEOREM
It is the purpose of this section to generalize the results of Section 9.3.

To do this the first thing to do is to prove a version of Stokes' theorem
for an n-dimensional interval.
Let I be the unit interval in
En
given by
/={x:xkE[O,l],
Let I0k be the
(n
Vk E (l, n)}
I )-dimensional face of I given by
J0k= {x:x E I & xk = O},

and let
/1k be
the
(n -
!)-dimensional face of I given by
I 1k={x: x E I & xk = 1} .
The space
En
may be considered as a regular, n-dimensional
C'
mani
fold, and the interval I is certainly a Jordan domain in this manifold.

Actually,
En
is an orientable manifold. Indeed, if '111 is that oriented
structure for
En which contains the identity transformation, then

(En, '111 ) is an oriented manifold. When we integrate a differential form
over I we shall always consider I as a Jordan domain in this latter oriented
manifold. Perhaps we should also note that according to Exercise 5 of
Section 9.8, there is only one other oriented structure of class C1 for En.
468 I THE
k and J k is a Jordan domain in

1
an (n - !)-dimensional, regular, oriented C1 manifold. Indeed, let
Mk be the range of the function 'Pk that has domain En-1, and is given by
Each
(n -
1)-dimensional interval J0
'Pk(t)
where the
e1
k-1
j=l
J=k+l
2: tiei + L
ti-lei,
(9.10.1)
are the standard unit vectors in E n. If <1>1 is that oriented
structure on Mk which contains the function 'Pk then (Mk. <1>1) is an
we shall always consider J0k as a Jordan damain
oriented manifold, and
In a similar manner, let Nk be the range of the
in this oriented manifold.
function t/Jk with domain E" -1 given by
(9.10.2)
If '1'1 is the oriented structure on Nk that contains t/Jk, then (Nk '1'1)
is an oriented manifold and
we shall always consider I/ as a Jordan domain
in this oriented manifold
Suppose now that w is a C1 differential form of order
(open) domain contains /. If we write
w(x)
L wr;1(x)
n- 1
whose
rJxi1 /\ /\ dxin-1,
[j]
then, by definition,
dw(x)
L dw1il(x)
[j)
LL
[j]
k=l
/\ dxit /\ /\ rJxin-1
Dkwrn(x) rJx k /\ dxii /\ /\ rJxi n-1.
Now, since 1 :,.;; j1 < < j,,_1 :,.;;
n,
it follows that
rJxk /\ dxii /\ ... /\ rJxin-1

/\ dx"k -j;, Vi E ( l, n
3j; so that k }; .
1),
Hence
dw(x)
L (-I)k+IDkwk(x) dx1
k=l
/\ /\ dx",
where we have set
wk(x)
If
W1.--.k-1.k+t..n(x)
is the identity transformation of E" onto itself, it follows that
Vx E E",j,(x)
1. Thus, from the definition of a surface integral given
in Section 9.9, we get
l dw
I
(-l)k+i l Dkwdx) dx.
k=I
9.10 STOKES' THEOREM I 469

If we use iterated integration on the integrals on the right we get
i
where xk= (x1,
i [J:
Dkwdx)dx=
xk-1, xk+l,
Dkwk(x)dxk dxk>
xn) and Ik = {xk: x
E /}. Thus
From the definition of the functions 1/Jk and cpk, and the definition
of a surface integral given in section 9.9, we get
1ik
w=
L W[j]
[j]
lfJk(t)
a (lfJki1, ... '1/Jkjn-1)

(t) dt'
a(tl . . .'tn-1)
'
where]= {t: t E En-1 & Oti 1, Vj E
( l , n - 1 ) }.
Now, if (ji .
jn-i) contains the integer k, then

a(lfJki1,
a(tl' .
'
1/lk;n-1 )
(t)= O '
1n-I)
otherwise this Jacobian is identically I. Thus
I1k
w=
w=
{ wk
J lk
Ik
wk
1/Jdxk) dxk.
In the same way we find
To/<
cpk(xk) d:xk.
Consequently,
and thus
dw=
k=l
(-l)k+1
[f
11k
w-
f ]
1
n
= :L :L (-l)i+k
w.
f; k
j =O k=I
Iok
(9.10.3)
In Section 9.1 we formed formal sums of curves that we called chains

and explained there how the meaning could be made precise. In the
same way we shall form the formal sum
1 n
a1= :L :L (-l)i+kf/,
i=O k=I
(9.10.4)
and call this chain the oriented boundary of/. We shall leave to the
reader the simple task of giving a precise definition of a chain in this
context. We shall define the integral of
(9.10.3).
over al as the last sum in
Thus we have proved the following.
9.10.1 Theorem. If w is a C 1 differential form of order n - 1 with

(an open) domain in En that contains the n-dimensional unit interoal/, then
r dw
11
=.
r w,
Jar
(9.10.5)
where al is the chain given by (9.10.4) and the integral on the right is defined
as the last sum in (9.10.3).
To be able to obtain a formula like
(9.10.5) for certain types ofJordan
domains in manifolds, it is first necessary to obtain a version of (9.10.5)

when I is replaced by a homeomorphic image of/. This is described in
the next theorem.
9.10.2 Theorem. Suppose that ip is a C2 homeomorphism with domain

(an open set) in En which contains/, has range in EP, n :,;;;; p, and dip(x) has
rank n, Vx E (ip) (see the Remark below). Further, suppose that w is a
C1 differential form of order n
1 with domain (an open set) in EP that con
tains ip (/) Then
-
JIP(/) dw= Jra IP(/) w '

where
iJip(/)
l IPW
0
(
LL
J=O k=l
(9.10.6)
-l)Hkip(J/)'
'L 'L ( -1 > i+k
J=O k=I
(9.10.7)
JIPUJk) w.
(9.10.8)
Proof. The differential form ip*w w ip is a C1 differential form of

order n - 1 whose open domain in En contains the interval /. [See
(9.6.15) for a definition of ip*w.] From Theorem 9.10.l we have
=
r dip*w
Jr
r ip*w
Ja1
I n
'L 'L (-l)k+
J=O k=I
We claim that
r ip*w
J1;k
Let us show this for
1,
Jjk
ip*w.
f1PU;k> w.
the situation for
(9.10.9)
(9.10.10)
=
being similar. As we
have noted before, /1k may be considered as a Jordan domain in
(Nk,
9.10
'111), where '111 is the oriented structure on N k that

lflk given by (9.10.2). Let J be the unit interval
=cp lflk(j). By proposition 9.9.10 we get
contains the function

in
{u k> w = J.,,."' < > w

1
kJ
{ (cp
En-1;
then
cp(I1k)
I/Id *w.
On the other hand, using Proposition 9.6. lO(c) we get
cp*w = f lflk*
J cp*w = J
11k
J
o/Jk(J)
= L (cp
cp*w
lflk)*w.
Thus the two integrals in (9.10.10) are the same. Consequently, the
right side of (9.10.8) is the same as the right side of (9.10.9).
On the other hand, from Proposition 9.6. lO(d) we get
From Proposition 9.9.10 we get
f dcp*w= r cp*dw=J
J1
<p(J)
dcp*w=cp*dw.
dw.
Thus using the last equality, (9.10.8), (9.10.9), and (9.10.10) we see
that (9.10.6) is valid.
REMARK:
has rank
The conditions that
cp
is a homeomorphism and
dcp(x)
are used only to remain within the context of integration
on a manifold. Theorem 9.10.2 remains valid if we only assume that
cp
is of class
C2
provided the concept of integration over a surface is
used as given in Section 9.5. See the remark after the proof of Propo
sition 9.9.10.
We can now get a more general version of Stokes' theorem 9.10.2 by

"patching together" formula (9.10.6). This more general version of
Stokes' theorem is obtained for special types of Jordan domains. To
describe these objects we shall introduce the half-space,
Hn= {x:
XE
In what follows we shall identify

by
{x: x E En &xn = O}.
En &xn
En-1
O}.
with the subspace of
En
given
9.10.3 Definition. Suppose (M, k) is a regular, oriented, n-dimensional

Ck manifold in EP and K is a bounded subset of M. The set K is said to be a
regular Jordan domain in (M, k) Vx E iJK,3cp E k so that
(a)
(b)
(b')
x E (cp).
.B(cp) n Hn=cp-1(K) .
.B(cp) n En-1=cp-1(aK).
(See Exercise 10.)
We should note here that, as in the case of defining a

REMARK:
Jordan domain in an oriented manifold, we have clearly abused the
tenants of notational precision in that K itself is a point set and thus
could be a regular Jordan domain in different oriented manifolds
having the same trace M. However, we think that no confusion will
result.
9.10.4 Proposition. Suppose K is a regular Jordan domain in the
oriented manifold (M, k) , and is that subset of all 'PE k so that
se('P)
n aK oF- 0
and conditions (b) and (b') of Definition 9.10.3 are satisfied. Let 'I' be that
collection of all functions 1/1 with the property that 3 'PE so that
1/1 = 'Pl.B('P)
n E"-1.
Then 'I' is a regular, oriented Ck atlas for aK.

Proof.
= (t1,
Let 'PE and tE .B('P) n 11-1. Then, of course, (t, 0)

tn-1, O) E .B(ip) and we set
1/l(t) = ip(t, 0).

For every uE En- I we have
dl/J (t) (u) =
n-1
k=I
D 'P(t,O) uk = dip(t, 0)(u, 0).

k
Now, dip(t,O) has rank n, and thus dip(t,O) IE"-1 must have rank n - I.
This means that dl/l(t) has rank n - I. By conditions (a) and (b') of
Definition 9.10.3,
{se(l/J) : 1/1E 'I'}.
aK = u
Thus '11 is certainly a Ck atlas for aK.

To show that '11 is an oriented atlas, let 'Pi, ip2 E , and
j= 1,2.
Let us set
.(t) = 1/12 -l
v(t,t") = i/)2 -l
0
0
1/11 (t) = 'P2 -l
'P1 (t, 0)'
'Pt (t,tn).
Since .(t) = v(t,O), it follows that
Vj,k E(l,n-1).
On the other hand, v"(t,0)
0, so that
D;v"(t,0) = 0,
Thus it follows that
VjE(l,n-1).
9.10
}v(t,O) =Dnv"(t,O)j,,(t).
Now, since
(M, <J>k)
is an oriented manifold, it follows that]v(t, 0) > 0.
On the other hand, since
Dnv"(t, O)
v"(t,t")
> 0 when
tn
> 0, it follows that
;:.: 0. Thus we must have j,,(t) > 0. Hence, 'II' is an oriented
atlas and the proposition is proved.

What we have just proved shows that there is an oriented Ck structure
'IJ'k
containing the atlas 'II' of Proposition 9.10.4, so that
regular,
( aK, 'IJ'k) is a
(n - 1)-dimensional, oriented Ck manifold. For reasons that
will soon become apparent, we want to consider the oriented manifold
(aK, (-1) "'IJ'k) r:ather than (aK, 'IJ'k). The reader is advised to reread
the last two paragraphs of Section 9.8 to refresh his memory on the
notation
( - I) n'IJ'k.
9.10.5 Definition. Suppose K is a regular Jordan domain in the regular.

oriented. n-dimensional Ck manifold (M, <J>k). Suppose further that 'II' is
that oriented Ck atlas for aK gi,ven in Proposition 9. l 0.4 and 'IJ'k is the oriented
structure for aK that contains 'II'. The oriented, (n - I )-dimensional, regular
Ck manifold (aK, (-l )n'IJ'k) is called the regular oriented boundary of K
with orientation induced by the orientation <J>k. (See the last two paragraphs
of Section 9.8 for a definition of -'IJ'k.) We shall usually abuse the notation
and designate (aK, (-I) "'IJ'k) by 'aK.'
Examples.
As a first example of a regular Jordan domain, let us
3 We shall take M E3 and <J>k to be

3 that contains the identity transformation. The set
K is the closure of B(O, I) and aK is the unit sphere {x: lxl=I}. Let
'Pa be that function with domain ]O, 1T[ X ]a,a+27T[ X R+ given by
consider the closed unit ball in
the structure on
'-Pa(t) = (t3 cos t1,t3 sin t1 cos t2,t3 sin t1 sin t2).
The Jacobian of
'Pa at t is
l<f!a ( t ) = (t3)2 sin t1
and thus
Va
R, 'Pa
7r(x)
E <J>k. If
> 0,
(x2,x3,x1}, we also get j
a(t)
7T 0.
Let
Then
T(t) = (t1,-t2,-t3) + e3 and set

l/Ja='-Pa0T .
1>(1/Ja) = ]0,7T[
]-a-27T,-a[
]-oo, I[, and
1>( I/la)
H3,
l/Ja-'(aK) = 1>(1/Ja)
l/Ja-1 (K)
1T0 l/Ja instead of l/Ja Thus

{ 1T 0 l/Ja: a E R} satisfy the
conditions of Definition 9.10.3. If we restrict each l/Ja and 1T 0 t/Ja to
We get the same results if we work with
the functions in the set
{l/Ja: a
R}
(1/la)
2, then we get an oriented atlas for iJK and by adding the
identity transformation restricted to B (0, 1/2) we get that K is a regular

Jordan domain in (3,
k).
As a second example let us consider the case of an anchor ring

or bagel in 3 We shall again take M=3 and
k the structure on

3 onto itself. Let
Ia=]a,a+2rr[,Ip=],B,,B+2rr[ and IR=]O,R[. Let 'Pa./3 be that
function with domain Ia X Ip X IR given by
M that contains the identity transformation of
'Pa./ (t)
(R + t3 cos t2) cos t1,
'Pa.l(t) =(R + t3 cos t2) sin t1,
(9.10.11)
'Paj (t) =t3 sin t2

The functions 'Pa,P are C00 homeomorphisms and moreover
l'f'a.P (t) =t3(R + t3 cos t2)
>
0.
{cpa. p : a, ,B E R} belong to
]O, R [, let cp be defined on [O,2rr] X [O,2rr] X (0, r]
by the right side of (9.10.11), and set K=92.(cp). K is an anchor ring
in 3 and aK is a torus (Fig. 9.10.1) (see Exercise 4 of Section 9.5).
This shows that the collection of functions
k. Now let r
FIGURE 9.10.1
Let us set T(t) =(t1,-t2, r - t3) and
1/Ja,P='Pa,P
0 T
The domain of 1/Ja,/3 is Ia X ]13 X ]r - R,r[, where ]13=]-,B - 2rr, -,B[.

Clearly,
1/la,p-1(K) = (1/la.P)
H3 ,
l/Ja .P-t ( iJK) =(l/Ja,/3)
9.10
Since
has a positive Jacobian at each point, it follows that the same
is true for t/Ja,/3 Thus we see that
(3' <t>k).
K is a regular Jordan domain in
9.10.6 Theorem. Suppose that (M, <1>2) is a regular, oriented, n-dimen

sional C2 manifold in P, and K is a regular Jordan domain in (M, <1>2).
Suppose also that w is a C1 differential form of order n - 1 with domain (an
open set) in P so that KC R>( w). Then
r dw=
JK
aK
w.
(9.10.12)
Let be the subset of <1>2 described in Proposition 9.10.4.

x E iJK, 31/Jx E so that x E fl{,( t/Jx) and I C R>( t/Jx), where
I is the unit interval in En. Let B x be an open ball about I/Ix-1( x) so that
Bx n Hn C I, and let 'fix= t/JxlBx The collection of the ranges of the
latter elements is a relatively open covering for the compact set iJK.
Thus there is a finite set {'P;: j E ( 1, l)} of these elements whose
ranges cover iJK. Now, Vx E K =K\ U {fl(,( ipJ): j E (I, l)}, 31/Jx E <1>2
1
so that x E fl(,( I/Ix), I C R>( I/Ix) and fl(,( I/Ix) n iJK= 0. Let Bx be an
open ball about t/lx-1( x) so that Bx C I and let 'fix= t/lxlBx Since Ki
is compact there is a finite set {cp;:j E (l+ l,m)} of these latter ele
ments whose ranges cover Ki. Thus {fl(,( cp;):j E (l,m)} covers K.
Let {'lT;: j E (I, m)} be a ci partition of unity for K so that 'lT;IM
has support in fl(,( cp;). If t/JJ corresponds to 'f';, as described above, then
'lT;01/J;( Jik)=O, Vk E ( l,n-1), i=O, 1, and also 'lTJ01/J;( /t)=O.
From the fact that Vx E K\t/l;( I),'lT; w( x)=0, we get, from Theorem
Proof.
For each
9.10.2,
Note that if j E
(l+ l,m), then 'lT; 0'l'J ( I0n)=0, so that in this case
all the above integrals are zero.

Recall that we are supposing that aK has the orientation induced by
<1>2, that is, ( -1) nqr2 Thus the last equality is nothing more than
r d'lT;W =
JK
Since
aK
'lT;W.
{'lT;: j E ( 1, m)} contains a partition of unity for aK that is sub

( -1) nqr2' it follows that
ordinate to
On the other hand,
n
w( x)= L 'lT; ( x)w( x)
j=i
for all x in some open set in EP that contains K. Thus, for the x in this
open set,
dw(x) =
drr;w(x).
j=l
Hence from Proposition 9.9.8 we get
dw =
'i:, l
i=l
drr;w.
Thus we see that formula (9.10.12) holds and the proof is complete.
The theorem we have just proved does not include Theorem 9.10.1
or 9.10.2 as special cases. Aside from the relatively minor fact that al
and acp(I), when considered as chains, are logically different from
manifolds, there is the more important fact that a/ and acp(/) when
considered as boundaries of I and cp(J), respectively, cannot support
structures that will make them into regular manifolds. The trouble,
of course, lies in the fact that these boundaries have sharp corners.
These things can be rectified by considering "piecewise Ck manifolds,"
but we shall not go into these matters.
D Exercises
1.
Suppose M is a compact, regular, oriented, m-dimensional C2
manifold in En and w is a C1 differential form of order m - I with

domain an open set in En which contains M. Show that
L
2.
dw=O.
Let w be a C1 differential form of order m
open set) in En, m
n.
1 with domain (an
Suppose that for every (m - !)-dimensional
sphere S C E" we have
L
Show that
3.
w=O.
is closed.
A fluid flows in E3 with a velocity v which depends only on the
position vectors (x, y, z). The flow is said to be irrotational if and only
if v is of class C1 and
JY
( v1 dx + v2 dy + v3 dz) = 0,
for every smooth, regular, oriented, closed curve yin E3. Show that the
flow is irrotational if and only if there exists a real-valued function cp
of class C2 so that
v=-'ilcp.
9.10
4.
Let
K be a regular Jordan domain in the regular, oriented,

ci manifold (E", qri), where qri is the oriented structure
n-dimensional
for E" which contains the identity transformation. Show that the Jordan
content of K is given by
IKI
_! J (-1) Hxi dxi /\

n
/\
d;J" /\
/\ dx".
iJK j=l
K1 and K2 are regular Jordan domains in the regular,

2
2
(M 1, <l>i ) and (M 2, <1>2 ), respec
2
tively. If (aK1, (-1)"'1'1 )
(aK2, (-l)"'l'l), and w is a continuous,
5.
Suppose
oriented, n-dimensional C2 manifolds

=
exact, nth-order differential form, show that
Ki and K2 are regular Jordan domains in a regular,

2
C manifold, and K2 C Ki. Let w be a closed
1
C differential form of order n - 1 so that Ki \K2 C J0(w). Show that
6.
Suppose
oriented, n-dimensional
7.
Generalize formula (9.3.13) to higher dimensions.
8.
Generalize Exercises 8, 9, and 10 of Section 9.3 to higher dimen
sions.
Suppose a fluid flows in an open subset of 3 with a velocity
v
(x, y, z). Suppose
also that p(x, y, z, t) is the density of the fluid at the position (x, y, z)
and at the time t. We shall also suppose that the flow has no sinks or
9.
of class C1 which depends only on the position vectors
sources; that is, the rate of increase of mass of the fluid which lies inside
any fixed, closed surface equals the rate at which the mass flows through
the surface. Assuming that
equation of continuity,
dp
at
IO.
p is of class ci, obtain the hydrodynamic
apv + apv + apv

ax
ay
af.
0.
Show that condition (b') of Definition 9.10.3 is implied by
condition (b).
SYMBOLS
PAGE
2
negation
=::::}
implication, if ..., then ...
&
conjunction, and
{=}
for every
(x)(Q(x))
for every
there exists
16, 20
element of
16, 18
intersection
17
{x: Q(x)}
AXB
A\B
ACB
An B
n {A:A E v(..}
AUB
U {A:A E vt}
Ac
17
not an element of
21
0
JFJ(R)
5ll(R)
R -
R-1(A)
null set
24
F0G
composition of the functions F and G
24
!IA
f(A)
the range off restricted to
16
17
17
17
17
17
17
22
22
22
22
24
24
disjunction, either ... or ...

equivalence, if and only if
x, Q(x) is true
the set of all
x such that Q(x) is true
Cartesian product
A and not in B
A is contained in B
elements in
inclusion,
the intersection of all sets in .,(,

umon
the union of all sets in vt
complement of the set
domain of the relation R
range of the relation R
R
A under R
inverse of the relation

inverse image of
equivalence relation
the functionf restricted to the set
26
the natural numbers
31
the integers
35
the rationals
39
Q
Q+
(m, n)
45
No
the nonnegative integers
47
R
R+
the reals
48
52, 69
lim
limit
53
(x(n)), (xn)
a sequence
36
the positive rationals

the set of all integers between
m and n
the positive reals
479
480 I SYMBOLS
54, 235
lxl
absolute value, the length of the
63
g.l.b., inf
greatest lower bound, infimum
63
1.u.b., sup
least upper bound, supremum
69
[a,b], ]a,b[
[a,b[, ]a,b]
I(x)
half-open intervals
vector
69
69
71
(AC)
closed, open intervals

an open interval containing
77
closure of the set
91
lim,lim
limit superior, limit inferior
98
CT
98
99
132
133
133
138
138
183, 354
183, 354
183, 354
184, 355
184, 355
184
(a)n
L ak
k=O
(a, u(a))
2: <ak)
k=O
n
II ak
k=O
(a, Il(a))
II (ak)
k=O
f', df /dx, DJ
pn>, djn/dx", D nJ
A
IAI
A* A
R1(A, {xk})
i5,(A)
Q,(A)
J:
r
f
proof requires axiom of choice
finite sum
infinite series
infinite series
finite product
infinite product
infinite product
derivative off
nth order derivative off
decomposition of an interval
norm of a decomposition
A* is a refinement of A
Riemann sum forf
upper Darboux sum forf

lower Darboux sum forf
J(x) dx
upper Darboux integral off
f(x) dx
lower Darboux integral of f
f(x) dx
Riemann-Darboux integral off
_a
185
201
( f, /(J ))
improper integral
211
Riemann-Stieltjes integral
J(x) dg(x)
236
xy
dot or inner product
239
240
Ll.
L EBLl.
direct sum of L and
242
B(a,r)
open ball with center at
247
boundary of the set A
247
{3A
Ao
259
[ai!]
matrix
orthogonal complement of
interior of the set
LJ.
a
and radius r
SYMBOLS I 481
263
transpose of T
T'
sign of
275
sgn
276
det
309
D.J
310
af
Dk,-k
310
315
316
325
325
328
353
356
356
[a;;]
determinant of
[a;;]
partial derivative
ax
differential of f at
df(a)
],(a)
dxk
Du Dvf
Jacobian of f at
differential of projection
akf
ax' ... axk
Ck, C""
d(I)
I dx
I f(x) dx
fA
f(x)
) <ix
higher-order partial
classes of differentiable functions
diameter of
upper Darboux integral of f

lower Darboux integral of f
Riemann-Darboux integral off
361
'X( ),K(A)
IAI
Jordan content of
361
XA
characteristic function of
393,431
a(J1, ..., r) (a)

a(x1, ... ' xn)
Jacobian of f at
361
399,444
400
Jy
410
440
444
>.. /\ ,
(j]
outer and inner Jordan content of
A
A
usually a differential form

line integral of a differential form w
ip *w, w
xk
composition of directional derivatives
/C x
'P
composition of w and function 'P

exterior multiplication
special n-tuple of integers
447
dw(x )
differential of differential form w at
463
integral of a differential form over a
462,473
aK
Jordan domain
relative boundary; also oriented
boundary of a Jordan domain
INDEX
Abel's sum formula, 114

test, 114
theorem, 170
Bolzano-Weierstrass theorem, 65,

243
Borel, E., 77
Absolute value, 36,50,58
Boundary, 247
Accumulation point, 65, 242
Bounded set, 46
left, right, 85
Bounded variation, 218
Affine space, 306

transformation, 309
Algebra, 300,438
Cantor, G., 44
alternating tensor, 440
diagonal process, 121
closed, 300
function, 126
of differential forms, 437
intersection theorem, 79
exterior, 440
set, 121
finite dimensional, 438
Cartesian product, 16
Grassman, 440
Cauchy, A., 45
separating, 300
tensor, 439
Alternating multilinear functional,
277,439
Buajakovsky-Schwarz inequality,
217, 236
condensation test, 113
net, 357
Analytic function, 163
product of series, 167,168
Arc cos, arc sin, arc tan, 175,176
remainder formula, 160
Archimedian ordering, 30,50,57
root test, 110
Arcwise connected, 251
sequence, 50
Area of a surface, 432
Cesaro summability, 109,171
Arzela-Ascoli theorem, 297
Chain, 401,469
Associative laws, 27,56
Chain rule, 145,321
Atlas, 455
Characteristic functions, 361
oriented, 457
Axioms, 2
for the natural numbers, 27
Axiom of choice, 71
Closed set, 77, 242

Cofactor, 284
Commutative laws, 27,56
Compact set, 77,242,294
Comparison test, 110
Complement of a set, 17
Ball, 242
Component, 245
Basis, 230
Composition of functions, 24
ordered, 258
Bernstein, S., 178,180
Conjunction, 3
Connected set, 244
polynomials, 182
arcwise, 251
theorem, 162
simply, 420
Binet-Cauchy theorem, 280
Content,
Blumenthal, L., M., 159
Jordan, inner and outer, 361
Boas, R., P., Jr., 142
of an interval, 353
482
INDEX 1483
Continuity, 73, 249
uniform, 80, 250
Convex sets, 245
Corollaries, 2
Cosine, 173
addition formula, 173
Countability, 38, 41
Covering, 77, 242
open, 77, 242
Differential of a function, 310

of a differential form, 447
higher order, 325, 328
Differentiation, 138, 305
of integrals, 377
rules, 145, 319
Dini's theorem, 131
Directed set, 356
Directional derivative, 307, 309
Cramer's rule, 286
Disjunction, 3
Critical point, 345
Distributive law, 27, 56
Curves, 398
Domain
closed, 399
of a function, 23
homotopic, 419
of a relation, 22
length of, 403
Dot product, 236
oriented, 398
piecewise regular and smooth, 399
Eigenvalue, eigenvector, 270
Equality, 12, 16
D' Alembert's ratio test, 11 l
Equicontinuity, 297
Darboux lower and upper sums and
Equivalence, 5
integrals, 184
integrable functions, 184
relation, 24
Euclidean spaces, 235
Darboux's theorem, 158
Euler's relation, 318
Darboux-Stieltjes integrals, 211
Existential quantifier, 7
Decimal expansions, l l 8
Exponential function, 142
terminating, 120
Decompositions of intervals, 183
generalized, 87
Exterior product, 441
refinements of, 183, 354

Dedekind, R., 44
Definition, 5
inductive, 87
Fejer, L., 178

theorem, 304
Denumerable, 38
Finite set, 41
DeRham, G., 454
Functionals,
Derivative, 138
higher order, 138
of the logarithm, 139
of the trigonometric and inverse
alternating multilinear, 277, 439

linear, 262
multilinear, 276, 438
Functions, 23
trigonometric functions, 173,
analytic, 163
176
arc cos, arc sin, arc tan, 175, 176
Derived set, 242
bounded variation, 218
Determinants, 274, 277
Cantor, 126
expansion by cofactors, 282
characteristic, 36 l
of linear transformations, 287
class Ck, C"', 328
Differential forms, 399, 444

algebra of, 437
closed, 419, 452
continuous, 73, 249

nowhere differentiable, 142
cosine, 174
differential of, 447
Darboux integrable, 184, 356
exact, 416, 452
differentiable, 138, 310
484 I INDEX
Functions (cont.)
equicontinuous, 297
Geometric-arithmetic means
inequality, 349, 35 l
exponential, 87, 142
Geometric series, l l 0
gradient of, 316
Gradient, 316
graph of, 305
Gram-Schmidt process, 283
harmonic, 429
Graph, 305
higher-order derivatives, 138
Green's theorem, 410, 414, 415
infinitely differentiable, 138

integrable, 184, 356, 371
integration of a sequence and
series of, 196
limits of, 69, 249
Hadamard J., 166

Harmonic function, 429
Harmonic series, l 00
linear, 257
Heine-Borel theorem, 77, 242
Lipschitz, 363
Heine, E., 47, 77
logarithm, 139
maximum and local maximum of,
81, 345
minimum and local minimum of,
81, 345
monotone, 84
theorem, 80
Hessian, 346
Homeomorphism, 250
Higher-order differences, 152
differentials, 325, 328
partial derivatives, 325
multivalued, 23
Holder's inequality, 352
one-to-one, 23
Homotopic curves, 419
open, 249
oscillation of, 366
periodic, 175
Implication, 3
permutations, 274
Implicit Function Theorem, 340
piecewise regular and smooth, 399
Improper integrals, 201
polynomial, 75
absolutely convergent, 202
product of, 72
Cauchy, principal value, 207
quotient of, 72
convergent, 202
Riemann-Darboux integrable,
divergent, 202
184, 371
of the first and second kind, 201
saltus, 224
Inclusion of sets, 17
sequences and series of, 126
Index, 482
differentiable, 156, 157

sine, 173
spaces of, 293
sum of, 72
support of, 462
tangent, 176
trigonometric, 172
uniformly continuous, 80, 250
variation of, 222
Fundamental theorem of the
calculus, 199
Induction, 27
Inductive
definition, 87
set, 57
Inequalities,
Cauchy-Bunjakovsky-Schwarz,
217, 236
geometric-arithmetic means, 349,
351
Holder, 352
Minkowski, 352
triangle, 36, 237
Inference, 6
Gauss's theorem, 410, 414, 415

Generalized Mean Value Theorem,
151
Infinite products, 133

absolutely and conditionally
convergent, 135
INDEXl485
Infinite series, 98
Abel's test, 114
Intervals (cont.)
higher dimensional, 353
absolutely convergent, l 0 l
Interior of a set, 247
comparison test, 110
Intersection of sets, 17
condensation test, 113
Inverse of a function and a
conditionally convergent, l 03
relation, 22, 23
convergent, 99
Inverse Function Theorem, 335
decimal expansions, 118
Isomorphism, 32
of differentiable functions, 157
Iterated integration, 374
Dirichlet's test, 115

divergent, 99
of functions, 126
geometric, 110
Jacobian, 315
matrix, 315
grouping of, 103
Jordan content, 359, 361
harmonic, 100
Jordan domain, 462
of integrable functions, 196

integral test, 203
inner, outer, 361

regular, 471
Leibnitz' test, 115
Jordan measurable set, 361
limit comparison test, 117
Jump discontinuity, 85
p-series, 113
power, 165
Pringsheim's theorem, l 09
Lachterman, S., 159
product of, 168
Lagrange,
Raabe's test, 112
identity, 217, 240
ratio test, 11 l
multipliers, 347, 349
rearrangement, 10 l
Riemann's theorem, l 04
remainder formula, 160

Laplace operator, 414
root test, 110
Least upper bound, 63
sum of, 101
Lebesgue outer measure, 369
Infimum, 63
Infinite set, 41
Lebesque's theorem on Riemann

integrable functions, 371
Inner product, 236
Leibnitz' differentiation formula, 149
Integers, 31, 60
Lemmas, 2
Integral mean value theorems, 195,
Length of a curve, 403
215
L'Hospital's rule, 154
Integral test for series, 203
Limit comparison test, 117
Integrals,
Limit,
Darboux-Stieltjes, 211
of a function, 69, 249
line, 400
generalized, 356
Riemann-Darboux, 184, 185, 353,
inferior and superior, 90
371
Riemann-Stieltjes, 211
Integration, 183, 353
of differential forms, 396
left and right, 85

of a sequence, 50
Line integrals, 396, 400
Linear,
iterated, 374
combination of vectors, 230
on manifolds, 461
functional, 262
by parts, 192
manifold, 230
Intervals, 69, 353

content or volume of, 353
decompositions of, 183, 354
space, 233
subspace, 230
Linear transformations, 257
486 IINDEX
Linear transformations (cont.)
eigenvalues and eigenvectors of,
270
matrix representation of, 258
nonsingular, 257
norm of, 261
orthogonal, 271
projections, 266
proper values and proper vectors
Natural numbers, 26, 57

Negation, 2, 8
Net, 357
Cauchy, 357
limit of, 357
Nondenumerable, 121
Nowhere differentiable continuous
function, 142
Null set, 21
of, 270
rank of, 257
symmetric, 269
transpose of, 263
Linearly dependent and inde
pendent vectors, 229
Lipschitz functions, 363
Logarithm, 139
natural, 142
Open,
ball, 242
covering, 77, 242
function or map, 249
set, 77, 242
Order preserving, 36
Ordered,
basis, 258
n-tuple, 228
Maak, W., 161

Maclaurin series, 165
Manifolds, 455
pair, 16
Oriented,
curves, 398
Ck structure for, 455
manifolds, 457
integration on, 461
surfaces, 432
orientable, oriented, orientation,

457
regular Ck, 455
topological, 455
trace of, 455
Map, 23, 249
Matrix, 258
adjoint, 285
Orientation,
of a curve, 405
induced, 473
of a manifold, 457
Orthogonal complement, 239
Orthogonal transformations, 271
Oscillation of a function, 366
Outward normal, 414, 434
Jacobian, 315
skew-symmetric, 292
symmetric, 269
transpose of, 264
Maximum, 45, 81, 345
local, 81, 150, 345
Mean Value Theorem, 150, 332
Generalized, 151
Merten's theorem, 168
Mikolas, M., 142
Minimum, 46, 81, 345
local, 81, 150, 345
Minkowski's inequality, 352
Partial derivatives, 310

higher order, 325
Partition of unity, 299, 462
subordinate to an open cover,
299, 462
Path, 305
Peano curves, 253
Perfect se.t, 123
Period, 175
Permutation, 274
Piecewise regular and smooth, 399
Mobius band, 457
PoincarC's lemma, 454
Monotone functions, 85
Power series, 165
Multilinear functionals, 276, 438

alternating, 277, 439
Abel's theorem, 170

Cauchy-Hadamard's theorem, 166
INDEX 1487
Power series (cont.)
interval of convergence, 166
Sequence, 45, 50, 58

bounded,62
product of, 167
Cauchy,50,58
radius of convergence, 166
convergent,50, 58
Tauberian theorems, 171
of differentiable functions, 156
Predicate calculus, 7
of functions, 126
Pringsheim's theorem, 109
limit of, 50, 58
Products, 131
monotone, 62
infinite, 133
subsequence, 53, 58
Projections, 266
Series (see Infinite series)
Proof, 6
Sets, 16
by contradiction, 14
arcwise connected, 251
Proper value and proper vector,270
boundary of, 247
Propositional calculus, 7
bounded, 62, 242
Propositions, 2
Cantor, 122
Pythagorean theorem, 44
closed, closure of, 77, 242

compact, 77, 242, 294
connected, 244
Raabe's test, 112

Range of a function and a relation,
22, 23
convex, 245
countable, 60
covering for, 77, 242
Rank of a linear transformation, 257
dense, 60
Ratio test, 11 l
denumerable, 38, 60
Rationals, 35, 60
derived, 242
Reals, 44, 56
directed, 356
Rearrangement of infinite series,
finite, 41, 60
101, 104
greatest lower bound of, 63
Recursive definition, 132
infimum of, 63
Relation, 22
interior of, 247
equivalence, 24
least upper bound of, 63
reflexive, 24
open, 77
symmetric, 24
perfect, 123
transitive, 24
relatively open and closed, 244
Relatively open and closed sets, 244
simply connected, 420
Riemann-Darboux integrals, 183,
star-shaped, 420, 452
353
existence of, 197, 371
properties of, 190, 371
Riemann-Stieltjes integrals, 210
existence of, 219
properties of, 212
supremum of, 63
totally bounded, 294
Simply connected sets, 420
Sine, 173
addition formula, 173
Spherical coordinates, 268
Roche remainder formula, 160
Star-shaped sets, 420, 452
Rolle's theorem, 150
Statement, 3
Root test, 110
Stieltjes-Riemann integral, 210

Stoke's theorem, 410, 470, 475
Stone-Weierstrass theorem, 300
Saddle point, 345

Saltus function, 224
Structure, 455
oriented, 457
Schlomilch remainder formula, 160
Subsequence, 53, 58
Semicontinuity, lower and upper, 97
Subset, 18
4881 INDEX
Support of a function, 462
Trace (cont.)
Surface, 306
of a manifold, 455
area, 432
of a surface, 306, 432
integrals, 435
Transformation theorem for
oriented, 432
integrals, 390
Symbols, 479
Truth table, 2, 3
Symmetric,
matrix, 270
transformation, 269
Uncountable, 121
Uniform continuity, 80, 250
Uniform convergence, 126
Tangent,
Union of sets, 1 7
function, 176
Universal quantifier, 7
plane, 309
vector, 306
Tauberian theorems, 171
van der Waerden, B., L., 142
Taylor's,
Variable, 7
expansion, 163
Variation of a function, 222
series, 165
Vectors,
theorem, 330
components of, 229
Taylor's remainder formulas, 159
linearly dependent and
Cauchy form, 160
independent, 229
integral form, 192
orthogonal and orthonormal, 238
Lagrange form, 160
Vector space, 228, 233
Roche form, 160

Schlomilch form, 160
basis for, 230

finite dimensional, 233
Tensor algebra, 439
linear subspace of, 230
alternating, 440
Venn diagram, 1 7
Ternary expansions, 121
Voltera, V., 454
Theorems, 2
Volume of an interval, 353
Triangle inequality, 36, 237

Trichotomy, 27, 48, 57
Trigonometric functions, 172
inverses of, 175
Topological map, 250
Totally bounded set, 294
Trace,
of a curve or path, 305
Weierstrass, K., 47
approximation theorem, 129
M Test, 129
Well ordering, 29
Weyl, H., l

Devinatz D. - Advanced Calculus (1963)

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Devinatz D. - Advanced Calculus (1963)

Diunggah oleh

Hak Cipta:

Format Tersedia

ADVANCED

HOLT, RINEHART AND WINSTON

by Holt, Rinehart and Winston, Inc.

All Rights Reserved

year graduate students concentrating in mathematics. Occasionally,

mathematical maturity than that afforded by the usual freshman

sophomore courses in the calculus. One excellent way to gain such

maturity is through a beginning course in linear algebra, although the

Section 1.1 on logic is to be read by the student. Of course, such a

brief introduction is not intended to teach the student the elements

in mathematical reasoning. It may not be too well understood on the

of the propositional calculus is to be viewed as a concise shorthand for

mathematical statements. My experience has been that students learn

For those instructors who do not wish to spend time on an extended

treatment of the real number system, I have arranged matters so that

If, in going through the material, my peers should at times accuse

me of being pedantic, I plead guilty to the charge; my aim in doing

study of mathematics get the idea that a vague or seemingly trivial

As far as the differential calculus is concerned, there is probably

of Riemann-Darboux integration and Jordan content rather than one

of the more modern theories of measure and integration. Although

the best, in the case of integration my view is that a student cannot

special pedagogical devices as aids in understanding. Mathematics is

script. I am deeply indebted to Sam Lachterman of St. Louis University.

THE REAL NUMBER SYSTEM

Some Ideas about Logic

Properties of the Reals

The Heine-Borel Theorem and Uniform Corttinuity

Series of Real Numbers

The Limit Concept and Continuity

The Derivative Concept

Functions of Bounded Variation and the Existence of

CHAPTER 6 j HIGHER DIMENSIONAL SPACE

Real Vector Spaces

CHAPTER 7 I HIGHER DIMENSIONAL

Motivation and Definitions

II. SURFACE INTEGRALS

Motivation and Definitions

2 I THE REAL NUMBER SYSTEM

new statements that are called true. In this sense mathematics is a

current usage seems to suggest the following rules. A theorem is

Many people also use the word

'scholium' to play the same role as the word 'proposition' or even

(To be read: not A.)

SOME IDEAS ABOUT LOGIC I 3

' '==i, &

tables as giving a meaning to statements containing these symbols. We

4 I THE REAL NUMBER SYSTEM

comment on it further. To form a statement in the written English

symbol and has no empty-space symbol in betwee'n is called a word.

and 'V' are part of our mathematical alphabet.

SOME IDEAS ABOUT LOGIC I 5

of the truth tables are nothing more than formalizations of these

are statements that can be

given 't' or 'f' values. We wish to show that the statement

always has a 't' value regardless of the values taken on by

and 'G' for

new symbols in terms of our basic symbols. As a case in

the equivalence symbol

B. It is sometimes also read:

The set of symbols

been defined, we coan consider it as the name of a statement. It is easily

the value 't' attached to it whenever

In the above paragraph we have used a short symbol to replace a