Mathematics A Chronicle of Human Endeavor

Mathematics/A
CHRONICLE
OF HUMAN
ENDEAVOR
Herbert I. Gross
Massachusetts Institute of Technology
Frank L. Miller
Orange Coast Community College
H olt, R inehart and W inston, Inc.

New York Chicago San Francisco Atlanta
Dallas Montreal Toronto London Sydney
Copyright© 1971 by Holt, Rinehart and Winston, Inc.
All rights reserved
Library of Congress Catalog Card Number: 73-142326
SBN: 0 3 -0 8 5 4 0 6 -7
Printed in the United States of America
9 8 7 6 5 4 3 2 1 090 4 3 2 1
PREFACE
The story is told th a t once when Calvin Coolidge returned home from
church, his wife asked him the title of the sermon. “Sin,” replied Mr.
Coolidge. “And what did the clergyman have to say about it?” asked
Mrs. Coolidge. “ He was against it,” came the terse reply.
The above story, usually told as an anecdote to illustrate conciseness,
also serves to convey the message, in a certain sense, th a t most of the great
ideas have already been given. All th a t remains to be done is to style the
way in which they may be presented. In this vein, the present textbook
makes no claims of presenting new mathematics or of inventing new
theories. Rather, it is the hope of the authors to present mathematics with
sufficient clarity so th a t even the beginner can learn to understand the true
nature of mathematics, its role in the development of man and his society,
and its practical and esthetic aspects. No knowledge of mathematics
beyond th a t which is generally known by the junior high school student is
presupposed by the authors.
Of course, any such attem pt to teach mathematics utilizes a highly
subjective approach on the part of the teacher; in this respect, the teaching
experience of the authors will greatly affect the style of the text.
The text was assembled by Herbert I. Gross while teaching a term inal
mathematics course at Corning Community College in Corning, New York,
and conducting numerous in-service seminars for teachers involved in pre
college mathematics at all levels of instruction. This material was revised
and specially edited by Frank L. Miller while teaching at Orange Coast
Community College, Costa Mesa, California.
The text takes into account the following four points: (1) The average
nonmathematician is blissfully unaware of the beauty of m athem atics apart
from its obvious value as a computational tool. (2) M any people are adept
at performing the basic operations of arithmetic, but they seldom under
vi Preface
stand how and why various “recipes” work. In short, people have memor
ized certain mathematical principles but have not grown intellectually in
the process. (This in part explains why parents berate the new mathe
matics. Previously, they could hand down the same recipes from genera
tion to generation. Now, since children are learning explanations different
from those memorized -by their parents, the lack of genuine understanding
becomes more pronounced. In effect, the average parent who cannot help
his child with the “new” mathematics probably could not help him with the
“old” either—only he did not realize it!) (3) “Logic” is a frightening
word to many people. Some do not understand it at all, others misinter
pret it, and still others understand it in a mechanical way without any
significant feeling as to its great value. (4) Very often mathematics is
learned as a collection of “tricks” without any awareness of the basic
unifying threads th a t characterize the subject. Too often students tend
to learn many concepts each with one application rather than learning a
few concepts with many applications.
I t turns out th a t these four points seem to permeate the core of the “new”
mathematics, and with so much lip service devoted to it, perhaps it would
be wise to say just a few words here about what we think the “new”
mathematics really is.
As simply as possible, let us observe th a t one usage of the word “new” is
as the opposite of “passe” (rather than of “old” ). I t is in this sense th a t
the “new” mathem atics has its greatest moment. T h at is, we are not so
much interested in the “ age” of a topic as much as we are interested in its
meaningfulness. The quest for meaningfulness is even more im portant
when we realize how rapidly the field of mathematics has expanded in the
past few years. Indeed, it has been said th a t mathematics is like the enthu
siastic young m an who mounted his horse and rode off furiously in all
directions. W ith so much more material and no more time in which to
learn it, the curriculum must be streamlined. This involves emphasizing
m ajor concepts. I t was with this in mind th a t wre decided to focus on
four general aspects of mathematics wrhich we felt underlie virtually every
topic in mathematics. The text is comprised of these topics.
C hapter 1 is devoted to a development of our number system. Begin
ning at the “Dawn of Consciousness” and the earliest tally systems,
we gradually explore the evolution of our present system of enumeration
and place value arithmetic. In this context, we introduce the idea of differ
ent number bases so th a t wre may better emphasize the concept of place
value and at the same time deemphasize the undue importance placed on
the concept of ten. (It is our contention th a t the development of mathe
matics did not require the biological “coincidence” th a t we were born with
ten fingers.)
Preface vii
Once the whole numbers are adequately described, C hapter 1 continues

with the development of the rational numbers, irrational numbers, signed
numbers, and complex numbers. At all times, the major emphasis is on
the great ideas; and only enough computational devices necessary for the
appreciation of these ideas are introduced.
Chapter 2 discusses the concept of sets and tries to explain the justifi
cation for placing such emphasis on this concept in the “modern” curric
ulum. In particular, the beginning p art of the chapter introduces the
necessary terminology in a slow and, hopefully, selfmotivated manner.
Once im portant vocabulary is established, the remainder of the chapter
devotes itself to relating sets to particular areas of mathematical investi
gations.
Chapter 3 strives to develop th a t aspect of mathematics most applicable
to all students, regardless of their field of interest—namely, the idea of
logical thought. I t is readily conceivable th a t one can get through life
without being able to solve a “John-and-Bill marble problem” ; but all of us,
at one time or another, are called upon to make decisions. In all fairness,
we cannot demand th a t our decisions be correct, but we can demand th a t
our decisions be as compatible as possible with the available knowledge;
and it is in this sense th a t logic is so important.
In order to make logic more palatable, we begin with the case against
intuition so th a t you will be convinced that, like it or not, there is a need
for logic. We then introduce the concept of a “game” and use this as a
springboard for explaining the idea of logic and the meaning of a m athe
matical system.
At this point, we introduce the concept of Boolean algebra as an example
of a mathematical “game.” There were many other systems we could have
chosen to explore, but our choice of Boolean algebra was motivated by the
fact th a t this system has many practical applications; and also th a t it is a
fairly easy system to visualize. Moreover, the choice of Boolean algebra
provides us with an excellent vehicle for reviewing and furthering our
understanding of sets.
Chapter 4 introduces a discussion of what mathematics is all about—at
least from a computational point of view. In particular, we discuss the
idea th a t mathematics is the study of relationships, and to this end we
introduce the concept of functions and graphs. So much for a glimpse at
the contents!
In previous years, our preface would end here, but with the modern
emphasis on the relevancy of education, we feel th a t still a few more words
are in order. To clarify further, it is fair to say th a t the best proof th a t
brotherhood has failed is th a t we still have Brotherhood W eek; and the best
proof th a t a t least p art of the educational process has failed is th a t students
viii Preface
ask th a t education be made relevant. In a Utopian system everything

would be relevant, or at least seem so.
Perhaps the cry for relevancy is a disguised cry for a generalization of the
“new” mathematics. W hat is sought is a meaningful education, and to be
meaningful we do not have to choose between human values and pragmatic
course content. R ather these two facets of human endeavor exist side by
side, and each is enhanced by the presence of the other.
As our environment changes new problems replace old ones, but the
underlying causes are the same. In a manner of speaking, we spend our
time on urgent issues while we neglect the im portant ones. Thus, since
time has a way of changing what is urgent (and what is not), a good educa
tion m ust focus on im portant truths, truths th a t not only characterize a
particular field of study, but hopefully, other fields as well.
While we lament th a t within the framework of the present text it was
not within our abilities to focus more attention on these points than we
did, the fact is th a t we have elected to choose three properties which we feel
separate man from the rest of the animal kingdom.
(1) M an seeks the simplest solution to any problem th a t plagues him, and
it is only when this solution becomes either too cumbersome or out
moded th a t he seeks another solution. I t is this assumption th at
underlies our approach to the development of the number system.
M an started with nothing more advanced than a tally system and
gradually developed better systems of enumeration as he needed them.
Yet, it should not be too difficult to see th a t any subject in the curric
ulum uses this property, no m atter what the field of investigation.
(2) M an has the ability to think logically. He alone, in the animal king
dom, can predict and plan the future based solely on his knowledge of
the present and the past. Thus, our desire to help teach logical thought
in this text is m otivated by this property. Again, it must be under
stood th a t this logic is the same whether we are in the world of the
physical scientist or the world of the social scientist. The job of de
ducing inescapable conclusions from given information is part of every
field of study.
(3) Despite his drive for m aterial gain and his tendency to place pragmatic
pursuits above idealistic values, man is still basically a humanitarian
and he is capable of appreciating art for the sake of art. In this con
text, there are many facets of any subject th a t transcend practical
applications. One aim of the text is to show th a t there are aspects of
mathem atics th a t have an esthetic value which is im portant apart
from any practical value. Indeed, as in any field, there are m athe
maticians who study m athem atics for the same reason th a t the human
ities m ajor studies poetry—because it is beautiful.
Preface ix
In addition to any mathematical content th a t we hope can be learned

from this text, it is our hope th a t we will be able to emphasize the three
properties we have listed, in their own right. In this way the student will
see mathematics as a part of “the chronicle of human endeavor” rather
than as an important, but isolated, part of the overall curriculum.
Cambridge, Massachusetts H .I.G .

Costa Mesa, California F.L.M .
February 1971
i
I
!
CONTENTS
Preface v
chapter one / THE DEVELOPMENT OF OUR NUMBER SYSTEM

1.1 The Concept of Number versus Numeral 1
1.2 The Development of Place Value 4
1.3 Different Number Bases 22
A Note on Number versus Numeral 36
1.4 The Rational Numbers 43
Note 1— More on Geometric Arithmetic 53
Note 2— A Brief Glance at Number Theory 59
1.5 Decimal Fractions: Place Value Revisited 67
1.6 The Irrational Numbers 82
Note 1 87
Note 2 88
Note 3 89
1.7 Signed Numbers 91
1.8 Signed Numbers and the Number Line 97
1.9 The Complex Numbers 103
The Complex Numbers as an Extension of the Number Line
(A n Introduction to Vectors) 113
chapter tw o / AN INTRODUCTION TO THE THEORY OF SETS
PART I T H E MOOD SETTER 123

2.1 Introduction 123
2.2 A Set Is a Bunch of Dogs (An Introduction to Well-Defined
Sets) 124
A Note on the Excluded Middle 125
xi
x ii Contents
2.3 The N uts ’N Bolts 127

2.4 Two Special Sets 131
2.5 Two M ajor Methods for Describing Sets 134
The Number of Subsets of a Given Set 138
Applications to Elementary Algebra 141
A n Introduction to Matrices in Terms of Methods for
Describing Sets 150
PART II T H E A R IT H M ET IC OF SETS 166

2.6 Unions, Intersections, and Complements 166
A Note on the Representation of Sets 174
The Chart Method for Viewing Sets 174
Circle Diagrams 181
2.7 The Contrast between Addition and Union 187
A Note on Unions, Intersections, and Complements
2.8 Simultaneous Equations 199
2.9 Intersections of Curves 201
2.10 Sets as a “Common Denominator” 205
PART I II CARTESIAN PRODUCTS 206

2.12 A Geometric M otivation 206
2.13 Cartesian Products of A rbitrary Sets 213
2.14 Perm utations and Other Computations 216
2.15 Combinations 225
2.16 The Binomial Theorem 236
2.17 A Glimpse at Probability 248
2.18 Equivalence Relations 253
chapter three / THE “GAME” OF MATHEMATICS: AN

INTRODUCTION TO ABSTRACT SYSTEMS
(WITH SPECIAL EMPHASIS ON BOOLEAN
ALGEBRA) 258
3.2 The Case Against Intuition 261
3.3 The Concept of a Game 268
What Is a Game? 268
3.4 Rote versus Reason 271
3.5 Can All Words Be Defined? 274
3.6 Are Rules Logical? 276
3.7 Can There Be Proof without Assumptions? 279
3.8 The General Game of M athematics 280
Contents
3.9 Boolean Algebra 285

Introduction 285
The Set of Subsets of a Given Set 286
Some Fundamental Theorems 291
Boolean Algebra Defined 304
A Note on Proofs 307
Truth-Table Logic 308
A Note on Tautologies 319
Switches 322
chapter four / AN INTRODUCTION TO FUNCTIONS AND

GRAPHS 330
4.2 Functions and Sets 332
4.3 Inverse Functions 338
4.4 Counting Revisited 342
A Note on Countable Sets 349
4.5 An Introduction to Functions of a Real Variable 351
4.6 A Picture Is W orth a Thousand Words 356
Index 363
chapter one / THE DEVELOPMENT
OF OUR NUMBER SYSTEM
1.1 THE CONCEPT OF NUMBER VERSUS NUMERAL

In any language there is a considerable difference between a concept and
the word used to denote th a t concept. To paraphrase this idea on a lighter
level, consider the following Lewis Carroll-type syllogism:
(1) C at is a three-letter word.

(2) All cats chase mice.
(3) Therefore, some three-letter words chase mice.
The ridiculousness of the above conclusion vanishes as soon as we realize

th a t statem ents (1) and (2) do not have the same subjects. I t is the word
cat which is being described in (1), and the animal cat which is being
described in (2). In other words, had we employed proper grammatical
usage, (1) would have been w ritten as
“C at” is a three-letter word.
Let us observe a few more im portant asides concerning the above exam
ple. For one thing, it illustrates a great deal about m an’s ability to think
abstractly—certainly there is nothing about the conglomeration of letters
c-a-t th a t even resembles “a member of the feline family th a t was intro
duced as a house pet in E gypt around 5000 B .C .” Indeed, it would be a
naive form of circular reasoning to say th a t we call it a cat because it looks
ju st like one.
Secondly, notice the power of the symbol “cat.” Once we get used to
the alphabet and learn to read, there comes a time early in our intellectual
development where, in fact, we never even notice the letters c-a-t but
rather we actually visualize the animal itself when we look a t “cat.” In
short, not only can man think abstractly, but he can do it in such a natural
way th a t he often does not realize he is doing it.
Finally, let us observe th a t while languages have come and gone and while
changes have been made even within the same language, the animal cat
is the same as it was 5000 years ago. The point is th a t it is usually lan
guage, not basic concepts, which changes.
2 The Development of Our Number System
W ith the above remarks in mind, we now turn our attention to a very
specific language—th a t of mathematics; and we shall observe th at this
language inherits the problems of all other languages—no more or no less.
For example, it is reasonable to assume th a t virtually from the so-called
“dawn of consciousness” man had the ability to grasp and to utilize the
concept of “how-many-ness.” While he may not have used symbols such
as 2, he was aware of the idea of two. He knew th a t he had two arms and
one head; or at least he knew th a t he had more arms than heads.
Now ju st as it is quite likely th a t the first word th a t named the animal
cat was a picture of a cat (sign language, or even hieroglyphics), it is also
likely th a t the first symbol th a t denoted two (arms) was a picture of two
arms. Perhaps, after a while, ancient man recognized th at two arms and
two apples had the concept of “two-ness” in common and so he might use
two tally marks to denote two of anything. (As we shall indicate by exam
ples in the next section, it was not a trivial m atter for man to discover th a t
the idea of two-ness was the same whether he was talking about apples or
people.)
The point we are leading up to is th a t the symbols used to denote num
bers are called numerals. Ju st as there exist many synonyms to name the
same concept, many numerals may be used to name the same number.
For example, the number two is named by such numerals as 2, 5 — 3,
2 X 1, and 11.
We shall not enter into a philosophical discussion here as to how one can
rigorously define w hat is meant by a number. Rather we shall assume
th a t each of us senses what a number is; and we shall use the convention
of spelling the word when we mean to refer to the number (such as when we
say the number two) and we shall call all other symbols for denoting num
bers (such as 2, 11, 1 + 1) numerals.
In still other words, when we write 3 + 2 = 5, we certainly do not mean
th a t the symbols 3 + 2 and 5 look alike. W hat we do mean is th a t 3 + 2
and 5 are two different numerals which name the same number—namely,
five. This is perhaps the major reason why in the “new” mathematics
3 + 2 = 5 is read as 3 + 2 is 5; th a t is, 3 + 2 is a synonym for 5. (The
student might have a tendency to think of 3 + 2 as being two numerals.
Do not confuse numerals with digits. In other words, 3 + 2 indicates th at
we are forming the sum of the two digits 2 and 3; but, nevertheless, 3 + 2
is a numeral which names five.)
Perhaps the following riddle will supply some pleasant diversion and also
shed light on our discussion.
A m an goes into a store, and buys some objects which are equally priced. The
clerk tells him the following.
1.1 The Concept of Number versus Numeral 3
4 will cost you 10 cents.

W hat was the man buying?
The answer to the question: he was buying house numerals a t 10 cents

per digit. Thus, 23 and 76 a t 10 cents per digit each cost 20 cents. Notice
how different, in terms of the above discussion, the problem would have
been had it said “Seventy-six and twenty-three each cost 20 cents.” In
still other words, while 23 and 76 name different numbers, they are both
two-digit numerals. We shall have reason to refer to this problem again
in the next section.
If the above discussion seems to be overly elementary, the authors are the
first to adm it that, at least in some respects, it is. However, the following
precautions are in order.
(1) Do not confuse elementary with simple. When Holmes says “Ele
mentary, my dear W atson,” the result is anything but simple or self-
evident to poor Watson.
(2) While we are talking about whole numbers, we are also talking about
the ground floor of our entire number system. Thus to understand
number-versus-numeral at the level of the whole numbers is to have
the foundation for extending the concept to all numbers.
(3) The concept of number-versus-numeral not only plays an im portant
and direct role in our number system ; b u t in various disguises it plays an
im portant role in virtually every phase of mathematics. This we shall
see as the course begins to unfold. The im portant point is th a t the
same things which occur in elementary discussions occur on the more
complex levels.
(4) Ju st as we see the animal cat when we see the word “cat,” we also
tend to see numbers when we look a t numerals. This is precisely
what happened in our house-numeral riddle. To overcome this rather
natural tendency, there will be times in which you will need great will
power if you are to try to capture the “spirit” of the discussion. In
short, because the present discussion centers around a topic th a t we
are very used to, there is a danger th a t the student may feel we are
making mountains out of molehills. Of course, to avoid this we could
have chosen a more complex illustration b u t if difficulty arises, then
we have no way of knowing whether it was the concept or the illustra
tion which caused the trouble. In other words, since the whole num-
the distinction could be made by the use of such concepts as a one-to-one

correspondence, whereby we match members of the first collection with
members of the second, b u t such endeavors are at best cumbersome.
T hat tally marks were not conducive for representing large numbers can
be seen from our vocabulary. For example, we refer to a hundred cat
tle as a herd, but two" hundred cattle is not called two herds. It, too,
is called a herd. In short, while man could keep track of a small number
of cattle by use of tally numerals, after a while all he saw was “many”
or “several.” Thus, “herd” indicates th a t the concept of “more than a
few” perplexed ancient man. In the same context, notice also how difficult
it was for him to view “number” abstractly. The fact th a t he saw “num
ber” more as an adjective than as a noun is easily seen from the fact th at
he did not refer to one hundred sheep as a herd. Sheep were different
from cattle, so a large collection of sheep had to be given a different name
than th a t for a large number of cattle. Thus, many sheep were referred
to as a “flock.”
How, then, was man going to keep all of the benefits of the tally system,
but yet refine it to eliminate some of its shortcomings? As is usually the
case, the innovations were gradual—enough innovations to get the job
done, but not so many as to be violently revolutionary. Perhaps the first
innovation was th a t of using geometric patterns to represent the array of
tallies. Even today we have rem nants of this idea in our everyday mathe
m atical vocabulary. Thus, we refer to 3 X 3 as the “square” of three; yet
“square” seems to be much more suggestive of geometry than of arithmetic.
The point is th a t the array of nine dots (or tallies) could be arranged in the
form of a square which had three rows and three columns. Thus,
Moreover the first forms of number theory were of a geometric nature

rather than of an algebraic (or analytic) nature. For example, by 600 b . c .
the ancient Greek was already aware of the fact th a t the sum of consecutive
odd numbers, starting with 1, always represented a perfect square. T hat is,
1 = 1X1
14-3 = 4 = 2 X 2
1 4 3 4 5 = 9 = 3 X 3
1 4 - 3 + 5 + 7 = 16 = 4 X 4 .
Such a result was easy (but ingenious) for him to see in terms of a picture.
Namely, every square could be subdivided into L-shaped regions, with
the numbers of dots in each L being consecutive odd numbers. (See

Figure 1.1.)
The idea of geometric arrays is still used today, for example, in playing
cards and dice. W ithout belaboring the point, notice th a t while we might
have to look twice to distinguish between 11111 and 111111, we have
no such trouble in distinguishing between X and : : . In fact, after
we have played with playing cards or dice for ju st a little while, we do not
even bother to count the dots; rather, we ju st recognize the picture.
Although such geometric innovations helped a little, they were not very
good in helping people recognize the difference between, say, nine hundred
seventeen and nine hundred eighteen. Still other innovations were needed.
The next improvement came from the recognition th a t one did not need
a symbol th a t looked like the number it was to represent. This is precisely
what the Roman recognized when he let X represent ten. Notice th a t X
in no way at all suggests ten. In fact, X could have been used to denote
any number. Quite likely, it was more than coincidence th a t X was
chosen to denote ten and th a t we humans, on the average, have ten
fingers. Indeed, had we been born with twelve fingers, would there have
been anything special about the number ten? As another nice example of
number-versus-numeral, the Roman used V to denote five. Notice th a t
just as five is half of ten, V is half of the letter X. To put this idea in
more modern dress, it would be as if we first invented 8 to denote eight and
then invented 3 to denote four since 3 is half of 8—th a t is, $. In other
words, to say 3 is half of 8 may be used as a true numeral fact, but three is
half of eight is a false number fact. Also in this respect, it should be clear
th a t if our ancestors had invented 4 to denote three and 3 to denote four,
we would have memorized these facts just as we did the ones we were given
to memorize (in which case “3 is half of 8” would have been true, both as a
numeral statem ent and a number statem ent). We will not pursue the
derivation of the numerals any further. While such a pursuit is interesting
Figure 1.1
in its own right, it tends to obscure the very important point th at man had
hit upon the abstract idea th a t a numeral did not have to look like th e
number it represented.
Now to mimic our own decimal system as much as possible, let us ignore
the fact th a t the Roman invented symbols for five, fifty, and five hundred;
we shall use only those symbols th a t denote powers of ten. Thus, we have
X to denote ten, C to denote one hundred, and M to denote one thousand.
Let us also ignore the subtractive property th at IX names nine while X I
names eleven. Historically, the original Roman numerals did not have
the subtractive property. Four was denoted exclusively by IIII, never
by IV. Both X I and IX meant eleven, and nine was denoted by V IIII
(although we are pretending th a t nine would have been w ritten as I I I I I I I I I
simply to conform with the decimal-place value system of today).
To continue, in this way the Roman could represent any number from
one to nine thousand nine hundred ninety-nine by using the symbols I,
X, C, and M ; and no symbol would have to appear more than nine times.
(Obviously, if we could use a symbol as often as we liked, we would only
need the symbol I, used enough times to represent any number—which is
precisely the tally system.) Thus, the Roman would represent a number
such as twenty-three using but two different symbols and a total of five
characters; th a t is, X X III. To be sure, this is not as compact as our place-
value system wherein all we need are the two characters, 23; but it is a
considerable improvement over the original tally system in which twenty-
three characters (tallies) would have had to have been used. In passing, let
us observe, in this regard, th a t whenever we evaluate any ancient system,
it is only fair th a t we compare it with what it replaced—not what it was
replaced by; th a t is, each innovation is brought about to improve the old,
and this is how it should be evaluated.
Surprising as it may seem, the Roman-numeral system was adequate for
performing the four basic operations of arithmetic. For example, to per
form the problem which we indicate as 23 -f- 34, the Roman would merely
have to write X X III X X X IIII. There would have been no need to com
bine like term s since, for example, an X was an X no m atter where it was
placed (in this respect the subtractive principle introduced later into
Roman numerals brought more inconvenience than it brought convenience).
To perform subtraction, the Roman actually took away what he was sup
posed to. Thus, to perform the operation which we would write as
45 — 31, he would have w ritten X X X X IIIII and then taken away three
X ’s and one I. He would have denoted this operation quite concisely
merely by writing X X X X IIIII. In no event would he have had to invoke
the notation ^ ^ X X I I I l / — X X X I = X IIII.
In a problem such as 41 — 19, he saw th a t he could not have taken away
nine from one, so he used his numeral system to recognize th a t X and
I I I I I I I I I I were synonyms. Thus he would have written X X X IIIIIIIIIII

instead of X X X X I; and in this form he could have taken away the desired
amount quite conveniently. This is precisely the idea behind what we
call regrouping in the modern elementary-school curriculum. We teach the
youngster to see 41 as 4(10) + 1 = 3(10) + 1(10) -f 1 — 30 -f- 11; and
to see 19 as 10 + 9. Then we take 10 from 30 and 9 from 11. This is
in a more open and logical form, what we adults knew as borrowing.
The Roman could even have multiplied rather nicely in his system had he
thought of the idea. Namely, let us observe th a t multiplying by ten
merely involves trading a symbol for the next higher one. For example,
I ten times would be X, X ten times would be C, and C ten times would be
M. Suppose the Roman had wanted to solve the problem which we write
as 33 X 32. Of course, he could have written X X X III thirty-tw o times
and th a t would have been his answer. However, an easier way would have
been to recognize th a t X X X III ten times would be CCCXXX, since, as
we have just mentioned, multiplying by ten merely involves replacing each
symbol by the symbol which names the next higher denomination. In this
way, he could have written
CCCXXX (ten thirty-three’s)

CCCXXX (ten more thirty-three’s)
CCCX X X (ten more thirty-three’s)
X X X III (one more thirty-three)
X X X III (one more thirty-three).
The total accumulation of these symbols would denote the answer. Again,
observe th a t we do not have to line up like denominations; nor do we have
to exchange ten of one denomination for one of the next higher denomina
tion; although in the above problem, we could have exchanged ten of our
X ’s for a C, and ten of our C’s for an M. Our answer would have been
what was left—namely, M X X X X X IIIIII. T hat is
(The ten X ’s become a C; this C plus the other nine become an

M—M X X X X X IIIIII remain.)
<W Q t t t
X X ^tH I
X X X III
In this way multiplication appears as “rapid addition.” In a similar

way, division can be performed as “rapid subtraction,” but we shall not
take the space to illustrate this now.
Let us instead continue tracing the development of place-value. Cer
tainly, a system such as the Roman numerals was a great improvement
over the original tally systems; but it had a particularly great drawback.
Namely, every time we want to introduce a new power of ten we must
invent a new symbol. Thus, to express a number which we would write,

say, as 1,834,231,345,984, the Roman would have needed a total of thirteen
different symbols (of which I, X, C, and M are but four). As awkward as
it would have been, notice what an improvement it still would be over the
tally method. For instance, had the Roman invented the required thirteen
symbols, the number above could have been represented by a total of
1 + 8 + 3 + 4 + 2 + 3 + 1 + 3 + 4 + 5 + 9 + 8 + 4 characters; yet
several lifetimes would not be enough to represent the above number in a
tally system. If we could write one tally mark per second and write with
out interruption until the task were done, it would take in excess of thirty
years to write one billion tally marks! To verify this claim, simply divide
1,000,000,000 by 60 X 60 X 24 X 365.
We do not wish to stray too far from the prime objective of developing
our number system. Suffice it to say, then, while Roman numerals were
a big innovation, a better system was needed to avoid having to invent
a new symbol for each power of ten. The solution to this problem came
with the abacus. In its simplest form, the abacus may be viewed as being
a series of lines where, in order, the lines denote consecutive powers of ten.
A pebble (bead) could be placed on a given line. The pebble would always
represent one unit; but the unit would be determined by the line on which
the pebble was placed. For examples, see Figure 1.2. Observe, now, th at
a pebble’s appearance is the same wherever it is placed. Essentially, it
is only how we position a pebble th a t allows us to recognize the denomina
tion it represents. In short, for the first time, we have a system of enu
meration wherein we can determine denominations only by position. Do
not confuse this with the subtractive principle of Roman numerals. In
Roman numerals, to be sure, XC names ninety while CX names one
hundred ten; b u t the im portant point is th a t the X still denotes ten regard
less of which position it occupies. The position does not affect the de
nomination in Roman numerals, but only whether th a t denomination is
to be added or subtracted. In short, as we are defining place-value, the
Romans had no place-value.
(One) (Ten) (H undred) (Thousand) (Two thousand

one hundred
thirty-six)
Figure 1.2
In any event, the mechanics of performing arithmetic using the abacus

were the same as for using Roman numerals. The only difference was
th a t now the various denominations were determined by the line on which
the pebbles were placed rather than by the appearance of the symbol, and
multiplication by ten meant moving all pebbles to the next line to the left.
This sets the stage for the invention of our present place-value system.
To begin with, let us recall th a t in certain ways each generation inherits a
legacy from the past and tries to leave one for the future. In this context,
the invention of place-value did not occur overnight. All the ingredients
which are usually associated with place value, except one, were already
present when the Hindus and Arabs made their joint contribution toward
the invention of our present system. For example, the fact th a t the
Romans used such symbols as X and C established the facts th a t man was
already able to use abstract symbols to denote numbers and th a t he had
also hit upon the idea of trading—in bundles of ten. The abacus paved
the way for place-value in the sense th a t it introduced the idea th a t
denominations could be represented merely by position.
The remaining ingredient th a t gave rise to our place-value system, was
the fact th a t man devised a system whereby there was no need for any
concrete symbol to denote denomination. This in turn led to a problem
th a t was not encountered in any previous system of enumeration—namely,
in any system where the denominations are recognizable by their appear
ance, there is no need to invent a symbol to indicate the absence of any
denomination. For example, the Roman would write C C III to denote
the number two hundred three. There was no need for him to indicate
th a t no X ’s were present, because if they had been there we would see
them. In a similar way, the abacus would have appeared as in Figure 1.3.
Here, too, the absence of any pebbles on the ten’s line indicates th a t there
are none of this denomination. In other words, prior to the invention of
place-value, the smallest whole number th a t was named was one. There
was no need to denote the idea of nothingness since this was clear from
the missing symbol which named the particular denomination.
However, once our present place-value system was invented, all this
changed. Since there was no visible way of recognizing a denomination
Figure 1.3
other than by where a digit was placed, it became imperative to invent a

place holder, th a t is, a symbol to indicate th a t we had none of a particular
denomination. For example, without such a symbol how would we
recognize the numeral which names two hundred three? For, if we were
to write 2 3, how would we know what denominations were named by the
2 and the 3? If there is even the simplest visible way of recognizing the
denominations, this problem does not exist. For example, in the following
chart there is no danger of misreading the numerals, but if the column
headings were omitted all the numerals would have the form 23. In short,
as long as there is a label we do not have to indicate nothingness other
than by using nothing.
Thousands Hundreds Tens Ones
2 3 (2300)
2 3 (2030)
2 3 (2003)
2 3 (230)
2 3 (203)
2 3 (23)
In a similar way, our present place-value system made it mandatory,

for the first time, th a t we never use more than nine of a single denomina
tion. Up to this time, using either Roman numerals or the abacus, man
did not have to use more than nine of any denomination; but he could, if
he so wished. Again, the reason for this was the visibility of the various
denominations. For example, with the columns labeled there would be
no danger of confusing
100 10 1 and 100 10 1

2 1 3 2 13
(two hundred thirteen) (thirty-three)
Y et with the labels omitted both of the above entries would look like 213.
This is similar to what a youngster tends to do when he first learns to
add. For example, given 78 + 96, he writes it as
78
+96
and then proceeds to solve it as two separate problems:

7 8
+9 and + 6.
1.2 The Development of Place Value IS
He then writes
7 8
+9 6
16 14.
We tend to call this wrong, yet it is almost correct. Indeed it would have
been correct had he written, say, (16) (14) to indicate th a t he had 16 tens
and 14 ones. In fact, the use of the parentheses is an excellent forerunner
to the idea of carrying. T h at is, 14 ones is the same as 1 ten and 4 ones.
In other words, with the labels present each of the following is a different
but correct way of representing 78 + 96.
Hundreds Tens Ones
16 14
17 4
1 7 4
Notice how the abacus serves as a missing link between our modern
place-value system and the Roman numeral system. Namely, ju st as in
the place-value system, the abacus uses the idea of position value; but just
as in the Roman system, the abacus allows for limitless amounts of any
denomination.
In any event, place-value forces us to invent a numeral to denote none,
and this numeral turns out to be 0 (zero).1 In terms of number-versus-
numeral, we m ust not confuse none and zero. Referring to our house-
numeral riddle of the previous section, at 10 cents per digit, 20 costs as
much as 29. In still other words, 0 is a numeral th a t denotes none just the
same as 7 is a numeral which denotes seven. I t is rather unfortunate th a t
except for zero and none, the place-value digit and the number it names
are pronounced the same way (for example, 7 is pronounced “seven” ).
This further tends to obscure the difference between number and numeral.
Once these observations are made, we can perform arithmetic in the new
system in ways th a t are analogous to the old. We have already illustrated
how addition, subtraction, and multiplication can be done with Roman
numerals. These techniques translate in an almost natural way to place-
value. For example, in place-value we add on a zero to multiply by ten.
Observe th a t this is equivalent to exchanging a denomination for one of
1 Since the symbol zero was necessitated by a numeral system called place value in
stead of number concept we tend to ‘segregate’ zero from the other whole numbers.
The natural numbers are the whole numbers excluding zero, and the whole numbers are
the natural numbers plus zero. The word “natural” is suggested by the idea of trying
to count none as being unnatural.
the next higher since the annexation of a 0 serves to push each denomination
over by one place. For example, suppose we add on a zero to 23 and thus
form 230. To be sure, the 2 and 3 have the same face value (that is 2
always denotes two while 3 always denotes three) but the denominations
named by 2 and 3 are different. In 23 we have 2 tens and 3 ones, while in
230 we have 2 hundreds and 3 tens. In our modern numeral system we
m ust distinguish between face value and place value.
In this same context, observe how the traditional recipe for performing
33 X 32 is just a disguised form of the Roman numeral version which we
have already described. Most of us have memorized the recipe, or
algorithm.
33
X32
66
99
1056.
We may not know why we do it, but we do know th a t we get the right
answer. Yet, a little reflection about face value versus place value tells
us th a t while the face value of 3 in 32 is three, its place value makes it 3
tens, or thirty. Now thirty 33’s is not 99, but 990. In terms of number-
versus-numeral, it makes no difference whether we write the 0 or indicate
it in some other way (such as by indenting). In other words, our above
algorithm has a “missing” 0, as we could have written
33
X32
66 (two 33’s)
990 (thirty 33’s)
1056 (thirty two 33’s).
Now, observe how easy it is to recognize multiplication as being rapid

addition (and also observe how this translates, word for word, into the same
computation, b u t with different numerals, th a t we used in our Roman-
numeral computation).
If the above example seems simple, let us further point out th a t the type
of problem involved here, still occurs in more advanced mathematics. For
example, there are those who can divide polynomials but who do not know
why the recipe works; there are those th a t can mechanically take derivatives
and form integrals, b u t who do not know why the techniques are correct.
The point is th a t if we memorize the fundamentals without understanding
them, there is little hope th a t we can assimilate the finer points which
extend the frontiers of knowledge. In summary, the fundamentals must
become meaningful before one can hope to become proficient with the
phases of the subject he has not been taught to memorize. I t is this con
cept th at motivated us to spend this much time in developing, or recon
structing, our present place-value system.
Even though the point has been made, it might be well to complete our
study of place-value with specific mention of division and thus round out
the discussion of the four basic operations of arithmetic.
Moreover, such a decision enables us to introduce the im portant idea of
inverse operations. Roughly speaking, “inverse” indicates the idea of a
change in emphasis. For example, the statem ents
Mickey M antle was a great baseball player.
and
A great baseball player was Mickey M antle.
may be viewed as being inverses of one another. They say the same thing
but with a change of emphasis. T h at is, our first statem ent seems to be
an answer to the question “Who was Mickey M antle?” and, thus, empha
sizes Mickey M antle. Our second statem ent appears to be an answer to
“ Name a great baseball player” ; hence, it seems to emphasize great base
ball players.
In terms of a mathematical example, let us consider the idea th a t sub
traction is the inverse of addition. When we write, for example, 3 + 2 = 5,
we are emphasizing 5; at least the 5 stands alone and we seem to be saying
th a t 5 is the sum of 2 and 3. Now 5 — 2 = 3 conveys exactly the same
meaning as 3 + 2 = 5; only we seem to be emphasizing the 3. In short, we
can automatically subtract as soon as we know how to add. More pre
cisely, when we see the expression 5 — 2, we may read this as “th a t number
which must be added to 2 to yield 5.” This is what store clerks do when
they make change. If you pay for a $2.34 purchase with a $5 bill, the
clerk gives you change by adding on to $2.34 the amount necessary to make
$5.00.
In a similar way, division is the inverse of multiplication. T h at is, we
can divide as soon as we know how to multiply. By way of illustration,
let us consider 2821 -5- 13.
In terms of multiplication, this asks us to find the number which when
multiplied by 13 yields 2821. Another way of looking at this is th a t if
multiplication is rapid addition then division is rapid subtraction. Thus,
we wish a quick way of determining the number of bundles we can make
from 2821 tallies if we place 13 tallies in each bundle.
To this end, we observe th a t one 13 is 13; ten 13’s is 130; one hundred
13’s in 1300; and so on. Hence, if we subtract 2600 from 2821 we are
immediately performing in one step the equivalent of subtracting 13 two
hundred times. In summary,
2821
—2600 (two hundred 13’s)
221
—130 (ten 13’s)
91
—91 (seven 13’s).
Thus, in all, there were 217 bundles of 13 each in 2821.

The recipe, known as long division and which some think is so mystical,
is exactly what we have done above—only with the use of place-holders to
supply us with short cuts. Namely, when we write
217
13|2821
26
22
13
91
91
notice th a t we have left out a few 0’s and with these 0’s supplied, we obtain
217
13(2821
-2 6 0 0 (200)
221
-1 3 0 (10)
91
-9 1 (7)
which is ju st what we had done with rapid subtraction.
Again, while we may be dealing here a t a very elementary level, the
principle of inverse operations permeates virtually every branch of mathe
matics. For example, in algebra factoring is the inverse of multiplication;
logarithms are the inverse of exponentiation; and extracting roots is the
inverse of raising to powers. In trigonometry, as might be expected, the
inverse trigonometric functions are, indeed, the inverses of the regular
trigonometric functions. In calculus integration is the inverse of differ
entiation. The im portant computational point is th a t once we have
studied an operation thoroughly, we immediately have a strong hold on
the inverse operation. We shall exploit this idea on many occasions

throughout the text; but for now, except for a brief glance at the idea of
scientific notation and the use of exponents, this completes our develop
ment of the whole numbers.
Granted th a t place-value was a great improvement over earlier tally
systems, the fact remains th a t as man used his newer systems, he was able
to study vastness in a way previously unheard of. While by nature of his
physical knowledge there was little need for primitive man to comprehend
the number th a t we would write as, say 10,000,000,000,000,000,000,000,
the fact remains th a t numbers this vast are frequently encountered in
modern-day science. The interesting paradox is th a t while the above
number is written in place-value notation, the number of 0’s behaves much
like tallies as we try to keep track of the various places.
Faced with the prospects of a numeral system th a t was becoming too
cumbersome for his modern needs, man invented the idea of natural-
number exponents. In this sense, the exponent was just a little code for
telling how many factors of a certain number we should take.
In abstract language, if b is any number and n is any natural number,
we define the symbol (numeral) 6” (read as b raised to the n th power or,
more concisely, b to the nth) by
bn = b X b X . . Xb
n factors of b.
b is called the base, while n is called the exponent.

By way of illustration, 27 means 2 X 2 X 2 X 2 X 2 X 2 X 2 , or, 128.
T hat is, 27= 128. 34 = 3 X 3 X 3 X 3 = 81.
I t should not be difficult to see that, for example,2100 is much easier to
write than a hundred factors of two.
The concept of scientific notation enters the picture when we choose our
base to be ten. To see why, observe the following.
101 = 10
102 = 10 X 10 = 100
103 = 10 X 10 X 10 = 1000, and so on.
If we pursue this further, we readily discover th a t 10n represents the num

ber which when written in place-value consists of the digit 1 followed by
n 0’s.
Thus the number which we write as 10,000,000,000,000,000,000,000 in
place-value notation would be written quite simply in scientific notation
as 1022. To expound upon this idea further: recall th a t in an earlier section
we mentioned th a t at the rate of one tally per second, working around the
clock, it would take over th irty years to count to a billion. Ju st for fun,
then, let us assume th a t we can write one digit per second. Consider the
number:
1010” .
As we have written it, at the rate of a digit per second, it took us about
six seconds to express in scientific notation. Now let n = 1010. Then
the above number may be viewed in place-value notation as a 1 followed
by 1010 zeros. B ut 1010 means 10,000,000,000, which represents ten billion.
In other words,
1010“
if converted into place-value would be a 1 followed by ten billion 0’s. At
the rate of one digit per second, it would take us more than 300 years to
write this number of 0’s. In other words, scientific notation allows us to
express in six seconds a number which would have taken more than 300
years to express in terms of the usual system of place-value. I t should
be clear now what we mean when we say th a t scientific notation is to
place-value what place-value was to the more primitive tally systems.
Indeed, as man progresses, it would be vane (and also vain) for us to try
even to predict the innovations th at he shall invent in his quest to under
stand his history and his heritage.
Since exponents play such a vital role in many scientific contexts, we
shall continue our digression and talk more about exponents. In par
ticular, we wish to point out th a t if b and c are any numbers and n and m
are any natural numbers, it is rather easy to establish the following recipes:
bm X bn = bm+n
—fomXn
(ibc)m = bm X cm.
Moreover, if n exceeds m, then
5» bm = bn~m.
R ather than try to prove these results, wre shall merely look at a few
examples to see how these results come about.
24 X 23 = (2 X 2 X 2 X 2) X (2 X 2 X 2)
= 2 X 2 X 2 X 2 X 2 X 2 X 2
= 27
= 24+3.
(24)3 = 24 X 24 X 24
= (2 X 2 X 2 X 2) X (2 X 2 X 2 X 2) X (2 X 2 X 2 X 2)
= 212
= 24X3.
(2 X 3)4 = (2 X 3) X (2 X 3) X (2 X 3) X (2 X 3)
= (2X2X2X2)X(3X3X3X3)
= 24 X 34.
37 -s- 32 means, in term s of th e inverse of m ultiplication, th e num ber which
when m ultiplied by 32 yields 37. W e already know th a t 32 X 36 = 37.
Hence
37 -r- 32 = 36 = 37“ 2.
These recipes show us th e interesting result th a t in term s of exponents

m ultiplication problems are solved by addition, and division problems by
subtraction. In term s of scientific notation, we now have a convenient
way of keeping track of large num bers of zeros. For example, consider
24,000,000,000,000 X 3,000,000,000.
In term s of scientific notation
24,000,000,000,000 = 24 X 1012
3,000,000,000 = 3 X 109.
Thus,
24,000,000,000,000 X 3,000,000,000
= 24 X 1012 X 3 X 109
= 24 X 3 X 1012 X 109
= (24 X 3) X (1012 X 109)
= 72 X 1012+9
= 72 X 1021
= 72,000,000,000,000,000,000,000.
Moreover, by studying th e above procedure we understand why the num

ber of zeros in the product is the sum of th e zeros in each of the separate
factors.
Before concluding this section, it will be to our advantage in later work
to now define the meaning of 6°. So far, this has not been defined, since
we do not consider 0 to be a natural number (recall th a t 0 is referred to as a
whole number b u t not a natural number. The natural numbers sta rt
with 1.)
I t should be clear th a t we are at liberty to define 6° in any way th a t we
wish since it has not previously been defined. M an elected to define 6° = 1
(where we m ust assume th a t 6 ^ 0, lest logical difficulty beyond our present
interest arises); and the reason for this choice provides an excellent example
of m an’s ability to think logically. Namely, he observed th a t the recipe
bn X bm = bn+m
was very convenient. Consequently, he wanted to define 6° in such a way

th a t this recipe would still be applicable. He observed th at
6" X 6° = 6n+0
meant th a t bn X b° =•■bn since n + 0 = n. This told him th at when 6°

multiplied bn the answer was still 6n; hence, he concluded th at b° must be 1,
since 1 has the property of not changing the value of the number it multi
plies. However, 0 X c = 0 for any number c; this is why we must pre
clude letting 6 = 0 when we define 6° = 1.
Lest we tend to lose sight of the great importance of exponents, let us
point out here th a t just as place-value became a necessary improvement
over tally systems if the technology of man was to flourish, so also were
exponents a necessary improvement over place-value. In still other words,
quite apart from any computational aspects of exponents, the study of
exponents affords us another fine example of how each invention seems to
ultim ately lead us to another plateau, at which time a new invention is
necessary if progress is to continue.
At any rate, this completes our treatm ent of the development of our
present place-value system. In the next section we shall attem pt to gen
eralize the concept of place-value by showing that, strange as it may seem,
it is not im portant th a t we use the number ten as the basis of our study of
place-value.
Exercises
1. List six different numerals which name the number five.
2. In what sense are 4 + 1 and 3 + 2 synonyms? In what sense are
they not the same?
3. In terms of numerals rather than numbers, give an example in which 3
is “bigger th an ” 4.
4. Explain the flaw in the argument “In order to determine which of two
bags contains more pieces of candy, all we need do is weigh the bags.”
5. Suppose a number leaves a remainder of three when divided by five
and a second number leaves a remainder of two when divided by five.
Then the sum of these two numbers is divisible by five. Use tally
marks to explain why this result is true.
6. Use Roman numerals to form the product of twenty-four and thirteen.
7. Suppose the Romans had invented a new symbol for each power of ten,
and th a t they agreed never to use more than nine of any given symbol
in expressing a number. Consider the number which we write as
1,456,007,013.
(a) How many symbols would the Romans have had to invent to
express this number?
(b) How many characters would they have had to use?

8. Explain how the abacus was a bridge between the ancient tally systems
and our present place-value system.
9 . Show how the abacus would have been used to perform
(a) 456 + 897 (b) 456 X 12.
10 . Even though the abacus was a place-value system, its use did not
require the invention of a place holder. Why?
11. Suppose we allowed more than a single digit per place-value column.
Name three different numbers th a t could then be named by the
numeral 1234.
12. Solve the problem of subtracting ninety-seven from two hundred thirty-
four by each of the following methods:
(a) change-making (b) the abacus.
13. Use the principle of “rapid” addition to compute 789 X 312.
14. Use the principle of “rapid” subtraction to compute 15,129 -f- 123.
15 . Explain how the need for scientific notation arose.
16. The number known as Avogadro’s number and which represents the
number of molecules in eighteen grams of water is represented in
scientific notation as
6 X 1023.
How is this number written as a place-value numeral?

17 . In terms of the answer to Exercise 16, explain how place-value managed
to become as cumbersome as the tally systems, and how scientific
notation simplified this problem.
18 . Explain the difference between 36 and 3 X 6 .
19. Are 36 and 63 synonyms? Show why or why not.
20 . Are 24 and 42 synonyms? Show why or why not.
21 . Looking at Exercises 19 and 20, is there any general statem ent we can
make regarding the relationship between ab and b°? Explain.
22 . Observing th a t 9 = 32, explain why 93 = 36.
23. W hat number is named by the following numeral.
(a) 45 (e) 34-h 32
(b) 54 (f) 5°
(c) 32 X 23 (g) 05
(d) 42 X 43
24 . Use exponent notation to find the product of six billion and twelve
million. Express the product as a numeral in both scientific notation
and place-value. W hat difficulty arises when we name the product as
a number which did not arise when we were using the idea of exponents?
How does this compare with the problem involved when one wished to
name powers of ten using the Egyptian numeral system?
1.3 DIFFERENT NUMBER BASES

In this section we shall try to emphasize the role of the “trade-in” number
as opposed to the number which we call ten. Let us say for now th at our
system of arithmetic hinges on the fact th at we have 10 fingers, not on the
fact th a t we have ten fingers! I t will be our aim throughout this section
to resolve this apparently paradoxical statem ent in a meaningful way.
As we have already indicated, ancient man could have elected to trade-in
in whatever batches he desired. Thus, had man been born with four
fingers, it is quite natural th a t four would have been the trade-in value
and then the Romans could have used X to denote four rather than ten.
Let us assume th a t we were still born with ten fingers, but th a t we wished
to trade-in in batches of four. There are several different models th at we
could form to study this problem. However, let us consider only the
following.
The odometer is the name given to the part of the car, usually con
sidered as p art of the speedometer, th a t serves as the mileage indicator—
the p art th a t tells how many miles the car has traveled. The odometer
consists of interlocking gears, each with ten faces. The faces are named
by the numerals 0 through 9, inclusive. The interlocking is performed in
such a way th a t whenever 0 turns up on any gear, the gear immediately
to the left moves up one number. The exciting part, at least to most
children when they first become aware of the odometer, is when “all 9’s
are up,” for the next mile then creates plenty of action. Namely, the
next mile brings 0 to the first gear (by first, we mean furthest to the right);
this in turn makes the next gear change by one, thus bringing about a 0
here, and so the process continues as the next mile converts all the 9’s
to 0’s.
The im portant point here is th a t the gears of the odometer did not have
to have ten faces. For example, suppose th a t each gear had only four
faces. Suppose also th a t each of these faces were named by the numerals 0
through 3, inclusive; and th a t the interlocking were still carried out by the
process of a return to 0 on one gear causing the gear immediately to the
left to change by one. Such an odometer would serve just as well for
recording mileage. Of course, the two different odometers would result
in the same number of miles being represented by different combinations of
symbols. For instance, on our new odometer, the “action” would take
place when all 3’s appeared. The table at the top of page 23 is a visual
representation of this process.
In other words, the number of miles traveled does not depend on the
odometer, but the dial reading does; or, the numeral, not the number,
depends on which odometer we use. Notice also th at unless we tell some
one how our odometer is constructed, he has no way of converting the dial
Reading on
Number Number odometer having Reading on
of of miles ex-pressed, four faces per odometer having
miles in tally marks gear (1,2,3,0) ten faces per gear
None 000 000

One 1 001 001
Two II 002 002
Three III 003 003
Four mi 010 004
Five iiiii 011 005
Six i iiiii 012 006
Seven m in i 013 007
Eight ilir n n 020 008
Nine nnnm 021 009
Ten m in im 022 010
Eleven minimi 023 011
Twelve m in iu m 030 012
Thirteen nnnnnni 031 013
Fourteen nmnnnni 032 014
Fifteen m inim um 033 015
Sixteen innm nnm i 100 016
Seventeen nmnnnnnn 101 017
Eighteen nnnnmnnni 102 018
Nineteen m nim ninnm 103 019
Twenty m im m innm n 110 020
Twenty-one m m m nim nnn 111 021
Twenty-two nnnnnnnnnnn 112 022
reading into the correct number of miles. For example, the reading 012
denotes 111111 (six) miles on one odometer, but 111111111111 (twelve) on the
other. To avoid this misinterpretation, we will refer to them as the
base-four and base-ten odometers, respectively. Observe th a t in what we
call base-four we mean a place-value system in which the trade-in value
is always four. However, as far as number-versus-numeral is concerned,
closer scrutiny of the base-four odometer shows th a t what we would
ordinarily write as 4 appears there as 10. Notice, then, the difference
between ten and 10. Here we see an excellent example in which failure
to distinguish between number and numeral can really haunt us. The
point we are trying to make is th a t in place-value, the second column
(from right to left) names the trade-in value. In our base-four odometer,
the trade-in value is four but it is represented by 10 to indicate th a t in
this case we have one batch of our trade-in value and no ones. In fact,
it should not be difficult to see th a t we could have built an odometer using
any natural number as base, provided th a t it is at least equal to two.
(W hat we are saying here is th a t if the odometer is to be useful there m ust
2^ The Development of Our Number System
be a t least two faces on each dial since we need symbols for at least zero
and one.) The point is th a t no m atter how many faces there are to one
of the odometer’s dials, the reading will be 10 when we travel this number
of miles. On a base-five odometer, 10 will appear when we have traveled
five miles; on a base-twelve odometer, 10 will represent twelve miles.
Observe th a t a base-n -dial m ust have n digits. In other words, on a base-
twelve odometer, ten and eleven would be expressed by single digits while
twelve would be the first number denoted as a multiple digit—namely, 10.
In summary, our trade-in value in any base will be represented by the
numeral 10; but except for base-ten it will not be the number ten.
A major problem occurs once we introduce the possibilities of different
bases. For, when we assumed th a t ten was the only possible trade-in
value, it seemed th a t we were making too much of the fact th at there was
a difference between ten and 10. To be sure, there is a difference between
numbers and numerals; but perhaps it looked like too big an issue was
being made over it, since everything seemed clear from context. However,
now, when we see the numeral 10, all we can conclude is th at it names the
particular trade-in value. For this reason, we add some garnishing to
place-value numerals when the possibility of different bases exists. For
example, if we wished to indicate th a t the reading 10 occurred on a base-
five odometer, we would write (10)five. We shall, in the interest of brevity,
often abbreviate (10)nVe by (10)5. We shall also use the convention th at
when no subscript occurs, the usual base-ten is assumed. Thus, if we write
23, we mean either (23)io or (23)ten-
Perhaps we can explain different number bases most clearly by studying
a particular base in detail. Since we have already introduced a base-four
odometer in some detail, it would be to our advantage to begin a compre
hensive study of base-four arithmetic. Recall th a t base-four arithmetic
has only four digits: 0, 1, 2, and 3. To this end, we see th a t our numeral
system appears to be given by 1, 2, 3, 10, 11, 12, 13, 20, 21, 22, 23, 30, 31,
32, 33, 100, . . . . As things stand, there is no trouble even if the numeral
system seems a bit unfamiliar to us. The trouble occurs when we start to
pronounce these numerals as one, two, three, ten, eleven, and so on. Here
we begin to pay the price for not having properly distinguished between
number and numeral, and reading or pronouncing ten and 10 alike. Ob
serve th a t no harm comes as long as it is the single digit numerals we are
reading. T hat is, 3 is always three in any base in which 3 exists. Thus,
we elect to solve this problem by pronouncing (12)4 as “one-two-base-four.”
Certainly we are not denouncing the pronunciation “twelve-base-four.”
However, the sound of twelve might prejudice our thinking. The impor
ta n t thing to observe, however, is that, when written, (12)4 does not depend
on how it is pronounced! Since we do not want to become more involved
with jargon than with ideas, our agreement is th a t the student may pro
nounce, when necessary, (12)4 in any way th a t he desires, provided th a t he

understands the meaning of the numeral (12)4. In this text, whenever we
wish to pronounce, say, (234)6, we shall say two-three-four-base-five rather
than two hundred thirty-four base-five.
Remember th a t in the base-four system we have in no way changed the
concept of the natural numbers. All we have changed is the numeral
system. In other words, the number six exists as a concept no m atter
what base we are using. However, if the base is less than seven, we shall
never see the numeral 6. In still other words, among the various numerals
th a t name the number six we have
6, (10)6, (ll)s, (12)4, and (20),.
Let us pretend for the time being th a t we were born in the base-four
numeral system. As we have seen, our system of numerals, all other things
being equal, would look like this:
1, 2, 3, 10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33, 100, . . . .
To the person who used this system, there would be nothing unnatural
about this procedure. As a final reminder, observe th a t the digits here
have the same value as always, but only the value representing the trade-in
is different.
In this system, 13 + 12 would denote
since 13 = 1111 111 and 12 = 1111 11.

Now it is easy to see th a t this number of tallies, which is represented by
13 + 12 in our base-four system of enumeration, can also be written as
and in base-four numerals this would be written as 31. In terms of

Roman numerals, if X represents 1111, all we are saying is th a t
X III X II = X X IIII I = X X X I.
Hopefully, our last illustration shows th a t no m atter what base has been
chosen, the tally system serves as a marvelous visual aid for interpreting
place-value. Finally, to emphasize the difference between number and
numeral, let us observe th a t in the base-four system 13 is a numeral which
denotes the number seven; 12 is a numeral th a t denotes the number six;
and 31 is a numeral which denotes the number thirteen. Thus, the base-
four fact 13 + 12 = 31 simply denotes the number fact th a t the sum of
seven and six is thirteen. This would be w ritten as 7 + 6 == 13 in terms of
base-ten numerals. Of course, when more than one base is under con-
sideration, subscripts are needed to avoid misinterpretation. In this event,

we could say th a t
(13 + 12 = 31) four and (7 + 6 = 13)te„
express the same number fact, but involve different numerals.
The next step in our base-four existence toward learning place-value
arithm etic would be to learn addition and multiplication tables. Using
our memorized numeral system and learning to count on our fingers (much
the same as we actually learned the base-ten tables), we would learn such
numeral facts as
3 + 2=11 2 + 3=11 2 X 3 = 12 3 X 3 = 21, and so on.
(For those who are having difficulty making the transition from base-ten
to base-four, the following clarification should be of help. When we see
3 X 3 we are tem pted to say “9.” Fine; now recalling th a t we are in
base-four, think of the 9 as naming nine; which, in turn, is 2(fours) +
l(one); and this is written as 21 in the base-four numeral system.)
We could then construct our tables as follows.
+ 0 1 2 3 X 0 1 2 3
0 0 1 2 3 0 0 0 0 0
1 1 2 3 10 1 0 1 2 3
2 2 3 10 11 2 0 2 10 12
3 3 10 11 12 3 0 3 12 21
W ith these tables in mind, let us see how we could learn addition in
term s of place value. Suppose we were told to perform the operation indi
cated by 12 + 21. We would write
12
+21
and we might then, quite mechanically, view this as two separate simpler
problems, namely
1 and 2
+2 +1
3 3.
We could then superimpose these results to conclude th a t
12
+21
33 .
(As a review of our own base-ten system and how to convert into other
bases; observe th at the above numeral fact merely states th at the sum of
six and nine is fifteen. In particular,
(12)4 = 6, (21)4 = 9, and (33)4 = 15.)
Now suppose we were given 13 -1- 12. In terms of tallies, we have

already obtained 31 as our answer and we wish to do this in terms of base-
four place-value. Again, wTe could w rite:
13
+ 12
and think of it as
1 and 3
+1 ±2
2 11.
Thus we might write

13
+ 12
2(11)
but to avoid misinterpretation and the need to always use parentheses, it
is easier to trade in four (which here of course is denoted by the numeral 10)
of the first column for one of the second. T hat is, “bring down the 1 and
carry a 1.”
In this way, we would write
1
13
+ 12
31.
Thus, regrouping would exist in base-four just as in base-ten so th a t in
base-four addition could be carried out in the same way as in base-ten;
and only the numerals will have changed. Indeed, if we liked this form
of computation, we would have every right to say “Isn’t it fortunate th a t
we decided to trade-in in bundles of 10!” We would by no means be
saying th a t the number ten was im portant to us.
We could continue on and show how all the recipes th a t we used in
multiplication, division, and subtraction when we did arithmetic in base-ten
apply to all bases and th a t only the numerals are different. Such en
deavors are left for exercises. I t is certainly a valuable experience to
become adept a t the manipulations involved in various bases, but such
an endeavor is peripheral to our major aim, which is simply to exploit the

number-versus-numeral concept.
Instead, let us return to the fundamental idea th at in the use of any
language we study concepts but see only words. The number eighty, as
we hope is clear by now, does not depend on what base we are using.
However, the place-value numeral which denotes eighty does. In base-
four the column values are one, four, sixteen, sixty-four, and so on; and
thus eighty would be represented as 1(sixty-four) + 1(sixteen) + O(four) +
O(one). In other words, (80)ten = (1100)fOUr-
Here might be a good place to review our odometer idea. If two
odometers using different bases are used side by side during the same
trip, even though the mileage is the same the dial reading on the odometer
with the smaller base will appear to read a larger number than the other,
since we trade-in more often on the smaller-base odometer. Thus if a
base-ten odometer and a base-four odometer initially read 0000, at the
end of an eighty-mile trip the base-ten odometer will read 0080 and the
base-four odometer will read 1100.
In a similar way, when we think of base-eleven the denominations are
one, eleven, and so on, and we can then think of eighty as being 7(eleven) +
3(one). T hat is, (80) ten (73) eleven-
By this tim e the student may have noticed th at in any base he is using
powers of th a t base; for example,
(248) 10 = (2 X 100) + (4 X 10) + 8 = (2 X 102) + (4 X 10) + 8,
(243)5 = (2 X 52) + (4 X 5) + 3,
(31)4 = (3 X 4) + 1,
(246)7 = (2 X 72) + (4 X 7) + 6.
Converting these to base-ten, respectively, we get 248, 73, 13, and 130.
Among other things, notice th at the place-value numeral which names
eighty in base-eleven ends in an odd digit. Yet eighty is an even number.
(Recall th a t even and odd are number properties and do not depend on
any base. In other words, if we were to bundle eighty tallies in twos, we
would obtain forty bundles with none left over.)
This leads us to the question of which properties of numbers are really
number properties and which are actually numeral properties. In many
ways, this is the same as conveying ideas in different languages. Each
language has its own idioms and we often say with justification th at
certain phrases lose something in translation. The ideas are the same but
how we express them varies from language to language. Certain arith
metic properties are also easier to understand in one base than in another,
even though the concept has the same meaning in all bases. By way of
illustration, observe th a t in terms of our usual base-ten system of enu
meration, the numbers five and two seem to possess extra “virtue” with
regard to divisibility. For example, to see if a number is divisible by five,

we look at its unit digit when the numeral is written in base-ten. If this
digit is either 0 or 5 (that is, if the number denoted by the unit digit is
divisible by five) then the number is divisible by five; otherwise, it is not.
In a similar way, we may test for divisibility by two. We find th a t if
the unit digit of the number is either 0, 2, 4, 6, or 8, then the number is
divisible by two; otherwise, it is not.
This property is not shared with any other single digits in base-ten
arithmetic. For example, twenty-eight is divisible by seven; but written
as a base-ten numeral, twenty-eight is represented by 28, where the unit
digit is 8 ; and 8 is not divisible by 7.
The point to be made is th a t there is nothing special with regard to
division about the numbers two and five. Rather, it is th a t 2 and 5 play
a special role in the base-ten system; for observe th a t the only nontrivial
(since any natural number is divisible by itself and one, we call these two
factors trivial divisors) divisors of ten are two and five. T hat is, both
two and five are divisors of all the denominations in base-ten, except for
the unit-digit column. In other words, in terms of numerals, 10, 100, 1000,
and so on are each divisible by both 2 and 5 in base-ten. Hence, all we
need test is the unit digit to see if the number is either divisible by 2 or 5.
By way of example, consider the number 2376:
2376 = 2(1000) + 3(100) + 7(10) + 6.
Hence, if we divide 2376 by 5, no remainder can occur until we get to the
unit digit, since 5 is a divisor of 1000, 100, and 10. We thus see th a t when
5 is divided into 2376 the remainder is 1. In a similar way, notice th at
2376 is divisible by 2.
However, this use of numerals in testing for divisibility depends rather
strongly upon the base. For example, let us consider the number fact th a t
ten divided by five is two. This is indeed a number fact and does not
depend in any way on what base we are using. In fact, we do not even
have to use place-value to denote this fact. For example, in terms of tallies
we need only observe the array 11111 11111 to see th a t when ten tallies are
bundled in bunches of five we get two bundles. Now, let us investigate
this number fact using base-seven numerals. In this event, we observe
th at ten is written as (13)7; hence, the number fact is expressed by the
numeral fact
(13)7 (5)7 = (2)7.
(We are aware th a t (5)7 = 5 and (2)7 = 2, but we prefer the above notation
to emphasize the fact th a t we are using base-seven numerals.)
Here the unit digit of (13)7 is 3 and certainly (3)7 4- (5)7 is not a whole
number.
SO The Development of Our Number System
The point is th a t while number facts do not depend on the base, it may
be easier to recognize a particular number fact in one base than in another.
For example, as we have just seen, divisibility by five is viewed conven
iently in terms of base-ten numerals, since five is a divisor of ten. However,
divisibility by either five or two was not too convenient in terms of base-
seven numerals, since five is not a divisor of seven.
In short, the test for divisibility in terms of looking only at the unit
digit hinges on whether the divisor is also a divisor of the base. The more
divisors the base possesses, the more convenient tests are for divisibility;
while the fewer the divisors of the base, the fewer tests for divisibility in
terms of the unit digit. The extreme, in this respect, occurs when the base
is a prime number. A prime number is any natural number, excluding 1,
which is divisible only by 1 and by itself; for example, 2, 3, 5, 7, 11, and 13.
If a number is prime, then there are no nontrivial divisors of it. This was
the case in our study of base-seven numerals.
A number like twelve is a good base in this regard since it admits two,
three, four, and six as nontrivial divisors. This means th at in base-twelve
numerals, the results can be expressed by the following statem ent:
A number is divisible by either two, three, four, or six if and only if its unit digit
is divisible by either two, three, four, or six, respectively (where, of course, the
digits refer to base-twelve numerals).
In still other words, in base-twelve
(1) A number is divisible by 2 if and only if its unit digit is 0, 2, 4, 6,8,

or T. (T is a single digit to denote ten. Remember th a t twelve is
the first double digit in base-twelve.)
(2) A number is divisible by 3 if and only if its unit digit is 0, 3, 6, or 9.
(3) A number is divisible by 4 if and only if its unit digit is 0, 4, or 8.
(4) A number is divisible by 6 if and only if its unit digit is 0 or 6.
For example, the number which is written as (106)i2 should be divisible by

six, since in base-twelve its unit digit is 6. To check this, observe th at
(106) 12 = 1(144) + 6 = 150
and
150 -f 6 = 25.
For the very reason th a t twelve has so many divisors, there are some who
feel th a t twelve should have been the base for arithmetic rather than ten.
There is a society, called the Duodecimal Society, which still maintains
this (duo-decimal refers to base-twelve). The authors believe th a t while
base-twelve affords advantages not offered by base-ten, these advantages
1.S Different Number Bases 31
hardly warrant the chaos of changing—especially since we understand th a t

the choice of base in no way affects the number facts. However, the very
vocabulary th at we use to name numbers seems to indicate th at even in the
Middle-Ages man sensed the advantages of twelve over those of ten. For
example, aside from such definitions as 12 inches = 1 foot, or th a t there
are twelve, not ten, to a dozen; observe th at the first “teen” is thirteen
(three-teen) not “one-teen.” T hat is, our own counting system seems to
indicate th at we get to twelve before we start the teens.
Another interesting numeral fact is the role of nine in our base-ten sys
tem. Namely, to test for divisibility by nine using base-ten numerals, we
need only form the sum of the digits and see whether this is divisible by
nine. For example, 126 is divisible by 9 since l + 2 + 6 ( = 9 ) i s divisible
by 9. We can go one step further and say th at the remainder when a num
ber is divided by nine is the same as the remainder when the sum of its
digits is divided by nine. For example, consider 1,002,432. The sum of
the digits is 1 + 0 + 0 + 2 + 4 + 3 + 2 = 1 2 . In turn, 1 + 2 = 3.
Thus, when 1,002,432 is divided by 9 the remainder is 3. This fact is
known as casting our nines and is used as a check in performing arithmetic
computations. For example, suppose we want to check th a t 33 X 32 =
1056. We observe th a t 33 and 3 + 3 leave the same remainder when
divided by 9, as do 32 and 3 + 2. Thus, the remainder when 33 X 32 is
divided by 9 should agree with the remainder obtained when 6 X 5 is
divided by 9. T hat is, we compare
33 with 6
X32 X5
1056 30.
Now l + 0 + 5 + 6 = 12, and 1 + 2 = 3; and 3 + 0 = 3. Thus, the

result checks.
Observe th a t the check by casting out nines is not foolproof. To see this
observe, for example, th a t if the answer is off by nine the incorrect answer
will check. For instance, had we said 33 X 32 = 1956, we would obtain
1 + 9 + 5 + 6 = 21, and 2 + 1 = 3. This agrees with the check. An
other way in which this method can lead to error is in the recipe for m ulti
plication. If we forget to indent, for example, and perform the operation
33 X 32 as follows:
33
X32
66
_99
165
1 + 6 + 5 = 12 and 1 + 2 = 3; and the result again checks. This hap

pens because adding on a zero does not change the sums of the digits.
T hat is, the sums of the digits 99 and 990 are the same.
However, the main concern of our investigation is why this interesting
property of 9 seems to exist. It is due to the fact th at 9 is the greatest
single digit in the base-ten system. T hat is, our trade-in values are written
as 10, 100, 1000, and so on:
10 = 9 + 1
100 = 99 + 1
1000 = 999 + 1.
Thus, consider the three-digit numeral, say, abc. This means that
a (100) + 6(10) + c = a(99 + 1) + 6(9 + 1) + c
= 99a + a + 96 + c
= (99a + 96) + (a + 6 + c).
Now it should be fairly clear th a t (99a + 96) is divisible by 9, so th at any
remainder when we divide by 9 m ust come from (a + 6 + c); but this is
precisely the sum of the digits of abc.
However, the above argument could be reproduced in any number base
(since the trade-in value is always denoted by 10) but rather than 9, we
would use the greatest single digit in th a t particular base. For example,
in the base-seven system we could cast out sixes. This follows from the
numeral fact th a t (6 = 10 — 1)7. I t is also worth noting here th at any
divisor of the “casting-out number” is also a “casting-out number.” For
example, in base-ten we can cast out threes as well as nines; and in the
base-seven system, we can cast out twos and threes as well as sixes. In
fact, if the base is odd, the casting-out number will be even and every even
number is divisible by two (by definition). Hence, in an odd base, we can
always cast out twos.
This leads to a rather interesting fact. We have already observed th at
in an even base we can test for divisibility by two just by observing whether
the unit digit is even. However, in an odd base this is not possible. For
example, eight is divisible by two since the product of two and four is
eight. Now writing eight as a base-five numeral, we obtain the numeral
(13)6 and here the unit digit is 3; but three is not an even number. In
other words, as the following shows, there is no correlation between even
and odd in terms of unit digits if the base is odd. Namely,
8 = (13)6 but 3 is odd;
12 = (22)6 and 2 is even.
Thus in base-five the unit digit of an even number may be either even or
odd. Similarly, the unit digit of an odd number, expressed base-five, may
be either even or odd. Thus,
23 = (43)6 and 3 is odd;

27 = (102)6 and 2 is even.
Now recall th a t a number is even if it is divisible by two; and this does

not depend on the base. Our problem seems to stem numeral-wise from
the use of an odd base. As we have seen, in an odd base, the casting-out
number is even; hence, in any odd base, we can cast out twos. Thus, to
test for even or odd numbers using the numerals of an odd base, all we need
do is check the sum of the digits. In terms of our previous examples,
observe th at
(13)6 yields the result 1 + 3 = 4, which is even;

(22)s yields the result 2 + 2 = 4, which is even;
(43)5 yields the result 4 + 3 = 7 or (12)5 (1 + 2 = 3), which is odd;
(102)5 yields the result 1 + 2 = 3, which is odd.
In summary, then, whether a number is even or odd is a number fact,

not a numeral fact. However, the test for even or odd in terms of numerals
does depend on the base. In particular, when expressed in an even base,
a number is even if and only if the unit digit is even; while when expressed
in an odd base, a number is even if and only if the sum of the digits is
even.
Here we see a very im portant phase of mathematics. Namely, certain
aspects of arithmetic depend only on the numbers involved, while other
aspects depend on the numerals. One problem of the mathematician is
the investigation of which is which. For example, the property of nine
in casting out nines does not depend on the number nine but on the fact
th at 9 represents the greatest digit in the base-ten system. On the other
hand, the fact th a t nine is a perfect square does not depend on the number
base being used. T hat is, the product of three and three is nine, no m atter
how we elect to write it, be it
Jll or 3X 3 = 9 or ( 3 X 3 = 21)4
or (10 X 10 = 100)3, and so on.
While this investigation may possess aesthetic values from the purely
theoretical point of view, the problem of confusing numbers and numerals
may be a very real problem to the average student. T hat is, many
concepts of arithmetic are relatively easy to grasp; but the techniques
which one uses to obtain results in terms of numerals is what usually causes
the greatest difficulty. Thus, among other things, a proper distinction
between numbers and numerals is perhaps the greatest aid in learning to
understand arithmetic from a computational point of view.
Before concluding the present study of different number bases, it would

be an injustice not to summarize the many reasons for having studied them.
(1) The study of different number bases affords us the opportunity to

enhance our knowledge of the difference between numbers and nu
merals; for this study shows us the structure of place-value and its
independence from the choice of the base.
(2) We see the difference between properties of numbers and properties
of numerals. In other words, number facts do not depend on the
choice of our base, but how we “measure” some of these facts does.
(3) One of the complaints about the new mathematics is th a t it does not
pay enough attention to drill. A fairer statem ent is th at the new
mathematics stresses drill but in ways which might be more exciting
than sheer rote. In this respect, the study of different bases forces
the student to continually translate from one number base to another
(usually base-ten). This can only be done if the student has proper
mastery of the arithmetic tables, among other things. T hat is, in
addition to all else, the study of different number bases affords us
excellent drill and review with respect to our own base-ten system.
In concluding this section, let us say just a word about base-two (binary)
arithmetic. In terms of computers, observe th a t a switch is either off or
on. I t is not partly on and partly off. Thus, an adequate coding system
would require two symbols—one to denote on and the other to denote off.
Thus, base-two, with its two digits is very appropriate for this task. This
is the reason th a t base-two has such wide application in many pragmatic
endeavors.
Exercises
1. Indicate how the number twelve would be written in each of the fol
lowing bases.
(a) two (e) ten
(b) three (f) eleven
(c) four (g) twelve
(d) seven (h) fifteen
2. W hat number is named by (123)„ if n is equal to each of the following.
(a) four (e) eleven
(b) five (f) fifteen
(c) six (g) twenty
(d) nine
3. Explain what we mean when we say th a t there is no 6 in base-four
arithmetic, b u t there is the number six.
4 . W ithout presupposing a knowledge of algebra, explain how we could

tell by sight th a t x = 7 if it is true th at (146)* = (123)8.
6. In each of the following, determine the number base if we assume th a t
each stated numeral fact is correct.
(a) 2 + 2 = 10 (f) 10 X 5 = 50
(b) 4 + 5 = 12 (g) 26 + 35 = 62
(c) 8 + 7 = 12 (h) 34 + 45 = 123
(d) 14 - 5 = 6 (i) 44 -s- 8 = 5
(e) 4 X 5 = 26
6 . Letting T represent ten and E represent eleven, write the addition and
multiplication tables (for all single digits) for base-twelve.
7 . In base-ten we know th a t a number is divisible by five if and only if
its unit digit is 0 or 5. Is this true in base-twelve? W hat analogous
result is true in base-twelve?
8. Compute the product of twenty-three and eleven using base-twelve
numerals. Is this the same as computing (23)i2 X ( ll) i2? Explain.
9 . Perform the indicated operations using the indicated bases.
(a) (236)8 + (754)g (b) (234)5 X (32)#
10. Translate each of the problems in Exercise 5 into our more familiar
base-ten system and then check th a t the results you obtained in
Exercise 5 are correct.
11 . Show how we could convert (123)4 into an equivalent base-six numeral
without first going through base-ten, but rather just by using only
base-six notation. For example, use such results as (16)ten = (24)SiX.
12 . In base-ten arithmetic is it true th at a number is divisible by six if
and only if the sum of its digits is divisible by six?
13. In base-seven arithmetic is it true th a t a number is divisible by six if
and only if the sum of its digits is divisible by six?
14 . Use casting out nines to verify th at 28 X 49 = 1372.
15. If we know th a t in a certain number base we can cast out fours, what
must the base be? (There is more than one correct answer.)
16. As mentioned in the text, base-two is particularly well suited for com
puters since it requires only the two single digits 0 and 1.
(a) W rite the addition and multiplication tables (for single digits only)
for base-two.
(b) W rite eighty-seven and twenty-three as base-two numerals and
then compute their product using base-two arithmetic.
(c) Convert your answer in (b) to base-ten and see if you have obtained
the correct answer to 87 X 23.
A NOTE ON NUMBER VERSUS NUMERAL
Our discussion of different number bases should serve to free us from the
confinement of thinking in terms of a particular numeral system. Now th at
this has been established, we feel free to return to our familiar base-ten
system in discussing various number concepts. Thus, unless otherwise
specified, all the remaining computations will be in base-ten. However,
ju st as there were different systems of enumeration, there were often dif
ferent systems of doing arithmetic, even within the framework of a
particular numeral system.
I t is our purpose here to investigate a cross section of outdated computa
tional devices and compare them with our more familiar methods in terms
of the role played by place-value. At the same time, we shall see th at the
older methods often reflect the feelings of their times—and this, too, is
im portant as we pursue our study of mathematics as a mirror of human
endeavor. We shall in no way try to give an exhaustive study of tech
niques. Rather, we shall single out a few well-chosen examples, leaving it
to the interested reader to pursue the topic further.
Let us begin by observing th at while we write from left to right, our
recipes in arithm etic frequently proceed from right to left. For example,
when we use place-value to add, we start at the units and proceed to the
tens, hundreds, and so on—going from right to left. Obviously, we could
have agreed to write the units, tens, and hundreds from left to right. T hat
is, there is no reason why we could not have invented the numeral 123 to
represent the number three hundred twenty-one. (If this seems backward
it is only because of what we are used to calling forward.) Had we used
this system, such things as carrying in addition would then have proceeded
from left to right. For example,
1 10 100 1000
8 7 6
5 8 7
13 15 13
3 16 13
3 6 14
3 6 4 1
illustrates how we would add six hundred seventy-eight and seven hun
dred eighty-five to obtain one thousand four hundred sixty-three.
For a plausible explanation as to why we did not label increasing de
nominations from left to right, recall th a t place-value was an Arabic con
36
tribution and th a t the Sernetic people write from right to left. Hence,
it is quite consistent th a t their arithmetic proceeded from right to left.
When the Europeans adopted this system, they decided not to tam per with
such a “good thing” and, consequently, they adopted it without a change,
complete with the right-to-left characteristics.
Perhaps man never cared whether he did arithmetic from left-to-right
or from right-to-left. Yet there seems to be ample evidence th at he did
care. As a case in point, let us start our discussion of old-fashioned methods
of arithmetic with the scratch-out method of subtraction.
Consider the problem th at we would write as
4123
-2 4 6 8 .
In the scratch-out method, we would begin by subtracting 2 from 4 (that

is, we begin at the left, just as we write) and we record the answer above
the 4. Thus,
2
^123
£468
where we have crossed out the 4 and the 2 to indicate th a t we have used
these numbers. We next proceed to subtract 4 from 21 (the new’ sub
trahend) to obtain 17. We again cross out the numbers which have been
used. Thus, we obtain
1
P
U 23
^ 68.
We then subtract 6 from 172, and we continue in this way, to obtain
165
#65
m
M i-
The answer would be the portion not crossed out; th a t is, 1655. Stripped
of embellishment, this method is nothing different from viewing 2468 as
2000 -f 400 + 60 + 8 ; we then subtract 2000 from 4123 and then 400
from this result. For example, if we refer to our first scratching-out, we
see th a t the top is 2123, the p art which is not yet scratched out; and,
indeed, 2123 is what is left if we subtract 2000 from 4123. Similarly, if we
then subtract 400 from 2123, we obtain 1723, which is precisely the un-
scratched-out p art of the top in the second stage of our calculation.
The main point is th a t while the scratching-out method works, it is

awkward. Among other things, with so many things crossed out, it is
often hard when we are finished to remember what the problem was and
it is even harder to check the work. Why, then, did man bother to use
such a messy device when less messj' devices were known to him? We can
only conclude th a t there was a certain peace of mind he got from doing
subtraction from left to right—enough satisfaction so th at he was willing
to put up with the other inconveniences!
As we turn our attention to multiplication, it is obvious th at ancient
forms of multiplication could not be correlated with place-value if only
because place-value was as yet uninvented. Actually, we know rather
little about the methods of multiplication used by the ancients. Some
evidence leads us to believe th a t the Egyptians used the method known as
the duplation plan. In essence, this plan makes use of the process of
doubling and adding and/or subtracting. Let us illustrate this technique
by solving the problem
28 X 49.
To this end, we proceed as follows.
1 X 49 = 49
2 X 49 = 98
4 X 49 = 196
8 X 49 = 392
16 X 49 = 784
32 X 49 = 1568.
This system appears particularly convenient for multiplying a number

by 1, 2, 4, 8, 16, 32, and so forth. But the important point is th a t all
our numbers can be expressed as suitable combinations of these numbers.
This can be seen by thinking in terms of base-two arithmetic. T hat is,
the column values then represent 1, 2, 4, 8, 16, 32, and so forth, and the
only single digits are 0 and 1. Thus, the number which we write as 28
would appear in the base-two system as 11100. Of course, one does not
need base-two here. W ithout reference to base-two, all we are saying is
th a t 28 = 1 6 + 8 + 4. Thus, to ascertain the value of twenty-eight 49’s,
we need only take
16(49) + 8(49) + 4(49) = 784

+392
+ 196
1372.
We could also have said th a t ‘28 = 32 — 4; hence,

28(49) = 32(49) - 4(49) = 1568
-1 9 6
_ 1372.
Clearly, either of these two methods gives us the same result as by the
usual method of multiplication. Notice th a t while one can explain this
method in terms of place-value using base-two, the fact remains th a t this
method could be (and, in fact, was) used by people who did not yet
have place-value. The important thing is th at multiplication of natural
numbers is rapid addition, and this is a concept which makes sense regard
less of the particular system of numerals being used. Moreover, this
method is simple enough in concept th at we should not be surprised to
find it used as late as the sixteenth century! In fact, the duplation plan
has a contemporary form found among Russian peasants today. The form
is even referred to as the peasant method of multiplication. This method
is an outgrowth of duplation in the sense th a t one doubles and halves at
the same time rather than just doubling. T hat is, we observe th a t the
product of two numbers remains the same if we double one of the numbers
and halve the other. For example,
16 X 64 = 8 X 128 = 4 X 256 = 2 X 512 = 1 X 1024 = 1024.
The problem with this method is th a t we do not always have powers of 2,
such as 16 and 64, with which to work. T hat is, while there is no problem
associated with doubling a number, there is one associated with halving a
number, since it is not true th a t every number is divisible by 2. The
peasant method avoids this problem in the following rather intriguing way.
We agree for the sake of simplicity to double the larger and halve the
smaller so th a t we can get one of the numbers to 1 as quickly as possible.
Notation-wise, we write the smaller number on top and the larger on the
bottom. Each time th a t we get a remainder when we divide by 2, we
neglect it, contrary to the intuition of many th a t we should alternate
between adding it and dropping it to secure a better balance. At any rate,
28 14 7 3 1
49 98 196 392 784.
We then circle the odd numbers in the top row. In this example, they
would be the numbers 1, 3, and 7. Thus,
28 14 © © ®
49 98 196 392 784.
Finally, we add all those numbers in the bottom row which appear under
the circled numbers. In this case, we add 784, 392, and 196, which is
exactly the same thing we did previously using the duplation plan. The
relationship between this method, base-two, and duplation is left as an
exercise for the reader. We should point out in passing the following.
(1) Whereas it is convenient to halve the smaller number, the peasant

method does not depend on this at all. Thus, the same problem could
have been solved by writing
@ 24 12 6 (3) ® 896
28 56 112 224 448 896 +448
+28
1372.
(2) We are not advocating a return to duplation and the peasant method;
we are not saying th a t these ancient methods are better than the
modern method. We wish only to show th at these methods are both
logical and clever and indicate an understanding of arithmetic by some
of the ancients th a t is as sophisticated as the understanding of many
modern men.
(3) We also claim th a t this technique appears mystical enough to generate
a curiosity on the p art of the student which might induce him to study
numbers and multiplication in more detail than he might otherwise
have done. In short, these old-fashioned methods often whet our
interest and at the same time give us a better appreciation of the
endeavors of our predecessors.
The first form of multiplication th at resembled our modern form appeared

shortly after the advent of place-value (about 1150) in Bhaskara’s Lilavati.
I t is distinguished from our present method only in th a t the arithmetic
proceeds from left-to-right rather than from right to left. We shall solve
135 X 12 in two different ways; one in which 12 is treated as a single-digit
number, and the second in which 12 is treated as a two-digit number.
135 135
12 12
12 135
36 270
60 1620
1620
Here is probably a good place to observe the arbitrariness of right-to-left
versus left-to-right. Granted th a t force of habit makes Bhaskara’s style
awkward to us, the awkwardness is only a m atter of habit and nothing
more. Thus, in his first method, he is saying th at
135 X 12 = (100 + 30 + 5) X 12
= 100(12) + 30(12) + 5(12)
= 1200 + 360 + 60
and this is exactly what our first illustration becomes when we supply the
missing 0’s. T hat is,
135
12
1200
+ 360
+60
1620.
His second method is merely a rewording of the result th a t
135 X 12 = 135 X (10 + 2) = 135(10) + 135(2) = 1350 + 270.
For again, supplying the missing 0’s, we have,
135
12
1350
+ 270
1620.
In Pacioli’s Suma (1494), we find our common method of multiplying

for the first time; and it appears in the form illustrated in Figure 1.4.
More descriptively, this method is referred to as per scachiere because of
its resemblance to a chess board; per bericuocolo because of its resemblance
to cakes of this name sold at the fairs of Tuscany; per scalletta because of
the diagram’s resemblance to little stairs.
In the same book, reference is made to the Gelosia method of multiplica-
9 8 7 6
6 7 8 9
8 8 8 8 4
7 9 0 0 8
...
6 9 1 3 2
5 9 2 5 6
6 7 0 4 8 1 6 4
Figure 1.4
tion, which is exactly equivalent to our modern form except th at it facili

tates the carrying process. We illustrate this method with the problem
9876 X 6789 in Figure 1.5. This example shows why the name gelosia
is used. Namely, the pattern resembles the grating, or lattices, used in
windows; these were known as gelosia, eventually becoming the word
“jalousie” meaning “blind” in French. Notice th a t this method could be
used for multiplying any two numbers, no m atter what number of digits
was contained in each. This method differs from our own only in the
sense th a t by adding between diagonals much of the actual carrying process
is eliminated.
While we are on the subject of the Gelosia method, notice the painstaking
detail th a t this method demands—even to the “scalloping” of the answer.
Obviously, such a tedious design shows th a t people had at least a good
am ount of leisure time. In other words, one can often use such things as
arithm etic recipes as an artifact to determine the way of life of a particular
civilization. Thus, the history of mathematics can also supply the archae
ologist with another avenue of im portant data in reconstructing the status
of a given society.
Exercises
1. Use the scratching-out method to compute 2345 — 1846.
2. Suppose we had decided to have place-value proceed from top to bottom
rather than from right to left. Show how we would have computed
(a) the sum of eighty-seven and ninety-six,
(b) the product of eighty-seven and ninety-six.
3. Use the Gelosia method to compute the following.

(a) 9854 X 456 (b) 456 X 23 (c) 1004 X 208.
4 . Do the computations in Exercise 3 using the duplation method.
5 . Do the computations in Exercise 3 using the Russian-peasant method.
6. Do these same computations with the abacus.
7 . Construct a base-four abacus and show how to solve (123)f0ur + (323) four-
1.4 THE RATIONAL NUMBERS

In terms of counting it is not hard to see why and how man invented
the idea of the whole numbers. I t is not so easy, however, to see why he
would have invented fractions. Indeed, most explanations turn out to be
excuses rather than reasons.
Virtually every elementary textbook conveys the idea th a t fractions
allow us to talk about parts of the whole. Yet it is quite common to refer
to one ounce rather than to one-sixteenth of a pound, or to a nickle rather
than to one-twentieth of a dollar. I t is more likely th a t it was the concept
of division th a t led to the idea of rational numbers. (Here rational has
no connection with “sane” ; rather it is a derivative of “ratio,” a concept
th a t is synonymous with division.)
Let us introduce our point with the following question: Which quotient
is more natural, 5 -S- 3 or 6 -f- 3? M ost people feel much more at ease
with 6 -5- 3 than with 5 -i- 3, the reason being th a t 6 is divisible by 3,
but 5 is not. This attitude merely reflects the fact th at just as ancient
man found it natural to think in terms of whole numbers, so do we. We
are in effect visualizing a tally system when w’e say th a t 5 is not divisible
by 3. Certainly in the tally context it is trivial to conclude th a t wre cannot
divide five objects into a whole number of piles (and still use all the objects)
if we must put three objects into each pile. Howrever, what about a person
who travels five miles in three minutes? Does he not have an average rate
of speed? The problem centers around the difference between physical as
pects of objects and, say, their length. For example, we cannot have
more than one but fewTer than two people in a room. Yet we can have a
length—indeed, infinitely many lengths—th a t are greater than one inch
but less than two inches. In more formal language, we say th a t tallies
are a discrete measurement, w+ile length is a continuous measurement.
We now arrive a t a crossroad in the development of our number system.
Unless five divided by three was given a numerical meaning, we would
have to adm it th a t a man who traveled five miles in three minutes had
no average speed (or if one desires a more intuitive comparison, let us
ask whether we would like to feel th at it is possible to divide a five inch
length into three pieces of equal length). Stated in this way, it is clear
th a t progress required th a t we extend our number system.
Recalling th a t when a simple system becomes outmoded we try to replace

it by the next most simple system which has all the advantages of the
original but lacks the shortcomings, we are faced now with the problem of
having to find a model th a t has all the advantages of the tally system, but
in addition allows the division of five by three. The solution comes with
the invention of the number line. In its simplest form, the number line is
similar to a ruler and the ruler is perhaps the most elementary illustration
of how arithmetic and geometry can coexist.
To construct a ruler we begin writh a straight line (which is a geometric
concept); on this line wre mark off successively some fixed length (which we
shall call a standard unit, or s. u.). This unit is arbitrary in length, but
once chosen it remains fixed as we mark it off on the given straight line,
thus locating a sequence of points (also a geometric concept), equally spaced
along the straight line. We then name these points 1, 2, 3, and so on
(these names are arithm etic). In this way, the geometric concept of point
is labeled by the arithmetic concept of numerals. This is summed up
pictorially in Figure 1.6.
C o .. —
- t i . u.
—--------2 s. u . -------—
1. s. u.
1 2 3 4 5
Figure 1.6
This “ruler” gives us a new way to view whole numbers. We may, as

just described, view them as points on a line; or equivalently, we may
view them as lengths. The two interpretations are closely related and may
be seen as follows; if we take a length of 2 s. u. and mark it off on the ruler
starting a t the origin of the ruler, this length terminates a t the point 2.
Sometimes we shall use the point interpretation and at other times we
shall use the length interpretation, depending on which model better serves
our purposes; but in either interpretation notice th a t our new model allows
us to progress from a discrete to a continuous measurement. T hat is,
there are lengths between, say, 1 s. u. and 2 s. u., just as there are points on
the ruler between 1 and 2 ; but there are no number of objects th a t exceed
one but are less than two.
In summary, we shall replace tallies by lengths (or points on a line) as
the means of visualizing numbers. However, we must remember th a t the
concept of number is still abstract, and th a t lengths are no more numbers
1.^ The Rational Numbers 46
than tallies were. W hat is im portant is th a t lengths give us a visual

intuitive picture of numbers just as tallies did; but th a t with lengths we
can extend the number concept beyond whole numbers.
If we view numbers as lengths, it is natural th a t we shall equate equality
of numbers with equality of length. T hat is, two numbers will be equal in
terms of the number-line interpretation, provided they name the same
length. In a similar way, the ideas of “less th an ” and “greater th a n ”
become translated into “shorter th an ” and “longer th an .” In other words,
the idea th a t 1 < 2, in terms of our picture, says th a t 1 s. u. is shorter than
2 s. u. If we wanted to use the idea of points, we refer to “to the right of”
and “to the left of” ; in other words, one number is greater than another if
the point it names lies to the right of the point named by the other.
An interesting aside is how the symbols < and > became the abbrevia
tions for “less th an ” and “greater th an ,” respectively. The idea was th a t
since = meant equal, the spacing between these two parallel lines would
be altered so th a t there was more distance between them at one end than
at the other. Then, if two numbers were unequal one would line up the
lesser number with the shorter spacing. Thus, the following would be
correct: 1 — 2, 4 — 7, and 8 = 5. The trouble was th a t with careless
handwriting one could never tell whether 6 = c meant th a t b was less than c
or th a t b equaled c (with a sloppy equal sign). To avoid such a dilemma,
the spacing was closed at one end; th a t is, to make the smaller distance
zero. In this way the symbols < and > were born, with the smaller
number being placed at the closed end of the symbol. In other words,
1 < 2, 4 < 7, and 8 > 5.
The model of length as a means for viewing numbers is so intuitive th a t
geometry is now being introduced early in the elementary school curriculum
to take advantage of the idea th a t a picture is worth a thousand words.
At a very early age the student can see the ideas of “shorter th an ” and
“longer than” or “to the right of” and “to the left of”—at least this is
possible at an earlier age than seems to be required to master the arith
metic ideas of “greater th an ” and “less than.”
To continue, let us see how the number line allows us to perform division
problems in a way th a t the tally system precluded.
I t is wiser th a t we start with a division problem th a t could have been
handled by the tally method so th a t we can see the transition more clearly.
Consider 6 -s- 3. I t is quite natural th a t we would interpret this as dividing
a length of 6 s. u. into three equal parts. If we were to do this it is easy
to see th a t each of the resulting pieces would have a length of 2 s. u. In
this way we would now have a visual way to view 6 -§- 3 = 2 in terms of
lengths. If we now switch to the problem of 5 -4- 3, which we could not
have done by means of tallies, we see th a t it is just as natural in terms of
lengths to talk about dividing a length of 5 s. u. into three equal parts.
4.6 The Development of Our Number System
Moreover, it is easy to see th a t if this division were accomplished the

length of each piece would have to be in excess of 1 s. u. but less than 2 s. u.
To emphasize once again th a t the number line is merely a good model,
notice th a t we do not have to refer to the number line talk about dividing
5 by 3. Thus, we might ask the average speed of a particle if it traveled
5 miles in 3 minutes. • One mile per minute would be too small since then
the particle would have traveled only 3 miles during the 3 minutes; and
2 miles per minute would have been too fast since then the particle would
have traveled 6 miles during the 3 minutes. The number line allows us to
picture what is happening in a way which is perhaps more basic and
intuitive than th a t which we could obtain by the use of any other model.
Before continuing, we should point out th a t our model is so intuitive
th a t it is possible th a t we are taking certain things for granted th at should
be explicitly stated at least once. One such assumption we are making is
th a t in terms of lengths, we add by laying the two lengths off consecutively,
side by side, and defining the sum of the two lengths to be the length thus
obtained. Figure 1.7 identifies this in terms of a picture.2
— a - b ----------------------------- -
hm b * ~
Figure 1.7
We do this not so much because it is self-evident but rather because it

helps to illustrate how we wish to define addition. In other words, since
we invented the model to help us visualize what was happening, we should
also invent our other definitions to capture the other aspects of what we
know is happening. By wray of a specific illustration, notice how our
remarks apply to the number fact 3 + 2 = 5. Clearly, if we lay off a
length of 3 s. u. next to a length of 2 s. u. on the same straight line, the
composite length is 5 s. u., as shown in Figure 1.8.
-------------------------- 5 -------------------------- »
-------------------------3 + 2------------------------►
-------------- 3 --------------- — ---------2-------- ►

______ I______ I______ I______ I______ I
1 2 3 4 5
Figure 1.8
2 See N ote 1—More on Geometric Arithmetic on p. 53.

As in rapid addition, we view multiplication by a whole number as the

length obtained by laying off the given length th a t number of times. In
this way 4 X 5 can still be read as “four times five,” and this in turn means
the length obtained if a length of 5 s. u. is laid off four consecutive times.
With this discussion as background we are now ready to generalize the
idea of what m n means if m and n are whole numbers. We exclude the
case n = 0 since m -f- n means the number which when multiplied by n
equals m, we have th at m 4 -0 means the number which when multiplied
by 0 yields m. However, any number times 0 is 0; hence, if m is not 0,
there is no number which multiplied by 0 is m. If m — 0 then any number
times 0 will equal m. Thus, in the first case there are no answers, and in
the second case there are too many answers; so in either case, we solve our
problem by decreeing th a t it is not permissible to divide by zero. We
merely mean to take a length of m s. u. and to divide it into n equal parts;
then the length of each such piece represents m -¥ n.
Using the preceding discussion as motivation, we now define a rational
number to be the quotient of two whole numbers (the divisor being unequal
to 0). In some cases the quotient will be a whole number (in which case
the tally system would still have been a sufficient model). In other cases,
we need the number line if we are to view the quotient realistically.
In summary, the number line provides us with a simple and intuitive
way of visualizing the quotient of two numbers. Namely, what is usually
called the dividend in arithmetic is the length of the line th a t is to be
divided; the part th a t is usually called the divisor in arithmetic is the num
ber of parts into which the given length is to be divided; and what is called
the quotient in arithmetic is the length of each piece obtained when the
given length is divided into the given number of equal parts.
Notice th a t so far we have avoided the use of the word “fraction” in our
discussion. The best reason for this can be explained in terms of number-
versus-numeral. T h at is, a fraction is a numeral which names a rational
number.
A particularly im portant type of fraction is known as a common fraction.
By a common fraction we mean the symbol m /n, where m and n are whole
numbers and n 0, m /n is used to name th a t number th a t must be m ulti
plied by n to obtain m; in other words, m /n is used to denote the rational
number th a t is obtained when m is divided by n.
In this context, when we write f = 3, we are saying th a t both f
and 3 are numerals th a t name the same number (in this case, three). How
ever, observe th a t while both f and 3 name the rational number three, 3
is not called a common fraction while f is. T h at is, a common fraction
is a very specific numeral which by definition must have the form m /n .
Notice also th a t a fraction is just as valid if it is “top heavy.” T h at is,
the common fraction f is just as meaningful as the common fraction §,
at least in the sense th a t one can just as logically divide a 5-inch line into
three equal parts as he can divide a 2-inch line into three equal parts.
Before proceeding further, we might explain why the names “numerator”
and “denominator” were invented to name the top and bottom numbers
of the common fraction. First of all, if “numerator” and “top” are syn
onyms, then we need not use a fancy word to replace a straightforward
word. Obviously, “num erator” and “top” are not synonyms. In fact,
when we write m /n there is no top or bottom ; rather m is to the left of n!
Ancient man recognized th at the size of the standard unit was rather
arbitrary. He then realized th a t he could divide the standard unit of
length into any number of equal parts and the length of any one of the
resulting pieces could then be his new standard length. In this way, for
example, i would be the standard length th a t we would obtain by taking
our standard unit, dividing it into three equal parts, and then taking one
of these three equal parts. In general, 1/n was used to denote the length
of each resulting piece if a standard unit were divided into n pieces of equal
length.
The next observation was th at given any denomination, 1/n, we could
generate arbitrarily long line segments by marking off the length 1/n con
secutively as often as we wished. For example, we might elect to mark
off 1/n, m times. Clearly, the name for such a length would be m X 1/n.
On the other hand, it should not be hard to see th at the length named by
m X 1/n was precisely the same as the length named by m /n. In terms
of our usual usage of numerals, we are saying th at m /n = m X 1/n. In
this form, it easy to see th a t n names the denomination, while m enumerates
the number of times we mark off this denomination. Remember th at
the greater the denominator, the smaller the denomination.
Given, say, f, we now have two ways of viewing this as a length.
(1) I t may be viewed as the length of each piece when a length of 5 s. u.
is divided into 3 equal parts, or (2) it may be viewed as the length ob
tained when the denomination ^ is marked off 5 consecutive times. We
will use the interpretation th a t best accomplishes our purpose.
In the same way th a t tally marks gave us an excellent picture for vis
ualizing why the rules for the arithmetic of whole numbers were as they
were, the number line gives us equal insight to the arithmetic of rational
numbers. For example, let us consider the rule which says th a t we may
multiply num erator and denominator of a common fraction by any nonzero
number without changing the fraction. First of all, let us “clean up” the
language of the last statem ent. Obviously, we change the fraction. Cer
tainly ^ does not look like f any more than 4 + 1 looks like 5. W hat we
mean is th a t £ and f are different common fractions th a t name the same
rational number. In terms of the number line we are saying th a t 2 pieces
whose size is | are necessary to form the same length as 1 piece whose size
(denomination) is + Or, in terms of our other interpretation, the length

obtained by dividing 1 s. u. into 3 equal parts is the same as the length
obtained by dividing a length of 2 s. u. into 6 equal parts.
We should also note th a t since the numerator and denominator of a com
mon fraction are whole numbers, we can explain such things as ^ = f in
terms of whole numbers.For example, all we are saying is th a t if one can
buy a bunch of equally priced objects for 30, he can buy two bunches for 60.
This interpretation also captures the idea of a ratio.
In summary, then,
1 __ 2 __ 3___ 4
3 6 9 TS ' '
simply indicates th a t different common fractions name the same ratio.

I t is im portant th at we do not allow ourselves to be tricked into believing
th a t § = f merely because we obtained one from the other by doing the
same thing to both the num erator and the denominator. For example, if
we add 2 to both the numerator and denominator of we obtain f ; but it
is not true th a t f = f, even though we obtained one from the other by
doing the same thing to both the numerator and the denominator.
When we deal with common denominators (such as when we want to
add two rational numbers in common fraction form) we are playing the
synonym game. I t is im portant to emphasize here th a t a rational number
does not have a numerator and denominator. I t is the numeral called
the common fraction which has a numerator and a denominator. Namely,
the only time th a t it is safe to compare amounts of money by the number of
coins is if the coins all have the same denomination. T h at is, a man with
four coins certainly has more coins than the man with three coins; however
unless the denominations of the coins are all alike, we cannot be sure th a t
the man with four coins has more purchasing power than the man with
three coins—th a t is three dimes out-purchase four pennies.
In this context, we may view \ | as denoting the length obtained
when a length of \ and a length of f are laid off end to end. If we are
dealing with like denominations all we need do is add the total number of
parts. For instance, when we say th a t f + f = t> we are really saying th a t
we have a common denomination, y, and th a t 2 pieces of this size together
with 3 more pieces of this size are 5 pieces of this size, or 5 X + However,
we are not dealing with like denominations in our present problem.
Since man has the ability to think logically, he often tries to reduce
new, unsolved problems to old, solved problems. W ith this in mind, he
tries to express a problem with different denominations in terms of a
problem th a t has a like denomination. Thus, he observes th a t since 2 and
3 are both divisors of 6, it would be helpful to replace ^ by its synonym
and £ by its synonym f. In this way, he obtains th a t3
3 See Note 2—A Brief Glance a t Number Theory on p. 59.
I t might have seemed more natural th a t in adding \ and y we should

have combined like parts. T h at is, why could we not have said
1 ,1 1+1 20
2 3 2+ 3 5'
We will answer this with the following example. If you are paid by
the hour and you work \ hour one day and y hour the next day, then you
have worked 30 + 20 minutes in all. In terms of this real-life experience,
you have worked 50 minutes, which is f hours.If we had added numera
tors and denominators,you would have worked f hours, or 24 minutes.
On the other hand, the baseball player who goes 1 for 2 in the first game
and then 1 for 3 in the second game has gone 2 for 5 not 5 for 6 in the two
games. In short, in any man-made system we invent rules and models
to conform with what we believe is true.
This also explains why it is not enough to say th at m /n is a common
fraction. In mathematics, structure hinges on how the symbols are com
bined not on how they look. In other words, if \ and y are to be common
fractions, it is m andatory th a t \ + £ = $; otherwise the “realism” of the
rational numbers escapes us. In still other words, in any interpetation in
which it would be correct to say th a t \ + y = f, then the symbols y, y,
and f are not being used as common fractions even though they may look
like common fractions.
If we now invoke the idea th a t subtraction is the inverse of addition, we
need spend no additional time discussing how we subtract rational num
bers. All we have to do, given a problem such as y — i = ?, is to view
i t a s i + ? = f . T h at is, ^ + ? = ^y; whereupon our previous knowledge
of addition allows us to conclude th a t ? = yy-
Let us now turn our attention to the question of multiplying rational
numbers. Clearly, if at least one of our factors is a w'hole number, there
is no trouble in still keeping the “rapid addition” idea. For example, if
we wish to compute 5 X f , we may view this in terms of lengths as marking
off \ five times, thus yielding a total length of The difficulty lies in a
product such as \ X y.
Here we encounter for the first time the fact th a t we cannot view multi
plication as “rapid” addition; for, quite clearly, when a problem says to
perform an operation n times, n must be a whole number. T h at is, we
may mark off a length once, twice, or three or more times, but we can
not m ark it off one-half times.
Some of us may recall th a t in an expression such as \ X y, we no longer
read the “ X ” as “times,” but rather as “of.” We say \ of and this has
an immediate simple interpretation in terms of the number line. Specifi

cally, we think of the length which is \ of 3. T hat is, we start with the
length ^ and divide it into two equal parts and we call the length of one of
these two equal parts f of §. Of course, it is also easy to see th a t another
name for this length is £ (since 2 X i = |) ; and this is what we mean by
1i vA i3 <-
— 16-
Notice th a t we avoid saying th a t all we had to do was to multiply the
two numerators and the two denominators to get the correct answer. To
be sure, this procedure yields the right answer, but only as a short cut once
we know how to do the problem. In other words, it is not self-evident
th a t all we need to do is combine like parts; for if it were, then a similar
result should be true for addition—and it is not.
Notice also th at this interpretation of multiplication conforms well with
other results we accept. For example, we know th a t | X 1 = | . All
this says in terms of our interpretation is th at %of 1 s. u. is f . As another
example, notice th a t in an expression such as 2 X 3 = 6, the product is
greater than either of the factors; while in the expression f X 1 h the
product is less than either of the factors. Why does this happen? Notice
th a t 5 X | indicates th a t we are to take but half of the length 3, and clearly
this is less than the length f itself. A similar interpretation holds for 3
of i
On the other hand, consider f X i This product will be in excess of i
since we are taking f of i- T hat is, we are dividing i into 3 equal parts
and then marking off one of these three equal parts five times. This yields
a greater length than th a t with which we started, as shown in Figure 1.9.
1 ____________________ __
4
Figure 1.9
There is a second rather interesting interpretation of multiplication in

terms of areas. We do not give it as the major interpretation because it
does not conform with our interpretation of numbers as being lengths;
rather it views numbers as areas. The idea is th is: we may always view the
product of two numbers as an area of a rectangle. For example, 3 X 4
would denote the area of a rectangle whose width was 3 and whose length
was 4. W ith this in mind, \ X 3 could be viewed as the area of the rec
tangle whose width was f and whose length was Figure 1.10 puts this
in terms of a picture.
2
A B
\
1
3
The area o f ABCD is on one hand
1 / / / / / / / / ✓
-* x J . On the other hand it is 6
7 of
D C
the area of the entire unit square;
that is, ABCD is one of six equal
rectangles, each of which has the
dimension - x - , into which the
unit square is divided.
-*--------------- 1 s. u .-------------- —
Figure 1.10
As a final observation let us again point out th a t while the number line
affords us a nice way to visualize the rational numbers, the rules for com
putation apply to the real world without any regard to the number line.
For example, in terms of distance being speed multiplied by time, notice
th a t if a particle were to travel at a rate of I miles per hour for § hour,
it would travel £ miles.
We can now understand the last of the four basic operations of arithme
tic, division, in terms of its being the inverse of multiplication. For exam
ple, given | -r f = ?, we may interpret this as f X ? = f ; or ^ of ? = f.
In other words, we may view ? as a certain length, divide it into seven equal
parts, and then observe th a t five of these seven parts add up to f . If five
parts add up to f then one part is i of §, or t s (we are assuming th a t we
already know how to multiply). All seven parts make up ?, hence ? is
7 X t s , or ? = y-f- Figure 1.11 sums this up pictorially.
Figure 1.11
In terms of a non-number-line example, suppose a particle traveled §

miles in y minutes and we wanted to compute its speed in miles per minute.
The answer would be -f -4- y miles per minute. We could compute it as
follows: If it travels f miles in five y’s of a minute, then it travels r of f
miles in y of a minute; and hence it travels 7 X ( i of -f) miles in a minute

(since a minute is seven y’s of a minute). In this way, we do exactly what
we performed pictorially in Figure 1.10.
If we do not wish to draw the number line, observe th a t there are other
ways of figuring out how to solve ? X f = !• For instance, we may ob
serve th a t i X t = 1and th a t f of 1 = 1. T hat is, to convert y into f
first multiply it by \ (to obtain 1) and then multiply this by f to obtain f.
(I X i) X * = f.
This, in turn, says th a t H X f = ! ; which, in turn, says th a t f f = xi-
This last example used the familiar “invert and multiply” recipe we all
learned for dividing fractions. However, our method yielded the recipe
in a logical way and not as a divinely inspired magic recipe. Ju st to
“invert and m ultiply” is not logical. T hat is, why is it logical th a t we
invert the divisor and not the dividend; why not the other way around, or
why not invert both? The answer is th a t the recipe yields the correct
answer. In still other words, if we try to compute f -s- y, then
^2 w
X ui
^3 V
X y5
3 7
X X S
are all forms of “invert and multiply,” all of which yield different answers.
How shall we determine which is correct?
NOTE 1 — M ORE ON G EOM ETRIC A RITH M ETIC
If we were asked to pick which of the following words were inappropriate

(arithmetic, geometry, trigonometry, and algebra) we would see th a t “alge
bra” seems to be out of place (arithM ETic, geoMETry, trigonoM ETry,
algebra). In essence, the first three words are of Greek origin, while
“algebra” is basically of Arabic origin and was not used prior to 800 a . d .
Thus, in 600 b . c . the ancient Greek had considerable mastery of geom
etry but none of algebra, if only because it was not yet invented. In spite
of this, however, the ancient Greek was able to obtain many algebraic
results in terms of geometry. We shall illustrate this in stages. Let us
begin with the ordinary concepts of elementary arithmetic. Recall th a t
the Greek is interpreting numbers as lengths and take into account the five
following considerations.
(1) H e had a very simple way of viewing addition. Namely, given two
numbers a and b he could mark them off consecutively, viewing them
as lengths, along the same line. The total length could then be labeled
a + b (Figure 1.12(a)). Notice th a t we can easily view 3 + 2 = 5 in
this way (Figure 1.12(b)). Namely, the total length, when a 2-inch
length is marked off on a straight line next to a 3-inch length, is a 5-inch
length.
a +b 3+2
(a) (b)
Figure 1.12
(2) Subtraction was still the inverse of addition. Thus a — b merely

means the length th a t had to be added to b to yield a. In this context,
b had to be a shorter length than a, just as was the case with tallies;
and also as with tallies, we could view this operation in terms of taking
b away from a (see Figure 1.13).
(3) We have already seen th a t for whole numbers multiplication is rapid
addition, so we should have no trouble in trying to visualize the multi
plication of whole numbers in terms of the number line. However, as
a sidelight, notice th a t while we have been restricting arithmetic to
whole numbers thus far, there is no need to make this restriction in
terms of lengths. For example, as we have already mentioned, we can
think of lengths which are greater than three units but less than four
units. Consequently, we should not be too surprised th a t the ancient
Greek devised a way to find the product of any two lengths. If the
Greek had denoted the lengths by a and b and the length of the stand
ard unit by 1, he could then have drawn two lines through a common
point A . Along one of the lines he could mark off the length of his
standard unit, labeling its point of termination B. On this same line
-a - b-
-b + (a - b)-
Figure 1.13
immediately adjacent to B, he could then mark off the length b and

label this point of termination C. Then on the other line starting at A ,
he could mark off a length equal to a, labeling the terminal point D.
He could then draw the line BD, and through point C he could draw a
line parallel to BD, meeting the second line at E. Then D E would
represent the length equal to the product of a and b. T h at is, D E =
a X b. This is diagrammed in Figure 1.14. This result follows from
Figure 1.14
the properties of similar triangles. Two figures are similar if they have
exactly the same shape. More mathematically, the lengths of the
sides of one are proportional to the lengths of the sides of the other.
Namely, the above diagram yields the fact th at
a: 1 = D E .b
or
a = D E /b
or
a X b = DE.
We illustrate this result in Figure 1.15 with the example 2 X 3 = 6.
Figure 1.15
(4) We have already shown a method for visualizing a -s- b if a and b are
natural numbers. However, Figure 1.16 shows a more general tech
nique th a t works for any lengths. Namely, we again choose two lines
Figure 1.16
passing through a common point A . On one of the lines we mark off b

and then a, labeling the terminal points B and C, respectively. Then
on the other line we mark off a length equal to 1, labeling this terminal
point D. We draw BD, and through C we draw a line parallel to BD
meeting our second line at E. Then DE = a -5- b. This result fol
lows from the observation th at
DE: 1 = a:b.
H ad we wished to construct b a, we need only have reversed the
roles of a and b. Figure 1.17 shows this method by illustrating
12 -h 4 = 3.
K 2 -H
i i i
Scale
Figure 1.17
(5) While it is not one of the four basic operations of arithmetic, the process
known as extracting square roots played a vital role in early Greek
mathematics. Recall th a t the square root of a number is a number
which when multiplied by itself yields the given number. For example,
the square root of nine is three since the product of three with itself
is nine. We write this as \ / 9 = 3. In general, the numeral y /b is

used to denote the fact th at
y /b X y /b = b.
To construct y /b , given the length b, the ancient Greek would on the
same line mark off b and then 1. Let us call A the point at which we
begin to mark off 6; B the point at which b terminates; and C the point
at which 1 terminates (Figure 1.18). Thus, AC = b + 1. The Greek
D
VS.
\\
\ \
V\
/
/ /
/
* \\
B
V\
b . .
/
Figure 1.18
then constructed a semicircle with AC as its diameter. At B he erected

a line perpendicular to AC which met the semicircle at D. Then
BD = y /b . T h at is
BD X BD = b.
While it is beyond both our scope and our need to prove this result, it
follows from the fact th a t triangles ADC, AD B, and DBC are all simi
lar. In particular, in Figure 1.18, b:h = h :l, or h2 = b.
B ut the ancient Greek’s use of geometry was not restricted here to its
role in ordinary arithmetic. He also used it to show such algebraic results
as the distributive property,
a X (b + c) = (a X b) + (a X c)
and others such as
(a + 6) X (c + d) = (a X c) + (a X d) + (6 X c) + (6 X d)
(a + ft)2 = a2 + 2ab + ft2.
These results followed from the realization th a t the product of two lengths
could be viewed as the area of a rectangle. In other words, 3 X 4 may be
viewed as the area of a rectangle whose dimensions are 3 units by 4 units.
In this respect, treating numbers as lengths, the three equations above may
be interpreted as shown in Figure 1.19.
f
b
axb
c
axe
c axe bxc
e
a:
i
ab T
a +b
KJL .. d axd bxd ab
-a + b ----
1
Figure 1.19
Observe th a t not only are the algebraic results corroborated by the

geometric interpretation, but they also seem more intuitive and less omi
nous this way. Along these same lines, one could use three-dimensional
geometry to help visualize the result
(a + b)3 = a3 -f- 3a26 + 3a62 -j- b3.
We need only visualize a cube whose side is (a + b). The volume of the
cube would be given by (a + b)3; th a t is, (a + b) X (a -(- 6) X (a + b).
On the other hand, the volume is made up of eight pieces, one of which
has dimensions a by a by a; three of which have dimensions a by a by 6;
three of which have dimensions a by b by 6; and the eighth having dimen
sions b by b by b. This is pictured in Figure 1.20.
V / b /
/ a/
V /
£------------- /
f
/
a X
a /! /i
/b
/
/
b / /a
/
Figure 1.20
Geometry can be applied to our discussion of rational numbers to shed

further light on the subject. In particular, we refer to the concept of the
number line. As the name implies, the number line is a product of the
union of arithm etic and geometry.
By way of illustration, suppose we are given the line segment A B and
wish to divide it into five pieces of equal length.
A B
We first draw any other line at all through the point A.
On this new line, starting a t A , we mark off any chosen length five times,
labeling the last point of division C, for want of anything better. Virtually
by default, th at is, we constructed it th at way, the segment AC is divided
into 5 parts of equal length. Unfortunately, however, it is A B , not AC,
which we wished to divide into 5 pieces of equal length. To this end, we
now draw the straight line which joins B and C.
We then draw lines through the other points of division on AC parallel
to BC. We label the points a t which these lines intersect A B by D, E,
F, and G, respectively.
Then the lengths of the segments: AD, DE, EF, FG, and GB are all equal.
T hat is, we have succeeded in dividing the segment A B into 5 pieces of
equal length. Note the properties of similar triangles th a t are used in
this example.
NOTE 2 — A BRIEF GLANCE AT NUMBER THEORY
The ancient Greek was quite captivated by the topic of divisibility and
proceeded to subdivide the natural numbers into various categories in
terms of their divisibility properties. We shall have more to say about
this in a later chapter, but for now we shall restrict this topic to those
points th a t will be of computational value to us in our present investigation
of the rational numbers. In particular, we shall discuss here the concepts
of prime and composite numbers.
To begin with, the Greek recognized th at 1 denoted a rather special
natural number which- he called a unit. For any other natural numbers,
he noticed two possibilities. Namely, since n = n X 1, he knew th at every
natural number greater than 1 had at least two natural numbers as divi
sors—itself and 1; he knew also th a t some natural numbers had only these
two natural numbers as divisors. For example,
2 = 2 X 1 represents the only way two can be the product of two

natural numbers.
3 = 3 X 1 represents the only way three can be the product of two
natural numbers.
5 = 5 X 1 represents the only way five can be the product of two
natural numbers.
On the other hand, for example, while we may write 4 = 4 X 1; 6 =

6 X 1; 8 = 8 X 1; 9 = 9 X 1; and so on, the fact remains th a t 4 = 2 X 2 ;
6 = 3X2; 8 = 4X2; 9 = 3X3.
Numbers such as 2, 3, 5, 7, and 11, th a t had only themselves and 1 as
natural number divisors were called 'prime numbers; the other numbers,
such as 4, 6, 8, 9, and 10, were called composite numbers. Thus, the Greek
partitioned the natural numbers into three mutually exclusive (that is,
nonoverlapping) classes:
Units: 1
Primes: 2, 3, 5, 7, 11, 13, . . .
Composites: Everything else; th a t is, 4, 6, 8, 9, 10, 12, . . . .
He viewed the prime numbers as being the atoms, so to speak, of the

natural numbers, in the sense th a t they were indivisible and he devoted
much time and effort to their study. I t is interesting to note th a t the
investigation started by the ancient Greeks is still continuing in various
forms today, with many results still undetermined. For example, the
ancient Greeks tried to find a simple, compact recipe for methodically
telling whether a given number was prime. While he found some techniques,
he failed to find one th a t did the job in the way he wished. Today, some
2500 years later, we still have not been able to find the right recipe.
W ith regard to the last point, let us hasten to say th a t the ancient Greek
made considerable progress. He found a type of recipe for finding primes,
known as the sieve method. In this technique, we effectively sieve the
primes through a special “strainer.” By way of illustration, suppose we
wish to find all the primes which are less than, say, fifty. We would then
write the first fifty natural numbers:
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50.
Since, by definition, 1 is not a prime (it is a unit), we strike it from our

list. The next number, 2, we circle because by definition it m ust be a
prime. T hat is, 2 cannot be divisible by any other natural number greater
than one since 2 is the first such natural number. Now we strike out
(sieve) all other members of the list th a t have 2 as a divisor. These include
such numbers as 4, 6, 8, and 10. These numbers cannot be primes since
they are greater than two but have two as a divisor. At this point, the
list now looks like this.
3 5 7 9 11 13 15
17 19 21 23 25 27 31 33
35 37 39 41 43 45 47 49.
Since 3 is now the next uncircled member of the list, we may circle it as
being prime; for if it were not prime, it would have been divisible by two,
and already struck from the list. We may now delete all other multiples
of three from the remaining list since these cannot be primes, by virtue of
being divisible by three. The list now looks like this:
© @ 5 7 11 13 17 19 23
25 29 31 35 37 41 43 47 49.
We could scratch out the members rather than delete them. This would
emphasize the fact th a t we did not need our numeral system to make this
sieve; and this is im portant since the Greeks did not have our numeral
system. W ithout deleting, all we would have to do after circling 2 would
be to scratch out every second member from th a t point on. Then once we
circled 3 we would scratch out every third member which had not yet been
scratched. Continuing in this way, either by deleting or scratching out,
we eventually obtain
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47.
as the only prime numbers which do not exceed 50. Summarized in a form
th a t emphasizes our “sieve” we have:
© * © 0 ( 7) % 0
@ V* V6 © © 30
yb 34 36 36 27 33 © 36
& 34 36 36 (37) 36 36 id
0 @ 44 46 46 @ 46 46 36
While the sieve method works, it is not the type of recipe we ultimately
want because it involves testing all smaller numbers when we wish to test
a particular number. To put it in even more elementary terms, an obvious
test to see if a number is prime is to divide it by 2, 3, 4, 5, and so on, to
see if the quotient is ever a natural number. But this is not a “neat” recipe.
In more professional language, we can say th at we want recipes th at are
closed in form rather than open.
To illustrate this from a different view, consider the problem of finding
the sum of the first n natural numbers. To this end, we could write
1 + 2 + 3 + . . ,+ n
and compute the sum ; but observe how awkward this would be if we wished
to find the sum of the first thousand natural numbers this way. We
would have to compute
1 + 2 + 3 + . . . + 1000
and this is, at best, tedious. Such a procedure is called an open form since,
in effect, we m ust find the answers to the previous cases before solving the
one we wish. Now, without going into the details, it can be shown th a t
1i +
i - 2i + . . . +i n = n(n + 1)
----- ------
T h at is, the sum of the first n natural numbers can be found simply by
multiplying the last number in the sum by its successor and dividing by
two. For example, to compute 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10,
we need only form the number represented by (10 X l l ) / 2 = 110/2 = 55.
A direct check verifies this result. While this is not a proof, observe th at
one can visualize the above result by observing th at we can form five pairs,
the sum of each of which is eleven. T h at is,
1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9+10.
The latter result is what we call closed form, for we can now find the sum
of the first thousand natural numbers directly without knowledge of pre
vious sums merely by computing (1000 X 1001)/2 = 1,001,000/2 =
500,500 (that is, we have 500 pairs, each with sum 1001). This is the type
of recipe we would like to find for determining primes.
Even without knowing the ultimate formula, there are still certain short
cuts th a t we may use in testing for whether a number is prime. As an
example of this, let us observe the following facts.
(1) Any composite number has prime divisors. (We leave this to the
reader’s intuition. Roughly we demonstrate by breaking down the
divisors into their indivisible parts. For example, 24 = 8 X 3 and
8 = 2 X 2 X 2. Thus, 24 = 23 X 3, and both 2 and 3 are primes.)
(2) If the natural number n is the product of the two natural numbers r
and s, and if r exceeds v n, then s must be less than v w. For example,
if 36 = r X s and we know th at r and s are unequal, then it is impossible
for both r and s to exceed 6 or for both to be less than 6.
This result has a simple interpretation in terms of geometry. Think

of a 6 by 6 inch square, as shown in Figure 1.21. Its area is 36 square
inches. Now think of a rectangle of dimensions r by s where both
r and s are less than 6. This rectangle fits inside the square; hence,
it has a smaller area. T hat is, r X s would be less than 36. A similar
argument holds if both r and s exceed 6.
rs r
A
s
6x6 RS
s
Figure 1.21
Combining (1) and (2), we see th a t to test a given natural number for
being a prime, we need divide by only those primes which are less than
the square root of the number. If none of these numbers divide the given
one, it is a prime. For example, consider 149. Since 12 X 12 = 144 and
13 X 13 = 169, we need only test primes th a t are less than 12; th a t is, 2,
3, 5, 7, and 11. Since none of these primes is a divisor of 149, 149 is itself
prime. In other words, if 13 were a factor, a prime less than 13 would also
have to be a factor; but we have just seen th at this can’t happen since 2, 3,
5, 7, and 11 are the only primes less than 13.
There is a difference between understanding concepts and being able to
m aster techniques for measuring the concepts. Certainly, the concept of
a prime number does not hinge on how well we master recipes such as the
one above. However, if extensive work is going to be done with primes,

recipes th a t enable us to test more efficiently are important. Our main
aim is emphasizing the concepts; but frequently this can be done effectively
only by stressing certain techniques.
From our immediate point of view, the most significant use of prime
numbers hinged on the result th at every composite number could be written
as a product of powers of distinct primes in a unique way, not including, of
course, the order in which the factors may be written. If we agree to write
the factors in nondecreasing order, th a t is by size, then the representation
is unique. This is another reason why 1 is not considered a prime.
Namely, multiplying by 1 does not change a product; hence, we could
“throw in” as many factors of 1 as we wished and the factorization would
not be unique accordingly. For example, 24 may be written as a product
in several ways involving natural numbers. Thus,
24 = 24 X 1 = 12 X 2 = 8 X 3 = 6 X 4 = 4 X 2 X 3.
However, when written in the form of powers of distinct primes, we can
only have
24 = 23 X 3.
As a second example,
720 = 24 X 32 X 5.
Here, again, we are faced with the problem of distinguishing between the
right answer when we see it and being able to find the right answer in the
first place. The recipe here is not difficult to understand. For example,
since the unit digit in 720 is even, 720 is divisible by 2, and the quotient
is 360. This too is even, so we may again divide by 2, obtaining 180 as a
quotient. Again, we may divide by 2, obtaining 90; and again we may
divide by 2 to obtain 45. The number 45 is not divisible by 2, but it is
divisible by 3, yielding a quotient of 15 and 15 = 3 X 5. In terms of a
chart—
1
5 |~5
3 |"l5
3 (~45 720 = 2 X 2 X 2 X 2 X 3 X 3 X 5.
2 |~90
2 [180
2 f360
2 720
The above result is known as the unique factorization theorem and il
lustrates the role of prime numbers as the “atom s” of our number system.
For example, once 720 is written as 24 X 32 X 5, we have a wealth of
information about the divisors of 720. In particular, we know th a t any
number having prime divisors other than 2, 3, or 5 cannot be a divisor of
720. Moreover, of the remaining numbers, any one th a t contains either
more than four factors of 2, more than two factors of 3, or more than one
factor of 5 cannot be a divisor of 720. Expressed symbolically, a divisor
of 720 m ust be of the form
2* X 3” X 5r
where m must be either 0, 1, 2, 3, or 4; n m ust be either 0, 1, or 2; and

r must be either 0 or 1. T hat is there are 5 choices for m, 3 for n, and
2 for r.
While it is not im portant for our purposes, the interested student should
have no great difficulty in showing th a t there are precisely (that is, 5 X
3 X 2) 30 such divisors and th at 6 of these are given by
2° X 3° X 5° = 1; 2° X 3° X 51 = 5; 2° X 31 X 5° = 3;
2° X 31 X 51 = 15; 2° X 32 X 5° = 9; 2° X 32 X 51 = 45.
We then get 6 more divisors each by replacing 2° in the above six numerals
by 21, 22, 23, and 24. Leaving the details to the reader, we see th a t these
other divisors are
2 10 6 30 18 90
4 20 12 60 36 180
8 40 24 120 72 360
16 80 48 240 144 720.
(We shall say more about such computational techniques in P art I II of

Chapter Two.) In summary, then, the 30 divisors of 720 are given by
1 3 5 9 15 45
2 6 10 18 30 90
4 12 20 36 60 180
8 24 40 72 120 360
16 48 80 144 240 720.
In a similar way, we can also find the largest number th a t is a divisor of

the three numbers 420, 504, and 2100. We could make a list as above of
all the divisors of 420, and a list containing all the divisors of 504, and,
thirdly, a list containing all the divisors of 2100. Then we would merely
have to look a t the three lists and extract the largest number appearing
on all three lists. However, using unique factorization, there is an even
better way. We write each of the numbers as a product of powers of
distinct primes. Thus,
_ 420= 22 X 3 X 5 X 7
504 = 23 X 32 X 7
2100 = 22 X 3 X 52 X 7.
Now looking a t the factorizations, we see a t once th a t any divisor must

contain no prime factors other than 2, 3, 5 or 7. Moreover, since we want
a number th a t is a divisor of all three numbers, we cannot have an exponent
in excess of the smallest exponent of th a t prime in each of the three num
bers. For example, a common divisor could not contain 23 as a divisor
since this is more factors of 2 than either 420 or 2100 contain. Moreover,
our common divisor can contain no factors of 5 since 5 is not a prime
factor of 504. Continuing in this way, we construct
22 X 31 X 5° X 71 = 4 X 3 X 7 = 84.
Thus, 84 is the largest number th a t is a divisor of 420, 504, and 2100.

In a similar way, we could find the least number divisible by each of the
numbers 420, 504, and 2100. We would proceed precisely as above, only
we would now choose the largest rather than the smallest exponent which
occurred for each prime. Thus,
23 X 32 X 52 X 71 = 8 X 9 X 25 X 7 = 12,600.
Terminology-wise, we call 84 the greatest common divisor (gcd) of 420,

504, and 2100, while 12,600 is called the least common multiple (1cm) of
420, 504, and 2100.
The discussion in the last few pages should prove th at the ancient
Greeks’ knowledge of divisibility transcended the use of geometry alone.
More im portant, in terms of this section, 1cm can be used to find common
denominators. For example, had we been called upon to form the sum
of 1/420, 1/504, and 1/ 2100; the above procedure would have told us to
choose 1/12,600 as the unit of common denomination, th at is,
1 30
420 12,600
1 25
504 12,600
1 6
2100 12,600'
Hence
1 , 1 , 1 30 , 25 , 6
420 + T504
K a ~T n m n
^ 2100 = in nnn + , r» +
12,600 ^ 12,600 ^ 12,600
61
12,600
Further practice is left for the exercises.
Exercises
1. Using 1 to denote a length of one inch, use the type of techniques
described in this section to construct the lengths named by the
following.
(a) 4 + 3 (e) f - i (i)|Xi (m) V i
(b) f + \ (f) f — I (j) 4 -i- 3 (n) \ / 7
(c) f + \ (g) 4 X 3 (k) f 4- \ (o) V f
(d) 6 - 3 (h) f X * (1) f * \ (p ) V 2
2. Explain the conceptual difference between the expressions f and
3 X i in terms of lengths; and then explain why we say th a t f =
3 X
3. In terms of m /n = m X 1/n, explain why it is natural to call m the
numerator and n the denominator of the common fraction m /n.
4. Show by an appropriate use of rectangles and their areas why
2 X (3 + 4 + 5) = (2 X 3) + (2 X 4) + (2 X 5).
5. Using a = 2 and 6 = 3, verify in this case th a t the recipes (a + 6)2 =
a2 + 2ab + 62 and (a + 6)3 = a3 + 3a26 + 3a62 + 63 are obeyed.
6. Explain the relationship between the length %and the point
7. W rite each of the following common fractions as an equivalent mixed
number.
(a) i (b) V- (c) V (d) W (e) m
8 . State a division problem involving only natural numbers, to which each
of the common fractions listed in Exercise 7 denotes a correct answer.
9. Represent each of the following mixed numbers as an equivalent
common fraction.
(a) 2* (b) 7 i (c) I l f (d) 321*
10. In terms of lengths, explain why it is a misnomer to refer to f as
improper.
11. Add the following; ttVo-
1.5 DECIMAL FRACTIONS: PLACE VALUE REVISITED

Thus far while all of our discussion has been about rational numbers, all
of our observations have been in terms of common fractions. In short,
we have once again become embroiled in the war of number-versus-
numeral. The point is th at rational numbers would have existed even if

common fractions had never been invented. Moreover, other numerals
could be invented to denote rational numbers. Our aim now is to introduce
another type of numeral called the decimal fraction. Decimal fractions
afford an interesting way of introducing the concept of irrational numbers.
To begin with, we have commented on the fact th at the length called
1 s. u. was arbitrarily chosen. T hat is, we could have chosen any length
and called it 1 s. u. However, once we chose this length, no other length
can also be called 1 s. u. since ambiguity would arise. For example, we
could have chosen the length which we named | to be the length of the
standard unit, but since we had already chosen a length five times this as
our standard unit, we could not call this other length one standard unit
also, without encouraging misinterpretation. In this section we would like
to investigate the idea of a defined standard unit from which to work in
terms of place-value. Let us carry out the investigation using our mone
tary system as an example.
Suppose th a t we have chosen the $1 bill as our “standard unit.” Cer
tainly this was not mandatory, as witnessed by the fact th at we have dimes
and pennies. In other words, we could have chosen the penny as our
standard unit, in which case the dime would still be equivalent to ten
pennies and the dollar equivalent to one hundred pennies. T hat is, the
monetary system would still be the same but there would be a different
standard unit. The im portant thing is th a t we do not want the penny
and the dollar called by the same name since they have different values.
The above example does not hinge on our m onetary system as much as it
depends on place-value. In usual place-value notation, we have already
observed th a t as we move from right to left, we multiply the value of each
column by ten to get the value of the next column. T h at is,
X10 X10 X10 X10
10,000 « 1000 « 100 <---- 10 «— 1.
Inversely, however, it is easy to see th a t as we move from left to right we

divide the value of each column by ten (that is, we multiply by one-tenth)
to obtain the value of the next column. Thus,
X iV X To X iV X r?
10,000 — ► 1000 — ► 10 0 — ► 10 — ► 1 .
W ith this in mind, if we wish to invent a unit smaller than the standard
unit, we need only continue our place-value system from left to right, each
time multiplying by one-tenth. T h at is,
Xt*o X tV X tV X^o X tV
1000 — ► 100 — ► 10 — ► 1 — ► 1/10 — ► 1 / 100 .
The im portant thing is th a t no m atter which column we wish to view as

being the standard unit, the structure of place-value remains the same,
since we are still trading ten of one denomination for one of the next higher
denomination. For example, if we had
1000 100 10 1 1/10 1/100 1/1000
6 9 4 2 7 8 5
we would read this as
6(1000) + 9(100) + 4(10) + 2(1) + 7(1/10)
+ 8(1/100) + 5(1/1000) or 6,942,785(1/1000).
T hat is, there are many different ways (all of which are equivalent) of
expressing the same number unambiguously in terms of place-value. How
ever, it is im portant to notice th a t it was because the columns were ex
plicitly labeled th at no misinterpretation or ambiguity occurred. For
example, with the labels omitted, we see only the sequence of digits
6942785
from which we have no way of knowing which column represents the stand
ard unit. In other words, with no labels we cannot distinguish between
such numbers as
6(1000) + 9(100) + 4(10) + 2(1) + 7(1/10) + 8(1/100) + 5(1/1000);
6(100) + 9(10) + 4(1) + 2(1/10) + 7(1/100) + 8(1/1000) + 5(1/10,000);
6(100,000) + 9(10,000) + 4(1000) + 2(100) + 7(10) + 8(1) + 5(1/10).
W hat we do know, however, is th a t we have 6,942,785 of a particular
denomination represented by the column furthest to the right, which is not
necessarily the standard unit.
Thus, it becomes clear th a t we need some sort of symbolism to tell us
which column represents our standard unit. Ju st as with the case of zero
(place holder) it is the concept and not the actual symbol th a t is im portant.
A convenient place for a symbol might be between the 1 and 1/10 columns
so that, in a way, we separate the whole numbers from the fractions. In
terms of the above example, we might write
6942*785 or 6942:785 or 6942.785.
(Historically the latter notation was adopted, and so was born the decimal
point.)
In the same way th a t the recipes of arithmetic work for the whole num
bers, so also do they work for decimals. For example, suppose we wish to
multiply 3.142 by 2.13. Except for the decimal points, our claim is th a t
we do this problem in exactly the same wray as we would do the problem
3142 X 213.
3142 3.142
X213 X2.13
9426 9426
3142 3142
6284 6284
669246 6.69246
We are usually mechanically taught to count off the decimal places, in this
case five, to obtain the correct answer. Yet such a result need not be
memorized. For, in the usual fraction notation,
3.142 = 3 + 1/10 + 4/100 + 2/1000
= 3142/1000.
Similarly,
2.13 = 2 + 1/10 + 3/100
= 200/100 + 10/100 + 3/100
= 213/100.
Therefore,
3.142 X 2.13 = 3142/1000 X 213/100 = (3142 X 213)/(1000 X 100).
The num erator accounts for the sequence of digits, and the denominator
accounts for moving the decimal point. T hat is,
1000 X 100 = 10 X 10 X 10 X 10 X 10.
In summary,
3.142 X 2.13 = (3142 X 214) X 1/10 X 1/10 X 1/10 X 1/10 X 1/10.
Each time we multiply by 1/10 (divide by 10), we move the decimal point
one column to the left.
Here again the number line affords us another rather intuitive way of
visualizing the proper placement of the decimal point. For, in terms of
lengths, 3.142 is shorter than 4 s. u. but longer than 3 s. u. T hat is, 3.142
lies between 3 and 4. In a similar way, we find th a t 2.13 lies between 2
and 3. Thus,
3.142 3 4
X2.13 lies between X2 and X3
~ 6 12
I t is nice to notice here the convenience of the number line in discussing

why 3.142 X 2.13 m ust lie between 6 and 12. Namely, a glance at Fig
ure 1.22 shows th a t 3.142 X 2.13 is the area of a rectangle A JG I while 6
and 12 represent the areas of rectangles A B K L and ACDH, respectively.
Figure 1.1
Since A JG I is sandwiched between A B K L and ACDH, it is easy to see

th a t its area must exceed the area of the smallest rectangle and in turn be
exceeded by the area of the largest rectangle.
Defining division to be the inverse of multiplication, we again see th a t
the usual division recipe applies virtually without variation.
Our last remark now enables us to write all rational numbers in decimal
form (called decimal fractions). For example, suppose we wish to write
j as a decimal fraction. We have already seen th a t j means 1 - ^ 4 .
Using the usual recipe, we have
0.2500 . . .
4 | 1.0000
_8
20
20
00 .
Thus, we see th a t in decimal form 1/4 becomes 0.25. As a quick check,

we observe th at
0.25 = 2(1/10) + 5(1/100)
= 2/10 + 5/100
= 20/100 + 5/100
= 25/100
= 1/4.
Keep in mind th a t both 1/4 and 0.25 are names (numerals) for the rational
number 1 -f 4. Of course, we could write 0.250, 0.2500, 0.25000, and so on,
as synonyms for 0.25. In other words, we could write this decimal in a
nonterminating way merely by agreeing to write 0’s forever. Rather than
do this, we prefer to write 0.25 and call it a terminating decimal. Among

the rational numbers th a t are represented by terminating decimals are
\ = 0.5
i = 0.125
I = 0.375
\ = 1.25.
Whereas it may appear at first glance th at most rational numbers will
lead to terminating decimals, such is not the case. In fact, if the factors
of the denominator can be broken down to contain other than powers of
2 and 5, the resulting decimal will not terminate. Rather than offer a
rigorous proof a t this time, we merely point out th a t in decimal form we
are restricted to the denominators 1, 10, 100, 1000, . . . (We assume here
th a t all fractions are in lowest terms. T hat is, we choose as a name for the
particular rational number the fraction in which numerator and denomina
tor share no common factor other than 1. For example, we would write
1/100 rather than 3/300); but 10 = 5 X 2 ; 100 = 5 X 2 X 5 X 2; 1000 =
5 X 2 X 5 X 2 X 5 X 2 ; and so on. Thus, as long as the denominators
contain nothing more than factors of 2 and 5, we can build tens as we need
them. By way of illustration, consider 1/40. Since 40 = 2 X 2 X 2 X 5 ,
we have 1/40 = 1/(2 X 2 X 2 X 5). Our denominator has two 2’s th at
are not matched by 5’s. So we m ust augment our denominator with two
more factors of five. However, whatever we do to the denominator, as
far as multiplication is concerned, m ust also be done to the numerator.
In this case, we obtain
1/40 = (1 X 5 X 5)/(2 X 2 X 2 X 5 X 5 X 5) = 25/1000 = 0.025.
As a second example,
1/16 = 1/(2 X 2 X 2 X 2)
= (1 X 5 X 5 X 5 X 5)/(2 X 2 X 2 X 2 X 5 X 5 X 5 X 5 )
= 625/10,000 = 0.0625.
On the other hand, if the denominator contains factors other than 2 or 5

we cannot build our tens since the only digits (excluding 1) th a t are divisors
of 10 are 2 and 5. For instance, 1/6 = 1/(2 X 3) = 5/(2 X 3 X 5); but
now we are a t an impasse because there is nothing we can do in terms of
multiplying by a whole number to convert the 3 in the denominator into
a 10.
If the above comment seems too difficult to grasp intuitively, think of
the monetary example wherein we can purchase a certain item a t the price
of 3 for $1. In this case, each item costs I of a dollar. Yet with our
decimally oriented monetary system, we cannot measure out 3 of a dollar.
For instance, we know th at it is worth more than 3 dimes but less than 4
dimes; it is worth more than 33 cents but less than 34 cents; and no m atter
how many more denominations we invented decimally, we could never
measure exactly ^ of a dollar. Notice th a t we stress decimally since we
could invent a coin th a t was worth f of a dollar—just as we invented non
decimal coins to represent \ of a dollar, * of a dollar, and of a dollar.
Leaving out the monetary illustration, we are saying th a t ^ = 1 4- 3, or
0.3333. . .
1.00000000
_9
10
_9
10
_9
10
_9
1 ____
T hat is, we never get a zero remainder in this division problem. In fact,
the remainder, 1, repeats continuously. In other words, in decimal form
the rational number whose name is | is nonterminating. This leads to a
rather esthetic thought; namely, 0.3, 0.33, 0.333, and so on, never become
synonyms for Yet, each member in the sequence becomes more nearly
equal to as the following differences show. Thus,
0.3 = 3/10 = 9/30; 1/3 = 10/30;
hence,
1/3 - 0.3 = 1/30,
0.33 = 33/100 = 99/300; 1/3 = 100/300;
hence,
1/3 - 0.33 = 1/300.
In a similar way,
1/3 - 0.333 = 1/3 - 333/1000 = 1000/3000 - 999/3000 = 1/3000.
In other words, while 0.33333... is never exactly equal to \ no m atter
where we terminate, the endless sequence of 3’s does represent f. The
esthetic point is th a t the mathematician is quite interested in the endless
progression, even though the word “endless” is highly subjective. For
example, it is physically impossible for man to write out an endless progres
sion, by virtue of the meaning of the word “endless.” Moreover, from
74 The Development of our Number System
a purely practical point of view, we never have to worry about an endless

sequence, a t least in the above example. For, as we have indicated, we
can make the error between y and 0 .3 3 3 ... as small as we wish by using
enough 3’s in the sequence. For instance, then, if we could not measure
closer than to ^Vt-h of an inch, certainly, in decimal form we might just
as well call 0.33 a synonym for y since we will never notice the error. In
terms of the number line, this means th a t the difference between y and
xoV is less than the thickness of a pencil lead. We shall say more about
this in the next section.
I t goes without saying th a t it would have been nice had all rational
numbers been represented by terminating decimals. Certainly, in line
with our beliefs th a t man seeks the simplest possible explanations, this
would have been a step in the right direction. Nonetheless, our quest for
simplicity is still a t work. To be sure, we m ust now adm it th a t rational
numbers may be represented by nonterminating decimals. But among
the nonterminating decimals there are some which might have more in
teresting properties than others. For example, granted th at the decimal
expansion of y was nonterminating, the expansion repeated the same cycle
endlessly—in this case the endless repeating cycle consisted of the single
digit 3. This leads to the following definition: A decimal is said to be
repeating if it does not terminate, but rather repeats beyond a certain point
the same cycle of digits endlessly.
In light of some previous observations, we can see without performing
the actual divisions th a t A , and 59/90 will not be represented by
term inating decimals, since the denominators contain factors other than
powers of 2 and 5. Yet as we perform the divisions, we find th at in each
of these cases the resulting decimals are repeating decimals! Specifically,
| = 0.142857142857142857. .. (where 142857. means 142857 is the end
lessly repeating cycle); -rs = 0.533. .. (where 3 . . . means th a t 3 consti
tutes the endlessly repeating cycle); and 59/90 = 0 .6 5 5 ... (where 5 . . .
means th a t 5 constitutes the endlessly repeating cycle).
As we study the division in each of the above examples, we find th at
(1) 0 never occurs as a remainder, and (2) eventually a remainder repeats.
I t is the repeating of a remainder th at is unequal to 0 th a t guarantees a
repeating decimal. For example, when we investigate y, we write
7 1.0000000000000
where the divisor consists of a 1 followed by endlessly many 0’s. Our

first remainder is 1, followed in turn by 3, 2, 6, 4, and 5. Then we get 1
as a remainder again. This leaves us in the same position as a t the start—
namely, with a 1 followed by endlessly many 0’s. In this way, the cycle
142857 m ust repeat endlessly. Thus,
0.142857142857142857142857
7 I 1.1030206040501030206040501030206040501000000.
In this example it is an interesting coincidence th a t all possible nonzero

remainders occur before there is a repetition. This leads to an impractical
but highly interesting result. Namely, the decimal sequences for y, f, f,
y, y, and y m ust consist of the same cycle of digits only starting in different
positions. This should not be difficult to see, since in f the first remainder
is 2, which was the third remainder in the decimal form of y. This fact
leads to an unusual coincidence th at can be used as a trick. Simply ask a
person to multiply 142857 by 2, 3, 4, 5, or 6. We obtain the following
answers.
142857 142857 142857
XI X2 X3
142857 285714 428571
142857 142857 142857
X4 X5 X6
571428 714285 857142
where each answer is essentially a variation of the cycle 142857.
If we multiply by 7, we obtain 142857 X 7 = 999999. How would you
explain this? If we multiply by 8, we see th a t 142857 X 8 = 1142856,
and the cycle seems to be broken. But if we sta rt the cycle with the second
1, we have 1428561; and if we group this as 14285(61) and recall th a t
6 + 1 = 7, we obtain 142857, which is the same as we obtained when we
multiplied 142857 by 1. Is this ju st a coincidence or is there a reason why
this happened? If it is not a coincidence, can you explain how this result
generalizes? I t is not very im portant for our purposes whether you can
answer these questions. I t is important, however, to illustrate how keen
observation permits us to make interesting surmises even in the most
ordinary situations.
Returning to our discussion, the only question th a t now remains is
whether it was luck or necessity th a t caused us to get a remainder th a t
repeated in the above examples. We intend to show th a t when we form
the quotient of two natural numbers a remainder must eventually repeat.
To this end, we observe th a t when we divide a number by 7 and we
continue the division until the remainder is less than 7, then there are only 7
possible remainders we can obtain: 1, 2, 3, 4, 5, 6, and 0 (this is the case
wherein the number is divisible by 7). Hence, if we divide eight numbers
by 7, a t least two remainders m ust repeat, else there would be a t least
eight different remainders. This simple principle, known more ominously
as Dedekind's chest-of-drawers principle, is used quite often. For example,
it tells us th a t in a group of, say, 400 people, a t least two celebrate their
birthday on the same day; otherwise there would be at least 400 days in a
year. I t tells us th a t if there is an urn with marbles, some colored red
and the others black, and we wish to choose two of the same color; then we
need draw only 3 marbles from the urn to make sure th at at least two have
the same color (of course, we may have matched a pair with 2, but we
cannot miss with 3).
This principle guarantees us th at if our divisor is n (in the above case we
were considering n = 7), by the time we get to the (n + l)th 0 after the
decimal point, we are assured th at at least one remainder has repeated—
unless 0 occurs, in which case the decimal terminates. Summed up then,
using the chest-of-drawers principle, when we express a rational number as a
decimal, we must obtain either a terminating decimal (0 occurs as a re
mainder) or a repeating decimal (0 never occurs as a remainder but some
other remainder repeats).
Although it is not im portant for our immediate needs, the converse
(which means in an If ... . then . . . statement, the statem ent which
is obtained by interchanging the clauses following the “If” and the “then”)
of the above result is also true. T hat is, if a decimal either terminates or
repeats, then it represents a rational number. Before demonstrating this,
however, let us point out th a t this statem ent is not as trivial as it may ap
pear a t first glance. I t is not just a restatem ent of the previous result.
Observe th a t there is a difference between saying th at if a person is reading
this book, he is alive, and if a person is alive, he is reading this book!
Again, rather than offer a proof, we will proceed by examples. Consider
a term inating decimal. For example, let us choose 2.673. Certainly we
may write
2.673 = 2 + 6/10 + 7/100 + 3/1000

= 2000/1000 + 600/1000 + 70/1000 + 3/1000
= 2673/1000.
(In general, we may pretend th a t the decimal point is missing and read
the numeral as th a t number of whatever denomination is represented by
the last column.)
The more difficult case is when the decimal does not terminate. For ex
ample, let us consider 0.3333333. . . , where 3 repeats endlessly, and let us
pretend th a t we do not know th a t this is the decimal form of the rational
number f. We observe th a t if we move the decimal point one place to the
right, the decimal does not change. But this means multiplying by ten.
In other words, if we let n = 0.33333333333333333333333333... , then
lOn = 3.33333333333333333.... Writing this in a form more conducive
to subtraction, we have
10n = 3.33333333333333333333...
- (to = 0.33333333333333333333...)
9n = 3.00000000000000000000...
or n = ^—as we already knew.

As for other examples, consider n = 0.4545454545454545... (where 45
constitutes the endlessly repeating cycle). We observe th a t the decimal
part remains the same if we move the decimal point two places to the right
(actually, we could move it four, six, or eight or more places to the right
but this would just complicate the arithmetic). Thus, we have
lOOw = 45.4545454545. . .
—(w = 0.4545454545. . .)
99n = 45
n = 45/99 = 5/11.
The key idea is th a t this device allows us to replace repeating decimals by
terminating decimals. In terms of numerators and denominators, term i
nating decimals are more readily handled than repeating decimals since we
cannot pick such a “ nice” denominator to represent an endless sequence
if only because there is no column furthest to the right. As a trivial check,
one need only consider
0.45454545. . .
11 | 5.00000000...
44
Notice th at the chest-of-drawers principle states in this case th a t a re

mainder must repeat by the twelfth 0 after the decimal point, but it may
repeat sooner. In the above example the remainder repeats by the third
decimal point. Thus, with two people it is possible th a t they celebrate
their birthdays in the same month. W ith thirteen or more people, how
ever, a t least two must celebrate their birthdays during the same month.
Note th a t we are not saying which month it will be though.
As a final example, consider the case wherein n = 0.41313. .. (where
the endlessly repeating cycle is 13). Here we run into a slight computa
tional snag th a t did not confront us before, simply because part of our
decimal is not contained in the repeating cycle. In this case, it is the 4
which causes the complication. All we need observe is th a t by moving the
decimal point one, three, five, seven, and so on, places to the right, we
obtain the repeating decimal 0.13131313.. . . Thus,
lOOOn = 413.1313131313...
—(lOw = 4.1313131313...)
- 990n = 409
n = 409/990.
To check, we observe the following.
0.413131
990 | 409.000000
3960 130 is the first remainder to repeat
^3§D in this division. Again, notice
990 th at the repetition took place at
the third decimal place—not at
3100
the 991st!
2970 /
These examples demonstrate essentially th at the quotient of two natural

numbers when expressed in decimal form will always yield a terminating or
repeating decimal. Conversely, any terminating or repeating decimal can
be obtained as a quotient of two natural numbers.
In passing, we wish to observe th a t one could certainly invent decimals
which are neither terminating nor repeating. For example, consider
n = 0.282882888288882888882888888. .. where each time we add one more
8 to the cycle. This decimal can never repeat the same cycle endlessly
because each time one more 8 is added to the cycle, thus preventing the
exact same cycle from being endlessly repeated. In other words, there are
decimals which cannot be obtained as the quotient of two natural numbers.
The next section will be devoted entirely to this topic.
Once again, let us emphasize the question of different number bases and
their effect upon whether a particular decimal fraction will either terminate
or repeat. For example, let us first observe th at in the base-n system,
the first place to the right of the decimal point denotes the denomination
1/n; the next, l / n X 1/n , or 1/ n 2, and so on. Thus, (0.2)6 means 2
(one-sixth) or f which, in turn, equals f. Thus, while ^ is represented by
a nonterminating decimal in the base-ten system, it is represented by a
term inating decimal fraction (0.2)6 in the base-six system.
This brings us to an interesting point; namely, th a t in our base-ten
system, we observe th a t when written in lowest terms, a common fraction
will be equivalent to a terminating decimal if and only if the denominator
can be represented by unique factorization as a product of powers of two
an d /o r five. Otherwise, the decimal fraction would be nonterminating.
Notice, again, the special properties of two and five in our base-ten system.
The point is th at in any base, the terminating decimal fractions are de
termined by whether the prime factors of the denominator happen to be
divisors of the number base. For example, in base-six arithmetic the
crucial primes would be two and three. This is because of the number
fact th at the product of two and three is six. T hat is, 2 X 3 = 6 or
( 2 X 3 = 10)6.
By way of additional examples, when written as a decimal, y would
terminate in the base-fourteen system. In fact, (y = 0.2),4. This follows
from the fact th at seven is a divisor of the base in this case. In particular,
(r = (1 X 2)/(7 X 2) = * = 0.2) i4.
Thus, regardless of the base, a rational number when represented as a

decimal fraction either terminates or repeats. This does not depend on
the base; but whether it terminates or whether it repeats is dependent on
the base.
As a further example, when y is written as a decimal the decimal will
terminate in any even base since two is a divisor of every even natural
number; on the other hand, \ will repeat but not terminate in any odd
base since no odd number is divisible by two. By way of illustration,
since y = t = t = i = tit = and so on, we have | = (0.2)4 = (0.3)6 =
(0.4)8 = (0.5)io = (0.6)i2, and so on. On the other hand, \ = (0.111.. .)3 =
(0.222. . . )5 = (0.333. . .) t, and so on. The details are left to the reader
as an exercise. All it involves is dividing 2 into 1, but using the tables for
the base in question. For example, in base-three,
.11
| = 2 | 1.0000
_2
"To
_2
1 . ..
(since 10three = 3ten and therefore 10 -5- 2 = 1 and 1 remainder in base-

three numerals.)
Aside from esthetic value, the above discussion throws an interesting
light on such questions as whether i or y are “exact.” When a person says
th a t these are not exact, he is usually referring to the fact th a t they never
terminate as decimals. But this is only a fact in some number bases and
not in others. For example, in any number base which is divisible by
three § does indeed terminate. For example,
i = (0.2)6 = (0.3)9 = (0.4) 12 = (0.5),6.

(We omit base-three only because I is not a numeral in this system. It

is permissible to say ^ = ( ^ 3 = (0.1)3.)
To continue, whether or not £ is exact should depend only on the number
named by f, not on the numeral §. Thus, the use of different bases allows
us another interpretation of why 0.333. . . is exact even though it is endless.
In fact, it only becomes inexact when we chop off the decimal at a certain
point. For example, 0.3333333 is not exactly but 0 .3 3 3 ... is exactly
equal to Here again, then, we see a nice distinction between numerals
and numbers. The rational number one-third is an “exact” number but
whether its “decimal” fraction “name” terminates or whether it repeats
depends on the base being used.
More about these points will be left for the exercises; but we would
like to conclude these brief remarks with a reference to prime and com
posite numbers. In particular, we have observed th at a common fraction
will be represented by a terminating decimal if and only if all the prime
divisors of the denominator are divisors of the base. However, a prime
number has no nontrivial divisors. Thus, unless the denominator isitself
a power of the base, a common fraction will never terminate as a decimal
when the number base is a prime number.
At the other extreme, observe th at the more divisors the base possesses,
the more fractions will term inate as decimals. This is another reason why
there are many advocates of base-twelve arithmetic: Since 12 possesses
the nontrivial divisors 2, 3,4, and 6, we find th a t among single-digit divisors
1} b i> i> and 1/ ? 1 (where T denotes ten) all term inate as decimals in
the base-twelve system; while only £, i , £, and i terminate as decimals in
our base-ten system. In particular,
I = (0 .6 )1 2 ; \ = ( 0 . 4) i 2 ; J = ( 0 . 3) i 2 ; * = (0 .2 )i2 .
and
1 = 1 *= J L =
8 23 2363 \ 123/ ten
6 3 = 216 = 160t w e i v e
1 2 3 = 1 0 3t w e i v e = 1 0 0 0 tw e lv e
I = (0.16)12
We conclude this section with a discussion of percentages. “Percent,”
derived from the Latin percentum means literally “per one hundred.”
From our point of view, “percent” is merely a synonym for “divided by
100,” or for indicating th a t the denominator of a fraction is 100. Thus,
when one says 18 percent of 50 (written 18% of 50) we think of it as saying
18/100 of 50 = 18/100 X 50 = 18/100 X 50/1
= 900/100
= 9.
W ithout consciously using the number line, we may think of 50 as being

divided into 100 equal parts, and then taking 18 of these equally sized
parts. T hat is, we may think of 18 pieces, each of size
In a similar way, when one says th a t 16% of a certain number is 32,
he means th a t 16/100 X (that number) = 32. In other words, (that
number) = 32 X 100/16 = 3200/16 = 200. In fact, we are merely using
the inverse of multiplication, or division, here. Again, thinking subcon
sciously of the number line we are saying th a t 16 of the 100 equal parts is
equal to 32; hence, one of these parts is equal to 2. Therefore, the entire
hundred pieces must be worth 100 X 2 = 200.
In final summary, if we are to study mathematics as intelligent, sensible
people, fractions m ust be more than just computational tools to us. There
should be a meaningful experience in human endeavor as we learn the “why”
as well as the “how.” For, in learning to master anything, one soon
learns th at he does not choose between the “how” and the “why.” Rather
he learns to see the “how” and the “why” in a state of blissful coexistence!
Exercises
1. If we assume th at “decimal” refers to 10 rather than ten, explain how
we can have a decimal system in any number base system.
2. In what sense may we read 62.3764 as if it were 623,764?
3. Assume th a t we were using place-value as usual, but th a t we forgot to
put in a decimal point when we expressed a certain number. In this
case, name five different numbers th a t could be named by 2345678.
In what sense are all these numbers like 2,345,678?
4. Distinguish between the use of 0 in such numerals as 4.1300 and 4.0013.
5. Find the sum of 45.0013, 123.261, 872.4456, and 9.003.
6. Compute 23.76 X 2.8, and interpret this result in terms of converting
the decimal fractions into equivalent common fractions.
7. In terms of areas of rectangles, explain why 23.76 X 2.8 m ust exceed
46, but be less than 72.
8. Express each of the following ratios as decimal fractions.
(a) 3:8 (b) 27:40 (c) 40:27 (d) 8:14 (e) 3:17
9. W ithout actually computing the equivalent decimal fraction, tell which
of the following common fractions are equivalent to terminating deci
mals and which are equivalent to repeating, nonterminating decimals
(base-ten).
(a) 23/50 (c) 13/40 (e) 9/25 (g) 239/1250
(b) 29/70 (d) 6/75 (f) 18/24
10. Name three bases in which the decimal equivalent of \ does not term i
nate. W hat decimal expresses \ in each of these cases?
11. Find three different number bases in which the decimal equivalent of f
terminates. W hat is the decimal in each of these three cases?
12. Is there a base in which the decimal equivalent of f will neither termi
nate nor repeat? Explain.
13. Express as a common fraction the rational numbers named by each of
the following decimals.
(a) 0.4563 (d) 23.4343... (g) 0.31247878...
(b) 4.563 ___ - (e) 0.45666... (h) 0 .9 9 9 ...
(c) 0.45634563... (f) 0.64545...
14. Express £ as an equivalent decimal fraction in the base-twelve system.
15. Is it possible th a t there are two whole numbers whose quotient in deci
mal form is given by
0.25255255525555...
(where means add one more 5 to each cycle)? Explain.
16. Find the value of n in each of the following.
(a) 23% of n is 92. (d) n is 23% of 1000.
(b) 23% of n is 1000. (e) n X 1.23 = 55.35.
(c) n is 23% of 92. (f) n + 7.983 = 11.052.
(g) If an object travels a t a steady rate of 15 feet per second then it
is also traveling at n miles per hour.
17. In terms of number-versus-numeral, explain why it is wrong to say
th a t a rational number is any decimal which either terminates or
repeats.
1.6 THE IRRATIONAL NUMBERS
The concept of number-versus-numeral extends in a very natural way to

the number line—only in this environment it becomes known as point-
versus-dot. A number is denoted by a numeral and a point by a dot.
This is because conceptually we think of a point as having no thickness;
yet the fact th a t a dot is visible means th at it has thickness. Our first ap
proximation is to say th a t we will make the dot so thin th at its thickness
is negligible; but this really solves nothing—at least from one point of view,
since thickness is still thickness!
I t was the difference between a dot and a point th a t allowed the ancient
Greek to believe th a t the rational numbers filled in the number line com
pletely. His argument hinged on the fact th a t the average of two rational
numbers was also a rational number. In terms of the number line this
meant th a t if m and n were points th a t named rational numbers, then the
point midway between them also represented a rational number. T hat
this is true is not hard to see when we realize th a t the average of m and n is
(m + n ) /2 ; thus, if m and n are rational, so is their sum—and a rational
number divided by two is still a rational number.
The Greek’s argument now became a type of bisection. For example,
let m and n denote points which name rational numbers. Then as we have
just mentioned, if p denotes the midpoint of this segment th a t joins m

and n, p is also a rational number. We now take the midpoint of the seg
ments between m and p and between p and n. These points also denote
rational numbers, and we continue this process for a sufficiently long period
of time, each time constructing new rational numbers.
After a while, the spaces between our dots get so small th a t it seems
the dots all meld together and form a solid line. Surprisingly enough, this
is no optical illusion. The dots do run together. In effect, what happens
is th at the thickness of the dot becomes greater than the distance between
consecutive dots (and notice th at no m atter how tiny we make the dots,
this problem eventually arises. All th at the thickness of the dot affects is
how long it takes before the thickness of the dot exceeds the space between
two consecutive dots). On the other hand, the points never run together.4
Let us illustrate this more concretely. Suppose we take a line th a t is 1
unit in length, say, the interval from 0 to 1. Then the midpoint of this
segment is \. If we bisect the two resulting segments, we obtain | and f.
If we then bisect each of the four segments, we obtain f , f , f , and Con
tinuing in this way, the dots eventually fill in the segment; yet we do not
even come close to obtaining all rational numbers in this way. In fact,
it is clear th a t the only rational numbers we obtain are those which when
expressed in lowest terms have as their denominators powers of two. In
other words, we cannot obtain any fraction in this way which when ex
pressed in lowest terms has a denominator of 3, 5, 6, 7, 9, 10, 11, 12, 13,
14, 15, 17, and so on.
What, then, has happened to f ? Has it disappeared from the number
line? It is simply th a t one of the dots occupies a position th a t overlaps
the point To see this in a more constructive light, observe th a t ^ must
appear either between 0 and \ or between \ and 1. Using common
denominators, it is easy to see th a t f appears between %and f rather than
between f and f. Thus, we have located f to lie between 0 and \. We
next bisect the interval from 0 to thus obtaining the point \ . We next
ask whether ^ occurs between 0 and \ or between j and Again using
common denominators we see th a t T2 lies between T32 and rather than
between ^ and In this way, we have more accurately defined § to
lie between \ and We can continue in this way as shown in Figure 1.23,
and what eventually happens is th at the interval into which we have
squeezed | becomes thinner than the pencil point.
Thus, as intuitive as it might seem, the dot-type argument of showing
th a t the rational numbers fill in the number line is erroneous. There is
such a difference between dots and points, in fact, th a t the erroneous argu
ment hides the fact th a t we fail to obtain most of the rational numbers by
this construction.
4See Note 1 a t the end of this section.

_9
24
6 .i
— _ . ,12
24 27JLI- 24
A _______ I_#____ 1.6.
*2 1_ |_A_ 12
4 ~ 12 3 _ 12
2 I _2 3 6
6 3~6 6 6
Figure 1.28
Of course, from a legal point of view, all our present argument shows is
th a t there is an appreciable difference between a point and a dot, and th at
the ancient Greek used a natural but naive argument. Notice th at we
have not disproved his claim—only his proof.
However, now th a t we have at least established a need for more than
the given proof, we can proceed more confidently to seek a vehicle for our
claim th a t there m ust be more than rational numbers. The vehicle we
elect to use is th a t of decimals. We have already seen th a t the only
decimal numerals th at can name rational numbers are those which either
term inate or repeat. Consequently, if we agree th a t all decimals name
numbers, then as soon as we can exhibit a decimal which neither terminates
nor repeats, we have established the existence of nonrational (irrational)
numbers. To this end, we offer as a candidate the number
m = 0.28288288828888...
(where " means th a t we add one more 8 to each cycle).

By its very construction m can neither terminate ( . . . indicates th a t it
goes on endlessly) nor repeat (for each time one more 8 is present than in
the previous cycle). Yet the definition of m is not random. I t is as
precisely defined as is, say, 0.282828... . Indeed, we can objectively
determine each decimal place of m by observing th at 2’s will appear in the
1st, 3d, 6th, and 10th places and so on, while 8’s will appear every place
else.
Historically, the ancient Greek did not discover the irrational numbers
in this way (among other things, decimals had not yet been invented). He
discovered them in terms of his association of numbers with lengths. T hat
is, he felt th a t any length he could obtain by a geometric construction
should represent the name of a number. As an example, consider the
isosceles right triangle, each of whose sides has length 1. By the Pythago
rean theorem, it followed th a t the length of the hypotenuse of this triangle
was \/2 - Since this was a “real” triangle, his feeling was th a t \ / 2 was
a “real” number. (We shall examine the idea of “real” numbers in more
detail in the section on complex numbers.) (As a footnote on the power
of geometric pictures in viewing abstract relations, an ancient proof of
this theorem which involves nothing more than subdividing a square in
two different ways is supplied as Note 2 at the end of this section.)
Since the only numbers he knew were rational numbers, he set out to
find two whole numbers whose quotient was y /2 . In other words, he
sought a rational number whose square was 2. His quest proved futile,
even though he managed to come close a t times. For example, 7/5
almost works since 7/5 X 7/5 = 49/25, which is almost 2.
He eventually employed the well-known logical ruse used so often in
mathematics—th at of pretending to know what the answer was, and then
working backwards to discover it. This is what we do in algebra problems
when we let x equal the unknown. We are pretending th a t x is the answer
and then we apply logic to deduce the numerical value x m ust have. In
this context, he probably said, “Okay, suppose we knew what the number
was. Let’s call it m /n. Then m2/n 2 = 2, or m2 = 2n2. However, it is
impossible for there to be two whole numbers m and n such th a t m2 = 2n2,
since in terms of the unique factorization theorem ra2 has an even number
of factors, while 2n2 has an odd number of factors. Thus, there can be no
rational number whose square is 2.”5 This was, indeed, a dilemma to
him (legend has it th at the mathematician who discovered this fact com
m itted suicide); for now he either had to adm it th at \ / 2 was not a number
or else he had to adm it th at the rational numbers, just as the whole num
bers before this, were in themselves not sufficiently complete to handle
the various fine points of mathematics.
To correlate this discussion of y 2 with our decimal interpretation, we
are saying th a t we can find successively better rational approximations in
decimal form for y /2 \ but th a t no such number can ever equal \/ 2 - For
example, let us try to find a decimal which names y /2 . Certainly such a
number must lie between 1 and 2 since the square of 1 is too small and the
square of 2 is too big. In a similar way, the fact th a t 1.4 X 1.4 = 1.96
and 1.5 X 1.5 = 2.25 means th a t 1.4 < \ / 2 < 1.5. Carrying out this
procedure still further, it could be shown th a t 1.41 < y /2 < 1.42; and
th a t 1.414 < \ / 2 < 1.415. Moreover, such a procedure, provided only
th a t we had the patience, could be carried on as many times as we liked.
In this way we could continually squeeze \ / 2 between two rational num
bers which became as nearly equal as we desired; but in no event would we
6 See Note 3 a t the end of this section for another proof.

ever obtain a terminating or repeating decimal whose square was 2, other

wise -\/2 would be rational, contrary to what we know is true.
M an has spent a great deal of time studying the irrational numbers.
In fact, more volumes have been devoted to irrational numbers than to
rational numbers. Yet, we never need irrational numbers in our real,
pragmatic, world. This becomes apparent to us as soon as we are first
taught th a t x = 22/7. This is preposterous, for it happens to be irrational
while 22/7 clearly names a rational number. W hat we really mean is th at
the decimal expressions for x and 22/7 begin with 3.14, but the resemblance
soon ends.
x = 3.14158. ..
22/7 = 3.142587. .. .
B ut if we could not measure beyond two decimal-place accuracy it would
be impossible to distinguish between x and 22/7. I t would not be th at
they were equal, but rather th a t the difference between them was, so to
speak, less than the thickness of a pencil point. In short, in the real world
we can always find rational approximations to any desired degree of ac
curacy for irrational numbers. Among other things, the extensive study
of irrational numbers is a tribute to m an’s esthetic sense of values.
To illustrate this idea in terms of a different example, let us return to
m = 0.28288288828888.
Consider the following.
A sequence of ra 0.2 < m < 0.3 A sequence of ra
tional numbers 0.28 < m < 0.29 tional numbers
th a t becomes pro 0.282 < m < 0.283 th a t becomes pro
gressively larger. 0.2828 < m < 0.2829 gressively smaller.
0.28288 < m < 0.28289
0.282882 < m < 0.282883
T T
The difference becomes
arbitrarily small.
We see th a t we can find a rational number th a t will approximate m to as
great a degree of accuracy as we desire. In terms of the number line, we
are saying th a t we can find a dot th a t originates and terminates at a rational
number, as “th in ” as we desire, th a t will engulf m.
Lest we think th a t irrational numbers include no more than y /2 and x,
let us hasten to point out th a t there are even more irrational numbers than
there are rational numbers, even though there are infinitely many rational
numbers. W ithout supplying a rigorous proof at this time, the following
physical model might prove effective. Imagine th a t we have a ten-faced
die numbered with the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. We start with 0

and then roll the die. We record the numeral th at turns up on the die as
the first decimal digit. We repeat this process, each time recording another
decimal digit; we assume th at we will continue the experiment forever.
In this way we obtain a decimal numeral. For example, if our first eight
rolls turn up in succession 2, 5, 3, 4, 4, 9, 5, 1, our decimal starts out
0.25344951. In any event, it seems much more likely th a t if this experi
ment were carried out endlessly we would get a random distribution of
digits rather than an endlessly repeating cycle. T hat is, it seems more
likely th a t a randomly constructed decimal will name an irrational number
rather than a rational number. In this interpretation it almost seems
th a t it should be impossible to obtain a rational number since it is hard to
imagine the same cycle of digits occurring endlessly. In a way this in
dicates, at least, how the irrational numbers tend to “dwarf” the rational
numbers.
Now lest we forget our primary mission (that of developing our number
system) let us curtail further discussion of irrational numbers and instead
point out th a t we have established th a t the rational numbers were not
sufficient to fill in the number line, and th at we invented the irrational
numbers to fill in the gaps. Hence, a given number is either rational or
irrational, but not both. In terms of divisibility, we are saying th a t either
a number is the quotient of two whole numbers or else it is not. In this
way, the irrational numbers seem to “close” the number line—or do they?
This will be clarified in Section 1.7.
NOTE 1
The following is a particularly simple way to see the difference between a

point and a dot.
Consider a group of uniformly sized marbles (this will serve as our physi
cal model of dots), and two lines S and L, as shown in Figure 1.24.
• +s
Figure 1.24
Clearly, L can hold more marbles than S. T hat is, L contains more
dots than S. However, L and S have exactly the same number of points,
as the following construction will indicate.
First, construct a figure similar to the one in Figure 1.25, joining the end
points of S and L to form the triangle whose vertex is A .
Figure 1.25
Pick any point p on line S. Then the line which joins A and p intersects
L at one and only one point, say, q. Conversely, starting at any point q
on line L, the line th a t joins A and q intersects S at exactly one point, p.
This construction exhibits a one-to-one correspondence between the points
on each line.
NOTE 2
W ith a, b, and c as in Figure 1.26 we would like to show th a t a2 + b2

= c2.
b
Figure 1.26
Let T denote the area of the triangle. We proceed to subdivide the

square whose side is (a + b) in the two different ways illustrated in Figure
1.27(a) and (b). From (a) the area of the square is
a2 + b2 + 4T.
(a) (b)
Figure 1.27
From (b) the area of the same square is

c2 + 4 T.
Since a given square has but one area, it follows th at
a2 + b2 + 471 = c2 + 4 T,
whence
a2 b2 = c2.
NOTE 3
The ancient Greek proof th a t y /2 is irrational utilizes the indirect proof.

We assume th a t our conclusion is false and show th a t this leads to a con
tradiction, whence we conclude th a t the conclusion was in fact true. To
this end suppose th a t m and n are whole numbers such th a t m /n — y/ 2
and th a t m /n is in lowest terms.
Squaring both sides we obtain
^- = 2 or m2 = 2n2.
nl
Since m2 = 2n2 and 2n2 is certainly divisible by 2, m2 is also divisible
by 2. B ut 2 is a prime number; hence, if it divides m2 it m ust divide m.
(Beware, this is true only for prime numbers. For example, 6 is not
divisible by 4, even though 62 is. We may write 4 as 2 X 2. Then each 2
is used with each factor of 6. However, primes cannot be broken down
into smaller factors.)
Since m is divisible by 2, we may write
m = 2k
where k is also a whole number. Thus,
m? = 4k2.
Now m2 = 2n2, and m2 = 4k2. Therefore,
4fc2 = 2n2
since they both equal m2. From 4k2 = 2n2 it follows th at
n2 = 2k2.
From n2 = 2k2 we can now conclude th a t n is also divisible by 2. This
is the required contradiction; for since m and n are both divisible by 2,
m /n is not in lowest terms; yet we assumed th a t it was.
As a final remark, we might want to ask what made the ancient Greek
suspect th a t y / 2 was irrational since he already felt th at the rational
numbers filled in the entire number line. Perhaps he believed th a t y /2
was rational and proceeded to deduce what whole numbers had this
quotient. When the procedure yielded a known contradiction, he may
then have decided th a t he had invented an indirect proof th a t y /2 was
irrational.
Exercises
1. In terms of decimals, explain why there must exist irrational numbers.
2. Let m denote the decimal 0.676776777. .. (where each time one more 7
is added to the cycle) and let n denote 0 .212112111.. . (where each time
another 1 is added to the cycle).
(a) Are m and n rational numbers? Explain.
(b) Is m + n a rational number? Explain.
(c) Does the sum of two irrational numbers always yield an irrational
number? Explain.
(d) W ith m as above, find two rational numbers whose difference is less
than 0.0000001 such th a t one of them is more than m while the
other is less than m.
3. Find two irrational numbers whose difference is less than 0.00000001
such th a t one of them is greater than £ while the other is less than
4. From a practical point of view, what does it mean if we write
y / 2 = 1.41?
5. Highlight the difference between a point and a dot to explain why the
ancient Greek did not expect irrational numbers to exist.
6. Show by means of an example th at the product of two irrational numbers
may be rational.
7. Using the proof in the text th at y /2 is irrational as an example, show
th a t y /3 is irrational.
8. If we already know th a t v 3 is irrational and th a t the sum of two
rational numbers is rational, show how we may conclude th a t v 3 — 2
must be irrational.
1.7 SIGNED NUMBERS

The concept of negative numbers is introduced relatively early into the
elementary school curriculum by such means as profit and loss. On the
other hand, irrational numbers are rarely discussed until the later years of
high school and, even then, usually quite superficially. Thus, it may seem
strange th at while the ancient Greek knew about irrational numbers as
early as 600 B .C ., it was not until the thirteenth century a .d . th a t negative
numbers were first treated as a subject; and it was not until the sixteenth
century th a t they became an integral part of higher mathematics.
The reason th a t negative numbers were so hard for man to discover
provides an interesting forerunner to the study of complex numbers.
Namely, man has a tendency to restrict the meaning of “real” to things
th a t we can at least visualize. In this sense, then, “imaginary” is a
relative thing, dependent upon the known body of knowledge. For
example, consider the indicated operation, 5 — 7. Certainly we cannot
take seven tallies away from a collection th a t contains only five tallies.
To be sure, one who has a superficial understanding of negative numbers
gives the answer as —2 which, he says, means th at the collection “owes”
us 2. B ut if, indeed, the collection still owes us 2, it must be only because
we did not take our seven. No m atter how we disguise it, the G reat
Society notwithstanding, we cannot take seven from five. In other words,
as long as the model for numbers was tallies, the idea of “less than none”
was “imaginary.” When tallies were replaced by lengths, it was equally
unrealistic to talk about lengths shorter than nothing. Thus, it should
become fairly obvious why the ancient Greek used the word “imaginary”
to denote the answer to 5 — 7.
The point is th a t a concept was being judged not in its own right but in
terms of a particular physical model. In terms of concepts already dis
cussed by us, it is like talking about f -r- f . T hat is, we might not talk
about dividing f of a pie among f of a person, but it certainly makes sense
to talk about a particle which moves f feet in f seconds. In this same vein,
while we cannot take seven pieces of candy from a dish which contains only
five pieces, we can lower the temperature by 7°F of something whose tem
perature is 5°F.
The aim of this section is to present the development of signed numbers
and to indicate the further importance of the number line. Let us begin
our study as it occurred historically—in terms of profit and loss.
I t is clear th a t in any business transaction the difference between profit
and loss is crucial. T hat is, if we conclude a $7 transaction, we are inter
ested in knowing whether the $7 was a profit or whether it was a loss. The
magnitude of the transaction alone is not enough. Suppose we wish to
think of 0 as the symbol th a t indicates a transaction th at wTas neither a
profit nor a loss. In other words, we think of 0 as denoting th at we broke
even. Then we could think of profit as a number in excess of 0, and a loss
as a number less than 0. More specifically, instead of using the traditional
red and black inks to distinguish between profit and loss, or instead of
writing P7 or L7 to distinguish between a $7 profit or $7 loss, we elect to
use the symbol + to denote profit and the symbol — to denote loss. The
reasons for choosing + and — as our symbols will be explained in more
detail later.
Thus, we think of a signed number as being composed of two separate
parts: (1) the magnitude, or size, and (2) the signature, + or —, which we
use as a code to distinguish between profit and loss. Thus, we would write
(+ 7 ) to indicate a $7 profit and ( —7) to indicate a $7 loss. The paren
theses are used to emphasize th a t both the signature and the magniture are
necessary to describe the transaction fully. In other words, + 7 or —7 is
treated as one entity.
Before proceeding further, notice th a t we usually read + as “plus” and
— as “minus.” This is quite natural because + and — are the usual sym
bols to denote addition and subtraction, respectively. However, and we
shall show a connection later, the fact remains th a t conceptually there is a
difference between signature and the operations of addition and subtrac
tion. In other words, we have opened the doors for misinterpretation by
letting one symbol, + , exist in two different contexts. To help minimize
this problem we agree to read the signature + as “positive” and — as
“negative.”
Suppose we now wish to discuss the meaning of the sum of two signed
numbers. As we mentioned in our discussion of rational numbers, there
is no “natural way” to define the sum of two numbers. W hat is “natural”
will depend on what we want the sum to mean. In terms of business
transactions, it seems natural th a t the sum of two transactions be the net
result of the two transactions. In other words, we are saying th a t since
signed numbers have been introduced to denote profit and loss, the sum of
two signed numbers should represent the resultant profit or loss of the two
transactions. To illustrate these remarks, let us arbitrarily choose a pair

of magnitudes, say 7 and 3, and investigate.
(1) ( + 7) + (+ 3 )
(2) (+ 7 ) + (-3 )
(3) (-7 ) + (+ 3 )
(4) (-7 ) + (-3 ).
Again notice the double use of both + and —. On the one hand, we
are using them to indicate positive and negative signatures. On the other
hand, + is also being used to denote the sum of a pair of signed numbers.
Using the interpretation previously discussed, we have:
(+ 7 ) + (+ 3 ) = (+10) since the net effect of a $7 profit and a $3 profit is
a $10 profit.
(+ 7 ) + (—3) = (+ 4 ) since the net effect of a $7 profit and a $3 loss is a
$4 profit.
(—7) + (+ 3 ) = ( —4) since the net effect of a $7 loss and a $3 profit is a
$4 loss.
( —7) + (—3) = (—10) since the net effect of a $7 loss and a $3 loss is a
$10 loss.
This is summed up pictorially in Figure 1.28.
(+ 7 )+ (+3) = (+10) (+ 7 )+ ( - 3 ) = (+4) ( - 7 ) + (+3) = ( - 4 ) (-7) + (-3 ) = (-10)

......
Debit Credit Debit Credit Debit Credit Debit Credit
7 7 7 7
3 3 3 3
10 4 4 10
Figure 1.28
Of course, other than for the specific answer, the above discussion did
not depend on numbers having magnitudes 7 and 3. The debit-credit
motif makes it clear th a t if the two transactions are either both credits or
both debits, we find the net effect by adding the magnitudes; whereas, if we
have one credit and one debit, we subtract to find the balance. We state
this more formally as a rule.
Adding Signed Numbers
(1) If the numbers have like signatures, add the two magnitudes and affix
the common signature.
(2) If the numbers have unlike signatures, subtract the smaller magnitude
from the larger magnitude and affix the signature of the number which
has the larger magnitude.
Whereas the formal rule may appear to be more awesome than our pre
vious remarks, notice th a t it merely captures the same meaning we have
already seen in a fairly intuitive way using our profit-loss interpretation.
However, before condemning the rule as being stilted and nonintuitive,
notice th a t the formal rule never mentions the words “profit” and “loss.”
In other words, the formal rule, while motivated by physical interpretation,
never makes mention of any physical interpretation. This idea plays an
im portant role later in the text.
Subtraction with signed numbers is the next operation we consider.
Here we can reinforce an im portant concept already studied. Since we
have previously introduced subtraction as the inverse of addition and since
we have already defined (a — b) by the relation b + (a — b) = a, we can
be consistent, and a t the same time take the easy way out, by keeping this
same definition of subtraction. Ju st as before, when we agreed th at 13 — 4
means the number th a t m ust be added to 4 to yield 13, we now agree th a t
(+ 7 ) — (+ 3 ) means the number th at m ust be added to (+ 3 ) to yield (+ 7 ).
Once we know how to add signed numbers by using this definition, it be
comes fairly easy to see th a t (+ 7 ) — (+ 3 ) = (+ 4 ).
In terms of the bookkeeping analogy, we could think of addition as
finding the balance when the two transactions were known. Subtraction,
then, may be viewed as finding the second transaction if the first transac
tion and the balance are known. We have the following.
(+ 7 ) - (+ 3 ) = (+ 4 ) since (+ 4 ) + (+ 3 ) = (+ 7 )
(+ 7 ) - (-3 ) = (+ 10) since (+ 10) + ( - 3 ) = (+ 7 )
(-7 ) - (+ 3 ) = (-1 0 ) since ( - 1 0 ) + (+ 3 ) = ( - 7 )
(-7 ) - (-3 ) = (-4 ) since (-4 ) + (-3 ) = (-7 ).
These results may be summarized by the following formal rule.
Subtracting Signed Numbers
(1) Change the minus sign to a plus sign, and then change the signature
of the number which is being subtracted.
(2) Perform the resulting addition. For example, applying this to (+ 7 ) —
( —3), we obtain (+ 7 ) + (+ 3 ) = (+ 1 0 ), which agrees with our previ
ous result.
Notice, again, th a t the formal rule tells us nothing th a t we had not

already known. However, it is a good rule in the sense th a t it, too, never
mentions the physical terms, profit and loss. Yet the same danger exists
here th a t existed when we spoke of the rule for dividing rational numbers.
Remember th a t the rule for division of rational numbers could be sum
marized by the phrase “invert and m ultiply.” If we identify (+ a ) and
( —a) as being additive inverses in the sense th a t their sum is zero (that is,
(+ a ) + ( —a) = 0), then the rule for subtracting signed numbers can be
summarized by the phrase “invert and add.” Because a rule is an ac
cepted convention and does not have to be logical, there seems to be some
thing mystical about it; and this, in turn, tends to make the rule seem
“unreal.” As we have said before, it makes more sense for us to visualize
what appears to be happening first and then to see th a t the rule is nothing
more than a convenient aid to our memory rather than to accept a result
blindly.
We are still left with the problem of defining multiplication and division
of signed numbers. However, if we once again agree to define division as
the inverse of multiplication, we need really only study the operation of
multiplication. As a change of pace, let us first observe the formal rule
and then look a t the intuitive method.
Multiplying Signed Numbers The product of two signed numbers is defined

to be the signed number (1) whose magnitude is the product of the magni
tudes of the two signed numbers and (2) whose signature is positive if
both factors have the same signature, and negative if the factors have unlike
signatures.
(+ 7 ) X (+ 3 ) = (+21)
(+ 7 ) X (-3 ) = (-2 1 )
(-7 ) X (+ 3 ) = (-2 1 )
(-7 ) X (-3 ) = (+ 21).
Granted, again, th a t rules need no rationale to defend them, is there an
intuitive reason for accepting this rule of multiplication of signed numbers?
This question is rather difficult to answer because what is intuitive to one
need not be intuitive to another. In fact, not only does intuition vary
from person to person, but even within the same person it may vary from
time to time. Be this as it may, the following logic may suffice.
We may think of (+ 7 ) X (+ 3 ) as meaning th a t wre made a profit of $3
seven times or a profit of $7 three times. This would account for a profit
of $21 or ( + 21).
As for (+ 7 ) X ( —3), we may think of this as meaning th a t we lost $3
seven times. This would account for the answer ( —21). Notice th a t the
interpretation of winning $7 “negative three” times has no “ natural”
physical significance. In a similar w’ay we think of ( —7) X (+ 3 ) as mean
ing th a t v’e lost $7 three times or ( —21).
The problem occurs when we consider the expression ( —7) X ( —3).

To say th a t two negatives make a positive (hence (+21) must be the
answer) is begging the question. For if it is so natural th at two negatives
yield a positive, how is it possible, for example, th a t ( —7) + ( —3) =
( —10)? In other words, is it natural to assume th a t the best way to incur
a profit is to suffer two losses? Only a strange form of subsidy could
guarantee such a situation. In the same way, the interpretation th a t we
lost $7 “negative three” times is equally shaky from a physical point of
view. We shall try here to supply a few observations as to wrhy the product
of two negative numbers is positive. Observe, of course, th at we have
this right by definition. I t is simply th a t we prefer a better motivation.
Notice th a t in a random transaction, profit and loss are equally likely
possibilities. In a sense, one m an’s loss is another’s profit. Since profit
and loss are equally likely, and since profit is denoted by + and loss by —,
it would seem th a t there should be an equal number of + ’s and —’s in
the multiplication table. H ad we defined the product of two negatives to
be negative, our table would have had the unbalanced result of three —’s
and only one + . In summary, then, wTe might say th at our sense of
symmetry requires an equal number of + ’s and —’s. Howrever, notice
th a t we are appealing only to our sense of symmetry and th at there is
nothing logically binding about this observation. To view this more
dramatically, consider the case of odd and even whole numbers. Given
a whole number a t random, it is equally likely th a t it is odd as it is even.
Yet, a glance a t the multiplication table shows th a t the even numbers
outnumber the odd three to one. T hat is,
even X even = even
even X odd = even
odd X even = even
odd X odd = odd.
This should convince us th a t it is very easy in mathematics (or, in any
other subject as well) to make up reasons once we know the right answer.
A rule of arithm etic th a t we shall study later is called the cancellation
rule. I t states th a t if a 0 and if a X b = a X c, then b = c. Suppose
we would like this rule to be true. Then the indirect proof will show th a t
( —7) X ( —3) = (+ 21), for we already know th a t ( —7) X (+ 3 ) = (—21).
Hence, if we assume th a t ( —7) X ( —3) = ( —21), we would have
( - 7 ) X (+ 3 ) = ( - 7 ) X ( - 3 ) .
Hence, by the cancellation rule it would follow th a t (+ 3 ) = ( —3), which,
a t least in terms of profit and loss, is a contradiction. T hat is, we certainly
would not call a $3 loss and a $3 profit the same transaction.
If a, b, c, and d are counting numbers, it is easy, geometrically speaking,
to see th a t (a + b) X (c + d) = ac + ad + 6c + bd. This is illustrated
in Figure 1.29. Suppose th at we would like this rule to be true even when
we have negative, or minus, signs as well. For example, consider
(8 - 7) X (4 - 3).
On the one hand, we know th a t the answer is 1 X 1, or 1. On the other
hand, we would have
(8 X 4) + [8 X ( - 3 ) ] + [ ( - 7 ) X 4] + [ ( - 7 ) X ( -3 ) ],
which in turn is 32 — 24 — 28 + [( —7) X ( —3)], and this must equal 1.

In turn, this is possible only if ( —7) X ( —3) = (+ 21).
a b
_ " * - - — -j
c ac be On the one hand the area of the rectangle is

(a + b)(c + d). On the other hand, it is
ac + be + ad + bd.
d ad
Figure 1.29
Once we accept the rule for multiplying signed numbers, the definition of
division, the inverse of multiplication becomes immediately apparent. For,
if we continue to define m -r n to mean the number which m ust be multi
plied by n to yield m, we obtain the following:
(+ 21) (+ 3 ) = (+ 2 1 )/(+ 3 ) = (+ 7 ) since (+ 7 ) X (+ 3 ) = (+ 21)

(+ 21) (-3 ) = ( + 2 1 ) /( —3) = (-7 ) since (-7 ) X (-3 ) = (+ 21)
( - 21) (+ 3 ) = ( —2 1 )/(+ 3 ) = (-7 ) since (-7 ) X (+ 3 ) = (-2 1 )
(-21) (-3 ) = ( —21)/( —3) = (+ 7 ) since (+ 7 ) X (-3 ) = ( -2 1 ) .
In summary, then, the concept of profit and loss gives us a reasonable

physical motivation for accepting the rules of arithmetic for signed numbers.
1.8 SIGNED NUMBERS AND THE NUMBER LINE

From a purely computational point of view, the bookkeeping analogy
certainly led to the invention of signed numbers. Yet Ren6 Descartes is
often viewed as the inventor of signed numbers even though he lived in the
17th century, some 400 years after the bookkeeping interpretation was
introduced.
Before we describe Descartes’ contribution, it might be well to talk about
the state of society a t the time th a t he lived; for there is often a correlation
between the accomplishments of the times and the accepted beliefs of the
society. Although historical data for this period is abundant, we shall

attem pt to create a brief but accurate account of the spirit of the times.
The era in which Descartes lived is known as the Age of Enlightenment.
This age was ushered in when the heliocentric theory of the universe re
placed the geocentric theory, the geocentric being that the E arth was the
center of the universe.- Such a theory had much to do with the religious
beliefs of the time, for with the E arth as the center of the universe, it was
easy to view man as being next to the angels. In the fourteenth century,
the Polish astronomer Copernicus advanced the cause of the heliocentric
theory—the belief th at the Sun, not the Earth, was the center of the
universe. This belief was considered to be heretic. In fact. Galileo was
almost burned at the stake for supporting the views of Copernicus. How
ever, by the time of Descartes, the heliocentric theory had been well
accepted as the successor to the geocentric theory.
It should not be difficult to see that the shift to the heliocentric theory
greatly changed the attitudes of men, socially as well as philosophically,
for man now began to re-evaluate his role in the nature of things. The
heliocentric theory made man feel less significant in many ways. He no
longer saw himself as the center of the universe, but merely as another
creature in a vast universe. Even the great Voltaire exclaims: “ Is it rea
sonable to assume th at the laws of nature govern the entire universe, but
that a five-foot high animal, called man. remains immune to these laws?”
In this same vein, the religious beliefs of many also changed. With the
vastness of the universe established, man began to feel that God’s domain
was so vast He could not be concerned with each detail in the course of
everyday living. It was not that man felt less godly: it was just that he
felt more responsible for his own actions in this world. For example, many
who previously would not feed a starving neighbor now began to feel a
responsibility to this neighbor. This was not because people had become
“b etter” ; it was only a change of attitude, for previously man felt that his
neighbor was starving as a punishment for offending God. Hence, any
interference would result ill his being punished as well. However, now
he felt that such bad luck to his neighbor need not reflect Divine anger, but
might have been just th a t—bad luck!
In summary, then, the scientific achievement of replacing the geocentric
theory by the heliocentric theory had a profound effect on other facets
of society, for while the theory did not change m an’s belief in God. it did
make him more conscious of his own role in the universe. In particular,
he now believed that he was more responsible for his own actions on Earth,
and th at he also had some responsibility toward his fellow man. Notice
th a t it is during the Age of Enlightenment that such social contracts as
described by Locke and Rousseau begin to take shape.
It was into this environment th at Descartes was born. He was a
theologian, philosopher, and mathematician. So great were his accomplish

ments in each of these fields that he would probably still be famous today
had he been involved in only one of these three major pursuits. Perhaps
the influence of the Age of Enlightenment 011 the religious beliefs of Des
cartes may be seen from his famous quotation: “ Cogito ergo sum ”— “ I
think, therefore I am .” For a theologian to make such an assertion prior
to the general acceptance of the heliocentric theory might well have caused
him to be branded a heretic. But in the Age of Enlightenment, this
quotation merely reflects Descartes’ awe at m an’s ability to think rationally.
His philosophical beliefs in conjunction with his theological beliefs gave
him faith that there must exist a key to the natural laws of the universe.
That is, he believed that there was a unifying thread to nature. Since his
quest was for quantitative relationships, he decided to study mathematics
to prepare himself better for this task.
Descartes proved to be an adept pupil of mathematics, and he mastered
both algebra and geometry. It was ironic that the vehicle he chose as
the backbone of his quest to find the unifying thread of nature turned out
to cause him considerable anguish : for in his study of geometry and algebra,
he found that these two topics were treated as being totally unrelated.
Perhaps it seemed paradoxical that the subject he had chosen to help him
seek certain unifying threads itself possessed 110 such unifying threads.
In other words, if he did not undertake the task of unifying algebra and
geometry, he might well have found himself in the awkward position of
saying th at everything in the entire universe was related, except algebra
and geometry!
Thus, it should not be surprising that when Descartes began his quest
of unifying algebra and geometry, he was particularly pleased by the
ancient Greek concept of the number line. This, he felt, was a step in the
right direction; for, as we have previously mentioned, the idea of naming
points (a geometric concept) numerically (an arithmetic concept), prac
tically forces one to accept a rather natural relationship between algebra
and geometry.
W ith the above brief description of Descartes and his times as a back
ground, does it seem strange th at he was annoyed by the bookkeeping
interpretation of signed numbers? Certainly, there was no unity between
the concept of the number line and the concept of profit and loss. Thus,
Descartes’ first endeavor was to explain signed numbers in terms of the
number line. Again, without worrying aboutprecise detail, let us see
how such a study might have proceeded.
First of all, recall how the number line began. We drew a straight line,
chose a standard unit of length, and marked off this unit consecutively
along the line, as in Figure 1.30. In this way, certain points on the line
received names while others did not. Rational numbers were then in-
100 The Development of Our Xumber System
«-------------- 1------------ 1_____________ t______________t____________ t

1 2 3 4 5 6
Figure 1.30
vented, at least in part, by the feeling that it did not seem natural that
certain points be named, but not others. In the same way. once the
rational numbers were motivated in terms of the number line, the same
question motivated the invention of the irrational numbers. Thus, along
these exact same lines, we might next ask why the point P in Figure 1.31
h ----- h
t------- 1----- 1------ 1------ 1______ 1______ 2______ t______ 9
P 1 : 3 4 5 6 ■ 5 9
F igure 1.31
should be deprived of a numerical name merely because it appears to the

left of the beginning of the number line. After all. it was just chance or
whim that caused us to draw the straight line from left to right rather than
from right to left. In other words, had we by whim drawn the line from
right to left. P might have received a numerical name while the points
1. 2. 3. . . . would not have.
To free our concept from such a whim, let us imagine, instead, that we
chose a fixed starting point 0 and that we now draw a straight line so that
it passes through 0. but in such a way as to extend on either side of it.
as in Figure 1.32.
Figure 1.32
This now causes an ambiguity to exist. For example, suppose we wish

to locate the point on the line which is 1 unit from 0. We find that iico.
not one. points now qualify. These points are indicated in Figure 1.33.
_______ i_____ t______i _______
.4 C B
Figure 1.33
A is the point on the line which lies 1 s. u. to the left of 0. while B is the
point on the line which lies 1 s. u. to the right of 0. In other words, we
find th at both distance and direction are im portant when we wish to locate
1S Siu’.ci Xur:ie~s c ’ d ihe X*rr.ber Lir.e 101
a point on the line. Consequently, it might be well to name each point

with two symbols, one telling its distance from 0 magnitude and the other
telling the direction from 0 signature . Thus, we might label A by L I
to show that it is 1 to the left of 0. and B by R l to indicate that it is 1 to
the right of 0. Compare this with the discussion of P 7 and L7 in the sec
tion on profit and loss. However, for reasons that we shall soon discuss,
let us abandon R and L notation for — and — notation. Specifically,
we shall use — to indicate the left to right direction and — to indicate the
right to left direction. Just as we did with rational numbers, we shall
identify lengths and points by saying that if a length L is measured from 0
in either direction, the point at which the length terminates will be labeled
L. In this way we obtain Figure 1.34. Here. then, is the extension of the
* * « « *____ i____ i____ i____i____ i------ 1
Fi^re 2.34
number line that the Greek did not visualize. Namely. — and — are being
used as a code to tell direction. In this sense. —3 does not indicate a line
that is 3 s. u. shorter than nothing! Rather, it means a line of length 3 s. u.
drawn in the right to left di'eciion. In terms of the usual notation of signed
numbers, the Greeks viewed the concept of magnitude, but Descartes added
the concept of signature. The influence of the number line is felt also by
virtue of the fact that many textbooks refer to signed numbers as directed
n u m hers.
Notice, also, from a physical point of d ew that direction is as important
as magnitude in the study of distances. Thus, if it is 100 miles from a
town .4 to a town B and a person leaves town .4 and travels at an average
speed of 25 miles per hour, he will reach town B in 4 hotirs. only if he travels
in the direction from A to B. In other words, the idea of introducing directed
distances in terms of the number line has physical as well as philosophical
meaning.
We can now interpret the arithmetic of signed numbers in terms of the
number line. For example, since signed numbers are now being visualized
as representing directed distances, it is natural to define the sum of two
such numbers as the resultant distance traveled. That is.
(*r7) -r —3 = —10 since a motion of 7 s. u. to the right followed by an
additional motion of 3 s. u. to the right is equiva
lent to the single motion of 10 s. u. to the right.
. —7 ' -r 1—3 = —4 since a motion of 7 s. u. to the right followed by a
second motion of 3 s. u. to the left is equivalent to
the single motion of 4 s. u. to the right.
102 The Development of Our N um ber System
' ~ —3 = -4
since a motion of 7 s. u. to the left followed by a
motion of 3 s. u. to the right is equivalent to a
single motion of 4 s. u. to the left.
1—T — . —3 = —10 since a motion of 7 s. u. to the left followed by an
other motion ot 3 s. u. to the left is equivalent to
- the single motion of 10 s. u. to the left.
Adding Signed Xumbers
1 If the numbers have like signatures, add the two magnitudes and affix
the common signature.
2 If the numbers have unlike signatures, subtract the smaller magnitude
from the larger and affix the signature oi the number having the larger
magnitude.
This is precisely the same rule that we stated when we discussed signed
numbers in terms o: profit and loss. Thus, from a purely formal point of
view. Descartes interpretation of signed numbers causes no discrepancv
with the profit-and-loss interpretation. The recipes are the same, inde
pendent or the choice of interpretation. The beauty of the formal rale
should now be apparent. Namely, no m atter which of the two interpreta
tions we elect to use. the formal rule remains the same. On the other hand,
had we chosen to define signed numbers say in terms of profit-loss. then
trouble would have arisen when we talked about directed distances, since
there is no direct correlation between these two interpretations. We
avoided such labels as P-L or R-L. since these labels can prejudice the in
terpretation. — and —. or. the other hand, are equally well adjustable to
either interpretation. In fact, the formal definition allows us to use any
interpretation of signed numbers that we wish, subject only to the restric
tion that the interpretation obeys the definition.
Thus Descartes supplied us with an interpretation of signed numbers that
completes our number line for us. T hat is. the natural., rational, irrational,
and signed numbers filled in the number line a im *>; spwes 'if'. The cont
racted number line is called the va 1 m • lit-;. If we wish to make no
reference to the number line, we define a real number as cue whose square
is at least as great as 0: that is. t is real if and only if ^ 0. This agrees
with our previous data in terms of the number line: for we know that a point
lies to the right of 0 — . to the left of 0 — or coincides with 0. In eacn
case we have: — X — = —: - X - = —: 0 x 0 = 0. Thus, the for
mal definition and the number-line definition agree.
In summary, the number line unifies the real numners for us. All rea.
mimbers can be viewed as either points or lengths. This is a tremenuous
improvement over the idea th a t th e whole num rers are useu tor counting.
I.? The Corr.p'cr X u n b c’i
the rational numbers are used in slicing pies, the negative numbers are for
profit and loss, and the irrational numbers are to make us nervous!
Exercises
1. Explain way we may refer to Descartes as the inventor of signed num
bers even taough signed numbers were known some 400 years before
his time.
2. Explain how 5 — 7 has no real meaning in terms of tally systems but
aoes nave meaning in terms of the number line.
3. Interpret each of the following results both in terms of profit-and-loss
and as directed lengths.
a - 3 --------- 2 = - 5 .
b —3 — - 2 = —1 .
c - 3 — v—2 = - 1 .
d —3 — —2 = —5 .
4. I se the result of Exercise 3 d to criticize the following argum ent :
" It is natural that the product of two negative numbers be a positive
number because the rule of double negation asserts th at two negatives
make an affirmative."
5. AA:tn the advent of signed numbers we agree to detine a — b to mean
•’ — —->. In terms of this interpretation show why 5 — 7 = —2.
6. Perform the indicated operations.
a ' -2 -3 ' - -7
b t - j + .♦
c ; -3 - 4 ; - -2
d -3 ' -4 - -2 ■
1.9 THE COMPLEX NUMBERS
After the last few sections it should be apparent what we mean when we
say that realness is in the eyes of the beholder. In summary, before the
rational numbers were invented 3 2 named a "nonreal’’ number: before
irrational numbers were invented \ 2 named a "nonreal" number; and
before signed numbers were invented 5 — 7 named a "nonreal” number.
In a simple extension of this chain: after the real "real" numbers had been
invented, v —1 named a "nonreal" number since by definition of a real
number, the square of a real number cannot be negative . However, we
must remember that in the truest sense V —1 is no more "u n rea r’ to our
present system than was 3-5-2 when all we had were the whole numbers.
To restate the above remarks in terms of a vehicle that we can further
exploit later, let us redevelop the origin of our number system with respect
to polynomial equations.
To this end, assume th at we only know whole numbers and we want to

solve the equation
x — 3 = 4.
We know the meaning of x — 3 = 4 since it uses only symbols th at we
have agreed we can comprehend. Moreover, it possesses a solution,
namely, x = 7. Since 7 is a member of the known number system, we
probably not only obtain the solution but we accept it in a very natural
way. Now, however, let us consider the equation
x + 4 = 3.
Certainly, it is as understandable to us, as was x — 3 = 4, since, again,
it uses only symbols whose meaning we have agreed upon. However
unlike the equation x — 3 = 4, there is no existing (whole) number that
serves as a solution. Thus, if we want to solve x + 4 = 3, we will have to
augment our number system; that is, we will have to invent more numbers.
(Actually, we must do more than just add more numbers; rather our new
number system must extend the old system so th at what was true in the
old must still be true in the new; otherwise, we introduce grounds for con
tradiction.) In brief, the equation x + 4 = 3 can be thought of as intro
ducing the necessity for inventing signed (positive and negative) whole
numbers, or as they are more commonly referred to, the integers. In this
sense, the number —1 is invented as a solution to x + 4 = 3.
Continuing with the idea of extending the number system through the
solution of equations, suppose that our new ‘Teal” number system consists
only of the integers. Then we can obtain “real” solutions to both equa
tions. Now consider the equation
2x = 3.
This equation cannot possess a solution that consists of an integer. For
if x denotes any integer, 2x denotes an even integer. Thus, since 3 is
odd and 2x is even and since no integer can be both even and odd, it follows
th at 2x = 3 has no integral solution. Thus, we are confronted with the
choice of considering this problem unsolvable or else inventing an extended
number system. Again, in brief, suppose we wish to solve 2x = 3, and,
accordingly, we invent the system known as the rational numbers. Then,
x = f would be a “ real” solution to the equation because we are now
conceding th at our new “real” number system is the rational numbers.
Moreover, with the advent of the rational numbers, not only is 2x = 3
solvable, but so also do x — 3 = 4 and x + 4 = 3 remain solvable precisely
as before the rational numbers were invented.
Continuing one step further, if we were to use the rational numbers as
a new model for the real numbers, the equation
x2 - 2 = 0
would be sensible to us, yet it could possess no “real” solution since the
“real” numbers are the rational numbers and \ / 2 is irrational.
It should now be clear as to what we mean by realness depending on the
level of our knowledge. Suppose, then, th a t we now accept the real
number system
» as the system
» that actually
* bears th at name;7 th a t is,7 x is
real if and only if x- ^ 0. Let us now consider
x2 + 1 = 0.
Certainly, this equation is just as understandable to us as were the four
previous equations. However, since any solution, x, of x2 + 1 = 0 implies
th at x2 = —1, we cannot by definition have a real number as its solution.
Thus, if we wish to solve this equation we must again augment the real
number system. We elect to have x2 + 1 = 0 possess a solution and we
denote one such solution by i. T hat is, i is defined by i2 = — 1 or
*=
The key point now is that i is no more “ imaginary” to our present real
number system than was f when only the whole numbers were known to man.
The major difference lies in the fact that we may feel more at ease with f than
with i.
We choose the name complex numbers to name the system th a t augments
the real numbers so that x2 + 1 = 0 can have a solution. More specifically,
we define the complex numbers to be all numbers of the form
a -J- bi,
where a and b denote real numbers and i denotes one of the roots of
x2 + 1 = 0.
Concerning the definition a + bi, one might wonder why it does not con
tain additional terms, say, other powers of i. The answer lies in the fact
th at we want the complex numbers to be an extension of the real numbers.
This means that everything th at is true for real numbers should still remain
true when the real numbers are viewed as being part of the complex
numbers.
Actually this concept may seem more complicated than is really the case.
To see the problem at a more familiar level, consider the problem of adding
rational numbers. In the previous sections we agreed to define \ ^ to
be f, even though it would have been easier and/or more natural to have
defined this in terms of adding like parts. We can now give another reason
for this definition in terms of the rational numbers being an extension of the
whole numbers. Suppose we agreed th at the definition of adding common
fractions was to add like parts. Then f + f = f- 3 = y and 2 = f ;
however, now the sum of two and three depends on whether we view these
numbers as whole numbers or as rational numbers. In one case the answer

is 5; in the other case the answer is f. The cumbersome technique of
defining a/b + c/d = (ad + bc)/bd guarantees th at the sum of two and
three does not depend on whether we view these as whole numbers or ra
tional numbers. In summary, we need the more cumbersome definition if
the rational numbers are to be an extension of the whole numbers.
To continue, one of the best ways to guarantee th at the new system aug
ments and extends the old is to have the new system adhere to every rule
of the old. As an example, recall th at if b is any real number, and, say,
m and n are any natural numbers, then bmbn = bm+n. Thus, it would be
wise for us to define bmbn = bm+n, even if b is a complex number. In this
way, if b happens to be real it will have this same exponent property re
gardless of whether we view b as real or complex. This choice leads us
to the following inescapable conclusions.
i = i
12 = - 1
13 = iH = ( —l) i = —i (provided th at we wish to keep the
real-number fact th at ( —l)a = —a)
H = &2 = ( _ ! ) ( _ ! ) = i
ih = iH = If = i.
Now the cycle is determined. In fact, for any whole number n, i ” must
reduce to either 1, —1, i, or —i. Moreover, again assuming that we insist
th a t the usual rules for exponents be obeyed, we can develop a very con
venient recipe for reducing i n. For example, suppose we wish to reduce z73.
We first note th a t 73 = 4(18) + 1 (we singled out 4 because i* = 1). We
now write
i73 = ;4<18>+i = (i*y»il = (1 )18i = lz = i.
Thus, the definition of a complex number as given by a + hi begins to
seem reasonable as soon as we agree th at the complex numbers must serve
as an extension of the real numbers. We must admit, however, even if we
would not agree, th at a man has the right to say, “ I don’t care about being
able to solve x- + 1 = 0. It is not worth the trouble.” Indeed, this same
remark could have been made about the equation 2x = 3 when the only
numbers were the whole numbers. The problem reduces quite nicely to
the philosophy th at if the new concept gets the job done, it is worth devel
oping. Our claim is th a t there are many real places where the complex
numbers do get the job done. We shall return to this point shortly, but
for now we prefer to expose the reader to a trivial, but real, interpretation
of i.
For an interpretation of i consider the four parade commands, attention,
about face, right face, and left face. We shall invent a code for these,
letting 1 denote attention. Since about face is the opposite of attention

we shall use —1 to denote about face. Then i will denote right face and
—i, left face. Finally, we shall introduce the notation C to denote any
command and n, any whole number. Then Cn denotes the command th a t
is equivalent to performing C, n consecutive times. For example, 1" = 1
because no m atter how many times we carry out the command of attention,
it is the same as if we merely remained at attention. Then ( —l)2 = 1
since carrying out about face twice is equivalent to having responded to
the single command of attention. Similarly, iz = —i since three consecu
tive executions of right face places us in the same position as the execution
of one left face. This interpretation follows all the rules th at m ust be
followed by 1", ( —1)", in, and ( —i)n. While this example may seem rather
superficial and a little trivial, it sufficiently serves for the purposes of
illustration.
Let us develop the arithmetic of complex numbers a bit further. As
we have mentioned, a mathematical system is governed not so much by its
members as it is by the rules th at tell us how to relate and combine these
members. These rules need not be logical, “natural,” nor self-evident.
Rather they are chosen to capture a particular mood or situation. W ith
respect to the system of complex numbers we have, in the last section,
committed ourselves to a particular course of action; namely, to insure th at
the complex numbers are an extension of the real-number system. In this
section we shall show how this commitment together with some fairly
elementary deductive logic allows us to derive the algebra of complex num
bers in a very straightforward way.
To begin with, let us establish some convenient nomenclature. We have
already seen th at a complex number has the form a + bi, where a and b
are both real numbers. The standard language refers to a as being the
real part of the complex number, and to b as being the imaginary part of
the complex number. Xotice, then, that by this definition both the real part
and the imaginary part of a complex number are real numbers. (In other
words, do not be thrown off by the name. The imaginary part of a com
plex number is the coefficient of i and does not include i.)
We should be quite easily convinced th at we need some way of equating,
or comparing, complex numbers. In other words, just as we can talk
about two real numbers being equal, we would like to be able to talk about
two complex numbers being equal. We know th at we want the complex
numbers to be an extension of the real numbers. We already know what
it means for two real numbers to be equal, and th at both the real and the
imaginary parts of a complex number are real. With this as background,
it would seem quite consistent to define two complex numbers to be equal
if and only if their real parts are equal and their imaginary parts are equal
(this involves statem ents of equality between real numbers).
D efinition
Given the two complex numbers a — bi and c — di we say that a — hi =
c -f di if and only if a = c and b = d.
The logical beauty of our definition is that it defines equality of complex

numbers completely in terms of equality of real numbers. Hence, if we
were logicians and if we already knew the properties of equality when
applied to real numbers, we could then deduce which facts followed in
escapably concerning equality of complex numbers. An illustration might
prove informative.
It is well-known to everyone, with regard to real numbers, equality is
a transitive relation. In plain English, if the first real number is equal to
the second, and if the second real number is equal to the third, then the
first real number is equal to the third. If we use symbols: If x. y. and z
are real numbers and if x — y and y = z. then x = z. We would like to
know whether equality with regard to complex numbers is also transitive.
All we need do is merely apply our definition and our previous knowledge.
Namely, we assume that we are given three complex numbers a — hi.
c — di. and e — f.: and we assume that a — bi = c — di and c — di =
e — ji. By definition, we know that a — c and b = a. and c = e and d = /.
but a. b. c. d. c, a n d / a r e real numbers. Hence, since a = c and c — e. it
follows that a = e. In a similar way. since :< = d and d = /. it follows
th at b = /. But if a = e and b = /. the definition of equality for complex
numbers tells us that a — bi = e — fi. Thus, we have shown that if we
want the usual rules of equality of real numbers to still prevail, the definition
of equality of complex numbers guarantees that equality with respect to
the complex numbers is a transitive relation. In short, we initially had
the choice to preserve or ignore the "old" rules, but once ire made ire choice
to preserve the old rules we no longer were free to choose the rules to
govern the extension—they were forced on us. Just as it was not manda
tory to define a foot to be 12 inches or a yard to be 3 feet, once these two
arbitrary definitions were accepted, we had to define 3b inches as being
equal to a yard. In other words, given any two of the three relations:
12 inches = 1 foot. 3 feet = 1 yard. 3b inches = 1 yard, the third follows
as an inescapable consequence.
As an aside, this example illustrates a good reason :or using subscripts.
.Vs we followed the above example, it was sometimes awkward to keep
track of c. i. c. e. a n d /. On the other hand, we are used to the notation
e — bi. Thus, it might prove convenient to label our numbers c: —
a; — r;f. and C; — In this way. the c or : indicates whether we are
talking about the real part or the Imaginary part of the number, and the
1£ Thi Complex Sum bes
subscript indicates to which number we are referring. Moreover, with

such a scheme, no m atter how many complex numbers are being used in a
problem, we never have to panic at the thought of exhausting the alphabet.
Now let us create a definition for the process of adding complex numbers.
Suppose that we are given the two complex numbers a — bi and c — di.
and that we wish to define their sum. It is important to remember that
the complex numbers are an extension of the real numbers. As for the real
numbers, we know that addition is commutative. This means that the
sum does not depend on the order of addition: that is. if x and u are real
numbers, then r — > = > — r : that it is associative addition does not
depend on voice inflection; in other words a — b — c yields the same sum
whether we “pronounce” it a — b — c or a — b — c : and tbat the
distributive law holds this is basically nothing more than the fundamental
principle of factoring . In other words, the distributive law merely states
that if x. >. and ; are real numbers. then x i — z = xy — xz\ Thus, if
these same rules are to prevail for the complex numbers, we must have
a — bi — c —di = a — bi — c — di
= a — c — bi — di
= a — c — b — d i.
The above demonstration is not a proof of anything. All we can say
is th at if we want the complex numbers to be an extension of the real
numbers, then and only then are we compelled to define the addition of
two complex numbers in the following manner.
Definition
Given the two complex numbers a — bi and c — di. we define their sum
by a — bi — c — di = a — c <— b — d i.
The above definition yields some inescapable conclusions. For example,

since a. b. c. and d are all real numbers, we know th at a — c and b — d are
also real numbers. Hence, a — c — b — d i is. by definition, a com
plex number. This shows us an inescapable consequence of our definition:
The sum of two complex numbers is also a complex number. In other
terminology, we say th at the complex numbers are closed with respect to
addition, meaning that we do not get “new” things when we add complex
numbers. For example, the odd integers are not closed with respect to
addition since the sum of two odd numbers need not be odd—in fact, the
sum will always be even. On the other hand, the odd integers are closed
with respect to multiplication since the product of two odd numbers is
always odd. The above definition not only shows us th at the sum of two
complex numbers is a complex number but also th at its real part is the sum
110 The Development o f Our Number System
of the real parts of the addends and its imaginary part is the sum of the
imaginary parts of its addends. Needless to say, this is a useful computa
tional fact.
We can deduce a few more facts about complex numbers from the above
definition. We know th at for real numbers, 0 is characterized by being
the additive identity. " In plain English, for any real number, b, b + 0 = b.
Suppose th at we want to investigate the existence of an additive identity for
the complex numbers. If it exists, let us say th at it has the form x + yi.
Now if a -f bi is any given complex number, we must then have th at
(a + bi) -f- (x + yi) = a + bi.
Otherwise, by definition, x + yi could not be the additive identity. Our
definition of addition tells us th a t
(a + bi) + (x + yi) = (a + x) + (b + y)i.
Comparing the last two equations, we see th at
a + bi = (a + x) + (6 + y)i.
Applying our definition of equality to this last equation,wesee th at
a = a + x and b = b + y.
However, the last two equations involve only real numbers, and assuming
th a t we can solve linear equations involving real numbers, these equations
force us to conclude th at
x = 0 and y = 0.
This shows us th a t if an additive identity is to exist and if we still wish to

preserve the meaning of the additive identity for the real numbers, then
we have no choice other than to define it as 0 + Of.
Let us next investigate multiplication. Omitting some details, suffice it
to say th a t if we wish to obey the usual rules of real numbers, we have no
choice but to follow the following sequence of steps.
(a + bi) (c + di) = ac + a(di) + (bi)c + (bi)(di)

= ac + (ad)i + (bc)i + bdi2
= ac + (ad + bc)i + bd(— 1), (since f2 = —1)
= (ac — bd) + (ad -f- bc)i.
Thus, we are led to the following definition.
D efin itio n
Given the complex numbers a + bi and c + di, we define their product
by (a + bi) (c + di) = (ac — bd) + (ad + bc)i.
This definition assures us th at the complex numbers are closed with

respect to multiplication since both ac — bd and ad + be m ust be real
numbers because a, b, c, and d are real numbers.
Before we continue further, a small but im portant idea confronts us.
Suppose th at x denotes a real number. By the definition of an extension,
x must also be a complex number. Yet, we have agreed th a t a complex
number m ust have both a real and an imaginary part. T hat is, x m ust
have the form a + bi, where a and b are both real. The trick here is to
observe th a t for any real number, b, b X 0 = 0. So if we want this prop
erty to remain true in the complex number system, we want Ch = 0,
whereupon x = x + Of. This line of reasoning leads to the following
definition.
D efinition
The complex number a + bi is called real if and only if b = 0. I t is
called purely imaginary, or imaginary, if a = 0. (Thus, a complex number
is real if its imaginarj' part is 0, and imaginary if its real part is 0.) In
this sense only 0 is both real and imaginary.
We are now in a position to show th a t the definitions for addition and

multiplication extend from the real numbers to the complex numbers.
T hat is, suppose th at x and y are real numbers. Then their sum is x + y.
However, as complex numbers they are represented by x + Oi and y + 0i.
According to our rule for adding complex numbers, the sum is now given
by (x + y) + (0 + 0)i = (x + y) + Oi, and this, of course, corresponds
to x + y. Thus, the sum of two real numbers is independent of whether
we view the numbers as real or complex. In a similar way, the product of
x and y would be written as
(x -j- 0i)(y + Oi) = (xy — 0-0) + (xO + 0 y)i = xy + Oi = xy.
The above discussion fulfills the aim of this section, which was to supply
insight into how a mathematical system is born and then nourished. How
ever, there is one more interesting definition to discuss.
D efinition
If a + bi is any complex number, then a — bi is called the complex con
jugate of a + bi. In other words, if we take a complex number and change
the sign of its imaginary part, the resulting complex number is called the
complex conjugate of the first complex number.
The concept of the complex conjugate has many im portant roles, but
for the present our purpose will be served by the following: Notice th a t
by definition of complex numbers and the rule of multiplication
(a + bi) (a — hi) = a2 + abi — abi — (bi)2

= a2 — (bi)2a2 — (62i2) = a2 — (62[—1]) = a2 — ( —ft2) = a2 + 62.
However, since a and 6 are both real numbers, it follows from the
definition of real numbers th a t both a2 and b2 are nonnegative, and hence
th a t a2 + b2 is a real, nonnegative number. In other words, the definition
of a complex conjugate insures us of a way of transforming a complex
number into a real number by either addition or multiplication. T hat is,
if we are given a + bi, we form a — bi. Then
(a + bi) + (a — bi) = 2a
which is real and
(a + bi)(a — bi) = a2 + b2
which is not only real, but also nonnegative.

We shall point out additional uses of complex conjugates in later sec
tions, but one use can be presented within the framework of this section.
We know th a t the complex numbers are closed with respect to addition
and multiplication. I t is also trivial to show th a t they are closed with
respect to subtraction. However, it is somewhat more difficult to show
closure with respect to division (excluding, as usual, division by 0). I t is
in this respect th a t the complex conjugate can be used effectively. For
example, suppose we want to divide (3 + 2i) by (5 + 7i). One quick
way is to write
(3 + 2i)
(5 + 7i)
However, we have insisted the form of a complex number be a + bi
where a and b are real. T hat is, we are obliged to express (3 + 2i ) / (5 + 7i)
in the form of a real number plus a real number times i. To this end we
transform the denominator into a real number by multiplying it by its
complex conjugate. Of course, to maintain equality we must also multiply
the num erator by the same thing. Thus, we rewrite (3 + 2i) / (5 + 7i)
in the form
(3 + 2i)(5 - 71)
(5 + 70(5 - 70’
Since
(3 + 20(5 - 70 = (15 + 14) + (10 - 21)i = 29 - H i
and
(5 + 70(5 - 70 = 25 + 49 = 74,
it follows th a t
1.9 The Complex Numbers US
(3 + 2i) (3 + 2i) (5 - 7i)

(5 + 7i)’ °r (5 + 7i)(5 - 7i)'
is equal to (29 — l h ’)/74.
T hat is,
(3 + 2i) (29 - lit) _ 29 11 .
(5 + 7i) ~ 74 ~ 74 74 1
(the desired form).
As a final check th a t this answer agrees with the meaning of division,
we need only check th at the product of (29 — lH )/7 4 and 5 + 7i yields
3 + 2*.
In the next section we shall do the same thing in a visual way. Specifi
cally, our aim in the next section is to invent complex numbers in terms of
the geometric concept of the number line.
Exercises
1. Compute the value of i 17i.
2. W hat is the sum of (3 -f- 4 i ) and (7 — 8i)?
3. W hat is the product of (3 + 4i) and (7 — 8i)?
4. W hat is the quotient when (3 + 4i ) is divided by (7 — 8i)?
5. W hat is the quotient when (7 — Si) is divided by (3 + 4i)?
6. W hat complex number must be multiplied by (2 + Zi) to yield (3 + 2i)?
7. W hat is the complex conjugate of (4 — 3i)?
8. Consider the equation (x — l) 2 + 3 = 0. Why is it impossible for any
real number x to satisfy this equation?
9. Express each of the following in the form a + bi where a and b are real
numbers.
(a) (3 + 4i)2
(b) (3 + 4i)(3 - 4i)
(c) (3 + 4i) (7 - Si) {2+ i)
(d) (3 + 4i)(7 - 8i)/(4 - 5i)
(e)
1.9.1 TH E COM PLEX NUM BERS AS AN EX TEN SION

OF TH E N U M BER LIN E (AN INTRODUCTION TO VECTORS)
Descartes wanted to do more than just give signed numbers an interpreta

tion in terms of length. He wanted to unify all of algebra and geometry.
Thus, once he completely developed the number line, he turned his atten
tion to those points in the plane th a t were not on the number line. Using
Figure 1.35 as an illustration, he might have asked, “ Is P any less of a
114 The Development of Our Xurriber System
• P
Figure 1.35
real point in the plane merely because it happens not to be on the real line?
After all, the real line could have been drawn anywhere in the plane.”
Descartes thought of placing two number lines at right angles to each
other, and thus was bom the Cartesian plane. (This concept has par
ticularly great importance in terms of graphs—something we shall discuss
at a later time.) These two number lines became known as the x axis and
the y axis; and points in the plane could now be identified with a pair of
real numbers, one of which was called the x coordinate of the point (ab
scissa) and the other of which was called the y coordinate of the point
(ordinate). Moreover, a form of place-value was introduced as an ab
breviation; namely, it would be understood that the coordinates would be
listed as an ordered pair of numbers, the first of which would be the x
coordinate and the second of which would be the y coordinate. See
Figure 1.36.
The Cartesian plane supplies us with the identification we need. Namely,
suppose we decide to abbreviate the complex number x -1- iy by (x,y).
Then, since by definition of a complex number, x and y are real numbers;
we have th at (x,y) is an ordered pair of real numbers. Thus, we shall
now identify the complex number x + iy with the point (x,y) in the
Cartesian plane. As a m atter of vocabulary, when the Cartesian plane is
I
t
31 < 3 1
! !
\ —3 . :> ' ,3. 2 )
L i
i
, 1
i
1
|
(-3. - 2) - 2)
; ________1 ,3'
- 3> (2. - 3 )
Figure 1.36
used as a geometric model of the complex numbers, we refer to it as the

Argand diagram.
However, now we must question whether the geometric model captures in
a natural way the properties we have required the complex numbers to
have. To this end. let us look at some of the algebraic requirements of
the complex numbers and see how they conform with the model.
(1) We have agreed th at the complex numbers are an extension of the real
numbers in the sense that x + iy is real if and only if y = 0. However,
if y = 0 the associated point in the plane is (x,0), which denotes a
point on the x axis. Thus, in the Argand diagram the real numbers are
represented by the x axis, which is the number line, just as we want it to
be. In a similar way. the y axis becomes known as the purely imagi
nary axis since if x = 0, x + iy is identified with the point (0,?/), which
defines the y axis. In summary, then, in the Argand diagram the real
numbers are identified with the x axis, the imaginary numbers are
identified with the y axis, and the complex numbers th at are neither
real nor (purely) imaginary are identified with points in the plane that
are on neither axis.
(2) We have agreed that the two complex numbers (xi + iyf) and (x2 + iy2)
are equal if and only if Xi = x2 and yi = j/2. Translating this into the
language of points-in-the-plane, we would say th at (xuyi) = (x2,y2) if
and only if Xi = x2 and yx = y2; but this is precisely the criterion th a t
two points in the plane coincide. Thus, the definition of equality
translates into the fact th at complex numbers are equal if and only if
they name the same point; and this certainly agrees with what our
intuition would dictate.
(3) In discussing the number line, we emphasized th at numbers could be
identified by either points or lengths. In our approach we emphasized
the length interpretation when we discussed the basic operations of
arithmetic. In particular, we defined addition in terms of laying off
the two lengths end-to-end to obtain the sum. For positive numbers
this posed no need for caution, since all numbers were represented by
lengths th at had the same direction and the same sense. (By sense
we mean th at while a particular straight line has a specific direction it
can be traversed in two ways. For example, with regard to a hori
zontal line, we may move along it either from left to right or from
right to left. In terms of the number line, the sense of all positive
numbers is from left to right.)
The major question is how we shall add lengths when the lengths can
have different senses, or different directions. The answer to this question
gives us an excellent excuse to introduce the very im portant concept of a
vector. Stated as briefly as possible, certain physical quantities are deter

mined by their size (magnitude) alone, while others depend on both their
size and their direction. For example, the concept of time is measured in
size alone without regard to direction, in the sense th at if we walk for two
hours it is still two hours no m atter in what direction we walk. On the
other hand, suppose th a t we wish to push a table across the floor and that
we use a push of 20 pounds of force to do this. Then, certainly, the direc
tion in which we apply the force is important. For example, if the force
is applied vertically downward the table will not move. Thus, force is
dependent on direction as well as on magnitude. Quantities that depend
on magnitude alone are called scalar quantities (or simply, scalars), while
those th a t depend on both magnitude and direction are called vectors.
As an aside, the difference between scalars and vectors explains the dif
ference between speed and velocity, and between distance and displace
ment. T hat is, when we say that we are traveling 30 miles per hour we
are referring to our speed, and this speed does not depend on the direction
in which we are moving. On the other hand, we sometimes say we are
traveling in a given direction, such as 30 miles per hour due east. When
direction is im portant we talk about velocity rather than speed. For ex
ample, we talk about the speedometer on a car; if the speedometer had a
compass attached, we could call it a “velocitymeter.” In short, speed is
the scalar and velocity is the vector. Notice that which of the two we want
depends on the physical situation. For instance, if a car gets a certain
gas mileage a t 30 miles per hour and a different gas mileage at 40 miles
per hour, we would talk about gas mileage being dependent upon speed.
On the other hand, if we wanted to get from town A to town B and if we
drove at 30 miles per hour and if the distance between the two towns were
60 miles, then the trip would take 2 hours, but only if we proceeded in the
direction from A to B. Thus, in this instance we would be interested in
velocity. In a similar way, distance is a scalar; but when we are interested
in distance in a given direction we call it displacement, and this is a vector
concept.
For now, we wish to relegate the discussion of vectors simply to the dif
ference between distance and displacement. When we were dealing with
only positive numbers there was no need to make this distinction since all
lengths had the same direction and the same sense. W e now want to
discover how to add vectors. In terms of a picture, notice th at scalars
were represented merely as lengths. It was unimportant to indicate either
a direction or a sense to such a length by the nature of a scalar quantity.
However, to denote vectors, we have to be more precise. In this case, we
will use arrows to picture a vector. The length of the arrow denotes the
magnitude, the direction of the arrow represents the direction of the vector,
and the sense of the vector is denoted by the placement of the arrowhead.
To clarify this point, a vector is denoted by an arrow in the same way
that a number is denoted by a numeral.
Since the only factors that go into the definition of a vector are its
magnitude, sense, and direction, it is obvious that we should define two
vectors to be equal if and only if they have the same magnitude, sense,
and direction. In terms of a picture the two vectors are equal if their
arrows have the same lengths, are parallel, and have the same sense. (We
never refer to the “same sense” unless the directions are the same; th at is,
sense is used only to determine in which direction we place the arrowhead
once the direction of the vector is known.) If we accept this definition,
it means that two vectors may be called equal (as arrows) even if they do
not coincide. All th at is required is what we have stated above.
Recalling once again that we are free to make up any definitions we wish,
let us invent a way to add two vectors, thinking of the arrows rather than
of the vectors. We would like the sum of two vectors to also be a vector;
and we also wish that whatever definition we invent correctly represents
the process for adding real numbers. Therefore, we let A and B denote
vectors. We shall define (A + B ) to be the following vector. We shift
B so th a t its tail coincides with the head of A . (Recall th at we are allowed
to move a vector provided we do not alter its magnitude, direction, or
sense.) Then (A + B) will be defined as the vector th a t originates at the
tail of A and terminates at the head of B. This is illustrated in Figure 1.37.
Figure 1.37
While this definition is for the sum of two vectors, it can be extended to
sums of three or more vectors (just as in ordinary arithmetic we define the
sum of two numbers and then proceed to add three or more numbers).
For example, given the three vectors A, B, and C, we observe th a t A + B
is a vector; hence, (A + B) + C is the sum of two vectors, as is shown in
Figure 1.38. ^ ^ ^
Of course, we could also have formed the sum of A + (B + C). For
tunately, it turns out th a t A + (B + C) = (A B) + C; and as a result
Figure 1.88
we can write A + B + C without worrying about voice inflection. Thus,

the definition of vector addition agrees with the definition of ordinary addi
tion in this respect (see Figure 1.39); th at is, vector addition is associative.
Now th a t we have started to make comparisons between vector addition

and numerical addition, let us examine a few more properties of addition.
In ordinary addition if a and b are numbers then a + b = b + a. The
parallelogram in Figure 1.40 indicates th a t a similar result holds for the
sum of two vectors; th a t is, vector addition is commutative.
Figure 1-40
The number 0 is characterized as being the additive identity; th a t is, 0

does not change a number with respect to addition. In still other words,
b 4* 0 = b for all real numbers, b. Is there a similar vector property?
To answer this, we let B denote any vector and observe what B -+- C looks
like. From Figure 1.41 it should be apparent th a t if C has any length,
B + C cannot equal B. Hence, if we are to have B + C = B, we are
required to have C be the vector whose length is 0. In other words, if we
define 0 to be the vector of zero magnitude, then for all vectors B,
B + 0 = B.
Figure 1.41
Finally, the arithmetic property of subtraction follows from the fact

th at given any real number b, there exists another real number —b such
th a t b + ( —6) = 0. From this we define subtraction by letting b — c
mean b + ( —c). In terms of vectors we are asking the following: Given
any vector B is there a vector C such th at B + C = 0? The diagram
in Figure 1.42 shows th a t C m ust have the same direction and magnitude
th at B has, but the opposite sense.
For B + <? to have no length, the head

o f ? must coincide with the tail o f B.
or
- B (or ? )
Figure 1.4®
In this way, we can define A — B by A + ( —B) (see Figure 1.43).

While we could check other properties if we so desired, we think the
point has been made, the point being th a t the definitions of vectors and
vector addition capture pictorially what we want to be true about the
arithmetic of numbers.
Let us now return to the Argand diagram. Given the complex number
x + iy, we have agreed th a t we may view it as the point (x,y). In terms
of vectors, we could also view x + iy as the vector that originates at the

point (0,0) and terminates at the point (x,y), as shown in Figure 1.44.
Since we have agreed th a t vector equality depends only on magnitude,

direction, and sense, we may without loss of generality assume unless
otherwise stated th at every vector originates at the point (0,0). If we
do this then we may abbreviate the vector by the symbol (x,y) also, the
idea being th at then the vector (x,y) terminates at the point (x,y)—and
whether we mean the vector or the point should be clear from context.
Now observe th at if we treat (xi + iyi) and (x* + iy«) as vectors and
form their sum, we obtain what is shown in Figure 1.45. In other words,
the sum is (xi + x») + i(yi + yf) or, using alternate notation,
y
Figure 1.45
1.9 The Complex Xumbers 121
U i, 0i ) + (X 2 .I/2 ) = U i + 2
X ,J / i + «/2) •
In summary, the definition of complex numbers makes vectors a good

geometric model for them, and the method for adding vectors is the per
fect geometric analog for how we add complex numbers. In particular,
if the two complex numbers are real, the vector definition is precisely the
rule for adding signed numbers.
Before closing this section, we will introduce one more concept. In cer
tain applications, as we have already mentioned, scalars are preferred to
vectors. In this context, there are times when we are interested only in
the length of the arrow, or the magnitude of the vector. With this in mind,
we invent the notation A to denote the magnitude of A ([A\ is called the
absolute value of A ). Translating this into the Argand diagram, the
Pythagorean theorem tells us that 'x + iy] = V x 2 + y1- This is illus
trated in Figure 1.46. In particular, if the number is real (that is, y = 0)
we have jx! = y /x -.
Figure I .46
Notice that by convention, unless otherwise stated, the square root is

always taken to be positive. (This convention is explained in Chapter 4.)
Thus, \ / x 2 is not the same as x unless x is nonnegative. If x is negative
then y / x 2 = —x. (Remember here th a t —x need not be a negative num
ber. In fact if x is negative then —x is positive. For example, if x = 6,
then [x| = 6 which is x; if x = —4, then 'x| = 4 which is —x.) For a real
number x, then,
x if x is nonnegative
—x if x is negative.
In short, then, \x\ yields the magnitude of x.

The idea of absolute value gives us another feature of the complex con
jugate of a number. Recall th at x — iy is defined to be the complex con
jugate of x + iy. From the definition of multiplication
(x + iy){x — iy) = x2 + y2 = |x + iy\2.

122 The Development of Our Number Systems
Exercises
1. Locate each of the following points in the Cartesian plane.
(a) (1,2) (d) ( 1 ,-2 )
(b) (2,1) (e) (-2 ,1 )
(c) ( -1 ,2 ) (f> ( - 1 , - 2 )
2. Explain how the Argand diagram affords us an interpretation wherein
the complex numbers are as real as the real numbers.
3. Add (3 + 4i) and (7 — Si) as vectors.
4. W hat is the magnitude of (3 + 4i')? W hat does this tell us about the
location of (3 + 4i) in the Argand diagram?
5. In terms of the Argand diagram how are the positions of a complex
number and its complex conjugate related?
chapter two / AN INTRODUCTION
TO THE THEORY OF SETS
part I / The Mood Setter
2.1 INTRODUCTION
It is a toss-up, at least at the elementary level, as to whether sets or the
number line receive the most publicity in “modern” mathematics. The
number line can be traced back to 600 B .C ., while the newer concept of
sets as a self-contained study can be traced back to about 1850 a . d .
This should serve as adequate evidence that “modern” is used not so
much as a synonym for “new,” but rather as a synonym for “meaningful”
or “useful.”
W hy is the study of sets so meaningful? A complete answer to this
question would result in a multivolume text. For our immediate purposes,
it might suffice to say th at in the same way th at numbers are the building
blocks of arithmetic, sets are the building blocks of all mathematics.
Thus, sets can be used to help us examine every mathematical system ;
they' can be used to help us better understand the basic ideas of probability
theory, including the topics of permutations and combinations; they can
be used in the study of logic (Boolean algebra), including the designing of
com puters; they can be used to enhance our ability in studying quantitative
relationships (known in the literature as the theory of functions); and
they’ can be used to help us gain an objective insight to the concept of
infinity'. In fact, the study' of sets allows us to study’ the entire concept of
counting in an extremely beautiful way’.
The study’ of sets is the study’ of a single concept th at has very’ many
applications. This is much more meaningful than having to learn many
concepts, each with a single application.
Actually’, a set is nothing more than a collection. Thus, while the study
of sets may have had its formal origin in 1850, sets themselves m ust have
1%4 A n Introduction to the Theory of Sets
been known of intuitively since earliest times. T hat is, whether we

use the word “set” when we refer to a set of dishes or a set of books; or
whether we use a synonym for “set,” as when we refer to a flock (a collec
tion) of sheep or a herd (a collection) of cattle; or even more indirectly
when we talk about the world series (a set of baseball games), or the
Boston Red Sox (a set of baseball players), the fact remains th at we are
dealing with the basic, simple concept of a collection.
W ith these thoughts in mind, we shall proceed to find out why and how
the simple concept of a set has revitalized the entire teaching and learning
of mathematics.
2.2 A SET IS A BUNCH OF DOGS

(AN INTRODUCTION TO WELL-DEFINED SETS)
Since, from a practical point of viewr, mathematics is the handmaiden of
science and technology; and since, whenever possible, the scientist desires
quantitative, or a t least objective, measurements rather than subjective
ones (for example, in an experiment, he would feel more comfortable
knowing th a t he wanted the w’ater heated to 80°F instead of being told to
use lukewarm water), it should not be hard to understand th at the mathe
matician might want the idea of objectivity carried over to his concept of
a set.
For example, a mathematician might balk at the thought of studying
the set of all beautiful paintings because it is rather vague as to which
paintings are members of this set. Indeed, the winner of any beauty
contest depends on the panel of judges; and with a different panel there
might well be a different winner.
For scientific purposes, at least, we wish to study only those sets th at
are more than ju st random collections, subjectively accumulated. We
wrant more precision than th a t of the first-grade youngster who summarized
what he had learned about sets by saying, “A set is a bunch of dogs.”
Instead, w*e would like to limit our study of sets to those for which the
members can be tested objectively; th a t is, in a precise way in which the
content of the set does not depend on the “judge.” More concretely,
consider the set of all people nowr living who were born on April 2, 1929.
If we wish to decide whether a person belongs to this set, we need only
ascertain the date of his birth; and surely the outcome of this task does
not depend on the person w’ho is chosen to serve as judge. (To be sure,
the efficiency in ascertaining the date of birth may depend on who is the
judge, but the test for membership involves only the information, not how
it was obtained.) Thus, the set of all living people born on April 2, 1929,
has an objective test for membership.
A well-defined set is one in which for any given object there is an objective
The Mood Setter 125
rule th a t allows us to determine whether or not the object belongs to the

set. The object either belongs or it doesn’t belong, one or the other, but
not both.1
A set chosen at random is very likely a well-defined set; and this likeli
hood increases very sharply if we are dealing with specific types of sets.
For example, most mathematics books refer to the set of rational num
bers. When we look up the definition of a rational number we find th at
a rational number is any number th at is the quotient of two integers.
Observe th at this definition supplies us with a well-defined, objective test
for membership. Assuming th a t our judge can perform the operation
known as division, he has an objective test for determining whether a
given number is rational. To be sure, some judges may be able to find
out more quickly than others but this does not affect the test for member
ship. In other words, one person might have a more efficient recipe for
performing the test than another (for example, in terms of decimals,
the rational numbers are precisely those decimals th a t either term inate or
repeat the same cycle endlessly). Again, this consideration does not alter
the test for membership.
As we said, the rational numbers are a well-defined set. In fact, in
virtually every situation in which we would want to use sets, the set will
probably be a well-defined set. And because we insist on dealing only with
well-defined sets, we need not concern ourselves with sets for which member
ship is a highly subjective controversy.
We shall assume, unless specifically stated to the contrary, th a t all the
sets in the remainder of the text are well-defined sets.
A NOTE ON THE EXCLUDED MIDDLE
The rule of the excluded middle states th a t a proposition is either true

or false, one or the other, but not both. While this sounds logical enough
(and, even more importantly, is true in almost all cases), we m ust remember
th a t logic consists of applying man-made rules to man-made observations.
1 While it m ight seem to be a truism th a t something is either true or false, one or the
other b u t not both, the fact remains th a t this demand gives rise to interesting paradoxes
th a t actually affect the study of the theory of sets. A note about this is supplied a t the
end of this section.
126 A n Introduction to the Theory of Sets
This means th a t logic by its very nature has built into it many of the flaws
of man.
As a case in point, consider the famous barber-of-Seville paradox. The
barber is the only barber of Seville. To advertise this fact in a way th at
is as flattering as possible to himself, he puts the following sign in his
window:
I shave all men of Seville who do not shave themselves; bu t I shave no man who
shaves himself.
If we use his sign as the “test,” let us ask ourselves whether the barber
shaves himself. If he does, then he is a man who shaves himself; yet the
sign says th a t he shaves no man who shaves himself. Thus we see, ac
cording to his sign, th a t if the barber shaves himself then he does not
shave himself. A similar thing happens if we assume th a t the barber
does not shave himself. In short, according to his sign we have the
paradox th a t if he does, he does not; and if he does not, he does!
The only “o u t” of the paradox is if the barber is not a resident of Seville;
for, after all, his sign refers only to men of Seville.
The barber-of-Seville paradox is a riddle-type version of the serious
problem th a t the famous philosopher, Bertrand Russell, saw in the defini
tion of a set. In a manner analogous to our treatm ent of the barber-of-
Seville paradox, Russell claimed th at to make sure th a t a similar paradox
could not result in the theory of sets, it should be explicitly stated in the
definition of a set th a t no set can ever be a member of itself.
While this might seem like a difficult and abstract condition, notice
th a t it imposes no hardship on us; for with the exception of barber-of-
Seville type paradox, every set th a t we study has this property. By
way of illustration, observe th a t New England refers to a set of states,
but th a t New England itself is not a state. The integers are a set (of
numbers), but the integers themselves are not a number. The National
Football League is a set of football teams, but the National Football League
itself is not a football team.
Thus, from a practical point of view the restriction th a t a set not be a
member of itself is almost automatically fulfilled. However, by imposing
this restriction we are able to remove the paradoxes from our considerations.
There are many barber-of-Seville type riddles th a t make their way into
popular magazines as puzzles from time to time in different forms. A
particularly popular version involves the case of a tribe of philosophical
cannibals. Before preparing their captive for the feast, they have him
u tter a statem ent. If his statem ent is considered to be true, the reward
is th a t the captive is allowed to die a painless death. However, if his
statem ent is judged to be false, then the captive m ust die a painful death.
W hat statem ent can the captive make th a t presents a dilemma to the
The Mood Setter 127
cannibals? He says th at he will die a painful death. If this statem ent is

true, then the rule says he must die a painless death—but dying a painless
death would make his statem ent false. This would mean he m ust die a
painful death—but then his statem ent would be true. In essence, if his
statem ent is true then it is false; and if it is false then it is true.
2.3 THE NUTS 'N BOLTS

W ith the idea of a set firmly entrenched in our minds, let us now turn to
the basic nomenclature. Like it or not, we must learn the basic vocabulary
before we can do anything else.
To begin with, unless it is otherwise specified, we denote sets by upper
case letters; and we use lower-case letters to denote members (often called
elements) of a set. Thus, sets would be denoted by A , B, C, and so on;
and elements by a, b, c, and so on. As is the case in all studies, we invent
a convenient shorthand. Namely, if we wish to indicate th at 6 is a member
of B (or an element of B, or more colloquially, b belongs to B) we write
bC B.
If, on the other hand, we wish to indicate th at b is not an element of B,
we use the standard mathematical technique of placing a slash mark
through the symbol th a t denotes the relation. In other words,
b 4 B
denotes th a t b does not belong to B.
As an illustration, we let W denote the set of whole numbers. Then
we could write 3 £ W, since 3 is a whole number; we could also write
^ 4. W since \ is not a whole number.
To apply this new notation to our previous discussion, we may now say
th a t the definition for a well-defined set is as follows.
D efinition
The set B is said to be well-defined if there is an objective test for member
ship whereby for each b one and only one of the following two statem ents
is true: (1) 6 £ B or (2) b ef. B- Moreover, B ^ B.
Let us next observe th a t £ is a relation between an element and a set.
I t is not a relation between two sets. W ith this in mind, our next endeavor
is to define a somewhat similar concept for relating two sets. For this
purpose consider a statem ent of the form
All A ’s are B’s.
Paraphrased, this says th a t every member of A is also a member of B.

We now invent a new bit of shorthand and write

A C B
as an abbreviation for all A ’s are B ’s.
Moreover, when we write A C B we say th at A is a subset of B, or if we
wish to place the emphasis on B we say th a t B is a superset of A. Notice
th a t when we say th a t all A ’s are B ’s, we cannot be sure th at all B ’s are A ’s.
If we wish to emphasize th at A is a subset of B but th at there is a t
least one member of B th a t is not a member of A, we often write A B.
By way of illustration, again let W denote the set of whole numbers
and let R denote the set of rational numbers. Clearly, every whole
number is a rational number. (Recall th at a rational number is any
number th at is the quotient of two integers; thus, if n is a whole number,
we have n = w -r* 1. Therefore, n is the quotient of two integers.) On
the other hand, not every rational number is a whole number. For
instance, \ is not a whole number since there is no whole number whose
double is 1. The point is th a t this whole paragraph, if we use our new
language, can be elegantly abbreviated as
w £ R-
Also notice that, according to our definitions, we would never write W £ R
since £ is reserved for relating an element to a set; but both R and W
denote sets.
Another interesting point is the resemblance between the symbol C and
the symbol < (which denotes the relation between numbers of “is less
th a n ”). While there is some similarity between the properties of these
two symbols there is an extremely im portant difference. Namely, if a
and b denote two unequal numbers then either a < b or b < a. However,
if A and B denote different sets, it need not be true th a t either A C B or
R C A . For example, let A denote the set of Frenchmen and let B denote
the set of musicians. Since there is a t least one musician who is not
French, it is false th a t R C A . Similarly, since there is a t least one
Frenchman who is not a musician, it is also false th a t A C R. In fact, it
should be easy to generalize the given example and observe th a t quite
often if A and R are randomly chosen sets then both A C R and R C A
are false. (If we wish to abbreviate the statem ent th a t it is false th a t A
is a subset of R, we utilize the slash mark and write A (£. R.)
A more interesting situation occurs when both A C R and R C A are
true. This tells us th a t each A is a R and th a t each R is an A. When we
p u t these two facts together, logic tells us th a t A and R consist of precisely
the same members. In this event we prefer to say th at A and R are equal,
and we write A = R. In summary, given two sets A and R, we say th at
A = R if A C R and R C A are both true statements. Paraphrased in
The Mood Setter 129
plain English, A = B means th at A and B are two different names for

the same collection (set).
Why should we wish to give the same collection two different names?
The question is well taken, and a proper answer to it will unveil an im
portant aspect of how one finds unifying threads in the systematic study
of any subject. For example, let us take an illustration from plane
geometry. Suppose we let A denote the set of all triangles th a t have two
sides of equal length, and let B denote the set of all triangles th a t have
two angles of equal measure.2 There is no reason for those who have not
studied geometry to suspect th a t A and B are equal sets. However, in
the course of time one proves as theorems th a t all A ’s are B ’s and th a t
all B ’s are A ’s; and in this way we obtain the unifying thread th a t the
triangles th at have two equal sides are 'precisely those th a t have two equal
angles.
As a final topic in this section, let us once again return to the idea th at
a picture is worth a thousand words. In much the same way th a t the
number line and the Cartesian plane help us view certain types of numbers
more concretely, geometric ideas are frequently used for “viewing” sets.
These devices are known as circle diagrams, or Venn diagrams. Briefly
summarized, we view a set as being contained within a closed curve (closed
curve, as we shall soon see, is a far better term than “circle” in the present
context). In this context we would illustrate the fact th a t A =£ B by
Figure 2.1. If all wTe were given was A C B, we would then draw the
diagram as in Figure 2.2. The dotted lines warn us th a t we are sure only
th a t the A circle lies within the B circle; we are not certain whether it is
true th at some B ’s are not A ’s.
Figure 2.1
2 A t first glance the phrase “two angles of equal measure” may seem a stilted way of
saying “ two equal angles.” In the new mathematics the point is made th a t it is not
the angles th a t are equal (since they are located in different parts of space), b u t the
measures (be it the units, degrees, or radians, or what have you). In a similar way, it
is not the two sides of an isosceles triangle th at are equal (since the lines constitute two
different sets of points), but the lengths of the two sides. However, now th a t this is
understood, we shall allow ourselves to say such things as “two equal angles,” to avoid
cumbersome language.
ISO A n Introduction to the Theory of Sets
Figure 2.2
Earlier we had mentioned th a t neither the statem ent
All A ’s are B ’s
nor the statem ent
All B ’s are A ’s
need be true. In terms of a circle diagram, this situation would be depicted

by Figure 2.3.
both A ’s and B’s
Figure 2.3
Here we see by pictorial means a good reason for referring to “closed

curves” rather than to “circles.” Namely, suppose th a t we draw A and
B as circles and insist th a t all sets be circles. Then those objects th a t
belong to both A and B form a set; but, as the above diagram indicates,
th a t region would not be a genuine circle even though it would be a closed
curve.
In line with the representation of sets by circles, the idea th a t A and B
are equal would translate into the picture as the curves A and B th at
enclose precisely the same region. Notice, indeed, th a t this is exactly
w hat is implied when we say th a t the A circle is contained within the B
circle, and th a t a t the same time the B circle is contained within the A circle.
Notice th a t we have used the word “ within” in the above paragraph.
I t is certainly natural from a geometric point of view to consider any region
as being contained within itself. In terms of sets, this translates into the
The Mood Setter 131
fact th at it is perfectly proper to say th a t all A ’s are A ’s. This gives us

another important difference between G and G- Namely, while a well-
defined set requires th a t the set not be a member of itself, there is no such
restriction placed on a well-defined set being a subset of itself. In other
words, if A is a well-defined set, A ^ A but A C A . Another way of
visualizing this is to think of a set as being any collection, and a subset
as being anything we can form by choosing members of the collection. In
this context, if someone says to us, “Take whatever you w ant,” one choice
we can make is to take everything.
Of course, there is another extreme: we may elect to take none of the
collection. This has no bearing on what we have just discussed, b u t it
plays a large role in the next section.
2.4 TWO SPECIAL SETS

A t the end of the last section we intimated th a t a set might have no
members. Why would we allow a set to have no members? After all,
if there are no members, why bother naming it; and, more important, if a
set is a collection, does not the term “collection” imply a t least one member?
The answer to these questions centers around the idea of the test for
membership th a t is implied for any set. T hat is, it is possible th a t the
test for membership be so stringent th a t nothing can survive it. For
instance, let us consider the set of all numbers th a t are greater than 5 b u t
less than 3. Certainly, we have an objective and meaningful test for
membership. Namely, given any number we first see if it exceeds 5 and
then see if it is less than 3. If both of these things are true then the number
belongs to the above set; otherwise, it does not. Of course, no number
can survive the test for membership in this case; nevertheless, the test is
objective and well-defined. Moreover, it is significant to discover th a t a
test is so severe th a t no element can survive it.
To strengthen the case even more, notice how often, especially in
mathematics, we are more interested in the test for membership than we
are in knowing the names of all the members. For instance, we might be
given a particular number and we might wish to know whether this number
is a prime. At this point we are more interested in knowing the recipe
by which we can test the given number for being a prime than we are in
wanting to see a complete listing of all the primes—and this is indeed
fortunate since no complete list is available because the number of primes
is infinite!
Later we shall supply even more reasons for allowing a set to contain
no members. For now we wish only to establish the fact th a t such a set
is meaningful. I t should be clear th a t such a set is the smallest possible
collection. In other words, we do not say th a t a collection is so tiny th a t
even if it had three more members there still would not be an y ! Summed
up, we define the empty set to be a set th at has no members; and we denote
the empty set (also called the null set) by <£.
Do not confuse <t>with 0. For example, consider the set of all numbers
th a t are neither positive nor negative. This particular set happens to have
one member, namely, 0. In other words, the set whose only member is 0
is not the empty set because the empty set has no members—not even 0.
W hat is true, however, is th a t 0 denotes the number of elements in <j>.
To see what the empty set means in terms of grammatical structures,
consider the sentence
No dog has two heads.
In the language of sets this says
The set of dogs th a t have two heads is the empty set.
Let us now turn our attention to the other extreme mentioned in the
last section on subsets. T hat is, one cannot take more than one has. In
other words, given a set A, the smallest subset of A is <f>and the largest
subset is A itself.
This can be generalized by the following rather silly question: “Does
the color blue belong to the set of all lawyers?” At first glance the answer
is no. At second glance the answer is even more emphatically no! When
we talk about the set of all lawyers, it is implicitly understood th at only
people are eligible for even the test for membership. In other words, a t
certain times not only do we wish to limit membership in a set, but also
to limit even those things th a t are eligible for the test for membership.
To formalize this idea, we introduce the following definition.
D efinition
By the universe of discourse or the universal set, usually denoted by 7, we
mean the set such th a t for any element b and every set A th a t is being
considered, it is true th at b £ I and A C 7.
Thus, <£ and 7 serve as upper and lower bounds in our discussion in the
sense th a t no set being studied can have fewer than no elements nor can
it contain any element not already contained in 7. T hat is, for each set
A it is true th a t
<*> C A C 7.
I t turns out th a t without knowing it we have made use of the universal
set many times in the treatm ent of elementary mathematics. Consider,
The Mood Setter 138
for example, the fact th at in ninth grade algebra we are told th a t x2 + 1

cannot be factored; yet in the eleventh grade we are taught th a t x2 + 1 =
(x + i){x — i). W hat happened was th a t in the ninth grade we were
dealing with real numbers, while in the eleventh grade we were using the
complex numbers. In terms of the above discussion, what we were really
taught was th at if the universe of discourse was the real numbers then
x2 + 1 could not be factored; but if the universe of discourse was the
complex numbers then it could be factored.
As a second example, let us refer to a problem in analytic geometry.
Suppose we wish to know the graph of the equation x = 1. If the universe
of discourse is the x axis, then the graph of the equation is precisely a
single point. On the other hand, if the universe of discourse is the xy
plane, then the graph is a straight line. Finally, if the universe of dis
course is three-dimensional space, the graph is a plane. These three
possibilities are pictured in Figure 2.4.
-j------ 1-
o i
Jf = 1
(a; (b)
Figure 3.4
In Venn diagrams it is conventional to represent I by a rectangle and

to make sure th a t all sets under consideration are drawn within this
rectangle. We shall have more to say about this later as we develop
additional ideas about sets.
Exercises
1. Determine which of the following sets are wrell-defined.
(a) The set of even integers
(b) The set of interesting integers
(c) The set of paintings by Picasso
(d) The set of beautiful paintings
(e) The set of irrational numbers
In each case th a t a set is well-defined, state the test for membership.
2. Name five subsets of the complex numbers th a t we have studied in this
text.
3. Explain the basic difference between the concepts of “is an element of”
and “is a subset of.”
4. Let S = {1,2,3,4}.
(a) List all the subsets of S.
(b) How many of these have exactly two elements?
5. (a) Let S denote the set of numbers named by 1, 2, 3,and 4. Does
the number named by 7 — 5 belong to <S?
(b) Let S denote the set of numerals 1, 2, 3, and 4. Does the numeral
7 — 5 belong to S ?
6. Describe two sets A and B for which it is true th a t all A ’s are B ’s and
th a t all B ’s are A ’s.In this case how are A and B related?
7. Describe two sets A and B for which it is true th a t all A ’s are B ’s,
but not true th a t all B ’s are .4’s. In this case, using the language of
sets, how would we indicate the relationship of A to B?
8. Use a Venn diagram to illustrate th a t some A ’s are B ’s. How would
w’e state this fact in the language of sets?
9. Let A denote the set of men and B, the set of immortals. Use the
language of sets to express the fact th at no man is immortal.
10. Show by means of a specific example th at {0} and <t>are different sets.
2.5 TWO MAJOR METHODS FOR DESCRIBING SETS

Now th a t we have established a few basic terms and the concept of a
well-defined set, let us describe some general methods for describing sets.
The two major methods are known as (1) the roster method and (2) the
set-builder method. Depending on the particular circumstances, one
method is more desirable than the other.
Let us begin the discussion with the roster method. As the name implies,
the roster method is nothing more than an explicit listing of the members
of a set. For example, if A were to denote the set of natural numbers th a t
are less than 10, then use of the roster method would yield
A = {1,2,3,4,5,6,7,8,9}.
Braces are conventionally used to enclose the set.
I t should also be noted th a t in the definition of the equality of two sets,
we specified merely th a t the two sets contain exactly the same members.
We did not require th a t the elements be listed in any particular order.
This certainly agrees with our intuition and past experience. For example,
if we were talking about the set of all men who were senators of the United
States during 1964, we would not quibble about whether they should be
listed by age, by years of service, or by alphabetical order. Indeed, while
one listing may be more convenient than another for a particular purpose,
the fact remains th a t each is a listing of the required set. In terms of
The Mood Setter 136
the above example we would write, for instance, {1,2,3,4,5,6,7,8,9} =

{9,1,3,2,7,6,4,8,5} = A . In this same context, we agree th a t a set does
not change merely by counting the same element more than once. For
example, no m atter how many times we count “M onday” there are still
only seven days in a week. Thus, for simplicity, we agree never to list
the same element more than once; th a t is, we would write {1,2,3} rather
than {1,1,2,2,3,3,3}. There are some exceptions; for example, suppose
we wish to list the set of all letters th at occur in the word “Mississippi,”
and investigate the various rearrangements. In this case, we would most
likely not write (m,i,s,p), but rather (m,i,s,s,i,s,s,i,p,p,i) since we are
distinguishing between the first i and the second, and so on. T hat is, we
may for purposes of identification imagine the i’s to be colored different^
so th a t we can tell them apart. We will talk more about this a t a more
appropriate time.
In contrast to the roster method, the set-builder notation stresses the
members of the set implicitly rather than explicitly. In other words, while
the roster method actually lists the members of the set, the set-builder
method describes, or emphasizes, the test for membership.
Specifically, the set builder notation utilizes the braces and a few other
symbols as well. Namely, we write [x: x . . .}, which is read as, “The
set of all elements x such th a t . . . .” For example, we might write
{x: x is a real number} to indicate th a t we are referring to the set of real
numbers. Or if we were going to make reference to the real numbers very
often, we might let R, say, denote the real numbers, and we could then
write very compactly th a t [x: x G R } denotes the set of real numbers. As
a nonmathematical example, let I denote the set of all Americans who were
alive on January 1, 1960; and let B denote the set of all people who were
born on April 2, 1929. Then, by
{x: x G I and x CL B \
we would mean the set of all Americans who were alive as of January 1,
1960; and who were born in April 2, 1929. Notice the compactness of the
set-builder method as it clearly emphasizes the test for membership.
Now th at the two methods have been described, let us discuss their
relative strengths and weaknesses. We begin with the roster method.
The obvious strength of this method is th a t it tells us outright what the
members of the set are. T hat is, we have the easiest possible test for
membership—if an element appears on the list, it belongs to the set;
otherwise it does not.
However, there are basic weaknesses th at can cause great difficulty.
For one thing, we cannot list an infinite set if only because our lives are
finite. T hat is, suppose we wrish to list the set of natural numbers. We
could begin by writing 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and so on, but we
would never come to an end. Or we could write 1, 2, 3, 4 . . . , where . . .

stands for “and so on.” However, “and so on” is vague and subjective,
to say the least. As an example, consider the numbers generated by the
following recipe
dn = n + (n — l)(w — 2)(n — 3 )(n — 4)
where an is merely an abbreviation for indicating the nth number. For
example, if we wished the first number in the progression, we would choose
n — 1, and write
ax = 1 + (1 — 1)(1 - 2)(1 - 3)(1 - 4)
where we merely replace all n ’s by 1. Thus,
ai = at = 1 + (0)(—1) ( —2) ( —3)
= 1+ 0
= 1
and then we see th a t 1 is the first member of the sequence.

To obtain the second member we replace each n by 2
a2 = 2 + (2 - 1)(2 - 2)(2 - 3)(2 - 4)

= 2 + (1)(0)(—1)(—2)
= 2+ 0
= 2
and we see th a t the second term is 2.

Replacing n by 3 we next see th at
a3 = 3 + (3 - 1)(3 - 2) (3 - 3) (3 - 4)
= 3 + (2 )(1 )(0 )(-1 )
= 3+ 0
= 3.
Similarly,
a4 = 4 + (4 - 1)(4 - 2)(4 - 3)(4 - 4)
= 4 + (3) (2) (1) (0)
= 4 + 0
= 4.
In this way we see th a t we have a well-defined test th a t tells us th a t the

first four members in the sequence, in order, are 1,2,3,4. Suppose th at
we told this to a person but we did not tell him the recipewe were using.
W hat doyou guess he would choose for the fifth member of the sequence?
I t seems reasonable th a t when someone says 1,2,3,4... , we expect 5 to
occur next. Yet,
The Mood Setter 137
a6 = 5 + (5 - 1)(5 - 2)(5 - 3)(5 - 4)

= 5 + (4)(3)(2)(1)
= 5 + 24
= 29,
a far cry from being self-evident if the recipe is w ithheld!

As other examples of this type of confusion, consider the sequence
o,t,t,f,f,s,s, . . . . W hat letter comes next? We claim th a t this sequence
consists of the first letter in the name of each natural number starting with
one; th a t is, one, two, ffiree, /our, /ive, six, seven, eight, . . . . Thus, in
this example we are looking for e as the next entry—again, not so self-
evident when the rule is withheld!
As a final example, consider the next sequence and try to decide what
comes next.
31,30,31,30,31,31,30,31,30,31,31,... .
I t is neither 30 nor 31, but rather 28! For the above list was the number
of days in a month starting with March in a non leap year.
These examples show why we shun “and so on.” Moreover, they il
lustrate, a t least indirectly, one advantage of the set-builder notation.
Namely, notice how the “riddle” effect of these examples vanishes as soon
as we let the other fellow know the explicit rule we are following.
However, aside from the fact th a t the roster method cannot be used
without some ambiguity for trying to list infinite sets, there is still trouble
in listing noninfinite sets. For example, the people who are named in the
Boston telephone directory this year constitute a finite set, but nonetheless
a set large enough to be undesirable for the average man to list. B ut
even if a set has few elements, it might be still very difficult to list because
we may have trouble locating its members. To this end, consider the
set of all living people who were born in New England on M ay 14, 1865.
I t is reasonable to assume th a t this set has relatively few numbers. In
fact, it might well be empty. Yet, look a t the difficulty th a t could arise if
we were to try to determine the list of the actual members of this collection!
The very weakness of the roster method is the strength of the set-
builder method, for quite often we are more interested in the test for
membership than in the members themselves. For example, we might be
more interested in knowing whether a particular number is divisible by
7, 11, and 13 than in knowing the entire set of numbers th a t are divisible
by 7, 11, and 13. By the same token, the strength of the roster method
is the weakness of the set-builder method. T hat is, there are times when
we require the members of a set explicitly. For example, a lawyer might
be more interested in the names of the people mentioned in a will than in
the set-builder idea of all the living relatives.
However, as is so often the case in the chronicle of human endeavor, we

find th a t these two conflicting methods not only coexist but they are often
used side by side in the same problem. T hat is, we often start a problem
by getting the answer implicitly, and then converting it to a more explicit
form. This will be pursued in much greater detail in the next section.
We are now in a good position, a t least as far as having developed the
proper concepts and terminology, to re-examine the Russell paradox dis
cussed in the section on well-defined sets, and to see how this paradox
applies to the study of sets. We have already agreed th at to avoid the
Russell paradox we m ust agree th a t for any set B, B ^ B. We shall show
th a t this convention can lead us directly from the frying pan into the fire!
Namely, let A denote the following set whose elements are themselves sets.
A = { X : X 4 X} .
Since for any well-defined set X we have agreed th a t X ^ X, it follows th a t

A = {X: X 4 X } is just a fancy way of saying th a t A is the set of all well-
defined sets. In particular, since A is itself a set it makes sense to test A
for membership in itself. The question is: is A C A , or is A 4- A . If
A C A , then by the definition of A it m ust be th at A ^ A (since A consists
precisely of those sets th a t are not members of themselves); on the other
hand, if we assume th a t A ef. A , then A £ A , again, by the definition of A.
Thus, as a generalization of the barber-of-Seville paradox, we see th a t if A
belongs to A , then it does not; and if it does not, it does! For this reason
we agree th a t unless we wish to arrive a t a logical paradox we must refrain
from talking about the set of all sets.
We have now learned the preliminary vocabulary necessary to discuss
the impact of sets on modern mathematics.
2.5.1 T H E N U M B ER OF SUBSETS OF A GIVEN SET
This note reviews the notation we have learned to date. We will also
mention a few interesting asides.
To begin with, let us recall th a t in our discussion of binary arithmetic
we observed th a t a switch is either on or off, one or the other, but not b o th ;
this was conducive to building computers in a base-two system, where,
for example, 0 could denote off and 1 could denote on.
This same structure allows us to code subsets of a given set S in the
same way. Namely, given an element, s £ S, and a subset, T, of S ;
either s £ T or s 4- T, one or the other, but not both. This means th a t
we can now use the code, for example, of using 0 to denote th a t an element
of S does not belong to a given subset, while we can use 1 to denote th a t it
does.
The Mood Setter 139
Suppose th a t S has a single element. Let a denote the name of this

element. T hat is, in roster form,
S = {a}.
The two possibilities are then given by
a
0
1.
The first entry says th a t a is not in the subset; but since S has only the
element a in it, it follows th a t the first entry denotes the empty set. The
second entry indicates th a t a is to be in the subset; but since a is the only
element of S, this means th a t the indicated subset is S itself. This shows
th a t if S has only one element, then S has two subsets, namely, <t>and S.
Suppose now th a t S has two elements. In roster form, let us say
S = {a, 6}.
If we make a chart as if we were counting in base-two (that is, 00, 01, 10,
11), we would obtain
a b
0 0
0 1
1 0
1 1
This shows th a t S has four subsets. In fact, reading the entries from top
to bottom and using the code, we see th a t the four subsets are given
specifically by
0, {&}, {a}, and {a,b} = S.
Notice the subtle difference between b and {b}. When we write b we mean
the element b. When we write {6} we mean the set whose only member
is b. In terms of notation, one would write
bC B
but
{b}CB.
Aside from semantic differences there is a logical reason (which we shall
review in more detail a little later) for distinguishing an element from a set
whose membership consists of a single element. For example, suppose we
desire the solution set S of the equation x — 1 = 0. A solution set refers
to the set of all numbers which satisfy the equation.) Then, if we write
1 C S , all we are saying is th a t one member of the solution set is 1. How
ever, when we write S = {1}, we are saying th at the entire solution set
consists of precisely the element 1.
Getting back to the main discussion, we have shown th a t if S has two
elements then S has four subsets. In a similar way, we may use the
binary number system to show th a t if S has three elements, then S has
eight subsets. Namely,
a b c
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
and again reading from top to bottom, we see th a t the eight subsets are
<t>, {c}, {&}, {b,c}, {a}, {a,c}, {a,b\, {a,b,c\.
Notice also th a t this procedure indicates th a t the number of subsets
doubles each time we add an element to the set. This follows by virtue
of the fact th a t if we let r denote the new element of S, then any subset
th a t could occur before r was added remains a subset whether or not we
include r. In terms of the above example, observe th at under the b and c
columns we see th a t the sequence 00, 01, 10, 11 occurs twice—once with a
0 in the a column and once with a 1 in the a column.
A final remark in this connection is th a t the binary number code shows
us a rather nontechnical way of establishing the im portant result th at if a
set has n elements, it has 2n subsets.
Moreover, this offers us still another reason as to why we agree th a t
both <f>and S are treated as subsets of S. Namely, no m atter how trivial
it may seem, the fact still remains th a t if we want the simple result th at a
set of n elements has exactly 2” subsets then we m ust agree th at both <f>
and S are subsets of S. For example, if we wish to exclude (j>and S, we
would have to say th a t if <S has n elements, then it has 2" — 2 subsets.
The next section deals with more computational material than we have
previously encountered.
Exercises
1. Describe the basic differences between the set-builder method and the
listing, or roster, method for describing sets.
The Mood Setter 141
2. Use set-builder notation to describe each of the following sets.

(a) the set of even integers
(b) the set of all numbers th at are divisible by three
(c) the set of odd integers
3. Describe the set of New England states both by the roster method and
the set-builder method.
2.5.2 APPLICATIONS TO ELEM EN TA RY ALGEBRA
In the previous section we introduced several basic concepts in the theory

of sets, but we still have many more concepts to discuss. However, let
us first pause to apply the new concepts to some old ideas. In particular,
we shall take a new look at some classical algebra.
To begin with, an equation such as
3rc = 6
could be viewed as a fill-in-the-blank type open sentence:
3 X _________= 6.
An open sentence is an outgrowth of the familiar “Fill in the blank”
type of exercise. For example, when we write
_______ is the capital of Massachusetts,
we do not have a complete sentence (hence the name open sentence).

Since we do not have a complete sentence, it does not make sense to discuss
the tru th or falseness of it. W hat we mean by “fill in the blank” is th a t we
wish to replace the blank (in other words, even in English grammar we can
view the blank as a place holder, in the above case holding the place of the
name of a city) by the word or words th a t convert the open sentence into
a true statem ent. In terms of the above open sentence, if we replace the
blank by “A tlanta” we get a statem ent th a t happens to be false. W ithout
further ado it should be clear th a t “Boston” is the only substitution for the
blank th a t converts the above open sentence into a true statem ent.
In this context we would refer to {Boston} as being the solution set of
the open sentence.
In particular, we may have open sentences in mathematics, and if the
open sentence is in the form of an equality (as in the present example
3 X = 6) we call the open sentence an equation, and as in the case of
any open sentence, we try to find the solution set of the equation.
Again, in terms of number-versus-numeral, it is not important what the
symbol is for the place holder but it is im portant th a t we have the place
holder. For example, instead of talking about “filling in the blank,” we

could have written
x is the capital of Massachusetts
and then asked what rr should be replaced by to convert the open sentence
into a true statem ent. In the subject called algebra it is conventional to
use x (or other letters of the alphabet) to denote the place holder of our
open sentences, but once we see the key idea th a t only the symbolism is
different we should discover th at algebraic equations are a part of (that is,
form a subset of) the more general idea of “filling-in-the-blanks.”
I t is not difficult to determine th a t we m ust replace the blank by 2 if we
want to form a true statem ent out of 3 X = 6.
Traditionally, the equation 3a; = 6 is solved by the process of dividing
both sides by 3, thus yielding x = 2. Since division is the inverse of
multiplication, Sx = 6 may be read as, lix is the number which when
multiplied by 3 yields 6,” but we have already seen th a t the name of this
number is $, or 2. At any rate, it should be clear th at we could solve such
an equation even had we never heard of sets.
Moreover, we could have been much more general, and considered the
equation
bx = c
where b and c represent any real numbers, but b 0.
Observe th a t there is no law against b being equal to zero. I t is just
th a t if b = 0, then bx = 0, in which event either c = 0, or else there is no
solution to the open sentence. In either event, 6 = 0 provides a rather
dull and sterile case, so we prohibit it from happening in our discussion.
Since 6 7* 0, we can divide both sides of the equation by 6 and obtain
x = c/6. (The previous example was simply a special case of this with
c = 6 and 6 = 3.)
Now we are ready to investigate the use of sets in this study. Suppose
someone were to ask us to find the set of all numbers th a t were solu
tions of the equation 3a; = 6. The job becomes trivial if we use the
set-builder notation. Namely, if we let S denote the set of all numbers
th a t satisfy the equation 3a: = 6, quite simply by use of the set-builder
notation we have S = {x: 3x = 6}. Observe th a t while a person might
not be able to list explicitly the members of S, he is nonetheless provided
with a well-defined test for membership in S. Namely, given any number,
he need only multiply it by 3, and if the product is 6 then the number
belongs to S] otherwise it does not. For example, since 3 X 7 = 21 and
21 6, we may objectively conclude th a t 7 el S.
Observe th a t the set-builder method provides us with a way of under
The Mood Setter 143
standing the equation, and which properties a solution m ust have even if
we do not know objectively how to find the solution.
Algebra is the vehicle th at allows us to proceed from the implicit set-
builder form to the explicit roster method. T hat is, by use of algebra we
know th a t the only number th a t satisfies the equation Sx = 6 is 2. In
other words, if we now let T denote the set whose only element is 2 (that
is, T = {2}), then T is also the solution set for the equation 3a: = 6.
Since an equation has but one solution set, it m ust be th at S and T are
synonymous; hence, S = T. But while S and T are synonyms, S is in
the set-builder notation while T is an example of the roster method.
I t should be easy to see th a t S focuses our attention on the problem,
while T focuses our attention on the solution to the problem; and it is the
process of objectively converting from S to T th a t is known as algebra.
More generally, in set-builder notation, the solution set of bx = c (b 0)
is [x:bx = c}, while in roster form this set is {c/b}.
Why make such a fuss over the use of sets here? After all, the solutions
of the equations do not change nor does the algebra get any easier when wre
use sets. However, notice th a t the use of sets emphasizes th a t there are
two parts to solving a problem. We must first sense what the correct answer
means, and then we must be able to state what the correct answers are.
The set-builder form of the solution set emphasizes what a correct answer
means, while the roster form of the solution set lists all the correct answers.
In this context the greatly feared, and often misunderstood, concept of
algebra exists as the servant of m an’s attem pts to convert implicit solutions
to explicit solutions of problems. I t is interesting and informative to note
th at in many instances when a person cannot solve a problem, it is because
he cannot even sense what the solution m ust be, and hence would not be
able to recognize it even if he saw it. By proper use of set notation wTe
can separate the problem into these twro distinct phases.
Let us now turn our attention to more complicated equations. Consider
the quadratic equation
x2 — 4a: + 3 = 0.
Suppose th a t we were required to find the set of all numbers th at are solu
tions of this equation. If wre let A denote the solution set of the equation,
we obtain, virtually a t a glance, the set-builder form
A = [x: x2 — Ax + 3 = 0}.
We now have a fairly easy task, given any particular number, to determine
whether this number belongs to A . Namely, wre square the number,
subtract four times the number from the square, and then add 3. If the
sum is 0, the number belongs to A ; otherwise, it does not. For instance,
let us examine the number 5 for membership in A : 52 — 4(5) + 3 = 25 —
20 + 3 = 8; and 8 ^ 0 . Hence, 5 4- A. If we wish to use a chart we

write the following.
x x2 4x (x2 — 4x + 3)
-3 9 -1 2 24
-2 4 -8 15
-1 1 -4 8
0 0 0 3
+ 1 1 4 0+
+2 4 8 -1
+3 9 12 0+
+4 16 16 3
+5 25 20 8
Studying the results of the chart, we see th a t 1 and 3 belong to the

solution set of the equation. In terms of our newly acquired language:
1 £ A and 3 £ + , or in still other words, {1,3} C A.
However, we have not as yet earned the right to conclude + = {1,3}.
T hat is, there may be other values of x th a t satisfy the given equation.
Notice in this respect th a t while the chart used only integers to replace x,
there was no specification concerning the universe of discourse. For
example, there is nothing, at least a t the moment, th a t precludes non
integers from belonging to the solution set A . Moreover, even if we test
for 20 years by trial-and-error and never find any other members of + ,
this does not mean th a t there are no other members. Recall th at there is
a difference between saying th a t we cannot find an answer and saying th at
there is no answer!
Once again, it is the subject called algebra th a t helps us replace trial-
and-error-techniques by a more systematized approach. In general, if
a and b denote real numbers, we recall th a t
(.x + a)(x + b) = x2 + ax + bx + ab = x2 + (a + b)x + ab.
In the form of a brief review, observe th a t in terms of geometry we have

the diagram pictured in Figure 2.5.
.v a
.V2 ax
bx ab
Figure 2.5
The Mood Setter I 46
From this result it becomes rather easy to see th a t the sum and product
of a and b become im portant items. For example, translating this result
back to our original problem of x2 — -f- 3 = 0, we desire to find two
numbers, a and b, such th a t a -f b = —4, and ab = 3. While this still
involves trial-and-error, without too much difficulty we find th a t the two
numbers we seek are —1 and —3. Indeed, ( —1) + ( —3) = ( —4), and
( —1) ( —3) = 3. Thus,
x2 — 4z + 3 = [x + ( —l)][a; + ( —3)] = (x — l)(a: — 3).
In the language of mathematics, an equation such as
x2 — 4x + 3 = (x — l)(a: — 3)
is called an identity. An identity is an open sentence th a t is true for any
permissible substitution. In the language of sets an identity is an equation
whose solution set is the entire universe of discourse. T hat is,
I = {x: x2 — 4x + 3 = (x — l)(x — 3)}.

In any event, then, solving the equation
x2 — 4x + 3 = 0
is equivalent to solving the equation
(x — l)(a: — 3) = 0.
Here, again, is a good place in which to observe the beauty of the lan
guage of sets even though the actual computations do not vary in the old
and the new ways. Namely, the two equations x2 — 4x + 3 = 0 and
(x — l)(a: — 3) = 0 do not look alike. Certainly, without a knowledge
of the structure of arithmetic, a person would not look at these two equa
tions and say, “Oh my! These are merely synonyms.” However, in
terms of sets, when we say th a t these equations are equivalent, weare
merely saying th a t they have the same (equal) solution sets. In symbols,
[x\ x2 — 4x + 3 = 0} = {x : (x — l)(a: — 3) = 0}.
Notice th a t this statem ent is delightfully void of any attem pt to show th at

the equations look alike. We have talked so long about synonymous
expressions because frequently in solving any problem or in trying to
communicate with others, we find th a t one word or statem ent is simpler to
grasp than an equivalent one. For example, notice how many times we ask
someone to put something into other words. When we write “in other
words,” the expression following this phrase though different from the
original in wording, has the same meaning. Referring now to the equations
x2 — 4a; + 3 = 0 and (x — l)(x — 3) = 0, observe th a t the rules of
arithm etic allow us to solve the second more readily than the first, even
though the equations are equivalent. To see this, recall th a t if the product
of two numbers is 0 then a t least one of the two numbers must be 0. Thus,
since the product of (x — 1) and (x — 3) is 0 in (x — l)(x — 3) = 0, it
follows th a t either x — 1 = 0 or x — 3 = 0; or, in other words, either
x = 1 or x = 3.
Let us digress to observe th a t the converse of a true statem ent need not
be true. W ith regard to our problem, all we have shown is th a t if
(x — l)(x — 3) = 0, then either x = 1 or x = 3. We have not shown that
if either x = 1 or x = 3, then (x — l)(x — 3) = 0. In terms of the classical
language of mathematics, it still remains for us to check whether x = 1 and
x = 3 are indeed solutions. In short, what is commonly known as the
check of the problem is in reality, from a logical point of view, part of the
solution itself.
In summary of our somewhat lengthy discussion of x2 — 4x + 3 = 0,
let us simply observe th a t in the traditional (pre-sets) presentation, one
can find the problem :
Solve the equation x2 — 4x + 3 = 0.
The solution proceeds as follows:
x2 — 4x + 3 = (x — 1) (x — 3).
Hence, x2 — 4x + 3 = 0 implies th a t (x — l)(x — 3) = 0. This, in turn,

implies th a t either x = 1 or x = 3. Check:
If x = 1, then x2 — 4x + 3 = l 2 — 4(1) + 3 = 1 — 4 + 3 = 0,
which checks with the original equation.

If x = 3, then x2 — 4x + 3 = 9 — 12 + 3 = 0, which also checks with
the original equation. Hence, the solutions of x2 — 4x + 3 = 0 are
precisely x = 1 and x = 3. (Traditionally, the solutions were called roots
of the equation.)
The im portant point is th a t in terms of sets, the computational procedure
remains the same—only the emphasis on the problem versus the solution
is better highlighted. Thus, in terms of the new approach, the same
problem might be stated as:
Find the solution set S of the open sentence x2 — 4x + 3 = 0.
We would proceed by immediately writing down the correct answer in

set-builder form; th a t is, S = {x: x2 — 4x + 3 = 0}. Next we would
apply the same algebra as above to show
The Mood Setter 147
S = [x: x2 — 4a; + 3 = 0}
= [x\ (x — l)(x — 3) = 0}
= [x’. x — 1 or x = 3}
= (1,3).
The set-builder form centered our attention on the problem and em
phasized the test for membership in S. The use of algebra allowed us to
convert S from the set-builder form to the roster form; and this was a
distinct practical advantage, since in real life we frequently wish to express
answers as explicitly as possible. Observe th a t the jargon alone is un
important. If we do not understand what we are trying to do, and if wre
have not mastered enough computational skills, then it makes little differ
ence whether we say, “Solve the equation . . . ” (traditional language)
or, “Find the solution set . . .” (modern language)—we will not be able
to do either!
For the reader who is more interested in application than in theory, let
us point out again th a t there is interplay between theory and application
in mathematics (as well as in virtually all other subjects). Thus, the
theory of this section can be applied to practical situations. By way of
illustration, consider the following examples.
Example 1
I t is desired to cut a 6-inch piece of string into two parts so th a t the longer
piece is twice the length of the shorter piece. W hat is the length of each
piece?
s o l u t io n : A s usual, we pretend to know the answer.
Suppose we let x
denote the length of the shorter piece. Then, since the longer piece is
twice the length of the shorter, we will denote it by 2x. Since the sum of
the two lengths m ust equal 6, we arrive a t the open sentence
x + 2x = 6 or 3x = 6.
Observe th a t 3x = 6 is precisely the equation th a t we elected to begin
the theoretical points of this section. Notice th a t the equation is inanimate
and, hence, we have no notion as to what physically significant problems
it will be called upon to help solve. Knowing th a t x = 2 is the only solu
tion to 3x = 6, and recalling th a t x denotes the length of the shorter piece,
wre have proven th a t the answer to the problem is th a t the shorter piece
is 2 inches and the longer piece is 4 inches. A simple check shows th a t
this is indeed a solution, and our problem is solved. This shows the
interplay between the theory and application in this problem, for we
translated the practical problem into an abstract equation, and the abstract
equation was solved by theoretical methods.
As a final note on this example, which may serve to place algebra in

proper perspective, observe th at we have never insisted upon the use of
algebra to solve the problem. For instance, if we had enough skill in
arithmetic, we might see a t a glance th a t the two pieces had to be in the
ratio of 2 to 1 (2:1). Thus, the longer piece would have to be § of the
entire length, while the shorter piece would have to be ^ of the entire
length. But f of 6 is 4 and | of 6 is 2—the required answer. Or, we could
have relied on brute-force, trial-and-error, and hoped to stumble across
the answer. However, algebra serves as an objective “equalizer,” in
helping us solve practical problems by relatively simple recipes in the event
th a t our subjective insight is not sufficient for the task. More esthetically,
it supplies us with an objective vehicle for deducing what answers follow
inescapably (logically) from the given assumptions.
Exam ple 2
I t is known th a t a particular number has the property th a t if its square is
subtracted from four times the number, the answer is 3. W hat is the
number?
Again, let x denote the number. Then x2 denotes the square
s o l u t io n :
of the number, while Ax denotes four times the number. Thus, the open
sentence becomes
Ax — x2 = 3.
Adding x2 — Ax to both sides of the equation, it is converted into
x2 — Ax -f- 3 = 0.
In terms of sets, all we are saying is th at
{x: Ax — x2 = 3} = {x: x2 — Ax + 3 = 0}.
Our theory has already shown us th a t the only solutions are x = 1 and
x = 3. Thus, the required number m ust be either 1 or 3. (But unless
whoever posed the problem gives us additional information, we cannot tell
which of these two numbers he had in mind—only th a t it m ust have been
one of these two. More positively, we have arrived objectively a t the con
clusion th a t any number other than 1 or 3 cannot be a solution.)
Exam ple 3
We wish to construct a square whose perimeter exceeds its area by 3 units.
W hat m ust the length of the side of the square be?
Letting the length of the side of the square be x, its area is x2
s o l u t io n :
and its perimeter is Ax. Thus, the open sentence becomes
The Mood Setter 149
4x — x 2 = 3 .
Observe th at while the physical situation discussed in this example is

very much different from the one discussed in Example 2, the equations
are identical. This highlights our assertion th a t the same theoretical
equation can occur in many diverse physical (real-life) situations. In any
event, we can now easily establish the fact th at the length of the side of the
square is either 1 or 3, and the problem is solved.
I t has been our aim in this section to help the student master the new
notation introduced with sets; and, at the same time, to place the review
in a productive environment by applying the concepts to some im portant
ideas of elementary mathematics. In particular, we chose to show how
the anatomy of algebra could be compactly summarized by proper use of
the set-builder versus roster methods for describing sets. In this procedure,
we do not arrive at any answers we did not already know, but we dofind
a convenient device for both learning and teaching various concepts.
Exercises
1. Express the solution set S in set-builder form for each of the following
equations.
(a) 3x = 15 (f) x2 + 2x + 10 = 0
(b) 4x + 9 = 3x + 11 (g) x2 + 3x = 28
(c) 5x + 6 = 3x + 2 (h) 2a:2 — 7x + 3 = 0
(d) x2 + 7x + 10 = 0 (i) (x — l)(x + 2) (a: — 3) (a: — 4) = 0
(e) x2 — 8x + 15 = 0
2. Use any means you desire to convert each of the solution sets obtained
in Exercise 1 into the roster form.
3. Verify th at
x3 + 3a;2 — x — 3 = (x + l)(a: — l)(z + 3)
and use this result to find the solution set for
x2 + 3a:2 — x — 3 = 0
both implicitly and explicitly.

4. W rite an equation whose solution will solve the following problem.
Then solve the equation, thus solving the problem.
John, Bill, and M ary have, in all, 72 marbles. Bill has twice as many marbles
as M ary while John has three times as many as M ary. How many marbles does
M ary have?
5. Solve the problem in Exercise 4 using ratios and ordinary arithmetic.

6. W rite the equation th a t will help solve the following problem and then
solve the equation.
The square of a particular natural number exceeds the number itself by 72. W hat
is the number?
2.5.3 AN IN TRO D U CTIO N TO M ATRICES IN TERM S

OF M ETHODS FOR D ESCR IB IN G SETS
In this section we shall continue to show how the roster method and the
set-builder method can be combined to give a modern meaning to traditional
topics.
Applying the discussion of solution sets to the topic of simultaneous linear
equations, consider, for example, the following pair of equations.
x + y = 8)
x - y = 2j
Here we can readily determine th at x = 5 and y = 3 is the only solution
to the equations. One can see this by remembering th a t equals added to
equals are equal, equals multiplied by equals are equal, and so on. Thus
if we add the two equations, we obtain th a t 2x = 10; from whence it
follows th a t x = 5 and y — 3.
We can use the language of sets here very well. For instance, we could
write a t once th a t the solution set S is given by
S = {(x,y): x + y = 8 and x — y = 2}.
A t this point, even the student who has not been exposed to the various
computational techniques for solving simultaneous linear equations can
still use the test-for-membership idea inherent in set-builder notation to
decide whether a given ordered pair (x,y) belongs to S. The various com
putational techniques merely give us an efficient way of expressing S in
the more explicit roster form: $ = {(5,3)}.
Even the familiar devices, such as equals added to equals, can be in
terpreted in term s of the language of sets. More explicitly, we learn
th a t we can multiply both sides of an equation by the same nonzero con
stant. In terms of solution sets this says only th a t both equations have
the same solution set. By way of illustration, when we say th a t x + y = 4
is the same as 2{x + y) = 2(4), all we mean is th a t the two different
equations have the same solution set. In a similar way, when we say th a t
in a system of equations wre can replace any equation by itself plus any
nonzero multiple of any other equation in the system, we again mean th a t
such an operation does not change the solution set of the given system.
Another operation th a t does not change the solution set of a given system
The Mood Setter 151
is changing the order of the equations th a t make up the system. In short,

we are saying th at the solutions do not depend on the order in which the
equations are stated. If we apply these operations to the system
x + y = 8
x - y = 2 }
we obtain
) x = 5 ) x = 5)
J - V = —3 J ~ y = 3/
(d) (e)
where ~ means th a t the systems have the same solution set. Observe
th a t (b) is obtained from (a) by replacing the first equation by the sum of
the first and the second; (c) is obtained from (b) by multiplying both sides
of the first equation by i and replacing the first equation by this result;
(d) is obtained from (c) by replacing the second equation by the second
minus the first (or equivalently, adding —1 times the first to the second);
and finally (e) is obtained from (d) by replacing the second by —1 times
the second equation.
Now utilizing the language of solution sets (a), (b), (c), (d), and (e)
say th at
{(x,y): x + y = 8 and x — y = 2} = {(x,y): 2x = 10 and x — y = 2}

— {(x,y): x = 5 and x — y = 2}
= {(x,y): x = 5 and —y — —3}
= {(x,y): x = 5 and y = 3}.
While all of the above sets are equal, the last one is particularly easy to
convert into roster form; namely,
S = {(5,3)}.
Let us illustrate this approach with another example, only this time we
shall omit the reasons for each step. Let us find the solution to the pair
of simultaneous equations
3x + 2y =
= si
4x + 3y = 15/
We obtain
3x + 2y = 8) I2x + 8y = 32) ^ 12x +
4x + 3y = 15/ ~ -1 2 x - 9y = —45/ ~
3x + 2y = 8) 3x + 2y = 8)
~ y = 13/ - 2 y = —26/
and the roster form of the solution set, S = {(-6,13)}, follows at once
from the last pair of equivalent equations. In still other words, leaving
out the in-between steps, the set
{(x,y); 3x -f 2y = 8 and 4x + 3y = 15}
is equal to the set
{(x,y): x = —6 and y = 13}.
Our algebraic computational devices merely allowed us to establish this
equality with a minimum of trial-and-error and other frustrations.
While the complexity of the computation may increase with the number
of equations in the system, the method itself is not affected by such changes.
By way of illustration, consider the system of three linear equations:
x + 2y + 3z = 4j
2x + 5y + 7z = 9/*
4x + 9y + 9z = 15)
In terms of our previous discussion, the solution set for this system will not
be altered if we replace the second equation by the second minus twice the
first, and the third equation by the third minus four times the first. The
resulting equivalent system now has the interesting property th at x
appears in only the first equation. In other words, in the equivalent
system the last two equations involve two equations with two unknowns,
which is certainly easier to handle than three equations with three un
knowns. Once we find y and z from these last two equations we can find
x by replacing y and z in the first equation by their now known values.
Then we have
x + 2y+ 3z = 4 | x + 2y + 3z
2x + by+ 7z = 9 / ~ y + z
4x + 9y + 9z = 15) y — 3z
(a) (b)
While (b) is more convenient than (a), we may still reduce (b) by the
same technique th a t we used before. This time we may elect to eliminate
y from all but the second equation in (b). T hat is, we could have arrived
a t the equivalent system
x + z —
y + z = 1>
- 4 2 = -2 )
(c)
by taking (b) and replacing the first equation by the first minus twice the
second, and the third by the third minus the second. The advantage of
The Mood Setter 163
(c) is th a t we can now find the value of z a t once from the third equation.
W ith this knowledge we can return to the second equation to find y, and
to the first equation to find x. Of course, we could continue the reduction
still further by taking the third row of (c) and replacing it by the third
row multiplied by thus obtaining the system (d). From (d) we could
obtain (e) by replacing the first row of (d) by the first minus the third, and
the second row by the secondminus the third. In summary,
x + 2y + 3z = 4l x + 2y + 3z = 4l x + 2
2x + by + 7z = 9 / ~ y + z = 1> ~ y-\- z
4x + 9y -f 9z = 15; y — 3z = —l) — 4z
(a) (b) (c)
x + z = 2) x = §)
y + 2 = 1> ~ y = £>•
2 = \) * = \)
(d) (e)
Thus, in a systematic way we have reduced the set (a) to the equivalent
set (e); but (e) is particularly convenient for showing th a t the solution
set is {( f )}. This system is straightforward and involves no artificial
devices, and it extends to any number of equations. Moreover, this
technique is self-contained and does not force us to invent such advanced
concepts as either “determ inants” or “matrices” for a suitable discussion
of the process.
We now introduce matrices (which are merely rectangular arrays of
numbers) by showing th a t they occur in a very natural way; simply as a
way of coding the reduction of a system of equations. Specifically, notice
th a t when we worked on our system all the changes were with respect to the
constants, not the variables. In other words, if we invented a type of
place-value, we could perform the operations using only the constants of
the equation. For example, we could abbreviate the equation ax + by = c
by writing [a b c] where we view the first column as holding the place of x,
the second column as holding the place of y, and the third column as rep
resenting the constant on the right-hand side of the equation. Obviously,
we can continue this abbreviation for any number of variables. Thus,
[a b c d e f g] would be used to denote axi + bx2 + cx3 + c?x4 +
ex5 + /x 6 = g.
Applying this system of coding to some of our previous examples we
could code
x + y = 8)
x — y — 2)
by the two-row, three-column array (called a 2 by 3 matrix)
1 1 8
1 -1 2 ‘
Similarly,
'3 2 8'
4 3 15.
would represent the system
3x + 2y = 8)
4x + 3y = 15;
If we w ant to invent the concept of equivalent matrices in terms of the
facts we know’ about systems of linear equations we define the concept
called row-equivalence as follows: If (1) we replace any row of a matrix
by a constant nonzero multiple of th a t row, or (2) we change the order of
the row’s of the matrix, or (3) we replace any row by th at row’ plus a nonzero
multiple of any other row’, we call the resulting matrix row-equivalent to
the original matrix. In general, any two matrices are called row -equivalent
if one can be obtained from the other by operations of the type (1), (2),
and (3), above. Notice th a t the rules for generating equivalent matrices
are exactly the same as the rules for solving systems of equations.
W ith this definition of row’-equivalence we are sure th a t w’hen it is used
as our code, two row-equivalent matrices represent equivalent systems of
equations.
Let us illustrate these last remarks, but first let us agree th a t ~ w’ill
now denote row-equivalent matrices. W ith this in mind, w’e have
OCl ' l 1 8' 0i "l 1 8' 7i '1 1 8" 81 ' l 0 5 '
OL2 1 - 1 2. 02 .0 -2 - 6. 72 0 -1 -3 . 82 0 1 3.
(a) (b) (c) (d)
where
02 = «2 ~ «1
72 = §02
5i = 7 i + 72
82 = — 7 2 .
As a review of our code, (a),(b), (c), and (d)represent the systems
x + y = 8) x + y = 8) x+ y = 8)
x - y =2J(Ox) - 2 y =- Q j - y = -3 j
(a) (b) (c)
x + (0y) = 5) or x = 5)
y = 3/ y = 3J
(d)
The Mood Setter 156
In this way matrices allow us to obtain the same results as the method
presented on the previous pages, but without having to write the x’s, y ’s,
+ ’s, and = ’s with each equation. In a sense this does for systems of
equations what synthetic division does for polynomial division. To make
the concept of row-equivalence clearer, let us review the other problems
in this light.
'3 2 8' '-1 2 -8 -3 2 ' '- 1 2 -8 -3 2 '

4 3 15. 12 9 45. 0 1 13.
(a) (b) (c)
-3 -2 - 8' '- 3 0 18' 1 0 - 6'
0 2 26. 0 2 26. 0 1 13.
(d) (e) (f)
In terms of our code (a) and (f) represent the equivalent systems
3x + 2y = 8 |
and
4x + 3y = 15j
(a)
In a similar way
i
00
'l 2 3 1 2 4' 'l 0 1 2

2 5 7 9 0 1 1 1 /V 0 1 1 1
4 9 9 15 0 1 -3 -1 0 0 -4 -2
(a) (b) (c)
3
' l 0 1 2' 1 0 0 2
1
0 1 1 1 0 1 0 2
1 1
0 0 1 2 _ 0 0 1 2 _
(d) (e)
shows th a t the two systems
x + 2y + 3z :
2x + 5y + 7z ■ and
4x + 9y + 9z =
(a) (e)
are equivalent.
This matrix notation, while not giving us any information we could not
have otherwise obtained still affords us a compact, convenient system of
notation.
However, there is still more th a t the matrix system can do for us,
provided we want to carry the code still further. For example, suppose
we had three equations with three unknowns and wished to see what
happened if we left the left-hand side of each equation alone and varied
only the constants on the right-hand side. Thus, instead of the system
x + 2y + 3z = 4)
2x + by + 7z = 9>
Ax 9y + 9z = 15)
we might have wished to solve the more general system
x + 2y + 3z = aj
2x + 5y + 7z = 6 >-
Ax + 9y + 9z = c)
To be sure, we could proceed as before using a 3 by 4 matrix, but a

suitable 3 by 6 matrix will simplify the cause even more. T hat is, we will
still let the first three columns denote x, y, and z, but now the next three
columns will represent a, b, and c. Thus, our matrix would be
x yz a b c
1 23 1 0 0'
2 57 0 1 0-
4 99 0 0 1_
Our row-reducing operations continue exactly the same as before as we

reduce the first terms in the second and third rows to 0, then the second
terms in the first and third rows to 0, and finally, the third terms in the
first and second rows to 0. (Notice th a t adding more columns does not
appreciably change the am ount of computation; however, as we add on
rows, the work seems to increase considerably.) In this way we obtain
—)
to
1 0 O' 'l 1 0 o'

CO
2 3
2 5 7 0 1 0 0 1 1 -2 1 0
A 9 9 0 0 1_ 0 1 -3 -4 0 1
(a) (b)
1
0
'l 0 1 5 - 2 o' 1 5 - 2 O'

0 1 1 - 2 1 0 0 1 1 -2 1 0
0 -4 - 2 - 1 1_ _0 0 -1 1 i i
0 2 4.
(c) (d)
X y z a b c
"l 0 0 0 9 i'
T 4
5 3 1 •
0 1 0 4 4
0 0 1 1 1 1
2 4 4.
(e)
The Mood Setter 167
From our code, (e) now tells us th a t

x + 2y + 32 = ai
2x + 5y -f- 7z = 6 /
4x + 9y + 92 = c)
has the solution
x = 9a/2 - 96/4 + c/4
y = —5a/2 + 36/4 + c/4
z = a/ 2 + 6/4 - c/4.
We see th a t this system has a unique solution for each given set of values
a, 6, and c. T hat is x, y, and z are determined unambiguously once a, 6,
and c are specified. Our earlier example was the special case in which
a = 4, 6 = 9, and c = 15.
As is so often the case, we seldom get something for nothing. In the
same sense, if we are judicious in our choice of notation, it often happens
th a t what appears to be a more cumbersome notation often brings with it a
few fringe benefits. For example, look a t the second row in (b) which reads
0 1 1 - 2 1 0
and which translates into
y -\- z = —2a + 6
Since 6 symbolizes the second equation and a the first (since these are the
constants on the right-hand side of each equation), —2a + 6 tells us th at
we eliminate x by adding the second equation to minus twice the first (or
equivalently, by subtracting twice the first equation from the second).
A rather elementary check verifies th a t this is indeed the case.
The im portant point is th a t with the augmented form of the matrix, the
“second half” of it always gives us a quick way of checking how we com
bined the original equations to obtain the result. For example, the last
row in (c) tells us th a t we can eliminate both x and y to obtain an expression
for —4z “simply” by adding minus twice the first equation and minus the
second equation to the third equation. T hat is,
0 0 - 4 - 2 - 1 1
says th a t
—4g = —2a — 6 + c.
As a check notice th a t —2a refers to
—2x — 4y — 62 = —2a. (1)
—6 says th at
—2x — by — 7z = —b (2)
and c says th at
4x + 9y + 9z = c. (3)
If we add (1), (2), and (3) we obtain th at
—4 z = —2a — 6 + c
just as our matrix tells us, but the matrix technique gives us a very sys
tem atic way to derive the result, without recourse to guessing and other
trial-and-error techniques.
The concept of an inverse matrix plays a most vital role in matrix algebra
and linear algebra. In the simple coding system we have already supplied
an excellent but simple motivation for this concept. Namely, if we think
of inverse in the usual sense, th a t is, as a shift in emphasis, we see th a t the
inverse of expressing a, b, and c in terms of x, y, and z would be to express
x, y, and z in terms of a, b, and c. (Recall th a t in this sense subtraction is
the inverse of addition since it involves only a change in emphasis. For
example, 3 + 2 = 5 and 2 = 5 — 3 both convey the same number fact
only with a switch in emphasis.) In terms of the last problem of the
previous section we showed th a t the inverse of
1 2 3
2 5 7
4 9 9
was
n9r 9
T 4
3 1
4 4
1 1
2 4 4
and th a t this, in turn, meant th a t if
then
9a/2 — 96/4 + c/4 = x
—5a/2 + 36/4 + c/4 = y
a /2 + 6/4 — c/4 = z
To apply this to a 4 by 8 situation, observe th a t

The Mood Setter 159
X y z w a 6 c d x y z w a b e d
'l 1 1 1 1 0 0 o' 'l l 1 1 1 0 0 O'
2 3 3 4 0 1 0 0 0 l 1 2 - 2 1 0 0
3 2 2 5 0 0 1 0 0 -l -1 2 - 3 0 1 0
4 5 6 5 0 0 0 1 0 l 2 1 -4 0 0 1
(a) (b)
"l 0 0 - 1 3 - 1 0 0" ~1 0 0 -1 3 - 1 0 O'
<M
0 1 1 1 0 0 0 1 1 2 -2 1 0 0
1
0 0 0 4 -5 1 1 0 0 0 1 -1 -2 - 1 0 1
0 0 1 -1 -2 - 1 0 1_ _0 0 0 4 -5 1 1 0_
(c) (d)
1 0 0 -1 3 1 0 0 1 0 0 - 1 3 - 1 0 0
0 1 0 3 0 2 0 -1 0 1 0 3 0 2 0 - 1
0 0 1 -1 -2 1 0 1 0 0 1 - 1 -2 - 1 0 1
0 0 0 4 -5 1 1 0 5 1 1
0 0 0 1 T 4 4 0
(e) (f)
x y z w a 6 d
'l 0 0 0 7 3
0
T 4
0 1 0 0 15
4 T
5 _ 1 _ 1
4 1
13 3
4 1
5 1
T 4 0
(g)
A comparison of (a) and (g) show th a t the matrices
“l l l l “ r i4 _ 14 14 o-
w
15 5 3 1
2 3 3 4 and 4 "i" 4
13 3 1 1
3 2 2 5 4 4 4 1
J 5 6 5_ _ — T 4 A
4 A
4 0
W_
are inverses. In terms of linear equations, (a) and (g) tell us th at
a ;+ y -\- z + w = a'
2x + Sy+ 3z + 4w = 6|
Sx + 2y-f 2z + 5w = cj
\x + by+ 62 + bw = di
implies
7a/4 - 36/4 + c/4 = “
15a/4 + 56/4 - 3c/4 - d = y{
- 13a/4 - 36/4 + c/4 + d = z\
—5a/4 + 6/4 + c/4 = wl
Observe also th a t (b), (c), (d), (e), and (f) can be translated in terms of
our code to give other systems of linear equations th at are equivalent to
those implied by (a) and (g). In other words, for example, comparing
(a) with (b) we see th a t
x + y+ z w = a\ x + y + z +w = a
2x + Sy + 3z + 4w = bl ^ y + z + 2w = —2a -f b
3x + 2y + 2z + 5w = cl —y — z + 2w — —3a + c
4x + by + 62 + 5w = d) y + 2z + w = —4a + d
Finally,observe th a t while the matrices may start to look involved,

they are still much more simple and systematic than other methods of
computation for developing the same results—including determinants!
Up to now we would have been able to live without the new coding
system; and, until now there has been no appreciable advantage to matrices.
The major advantage in matrix notation comes when equations are
dependent. For example, consider the system
x + y = 3)
2x + 2y = 6/ ‘
Here wre still have two equations with tw'o unknowns but the twTo equations
are dependent in the sense th a t both have the same solution set. In
other words (x,y) is a solution of the first equation if and only if it is also
a solution of the second equation. Similarly, the system
x + y = 3)
2x + 2y = 7/
has no solutions since x + y = 3 implies 2x + 2y — 6 ; th a t is 2x + 2y = 7
is incompatible with x + y = 3.
I t is here th a t the matrix coding system is extremely useful. While the
method we are about to illustrate applies to any situation wre shall apply
it first to the case of three equations and three unknowns. Consider the
following system.
x + y — 2z = a)
— l l x + 4y + 7z = &/•
26x — 9y — 17z — c)
If we now subject the above system to our coding system we obtain
X y z a b c
1 1 - 2 1 0 O' '1 1 -2 1 0 O'
-11 4 7 0 1 0 0 15 -1 5 11 1 0
26 -9 - 1 7 0 0 1_ 0 -3 5 35 - 2 6 0 1_
(a) (b)
The Mood Setter 161
1 1 - 2 1 0 0
0 1 - 1 11/15 1/15 0
0 1 1 - 2 6 /3 5 0 1/35
(c)
X y z a 6 c
1 0 -1 4/15 - 1/15 0
0 i -1 11/15 1/15 0
0 0 0 -1 /1 0 5 1/15 1/35
(d)
In terms of our code, (a), (b), (c), and (d) represent systems of equivalent
equations. The last row of (d) is particularly interesting, however. For
our code tells us th a t —a/ 105 + 6/15 -j- c/35 = 0 (that is, Ox + Oy + 0z),
or, clearing denominators,
—a + 76 + 3c = 0 or a = 76 -f- 3c.
In other words, our code now tells us th a t the system cannot possess a
solution unless a = 76 + 3c. The matrix code provided us with an
excellent algorithm to replace trial-and-error. Specifically, the fact th a t
a must equal 76 + 3c tells us th a t the first equation is equal to the sum
of seven times the second plus three times the third. Now we can easily
see th a t seven times the second is —77x + 28y + 49z = 76, while three
times the third is 78x — 27y — 5 lz = 3c.
Adding these two equations we see th a t x + y — 2z = 76 + 3c, but our
first equation stipulates th a t x + y — 2z = a; and this checks with the
result a = 76 + 3c, since x + y — 2z is a unique number for any specified
values of x, y, and z.
Observe th a t the same result might have been obtained by trial-and-
error, but th a t the matrix system makes guessing completely unnecessary
here. The matrix code also gives us excellent insight into just what is
happening with the system of equations.
Getting back to the system, observe th a t all we have shown is th at
76 + 3c= a is a necessary condition for the system to possesssolutions.
We have not shown th a t this condition is sufficient. However, if we now
return to (d) we see th a t this condition is sufficient since the first two rows
of (d) translate into
x — z = 4a/15 — 6/15
y — z = lla /1 5 + 6/15
which in turn says, for example, th a t once 76 + 3c = a we can choose z

a t random, whereupon the accompanying values of x and y are uniquely
determined by
x = z + 4a/15 — 6/15
y = z + lla /1 5 + 6/15.
To summarize our results, the matrix coding system shows th at

x + y — 2z = a)
— 1Ix + 4y + 7z = 6 /
26# — 9y — YJz = c)
is equivalent to the system
x — z — 4a/15 — 6/15)
y — z = lla /1 5 + 6/15)
'provided that 76 + 3c = a.
76 + 3c = a is often referred to as a constraint. T hat is, unless 76 +
3c = a the system can have no solutions. When this happens (that is,
when a 5^ 76 + 3c), we say th a t the system of equations is incompatible.
Once a = 76 + 3c we have infinitely many solutions, one for each choice
of z. We say th a t the system has 1 degree of freedom since we can specify
one of the variables at random and still obtain unique solutions.
Now
X y z w
1 0 -3 -2 4
0 1 -1 1 3
is used as a code for the system of two equations with four unknowns;
namely,
x — 3z — 2w> = 4)
y — z + w = 3/
which indicates 2 degrees of freedom since we can choose w and z a t random.

From this we obtain
x = 3z + 3w + 4)
y = z — w + 3J
To pursue this point just a bit further, suppose th a t we were given the
following system of three equations and four unknowns
x + y — 4z — w = 7)
2x + y — 7z — 3w =11/- (2)
2x 3y — 9z — w = 17)
Using the matrix code for reducing (2) we obtain

The Mood Setter 163
"l 1 -4 -1 7" ‘l 1 -4 -1 7"

2 1 -7 -3 11 0 -1 1 -1 -3
2 3 -9 -1 17. 0 1 -1 1 3.
(a) (b)
GO
to
1
1
'l 0 -2 4" 'l 0 -3 4'
r 0 -1 1 -1 -3 0 1 -1 1 3
0 0 0 0 0 0 0 0 0 0
(c) (d)
Since the last row of (d) is the indentity Ox + Oy + Oz + Ow = 0, this

row is redundant; hence, (d) can be replaced by
*1 0 - 3 - 2 4'
0 1 - 1 1 3
which is matrix (1).

We have thus shown th a t (2) is equivalent to (1); but the “echelon”
form of (1) allows us to quickly determine the degrees of freedom and how
to determine x and y once w and z are chosen. (The echelon form in the
first nonzero entry of each row is 1, and 1 is “indented” if it comes in a row
later than the row in which another 1 appears, and in any column where a 1
appears all other entries are 0.) In summary, then, the solutions of (2)
are given by choosing, say, z and w a t random, whereupon
x = 3z + 2w + 4 an(j y — z — w -\- 3.
To continue the review further, in trying to solve (2) when we got to

(d) we might have heaved a sigh of relief because even though the equations
were dependent they were compatible. We might have desired to study
the more general situation
x + y — 4z —w = a)
2x + y — Iz —2>w = &>• (3)
2x + 3y — 9z —w = c)
In this event we would have obtained the following 3 by 7 sequence of

matrices.
x y z w a b c
'l 1 -4 -1 1 0 O' 'l 1 -4 -1 1 0 o'
2 1 -7 -3 0 1 0 0 -1 1 -1 -2 1 0
2 3 -9 -1 0 0 1 0 1 -1 1 - 2 0 1_
(a) (b)
X y z w a b c
'l 0 -3 -2 -1 1 O' 1 0 -3 - 2 -1 1 0
0 -1 1 -1 -2 1 0 0 i -1 1 2 -1 0
0 0 0 0 -4 1 1 0 0 0 0 -4 1 1
(c) (d)
(d) is very informative. First of all, its last row tells us th at our con
straint is —4a + b + c = 0, or 4a = b + c. Observe th a t this result
checks with the result obtained for (2). Namely, in (2), a = 7, b = 11,
and c = 17; whence it does follow th a t 4a = b + c. We also gain the
information th a t the equations in (2) were dependent but compatible
because the sum of the last two equations was equal to four times the first.
Continuing further, the first two rows of (d) tell us th at to find x and y
once our constraint is m et we may choose z and w a t random, whereupon x
and y are determined by
x = 3z + 2w — a -\- b and y = z — w -\-2 a — b.
Again, referring to (2) we have a — 7 and b — 11, whereupon these
equations reduce to
x = 3z + 2w + 4 and y = z —w + 3
which agrees with our previous results!
An interesting special case in which the constraint m ust be obeyed is
when a = b = c = 0. In this event the system of equations is called
homogeneous. W hat this tells us with respect to our given illustration is
th a t if we are given the system
x + y — 4z — w — 01
2x + y — Iz — 2>w = 0>>
2x + Sy — 9z — w = 0;
this system has 2 degrees of freedom. Namely, from x = 3z + 2it> — a b
and y = z — w 2a — b with a = b = 0 we may choose z and w at
random; whence x = 3z + 2w and y = z — w. The check th a t these
are indeed solutions is left to the reader.
As a final observation concerning the result obtained in (d) we see th a t
the system
x + y — 4z — u> = 6^
2x + y — 7z —3«; = 8 /
2x + Sy — 9z — w = 9)
has no solutions (th at is, the solution set is empty) because the constraint
4a = b + c is not met. (In other words, in this case 4a = 24 while
b + c = 8 + 9.)
The Mood Setter 166
I t is hoped th a t the above discussion has established th a t matrices are

the im portant device in the study of linear systems of equations; and th at,
in turn, matrices have a simple interpretation in terms of solution sets for
systems of linear equations.
This illustration concludes the first part of our treatm ent of sets. In
P art II we shall turn our attention to the problem of how we invent opera
tions by which we can combine sets to form new sets.
Exercises
1. Use matrices to solve each of the following systems:
(a) 3z + 4y =
4x + by = 9j
(b) 6x + 5y = - 3 )
5x — 3y = 11;
2. Use matrices to solve each of the following systems:
(a) x + y + z = 3)
2x + 3y + 4z = 4>-
6x + 7y + 9z = 5;
(b) x + y + z = 3)
3x + Ay + 5z = 4/-
4x + 7y + 8z = 5;
3. Solve for a, b, and c in terms of x, y, and z if:
(a) x + y + z = a)
2x + 3y + 4z = 6/-
6x + 7y + 9z = c)
(b) x + y + z = a)
3x + 4y + 5z = b\-
4iX + 7y + Sz = c)
4. Under what conditions (that is, for what values of a, b, and c) will the
following system have solutions? Describe the solution set in these
cases.
x + 2y + 3z = a)
2x + 3y + 4z = &>•
3x + 4y + 5z = c)
5. Find the solution set of
£ + y + z + u> = l \
2x + 2y + z + 2w = 2f
3x + y + 2 + 2ty = 3(
x + 7y + 3z + 4iy = 1/
part II / The Arithmetic of Sets
2.6 UNIONS, INTERSECTIONS, AND COMPLEMENTS

In P a rt I we introduced the concept of set along with a few basic properties
and pointed out th a t the study of sets served a multipurpose function in
mathematics. We began to illustrate this result later with respect to the
concept of systems of equations. We showed th at while we could have
studied equations without the formal knowledge of sets, a study with this
knowledge simplified the underlying principle behind the solution of
equations. More specifically, we used the concept of sets to show the
inner mechanism whereby one proceeded from an implicit to an explicit
form of the answer, and how the recipes usually associated with algebra fit
into this overall pattern.
In this section we intend to introduce a few additional concepts in the
study of sets and we shall then show how sets may be used to unify ap
parently unrelated topics in the study of mathematics.
We begin by pointing out th a t while we have applied the concept of
sets to arithmetic, we have not applied the concept of arithmetic to sets.
T hat is, we have not yet explained how we may combine sets to form new
sets. Thus, before proceeding further let us first introduce the arithmetic
of sets. For the present we shall study this idea for its own sake and then
later see how this applies to other topics in mathematics.
From a very informal point of view, suppose th a t two organizations A
and B wish to form a merger (this means forming a new organization th a t
incorporates A and B). Calling this newly formed organization C we see
th a t C consists 'precisely of those elements that belonged to at least one of the
original groups A or B. In a sense C may be viewed as the union of A
and B. Notice th a t if there are well-defined tests for membership in A
and B, then there is also a well-defined test for membership in C. Namely,
given any element in our universe of discourse, we test to see whether it
belongs to either A or B. If it belongs to neither then it will not belong
to C; otherwise it will.
We can generalize this idea to cover all sets. First of all, to minimize
any chance for paradoxes, we assume th a t we have a specific universe of
discourse which we shall denote by I.
166
The Arithmetic of Sets 167
D efinition
Let A and B be subsets of I. Then by the union of A and B, w ritten
A \ J B, we mean the set of all elements th a t belong to either A or B (unless
otherwise specified by “either . . . or . . . ” we mean at least one3). In
the language of sets
A K J B = [x: x G A or x G B ).
This is shown in terms of circle diagrams in Figure 2.6.
We could have combined the sets A and B in a way almost completely

opposite to th a t of union. For example, rather than a merger we could
have formed a “shrinker.” T hat is, we could have consolidated the two
organizations A and B by forming a new organization D, characterized by
the fact th a t its members were those th a t belonged simultaneously to both
A and B.
D efinition
Again, let A and B be subsets of I. Then by the intersection of A and B,
written A (~\ B, we mean the set of all elements th a t belong to both A
and B or, symbolically,
3 While this may seem strange, the fact remains th a t “either . . . or . . . ” is often
used in this nonmutually exclusive sense. For example, when we say th a t either Tom
or Jerry will go to the store, we do not preclude the possibility th a t both boys m ight go.
When we reach into a deck of cards and say th a t we shall draw either a spade or a face
card, we do not feel th a t we have lied should we draw the king of spades. On the other
hand, there are times when “either . . . or . . .” means “one or the other, b u t not
b oth.” In this event there is still no contradiction, for “one or the other, b u t not both”
is covered by the expression “a t least one.” Thus, our only precaution is th a t we will
say “either . . . or . . . , b u t not both” when we mean “either . . . or . . . ” in the
exclusive sense.
A r\ B = and i £ B ) .
The word “intersection” can be understood from the circle diagrams by
observing th a t A B in Figure 2.7 is actually the intersection of the two
circles.
denotes A O B
Notice th a t these last two definitions introduce operations on sets th at

are closed. T h at is, the union of two sets is again a set; and similarly, the
intersection of two sets is a set.
This leads to still another reason for introducing the empty set, <t>. For,
first observe th a t since A \J B = <£ if and only if A — and B = <f>it
would not be necessary to invent the empty set for unions, since the
only way for a union to be empty is for the sets forming the union to be
em pty. However, it is possible for the intersection of two nonempty sets
to be empty. For example, if I denotes the integers and A denotes the
even integers while B denotes the odd integers, we see th a t both A and B
contain infinitely many elements, yet their intersection is empty since
there are no integers th a t are simultaneously even and odd. T hat is, each
integer is either even or odd b u t not both. This shows th a t i f we did not
consider the empty set it might well happen th a t the intersection of two
sets might not be a set.
A B = <t>is equivalent to the more familiar, “no A ’s are B 1s,” and
translates into the circle diagram shown in Figure 2.8.
The final operation th a t we wish to introduce in this section is the
concept of the complement of a set. Observe th a t whenever we choose a
subset of the universe of discourse we actually choose two subsets. For
example, if I denotes the set of natural numbers and A denotes the subset
consisting of the perfect squares, then we have induced the set B whose
members are the natural numbers th a t are not perfect squares. If I
denotes all American citizens and A denotes the set of American citizens
who reside in New York, then we can induce a set B by allowing it to

denote all American citizens who do not reside in New York.
D efinition
By the complement of A , written A ', we mean the set of all elements th a t
belong to universe of discourse b u t not to A . T hat is,
A ' = {x: x C 7 but x 4 A ).
Figure 2.9 shows this in terms of circle diagrams.
Logically, “b u t” has the same meaning as “and” in many contexts.
In this sense A ' = {x: x G 7 and x ^ A }. Thus, we may say th a t
A ' = I C\ A '.
This is not surprising since all elements belong to 7. T h at is, if B is any
subset of I then B = I (~\ B.
Figure 2.9
The concept of complement does not depend on the concept of universal

set. However, the complement of a given set does depend on the universal
set. In other words, we do not say th a t A ' means all non-A’s but rather,
all F s th a t are non-A’s.
One often extends the concept of complement to th a t of relative com
plement. Given any two sets A and B, we define the relative complement
of A in B, w ritten B — A , to be all B ’s th a t are non-A’s. T hat is,
B — A = {a;: x £ B but x 4 A }.
Now we can establish th a t
B — A = BC\ A'
since
B — A = { x :x C B but x 4 A }
and
B r \ A ' = {x: x C B and i C A ' (that is, x 4 -d)}-
This is shown in terms of circle diagrams in Figure 2.10.
AUB
= (A n b ') u (A n B) u (B n a' )
Why?
Figure 2.10
In term s of a basic structure, observe th a t arithmetic seems to hinge

on rules of combination th a t are closed. We are in a good position to
introduce the arithm etic of sets, for we have the three basic operations of
union, intersection, and complement whereby we can form new sets from
old ones. For example, if A , B, and C are sets, we can form the new
set D = B \ J C and then form A C\ D. T hat is, we can talk about such
combinations as A H (B U C). This idea is shown in a circle diagram
in Figure 2.11.4
4 See the note a t the end of this section.

A denoted by j 11
f i U C denoted by z r r
A n (B U C) denoted by (Why?)
Figure 2.11
Before concluding this section we wish to present one more way of

visualizing these three basic operations of sets. Consider the problem
of students enrolling at a college and registering for certain courses. One
type of registration card is of a punch-board variety. Specifically, the
card has several holes punched a t the top. Each hole represents, for
example, a different course. Thus, the first hole might represent M ath
101; the second, History 101; and so on. If a student enrolls in a certain
course we make a wedge-shaped punch in the hole corresponding to th a t
course. We might have the card of a student who takes both M ath 101
and History 101 [Figure 2.12(a)], the card of a student who takes M ath 101
but not History 101 [Figure 2.12(b)], the card of a student who takes
History 101 but not M ath 101 [Figure 2.12(c)], and the card of a student
who takes neither of these two [Figure 2.12(d)].
Observe th a t a particular student’s card m ust be one of the four types
pictured in Figure 2.12, but it cannot be more than one such form. In
other words, the four diagrams in Figure 2.12 represent all possibilities in
a mutually exclusive manner.
Should the registrar wish to determine those students who are enrolled
in M ath 101, he need only take a long needle and insert it through the
first hole. Those cards whose first hole is punched out will fall from the
needle, while the unpunched cards continue to hang on the needle. Thus,
if we wish to determine which students take both M ath 101 and History
101, we can take two needles and insert one in the first hole and one in the
second. Then a card can fall off if and only if it is punched out in both
holes [Figure 2.12(a)]. If we had only one needle we could insert it first
in the first hole. Then the cards th a t fell off would still be “eligible,”
(a)
<d>
Figure 2.12
while the ones th a t did not could not possibly belong to students taking
both courses. We can then pick up the cards th a t fell off and run the
needle through the second hole. The ones th a t now fall off represent
students taking both courses.
In a similar way, if we wished to find those students who took a t least
one of the two subjects, we could first run the needle through the first hole.
Then for the cards th a t did not fall off we could run the needle through
the second hole. The students th a t we wanted would be represented by
the total of cards th a t fell.
Finally, if we wished to find the students who were not enrolled in M ath
101 we would insert the needle in the first hole; then the cards th a t had
not fallen off would represent the students who were not taking M ath 101.
A little reflection should show th a t the above three paragraphs represent
the concepts of intersection, union, and complement, respectively.
As an example, let
I = {1,2,3,4,5,6,7,8,9,10}
A = {1,2,3,5,7,9}
B = {1,3,5,6,8,9}
C = {1,2,3,4,5,7}.
Then
A U B = {1,2,3,5,6,7,8,9}
A C \B = {1,3,5,9}
(A W B ) C \C = 11,2,3,5,7}
(A VJ 5 ) ' = {4,10}
A' = {4,6,8,10}
B' = {2,4,7,10}
A'UB' = {2,4,6,7,8,10}
A 'HB'= {4,10}.
In terms of circle diagrams we could have represented the above problem
as in Figure 2.13.
Before concluding this example, it is worth observing th a t certain things

th a t we might expect to be true are not. For example, while one might sus
pect th a t (A \J B )' = A ' W B', the above example shows this to be false.
On the other hand, (A U B )' = A ' H B ' seems to be true. We shall be
interested in those things th a t are true for all sets, not just for special
cases. Thus, we shall be interested in such statem ents as: For all sets A
and B, (A W B )' = A ' r \ B '. T hat is, we are interested in finding uni
versally true recipes about sets. For example, in the case of ordinary
arithm etic it turns out th a t 1 X (2 + 3) = (1 X 2) + 3, yet this is not
true in general. T hat is, we can find numbers a, b, and c for which
a X (b + c) is not a synonym for (a X b) + c.
A NOTE ON THE REPRESENTATION OF SETS
If we hatch and cross-hatch the circle diagrams we must exercise some

degree of caution in reading the figures. T hat is, we m ust be sure not to
let the lines run over, nor m ust we allow large gaps to exist between the
lines. Thus, it would be helpful if there were a neater way of comparing
circle diagrams than by looking a t hatched regions.
In addition, the concept of a circle diagram becomes very difficult to use
if we are dealing with more than three sets.
The purpose of this note is to introduce other ways of representing sets,
the first method being the chart method. Aside from the fact th at it
allows us to treat collections of more than three sets much more easily
than we can by circle diagrams, this method is also a good forerunner to the
introduction of truth-table logic th a t we shall pursue further in our investiga
tion of Boolean algebra.
The second method of representation consists of a neater way of utilizing
circle diagrams. This will allow us to reap all of the advantages of circle
diagrams without becoming involved in the problems inherent in hatching
and cross-hatching.
2.6.1 T H E CH ART M ETH O D FOR V IEW ING SETS
If x is any element in I and A is any subset of I, there are two mutually

exclusive possibilities which cover all cases:
x G A or x 4- A .
(This is merely a restatem ent of a well-defined test for membership.)

If B denotes any other subset of / , the above two possibilities exist whether
x C B or x 4 B. This, in turn, leads to the four mutually exclusive
possibilities.
(1) x G A and x C B \ th a t is, xC A C\ B.

(2) x G A but x 4 B >th a t is, xG A B'.
(3) x 4 A but x G B; th a t is, xG A ' C\ B.
(4) x 4 A and x 4 B; th a t is, x C. A' B'.
To employ the chart method we introduce the symbols 1 and 0 to indicate

th a t something is or is not an element of a given set, respectively, as follows.
A
0
1
174
This says th a t if we have a subset A of I then there are two possibilities:

Either an element of I does not belong to A (0) or it does (1).
A___ £
1 1 The element belongs to both A and B.
1 0 The element belongs to A but not to B.
0 1 The element belongs to B but not to A .
0 0 The element belongs to neither A nor B.
These are the four mutually exclusive possibilities.

By the chart method we define union, intersection, and complement in
terms of a description about what happens in each of the mutually exclusive
cases. For example, if we wish to use the chart method to define A VJ B
we would write the following.
A B_______ A \ J B
1 1 1
1 0 1
0 1 1
0 0 0
This merely says th a t an element of I belongs to A \ J B unless it belongs

to neither A nor B.
We would denote A C\ B by the following.
A B _______ A C \ B
1 1 1
1 0 0
0 1 0
0 0 0
This merely says th a t unless an element of I belongs to both A and B it
does not belong to A C\ B.
We indicate the concept of complement by the following.
A____A /
1 0
0 1
This says th a t if an element belongs to A it does not belong to A ', and if it
does not belong to A it does belong to A '.
I and <f>have interesting interpretations in terms of the chart method.
Namely, I is characterized by the fact th a t all elements under consideration
belong to it.
A B I
1 1 1
1 0 1
0 1 1
0 0 1
This says th a t no m atter which case holds (that is, whether the element
belongs to A and B, or to neither, or to one but not the other), by definition
it belongs to I.
On the other hand, the empty set is characterized by the fact th at no
m atter where the chosen element is, it is never a member of <f>by the very
definition of the empty set.
A B
1 1 0
1 0 0
0 1 0
0 0 0
T h at is, no m atter which of the four mutually exclusive cases prevail the
element can in no event be a member of <f>) which, of course, is the definition
of 0.
In using the chart method for testing the equality of two sets, two sets
are equal if and only if their charts are identical, case for case. Translated
into more familiar language, this says nothing more than th a t an element
is a member of one set if and only if it is a member of the other; and this
was our definition of equality of sets.
For example, suppose we wish to investigate the two sets A KJ B ' and
( A ^ J B )'. We could proceed as follows.
A B B' A \J B’ AVJ B ( i U B )'
1 1- >0 1 1 > 0
(1) (2)
The columns labeled (1) and (2) are not identical, case for case. Hence,
we conclude th a t A \ J B ' and (A VJ B )' are not synonyms. Among other
things, this shows us th a t in the arithmetic of sets the parentheses are
essential—th a t in deleting them we may change the set.
The chart actually tells us much more than this however. For example,
in the last illustration we see th a t (1) and (2) correspond except in the
first two lines. T hat is, if the first two rows of the chart could be deleted
then A K J B ' would equal (AKJ B )'. Of course, we cannot randomly
cross out entries, but if it should happen th a t A = <f>then the first two
cases could not occur (why?). In other words, the above chart allows us
to draw the following conclusion:
A \ J B ' = (A KJ B ); if and only if A is the empty set.
As a second illustration let us compare (A W B )' and A ' C\ B '.
A B A \J B (A U B )' A' B' A ' r \ B' A' U B'
1 1 1 0 0 0 0 0
1 0 1 0 0 1 0 1
0 1 1 0 1 0 0 1
0 0 0 1 1 1 1 1
(1) (2) (3)
Since (1) and (2) are identical, case for case, we conclude th at
(A U B Y = A ' n B'.
However, (1) and (3) are different, and this also was accounted for in the
example of the previous section. The places of disagreement are the
second and third rows. These correspond to those elements th a t belong
to one but not the other of the two sets. In other words, if it were im
possible for an element to belong to one but not the other, then (1) and (3)
would have been the same. However, to delete the second and third rows
is equivalent to saying A = B. (Why? Hint: A = B means
A____B
1 0
0 1
cannot occur.) Thus, we have demonstrated the impossibility of (A KJ B )'
being equal to A ' \J B ' except in the trivial case th a t A = B.
As the third example let us demonstrate th at
A = ( i H £ ) U ( i n B').
A B B' A (~\B a r\ b ' ( A H 5 ) U (A
1 1 0 1 0 1
1 0 1 0 1 1
0 1 0 0 0 0
0 0 1 0 0 0
(1) (2)
Columns (1) and (2) verify the assertion.
I t should be observed here th a t just as certain algebra problems can be
solved by common sense rather than by technical recipes, it is also true
th a t certain equalities between sets can be sensed without resorting to the

charts. For example,
( A ^ B ) V ( A T i B')
merely says the set of elements th a t belong either to both A and B or else
to A but not to B. If we sense the logical structure of the grammer
involved, it is easy to see th a t this defines all of A. We can see this more
intuituvely in terms of the circle diagram in Figure 2.14.
Figure 2.14
Ju st as in numerical arithmetic, the recipes give us an objective way to

solve certain problems should intuition fail us.
I t should also be pointed out th a t the methods of charts can be extended
to cases where are three, four, five, or more subsets under consideration.
For example, we might want to compare the following sets.
in(B U C )
(A n b) * j (A n c)
(A H B ) U c.
In the case of three subsets there are eight possibilities characterized by
the following list.
A B c
1 1 1
1 1 0
1 0 1
1 0 0
0 1 1
0 1 0
0 0 1
0 0 0
The important point is th a t since the union, the intersection, or the

complement of a set is again a set, we can always proceed by never having
to combine more than two sets a t a time.
A B C A C \ B A(~\C B \ J C ai^ ( b u o (A f\B )K J (A f^C ) (A r\B )^J C

1 1 1 1 1 1 i 1 1
1 1 0 1 0 1 i 1 1
1 0 1 0 1 1 i 1 1
1 0 0 0 0 0 0 0 0
0 1 1 0 0 1 0 0 1
0 1 0 0 0 1 0 0 0
0 0 1 0 0 1 0 0 1
0 0 0 0 0 0 0 0 0
(1) (2) (3)
The chart shows th a t (1) and (2) denote synonyms, but th a t (3) is not
the same as (1) or (2). The differences occur in the fifth and seventh
rows. This in turn means th a t only if an element belongs to B and C but
not to A, or to C but neither to A nor B will the columns be different.
For example, let A = {1,2,3,5,6}; B = {3,4,5}; and C = {5,6}. In this
case, B \ J C = {3,4,5,6}; hence, A C\ (B U C) = {3,5,6}; while A H B =
{3,5}. Therefore, ( A r \ f i ) U C = {3,5,6}. But this is true only be
cause no element of C belongs to the complement of A .
On the other hand, in our example of the previous section
A r \ { B \ J C ) = {1,2,3,5,7},
while (A H B) U C = {1,2,3,4,5,7,9}.
In summary, the chart method shows us th a t whenever we are given
three sets A, B, and C, we may conclude th a t in any event A C\ { B \J C) =
( A n 5 ) U ( A O C ) ; but it need not be true th a t A C\ (BKJ C) =
( A H B ) U C.
The success of this method does not depend on the concept of sets as
much as it does on the concept th a t there are only two possibilities and th a t
they are mutually exclusive. For example, we could use the chart method
to prove certain theorems about even and odd integers (since an integer
is either even or odd, one or the other, but not both). Consider the follow
ing problem.
Choose two integers a t random. Form their sum and their product, and then
multiply the sum by the product. For example, if we choose 7 and 12 the sum is
19 and the product is 84. We then multiply 19 by 84 and obtain 1596, which is
even. Our claim is th a t the answer will be even no m atter which two integers we
choose!
The solution of this problem can be obtained from charts by recalling

th a t (1) any integer is either even or odd but not both; (2) odd X odd =
odd, but odd X even = even X odd = even X even = even; and (3)
odd + odd = even + even = even, while odd + even = even -f odd =
odd. Then letting e stand for even, o for odd, and a and b for any two
integers, we see the following.
a b (a + b) ab (a + b){ab)
e e e e e
e 0 0 e e
0 e 0 e e
0 0 e 0 e
(1)
A glance a t column (1) nowr shows us th a t in each of the four possible

cases the result is even, and our assertion is verified.
The chart method can also help solve the problem of 'paraphrasing. It
is clear, for example, th a t we can combine two sets in endlessly many ways
using repeated intersections, unions, complements, and so on. However,
the chart of the newly formed set m ust belong to one of 16 basic types.
This follows from the fact th a t no m atter what column we get, the first row
m ust have either 0 or 1 as an entry; so m ust the second, third, and fourth.
T h at is, we double the number of possibilities each time. The 16 pos
sibilities are given below.
A B 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
1 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0
0 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0
0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
Any other column consisting of 0’s and l ’s m ust look exactly like one
of these 16. We can now give names to each of these 16 columns.
Column 1: I Column 16: I'

Column 2: A \J B Column 15: (A VJ B )'
Column 3: A \ J B' Column 14: ( i U B ')'
Column 4: A Column 13: A'
Column 5: A ' \J B Column 12: (A'UB)'
Column 6: B Column 11: B'
Column 7: (A U B ') C\ (B KJ A ') Column 10: [(A U B') H ( B U A')]'
Column 8\ A C\ B Column 9: (A B)'
The columns are arranged to highlight the fact th a t 1 and 16, 2 and 15,
3 and 14, and so on, represent pairs of complementary sets. Remember
th at the 16 names we have given take care of all possibilities. T hat is,
any other combination of A and B using unions, intersections, and com
plements m ust be a synonym for one of these 16. For example, consider
the set (A U B) H ( i H B)'.
A B AVJ B A C \B (A P \ b y
1 1 1 1 0 0
1 0 1 0 1 1
0 1 1 0 1 1
0 0 0 0 1 0
This column coincides with column 10, and we thus see th a t

{(^ kj B') n ( B V J A ’) \' = ( A ^ j b ) r \ (A r \ By .
This shows th a t {(A W B') H ( B U A' ) ) ' and {A \ J B) C\ {A C\ B)' are
two different wrays of expressing the same set. Notice th a t the latter
description talks about the set of elements th a t belongs to either A or B
but not both, and this is a rather easy set to visualize. On the other hand,
the former description, while equivalent to the latter, gives the impression
in general of being far more complicated, as it describes the set of those
objects for which it is false th a t they belong to either A or not to B, and
to either B or not to A. Here, by use of the chart method we can make
an effective paraphrase.
This idea of naming all possible columns can be extended to the case in
which we have more than two subsets; the problem being, however, th at
things soon become quite messy. For example, with three sets the chart
has eight rows and as a result, we have 28 rather than 24 possible different
column values. Similarly, with four sets we have 16 rows; hence, 216
different column labels. In other words, while we have 16 columns to
name when we deal with two sets, we have 256 columns to name when we
deal with three sets, and 65,536 columns to name when we deal with four
sets. (That is, 24 = 16; and 216 = 65,536.)
2.6.2 C IR CLE DIAGRAM S
As we have already seen, in addition to using charts we can hatch and

cross-hatch circle diagrams. However, since it is often difficult to keep
track of what is hatched and what is not, we can resort, for example, to
using different colored pencils. In this section, however, we shall number
the diagrams. For example, when dealing with one subset we write what
appears in Figure 2.15. Here 1 and 2 are not used as elements but as labels
for the mutually exclusive sets A and A ' th a t make up I. T hat is, I =
1 VJ 2, and 1 C\ 2 = <f>. In still other words, 1 represents A while 2
represents A '.
Similarly, when we deal with two subsets the diagram will appear as
in Figure 2.16. In this case, the regions 1, 2, 3, and 4 make up / ; and,
again, they are mutually exclusive. In particular,
1 denotes A C\ B.
2 denotes A C\ B ' .
3 denotes A ' C\ B.
4 denotes A ' C\ B ' .
Again, observe what we mean when we say th a t this is just a visual chart
method. In particular, to correlate the chart with the geometric regions
we have:
A B Region Set
1 1 1 A H fi
1 0 2 A C \B '
0 1 3 A 'C \B
0 0 4 A 'C \B '
In Figure 2.16 A = 1 U 2 while B = 1 U 3 . Since the regions 1, 2, 3,

and 4 are mutually exclusive we shall take the liberty of writing
A = {1,2} and B = {1,3}
merely for the sake of brevity. Again, we do not mean th a t A is the set
consisting of the two elements 1 and 2. In fact, if we insist on this nota-
tional interpretation it is better if we say th a t A consists of the two mu
tually exclusive subsets 1 and 2. In terms of circle diagrams, suppose we
wish to establish th a t
(A W B )' = A ' H B'.

We write
A = (1,2)
B = {1,3).
Hence, A W B = {1,2,3}; therefore, (A W B )' = {4}, or merely 4. On

the other hand,
A ' = {3,4}
B ' = {2,4}.
Hence, A ' r \ B ' = {4}, or merely 4. This is shown in Figure 2.17.
B
We could have obtained
the same result by use o f
hatching, but notice how
much more objective
our new way is.
Figure 2.17
Using the interpretation established earlier—th a t is if two sets occupy the

same region they are equal—we see th a t (A W 5 ) ' = A ' C\ B ' in the sense
th a t each is a synonym for region 4.
As a second example, suppose we wished to show th a t if A C\ B = A
then A is a subset of B. In this event we have as before A = {1,2} and
B = {1,3}. Thus, A C\ B = 1. On the other hand, since A = {1,2}
the condition A C\ B = A implies th a t {1,2} = 1. This is possible since
1 and 2 share no elements in common if and only if 2 = <t>. B ut if 2 = <f>,
A — {1,0}, or more simply, A = 1. In this event A C B since 1 C {1,3}.

This is shown in Figure 2.18.
We can continue the circle diagram idea to include the case of three
subsets, as in Figure 2.19.
i
/
Here the regions 1, 2, 3, 4, 5, 6, 7, and 8 can be related to sets and charts

by (Our numbering refers to the above figure, b u t observe th a t no par
ticular rule governs how the regions are to be numbered.)
A B C Region_________ Set denoted

1 1 1 1 AC^BC\C
1 1 0 2 AC\Br\C'
1 0 1 3 A r\B 'C \C
1 0 0 4 A H B 'H C '
0 1 1 5 A T \B C \C
0 1 0 6 A '(~\ B n C'
0 0 1 7 A 'C \ B 'C ^ C
0 0 0 8 A '(~ \ C"
For example, if we wished to indicate the region corresponding to

A ( B \ J C), we would write
A = {1,2,3,4}
B = 11,2,5,6}
C = {1,3,5,7},
as pictured in Figure 2.20. T h en £ U C = {1,2,3,5,6,7} and A (BKJ C)

= {1,2,3}. Similarly, A n B = {1,2}; hence, (A H B ) K J C = {1,2,3,5,7}.
This is illustrated in Figure 2.21.
We see from the diagrams th a t A C\ (B KJ C) and (A r \ B ) \ J C are

synonyms if and only if regions 5 and 7 are both empty. This condition
exists in Figure 2.22. W ith this as a guide, we could make up example
after example wherein these two sets would be equal even though the
result does not hold for sets in general. One such example is Figure 2.23.
Figure 2.21
Circle diagrams are more visual than the previously discussed chart
method, but these two methods are equivalent in th a t they say the same
thing. The only difference is th a t what one says by an appropriate listing
of a row the other says by a picture.
The geometric idea of a circle diagram is probably easier to visualize
than the abstract listing of the chart method. Observe th a t while the
chart method may be cumbersome, it can be used, awkwardly or not, for
three, four, five, or more, subsets. On the other hand, the circle diagrams
get completely out of hand when we deal with more than three subsets.
Try, for example, to illustrate by a circle diagram the general case of four
subsets. If you do this correctly, there will be 16 regions in your diagram
and no two regions will represent the same set!
Notice th a t even though the chart method is easier to handle than
circle diagrams if we are dealing with more than three sets, even the chart
method gets out of hand when we have to deal with a fairly large number
of sets. For example, with 20 sets the chart method would have over a
million (220) rows. We shall ultimately want to develop certain rules and
A = ( l , 2, 4, 5, 6, 7, 8, 9} ^ A n (B U C ) = {1, 2, 4, 5, 6}
B = (1, 2, 3. 4, 6, 10, 11, 12} )
C = {1, 2, 5 J J 04 n f i ) U C = { 1, 2, 4, 5, 6 )
Figure 2.28
identities th a t will allow us to work with sets abstractly, unimpeded by

either circle diagrams or charts. Thus, circle diagrams and charts are
most useful in helping us learn how sets are combined to form new sets.
Exercises
1. Let
Then list the element

(a) A K J B
(b) A U B
(c) (A U B )'
(d) A ' \ J B '
(e) A ' C\ B'
(f) A U ( B n c )
2. Use the chart method to show th a t (A VJ B KJ C)' = A ' C\ B ' C\ C .
3. Use circle diagrams to show th at A VJ (B C\ C) = (4 U B) O ( i U C).
2.7 THE CONTRAST BETWEEN ADDITION AND UNION

In many ways, especially since union seems to incorporate the idea of
combining sets, there is a tendency to associate unions with addition. T hat
is, suppose th a t we have two sets A and B and th a t we know the number
of members in each set. Say th a t A has m members and B has n members.
Then it is not necessarily true th a t A \ J B has m + n members. As a
particular example let us consider the case wherein A has 20 members and
B has 30 members. We then wish to ascertain the number of members
in i U B . Let us first introduce the following notation: if X denotes
any set, then N{ X) denotes the number of elements th a t belong to X. In
other words, for the problem we are now describing we wish to compute
N ( A ^ J B), knowing th a t N( A) = 20 and N( B) = 30.
Since often the first impulse is to think of union in terms of addition,
we may decide th a t N ( A V J B) = 50 since 20 + 30 = 50. We do not
deny th a t 20 + 30 = 50. However, it is not clear th a t we want to add
20 and 30 in this problem. Why?
Let us refer to the enrollment card idea of the previous section. Suppose
the registrar finds th a t 20 students have enrolled in M ath 209 and th a t 30
have enrolled in Biology 232, and he records the names of all these students.
I t should not be difficult to see th a t the resulting list will contain 50 different
names if and only if no student takes both M ath 209 and Biology 232.
However, let us assume th a t 7 students take both of the courses. Then
the list would contain only 43 different names. To understand this, sup
pose th a t the registrar compiles the list by placing a needle through the
hole labeled M ath 209. Then 20 cards fall from the pile. If these cards
remain outside the rest of the pack, there are no longer 30 cards with the
Biology 232 hole punched left in the pack, for 7 of these fell with the
M ath 209 cards. In other words, only 23 of the Biology 232 cards remain
in the pack. Thus, when the needle is placed in the Biology 232 hole only
23 additional cards fall off—accounting for the figure 43.
This same result can be visualized rather well in terms of circle diagrams.
Namely, if we assume th a t N(A) = 20 and N(B) = 30 we see th a t if
N (A C\ B) = 7, then we get a picture like Figure 2.24.
Figure 2.24 shows us th a t 13 elements belong to A but not to B, 7 belong

to both, and 23 belong to B but not to A . In all there are 43 elements
despite the fact th a t separately A has 20 members and B has 30. The
controlling factor is A C\ B, for if an element belongs to both A and B it
contributes 1 to N( A) by virtue of belonging to A, and 1 to N( B) by virtue
of belonging to B even though it is just one element. In summary, each
element in A C\ B is counted twice when we form N( A) + N(B), but it is
only one element of A C\ B. For example, in the problem we just con
sidered notice th a t the difference between the correct answer (43) and
w hat may have been a first-impulse answer (50) is exactly the number of
elements in the intersection of the two sets. T hat is, 50 — 43 = 7. It
thus appears th a t the recipe should have been
N( A KJB) = N( A) + N( B) - N( A H B).
Applying this to the problem we have,
N ( A \ J B) = 20 + 30 — 7 = 43.
This agrees with the proper result. To carry the example one step
further, the given information th a t N( A) = 20 and N(B) = 30 does very
little to determine N ( A V J B). We do know th a t N( A Pi B) is between 0
and 20, but nothing else. These two extreme cases correspond to the
events th a t (1) A and B share no elements in common, and (2) A is a subset
of B. This is demonstrated pictorially in Figure 2.25. In terms of the
recipe these cases lead to
(a) N( A VJ B) = 20 + 30 — 0 = 50
and
(b) N( A VJ B) = 20 + 30 - 20 = 30.
In other words, with regard to the present problem, unless N( A B) is
given we can only conclude th a t N { A \ J B) is a t least as great as 30 but
in no event in excess of 50. Moreover, we can obtain any correct answer
between 30 and 50 merely by appropriately choosing the value for
N( A f \ B). In general, for this problem we need only let N( A VJ B) =
50 — N( A r \ B). For example, if we wish th a t N ( A \ J B) = 37 we let
N( A Hi B) = 13. See Figure 2.26.
In summary, the trouble with writing N( A VJ B) = N( A) + N( B) is th a t
A and B need not be mutually exclusive. T hat is, it may happen th a t
A C\ B 0. In Figure 2.26 we had no trouble when we added 7, 13, and
Figure 2.26
17; but this was because the regions in question were mutually exclusive.
In other words, if X , Y, and Z are mutually exclusive in pairs (that is,
X (X Y = Y ( X Z = X ( X Z = <t>), then N ( X U Y U Z) = N( X) + N( Y)
+ N(Z). We will discuss this later. W ith regard to Figure 2.26, think of
X = A IX B', Y = A ( X B , and Z = A' IX B. Then N( A U B ) =
N ( X U Y U Z) = N( X ) + N( Y ) + N ( Z ) since in this case X, Y, and Z
are mutually exclusive in pairs.
So far we have not restricted the finiteness of the sets under considera
tion. To avoid ambiguity and/or misinterpretation, we shall now impose
the restriction for the remainder of this section th a t all sets under con
sideration be finite. To see why this is necessary let us consider the
following situation. Look a t the expression 5 — 3. We viewed 5 — 3 as
the number th a t m ust be added to 3 to yield 5. From a physical interpreta
tion point of view, we might have viewed 5 — 3 as the process of deleting
three tally marks from a collection of five. More generally, in terms of
sets suppose th a t B C A and th a t N(B) = 3, while N( A) = 5. Then
5 — 3 could be viewed as being the number of members in the set th at
results when B is deleted from A. (Recall th a t this set is called A — B,
while A — B is merely another name for A (X B'.) In general, then, if
B C A we can view N( A) — N( B) as being N( A — B). The problem
occurs if B is a subset of A, and A is an infinite set, for in this case it is
not so easy to describe the number of members in A — B. For example,
let A denote the set of whole numbers. Then A is certainly an infinite
set. Now let B denote the set of even whole numbers. Then it is clear
th a t B is an infinite set th a t is also a subset of A . In this case A — B
would be the set of odd whole numbers, which is an infinite set; hence,
N (A — B) would be infinite. On the other hand, suppose B were the
set of all whole numbers greater than 10. T hat is,
B = {11,12,13,14,15,16,17,18,... }.
In this case B would also be an infinite subset of A . If we now delete B
from A to form A — B we find th a t
A - B = {1,2,3,4,5,6,7,8,9,10}
or
N (A - B) = 10.
In summary, if A and B are infinite sets and no further specifications are
made, then we cannot, without the risk of misinterpretation, give a well-
defined definition of N( A) — N(B). In other words, the formula
N ( A U B) = N( A) + N( B) - N( A IX B)
becomes troublesome if A and B are infinite sets since (1) we cannot be
sure if A C\ B is finite or infinite and (2) if A C\ B is infinite we must, in

effect, subtract infinity from infinity, and as mentioned above, this is not
well-defined. For these reasons we shall only consider finite sets in this
section.
Now let us return to
N( A VJ B) = N ( A ) + N( B) - N( A H B ).
We have already given two interpretations of this result: one in terms of
enrollment cards and the other in terms of circle diagrams. Simply for
more experience let us now use the chart method to illustrate this result.
Number of times counted in

A________ B A \J B arriving at N ( A ) N (B )
1 1 1 2 (since these belong to both A and B)
1 0 1 1 (since these are in A b u t not in B)
0 1 1 1 (since these are in B b u t not in A )
0 0 0 0 (since these are in neither A nor B )
We thus see th at the proper count of JV(i U B ) is wrong only in th a t the

elements of A C\ B were counted twice rather than once.
W hatever method we elect to use to visualize what is happening, the
im portant thing is to note th at
N( A U B) = N( A) + N( B)
is correct if and only if A B = <f>.
Later we shall spend some time discussing the concept of probability
theory. We shall see th a t this study entails clever ways of counting and
utilizes the results of this section. By way of preview consider the follow
ing situation. We have a standard deck (52) of playing cards. We define
the face cards to be ace, king, queen, and jack. We are to reach into the
deck and randomly extract a card. W hat is the likelihood (probability)
th a t we chose either a face card or a heart? There are 13 hearts and 16
face cards in the deck (13 + 16 = 29). Hence, since there are 52 cards
in the entire deck and since 52 — 29 = 23, it appears th a t the likelihood
of drawing either a heart or a face card is 29 chances in favor to 23 chances
against. However, there are some errors in this chain of reasoning.
Certainly, we do not deny th a t 13 + 16 = 29 or th a t 52 — 29 = 23. We
merely point out th a t we did not want to perform these operations. Why?
To begin with, there are four cards th a t are counted both as hearts and
as face cards; namely, the ace, the king, the queen, and the jack of hearts.
In terms of a circle diagram, letting H denote the set of hearts and F the
set of face cards, we have Figure 2.27. More abstractly, if we let H and F
S = spades
D = diamonds
C = clubs
Figure 2.27
be as in the diagram, then the answer to the problem is represented by

N (H KJ F). We see th a t N( H) = 13, N(F) = 16, and N( H H F) = 4.
The formula now reads
N( H KJ F) = 13 + 16 — 4 = 25.
In other words, an intuitive count might lead us to think th at wre have

29 chances out of 52 of accomplishing our objective, while a proper count
shows th a t the chances are only 25 out of 52. (Notice th at this subtle
error is the difference between the odds really being 25 to 27 against us and
believing th a t they are 29 to 23 in our favor.) As we said, we shall discuss
this idea more later; for now we wish only to point out one more use of
the theory of sets in solving problems in other branches of mathematics.
I t is next our endeavor to extend these results beyond the intersection
and the union of two sets. For example, suppose th a t we have three sets
A , B, and C; and we wish to compute
JV(iUfiUC).
We shall show th a t
N (A KJ B \ J C ) = N( A) + N( B) + N(C) - N( A Pi B)
- N ( A P C ) - N ( B P C) + N{ A H B H C ) .
We shall show a few ways of visualizing this result.
(1) Suppose th a t A, B, and C were three lists containing names of people

and th a t we wished to amalgamate the three lists into one without
counting the same name twice. Then if we ju st stapled the lists
together we would see th a t:
(a) if a name appeared on one of the lists it was counted the correct
number of times,
(b) if the name appeared on two of the lists it was counted once too
many and hence should be subtracted once,
(c) if the name appeared on all three lists then it should be subtracted
twice.
Thus, for example, if an element belongs only to A it is counted only
once in the sum N( A) + N( B) + N(C). If it belongs to ju st B and A,
it is counted twice in the sum N( A) + N( B) -f- N(C) b u t it is sub
tracted once in N( A r \ B). Finally, if the element belongs to A and
B and C it is counted three times in N( A) + N( B) + N(C) and then
subtracted out three times in N( A r \ B) — N( A D C) — N ( B C\ C).
Now, however, it is not counted a t all, but N( A B C\ C) counts it
once.
(2) In terms of circle diagrams (Figure 2.28) let us indicate by 1, 2, or 3
the number of times an element is counted in arriving a t the sum

N( A) + N( B) + N(C). If now we subtract out those th a t appear
in the pairs of intersections, we have Figure 2.29. If we then add in
those th a t belong to all three sets, we have Figure 2.30. This is
Figure 2.29
precisely what we desire; namely each element is counted once no

m atter where it appears.
For more than three sets a rather interesting pattern prevails, which wre
present without proof. For example,
N(A KJ B V C y j D) = N(A) + N(B) + N(C) + N(D)

- [iV(^ + N(A H C ) + N(A C\ D)
+ n (b n c ) + N(B n z ) ) + N( c n d)]
+ [N(A r i B H C ) + N(A r ^ B H D )
+ N(A n c n D ) + N(B n c r \ D ) ]
- N(A r ^ B n c H D ) .
The pattern is th a t we alternately add and subtract all possible combina
tions of intersections ranging from taking the sets one a t a time to all a t
one time. We conclude this section with an example.
In a certain school, students are required in order to graduate to take a t least one
of three languages: French, German, or Spanish. In a certain graduating class we
find th a t 20 students took all three languages, 35 took French and German, 40 took
French and Spanish, 50 took German and Spanish, 90 took French, 70 took German,
and 110 took Spanish. How many were in the graduating class?
In solving this problem we m ust be careful not to add the given numbers.
For if we do, we count certain students more th an once. One solution
(letting F, S, and G denote the set of students taking French, German, and
Spanish, respectively) is to use our formula with N( F ) = 90, N( G) = 70,
N( S) = 110, N( F n G) = 35, N{F C\ S) = 40, N(G H <S) = 50, and
N ( F n G n S) = 20. The number in the graduating class is
N( FKJ G ^ J S), and we have
N( F VJ GKJ S) = 90 + 70 + 110 - 35 - 40 - 50 + 20 = 165.
Thus, there were 165 members of the graduating class. We could have
used circle diagrams. In this event, since 20 students take all three, we
would have Figure 2.31. Then, since 35 take both French and German
(this number includes the 20 who take all three), we would draw Figure
2.32. Continuing in this way (the details are left to the reader), we get
Figure 2.33. N ot only does this give us the same answer, but since the
regions in the diagram are mutually exclusive we have such other results
Figure 2.33
a s : There were 5 members of the graduating class who took German but
neither French nor Spanish.
Exercises
1. Determine N ( A ^ J B) under each of the following sets of conditions.
(a) N( A) = 15, N( B) = 25, N( A n B) = 7
(b) N( A) = 15, N( B) = 25, N( A n B) = 15
(c) N( A) = 15, N( B) = 25, N( A H £ ) = 0
(d) N( A) = 15, N( B) = 25, N( A H B) = 20
2. Use circle diagrams to answer the following questions. We are given
the three sets A, B, and C, and the following information.
N( A n B r \ C) = 1 N( A) = 14
N( A n B) - 5 iV(£) = 13
iV(A H C ) = 3 i\T(C) = 12
jv (b n c ) = 4
Find the following numbers.
(a) N( A KJ B \ J C ) (d) JV([A C')
(b) A(A C \B ' C\ C ) (e) N (A n C')
(c) N(A r\Br\c')
3. In a survey of 50 students it was found th at 14 studied German, 13
studied French, 12 studied Spanish, 5 studied both German and French,
4 studied both Spanish and French, 3 studied both German and Spanish,
and only 1 studied all three languages. How many students in the
survey were taking none of these languages? Explain.
A NOTE ON UNIONS, INTERSECTIONS, AND COMPLEMENTS
We shall apply the concepts of unions, intersections, and complements

to various topics in mathematics. In particular we shall discuss the
arithm etic topic of common multiples, the algebraic concept of simultaneous
equations, and the geometric concept of intersections of curves. We shall
first study each of these concepts separately, showing how their structure
is enhanced by a knowledge of sets.
Once the above topics have been studied individually, we shall show
still other advantages of sets by treating these three apparently inde
pendent topics as applications of the same principle.
We have been using such concepts as unions and intersections long before
we were formally aware of them. For example, consider the problem of

finding the sum of two rational numbers in common fraction form. Sup
pose th a t we wish to determine §■ + y. Recall th a t our first step was to
determine a common denom inator; th a t is, we wanted to find a number th a t
was divisible by both 5 and 7. The smallest natural number with this
property was 35. Other natural numbers with this property were 70, 105,
and 140—to name but a few.
Let us now investigate the anatomy of the above problem. The number
we sought had to be divisible by 5. This meant th a t it had to have the
form 5x, where x was some integer. At the same time it had to be divisible
by 7, which meant th a t it also had to have the form 7y, where y was some
integer.
Using set-builder notation we let
A = {5x: x C N}
B = {7y: y £ N]
where N denotes the set of natural numbers. We have taken the liberty
of restricting our universe of discourse to the set of natural numbers.
In plain English, A merely names the set of all natural numbers th a t are
divisible by 5, while B names the set of all natural numbers divisible by 7.
We wish to find those natural numbers th a t are divisible by both 5 and 7,
but in the language of sets this is precisely the set A C\ B. Thus, even
if a person did not know how to find such numbers systematically he could
a t least recognize them if he saw them. For example, given the number
25 we see th a t it is divisible by 5 but not by 7. Hence, 25 G A, but
25 4- B. In other words, 25 £ A C\ B '. In fact, as a special case of the
more general results described in Section 2.6, observe th a t each natural
number belongs to exactly one of the four mutually exclusive sets: A C\ B',
A ' C\ B, A ' C\ B', or A C\ B. This is shown in terms of a diagram in
Figure 2.34.
Figure 2.34
In our case, A C\ B' denotes the set of natural numbers th a t are divisible
by 5 but not by 7; A' C\ B denotes the natural numbers th at are divisible
by 7 but not by 5; A' C\ B' denotes the set of natural numbers th at are
divisible by neither 5 nor 7; while A C\ B denotes the set of natural numbers
divisible by both 5 and 7—th a t is, the common multiples of 5 and 7.
Some further examples are
25 £ A nB'
14 £ A' r \ B
33 C A ' H B'
70 G A r\B.
This in terms of a circle diagram is shown in Figure 2.35.
Notice th a t by translating A and B into roster form, we can determine

A C\ B by inspection. Thus,
A = {5,10,15,20,25,30,35,40, . . .}
B = {7,14,21,28,35,42,49, . . .}
and we see th a t 35 is the least natural number th a t belongs to both A and

B. H ad we elected to proceed further, we would have found, in order of
increasing value, th a t 70, 105, 140, 175, 210, and so on, also were members
of A H B. Perhaps our intuition would also have predicted th a t A n B
was precisely the set of all natural numbers th a t are divisible by 35; but
this fact is not needed—it is only helpful. In fact p art of our treatm ent
of rational numbers in Chapter 1 should show how to perform the necessary
computation for finding the least positive number in A n B (this was called
the least common denom inator), but one did not have to know these com
putational skills to understand the underlying concept.
Exercises
1. Let A denote the natural numbers th a t are divisible by 9; and B, the
natural numbers th at are divisible by 15.
(a) List three members of A C\ B ' .
(b) List three members of A ' C\ B ' .
(c) List three members of A' C\ B.
(d) W hat is the smallest natural number th a t belongs to A C\ B?
(e) Describe A C\ B by the set-builder notation in two different ways.
2. Among the sets of numbers studied by the ancient Greeks were the
perfect numbers. A number is said to be perfect if the sum of the natural
numbers (excluding the number itself) th a t divide it is equal to the
given number. For example, 6 is a perfect number since its proper
divisors are 1, 2, and 3; and 1 + 2 + 3 = 6. Use the test for member
ship to determine which of the following are also perfect numbers.
(a) 12 (b) 16 (c) 43 (d) 28 (e) 65.
3. Even today it is not known whether any odd perfect numbers exist,
yet Euclid was able to describe the even perfect numbers completely.
Namely, he showed th a t N would be a perfect number if and only if
N = 2p_1(2p — 1), where 2P — 1 is a prime number. For example, if
p = 2, the recipe yields N = 2 X 3 = 6. Use this recipe to find the
four smallest perfect numbers.
4. List the five smallest members of M , if M = {2P — 1; p is a prim e}.
2.8 SIMULTANEOUS EQUATIONS

Simultaneous equations as we have seen are two or more open sentences
th a t must be satisfied at the same time. For example, if we were to write
the equation x + y = 10 it should not be difficult to see th a t there are many
pairs of numbers th a t satisfy the equation. In fact, there are infinitely
many such pairs. For given any number x, we need only form 10 — x as
the second number to guarantee th a t the sum be 10. Thus,
6 + (10 - 6) = 6 + 4 = 10
I + (10 — ^ + 9^ = 10
14 + (10 - 14) = 14 + ( - 4 ) = 10.
In a similar way there would be infinitely many solutions to the equation

x — y = 6. Namely, given any number y, we determine x by adding 6
to y. Then x — y will obviously equal 6. For example, if we choose
y = 14, let x = 14 + 6 = 20. Then x — y = 20 — 14 = 6.
Notice th a t if we try the pair of numbers x and y to seeifthey are
solutions of the two given equations, we see th a t our particular pair of
numbers may serve as a solution of neither equation, or of one but not the
other, or of both.
To indicate th a t we want those pairs of numbers th at are solutions to
both equations we usually write
where the brace around the equations indicates th at we want pairs of

numbers th a t simultaneously satisfy both equations.
Granted th a t we do not need the language of sets here, still the set-
builder notation makes it easy to visualize the problem. Namely, let us
agree to denote by A the solution set for the equation x + y = 10, and
by B the solution set for the equation x — y — 6. In other words,
A = {(x,y): x + y = 10} and B = {(x,y): x - y = 6}.
Notice th a t the elements are pairs of numbers, but nothing in our definition
of sets excluded pairs of numbers from being elements. Moreover, the
pairs are called ordered pairs since their order may make a difference.
T h at is, x — y and y — x are not synonyms. Thus, (7,1) G B since
7 — 1 = 6; but (1,7) ef. B since 1 — 7 = —6 and —6 ^ 6.
In the language of sets, asking for the simultaneous solutions to the
equations
is now no more complicated than asking one to find A C\ B. (Notice th at

we have dealt with this problem more computationally in our section on
matrices. All we are emphasizing here is the logical structure of the con
cept.) A person w ithout knowledge of algebra or its equivalent might not
know of a neat procedure for finding A C \B \ but the im portant thing is
th a t given any ordered pair, he can decide objectively whether or not it
belongs to A C\ B.
By way of illustration, consider the ordered pair (7,3). Since 7 +
3 = 10, then (7,3) G A ; however, since 7 — 3 ^ 6 , then (7,3) 4 B. In
other words (7,3) G A f \ B'.
Again, in the form of a review, in this problem A C\ B ' means an ordered
pair of numbers (x,y) whose sum is 10 but whose difference is not 6 ) A C\ B
denotes those ordered pairs whose sum implies th a t for a solution we must
have th a t x = 8. Then, since x + y = 10 the knowledge th a t x = 8
forces us to conclude th a t y = 2. Thus, if there exists a solution it must
be (8,2), and a check shows th a t this is indeed a solution.
If we wish to analyze the above more in terms of the language of sets,
all we are saying by way of review is th a t by use of algebra we can show th a t
The Arithmetic of Sets SOI
the following pairs of simultaneous equations have the same (equal) solu
tion sets.
(1) x + y = 10) (2) x + y = 10) (3) x = 8)
x — y = 6/ 2x = 16j y — 2)
It is simply th a t (3) is easier to solve a t sight than either (1) or (2).
To summarize the above problem; if we wish to express the set of
simultaneous solutions of
x + y = 10)
x - y = 6/
we need only let A — {(x,y): x + y = 10} and B = {(x,y): x — y = 6}.
Then the solution set is A C\ B. We can then use algebra to convert
A C\ B from its set-builder form to the perhaps more convenient form of
A r \ B = {(8,2)}.
Again, observe th a t the role of sets in no way changes the solution to the
problem; it only affords a uniform vehicle for stating the problem in familiar
language and helps us separate the actual problem from the solution of the
problem.
Exercises
1. Let A denote the solution set of 3x + 4y = 11 and let B denote the
solution set of 2x + 3y = 7.
(a) Find an ordered pair (x,y) th a t belongs to 4 H B '.
(b) Find a member oi A ' C \B .
(c) Find a member of A ' C\ B '.
(d) Find all members of A C\ B.
(e) Granted th a t A r \ B, A ' C\ B, A f~\ B r, and A ' C\ B', subdivide
the ordered pairs into mutually exclusive partitions. Does it follow
th a t these four sets each have the same number of elements?
Explain.
2. Find the solution set for each of the following in both set-builder and
roster form.
2x + 7y = 32) 7x + 4 y = 34)
2x + 5y = 18/ 2x + 3y = 24 j
3. Give an example of an equation whose solution set has no members in
common with the members of the solution set of 3x + 17y = 87.
2.9 INTERSECTIONS OF CURVES

One of the contributions made by Descartes’ coordinate geometry is th a t
it made it fairly natural to think of a curve as being a set. T h at is, while
F igure 2.86
we usually think of a curve as a drawing, we can also think of it as being

a set of points. Thus, in Figure 2.36 we may think of the curves A and B
as being sets of points. Certain points in the plane are on (belong to)
one curve but not the other, other points belong to both curves, and still
other points belong to neither curve. In this case, we would write
pC A 'I^B '
qCA KB'
rC A 't^B
sCA H R
Moreover, A C\ B would correspond to what we usually mean by the
intersection of curves, since the intersection of the curves consists precisely
of the points th a t belong to both curves simultaneously.
This concept takes on more significance from a computational point of
view when we restrict our attention to coordinate geometry.
Consider, again, the equation x + y = 10. A t least a t first glance
there is nothing to suggest geometry as we look a t the equation. Now
consider the solution set of this equation; namely, {(x,y): x + y = 10}.
In terms of Descartes’ naming of the points in the plane, we can view the
solution set of this equation as being the points (x,y) in the plane th at
satisfy the relation x + y = 10. While we have no intention of proving
the following statem ent, the fact remains th a t the set of points having this
property lies on a straight line. (This is why x + y = 10 is often referred
to as a linear equation.) Two points th a t satisfy this equation are (10,0)
and (0,10). Thus, the set of points (x,y) for which x + y = 10 form the
straight line th a t passes through the points (0,10) and (10,0), as well as
through many, many other points. When we treat the solution set of an
equation as a set of points in the Cartesian plane, the resulting picture is
called the graph of the equation. Thus, Figure 2.37 shows the graph of the
equation x + y = 10.
In a similar way, we can show in Figure 2.38 th a t the graph of x — y = 6
is the straight line th a t passes through (6,0) and (7,1). If in Figure 2.39
we superimpose the results of Figure 2.37 and 2.38, we see th a t the two
graphs intersect a t one point (8,2). This should not be too much of a
surprise after the previous section.
Figure 2.38
Figure 2.39
20Jt A n Introduction to the Theory of Sets
Here, then, we see th a t the great feat of Descartes was to translate the
solution set of algebraic equations into pictures, and conversely, pictures
into the solution set of algebraic equations.
Thus, while an equation and its graph are conceptually very different
(as different as algebra and geometry), we see th at we can get a better
intuitive grasp of algebraic results if we can visualize the appropriate
picture.
By way of further illustration in terms of graphs, the solution set of
x + y = 7)
x + y = 3j
is empty because the graphs of the equations are parallel lines and, hence,
share no points in common (Figure 2.40).
In fact, since it is relatively easy to visualize th at a pair of straight lines

either coincide (in which case they share infinitely many points in common),
are parallel and distinct (in which case they have no points of intersection),
or else are not parallel (in which case they meet at exactly one point);
it should be easy to understand if we translate from graphs to equations
th a t the solution set for a pair of simultaneous linear equations contains
either one element or no elements or infinitely many elements—and th at
there are no other possibilities.
Again, it was not our purpose to present an extensive treatm ent of
curves here, but rather to point out how understanding sets helps us avoid
certain difficult concepts in a very nice way. We shall develop more ideas
about graphs in later sections.
Exercises
1. Let us accept the fact th a t the solution set of any equation of the form
ax + by = c (where a, b, and c are any numbers and not both a and b
are 0) has as its graph a straight line. Graph the following equations:
(a) Sx + Ay = 11
(b) 7x + 9y = 23.
Drawing the graphs to accurate scale, determine where the curves
intersect. Then solve the two equations simultaneously and see how
this result checks with the graphical result.
2. Repeat the instructions from Problem 1 for the following pairs of
equations:
(a) 2x + by = 10 (c) x — 2y = 7
(b) x + 2y = 6; (d) Sx + y = 28.
2.10 SETS AS A "COMMON DENOMINATOR”

I t should now be clear th a t the language of sets, if properly used, can
simplify the structure of many topics in the mathematics curriculum. We
saw this in our individual treatm ent of common multiples, simultaneous
equations, and curves.
In this section we shall go one step further with sets and show th a t we
can unify the three different topics (common denominator, simultaneous
equations, and intersection of curves) described in the previous sections.
This we shall do by taking excerpts from each of the three sections and
placing them side by side.
(1) 25 C A ^B ' (2) (7,3) G A H B' (3) q G A H B'
14 G A ' C\ B (6,0) G A ' P i R r G A ' Pi B
33 G A ' n B' (9,4) C A ' H B ' p € A ' n B'
70 € A HB (8,2) C A t ^ B sCAl^B.
To someone who had not seen the previous three sections, (1), (2), and
(3) look like three ways of saying the same thing, only with the names of
the elements changed. B ut we know th at (1), (2), and (3) were arrived
a t from completely different mathematical situations—or at least so it would
appear.
Thus, it seems th a t the language of sets can be effectively used to unify
many different problems, a t least in regard to a common language. In
other words, while it may initially be difficult for a student to learn the
terminology of the language of sets, once he has learned the language he
gets the bonus of being able to learn many different concepts in terms of a
common environment. This is a remarkable fringe benefit, and if the
language of sets had nothing to offer besides being a common denominator
of sorts this alone would make it worthy of being learned. Indeed, it
wrould be a big help if we could develop a uniform language such as in Eng
lish grammar. T hat is, notice th a t while grammar increases in sophistica
tion from year to year, a noun remains a noun and a verb remains a verb.
In the same way, if the use of sets could insure a uniform mathematical
language, it would not be nearly as upsetting for us to advance in mathe
matical sophistication; for while the game might get tougher to play, we
would still have the psychological advantage of knowing th a t we are at
least still playing the same game.
However, we have hardly begun to show why sets are important. The
next section will show still another excellent use for the concept of sets.
part III / Cartesian Products
2.11 INTRODUCTION
Up to now we have mainly used sets to establish the proper vocabulary.
Now we present one other type of set, which in many ways is a dimension
above other sets. I t is called a Cartesian product.
This type of set has many im portant applications. In the following
sections we shall study two such uses of Cartesian products. One use is
with regard to counting procedures in such problems as counting the
number of perm utations and combinations of certain events. The other
use is more abstract, wherein we shall use Cartesian products to dissect
the concept known as relations. In particular, we shall examine the
im portant idea of an equivalence relation.
Before doing these things, however, it is best th at we try to find some
motivation for introducing the idea of a Cartesian product; and, hopefully,
this motivation will be supplied in the next two sections.
2.12 A GEOMETRIC MOTIVATION

As we have mentioned, Descartes’ coordinate geometry served to unify
arithm etic (algebra) and geometry. In other words, it afforded a vehicle
whereby ordered pairs of numbers could be viewed as points in the plane,
and points in the plane could be viewed as ordered pairs of numbers. For
example, given the expression x2 + y2 = 1 we have the choice of defining
Cartesian Products 207
{(x,y): x2 + y2 = 1} as the solution set of the equation x2 + y 2 = 1, or

as the set of points in the Cartesian plane whose coordinates (x,y) satisfy
the relation x2 + y2 = 1. Obviously, the two interpretations are inde
pendent of one another.
One beauty of this setup is th a t we can do geometry problems essentially
with arithmetic, needing no knowledge of geometry. T hat is, the “atom s”
of geometry are points, and points can be viewed as being ordered pairs of
real numbers. While the above comments make sense without the specific
mention of sets, let us introduce the concept to the discussion for the
purpose of motivation for the remainder of this section.
Recall th a t the number line is a geometric way of visualizing the set of
real numbers. Let us denote the set of real numbers by R. Then the
geometric concept of saying “a: is a point on the number line” translates
simply into (‘x C R ” In still other words, the number line is just a pic
torial representation of the set R, with points on the line corresponding
to elements of R.
How can we extend this concept to relate points in the plane to sets?
We have already seen th a t points in the plane are ordered pairs of numbers
or, more specifically, ordered pairs of elements of R. Order is im portant
since the points (3,4) and (4,3) are not the same. Let us introduce the
notation R X R (called the Cartesian product of R with itself) to denote
the set of all pairs of real numbers. T hat is,
R X R = {(x,y) :x C R and y G
Notice th a t while the definition of R X R was motivated by a view of the

plane, the concept in no way depends on geometry since one can easily
view the set of ordered pairs of real numbers without a knowledge of
coordinate geometry.
Carrying the analogy further, observe th a t R X R X R can be viewed as
the set of all ordered triples of real numbers, th a t is, R X R X R =
{(x,y,z): x C R, y G R, and z (EL R], and th at this would correspond to the
set of points th a t make up three-dimensional Cartesian space.
The concept being described here could still be studied without modern
set notation, but once we have mastered the basic terminology of sets we
thus minimize any chance of misinterpretation because this new language
is both concise and precise.
Next observe th a t the idea of Cartesian products gives us an analytical
way of viewing the geometric concept of dimension. By this we mean
th a t we have identified the number line with R and the number line is
one-dimensional. Thus, one-dimensional geometry involves a single rep
lica of R. Next, we have identified the geometry of the plane with R X R,
but the plane is considered to be two-dimensional, while R X R involves
two replicas of R. Similarly, R X R X R involves three replicas of R and

corresponds to the geometry of three-dimensional space. Observe th at
R X R X R X R = {{x,y,z,w) : x G R, y G R, z C R, w G R}
is perfectly well-defined as a set of 4-tuples of real numbers and hence would
be identified with four dimensions even though physically we cannot view
four-dimensional space in the usual sense of the term.
Let us visualize some sets pictorially. In terms of the number line, the
position of a point falls into one of three categories: (1) it may lie to the
right of the origin, (2) it may lie to the left of the origin, (3) it may coincide
with the origin. Recalling Descartes’ idea of signed numbers, this trans
lates into the fact th a t a real number is either (1) positive, (2) negative, or
(3) zero. Let us identify the positive numbers as those being greater than
zero, the negative numbers as those being less than zero, and zero just
being zero. In terms of sets:
R + = [x £ R: x > 0 = the set of positive real numbers}, (1)
R~ = { x C R : x < 0 = the set of negative real numbers}, (2)
{0}, (3)
R = R + U R - U {0}, where R + (X R~ = R+IX {0} = R~ (X {0} = 0.
These three mutually exclusive subsets whose union is R translate very
nicely into the language of the number line since R + corresponds to the
points th a t lie to the right of the origin, R~ to the points th a t lie to the left
of the origin, and {0} to the origin itself.
It should now be easy to see the difference between such sets as R + (X R +
and R + X R +. Observe th a t R + (X R + merely yields R + again, as does
R + U R +. On the other hand, R + X R + = {(x,y): x C R + and y C R+\ >
or in terms of the definition of R +, R + X R + = {(x,y) : x > 0 and y > 0}.
Translating this result into pictorial form, R + X R + represents a quad
ran t of the plane. Traditionally, the Cartesian plane is viewed as being
composed of four quadrants called I, II, III, and IV. Quadrant I denotes
the upper right-hand quarter of the plane; quadrant II, the upper left-hand
quarter; quadrant III, the lower left-hand quarter; and quadrant IV, the
lower right-hand quarter (see Figure 2.41).
To prevent ambiguities from arising, a t least for the time being, we shall
exclude the x and y axes from being part of the quadrants. This guarantees
no point can belong to two quadrants. Otherwise, (0,0) would belong to
all four quadrants. If we wish the axes to be included as part of a quadrant,
then w hat we would call a quadrant would be called the interior of what we
now call a quadrant. To go on, notice th a t the set R + X R + translates
geometrically into quadrant I. This shows us visually a very basic and im
portant difference between unions or intersections, and Cartesian products.
(-.+) (+.+)
II I
III IV
(-, -) (+, -)
Figure 2.41
Namely, unions and intersections of a set with itself preserve dimension

while a Cartesian product increases the dimension. T hat is, while such
things as R + U R r are subsets of the number line, R + X R~ is a quadrant.
Specifically, quadrant I corresponds to R + X R +, quadrant II corresponds
to R~ X R +, quadrant I II corresponds to R~ X R~, and quadrant IV corre
sponds to R + X R~.
I t should also be observed th at we can study subsets of the real numbers
other than positive, negative, or zero numbers. Special subsets of the
real numbers th a t play an im portant role in mathematical analysis are
called intervals.
Before we give a formal definition of an interval, let us notice th a t it is
not uncommon to refer to, for example, the set of all numbers th a t are
greater than 1 but less than 2. Such a set has infinitely many members.
To be sure, it contains no integers, b u t it does contain, for example,
\/2> and f • In terms of the number line, this set corresponds to the
set of points described in Figure 2.42.
------------------- 1------------C 1,1,11 —

0 1 2
Figure 2.42
Looking a t the picture, it should not be difficult to guess why this set
of numbers is called an interval. We can see th a t the collection is a con
nected set of points. Notice th a t 1 and 2 are not themselves members
of the set being discussed. For this reason we refer to the interval as being
an open interval, since it is open a t both ends. In other words, the end
points are not members of the collection. On the other hand, had we
referred to all numbers th a t are at least as great as 1 and no greater than
2, we would have called this a closed interval. The only difference between
an open and a closed interval is th at we include the end points as members

in a closed interval but not in an open interval.
This leads us to another battle of point versus dot. Namely, if we were
to draw the closed interval from 1 to 2, we would not be able to distinguish
it from the drawing of the open interval from 1 to 2 because a point has
no thickness. The question is how can we tell just by looking whether the
end points are included or excluded? Since we cannot, we invent some
new notation. If we wish to draw a picture indicating the open interval
from 1 to 2, we enclose the end points with parentheses; to indicate the
closed interval from 1 to 2, we enclose the interval in brackets. These
two ideas are illustrated in Figure 2.43. We use the notation (1,2) to
Open interval
from 1 to 2.
'--------- 6------ >—
0 1 2
Closed interval
from 1 to 2.
------------------------ 1--------- E— 3—
0 I 2
Figure 2.43
indicate the open interval from 1 to 2, and [1,2] to indicate the closed
interval from 1 to 2. From this we are led to the following definition:
If a < b, then by (a,b) we mean the set of all real numbers th a t are
greater than a but less than b. This subset of the real numbers is called
the open interval from a to b. In the language of sets (a,b) denotes
{x C R : x > a and x < b}, and this is usually abbreviated by the expres
sion a < x < b.
We use the notation [a,b] to denote the set of real numbers th a t are no
less than a and no greater than b. This subset is called the closed interval
from a to b. In the language of sets, [a,b] = {x G R - x ^ a and x ^ b],
and this is often abbreviated by a ^ x ^ b.
Notice th a t an interval need not be either open or closed. For example,
we might write (1,2] to indicate th a t the interval excludes 1 but includes
2. Thus, (1,2] would denote the set written as 1 < x ^ 2; and we would
say th a t this interval is open on the left and closed on the right.
Obviously, we can talk about the union or the intersection of two or
more intervals since we can talk about the union and intersection of
any sets. In particular, intervals are sets. By way of illustration,
(1,3) D (2,4) denotes those real numbers less than 3 and greater than 1,
while a t the same time greater than 2 and less than 4. Once the language
is unscrambled, it is not too difficult to see th a t (1,3) C\ (2,4) = (2,3).
Here again, we can use the number line as a visual aid. Namely, (1,3)
corresponds to Figure 2.44 while (2,4) corresponds to Figure 2.45. Super
imposing the two diagrams, we find th a t the answer is the cross-hatched
region, and th at this region is clearly (2,3), as shown in Figure 2.46.
1 1----
0 1 2 3 4
Figure 2.44
1 1-------
0 1 2 3 4
Figure 2.45
------------------- 1 I
0 1 2 3 4 5
Figure 2.46
Aside from the fact th a t intervals play an im portant role in computa

tional mathematics, we have also introduced this concept to highlight the
essential feature of a Cartesian product. Namely, while the intersection
of two intervals is again an interval, the Cartesian product of two intervals
corresponds to a two-dimensional subset of the plane. Thus, (1,3) X (2,4)
denotes the set of all ordered pairs of real numbers where the first member
of the pair belongs to the open interval (1,3) and the second member of
the pair belongs to the open interval (2,4). T hat is (1,3) X (2,4) =
{(x,y): 1 < x < 3 and 2 < y < 4}.
Again, using a graph as a visual aid, any point in the region indicated
below in Figure 2.47 is characterized by 1 < x < 3. On the other hand,
Figure 2.47
2 < y < 4 would be represented by Figure 2.48. Hence, the only way
1 < x < 3 and 2 < y < 4 is if the conditions exist as in Figure 2.49, and
we see th at the Cartesian product of two intervals is a two-dimensional
subset of the plane.
Figure 2.48
, K . I _ _
/ / / 0 p /y
r X ]
-/ '
^ r:
/ /
£
_ k x \ /
\
n. Required region is interior
1 \ \
o f cross-hatched region.
v- > \ i r x
\ i 4
Figure 2.49
Exercises
1. Explain the correspondence between sets of real numbers and subsets
of the number line.
2. Explain the correspondence between ordered pairs of real numbers and
subsets of the Cartesian plane.
3. In terms of Cartesian products explain why (0,0) is the same as {0} X
{0 };
4. Indicate the intervals (3,7) and (4,9) on the x axis.Then describe
both the union and the intersection of these two sets.
5. In terms of the Cartesian plane describe (3,7) X (4,9).
6. Repeat the processes of Exercises 4 and 5 using the following pairs
sets.
(a) [3,7] and [4,9].

(b) (3,7] and [4,9).
(c) (1,2) and (3,4).
7. Recalling th a t we may identify three-dimensional space with R X
R X R, and using R + and R~ as described in the text, explain in terms
of subsets of R X R X R th a t there are eight octants in three-dimen
sional space (as contrasted to four quadrants in two-dimensional space).
8. In terms of three-dimensional geometry try to describe the set [1,2] X
[1,2] X [1,2].
9. Using the meanings as given in the text, translate in terms of geometry
the following expression into plain English.
Ei X R* = (R + X R ~) U (E+ X [0]) U ({0} X R~) U ([0] X {0}).
2.13 CARTESIAN PRODUCTS OF ARBITRARY SETS
In the last section we saw th a t the idea of ordered pairs of numbers

motivated the concept of Cartesian products. I t did not take long for
man to discover th a t many problems in life, aside from locating points
in the plane, hinge on ordered pairs, and thus he generalized this concept.
For example, whenever we give a concrete interpretation to a statem ent
of the form:
All A ’s are B ’s,
the truth of the statem ent hinges not only on the choice of A and B, but
also on the order in which they are stated. Thus, if A denotes the set of
bears and B the set of animals, the statem ent would be true; however, if
we reversed the order of A and B the resulting statem ent would be false.
Let us now proceed letting S and T denote any two sets (we do not
discount the possibility th a t S = T, but this need not be the case). Then
by S X T we mean the set of all ordered pairs where the first member of
the pair is an element of S, while the second member of the pair is an
element of T. In our language, S X T = {(s,t): s C S and t C. T}.
In terms of a numerical example, suppose S = [1,2,3] and T = [4,5].
Then S X T would consist of those ordered pairs in which the first member
came from S and the second from T. In other words, the first member
m ust be 1, 2, or 3; and no m atter what the first member is the second
member m ust be either 4 or 5. I t should not be difficult to see
S X T = {(1,4), (1,5), (2,4), (2,5), (3,4), (3,5)).
Notice th a t S X T and T X S are not the same sets since the pairs of
one have a different order than the pair of the other. T hat is,
T X S = {(4,1), (5,1), (4,2), (5,2), (4,3), (5,3)}.
If we graph S X T or T X S with the understanding th a t the first

member will be represented by the x axis and the second by the y axis, the
circled points in Figure 2.50 represent S X T while the boxed points
represent T X S. A quick glance shows th at the two sets of points are
not the same.
Figure 2.50
Still another way of seeing the difference is to think of the first member
as telling us the number of tens and the second as telling us the number of
ones. Thus, S X T would induce the numbers 14, 15, 24, 25, 34, and 35;
while T X S would induce 41, 51, 42, 52, 43, and 53. These are clearly
two different sets of numbers.
A third way of viewing the numbers in S X T is in terms of branch
diagrams. T h at is starting with S we write the members of S on one
line— 1 2 3. Next we observe th a t starting with any member of S the
next member can be any element of T. Since T has two elements there are
two lines (branches) drawn from each element of S.
1 2 3
\ / \ \ l \ \ l\
4 5 4 5 4 5
We could then read all possibilities by starting anywhere on the top line
and following a branch to the bottom.
Using any method th a t seems the most convenient, we should not find it
difficult to see th a t if S and T are arbitrary sets, then
N (S X T ) = N (S ) X N(T).
In terms of our example, N ( S ) = 3 and N ( T ) = 2. Thus,
N ( S X T) = 3 X 2 = 6,
which checks with our previous result.
While S X T and T X S may well be different sets, N (S X T) — N (T X S)

since N(S) X N( T) = N( T) X W(S).
(Recall th a t N(S) and N(T) are numbers and th a t the product of two num
bers does not depend on the order of the factors.)
More intuitively, we are saying th a t if we reverse the order of the ele
ments th at make up the pairs, then we change the pairs but we do not
change the number of the pairs.
I t makes sense to talk about A X B X C, where A , B, and C are sets.
Namely, A X B X C = {(a,b,c):a £ A , b £ B, c £ C}. In this way
N (A X B X C) = N( A) X N(B) X N(C), and we can extend this idea
to the Cartesian product of four or more sets as well.
For example, suppose we are given the digits 1, 2, 3, 4, and 5, and we are
told to arrange them without repetitions to form a five-digit number.
5 X 4 X 3 X 2 X 1 = 120. Thus, there are 120 different five-digit num
bers th at can be formed in this way. How can we see this in terms of
Cartesian products? Let A denote the set of digits th at can be used in the
ones column. Then A(A) = 5. Let B denote the set of digits th a t can be
used for the tens column once the digit for the ones column is chosen. Since
no digit can be used twice, N(B) = 4. Once the units and tens digits have
been selected let C denote the set of digits th a t can be used in the hundreds
column. Again, since no digit can be used more than once, N(C) = 3.
Continuing in this way, we define D and E so th a t N(D) = 2 and N(E) = 1.
Since the number is an element of A X B X C X D X E , and since N( A X
B X C X D X E ) = N( A) X N(B) X N(C) X N{D) X N(E) = 5 X 4 X
3 X 2 X 1, we see how the answer to the problem was obtained. We could
have actually counted the 120 different numbers, but th a t is quite tedious.
Even so, it is still better than nothing. In many problems of everyday
life there is no way to replace tedious trial-and-error. Fortunately, how
ever, the invention of computers, which “read” quickly, enables us to ex
amine many cases in a brief interval of time.
We shall pursue the development of computational skill in the use of
the number of elements in a Cartesian product in the next two sections.
However, our main point here was to show how the concept of a Cartesian
product, once extended from the realm of the plane, can be used as an aid
when we wish to quickly count large numbers of possibilities.
E x ercises
1. E xplain w hat we m ean when we say th a t A X B and B X A are not
equal, b u t th a t N ( A X B) and N ( B X A ) are.
2. If N ( A ) = 4 and N ( B ) = 5, show by graph technique th a t N ( A X B) =
20 .
3. Do as in Exercise 2, using branch diagrams.
4. L et A = {1,2,3,4}.
(a) How m any elem ents belong to A X -A?
(b) List th e elem ents of A X A.
(c) How m any tw o-digit num bers can be formed using th e digits 1, 2, 3,
or 4, if th e same digit m ay be used more th a n once?
(d) How m any such num bers can be formed if wre allow no digit to be
repeated?
5. A resta u ra n t offers a choice of four different fruit juices, fifteen different
sandwiches, five different dinner beverages, and ten different desserts.
How m any different meals can be served if a meal is to consist of juice, a
sandwich, a beverage, and a dessert?
6 . In how m any ways can the digits 1, 2, 3, 4, 5 , 6, 7, 8, and 9 be arranged
to form a nine-digit number?
7. A baseball m anager is not concerned with the order in which his nine
players bat. How m any different b attin g orders can he invent once
his nine players are selected?
8 . The num erals 1, 2, 3, 4, 5, 6, 7, 8, and 9 are placed into a bag. A person
is to draw the numerals from the bag, one a t a time, reading each nu
meral drawrn. W hat is the possibility th a t he will draw the numerals
in the consecutive order from 1 through 9? Explain.
2.14 PERMUTATIONS AND OTHER COMPUTATIONS

As we have already seen, one im portant form of counting involves the
num ber of ways in w hich the elements of a finite set can be arranged. Any
such arrangem ent is called a perm utation.
We have seen th a t in counting perm utations wre m ultiply all natural
num bers from 1 to the num ber of elements in the set. For example, wThen
the set had five elements we computed 5 X 4 X 3 X 2 X 1 = 120 to find
the number of perm utations, and w e explained this in terms of counting
the number of elements in a Cartesian product. In the same way, wre could
determ ine th a t if N(S) = n then the number of perm utations on S is
given by 1 X 2 X . . . X n. Since such expressions occur so often, we
wrill invent an abbreviation—a shorthand, if you will, in much the same
way th a t we invented exponents as a form of shorthand.
Specifically, for any natural number n we write n! (read as “n factorial”)
to denote the product of the first n natural numbers. For example,
1! = 1
2! = 1 X 2 = 2
3! = 1 X 2 X 3 = 6
4! = 1 X 2 X 3 X 4 = 24
5! = 1 X 2 X 3 X 4 X 5 = 120
6! = 1 X 2 X 3 X 4 X 5 X 6 = 720
7! = 1 X 2 X 3 X 4 X 5 X 6 X 7 = 5040
8! = 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 = 40,320
9! = 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 X 9 = 362,880
10! = 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 X 9 X 10 = 3,628,880.
Notice th a t we get from one factorial to the next by multiplying by the
next natural number.
2! = 2(1!) = 2 X 1 = 2
3! = 3(2!) = 3 X 2 = 6
4! = 4 (3 !)= 4 X 6 = 24
5! = 5(4!) = 5 X 24 = 120, and so on.
In general, for any natural number n, (n + 1)! = (n + 1) (n!).
While 0 is not a natural number, it is customary to define 0! to be 1.
Remember th a t definitions are man-made and do not have to be logical.
T hat is, we could have defined 0! to be anything a t all, especially since
our original definition for n! required n to be a natural number and 0 is
not a natural number. However, it is also customary to choose definitions
th at will preserve properties we wish to use. The key computational
fact about factorials is that, as seen above, (n 1)! = (n + l)(n!). Sup
pose we like this recipe to the extent th a t we would like it to remain true
even if n = 0. Then, logically, we have no choice but to replace n by
0 in the recipe and see what this implies. Replacing n by 0 in (n + 1)! =
(n + l)(n!), we see th a t this leads to
(0 + 1)! = (0 + 1) (0 !)
or
1 ! = 1(0 !)
or
1 = 0!
Thus, we see th a t if we wish to preserve this recipe (and the choice is ours
to make) then we must define 0! = 1. In terms of this new language, if
N (S ) = n then there are n! distinct permutations th a t can be defined on
S. Let us apply this idea to some problems.
Example 1
How many nine-digit numbers can be formed by the numerals 1, 2, 3, 4,
5, 6, 7, 8, and 9 if each digit is to be used exactly once?
Observe th a t this simply paraphrases asking us to find the

s o l u t io n :
number of permutations th a t can be performed on a set of nine elements.
Thus, the answer to the problem is 9!, or 362,880. Such an example
serves to enhance our claim th a t man can think logically as well as seek
simple solutions to his problems; since it is much “neater” to proceed our
way than to have had to list all 362,880 numbers to obtain the answer!
Exam ple 2
How many nine-digit numbers can be formed using the digits 1, 2, 3, 4, 5,
6, 7, 8, and 9 if we can use each digit as often as we wish?
Here we come to an interesting example to show th at while
s o l u t io n :
recipes may become automatic in mathematics, wre can virtually never
autom atically apply a particular recipe to a given situation. In this
instance the reader who tries to use permutations will be in great difficulty
since permutations require th a t no member be used more than once. In
this problem wre may use some digits more than once, and others less than
once. Let us visualize the problem as follows.
We are to fill in the nine spaces below with any digit from the given
collection.
Since we are not restricted to specific digits in each space, each space may
be filled in by any one of nine elements. Let us indicate this by writing
9 9 9 9 9 9 9 9 9
where here we are not indicating the number 999,999,999; but rather th at
each space may be filled in in nine different wrays. The answrer to the
problem is given by
9 X 9 X 9 X 9 X 9 X 9 X 9 X 9 X 9 = 99.
W hy is this the case? The answer is, again, best explained in terms of
Cartesian products. We will let A = {1,2,3,4,5,6,7,8,9}. Then the
required number corresponds to nothing more than an element of A X A X
A X A X A X A X A X A X A X A (why?); and the correct answer
is now a consequence of the general result th a t the number of elements in a
Cartesian product is the product of the number of elements in each of the
sets th a t comprise the Cartesian product.
Exam ple 3
We have the numerals 1, 2, 3, 4, 5, 6, 7, 8, and 9 a t our disposal. We
wish to form a two-digit number from these numerals and to have no
numeral be repeated in a given number. How many such numbers can
we form?
Again, we may begin by observing th a t we wish to fill in two

s o l u t io n :
spaces. The first space can be filled in by any one of the nine digits, but
since a digit cannot occur twice in the same number we can fill in the
second space by a choice of any one of the remaining eight unused digits.
A diagrammatic example might be
9 8.
Thus, the answer to the problem will be 9 X 8 = 72. This follows from
Cartesian products: we let A denote the set containing the original nine
digits, and B the set of digits th at remain after one of the original digits is
deleted. Then the two-digit numbers correspond to the members of
A X B. The result follows since N (A X B) = N( A) X N(B), and since
N( A) = 9 and N(B) = 8.
Had we wished to solve the same problem, but now allowing a digit to
be repeated, our answer would have been 9 X 9 = 81. The reader may
verify this result.
Observe once again th a t we do not need Cartesian products to solve
this problem; however, with this knowledge we can easily understand why
we multiply certain numbers the way we do to obtain an answer. For
example, the following pattern allows us to write down all two-digit num
bers formed by the use of all digits excluding 0.
@ 12 13 14 15 16 17 18 19
21 (g) 23 24 25 26 27 28 29
31 32 © 34 35 36 37 38 39
41 42 43 @ 45 46 47 48 49
51 52 53 54 © 56 57 58 59
61 62 63 64 65 © 67 68 69
71 72 73 74 75 76 © 78 79
81 82 83 84 85 86 87 © 89
91 92 93 94 95 96 97 98 @
While this technique may be tedious, it also allows us to visualize the

physical significance of the expression 9 X 9 used in finding the answer.
Finally, observe th a t the circled numbers on the diagonal are precisely
those th at use repeated digits; when these nine numbers are deleted from
the list there are 81 — 9 = 72 numbers left, which checks with the original
result obtained with the recipe.
Notice also th a t 9 X 8 is not a factorial since factorials m ust include
the product of all natural numbers from 1 to the number in question. How
ever, in terms of rational numbers we can rewrite 9 X 8 in terms of fac

torials. Namely,
91 = 9 X 8 X 7 X 6 X 5 X 4 X 3 X 2 X 1
= (9X8)X(7X6X5X4X3X2X1)
= (9 X 8) X (71).
Hence, 9 X 8 = 91/7! In still other words,
9 X 8 = (9 X 8)(7!)/(71) = 91/7!
Exam ple 4
We wish to form four-digit numbers from the digits 1, 2, 3, 4, 5, 6, 7, 8, and
9 with no digit being repeated in a given number. We also want the num
ber to be greater than 6000. How many such numbers are there?
This is quite similar to the other examples except th at look
s o l u t io n :
ing a t the four spaces now, we observe th a t the first (counting from left
to right) digit cannot be 1, 2, 3, 4, or 5, since in terms of place-value such
digits would guarantee th a t the number could not exceed 6000. Thus, we
have four first choices: 6, 7, 8, or 9. Once we have made the first choice,
and since there can be no repetitions, the second choice can be made in
eight ways. T hat is, as long as the first digit is at least as great as 6, there
are no further restrictions on how to chose the digits. The third digit
can be chosen in seven ways, and the fourth in six. Thus, the diagram
would be
18 7 6
and we could obtain 4 X 8 X 7 X 6 = 1344 such numbers. We could
check this result by actually writing all such numbers and performing a
count but such a chore might prove difficult and uninspiring. Frequently
enough wrinkles are thrown in so th a t the problems become even more
difficult to handle neatly. However, the principle always remains the
same, even though the techniques may become more trying, (and it will
be more trying for some than for others!). To illustrate our last remark,
consider the next example.
Exam ple 5
All the requirements of this problem are the same as those in Example 4
except th a t we also want the number to be even.
We still have four spaces; the second and third spaces can be
s o l u t io n :
filled in by any remaining digits after the first and fourth spaces are filled
in. The first space can still be filled in by 6, 7, 8, or 9; but the choice for
the fourth space is now affected by the choice for the first space. Namely,
since the number is to be even, the last space must be filled in by 2, 4, 6, or 8.

If we choose either 7 or 9 for the first space, then there can be four numbers
(2, 4, 6, and 8) in the fourth space. However, if we choose either 6 or 8
for the first space, then whichever we choose cannot be used in the fourth
space since we do not allow repetitions. This forces us to consider two
mutually exclusive cases, one of which m ust happen.
Case 1: The first digit is either 7 or 9. In this event we may choose the
first digit in two ways and the fourth in four. (Notice th a t we fill in the
difficult spaces first and leave the straightforward ones for last.) This
leaves us with seven digits, any of which may be used for the second space,
while any one of the remaining six can be used for the third space. Thus,
2 7 6 4.
Hence, there are 2 X 7 X 6 X 4 = 336 of these required numbers.
Case 2: The first digit is either 6 or 8. In this event everything is as

before except th a t once we fill in the first space there are only three ways to
fill in the fourth. Thus,
2 7 6 3.
Hence, there are 2 X 7 X 6 X 3 = 252 solutions for this type of case.
Since Case 1 or Case 2 m ust occur and since they cannot occur a t the
same time, the total number of ways th at either Case 1 or Case 2 can occur
is the sum of 336 and 252, or 588, numbers in all.
Also notice th a t we used two different types of counting recipes in this
problem: the Cartesian product idea as we considered the two separate
cases, and the right to add these two results, which follows from the recipe
iV (4 U B) = N(A) + N(B) — N( A C\ B), where here A is the set of
solutions obtained in Case 1 and B is the set of solutions corresponding
to Case 2. Since A C\ B = <f>we merely have to add the number of ele
ments in each set.
Also notice th a t these recipes, while perhaps not intuitively easy to
use, help us avoid false answers, such as the urge to say th a t since there are
as many even numbers as there are odd, the answer to Example 5 should
be half the answer to Example 4.
Example 6
We wish to determine the number of nine-digit numbers th a t can be formed
by the permutations of the digits 1, 2, 3, 4, 5, 6, 7, 8, and 9; subject to the
condition th a t the first six places be a permutation of 1, 2, 3, 4, 5, or 6.
We observe th a t the first six places can be filled in 6! ways
s o l u t io n :
while the last three places can be filled in 3! ways. Thus, the answer is
6! X 3! = 720 X 6, or 4320, such numbers. Since the digits 1, 2, 3, 4, 5,

or 6 m ust occupy the first six places, the first place can be filled in by any
one of six choices; the second by five, the third by four, and so on. Then,
since the last three spaces can use only 7, 8, or 9, we have but three choices
for the seventh position, two for the eighth, and one for the ninth. Thus,
the diagram appears -
6 5 4 3 2 1 3 2 1.
W ith these she examples as an introductory background, let us now
hasten to point out th a t this concept is not restricted to sets of numbers.
For example, the technique of Example 6 can be applied exactly as is to a
physical situation. Suppose th a t a baseball team has six men whom the
coach wishes to occupy the first six positions in the batting order, and th at
once he decides this he does not care about the specific order in which the
six men bat. He can invent 4320 batting orders, just as in the last exam
ple; for, while the physical picture has changed, we are still involved with
6 5 4 3 2 1 3 2 L
Exam ple 7
Five people are to be seated in five chairs placed in a row. How many
different seating arrangements are possible?
solution : The answer is 5!, or 120. This is precisely the same problem
but in a different environment as the one th at asks how many five-digit
numbers can be formed by a perm utation of the digits 1, 2, 3, 4, and 5.
We see th a t the first chair can be occupied in five different ways; once the
first is occupied, the second can be occupied in four different ways, and
so on.
5 4 3 21
We can make this more difficult by adding the condition th a t two particu
lar people refuse to sit next to each other.
Exam ple 8
How many seating arrangements are possible if five people are to occupy
five seats arranged in a row but two particular men refuse to sit next to one
another?
: Let us call the men who refuse to sit next to one another A and
s o l u t io n
B. One approach is to observe th a t if we agree to seat A first the choices

for seating B depend upon whether A is placed in an end chair. For
example, if A sits in the first chair then B can sit anywhere but the second;
and if A is seated in the fifth chair then B can sit anywhere except the fourth
chair. On the other hand, if A is seated in either the second, third, or

fourth chairs, B is excluded from sitting in two places—immediately to
the left and immediately to the right of A. This indicates th a t we should
again consider two mutually exclusive cases.
Case 1: A sits in an end seat.

In this event there are two ways of seating A since there are two end
chairs. Once A is seated B can be seated in three ways. Namely, he
cannot occupy the same chair as A and he cannot occupy the vacant chair
on one side of A (there is no chair on the other side since A is occupying
an end seat). We call the other people C, D, and E. Since there are no
restrictions on C, D, and E, we see th a t once A and B are seated, C can
be seated in any one of the remaining three chairs, D in any one of the
remaining two chairs, and E in the one remaining chair. Pictorially,
2 3 3 2 1
A B O D E '
Hence, there are 2 X 3 X 3 X 2 X 1 = 36 seating arrangements for
Case 1.
Case 2: If we let Case 2 denote the situation in which A does not occupy
an end seat, everything else remains the same except now once A is seated
B can be seated in only two ways rather than three, for now there are two
chairs next to A. Moreover, since there are three non-end seats A can
be seated in three ways rather than in two. Pictorially,
3 2 3 _2 1
A B C D E
and we see th a t here, too, 36 seating arrangements satisfy the conditions
of Case 2. Hence, the answer to the problem is th a t 72 such seating ar
rangements exist. Moreover, since there are 120 seating arrangements
in all and since A and B either do or do not sit next to each other, this m ust
mean th a t there are 120 — 72 = 48 ways in which to seat the people so
th a t A and B are next to each other.
As a concluding remark to this problem, observe th a t it makes a dif
ference whether the chairs are in a row or in a circle, for in a circle there
is no end seat. Hence, A can be seated in any of the five chairs; then if
B is not to sit next to A, he can sit in just two chairs, and we see
5 2 3 2 1^
A B C D E'
Thus, there would b e 5 X 2 X 3 X 2 X l = 60 rather than 72 ways for

the people to be seated with A and B separated. The difference of 12
occurs because if A and B occupy end seats when the chairs are arranged
in a row, these chairs become adjacent when pulled into a circle. These
are the 12 ways in which A and B can occupy end seats.
A C D E B\
A C E D b )
A E D C B i anC*^ en six more with the positions of A and B reversed.

A D E C B\
A D C E B/
All these examples point up th a t we need to find clever ways of counting
to handle certain types of situations th a t would be too cumbersome to
handle by brute force. While these clever techniques could have been
(and have been) developed without the use of Cartesian products, notice
how elegantly this concept unifies what appear to be many diverse types
of counting situations. This only emphasizes the idea th a t a proper study
and knowledge of sets helps us learn many mathematical skills meaningfully.
In the next section we shall continue this discussion but focus our a t
tention on counting events in which the order of happening is not impor
tan t. This study comes under the heading of combinations rather than
permutations. Briefly summarized, permutations are used when the
order of events is im portant and combinations are used when the order
is unim portant—but more about this in the next section.
Exercises
1. A restaurant has a menu consisting of five sandwiches and four bever
ages. How many different lunches can be made up if each lunch
consists of a sandwich and a beverage?
2. Using the same conditions as in Exercise 1, how many lunches can
be made up if a lunch consists of either a beverage or a sandwich, but
not both?
3. Express each of the following as an equivalent place-value numeral,
(a) 5!/4! (b) 5 !/l! (c) 5!/0! (d) 8!/5! (e) 7!/5!2!
4. We are given the anagram ARECUS and told to rearrange the letters
to form a six-letter word. If we decide to proceed by trial-and-error
and list all possible arrangements of the six letters, how many entries
will be on our list? Are any of these entries actual words?
5. In how many ways can six people be seated around a circular table?
(In questions of this type, it is usually understood th a t two seating
arrangements are different only if a t least one person has a different
person next to him in the two arrangements.)
6. In how many ways can six people be seated if the chairs are arranged
in a straight line?
7. How can we account for the difference between the two answers ob
tained in Exercise 5 and Exercise 6?
8 . We are to form a four-digit numeral from the digits 1, 3, 5, and 6
with the understanding th a t no digit may occur more than once in
any numeral.
(a) How many such numerals can we form?
(b) How many numerals can we form if it is required th a t the resulting
number be greater than 5000?
(c) How many numbers can we form if the number is required to be
even?
(d) How many numbers can we form if the number m ust be even and
greater than 5000? List these numbers.
(e) Do parts (a), (b), (c), and (d) under the assumption th a t a digit
may be repeated as often as we wish in a given numeral.
9. Each person in a group is asked to pick a three-letter monogram as a
code name. The letters may be repeated. For example, one might
choose SAP and another might choose SSP. How many people m ust
be in the group before we can be positive th a t a t least two of the people
chose the same monogram? Explain.
10. In a certain club 20 members are eligible to hold office. The offices
consist of a president, a vice-president, a secretary, and a treasurer.
If no person can hold more than one office, in how many ways can the
club’s officers be chosen?
11. In another club 20 members are eligible to fill four vacancies in the
board of directors. In how many different ways can these vacancies
be filled?
12. W hat is the basic conceptual difference between the situations de
picted in Exercise 10 and Exercise 11?
13. Four different m ath books and three different history books are to
be placed on a shelf in the bookcase. How many different shelving
arrangements are possible if
(a) the seven books may be placed in any order;
(b) the four m ath books m ust be together;
(c) the four m ath books m ust be together and the three history books
must be together;
(d) the books m ust alternate in the order: math, history, m ath, his
tory, m ath, history, math.
2.15 COMBINATIONS
There are some counting problems in which the correct answer depends
on the order in which certain events are performed, and other problems in
which the order makes no difference at all. For example, consider the
following problem.
There are ten men in a room. Each man shakes hands once with every other man
in the room. How many handshakes are made?
In answering this problem we might tend to proceed by saying th at

one of the men can be any one of the ten, while the other can be any one
of the remaining nine. Using our previous theory, we might then conclude
th a t there are 10 X 9 = 90 handshakes. However, this answer is twice as
great as the correct one, for in this problem it makes no difference whether
A shakes hands with B, or B shakes hands with A. In other words, the
above procedure counts the same handshake twice: once with one member
of the pair listed first, and once with this same member listed last. The
correct answer here is 45, not 90, since order is unimportant.
Of course, we could do this problem by actually counting the number
of handshakes, case by case. Tedious as this might be, it is worth doing
if only to emphasize the cleverness and compactness of “the other w ay.”
To this end, let us name our ten men for purposes of identification: 0, 1, 2,
3, 4, 5, 6, 7, 8, and 9.
Then the answer 90 corresponds to the following pairs of handshakes.
90 91 92 93 94 95 96 97 98
In this setup the first digit names the first man, while the second digit
names the second man. No repeated digit exists since a man does not
shake hands with himself. Notice, however, th a t each number below the
drawn diagonal names a handshake already described by a number on or
above the diagonal. Hence, the 90 entries name only 45 different hand
shakes.
To illustrate this claim in other ways, consider the situation in which
the numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 are placed in a bag. We are
to draw one of these from the bag. Then, without replacing the first
one, we are to draw another from the bag. Does the answer depend on
the order in which the digits are chosen? Obviously, before we can decide
anything about an answer, we must first know the question; and our claim
is that, depending on the specific question, the answer may or may not
depend on the order. For example, suppose th at the two digits drawn
are to form a two-digit number with the first-drawn digit representing
tens and the second, ones. Then it makes a difference if we first choose
9 and then 1, or first 1 and then 9; for in the first case the number would be
91 and in the second, 19. Thus, if we viewed this problem as a game in
which the winner was the person who formed the greatest two-digit number,
the person who drew 9 and then 1 would beat the person who drew 1 and
then 9, even though both drew the same digits.
On the other hand, suppose conditions remained the same but th a t the
winner was now the man whose two digits represented the greatest sum.
Then the man who drew 9 and then 1 would receive 10 as his score since
9 + 1 = 10, while the man who chose 1 and then 9 would also receive
10 as a score since 1 + 9 = 10. In this case, then, the score would depend
only on the digits chosen, not on the order in which they were chosen.
With regard to this problem, the number of ways of selecting the two
digits is 9 X 8 = 72 if order is important, and half this amount, or 36, if
the order is unimportant. Specifically, these 36 are represented by the
following pairs if order is irrelevant. If order is important, the other 36
pairs come from reversing the order of the digits in the above 36 pairs.
12 13 14 15 16 17
23 24 25 26 27 28
34 35 36 37 38 39
45 46 47 48 49
56 57 58 59
67 68 69
78 79
89
While the listing may be long and drawn out, once done it affords us
the opportunity to make another interesting observation concerning the
difference between numbers and numerals. Namely, suppose the object
were to add the two numbers chosen. Then, as we have seen, 36 different
combinations of digits concern us. However, only 15 different scores
can be made. T hat is, the lowest score is 3, which results when 1 and 2
are chosen; the highest score is 17, which results when 8 and 9 are chosen;
and any natural number between 3 and 17 can be obtained as a score.
Thus, the scores are 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, and 17.
However, 3, 4, 16, and 17 can be obtained in only one way each while
10, for instance, can be scored by four different combinations: 1 and 9, 2
and 8, 3 and 7, 4 and 6. This, then, is an example th a t corresponds to a
situation in which 36 numerals represent 15 different numbers. The same

type of situation is involved in rolling an ordinary pair of dice. Namely,
there are 6 X 6 = 36 ways in which the dice may turn up. The following
is a listing of these ways.
1-1 1-2 1-3 1-4 1-5 1-6

2-1 2-2 2-3 2-4 2-5 2-6
3-1 3-2 3-3 3-4 3-5 3-6
4-1 4-2 4-3 4-4 4-5 4-6
5-1 5-2 5-3 5-4 5-5 5-6
6-1 6-2 6-3 6-4 6-5 6-6
The first digit corresponds to the first die, and the second digit to the
second die. We allow repeated digits since each die can turn up the same
face value; and if you feel that, say 2-4 and 4-2 are the same thing, imagine
the two dice to be colored differently. Certainly a red 4 and a green 2
denote a different toss than a red 2 and a green 4.
Thus, there are 36 ways in which the dice can turn up. Yet there are
only 11 different numbers th a t the dice can name: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
and 12.
The word “perm utation” indicates a situation in which order is impor
tant. If the order is not im portant we use the term “combination.”
More specifically, if r and n are whole numbers and r ^ n, then P{n,r)
denotes the number of ways in which r objects can be chosen from a set of
n objects if order is im portant and if the object chosen is not replaced
before each new choice. For example, P (5,3) means the number of ways
we can choose three objects from a set of five if different arrangements
mean different choices. We have already seen th a t there are 5 X 4 X 3 =
60 such choices. Thus, P (5,3) = 60. If we view the set as being com
posed of the digits 1, 2, 3, 4, and 5, and if we view our choice as being a
three-digit numeral, then our 60 choices are the following.
123 124 125 134 135 145 234 235 245 345
132 142 152 143 153 154 243 253 254 354
213 214 215 314 315 451 324 325 425 435
231 241 251 341 351 415 342 352 452 453
312 412 521 413 513 514 423 523 524 534
321 421 512 431 531 541 432 532 542 543
In a similar way, we define C(n,r) to mean the number of ways of choos

ing r elements from a set containing n elements without replacement if
order is unimportant. Thus, a glance a t the above chart indicates C(5,3) =
10. In fact, the different combinations are given by the ten three-digit
numbers along the top row of the chart; the six members of each column
are all permutations of the top entry. In other words, in this illustration
we have exactly six times as many permutations as we have combinations;
and it is more than a coincidence th a t 6 = 3! I t should not be too difficult
now to see th a t when we choose r elements from a collection of n, there
are exactly r! times as many permutations as combinations, since any
set of r elements can be arranged in r\ different orders—even though the
elements form only one combination.
As a second example, P (9,4) denotes the number of ways th a t four
elements can be chosen from a collection of nine without replacement if
order is important. Thus,
P(9,4) = 9 X 8 X 7 X 6 = 3024.
On the other hand, C(9,4) m ust be 3024/4! = 3024/24 = 126. This

follows from the fact th a t each combination of the required type induces
4! permutations. For example, if one combination were labeled 1234, it
would induce these 24 permutations.
1234 1243 1423 4123
1324 1342 1432 4132
2134 2143 2413 4213
2314 2341 2431 4231
3124 3142 3412 4312
3214 3241 3421 4321
If we wish to generalize the results in terms of r and n we might proceed

as follows.
P(n,r) = n X (n — 1) X (w — 2) X . • • X (n — [r — lj)
r factors
= n(n — l)(n — 2) . . . (n — r + 1).
If we now multiply and divide this result by (n — r ) !, we see the following.

P(n,r) = n(n — l)(w — 2) . . . (n — r + l)(w — r)!/(n — r)! (1)

= w!/(n — r)!
I t is usually a great help to work the results through by first choosing
specific values for r and n. For example, if we refer to P(9,4), we have
already seen
P(9,4) = 9 X 8 X 7 X 6 .
We could then, if we wished, multiply and divide by 5! This would
convert 9 X 8 X 7 X 6 into 9! T hat is,
P(9,4) = 9 X 8 X 7 X 6 X 5!/5! = 91/5!
Then, observe th at in this case n = 9 and r = 4; thus, n — r = 5, which
agrees with the formula expressed in Equation (1).
However, while Equation (1) is both general and compact, do not feel
compelled to memorize this formula. While there is no harm in retaining
the formula and using it to save time, it is very im portant to understand
the derivation of the formula so th a t (1) we can again derive the result
in the event wre forget it and (2) wTe can generalize the results in a situation
where the formula has to be modified before it can be applied. We shall
say more of this later. To continue, in terms of our previous remarks,
Equation (1) almost immediately leads to Equation (2).
C(n,r) = [n!/(n — r ) !] 4- r!
= n!/r!(n — r)! (2)
By w’ay of illustrating Equation (2), we have n = 7 and r = 5,
C(7,5) = 7 !/2 !5!
= 7 X 6 X ^ X ^ X ^ X ^ X / 1 X 2 X / X ^ X ^ X / X /
or C(7,5) = 7 X f = 21. Notice th a t w*e never had to invoke Equation
(2) directly because we already knew from our knowledge of Cartesian
products th a t
P ( 7,5) = 7 X 6 X 5 X 4 X 3 .
W e also know in this case th a t since five elements are involved in a particu
lar selection, there are 5! times as many permutations as there are com
binations. Thus, since division is the inverse of multiplication we have
C(7,5) = (7 X 6 X 5 X 4 X 3/5 X 4 X 3 X 2 X 1) = 21.
We are not claiming th a t it is easier not to use the recipe; obviously, it

is always quicker to ju st “plug in” numbers. B ut we wish to emphasize
th a t if wre are thoughtful it is usually no great tragedy if we forget the
formula.
Also keep in mind th a t if we wished to compute C(7,5) directly we

could denote the seven elements by 1, 2, 3, 4, 5, 6, and 7, whence we could
obtain
12 13 14 15 16 17
23 24 25 26
34 35 36 37
45 46 47
56 57
67
I t is apparent th a t the above chart is a solution for counting the number

involved in C(7,2). Now observe th a t Equation (2) remains the same if we
reverse the roles of r and n — r.
C(n,r) = C(n,n — r). (3)
In terms of the illustration Equation (3) says
<7(7,5) = C( 7,2).
The reason for this is quite simple if we understand basic concepts.
Recall th a t every time we choose a subset of a set we really determine two
subsets; namely, the subset and its complement. Thus, whenever we
choose a subset of five elements from a set of seven elements we also deter
mine its complement which has two elements. Referring to the chart, we
see th a t 12 also determines 34567 since these elements form the complement
of the given subset. In general, we can replace each element of the chart
by its complement, and obtain the following chart.
34567 24567 23567 23467 23457
14567 13567 13467 13457 13456
12567 12467 12457 12456
12367 12357 12356
12347 12346
12345
Another way of looking a t this is to observe th at if seven coins are tossed

there are the same number of ways of obtaining exactly five heads as there
are ways of obtaining exactly five tails, since heads or tails occur with
equal likelihood. Yet whenever five tails occur there are exactly two
heads since each coin m ust be either heads or tails. Thus, there are the
same number of ways of obtaining five heads as there are ways of obtaining
two heads when seven coins are tossed.
A final observation based on Equation (2): let us look at another reason
for defining 0! to be 1. Observe th at by definition C(5,5) m ust equal 1
since there is only one way of choosing a subset containing five elements
from a set of five elements. On the other hand, by Equation (2) we see
th a t
C( 5,5) = 5 I/O !5! = 1/0!
This implies th a t 1 = 1/0! if we want Equation (2) to remain correct,

and this leads to the fact th a t we m ust define 0! = 1.
By way of further strengthening these new concepts, let us view the
subsets of a set having five elements. To this end let S = {1,2,3,4,5}.
We already know th a t S has 25, or 32, subsets. Moreover, a subset must
have either 0, 1, 2, 3, 4, or 5 elements. By the new definitions C(5,0)
denotes the number of ways in which we may choose no elements from the
five; C(5,l) denotes the number of ways in which we may choose one from
the five; C(5,2) denotes the number of ways in which we may choose two
elements; C(5,3), the number of ways to choose three; C(5,4), the number
of ways to choose four; and C(5,5), the number of ways to choose five. We
do not use permutations because we have already agreed th at a set de
pends only on its members, not the order in which the members are listed.
Now Equation (2) yields
C ( 5 ’0 ) - o il = 1
C(5,l) = j f j - 5
n (' o') _ _§L _ 5 X 4 X 3 X 2 X 1 _
K, ) 2!3! 1 X 2 X 1 X 2 X 3
C(5,3) = C(5,2) = 10
C(5,4) = C(5,l) = 5
C(5,5) = C(5,0) = 1.
Thus, without having to perform the actual count, subset by subset, we

see th a t S has
1 subset with 0 members,

5 subsets with 1 member,
10 subsets with 2 members,
1 subset with 5 members,
and this accounts for the 32 subsets.

Let us now return to an earlier remark in which we implied th a t some
times we had to be careful when using a formula and th a t mere memoriza
tion of the recipes might not prove too helpful. We shall illustrate this
idea in terms of a problem. Consider the word “bark.” If we form
permutations of these four letters, we obtain 24 different arrangements.
bark bakr bkar kbar
brak brka bkra kbra
arbk arkb akrb karb
abrk abkr akbr kabr
rakb rakb rkab krab
rbak rbka rkba krba
This is in accord with the fact th a t there are 4! permutations of four ele
ments. In terms of a place diagram we have
i 3 2 L
However, let us now consider the word “noon.” In this case there
seem to be only six permutations rather than twenty-four; namely, noon,
nono, nnoo, onno, onon, and oonn.
The point here is that, as far as a word is concerned, we do not distinguish
between permutations of multiple letters. For example, in our case we
could artificially distinguish between the two o’s by writing o and o. (or we
could imagine them as having different colors, or we could write them as
Oi and o2 to indicate the first o and the second o). In a similar way we
could write n and n to distinguish between the two n ’s. In this way, noon,
noon, noon, and noon are four different perm utations; yet they are read as
precisely the same word. T hat is, in terms of the place diagram we may
place the two n ’s in any two of the four places without regard to order.
Thus, the n ’s may be placed in C(4,2) = six different ways. Once the n ’s
are placed, the two o’s can be placed in only one way; namely, they must
fill the two spaces th a t remain. This accounts for the fact th a t we have
six answers rather than 24. We would not be able to use the recipes in
either Equation (1) or Equation (2) as they now stand. Rather, we would
begin by writing 4! which would be correct if the letters were all different.
Then we would divide by 2! to indicate th a t if we leave the letters in place
and just permute the o’s, we do not get anything new. Then we must
divide by 2! again to indicate the same thing about the n ’s. Thus, we
obtain our answ er:
41/212! = 6.
As a more difficult example along these same lines, consider the word,
“mississippi.” Here we have 1 m, 4 i’s, 4 s’s, and 2 p ’s. Again imagine
th a t we are filling in blanks. There would be 11 blanks, and the m could
be placed in any one of these. Thus, there are C ( ll,l) ways of placing
the m. Notice th a t we use C rather than P because if we rearrange the
duplicate letters we do not change the word we see. Once this is done there
are 10 spaces left and the four i’s may be placed in any four of these ten
spaces. T hat is, we may place the four i’s in the remaining spaces in C(10,4)
different ways. W ith the m and the i’s in place there are six places left,
and the four s’s may be placed in any of these six places. Thus, we may
place the s’s in C(6,4) different ways. We are then left with our two p ’s
and just two open places. This means th a t the p ’s may now be placed in
C(2,2) different ways. (Observe th a t C(2,2) = 1, and it is not surprising
th a t with two like objects left and only two spaces to be filled there is only
one way of doing this. We preferred writing C(2,2) rather than 1 to em
phasize the technique being used.) To go on, since the groups of letters
may be placed without restriction once the previous letters are in position,
we see from our knowledge of Cartesian products th a t the total number of
ways we can rearrange the letters is
(7(11,1) X C(10,4) X (7(6,4) X C(2,2). (4)
As a final point, let us note th a t Equation (4) once again emphasizes the
difference between overall knowledge and computational techniques.
T h at is, Equation (4) is the correct answer, although in an unusual form.
Once we apply a recipe such as Equation (2) to Equation (4) we get the
equivalent but more familiar
11 X (10 X 9 X 8 X 7 ) /( l X 2 X 3 X 4 ) X ( 6 X 5)/2 X 1 (5)
and this in turn leads to
11 X 210 X 15 (6)
or
34,650. (7)
Equation (7) represents the answer in its most familiar form, but notice
th a t Equation (4) is just as precise as Equation (7). The failure to separate
the computational know-how from the theory of solving the problem is
tantam ount to the oft-quoted cliche of failing to see the forest because of
the trees.
Referring again to the “mississippi problem,” there was no rule saying
th at we had to place the m first. For example, we might have elected to
place the four i’s first. This could have been done in C (ll,4) different
ways. If we next placed the s’s, this could have been done in C(7,4) dif
ferent ways; the p ’s could then have been placed in C(3,2) ways, and the
m in C (l, 1) ways. H ad this been the case, the answer to the problem would
have been
C (ll,7 ) X C(7,4) X C(3,2) X C (l,l)
and it should not be surprising th at this also yields 34,650 as the answer!
We leave additional examples for the exercises. In the next section we
shall show how combinations may be used to help us derive and understand
certain im portant algebraic recipes. In particular we shall study the bi
nomial theorem.
Exercises
1. A poker hand consists of 5 cards without regard to order, dealt from a
deck th a t contains 52 cards.
(a) How many different poker hands are possible?
(b) How many of these ways consist of hands in which all five cards are
of the same suit?
(c) How many of these hands consist of four of a kind (four cards of the
same face value)?
2. A bridge hand consists of 13 cards dealt from a standard deck of 52
cards. In how many ways can a hand of each of the following types be
obtained?
(a) suits of 5, 4, 3, and 1 card
(b) suits of 4, 4, 3, and 2 cards
(c) 13 of the same suit
3. I t can be shown th a t the number named by C(52,13) can be approxi
mated in place-value notation to be slightly larger than 6 X 1011.
If we assume th a t on the average one bridge hand per second is played
in America around the clock, how likely does it seem th a t a bridge hand
consisting of 13 cards in the same suit will occur?
4. Three people approach a row of six chairs and decide to sit down a t
random with no more than one person per chair.
(a) In how many ways can the people be seated?
(b) In how many ways can they be seated if they agree not to leave
empty seats between them?
(c) In how many ways can they be seated if they agree th a t there must
be a t least one empty chair between each pair of people?
6. Construct an example th a t will explain wrhy
C(12,3)C(9,5)C(4,4) = C(12,5)C(7,4)C(3,3)
w ithout multiplying both sides of the expression to verify this result.
2.16 THE BINOMIAL THEOREM

We have seen th a t the study of Cartesian products (in particular the num
ber of elements in such a product) serves as a fine background for intro
ducing the computational aspects of permutations and combinations. We
have presented many illustrations of these computations in action, and in
the process we may have introduced symbolism. Keep in mind th at such
symbolism as C(n,r) and n! is used for convenience and th at it has no bear
ing on the concepts of permutations and combinations. In fact, this might
be a good time to once again point out the difference between numbers and
numerals. C(5,2) names a number—the number of ways in which two
objects may be chosen from a set of five without regard to order. Our
computational techniques, with or without factorial notation, simply pro
vide us with a means for converting this rather implicit numeral into the
more explicit numeral 10. The illustrations and exercises of the last few
sections have helped to clarify this point.
In this section we shall show how the study of combinations can be ap
plied to the abstract game of arithmetic as well as to the more practical
examples of the previous sections. In particular we shall study the
binomial theorem.
To this end, let us observe th a t the rules of ordinary arithmetic allow us
to develop recipes for computing such expressions as
(cti + a2 + 03) (61 -\- bi).
For example, the use of areas of rectangles shows th at
(ai + 02 d~ 03) (&i 4“ bi) = aibi d- ai?>2 d- a,ibi + a^bi + ®3&i d- 0362*
(See Figure 2.51.) This means th a t the product may be written down
mechanically as the sum of all products of two numbers, one from each set of
parentheses. In the above illustration the first factor could be chosen in
any one of three ways since the first set of parentheses contains three terms;
in a similar way we see th a t the second factor could be chosen in two ways.
Thus, as we have learned from the discussion of the number of elements in a
Cartesian product, there should be 3 X 2 = 6 terms in the product—and
a simple check shows this to be the case. We should also observe th a t
while we used a particular illustration, the above remarks (other than for
a1 02 “ 3
n
a3bx
a, b, a2b\
1
axb2 aibi
Figure 2.61
the specific number involved) do not depend on the number of terms in

each of the pairs of parentheses.
This result extends to a product involving more than ju st two sets of
parentheses. For example, we claim th at
(ui -b ct2 ~b a3)(b\ ~b bs)(ci + C2) = fli&iCi -b (Z1&1C2 -b aib2Ci *b a\b2C2

-b a2biCi + 0,2b1C2 + U2&2C1 ~b U2&2C2 ~b U3&1C1 ~b a3biC2 ~b a3b2Ci ~b a3b2c2.
T hat is, the product has 12 terms ( 3 X 2 X 2 = 12) and each term con
sists of the product of three numbers, one from each of the three sets of
parentheses. To prove this, one could think in terms of the volume of the
parallelepiped (this is the three-dimensional analogue of a rectangle) whose
sides were («i + a2 + a3), (61 + 62), and (ci + c2) (see Figure 2.52). How
ever, such a procedure cannot be generalized if more than three sets of
parentheses are involved since we have no 'physical concept of geometry
of more than three dimensions.
From a more abstract point of view we may form the product using two
factors a t a time. T hat is,
(di + a2 4* 03) (&i + bi){C\ + C2)

= [(fll + ®2 + <13) (61 + 62)] (ci + C2)
= [di&i + di52 -b a2bi + a2b2 + a3bi + a3f>2](ci + c2)
Figure 2.62
and this reduces to the case in which we have only two sets of parentheses
since, of course, such expressions as ai&i are also numbers.
Again, observe th a t this procedure did not depend on the number of
terms within each set of parentheses, and th at we can continue the proce
dure for any number of sets of parentheses merely by the repeated applica
tion of the above technique. To simplify this point, observe th a t what
we are doing is essentially no different from the way in which we form a
sum, such a s l + 2 - |- 3 - |- 4 . In effect, we use the associative rule re
peatedly to obtain
l + 2 + 3 + 4 = ( l + 2 ) - M + 4 = 3 - j-3 + 4
= (3 + 3) + 4 = 6 + 4 = 10.
We are now ready to discuss the binomial theorem. Observe th at we

have not restricted the number of terms within each set of parentheses,
nor have we insisted th a t the symbols, ax, a2, a3, bu and so on, all represent
different numbers. Let us focus attention on a special case. We shall
assume th a t each set of parenthess represents the sum of two particular
numbers, which we shall denote by a and b. In other words, wre shall
be interested in such expressions as
(a + b)
(a + 6) (a + b)
(a + &)(a + b)(a + b)
(a + b)(a + b) (a 6) (a+ b)
or, taking advantage of exponent notation, we are interested in studying
(a + b)n
where n represents any whole number.

We can apply the above discussion in the following manner.
Case 1: n = 0.
Then (a + b)n = (a + 6)° = 1, by definition of the exponent 0.
Case 2: n = 1.
Then (a + 6)» = (a + b)1 = (a + 6).
Case 3: n = 2.
Then (a + b)n = (a + 6)2 = (a + fc)(a + 6) = a2 + ab + ba + b2
= a2 + 2a6 + 62.
Cartesian Products
Case 4: n = 3.
Then (a + b)" = (a + b)3 = (a + b)2(a + b)

= (a2 + 2ab + b2)(a + b)
= a3 + a2b + 2a2b + 2ab2 -f- ab2 + b3
= a3 -f 3a2b + 3ab2 + b3.
Case 5: n = 4.
Then (a + b)n = (a + b)4 = (a + b)3(a + b)1
= (a3 + 3a2b + 3ab2 + 63)(a + b)
= a* + a3b + 3a36 + 3a2b2 + 3a262 + 3ab3 + ab3 + b4
= a4 + 4a3b + 6a2b2 + 4ab3 + b4.
When we use such procedures as (a + b)4 = a4 + 4a3b + 6a2b2 +

4ab3 + b4, we call this expanding by the binomial theorem.
Had we wished to pursue the five cases further we would have obtained
the following.
n (o + b)n
0 1
1 a 4- b
2 a2 + 2ab + 62
3 a3 + 3a26 + 3ab2 + b3
4 a4 + 4a3i> -(- 6a2b2 + 4ab3 + b*
5 a5 + 5a*b + 10a362 + 10a263 + 5ab* + ¥
6 a6 + 6a56 + 15a462 + 20a363 + 15a264 + 6ab8+ ¥
7 a1 + 7a66 + 21a5b2 + 35a463 + 35a364 + 21a2b6+ 7a ¥ + ¥
While the above task was quite laborious, it has resulted in our collecting
a considerable am ount of data to analyze. The above chart indicates some
interesting trends. (Observe th a t any results obtained from the charts
are trends and not theorems. T hat is, much as in a laboratory, unless
we show th a t something follows inescapably, all th a t we have is a conjec
ture—a hunch. However, many times a theorem is born after much data
indicates th a t a trend may be more than just coincidence. We shall
illustrate this remark by the manner in which we evaluate the above
chart.) Among the “obvious” trends are:
(1) If we look down the first column of the chart on the side headed by
(a + b)n, we observe th a t the coefficient in each case is 1.
(2) The coefficients in the second column are successively given by 1, 2,
3,4, 5, 6, and 7; and this appears to indicate the set of natural numbers.
(3) The third column leads to the coefficients 1, 3, 6, 10, 15, and 21; and
2/f.O A n Introduction to the Theory of Sets
this begins to indicate the set of triangular numbers (that is, 1, 1 + 2,

1 + 2 + 3, 1 + 2 + 3 + 4 , and so on).
(4) The fourth column indicates the coefficients 1, 4, 10, 20, and 35; and
this set of numbers, while perhaps not as familiar to us as the others,
is indeed a well-defined set th at seems to be formed by sums of con
secutive triangular numbers.
While our observations might not be too indicative of any practical re
sults, notice th a t pure mathematics is often born merely of our interest in
studying sequences of numbers such as these. Perhaps we can now combine
the above three observations and begin to look for a more general trend.
Namely, the first column of coefficients consisted of the sequence 1, 1, 1,
1, 1, 1, 1, 1; and if we form consecutive sums of these elements we obtain:
1, 1 + 1, 1 + 1 + 1, . . . , or 1, 2, 3, 4, 5, 6, 7, . . . , and these turned
out to be the coefficients of the second column. In turn, these successive
sums lead to 1, 1 + 2, 1 + 2 + 3, . . . , or 1, 3, 6, 10, 15, 21; which were
the members making up the coefficients of the third column. These, in
turn, lead to the successive sums 1, 1 + 3, 1 + 3 + 6, . . . , or 1, 4, 10,
20, 35; which formed the coefficients of the next column. Continuing this
process, we would next obtain the sequence 1 , 1 + 4 , 1 + 4 + 10, 1 + 4 +
10 + 20, . . . , or 1, 5, 15, 35; and this yields the coefficients of the next
column. We have not proven any theorems yet, but things are starting
to look quite suspicious as we begin to sense th a t this procedure will con
tinue endlessly.
Moreover, if we were endowed with enough geometric and artistic intui
tion, we could use the results discussed above to invent the chart in Fig
ure 2.53, which is known as Pascal's triangle. The arrows show how the
coefficients in the various columns have been arrayed in the triangle. The
l
/
/ \ /
1 2 1
/ / /
1 3 3 1
/ / / /
1 4 6 - ----------- 4 1
/ / + \ f
1 5 10 10 5 1
1
/ y y 15 y
6
y 6
y 1 15---------- 20
1
/ 7
/ 21
/ \ 35/ 35
y 21 / 7
/ .I
r
y 8 y 28 / 56 y 70 y 56 y 28 y 8 y 1
Figure 2.53
triangle itself is constructed by beginning with 1 and then flanking this by

two more l ’s on the next line. We then form the next line by placing the
sum of two consecutive numbers of the line above between the two num
bers forming the sum (see the dotted triangles in the figure to get the idea)
and then flanking the line with two more l ’s, and so on.
Looking at the triangle, we might next observe th a t the sum of the
numbers comprising each line is a power of two. T hat is,
1 = 1 = 2°
1 + i = 2 = 21
1 + 2 + 1 = 4 = 22
1 + 3 + 3 + 1 = 8 = 23, and so on.
By now it should beclear th at the study of the binomial theorem leads

to some interesting observations, and th at we do not need any knowledge
of our previous study of combinations to work on this problem. However,
let us now begin to analyze the process of computing (a -f- b)n, and see if
we can find some clues as to why our observations occurred and whether
they were coincidental or inescapable.
According to our earlier theory, when we form the product (a + b)n
there should be 2" terms, since there were two terms in each set of paren
theses and there are n sets of parentheses. By way of illustration, (a + b)3
should have eight terms. From the chart, it seems th at (a + b)3 is ex
pressed only as the sum of four terms since we have written (a + b)3 =
a3 + 3a26 + 3a62 + b3, but this is only because certain of the eight terms
can be amalgamated into one. Indeed, in this case there are three terms
denoted by a2b and three terms denoted by ab2. I t is in this sense th a t we
can begin to get a feeling for interpreting the coefficients, and see th a t we
arrive a t our eight terms by taking the sum of the coefficients. This explains
a t least the result th a t the sum of the numbers in any one line of Pascal’s
triangle m ust be a power of two.
We have also seen th a t each term in the product m ust consist of n factors,
one from each of the n sets of parentheses. Since the only terms in each
set of parentheses are one a and one b, the n factors m ust consist solely of
a ’s and b’s. This explains another interesting property of the chart.
Notice th a t in the chart the expression (a + b)n consists of terms of the
form arbs, where r + s = n~ For example, when n = 5 the chart shows
th a t (a + b)h consists of terms of the following type: a5, a*b, a3b2, a2b3, ab*,
and bb. Recalling th a t a = a1, b = b1, b° = 1, and a° = 1, we see th a t the
sum of the exponents of a and b in each term is 5.
The only question left to answer now if we wish to have a completely
determined way to form our chart without recourse to the tedious arith
metical techniques and without having to worry about whether our “trends”
2^2 A n Introduction to the Theory of Sets
are inescapable, is to figure out how the various coefficients must be formed.
For example, in the situation
(a + 6)4 = a4 + 4a36 + 6a2b2 + 4a63 + b*
why is the coefficient of a3b equal to 4, while the coefficient of a* equal to 1?
This question is easy to answer in terms of combinations. Namely, we
form a term in the product by forming products consisting of one factor
from each of the four, in this case, sets of parentheses. Since there are but
four sets of parentheses, the only way we can choose a term containing
four factors of a is to choose four a ’s from a possible maximum of four a ’s;
and this can be done in only one w ay; namely, we m ust choose the a, not
the b, from each set of parentheses. On the other hand, to form a term of
the type a3b, we need only choose three of the factors to be a ’s and one to
be b. However, since there are four sets of parentheses we could choose
the b from either the first, second, third, or fourth set. T hat is, there are
four ways of forming a term of the type a3b. To summarize, there are as
many ways of forming a term of type a3b as there are ways of choosing three
a ’s, or equivalently, one b from a collection of four. This number is repre
sented by none other than (7(4,1) or its equivalent (7(4,3), since (7(4,1) =
C(4’3)'
To generalize this idea, let us first review the binomial expansion of
(a + b)6 in this light rather than by straightforward computation. We
first observe th a t the exponent tells us th a t there are really six sets of paren
theses in the expansion; hence, the sum of the exponents of each of the terms
m ust be 6. This tells us th a t we can only expect terms of the type a6,
a56, a*b2, a3b3, a2b*, ab5, and 66. Moreover, the coefficient of a6 m ust be
(7(6,6) since we can get a term of this type only by choosing six a ’s out of a
possible six. In the same way, we see th a t the coefficient of a5b m ust be
(7(6,5) since we can obtain this type of term only by choosing an a from
five of the six sets of parentheses. Proceeding in this way, we may write
down a t once
(a + b)6 = (7(6,6)a6 + C(6,5)a56 + C(6,4)a462 + C(6,3)a363
+ C(6,2)a264 + C(6,l)a65 + (7(6,0)66.
This result does not depend on whether we can find synonyms for C (6,4)
and the others. However, if we use our previous factorial formulas or
whatever we find convenient, we see th a t (7(6,6) = 1, (7(6,5) = 6, (7(6,4) =
15, (7(6,3) = 20, (7(6,2) = 15, C(6,l) = 6, and (7(6,0) = 1. This leads to
the result shown in the chart.
If we wish to extend this technique beyond an example contained in the
chart, consider the problem of finding the coefficient of a 865 in the binomial
expansion of (a + 6)13. B y sight we need only write (7(13,8), for there are
as many terms of type a*bb as there are ways of choosing eight a ’s from a set
of 13. Observe th a t this answer is correct and is indeed exact. To be
sure, it might not be as familiar a numeral as the equivalent place-value

numeral, but this is irrelevant. The point is th a t it is now but a m atter
of technique to compute C(13,8) by the recipe
C(13,8) = 13 !/8!5! = (13!/8!)/5!
= (13 X 12 X 11 X 10 X 9)/5 X 4 X 3 X 2 X 1 = 1287.
Thus, the recipe for combinations gives us a big advantage over computing
(a + b)13 term by term, and then having to collect 1287 terms, each of the
type a*bb.
If we now desire to summarize the results in general we could write
(a + b)n = C(n,n)an + C(n,n — l) a n_16 + C(n,n — 2)an~2b2 + . . .
4* C(n,2)a?bn~2 + C(n,l)a6n_1 + C(n,0)bn.
Once again, let us hasten to point out th a t computational skill is not the
aim of the text, but hopefully by now computational techniques should
not look quite so ominous. Aside from showing us good theoretical appli
cation of combinations, the binomial theorem affords a rather nice vehicle
for exhibiting the multifaceted personality of mathematics.
Exercises
1. (a + 6)9 is expanded by the binomial theorem. Find the coefficient of
a663.
2. Nine coins are tossed. In how many ways may we obtain six heads and
three tails?
3. How are Exercises 1 and 2 related?
4. Explain the device in Pascal’s triangle whereby we enter the sum of two
entries on one line to obtain an entry on the next.
5. In the binomial expansion of (x3 + 2y)h find the coefficient of the term
whose form is x9y2.
6. 210 = 1024. In the binomial expansion of (a + 6)10 how does 1024
occur?
7. W hat is the coefficient of a5b5 in the binomial expansion of (a + 6)10?
8. Suppose we wished to compute (1.001)10 to be correct to the third deci
mal place but we did not wish to go through the work of actually raising
1.001 to the tenth power. Recall th a t 1.001 = 1 + 0.001. Use the
binomial theorem to expand (1 + 0.001)10 and thus obtain the required
approximation for (1.001)10.
2.17 A GLIMPSE AT PROBABILITY

One area in which combinations and permutations can be applied is prob
ability theory. We shall demonstrate the meaning of probability quite
briefly in the following way. Suppose th a t an event can be performed in n
different ways, all of which are equally likely to occur. For example, if
#44 A n Introduction to the Theory of Sets
one million numbers are in a box and we draw one a t random then there
are two possibilities: either we draw the number 6 or we do not. However,
unless half of the numbers in the box are labeled 6 these two outcomes are
not equally likely. Now suppose th at m of these n equally likely outcomes
are favorable outcomes. We define the probability of a favorable outcome
to be m /n.
Using more mathematical language, if A denotes a particular event and
if A can occur in m equally likely ways out of n equally likely total out
comes, then we define P(A), called the probability of A, by P(A) = m/n.
This idea of probability is closely connected with the structure of unions
and intersections in the sense th a t if A and B are two events, the probability
th a t a t least one of the two events occurs is given by
P(A U B) = P(A) + P(B) - P(A n B).
Thus, if we wish to compute the probability of drawing either a heart

or a face card from an ordinary deck of playing cards, where we assume th at
each of the 52 cards is equally likely to be drawn as any other, and if we
let A denote the event th a t we draw a heart and B the event th at we draw
a face card we see th at
P(A) = 13/52
P(B) = 16/52
P(A H B ) = 4/52.
Hence
P( A KJ B) = (13 + 16 - 4)/52 = 25/52.
If we assume th a t an event is either favorable or unfavorable, one or the

other but not both, and if we let A denote the favorable events and A'
denote the unfavorable events we have A C\ A ' = <f> and S = AK J A'.
Therefore, P(S) = N ( S ) / N ( S ) = 1 = P(A U A') = P(A) + P ( A ’) ^ 0,
where P{<t>) = 0. Hence, P(A') = 1 — P(A). Thus, as we might well
have suspected, the probability th a t a favorable event occurs is just one
less the probability th a t an unfavorable event occurs. For example, the
probability th a t we draw neither a heart nor a face card from an ordinary
deck of cards is 27/52, since the probability th a t one or the other will
happen is 25/52.
By now it may be clear why the study of Cartesian products plays such
a big role in the study of probability. Namely, we are always concerned
with counting the number of outcomes as well as the number of favorable
outcomes; and the clever ways of counting th a t are often needed are best
understood in terms of Cartesian products, including permutations and
combinations.
Cartesian Products 2J±5
At the end of this section we shall supply some examples, but now we
wish to make a very im portant observation. I t is probably not surprising
th at the original study of probability arose in regard to games of chance.
Even in modern textbooks there is a preponderance of examples concerning
cards and dice when probability theory is discussed. However, it is wrong
to believe th at such considerations are the only uses of the probability
theory. In truth, all of life in one way or another is concerned with the
study of probability. The actuary employed by the insurance company is,
in a sense, a bookmaker. Indeed, when we take out a life insurance policy,
we are betting th at we will not live to a certain age while the company
bets th at we do; and the actuary determines the odds to make it a fair bet.
More importantly, the whole world of science hinges on the concept of
probability, for in the world of science until an experiment is completed
all we have is a prediction. If, for example, the space agency is told not
to send a man into orbit until it is certain th a t he will return alive, then it
will never send a man into o rb it; for the only way in which we can be sure
th a t he returns alive is for us either never to send him, or if we do send him,
to wait until he returns alive! All decisions m ust be made in the light of
available evidence; and granted th at a well thought-out decision may have
bad results, we would like to feel th at it was due to unforeseeable bad luck,
rather than to poor planning.
In laboratory work when there is some discrepancy between theoretical
results and experimental results, we m ust often decide how large an error
we can tolerate before wre have to adm it th at the discrepancy is due to a
mistake in the theory rather than to a human error in measurement and
experimental design.
The biggest problem is th a t outside of such contrived examples as dice
and playing cards it is either very difficult or even impossible to determine
the number of possible, equally likely outcomes; and as a result, the study
known as statistics is born. Statistics in many ways simply (but not
easily) involves methods of collecting reliable data from wrhich certain
probabilities can be determined for use in various computations. The use
of statistics and probability from either the theoretical or computational
points of view is not the purpose of this text. All we wrant to do is indicate
the importance of the topic and the use of the theory of sets in its proper
development. However, it is difficult to understand the importance of
probability without reference to at least a few problems; so we shall con
clude this section with a few problems, their solutions, and a brief discussion.
Example 1
A “fair” coin is tossed six times. W hat is the probability th a t exactly
three heads and three tails occur?
2^6 An Introduction to the Theory of Sets
so lu tio n :I t should be clear th a t th e result does n o t depend on w hether

one coin is flipped six tim es or six coins are flipped once each. T here is
n o t a “ 50-50” chance here. Indeed, since 26 = 64, it should be easy to
verify th a t there are 64 equally likely outcom es and th a t only C(6,3) of
these are favorable events. B u t C(6,3) = 20; hence, th e probability of
th is event is 20/64 or 5/16. In other words, for every five chances to wrin,
wre have 11 wrays to lose. (We can say th a t th e odds against us are 11 to 5,
or 11-5). L et us look a t this in term s of a chart.
Distribution Number of ways

6h, 01 C(6,0) = 1
5h, It C (6,l) = 6
4h, 21 C(6,2) = 15
3h, 31 C(6,3) = 20
2h, 41 C(6,4) = 15
lh , 51 C(6,5) = 6
0A, 61 C(6,6) = _1
64 ways in all
W hat is tru e is th a t 3h, 31 is more likely th an any other single distribution.

W hat is 50-50 is th e likelihood th a t more heads th an tails occur in our
experim ent.
E xam ple 2
W hat is th e probability when a pair of “fair” dice are rolled th a t the sum
will be seven?
s o l u t io n There are only 1 1 different sums th a t can be rolled; namely,
:
any num ber between 2 and 12, inclusive. However, these numbers are
n ot equally likely; for, as we have already seen, numbers such as 2 and 12
can be obtained in only one wray each, while 7 may be obtained in six ways;
th a t is, 1-6, 6-1, 2-5, 5-2, 3-4, and 4-3. Here we see the difference between
num bers and numerals once again, for while only 11 different sums can
occur, 36 (6 X 6) different numerals can name these 11 numbers. Thus,
the probability of obtaining a 7 is not 1/11 b u t rather 6/36 or 1/6. Observe
th a t by equally likely we mean th a t the combination 5-2 is no more likely
th a n the combination 6-6; both are equally likely. W hat we do mean is th a t
there are six different ways of rolling 7, and 5-2 is ju st one of these; but 6-6
is the only way of obtaining a twelve.
Exam ple 3
A “fair” die is rolled twice. W hat is the probability th a t a 4 does not turn
up either time?
There are five ways in which the first die fails to turn up a 4.
so lu tio n :
Namely, it turns up either 1, 2, 3, 5, or 6. There are also five ways in
which the second die fails to turn up a 4. Thus, there are 5 X 5 ways in
which neither the first nor the second die turn up a 4. In all there are
6 X 6 equally likely ways in which the pair of rolls may terminate. The
probability of a 4 turning up neither time is 5 X 5/6 X 6, or 25/36. In
terms of odds, the odds are 25 to 11 th a t a 4 will not turn up on either die.
Example 4
W hat is the probability th a t when a pair of “fair” dice are rolled a t least
one of the faces will turn up a 4?
Here we use the fact th a t P(A) = 1 — P(A'), for it should be
s o l u t io n :
clear th at the set of events in which a t least one die turns up a 4 is the com
plement of the set of events in which neither die turns up a 4. Hence,
since there is a probability of 25/36 th a t neither die reads 4, it m ust be
th a t the probability th a t at least one die turns up a 4 is 11/36.
The number of possible outcomes involved in the solutions of (3) and (4)
is quite small. In fact, the following is a complete listing.
1-1 2-1 3-1 4-1* 5-1 6-1
1-2 2-2 3-2 4-2* 5-2 6-2

*
CO
1-3 2-3 3-3 5-3 6-3

1
*
CO
1-4* 2-4* 4-4* 5-4* 6-4*
1-5 2-5 3-5 4-5* 5-5 6-5
1-6 2-6 3-6 4-6* 5-6 6-6
The 11 entries with asterisks verify directly the results discussed above.
I t is worth noting th a t intuition can easily lead to misinterpretation
in such a problem as the above. For example, it is easy to employ the
reasoning th a t since there are only six equally likely ways in which a fair
die can turn up, then in two rolls there are two chances out of six th a t a 4
will occur. This reasoning leads to 2/6, or 1/3, as the probability of
obtaining a t least one 4 with one roll of a pair of dice. We have seen
th a t the probability is 11/36. The error seems more suggestive if we
write 1/3 = 12/36, for then we see th a t we are off by 1/36; whence it is
not difficult to see th a t 4-4 is but one roll th a t we have counted as two
solutions (that is, as two favorable events).
While the outcomes are more numerous as the number of rolls of the
die increases, the theory remains the same. For instance, if the die is
rolled three times (or if three dice are rolled once) there are 6 X 6 X 6
possible outcomes, and 5 X 5 X 5 ways in which a 4 never occurs. Thus,
the probability th at no 4’s occur is (5 X 5 X 5)(6 X 6 X 6) = 125/216.
This means th a t the probability of obtaining at least one4 is 1 — 125/216, or
91/216, a far cry from the intuitive answer th at the probability is 3/6. In
fact, a game called “chuck-a-luck” is based on this idea. Namely, a
player picks 1, 2, 3, 4, 5, or 6, and then three fair dice are rolled. If the
number picked by the player does not appear on any of the dice, the player
loses his wager. If his number appears on exactly one of the three dice, he
wins an am ount equal to his wager. If it appears on exactly two of the
dice, he wins an am ount equal to twice his wager; and finally, if his number
occurs on all three dice, he wins an amount equal to three times his wager.
At first glance (and to many, at second, third, and fourth glances as
well) it appears th a t this is a good bet for the player to accept. However,
the above discussion shows th a t the player has an even better chance of
losing. Let us assume, for the sake of argument, th at a player chooses
4. We have already seen th a t there are 125 ways for him to lose and 91
ways to win. Since some of these 91 ways pay an extra dividend, they
m ust be considered separately. I t is easy to see th a t only one of these
91 winning situations involves three 4’s; namely, each die must turn up a
4. Now let us investigate the number of ways in which exactly two of the
dice turn up a 4. Those who are now adept at the game of clever counting
may already see th a t there are C(3,2) ways in which two of the dice turn
up 4’s. We can also count more concretely and see th a t it must be either
the first and second, or first and third, or second and third dice which
turn up the 4’s. Once the two 4’s occur the third die can record any
one of five faces; namely, anything but a 4, since if a 4 occurred we would
have three 4 ’s. Thus, there are C(3,2) X 5 ways of obtaining exactly two
4’s, and this can be translated by observing th at C(3,2) = 3, and 3 X 5 =
15. T hat is, there are 15 ways of obtaining exactly two 4’s. This leaves
us with the fact th a t 75 of the 91 winning situations involve only one 4
turning up. (This can be checked directly by observing th a t C(3,l) X
5 X 5 = 3 X 5 X 5 = 75; and this is obtained by recalling th at C(3,l)
denotes the number of ways in which one of the three dice turns up a
4, while 5 X 5 represents the number of ways in which neither of the
other two dice turns up a 4.) No m atter how we proceed we would even
tually obtain the following:
125 = number of ways in which no 4’s occur (1)

75 = number of ways in which exactly one 4 occurs (2)
15 = number of ways in which exactly two 4’s occur (3)
1 = number of ways in which exactly three 4’s occur. (4)
For the sake of convenience, let us now assume th at each wager consists
of SI. Since the 216 outcomes are assumed to be equally likely, let us
base our analysis on 216 tosses in which each of the 216 outcomes occurs.
Of course, such a distribution is unlikely, but any deviation from this
distribution is a m atter of luck, so we disregard it. Then (1) indicates
th a t we lose $125, (2) indicates th a t we win $75, (3) indicates th a t we
win 15 X 2, or $30, and (4) indicates th at we win 1 X 3, or $3. T hat
is, we can expect, in wagering $216, to win 75 + 30 + 3 = $108 while
losing $125. This means th at we lose $125 — $108, or $17 per $216 in
vested. Certainly, a player may be an extremely lucky man and win
much money, but any deviation from an average loss of $17 per $216
played is due to luck alone; this in turn means th a t if the player wins he
should attribute this to good luck, but if he loses he should think twice
before being naive enough to believe th a t it was caused by bad luck alone!
Again, notice th at whether or not we are adept at counting, the theory is
not affected here; only the determination of the possible outcomes might
be. By miscounting the possibilities we can get into difficulty. In fact,
one reason th a t many gambling houses have “honest” wheels and “honest”
dealers is th a t they have too much to lose by being caught cheating. In
deed, m an’s ability to misinterpret data often allows the gambling house
to establish odds th at are in its favor, but which the player believes are
in his favor. In short, the house plays the percentages, while the player
relies on luck, and it is fair to assume th a t this analogy permeates all phases
of our life!
We shall conclude this section with one more example which, in addition
to reinforcing some previous ideas, illustrates another living example of
the case against intuition.
Example 5
There are n people in a room, where n denotes a natural number. W hat
is the probability if these n people were chosen at random th a t a t least
two of them celebrate their birthdays on the same day of the year? (No
tice th a t we are not naming a particular day, nor are we insisting th a t they
be born in the same year.)
I t is clear th at the answer depends on n. To see some extreme
s o l u t io n :
cases, merely observe th a t if n = 1 the probability is 0, since with only
one person it is impossible for two of them to have the same birth d ay ! At
the other extreme, if n is a t least equal to 367 (this guards against leap
year problems), the chest-of-drawers principle guarantees th a t the prob
ability is 1, since then we have more people than dates in a year.
Let us tackle some specific values of n and see what the trend is, and
let us also assume th a t leap year is excluded (that is, there are 365 days
in a year).
(1) If n = 1, we have already seen th at it is impossible for two to have

their birthdays on the same day.
(2) If n = 2, we see th a t the first person’s birthday can fall on any one
of the 365 days without conflict, but the second can have his birthday
only on any one of the 364 days if his birthday is to be different from
th a t of the first man. Thus, there are 365 X 364 ways in which the
two men do not celebrate their birthdays on the same day of the year;
and there are 365 X 365 possible ways for two men to have birth
dates. Thus, if n = 2, the probability th a t the two men have differ
ent birthdays is given by (365 X 364)/(365 X 365), or 365/365 X
364/365, or 364/365. Hence, the probability th a t both celebrate
their birthdays on the same day of the year is 1 — 364/365, or 1/365.
T hat is, with two people, the odds against both celebrating their birth
days on the same day is 364 to 1.
(3) If n = 3, we may use this same procedure. Namely, there are 365 X
365 X 365 ways in which they may celebrate their birthdays. More
over, if no two are to have their birthdays on the same day, we see
th a t the first may celebrate his on any one of the 365 days, the second
can then celebrate his on any one of the remaining 364 days, and the
third, on any one of the remaining 363 days. Thus,
(365 X 364 X 363)/(365 X 365 X 365)
= (364 X 363)/(365 X 365)
= 132,132/133,225
is the probability th a t no two celebrate their birthdays on the same
date. In other words, with three people the odds are 132,132 to
1,093 (or about 120 to 1) against two people celebrating their birth
days on the same day.
As we let n increase by one each time, we see th a t in determining the

probability th a t no two have their birthdays on the same day, we multiply
by more and more numbers th a t not only are less than one, but each of
whose factor is less than the previous one. For instance,
365/365 X 364/365 X 363/365 X 362/365 X 361/365 X 360/365
represents the probability th a t no two people have their birthdays on the
same day if there are six people in the room.
The amazing thing is th a t this product diminishes so rapidly th a t when
n = 23 the odds are about even th a t a t least two people have their birth
days on the same date! If a person relies on intuition here, he would
probably guess th a t it would require about 180 people before it is an even
bet th a t a t least two people will have their birthdays on the same day.
Yet, with n = 100 the odds are approximately 3,300,000 to 1 th a t a t least
two people have their birthdays on the same day! W ith 150 people, the
odds become 4,500,000,000,000,000 to 1! While we do not intend to
prove this assertion because it would require such tedious computation, let
us at least observe th at by the time we get to about the hundredth person,
the factors look like 265/365, 264/365, and so on, and this involves talking
about f of | of § of § . . . . T hat is, we begin to multiply by (f)m (and
even this is not quite exact since the terms get progressively smaller).
But to get a general idea of what is happening observe
(2/3)2 - 4/9
(2/3)3 = 8/27
(2/3)4 = 16/81
(2/3)5 = 32/243
(2/3)6 = 64/729
and we see th a t the powers of f diminish very rapidly.

We could present an almost endless collection of problems for the pur
pose of illustrating elementary probability theory. Indeed, even with
this lengthy discussion we have hardly begun to scratch the surface of the
subject. However, the illustrations presented, together with the exercises
a t the end of this section, should supply enough drill to insure th a t we
grasp the concept of probability and its significance, and th a t we see the
role of sets in the development of this topic.
Exercises
1. The letters, A, B, C, D, and E are placed a t random in a row. W hat
is the probability th a t A and B will be next to each other?
2. A person writes down three letters chosen a t random and it is possible
th a t he writes the same letter more than once.
(a) W hat is the probability th a t none of the three letters is a vowel?
(b) W hat is the probability th a t all of the three letters are vowels?
(c) W hat is the probability th a t a t least one of the three is a vowel?
(d) W hat is the probability th a t exactly one of the three is a vowel?
3. We are to form a four-digit numeral by arranging the digits 1, 3, 5,
and 6 in random order.
(a) W hat is the probability th a t the numeral will name a number
greater than 5000?
(b) W hat is the probability th a t the numeral will name an even
number?
(c) W hat is the probability th a t the numeral will name an even num
ber greater than 5000?
4. Four m ath books and three history books are shelved in random order.
W hat is the probability th a t the four m ath books will be together?
5. A committee of four is to be chosen by lot from a list of nine candidates.

Smith is one of the candidates. W hat is the probability th at Smith
will be chosen for the committee?
6. The letters AAAAABBBBCCC are to be arranged in a row but chosen
a t random.
(a) W hat is the probability th a t the arrangement will again yield
AAAAABBBBCCC?
(b) W hat is the probability th at the sequence will begin with AA and
end with BB?
(c) W hat is the probability th a t the sequence will begin with CCC?
7. A flush is a poker hand in which all five cards are of the same suit.
W hat is the probability of being dealt a flush?
8. Assuming th a t C(52,13) = 6 X 10", what is the probability of a
bridge hand consisting of 13 cards in the same suit?
9. A penny is tossed 10 times. Assuming th a t the tosses are random,
what is the probability th a t wre shall obtain exactly five heads?
10. A penny is tossed 15 times. Again assuming th a t the tosses are
random, what is the probability of obtaining
(a) exactly 13 heads?
(b) exactly 14 heads?
(c) exactly 15 heads?
(d) a t least 8 heads?
(e) a t least 8 tails?
11. A fair coin is tossed 20 times and each of the 20 times it turns up heads.
(a) W hat is the probability th a t tails will occur on the next toss?
(b) Suppose th a t we did not know the coin was fair.
(i) W hat was the probability th a t it wrould yield heads for each
of the 20 tosses?
(ii) Would we, then, be likely to conclude th a t the coin was fair?
Explain.
12. Two pennies are dented in such a way th a t the probability is f th at
the first will turn up heads when flipped, and the probability is f th at
the second will turn up tails when flipped. Assuming th a t both coins
are then flipped fairly, what is the probability th at
(a) both coins will turn up heads?
(b) both will turn up tails?
(c) one will be heads and the other tails?
13. Two boys and twro girls are placed at random in a row for a picture.
W hat is the probability th a t the boys and girls alternate in the picture?
14. Suppose th a t three people enter a restaurant th at has a row of six
seats. If they choose their seats a t random wrh at is the probability
th a t
(a) they sit with no seats between them?

(b) they sit with a t least one empty seat between each of them?
15. A certain college has 500 students and it is known th a t 300 read French,
200 read German, 50 read Russian, 20 read French and Russian, 30
read German and Russian, 20 read German and French, and 10 read
all three languages. If a student is now chosen a t random from this
college, what is the probability th a t this student
(a) reads all three languages?
(b) reads exactly two of these languages?
(c) reads at least two of these languages?
(d) reads at least one of the three languages?
(e) reads none of these three languages?
16. W hat is the probability th a t a bridge hand will have suits of 4, 4, 3,
and 2 cards?
17. Which of the following two bridge hands has the better probability
of occurring and why:
(a) all 13 spades
(b) AK53 of spades
QJ72 of hearts
765 of diamonds
AJ of clubs.
2.18 EQUIVALENCE RELATIONS

In such statem ents as
John is taller than Bill,
Seven is greater than three,
Line m is parallel to line n,
we see examples of relations.

Thus, “is taller th an ” relates John to Bill; “is greater th an ” relates
seven to three; and “is parallel to ” relates lines m and n.
Obviously, the study of relations is older than the study of Cartesian
products. However, Cartesian products offer us a very precise way of
defining relations.
Let S denote any set. Then we define a relation on S to be any subset
of S X S.
While the above definition may seem very abstract a t the first few
glances, the fact is th a t it affords us an elegant, concise, and all-inclusive
description of a relation.
W hat put us on the track of using Cartesian products in this way? The
answer is th a t we observe th a t relations not only relate pairs of objects,
but th a t the relation depends on the order in which the pair is stated. Thus,
if “John is taller than Bill” is a true statem ent, then “Bill is taller than
John” is a false statem ent. T hat is, in general, the tru th of a relation
depends on the order of the pair—although there are some relations in
which order makes no difference (such as “is the same height as”).
In other words, in- ju st the same way th at we could view points in the
plane as ordered pairs, so can we view any relation. For example, let S
denote the set consisting of the whole numbers 1, 2, 3, and 4; and let the
relation be “is less th an .” Then we have
1 < 2, 1 < 3, 1 < 4, 2 < 3, 2 < 4, and 3 < 4.
Let us now agree to use the code (b,c) as an abbreviation for b < c. Then
the above inequalities translate into
{(1,2), (1,3), (1,4), (2,3), (2,4), (3,4)}
and this is clearly a subset of S X S.
While our definition of a relation is quite simple, its broadness allows a
set to have a great many relations. T hat is, suppose S has n elements.
Then, as we have already seen, S X S will have n2 elements. Hence,
S X S will have 2n’ subsets. (Why?) By way of illustration, in the exam
ple above S has four members. In this case n2 = 16; and 2n* = 216 =
65,536; and th a t is a great number of relations.
N ot all relations are equally interesting, but there is one type of relation
th a t occupies a great position in mathematical structures. These are
called equivalence relations and they will be the topic of discussion in the
remainder of this section.
To begin with, let us concede th a t Cartesian products are no longer
necessary in our conversation. They have served their purpose in helping
us m otivate the idea of relations. Instead of using ordered pairs, such as
(b,c), we will write bRc to stand for b is related to c by the relation R. As a
specific example, let R denote the relation “is greater th an .” In this
case we wrould write 7R3 as an abbreviation for “7 is greater than 3.”
In what follows, R will denote an arbitrary relation on some set S. I t
would be comforting to know th a t for any b £ *S, bRb; th a t is, we would
like to believe th a t every element “is related to itself.” However, this
is not the case. Indeed, in our above example wherein R denoted “is
greater th an ,” this is false. T hat is, bRb would mean here th a t b was
greater than b, and this is false. (In terms of a subset of S X S we are
saying th a t while (6,6) belongs to S X S, there is no reason why it m ust
belong to the particular subset of S X S we have chosen.) On the other
hand, if R denotes “is equal to ,” then bRb would be a true statem ent.
This leads us to the following concept: The relation R is said to be reflexive
on S if bRb is true for all 6 C S.
Among reflexive relations are “ is equal to ,” “is the same age as,” and
“is parallel to.” Among relations th at are not reflexive are “is less th an ,”
“is shorter than,” and “is the father of” (since no person is his own father).
In fact, a relation chosen a t random is probably not reflexive.
A second important feature of some relations hinges on a few things
we have already hinted at, especially the idea of order being important.
For example, as we have seen, the fact th a t bRc is true does not mean th at
cRb will also be true, although this could certainly happen. Now we can
say th at the relation R is said to be symmetric on S if bRc implies th a t cRb.
Examples of symmetric relations are “is equal to,” “is parallel to ,” and
“lives next door to.” Examples of relations th a t are not symmetric are
“is less than,” “is shorter th an ,” “is the father of,” and “is the brother
of” (that is, if John is the brother of Mary, it is unlikely th a t M ary is the
brother of John).
Again, in terms of subsets of S X S, to be symmetric the subset must
have the rather stringent condition th a t (c,6) belongs to the subset when
ever (b,c) does.
The final property of relations we wish to discuss here is known as transi
tivity; namely, the relation R on S is called transitive if, whenever aRb and
bRc then aRc.
For instance “ equality” is transitive; for if the first is equal to the second
and the second is equal to the third, then the first is also equal to the third.
On the other hand, “is the father of” is not transitive. T hat is, if the
first person is the father of the second and the second is the father of the
third, then the first is not the father, but the grandfather, of the third.
Again, in terms of subsets of S X S transitivity means th a t whenever
(a,6) and (b,c) belong to the subset, so also does (a,c).
The next point to observe is th a t a particular relation can have any
combination of these three properties (reflexive, symmetric, transitive),
ranging from none to all three.
(1) “ lives next door to ” is symmetric but neither reflexive (unless the
person lives in two adjoining homes) nor transitive.
(2) “is less than ” is transitive but neither reflexive nor symmetric.
(3) “is the father of” is neither reflexive, transitive, nor symmetric.
(4) “is divisible by” is reflexive and transitive but it need not be sym
metric (4 is divisible by 2 but 2 is not divisible by 4).
(5) “is equal to ” is reflexive, symmetric, and transitive.
By an equivalence relation on S, we mean any relation th a t is reflexive, symmetric,
and transitive.
The beauty of an equivalence relation is th a t “if you’ve seen one you’ve

seen them all”—a t least as far as th a t relation is concerned. For example,
“is the same height as” is an equivalence relation. Suppose th at we want

to find a person who is exactly six feet tall. Then once we have one such
person any other who is “related” to this person (by the relation “is the
same height as” ) is also exactly six feet tall. However, these two persons
might be quite different in other respects. T hat is, two people might be
equivalent with respect to the relation “is the same height as” but not
equivalent with respect to the relation “has the same color hair as.”
The major point as far as mathematical structure is concerned is th at
equivalence relations allow us to substitute anything for something to
which it is equivalent, taking heed th a t we always restrict these substitu
tions to statem ents involving the equivalence relation.6
The preceding material on sets is ample for an introduction to the topic.
Later in the text there will be occasions to introduce new ideas about
sets as well as to reinforce old ideas.
In the next chapter we turn our attention to the general structure of
mathematics. The illustrations in Chapter 3 draw freely from the ideas
of this chapter. Among other things Chapter 3 will review many im
portant properties of this chapter.
Figure 2.54
6 To see this, consider the equilateral triangle A B C and let .AD be perpendicular to
BC, as in Figure 2.54. Now by definition of equilateral, BC = A C ; hence, if we sub
stitu te equals for equals the fact th a t A D is perpendicular to BC means th a t it is also
perpendicular to A C —b u t this is preposterous. W hat went wrong? The answer is
th a t BC and A C were equal with respect to length; th a t is, with respect to the relation
“is the same length as.” This, in turn, means th a t wherever the length of BC is the
correct answer so also will the length of .AC be the correct answer. However, in Figure
2.54 the relation “is perpendicular to ” is a relation between lines, not the lengths of the
lines. This is one reason th a t the better geometry books write AC = BC to indicate
th a t it is the lengths of the lines th a t are equal, not the lines themselves. In short, it
is true th a t .AD is perpendicular to BC, bu t it is not true th at A C = BC. In fact, had
we used the notation BC and AC to denote lengths there would have been no problem.
Exercises
1. Tell which of the following relations are reflexive, which are symmetric,
and which are transitive:
(a) is greater than
(b) is the sister of
(c) is the friend of
(d) is divisible by
(e) has the same shape as
(f) lives next door to
2. Describe the importance of an equivalence relation.
chapter three / THE "GAME" OF
MATHEMATICS: An Introduction
to Abstract Systems (with Special
Emphasis on Boolean Algebra)
3.1 INTRODUCTION
There is an old saying th a t the only sure things are death and taxes.
Yet, with the exception of certain trivialities, man can be sure of nothing.
For example, if someone wanted to bet against us th a t the world would
not be here tomorrow we would have to wait until then before we could
collect. In short, in the world of science until an experiment is completed,
all we ever have is a conjecture. The systematic study of any serious topic
begins with certain assumptions—things we believe to be true. These
“tru th s’’ may be based on whims, intuition, or past experience. They
may appear self-evident or they may be acquired knowledge; but whatever
they are, they serve as the basis of further inquiries.
We then apply the rules of logical thought to these assumptions and
thus try to determine what facts follow in an inescapable manner from our
assumptions. We do not worry so much about whether our conclusions
are true in the real world (for we cannot even be sure th a t our assumptions
are absolutely tru e ); but we do worry about whether our conclusions are
inescapable consequences of our assumptions. Herein lies a basic difference
between tru th and validity. T ruth is a subjective value judgment made
about individual statem ents. Validity is a more objective judgment th at
applies to an entire argument. We say th a t an argument is valid if the
conclusion follows inescapably from the assumptions, independent of
whether the assumptions happen to be true.
T ruth and validity are connected by the fact th a t if an argument is
valid and if the assumptions are true, then the conclusion also is true.
By way of illustration, we give four different arguments in which the
validity of the argument and the tru th of the conclusion appear in different
combinations.
268
Introduction 259
Example 1
All Texans are M artians.
All M artians are Bostonians.
Therefore, all Texans are Bostonians.
This is an example in which the conclusion is false but the argument

is valid. In other words, while the conclusion is false it is nonetheless
an inescapable consequence of the first two assumptions. In still other
words, if we were to invent some abstract game in which there were three
types of pieces, called Texans, M artians, and Bostonians; and if it were
a rule of this game th at all Texans were M artians and th a t all M artians
were Bostonians, then in this game it would be true th a t all Texans are
Bostonians.
Example 2
All Frenchmen are Europeans.
All Germans are Europeans.
Therefore, all Frenchmen are Germans.
Conclusions like the above probably started World War I! We all

recognize th at the above conclusion is false. The argument is also invalid
(that is, not valid). In terms of the language of sets, a set can have two
(or more) different subsets th at are not subsets of one another. In other
words, the above example merely says in abstract form th at F and G are
subsets of E. Certainly, this is not enough to insure th a t F and G are
equal subsets, nor th a t F m ust be a subset of G.
Notice th at we do not preclude th at F might be a subset of (■?; all we
are saying is th a t whether or not the conclusion is true, it does not follow
inescapably from the evidence presented in the assumptions. From a
different perspective, if the conclusion is true it is virtually in spite of the
assumptions. The next example is also along these same lines.
Example 3
All Parisians are Europeans.
All Frenchmen are Europeans.
Therefore, all Parisians are Frenchmen.
Here the conclusion is true, but this tru th does not follow from the
form of our assumptions. In short, the fact th a t it is true th a t all Parisians
are Frenchmen requires more knowledge than th a t given in the above two
assumptions. Finally, it is possible th a t we have both tru th and validity.
260 The Game of Mathematics
Exam ple 4
All bears are animals.
All animals have four legs.
Therefore, all bears have four legs.
Here is a true conclusion th at follows inescapably from the form of our

assumptions.
A word of warning is appropriate here. While true assumptions and
valid reasoning imply th at the conclusions are true, it need not follow th a t
if the argument is valid and the conclusion is true th a t the assumptions
are also true. For example, consider the next statements.
Exam ple 5
All bears are trees.
All trees have four legs.
Therefore, all bears have four legs.
The argument is still valid and the conclusion is still true, but the as
sumptions are not true.
I t is this last result th a t makes the scientific method so difficult to work
with. In the “game” of life we know nothing for sure; all we have is what
we can see. Thus, in many cases the scientist starts with a known observa
tion and then tries to invent a theory th a t explains the observation. In
essence, he applies valid reasoning to his assumptions in order to deduce a
true conclusion. As noted above, this in no way proves th at his assump
tions are true. This is the problem th a t motivated Einstein to say when
asked if his theory of relativity was true: “All the experiments in the world
can never prove me right; but a single one may prove me wrong!” Para
phrased, he was saying th a t false assumptions together with valid reasoning
could yield a true conclusion, but th a t true assumptions together with
valid reasoning could not yield a false conclusion.
G etting back to the major point of this section, we wish to show how all
m athem atical systems are set up so th a t one can deduce conclusions from
given assumptions. I t is not necessary th a t the conclusions be true,
although we keep in mind th a t the “truer” the assumptions, the “truer”
will be the valid conclusions.
In closing this part of the discussion we should point out th a t this type
of inquiry into validity is not restricted to mathematical systems, nor to
the natural and physical sciences. Indeed, every branch of human en
deavor—the social sciences as well as the physical and natural sciences—has
its foundations in the fact th a t we begin with man-made assumptions and
apply the rules of logical thought. These rules are the same in the social
The Case Against Intuition 261
sciences as they are in the physical sciences. The biggest difference be
tween the physical and the social sciences is th a t it seems much more
difficult in the complex social sciences to find “rules of the game” th a t are
acceptable to all players.
We now begin to study the development of mathematical systems and
to identify a mathematical structure. Because most people seem to have
a certain “faith” in intuition and common sense but seem to distrust logic
(probably because logic is a difficult subject), we shall in the next section
point out the weakness of intuition and common sense, in order to motivate
better the need for logic.
3.2 THE CASE AGAINST INTUITION

Quite frequently one hears remarks equivalent to, “Why do this by
logic when it is so much easier to do it by common sense?” I t is certainly
true th a t on the frontiers of knowledge there is no substitute for common
sense or good intuition; for in the real world, unlike with some textbook-
type problem, we do not at the start know the answer to a problem and
merely try to prove th at answer correct. Rather, we sta rt with a problem
for which we have no answer; the “real-life textbook” has no answers in the
back. In short, then, one usually needs a certain insight just to know in
what direction to proceed, let alone how to solve a particular problem.
Such an approach is, in our opinion, realistic; for frequently one intro
duces an abstract, logical study of phenomenon only after he “feels” th at
he understands the phenomenon being studied. Granted th a t the analogy
may be somewhat weak, it is like saying th at man learned to hunt food
first for survival. I t was only when he no longer faced the threat of starva
tion th a t he devoted time to constructing a cookbook of exotic recipes.
The point is th a t if much of our study is going to depend on a good, basic
knowledge of our number system or sets, then we should make sure th a t
we have a good knowledge of these topics. One of the best ways, a t least
on the elementary level, of learning to feel a t home with these number
systems is to study them first on intuitive rather than abstract grounds,
and hopefully this is what we did in the first two chapters.
Moreover, while intuition and common sense play a large role in the
study of mathematics, it would be somew'hat of a folly to try to teach these
topics, for they are subjective; and subjective topics not only seem different
to different people, but even within the same person they seem different
at different times. These statem ents lead us to recognize th a t if the pri
mary role is to teach everyone the subject, then the less appeal to intuition,
the better will be the overall teaching. If we can teach objectively there
will be less misinterpretation than if we had taught subjectively. In
fact, a t one time or another, we have all probably felt somewhat chagrined
when an expert explains his topic to us in a way which is crystal clear to

him\ We see th a t he is right but wonder how we could have learned this
ourselves had he not been around to show us. In summary, the paradox
seems to be th at most topics at a sufficiently advanced level require that
the student have good common sense and intuition, ingredients of a subjec
tive nature. However, good teaching techniques should tend toward
objectivity; th a t is, they should be independent of intuition.
Certainly, if the above remarks completely characterized the problem,
the solution would be rather apparent. Namely, we would let the out
standing student proceed intuitively, while developing objective criteria
for teaching the other students.
However, the problem is much more complex. Since intuition is subjec
tive it might even lead to the wrong answers! T hat is, there might be a
drastic difference between the right answer and the answer th a t seems
right. Moreover, if a wrrong answer seems right it is difficult to see th at
the answer is w’rong; otherwise, the answer would not seem right in the
first place. W hat should we do if two persons arrive a t contradictory
results and each is adam ant th a t his answer is supported by common sense?
Let us look a t some explicit problems.
Exam ple 1
A man drives from town A to towTn C by wTay of town B. The distance
between A and B is the same as th a t between B and C. In going from A
to B he averages 20 mph, while in going from B to C he averages 30 mph.
W hat was his average speed for the entire trip from A to C via B?
The average man would venture the answer 25 mph, since

s o l u t io n :
25 is the average of 30 and 20 and since equal distances are traversed at
the two speeds. Let us choose a computationally convenient number for
the distance between A and B and, hence, between B and C, and let us
actually compute the average speed.
We should make the total distance traveled a multiple of both 20 and
30 so th a t we can work with integers rather than with fractions. Let us
agree th a t the distance between both A and B, and B and C is 60 miles.
In this event the total distance traveled w’ould be 120 miles. Moreover,
a t an average rate of 20 mph it would take three hours to get from A to B,
and a t an average rate of 30 mph it would take two hours to get from B to
C. Thus, the 120-mile trip in this case w’ould take five hours. B ut a
simple division show’s th a t 120 miles per five hours is equivalent to 24, not
25, miles per hour!
I t seems th a t a paradox has arisen, th a t somew’here along the line a
mile has disappeared! Yet such is not the case; a little reflection shows
th a t the two speeds were applied for different lengths of time. For exam-
pie, since the distance from A to B is the same as the distance from B to
C, we spend more time traveling at the slower speed (20 mph) than a t the
faster speed (30 mph). Thus, it should not be surprising th a t the average
is closer to 20 than to 30. Likewise, if we were to figure average test scores
and if we scored 50 on the first test and 100 on the second, a t first glance
we might be tempted to consider the average to be 75. B ut such an
answer presupposes th a t the two tests are equally weighted. For example,
if the score of 50 had been made in a two-minute homework quiz and the
score of 100 had been made in a final examination, the average would
probably be closer to 100 than to 50. In the travel problem, had the man
driven for the same length of time at each speed, then the average speed
would indeed have been 25 mph.
W ith respect to this same travel problem, there is still another surprising
aspect. The answer 24 mph depends only on the numbers 20 and 30 and
not on the distance traveled. There are many ways of checking this but
perhaps the most straightforward way is to observe th a t 20 mph means 1
mile per 3 minutes, whereas 30 mph means 1 mile per 2 minutes. Since
the distance traveled a t 20 mph is the same as th a t traveled a t 30 mph, we
see th at we travel 3 minutes at one speed to cover the same distance th a t
we can cover in 2 minutes a t the other speed. Since this means th a t the
time ratio is 3:2, we know th a t f of the time is spent a t the lower speed,
while f of the time is spent a t the greater speed. Notice th at 24 is exactly
f of the way between 30 and 20 or, equivalently, f of the way between 20
and 30, regardless of the distance between A and B. Notice th at the inter
pretation \ + ^ = f gives the correct answer in the sense th a t 1 mile per
2 minutes and 1 mile per 3 minutes “averages out to ” 2 miles per 3 minutes
(24 miles per hour).
Example 2
Let towns A, B, and C all be 30 miles from each other. Ju st as before,
the driver goes from A to B a t an average speed of 20 mph. How fast
must he travel from B to C if he wishes to average 60 mph for the entire
trip from A to C?
s o l u t io nThe “usual” guess would be th a t the driver m ust average 100

:
mph between B and C since the average of 20 and 100 is 60. Yet the
answer is not 100 mph. Even if he were to travel a t the speed of light
(in fact, even if he were to get from B to C with no time elapsing) it would be
impossible for him to reach a 60 mph average! To see this, we observe
th a t since the distance between both A and B, and B and C is 30 miles,
the driver must travel a total of 60 miles. If he is to average 60 mph
he must cover the entire 60 miles in 1 hour. Yet, in driving a t 20 mph he
has already used up 1£ hours just in going the 30 miles between A and B.
In other words, even if the driver could now' get from B to C w'ithout using
any more time, he has still used \ \ hours. Thus, the best he could do is
60 miles per f hours, or 40 mph. Notice th a t we are not saying th at the
driver can never average 60 mph. We are saying th at he cannot do it
within the prescribed distance. For example, since he drove for \ \ hours
a t 20 mph, he must drive a t 100 mph for 1| hours also if he wishes to aver
age 60 mph. B ut 100 mph for 1| hours means a distance of 150, not 30
miles.
Exam ple 3
A man has 30 jelly beans th a t he wishes to sell at 3 for 10.He has a better
grade of jelly bean th a t he w'ishes to sell a t 2 for10.He alsohas30of
these. R ather than keep them in two separate jars, he decides to place
them all in one jar and sell them at 5 for 20. Which way does he make
more money?
W hat difference does it make? After all, 2/10 and 3/10 adds
s o l u t io n :
up to 5/20. Yet it does make a difference, as the following computation
shows.
A t 2 for lft, 30 jelly beans bring in 15ff.

A t 3 for lfi, 30 jelly beans bring in 10^.
Thus, selling them separately the man brings in 250 if all are sold. On
the other hand, selling the entire 60 a t 5 for 20 brings in only 240! T hat
is, there are 12 piles of 5 in 60, and each pile of 5 brings in 20.
Again, with a little more thought the mystery is easily solved. Namely,
the profit would have been the same had he had equal batches, rather than
equal amounts of each. For example, a t 2 for 10 the 30 jelly beans split
into 15 batches, whereas a t 3 for 10 they split into 10 batches. In other
words, it is 2 of one kind and 3 of the other th a t are worth 5 for 20. For
example, had he sold 30 a t 2 for 10 and 45 a t 3 for 10 (since in this case
there are 15 batches of each) he would have made the same either way.
The next illustration shows how intuition can lead to interesting para
doxes w’hen we deal with collections having infinitely many objects. (A
further discussion of this concept is presented in the next chapter.) This
is because in everyday life we are concerned with finite situations, and
hence our intuition may be said to be finitely oriented.
Exam ple 4
Suppose th a t we have a certain number of objects in a collection. Then
it seems to be intuitively clear th a t by merely changing the name of each
object we in no way alter the total number in the collection. For example, a
collection of three men remains a collection of three men no m atter how
the men decide to change their names. Let us start with the collection
of counting numbers (this is an infinite collection since the counting num
bers never come to an end) and change the name of each number by replac
ing it by its double.
2 3 4 5 6 7 8 9 10 11 12 . . .
2 4 6 8 10 12 14 16 18 20 22 24
Intuition tells us th at the top and bottom lines contain the same number
of members. Yet the top line contains both even and odd counting num
bers whereas the bottom line contains only the even counting numbers.
Certainly, it does not seem th a t there should be a 1 to 1 correspondence
between all the counting numbers and the even counting numbers.
Here is another way of viewing this example: Suppose there are infinitely
many men and infinitely many rooms in a hotel. We put the first man in
room 1, the second man in room 2, and so on, so th a t all the rooms are
occupied. Then infinitely many more men come to the hotel and each
man wants a room. The manager tells each of the men who are already
in the rooms to move into the room whose number is twice the original
number. Thus the man in room 1 moves to room 2, the man in room 2
moves to room 4, the man in room 3 moves to room 6, and so on. Each
of the original men are still in a room but now all the odd-numbered rooms
are vacant and the new men can move in as desired. This process can be
successfully completed over and over again as each new batch of men
comes to the hotel.
Another example of this problem is the following paradox. List the
counting numbers but not in the usual order; namely, list the first two
even numbers and then the first odd number. Then list the next two
even numbers and the next odd number, and continue this way. Thus, we
have 2, 4, 1, 6, 8, 3, 10, 12, 5, 14, 16, 7, 18, 20, 9, . . . . If we stop after
any odd number, no m atter how far we go there will always be twice as
many even as odd numbers in the sequence. Since we will not run out
of even numbers before odd ones it appears th a t there are twice as many
even numbers as there are odd numbers—a fact completely contrary to
what we know to be true. If we reverse the role of even and odd numbers
and obtain 1, 3, 2, 5, 7, 4, 9, 11, 6, 13, 15, 8, . . . , then there are seemingly
twice as many odd numbers as even ones; and this is even more ridiculous
if we agree th a t there are also twice as many even as there are odd numbers.
Variations of this type are endless. For example, we could also “prove”
th a t there are three times as many even numbers as odd by writing 2, 4,
6, 1, 8, 10, 12, 3, 14, 16, 18, 5, 20, 22, 24, 7, . . .
The paradox hinges on the fact th a t we talk about endless, but we always
stop the sequence somewhere. Notice th a t in terms of endlessness the con
cept of “m any” becomes nonintuitive. For example, 1,000,000,000,000,000

is a large number; for simplicity, call it N. But even if we ever counted
as far as N we would be no closer to the “end” of the number system
than when we began. For the next numbers would be N + 1, N -j- 2,
N -f- 3, N + 4, . . . , and we are back to the beginning of our number
system, only using AT"as a new reference point. I t is im portant to under
stand th at we are making no attem pt to “prove” th a t there are twice as
many even numbers as odd numbers. In Chapter 4 we shall show how the
mathematician demonstrates th a t there are an equal number of each, just
as we would like to believe. The point is simply th a t intuition can cause
dilemmas th a t are not easy to resolve. In other words, we must learn
to distinguish between the precise use of the word “endless” and its colorful
form, where the author writes, “our hero remained underwater endlessly.”
In reality, the author probably means th a t the hero held his breath for
two minutes—which might seem endless!
The concept of endlessness offers interesting intuition-defying examples
of a geometric nature as well.
Exam ple 5
Consider the right triangle each of whose legs is 1 standard unit (s. u.). By
the theorem of Pythagoras, the length of the hypotenuse is \ / 2 (see
Figure 3.1).
Figure 8.1
In other words, we know th a t the length of the line segment A B is be

tween 1.4 and 1.5 s. u. Now let us connect B to A by a network of lines
th a t consist of horizontal motions to the left and vertical motions down
ward. A few such networks might be those illustrated in Figure 3.2.
Notice th a t since each network consists of a composite motion of 1 s. u. to
the left and 1 s. u. downward, the perimeter of the network m ust be pre-
B B
Figure 3.2
cisely 2 s. u. But by using short enough segments (Figure 3.3) we can

make the network look arbitrarily close to being the hypotenuse of the
right triangle. Hence, it appears th a t we have “proven” 2 = y /2 , a
result we know to be preposterous! The resolution of this dilemma is
left as an exercise for the reader.
Figure 3.3
The list of such paradoxes is huge. But it is not our purpose to write a
chapter of such paradoxes, so suffice it to say th a t not only is the list large,
but the complexity of these paradoxes varies from those th a t can be solved
with a minimum of effort lo those which are so sophisticated th a t they are
unsolved even today. Our main purpose was to show th a t there are situa
tions in which intuition and common sense seem to fail us. N ot th a t we
can legally say our intuition is wrong, but we can say th a t it leads to in
correct results. (By this we mean th a t if the right answer to a problem is
12 and our intuition tells us 6 ; the fact is th a t our intuitive answer is 6,
even though it is not the correct answer.)
If intuition can lead us astray on certain occasions, we must adm it th a t
it could lead us astray on others. Thus, we must have access to a tech
nique th a t is more objective than mere intuition.
We are not saying th a t one should not be happy to possess a fine intuitive
sense. In fact, it may be the finest single virtue the creative researcher
possesses. In summary, we are saying
(1) I t is difficult, if not impossible, to teach a meaningful course to a cross

section of people if the subject m atter is highly subjective, since dif
ferent people see things in different ways. Intuition is certainly
highly subjective.
(2) Even if it were possible to teach intuitively, we should be prepared
with a powerful alternative. This alternative would be necessary if
two highly competent researchers obtain contradictory results, each
in accord with his own intuition.
3.3 THE CONCEPT OF A GAME
3.3.1 W HAT IS A GAME?

When we think of the wide variety of things correctly called games, it
seems a monumental task to make up an acceptable definition of a game.
Perhaps the best way of going about this is to think of a specific game and,
without naming the game, consider how we would go about teaching it to
others. T hat is, if we can describe a game in such a way th at the descrip
tion does not depend on the particular game being described, then the
chances are good th a t we have captured the key ingredients of anything
called a game. To this end, then, we think of any game at which we are
adept. We are called upon to teach this game to another person, whom
we shall call our student. In order th at we avoid certain subjective hin
drances, we shall make the following assumptions.
(1) Our student is motivated to learn the game; th a t is, he is willing to

learn and to do the things we feel are im portant for the development
of certain skills. For example, if we wish to teach him football we
assume th a t he will conscientiously attem pt any calisthenics we pre
scribe. In other words, we do not have to concern ourselves with the
subjective problem of how to m otivate our student. He comes th a t
way!
(2) While willingness on the part of the student is desirable, it is not
enough. We shall further assume th a t the student has the mental
capacity to comprehend what we tell him, for no m atter how eager he
is he cannot obey an instruction he does not understand. Thus, we
avoid the subjective problem of how to teach those with whom we
cannot effectively communicate.
The Concept of a Game 269
(3) We shall also assume th a t the student knows absolutely nothing about
the game, even though he may be quite intelligent otherwise. For
example, if we wish to teach him bridge we shall assume th a t he has
not even heard of a deck of cards. If we assume certain advance
knowledge on the part of the student, we might well take certain
liberties and shortcuts in our presentation, thus obscuring the logical
structure of the game. In summary, then, we assume th a t our student
is a genuine beginner, not a mediocre or good player who is trying to
improve.
It is im portant to note th a t we are not minimizing the problems implied

by the above assumptions. Rather, it is our aim to correctly define a
game. Hence, we wish to avoid the im portant but subjective problems th at
might lead us astray from this aim.
I t is now established th a t we are experts a t this particular game and
th at our student is capable and well motivated, but a rank novice. To
begin, we might ask what we should teach him first. A number of stock
phrases now suggest themselves: the objective, the rules, the strategy, the
terminology. Let us analyze these concepts one by one. First, we look
a t the objective. Every game may be characterized by the fact th a t the
objective is to win. Such a statem ent hardly distinguishes one game from
another. In other words, if every game has as its main objective to win,
then telling our student th a t the objective of our game is to win would not
help him to distinguish this game from any other game he might know.
Of course, one might argue th a t when we tell him to win, we mean to win
within the framework accorded by the rules. However, we are assuming
th a t our student knows nothing about the game—and this includes the
rules. Hence, if we agree to teach first things first, we must agree th at
one should learn the rules before the objective.
Perhaps then we should teach the rules first. But, then again, what
are rules? We can call rules relationships between terms. For example, in
baseball the rule th a t states “three strikes is an o u t” serves to relate the
terms “strike” and “out.” If our student does not know the meaning of
these terms then certainly he cannot fully comprehend the meaning of the
rule itself. Moreover, since we have assumed th a t our student knows
nothing about the game we must assume th a t he does not know the meaning
of the terms.
In other words, it appears th a t the fundamental building block in the
learning of any game is the development of the necessary vocabulary for
proper communication. After the terminology is presented we then
present the rules of the game. T hat is, we show what makes this game
different from all other games th a t could have been played with exactly
the same equipment. For example, many different games can be played
with a deck of cards. The equipment is the same but the rules are dif
ferent. Then we can explain in terms of the terminology and the rules, the
object of the game, or what we mean by a winning situation. How we
employ the rules to arrive a t the winning situation is known as the strategy.
We can now define a game to be any system consisting of definitions,
rules, and a winning situation(s) th at is carried out by strategy; th at is, by
employing the rules in such a way th at the winning situation follows in an
inescapable manner. This idea is diagrammed in Figure 3.4.
strategy
Objectives(s)
Rules o f the game

(relationships between terms)
-<
Definitions, terminology
(com m unication)
Figure 3.4
There might be other, or even better, definitions of a game, but we

shall agree to accept the above definition since it will serve our purpose
quite well. Observe, among other things, th at because of the nature of
our definition, all things justifiably called “games” will fall within its
limits.
However, our definition, while it does characterize all things called games,
is much too general in the sense th a t it also characterizes many things
not usually called games. For example, according to our definition just
about every meaningful subject in the curriculum could be called a game.
Indeed, in a certain sense, our entire life could be called a game since we
accept certain rules and terminologies, and then see to what inescapable
conclusions our assumptions take us. We shall not view this as a setback
th a t our definition is too general. We see nothing wrong with the allegory
th a t most things are games, in the sense of our definition of the concept.
In particular, in due time we shall describe the game of mathematics.
When we describe mathematics in this way we are not making fun of the
subject, nor are we minimizing its importance, nor are we taking it too
lightly. We are merely agreeing that, like most other things, mathematics
may be viewed as a system consisting of definitions, rules, objectives, and
strategies. B ut before we return to the study of mathematics, we wish
to explore the general concept of a game in more detail. Specifically, we
should like to probe for the answers to such questions as, “How do we
Rote versus Reason 271
distinguish between rote and reason?” “Can all words be defined?”

“Are rules logical?” and “Can there be any proof without assumptions?”
Once these questions are answered in general we shall look at mathe
matics in particular. In this way our vehicle remains flexible and equally
applicable to topics other than mathematics. Thus, we shall see mathe
matics not as a subject apart from most others but rather as a subject
very much like others.
3.4 ROTE VERSUS REASON

Earlier we mentioned th a t the new mathematics was primarily concerned
with attem pts to view mathematics meaningfully rather than as sheer
rote. The question of logic versus memory plays a large role not only in
mathematics education, but in all forms of education as well. The point
th a t we wish to emphasize in this section is th a t the two concepts of rote
and reason coexist, and th at it is only the degree, not the existence, of each
th a t we can moderate.
Here the game concept can help us see the situation more clearly. Let
us ask whether games are logical or memorizable. Scrutiny clearly shows
th a t both ingredients are present in large proportions. For example, not
only can the terminology, rules, and winning situations be memorized, but
there is probably no other way for a person to learn them. For example,
there is no logical way for a person who has never played cards before to
know th at a standard deck contains 52 cards. We can make the memoriza
tion more palatable by relating the terms to the student’s previous experi
ence. For instance, he probably knows th a t there are 52 weeks in a
year and th at there are 4 seasons, each of 13 weeks’ duration. By associa
tion with this knowledge it might be easier for the student to learn th a t
there are 52 cards in a deck, and 4 suits each of 13 cards in the deck.
Likewise, memorization is necessary to learn rules. I t is hardly logical
th a t 3 strikes should be an out. I t is simply an agreed-upon rule of the
game. If a lad who is a poor hitter exclaims th a t the rule th a t 3 strikes
is an out is not logical to him, th a t it would be much more logical to have
18 strikes be an out, we might say, “Look, the rule th a t 3 strikes is an out
seems no more logical to me than it does to you. Nonetheless, it is one
of the rules of the game. If you don’t want to accept this rule, and the
choice is yours, find others who feel as you do and make up your own game,
but please don’t call it baseball since it will be confused with a game already
called baseball!”
The strategy of the game falls into an entirely different category. Gen
erally, memorizing strategy usually relegates one to the role of a less-than-
brilliant player, for strategy is the art of making a winning situation follow
inescapably from the rules of the game. In short, when we say th a t a
game is a high form of logic what we really mean is th a t once we memorize

the terminology, the rules, and the objective; the objective can be obtained
by logically applying the rules. For example, chess is a highly logical
game. But it is by memory, not logic, th at we learn, for instance, how the
queen must move.
We stress this point to bring out the fact th a t it is usually quibbling
for one to argue over rote versus reason; or, in other forms, subjectivity
versus objectivity, and induction versus deduction. These topics are so
intertwined th a t it is often almost impossible to separate them.
In particular, in the game called life it is often vain for one to ask if
his way of life is right or wrong. Rather, each of us has a philosophy; in a
sense, a system of definitions, rules, and objectives. Our most meaningful
inquiry can only be whether or not our actions follow inescapably from
our beliefs. One might apply this idea even to religion. Thus, a religion
may be both dogmatic and logical: the dogma being the definitions, rules,
and objectives and the logic being the study of whether the wray of life
of the religion is consistent with the dogma.
Obviously, it is not the role of this text to explore many of these topics.
B ut it is well within the scope of the text to discuss mathematics in terms
of human endeavor. Here logic and rote play a large role, for while man
is endowed with the ability to think logically he has few, if any, tenets
th a t he can call absolute. There are many things th at we have strong
feelings about, but wre cannot prove them in the truest sense of the word.
Taking the saying th a t the only sure things are death and taxes, wre
cannot even prove the sureness of death. For if we were suddenly granted
immortality and not told about it, how7 old would we have to be before
we could be sure th a t we would never die? Granted th at by relative
standards we might be highly suspicious by the age 1000, this is no proof
th a t we m ust live to the age 1001.
Certainly, man could take a pessimistic view and decide th a t since there
is nothing he is absolutely sure of, there is little reason to study logic, for
things th a t follow inescapably from doubts must themselves be no better
than doubts. But, let us instead take the attitude, “Granted, these
beliefs may eventually turn out to be false, but at the moment they are
the most sensible things we can imagine. Hence, since wre must have
some beliefs in order to make predictions, wre shall accept these until such
tim e as wre no longer believe th a t they are the most sensible.” In a certain
way, this attitu d e is analogous to one in a rather wrell-known baseball
anecdote concerning three umpires discussing balls and strikes. The
youngest umpire said, “Some are strikes and some are balls. I call them
as I see them !” The more seasoned umpire said, “Some are strikes and
some are balls. I call them as they are!” How'ever, the veteran umpire
pointed out “Some are strikes and some are balls, but until I call them they
Rote versus Reason 273
aren’t anything!” In short, most of life is spent making subjective judg

ments, or else we have nothing. This is perhaps the best point of view for
supporting the idea th a t the scientist is the interpreter of nature. T hat
is, in the noblest form, science consists of interpreting and predicting
things, not in forming nature. (Perhaps this is the intent concerning the
statem ent, th at according to the laws of aerodynamics, the bumble bee
can’t fly but since the bumble bee doesn’t know about these principles, he
flies anyway.) Things happen and the scientist, in terms of things th at
he believes to be true, tries to explain why these things happen. He then
uses his subjective explanation to make predictions. Of course, his
predictions could be right even if his beliefs were wrong, but if prediction
after prediction comes true, he would argue th a t right or wrong he will
accept his beliefs as long as the predictions hold true. B ut a scientist
never says th at his theory is correct. He only says that, a t the moment, it
explains certain phenomena and predicts other results better than any
other known explanation.
I t is from this aspect of the story of human endeavor th a t logic becomes
a vital study in the life of each intelligent man. A course in mathematics
designed so th a t the average man can improve his outlook on the universe
would be sadly lacking without sufficient homage paid to logic. I t appears
th at mathematics affords the best vehicle for the study of logic. M any
eminent mathematicians and philosophers have even attem pted to show
th at logic and mathematics were synonyms. W ithout taking sides, we
dodge this claim and merely assert th a t logic is such a large part of m athe
matics, or mathematics affords such a wonderful example of “living”
logic, th a t the man who says, “ I love logic, but I don’t see wThy I should
want to study mathematics!” is in the same position of the man wTho says,
“I love beautiful paintings, but since I can visualize one I don’t see wrhy
I have to look a t one!”
Before concluding this section, let us use the game analogy to show how
subjectivity and objectivity are interrelated. We usually call baseball
an objective game because the rules are such th a t we know exactly how
certain situations th a t occur in the game m ust turn out. For example,
suppose a batter has 3 balls and 2 strikes and does not swing a t the next
pitch. We know th a t if the umpire calls it a strike the batter is out. If
he calls it a ball, then the batter is awarded first base. This sequence is
highly objective, but the fact remains th a t the umpire m ust decide whether
the pitch was a ball or a strike. However, this is a subjective opinion in
all instances, even though certain pitches are easier to call than others.
In summary, then, the analogy of a game should provide a fairly good
first approximation to the interdependence of rote and reason, and sub
jectivity and objectivity. Moreover, the analogy can be used still further,
academically speaking, and serve as a model whereby many serious
educationally oriented questions may be analyzed in a rather rewarding

way. M any of these points are probably good for us to keep in mind in
case we run into stumbling blocks in the sections th at follow. For exam
ple, consider such comments of lament as:
I love mathematics; b u t I just can’t do well in m ath courses.
I study much harder than he does, b u t he gets much better grades.
We can probably add many more laments based on our own experiences;
but, just for the sake of illustration, compare the above statements with
their “game” counterparts which somehow don’t seem as strange:
I love baseball; I watch it on TV, I go to the ball parks, I study books on the subject.
Still I ’m a pretty awful player.
Joe has much more natural ability than I have and when it comes to wrestling, he
can beat me no m atter how hard I practice.
Somehow or other, we tend to accept physical limitations as a rather

natural thing; but we become much more self-conscious when these aspects
are applied to our academic abilities!
I t should be clear th a t the idea of a game can be very helpful in our
study of, and attitude toward, mathematics. We have no intention of
abandoning the topic now, but rather we intend to dissect various aspects
of a game in more detail.
3.5 CAN ALL WORDS BE DEFINED?

In earlier sections we discussed the difference between subjective and ob
jective definitions. In this section we shall look a t this problem in more
detail, and we shall also look a t other problems th a t arise when we attem pt
to make up definitions.
While a t first glance we ma}r see the problem of defining words as fairly
straightforward, the following considerations may change this stand. Try
to define the color blue to a person who has never seen. Try to define the
concept of tim e without using the concept of time in the definition. Along
similar lines, try to visualize learning a foreign language about which you
know nothing, and if for this task the only tool you have available is a
dictionary in that language. For example, suppose th a t you want to learn
Russian and all you have is an unabridged Russian dictionary—not a
Russian-English dictionary. You do not know a single word in Russian.
About all you could do would be to learn alphabetical order. From still
another point of view, imagine th a t we are back at the “dawn of conscious
Can All Words Be Definedf 276
ness” and th at for the first time we want to invent a language. I t comes
time to define our first word. But how can we do this when as yet we have
no other words? I t appears rather likely th at man chose a number of
primitive concepts which he made no attem pt to define objectively and
defined other concepts in terms of these.
The point to be made is th a t in the formation of any language some sub
jectivity must take place, for somehow we must arrive a t a starting point.
In other words, if we start with a particular word and try to trace its mean
ing through various synonyms we must eventually get back to the original
word. For example, let us look up any word in the dictionary. After
finding a synonym for this word we look up the meaning of the synonym
and continue to proceed in this way. Eventually, we will be led back to
the original word or else to one of the previous synonyms. Such a proce
dure is called circular reasoning. Circular reasoning is not necessarily evil.
Frequently, one learns to understand a concept better just by hearing the
concept rephrased, even if no additional facts are introduced. But from a
logical point of view, circular reasoning does not help us get to the core of
things. An example of circular reasoning might be a warning lantern
placed on a pile of stones. We ask why the lantern is there and are told
th at it is to warn people of the pile of stones. We ask why the stones were
there and are told th a t they are there to support the lantern! As a second
illustration of circular reasoning, consider the case of the king who issues
a decree stating th a t he is perfect. When he is asked to prove th a t he is
perfect the king replies “Read the decree!”
Our contention is th a t whereas it is possible to define all words in one
way or another, the possibility hinges on our willingness to accept the fact
th a t some of the words must be defined subjectively. There is nothing
wrong with subjective definitions; for instance, concepts such as beauty,
justice, and holiness are certainly of a subjective nature. Consequently,
any attem pts to justify the various definitions of these concepts must
ultimately rest with the user of the concept. Yet, while it might well be
impossible to answer objectively such questions as “W hat is beauty?”
“W hat is holiness?” “Is justice just?”, many of our own values stem from
our honest, albeit subjective, efforts to answer such questions. In our
attem pt to learn the game of mathematics we shall eventually have to face
the problem of defining certain subjective concepts with a fair degree of
objectivity.
B ut while the distinction between objective and subjective is important,
we must not let a study of this problem obscure other phases of the overall
problem. For example, we must also keep in mind th a t even after the
problem “Can all words be defined?” has been solved to our own satisfac
tion, the problem of one word having too many definitions can also haunt
us; for if a word has too many definitions, the chances for misinterpretation
of meaning are greatly enhanced. As a classic example, consider the

statem ent
She was fair.
W ithout hinting at the context, we merely ask the reader to make use of
the dictionary definition of the word “fair,” and based on what information
he finds, to tell us the meaning of our sentence. A look at Webster’s
Collegiate Dictionary, Seventh Edition, reveals:
fair\'fa(a)r, 'fe(o)r\ad; [ME fager, fair fr. OE faeger; akin to OHG fagar
beautiful and perti. to with puosti to decorate] 1: attractive in appearance:
BEAU TIFU L 2 : superficially pleasing: SPECIOUS 3a: CLEAN, PURE
b: CLEAR, LEG IBLE 4a: not stormy or foul: CLOUDLESS b: free or
nearly free from precipitation 5: AM PLE (a ^ estate) 6a: marked by im
partiality and honesty: JUST b: conforming with the established rules:
ALLOWED 7a: PROM ISING, LIK ELY b: favorable to a ship’s course
(a ~ wind) 8 : archaic: free of obstacles 9: not dark: BLOND 10: ADE-
QUATE-fairness - n
There are a multitude of ways to interpret the remark “She was fair.”
In summary, the problem of defining words objectively and unambig
uously can be difficult in any field of endeavor. In particular, we shall
experience this problem in our objective study of arithmetic. To under
stand arithm etic one must grasp the concept of number. Yet number is
one of those primitive concepts th a t leads us to circular reasoning. Unless
we use the utm ost care, an attem pt to define the concept of number may
presuppose the knowledge of the concept of number in the definition.
3.6 ARE RULES LOGICAL?

I t should already be quite evident from our previous discussion th a t rules
need not be logical (or intuitive, or self-evident). Rules give us an objec
tive set of criteria to apply in the course of playing the game. Notice
th a t we are not saying th a t the rules are made up objectively—only th at
they can be applied objectively. T hat is, wrhile any advocate of the game
of baseball accepts without question the rule th a t three strikes is an out,
there is no guarantee th a t if the rule book were to be rewritten the rule
would receive unanimous approval.
The danger with rules is th a t if the user is convinced th a t they must be
logical (which they need not be) and he fails to see the logic, he frequently
thinks th a t he m ust be a t fault. This often leads to a complex th a t will
impede his mastery of the game.
Are Rules Logical? 277
Rather than center our preliminary remarks concerning rules around

mathematics, let us choose a subject from the curriculum with which the
average student has had more experience. For instance, let us talk about
English; in particular, some rules of grammar. One such rule is
Never end a sentence with a preposition.
Why not? Is the rule logical? Is one a barbarian if he ends a sentence

with a preposition? Actually, certain verbs derived from the German
language have separable prefixes and cause a great deal of difficulty to
those who insist on following this rule. There is an anecdote concerning
the late Sir Winston Churchill, who always ended sentences with preposi
tions. Once when confronted with this “atrocity” by his secretary Church
ill replied,
“Young man, correcting my grammar is something up with which I shall not p u t!”
Suppose th a t we should still like to accept this rule. To someone who

disagrees with us we might say, “ Look, this rule is no more logical to me
than it is to you. In fact, it is no more logical to me than the rule th a t
three strikes is an out in baseball. But just as in any game, we must
accept certain rules if we are to play. In particular, in the game of good
grammar we agree to accept this rule.”
Perhaps of even more importance is th a t if one memorizes a rule he may
be guilty of either (1) misinterpreting the rule or (2) accepting something
th at cannot be true. For an illustration of the first alternative, think of
the number of people who misinterpret the rule
The exception proves the rule.
Let us ask a person to prove th a t his theory is right. He cites an illustra

tion in which he is wrong and then gleefully exclaims, “Ahha! The excep
tion proves the rule!” How utterly ridiculous! Do we really believe th a t
the best way to establish a result is to show th at it is wrong? Yet, the
rule in its original meaning is an excellent one. Originally the word
“proves” meant “tests.” Thus, the quotation should be
The exception tests the rule.
This makes fine sense. This says th a t if we think we have uncovered a

new rule, we should try it out. We should not try it in some simple situa
tion, but rather in a difficult (exceptional) situation. If the rule works well
in exceptional cases the chances are excellent th a t it is a good rule.
As an example of accepting something th at cannot be true, consider the

remark
Every rule has an exception.

9
This leads to an interesting contradiction if we assume th at it is true.

Namely, assume th at the statem ent is true. But since it itself is a rule it,
too, m ust have an exception. But if it has an exception, this means th at
there is an exception to the rule th at every rule has an exception. This
means th a t there must be a rule th at has no exception, but this in turn
means th a t the original statem ent is false, contrary to our claim th at it is
true.
Most subjects suffer from these same ailments and in Section 3.8 we
shall see these shortcomings in mathematics. We have chosen English
grammar as an example only because this subject serves as a common
denominator for all who read this text. Let us for now look at one more
example; namely,
the rule of double negation.
This rule outlaws the sentence
I haven’t no money.
Why is this sentence taboo? I t is easily misleading because logically it

means
I have money.
B ut such reasoning reflects some sort of negative attitude. Who is to say

th a t the utterer of “I haven’t no money,” is not sufficiently endowed to
know th a t “I haven’t no money,” and “ I have money,” are synonyms?
In fact, the statem ent, “ I t is false th a t I have no money,” seems so much
more emphatic than “I have money.” The rule of the double negative
in English grammar is more whim than logic. The double negative is a
perfectly acceptable construction in some other languages.
By the way, it is interesting to note that, good, bad, or indifferent, the
rule of double negation is used in “wild” situations where the inventor
never intended to use it. In our own course, notice our previous comment
concerning the logic th a t the product of two negative numbers must be
positive by the rule of the double negative. Yet, we conveniently forget
this rule as we let the sum of two negative numbers remain negative!
An additional question we should explore is, “How many rules should a
Can There Be Proof without Assumptions? 279
game have?” As a rough approximation to a more precise answer, we

could say th at a game should have enough rules to make the strategy excit
ing, but not so many as to make the game hopelessly complicated. More
over, we want to make sure th a t the rules are consistent and th a t they do
not contradict one another. For instance, think of the plight of baseball
players if there were a rule th at four strikes were an out and at the same
time a rule th a t three strikes were an out. The first time a batter had
three strikes the game would end because of a dilemma th a t could not be
resolved within the framework of the rules of the game. Namely, the pitcher
would insist th at the batter was out, while the batter would insist th a t he
still had one more strike. Yet, despite the fact th a t the two claims would
be contradictory each would be supported by the rules.
Before we apply our remarks to mathematics we shall talk more about
the role of strategy in a game.
3.7 CAN THERE BE PROOF WITHOUT ASSUMPTIONS?

The title of this section, in the language of a game, might more appro
priately have been: Can There Be Strategy without Rules? The answer
to this must emphatically be n o ! Strategy is not merely plotting to win.
Strategy is plotting to win within the framework of the rules of the game.
Thus, if the object of the game of checkers is to capture all of your oppo
nent’s pieces, it is true th a t if you take all of his pieces off the board while
he leaves the room to find an ash tray, then you have captured his pieces.
Yet, this would not be a winning situation; certainly, you have carried out
the objective but, unfortunately, not within the framework of the rules.
There is no rule th at states whenever a player leaves the room his opponent
has the right to take as many of his pieces off the board as he desires.
Notice th at we are not concerned with whether such a rule would be good or
bad; we are not concerned with the ethics of the player who steals his oppo
nents pieces. These may be im portant topics, but they are highly sub
jective. We are concerned only with the legality of the play, not the ethics.
T hat is, no m atter how we may feel about it subjectively, we must allow
those things permitted by the rules and disallow those things forbidden by
the rules. (If subjectivity is to be allowed it must appear in our choice
of accepted rules, not on how the rules are applied logically.) In short,
strategy is the art of arriving at a winning situation as an inescapable con
sequence of the rules of the game.
From another viewpoint, what might be an excellent strategy under one
set of assumptions might be a poor one under a different set. Remember
how we decided whether £ + 3 = f or f? We did not call one definition
more logical than another. However, we did say th at we wanted fractions
to represent lengths, and if lengths were to be added by laying them off
end to end, and if two fractions were to be equal if they were synonyms for
the same length, only then were we compelled to accept the rule th at
1 I 1 _ 5
2 I 3 IT"
From a simpler point of view, consider, as we did previously, the arbi

trariness of defining 12 inches as 1 foot and 3 feet as one yard. Granted
th a t these are merely, assumptions and th at we could have originally made
up different rules, it remains th a t once weacceptthese definitions we no
longer have any choice as to the number of inchesin a yard. Anynumber
other than 36 causes a contradiction.
3.8 THE GENERAL GAME OF MATHEMATICS
We have already seen th a t certain topics in the study of mathematics are

basically practical, while others are considered nonpractical. We feel th at
terms such as practical and asthetic are subjective; hence, we shall make
no attem pt to distinguish between topics from this point of view. How
ever, no m atter how we categorize the various topics we find th a t all
branches of mathematics are geared toward determining which statem ents
must be true once we assume th at certain other statem ents are true. In
this respect we might say th at we are given certain assumptions, usually
called hypotheses, together with a set of rules which we agree to accept,
usually called axioms and/or postulates; and using these in a logical fashion,
we discover what other results follow as inescapable consequences.
This, then, is the game of mathematics in its most general form. The
assumptions contain certain terms and the rules are relationships between
these terms. The terms may be defined either objectively or subjectively.
Some of the terms might have meaningful counterparts in the real world;
others are mere figments of the m athem atician’s creative imagination,
having no connection with reality. B ut these points are of peripheral
importance to our claim th a t mathematics may be viewed as a game.
Let us begin with the game called arithmetic. W hat are the basic
“playing pieces” in this game? Certainly, the term number occupies a
prime role in this game. But just what is a number? While we may view
numbers in terms of lengths, tallies, and so on, the fact remains th a t the
concept of number is abstract and hence can be viewed in many different
concrete ways by different observers. Moreover, if we agree th a t number
is a subjective concept, where does th a t leave us with respect to such terms
as equality, addition, and multiplication since these are relations between
numbers?
The problem of basic definitions is not limited to the arithm etic side of
mathematics. Consider, for example, plane geometry. Euclid talks about
The General Game of Mathematics 281
points and lines, but these concepts are very basic and their definitions
seem to be absurd. Consider Euclid’s definition:
A point is th a t which has no parts.
From a subjective point of view, this is perhaps his way of trying to project
the feeling of the “tinyness” of a point, but from an objective point of view
the definition says no more than th at a point is nothing. In a way, the
felony is compounded when he defines a line as th a t which is generated by
a moving point.
In fact, we saw in the previous chapter th a t severe problems, almost
paradoxes, arose when one confused “point” with “dot.” For example,
the rule of plane geometry th a t says th a t one and only one line is deter
mined by a pair of distinct points really hinges on the concept of “point”
rather than on the physical nature of a “dot.” Figure 3.5 clearly indicates
th a t more than one line can be drawn between two “dots.”
Figure 8.5
Now, what about the rules of a particular game of mathematics? Years

ago no one spoke of the rules, but rather of axioms and postulates. Axioms
were often defined to be truths th a t were self-evident and could not be
proven. Such a description leaves much to be desired, to say the least.
Phrases such as “self-evident but cannot be proven” are almost contradic
tions in terms. M ost of us feel th a t for something to be self-evident there
must be some degree of proof, even if only of a highly subjective nature.
But let us suppose we reform somewhat and merely define an axiom to be
something th a t is self-evident. Notice th a t the subjectivity now becomes
enormous! W hat may be self-evident to one need not be self-evident to
another.
Suppose th a t someone told us th a t a certain axiom did not seem self-
evident to him. We could reply, “Look, this axiom need not seem anymore
self-evident to me than it does to you. It need be no more self-evident

than the rule of grammar th a t tells us th at we must not end a sentence with
a preposition, or the rule of baseball th at classifies 3 strikes as an out. It
is merely a rule of the game. Whether or not you elect to accept it is up
to you. If you accept it we can still continue to learn the game. If you
do not accept it then- we must stop playing, for my game hinges on this
rule th a t you will not accept.”
Of even more importance is the fact th a t in the most general use of the
term, an axiom need not be self-evident. We shall explore this in more
detail later. Now here is an axiom from a subject called projective geom
etry, which we feel not only fails to qualify under the heading of self-
evident but even seems incredible! In fairness, however, we should point
out th a t the following axiom is not always accepted as a rule in all treat
ments of projective geometry. In some presentations it is viewed as an
inescapable conclusion following from other rules. T hat is, there is no
absolute standard by which we can distinguish between cause and effect,
so to speak. Thus, when we say th a t a conclusion follows “inescapably,”
we are talking in relative terms; we mean inescapable providing th a t we
accept other things as being true. For instance, nothing inescapable makes
us accept th a t 12 inches equals 1 foot. This is an arbitrary assumption
th a t has been accepted as a standard. In the same way, we were not
compelled to accept 3 feet as equaling 1 yard. Nonetheless, once we accept
these two assumptions, then it follows inescapably th at 36 inches equals
1 yard. We could have begun by defining 36 inches to equal 1 yard and
3 feet to equal 1 yard. H ad we done this, the fact th a t 12 inches equals
1 foot would now be an inescapable consequence rather than just another
general assumption.
To continue, the axiom to which we refer states the following. Through
any point 0 draw three different lines. On each of the three lines choose
two different points. Label the points on one line A and A '; on the second
line B and B '\ and on the third line C and C . In this way we obtain two
triangles A B C and A 'B 'C '. Extend corresponding sides of these two tri
angles until they intersect. The points of intersection, as shown in Figure
3.6, all lie on the same straight line! (By corresponding sides we mean sides
th a t are marked by the same letters. Thus, A B and A 'B ' are a pair of
corresponding sides, as are AC and A 'C ' or BC and B'C' .) Notice th at
while the subject is called projective geometry, we do not have to under
stand the subject to understand the statem ent of the above axiom.
We can carry out the indicated construction and verify the claim of the
axiom; yet the axiom will not seem self-evident to many. Let us also point
out th a t if we define an axiom to be a rule of the game we are in no way
bound to force the axiom to be true! Certainly, one has the right to make
up a rule as he sees fit independently of what exists in the real world. For
The General Game of Mathematics 283
example, while it is true in the real world th at the statem ent, “All Texans
are M artians” is false, there is nothing th a t prevents us from inventing a
game in which some pieces are named Texans and others are named
M artians, and then to make up the rule in this game th a t all Texans are
Martians.
However, before concluding this section, let us say something concerning
the old definition th a t an axiom is a self-evident statem ent of fact. Actu
ally, this definition is not in direct contradiction to our remarks th at an
axiom is a rule of the game. The fact is th a t to most people mathematics is
a product of the real world. I t is the language of the sciences. One
problem of the scientist is to make accurate predictions concerning the
world in which we live. Thus, while the rules of any game are a t best
assumptions, at least in the game of science we try to make the rules as
self-evident as possible so that the predictions based on these rules will be as
“true” as possible.
In fact, this may be the best way to distinguish between pure and applied
mathematics, a t least in so far as the game analog is concerned. T h at is,
let us agree th at we call it applied mathematics when the game is based on
rules th a t conform to the real world, and pure or abstract mathematics if
the rules are compatible with one another but do not seem to serve as a
model for any physical entity. In summary, both pure and applied m athe
matics as games have the same structure; the only difference is the degree
of realism upon which the rules are based.
Returning to the terminology of the game, we observe th a t strategy calls
for deducing inescapable consequences of the rules and other assumptions.
Such inescapable conclusions are called theorems. According to our inter
pretation, the theorem is not merely the statem ent but its proof as well.
#S4 The Game of Mathematics
T hat is, any statem ent about our game, true or false, is called a conjecture
until it can be shown to follow as an inescapable consequence of the rules.
The conjecture together with the proof is the theorem. This is in keeping
with our main interest of seeing what things follow7 from the assumptions,
rather than including those truths which seem to be unrelated to the
assumptions. By way of illustration, in plane geometry the statem ent
th a t the base angles of an isosceles triangle are equal is only a conjecture
until we prove th at it follows from the other definitions and rules. In fact
this was the beauty of the ancient Greek’s contribution to geometry.
Certainly it must have been self-evident long before the time of Euclid that
the base angles of an isosceles triangle were equal. It was clear simply by
looking at the triangle. W hat Euclid did was establish a handful of defini
tions and rules and then show7th at this result (together with several volumes
of other results) was an inescapable consequence of these definitions and rules.
The beauty of a theorem is th at it contains a fact th a t follows merely from
the tru th of other assumptions; and it does not require any additional pieces
of information. This is precisely what w7e mean w7hen w7e talk about
capturing the structure of the subject.
Our definition of theorem allows us to make a strong connection between
tru th and validity—a connection th at we discussed a little earlier in this
chapter. Recall th at w7e said th at if the assumptions were true and the
argument valid then the conclusion w as also true. Now7, since in a theorem
the conclusion is a valid consequence of our rules, it follows th a t as soon as
the rules happen to be true so also will the conclusion be true. In par
ticular, then, as soon as we find a physical model th at conforms to our rules
(and there may be no such model or there may be a large number of such
models, depending on the rules under consideration) the valid conclusions
of our game become “true facts” in the physical model.
We have chosen to study the arithmetic of sets as our first game. This
is our first choice because sets are new enough to most of us so th at the study
w7ill not seem dull; yet a t the same time this subject is concrete enough so
th a t wre can feel we are dealing with something “real.”
Before proceeding with the game of sets, however, a few7additional words
are in order concerning our discussion of logic. While w7e may have made
logic seem sort of like a cure-all, the fact remains th a t the concept of
“inescapable” is rather subjective. How, for example, can we distinguish
betw7een th a t which is really inescapable and th a t which isn’t, even though
w7e may think it is? From a different perspective this is much the
same as distinguishing betwreen an unsolved problem and an unsolvable
problem. We do not know the difference until the problem is solved (in
which case it is no longer an unsolved problem ); yet until the problem is
solved we usually have no wray of making the distinction.
Boolean Algebra 286
Our point is that, since logic is the process of deducing inescapable con
clusions, even the study of logic is a game based upon man-made rules.
We do not wish to enter into a discussion of logic at this time but in terms
of the structure of our game concept, we should point out th a t every game
has not one but two sets of rules. There are the special rules, peculiar to
th at particular game, such as three strikes is an out in the game of baseball;
and there are those rules th a t apply to the strategy level of every game—
the rules of logic. For example, no m atter what game we are playing, wre
find ourselves using such rules as: If event A causes event B to happen
and if event B causes event C to happen, then event A also causes event
C to happen. We repeat, no m atter what the game, we m ust come to
grips with the rules of logic.
For the immediate purposes of our course we shall take the liberty of
assuming th at we all, to some extent or another, know the basic rules of
logic, and we shall feel free to draw upon these as we need them in the
development of any particular game we are playing.
3.9 BOOLEAN ALGEBRA
3.9.1 INTRODUCTION
We have explained the value of an abstract system in terms of our search

for determining which statem ents follow inescapably from other state
ments th a t were assumed to be true. In this section we shall investigate
the structure of the set of subsets of a given set and see if we can isolate a
few properties th at determine the entire structure of the system. T hat is,
we shall accept certain statem ents of fact about the set of all subsets in a
given set. Then wTe shall see what conclusions follow inescapably from
our assumptions. Next we shall try to find other models th a t obey the
same rules. Any such model must have the same properties as the original
since these properties depend only on the rules.
The term Boolean algebra is applied to any system th at obeys the rules
we shall accept for the set of all subsets of a given set. This subject is
named for the Englishman George Boole (1815-1864) who developed the
algebra of sets in about 1850.
We shall show th a t twro fairly im portant systems are examples of Boolean
algebra; namely, truth-table logic and the theory of switches in electrical
networks.
Before wre talk about these systems we shall first undertake the investi
gation of the subsets of a given set. For one thing, this wrill afford an
excellent review of sets; it will also serve as the most intuitive introduction
to the more abstract concept of a Boolean algebra.
3.9.2 T H E SET OF SUBSETS OF A GIVEN SET

In the following discussion, let I denote any given set. We shall then
investigate the set of all subsets of I. In other words, let
A — { x .x Q . /}•
(Notice here th a t the elements of A are themselves sets, namely subsets o f/.)
We wish to study the structure of A .
To begin with, we observe th a t N (A ) ^ 1, since no m atter how we choose
I, <f>C I ; and hence, <f>C A . Moreover, if we assume th at I <f>, then
N (A ) ^ 2; since then both 0 and I are distinct subsets of / , and hence they
will be distinct elements of A . Since I = </> is a rather dull situation, we
shall assume in the following th a t I ^ <t>.
We have previously defined the operations of union, intersection, and
complementation of sets, as well as the relation of equality. We have seen
th a t our definition of equality for sets was an equivalence relation. T hat
is,
(1) For each x G A , x = x. (reflexive)

(2) For x C A and y £ A , ifx = y, then y = x. (symmetric)
(3) For x, y and z £ A , if x = y and y = z,then x = z. (transitive)
These three conditions then allow' us to state
(4) If x — y, we may replace x by y (and vice versa) in any statem ent of
equality without changing the tru th of the statem ent. (substitution)
Finally, our insistence on using only sets with a well-defined test for
membership allows us also to claim the following.
(5) Given any two elements, x and y, of A, then exactly one of the following
two statem ents is true.
x = y
x y (dichotomy)
In what follows we shall assume th a t we have accepted the above five
properties as “rules of the game” concerning the relation of equality for
elements in A .
Turning to unions and intersections, using either charts or circle diagrams
and basic definitions, we see th at
(6) If x C A and y G A , then xKJ y and x f \ y are also elements of A.
(closure)
In plainer terms, all this says is th a t the union of two sets is again a set, as
is the case with the intersection.
Boolean Algebra 287
(7) x^J y = y \J x
x (~\y = y C \x. (commutative)
(8) ( i U y ) ^ J z = x \ J ( j / U z)
{x C \y) C \z = x {y (~\ z). (associative)
Unions and intersections are related by:
(9) x C\ { y \J z) = (x y) {x z)
x \ J (y C\ z) = (x \J y) (x y j z). (<distributive)
The special subsets <f>and I are characterized by the following.
(10) For each / = x; and x \ J <t>= x. (identity)
Finally, the principle of complements states th at
(11) Given x £ A , there exists another element x ' C i such th a t:
x \J x ' = I
x x' = <t>. (rule of complements)
At this point it might be well for us to digress and make some comments
about the rules. Keep in mind th a t we are not calling the rules self-
evident even though they may seem th a t way to some “players.” For
example, the associative rules merely tell us th a t in our game, “voice in
flection” (symbolically, we indicate voice inflection in mathematics by use
of parentheses and the like) does not m atter when we deal exclusively with
unions or intersections. However, the distributive rules tell us th a t voice
inflection is im portant when unions and intersections appear within the
same expression.
This is not unlike the situation in ordinary arithmetic. For example,
given 3 + 4 + 5, we can pronounce it as (3 + 4) + 5 or as 3 + (4 + 5).
I t happens th a t these two pronunciations yield the same result. On
the other hand, 3 + 4 X 5 also has two pronunciations. B ut in one
case we have (3 + 4) X 5 (= 35), while in the other case we have
3 + (4 X 5 )(= 23). Thus, the value of 3 + 4 X 5 does depend on voice
inflection. Even with respect to the same operation, the use of parentheses
can make a difference. For example, given 9 — 3 — 1, we may view this
as (9 — 3) — 1, which is 5; or we may view it as 9 — (3 — 1), which is 7.
Such problems are not restricted to mathematics. Among other things
they occur right within the framework of grammar. By way of illustration,
consider the statem ent
They don’t know how good m eat tastes.
As this sentence stands, we have no way of knowing whether “good” is an

adjective modifying meat or an adverb modifying tastes. T hat is, are

these people who have eaten only bad meat so th a t they do not know how
good m eat tastes; or have they never eaten meat a t all so th at they do not
know how good meat tastes?
Turning nowr to the commutative rules, let us observe th a t they are
independent of the rules of closure. T hat is, for example, closure tells
us th a t both A \ J B and B \ J A are sets; but it does not tell us th at these
two sets are equal. The same thing occurs in numerical arithmetic. For
example, the fact th at the integers are closed with respect to subtraction
tells us th a t both 3 — 5 and 5 — 3 are integers; but we cannot conclude
from this th a t 5 — 3 and 3 — 5 are equal.
Moreover, since equality is a relation between two sets, the commutative
rules would not even make sense unless we had defined closure first. T hat
is, unless we know th a t A \J B is a set (just as 3 + 2 is one number) and
likewise th a t B U A is a set, it is meaningless, in terms of our use of the
symbols, to even wrrite A \ J B = B \ J A , since = only relates sets to sets.
As a final word of caution, there is a tendency by some to confuse the
rule of symmetry for equality with the rule of commutivity for unions and
intersections. Observe th a t symmetry merely tells us th at the equality of
twro sets does not depend on the order in which we name the two sets, while
commutivity tells us th a t two different ways of combining two given sets
yield the same set. In juxtaposition, it is commutivity th at tells us th at
xK J y = y \ J x . Once we know this, it is the symmetric rule for equality
th a t lets us write y \ J x = z V J i / i n place of x \J y = y \ J x .
Returning to the list of rules, notice th at many more than the above 11
rules are obeyed. However, we intend to show th at these 11 rules serve
as a backbone from which many, many other statem ents follow inescapably.
Before doing this, it is im portant to note th a t while the 11 rules evolved
from the definition of equality, union, intersection, complement, <f> and
I, nowhere are these terms defined in the 11 rules. T hat is, the player of
our game is not obliged to accept circle diagrams, charts, and so on. He
need only accept the 11 rules themselves.
For example, someone might accept rule (11) by considering
x x' x x' x C \x ' I <t>) Hence,
1 0 1 0 1 0> x x' = I
o i i o i o) x r ^ x r = <t>
t f
t ___________ 4s
On the other hand, we do not call the chart a proof of rule (11) since
we never force any interpretation on a player. In short, we only require
th a t each player accept rule (11). We do not make him explain to us why
he does. This helps us to avoid basing our development in terms of sub
jective models th a t might be acceptable to some players but not to others.
Boolean Algebra 289
As a further review, this time of the circle diagram method, suppose

we wish to verify th at
{ x \J y ) y j z = x \ J (y z).
We would draw F igure3.7. T h e n x = {1,2,4,5} y — {2,3,5,6} z = {4,5,6,7}.
So (x y j y) = {1,2,3,4,5,6}; therefore (x VJ y) \J z = {1,2,3,4,5,6,7}.

And (yK Jz) = {2,3,4,5,6,7}; hence, x \ J (yVJ z) = {1,2,3,4,5,6,7}. See
Figures 3.8 and 3.9. Consequently, a believer in circle diagrams would
use this as his motivation for agreeing to accept rule (8), even though we
really don’t care what led him to the acceptance of the rule.
We leave it to the reader to verify the other rules stated above by any
method th a t he wishes.
Let us now proceed with the strategy part of the game, but from a fairly
informal point of view. Let us start with any element x G A .
Figure 3.9
(1) x A x' = <t> by rule (11)

(2) x \ J (x (~\x') = x \ J <j> by rule (4); th a t is, we have substituted <f>
for x C\ x \ since they are equal by step
( 1).
(3) ( x U x) C\ ( i U x') = x by rules (9), (10), and (4), (that is
x KJ {x C\ x') = { x V J x ) C \ { x \ J x')
and x ^ J <t>= x so we may substitute
these for one another.)
(4) ( x \ J x) f \ I =x by rules (11) and (4).
(5) (xKJ x) = x by rules (10) and (4).
We have thus demonstrated th at anyone who has accepted the 11 rules

m ust now either accept th a t x \ J x = x, or else contradict himself! T hat
is, x x = x follows inescapably from the 11 rules!
Before proceeding further, a few remarks are in order. First, observe

th a t to the person who has accepted, for example, the chart method, it
would be obvious th a t x KJ x = x, since
x x xU x
1 1 1
0 0 0
t ^
However, the chart above would not constitute a proof to the person
who did not understand the chart method. In contrast to this, notice
th a t our demonstration used no physical model, but only the fact th a t we
accepted the 11 rules. (In fact, to be more precise, the proof th a t x \J x =
x follows inescapably from the mere acceptance of rules 4, 9, 10, and 11.)
Secondly, we have made no attem pt to show why the sequence of steps
th a t we used to dem onstrate th a t x U x = x was self-evident. The main
Boolean Algebra 291
reason for this is th at we do not feel th a t these steps need be self-evident

to the reader; for, as we have mentioned before, the strategy part of any
game is, a t least in part, quite creative; and the ability to “create” certainly
varies from player to player. In this context we do not preclude the pos
sibility th at different players would have found different, but equally
correct, proofs. In other words there can be more than one way to show
th at a particular conclusion follows inescapably from a set of given assump
tions, even within the framework of the same rules of logic.
While we understand th a t players may experience great difficulty in
mapping out the “proper” strategy, we shall nonetheless expect the player
to work with sufficient diligence so th a t he learns to appreciate, or at least
understand, the structure of a proof when he sees one. In terms of a
game in general, we are saying th a t there are many chess players, for
example, who could not think up the same move th a t the expert creates;
but th at once they see the expert make the move, they understand and
appreciate the brilliance of i t !
We shall devote the next section to the development of other facts (that
is, theorems) th a t follow from the rules we have accepted for the game.
To the players of our game we have already demonstrated the following
theorem.
Theorem 1
For each x G A , x x = x.
Recall th a t the theorem is the statem ent, together with the proof of the
statem ent. The statem ent, until the proof is supplied, is merely a con
jecture. In other wTords, the theorem is the statem ent th a t follows in
escapably from the rules; and the demonstration of this is called the
proof. In no event do we say th a t the theorem is true, only th a t it is an
inescapable consequence of our rules.
Exercises
1. Use the circle diagrams to verify the distributive rule.
2. Use the chart method to verify the distributive rule.
3. Explain why we do not call either the circle diagram or the chart a proof
of the distributive rule.
3.9.3 SOME FUNDAM ENTAL THEOREM S

In this section we shall show the logical side of deriving inescapable (valid)
conclusions from given assumptions. Thus, we shall assume the usual
rules of logic and the 11 rules of the previous section. We have already
established the theorem th a t x \J x = x, but we shall rederive this in the

more formal statement-reason format in order to emphasize the structure
of a proof. Later, to help prevent boredom which usually accompanies
long explanations, we shall omit the statement-reason format and return
to the style of the proof as given in the preceding section.
Theorem 1
For each x £ A , x W x = x.
pro o f:
Statement Reason
(1) Given x £ A , there exists x' £ A (1) Rule of complements
such th a t x C\ x' = <f>
(2) x W (x x') = x \ J <{> (2) Substitution of 0 for x C\ x'
in x \J (x r \ x')
(3) x y j (x C\ x') = ( i U i ) H ( i U x') (3) Distributive rule
(4) {x x) C\ {x y j x') = 0 (4) Substitution of (3) into (2)
(5) x \ J x' = I (5) Rule of complements
(6) ib 0 = x (6) Identity rule
(7) ( x \ J x) C\ I = x (7) Substitution of (5) and (6)
into (4)
(8) {x y j x) C\ I = x VJ x (8) Identity rule
(9) x \J x = x (9) Substitution of (8) into (7)
This proof merely spells out, step by step, the procedure th a t had oc
curred previously when we wrote
x C\ x' —0
x \j (x r \ x') —x ^j 0
{x U x) C\ (x U x') = x \J 0
( i U x) r \ i = x
(xKJ x) = x.
A rather strange property of the rules of our game of subsets is th at

these rules are so designed th a t any rule will remain a rule if we interchange
y j and r \ as well as 0 and I everywhere. For example, the rule th at
i U 0 = x wouldthen become x (~\ I = x, which is also a rule. The
implication of this phenomenon, known as the principle of duality, is th a t
each theorem will remain a theorem if W is interchanged with C\, and 0
is interchanged with I.
For example, using the principle of duality, we would have as a corollary
to Theorem 1 th a t for each x £ A , x x = x. We do not have to accept
Boolean Algebra 293
the duality principle on blind faith, nor as another rule. Rather, we need
only observe th a t to prove the dual of a given theorem all we need do is
interchange U and f \ , as well as 0 and 7 everywhere in the proof of the
theorem.
As an application of this, we shall prove the dual of Theorem 1 by copy
ing the statem ent and the proof word for word, except th a t we shall inter
change y j and P\, as well as 0 and 7.
Corollary l 1
For each x C A, x r \ x = x.
pro o f:
Statement Reason
(1) Given x £ A , there exists i ' C A (1) Rule of complements
such th a t x \ J x' = 7.
(2) x C\ (x VJ x r) = x C\ I (2) Substitution
(3) x C\ (x VJ x') = ( x f \ x ) \ J {xC\ x') (3) Distributive rule
(4) {x r\ x) \j (x c\ x') = x r\ i (4) Substitution
(5) x r \ x ' = <t> (5) Rule of complements
(6) x r\ i = x (6) Identity rule
(7) {x C\ x) VJ 0 = x (7) Substitution
(8) (x r\x )\j< i> = x r \ x (8) Identity rule
(9) x C\ x = x (9) Substitution
We merely copied the statem ent and the proof of Theorem 1 verbatim
except for the duality substitution, and this gave us a proof for Corollary
1. Yet, if we didn’t tell the reader our secret, notice th at the resulting
statement-reason format is a valid proof; th a t is, the reader would not
know th a t all we did was make a verbatim translation to obtain the dual
from the original theorem. In other words, whenever we have proven a
theorem we may immediately write down its dual as another theorem, and
if called upon to submit a proof, we need merely copy the proof of the
theorem, replacing by C\ and 0 by 7. Henceforth, whenever we do this,
we shall simply give duality as the reason.
Keep in mind th a t we never have tried to explain why we chose the
1 A corollary is a theorem whose validity follows almost immediately from another

fact. T h at is, to indicate th a t the above result follows “quickly” from Theorem 1, we
call it a corollary rather than, say, Theorem 2. Certainly the distinction between a
corollary and a theorem may often be quite subjective.
particular theorem th a t we did for our first theorem. We could have

chosen others. Nor do we try to explain why we chose the particular
proof th a t we did. There could have been others. Nor do we try to
explain the motivation behind either the theorem or the proof. In short,
much of this part of the game involves a certain amount of personal choice;
and this is far too subjective a concept to teach as some sort of science.
Let us prove another theorem. By using either charts or circle dia
grams, it is easy to see th a t x C\<t> = <t>. Yet, in our rules we mention
only the identity property of <£. T hat is, self-evident or not, we have
not yet established the right to use x C\ <j>= <f>in terms of the structure
of our game; all we know about <£ is th at x U <t>= x. So as our next
project let us try to establish th a t for each x A, x $ = <f>.
Granted th a t the structure of a proof is highly subjective, we shall
nonetheless try to motivate a plan of attack so th a t the reader might obtain
some idea as to how one might proceed. We observe th at we must bring
</> into the picture in terms of a rule th a t we have already accepted. But
the only rule th a t governs <£ is th a t y \ J <t>= y for each y C A . Thus, for
a given x £ A we have by substituting y for y \ J <£
x C\ (y ^ J <t>) = x C \y.
By the distribution rule and substitution, this yields
( i n y ) U ( i H <t>) = x C \y. (1)
This gives us a form th a t contains the desired x C\ </>; but things are
still fairly complicated. However, we assumed th a t y could be any element
of A . We now choose y to our advantage; namely, since we know th at
x C\ x' = 4>, we shall choose y to be x'\ whence (1) by substitution be
comes
(x r \ x') \ j {x r \ 0) = x r \ x'
or <t>W (x r \ <£) = 0.
However, since z <f>= <f>\J z = z, each z C A ; and sincex <t> is an

element of A (that is, let z = x C\ <t>) (2) becomes
x C\ <t>— <!>•
Once we have th e proof sketched as above, we can put it into the more
direct statem ent-reason form and proceed in the usualway. The reader
who has studied geometry may recall this as a common technique.
Namely, one makes a rough diagram, forms a plan of attack, and then
translates his approach into the formal statem ent-reason approach.
Boolean Algebra 296
T h eo rem 2
For each x A, x r\ <f>= <£.
p ro o f:
Statement Reason
(1) For the given x there exists x' (1) Rule of complements
such th at x C\ x' = <t>.
(2) x' <f>= x' (2) Identity rule
(3) 0) = x x' (3) Substitution
(4) x C\ (x' <l>) = (x r \ x') KJ {x C\ 0) (4) Distributive rule
(5) (x r\ x') vj {x r\ < t>) = x r\ x ' (5) Substitution of (4) into
(3)
(6) x C\ x' = <j> (6) Rule of complements
(7) <j>\j (x r\ <
t>) = <j) (7) Substitution of (6) into
(5)
(8) <t>\j t x r\<t>) = ( x r\<t>) <f> (8) Commutative rule
(9) (x <t>) <t>= <i> (9) Substitution
(io ) (xr^\<i>)^j<t> = xr^<i> (10) Identity rule
(ii) x <f>= <t> (11) Substitution
Again, there may be other ways to prove this theorem; the proof above
is only one way. Also, the fact remains th a t even if the above demonstra
tion does not seem self-evident to a player, it still constitutes a proof.
In support of an earlier remark, it should be noted here th a t nowhere
have we used Theorem 1 in the proof of Theorem 2. This implies th a t
the two theorems are independent. T hat is, Theorem 2 could just as
well have been proved before. Finally, to the student who feels th at
charts or circle-diagrams are an “easier” way of getting the answer, we wish
to emphasize, as strongly as possible, th a t our procedure does more than
show the answer. I t shows how the answer is an inescapable consequence
of our 11 rules thus emphasizing the role of structure.
Using the principle of duality, we now have an immediate consequence of
Theorem 2.
Corollary 2
For each x £ A , x \ J I = I.
At this point we shall begin to prove theorems in a slightly more informal

way. I t will be left as an exercise for the reader to translate these informal
proofs into the standard statement-reason format.
T heorem 3
For each x and y in A, x VJ (x C\ y) = x.
pr o o f :
x \ J {x r \ y) = ( i n / ) U ( i H y) Identity rule and substitution

= x C\ (/ KJ y) Distributive rule (and symmetry;
th a t is we usually write
z H (I KJ y) = (x C\ I) KJ (x n y)
and symmetry allows us to
reverse to order of the equality)
= x C\ I Commutative rule and Corollary 2
= x Identity rule
The proof as given above is usually abbreviated even more (with the
reader left to supply the step-by-step reasons):
x \ J (x C \y ) = (x r \ I) \J (x C\ y) = x r \ (I KJ y) = x C\ (y ^ J I)
= x C \ I = x.
The uninterrupted string of equalities establishes the proof of the theorem.
Then, by duality, we have the following corollary.
Corollary 3
For each x and y in A , x (x VJ y) = x.
Again, we could have established the results stated in Theorem 3 and
its corollary merely by a suitable circle diagram. However, the proof
establishes the inescapability of these results from the 11 assumptions,
independently of the use of charts, circle diagrams, and so on. In still
other words, our proof shows not only the “tru th ” of Theorem 3 but also
the fact th a t no other properties of sets, other than those already objec
tively stated, are necessary to “guarantee” this “tru th .” We keep repeat
ing this idea because we feel it cannot be emphasized too strongly.
Moreover, let us also observe th a t we are not bound to invoke the princi
ple of duality merely because we have the right to. For example, once
Theorem 3 is established, wre may write
x \ J (x y) = ( i U x) C\ ( i U y).
B ut by Corollary 1, x KJ x = x; hence, x VJ (x y) = x C\ (x VJ y). Now
since x VJ {x C\ y) = x and since x \ J {x C \y) = x C\ (x \J y ) ; we have
x C\ (x U y) = x, and this gives us another proof of Corollary 3. This
further substantiates our remark th a t there is often more than one correct
proof of a particular theorem.
Boolean Algebra 297
We have often mentioned th a t what seems natural to do may be wrong.

For example, it might seem natural th a t if xK J z = y KJ z, then x = y.
T h at is, it might seem natural to cancel out the z’s just as we do in regular
arithmetic. But natural or not, the fact remains th a t it is wrong. Con
sider, for example, the situation wherein I = {1,2,3} ;x = {1,2 };?/ = {2,3};
and z = {1,3}. In this case x \ J z = {1,2,3} = yKJ z, yet x ^ y. We
shall talk in more detail later about the differences between the arithmetic
of numbers and the arithmetic of sets.
In a similar way, one can show th a t if x C\ z = y z, it need not be
true th a t x = y. However, a modified cancellation law applies; namely,
the next theorem.
Theorem 4
Suppose th a t we are given elements x and y in A and th a t we can find an
element of z in A such th a t
x \J z = y \J z
and
x C\ z = y C\ z.
Then x = y.
x = x^J {x z) Theorem 3
= (y C\ z) Given & substitution
= { x \J y ) C \{ x K J z) Distribution rule & substitution
= (y x) (y z) Commutative rule & substitution
= y \ J (x C\ z) Distribution rule
= y \ J {y C\ z) Substitution
= y Theorem 3
Theorem 4 is im portant in th a t it gives us a criterion for testing the

equality of two elements of A . Namely, given two elements, say x and
y, all we need do is exhibit an element z such th a t x VJ z = y KJ z and
x C\ z = y C\ z. (Of course, by the commutative rules it would be suffi
cient th a t x = z U y and z C\ x = z C \y.) Theorem 5 is an applica
tion of this idea.
Theorem 5
/ ' = <f>.
We shall show th a t / VJ / ' = I KJ <j> and I C\ / ' = I C\ <t>. Then
p ro o f:
Theorem 5 will follow immediately from Theorem 4. By definition of
I ', we have th a t I KJ I ' = I and I C\ I ' = 0; but by definition of <f>,
while Theorem 2 shows th a t I (~\ <t>= <t>. Hence,

I \J V = I \J <t>
i n r = i
and the result follows.
Corollary 5
= I.
pr o o f We merely apply the duality principle to Theorem 5. Observe
:
that, in a sense, Theorem 5 could be viewed as a corollary to Theorem 4
since it is virtually an immediate consequence of the result stated therein.
However, we prefer to call this result a theorem rather than a corollary
for the purpose of emphasizing the result.
In a similar way, the following is easily shown to be another consequence
of Theorem 4.
T heorem 6
For each x G A , Or')' = x. (This is a form of double negation. Intui
tively it says th a t the complement of the complement is the original set.)
pr o o f By Theorem 4 all we need show is th a t x' W (x'Y = x'
: x and
x ' C\ (x'Y = x' C\ x in order to establish the equality of x and Or')'.
B ut by the definition of {x'Y we have th a t Or') KJ (x'Y = I and also
Or') (x'Y = <t>, while by the definition of x' (and the commutative rule)
we have th a t (x ') \J x = I and (x') (~\x = 0—and the theorem is proved!
Let us p u t this into statement-reason format.
Statement Reason
(1) x ' U ( x 'Y = I (1) Rule of complements
x f C \ (x 'Y — 0
(2) x \ J x ' = I (2) Rule of complements
x C\ x' = 0
(3) x \J x ' = x ' \J x (3) Commutative rule
x C\ x' — x' C\ x
(4) x ' \ J x = I (4) Substitution of (3) into (2)
xf C \ x = 0
(5) x ' (x 'Y = x f x (5) Substitution of (4) into (1)
x' C\ {x'Y — x' C\ x
(6) (x'Y = x (6) Theorem 4
As a final application of Theorem 4 we have the next theorem.

Boolean Algebra 299
Theorem 7
For x,y in A ; (x W y )' = x ' C\ y'.
pro o f By the definition of (x VJ y)', we know th a t { x \J y ) \ J { x \J y)' =
:
I and ( i U y) C\ (x W y)' = <t>. Hence, by Theorem 4 all we need show
is th a t (x VJ y) U (x' C \y ') = I and ( i U y) C\ (x ' A y') = 4>- To this
end:
(x VJ y)KJ (x' C\ y f) = [(x U y ) U x'] n [(* U y ) U y'] (Distributive
rule)
[(y U x ) U x'] H [(x U y ) U y'] (Why?)
i y U ( x U x')j n [x U ( y U y')j (Associative rule
and substitution)
( y U / ) n ( x U /) (Why?)
ir \i (Why?)
I (Identity rule and substitution)
(x VJ y) A (x' n y') = [(x U y ) H x'] n y' (Associative rule)

[(x n x') VJ (y n x')] n y' (Why?)
[ ^ ( y n x')] n y' (Why?)
(y n x') n y' (Why?)
(x' n y) n y' (Why?)
x' n (y n y') (Why?)
xf r \ $ (Why?)
<t>. (Why?)
Corollary 7 (By duality)

For each x,y in A ; ( x f \ y)' = x' VJ y '.
These two results are known as De Morgan’s laws. They say th a t the
complement of a union (intersection) is the intersection (union) of the com
plements. For example in order not to belong to both x and y we must
either not belong to x or not belong to y. While the result seems to require
a feeling for properties of sets notice th a t our derivation was quite abstract,
in the sense th a t it drew only on the stated 11 rules and their inescapable
consequences.
So far we have not introduced the idea of x C y into our game. Using
circle diagrams as an aid, observe th a t C can be defined in terms of the
previously introduced concepts.
D efinition 1
For x and y in A , x C y shall mean th a t x U y = y.
Using the circle diagram idea to interpret the meaning of x C y> we
see th a t x C y also implies th a t x H y = x and th a t x H y' = f Conse
quently, had we desired we might have defined x C y quite differently.

T hat is, we could have said the following.
D efinition la
For x,y in A ; x C y means th a t x C \y = x or
D efinition lb
For x,y in A ; x C y means th a t x y' = <t>.
Unless these three definitions are synonymous, we would be in rather

great difficulty—the type of difficulty th at arises when one symbol can
have more than one precise meaning.
Thus, before we proceed further with this game we shall show th a t the
above three definitions of C are equivalent. Once this is done we may use
any of the three definitions. There will be times when, to prove a certain
result, it will be more convenient to use Definition lb rather than Defini
tions 1 or l a ; a t other times it might be more convenient to use Definitions
1 or la.
First we shall see th a t Definitions 1 and la are equivalent and then see
th a t Definitions 1 and lb are equivalent. To accomplish this we must
first study Theorem 8 and Theorem 9.
Theorem 8
Given x and y in A ; then x \ J y = y if and only i f x C \y = x. T hat is, if
i U y = y, then x P\ y = x; and if x y = x, then xKJ y = y.
pro o f : Assume th a t xK J y = y. Then
x y = x { x \J y) (Substitution)
= (x (~\ x) \J (x y) (Why?)
= x \ J {x (~\y) (Why?)
= x. (Why?)
Now assuming th a t x C\ y = x, we have
i U j/ = ( i H i/ ) U j/ (Substitution)
= y \ J (x C \y ) (Commutative rule and substitution)
= y, (Theorem 3 and substitution)
T heorem 9
Given th a t x and y are in A ; then x \J y = y if and only if x f \ y f = </>.
T h at is, if x \ J y = y, then x C\ y' = <*>; and if x C\ y' = </>, th en x U y = y.
pr o o f : Assume th a t x ^ J y = y. Then y' C\ (a: y) = yr (~^y (Why?)

Boolean Algebra 301
(yf O i ) U (y' n y ) = y' C \y (Why?)

W C \ x ) \ J <t>= <t>. (Why?)
Thus
•y' C \x = <t> (Why?)
or
x C\ y' = <t>.
Similarly, if x C\ y' = 0, then
y KJ {x C\ y') = y <t> (Why?)
so
{ y \J x) C \ { y \ J y') = y (Why?)
or
( y \ J x) C M = y. (Why?)
Thus
y \J x = y
or
x \ J y = y.
Theorems 8 and 9 together allow us to amalgamate Definitions 1, la,
and lb into one general, noncontradictory form.
D efinition
For any x and y in A , we say x C y if and only if
(1) x \J y = y or
(2) x r \ y = x or
(3) x C ^ y ' = <t>.
For any particular x and y either (1), (2), and (3) are all true, or else
all three are false.
This definition, coupled with our rules and previously proven theorems,
allows us to analyze the structure of the set of subsets of a given set in
more detail. In fact, there are several immediate corollaries to the above
definition worth noting.
To begin with, we m ust have th a t x C I and <j>C x, for all x £ A . T hat
x C I follows from our definition of C and the identity rule, since we know
th a t x f \ I = x, which in turn says th a t x C I. Similarly, x VJ <f>= x tells
us th a t <t>C x. Because of the importance of these two results we shall
state them as theorems rather than as corollaries of our definition.
T heorem 10
For each element x G A , we have th at
xC I
and
<t>C .x.
Theorem 10 emphasizes the limiting values of <f>and I in the sense th at
each x is between <£ and I. Also notice th at Theorem 10 gives us another
reason for agreeing th a t the empty set is a subset of every set. Namely,
we have now shown that, under our accepted rules, it follows inescapably
th a t <j>d x. In still other words the assumption th at <f>is not a subset of I
is incompatible writh our accepted 11 rules.
Another pair of corollaries to our definition are given by the next two
theorems.
T heorem 11
If x C <t>, then x = <f>] and if I C x, then I = x.
pr o o f By definition, if x C <t>, then x VJ <f>= </>; but x W <j) = x. Hence,
:
x C <f> implies th a t x = 0. Similarly, if I C. x, then I \ J x — x, but
I \J x = I; hence I — x.
Theorem 11, among other things, shows th a t there is no analog of nega
tive numbers when one deals with sets. T hat is, we have shown th a t the
only subset of the empty set is the empty set itself. (In other words x
denoted any subset of <j>and we showed th a t this implied x = </>.)
By this tim e the technique of deducing theorems and of studying the
structure of systems should be clearer. Thus, while there are many more
theorems we could prove, we shall prove just two more about sets. First
we shall dem onstrate th a t C is a transitive relation.
T heorem 12
Given x, y, and z in A ; if x C y and if y C z, then x C z.
pr o o f Paraphrasing the theorem in terms of our definition, we wish to
:
show th a t if x ^ J y = y and if y \J z = z, then i U z = z. Using the given
equalities we see th a t i U z = x \ J (y ^ J z) = {x^J y ) \ J z — y ^ J z = z
and the theorem is proved.
Finally, we wish to prove a theorem th a t will give us another technique
for proving th a t two elements are equal.
T heorem 13
Given x and y i n A , then x = y if and only if x C y and y C x. (A result
th at we accepted as being true by definition in Chapter 2, but here we show
th a t it is an inescapable consequence of our structure.)
Boolean Algebra 303
pro o f : Clearly if x = y, then x C y and y C x since in this event
xKJ y = y \ J y = y
and
x \ J y — x \ J x — x.
Next, suppose th at x C. y and y C x. Since x C y, we have th a t

x \ J y = y\ and since y C x, we have y KJ x = x; but since x ^ J y = y \ J x ,
we have th a t x = y as asserted.
The reader should recall th a t the im portant part of this section is not to
memorize proofs; nor is it to see how few steps are needed to form a proof;
nor is it to discover th a t most of the theorems are simple consequences of
the circle diagrams.
The im portant point is th a t we have abstracted a small number of rules
th at characterize a certain topic; and th a t from these rules we have derived
a set of 13 inescapable conclusions called theorems. While the rules we
accepted were in regard to our feelings as to the meanings of union, inter
section, and complement, the theorems wrere derived only from the stated
rules, not things felt to be true.
This sets the stage for our discussion of Boolean algebra. We shall
see how many apparently different systems obey the rules outlined in
these last two sections. Any such systems will automatically obey the
theorems of this section.
If nothing else, a self-contained lesson of this section is the demonstration
of m an’s ability to think logically. I t is in this sense th a t the material of
this section is designed to illustrate the same type of deductive logic th at
typifies a study of Euclid’s geometry.
T hat is, the format usually reserved for the treatm ent of plane geometry
is applicable to all mathematical structures.
Observe th a t all we have done is develop a few facts th a t follow in
escapably from the rules. To be sure, our rules are quite concrete in
terms of how sets behave; but pay particular attention to the fact th a t
nowhere in the proofs did we use these concrete properties. All we used
were the rules themselves, not the reasons why the rules were chosen. We
shall emphasize this in the next section.
Exercises
1. In ordinary arithmetic, let us agree to call one statem ent the dual of
another if all we do to get from one to the other is interchange the words
multiplication and division (or the symbols X and State a rule
of arithm etic whose dual is not a rule.
2. Prove Corollary 2 directly from the given proof of Theorem 2 using the
principle of duality step-by-step.
3. Rewrite the proof of Theorem 3 in the more formal statement-reason

format.
4. Use circle diagrams to illustrate the results of Theorems 3, 4, 7, and 8.
5. Use charts to illustrate the results of Corollaries 3, 5, and 7.
3.9.4 BOOLEAN ALGEBRA D E F IN E D
Let S denote any nonempty set. Suppose there exists an equivalence

relation, denoted by = , between the elements of S. Suppose further
th a t there exist operations (cup), C\ (cap), and ' (prime).
D efinition
S is called a Boolean algebra if and only if
(1) a and b belong to S, so also does a U b and (Rule of closure)

(la) a and b belong to S, so also does a b
(2) for a and b in S, a \ J b = b U a and (Commutative rules)
(2a) for a and b in S, a (^\b = b C\ a
(3) a, b, and c are in S, then (a VJ b) VJ c = a ^ J (bKJ c) and
(Associative rules)
(3a) a, 6, and c are in S, then (a f~\b) c= a C\ (b D c)
(4) for a, b, and c in S, a (b c) = (aC\ b) U (a r \ c) and
(Left-sided distributive rules)
(4a) for a, b, and c in S, a (b c) = ( a U b) ( a \J c)
(5) there exists an element of S, denoted by 0,such th a t a U 0 = a for
each a £ S and (Identity rules)
(5a) there exists an element of S, denoted by I, such th at a C\ I = a for
each a £ S.
(6) for each a £ S there exists an element, denoted by a', such th at
o U a ' = I and a C\ a' = 0. (Rule of complements)
Thus, a Boolean algebra is more than just a set. I t is a set together

w ith certain operations th a t obey certain rules.
Compare this with our earlier remarks about common fractions. Re
member th a t common fractions are more than just ordered pairs of numbers.
They are ordered pairs th a t obey certain rules.
Also notice th a t we prefer to use the words cup, cap, and prime rather
than union, intersection, and complement, in order to stress the abstract
features of a Boolean algebra; th a t is, union is more suggestive than is cup.
Moreover, it is also easy to see here th a t the principle of duality applies.
In fact, if we consistently interchange VJ and as well as 0 and I, we see
th a t (1) becomes (la ); (2) becomes (2a); (3) becomes (3a); (4) becomes
(4a); (5) becomes (5a); and (6) remains (6).
Finally, th e 13 theorems proven in the previous section (as well as many
Boolean Algebra 305
other theorems th a t we did not prove) remain valid in any Boolean algebra
since these proofs depend only on the properties that constitute a Boolean
algebra. This is why we were so insistent on emphasizing the facts th a t (1)
we did not have to say why we accepted the rules, and (2) we only wanted
those conclusions th at followed inescapably from our rules.
Thus, in any Boolean algebra we have the following.
(1) a \J a = a and a C\ a = a.
(2) a C\ <f>= <t>and a U I = I.
(3) aVJ (a C\b) = a (a KJ b) = a.
(4) I ' = <f>and <t>' = I.
(5) (a')' = a.
(6) (a VJ &)' = a' C \b' and (a C\ b)' —a' W b'.
(7) aU b = a U c and a C \b = acimply th a t b = c.
(8) a VJ b = b if and only if a f~\ b = a if and only if ab'= <t>.
Defining a C b to mean a b = b, we also have the following.
(9) For each a, <t>C a and a C 1-
(10) If a C <£, then a = and if I C a,, then I = a.
(11) If a C b and if bC c, then aC c.
(12) If a C b and if bC a, then a= b.
To see the similarities as well as the dissimilarities between this game

and the usual game of numerical arithmetic, the following device is useful.
Let us agree to rewrite the rules of a Boolean algebra with the following
changes.
(1) Replace 0 by 0 and I by 1.

(2) Replace VJ by + , by X, ' by 1 —, and “ element” by “number.”
We then obtain the following.
(1) If a and b are numbers, so also is a + b.

(la) If a and b are numbers, so also is a X b.
(2) a -f* b — b -{- a.
(2a) a X b = b X a.
(3) (a -J- b) + c = a-f- (6 + c).
(3a) (a X b) X c = aX (b X c).
(4) a X (b + c) = (a X b) + (a X c).
(4a) a + (b X c) = (a + b) X (a + c).
(5) For each number a, a + 0 = a.
(5a) For each number a, a X 1 = a.
(6) For each number a there exists a number denoted by 1 — a such th a t
a + (1 — a) = 1 and a X (1 — a) = 0.
Since it is false th at
a + (b X c) = (a + b) X (a + c)
and th at
a X (1 — a) = 0,
we see th a t the rules of a Boolean algebra are different from those of the
usual numerical algebra. In other words, rules (4a) and (6) are rules in
Boolean algebra but not in ordinary numerical algebra. (Of course, if the
rules for a Boolean algebra were identical to those of ordinary arithmetic, it
would be of no interest to study Boolean algebra since structurally we would
not be able to distinguish it from ordinary arithmetic.)
There are also rules th a t are obeyed in ordinary numerical algebra but
not in a Boolean algebra. For example, in ordinary arithmetic, given any
number x we can find another number y such th a t x + y = 0. If this
were translated into the language of Boolean algebra, it would say : Given
a G S, there exists b £ S such th a t a U b = <£; but if a 9* <t> no such b
exists. (Why?)
This means th a t any theorem in numerical arithmetic will be a theorem
in Boolean algebra provided the proofs use rules th a t are always acceptable
to both Boolean algebra and numerical arithmetic.
If a proof uses a result th a t is true in one but not the other, the theorem
may still be valid in both, but we will need different proofs in each case.
For example, it is a theorem of arithm etic th a t a X 0 = 0 for all numbers
a; and it is a theorem of Boolean algebra th a t a C\ <f>= <f> for all a G S.
Yet the numerical proof utilizes rules th at are not permitted in Boolean
algebra, and vice versa. (See A Note on Proofs at the end of this section.)
While the mathem atician can invent a mathematical system merely
for its own sake, he usually has a particular model in mind when he invents
a system. In this section we defined a Boolean algebra in terms of such
operations as cup, cap, and prime in order to remain abstract; but as we
were doing this we had in mind the properties of union, intersection, and
complement as described in the previous section. This move guarantees
the fact that there is at least one “real” model that obeys all the necessary
properties for a system to be a Boolean algebra. Namely, the set of all
subsets of a given set together with the usual meanings of union, inter
section, and complement is such an algebra—since this is where the rules
came from in the first place.
Finally, one often tries to determine the rules th a t completely charac
terize an entire subject. One way of telling whether a set of rules does
this is to invent or discover other systems th a t obey the same rules. If
two such systems have essentially little in common, the chances are th at
the rules we selected were not sensitive enough to describe the entire
Boolean Algebra 307
picture. If the two systems are virtually equivalent, then we have found
a unifying link in terms of the structure of the rules. In any event, as a
brief summary, it is often the case in studying a particular system th a t the
mathematician tries to characterize it by a small number of rules; he then
sees how many models he can find th a t obey those same rules.
In this spirit of inquiry we shall devote the next two sections to con
structing other models of a Boolean algebra; and, as a by-product, we shall
show the interrelationships th a t exist between three subjects of rather
diverse natures.
A NOTE ON PROOFS
We mentioned th at a theorem might be true in both Boolean and numer

ical algebra, even in cases where the necessary rules are different. The
particular theorem th a t we mentioned was x C\ <t>= <t> and a X 0 = 0.
To see this in more vivid detail let us take the proof th a t a X 0 = 0 and
translate it into the language of Boolean algebra. Then we will take the
proof th at x C\ <j>= <t> and translate it into the language of numerical
algebra. In each case we shall see where the difficulty lies.
Arithmetic Sets
0+ 0 = 0 < t> K J <t> = <t>
a X (0 + 0) = a X 0 x P l (#W <t>) = x(~\ <f>

(a X 0) + (a X 0) = (a X 0) ( x i ^ ^ y j (xn*) = (xH <t>)
Let a X 0 = b Let Xr \ <f>= y

b+ b= b y \J y = y
Therefore, by cancellation 6 = 0 Therefore, y = <t>(but here this is false since
y \ J y = y for each element y of a Boolean
algebra).
Conversely, sets Arithmetic
x 'W <f>= x ' ( 1 —a ) + 0 = 1 —a
i H (x ' \ J <f>) = x(~\ x ’ a X {(1 — a) + 0} = a X (1 — a)
{_xC\ x ') \ J (x P l <t>) = x C \ x' [a X (1 — a)] + (a X 0) = a X (1 — a)
< t> \J (xH <t>) — <t> 0 + (a X 0) = 0 (but this is false since
in <f> = <t> o X (1 — o ) = 0 i s not true for all values
of o, bu t only for a = 0 or a = 1)
In summary, the above shows ow two different proofs were necessary

since the proof in one case does r ; translate into a proof in the other.
Exercises
1. Explain the difference between Boolean algebra and the arithmetic
of the set of subsets of a given set.
2. Why will certain theorems of Boolean algebra also be theorems in
ordinary arithmetic, while other theorems of Boolean algebra will not
be?
3. Prove th a t in any Boolean algebra
f lU (a' (^\b) = a \J b.
4. Let S = {0,1}. We define two binary operations on S th at we denote
by W and Pi, where
KJ 0 1 n 0 1
0 0 1 and 0 0 0
1 0 1 i 0 1
Show that, with respect to our two binary operations, S is an example of a

Boolean algebra.
3.9.5 TR U TH -TA B LE LOGIC

As we have already mentioned, validity involves judging the tru th of one
statem ent based on the tru th of other statements. In this context truth-
table logic is concerned with the tru th of a compound statem ent based
on the tru th of its constituent parts.
Aside from the fact th a t truth-table logic serves as a model for a Boolean
algebra, it also allows us to apply algebra to statements. In other words,
tru th tables will afford us a completely nonmathematical but well known
model as to how to study structure in general.
To begin our study, we must first define what we mean by a statement.
A statem ent is any sentence th a t is either true or false, but not both. For
example,
There are eight days in a week.
is a false statem ent. On the other hand,
Why don’t you go home?
would not be called a statem ent since we cannot define it with respect to
tru th or falsity. W ithout further embellishment, it is probably fair to
Boolean Algebra 309
assume that, in general, we all know what a statem ent is. The next
question concerns the formation of compound statem ents. Again, based
on previous experience we sense th at
Both . . . and . . .
Either . . . or . . .
If . . . then . . .
are very common constructions for combining statem ents to form new
statements. They are so common th a t they have been given special
names (conjunction, disjunction, and conditional, respectively). How
shall we judge the tru th of such statements? Perhaps it is best to proceed
by example.
Example 1
Consider the conjunctive statem ent
There are eight days in a week and there are twelve months in a year.
We often reduce the size of conjunctions to make them seem simpler.

For example, the above conjunctive is usually written as
There are eight days in a week and twelve months in a year.
In a similar way, we often say
The weather is cloudy and warm.
as an abbreviation for
The weather is cloudy and the weather is warm.
Let us make the obvious assumptions th at
There are eight days in a week.
is a false statem ent, while
There are twelve months in a year.
is a true statem ent.

We are, thus, forming a statem ent by the conjunction of two simple
statements, one of which is false and the other true. A t first glance we
might be tem pted to say th a t the conjunction is half true and half false.
However, our definition of a statem ent precludes our saying this, for we
agreed th a t a statem ent must be either true or false, one or the other, but
not both! Hence, we must define what we mean by the tru th of a con
junctive in terms of the tru th of its constituent parts, and since we want
our results to conform with reality, we shall choose a definition to agree with
our experience.
Suppose we were to make a wager th a t both of two events will happen.
Then if we are to win our bet, both things must happen; otherwise we lose.
W ith this in mind, it seems natural th a t we define a conjunctive to be true
if and only if each of the constituent parts is true. But what has this to do
with the concept of truth-table logic? To answer this question, let us
introduce some convenient abbreviations and notations. We shall refer
to statem ents by letters such as p, q, r, s, and we shall use the conventional
notation
VA q
to abbreviate the compound statem ent
(both) p and q.
The tru th of p A q certainly depends on the tru th of the individual state

ments p and q. Since p and q are any statem ents and since any statem ent
is either true or false but not both, we find th a t there are four possible
combinations concerning the tru th of p and q separately. Letting t denote
true and f denote false we see th a t
V______ 9
t t
t f
f t
f f
where for a particular choice of p and q one and only one of the above four
happens; b u t which it is depends on the tru th of p and q. Now we have
agreed th a t p A q is true only when p and q are each true. Thus, in terms
of the tru th of p and q we define the tru th of p A q (and, hence, the name
truth-table) by the following table.
v_______ q______ v A q
t t t
t f f
f t f
f f f
Boolean Algebra S ll
Example 2
Consider the disjunctive statem ent
Either the pen is in the desk or it is on the table.
Here we are saying th a t the pen is in a t least one of the two places and,
as a result, this statem ent is false only if each of the constituent parts
is false. In terms of traditional notation we write
VVq
as an abbreviation for
(either) p or q.
Thus, the truth-table definition of p V q is the following.
V________ 9________V V g
t t t
t f t
f t t
f f f
The only troublesome spot occurs when both p and q are true, because of
the usual misinterpretation th a t “either . . . or . . .” precludes the possi
bility th a t both may happen. However, a little reflection shows th a t if
someone asks us for a pen and we know th a t we leave our pen in the desk or
on the table, then we may say in all honesty th a t the pen is either in the
desk or on the table—even though we may have left a pen in each place.
Putting this idea in terms of a wager, if a card is drawn from a deck and
we bet th a t it is either a spade or a face card, then we do not expect to lose
our wager if the card is the king of spades* Notice th a t we have made this
point earlier when we talked about unions of sets, but we feel it is an im
portant enough point to w arrant repetition here.
Exam ple 3
Consider the conditional statem ent
If I get a score of 100, (then) the professor gives me an A.
Letting p represent the statem ent following “if” and q represent the state
ment following “then,” we abbreviate the conditional by writing
SIS The Game of Mathematics
How shall we define the tru th of p —» q in terms of p and 5? Again, let us

proceed in terms of our own experience. Suppose we are on a jury and it
is our job to decide whether Jones was lying when he said “If I get a score
of 100, the professor gives me an A.” We shall assume th at he is telling
the tru th whenever we cannot prove th at he was lying. Proceeding by
cases, we have
Case 1: p q
t t
In this event we are saying th a t we have evidence telling us th at he got 100

and the professor gave him an A. In this case we should decide th at his
statem ent was true. W ith this as motivation, we agree th at
p________ q_______ p -» g
t t t
Case 2: p q
t f
In this event the evidence is th a t he got 100 but th a t the professor did not
give him an A; yet his statem ent indicates th at if he got 100 he should get
an A. Hence, we conclude in this case th a t he is “lying” (where “lying”
includes being misinformed). Thus we agree th at
p________ q_______ p - * q
t f f
Case 3: p q
i t
In this event our evidence is th a t he did not get 100, but th a t he did get an
A. Observe th a t Jones told us merely what would happen if he got 100.
He said nothing about w hat would happen if he did not get 100. For ex
ample, it might well be true th a t he would get an A with a score of 99
rather than 100. In any case, we have no proof th a t Jones is lying. Hence,
we bring in the verdict of “truthful.”
p________ q_______ P ~ * q
f t t
Case_________________________ p____ q
i i
Boolean Algebra 313
In this event we are saying th a t evidence shows th a t Jones got neither 100
nor an A. Here, again, we cannot prove th a t Jones was lying. From a
different perspective, if Jones had asked his professor what grade he would
get with a score of 100 and the professor replied, “ If you get 100, I ’ll give
you an A,” then once Jones fails to get 100 the professor is consistent no
m atter what grade he gives Jones. For example, if Jones gets 30, the
professor might well not give him an A. In short, we agree th a t
V________ q________ p - » g
f f t
In summary, we define p —»q by
v______ q______ P~>q

t t t
t f f
f t t
f f t
Do not get “hung up” on p and q. Think of p as the antecedent (the “if”
clause) and q the consequent (the “then” clause). All we are saying is
th a t the conditional is true unless the antecedent is true and the consequent
is false. This definition also agrees with usage in scientific methods.
T h at is, if a scientist predicts th at a certain thing m ust happen when con
dition p occurs, he is not held to th a t prediction if condition p fails to occur.
We should notice th a t intuitive or not the above definition is just th a t—a
definition. All we have done is to try to make the definition more palat
able by showing th a t it agrees with our usual experience. We must not be
alarmed when confronted by strange statements. For instance, consider
the statem ent
If there are seven days in a week then Boston is the capital of M assachusetts.
This statem ent, while sounding silly, is nevertheless meaningful, and as

such it must be either true or false but not both. All we are saying is th a t
this corresponds to p —>q where both p and q are true. Our definition
tells us th a t p —»q is true in this case. T hat is,
If there are seven days in a week, Boston is the capital of M assachusetts.
is a true statem ent in terms of the truth-table definition. Observe th a t

we are not saying th a t “There are seven days in a week.” proves th a t
“Boston is the capital of M assachusetts.” All we are saying is th a t the
statem ent itself is called true. In other words when we agree th a t “Boston
is the capital of M assachusetts” is true, it implies th at another proof has

been given.
Two statem ents p and q are said to be materially equivalent if their
tru th values are the same. We define p = q by
V________q________ v = q
t t t
t f f
f t f
f f t
For a discussion of truth-table logic we need not have a formal knowledge

of sets. T hat is, in a logic course in which the concept of sets was never
mentioned, one could still talk meaningfully about truth-table logic. How
ever, the reader has already, perhaps, begun to see a connection between
truth-table logic and the chart method for describing sets. Indeed, just
as C\ and W are binary operations on sets, A and V are binary operations
on statem ents. In fact, if we identify C\ with A , with V , 1with t,
0 with f,A with p, and B with q, then the truth-table definition for p A q
can be identified with the chart definition of A B; w'hile the truth-table
definition for p V q can be identified with A \ J B. We shall comment
more on this later. For now we wish to stress th a t truth-table logic de
pends in no way on the study of sets, but th a t such a study may enhance
the structure of tru th tables.
Next let us discuss the nature of the negation of a statem ent. Observe
th a t if we prefix a statem ent with “I t is false th a t . . . / ’ we form the
negation of the given statem ent. By this we mean a statem ent th a t is
true if the original is false, and false if the original is true. Let us introduce
the notation ~ p , called the negation of p, to mean the statem ent “p is
false.” Then, by our definition of statem ent, we obtain the following
tru th table.
p ~p
t f
f t
Finally, there are statem ents whose form makes them true regardless of
the tru th of the constituent parts. For example, if p denotes any state
ment then the statem ent “E ither p is true or p is false” is a true statem ent
regardless of the tru th of p. This follows from our definition th a t any
statem ent is either true or false (but not both). This same fact makes it
clear th a t the statem ent “p is both true and false” is always false regardless
of the tru th of p. A statem ent th a t is always true by its very form, in
Boolean Algebra 315
dependently of the tru th of its constituent parts, is called a tautology.

(See A Note on Tautologies a t the end of this section.) We shall denote a
tautology by T. In the same vein, a statem ent whose form forces it always
to be false will be called self-contradictory and will be denoted by F.
Observe th a t in the sense th a t mathematics is a study of relationships,
truth-table logic qualifies as a type of mathematics. Namely, observe th at
if we let S denote the set of statements, then A, — and V are merely
rules th a t combine statem ents to form statements. We want only to indi
cate th a t we can write down rules and definitions for truth-table logic and
then develop theorems. Indeed, this is frequently done. For example, it
is common to define two statem ents to be logically equivalent if and only
if they have the same tru th table. This is in accord with our intuitive
feeling th at to be logically equivalent, two statem ents should either both
be true or both be false in any given set of conditions.
By way of illustration, this affords us a way of using tru th tables to
demonstrate the rule of double negation. Double negation has the form,
“I t is false th at it is false . . . T hat is, double negation involves the
negation of the negation. In symbols the double negation of p would be
Let us look a t the tru th table.
p ~p ~ (~ p )
t f t
f t f
t ____________________
Since the columns in the table denoting p and ~ (~ p ) have identical

entries, we conclude th a t p and ~ (~ p ) are logically equivalent, and we
may wTite p = ~ (~ p ). This captures precisely the structure of double
negation. Observe, also, th a t the above demonstration in no way pre
supposes a knowledge of mathematics. The study of tru th tables tran
scends the field of mathematics, in the sense th a t every educated man needs
to understand logic. Among its many virtues, truth-table logic affords a
convenient, mechanical test to determine whether two statem ents are para
phrases of one another. Moreover, to work with truth-table logic, we
need not even know the verbal description of such expressions as p A q}
p V q, p —* q, or ~ p . All we need are the truth-table definitions.
In summary, we can begin the game of truth-table logic by giving the
following chart and recalling th a t if two statem ents are equal they possess
identical truth-table columns.
P Aq P V q V
t f
t f
t f
t f
Just from the given definitions we can now construct the tru th table for
an expression such as
(p V q) A (~ p ).
Namely, we take the disjunction of the two statements (p V q) and
p_______ q_______ v V q_______ 22.______ t(p V q) A (~p)1

t t t f f
t f t f f
f t t t t
f f f t f
We could also construct the tru th table for (~ p ) A q.
p_______ q_____ 22. 1 (~ p ) A g]

t t f f
t f f f
f t t t
f f t f
A comparison of the two charts shows us the following.
t(p V q) A (~p)1 = (~p) A g

i t f
f t f
t t t
f- J
Now we recognize th a t the compound statem ent is a tautology.

Of course, if we understand the English equivalent of the symbols we
begin to understand the meaning of a paraphrase. Namely, one statem ent
says
“Either p or q is true b u t p is false.”
The other says

“p is false and q is true.”
The second statem ent probably seems simpler than the first. Yet our
tables show th a t the two statem ents are logically equivalent; th a t is, they
have the same meaning.
We are not implying th a t those who understand grammatical structure
well would not have been able to decide this without the use of tru th tables.
For instance, we might a t a glance conclude th a t if either p or q is true but p
Boolean Algebra 317
is false, then it must be th a t q is true. T hat is, p is false and q is true. B ut

the tru th tables exhibit this result without an appeal to intuition, just on
structure alone. This comparison is similar to our earlier comparison of
arithmetic and algebra. A sufficiently clever person might solve a problem
using arithmetic where someone else might use algebra. How the problem
is solved is not important, but it is im portant to know th a t algebra stresses
the structure of the study of arithmetic.
Perhaps the reader has already noticed the perfect parallelism between
the tru th tables the chart method used in the study of sets. Now we begin
to see wThy truth-table logic may be viewTed as a Boolean algebra (even
though it is not necessary to take this view). T hat is, every rule of a
Boolean algebra can be shown to be a rule of truth-table logic in the fol
lowing way.
a,b, c, . . . <----------- > p, q, r, . . .

I <------------------------ > T
<t> <------------------------ > F
' <-----------------------------> ~
\ J <----------------------- > V
C\ <----------------------- > A
c <-----------------------
We shall illustrate this with a few examples. To begin with, consider rule
(5) in Boolean algebra, o U <f>= a. Using the identifications described
above, this translates into
p V F = p.
T ruth tables yield
p F p VF
t i t
i f f
and a comparison of the columns labeled p and p V F verifies th a t

p V F = p, as asserted. (In “plain English” the disjunction of a given
statem ent with a known false statem ent has the same tru th value as the
known statem ent.)
As a second example, rule (4), a (b U c) = (a b) VJ (a c), be
comes under our identification
p A (q V r) = (p A q) V (p A r).
W e verify this result using tru th tables.

V q r gVr pA(jVf) pA g pA r (p A ? ) V ( p A r)
t t t t t t t
t t f t t f t
t f t t f t t
t f f f f f f
f t t t f f f
f f f
f f t t f f f
f f f f f f f
(1) (2)
A glance at (1) and (2) shows us that, under our identification, rule (4) of
Boolean algebra becomes a rule for truth-table logic.
While we need not have learned about sets or Boolean algebra to study
truth-table logic, truth-table logic forms a physical model of a Boolean
algebra, and we can claim a knowledge of truth-table logic as a bonus once
we understand Boolean algebra. Stated in more mathematical terms,
every theorem of a Boolean algebra under the identification previously
discussed remains a theorem in truth-table logic—without our having to
use tables to verify the results.
For example, wre proved th a t the theorem
f lU (a r \ b ) = a
held in any Boolean algebra. Using our identifications, this means th a t
p V (p A q) = V
is a theorem in truth-table logic. As a check we may, i f we so desire, use
tru th tables, in which case we obtain the following.
V q V V g V V (p A g)
t t t t
t f f t
f t f f
f f f f
(1) (2)
A look a t (1) and (2) verifies the result.

As another example, the theorem th a t a = b means a C & and b C a
translates into p = q means ( p - ^ q ) A (q —*p)• Hence, {p —>q) A (q —>p)
will be abbreviated by p <-> q. Using tru th tables we see the following.
p q p- *q______ g -» V (p -» g) A (g v) or p <-»q V= q
t t t t t t
t f f t f f
f t t f f f
f f t t t t
Boolean Algebra 319
T hat is, if p implies q and if q implies p, then p and q are logically equiva
lent th at is, p «-> q means the same as p = q. In terms of sets, if all A ’s are
B ’s and if all B ’s are A ’s, then A and B are merely different names for the
same collection.
In other words, quite apart from Boolean algebra, the use of truth-table
logic does for sentence structure what ordinary algebra does for arithmetic.
In short, the student of grammar could use tru th tables as a means for
determining whether two structures are grammatically equivalent, or
paraphrases of one another. Remember th a t to paraphrase means to put
into other words without changing the meaning.
The major point is th a t we can obtain all the benefits of truth-table logic
without having to pore through the entire subject once we have studied
Boolean algebra. In summary, the set of subsets of a given set and truth-
table logic, as different as they might be in other respects, both serve as
models for a Boolean algebra. Something is a theorem in one of these
subjects i f and only if it is a theorem in the other.
Here might be a good place to distinguish between “if,” “only if,” and
“if and only if.” Briefly, they are as different as p —>q, q —►p, and p «-> q,
respectively. T hat is, we read p —»q as “If p then q”; we read q —* p as
“p only if q”; and we read p «-> q as “p is true if and only if q is tru e.” In
short, “if and only if” means th a t the two phrases being related are syno
nyms. This distinction often occurs in terms of the words “necessary,”
“sufficient,” and “necessary and sufficient.” T hat is, if p is sufficient to
guarantee q, this means th a t p —* q \ if p is necessary in order for q to happen,
this means th a t q —> p (in other words, in this case the fact th a t q happened
means th at p also happened); and if p is both necessary and sufficient to
guarantee q, then p and q are equivalent happenings (that is, p q).
A NOTE ON TAUTOLOGIES
As we have seen, a tautology is a statem ent that, based on its form alone,
is always true. The simplicity of this definition makes it easy for us to
overlook the tremendous impact of this concept.
To begin with, recall th a t in our discussion of logic being the art of
drawing inescapable conclusions from given assumptions, the problem cen
tered about the meaning of “inescapable.” “Inescapable” possesses a
subjective air. Can something be inescapable to one person but not to an
other? Moreover, how can we ever be sure th a t something is inescapable?
Coupled with this problem is one of prejudice. T hat is, in many types
of arguments we accept the logic of the argument if we happen to like the
conclusion, rather than judging the argument on its own merits. For
example, if a person enjoys smoking he may tend to approve of any argu
ment, no m atter what its flaws, if it concludes th a t smoking is beneficial.
If a person is a staunch Republican he may fail to seek flaws in arguments
th a t assail the Democrats.
Thus, it appears th a t we must appeal to rules of logic based on form
alone to decide problems of inescapability. Tautologies seem to be just
w hat we need. They depend only on their form, not on the specific topics
they relate; in addition, what better structure is there to tru st than one
th a t is true no m atter what the truths of its constituent parts are?
By way of illustration, let us look a t a few forms of arguments in which
we tend to say th a t the conclusion follows inescapably from two or more
assumptions.
If p is true, then q is true. (1)

If q is true, then r is true. (2)
Therefore, if p is true then r is true. (3)
In (3) the word “therefore” indicates th a t an inescapable conclusion is

about to be drawn. In terms of truth-table logic, the above argument can
be stated as
l(p -*•q) A (q -* r)l -> (p -» r). (4)
By way of review, (1) says “p implies q (2) says “q implies r ” ; and,

according to (3), (1), and (2) together imply th a t “p implies r.”
We now construct the tru th table for (4).
© <
© T
©
©
©
©
©
r q —*r p —* r
a
p 9
t t t t t t t t
t t f t f f f t
t f t f t f t t
t f f f t f f t
f t t t t t t t
f t f t f f t t
f f t t t t t t
f f f t t t t t
Therefore, (4) is a tautology.

If we now switch our model from truth-table logic to sets, the translation
of (4) into the language of sets says th a t if all A ’s are B ’s and if all B ’s are
C’s, then it follows inescapably th a t all A ’s are C’s.
Boolean Algebra SSI
Arguments such as the above were studied long before Boolean algebra
was invented. For example, the subject known as formal logic tried to
tackle the idea of inescapability in the form of an argument. T h at subject
was to help us distinguish between tru th and validity. In fact, the study
of logic in terms of form was known to Aristotle. Aristotelian logic, re
plete with syllogisms, can be traced back well into the pre-Christian era.
W hat is interesting is th a t people studied sets without knowing they
were doing it. Of course, once we know the language of sets we can per
form certain translations, whereupon we discover the following types of
results.
Pre-Set Language Set Language

All A ’s are B ’s. AC B
Some A ’s are B ’s. AC\ B ^ <f>
Some A ’s are not B ’s. A(~\ B ' ^ <t>
No A ’s are B ’s. A(~\ B — <t>
All th a t we are trying to show is th a t the arguments called valid in formal

logic are tautologies in truth-table logic.
As a second and final example, let us consider this argument: If either p
or q is true, but p is not true; then q is true. In the language of truth-table
logic, this statem ent becomes
[(p V q) A (~p)l -*■?• (5)
The tru th table yields the following results.
®
P q pVq ~p (p V q) A (~p) ® -» q
t t t f f t
t f t f f t
f t t t t t
f f f t f t
Therefore, (5) is a tautology.

Tautologies, truth-table logic, and formal logic could and should be
studied for their own sakes in greater detail, but in this text they are means
toward an end. Therefore, we bring this note to a close, but with the
hope th a t the interested reader will pursue the topic further on his own.
Exercises
1. Let p = Today is Friday.
Let q = Jones is married.
Let r = School is fun.
Write each of the following compound statem ents in symbolic form.

(a) Today is Friday and school is fun.
(b) If today is not Friday, school is fun.
(c) Either today is Friday and Jones is married or school is not fun.
2. Decide upon the tru th of the statem ent
“ If there are eight days in a week then all grass is yellow.”
where we make the obvious assumption th a t there are not eight days
in a week and th a t some grass is not yellow.
3. W rite the tru th tables for each of the following.
(a) ( p - * q ) V ( ~ p ).
(b) (p V q)~* ( ~ p ).
4. Paraphrase (a) and (b) in Exercise 3 so th a t the symbol —* does not
appear.
5. Use tru th tables to show the statem ents ~ ( p A q) and ( ~ p ) V (~ff)
are logically equivalent.
6. In terms of Boolean algebra explain how we could conclude th at
~ ( p A ?) = (~ p ) V ( ~q) without reference to tru th tables.
7. Use tru th tables to verify th a t (p A q) V ( ~ p ) V ( ~q) is a tautology.
8. Show by use of tru th tables th a t p —>q and (~ g ) —* (~ p ) are logically
equivalent. Then apply the result to a discussion of the indirect proof.
9. Construct a tru th table for ~ ( p A ~ q ) and one for ~ p V q.
(a) The tru th values for ~ ( p A are the same as which of the
basic rules (V , A, or —»).
(b) The tru th values for ~ p V q are the same as which of the basic
rules (V , A, or —>).
(These expressions are often used to define implication.)
3.9.6 SW ITCHES
While we presuppose no knowledge of the theory of electric circuits in
this section, let us assume th a t we are all familiar with the idea of current
flowing in a wire. By a switch we mean a device by which we can control
whether or not the current will flow. As an example take an ordinary
light switch. We flick the switch and the light goes on; we flick it again
and the light goes off. We usually view a switch as a break in the wire, as
shown in Figure 3.10.
C urrent source o--------------------or <>■ —— — O utlet
Figure 3.10
Boolean Algebra 323
We say th a t the switch is open when it does not let current pass (Figure
3.11).
Figure 3.11
When the switch allows current to pass we say th a t it is closed (Figure

3.12).
Figure 3.12
Two switches are said to be connected in series if they appear as in Figure

3.13.
-O' O ------------------ 0
Figure 3.13
Switches are said to be connected in parallel if they appear as in Figure

3.14.
Figure 3.14
In series if either switch is open, current will not flow; while in parallel
one switch can be open and current will flow through the closed switch.
This is illustrated in Figure 3.15. Labeling the switches A , B, C, and so
on, and letting o and c denote open and closed, respectively, we can make
a chart of possibilities concerning two switches A and B.
. A
“ This prevents current
from reaching o u tlet.”
‘Current reaches outlet
through “ lower” branch.’
(a) - (b)
Figure 3.15
A________B
c c
c o
o c
o o
This chart follows from the fact th a t a switch is either open or closed, one
or the other, but not both. If we now write A sB to denote the switch
formed when A and B are connected in series, and A pB to denote the switch
formed when they are connected in parallel, we can make the following
chart.
A B A sB A pB
c c c c
c o o c
o c o c
o o o o
I t should be clear th a t s and p serve as binary operations whereby we

can combine switches to form new switches; and it should be equally clear
we need not study sets to understand the “arithm etic” of switches. How
ever, perhaps the astute reader has already begun to notice the identifi
cation.
n *■ * s
U •• • p
1 •• • c
0 ■• • o
This makes the study of switches another possible model of Boolean

algebra. In fact, with a little insight, we can almost guess th a t the analog
of I will be a switch th a t is always closed; the analog of <t>will be a switch
th a t is always open; and the analog of ' (complement) will be a switch
th a t is in the opposite position of the given switch. T hat is, for a given
Boolean Algebra 326
switch A we may define the switch A c to be the switch th a t is open when

A is closed, and closed when A is open. Letting C denote a switch th a t
is always closed and 0 a switch th a t is always open, we can make the follow
ing chart.
A B Ae Bc C 0 A sB A pB
c c o o c o c c
c o o c c o o c
o c c o c o o c
o o c c c o o o
The resemblance between this chart and our previous use of charts and
tru th tables should be obvious. If we now define two switches to be
equivalent if one is closed when the other is closed and open when the
other is open, we can see this is equivalent to saying th a t the tables for the
switches are identical.
To understand the last idea better, observe th a t when we flick a switch
to make a light go on we may have no idea as to why it comes on. In
other words, we need not know how the circuit is wired. In this respect, we
are saying th a t two switches are equivalent if they would both account
for the reason the switch behaves as it does.
Suppose we have a switch such th a t whenever we flick it the light it
controls never goes on. Perhaps our first suspicion is th a t the bulb is
defective; we test the bulb and find th a t it works. We might next suspect
th a t there is a defect in the switch; we check th a t and, again, nothing is
wrong. We try to construct the nature of the switch based on this in
formation. For example, it might be th a t the switch has the form shown
in Figure 3.16.
5
Figure 3.16
For, in this event, if A is closed, A c is open; and if A is open, A° is closed.

Thus, in either event current cannot flow. However, it is also possible
th a t the switch has the form shown in Figure 3.17. In this case it is
obvious th a t current cannot flow.
If we cannot take the switch apart we have no way of deciding which
Figure 3.17
of the two wirings is correct—if either! This is frequently referred to as

the black box concept, which in a way is characteristic of the entire scien
tific method. We can measure the input into the box and measure the
output from the box, but we cannot look into the box. Based on a study
of only inputs and outputs, we try to guess what the interior of the box
m ust look like, knowing th a t a t any time the next measurement may
contradict our guess.
To continue, the practical aspect of such a study of switches lies in the
fact th a t if two switches give the same result, it is to our advantage to
use the simpler switch, where the meaning of simpler usually depends on
various engineering aspects.
We can easily verify that, under the given identifications, the switches
form a model of Boolean algebra, since all the necessary rules are obeyed.
This means th a t the student who wishes to learn about Boolean algebra
now has three choices for a model; namely, subsets of a given set, truth-
table logic, or the theory of switches. Moreover, if a person happens to
understand or enjoy switches, but if he dislikes truth-table logic, he can, if
called upon to solve a logic problem, translate it into an equivalent switch
problem and solve th a t instead.
As a case in point, let us consider the problem of demonstrating th at
p A {q V r) = (p A q) V {p A r). The corresponding statem ent in the
language of switches is
As(EpC) = (AsB)p(AsC).
T h at is, we are comparing the switches shown in Figure 3.18.
p Q
Figure 3.18
Boolean Algebra 327
A chart or some other convenient method will establish the equivalence

of the two switches; we leave the details to the interested reader since
the procedure is almost identical to the one used in tru th tables.
An interesting point is th a t we can now build electronic tru th tables.
Specifically, if we desire to construct the tru th table for p A (q V r), we
may construct either of the configurations depicted in Figure 3.18, but
now labeling the switches p, q, and r. This is shown in Figure 3.19.
— °/? ° —
0------ — 0 ^ 0 -----
r °
Figure 3.19
If p is true we will indicate this by closing switch p; if p is false we will leave

switch p open. Similar considerations will determine whether we close
switches q and r. Under these conditions, whenever the bulb lights we
are saying th a t p A (q V r) is true and th a t otherwise it is false.
In concluding this section, we have provided an example wherein several
different models obey the same set of rules; thus, these models are the
“same” game with different vocabularies. This has been in accord with
our attem pt to emphasize that, in the pursuit of logic, the mathematician
is frequently concerned with showing th a t apparently different things are
actually different models of the same “game.” This is what we mean by
the structure of a mathematical system. W ith regard to this endeavor, we
defined a Boolean algebra and then went on to investigate three entirely
different physical models of such an algebra, each im portant in its own
right independently of the concept of mathematics. However, each of
these three models could have been learned, in a manner of speaking, as a
single subject.
Perhaps the last few sections are most im portant because they give us a
fairly concrete example of what happens in the study of any abstract
system: how we make up definitions and rules (usually modeled after a
“real” system), and how we proceed to see what follows inescapably from
our assumptions. This is the structure of mathematics, and no m atter
w hat topic we deal with, either elementary or sophisticated, the same idea
is still present. In this sense the idea of a “game” far transcends the
applications of this text. Indeed the game idea is a unifying thread of all
“human endeavor.”
Exercises
1. Sketch the switch th at is expressed by each of the following.
(a) As(BpC).
(b) (AsB)p(A°)p(B°).
2. Find switches th a t are equivalent to the ones defined in Exercise 1(a)
and (b).
3. Use a chart to show th a t the following two switches are equivalent.
(AsBy and A cpBc.
4. Explain in terms of Boolean algebra how we could have established

the result of Exercise 3 without recourse to charts.
5. Three people are to serve on a panel and to vote either yes or no on
certain questions. A yes vote is recorded by depressing a switch, while
a no vote is registered by not voting. Show how we could wire a switch
such th a t a light goes on whenever a majority votes yes on a question.
6. W rite the formula for each of the pairs of switches in Figure 3.20, and
name the rule of Boolean algebra being expressed.
(a)
B A B
----------- - and o— —
_ y
c C
(b)
— of o ---------
A A A
— and o-----
_ y 0___
s * S o
B C B C
(c)
Figure 8.20
7. Use charts to show th a t the two circuits in Figure 3.21 are equivalent.
Boolean Algebra
(b)
Figure 8.21
chapter four / AN INTRODUCTION
TO FUNCTIONS AND GRAPHS
4.1 INTRODUCTION
M athematics has been variously described as the study of relationships,
the language of science, the basic tool of technology, the logical quest for
tru th , and the study of exact measurement. More subjectively, it has
been called a strict discipline, a way of life, and a philosophy. Perhaps,
then, we might combine all of these descriptions into one neat package
and call the ensuing combination of phrases the definition of mathematics.
We could even send out questionnaires to everyone in the entire world in
order to ascertain other descriptive phrases about mathematics. We
could incorporate all of this new information into the original definition
and make it more comprehensive. B ut the results of such an undertaking
could lead at best to a very cumbersome definition of mathematics. More
over, the definition would not be particularly satisfying. For, while it
may be true th a t any concept can be defined by suitably incorporating
every appropriate descriptive phrase (assuming th a t such a feat is possible),
the fact remains th a t such a definition is almost anti-intellectual, at least
in the sense th a t one usually hopes to see a concept defined in terms of a
basic unifying thread from which the other properties follow.
In light of the above remarks it seems th a t wTe should look through the
various descriptions of mathematics and find one th a t characterizes the
subject well. However, any attem pt to define mathematics in terms of
short phrases is destined for failure, for mathematics, and quite justifiably
so, means many different things to different people. In a sense, the a t
tem pt to answer the question, “W hat is M athematics?” can be compared
with the fable concerning the blind men and the elephant. Each man
touched a different p art of the animal and gave a different description
of the elephant. Each m an’s description served as a distortion of the
actual description of the elephant. Each man was partly correct but
none was entirely correct. In this vein, any single descriptive phrase of
m athem atics is (1) too comprehensive; th a t is, it is correct but it defines
more than just mathematics; (2) too specialized; th a t is, it defines certain
330
Introduction 881
aspects of the subject very well but does not apply to other aspects; and (3)
too vague. One definition th a t is too comprehensive is
M athematics is the study of relationships.
Certainly, this is a true statem ent. In geometry, for example, one studies
the relationship between the area of a region and its various dimensions.
In traditional algebra problems, we are usually investigating the relation
ships of greater than and less than. Probably, whenever we deal with m athe
matics, somehow or other we are concerned with the study of relationships.
However, it should not be difficult to see th a t to define mathematics
as “the study of relationships” would make virtually every academic
endeavor of man a branch of mathematics. For example, in physics
Galileo studied the relationship between the distance th a t an object fell
and the time during which it was falling, and Newton studied the force of
attraction between two objects in relation to their sizes and the distance
between them. Such quantitative studies of relationships are well known
in all of the physical sciences. Indeed, they are the backbone of many
investigations. However, an equally im portant point is th a t the study of
relationships is by no means restricted to mathematics and the physical
sciences. For example, the philosopher studies the relationship between
a concept and the word used to denote th a t concept, the economist studies
the relationship between various forms of supply and demand, the student
of literature studies writing in relationship to the society of the times, the
historian judges the success of a particular society in relationship to the
aims upon which the society was formed, and the psychologist studies the
scores of certain tests in relationship to the environmental background of
those who took the test.
Such examples are numerous, and the phrase, “the study of relation
ships,” permeates almost every field of human endeavor. Thus, such a
definition of mathematics would be too comprehensive.
Notice, however, th a t such a definition of mathematics, even though
it does not separate mathematics from other subjects, is rather worthwhile;
a t least in the sense th a t it reflects a general trend in one prevalent form of
mathematical usage of the day. Namely, more and more subjects, at
one time thought to have only a minimal need for mathematics, are be
ginning to require the study of relationships in more precise quantitative
ways. Such a study brings mathematics into play as a very strong com
putational tool.
I t is our aim in this chapter to exploit the idea th a t mathematics is
the study of relationships. While such a definition might not apply to
every aspect of mathematics, and while it might be much too compre
hensive, the fact is th a t this aspect of mathematics, perhaps more than
332 A n Introduction to Functions and Graphs
any other aspect, has been the unifying thread by which man has tried to
explore and to understand the world around him.
In this context it is particularly easy to introduce the meaning of a
function. Stripped of all embellishment, a function is a rule. In the
classical sense it was a rule th a t assigned to one number, another number.
In the modern sense it is a rule th a t assigns to an element of one set an
element of another set.
We shall study functions from both points of view, but introduce the
topic from the modern point of view. While this is not chronologically
correct, we prefer the generality of the modern approach. Afterwards, we
shall look into the classical viewpoint.
4.2 FUNCTIONS AND SETS

Since it is usually difficult to think in abstract terms, let us begin this
discussion by introducing a concrete environment. Consider a collection
of salesmen (whom we shall call set A ) who are having a convention at a
particular hotel (the rooms of which we shall call set B ). The entire hotel
has been reserved for the salesmen, and each salesman will reside at the
hotel for the duration of the convention. As each man enters the hotel
he is assigned a room by the room clerk. In the discussion th a t follows
we shall illustrate all of the definitions in terms of these particulars.
D efinition
Let A and B denote sets. Then by a function / from A to B, written
/ : A —>B, we mean th a t / is a rule th a t assigns to each element a £ A
one element b © B. The fact th a t f assigns a © A to b G B is denoted by
/(a) = b (read as / of a equals b). (By definition if / is a function and
/(a) = bi th e n /(a ) 5* b2 for any other elements in B.)
In terms of the example of the salesmen, the room clerk plays the role
of the function from A to B ] th a t is, he is a “rule” th at assigns to each
element of A (each salesman) an element of B (a hotel room).
D efinition
If / : A —>B then we call A the domain of / (abbreviated dom /), while
B is called the range of /.
Again, in terms of the salesmen example, the domain is the salesmen

and the range is the hotel rooms.
I t is possible th a t the salesmen would not completely fill the hotel rooms.
In this sense the notion of range is somewhat deceptive since it gives us
Functions and Sets SSS
no hint as to how much of the range is “used up” by the function. In

still other words, we might be more interested in knowing which rooms
are being used by the salesmen than in knowing th a t all the rooms being
used are in the hotel.
D efinition
Given /: A —>B, we define the image of / (usually abbreviated by Im /)
to be the set {f(a): a £ A}. This set is also denoted by f(A).
In other words, f(A) is precisely th a t subset of B th a t is “used u p ” by
/. More precisely, instead of using such phrases as “used up,” we often
call b the image of a (with respect to /) if /(a) = b. In this context f (A)
denotes the subset of B th a t consists of those elements of B th a t are images
of elements in A (with respect to /). In terms of the salesmen example,
f (A) is the set of rooms to which salesmen are actually assigned. This is
diagrammed in Figure 4.1.
Of course, it is possible th a t all the salesmen use up all the hotel rooms.
T hat is, nothing excludes the possibility th a t/(A ) = B. This leads to the
following definition.
D efinition
G iven/: A —>B, we say th a t/ is onto B if and only if f (A) = B\ otherwise,
/ is a function from A into B. (In other words a function is onto when the
image equals the range.)
In many situations one is more interested in the image of / than in its

range. T hat is, given / : A —>B, we usually concentrate on / : A —>f(A).
From still another point of view, we are saying th a t often we are dealing
with a set A and we study what happens to its members under the action
of the function /. From this perspective it is clear th a t the only range
of interest to us would be f (A) since no other elements would be mentioned.
The point is th a t if we replace B by /(A ), we automatically c o n v e rt/: A - ^ B
into an “onto” function. In summary, all th a t is required for / : A —>B
to be onto is th a t every element in the range (B ) be the image, under /, of
a t least one element in the domain (A).
Returning once again to the salesmen, it is reasonable to assume th at we

do not assign a salesman to more than one room. However, it is equally
reasonable th a t we might assign more than one salesman to the same room.
As a second example, consider the basketball coach telling players of his
team (A) how to guard members of the other team (B). In this context
the coach serves as a function from A to B. W ith respect to our present
discussion we are saying th a t it is reasonable to assign two players of team
A to guard the super star of team B, but somewhat unreasonable to ask
one member of team A to guard two members of team B. This leads us
to still another basic definition.
D efinition
The function f: A —>B is called one-to-one (often written as 1-1) if no
element of B is the image of more than one element of A . More precisely,
/ is 1-1 means th a t if ai and a2 are elements of A then ai a2 implies th at
/(« i) ^ /(« 2); or from a different emphasis /(ai) = /(a 2) implies th at ai =
ci2.
From an abstract point of view, our definition of a function forbids us to

allow an element of the domain to have more than one image in the range.
The classical definition of a function required only th a t each element of
the domain have at least one image in the range. T hat is, it required th a t
each element in the domain have an image, just as in the modern definition,
but the classical liberalness in allowing an element to have more than one
image led to a further refinement in vocabulary. Namely, if it happened
th a t there could only be one image per element in the domain then the
function was called single-valued; otherwise it was called multi-valued. To
correlate the modern vocabulary with the classical, what is called a function
in the modern language would have been called a single-valued function
in the classical language, and a multi-valued function is now called a relation.
We shall say more about this later, b u t for now, unless otherwise specified,
function will mean single-valued function.
Remember th a t single-valued and one-to-one are entirely different con
cepts. All functions are, by our definition, single-valued, b u t not all are
1- 1.
Figure 4.2 illustrates the meaning of 1-1 and onto. Observe th a t a
particular function can be both, neither, or one but not the other.
We shall discuss functions th a t are both 1-1 and onto in more detail
later. For now, we shall introduce just enough definitions so th a t we can
begin to view the study of functions as a mathematical structure.
In any mathem atical system we m ust have a criterion for equality so
th a t we may distinguish between the elements of our system.
Functions and Sets 336
These two
elements
have the
1 -1 but not onto onto but not 1-1

(a) (b)
neither 1—1 nor onto both 1 ■1 and onto

(c) (d)
Figure 4.2
D efinition
Given two functions / and g, we say th a t / = g if and only if
(1) d o m / = dom g and
(2) /(a) = g{a) for each a C A (where A = dom / [or g]).
T hat is, we insist th at we not compare functions unless they operate on

the same set of elements. Secondly, once the domains are the same, we
insist th a t the images be the same, element for element.
Referring to the salesmen example, observe th a t different groups of
salesmen can come to the hotel a t different times. Moreover, for a given
group of salesmen there is more than one way for the room clerk to assign
rooms. If wre now talk about equal functions (or, in terms of the example,
equal room assignments) we insist (1) th a t each assignment involves the
same set of people and (2) th a t each person who receives a room by one
assignment receives the same room by the other assignment.
Notice particularly th a t we require more than th a t the same rooms be
used in each assignment. In still other words, to say th a t for each a £ A,
/( a ) = g(a) says much more than just/(.A ) = g(A). In terms of a more
mathematical illustration, let A = {1,2,3} and let B = {4,5,6}. Define
f : A - + B by
/(I) = 4
/(2) = 5
/(3) = 6
and define g : A —>B by
9(1) = 4
9(2) = 6
9(3) = 5.
In particular, since both / and g are onto, we have th a t/(A ) = g(A) = B.
However, / g, for while they have the same domain, /(2) g{2). This
is shown in Figure 4.3.
Figure 4-3
W e now want to define how we may combine functions to form other

functions. To this end, suppose f : A —>B and g: B —>C. T h e n / and g
can be composed so as to induce a function from A to C. For example, we
can sta rt with a £ A and then look a t /(a). This is an element of B, and
we shall call it b; th a t is, b = f (a). By definition of g, g maps b into some
element of C, say c. T h at is, g(b) = c. P utting these two separate
operations into one symbol (specifically, by replacing b by /(a)), we obtain
9(f(a)) = c.
This is pictured in Figure 4.4.
g°f= h
Figure 4-4
Functions and Sets 337
D efinition
Let /: A —* B and g: B —* C be given. Then by the composition of / and
g, written g »/, we mean the function h : A —>C such th a t for each a G A ,
Observe th a t by our definition g • / has been defined so th a t its domain

is the domain of / and its image is the image of g. In this sense, then, we
must be very careful not to confuse g(f(x)) with f(g(x)). For example, by
definition of g, g{x) belongs to C, but C is not necessarily the domain of
/. In other words, f(g(x)) might not even make sense if / : A —>B and
g .B ^C .
While our definition refers to the sets A, B, and C; nothing excludes
the possibility th a t A = B = C. In this event, the problem described
in the previous paragraph cannot occur since the domain and range of each
function is then the same set, say, A. However, even then we must not
confuse / ° g with g »/. By way of illustration let A = {1,2,3} and define
/ and g as follows:
/(l) = 1 <7(1) = 2
/(2) = 3 </(2) = 1
/(3) = 2 0(3) = 3.
Then
<7(/(l)) = 0(1) = 2
<7(/(2)) = 0(3) = 3
0(/( 3)) = 0(2) = 1
while
m i)) = /(2) = 3
/(0( 2)) = /( l ) = 1
M 3)) = /(3) = 2.
Thus, both / - g and g ° / are 1-1 and onto functions from A to itself, but
they are not equal.
Observe also th a t if / : A —>B and g: B —* C are both 1-1 and onto,then
g o f is also 1-1 and onto from A to C. This is an im portant result; but
it is not as self-evident as it might seem—a t least, in the sense th a t its
converse is not true. T hat is, g « / can be 1-1 and onto even though not
both / and g are. An example is shown in Figure 4.5.
In the next section, we shall explore a few properties of functions which
are both 1-1 and onto.
Here g ° f is both 1- 1 and onto

= c t a n d g (/(a )) = c 2).
But / is not o n to and g is not 1- 1 .
Figure 4-6
4.3 INVERSE FUNCTIONS

In many cases when we deal with functions, we start with a set and make
up some function th a t in turn induces an image or range. For example,
suppose th a t A denotes any nonempty set. Then we can easily invent at
least one function th a t will map A onto itself. Namely, let us define /
by fip ) = a for all a G A. For fairly apparent reasons such a function
is called the identity function on A.
D efinition
Let A be any set. Then the identity function on A, usually denoted by
I, is defined by 1(a) = a for all a £ A .
W ith this definition in mind, suppose th a t wre have a function / : A —»B
th a t is both 1-1 and onto. This is pictured in Figure 4.6.
Figure 4-6
The point is th a t / induces a function from B to A in a natural way, merely

by our reversing the direction of the arrows in Figure 4.6. Thus, we have
Figure 4.7.
In a manner of speaking, g “undoes” /. In still other words, g • / is
the identity function on A. T hat is, g(f(a)) = a for all a £ A. In terms
of the above notation, g • / = I. In a similar way, / • g is the identity
Inverse Functions 339
Figure 4.7
function on B. In the special case th a t A = B, we may say th a t f • g =

g o f = I.
The function g defined in this way is called the inverse of f, and is usually
denoted by f~ \ This notation is like the usual notation for multiplication
of numbers. Recall th a t we often write a-1 to denote the number which
when multiplied by a equals 1. T hat is, aa~l = 1. Notice th a t I plays
to function composition what 1 plays to ordinary arithm etic multiplication
(that is, the role of an “identity” element; meaning that, with respect to
the given operation, it does not change anything).
This idea of inverse is exactly the same as the more intuitive concept
of inverse as stated in Chapter 1. For example, let us consider the idea
th a t subtraction is the inverse of addition. Let R denote the set of real
numbers and define f : R —* R by f(x) = x + 3 for all x G R- Thus, in
essence, / is the process of adding 3 to any number. Now define g: R —* R
by g (y) = y — 3 for all y c R- Then g, in the sense of Chapter 1, is the
inverse of / since subtracting 3 is the inverse of adding 3. At the same time,
in terms of our above ideas g “undoes” / in the sense th at g ° f is the identity
function on R. T hat is g(f(x)) = g(x + 3) = (x + 3) — 3 = x.
Perhaps the most im portant thing to notice here is th a t unless f is both
1-1 and onto we are unable to talk a b o u t/-1 in any meaningful way. Pic-
torially, we are saying th a t unless / is both 1-1 and onto we do not get a
function from B to A by reversing the arrows. To illustrate these results
pictorially, consider Figure 4.8.
In Figure 4.8(a), reversing the arrows does not give us a function from
B to A because not all of B is in the domain. In other words, if f: A —>B
is not onto, then the domain of / -1 is not B; hence, f ~ l is not a function from
B to A , even though the range of / -1 is A.
In Figure 4.8(b), reversing the arrows does not give us a function from
B to A because / -1 would then assign to an element of B more than one
image in A. In other words, in terms of classical notation, if / is not 1-1
then / -1 is not single-valued.
We place special attention on functions from a set to itself th a t are both
1-1 and onto. Such functions are called 'permutations, and observe how
Not in dom
of / " '
has 2 images
under
(a) (b)
Figure 4-3
this coincides exactly with our earlier definition of a permutation. Our

original definition of a perm utation was th a t it was a rearrangement of a
set of objects. W hat is a rearrangement other than a 1-1, onto function
from the set to itself?
In an attem pt to show an application of a mathematical structure and
a t the same time to use some of our new definitions, let us look at the set
of all permutations on a set of three elements. To this end, let S = {1,2,3}.
As described in Chapter 2, there are exactly six functions F: S —>S such
th a t F is both 1-1 and onto.
i“H
/( I) = 1 0(1) = 2
II
1(2) = 2 m = 3 0(2) = 1
oo
to
7(3) = 3
II
0(3) = 3
A(l) = 2 Jfc(l) = 3 m( 1) = 3
h(2) = 3 m( 2) = 2
II
t—
k( 3) = 2 m( 3) = 1.
II
H
We use I to denote the identity function on S.

We may also check th a t the composition of any two such permutations
is again a permutation. For example, g ° h yields and h ° g yields:
g(h( 1)) = g(2) = 1 h(g( 1)) = h(2) = 3

g(h( 2)) = 0(3) = 3 h(g(2)) = h( 1) = 2
g(h(3)) = <7(1) = 2 h(g(3)) = h( 3) = 1.
But,
/( I ) = 1 m( 1) = 3
/(2) = 3 m( 2) = 2
/(3) = 2 m( 3) = 1.
Hence, g ° h = f since they have the same domain and the same image,
element for element. In other words, in terms of the number-versus-
numeral concept, g « h a n d /a r e two different names for the same permuta
tion on S; while h <>g and m are synonyms for a different permutation.
Inverse Functions 341
In a similar way, suppose th a t we wish to “undo” h. In terms of a

picture, we have Figure 4.9.
Figure 4-9
Thus to “ undo” h we want to “m ap” 2 into 1, 3 into 2, and 1 into 3, as

shown in Figure 4.10.
//"'(l ) = 3
/f'( 2)= 1
h '* ( 3 ) = 2
Figure 4-10
We next observe th a t this is precisely what k does. Thus, we see th at

k = h~*. As a final check, without reference to the figure, we obtain
k(h(l)) = k(2) = 1
k(h( 2)) = k( 3) = 2
k(h( 3)) = Jfe(l) = 3.
Similarly,
h(k(l)) = h( 3) = 1
h(k( 2)) = /i(l) = 2
/i(fc(3)) = /i(2) = 3.
In this way we can construct an “arithm etic” of functions, where our
“multiplication” (composition) table would be given by the following
table. The details are left as an exercise.
A composition between two functions is an operator; just like multiplica
tion is an operation between two numbers.
o I h k m
f 9
I I f 9 h k m
f f I k m 9 h
9 9 h I f m k
h h 9 m k I f
k k m f I h 9
m m k h 9 f I
In the next section we shall show one more illustration of 1-1 and onto
functions in mathematics. In particular, we shall discuss the concept of
counting and the idea of cardinality from the point of view of 1-1 and onto
functions. Later on we shall return to these ideas with respect to the
classical concept of functions.
Exercises
1. Let A = {1,2,3} and let B — {7,8,9}. Describe a fu n ctio n /: .4 —»B
such th a t
(a) / is neither 1-1 nor onto.
(b) / is 1-1 and onto.
2. In terms of the problem above, is it possible th at / can be 1-1 and not
onto? Explain.
3. Let A be the set of integers, and define / b y /(a ) = a2 for each a £ A.
(a) Describe B if B is the image of A with respect to /.
(b) In this case is / : A —* B onto? Explain.
(c) In this case is / : A —>B 1-1? Explain.
4. In each of the following let R denote the set of real numbers and let
f : R —*R. If g denotes the inverse of /, find the inverse of / in each
of the following cases.
(a) /(r) = r + 3.
(b) /(r) = 3r.
(c) /(r) = 2r + 7.
5. Again, let R denote the set of real numbers and le t/ : R —*■R and g : R —*R
be defined by f{r) = 3r and g{r) = r + 3. Describe the functions
/ o g and g « / in this case.
6. Let I, /, g, h, j , and k denote the six permutations on a set A containing
three elements as outlined in the text. Compute the following.
(a) f • (g • h).
(b) g • j~ \
(c) ( g - j ) - 1.
(d) U • g)~'.
(e) r 1 • g~l.
4.4 COUNTING REVISITED

In C hapter 1 we discussed the concept of counting, and briefly traced
the development of our numeral system. In Chapter 2 we discussed clever
ways of counting. However, in both chapters we made the assumption
th a t we all knew intuitively what numbers were. The 1-1 and onto func
tions afford us a logical way to discuss the number concept. In this section
we shall try to indicate how this is so.
Let us first observe th a t one does not need a numeral system to capture
Counting Revisited 343
the meaning of “how-many-ness.” For example, suppose th at we have

never learned to count and th at we wish to decide which of two bags con
tains more jelly beans. Clearly, we can eject simultaneously one bean
from each bag, and the bag th a t is emptied first has the fewer number of
beans. If both bags are emptied a t the same time then the two bags have
the same number (even though we may not know how to name th a t
number).
We actually, in the above problem, constructed a function from one
set to another. More specifically, let A denote the set of beans in the first
bag and B, the set of beans in the second bag. As we eject beans from
each bag we are actually matching a bean from one bag with a bean from
the other. This matching is the function, and by construction it is 1-1.
However, it is not necessarily onto. In fact, it would be onto if and only
if there were the same number of beans in each bag, at least for finite
collections.
To view this idea in terms of the salesmen and the hotel rooms, suppose
we insist th a t no two salesmen can be assigned to the same room (this
is the criterion th at our function be 1-1). Then, if some rooms are unused
after all the salesmen have been assigned, we can conclude th a t there are
more rooms than salesmen; if salesmen are still left unassigned after the
rooms have been used up, we can conclude th a t there are more salesmen
than rooms; and if when all salesmen have been assigned there are no
rooms left vacant, we can conclude th a t there are the same number of
rooms as salesmen. Yet in each case we do not (and need not) know how
many of each there are.
D efinition
Two sets A and B are said to have the same cardinality (or to be equi-
potent) if and only if there exists a function / : A —>B such th a t / is both
1-1 and onto. In this case we write A ~ B.
While this definition may sound stilted and abstract, it captures the idea
of “how-many-ness” without using the concept of number in the definition.
Before we check some of the properties of our definition, let us make
reference to one more point. The reader may still feel th a t all we are
doing is counting. The answer to this is th at if the collections are finite
then it is true th a t counting and 1-1 correspondences are equivalent.
The trouble enters when we deal with infinite collections. For example,
let us consider the set of whole numbers W. Clearly, this is an infinite
collection. Intuitively, we would like to believe th a t the size of a collection
does not depend on how the objects in the collection are named. T hat is,
if there are 10 people in a room and if each of the 10 changes his name,
there are still 10 people in the room. However, if we try this idea on W,
there arises an interesting dilemma. Let us replace each member of W

by its double. Thus, 0 remains 0, 1 becomes 2, 2 becomes 4, 3 becomes
6, and so on. In essence, all we have done is change the name of each
member of W.
Old name —* 0 1 ' 2 3 4 5 6 7 8 9 10

I 1 1 1 1 I I 1 1 1 1
New name —» 0 2 4 6 8 10 12 14 16 18 20
While it appears th a t all we have done is change names, observe that

the top line lists the whole numbers W, but the bottom line names only
the set of even whole numbers E. Thus, on the one hand we want to be
able to say th a t W and E have the same number of elements (since we
got one from the other just by changing names); on the other hand, we
know th a t E C 11. In fact, E seems to have only half as many members
as W. ^
The point is th a t the new definition of cardinality “works” the way we
would like it to, regardless of whether the sets are finite or infinite. How
ever, the more intuitive definition works well only for finite sets.
The above illustration gives us a tremendous hint as to how to define
infinity without using the concept of “how-many-ness.” Notice th at E
is a proper subset of M (by proper subset we mean th a t all E 's are IT’s
but not all W ’s are E ’s) ; yet we have constructed a function/: W —>E th at
is both 1-1 and onto. To be more specific, we have defined / by f(w) = 2w
for all w C W. Such a function / is onto since each even whole number is
the double of a whole number. T hat is, if e is even, e = 2k where A; is a
whole number; hence, e = J(k). Moreover, / is 1-1 since 2wi = 2w2 if
and only if wi = w<i.
However, our intuition tells us th a t this cannot happen for finite sets.
In other words, if A has 3 elements and B has 4 elements, then we cannot
find a function from A to B th a t is both 1-1 and onto. We do not use
intuition when we talk about infinite sets. We have never seen infinity; but
we use intuition when we discuss finite sets. For example, the number
1010I° is represented in place-value as a 1 followed by 10 billion zeros. Call
this number N. Then, N is certainly larger than any number we have
ever conjectured in the real world. Yet, large as it is, it is still painfully
finite. In other words, if we could ever manage to count to N, our next
numbers would be N + 1, N + 2, N + 3, . . . , and so we are no closer
to the “end” of the whole numbers than when we started to count. In
still other words, if we call N our new starting point, (0), we are, in effect,
back a t the beginning when we count N + 1, N + 2, and so on. The
following is the definition for an infinite set.
Counting Revisited SJ+5
D efinition
The set A is said to be infinite if and only if there exists a proper subset
B and A such th a t A ~ B.
Notice how this definition captures exactly what happened in our dis
cussion of E and W above, and th a t this definition nowhere uses the intui
tive meaning of infinity.
Let us return to our definition of cardinality and show th a t it has the
properties th a t we would intuitively believe it has.
For example, we feel th at any set has the same number of elements as
itself. Then, given any set A, consider the identity function, I, on A.
Then I: A —>A is both 1-1 and onto; hence, by our definition A has the
same cardinality as A ; or in terms of our new notation, A ~ A.
Secondly, we would like to believe th a t if A has the same number of
elements as B, then B has the same number of elements as A . If A has
the same cardinality as B, our definition says th a t there exists /: A —>B
such th at / is both 1-1 and onto; but, as we saw in the last section, this is
precisely what is necessary to guarantee the existence of f ~ l, which is also
1-1 and onto. In other words, f ~ x: B —>A is a 1-1 and onto map from
B to A, and this is exactly what our definition requires for B to have the
same cardinality as A .
Thirdly, we would like to believe th a t if A has the same cardinality as
B and B has the same cardinality as C, then A has the same cardinality
as C.
Again, the fact th at A ~ B means there exists a function f : A —>B
which is both 1-1 and onto. Similarly, B ~ C implies th a t there exists
g : B —>C th a t is 1-1 and onto. Also, as we saw in the last section, the
composition of two 1-1 onto functions is also 1-1 and onto. In other
words, g °/ is a 1-1 onto function from A to C. Thus, by our definition
A ~C.
Summed up, we have shown for any sets, A, B, and C the following
three things.
(1) A ~ A.
(2) If A ~ B, then B ~ A.
(3) If A B and B ~ C, then A ~ C.
In other words, ~ is an equivalence relation. This, in turn, means th a t

with a given cardinality, when we have seen one such set then we have
seen them all. To clarify this point let us now analyze the usual counting
process. Suppose th a t we have a set of three apples; how do we know
th a t there are three? We count them! B ut what was the structure of
the counting process? In essence we said, “Here is 1, here is 2, and here

is 3.” If we let J 3 = {1,2,3}, what we did was form a function from J 3
to the set of apples th a t was 1-1 and onto. This idea is pictured in Figure
4.11.
Figure 4-11
We generalize this idea as follows: Let J n = {1,2,3, . . . , n \ for any

positive integer n.
D efinition
A set S is said to be finite and of cardinality n if and only if *S -—' «/„.
Thus, the set of apples would be said to have cardinality 3, which coin
cides exactly with our intuitive feeling.
Here is a good place to point out the difference between cardinal and
ordinal numbers. In counting the three apples, notice th a t we could have
started the count with any of the apples. T hat is, we might have labeled
any one of the three apples as being the first, any one of the remaining
two as being second, and the last as being the third. In any event, the
cardinality of the set would be 3, but we have a choice of six different 1-1
onto functions from J 3 to the set of apples. T hat is, the actual cardinality
is independent of what 1-1 onto function we choose, but the choice of the
function will affect how we have elected to order the members of the collec
tion. This is why dictionaries usually define the ordinal numbers as
adjectives (first, second, third) and the cardinal numbers as nouns (one,
two, three).
Let us agree to define the cardinality of the empty set to be 0. Based
on our experience, we would have no reason to define a cardinality of less
than 0. In summary, we make the following definitions.
J 0 = <t>
J i = {1}
J 2 = {1,2}
J n = {1,2, . . . , n}.
Then A is said to be finite of cardinality n if and only if A ~ J„.

At this point we leave the discussion of finite sets and return to the idea
of infinite sets. As we have seen (either by intuition or by our new defi
nition), the set of whole numbers is an infinite set. B ut it is a “special”
kind of infinity. Let us suppose th a t we are told to fist the whole numbers.
We could begin
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, . . .
and even though we could never complete the list, we would still know, in a
well-defined way, where each number would occur in our list had we elected
to carry it th a t far. In still other words, while it is impossible to explicitly
list all the whole numbers, each whole number occurs in a well-defined way
somewhere in the list. For example, 456 would be the four hundred fifty-
seventh (why?) entry in the list and we know this without having to write
the entire list. In brief, when we try to list the whole numbers we fail
because there are too many, not because we cannot find an orderly way to
list them.
Generally, our “favorite” infinite sets will be those th a t are also “list-
able” ; th a t is, sets in wrhich we can say, “This is the first member, this is
the second,” and so on, and not skip any member of the collection. This
is not as simple as it sounds and a note to this effect is presented a t the
conclusion of this section.
D efinition
Let W denote the set of whole numbers. Then the infinite set A is called
countable (or denumerable) if and only if A ~ W .1 In essence, a countable
set is the most orderly infinite collection.
By way of illustration, the set of common fractions is a countable col

lection. This is an interesting observation, both in its own right and also
in terms of number-versus-numeral. We have seen th a t we can not order
the rational numbers in the same way as we can order the whole numbers.
For instance, wre can talk about unequal but consecutive whole numbers,
but this concept does not exist for rational numbers. In particular, as we
have seen, if r and s are two unequal rational numbers, there are infinitely
many real numbers between r and s.
However, with common fractions, there is an interesting but unusual
way to define size. Given the common fraction m/n, let us define its size
to be m + n. W ith respect to this definition, the common fraction f
would have size 7 (that is, 2 + 5). Of course, there is more than one
1 Some people are afraid of numbering something 0. To this end, we may replace W
by N (the set of natural num bers; th a t is, the whole numbers excluding 0). The point
is th a t W ~ N since the mapping f(w) = w + 1 is a 1-1 onto function from W to N.
common fraction whose size is 7. If we restrict our numerators and

denominators to being whole numbers, we find th at there are exactly seven
common fractions of size 7. They are
o. I 2 3 4 5 Q r ir l 6
7 ) 6 > ~5> 4 ) y.
i is not a common fraction by our agreement th at O’s are not allowed as

denominators.
Generally, there are n common fractions of size n. Specifically,
t is the only common fraction of size 1.

£ and y are the only ones of size 2.
and f are the only three of size 3.
y, y, f , and y are the only four of size 4.
Let us also agree th a t when we list common fractions according to size we

will list them in order of increasing numerators. Thus, our list would be:
0 0 1 0 1 2 0 . 1 2 3
T> T> 2 ) T> 4 ) 3) Tj • • •
and we could tell exactly where any common fraction would appear in this
listing. For example, 34/55 is the thirty-fifth entry when we are listing
common fractions of size 89 (that is, 34 + 55). In fact, if we recall the
recipe th a t 1 + 2 + • • • + w = n ( n + l ) / 2 , we can tell even more pre
cisely where 34/55 will occur in the list. Namely, since there is one fraction
of size 1, two are of size 2, and so on, our list would have 1 + 2 + • • • + 88
entries in it before we even got to size 89 (that is, we would first list all frac
tions from size 1 through 88 before we would begin listing those of size 89);
but this is precisely 88(89)/2 = 3916. Then we would have to proceed an
additional 35 places. Thus, 34/55 would be the 3951st entry in the list,
and so the set of common fractions is countable, since we can compute
where each member of the set will occur in our listing.
The difference between number and numeral here is th a t our countable
list of common fractions includes infinitely many synonyms for the same
rational number (for example, y, f, f , i> and so on are all different as
common fractions, or numerals, but each names the same number).
This means th a t the cardinality of the rational numbers cannot exceed the
cardinality of the common fractions; so we can conclude th a t the rational
numbers are also countable.
Surprising as it may seem, not all infinite sets are countable. In partic
ular, (as we shall show in the note at the end of this section) the real num
bers are not countable. Moreover, since the real numbers are made up of
the rational numbers and the irrational numbers, this implies th a t it must
be the irrational numbers th a t are not countable. In this sense there are
more irrational numbers than rational numbers. More precisely, we can
find a 1-1 onto function from the whole numbers to the rational numbers,
but no such function exists from the whole numbers to the irrational
numbers.
As a final way to identify the modern concept of counting with the tradi
tional method, let us merely observe th at when we wrote in earlier chapters,
for example, th a t N(A) = 7, this was just another way of saying th a t A was
finite with cardinality 7. T hat is, A ~ J 7.
This concludes the modern introduction to functions. In the next sec
tion we shall coordinate this discussion with the classical idea of a function.
Exercises
1. Describe two well-known sets, each of which has the same cardinality
as J 10.
2. By discussing the function f(n) = 3n, dem onstrate th a t the set of mul
tiples of three has the same cardinality as th a t of the whole numbers.
3. Using the discussion of this section, where on our list would the following
fractions appear if we define the size of m / n to be m + n?
(a) 7/3. (c) 99/101.
(b) 3/4. (d) 30/40.
4. Let N denote the natural numbers and let S denote the subset of N
consisting of the perfect squares.
(a) Find a fu nction/: N —» S th a t is 1-1.
(b) Is it possible th a t there exists a function from N to S th a t is 1-1
but not onto? Explain.
5. If A is an infinite set, is it possible to find a function /: A —* A th a t is
1-1 but not onto? Explain.
6. Let N denote the set of natural numbers. Describe five subsets of N
th at have the same cardinality as N.
7. How many permutations on A are there if N( A) = 5?
A NOTE ON COUNTABLE SETS
As we have mentioned before, since our intuition is based on finite studies,

we should be wary of any prediction it makes about infinite collections.
One such prediction it makes is th at if we have seen one infinite collection,
we have seen them all. T hat is, there is a tendency to assume th a t if two
collections are infinite they can be placed in a 1-1 correspondence. How
ever, this is not the case. There are different orders of infinity; or worded
differently, the cardinality of two infinite sets need not be equal. While
it is not our purpose to explore this topic in great detail, there is one remark
we would like to make.
In particular, there is a famous result known as Cantor’s diagonalization
principle, which is used to show th at the real numbers are not countable.
In other words, Cantor proved th at while the rational numbers and the
natural numbers had the same cardinality, the real numbers and the natural
numbers had different cardinalities. Specifically, the cardinality of the
real numbers is greater than th a t of the natural numbers. Cantor’s
method involved the use of decimal notation. First of all, recall th at every
real number may be viewed as a decimal, the rational numbers being repre
sented as either terminating or repeating decimals, and the irrational num
bers as endless nonrepeating decimals. Moreover, in decimal form, with
one exception, two numbers are equal if and only if they agree (are the
same) for each decimal place. The one exception occurs when a string of
9’s is endlessly repeated. For example, 1.000000. .. and 0.9999. . . are
two different ways of writing 1 as a decimal. To avoid this ambiguity in
discussing C antor’s principle, let us agree to use the repeating 9 representa
tion uniquely when the situation arises. In this way we make sure th at
no two decimals can represent the same number and th at no decimal
terminates.
Cantor then proceeds to use the indirect proof to show th a t even the
real numbers just between 0 and 1 are not countable. To do this he as
sumes the opposite is true, and then proceeds to arrive at a contradiction.
Namely, he assumes th a t the real numbers from 0 through 1are countable.
Under this assumption, it means th a t we can order the real numbers in a
well defined way in such a manner th a t any real number will eventually
appear on the list. Let us use the notation n to denote the first real num
ber on the list, r2 to denote the second one, and so on. In this way the
assumption th a t the set of real numbers is countable allows us to conclude
th a t the real numbers may be arranged in such a way
n
M /V» /VI Of*
, /2 , 73, >4 , • • . « n, • ■ •
th a t although the listing is endless, each real number must occur somewhere
on the list.
The diagonalization principle then takes on the following form. Cantor
defines a real number b as follows. We look a t the first decimal place of ru
and we choose any digit except the digit th a t occurs in the first decimal
place of r\ to be the first decimal place of b. For the second decimal place
of b we choose any digit other than the digit th a t occurs in the second deci
mal place of r2. In general, for the n th decimal place of b we choose any
digit except the one th a t occurs in the nth decimal place of r». In this way
b is certainly a real number, but it cannot be any real number already on the
A n Introduction to Functions of a Real Variable 361
list, for it differs from n in a t least the first decimal place; it differs from r2
in a t least the second decimal place; and so on. In other words, if the real
numbers were countable, then b would have to occur somewhere on the
list; th a t is, it would have to occur in the n th spot for some natural number
n. However, the name of the real number in this spot is r„, and b cannot
equal r„ since their n th decimal places are unequal. This contradiction
validated the claim th a t the real numbers were not a countable collection.
If we use double subscript notation, it is easy to see why this was called
a diagonalization process. Namely, let rnm denote the mth decimal digit
of rn. For example, by r37 we would mean the digit in the seventh decimal
place of the real number th a t is listed third. More concretely, let us assume
th a t the listing appeared as follows.
n = 0.14356980357...
r2 = 0.4789024666790. . .
r3 = 0.03485769687123. . .
U= 0.98079666453123....
Then, r37 = 6, r42 = 8, and so on.

Cantor’s method involved studying the n th term of rn. T hat is, he
looked a t rn, r22, r33, r44, rS5, and so on, and found th a t these elements form
the diagonal of the above array of numbers. Moreover, as a further illus
tration of Cantor’s principle, using the above array we would form b by
putting any digit except 1 in the first decimal place, any digit except 7 in
the second decimal place, any digit except 8 in the third decimal place, any
digit except 7 in the fourth decimal place, and so on. Thus, a possible b
would be b = 0.3469. . . ; no m atter what b is equal to, it certainly could
not equal r7, r2, r3, or r4 by its construction.
Finally, Cantor’s demonstration gave a rigorous proof th a t there were
“more” irrational numbers than rational numbers. Namely, since the real
numbers are made up exclusively of the rational and the irrational numbers,
and since the rational numbers are countable and the real numbers are not,
then assuming th a t the union of two countable sets is again countable, it
must be th at the irrational numbers are not countable.
4.5 AN INTRODUCTION TO FUNCTIONS OF A REAL VARIABLE
W ith the first four sections of this chapter as background, we are now
ready to begin a study of functions from the traditional point of view.
Our previous study was actually more general than what we now need.
T hat is, we have already discussed functions in terms of arbitrary domains
and images; in the study of real variables, all we do is restrict our domain
and range to subsets of the real numbers.
By way of illustration, consider an expression such as
f(x) = x2.
As before, / denotes the rule, x denotes the input (domain), and x2 denotes
the output (image). I t is not clear from /(x) = x2 what the domain of f is.
Let us assume, for lack of information to the contrary, th at x can denote any
real number. Then the domain of f is the set of real numbers, while the
image of / is the set of all positive real numbers. Since x2 = ( —x)2, f would
not be 1-1 since for any real number x, x and —x have the same image with
respect to /. However, / is single-valued since for each given input there is
one and only one output.
If we wanted to study functions of a complex variable, all we would have
to do is consider something like
m = *2,
where z denotes a complex number. In terms of a function-machine, / may
be viewed as the same machine as in f(x) = x2. All th a t is different is th at
now both the input and the output are subsets of the complex numbers
rather than of the real numbers.
We may even have examples of function-machines where the input is a
complex number and the output a real number, or vice versa. For ex
ample, suppose th a t x denotes any real number, and th at we define / by
f(x) = xi where, as usual, i2 = —1. In this case the input is real, while
the output is purely imaginary. As a second illustration, consider the case
in which z denotes a complex number and \z\ denotes its magnitude. Con
sider the function / now defined by f(z) = \z\. In this case the input is
complex, but the output, by definition of magnitude, is a nonnegative, real
number. In this case we would say th a t / is a real-valued function of a
complex variable to indicate th a t the output is real while the input is
complex.
Later we shall study different combinations of inputs and outputs; how
ever, for the time being, our only aim is to study real-valued functions of a
real variable. T hat is, we are interested in the situation wherein the input
as well as the output are real numbers.
Let us introduce new terminology for input and output. In an expres
sion such as f(x), / is referred to as the function; z, as the independent vari
able; and f(x), as the dependent variable. (There still is a tendency by
some to refer to f(x) as the function. We mention this only so th a t the
reader will not be confused if he runs across it.)
To understand this terminology better, observe th a t in our more intui
tive language both the input and the output are variables but th a t the out
put depends on the input. T hat is, the dependent variable plays the role
of the output which depends on the input (independent variable). While
this new vocabulary should be learned and understood, it is not really an
improvement over the old. For example, it is often not clear (and in other
cases it is even immaterial) which is the output and which is the input.
Again, in terms of function-machines, we can interchange terminals, and
reverse the role of output and input. In fact, this is precisely the role of
the inverse function. By way of illustration consider
f{x) = 2x + 3, (1)
where the domain of / is the set of real numbers.

In plain English / i s the rule th a t doubles any given real number and then
adds 3 to the result. To “undo” / we would have to subtract 3 from a
number and then divide by 2. To see this more concretely from an alge
braic point of view, think of (1) with y denoting the output or dependent
variable. Then we have y = 2x + 3. Algebraically, we may correlate
the idea th at y is dependent and x is independent with the fact th a t in an
expression such as y = 2x + 3 it is more “natural” to start with an arbi
trary (hence, the idea of independent) value of x, and then compute y
(hence, the idea th a t y is dependent).
We can then solve for x in terms of y and we obtain
In this expression it is easier to think of y as the independent variable

from which we solve for x. In any event, the equations y = 2x + 3 and
x = (y — 3)/2 are called inverses of one another because each says the
same thing, only with a switch in emphasis. Translating these results into
the language of functions we are saying th a t / and g are inverses of each
other where
f(x) = 2* + 3 \ f(g(x)) _ _ 2 (* 2 = -? ) + 3
and 1 =x
g(x) = (x - 3 )/2 / : . f og = l.
In terms of the function-machine, g is obtained from / by reversing the

input and output of the /-machine. In an expression such as f(x) =
2x + 3, there is a danger th a t it will be read too literally. The point is
th a t x serves merely as some sort of place holder to represent the input.
Perhaps it would be wiser, although much more cumbersome, to write
/([ ]) = 2[ ] + 3, where it would be easily understood th a t any number
could replace [ ]. T hat is, we should be emphasizing not the symbol x,
but rather the fact t h a t / i s the rule th a t doubles a given input and adds 3.
Thus, we would have
f i x 2 - 4) = 2(x2 - 4) + 3
= 2x2 - 5.
This, in turn, says th a t f i x 2 — 4) can also be looked upon as inducing a
new function g, where g is the rule th a t squares a given input, doubles the
result, and then subtracts 5 to yield the output.
The same type of problem occurs in the composition of any two functions.
In terms of the machine concept, all we have to do to represent composi
tion is to attach two function machines in sequence. In this way the out
put from the first machine is the input for the second. By way of illustra
tion, consider
hix) = x2
and
kix) = x + 1.
To form k » h we run an input (x) through the /i-machine and obtain x2
as an output. We then make x2 the input of the fc-machine, whereby the
output is x2 + 1. In short, the idea of following the /i-machine by the
A:-machine yields the equivalent of a new machine which has the effect of
squaring any input and adding 1 to the result.
We could have started with the /:-machine, followed by the /i-machine,
in which case if x was the input of the A>machine, the output would be
(x + 1), which would be the input of the /i-machine; hence, the final output
would be (x + l) 2, since the /i-machine squares any given input. In terms
of a picture we would have Figure 4.12.
As a final observation, x2 + 1 and (x + l) 2 are equal if and only if x = 0.
k°h - machine
In terms o f the example in

which f (x) = 2 x + 3 and
/ \ X —3
•MX) - — we are saymg
that p u ttin g /a n d g in se
quence would be a com
plicated way o f building
h°k - machine a machine that “ didn’t
do anything.” That is,
f°g = / means that the in
put o f the /- machine will
be the o u tp u t o f the q -
machine.
Figure 4-12
However, for two functions to be equal they must yield equal outputs for
each input, not just for some. Thus, we see again th a t h » k and k « h are
not equal functions.
We are saying th at labels such as dependent and independent are whim
sical, and can be interchanged just by a reversal of the poles of thefunction-
machine. Moreover, observe th at if we are given an algebraic expression
such as
x + y = 1,
there is no indication as to which of the two variables, x or y, will be
emphasized. To be sure, x + y = 1 can be correctly paraphrased as
either
x = 1- y
or
y = 1 - x.
In x = 1 — y, y is the independent variable and in y — 1 — x, x is. Yet
we have no way of knowing which of these two is suggested by x y = 1.
We shall accept such adjectives as dependent and independent a t face
value, and use these expressions only when we feel th a t their context could
allow no misinterpretation.
As an illustration of these ideas let us consider the problem tackled by
Galileo of determining how far a freely falling body fell during a given
period of time. Galileo assumed th a t the object was close enough to the
E arth so th at its gravitational acceleration could be taken as a constant,
and he also assumed th a t air resistance was absent, or a t least negligible.
Under these conditions he discovered th a t
s = 16£2,
where s was the number of feet the body fell in t seconds. From our point
of view, it is not im portant whether s = lQt2 is true. I t is im portant th at
this equation defines a rule (function) th a t assigns to one real number t
another real number s. To review our new language, in s = 16£2 we would
refer to t as the independent variable and to s as the dependent variable;
th a t is, we put t in and get s. If we wished, we could invert s = 16<2 to
obtain
1 = V ie ’
where these two equations are equivalent, except th a t they interchange the
roles of s and t as dependent and independent variables. In still other
words, s = lQt2 would be the desired form if we were given t and told to
find s, while t = -\/s/16 would be the desired form if we were given s and
told to find t.
The next point to be made is th a t s = 16<2 in a given physical situation

is not true forever. For example, it yields the result th at when t = 3,
s = 16(3)2 = 16(9) = 144, or th at the object fell 144 feet during these
three seconds. However, suppose th at we had never let go of the object
during these three seconds. Then the object would not have fallen a t all.
As a second extreme, suppose th a t we released the object right away but
th a t it was, say, only 64 feet above the ground. In this case t = V s / 16
yields the fact th a t the object hit the ground a t the end of the second
second, and th a t the object had still fallen only 64 feet after the third
second.
In other words, there is an earliest time t for which s = 16i2 is true (the
instant we release the object) and a final time for which s = 16J2 is true
(the time th at the object hits the ground; th at is, the time it is no longer in
free fall). For example, had the object been 64 feet above the ground,
perhaps s = 16£2 should have been rewritten
(where we are defining t = 0 to be the instant a t which the object was

released).
In other words, the recipe given in s = 16£2 is only true on the closed
interval [0,2] [that is, for t such th a t 0 ^ t ^ 2 (see P art III, Chapter 2 for
a review of this notation)].
This is a key concept in many physical situations. T hat is, there is
nothing th a t excludes the domain from being any subset of the real num
bers, but in most real situations the domain will be either an interval or a
union of intervals. In terms of the more pictorial number line we are
saying th a t the domain will usually be a connected portion (a line segment)
of the number line, or perhaps a union of such segments.
W ith respect to the above remark, observe th a t we may view both the
domain and the image as subsets of real numbers (pictorially, as points on
the number line). This, in turn, might suggest the use of Cartesian coor
dinates whereby we could use the x axis to denote the domain and the y
axis to denote th e range.
This results in the concept of the graph of a function, and we shall turn
our attention to this idea in the next section.
4.6 A PICTURE IS WORTH A THOUSAND WORDS

Even on a subconscious level we frequently think of nongeometric ideas
in terms of geometric pictures. For example, consider the expression,
“Profits rose.” The only way profits can “rise” is if, for example, the
A Picture Is Worth a Thousand Words 367
office safe blows up. Obviously, what we mean when we say th at profits
rose is th at profits increased. Why then did we interchange rise (geometric)
with increase (arithmetic)? The answer centers around the idea of a graph.
How is the concept of a graph related to the concept of a function? The
answer lies, at least for a first approximation, in the concept of ordered
pairs (which in turn suggests Cartesian products, and this in turn suggests
the Cartesian plane). Namely, the function-machine is determined once
we know the output for each given input. In short, we can abbreviate the
function by using ordered pairs where, for example, the first member of
the pair can name the input while the second member of the pair can name
the output. In terms of the Cartesian plane, this says th a t we use the
x axis to indicate the domain of the function, while we use the y axis to indi
cate the range of the functions. In still other words, we are saying th a t
we may use the point {x,f{x)) to represent the fact th a t for an input of x
the output is fix).
Conceptually, the function and the graph are two entirely different
things. One is an analytic relation and the other is a picture of it. A
picture may be helpful where, for example, we know th a t for a certain real
number x, fix) is positive (this is an arithm etic statem ent). B ut if fix) is
positive, we know th a t the point (x,f(x)) lies above the x axis. In a similar
way, if fix) is negative, the point (x,f(x)) is below the x axis; and if fix) = 0,
the point (x,f{x)) is on the x axis. Thus, the idea of a graph replaces the
analytic terms “greater than 0,” “equal to 0,” and “less than 0” by “above
the x axis,” “on the x axis,” and “below the x axis,” respectively.
In a similar way, it replaces “increasing” by “rising,” “constant” by
“horizontal,” and “decreasing” by “falling” (that is, if fix) does not vary,
the point (x,f(x)) always has the same height above the x axis. Hence,
all such points are parallel to the x axis, or horizontal to it if we define the
direction of the x axis as being horizontal.)
Let us now turn to a specific situation. For example, let us discuss the
function / given by
fix) = x2.
We know at once th a t the input x yields the output x2. This means th at
in terms of a graph, the input x will give rise to the point in the plane (x,x2).
Some of the points on the graph would be (0,0), (1,1), ( —1,1), (^,i).
In terms of a picture, we would have Figure 4.13.
If we now use our intuition we might conjecture2 th a t the graph of
fix) = x2 is given by Figure 4.14.
2 Actually we have only a conjecture, no m atter how intuitive things may seem. T h at
is, as long as we locate points th a t have spaces between them,we are only conjecturing
as to w hat goes on in between.
• ( - 1, 1) • 0 , 1)
(0 , 0 )
Figure 4-13
We now have a visual way for measuring “single-valuedness” and

“ 1-lness.” Namely, if each line th a t is parallel to the y axis and th at
passes through a point in the domain intersects the graph a t only one point,
then the function is single-valued; and if each line th a t is parallel to the
x axis and th a t passes through a point in the image intersects the graph
in only one point then the function is 1-1. By way of illustration, Figure
4.15 indicates th a t if f ( x ) = x2, then f is single-valued but not 1-1. In
fact, every positive number in the image is yielded by two members of the
domain.
Do not confuse the graph with the domain and the image of the function.
Recall th a t the domain is located as a subset of the x axis, while the image
is a subset of the y axis. For example, referring again to the function f
where f(x) = x2, suppose th a t we take as the domain of / the closed interval
[2,3]. Then the image of / in this case would be the closed interval [4,9].
This is th e im age o f
both Each value o f x yields one
and only one point on the
graph; but each nonzero
value in the image comes
from two num bers in the
domain.
Figure 4-16
In fact, Figure 4.15 also shows us th a t in this example / is both single

valued and 1-1. This result in no way depends on the graph, but the graph
gives us much im portant information a t a glance. For example, the fact
th a t the graph is always rising tells us th a t the output increases as the
input increases. T hat is, f(x) is increasing as x increases. Notice th a t
not only does the graph rise but it also seems to be “accelerating”—th a t is,
it appears to be rising a t a faster and faster rate. To illustrate this,
observe th a t the curves depicted in Figure 4.16 are both rising. However,
Figure 4-16
in the first case the curve seems to be rising faster and faster, while in the
second case it seems to rise more and more slowly. The first case corre
sponds to acceleration, while the second case corresponds to deceleration.
Returning to fix) = x2, observe th a t we have a genuine acceleration.
For example, when x changes from 1 to 2, fix) changes from 1 to 4; thus, in
this case an increase in a; of 1 unit produces an increase in y of 3 units.
Yet, when x changes from 2 to 3 (which is still only a 1 unit change in x),
fix) changes from 4 to 9, or a change in 5 units.
Graphs also supply us with a nice reason as to why we may deal with
single-valued functions without loss of generality. T hat is, the graph of a
multi-valued function has the property of doubling back. In other words,
a line parallel to the y axis intersects the graph in more than one point.
For the purpose of an illustration let us suppose th at our graph is a smooth
curve. Intuitively, it should be clear th a t the places a t which the graph
doubles back are those a t which the curve possesses a vertical tangent line.
This idea is pictured in Figure 4.17.
The curves C t (from 0 to A),

C2 (from A to B), and C3
(from B to D) are single
valued, and the graph curve
isC , U f , U C 3.
One x value yields

3 y values.
Figure 4.17
Notice th a t these points of tangency partition the curve into a union

of single-valued curves. As a specific example, we can now discuss the
convention th a t says y / x means + y / x unless otherwise specified. Let us
sta rt with the equation y2 = x, graphed in Figure 4.18.
Figure 4.18
For positive values of x we see th a t the curve is double-valued, and th a t

the curve possesses a vertical tangent a t the point (0,0). We can thus
break the curve into two mutually exclusive subsets. These two pieces are
called branches. One branch lies above the x axis and the other, below it.
We shall denote the upper branch by y = + y / x and the lower one by
y = — y /x . This is in accord with the usual idea th a t if y2 = x, then

y = ± y /x . Once we know one branch in detail, we know th a t the other
is just the mirror image with respect to the x axis. See Figure 4.19.
y = x 1 is then given by
v C, U C 2 where C, and
C2 are both single-valued.
Figure 4-19
To summarize what we have said about “ 1-lness” and “single-valued-

ness” :
(1) If a “continuous” graph doubles back the function is not single-valued.

In this case we can partition the function into a union of single-valued
functions by noting the points at which the graph has vertical tangents.
(2) If the curve passes from rising to falling (and for a smooth curve this is
characterized by those points a t which we have a horizontal tangent)
then the function is not 1-1.
(3) Therefore, if the graph is either always rising or always falling, the
function is both 1-1 and onto.
While there is more th a t can be said about functions of a real variable,

the present discussion is sufficient for our needs. T hat is, our aim was to
lay the groundwork for this study in terms of the basic building blocks of
this course. I t is of interest to note th at while the concept of function can
be extended to include the subject known as calculus, the extension requires
primarily a good dosage of computational knowledge but relatively few new
concepts. In other words, no m atter how sophisticated we might want to
become, the fact is th at we are primarily often concerned with the study of
how one variable (the dependent variable or the “o u tp u t”) depends on
another (the independent variable or the “input”). In fact, in the more
serious real-life problems we often find th a t our functions involve measuring
a quantity in terms of several other variables; yet, again, while the com
putational aspects become much more severe the basic concept remains
intact.
As a final note, we may have observed th at this chapter was considerably

shorter than the others. The main reason for this is th at the first three
chapters discuss the basic building blocks of mathematics; th a t is, the
numeral-versus-number concept, the fundamentals of sets, and the game
of mathematics. These are “grassroots” topics th a t require considerable
exploration. The last chapter, however, hopefully serves as an introduc
tion to how we “put it all together” when we apply our building blocks to
a particular mathem atical topic.
INDEX
Abacus, 10 Circle diagrams, 181
Abscissa, 114 See also Venn diagrams
Absolute value, 121 Circular reasoning, 275
Addition, 15 Closed form, 62
Additive identity, 110 Closure, 286
Age of Enlightenment, 98 Combinations, 225
Algebra, 141 Commutative, 109, 287
Argand diagram, 115, 119 Complements, 169, 287, 304
Arrow (vector), 116 Conditional, 309
Associative, 109, 238, 287 Conjunction, 309
Average, 82 Converse of statement, 76
Axioms, 280, 281 Copernicus, 98
Corollary, 292
Countable, 347
Barber-of-Seville Paradox, 126 Cumulative, 5
Binary operations, 324 Cup, 304
Binomial theorem, 236 See also Union
Bisection, 82
Boole, George, 285 Decimal point, 69
Boolean algebra, 285, 304 fraction, 67
repeating, 74
Cancellation rule, 96 terminating, 72
Cantor’s diagonalization principle, 350 Dedekind’s chest-of-drawers principle,
Cap, 304 75
See also Intersection Denumerable, 347
Cardinal numbers, 346 Descartes, Rene, 97
Cardinality, 343 Dichotomy, 286
Cartesian plane, 114 Disjunction, 309
Cartesian product, 206, 213 Displacement, 116
366
366 Index
D istance, 116 H eliocentric th eo ry , 98

D istrib u tiv e law, 109, 287 H ypothesis, 280
D ivision, 97
D om ain, 332 Id e n tity , 145, 287
D u a lity , principle of, 292 Im p licatio n , 309
See also C o n d itio n al
E q u a tio n s, d ep en d en t, 160 In d ire c t proof, 89
e q u iv alen t, 145 In fin ite, 345
hom ogeneous, 164 In te rse c tio n , 167, 304
in com patible, 160, 162 In te rv a ls , closed, 209
q u a d ra tic , 143 open, 209
ro o ts, 146 In v e rse fu n ctio n s, 338
simultaneous linear, 150, 199 In v e rse o p e ra tio n s, 15
solution set, 139
Equipotent, 343 Least common multiple, 66
See also Cardinality Logic-truthtables, 308
Equivalence relations, 253
reflexive, 254, 286 Materially equivalent, 314
symmetric, 255, 286 Matrices, 150
transitive, 255, 286 echelon form, 163
Excluded middle, 125 row-equivalence, 154
Exponents, 17 Mutually exclusive, 60, 189
Factorial, 216 Number bases, 22

Finite, 346 Number line, 44, 58
Fraction, common, 47 right to left, 101
decimal, 67 Numbers, cardinal, 346
Functions, 332 composite, 60
composition of, 337 even, 5
dependent variable, 352 irrational, 82
domain, 332 natural, 217
equality of, 335 negative, 91
image, 333 odd, 5
independent variable, 352 ordinal, 346
into, 333 prime, 30, 60
inverse, 338 rational, 43
one-to-one, 334 real, 115
onto, 333 Numbers, complex, 103
range, 332 closure, 109, 110
of a real variable, 351 conjugate, 111
division, 112
Galileo, 98 equality of, 108
Game, concept, 268 imaginary part, 107
strategy, 270 multiplication, 110
Geocentric theory, 98 real part, 107
Greatest common divisor, 66 sum, 109
Index 367
Numbers, signed, 91 infinite, 345

addition, 93, 101 intersection, 167
magnitude, 92 membership test, 135
multiplication, 95 roster notation, 134
signature, 92 set-builder notation, 134
subtraction, 94 subsets, 128
Numerals, 2 supersets, 128
union, 167
One-to-one correspondence, 88 universal, 132
Ordinal numbers, 346 Sieve method, 60
Ordinate, 114 Speed, 116
Square roots, 56
Pascal’s triangle, 240 Statement, 308
Percentages, 80 Substitution, 286
Permutations, 216, 339 Subtraction, 15
Point-dot, 82, 281 Switches, theory of, 322
Postulates, 280, 281 Symmetric relation, 109, 254, 286
Prime, 304
See also Complement Tautology, 315, 319
Probability, 243 Theorem, 283
Transitive relation, 108, 254, 286
Quadratic equation, 143 Triangles, similar, 55
Range, 332 Union, 167, 304

Reflexive relation, 109, 254, 286 Unique factorization theorem, 64, 65
Roman numerals, 8
Vectors, 113, 116
Scalar quantities, 116 addition, 117
Self-contradictory, 315 Velocity, 116
Sets, well-defined, 124, 127 Venn diagrams, 129
complement, 169 Voltaire, 98
countable, 347
denumerable, 347 x-axis, 114
elements of, 127 x-coordinate, 114
empty-null, 131, 132
equality, 128, 176 y-axis, 114
finite, 346 y-coordinate, 114

Mathematics A Chronicle of Human Endeavor

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Mathematics A Chronicle of Human Endeavor

Diunggah oleh

Hak Cipta:

Format Tersedia

Mathematics/A

H olt, R inehart and W inston, Inc.

Once the whole numbers are adequately described, C hapter 1 continues

ask th a t education be made relevant. In a Utopian system everything

In addition to any mathematical content th a t we hope can be learned

Cambridge, Massachusetts H .I.G .

chapter one / THE DEVELOPMENT OF OUR NUMBER SYSTEM

chapter tw o / AN INTRODUCTION TO THE THEORY OF SETS

PART I T H E MOOD SETTER 123

2.3 The N uts ’N Bolts 127

PART II T H E A R IT H M ET IC OF SETS 166

PART I II CARTESIAN PRODUCTS 206

chapter three / THE “GAME” OF MATHEMATICS: AN

3.9 Boolean Algebra 285

chapter four / AN INTRODUCTION TO FUNCTIONS AND

1.1 THE CONCEPT OF NUMBER VERSUS NUMERAL

(1) C at is a three-letter word.

The ridiculousness of the above conclusion vanishes as soon as we realize

“C at” is a three-letter word.

4 will cost you 10 cents.

The answer to the question: he was buying house numerals a t 10 cents

the distinction could be made by the use of such concepts as a one-to-one

Moreover the first forms of number theory were of a geometric nature

the numbers of dots in each L being consecutive odd numbers. (See

I I I I I I I I I I were synonyms. Thus he would have written X X X IIIIIIIIIII

CCCXXX (ten thirty-three’s)

(The ten X ’s become a C; this C plus the other nine become an

In this way multiplication appears as “rapid addition.” In a similar

invent a new symbol. Thus, to express a number which we would write,

(One) (Ten) (H undred) (Thousand) (Two thousand

In any event, the mechanics of performing arithmetic using the abacus

other than by where a digit was placed, it became imperative to invent a

Thousands Hundreds Tens Ones

In a similar way, our present place-value system made it mandatory,

100 10 1 and 100 10 1

(two hundred thirteen) (thirty-three)

and then proceeds to solve it as two separate problems:

Hundreds Tens Ones

Now, observe how easy it is to recognize multiplication as being rapid

Mickey M antle was a great baseball player.

A great baseball player was Mickey M antle.

Thus, in all, there were 217 bundles of 13 each in 2821.

the inverse operation. We shall exploit this idea on many occasions

b is called the base, while n is called the exponent.

If we pursue this further, we readily discover th a t 10n represents the num­

These recipes show us th e interesting result th a t in term s of exponents

In term s of scientific notation

Moreover, by studying th e above procedure we understand why the num ­

was very convenient. Consequently, he wanted to define 6° in such a way

meant th a t bn X b° =•■bn since n + 0 = n. This told him th at when 6°

(b) How many characters would they have had to use?

How is this number written as a place-value numeral?

1.3 DIFFERENT NUMBER BASES

None 000 000

nounce, when necessary, (12)4 in any way th a t he desires, provided th a t he

since 13 = 1111 111 and 12 = 1111 11.

and in base-four numerals this would be written as 31. In terms of

sideration, subscripts are needed to avoid misinterpretation. In this event,

Now suppose we were given 13 -1- 12. In terms of tallies, we have

Thus we might write

an endeavor is peripheral to our major aim, which is simply to exploit the

regard to divisibility. For example, to see if a number is divisible by five,

In still other words, in base-twelve

If we pursue this further, we readily discover th a t 10n represents the num

Moreover, by studying th e above procedure we understand why the num

1 + 6 + 5 = 12 and 1 + 2 = 3; and the result again checks. This hap

tion, which is exactly equivalent to our modern form except th at it facili

an immediate simple interpretation in terms of the number line. Specifi