CHRONICLE
OF HUMAN
ENDEAVOR
Herbert I. Gross
Massachusetts Institute of Technology
Frank L. Miller
Orange Coast Community College
The story is told th a t once when Calvin Coolidge returned home from
church, his wife asked him the title of the sermon. “Sin,” replied Mr.
Coolidge. “And what did the clergyman have to say about it?” asked
Mrs. Coolidge. “ He was against it,” came the terse reply.
The above story, usually told as an anecdote to illustrate conciseness,
also serves to convey the message, in a certain sense, th a t most of the great
ideas have already been given. All th a t remains to be done is to style the
way in which they may be presented. In this vein, the present textbook
makes no claims of presenting new mathematics or of inventing new
theories. Rather, it is the hope of the authors to present mathematics with
sufficient clarity so th a t even the beginner can learn to understand the true
nature of mathematics, its role in the development of man and his society,
and its practical and esthetic aspects. No knowledge of mathematics
beyond th a t which is generally known by the junior high school student is
presupposed by the authors.
Of course, any such attem pt to teach mathematics utilizes a highly
subjective approach on the part of the teacher; in this respect, the teaching
experience of the authors will greatly affect the style of the text.
The text was assembled by Herbert I. Gross while teaching a term inal
mathematics course at Corning Community College in Corning, New York,
and conducting numerous in-service seminars for teachers involved in pre
college mathematics at all levels of instruction. This material was revised
and specially edited by Frank L. Miller while teaching at Orange Coast
Community College, Costa Mesa, California.
The text takes into account the following four points: (1) The average
nonmathematician is blissfully unaware of the beauty of m athem atics apart
from its obvious value as a computational tool. (2) M any people are adept
at performing the basic operations of arithmetic, but they seldom under
vi Preface
stand how and why various “recipes” work. In short, people have memor
ized certain mathematical principles but have not grown intellectually in
the process. (This in part explains why parents berate the new mathe
matics. Previously, they could hand down the same recipes from genera
tion to generation. Now, since children are learning explanations different
from those memorized -by their parents, the lack of genuine understanding
becomes more pronounced. In effect, the average parent who cannot help
his child with the “new” mathematics probably could not help him with the
“old” either—only he did not realize it!) (3) “Logic” is a frightening
word to many people. Some do not understand it at all, others misinter
pret it, and still others understand it in a mechanical way without any
significant feeling as to its great value. (4) Very often mathematics is
learned as a collection of “tricks” without any awareness of the basic
unifying threads th a t characterize the subject. Too often students tend
to learn many concepts each with one application rather than learning a
few concepts with many applications.
I t turns out th a t these four points seem to permeate the core of the “new”
mathematics, and with so much lip service devoted to it, perhaps it would
be wise to say just a few words here about what we think the “new”
mathematics really is.
As simply as possible, let us observe th a t one usage of the word “new” is
as the opposite of “passe” (rather than of “old” ). I t is in this sense th a t
the “new” mathem atics has its greatest moment. T h at is, we are not so
much interested in the “ age” of a topic as much as we are interested in its
meaningfulness. The quest for meaningfulness is even more im portant
when we realize how rapidly the field of mathematics has expanded in the
past few years. Indeed, it has been said th a t mathematics is like the enthu
siastic young m an who mounted his horse and rode off furiously in all
directions. W ith so much more material and no more time in which to
learn it, the curriculum must be streamlined. This involves emphasizing
m ajor concepts. I t was with this in mind th a t wre decided to focus on
four general aspects of mathematics wrhich we felt underlie virtually every
topic in mathematics. The text is comprised of these topics.
C hapter 1 is devoted to a development of our number system. Begin
ning at the “Dawn of Consciousness” and the earliest tally systems,
we gradually explore the evolution of our present system of enumeration
and place value arithmetic. In this context, we introduce the idea of differ
ent number bases so th a t wre may better emphasize the concept of place
value and at the same time deemphasize the undue importance placed on
the concept of ten. (It is our contention th a t the development of mathe
matics did not require the biological “coincidence” th a t we were born with
ten fingers.)
Preface vii
(1) M an seeks the simplest solution to any problem th a t plagues him, and
it is only when this solution becomes either too cumbersome or out
moded th a t he seeks another solution. I t is this assumption th at
underlies our approach to the development of the number system.
M an started with nothing more advanced than a tally system and
gradually developed better systems of enumeration as he needed them.
Yet, it should not be too difficult to see th a t any subject in the curric
ulum uses this property, no m atter what the field of investigation.
(2) M an has the ability to think logically. He alone, in the animal king
dom, can predict and plan the future based solely on his knowledge of
the present and the past. Thus, our desire to help teach logical thought
in this text is m otivated by this property. Again, it must be under
stood th a t this logic is the same whether we are in the world of the
physical scientist or the world of the social scientist. The job of de
ducing inescapable conclusions from given information is part of every
field of study.
(3) Despite his drive for m aterial gain and his tendency to place pragmatic
pursuits above idealistic values, man is still basically a humanitarian
and he is capable of appreciating art for the sake of art. In this con
text, there are many facets of any subject th a t transcend practical
applications. One aim of the text is to show th a t there are aspects of
mathem atics th a t have an esthetic value which is im portant apart
from any practical value. Indeed, as in any field, there are m athe
maticians who study m athem atics for the same reason th a t the human
ities m ajor studies poetry—because it is beautiful.
Preface ix
CONTENTS
Preface v
Let us observe a few more im portant asides concerning the above exam
ple. For one thing, it illustrates a great deal about m an’s ability to think
abstractly—certainly there is nothing about the conglomeration of letters
c-a-t th a t even resembles “a member of the feline family th a t was intro
duced as a house pet in E gypt around 5000 B .C .” Indeed, it would be a
naive form of circular reasoning to say th a t we call it a cat because it looks
ju st like one.
Secondly, notice the power of the symbol “cat.” Once we get used to
the alphabet and learn to read, there comes a time early in our intellectual
development where, in fact, we never even notice the letters c-a-t but
rather we actually visualize the animal itself when we look a t “cat.” In
short, not only can man think abstractly, but he can do it in such a natural
way th a t he often does not realize he is doing it.
Finally, let us observe th a t while languages have come and gone and while
changes have been made even within the same language, the animal cat
is the same as it was 5000 years ago. The point is th a t it is usually lan
guage, not basic concepts, which changes.
2 The Development of Our Number System
W ith the above remarks in mind, we now turn our attention to a very
specific language—th a t of mathematics; and we shall observe th at this
language inherits the problems of all other languages—no more or no less.
For example, it is reasonable to assume th a t virtually from the so-called
“dawn of consciousness” man had the ability to grasp and to utilize the
concept of “how-many-ness.” While he may not have used symbols such
as 2, he was aware of the idea of two. He knew th a t he had two arms and
one head; or at least he knew th a t he had more arms than heads.
Now ju st as it is quite likely th a t the first word th a t named the animal
cat was a picture of a cat (sign language, or even hieroglyphics), it is also
likely th a t the first symbol th a t denoted two (arms) was a picture of two
arms. Perhaps, after a while, ancient man recognized th at two arms and
two apples had the concept of “two-ness” in common and so he might use
two tally marks to denote two of anything. (As we shall indicate by exam
ples in the next section, it was not a trivial m atter for man to discover th a t
the idea of two-ness was the same whether he was talking about apples or
people.)
The point we are leading up to is th a t the symbols used to denote num
bers are called numerals. Ju st as there exist many synonyms to name the
same concept, many numerals may be used to name the same number.
For example, the number two is named by such numerals as 2, 5 — 3,
2 X 1, and 11.
We shall not enter into a philosophical discussion here as to how one can
rigorously define w hat is meant by a number. Rather we shall assume
th a t each of us senses what a number is; and we shall use the convention
of spelling the word when we mean to refer to the number (such as when we
say the number two) and we shall call all other symbols for denoting num
bers (such as 2, 11, 1 + 1) numerals.
In still other words, when we write 3 + 2 = 5, we certainly do not mean
th a t the symbols 3 + 2 and 5 look alike. W hat we do mean is th a t 3 + 2
and 5 are two different numerals which name the same number—namely,
five. This is perhaps the major reason why in the “new” mathematics
3 + 2 = 5 is read as 3 + 2 is 5; th a t is, 3 + 2 is a synonym for 5. (The
student might have a tendency to think of 3 + 2 as being two numerals.
Do not confuse numerals with digits. In other words, 3 + 2 indicates th at
we are forming the sum of the two digits 2 and 3; but, nevertheless, 3 + 2
is a numeral which names five.)
Perhaps the following riddle will supply some pleasant diversion and also
shed light on our discussion.
A m an goes into a store, and buys some objects which are equally priced. The
clerk tells him the following.
1.1 The Concept of Number versus Numeral 3
(1) Do not confuse elementary with simple. When Holmes says “Ele
mentary, my dear W atson,” the result is anything but simple or self-
evident to poor Watson.
(2) While we are talking about whole numbers, we are also talking about
the ground floor of our entire number system. Thus to understand
number-versus-numeral at the level of the whole numbers is to have
the foundation for extending the concept to all numbers.
(3) The concept of number-versus-numeral not only plays an im portant
and direct role in our number system ; b u t in various disguises it plays an
im portant role in virtually every phase of mathematics. This we shall
see as the course begins to unfold. The im portant point is th a t the
same things which occur in elementary discussions occur on the more
complex levels.
(4) Ju st as we see the animal cat when we see the word “cat,” we also
tend to see numbers when we look a t numerals. This is precisely
what happened in our house-numeral riddle. To overcome this rather
natural tendency, there will be times in which you will need great will
power if you are to try to capture the “spirit” of the discussion. In
short, because the present discussion centers around a topic th a t we
are very used to, there is a danger th a t the student may feel we are
making mountains out of molehills. Of course, to avoid this we could
have chosen a more complex illustration b u t if difficulty arises, then
we have no way of knowing whether it was the concept or the illustra
tion which caused the trouble. In other words, since the whole num-
6 The Development of Our Number System
1 = 1X1
14-3 = 4 = 2 X 2
1 4 3 4 5 = 9 = 3 X 3
1 4 - 3 + 5 + 7 = 16 = 4 X 4 .
Such a result was easy (but ingenious) for him to see in terms of a picture.
Namely, every square could be subdivided into L-shaped regions, with
1.2 The Development of Place Value 7
Figure 1.1
8 The Development of Our Number System
in its own right, it tends to obscure the very important point th at man had
hit upon the abstract idea th a t a numeral did not have to look like th e
number it represented.
Now to mimic our own decimal system as much as possible, let us ignore
the fact th a t the Roman invented symbols for five, fifty, and five hundred;
we shall use only those symbols th a t denote powers of ten. Thus, we have
X to denote ten, C to denote one hundred, and M to denote one thousand.
Let us also ignore the subtractive property th at IX names nine while X I
names eleven. Historically, the original Roman numerals did not have
the subtractive property. Four was denoted exclusively by IIII, never
by IV. Both X I and IX meant eleven, and nine was denoted by V IIII
(although we are pretending th a t nine would have been w ritten as I I I I I I I I I
simply to conform with the decimal-place value system of today).
To continue, in this way the Roman could represent any number from
one to nine thousand nine hundred ninety-nine by using the symbols I,
X, C, and M ; and no symbol would have to appear more than nine times.
(Obviously, if we could use a symbol as often as we liked, we would only
need the symbol I, used enough times to represent any number—which is
precisely the tally system.) Thus, the Roman would represent a number
such as twenty-three using but two different symbols and a total of five
characters; th a t is, X X III. To be sure, this is not as compact as our place-
value system wherein all we need are the two characters, 23; but it is a
considerable improvement over the original tally system in which twenty-
three characters (tallies) would have had to have been used. In passing, let
us observe, in this regard, th a t whenever we evaluate any ancient system,
it is only fair th a t we compare it with what it replaced—not what it was
replaced by; th a t is, each innovation is brought about to improve the old,
and this is how it should be evaluated.
Surprising as it may seem, the Roman-numeral system was adequate for
performing the four basic operations of arithmetic. For example, to per
form the problem which we indicate as 23 -f- 34, the Roman would merely
have to write X X III X X X IIII. There would have been no need to com
bine like term s since, for example, an X was an X no m atter where it was
placed (in this respect the subtractive principle introduced later into
Roman numerals brought more inconvenience than it brought convenience).
To perform subtraction, the Roman actually took away what he was sup
posed to. Thus, to perform the operation which we would write as
45 — 31, he would have w ritten X X X X IIIII and then taken away three
X ’s and one I. He would have denoted this operation quite concisely
merely by writing X X X X IIIII. In no event would he have had to invoke
the notation ^ ^ X X I I I l / — X X X I = X IIII.
In a problem such as 41 — 19, he saw th a t he could not have taken away
nine from one, so he used his numeral system to recognize th a t X and
1.2 The Development of Place Value 9
The total accumulation of these symbols would denote the answer. Again,
observe th a t we do not have to line up like denominations; nor do we have
to exchange ten of one denomination for one of the next higher denomina
tion; although in the above problem, we could have exchanged ten of our
X ’s for a C, and ten of our C’s for an M. Our answer would have been
what was left—namely, M X X X X X IIIIII. T hat is
Figure 1.2
1.2 The Development of Place Value 11
Figure 1.3
12 The Development of Our Number System
2 3 (2300)
2 3 (2030)
2 3 (2003)
2 3 (230)
2 3 (203)
2 3 (23)
Y et with the labels omitted both of the above entries would look like 213.
This is similar to what a youngster tends to do when he first learns to
add. For example, given 78 + 96, he writes it as
78
+96
He then writes
7 8
+9 6
16 14.
We tend to call this wrong, yet it is almost correct. Indeed it would have
been correct had he written, say, (16) (14) to indicate th a t he had 16 tens
and 14 ones. In fact, the use of the parentheses is an excellent forerunner
to the idea of carrying. T h at is, 14 ones is the same as 1 ten and 4 ones.
In other words, with the labels present each of the following is a different
but correct way of representing 78 + 96.
16 14
17 4
1 7 4
Notice how the abacus serves as a missing link between our modern
place-value system and the Roman numeral system. Namely, ju st as in
the place-value system, the abacus uses the idea of position value; but just
as in the Roman system, the abacus allows for limitless amounts of any
denomination.
In any event, place-value forces us to invent a numeral to denote none,
and this numeral turns out to be 0 (zero).1 In terms of number-versus-
numeral, we m ust not confuse none and zero. Referring to our house-
numeral riddle of the previous section, at 10 cents per digit, 20 costs as
much as 29. In still other words, 0 is a numeral th a t denotes none just the
same as 7 is a numeral which denotes seven. I t is rather unfortunate th a t
except for zero and none, the place-value digit and the number it names
are pronounced the same way (for example, 7 is pronounced “seven” ).
This further tends to obscure the difference between number and numeral.
Once these observations are made, we can perform arithmetic in the new
system in ways th a t are analogous to the old. We have already illustrated
how addition, subtraction, and multiplication can be done with Roman
numerals. These techniques translate in an almost natural way to place-
value. For example, in place-value we add on a zero to multiply by ten.
Observe th a t this is equivalent to exchanging a denomination for one of
1 Since the symbol zero was necessitated by a numeral system called place value in
stead of number concept we tend to ‘segregate’ zero from the other whole numbers.
The natural numbers are the whole numbers excluding zero, and the whole numbers are
the natural numbers plus zero. The word “natural” is suggested by the idea of trying
to count none as being unnatural.
14 The Development of Our Number System
the next higher since the annexation of a 0 serves to push each denomination
over by one place. For example, suppose we add on a zero to 23 and thus
form 230. To be sure, the 2 and 3 have the same face value (that is 2
always denotes two while 3 always denotes three) but the denominations
named by 2 and 3 are different. In 23 we have 2 tens and 3 ones, while in
230 we have 2 hundreds and 3 tens. In our modern numeral system we
m ust distinguish between face value and place value.
In this same context, observe how the traditional recipe for performing
33 X 32 is just a disguised form of the Roman numeral version which we
have already described. Most of us have memorized the recipe, or
algorithm.
33
X32
66
99
1056.
We may not know why we do it, but we do know th a t we get the right
answer. Yet, a little reflection about face value versus place value tells
us th a t while the face value of 3 in 32 is three, its place value makes it 3
tens, or thirty. Now thirty 33’s is not 99, but 990. In terms of number-
versus-numeral, it makes no difference whether we write the 0 or indicate
it in some other way (such as by indenting). In other words, our above
algorithm has a “missing” 0, as we could have written
33
X32
66 (two 33’s)
990 (thirty 33’s)
1056 (thirty two 33’s).
become meaningful before one can hope to become proficient with the
phases of the subject he has not been taught to memorize. I t is this con
cept th at motivated us to spend this much time in developing, or recon
structing, our present place-value system.
Even though the point has been made, it might be well to complete our
study of place-value with specific mention of division and thus round out
the discussion of the four basic operations of arithmetic.
Moreover, such a decision enables us to introduce the im portant idea of
inverse operations. Roughly speaking, “inverse” indicates the idea of a
change in emphasis. For example, the statem ents
and
may be viewed as being inverses of one another. They say the same thing
but with a change of emphasis. T h at is, our first statem ent seems to be
an answer to the question “Who was Mickey M antle?” and, thus, empha
sizes Mickey M antle. Our second statem ent appears to be an answer to
“ Name a great baseball player” ; hence, it seems to emphasize great base
ball players.
In terms of a mathematical example, let us consider the idea th a t sub
traction is the inverse of addition. When we write, for example, 3 + 2 = 5,
we are emphasizing 5; at least the 5 stands alone and we seem to be saying
th a t 5 is the sum of 2 and 3. Now 5 — 2 = 3 conveys exactly the same
meaning as 3 + 2 = 5; only we seem to be emphasizing the 3. In short, we
can automatically subtract as soon as we know how to add. More pre
cisely, when we see the expression 5 — 2, we may read this as “th a t number
which must be added to 2 to yield 5.” This is what store clerks do when
they make change. If you pay for a $2.34 purchase with a $5 bill, the
clerk gives you change by adding on to $2.34 the amount necessary to make
$5.00.
In a similar way, division is the inverse of multiplication. T h at is, we
can divide as soon as we know how to multiply. By way of illustration,
let us consider 2821 -5- 13.
In terms of multiplication, this asks us to find the number which when
multiplied by 13 yields 2821. Another way of looking at this is th a t if
multiplication is rapid addition then division is rapid subtraction. Thus,
we wish a quick way of determining the number of bundles we can make
from 2821 tallies if we place 13 tallies in each bundle.
To this end, we observe th a t one 13 is 13; ten 13’s is 130; one hundred
16 The Development of Our Number System
13’s in 1300; and so on. Hence, if we subtract 2600 from 2821 we are
immediately performing in one step the equivalent of subtracting 13 two
hundred times. In summary,
2821
—2600 (two hundred 13’s)
221
—130 (ten 13’s)
91
—91 (seven 13’s).
notice th a t we have left out a few 0’s and with these 0’s supplied, we obtain
217
13(2821
-2 6 0 0 (200)
221
-1 3 0 (10)
91
-9 1 (7)
which is ju st what we had done with rapid subtraction.
Again, while we may be dealing here a t a very elementary level, the
principle of inverse operations permeates virtually every branch of mathe
matics. For example, in algebra factoring is the inverse of multiplication;
logarithms are the inverse of exponentiation; and extracting roots is the
inverse of raising to powers. In trigonometry, as might be expected, the
inverse trigonometric functions are, indeed, the inverses of the regular
trigonometric functions. In calculus integration is the inverse of differ
entiation. The im portant computational point is th a t once we have
studied an operation thoroughly, we immediately have a strong hold on
1.2 The Development of Place Value 17
101 = 10
102 = 10 X 10 = 100
103 = 10 X 10 X 10 = 1000, and so on.
then, let us assume th a t we can write one digit per second. Consider the
number:
1010” .
As we have written it, at the rate of a digit per second, it took us about
six seconds to express in scientific notation. Now let n = 1010. Then
the above number may be viewed in place-value notation as a 1 followed
by 1010 zeros. B ut 1010 means 10,000,000,000, which represents ten billion.
In other words,
1010“
if converted into place-value would be a 1 followed by ten billion 0’s. At
the rate of one digit per second, it would take us more than 300 years to
write this number of 0’s. In other words, scientific notation allows us to
express in six seconds a number which would have taken more than 300
years to express in terms of the usual system of place-value. I t should
be clear now what we mean when we say th a t scientific notation is to
place-value what place-value was to the more primitive tally systems.
Indeed, as man progresses, it would be vane (and also vain) for us to try
even to predict the innovations th at he shall invent in his quest to under
stand his history and his heritage.
Since exponents play such a vital role in many scientific contexts, we
shall continue our digression and talk more about exponents. In par
ticular, we wish to point out th a t if b and c are any numbers and n and m
are any natural numbers, it is rather easy to establish the following recipes:
bm X bn = bm+n
—fomXn
(ibc)m = bm X cm.
Moreover, if n exceeds m, then
5» bm = bn~m.
R ather than try to prove these results, wre shall merely look at a few
examples to see how these results come about.
24 X 23 = (2 X 2 X 2 X 2) X (2 X 2 X 2)
= 2 X 2 X 2 X 2 X 2 X 2 X 2
= 27
= 24+3.
(24)3 = 24 X 24 X 24
= (2 X 2 X 2 X 2) X (2 X 2 X 2 X 2) X (2 X 2 X 2 X 2)
= 212
= 24X3.
1.2 The Development of Place Value 19
(2 X 3)4 = (2 X 3) X (2 X 3) X (2 X 3) X (2 X 3)
= (2X2X2X2)X(3X3X3X3)
= 24 X 34.
37 -s- 32 means, in term s of th e inverse of m ultiplication, th e num ber which
when m ultiplied by 32 yields 37. W e already know th a t 32 X 36 = 37.
Hence
37 -r- 32 = 36 = 37“ 2.
24,000,000,000,000 X 3,000,000,000.
24,000,000,000,000 = 24 X 1012
3,000,000,000 = 3 X 109.
Thus,
24,000,000,000,000 X 3,000,000,000
= 24 X 1012 X 3 X 109
= 24 X 3 X 1012 X 109
= (24 X 3) X (1012 X 109)
= 72 X 1012+9
= 72 X 1021
= 72,000,000,000,000,000,000,000.
6" X 6° = 6n+0
Exercises
1. List six different numerals which name the number five.
2. In what sense are 4 + 1 and 3 + 2 synonyms? In what sense are
they not the same?
3. In terms of numerals rather than numbers, give an example in which 3
is “bigger th an ” 4.
4. Explain the flaw in the argument “In order to determine which of two
bags contains more pieces of candy, all we need do is weigh the bags.”
5. Suppose a number leaves a remainder of three when divided by five
and a second number leaves a remainder of two when divided by five.
Then the sum of these two numbers is divisible by five. Use tally
marks to explain why this result is true.
6. Use Roman numerals to form the product of twenty-four and thirteen.
7. Suppose the Romans had invented a new symbol for each power of ten,
and th a t they agreed never to use more than nine of any given symbol
in expressing a number. Consider the number which we write as
1,456,007,013.
(a) How many symbols would the Romans have had to invent to
express this number?
1.2 The Development of Place Value 21
6 X 1023.
Reading on
Number Number odometer having Reading on
of of miles ex-pressed, four faces per odometer having
miles in tally marks gear (1,2,3,0) ten faces per gear
reading into the correct number of miles. For example, the reading 012
denotes 111111 (six) miles on one odometer, but 111111111111 (twelve) on the
other. To avoid this misinterpretation, we will refer to them as the
base-four and base-ten odometers, respectively. Observe th a t in what we
call base-four we mean a place-value system in which the trade-in value
is always four. However, as far as number-versus-numeral is concerned,
closer scrutiny of the base-four odometer shows th a t what we would
ordinarily write as 4 appears there as 10. Notice, then, the difference
between ten and 10. Here we see an excellent example in which failure
to distinguish between number and numeral can really haunt us. The
point we are trying to make is th a t in place-value, the second column
(from right to left) names the trade-in value. In our base-four odometer,
the trade-in value is four but it is represented by 10 to indicate th a t in
this case we have one batch of our trade-in value and no ones. In fact,
it should not be difficult to see th a t we could have built an odometer using
any natural number as base, provided th a t it is at least equal to two.
(W hat we are saying here is th a t if the odometer is to be useful there m ust
2^ The Development of Our Number System
be a t least two faces on each dial since we need symbols for at least zero
and one.) The point is th a t no m atter how many faces there are to one
of the odometer’s dials, the reading will be 10 when we travel this number
of miles. On a base-five odometer, 10 will appear when we have traveled
five miles; on a base-twelve odometer, 10 will represent twelve miles.
Observe th a t a base-n -dial m ust have n digits. In other words, on a base-
twelve odometer, ten and eleven would be expressed by single digits while
twelve would be the first number denoted as a multiple digit—namely, 10.
In summary, our trade-in value in any base will be represented by the
numeral 10; but except for base-ten it will not be the number ten.
A major problem occurs once we introduce the possibilities of different
bases. For, when we assumed th a t ten was the only possible trade-in
value, it seemed th a t we were making too much of the fact th at there was
a difference between ten and 10. To be sure, there is a difference between
numbers and numerals; but perhaps it looked like too big an issue was
being made over it, since everything seemed clear from context. However,
now, when we see the numeral 10, all we can conclude is th at it names the
particular trade-in value. For this reason, we add some garnishing to
place-value numerals when the possibility of different bases exists. For
example, if we wished to indicate th a t the reading 10 occurred on a base-
five odometer, we would write (10)five. We shall, in the interest of brevity,
often abbreviate (10)nVe by (10)5. We shall also use the convention th at
when no subscript occurs, the usual base-ten is assumed. Thus, if we write
23, we mean either (23)io or (23)ten-
Perhaps we can explain different number bases most clearly by studying
a particular base in detail. Since we have already introduced a base-four
odometer in some detail, it would be to our advantage to begin a compre
hensive study of base-four arithmetic. Recall th a t base-four arithmetic
has only four digits: 0, 1, 2, and 3. To this end, we see th a t our numeral
system appears to be given by 1, 2, 3, 10, 11, 12, 13, 20, 21, 22, 23, 30, 31,
32, 33, 100, . . . . As things stand, there is no trouble even if the numeral
system seems a bit unfamiliar to us. The trouble occurs when we start to
pronounce these numerals as one, two, three, ten, eleven, and so on. Here
we begin to pay the price for not having properly distinguished between
number and numeral, and reading or pronouncing ten and 10 alike. Ob
serve th a t no harm comes as long as it is the single digit numerals we are
reading. T hat is, 3 is always three in any base in which 3 exists. Thus,
we elect to solve this problem by pronouncing (12)4 as “one-two-base-four.”
Certainly we are not denouncing the pronunciation “twelve-base-four.”
However, the sound of twelve might prejudice our thinking. The impor
ta n t thing to observe, however, is that, when written, (12)4 does not depend
on how it is pronounced! Since we do not want to become more involved
with jargon than with ideas, our agreement is th a t the student may pro
1.3 Different Number Bases 26
Let us pretend for the time being th a t we were born in the base-four
numeral system. As we have seen, our system of numerals, all other things
being equal, would look like this:
1, 2, 3, 10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33, 100, . . . .
To the person who used this system, there would be nothing unnatural
about this procedure. As a final reminder, observe th a t the digits here
have the same value as always, but only the value representing the trade-in
is different.
In this system, 13 + 12 would denote
Hopefully, our last illustration shows th a t no m atter what base has been
chosen, the tally system serves as a marvelous visual aid for interpreting
place-value. Finally, to emphasize the difference between number and
numeral, let us observe th a t in the base-four system 13 is a numeral which
denotes the number seven; 12 is a numeral th a t denotes the number six;
and 31 is a numeral which denotes the number thirteen. Thus, the base-
four fact 13 + 12 = 31 simply denotes the number fact th a t the sum of
seven and six is thirteen. This would be w ritten as 7 + 6 == 13 in terms of
base-ten numerals. Of course, when more than one base is under con-
26 The Development of Our Number System
+ 0 1 2 3 X 0 1 2 3
0 0 1 2 3 0 0 0 0 0
1 1 2 3 10 1 0 1 2 3
2 2 3 10 11 2 0 2 10 12
3 3 10 11 12 3 0 3 12 21
W ith these tables in mind, let us see how we could learn addition in
term s of place value. Suppose we were told to perform the operation indi
cated by 12 + 21. We would write
12
+21
and we might then, quite mechanically, view this as two separate simpler
problems, namely
1 and 2
+2 +1
3 3.
We could then superimpose these results to conclude th a t
12
+21
33 .
1.3 Different Number Bases 27
(As a review of our own base-ten system and how to convert into other
bases; observe th at the above numeral fact merely states th at the sum of
six and nine is fifteen. In particular,
(12)4 = 6, (21)4 = 9, and (33)4 = 15.)
The point is th a t while number facts do not depend on the base, it may
be easier to recognize a particular number fact in one base than in another.
For example, as we have just seen, divisibility by five is viewed conven
iently in terms of base-ten numerals, since five is a divisor of ten. However,
divisibility by either five or two was not too convenient in terms of base-
seven numerals, since five is not a divisor of seven.
In short, the test for divisibility in terms of looking only at the unit
digit hinges on whether the divisor is also a divisor of the base. The more
divisors the base possesses, the more convenient tests are for divisibility;
while the fewer the divisors of the base, the fewer tests for divisibility in
terms of the unit digit. The extreme, in this respect, occurs when the base
is a prime number. A prime number is any natural number, excluding 1,
which is divisible only by 1 and by itself; for example, 2, 3, 5, 7, 11, and 13.
If a number is prime, then there are no nontrivial divisors of it. This was
the case in our study of base-seven numerals.
A number like twelve is a good base in this regard since it admits two,
three, four, and six as nontrivial divisors. This means th at in base-twelve
numerals, the results can be expressed by the following statem ent:
A number is divisible by either two, three, four, or six if and only if its unit digit
is divisible by either two, three, four, or six, respectively (where, of course, the
digits refer to base-twelve numerals).
33 with 6
X32 X5
1056 30.
33
X32
66
_99
165
82 The Development of Our Number System
Jll or 3X 3 = 9 or ( 3 X 3 = 21)4
While this investigation may possess aesthetic values from the purely
theoretical point of view, the problem of confusing numbers and numerals
may be a very real problem to the average student. T hat is, many
concepts of arithmetic are relatively easy to grasp; but the techniques
which one uses to obtain results in terms of numerals is what usually causes
the greatest difficulty. Thus, among other things, a proper distinction
between numbers and numerals is perhaps the greatest aid in learning to
understand arithmetic from a computational point of view.
34 The Development of Our Number System
In concluding this section, let us say just a word about base-two (binary)
arithmetic. In terms of computers, observe th a t a switch is either off or
on. I t is not partly on and partly off. Thus, an adequate coding system
would require two symbols—one to denote on and the other to denote off.
Thus, base-two, with its two digits is very appropriate for this task. This
is the reason th a t base-two has such wide application in many pragmatic
endeavors.
Exercises
1. Indicate how the number twelve would be written in each of the fol
lowing bases.
(a) two (e) ten
(b) three (f) eleven
(c) four (g) twelve
(d) seven (h) fifteen
2. W hat number is named by (123)„ if n is equal to each of the following.
(a) four (e) eleven
(b) five (f) fifteen
(c) six (g) twenty
(d) nine
3. Explain what we mean when we say th a t there is no 6 in base-four
arithmetic, b u t there is the number six.
1.3 Different Number Bases 35
Our discussion of different number bases should serve to free us from the
confinement of thinking in terms of a particular numeral system. Now th at
this has been established, we feel free to return to our familiar base-ten
system in discussing various number concepts. Thus, unless otherwise
specified, all the remaining computations will be in base-ten. However,
ju st as there were different systems of enumeration, there were often dif
ferent systems of doing arithmetic, even within the framework of a
particular numeral system.
I t is our purpose here to investigate a cross section of outdated computa
tional devices and compare them with our more familiar methods in terms
of the role played by place-value. At the same time, we shall see th at the
older methods often reflect the feelings of their times—and this, too, is
im portant as we pursue our study of mathematics as a mirror of human
endeavor. We shall in no way try to give an exhaustive study of tech
niques. Rather, we shall single out a few well-chosen examples, leaving it
to the interested reader to pursue the topic further.
Let us begin by observing th at while we write from left to right, our
recipes in arithm etic frequently proceed from right to left. For example,
when we use place-value to add, we start at the units and proceed to the
tens, hundreds, and so on—going from right to left. Obviously, we could
have agreed to write the units, tens, and hundreds from left to right. T hat
is, there is no reason why we could not have invented the numeral 123 to
represent the number three hundred twenty-one. (If this seems backward
it is only because of what we are used to calling forward.) Had we used
this system, such things as carrying in addition would then have proceeded
from left to right. For example,
1 10 100 1000
8 7 6
5 8 7
13 15 13
3 16 13
3 6 14
3 6 4 1
illustrates how we would add six hundred seventy-eight and seven hun
dred eighty-five to obtain one thousand four hundred sixty-three.
For a plausible explanation as to why we did not label increasing de
nominations from left to right, recall th a t place-value was an Arabic con
36
1.3 Different Number Bases 37
tribution and th a t the Sernetic people write from right to left. Hence,
it is quite consistent th a t their arithmetic proceeded from right to left.
When the Europeans adopted this system, they decided not to tam per with
such a “good thing” and, consequently, they adopted it without a change,
complete with the right-to-left characteristics.
Perhaps man never cared whether he did arithmetic from left-to-right
or from right-to-left. Yet there seems to be ample evidence th at he did
care. As a case in point, let us start our discussion of old-fashioned methods
of arithmetic with the scratch-out method of subtraction.
Consider the problem th at we would write as
4123
-2 4 6 8 .
28 X 49.
1 X 49 = 49
2 X 49 = 98
4 X 49 = 196
8 X 49 = 392
16 X 49 = 784
32 X 49 = 1568.
exactly the same thing we did previously using the duplation plan. The
relationship between this method, base-two, and duplation is left as an
exercise for the reader. We should point out in passing the following.
135 X 12 = (100 + 30 + 5) X 12
= 100(12) + 30(12) + 5(12)
= 1200 + 360 + 60
and this is exactly what our first illustration becomes when we supply the
missing 0’s. T hat is,
135
12
1200
+ 360
+60
1620.
135
12
1350
+ 270
1620.
9 8 7 6
6 7 8 9
8 8 8 8 4
7 9 0 0 8
...
6 9 1 3 2
5 9 2 5 6
6 7 0 4 8 1 6 4
Figure 1.4
42 The Development of Our Number System
While we are on the subject of the Gelosia method, notice the painstaking
detail th a t this method demands—even to the “scalloping” of the answer.
Obviously, such a tedious design shows th a t people had at least a good
am ount of leisure time. In other words, one can often use such things as
arithm etic recipes as an artifact to determine the way of life of a particular
civilization. Thus, the history of mathematics can also supply the archae
ologist with another avenue of im portant data in reconstructing the status
of a given society.
Exercises
1. Use the scratching-out method to compute 2345 — 1846.
2. Suppose we had decided to have place-value proceed from top to bottom
rather than from right to left. Show how we would have computed
(a) the sum of eighty-seven and ninety-six,
(b) the product of eighty-seven and ninety-six.
1.4 The Rational Numbers 43
C o .. —
- t i . u.
—--------2 s. u . -------—
1. s. u.
1 2 3 4 5
Figure 1.6
— a - b ----------------------------- -
hm b * ~
Figure 1.7
-------------------------- 5 -------------------------- »
-------------------------3 + 2------------------------►
Figure 1.8
at least in the sense th a t one can just as logically divide a 5-inch line into
three equal parts as he can divide a 2-inch line into three equal parts.
Before proceeding further, we might explain why the names “numerator”
and “denominator” were invented to name the top and bottom numbers
of the common fraction. First of all, if “numerator” and “top” are syn
onyms, then we need not use a fancy word to replace a straightforward
word. Obviously, “num erator” and “top” are not synonyms. In fact,
when we write m /n there is no top or bottom ; rather m is to the left of n!
Ancient man recognized th at the size of the standard unit was rather
arbitrary. He then realized th a t he could divide the standard unit of
length into any number of equal parts and the length of any one of the
resulting pieces could then be his new standard length. In this way, for
example, i would be the standard length th a t we would obtain by taking
our standard unit, dividing it into three equal parts, and then taking one
of these three equal parts. In general, 1/n was used to denote the length
of each resulting piece if a standard unit were divided into n pieces of equal
length.
The next observation was th at given any denomination, 1/n, we could
generate arbitrarily long line segments by marking off the length 1/n con
secutively as often as we wished. For example, we might elect to mark
off 1/n, m times. Clearly, the name for such a length would be m X 1/n.
On the other hand, it should not be hard to see th at the length named by
m X 1/n was precisely the same as the length named by m /n. In terms
of our usual usage of numerals, we are saying th at m /n = m X 1/n. In
this form, it easy to see th a t n names the denomination, while m enumerates
the number of times we mark off this denomination. Remember th at
the greater the denominator, the smaller the denomination.
Given, say, f, we now have two ways of viewing this as a length.
(1) I t may be viewed as the length of each piece when a length of 5 s. u.
is divided into 3 equal parts, or (2) it may be viewed as the length ob
tained when the denomination ^ is marked off 5 consecutive times. We
will use the interpretation th a t best accomplishes our purpose.
In the same way th a t tally marks gave us an excellent picture for vis
ualizing why the rules for the arithmetic of whole numbers were as they
were, the number line gives us equal insight to the arithmetic of rational
numbers. For example, let us consider the rule which says th a t we may
multiply num erator and denominator of a common fraction by any nonzero
number without changing the fraction. First of all, let us “clean up” the
language of the last statem ent. Obviously, we change the fraction. Cer
tainly ^ does not look like f any more than 4 + 1 looks like 5. W hat we
mean is th a t £ and f are different common fractions th a t name the same
rational number. In terms of the number line we are saying th a t 2 pieces
whose size is | are necessary to form the same length as 1 piece whose size
1.4 The Rational Numbers 49
1 ,1 1+1 20
2 3 2+ 3 5'
We will answer this with the following example. If you are paid by
the hour and you work \ hour one day and y hour the next day, then you
have worked 30 + 20 minutes in all. In terms of this real-life experience,
you have worked 50 minutes, which is f hours.If we had added numera
tors and denominators,you would have worked f hours, or 24 minutes.
On the other hand, the baseball player who goes 1 for 2 in the first game
and then 1 for 3 in the second game has gone 2 for 5 not 5 for 6 in the two
games. In short, in any man-made system we invent rules and models
to conform with what we believe is true.
This also explains why it is not enough to say th at m /n is a common
fraction. In mathematics, structure hinges on how the symbols are com
bined not on how they look. In other words, if \ and y are to be common
fractions, it is m andatory th a t \ + £ = $; otherwise the “realism” of the
rational numbers escapes us. In still other words, in any interpetation in
which it would be correct to say th a t \ + y = f, then the symbols y, y,
and f are not being used as common fractions even though they may look
like common fractions.
If we now invoke the idea th a t subtraction is the inverse of addition, we
need spend no additional time discussing how we subtract rational num
bers. All we have to do, given a problem such as y — i = ?, is to view
i t a s i + ? = f . T h at is, ^ + ? = ^y; whereupon our previous knowledge
of addition allows us to conclude th a t ? = yy-
Let us now turn our attention to the question of multiplying rational
numbers. Clearly, if at least one of our factors is a w'hole number, there
is no trouble in still keeping the “rapid addition” idea. For example, if
we wish to compute 5 X f , we may view this in terms of lengths as marking
off \ five times, thus yielding a total length of The difficulty lies in a
product such as \ X y.
Here we encounter for the first time the fact th a t we cannot view multi
plication as “rapid” addition; for, quite clearly, when a problem says to
perform an operation n times, n must be a whole number. T h at is, we
may mark off a length once, twice, or three or more times, but we can
not m ark it off one-half times.
Some of us may recall th a t in an expression such as \ X y, we no longer
read the “ X ” as “times,” but rather as “of.” We say \ of and this has
1.4 The Rational Numbers 51
1 ____________________ __
4
Figure 1.9
2
A B
\
1
3
The area o f ABCD is on one hand
1 / / / / / / / / ✓
-* x J . On the other hand it is 6
7 of
D C
the area of the entire unit square;
that is, ABCD is one of six equal
rectangles, each of which has the
dimension - x - , into which the
unit square is divided.
-*--------------- 1 s. u .-------------- —
Figure 1.10
As a final observation let us again point out th a t while the number line
affords us a nice way to visualize the rational numbers, the rules for com
putation apply to the real world without any regard to the number line.
For example, in terms of distance being speed multiplied by time, notice
th a t if a particle were to travel at a rate of I miles per hour for § hour,
it would travel £ miles.
We can now understand the last of the four basic operations of arithme
tic, division, in terms of its being the inverse of multiplication. For exam
ple, given | -r f = ?, we may interpret this as f X ? = f ; or ^ of ? = f.
In other words, we may view ? as a certain length, divide it into seven equal
parts, and then observe th a t five of these seven parts add up to f . If five
parts add up to f then one part is i of §, or t s (we are assuming th a t we
already know how to multiply). All seven parts make up ?, hence ? is
7 X t s , or ? = y-f- Figure 1.11 sums this up pictorially.
Figure 1.11
are all forms of “invert and multiply,” all of which yield different answers.
How shall we determine which is correct?
(1) H e had a very simple way of viewing addition. Namely, given two
numbers a and b he could mark them off consecutively, viewing them
as lengths, along the same line. The total length could then be labeled
a + b (Figure 1.12(a)). Notice th a t we can easily view 3 + 2 = 5 in
this way (Figure 1.12(b)). Namely, the total length, when a 2-inch
length is marked off on a straight line next to a 3-inch length, is a 5-inch
length.
a +b 3+2
(a) (b)
Figure 1.12
-a - b-
-b + (a - b)-
Figure 1.13
1.4 The Rational Numbers 66
Figure 1.14
the properties of similar triangles. Two figures are similar if they have
exactly the same shape. More mathematically, the lengths of the
sides of one are proportional to the lengths of the sides of the other.
Namely, the above diagram yields the fact th at
a: 1 = D E .b
or
a = D E /b
or
a X b = DE.
We illustrate this result in Figure 1.15 with the example 2 X 3 = 6.
Figure 1.15
(4) We have already shown a method for visualizing a -s- b if a and b are
natural numbers. However, Figure 1.16 shows a more general tech
nique th a t works for any lengths. Namely, we again choose two lines
56 The Development of Our Number System
Figure 1.16
K 2 -H
i i i
Scale
Figure 1.17
(5) While it is not one of the four basic operations of arithmetic, the process
known as extracting square roots played a vital role in early Greek
mathematics. Recall th a t the square root of a number is a number
which when multiplied by itself yields the given number. For example,
the square root of nine is three since the product of three with itself
1.4 The Rational Numbers 57
D
VS.
\\
\ \
V\
/
/ /
/
* \\
B
V\
b . .
/
Figure 1.18
B ut the ancient Greek’s use of geometry was not restricted here to its
role in ordinary arithmetic. He also used it to show such algebraic results
as the distributive property,
a X (b + c) = (a X b) + (a X c)
and others such as
(a + 6) X (c + d) = (a X c) + (a X d) + (6 X c) + (6 X d)
(a + ft)2 = a2 + 2ab + ft2.
These results followed from the realization th a t the product of two lengths
could be viewed as the area of a rectangle. In other words, 3 X 4 may be
viewed as the area of a rectangle whose dimensions are 3 units by 4 units.
In this respect, treating numbers as lengths, the three equations above may
be interpreted as shown in Figure 1.19.
68 The Development of Our Number System
f
b
axb
c
axe
c axe bxc
e
a:
i
ab T
a +b
-a + b ----
1
Figure 1.19
V / b /
/ a/
V /
£------------- /
f
/
a X
a /! /i
/b
/
/
b / /a
/
Figure 1.20
A B
On this new line, starting a t A , we mark off any chosen length five times,
labeling the last point of division C, for want of anything better. Virtually
by default, th at is, we constructed it th at way, the segment AC is divided
into 5 parts of equal length. Unfortunately, however, it is A B , not AC,
which we wished to divide into 5 pieces of equal length. To this end, we
now draw the straight line which joins B and C.
We then draw lines through the other points of division on AC parallel
to BC. We label the points a t which these lines intersect A B by D, E,
F, and G, respectively.
Then the lengths of the segments: AD, DE, EF, FG, and GB are all equal.
T hat is, we have succeeded in dividing the segment A B into 5 pieces of
equal length. Note the properties of similar triangles th a t are used in
this example.
The ancient Greek was quite captivated by the topic of divisibility and
proceeded to subdivide the natural numbers into various categories in
terms of their divisibility properties. We shall have more to say about
60 The Development of Our Number System
this in a later chapter, but for now we shall restrict this topic to those
points th a t will be of computational value to us in our present investigation
of the rational numbers. In particular, we shall discuss here the concepts
of prime and composite numbers.
To begin with, the Greek recognized th at 1 denoted a rather special
natural number which- he called a unit. For any other natural numbers,
he noticed two possibilities. Namely, since n = n X 1, he knew th at every
natural number greater than 1 had at least two natural numbers as divi
sors—itself and 1; he knew also th a t some natural numbers had only these
two natural numbers as divisors. For example,
Units: 1
Primes: 2, 3, 5, 7, 11, 13, . . .
Composites: Everything else; th a t is, 4, 6, 8, 9, 10, 12, . . . .
wish to find all the primes which are less than, say, fifty. We would then
write the first fifty natural numbers:
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50.
3 5 7 9 11 13 15
17 19 21 23 25 27 31 33
35 37 39 41 43 45 47 49.
Since 3 is now the next uncircled member of the list, we may circle it as
being prime; for if it were not prime, it would have been divisible by two,
and already struck from the list. We may now delete all other multiples
of three from the remaining list since these cannot be primes, by virtue of
being divisible by three. The list now looks like this:
© @ 5 7 11 13 17 19 23
25 29 31 35 37 41 43 47 49.
We could scratch out the members rather than delete them. This would
emphasize the fact th a t we did not need our numeral system to make this
sieve; and this is im portant since the Greeks did not have our numeral
system. W ithout deleting, all we would have to do after circling 2 would
be to scratch out every second member from th a t point on. Then once we
circled 3 we would scratch out every third member which had not yet been
scratched. Continuing in this way, either by deleting or scratching out,
we eventually obtain
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47.
as the only prime numbers which do not exceed 50. Summarized in a form
th a t emphasizes our “sieve” we have:
62 The Development of Our Number System
© * © 0 ( 7) % 0
@ V* V6 © © 30
yb 34 36 36 27 33 © 36
& 34 36 36 (37) 36 36 id
0 @ 44 46 46 @ 46 46 36
While the sieve method works, it is not the type of recipe we ultimately
want because it involves testing all smaller numbers when we wish to test
a particular number. To put it in even more elementary terms, an obvious
test to see if a number is prime is to divide it by 2, 3, 4, 5, and so on, to
see if the quotient is ever a natural number. But this is not a “neat” recipe.
In more professional language, we can say th at we want recipes th at are
closed in form rather than open.
To illustrate this from a different view, consider the problem of finding
the sum of the first n natural numbers. To this end, we could write
1 + 2 + 3 + . . ,+ n
and compute the sum ; but observe how awkward this would be if we wished
to find the sum of the first thousand natural numbers this way. We
would have to compute
1 + 2 + 3 + . . . + 1000
and this is, at best, tedious. Such a procedure is called an open form since,
in effect, we m ust find the answers to the previous cases before solving the
one we wish. Now, without going into the details, it can be shown th a t
1i +
i - 2i + . . . +i n = n(n + 1)
----- ------
T h at is, the sum of the first n natural numbers can be found simply by
multiplying the last number in the sum by its successor and dividing by
two. For example, to compute 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10,
we need only form the number represented by (10 X l l ) / 2 = 110/2 = 55.
A direct check verifies this result. While this is not a proof, observe th at
one can visualize the above result by observing th at we can form five pairs,
the sum of each of which is eleven. T h at is,
1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9+10.
The latter result is what we call closed form, for we can now find the sum
of the first thousand natural numbers directly without knowledge of pre
vious sums merely by computing (1000 X 1001)/2 = 1,001,000/2 =
500,500 (that is, we have 500 pairs, each with sum 1001). This is the type
of recipe we would like to find for determining primes.
1.4 The Rational Numbers 66
Even without knowing the ultimate formula, there are still certain short
cuts th a t we may use in testing for whether a number is prime. As an
example of this, let us observe the following facts.
(1) Any composite number has prime divisors. (We leave this to the
reader’s intuition. Roughly we demonstrate by breaking down the
divisors into their indivisible parts. For example, 24 = 8 X 3 and
8 = 2 X 2 X 2. Thus, 24 = 23 X 3, and both 2 and 3 are primes.)
(2) If the natural number n is the product of the two natural numbers r
and s, and if r exceeds v n, then s must be less than v w. For example,
if 36 = r X s and we know th at r and s are unequal, then it is impossible
for both r and s to exceed 6 or for both to be less than 6.
rs r
A
s
6x6 RS
s
Figure 1.21
Combining (1) and (2), we see th a t to test a given natural number for
being a prime, we need divide by only those primes which are less than
the square root of the number. If none of these numbers divide the given
one, it is a prime. For example, consider 149. Since 12 X 12 = 144 and
13 X 13 = 169, we need only test primes th a t are less than 12; th a t is, 2,
3, 5, 7, and 11. Since none of these primes is a divisor of 149, 149 is itself
prime. In other words, if 13 were a factor, a prime less than 13 would also
have to be a factor; but we have just seen th at this can’t happen since 2, 3,
5, 7, and 11 are the only primes less than 13.
There is a difference between understanding concepts and being able to
m aster techniques for measuring the concepts. Certainly, the concept of
a prime number does not hinge on how well we master recipes such as the
64 The Development of Our Number System
The above result is known as the unique factorization theorem and il
lustrates the role of prime numbers as the “atom s” of our number system.
For example, once 720 is written as 24 X 32 X 5, we have a wealth of
information about the divisors of 720. In particular, we know th a t any
number having prime divisors other than 2, 3, or 5 cannot be a divisor of
720. Moreover, of the remaining numbers, any one th a t contains either
more than four factors of 2, more than two factors of 3, or more than one
factor of 5 cannot be a divisor of 720. Expressed symbolically, a divisor
of 720 m ust be of the form
2* X 3” X 5r
2° X 3° X 5° = 1; 2° X 3° X 51 = 5; 2° X 31 X 5° = 3;
2° X 31 X 51 = 15; 2° X 32 X 5° = 9; 2° X 32 X 51 = 45.
We then get 6 more divisors each by replacing 2° in the above six numerals
by 21, 22, 23, and 24. Leaving the details to the reader, we see th a t these
other divisors are
2 10 6 30 18 90
4 20 12 60 36 180
8 40 24 120 72 360
16 80 48 240 144 720.
1 3 5 9 15 45
2 6 10 18 30 90
4 12 20 36 60 180
8 24 40 72 120 360
16 48 80 144 240 720.
have to look a t the three lists and extract the largest number appearing
on all three lists. However, using unique factorization, there is an even
better way. We write each of the numbers as a product of powers of
distinct primes. Thus,
_ 420= 22 X 3 X 5 X 7
504 = 23 X 32 X 7
2100 = 22 X 3 X 52 X 7.
22 X 31 X 5° X 71 = 4 X 3 X 7 = 84.
23 X 32 X 52 X 71 = 8 X 9 X 25 X 7 = 12,600.
1 30
420 12,600
1 25
504 12,600
1 6
2100 12,600'
1.5 Decimal Fractions: Place Value Revisited 67
Hence
1 , 1 , 1 30 , 25 , 6
420 + T504
K a ~T n m n
^ 2100 = in nnn + , r» +
12,600 ^ 12,600 ^ 12,600
61
12,600
Further practice is left for the exercises.
Exercises
1. Using 1 to denote a length of one inch, use the type of techniques
described in this section to construct the lengths named by the
following.
(a) 4 + 3 (e) f - i (i)|Xi (m) V i
(b) f + \ (f) f — I (j) 4 -i- 3 (n) \ / 7
(c) f + \ (g) 4 X 3 (k) f 4- \ (o) V f
(d) 6 - 3 (h) f X * (1) f * \ (p ) V 2
2. Explain the conceptual difference between the expressions f and
3 X i in terms of lengths; and then explain why we say th a t f =
3 X
3. In terms of m /n = m X 1/n, explain why it is natural to call m the
numerator and n the denominator of the common fraction m /n.
4. Show by an appropriate use of rectangles and their areas why
2 X (3 + 4 + 5) = (2 X 3) + (2 X 4) + (2 X 5).
5. Using a = 2 and 6 = 3, verify in this case th a t the recipes (a + 6)2 =
a2 + 2ab + 62 and (a + 6)3 = a3 + 3a26 + 3a62 + 63 are obeyed.
6. Explain the relationship between the length %and the point
7. W rite each of the following common fractions as an equivalent mixed
number.
(a) i (b) V- (c) V (d) W (e) m
8 . State a division problem involving only natural numbers, to which each
of the common fractions listed in Exercise 7 denotes a correct answer.
9. Represent each of the following mixed numbers as an equivalent
common fraction.
(a) 2* (b) 7 i (c) I l f (d) 321*
10. In terms of lengths, explain why it is a misnomer to refer to f as
improper.
11. Add the following; ttVo-
W ith this in mind, if we wish to invent a unit smaller than the standard
unit, we need only continue our place-value system from left to right, each
time multiplying by one-tenth. T h at is,
Xt*o X tV X tV X^o X tV
1000 — ► 100 — ► 10 — ► 1 — ► 1/10 — ► 1 / 100 .
1.5 Decimal Fractions: Place Value Revisited 69
3142 3.142
X213 X2.13
9426 9426
3142 3142
6284 6284
669246 6.69246
We are usually mechanically taught to count off the decimal places, in this
case five, to obtain the correct answer. Yet such a result need not be
memorized. For, in the usual fraction notation,
3.142 = 3 + 1/10 + 4/100 + 2/1000
= 3142/1000.
Similarly,
2.13 = 2 + 1/10 + 3/100
= 200/100 + 10/100 + 3/100
= 213/100.
Therefore,
3.142 X 2.13 = 3142/1000 X 213/100 = (3142 X 213)/(1000 X 100).
The num erator accounts for the sequence of digits, and the denominator
accounts for moving the decimal point. T hat is,
1000 X 100 = 10 X 10 X 10 X 10 X 10.
In summary,
3.142 X 2.13 = (3142 X 214) X 1/10 X 1/10 X 1/10 X 1/10 X 1/10.
Each time we multiply by 1/10 (divide by 10), we move the decimal point
one column to the left.
Here again the number line affords us another rather intuitive way of
visualizing the proper placement of the decimal point. For, in terms of
lengths, 3.142 is shorter than 4 s. u. but longer than 3 s. u. T hat is, 3.142
lies between 3 and 4. In a similar way, we find th a t 2.13 lies between 2
and 3. Thus,
3.142 3 4
X2.13 lies between X2 and X3
~ 6 12
Figure 1.1
0.2500 . . .
4 | 1.0000
_8
20
20
00 .
Keep in mind th a t both 1/4 and 0.25 are names (numerals) for the rational
number 1 -f 4. Of course, we could write 0.250, 0.2500, 0.25000, and so on,
as synonyms for 0.25. In other words, we could write this decimal in a
nonterminating way merely by agreeing to write 0’s forever. Rather than
72 The Development of Our Number System
As a second example,
1/16 = 1/(2 X 2 X 2 X 2)
= (1 X 5 X 5 X 5 X 5)/(2 X 2 X 2 X 2 X 5 X 5 X 5 X 5 )
= 625/10,000 = 0.0625.
For instance, we know th at it is worth more than 3 dimes but less than 4
dimes; it is worth more than 33 cents but less than 34 cents; and no m atter
how many more denominations we invented decimally, we could never
measure exactly ^ of a dollar. Notice th a t we stress decimally since we
could invent a coin th a t was worth f of a dollar—just as we invented non
decimal coins to represent \ of a dollar, * of a dollar, and of a dollar.
Leaving out the monetary illustration, we are saying th a t ^ = 1 4- 3, or
0.3333. . .
1.00000000
_9
10
_9
10
_9
10
_9
1 ____
T hat is, we never get a zero remainder in this division problem. In fact,
the remainder, 1, repeats continuously. In other words, in decimal form
the rational number whose name is | is nonterminating. This leads to a
rather esthetic thought; namely, 0.3, 0.33, 0.333, and so on, never become
synonyms for Yet, each member in the sequence becomes more nearly
equal to as the following differences show. Thus,
0.3 = 3/10 = 9/30; 1/3 = 10/30;
hence,
1/3 - 0.3 = 1/30,
0.33 = 33/100 = 99/300; 1/3 = 100/300;
hence,
1/3 - 0.33 = 1/300.
In a similar way,
1/3 - 0.333 = 1/3 - 333/1000 = 1000/3000 - 999/3000 = 1/3000.
In other words, while 0.33333... is never exactly equal to \ no m atter
where we terminate, the endless sequence of 3’s does represent f. The
esthetic point is th a t the mathematician is quite interested in the endless
progression, even though the word “endless” is highly subjective. For
example, it is physically impossible for man to write out an endless progres
sion, by virtue of the meaning of the word “endless.” Moreover, from
74 The Development of our Number System
7 1.0000000000000
0.142857142857142857142857
7 I 1.1030206040501030206040501030206040501000000.
it tells us th a t in a group of, say, 400 people, a t least two celebrate their
birthday on the same day; otherwise there would be at least 400 days in a
year. I t tells us th a t if there is an urn with marbles, some colored red
and the others black, and we wish to choose two of the same color; then we
need draw only 3 marbles from the urn to make sure th at at least two have
the same color (of course, we may have matched a pair with 2, but we
cannot miss with 3).
This principle guarantees us th at if our divisor is n (in the above case we
were considering n = 7), by the time we get to the (n + l)th 0 after the
decimal point, we are assured th at at least one remainder has repeated—
unless 0 occurs, in which case the decimal terminates. Summed up then,
using the chest-of-drawers principle, when we express a rational number as a
decimal, we must obtain either a terminating decimal (0 occurs as a re
mainder) or a repeating decimal (0 never occurs as a remainder but some
other remainder repeats).
Although it is not im portant for our immediate needs, the converse
(which means in an If ... . then . . . statement, the statem ent which
is obtained by interchanging the clauses following the “If” and the “then”)
of the above result is also true. T hat is, if a decimal either terminates or
repeats, then it represents a rational number. Before demonstrating this,
however, let us point out th a t this statem ent is not as trivial as it may ap
pear a t first glance. I t is not just a restatem ent of the previous result.
Observe th a t there is a difference between saying th at if a person is reading
this book, he is alive, and if a person is alive, he is reading this book!
Again, rather than offer a proof, we will proceed by examples. Consider
a term inating decimal. For example, let us choose 2.673. Certainly we
may write
(In general, we may pretend th a t the decimal point is missing and read
the numeral as th a t number of whatever denomination is represented by
the last column.)
The more difficult case is when the decimal does not terminate. For ex
ample, let us consider 0.3333333. . . , where 3 repeats endlessly, and let us
pretend th a t we do not know th a t this is the decimal form of the rational
number f. We observe th a t if we move the decimal point one place to the
right, the decimal does not change. But this means multiplying by ten.
In other words, if we let n = 0.33333333333333333333333333... , then
lOn = 3.33333333333333333.... Writing this in a form more conducive
to subtraction, we have
1.5 Decimal Fractions: Place Value Revisited 77
10n = 3.33333333333333333333...
- (to = 0.33333333333333333333...)
9n = 3.00000000000000000000...
decimal point one, three, five, seven, and so on, places to the right, we
obtain the repeating decimal 0.13131313.. . . Thus,
lOOOn = 413.1313131313...
—(lOw = 4.1313131313...)
- 990n = 409
n = 409/990.
To check, we observe the following.
0.413131
990 | 409.000000
3960 130 is the first remainder to repeat
^3§D in this division. Again, notice
990 th at the repetition took place at
the third decimal place—not at
3100
the 991st!
2970 /
Notice, again, the special properties of two and five in our base-ten system.
The point is th at in any base, the terminating decimal fractions are de
termined by whether the prime factors of the denominator happen to be
divisors of the number base. For example, in base-six arithmetic the
crucial primes would be two and three. This is because of the number
fact th at the product of two and three is six. T hat is, 2 X 3 = 6 or
( 2 X 3 = 10)6.
By way of additional examples, when written as a decimal, y would
terminate in the base-fourteen system. In fact, (y = 0.2),4. This follows
from the fact th at seven is a divisor of the base in this case. In particular,
.11
| = 2 | 1.0000
_2
"To
_2
1 . ..
and
1 = 1 *= J L =
8 23 2363 \ 123/ ten
6 3 = 216 = 160t w e i v e
1 2 3 = 1 0 3t w e i v e = 1 0 0 0 tw e lv e
I = (0.16)12
We conclude this section with a discussion of percentages. “Percent,”
derived from the Latin percentum means literally “per one hundred.”
From our point of view, “percent” is merely a synonym for “divided by
100,” or for indicating th a t the denominator of a fraction is 100. Thus,
when one says 18 percent of 50 (written 18% of 50) we think of it as saying
18/100 of 50 = 18/100 X 50 = 18/100 X 50/1
= 900/100
= 9.
1.5 Decimal Fractions: Place Value Revisited 81
Exercises
1. If we assume th at “decimal” refers to 10 rather than ten, explain how
we can have a decimal system in any number base system.
2. In what sense may we read 62.3764 as if it were 623,764?
3. Assume th a t we were using place-value as usual, but th a t we forgot to
put in a decimal point when we expressed a certain number. In this
case, name five different numbers th a t could be named by 2345678.
In what sense are all these numbers like 2,345,678?
4. Distinguish between the use of 0 in such numerals as 4.1300 and 4.0013.
5. Find the sum of 45.0013, 123.261, 872.4456, and 9.003.
6. Compute 23.76 X 2.8, and interpret this result in terms of converting
the decimal fractions into equivalent common fractions.
7. In terms of areas of rectangles, explain why 23.76 X 2.8 m ust exceed
46, but be less than 72.
8. Express each of the following ratios as decimal fractions.
(a) 3:8 (b) 27:40 (c) 40:27 (d) 8:14 (e) 3:17
9. W ithout actually computing the equivalent decimal fraction, tell which
of the following common fractions are equivalent to terminating deci
mals and which are equivalent to repeating, nonterminating decimals
(base-ten).
(a) 23/50 (c) 13/40 (e) 9/25 (g) 239/1250
(b) 29/70 (d) 6/75 (f) 18/24
10. Name three bases in which the decimal equivalent of \ does not term i
nate. W hat decimal expresses \ in each of these cases?
11. Find three different number bases in which the decimal equivalent of f
terminates. W hat is the decimal in each of these three cases?
82 The Development of Our Number System
12. Is there a base in which the decimal equivalent of f will neither termi
nate nor repeat? Explain.
13. Express as a common fraction the rational numbers named by each of
the following decimals.
(a) 0.4563 (d) 23.4343... (g) 0.31247878...
(b) 4.563 ___ - (e) 0.45666... (h) 0 .9 9 9 ...
(c) 0.45634563... (f) 0.64545...
14. Express £ as an equivalent decimal fraction in the base-twelve system.
15. Is it possible th a t there are two whole numbers whose quotient in deci
mal form is given by
0.25255255525555...
(where means add one more 5 to each cycle)? Explain.
16. Find the value of n in each of the following.
(a) 23% of n is 92. (d) n is 23% of 1000.
(b) 23% of n is 1000. (e) n X 1.23 = 55.35.
(c) n is 23% of 92. (f) n + 7.983 = 11.052.
(g) If an object travels a t a steady rate of 15 feet per second then it
is also traveling at n miles per hour.
17. In terms of number-versus-numeral, explain why it is wrong to say
th a t a rational number is any decimal which either terminates or
repeats.
_9
24
6 .i
— _ . ,12
24 27JLI- 24
A _______ I_#____ 1.6.
*2 1_ |_A_ 12
4 ~ 12 3 _ 12
2 I _2 3 6
6 3~6 6 6
Figure 1.28
Of course, from a legal point of view, all our present argument shows is
th a t there is an appreciable difference between a point and a dot, and th at
the ancient Greek used a natural but naive argument. Notice th at we
have not disproved his claim—only his proof.
However, now th a t we have at least established a need for more than
the given proof, we can proceed more confidently to seek a vehicle for our
claim th a t there m ust be more than rational numbers. The vehicle we
elect to use is th a t of decimals. We have already seen th a t the only
decimal numerals th at can name rational numbers are those which either
term inate or repeat. Consequently, if we agree th a t all decimals name
numbers, then as soon as we can exhibit a decimal which neither terminates
nor repeats, we have established the existence of nonrational (irrational)
numbers. To this end, we offer as a candidate the number
m = 0.28288288828888...
isosceles right triangle, each of whose sides has length 1. By the Pythago
rean theorem, it followed th a t the length of the hypotenuse of this triangle
was \/2 - Since this was a “real” triangle, his feeling was th a t \ / 2 was
a “real” number. (We shall examine the idea of “real” numbers in more
detail in the section on complex numbers.) (As a footnote on the power
of geometric pictures in viewing abstract relations, an ancient proof of
this theorem which involves nothing more than subdividing a square in
two different ways is supplied as Note 2 at the end of this section.)
Since the only numbers he knew were rational numbers, he set out to
find two whole numbers whose quotient was y /2 . In other words, he
sought a rational number whose square was 2. His quest proved futile,
even though he managed to come close a t times. For example, 7/5
almost works since 7/5 X 7/5 = 49/25, which is almost 2.
He eventually employed the well-known logical ruse used so often in
mathematics—th at of pretending to know what the answer was, and then
working backwards to discover it. This is what we do in algebra problems
when we let x equal the unknown. We are pretending th a t x is the answer
and then we apply logic to deduce the numerical value x m ust have. In
this context, he probably said, “Okay, suppose we knew what the number
was. Let’s call it m /n. Then m2/n 2 = 2, or m2 = 2n2. However, it is
impossible for there to be two whole numbers m and n such th a t m2 = 2n2,
since in terms of the unique factorization theorem ra2 has an even number
of factors, while 2n2 has an odd number of factors. Thus, there can be no
rational number whose square is 2.”5 This was, indeed, a dilemma to
him (legend has it th at the mathematician who discovered this fact com
m itted suicide); for now he either had to adm it th at \ / 2 was not a number
or else he had to adm it th at the rational numbers, just as the whole num
bers before this, were in themselves not sufficiently complete to handle
the various fine points of mathematics.
To correlate this discussion of y 2 with our decimal interpretation, we
are saying th a t we can find successively better rational approximations in
decimal form for y /2 \ but th a t no such number can ever equal \/ 2 - For
example, let us try to find a decimal which names y /2 . Certainly such a
number must lie between 1 and 2 since the square of 1 is too small and the
square of 2 is too big. In a similar way, the fact th a t 1.4 X 1.4 = 1.96
and 1.5 X 1.5 = 2.25 means th a t 1.4 < \ / 2 < 1.5. Carrying out this
procedure still further, it could be shown th a t 1.41 < y /2 < 1.42; and
th a t 1.414 < \ / 2 < 1.415. Moreover, such a procedure, provided only
th a t we had the patience, could be carried on as many times as we liked.
In this way we could continually squeeze \ / 2 between two rational num
bers which became as nearly equal as we desired; but in no event would we
NOTE 1
• +s
Figure 1.24
88 The Development of Our Number System
Clearly, L can hold more marbles than S. T hat is, L contains more
dots than S. However, L and S have exactly the same number of points,
as the following construction will indicate.
First, construct a figure similar to the one in Figure 1.25, joining the end
points of S and L to form the triangle whose vertex is A .
Figure 1.25
Pick any point p on line S. Then the line which joins A and p intersects
L at one and only one point, say, q. Conversely, starting at any point q
on line L, the line th a t joins A and q intersects S at exactly one point, p.
This construction exhibits a one-to-one correspondence between the points
on each line.
NOTE 2
b
Figure 1.26
1.6 The Irrational Numbers 89
(a) (b)
Figure 1.27
NOTE 3
(Beware, this is true only for prime numbers. For example, 6 is not
divisible by 4, even though 62 is. We may write 4 as 2 X 2. Then each 2
is used with each factor of 6. However, primes cannot be broken down
into smaller factors.)
Since m is divisible by 2, we may write
m = 2k
where k is also a whole number. Thus,
m? = 4k2.
Now m2 = 2n2, and m2 = 4k2. Therefore,
4fc2 = 2n2
since they both equal m2. From 4k2 = 2n2 it follows th at
n2 = 2k2.
From n2 = 2k2 we can now conclude th a t n is also divisible by 2. This
is the required contradiction; for since m and n are both divisible by 2,
m /n is not in lowest terms; yet we assumed th a t it was.
As a final remark, we might want to ask what made the ancient Greek
suspect th a t y / 2 was irrational since he already felt th at the rational
numbers filled in the entire number line. Perhaps he believed th a t y /2
was rational and proceeded to deduce what whole numbers had this
quotient. When the procedure yielded a known contradiction, he may
then have decided th a t he had invented an indirect proof th a t y /2 was
irrational.
Exercises
1. In terms of decimals, explain why there must exist irrational numbers.
2. Let m denote the decimal 0.676776777. .. (where each time one more 7
is added to the cycle) and let n denote 0 .212112111.. . (where each time
another 1 is added to the cycle).
(a) Are m and n rational numbers? Explain.
(b) Is m + n a rational number? Explain.
(c) Does the sum of two irrational numbers always yield an irrational
number? Explain.
(d) W ith m as above, find two rational numbers whose difference is less
than 0.0000001 such th a t one of them is more than m while the
other is less than m.
3. Find two irrational numbers whose difference is less than 0.00000001
such th a t one of them is greater than £ while the other is less than
4. From a practical point of view, what does it mean if we write
y / 2 = 1.41?
1.7 Signed Numbers 91
5. Highlight the difference between a point and a dot to explain why the
ancient Greek did not expect irrational numbers to exist.
6. Show by means of an example th at the product of two irrational numbers
may be rational.
7. Using the proof in the text th at y /2 is irrational as an example, show
th a t y /3 is irrational.
8. If we already know th a t v 3 is irrational and th a t the sum of two
rational numbers is rational, show how we may conclude th a t v 3 — 2
must be irrational.
five pieces, we can lower the temperature by 7°F of something whose tem
perature is 5°F.
The aim of this section is to present the development of signed numbers
and to indicate the further importance of the number line. Let us begin
our study as it occurred historically—in terms of profit and loss.
I t is clear th a t in any business transaction the difference between profit
and loss is crucial. T hat is, if we conclude a $7 transaction, we are inter
ested in knowing whether the $7 was a profit or whether it was a loss. The
magnitude of the transaction alone is not enough. Suppose we wish to
think of 0 as the symbol th a t indicates a transaction th at wTas neither a
profit nor a loss. In other words, we think of 0 as denoting th at we broke
even. Then we could think of profit as a number in excess of 0, and a loss
as a number less than 0. More specifically, instead of using the traditional
red and black inks to distinguish between profit and loss, or instead of
writing P7 or L7 to distinguish between a $7 profit or $7 loss, we elect to
use the symbol + to denote profit and the symbol — to denote loss. The
reasons for choosing + and — as our symbols will be explained in more
detail later.
Thus, we think of a signed number as being composed of two separate
parts: (1) the magnitude, or size, and (2) the signature, + or —, which we
use as a code to distinguish between profit and loss. Thus, we would write
(+ 7 ) to indicate a $7 profit and ( —7) to indicate a $7 loss. The paren
theses are used to emphasize th a t both the signature and the magniture are
necessary to describe the transaction fully. In other words, + 7 or —7 is
treated as one entity.
Before proceeding further, notice th a t we usually read + as “plus” and
— as “minus.” This is quite natural because + and — are the usual sym
bols to denote addition and subtraction, respectively. However, and we
shall show a connection later, the fact remains th a t conceptually there is a
difference between signature and the operations of addition and subtrac
tion. In other words, we have opened the doors for misinterpretation by
letting one symbol, + , exist in two different contexts. To help minimize
this problem we agree to read the signature + as “positive” and — as
“negative.”
Suppose we now wish to discuss the meaning of the sum of two signed
numbers. As we mentioned in our discussion of rational numbers, there
is no “natural way” to define the sum of two numbers. W hat is “natural”
will depend on what we want the sum to mean. In terms of business
transactions, it seems natural th a t the sum of two transactions be the net
result of the two transactions. In other words, we are saying th a t since
signed numbers have been introduced to denote profit and loss, the sum of
two signed numbers should represent the resultant profit or loss of the two
1.7 Signed Numbers 93
7 7 7 7
3 3 3 3
10 4 4 10
Figure 1.28
Of course, other than for the specific answer, the above discussion did
not depend on numbers having magnitudes 7 and 3. The debit-credit
motif makes it clear th a t if the two transactions are either both credits or
both debits, we find the net effect by adding the magnitudes; whereas, if we
have one credit and one debit, we subtract to find the balance. We state
this more formally as a rule.
(1) If the numbers have like signatures, add the two magnitudes and affix
the common signature.
94 The Development of Our Number System
(2) If the numbers have unlike signatures, subtract the smaller magnitude
from the larger magnitude and affix the signature of the number which
has the larger magnitude.
Whereas the formal rule may appear to be more awesome than our pre
vious remarks, notice th a t it merely captures the same meaning we have
already seen in a fairly intuitive way using our profit-loss interpretation.
However, before condemning the rule as being stilted and nonintuitive,
notice th a t the formal rule never mentions the words “profit” and “loss.”
In other words, the formal rule, while motivated by physical interpretation,
never makes mention of any physical interpretation. This idea plays an
im portant role later in the text.
Subtraction with signed numbers is the next operation we consider.
Here we can reinforce an im portant concept already studied. Since we
have previously introduced subtraction as the inverse of addition and since
we have already defined (a — b) by the relation b + (a — b) = a, we can
be consistent, and a t the same time take the easy way out, by keeping this
same definition of subtraction. Ju st as before, when we agreed th at 13 — 4
means the number th a t m ust be added to 4 to yield 13, we now agree th a t
(+ 7 ) — (+ 3 ) means the number th at m ust be added to (+ 3 ) to yield (+ 7 ).
Once we know how to add signed numbers by using this definition, it be
comes fairly easy to see th a t (+ 7 ) — (+ 3 ) = (+ 4 ).
In terms of the bookkeeping analogy, we could think of addition as
finding the balance when the two transactions were known. Subtraction,
then, may be viewed as finding the second transaction if the first transac
tion and the balance are known. We have the following.
(+ 7 ) - (+ 3 ) = (+ 4 ) since (+ 4 ) + (+ 3 ) = (+ 7 )
(+ 7 ) - (-3 ) = (+ 10) since (+ 10) + ( - 3 ) = (+ 7 )
(-7 ) - (+ 3 ) = (-1 0 ) since ( - 1 0 ) + (+ 3 ) = ( - 7 )
(-7 ) - (-3 ) = (-4 ) since (-4 ) + (-3 ) = (-7 ).
These results may be summarized by the following formal rule.
(1) Change the minus sign to a plus sign, and then change the signature
of the number which is being subtracted.
(2) Perform the resulting addition. For example, applying this to (+ 7 ) —
( —3), we obtain (+ 7 ) + (+ 3 ) = (+ 1 0 ), which agrees with our previ
ous result.
mentions the physical terms, profit and loss. Yet the same danger exists
here th a t existed when we spoke of the rule for dividing rational numbers.
Remember th a t the rule for division of rational numbers could be sum
marized by the phrase “invert and m ultiply.” If we identify (+ a ) and
( —a) as being additive inverses in the sense th a t their sum is zero (that is,
(+ a ) + ( —a) = 0), then the rule for subtracting signed numbers can be
summarized by the phrase “invert and add.” Because a rule is an ac
cepted convention and does not have to be logical, there seems to be some
thing mystical about it; and this, in turn, tends to make the rule seem
“unreal.” As we have said before, it makes more sense for us to visualize
what appears to be happening first and then to see th a t the rule is nothing
more than a convenient aid to our memory rather than to accept a result
blindly.
We are still left with the problem of defining multiplication and division
of signed numbers. However, if we once again agree to define division as
the inverse of multiplication, we need really only study the operation of
multiplication. As a change of pace, let us first observe the formal rule
and then look a t the intuitive method.
in Figure 1.29. Suppose th at we would like this rule to be true even when
we have negative, or minus, signs as well. For example, consider
(8 - 7) X (4 - 3).
On the one hand, we know th a t the answer is 1 X 1, or 1. On the other
hand, we would have
(8 X 4) + [8 X ( - 3 ) ] + [ ( - 7 ) X 4] + [ ( - 7 ) X ( -3 ) ],
a b
_ " * - - — -j
Figure 1.29
Once we accept the rule for multiplying signed numbers, the definition of
division, the inverse of multiplication becomes immediately apparent. For,
if we continue to define m -r n to mean the number which m ust be multi
plied by n to yield m, we obtain the following:
Figure 1.30
vented, at least in part, by the feeling that it did not seem natural that
certain points be named, but not others. In the same way. once the
rational numbers were motivated in terms of the number line, the same
question motivated the invention of the irrational numbers. Thus, along
these exact same lines, we might next ask why the point P in Figure 1.31
h ----- h
t------- 1----- 1------ 1------ 1______ 1______ 2______ t______ 9
P 1 : 3 4 5 6 ■ 5 9
F igure 1.31
Figure 1.32
Figure 1.33
A is the point on the line which lies 1 s. u. to the left of 0. while B is the
point on the line which lies 1 s. u. to the right of 0. In other words, we
find th at both distance and direction are im portant when we wish to locate
1S Siu’.ci Xur:ie~s c ’ d ihe X*rr.ber Lir.e 101
Fi^re 2.34
number line that the Greek did not visualize. Namely. — and — are being
used as a code to tell direction. In this sense. —3 does not indicate a line
that is 3 s. u. shorter than nothing! Rather, it means a line of length 3 s. u.
drawn in the right to left di'eciion. In terms of the usual notation of signed
numbers, the Greeks viewed the concept of magnitude, but Descartes added
the concept of signature. The influence of the number line is felt also by
virtue of the fact that many textbooks refer to signed numbers as directed
n u m hers.
Notice, also, from a physical point of d ew that direction is as important
as magnitude in the study of distances. Thus, if it is 100 miles from a
town .4 to a town B and a person leaves town .4 and travels at an average
speed of 25 miles per hour, he will reach town B in 4 hotirs. only if he travels
in the direction from A to B. In other words, the idea of introducing directed
distances in terms of the number line has physical as well as philosophical
meaning.
We can now interpret the arithmetic of signed numbers in terms of the
number line. For example, since signed numbers are now being visualized
as representing directed distances, it is natural to define the sum of two
such numbers as the resultant distance traveled. That is.
(*r7) -r —3 = —10 since a motion of 7 s. u. to the right followed by an
additional motion of 3 s. u. to the right is equiva
lent to the single motion of 10 s. u. to the right.
. —7 ' -r 1—3 = —4 since a motion of 7 s. u. to the right followed by a
second motion of 3 s. u. to the left is equivalent to
the single motion of 4 s. u. to the right.
102 The Development of Our N um ber System
' ~ —3 = -4
since a motion of 7 s. u. to the left followed by a
motion of 3 s. u. to the right is equivalent to a
single motion of 4 s. u. to the left.
1—T — . —3 = —10 since a motion of 7 s. u. to the left followed by an
other motion ot 3 s. u. to the left is equivalent to
- the single motion of 10 s. u. to the left.
1 If the numbers have like signatures, add the two magnitudes and affix
the common signature.
2 If the numbers have unlike signatures, subtract the smaller magnitude
from the larger and affix the signature oi the number having the larger
magnitude.
This is precisely the same rule that we stated when we discussed signed
numbers in terms o: profit and loss. Thus, from a purely formal point of
view. Descartes interpretation of signed numbers causes no discrepancv
with the profit-and-loss interpretation. The recipes are the same, inde
pendent or the choice of interpretation. The beauty of the formal rale
should now be apparent. Namely, no m atter which of the two interpreta
tions we elect to use. the formal rule remains the same. On the other hand,
had we chosen to define signed numbers say in terms of profit-loss. then
trouble would have arisen when we talked about directed distances, since
there is no direct correlation between these two interpretations. We
avoided such labels as P-L or R-L. since these labels can prejudice the in
terpretation. — and —. or. the other hand, are equally well adjustable to
either interpretation. In fact, the formal definition allows us to use any
interpretation of signed numbers that we wish, subject only to the restric
tion that the interpretation obeys the definition.
Thus Descartes supplied us with an interpretation of signed numbers that
completes our number line for us. T hat is. the natural., rational, irrational,
and signed numbers filled in the number line a im *>; spwes 'if'. The cont
racted number line is called the va 1 m • lit-;. If we wish to make no
reference to the number line, we define a real number as cue whose square
is at least as great as 0: that is. t is real if and only if ^ 0. This agrees
with our previous data in terms of the number line: for we know that a point
lies to the right of 0 — . to the left of 0 — or coincides with 0. In eacn
case we have: — X — = —: - X - = —: 0 x 0 = 0. Thus, the for
mal definition and the number-line definition agree.
In summary, the number line unifies the real numners for us. All rea.
mimbers can be viewed as either points or lengths. This is a tremenuous
improvement over the idea th a t th e whole num rers are useu tor counting.
I.? The Corr.p'cr X u n b c’i
the rational numbers are used in slicing pies, the negative numbers are for
profit and loss, and the irrational numbers are to make us nervous!
Exercises
1. Explain way we may refer to Descartes as the inventor of signed num
bers even taough signed numbers were known some 400 years before
his time.
2. Explain how 5 — 7 has no real meaning in terms of tally systems but
aoes nave meaning in terms of the number line.
3. Interpret each of the following results both in terms of profit-and-loss
and as directed lengths.
a - 3 --------- 2 = - 5 .
b —3 — - 2 = —1 .
c - 3 — v—2 = - 1 .
d —3 — —2 = —5 .
4. I se the result of Exercise 3 d to criticize the following argum ent :
" It is natural that the product of two negative numbers be a positive
number because the rule of double negation asserts th at two negatives
make an affirmative."
5. AA:tn the advent of signed numbers we agree to detine a — b to mean
•’ — —->. In terms of this interpretation show why 5 — 7 = —2.
6. Perform the indicated operations.
a ' -2 -3 ' - -7
b t - j + .♦
c ; -3 - 4 ; - -2
d -3 ' -4 - -2 ■
After the last few sections it should be apparent what we mean when we
say that realness is in the eyes of the beholder. In summary, before the
rational numbers were invented 3 2 named a "nonreal’’ number: before
irrational numbers were invented \ 2 named a "nonreal" number; and
before signed numbers were invented 5 — 7 named a "nonreal” number.
In a simple extension of this chain: after the real "real" numbers had been
invented, v —1 named a "nonreal" number since by definition of a real
number, the square of a real number cannot be negative . However, we
must remember that in the truest sense V —1 is no more "u n rea r’ to our
present system than was 3-5-2 when all we had were the whole numbers.
To restate the above remarks in terms of a vehicle that we can further
exploit later, let us redevelop the origin of our number system with respect
to polynomial equations.
104 The Development of Our Number System
x2 - 2 = 0
would be sensible to us, yet it could possess no “real” solution since the
“real” numbers are the rational numbers and \ / 2 is irrational.
It should now be clear as to what we mean by realness depending on the
level of our knowledge. Suppose, then, th a t we now accept the real
number system
» as the system
» that actually
* bears th at name;7 th a t is,7 x is
real if and only if x- ^ 0. Let us now consider
x2 + 1 = 0.
Certainly, this equation is just as understandable to us as were the four
previous equations. However, since any solution, x, of x2 + 1 = 0 implies
th at x2 = —1, we cannot by definition have a real number as its solution.
Thus, if we wish to solve this equation we must again augment the real
number system. We elect to have x2 + 1 = 0 possess a solution and we
denote one such solution by i. T hat is, i is defined by i2 = — 1 or
*=
The key point now is that i is no more “ imaginary” to our present real
number system than was f when only the whole numbers were known to man.
The major difference lies in the fact that we may feel more at ease with f than
with i.
We choose the name complex numbers to name the system th a t augments
the real numbers so that x2 + 1 = 0 can have a solution. More specifically,
we define the complex numbers to be all numbers of the form
a -J- bi,
where a and b denote real numbers and i denotes one of the roots of
x2 + 1 = 0.
Concerning the definition a + bi, one might wonder why it does not con
tain additional terms, say, other powers of i. The answer lies in the fact
th at we want the complex numbers to be an extension of the real numbers.
This means that everything th at is true for real numbers should still remain
true when the real numbers are viewed as being part of the complex
numbers.
Actually this concept may seem more complicated than is really the case.
To see the problem at a more familiar level, consider the problem of adding
rational numbers. In the previous sections we agreed to define \ ^ to
be f, even though it would have been easier and/or more natural to have
defined this in terms of adding like parts. We can now give another reason
for this definition in terms of the rational numbers being an extension of the
whole numbers. Suppose we agreed th at the definition of adding common
fractions was to add like parts. Then f + f = f- 3 = y and 2 = f ;
however, now the sum of two and three depends on whether we view these
106 The Development of Our Number System
D efinition
Given the two complex numbers a — bi and c — di we say that a — hi =
c -f di if and only if a = c and b = d.
Definition
Given the two complex numbers a — bi and c — di. we define their sum
by a — bi — c — di = a — c <— b — d i.
of the real parts of the addends and its imaginary part is the sum of the
imaginary parts of its addends. Needless to say, this is a useful computa
tional fact.
We can deduce a few more facts about complex numbers from the above
definition. We know th at for real numbers, 0 is characterized by being
the additive identity. " In plain English, for any real number, b, b + 0 = b.
Suppose th at we want to investigate the existence of an additive identity for
the complex numbers. If it exists, let us say th at it has the form x + yi.
Now if a -f bi is any given complex number, we must then have th at
(a + bi) -f- (x + yi) = a + bi.
Otherwise, by definition, x + yi could not be the additive identity. Our
definition of addition tells us th a t
(a + bi) + (x + yi) = (a + x) + (b + y)i.
Comparing the last two equations, we see th at
a + bi = (a + x) + (6 + y)i.
Applying our definition of equality to this last equation,wesee th at
a = a + x and b = b + y.
However, the last two equations involve only real numbers, and assuming
th a t we can solve linear equations involving real numbers, these equations
force us to conclude th at
x = 0 and y = 0.
D efin itio n
Given the complex numbers a + bi and c + di, we define their product
by (a + bi) (c + di) = (ac — bd) + (ad + bc)i.
1.9 The Complex Numbers 111
D efinition
The complex number a + bi is called real if and only if b = 0. I t is
called purely imaginary, or imaginary, if a = 0. (Thus, a complex number
is real if its imaginarj' part is 0, and imaginary if its real part is 0.) In
this sense only 0 is both real and imaginary.
D efinition
If a + bi is any complex number, then a — bi is called the complex con
jugate of a + bi. In other words, if we take a complex number and change
the sign of its imaginary part, the resulting complex number is called the
complex conjugate of the first complex number.
The concept of the complex conjugate has many im portant roles, but
for the present our purpose will be served by the following: Notice th a t
by definition of complex numbers and the rule of multiplication
112 The Development of Our Number System
However, since a and 6 are both real numbers, it follows from the
definition of real numbers th a t both a2 and b2 are nonnegative, and hence
th a t a2 + b2 is a real, nonnegative number. In other words, the definition
of a complex conjugate insures us of a way of transforming a complex
number into a real number by either addition or multiplication. T hat is,
if we are given a + bi, we form a — bi. Then
(a + bi) + (a — bi) = 2a
which is real and
(a + bi)(a — bi) = a2 + b2
Exercises
1. Compute the value of i 17i.
2. W hat is the sum of (3 -f- 4 i ) and (7 — 8i)?
3. W hat is the product of (3 + 4i) and (7 — 8i)?
4. W hat is the quotient when (3 + 4i ) is divided by (7 — 8i)?
5. W hat is the quotient when (7 — Si) is divided by (3 + 4i)?
6. W hat complex number must be multiplied by (2 + Zi) to yield (3 + 2i)?
7. W hat is the complex conjugate of (4 — 3i)?
8. Consider the equation (x — l) 2 + 3 = 0. Why is it impossible for any
real number x to satisfy this equation?
9. Express each of the following in the form a + bi where a and b are real
numbers.
(a) (3 + 4i)2
(b) (3 + 4i)(3 - 4i)
(c) (3 + 4i) (7 - Si) {2+ i)
(d) (3 + 4i)(7 - 8i)/(4 - 5i)
(e)
• P
Figure 1.35
real point in the plane merely because it happens not to be on the real line?
After all, the real line could have been drawn anywhere in the plane.”
Descartes thought of placing two number lines at right angles to each
other, and thus was bom the Cartesian plane. (This concept has par
ticularly great importance in terms of graphs—something we shall discuss
at a later time.) These two number lines became known as the x axis and
the y axis; and points in the plane could now be identified with a pair of
real numbers, one of which was called the x coordinate of the point (ab
scissa) and the other of which was called the y coordinate of the point
(ordinate). Moreover, a form of place-value was introduced as an ab
breviation; namely, it would be understood that the coordinates would be
listed as an ordered pair of numbers, the first of which would be the x
coordinate and the second of which would be the y coordinate. See
Figure 1.36.
The Cartesian plane supplies us with the identification we need. Namely,
suppose we decide to abbreviate the complex number x -1- iy by (x,y).
Then, since by definition of a complex number, x and y are real numbers;
we have th at (x,y) is an ordered pair of real numbers. Thus, we shall
now identify the complex number x + iy with the point (x,y) in the
Cartesian plane. As a m atter of vocabulary, when the Cartesian plane is
I
t
31 < 3 1
! !
\ —3 . :> ' ,3. 2 )
L i
i
, 1
i
1
|
(-3. - 2) - 2)
; ________1 ,3'
- 3> (2. - 3 )
Figure 1.36
1.9 The Complex Numbers 115
(1) We have agreed th at the complex numbers are an extension of the real
numbers in the sense that x + iy is real if and only if y = 0. However,
if y = 0 the associated point in the plane is (x,0), which denotes a
point on the x axis. Thus, in the Argand diagram the real numbers are
represented by the x axis, which is the number line, just as we want it to
be. In a similar way. the y axis becomes known as the purely imagi
nary axis since if x = 0, x + iy is identified with the point (0,?/), which
defines the y axis. In summary, then, in the Argand diagram the real
numbers are identified with the x axis, the imaginary numbers are
identified with the y axis, and the complex numbers th at are neither
real nor (purely) imaginary are identified with points in the plane that
are on neither axis.
(2) We have agreed that the two complex numbers (xi + iyf) and (x2 + iy2)
are equal if and only if Xi = x2 and yi = j/2. Translating this into the
language of points-in-the-plane, we would say th at (xuyi) = (x2,y2) if
and only if Xi = x2 and yx = y2; but this is precisely the criterion th a t
two points in the plane coincide. Thus, the definition of equality
translates into the fact th at complex numbers are equal if and only if
they name the same point; and this certainly agrees with what our
intuition would dictate.
(3) In discussing the number line, we emphasized th at numbers could be
identified by either points or lengths. In our approach we emphasized
the length interpretation when we discussed the basic operations of
arithmetic. In particular, we defined addition in terms of laying off
the two lengths end-to-end to obtain the sum. For positive numbers
this posed no need for caution, since all numbers were represented by
lengths th at had the same direction and the same sense. (By sense
we mean th at while a particular straight line has a specific direction it
can be traversed in two ways. For example, with regard to a hori
zontal line, we may move along it either from left to right or from
right to left. In terms of the number line, the sense of all positive
numbers is from left to right.)
The major question is how we shall add lengths when the lengths can
have different senses, or different directions. The answer to this question
gives us an excellent excuse to introduce the very im portant concept of a
116 The Development of Our Number System
and the sense of the vector is denoted by the placement of the arrowhead.
To clarify this point, a vector is denoted by an arrow in the same way
that a number is denoted by a numeral.
Since the only factors that go into the definition of a vector are its
magnitude, sense, and direction, it is obvious that we should define two
vectors to be equal if and only if they have the same magnitude, sense,
and direction. In terms of a picture the two vectors are equal if their
arrows have the same lengths, are parallel, and have the same sense. (We
never refer to the “same sense” unless the directions are the same; th at is,
sense is used only to determine in which direction we place the arrowhead
once the direction of the vector is known.) If we accept this definition,
it means that two vectors may be called equal (as arrows) even if they do
not coincide. All th at is required is what we have stated above.
Recalling once again that we are free to make up any definitions we wish,
let us invent a way to add two vectors, thinking of the arrows rather than
of the vectors. We would like the sum of two vectors to also be a vector;
and we also wish that whatever definition we invent correctly represents
the process for adding real numbers. Therefore, we let A and B denote
vectors. We shall define (A + B ) to be the following vector. We shift
B so th a t its tail coincides with the head of A . (Recall th at we are allowed
to move a vector provided we do not alter its magnitude, direction, or
sense.) Then (A + B) will be defined as the vector th a t originates at the
tail of A and terminates at the head of B. This is illustrated in Figure 1.37.
Figure 1.37
While this definition is for the sum of two vectors, it can be extended to
sums of three or more vectors (just as in ordinary arithmetic we define the
sum of two numbers and then proceed to add three or more numbers).
For example, given the three vectors A, B, and C, we observe th a t A + B
is a vector; hence, (A + B) + C is the sum of two vectors, as is shown in
Figure 1.38. ^ ^ ^
Of course, we could also have formed the sum of A + (B + C). For
tunately, it turns out th a t A + (B + C) = (A B) + C; and as a result
118 The Development of Our Number System
Figure 1.88
Figure 1-40
1.9 The Complex Numbers 119
Figure 1.41
- B (or ? )
Figure 1.4®
Figure 1.45
1.9 The Complex Xumbers 121
U i, 0i ) + (X 2 .I/2 ) = U i + 2
X ,J / i + «/2) •
Figure I .46
Exercises
1. Locate each of the following points in the Cartesian plane.
(a) (1,2) (d) ( 1 ,-2 )
(b) (2,1) (e) (-2 ,1 )
(c) ( -1 ,2 ) (f> ( - 1 , - 2 )
2. Explain how the Argand diagram affords us an interpretation wherein
the complex numbers are as real as the real numbers.
3. Add (3 + 4i) and (7 — Si) as vectors.
4. W hat is the magnitude of (3 + 4i')? W hat does this tell us about the
location of (3 + 4i) in the Argand diagram?
5. In terms of the Argand diagram how are the positions of a complex
number and its complex conjugate related?
chapter two / AN INTRODUCTION
TO THE THEORY OF SETS
2.1 INTRODUCTION
It is a toss-up, at least at the elementary level, as to whether sets or the
number line receive the most publicity in “modern” mathematics. The
number line can be traced back to 600 B .C ., while the newer concept of
sets as a self-contained study can be traced back to about 1850 a . d .
This should serve as adequate evidence that “modern” is used not so
much as a synonym for “new,” but rather as a synonym for “meaningful”
or “useful.”
W hy is the study of sets so meaningful? A complete answer to this
question would result in a multivolume text. For our immediate purposes,
it might suffice to say th at in the same way th at numbers are the building
blocks of arithmetic, sets are the building blocks of all mathematics.
Thus, sets can be used to help us examine every mathematical system ;
they' can be used to help us better understand the basic ideas of probability
theory, including the topics of permutations and combinations; they can
be used in the study of logic (Boolean algebra), including the designing of
com puters; they can be used to enhance our ability in studying quantitative
relationships (known in the literature as the theory of functions); and
they’ can be used to help us gain an objective insight to the concept of
infinity'. In fact, the study' of sets allows us to study’ the entire concept of
counting in an extremely beautiful way’.
The study’ of sets is the study’ of a single concept th at has very’ many
applications. This is much more meaningful than having to learn many
concepts, each with a single application.
Actually’, a set is nothing more than a collection. Thus, while the study
of sets may have had its formal origin in 1850, sets themselves m ust have
1%4 A n Introduction to the Theory of Sets
1 While it m ight seem to be a truism th a t something is either true or false, one or the
other b u t not both, the fact remains th a t this demand gives rise to interesting paradoxes
th a t actually affect the study of the theory of sets. A note about this is supplied a t the
end of this section.
126 A n Introduction to the Theory of Sets
This means th a t logic by its very nature has built into it many of the flaws
of man.
As a case in point, consider the famous barber-of-Seville paradox. The
barber is the only barber of Seville. To advertise this fact in a way th at
is as flattering as possible to himself, he puts the following sign in his
window:
I shave all men of Seville who do not shave themselves; bu t I shave no man who
shaves himself.
If we use his sign as the “test,” let us ask ourselves whether the barber
shaves himself. If he does, then he is a man who shaves himself; yet the
sign says th a t he shaves no man who shaves himself. Thus we see, ac
cording to his sign, th a t if the barber shaves himself then he does not
shave himself. A similar thing happens if we assume th a t the barber
does not shave himself. In short, according to his sign we have the
paradox th a t if he does, he does not; and if he does not, he does!
The only “o u t” of the paradox is if the barber is not a resident of Seville;
for, after all, his sign refers only to men of Seville.
The barber-of-Seville paradox is a riddle-type version of the serious
problem th a t the famous philosopher, Bertrand Russell, saw in the defini
tion of a set. In a manner analogous to our treatm ent of the barber-of-
Seville paradox, Russell claimed th at to make sure th a t a similar paradox
could not result in the theory of sets, it should be explicitly stated in the
definition of a set th a t no set can ever be a member of itself.
While this might seem like a difficult and abstract condition, notice
th a t it imposes no hardship on us; for with the exception of barber-of-
Seville type paradox, every set th a t we study has this property. By
way of illustration, observe th a t New England refers to a set of states,
but th a t New England itself is not a state. The integers are a set (of
numbers), but the integers themselves are not a number. The National
Football League is a set of football teams, but the National Football League
itself is not a football team.
Thus, from a practical point of view the restriction th a t a set not be a
member of itself is almost automatically fulfilled. However, by imposing
this restriction we are able to remove the paradoxes from our considerations.
There are many barber-of-Seville type riddles th a t make their way into
popular magazines as puzzles from time to time in different forms. A
particularly popular version involves the case of a tribe of philosophical
cannibals. Before preparing their captive for the feast, they have him
u tter a statem ent. If his statem ent is considered to be true, the reward
is th a t the captive is allowed to die a painless death. However, if his
statem ent is judged to be false, then the captive m ust die a painful death.
W hat statem ent can the captive make th a t presents a dilemma to the
The Mood Setter 127
D efinition
The set B is said to be well-defined if there is an objective test for member
ship whereby for each b one and only one of the following two statem ents
is true: (1) 6 £ B or (2) b ef. B- Moreover, B ^ B.
Let us next observe th a t £ is a relation between an element and a set.
I t is not a relation between two sets. W ith this in mind, our next endeavor
is to define a somewhat similar concept for relating two sets. For this
purpose consider a statem ent of the form
Figure 2.1
2 A t first glance the phrase “two angles of equal measure” may seem a stilted way of
saying “ two equal angles.” In the new mathematics the point is made th a t it is not
the angles th a t are equal (since they are located in different parts of space), b u t the
measures (be it the units, degrees, or radians, or what have you). In a similar way, it
is not the two sides of an isosceles triangle th at are equal (since the lines constitute two
different sets of points), but the lengths of the two sides. However, now th a t this is
understood, we shall allow ourselves to say such things as “two equal angles,” to avoid
cumbersome language.
ISO A n Introduction to the Theory of Sets
Figure 2.2
All A ’s are B ’s
All B ’s are A ’s
Figure 2.3
even if it had three more members there still would not be an y ! Summed
up, we define the empty set to be a set th at has no members; and we denote
the empty set (also called the null set) by <£.
Do not confuse <t>with 0. For example, consider the set of all numbers
th a t are neither positive nor negative. This particular set happens to have
one member, namely, 0. In other words, the set whose only member is 0
is not the empty set because the empty set has no members—not even 0.
W hat is true, however, is th a t 0 denotes the number of elements in <j>.
To see what the empty set means in terms of grammatical structures,
consider the sentence
Let us now turn our attention to the other extreme mentioned in the
last section on subsets. T hat is, one cannot take more than one has. In
other words, given a set A, the smallest subset of A is <f>and the largest
subset is A itself.
This can be generalized by the following rather silly question: “Does
the color blue belong to the set of all lawyers?” At first glance the answer
is no. At second glance the answer is even more emphatically no! When
we talk about the set of all lawyers, it is implicitly understood th at only
people are eligible for even the test for membership. In other words, a t
certain times not only do we wish to limit membership in a set, but also
to limit even those things th a t are eligible for the test for membership.
To formalize this idea, we introduce the following definition.
D efinition
By the universe of discourse or the universal set, usually denoted by 7, we
mean the set such th a t for any element b and every set A th a t is being
considered, it is true th at b £ I and A C 7.
Thus, <£ and 7 serve as upper and lower bounds in our discussion in the
sense th a t no set being studied can have fewer than no elements nor can
it contain any element not already contained in 7. T hat is, for each set
A it is true th a t
<*> C A C 7.
I t turns out th a t without knowing it we have made use of the universal
set many times in the treatm ent of elementary mathematics. Consider,
The Mood Setter 138
-j------ 1-
o i
Jf = 1
(a; (b)
Figure 3.4
Exercises
1. Determine which of the following sets are wrell-defined.
(a) The set of even integers
(b) The set of interesting integers
(c) The set of paintings by Picasso
(d) The set of beautiful paintings
(e) The set of irrational numbers
In each case th a t a set is well-defined, state the test for membership.
2. Name five subsets of the complex numbers th a t we have studied in this
text.
184 A n Introduction to the Theory of Sets
3. Explain the basic difference between the concepts of “is an element of”
and “is a subset of.”
4. Let S = {1,2,3,4}.
(a) List all the subsets of S.
(b) How many of these have exactly two elements?
5. (a) Let S denote the set of numbers named by 1, 2, 3,and 4. Does
the number named by 7 — 5 belong to <S?
(b) Let S denote the set of numerals 1, 2, 3, and 4. Does the numeral
7 — 5 belong to S ?
6. Describe two sets A and B for which it is true th a t all A ’s are B ’s and
th a t all B ’s are A ’s.In this case how are A and B related?
7. Describe two sets A and B for which it is true th a t all A ’s are B ’s,
but not true th a t all B ’s are .4’s. In this case, using the language of
sets, how would we indicate the relationship of A to B?
8. Use a Venn diagram to illustrate th a t some A ’s are B ’s. How would
w’e state this fact in the language of sets?
9. Let A denote the set of men and B, the set of immortals. Use the
language of sets to express the fact th at no man is immortal.
10. Show by means of a specific example th at {0} and <t>are different sets.
a3 = 3 + (3 - 1)(3 - 2) (3 - 3) (3 - 4)
= 3 + (2 )(1 )(0 )(-1 )
= 3+ 0
= 3.
Similarly,
a4 = 4 + (4 - 1)(4 - 2)(4 - 3)(4 - 4)
= 4 + (3) (2) (1) (0)
= 4 + 0
= 4.
A = { X : X 4 X} .
This note reviews the notation we have learned to date. We will also
mention a few interesting asides.
To begin with, let us recall th a t in our discussion of binary arithmetic
we observed th a t a switch is either on or off, one or the other, but not b o th ;
this was conducive to building computers in a base-two system, where,
for example, 0 could denote off and 1 could denote on.
This same structure allows us to code subsets of a given set S in the
same way. Namely, given an element, s £ S, and a subset, T, of S ;
either s £ T or s 4- T, one or the other, but not both. This means th a t
we can now use the code, for example, of using 0 to denote th a t an element
of S does not belong to a given subset, while we can use 1 to denote th a t it
does.
The Mood Setter 139
S = {a}.
a
0
1.
The first entry says th a t a is not in the subset; but since S has only the
element a in it, it follows th a t the first entry denotes the empty set. The
second entry indicates th a t a is to be in the subset; but since a is the only
element of S, this means th a t the indicated subset is S itself. This shows
th a t if S has only one element, then S has two subsets, namely, <t>and S.
Suppose now th a t S has two elements. In roster form, let us say
S = {a, 6}.
If we make a chart as if we were counting in base-two (that is, 00, 01, 10,
11), we would obtain
a b
0 0
0 1
1 0
1 1
This shows th a t S has four subsets. In fact, reading the entries from top
to bottom and using the code, we see th a t the four subsets are given
specifically by
0, {&}, {a}, and {a,b} = S.
Notice the subtle difference between b and {b}. When we write b we mean
the element b. When we write {6} we mean the set whose only member
is b. In terms of notation, one would write
bC B
but
{b}CB.
Aside from semantic differences there is a logical reason (which we shall
review in more detail a little later) for distinguishing an element from a set
whose membership consists of a single element. For example, suppose we
desire the solution set S of the equation x — 1 = 0. A solution set refers
140 A n Introduction to the Theory of Sets
to the set of all numbers which satisfy the equation.) Then, if we write
1 C S , all we are saying is th a t one member of the solution set is 1. How
ever, when we write S = {1}, we are saying th at the entire solution set
consists of precisely the element 1.
Getting back to the main discussion, we have shown th a t if S has two
elements then S has four subsets. In a similar way, we may use the
binary number system to show th a t if S has three elements, then S has
eight subsets. Namely,
a b c
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
and again reading from top to bottom, we see th a t the eight subsets are
<t>, {c}, {&}, {b,c}, {a}, {a,c}, {a,b\, {a,b,c\.
Notice also th a t this procedure indicates th a t the number of subsets
doubles each time we add an element to the set. This follows by virtue
of the fact th a t if we let r denote the new element of S, then any subset
th a t could occur before r was added remains a subset whether or not we
include r. In terms of the above example, observe th at under the b and c
columns we see th a t the sequence 00, 01, 10, 11 occurs twice—once with a
0 in the a column and once with a 1 in the a column.
A final remark in this connection is th a t the binary number code shows
us a rather nontechnical way of establishing the im portant result th at if a
set has n elements, it has 2n subsets.
Moreover, this offers us still another reason as to why we agree th a t
both <f>and S are treated as subsets of S. Namely, no m atter how trivial
it may seem, the fact still remains th a t if we want the simple result th at a
set of n elements has exactly 2” subsets then we m ust agree th at both <f>
and S are subsets of S. For example, if we wish to exclude (j>and S, we
would have to say th a t if <S has n elements, then it has 2" — 2 subsets.
The next section deals with more computational material than we have
previously encountered.
Exercises
1. Describe the basic differences between the set-builder method and the
listing, or roster, method for describing sets.
The Mood Setter 141
3rc = 6
3 X _________= 6.
An open sentence is an outgrowth of the familiar “Fill in the blank”
type of exercise. For example, when we write
and then asked what rr should be replaced by to convert the open sentence
into a true statem ent. In the subject called algebra it is conventional to
use x (or other letters of the alphabet) to denote the place holder of our
open sentences, but once we see the key idea th a t only the symbolism is
different we should discover th at algebraic equations are a part of (that is,
form a subset of) the more general idea of “filling-in-the-blanks.”
I t is not difficult to determine th a t we m ust replace the blank by 2 if we
want to form a true statem ent out of 3 X = 6.
Traditionally, the equation 3a; = 6 is solved by the process of dividing
both sides by 3, thus yielding x = 2. Since division is the inverse of
multiplication, Sx = 6 may be read as, lix is the number which when
multiplied by 3 yields 6,” but we have already seen th a t the name of this
number is $, or 2. At any rate, it should be clear th at we could solve such
an equation even had we never heard of sets.
Moreover, we could have been much more general, and considered the
equation
bx = c
where b and c represent any real numbers, but b 0.
Observe th a t there is no law against b being equal to zero. I t is just
th a t if b = 0, then bx = 0, in which event either c = 0, or else there is no
solution to the open sentence. In either event, 6 = 0 provides a rather
dull and sterile case, so we prohibit it from happening in our discussion.
Since 6 7* 0, we can divide both sides of the equation by 6 and obtain
x = c/6. (The previous example was simply a special case of this with
c = 6 and 6 = 3.)
Now we are ready to investigate the use of sets in this study. Suppose
someone were to ask us to find the set of all numbers th a t were solu
tions of the equation 3a; = 6. The job becomes trivial if we use the
set-builder notation. Namely, if we let S denote the set of all numbers
th a t satisfy the equation 3a: = 6, quite simply by use of the set-builder
notation we have S = {x: 3x = 6}. Observe th a t while a person might
not be able to list explicitly the members of S, he is nonetheless provided
with a well-defined test for membership in S. Namely, given any number,
he need only multiply it by 3, and if the product is 6 then the number
belongs to S] otherwise it does not. For example, since 3 X 7 = 21 and
21 6, we may objectively conclude th a t 7 el S.
Observe th a t the set-builder method provides us with a way of under
The Mood Setter 143
standing the equation, and which properties a solution m ust have even if
we do not know objectively how to find the solution.
Algebra is the vehicle th at allows us to proceed from the implicit set-
builder form to the explicit roster method. T hat is, by use of algebra we
know th a t the only number th a t satisfies the equation Sx = 6 is 2. In
other words, if we now let T denote the set whose only element is 2 (that
is, T = {2}), then T is also the solution set for the equation 3a: = 6.
Since an equation has but one solution set, it m ust be th at S and T are
synonymous; hence, S = T. But while S and T are synonyms, S is in
the set-builder notation while T is an example of the roster method.
I t should be easy to see th a t S focuses our attention on the problem,
while T focuses our attention on the solution to the problem; and it is the
process of objectively converting from S to T th a t is known as algebra.
More generally, in set-builder notation, the solution set of bx = c (b 0)
is [x:bx = c}, while in roster form this set is {c/b}.
Why make such a fuss over the use of sets here? After all, the solutions
of the equations do not change nor does the algebra get any easier when wre
use sets. However, notice th a t the use of sets emphasizes th a t there are
two parts to solving a problem. We must first sense what the correct answer
means, and then we must be able to state what the correct answers are.
The set-builder form of the solution set emphasizes what a correct answer
means, while the roster form of the solution set lists all the correct answers.
In this context the greatly feared, and often misunderstood, concept of
algebra exists as the servant of m an’s attem pts to convert implicit solutions
to explicit solutions of problems. I t is interesting and informative to note
th at in many instances when a person cannot solve a problem, it is because
he cannot even sense what the solution m ust be, and hence would not be
able to recognize it even if he saw it. By proper use of set notation wTe
can separate the problem into these twro distinct phases.
Let us now turn our attention to more complicated equations. Consider
the quadratic equation
x2 — 4a: + 3 = 0.
Suppose th a t we were required to find the set of all numbers th at are solu
tions of this equation. If wre let A denote the solution set of the equation,
we obtain, virtually a t a glance, the set-builder form
A = [x: x2 — Ax + 3 = 0}.
We now have a fairly easy task, given any particular number, to determine
whether this number belongs to A . Namely, wre square the number,
subtract four times the number from the square, and then add 3. If the
sum is 0, the number belongs to A ; otherwise, it does not. For instance,
let us examine the number 5 for membership in A : 52 — 4(5) + 3 = 25 —
144 A n Introduction to the Theory of Sets
x x2 4x (x2 — 4x + 3)
-3 9 -1 2 24
-2 4 -8 15
-1 1 -4 8
0 0 0 3
+ 1 1 4 0+
+2 4 8 -1
+3 9 12 0+
+4 16 16 3
+5 25 20 8
.v a
.V2 ax
bx ab
Figure 2.5
The Mood Setter I 46
From this result it becomes rather easy to see th a t the sum and product
of a and b become im portant items. For example, translating this result
back to our original problem of x2 — -f- 3 = 0, we desire to find two
numbers, a and b, such th a t a -f b = —4, and ab = 3. While this still
involves trial-and-error, without too much difficulty we find th a t the two
numbers we seek are —1 and —3. Indeed, ( —1) + ( —3) = ( —4), and
( —1) ( —3) = 3. Thus,
x2 — 4z + 3 = [x + ( —l)][a; + ( —3)] = (x — l)(a: — 3).
In the language of mathematics, an equation such as
x2 — 4x + 3 = (x — l)(a: — 3)
is called an identity. An identity is an open sentence th a t is true for any
permissible substitution. In the language of sets an identity is an equation
whose solution set is the entire universe of discourse. T hat is,
Here, again, is a good place in which to observe the beauty of the lan
guage of sets even though the actual computations do not vary in the old
and the new ways. Namely, the two equations x2 — 4x + 3 = 0 and
(x — l)(a: — 3) = 0 do not look alike. Certainly, without a knowledge
of the structure of arithmetic, a person would not look at these two equa
tions and say, “Oh my! These are merely synonyms.” However, in
terms of sets, when we say th a t these equations are equivalent, weare
merely saying th a t they have the same (equal) solution sets. In symbols,
arithm etic allow us to solve the second more readily than the first, even
though the equations are equivalent. To see this, recall th a t if the product
of two numbers is 0 then a t least one of the two numbers must be 0. Thus,
since the product of (x — 1) and (x — 3) is 0 in (x — l)(x — 3) = 0, it
follows th a t either x — 1 = 0 or x — 3 = 0; or, in other words, either
x = 1 or x = 3.
Let us digress to observe th a t the converse of a true statem ent need not
be true. W ith regard to our problem, all we have shown is th a t if
(x — l)(x — 3) = 0, then either x = 1 or x = 3. We have not shown that
if either x = 1 or x = 3, then (x — l)(x — 3) = 0. In terms of the classical
language of mathematics, it still remains for us to check whether x = 1 and
x = 3 are indeed solutions. In short, what is commonly known as the
check of the problem is in reality, from a logical point of view, part of the
solution itself.
In summary of our somewhat lengthy discussion of x2 — 4x + 3 = 0,
let us simply observe th a t in the traditional (pre-sets) presentation, one
can find the problem :
x2 — 4x + 3 = (x — 1) (x — 3).
If x = 1, then x2 — 4x + 3 = l 2 — 4(1) + 3 = 1 — 4 + 3 = 0,
S = [x: x2 — 4a; + 3 = 0}
= [x\ (x — l)(x — 3) = 0}
= [x’. x — 1 or x = 3}
= (1,3).
The set-builder form centered our attention on the problem and em
phasized the test for membership in S. The use of algebra allowed us to
convert S from the set-builder form to the roster form; and this was a
distinct practical advantage, since in real life we frequently wish to express
answers as explicitly as possible. Observe th a t the jargon alone is un
important. If we do not understand what we are trying to do, and if wre
have not mastered enough computational skills, then it makes little differ
ence whether we say, “Solve the equation . . . ” (traditional language)
or, “Find the solution set . . .” (modern language)—we will not be able
to do either!
For the reader who is more interested in application than in theory, let
us point out again th a t there is interplay between theory and application
in mathematics (as well as in virtually all other subjects). Thus, the
theory of this section can be applied to practical situations. By way of
illustration, consider the following examples.
Example 1
I t is desired to cut a 6-inch piece of string into two parts so th a t the longer
piece is twice the length of the shorter piece. W hat is the length of each
piece?
s o l u t io n : A s usual, we pretend to know the answer.
Suppose we let x
denote the length of the shorter piece. Then, since the longer piece is
twice the length of the shorter, we will denote it by 2x. Since the sum of
the two lengths m ust equal 6, we arrive a t the open sentence
x + 2x = 6 or 3x = 6.
Observe th a t 3x = 6 is precisely the equation th a t we elected to begin
the theoretical points of this section. Notice th a t the equation is inanimate
and, hence, we have no notion as to what physically significant problems
it will be called upon to help solve. Knowing th a t x = 2 is the only solu
tion to 3x = 6, and recalling th a t x denotes the length of the shorter piece,
wre have proven th a t the answer to the problem is th a t the shorter piece
is 2 inches and the longer piece is 4 inches. A simple check shows th a t
this is indeed a solution, and our problem is solved. This shows the
interplay between the theory and application in this problem, for we
translated the practical problem into an abstract equation, and the abstract
equation was solved by theoretical methods.
148 A n Introduction to the Theory of Sets
Exam ple 2
I t is known th a t a particular number has the property th a t if its square is
subtracted from four times the number, the answer is 3. W hat is the
number?
Again, let x denote the number. Then x2 denotes the square
s o l u t io n :
of the number, while Ax denotes four times the number. Thus, the open
sentence becomes
Ax — x2 = 3.
Adding x2 — Ax to both sides of the equation, it is converted into
x2 — Ax -f- 3 = 0.
Our theory has already shown us th a t the only solutions are x = 1 and
x = 3. Thus, the required number m ust be either 1 or 3. (But unless
whoever posed the problem gives us additional information, we cannot tell
which of these two numbers he had in mind—only th a t it m ust have been
one of these two. More positively, we have arrived objectively a t the con
clusion th a t any number other than 1 or 3 cannot be a solution.)
Exam ple 3
We wish to construct a square whose perimeter exceeds its area by 3 units.
W hat m ust the length of the side of the square be?
Letting the length of the side of the square be x, its area is x2
s o l u t io n :
and its perimeter is Ax. Thus, the open sentence becomes
The Mood Setter 149
4x — x 2 = 3 .
Exercises
1. Express the solution set S in set-builder form for each of the following
equations.
(a) 3x = 15 (f) x2 + 2x + 10 = 0
(b) 4x + 9 = 3x + 11 (g) x2 + 3x = 28
(c) 5x + 6 = 3x + 2 (h) 2a:2 — 7x + 3 = 0
(d) x2 + 7x + 10 = 0 (i) (x — l)(x + 2) (a: — 3) (a: — 4) = 0
(e) x2 — 8x + 15 = 0
2. Use any means you desire to convert each of the solution sets obtained
in Exercise 1 into the roster form.
3. Verify th at
x2 + 3a:2 — x — 3 = 0
John, Bill, and M ary have, in all, 72 marbles. Bill has twice as many marbles
as M ary while John has three times as many as M ary. How many marbles does
M ary have?
6. W rite the equation th a t will help solve the following problem and then
solve the equation.
The square of a particular natural number exceeds the number itself by 72. W hat
is the number?
In this section we shall continue to show how the roster method and the
set-builder method can be combined to give a modern meaning to traditional
topics.
Applying the discussion of solution sets to the topic of simultaneous linear
equations, consider, for example, the following pair of equations.
x + y = 8)
x - y = 2j
Here we can readily determine th at x = 5 and y = 3 is the only solution
to the equations. One can see this by remembering th a t equals added to
equals are equal, equals multiplied by equals are equal, and so on. Thus
if we add the two equations, we obtain th a t 2x = 10; from whence it
follows th a t x = 5 and y — 3.
We can use the language of sets here very well. For instance, we could
write a t once th a t the solution set S is given by
S = {(x,y): x + y = 8 and x — y = 2}.
A t this point, even the student who has not been exposed to the various
computational techniques for solving simultaneous linear equations can
still use the test-for-membership idea inherent in set-builder notation to
decide whether a given ordered pair (x,y) belongs to S. The various com
putational techniques merely give us an efficient way of expressing S in
the more explicit roster form: $ = {(5,3)}.
Even the familiar devices, such as equals added to equals, can be in
terpreted in term s of the language of sets. More explicitly, we learn
th a t we can multiply both sides of an equation by the same nonzero con
stant. In terms of solution sets this says only th a t both equations have
the same solution set. By way of illustration, when we say th a t x + y = 4
is the same as 2{x + y) = 2(4), all we mean is th a t the two different
equations have the same solution set. In a similar way, when we say th a t
in a system of equations wre can replace any equation by itself plus any
nonzero multiple of any other equation in the system, we again mean th a t
such an operation does not change the solution set of the given system.
Another operation th a t does not change the solution set of a given system
The Mood Setter 151
While all of the above sets are equal, the last one is particularly easy to
convert into roster form; namely,
S = {(5,3)}.
Let us illustrate this approach with another example, only this time we
shall omit the reasons for each step. Let us find the solution to the pair
of simultaneous equations
3x + 2y =
= si
4x + 3y = 15/
We obtain
3x + 2y = 8) I2x + 8y = 32) ^ 12x +
4x + 3y = 15/ ~ -1 2 x - 9y = —45/ ~
3x + 2y = 8) 3x + 2y = 8)
~ y = 13/ - 2 y = —26/
162 A n Introduction to the Theory of Sets
and the roster form of the solution set, S = {(-6,13)}, follows at once
from the last pair of equivalent equations. In still other words, leaving
out the in-between steps, the set
{(x,y); 3x -f 2y = 8 and 4x + 3y = 15}
is equal to the set
{(x,y): x = —6 and y = 13}.
Our algebraic computational devices merely allowed us to establish this
equality with a minimum of trial-and-error and other frustrations.
While the complexity of the computation may increase with the number
of equations in the system, the method itself is not affected by such changes.
By way of illustration, consider the system of three linear equations:
x + 2y + 3z = 4j
2x + 5y + 7z = 9/*
4x + 9y + 9z = 15)
In terms of our previous discussion, the solution set for this system will not
be altered if we replace the second equation by the second minus twice the
first, and the third equation by the third minus four times the first. The
resulting equivalent system now has the interesting property th at x
appears in only the first equation. In other words, in the equivalent
system the last two equations involve two equations with two unknowns,
which is certainly easier to handle than three equations with three un
knowns. Once we find y and z from these last two equations we can find
x by replacing y and z in the first equation by their now known values.
Then we have
x + 2y+ 3z = 4 | x + 2y + 3z
2x + by+ 7z = 9 / ~ y + z
4x + 9y + 9z = 15) y — 3z
(a) (b)
While (b) is more convenient than (a), we may still reduce (b) by the
same technique th a t we used before. This time we may elect to eliminate
y from all but the second equation in (b). T hat is, we could have arrived
a t the equivalent system
x + z —
y + z = 1>
- 4 2 = -2 )
(c)
by taking (b) and replacing the first equation by the first minus twice the
second, and the third by the third minus the second. The advantage of
The Mood Setter 163
(c) is th a t we can now find the value of z a t once from the third equation.
W ith this knowledge we can return to the second equation to find y, and
to the first equation to find x. Of course, we could continue the reduction
still further by taking the third row of (c) and replacing it by the third
row multiplied by thus obtaining the system (d). From (d) we could
obtain (e) by replacing the first row of (d) by the first minus the third, and
the second row by the secondminus the third. In summary,
x + 2y + 3z = 4l x + 2y + 3z = 4l x + 2
2x + by + 7z = 9 / ~ y + z = 1> ~ y-\- z
4x + 9y -f 9z = 15; y — 3z = —l) — 4z
(a) (b) (c)
x + z = 2) x = §)
y + 2 = 1> ~ y = £>•
2 = \) * = \)
(d) (e)
Thus, in a systematic way we have reduced the set (a) to the equivalent
set (e); but (e) is particularly convenient for showing th a t the solution
set is {( f )}. This system is straightforward and involves no artificial
devices, and it extends to any number of equations. Moreover, this
technique is self-contained and does not force us to invent such advanced
concepts as either “determ inants” or “matrices” for a suitable discussion
of the process.
We now introduce matrices (which are merely rectangular arrays of
numbers) by showing th a t they occur in a very natural way; simply as a
way of coding the reduction of a system of equations. Specifically, notice
th a t when we worked on our system all the changes were with respect to the
constants, not the variables. In other words, if we invented a type of
place-value, we could perform the operations using only the constants of
the equation. For example, we could abbreviate the equation ax + by = c
by writing [a b c] where we view the first column as holding the place of x,
the second column as holding the place of y, and the third column as rep
resenting the constant on the right-hand side of the equation. Obviously,
we can continue this abbreviation for any number of variables. Thus,
[a b c d e f g] would be used to denote axi + bx2 + cx3 + c?x4 +
ex5 + /x 6 = g.
Applying this system of coding to some of our previous examples we
could code
x + y = 8)
x — y — 2)
by the two-row, three-column array (called a 2 by 3 matrix)
154 A n Introduction to the Theory of Sets
1 1 8
1 -1 2 ‘
Similarly,
'3 2 8'
4 3 15.
would represent the system
3x + 2y = 8)
4x + 3y = 15;
If we w ant to invent the concept of equivalent matrices in terms of the
facts we know’ about systems of linear equations we define the concept
called row-equivalence as follows: If (1) we replace any row of a matrix
by a constant nonzero multiple of th a t row, or (2) we change the order of
the row’s of the matrix, or (3) we replace any row by th at row’ plus a nonzero
multiple of any other row’, we call the resulting matrix row-equivalent to
the original matrix. In general, any two matrices are called row -equivalent
if one can be obtained from the other by operations of the type (1), (2),
and (3), above. Notice th a t the rules for generating equivalent matrices
are exactly the same as the rules for solving systems of equations.
W ith this definition of row’-equivalence we are sure th a t w’hen it is used
as our code, two row-equivalent matrices represent equivalent systems of
equations.
Let us illustrate these last remarks, but first let us agree th a t ~ w’ill
now denote row-equivalent matrices. W ith this in mind, w’e have
OCl ' l 1 8' 0i "l 1 8' 7i '1 1 8" 81 ' l 0 5 '
OL2 1 - 1 2. 02 .0 -2 - 6. 72 0 -1 -3 . 82 0 1 3.
(a) (b) (c) (d)
where
02 = «2 ~ «1
72 = §02
5i = 7 i + 72
82 = — 7 2 .
As a review of our code, (a),(b), (c), and (d)represent the systems
x + y = 8) x + y = 8) x+ y = 8)
x - y =2J(Ox) - 2 y =- Q j - y = -3 j
(a) (b) (c)
x + (0y) = 5) or x = 5)
y = 3/ y = 3J
(d)
The Mood Setter 156
In this way matrices allow us to obtain the same results as the method
presented on the previous pages, but without having to write the x’s, y ’s,
+ ’s, and = ’s with each equation. In a sense this does for systems of
equations what synthetic division does for polynomial division. To make
the concept of row-equivalence clearer, let us review the other problems
in this light.
In terms of our code (a) and (f) represent the equivalent systems
3x + 2y = 8 |
and
4x + 3y = 15j
(a)
In a similar way
i
00
happened if we left the left-hand side of each equation alone and varied
only the constants on the right-hand side. Thus, instead of the system
x + 2y + 3z = 4)
2x + by + 7z = 9>
Ax 9y + 9z = 15)
x + 2y + 3z = aj
2x + 5y + 7z = 6 >-
Ax + 9y + 9z = c)
x yz a b c
1 23 1 0 0'
2 57 0 1 0-
4 99 0 0 1_
to
2 3
2 5 7 0 1 0 0 1 1 -2 1 0
A 9 9 0 0 1_ 0 1 -3 -4 0 1
(a) (b)
1
0
(c) (d)
X y z a b c
"l 0 0 0 9 i'
T 4
5 3 1 •
0 1 0 4 4
0 0 1 1 1 1
2 4 4.
(e)
The Mood Setter 167
Since 6 symbolizes the second equation and a the first (since these are the
constants on the right-hand side of each equation), —2a + 6 tells us th at
we eliminate x by adding the second equation to minus twice the first (or
equivalently, by subtracting twice the first equation from the second).
A rather elementary check verifies th a t this is indeed the case.
The im portant point is th a t with the augmented form of the matrix, the
“second half” of it always gives us a quick way of checking how we com
bined the original equations to obtain the result. For example, the last
row in (c) tells us th a t we can eliminate both x and y to obtain an expression
for —4z “simply” by adding minus twice the first equation and minus the
second equation to the third equation. T hat is,
0 0 - 4 - 2 - 1 1
says th a t
—4g = —2a — 6 + c.
As a check notice th a t —2a refers to
—2x — 4y — 62 = —2a. (1)
158 A n Introduction to the Theory of Sets
—6 says th at
—2x — by — 7z = —b (2)
and c says th at
4x + 9y + 9z = c. (3)
If we add (1), (2), and (3) we obtain th at
—4 z = —2a — 6 + c
just as our matrix tells us, but the matrix technique gives us a very sys
tem atic way to derive the result, without recourse to guessing and other
trial-and-error techniques.
The concept of an inverse matrix plays a most vital role in matrix algebra
and linear algebra. In the simple coding system we have already supplied
an excellent but simple motivation for this concept. Namely, if we think
of inverse in the usual sense, th a t is, as a shift in emphasis, we see th a t the
inverse of expressing a, b, and c in terms of x, y, and z would be to express
x, y, and z in terms of a, b, and c. (Recall th a t in this sense subtraction is
the inverse of addition since it involves only a change in emphasis. For
example, 3 + 2 = 5 and 2 = 5 — 3 both convey the same number fact
only with a switch in emphasis.) In terms of the last problem of the
previous section we showed th a t the inverse of
1 2 3
2 5 7
4 9 9
was
n9r 9
T 4
3 1
4 4
1 1
2 4 4
then
9a/2 — 96/4 + c/4 = x
—5a/2 + 36/4 + c/4 = y
a /2 + 6/4 — c/4 = z
X y z w a 6 c d x y z w a b e d
'l 1 1 1 1 0 0 o' 'l l 1 1 1 0 0 O'
2 3 3 4 0 1 0 0 0 l 1 2 - 2 1 0 0
3 2 2 5 0 0 1 0 0 -l -1 2 - 3 0 1 0
4 5 6 5 0 0 0 1 0 l 2 1 -4 0 0 1
(a) (b)
"l 0 0 - 1 3 - 1 0 0" ~1 0 0 -1 3 - 1 0 O'
<M
0 1 1 1 0 0 0 1 1 2 -2 1 0 0
1
0 0 0 4 -5 1 1 0 0 0 1 -1 -2 - 1 0 1
0 0 1 -1 -2 - 1 0 1_ _0 0 0 4 -5 1 1 0_
(c) (d)
1 0 0 -1 3 1 0 0 1 0 0 - 1 3 - 1 0 0
0 1 0 3 0 2 0 -1 0 1 0 3 0 2 0 - 1
0 0 1 -1 -2 1 0 1 0 0 1 - 1 -2 - 1 0 1
0 0 0 4 -5 1 1 0 5 1 1
0 0 0 1 T 4 4 0
(e) (f)
x y z w a 6 d
'l 0 0 0 7 3
0
T 4
0 1 0 0 15
4 T
5 _ 1 _ 1
4 1
13 3
4 1
5 1
T 4 0
(g)
“l l l l “ r i4 _ 14 14 o-
w
15 5 3 1
2 3 3 4 and 4 "i" 4
13 3 1 1
3 2 2 5 4 4 4 1
J 5 6 5_ _ — T 4 A
4 A
4 0
W_
a ;+ y -\- z + w = a'
2x + Sy+ 3z + 4w = 6|
Sx + 2y-f 2z + 5w = cj
\x + by+ 62 + bw = di
implies
7a/4 - 36/4 + c/4 = “
15a/4 + 56/4 - 3c/4 - d = y{
- 13a/4 - 36/4 + c/4 + d = z\
—5a/4 + 6/4 + c/4 = wl
160 A n Introduction to the Theory of Sets
Observe also th a t (b), (c), (d), (e), and (f) can be translated in terms of
our code to give other systems of linear equations th at are equivalent to
those implied by (a) and (g). In other words, for example, comparing
(a) with (b) we see th a t
x + y+ z w = a\ x + y + z +w = a
2x + Sy + 3z + 4w = bl ^ y + z + 2w = —2a -f b
3x + 2y + 2z + 5w = cl —y — z + 2w — —3a + c
4x + by + 62 + 5w = d) y + 2z + w = —4a + d
1 1 - 2 1 0 0
0 1 - 1 11/15 1/15 0
0 1 1 - 2 6 /3 5 0 1/35
(c)
X y z a 6 c
1 0 -1 4/15 - 1/15 0
0 i -1 11/15 1/15 0
0 0 0 -1 /1 0 5 1/15 1/35
(d)
In terms of our code, (a), (b), (c), and (d) represent systems of equivalent
equations. The last row of (d) is particularly interesting, however. For
our code tells us th a t —a/ 105 + 6/15 -j- c/35 = 0 (that is, Ox + Oy + 0z),
or, clearing denominators,
—a + 76 + 3c = 0 or a = 76 -f- 3c.
In other words, our code now tells us th a t the system cannot possess a
solution unless a = 76 + 3c. The matrix code provided us with an
excellent algorithm to replace trial-and-error. Specifically, the fact th a t
a must equal 76 + 3c tells us th a t the first equation is equal to the sum
of seven times the second plus three times the third. Now we can easily
see th a t seven times the second is —77x + 28y + 49z = 76, while three
times the third is 78x — 27y — 5 lz = 3c.
Adding these two equations we see th a t x + y — 2z = 76 + 3c, but our
first equation stipulates th a t x + y — 2z = a; and this checks with the
result a = 76 + 3c, since x + y — 2z is a unique number for any specified
values of x, y, and z.
Observe th a t the same result might have been obtained by trial-and-
error, but th a t the matrix system makes guessing completely unnecessary
here. The matrix code also gives us excellent insight into just what is
happening with the system of equations.
Getting back to the system, observe th a t all we have shown is th at
76 + 3c= a is a necessary condition for the system to possesssolutions.
We have not shown th a t this condition is sufficient. However, if we now
return to (d) we see th a t this condition is sufficient since the first two rows
of (d) translate into
x — z = 4a/15 — 6/15
y — z = lla /1 5 + 6/15
x = z + 4a/15 — 6/15
y = z + lla /1 5 + 6/15.
'provided that 76 + 3c = a.
76 + 3c = a is often referred to as a constraint. T hat is, unless 76 +
3c = a the system can have no solutions. When this happens (that is,
when a 5^ 76 + 3c), we say th a t the system of equations is incompatible.
Once a = 76 + 3c we have infinitely many solutions, one for each choice
of z. We say th a t the system has 1 degree of freedom since we can specify
one of the variables at random and still obtain unique solutions.
Now
X y z w
1 0 -3 -2 4
0 1 -1 1 3
is used as a code for the system of two equations with four unknowns;
namely,
x — 3z — 2w> = 4)
y — z + w = 3/
To pursue this point just a bit further, suppose th a t we were given the
following system of three equations and four unknowns
x + y — 4z — w = 7)
2x + y — 7z — 3w =11/- (2)
2x 3y — 9z — w = 17)
GO
to
1
1
'l 0 -2 4" 'l 0 -3 4'
r 0 -1 1 -1 -3 0 1 -1 1 3
0 0 0 0 0 0 0 0 0 0
(c) (d)
*1 0 - 3 - 2 4'
0 1 - 1 1 3
x = 3z + 2w + 4 an(j y — z — w -\- 3.
x + y — 4z —w = a)
2x + y — Iz —2>w = &>• (3)
2x + 3y — 9z —w = c)
x y z w a b c
'l 1 -4 -1 1 0 O' 'l 1 -4 -1 1 0 o'
2 1 -7 -3 0 1 0 0 -1 1 -1 -2 1 0
2 3 -9 -1 0 0 1 0 1 -1 1 - 2 0 1_
(a) (b)
164 A n Introduction to the Theory of Sets
X y z w a b c
'l 0 -3 -2 -1 1 O' 1 0 -3 - 2 -1 1 0
0 -1 1 -1 -2 1 0 0 i -1 1 2 -1 0
0 0 0 0 -4 1 1 0 0 0 0 -4 1 1
(c) (d)
(d) is very informative. First of all, its last row tells us th at our con
straint is —4a + b + c = 0, or 4a = b + c. Observe th a t this result
checks with the result obtained for (2). Namely, in (2), a = 7, b = 11,
and c = 17; whence it does follow th a t 4a = b + c. We also gain the
information th a t the equations in (2) were dependent but compatible
because the sum of the last two equations was equal to four times the first.
Continuing further, the first two rows of (d) tell us th at to find x and y
once our constraint is m et we may choose z and w a t random, whereupon x
and y are determined by
x = 3z + 2w — a -\- b and y = z — w -\-2 a — b.
Again, referring to (2) we have a — 7 and b — 11, whereupon these
equations reduce to
x = 3z + 2w + 4 and y = z —w + 3
which agrees with our previous results!
An interesting special case in which the constraint m ust be obeyed is
when a = b = c = 0. In this event the system of equations is called
homogeneous. W hat this tells us with respect to our given illustration is
th a t if we are given the system
x + y — 4z — w — 01
2x + y — Iz — 2>w = 0>>
2x + Sy — 9z — w = 0;
this system has 2 degrees of freedom. Namely, from x = 3z + 2it> — a b
and y = z — w 2a — b with a = b = 0 we may choose z and w at
random; whence x = 3z + 2w and y = z — w. The check th a t these
are indeed solutions is left to the reader.
As a final observation concerning the result obtained in (d) we see th a t
the system
x + y — 4z — u> = 6^
2x + y — 7z —3«; = 8 /
2x + Sy — 9z — w = 9)
has no solutions (th at is, the solution set is empty) because the constraint
4a = b + c is not met. (In other words, in this case 4a = 24 while
b + c = 8 + 9.)
The Mood Setter 166
Exercises
1. Use matrices to solve each of the following systems:
(a) 3z + 4y =
4x + by = 9j
(b) 6x + 5y = - 3 )
5x — 3y = 11;
2. Use matrices to solve each of the following systems:
(a) x + y + z = 3)
2x + 3y + 4z = 4>-
6x + 7y + 9z = 5;
(b) x + y + z = 3)
3x + Ay + 5z = 4/-
4x + 7y + 8z = 5;
3. Solve for a, b, and c in terms of x, y, and z if:
(a) x + y + z = a)
2x + 3y + 4z = 6/-
6x + 7y + 9z = c)
(b) x + y + z = a)
3x + 4y + 5z = b\-
4iX + 7y + Sz = c)
4. Under what conditions (that is, for what values of a, b, and c) will the
following system have solutions? Describe the solution set in these
cases.
x + 2y + 3z = a)
2x + 3y + 4z = &>•
3x + 4y + 5z = c)
5. Find the solution set of
£ + y + z + u> = l \
2x + 2y + z + 2w = 2f
3x + y + 2 + 2ty = 3(
x + 7y + 3z + 4iy = 1/
part II / The Arithmetic of Sets
D efinition
Let A and B be subsets of I. Then by the union of A and B, w ritten
A \ J B, we mean the set of all elements th a t belong to either A or B (unless
otherwise specified by “either . . . or . . . ” we mean at least one3). In
the language of sets
A K J B = [x: x G A or x G B ).
This is shown in terms of circle diagrams in Figure 2.6.
D efinition
Again, let A and B be subsets of I. Then by the intersection of A and B,
written A (~\ B, we mean the set of all elements th a t belong to both A
and B or, symbolically,
3 While this may seem strange, the fact remains th a t “either . . . or . . . ” is often
used in this nonmutually exclusive sense. For example, when we say th a t either Tom
or Jerry will go to the store, we do not preclude the possibility th a t both boys m ight go.
When we reach into a deck of cards and say th a t we shall draw either a spade or a face
card, we do not feel th a t we have lied should we draw the king of spades. On the other
hand, there are times when “either . . . or . . .” means “one or the other, b u t not
b oth.” In this event there is still no contradiction, for “one or the other, b u t not both”
is covered by the expression “a t least one.” Thus, our only precaution is th a t we will
say “either . . . or . . . , b u t not both” when we mean “either . . . or . . . ” in the
exclusive sense.
168 A n Introduction to the Theory of Sets
A r\ B = and i £ B ) .
The word “intersection” can be understood from the circle diagrams by
observing th a t A B in Figure 2.7 is actually the intersection of the two
circles.
denotes A O B
D efinition
By the complement of A , written A ', we mean the set of all elements th a t
belong to universe of discourse b u t not to A . T hat is,
A ' = {x: x C 7 but x 4 A ).
Figure 2.9 shows this in terms of circle diagrams.
Logically, “b u t” has the same meaning as “and” in many contexts.
In this sense A ' = {x: x G 7 and x ^ A }. Thus, we may say th a t
A ' = I C\ A '.
This is not surprising since all elements belong to 7. T h at is, if B is any
subset of I then B = I (~\ B.
Figure 2.9
170 A n Introduction to the Theory of Sets
AUB
= (A n b ') u (A n B) u (B n a' )
Why?
Figure 2.10
A denoted by j 11
f i U C denoted by z r r
A n (B U C) denoted by (Why?)
Figure 2.11
(a)
<d>
Figure 2.12
while the ones th a t did not could not possibly belong to students taking
both courses. We can then pick up the cards th a t fell off and run the
needle through the second hole. The ones th a t now fall off represent
students taking both courses.
In a similar way, if we wished to find those students who took a t least
one of the two subjects, we could first run the needle through the first hole.
Then for the cards th a t did not fall off we could run the needle through
the second hole. The students th a t we wanted would be represented by
the total of cards th a t fell.
Finally, if we wished to find the students who were not enrolled in M ath
101 we would insert the needle in the first hole; then the cards th a t had
not fallen off would represent the students who were not taking M ath 101.
A little reflection should show th a t the above three paragraphs represent
the concepts of intersection, union, and complement, respectively.
As an example, let
I = {1,2,3,4,5,6,7,8,9,10}
A = {1,2,3,5,7,9}
B = {1,3,5,6,8,9}
C = {1,2,3,4,5,7}.
The Arithmetic of Sets 173
Then
A U B = {1,2,3,5,6,7,8,9}
A C \B = {1,3,5,9}
(A W B ) C \C = 11,2,3,5,7}
(A VJ 5 ) ' = {4,10}
A' = {4,6,8,10}
B' = {2,4,7,10}
A'UB' = {2,4,6,7,8,10}
A 'HB'= {4,10}.
In terms of circle diagrams we could have represented the above problem
as in Figure 2.13.
x G A or x 4- A .
174
The Arithmetic of Sets 176
A___ £
1 1 The element belongs to both A and B.
1 0 The element belongs to A but not to B.
0 1 The element belongs to B but not to A .
0 0 The element belongs to neither A nor B.
A B_______ A \ J B
1 1 1
1 0 1
0 1 1
0 0 0
A B _______ A C \ B
1 1 1
1 0 0
0 1 0
0 0 0
This merely says th a t unless an element of I belongs to both A and B it
does not belong to A C\ B.
We indicate the concept of complement by the following.
A____A /
1 0
0 1
This says th a t if an element belongs to A it does not belong to A ', and if it
does not belong to A it does belong to A '.
I and <f>have interesting interpretations in terms of the chart method.
Namely, I is characterized by the fact th a t all elements under consideration
belong to it.
176 A n Introduction to the Theory of Sets
A B I
1 1 1
1 0 1
0 1 1
0 0 1
This says th a t no m atter which case holds (that is, whether the element
belongs to A and B, or to neither, or to one but not the other), by definition
it belongs to I.
On the other hand, the empty set is characterized by the fact th at no
m atter where the chosen element is, it is never a member of <f>by the very
definition of the empty set.
A B
1 1 0
1 0 0
0 1 0
0 0 0
T h at is, no m atter which of the four mutually exclusive cases prevail the
element can in no event be a member of <f>) which, of course, is the definition
of 0.
In using the chart method for testing the equality of two sets, two sets
are equal if and only if their charts are identical, case for case. Translated
into more familiar language, this says nothing more than th a t an element
is a member of one set if and only if it is a member of the other; and this
was our definition of equality of sets.
For example, suppose we wish to investigate the two sets A KJ B ' and
( A ^ J B )'. We could proceed as follows.
A B B' A \J B’ AVJ B ( i U B )'
1 1- >0 1 1 > 0
(1) (2)
The columns labeled (1) and (2) are not identical, case for case. Hence,
we conclude th a t A \ J B ' and (A VJ B )' are not synonyms. Among other
things, this shows us th a t in the arithmetic of sets the parentheses are
essential—th a t in deleting them we may change the set.
The chart actually tells us much more than this however. For example,
in the last illustration we see th a t (1) and (2) correspond except in the
The Arithmetic of Sets 177
first two lines. T hat is, if the first two rows of the chart could be deleted
then A K J B ' would equal (AKJ B )'. Of course, we cannot randomly
cross out entries, but if it should happen th a t A = <f>then the first two
cases could not occur (why?). In other words, the above chart allows us
to draw the following conclusion:
A \ J B ' = (A KJ B ); if and only if A is the empty set.
As a second illustration let us compare (A W B )' and A ' C\ B '.
A B A \J B (A U B )' A' B' A ' r \ B' A' U B'
1 1 1 0 0 0 0 0
1 0 1 0 0 1 0 1
0 1 1 0 1 0 0 1
0 0 0 1 1 1 1 1
(1) (2) (3)
Since (1) and (2) are identical, case for case, we conclude th at
(A U B Y = A ' n B'.
However, (1) and (3) are different, and this also was accounted for in the
example of the previous section. The places of disagreement are the
second and third rows. These correspond to those elements th a t belong
to one but not the other of the two sets. In other words, if it were im
possible for an element to belong to one but not the other, then (1) and (3)
would have been the same. However, to delete the second and third rows
is equivalent to saying A = B. (Why? Hint: A = B means
A____B
1 0
0 1
cannot occur.) Thus, we have demonstrated the impossibility of (A KJ B )'
being equal to A ' \J B ' except in the trivial case th a t A = B.
As the third example let us demonstrate th at
A = ( i H £ ) U ( i n B').
A B B' A (~\B a r\ b ' ( A H 5 ) U (A
1 1 0 1 0 1
1 0 1 0 1 1
0 1 0 0 0 0
0 0 1 0 0 0
(1) (2)
Columns (1) and (2) verify the assertion.
I t should be observed here th a t just as certain algebra problems can be
solved by common sense rather than by technical recipes, it is also true
178 A n Introduction to the Theory of Sets
Figure 2.14
The chart shows th a t (1) and (2) denote synonyms, but th a t (3) is not
the same as (1) or (2). The differences occur in the fifth and seventh
rows. This in turn means th a t only if an element belongs to B and C but
not to A, or to C but neither to A nor B will the columns be different.
For example, let A = {1,2,3,5,6}; B = {3,4,5}; and C = {5,6}. In this
case, B \ J C = {3,4,5,6}; hence, A C\ (B U C) = {3,5,6}; while A H B =
{3,5}. Therefore, ( A r \ f i ) U C = {3,5,6}. But this is true only be
cause no element of C belongs to the complement of A .
On the other hand, in our example of the previous section
A r \ { B \ J C ) = {1,2,3,5,7},
while (A H B) U C = {1,2,3,4,5,7,9}.
In summary, the chart method shows us th a t whenever we are given
three sets A, B, and C, we may conclude th a t in any event A C\ { B \J C) =
( A n 5 ) U ( A O C ) ; but it need not be true th a t A C\ (BKJ C) =
( A H B ) U C.
The success of this method does not depend on the concept of sets as
much as it does on the concept th a t there are only two possibilities and th a t
they are mutually exclusive. For example, we could use the chart method
to prove certain theorems about even and odd integers (since an integer
is either even or odd, one or the other, but not both). Consider the follow
ing problem.
Choose two integers a t random. Form their sum and their product, and then
multiply the sum by the product. For example, if we choose 7 and 12 the sum is
19 and the product is 84. We then multiply 19 by 84 and obtain 1596, which is
even. Our claim is th a t the answer will be even no m atter which two integers we
choose!
180 A n Introduction to the Theory of Sets
a b (a + b) ab (a + b){ab)
e e e e e
e 0 0 e e
0 e 0 e e
0 0 e 0 e
(1)
A B 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
1 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0
0 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0
0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
Any other column consisting of 0’s and l ’s m ust look exactly like one
of these 16. We can now give names to each of these 16 columns.
The columns are arranged to highlight the fact th a t 1 and 16, 2 and 15,
3 and 14, and so on, represent pairs of complementary sets. Remember
th at the 16 names we have given take care of all possibilities. T hat is,
any other combination of A and B using unions, intersections, and com
plements m ust be a synonym for one of these 16. For example, consider
the set (A U B) H ( i H B)'.
A B AVJ B A C \B (A P \ b y
1 1 1 1 0 0
1 0 1 0 1 1
0 1 1 0 1 1
0 0 0 0 1 0
for the mutually exclusive sets A and A ' th a t make up I. T hat is, I =
1 VJ 2, and 1 C\ 2 = <f>. In still other words, 1 represents A while 2
represents A '.
Similarly, when we deal with two subsets the diagram will appear as
in Figure 2.16. In this case, the regions 1, 2, 3, and 4 make up / ; and,
again, they are mutually exclusive. In particular,
1 denotes A C\ B.
2 denotes A C\ B ' .
3 denotes A ' C\ B.
4 denotes A ' C\ B ' .
Again, observe what we mean when we say th a t this is just a visual chart
method. In particular, to correlate the chart with the geometric regions
we have:
A B Region Set
1 1 1 A H fi
1 0 2 A C \B '
0 1 3 A 'C \B
0 0 4 A 'C \B '
The Arithmetic of Sets 183
merely for the sake of brevity. Again, we do not mean th a t A is the set
consisting of the two elements 1 and 2. In fact, if we insist on this nota-
tional interpretation it is better if we say th a t A consists of the two mu
tually exclusive subsets 1 and 2. In terms of circle diagrams, suppose we
wish to establish th a t
B
We could have obtained
the same result by use o f
hatching, but notice how
much more objective
our new way is.
Figure 2.17
A = {1,2,3,4}
B = 11,2,5,6}
C = {1,3,5,7},
Figure 2.21
186 A n Introduction to the Theory of Sets
Circle diagrams are more visual than the previously discussed chart
method, but these two methods are equivalent in th a t they say the same
thing. The only difference is th a t what one says by an appropriate listing
of a row the other says by a picture.
The geometric idea of a circle diagram is probably easier to visualize
than the abstract listing of the chart method. Observe th a t while the
chart method may be cumbersome, it can be used, awkwardly or not, for
three, four, five, or more, subsets. On the other hand, the circle diagrams
get completely out of hand when we deal with more than three subsets.
Try, for example, to illustrate by a circle diagram the general case of four
subsets. If you do this correctly, there will be 16 regions in your diagram
and no two regions will represent the same set!
Notice th a t even though the chart method is easier to handle than
circle diagrams if we are dealing with more than three sets, even the chart
method gets out of hand when we have to deal with a fairly large number
of sets. For example, with 20 sets the chart method would have over a
million (220) rows. We shall ultimately want to develop certain rules and
A = ( l , 2, 4, 5, 6, 7, 8, 9} ^ A n (B U C ) = {1, 2, 4, 5, 6}
B = (1, 2, 3. 4, 6, 10, 11, 12} )
C = {1, 2, 5 J J 04 n f i ) U C = { 1, 2, 4, 5, 6 )
Figure 2.28
The Arithmetic of Sets 187
Exercises
1. Let
the list would contain only 43 different names. To understand this, sup
pose th a t the registrar compiles the list by placing a needle through the
hole labeled M ath 209. Then 20 cards fall from the pile. If these cards
remain outside the rest of the pack, there are no longer 30 cards with the
Biology 232 hole punched left in the pack, for 7 of these fell with the
M ath 209 cards. In other words, only 23 of the Biology 232 cards remain
in the pack. Thus, when the needle is placed in the Biology 232 hole only
23 additional cards fall off—accounting for the figure 43.
This same result can be visualized rather well in terms of circle diagrams.
Namely, if we assume th a t N(A) = 20 and N(B) = 30 we see th a t if
N (A C\ B) = 7, then we get a picture like Figure 2.24.
This agrees with the proper result. To carry the example one step
further, the given information th a t N( A) = 20 and N(B) = 30 does very
little to determine N ( A V J B). We do know th a t N( A Pi B) is between 0
The Arithmetic of Sets 189
and 20, but nothing else. These two extreme cases correspond to the
events th a t (1) A and B share no elements in common, and (2) A is a subset
of B. This is demonstrated pictorially in Figure 2.25. In terms of the
recipe these cases lead to
(a) N( A VJ B) = 20 + 30 — 0 = 50
and
(b) N( A VJ B) = 20 + 30 - 20 = 30.
In other words, with regard to the present problem, unless N( A B) is
given we can only conclude th a t N { A \ J B) is a t least as great as 30 but
in no event in excess of 50. Moreover, we can obtain any correct answer
between 30 and 50 merely by appropriately choosing the value for
N( A f \ B). In general, for this problem we need only let N( A VJ B) =
50 — N( A r \ B). For example, if we wish th a t N ( A \ J B) = 37 we let
N( A Hi B) = 13. See Figure 2.26.
In summary, the trouble with writing N( A VJ B) = N( A) + N( B) is th a t
A and B need not be mutually exclusive. T hat is, it may happen th a t
A C\ B 0. In Figure 2.26 we had no trouble when we added 7, 13, and
Figure 2.26
190 A n Introduction to the Theory of Sets
17; but this was because the regions in question were mutually exclusive.
In other words, if X , Y, and Z are mutually exclusive in pairs (that is,
X (X Y = Y ( X Z = X ( X Z = <t>), then N ( X U Y U Z) = N( X) + N( Y)
+ N(Z). We will discuss this later. W ith regard to Figure 2.26, think of
X = A IX B', Y = A ( X B , and Z = A' IX B. Then N( A U B ) =
N ( X U Y U Z) = N( X ) + N( Y ) + N ( Z ) since in this case X, Y, and Z
are mutually exclusive in pairs.
So far we have not restricted the finiteness of the sets under considera
tion. To avoid ambiguity and/or misinterpretation, we shall now impose
the restriction for the remainder of this section th a t all sets under con
sideration be finite. To see why this is necessary let us consider the
following situation. Look a t the expression 5 — 3. We viewed 5 — 3 as
the number th a t m ust be added to 3 to yield 5. From a physical interpreta
tion point of view, we might have viewed 5 — 3 as the process of deleting
three tally marks from a collection of five. More generally, in terms of
sets suppose th a t B C A and th a t N(B) = 3, while N( A) = 5. Then
5 — 3 could be viewed as being the number of members in the set th at
results when B is deleted from A. (Recall th a t this set is called A — B,
while A — B is merely another name for A (X B'.) In general, then, if
B C A we can view N( A) — N( B) as being N( A — B). The problem
occurs if B is a subset of A, and A is an infinite set, for in this case it is
not so easy to describe the number of members in A — B. For example,
let A denote the set of whole numbers. Then A is certainly an infinite
set. Now let B denote the set of even whole numbers. Then it is clear
th a t B is an infinite set th a t is also a subset of A . In this case A — B
would be the set of odd whole numbers, which is an infinite set; hence,
N (A — B) would be infinite. On the other hand, suppose B were the
set of all whole numbers greater than 10. T hat is,
B = {11,12,13,14,15,16,17,18,... }.
In this case B would also be an infinite subset of A . If we now delete B
from A to form A — B we find th a t
A - B = {1,2,3,4,5,6,7,8,9,10}
or
N (A - B) = 10.
In summary, if A and B are infinite sets and no further specifications are
made, then we cannot, without the risk of misinterpretation, give a well-
defined definition of N( A) — N(B). In other words, the formula
N ( A U B) = N( A) + N( B) - N( A IX B)
becomes troublesome if A and B are infinite sets since (1) we cannot be
The Arithmetic of Sets 191
S = spades
D = diamonds
C = clubs
Figure 2.27
N( H KJ F) = 13 + 16 — 4 = 25.
JV(iUfiUC).
We shall show th a t
N (A KJ B \ J C ) = N( A) + N( B) + N(C) - N( A Pi B)
- N ( A P C ) - N ( B P C) + N{ A H B H C ) .
(b) if the name appeared on two of the lists it was counted once too
many and hence should be subtracted once,
(c) if the name appeared on all three lists then it should be subtracted
twice.
Thus, for example, if an element belongs only to A it is counted only
once in the sum N( A) + N( B) + N(C). If it belongs to ju st B and A,
it is counted twice in the sum N( A) + N( B) -f- N(C) b u t it is sub
tracted once in N( A r \ B). Finally, if the element belongs to A and
B and C it is counted three times in N( A) + N( B) + N(C) and then
subtracted out three times in N( A r \ B) — N( A D C) — N ( B C\ C).
Now, however, it is not counted a t all, but N( A B C\ C) counts it
once.
(2) In terms of circle diagrams (Figure 2.28) let us indicate by 1, 2, or 3
Figure 2.29
194 A n Introduction to the Theory of Sets
For more than three sets a rather interesting pattern prevails, which wre
present without proof. For example,
In a certain school, students are required in order to graduate to take a t least one
of three languages: French, German, or Spanish. In a certain graduating class we
find th a t 20 students took all three languages, 35 took French and German, 40 took
French and Spanish, 50 took German and Spanish, 90 took French, 70 took German,
and 110 took Spanish. How many were in the graduating class?
In solving this problem we m ust be careful not to add the given numbers.
For if we do, we count certain students more th an once. One solution
(letting F, S, and G denote the set of students taking French, German, and
Spanish, respectively) is to use our formula with N( F ) = 90, N( G) = 70,
N( S) = 110, N( F n G) = 35, N{F C\ S) = 40, N(G H <S) = 50, and
N ( F n G n S) = 20. The number in the graduating class is
N( FKJ G ^ J S), and we have
N( F VJ GKJ S) = 90 + 70 + 110 - 35 - 40 - 50 + 20 = 165.
The Arithmetic of Sets 196
Thus, there were 165 members of the graduating class. We could have
used circle diagrams. In this event, since 20 students take all three, we
would have Figure 2.31. Then, since 35 take both French and German
(this number includes the 20 who take all three), we would draw Figure
2.32. Continuing in this way (the details are left to the reader), we get
Figure 2.33. N ot only does this give us the same answer, but since the
regions in the diagram are mutually exclusive we have such other results
Figure 2.33
196 A n Introduction to the Theory of Sets
a s : There were 5 members of the graduating class who took German but
neither French nor Spanish.
Exercises
1. Determine N ( A ^ J B) under each of the following sets of conditions.
(a) N( A) = 15, N( B) = 25, N( A n B) = 7
(b) N( A) = 15, N( B) = 25, N( A n B) = 15
(c) N( A) = 15, N( B) = 25, N( A H £ ) = 0
(d) N( A) = 15, N( B) = 25, N( A H B) = 20
2. Use circle diagrams to answer the following questions. We are given
the three sets A, B, and C, and the following information.
N( A n B r \ C) = 1 N( A) = 14
N( A n B) - 5 iV(£) = 13
iV(A H C ) = 3 i\T(C) = 12
jv (b n c ) = 4
Find the following numbers.
(a) N( A KJ B \ J C ) (d) JV([A C')
(b) A(A C \B ' C\ C ) (e) N (A n C')
(c) N(A r\Br\c')
3. In a survey of 50 students it was found th at 14 studied German, 13
studied French, 12 studied Spanish, 5 studied both German and French,
4 studied both Spanish and French, 3 studied both German and Spanish,
and only 1 studied all three languages. How many students in the
survey were taking none of these languages? Explain.
A = {5x: x C N}
B = {7y: y £ N]
where N denotes the set of natural numbers. We have taken the liberty
of restricting our universe of discourse to the set of natural numbers.
In plain English, A merely names the set of all natural numbers th a t are
divisible by 5, while B names the set of all natural numbers divisible by 7.
We wish to find those natural numbers th a t are divisible by both 5 and 7,
but in the language of sets this is precisely the set A C\ B. Thus, even
if a person did not know how to find such numbers systematically he could
a t least recognize them if he saw them. For example, given the number
25 we see th a t it is divisible by 5 but not by 7. Hence, 25 G A, but
25 4- B. In other words, 25 £ A C\ B '. In fact, as a special case of the
more general results described in Section 2.6, observe th a t each natural
number belongs to exactly one of the four mutually exclusive sets: A C\ B',
A ' C\ B, A ' C\ B', or A C\ B. This is shown in terms of a diagram in
Figure 2.34.
Figure 2.34
198 A n Introduction to the Theory of Sets
In our case, A C\ B' denotes the set of natural numbers th a t are divisible
by 5 but not by 7; A' C\ B denotes the natural numbers th at are divisible
by 7 but not by 5; A' C\ B' denotes the set of natural numbers th at are
divisible by neither 5 nor 7; while A C\ B denotes the set of natural numbers
divisible by both 5 and 7—th a t is, the common multiples of 5 and 7.
Some further examples are
25 £ A nB'
14 £ A' r \ B
33 C A ' H B'
70 G A r\B.
A = {5,10,15,20,25,30,35,40, . . .}
B = {7,14,21,28,35,42,49, . . .}
Exercises
1. Let A denote the natural numbers th a t are divisible by 9; and B, the
natural numbers th at are divisible by 15.
(a) List three members of A C\ B ' .
(b) List three members of A ' C\ B ' .
(c) List three members of A' C\ B.
(d) W hat is the smallest natural number th a t belongs to A C\ B?
(e) Describe A C\ B by the set-builder notation in two different ways.
2. Among the sets of numbers studied by the ancient Greeks were the
perfect numbers. A number is said to be perfect if the sum of the natural
numbers (excluding the number itself) th a t divide it is equal to the
given number. For example, 6 is a perfect number since its proper
divisors are 1, 2, and 3; and 1 + 2 + 3 = 6. Use the test for member
ship to determine which of the following are also perfect numbers.
(a) 12 (b) 16 (c) 43 (d) 28 (e) 65.
3. Even today it is not known whether any odd perfect numbers exist,
yet Euclid was able to describe the even perfect numbers completely.
Namely, he showed th a t N would be a perfect number if and only if
N = 2p_1(2p — 1), where 2P — 1 is a prime number. For example, if
p = 2, the recipe yields N = 2 X 3 = 6. Use this recipe to find the
four smallest perfect numbers.
4. List the five smallest members of M , if M = {2P — 1; p is a prim e}.
6 + (10 - 6) = 6 + 4 = 10
I + (10 — ^ + 9^ = 10
14 + (10 - 14) = 14 + ( - 4 ) = 10.
numbers may serve as a solution of neither equation, or of one but not the
other, or of both.
To indicate th a t we want those pairs of numbers th at are solutions to
both equations we usually write
the following pairs of simultaneous equations have the same (equal) solu
tion sets.
(1) x + y = 10) (2) x + y = 10) (3) x = 8)
x — y = 6/ 2x = 16j y — 2)
It is simply th a t (3) is easier to solve a t sight than either (1) or (2).
To summarize the above problem; if we wish to express the set of
simultaneous solutions of
x + y = 10)
x - y = 6/
we need only let A — {(x,y): x + y = 10} and B = {(x,y): x — y = 6}.
Then the solution set is A C\ B. We can then use algebra to convert
A C\ B from its set-builder form to the perhaps more convenient form of
A r \ B = {(8,2)}.
Again, observe th a t the role of sets in no way changes the solution to the
problem; it only affords a uniform vehicle for stating the problem in familiar
language and helps us separate the actual problem from the solution of the
problem.
Exercises
1. Let A denote the solution set of 3x + 4y = 11 and let B denote the
solution set of 2x + 3y = 7.
(a) Find an ordered pair (x,y) th a t belongs to 4 H B '.
(b) Find a member oi A ' C \B .
(c) Find a member of A ' C\ B '.
(d) Find all members of A C\ B.
(e) Granted th a t A r \ B, A ' C\ B, A f~\ B r, and A ' C\ B', subdivide
the ordered pairs into mutually exclusive partitions. Does it follow
th a t these four sets each have the same number of elements?
Explain.
2. Find the solution set for each of the following in both set-builder and
roster form.
2x + 7y = 32) 7x + 4 y = 34)
2x + 5y = 18/ 2x + 3y = 24 j
3. Give an example of an equation whose solution set has no members in
common with the members of the solution set of 3x + 17y = 87.
F igure 2.86
Figure 2.38
Figure 2.39
20Jt A n Introduction to the Theory of Sets
Here, then, we see th a t the great feat of Descartes was to translate the
solution set of algebraic equations into pictures, and conversely, pictures
into the solution set of algebraic equations.
Thus, while an equation and its graph are conceptually very different
(as different as algebra and geometry), we see th at we can get a better
intuitive grasp of algebraic results if we can visualize the appropriate
picture.
By way of further illustration in terms of graphs, the solution set of
x + y = 7)
x + y = 3j
is empty because the graphs of the equations are parallel lines and, hence,
share no points in common (Figure 2.40).
Exercises
1. Let us accept the fact th a t the solution set of any equation of the form
ax + by = c (where a, b, and c are any numbers and not both a and b
The Arithmetic of Sets 206
are 0) has as its graph a straight line. Graph the following equations:
(a) Sx + Ay = 11
(b) 7x + 9y = 23.
Drawing the graphs to accurate scale, determine where the curves
intersect. Then solve the two equations simultaneously and see how
this result checks with the graphical result.
2. Repeat the instructions from Problem 1 for the following pairs of
equations:
(a) 2x + by = 10 (c) x — 2y = 7
(b) x + 2y = 6; (d) Sx + y = 28.
In the same way, if the use of sets could insure a uniform mathematical
language, it would not be nearly as upsetting for us to advance in mathe
matical sophistication; for while the game might get tougher to play, we
would still have the psychological advantage of knowing th a t we are at
least still playing the same game.
However, we have hardly begun to show why sets are important. The
next section will show still another excellent use for the concept of sets.
2.11 INTRODUCTION
Up to now we have mainly used sets to establish the proper vocabulary.
Now we present one other type of set, which in many ways is a dimension
above other sets. I t is called a Cartesian product.
This type of set has many im portant applications. In the following
sections we shall study two such uses of Cartesian products. One use is
with regard to counting procedures in such problems as counting the
number of perm utations and combinations of certain events. The other
use is more abstract, wherein we shall use Cartesian products to dissect
the concept known as relations. In particular, we shall examine the
im portant idea of an equivalence relation.
Before doing these things, however, it is best th at we try to find some
motivation for introducing the idea of a Cartesian product; and, hopefully,
this motivation will be supplied in the next two sections.
R X R = {(x,y) :x C R and y G
(-.+) (+.+)
II I
III IV
(-, -) (+, -)
Figure 2.41
Figure 2.42
Looking a t the picture, it should not be difficult to guess why this set
of numbers is called an interval. We can see th a t the collection is a con
nected set of points. Notice th a t 1 and 2 are not themselves members
of the set being discussed. For this reason we refer to the interval as being
an open interval, since it is open a t both ends. In other words, the end
points are not members of the collection. On the other hand, had we
referred to all numbers th a t are at least as great as 1 and no greater than
2, we would have called this a closed interval. The only difference between
210 A n Introduction to the Theory of Sets
Open interval
from 1 to 2.
'--------- 6------ >—
0 1 2
Closed interval
from 1 to 2.
------------------------ 1--------- E— 3—
0 I 2
Figure 2.43
indicate the open interval from 1 to 2, and [1,2] to indicate the closed
interval from 1 to 2. From this we are led to the following definition:
If a < b, then by (a,b) we mean the set of all real numbers th a t are
greater than a but less than b. This subset of the real numbers is called
the open interval from a to b. In the language of sets (a,b) denotes
{x C R : x > a and x < b}, and this is usually abbreviated by the expres
sion a < x < b.
We use the notation [a,b] to denote the set of real numbers th a t are no
less than a and no greater than b. This subset is called the closed interval
from a to b. In the language of sets, [a,b] = {x G R - x ^ a and x ^ b],
and this is often abbreviated by a ^ x ^ b.
Notice th a t an interval need not be either open or closed. For example,
we might write (1,2] to indicate th a t the interval excludes 1 but includes
2. Thus, (1,2] would denote the set written as 1 < x ^ 2; and we would
say th a t this interval is open on the left and closed on the right.
Obviously, we can talk about the union or the intersection of two or
more intervals since we can talk about the union and intersection of
any sets. In particular, intervals are sets. By way of illustration,
(1,3) D (2,4) denotes those real numbers less than 3 and greater than 1,
while a t the same time greater than 2 and less than 4. Once the language
is unscrambled, it is not too difficult to see th a t (1,3) C\ (2,4) = (2,3).
Cartesian Products 211
Here again, we can use the number line as a visual aid. Namely, (1,3)
corresponds to Figure 2.44 while (2,4) corresponds to Figure 2.45. Super
imposing the two diagrams, we find th a t the answer is the cross-hatched
region, and th at this region is clearly (2,3), as shown in Figure 2.46.
1 1----
0 1 2 3 4
Figure 2.44
1 1-------
0 1 2 3 4
Figure 2.45
------------------- 1 I
0 1 2 3 4 5
Figure 2.46
Figure 2.47
212 A n Introduction to the Theory of Sets
2 < y < 4 would be represented by Figure 2.48. Hence, the only way
1 < x < 3 and 2 < y < 4 is if the conditions exist as in Figure 2.49, and
we see th at the Cartesian product of two intervals is a two-dimensional
subset of the plane.
Figure 2.48
, K . I _ _
/ / / 0 p /y
r X ]
-/ '
^ r:
/ /
£
_ k x \ /
\
n. Required region is interior
1 \ \
o f cross-hatched region.
v- > \ i r x
\ i 4
Figure 2.49
Exercises
1. Explain the correspondence between sets of real numbers and subsets
of the number line.
2. Explain the correspondence between ordered pairs of real numbers and
subsets of the Cartesian plane.
3. In terms of Cartesian products explain why (0,0) is the same as {0} X
{0 };
4. Indicate the intervals (3,7) and (4,9) on the x axis.Then describe
both the union and the intersection of these two sets.
5. In terms of the Cartesian plane describe (3,7) X (4,9).
6. Repeat the processes of Exercises 4 and 5 using the following pairs
sets.
Cartesian Products 213
Notice th a t S X T and T X S are not the same sets since the pairs of
one have a different order than the pair of the other. T hat is,
214 A n Introduction to the Theory of Sets
Figure 2.50
Still another way of seeing the difference is to think of the first member
as telling us the number of tens and the second as telling us the number of
ones. Thus, S X T would induce the numbers 14, 15, 24, 25, 34, and 35;
while T X S would induce 41, 51, 42, 52, 43, and 53. These are clearly
two different sets of numbers.
A third way of viewing the numbers in S X T is in terms of branch
diagrams. T h at is starting with S we write the members of S on one
line— 1 2 3. Next we observe th a t starting with any member of S the
next member can be any element of T. Since T has two elements there are
two lines (branches) drawn from each element of S.
1 2 3
\ / \ \ l \ \ l\
4 5 4 5 4 5
We could then read all possibilities by starting anywhere on the top line
and following a branch to the bottom.
Using any method th a t seems the most convenient, we should not find it
difficult to see th a t if S and T are arbitrary sets, then
N (S X T ) = N (S ) X N(T).
Cartesian Products 216
N ( S X T) = 3 X 2 = 6,
(Recall th a t N(S) and N(T) are numbers and th a t the product of two num
bers does not depend on the order of the factors.)
More intuitively, we are saying th a t if we reverse the order of the ele
ments th at make up the pairs, then we change the pairs but we do not
change the number of the pairs.
I t makes sense to talk about A X B X C, where A , B, and C are sets.
Namely, A X B X C = {(a,b,c):a £ A , b £ B, c £ C}. In this way
N (A X B X C) = N( A) X N(B) X N(C), and we can extend this idea
to the Cartesian product of four or more sets as well.
For example, suppose we are given the digits 1, 2, 3, 4, and 5, and we are
told to arrange them without repetitions to form a five-digit number.
5 X 4 X 3 X 2 X 1 = 120. Thus, there are 120 different five-digit num
bers th at can be formed in this way. How can we see this in terms of
Cartesian products? Let A denote the set of digits th at can be used in the
ones column. Then A(A) = 5. Let B denote the set of digits th a t can be
used for the tens column once the digit for the ones column is chosen. Since
no digit can be used twice, N(B) = 4. Once the units and tens digits have
been selected let C denote the set of digits th a t can be used in the hundreds
column. Again, since no digit can be used more than once, N(C) = 3.
Continuing in this way, we define D and E so th a t N(D) = 2 and N(E) = 1.
Since the number is an element of A X B X C X D X E , and since N( A X
B X C X D X E ) = N( A) X N(B) X N(C) X N{D) X N(E) = 5 X 4 X
3 X 2 X 1, we see how the answer to the problem was obtained. We could
have actually counted the 120 different numbers, but th a t is quite tedious.
Even so, it is still better than nothing. In many problems of everyday
life there is no way to replace tedious trial-and-error. Fortunately, how
ever, the invention of computers, which “read” quickly, enables us to ex
amine many cases in a brief interval of time.
We shall pursue the development of computational skill in the use of
the number of elements in a Cartesian product in the next two sections.
However, our main point here was to show how the concept of a Cartesian
product, once extended from the realm of the plane, can be used as an aid
when we wish to quickly count large numbers of possibilities.
216 A n Introduction to the Theory of Sets
E x ercises
1. E xplain w hat we m ean when we say th a t A X B and B X A are not
equal, b u t th a t N ( A X B) and N ( B X A ) are.
2. If N ( A ) = 4 and N ( B ) = 5, show by graph technique th a t N ( A X B) =
20 .
3. Do as in Exercise 2, using branch diagrams.
4. L et A = {1,2,3,4}.
(a) How m any elem ents belong to A X -A?
(b) List th e elem ents of A X A.
(c) How m any tw o-digit num bers can be formed using th e digits 1, 2, 3,
or 4, if th e same digit m ay be used more th a n once?
(d) How m any such num bers can be formed if wre allow no digit to be
repeated?
5. A resta u ra n t offers a choice of four different fruit juices, fifteen different
sandwiches, five different dinner beverages, and ten different desserts.
How m any different meals can be served if a meal is to consist of juice, a
sandwich, a beverage, and a dessert?
6 . In how m any ways can the digits 1, 2, 3, 4, 5 , 6, 7, 8, and 9 be arranged
to form a nine-digit number?
7. A baseball m anager is not concerned with the order in which his nine
players bat. How m any different b attin g orders can he invent once
his nine players are selected?
8 . The num erals 1, 2, 3, 4, 5, 6, 7, 8, and 9 are placed into a bag. A person
is to draw the numerals from the bag, one a t a time, reading each nu
meral drawrn. W hat is the possibility th a t he will draw the numerals
in the consecutive order from 1 through 9? Explain.
1! = 1
2! = 1 X 2 = 2
3! = 1 X 2 X 3 = 6
4! = 1 X 2 X 3 X 4 = 24
5! = 1 X 2 X 3 X 4 X 5 = 120
6! = 1 X 2 X 3 X 4 X 5 X 6 = 720
7! = 1 X 2 X 3 X 4 X 5 X 6 X 7 = 5040
8! = 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 = 40,320
9! = 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 X 9 = 362,880
10! = 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 X 9 X 10 = 3,628,880.
Notice th a t we get from one factorial to the next by multiplying by the
next natural number.
2! = 2(1!) = 2 X 1 = 2
3! = 3(2!) = 3 X 2 = 6
4! = 4 (3 !)= 4 X 6 = 24
5! = 5(4!) = 5 X 24 = 120, and so on.
In general, for any natural number n, (n + 1)! = (n + 1) (n!).
While 0 is not a natural number, it is customary to define 0! to be 1.
Remember th a t definitions are man-made and do not have to be logical.
T hat is, we could have defined 0! to be anything a t all, especially since
our original definition for n! required n to be a natural number and 0 is
not a natural number. However, it is also customary to choose definitions
th at will preserve properties we wish to use. The key computational
fact about factorials is that, as seen above, (n 1)! = (n + l)(n!). Sup
pose we like this recipe to the extent th a t we would like it to remain true
even if n = 0. Then, logically, we have no choice but to replace n by
0 in the recipe and see what this implies. Replacing n by 0 in (n + 1)! =
(n + l)(n!), we see th a t this leads to
(0 + 1)! = (0 + 1) (0 !)
or
1 ! = 1(0 !)
or
1 = 0!
Thus, we see th a t if we wish to preserve this recipe (and the choice is ours
to make) then we must define 0! = 1. In terms of this new language, if
N (S ) = n then there are n! distinct permutations th a t can be defined on
S. Let us apply this idea to some problems.
Example 1
How many nine-digit numbers can be formed by the numerals 1, 2, 3, 4,
5, 6, 7, 8, and 9 if each digit is to be used exactly once?
218 A n Introduction to the Theory of Sets
Exam ple 2
How many nine-digit numbers can be formed using the digits 1, 2, 3, 4, 5,
6, 7, 8, and 9 if we can use each digit as often as we wish?
Here we come to an interesting example to show th at while
s o l u t io n :
recipes may become automatic in mathematics, wre can virtually never
autom atically apply a particular recipe to a given situation. In this
instance the reader who tries to use permutations will be in great difficulty
since permutations require th a t no member be used more than once. In
this problem wre may use some digits more than once, and others less than
once. Let us visualize the problem as follows.
We are to fill in the nine spaces below with any digit from the given
collection.
Since we are not restricted to specific digits in each space, each space may
be filled in by any one of nine elements. Let us indicate this by writing
9 9 9 9 9 9 9 9 9
where here we are not indicating the number 999,999,999; but rather th at
each space may be filled in in nine different wrays. The answrer to the
problem is given by
9 X 9 X 9 X 9 X 9 X 9 X 9 X 9 X 9 = 99.
W hy is this the case? The answer is, again, best explained in terms of
Cartesian products. We will let A = {1,2,3,4,5,6,7,8,9}. Then the
required number corresponds to nothing more than an element of A X A X
A X A X A X A X A X A X A X A (why?); and the correct answer
is now a consequence of the general result th a t the number of elements in a
Cartesian product is the product of the number of elements in each of the
sets th a t comprise the Cartesian product.
Exam ple 3
We have the numerals 1, 2, 3, 4, 5, 6, 7, 8, and 9 a t our disposal. We
wish to form a two-digit number from these numerals and to have no
numeral be repeated in a given number. How many such numbers can
we form?
Cartesian Products 219
@ 12 13 14 15 16 17 18 19
21 (g) 23 24 25 26 27 28 29
31 32 © 34 35 36 37 38 39
41 42 43 @ 45 46 47 48 49
51 52 53 54 © 56 57 58 59
61 62 63 64 65 © 67 68 69
71 72 73 74 75 76 © 78 79
81 82 83 84 85 86 87 © 89
91 92 93 94 95 96 97 98 @
9 X 8 = (9 X 8)(7!)/(71) = 91/7!
Exam ple 4
We wish to form four-digit numbers from the digits 1, 2, 3, 4, 5, 6, 7, 8, and
9 with no digit being repeated in a given number. We also want the num
ber to be greater than 6000. How many such numbers are there?
This is quite similar to the other examples except th at look
s o l u t io n :
ing a t the four spaces now, we observe th a t the first (counting from left
to right) digit cannot be 1, 2, 3, 4, or 5, since in terms of place-value such
digits would guarantee th a t the number could not exceed 6000. Thus, we
have four first choices: 6, 7, 8, or 9. Once we have made the first choice,
and since there can be no repetitions, the second choice can be made in
eight ways. T hat is, as long as the first digit is at least as great as 6, there
are no further restrictions on how to chose the digits. The third digit
can be chosen in seven ways, and the fourth in six. Thus, the diagram
would be
18 7 6
and we could obtain 4 X 8 X 7 X 6 = 1344 such numbers. We could
check this result by actually writing all such numbers and performing a
count but such a chore might prove difficult and uninspiring. Frequently
enough wrinkles are thrown in so th a t the problems become even more
difficult to handle neatly. However, the principle always remains the
same, even though the techniques may become more trying, (and it will
be more trying for some than for others!). To illustrate our last remark,
consider the next example.
Exam ple 5
All the requirements of this problem are the same as those in Example 4
except th a t we also want the number to be even.
We still have four spaces; the second and third spaces can be
s o l u t io n :
filled in by any remaining digits after the first and fourth spaces are filled
in. The first space can still be filled in by 6, 7, 8, or 9; but the choice for
the fourth space is now affected by the choice for the first space. Namely,
Cartesian Products 221
Case 1: The first digit is either 7 or 9. In this event we may choose the
first digit in two ways and the fourth in four. (Notice th a t we fill in the
difficult spaces first and leave the straightforward ones for last.) This
leaves us with seven digits, any of which may be used for the second space,
while any one of the remaining six can be used for the third space. Thus,
2 7 6 4.
Hence, there are 2 X 7 X 6 X 4 = 336 of these required numbers.
Example 6
We wish to determine the number of nine-digit numbers th a t can be formed
by the permutations of the digits 1, 2, 3, 4, 5, 6, 7, 8, and 9; subject to the
condition th a t the first six places be a permutation of 1, 2, 3, 4, 5, or 6.
We observe th a t the first six places can be filled in 6! ways
s o l u t io n :
while the last three places can be filled in 3! ways. Thus, the answer is
222 A n Introduction to the Theory of Sets
Exam ple 7
Five people are to be seated in five chairs placed in a row. How many
different seating arrangements are possible?
solution : The answer is 5!, or 120. This is precisely the same problem
but in a different environment as the one th at asks how many five-digit
numbers can be formed by a perm utation of the digits 1, 2, 3, 4, and 5.
We see th a t the first chair can be occupied in five different ways; once the
first is occupied, the second can be occupied in four different ways, and
so on.
5 4 3 21
We can make this more difficult by adding the condition th a t two particu
lar people refuse to sit next to each other.
Exam ple 8
How many seating arrangements are possible if five people are to occupy
five seats arranged in a row but two particular men refuse to sit next to one
another?
: Let us call the men who refuse to sit next to one another A and
s o l u t io n
Case 2: If we let Case 2 denote the situation in which A does not occupy
an end seat, everything else remains the same except now once A is seated
B can be seated in only two ways rather than three, for now there are two
chairs next to A. Moreover, since there are three non-end seats A can
be seated in three ways rather than in two. Pictorially,
3 2 3 _2 1
A B C D E
and we see th a t here, too, 36 seating arrangements satisfy the conditions
of Case 2. Hence, the answer to the problem is th a t 72 such seating ar
rangements exist. Moreover, since there are 120 seating arrangements
in all and since A and B either do or do not sit next to each other, this m ust
mean th a t there are 120 — 72 = 48 ways in which to seat the people so
th a t A and B are next to each other.
As a concluding remark to this problem, observe th a t it makes a dif
ference whether the chairs are in a row or in a circle, for in a circle there
is no end seat. Hence, A can be seated in any of the five chairs; then if
B is not to sit next to A, he can sit in just two chairs, and we see
5 2 3 2 1^
A B C D E'
occurs because if A and B occupy end seats when the chairs are arranged
in a row, these chairs become adjacent when pulled into a circle. These
are the 12 ways in which A and B can occupy end seats.
A C D E B\
A C E D b )
Exercises
1. A restaurant has a menu consisting of five sandwiches and four bever
ages. How many different lunches can be made up if each lunch
consists of a sandwich and a beverage?
2. Using the same conditions as in Exercise 1, how many lunches can
be made up if a lunch consists of either a beverage or a sandwich, but
not both?
3. Express each of the following as an equivalent place-value numeral,
(a) 5!/4! (b) 5 !/l! (c) 5!/0! (d) 8!/5! (e) 7!/5!2!
4. We are given the anagram ARECUS and told to rearrange the letters
to form a six-letter word. If we decide to proceed by trial-and-error
and list all possible arrangements of the six letters, how many entries
will be on our list? Are any of these entries actual words?
5. In how many ways can six people be seated around a circular table?
(In questions of this type, it is usually understood th a t two seating
arrangements are different only if a t least one person has a different
person next to him in the two arrangements.)
6. In how many ways can six people be seated if the chairs are arranged
in a straight line?
Cartesian Products 225
7. How can we account for the difference between the two answers ob
tained in Exercise 5 and Exercise 6?
8 . We are to form a four-digit numeral from the digits 1, 3, 5, and 6
with the understanding th a t no digit may occur more than once in
any numeral.
(a) How many such numerals can we form?
(b) How many numerals can we form if it is required th a t the resulting
number be greater than 5000?
(c) How many numbers can we form if the number is required to be
even?
(d) How many numbers can we form if the number m ust be even and
greater than 5000? List these numbers.
(e) Do parts (a), (b), (c), and (d) under the assumption th a t a digit
may be repeated as often as we wish in a given numeral.
9. Each person in a group is asked to pick a three-letter monogram as a
code name. The letters may be repeated. For example, one might
choose SAP and another might choose SSP. How many people m ust
be in the group before we can be positive th a t a t least two of the people
chose the same monogram? Explain.
10. In a certain club 20 members are eligible to hold office. The offices
consist of a president, a vice-president, a secretary, and a treasurer.
If no person can hold more than one office, in how many ways can the
club’s officers be chosen?
11. In another club 20 members are eligible to fill four vacancies in the
board of directors. In how many different ways can these vacancies
be filled?
12. W hat is the basic conceptual difference between the situations de
picted in Exercise 10 and Exercise 11?
13. Four different m ath books and three different history books are to
be placed on a shelf in the bookcase. How many different shelving
arrangements are possible if
(a) the seven books may be placed in any order;
(b) the four m ath books m ust be together;
(c) the four m ath books m ust be together and the three history books
must be together;
(d) the books m ust alternate in the order: math, history, m ath, his
tory, m ath, history, math.
2.15 COMBINATIONS
There are some counting problems in which the correct answer depends
on the order in which certain events are performed, and other problems in
226 A n Introduction to the Theory of Sets
which the order makes no difference at all. For example, consider the
following problem.
There are ten men in a room. Each man shakes hands once with every other man
in the room. How many handshakes are made?
90 91 92 93 94 95 96 97 98
In this setup the first digit names the first man, while the second digit
names the second man. No repeated digit exists since a man does not
shake hands with himself. Notice, however, th a t each number below the
drawn diagonal names a handshake already described by a number on or
above the diagonal. Hence, the 90 entries name only 45 different hand
shakes.
To illustrate this claim in other ways, consider the situation in which
the numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 are placed in a bag. We are
to draw one of these from the bag. Then, without replacing the first
one, we are to draw another from the bag. Does the answer depend on
the order in which the digits are chosen? Obviously, before we can decide
Cartesian Products 227
anything about an answer, we must first know the question; and our claim
is that, depending on the specific question, the answer may or may not
depend on the order. For example, suppose th at the two digits drawn
are to form a two-digit number with the first-drawn digit representing
tens and the second, ones. Then it makes a difference if we first choose
9 and then 1, or first 1 and then 9; for in the first case the number would be
91 and in the second, 19. Thus, if we viewed this problem as a game in
which the winner was the person who formed the greatest two-digit number,
the person who drew 9 and then 1 would beat the person who drew 1 and
then 9, even though both drew the same digits.
On the other hand, suppose conditions remained the same but th a t the
winner was now the man whose two digits represented the greatest sum.
Then the man who drew 9 and then 1 would receive 10 as his score since
9 + 1 = 10, while the man who chose 1 and then 9 would also receive
10 as a score since 1 + 9 = 10. In this case, then, the score would depend
only on the digits chosen, not on the order in which they were chosen.
With regard to this problem, the number of ways of selecting the two
digits is 9 X 8 = 72 if order is important, and half this amount, or 36, if
the order is unimportant. Specifically, these 36 are represented by the
following pairs if order is irrelevant. If order is important, the other 36
pairs come from reversing the order of the digits in the above 36 pairs.
12 13 14 15 16 17
23 24 25 26 27 28
34 35 36 37 38 39
45 46 47 48 49
56 57 58 59
67 68 69
78 79
89
While the listing may be long and drawn out, once done it affords us
the opportunity to make another interesting observation concerning the
difference between numbers and numerals. Namely, suppose the object
were to add the two numbers chosen. Then, as we have seen, 36 different
combinations of digits concern us. However, only 15 different scores
can be made. T hat is, the lowest score is 3, which results when 1 and 2
are chosen; the highest score is 17, which results when 8 and 9 are chosen;
and any natural number between 3 and 17 can be obtained as a score.
Thus, the scores are 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, and 17.
However, 3, 4, 16, and 17 can be obtained in only one way each while
10, for instance, can be scored by four different combinations: 1 and 9, 2
and 8, 3 and 7, 4 and 6. This, then, is an example th a t corresponds to a
228 A n Introduction to the Theory of Sets
The first digit corresponds to the first die, and the second digit to the
second die. We allow repeated digits since each die can turn up the same
face value; and if you feel that, say 2-4 and 4-2 are the same thing, imagine
the two dice to be colored differently. Certainly a red 4 and a green 2
denote a different toss than a red 2 and a green 4.
Thus, there are 36 ways in which the dice can turn up. Yet there are
only 11 different numbers th a t the dice can name: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
and 12.
The word “perm utation” indicates a situation in which order is impor
tant. If the order is not im portant we use the term “combination.”
More specifically, if r and n are whole numbers and r ^ n, then P{n,r)
denotes the number of ways in which r objects can be chosen from a set of
n objects if order is im portant and if the object chosen is not replaced
before each new choice. For example, P (5,3) means the number of ways
we can choose three objects from a set of five if different arrangements
mean different choices. We have already seen th a t there are 5 X 4 X 3 =
60 such choices. Thus, P (5,3) = 60. If we view the set as being com
posed of the digits 1, 2, 3, 4, and 5, and if we view our choice as being a
three-digit numeral, then our 60 choices are the following.
123 124 125 134 135 145 234 235 245 345
132 142 152 143 153 154 243 253 254 354
213 214 215 314 315 451 324 325 425 435
231 241 251 341 351 415 342 352 452 453
312 412 521 413 513 514 423 523 524 534
321 421 512 431 531 541 432 532 542 543
Cartesian Products 229
P(9,4) = 9 X 8 X 7 X 6 = 3024.
P(n,r) = n X (n — 1) X (w — 2) X . • • X (n — [r — lj)
r factors
= n(n — l)(n — 2) . . . (n — r + 1).
23 24 25 26
34 35 36 37
45 46 47
56 57
67
12347 12346
12345
232 A n Introduction to the Theory of Sets
C ( 5 ’0 ) - o il = 1
C(5,l) = j f j - 5
n (' o') _ _§L _ 5 X 4 X 3 X 2 X 1 _
K, ) 2!3! 1 X 2 X 1 X 2 X 3
C(5,3) = C(5,2) = 10
C(5,4) = C(5,l) = 5
C(5,5) = C(5,0) = 1.
This is in accord with the fact th a t there are 4! permutations of four ele
ments. In terms of a place diagram we have
i 3 2 L
However, let us now consider the word “noon.” In this case there
seem to be only six permutations rather than twenty-four; namely, noon,
nono, nnoo, onno, onon, and oonn.
The point here is that, as far as a word is concerned, we do not distinguish
between permutations of multiple letters. For example, in our case we
could artificially distinguish between the two o’s by writing o and o. (or we
could imagine them as having different colors, or we could write them as
Oi and o2 to indicate the first o and the second o). In a similar way we
could write n and n to distinguish between the two n ’s. In this way, noon,
noon, noon, and noon are four different perm utations; yet they are read as
precisely the same word. T hat is, in terms of the place diagram we may
place the two n ’s in any two of the four places without regard to order.
Thus, the n ’s may be placed in C(4,2) = six different ways. Once the n ’s
234 A n Introduction to the Theory of Sets
are placed, the two o’s can be placed in only one way; namely, they must
fill the two spaces th a t remain. This accounts for the fact th a t we have
six answers rather than 24. We would not be able to use the recipes in
either Equation (1) or Equation (2) as they now stand. Rather, we would
begin by writing 4! which would be correct if the letters were all different.
Then we would divide by 2! to indicate th a t if we leave the letters in place
and just permute the o’s, we do not get anything new. Then we must
divide by 2! again to indicate the same thing about the n ’s. Thus, we
obtain our answ er:
41/212! = 6.
As a more difficult example along these same lines, consider the word,
“mississippi.” Here we have 1 m, 4 i’s, 4 s’s, and 2 p ’s. Again imagine
th a t we are filling in blanks. There would be 11 blanks, and the m could
be placed in any one of these. Thus, there are C ( ll,l) ways of placing
the m. Notice th a t we use C rather than P because if we rearrange the
duplicate letters we do not change the word we see. Once this is done there
are 10 spaces left and the four i’s may be placed in any four of these ten
spaces. T hat is, we may place the four i’s in the remaining spaces in C(10,4)
different ways. W ith the m and the i’s in place there are six places left,
and the four s’s may be placed in any of these six places. Thus, we may
place the s’s in C(6,4) different ways. We are then left with our two p ’s
and just two open places. This means th a t the p ’s may now be placed in
C(2,2) different ways. (Observe th a t C(2,2) = 1, and it is not surprising
th a t with two like objects left and only two spaces to be filled there is only
one way of doing this. We preferred writing C(2,2) rather than 1 to em
phasize the technique being used.) To go on, since the groups of letters
may be placed without restriction once the previous letters are in position,
we see from our knowledge of Cartesian products th a t the total number of
ways we can rearrange the letters is
(7(11,1) X C(10,4) X (7(6,4) X C(2,2). (4)
As a final point, let us note th a t Equation (4) once again emphasizes the
difference between overall knowledge and computational techniques.
T h at is, Equation (4) is the correct answer, although in an unusual form.
Once we apply a recipe such as Equation (2) to Equation (4) we get the
equivalent but more familiar
11 X (10 X 9 X 8 X 7 ) /( l X 2 X 3 X 4 ) X ( 6 X 5)/2 X 1 (5)
and this in turn leads to
11 X 210 X 15 (6)
or
34,650. (7)
Cartesian Products 236
Equation (7) represents the answer in its most familiar form, but notice
th a t Equation (4) is just as precise as Equation (7). The failure to separate
the computational know-how from the theory of solving the problem is
tantam ount to the oft-quoted cliche of failing to see the forest because of
the trees.
Referring again to the “mississippi problem,” there was no rule saying
th at we had to place the m first. For example, we might have elected to
place the four i’s first. This could have been done in C (ll,4) different
ways. If we next placed the s’s, this could have been done in C(7,4) dif
ferent ways; the p ’s could then have been placed in C(3,2) ways, and the
m in C (l, 1) ways. H ad this been the case, the answer to the problem would
have been
C (ll,7 ) X C(7,4) X C(3,2) X C (l,l)
and it should not be surprising th at this also yields 34,650 as the answer!
We leave additional examples for the exercises. In the next section we
shall show how combinations may be used to help us derive and understand
certain im portant algebraic recipes. In particular we shall study the bi
nomial theorem.
Exercises
1. A poker hand consists of 5 cards without regard to order, dealt from a
deck th a t contains 52 cards.
(a) How many different poker hands are possible?
(b) How many of these ways consist of hands in which all five cards are
of the same suit?
(c) How many of these hands consist of four of a kind (four cards of the
same face value)?
2. A bridge hand consists of 13 cards dealt from a standard deck of 52
cards. In how many ways can a hand of each of the following types be
obtained?
(a) suits of 5, 4, 3, and 1 card
(b) suits of 4, 4, 3, and 2 cards
(c) 13 of the same suit
3. I t can be shown th a t the number named by C(52,13) can be approxi
mated in place-value notation to be slightly larger than 6 X 1011.
If we assume th a t on the average one bridge hand per second is played
in America around the clock, how likely does it seem th a t a bridge hand
consisting of 13 cards in the same suit will occur?
4. Three people approach a row of six chairs and decide to sit down a t
random with no more than one person per chair.
(a) In how many ways can the people be seated?
(b) In how many ways can they be seated if they agree not to leave
empty seats between them?
236 A n Introduction to the Theory of Sets
(c) In how many ways can they be seated if they agree th a t there must
be a t least one empty chair between each pair of people?
6. Construct an example th a t will explain wrhy
C(12,3)C(9,5)C(4,4) = C(12,5)C(7,4)C(3,3)
w ithout multiplying both sides of the expression to verify this result.
(ai + 02 d~ 03) (&i 4“ bi) = aibi d- ai?>2 d- a,ibi + a^bi + ®3&i d- 0362*
(See Figure 2.51.) This means th a t the product may be written down
mechanically as the sum of all products of two numbers, one from each set of
parentheses. In the above illustration the first factor could be chosen in
any one of three ways since the first set of parentheses contains three terms;
in a similar way we see th a t the second factor could be chosen in two ways.
Thus, as we have learned from the discussion of the number of elements in a
Cartesian product, there should be 3 X 2 = 6 terms in the product—and
a simple check shows this to be the case. We should also observe th a t
while we used a particular illustration, the above remarks (other than for
Cartesian Products 237
a1 02 “ 3
n
a3bx
a, b, a2b\
1
axb2 aibi
Figure 2.61
T hat is, the product has 12 terms ( 3 X 2 X 2 = 12) and each term con
sists of the product of three numbers, one from each of the three sets of
parentheses. To prove this, one could think in terms of the volume of the
parallelepiped (this is the three-dimensional analogue of a rectangle) whose
sides were («i + a2 + a3), (61 + 62), and (ci + c2) (see Figure 2.52). How
ever, such a procedure cannot be generalized if more than three sets of
parentheses are involved since we have no 'physical concept of geometry
of more than three dimensions.
From a more abstract point of view we may form the product using two
factors a t a time. T hat is,
Figure 2.62
238 A n Introduction to the Theory of Sets
and this reduces to the case in which we have only two sets of parentheses
since, of course, such expressions as ai&i are also numbers.
Again, observe th a t this procedure did not depend on the number of
terms within each set of parentheses, and th at we can continue the proce
dure for any number of sets of parentheses merely by the repeated applica
tion of the above technique. To simplify this point, observe th a t what
we are doing is essentially no different from the way in which we form a
sum, such a s l + 2 - |- 3 - |- 4 . In effect, we use the associative rule re
peatedly to obtain
l + 2 + 3 + 4 = ( l + 2 ) - M + 4 = 3 - j-3 + 4
= (3 + 3) + 4 = 6 + 4 = 10.
(a + b)
(a + 6) (a + b)
(a + &)(a + b)(a + b)
(a + b)(a + b) (a 6) (a+ b)
(a + b)n
Case 1: n = 0.
Case 2: n = 1.
Then (a + 6)» = (a + b)1 = (a + 6).
Case 3: n = 2.
Then (a + b)n = (a + 6)2 = (a + fc)(a + 6) = a2 + ab + ba + b2
= a2 + 2a6 + 62.
Cartesian Products
Case 4: n = 3.
Case 5: n = 4.
Then (a + b)n = (a + b)4 = (a + b)3(a + b)1
= (a3 + 3a2b + 3ab2 + 63)(a + b)
= a* + a3b + 3a36 + 3a2b2 + 3a262 + 3ab3 + ab3 + b4
= a4 + 4a3b + 6a2b2 + 4ab3 + b4.
n (o + b)n
0 1
1 a 4- b
2 a2 + 2ab + 62
3 a3 + 3a26 + 3ab2 + b3
4 a4 + 4a3i> -(- 6a2b2 + 4ab3 + b*
5 a5 + 5a*b + 10a362 + 10a263 + 5ab* + ¥
6 a6 + 6a56 + 15a462 + 20a363 + 15a264 + 6ab8+ ¥
7 a1 + 7a66 + 21a5b2 + 35a463 + 35a364 + 21a2b6+ 7a ¥ + ¥
While the above task was quite laborious, it has resulted in our collecting
a considerable am ount of data to analyze. The above chart indicates some
interesting trends. (Observe th a t any results obtained from the charts
are trends and not theorems. T hat is, much as in a laboratory, unless
we show th a t something follows inescapably, all th a t we have is a conjec
ture—a hunch. However, many times a theorem is born after much data
indicates th a t a trend may be more than just coincidence. We shall
illustrate this remark by the manner in which we evaluate the above
chart.) Among the “obvious” trends are:
(1) If we look down the first column of the chart on the side headed by
(a + b)n, we observe th a t the coefficient in each case is 1.
(2) The coefficients in the second column are successively given by 1, 2,
3,4, 5, 6, and 7; and this appears to indicate the set of natural numbers.
(3) The third column leads to the coefficients 1, 3, 6, 10, 15, and 21; and
2/f.O A n Introduction to the Theory of Sets
While our observations might not be too indicative of any practical re
sults, notice th a t pure mathematics is often born merely of our interest in
studying sequences of numbers such as these. Perhaps we can now combine
the above three observations and begin to look for a more general trend.
Namely, the first column of coefficients consisted of the sequence 1, 1, 1,
1, 1, 1, 1, 1; and if we form consecutive sums of these elements we obtain:
1, 1 + 1, 1 + 1 + 1, . . . , or 1, 2, 3, 4, 5, 6, 7, . . . , and these turned
out to be the coefficients of the second column. In turn, these successive
sums lead to 1, 1 + 2, 1 + 2 + 3, . . . , or 1, 3, 6, 10, 15, 21; which were
the members making up the coefficients of the third column. These, in
turn, lead to the successive sums 1, 1 + 3, 1 + 3 + 6, . . . , or 1, 4, 10,
20, 35; which formed the coefficients of the next column. Continuing this
process, we would next obtain the sequence 1 , 1 + 4 , 1 + 4 + 10, 1 + 4 +
10 + 20, . . . , or 1, 5, 15, 35; and this yields the coefficients of the next
column. We have not proven any theorems yet, but things are starting
to look quite suspicious as we begin to sense th a t this procedure will con
tinue endlessly.
Moreover, if we were endowed with enough geometric and artistic intui
tion, we could use the results discussed above to invent the chart in Fig
ure 2.53, which is known as Pascal's triangle. The arrows show how the
coefficients in the various columns have been arrayed in the triangle. The
l
/
/ \ /
1 2 1
/ / /
1 3 3 1
/ / / /
1 4 6 - ----------- 4 1
/ / + \ f
1 5 10 10 5 1
1
/ y y 15 y
6
y 6
y 1 15---------- 20
1
/ 7
/ 21
/ \ 35/ 35
y 21 / 7
/ .I
r
y 8 y 28 / 56 y 70 y 56 y 28 y 8 y 1
Figure 2.53
Cartesian Products 241
1 = 1 = 2°
1 + i = 2 = 21
1 + 2 + 1 = 4 = 22
1 + 3 + 3 + 1 = 8 = 23, and so on.
are inescapable, is to figure out how the various coefficients must be formed.
For example, in the situation
(a + 6)4 = a4 + 4a36 + 6a2b2 + 4a63 + b*
why is the coefficient of a3b equal to 4, while the coefficient of a* equal to 1?
This question is easy to answer in terms of combinations. Namely, we
form a term in the product by forming products consisting of one factor
from each of the four, in this case, sets of parentheses. Since there are but
four sets of parentheses, the only way we can choose a term containing
four factors of a is to choose four a ’s from a possible maximum of four a ’s;
and this can be done in only one w ay; namely, we m ust choose the a, not
the b, from each set of parentheses. On the other hand, to form a term of
the type a3b, we need only choose three of the factors to be a ’s and one to
be b. However, since there are four sets of parentheses we could choose
the b from either the first, second, third, or fourth set. T hat is, there are
four ways of forming a term of the type a3b. To summarize, there are as
many ways of forming a term of type a3b as there are ways of choosing three
a ’s, or equivalently, one b from a collection of four. This number is repre
sented by none other than (7(4,1) or its equivalent (7(4,3), since (7(4,1) =
C(4’3)'
To generalize this idea, let us first review the binomial expansion of
(a + b)6 in this light rather than by straightforward computation. We
first observe th a t the exponent tells us th a t there are really six sets of paren
theses in the expansion; hence, the sum of the exponents of each of the terms
m ust be 6. This tells us th a t we can only expect terms of the type a6,
a56, a*b2, a3b3, a2b*, ab5, and 66. Moreover, the coefficient of a6 m ust be
(7(6,6) since we can get a term of this type only by choosing six a ’s out of a
possible six. In the same way, we see th a t the coefficient of a5b m ust be
(7(6,5) since we can obtain this type of term only by choosing an a from
five of the six sets of parentheses. Proceeding in this way, we may write
down a t once
(a + b)6 = (7(6,6)a6 + C(6,5)a56 + C(6,4)a462 + C(6,3)a363
+ C(6,2)a264 + C(6,l)a65 + (7(6,0)66.
This result does not depend on whether we can find synonyms for C (6,4)
and the others. However, if we use our previous factorial formulas or
whatever we find convenient, we see th a t (7(6,6) = 1, (7(6,5) = 6, (7(6,4) =
15, (7(6,3) = 20, (7(6,2) = 15, C(6,l) = 6, and (7(6,0) = 1. This leads to
the result shown in the chart.
If we wish to extend this technique beyond an example contained in the
chart, consider the problem of finding the coefficient of a 865 in the binomial
expansion of (a + 6)13. B y sight we need only write (7(13,8), for there are
as many terms of type a*bb as there are ways of choosing eight a ’s from a set
of 13. Observe th a t this answer is correct and is indeed exact. To be
Cartesian Products 243
Exercises
1. (a + 6)9 is expanded by the binomial theorem. Find the coefficient of
a663.
2. Nine coins are tossed. In how many ways may we obtain six heads and
three tails?
3. How are Exercises 1 and 2 related?
4. Explain the device in Pascal’s triangle whereby we enter the sum of two
entries on one line to obtain an entry on the next.
5. In the binomial expansion of (x3 + 2y)h find the coefficient of the term
whose form is x9y2.
6. 210 = 1024. In the binomial expansion of (a + 6)10 how does 1024
occur?
7. W hat is the coefficient of a5b5 in the binomial expansion of (a + 6)10?
8. Suppose we wished to compute (1.001)10 to be correct to the third deci
mal place but we did not wish to go through the work of actually raising
1.001 to the tenth power. Recall th a t 1.001 = 1 + 0.001. Use the
binomial theorem to expand (1 + 0.001)10 and thus obtain the required
approximation for (1.001)10.
one million numbers are in a box and we draw one a t random then there
are two possibilities: either we draw the number 6 or we do not. However,
unless half of the numbers in the box are labeled 6 these two outcomes are
not equally likely. Now suppose th at m of these n equally likely outcomes
are favorable outcomes. We define the probability of a favorable outcome
to be m /n.
Using more mathematical language, if A denotes a particular event and
if A can occur in m equally likely ways out of n equally likely total out
comes, then we define P(A), called the probability of A, by P(A) = m/n.
This idea of probability is closely connected with the structure of unions
and intersections in the sense th a t if A and B are two events, the probability
th a t a t least one of the two events occurs is given by
P(A) = 13/52
P(B) = 16/52
P(A H B ) = 4/52.
Hence
P( A KJ B) = (13 + 16 - 4)/52 = 25/52.
At the end of this section we shall supply some examples, but now we
wish to make a very im portant observation. I t is probably not surprising
th at the original study of probability arose in regard to games of chance.
Even in modern textbooks there is a preponderance of examples concerning
cards and dice when probability theory is discussed. However, it is wrong
to believe th at such considerations are the only uses of the probability
theory. In truth, all of life in one way or another is concerned with the
study of probability. The actuary employed by the insurance company is,
in a sense, a bookmaker. Indeed, when we take out a life insurance policy,
we are betting th at we will not live to a certain age while the company
bets th at we do; and the actuary determines the odds to make it a fair bet.
More importantly, the whole world of science hinges on the concept of
probability, for in the world of science until an experiment is completed
all we have is a prediction. If, for example, the space agency is told not
to send a man into orbit until it is certain th a t he will return alive, then it
will never send a man into o rb it; for the only way in which we can be sure
th a t he returns alive is for us either never to send him, or if we do send him,
to wait until he returns alive! All decisions m ust be made in the light of
available evidence; and granted th at a well thought-out decision may have
bad results, we would like to feel th at it was due to unforeseeable bad luck,
rather than to poor planning.
In laboratory work when there is some discrepancy between theoretical
results and experimental results, we m ust often decide how large an error
we can tolerate before wre have to adm it th at the discrepancy is due to a
mistake in the theory rather than to a human error in measurement and
experimental design.
The biggest problem is th a t outside of such contrived examples as dice
and playing cards it is either very difficult or even impossible to determine
the number of possible, equally likely outcomes; and as a result, the study
known as statistics is born. Statistics in many ways simply (but not
easily) involves methods of collecting reliable data from wrhich certain
probabilities can be determined for use in various computations. The use
of statistics and probability from either the theoretical or computational
points of view is not the purpose of this text. All we wrant to do is indicate
the importance of the topic and the use of the theory of sets in its proper
development. However, it is difficult to understand the importance of
probability without reference to at least a few problems; so we shall con
clude this section with a few problems, their solutions, and a brief discussion.
Example 1
A “fair” coin is tossed six times. W hat is the probability th a t exactly
three heads and three tails occur?
2^6 An Introduction to the Theory of Sets
E xam ple 2
W hat is th e probability when a pair of “fair” dice are rolled th a t the sum
will be seven?
s o l u t io n There are only 1 1 different sums th a t can be rolled; namely,
:
any num ber between 2 and 12, inclusive. However, these numbers are
n ot equally likely; for, as we have already seen, numbers such as 2 and 12
can be obtained in only one wray each, while 7 may be obtained in six ways;
th a t is, 1-6, 6-1, 2-5, 5-2, 3-4, and 4-3. Here we see the difference between
num bers and numerals once again, for while only 11 different sums can
occur, 36 (6 X 6) different numerals can name these 11 numbers. Thus,
the probability of obtaining a 7 is not 1/11 b u t rather 6/36 or 1/6. Observe
th a t by equally likely we mean th a t the combination 5-2 is no more likely
th a n the combination 6-6; both are equally likely. W hat we do mean is th a t
there are six different ways of rolling 7, and 5-2 is ju st one of these; but 6-6
is the only way of obtaining a twelve.
Exam ple 3
A “fair” die is rolled twice. W hat is the probability th a t a 4 does not turn
up either time?
Cartesian Products 247
There are five ways in which the first die fails to turn up a 4.
so lu tio n :
Namely, it turns up either 1, 2, 3, 5, or 6. There are also five ways in
which the second die fails to turn up a 4. Thus, there are 5 X 5 ways in
which neither the first nor the second die turn up a 4. In all there are
6 X 6 equally likely ways in which the pair of rolls may terminate. The
probability of a 4 turning up neither time is 5 X 5/6 X 6, or 25/36. In
terms of odds, the odds are 25 to 11 th a t a 4 will not turn up on either die.
Example 4
W hat is the probability th a t when a pair of “fair” dice are rolled a t least
one of the faces will turn up a 4?
Here we use the fact th a t P(A) = 1 — P(A'), for it should be
s o l u t io n :
clear th at the set of events in which a t least one die turns up a 4 is the com
plement of the set of events in which neither die turns up a 4. Hence,
since there is a probability of 25/36 th a t neither die reads 4, it m ust be
th a t the probability th a t at least one die turns up a 4 is 11/36.
The number of possible outcomes involved in the solutions of (3) and (4)
is quite small. In fact, the following is a complete listing.
The 11 entries with asterisks verify directly the results discussed above.
I t is worth noting th a t intuition can easily lead to misinterpretation
in such a problem as the above. For example, it is easy to employ the
reasoning th a t since there are only six equally likely ways in which a fair
die can turn up, then in two rolls there are two chances out of six th a t a 4
will occur. This reasoning leads to 2/6, or 1/3, as the probability of
obtaining a t least one 4 with one roll of a pair of dice. We have seen
th a t the probability is 11/36. The error seems more suggestive if we
write 1/3 = 12/36, for then we see th a t we are off by 1/36; whence it is
not difficult to see th a t 4-4 is but one roll th a t we have counted as two
solutions (that is, as two favorable events).
248 A n Introduction to the Theory of Sets
While the outcomes are more numerous as the number of rolls of the
die increases, the theory remains the same. For instance, if the die is
rolled three times (or if three dice are rolled once) there are 6 X 6 X 6
possible outcomes, and 5 X 5 X 5 ways in which a 4 never occurs. Thus,
the probability th at no 4’s occur is (5 X 5 X 5)(6 X 6 X 6) = 125/216.
This means th a t the probability of obtaining at least one4 is 1 — 125/216, or
91/216, a far cry from the intuitive answer th at the probability is 3/6. In
fact, a game called “chuck-a-luck” is based on this idea. Namely, a
player picks 1, 2, 3, 4, 5, or 6, and then three fair dice are rolled. If the
number picked by the player does not appear on any of the dice, the player
loses his wager. If his number appears on exactly one of the three dice, he
wins an am ount equal to his wager. If it appears on exactly two of the
dice, he wins an am ount equal to twice his wager; and finally, if his number
occurs on all three dice, he wins an amount equal to three times his wager.
At first glance (and to many, at second, third, and fourth glances as
well) it appears th a t this is a good bet for the player to accept. However,
the above discussion shows th a t the player has an even better chance of
losing. Let us assume, for the sake of argument, th at a player chooses
4. We have already seen th a t there are 125 ways for him to lose and 91
ways to win. Since some of these 91 ways pay an extra dividend, they
m ust be considered separately. I t is easy to see th a t only one of these
91 winning situations involves three 4’s; namely, each die must turn up a
4. Now let us investigate the number of ways in which exactly two of the
dice turn up a 4. Those who are now adept at the game of clever counting
may already see th a t there are C(3,2) ways in which two of the dice turn
up 4’s. We can also count more concretely and see th a t it must be either
the first and second, or first and third, or second and third dice which
turn up the 4’s. Once the two 4’s occur the third die can record any
one of five faces; namely, anything but a 4, since if a 4 occurred we would
have three 4 ’s. Thus, there are C(3,2) X 5 ways of obtaining exactly two
4’s, and this can be translated by observing th at C(3,2) = 3, and 3 X 5 =
15. T hat is, there are 15 ways of obtaining exactly two 4’s. This leaves
us with the fact th a t 75 of the 91 winning situations involve only one 4
turning up. (This can be checked directly by observing th a t C(3,l) X
5 X 5 = 3 X 5 X 5 = 75; and this is obtained by recalling th at C(3,l)
denotes the number of ways in which one of the three dice turns up a
4, while 5 X 5 represents the number of ways in which neither of the
other two dice turns up a 4.) No m atter how we proceed we would even
tually obtain the following:
For the sake of convenience, let us now assume th at each wager consists
of SI. Since the 216 outcomes are assumed to be equally likely, let us
base our analysis on 216 tosses in which each of the 216 outcomes occurs.
Of course, such a distribution is unlikely, but any deviation from this
distribution is a m atter of luck, so we disregard it. Then (1) indicates
th a t we lose $125, (2) indicates th a t we win $75, (3) indicates th a t we
win 15 X 2, or $30, and (4) indicates th at we win 1 X 3, or $3. T hat
is, we can expect, in wagering $216, to win 75 + 30 + 3 = $108 while
losing $125. This means th at we lose $125 — $108, or $17 per $216 in
vested. Certainly, a player may be an extremely lucky man and win
much money, but any deviation from an average loss of $17 per $216
played is due to luck alone; this in turn means th a t if the player wins he
should attribute this to good luck, but if he loses he should think twice
before being naive enough to believe th a t it was caused by bad luck alone!
Again, notice th at whether or not we are adept at counting, the theory is
not affected here; only the determination of the possible outcomes might
be. By miscounting the possibilities we can get into difficulty. In fact,
one reason th a t many gambling houses have “honest” wheels and “honest”
dealers is th a t they have too much to lose by being caught cheating. In
deed, m an’s ability to misinterpret data often allows the gambling house
to establish odds th at are in its favor, but which the player believes are
in his favor. In short, the house plays the percentages, while the player
relies on luck, and it is fair to assume th a t this analogy permeates all phases
of our life!
We shall conclude this section with one more example which, in addition
to reinforcing some previous ideas, illustrates another living example of
the case against intuition.
Example 5
There are n people in a room, where n denotes a natural number. W hat
is the probability if these n people were chosen at random th a t a t least
two of them celebrate their birthdays on the same day of the year? (No
tice th a t we are not naming a particular day, nor are we insisting th a t they
be born in the same year.)
I t is clear th at the answer depends on n. To see some extreme
s o l u t io n :
cases, merely observe th a t if n = 1 the probability is 0, since with only
one person it is impossible for two of them to have the same birth d ay ! At
the other extreme, if n is a t least equal to 367 (this guards against leap
year problems), the chest-of-drawers principle guarantees th a t the prob
ability is 1, since then we have more people than dates in a year.
Let us tackle some specific values of n and see what the trend is, and
let us also assume th a t leap year is excluded (that is, there are 365 days
in a year).
260 A n Introduction to the Theory of Sets
two people have their birthdays on the same day! W ith 150 people, the
odds become 4,500,000,000,000,000 to 1! While we do not intend to
prove this assertion because it would require such tedious computation, let
us at least observe th at by the time we get to about the hundredth person,
the factors look like 265/365, 264/365, and so on, and this involves talking
about f of | of § of § . . . . T hat is, we begin to multiply by (f)m (and
even this is not quite exact since the terms get progressively smaller).
But to get a general idea of what is happening observe
(2/3)2 - 4/9
(2/3)3 = 8/27
(2/3)4 = 16/81
(2/3)5 = 32/243
(2/3)6 = 64/729
Exercises
1. The letters, A, B, C, D, and E are placed a t random in a row. W hat
is the probability th a t A and B will be next to each other?
2. A person writes down three letters chosen a t random and it is possible
th a t he writes the same letter more than once.
(a) W hat is the probability th a t none of the three letters is a vowel?
(b) W hat is the probability th a t all of the three letters are vowels?
(c) W hat is the probability th a t a t least one of the three is a vowel?
(d) W hat is the probability th a t exactly one of the three is a vowel?
3. We are to form a four-digit numeral by arranging the digits 1, 3, 5,
and 6 in random order.
(a) W hat is the probability th a t the numeral will name a number
greater than 5000?
(b) W hat is the probability th a t the numeral will name an even
number?
(c) W hat is the probability th a t the numeral will name an even num
ber greater than 5000?
4. Four m ath books and three history books are shelved in random order.
W hat is the probability th a t the four m ath books will be together?
252 A n Introduction to the Theory of Sets
but th a t the relation depends on the order in which the pair is stated. Thus,
if “John is taller than Bill” is a true statem ent, then “Bill is taller than
John” is a false statem ent. T hat is, in general, the tru th of a relation
depends on the order of the pair—although there are some relations in
which order makes no difference (such as “is the same height as”).
In other words, in- ju st the same way th at we could view points in the
plane as ordered pairs, so can we view any relation. For example, let S
denote the set consisting of the whole numbers 1, 2, 3, and 4; and let the
relation be “is less th an .” Then we have
1 < 2, 1 < 3, 1 < 4, 2 < 3, 2 < 4, and 3 < 4.
Let us now agree to use the code (b,c) as an abbreviation for b < c. Then
the above inequalities translate into
{(1,2), (1,3), (1,4), (2,3), (2,4), (3,4)}
and this is clearly a subset of S X S.
While our definition of a relation is quite simple, its broadness allows a
set to have a great many relations. T hat is, suppose S has n elements.
Then, as we have already seen, S X S will have n2 elements. Hence,
S X S will have 2n’ subsets. (Why?) By way of illustration, in the exam
ple above S has four members. In this case n2 = 16; and 2n* = 216 =
65,536; and th a t is a great number of relations.
N ot all relations are equally interesting, but there is one type of relation
th a t occupies a great position in mathematical structures. These are
called equivalence relations and they will be the topic of discussion in the
remainder of this section.
To begin with, let us concede th a t Cartesian products are no longer
necessary in our conversation. They have served their purpose in helping
us m otivate the idea of relations. Instead of using ordered pairs, such as
(b,c), we will write bRc to stand for b is related to c by the relation R. As a
specific example, let R denote the relation “is greater th an .” In this
case we wrould write 7R3 as an abbreviation for “7 is greater than 3.”
In what follows, R will denote an arbitrary relation on some set S. I t
would be comforting to know th a t for any b £ *S, bRb; th a t is, we would
like to believe th a t every element “is related to itself.” However, this
is not the case. Indeed, in our above example wherein R denoted “is
greater th an ,” this is false. T hat is, bRb would mean here th a t b was
greater than b, and this is false. (In terms of a subset of S X S we are
saying th a t while (6,6) belongs to S X S, there is no reason why it m ust
belong to the particular subset of S X S we have chosen.) On the other
hand, if R denotes “is equal to ,” then bRb would be a true statem ent.
This leads us to the following concept: The relation R is said to be reflexive
on S if bRb is true for all 6 C S.
Cartesian Products 266
Among reflexive relations are “ is equal to ,” “is the same age as,” and
“is parallel to.” Among relations th at are not reflexive are “is less th an ,”
“is shorter than,” and “is the father of” (since no person is his own father).
In fact, a relation chosen a t random is probably not reflexive.
A second important feature of some relations hinges on a few things
we have already hinted at, especially the idea of order being important.
For example, as we have seen, the fact th a t bRc is true does not mean th at
cRb will also be true, although this could certainly happen. Now we can
say th at the relation R is said to be symmetric on S if bRc implies th a t cRb.
Examples of symmetric relations are “is equal to,” “is parallel to ,” and
“lives next door to.” Examples of relations th a t are not symmetric are
“is less than,” “is shorter th an ,” “is the father of,” and “is the brother
of” (that is, if John is the brother of Mary, it is unlikely th a t M ary is the
brother of John).
Again, in terms of subsets of S X S, to be symmetric the subset must
have the rather stringent condition th a t (c,6) belongs to the subset when
ever (b,c) does.
The final property of relations we wish to discuss here is known as transi
tivity; namely, the relation R on S is called transitive if, whenever aRb and
bRc then aRc.
For instance “ equality” is transitive; for if the first is equal to the second
and the second is equal to the third, then the first is also equal to the third.
On the other hand, “is the father of” is not transitive. T hat is, if the
first person is the father of the second and the second is the father of the
third, then the first is not the father, but the grandfather, of the third.
Again, in terms of subsets of S X S transitivity means th a t whenever
(a,6) and (b,c) belong to the subset, so also does (a,c).
The next point to observe is th a t a particular relation can have any
combination of these three properties (reflexive, symmetric, transitive),
ranging from none to all three.
(1) “ lives next door to ” is symmetric but neither reflexive (unless the
person lives in two adjoining homes) nor transitive.
(2) “is less than ” is transitive but neither reflexive nor symmetric.
(3) “is the father of” is neither reflexive, transitive, nor symmetric.
(4) “is divisible by” is reflexive and transitive but it need not be sym
metric (4 is divisible by 2 but 2 is not divisible by 4).
(5) “is equal to ” is reflexive, symmetric, and transitive.
By an equivalence relation on S, we mean any relation th a t is reflexive, symmetric,
and transitive.
Figure 2.54
6 To see this, consider the equilateral triangle A B C and let .AD be perpendicular to
BC, as in Figure 2.54. Now by definition of equilateral, BC = A C ; hence, if we sub
stitu te equals for equals the fact th a t A D is perpendicular to BC means th a t it is also
perpendicular to A C —b u t this is preposterous. W hat went wrong? The answer is
th a t BC and A C were equal with respect to length; th a t is, with respect to the relation
“is the same length as.” This, in turn, means th a t wherever the length of BC is the
correct answer so also will the length of .AC be the correct answer. However, in Figure
2.54 the relation “is perpendicular to ” is a relation between lines, not the lengths of the
lines. This is one reason th a t the better geometry books write AC = BC to indicate
th a t it is the lengths of the lines th a t are equal, not the lines themselves. In short, it
is true th a t .AD is perpendicular to BC, bu t it is not true th at A C = BC. In fact, had
we used the notation BC and AC to denote lengths there would have been no problem.
Cartesian Products 257
Exercises
1. Tell which of the following relations are reflexive, which are symmetric,
and which are transitive:
(a) is greater than
(b) is the sister of
(c) is the friend of
(d) is divisible by
(e) has the same shape as
(f) lives next door to
2. Describe the importance of an equivalence relation.
chapter three / THE "GAME" OF
MATHEMATICS: An Introduction
to Abstract Systems (with Special
Emphasis on Boolean Algebra)
3.1 INTRODUCTION
There is an old saying th a t the only sure things are death and taxes.
Yet, with the exception of certain trivialities, man can be sure of nothing.
For example, if someone wanted to bet against us th a t the world would
not be here tomorrow we would have to wait until then before we could
collect. In short, in the world of science until an experiment is completed,
all we ever have is a conjecture. The systematic study of any serious topic
begins with certain assumptions—things we believe to be true. These
“tru th s’’ may be based on whims, intuition, or past experience. They
may appear self-evident or they may be acquired knowledge; but whatever
they are, they serve as the basis of further inquiries.
We then apply the rules of logical thought to these assumptions and
thus try to determine what facts follow in an inescapable manner from our
assumptions. We do not worry so much about whether our conclusions
are true in the real world (for we cannot even be sure th a t our assumptions
are absolutely tru e ); but we do worry about whether our conclusions are
inescapable consequences of our assumptions. Herein lies a basic difference
between tru th and validity. T ruth is a subjective value judgment made
about individual statem ents. Validity is a more objective judgment th at
applies to an entire argument. We say th a t an argument is valid if the
conclusion follows inescapably from the assumptions, independent of
whether the assumptions happen to be true.
T ruth and validity are connected by the fact th a t if an argument is
valid and if the assumptions are true, then the conclusion also is true.
By way of illustration, we give four different arguments in which the
validity of the argument and the tru th of the conclusion appear in different
combinations.
268
Introduction 259
Example 1
All Texans are M artians.
All M artians are Bostonians.
Therefore, all Texans are Bostonians.
Example 2
All Frenchmen are Europeans.
All Germans are Europeans.
Therefore, all Frenchmen are Germans.
Example 3
All Parisians are Europeans.
All Frenchmen are Europeans.
Therefore, all Parisians are Frenchmen.
Here the conclusion is true, but this tru th does not follow from the
form of our assumptions. In short, the fact th a t it is true th a t all Parisians
are Frenchmen requires more knowledge than th a t given in the above two
assumptions. Finally, it is possible th a t we have both tru th and validity.
260 The Game of Mathematics
Exam ple 4
All bears are animals.
All animals have four legs.
Therefore, all bears have four legs.
Exam ple 5
All bears are trees.
All trees have four legs.
Therefore, all bears have four legs.
The argument is still valid and the conclusion is still true, but the as
sumptions are not true.
I t is this last result th a t makes the scientific method so difficult to work
with. In the “game” of life we know nothing for sure; all we have is what
we can see. Thus, in many cases the scientist starts with a known observa
tion and then tries to invent a theory th a t explains the observation. In
essence, he applies valid reasoning to his assumptions in order to deduce a
true conclusion. As noted above, this in no way proves th at his assump
tions are true. This is the problem th a t motivated Einstein to say when
asked if his theory of relativity was true: “All the experiments in the world
can never prove me right; but a single one may prove me wrong!” Para
phrased, he was saying th a t false assumptions together with valid reasoning
could yield a true conclusion, but th a t true assumptions together with
valid reasoning could not yield a false conclusion.
G etting back to the major point of this section, we wish to show how all
m athem atical systems are set up so th a t one can deduce conclusions from
given assumptions. I t is not necessary th a t the conclusions be true,
although we keep in mind th a t the “truer” the assumptions, the “truer”
will be the valid conclusions.
In closing this part of the discussion we should point out th a t this type
of inquiry into validity is not restricted to mathematical systems, nor to
the natural and physical sciences. Indeed, every branch of human en
deavor—the social sciences as well as the physical and natural sciences—has
its foundations in the fact th a t we begin with man-made assumptions and
apply the rules of logical thought. These rules are the same in the social
The Case Against Intuition 261
sciences as they are in the physical sciences. The biggest difference be
tween the physical and the social sciences is th a t it seems much more
difficult in the complex social sciences to find “rules of the game” th a t are
acceptable to all players.
We now begin to study the development of mathematical systems and
to identify a mathematical structure. Because most people seem to have
a certain “faith” in intuition and common sense but seem to distrust logic
(probably because logic is a difficult subject), we shall in the next section
point out the weakness of intuition and common sense, in order to motivate
better the need for logic.
Exam ple 1
A man drives from town A to towTn C by wTay of town B. The distance
between A and B is the same as th a t between B and C. In going from A
to B he averages 20 mph, while in going from B to C he averages 30 mph.
W hat was his average speed for the entire trip from A to C via B?
pie, since the distance from A to B is the same as the distance from B to
C, we spend more time traveling at the slower speed (20 mph) than a t the
faster speed (30 mph). Thus, it should not be surprising th a t the average
is closer to 20 than to 30. Likewise, if we were to figure average test scores
and if we scored 50 on the first test and 100 on the second, a t first glance
we might be tempted to consider the average to be 75. B ut such an
answer presupposes th a t the two tests are equally weighted. For example,
if the score of 50 had been made in a two-minute homework quiz and the
score of 100 had been made in a final examination, the average would
probably be closer to 100 than to 50. In the travel problem, had the man
driven for the same length of time at each speed, then the average speed
would indeed have been 25 mph.
W ith respect to this same travel problem, there is still another surprising
aspect. The answer 24 mph depends only on the numbers 20 and 30 and
not on the distance traveled. There are many ways of checking this but
perhaps the most straightforward way is to observe th a t 20 mph means 1
mile per 3 minutes, whereas 30 mph means 1 mile per 2 minutes. Since
the distance traveled a t 20 mph is the same as th a t traveled a t 30 mph, we
see th at we travel 3 minutes at one speed to cover the same distance th a t
we can cover in 2 minutes a t the other speed. Since this means th a t the
time ratio is 3:2, we know th a t f of the time is spent a t the lower speed,
while f of the time is spent a t the greater speed. Notice th at 24 is exactly
f of the way between 30 and 20 or, equivalently, f of the way between 20
and 30, regardless of the distance between A and B. Notice th at the inter
pretation \ + ^ = f gives the correct answer in the sense th a t 1 mile per
2 minutes and 1 mile per 3 minutes “averages out to ” 2 miles per 3 minutes
(24 miles per hour).
Example 2
Let towns A, B, and C all be 30 miles from each other. Ju st as before,
the driver goes from A to B a t an average speed of 20 mph. How fast
must he travel from B to C if he wishes to average 60 mph for the entire
trip from A to C?
In other words, even if the driver could now' get from B to C w'ithout using
any more time, he has still used \ \ hours. Thus, the best he could do is
60 miles per f hours, or 40 mph. Notice th a t we are not saying th at the
driver can never average 60 mph. We are saying th at he cannot do it
within the prescribed distance. For example, since he drove for \ \ hours
a t 20 mph, he must drive a t 100 mph for 1| hours also if he wishes to aver
age 60 mph. B ut 100 mph for 1| hours means a distance of 150, not 30
miles.
Exam ple 3
A man has 30 jelly beans th a t he wishes to sell at 3 for 10.He has a better
grade of jelly bean th a t he w'ishes to sell a t 2 for10.He alsohas30of
these. R ather than keep them in two separate jars, he decides to place
them all in one jar and sell them at 5 for 20. Which way does he make
more money?
W hat difference does it make? After all, 2/10 and 3/10 adds
s o l u t io n :
up to 5/20. Yet it does make a difference, as the following computation
shows.
Thus, selling them separately the man brings in 250 if all are sold. On
the other hand, selling the entire 60 a t 5 for 20 brings in only 240! T hat
is, there are 12 piles of 5 in 60, and each pile of 5 brings in 20.
Again, with a little more thought the mystery is easily solved. Namely,
the profit would have been the same had he had equal batches, rather than
equal amounts of each. For example, a t 2 for 10 the 30 jelly beans split
into 15 batches, whereas a t 3 for 10 they split into 10 batches. In other
words, it is 2 of one kind and 3 of the other th a t are worth 5 for 20. For
example, had he sold 30 a t 2 for 10 and 45 a t 3 for 10 (since in this case
there are 15 batches of each) he would have made the same either way.
The next illustration shows how intuition can lead to interesting para
doxes w’hen we deal with collections having infinitely many objects. (A
further discussion of this concept is presented in the next chapter.) This
is because in everyday life we are concerned with finite situations, and
hence our intuition may be said to be finitely oriented.
Exam ple 4
Suppose th a t we have a certain number of objects in a collection. Then
it seems to be intuitively clear th a t by merely changing the name of each
object we in no way alter the total number in the collection. For example, a
collection of three men remains a collection of three men no m atter how
The Case Against Intuition 266
the men decide to change their names. Let us start with the collection
of counting numbers (this is an infinite collection since the counting num
bers never come to an end) and change the name of each number by replac
ing it by its double.
2 3 4 5 6 7 8 9 10 11 12 . . .
2 4 6 8 10 12 14 16 18 20 22 24
Intuition tells us th at the top and bottom lines contain the same number
of members. Yet the top line contains both even and odd counting num
bers whereas the bottom line contains only the even counting numbers.
Certainly, it does not seem th a t there should be a 1 to 1 correspondence
between all the counting numbers and the even counting numbers.
Here is another way of viewing this example: Suppose there are infinitely
many men and infinitely many rooms in a hotel. We put the first man in
room 1, the second man in room 2, and so on, so th a t all the rooms are
occupied. Then infinitely many more men come to the hotel and each
man wants a room. The manager tells each of the men who are already
in the rooms to move into the room whose number is twice the original
number. Thus the man in room 1 moves to room 2, the man in room 2
moves to room 4, the man in room 3 moves to room 6, and so on. Each
of the original men are still in a room but now all the odd-numbered rooms
are vacant and the new men can move in as desired. This process can be
successfully completed over and over again as each new batch of men
comes to the hotel.
Another example of this problem is the following paradox. List the
counting numbers but not in the usual order; namely, list the first two
even numbers and then the first odd number. Then list the next two
even numbers and the next odd number, and continue this way. Thus, we
have 2, 4, 1, 6, 8, 3, 10, 12, 5, 14, 16, 7, 18, 20, 9, . . . . If we stop after
any odd number, no m atter how far we go there will always be twice as
many even as odd numbers in the sequence. Since we will not run out
of even numbers before odd ones it appears th a t there are twice as many
even numbers as there are odd numbers—a fact completely contrary to
what we know to be true. If we reverse the role of even and odd numbers
and obtain 1, 3, 2, 5, 7, 4, 9, 11, 6, 13, 15, 8, . . . , then there are seemingly
twice as many odd numbers as even ones; and this is even more ridiculous
if we agree th a t there are also twice as many even as there are odd numbers.
Variations of this type are endless. For example, we could also “prove”
th a t there are three times as many even numbers as odd by writing 2, 4,
6, 1, 8, 10, 12, 3, 14, 16, 18, 5, 20, 22, 24, 7, . . .
The paradox hinges on the fact th a t we talk about endless, but we always
stop the sequence somewhere. Notice th a t in terms of endlessness the con
266 The Game of Mathematics
Exam ple 5
Consider the right triangle each of whose legs is 1 standard unit (s. u.). By
the theorem of Pythagoras, the length of the hypotenuse is \ / 2 (see
Figure 3.1).
Figure 8.1
B B
Figure 3.2
Figure 3.3
The list of such paradoxes is huge. But it is not our purpose to write a
chapter of such paradoxes, so suffice it to say th a t not only is the list large,
but the complexity of these paradoxes varies from those th a t can be solved
with a minimum of effort lo those which are so sophisticated th a t they are
unsolved even today. Our main purpose was to show th a t there are situa
tions in which intuition and common sense seem to fail us. N ot th a t we
can legally say our intuition is wrong, but we can say th a t it leads to in
correct results. (By this we mean th a t if the right answer to a problem is
12 and our intuition tells us 6 ; the fact is th a t our intuitive answer is 6,
even though it is not the correct answer.)
If intuition can lead us astray on certain occasions, we must adm it th a t
it could lead us astray on others. Thus, we must have access to a tech
nique th a t is more objective than mere intuition.
268 The Game of Mathematics
We are not saying th a t one should not be happy to possess a fine intuitive
sense. In fact, it may be the finest single virtue the creative researcher
possesses. In summary, we are saying
(3) We shall also assume th a t the student knows absolutely nothing about
the game, even though he may be quite intelligent otherwise. For
example, if we wish to teach him bridge we shall assume th a t he has
not even heard of a deck of cards. If we assume certain advance
knowledge on the part of the student, we might well take certain
liberties and shortcuts in our presentation, thus obscuring the logical
structure of the game. In summary, then, we assume th a t our student
is a genuine beginner, not a mediocre or good player who is trying to
improve.
with a deck of cards. The equipment is the same but the rules are dif
ferent. Then we can explain in terms of the terminology and the rules, the
object of the game, or what we mean by a winning situation. How we
employ the rules to arrive a t the winning situation is known as the strategy.
We can now define a game to be any system consisting of definitions,
rules, and a winning situation(s) th at is carried out by strategy; th at is, by
employing the rules in such a way th at the winning situation follows in an
inescapable manner. This idea is diagrammed in Figure 3.4.
strategy
Objectives(s)
-<
Definitions, terminology
(com m unication)
Figure 3.4
We can probably add many more laments based on our own experiences;
but, just for the sake of illustration, compare the above statements with
their “game” counterparts which somehow don’t seem as strange:
I love baseball; I watch it on TV, I go to the ball parks, I study books on the subject.
Still I ’m a pretty awful player.
Joe has much more natural ability than I have and when it comes to wrestling, he
can beat me no m atter how hard I practice.
ness” and th at for the first time we want to invent a language. I t comes
time to define our first word. But how can we do this when as yet we have
no other words? I t appears rather likely th at man chose a number of
primitive concepts which he made no attem pt to define objectively and
defined other concepts in terms of these.
The point to be made is th a t in the formation of any language some sub
jectivity must take place, for somehow we must arrive a t a starting point.
In other words, if we start with a particular word and try to trace its mean
ing through various synonyms we must eventually get back to the original
word. For example, let us look up any word in the dictionary. After
finding a synonym for this word we look up the meaning of the synonym
and continue to proceed in this way. Eventually, we will be led back to
the original word or else to one of the previous synonyms. Such a proce
dure is called circular reasoning. Circular reasoning is not necessarily evil.
Frequently, one learns to understand a concept better just by hearing the
concept rephrased, even if no additional facts are introduced. But from a
logical point of view, circular reasoning does not help us get to the core of
things. An example of circular reasoning might be a warning lantern
placed on a pile of stones. We ask why the lantern is there and are told
th at it is to warn people of the pile of stones. We ask why the stones were
there and are told th a t they are there to support the lantern! As a second
illustration of circular reasoning, consider the case of the king who issues
a decree stating th a t he is perfect. When he is asked to prove th a t he is
perfect the king replies “Read the decree!”
Our contention is th a t whereas it is possible to define all words in one
way or another, the possibility hinges on our willingness to accept the fact
th a t some of the words must be defined subjectively. There is nothing
wrong with subjective definitions; for instance, concepts such as beauty,
justice, and holiness are certainly of a subjective nature. Consequently,
any attem pts to justify the various definitions of these concepts must
ultimately rest with the user of the concept. Yet, while it might well be
impossible to answer objectively such questions as “W hat is beauty?”
“W hat is holiness?” “Is justice just?”, many of our own values stem from
our honest, albeit subjective, efforts to answer such questions. In our
attem pt to learn the game of mathematics we shall eventually have to face
the problem of defining certain subjective concepts with a fair degree of
objectivity.
B ut while the distinction between objective and subjective is important,
we must not let a study of this problem obscure other phases of the overall
problem. For example, we must also keep in mind th a t even after the
problem “Can all words be defined?” has been solved to our own satisfac
tion, the problem of one word having too many definitions can also haunt
us; for if a word has too many definitions, the chances for misinterpretation
276 The Game of Mathematics
W ithout hinting at the context, we merely ask the reader to make use of
the dictionary definition of the word “fair,” and based on what information
he finds, to tell us the meaning of our sentence. A look at Webster’s
Collegiate Dictionary, Seventh Edition, reveals:
fair\'fa(a)r, 'fe(o)r\ad; [ME fager, fair fr. OE faeger; akin to OHG fagar
beautiful and perti. to with puosti to decorate] 1: attractive in appearance:
BEAU TIFU L 2 : superficially pleasing: SPECIOUS 3a: CLEAN, PURE
b: CLEAR, LEG IBLE 4a: not stormy or foul: CLOUDLESS b: free or
nearly free from precipitation 5: AM PLE (a ^ estate) 6a: marked by im
partiality and honesty: JUST b: conforming with the established rules:
ALLOWED 7a: PROM ISING, LIK ELY b: favorable to a ship’s course
(a ~ wind) 8 : archaic: free of obstacles 9: not dark: BLOND 10: ADE-
QUATE-fairness - n
There are a multitude of ways to interpret the remark “She was fair.”
In summary, the problem of defining words objectively and unambig
uously can be difficult in any field of endeavor. In particular, we shall
experience this problem in our objective study of arithmetic. To under
stand arithm etic one must grasp the concept of number. Yet number is
one of those primitive concepts th a t leads us to circular reasoning. Unless
we use the utm ost care, an attem pt to define the concept of number may
presuppose the knowledge of the concept of number in the definition.
“Young man, correcting my grammar is something up with which I shall not p u t!”
I haven’t no money.
I have money.
end to end, and if two fractions were to be equal if they were synonyms for
the same length, only then were we compelled to accept the rule th at
1 I 1 _ 5
2 I 3 IT"
points and lines, but these concepts are very basic and their definitions
seem to be absurd. Consider Euclid’s definition:
From a subjective point of view, this is perhaps his way of trying to project
the feeling of the “tinyness” of a point, but from an objective point of view
the definition says no more than th at a point is nothing. In a way, the
felony is compounded when he defines a line as th a t which is generated by
a moving point.
In fact, we saw in the previous chapter th a t severe problems, almost
paradoxes, arose when one confused “point” with “dot.” For example,
the rule of plane geometry th a t says th a t one and only one line is deter
mined by a pair of distinct points really hinges on the concept of “point”
rather than on the physical nature of a “dot.” Figure 3.5 clearly indicates
th a t more than one line can be drawn between two “dots.”
Figure 8.5
example, while it is true in the real world th at the statem ent, “All Texans
are M artians” is false, there is nothing th a t prevents us from inventing a
game in which some pieces are named Texans and others are named
M artians, and then to make up the rule in this game th a t all Texans are
Martians.
However, before concluding this section, let us say something concerning
the old definition th a t an axiom is a self-evident statem ent of fact. Actu
ally, this definition is not in direct contradiction to our remarks th at an
axiom is a rule of the game. The fact is th a t to most people mathematics is
a product of the real world. I t is the language of the sciences. One
problem of the scientist is to make accurate predictions concerning the
world in which we live. Thus, while the rules of any game are a t best
assumptions, at least in the game of science we try to make the rules as
self-evident as possible so that the predictions based on these rules will be as
“true” as possible.
In fact, this may be the best way to distinguish between pure and applied
mathematics, a t least in so far as the game analog is concerned. T h at is,
let us agree th at we call it applied mathematics when the game is based on
rules th a t conform to the real world, and pure or abstract mathematics if
the rules are compatible with one another but do not seem to serve as a
model for any physical entity. In summary, both pure and applied m athe
matics as games have the same structure; the only difference is the degree
of realism upon which the rules are based.
Returning to the terminology of the game, we observe th a t strategy calls
for deducing inescapable consequences of the rules and other assumptions.
Such inescapable conclusions are called theorems. According to our inter
pretation, the theorem is not merely the statem ent but its proof as well.
#S4 The Game of Mathematics
T hat is, any statem ent about our game, true or false, is called a conjecture
until it can be shown to follow as an inescapable consequence of the rules.
The conjecture together with the proof is the theorem. This is in keeping
with our main interest of seeing what things follow7 from the assumptions,
rather than including those truths which seem to be unrelated to the
assumptions. By way of illustration, in plane geometry the statem ent
th a t the base angles of an isosceles triangle are equal is only a conjecture
until we prove th at it follows from the other definitions and rules. In fact
this was the beauty of the ancient Greek’s contribution to geometry.
Certainly it must have been self-evident long before the time of Euclid that
the base angles of an isosceles triangle were equal. It was clear simply by
looking at the triangle. W hat Euclid did was establish a handful of defini
tions and rules and then show7th at this result (together with several volumes
of other results) was an inescapable consequence of these definitions and rules.
The beauty of a theorem is th at it contains a fact th a t follows merely from
the tru th of other assumptions; and it does not require any additional pieces
of information. This is precisely what w7e mean w7hen w7e talk about
capturing the structure of the subject.
Our definition of theorem allows us to make a strong connection between
tru th and validity—a connection th at we discussed a little earlier in this
chapter. Recall th at w7e said th at if the assumptions were true and the
argument valid then the conclusion w as also true. Now7, since in a theorem
the conclusion is a valid consequence of our rules, it follows th a t as soon as
the rules happen to be true so also will the conclusion be true. In par
ticular, then, as soon as we find a physical model th at conforms to our rules
(and there may be no such model or there may be a large number of such
models, depending on the rules under consideration) the valid conclusions
of our game become “true facts” in the physical model.
We have chosen to study the arithmetic of sets as our first game. This
is our first choice because sets are new enough to most of us so th at the study
w7ill not seem dull; yet a t the same time this subject is concrete enough so
th a t wre can feel we are dealing with something “real.”
Before proceeding with the game of sets, however, a few7additional words
are in order concerning our discussion of logic. While w7e may have made
logic seem sort of like a cure-all, the fact remains th a t the concept of
“inescapable” is rather subjective. How, for example, can we distinguish
betw7een th a t which is really inescapable and th a t which isn’t, even though
w7e may think it is? From a different perspective this is much the
same as distinguishing betwreen an unsolved problem and an unsolvable
problem. We do not know the difference until the problem is solved (in
which case it is no longer an unsolved problem ); yet until the problem is
solved we usually have no wray of making the distinction.
Boolean Algebra 286
Our point is that, since logic is the process of deducing inescapable con
clusions, even the study of logic is a game based upon man-made rules.
We do not wish to enter into a discussion of logic at this time but in terms
of the structure of our game concept, we should point out th a t every game
has not one but two sets of rules. There are the special rules, peculiar to
th at particular game, such as three strikes is an out in the game of baseball;
and there are those rules th a t apply to the strategy level of every game—
the rules of logic. For example, no m atter what game we are playing, wre
find ourselves using such rules as: If event A causes event B to happen
and if event B causes event C to happen, then event A also causes event
C to happen. We repeat, no m atter what the game, we m ust come to
grips with the rules of logic.
For the immediate purposes of our course we shall take the liberty of
assuming th at we all, to some extent or another, know the basic rules of
logic, and we shall feel free to draw upon these as we need them in the
development of any particular game we are playing.
3.9.1 INTRODUCTION
(7) x^J y = y \J x
x (~\y = y C \x. (commutative)
(8) ( i U y ) ^ J z = x \ J ( j / U z)
{x C \y) C \z = x {y (~\ z). (associative)
(9) x C\ { y \J z) = (x y) {x z)
x \ J (y C\ z) = (x \J y) (x y j z). (<distributive)
The special subsets <f>and I are characterized by the following.
(10) For each / = x; and x \ J <t>= x. (identity)
Finally, the principle of complements states th at
(11) Given x £ A , there exists another element x ' C i such th a t:
x \J x ' = I
x x' = <t>. (rule of complements)
At this point it might be well for us to digress and make some comments
about the rules. Keep in mind th a t we are not calling the rules self-
evident even though they may seem th a t way to some “players.” For
example, the associative rules merely tell us th a t in our game, “voice in
flection” (symbolically, we indicate voice inflection in mathematics by use
of parentheses and the like) does not m atter when we deal exclusively with
unions or intersections. However, the distributive rules tell us th a t voice
inflection is im portant when unions and intersections appear within the
same expression.
This is not unlike the situation in ordinary arithmetic. For example,
given 3 + 4 + 5, we can pronounce it as (3 + 4) + 5 or as 3 + (4 + 5).
I t happens th a t these two pronunciations yield the same result. On
the other hand, 3 + 4 X 5 also has two pronunciations. B ut in one
case we have (3 + 4) X 5 (= 35), while in the other case we have
3 + (4 X 5 )(= 23). Thus, the value of 3 + 4 X 5 does depend on voice
inflection. Even with respect to the same operation, the use of parentheses
can make a difference. For example, given 9 — 3 — 1, we may view this
as (9 — 3) — 1, which is 5; or we may view it as 9 — (3 — 1), which is 7.
Such problems are not restricted to mathematics. Among other things
they occur right within the framework of grammar. By way of illustration,
consider the statem ent
We leave it to the reader to verify the other rules stated above by any
method th a t he wishes.
Let us now proceed with the strategy part of the game, but from a fairly
informal point of view. Let us start with any element x G A .
290 The Game of Mathematics
Figure 3.9
Theorem 1
For each x G A , x x = x.
Recall th a t the theorem is the statem ent, together with the proof of the
statem ent. The statem ent, until the proof is supplied, is merely a con
jecture. In other wTords, the theorem is the statem ent th a t follows in
escapably from the rules; and the demonstration of this is called the
proof. In no event do we say th a t the theorem is true, only th a t it is an
inescapable consequence of our rules.
Exercises
1. Use the circle diagrams to verify the distributive rule.
2. Use the chart method to verify the distributive rule.
3. Explain why we do not call either the circle diagram or the chart a proof
of the distributive rule.
Theorem 1
For each x £ A , x W x = x.
pro o f:
Statement Reason
(1) Given x £ A , there exists x' £ A (1) Rule of complements
such th a t x C\ x' = <f>
(2) x W (x x') = x \ J <{> (2) Substitution of 0 for x C\ x'
in x \J (x r \ x')
(3) x y j (x C\ x') = ( i U i ) H ( i U x') (3) Distributive rule
(4) {x x) C\ {x y j x') = 0 (4) Substitution of (3) into (2)
(5) x \ J x' = I (5) Rule of complements
(6) ib 0 = x (6) Identity rule
(7) ( x \ J x) C\ I = x (7) Substitution of (5) and (6)
into (4)
(8) {x y j x) C\ I = x VJ x (8) Identity rule
(9) x \J x = x (9) Substitution of (8) into (7)
This proof merely spells out, step by step, the procedure th a t had oc
curred previously when we wrote
x C\ x' —0
x \j (x r \ x') —x ^j 0
{x U x) C\ (x U x') = x \J 0
( i U x) r \ i = x
(xKJ x) = x.
the duality principle on blind faith, nor as another rule. Rather, we need
only observe th a t to prove the dual of a given theorem all we need do is
interchange U and f \ , as well as 0 and 7 everywhere in the proof of the
theorem.
As an application of this, we shall prove the dual of Theorem 1 by copy
ing the statem ent and the proof word for word, except th a t we shall inter
change y j and P\, as well as 0 and 7.
Corollary l 1
For each x C A, x r \ x = x.
pro o f:
Statement Reason
(1) Given x £ A , there exists i ' C A (1) Rule of complements
such th a t x \ J x' = 7.
(2) x C\ (x VJ x r) = x C\ I (2) Substitution
(3) x C\ (x VJ x') = ( x f \ x ) \ J {xC\ x') (3) Distributive rule
(4) {x r\ x) \j (x c\ x') = x r\ i (4) Substitution
(5) x r \ x ' = <t> (5) Rule of complements
(6) x r\ i = x (6) Identity rule
(7) {x C\ x) VJ 0 = x (7) Substitution
(8) (x r\x )\j< i> = x r \ x (8) Identity rule
(9) x C\ x = x (9) Substitution
We merely copied the statem ent and the proof of Theorem 1 verbatim
except for the duality substitution, and this gave us a proof for Corollary
1. Yet, if we didn’t tell the reader our secret, notice th at the resulting
statement-reason format is a valid proof; th a t is, the reader would not
know th a t all we did was make a verbatim translation to obtain the dual
from the original theorem. In other words, whenever we have proven a
theorem we may immediately write down its dual as another theorem, and
if called upon to submit a proof, we need merely copy the proof of the
theorem, replacing by C\ and 0 by 7. Henceforth, whenever we do this,
we shall simply give duality as the reason.
Keep in mind th a t we never have tried to explain why we chose the
x C\ (y ^ J <t>) = x C \y.
This gives us a form th a t contains the desired x C\ </>; but things are
still fairly complicated. However, we assumed th a t y could be any element
of A . We now choose y to our advantage; namely, since we know th at
x C\ x' = 4>, we shall choose y to be x'\ whence (1) by substitution be
comes
(x r \ x') \ j {x r \ 0) = x r \ x'
or <t>W (x r \ <£) = 0.
x C\ <t>— <!>•
Once we have th e proof sketched as above, we can put it into the more
direct statem ent-reason form and proceed in the usualway. The reader
who has studied geometry may recall this as a common technique.
Namely, one makes a rough diagram, forms a plan of attack, and then
translates his approach into the formal statem ent-reason approach.
Boolean Algebra 296
T h eo rem 2
For each x A, x r\ <f>= <£.
p ro o f:
Statement Reason
(1) For the given x there exists x' (1) Rule of complements
such th at x C\ x' = <t>.
(2) x' <f>= x' (2) Identity rule
(3) 0) = x x' (3) Substitution
(4) x C\ (x' <l>) = (x r \ x') KJ {x C\ 0) (4) Distributive rule
(5) (x r\ x') vj {x r\ < t>) = x r\ x ' (5) Substitution of (4) into
(3)
(6) x C\ x' = <j> (6) Rule of complements
(7) <j>\j (x r\ <
t>) = <j) (7) Substitution of (6) into
(5)
(8) <t>\j t x r\<t>) = ( x r\<t>) <f> (8) Commutative rule
(9) (x <t>) <t>= <i> (9) Substitution
(io ) (xr^\<i>)^j<t> = xr^<i> (10) Identity rule
(ii) x <f>= <t> (11) Substitution
Again, there may be other ways to prove this theorem; the proof above
is only one way. Also, the fact remains th a t even if the above demonstra
tion does not seem self-evident to a player, it still constitutes a proof.
In support of an earlier remark, it should be noted here th a t nowhere
have we used Theorem 1 in the proof of Theorem 2. This implies th a t
the two theorems are independent. T hat is, Theorem 2 could just as
well have been proved before. Finally, to the student who feels th at
charts or circle-diagrams are an “easier” way of getting the answer, we wish
to emphasize, as strongly as possible, th a t our procedure does more than
show the answer. I t shows how the answer is an inescapable consequence
of our 11 rules thus emphasizing the role of structure.
Using the principle of duality, we now have an immediate consequence of
Theorem 2.
Corollary 2
For each x £ A , x \ J I = I.
T heorem 3
For each x and y in A, x VJ (x C\ y) = x.
pr o o f :
Corollary 3
For each x and y in A , x (x VJ y) = x.
Again, we could have established the results stated in Theorem 3 and
its corollary merely by a suitable circle diagram. However, the proof
establishes the inescapability of these results from the 11 assumptions,
independently of the use of charts, circle diagrams, and so on. In still
other words, our proof shows not only the “tru th ” of Theorem 3 but also
the fact th a t no other properties of sets, other than those already objec
tively stated, are necessary to “guarantee” this “tru th .” We keep repeat
ing this idea because we feel it cannot be emphasized too strongly.
Moreover, let us also observe th a t we are not bound to invoke the princi
ple of duality merely because we have the right to. For example, once
Theorem 3 is established, wre may write
x \ J (x y) = ( i U x) C\ ( i U y).
B ut by Corollary 1, x KJ x = x; hence, x VJ (x y) = x C\ (x VJ y). Now
since x VJ {x C\ y) = x and since x \ J {x C \y) = x C\ (x \J y ) ; we have
x C\ (x U y) = x, and this gives us another proof of Corollary 3. This
further substantiates our remark th a t there is often more than one correct
proof of a particular theorem.
Boolean Algebra 297
Theorem 4
Suppose th a t we are given elements x and y in A and th a t we can find an
element of z in A such th a t
x \J z = y \J z
and
x C\ z = y C\ z.
Then x = y.
x = x^J {x z) Theorem 3
= (y C\ z) Given & substitution
= { x \J y ) C \{ x K J z) Distribution rule & substitution
= (y x) (y z) Commutative rule & substitution
= y \ J (x C\ z) Distribution rule
= y \ J {y C\ z) Substitution
= y Theorem 3
Theorem 5
/ ' = <f>.
We shall show th a t / VJ / ' = I KJ <j> and I C\ / ' = I C\ <t>. Then
p ro o f:
Theorem 5 will follow immediately from Theorem 4. By definition of
298 The Game of Mathematics
Corollary 5
= I.
pr o o f We merely apply the duality principle to Theorem 5. Observe
:
that, in a sense, Theorem 5 could be viewed as a corollary to Theorem 4
since it is virtually an immediate consequence of the result stated therein.
However, we prefer to call this result a theorem rather than a corollary
for the purpose of emphasizing the result.
In a similar way, the following is easily shown to be another consequence
of Theorem 4.
T heorem 6
For each x G A , Or')' = x. (This is a form of double negation. Intui
tively it says th a t the complement of the complement is the original set.)
pr o o f By Theorem 4 all we need show is th a t x' W (x'Y = x'
: x and
x ' C\ (x'Y = x' C\ x in order to establish the equality of x and Or')'.
B ut by the definition of {x'Y we have th a t Or') KJ (x'Y = I and also
Or') (x'Y = <t>, while by the definition of x' (and the commutative rule)
we have th a t (x ') \J x = I and (x') (~\x = 0—and the theorem is proved!
Let us p u t this into statement-reason format.
Statement Reason
(1) x ' U ( x 'Y = I (1) Rule of complements
x f C \ (x 'Y — 0
(2) x \ J x ' = I (2) Rule of complements
x C\ x' = 0
(3) x \J x ' = x ' \J x (3) Commutative rule
x C\ x' — x' C\ x
(4) x ' \ J x = I (4) Substitution of (3) into (2)
xf C \ x = 0
(5) x ' (x 'Y = x f x (5) Substitution of (4) into (1)
x' C\ {x'Y — x' C\ x
(6) (x'Y = x (6) Theorem 4
Theorem 7
For x,y in A ; (x W y )' = x ' C\ y'.
pro o f By the definition of (x VJ y)', we know th a t { x \J y ) \ J { x \J y)' =
:
I and ( i U y) C\ (x W y)' = <t>. Hence, by Theorem 4 all we need show
is th a t (x VJ y) U (x' C \y ') = I and ( i U y) C\ (x ' A y') = 4>- To this
end:
(x VJ y)KJ (x' C\ y f) = [(x U y ) U x'] n [(* U y ) U y'] (Distributive
rule)
[(y U x ) U x'] H [(x U y ) U y'] (Why?)
i y U ( x U x')j n [x U ( y U y')j (Associative rule
and substitution)
( y U / ) n ( x U /) (Why?)
ir \i (Why?)
I (Identity rule and substitution)
D efinition 1
For x and y in A , x C y shall mean th a t x U y = y.
Using the circle diagram idea to interpret the meaning of x C y> we
see th a t x C y also implies th a t x H y = x and th a t x H y' = f Conse
300 The Game of Mathematics
D efinition la
For x,y in A ; x C y means th a t x C \y = x or
D efinition lb
For x,y in A ; x C y means th a t x y' = <t>.
Theorem 8
Given x and y in A ; then x \ J y = y if and only i f x C \y = x. T hat is, if
i U y = y, then x P\ y = x; and if x y = x, then xKJ y = y.
x y = x { x \J y) (Substitution)
= (x (~\ x) \J (x y) (Why?)
= x \ J {x (~\y) (Why?)
= x. (Why?)
i U j/ = ( i H i/ ) U j/ (Substitution)
= y \ J (x C \y ) (Commutative rule and substitution)
= y, (Theorem 3 and substitution)
T heorem 9
Given th a t x and y are in A ; then x \J y = y if and only if x f \ y f = </>.
T h at is, if x \ J y = y, then x C\ y' = <*>; and if x C\ y' = </>, th en x U y = y.
D efinition
For any x and y in A , we say x C y if and only if
(1) x \J y = y or
(2) x r \ y = x or
(3) x C ^ y ' = <t>.
For any particular x and y either (1), (2), and (3) are all true, or else
all three are false.
This definition, coupled with our rules and previously proven theorems,
allows us to analyze the structure of the set of subsets of a given set in
more detail. In fact, there are several immediate corollaries to the above
definition worth noting.
To begin with, we m ust have th a t x C I and <j>C x, for all x £ A . T hat
x C I follows from our definition of C and the identity rule, since we know
th a t x f \ I = x, which in turn says th a t x C I. Similarly, x VJ <f>= x tells
us th a t <t>C x. Because of the importance of these two results we shall
state them as theorems rather than as corollaries of our definition.
302 The Game of Mathematics
T heorem 10
For each element x G A , we have th at
xC I
and
<t>C .x.
Theorem 10 emphasizes the limiting values of <f>and I in the sense th at
each x is between <£ and I. Also notice th at Theorem 10 gives us another
reason for agreeing th a t the empty set is a subset of every set. Namely,
we have now shown that, under our accepted rules, it follows inescapably
th a t <j>d x. In still other words the assumption th at <f>is not a subset of I
is incompatible writh our accepted 11 rules.
Another pair of corollaries to our definition are given by the next two
theorems.
T heorem 11
If x C <t>, then x = <f>] and if I C x, then I = x.
pr o o f By definition, if x C <t>, then x VJ <f>= </>; but x W <j) = x. Hence,
:
x C <f> implies th a t x = 0. Similarly, if I C. x, then I \ J x — x, but
I \J x = I; hence I — x.
Theorem 11, among other things, shows th a t there is no analog of nega
tive numbers when one deals with sets. T hat is, we have shown th a t the
only subset of the empty set is the empty set itself. (In other words x
denoted any subset of <j>and we showed th a t this implied x = </>.)
By this tim e the technique of deducing theorems and of studying the
structure of systems should be clearer. Thus, while there are many more
theorems we could prove, we shall prove just two more about sets. First
we shall dem onstrate th a t C is a transitive relation.
T heorem 12
Given x, y, and z in A ; if x C y and if y C z, then x C z.
pr o o f Paraphrasing the theorem in terms of our definition, we wish to
:
show th a t if x ^ J y = y and if y \J z = z, then i U z = z. Using the given
equalities we see th a t i U z = x \ J (y ^ J z) = {x^J y ) \ J z — y ^ J z = z
and the theorem is proved.
Finally, we wish to prove a theorem th a t will give us another technique
for proving th a t two elements are equal.
T heorem 13
Given x and y i n A , then x = y if and only if x C y and y C x. (A result
th at we accepted as being true by definition in Chapter 2, but here we show
th a t it is an inescapable consequence of our structure.)
Boolean Algebra 303
xKJ y = y \ J y = y
and
x \ J y — x \ J x — x.
Exercises
1. In ordinary arithmetic, let us agree to call one statem ent the dual of
another if all we do to get from one to the other is interchange the words
multiplication and division (or the symbols X and State a rule
of arithm etic whose dual is not a rule.
2. Prove Corollary 2 directly from the given proof of Theorem 2 using the
principle of duality step-by-step.
304 The Game of Mathematics
D efinition
S is called a Boolean algebra if and only if
other theorems th a t we did not prove) remain valid in any Boolean algebra
since these proofs depend only on the properties that constitute a Boolean
algebra. This is why we were so insistent on emphasizing the facts th a t (1)
we did not have to say why we accepted the rules, and (2) we only wanted
those conclusions th at followed inescapably from our rules.
Thus, in any Boolean algebra we have the following.
(1) a \J a = a and a C\ a = a.
(2) a C\ <f>= <t>and a U I = I.
(3) aVJ (a C\b) = a (a KJ b) = a.
(4) I ' = <f>and <t>' = I.
(5) (a')' = a.
(6) (a VJ &)' = a' C \b' and (a C\ b)' —a' W b'.
(7) aU b = a U c and a C \b = acimply th a t b = c.
(8) a VJ b = b if and only if a f~\ b = a if and only if ab'= <t>.
Defining a C b to mean a b = b, we also have the following.
(9) For each a, <t>C a and a C 1-
(10) If a C <£, then a = and if I C a,, then I = a.
(11) If a C b and if bC c, then aC c.
(12) If a C b and if bC a, then a= b.
Since it is false th at
a + (b X c) = (a + b) X (a + c)
and th at
a X (1 — a) = 0,
we see th a t the rules of a Boolean algebra are different from those of the
usual numerical algebra. In other words, rules (4a) and (6) are rules in
Boolean algebra but not in ordinary numerical algebra. (Of course, if the
rules for a Boolean algebra were identical to those of ordinary arithmetic, it
would be of no interest to study Boolean algebra since structurally we would
not be able to distinguish it from ordinary arithmetic.)
There are also rules th a t are obeyed in ordinary numerical algebra but
not in a Boolean algebra. For example, in ordinary arithmetic, given any
number x we can find another number y such th a t x + y = 0. If this
were translated into the language of Boolean algebra, it would say : Given
a G S, there exists b £ S such th a t a U b = <£; but if a 9* <t> no such b
exists. (Why?)
This means th a t any theorem in numerical arithmetic will be a theorem
in Boolean algebra provided the proofs use rules th a t are always acceptable
to both Boolean algebra and numerical arithmetic.
If a proof uses a result th a t is true in one but not the other, the theorem
may still be valid in both, but we will need different proofs in each case.
For example, it is a theorem of arithm etic th a t a X 0 = 0 for all numbers
a; and it is a theorem of Boolean algebra th a t a C\ <f>= <f> for all a G S.
Yet the numerical proof utilizes rules th at are not permitted in Boolean
algebra, and vice versa. (See A Note on Proofs at the end of this section.)
While the mathem atician can invent a mathematical system merely
for its own sake, he usually has a particular model in mind when he invents
a system. In this section we defined a Boolean algebra in terms of such
operations as cup, cap, and prime in order to remain abstract; but as we
were doing this we had in mind the properties of union, intersection, and
complement as described in the previous section. This move guarantees
the fact that there is at least one “real” model that obeys all the necessary
properties for a system to be a Boolean algebra. Namely, the set of all
subsets of a given set together with the usual meanings of union, inter
section, and complement is such an algebra—since this is where the rules
came from in the first place.
Finally, one often tries to determine the rules th a t completely charac
terize an entire subject. One way of telling whether a set of rules does
this is to invent or discover other systems th a t obey the same rules. If
two such systems have essentially little in common, the chances are th at
the rules we selected were not sensitive enough to describe the entire
Boolean Algebra 307
picture. If the two systems are virtually equivalent, then we have found
a unifying link in terms of the structure of the rules. In any event, as a
brief summary, it is often the case in studying a particular system th a t the
mathematician tries to characterize it by a small number of rules; he then
sees how many models he can find th a t obey those same rules.
In this spirit of inquiry we shall devote the next two sections to con
structing other models of a Boolean algebra; and, as a by-product, we shall
show the interrelationships th a t exist between three subjects of rather
diverse natures.
A NOTE ON PROOFS
Arithmetic Sets
0+ 0 = 0 < t> K J <t> = <t>
Exercises
1. Explain the difference between Boolean algebra and the arithmetic
of the set of subsets of a given set.
2. Why will certain theorems of Boolean algebra also be theorems in
ordinary arithmetic, while other theorems of Boolean algebra will not
be?
3. Prove th a t in any Boolean algebra
f lU (a' (^\b) = a \J b.
4. Let S = {0,1}. We define two binary operations on S th at we denote
by W and Pi, where
KJ 0 1 n 0 1
0 0 1 and 0 0 0
1 0 1 i 0 1
would not be called a statem ent since we cannot define it with respect to
tru th or falsity. W ithout further embellishment, it is probably fair to
Boolean Algebra 309
assume that, in general, we all know what a statem ent is. The next
question concerns the formation of compound statem ents. Again, based
on previous experience we sense th at
Both . . . and . . .
Either . . . or . . .
If . . . then . . .
are very common constructions for combining statem ents to form new
statements. They are so common th a t they have been given special
names (conjunction, disjunction, and conditional, respectively). How
shall we judge the tru th of such statements? Perhaps it is best to proceed
by example.
Example 1
Consider the conjunctive statem ent
There are eight days in a week and there are twelve months in a year.
as an abbreviation for
agreed th a t a statem ent must be either true or false, one or the other, but
not both! Hence, we must define what we mean by the tru th of a con
junctive in terms of the tru th of its constituent parts, and since we want
our results to conform with reality, we shall choose a definition to agree with
our experience.
Suppose we were to make a wager th a t both of two events will happen.
Then if we are to win our bet, both things must happen; otherwise we lose.
W ith this in mind, it seems natural th a t we define a conjunctive to be true
if and only if each of the constituent parts is true. But what has this to do
with the concept of truth-table logic? To answer this question, let us
introduce some convenient abbreviations and notations. We shall refer
to statem ents by letters such as p, q, r, s, and we shall use the conventional
notation
VA q
(both) p and q.
V______ 9
t t
t f
f t
f f
where for a particular choice of p and q one and only one of the above four
happens; b u t which it is depends on the tru th of p and q. Now we have
agreed th a t p A q is true only when p and q are each true. Thus, in terms
of the tru th of p and q we define the tru th of p A q (and, hence, the name
truth-table) by the following table.
v_______ q______ v A q
t t t
t f f
f t f
f f f
Boolean Algebra S ll
Example 2
Consider the disjunctive statem ent
Here we are saying th a t the pen is in a t least one of the two places and,
as a result, this statem ent is false only if each of the constituent parts
is false. In terms of traditional notation we write
VVq
as an abbreviation for
(either) p or q.
V________ 9________V V g
t t t
t f t
f t t
f f f
The only troublesome spot occurs when both p and q are true, because of
the usual misinterpretation th a t “either . . . or . . .” precludes the possi
bility th a t both may happen. However, a little reflection shows th a t if
someone asks us for a pen and we know th a t we leave our pen in the desk or
on the table, then we may say in all honesty th a t the pen is either in the
desk or on the table—even though we may have left a pen in each place.
Putting this idea in terms of a wager, if a card is drawn from a deck and
we bet th a t it is either a spade or a face card, then we do not expect to lose
our wager if the card is the king of spades* Notice th a t we have made this
point earlier when we talked about unions of sets, but we feel it is an im
portant enough point to w arrant repetition here.
Exam ple 3
Consider the conditional statem ent
Letting p represent the statem ent following “if” and q represent the state
ment following “then,” we abbreviate the conditional by writing
SIS The Game of Mathematics
Case 1: p q
t t
p________ q_______ p -» g
t t t
Case 2: p q
t f
In this event the evidence is th a t he got 100 but th a t the professor did not
give him an A; yet his statem ent indicates th at if he got 100 he should get
an A. Hence, we conclude in this case th a t he is “lying” (where “lying”
includes being misinformed). Thus we agree th at
p________ q_______ p - * q
t f f
Case 3: p q
i t
In this event our evidence is th a t he did not get 100, but th a t he did get an
A. Observe th a t Jones told us merely what would happen if he got 100.
He said nothing about w hat would happen if he did not get 100. For ex
ample, it might well be true th a t he would get an A with a score of 99
rather than 100. In any case, we have no proof th a t Jones is lying. Hence,
we bring in the verdict of “truthful.”
p________ q_______ P ~ * q
f t t
Case_________________________ p____ q
i i
Boolean Algebra 313
In this event we are saying th a t evidence shows th a t Jones got neither 100
nor an A. Here, again, we cannot prove th a t Jones was lying. From a
different perspective, if Jones had asked his professor what grade he would
get with a score of 100 and the professor replied, “ If you get 100, I ’ll give
you an A,” then once Jones fails to get 100 the professor is consistent no
m atter what grade he gives Jones. For example, if Jones gets 30, the
professor might well not give him an A. In short, we agree th a t
V________ q________ p - » g
f f t
Do not get “hung up” on p and q. Think of p as the antecedent (the “if”
clause) and q the consequent (the “then” clause). All we are saying is
th a t the conditional is true unless the antecedent is true and the consequent
is false. This definition also agrees with usage in scientific methods.
T h at is, if a scientist predicts th at a certain thing m ust happen when con
dition p occurs, he is not held to th a t prediction if condition p fails to occur.
We should notice th a t intuitive or not the above definition is just th a t—a
definition. All we have done is to try to make the definition more palat
able by showing th a t it agrees with our usual experience. We must not be
alarmed when confronted by strange statements. For instance, consider
the statem ent
If there are seven days in a week then Boston is the capital of M assachusetts.
V________q________ v = q
t t t
t f f
f t f
f f t
p ~p
t f
f t
Finally, there are statem ents whose form makes them true regardless of
the tru th of the constituent parts. For example, if p denotes any state
ment then the statem ent “E ither p is true or p is false” is a true statem ent
regardless of the tru th of p. This follows from our definition th a t any
statem ent is either true or false (but not both). This same fact makes it
clear th a t the statem ent “p is both true and false” is always false regardless
of the tru th of p. A statem ent th a t is always true by its very form, in
Boolean Algebra 315
p ~p ~ (~ p )
t f t
f t f
t ____________________
P Aq P V q V
t f
t f
t f
t f
316 The Game of Mathematics
Just from the given definitions we can now construct the tru th table for
an expression such as
(p V q) A (~ p ).
Namely, we take the disjunction of the two statements (p V q) and
The second statem ent probably seems simpler than the first. Yet our
tables show th a t the two statem ents are logically equivalent; th a t is, they
have the same meaning.
We are not implying th a t those who understand grammatical structure
well would not have been able to decide this without the use of tru th tables.
For instance, we might a t a glance conclude th a t if either p or q is true but p
Boolean Algebra 317
We shall illustrate this with a few examples. To begin with, consider rule
(5) in Boolean algebra, o U <f>= a. Using the identifications described
above, this translates into
p V F = p.
T ruth tables yield
p F p VF
t i t
i f f
p A (q V r) = (p A q) V (p A r).
V q r gVr pA(jVf) pA g pA r (p A ? ) V ( p A r)
t t t t t t t
t t f t t f t
t f t t f t t
t f f f f f f
f t t t f f f
f f f
f f t t f f f
f f f f f f f
(1) (2)
A glance at (1) and (2) shows us that, under our identification, rule (4) of
Boolean algebra becomes a rule for truth-table logic.
While we need not have learned about sets or Boolean algebra to study
truth-table logic, truth-table logic forms a physical model of a Boolean
algebra, and we can claim a knowledge of truth-table logic as a bonus once
we understand Boolean algebra. Stated in more mathematical terms,
every theorem of a Boolean algebra under the identification previously
discussed remains a theorem in truth-table logic—without our having to
use tables to verify the results.
For example, wre proved th a t the theorem
f lU (a r \ b ) = a
held in any Boolean algebra. Using our identifications, this means th a t
p V (p A q) = V
is a theorem in truth-table logic. As a check we may, i f we so desire, use
tru th tables, in which case we obtain the following.
V q V V g V V (p A g)
t t t t
t f f t
f t f f
f f f f
(1) (2)
p q p- *q______ g -» V (p -» g) A (g v) or p <-»q V= q
t t t t t t
t f f t f f
f t t f f f
f f t t t t
Boolean Algebra 319
T hat is, if p implies q and if q implies p, then p and q are logically equiva
lent th at is, p «-> q means the same as p = q. In terms of sets, if all A ’s are
B ’s and if all B ’s are A ’s, then A and B are merely different names for the
same collection.
In other words, quite apart from Boolean algebra, the use of truth-table
logic does for sentence structure what ordinary algebra does for arithmetic.
In short, the student of grammar could use tru th tables as a means for
determining whether two structures are grammatically equivalent, or
paraphrases of one another. Remember th a t to paraphrase means to put
into other words without changing the meaning.
The major point is th a t we can obtain all the benefits of truth-table logic
without having to pore through the entire subject once we have studied
Boolean algebra. In summary, the set of subsets of a given set and truth-
table logic, as different as they might be in other respects, both serve as
models for a Boolean algebra. Something is a theorem in one of these
subjects i f and only if it is a theorem in the other.
Here might be a good place to distinguish between “if,” “only if,” and
“if and only if.” Briefly, they are as different as p —>q, q —►p, and p «-> q,
respectively. T hat is, we read p —»q as “If p then q”; we read q —* p as
“p only if q”; and we read p «-> q as “p is true if and only if q is tru e.” In
short, “if and only if” means th a t the two phrases being related are syno
nyms. This distinction often occurs in terms of the words “necessary,”
“sufficient,” and “necessary and sufficient.” T hat is, if p is sufficient to
guarantee q, this means th a t p —* q \ if p is necessary in order for q to happen,
this means th a t q —> p (in other words, in this case the fact th a t q happened
means th at p also happened); and if p is both necessary and sufficient to
guarantee q, then p and q are equivalent happenings (that is, p q).
A NOTE ON TAUTOLOGIES
As we have seen, a tautology is a statem ent that, based on its form alone,
is always true. The simplicity of this definition makes it easy for us to
overlook the tremendous impact of this concept.
To begin with, recall th a t in our discussion of logic being the art of
drawing inescapable conclusions from given assumptions, the problem cen
tered about the meaning of “inescapable.” “Inescapable” possesses a
subjective air. Can something be inescapable to one person but not to an
other? Moreover, how can we ever be sure th a t something is inescapable?
320 The Game of Mathematics
Coupled with this problem is one of prejudice. T hat is, in many types
of arguments we accept the logic of the argument if we happen to like the
conclusion, rather than judging the argument on its own merits. For
example, if a person enjoys smoking he may tend to approve of any argu
ment, no m atter what its flaws, if it concludes th a t smoking is beneficial.
If a person is a staunch Republican he may fail to seek flaws in arguments
th a t assail the Democrats.
Thus, it appears th a t we must appeal to rules of logic based on form
alone to decide problems of inescapability. Tautologies seem to be just
w hat we need. They depend only on their form, not on the specific topics
they relate; in addition, what better structure is there to tru st than one
th a t is true no m atter what the truths of its constituent parts are?
By way of illustration, let us look a t a few forms of arguments in which
we tend to say th a t the conclusion follows inescapably from two or more
assumptions.
©
©
©
©
©
r q —*r p —* r
a
p 9
t t t t t t t t
t t f t f f f t
t f t f t f t t
t f f f t f f t
f t t t t t t t
f t f t f f t t
f f t t t t t t
f f f t t t t t
Arguments such as the above were studied long before Boolean algebra
was invented. For example, the subject known as formal logic tried to
tackle the idea of inescapability in the form of an argument. T h at subject
was to help us distinguish between tru th and validity. In fact, the study
of logic in terms of form was known to Aristotle. Aristotelian logic, re
plete with syllogisms, can be traced back well into the pre-Christian era.
W hat is interesting is th a t people studied sets without knowing they
were doing it. Of course, once we know the language of sets we can per
form certain translations, whereupon we discover the following types of
results.
®
P q pVq ~p (p V q) A (~p) ® -» q
t t t f f t
t f t f f t
f t t t t t
f f f t f t
Exercises
1. Let p = Today is Friday.
Let q = Jones is married.
Let r = School is fun.
322 The Game of Mathematics
where we make the obvious assumption th a t there are not eight days
in a week and th a t some grass is not yellow.
3. W rite the tru th tables for each of the following.
(a) ( p - * q ) V ( ~ p ).
(b) (p V q)~* ( ~ p ).
4. Paraphrase (a) and (b) in Exercise 3 so th a t the symbol —* does not
appear.
5. Use tru th tables to show the statem ents ~ ( p A q) and ( ~ p ) V (~ff)
are logically equivalent.
6. In terms of Boolean algebra explain how we could conclude th at
~ ( p A ?) = (~ p ) V ( ~q) without reference to tru th tables.
7. Use tru th tables to verify th a t (p A q) V ( ~ p ) V ( ~q) is a tautology.
8. Show by use of tru th tables th a t p —>q and (~ g ) —* (~ p ) are logically
equivalent. Then apply the result to a discussion of the indirect proof.
9. Construct a tru th table for ~ ( p A ~ q ) and one for ~ p V q.
(a) The tru th values for ~ ( p A are the same as which of the
basic rules (V , A, or —»).
(b) The tru th values for ~ p V q are the same as which of the basic
rules (V , A, or —>).
(These expressions are often used to define implication.)
3.9.6 SW ITCHES
While we presuppose no knowledge of the theory of electric circuits in
this section, let us assume th a t we are all familiar with the idea of current
flowing in a wire. By a switch we mean a device by which we can control
whether or not the current will flow. As an example take an ordinary
light switch. We flick the switch and the light goes on; we flick it again
and the light goes off. We usually view a switch as a break in the wire, as
shown in Figure 3.10.
Figure 3.10
Boolean Algebra 323
We say th a t the switch is open when it does not let current pass (Figure
3.11).
Figure 3.11
Figure 3.12
-O' O ------------------ 0
Figure 3.13
Figure 3.14
In series if either switch is open, current will not flow; while in parallel
one switch can be open and current will flow through the closed switch.
This is illustrated in Figure 3.15. Labeling the switches A , B, C, and so
on, and letting o and c denote open and closed, respectively, we can make
a chart of possibilities concerning two switches A and B.
324 The Game of Mathematics
. A
“ This prevents current
from reaching o u tlet.”
‘Current reaches outlet
through “ lower” branch.’
(a) - (b)
Figure 3.15
A________B
c c
c o
o c
o o
This chart follows from the fact th a t a switch is either open or closed, one
or the other, but not both. If we now write A sB to denote the switch
formed when A and B are connected in series, and A pB to denote the switch
formed when they are connected in parallel, we can make the following
chart.
A B A sB A pB
c c c c
c o o c
o c o c
o o o o
n *■ * s
U •• • p
1 •• • c
0 ■• • o
A B Ae Bc C 0 A sB A pB
c c o o c o c c
c o o c c o o c
o c c o c o o c
o o c c c o o o
The resemblance between this chart and our previous use of charts and
tru th tables should be obvious. If we now define two switches to be
equivalent if one is closed when the other is closed and open when the
other is open, we can see this is equivalent to saying th a t the tables for the
switches are identical.
To understand the last idea better, observe th a t when we flick a switch
to make a light go on we may have no idea as to why it comes on. In
other words, we need not know how the circuit is wired. In this respect, we
are saying th a t two switches are equivalent if they would both account
for the reason the switch behaves as it does.
Suppose we have a switch such th a t whenever we flick it the light it
controls never goes on. Perhaps our first suspicion is th a t the bulb is
defective; we test the bulb and find th a t it works. We might next suspect
th a t there is a defect in the switch; we check th a t and, again, nothing is
wrong. We try to construct the nature of the switch based on this in
formation. For example, it might be th a t the switch has the form shown
in Figure 3.16.
5
Figure 3.16
Figure 3.17
p Q
Figure 3.18
Boolean Algebra 327
— °/? ° —
0------ — 0 ^ 0 -----
r °
Figure 3.19
Exercises
1. Sketch the switch th at is expressed by each of the following.
(a) As(BpC).
(b) (AsB)p(A°)p(B°).
2. Find switches th a t are equivalent to the ones defined in Exercise 1(a)
and (b).
3. Use a chart to show th a t the following two switches are equivalent.
(a)
B A B
----------- - and o— —
_ y
c C
(b)
— of o ---------
A A A
— and o-----
_ y 0___
s * S o
B C B C
(c)
Figure 8.20
7. Use charts to show th a t the two circuits in Figure 3.21 are equivalent.
Boolean Algebra
(b)
Figure 8.21
chapter four / AN INTRODUCTION
TO FUNCTIONS AND GRAPHS
4.1 INTRODUCTION
M athematics has been variously described as the study of relationships,
the language of science, the basic tool of technology, the logical quest for
tru th , and the study of exact measurement. More subjectively, it has
been called a strict discipline, a way of life, and a philosophy. Perhaps,
then, we might combine all of these descriptions into one neat package
and call the ensuing combination of phrases the definition of mathematics.
We could even send out questionnaires to everyone in the entire world in
order to ascertain other descriptive phrases about mathematics. We
could incorporate all of this new information into the original definition
and make it more comprehensive. B ut the results of such an undertaking
could lead at best to a very cumbersome definition of mathematics. More
over, the definition would not be particularly satisfying. For, while it
may be true th a t any concept can be defined by suitably incorporating
every appropriate descriptive phrase (assuming th a t such a feat is possible),
the fact remains th a t such a definition is almost anti-intellectual, at least
in the sense th a t one usually hopes to see a concept defined in terms of a
basic unifying thread from which the other properties follow.
In light of the above remarks it seems th a t wTe should look through the
various descriptions of mathematics and find one th a t characterizes the
subject well. However, any attem pt to define mathematics in terms of
short phrases is destined for failure, for mathematics, and quite justifiably
so, means many different things to different people. In a sense, the a t
tem pt to answer the question, “W hat is M athematics?” can be compared
with the fable concerning the blind men and the elephant. Each man
touched a different p art of the animal and gave a different description
of the elephant. Each m an’s description served as a distortion of the
actual description of the elephant. Each man was partly correct but
none was entirely correct. In this vein, any single descriptive phrase of
m athem atics is (1) too comprehensive; th a t is, it is correct but it defines
more than just mathematics; (2) too specialized; th a t is, it defines certain
330
Introduction 881
aspects of the subject very well but does not apply to other aspects; and (3)
too vague. One definition th a t is too comprehensive is
Certainly, this is a true statem ent. In geometry, for example, one studies
the relationship between the area of a region and its various dimensions.
In traditional algebra problems, we are usually investigating the relation
ships of greater than and less than. Probably, whenever we deal with m athe
matics, somehow or other we are concerned with the study of relationships.
However, it should not be difficult to see th a t to define mathematics
as “the study of relationships” would make virtually every academic
endeavor of man a branch of mathematics. For example, in physics
Galileo studied the relationship between the distance th a t an object fell
and the time during which it was falling, and Newton studied the force of
attraction between two objects in relation to their sizes and the distance
between them. Such quantitative studies of relationships are well known
in all of the physical sciences. Indeed, they are the backbone of many
investigations. However, an equally im portant point is th a t the study of
relationships is by no means restricted to mathematics and the physical
sciences. For example, the philosopher studies the relationship between
a concept and the word used to denote th a t concept, the economist studies
the relationship between various forms of supply and demand, the student
of literature studies writing in relationship to the society of the times, the
historian judges the success of a particular society in relationship to the
aims upon which the society was formed, and the psychologist studies the
scores of certain tests in relationship to the environmental background of
those who took the test.
Such examples are numerous, and the phrase, “the study of relation
ships,” permeates almost every field of human endeavor. Thus, such a
definition of mathematics would be too comprehensive.
Notice, however, th a t such a definition of mathematics, even though
it does not separate mathematics from other subjects, is rather worthwhile;
a t least in the sense th a t it reflects a general trend in one prevalent form of
mathematical usage of the day. Namely, more and more subjects, at
one time thought to have only a minimal need for mathematics, are be
ginning to require the study of relationships in more precise quantitative
ways. Such a study brings mathematics into play as a very strong com
putational tool.
I t is our aim in this chapter to exploit the idea th a t mathematics is
the study of relationships. While such a definition might not apply to
every aspect of mathematics, and while it might be much too compre
hensive, the fact is th a t this aspect of mathematics, perhaps more than
332 A n Introduction to Functions and Graphs
any other aspect, has been the unifying thread by which man has tried to
explore and to understand the world around him.
In this context it is particularly easy to introduce the meaning of a
function. Stripped of all embellishment, a function is a rule. In the
classical sense it was a rule th a t assigned to one number, another number.
In the modern sense it is a rule th a t assigns to an element of one set an
element of another set.
We shall study functions from both points of view, but introduce the
topic from the modern point of view. While this is not chronologically
correct, we prefer the generality of the modern approach. Afterwards, we
shall look into the classical viewpoint.
D efinition
Let A and B denote sets. Then by a function / from A to B, written
/ : A —>B, we mean th a t / is a rule th a t assigns to each element a £ A
one element b © B. The fact th a t f assigns a © A to b G B is denoted by
/(a) = b (read as / of a equals b). (By definition if / is a function and
/(a) = bi th e n /(a ) 5* b2 for any other elements in B.)
In terms of the example of the salesmen, the room clerk plays the role
of the function from A to B ] th a t is, he is a “rule” th at assigns to each
element of A (each salesman) an element of B (a hotel room).
D efinition
If / : A —>B then we call A the domain of / (abbreviated dom /), while
B is called the range of /.
D efinition
Given /: A —>B, we define the image of / (usually abbreviated by Im /)
to be the set {f(a): a £ A}. This set is also denoted by f(A).
In other words, f(A) is precisely th a t subset of B th a t is “used u p ” by
/. More precisely, instead of using such phrases as “used up,” we often
call b the image of a (with respect to /) if /(a) = b. In this context f (A)
denotes the subset of B th a t consists of those elements of B th a t are images
of elements in A (with respect to /). In terms of the salesmen example,
f (A) is the set of rooms to which salesmen are actually assigned. This is
diagrammed in Figure 4.1.
Of course, it is possible th a t all the salesmen use up all the hotel rooms.
T hat is, nothing excludes the possibility th a t/(A ) = B. This leads to the
following definition.
D efinition
G iven/: A —>B, we say th a t/ is onto B if and only if f (A) = B\ otherwise,
/ is a function from A into B. (In other words a function is onto when the
image equals the range.)
D efinition
The function f: A —>B is called one-to-one (often written as 1-1) if no
element of B is the image of more than one element of A . More precisely,
/ is 1-1 means th a t if ai and a2 are elements of A then ai a2 implies th at
/(« i) ^ /(« 2); or from a different emphasis /(ai) = /(a 2) implies th at ai =
ci2.
These two
elements
have the
Figure 4.2
D efinition
Given two functions / and g, we say th a t / = g if and only if
(1) d o m / = dom g and
(2) /(a) = g{a) for each a C A (where A = dom / [or g]).
/(I) = 4
/(2) = 5
/(3) = 6
and define g : A —>B by
9(1) = 4
9(2) = 6
9(3) = 5.
In particular, since both / and g are onto, we have th a t/(A ) = g(A) = B.
However, / g, for while they have the same domain, /(2) g{2). This
is shown in Figure 4.3.
Figure 4-3
9(f(a)) = c.
This is pictured in Figure 4.4.
g°f= h
Figure 4-4
Functions and Sets 337
D efinition
Let /: A —* B and g: B —* C be given. Then by the composition of / and
g, written g »/, we mean the function h : A —>C such th a t for each a G A ,
/(l) = 1 <7(1) = 2
/(2) = 3 </(2) = 1
/(3) = 2 0(3) = 3.
Then
<7(/(l)) = 0(1) = 2
<7(/(2)) = 0(3) = 3
0(/( 3)) = 0(2) = 1
while
m i)) = /(2) = 3
/(0( 2)) = /( l ) = 1
M 3)) = /(3) = 2.
Thus, both / - g and g ° / are 1-1 and onto functions from A to itself, but
they are not equal.
Observe also th a t if / : A —>B and g: B —* C are both 1-1 and onto,then
g o f is also 1-1 and onto from A to C. This is an im portant result; but
it is not as self-evident as it might seem—a t least, in the sense th a t its
converse is not true. T hat is, g « / can be 1-1 and onto even though not
both / and g are. An example is shown in Figure 4.5.
In the next section, we shall explore a few properties of functions which
are both 1-1 and onto.
338 A n Introduction to Functions and Graphs
Figure 4-6
D efinition
Let A be any set. Then the identity function on A, usually denoted by
I, is defined by 1(a) = a for all a £ A .
W ith this definition in mind, suppose th a t wre have a function / : A —»B
th a t is both 1-1 and onto. This is pictured in Figure 4.6.
Figure 4-6
Figure 4.7
Not in dom
of / " '
has 2 images
under
(a) (b)
Figure 4-3
/( I) = 1 0(1) = 2
II
1(2) = 2 m = 3 0(2) = 1
oo
to
7(3) = 3
II
0(3) = 3
A(l) = 2 Jfc(l) = 3 m( 1) = 3
h(2) = 3 m( 2) = 2
II
t—
k( 3) = 2 m( 3) = 1.
II
H
Hence, g ° h = f since they have the same domain and the same image,
element for element. In other words, in terms of the number-versus-
numeral concept, g « h a n d /a r e two different names for the same permuta
tion on S; while h <>g and m are synonyms for a different permutation.
Inverse Functions 341
Figure 4-9
//"'(l ) = 3
/f'( 2)= 1
h '* ( 3 ) = 2
Figure 4-10
o I h k m
f 9
I I f 9 h k m
f f I k m 9 h
9 9 h I f m k
h h 9 m k I f
k k m f I h 9
m m k h 9 f I
842 A n Introduction to Functions and Graphs
In the next section we shall show one more illustration of 1-1 and onto
functions in mathematics. In particular, we shall discuss the concept of
counting and the idea of cardinality from the point of view of 1-1 and onto
functions. Later on we shall return to these ideas with respect to the
classical concept of functions.
Exercises
1. Let A = {1,2,3} and let B — {7,8,9}. Describe a fu n ctio n /: .4 —»B
such th a t
(a) / is neither 1-1 nor onto.
(b) / is 1-1 and onto.
2. In terms of the problem above, is it possible th at / can be 1-1 and not
onto? Explain.
3. Let A be the set of integers, and define / b y /(a ) = a2 for each a £ A.
(a) Describe B if B is the image of A with respect to /.
(b) In this case is / : A —* B onto? Explain.
(c) In this case is / : A —>B 1-1? Explain.
4. In each of the following let R denote the set of real numbers and let
f : R —*R. If g denotes the inverse of /, find the inverse of / in each
of the following cases.
(a) /(r) = r + 3.
(b) /(r) = 3r.
(c) /(r) = 2r + 7.
5. Again, let R denote the set of real numbers and le t/ : R —*■R and g : R —*R
be defined by f{r) = 3r and g{r) = r + 3. Describe the functions
/ o g and g « / in this case.
6. Let I, /, g, h, j , and k denote the six permutations on a set A containing
three elements as outlined in the text. Compute the following.
(a) f • (g • h).
(b) g • j~ \
(c) ( g - j ) - 1.
(d) U • g)~'.
(e) r 1 • g~l.
D efinition
Two sets A and B are said to have the same cardinality (or to be equi-
potent) if and only if there exists a function / : A —>B such th a t / is both
1-1 and onto. In this case we write A ~ B.
While this definition may sound stilted and abstract, it captures the idea
of “how-many-ness” without using the concept of number in the definition.
Before we check some of the properties of our definition, let us make
reference to one more point. The reader may still feel th a t all we are
doing is counting. The answer to this is th at if the collections are finite
then it is true th a t counting and 1-1 correspondences are equivalent.
The trouble enters when we deal with infinite collections. For example,
let us consider the set of whole numbers W. Clearly, this is an infinite
collection. Intuitively, we would like to believe th a t the size of a collection
does not depend on how the objects in the collection are named. T hat is,
if there are 10 people in a room and if each of the 10 changes his name,
there are still 10 people in the room. However, if we try this idea on W,
344 A n Introduction to Functions and Graphs
D efinition
The set A is said to be infinite if and only if there exists a proper subset
B and A such th a t A ~ B.
Notice how this definition captures exactly what happened in our dis
cussion of E and W above, and th a t this definition nowhere uses the intui
tive meaning of infinity.
Let us return to our definition of cardinality and show th a t it has the
properties th a t we would intuitively believe it has.
For example, we feel th at any set has the same number of elements as
itself. Then, given any set A, consider the identity function, I, on A.
Then I: A —>A is both 1-1 and onto; hence, by our definition A has the
same cardinality as A ; or in terms of our new notation, A ~ A.
Secondly, we would like to believe th a t if A has the same number of
elements as B, then B has the same number of elements as A . If A has
the same cardinality as B, our definition says th a t there exists /: A —>B
such th at / is both 1-1 and onto; but, as we saw in the last section, this is
precisely what is necessary to guarantee the existence of f ~ l, which is also
1-1 and onto. In other words, f ~ x: B —>A is a 1-1 and onto map from
B to A, and this is exactly what our definition requires for B to have the
same cardinality as A .
Thirdly, we would like to believe th a t if A has the same cardinality as
B and B has the same cardinality as C, then A has the same cardinality
as C.
Again, the fact th at A ~ B means there exists a function f : A —>B
which is both 1-1 and onto. Similarly, B ~ C implies th a t there exists
g : B —>C th a t is 1-1 and onto. Also, as we saw in the last section, the
composition of two 1-1 onto functions is also 1-1 and onto. In other
words, g °/ is a 1-1 onto function from A to C. Thus, by our definition
A ~C.
Summed up, we have shown for any sets, A, B, and C the following
three things.
(1) A ~ A.
(2) If A ~ B, then B ~ A.
(3) If A B and B ~ C, then A ~ C.
Figure 4-11
D efinition
A set S is said to be finite and of cardinality n if and only if *S -—' «/„.
Thus, the set of apples would be said to have cardinality 3, which coin
cides exactly with our intuitive feeling.
Here is a good place to point out the difference between cardinal and
ordinal numbers. In counting the three apples, notice th a t we could have
started the count with any of the apples. T hat is, we might have labeled
any one of the three apples as being the first, any one of the remaining
two as being second, and the last as being the third. In any event, the
cardinality of the set would be 3, but we have a choice of six different 1-1
onto functions from J 3 to the set of apples. T hat is, the actual cardinality
is independent of what 1-1 onto function we choose, but the choice of the
function will affect how we have elected to order the members of the collec
tion. This is why dictionaries usually define the ordinal numbers as
adjectives (first, second, third) and the cardinal numbers as nouns (one,
two, three).
Let us agree to define the cardinality of the empty set to be 0. Based
on our experience, we would have no reason to define a cardinality of less
than 0. In summary, we make the following definitions.
J 0 = <t>
J i = {1}
J 2 = {1,2}
J n = {1,2, . . . , n}.
Counting Revisited 347
and even though we could never complete the list, we would still know, in a
well-defined way, where each number would occur in our list had we elected
to carry it th a t far. In still other words, while it is impossible to explicitly
list all the whole numbers, each whole number occurs in a well-defined way
somewhere in the list. For example, 456 would be the four hundred fifty-
seventh (why?) entry in the list and we know this without having to write
the entire list. In brief, when we try to list the whole numbers we fail
because there are too many, not because we cannot find an orderly way to
list them.
Generally, our “favorite” infinite sets will be those th a t are also “list-
able” ; th a t is, sets in wrhich we can say, “This is the first member, this is
the second,” and so on, and not skip any member of the collection. This
is not as simple as it sounds and a note to this effect is presented a t the
conclusion of this section.
D efinition
Let W denote the set of whole numbers. Then the infinite set A is called
countable (or denumerable) if and only if A ~ W .1 In essence, a countable
set is the most orderly infinite collection.
1 Some people are afraid of numbering something 0. To this end, we may replace W
by N (the set of natural num bers; th a t is, the whole numbers excluding 0). The point
is th a t W ~ N since the mapping f(w) = w + 1 is a 1-1 onto function from W to N.
348 A n Introduction to Functions and Graphs
and we could tell exactly where any common fraction would appear in this
listing. For example, 34/55 is the thirty-fifth entry when we are listing
common fractions of size 89 (that is, 34 + 55). In fact, if we recall the
recipe th a t 1 + 2 + • • • + w = n ( n + l ) / 2 , we can tell even more pre
cisely where 34/55 will occur in the list. Namely, since there is one fraction
of size 1, two are of size 2, and so on, our list would have 1 + 2 + • • • + 88
entries in it before we even got to size 89 (that is, we would first list all frac
tions from size 1 through 88 before we would begin listing those of size 89);
but this is precisely 88(89)/2 = 3916. Then we would have to proceed an
additional 35 places. Thus, 34/55 would be the 3951st entry in the list,
and so the set of common fractions is countable, since we can compute
where each member of the set will occur in our listing.
The difference between number and numeral here is th a t our countable
list of common fractions includes infinitely many synonyms for the same
rational number (for example, y, f, f , i> and so on are all different as
common fractions, or numerals, but each names the same number).
This means th a t the cardinality of the rational numbers cannot exceed the
cardinality of the common fractions; so we can conclude th a t the rational
numbers are also countable.
Surprising as it may seem, not all infinite sets are countable. In partic
ular, (as we shall show in the note at the end of this section) the real num
bers are not countable. Moreover, since the real numbers are made up of
the rational numbers and the irrational numbers, this implies th a t it must
be the irrational numbers th a t are not countable. In this sense there are
more irrational numbers than rational numbers. More precisely, we can
Counting Revisited 349
find a 1-1 onto function from the whole numbers to the rational numbers,
but no such function exists from the whole numbers to the irrational
numbers.
As a final way to identify the modern concept of counting with the tradi
tional method, let us merely observe th at when we wrote in earlier chapters,
for example, th a t N(A) = 7, this was just another way of saying th a t A was
finite with cardinality 7. T hat is, A ~ J 7.
This concludes the modern introduction to functions. In the next sec
tion we shall coordinate this discussion with the classical idea of a function.
Exercises
1. Describe two well-known sets, each of which has the same cardinality
as J 10.
2. By discussing the function f(n) = 3n, dem onstrate th a t the set of mul
tiples of three has the same cardinality as th a t of the whole numbers.
3. Using the discussion of this section, where on our list would the following
fractions appear if we define the size of m / n to be m + n?
(a) 7/3. (c) 99/101.
(b) 3/4. (d) 30/40.
4. Let N denote the natural numbers and let S denote the subset of N
consisting of the perfect squares.
(a) Find a fu nction/: N —» S th a t is 1-1.
(b) Is it possible th a t there exists a function from N to S th a t is 1-1
but not onto? Explain.
5. If A is an infinite set, is it possible to find a function /: A —* A th a t is
1-1 but not onto? Explain.
6. Let N denote the set of natural numbers. Describe five subsets of N
th at have the same cardinality as N.
7. How many permutations on A are there if N( A) = 5?
ever, this is not the case. There are different orders of infinity; or worded
differently, the cardinality of two infinite sets need not be equal. While
it is not our purpose to explore this topic in great detail, there is one remark
we would like to make.
In particular, there is a famous result known as Cantor’s diagonalization
principle, which is used to show th at the real numbers are not countable.
In other words, Cantor proved th at while the rational numbers and the
natural numbers had the same cardinality, the real numbers and the natural
numbers had different cardinalities. Specifically, the cardinality of the
real numbers is greater than th a t of the natural numbers. Cantor’s
method involved the use of decimal notation. First of all, recall th at every
real number may be viewed as a decimal, the rational numbers being repre
sented as either terminating or repeating decimals, and the irrational num
bers as endless nonrepeating decimals. Moreover, in decimal form, with
one exception, two numbers are equal if and only if they agree (are the
same) for each decimal place. The one exception occurs when a string of
9’s is endlessly repeated. For example, 1.000000. .. and 0.9999. . . are
two different ways of writing 1 as a decimal. To avoid this ambiguity in
discussing C antor’s principle, let us agree to use the repeating 9 representa
tion uniquely when the situation arises. In this way we make sure th at
no two decimals can represent the same number and th at no decimal
terminates.
Cantor then proceeds to use the indirect proof to show th a t even the
real numbers just between 0 and 1 are not countable. To do this he as
sumes the opposite is true, and then proceeds to arrive at a contradiction.
Namely, he assumes th a t the real numbers from 0 through 1are countable.
Under this assumption, it means th a t we can order the real numbers in a
well defined way in such a manner th a t any real number will eventually
appear on the list. Let us use the notation n to denote the first real num
ber on the list, r2 to denote the second one, and so on. In this way the
assumption th a t the set of real numbers is countable allows us to conclude
th a t the real numbers may be arranged in such a way
n
M /V» /VI Of*
, /2 , 73, >4 , • • . « n, • ■ •
th a t although the listing is endless, each real number must occur somewhere
on the list.
The diagonalization principle then takes on the following form. Cantor
defines a real number b as follows. We look a t the first decimal place of ru
and we choose any digit except the digit th a t occurs in the first decimal
place of r\ to be the first decimal place of b. For the second decimal place
of b we choose any digit other than the digit th a t occurs in the second deci
mal place of r2. In general, for the n th decimal place of b we choose any
digit except the one th a t occurs in the nth decimal place of r». In this way
b is certainly a real number, but it cannot be any real number already on the
A n Introduction to Functions of a Real Variable 361
list, for it differs from n in a t least the first decimal place; it differs from r2
in a t least the second decimal place; and so on. In other words, if the real
numbers were countable, then b would have to occur somewhere on the
list; th a t is, it would have to occur in the n th spot for some natural number
n. However, the name of the real number in this spot is r„, and b cannot
equal r„ since their n th decimal places are unequal. This contradiction
validated the claim th a t the real numbers were not a countable collection.
If we use double subscript notation, it is easy to see why this was called
a diagonalization process. Namely, let rnm denote the mth decimal digit
of rn. For example, by r37 we would mean the digit in the seventh decimal
place of the real number th a t is listed third. More concretely, let us assume
th a t the listing appeared as follows.
n = 0.14356980357...
r2 = 0.4789024666790. . .
r3 = 0.03485769687123. . .
U= 0.98079666453123....
W ith the first four sections of this chapter as background, we are now
ready to begin a study of functions from the traditional point of view.
Our previous study was actually more general than what we now need.
T hat is, we have already discussed functions in terms of arbitrary domains
and images; in the study of real variables, all we do is restrict our domain
and range to subsets of the real numbers.
By way of illustration, consider an expression such as
362 A n Introduction to Functions and Graphs
f(x) = x2.
As before, / denotes the rule, x denotes the input (domain), and x2 denotes
the output (image). I t is not clear from /(x) = x2 what the domain of f is.
Let us assume, for lack of information to the contrary, th at x can denote any
real number. Then the domain of f is the set of real numbers, while the
image of / is the set of all positive real numbers. Since x2 = ( —x)2, f would
not be 1-1 since for any real number x, x and —x have the same image with
respect to /. However, / is single-valued since for each given input there is
one and only one output.
If we wanted to study functions of a complex variable, all we would have
to do is consider something like
m = *2,
where z denotes a complex number. In terms of a function-machine, / may
be viewed as the same machine as in f(x) = x2. All th a t is different is th at
now both the input and the output are subsets of the complex numbers
rather than of the real numbers.
We may even have examples of function-machines where the input is a
complex number and the output a real number, or vice versa. For ex
ample, suppose th a t x denotes any real number, and th at we define / by
f(x) = xi where, as usual, i2 = —1. In this case the input is real, while
the output is purely imaginary. As a second illustration, consider the case
in which z denotes a complex number and \z\ denotes its magnitude. Con
sider the function / now defined by f(z) = \z\. In this case the input is
complex, but the output, by definition of magnitude, is a nonnegative, real
number. In this case we would say th a t / is a real-valued function of a
complex variable to indicate th a t the output is real while the input is
complex.
Later we shall study different combinations of inputs and outputs; how
ever, for the time being, our only aim is to study real-valued functions of a
real variable. T hat is, we are interested in the situation wherein the input
as well as the output are real numbers.
Let us introduce new terminology for input and output. In an expres
sion such as f(x), / is referred to as the function; z, as the independent vari
able; and f(x), as the dependent variable. (There still is a tendency by
some to refer to f(x) as the function. We mention this only so th a t the
reader will not be confused if he runs across it.)
To understand this terminology better, observe th a t in our more intui
tive language both the input and the output are variables but th a t the out
put depends on the input. T hat is, the dependent variable plays the role
of the output which depends on the input (independent variable). While
this new vocabulary should be learned and understood, it is not really an
A n Introduction to Functions of a Real Variable 363
improvement over the old. For example, it is often not clear (and in other
cases it is even immaterial) which is the output and which is the input.
Again, in terms of function-machines, we can interchange terminals, and
reverse the role of output and input. In fact, this is precisely the role of
the inverse function. By way of illustration consider
f{x) = 2x + 3, (1)
f(x) = 2* + 3 \ f(g(x)) _ _ 2 (* 2 = -? ) + 3
and 1 =x
g(x) = (x - 3 )/2 / : . f og = l.
f i x 2 - 4) = 2(x2 - 4) + 3
= 2x2 - 5.
This, in turn, says th a t f i x 2 — 4) can also be looked upon as inducing a
new function g, where g is the rule th a t squares a given input, doubles the
result, and then subtracts 5 to yield the output.
The same type of problem occurs in the composition of any two functions.
In terms of the machine concept, all we have to do to represent composi
tion is to attach two function machines in sequence. In this way the out
put from the first machine is the input for the second. By way of illustra
tion, consider
hix) = x2
and
kix) = x + 1.
To form k » h we run an input (x) through the /i-machine and obtain x2
as an output. We then make x2 the input of the fc-machine, whereby the
output is x2 + 1. In short, the idea of following the /i-machine by the
A:-machine yields the equivalent of a new machine which has the effect of
squaring any input and adding 1 to the result.
We could have started with the /:-machine, followed by the /i-machine,
in which case if x was the input of the A>machine, the output would be
(x + 1), which would be the input of the /i-machine; hence, the final output
would be (x + l) 2, since the /i-machine squares any given input. In terms
of a picture we would have Figure 4.12.
As a final observation, x2 + 1 and (x + l) 2 are equal if and only if x = 0.
k°h - machine
Figure 4-12
A n Introduction to Functions of a Real Variable 356
However, for two functions to be equal they must yield equal outputs for
each input, not just for some. Thus, we see again th a t h » k and k « h are
not equal functions.
We are saying th at labels such as dependent and independent are whim
sical, and can be interchanged just by a reversal of the poles of thefunction-
machine. Moreover, observe th at if we are given an algebraic expression
such as
x + y = 1,
there is no indication as to which of the two variables, x or y, will be
emphasized. To be sure, x + y = 1 can be correctly paraphrased as
either
x = 1- y
or
y = 1 - x.
In x = 1 — y, y is the independent variable and in y — 1 — x, x is. Yet
we have no way of knowing which of these two is suggested by x y = 1.
We shall accept such adjectives as dependent and independent a t face
value, and use these expressions only when we feel th a t their context could
allow no misinterpretation.
As an illustration of these ideas let us consider the problem tackled by
Galileo of determining how far a freely falling body fell during a given
period of time. Galileo assumed th a t the object was close enough to the
E arth so th at its gravitational acceleration could be taken as a constant,
and he also assumed th a t air resistance was absent, or a t least negligible.
Under these conditions he discovered th a t
s = 16£2,
where s was the number of feet the body fell in t seconds. From our point
of view, it is not im portant whether s = lQt2 is true. I t is im portant th at
this equation defines a rule (function) th a t assigns to one real number t
another real number s. To review our new language, in s = 16£2 we would
refer to t as the independent variable and to s as the dependent variable;
th a t is, we put t in and get s. If we wished, we could invert s = 16<2 to
obtain
1 = V ie ’
where these two equations are equivalent, except th a t they interchange the
roles of s and t as dependent and independent variables. In still other
words, s = lQt2 would be the desired form if we were given t and told to
find s, while t = -\/s/16 would be the desired form if we were given s and
told to find t.
356 A n Introduction to Functions and Graphs
office safe blows up. Obviously, what we mean when we say th at profits
rose is th at profits increased. Why then did we interchange rise (geometric)
with increase (arithmetic)? The answer centers around the idea of a graph.
How is the concept of a graph related to the concept of a function? The
answer lies, at least for a first approximation, in the concept of ordered
pairs (which in turn suggests Cartesian products, and this in turn suggests
the Cartesian plane). Namely, the function-machine is determined once
we know the output for each given input. In short, we can abbreviate the
function by using ordered pairs where, for example, the first member of
the pair can name the input while the second member of the pair can name
the output. In terms of the Cartesian plane, this says th a t we use the
x axis to indicate the domain of the function, while we use the y axis to indi
cate the range of the functions. In still other words, we are saying th a t
we may use the point {x,f{x)) to represent the fact th a t for an input of x
the output is fix).
Conceptually, the function and the graph are two entirely different
things. One is an analytic relation and the other is a picture of it. A
picture may be helpful where, for example, we know th a t for a certain real
number x, fix) is positive (this is an arithm etic statem ent). B ut if fix) is
positive, we know th a t the point (x,f(x)) lies above the x axis. In a similar
way, if fix) is negative, the point (x,f(x)) is below the x axis; and if fix) = 0,
the point (x,f{x)) is on the x axis. Thus, the idea of a graph replaces the
analytic terms “greater than 0,” “equal to 0,” and “less than 0” by “above
the x axis,” “on the x axis,” and “below the x axis,” respectively.
In a similar way, it replaces “increasing” by “rising,” “constant” by
“horizontal,” and “decreasing” by “falling” (that is, if fix) does not vary,
the point (x,f(x)) always has the same height above the x axis. Hence,
all such points are parallel to the x axis, or horizontal to it if we define the
direction of the x axis as being horizontal.)
Let us now turn to a specific situation. For example, let us discuss the
function / given by
fix) = x2.
We know at once th a t the input x yields the output x2. This means th at
in terms of a graph, the input x will give rise to the point in the plane (x,x2).
Some of the points on the graph would be (0,0), (1,1), ( —1,1), (^,i).
In terms of a picture, we would have Figure 4.13.
If we now use our intuition we might conjecture2 th a t the graph of
fix) = x2 is given by Figure 4.14.
2 Actually we have only a conjecture, no m atter how intuitive things may seem. T h at
is, as long as we locate points th a t have spaces between them,we are only conjecturing
as to w hat goes on in between.
368 A n Introduction to Functions and Graphs
• ( - 1, 1) • 0 , 1)
(0 , 0 )
Figure 4-13
x axis and th a t passes through a point in the image intersects the graph
in only one point then the function is 1-1. By way of illustration, Figure
4.15 indicates th a t if f ( x ) = x2, then f is single-valued but not 1-1. In
fact, every positive number in the image is yielded by two members of the
domain.
Do not confuse the graph with the domain and the image of the function.
Recall th a t the domain is located as a subset of the x axis, while the image
is a subset of the y axis. For example, referring again to the function f
where f(x) = x2, suppose th a t we take as the domain of / the closed interval
[2,3]. Then the image of / in this case would be the closed interval [4,9].
A Picture Is Worth a Thousand Words 369
This is th e im age o f
both Each value o f x yields one
and only one point on the
graph; but each nonzero
value in the image comes
from two num bers in the
domain.
Figure 4-16
Figure 4-16
in the first case the curve seems to be rising faster and faster, while in the
second case it seems to rise more and more slowly. The first case corre
sponds to acceleration, while the second case corresponds to deceleration.
Returning to fix) = x2, observe th a t we have a genuine acceleration.
For example, when x changes from 1 to 2, fix) changes from 1 to 4; thus, in
this case an increase in a; of 1 unit produces an increase in y of 3 units.
Yet, when x changes from 2 to 3 (which is still only a 1 unit change in x),
fix) changes from 4 to 9, or a change in 5 units.
360 A n Introduction to Functions and Graphs
Graphs also supply us with a nice reason as to why we may deal with
single-valued functions without loss of generality. T hat is, the graph of a
multi-valued function has the property of doubling back. In other words,
a line parallel to the y axis intersects the graph in more than one point.
For the purpose of an illustration let us suppose th at our graph is a smooth
curve. Intuitively, it should be clear th a t the places a t which the graph
doubles back are those a t which the curve possesses a vertical tangent line.
This idea is pictured in Figure 4.17.
Figure 4.17
Figure 4.18
y = x 1 is then given by
v C, U C 2 where C, and
C2 are both single-valued.
Figure 4-19