Anda di halaman 1dari 202

MAS1404/EEE2011.

Discrete mathematics for computing science


Sarah Rees
Sarah.Rees@ncl.ac.uk

Recommended books
J.Gersting Mathematical structures for computer science, 4th edition
(Freeman, 1999).
R.P.Grimaldi Discrete and combinatorial mathematics, 4th edition (AddisonWesley, 1999).
J.K.Truss Discrete mathematics for computer scientists, 2nd edition
(Addison-Wesley, 1999).

Syllabus
Set Theory: defining sets by listing, by properties, and by induction;
unions, intersections, complements; Venn diagrams; power sets. Cartesian products.
Relations: definition and examples of unary, binary and tertiary relations.
Representation of binary relations using directed graphs. Building more
useful structures: cartesian products; lists; inductive definition of structures.
Functions: domain, codomain, range of a function; injections, surjections,
bijections; inverse functions; total and partial functions; recursive functions.
Graphs: directed, undirected. Basic properties. Trees, parsing trees.
3

Reasoning about structures using structural induction;


Propositional Calculus: propositional connectives, formulas, truth tables,
tautology.
Predicate Logic: existential and universal quantifiers; predicate calculus.
Boolean Algebra: axioms; examples; simplification of Boolean expressions.

Schedule of lectures and coursework


Lecture 1 Sets
1.1
Lecture 10 Graphs
Lecture 2
1.2,1.3
Lecture 11
Lecture 3
1.41.6
Lecture 12 Recursion&induction
Lecture 4 Relations 2.12.3
Lecture 13
Lecture 5
2.4,2.5
Lecture 14 Prop. Logic
Lecture 6 Functions 3.13.3
Lecture 15
Lecture 7
3.4, 3.5
Lecture 16 Predicate Logic
Lecture 8
3.5ctd,3.6
Lecture 17 Boolean Algebra
Lecture 9
3.73.9
Lecture 18 Boolean Algebra
The remaining hours will be used for recap sessions.

4.1, 4.2
4.34.6
5.1,5.2
5.2ctd.
6.16.4
6.56.8
7.17.4
8.18.5
8.68.9

Homework
Homework
Homework
Homework

1
2
3
4

Sets and Relations


Chapters 1,2
Functions
Chapter 3
Graphs, recursion and induction Chapters 4,5
Logic
Chapters 6,7,8

1 Sets

14

1.1 Definition and description of a set . . . . . . . .

14

1.2 Basic set operations and relations . . . . . . . .

23

1.3 Rules relating the set operations . . . . . . . . .

30

1.4 Venn diagrams . . . . . . . . . . . . . . . . . . . .

33

1.5 Power sets . . . . . . . . . . . . . . . . . . . . . . .

42

1.6 Direct products . . . . . . . . . . . . . . . . . . . .

43

2 Relations

46

2.1 Definition and examples . . . . . . . . . . . . . .


7

46

2.2 Binary relations; infix notation . . . . . . . . . .

51

2.3 Diagram of a binary relation . . . . . . . . . . .

52

2.4 Properties of a binary relation . . . . . . . . . .

53

2.5 Equivalence relations, partial orders and their


diagrams . . . . . . . . . . . . . . . . . . . . . . . .

56

3 Functions

61

3.1 Definition of a function, notation, examples . .

61

3.2 The graphs of a function, directed and Cartesian 67


3.3 Functions vs. non-functions . . . . . . . . . . . .

72

3.4 Composing functions . . . . . . . . . . . . . . . .

74

3.5 Inverse functions . . . . . . . . . . . . . . . . . . .

79

3.6 Properties that make a function invertible . .

85

3.7 Computing an inverse for a function . . . . . .

91

3.8 Partial functions . . . . . . . . . . . . . . . . . . .

92

3.9 Functions defined recursively . . . . . . . . . . .

97

4 Graphs

102

4.1 Definition and examples, notation . . . . . . . . 102


4.2 Paths, connectivity, components . . . . . . . . . 110
4.3 Bipartite graphs . . . . . . . . . . . . . . . . . . . 113
4.4 Trees and forests . . . . . . . . . . . . . . . . . . . 118
4.5 Parsing trees . . . . . . . . . . . . . . . . . . . . . 122
4.6 Spanning trees . . . . . . . . . . . . . . . . . . . . 125
9

5 Recursion and induction

127

5.1 Examples of some sets defined inductively . . . 127


5.2 Some examples of proof by induction . . . . . . 130
6 Propositional logic

139

6.1 Statements, propositions, paradoxes . . . . . . 139


6.2 Compound propositions, logical connectives

. 142

6.3 Logical equivalence, truth tables . . . . . . . . . 145


6.4 Construction of truth tables . . . . . . . . . . . . 147
6.5 Rules relating the operations of propositional
logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.6 Tautologies and contradictions . . . . . . . . . . 156
10

6.7 Translating English sentences into the language


of propositional logic . . . . . . . . . . . . . . . . 157
6.8 The connectives , and . . . . . . . . . . . 159
7 Predicate logic

162

7.1 The universal quantifier . . . . . . . . . . . . . . 166


7.2 The existential quantifier . . . . . . . . . . . . . . 169
7.3 Multiple quantifiers . . . . . . . . . . . . . . . . . 171
7.4 Relating the existential and universal quantifiers172
7.5 Translating English sentences using quantifiers 173
7.6 A systematic approach to translation . . . . . . 175
7.7 First we find expressions for the components . 176
11

7.8 Quantifying more than one variable . . . . . . . 177


7.9 Translating whole statements . . . . . . . . . . . 178
8 Boolean algebra

185

8.1 Definition of a Boolean algebra . . . . . . . . . . 185


8.2 Examples of Boolean algebras

. . . . . . . . . . 187

8.3 Principle of duality for Boolean algebras . . . . 189


8.4 Rules deducible from the axioms . . . . . . . . . 190
8.5 Verifying the extra rules . . . . . . . . . . . . . . 191
8.6 Boolean expressions . . . . . . . . . . . . . . . . . 194
8.7 Tidying up Boolean expressions . . . . . . . . . 196
8.8 Disjunctive normal form . . . . . . . . . . . . . . 198
12

8.9 Relating disjunctive normal form and truth tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

13

1.1

Sets
Definition and description of a set

A set is a collection of objects (known as the elements) for which membership is decidable.
The definition is a bit mysterious, e.g. what do we mean by the phrase
for which membership is decidable ?
Basically a set is a collection of things. It might be described explicitly,
by listing its elements, but it might be described differently.
Well start by looking at some examples, described in various different
ways.

14

Examples 1.1 (Different types of sets)


Sets described by listing their elements , between brackets {, }
and separated by commas, e.g. {1, 3, 5, 7, 9}, the odd digits.
We read { as the set (consisting) of, and keep quiet as we read }, so
read the above as the set consisting of 1, 3, 5, 7, 9.
Each element is counted once only (if it were convenient, we might
write an element down more than once) and the order the elements are
listed in is irrelevant.
Two sets are equal if they contain the same elements. So
{1, 3, 5} = {3, 1, 5} = {1, 1, 3, 5}.
We read as is an element of, 6 as is not an element of
1 {1, 3, 5}, 3 {1, 3, 5}, 4 6 {1, 3, 5}
15

Other examples of sets:


{cat, horse, dog }, {0, 1, cat, monkey, 57}

16

Some standard sets :


Z, the integers,
Q, the rationals (all fractions),
R, the reals (all numbers met in nature, positive or negative, rational
or irrational),
C, the complex numbers,
N, the natural numbers,
, the empty set.
NB: In this course, we shall define 0 to be a natural number, so
N = {0, 1, 2, 3, . . .}.

But notice that some books exclude 0 from N, call 1 the first natural
number. We call the set {1, 2, 3, . . .}, formed from N by excluding 0,
the positive integers, or Z+.

17

Sets described by rules , such as {x Z : x > 10}, which we read


as the set of x in the integers such that x is greater than 10. We read
: as such that, so that, for which, as is convenient. The symbol |
may be used instead of :; it means exactly the same. where : or | as
such that.

18

Sets described inductively/recursively :


The set of all integer powers of 2, {2n : n N} can also be defined as
the smallest set S such that S contains 1, and if x S then 2x S.
Similarly,
N can be defined by the statement:
N contains 0, and if x N then x + 1 N, but nothing else is in N..
Z+ can be defined by the statement:
Z+ contains 1, and if x Z+ then x + 1 Z+, but nothing else is in
Z+.
And the set S defined by the rule:
S contains 10, and if x S then x + 1 S, but nothing else is in S
is simply {x Z : x 10}.
The fact that these sets can be defined in this way means it is easy
to construct them, deduce and prove properties using induction (well
19

meet this later), and to define functions over them.


Similarly, inductive definitions of trees and linked list structures make
it easy to work with those.

20

A collection of elements that is not properly defined is not


a set!
For a collection of elements to be a set, membership must be decidable.
Consider
, the collection of all sets A such that A 6 A.
Is an element of ? We try to decide. Lets see what happens: if is an element of , then . . .

if is not an element of , then . . .


21

Both possibilities lead us to nonsense. So in fact, cant be a set; the definition is inconsistent. This strange situation is known as Russells paradox.

22

1.2

Basic set operations and relations

Definition 1.2 (Cardinality) We define the size (cardinality) of a set


to be the number of elements in it, and use the Notation |A|.
A set might have infinite cardinality. Note too that even if an element is
listed more than once in the description of a set it is only counted once.

Definition 1.3 (subspace) Suppose that A, B are sets. We say that


A is a subset of B and write A B or A B if every element of A is
also in B.
23

Notice that A A and A.

We can write the symbols and backwards as and . Then B A


and B A mean exactly the same as A B and A B. If A is not a
subspace of B, then we write A 6 B.

24

Definition 1.4 (Binary operations) We define the intersection of A


and B, written A B, to be the set of all elements of A that are also in
B.
We define the union of A and B, written A B, to be the set whose
elements consist of all the elements of A together with all the elements
of B.
We define the set difference A \ B to be the subset of A formed from A
by deleting all elements also in B.
We define the symmetric difference A + B to be the union of A \ B and
B \ A.
Instead of + as a notation for symmetric difference, we also see and
.
25

Definition 1.5 (Universal set, complement) We define a universal


set U to be the set that contains as an element everything we are currently
interested in. (e.g it might be the set of all people on a database). Then
we define the complement Ac of a set A to be the set U \ A.
We can illustrate the definitions of these operations with some examples:

26

27

Notice that some different combinations of operations always give the


same answer e.g.
A (B C) = (A B) (A C)

as we can verify by working an example. Of course this isnt just a coincidence . . .

28

29

1.3

Rules relating the set operations

Commutative laws: A B = B A
Associative laws: A (B C) =
= (A B) C
Distributive laws: A (B C) =
(A B) (A C)
U and :
AU =A
Complements:
A Ac =
De Morgans laws: (A B)c = Ac B c
Idempotent laws: A A = A
(Ac)c = A

AB =BA
A (B C) =
= (A B) C
A (B C) =
(A B) (A C)
A=A
A Ac = U
(A B)c = Ac B c
AA=A

We can deduce further rules as simple consequences of these, such as


(A B C)c = Ac B c C c,
30

which we get by applying the first of De Morgans laws twice: As an


example, we verify this last rule for a particular choice of U , A, B, C.

31

Note that rules relating set operations generally come in pairs. Given one
rule we get a second usually distinct rule by swapping with and
with U . The principal of duality makes this work. That says that for any
statement that is true about sets in general, the statement that you get
from that by replacing each by a , each by a , each U by a and
each by a U is also true. The second statement is called the dual of the
first. If the second statement is equal to the first, then the statement is
said to be self-dual; e.g. (Ac)c = A.
Example:. The dual of
is

(A B C)c = Ac B c C c
(A B C)c = Ac B c C c
32

1.4

Venn diagrams

Venn diagrams provide a useful technique to verify rules that hold in set
theory. A Venn diagram for a collection of sets A, B, C, . . . is a diagram
that displays each set as a enclosed region within the plane. Usually
we draw a rectangular region representing the universal set U and then
represent each of the sets A, B, C, . . . as connected regions within the
rectangle. An intersection of two of those sets, such as AB, is displayed
as the region common to both the regions representing the two sets (A
and B).
As examples, well draw the Venn diagrams
(a) for a single set A within the universal set. In that diagram we can
easily identify A and Ac.
33

(b) two subsets A, B of the universal set. In that diagram we can easily
identify A B, A B, A \ B, B \ A, A + B Ac, etc.

(c) three subsets A, B, C of the universal subset.

34

(d) four subsets A, B, C, D of the universal subset. By now its starting


to get a bit trickier . . .

35

We can use a Venn diagram to verify a general law about sets by drawing
the sets A, B, C, . . . in the most general configuration possible (ie so that
each possible intersection between the sets is represented by a distinct nonempty region), and then using shading to help us construct and compare
the various sets that are involved in the law.
Example: we verify the law
(A B)c = Ac B c

by drawing two Venn diagrams, as below

In the first we identify the set (A B)c. We shade A B, and then see
its complement as the unshaded portion of the diagram.
In the second we shade Ac one way, B c another way, and then identify
their union as the total shaded area. The fact that the regions identified
in the two diagrams match up verifies that the two sets are equal.
36

37

Example: we verify the law


(A B) C = (A C) (B C)

Again we use 2 diagrams, as below.

In the first, we mark A B and C, by shading in two different ways, and


then identify (A B) C as the total shaded area.

In the second we mark A C by shading one way, B C by shading


another way, and then their intersection as the area that is shaded both
ways. Since we can see that the regions identified in the two diagrams
match up, we have verified that the two sets are equal.

38

39

We can also use Venn diagrams to deduce some properties of a particular


configuration of sets from others.
For example, suppose that we have three sets A, B, C for which B A
and A C = . Then we can deduce that B C = .

We draw a Venn diagram for A, B, C that incorporates the information


that B A by putting the region representing B within the region
representing A. Then since A C = we draw the region representing C
as disjoint from the region representing A. Then it is necessarily disjoint
from the region representing B,and hence B C = .

But notice that if now D is a fourth set that contains A, D C need not
be empty!

40

41

1.5

Power sets

The power set of A, known as P(A), or (sometimes) 2A. is the set of all
subsets of A.
Where A has n elements, its power set has 2n.
As examples we write down P({1, 2}) and P({1, 2, 3}) and count their
elements.

42

1.6

Direct products

The direct product of two sets A, B is the set


{(a, b) : a A, b B}.

More generally, the direct product of n sets A1, A2, . . . An is the set
{(a1, a2, . . . an) : a1 A1, a2 A2, . . . an An}.
Order of composition is important, i.e. in general A B is not the same
as B A.

Some examples:-

43

Direct products are also called Cartesian products.


The sets A1, . . . An are called the components of the cartesian product
A1 . . . An .

The size of A1 A2 . . . An is equal to the product of the size of its


components, ie
|A1 A2 . . . An| = |A1| |A2| |An|.
We often store data in sets that are, in effect, direct products, although
we may not call them such. We might call such sets records, or structures.
We usually access the components of a record using dot notation.

44

45

2.1

Relations
Definition and examples

Basically a relation is a rule or property that holds for some elements of


a set but not for others, or between some (ordered) pairs of elements,
or triples of elements but not for others. In order to talk sensibly about
relations we need a proper, more formal definition.
Definition 2.1 (relation) Given sets A1, A2, . . . An, we define an nary relation R between A1, A2, . . . An to be a subset of A1 A2 . . . An.

If we have just one set, that is, if n = 1, then R is a unary relation.


If n = 2, then we have a binary relation.
If n = 3, then we have a ternary relation.
46

Often A1, A2, . . . An are all equal to the same set A, in which case we can
say our relation is on A. Where R is an n-ary relation and (x1, . . . xn)
R, we say that R holds at (x1, . . . xn). Sometimes we write R(x1, . . . xn)
rather than (x1, . . . xn) R.

In this course most of the relations we shall meet will be either binary or
unary. A few will be ternary.
We also use the terms property and predicate to mean exactly the same
thing as relation. In particular we usually use the term property for a
unary relation, and well meet the term predicate later when we do some
logic.
Some unary relations :
On Z, x is even, x is positive. We can write the statement x is even
in set theory language, as x EvenInt or in the form EvenInt(x).
47

Similarly we can write the statement x is positive as x P osInt or as


P osInt(x).
For the set A = {1, 2, 3, 4, 5}, the relation EvenInt on A is the subset
{2, 4}.

Some binary relations :


On Z, we have x + y is even, x is greater than y. We can write the
statement x + y is even in set theory language as (x, y) EvenSum
or EvenSum(x, y). For the set A = {1, 2, 3, 4, 5, }, the relation
EvenSum on A is the set of pairs

For the same set the relation IsGreaterT han is the set of pairs
48

Where is any set, for x , A x is an element of A,


defines a binary relation between and P(). For = {1, 2, 3}, we
have
P() =
and hence the relation is the set of pairs

49

A ternary relation :
On Z, SumEquals(x, y, z) to mean x + y = z. For the set A =
{1, 2, 3, 4, 5, }, the relation SumEquals on A is the set of triples

50

2.2

Binary relations; infix notation

Binary relations are particularly useful, and since a binary relation relates
two variables, it is often pleasant to use a different kind of notation to
denote them, which we call infix notation. In this notation, given a binary
relation R, rather than write (a, b) R, or R(a, b) (prefix notation), we
write instead aRb.
There are some very common binary relations, many of which have special
non-alphabet characters as their most natural labels.
= . <, , = on Z, Q, R or any subset of those.
on the set of subsets of some set , i.e. on P().
relating and P().
m on Z, defined (for m Z) by x m y if y x is a multiple of m.
51

2.3

Diagram of a binary relation

We often use diagrams to illustrate binary relations, points with directed


arcs joining them. (These are special cases of directed graphs, which well
meet later.)
Given a binary relation R on a set A, we draw a node for each element
of A, and then draw a directed arc from x A to y A if xRy.
As an example, well draw the diagram of on {1, 2, 3, 4, 5}

52

2.4

Properties of a binary relation

A relation R on A is reflexive if for all x A, xRx. On the diagram for


R, this means that every node is connected directly to itself.
R is symmetric if for all x, y A with xRy, we also have yRx. On the
diagram for R, this means that whenever there is a directed edge from x
to y there is also one from y to x.
R is transitive if for all x, y, z A with xRy and yRz we also have xRz.
On the diagram for R this means that whenever there is a directed path
from x to z via y there is also a directed edge from x to z.
R is antisymmetric if whenever both xRy and yRx are true we have
x = y. On the diagram for R this means that we have at most one
directed edge between 2 distinct vertices.
53

We can examine these properties for some examples, such as <, , m:-.

54

55

2.5

Equivalence relations, partial orders and their diagrams

We call a relation that is reflexive, symmetric and transitive an equivalence


relation. As an example we have m, for each ms, on Z or any other set
of integers.
We call a relation that is reflexive, antisymmetric and transitive a partial order.
As examples, we have on Z, on the set of subsets of a set .
For equivalence relations and partial orders we have special, more compact
diagrams.

For an equivalence relation we form the equivalence graph by deleting all


the loops from the standard diagram of the relation, and replacing each
pair of directed edges between two related vertices by a single undirected
edge.
56

As an example we can draw both the standard diagram and the equivalence
graph of 3 on {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.

57

58

For a partial order we form the Hasse diagram by deleting all the loops
from the standard diagram of the relation, deleting any directed edge from
the diagram if it joins two vertices that are also connected by a directed
path of two or more edges. Further if a is related to b we draw b higher
up the page than a (and so we dont need an arrow on the edge)..
As an example we can draw both the standard diagram and the Hasse
diagram of on the set of subsets of {1, 2, 3}.

59

60

3.1

Functions
Definition of a function, notation, examples

Definition 3.1 Given two sets X and Y , a function f from X to Y is


a rule that, given any element x X defines an element f (x) in Y .
3 things are necessary to define a particular function; the set X (called
the domain), the set Y , (called the codomain), and the rule itself. We
call f (x) the image of x under f .
In programming, the domain corresponds to the type of the input argument
of the function, and the codomain to the type of the returned value.
We usually use letters like f, g, h as names of functions, or sometimes
f1, f2, . . .. But it doesnt really matter.
61

Commonly we describe a function using notation like


f : Z Z,

f (x) = x2

or alternatively

62

f : Z Z,

x 7 x2

My example above probably looks quite familiar, because the function


deals with numbers, but theres no reason why either X or Y should be
a set of numbers. They could be sets of anything.
The following all define valid functions.
X is the set of people in this room, Y = R, f (x) is the body temperature of x, for each x X.

X is the set of people in this room, Y the set of all female first names,
f (x) is the first name of the mother of person x.

X is a set of records, Y the set of possible values of one particular


component, f (x) is the value of that component for record x.
Any relation R on sets A1, . . . An can be thought of as a function
from A1 . . . An to the set {T rue, F alse}. We define R(x1, . . . xn)
to take the value T rue when the relation holds on (x1, . . . xn) and
63

F alse otherwise. This explains why were using the same notation for
functions as we already (sometimes) used for relations. A function with
codomain {T rue, F alse} is called a Boolean valued function.
The terms mapping and map mean exactly the same as function.

64

For any set X, we can define a special function, which we call the identity
function on X.
Definition 3.2 (Identity function) The identity function iX : X
X. is defined by the rule i(x) = x for all x X.
So iZ is a function with domain and codomain Z, iR a function with
domain and codomain R etc. Each identity function iX has the same
rule: iX (x) = x for each x X.

65

While the codomain of a function tells us where its output should lie (and
so specifies its type) the range specifies the set of outputs completely.
Definition 3.3 (Range) Where f : X Y is a function, the set of
all elements of Y that are of the form f (x) for some x in X is called the
range of f . In set theory, we would describe this set as
{y Y : y = f (x) for some x X}
Some people use the word image instead of range.
For our first example f : Z Z, x 7 x2, the range of f is the set of all
squares, {0, 1, 4, 9, 16, 25, . . .}.

66

3.2

The graphs of a function, directed and Cartesian

Since a function f is specified by the two sets X, Y and all the pairs
(x, f (x)), it is actually a binary relation on X Y . Each element of X
is related by f to its image f (x) in Y (but careful, the boolean valued
function we associate with that binary relation is not f ). So it is natural to
use a diagram to represent the function, consisting of points and directed
arcs. A cluster of points on the left of the diagram represents the elements
of X, another cluster of points on the right represents the elements of Y .
For each element x of X, a directed arc joins it to its image f (x) in Y .
NB. If X and Y have some points in common we see each common point
twice, once as an element of X, and once as an element of Y .
We call this diagram the directed graph of the function.
67

Well look at some examples, not forgetting the identity function. It is


easy to identify the range of a function from its directed graph.

68

69

Of course we are also used to representing functions by a different kind of


graph, which uses axes and coordinates. The points of X are marked off
on an x-axis, the points of Y on a y-axis, defining a coordinate sustem.
Then each point (x, f (x)) is plotted. Well call this kind of graph the
Cartesian graph of the function.
Of course we can only use a Cartesian graph to represent a function that
maps a set of numbers to a set of numbers. A directed graph can be used
a bit more generally, but in practice were unlikely to use it if X and Y
are infinite.
Well look at some examples, not forgetting the idenfity function. It is
easy to identify the range of a function from its Cartesian graph.

70

71

3.3

Functions vs. non-functions

Let X be the set of student numbers of people in this room. Let Y1 be


the set of all female first names, let Y2 be the set of all first names.
If f1(x) is the first name of the person with student number x, then f1
doesnt define a function from X to Y1, since some people in this room
are male. f1 : X Y1 is not a function because f1(x) is not defined as
an element of Y1 for every x X; i.e. f1(x) is outside the codomain for
some elements of X.
But the same rule f1 defines a function from X to Y2.
Now let f2(x) be the name of the friend of the person with student number
x. Wed like to use f2 to define a function from X to Y2, but we have
a problem since some people have more than one friend, while some may
72

have none. f2 : X Y2 is not a function because f2(x) is undefined or


ambiguous for some x in X.
For f : X Y to be a function we need exactly one value of f (x) for
each element x of X, and we need that value to be an element of Y .
Look at this pictorially. In the directed graph that represents a function,
we see one directed arc out of each element of X, leading to an element
of Y . Some elements of Y may be at the end of more than one arc, and
some elements of Y may be at the end of no arc.
Here are some example of directed graphs, some of functions, some of
non-functions.

73

3.4

Composing functions

Sometimes we have more than one function and it makes sense to combine
them.
Definition 3.4 (Function composition) Given functions f : X
Y and g : Z W ,
if Y Z, the composite g f is the function with domain X and
codomain W defined by the rule
g f (x) = g(f (x)).

If Y 6 Z, then g f is not defined.

Note that we cant always compose two functions. We can only define
the composite g f of functions f and g if the codomain of f is in the
domain of g. We might have both of f g and g f defined, we might
74

have either one but not the other defined, and we might have neither
defined. Where both are defined they are probably not equal.

75

Examples:
First let X = {1, 2, 3}, Y = {4, 5, 6, 7}, Z = {4, 5, 6, 7, 8}, W =
{1, 2, 3, 4, 5} and define f : X Y, g : Z W by f (1) = 4
,f (2) = 6,f (3) = 7, g(4) = 2, g(5) = 3, g(6) = 4, g(7) = 5, g(8) = 5.
Draw the directed graphs of f and g and study the composition graphically.

76

Next let R0 = {x R : x 0}, and define f : R R0 byf (x) =


x2, g : R R by g(x) = x 1, h : R0 R0 by h(x) = x + 1.
Then compute the compositions (and see that h g doesnt work).

77

78

3.5

Inverse functions

Sometimes we want to undo the effect of a function. We know f (x) but


we want to know where it came from, what x was. That could be very
important if we wanted to detect an error in a program. Maybe we have
a program that runs through a large set of input values x, and for each
value of x we have an output value y, computed using some function f .
If the program is interrupted we may want to recover x. Maybe memory
constraints meant we couldnt store it. Can we recover x?
We can if f has an inverse.
Definition 3.5 (Inverse function) Given a function f : X Y ,
then a function g : Y X is an inverse for f if g f (x) = x for all
x X and also f g(y) = y for all y Y .
79

Another way of phrasing the above is to say that gf = iX and f g = iY .


We say above that g is an inverse for f . In fact if f has an inverse, it
only has one. But it doesnt have to have one at all. If f has an inverse
we usually call it f 1.
To explore, well draw the graphs (directed or Cartesian) of some functions.
If the function has an inverse well draw its graph too.
f1 : R R, x 7 2x.

80

f2 : R R0, x 7 x2. (We define R0 to be {x R : x 0}.)

:f3 : Z Z : x 7 2x.

81

f4 : {0, 1, 2, 3, 4, 5} {0, 1, 2, 3, 4, 5} : x 7 5 x.

f5 : R R : x 7 5 2x.

82

Where f is invertible, we get the directed graph of f 1 by reflecting the


directed graph of f left to right, and we get the cartesian graph of f 1
by reflecting the directed graph of f in the line y = x.
If f isnt invertible the reflections of its graphs wont be the graphs of
functions.
To illustrate this, well draw the reflected graphs of each of our examples
that dont have inversess:

83

84

3.6

Properties that make a function invertible

We look at our 5 examples to see what can stop a function having an


inverse; f2 and f3 were not invertible.
We have problems defining an inverse for f2 because each of the values
in the codomain is the image of two different values in the domain. We
dont know which value the inverse should take it back to. E.g. f2(2) =
f2(2) = 4, but we cannot have both f21(4) = 2 and f21(4) = 2.
We have problems defining an inverse for f3 because there are values of
the codomain that are not in the range. We cannot define f31(3) because
3 is not in the range.
These problems are visible in the graphs of f2, f3. What we have observed
is that f2 is not injective, and f3 is not surjective.
85

Definition 3.6 (Injective, surjective, bijective) Let f : X Y


be a function.
f is injective (or 1-1) if whenever x1, x2 are distinct elements of X, then
f (x1) 6= f (x2).

f is surjective (or onto) if every element y Y is an image f (x) for at


least one x X.
f is bijective if it is both injective and surjective.

86

An injective mapping is called an injection, a surjective mapping a surjection, a bijective mapping a bijection.
Let f : X Y be a function. It turns out that
if f is injective, then there is a function g : Y X for which gf = iX .
g is called a left inverse for f .
if f is surjective, then there is a function h : Y X for which
f h = iY . h is called a right inverse for f .
Actually there could be many left or right inverses.
if f is bijective, then f actually has an inverse, which is both a left
inverse and a right inverse.
So the properties of injectivity and surjectivity together are what we need
to have an inverse.

87

We can check these properties easily on the graphs of functions.


In the Cartesian graph:
for an injective function we see at most one point with any given ycoordinate,
for a surjective function we see at least one point with any given ycoordinate (for y in the codomain).
In the directed graph:
for an injective function we never see two arrows to the same point in
the codomain,
for a surjective function we see at least one arrow to each point in the
codomain.
We can draw some examples of graphs, and recognise which of them represent injective functions, which surjective and which bijective functions.
88

Examples (Cartesian graphs):

89

Examples (directed graphs):

90

3.7

Computing an inverse for a function

If we want to try and compute an inverse for a function, we start by looking


for a left inverse.. Then we check to see if it is also a right inverse.
e.g. As an example, we compute the left inverse of
f : R R, x 7 2x + 3
We rearrange the equation y = 2x + 3 to get x on its own. In this way
we get a function g : R R for which g(y) = x, so g f (x) = x.

Now we can check that f g(y) = y for all y R.


91

3.8

Partial functions

Definition 3.7 Given two sets X and Y , a partial function f from X


to Y is a rule that defines an element f (x) Y for some elements x of
X, but may be undefined for others.
We write f : X 6 Y to indicate that f is a partial function but not a
function from X to Y .
The subset of X consisting of those x for which f (x) is defined is called
the domain of f .
We sometimes use the term total function instead of just function, if we
want to stress the fact that we have a function, not just a partial function.

92

If f : X 6 Y is a partial function with domain Z, we can define a (total)


function g : Z Y by the rule g(z) = f (z) for all z Z. We call g the
restriction of f to Z, and write
g = f |Z .
Since every partial function has a total function closely associated with it,
a lot of what we have said already about functions remains true for partial
functions.
A partial function isnt much less than a function, and sometimes its good
enough for our purposes.

93

Examples of partial functions:


f from R to R defined by

f (x) =

(x),

wherever this makes sense. This is a partial function with domain {x


R : x 0}.
g from R to R defined by

1
g(x) = 2
x 1
wherever this makes sense. This is a partial function with domain R \
{1, 1}.

94

We can find a partial function g from Z to Z that is a left inverse of


the function
f : Z Z, x 7 2x + 1
by rearranging y = 2x + 1 to get x on its own.
We get

y1
x=
2

The reason that

y1
g : Z 6 Z, y 7
2
is not a total function is that y1
2 is not an integer for all y Z. The
domain of g is the odd integers; it is equal to the range of f .
The partial function g satisfies g f = iZ.
95

Using g its now easy to construct a (total) function that is a left inverse
of f . We simply define h(y) = g(y) whenever g is defined, and h(y) = 0
otherwise. Our construction guarantees that h f = iZ. So h; Z Z is
a left inverse of f .
But f does not have an inverse. It cannot. It is not surjective.

96

3.9

Functions defined recursively

The rule that defines a function is not always given explicitly. Sometimes
it could be given explicitly, but can be given much more simply in another
form, which describes the value of the function at each value of x in
terms of its value at smaller values of x. In many examples, the function
is defined on the positive integers, and its value at x can be defined in
terms of its value at positive integers less than x.
e.g. The factorial function, f act(n), also written n!. This is a function
from N to N, which is defined for any positive integer n to be the product
n(n 1) . . . 1. Its defined very simply recursively by the pair of rules
and

f act(n) = n f act(n 1)
f act(1) = 1
97

So the easiest way to program it is as follows (in C):int factorial(int n) /* n assumed >0
{
if (n==1 ) return 1;
return n*factorial(n-1);
}

*/

e.g. The function f : N N defined by f (n) = 2n can be defined


recursively by the rules
f (1) = 1,
and similarly easily programmed.

f (n) = 2f (n 1)

98

A recursive definition of a function always has two components, the recursive rule (which defines f (n) in terms of some of the earlier values of f )
and initial conditions, which fix the first few values of f , f (1) (or f (0)),
and possibly f (2), f (3), . . .. The number of initial conditions that are
necessary depends on the type of recursion. If the recursion is one step
(that is, the rule defines f (n) in terms of f (n 1) only, then only one
initial condition is necessary. But if the recursion is k-step, defining f (n)
in terms of f (n 1), . . . f (n k), then k initial conditions are necessary.
Examples 3.8 Where f : N N is the function given recursively by
the rule f (n) = 2f (n 1) and f (1) = 2, we have

f (2) = 2 2 = 4, f (3) = 2 4 = 8,f (4) = 2 8 = 16. In fact the


function can be described very simply by the rule f (n) = 2n.

99

Where f : N N is the function given recursively by the rule f (n) =


2nf (n 1) and f (1) = 2, we have
f (2) = 2 2 2 = 8.
f (3) = 2 3 8 = 48.
f (4) = 2 4 48 = 384

This function is in fact 2nf act(n) (also written 2nn!).


Where f : N N is given by f (n) = 12 3n, we have
12 3n
f (n)
=
=3
n1
f (n 1) 12 3

So f (n) = 3f (n 1). f can be described recursively by the rule f (n) =


3f (n 1) and the initial condition f (1) = 36.
We can compute further examples.
100

101

4.1

Graphs
Definition and examples, notation

Definition 4.1 (directed graph) A directed graph consists of a set V


of vertices, and a set E of ordered pairs of vertices, called directed edges.
The directed edge e = (x, y) joins vertex x to vertex y; x is called the
initial vertex, y the terminal vertex of e.
If the two vertices of e coincide, then e is called a loop. Two distinct
edges cannot share both their initial and their end vertices.
We always think of directed graphs pictorially, representing each vertex as
a point in a plane, a directed edge (x, y) as a directed arc from the point
x to the point y. We often abbreviate directed graph to digraph.
102

We understand that the same graph may be displayed on the page in


many different ways. It is defined simply by its set of vertices and the
connections between them, and it is irrelevant how these are placed on
the paper.
Hence the first two diagrams below represent the same directed graph.
Strictly speaking the third diagram represents a distinct (but isomorphic)
graph, since its vertices have been differently labelled, but we dont always
need to label the vertices of a graph.

103

We have already met directed graphs, used as diagrams for both relations
and functions (see sections 2 and 3); we can learn a lot about relations
and functions by studying the properties of the associated directed graphs.

104

Sometimes we prefer not to have arrows on our edges, and to outlaw


loops.
Definition 4.2 (graph) A graph consists of a set V of vertices, and a
set E of edges each of which is a set of two distinct vertices of V . Where
e = {x, y} is an edge, we say that e joins x and y..
Of course we always think of graphs pictorially too, representing vertices
as points in a plane and edges as undirected arcs between pairs of vertices.
We have already met examples of graphs as the equivalence graphs of
equivalence relations (see section 2).

105

Given any digraph we can form a graph by deleting any loops and connecting by an edge any two vertices connected by 1 or 2 directed edges,
e.g.

106

We define the complete graph on n vertices Kn to be the graph on n


vertices in which any two vertices are joined by an edge. Kn has n(n1)/2
edges.
Examples:

107

108

We define the cycle graph Cn to be the graph consisting of n vertices


joined together in a single cycle of n edges.
Examples:

109

4.2

Paths, connectivity, components

Definition 4.3 We define a path (of length n) in a graph or directed


graph to be a sequence of vertices v0, v1, . . . vn such that for any i,
(vi, vi+1) is a edge or directed edge. Such a path is called a circuit
(of length n) if v0 = vn, and a cycle if v0 = vn and all of v0, . . . vn1
are distinct. The length is the number of edges in the path/circuit. A
graph is said to be connected if any two vertices x, y are connected by a
path; a directed graph is said to be connected if its underlying graph is
connected.
Examples:

110

Definition 4.4 Suppose that G = (V, E) is a graph (or directed graph).


A subgraph of G is a graph (or directed graph) G = (V , E ) whose
vertex set V is a subset of V , and whose edge set E is a subset of E.
A component.of G is a connected subgraph G = (V , E ) of G such that
no edge of G joins a vertex in V to a vertex outside V .
Examples.

111

For the equivalence graph of an equivalence relation, the components


of the graph define the equivalence classes of the relation; two vertices
are in the same component if and only if they are related.
Example, the equivalence graph of 3 on {0,1,2,3,4,5,6,7,8,9}:

112

4.3

Bipartite graphs

Definition 4.5 A graph (or directed graph) is bipartite if its vertex set
V can be written as a disjoint union of two subsets V1 and V2 in such
a way that any edge in the graph joins a vertex in V1 to a vertex in V2.
More generally a graph (or directed graph) is k-partite if its vertex set V
can be written as a disjoint union of k subsets V1, . . . , Vk in such a way
that every edge joins two vertices in different subsets.
An alternative way of phrasing this is to say that a graph is k-partite if
the vertices can be coloured with k colours in such a way that any edge
joins two vertices of different colours. Of course if a graph is k-partite its
also (k + 1)-partite.
We can draw lots of examples.
113

First some bipartite graphs (Note that the directed graph of a function is
always bipartite.)

114

Next some tripartite graphs

And some other multipartite graphs.

115

We define the complete bipartite graph Kr,s to be the bipartite graph on


r + s vertices with rs edges each joining a vertex in a subset of r vertices
to one of the other s vertices.

We define complete k-partite graphs similarly, e.g. K2,3,4

116

It can be proved that any graph that can be drawn on a piece of paper
with no edges crossing can be coloured with 4 colours (and so is 4-partite).
This is the famous four colour theorem.
Another well known theorem in graph theory tells us that a graph can
be drawn on a piece of paper with no edges crossing provided it doesnt
contain a subgroup that deforms to give either K5 or K3,3.
It is easy to check whether or not a graph is bipartite (2-partite). A graph
is bipartite if and only if all its cycles have even length.

117

4.4

Trees and forests

In computer science, certain graphs, known as trees are important.


Definition 4.6 A tree is a connected graph without any cycles. It may
or not have directions assigned to its edges. A vertex with just one edge
through it is called a leaf. Sometimes one vertex is selected to be called
the root of the tree; then the tree is known as a rooted tree. In that case,
the tree is called a binary tree if every non-leaf is joined to 2 vertices that
are further from the root than it. If x is a vertex in a rooted tree, then
the vertex joined to x and nearer to the root than x is called its parent,
and the other vertices joined to x (and further from the root than x) are
called its children.
A forest is a graph without any cycles, that is a graph whose components
are trees.
118

Some diagrams of examples.

119

120

Rooted trees are used to define directory stucture.

121

4.5

Parsing trees

Trees are also used for parsing, e.g. of arithmetic expressions, such as
h
i2
(x + x (y + z)) + y (x z) or
x (x y)3 + (x + y) (x y) .
We see , + and as binary operations, and exponentiation by n (for
each integer n) as a unary operation. Then any arithmetic expression is
built up using a combination of those binary and unary operations on a set
of constants and variables. We parse such an expression using a rooted
tree in which each vertex has either one or two children.
We form a parsing tree as follows. Each leaf is labelled by a constant
or variable, and each non-leaf by an operation. A non-leaf is then also
associated with the sub-expression that is formed by applying the operation
122

that labels that vertex to the one or two (maximal) subtrees whose root(s)
is/are the children of that vertex.
NB. Note that when we parse an arithmetic expression, we always parse
x + y z as x + (y z), x + y 3 as x + (y 3), and x y 3 as x (y 3), etc.
That is, exponentiation has priority over multiplication and addition, and
multiplication has priority over addition.

123

Examples:

124

4.6

Spanning trees

Definition 4.7 A spanning tree for a graph is a subgraph, involving all


the vertices, that is also a tree.
Given a graph G = (V, E) we can easily construct a spanning tree as
follows.
If G is not a tree, it contains a cycle. So we look for an edge that is in a
cycle, and delete it, to get a new graph G = (V, E ) with all the vertices
of G, but one fewer edge. If G is a tree, then it is a spanning tree for G.
If not, it contains a cycle, and we repeat the above procedure.
We continue until we arrive at a tree.
Example:
125

126

5.1

Recursion and induction


Examples of some sets defined inductively

Z+ is defined as follows:(1) 1 Z+; (2) If x Z+ then x + 1 Z+.


Nothing is in Z+ that is not forced by either (1) or (2) to be in Z+.
The set P of polynomials in x with real number coefficients is defined
as follows:(1) For any c R, c is in P (2) If f is in P then so are xf and f + c,
for any c R.
Nothing is in P that is not forced by (1) or (2) to be in P .
The set T of binary trees can be defined as follows:127

(1) The graph K1 is in T . (2) If T is a binary tree and v is any leaf of


T , then the graph formed from v by attaching two edges and two new
vertices below v is in T . (We can draw a picture to illustrate this.)

Nothing is in T that is not forced by either (1) or (2) to be in T .


What are the advantages of recognising that a set can be defined inductively?
We can give recursive definitions for functions that have that set as
their domain. Such definitions often give rise to very straightforward
programming, e.g. look at the way we could program the factorial
128

function on the positive integers.


We can prove things about the set using the principle of mathematical
induction, e.g. we can prove: The sum of the first n positive integers (i.e. 1 + 2 + 3 + . . . + n) is
n(n + 1)
.
2
If T is a binary tree with n leaves then T has 2(n 1) edges.

How do we prove something using the principle of mathematical induction?


We split the argument into two pieces, the base of the induction and the
inductive step. The base of the induction deals with the first element
of the set, defined by the first rule, and the inductive step deals with the
second rule.
129

5.2

Some examples of proof by induction

Example 5.1 To prove that the sum of the first n positive integers (i.e.
1 + 2 + 3 + . . . + n) is
n(n + 1)
2
Proof:
Base of the induction We prove the statement for n = 1. The sum of the
first 1 positive integers is just the first positive integer, i.e. 1 itself. The
formula n(n + 1)/2 takes the value 1(1 + 1)/2 = 1 when we put n = 1
into it. So the formula gives the correct answer when n = 1. Hence the
base of the induction is proved.
The inductive step Suppose that the sum of the first n positive integers
130

is given by the formula


n(n + 1)
2
To prove the inductive step we need to deduce from this that the sum of
the first n + 1 positive integers is given by the formula
(n + 1)((n + 1) + 1) (n + 1)(n + 2)
=
.
2
2
Now the sum of the first n + 1 positive integers is
1 + 2 + . . . + n + (n + 1) = (1 + 2 + . . . + n) + (n + 1)
and since we know the formula for the first n we can write this as
n(n + 1)
+ (n + 1)
2
131

When we tidy this up we get



n
(n + 2)
+ 1 = (n + 1)
(n + 1)
2
2
i.e. exactly the formula we were looking for. Hence the inductive step is
proved.
The principle of mathematical induction says that if we check just these
two things we have proved that the formula holds for every positive integer.
2

132

Example 5.2 If T is a binary tree with n leaves then T has 2(n 1)


edges.
Proof:
Base of the induction We need to prove the result for the binary tree with
just one leaf, i.e. for the tree that is a single vertex. This tree has no
edges. And this fits with the formula, since 2(1 1) = 0.
So the base of the induction is proved.

The inductive step Now suppose that we know that the result holds for
binary trees with n leaves. We want to deduce from this that the result
holds for binary trees with n + 1 leaves.
Now suppose that T is a binary tree with n + 1 leaves. Then we can
find two leaves in T that are joined by edges to a single vertex v; when
the two vertices and two edges are deleted from T we get a binary tree
133

T with 2 fewer edges and 2 fewer vertices than T . That is, T is formed
from T by adding two edges and two vertices below its leaf T .
(as in diagram)
The vertex v was a leaf of T but is not a leaf in T , but every other
leaf of T is also a leaf of T . So T has n leaves. Then since the result
holds for binary trees with n leaves, T must have 2(n 1) edges.

Now since T has two more edges than T , the number of edges in T is
2(n 1) + 2 = 2n = 2((n + 1) 1).

So the inductive step is proved.

The principle of mathematical induction says that by checking just these


two things we have proved that the formula holds for every binary tree. 2
134

Example 5.3 2n n3 for any integer n with n 10


NB: to do this example, we need to know the expansion
(n + 1)3 = n3 + 3n2 + 3n + 1

135

136

Example 5.4 n3 n is divisible by 3 for any integer n.

137

138

6.1

Propositional logic
Statements, propositions, paradoxes

This point of this section is to set up a framework for reasoning by manipulating statements, in particular special kinds of statements that are
called propositions.
Definition 6.1 (statements, propositions) A statement is a sentence
that is not a question or a command. A proposition is a statement that
is either true or false, but not both. If a proposition is true we say it has
truth-value T rue. Otherwise it has truth-value F alse.
In programming, we usually use 1 to represent T rue and 0 to represent
F alse. In this course well often use T and F .
139

Examples 6.2
1. Some true propositions:
(a) 2 + 3 = 5. (b) Mumbai is in India. (c) Spurs won the FA cup in
1991.
2. Some false propositions.
(a) 10 > 87. (b) Newcastle is south of Leeds.
3. Propositions whose truth-value depends on one or more variables open statements.
(a) n is an even integer. (depends on n). (b) She is taller than me.
(Depends on she and me). (c) x + y > 7. (Depends on x and y).
4. The following are NOT propositions or statements.
(a) Well done! (b) How did you manage that?
140

5. The sentence: This sentence is false. is a statement, but not a


proposition. It cannot be true, and it cannot be false. It is in fact a
paradox.

141

6.2

Compound propositions, logical connectives

Definition 6.3 We make compound propositions by glueing together


simple propositions using the logical connectives and, or and not.
So from the two simple propositions
Roses are always red.

Violets are always blue.

we can make various compound propositions


Roses are always red and violets are always blue.
Roses are always red or violets are always blue.
Roses are not always
red.
The truth-value of a compound proposition is determined both by the
truth-value of the constituent propositions, and by the way in which they
142

are connected together. Well learn how to manipulate compound propositions; the language and laws of this manipulation are known as the
propositional calculus.
To stop our explanation of this getting too wordy, well use symbols rather
than the English language.
Well use letters like p, q, r to stand for propositions and call them propositional
variables,
(called conjunction) means and, (disjunction) means or, and
means not. T is a true proposition, F a false proposition,
We call an expression that we build out of these symbols a propositional formula

143

Strict rules define the way in which and, or and not should be interpreted in logic (not necessarily as you might expect in English where the
meanings are not always consistent).
Given propositional formulae f, g,
f g is defined to be true if both f is true and g is true, and is defined
to be false otherwise.
f g is defined to be true if f is true or g is true, including when both
are true, and false otherwise.
f is defined to be true when f is false, and false when f is true.
Note the meaning of as or is defined to be inclusive, as in Would you
like milk or sugar with your tea? rather than exclusive as in Would you
like tea or coffee?.
144

6.3

Logical equivalence, truth tables

Definition 6.4 (Logical equivalence) .


Two propositional formulae, f, g are said to be logically equivalent if the
formula f is true whenever the formula g is true, and vice versa. We write
f g.
e,g, for any p, q, r, p (q r) (p q) (p r)

We use truth tables to test for logical equivalence between propositional


formulae. Truth tables play the same role in logic that Venn diagrams do
in set theory.
The truth table of a formula has many rows and columns.
Theres a column for each variable in the formula, and then one for each
appearance of each of the (binary or unary) operations in the formula.
145

Theres a row for each possible combination of values for the variables
in the formula.
An iterative process (to be explained) computes the value of the whole
formula, and inserts it into the appropriate column of the truth table.
Two formulae are logically equivalent if and only if their truth tables
match, that is, if the values of the formulae match as the variables run
through all possible sets of values.

146

6.4

Construction of truth tables

The iterative process constructs the truth table for a formula by parsing
the formula, and then combining the truth tables for the basic operations
into which it breaks down.

147

Truth tables for the basic operations


The building blocks for any truth table are the truth tables for the three
basic operations, , and . These tables are as follows:p
T
T
F
F

q pq
T T
F F
T F
F F

p
T
T
F
F

q pq
T T
F T
T T
F F

148

p p
T F
F T

Parsing the formula


When we have a formula involving more than one operation we need to
start by parsing it. In this way the operation at the root of the parsing tree
is identified. Each other operation is identified as the root of a subtree,
and hence corresponds to a subformula.
Within the truth table we have first a column for each variable, and then
a column for each operation appearing in the parsing tree (each operation
heading as many columns as it labels vertices of the parsing tree). We
use a column headed by an operation to store the value of the appropriate
subformula. We build up the value of the whole formula, working from
the leaves of the parsing tree until we reach the root.
Well illustrate this process with an example.
149

Example 6.5 The truth table for (p q) (p r)


First we draw the parsing tree.

The root is labelled , and the two non-leaves are labelled , corresponding to subformulae p q and p r. So we have a truth table with
6 columns, 3 labelled by the variables p, q, r and then 3 more, labelled
p q, and p r, where we store the values of the subformulae p q,
the whole formula and the subformula p r.
150

p
T
T
T
T
F
F
F
F

q
T
T
F
F
T
T
F
F

r (p q) (p r)
T
F
T
F
T
F
T
F

The values of the whole formula are in the 5th column.


NB: We should always run through all possible values of the variables in
a systematic order (as we have done here), to make it easier to compare
truth tables of different formulae to show their logical equivalence.
151

Example 6.6 We prove using truth tables that p (q r) (p q)


(p r)
We just built the truth table for . (p q) (p r) Now we build the truth
table for p (q r).
p q r p (q r)

152

We find the truth value for the whole formula in the fourth column of this
table, and compare it with the calues in the fifth column of the other table.
Since the entries in those columns match (and the values of p, q, r have
been enumerated in the same orders), we have proved the equivalence of
the two formulae.

153

6.5

Rules relating the operations of propositional logic

The equivalence weve just proved should look familiar; weve seen the
same equivalence for sets. In fact, using truth tables, we can also verify
all of the following, for propositional formulae f, g, hs.
Commutative laws. f g g f
Associative laws. f (g h) (f g) h)
Distributive laws. f (g h)
(f g) (f h)
f T f
f f F
De Morgans laws. (f g) (f ) (g)
Idempotent laws. f f f
(f ) = f
154

f g gf
f (g h) (f g) h
f (g h)
(f g) (f h)
f F f
f f T
(f g) (f ) (g)
f f f

Its clear that theres a connection between set theory and propositional
calculus. Its like two different languages. Sets translate to propositional
formulae, to , to , U to T and to F . Theres a reason for this,
which will become clearer later in the course.
We see again the duality that we saw with sets, where is paired with
, and T with F .

155

6.6

Tautologies and contradictions

Definition 6.7 (Tautology, contradiction) .


A propositional formula that is always true is called a tautology. A propositional formula that is always false is called a contradiction.
Example 6.8 Show that ((p q)) (p) is a tautology.
We simply construct the truth table. Since all values of p and q give a T
in the column for the whole formula (under the , it is a tautology.
p q ( (p q)) (p)

156

6.7

Translating English sentences into the language of propositional logic

We need to be able to break a proposition down into its constituent parts.


We have to think very hard about what the sentence really means. Well
look at some examples:
Sue is a beautiful woman can be broken down as Sue is beautiful and
Sue is a woman. The negative of this is then equivalent to Either Sue
is not beautiful or Sue is not a woman. (Remember de Morgans law.)
Mike is a wonderful guitarist is already as simple as it can get, i.e. it
does not mean the same as Mike is wonderful and Mike is a guitarist.
Nobody said that Mike was wonderful.
To translate The work is easy to do but not satisfying we first translate
but as and not. Then the whole sentence can be broken down as
157

The work is easy to do and the work is not satisfying. The negative of
this is equivalent to Either the work is not easy to do or it is satisfying.
Soon we shall see that this negative is equivalent to the statement If
the work is easy to do then it is satisfying.
We can build further propositional formulae with the introduction of additional connectives.

158

6.8

The connectives , and

We define connectives , and by specifying their truth tables, as


follows:p q pq
p q pq
p q pq
T T T
T T T
T T T
T F
F
T F T
T F
F
F T T
F T F
F T F
F F T
F F T
F F T
We can verify, using truth tables that
q p q,
p q q p,
p q (p q p q).
159

We read f g as f implies g, or if f then g . Similarly we read


f g as f is implied by g and f g as f if and only if g,

Note that by definition f g is true provided that whenever f is true


then g is also true. Hence in logic, f g is true whenever f is false,
irrespective of the value of g. This may not seem consistent with English
usage, But it is true in logic. Hence the proposition If pigs can fly then
the pope is a Protestant is true.
Note that the negative of p q is equivalent to the negative of p q,
and hence by de Morgans law is equivalent to p q. Hence the negative
of If pigs can fly then the pope is a Protestant must be equivalent to
Pigs can fly and the pope is not a Protestant.
NB: weve used the notation , and . Often you see , and
, meaning the same.
160

161

Predicate logic

Weve already spent a little time studying propositional logic, manipulating


formulae consisting of variables representing propositions linked by the
logical connectives , , , , and . In propositional logic, we
work from the principle that a proposition is either true or false, and we
are interested in knowing which truth values of the propositional variables
involved in a formula make the compound proposition it represents true,
and which make it false.
But propositional logic can be a bit restrictive and doesnt allow very
sophisticated arguments. For more sophistication we need to allow manipulation of formulae at a level lower than propositions. We need to
break each proposition down into its constituent parts, such as variables,
constants, logical connectives and predicates. Predicate logic allows us to
162

do this.
To illustrate this well explain how predicate calculus allows us to deduce
the statement
(3): My cat has blue eyes
from the pair of statements
(1): My cat is a Siamese cat and (2): Every Siamese cat has
blue eyes,

163

We need to write each of the three statements in terms of predicates


(relations)..
We can write the first statement in terms of the binary predicate .
Setting its two arguments equal to constants my cat and (the set)
Siamese cats we get a translation of the first statement as
(1): my cat Siamese cats.
We define a unary predicate Q to mean has blue eyes. Then the third
statement translates as
(3): Q(my cat)
The second statement is a little more complicated. We can express it using
the two predicates and Q, and a little more. In fact we can phrase it as
(2): For every x Siamese cats, Q(x).
164

Now the logic should be clear. The second statement tells us that Q(x)
holds whenever x is an element of the set Siamese cats. The first
statement tells us that my cat is in that set. Hence Q(my cat) holds.
Predicate calculus gives us a mechanism (including symbols) to express
the second statement properly, and hence to reason in this way.

165

7.1

The universal quantifier

We use the symbol in predicate logic to mean for all, for every,
for each. x means for all x. Often for all x really means for all
appropriate x, that is, for all x in some set of values which make sense
in the current context. In that case we write x S, where S is that set
of appropriate values. the symbol is known as the universal quantifier.
Hence the statement above can be written in symbols as
x Siamese cats Q(x)
Note that although x appears as a variable in this proposition, it is not free,
that is, the proposition does not make sense if we replace x by a constant
We say that x is a bound variable. Although x cannot be replaced by
a constant, it can be replaced by another variable, without changing the
166

value of the proposition. The proposition


y Siamese cats Q(y) and x Siamese cats Q(x)
are logically equivalent to each other.
By contrast the variables x in the two statements x Siamese cats
and Q(x) are free; the statements are meaningful when those variables
are replaced by constants (as in the first and third statements).
In order to be able to manipulate formulae involving the universal quantifier, we need to know precisely when a formula involving it is considered
(in logic) to be true. Given a set of variables S and a unary predicate P ,
the proposition
x S P (x)
is defined to be true precisely if the proposition P (x) is true for every
element x of S. Otherwise the proposition is false.
167

Similarly the proposition


x P (x)
is defined to be true precisely if P (x) is true for all possible values of x
(whatever that means).
Examples 7.1
Where, for x N, the statement P (x) means x is a perfect square,
(that is, x = y 2 for some integer y) the statements P (0) and P (1) are
true, P (2) is false, P (3) is false, P (4) is true . . ..
So the statement x N P (x) is false.

Where P (x) means x loves the underdog, the statement x, P (x) means
everybody loves the underdog.

168

7.2

The existential quantifier

We use the symbol in predicate logic to mean there exists, there is,
for some. x means there exists an x. We may use x or x S,
for some set S. Just like , the existential quantifier has the effect of
turning a free variable into a bound variable.
Given a set of variables S and a unary predicate P , the proposition
x S P (x)

is defined to be true precisely if the proposition P (x) is true for at least


one element x of S. The proposition is defined to be false if P (x) is false
for every element x of S.
Similarly the proposition
x P (x)
169

is defined to be true precisely if P (x) is true for at least one value of x.


Examples 7.2
Where, for x N, P (x) means x is a perfect square, since P (0) is true,
the statement x N P (x) is true.
Where P (x) means x loves the underdog, x, P (x) means somebody
loves the underdog.

170

7.3

Multiple quantifiers

It is perfectly possible to apply a (universal or existential) quantifier to


a predicate involving several variables, or even to apply more than one
quantifier. Each quantifier binds one variable, and of course two different
quantifiers cannot simultaneously bind the same variable.
e.g. We can write P (x) above as y Z x = y 2, and so the (false)
statement x N P (x) can be written as x N y Z x = y 2.

171

7.4

Relating the existential and universal quantifiers

Notice that, for any unary predicate P , and for any set S, the following
two identities hold.
(x S P (x)) x S (P (x))
(x S P (x)) x S (P (x))
Examples 7.3
Where P (x) is, as above x is a perfect square. x N (P (x)) translates as there is a natural number which is not a perfect square.
We know this to be true, and we used this fact to prove the statement
x N, P (x) to be false.
x N P (x) is true.

172

7.5

Translating English sentences using quantifiers

The square of any integer is positive. translates as


x Z x2 0

The sum of an odd integer and an even integer is always an odd integer.
translates as
x OddInt y EvenInt x + y OddInt

Between any two odd integers there is an even integer translates as


x OddInt y OddInt (x < y

or as

i Z j Z (2i+1 < 2j+1


173

(z EvenInt x < z < y))

(k Z 2i+1 < 2k < 2j+1))

The principle of mathematical induction uses the fact that the following
is a tautology:(P (1) (n N P (n) P (n + 1))) (n N P (n))

174

7.6

A systematic approach to translation

Lets try and be a bit more systematic, and a bit more ambitious. Id like
to be able to express some famous results about natural numbers using
predicate calculus:
The product of two odd integers is odd.
The square root of 2 is irrational. (Hippasus, 600 BC).
The only integer solutions to xn + y n = z n with n > 1 occur when
n = 2 (Fermats last theorem).
Every natural number can be written as the sum of four squares (due
to Lagrange 1770)
An odd prime can be written as the sum of two squares if and only if
it is congruent to 1 mod 4. (Fermat)
175

7.7

First we find expressions for the components

Suppose that n N.
n is odd if k Z(n = 2k + 1)
n is even if k Z(n = 2k)

n is composite if r, s Z+ \ {1}(n = rs)

n is prime if (r, s Z+ \ {1}(n = rs)) or, alternatively, r, s


Z+ \ {1}(n 6= rs))
n is congruent to 1 mod 4 if k Z(n = 4k + 1)

n is a perfect square if k Z(n = k 2)

n is a sum of two squares if k, l Z(n = k 2 + l2)


176

7.8

Quantifying more than one variable

Note that r, s is shorthand for r, s and r, s is shorthand for r, s,


and that its legitimate to use this shorthand, since r, s means exactly
the same as s, r, and r, s means exactly the same as s, r.
However when we mix two different quantifiers, the order of composition
is important, that is r, s and s, r do not mean the same.

177

7.9

Translating whole statements

In general translation between logic and English (in either direction) is


easiest done in stages. We parse the statement into components, and
then translate each component. We can translate The product of two
odd integers is odd first as
For any m, n, if m and n are odd integers, then mn is an odd integer,
and then use the translation we know for n is an odd integer.
In this way we arrive at
m, n Z((i, j Z(m = 2i+1n = 2j+1)) (k Z(mn = 2k+1)))
Similarly, The product of two even numbers is even translates as
m, n Z(i, j Z(m = 2i n = 2j) (k Z(mn = 2k)))
178

In the other direction, the statement


m, n Z(k Z(mn = 2k) ((i Z(m = 2i)(j Z(n = 2j))
translates to English first as

For any m, n, if mn is an even integer then either m is an even integer


or n is an even integer,
and then as
If a product of two integers is even then one of the two factors must be
even

179

The square root of 2 is irrational translates as


(p, q Z+(2q 2 = p2))
Fermats last theorem translates as
x, y, z, n N, (xn + y n = z n n 2)

Lagranges theorem about sums of squares translates as

n N(i, j, k, l N(n = i2 + j 2 + k 2 + l2))

180

Fermats theorem about sums of 2 squares is a little harder to express.


p is an odd prime translates as
(p Z p 6= 2 r, s Z+ \ {1}(n 6= rs))

while p is a sum of two squares translates as

k Z, l Z(p = k 2 + l2)

We can translate For any p, if p is a prime congruent to 1 mod 4, then


p is the sum of two squares as
p Z+
((k(p = 4k + 1) (r, s Z+ \ {1}(p 6= rs)))
(l, m Z(p = l2 + m2))).

(We can leave out p 6= 2 since 2 is not of the form 4k + 1).


181

We can translate For any positive integer p, if p is an odd prime and the
sum of two squares, then p is congruent to 1 mod 4 as
p Z+
((p 6= 2) (r, s Z+ \ {1}(p 6= rs)) (l, m Z(p = l2 + m2)))
((k Z(p = 4k + 1)))

The theorem is the and of those two statement.

Unfortunately it is not logically equivalent to the statement


For any p, p is a prime congruent to 1 mod 4, if and only if p is the sum
of two squares,
nor is it logically equivalent to the statement
For any p, p is an odd prime and the sum of two squares if and only if p
is congruent to 1 mod 4.
182

Neither of those last two statements is true.


For 20 = 22 + 42 is the sum of two squares, but is not prime, and is not
congruent to 1 mod 4.
And 9 = 4 2 + 1 is congruent to 1 mod 4, but it is not prime (it is the
sum of two squares, 32 + 02). 21 = 10 2 + 1 is also congruent to 1 mod
4. Its not prime, and it is not the sum of two squares.
The point is that we need (on both sides of the implication) to state that
p is prime.

183

Maybe the simplest way to express the theorem is as follows:


For any positive integer p, p is an odd prime and the sum of two squares
if and only if p is a prime and is congruent to 1 mod 4.
That is,
p Z+
(((p 6= 2) (r, s Z+ \ {1}(p 6= rs)) (l, m Z(p = l2 + m2)))
((r, s Z+ \ {1}(p 6= rs)) ((k Z(p = 4k + 1)))))

184

8.1

Boolean algebra
Definition of a Boolean algebra

A Boolean algebra is a set S, containing 2 special elements the zero (0)


and unity (1), equipped with operations of addition (+), multiplication
() (both between pairs of elements) and complementation () (of single
elements), and satisfying the following axioms.

185

xy =yx
x (y z)
= (x y) z
Distributive laws:
x (y + z)
= (x y) + (x z)
Behaviour of 1 and 0:
x1=x
Behaviour of complements: x x = 0
Commutative laws:
Associative laws:

x+y =y+x
x + (y + z)
= (x + y) + z
x + (y z)
= (x + y) (x + z)
x+0=x
x + x = 1

The symbols weve used, i.e. 0, 1, +, ,, arent a part of the definition.


We could just as easily use other symbols.
Weve already met examples of Boolean algebras in this course.

186

8.2

Examples of Boolean algebras

For any set U , the set of all subsets of U is a Boolean algebra, for
which multiplication is given by , addition by , complementation by
c, 1 by U , and 0 by .
For any set (say {p, q, r} ) of propositional variables, the set of all
propositional formulae in these variables is a Boolean algebra, for which
multiplication is given by , addition by , complementation by , the
zero by FALSE and the unity by TRUE.

The set {0, 1} forms a Boolean algebra, in which 0 is the zero and 1
the unity when we define
0 + 0 = 0, 1 + 1 = 1, 0 + 1 = 1 + 0 = 1,
0 1 = 1 0 = 0 0 = 0, 1 1 = 1, 0 = 1, 1 = 0
187

NB. Notice that 1 + 1 = 1 NOT 1 + 1 = 0. This all makes sense if we


think of 1 as TRUE, 0 as FALSE, + as or, as and, and as not.
In this course we are studying Boolean algebras in order to learn some
useful algebraic techniques for manipulating propositional formulae or set
theoretic expressions. Its often easier and quicker to manipulate propositional formulae or expressions involving sets algebraically than it is to rely
on truth tables or Venn diagrams.

188

8.3

Principle of duality for Boolean algebras

Given any arithmetic expression in a Boolean algebra, the dual of that


expression is obtained by replacing any by a +, any + by a , and 0 by
a 1 and any 1 by a 0.
The Principle of duality for Boolean algebras says that the dual of any
rule holding in a Boolean algebra must also hold. The Principle of duality
is basically a consequence of the fact that the Boolean algebra axioms
occur in dual pairs.
We have already observed the principle of duality holding in the Boolean
algebra of the set of subsets of a set. One consequence of the principle
of duality is that from any Boolean algebra S (as above) a second one
S can be defined, whose multiplication is +, addition is , unity is 0 and
zero is 1.
189

8.4

Rules deducible from the axioms

The following rules hold in every Boolean algebra. They are straightforward to deduce from the axioms.
(a) Uniqueness of complements: that is
(b) (x) = x.

x + y = 1, x y = 0 y = x

(c) x + 1 = 1 and x 0 = 0.

(d) x x = x and x + x = x (idempotent laws)

(e) (x + y) = x y and (x y) = x + y (de Morgans laws)

190

8.5

Verifying the extra rules

The extra rules can be deduced from the axioms as follows:(a) Given x + y = 1,

x y = 0 we have

y = 1 y = (x + x) y = x y + x y = 0 + x y = x x + x y
= x (x + y) = x 1 = x

(b) We observe that x + x = x + x = 1 and x x = x x = 0 and


then apply (a) to get x = (x).
(c) First we have
x + 1 = (x + 1) 1 = (x + 1) (x + x) = x + (1 x) = x + x = 1

(the third equality follows from the distributive law).

The rule x 0 = 0 follows from x + 1 = 1 by duality.


191

(d) We see first that


(x x) + x = (x + x) (x + x) = 1 1 = 1

(applying distributivity to get the first equality) and then that


(x x) x = x (x x) = x 0 = 0

So x x is the complement of x, and hence is equal to (x) = x. The


second idempotent law follows from the first by duality.
(e). Well verify the first of de Morgans two laws, that
(x y) = (x + y )
This follows from
(x y) + (x + y ) =
=
=

((x y) + x) + y = ((x + x) (y + x)) + y distrib.


(1 (y + x)) + y = (y + x) + y = (x + y) + y
x + (y + y ) = x + 1 = 1, using (b)
192

together with
(x y) (x + y ) =
=
=
=

x (y (x + y ))
x ((y x) + (y y )) by distributivity
x ((y x) + 0) = x (y x) = x (x y)
(x x) y = 0 y = 0

The second law follows from the first by duality.

193

8.6

Boolean expressions

The Boolean algebras we are interested in are in general finite. A finite


Boolean algebra must have 2n elements, for some n N , and any two
Boolean algebras of the same size are basically the same (except that their
elements may be named differently).
So from now on we shall focus on one particular Boolean algebra, the
Boolean algebra of all Boolean expressions in n variables, But anything
we prove about this Boolean algebra can be translated into a statement
about any of the other examples weve seen.
A Boolean expression in variables a, b, c, d, . . . is any expression which can
be formed by adding together strings of those symbols together with the
symbols a, b, c, . . ., and applying the axioms of Boolean algebra. Multiplication is written as juxtaposition, addition as +, 1 is the unity and 0 the
194

zero element, and x the complement of x (for any variable, or indeed any
expression, x). This is equivalent to the Boolean algebra of propositional
formulae in a, b, c, d, on the understanding that multiplication represents
and, addition or, 1 true, 0 false, and x the negation of x, and it is
often helpful to remember that.

195

8.7

Tidying up Boolean expressions

Using the axioms we can tidy up any element of the Boolean algebra until
it is a sum of products of distinct variables from X together with its set of
complements, where no variable and its complement both appear in the
same product.
Examples. Let X = {a, b, c},

(a + b)(b + c)(c + a) =
=
=
=
(a + b + c)(a + b + c) =
=

(a + b)(bc + ba + cc + ca)
(a + b)(bc + ab + ac)
abc + aab + aac + bbc + bab + bac
abc + bac = abc + abc
a(a + b + c) + b(a + b + c) + c(a + b + c)
aa + ab + ac + ba + bb + bc + ca + cb + cc
196

= ab + ac + ab + bc + ac + bc
= ab + ab + ac + ac + bc + bc
The ordering of the terms, and within the terms, in the final expression,
has been chosen for aesthetic reasons only!

197

8.8

Disjunctive normal form

We can do something else with the second example above.


Since ab = ab1 = ab(c + c) = abc + abc etc., we can rewrite the
final expression as
abc+abc+abc+abc+abc+abc+abc+abc+abc+abc+abc+abc
which then simplifies (by reordering, and then using the idempotent law
to exclude duplicates of any term) as
abc + abc + abc + abc + abc + abc
This expression is of course more complicated than the final expression
above. What makes it interesting is that it is a sum of terms each of
which is a product involving every element of X or its complement exactly
once.
198

The end result of the first example is already a sum of products of this
type.
The products like this are called the atoms (or sometimes the minterms) of
the Boolean algebra. Any element of the Boolean algebra can be written as
a sum of distinct atoms in one way only. This expression for the element is
known as its disjunctive normal form (d.n.f.) (or minterm normal form).
Its sometimes useful to be able to put an element into this form, e.g. it
allows us to compare elements, in particular to test if two are equal.
More examples.
Let X = {a, b, c, d} To put ab into disjunctive normal form, we write
ab = ab(c + c)(d + d) = abcd + abcd + abcd + abcd

199

To put ab + ac into disjunctive normal form we write


ab + ac = ab(c + c)(d + d) + a(b + b)c(d + d)
= abcd + abcd + abcd + abcd + abcd + abcd + abcd + abc
= abcd + abcd + abcd + abcd + abcd + abcd

200

8.9

Relating disjunctive normal form and truth tables

There is an exact correspondence between disjunctive normal forms and


truth tables. Each minterm appearing in the d.n.f of a Boolean expression
corresponds to a row of the truth table for the expression for which the
value of the expression is non-zero. The corresponding row is that row at
which a variable x takes the value 1 if x itself appears in the minterm,
but the value 0 if x appears in the minterm.
Hence we can deduce the truth table of an expression from its d.n.f., or
the d.n.f. from the truth table.
e.g. consider the two examples in variables {a, b, c, d} aboves, for which
we already computed the d.n.f. algebraically..
The truth table for ab has non-zero values on the rows at which (a, b, c, d)
201

takes the values (1, 1, 1, 1), (1, 1, 1, 0), (1, 1, 0, 1), (1, 1, 0, 0).
The truth table for ab + ac has non-zero values on the 6 rows where
(a, b, c, d) takes the values
(1, 0, 1, 1), (1, 0, 1, 0), (1, 0, 0, 1), (1, 0, 0, 0), (1, 1, 0, 1), (1, 1, 0, 0).

202

Anda mungkin juga menyukai