You are on page 1of 63

Discrete Mathematics I (CS127)

Lecture Notes
Alexander Tiskin
University of Warwick
Autumn Term 2004/05

This course introduces some of the fundamental mathematical ideas that


are used in the design and analysis of computer systems and software. The
course makes you familiar with basic concepts and notation, helps you to
develop a good understanding of mathematical proofs, and enables you to
apply mathematics to solving computer science problems.

Problem sheets and seminars


The course is accompanied by a series of problem sheets relating to topics
covered in the lectures. To develop proper understanding it is essential that
you try to solve these problems during your own private study. The seminars
provide an opportunity to get help with difficulties experienced in tackling
the problem sheets, or with understanding the material from lectures. Please
sign up for a weekly seminar at a time which suits you, and do attend it.
Your performance at seminars will not be assessed, so nothing can prevent
you from showing your solutions, whatever your confidence level in them
might be. Confidence tends to grow with practice, and so does your exam
potential.

Lecture notes and books


The lecture notes are self-contained, but you may find it helpful also to
consult some books. The library contains several which cover all or part
of the course syllabus, and exploration of the catalogue and shelves is recommended. The three books listed below are all suitable. They are in the
library and should be available in the University bookshop. They cover the
material in different ways and in different style. It is suggested that you
look at them all, to find the one you find most accessible.
K. A. Ross and C. R. B. Wright, Discrete Mathematics (5th ed.),
Prentice Hall, 2003.

Discrete Mathematics I (CS127)


K. H. Rosen, Discrete Mathematics and Its Applications (5th ed.),
McGraw-Hill, 2003.
J. K. Truss, Discrete Mathematics for Computer Scientists (2nd ed.),
Addison-Wesley, 1999.
Another book well worth considering is
E. Bloch, Proofs and Fundamentals: a First Course in Abstract Mathematics, Birkhauser, 2002.

It is less suitable as general reference for the course material, but instead
concentrates on what is arguably its most important aspect: the concept of
a proof. It is very clearly written, and in many respects complements the
books on the courses main reading list.

Electronic resources
As the course progresses, the material will be available on the course website:
http://www.dcs.warwick.ac.uk/~tiskin/teach/dm1.html . The Rosen
book has a website of its own: http://www.mhhe.com/rosen .
A forum (discussion group) on Warwick Forums has been set up to exchange messages relevant to the course. In the past, it proved to be a useful tool for communication within the CS127 student population, and also
between students and tutors. The forum is available at http://forums.
warwick.ac.uk . The University IT Services should be able to help in case
of any problems with accessing this forum. As with all discussion groups,
its abuse will not be tolerated.

Assessment
One of the main challenges of the course is the lack of continuous coursework
assessment. This means that you have to work hard, without being forced
to. The course is assessed by a two-hour examination in week 1 of Summer
Term. Results of this and other exams will be announced at the end of the
academic year.
A new element of the course introduced last year is the class test, which
will be held in week 7 of Autumn Term. The test will consist of a one-hour
paper with 20 true or false questions, to be answered on specially prepared
sheets, which then will be scanned and marked automatically. The resulting
mark will not contribute to your official course assessment, and the class
test itself is not mandatory. However, it is strongly recommended to take
the test, in order to get feedback on your progress and to prepare yourself
for the Summer Term exam.

Discrete Mathematics I (CS127)

A Brief Tour of the Discrete Mathematics Zoo

Mathematics studies concepts that are abstract, idealised images of the real
world. An example of such a concept is natural numbers:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, . . .
We all learn it in early childhood yet nobody has ever seen three, as
opposed to three oranges or figure 3 in black ink in the top-right corner
of this page.
A philosopher would say here: well, our concept of three captures the
threeness of all three-element sets that we have seen before or may see in
future: three apples, three penguins, or two sheep with a sheepdog in the
field. Number 0 can be accommodated by this view as well: it represents
an empty set, a set that contains nothing.
While the philosophers answer makes a lot of sense, it is also true that
in mathematics, concepts depart from immediate reality, and start to live a
life of their own. Consider, for example, the notion of a set, that our friend
the philosopher has used to define natural numbers. We can have a set of
apples or penguins, so why not think about sets of numbers? Say, the set of
this weeks National Lottery winning numbers: {14, 20, 25, 32, 47, 49}. (note
the use of curly brackets to denote a set). We could then think of some
more interesting (in my opinion) examples, such as the set N of all natural
numbers:
N = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, . . . },
the set of all integers (natural numbers and their negatives):
Z = {. . . , 6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6, . . . },
or the set of all even integers:
{. . . , 10, 8, 6, 4, 2, 0, 2, 4, 6, 8, 10, . . . }.
For a mathematician, the last three sets are just as legitimate as a set
of three apples. However, there is a crucial difference: the new sets are
infinite. Infinite sets do not occur in reality, even the number of atoms in
the Universe is finite. Yet, we have just imagined a few infinite sets. Even
if we cannot write down the elements of these sets without resorting to the
. . . notation, we can capture these sets in our mind, and treat them as
we would treat any real-world set.
Of course, to make our theory of sets useful, we will have to answer some
important questions:
do infinite sets have a size? (yes they do, but of course these sizes
are beyond natural numbers);

Discrete Mathematics I (CS127)


can two infinite sets have different size? (yes, their sizes can vary so
greatly it is hard to imagine even for a mathematician);
can one form a set of all sets? (no, this is asking too much one
cannot even form a set of all possible set sizes).

This sort of question cannot be answered from any empirical evidence:


infinite sets simply do not exist in reality. At this point, we are confronted
with a major distinction between mathematics and natural sciences: instead
of experiments, mathematics relies on proofs. The answers to the above
questions given in brackets can be formally and unambiguously proved to
be correct. Experimentation also plays a role in mathematics, but rather
a supporting one: it helps our intuition to understand the concepts and
find the right idea for a proof. For example, to answer the first question
above, we could think of various infinite sets that can be imagined, and ask
ourselves if they are likely to have a sensible notion of size. Then we would
formulate this notion precisely, and prove that it satisfies all the properties
that we associate with size for example, by adding new elements to a set
we cannot decrease its size. The same approach can be applied to the other
two questions. For the final question, this approach has an additional twist:
we want to prove that a certain object (a set of all sets) does not exist. In
order to tackle this, we imagine that it does exist, and try to consider all
consequences of its existence. Somewhere in our reasoning we come to a
contradiction (it turn out that the set of all sets cannot be assigned any
sensible size). The contradiction proves that the object we imagined (the
set of all sets) cannot exist without violating the basic laws of logic.
To be able to write down our proofs, we need a language that is both
precise (does not allow any ambiguity) and concise (allows to express complicated ideas relatively briefly). We should indicate exactly the concepts
that we consider basic, i.e. that require no definition. For example, a set and
a natural number are basic concepts. All concepts that are not basic must
be given a formal definition. For example, we will have to define finite set
and even number. We should also indicate exactly the statements that we
consider to be axioms, i.e. that require no proof. For example, two sets are
the same if they consist of the same elements is an axiom. All statements
that we hold to be true, but that are not axioms, are called theorems; they
must be given formal proofs. For example, we will have to prove the answers
that we gave to the above list of questions on infinite sets.
This approach to mathematics is called the axiomatic method. It requires
a special language and a set of proof rules known as logic. Logic is a part of
mathematics both as a tool and an object of study; we will see some details
of it in the beginning of our Discrete Mathematics course. Concepts and
laws of logic allow us to formalise ways of reasoning that we learn together
with our mother tongue:

Discrete Mathematics I (CS127)

All eagles can fly;


Some pigs cannot fly.
Therefore, some pigs are not eagles.
The conclusion seems obvious, but in mathematics we must know a precise
reason why it follows from the two given conditions. Firstly, we must define
exactly the class of all things that these statements speak about: suppose
this is the class of all living creatures. The first condition can be reformulated as follows: If a creature is an eagle, it can fly. The laws of logic
tell us that this is equivalent to saying: If a creature cannot fly, it is not
an eagle. The second condition says that there is a creature, which is a pig
and cannot fly. Taken together with the previous statement, this leads to
the conclusion being proved: there is a creature, which is a pig and cannot
fly, and therefore is not an eagle. Note that we can only prove what logically follows from the given conditions. For example, we do not have enough
information to conclude that all pigs are not eagles.
Armed with logic, we will take a closer look at sets, and will introduce
two concepts that are central to all mathematics: relation between elements
of two sets, and function from one set to another. We will study different
types of relations and functions, and eventually will consider graphs an
especially powerful concept in dealing with complicated sets, in particular
the ones occurring in computer science.
In summary, the basic ingredients of our course are sets and natural
numbers, glued together by logic. We will use these ingredients to build
more complicated structures, and will apply the axiomatic method to study
their properties. A lot of emphasis will be put on being able to prove facts,
rather than just memorise them. This ability, of course, comes only with
practice hence the weekly problem sheets and seminars to discuss your
solutions. Please do attempt to solve the problems, and be active at the
seminars: our subject is discrete, rather than discreet, mathematics.

Discrete Mathematics I (CS127)

Discrete Mathematics I (CS127)

Logic

2.1

Statements and operators

We use all sorts of sentences in everyday speech. Our language has special
ways in which we can communicate information, ask a question, give a command, express our thoughts, feelings or emotions. In mathematics, however,
we restrict ourselves to only one type of sentences: statements, which must
be either true or false. Here are some examples of statements:
Five is less than ten.
Pigs can fly.
There is life on Mars.
Note that we know the last statement must be true of false, despite the fact
that we cannot decide between true and false from our present knowledge.
Here are some examples of sentences that are not statements:
Welcome to Tweedys farm!
Whats in the pies?
Its not as bad as it seems. . .
The last sentence will become a statement if we substitute the name of a
particular object for the pronoun it. Of course, we must also give a clear,
unambiguous definition of bad, seems, etc.
Thus, every statement has a value taken from the set B = {F, T }. The
two elements of this set are called Boolean values. There are special operations, called Boolean operators, that one can perform on Boolean values
(rather like addition and multiplication on natural numbers):

negation (logical NOT), denoted ;


conjunction (logical AND), denoted ;
disjunction (logical OR), denoted ;
implication (IF . . . THEN . . . ), denoted ;
equivalence (. . . IF AND ONLY IF . . . ), denoted .

The negation (NOT) operator simply reverts the value of a statement


to the opposite value. We can define the action of operator applied to a
statement A by the following truth table:
A
F
T

A
T
F

The conjunction (AND) operator applies to two separate statements.


The conjunction of A and B is true when both A and B are true; the conjunction is false when either A, or B (or both) are false. Thus, operator

Discrete Mathematics I (CS127)

can be defined by the following truth table:


A
F
F
T
T

B
F
T
F
T

AB
F
F
F
T

The disjunction (OR) operator also applies to two separate statements,


and is complementary to conjunction. The disjunction of A and B is true
when either A or B (or both) are true; the disjunction is false when both A
and B are false. Here is the truth table for :
A
F
F
T
T

B
F
T
F
T

AB
F
T
T
T

The two statements connected by conjunction or disjunction do not need


to be related in any way. Thus,
(5 < 10) (Pigs can fly)

means T F

(5 < 10) (Pigs can fly) means

T F

means F
means

The same applies to statements connected by implication (IF . . . THEN


. . . ). In ordinary life, we usually think of implication as a cause-effect relationship: if the bird is happy, then it sings loud. This relationship is
one-way: if the bird sings, it does not necessarily mean that it is happy
perhaps there are other reasons for a bird to sing. And, if the bird is
not happy, we cannot conclude whether it should sing or not, so we must
accept both possibilities. The same reasoning applies in logic, but here the
statements connected by implication do not have to be related. For any two
statements A, B, the value of the implication is determined by the truth
table:
A B AB
F F
T
F T
T
T F
F
T T
T
Thus, a false statement implies anything, no matter true or false, but a true
statement can only imply another true statement.
The equivalence operator (. . . IF AND ONLY IF . . . ) can be thought of
as the two-way version of implication: A is equivalent to B, when A implies

Discrete Mathematics I (CS127)

B, and B implies A. In other words, the values of A and B must agree:


either both true, or both false. Here is the truth table:
A
F
F
T
T

B
F
T
F
T

AB
T
F
F
T

Implication and equivalence play a special role in mathematics. Many


mathematical theorems have the form
if A then B
or
A implies B
sometimes disguised as
A is sufficient for B
or
B is necessary for A
The meaning of all these sentences is the same: A B. A standard way of
proving such theorems is by a chain of implications:
A P 1 P2 . . . P n B
where P1 , P2 , . . . , Pn are some statements, chosen so that every implication
in the chain can be proved in one step.
Another common form of theorems is
A if and only if B
often disguised as
A and B are equivalent
or
A is necessary and sufficient for B
or
B is necessary and sufficient for A
The meaning of all these is A B. A standard way of proving such theorems
is by a chain of equivalences:
A P 1 P2 . . . P n B
where P1 , P2 , . . . , Pn are some statements, chosen so that every equivalence
in the chain can be proved in one step.

10

Discrete Mathematics I (CS127)

2.2

Laws of logic

The truth tables completely define Boolean operators, so, in principle, the
truth value of any compound statement, however complicated, can be found
by a series of truth table lookups. In practice, we often want an easier
and more intuitive method of dealing with compound statements. One such
method consists in applying certain properties of Boolean operators, known
as the laws of logic. From the formal point of view, these laws do not add
anything new to the operator definitions: each of the laws follows directly
from the truth tables. However, the laws offer an alternative, complementary
approach to logic, and are widely applicable. Many of these laws are similar
to the properties of arithmetic operators + and .
In the following formulas, letters A, B, C stand for arbitrary statements.
The statements of the laws are always true, irrespective of the truth values
of A, B, C.
The first group of laws involve only one operator and at most two elementary statements each.
A A

double negation law

A A A

A A A

idempotence of ,

A B B A

A B B A

commutativity of ,

The double negation law is similar to (a) = a, and the commutativity


laws correspond to a b = b a, and a + b = b + a. The idempotence laws
have no direct counterparts in arithmetic.
The second group of laws involve more than one operator, and/or more
than two elementary statements each.
(A B) C A (B C)

associativity of

A (B C) (A B) (A C)

distributivity of over

(A B) C A (B C)
A (B C) (A B) (A C)

associativity of

distributivity of over

The associativity laws correspond to the arithmetic laws (a b) c = a (b c)


and (a + b) + c = a + (b + c). These laws allow us to write A B C and
A B C without any brackets, just as we write a b c and a + b + c.
The first distributivity law corresponds to a (b + c) = a b + a c. The
second distributive law has no counterpart in arithmetic, since, in general,
a + b c 6= (a + b) (a + c).
Note that in all laws so far, we can replace all symbols by , and,
simultaneously, all symbols by . The resulting statement will still be
true for any A, B, C. This is a general rule that applies to all laws we
introduce in this section.

Discrete Mathematics I (CS127)

11

The following pair of laws, called De Morgans laws, describes the close
relationship between operators , .
(A B) A B

(A B) A B

These two laws allow us to express via and :


A B (A B) (A B)
and via and :
A B (A B) (A B)
This means that any one of the two operators , is redundant: we can
rewrite any statement without using either one or the other. Of course, it
is usually more convenient to use both.
Another group of laws deals with the case of know truth values appearing
explicitly in compound statements:
A T A

A F A

identity laws

A F F

A T T

annihilation laws

A A F

A A T

excluded middle

A (A B) A A (A B)

absorption laws

Identity laws correspond to a 1 = a, a + 0 = a. An arithmetic annihilation


law does not hold for addition, but holds for multiplication: a 0 = 0.
An arithmetic analogue of the law of excluded middle does not hold for
multiplication, but holds for addition: a + (a) = 0.
Finally, the following two laws completely describe the two remaining
Boolean operators, and :
(A B) (A B) (A B)

(A B) (A B) (B A) (A B) (A B)
Again, both and are formally redundant, but, as we mentioned before,
very useful in practice.
All the above laws are in fact theorems, and proving them is a good
exercise in applying truth tables. Here is a table that proves one of De
Morgans laws, (A B) (A B):
A
T
T
F
F

B
T
F
T
F

AB
T
F
F
F

(A B)
F
T
T
T
?

A
F
F
T
T

B
F
T
F
T

(A B)
F
T
T
T
?

12

Discrete Mathematics I (CS127)

The columns for the two sides of the law (marked ?) are identical, hence
their truth values agree for any A, B.
We can use our laws of logic to prove new theorems. Here is an example.
Theorem 1 (Principle of proof by contradiction). For any statements
A, B, we have (A B) (B A)
Proof. We apply the law for , then the law of double negation, commutativity of , and finally the law for once again, this time in the opposite
direction.
(B A) (BA) (BA) (AB) (A B)

The above theorem gives us a useful generic proof method. When we are
given a statement A, and we are asked to prove a statement B, we may start
by assuming that B is false (i.e. B holds), and then show that a statement
contradicting A (i.e. A) follows from our assumption. The principle of proof
by contradiction tells us that in this case, B must be a logical consequence
of A.

2.3

Predicates and quantified statements

Statements we have been making so far declared facts about specific objects:
Five is less than ten.
The pie is not as bad as it looks.
Often we need more that that: we want to declare a fact about a specific
set of objects. For example, we could say:
Some natural numbers are less than ten.
All pies are not as bad as they look.
In the first case, we could try to come up with a specific example that
proves is: say, five is less than ten. In the second case, we could restrict our
attention to a finite number of possible pies; let this set be {Chicken pie,
Mushroom pie, Cabbage pie}. Then the statement All pies are not as bad
as they look is a conjunction:
(Chicken pie is not as bad as it looks)

(Mushroom pie is not as bad as it looks)


(Cabbage pie is not as bad as it looks)

There are problems with both these approaches. In the first case, it was
easy to find a specific instance (five) that proved our statement; for other

Discrete Mathematics I (CS127)

13

statements, it could be much harder. We would like to have a way of saying


some numbers are less than ten without having to show a specific example.
In the second case, the chosen set of pies was too small; in reality, there are
millions of individual pies, so our statement has to be a conjunction of a
huge number of individual statements. This would be hard to deal with if
we were to use it in proofs. Furthermore, this approach would completely
fail if the statement were about all possible pies, and then it turned out that
this set is infinite. We would like to have a way of making a statement about
all elements or some elements of any set, including infinite ones.
We achieve the stated goal by using the notion of a predicate. A predicate
is simply a sentence containing variables ranging over a particular set. The
set of values for a variable is called the range of that variable. Will will
always assume that the range is nonempty. The sentence must become true
or false when an element of the range is substituted for every variable. Here
are some examples:
Number x is less than ten.
Pie p is not as bad as it looks.
Here x is a variable that stands for a member of set N (i.e. ranges over N),
and p is a variable that stands for a member of the set of all pies (i.e. ranges
over that set. Of course, in the latter case we must specify precisely what
we understand by all pies.
A predicate may contain more than one variable. For examples, these
are valid predicates:
Number x is less than number y.
Pie p is better than pie q.
Ordinary statements can be regarded as a special case of predicates, containing zero variables, for example:
Number 5 is less than number 10.
This chicken pie is better than that apple pie.
In the latter case, we assume that we are talking about two specific, welldefined pies.
A predicate with more than one variable can be made a statement by
substituting a specific element of the range for every variable. A different
way of make a statement from a predicate is by using quantifiers. Let denote
by P (x) a predicate with the variable x, There are two quantifiers:
existential (FOR SOME x, P(x)): x : P (x);
universal (FOR ALL x, P(x)): x : P (x).
Here, the range of x (i.e. the set from which x taken) is implicit. Often, we
want to specify the range of a variable. The above examples can be written

14

Discrete Mathematics I (CS127)

as:
x N : x < 10

p Pies : p is not as bad as it looks


The sign stands for belongs, and denotes the membership of an element
in a set. The general form is
x S : P (x)

x S : P (x)

With predicates having more than one variable, we can write more complicated quantified statements:
x N : y N : x < y

y N : x N : x < y

Note that the meaning, and even the truth value of the above two statements is different: the first one is true (for every natural number, there is
a greater number), the second is false (there is a natural number greater
than all natural numbers). In general, the meaning of a quantified statement depends on the order of the quantifiers.
The meaning of a quantified statement does not change if we change the
quantifier variable consistently throughout the statement. For example, we
can write:
z N : z < 10

Pies : is not as bad as it looks

The variable in a quantified statement is only defined within the statement;


it is not visible from outside. In programming, such variables are called
local. In mathematics, we call them dummies, or bound variables. In the
examples above, variables x, z are bound by the quantifier , and variables
p, are bound by the quantifier . In contrast, a variable in a predicate not
bound by any quantifier (such as P (x) or z < 10) is called free. We have
the following laws of changing the bound variable:
x : P (x) y : P (y)

x : P (x) y : P (y)

As we have seen before, a universally quantified statement with a finite


range S = {a1 , . . . , an } can be expressed by a conjunction:
x S : P (x) P (a1 ) P (an )
Similarly, an existentially quantified statement with a finite range can be
expressed by a disjunction:
x S : P (x) P (a1 ) P (an )

Discrete Mathematics I (CS127)

15

These equivalences do not hold for an infinite S, since their right-hand sides
would not be well-defined. However, the following laws will hold for any
nonempty range, finite or infinite:
x : T T

x : F F

x : T T

x : F F

x : P (x) = x : P (x)

In predicate logic, we also have the following analogue of De Morgans


laws:
x : P (x) x : P (x)

x : P (x) x : P (x)

On a finite range, these laws can be proved by the laws of Boolean logic,
using properties of conjunction for , and those of disjunction for . On an
infinite range, the new laws must be taken as axioms.
When several predicates are involved in a quantified statement, all the
usual laws of Boolean logic apply to these predicates. However, when we
introduce a quantifier, we must be careful not to capture inadvertently
any existing free variables, or any variables bound by other quantifiers. For
example, the statement (x : P (x)) (x : Q(x)) is, in general, not equivalent to x : (P (x) Q(x)). This is because in the former statement, P (x)
and Q(x) may be satisfied by different values of x, whereas in the latter
statement the value of x must be the same for both P and Q. We can
make this argument even more forceful by replacing the first statement by
its logical equivalent: (x : P (x)) (y : Q(y)). By a similar reasoning,
there is no equivalence between the statements (x : P (x))(x : Q(x)) and
x : (P (x)Q(x)), since the former is equivalent to (x : P (x))(y : Q(y)).
However, the following equivalences hold:
(x : P (x)) (x : Q(x)) x : (P (x) Q(x))

(x : P (x)) (x : Q(x)) x : (P (x) Q(x))

As before, they can be proved by laws of Boolean logic for a finite range,
but must be taken as axioms when the range is infinite.
In general, a quantifier x or x is safe to capture a predicate Q, as
long as Q does not contain x as a free variable (in other words, as long as
all occurrences of x in Q are bound by other quantifiers). Therefore, we
have the following laws, where Q is always assumed to be a predicate not

16

Discrete Mathematics I (CS127)

containing x as a free variable:


(x : P (x)) Q x : (P (x) Q)

(x : P (x)) Q x : (P (x) Q)
(x : P (x)) Q x : (P (x) Q)

(x : P (x)) Q x : (P (x) Q)

(x : P (x)) Q x : (P (x) Q)

(x : P (x)) Q x : (P (x) Q)
Q (x : P (x)) x : (Q P (x))

Q (x : P (x)) x : (Q P (x))
(x : P (x)) Q x : (P (x) Q)

(x : P (x)) Q x : (P (x) Q)

Just like laws of Boolean logic, which are useful in simplifying statements
involving Boolean operators, the above laws, along with other laws introduced in this section, allow us to simplify statements involving quantifiers.
The ultimate purpose of all these laws, and of logic as a whole, is to allow
us to express and prove facts about objects and sets that we build across
all branches of mathematics. In the following sections of the course, we will
make extensive use of this sections language and ideas.

Discrete Mathematics I (CS127)

17

Sets

3.1

The nave set theory

The notion of a set is central to mathematics. However, it was not until the
late 1800s and early 1900 that mathematicians began to study sets in their
own right. Sets and set elements are basic concepts, and, as such, are left
without a formal definition. Georg Cantor (18451918), one of the creators
of modern set theory, gave the following description:
By a set we shall understand any collection into a whole M of
definite, distinct objects of our intuition or of our thought. These
objects are called the elements of M .
The above is not a mathematical definition: it just describes our intuitive
idea of sets (collections) and their elements (objects). However, we can
formulate some characteristic properties that we associate with sets:
Any object can be an element of a set. For example, we can form the
following sets:
Planets = {Mercury, Venus, . . . , Pluto}
Neven = {0, 2, 4, 6, 8, 10, . . .}

Junk = {239, banana, ace of spades}


The order of elements in a set does not matter. For example,
Junk = {239, banana, ace of spades}
Repetition of elements in a set does not matter. For example,
Junk = {banana, banana, ace of spades, 239, 239, 239}
A set can be an element of another set. For example,
Junk = {banana, banana, ace of spades, 239, 239, 239}
SuperJunk = {239, Junk , } = {239, {banana, ace of spades, 239}, }
There is a special set, which contains no elements. It is called the empty
set, and denoted : = {}. Any set with exactly one element is called a
singleton. For example, we can form the following singletons:
MorningStars = {Venus}

NonpositiveNaturals = {0}
EmptySets = {}

18

Discrete Mathematics I (CS127)

Note that the set EmptySets is not empty: it contains an element, which
happens to be the set . Likewise, the set MorningStars is distinct from the
planet Venus, and the set NonpositiveNaturals is distinct from the number
zero.
The fact that x is an element of set S is written as x S. Thus,
Jupiter Planets, orange 6 Junk . A set A is called a subset of a set B
(A B), if all elements of A are also elements of B (but not necessarily the
other way round). For example, Neven is a subset of N (Neven N), since
every even natural number is a natural number. We can write the definition
formally as follows:
A B x : x A x B
By this definition, the empty set is a subset of any set (since the range
of the quantified statement in the definition is empty), and every set is a
subset of itself.
It is very important to distinguish between the signs (element inclusion) and (subset inclusion). Despite their superficial similarity, their
meaning is very different: the first indicates an individual member of a set,
the second an arbitrary subset of a set, including the two possible extremes: the empty set and the whole working set. Element inclusion is
a basic concept, and therefore has no formal definition; the definition of
subset inclusion in terms of element inclusion was given in the previous
paragraph.
Our intuitive idea of a set is an arbitrary collection of elements, where
the order and any repetitions of elements are ignored. Can we make this
idea formal by giving to the basic concept of a set the appropriate axioms?
The fact that order and repetitions do not matter is easy to express:
Axiom (The Law of Extensionality). If two sets contain the same elements, they are equal.
In other words, for any sets A, B, we have
(A B B A) A = B
In particular, any two sets without elements are equal, therefore there is
only one empty set .
When dealing with sets, we often need to select from a given set a subset
that satisfies a certain property. For example, we could start from the set
N, and select from it only those numbers that are even. In general, let S
be our working set; then we can express any property of its elements by a
predicate P (x), where x is a variable ranging over S. A set of all elements
x of S for which P (x) is T is denoted {x S | P (x)}. For example,
Neven = {x N | x is even}

Discrete Mathematics I (CS127)

19

The variable x in the above expression is a dummy: the set Neven will not
change if we replace all occurrences of x in its definition by y, or by any
other variable.
For any set S, we have
{x S | T } = S

{x S | F } =

Here are some more examples:


{x N | x > 0} = {1, 2, 3, 4, 5, 6, . . .}
{x Planets | x is red} = {Mars}
{x N | x 0} = N

{x Planets | x is a banana} =
Using the predicate notation, we can attempt to formalise completely
our intuitive notion of a set. We have described a set as an arbitrary
collection of elements that is, we can form a set of elements satisfying
any given predicate. We can now make it our second axiom.
Axiom (The Law of Abstraction). For any predicate P (x), there is a
set A = {x | P (x)}, such that an element x is in A if and only if P (x) is
true.
Our two axioms the law of extensionality and the law of abstraction
formalise our intuition about sets. We could try to base a whole theory
on these two axioms. Indeed, such attempts were made in the early stages
of set theory development. Unfortunately, it was soon realised that the
extensionality and abstraction laws, taken together, are inconsistent that
is, a theory based on these laws leads to contradictions. The simplest of
these contradiction is called Russells paradox, after the great logician and
philosopher Bertrand Russell (18721970).
Consider the following predicate: P (x) x 6 x (note that it involves
element inclusion, rather than subset inclusion). In words, we could say that
P (x) means x is not a member of itself. This would be definitely true if
x is not a set; it is also true for all sets we have seen so far, and for all sets
we can think of (except perhaps an imaginary set of all sets). We may or
may not believe that P (x) is true for all x: whether this is the case or not
is irrelevant, since both possibilities will lead to a contradiction. What is
relevant is that P (x) is a well-formed predicate (i.e. is true or false for any
given x). Therefore, by the law of abstraction, we can form the set B of all
objects x that satisfy the predicate P (x):
B = {x | P (x)} = {x | x 6 x}

20

Discrete Mathematics I (CS127)

In words, B is the set of all objects that are not their own members.
Now consider the following statement R: B B. It is a well-formed
statement, so it must be either true or false. Suppose statement R is true,
so B is a member of B, and, like all members of B, must not be a member
of itself. This makes the statement R false which is impossible, since
we assumed it was true. Now suppose statement R is false, so B is not
a member of B. By definition of set B, everything that is not a member
of itself must be a member of B, so B itself must be a member B. This
makes statement R true which is impossible, since we assumed it was
false! Thus, statement R cannot be either true or false, so there must be
something wrong in our reasoning. The only thing that can be wrong is
the law of abstraction that we used to form the set B.
There is an alternative, somewhat lighter form of Russells paradox.
Imagine a village that has a single (male) barber with the following code of
practice: the barber will shave every man in the village, but only if this man
does not shave himself. Must the barber shave himself? The question has no
answer, since both choices of the answer lead to a contradiction. Therefore,
the barbers rule is inconsistent.
Because of Russells paradox, the theory based on the laws of extensionality and abstraction is often called the nave set theory. It captures our
intuitive notion of a set but, being inconsistent, cannot serve as a formal
foundation of mathematics. A lot of time and effort have been spent in
order to provide a more sound axiomatic system for sets. Now, several such
systems exist; they are all significantly more complicated than the nave set
theory. We shall not go into their details in this course. For the rest of
the course, we will use implicitly the laws of extensionality and abstraction,
and in particular the convenient notation for set abstraction {x | P (x)}. On
the level of our course, no paradoxes similar to Russells will arise. Indeed,
unless mathematicians create them artificially, they seldom arise at all.

3.2

Operations on sets

We have already studied the concept of set abstraction, that allows us (ideally) to form a set {x | P (x)} from any predicate P (x). We will now use this
method to define operations that create new sets from existing ones. Despite
the problems with abstraction arising due to Russells paradox, these new
set operations will be completely non-controversial.
Let A, B be any sets. The intersection of A, B, denoted A B, is a set
that contains all elements which are members of both A and B:
A B = {x | (x A) (x B)}
The union of A, B, denoted A B, is a set that contains all elements which
are members of either A, or B (or both):
A B = {x | (x A) (x B)}

Discrete Mathematics I (CS127)

21

The difference of A, B, denoted A \ B, is a set that contains all elements


which are members of A, except those which are members of B:
A \ B = {x | (x A) (x B)}
As we see from the definitions, set operations are closely related to
Boolean operators. In particular, they have properties very similar to the
laws of Boolean logic, where is analogous to conjunction, and to disjunction.
AA=A

AA=A

idempotence of ,

AB =BA

AB =BA

commutativity of ,

Also,
(A B) C = A (B C)

associativity of

A (B C) = (A B) (A C)

distributivity of over

(A B) C = A (B C)

associativity of

A (B C) = (A B) (A C)

distributivity of over

Set difference does not directly correspond to negation, since it involves


two sets rather of one. In order to obtain an analogue of negation, let us fix
a particular set S (the universal set). We now restrict ourselves to sets that
are subsets of S. For any set A S, the complement of A (with respect to
S) is the difference A = S \ A. The laws of complement are analogous to
the the laws of Boolean negation. We have the law of double complement:
A = A
and De Morgans laws:

A B = A B

A B = A B

Here, A, B are arbitrary subsets of S.


The universal set S corresponds to the statement T , and the empty set
to the statement F . Note that S. We have:
AS =A

A=A

identity laws

A=

AS =S

annihilation laws

A A =

A A = S

excluded middle

A (A B) = A = A (A B)

absorption laws

All the above laws are theorems, and are easy to prove by the laws of Boolean
logic. Here is an example:

22

Discrete Mathematics I (CS127)

Theorem 2 (De Morgans Law). For any universal set S, and for any

sets A, B S, we have A B = A B.
Proof. We apply the definition of complement, the Boolean De Morgans
law, the Boolean distributivity law, once again the definition of complement,
and finally the definition of set union:
A B = S \ (A B) =

{x | (x S) (x A B)} =

{x | (x S) ((x A) (x B))} =

{x | (x S) ((x A) (x B))} =

{x | ((x S) (x A)) ((x S) (x B))} =

{x | (x S \ A) (x S \ B)} =
(x B)}
= A B

{x | (x A)

Let us compare once again the laws of Boolean logic with the laws of
sets. In logic, we have the set of Boolean values B = {F, T }, and Boolean
operators , , . In set theory, we have a fixed universal set S, and set
operations (complement), , . The laws obeyed by these two structures
(set B and the set of all subsets of S) are essentially the same. There are
many other similar structures in mathematics, with operations governed by
exactly the same laws. Such structures are called Boolean algebras.
The Boolean algebra formed by all subsets of a given set S is called the
powerset of S. Formally, the powerset of S is a set P(S) = {A | A S}.
In other words, a set is member of P(S), if and only if it is a subset of S:
A : A P(S) A S.
Let us consider some examples. The simplest case is S = . The empty
set contains exactly one subset: the empty set itself. Hence, the powerset
of is a singleton: P() = {}. Note: the powerset of the empty set is not
empty.
Now let S be a singleton, for example S = {Bunty}. Set S contains two
subsets: S itself, and the empty set. Hence, the powerset of S consists of
two elements: P(S) = {, {Bunty}}. In general, the powerset of any set S
contains, among other elements, the set S itself, and the empty set. For
example,
P({a, b, c}) = {, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}}
When we form a subset of a given set S, we have two choices for each
element: either to include, or not to include this element in the subset.
Thus, for a finite set of n N elements, we make n independent choices,
leading to 2n different subsets. Therefore, the powerset of a finite set is
finite. Furthermore, the powerset of an n-element finite set consists of 2 n
elements. Note that this also holds for P(), since 20 = 1.

Discrete Mathematics I (CS127)

23

If set S is infinite, then its powerset P(S) must also be infinite. This is
because P(S) contains, among other elements, all singletons {a}, such that
a S. Since S is infinite, the number of such singletons is also infinite.
The last set operation that we consider in this section is based on the
idea of a sequence. Let x1 , x2 , . . . , xn be any objects (n N). A (finite)
sequence (x1 , x2 , . . . , xn ) is different from a set {x1 , x2 , . . . , xn } in that the
order and repetition of elements do matter in a sequence. For example,
the sequence JunkSeq 1 = (239, banana, ace of spades) is different from the
sequence JunkSeq 2 = (banana, 239, ace of spades, 239). Natural number n
is called the length of the sequence. For example, the length of JunkSeq 1 is
three, and the length of JunkSeq 2 is four. We will give a formal definition
of sequences further in the course.
A sequence of length two is called an ordered pair. Let A, B be any sets.
The Cartesian product of sets A, B, denote A B, is the set of all ordered
pairs (a, b), where a A, b B. In other words, A B = {(a, b) | (a
A) (b B)}.
The Cartesian product is named after the great philosopher and mathematician Rene Descartes (15961650). Descartes lived long before sets
emerged as a separate mathematical concept. However, Descartes was the
first to realise that in geometry, a point in the plane can be represented by
a pair of numbers, called coordinates. Therefore, the whole plane is represented by what we now call a Cartesian product of two lines.
Here are some examples of Cartesian products:
A = A = for any set A
{Bunty} {Fowler} = {(Bunty, Fowler)}

{Fowler} {Bunty} = {(Fowler, Bunty)}


{a, b, c} {d, e} = {(a, d), (a, e), (b, d), (b, e), (c, d), (c, e)}
N Planets = {(n, x) | (a N) (x Planets)} =
{(5, Saturn), (239, Earth), . . .}

The Cartesian product of a set A to itself is called the Cartesian square


of A, and denoted A2 = A A. For example,
{a, b}2 = {(a, a), (a, b), (b, a), (b, b)}
N2 = N N = {(m, n) | m, n N}

Thus, the plane is the Cartesian square of a line.


When forming a pair (a, b) in the Cartesian product A B, we make
two independent choices: we choose a A, and b B. For finite sets
A, B, with m and n elements respectively, there are m n possible pairs.
Therefore, the Cartesian product of two finite sets is finite. Furthermore,
the Cartesian product of an m-element set and an n-element set consists

24

Discrete Mathematics I (CS127)

of m n elements. Note that this also holds for the products involving the
empty set: the Cartesian product of the empty set with any other set is
empty.
If one of the sets A, B is infinite, and the other is non-empty, then the
Cartesian product A B must be infinite. This is because if, say, A is
infinite, and b B, then we can form an infinite number of distinct pairs
(x, b), where x A. Each of such pairs belongs to A B.
In general A B 6= B A (the equality only holds when A = B, or
when one of A, B is empty). Hence, the Cartesian product operator is not
commutative. Furthermore, a nested pair ((a, b), c) is different from the
nested pair (a, (b, c)), hence (A B) C 6= A (B C), so the Cartesian
product operator is not associative. However, it still has some distributive
properties with respect to other set operations:
A (B C) = (A B) (A C)

distributivity of over

A (B C) = (A B) (A C)

distributivity of over

A (B \ C) = (A B) \ (A C)

distributivity of over \

(A B) C = (A C) (B C)
(A B) C = (A C) (B C)
(A \ B) C = (A C) \ (B C)

For a finite sequence of sets A1 , A2 , . . . , Ak , we can define the Cartesian


product A1 A2 Ak as the set of all finite sequences (a1 , a2 , . . . , ak ),
where ai Ai for all i N, 1 i k. Alternatively, we can define the multiple Cartesian product A1 A2 Ak as nested binary Cartesian products
(((A1 A2 ) A3 ) . . . ) Ak or A1 (A2 (A3 ( Ak ))). From the
formal viewpoint, the above three definitions are not equivalent, since the
sequence (a1 , a2 , . . . , ak ) is different from the nested pairs (a1 , (a2 , (. . . , ak )))
and (((a1 , a2 ), . . . ), ak ). However, the structure of the resulting sets is similar, and in most applications we can treat the above as three equivalent
definitions of the Cartesian product of a sequence of sets. If all sets in the
sequence are finite, with set Ai having ni elements for every i, then the
Cartesian product A1 A2 Ak (by any of the three definitions) has
n1 n2 . . . nk elements.
Similarly to the Cartesian square, we can define the k-th Cartesian power
of a set A as Ak = A A A (k times). Thus, the three-dimensional
space is the Cartesian cube (i.e. the third Cartesian power) of a line.
It is possible do define the Cartesian product of an infinite sequence of
sets by considering infinite sequence of elements, each element coming from
the corresponding set in the sequence. We will not use Cartesian products
of an infinite number of sets in this course.

Discrete Mathematics I (CS127)

25

Relations

4.1

Introduction to relations

We usually thing of a relation between sets as a certain set of ordered


pairs, where each element of a pair is taken from its corresponding set. For
example, we could have the relation between the set of all cars and the set
of all people, which would consists of all pairs (x, y), where car x is driven
by person y. A car may be driven by more than one person, so there may
be several pairs with the same x; a person may drive more than one car, so
there may be several pairs with the same y. Some cars may have no drivers,
and some people may not drive any cars, therefore some set elements may
not be included in any of the pairs.
Thus, for any sets A, B, a relation between A and B is an arbitrary
subset of the Cartesian product A B. In other words, a relation between
A and B is an arbitrary set of ordered pairs (a, b), where a A, b B.
Although ordinary set notation would be sufficient, there is an alternative,
more convenient notation for relations. We denote a relation Rp A B
by Rp : A B. Instead of writing (a, b) Rp , we write apb. This is in line
with the normal practice of mathematics, where we use e.g. x y instead
of (x, y) R .
Relation R : N N is an example of a relation between the set N and
itself. For any set A, we say that relation Rp : A A is a relation on the
set A. From arithmetic, we already know several other relations on the set
N: R= , R< , R , R> , R . Another example of a relation on the set N is the
relation R| : N N, where m|n is true if m divides n (i.e. n is a multiple
on m).
We can define the following relations between any sets A, B:
the empty relation A B
the complete relation A B A B
On any set A, we can define the equality relation R=A : A A as
R=A = {(a, a) | a A}. The equality relation consists of all pairs where
both elements are equal. When the set A is clear from the context, we drop
the subscript, so instead of a =A b we write simply a = b.
Let Rp : A B, Rq : B C. The composition of relation Rp and Rq is
a relation Rpq : A C, defined as follows:
(a, c) A C : a(p q)c b B : (apb) (bqc)

In other words, an element a A is related to an element c C by the


composition Rpq , if there is (at least one) intermediate element b B, such
that a is related to b by Rp , and b is related to c by Rq .

26

Discrete Mathematics I (CS127)

Consider, for example, the relation Rq : People People, where xqy if


x is a child of y. The composition Rqq relates two elements x, y, if x is a
grandchild of y.
Let Rp : A B. The inverse of relation Rp is a relation Rp1 : B A,
defined as follows:
(b, a) B A : b(p1 )a apb

In other words, an element b B is related to an element a A by the


inverse relation Rp1 , if a is related to b by the original relation Rp . The
superscript 1 is just a symbol for inversion, not the number minus one.
For example, the inverse of the child relation Rq : People People is
the relation Rq1 , that relates two elements x, y, if x is a child of y.
We now switch our attention from relations between two arbitrary sets
A, B to relations on a given set A (the previous two examples were already
of this type). Let Rp : A A. Relation Rp is
reflexive, if every element is related to itself: a A : apa
symmetric, if every two elements are related in both possible orders,
as long as they are related at all: a, b A : apb bpa
antisymmetric, if no two distinct elements are related in both possible
orders: a, b A : (apb bpa) a = b
transitive, if every two elements related via an intermediate third element are also related directly: a, b, c A : (apb bpc) apc
In other words, a relation Rp : A A is reflexive, if R=A Rp ; symmetric,
if Rp1 Rp ; antisymmetric, if Rp Rp1 R=A ; transitive, if Rpp Rp .
Verifying these claims is left as an exercise.
Note that a relation that is not symmetric need not be antisymmetric,
and vice versa. Any relation which contains simultaneously pairs (a, b) and
(b, a) for some, but not all a, b A, a 6= b, would be an example of a relation
that is neither symmetric nor antisymmetric. The equality relation is an
example of a relation which is both symmetric and antisymmetric (and also
reflexive and transitive).
The most interesting relations are those that satisfy more than one of
the above properties. In particular, a relation is
an equivalence relation, if it is reflexive, symmetric and transitive;
a partial order, if it is reflexive, antisymmetric and transitive.
The equality relation is both an equivalence relation and a partial order. In
the following sections, we shall see more examples of each type of relations.

Discrete Mathematics I (CS127)

4.2

27

Equivalence relations

An equivalence relation is a relation that is reflexive, symmetric and transitive. Examples of equivalence relations are abundant in mathematics and
in everyday life. For example, consider the relation on the set of all people,
where person a is related to person b, if a and b are of the same age (in
whole number of years). It is easy to check that all necessary properties in
the definition of an equivalence relation are satisfied. A relation where a is
related to b if a and b were born on the same day (but possibly in different
years) is another equivalence relation. In geometry, we can define an equivalence relation on the set of all straight lines in the plane, where line a is
related to line b, if a and b are parallel (every line is considered to be parallel
to itself).
In arithmetic, given a fixed number n Z, we can define the relation
Rn : Z Z, where two numbers are related, if their difference is a multiple
of n: a n b n|(a b). The relation Rn is called congruence modulo
n. It is an equivalence relation for every natural n > 0.
Let A be any set, and R : A A an equivalence relation ( is a
general mathematical sign for equivalence). For any element a A, the
equivalence class of a, denoted [a] , is the set of all elements in A related
to a: [a] = {x A | x a}. Since R is reflexive, every element belongs
to its own equivalence class: for all a A, a [a] . Sometimes an element
a is called a representative of the equivalence class [a] .
For example, if a b means that a and b are two people of the same
age, then the equivalence classes are all possible ages, and every person
represents all people of his or her age. If a b means that persons a
and b share a birthday, then the equivalence classes are all 366 possible
birthdays, and every person represents all people with the same birthday.
If a b means that lines a and b are parallel, then these lines share the
same direction, and we can think of all possible directions as the equivalence
classes. For the congruence relation Rn , the equivalence class of any a Z
consists of all numbers that give the same remainder as a, when divided by
n. Thus, [2]5 = {. . . , 18, 13, 8, 3, 2, 7, 12, 17, . . . }.
The importance of equivalence classes is that in a set with an equivalence
relation, every element belongs to one, and only one, equivalence class. In
other words, we have the following theorem.
Theorem 3. Let R : A A be an equivalence relation. The equivalence
classes of R are pairwise disjoint. The union of all equivalence classes is
the whole set A.
Proof. To prove that the classes are pairwise disjoint, we need to show that
for all a, b A : ([a] = [b] ) ([a] [b] = ). Consider two cases:
Case a b. Consider any x [a] . By transitivity of R , we have:
x a, a b = x b = x [b]

28

Discrete Mathematics I (CS127)


Hence [a] [b] . Swapping a and b, we get [b] [a] , therefore
[a] = [b] .
Case a 6 b. Suppose [a] [b] 6= , then there is some x [a] [b] .
By symmetry and transitivity of R , we have a x, x b = a b,
contradiction. Therefore [a] [b] = .

By the law of excluded middle, one of the above two cases must be true,
hence a, b A : ([a] = [b] ) ([a] [b] = )
Finally, by reflexivity of R , we have a a, therefore a [a] , so every
element of A belongs to some equivalence class. On the other hand, every
equivalence class is a subset of A, therefore the union of all equivalence
classes is the whole set A.

Theorem 3 allows us to think of any equivalence relation as a partitioning
of the set into disjoint subsets. In many cases, such partitioning has a wellunderstood intuitive meaning:
The equivalence relation person a is of the same age as person b
(in whole number of years) has approximately 110120 equivalence
classes, corresponding to all possible ages. Note that these ages need
not be a contiguous set of natural numbers, if e.g. there is a person of
age 120, but no person of age 119.
The equivalence relation person a was born on the same day as person
b (possibly in different years) has exactly 366 equivalence classes,
corresponding to every date in a year. Note that the sizes of all classes
will be nearly equal, except the class corresponding to 29 February,
which will be approximately four times smaller than others.
The equivalence relation line a is parallel (or equal) to line b has
an infinite number of equivalence classes corresponding to all possible
directions of a line in the plane. In fact, we can define direction as
an equivalence class of this relation.
The congruence modulo n relation Rn has n equivalence classes,
represented by numbers 0, 1, . . . , n 1. For example, for n = 5, we
have:
[0]5 = {. . . , 10, 5, 0, 5, 10, . . . }

[1]5 = {. . . , 9, 4, 1, 6, 11, . . . }

[2]5 = {. . . , 8, 3, 2, 7, 12, . . . }

[3]5 = {. . . , 7, 2, 3, 8, 13, . . . }

[4]5 = {. . . , 6, 1, 4, 9, 14, . . . }

Although the number of classes is finite, each class is an infinite set.


The classes [a]n are called residue classes modulo n.

Discrete Mathematics I (CS127)

29

For a given equivalence relation R : A A, the set of all is equivalence


classes is called the quotient set of A with respect to R , and is denoted
by A/R = {[a] | a A}. In the examples above, the quotient sets are
respectively the set of all ages, the set of all birthdays, the set of all line
directions, and the set of all residue classes modulo n (for a given n N,
n > 0). The latter set is usually denoted by Zn = Z/Rn = {[a]n | a
Z}. The set Zn possesses very interesting arithmetic properties, which are
studied in number theory.
For a finite set A, the quotient set A/R must be finite. In particular,
if A has n elements, and if all equivalence classes happen to be of equal size
m, then n must be a multiple of m, and the quotient set will have n/m
elements (i.e. equivalence classes). For an infinite set A, the quotient set
may be finite or infinite.

4.3

Partial orders

A partial order is a relation that is reflexive, antisymmetric and transitive.


Whereas an equivalence relation is an abstraction of equality or similarity between objects, a partial order is an abstraction of one object being
in some sense smaller (or greater) than another, or of one object preceding (or succeeding) another. Consider, for example, a relation on the
set of all people, where person a is related to person b, if a is a descendant
of b (i.e. a child, a grandchild, a great-grandchild, etc.) We count every
person as his or her own descendant, therefore the relation is reflexive. The
relation is also transitive, since a descendant of a descendant of a person is
a descendant of that person. Of course, the relation is not symmetric, since
person a is a descendant of b does not imply that person b is a descendant
of a. Moreover, these two statements can both be true in one case only:
when a and b are the same person (who, by definition, is a descendant of
him/herself). Thus, we have the antisymmetry property, and our relation is
a partial order on the set of all people.
It is easy to check that the arithmetic relations R and R , both on N
and on Z, are partial orders.
The divisibility relation R| : N N, which we mentioned several times
before, is formally defined as follows: for m, n N, we have m|n (m divides
n, n is a multiple of m), if there is number k N, such that m k = n.
Note that by this definition, number 1 divides every number: to prove 1|n,
we take k = n. Also, number 0 is a multiple of every number: to prove m|0,
we take k = 0. We have the following theorem.
Theorem 4. The divisibility relation R| : N N is a partial order.
Proof. Let n N. We have n 1 = n, hence n|n by definition of relation
R| . Therefore, relation R| is reflexive.

30

Discrete Mathematics I (CS127)

Let m, n N, m|n, n|m. By definition of relation R| , there are k, l N,


such that n = k m, m = l n. Hence, n = k l n, so k l = 1. Since k and l
are natural numbers, this can only be true if k = l = 1, hence n = 1m = m.
Therefore, relation R| is antisymmetric.
Let m, n, p N, m|n, n|p. By definition of relation R| , there are k, l N,
such that n = k m, p = l n. Hence, p = k l m, so m|p. Therefore, relation
R| is transitive.

Another important example of a partial order is the subset inclusion
relation A B, where A, B are both subsets of a given set S. Since
the objects being related are subsets of S, the subset inclusion relation is
defined on the powerset of S: R : P(S) P(S). The relation is reflexive,
since for any A S, A A; antisymmetric, since for any A, B S,
(A B) (B A) (A = B); transitive, since for any A, B, C S,
(A B) (B C) (A C).
Note that in a partial order, some pairs of elements may be incomparable.
For example, for any two persons one does not have to be an ancestor of the
other: they could be siblings, cousins, or not related at all. Likewise, there
are pairs of numbers neither of which divides the other (e.g. 4 and 5), and
pairs of sets neither of which is a subset of the other (e.g. {1, 2}, {1, 3}
{1, 2, 3}). On the other hand, relations R and R satisfy an additional
property: for any numbers a, b, we have either a b, or b a (or both,
if a = b). In general, a partial order R : A A is called total, if for all
a, b A, we have either a  b, or b  a. Thus, partial orders R and R
are total; partial orders R| and R are not total.
Consider a partial (not necessarily total) order R : A A. Let a, b A.
We say that c A is an upper bound of a, if a  c. In particular, every
element is an upper bound of itself. An element c A is a (common) upper
bound of a and b, if a  c and b  c. An arbitrary pair of elements a, b
may have no common upper bound at all, or several common upper bounds.
If the latter case, one of the bounds may play a special role, being the
closest to a and b among all their common upper bounds. Formally, an
element c A is called the least upper bound of a, b, denoted lub (a, b), if c
is an upper bound of a, b, and for any upper bound x of a, b, we have c  x.
In other words,
c = lub (a, b)

(a  c) (b  c) x A : (a  x) (b  x) (c  x)

The least upper bound of a, b does not have to exist, even if elements a, b
have some common upper bounds.
All the above definitions can be easily restated for lower, rather than
upper bounds. Thus, d A is a lower bound of a, if d  a. Every element
is a lower bound of itself. An element d A is a (common) lower bound of

Discrete Mathematics I (CS127)

31

a and b, if d  a and d  b. Two elements can have any number of common


lower bounds, or no common bounds at all. An element d A is called the
greatest lower bound of a, b, denoted glb (a, b), if d is a lower bound of a,
b, and for any lower bound x of a, b, we have x  d. In other words,
d = glb (a, b)

(d  a) (d  b) x A : (x  a) (x  b) (x  d)

Two elements may not have the greatest lower bound, even if they have
some common lower bounds. However, if two elements have the greatest
lower bound, then it is unique (why?). The same applies to the least upper
bound.
As an example, consider the partial order a is a descendant of b. For
any two people, their common upper bound is any common ancestor, if one
exists. Thus, if two persons are cousins, then either of the two common
grandparents is their common upper bound. Neither of these upper bounds
is the least, since the two grandparents are not ancestors of each other.
There are many other common upper bounds, provided by ancestors of
these grandparents, but none of these upper bounds is the least.
In the same partial order, the common lower bound of any two people
is their common descendant, if one exists. Thus, is two persons are inlaws, i.e. each of them is a parent of the other childs partner, then each
of their common grandchildren is their common lower bound. There may
be many other common lower bounds, provided by descendants of common
grandchildren. If the two in-laws have exactly one common grandchild,
he/she is their greatest lower bound, since all other common lower bounds
would be that grandchilds descendants.
In arithmetic, the greatest lower bound of two numbers a, b N with
respect to the divisibility relation R| is the two numbers greatest common
divisor: glb| (a, b) = gcd(a, b). (Sometimes the greatest common divisor is
called highest common factor.) In the same partial order, the least upper
bound of two numbers a, b N is their least common multiple: lub| (a, b) =
lcm(a, b). In contrast with the previous example, every two non-zero natural
numbers have the greatest common divisor and the least common multiple,
and therefore the greatest lower bound and the least upper bound in R| .
Another example of an arithmetic partial order with guaranteed greatest
lower and least upper bounds is the total order R : N N. Here, the
greatest lower bound of two numbers a, b is simply their minimum a u b
(a u b = a if a b, and a u b = b otherwise). The least upper bound of a, b
is their maximum a t b (a t b = b if a b, and a t b = a otherwise). In fact,
it is easy to see that greatest lower and least upper bounds are guaranteed
to exist in every totally ordered set.
Finally, consider the subset inclusion relation R : P(S) P(S) on
the subsets of any (note necessarily finite) set S. The greatest lower bound

32

Discrete Mathematics I (CS127)

of two subsets A, B S is their intersection glb (A, B) = A B, and


the least upper bound is their union lub (A, B) = A B. For any two
sets, we can form their intersection and their union, therefore the relation
R is another example of a partial order where greatest lower and least
upper bounds always exist. In general, a partially ordered set where for
every two elements one can find their greatest lower bound and least upper
bound is called a lattice. The partial orders R| : N N, R : N N and
R : P(S) P(S) (for any set S) are examples of lattices.
In many partial ordered sets, it is worthwhile to look for elements that
are in some sense extreme. Since the set may include incomparable elements, we have two possible notions of extremality. Consider a partial
(not necessarily total) order R : A A. We say that a A is a maximal
element, if for all x A, we have (a  x) (a = x). In other words, the
only element higher than or equal to a is a itself. We say that c A is the
greatest element, if for all x A, we have x  c. In other words, c is higher
than or equal to all elements of A. Note that by this definition, the greatest element must be comparable to (and higher than) all other elements,
whereas a maximal element may be comparable to (and higher than) some
elements and incomparable to others.
Both above definitions can be restated for the opposite extremes. We
say that b A is a minimal element, if for all x A, we have (x  b)
(x = b). In other words, the only element lower than or equal to b is b itself.
We say that d A is the least element, if for all x A, we have d  x. In
other words, a is lower than or equal to all elements of A. Again, the least
element is comparable to all other elements, whereas a minimal element may
be comparable to some elements and incomparable to others.
As an example, consider the partial order a is a descendant of b. A
minimal element in this partial order is any person without children. There
is no least element, since no person is everyones descendant.
In the total order R : N N, number 0 is the least element, and the
only minimal element. There are no maximal or greatest elements. In the
partial order R| : N N, number 1 is the least element, since it divides
all natural numbers. Number 0 is (somewhat contrary to the intuition) the
greatest element, since every natural number divides 0. It is also the only
maximal element.
An interesting variation of the previous example is the same partial order
R| , considered on the set of all natural numbers, except 0, 1. In this partial
order, every prime number is a minimal element, since it is not divisible
by any other natural number. There is no least element, since no number
(except 1, which is excluded) divides all natural numbers. There are no
maximal elements, since for every number other than 0, there is a distinct
multiple (e.g. x 6= 2x and x|2x for any x N, x 6= 0). There is no greatest
element, since no positive number is a multiple of all other numbers.
In the subset inclusion relation R : P(S) P(S), the least (and the

Discrete Mathematics I (CS127)

33

only minimal) element is , and the greatest (and the only maximal) element
is S. If and S are excluded, and S is neither empty nor a singleton, then
there will be many minimal elements (all singletons {a}, where a S)
and many maximal elements (all complements of such singletons), but no
greatest or least element.
It is easy to prove that any greatest element is maximal, and that any
least element is minimal (try it!). As the above examples show, the converse
is not true: a maximal element need not be the greatest, and a minimal
element need not be the least. It is also easy to prove that if the greatest
(or the least) element exists, then it must be unique (try it!). However, if
a maximal or a minimal element is unique, it still does not have to be the
greatest or the least (why?).
The results of this section show us that the concept of a relation, and in
particular equivalence relations and partial orders, give us a useful general
tool, applicable in various branches of mathematics and computer science.
We will apply our knowledge of relations in the following sections.

34

Discrete Mathematics I (CS127)

Discrete Mathematics I (CS127)

5
5.1

35

Functions
Introduction to functions

The word function takes on different meanings in different branches of


mathematics and computer science. One often thinks of a function as a
transformation rule, or a set of rules, that allow us to map, or transform,
objects into other objects. There are various ways to make this concept
of a function precise. In this course, we take the approach of ignoring the
process of transformation (which may not even be computable), and instead
we concentrate on the initial object the function was applied to, and the
final object that is the result of this application. In other words, we view a
function as a relation between the set of all possible inputs and all possible
outputs.
The special property of functions, which distinguishes them from other
relations, is that for every input, the function produces exactly one output (see Figure 1). Formally, a function f from set A to set B is a relation
Rf : A B, where for every a A, there is exactly one b B, such that
af b (that is, (a, b) Rf ). Set A is the domain of f , set B is the co-domain of
f . We say that function f maps A into B. We say that a function f : A A
is a function on the set A.
There is special notation and terminology associated with functions. We
indicate that f is a function from A into B by writing f : A B. As an
alternative notation to (a, b) Rf or af b, we write f (a) = b. This notation
is unambiguous since, by definition of a function, for every a there is exactly
one b = f (a). We say that function f maps a to b. For a given function f ,
element b = f (a) is called the image of a, and a is called the pre-image of b.
We have already seen some examples of functions earlier in the course.
In particular, the equality relation on a set A, defined as R=A = {(a, a) | a
A}, is a function on A. It is called the identity function on A, and denoted
idA : A A. For all a A, we have idA (a) = a.
As an example of an arithmetic function, we can take the function sq :
Z N, defined as the set of pairs Rsq = {(m, n) ZN | m2 = n}. This set
satisfies the definition of a function, since every natural number has exactly

B


Figure 1: A function

36

Discrete Mathematics I (CS127)

one square. We have


Rsq = {. . . , (3, 9), (2, 4), (1, 1), (0, 0), (1, 1), (2, 4), (3, 9), . . .}
Consider any function f : A B, and let H A. The restriction of f
on set H is function f |H , defined as f |H = {(a, f (a)) | a H}. In other
words, the restriction agrees with the original function on all elements of H,
and is undefined on all elements not in H. For example, the restriction of
sq to the set of all natural numbers is the function sq|N : N N. We have
Rsq|N = {(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), . . .}
Two other special cases of a function that we considered before are finite
and infinite sequences. Let A be any set, finite or infinite. A finite sequence
of elements of A is a function Nk A, where k N is the length of the
sequence. Notation (a0 , a1 , . . . , ak1 ) Ak is simply an alternative, shorter
way of writing

a(0) = a0

a : Nk A

a(1) = a1

...

a(k 1) = ak1

Similarly, an infinite sequence of elements of A is a function N A. Notation (a0 , a1 , a2 , a3 , . . .), where i N : ai A, is an alternative to
a(0) = a0

a:NA

a(1) = a1

a(2) = a2

a(3) = a3

...

Thus, unlike sets, sequences need not be a basic, undefined concept: we


define sets via functions. Since functions are a special case of relations, and
relations are a special case of sets, the concept of an ordered sequence is
ultimately reduced to the concept of an unordered set.
Since functions are relations, the operations of composition and inversion
can be applied to functions just like to any other relations. The result of
such application is a relation, but is not a priori guaranteed to be a function.
It still turns out that the result of function composition will always be a
function.
Theorem 5. Let f : A B, g : B C. The composite relation Rf g is a
function A C.
Proof. Let a A. Since f is a function, there is a unique b = f (a) B.
Since g is a function, there is a unique c = g(b) = g(f (a)) C. By definition
of relation composition, we have (a, c) Rf g . Since such element c is
unique, relation Rf g is a function f g : A C.


Discrete Mathematics I (CS127)

37

f (A)

B


Figure 2: The range of a function


Thus, f g(a) = g(f (a)). This explains why in some books, the order of
the notation for function composition is inverted: g f instead of f g.
We prefer the latter notation, which indicates that in the expression g(f (a)),
function f is applied first, followed by function g.
In general, there is no analogue of Theorem 5 for function inversion. For
a function f : A B, the inverse relation Rf 1 : B A need not be a
function. Consider, for example, the function sq : Z N. Its inverse is the
square root relation Rsq 1 = {(n, m) N Z | m2 = n}. We have
Rsq 1 = {. . . , (9, 3), (4, 2), (1, 1), (0, 0), (1, 1), (4, 2), (9, 3), . . .}
This set of pairs does not satisfy the definition of a function: some natural
numbers, such as 2, 3, 5, 6, . . ., do not have an integer square root, whereas
other natural numbers, such as 1, 4, 9, 16, . . ., have two integer square roots
of opposite signs. Thus, neither the existence nor the uniqueness condition
from the definition of a function is satisfied.
Let us go back to the definition of a function f : A B. Note that
the domain A and the co-domain B play different, non-symmetric roles: for
every element of the domain, there must be a unique image of the co-domain,
but not vice versa. The set of all elements of the co-domain that do have a
pre-image in the domain (not necessarily a unique one) is called the range of
the function (see Figure 2). The range of a function f : A B is denoted
f (A). For example, the range of the square function sq : Z N is the set
of all squares.
Many important functions satisfy stronger conditions than just the existence and uniqueness of the image. Here we concentrate on two such
conditions.
A function f : A B is called surjective, if its range is the whole
co-domain B:
f (A) = B
(see Figure 3). Such a function f is said to map the domain A onto B:
f : A  B. An example of a surjective function is the function suit :
Cards {, , , }, which maps the finite set of cards in a standard pack
to the set of four suits. Since there is at least one card of every suit in the
pack, function suit is surjective.

38

Discrete Mathematics I (CS127)

Figure 3: A surjective function




Figure 4: An injective function


A function f : A B is called injective, if it maps different elements of
the domain A to different elements of the co-domain B:
x, y A : (f (x) = f (y)) (x = y)
(see Figure 4). Such a function f is said to map A to B one-to-one: f : A 
B. An example of a injective function is the square function on the set of
natural numbers: sq|N : N N. Since every two different natural numbers
have different squares, function sq|N is injective. The square function on the
set of all integers in not injective, since e.g. sq(5) = sq(5) = 25.
The concepts of a surjective and an injective functions are in a certain
sense complementary: for any pair of sets A, B, there is a surjective function
from A to B, if and only if there is an injective function from B to A. The
proof of this statement is left as an exercise.
A function f : A B is called bijective, if it is both surjective and
injective. For every element of te co-domain B, such a function has a unique
pre-image in the domain A:
b B : !a A : f (a) = b
(see Figure 5). A bijective function f from A to B is also called a one-to-one
correspondence between A and B: f : A 
 B. An example of a bijective
function is the function add five on the set of all integers:
add5 : Z Z a Z : add5 (a) = a + 5

Discrete Mathematics I (CS127)

39

B


Figure 5: A bijective function


For every integer b Z, number b 5 is the pre-image, therefore function
add5 is surjective. Adding five to two different integers produces different results, therefore function add5 is injective. Thus, function add5 is a
bijective function from the set Z to itself.
A bijective function from any set to itself is called a permutation on
that set. In the previous example, function add5 is a permutation on Z. A
special case of a permutation is an involution, which is any bijection that
coincides with its own inverse. Under an involution, every element of the
domain is either left unchanged, or swapped with another element. An
example of an involution is the function that inverts the sign of an integer:
neg : Z Z a Z : neg(a) = a
Proof of the following properties of functions is left as an exercise:
composition of two surjective (respectively injective, bijective) functions is surjective (injective, bijective);
the inverse relation of a bijective function is a bijective function.
A special example of a bijection puts the powerset of any given set S in
one-to-one correspondence with the set of all possible functions from S to the
two-element set B = {F, T }. For any subset A P(S), the corresponding
function is the indicator function of A, A : S B, defined as follows:
(
T if x A
x S : A (x) =
F if x 6 A
To prove that the mapping : A 7 A is a bijection between P(A) and the
set of functions B(S) = {f | f : S B} is left as an exercise.

5.2

Set cardinality

Putting two sets in one-to-one correspondence is one of the most basic activities that can be performed on sets. Intuition tells us that it is possible
if and only if both sets have the same size. In fact, the idea of one-toone correspondence, or bijection, allows us to define precisely what size
means, even for infinite sets.

40

Discrete Mathematics I (CS127)

We say that two sets A, B are equinumerous (A


= B), if there is a
bijective function f : A 
 B. For any given set S, we can think of equinumerous as a relation on the subsets of S: R
= : P(S) P(S). Since every
set can be put in one-to-one correspondence with itself by the identity function, relation R
= is reflexive. Since both the inverse of a bijective function
and a composition of two bijective functions are bijective, relation R
= is
symmetric and transitive. Thus, R
= : P(S) P(S) is an equivalence relation. Each of its equivalence classes is composed of sets of the same size; in
fact, every such class can be thought of as an abstraction of set size, either
finite or infinite. In mathematics, these set sizes are called cardinalities.
It is easy for us to get hold of finite cardinalities, since we accepted the
natural numbers as one of our basic concepts. For any n N, let Nn be
defined as the set of first n natural numbers:
Nn = {x N | x < n}
Thus, N0 = , N1 = {0}, N2 = {0, 1}, etc. Intuitively, sets Nn are a
representative collection of what we would like to call finite sets: we define a
set to be finite, if it is equinumerous with the set Nn for some n N. None
of the sets Nn with different values of n are equinumerous; we accept this
as one of the axiomatic properties of natural numbers. Given this property,
it is easy to prove that every finite set is equinumerous with exactly one of
Nn .
Theorem 6. For every finite set A, there is a unique n N, such that
A
= Nn .
Proof. Suppose A is equinumerous with Nk and Nl , k, l N. We have the
bijections f : A 
 Nk and g : A 
 Nl . Function f 1 g : Nk 
 Nl is also
a bijection (why?) Therefore, sets Nk and Nl are equinumerous. This can
only happen if k = l.

By the above theorem, every finite set has a uniquely defined natural
number as its cardinality. This fact gives some precision to our introductory
remark that natural numbers are an abstraction of finite set sizes.
We now turn our attention to cardinalities of infinite sets. A priori, it is
not obvious whether different infinite sets (e.g. N, Neven , N2 , N3 , Z, P(N))
have different cardinalities. We begin our study of infinite cardinalities from
the set N. We call an infinite set countable, if it is equinumerous with the
set of all natural numbers N. Intuitively, such a set can be counted, i.e.
put in one-to-one correspondence with N.
It may appear at first that by removing elements from N, we can obtain
infinite sets with a cardinality different from that of N. It turns out that
this is not the case. Let us look at some examples.
Theorem 7. Set N+ = N \ {0} is countable.

Discrete Mathematics I (CS127)

41

Proof. Consider function f : N N+ , which adds one to every natural


number: n : f (n) = n + 1.
With respect to function f , every element of N+ has a pre-image:
n N+ : n = (n 1) + 1 = f (n 1)
Therefore, function f is surjective.
Furthermore, function f maps different elements of N to different elements of N+ :
m, n N : (m 6= n) (m + 1 6= n + 1)
Therefore, function f is injective.
Since f is surjective and injective, f is bijective

The above proof can be represented graphically as follows:


0 1 2 3 4 5 6 7
l l l l l l l l

1 2 3 4 5 6 7 8

Theorem 8. Set Neven = {0, 2, 4, 6, . . . } is countable.


Proof. Consider function f : N Neven , which doubles every natural number: n : f (n) = 2n.
With respect to function f , every element of Neven has a pre-image:
n Neven : n = 2 (n/2) = f (n/2)
Therefore, function f is surjective.
Furthermore, function f maps different elements of N to different elements of Neven :
m, n N : (m 6= n) (2m 6= 2n)
Therefore, function f is injective.
Since f is surjective and injective, f is bijective

The above proof can be represented graphically as follows:


0 1 2 3 4

l l l l l

7
l

0 2 4 6 8 10 12 14

Theorems 7 and 8 suggest that, contrary to the intuition, a part (i.e.


a proper subset) of an infinite set can be of the same size as the whole.
In fact, it can be proved that every subset of a countable set is either finite
or countable; in other words, the cardinality of N is the smallest among
infinite cardinalities. As a consequence, for any equivalence relation on a

42

Discrete Mathematics I (CS127)

countable set, the quotient set (i.e. the set of all equivalence classes) is
either finite or countable. This can be shown by selecting an arbitrary
representative from every equivalence class. The function that maps every
equivalence class to its representative is a bijection (why?), therefore the
quotient set is equinumerous with a subset of the initial set. Since the
initial set is countable, its quotient set must be finite or countable.
It turns out that not only subsets, but also certain supersets of N may
be countable.
Theorem 9. Set Z is countable.
Proof. Consider function f : N Z, which counts negative integers by
even naturals, and positive integers by odd naturals:
(
(n + 1)/2 if n odd
n : f (n) =
n/2
if n even
Function f is bijective (proof left as an exercise).

The above proof can be represented graphically as follows:


4 3 2 1 0 1 2 3 4

l l l l l l

2 0 1 3 5 7

Perhaps taking a Cartesian square or a higher Cartesian power of a


countable set will produce a bigger set? It turns out that the answer is
no.
Theorem 10. Set Z2 is countable.
Proof. We only give the main idea of the proof. The set Z2 can be represented as an infinite two-dimensional table, where the entry in row i and
column j corresponds to the pair (i, j), i, j N. The entries in such a table
can be counted by diagonals:
0
1
2
3
4

0
0
2
5
9
14

1
1
4
8
13

2
3
7
12

3
6
11

4
10

This method gives us a bijection between N and N2 ; with a little extra effort,
the formula for this bijection can be given explicitly (left as an advanced
exercise).


Discrete Mathematics I (CS127)

43

The above theorem implies that any finite Cartesian power of a countable
set is countable. For instance,
N3 = (N N) N
=NN
=N
In our quest for uncountable infinity, we may be tempted to extend the
set of natural numbers so that, roughly speaking, we would have an infinity
of numbers everywhere. More precisely, we may want to consider the set
Q of rational numbers, defined as fractions m/n, where m, n Z, n 6= 0.
Two fractions a/b and c/d are considered equal, i.e. representing the same
rational number, if a d = b c. Therefore, we have an equivalence relation
on the set of all integer pairs:
R : Z2 Z2

(a, b) (c, d) a d = b c

Every rational number is defined as an equivalence class of this relation.


The whole set of rational numbers is the quotient set Q = Z2 /R .
In contrast with sets N and Z, the set of rational numbers Q is dense:
between any two rational numbers, no matter how close, there is another
rational number. In fact, in every segment between two rational numbers,
no matter how tiny, there is an infinite number of other rational numbers.
Intuitively, it feels as if there must be much more rational numbers than
integers, in order to fill up all those segments. However, we already know
that the set Q must be countable, since it defined as a quotient set of a
countable set Z2 .
Do uncountable sets exist at all? The answer to this question is given
by Cantors theorem: no set can be equinumerous with its own powerset.
Theorem 11. For all sets A, A 6
= P(A).
Proof. The proof method is called Cantors diagonal argument, and is reminiscent of Russells paradox.
To prove the statement by contradiction, suppose that for some set A,
there exists a bijective function f : A 
 P(A), which puts elements of A in
one-to-one correspondence with subsets of A. Consider the set of all elements
of A that are not in their corresponding subsets: D = {a A | a 6 f (a)}.
Since D is a subset of A, it must, like all other subsets, have a corresponding
element d, such that f (d) = D.
Consider the statement d D. Suppose this statement is true. Then d
is an element of the set D of all elements that are not in their corresponding
subsets. But the corresponding subset of d is set D itself, therefore, by the
definition of D, we have d 6 D. Hence, the statement d D cannot be true.
Suppose the statement d D is false. Then d is not an element of the
corresponding set D. We have a special set for such elements, which happens
to be D itself! Therefore, by the definition of D, we have d D. Hence,
the statement d D cannot be false.

44

Discrete Mathematics I (CS127)

By the laws of logic, d D must be true or false. As we have shown


above, both cases lead to a contradiction. Therefore, our initial assumption
must be false, and the bijective function f cannot exist.

The above theorem implies that the set P(N) is uncountable. Since the
powerset of any set A is equinumerous with the set of all Boolean functions
A B, the set of functions from N to B = {F, T } is also uncountable. By
replacing F with 0 and T with 1, we can obtain a simple bijection from the
latter set to the set of all function N {0, 1}. This set, in its turn, is a
subset of the set of all functions N N, which can be regarded as the set of
all infinite integer sequences, or as the infinite Cartesian product NN. . . .
Therefore, unlike finite Cartesian products, an infinite Cartesian product of
countable sets need not be countable.
The fact that the set P(N) is uncountable helps us in analysing the
cardinality of yet another numerical set. Consider extending the set of rational numbers Q by filling in the gaps. We obtain is the set of real
numbers, which, in addition to rational numbers, contains such numbers as

2 = 1.414213 . . . and = 3.141592 . . . . To formalise properly the idea of a


real number as a gap between rationals, we notice that every real number
can be approximated by rationals both from below and from above. For
example, is approximated from below by rationals
3,

31 314 3141
,
,
,...
10 100 1000

and from above by

32 315 3142
,
,
,...
10 100 1000
Thus, a real number splits the set of all rationals into two subsets, below
and above. More precisely, a real number is defined as a partitioning
Q = Q1 Q2 , such that for all x Q1 , y Q2 , we have x < y. These
partitionings are traditionally called Dedekind cuts of Q. Taken together,
all possible Dedekind cuts form the set of real numbers R.
Since the real numbers are nothing else than gaps between rationals,
one might expect that there cannot be more gaps than rationals themselves. Here the intuition fails us once again: unlike Q, the set R is uncountable. Consider the set of real numbers between 0 and 1. Every such number
can be represented in the decimal (or binary, or any other positional) system, which is just another form of approximation by rationals. For example,
the number 3 = 0.141592 corresponds to the following infinite sequence
of decimal digits:
(1, 4, 1, 5, 9, 2, . . . )
4,

The set of all such sequences includes as a subset the set of all sequences
composed of numbers 0 and 1. We already know that this set is equinumerous with the set of all functions N B, which is uncountable. Therefore
the whole set R is also uncountable.

Discrete Mathematics I (CS127)

45

In order to obtain larger infinite cardinalities, we can go beyond P(N)


by applying Cantors theorem several times. The sets N, P(N), P(P(N)),
P(P(P(N))), . . . all have different cardinalities. This sequence of cardinalities is only a beginning of an enormous tower of infinite cardinalities. In
fact, there are so many of them, they do not even form a set. However, in
this course, and in real life, we rarely need any sets bigger than P(N).

46

Discrete Mathematics I (CS127)

Discrete Mathematics I (CS127)

47

Induction

Let us take another look at the set of natural numbers:


N = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, . . . }
We have accepted it is a basic concept, and have therefore used it without
definition. However, we have not said much about the axioms that describe
the basic properties of the set N. Here are these axioms in a simplified form:
0 is a natural number;
if n is a natural number, then the next number next(n) is also a natural
number;
every natural number can be obtained by applying the above axioms
a finite number of times.
These three axioms can be used to derive all known properties of natural
numbers. They describe exactly what a natural number is, by giving the
first natural number, and a method of obtaining new natural numbers
from old ones. This way of describing a mathematical object is usually called
an inductive definition. It is important to note that it is not a definition in
our original sense of the word: it does not reduce a concept (in this case,
that of a natural number) to other, more basic concepts. Instead, it reduces
instances of a concept (for example, number 5) to other, more basic instances
(5 = next(4)).
Notice that our inductive definition of N is self-referential: it defines a
natural number by referring to the concept of natural number itself. Such
self-reference would not be allowed in a standard definition. Since, technically, an inductive definition is a collection of axioms, self-reference is
allowed, but care has to be taken to prevent such self-reference from going
in a vicious circle (a natural number is a natural number). In fact, the
first axiom of natural numbers is not self-referential: it provides the base of
induction, number 0. The second axiom provides the inductive step, the rule
for constructing new natural numbers from objects already known to be a
natural number. The base and the inductive step capture everything that is
a natural number; however, they still do not allow us to decide whether any
other object is a natural number. Therefore, we need the final axiom, which
provides the completeness statement. The concept of a natural number is
now settled: we know that 0 N by the first axiom;
3 = next(2) = next(next(1)) = next(next(next(0))) N
by the second axiom; The Moon 6 N by the final axiom.
In general, every inductive definition will follow the above pattern:

48

Discrete Mathematics I (CS127)


induction base, giving one or more initial elements of the set being
defined;
inductive step, giving one or more rules for obtaining new elements of
the set from old ones;
completeness statement.

For example, we can define a queue as follows:


the empty queue (no people queuing) is a queue;
if we take an existing queue, and put another person behind the last
person in the queue, the result is a queue;
every queue can be obtained by applying the above rules a finite number of times.
Provided that putting the person behind is precisely defined, we have an
unambiguous inductive definition of a queue.
Perhaps a more useful example is the inductive definition of a Boolean
statement:
F , T are Boolean statements;
if A, B are Boolean statements, then A, A B, A B, A B,
A B are Boolean statements;
every Boolean statement can be obtained by applying the above rules
a finite number of times.
Here, there are two initial Boolean statements in the base of induction, and
the inductive step gives us five rules for constructing new statements from
old ones.
Let us return to the general situation where we are given an inductive
definition for a set S. Suppose we need to prove that all elements of S share
some common property, given by the predicate P : x S : P (x). The proof
has to follow the structure of the inductive definition:
induction base: for each of the initial elements in S, predicate P is
true;
inductive step: if element x S is obtained from some simpler
elements, and P is true for each of these simpler elements, then P has
true for x;
by the completeness statement, P must be true for all elements of S.

Discrete Mathematics I (CS127)

49

The inductive step is an implication: assuming that P is true for elements


from which x is obtained, we must prove that it logically follows that P (x)
must be also true. The assumption that we make in the inductive step is
called inductive hypothesis. Superficially, it may look as if we are assuming
what we are supposed to prove, but in fact, the inductive step reduces the
proof of P (x) to proving P for elements simpler than x. Ultimately, the
whole proof hinges on proving the induction base. Thus, both the induction
base and the inductive step are essential parts of the inductive proof; the
proof is not valid without one or the other.
The structure of the inductive proof, applied to the specific case of natural numbers, is as follows:
induction base: prove P (0);
inductive step: prove that for all n N, P (n) P (next(n));
therefore, n N : P (n).
In this case, the inductive hypothesis is simply the statement P (n).
As an example, consider the following proof.
Theorem 12. Any amount of postage beginning from 8p can be paid by
postage stamps of value 3p and 5p.
Proof. Let us denote the amount of postage by n N, n 8.
Induction base: n = 8 = 3 + 5.
Inductive step. Suppose postage n is paid by 3p and 5p stamps. There
may be two cases:
there is at least one 5p stamp. Replace it by two 3p stamps. The
amount has increased by 5 + (3 + 3) = 1 pence.
there are only 3p stamps. Since n 8, there are at least three of
them. Replace three 3p stamps by two 5p stamps. The amount has
increased by (3 + 3 + 3) + (5 + 5) = 1 pence.
In both cases, postage n + 1 has been paid by 3p and 5p stamps as required.
By induction, the statement is true for any postage.

The above example may not strike one as a useful mathematical fact.
However, many much more useful properties of natural numbers and other
inductively defined sets can be proved by the induction principle. As an
example, consider the following theorem, which we give without proof.
Theorem 13. Every Boolean function Bn B, n N, can be expressed by
a Boolean statement with n free variables, using only operators , , .
Proof. Induction.

50

Discrete Mathematics I (CS127)

This theorem justifies our choice of Boolean operators: using just three
of them, we can express every possible Boolean function of n variables.
(Exercise: what statement should be the base of induction in the omitted
proof?)
In the following chapter, we shall see more examples of induction.

Discrete Mathematics I (CS127)

7
7.1

51

Graphs
Motivating examples

Graphs were invented by Leonhard Euler (17071783). Figure 6 shows the


map of his native town of Konigsberg (now Kaliningrad), consisting of four
islands connected by seven bridges. The question is: can one, starting from
any point on the map, make a tour of the town, crossing every bridge exactly
once and returning to the original point.
Figure 7 shows a graph representing this puzzle. The black nodes correspond to islands, the white nodes to bridges. A black node is connected to a
white node by an edge, is the corresponding island and bridge are adjacent.
The puzzle now becomes: can one, starting from any node, make a tour
of the graph, visiting every edge exactly once and returning to the original
node. The puzzle is greatly simplified by the graph representation. In particular, the exact location of nodes and the shape of edges do not matter;
the edges may even cross, as long as we do not count the crossing point as
a new node.
It is hardly surprising that we can represent the above problem of a
geometric nature by a graph. However, graphs are applicable to a much
larger class of problems. Consider the following one, which has very little
geometry in it.
A farmer, who has in his possession a wolf, a goat and a cabbage, wants
to cross a river in a boat. The boat is only big enough for the farmer himself,
plus one other item. The wolf cannot be left alone with the goat, or the goat
alone with the cabbage. Is it possible to get to the other side of the river
satisfying all these restrictions?
Figure 8 shows a graph representing this puzzle. The nodes correspond
to different states of the games, the edges to transitions between states. It is
clear that the puzzle has two distinct solutions, which correspond to paths
from left to right in the picture. As before, the location of nodes and the

Figure 6: The Konigsberg bridges

52

Discrete Mathematics I (CS127)










Figure 7: The Konigsberg graph


C
FWG


F GC
W



WC
FG


FG
WC




F W GC



FWC
G



G
FWC

F W GC



W
F GC

F GW
C

Figure 8: The wolf/goat/cabbage graph


shape or intersections of edges do not matter.
As a third example, consider the following puzzle. There are three houses
and three wells. The owner of each of the houses wants to build a path to
each of the wells. The paths must not cross.
Figure 9 shows a graph representing the puzzle. In contrast with the
previous examples, the layout of nodes and edges does matter here. The
layout shown in Figure 9 is not a solution, since the edge from H3 to W1
intersects with other edges.
H1

H2

H3

"

$%

&'

()

W1

W2

W3

Figure 9: The houses and wells graph

Discrete Mathematics I (CS127)

53

1


0


K(N5 )


Figure 10: The complete graph on five nodes

7.2

Graphs as relations

From our first two motivating examples, it is clear that in our definition of a
graph, it should only matter which nodes are connected by edges; the layout
and shape of nodes and edges are irrelevant. Therefore, graphs for us are a
special type of relations.
Let Rp : A A. We say that relation Rp is
irreflexive, if no element is related to itself: a A : (apa)
symmetric, if every two elements are related in both possible orders,
as long as they are related at all: a, b A : apb bpa
Let V be any finite set. We call elements of V nodes. An irreflexive,
symmetric relation E = R* : V V is called a graph on V . The pairs of
nodes that are elements of relation E are called edges. Two nodes that are
connected by an edge are called adjacent. A graph with set of nodes V and
set of edges E is usually denoted G = (V, E).
A special case of a graph on the set of nodes V is the empty graph, which
has no edges: (V, ). The other extreme is the complete graph, which contains
all possible edges: K(V ) = (V, E), where E = {(u, v) V 2 | u 6= v}.
Figure 10 shows the complete graph K(N5 ).
When studying the structure of a graph, we usually want to identify
graphs which are the same up to a renaming of nodes. This informal idea
is captured by the following definition. Graphs G1 = (V1 , E1 ) and G2 =
(V2 , E2 ) are called isomorphic, if there is a bijective function f : V1 
 V2
which preserves the edges:
u, v V1 : (u, v) E1 (f (u), f (v)) E2
Bijective function f is called the isomorphism between G1 and G2 . Figure 11
shows three isomorphic graphs with different layouts.
The notion of isomorphism can be useful when the exact set of nodes
in the graph is irrelevant. For example, for every n N, there is, up to
isomorphism, just one complete graph on n nodes. We will denote this
graph by K(n). This can be read as any graph isomorphic to K(Nn ).
A graph G = (V, E) is called bipartite (or two-coloured ), if the set of
nodes can be partitioned into two disjoint subsets V = V1 V2 , such that

54

Discrete Mathematics I (CS127)


4

1


1


Figure 11: Isomorphic graphs


H1

H2


H3




W1



W2



W3

Figure 12: The complete bipartite graph on two sets of three nodes

every edge in E connects two nodes from different subsets. The subsets V1 ,
V2 are called colour classes. From Figures 7, 8, 9 it is clear that the three
graphs introduced in the previous subsection are bipartite, with the colour
classes indicated by black and white colouring of the nodes. The complete
graph K(5) in Figure 10 is not bipartite.
The bipartite graph that contains all possible edges between its colour
classes is called a complete bipartite graph: K(V1 , V2 ) = (V1 V2 , (V1 V2 )
(V2 V1 )). Figure 12 shows a straightened picture of the houses and
wells graph, which is the complete bipartite graph K(H, S) on the sets of
houses H = {H1 , H2 , H3 } and wells W = {W1 , W2 , W3 }. When the exact
set of nodes is irrelevant, we will denote the complete bipartite graph by
K(m, n), where m, n N at are the sizes of the colour classes. This can be
read as any graph isomorphic to K(H, W ) with m houses and n wells.
The definition of bipartite graphs can be generalised for any fixed number
of colour classes. A graph with k colour classes is called k-partite. The
k-colourability problem consists in determining whether a given graph is kpartite, for a fixed value of k. The 2-colourability problem can be solved
efficiently; however, nobody knows an efficient algorithm for 3-colourability.
In fact, deciding if such an algorithm exists amounts to a solution of the
famous P versus NP problem. A correct solution can bring the author,
apart from worldwide fame, a $1 000 000 prize from Clay Mathematical
Institute. See www.claymath.org for details.

Discrete Mathematics I (CS127)

7.3

55

Graph connectivity

In real-life problems represented by graphs, a graph edge often corresponds


to a move from one node to another. Edges are playing such a role in
our Konigsberg Bridges and wolf/goat/cabbage examples. A logical development of this idea is to consider a sequence of moves, which visits several
nodes in turn. The sequence may or may not be required to return to the
starting node, and repeated visits to the same nodes or edges may or may
not be allowed.
Let G = (V, E) be a graph. We begin by defining an unrestricted sequence of moves, which we call a walk. Formally, a walk is a sequence
of nodes (ub, u1 , . . . , uk1 , v), such that every two consecutive nodes in the
sequence are connected by an edge:
(u * u1 ) (u1 * u2 ) (uk1 * v)
The statement nodes u and v are connected by a walk will be denoted
by u # v. Sometimes we will also write u # v to denote a particular walk
from u to v. The above walk from u # v can be written in a compact form:
u = u0 * u1 * u2 * . . . * uk1 * v0 = v
A walk that returns back the starting node is called a tour.
In Figure 13, the following are examples of a walk and a tour:
0*3*1*4*6*3*0*2*5
0*3*1*4*6*3*5*2*0
Nodes u, v in a graph are connected, if there is a walk u # v. A graph
is called connected, if every two of its nodes are connected.
We can regard node connectivity as a relation on the set of all nodes in
a graph: R# : V V . It is easy to check that this relation is
2

7


3
1

Figure 13: An example graph

10

56

Discrete Mathematics I (CS127)


reflexive: u V : u # u;
symmetric: u, v V : (u # v) (v # u);
transitive: u, v, w V : (u # v) (v # w) (u # w).

Therefore, R# is an equivalence relation on V . The equivalence classes of


R# are called connected components of the graph G. A graph is connected,
if and only if it has one connected component.
We can restrict the notion of a walk by forbidding repeated visits to the
same node. A walk where all nodes (and, therefore, all edges) are distinct
is called a path. The statement nodes u and v are connected by a path
will be denoted by u
v. Sometimes we will also write u
v to denote a
particular path from u to v. The path from u
v can be written as:
u = u0 * u1 * u2 * . . . * uk1 * uk = v
i, j Nk+1 : ui 6= uj
A tour with at least three nodes, where all nodes except the starting and the
final are distinct, is called a cycle. A cycle can be viewed as a path, followed
by an edge connecting the end and the beginning of the path: u
v * u.
A graph without any cycles is called acyclic.
In Figure 13, the following are examples of a path and a cycle:
0 * 2 * 7 * 10 * 8 * 3 * 5
3 * 8 * 10 * 9 * 4 * 6 * 3
We now have another relation on the set of all nodes in a graph: R :
V V . It is easy to check that this relation is reflexive and symmetric.
However, the transitivity is not so obvious: a path u
v followed by a path
v
w is not necessarily a path from u to w, since some nodes visited before
v may be re-visited after v. It turns out that R is still transitive. In fact,
we can prove an even stronger result: R is exactly the same relation as
R# .
Theorem 14. Consider a graph G = (V, E). For all u, v V , there is a
path u
v, if and only if there is a walk u # v.
Proof. A path u
v is also a walk u # v. Therefore, this direction of the
implication is trivial.
The opposite direction of the implication is proved by induction.
Induction base. Consider a walk u # u of length 0. This walk is a
sequence (u), which is also a path.
Inductive step. Consider a walk u # v, obtained by adding an edge to
a shorter walk: u # w * v. Since nodes u and w are connected by a walk,
by the induction hypothesis they are also connected by a path: u
w * v.
There are two possible cases:

Discrete Mathematics I (CS127)


b

57

c


a


Figure 14: An example graph


path u

w does not visit node v. Then u

w * v is itself a path.

path u
w visits node v: u
v
w * v. We now have a path
u
v as an initial segment of u
w.
In both cases, the existence of a walk u # v implies the existence of a path
u
v.

By the above theorem, the notions of connectivity by walks and by paths
coincide: a graph is connected if and only if every two of its nodes are
connected by a path.
Let us now recall the Konigsberg Bridges problem. It consists in finding
a tour that visits every edge in a graph exactly once. Such a tour is called
an Euler tour of the graph.
The graph in Figure 14 has the following Euler tour:
a*b*c*f *e*d*c*e*b*f *a
It turns out that the Euler tour problem has a simple solution for any
graph G = (V, E). The solution is based on the following definition. For
any node v V , its degree is the number of nodes adjacent to it: deg(v) =
|{u V | v * u}|. For example, in Figure 14:
deg(a) = deg(d) = 2
deg(b) = deg(c) = deg(e) = deg(f ) = 4
We are now able to describe a simple test for existence of the Euler tour
in a graph.
Theorem 15. Consider a graph G = (V, E). Graph G has an Euler tour,
if and only if
G is connected;
every node in V has even degree.

58

Discrete Mathematics I (CS127)

Proof. If G has an Euler tour, then it is connected, since the Euler tour
contains a walk between any pair of nodes. Consider any node v V .
Suppose node v is visited k times by the Euler tour. On every visit, the
tour uses two edges: an incoming and an outgoing edge. Since every edge
adjacent to v is used exactly once, the total number of edges adjacent to v
must be twice the number of visits: deg(v) = 2k. Therefore, deg(v) is even.
The proof of the opposite implication is done in several steps. First, we
build a tour that visits every edge at most once, but may miss some of the
edges. We then show that such a tour can be extended to cover all the edges.
Let G = (V, E) be a connected graph, where each node has even degree.
Let us fix any starting node u V . Consider any walk u # v 6= u. The
final node of this walk v may have been visited by the walk several times;
on each such visit, the walk uses two edges adjacent to v. However, on the
final visit, only one incoming edge is used. Therefore, the number of visited
edges adjacent to v is odd. Since the total number of adjacent edges deg(v)
is even, there is at least one unvisited edge adjacent to v. Let us add this
edge to the walk: u # v * w. If w 6= u, we can repeat the previous step,
extending the walk u # w by more edges. Eventually, the walk will return
back to node u.
At this point, we have a tour u # u that visits every edge at most once,
but may not visit some of the edges at all. Suppose there are some unvisited
edges. We now recall that graph G is connected. Consider all nodes in our
tour u # u. If all these nodes had no adjacent unvisited edges, then there
would be no path connecting every one of them to an unvisited edge, and
hence the graph would not be connected. Therefore, some node s in the
tour u # s # u has an adjacent unvisited edge s * t.
Let us now make s the initial node of our tour: s # u # s. The tour
still visits every edge at most once. Let us extend the tour by visiting the
previously unvisited edge s * t: s # u # s * t. As before, the final
node t has an odd number of adjacent visited edges, but the total number of
adjacent edges deg(t) is even. Therefore, there is an unvisited edge adjacent
to t, so we can extend the walk by another edge. As before, we can repeat
this process until the walk returns back to node s. If there still are any
unvisited edges in the graph, we can repeat the whole process once again.
Eventually, the walk will return back to the starting node, having visited all
edges in the graph. We have constructed an Euler tour of the graph G. 
Even though the above proof is longer that our previous proofs, it is less
formal: we use such phrases as repeat the whole process until eventually it yields an Euler tour. This proof can be completely formalised using
induction.
To illustrate the tour-building procedure outlined in the proof, consider
the graph in Figure 14. Let us take a as the starting node, and begin the
walk by moving along the edge a * b. Node b has now one adjacent visited

Discrete Mathematics I (CS127)

59

edge; since deg(b) is even, it is also guaranteed to have at least one adjacent
unvisited edge. In fact, it has three such edges; let us take the edge b * c.
We now have the walk a * b * c. Node c, in its turn, is guaranteed to have
at least one adjacent unvisited edge, so we can keep extending the walk.
Eventually we will return back to node a. Suppose at this point our walk is
the tour a * b * c * f * a.
Since the graph is connected, and not all edges have been visited, at
least one node in the current tour must have an adjacent unvisited edge.
For instance, let us take node b with the unvisited edge b * f . We can now
make b the starting node in our existing tour, and extend the tour by a new
edge, making it into a walk:
b*c*f *a*b*f
We can keep extending the walk by more edges, until eventually we return
to node b:
b*c*f *a*b*f *e*d*c*e*b
At this point, all edges have been visited, so our current tour is an Euler
tour.
The power of Theorem 15 is in replacing a complex global condition (existence of an Euler tour) by a much simpler global condition (connectivity),
plus a number of very simple local conditions (node degrees). By Theorem 15, the original Konigsberg graph in Figure 7 has no Euler tour, since
some nodes (in fact, all nodes representing islands) have odd degree.
Inspired by our success in finding an efficient test for the existence of an
Euler tour, we may want to formulate an analogous (and practically more
important) problem for cycles. It consists in finding a cycle that visits every
node (but not necessarily every edge) in a graph exactly once. Such a cycle
is called a Hamiltonian cycle of the graph.
The graph in Figure 14 has the following Hamiltonian cycle:
a*b*e*d*c*f *a
It turns out that the Hamiltonian cycle problem, despite its similarity
with the Euler tour problem, is hard. In fact, nobody has managed so far
to find an efficient test for existence of a Hamiltonian cycle, or to prove that
no such test exists. The status of this problem is very similar to that of
the colourability problem introduced in the previous subsection. And, like
colourability, the Hamiltonian cycle problem is also worth $1 000 000!

7.4

Trees

We begin this subsection by giving some definitions. Let G = (V, E) and


G0 = (V 0 , E 0 ) be graphs. Graph G0 is called a subgraph of G (G0 G), if

60

Discrete Mathematics I (CS127)


2

3


G0

Figure 15: A graph and its subgraph


2

G0

3


Figure 16: A graph and its spanning subgraph


V 0 V , E 0 E (see Figure 15). Graph G0 is called a spanning subgraph of
G (G0 v G), if V 0 = V , E 0 E (see Figure 16). Every spanning subgraph
is also a subgraph, but not necessarily vice versa.
Let us denote the set of all graphs on node set V by G(V ). We can view
the notions of subgraphs and spanning subgraphs as relations on G(V ):
R , Rv : G(V ) G(V ). It is easy to see that the relations R , Rv are
reflexive, antisymmetric, and transitive. Therefore, these two relations are
partial orders on G(V ).
Recall that a graph is called connected, if every two of its nodes are
connected, and acyclic, if is has no cycle as a subgraph. A graph is called
a tree, if it is both connected and acyclic. Figure 17 shows an example of a
tree.
If a graph is acyclic, but not necessarily connected, then every its connected component is a tree. Because of this, the term forest is often used as
a synonym of acyclic graph.

Figure 17: A tree

Discrete Mathematics I (CS127)

61

Note that a connected graph stays connected if we add some edges to it.
Therefore, a connected graph cannot have too few edges. Also note that
an acyclic graph stays acyclic if we remove some edges from it. Therefore,
an acyclic graph cannot have too many edges. A tree, being both connected and acyclic, must therefore have some middling number of edges.
It turns out that we can specify this number precisely: every tree with a
given number of nodes has the same number of edges.
Theorem 16. Let G = (V, E) be a tree. We have |V | = |E| + 1.
Proof. Induction.
Induction base. Consider graph G with one node and no edges. It is
connected and acyclic, and therefore a tree. We have |E| = 0, |V | = 1 =
|E| + 1.
Inductive step. Let G = (V, E) be any tree with at least one edge u * v.
Let G0 = (V, E \ {(u, v), (v, u)}) be the spanning subgraph of G obtained by
removing the edge u * v.
Consider the connectivity relation R in the graph G0 . A node w V
cannot be connected both to u and v in G0 , otherwise we would have a cycle
u
w
v * u in G. However, every node w must be connected either
to u or to v in G0 , otherwise graph G would not be connected. Therefore,
graph G0 has two connected components with node sets Vu = [u] (all nodes
connected to u) and Vv = [v] (all nodes connected to v). Let us denote
these components by Gu = (Vu , Eu ) and Gv = (Vv , Ev ).
Both Gu and Gv are connected and acyclic, therefore they are trees. By
the inductive hypothesis, we have
|Vu | = |Eu | + 1 |Vv | = |Ev | + 1
We also have |V | = |Vu | + |Vv | (all nodes in G are nodes in Gu plus nodes
in Gv ), and |E| = |Eu | + |Ev | + 1 (all edges in G are edges in Gu plus edges
in Gv plus the edge u * v). Therefore,
|V | = |Vu | + |Vv | = (|Eu | + 1) + (|Ev | + 1) =

(|Eu | + |Ev | + 1) + 1 = |E| + 1




The above theorem does not simply give us an edge count for trees; we
can draw from it some important conclusions on the structure of a tree. Let
us call a node of degree 1 a leaf.
Theorem 17. Every tree with at least one edge has a leaf.
Proof. Let G = (V, E) be a tree with at least one edge. The sum of all
node degrees in any graph is twice the number of edges, since every edge

62

Discrete Mathematics I (CS127)

contributes to the degree of both its ends. Suppose every node in G has
degree at least 2. Then 2 |E| 2 |V |, therefore |E| |V |. But since G
is a tree, by the previous theorem |E| = |V | 1. This is a contradiction, so
our assumption must be false, and G has some nodes of degree less than 2.
Since G is connected and has at least one edge, it cannot have any nodes of
degree 0. Therefore, there must be at least one node of degree 1.

To complete our study of trees, we characterise them in terms of the
spanning subgraph relation.
Recall that relation Rv is a partial order on the set G(V ) of all graphs
with node set V . This partial order has the least element (V, ) (the empty
graph), and the largest element K(V ) (the complete graph). Things become
more interesting if we restrict the relation Rv to the set of all connected, or
acyclic, graphs on V .
Theorem 18. Let V be any finite set. Consider the partial order Rv on
the set of all connected graphs on V . A graph G = (V, E) is minimal in
this partial order, if and only if it is a tree.
Proof. Since graph G is connected by the condition of the theorem, we need
to prove that G is minimal if and only if it is acyclic. Equivalently, we need
to prove that G has a cycle, if and only if it is not minimal in the partial
order.
Suppose that graph G has a cycle. Let u * v be any edge in the cycle.
Remove edge u * v from the graph. The graph stays connected, since every
walk that passed through the edge u * v can be redirected by the remaining
path u
v. Since the graph stays connected after removing an edge, it is
not minimal.
To prove the opposite implication, suppose that graph G is not minimal
connected. This means that for some u, v V , removing the edge u * v
does not disconnect the graph. It can only happen, if nodes u and v are
connected, apart from the edge u * v, by some path u
v. Therefore,
graph G has a cycle u * v
u.

In short, trees are minimal among connected graphs. This can be viewed
as an alternative definition of a tree, not using the word acyclic.
Theorem 19. Let V be any finite set. Consider the partial order Rv on
the set of all acyclic graphs on V . A graph G = (V, E) is maximal in this
partial order, if and only if it is a tree.
Proof. Since graph G is acyclic by the condition of the theorem, we need
to prove that G is maximal if and only if it is connected. Equivalently, we
need to prove that G is disconnected, if and only if it is not maximal in the
partial order.
Suppose that graph G is disconnected. Let u, v be any two unconnected
nodes. Add the edge u * v to the graph. The graph stays acyclic, since u

Discrete Mathematics I (CS127)

63

and v are not connected by any path apart from the new edge u * v. Since
the graph stays acyclic after adding an edge, it is not maximal.
To prove the opposite implication, suppose that graph G is not maximal
acyclic. This means that for some u, v V , adding the edge u * v does
not create a cycle. It can only happen, if nodes u and v are unconnected.
Therefore, graph G is disconnected.

In short, trees are maximal among acyclic graphs. This can be viewed
as an alternative definition of a tree, not using the word connected.

You might also like