Vasco Brattka
Cape Town
February 11, 2011
Contents
Contents
1 Mathematics
1.1 What is Mathematics about? . . . . . . . . . . . . . . . . . . . . . .
1.2 What are Proofs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Indirect Proofs and the Principle of Excluded Middle . . . . . . . . .
3
3
5
9
2 Sets
2.1 What is a Set? . . . . . . . . . . . . . . . . . . . .
2.2 Explicit Definitions of Sets . . . . . . . . . . . . . .
2.3 Subsets and Comprehension . . . . . . . . . . . . .
2.4 Russels Paradox . . . . . . . . . . . . . . . . . . .
2.5 Union and Intersection of Sets . . . . . . . . . . . .
2.6 Difference and Complement of Sets . . . . . . . . .
2.7 Union and Intersection of Indexed Families of Sets
2.8 Power Sets . . . . . . . . . . . . . . . . . . . . . .
2.9 Product of Sets . . . . . . . . . . . . . . . . . . . .
2.10 Disjoint Union of Sets . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13
13
15
16
21
23
27
32
36
38
42
3 Logic
3.1 What is Logic? . . . . .
3.2 Propositional Logic . . .
3.3 FirstOrder Logic . . . .
3.4 Correspondence Between
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
45
45
46
50
53
.
.
.
.
.
.
.
.
55
55
57
61
68
73
74
80
82
. . . . . .
. . . . . .
. . . . . .
Logic and
. . . . . . .
. . . . . . .
. . . . . . .
Set Theory
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
4.9
Infinite Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Cardinality
5.1 What is the Cardinality of a Set? . . . .
5.2 The Theorem of Schr
oderBernstein . .
5.3 Cantors Diagonalization Method . . . .
5.4 The Continuum Hypothesis . . . . . . .
5.5 Cantors Pairing Function . . . . . . . .
5.6 Induction Principle on Natural Numbers
5.7 Finite and Countable Sets . . . . . . . .
5.8 Dedekind Infinite Sets . . . . . . . . . .
5.9 Cardinality and Set Constructions . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 Order
6.1 What is Order? . . . . . . . . . . . . . . . .
6.2 Reflexivity, Symmetry and Transitivity . . .
6.3 Equivalence Relations . . . . . . . . . . . .
6.4 Preorders, Partial Orders and Linear Orders
6.5 Monoids . . . . . . . . . . . . . . . . . . .
6.6 Maximum and Minimum . . . . . . . . . . .
6.7 Supremum and Infimum . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
85
.
.
.
.
.
.
.
.
.
89
89
92
95
96
97
100
103
107
110
.
.
.
.
.
.
.
117
117
118
121
125
128
131
136
139
Mathematicians
141
Greek Alphabet
145
Mathematical Symbols
147
Index
149
CHAPTER
Mathematics
In my own experience, mathematics in general and pure mathematics
in particular has always seemed like secret gardens, special places where
I could grow exotic and beautiful theories. You need a key to get in, a key
that you earn by letting mathematical structures turn in your head until they
are as real as the room you are sitting in.
David Mumford (Fields Medalist, Brown University)
1.1
1.
Mathematics
Algebra
Number Theory
Geometry
Algebraic Geometry
Analysis
Logic
Combinatorics
Theoretical Computer Science
Probability
Mathematical Physics
1.2
Now, what is this activity of mathematician exactly about? What is a mathematical proof? Usually, a proof is considered as a text that convinces the reader of a
certain result in form of rigorous logical reasoning about the underlying definitions
and concepts. But what is rigorous logical reasoning? The truth is that we cannot
present a proper definition of rigorous reasoning and that mathematics is learned
by doing. This is a bit like to learn bicycling. It is very hard to describe in words
what you have to do, but somebody will show you how to do it and eventually you
will manage not to fall. Basically, everybody can learn how to reason logically and
rigorously in the mathematical sense, but it requires some years of practice under
the guidance of other mathematicians to achieve some mastery in this discipline.
So, let us start right away and let us look into some proofs.
We recall that the natural numbers are exactly the numbers 0, 1, 2, 3, .... We
write
N = {0, 1, 2, 3, ...}
for the set of natural numbers. Strictly speaking, this is not a good definition of
N, since it leaves the dots ... open to interpretation. However, we assume that
the reader has some intuitive understanding of the concept of natural numbers and
hence the definition above is clear enough. For the professional mathematician, the
most important information in this definition is that 0 is considered as a natural
1.
Mathematics
number. Some authors also start with 1 here, but throughout this text we will
consider 0 as a natural number as well.
Now, among the natural numbers we single out the prime numbers as interesting
subset. We recall the definition.
Definition 1.1 (Prime numbers) A natural number p 2 is called prime number if it has no other natural number as divisor than 1 and p itself. By
P = {p N : p is a prime number}
we denote the set of all prime numbers.
An easy calculation shows that the first few prime numbers are
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, ...
One obvious question is whether this sequence of numbers ends eventually or whether
it is is infinite? Euclid already proved more than 2000 years ago that the set of prime
numbers is infinite. His proof is basically still the same that we use nowadays. Let
us formulate our first theorem and proof to illustrate the mathematical activity.
Theorem 1.2 (Euclid 300 BC) The set P of prime numbers is not finite.
Proof. Let us assume the contrary, i.e. suppose that there are exactly finitely many
prime numbers p1 , ..., pn for some n N. We know that there are prime numbers
such as 2 and hence n 1. That is, the finite set P = {p1 , ..., pn } is the set of all
prime numbers and it is not empty. Now consider the product of all these numbers
plus one:
k = p1 p2 ...pn + 1.
This number k > 1 has a prime divisor p and hence p P. Then p divides the
product p1 p2 ...pn and the number k and hence it divides also 1 = k p1 p2 ...pn ,
which is impossible. This means that we have a contradiction and our assumption
was wrong. Thus, the set P of prime number cannot be finite.
2
The little box 2 indicates the position in the text where the proof ends. Some
authors use other symbols for this purpose or they write q.e.d. (which stands for
the Latin phrase quod erat demonstrandum, i.e. which was to be demonstrated).
This version of the proof can be found in many text books and it is considered as
a logically rigorous example of a proof and as a starting point of number theory.
Despite this fact, the proof raises a number of questions:
1. What does it exactly mean that a set is finite or not finite?
2. What is a set at all?
3. We have proved that the assumption that there are only finitely many prime
numbers leads to a contradiction. Is it admissible to conclude in this indirect
way that there are infinitely many prime numbers?
1.
Mathematics
the course of this text in an appendix. A prime divisor is a divisor that is a prime
number. Now we can formulate the following lemma that closes the most essential
gap in the proof of Theorem 1.2. It shows why the number k in the proof has to
have a prime divisor.
Lemma 1.4 Any natural number n > 1 has a prime divisor p P.
Proof. Let n > 1 be a natural number. Let us consider the set
D = {d N : d > 1 and d is a divisor of n}
of all natural numbers d > 1 that divide n. This set is certainly not empty since
n D and hence there is a minimal number m D and we write m = min(D). If
this m is a prime number, then we have found the desired prime divisor of n. If m
is not a prime number, then it has a natural number d as divisor other than 1 and
m itself. This d divides m and hence also n, i.e. d D and d < m = min(D). But
this is a contradiction and hence this second case cannot occur.
2
Note that we have proved more than claimed, we have even proved that each
natural number n > 1 has a prime number as smallest divisor d > 1. Once again
one could complain that this proof is not rigorous enough. For instance, we did not
prove that if d divides m and m divides n, then m also divides n. This property
is called transitivity of the divisor relation and we will discuss it in the exercises.
Besides this question and the questions raised above, this second proof that we have
seen provokes a number of further questions:
1. Are we allowed to form arbitrary sets or why can we just build a set like D?
2. What exactly is a minimum and why has D such a minimum?
We will address all these questions more carefully in this course. Below we
formulate a number of exercises that continue our little excursion into number theory.
Here we end with a big open problem.
Conjecture 1.5 (Twin primes) There are infinitely many twin primes, i.e. there
are infinitely many pairs (p, q) of prime numbers p, q P such that q p = 2.
Here are the first few twin primes:
(3, 5), (5, 7), (11, 13), (17, 19), (29, 31), (41, 43), (59, 61), (71, 73), (101, 103)
It is conjectured for some thousand years that there are infinitely many such twin
primes, but until today nobody managed to prove this conjecture. In recent years
there was some partial progress on this matter, but until today (February 2010)
there is no final solution to this problem.
So, after the discussion of a proof that is several thousands of years old, we see
a very similar property as a conjecture that is still unsolved. Hence, the impression
Problems
1.1 Prove that the following properties of the divisor relation hold true for all natural
numbers a, b, d, n, m:
1. nn
(reflexivity)
(transitivity)
(linearity)
(cancellation)
5. 1n
6. n0
7. 0n implies n = 0
(antisymmetry)
Identify the places in the proofs of this section where some of these properties have been
implicitly used without further mentioning.
1.3
The proof of Euclids result Theorem 1.2 that the set of prime numbers is not
finite was presented as an indirect proof. An indirect proof is a proof that follows
the following logical pattern:
(A = ) = A.
Here stand for false and the symbol is also called falsum. The symbol =
stands for implies and stands for not. That is, if one uses the above logical
formula in order to prove A, then one first shows that the negation of A implies
something incorrect and then one concludes that this entire implication entails A.
Here one reads A as not A. Why is this pattern of logical reasoning justified? It is
essentially based on the principle of excluded middle which we formulate separately.
1.
Mathematics
Principle 1.6 (Excluded Middle, Aristotle 350BC) Any welldefined mathematical statement A is either true or false. In particular, the statement (A A) is
true.
Here is read as or and the principle says that (A A) is true. For most
mathematicians the principle of excluded middle is clearly true. For instance, we
believe that the Twin Prime Conjecture 1.5 is either true or false, it is just that
we do not know which alternative is correct. There are some mathematicians, who
would argue that the statement of the Twin Prime Conjecture 1.5 is not clearly true
or false. This direction of mathematics is called intuitionism and was essentially
founded by Brouwer. In intuitionistic logic the formula (A A) is not considered
as correct, since our current knowledge does not suffice to say clearly whether a
statement A such as the Twin Prime Conjecture 1.5 is correct or whether its negation
A is correct. That is, intuitionists have an understanding of truth that is time
dependent and something that is not found to be true today, might be recognized
as true tomorrow. Most mathematicians rather follow platonism and they believe
that any welldefined mathematical statement is either true or not. By the way,
we will say that a mathematical object or statement is welldefined, if its definition
or specification has a clear and nonambiguous mathematical interpretation that
actually leads to an object of the specified type. In case of a statement the type
would be a truth value that we can clearly assign.
Directions of Mathematical Philosophy: Platonism versus Intuitionism
1. Platonism: Any welldefined mathematical statement A is either true or false.
2. Intuitionism: For some welldefined mathematical statements A proofs have
been found, i.e. those A are true, others have been disproved, i.e. for them A
is true. For some statements A currently neither A nor A is known.
For intuitionists truth entails knowledge and hence they cannot relate to the
statement that (A A) holds in cases where neither A nor A is known. We
adapt the platonistic main stream philosophy of mathematics for this course and we
assume that the Principle of Excluded Middle is correct. Using this principle we can
obtain a justification for indirect proofs.
Proposition 1.7 (Indirect proof ) For each welldefined mathematical statement
A the reasoning ((A = ) = A) is correct.
Proof. Let A be some welldefined mathematical statement for which we can show
(A = ). This means that if A does not hold, then something false follows.
The Principle of Excluded Middle 1.6 tells us that either A or A is correct. Since
something false follows from A, we have no other option but to conclude that A
must be correct.
2
We formulate the indirect proof method (also called proof by contradiction or
reductio ad absurdum) as a method again.
10
Problems
1.2 Revisit the proof of Euclids Theorem 1.2 and show that essentially the same proof with
little modifications proves the following statement:
For any given finite number p1 , ..., pn P of prime numbers with n 1 there
exists a prime number p P that is not among the numbers p1 , ..., pn , i.e. such
that p 6 {p1 , ..., pn }.
Show that this proof is easily arranged such that it does not use any indirect reasoning!
In fact, this shows that Euclids Theorem is a constructive theorem and the proof actually
contains an algorithm how to compute a further prime number p, given any finite number
p1 , ..., pn of prime numbers.
Bibliographic Remarks
We close this chapter with some bibliographic remarks on useful books. There exists a
huge number of text books which can be used together with this course. Most of them
complement the course in one or the other way. We just mention a few of them.
[1] Martin Aigner and G
unter M. Ziegler, Proofs from THE BOOK, 4th edition, Springer,
Berlin, 2009.
[2] Ethan D. Bloch, Proofs and Fundamentals, A First Course in Abstract Mathematics,
Birkh
auser, Boston, 2000.
[3] Mariana Cook, Mathematicians: An Outer View of the Inner World, Princeton University Press, 2009.
11
1.
Mathematics
[4] Philip J. Davis and Reuben Hersh, The Mathematical Experience, Birkhauser, Bosten,
1981.
[5] Timothy Gowers (editor), The Princeton Companion to Mathematics, Princeton University Press, 2008.
[6] Paul R. Halmos, Naive Set Theory, Springer, New York, 1974.
[7] Kevin Houston, How to Think Like a Mathematician, A Companion to Undergraduate
Mathematics, Cambridge University Press, 2009.
The content of the first text book by Aigner and Ziegler goes far beyond this course and
it is basically a collection of the gems of mathematics. It is a good companion throughout
the life of any professional mathematician, who will return back to this book in order to
learn some of the most beautiful proofs in mathematics. The second book by Cook is not a
text book, but a collection of more than 90 photographic portraits of mathematicians, which
provides the reader some authentic insights into what mathematicians think and feel about
their work. The book by Davis and Hersh is a book that tries to disclose the nature of mathematics and the philosophical grounds on which many mathematicians operate. It also raises
questions about the metaphysical status of truth in mathematics and dominant attitudes
of mathematicians in this respect, such as platonism, formalism and constructivism. The
companion edited by Gowers is an encyclopedic introduction into all areas of mathematics.
The content goes far beyond our course, but it is one of the best available such introductions
into mathematics in general. Finally, the text book by Houston is perhaps the most useful
and affordable companion for the reader of this course.
12
CHAPTER
Sets
No one shall expel us from the paradise that Cantor created for us.
David Hilbert (18621943)
2.1
What is a Set?
A set is a Many that allows itself to be thought of as a One.
Georg Cantor (18451918)
In the previous section we have already seen several examples of sets, among
them the set of natural numbers N and the set of prime numbers P. Although
mathematics is about rigorous reasoning, we will not present a formal definition
of what a set is here. There is a more rigorous development of set theory, which is
called axiomatic set theory, but this axiomatic approach is too difficult for beginners.
The way we will develop set theory here is called naive set theory, since it is based
on an intuitive understanding of the concept of a set. In other words, although we
want to develop mathematics rigorously, we have to start from somewhere and in
this case the starting point is the naive concept of a set. However, even this naive
concept has a number of features that we can make more precise:
Informal definition of a set
1. A set S is a welldefined collection of mathematical objects x.
2. The members x of a set are called elements. If x is an element of the set S,
then we write x S. Otherwise, we write x 6 S.
13
2.
Sets
3. Two sets S1 and S2 are equal if and only if they contain exactly the same
elements. If S1 and S2 are equal, then we write S1 = S2 . Otherwise, we write
S1 6= S2 .
There is a particular set , which is called the empty set and it does not contain
any elements. Sometimes one writes = {}. We give a few further examples of sets
that are commonly used. The sets N and P have already been mentioned and used
in the previous chapter.
Some useful sets of numbers
1. N := {0, 1, 2, ....}, the set of natural numbers,
2. Nn := {1, ..., n}, the set of natural numbers from 1 to n for n N,
3. P, the set of prime numbers,
4. Z := {..., 2, 1, 0, 1, 2, ...}, the set of integers,
5. Q, the set of rational numbers,
6. A, the set of algebraic numbers,
7. R, the set of real numbers,
8. C, the set of complex numbers.
We do not define the integers, rational numbers, algebraic numbers, real numbers
and complex numbers precisely here. We assume that the reader has seen these sets
of numbers before and we leave a precise treatment to a later stage. We just want
to name some commonly used sets here in order to have some examples. We close
by emphasizing two important properties that members in a set do not have. They
have neither a position nor multiple appearances.
Multiplicity and Order
1. Multiplicity of an object x in a set S is not considered. Either x is an element
of S or not. No element can have multiple instances within a set.
2. Order of elements in a set does not play any role. That is, a member of a set
has no particular position within the set.
Later we will see the concept of an indexed family, where the position of elements
plays a role and one and the same element can also appear several times in different
positions. In certain special areas of mathematics also multisets are considered,
which are sets where multiplicity occurs, although order plays no role. Such multisets
have already been considered by Dedekind, but we will not use them here.
14
2.2
In general, the curly brackets { and } are used to specify a particular set.
This can happen in at least two different ways, either by listing the elements explicitly or by comprehension. Only finite sets can be specified by listing their elements
explicitly.1 For instance
{2, 7, 2010, 4, 2}
is a finite set with 4 elements. The particular listing of elements used to specify
the set names the elements in a particular order and some elements might even be
repeated in this list. Nevertheless, neither the order nor the repetition matters for
the set that is defined in this way, as pointed out before. So,
{2, 7, 2010, 4, 2} = {2010, 7, 4, 2}.
This is simply because we agreed that two sets are equal if and only if they contain
exactly the same elements. That is order and multiplicity are features of the list that
specifies the set, but not properties of the set itself. Also the naming of elements
can happen in very different ways. For instance, the following two singletons are
identical:
{1729} = {the smallest number expressible as the sum of two cubes in two different ways}.
A singleton is a set with exactly one element. In this case it is not so easy to
recognize that the two sets are actually identical. This requires some knowledge
about cubes and numbers and also some agreements. For instance, there is an
implicit agreement that the number 1729 is understood as base 10 expansion (this
is usually the case if not mentioned otherwise). The text in the set on the right
hand side is not read as a sequence of symbols, but as a mathematical definition
of the uniquely identified number, which is an element of this set. However, such
definitions are only acceptable if they have some clear mathematical interpretation.
See Problem 2.1 for a problematic example. Sometimes even concretely specified
sets are not so easy to understand. Here is an example.
Example 2.1 Let T = {the largest pair of twin primes (p, q)}. If the Twin Prime
Conjecture 1.5 is correct, then there is no largest pair (p, q) of twin primes and
1
We only made one exception, we also defined the set of natural numbers N = {0, 1, 2, 3, ...}
by indicating an infinite list of elements.
15
2.
Sets
2.3
While the above sets are formed by an explicit listing of their objects, a more common
method is to specify a set by comprehension. Comprehension usually means that a
set is formed by specifying a subset of a given set using some property. An example
that we have already seen is
P := {p N : p is a prime number}.
Here the given set is the set N of natural numbers and we single out a subset P of it
by specifying which elements of N are members of this subset. So, the way to read
the above definition is that P is defined to be the set of those natural number p N
that have the property that they are prime. Some authors also write this set as
{p N  p is a prime number},
i.e. with a  instead of a :. In both cases : and  are read as such that.
The symbol := is read as is defined to be equal. Let us now more formally
capture what a subset is in general.
Definition 2.2 (Subset) Let S be a set. We say that T is a subset of S if all
elements of T are also elements of S. If T is a subset of S, then we write T S.
Otherwise, we write T 6 S.
16
17
2.
Sets
18
19
2.
Sets
Problems
2.1 Discuss the definition of the following set:
S = {the smallest natural number which cannot be defined with less than 100 symbols}.
Is this set S welldefined? If yes, can we determine the member of this set? If not, why not?
2.2 Prove that for any three sets R, S, T the following statements hold true:
1. S S
2. R S and S T implies R T
2.3 Find out which of the following statements are correct!
1. {},
2. {},
3. {} ,
4. {} {N}.
20
(reflexivity)
(transitivity)
2.4
Russels Paradox
In formal logic, a contradiction is the signal of defeat,
but in the evolution of real knowledge it marks the
first step in progress toward a victory.
Alfred North Whitehead (18611947)
We have already seen that there are some limitations on how a set can be listed
explicitly. Namely the listing has to be mathematically welldefined. Now we will see
another type of limitation that shows why comprehension has to be used carefully
as well. In early years of set theory one has already recognised that there are sets
that lead to serious problems. One such construction is called Russels paradox and
we present it as an example here.
Example 2.8 (Russels paradox 1901) We consider the set
S = {X : X 6 X}
of all sets X that do not contain themselves as element. Now the question is whether
S is an element of itself ? On the one hand, if S S, then S, by definition has the
property that S 6 S. This is a contradiction! On the other hand, if S 6 S, then S,
by definition has the property that S S. This is also a contradiction! Altogether,
we obtain
S S S 6 S.
This statement is clearly not correct, hence the existence of the set S leads to a
contradiction!
Cantor had discovered similar antinomies, but he did not publish them. The
way out of the problem of Russels paradox is just to declare the formation of sets
such as S as illegal. We have to apply some restrictions with regards to which sets
we can actually build. The discovery of this paradox has led to the development of
axiomatic set theory, a discipline which explains in great formal detail which sets
can be formed and which sets cannot be formed. Essentially, the situation is as
follows.
Admissible constructions of sets
1. We have an empty set and an infinite set such as the set N of natural numbers.
2. We can form finite sets by explicit specification of their elements.
3. We can form subsets of already constructed sets by comprehension (using some
property that characterizes the elements in the subset).
4. We can apply certain welldefined operations to sets in order to form new sets
out of given sets. These operations are the union of sets and the power set
construction.
21
2.
Sets
We will specify in subsequent sections what union and power set construction
exactly means. The essential point is that the set in Russels paradox has not been
built by either of these admissible tools. The condition X 6 X could be considered as
a property, but in order to use comprehension to form the set S of Russels paradox
one would need first the set U of all sets X and this set does not exist (something
that we will prove below exactly with the argument of Russels paradox).
In some approaches to axiomatic set theory, the collection U of all sets is considered as a proper class, which is something like a set of second order. Already Cantor
considered such classes as the way to avoid antinomies. Then the class S of all sets,
which are not members of themselves can be formed, but the question of whether
S is a member of itself does not make any sense, since S contains only sets and not
proper classes. Outside of set theory, the term class is sometimes also used as a
synonym for sets.
One should note that a set can very well be a member of another set. For instance
the set {, N} is a set with exactly two elements: the empty set and the set of
natural numbers N. And in fact, Russels paradox can be turned into a useful proof
of the fact that there is no set that contains all sets.
Proposition 2.9 (No universal set) There is no set U that contains all sets S.
Proof. Let U be some set. Now we consider the set
S := {X U : X 6 X}
of all sets in U that are not member of themselves. This set S is welldefined by
comprehension. Now, the assumption S S implies S 6 S. This is a contradiction.
Hence S 6 S. But this implies S 6 U . So, no matter how we choose the set U to
start with, we can always construct a set S that is not a member of U . Hence no
set U can contain all sets S.
2
Despite other claims, this result was already known and proved by Cantor in
1899 before Russel presented his paradox. However, Cantor did not publish his
result, but he only reported it in letters to David Hilbert and Richard Dedekind.
However, Cantors original proof was different from the proof presented here and we
will come back to his proof at a later stage (see Problem 5.5).
In computability theory, a branch of mathematical logic, one can use the idea
of Russels paradox also in a constructive way to define sets such as the halting
problem or the selfapplicability problem, which have been studied, for instance, by
Alan Turing and Kurt G
odel. These sets exhibit some interesting behaviour and
they play a crucial role in computability theory. It is not unusual in mathematics
that some paradox or contradiction has eventually been turned into a useful result.
In relation to Russels paradox one can ask the question, whether there can be
any set S with S S at all? Indeed, the axioms of formal set theory do not allow
the construction of such a set S. A set S is called wellfounded, if an infinite chain
22
2.5
In this section we want to study operations that allow to construct new sets from
given sets, in particular we will look at the union and the intersection of sets.
Definition 2.11 (Union and intersection) Let X and Y be sets.
1. We define the union X Y of X and Y by
X Y := {x : x X or x Y }.
2. We define the intersection X Y of X and Y by
X Y := {x : x X and x Y }.
23
2.
Sets
Thus, the union X Y is the collection of all elements from X and Y together,
i.e. it is the set of all elements which are in X or in Y . The intersection X Y is
the collection of all elements that are simultaneously in both sets X and Y , i.e. it is
the set of all elements x which are in X and also in Y .
We note that the union is formally not a special case of comprehension, since
we do not define X Y as a subset of some given set U , but we only create this
set U by forming the union. In contrary to this, intersection can be considered as a
special case of comprehension, since we can prove the following lemma.
Lemma 2.12 For any two sets X Y = {x X : x Y }.
The diagram in Figure 2.2 illustrates the intersection X Y and the union X Y
of two sets X and Y .
X Y
Y
X Y
24
(commutativity)
3. Z (X Y ) = (Z X) Y and
Z (X Y ) = (Z X) Y,
(associativity)
4. Z (X Y ) = (Z X) (Z Y ) and
Z (X Y ) = (Z X) (Z Y ).
(distributivity)
Proof.
1. If x X Y , then x X and x Y . Hence, in particular, x X. This proves
the first statement. If x X, then x X or x Y . Hence, x X Y . This
proves the second statement.
2. and 3. are left to the reader (see Problem 2.4).
4. We prove only the first equality and leave the second one to the reader (see
Problem 2.4). In order to prove the first equality, we convince ourselves that
x Z (X Y )
x Z or x X Y
x Z or (x X and x Y )
(x Z or x X) and (x Z or x Y )
x (Z X) (Z Y ).
25
2.
Sets
An analogous definition holds for the union and intersection of any finite number
of sets in general. Now we prove a result on inclusion and its interaction with union
and intersection.
Proposition 2.15 (Inclusion, union and intersection) Let X, Y and Z be sets.
Then the following hold:
1. (X Z and Y Z) if and only if (X Y ) Z.
2. (Z X and Z Y ) if and only if Z (X Y ).
Proof.
1. We prove both directions of the implication separately.
= Let X Z and Y Z. We have to prove (X Y ) Z. Thus, let
x X Y . That is, x X or x Y . In the first case, it follows that x Z
since X Z and in the second case it also follows x Z since Y Z. Thus,
in any case x Z, which was to be proved.
= Now suppose (X Y ) Z. We have to prove X Z and Y Z. Let
x X. Then x X Y and hence x Z. For the second part, let x Y .
Then x X Y and x Z follows, which was to be proved.
2. We leave this proof to the reader (see Problem 2.5).
2
If two sets have no elements in common, then they are called disjoint.
Definition 2.16 (Disjoint) Let X and Y be sets. If X Y = , then X and Y
are called disjoint
The diagram in Figure 2.3 illustrates two disjoint sets.
26
Problems
2.4 Prove the remaining statements from Proposition 2.14. Let X, Y and Z be sets. Then
the following properties hold:
1. X Y = Y X and X Y = Y X
(commutativity)
2. Z (X Y ) = (Z X) Y and
Z (X Y ) = (Z X) Y
(associativity)
3. Z (X Y ) = (Z X) (Z Y )
(distributivity)
2.6
In this section we discuss another method to create new sets from given sets, by
considering the difference of sets.
Definition 2.17 (Difference) Let X and Y be sets. We define the difference X \Y
of X and Y by
X \ Y := {x : x X and x 6 Y }.
Some authors also write X Y instead of X \ Y . Figure 2.4 illustrates the
difference of two sets X and Y .
X \Y
27
2.
Sets
Example 2.18 The following statements hold true (try to prove them!):
1. {1, 3, 5} \ {2, 3, 4} = {1, 5},
2. 2N \ 3N = 2N \ 6N,
3. 2N \ P = 2N \ {2},
4. N \ 2N = {2k + 1 : k N}.
In the following proposition we collect some useful basic properties of the set
difference.
Proposition 2.19 (Difference) Let X and Y be sets. Then
1. X \ Y X,
2. (X \ Y ) Y = ,
3. (X \ Y ) Y = X Y .
Proof.
1. Let x X \ Y . Then x X and x 6 Y . In particular, x X.
2. Let x (X \ Y ) Y . Then x X \ Y and x Y . But x X \ Y means x X
and x 6 Y . Hence, x Y and x 6 Y , which is a contradiction. Thus, there is
no x (X \ Y ) Y and hence (X \ Y ) Y = .
3. We prove both inclusions separately.
Let x (X \ Y ) Y . Then x X \ Y or x Y . This means (x X and
x 6 Y ) or x Y . Altogether, x X or x Y , i.e. x X Y .
Let x X Y . Then x X or x Y . Now we make a case distinction.
1. Case: x Y . In this case x X \ Y or x Y is certainly correct.
2. Case: x 6 Y . In this case x X and x 6 Y is correct. Hence x X \ Y or
x Y is correct.
In both cases we obtain x (X \ Y ) Y .
2
De Morgans Law captures what happens if we subtract a union or an intersection of sets from another set. Basically, unions are turned into intersections and
intersections into unions in this case, as expressed more precisely in the following
proposition.
Proposition 2.20 (De Morgans Laws) Let X, Y and Z be sets. Then
1. Z \ (X Y ) = (Z \ X) (Z \ Y ),
28
29
2.
Sets
x X and x 6 Y \ Z
x X and (x 6 Y or x Z)
(x X and x 6 Y ) or (x Z and x Z)
x (X \ Y ) (X Z).
30
Problems
2.8 Let X, Y and Z be sets. Prove that Z \ (X Y ) = (Z \ X) (Z \ Y ).
2.9 Find sets X, Y and Z such that (Z \ Y ) (Z \ X) and X 6 Y .
2.10 Let X and Y be sets with X Y . Prove that the following statements are pairwise
equivalent to each other:
1. X $ Y ,
2. Y 6 X,
3. Y \ X 6= .
2.11 We consider double differences of sets.
1. Prove that (X \ Y ) \ Z = X \ (Y Z) for all sets X, Y and Z.
2. Prove that (X \ Y ) \ Z X \ (Y \ Z) for all sets X, Y and Z.
3. Show that there are sets X, Y and Z such that (X \ Y ) \ Z 6 X \ (Y \ Z).
2.12 Let X and Y be subsets of a fixed set Z. Prove that X \ Y = X Y c , where the
complement is taken with respect to Z.
2.13 Let X and Y be sets. The symmetric difference XY of X and Y is defined by
XY := (X \ Y ) (Y \ X).
Let X, Y and Z be sets. Prove that the following holds:
1. X = X,
2. XX = ,
3. XY = Y X,
4. X(Y Z) = (XY )Z,
(commutative)
(associative)
5. XY = (X Y ) \ (X Y ).
31
2.
2.7
Sets
Often we want to work with the union and intersection of infinitely many sets and
not just of finitely many sets. Usually this is done by considering indexed families of
sets. If I is a nonempty set and there is a set Xi given for each i I, then (Xi )iI
is called an indexed family of sets over I. Some authors also write {Xi }iI for an
indexed family of sets, but this is an unfortunate notation since the curly brackets
{ and } are overloaded in this way with a different meaning, hence we will only
use round brackets in order to denote indexed families. Now we can define union
and intersection for indexed families.
Definition 2.27 (Union and intersection for indexed families) Let I be a nonempty set and let (Xi )iI and (Yi )iI be indexed families of sets over I. We define:
S
1. iI Xi := {x : (i I) x Xi },
T
2. iI Xi := {x : (i I) x Xi }.
Here (i I) is read as there exists an i in I such that and (i I) is
read as for all i in I it holds that. In some
W sense, the existential quantifier can
actually be read like a big or operation
and the universal quantifier
can be
V
W
read
like
a
big
and
operation
.
Indeed,
some
authors
write
instead
of and
V
instead of . Having this in mind it is easy to see that the union and intersection
for indexed families of sets actually generalises the union and intersection for two
sets (see Problem 2.14). The reader should also note that intersection could be
considered as a special case of comprehension again, whereas union yields a genuine
new type of set.
In the special case that the index set I = N is the set of natural numbers, one
also uses the following notations:
Xi :=
i=0
Xi and
Xi :=
i=0
iN
Xi .
iN
Xi :=
[
iI
Xi and
n+k
\
i=n
Xi :=
Xi .
iI
S
Sn+k
Note that these notations can also be typeset inline like
i=0 Xi and
i=n Xi with
indexes written at the side. We give some examples of sets formed with union and
intersection over the natural numbers (or a subset of natural numbers).
32
33
2.
Sets
iI
34
Problems
2.14 Let X1 and X2 be sets. Prove that the following hold:
S2
1. X1 X2 = i=1 Xi ,
S2
2. X1 X2 = i=1 Xi .
35
2.
Sets
[
P = N \ {1}
(kN \ {k}) .
k=2
2.16 Let I be a nonempty index set with nonempty subsets J and K. Let (Xi )iI be an
indexed family of sets. Prove that the following holds:
S
S
1. J K = iJ Xi iK Xi ,
T
T
T
2.
iJ Xi
iK Xi =
iJK Xi .
2.17 Let I be a nonempty index set and let (Xi )iI be an indexed family of sets and let
Y be another set. Prove that the following hold:
S
1. (k I) Xk iI Xi ,
S
2. (k I) Xk Y iI Xi Y .
2.18 Let I be a nonempty index set. Let X be a set and (Yi )iI an indexed family of sets.
Prove that:
S
S
1. X
iI Xi =
iI (X Yi ),
T
S
2. X \
iI Yi =
iI (X \ Yi ).
2.19 Let I be a nonempty index set. Let X be a set and (Yi )iN an indexed family of sets.
Prove that
S
S
1.
iI Xi \ Y =
iI (Xi \ Y ),
T
T
2.
iI Xi \ Y =
iI (Xi \ Y ).
2.8
Power Sets
Besides the method of comprehension we have already seen that the union of sets
(or of an indexed family of sets) is a way to defined new sets from given sets.
Another very important set theoretical construction that cannot be subsumed under
comprehension is the power set construction. Given a set X the power set 2X is the
set of all subsets of X, which is much larger than X itself.
Definition 2.34 (Power set) Let X be a set. Then
2X := {Y : Y X}
is called the power set of X.
Some authors also write P(X) instead of 2X . The power set 2X of any set X is
always nonempty, since 2X . We mention some examples.
Example 2.35 We obtain the following (try to verify these examples!):
36
2. 2(2
= 2{} = {, {}},
(2 ) )
3. 2(2
(monotonicity)
2. 2X 2Y = 2XY ,
3. 2X 2Y 2XY .
Proof.
1. We prove both implications separately.
= Let X Y . We have to prove 2X 2Y . Let A 2X . This means
A X. Since X Y , we get A Y by transitivity of the inclusion relation
(see Problem 2.2). This means A 2Y .
= Now let 2X 2Y . We have to prove X Y . Let x X. Then
{x} X, i.e. {x} 2X and hence {x} 2Y since 2X 2Y . But this means
{x} Y and hence x Y .
2. We prove both inclusions separately.
Let A 2X 2Y . Then A 2X and A 2Y . That is A X and A Y .
By Proposition 2.15 this implies A X Y , i.e. A 2XY .
Let A 2XY . Then A X Y , which implies by Proposition 2.15 that
A X and A Y . Hence A 2X and A 2Y , i.e. A 2X 2Y .
3. This proof is left to the reader (see Problem 2.20).
2
We note that the inverse inclusion of the last statement (3) does not hold true in
general (see Problem 2.20). We close this section with introducing another notation
that is commonly used in mathematics in order to denote unions and intersections
and that is best expressed using the power set.
37
2.
Sets
Definition 2.37 (Union and intersection for sets of subsets) Let X be a set
and let S 2X . Then we define
S
S
1. S := SS S,
T
T
2. S := SS S.
In order to make this more precise, we consider S here as an index set. Then we
can define an indexed family of sets (XS )SS where XS := S and we obtain
[
[
[
\
\
\
S=
S=
XS and
S=
S=
XS .
SS
SS
SS
SS
This is the precise interpretation of the definition above and it shows that this is
not a new concept. It is just another way of writing the union and intersection of
indexed families of sets in a way that is sometimes more convenient.
Problems
2.20 Let X and Y be sets. Prove that
1. 2X 2Y 2XY ,
2. 2X 2Y = 2XY X Y or Y X.
2.21 Let (Xi )iI be an indexed family of sets. Prove that
T
T
1. 2 iI Xi = iI 2Xi
S
S
2. 2 iI Xi iI 2Xi .
2.9
Product of Sets
When we discussed the twin prime conjecture, we have already spoken about pairs
(p, q) of prime numbers. The essential idea of a pair (p, q) is that it is ordered,
i.e. it matters in which position p and q appear, respectively. This distinguishes an
ordered pair from a set {p, q}. We could leave the definition of a pair intuitive, but
is is also relatively simple to define a pair more precisely using sets. This idea of
formalizing pairs goes back to Kuratowski.
Definition 2.38 (Kuratowski pair) Let X be a set with x, y X. Then we
define the Kuratowski pair, or for short the pair (x, y), as follows:
(x, y) := {{x}, {x, y}}.
Essentially, this definition of a pair is not really used in practice in mathematics,
but only the following property of pairs is of importance. That is, as soon as you
have understood the following proposition and its proof, you can forget the previous
definition.
38
39
2.
Sets
(monotonicity)
3. X (Y Z) = (X Y ) (X Z),
(distributivity)
4. X (Y Z) = (X Y ) (X Z),
(distributivity)
5. (X Y ) (W Z) = (X W ) (Y Z),
6. (X Y ) (W Z) (X W ) (Y Z).
Proof.
1. 2. and 3. are left to the reader (see Problem 2.22).
4. We prove both inclusions separately.
Let (x, y) X (Y Z). Then x X and y Y Z. The latter means
y Y and y Z. Hence (x, y) X Y and (x, y) X Z, which means
(x, y) (X Y ) (X Z).
Let a (X Y )(X Z). Then a (X Y ) and a (X Z). This means
that there are x X, y Y and z Z such that a = (x, y) and a = (x, z).
This implies y = z. In particular, y Y Z and thus a = (x, y) X (Y Z).
5. We prove both inclusions separately.
Let a (X Y ) (W Z). Then a (X Y ) and a (W Z). Hence
there are x X, y Y , w W and z Z such that a = (x, y) and a = (w, z).
This implies x = w and y = z and hence x X W and y Y Z. Thus
a = (x, y) (X W ) (Y Z).
Let now (x, y) (X W ) (Y Z). Then x X W and y Y Z, i.e.
x X and x W and y Y and y Z. Thus (x, y) (X Y ) (W Z).
6. This proof is left to the reader (see Problem 2.22).
2
We point out that the inverse inclusion in 6. does not hold true in general (see
Problem 2.22). The diagram in Figure 2.5 illustrates the products of X Y and
W Z in a coordinate system (this is not a Venn diagram!). The first components
of pairs are illustrated on the horizontal axis whereas the second components are
illustrated on the vertical axis. One can see why the intersection is a product
(=rectangle) itself and why the union is not. However, this does not constitute a
formal proof.
40
{z
X
}

{z
W
The definition of pairs can easily be generalised to higher arities. By the arity
we mean the number n of components in a tuple (x1 , x2 , ..., xn ). For instance, we
could define triples by (x1 , x2 , x3 ) := (x1 , (x2 , x3 )) and then we could prove that
(x1 , x2 , x3 ) = (y1 , y2 , y3 ) if and only if x1 = y1 and x2 = y2 and x3 = y3 . We will not
work this out formally here, but we will take an intuitive understanding of ntuples
(x1 , ..., xn ) for an n N from now on. That is, we assume
(x1 , ..., xn ) = (y1 , ..., yn ) (i {1, ..., n}) xi = yi .
By the way, ntuples are called pairs, triples, quadruples and quintuples for n = 2, 3, 4
and 5, respectively. There is also a tuple () of arity 0, which is sometimes called
the empty tuple or empty word. We do not distinguish between tuples of arity 1 and
their only component, i.e. (x) = x. Using tuples of arbitrary arity n we can now
also generalise the Cartesian product to higher arities.
Definition 2.43 (Generalised Cartesian product) Let X1 , ..., Xn be sets with
n N. Then we define
n
i=1
Later on, we can even further generalize this definition to products over families
of sets, but we first need to define what an infinite (or indexed) tuple is for this
purpose and we do not have such a definition at hand yet. An important special
case of the previous definition is the situation where all the sets Xi are the same set.
In this case we simply write
n
X n := X X = X
... X}
 {z
i=1
n times
41
2.
Sets
and call this the nfold product of the set X with itself. We also allow the special
case n = 0 here, in which case we obtain a singleton X 0 = {()} with the empty
tuple.
Problems
2.22 Let W, X, Y and Z be sets. Prove that the following holds:
1. X = X = ,
2. X Y and W Z = X W Y Z,
3. X (Y Z) = (X Y ) (X Z),
(distributivity)
4. (X Y ) (W Z) (X W ) (Y Z).
5. Prove that there are sets X, Y, Z, W such that the inverse inclusion in the previous
statement does not hold.
2.23 Let X be a set, I a nonempty set and (Yi )iN an indexed family of sets over I. Prove
that
S
S
1. X
iI Yi =
iI (X Yi ),
T
T
2. X
iI Yi =
iI (X Yi ).
2.10
Sometimes one would like to define a union of two sets X, Y such that one can keep
track from which set the elements actually originate from. This is important, in
particular, when X and Y have nonempty intersection.
Definition 2.44 (Disjoint union) Let X and Y be sets. Then we define the disjoint union by
X t Y := ({1} X) ({2} Y ).
Sometimes the disjoint union is also denoted by X +Y or by X Y and sometimes
it is called discriminated union or tagged union. Here the number 1 and 2 in the
first component is used like a label that indicates from which set, either X or Y , the
elements originate from. If one takes the ordinary union X Y of two sets X and
Y that are not disjoint, i.e. such that X Y 6= , then the information whether an
element X Y originates from X or Y (or both) is lost in the set X Y . Properties
of the disjoint union can easily be derived from properties of the ordinary union and
the set product and we are not going to study such properties here. We just mention
that the disjoint union can be generalized to families of sets.
Definition 2.45 (Disjoint union of a family of sets) Let (Xi )iI be an indexed
family of sets. Then we define the disjoint union of this family by
[
G
Xi := ({i} Xi ).
iI
42
iI
nN
The operation on sets is also called Kleene star operation. Strictly speaking,
any element of X has the form (i, x1 , ..., xi ) for some i N. This includes the case
0 = (0). One usually only writes (x1 , ..., xi ) = {i} (x1 , ..., xi ) in this situation with
the understanding that i is defined implicitly by the number of arguments in the
tuple (x1 , ..., xi ). In this abbreviated notation on obtains 0 = (). The Kleene star
operation has many applications also in computer science, where it is used to describe
regular languages. We give some examples how it can be used in mathematics.
Example 2.47 Here are some examples.
1. We want to create a set E {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} that contains all decimal
expansions of natural numbers without leading zeros. That is
E := {(n1 , ..., nk ) {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} : k = 1 or (k > 1 and n1 6= 0)}.
2. We want to create a set D N that contains chains of numbers that divide
each other. That is
D := {(n1 , ..., nk ) N : k 2 and (i {2, ..., k}) ni1 ni }.
That is (2, 4) and (3, 6, 12, 36) are examples of elements in D.
We have used the simplified way to denote elements of sets of finite words that has
been described above.
43
CHAPTER
Logic
Logic is the anatomy of thought.
John Locke (16321704)
3.1
What is Logic?
Since ancient times logic was mostly considered as the art of proper and systematic
reasoning. Aristotles work on analytics (as he called what we call logic nowadays)
was considered for a long time as the major work in logic and for almost 2000
years there was not much progress in this discipline. This changed radically at the
end of the 19th century when logic became an active field of research again within
mathematics. Nowadays logic is a rich subfield of mathematics that has many subdisciplines on its own, such as model theory, proof theory and computability theory.
There are many applications of particular branches of logic in other disciplines such
as computer science and philosophy, but also within algebra, analysis or other mathematical areas. Within mathematics logic is also the major foundational sub discipline
which undertakes a reflection about mathematics with mathematical methods and
this is what has been called metamathematics. Godels results have spectacularly
contributed to the understanding of the limitations of mathematics and perhaps also
the scientific method in general and they are part of the jewels that 20th century
mathematics has produced.
The purpose of this section is neither to introduce any particular knowledge in
logic nor to introduce the subject as a foundational disciplines. We will rather take
a naive approach to logic (similar as with set theory) and we will try to highlight the
relevance of logic as it is used on a daytoday basis by any working mathematician.
Essentially, we will just look at how we have used logic so far and we will emphasize
and collect the rules logical reasoning that we have already used.
45
3.
3.2
Logic
Propositional Logic
Propositional logic is the part of logic that deals with the logical combination of
mathematical propositions without considering any particular mathematical objects.
Informal definition of a proposition
A proposition is a welldefined mathematical statement that is either true or false.
We will typically denote propositions by variables using letters A, B, C, .... The
truth values true and false are sometimes denoted by t and f. We will
denote them by 1 (for true) and 0 (for false). If we have two propositions A
and B, which both are either true or false, then we have altogether 22 = 4 different
possibilities of assigning truth values to the pair (A, B) and correspondingly we
have 42 = 16 different binary logical operations that we can consider. We will only
consider a small subset of these and we will define them in the following definition
via a truth table.
Definition 3.1 (Logical operations) Let A and B be propositions. Then we
define the logical operations of negation A, conjunction A B, disjunction A B,
implication A = B and equivalence A B via the following table of truth
values:
A
0
0
1
1
B
0
1
0
1
A
1
1
0
0
AB
0
0
0
1
AB
0
1
1
1
A = B
1
1
0
1
A B
1
0
0
1
The symbols and > can be read as constant logical operations with truth
values 0 and 1, respectively. The way to read this table is that it tells us what we
actually mean when we say A and B is true, we mean that A and B both have the
truth value 1. Similarly, A or B is true means that at least one (possibly both)
of the propositions A and B have the truth value 1. This should be distinguished
from the exclusive or that we occasionally mean when we say or in our daily
language and that excludes the option that both A and B are true. The exclusive
or operation is sometimes denoted by A + B or A B and it is true if and only
if exactly one of A and B is true. However, we will not further use this operation
here and hence we have not included it in the table above. Note that by definition
A = B is considered as true if and only if the statement if A is true, then B
is true is true. By definition this is always correct, if A is not true (no matter
what the truth value of B is). That means that implications are closely related to
disjunctions and we capture this relation in the following proposition.
46
5. > is a tautology since it is true and there are no propositional variables involved.
How can we actually find out whether a logical formula is a tautology or not?
This can be done with the truth table method that we illustrate in the proof of the
next proposition. We collect a number of tautologies that involve implications.
Proposition 3.4 (Implication) Let A, B and C be propositions. Then the following are tautologies:
1. ((A = B) A) = B
2. (A = B) (A B)
3. ((A = B) (B = C)) = (A = C)
4. (A = B) (B = A)
5. ((A B) ((A = B) (B = A))
6. ((A B) = C) (A = (B = C))
(modus ponens)
(implication and disjunction)
(hypothetical syllogism)
(contraposition law)
(equivalence)
(currying)
47
3.
Logic
Proof. We only prove 2. and leave the other proofs to the reader (see Problem 3.1).
We prove that the given logical formula is a tautology by systematically writing
down its truth table:
A
0
0
1
1
B
0
1
0
1
A
1
1
0
0
(A B)
1
1
0
1
A = B
1
1
0
1
(A = B) (A B)
1
1
1
1
The last column of this table indicates the truth value of the entire logical formula
(A = B) (A B) depending on the truth values of A and B in the first
two columns. We see that the last column always carries the truth value 1 for true,
irrespectively of the truth values of A and B. Hence the formula is a tautology. 2
In fact, the tautology 2. is sometimes exploited in order to proof implications
and we capture this as a proof method.
Proof Method (Implications as disjunctions)
In order to prove A = B for two welldefined mathematical statements A and B
it is sufficient (and, in fact, logically equivalent) to prove A B.
In the previous proposition we have carefully used parentheses to indicate in
which order the logical operations are to be applied. Sometimes, parenthesis are left
away and the operations are ordered in the following priority list:
, =, , ,
with increasing priority. That is, a logical formula such as
A B C D
would be read as
((A) B) (C D).
However, in cases of doubts it is better to use more than less parentheses. In the
following result we collect some very common other tautologies.
Proposition 3.5 (Tautologies) Let A, B and C be propositions. Then the following are tautologies:
1. ((A B) C) (A (B C))
(associativity)
2. ((A B) C) (A (B C))
3. (A B) (B A)
(commutativity)
4. (A B) (B A)
5. (A (B C)) ((A B) (A C))
48
(distributivity)
8. (A B) (A B)
Proof. One can use the truth table method to prove that all these formulas are
tautologies. We work number 5. out as an example and leave the rest to the reader
(see Problem 3.2). We denote the entire formula (A(BC)) ((AB)(AC))
by F .
A
0
0
0
0
1
1
1
1
B
0
0
1
1
0
0
1
1
C
0
1
0
1
0
1
0
1
BC
0
1
1
1
0
1
1
1
(A (B C))
0
0
0
0
0
1
1
1
AB
0
0
0
0
0
0
1
1
AC
0
0
0
0
0
1
0
1
((A B) (A C))
0
0
0
0
0
1
1
1
F
1
1
1
1
1
1
1
1
The last column of this table indicates the truth value of the entire logical formula
F , i.e. (A (B C)) ((A B) (A C)) depending on the truth values of
A, B and C in the first three columns. We see that the last column always carries
the truth value 1 for true, irrespectively of the truth values of A, B and C. Hence
the formula is a tautology.
2
The previous proof shows that checking whether a logical formula is a tautology
or not becomes increasingly more time consuming as more propositional variables
are involved. Roughly speaking, the truth table method requires 2n computational
steps (i.e. columns in the table) if the formula involves n propositional variable
A1 , ..., An .
This observation is related to one of the challenging big open problems of mathematics, the socalled PNP problem. Here P stands for the set of problems that
can be decided in polynomial time and NP stands for the set of problems that can
be verified in polynomial time. We cannot make the definitions of these sets precise
here, this would be subject of a course on computational complexity theory, but we
state that the big open problem is whether these two sets are equal or not. While
it can be proved easily that P NP, it is not known whether the inverse inclusion
holds or not. The majority of experts believes that the inverse inclusion does not
hold. That is, we have the following conjecture (which is open more or less since the
late 1960s).
Conjecture 3.6 (PNP Problem) P $ NP.
This problem is among those few big open mathematical problems for which
the Clay Mathematics Institute offers one million US dollar to anybody who solves
49
3.
Logic
the problem successfully in either way (i.e. by proving or disproving the conjecture
and by publishing the result properly). For the proof that indeed P $ NP holds, it
would be sufficient to show that there is no significantly more efficient way to check
whether a given formula is a tautology or not, then the truth table method discussed
above. For the proof that P = NP, it would be sufficient to provide a significantly
more efficient algorithm (that is one that does not require 2n many steps, but rather
roughly n2 , n3 or nk many steps for some fixed k N).
Problems
3.1 Let A, B and C be propositions. Prove that the following logical formulas are tautologies:
1. ((A = B) A) = B
2. ((A = B) (B = C)) = (A = C)
3. (A = B) (B = A)
4. ((A B) ((A = B) (B = A))
5. ((A B) = C) (A = (B = C))
(modus ponens)
(hypothetical syllogism)
(contraposition law)
(equivalence)
(currying)
3.2 Let A, B and C be propositions. Prove that the following logical formulas are tautologies:
1. ((A B) C) (A (B C))
(associativity)
2. ((A B) C) (A (B C))
3. (A B) (B A)
(commutativity)
4. (A B) (B A)
5. (A (B C)) ((A B) (A C))
6. (A B) (A B)
(distributivity)
(de Morgans laws)
7. (A B) (A B)
3.3
FirstOrder Logic
Roughly speaking, firstorder logic is an extension of logic where one does not only
consider mathematical propositions and their truth values, but also such propositions that depend on certain mathematical objects. Besides the ordinary logical
operations discussed in the previous section, firstorder formulas also involve quantifications over such objects using universal and existential quantifiers. We start
with an example.
Example 3.7 The following firstorder formula expresses the fact that p N is a
prime number:
p 2 (n N)(np = (n = 1 n = p)).
50
51
3.
Logic
4. (x X)(E G(x)) (E (x X)G(x))
5. (x X)F (x) (x X)F (x)
(quantifier order)
like
a
big
or
. Correspondingly, some
V
W
authors write xX instead of (x X) and xX instead of (x X). It is easy
to see that the exportation does not work if conjunctions are used with existential
quantifiers or if disjunctions are used with universal quantifiers (see Problem 3.3).
If one of the involved formulas does not involve the variable over which one quantifies, then one can export quantifiers also with incompatible logical operations, as
specified under free quantifier exportation above. Similarly as general quantifier
exportation is not valid for incompatible logical connectives, quantifiers of different
type might not be changed in order in general (see Problem 3.3).
In the section on propositional logic we have illustrated a simple method that
can be used to find out whether a given propositional formula is a tautology or
not. This truth table method was indicated as inefficient, but at least in principle
it is applicable to any formula whatsoever and it yields a clear result following the
specified algorithm. Unfortunately, there is not such method for firstorder formulas
and the absence of such a method does not mean that one has not found such a
method, but it has been prove that there is no such method as a matter of principle.
Theorem 3.9 (Church 1936) There is no algorithm that can decide for a given
firstorder formula whether the formula is valid or not.
However, this does not mean that we cannot prove that certain firstorder formulas are valid. Indeed, there is an axiom system of valid firstorder formulas from
which one can derive all the valid firstorder formulas. This is the subject of G
odels
Completeness Theorem and this is treated in a course on logic. That means, in particular, that for any valid firstorder formula there is also a proof that the formula is
valid. It might just be that the proof is very intricate and lengthy. For the current
purposes we just treat firstorder formulas intuitively and we do not formally prove
their correctness. We can, however, construct some counterexamples for firstorder
formulas that are not valid.
Problems
3.3 We consider counterexamples for incompatible quantifier exportation.
52
3.4
Logic
(i I)
(i I)
=
S
TiI
iI
53
CHAPTER
4.1
Mathematics is not just about objects, but about relations between objects. If we
study natural numbers, then we are not just interested in them as such, but we
want to understand relations between natural numbers such as divisibility. Only
such relations are giving substance to a subject such as number theory. Similarly,
if we study real numbers, we want to understand relations between them such as
linear, continuous or differentiable functions. This is what brings substance to linear
algebra and analysis. All such relations can be considered as subsets of set products
in the following straightforward sense.
Definition 4.1 (Relation) A triple (R, X, Y ) is called a relation, if X and Y are
sets and R X Y . We will call X the source and Y the target of the relation and
R its graph.
Typically, we will just say that R X Y is a relation between X and Y and we
assume that the source X and target Y is defined in this way implicitly. However,
one should keep in mind that source X and target Y have to be specified as part of
the relation. It is not sufficient just to specify the graph alone. A relation is called
homogeneous if X = Y , i.e. if source and target are identical. A relation R X X
55
4.
56
& %
' $
 0
:
1
z

1
2
3
4
5
& %
Finite relations R are often illustrated in diagrams such as the one in Figure 4.1.
The source set and the target set are given separately with all their elements and an
arrow is added from each point x in the source space to each point y in the target
space with xRy.
4.2
Since relations are special sets, we can apply the machinery of set theory to relation,
i.e. we can form unions, intersections, differences and other operations on sets. This
has been illustrated in Example 4.3. However, there are also some operations that
are tailormade for relations, the most important of which is composition, which we
define next.
Definition 4.4 (Composition) Let R X Y and S Y Z be relations. Then
we define a relation S R X Z, which is called the composition of the two given
relations by
S R := {(x, z) X Z : (y Y )(xRy and ySz)}.
57
4.
We point out that two relations (R, X, Y ) and (S, V, Z) can only be composed in
the order S R if the target Y of R is identical to the source V of the S, i.e. if Y = V .
Sometimes we will just write SR := S R, for short. We illustrate composition in a
continuation of Example 4.3.
Example 4.5 We consider the relation R X Y from Example 4.3 and we let
Z := X = Y . Moreover, we consider the predecessor relation S Y Z with
S := {(y, z) Y Z : z = y 1.}.
The following diagram illustrates the composition S R of the two relations.
0
1
2
3
4
5
0
:
1
0
1
2
3
4
z

1
2
3
4
5
5
S
Z
0
1
2
3
4
5
SR
0
1
2
3
4
5
Z
The composition operation on relations satisfies a number of important properties. We mention that it is associative and that the diagonal acts as identity element
with respect to composition.
Proposition 4.6 (Composition) Let R X Y , S Y Z and T Z W be
relations. Then
1. (T S) R = T (S R)
2. R X = Y R = R
58
(associativity)
(identity element)
(x, w) (T S) R
(y Y ) (x, y) R and (y, w) (T S)
(y Y ) (x, y) R and (z Z)((y, z) S and (z, w) T )
(y Y )(z Z) (x, y) R and ((y, z) S and (z, w) T )
(z Z)(y Y ) ((x, y) R and (y, z) S) and (z, w) T
(z Z) (y Y )((x, y) R and (y, z) S) and (z, w) T
(z Z) (x, z) S R and (z, w) T
(x, w) T (S R).
59
4.
None of the relations R, S and S R in Example 4.3 is left total or right total.
The less or equal relation and the divisibility relation  on N are examples of left
and right total relations. The strictly less relation < on N is left total, but not right
total. The strictly larger relation > on N is right total, but not left total. In fact,
> is just the inversion of <. Inversion is another operation that can be performed
on relations in general and we define it next.
Definition 4.9 (Inverse relation) Let R X Y be a relation. Then we define
the inverse relation R1 Y X by
R1 := {(y, x) Y X : xRy}.
Inversion intuitively means to swap source and target space, but to leave the
relation as it is otherwise. For instance, the inverse 1 is nothing but , the
inverse <1 is nothing but > (where we consider all these relation on N). Moreover,
the inverse 1 is nothing but (where we consider these relations on 2X for an
arbitrary set X). Regarding a diagram as in Example 4.3 inversion means to reverse
all the arrows. We state a number of properties regarding inversion and composition.
Proposition 4.10 (Inverse and composition) Let R X Y and S Y Z
be relations. Then
1. (R1 )1 = R.
2. (S R)1 = R1 S 1 .
3. dom(R) R1 R.
4. dom(R1 ) = range(R) and range(R1 ) = dom(R).
Proof. We prove 2. and 3. and we leave the other statements to the reader (see
Problem 4.2). Let x X and z Z. Then we obtain
(z, x) (S R)1
(x, z) S R
(z, x) R1 S 1
60
4.3. Functions
1. If R and S are left total, then S R is left total.
2. If R and S are right total, then S R is right total.
3. R is left total R1 is right total.
Proof. We prove 1. and we leave 2. and 3. to the reader (see Problem 4.3). Let R
and S be left total. Then dom(R) = X and dom(S) = Y and we obtain
x dom(S R)
(z Z) x(S R)z
(y Y ) xRy
x dom(R) = X.
Problems
4.1 Let R X Y be a relation. Prove that
1. R X = Y R = R
(identity element)
4.3
Functions
Perhaps the most important relations that are considered in mathematics are functions. The idea of a function f : X Y is that each value x X is mapped to one
and only one function value f (x) Y . Thus, the crucial property is the uniqueness
here. For symmetry reasons we have uniqueness on the left and uniqueness on the
righthand side, where the right uniqueness is what is required for functions.
Definition 4.12 (Uniqueness) Let R X Y be a relation.
1. R is called left unique, if for all x1 , x2 X and y Y
x1 Ry and x2 Ry = x1 = x2 .
61
4.
62
4.3. Functions
' $
' $
0
1
2
3
4
5
0
1
2
3
4
5
& %
& %
63
4.
The reader should be warned that this is abuse of mathematical terminology that
sometimes creates confusion and mistakes. The function in the above example is
the object f : N N and it is not fully specified without naming its source and
target set. Moreover, the object f (n) is a natural number and not a function in this
case. It is recommendable to avoid the above terminology and to keep functions and
their function values clearly separated in mathematical formulations. The equation
f (n) = n2 can only be used to define f , given its source and target set. But neither
the equation nor f (n) is the function. The following proposition justifies to define
a function f by an equation as in Example 4.17 above.
Proposition 4.18 (Graph) Let f : X Y be a function. Then
graph(f ) = {(x, y) X Y : f (x) = y}
Proof. If f : X Y is a function, then that means that R := graph(f ) is a left total
and right unique relation R X Y . We prove R = {(x, y) X Y : f (x) = y}.
If (x, y) R, then xRy and hence {f (x)} = {y 0 Y : xRy 0 } = {y} due to right
uniqueness of R. This means f (x) = y, which proves . For the other inclusion
we consider (x, y) X Y with f (x) = y. This means that y is the only value
in {y 0 Y : xRy 0 } and in particular xRy holds, i.e. (x, y) R.
2
f (n)
16
14
12
10
8
6
4
2
t
 n
2
64
4.3. Functions
This proposition says that the graph of a function can essentially be characterized
by the function values. The diagram in Figure 4.4 is a typical illustration of (a part
of) a graph of a function. This illustration uses a Cartesian coordinate system to
illustrate the graph. The horizontal axis represents the input values n, whereas the
vertical axis represents the function values f (n). The previous proposition is the
basis of the following observation which says that two functions with identical source
and target set are equal if and only if all their function values coincide.
Proposition 4.19 (Equality of functions) Let f : X Y and g : X 0 Y 0 be
functions. Then
f = g X = X 0 and Y = Y 0 and (x X) f (x) = g(x).
Proof. The fact that f : X Y and g : X 0 Y 0 are functions means that f =
(graph(f ), X, Y ) and g = (graph(g), X 0 , Y 0 ) and graph(f ) X Y and graph(g)
X 0 Y 0 are left total and right unique relations. Hence, it is clear that
f = g X = X 0 and Y = Y 0 and graph(f ) = graph(g).
So let us assume now that X = X 0 and Y = Y 0 . By Proposition 4.18 we obtain
graph(f ) = graph(g)
(x X) f (x) = g(x)
65
4.
(x, z) graph(g f )
g(f (x)) = z.
2
gf
R
 Z
?
Y
g
66
4.3. Functions
relation R X Y
left total
right unique
multivalued function f : X Y
partial function f : X * Y
right unique
left total
j
function f : X Y
left unique
right total
j
surjection f : X
Y
injection f : X , Y
right total
left unique
j
bijection f : X Y
right unique are called partial function and they are often denoted by f : X * Y
or f : X Y . Partial functions are not necessarily defined on the entire source
set X. We write dom(f ) for the domain of a partial function. The diagram in
Figure 4.6 lists some common types of functions and relations that are often used
in mathematics. We study injections, surjections and bijections more closely in the
next section.
Problems
4.4 Let R X Y and S Y Z be relations. Prove that:
1. If R and S are left unique, then S R is left unique.
2. R is left unique if and only if R1 is right unique.
67
4.
4.4
In this section we discuss functions that have additional totality and uniqueness
properties. We start with a definition.
Definition 4.24 (Injective, surjective, bijective) Let f : X Y be a function.
1. f is called injective, if f is left unique,
2. f is called surjective, if f is right total,
3. f is called bijective, if f is injective and surjective.
Injective, surjective and bijective functions are also called injection, surjection and
bijection, respectively.
An injection is sometimes denotes as f : X , Y , where the arrow , is supposed
to indicate that this function is injective. Such an injection is also called a function f
from X into Y . Similarly, surjections are sometimes denoted as f : X
Y , where
the arrow
indicates that this function is surjective. Such a surjection is also
called a function f from X onto Y . For bijections one sometimes sees the notation
f : X Y , but we will not use this here.
By definition an injective function is a function that cannot map two different
inputs to the same output and a surjective function is a function that yields all
values of the target space as output. We capture these characterizations in terms of
function values in the following proposition.
Proposition 4.25 (Injectivity, surjectivity and bijectivity) Let f : X Y
be a function. Then
1. f is injective if and only if for all x, y X we have that f (x) = f (y) implies
x = y,
2. f is surjective if and only if for all y Y there exists an x X with f (x) = y.
3. f is bijective if and only if for all y Y there exists exactly one x X with
f (x) = y.
We leave the proof to the reader (see Problem 4.5). Often the above characterization of injectivity is used in its contrapositive form, i.e. a function f : X Y is
injective if and only if for all x, y X we have that x 6= y implies f (x) 6= f (y). For
short: distinct inputs have to be mapped to distinct outputs. This is the reason why
some authors also call injective functions onetoone function. However, this terminology is ambiguous, since it is also sometimes used to refer to bijective functions
and hence we will try to avoid it here.
68
0
1
2
3
4
0
Figure 4.6 summarizes the different types of functions that we have seen. The
function in Example 4.15 is neither surjective not injective. The diagrams in Figure 4.7 provide examples of injective, surjective and bijective functions. We provide
some further examples.
Example 4.26
1. The square function f : N N, n 7 n2 is an example of a function that is
injective, but not surjective. Hence, f is also not bijective.
2. The square function f : Z Z, z 7 z 2 on integers is an example of a function
that is neither injective nor surjective.
3. The predecessor function
f : N N, n 7
0
if n = 0
n 1 otherwise
4.
The examples of the predecessor function and the maximum function illustrate
another method how definitions (of functions) are often written in mathematics,
namely by case distinction.
Another way to characterize injective and surjective functions is by using their
behaviour under composition with other functions. Roughly speaking, we can divide functional equations by injective functions on the lefthand side and by surjective functions on the righthand side and these properties actually characterize
injective and surjective functions and they explain why these types of functions play
a significant role.
Theorem 4.27 (Cancellation) Let f : X Y be a function. Then
1. f is injective if and only if for all sets Z and all functions g, h : Z X we
have that f g = f h implies g = h,
2. f is surjective if and only if for all sets Z and all functions g, h : Y Z we
have that g f = h f implies g = h.
Proof.
1. = Let f be injective and let g, h : Z X be two functions with f g = f h.
By Propositions 4.19 and 4.21 we obtain
f (g(x)) = (f g)(x) = (f h)(x) = f (h(x))
for all x X and hence g(x) = h(x) follows for all x X due to injectivity of
f by Proposition 4.25. Again by Proposition 4.19 we obtain g = h.
= Now let us assume that for all functions g, h : Z X we have that
f g = f h implies g = h. Let us now choose Z = {0} (or any other nonempty
set) and let us consider for any x X the constant function cx : Z X, z 7 x.
Let now x1 , x2 X with f (x1 ) = f (x2 ). Then by Proposition 4.21
(f cx1 )(z) = f (cx1 (z)) = f (x1 ) = f (x2 ) = f (cx2 (z)) = (f cx2 )(z)
follows for all z Z. By Proposition 4.19 this means f cx1 = f cx2 and
hence by assumption cx1 = cx2 . This implies again by Proposition 4.19 that
we obtain x1 = cx1 (y) = cx2 (y) = x2 for any y X. Hence we have proved by
Proposition 4.25 that f is injective.
2. We leave this proof to the reader (see Problem 4.5).
4.
The bijective functions of type f : X X (with identical source and target set)
have particularly nice properties. They form what is called the symmetric group on
X. We mention all the relevant properties.
Corollary 4.31 (Symmetric group) Let X be a set. Then we obtain for all bijective f, g, h : X X the following:
1. (f g) h = f (g h)
(associative)
2. f idX = idX f = f
(identity)
3. f f 1 = f 1 f = idX
(inverse)
The bijective functions f : X X are also called permutations. This terminology is in particular used if X is a finite set. This is because a bijective map
f : X X actually permutes the elements of X. The bijective function in Figure 4.7
is a typical example of a permutation on a finite set.
Problems
4.5 Let f : X Y be a function. Prove the following:
1. f is injective if and only if for all x, y X we have that f (x) = f (y) implies x = y,
2. f is surjective if and only if for all y Y there exists an x X with f (x) = y.
3. f is bijective if and only if for all y Y there exists exactly one x X with f (x) = y.
4. f is surjective if and only if for all functions g, h : Y Z we have that g f = h f
implies g = h.
4.6 Let f : X Y be a function with R := graph(f ). Prove that the inverse relation
R1 Y X is a partial function of type f 1 : Y * X if and only if f is injective. Prove
that for injective f one obtains dom(f 1 ) = range(f ). Hence, one can also consider this
partial function as a function f 1 : range(f ) X.
4.7 Let X and Y be nonempty sets. Prove that the canonical projections
pX : X Y X, (x, y) 7 x and pY : X Y Y, (x, y) 7 y
are both surjective.
4.8 Let fi : Xi Yi be functions for i {1, 2}. Then we define the product function by
f1 f2 : X1 X2 Y1 Y2 , (x1 , x2 ) 7 (f1 (x1 ), f2 (x2 )).
Prove the following:
1. f1 and f2 injective = f1 f2 injective,
2. f1 and f2 surjective = f1 f2 surjective,
3. f1 and f2 bijective = f1 f2 bijective,
4.5
In this section we just introduce some further terminology that is related to the
source set of a function. A sequence in X is just another name for a function
f : N X and a family in X indexed by I is just a function f : I X. There are
special ways of denoting such functions.
Definition 4.32 (Family and sequence) Let I and X be nonempty sets and let
xi X for each i I. Then (xi )iI is just another way of writing the function
f : I X, i 7 xi
and this function is called a family in X (indexed by I). A family (xn )nN in X that
is indexed by N is called a sequence in X.
The notation (xn )nN for sequences can easily be read as generalization of the
notation (x1 , x2 , ..., xn ) for ntuples, since a sequence (xn )nN can be considered in
some sense as the infinite tuple
(x0 , x1 , x2 , x3 , ...).
However, one should keep in mind that formally we mean by (xn )nN the function
f : N X, n 7 xn . Sometimes sequences are also written as {xn }nN , but this
notation is misleading since one has to distinguish a sequence (xn )nN (which is a
function f : N X) from the set {xn : n N} (which is, in fact, nothing but
range(f )). See also Problem 4.12. We mention that the terminology of an indexed
family of sets S
(Xi )iI naturally falls under the terminology of a family introduced
here. If X := iI Xi , then (Xi )iI can be seen as a family in 2X indexed by I. In
other words, what we mean by (Xi )iI is exactly the function f : I 2X , i 7 Xi .
Occasionally, one is interested in changing the source set of a function. Since the
source set is part of what constitutes the function, this might change the properties
of the function.
Definition 4.33 (Restriction and extension) Let f : X Y be a function and
let A X. Then we define the restriction of f to A by
f A : A Y, x 7 f (x).
In this situation f is also called an extension of f A .
So,
f as it
In this
having
Example 4.34
1. We consider the function f : X Y from Example 4.15. This function
f is not injective, since f (0) = f (1) = 1. By restricting f to either A =
{0, 2, 3, 4, 5} or to B = {1, 2, 3, 4, 5} we obtain restrictions f A : A Y and
f B : B Y that are both injective.
2. We consider the square function f : Z Z, z 7 z 2 , which is not injective
since, for instance, f (1) = f (1) = 1. If we restrict f to N, then the resulting
restriction f N : N Z is injective.
Later we will prove in Proposition 4.51 that any function f : X Y has a
restriction f A : A Y with the same range, i.e. such that range(f ) = range(f A ).
However, this proof requires the Axiom of Choice.
Problems
4.12 Show that there are two sequences (xi )iN and (yi )iN in N such that
(xi )iN 6= (yi )iN and {xi : i N} = {yi : i N}.
4.13 Let X be a set. We consider the union map
U : 2X 2X 2X , (A, B) 7 A B.
Find a restriction U Y of U that is bijective.
4.6
When we work with functions f : X Y we are often not just interested in single
function values, but we would like to know how a function f behaves on certain
subsets A X or B Y . In order to express such properties, we define the image
of a set under a function and the preimage of a set under a function.
i
A
f 1 (B)
y
B = f (A)
z
f :X Y
X
y
A = f 1 (B)
z
f (A)
y
B
f :X Y
X
(i I)(x Ai ) f (x) = y
(i I) y f (Ai )
S
y iI f (Ai ).
f (x)
(i I) f (x) Bi
(i I) x f 1 (Bi )
T
x iI f 1 (Bi ).
This shows f 1 (
iI
Bi ) =
iI
iI
Bi
f 1 (Bi ).
4.14 Let f : X Y be a function and let (Ai )iI be an indexed family of subsets of X and
let (Bi )iI be an indexed family of subsets of Y . Let A, B X and C, D Y . Prove the
following:
1. A B = f (A) f (B),
T
T
2. f ( iI Ai ) iI f (Ai ),
3. f (A \ B) f (A) \ f (B),
S
S
4. f 1 ( iI Bi ) = iI f 1 (Bi ),
5. f 1 (C \ D) = f 1 (C) \ f 1 (D).
(image map)
(preimage map)
4.7
Set of Functions
In this section we discuss the set Y X of all functions f : X Y for two given sets
X and Y . As we will see, this concept generalizes the concept of a power set in some
sense and it can be considered as an exponentiation operation for sets.
Definition 4.41 (Set of functions) Let X and Y be sets. Then we denote by
Y X the set of all functions f : X Y and by X! the set of bijective functions
f : X X.
Some authors denote the set of bijective functions also by SX , since it is also
called the symmetric group on X, as mentioned in Corollary 4.31. There is one
important function that comes associated with the function set Y X and which is
called evaluation.
Definition 4.42 (Evaluation) Let X and Y be sets. Then we define the evaluation map by
ev : Y X X Y, (f, x) 7 f (x)
Sometimes the evaluation map is also called apply operation since it applies the
first argument (which is a function) to the second argument (which is a suitable
input). The next theorem is telling us that we can identify the set Z XY with
the set (Z Y )X . This corresponds to the arithmetic rule that for natural numbers
x, y, z N we have (z y )x = z xy . The bijection that maps Z XY to (Z Y )X is called
currying operation since it has been studied by Haskell Curry (and indeed already
earlier by others such as Moses Schonfinkel).
Theorem 4.43 (Currying) Let X and Y be sets. Then the socalled currying
operation
C : Z XY (Z Y )X ,
which is defined by C(f )(x)(y) := f (x, y) for all functions f : X Y Z and all
x X and y Y , is bijective.
Proof. Let g : X Z Y be a function. Then we can define a function f : X Y Z
by
f (x, y) := g(x)(y)
for all x X and y Y . For this function f we obtain
C(f )(x)(y) = f (x, y) = g(x)(y)
for all x X and y Y . Hence C(f )(x) = g(x) for all x X, which means
C(f ) = g. This shows that C is surjective. Now, let f1 , f2 : X Y Z be two
functions with C(f1 ) = C(f2 ). This implies C(f1 )(x) = C(f2 )(x) for all x X and
hence
f1 (x, y) = C(f1 )(x)(y) = C(f2 )(x)(y) = f2 (x, y)
Problems
4.22 Let X, Y and Z be sets. Prove that for any function f : X Y Z we obtain
ev (C(f ) idY ) = f.
Here C denotes the currying operation as defined in Theorem 4.43 and ev : Z Y Y Z
denotes the evaluation map. The following diagram illustrates the situation.
XY
C(f ) idY
R
 Z
?
ZY Y
ev
4.8
There is a particularly important axiom in set theory, which is called the Axiom
of Choice. Perhaps it is the most controversial settheoretical axiom and some
mathematicians prefer not to use it or to indicate at least, whenever they use this
axiom. However, often this axiom is applied tacitly without even mentioning it. We
phrase this axiom in form of a definition.
Definition 4.46 (Axiom of Choice) The Axiom of Choice is the statement that
for any set X there exists a choice function
CX : 2X \ {} X
with CX (A) A for all nonempty A X.
What the choice function CX does is that for any nonempty set A X it selects
a point x = CX (A) with the property that x A. This seems to be a trivial task
since any nonempty set A has to have some member x. This is the reason why
most mathematicians readily accept the Axiom of Choice. However, from a more
constructive point of view, the axiom is debatable, since it does not specify how
such a point x shall be chosen in general. The other axioms of set theory, such
as the power set axiom, specify in some sense how the object whose existence is
4.
By Corollary 4.30 the inverse f 1 of a function f is an inverse if it exists. However, by by Proposition 4.29 the inverse function f 1 only exists if f is bijective.
We generalize this observation. However, the proof requires the Axiom of Choice.
Theorem 4.49 (Left and right inverses) Let X and Y be nonempty sets and
let f : X Y be a function. Then
1. f has a right inverse if and only if f is surjective.
2. f has a left inverse if and only if f is injective.
3. f has an inverse if and only if f is bijective.
The proof of statement 1. uses the Axiom of Choice.
Proof.
1. Let us assume that the Axiom of Choice holds. We need to show that for
every function f : X Y it holds that f has a right inverse if and only if f is
surjective. Let us fix some function f : X Y .
= Let f be surjective. Then there is a choice function CX : 2X \ {} X
with CX (A) A for each nonempty set A X. We need to show that f has
a right inverse g : Y X. Since f is surjective, the preimage f 1 ({y}) is a
nonempty subset of X for each y Y and hence we can define g by
g(y) := CX (f 1 ({y}))
for every y Y . Then g(y) f 1 ({y}), i.e. f g(y) = y for all y Y , which
means f g = idY . Hence f has a right inverse g.
= If, on the other hand, f has a right inverse g : Y X, then for each
y Y we have that f (g(y)) = f g(y) = idY (y) = y, hence f is surjective. We
do not need the Axiom of Choice for this direction.
2. We leave this proof to the reader (see Problem 4.26).
3. = Let f be bijective. Then the inverse function f 1 of f exists by Proposition 4.29 and by Corollary 4.30 we obtain f f 1 = idY and f 1 f = idX .
Hence, the inverse f 1 is a right inverse as well as a left inverse of f .
= Let f have a right inverse g : Y X and a left inverse h : Y X, i.e.
f g = idY and h f = idX . We obtain by associativity
g = idX g = (h f ) g = h (f g) = h idX = h.
Hence g is a left inverse and a right inverse of f , hence it is an inverse.
The Axiom of Choice has not only lots of important applications in mathematics,
but also some counterintuitive consequences. One of those is the BanachTarski
Paradox, which is in fact a theorem that follows from the Axiom of Choice. It
states that a solid ball in the three dimensional Euclidean space can be decomposed
into finitely many disjoint pieces that can be reassembled to two balls of the same
volume as the original ball. And this process can be performed by rotations and
other geometrical transformations that do not change the shape of the pieces. The
pieces themselves are, however, very complicated and not like solid physical objects.
Problems
4.26 Let X and Y be nonempty sets and let f : X Y be a function. Prove that f has a
left inverse if and only if f is injective.
4.9
Infinite Products
4.
Definition 4.52 (Product) Let (Xi )iI be an indexed family of sets. Then we
define the product
)
(
Y
[
Xi := f : I
Xi : (i I) f (i) Xi .
iI
iI
Q
We note that in case that Xi = X for all i I, we obtain iI X = X I . In this
sense the exponentiation or functions set construction is a special case of the product.
We will see in Problem 4.28 that the product also generalizes the finite Cartesian
product Xni=1 Xi that we have considered earlier. Now we show that the Axiom of
Choice is equivalent to the statement that this product is nonempty whenever the
sets Xi are all nonempty.
Theorem 4.53 (Nonempty products) The following statement is equivalent to
the Axiom of Choice. For all indexed families (Xi )iI we obtain
Y
Xi 6= (i I) Xi 6= .
iI
S
Proof. Let X := iI Xi . Let us assume the Axiom of Choice holds. We need to
show that the statement in the theorem holds too. We consider both directions.
= Let Xi 6= for all i I. We need to prove that there exists a function
f : I X. By the Axiom of Choice there is a choice function CX : 2X \ {} X
for X. Since Xi 6= for all i I, we obtain CX (Xi ) Xi . Hence, we Q
can define a
suitable
f
by
f
(i)
:=
C
(X
)
for
all
i
I.
For
this
f
we
obtain
f
i
X
iI Xi and
Q
hence iI 6= .
= We prove the contrapositive statement. Let j I be such that Xj = .
Then,
obviously, there cannot be any function f : I X with f (j) Xj . Hence
Q
X
i = . For this direction we have not used the Axiom of Choice.
iI
Let us now assume that the statement in the theorem is correct. We prove that
under this assumption the Axiom of Choice follows. Let X be a set. We consider
the indexed family of sets (YA )AI with I := 2X \ {} and YA := A for each A I.
We have to prove that there is a function
CX : 2X \ {} X with CX (A) A for
Q
each nonempty
S A X. But if AI YA is nonempty, then there exists a function f : I AI YA with f (A) YA = A for each A I = 2X \ {}. Hence
CX (A) := f (A) is a suitable choice for each
Q nonempty A X. Thus, the axiom of
choice follows from the nonemptyness of AI YA .
There are many other theorems in mathematics that are, in fact, equivalent to
the Axiom of Choice. We just mention two examples:
1. The statement that each vector space has a basis is equivalent to the Axiom of
Choice. This statement is a crucial and fundamental fact in Linear Algebra.
These maps are all surjective (see Problem 4.27). The product together with the
canonical projections satisfies a socalled universal property that we formulate in the
following result.
Theorem 4.54 (Product) Let (Xi )iI be an indexed family of sets. For each set
Y and each
Q family (fi )iI of functions fi : Y Xi there exists exactly one function
f : Y iI Xi such that
fj = prj f
for all j I.
Proof. We first prove the existence of f . Let Y be a set and Q
let (fi )iI be a family
of functions fi : Y Xi . Then we define a function f : Y iI Xi by
f (y)(i) := fi (y)
S
for all y Y and i I. This f is welldefined, since the function f (y) : IQ
iI Xi
has the property that f (y)(i) = fi (y) Xi for each i I, hence f (y) iI Xi for
all y Y . Now we obtain
fj (y) = f (y)(j) = prj (f (y)) = prj f (y)
for each y Y and j I, which means fj = prj f for each j QI. Now we still
need to prove that f is uniquely determined. Hence, let g : Y iI Xi be some
function such that fj = prj g for all j I. We have to show that f = g. We obtain
g(y)(j) = prj g(y) = fj = prj f (y) = f (y)(j)
for all y Y and j I, hence g(y) = f (y) for all y Y and this means f = g. This
completes the proof.
The diagram in Figure 4.10 illustrates the situation of the proof. It is an example
of a commutative diagram. For finite
Q products (i.e. finite sets I) we do not have
to distinguish between the product iI Xi and the product XiI Xi that we have
introduced
Q earlier. We leave the proof to the reader (see Problem 4.28). Thus, the
product iI Xi introduced in this section actually generalizes the finite product
XiI Xi .
1
Compactness is a notion that is studied in Topology and Analysis. It plays a very important
role because compact sets have many properties in common with finite sets, although they are not
necessarily finite.
4.
iI
Xi
prj
?
 Xj
fj
Problems
4.27 Let (Xi )iI be a family of sets. Prove that the canonical projections prj :
are surjective for all j I. Prove that these maps are not injective in general.
4.28 Let I := Nn = {1, 2, ..., n} for some n 1. We consider the projections
n
pj : X Xi Xj , (x1 , x2 , ..., xn ) 7 xj .
i=1
F : X Xi
i=1
Xi
iI
iI
Xj
CHAPTER
Cardinality
The infinite! No other question has ever moved so profoundly the spirit of man.
David Hilbert (18621943)
5.1
Obviously, for some considerations about sets the size of a set matters. This size of a
set X is called the cardinality of X and it is often denoted by X. Some authors also
write card(X) = X. To define exactly what kind of quantity X is, is somewhat
nontrivial and we will not do this here. There is a mathematically precise way
to interpret X as a socalled cardinal number, which is a quantity that can take
natural number values but also many different infinite values. Perhaps surprisingly,
we do not have to specify what X exactly is, in order to work with cardinalities.
We will just specify what expressions like X Y  mean without saying what X
and Y  actually are. We make this precise in the following definition.
Definition 5.1 (Cardinality) Let X and Y be sets.
1. We write X = Y  and we say that X has the same cardinality as Y if there
is a bijective map f : X Y .
2. We write X Y  and we say that X has smaller or the same cardinality as
Y if there is an injective map f : X Y .
3. We write X < Y  and we say that X has strictly smaller cardinality than Y
if X Y  and not X = Y .
To make this clear again: if we write X Y , we are not saying that some
sort of number X is less or equal to some sort of number Y  (although there is a
5.
Cardinality
meaningful way to interpret things in this direction), but the expression X Y 
is simply a short way to say that there exists an injective function f : X Y . We
give a simple example.
Example 5.2 The two sets {1, 2, 3} and {A, B, C} (where we assume that A, B
and C are pairwise distinct objects) are of the same cardinality, i.e. {1, 2, 3} =
{A, B, C}. We can easily verify this using a bijective function f : {1, 2, 3}
{A, B, C} defined by f (1) := A, f (2) := B and f (3) := C.
If one accepts the Axiom of Choice, then X Y  is the same as saying that
there is a surjective function g : Y X.
Proposition 5.3 The following statement follows from the Axiom of Choice. Let
X and Y be nonempty sets. There exists an injective function f : X Y if and
only if there exists a surjective function g : Y X.
Proof. Let f : X Y be injective. Then f has a left inverse g : Y X by
Theorem 4.49 and hence g f = idX . This implies that g : Y X has to surjective.
For the other direction we assume that there exists a surjective function g : Y X.
Then by Theorem 4.49 the Axiom of Choice implies that there exists an right inverse
f : X Y of g, i.e. g f = idX . This implies that f has to be injective.
Hence, if one accepts the Axiom of Choice, then it does not matter whether
one defines X Y  via injections f : X Y or via surjections g : Y X. If,
however, one wants to be rather careful and independent of the Axiom of Choice,
then one should use injections here. The previous proposition is not correct for the
special case X = and Y 6= . In this case, the only function f : Y is injective,
but there is no function whatsoever of type g : Y , in particular, no surjective
one. As a first result we show that any subset A X of a set X has smaller or the
same cardinality than X.
Proposition 5.4 (Inclusion and cardinality) Let X and Y be sets. Then
X Y = X Y .
Proof. Let X and Y be sets with X Y . We consider the identity idY restricted
to X, i.e. the function
f : X Y, x 7 x.
This function is clearly injective, hence X Y .
However, one should be careful since X $ Y does not necessarily mean X < Y .
That is a set X can very well be smaller than Y in these sense that it contains fewer
elements without having a smaller cardinality. We give an example.
This example is also known as the Hilbert Hotel Paradox, although it is not really
a paradox. The story goes as follows: imagine a hotel with infinitely many rooms
numbered by natural numbers 0, 1, 2, 3, .... Suppose all the rooms are occupied and
there are 5 new guests arriving. Then you can easily create space, by asking all the
existing guests to move from room number n into room number n + 5. Then the
rooms 0, 1, 2, 3, 4 become vacant and can host the 5 new guests. But even if there is
a bus arriving with infinitely many guests numbered by natural numbers 0, 1, 2, 3, ...
one can create enough space. One just asks each guest in room number n to move in
room number 2n. Then only the even room number are occupied and all the newly
arriving guests can move into the rooms with odd room numbers. One can even
continue this game if there are infinitely many buses arriving, one for each natural
number 0, 1, 2, 3, ..., but we do not go into this here. We close this section with a
number of further examples.
Example 5.6 Let X, Y and Z be sets.
1. 2X  = {0, 1}X , i.e. the power set 2X of X and the set {0, 1}X of functions
f : X {0, 1} have the same cardinality, which follows from Theorem 4.45.
2. (Z Y )X  = Z XY , which follows from Theorem 4.43.
3. Y X  < 2XY , if X and Y both have at least two elements, which follows
from Problem 4.25.
5.1 Let X1 , X2 , Y1 and Y2 be sets. Prove the following:
1. (X1  X2  and Y1  Y2 ) = X1 t Y1  X2 t X2 ,
2. (X1  X2  and Y1  Y2 ) = X1 Y1  X2 X2 ,
3. X1  X2  = 2X1  2X2 ,
4. (X1  = X2  and Y1  Y2 ) = Y1X1  Y2X2 ,
5. X1  X2  = X1 ! X2 !.
5.2 Let X and Y be sets and let x X. Prove that {x} Y  = Y .
5.3 Let X and Y be sets. Prove that the following map is bijective:
(i, z) if z X Y
F : X t Y (X Y ) t (X Y ), (i, z) 7
(1, z) if z (X Y ) \ (X Y )
Conclude that X t Y  = (X Y ) t (X Y ). Prove also X Y  X Y  X t Y .
5.2
Cardinality
We claim that this set satisfies the claim. In order to prove this, it suffices to show
that A is a fixed point of F , i.e.
A = F (A),
since this implies A = F (A) = X \ g(Y \ f (A)), which implies X \ A = g(Y \ f (A))
and this is the claim. In order to prove A = F (A), we firstly note that B A for
all B M and hence monotonicity of F yields
!
[
F (B) F (A) = F
B ,
BM
which implies
!
[
BM
F (B) F
[
BM
BM
F (B) F
BM
= F (A).
BM
Since F is monotone, this implies F (A) F (F (A)) and hence F (A) M. Hence
[
F (A)
B = A.
BM
Altogether, this proves A = F (A), i.e. A is a fixed point of F . This finishes the
proof.
2
The diagram in Figure 5.1 illustrates the claim of the previous proposition. Now
#
#
f
X\A
f (A)
Y \ f (A)
y
g
"!
X
"!
Y
5.
Cardinality
for all x X. The diagram in Figure 5.1 illustrates the idea of the construction.
We need to prove that h is bijective.
We first prove that h is surjective. Let y Y . If y f (A), then clearly there
is an x A such that h(x) = f (x) = y. Otherwise, y Y \ f (A), but then
x := g(y) X \ A and hence h(x) = g 1 (x) = y.
Next we prove that h is injective. Therefore, let x, y X with h(x) = h(y). If x
and y are both in A, then we obtain f (x) = h(x) = h(y) = f (y) and hence x = y,
since f is injective. If x and y are both in X \A, then g 1 (x) = h(x) = h(y) = g 1 (y).
Since g 1 : range(g) X is bijective and, in particular, injective, this implies x = y.
If x and y are not both in A and not both in X \ A, then we can assume without
loss of generality x A and y X \ A. In this case h(x) = f (x) f (A) and
h(y) = g 1 (y) g 1 (X \ A) = Y \ f (A). Hence this case is impossible, since
h(x) = h(y). This finishes the proof that h is injective and hence bijective.
2
Now we can conclude that the relations on cardinality that we have studied
satisfy the following important properties.
Corollary 5.9 The following holds for all sets X, Y and Z:
1. X = X
(reflexivity)
(antisymmetry)
(transitivity)
The first statement clearly holds since the identity idX : X X is bijective, the
second statement is the statement of the Theorem of SchroderBernstein 5.8 and the
third statement holds since since the composition of two injective maps is injective
by Corollary 4.28. The above properties are basically those of an order relation
(except that the underlying class of all sets is not a set itself). We will study such
order relations later on. This particular order is even total (in a sense specified in
Definition 6.1), as the next result shows.
Theorem 5.10 (Trichotomy) The following statement is equivalent to the Axiom
of Choice. For any two sets X and Y we have X < Y  or X = Y  or Y  < X.
The proof is beyond our scope here and we have to postpone it until later.
Problems
5.4 Prove that the following two functions are injective:
1. I : {0, 1}N NN , f 7 f ,
2. J : NN {0, 1}N , f 7 ( 1, ..., 1 , 0, 1, ..., 1 , 0, 1, ..., 1 , ...).
 {z }
 {z }
 {z }
f (0)times
f (1)times
f (2)times
5.3
In this section we want to prove that there are sets of many different sizes. We start
with a result of Cantor that shows that the power set 2X of any set X is larger
than the set X itself. In some sense this is another instance of Russels paradox (see
Example 2.8).
Theorem 5.11 (Cantor 1892) Let X be a set. Then
X < 2X .
Proof. Let X be a set. We have to prove X 2X  and X 6= 2X . That is, it is
sufficient to show that there is an injective function f : X 2X and that there is
no injective function g : 2X X. It is easy to see that the function
f : X 2X , x 7 {x}
is injective: if x, y X with f (x) = f (y), then we obtain {x} = {y}, which implies
x = y. Now let us assume that there is an injective function g : 2X X. Then this
function has a left inverse h : X 2X by Theorem 4.49, which means h g = id2X .
Now we define the set
A := {x X : x 6 h(x)}.
Let y := g(A). Then we obtain h(y) = h g(A) = A, which implies
y A y 6 h(y) y 6 A.
This is clearly a contradiction. Hence the assumption was wrong and there cannot
be any injective function g : 2X X.
2
In Theorem 4.45 we have proved that the power set 2X has the same cardinality
as the set of function {0, 1}X . Hence we obtain the following corollary of Cantors
Theorem.
Corollary 5.12 For any set X we have X < {0, 1}X .
From Cantors Theorem 5.11 we can deduce that there are infinite sets of different
cardinality. In particular, we get the following corollary.
Corollary 5.13 We have N < 2N .
5.
Cardinality
That is, the power set 2N of the natural numbers is larger than the set N of
natural numbers itself, with respect to cardinality. Hence, we get an infinite chain
of larger and larger infinite sets:
N
2N
5.5 Show as follows that Proposition 2.9 is also a consequence of Cantors Theorem 5.11:
1. Assume that there is a universal set U that contains all sets X.
2. Show that X U implies 2X U .
3. Conclude that X U implies 2X  U .
4. Show that this leads to a contradiction!
5.4
Since the power set 2X of any set X is strictly larger than the set X itself, the
question arises whether there is any set of cardinality in between. For large enough
finite sets this is certainly the case. If X has two or more elements then there is a set
Y with X < Y  < 2X , namely any set Y that has exactly one element more than
X will do the job. However, it was a matter of many mathematical investigations
whether there can be such a set Y for infinite sets X as well. It is the socalled
Continuum Hypothesis that there is no such set. It comes in a generalized form
which makes a statement for any infinite set and in a basic form which makes the
statement only for the set X = N. We capture both in the following definition.
Definition 5.14 (Continuum Hypothesis) The Generalized Continuum Hypothesis is the statement that for each set X with N X there does not exist any set
Y with
X < Y  < 2X .
The (ordinary) Continuum Hypothesis is this statement for the special case X = N.
Kurt G
odel proved in 1940 that consistency of ZermeloFraenkel set theory implies consistency of the ZermeloFraenkel set theory together with the Continuum
Hypothesis. This also holds in presence of the Axiom of Choice. This implies that
the Continuum Hypothesis cannot be proved to be false using the ZermeloFraenkel
axioms together with the Axiom of Choice. In 1960 Paul Cohen proved that the
Continuum Hypothesis can also not be derived from the ZermeloFraenkel axioms,
not even in presence of the Axiom of Choice. This means that also the negation
of the Continuum Hypothesis together with the ZermeloFraenkel axioms and the
5.5
In the previous section we have seen that N = 2N, i.e. cardinalitywise there are
exactly as many natural numbers as there are even numbers. Perhaps, even more
surprisingly, we will show in this section that cardinalitywise there are as many
pairs of natural numbers as there are natural number, i.e. N N = N. This
proof is due to Cantor and it is called Cantors first diagonalization. The idea
is captured in the diagram in Figure 5.2. We systematically enumerate all pairs
(n, k) N N of natural numbers in a coordinate system by moving diagonally
through this system. This enumeration yields a function f : N N N that assigns
0
14
0
1
2
3
4
3
3
3
3
?
4
8
13
3
3
3
?
3
7
12
3
3
?
6
11
3
?
10
?
1
Cardinality
99
5.
Cardinality
5.7 Prove that the following function is surjective:
f : N N Z, (n, k) 7 n k.
Provide a concrete right inverse g : Z N N of f (without using the Axiom of Choice).
Show that this implies N = Z.
5.8 Prove that the following function is surjective:
f : N N N Q, (n, k, m) 7
nk
.
m+1
Provide a concrete right inverse g : Q NNN of f (without using the Axiom of Choice).
Show that this implies N = Q.
5.9 This question requires some basic knowledge about real numbers. Prove that the following two functions are injective:
P
1. F : {0, 1}N R, f 7 n=0 f (n)3n ,
2. G : R 2QQ , x 7 {(a, b) Q Q : a < x < b}.
Show that this implies 2N  = R.
5.10 Prove that the map
f : 2N 2N 2NN , (A, B) A B
is bijective. Conclude that 2N 2N  = 2N .
5.11 Show with the help of the previous problems that R2  = R.
5.12 This quesion requires some basic knowledge about complex numbers. Prove that the
following function is bijective:
f : R2 C, (a, b) 7 a + bi.
Show that this implies C = R.
5.6
In the following section we want to discuss finite and infinite sets and prototypes
of such sets will be derived from the natural numbers. For this purpose we need to
clarify some further properties of natural numbers. The first property is called the
induction principle.
Proposition 5.21 (Induction principle) Let A N be a subset that satisfies the
following properties:
1. 0 A
(induction base)
(induction step)
then A = N.
We cannot really prove this proposition here, since we are working with an
intuitive concept of the natural numbers. We have just defined N as the set of numbers 0, 1, 2, ... and on basis of this informal definition, the above induction principle
is just intuitively correct. The interpretation of the dots ... in the informal definition of N is just that with any number n also its successor n + 1 follows in that list.
Besides the above induction principle, there is a second principle that also follows
intuitively. This is called the recursion principle. The recursion principle allows us
to define functions inductively (or recursively) following the inductive structure
of natural numbers.
Proposition 5.22 (Recursion principle) Let X, Y be sets and let g : X Y
and h : Y X N Y be functions. Then there exists exactly one function
f : X N Y with
1. f (x, 0) := g(x),
2. f (x, n + 1) := h(f (x, n), x, n)
for all n N.
Once again, we cannot prove this result here, at least not the existence claim,
since we do not use a precise definition of N. However, we can use the induction
principle in order to derive the uniqueness claim in the recursion principle. We
formulate this as an example here that shows how to use the induction method.
Example 5.23 Let X and Y be sets and let g : X X and h : Y X N Y
be functions. Let us assume that we have two functions f : X N Y and
f 0 : X N Y that both satisfy the equations given in the Recursion Principle 5.22.
We claim that f = f 0 follows, using the induction principle. In order to show this,
we prove the following claim:
(n N)(x X)f (x, n) = f 0 (x, n).
This claim clearly implies f = f 0 . More precisely, this claim is equivalent to the
statement that the set
A := {n N : (x X)f (x, n) = f 0 (x, n)}
is equal to N. If we can show that A satisfies both requirements of the Induction
Principle 5.21, then A = N follows. We prove this now.
Induction base: n = 0. In this case clearly f (x, 0) = g(x) = f 0 (x, 0) for all x X.
That means 0 A.
Induction step: n n+1. Now we assume that n N is fixed and that for this fixed
101
5.
Cardinality
n we have (x X)f (x, n) = f 0 (x, n). This means n A, which is the socalled
induction hypothesis. We need to show n + 1 A. We obtain for all x X
f (x, n + 1) = h(f (x, n), x, n) = h(f 0 (x, n), x, n) = f 0 (x, n + 1),
where the induction hypothesis has been used in the middle equality. This now means
n + 1 A. Hence we have proved n A = n + 1 A.
This finishes the induction. Altogether, we have proved A = N and hence f = f 0 .
Hence, there can at most be one function f that satisfies the requirements of the
Recursion Principle 5.22.
The above structure of a proof by induction is typical, including the terminology
of an induction base, an induction step and an induction hypothesis. Usually, we will
not formulate the set A explicitly. The implicit understanding is that whenever we
want to prove a statement of the form (n N)P (n) with a proposition P (n),
then we can achieve this by applying the Induction Principle to the set
A := {n N : P (n)}.
The induction principle is the key idea for the socalled Peano axioms of natural
numbers. We formulate these axioms in the following definition.
Definition 5.24 (Peano model) We say that a triple (N, z, s) is a Peano model
of the natural numbers if the following holds:
1. N is a set
(natural numbers)
2. z N
(zero)
(successor)
(induction base)
b) (n N )(n A = s(n) A)
(induction step)
then A = N .
Since we are not going to develop set theory axiomatically here, we will not
prove that there are Peano models of the natural numbers at all. We keep on using
our intuitive model (N, 0, s) where s : N N, n 7 n + 1 is the successor function.
The Induction Principle 5.21 essentially says that (N, 0, s) is a Peano model of the
natural numbers. We briefly sketch how one can construct a set theoretical model
of the natural numbers, namely by choosing the following sets:
0 := , 1 := 0 {0}, 2 := 1 {1}, 3 := 2 {2}, ...
The set of natural numbers corresponds then to the set N of all these sets and the
successor function corresponds to the function s : N N, n 7 n {n}. One can
actually prove in a precise way that along these lines one can construct a Peano
model of the natural numbers. But we will not work this out in detail here.
5.13 Use the Induction Principle 5.21 in order to prove the following statement by induction:
(n N)
n
X
i=
i=0
n(n + 1)
.
2
5.14 Use the Recursion Principle 5.22 in order to prove that there exists exactly one function
f : N N such that
1. f (0) := 1,
2. f (n + 1) := f (n) (n + 1),
for all n N. This function f is called the factorial function and usually one writes n! := f (n)
for all n N.
5.15 For n, k N with k n we define the binomial coefficient
n
k
and we define
n
k
:=
n!
k!(n k)!
n+1
k+1
=
n
k
+
n
k+1
5.7
n
k
is a natural number
We have seen that many common sets are either of the same cardinality as the set
of natural numbers N or of the same cardinality as the power set 2N . In fact, most
infinite sets that commonly occur in mathematics are of one of the two corresponding
cardinalities. We introduce some related terminology. We recall that for each n N
with n 1 we denote by Nn := {1, 2, 3, ..., n} the set of the natural numbers from
1, ..., n. We define N0 := to be the empty set.
Definition 5.25 Let X be a set. Then we say that
1. X is finite if X = Nn  for some n N,
2. X is infinite if X is not finite,
3. X is countable if X N,
4. X is countably infinite if X = N,
5. X is uncountable if X is not countable.
5.
Cardinality
We note that it follows directly from the definition that the empty set is finite,
each finite set is countable and each countably infinite set is countable. Countable
sets are sometimes also called denumerable. We give some examples.
Example 5.26 We discuss some examples of sets.
1. The empty set is finite and so are 2 and Nn for each n N.
2. The set P of prime numbers is infinite, this is exactly what we proved in Theorem 1.2.
3. The sets N, Z and Q are all countably infinite.
4. The sets 2N and R and 2R are uncountable.
Next we prove that the relation between the cardinalities of Nn and Nk can be
directly deduced from n and k.
Proposition 5.27 (Finite sets) Let n, k N. Then we obtain:
1. Nn  Nk  n k,
2. Nn  = Nk  n = k,
3. Nn  < Nk  n < k.
Proof.
1. Let Nn  Nk . Then there is an injective map f : Nn Nk . If n > k, then
the values f (1), ..., f (n) cannot all be distinct, but one of the values 1, ..., k
must occur twice among f (1), ..., f (n). In this case f is not injective. Hence
n k. Let us now assume that n k. Then Nn Nk and hence Nn  Nk 
by Proposition 5.4.
2. Let Nn  = Nk . This means Nn  Nk  and Nk  Nn  and hence n k
and k n by 1. Hence n = k follows. If, on the other hand, n = k, then the
identity id : Nn Nk is clearly bijective and hence Nn  = Nk .
3. This follows directly from 1. and 2.
2
The previous result is the reason why one can actually consider X as a natural
number for finite sets X.
Definition 5.28 (Cardinality) Let X be a finite set and n N. Then we define
X = n : X = Nn .
In this case, the natural number n N is called the cardinality of X.
5.
Cardinality
X
?
Nn
?
 Nn
5.16 Let f : X Y be a function and let A X and B Y . Then we obtain:
1. A finite = f (A) finite,
2. A countable = f (A) countable,
3. B infinite and f surjective = f 1 (B) infinite,
4. B uncountable and f surjective = f 1 (B) uncountable.
5.17 Let X be a finite set. Prove that any surjective map f : X X is bijective.
5.8
Even before Cantor defined finite sets in terms of Nn and infinite sets as sets that
are not finite, Richard Dedekind already suggested a concept of infinity that does
not refer to the natural numbers but that uses an intrinsic property of finite sets
to characterize them. This concept leads to further important characterizations of
finite and infinite sets (which, however, require the Axiom of Choice).
Definition 5.33 (Dedekind infinite sets) A set X is called Dedekind infinite if
and only if there is a proper subset A $ X such that A = X.
We give an example.
Example 5.34 The set N is Dedekind infinite, since N = 2N and 2N $ N. The
set P of prime number is also Dedekind infinite (see Problem 5.18).
In the following theorem we collect a number of conditions that are equivalent
to Dedekind infiniteness.
Theorem 5.35 (Dedekind infinite sets) Let X be a set. Then the following are
equivalent:
1. X is Dedekind infinite,
2. there is a function f : X X which is injective but not bijective,
3. N X,
4. X has a countably infinite subset.
5.
Cardinality
h( 21 h1 (x)) if x B
x
if x C
5.
Cardinality
5.18 Prove that the set of prime numbers P is Dedekind infinite, without using the Axiom
of Choice, directly by going back the Theorem of Euclid 1.2 (see also Problem 1.2).
5.19 Let A N. Prove that it follows from the Axiom of Choice that the following are
equivalent:
1. A is infinite,
2. (n N)(k N)(k n and k A).
5.20 Let X be a set. Prove that the following are equivalent (without the Axiom of Choice):
1. there exists a function f : X X which is surjective but not injective,
2. there exists a surjective map g : X Y with a countably infinite set Y ,
3. the power set 2X is Dedekind infinite.
5.21 Let X be a set. Prove that it follows from the Axiom of Choice that the following are
equivalent:
1. X is countably infinite,
2. X countable and infinite.
5.9
In this section we investigate how set constructions affect the cardinality of sets. We
will only treat finite and countable sets here. The first observation is that all finite
set constructions that we have considered, preserve finiteness. That means that a
finite union, intersection, product, or the power set or function set of finite sets is
finite again. We can say more than this, we can determine formulas that allow to
compute the size of the resulting sets, if we know the size of the original sets. We
assume that k 0 = 1 for all k N.
Theorem 5.39 (Constructions on finite sets) Let X and Y be finite sets. Then
we obtain
1. X Y  + X Y  = X t Y  = X + Y ,
2. X Y  = X Y ,
110
5.
Cardinality
where (Nn Nk )({n+1}Nk ) = . Hence we obtain with 1. and Problem 5.2
Nn+1 Nk  = Nn Nk  + {n + 1} Nk  = n k + k = (n + 1) k.
n
3. Since X = Nn  and Y  = Nk  implies Y X  = NN
k  (see Problem 5.1), it
Nn
n
suffices to show Nk  = k for all n, k N. We prove this by induction on
n N.
Induction base: n = 0. Then N0 = and there is only one function f : Nk ,
0
0
i.e. NN
k  = 1 = k for all k N.
Induction step: n n + 1. We assume that we have a fixed n N with
n
n
NN
k  = k for all k N. Now we consider the map
n
F : Nk n+1 NN
k Nk , f 7 (f Nn , f (n + 1)),
iI
iI
Xi and
iI
Xi are countable,
2f (z)
if z X
2g(z) + 1 if z Y \ X
5.
Cardinality
2. It is sufficient to prove the statement for I = Nn with n N and for I = N.
The case of a general I can be reduced to either of these cases. The case
I = Nn with n N follows from 1. by induction over n N. We only discuss
the case I = N here. We consider the function
[
m:
Xi N, x 7 min{i N : x Xi }
iN
S
that determines for each x iN Xi the smallest index i = m(x) N such
that x Xi . Using this function, we define
[
F :
Xi N N, x 7 (fm(x) (x), m(x)).
iN
S
This function F is injective and
S hence  iN Xi  N N = NTby Corollary
S 5.17. This means
T that iN
S Xi is countable. Moreover, iN
T Xi
iN Xi and hence  iN Xi   iN Xi  N and this means that
iN Xi
is countable as well.
3. The function
f g : X Y N N, (x, y) 7 (f (x), g(y))
is injective (see Problem 4.8). Hence X Y  NN = N by Corollary 5.17.
This means that X Y is countable. It follows that X n is countable for all
n N by an easy induction on n.
Other operations that are wellbehaved with respect to countability are the disjoint union (see Problem 5.24) and the Kleene star operation (see Problem 5.25).
5.22 Let X and Y be finite sets. Prove that
1. X \ Y  = X X Y ,
2. X \ Y  = X Y  if Y X,
3. XY  = X Y  X Y .
5.23 Prove that the set N! of bijections f : N N is not countable.
5.24 Let (Xi )iN be a sequence
F of countable sets, i.e. each Xi is countable for all i N.
Prove that the disjoint union iN Xi is countable.
5.25 Let X be a set. We consider the Kleene operation X (i.e. the set of all finite words
over X). Prove the following:
1. {0, 1} is countably infinite,
n!
(nk)!
5.29 Let X be a set. Prove that the following are equivalent, without using the Axiom of
Choice:
1. X is infinite,
X
2. 22
is Dedekind infinite.
X
CHAPTER
Order
The mathematical sciences particularly exhibit order, symmetry, and limitation;
and these are the greatest forms of the beautiful.
Aristotle (384322 BC)
6.1
What is Order?
So far we have studied mainly uniqueness and totality properties of relations and
their consequences. The concepts of left and right totality and uniqueness are the
building blocks that we have used to define functions, injections, surjections and
bijections. These building blocks have also led to the concept of cardinality. Now
we want to focus on homogeneous relations that extend the identity relation. Such
relations can be used to order mathematical objects and to identify them according
to specific properties. Often there are different properties that we can use to identify
or order objects. For instance, the relation on natural numbers orders the natural
numbers according to their appearance in the sequence 0, 1, 2, 3, ... (one could also
say that this is an additive property), whereas the divisibility relation  orders natural
numbers according to their multiplicative properties. These two ways of ordering
natural numbers focus on different properties and they are not identical. However,
they share certain properties as we will see. As another example, the relation
orders sets according to the containment of elements and the relation X Y 
orders sets according to their cardinality. As we have seen, these different types of
ordering sets are not identical, but again they share certain properties. In a first
step we will identify the relevant properties that certain types of order relations have
in common.
6.
Order
6.2
reflexive
irreflexive
symmetric
antisymmetric
transitive
total
NN
6=
<
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
A plus + in the table means that the relation has the corresponding property, a
minus means that it does not have the property.
Symmetry is quite a restrictive property for relations. The equality relation,
the empty relation and the all relation are all unique symmetric relations in some
sense. For instance, equality is the only relation on a given set that is reflexive,
homogeneous relation
irreflexive
reflexive
9
irreflexive relation
reflexive relation
symmetric
transitive
?
strict order
transitive
compatibility relation
preorder
antisymmetric
symmetric
transitive
j
equivalence relation
partial order
antisymmetric
symmetric
j
total
j
equality relation
linear order
symmetric and antisymmetric (see Problem 6.3). It is interesting to point out that
the properties of homogeneous relations defined here can be characterized in terms
of composition and inversion and without mentioning points. We formulate a corresponding result.
Proposition 6.3 Let R X X be a relation. Then the following hold:
1. R is reflexive if and only if X R.
2. R is irreflexive if and only if R X = .
3. R is symmetric if and only if R = R1 .
4. R is antisymmetric if and only if R R1 X .
5. R is transitive if and only if R R R.
6. R is total if and only if X X R R1 .
We leave the proof to the reader (see Problem 6.1). The diagram in Figure 6.1
shows how the building blocks of reflexivity, symmetry and transitivity can be used
to define certain common types of order relations. Those types of relations that are
highlighted in bold face are the most common ones in mathematics. We will, in
particular, focus on the right hand side of the diagram and study those relations
that contain the equality relation (i.e. the reflexive relations).
6.
Order
6.1 Let R X X be a relation. Prove all the statements in Proposition 6.3.
6.2 Let X be a set. Prove the following
1. The equality relation X XX is the only relation on X that is reflexive, symmetric
and antisymmetric.
2. The empty relation X X is the only relation on X that is irreflexive, symmetric
and antisymmetric.
3. The all relation X X is the only relation on X that is symmetric and total.
6.3 Let R X X be a relation.
1. We define the reflexive closure of R by R= := X R. Prove the following:
a) R= is a reflexive and symmetric relation.
b) If S X X is reflexive and R S, then R= S.
T
c) R= = {S X X : S reflexive and R S}.
d) If R is reflexive, then R = R= .
2. We define the reflexive and symmetric closure of R by R := X R R1 . Prove the
following:
a) R is a reflexive and symmetric relation.
b) If S X X is reflexive and symmetric and R S, then R S.
T
c) R = {S X X : S reflexive and symmetric and R S}.
d) If R is reflexive and symmetric, then R = R.
S
3. We define the transitive closure of R by R+ := n=1 Rn . Here Rn stands for the
nfold composition of R with itself. Prove the following:
a) R+ is a transitive relation.
b) If S X X is transitive and R S, then R+ S.
T
c) R+ = {S X X : S transitive and R S}.
d) If R is transitive, then R = R+ .
6.4 Let X be a finite set with n = X elements. Prove the following
2
reflexive relations R X X.
6.3
Equivalence Relations
Often one needs to identify some mathematical objects that share certain properties.
Equivalence relations are a tool to express such identifications.
Definition 6.4 (Equivalence relation) Let be a relation on X. Then is
called an equivalence relation on X, if is reflexive, symmetric and transitive.
Perhaps the most basic example of an equivalence relation is the equality relation
= on an arbitrary set X. Obviously, we use the equality to identify objects. However,
in general we can also identify objects which are not equal.
Example 6.5 We mention a few examples of equivalence relations.
1. Let X be a set with the diagonal X X X. The diagonal can be seen as
equality relation on X and this relation is an equivalence relation.
2. Let X be a set with the all relation X X. This relation is also an equivalence
relation on X.
3. Let S be a set of sets. We define the equinumerosity relation S S by
X Y : X = Y 
for all X, Y S. By Corollary 5.9 this relation is an equivalence relation.
4. Let n N be fixed. For integers x, y Z we define the relation
x n y : n divides x y.
In this case x is called congruent to y modulo n. The relation n is an
equivalence relation (see Problem 6.9).
5. Let f : X Y be a function. Then we define the fiber relation f X X
of f by
x f y : f (x) = f (y)
for all x, y X. This relation is an equivalence relation. The fiber relation
f is also called the equivalence kernel of f .
We will study the fiber relation f of functions somewhat more below, since it
is a particularly important equivalence relation. Since the purpose of equivalence
relations is to identify objects, we need a tool to combine those objects that we
identify. By [x] we denote the equivalence class of x, which is the set of all those
objects that are identified with x with respect to some given equivalence relation.
6.
Order
3n
X/
The map in the following definition assigns to each point its equivalence class.
Definition 6.8 (Canonical projection) Let be an equivalence relation on a
set X. Then
p : X X/ , x 7 [x]
is called the canonical projection of the equivalence relation .
It is easy to see that p is surjective and that the fiber relation of p is just
(see Problem 6.6). In particular, this shows that any equivalence relation on X
is the fiber relation of some function f : X Y for some suitable Y . Now we prove
that the canonical projection and the fiber relation can be used to decompose any
function into an injective, a bijective and a surjective function.
Theorem 6.9 (Canonical decomposition) Let f : X Y be a function. Then
f can be decomposed into an injective function i, a bijective function b and a surjective function p, i.e.
f = i b p.
122
Y
6
 range(f )
X/ f
Y
f
u
6u
u
6
?
u
u ?
u
n
u
u
b
X/ f
range(f )
6.
Order
6.5 Let X X be an equivalence relation. Then the following three statements are
equivalent to each other for all x, y X:
1. x y,
2. [x] = [y],
3. [x] [y] 6= .
6.6 Let be an equivalence relation on X with canonical projection p : X X/ . Prove
the following:
1. p is surjective,
2. is the same relation as p .
Here p denotes the fiber relation of p.
6.7 Let X be a set and R X X a relation on X. We define the equivalence closure of
R by R := (R)+ (see Problem 6.3). Prove the following:
1. R is an equivalence relation.
2. If S X X is an equivalence relation and R S, then R S.
T
3. R = {S X X : S is an equivalence relation and R S}.
4. If R is an equivalence relation, then R = R .
6.8 Let X be a set. Then P 2X is called a partition of X, if the following hold:
1. 6 P
S
2.
P =X
(nonemptiness)
(cover)
3. A 6= B = A B = for all A, B P
(disjointness)
n
X
n
k=0
Bk
for all n N.
6.4
The minimal requirements that an order relation should satisfy are reflexivity and
transitivity. The properties of antisymmetry and totality are additional properties
that are considered.
Definition 6.10 (Order) Let be a relation on X.
1. is called a preorder, if it is reflexive and transitive.
2. is called a partial order if it is reflexive, antisymmetric and transitive.
3. is called a linear order if is reflexive, antisymmetric, transitive and total.
The pair (X, ) is called a preordered set, a partially ordered set or a linearly ordered
set if is a preorder, a partial order or a linear order on X, respectively.
Preorders are sometimes also called quasiorders and linear orders are also called
total orders.
Example 6.11 We provide a number of examples of order relations.
1. The less or equal relation on N is a linear order.
2. The divisibility relation  on N is a partial order, but not a linear order.
3. The divisibility relation  on Z, defined by
xy : (z Z) x z = y
for all x, y Z, is a preorder, but not a partial order.
4. The prefix relation v on X for some set X, defined by
(n, u1 , ..., un ) v (k, v1 , ..., vk ) : (n k and (i n) ui = vi )
for all n, k N and u1 , ..., un , v1 , ..., vk X, is a partial order, but not a linear
order in general.
6.
Order
5. The subset relation on the power set 2X of some set X is a partial order,
but not a linear order in general (see Problem 6.12).
6. The relation on a set of sets S, defined by
X Y : X Y 
for all X, Y S is a preorder, but not a partial order in general. It follows
from the Axiom of Choice that it is total (see Theorem 5.10).
The graphs in Figure 6.4 and 6.8 illustrate the partially ordered spaces (N, ),
(N, ) and (2N , ). Such a graph is called a Hasse diagram. If x is below y and
connected to y by an edge, then this means that x y. Edges that follow from
transitivity are left away in these graphs. Linear ordered sets like (N, ) actually
correspond to linear graphs in this way.
0
```
```
``
.
.
.
.
.
.
16
.
.
.
.
.
.
12
18
.
.
.
.
.
.
.
.
.
10
15
14
.
.
.
.
.
.
.
.
.
.
.
.
11
13
17
19
(N, )
(N, )
Figure 6.4: Hasse diagrams of the partially ordered sets (N, ) and (N, )
Whenever we have a preorder, then we automatically get some equivalence relab := R R1 is called the symmetric closure of R (see
tions. If R is a relation, then R
Problem 6.3). On the other hand, also the intersection R R1 is an equivalence
relation. This is what the following result says in other words.
Proposition 6.12 (Induced equivalence relation) Let be a preorder on X.
Then we obtain an equivalence relation on X by
x y : (x y and y x)
126
6.
Order
This result is the reason why one usually studies partial orders and not preorders.
If a relation is just a preorder on a given set X, then one can replace it by the partial
order induced on the quotient set X/ .
6.12 Let X be a set and consider the subset relation on 2X . Prove that is linear if and
only if X has less than two elements.
6.13 Let be a partial order on a set X. Prove that the induced equivalence relation is
the equality on X.
6.14 We call a relation < on X a strict order if it is irreflexive and transitive. Let be a
preorder on X. Prove that by
x < y : (x y and y 6 x)
we can define a strict order on X, which is called the induced strict order of . Prove that
1. the strict order < induced by the usual less or equal relation on N is the usual
strictly less relation,
2. the strict order $ induced by the inclusion relation on the power set 2X of some
set X is the usual proper inclusion relation,
3. the strict order induced by the cardinality relation on some set S of sets satisfies
the property X Y X < Y  for all X, Y S.
6.15 Let X be a finite set with n = X elements. Prove that there are exactly n! linear
orders on X.
6.5
Monoids
In the previous section we have seen that equivalence relations and partial orders
are often induced by preorders. But where do preorders come from? In this section
we show that many preorders are induced by monoids. A monoid is an algebraic
structure with one binary operation : X X X. In general, an algebraic
structure is a set together with operations that typically satisfy some additional
conditions. In the following definition we list some conditions that apply to a single
binary operation.
Definition 6.15 (Binary operations) Let X be a set with a binary operation
: X X X. We usually write x y instead of (x, y) for all x, y X. Let e X.
We define the following properties of binary operations:
1. is called associative if x (y z) = (x y) z for all x, y, z X,
2. is called commutative if x y = y x for all x, y X,
3. e is said to be an identity for if x e = x = e x for all x X,
128
6.5. Monoids
4. if e is a identity for , then y X is said be an inverse for x X with respect
to , if x y = e = y x.
Algebraic structures with one binary operation : X X X that satisfy
some combinations of these conditions have particular names. We give a survey on
some common such structures in the diagram in Figure 6.5. Here, we are mainly
interested in socalled monoids.
magma
associative
inverses
semigroup
quasi group
identity
identity
loop
monoid
inverses
associative
j
group
6.
Order
6. (X X , , idX ) is a monoid, where : X X X X X X is the composition.
We leave the proofs of these facts to Problem 6.17. The reason why we discuss
monoids here is that each monoid automatically comes with an induced preorder
that we mention in the following result.
Theorem 6.18 (Preorder of monoids) Let (M, , e) be a monoid. Then we obtain a preorder on M by
x y : (z M ) x z = y
for all x, y M . The preorder is called the induced preorder of the monoid
(M, , e).
Proof. Firstly, is reflexive, since has an identity e and hence x = x e for all
x M and hence x x. Secondly, let x y and y z for x, y, z M . Then there
are a, b M such that y = x a and z = y b. Hence z = (x a) b = x (a b)
because is associative. But this means x z. Hence is transitive.
2
The following example shows that many of the preorders that we have considered
here are actually induced by monoids.
Example 6.19 Let X be a set. We consider some monoids and the induced preorders:
1. (N, +, 0) induces the usual less or equal relation on N,
2. (N, , 1) induces the usual divisibility relation  on N,
3. (Z, , 1) induces the usual divisibility relation  on Z,
4. (X , , 0) induces the usual prefix relation v on X ,
5. (2X , , ) induces the usual inclusion relation on 2N .
We leave the proof to Problem 6.18. The example of the monoid (Z, , 1) shows
that the induced preorder of a monoid is not necessarily a partial order. A monoid
is just the right algebraic structure to yield an interesting preorder. If we have
too much algebraic structure, then the induced preorder can become trivial (see
Problem 6.21).
6.16 Let (M, , e) be a monoid and let e0 M be an element with the property that
x e0 = x = e0 x for all x M . Prove that e = e0 follows.
6.17 Let X be a set. Prove the statements in Example 6.17.
6.18 Let X be a set. Prove the statements of Example 6.19.
.
.
.
.
.
.
aba abb
.
.
.
.
.
.
baa bab
aa
.
.
.
.
.
.
bba bbb
ab
ba
bb
6.6
6.
Order
1
A
Figure 6.7: The subset A = {1, 2, 3} of the partially ordered set (N, )
Proof. We prove the claim for the minimum. Let us assume that A X has two
minima m, m0 . Then m A and m0 A and hence m m0 and m0 m. This
implies m = m, provided that is antisymmetric.
2
Since the maximum and minimum of a set in a partially ordered set is uniquely
determined, if it exists at all, we can use a special notation for these elements.
Definition 6.24 (Minimum and maximum) Let (X, ) be a partially ordered
set and let A X. Then we denote by max(A) the maximum of A, if it exists and
by min(A) the minimum of A, if it exists.
We give some further examples. In particular, we show that the maximum and
the minimum of a two element subset of a linear order always exist.
Example 6.25 We give some examples of maxima and minima.
1. We consider a linearly ordered set (X, ). Then for all x, y X
max({x, y}) =
y if x y
x otherwise
and
min({x, y}) =
x if x y
y otherwise
Hence these notations correspond to our previous usage of max and min as
functions on natural numbers (see Example 4.26).
2. We consider the partially ordered set (2X , ) for some set X. Then max(2X ) =
X and min(2X ) = .
In Example 6.22 we have seen that it can happen that a set like {1, 2, 3} in (N, )
has no greatest element. Nevertheless, there are elements like 2 and 3 in this set,
for which there are no greater elements. Such elements are called maximal.
Definition 6.26 (Maximal and minimal elements) Let (X, ) be a partially
ordered set and let A X.
1. Then an element m A is called a maximal element of A, if m x implies
m = x for all x A.
133
6.
Order
2. Then an element m A is called a minimal element of A, if x m implies
x = m for all x A.
Example 6.27 The elements 2 and 3 of the subset A = {1, 2, 3} of the partially
ordered set (N, ) are maximal elements of A and 1 is a minimal element.
If a partially ordered set has a maximum or a minimum, then this is the only
maximal or minimal element of the set, respectively.
Proposition 6.28 Let (X, ) be a partially ordered set and let A X.
1. If max(A) exists, then it is the only maximal element of A.
2. If min(A) exists, then it is the only minimal element of A.
Proof. Let A X and let us assume that max(A) exists and let m A be a
maximal element. Then m max(A) since max(A) is the maximum and hence
m = max(A) since m is a maximal element. This shows that any maximal element
of A is already the maximum. Moreover, if max(A) x for some x A then we
also have x max(A) since max(A) is the maximum and hence x = max(A) follows
by antisymmetry of . Hence max(A) is actually a maximal element. The second
claim can be proved analogously.
2
If one considers a linearly ordered set, then each maximal or minimal element of
a set is automatically the maximum or minimum of that set.
Proposition 6.29 Let (X, ) be a linearly ordered set and let A X and m A.
1. If m is a maximal element of A, then m = max(A) follows.
2. If m is a minimal element of A, then m = min(A) follows.
Proof. Let m A is a maximal element of A. Then for each x A we have m x
or x m since is total. If m x, then m = x follows from the maximality of m.
In any case x m holds. Hence m is the maximum of A.
2
Besides the least and the greatest element, there are often elements in the second
row that are also of some importance. These elements are called atoms and coatoms.
Definition 6.30 (Atoms) Let (X, ) be a partially ordered set and let A X.
1. Then a A is called an atom of A, if min(A) exists and a is minimal in
A \ {min(A)}.
2. Then a A is called a coatom of A, if max(A) exists and a is maximal in
A \ {max(A)}.
We give some examples of atoms and coatoms.
134
```
```
``
N\{0}
.
.
.
.
.
.
.
.
.
.
.
.
{0, 1}
{0}
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
{3}
{4}
{5}
{6}
{7}
{8}
{9}
.
.
.
{0, 2} {1, 2}
{1}
.
.
.
{2}
(2 , )
Problems
6.22 Let (M, , e) be a monoid with induced preorder . Prove that e is the least element
in M .
6.23 Prove the statements of Example 6.31.
6.
Order
6.7
(associative)
(commutative)
137
6.
Order
3. x t (x u y) = x and x u (x t y) = x
(absorption)
4. x t x = x and x u x = x
(idempotent)
6.24 Let (X, ) be a lattice with least element e X. Prove that (X,
S t, e) is a monoid.
Conclude that the following are monoids: (N, max, 0), (N, lcm, 1), (2N , , ).
6.25 Prove the statements of Example 6.36.
6.26 Prove Proposition 6.37.
In this section we want to present a brief survey on axiomatic set theory. Actually, there are many different versions of axiomatic set theory and the version presented here is called ZermeloFraenkel set theory, often abbreviated by ZF. ZermeloFraenkel set theory plus the Axiom of Choice, abbreviated as ZFC, is the standard
framework in which most of modern mathematics is developed.
ZermeloFraenkel sets theory ZFC
1. Axiom of the Empty Set. There exists an empty set without elements.
2. Axiom of Extensionality. Two sets X and Y are equal, if they contain
exactly the same elements.
3. Axiom of Comprehension. For each given set X and each predicate P for
X, there exists a set S = {x X : P (x)} of all elements of X that satisfy the
predicate P .
4. Axiom of Pairing. For all objects x and y there exists a set {x, y} that
contains exactly x and y.
5. Axiom of Union. For all sets of sets S there exists a set
S = {x : (X S) x X}
that contains all elements that belong to at least one set X S.
6. Axiom of the Power Set. For all sets X there exists a power set
2X = {S : S X}
that contains all subsets of X.
7. Axiom of Infinity. There exists an infinite set (such as N).
8. Axiom of Replacement. If C and D are classes and f : C D is a function,
then for each subset X of C the image f (X) is a subset of the class D.
139
6.
Order
9. Axiom of Foundation. Any nonempty set X of sets has the property that
it contains a member Y such that X Y = .
10. Axiom of Choice. Any set S that contains nonempty sets has a choice
function, i.e. a function f : S S such that f (X) X for each X S.
The axioms as listed here are only informal and simplified versions of the formal
axioms of ZFC. The purpose is to give the reader and impression of what these
axioms are about. For instance, in case of the Axiom of Replacement one would have
to be more precise about what kind of functions f are allowed here. As phrased here,
the axioms are certainly also redundant. For instance, the existence of the empty
set can be deduced from the Axiom of Infinity and the Axiom of Comprehension.
The Axiom of Comprehension itself can be deduced form the Axiom of Regularity.
Hence, the presentation of these axioms can be optimised. The Axiom of Foundation
(sometimes also called the Axiom of Regularity) is equivalent to the fact that the
element relation is wellfounded (provided one has the Axiom of Choice). There
are other versions of axiomatic set theory such as von NeumannBernayG
odel set
theory.
Unfortunately, it is not known whether the ZFC axioms are consistent! This is
one of the big open problems in mathematics and is related to a problem that was
discussed by Hilbert in a famous talk he gave at Paris in 1900.
Conjecture 6.38 (Consistency) The ZFC axioms of ZermeloFraenkel set theory
together with the Axiom of Choice are consistent.
Usually one proves that some mathematical axioms are consistent, by providing
an example of a mathematical object that satisfies all the axioms. However, in case
of set theory the problem is that nobody has any idea how to create a model for set
theory without using sets (and hence set theory). Nobody was able to resolve this
circularity until today. Even worse, Godel turned the observation that there is such
a circularity into the following negative result.
Theorem 6.39 (G
odels Second Incompleteness Theorem 1931) ZFC is consistent if and only if one cannot prove its consistency within ZFC.
The if direction of this result is trivial, if ZFC is inconsistent, then one can
conclude everything from ZFC, using the principle of explosion. In particular, one
can prove the consistency of ZFC in ZFC in that case. The only if direction of
the proof requires a deeper insight into logic and computability. What this theorem
clearly shows is that a consistency proof for ZFC would require some meta theory
that goes beyond set theory and then the question of consistency for this meta theory
would have to be resolved.
What is known, however, is that the axiom of choice is independent of the other
axioms of ZFC, i.e. ZFC is consistent if and only if ZF is consistent.
Mathematicians
Mathematicians are generally thought of as some kind of intellectual machine, a great
brain that crunches numbers and spits out theorems. In fact we are, as Hermann
Weyl said, more like creative artists. Although strongly constrained by the rules of
logic and by physical experience, we use our imagination to make great leaps into
the unknown. The development of mathematics over thousands of years is one of the
great achievements of civilization.
Sir Michael Francis Atiyah (Fields Medalist, Cambridge)
Order
Leopold Kronecker (18231891) was a student of Dirichlet and worked
in algebra, number theory, analysis, and mathematical logic. He rejected
Cantors set theory due to its nonconstructive nature and proposed himself
finitism, which can be considered as a forerunner of intuitionism. Kronecker
proposed the idea that analysis and other branches of mathematics should
be based on natural numbers and he said God made the integers; all else
is the work of man.
Georg Cantor (18451918) was a student of Kummer and Weierstra and
is best known for his development of set theory. His own definition of a set
was roughly the following: In its entirety we consider any collection M of
welldefined and distinguished objects m of our perception or of our thoughts
as a set. The objects m are called the elements of the set M . Cantor was
led to his study of set theory by his investigation of bijective maps and the
concept of equinumerity. In this context Cantors diagonalisation method
and Cantors pairing function are wellknown and named after him. He
also initiated the study of cardinality and of transfinite numbers. The set
{0, 1}N as a metric space is often called Cantor space and plays a crucial
role in some fields of mathematics such as topology, descriptive set theory
and fractal geometry.
David Hilbert (18621943) was a student of Lindemann and is one of
the most wellknown mathematicians of the 20th century. This is because
he made substantial contributions to various fields of mathematics such as
the theory of invariants, foundations of functional analysis and mathematical physics as well as mathematical logic. Many concepts and results are
named after him. For instance, Hilbert spaces play a central role in functional analysis and for the mathematical foundations of modern quantum
physics. Hilbert was a strong supporter of Cantors set theory. Hilberts
Basis Theorem caused a controversy around the fact that the proof was
highly nonconstructive. Hilberts Nullstellensatz is another fundamental
result that relates geometry to algebra and it is one of the starting points of
modern algebraic geometry. In a famous talk that Hilbert delivered at the
Sorbonne in Paris in 1900 he described 23 mathematical problems that he
considered as essential for mathematics of the 20th century. His problems
turned out to be very influential and some developments of mathematics
in the 20th century were motivated by attempts to solve some of his problems. Some of these problems are still unsolved. One of his problems was
to find a complete and sound axiomatic system for mathematics. Godels
First Incompleteness Theorem brought Hilberts programme and its original
form to an end, since Godel proved that even for arithmetic there cannot
be any reasonable axiom system that is sound and complete simultaneously.
Nevertheless, Hilberts foundational ideas in mathematical logic have a substantial impact even on nowadays mathematical logic.
142
6.
Order
del (19061977) was a student of Hahn and certainly the most
Kurt Go
influential mathematical logician of the 20th century. The G
odel Completeness Theorem shows that firstorder logic can be axiomatised in a complete
and sound way, i.e. it shows that in some sense provability and truth correspond in pure logic. However, G
odels First Incompleteness Theorem indicates that this is not true for mathematics in general. Even a simple fragment of mathematics such as arithmetic does not admit any axiom system
that is complete and sound simultaneously. This result brought Hilberts
programme in its original form to and end. G
odels Second Incompleteness
theorem states that a sufficiently rich mathematical theory cannot prove
its own consistency and this applies, in particular, to set theory. Godel
also proved that the Axiom of Choice and the Generalized Continuum Hypothesis are both consistent with ZermeloFraenkel set theory (i.e. if ZF is
consistent, then also ZF together with the Axiom of Choice and the Generalised Continuum Hypothesis). Later, Paul Cohen was able to show that
this is also true for the negations of the Axiom of Choice and the negation
of the Generalised Continuum Hypothesis, such that both are independent
of ZermeloFraenkel set theory. Godel also made notable contributions to
other areas of logic, such as intuitionistic logic and to relativity theory.
