You are on page 1of 359

PERSPECTIVES

IN
MATHEMATICS
DAVID E. PENNEY, The University of Georgia

!~~t~A~~
~ ~ W. A. BENJAMIN, INC.
~ ~
~{IS't\~ Menlo Park, California
Copyright © 1972 by W. A. Benjamin, Inc. Philippines copyright 1972 by W. A. Benjamin,
Inc.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written permission of the publisher. Printed in
the United States of America. Published simultaneously in Canada. Library of Congress
Catalog Card No. 70-166540.
For B. J. Pettis
Harry H. Corson
A. C. Woods
Paul S. Mostert
G. O. Sabidussi
Paul F. Conrad
Fred B. Wright
A. D. Wallace
A. H. Clifford
G. S. Young, Jr.
Frank D. Quigley
and especially L. B. Treybig
1

1
PREFACE

This book was planned primarily as a text for a one year course in math-
ematics for students not intending to take calculus. However, it could be
used as a sourcebook by teachers and mathematicians or as outside reading
by undergraduate mathematics majors. The usual topics from collegiate
precalculus courses have specifically been excluded, as has calculus itself.
Each chapter contains a rather detailed study of a topic chosen from one of
the major branches of modern mathematics. Although some of this material
is available elsewhere, it has not been gathered before into a single volume
written for the reader who does not have an extensive background in
mathematics. I quite frankly admit to choosing topics I found particularly
interesting, and I hope that the reader will be pleased with some of these
choices. The problems at the end of each chapter are in many cases meant
to open up avenues of deeper exploration of the subject of the chapter. The
chapters themselves are almost wholly independent of one another, thus
enabling the student to learn about those topics most appropriate for him,
in any order, and in the necessary degree of mathematical rigor. This should
provide the flexibility desirable for so diverse an audience-even under-
graduate mathematics majors are not commonly exposed to the majority of
the material included, and I hope that they too might enjoy this book.
Although the formal prerequisites for one who wishes to profit from this
book are minimal-some high-school algebra, a little geometry, and aptitude
to do college-level work-it is not an easy book. It contains only incidental
references to the history of mathematics. It should acquaint the reader with
the various branches of modern mathematics, and each student is expected
to find out what mathematics is by doing mathematics: guessing patterns,
making conjectures, and (most important) proving theorems.

ix
x Preface

Most technical terms are defined upon the occasion of their first appear-
ance, and in this case the term appears in boldface. The statements of
theorems appear in italics. The reader is advised to have paper and pencil
always available, and should not hesitate to consult the index. Generally,
the last problems of each chapter are the most difficult, but there are numerous
exceptions; the difficult problems have not been so indicated in order to
increase the similarity to mathematical research.
I wish to thank for their patience those undergraduates who were exposed
to this book in its preliminary form, the Department of Mathematics of the
University of Georgia for making it possible for me to write this book, and
my former teachers of mathematics, to whom it is dedicated. Special thanks
are due Professor Gail S. Young of the University of Rochester and Professor
Frank D. Quigley of Tulane University for their help with the second
chapter, Professor F. A. Roach of the University of Houston for his help
with the third, ProfessorH. S. M. Coxeter of the University of Toronto for
his help with Chapter 5, and Professor D. B. Hinton of the University of
Tennessee for his help with Chapter 8. Finally, lowe a great deal to my wife
Carol, who read every word, suggested innumerable improvements, and
invented a number of the better problems.

Athens, Georgia D.E.P.


November 1971
CONTENTS

Chapter 1 The Bolyai-Gerwin Theorem 1

1.1 The fundamental definitions 2


1.2 The Bolyai-Gerwin theorem 6
1.3 How to cut a polygon into triangles 8
1.4 How to cut a triangle to make a parallelogram 11
1.5 How to cut a parallelogram to make a rectangle 11
1.6 How to cut a rectangle to make a square 12
1.7 How to cut several squares and reassemble the pieces to form a
single square . 14
1.8 How to cut a square into pieces which can be reassembled to
form a given polygon of the same area 16
1.9 Some concluding remarks 17

Chapter 2 Brunnian Links 25


2.1 The simple closed curve 27
2.2 Links and their properties 30
2.3 A simple algebra. 33
2.4 Brunnian 4-links. 39
2.5 Notation and examples. 43
2.6 Commutators 44
2.7 The final generalization. 46

Chapter 3 The Well-Tempered Clavichord 52


3.1 Properties of logarithms 53
3.2 A peculiar manipulation 57
3.3 Continued fractions. 59

xi
xii Contents

3.4 The value of a continued fraction . 62


3.5 Applications to baseball and grade distributions 66
3.6 Harmony. 70
3.7 Tuning a piano, old style 73
3.8 Tuning a piano, new style 76
3.9 Improving the octave 78

Chapter 4 Group Theory 82

4.1 Some examples of groups 82


4.2 Subgroups 95
4.3 Cyclic groups and Abelian groups. 102

Chapter 5 Polyhedra 108

5.1 The definition of polyhedron 108


5.2 Euler's formula 118
5.3 Regular solids 123
5.4 A converse of Euler's formula . 127
5.5 Map coloring 134

Chapter 6 Infinite Sets 145

6.1 Sets 146


6.2 Functions. 154
6.3 More on one-to-one correspondences. 162
Contents xiii

6.4 The Cantor-Schroeder-Bernstein theorem 164


6.5 Properties of finite and infinite sets 170
6.6 Nondenumerable infinite sets 177

Chapter 7 Number Theory 184

7.1 Divisibility 184


7.2 Well-ordering 189
7.3 The fundamental theorem of arithmetic 196
7.4 The greatest common divisor 205
7.5 Applications . 210

Chapter 8 Animal Populations 220

8.1 Unrestricted growth of a single species 221


8.2 Growth of a single species under limiting conditions 229
8.3 The case of two competing species 233
8.4 The predator-prey case . 244

Chapter 9 The Art Gallery Theorem 254

9.1 Convex sets 254


9.2 Intersections of convex sets 258
9.3 Hulls and kernels 260
9.4 Helly's theorem . 266
9.5 Krasnoselskii's theorem. 271
9.6 L-convexity 276
xiv Contents

Chapter 10 The Real Number System 280

10.1 The rational numbers 280


10.2 Nested intervals of rational numbers 288
10.3 Construction of the real numbers . 295
10.4 The order relation on R 302
10.5 Are there more numbers? 307
10.6 An unusual set of real numbers 311
Epilogue 316

Answers and Hints 320

Index 345
CHAPTER 1

THE
BOLYAI-GERWIN
THEOREM

You have probably seen some of the numerous puzzles and games involving
a small number of plastic pieces of various shapes. In one variety of these,
an accompanying booklet gives outlines of various figures, and one attempts
to form the pieces into these figures. Also enjoying some popularity are
board games in which one uses the varied shapes of his pieces to force or
prevent certain moves by his opponent. It should not surprise you to find
that many mathematicians like such puzzles, and moreover that some of the
mathematical ideas behind such puzzles have been studied by mathematicians
for many years.
The puzzles usually feature one extra problem when one tires of con-
structing figures; for example, the plastic pieces come in a flat rectangular
box, and we find it surprisingly difficult to return the pieces properly to the
box. This chapter will reveal, among other things, a method by which you
can construct such puzzles of your own, even ones which will fit into a
rectangular box, if you wish. You will be able to cut several plastic pieces
which will fit into the box, but which you can reassemble to form a triangle,
a hexagon, a symmetrical five-pointed star-or all three.
Of course, one restriction immediately becomes apparent. If one such
figure can be cut up and the pieces reassembled to form another figure, the
two must have the same area. On the other hand, it is hard to believe that a
square could be cut into a finite number of pieces which could be reassembled
to form a circular disk of the same area, but we cannot answer whether or
not this is possible. This question presently enjoys the status of an Unsolved
Problem in Mathematics. (Do not confuse this with the famous problem of
Squaring the Circle-the distinction between the two will be discussed in the

1
2 The Bolyai-Gerwin Theorem 1.1

exercises.) Because circles, curved cuts, and curved figures in general lead
us beyond the frontiers of current mathematical knowledge, we shall restrict
our attention to polygons and straight-line cuts. In these cases the math-
ematical problems involved have been solved, and the solution is remarkably
simple. All that is required is that the two figures have the same area. If
so, then either may be cut into pieces and reassembled to form the other.
This is the essence of the Bolyai-Gerwin Theorem.

1.1 THE FUNDAMENTAL DEFINITIONS


We plan to lead you through a proof of the Bolyai-Gerwin Theorem for two
reasons. First, it is in some sense a very "rich" proof in that the proof itself
answers numerous questions which may already have occurred to you.
Second, the proof is a good model of mathematical thinking, and you should
develop the habit of mathematical thinking in order to profit from this book.
Have no fear, though, that we are trying to turn you into a mathematician-
we hope to convince you that mathematical thinking is not at all mysterious,
but merely ordinary careful reasoning. A little geometry will be helpful
to you but not essential. Your intuitive idea of the area of a plane figure will
serve, and we need but two definitions. Before the first definition, a small
bit of notation will be introduced for clarity. If L is a straight-line segment
in the two-dimensional plane, we will write L = [a, b] to indicate that a is
one endpoint of L, and b is the other. The interval notation is used to remind
us of what is in fact the case, that [a, b] consists of the two points a and b
together with all the points between them on the straight line through a and
b. And we call a the first point of Land b the last point of L.
A polygon is a plane figure of finite area bounded by a finite number of
straight line segments L b L 2 , L 3 , .•• , L", such that the last point of L 1 is
the first point of L 2 , the last point of L 2 is the first point of L 3 , and so on,
and the last point of L" is the first point of L 1 ; and moreover, other than as
indicated above, no two of these line segments have any points in common.
These line segments are called the edges of the polygon, and the endpoints
of these line segments are called the vertices of the polygon. The points
belonging to the edges of a polygon are to be considered part of the polygon;
that is, the polygon consists of all its interior points together with all its
boundary points. Consequently, when we speak of a point x belonging to the
polygon P, it is permissible that x be either an interior point or a boundary
point of P.
We require that the figure have finite area to avoid certain paradoxes
(to be discussed in the exercises) and also so that we may attach the common
meanings to many terms such as "square." We certainly want to mean by
"square" a square boundary together with what's inside, and the requirement
1.1 The fundamental definitions 3

that a polygon have finite area prevents us from the necessity of dealing with
a square boundary together with what's outside.
Two polygons are said to be equidecomposable if it is possible to cut one
up, using straight cuts, into a finite number of pieces which may be reas-
sembled, omitting none, to form the other.
You may object that this definition is somewhat vague. But an attempt
to make it much more precise would also make it several times as long, and
require the introduction of several other terms which would also have to be
defined precisely. Again, as with the idea of area, we ask that you accept
your natural understanding of the concept of equidecomposability as the
correct one. At this point you may test your concept of area and your
understanding of equidecomposability by means of the following theorem.
Theorem 1.1 If two polygons are equidecomposable, then they have the same
area.
If you feel that this proposition is obvious, then you surely understand enough
to proceed. After all, whatever we may mean by "area," the area of a plane
figure should not be altered by rigid (congruence) motions-so all that would
be necessary for a proof of the above theorem is to cut the first polygon
into several pieces and measure the area of each such piece. Then observe
that as the pieces are assembled into each of the two given polygons the
resulting total areas must be equal, for each is the sum of the same set of
numbers. But on a deeper level there are always a few "obvious" things
which become less obvious when more closely examined. If we were to start
with certain axioms about the geometry of the two-dimensional plane and
attempt to define "area," the development of this definition would be lengthy,
and the above theorem would have a rather complicated proof. For example,
it is conceivable that some of the pieces into which a polygon might be cut
simply cannot have area. It would be necessary to establish that if only
straight cuts are used to form such pieces then each piece must in fact have
area, and then that the total area of the polygon is equal to the sum of the
areas of the individual pieces. Further consideration of these topics will be
found in the exercises.
Exercises

1.1 Which of the figures shown in Fig. 1.1 is a polygon? Exactly in what
way does each that is not a polygon fail to satisfy the definition?
1.2 What is the smallest number of vertices that a polygon can have? Prove
that your answer is correct.
1.3 Construct a polygon having one and only one vertex in each of the
four quadrants and not containing the origin-or prove that no such polygon
can exist.
4 The Bolyai-Gerwin Theorem 1.1

(a) (b)

(e)

Fig.1.1 Which of the above figures is a polygon 7


1.1 The fundamental definitions 5

1.4 Need a polygon have positive, rather than zero, area? Why?
1.5 Show that a polygon must have at least one interior angle less than 180 0

1.6 The relation of equality between numbers enjoys the following three
properties.

a) The Reflexive Property: If x is a number, then x = x.


b) The Symmetric Property: If x and yare numbers such that x = y, then
y = x.
c) The Transitive Property: If x, y, and z are numbers such that x = y
and y = z, then x = z.

Such a relation is said to be an equivalence relation.


If P and Q are polygons, let us write P '" Q provided that P and Q are
equidecomposable; that is, if it is possible to cut P up, using straight cuts,
into a finite number of pieces which can be reassembled, omitting none, to
form Q. Show that equidecomposability is an equivalence relation:
specifically, if P, Q, and R are polygons, then
a) P '" P.
b) If P '" Q, then Q '" P.
c) If P '" Q and Q '" R, then P '" R.
1.7 Two polygons are said to be nonoverlapping if they have, at most,
boundary segments in common. In particular, polygons with only one point,
or nothing, in common are nonoverlapping. We take for granted in our
definition and subsequent discussion of equidecomposability that the pieces
one assembles to form a given polygon are to be nonoverlapping, but that
they may have common edges.
Show that a triangle is not the union of nonoverlapping parallelograms
each pair of which are congruent. Is this still true if we drop the restriction
that each two parallelograms are congruent, and require only that finitely
many parallelograms be used?
1.8 Plane figures not of finite area can exhibit surprising behavior, even if
the boundaries are required to be straight-line segments meeting in common
endpoints, much as in the definition of a polygon. We might as well allow
the use of half-infinite lines as well, with only one endpoint, that is also an
endpoint of another such half-infinite line or line segment. Here, un-
fortunately, the notion of geometric congruence breaks down rather badly.
It turns out that there is an example of such a figure which is congruent to
a proper part of itself. Thus there can be no way to define "area" for such
figures so that congruence motions would preserve area, or so that pro-
portional figures with the same area would be congruent.
6 The Bolyai-Gerwin Theorem 1.2

Find an example of such a plane figure with polygonal boundary con-


gruent to a proper part of itself. (The figure must be unbounded-that is,
contained in no circle, no matter how large the radius of the circle.)
1.9 In the terminology of the previous exercise, find an example of an un-
bounded plane figure with polygonal boundary such that the figure has finite
area. It is permissible to use infinitely many segments to form the boundary.
1.10 It is also very easy to find an example of a plane figure that has a finite
perimeter but infinite area. Please do so.

1.2 THE BOlYAI-GERWIN THEOREM


The Bolyai-Gerwin Theorem is just the converse of Theorem 1.1. The beauty
of this theorem lies in the extremely simple condition for equidecomposability
of two polygons and the curious fact that the theorem is unexpectedly true.
Theorem 1.2 (Bolyai-Gerwin) If two polygons have the same area then they
are equidecomposable.
The proof has its own beauty: it is constructive. The statement of the theorem
would be of little use to you should you wish to construct such puzzles as
were mentioned earlier, but the proof provides a "how to do it" recipe. If
the proof were not broken into a sequence of short steps, it would be easy to
lose the thread of the argument, so we shall present the proof in the following
sequence. We shall show, in order, how to cut
1) a polygon into triangles,
2) a triangle to make a parallelogram,
3) a parallelogram to make a rectangle,
4) a rectangle to make a square,
5) several squares to form a single square when the pieces are reassembled,
and
6) a square into pieces which can be reassembled to form a given polygon of
the same area.
Given two polygons of the same area, we can cut one into pieces which
can be assembled into a square, using the first five steps above. We can do
the same with the other polygon and assemble both into squares. The two
squares will have the same area, and so can be superimposed. If all cuts now
visible are made, then either square can be made into either polygon. Hence
the first polygon can be made into the second, with one of the squares as a
halfway point.
Finally, before we begin our sequence of six short proofs, nute that in
each proof, each construction can be performed by so-called "pure" geo-
metric methods-that is, with unmarked straightedge and compass. This
1.2 The Bolyai-Gerwin Theorem 7

Fig.1.2 The Texan Rectangle.

is not necessary for the truth of the Bolyai-Gerwin Theorem, but it does
serve as an added attraction and makes the construction of puzzles con-
siderably simpler. Also note that in each proof all cuts are straight-line
cuts, as required in the definition of equidecomposability, and finally that it
is never necessary to lift a piece and turn it over.

Exercises

1.11 Let R be a rectangle with short side of length one unit and long side of
length two units, and let S be a square of the same area as R. Show how to cut
R into a finite number of pieces, using straight cuts, so that the pieces can be
reassembled to form S. Can you do this with only three cuts?
1.12 Show how to cut up the Texan Rectangle shown in Fig. 1.2, bounded
by two parallel rays and a segment perpendicular to each, into infinitely
many squares that can be reassembled to form a strip twice as wide. This
example is one reason why we restrict our attention to figures of finite area
when dealing with equidecomposability.
1.13 Show how to cut the Texan Rectangle into squares that can be
reassembled to form the entire two-dimensional plane.
1.14 Show how a given triangle can be divided into two right triangles with a
single cut.
1.15 Let T be a right triangle. Show how you can cut T into two pieces which
can be reassembled to form the mirror image of T, without turning any piece
over.
1.16 In the next section we will show how to cut any polygon into triangles.
Thus the two previous exercises show how turning pieces over in our study of
equidecomposability could be avoided, even if thought necessary in some
construction. Explain how.
8 The Bolyai-Gerwin Theorem 1.3

1.17 What is a theorem? A corollary? A lemma? Look up the answers.


What is the difference between a mathematical definition and the sort of
definition found in a dictionary?
1.18 How would you phrase the Bolyai-Gerwin Theorem for the one-
dimensional case-that is, for sets which are subsets of the real number line?
In order to answer this question, you will have to decide on a one-dimensional
analogue of "polygon," and define equidecomposability for a figure like
this.
1.19 Do you believe that the theorem you phrased in the previous exercise
is true?
1.20 See Exercise 1.6. Given real numbers a and b, let us say that a is
equivalent to b, and write a ~ b, provided that a - b is a whole number.
Is the relation ~ an equivalence relation? If so, prove it; if not, show why
not.

1.3 HOW TO CUT A POLYGON INTO TRIANGLES


It will be much more convenient if the mathematical objects with which we
work have short and easily remembered names, so we begin by letting P be
a plane polygon. Let its vertices be called Vi' V2' V3' ..• ,Vn• We shall also
suppose that we have so named these vertices that the edges of P are the line
segments [Vh V2], [V2' v3 ], ... , [Vm Vi]. If P has only three vertices, it is
already a triangle and no further construction is necessary. If P has four or
more vertices, there must be at least one at which the interior angle is less
than 1800 (why?). We may suppose that we have named this vertex Vi' and
thus by the definition of polygon, Vi lies on the two edges [Vi' V2] and
[v m Vi] of P.
Consider the line segment K joining V2 with Vn• Now K will not be an
edge of P itself, but it may happen that except for the endpoints of K, K lies
entirely within the interior of P. If so, we cut P along this new line segment,
and thus obtain two new polygons. One is, of course, the triangle T whose
vertices are Vi' V2' and Vm and the other is a new polygon P' with one vertex
fewer than P. On the other hand, what if K does not lie entirely within P?
In this case we can produce another new line segment L, also joining two
vertices of P, and lying entirely within P, as follows.
Imagine a straight line M, parallel to the segment K, and passing through
the vertex Vi' Then imagine this line M moving slowly toward K while re-
maining always parallel to it. Since K does not lie entirely within the polygon
P, as M moves toward K it must eventually meet some part of the boundary
of P that lies between [v h V2] and [V m Vi]. To be precise, M must intersect
some part of the boundary of P that lies within the triangle T whose vertices
are Vi' V2' and Vn , as shown in Fig. 1.3. But when M first meets part of the
1.3 How to cut a polygon into triangles 9

Fig. 1.3 Triangulation


of a polygon.

boundary in this fashion, during its journey from V l to K, then M must simul-
taneously meet at least one vertex Vi of P. Let L be the line segment from
Vl to Vi. Then L lies, except for its endpoints, wholly within the interior of P.
Thus we can always produce some new line segment, either K or L,
joining two vertices of P not previously joined by an edge of P, and lying
except for its endpoints wholly within the interior of P. If we cut P along
this new line, we produce two new polygons with the following important
property: Each has fewer vertices than did P. We repeat this process on each
resulting polygon that has more than three vertices until the process is forced
to come to a halt by virtue of the fact that P has been cut entirely into
triangles. All the cuts are straight-line cuts, and each construction can be
performed with straightedge alone. The method is the "most efficient
possible" in that it does not require the creation of new vertices. Thus we
have shown how it is possible to cut a given polygon into triangles.
10 The Bolyai-Gerwin Theorem 1.3

Exercises

1.21 Suppose that P is a polygon with vertices V h V2, V 3 , ... ,Vn • In the
construction of this section, in which a polygon such as P is decomposed into
triangles, how many cuts must be made, and how many triangles will be
obtained? The answer of course depends on the value of n, and should give
the correct value in particular for n = 3 and n = 4.
1.22 Suppose that f/ is a statement meaningful for natural numbers; that is,
f/ might be the statement "Every natural number is composite." In order to
be meaningful it is not necessary that f/ be true, merely that for each natural
number f/ is either true or false. Let us consider the statement f/, as an
example, that "The sum of the first n natural numbers is n(n + 1)/2." In
this case f/ happens to be always true, and for such statements there is
frequently the possibility of proving them true by the method of induction.
To prove by the method of induction that such a statement is true, we
must first establish that the statement is true for n = 1; and then, assuming
the truth of the statement for n = k, show that it follows that the statement
is also true for n = k + 1. Consequently, since the statement is true for 1,
it is then true for 2; then, since it is true for 2, it is also true for 3, and so on.
Thus the statement must be true for each natural number. Use the method of
induction to prove that your answer to the previous exercise is correct.
1.23 Continuing the previous exercise, let f/ be the statement
n(n + 1)
1+2+3+"'+n= .
2
Prove f/ by the method of induction.
1.24 The method used in this section to prove that each polygon can be
triangulated is actually a proof by induction in disguise. Explain how to
rephrase this proof so that the use of the method of induction becomes
more apparent.
1.25 In the proof of the last section, we showed how to cut up a polygon
into triangles without the introduction of new vertices. Do you think an
analogous process will work for three-dimensional polyhedral solids? In
other words, can each polyhedral solid be cut up into tetrahedra (not neces-
sarily regular) without the introduction of new vertices?
1.26 See Exercise 1.22. Prove by induction: If 0 < a < 1 and n is a natural
number, then 0 < an < 1. Of course, you may use anything you know
about inequalities.
1.27 Prove by induction: If n is a natural number, then
(-It = 1 if n IS even,
(_l)n+t = -1 if n is odd.
1.5 How to cut a parallelogram to make a rectangle 11

1.28 Prove by induction: If n is a natural number, then 4n - 1 is evenly


divisible by 3. Note that if m is a natural number evenly divisible by 3, then
there exists a natural number k such that m = 3k.
1.29 Give an example of a pair of rectangles of the same area such that it is
not possible to cut one into finitely many squares that can be reassembled
to form the other.
1.30 Give an example of a rectangle that cannot be cut into finitely many
squares each two of which have the same area.

1.4 HOW TO CUT A TRIANGLE TO MAKE A


PARALLELOG RAM
This is much easier. In fact, you have probably already guessed how to do it.
If not, one hint is that it can be done with only one cut.
Let T be a triangle and let a and b be midpoints of two of its sides. Join
a with b by the straight-line segment [a, b], and cut T along [a, b]. This will
produce a small triangle and a trapezoid. Holding the trapezoid fixed,
rotate the triangle half a turn about either a or b. The resulting figure will be
a parallelogram. That it is indeed a parallelogram is the only fact not
immediately obvious. Establishing this fact is left for the next exercise.

Exercises

1.31 Establish, as indicated in the proof of this section, that the reassembled
triangle actually does form a parallelogram.
1.32 With what type of triangle will the method of this section produce a
rectangle rather than just a parallelogram?

1.5 HOW TO CUT A PARALLELOGRAM TO


MAKE A RECTANGLE
This is almost as easy as the previous proof. It is possible to construct in the
parallelogram (which we suppose is not already a rectangle) a line from one
vertex where the interior angle exceeds 90° to the opposite side, such that this
line lies within the parallelogram and is perpendicular to the opposite side.
Cut along this line; you will obtain a trapezoid and a triangle. If the triangle
is moved without rotation to the opposite side of the trapezoid, the two will
fit together to form a rectangle. Again, the only problem is to show that the
new figure is indeed a rectangle, and we leave this for the exercises.

Exercises

1.33 Show how to construct the perpendicular needed for the cut in the
proof of this section.
12 The Bolyai-Gerwin Theorem 1.6

Fig.1.4 Finding the


solution of x 2 = abo

1.34 Prove that the reassembled parallelogram in the proof of this section
actually does form a rectangle.
1.35 Under what circumstances will the reassembled parallelogram in the
last proof form a square, rather than just a rectangle?

1.6 HOW TO CUT A RECTANGLE TO MAKE A SQUARE


There is a trick to this construction. We must first determine the size of the
square, draw it, and then use it to determine where to make the cuts on the
rectangle. To draw the desired square, we use a method known to the ancient
Greek geometers. Let the rectangle be called R, let its short side have length
a, and let its long side have length b. (If R is already a square we are finished,
and no construction is necessary.)
Draw a straight line of length a + b, bisect it, and draw a circle with the
center at the point of bisection so that the segment of length a + b forms a
diameter of that circle. Since this diameter has length a + b, there is a point
p on it which divides the diameter into two segments, one of length a and one
oflength b. Draw a perpendicular to the diameter at the point p. See Fig. 1.4.
This perpendicular must meet the circle at some point q. It turns out that
the length x of the segment [p, q] has the property that x 2 = ab (to be
1.6 How to cut a rectangle to make a square 13

Fig. 1.5 How to turn


a rectangle into a square
of the same area.
R

established in an exercise), and hence a square of side length x will have the
same area as the rectangle R.
There will now be a small technical problem if the rectangle R is more
than four times as long as it is wide, but in an exercise we will ask you to show
how such a rectangle can be converted into one which is less than four times
as long as it is wide. Hence we may assume that R itself is in fact less than
four times as long as it is wide.
Draw R and the square, which we shall call S, in such fashion that they
overlap as shown in Fig. 1.5. Draw the extra line L also shown in the same
figure, and cut R along that portion of L which lies in R. Then cut R along
the line segment M, that part of the right-hand side of S which lies below L.
Now R consists of several regions, numbered from 1 to 4 in the figure.
S consists of the regions numbered 1, 3, 5, and 6. We reassemble the pieces
of R to form S as follows:

• Leave piece 1 alone.

• Move piece 2 to the position of piece 5.

• Move the triangle composed of pieces 3 and 4 to the position of the triangle
composed of pieces 3 and 6.
14 The Bolyai-Gerwin Theorem 1.7

No rotations are used in these motions. All we need to do is show that


triangle 2 is congruent to triangle 5 and that the triangle composed of pieces
3 and 4 is congruent to the triangle composed of pieces 3 and 6. Again, the
proof is left for the exercises.
We wanted the rectangle R to be less than four times as long as wide,
to ensure that the line L will angle down steeply enough to guarantee the
existence of triangle 3. In any case, we have shown that with two straight
cuts a rectangle can be reassembled into a square.

Exercises

1.36 In Section 1.6, an important preliminary to the construction is to show


how to construct a line segment of length x such that x 2 = ab (see Fig. 1.4).
It is shown in many courses in plane geometry that if the lines from q to the
endpoints of the diameter are drawn, then the angle formed at q is a right
angle. Use this fact to show that x 2 = abo
1.37 In the proof of this section, it was necessary to use a rectangle less than
four times as long as wide. Suppose that R is a rectangle more than four
times as long as wide. Show how to convert R by straight line cuts into a
rectangle of equal area less than four times as long as wide. What if R is
exactly four times as long as wide?
1.38 The construction in this section states that the rectangle R has to be
less than four times as long as it is wide in order to ensure the existence of
triangle 3 (see Fig. 1.5). Establish this by showing the following to be true:
If the rectangle R is less than four times as long as it is wide, then the line L
angles down steeply enough to guarantee the existence of triangle 3, and not
otherwise.
1.39 The last step in the proof of this section is to show that triangle 2 is
congruent to triangle 5, and that the triangle composed of pieces 3 and 4 is
congruent to the triangle composed of pieces 3 and 6. Please do so.

1.7 HOW TO CUT SEVERAL SQUARES AND


REASSEMBLE THE PIECES TO FORM A SINGLE SQUARE
Actually all we will show is how to assemble two squares 8 1 and 8 2 into a
single square 8. If there should happen to be a third square 8 3 then we would
repeat the process with 8 and 8 3 to obtain a single square, and so on.
Let 8 1 and 8 2 be two squares. If they should happen to have the same
area, then there is an extremely easy way to cut and reassemble them into a
single square, a way left for you to discover. So let us suppose that the area
of 8 1 exceeds the area of 8 2 . As in Section 1.6, a preliminary construction is
needed to determine where the cuts should be made in 8 1 and 8 2 in order that
1.7 Cutting squares and reassembling as one square 15

/
10

/
/
/
/
/
I
/
.......... ,

8
" S
...................
"
"
9 5

...........................
S2 T b
I
/ 4 " ' ....
/ " ....
[- f-
/ 8~/
I /
3 I /
/ /
/
/ 6 I
/
Fig. 1.6 How to I
/ /
I
turn two squares / 1 I
/ I
into a single square. I
I
I
a
(3 1'.... I
.... , /
............ I
I
........ I
........ , I
7
I
............................. I
I

2 S, ......... I
I

..................... /
........ I '-'--
A 'Y

the resulting pieces can be reassembled to form the square 8. Draw 8 1 and
8 2 as shown in Fig. 1.6, as well as the auxiliary square A, and let 8 1 and 8 2
have side lengths a and b, respectively.
Note in the figure that each of the two points y and {) is located at distance
b from the next vertex of A counterclockwise around A. Locate the two
points rL and p similarly. Draw the segments [rL, P], [P, y], [y, {)], and [{), rL],
shown as dashed lines in the figure. These segments form the boundary
of the desired square 8. Cut 8 1 and 8 2 along the parts of the boundary of 8
which lie in each. Now 8 1 consists of three pieces numbered 1, 2, and 3;
8 2 consists of two pieces marked 4 and 5; and 8 itself must be formed by
rearranging these pieces so as to cover the regions numbered 1, 4, 6, and 8.
We may ignore the pieces numbered 7, 9, and 10, since they do not enter the
construction-they are merely artifacts introduced to help discover where
to make the cuts, and are not part of either 8 1 , 8 2 , or 8. Here is the recipe
for reassembling the pieces of 8 1 and 8 2 to form 8:
• Leave piece 1 alone.
• Remove piece 4 and leave it on the side for a while.
16 The Bolyai-Gerwin Theorem 1.8

• Pick piece 2 up and move it, without rotation, so that it covers region 4
and part of region 8; the point marked y goes to position ~.
• Rotate piece 5 counterclockwise 90° about the point ~, thus covering the
top half of region 6.
• Use piece 3 to cover the rest of region 6; the point marked {3 on piece 3
goes to position y.
• Find piece 4 (left on the side for a while) and use it to cover the rest of
region 8; the point marked ~ on piece 4 goes to position IX.

A little easy geometry involving lengths and right angles can be used to
show that the pieces actually fit correctly and do form a square. We chose
this particular construction because Fig. 1.6 can also be used to prove the
Pythagorean Theorem. There are other proofs of the Pythagorean Theorem,
just as there are other proofs, using different figures, that two squares can be
assembled into a single square, but we prefer this one because it does two
jobs instead of one.

Exercises

1.40 Show that the polygon 8 bounded by the segments [IX, {3], [{3, y],
[y, b], and [~, IX] is in fact a square. See Fig. 1.6.
1.41 Show that the reassembly in which the squares 8 1 and 8 2 are cut into
pieces which form 8 actually works-that is, that the implied congruences
are actual.
1.42 The Pythagorean Theorem states that if a right triangle has legs of
length a and b, respectively, and hypotenuse of length c, then a2 + b 2 = c2 •
Show how ideas in the proof in this section and use of Fig. 1.6 can be used
to establish this theorem.

1.8 HOW TO CUT A SQUARE INTO PIECES WHICH CAN


BE REASSEMBLED TO FORM A GIVEN POLYGON OF
THE SAME AREA
We discussed this procedure immediately after we listed in Section 1.2 the
six steps in the proof of the Bolyai-Gerwin Theorem. What we do, of course,
is make the polygon into a square, using the first five steps. Clearly, then,
the actual process of assembling the pieces of the polygon to form a square
may be reversed. Or if you prefer, Exercise 1.6 has some bearing on this
problem, and can be used to show the existence of a method for cutting up a
square into pieces which can be reassembled to form a given polygon of the
same area.
1.9 Some concluding remarks 17

To summarize, suppose that P and Q are two polygons of the same area.
Chop each into triangles. Make each triangle into a parallelogram, each
parallelogram into a rectangle, and each rectangle into a square. Assemble
all the squares obtained from P into one giant square S, and all the squares
obtained from Q into one giant square T. Then Sand T will have the same
area, since each has the area common to P and Q. If you draw T together
with its cuts as a transparent overlay and place this overlay upon S, you will
see where to make additional cuts on S. Using these cuts, we can reverse
the construction of T from Q, using these new pieces of S and so reassemble
S to form Q. This maneuver shows that P and Q are equidecomposable, and
concludes the proof of the Bolyai-Gerwin Theorem.

Exercise

1.43 Exactly where in the proof of the Bolyai-Gerwin Theorem did we use
the hypothesis that the polygons P and Q have the same area?

1.9 SOME CONCLUDING REMARKS


We have derived a great deal of pleasure from the actual construction of
puzzles, using the techniques of the proof of the Bolyai-Gerwin Theorem.
If you think you too would enjoy such constructions, here are two pieces of
advice. One is aesthetic; one pertains to mechanical details.
For the aesthetic advice, an important point to remember is that the
puzzles will be unsatisfactory if some of the pieces are too small or if there is
a great disparity in their relative sizes. Constructions proposed for puzzles
should be drawn carefully first, perhaps using cardboard, to verify that there
are no tiny pieces. In addition, there is a code of honor among puzzle-
makers to the effect that no unnecessary cuts should be made merely for the
purpose of confusing the puzzle-worker. Since the only practical way to
apply the proof of the theorem to the invention of a puzzle is to turn some
pleasing polygon into a square, rather than the reverse, this is indeed the right
approach; but variations in where the cuts are made should be tried not
only to avoid very tiny pieces, but also to use the smallest possible number
of pieces in the puzzle. It is quite difficult to work such a puzzle with more
than fifteen pieces, and tremendously difficult to work one with more than
twenty-five.
Mechanically, a very satisfactory method of constructing the puzzle pieces
themselves is as follows: Cut a close-grained light hardwood about one-
quarter inch thick into the shape of some polygon, such as a regular hexagon.
Follow the steps of the construction, lightly drawing lines on the wood with a
pencil. Make the necessary cuts after each step using a fine-bladed jigsaw,
and the reassembled pieces may be glued in the new shapes to a piece of stiff
cardboard to hold them fast for subsequent drawing and cutting. The glue
18 The Bolyai-Gerwin Theorem 1.9

and the pencil marks may be sanded off after the final cuts are made and the
cardboard removed. The grain of the wood may give a few clues to the puzzle-
worker, but he will appreciate this. Keep a record of your construction in
case your victim challenges you to work the puzzle.
A natural question that would occur at this point to a mathematician is
the following: Does a theorem analogous to the Bolyai-Gerwin Theorem
hold for a three-dimensional figure? That is, is it true that if we have two
solid polyhedra with the same volume, then we can show that they are
equidecomposable, still of course using only straight-line cuts? The answer
is no. Of course, there are some pairs of solid polyhedra with the same volume
which are equidecomposable-thickening up any two equidecomposable
polygons, as you would in effect do if you constructed a puzzle, would
provide such an example. That the proposed theorem is not true does not
mean that it is always false, merely that there do exist pairs of polyhedra
with the same volume which are not equidecomposable. In fact, what may
be the simplest possible example-a cube and a regular tetrahedron of the
same volume-provides an example of a pair of nonequidecomposable
polyhedra with the same volume. This example was discovered by the German
mathematician M. Dehn and first published in 1902.
One might next ask if, given two polyhedra with the same volume, it
is possible to tell by a simple test whether or not they are equidecomposable.
The Swiss mathematician H. Hadwiger found the answer to this question
in 1949, but to provide the necessary definitions here to express the answer
would take us too far afield. Let it suffice to say that, as one might expect,
the question is resolved on the basis of the dihedral angles of the two
polyhedra.
Exercises

1.44 A famous problem of antiquity, studied by early geometers, was the


problem known as Squaring the Circle. Given a circle, the problem was to
construct a square of equal area, using only an unmarked straightedge and
compass. This problem is usually mentioned along with two companions,
that of trisection of a given angle and the duplication of the cube (given a
cube, to construct with straightedge and compass the edge length of a cube
with double the volume of the first). These problems remained unsolved for
thousands of years, but the solutions are now known. The answer is that in
each of the three cases cited above no such construction can exist. Please do
not misinterpret this statement. We do not mean that nobody can square
the circle with straightedge and compass because nobody knows how.
What we do mean is that in each case it has been proved that no such con-
struction exists, nor ever can exist. Of course, mathematics departments of
various famous universities receive from time to time proposed constructions.
The tendency is to ignore them, or return them to the proposer with a note
1.9 Some concluding remarks 19

to the effect that the construction is not correct, and that if the proposer
wants to know where the error is, he should be willing to pay a fee for pro-
fessional services. This has given mathematicians quite a reputation for
dogmatism and narrow-mindedness, but most would prefer to work on
the many problems to which the solution is yet unknown rather than search
for an error in a construction they know in advance to be incorrect.
The reason that these constructions are impossible has to do with the
fact that, given a line segment of length 1, unmarked straightedge, and com-
pass, certain lengths cannot be constructed. However, if a number can be
constructed, such as 2 or 1/4 (how?), its square root can also be constructed.
The technique is concealed in the proof of the Bolyai-Gerwin Theorem,
in Section 1.6 of this chapter. You are invited to discover the technique and
use it to construct a line segment whose length is the square root of 5, given
straightedge, compass, and a line segment of length 1.
Curiously enough, the impossibility of squaring the circle with straight-
edge and compass has little bearing, as far as we know, on the Unsolved
Problem in Mathematics mentioned early in this chapter-the problem of
whether a square and a circle of the same area are equidecomposable in the
general sense that, naturally, curved cuts are to be allowed, but still that only
finitely many pieces should be used. We do not require that only straightedge
and compass be used. The problem here is one of finding where to make the
cuts, or alternately, to show that there can be no way to make the cuts that
will work. This problem was first stated by A. Tarski in Fundamenta
Mathematicae in 1925.
1.45 The first three stages in the construction of the Snowflake Curve
are shown in Fig. 1.7. Some techniques of topology may be used to show that
it makes sense to talk about "the figure obtained by continuing this process,
over and over, once for each natural number." That is, of each point in the
plane it can be determined whether that point is eventually (and thus per-
manently) within the curve, or never within. The boundary of the figure,
which is what we mean by the snowflake curve itself, consists of those points
which eventually get on the boundary at some stage of the construction and
then stay on the boundary in each successive stage.
Since the figure is bounded its area is finite. However, the perimeter is
infinite; that is, given any whole number n no matter how large, at some
stage in the construction of the snowflake curve its perimeter exceeds n,
and the perimeter increases at each stage. In fact, the perimeter is multiplied
by 4/3 each time we pass to the next stage. So if the initial perimeter is 1,
we obtain the following sequence of perimeters for the successive stages of
the construction:
1, 4/3, (4/3)2, (4/3)3, (4/3)4, (4/3)5, ... ,
and this sequence of numbers increases without bound. Why?
20 The Bolyai-Gerwin Theorem 1.9

Fig. 1.7 First three stages


in the construction
of the Snowflake Curve.

If you know a little about infinite geometric series, you can use this
knowledge to calculate the area of the figure bounded by the snowflake
curve. Such a series is one in which each term is a fixed multiple, say r, of
the previous one; that is, the series has the form
1.9 Some concluding remarks 21

where a2 = rat, a3 = ra2' a4 = ra 3, and so on. The sum of such a series is


the first term divided by 1 - r. This formula works only if - 1 < r < 1,
but such will be the case if you find the right series for computing the area of
the snowflake curve. See whether you can do this.
1.46 It is possible that a plane figure has no area. We do not mean "zero
area," nor do we mean "infinite area." A much stranger phenomenon can
take place.
What ought "area" to mean? Suppose we list the properties it should
have, and then ask first if any such thing exists. That is, does there exist a
function A which assigns to each bounded subset of the two-dimensional
plane a nonnegative real number, called the area of the subset, with the
following properties:

a) The area of a line segment, or a point, or the "empty set," is zero.


b) If Rand S are congruent bounded plane sets, then they have the same
area.
c) If Rand S are bounded plane sets which overlap in a set of zero area, then
the area of R u S is the sum of the area of R and the area of S.
d) If ReS then the area of R does not exceed the area of S.
e) The area of a rectangle is the product of its length and its width.
The Polish mathematician S. Banach published in 1923 a proof that no
such function A can exist. The only way, then, to get an effective area-
measuring function such as we would like A to be is to drop one or more of
the above stipulations as to its behavior. If property (e) is dropped, then it
turns out that there is only one such function, and it is a rather dull one;
it assigns' area zero to every set. So we might as well keep property (e). But
we surely do not want to give up any of the first four properties listed, and so
the only solution is to drop the stipulation that the area function A measures
every bounded subset of the plane. If we do this, it is then possible to con-
struct a function A with all the desirable properties listed above; in fact, A
will coincide with our intuitive notion of "area" for geometric figures. For
example, A will assign the "correct" value for the area of a circle. What
goes wrong, as shown by Banach, is that there will exist bounded plane
figures to which A cannot assign any area value at all while still satisfying
all the listed properties. Such a set is said to be nonmeasurable.
No one has ever constructed an example of such a set; we merely have a
proof of its existence. This is understandably not very satisfying, but it is
also unavoidable. Incidentally, nonmeasurable sets exist in every dimension.
Can you use the techniques of this chapter to construct an area function
A with the properties listed above so that A does assign an "area" to every
bounded subset of the plane with polygonal boundary?
22 The Bolyai-Gerwin Theorem 1.9

1.47 The idea of volume in dimension three is a natural one, although we


have already mentioned that there are polyhedral figures of equal volume
which are not equidecomposable. On the other hand, S. Banach and A.
Tarski found that the situation is far worse than one might expect. It turns
out that the solid ball of radius 1 and the solid ball of radius 2 are equide-
composable-though here we use the term in a much more general sense
than previously, for the cuts are not straight, and most of the pieces into
which the balls are cut are nonmeasurable (see the previous exercise). One
dramatic way of putting it is this: If matter were homogeneous, it would be
possible to cut the sun into pieces and reassemble these into a solid ball the
size of a pea.
This sort of thing cannot happen for bounded plane sets, but it is possible
to solve the following problem: Show how to cut into infinitely many pieces
a square of side 1 so that these pieces may be reassembled into a rectangle
of twice the area of the square.
1.48 One method of cutting into pieces a polygon P and reassembling them
into another polygon Q may be called more efficient than another such
method if fewer pieces are involved. Very little is known about most efficient
procedures in the general case other than the mere fact of their existence.
Our methods, applied to turning an equilateral triangle into a square,
involve cuts to form a total of seven pieces. Find a more efficient way,
using shortcuts.
1.49 Suppose that P and Q are rectangles with short sides a and c, res-
pectively, and long sides band d, respectively; and suppose in addition that
P and Q have different areas and that b is larger than d. Show how to cut P
up into finitely many polygonal pieces which can be reassembled into a
rectangle which has one side equal to d. This shows that two rectangles can
be assembled into a single rectangle. How?
1.50 Can you use Exercise 1.49 and other information in this chapter to
provide a shorter proof of the Bolyai-Gerwin Theorem? How?
1.51 It was mentioned in the last section that there do exist polyhedral
solids of the same volume which are not equidecomposable. However, the
Bolyai-Gerwin Theorem may well hold for certain types of polyhedra. For
example, is it true that two rectangular parallelepipeds of the same volume
are equidecomposable? Explain.
1.52 At the beginning of this chapter, it was stated that one can make a
puzzle of a number of pieces that fit into a rectangular box, and which can
be reassembled to form a triangle, a hexagon, a symmetrical five-pointed
star-or all three. Of course, the Bolyai-Gerwin Theorem shows how to cut
a rectangle into pieces that can be reassembled to make a hexagon, but how
does it follow that a rectangle can be cut into pieces that can be reassembled
to form all three of the above figures?
Notes and references 23

1.53 Is it possible to cut up two cubes of equal volume into pieces which
can be reassembled to form a single cube? Explain.
1.54 Continuing the previous exercise, is it possible to cut up two cubes of
equal volume into a number of small cubes of equal volume which may be
reassembled to form a single cube? Reduce this problem to one of solving
a certain simple algebraic equation.
1.55 Is it possible to assemble finitely many squares, no two of which have
the same area, into a rectangle? Explain.

NOTES AND REFERENCES


The most convenient text covering the material of this chapter, as well as
associated topics, is V. G. Boltyanskii's Equivalent and Equidecomposable
Figures (Heath, 1963; translated by A. K. Henn and C. E. Watts from the
first Russian edition, Moscow, 1956). In particular, the results of Dehn and
Hadwiger about equidecomposable polyhedra are given in detail.
For those who wish to pursue these matters in the one-dimensional
case, W. Sierpinski's long article "On the Congruence of Sets and Their
Equivalence by Finite Decomposition" is excellent. This was originally
published in Vol. 20 of Lucknow University Studies (1954), and is available
in English in Monographs, published by Chelsea, containing the article by
Sierpinski as well as one by Klein, one by Runge, and one by Dickson on
other topics. One might also see Sierpinski's article, "Sur quelques pro-
blemes concernant la congruence des ensembles de points," in Elemente der
Mathematik, Vol. 5, pages 1-4 (1950).
With respect to Exercise 1.46, the paper of Banach's referred to is "Sur
Ie Probleme de Mesure," in Fundamenta Mathematicae, Vol. 4, pages 7-33
(1923). A general reference covering a wide variety of problems about plane
sets is the excellent text by Hadwiger, Debrunner, and Klee, Combinatorial
Geometry in the Plane, published by Holt, Rinehart, and Winston (1964).
A mention of equidecomposability is made on page 52, and several additional
references are given.
Additional topics on triangulation, construction of an area-measuring
function for the plane, and an alternate proof of the Bolyai-Gerwin Theorem
may be found in E. E. Moise's Elementary Geometry from an Advanced
Standpoint (Addison-Wesley, 1963), in Chapters 14 and 24. Chapter 19 also
contains some interesting material on straightedge and compass constructions.
In mathematics, as well as in the physical sciences, it frequently happens
that discoveries are made almost simultaneously by men working inde-
pendently. In the case of the Bolyai family, we have a double coincidence.
Bolyai Farkas (whose name is frequently rendered as Wolfgang Bolyai de
Bolya) was a Hungarian mathematician born in 1775. He studied at the
University at Gottingen, where he and Gauss, then also a student there,
24 The Bolyai-Gerwin Theorem 1.9

became close friends. The elder Bolyai returned to teach at Maros-Vasarhely,


and during this time his son Bolyai Janos (or Johann Bolyai de Bolya)
acquired an interest in mathematics. In 1832 Farkas published his major
mathematical work, the Tentamen, covering a variety of geometrical ideas,
and in which he stated the theorem we have called the Bolyai-Gerwin
Theorem. At almost the same time-perhaps within a year-the German
officer and amateur mathematician Gerwin also published the same result.
Meanwhile, Janos had been working on the problem of the independence
of Euclid's famous parallel postulate, and in the process discovered that the
parallel postulate could be replaced by a nonequivalent alternative. The
resulting geometry, known as non-Euclidean geometry, was at that time a
major breakthrough in mathematical thought; indeed, so important that the
reader is urged to consult a history of mathematics for a full appreciation of
its implications. The work of the younger Bolyai was published as a 26-page
appendix to his father's Tentamen. Again, almost simultaneously, the
Russian mathematician Lobachevsky came up with much the same result;
meanwhile, when he heard of this work of the younger Bolyai, Gauss revealed
to Bolyai Farkas that he too had obtained such results, but had hesitated
to publish them because he thought they might not be well received.
This was quite disappointing to Bolyai Janos, who never again published
any mathematical results; curiously enough, he is by far the better known of
the two Bolyais. This is probably as it should be, inasmuch as his discovery
of non-Euclidean geometry has had major implications in both the founda-
tions of mathematics itself as well as in Einstein's Theory of Relativity.
Bolyai's work went almost unnoticed for thirty-five years, until it was
noted after his death by Richard Baltzer in 1867. Finally, in 1894, a memorial
stone was placed on Bolyai Janos' grave in Maros-Vasarhely.
CHAPTER 2

BRUNNIAN
LINKS

Examine the three rings shown in Fig. 2.1, close the book, and try to reproduce
the drawing.
Many people have some difficulty in correctly drawing the three rings of
Fig. 2.1, known as the Borromean Rings, and will draw instead three rings
linked much as the three shown in Fig. 2.2. As soon as the difference between
the two is seen, though, it becomes easy to draw the Borromean Rings. The
distinguishing property of the Borromean Rings is that each of the rings lies
completely over one of the other two, and completely under the other. The
three rings shown in Fig. 2.2, on the other hand, have the property that
each actually links each of the other two. We shall examine in this chapter the
implications of this difference, as well as properties of such figures made of
other numbers of rings.
For the rest of this chapter, we shall assume that all such figures lie in
ordinary three-dimensional space, so that if you wish you may construct
examples made of string or wire. The circular shapes into which your material
is formed will not have to be perfectly circular, nor is the material itself
important with regard to the mathematical properties that will concern us.
All we care about is the manner in which the various curves link one another.
Hence we immediately pass to a convenient mathematical abstraction-
that of the simple closed curve. Just as Euclid saw in the real world various
rough approximations to a perfectly straight line (such as the line where the
wall meets the ceiling), and passed to the abstraction of the geometric straight
line without width or end, so also do mathematicians who wish to examine
properties of knots and linking curves pass to the abstract idea of a simple
closed curve.

25
26 Brunnian links 2.1

Fig.2.1 The Borromean rings.

Fig. 2.2 A (3, 1)-Brunnian link.


2.1 The simple closed curve 27

Fig.2.3 A wild
simple closed curve.

2.1 THE SIMPLE CLOSED CURVE


Let us consider the reality from which the concept of the simple closed curve
is abstracted. Imagine an ordinary piece of string with the ends woven to-
gether so as to form a homogeneous and continuous curve; then think of the
midline of this curve. This abstraction, the midline, is an example of a
simple closed curve. The important properties are these:
a) No point separates a simple closed curve. That is, no single scissors-cut
can separate it into two pieces.
b) Each set of two points does separate a simple closed curve. That is,
two scissors-cuts will separate the curve into two (why not more?)
pIeces.
c) The simple closed curve is one-dimensional-every sufficiently small
connected piece of it has the same dimensional properties as a small
piece of Euclid's perfect geometric straight line.
d) The simple closed curve is bounded-it is contained in some sphere of
sufficiently large radius.
Mathematicians were surprised, as you might be, to discover that the
above four properties leave something to be desired if they are supposed to be
defining properties of the common notion of a simple closed curve; at least,
something to be desired if this definition is to be an accurate abstraction from
reality. For this definition does not exclude such an object as is shown in
Fig. 2.3.
28 Brunnian links 2.1

Fig. 2.4
Tame simple
closed curves.

The object shown in Fig. 2.3 does have the four properties listed above,
and so ought to be considered a simple closed curve, but is not an abstraction
from reality since it certainly cannot be tied with a piece of string. There are
infinitely many loops, decreasing in size; of course, the entire figure cannot
be drawn, and so most of it is concealed in the box. Such curves are said to
be wild, as opposed to the tame ones we shall study. We may prevent the
occurrence of any wild simple closed curves henceforth by the following
agreement: We consider a curve only if it can be continuously deformed
without self-intersection onto a polygonal curve (one formed of a finite
collection of straight-line segments). Mostly for aesthetic reasons we shall
continue to draw our tame simple closed curves without corner points;
just imagine if you wish that all the corners have been rounded off slightly.
Each of the simple closed curves shown in Fig. 2.4 is tame-the one on the
left is an approximation by a polygonal simple closed curve to the one on the
right.
Since the curves shown in Figs. 2.1 and 2.2 can be continuously deformed
without self-intersections onto polygonal curves, they too are tame; but the
curve shown in Fig. 2.3 does not have this property. (This is not obvious.)
Note that the Borromean Rings of Fig. 2.1 have the property that each
ring lies completely over the one immediately clockwise to it. This means
that if anyone of these three simple closed curves is removed from the
Borromean Rings, the remaining two come apart. However, you will have
2.1 The simple closed curve 29

little trouble in convincing yourself that the three do not come apart. Because
of this interesting property, the Borromean Rings have been used for centuries
in the Christian religions as a symbol of the Holy Trinity. On a less reverent
note, they have also been used as the trademark of a well-known manufac-
turer of beer and ale. The mathematician H. Brunn may have been the
first to study generalizations of the Borromean Rings, about 1892, and these
generalizations are known to some mathematicians as Brunnian links for
this reason.
What generalizations? Why, one would naturally ask if it is possible to
construct four simple closed curves with the Borromean Property: The
four are linked together but removal of anyone causes the remaining three
to fall apart. And why stop with four? Do there exist ten, or twenty, or even
a hundred simple closed curves, the totality linked together, but such that
the removal of anyone causes the rest to fall apart? We shall examine
these generalizations as well as others in this chapter.
Exercises

2.1 We listed earlier the four properties of a simple closed curve:


a) No point separates a simple closed curve.
b) Each two-point set separates a simple closed curve.
c) A simple closed curve is one-dimensional.
d) A simple closed curve is bounded.
If there exists a figure in three-dimensional space which is not a simple closed
curve and has three of these properties but not the fourth, it shows that the
fourth property is essential for an abstract definition of simple closed curve.
Can you construct a figure having properties (a), (c), and (d), but not property
(b)? What about other combinations?
2.2 Which of the defining properties of a simple closed curve does a straight-
line segment have?
2.3 Which of the defining properties of a simple closed curve does an infinite
straight line have?
2.4 Which of the defining properties of a simple closed curve does a theta-
curve (a figure shaped like the Greek letter 8) have?
2.5 Which of the defining properties of a simple closed curve does a flat
two-dimensional circular disk have?
2.6 Construct an example of a wild simple closed curve essentially different
from that shown in Fig. 2.3.
2.7 Construct an example of four simple closed curves in three-dimensional
space with the property that the removal of a certain one of these causes
the other three to come apart, but removal of anyone of the other three
leaves the remaining three linked together.
30 Brunnian links 2.2

2.8 Each of the curves shown in Fig. 2.4 is said to be knotted because neither
can be continuously deformed into a circle in three-dimensional space
without self-intersection at some stage. On the other hand, a circle or a
square is an unknotted simple closed curve. Show that a square is unknotted.
2.9 Your intuition should tell you that a tame simple closed curve is un-
knotted (see Exercise 2.8) if and only if it can be continuously deformed in
three-dimensional space until it lies in a flat plane without self-intersection.
Show that this last condition is equivalent to the following: The tame simple
closed curve can be continuously deformed in three-dimensional space until
it lies on the surface of a round two-dimensional sphere without self-inter-
section.
2.10 It is difficult to show that each of the curves of Fig. 2.4 is knotted, but
you can show that either-and thus both-can be deformed so as to lie
on the surface of a two-dimensional torus without self-intersection. Please
do so. (A torus is the mathematician's abstraction of the surface of a one-
hole doughnut.)

2.2 LINKS AND THEIR PROPERTIES


We have already introduced some technical terms and concepts without
definition. We need these definitions to ensure that the writer and the reader
agree on precisely what is meant, but we will define these terms-"linking,"
"falling apart"-in such a way as to be as near as possible to the common
and natural interpretations of these terms. This is how we have been able so
far to use these terms in a very nonmathematical fashion-without previous
definition-with some assurance that no confusion as to precise meaning has
yet arisen. However, after the definitions, we will begin to use the math-
ematical terms now current to remind you that these terms have been defined
in a certain way, and to prevent the connotations associated with the common
terms from creeping in and clouding the issues.
An n-Iink is a collection of n simple closed curves in three-dimensional
space. (Remember that we shall deal only with tame simple closed curves; n
is merely a positive whole number.) See Fig. 2.5.
An n-link is splittable if it is possible to deform it continuously in three-
dimensional space in such a way that part of the link lies within Bland the
rest of the link lies within B 2 , where B 1 and B 2 are mutually exclusive solid
balls in three-dimensional space. Figure 2.6 shows an example of a splittable
3-link.
Your intuition is in good working order if it tells you that if an n-link
composed of curves C 1 , C2 , . . . , Cn is splittable, then no one of the curves
C i can lie partly within B 1 and partly within B 2 • What does happen is this:
Some (but not all) of the curves lie entirely within B 1 , some (but not all) of
2.2 links and their properties 31

Fig. 2.5 A 4-link.

Fig. 2.6 A splittable 3-link.


32 Brunnian links 2.2

Fig.2.7 A 4-link and


one of its 3-sublinks.

the curves lie entirely within B 2 , and each curve lies entirely within exactly
one of the two balls. For example, if a 3-link is splittable, then it can be
continuously deformed so that two of its curves lie entirely within B 1 and
the third entirely within B 2 , where (as before) B 1 and B 2 are disjoint solid
balls.
An n-link composed of the curves C 1 , C 2 , ••• , Cn is said to be completely
splittable provided that there exist mutually exclusive balls B 1 , B 2 , ••. , Bn
in three-dimensional space such that the link can be deformed continuously
so that each C i lies entirely within the ball B i •
The example of Fig. 2.6 is a splittable 3-link which is not completely
splittable, because C 2 and C 3 cannot be split apart. So all that splittability
means is that the link comes at least partly apart; completely splittable links
come entirely apart. If we say a link is nonsplittable, we mean that not even
one of the curves involved, or any pair, or any combination, can be separated
from the rest without cutting.
Finally, given any n-link composed of the curves C 1 , C 2 , .•. , Cn' a
sublink of this link is simply some subcollection of the curves C 1 , C 2 , ••• , Cn
-obtained, if you like, by erasing the ones not in the collection. A k-
sublink of the link L is just a k-link which is a sublink of L. This of course
makes sense only for 1 :::: k < n, where n is the number of curves used to
form L.
Figure 2.7 shows on the left a 4-link, and on the right one of its 3-sublinks.
Please note that a collection ofcurves in a sublink retains the linking properties
it enjoyed in the whole link, although removal of some curves to form a
2.3 A simple algebra 33

sublink out ofthe remainder may cause splittability where none existed before.
For example, consider the 2-sublinks of the Borromean 3-link.
We may use our new terminology to describe the property of interest
that distinguishes the Borromean Ring example from the example shown in
Fig. 2.2: The Borromean Rings form an example of a nonsplittable 3-link
every 2-sublink of which is completely splittable. And the generalizations we
seek may be described in the following way:
a) Does there exist a nonsplittable 4-link such that each of its 3-sublinks
is completely splittable?
b) Given a whole number n > 5, does there exist a nonsplittable n-link
with each of its (n - 1)-sublinks completely splittable?
c) Given whole numbers nand k with 1 ~ k < n, does there exist a non-
splittable n-link such that each of its k-sublinks is completely splittable
but each of its (k + l)-sublinks is nonsplittable? In particular, does
there exist an example of a nonsplittable 4-link with every 3-sublink
also nonsplittable, but with every 2-sublink completely splittable?

Exercises

2.11 Is every splittable 2-link also completely splittable? Why?


2.12 Is every splittable 3-link also completely splittable? Explain your
answer.
2.13 Show how to construct, for each whole number n > 2, an example of a
nonsplittable n-link each of whose sublinks is also nonsplittable.
2.14 Show how to construct, for each whole number n > 3, an example
of a nonsplittable n-link one of whose (n - 1)-sublinks is completely split-
table. Can this be done in such a way that one and only one of the (n - 1)-
sublinks is completely splittable, and all the other (n - l)-sublinks are
nonsplittable?
2.15 Construct infinitely many circles C 1 , C z , . .. in three-dimensional
space forming a nonsplittable link L such that removal of anyone of the
curves C j from L produces a splittable link. For the purposes of this exercise,
how is the word "link" being used? What is a reasonable definition of
"splittable" ?

2.3 A SIMPLE ALGEBRA


We shall answer the questions raised at the end of the preceding section by
use of an extremely simplified form of analytic geometry. Analytic geometry
may be described as the process of assigning algebraic equations or expres-
sions to geometric objects so that we can make deductions which may not
be geometrically apparent. The simplification we make lies in our new simple
34 Brunnian links 2.3

algebra. We shall use the customary symbols a, b, c, ... of ordinary algebra,


but the rules of our algebra are very few in number. Here they are:
1. The only operation is "multiplication." We shall denote the product of
a and b byab.
2. There is a multiplicative identity, which we shall denote by 1. It has the
property that for each a, la = a = a1.
3. The associative law for multiplication holds-that is, we may ignore
parentheses. It will always be true that (ab)c = a(bc), so that we may
simplify an expression such as a(b[cd(b)]) to abcdb.
4. Finally, for each symbol a in our algebra, there exists an element denoted
by a-I (and pronounced "a-inverse") in our algebra such that aa- 1 =
1 = a- 1 a.
To save space, we shall abbreviate such expressions as aaaa by a4, and
the usual laws of exponents can be shown valid:
d"d' = d"+n and (d"t = d"n

for all whole numbers m and n and each element a of our algebra. In
particular,
(a -l)n = a -n and aa- 1 = aO = 1,

thus justifying our use of the notation a-I for the inverse of the element a.
What we cannot use is the so-called commutative law; in general it will
not be true that ab = ba. The only complicated thing is to determine what
the symbols a, b, c, . .. represent. They are certainly not whole numbers,
nor are they real numbers. Moreover, the operation which we have so
blithely called "multiplication" certainly cannot be ordinary multiplication,
since the objects we are "multiplying" are not numbers. We need to explain
what the symbols a, b, c, and so on represent, and what the meaning of the
"product" ab is.
Let A be a circle in three-dimensional space, such as is shown in Fig. 2.8.
Warning: So far, "A" is just the name of a circle, and the symbol A will
not be an element of our algebra. The object a of our algebra will, however,
be closely associated with the circle A, and the association is so natural as to
cause us to use almost the same name-differing only in upper as opposed to
lower case-for two different things.
Imagine a curve in three-dimensional space that starts at the tip of your
nose, passes through the circle A once going away from you, curves around,
and returns to the tip of your nose. (The tip of your nose is denoted by p in
Fig. 2.8.) This new curve is almost the object a of our algebra. But all we
care about is the fact that this new curve passes through the circle A a net
of one time away from you, and it is the collection of all such curves that
2.3 A simple algebra 35

Fig. 2.8 A and B.

deserves the name a. Thus what the element a of our algebra represents is the
net effect of passing through the circle A one time going away, or alternatively,
the set of all such curves that start and end at the tip of your nose and pass
through the circle A once going away.
Your intuition should tell you that if x is any other curve also passing
through A a net of one time going away, then the curve x can be continuously
deformed to the curve representing a (as shown in Fig. 2.8) without cutting x
or the circle A, and in such a way that x never touches the circle A, and the
ends of the curve x are never removed from the tip of your nose. Conversely,
any curve x which can be so continuously deformed must pass through the
circle A a net of one time going away. So if you imagine all possible curves
that start at your nose, pass through the circle A a net of one time going
away from you, and return to your nose, you may consider each of these as
representing the element a of our algebra, since anyone can be deformed to
any other such in the manner described above.
Now that you know what the symbols a, b, C, • •. of our algebra rep-
resent, it is time to define the operation of multiplying two of them together.
Let A and B be circles in three-dimensional space as shown in Fig. 2.9. The
element a of our algebra can be represented by a curve that starts at p,
passes through the circle A once going away from you (and not passing
through B at all), and finally returns to p. Similarly, the element b can be
36 Brunnian links 2.3

Fig. 2.9 A curve


representing the
"product" abo

represented by a curve that behaves the same way with respect to the circle
B. We define the product ab to be represented by each and any curve that
has the net effect of "first doing a, then doing b." That is, a typical represent-
ative of ab is a curve that starts at p, passes away from you through A, then
passes away from you through B, and finally returns to p. It is permissible
to remove the end of a and the beginning of b from the point p and then
join these ends together close by, as we have shown in Fig. 2.9, in order to
clarify the picture.
A little experimentation with wire circles and two pieces of string should
convince you that, at least in this example, ab =1= ba; that is, the curve ab
cannot be deformed to the curve ba without cutting it or cutting the circles
A and B or removing its ends from p.
What sort of curve represents a- 1 ? Why, each curve that passes once
through the circle A toward you. What is the meaning of a 2 ? This stands for
those curves that have the net effect of passing through the circle A twice
away from you. What about the multiplicative identity 1 of our algebra?
That is represented by any curve that passes through no circle at all. These
phenomena are illustrated in Fig. 2.10.
2.3 A simple algebra 37

A
Fig.2.10 Some
algebraic illustrations.

A
38 Brunnian links 2.3

It is easy to see that la = a = al for each element a of our algebra,


and that aa- l = I = a-lao We are, of course, interpreting equality as
"has the same effect as," or "can be continuously deformed to," or "passes
through the same curves in the same direction in the same order as." The
associativity, which allows us to write abc for either (ab)e or a(be), will be
discussed in the exercises. You should now spend some time convincing
yourself that the laws of our simple little algebra in fact hold, when you
interpret the objects a, b, e, . .. and the operation of "multiplication" as
we have done.
From our point of view, one of the most useful things about this algebra
is that we can tell when two "objects" or expressions in the algebra are
actually equal by a very simple test: First, perform the only algebraic
simplifications allowable in the algebra:
Replace aa - 1 (or a- 1 a) by 1.
Replace 1a (or a1) by a.
Simplify expressions such as aaaa to a4 •
Perform all these simplifications on each expression. If the resulting ex-
pressions are identical, then the original expressions must have been equal.
For example, we may ask whether
aba 2 a- 1 bbeae - lee and
are equal. Each simplifies to abab 2 eae. Since they have identical simplifica-
tions, they are equal.
Exercises

2.16 The definition of "group" is given in Section 4.1. In our simple


algebra developed for the study of Brunnian links, we started with a set of
objects a, b, e, ... , each representing a way of sending a simple closed curve
through some fixed circle in three-dimensional space, and a "multiplication"
of these objects, by which the product ab was interpreted to mean the effect
of first doing a, then doing b. Verify informally that this set of objects
together with this multiplication does indeed satisfy the definition of a group.
2.17 We saw that our simple algebra has the following property: Two
strings of symbols are equal if they have identical simplifications. However,
there may be equal expressions without identical simplifications. If the
circles A and B are linked together as shown in Fig. 2.11, then ab = ba, but
this relation cannot be derived from the simplifications listed in the preceding
section. Why is ab = ba in this example?
2.18 Let us study just a little more the algebra associated with the 2-link
shown in Fig. 2.11. Show that every expression in this algebra is equal to one
of the form d'b m , where nand m are whole numbers (possibly zero; we
2.4 Brunnian 4-links 39

Fig.2.11 ab = ba.

interpret XO as the identity, 1). For example, b- 1 a2b3a 5 = a7 b2. This shows
that each expression in this algebra has a "standard form" that looks like
d'b m • Are two expressions in this algebra equal if and only if their standard
forms are identical? Explain your answer.
2.19 We return in this exercise to the general case of an algebra known only
to satisfy the group axioms (see Exercise 2.16). Show that (xy)-l = y-1X- 1.
Your proof should work not only in the case that x and yare single symbols
in the algebra, but also in the case that x and y stand for more complicated
expressIOns.
2.20 Show that the two expressions
aab2b-lca2b3b-la-lb3
and

are equal.

2.4 BRUNNIAN 4-LlNKS


Although we are going to use a very simple algebra, it will turn out to be a
powerful tool in our examination of Brunnian links. You will now see how
a fairly complex geometric problem can be transformed into a remarkably
simple algebraic one.
Suppose you had the task of drawing four simple closed curves to form a
nonsplittable link each 3-sublink of which is to be completely splittable.
40 Brunnian links 2.4

Fig.2.12 Three
fourths of a
{4, 3)-Brunnian link.

Since each 3-sublink is to be completely splittable, it should be possible to


draw the desired 4-link by first imagining one of the curves not present.
Since the remaining three split completely, they could be drawn as separated
circles much as in Fig. 2.12. The only problem would be to draw the fourth
curve in such a way as to guarantee that the resulting 4-link would be non-
splittable, but such that all four possible 3-sublinks would be completely
splittable. One-fourth of the latter task has already been completed: Clearly,
if the fourth curve, not yet drawn, is removed, then the remaining three
come apart.
You will not be convinced of the difficulty of finding a geometric solution
to this problem unless you give it a try-so spend some time trying to draw
that fourth curve. Remember, in order to obtain the desired example, it
must be true that if anyone of the curves A, B, or C is removed from your
figure, then the remaining three, two circles and your curve, must come
2.4 Brunnian 4-links 41

Fig.2.13 The
Borromean rings again.

completely apart. Do not try to draw a circle for the fourth curve; so far as
we know, it cannot be done that way. All you need is any tame simple closed
curve for the fourth curve.
Let us see if the miniature algebra we have developed will serve to tell
us why the Borromean Rings form a nonsplittable 3-link each 2-sublink of
which is completely splittable. Perhaps this will provide some insight that
will help in the construction of other examples. If you carefully draw apart
the upper two rings of the Borromean example shown in Fig. 2.1, you will
obtain a figure much like the one shown in Fig. 2.13.
We have two separated circles A and B, which is what we might well
want to start with if we wanted to invent the example of the Borromean rings.
The tip of your nose is shown as the point p on the curve C, for we wish to
calculate the algebraic formula of this curve. It first passes away from you
through A, then toward you through B, then again toward you through A,
and finally away from you through B. The algebraic formula we have
learned to associate with such a curve is ab-1a-1b. This simple formula
tells us a great deal about the Borromean rings.
First, since no simplification of the formula ab-1a-1b is possible, it does
not reduce to I. (This does not contradict Exercise 2.17, since the curves
A and B are not linked.) Hence the curve representing this formula actually
links the union of the curves A and B and cannot be removed without cutting.
42 Brunnian links 2.4

Second, observe the effect of cutting and removing the circle A. The
effect on the formula ab- 1 a- 1 b is both simple and striking. When the circle
A is taken away, the effect on the formula is to delete all occurrences of the
symbol a (and a- 1 as well). The formula becomes b- 1 b, which simplifies to 1.
This shows that if A is removed, then Band C are completely splittable.
Similarly, if B is removed, then we obtain aa- 1 = 1, and so A and Care
completely splittable. By our construction, since we drew A and B already
separated, A and B split completely if C is removed. This shows that every
2-sublink of the Borromean rings is completely splittable.
If the 3-link itself were splittable it would then be completely splittable
by the above discussion. But this is not the case since ab- 1 a- 1 b #= 1.
Hence the 3-link is nonsplittable. This demonstrates that the Borromean
rings actually do provide an example of a nonsplittable 3-link each 2-sublink
of which is completely splittable.
The one item of crucial importance about the formula ab- 1 a- 1 b, the
one phenomenon that makes the example work, is this: The deletion of all
occurrences of anyone symbol in the formula causes the formula to collapse
to I. What this means is that if anyone circle is removed from the example
of the Borromean rings, then the remaining two come apart. To apply
similar algebraic methods to the construction of a nonsplittable 4-link each
3-sublink of which is completely splittable, let us return to Fig. 2.12 and
simply ask what sort of formula the fourth curve we must draw should have.
First, the formula should involve all three of the symbols a, b, and e,
so that the four curves together will form a nonsplittable link.
Second, the formula must not collapse to I. This too is needed to ensure
that the 4-link is nonsplittable.
Finally, the formula must have the property that the deletion of all
occurrences of anyone symbol-either a, b, or e-will cause the formula to
collapse to I. This will, as in the example of the Borromean rings discussed
above, guarantee that each 3-sublink is completely splittable.

Exercise

2.21 Can you write down a formula involving all three symbols a, b, and e,
such that the formula does not collapse to I, but the deletion of all a's, or
all b's, or all e's, does cause the resulting formula to collapse to I? Or,
alternatively, can you prove this impossible, thus demonstrating there is no
analogy to the Borromean rings with four simple closed curves?
We are sure you will be able to discover the answer for yourself, and in
order for you to enjoy fully the thrill of mathematical discovery, we ask you
not to read beyond this paragraph until you have worked on this problem
for a while. Go back and look at the formula associated with the third curve
in the Borromean rings; it will probably help.
2.5 Notation and examples 43

2.5 NOTATION AND EXAMPLES


Let us introduce at this point a small bit of notation that will serve as a useful
abbreviation.
By an (n, k)-Brunnian link we mean a link of n simple closed curves in
three-dimensional space such that each k-sublink is completely splittable, but
each (k + 1)-sublink, and each (k + 2)-sublink, and so on up to the n-link
itself, is nonsplittable. Thus an (8, 5)-Brunnian link would be a collection of
eight simple closed curves such that the set of all eight, or any seven, or any
six, would be nonsplittable, but any 5-sublink would be completely splittable.
Clearly, if each 5-sublink is completely splittable, then so is any 4-sublink,
any 3-sublink, and any 2-sublink. Of course, it makes sense to talk about an
(n, k)-Brunnian link only if 1 ~ k < n; if k = 1 the definition still makes
sense-Fig. 2.2 shows a (3, 1)-Brunnian link. And, of course, the Borromean
rings form a (3, 2)-Brunnian link.
We have previously asked you to construct, first geometrically and then
algebraically, an example of a (4, 3)-Brunnian link. If you have succeeded
in doing this, we invite you to try a much harder problem-the construction
of a (4, 2)-Brunnian link. Here is the hint or observation that may simplify
the task: If a (4, 2)-Brunnian link exists, then the removal of anyone curve
from this link should produce a (3, 2)-Brunnian link. Thus one should per-
haps begin by drawing a (3, 2)-Brunnian link and then attempt to find where
to put the fourth curve so as to produce a (4, 2)-Brunnian link. Make use of
the algebra we developed specifically to solve problems of this sort.
Let us return to the problem of producing a (4, 3)-Brunnian link, starting
with the three circles A, B, and C as shown in Fig. 2.12. As we hope you
have discovered, there does indeed exist a formula using the symbols a, b, and
c such that the deletion of all occurrences of anyone of these symbols causes
the resulting formula to collapse to 1. One such formula is
aba - 1b- 1 cbab - 1 a-I c- 1 •

The presence of the "subformula" aba- 1 b- 1, so like the one associated


with the Borromean rings, is no coincidence. As soon as this is noted, the
pattern becomes clear for construction of even larger links, such as the
(5, 4)-Brunnian link.
To construct a (5, 4)-Brunnian link, first draw four separated circles
A, B, C, and D. Find a formula, such as
aba-lb-lcbab-la-lc-ldcaba-lb-lc-lbab-la-ld-l ,

with the property that all four symbols a, b, c, and d appear in the formula,
and such that the deletion of all occurrences of anyone of these symbols
causes the resulting formula to collapse to 1. We have not chosen the shortest
such formula above for the (5, 4)-Brunnian link, but it is one which continues
44 Brunnian links 2.6

the pattern already established. If we then draw in a fifth curve with this
formula, we can argue (as we did in the case of the Borromean rings) that
what is produced is indeed a (5, 4)-Brunnian link. You will also have seen
that the number of curves in these constructions is not material, except that
large numbers of such curves produce rather long formulas. But since the
formulas have the desired properties it follows that for each whole number
n > 2, an (n, n - 1)-Brunnian link exists. Thus it is possible to construct,
say, one hundred simple closed curves in three-dimensional space forming a
nonsplittable link, but such that the removal of anyone of these curves causes
the remaining ninety-nine to split completely.

Exercises
2.22 Draw a (2, 1)-Brunnian link.
2.23 Draw a (4, 3)-Brunnian link.
2.24 Draw a (5, 4)-Brunnian link.
2.25 Find a shorter formula than that given in Section 2.5 for the fourth
curve in a (4, 3)-Brunnian link.
2.26 Draw a (3, 2)-Brunnian link essentially different from the example of
the Borromean rings shown in Fig. 2.1.
2.27 Show that an (n, 1)-Brunnian link can be constructed for each whole
number n > 2. Compare this with Exercise 2.13.
2.28 If you have not already done so, construct an example of a (4, 2)-
Brunnian link.
2.29 For appropriate choice of x and y, the formula for the third curve in a
(3,2)-Brunnian link was seen to have the form xyx- 1y-1. Is this also true
of the formula given in Section 2.5 for the fourth curve in a (4, 3)-Brunnian
link? See also Exercise 2.19.
2.30 Construct four simple closed curves in three-dimensional space, say
A, B, C, and D, with the following rather unsymmetrical linking property:
The removal of A causes the remaining three to split completely, but of B, C,
and D, exactly two must be removed to cause A and the third to split, and if
only one of B, C, or D is removed, then the remaining three form a non-
splittable link.

2.6 COMMUTATORS
The frequent repetition of a mathematical concept justifies calling attention
to that concept and supplying it with a name. You have undoubtedly
noticed by now the frequent occurrence of a string of symbols of the form
xyx - 1Y -1. In addition, you may have noticed that in our simple algebra,
if we wish to construct the inverse ofa string of symbols such as ab 2 a- 1 cab- 3 ,
2.6 Commutators 45

the proper way to do it is to write the string of symbols in the reverse order,
changing the sign of each exponent. (The exponent of a is 1.) Thus, in the
example just mentioned, the inverse would be b3 a- 1 c- 1ab- 2 a-t, for the
product of the two will clearly produce 1. In general, we may state as a
useful law of exponents in our algebra that
(xy)-l = y-1x- 1

even if one or the other or both of x and yare strings of symbols rather than
a single symbol. Let us abbreviate by (x, y) the string of symbols xyx-1y-l.
This object is a victim of frequent study in modern mathematics, and is called
the commutator of x and y.
Suppose we are given three symbols a, b, and c, and form first the com-
mutator of a and b, and then form the commutator of that string of symbols
with the element c. Symbolically, we would express this as «a, b), c), which
expands to the expression
aba-lb-lcbab-la-lc-l.
It is no coincidence that this is the formula by which we earlier produced a
(4, 3)-Brunnian link, and indeed if you expand the multiple commutator
«(a, b), c), d), you will obtain the formula for the fifth curve in a (5, 4)-
Brunnian link. The reason that commutators produce this effect is this:
Any of the multiple commutators we have been writing down has the property
that if all occurrences of anyone symbol are deleted, then the entire com-
mutator collapses to 1. This is precisely the property needed to produce
(n, n - l)-Brunnian links.
Let us turn our attention to the problem of producing a (4, 2)-Brunnian
link. As we have already observed, a reasonable starting point would be
to draw a (3, 2)-Brunnian link; that is, the Borromean rings. We do this
because if anyone curve is deleted from a (4, 2)-Brunnian link, what is left is
a (3, 2)-Brunnian link. So we draw the three, and puzzle over exactly how to
draw the fourth curve. In terms of our algebra, if the three we draw are
called A, B, and C, what is required is a formula using all the symbols a, b,
and c, such that the deletion of all occurrences of anyone of these symbols
causes the formula to collapse, not to 1, but to the formula for a (3, 2)-
Brunnian link. In other words, can we use the symbols a, b, and c to write
a formula in which the deletion of all occurrences of anyone symbol does not
cause the formula to collapse to 1, but one in which the deletion of all
occurrences of any two symbols does cause the formula to collapse to I?
If we could find such a formula, and draw in a fourth curve representing this
formula, we would have a (4, 2)-Brunnian link as desired.
Here is where the concept of the commutator of two symbols proves
useful. If the formula involves the commutator (a, b), then at least this much
will remain if all occurrences of c in the formula are deleted, and some care is
46 Brunnian links 2.7

taken in the construction of the formula. If in addition this were all that
remained, then after removing curve C, we would have exactly what we want-
a (3, 2)-Brunnian link. So let us first write down the commutator of a and b,
and be sure that c is involved in any other commutators we write down so
as to guarantee the disappearance of all these other commutators if all
occurrences of c are deleted. At this point you may have already guessed
what will happen next. About the only other sorts of things we can construct
that are commutators involving the symbol c are the commutators (a, c)
and (b, c). But then we notice that the product of all three has the desired
property
(a, b)(a, c)(b, c) = aba-lb-laca-lc-lbcb-lc-t,
which will collapse to one of these three commutators if all occurrences of any
one symbol are deleted. Whether or not we remove the curve A, or B, or C,
what is left will be a (3, 2)-Brunnian link. Hence drawing a fourth curve
representing the above formula will produce a (4, 2)-Brunnian link.

Exercises

2.31 Show that the inverse of (x, y) is (y, x).


2.32 Prove that (x, y) = 1 if and only if xy = yx.
2.33 Show that the deletion of anyone symbol in all its occurrences from the
multiple commutator «(a, b), c), d) causes the resulting expression to
collapse to 1.
2.34 Use the ideas of this section to construct a (5, 2)-Brunnian link. Note:
you should start by drawing a (4, 2)-Brunnian link.
2.35 Use the methods of this section and your experience with Exercise
2.34 to show that a (n, 2)-Brunnian link can be constructed for any whole
number n > 3.

2.7 THE FINAL GENERALIZATION


At this point we have exhausted all possibilities for 4-links. As we have
shown, it is not too hard to produce a (4, 3)-Brunnian link; it is easy to
produce a (4, l)-Brunnian link; and we finally succeeded in the last section
in producing a (4, 2)-Brunnian link. It would be reasonable at this stage to
guess that for any whole numbers nand k for which the notation makes
sense, an (n, k)-Brunnian link can be constructed. But let us first examine the
next case still open, that of a (5, 3)-Brunnian link.
As before, let us start with all but one of the curves already drawn. If
we imagine a (5, 3)-Brunnian link with one of the curves removed, what we
must have is a (4, 3)-Brunnian link. However, we know that a (4, 3)-
Brunnian link can be constructed, so let us draw one. Let the four curves
2.7 The final generalization 47

Fig. 2.14
A (4, 3)-Brunnian link.

involved be called A, B, C, and D. Then, as before, the problem is to draw


a fifth curve E with certain properties-but this amounts only to finding a
formula for E, a formula involving all the symbols a, b, c, and d, such that the
deletion of all occurrences of anyone of these symbols from the formula
produces the formula for a (4, 3)-Brunnian link.
The formula for producing a (4, 3)-Brunnian link is itself a commutator,
but a slightly more complicated commutator than we used in the (n, 2)-
Brunnian examples (see Exercise 2.35). Recall that if separated circles A,
B, and C are drawn, then drawing a fourth simple closed curve D according
to the formula (a, b), c) will produce a (4, 3)-Brunnian link, as in Fig. 2.14.
To produce a (5, 3)-Brunnian link, we must provide a formula using
the four symbols a, b, c, and d, so that-if possible-the deletion of all
occurrences of anyone of these symbols produces a commutator in three
of these symbols, such as our original commutator (a, b), c). For example,
if a were deleted, it would be sufficient to have the formula collapse to
(b, c), d); if b were deleted, we would like to have the formula collapse to
48 Brunnian links 2.7

«a, c), d). But observe that even these more complicated commutators,
such as «a, b), c), do collapse to I upon the deletion of any symbol present
in them. Therefore, as we did for the construction of the (n,2)-Brunnian
links, we write the product of all possible commutators involving three of the
symbols a, b, c, and d, and obtain
«a, b), c) «a, b), d)«a, c), d)«b, c), d).
If all occurrences of anyone symbol, such as c, are deleted, all the commuta-
tors involving that symbol collapse to 1. What we have left is the single
commutator «a, b), d), not involving the symbol c. But this shows that if we
draw the fifth curve with the above formula, we have indeed produced a
(5, 3)-Brunnian link.
We need but a single definition here, for the purpose of making our ideas
easier to express. If a, b, c, ... , are symbols in our algebra, and k is a whole
number at least 2, then a k-commutator of these symbols is simply one of the
multiple commutators we have been considering, one that involves exactly k
distinct symbols drawn from our algebra. For instance, using the above
symbols, one example of a 4-commutator is «(a, b), c), d). It is important
to note that a k-commutator has the property that if all occurrences of any
one symbol are deleted from it, then the formula that remains will collapse
to 1.
Let us see how the concept of a k-commutator will serve us in guessing
the general procedure necessary to produce an (n, k)- Brunnian link for any
meaningful choice of nand k. Suppose we first wished to draw a (3, 2)-
Brunnian link. We draw two separated circles, label them A and B, and
consider the associated symbols a and b in our algebra. Since we wish to
form a (something, 2)-Brunnian link, we write down all possible 2-com-
mutators using the symbols a and b. There is no distinction between aba- 1 b- 1
and (for example) b - 1 aba -1 for our purposes. All we need to do is to write
down only one such, say aba - 1 b -1. We then draw in a third curve with this
formula and obtain a (3, 2)-Brunnian link.
Similarly, to produce a (4, 2)-Brunnian link, we first produce a (3, 2)-
Brunnian link, label the three curves A, B, and C, and consider the three
associated symbols a, b, and c in our algebra. We form all possible 2-
commutators using the symbols a, b, and c, and write their product
(a, b)(a, c)(b, c).
If a fourth curve is drawn according to this formula, we obtain a (4, 2)-
Brunnian link.
To produce a (4, 3)-Brunnian link, we draw three separated circles,
call them A, B, and C, and form the product of all possible 3-commutators
on the three symbols a, b, and c. There is only one, «a, b), c). If a fourth
curve is drawn with this formula, then a (4, 3)-Brunnian link is formed.
2.7 The final generalization 49

To produce a (5, 3)-Brunnian link, we first construct a (4, 3)-Brunnian


link, call the four curves of this link, A, B, C, and D, and then write the
product of all possible 3-commutators using the four symbols a, b, c, and d.
When a curve is drawn with this forty-symbol formula, a (5, 3)-Brunnian
link is produced.
The following is the general procedure, and you may verify for yourself
that it works. Given an (n, k)-Brunnian link, here is how to produce an
(n + 1, k)-Brunnian link. Label the n simple closed curves of the (n, k)-
Brunnian link C 1 , C 2 , C 3 , ..• , Cn. Let the algebraic symbols associated
with these curves be called ClJ C2, C3' . . . , Cn. Form the product of all possible
k-commutators using these n symbols. Draw a curve with this formula. The
resulting link will be an (n + 1, k)-Brunnian link.
For example, to produce a (7, 4)-Brunnian link, first construct a (5, 4)-
Brunnian link. Label the five curves A, B, C, D, and E. Form the product of
all possible 4-commutators using the five symbols a, b, c, d, and e. (Of the
five possible, one such 4-commutator would be «(a, b), c), d).) Draw a
sixth curve with this formula; label it F. This will give you a (6, 4)-Brunnian
link. Form all possible 4-commutators of the six symbols a, b, c, d, e, andh
and multiply these commutators together. Draw a seventh curve with this
formula. You will then have a (7, 4)-Brunnian link, as desired.
In general, the method for constructing an (n, k)-Brunnian link is this:
Start with a (k + 1, k)-Brunnian link. By using k-commutators at each
stage, form next a (k + 2, k)-Brunnian link, form from that a (k + 3, k)-
Brunnian link, and continue the process until the desired (n, k)-Brunnian
link is obtained. This establishes the only theorem of this chapter, which in
conclusion we state below.
Theorem 2.1 For any whole numbers nand k for which the notation makes
sense (I ~ k < n), there does exist an (n, k)-Brunnian link.

Exercises

2.36 Let S be an expression in our algebra; that is, S is a string of symbols


with various exponents all multiplied together. Let a be an element of the
algebra. Show that even though aSa - 1 need not be the same as S, the two do
in some sense represent the same curve.
2.37 Find a shorter formula than that given in Section 2.5 for a (5, 4)-
Brunnian link.
2.38 Prove that the deletion of all occurrences of a single symbol from a k-
commutator causes the resulting formula to collapse to 1. (You may use
proof by induction on the integer k; see Exercise 1.22.)
2.39 In what sense is our proof of Theorem 2.1 a proof by induction?
50 Brunnian links 2.7

2.40 Sometimes it is difficult or impossible to draw certain examples of


Brunnian links without having some of the curves pass over or under them-
selves. This may cause such curves to be knotted. Does this matter? Can
it be prevented? How?
2.41 At certain points in the construction of Brunnian links one must write
all possible essentially distinct k-commutators using n symbols, where
2 < k < n. This number of different k-commutators is the same as the
number of different k-element subsets of a set with n elements. For sets, we
°
may even allow k to take on the values and I, and as an example, let us
consider a set with four elements. The number of subsets with 0, 1, 2, 3, and 4
eletp.ents is, respectively, 1, 4, 6, 4, and 1. In what other context have you seen
this sequence? Can you obtain a general formula for calculating the number
of k-element subsets of a set with n elements? See Exercise 1.22 for one
method of attack.
2.42 Make up another problem like Exercise 2.30, but harder.
2.43 Make a Mobius strip by cutting a fairly long and narrow strip of
paper, and then attaching the ends with glue or tape after giving the strip
a half-twist. Show that the strip has only one side and only one edge.
2.44 Continuing the previous exercise, start drawing a line on the Mobius
strip, starting at a point one-third of the way from one edge and continuing
around the strip, always staying one-third of the way from that edge, until
you reach the starting point. Predict what will happen if the Mobius strip is
now cut in two down its center line. How many edges will the new object or
objects have? How many strips will result? Verify your predictions by the
experimental method.
2.45 Continuing the previous two exercises, what do you think will happen
if a Mobius strip is cut in two by keeping the scissors always one third of the
way from one edge, until the starting point is reached?
2.46 Repeat the previous three exercises for a strip with two half-twists.
2.47 Show that a strip with three half-twists has but one edge and one side,
and that the edge is knotted.
2.48 Since a Mobius strip has but one edge, and that edge is a simple closed
curve, one could imagine sewing two such strips together along the edges.
One would obtain a surface with no boundary edge-like a sphere, but also
very unlike a sphere in one way. How would the resulting surface differ
from a sphere?
2.49 Can a surface have one side and two edges? Two sides and one edge?
What are the impossible combinations?
2.50 A currently unsolved problem in mathematics is to find the answer to the
following question: Does every simple closed curve in the plane contain the
Notes and references 51

vertices of a square? One method of attack might be to show that every


simple closed curve in the plane contains the vertices of a parallelogram;
another approach might be to restrict the sorts of simple closed curves under
consideration. For an example of the latter approach, can you show that
every triangular simple closed curve contains the vertices of a square? That
is, given a triangle, is it always possible to construct a square whose vertices
lie on the boundary of the triangle?

NOTES AND REFERENCES


The first examples of (n, n - 1)-Brunnian links were given by H. Brunn in
his paper "Uber Verkettung", published in Sitzungberichte der Bayerischen
Akademie der Wissenschaften, Mathematische-Physikalische Klasse, Vol. 22
(1892), pages 77-99. Hans Debrunner published in Vol. 28 (1961) of the
Duke Mathematical Journal, pages 17-23, a paper entitled "Links of Brunnian
Type," in which he showed that Brunn's examples had the properties claimed
for them by Brunn, and further generalized these examples to show the
existence of (n, k)-Brunnian links for arbitrary nand k, I ~ k < n. D. E.
Penney's paper "Generalized Brunnian Links," published in Vol. 36 (1969)
of the Duke Mathematical Journal, provided an alternative construction of
(n, k)-Brunnian links.
This chapter is in essence providing sufficient material on homotopy
theory for the elementary study oflinks. Additional material for the advanced
student may be found, for example, in I. M. Singer and John A. Thorpe's
Lecture Notes on Elementary Topology and Geometry, published by Scott-
Foresman in 1967. Other material on links and knots is found in Crowell
and Fox's Introduction to Knot Theory, published in 1963 by Ginn and
Company, and in Topology of 3-Manifolds (edited by M. K. Fort, Jr.),
published by Prentice-Hall in 1962.
The existence of wild knots, such as the one shown in Fig. 2.3, is a
consequence of the thesis of L. Antoine, published in 1921. The study of
knots, links, and wild curves and surfaces in general, has become in recent
years an exciting and rapidly advancing branch of modern mathematics.
In particular, although the structure of surfaces is now well understood, the
problem of describing high-dimensional figures is much more difficult, and
to date only very partial answers to most questions have been obtained.
CHAPTER 3

THE
WELL-TEMPERED
CLAVICHORD

As you probably know, the "octave" on a piano has this name because-
counting both ends-it contains eight keys; these and the black keys between
some of them are named as shown in Fig. 3.1. If we count black keys as
well, and only the essentially different notes, there are actually twelve different
keys in an octave, and most music of the Western Hemisphere is written
using these and only these twelve different notes. For this reason, such music
is referred to as twelve-tone music; but you are probably aware that music
can be written in other systems.
We shall try to answer three questions in this chapter. First, why are
there twelve different notes in an octave on the piano? Second, what was
the proposed modification in the twelve-tone system used in the time of
Johann Sebastian Bach, a modification that he sought to call attention to by
the publication of his Woh/temperiertes K/avier? Finally, are further
improvements in the scale system we now use possible, and if so, how can such
improvements be effected?
Curiously enough, some information pertaining to the answers to all
three of these questions is supplied by the methods of continued fractions,
not an especially new branch of mathematics, but one which has begun to
enjoy valuable applications in fields where accurate approximations to
irrational numbers are needed, such as in the use of high-speed computers.
Moreover, the use of continued fractions to help answer the above questions
is particularly simple, and the only prerequisite for this particular applica-
tion is a degree of familiarity with logarithms.

52
3.1 Properties of logarithms 53

C# D# F# G# A# C#' D#' F#'

Fig. 3.1
The names of
the notes on
the piano.
C D E F GAB C' D' E' F'

Partly to remind you of the properties of logarithms, and partly so that


the necessary properties will be listed in this book for your convenience, we
shall first take up some topics in logarithms, leaving some proofs for the
exerCIses.

3.1 PROPERTIES OF LOGARITH MS


First, if b is any positive number other than 1, and a is any positive number
whatsoever, the equation

always has a unique solution x, which may be positive, negative, or zero.


This number x is called the logarithm of a to the base b, and we write

x = 10gb a.
Thus, 10gb a is the power to which the base b must be raised in order to
obtain the number a. For example,

loglo 1000 = 3 and log232 = 5


54 The well-tempered clavichord 3.1

y-axis

Fig.3.2 The graph


- - + - - - - + - - - - - - - - - - - x-axis of y = log,o x.

because
103 = 1000 and 2 5 = 32.

The graph of y = loglo X is shown in Fig. 3.2. Note that the graph
of y = loglo x passes over only the positive x-axis-this indicates that
only positive numbers have logarithms. Also, the graph passes through the
point (1, 0), since 100 = 1; the graph of y = 10gb x will have this property
for any acceptable value of b (namely, b positive but not equal to 1). Also,
the graph of y = loglo X is always increasing-that is,

This is also the case for any other value of b so long as b exceeds 1.
3.1 Properties of logarithms 55

One useful application of logarithms is in quickly finding approximate


answers to problems involving the products and quotients of many, or
large, numbers; another especially useful application is found in obtaining
good approximations to such numbers as (17)35.61. Because of this applica-
tion, the base b = lOis used for the tables of logarithms so frequently
found in the backs of textbooks in algebra, trigonometry, and engineering
mathematics. By choice of base 10, it becomes especially easy to locate the
decimal point in the final answer. However, the laws of logarithms which
make such shortcut computations possible are independent of the choice of
base, and theoretically any other positive number (except 1) will do as well.
The answers obtained by such computations in any case will usually be only
approximations, because although you may see from a table of logarithms
that IOg10 2 is given as 0.30103, actually this number has been rounded off
to fit in the table; 10glo 2 is in fact an infinite nonrepeating decimal. Hence
10°·30103 is not equal to 2, but will be very close to 2.
The mathematical properties of logarithms which make such computa-
tions possible follow from the ordinary laws of exponentiation for real
numbers, given below. For values for which the following expressions are
defined, we have always:

a
-x
=-
1
ax '

aO = 1 (for a =1= 0),

From these laws may be derived the various laws of logarithms. We need
only two; the others are given in the exercises. Here are the two:
Let b be a positive number other than 1. Then, for all positive numbers
x and y,
10gb xy = (10gb x) + (10gb y).

Also, for every positive number x and every number y,

It happens that these are exactly the two properties which also make the
approximate calculations mentioned earlier possible. For example, suppose
we wish to find, using logarithms, the approximate value of the product of
82,425 and 46,037. Using a table of logarithms to the base 10, we find that

10glo 82,425 = 4.91606 and 10glo 46,037 = 4.66311.


56 The well-tempered clavichord 3.1

Thus

IOg10 (82,425)(46,037) = IOg10 82,425 + IOg10 46,037


= 4.91606 + 4.66311
= 9.57917.
Using our table of logarithms in reverse to find what number has its
logarithm equal to 9.57917, we find that number to be 3,794,600,000 (the
five zeroes at the end of this number mean that our table is not sufficiently
accurate for us to be sure of the last five digits). The correct answer to the
problem is 3,794,599,725. This problem takes almost as long to do with
logarithms as actually multiplying the original two numbers together, but in a
problem involving several products and quotients, logarithms can immensely
shorten the time needed for the computation. The price paid is, of course,
the sacrifice in accuracy.
For another example, suppose we wish to find an approximate value for
50
2 . This would take quite a while to multiply out by hand, but we find the
logarithm of 2 in our table to be 0.30103, and thus

= 50· 0.30103 = 15.05150.


Again using the logarithm table in reverse, we find that the number whose
logarithm is 15.05150 is 1,125,900,000,000,000 (where, again, the eleven
zeroes indicate that only the first five digits are reliable). Here the correct
answer is 1,125,920,387,354,624.

Exercises

3.1 Why can we not use the positive number 1 as a base for taking
logarithms?
3.2 Sketch the graph of y = log2 x.
3.3 Sketch the graph of y = IOg(1/2) x.
3.4 Show that if 0 < b < 1, then the graph of y = 10gb X is always de-
creasing. You may assume that the graph of y = 10gb X is increasing if
1 < b.
3.5 Show that if 0 < band b i: 1, then the graph of y = 10gb x passes
through the point (1, 0).
3.6 Use logarithms to the base 2 to compute the value of 4·8.
3.7 Use logarithms to the base 2 to compute the value of 43 •
3.2 A peculiar manipulation 57

3.8 Use the laws of exponents stated in this section to prove that
10gb xy = 10& x + 10gb y.
3.9 Use the laws of exponents stated in this section to prove that
log" x Y = Y 10gb x.
3.10 We defined x = 10gb a to mean that bX = a. Can you use this definition
in reverse, as a definition of the meaning of b\ to prove the laws of exponents?
Of course, you may use the properties of logarithms given in the previous
two exercises.
3.11 Let a, b, and c be numbers for which the expressions below are meaning-
ful. Prove that
(lo&, b) . (10gb c) = lo&, c.
Hint: Let x = lo&, b, y = 10gb c, and z = lo&, c. Then ~ = b, and so on.
3.12 Let b be a positive number not equal to I, c a positive number, and a
a positive number not equal to 1. Simplify the following expressions:
10gb bc, 10gb (ljb), (10gb a) . (lo&, b).
3.13 Let band c be positive numbers, neither equal to 1. Show that
1
loge b = .
10gb c
3.14 Let b be a number larger than 1; if you want to be specific, you may even
assume for the purpose of this exercise that b = 10. One law of exponents
is that if x and yare any two numbers such that x < y, then bX < bY. Use
this fact to show that also, if x < y and both x and yare positive, then
10gb x < 10gb y.
3.15 You will need a table of logarithms for this exercise. Solve for x:
2x = 3.

3.2 A PECULIAR MANIPULATION


Let us assume that all logarithms to be used from now on are to the base
b = 10, since tables of such logarithms are the most readily available, and
we may then abbreviate IOg10 x by log x.
Later, one of the most important computations we shall want to perform
will involve expressing a number such as (log 3)j(log 2), which exceeds 1,
as the sum ofa whole number and a number between 0 and 1. We know that
(log 3)j(log 2) exceeds 1 by virtue of Exercise 3.14: since 2 < 3, also log 2 <
log 3, and hence (log 3)j(log 2) > 1.
58 The well-tempered clavichord 3.2

Here is the computation by which we can express (log 3)/(log 2) as the


sum of a whole number and a number between 0 and I. Note how the laws
of logarithms mentioned in Section 3.1 are used.
log 3 log [(2) . (3/2)]
--=----"'-------=
log 2 log 2
= log 2 + log (3/2)
log 2
= log 2 + log (3/2)
log 2 log 2
= 1 + log (3/2) .
log 2
Now 3/2 < 2, and hence log (3/2) < log 2. Thus we have obtained the
desired result provided both log (3/2) and log 2 are positive, so that their
quotient does lie between 0 and I. You will see in the exercises at the end of
this section that it is very easy to determine that both log (3/2) and log 2
are positive, and so we have achieved the desired result. Given the number
(log 3)/(log 2), we have expressed it as the sum of a whole number and a
number between 0 and I, in the form
log 3 = 1 + log (3/2) .
log 2 log 2
It may seem strange to you at this stage that these peculiar manipulations of
logarithms can provide us with information about problems involving
musical scale systems, but they will.
Exercises

3.16 Show that log (3/2) and log 2 are both larger than 0; that is, that each
is positive. Hint: Use Exercise 3.14 and the fact that log I = O.
3.17 Express the number 12/7 as the sum of a whole number and a number
between 0 and I.
3.18 Express the number (log 5)/(log 2) as the sum of a whole number and
a number between 0 and I.
3.19 Express (log 4)/(log 2) as the sum of a whole number and a number
between 0 and I.
3.20 Since we now know that 3/2 < 2 and log (3/2) < log 2, and that
the latter numbers are positive, we can write

-log
-- 2
> 1.
log (3/2)
3.3 Continued fractions 59

Express
log 2
log (3/2)

as the sum of a whole number and a number between 0 and 1.

3.3 CONTINUED FRACTIONS


Consider a fraction such as 12/7. We may convert this into a compound
fraction of a certain standard form as follows.

12
- - 1 +~
7 7

1
- 1 +-
7
-
5

1
- 1 +
1 + -2
5

1
- 1 +
1
1 +-
5-
2

1
- 1 +
1
1 +
1
2+-
2

The "standard form" into which 12/7 has been converted is this: It is a
compound fraction in which each numerator is I, all signs are "+" rather
than "-", and all numbers are positive whole numbers. If we had begun
with a negative number, we could still obtain this standard representation
for it, except that the first number on the left would be a negative number.
It should be clear that every rational number can be so expressed, and that the
resulting compound fraction must be finite because the denominators decrease
at each stage. But if we allow nonterminating denominators, just as we allow
nonterminating decimal expansions, each real number can be expressed as
60 The well-tempered clavichord 3.3

such a compound fraction-these are called continued fractions. For example,


the continued fraction for .J2 is
1
1 + ----------
1
2 + ---------
1
2+ - - - - ---
1
2 + -----
2 + 1
2+
Since we have agreed that all numerators are to be 1 and all signs positive,
we can adopt a much more convenient abbreviation for a continued fraction.
We merely list in order the numbers to the left of each fraction, setting the
first such number off by a semicolon, since it, unlike the others, may be zero
or negative. The above continued fraction would then be abbreviated by
(1 ; 2, 2, 2, 2, 2, ... ),
and the continued fraction we previously obtained for 12/7 would be ex-
pressed as (1; 1, 2, 2). In the latter case, the case of a terminating continued
fraction, we need to give the last denominator.
Of course, there is no difficulty in evaluating a terminating continued
fraction, such as (1; 2, 3). Only the following simple arithmetic is needed.

(1; 2, 3) = 1+ 1
1
2+-
3
1
-1 +-
7
3
3 10
-1+-=-.
7 7
Here is a method for "evaluating" nonterminating continued fractions in
which the numbers repeat periodically. This method will always produce the
correct value for the sort of continued fractions we will be dealing with. We
apply the method to the continued fraction (1; 2, 2, 2, ... ). Let x =
(1; 2, 2, 2, ... ). Then x = (1; 1 + x), and so

x=I+_I_
1 + x
Thus
x· (1 + x) = (1 + x) + 1,
so that
x + x2 = X + 2 or
3.3 Continued fractions 61

We discard the negative solution of this last equation for reasons to be dis-
cussed later, and find that x = 2. .J
If you wish to work in the opposite direction and find the continued
fraction expansion of x = .J
2, first write
so that x2 -- 1 -- 1,
or
(x + 1)(x -- 1) = 1.
Since we know that x '::F --1, we may divide both sides of this last equation
by x + 1, and thus obtain
1
x -- 1 =
1 +x
Thus
1
x = 1+
1 +x
Since the entire right-hand side of this last equation is equal to x, we substitute
the entire right-hand side for the x in the last denominator, and obtain
1
x = 1 +
1
1 + 1 +
1 +x
or
1
x= 1 +
1
2+
1 +x
If we continue this process of substitution for the x in the denominator of the
right-hand side, we obtain x = (I; 2, 2, 2, ... ), as desired.

Exercises
3.21 Evaluate the cQll,tinued fraction (I; 2, 1, 3, 2).
3.22 Evaluate the continued fraction (1; 3, 3, 3, ... ).
3.23 Find the continued fraction for .J"S.
3.24 Evaluate the continued fraction (1 ; 3, 4, 3, 4, 3, 4, 3, 4, ... ).
3.25 Since.J3 is between 1 and 2, the continued fraction expansion for .J3
must be of the form
.J3 = (1; a1' a2, a3, a4' ... ),
where the numbers a 1, a2' a3, a4' ... are all positive whole numbers. Show
that the continued fraction expansion of .J3 cannot be of the form
(I ; a, a, a, a, ... ),
where a is a positive whole number.
62 The well-tempered clavichord 3.4

3.26 Find the continued fraction expansion of .Ji


3.27 Evaluate (1; 1, 2, 2) and (1; 1, 2, 1, 1).
3.28 Evaluate (1; 1, 1, 1, 1, ... ). Call the resulting number Q. Show that if
a rectangle with sides of length 1 and Q is constructed, then divided into a
square and a rectangle by a line perpendicular to the long side, the new small
rectangle also has its sides in the proportion 1 : Q.
3.29 Show that any rectangle with the property of the one constructed in the
previous exercise must have its sides in the proportion 1 : Q, where
Q = (1; 1, 1, 1, 1, ... ).
3.30 Look up The Golden Mean in a book on the history of mathematics.

3.4 THE VALUE OF A CONTINUED FRACTION


We went through a process in the previous section by which we concluded
that the value of the continued fraction (1 ; 2, 2, 2, 2, ... ) was.Ji But how
can we say that a number is "equal" to one of these nonterminating continued
fractions when such a continued fraction, because it is nonterminating, clearly
cannot be evaluated directly? It is impossible to actually perform the infinite
number of additions and divisions necessary to "evaluate" a nonterminating
continued fraction, so what we must do is give a definition of the value of a
continued fraction in terms of processes which actually can be carried out.
This is how it is done. Consider again our continued fraction representa-
tion of .J2; that is, .J2 = (1; 2, 2, 2, 2, ... ). Let us imagine that we were
actually trying to evaluate this "infinite" fraction, and write the sequence of
so-called partial quotients that we obtain by evaluating it out to a certain
point and then forgetting about the rest. In this instance, we would first
evaluate (1; ), then (1 ; 2), then (1 ; 2, 2), and so on. The partial quotients
thus obtained form the following sequence:
1,3/2,7/5,17/12,41/29,99/70,239/169, ... , or
Sl = 1.000 000 000 .
S2 = 1.500 000 000 .
S3 = 1.400 000 000 .
S4 = 1.416 666 666 .
Ss = 1.413 793 103 .
S6 = 1.414 285 714 .
S7 = 1.414201 183 .

If you know that .J2 = 1.414213 562 ... , you notice an interesting
phenomenon. The sequence of partial quotients has bracketed the value of
3.4 The value of a continued fraction 63

Fig. 3.3 .J2 as the limit of a sequence.


1.4 1.5

.J2 much as in artillery fire; the values are bouncing back and forth on
either side of the value of .J2 and are getting closer and closer to it, as
indicated in Fig. 3.3. The technical term used here is that .J2 is the limit
of the above sequence of numbers, and it is in this sense that we say that the
continued fraction (1 ; 2, 2, 2, 2, ) has the value.Ji The method we used
for "evaluating" (1; 2, 2, 2, 2, ) in the previous section, the one in which
we substituted the x on the left-hand side of an equation for the x on the
right-hand side, is merely a device for avoiding examination of the sequence
of partial quotients. This device is very useful when it works, but as you can
see, it will work only on a continued fraction that repeats its terms periodically.
You may well wonder whether or not all continued fractions have values
in the sense of the limit of the sequence of partial quotients. After all, it is
conceivable that the sequence increases without bound. Fortunately for our
purposes, there is a theorem in the theory of continued fractions which
guarantees that if ao is any real number whatsoever and ai' a2' a3' a4' ...
are all positive whole numbers, then the sequence of partial quotients of the
continued fraction (a o ; ai' a2' a3' a4' ... ) does indeed have a limit, and it is
this limit we mean when we speak of the value of the continued fraction
(a o ; a h a2' a3' a4' ... ). Further topics on limits can be found in the next
set of exercises.
But the above-mentioned theorem is by no means the most interesting
fact about continued fractions. We shall use the following result to answer
some of our questions about musical scale systems: A continued fraction
64 The well-tempered clavichord 3.4

provides us with a "best possible" sequence of rational approximations to its


limit. In the last example, you can see that 17/12 is a fairly good approxima-
tion to,Ji The next fraction in the sequence is 41/29. The continued
fraction for ,J2, namely (1 ; 2, 2, 2, 2, ... ), will yield up to us the information
as to which, if any, fractions with denominators between 12 and 29 are
better approximations to ,J2 than 17/12. (Sometimes the method will pro-
duce a few which are not so good as 17/12, but these can be eliminated by
inspection.) The method will be easier to see if we first try a more complicated
example; say,
ex = (1; 3, 1, 4, 1, 3, 1, 4, 1, ... ).

The value of ex is approximately 1.261 28. The sequence of partial


quotients associated with the continued fraction expansion of ex is
1/1, 4/3, 5/4, 24/19, 29/23, ....
Now 5/4 was obtained by calculating the value of (1; 3, 1) and 24/19 was
obtained by calculating the value of (1; 3, 1, 4). To find the possibly better
approximations to ex with denominators between 4 and 19, one simply
calculates the values of the so-called intermediate fractions. One may think
of these as being obtained in the following way. We already have 5/4 =
(1; 3, 1). The next approximation given in the above sequence, 24/19, is
obtained by calculating the value of (1; 3, 1, 4). To get the intermediate
fractions we instead successively calculate the values of the fractions
(1; 3, 1, 1)
(1; 3,1,2)
(1; 3,1,3)

or, in more mathematical notation, we calculate the values of the fractions


(1; 3, 1, n), where n takes on all positive whole number values between 1 and
4 (since 4 is the last number in the fraction (1; 3, 1, 4) = 24/19); the first
n - 1 of these are the intermediate fractions, and the last is just 24/19 itself.
If the last term of 24/19 = (1; 3, 1, 4) had been some much larger number,
such as 30, then n would have to be allowed to take on twenty-nine values
rather than three, but the principle is the same.
In any case, in the above example the three intermediate fractions turn
out to be
9/7, 14/11, 19/15,
and each can be tested to see whether or not it is a better approximation to
ex = 1.261 28 than 5/4. We insert these three new fractions into the sequence
of partial quotients, and obtain
... , 5/4, 9/7, 14/11, 19/15, 24/19, ...
3.4 The value of a continued fraction 65

These five fractions have the following decimal values:


1.200 00 .
1.285 71 .
1.272 72 .
1.26666 .
1.263 18 .
It then is clear that each is a better approximation to 1.261 28 than its pre-
decessor. What the theorem guarantees is this: These are the only better
approximations with such denominators-of all fractions with denominators
between 4 and 19, only 9/7, 14/11, and 19/15 are better approximations to a;
than 5/4. Similarly, there may be such intermediate fractions between other
successive partial quotients. It will be necessary to examine the intermediate
fractions in order to answer completely the question raised at the beginning
of this chapter, on how improvements in the twelve-tone system might be
accomplished.
Exercises

3.31 Recall that we evaluated the continued fraction x = (1; 2, 2, 2, 2, ... )


by algebraic methods in which we obtained the equation x 2 = 2. We then
stated that the value of x must be )2, since the negative root can be ignored.
If you know the definition of the value of a continued fraction in terms of its
sequence of partial quotients, you will now be able to explain why the negative
root can indeed be discarded. Please supply this explanation.
3.32 The sequence Sh S2' S3' S4' . .. of real numbers is said to have the
number L as its limit if the following is true: Given the positive number f,
no matter how small, there exists a whole number N (probably dependent
on f) such that for all values of n > N, also ISn - LI < f.
This is the precise meaning of the word "limit" as used in the preceding
section. With the aid of this definition, you can prove such theorems as this:
If a sequence of real numbers has a limit, then it has only one limit. There
are many others you can think of and prove. The concept of limit is fun-
damental to calculus, perhaps the best-known and most important branch
of higher mathematics.
3.33 You may have noticed that the sequence of partial quotients Sl' S2'
S3' S4' . . . of the continued fraction expansion of )2 had the property that

Sl < S3 < Ss < ... < )2


and
)2 < ... < S6 < S4 < S2·

Try to give an informal proof why this should always be true, restricting your
attention to continued fractions with only positive whole number entries.
66 The well-tempered clavichord 3.5

3.34 Insert the intermediate fractions in the appropriate places in the


sequence of partial quotients of the continued fraction (1; 2, 2, 2, 2, ... ).
Indicate which of these are better approximations to .J"2 than their predeces-
sors in the new sequence.
3.35 A desk calculator would be very helpful in working this problem.
Suppose you wanted to find the continued fraction expansion of the square
root of 10, and you alrtiady know that the decimal expansion of the square
root of 10 looks like 3.162 277 660 . . . . You could then find the continued
fraction expansion of 3.162 277 660 instead, since this is a good approxima-
tion to the square root of 10. You would expect that the answer would be
similar to the continued fraction expansion of the square root of 10 itself-
the entries should be the same until round-off error catches up with you.
Knowing the decimal expansion, you could proceed as follows:
3.162277 660 = 3 + 0.162277 660
= 3 + I__
I
0.162 277 660

= 3+ 1
6.162 277 666
= 3+ 1 ...
6 + 0.162 277 666
At each stage, you extract the largest whole number you can from the last
denominator, invert the remainder twice (to keep it equal to what it was
before while producing a number larger than 1), and continue the process.
After a few stages you will have a good guess as to the actual continued
fraction expansion of the square root of 10 itself. Continue the above
computations, and make that guess.

3.5 APPLICATIONS TO BASEBALL AND GRADE


DISTRIBUTIONS
First, here is an example of how methods of continued fractions might be
used to answer a question about baseball. When a player's batting average
is given as, for example, 0.263, this number is computed by dividing the total
number of the player's official hits by the total number of official at-bats at
that point in the baseball season. The quotient is rounded off to three-place
accuracy. Suppose that we are given that a player's batting average is 0.263.
Can we draw any conclusions about his total number of hits and total number
of at-bats?
The number 0.263 is obtained by rounding off some longer decimal,
obtained perhaps by dividing 76 by 289. But it would be far too much
3.5 Applications to baseball and grade distributions 67

trouble to try all possible fractions with denominators less than one thousand
(a rather generous upper bound for the total number of at-bats available
to one player in a single season) to find out which ones might round off to
0.263. However, 0.263 is itself a reasonably good approximation to the
player's "true" or unrounded average, hence we can expand 0.263 into a
continued fraction in order to find the good rational approximations to it,
and thereby find the good rational approximations to the player's true
average. Such rational numbers are the candidates we examine, the numera-
tors providing us with possible numbers of hits, and the denominators
being the corresponding numbers of at-bats. The continued fraction ex-
pansion of 0.263 is given below:
263
0.263 = - = (0; 3, 1,4, 17,3).
1000
The corresponding sequence of partial quotients is
1/3, 1/4, 5/19, 86/327, 263/1000.
We insert the intermediate fractions and obtain
1/1, 1/2, 1/3, 1/4, 2/7, 3/11, 4/15, 5/19, 6/23, 11/42, 16/61, 21/80,
26/99, 31/118, 36/137,41/156,46/175,51/194, 56/213, 61/232,
66/251, 71/270, 76/289, 81/308, 86/327, 91/346, 177/673, 263/1000.
Of these, only 5/19 and those from 21/80 on give quotients which round
off to 0.263. However, since 5/19 appears in the list, this prevents the occur-
rence of other forms of the same fraction, and so we should also include
10/38, 15/57, and so on. In any case, we now have all the possible combina-
tions of hits and at-bats that give a percentage of 0.263. If you have additional
information-such as the fact that the player is a little-used pinch-hitter, or
that it is still early in the baseball season-this might be enough to conclude
that the player must have had five hits out of nineteen times at-bat. Of course,
with such advance knowledge we wouldn't have bothered to carry the listing
of the fractions out so far, but would have terminated our list when the
denominators became sufficiently large to take care of the maximum possible
number of at-bats.
As a second example, suppose that you know that in a certain class of no
more than sixty students, an instructor's final grade distribution was given as
A's: 10.8%
B's: 24.3%
C's: 37.8%
D's: 8.1%
F's: 18.9%
68 The well-tempered clavichord 3.5

Suppose that you wish to find exactly how many students were in the class.
Consider first only the percentage of students receiving "A" grades. We
find the continued fraction expansion of this number as follows:

108 = 0 + _1_
1000 1000
108

=0+ 1
9 + 28
108

= 0 + __I _
I
9+-
108
28

=0+ _ _1_ _
9 + 1
3 + 24
28

= 0 + __1__
9 + 1
3 +~
7

=0+ 1 _
9 + - -1- -
3 + 1
1
1 +-
6
or
(0; 9, 3, 1, 6).

The associated sequence of partial quotients, together with their decimal


expansions, is
1/9 = 0.111 111 .
3/28 = 0.107 142 .
4/37 = 0.108 108 .
27/250 = 0.108000 .
3.5 Applications to baseball and grade distributions 69

Now 28 is too small a denominator, since the corresponding decimal does not
round off to 0.108, and 250 is too large, since the denominator represents the
class size and there are no more than 60 students. So any other possible
approximations that round off to 0.108 can only be found among those
intermediate fractions immediately after 3/28 and those immediately after
4/37.
But there are no intermediate fractions between 3/28 and 4/37, and the
first one after 4/37 is (0; 9, 3, 1, 1) = 7/65, which we reject since its denomina-
tor is too large. The following intermediate fractions can have only larger
denominators, hence we are already finished. In effect, the only fraction
with denominator between 1 and 60 which does round off to 0.108 is 4/37.
Thus we know that there must have been 37 students in the class, and that
four of these students received "A" grades.

Exercises

3.36 Repeat the preceding example, using the "B" grades instead. You
should obtain a continued fraction in the form (0; 4, 8, 1, 2, 9) for 0.243,
and 9/37 the only admissible ratio.
3.37 Repeat the preceding example, using the "C" grades. You should
obtain two admissible fractions, 14/37 and 17/45.
3.38 Do this example again, this time using the "D" grades. You should
obtain only one admissible fraction.
3.39 Repeat the example using the "F" grades. You should obtain more than
one admissible fraction.
3.40 If you repeat Exercise 3.36, altering it only in that as many as 150
students may have been in the class, what results do you obtain for the
number of students that mayor must have been in the class?
3.41 If a baseball player's batting average at the end of a season is 0.338,
what is the minimum number of times he could have batted (that is, what is
the minimum number of official at-bats)?
3.42 Use the definition of limit given in Exercise 3.32 to prove that your
guess as to the limit of the sequence
1, 1/2, 1/3, 1/4, 1/5, 1/6, ...
is correct.
3.43 The value of a continued fraction has been defined as the limit of its
sequence of partial quotients. How could one analogously define the "sum"
of an infinite series such as
1 + 1/2 + 1/4 + 1/8 + 1/16 + ... ?
70 The well-tempered clavichord 3.6

3.44 What is your guess as to the sum of the infinite series given in the
previous exercise?
3.45 How would you prove that your answer to the previous exercise is
correct?

3.6 HARMONY
We are finally ready to turn our attention to the musical scale. A vibrating
string sets up corresponding vibrations in the air about it, vibrations which
are perceived as sound if they are sufficiently, but not excessively, rapid.
The human ear is usually capable of perceiving vibrations between thirty
and seventeen thousand hertz (the currently approved term for "cycles per
second"), although at the extremes only fairly loud sounds can be heard.
It was known to the ancient Greeks that the frequency of such a vibrating
string is inversely proportional to its length, all other pertinent factors
(tension, density, diameter, ... ) remaining the same. That is, doubling the
length of a string produced a frequency half that of the original string; to
triple the frequency it would be necessary to use a string one-third the original
length. In addition, it was probably known to the Greeks that a vibrating
string of length (say) twelve inches also vibrated to a certain extent as if it
were two six-inch strings joined at the middle, as well as three four-inch
strings, and so on. This phenomenon, which occurs in different proportion
with different musical instruments, is known as the production of the higher
harmonics of the so-called fundamental frequency of the vibrating string,
and it follows that the frequencies of these harmonics are the whole number
multiples of the fundamental frequency. See Fig. 3.4 for an indication of the
way in which the higher harmonics are produced. It is the production of
these higher harmonics in different proportions by the various orchestral
instruments that enables a composer to create music of such wide ranges of
sound. The particular combination of harmonics is one of the main reasons
why a violin sounds different from a saxophone.
It also follows that if two strings are set in simultaneous vibration and
the length of one is twice the length of the other, then the various harmonics
match up, producing a pleasant (or at least a nondissonant) effect. For in this
case the first harmonic of the longer string would be the fundamental
frequency of the shorter, and all higher harmonics would have frequencies
equal in pairs. However, though not unpleasant, the sound of two such
tones is not a very rich sound. You can produce such a harmony by striking
any two notes one octave apart on a properly tuned piano.
It must have been discovered early in the history of music that if two
strings were simultaneously plucked, and the length of the second was two-
thirds the length of the first, a pleasing harmony resulted. The shorter string
produces a frequency three-halves that of the longer one, and many of the
3.6 Harmony 71

Fundamental

Second harmonic

~---~
Third harmonic

"""""== :..7' oc;;:::::: ~

Fourth harmonic

............... :;7_

Fifth harmonic

Fig. 3.4 Production of harmonics by a vibrating string.


72 The well-tempered clavichord 3.6

higher harmonics are equal in frequency. The mixture of two such tones
gives a richer sound than can be obtained from the harmonics of a single
string, as you can hear by comparing the sound of middle C on the piano
together with the G immediately above it with the sound produced by two
notes an octave apart. (The pitch of this G is, however, not precisely three-
halves that of middle C, but only very close to this ratio-later in this chapter
we shall see why this is so.)
Finally, the Greeks also discovered a principle which has come to be
known as the Law of Small Whole Numbers: If the ratios of the lengths of
two plucked strings could be expressed using small whole numbers, such as
2 : 1 in the case of an octave or 3 : 2 in the case of a fifth, then the sounds
were harmonic rather than dissonant. But as the numbers needed to express
such ratios become larger, the sounds become more dissonant; the ratios of
4 : 3 and 5: 4 produce harmonies which most people find quite pleasant,
whereas only the most avant-garde composers might consider that a 37 : 29
ratio produces a harmonious sound.
We may now speculate about the origin of our present twelve-tone scale
system so widely used in our Western culture. If you will refer again to Fig.
3.1 at the beginning of this chapter, you will see that X # would be the
black key of the piano immediately above the white key X. There is no black
key between E and F, nor is there one between Band C, and so a musician
would interpret E # to mean F and B # to mean C.
If we have information about the frequencies of the notes in a single
octave, we may use this information to find the frequencies of every other
note on the piano, for each other note is some whole number of octaves
above or below one of these, and its frequency may be found by the ap-
propriate amount of doubling or halving the right one in our first octave.
Finally, it is common to tune middle C to about 256 hertz, but this fact need
concern us no further.
After a brief excursion into the art of tuning a piano, we will be able
to continue our study of the origins of the twelve-tone system.

Exercise

3.46 Strike the key middle C on a piano, while holding down the key C'
(the C one octave above middle C), and release middle C immediately. You
will hear C' sounding its natural frequency, showing that this frequency is
one of the harmonics of C. Can you find which other keys show the same
behavior, and make a table of the higher harmonics of middle C in terms of
the other notes on the piano. Hint: Of course, C' has twice the frequency of
C, and C" (the C immediately above C') has four times the frequency of C,
and so on. So, for example, you would look for the third harmonic of middle
C somewhere between C' and C".
3.7 Tuning a piano, old style 73

3.7 TUNING A PIANO, OLD STYLE


We have already mentioned that the harmony produced by two strings in a
3: 2 length ratio can almost, but not exactly, be produced by striking middle
C on the piano together with the G immediately above it. Why is not this G
tuned to exactly three-halves the frequency of middle C?
Suppose it were, and in fact that all such pairs of fifths-two notes seven
half steps apart on the piano-were so tuned. Let the frequency of middle
C be denoted by v. The argument that follows turns out to be independent of
the value of v, since it cancels out, but if you wish you may assume that
v = 256 hertz. Remember that if we move down one octave, or twelve half
steps, on the piano, the frequency will be halved. We proceed to fill in the
frequencies of all the other notes in our octave, using the fact (which we have
assumed in order to reach an eventual contradiction) that each note a fifth
above another has frequency exactly three-halves the frequency of the other.
C G D'
v (3j2)v (9j4)v
Since the note D' above has escaped our octave, we return it by dividing its
frequency by 2:
C D G
v (9j8)v (3j2)v
From the frequency of D we can calculate the frequency of A, then E'-we
then return the latter to our octave as before.
C D E G A
v (9j8)v (8Ij64)v (3j2)v (27j16)v
We continue this process for the other notes in the scale, and we finally end
up with the following frequencies for each:
C v
C# {lj2)4(3j2fv
D {lj2)(3j2)2 V
D # {lj2)5(3j2)9 V
E {lj2)2(3j2)4 V
F {lj2)6(3j2)11 V
F# {lj2)3(3j2)6 V
G (3j2)v
G# {lj2)4(3j2)8 V
A (lj2)(3j2)3 v
A # {lj2)5(3j2)10 V
B (lj2)2(3j2)5 V
C' {lj2)6(3j2)12 V
74 The well-tempered clavichord 3.7

The frequency of C', the last to be obtained and the last in the above list,
should give us the value v if divided by two, since C' is one octave above
middle C, our starting point. Thus
(1/2)7(3/2)12 v == v.
We may cancel out v, and obtain next
{1/2)7(3/2)12 - 1.
When we simplify this equation, we get
312 == 2 19 .
But this last equality is impossible, since 312 is odd but 2 19 is even. This
shows that it is impossible to have all the fifths on a piano tuned exactly
so that the frequency ratio in each is exactly 3 : 2. Of course, one could tune
middle C to the correct frequency v and then tune all the other C's on the
piano to the correct multiples of v. Then one could tune G above middle C
to the frequency (3/2)v (this can be done quite easily by ear alone with the
aid of a little experience), and then set all the other G's by that one. Proceed-
ing in this manner, one would finally obtain the frequency {1/2)6(3/2)11 V
for the F above middle C, and then all the other F's could be tuned accord-
ingly. All the fifths would have exactly the correct ratio except for one-the
fifth from F to C. Here the ratio would be a little less than 1.48 rather than
1.5, quite enough difference to sound very peculiar. Pianos were tuned in
this fashion for many years (actually, in Europe, this method of tuning
disappeared about the same time as the harpsichord did), and the interval
between F and C was known as the Wolf Interval, because wolves howl;
this was an allusion to the howling higher harmonics of the two notes, few
of which came anywhere near matching up, thus producing a number of
dissonances.
Of course, if one were to playa composition in which the notes F and C
were used as little as possible, as in a piece written in the key of B major,
the Wolf Interval would be avoided and the music would sound quite good.
(If you wished to compose in C, though, you would have to have your
piano retuned so as to place the Wolf Interval on another, little-used, fifth.)
However, transposition, changing the key of a piece by moving all notes up
or down a fixed amount, was impossible in the modern sense, for various
harmonic ratios would be changed by transposition. The key of composition
was sufficiently important to composers of the seventeenth and eighteenth
centuries so that they commonly included it in the titles of their works.

Exercises

3.47 Repeat the argument of this section, using a seven-tone scale system,
and thus show that it is also impossible to have all fifths in exactly a 3 : 2
3.7 Tuning a piano, old style 75

ratio in such a system. Note: In a seven-tone scale, a "fifth" would be an


interval of four, rather than seven, half steps.
3.48 Repeat the argument of this section, using an n-tone scale system
(where n is a positive whole number larger than 1), and thus show that it
is also impossible to have all "fifths" in exactly a 3 : 2 ratio.
3.49 Suppose that middle C on the piano is tuned to the frequency v, and
all other C's are set accordingly. Suppose also that each note on the piano
is then tuned a fixed multiple of the frequency of the note one half step below
it; that is, calling this multiple k, then the frequency of C# would be kv,
the frequency of D would be k 2 v, and so on. What is the value of k?
3.50 Since 2 < .J7 < 3, we know the continued fraction expansion of J7
starts off like (2; ... ? ... ). So we start by writing
.J7 = 2 + (.J7 - 2),
because .J7 - 2 is a number between 0 and 1. We invert twice to obtain a
number larger than 1, and thus
.J7 = 2 + _I_
I
.J7 - 2
The formula (a + b)(a - b) = a 2 - b 2 can be used to eliminate .J7
from the last denominator, as follows:

----
1 .J7+2
.J7 - 2 (.J7 - 2)(.J7 + 2)
.)7 + 2
-
7 - 22

.)7 + 2
-
3
Thus
r 1
-v7 = 2 + /
-v7 + 2
3
Since 2 < .J7 < 3, it follows that 4 < .J7 + 2 < 5, and so
4
-<
5 .J7 + 2 <-
333
or
O< .J7+2 < 1.
3
76 The well-tempered clavichord 3.8

Hence

,fi=2+
1 +
J .
7 - 1
3
If you continue this process, eventually a repetition will set in; you will
find that
../7 = 2 + 1 _
1
1 +------
1
1 +-----
1+ 1
2 + ../7
Hence ../7 = (2; 1, 1, 1,4, 1, 1, 1,4, 1, 1, 1, 4, ... ). Repeat this process in
order to find the continued fraction expansion of ../i

3.8 TUNING A PIANO, NEW STYLE


Now 312 is not equal to 2 19 , but the two numbers are relatively close, and
careless tuning might convince someone that it would be possible, using
twelve notes in an octave, to return to exactly the original frequency after
tuning twelve successive "perfect" fifths. This may well have something to
do with the genesis of the twelve-tone scale used almost exclusively in
Western music. Other cultures have developed scale systems with other
numbers of half steps in the octave, seven being one of the most common.
But the same argument as used before, in which we obtained the contradictory
result 312 = 2 19 , can be repeated for such scale systems to show that an
attempt to produce perfect fifths will always result in a Wolf Interval.
Perhaps the first question of this chapter has been partially answered, as to
why there should be twelve notes in an octave. We say "partially," for what
we have done so far is to give a possible evolutionary justification for the
twelve-tone system. Later we shall see that there is a mathematical reason
as well.
Johann Sebastian Bach wrote his Well-Tempered Clavichord, a col-
lection of forty-eight piano compositions, to draw attention to an alternate
tuning method which was then being considered in European musical circles.
This new method, known as well-tempering, involved tuning all the C's on
the piano as before, but subsequently effecting a compromise. Rather than
have most of the fifths in perfect harmony and one very bad Wolf Interval, it
was proposed that all the fifths be tuned very slightly flat-with a ratio of
approximately 1.498 307 rather than exactly 1.5-the number 1.498 307
chosen because it is that number which will cause all the fifths, including the
3.8 Tuning a piano, new style 77

last one tuned, to be in the same ratio. Rather than having a piano with one
howling interval, we would have a piano with twelve, but each of these twelve
wolves would howl so quietly and only at such high harmonics that they
would be undetectable except to electronic instruments or to an uncommonly
well-trained ear.
Avoiding the Wolf Interval was one of the reasons brought forth to
support well-tempering, but another must surely have had wide appeal:
Because all the intervals (not just the fifths) would have fixed ratios on a well-
tempered piano, transposition becomes possible. A piece written in C #
could be played in D or in G without changing any of the harmonic ratios,
and only a person with a well-developed sense of pitch could tell the difference.
The effect of transposition would be the same as if the piece were recorded
at 33t rpm and played back slightly faster or slower. And with the possibility
of transposition, vocal music becomes much easier to write and to sing.
This advantage of well-tempering must have been an important factor con-
tributing to its eventual adoption; for adopted it was, and is now used on
virtually all pianos. And perhaps we have answered the second question
raised at the beginning of this chapter, by showing one reason for the com-
position of the Wohltemperiertes Klavier.
The drawback to well-tempering is, of course, that none of the harmonies
other than octaves are "perfect." As we have seen, a fifth is tuned with the
higher note about 1.498 307 times the frequency of the lower, rather than
1.5 exactly. We now turn to the third question mentioned at the beginning of
this chapter. Would further improvements in the scale system be possible,
and how might they be accomplished? Should it be possible, by changing the
number of notes in an octave from twelve to some other number, to well-
temper a piano, but have the ratios of the fifths much closer to 1.5 than they
are at present? It seems likely. Theoretically we could have 1200 notes in an
octave rather than 12, and the fifth might be better approximated by striking
two notes 702 half steps apart rather than the 700 which would correspond
to the sound of the twelve-tone fifth. Of course, the technical difficulties
involved in the construction of such an instrument, to say nothing of the
problem of playing the monstrosity, appear intimidating; but perhaps we
do not need so many notes in order to improve the ratios of the fifths. Well,
we seem to be asking for a "next best approximation," and this brings us
back to the methods of continued fractions.
Exercises

3.51 It was mentioned in this section that in well-tempering, the ratio of the
fifths is 1.498 307 rather than 1.5 exactly. How was this number obtained?
Hint: See Exercise 3.49.
3.52 Any scale system that improves fifths will also improve fourths-pairs
of notes five half steps apart in the twelve-tone system, with an ideal (but not
78 The well-tempered clavichord 3.9

actual) frequency ratio of 4: 3. Why will improving fifths also improve


fourths?
3.53 In music, thirds are used very extensively in producing harmonies.
In the twelve-tone piano, a third can be produced by striking middle C and
the E immediately above it, or any two notes four half steps apart. The
ratio of frequencies in an ideal third would be 5 : 4, or 1.25, but such cannot
be obtained for every third in the twelve-tone system for reasons similar to
those which show that not all fifths can have the perfect ratio of 1.5. Show
why this is so. In the well-tempered twelve-tone piano, what is the actual
ratio of a third?
3.54 In what keys are the forty-eight compositions in the Well-Tempered
Clavichord written?
3.55 You may need both a table of logarithms and a desk calculator for this
problem. Suppose that a piano were constructed with 1200 notes in each
octave. Would the note 702 half steps above middle C produce a harmony
with middle C closer to a perfect fifth than the note 700 half steps above
middle C? Would the former note be better than any other on such a piano
for this purpose?

3.9 IMPROVING THE OCTAVE


Now 2(7/12) is very nearly equal to 1.5; in fact, 2(7/12) is approximately equal
to 1.498 307. What we seek is some possibly larger number, say n, of notes
in an octave, in which a "better" fifth would be (say) m notes up, so that
2(m/n) would be a better approximation to 1.5 than 2(7/12). As in our baseball
example, it would be both difficult and time-consuming to test all possible
fractions with denominators between 12 and 1000 to find such a better
approximation. Instead we seek rational approximations to the solution x
of the equation 2x = 1.5 by finding the continued fraction expansion of x.
Note that it will be unnecessary to find explicitly the decimal expansion of the
solution x in order to do this.
Using the laws of logarithms, we first transform the equation 2 = 3/2 X

into a more tractable one, as follows:


x
2 = 3/2,
log 2 X = log (3/2),

x log 2 = log (3/2),

log (3/2)
x----
- log 2 .
3.9 Improving the octave 79

We find the continued fraction expansion of x, much as we found the fraction


for ..j7 in Exercise 3.50. Since 3/2 < 2, then also log (3/2) < log 2, so
initially we invert to obtain a fraction larger than 1 to work with. We then
apply to the denominator the methods of Section 3.2, as follows

log (3/2) 1
----
log 2 log 2 '
log (3/2)
1
x=----
log (3/2)(4/3)
log (3/2)
1
------
1 + log (4/3)
log (3/2)

The fraction in the last denominator is less than 1, so again we doubly


invert it to obtain a number exceeding 1, and proceed as before. By replacing
log (3/2) with log (4/3)(9/8), we obtain as our second stage the fraction below:

1
x=------
1 + 1
1 + log (9/8)
log (4/3)

Again, we doubly invert the fraction in the last denominator to obtain a


number larger than 1, but this time we obtain a number between 2 and 3, and
slightly trickier methods are now needed:

log (4/3) log (9/8)2(256/243)


-
log (9/8) log (9/8)

= 2 + log (256/243) .
log (9/8)

Thus we obtain for the third stage of our continued fraction expansion of x
the following:
1
x = ----------
1
1 + ----- ----
1 + 1
2 + log (256/243)
log (9/8)
80 The well-tempered clavichord 3.9

The numbers involved get sufficiently large to cause some practical


difficulties even with the aid of a desk calculator in about two more stages,
so we carry the above no further but provide you with the results of our
labor.
x = (0; 1, 1,2,2,3, 1, 5, 2, 26, ... ).
The corresponding sequence of partial quotients is
1/1, 1/2, 3/5, 7/12, 24/41,31/53, ....
As you recall, the reason that the fraction 7/12 appears in this sequence
is that 7/12 is a fairly good approximation to the solution x of the equation
2x = 3/2. Actually, the value of x is 0.584 962 501 ... , and the difference
between 7/12 and x is 0.001 629 .... This is quite small, and provides us with
a good mathematical reason for using twelve notes in an octave, but we will
have to insert the intermediate fractions in the above sequence to find the
other possibly better approximations. Between 7/12 and 24/41 there are two
intermediate fractions, 10/17 and 17/29. Some arithmetic calculations provide
us with the information that the error between 10/17 and x is 0.003 272 ... ,
so that 10/17 is a poorer approximation to x than 7/12. Thus we cannot
improve the harmonies of fifths by using a seventeen-tone scale. For 17/29
the error is 0.001 244 ... , so that a twenty-nine-tone scale would be slightly
better than the twelve-tone scale, but not by much. It would seem hard to
justify using more than twice as many notes in an octave in order to reduce
the error by less than twenty-five percent.
However, the error between 24/41 and x is much smaller-only
0.000 403 ... , and we leave it to musicians to decide whether reducing the
error in the fifths to less than a quarter of its value in the twelve-tone scale
would justify building instruments with forty-one steps in each octave and
writing music for such instruments. The important idea is this: The theory
of continued fractions guarantees to us that the "next best" number of notes
in an octave, after twelve, is twenty-nine, next after that is forty-one, and so on.
Note finally that there is one intermediate fraction, namely 4/7, between
the two partial quotients 3/5 and 7/12. This suggests that a seven-tone
scale might be reasonably harmonious, and in fact some oriental music is
written in such a system.

Exercises

3.56 It was stated in this section that the continued fraction expansion of the
solution x of 2X = 3/2 is
x = (0; 1, 1,2, 2, 3, 1, 5, 2, 26, ... ).
But only the first four numbers were found by computations shown in the
text. Verify that the next two are correct.
Notes and references 81

3.57 This is a very long problem. If you find the continued fraction ex-
pansion of the solution y of 2Y = 5/4, you will be able to find what number of
notes in an octave would produce better approximations to the "perfect
third" ratio of 1.25 (see Exercise 3.53) than are possible in the usual twelve-
tone scale system. With luck, one of these solutions might also produce
better fifths. Find the next larger number of notes after 12 which will improve
thirds. If possible, find the next larger number of notes after 12 which will
improve both fifths and thirds.
3.58 Think of an argument against the well-tempering system of tuning a
plano.
3.59 This is a highly speculative problem. All you have to do is think about
it, or perhaps discuss it. The human eye perceives just about an "octave"
of the spectrum of radiation frequencies, and it has been customary for
centuries to think of this "octave" as made of seven "notes": red, orange,
yellow, green, blue, indigo, and violet. Why seven? Is there a theory of color
harmony lurking about somewhere? Is the division of the spectrum into
seven colors a cultural phenomenon, or is there a good physical or math-
ematical reason for it? Note that 4/7 is a good approximation to the solution
of 2X = 3/2. Do any cultures have twelve color names, or five, or forty-one?
3.60 Can a violinist play perfect fifths? Explain your answer.

NOTES AND REFERENCES


Continued Fractions, A. Ya. Khinchine's little book published in 1964 by
the University of Chicago Press in English translation, is still a good reference
work on the theory of continued fractions whose entries are real or positive
whole numbers.
Science and Music, by Sir James Jeans (Dover Publications 1968 edition),
contains many topics on harmony, a thorough discussion of the history of
scale systems, acoustics, and several other topics related to the material of
this chapter. There is a minor error on page 188 of this edition, in which Sir
James has neglected the possibility of a twenty-nine-tone scale system.
Analytic Theory of Continued Fractions, by H. S. Wall (van Nostrand,
1948) contains a wealth of information of a much more abstract sort, most
of which is of use only to students of advanced mathematics.
Horns, Strings, and Harmony, by Arthur H. Benade (Doubleday, 1960)
is another good book, one which should serve to show that this chapter deals
with only one very simple consideration in the mathematical and physical
theory of music.
CHAPTER 4

GROUP
THEORY

Group theory is remarkable in that from such simple beginnings so much


can be deduced, and in the wide variety of applications it has found-not
only in almost every other branch of modern mathematics, but also in such
diverse fields as crystallography and quantum mechanics. We shall be
concerned with only a very few major theorems in this chapter, but even
in the introductory stages of group theory you will be able to appreciate its
elegance, as well as learn enough to prove a wide variety of theorems on
your own.

4.1 SOME EXAMPLES OF GROUPS


We first present some examples of mathematical systems which will turn out
to satisfy our subsequent definition of "group." You should ask yourself
what these examples have in common, for it is just these common properties
that willleaQ us to the abstract definition.

Example 4.1 Consider first the set W of all whole numbers, positive, negative,
and zero, together with the single operation of ordinary addition. This
operation is said to be closed with respect to the set W, for if m and n are
two whole numbers then so is m + n. In other words, we cannot get outside
the set W using the operation of addition.

Moreover, this operation also obeys the associative law: If m, n, and p


are all elements of the set W, then

(m + n) + p = m + (n + p).

82
4.1 Some examples of groups 83

Since we are dealing only with the operation of addition, this law is a great
convenience-it permits us to neglect parentheses in many cases, and we shall
do so. To put it another way, the value of an expression such as
m+n+p
is independent of either choice of placement of parentheses, and hence is
unambiguous.
Third, there is one element of W, namely 0, with the property that if m
is any element of W whatsoever, then
o+ m = 0 = m + O.
For this reason the number 0 is called an identity element of the set W with
respect to the operation of addition. We must say "an" identity element, for
it is conceivable that there might be others. Also, if we were using a different
operation (such as multiplication) the identity element might well be some
other element of W.
Finally, with the existence of an identity element assured, we may speak
of an inverse of an element of W. In this example, each element of W does
have an inverse; that is, given an element m in W, there does exist at least one
element m' of W such that
m' +m= 0 = m + m'.
The element m' is called an inverse for m; again, it is conceivable that an
element has more than one inverse. In this particular case, m' is of course
just the so-called additive inverse - m of the whole number m.
To summarize, then, as Example 4.1 we have a pair (W, +) consisting
of a set W, a binary operation "+" on W, such that the operation is closed
and associative, W contains an identity element 0 with respect to this opera-
tion, and each element of W has an inverse with respect to this operation.
Example 4.2 For our second example, we begin with an equilateral triangle
in the plane. The elements of the set G that will form our group are going to
be the congruence motions of this triangle. There are a number of different
ways in which an equilateral triangle is congruent to itself, and the motions
that produce such congruences will be the elements of the group-not the
triangle itself; the triangle has been introduced for accounting purposes
only, to keep track of the effect of each congruence motion.
Three such motions are shown and labeled in Fig. 4.1. The motion which
has the effect of rotating the triangle one-third revolution clockwise will be
denoted by S. The motion of rotating the triangle one-third revolution
counterclockwise will be denoted by R, and the motion (if we may be per-
mitted use of such a term in this last case) of no movement at all will be
84 Group theory 4.1

a a

b~---~c b~---~c

b c

Fig.4.1 Three congruence


c~---~a a~---~b motions of an equilateral
triangle.

denoted by I. If we are prohibited from moving the triangle out of the plane,
these are all the possible motions. Now we have a set containing three rather
ususual elements, the motions I, R, and S; in order to have a group we need
an operation on this set, and so before we consider other possible motions
of the triangle, let us see how we are going to define the necessary operation.
We shall always refer to our group operation as multiplication, except
when we happen to be using ordinary addition as in Example 4.1. To con-
tinue with this analogy, we shall write the product of the two group elements
x and y as x· y or even simply as xy. In the present example, we define the
multiplication of congruence motions in the following manner. Given
motions x and y, the product xy is that motion which has the net effect of
first doing x, then doing y. In Fig. 4.2, we have indicated the product of R
with itself, and we see that RR = S, since the net effect of first doing one
rotation one-third revolution counterclockwise and then another, has the
same effect on the triangle as a single rotation one-third revolution clockwise.
You can also see with the aid of figures similar to Fig. 4.2 that IR = R,
RS = I, and so on.
4.1 Some examples of groups 85

c
b----~c

----~b

Fig.4.2 Products
of congruence
motions: R . R = S. c~----~a

If we are allowed to lift the triangle out of the plane in order to perform
congruence motions, three more motions become possible, as shown in
Fig. 4.3. We have called one of these motions by the name A because the
vertex labeled a of the triangle does not move as a consequence of this
motion, one which amounts to rotation of the triangle about the bisector
of the angle with vertex at a. For similar reasons Band C have received their
names.
The important thing to remember about this operation of "multiplica-
tion" of congruence motions is that the motion A does not mean to put the
triangle into the configuration shown as a consequence of the motion A in
Fig. 4.3, but means to rotate the triangle about its vertical axis, regardless of
the names pasted on its vertices. Similarly, the motion R does not mean to
put the triangle with the vertex labeled c at the top, a at the lower left, and b
at the lower right, but simply means to rotate the triangle one-third revolu-
tion counterclockwise.
86 Group theory 4.1

a a

A
..
b c c b

!c
b ~ c

Fig. 4.3 Congruence


motions of an equilateral
triangle by rotation about
a c b a an angle bisector.

If you wish, you might now cut an equilateral triangle from cardboard
and label its vertices a, b, and c (on both sides of the cardboard), and then
you can discover, for example, that AR = C, BC = S, and so on. We can
summarize such information about this mathematical system by saying that
the set of objects which will form the group consists of the six motions I, R,
S, A, B, and C; the multiplication is defined as above; and a "multiplication
table" for this system is shown below.
I R S A B C
I I R S A B C
R R S C I B A
S S I R C A B
A A C B I S R
B B A C R I S
C C B A S R I
This multiplication table is interpreted in the following way: The first
element of the product XY appears in the leftmost column, the second in
the topmost row, and the product XY itself is in the "obvious" place in the
body of the table. With the aid of this table we can examine this mathematical
4.1 Some examples of groups 87

system for those same four properties we abstracted from Example 4.1, the
example of whole numbers with the operation of ordinary addition.
First, since each entry in the body of the table is in fact an element of the
set G = {I, R, S, A, B, C}, we have the desired closure property. That is,
if X and Y both belong to the set G, then so does their product XY.
Next, the operation is associative for a simple, though perhaps subtle,
reason: Both (XY)Z and X( YZ) mean, in effect, first do X, then do Y, and
then do Z. This is the interpretation no matter which way the parentheses
are placed, and the net effect on the triangle will be the same in either case.
Hence for all elements X, Y, and Z of the set G, (XY)Z = X( YZ). Thus
the operation is associative.
The appearance of a row identical to the topmost or guide row-that
is, the appearance of the row with the element I at its left-and the appearance
of a column identical to the left-most column show us that
IX = X = XI
for all elements X of G, and hence the element I of G is an identity for this
operation.
Finally, the appearance of the identity element I at least once in each
row and column assures us of the existence of inverses for each element of G;
to find, for example, the inverse R' of the element R, locate I in the row to the
right of R in the body of the table. Then find the element at the head of that
column; in this case, the element is S, and so we know that RS = 1. We
verify also that SR = I, and hence S = R', the desired inverse of R, and
similarly R = S' as well. If the other elements of G are also checked, you
will find that for each congruence motion X, there is a congruence motion
X' such that
X'X = I = XX'.
Thus each element of G has an inverse in G.
There are thus four properties common to Examples 4.1 and 4.2. These
are just the properties that will lead us to the definition of group. If we denote
the multiplication in G by ".", then we have seen that in both the system
(W, +) and the system (G, .), the operation is closed and associative, an
identity exists with respect to this operation, and each element in the system
has an inverse in the system. However, there is very little else shared by
these two examples. For instance, the operation in (W, +) is commutative-
m + n = n + m is true for all values of m and n in W; but in the system
(G, .), RA = B whereas AR = C. Thus the order in which group elements
are multiplied is important, and we cannot assume without prior knowledge
that a given operation is commutative.
Also, the set W is infinite while the set G is finite. The system (W, +)
has an algebraic or numerical origin, but (G, .) has a geometric origin. Other
88 Group theory 4.1

than the four properties we have noted that they do have in common, there
is very little that appears to suggest common properties of the two systems
(W, +) and (G, .). However, it is the aim of group theory to discover those
properties which must be held in common by all (or certain classes of)
groups.
Example 4.3 H = {O, 1, 2}. Now you know what set is to be used in this
example. It remains only to define an operation, which amounts to filling in
the following table.
012

!i
We are using whole numbers for group elements here simply for con-
venience, since there may not be enough letters in our alphabet to cover all
the examples we might wish to consider. But how do we fill in this table?
We certainly cannot use ordinary addition or multiplication, for 2 + 2 = 4,
and 2·2 = 4, and 4 is not an element of H. Under these circumstances
our operation could not be closed. One way we might fill in the table is
shown below.
012
o 0 1 2
1 I 2 0
2 2 0 1

At this point you may be a little disturbed, if you are asking yourself
what this operation "is." But if you reflect a moment, you will see that you
already know everything there is to know about this operation on H; the
preceding table gives the answer to every conceivable question about the
operation, except possibly the one question that bothers many people at
this point. When they ask what this operation "is," they are really asking
how it came into being. One possible response is that this question belongs
to the realm of philosophy rather than mathematics. But mathematics
should be able to answer reasonable questions dealing with the origin and
existence of mathematical objects. So if you return to Example 4.2, and
consider only the three congruence motions in the set {I, R, S}, you can see
that the multiplication table for this subsystem is as follows.

IRS
I IRS
R R S I
S SIR
4.1 Some examples of groups 89

So our multiplication table for H = {a, 1, 2} is really the preceding table


traveling under an alias; we have simply replaced I by 0, R by 1, and S by 2.
So the above question about the origin of Example 4.3 may be answered in
this case by demonstrating a geometric origin for the elements of H and their
multiplication. It is appropriate to mention here that the other examples of
this section, too, as well as all groups, can be thought of as having a geometric
origin in much the above sense.
There does remain the question of checking Example 4.3 to see that the
four properties we are so interested in hold true. Closure, the existence of an
identity, and the existence of an inverse for each element of H are easily
verified by an examination of the multiplication table. And because we have
demonstrated that H is actually of geometric origin, so that the operation
in H may be thought of as combination of congruence motions as in Example
4.2, the operation in H is automatically associative. This is perhaps one of
the easiest ways to demonstrate associativity; without previous knowledge
of the origin of the operation in H, one would otherwise have to check for
associativity by verifying that a rather large number of equations hold true
in H. For example, one would test 0· (1 ·2) and (0· 1) ·2 for equality, and
continue with twenty-six other cases. There is, in fact, a test for associativity
of such a multiplication table that involves "looking at the table" in much the
same way that a table can be visually examined for the other three properties,
but it is quite complicated and generally takes longer than almost any
alternative.
We finally present, very briefly, a few more examples of groups. As
you have seen, it is really sufficient to list the elements of the set and give the
multiplication table. We will usually do this and no more in our examples.
Later, when you are making conjectures about properties of groups, you may
find it useful to refer to these examples to test your conjectures.
Example 4.4 J = {a, I, 2, 3}.
1 2 3
° 1 2 3
°°
2
1 1 2
2 3
3
° 1
3 3
° °
1 2
Example 4.5 K = {e, a, b, c}.
e a b c
e e a b c
a a e c b
b b c e a
c c b a e
90 Group theory 4.1

Example 4.6 Zn = {O, 1, 2, 3, ... , n - I}. xy is the remainder upon divi-


sion of x + y by n.
Example 4.7 E = {... , - 4, - 2, 0, 2, 4, ... }. The operation is ordinary
addition.
Example 4.8 R+ is the set of all positive real numbers. The operation is
ordinary multiplication.
Example 4.9 Let D be a circular disk of radius 1 in the plane. The elements
of the group L are to be all possible congruence motions of this disk, in-
cluding infinitely many different rotations and infinitely many different ways
of turning the disk over about a straight line through its center. We could
easily name these congruence motions; for example, we could denote by
R 30 the motion of rotating the disk 30 degrees counterclockwise, and by
T90 the motion of flipping the disk over about a line making a 90 degree
angle with the x-axis. The operation on the elements of L will be the same
as in the case of the congruence motions of the equilateral triangle, Example
4.2; namely, the product (R 30 ) • (T90 ) will be that motion which has the net
effect of first doing R 30 , then T90 , to the disk. In this case the product is T75 ,
as you can easily verify. Again, for the same reasons as in Example 4.2,
this operation is associative. It is only closure which is not so obvious.
Example 4.10 Let a regular tetrahedron in three-dimensional space have its
vertices labeled as in Fig. 4.4. The set T is to consist of all the congruence
motions of this tetrahedron. We will name these motions in such fashion
as to make the effect of each easy to remember, and it will turn out that our
labeling method also makes computation of the product of two such motions
very simple.
First, consider the motion of rotation of the tetrahedron one-third
revolution counterclockwise about the altitude passing through the vertex
labeled O. This has the effect of moving the vertex in position 1 to position 2,
the vertex in position 2 to position 3, and the vertex in position 3 to position 1.
The vertex labeled 0 remains fixed. Hence we abbreviate this motion by
(1 2 3), where it is the order of the symbols that tells us what is going on:
By the motion (1 2 3) we mean that "1 goes to 2, 2 goes to 3, and (because
here the parentheses close up) 3 goes to 1." The omission of 0 from the
symbol (1 2 3) means that the vertex at 0 is left fixed. The symbol (0 1)(2 3)
represents another possible congruence motion, the one in which the vertices
at 0 and 1 change places while the vertices at 2 and 3 also change places.
Caution: There are some symbols that seem meaningful, such as (1 2), which
do not represent possible congruence motions of the tetrahedron.
As in Examples 4.2 and 4.9, the multiplication we use on this set T of all
possible congruence motions of the tetrahedron goes as follows: The product
xy of two such is that congruence motion which has the net effect of first doing
4.1 Some examples of groups 91

Fig. 4.4 A regular tetrahedron


oL.---------~ with vertex positions labeled.

x, then doing y to the tetrahedron. As we mentioned before, our method of


labeling these congruence motions makes computation of products par-
ticularly simple. If, for example, you wish to find the product

(1 2 3)(1 4 3),

all you have to do is follow each vertex through each of these motions to find
out where it finally ends up. Take the vertex in position 1. After applying
(1 2 3) to the tetrahedron, this vertex lies in position 2. Then, after applying
(1 4 3), the vertex in position 2 does not move. Hence the net effect of the
product
(1 2 3)(1 4 3)

on the vertex in position 1 is to move that vertex to position 2. So we can


begin writing down the answer; the product must look like the expression
below:
(1 2 3)(1 4 3) = (1 2

In the incomplete symbol on the right-hand side above, we next want to


write down the symbol following the 2; since the product is to tell us where
the vertex in position 2 ends up, we follow that vertex through the two motions
(1 2 3) and (1 4 3), and find that the vertex in position 2 first moves to
92 Group theory 4.1

position 3, and then, under the action of (1 4 3), is moved to position 1.


We indicate this by closing up the parentheses:
(1 2 3)(1 4 3) = (1 2).
But our work is not yet complete, for it is quite possible that the vertices
in positions 3 and 4 are also moved by the product (1 2 3)(1 4 3), so we
follow each of them through the two motions (1 2 3) and (1 4 3), and finally
discover that the desired answer is
(l 2 3)(1 4 3) = (1 2)(3 4).
It turns out that the tetrahedron will admit twelve congruence motions,
which we list below; since each product can be computed directly from the
symbols involved, it is as unnecessary to give the multiplication table for T
as it would be to give the complete multiplication table for the set R+ of
Example 4.8. All one needs to know is that
T = {I, (1 2 3), (1 3 2), (1 2 4), (1 4 2), (1 3 4), (1 4 3), (2 3 4),
(2 4 3), (1 2)(3 4), (1 3)(2 4), (l 4)(2 3)},
where I stands for the "motion" in which no vertex is moved. Like the
previous example, only the closure property is not so obvious.
To summarize, in all ten examples we have a set, together with an opera-
tion, such that the operation is both closed and associative, the set contains
an identity with respect to this operation, and each element of the set has an
inverse with respect to this operation. Any such mathematical system is
known as a group. And you now have at your disposal an infinite number of
different examples, because Example 4.6 is really infinitely many examples.
On the other hand, Examples 4.4 and 4.3 are really special cases of Example
4.6, and in some sense Example 4.1 is really the same as Example 4.7.

Exercises

4.1 Verify that Example 4.4 is an example of a group. Note: The associa-
tivity provides the only difficulty. This can be established by "realization"
of the example as a group of congruence motions, by direct computation,
or by use of the Euclidean algorithm: If m and n are two natural numbers,
then there exist unique integers q and r such that n = qm + rand 0 ~ r < m.
Some careful use of this theorem will produce a proof that the group given
as Example 4.4 is associative, and in fact can also be used for Example 4.6.
4.2 You saw that Example 4.3 was really in some sense the "same" as the
group {I, R, S} of rotations of an equilateral triangle. The sameness comes
from the fact that it is possible to rename the elements of one group and
thus obtain the other group (the multiplication table, when renamed, must
4.1 Some examples of groups 93

also correspond to the multiplication table of the other group). Show that,
as stated at the end of this section, the groups of Examples 4.1 and 4.7 are
really the "same" in the above sense.
4.3 Even though the groups of Examples 4.4 and 4.5 have the same number
of elements, they are not the same in the sense of the previous exercise.
Why not?
4.4 Let a square be given in the two-dimensional plane. Name each of the
eight congruence motions of this square according to some reasonable
scheme, and exhibit the multiplication table for this system. (As usual, the
product of two motions is to be the motion that has the net effect of first
doing one, then the other.) Show that this system is a group.
4.5 Let (G, .) be an arbitrary group. Show that G cannot contain two
different identity elements; that is, if e andfare two elements of G such that

ex = x = xe,
and
fx = x = xl,
for all elements x of G, then e = f
4.6 Let (G, .) be an arbitrary group. Show that each element of G has only
one inverse; that is, show that if x is an element of G, and y and z are elements
of G each of which acts like an inverse for x, then y = z. Your first line of
the proof might well be:
"Let x be an element of G, and suppose that y and z are elements of G
such that
yx = e = xy,
and
zx = e = xz,

where e denotes the identity of G."


4.7 Show that if x is an element of the group G such that xx = x, then
x = e, the identity of G.
4.8 By virtue of Exercise 4.6, we may refer to "the" (rather than "an")
inverse of a group element. Let us denote the inverse of x by X-l. Prove
that if x is an element of the group G, then (X-l)-l = x.
4.9 Let G be a group and w an element of G such that

wx = x

for some (not necessarily all) x in G. Prove that w must be the identity of G.
94 Group theory 4.1

4.10 Let M = {O, I}. Find all possible ways to fill in the multiplication
table shown below so as to make M into a group.
o I

4.11 Let H = {O, I, 2}. Find all possible ways to fill in the multiplication
table shown below so as to make H into a group. For simplicity, you should
assume that in any case, 0 is to be the identity for the resulting group.
012

4.12 Verify that the system shown in Example 4.5 actually is a group.
4.13 Of the examples considered in this section, some have a commutative
operation (xy = yx is always true) and some do not (xy = yx is sometimes
false). List those groups for which the operation is commutative and those
for which it is not.
4.14 See Exercise 4.13. Is there a way that one can decide if a group opera-
tion is commutative by simply looking at the group's multiplication table?
4.15 Write down the complete multiplication table for the group T given
in Example 4.10.
4.16 Show that, in the multiplication table for a group, each element of the
group appears exactly once in each row and column (excluding, of course,
the guide row and guide column).
4.17 See Exercise 4.16. Fill in the multiplication table shown below in such
a way that 0 is an identity element for the operation, each element appears
exactly once in each row and column, but the resulting system is not a group.
What must go wrong?
o I 234
o
I
2
3
4
04.18 Besides ordinary addition and multiplication, there are numerous other
associative operations on the set R of all real numbers. Verify that the
operation # defined by
x#y=x+y+1
4.2 Subgroups 95

for all real numbers x and y is associative. See if you can find three other
associative operations on R. We ask that these operations be closed, but
not necessarily so chosen as to make R into a group.
4.19 An operation on a set S is said to be cancellative provided that both the
following are true:

a) Ifax = ay, then x = y.


b) If xa = ya, then x = y.
Show that every group is cancellative.
4.20 An operation on a set S is said to be cross-cancellative provided that,
whenever xa = ay, then x = y. Give an example of a group in which the
cross-cancellative law does not hold.

4.2 SUBGROUPS
Consider the subset
U = {I, (l 2)(3 4), (l 3)(2 4), (1 4)(2 3)}

of the group T given in Example 4.10. One would generally expect a subset
of a group not to be a group in its own right if the same operation were used;
one thing that might go wrong would be that the subset does not contain an
identity (which would have to be the same as the identity of the original
group). However, the subset U shown above does contain the identity of T.
Another difficulty could be that the subset is not closed under the operation
used in the group. We test U for closure by writing down its multiplication
table as follows:
I (l 2)(3 4) (l 3)(24) (l 4)(2 3)
I I (l 2)(3 4) (l 3)(24) (l 4)(2 3)
(l 2)(3 4) (l 2)(3 4) I (l 4)(2 3) (l 3)(24)
(l 3)(24) (l 3)(24) (l 4)(2 3) I (l 2)(3 4)
(1 4)(2 3) (l 4)(2 3) (1 3)(24) (l 2)(3 4) I
We see immediately from this table that not only is the operation closed
with respect to U, but also each element of U has an inverse in U. Since the
operation is automatically associative on U, U together with the operation
previously defined for T becomes a group in its own right. Under such
circumstances U is said to be a subgroup of the group T, and the formal
definition is given next.
H is said to be a subgroup of the group G provided that H is a subset
of G and (H, .) is a group, where the operation is the same as that in the
group G.
96 Group theory 4.2

Since the operation of G, when used on a subset H of G, is automatically


associative, in order to show that H is a subgroup of the group G it is only
necessary to show that
a) H is a subset of G,
b) the identity e of G belongs to H, and
c) if x and yare elements of H, then so are xy and X-i.

As mentioned in Exercise 4.8, since each group element x has one and
only one inverse, we may refer to that inverse with the notation X-i. Note
that a group G is always a subgroup of itself, and that the subset {e} of G
consisting of the identity e of G alone is alse always a subgroup of G. These
two subgroups are called improper subgroups of G; all others, if any, are
called proper subgroups of G.
The order o(G) of a group G is just the number of elements in the set G.
If this set should be infinite, we say that the group G has infinite order. For
example, oCT) = 12, where T is the group of Example 4.10; the order of its
subgroup U discussed previously is 4. The group (W, +) of whole numbers
with the operation of addition provides an example of a group of infinite
order. A group consisting of a single element (which must necessarily act as
its identity) is the only possible way one can have a group of order I. For
each natural number n, Example 4.6 provides an example of a group of order
n, namely Zn. So there is at least one group of given order. There may be
more than one; see Exercise 4.3.
If you now check some examples of groups, such as the ones previously
given in this chapter, you will discover no exception to the following rule:
If G is a finite group and H is a subgroup of G, then o(H) divides o(G)
evenly. For example, the group T of Example 4.10 has subgroups of order
I, 2, 3, 4, and 12, and all of these numbers are divisors of 12, the order of T
itself. This important principle is known as LaGrange's Theorem, and be-
cause it has so many useful consequences we now provide a proof.
Theorem 4.1 (LaGrange's Theorem) If G is a finite group of order nand H
is a subgroup of G of order m, then min.
Proof. The notation min is just a convenient abbreviation for the phrase
"m divides evenly into n." Note that min is true if and only if there exists
a whole number k such that n = mk; that is, if n is an integral multiple of m.
So let G be the group of order n, and let H be a subgroup of G of order m.
Now if x is any element of G whatsoever, we can form what is called a
left coset of H in G, obtained by multiplying each element of H on the left
by x, and thus obtaining a subset of G. Our notation for the left coset of
H by x in G will be xH, and an alternate way of stating the above definition is
xH = {xh Ih E H}.
4.2 Subgroups 97

We now establish the following things about the left cosets of H in G:


a) H is a left coset of H in G.
b) Any two left cosets of H in G contain the same number of elements.
c) Any two left cosets of H in G are either the same, or else have no elements
of G in common.
d) Each element of G belongs to some left coset of H in G.
Consequently, the left cosets of H in G actually have the effect of chopping
G up into a number, say k, of nonoverlapping subsets, each of which has the
same number of elements as H itself-namely m. Hence n = mk, and hence
min, as we wish to show. In the next set of exercises, outlines of the methods
of establishing the four propositions above will be given, with the details
left for you to provide. Contingent upon these four exercises, the proof of
LaGrange's Theorem will be completed.
To illustrate how this proof works in a special case, let us examine what
happens if we use this coset procedure on the group G of Example 4.2.
Now o(G) = 6, so we want to show that any subgroup of G has order 1,2,3,
or 6. Let us use for the subgroup H, the set H = {I, R, S}. Pretend you do
not immediately see that o(H) = 3, one of the divisors of o(G).
First we list the left cosets of H in G as follows:
IH = {II, IR, IS} = {I, R, S}.
RH = {RI, RR, RS} = {R, S, I}.
SH = {SI, SR, SS} = {S, I, R}.
AH = {AI, AR, AS} = {A, C, B}.
BH = {BI, BR, BS} = {B, A, C}.
CH = {CI, CR, CS} = {C, B, A}.
Note how each of the four propositions we need to establish for the proof
of LaGrange's Theorem is illustrated. First, H = {I, R, S} is indeed one
of the left cosets of H in G. Second, any two of the left cosets of H in G
indeed contain the same number of elements. Third, any two left cosets of
H in G are either the same or else have nothing in common. Finally, every
element of G appears in some left coset of H in G. Since there are exactly
two left cosets of H in G in this example, the number of elements in G must
be double the number in each coset. In particular, the number of elements of
G must be double the number in its subgroup H, and so since o(G) = n is
an integral multiple of oCR) = m, we see that min is true.
Note that LaGrange's Theorem tells us that a group of order 30 (for
example) can have subgroups only of order 1, 2, 3, 5, 6, 10, 15, and 30-but
not that such subgroups must exist. The group T of Example 4.10 has order
12 but no subgroup of order 6, even though 6 I 12. So when you use
LaGrange's Theorem, be sure to use it with care.
98 Group theory 4.2

Suppose that G is a group and 9 is one of its elements. By g1 we


mean g. Also, by g2 we mean gg, by g3 we mean ggg, and so on; so if n is a
natural number, g" is just an abbreviation for the product of g"-1 and g.
In analogy to calling our group operation multiplication, and speaking of gg
as the product of 9 and g, we call n an exponent in the expression g", and
call g" the !,lth power of g.
We can also define negative powers of g. If n is a natural number, then
by g-" we mean (g-1)". Finally, we convene that gO = e, the identity of G.
With these definitions of exponentiation, certain laws of exponents hold;
namely,
a) gmg " = gm+",
b) (gm)" = gm".

However, as a general rule, only those laws of exponents hold which involve
but one element of G; it is not in general true that if 9 and h are elements of
G, then (gh)" = g"h". There is one very useful law of exponents, though,
that does hold when two elements of G are involved:
c) (gh)-1 = h- 1g- 1.

One item worth remembering is that the two laws (a) and (b) above hold for
all integral values of m and n, positive, negative, and zero; for example, from
(b) one can derive that

for all integral values of n.


Suppose we were to pick an element 9 from a group G, and examine all
positive powers of g. For example, in our group G of Example 4.2, the set
of all positive powers of R is
{Rt, R 2 , R 3 , R4, R 5 , R 6 , R 7 , .•. } = {R, S, I, R, S, I, R, ... }.
As another example, in the group R+ of Example 4.8, the set of all positive
powers of the element 3 is
{3 1 , 32 , 33 , 34, 35 , 36 , 37 , ..• } = {3, 9, 27, 81, 243, 729,2187, ... }.
It may happen that when we consider the set of all positive powers of the
element 9 of an arbitrary group G, the set
1 234567
{ g,g,g,g,g,g,g, ... }

mayor may not contain e, the identity of G. In our first example above, the
set does contain the identity; R 3 = I, the identity of the group of congruence
motions of an equilateral triangle. In the second example, in the group R+,
no positive power of 3 is equal to 1, the identity of R+.
4.2 Subgroups 99

If no positive power of the element 9 in G is the identity, then 9 is said


to have infinite order. If some positive power of 9 is the identity, then there is
a least such positive power, and this exponent is called the order of g. So in
the above two examples, the order of R is 3, and the element 3 of R+ has
infinite order. It should be clear that in a finite group, each element has
finite order; an infinite group may contain only one element of finite order
(the identity), or may consist entirely of elements of finite order, or may even
be mixed.
However, in a finite group, the set consisting of all positive powers of an
element turns out to be a subgroup; for instance, in the first example above,
the set of all positive powers of R is {I, R, S}, which we have already seen is
a subgroup of the group G of Example 4.2. Moreover, this subgroup has
order 3, the same as the order of the element R which generates it. This is
no coincidence; if a group element 9 has finite order m, then the subset

{
1 2 3...
g,g,g, m ,g}

always turns out to be a subgroup of order m; and if the group involved is


finite of order n, then min by LaGrange's Theorem. Consequently, we have
another theorem, with details of the proof left for the exercises.

Theorem 4.2 If G is a finite group of order nand 9 is an element of G of order


m, then min.

In summary, the order of each subgroup of a group of order n, and the


order of each element of a group of order n, must always be a divisor of n.
These two facts will be found very useful in subsequent exercises.

Exercises

4.21 Find three different subgroups of the group (W, +) of Example 4.1.
Does this group contain any subgroups of finite order?
4.22 Find all the subgroups of the group G of Example 4.2. Give the order
of each subgroup of G and the order of each element of G.
4.23 Does the group G of Example 4.2 consist of all positive powers of some
one of its elements? Why?
4.24 Does the group H of Example 4.3 consist of all positive powers of some
one of its elements? Explain.
4.25 Give the order of each element of the group J of Example 4.4, and the
order of each element of the group K of Example 4.5. Does this information
provide a possible method of working Exercise 4.3?
4.26 What is the order of the identity e of a group G? Do any other elements
of G have the same order as e?
100 Group theory 4.2

4.27 Does the group L of Example 4.9 contain elements of finite order other
than its identity? How many? Does the group L of Example 4.9 contain any
elements of infinite order? How many?
4.28 Does the group L of Example 4.9 contain any subgroups of order two?
How many? Does L contain a subgroup of order three? Four? Any natural
number? Does L contain any subgroups of infinite order in addition to L
itself?
4.29 Give the order of each element of the group T of Example 4.10. Is
every divisor of o(T) = 12 represented among these numbers?
4.30 Let G be a group and g an element of G of order n. Show that, for each
element x of G, x-lgx also has order n. Warning: It is not generally true
that (x-lgx)n = (x-Itgn~. Another warning: Just because y" = e, it does
not immediately follow that y has order n.
4.31 Find an example of a group G in which the equation (gh)2 = g2h 2
does not hold for all elements g and h of G.
4.32 Let G be a group in which, for each two elements g and h of G, (gh)2 =
g2h 2. Prove that G must be commutative.
4.33 Let G be a group with identity e such that x 2 = e for all elements x in
G. Prove that G must be commutative.
4.34 Let G be a group with subgroups Hand K. Prove that H (") K is also
a subgroup of G.
4.35 Can a group G contain exactly one element of order 3?
4.36 Let G be a group and a and b elements of G. Suppose that the order of
ab is finite. Prove that ba must then have the same order as abo
4.37 Let G be a group and a and b elements of G, neither the identity of G.
Suppose that as = e and aba- l = b 2. What is the order of b?
4.38 Let G be a commutative group and let H be the set of all elements x of
G such that x 2 = e. Prove that H is a subgroup of G.
4.39 Prove that if a, b, and c are elements of a finite group, then the elements
abc, bca, and cab have the same order.
4.40 Find all the subgroups of the group T of Example 4.10. What is the
order of each of these subgroups? Is every divisor of o(T) = 12 represented
among these orders?
4.41 This is how to show part (a) of the proof of LaGrange's Theorem:
If G is a group and H is a subgroup of G, then H itself is a left coset of H in
G. It suffices to find an element x of G such that xH = H. What is a reason-
able candidate for x? Show that your candidate works.
4.42 This is how to show part (b) of the proof of LaGrange's Theorem:
Any two cosets of the subgroup H in the finite group G contain the same
number of elements. Let xH and yH be two left cosets of H in G (of course,
4.2 Subgroups 101

x and yare elements of G). Then a typical element of xH has the form xh
where h is some element of H. Let p = yx- 1. Simplify pxh. Show that
pxh is an element of yH. Show that the set yH can be obtained by applying
p as a left multiplier to each element of xH. Show that this transformation
of xH into yH by application of p as a left multiplier actually sets up a one-
to-one correspondence between the elements of xH and yH. It follows that
xH and yH have the same number of elements.
4.43 This is how to show part (c) of the proof of LaGrange's Theorem:
Any two left cosets of the subgroup H in the group G are either the same,
or else have no elements of G in common. Let xH and yH be two such cosets.
If xH and yH have no elements in common, we have no more to prove.
Suppose then that xH and yH have some element of G, say g, in common.
Then 9 must have the form xh 1 for some element h 1 of H; but 9 must also
have the form yh 2 for some element h 2 of H. Hence xh 1 = yh 2. From this
equation, derive the equation y - 1 X = h 2h 11 • Then show that y -1 X must
be an element of H. Table this result for a while.
Now, to show that xH = yH, choose a typical element, say xh(h E H)
of xH, and prove that xh must belong also to yH. Use the fact that (y-1 x)H =
H, which follows from the fact that y-1 x E H (why?). Of course, a similar
proof can be used to show that each element of y H is an element of xH,
and it follows that xH = yH. Hence if two left cosets of H in G overlap,
they must be equal.
4.44 This is how to show part (d) of the proof of LaGrange's Theorem:
Each element of G belongs to some left coset of the subgroup H in G. Let
9 be an arbitrary element of G. What is a reasonable candidate for the left
coset xH to which 9 might belong? Show that, for your choice of x, 9
actually does belong to xH.
4.45 Show that, in a group G with elements 9 and h, (gh) - 1 = h - 19 -1 .
4.46 Show that each element of a finite group has finite order.
4.47 Let G be a finite group, and 9 an element of G of order m. Let H
consist of the first m powers of g; that is,
H = {g1, g2, g3, ... , gm}.
Show that H must be a subgroup of G. Hint: You may use the fact that for
each natural number n, the group Zn of Example 4.6 actually is a group.
4.48 With respect to the previous exercise, show in addition that o(H) = m.
Hint: Clearly, o(H) ~ m. What would happen if o(H) < m?
4.49 Let G be a finite group. Show that there exists a fixed natural number
k such that, for each element 9 of G, gk = e.
4.50 Let G be a finite group of even order. Prove that G must contain an
element of order 2.
102 Group theory 4.3

4.3 CYCLIC GROUPS AND ABELIAN GROUPS


The group G is said to be cyclic provided that G contains an element g such
that each element of G is a power of g.
With respect to this definition, one is allowed to use all integral powers of
g, positive, negative, and zero. For example, the group (W, +) of Example
4.1 is cyclic, because every element of W is a "power" of 1. Here, because of
the additive rather than multiplication notation, we interpret 1" as n· 1, for
g" means iteration of the group operation, using the element g, n times. So
in the case of the element 1 of the group (W, +), 1" would mean the sum of
n 1's:
1+1+1+···+I=n·l.
If G is a cyclic group, then an element g of G whose powers produce all
elements of G is called a generator of G, and G is said to be generated by g.
It is usually the case that a cyclic group has more than one generator. The
group (W, +) contains another generator in addition to I-what is it?
The group G of Example 4.2 is not cyclic, as you were asked to verify
in Exercise 4.23. Of the two groups J and K of Examples 4.4 and 4.5, one is
cyclic and one is not-which is which? See also Exercise 4.25.
The group G is said to be Abelian if the operation in G is commutative;
that is, if gh = hg for all elements g and h of G.
Numerous examples of Abelian groups have already been examined. See
Exercises 4.13, 4.32, and 4.33. Our next theorem is sometimes quite useful.
Theorem 4.3 Every cyclic group is Abelian.
Proof Let x and y be elements of the cyclic group G, and let g be a generator
of G. Then, by definition, there exist integers m and n such that x = gm
and y = g". Hence
xy = (gm)(g") = gm+" = g"+m = (g")(gm) = yx.

Therefore G is Abelian.
As is frequently the case in group theory, the converse of this theorem
does not hold. That is, there do exist Abelian groups which are not cyclic.
However, there is one very important class of groups which must be cyclic
merely as a consequence of their order. Recall that a positive whole number
p is said to be prime if p has exactly two natural number divisors-itself and 1.
For example, the first ten prime numbers are
2,3,5, 7, 11, 13, 17, 19, 23, 29.
Theorem 4.4 Every group ofprime order is cyclic.
Proof Let G be a group of prime order p. Since p > 2, G contains at least
one element g not equal to the identity e of G.
4.3 Cyclic groups and Abelian groups 103

Let H be the subgroup of G generated by g. That is,


H = {gl, g2, g3, ... , gn},

where, as we have seen, n is the order of g. Since 9 :I: e, 9 has order at least 2.
But the order n of 9 must be a divisor of p = o(G), by Theorem 4.2. Since
p is prime, its only divisors are I and p, and hence n = p.
But H also has order n = p, and hence H is a subset of G containing as
many elements as G itself does. Hence H = G. So, since each element of H
is a power of g, so is each element of G. Therefore by definition, G is cyclic.
As usual, the converse of this theorem does not hold. It does not follow
that a cyclic group must be of prime order.
Exercises

4.51 Is every subgroup of (W, +) cyclic?


4.52 Show that the group Zn of Example 4.6 is cyclic for each natural num-
ber n.
4.53 Show that if a group G has prime order, then G is generated by each
of its elements other than the identity. Hint: Look at the proof of Theorem
4.4.
4.54 Let G be an Abelian group, n an integer, and 9 and h elements of G.
Show that
(gh)n = gnhn.

4.55 Let G be an Abelian group and H the subset of G consisting of all


elements of G of finite order. Show that H must be a subgroup of G. Hint:
Use the previous exercise.
4.56 Let G be a finite Abelian group and let k be a fixed natural number.
Let H be the set of all kth powers of all elements of G. Prove that H must
be a subgroup of G.
4.57 Let G be a group and n a natural number such that, for all elements
x and y of G,
(xyt = ~yn,

(xy)n+l = ~+1~+t,

(xy)n+2 = ~+2yn+2.

Prove that G must be Abelian.


4.58 Prove that every subgroup of a cyclic group is cyclic. Hint: Let G be a
cyclic group and H a subgroup of G. Let 9 be a generator of G. Then each
element of H is a power of g. Some positive power of G must appear in H
(unless H = {e}: but then H is cyclic); let k be the least positive integer such
that gk E H. Prove that gk generates H.
104 Group theory 4.3

4.59 Let G be a group. The center Z of G is the set of all elements z of G


such that zg = gz for all 9 E G. Prove that Z is a subgroup of G.
4.60 Let G be a finite group and 9 a fixed element of G. Let N(g) be the set
of all elements x of G such that gx = xg. Prove that N(g) is a subgroup
ofG.
4.61 This is difficult. A semigroup consists of a set together with a closed
and associative binary operation; the set must be nonempty. Prove that a
finite cancellative semigroup is a group. See Exercise 4.19.
4.62 Suppose that G is a finite group with no proper subgroups. Prove that
G must have prime order. Warning: The converse of LaGrange's Theorem
is false.
4.63 Let G be a group, H a subgroup of G, and 9 a fixed element of G.
Let g-IHg consist of all elements of G of the form g-lhg, where h is an
element of H. That is,
g-IHg = {g-lhg I hE H}.
Prove that 9 - I Hg must also be a subgroup of G.
4.64 Let G be a group and 9 and h elements of G. Prove that the equation
xgx = h
has a solution x in G if and only if gh = a2 for some element a in G. Warning:
Group G is not necessarily Abelian.
4.65 Suppose that 9 is the only element of order two in the group G. Prove
that 9 must belong to Z, the center of G. See Exercise 4.59.
4.66 Let G be a group containing elements x and y such that xy2 = y3 x
and yx2 = x 3y. Prove that x = e = y.
4.67 The subgroup H of the group G is said to be normal in G provided that
xH = Hx for all x in G. (Of course, Hx = {hx I h E H}.) Prove that the
center Z of a group G is a normal subgroup of G. See Exercise 4.59.
4.68 Let G be a finite group of order 2n and suppose that G contains a sub-
group H of order n, where n is a natural number. Prove that H is a normal
subgroup of G. See Exercise 4.67.
4.69 Let G be a finite group of order 2n and suppose that G contains a
subgroup H of order n, where n is a natural number. Prove that H contains
all the elements of G of odd order.
4.70 Prove that the equation x 2ax = a-I has a solution x in the group G
if and only if a = g3 for some element 9 of G.
4.71 Let A be a subset of the finite group G and suppose that A contains
more than half the elements of G. Prove that each element of G is the product
of two elements of A.
4.3 Cyclic groups and Abelian groups 105

4.72 Prove that if Hand K are two normal subgroups of the group G and
have only the identity of G in common, then every element of H commutes
with every element of K. That is, if hE Hand k E K, then hk = kh. See
Exercise 4.67.
4.73 Let G be a finite group of order n, where n is not 1 and n is not prime.
Prove that G must contain at least one proper subgroup.
4.74 (Azriel Rosenfeld) Prove that a group with only finitely many sub-
groups must be finite.
4.75 (Mira Bhargava) Prove that the union of two subgroups of a group is
itself a subgroup if and only if one of the two original subgroups contains
the other.
4.76 (F. M. Sioson) Prove that any semigroup in which x 2y = y = yx 2
for all its elements x and y must be a group. See Exercise 4.61.
4.77 (W. A. McWorter) Let G be a group of order n 2 and H a subgroup of G
of order n. Prove that for each element x of G, H () x- 1 Hx contains more
than one element. See Exercise 4.63.
4.78 (Michael Gemignani) Let G be a group with identity e and A a subgroup
of G such that (G - A) u {e} is also a subgroup of G. Prove that either
A = {e} or A = G. Note: G - A is the subset of G consisting of those
elements of G not belonging to A.
4.79 (Alan Schwartz) Let G be a finite Abelian group of order n. Then n is
odd if and only if each element of G is a square. (The element 9 of G is a
square if 9 = h 2 for some h E G.)
4.80 (Erwin Just and Mary R. Embry) Let G be a group in which no element
has order 2. Prove that if (xy)2 = (yX)2 for all elements x and y of G, then
G must be Abelian.
4.81 (W. A. Donnell) Let S be a semigroup satisfying the cross-cancellation
law. Prove that S is both Abelian and cancellative. Need S be a group?
See Exercises 4.20 and 4.61.
4.82 This and the next three exercises deal with one of the major theorems
about homomorphisms. A homomorphism cp: G --+ H is just a function cp
from the group G to the group H that "preserves the group operations"-
that is, for each two elements g1 and g2 of G,
CP(g1g2) = [CP(g1)] [CP(g2)]'
Of course, the operation between the square brackets is the group operation
in H. The image I of such a homomorphism cp: G --+ H is the set of elements
h of H such that
h = cp(g)

for some 9 E G. Show that, in the above notation, I is a subgroup of H.


106 Group theory 4.3

4.83 Continuing the previous exercise, we define the kernel K of a homo-


morphism <p from the group G to the group H as that subset of G consisting
of those elements g such that
cp(g) = e,

where e denotes the identity of H. Show that, in the above notation, K is a


normal subgroup of G.
4.84 Continuing the two previous exercises, let cp: G -+ H be a group
homomorphism with kernel K. Let GJK denote the collection of left cosets
of K in G. For two such left cosets xK and yK, we define an operation on the
set GJKas follows:
(xK)(yK) = (xy)K.

Show that GJK together with this operation becomes a group.


4.85 Finally, in the notation of the previous exercises, let (): GJK -+ I
according to the rule
()(xK) = cp (x),

where I denotes the image of cp. Show that () is then also a homomorphism,
and that in fact () is both one-to-one and onto (see Chapter 7 for definitions
of the last two terms).

NOTES AND REFERENCES


There are so many excellent texts and references on group theory that it
would be almost a hopeless task to list them all. A few are given below for
the convenience of the reader; however, most of those listed are fairly
advanced.

Birkhoff, G., and S. MacLane, A Survey of Modern Algebra (Macmillan,


1963).
Hall, M., The Theory of Groups (Macmillan, 1959).
Herstein, I. N., Topics in Algebra (Blaisdell, 1964).
Kurosh, A. G., The Theory of Groups (two volumes, translated by K. A.
Hirsch; Chelsea, 1955).
Jacobson, N., Lectures in Abstract Algebra, volume I (van Nostrand, 1951).
MacLane, S., and G. Birkhoff, Algebra (Macmillan, 1967).
Passman, D. S., Permutation Groups (Benjamin, 1968).
Notes and references 107

I wish to thank the editors of the American Mathematical Monthly for


permission to use several problems submitted as part of the regular problems
section of that journal. These are Exercises 4.74 (E 1522, Vol. 69, 1962),
4.75 (E 1592, Vol. 70, 1963),4.76 (E 1629, Vol. 70, 1963),4.77 (E 1874, Vol.
73, 1966), 4.78 (E 1764, Vol. 72, 1965), 4.79 (E 1794, Vol. 72, 1965), 4.80
'(E 1996, Vol. 74, 1967, and solution in Vol. 75, 1968), and 4.81 (E 2007,
Vol. 74, 1967).
Joseph Louis LaGrange, who first formulated and proved the theorem
that bears his name, was born in Turin in 1736. He was recognized as an
exceptionally able mathematician while still in his teens, and many consider
him second only to Euler of the mathematicians of the eighteenth century.
He solved a large number of significant and difficult problems, was one of
the first to insist on accuracy in mathematical proofs, and late in his life he
became known as a great teacher of mathematics. His major researches,
curiously enough, were not in group theory, but in mechanics, the calculus
of variations, and number theory.
CHAPTER 5

POLYHEDRA

In this chapter we shall be concerned mostly with the numerical relations


that must exist between the numbers of faces, edges, and vertices of a solid
polyhedron in three-dimensional space, and with the connection between
such relations and the mapmaker's problem of coloring each face of such a
figure so that faces with a common boundary are colored differently. Although
the material of this chapter is much simpler than much of that in the previous
four, it is unusual and surprising that under such circumstances we shall
find ourselves on the very edge of the mathematically unknown. For example,
suppose you are given a map of connected countries on a large island, and
you wish to color different countries in such a way that countries with a
common boundary have different colors. It has been proved that five colors
are always sufficient, but no one has ever been able to construct such a map
in which all five colors are actually required. Since it is easy to construct a
map that requires four colors, one sees that the answer to the question of how
many colors are needed is either "four" or "five"; but although amateur and
professional mathematicians have worked on this problem for over a century,
no one has yet discovered which is the correct answer.
Apparently the answer has something to do with Euler's Formula, which
gives one numerical relation that must exist between the numbers of vertices,
edges, and faces of a polyhedral solid; "apparently," because the proof that
five colors are always sufficient in the case mentioned above does require
Euler's Formula. Hence we begin with some introductory material about
polyhedra.

5.1 THE DEFINITION OF POLYHEDRON


A peculiarity of much of mathematics is its instability under slight changes.
A theorem that is true may become false with only the slightest alteration
in its wording-it is even possible that a change in the order of two clauses

108
5.1 The definition of polyhedron 109

may alter the truth of the statement of the theorem. Hence it is quite important
that the things about which these theorems are proved are defined precisely,
not out of any innate desire for precision, but from the need for accuracy.
Although you may be quite sure that you know what is meant by a "solid
polyhedron in three-dimensional space," someone else may well have a
definition that encompasses a wholly different set of objects. So the problem
arises of formulating a definition of such objects, merely for purposes of
communication alone. Of course, it would be preferable if the resulting
definition were exactly in accord with every man's preconceived notion of the
term defined, but this is not essential and is frequently impossible. For
example, you may well feel that a polyhedron ought to be connected-that is,
come in a single lump rather than several-but you could easily imagine
how one could defend an alternative definition in which polyhedra were not
required to be connected.
If there were in the literature of Qlathematics a single and well-established
definition o(the term "polyhedron," it would be the one used here; but since
there is not, we may define this term as we choose so long as, for practical
reasons, our definition is reasonably close to the commonly accepted meaning
of the term. So we choose a definition close to the common meaning, but
suited to the purposes of this chapter.
What we seek is a way of defining a solid three-dimensional object, a
subset of three-dimensional space bounded by "flats" with straight edges,
and connected so as to assure that the object "comes in a single piece."
Since we must begin with some primitive or undefined terms, we assume as
given and understood the terms "point," "line segment," and "plane" of
ordinary euclidean geometry. Part of the definition of polyhedron will
involve the notion of a polygon, so first we rephrase the definition of
"polygon" as given in Chapter I.
A polygon is a plane figure of finite area bounded by a finite number of
straight-line segments with the property that each endpoint of each line
segment is in fact the endpoint of exactly two of the line segments, and such
that each two such line segments meet, if at all, in a single common endpoint
of each. Finally, each polygon is to be connected, and so is the boundary
of each polygon-that is, the condition that the polygon be connected means
that
a) For each two points x andy within the polygon, there is a polygonal
line L with endpoints x and y lying entirely within the polygon.
And the condition that the boundary of each polygon be connected
means that
b) For each two points x and y on the boundary of the polygon, there is a
polygonal line L with endpoints x and y lying entirely within the boundary
of the polygon.
110 Polyhedra 5.1

Fig. 5.1 The exterior of a


polygonal boundary
is not a polygon.

To understand this definition fully, you should determine exactly what


sorts of polygon-like objects are ruled out by this definition. First, the
condition that a polygon have finite area prohibits such an object as that
shown in Fig. 5.1 from being a polygon. Next, the condition that each end-
point of each line segment be, in fact, the endpoint of exactly two line seg-
ments is really a condition on the boundary of the polygon-by which we
mean the union of that collection of boundary segments-and will prevent
an object such as that shown in Fig. 5.2 from being a polygon.
That each two line segments meet, if at all, in a common endpoint of
each eliminates the set shown in Fig. 5.3 as a possible polygon. In this figure
the two crossing sides are understood to cross at a point not an endpoint
of either side. The condition that the polygon be connected will prevent
the polygonal-like object shown in Fig. 5.4 from being a single polygon:
this example is the union of three polygons.
The last condition, that the boundary be connected, will prevent a
polygon from having a hole in it, as shown in Fig. 5.5. This consequence is a
special case of the Schoenflies Theorem, in which it is proved that a simple
closed curve in the two-dimensional plane is the boundary of a topological
disk (a "topological disk" is one that can be continuously deformed into
a circular disk). It turns out to be surprisingly difficult to prove this "obvious"
theorem, and it is beyond the scope of this text even to attempt to prove it for
the special case of a polygonal simple closed curve.
5.1 The definition of polyhedron 111

Fig. 5.2 The figure is not a polygon


because one point lies on more than two
boundary segments.

Fig. 5.3 The figure is not a


polygon because two sides
intersect in a point not a
common endpoint.
112 Polyhedra 5.1

Fig. 5.4 The figure


is not a polygon
because it is not
connected.

Fig. 5.5 The figure


is not a polygon because
the boundary is not
connected.
5.1 The definition of polyhedron 113

Fig. 5.6 A slight


alteration to make
the boundary
connected.

These polygons, as defined above, are to be the faces of our polyhedral


solids. We could actually alter the definition and allow polygons to have
polygonal holes in them, but over and over again in the subsequent proofs
we would have to convert each such polygon into one fitting the above
definition by slight alterations in its structure, of the sort shown in Fig. 5.6.
Now by a polygon we will mean a plane figure as defined above together
with its boundary, so in future use of the term "polygon" it will be permissible
for a subset of a polygon to lie partly or wholly on the boundary of the
polygon.
Using the term "polygon" as defined above, we can now define exactly
what is meant by a polyhedron, or polyhedral solid.
A polyhedron K is a subset of three-dimensional space such that
a) K has positive but finite volume;
b) each two points of K can be joined by a polygonal line lying entirely
within (or partly on the boundary of) K; and
c) the boundary of K is the union of finitely many polygons, such that each
two of these polygons meet, if at all, in a single common vertex or a
single common edge; and such that each three such polygons meet, if
at all, in a single common vertex.
Condition (c) of this definition is meant to prevent such an object as that
shown in Fig. 5.7 from being a polyhedron; however, it will be permissible
for a polyhedron to have a hole in it, as shown in Fig. 5.8.
114 Polyhedra 5.1

Fig. 5.7 The figure is not a polyhedron


because more than two faces have an edge
in common.

I I I
I I I
I I I I Fig. 5.8 A polyhedron with a hole
I I I I
)--~ I ~ running all the way through.
/ I II
/ I I I
/ I II
1 ..... ""1

/--_----..Lc--~
/
5.1 The definition of polyhedron 115

I! I
I I 1
I
I (
I 1
I I \ I
I \ I
I I \ 1
/
J----1I "I 1--
I
/ I I I
I I : I
/ 1 I I I
I I 1/\1

----j(t \1
/
Fig. 5.9 The polygonal closed

------
/
curve in the boundary does not / ..... .... --_~ .....
/ ,.-
separate the boundary. /

The vertices of the polygons forming the boundary of the polyhedron


K are called the vertices of K; the edges of the polygons are called the edges
of K; the polygons themselves are called the faces of K.
In some of the theorems which follow we shall need an additional
hypothesis that will assure us that our polyhedra have no holes in them.
Imagine a large cube with a small cube removed from its interior. We could
prevent this phenomenon by requiring that the boundary of each polygon
be connected; that is, each two points of the boundary could be joined by a
polygonal line lying entirely within the boundary. But this condition would
not prevent a hole running all the way through the polyhedron, as in Fig.
5.8. If we need to exclude such a possibility, so as to consider only such
polyhedra as could be molded from a cube of clay without cutting or poking
holes, we could add the following condition: that each polygonal closed
curve in the boundary of a polygon separates the boundary into two parts.
The polyhedron shown in Fig. 5.8 does not have this last property; Fig. 5.9
shows a polygonal closed curve in its boundary that does not separate the
boundary.
Since we may need both the conditions we have been discussing, in
order to prevent an internal hole or a hole all the way through, we combine
them into a single definition.
116 Polyhedra 5.1

The connected polyhedron K is said to be 2-connected provided that K


has connected boundary and each closed polygonal curve on the boundary
of K separates the boundary of K into two parts.
By virtue of condition (b) of our definition of polyhedron, each time
we use the term "polyhedron" we mean one which is connected; but we
refer to a connected polyhedron in the above definition to emphasize that
this definition is unambiguous only for connected polyhedra. In any case, a
2-connected polyhedron is what most people think of as a "polyhedron"-
the five regular solids, for example, are each 2-connected. A large piece of
Swiss cheese, even with polyhedral holes, is not.

Exercises
5.1 Draw a map of connected countries requiring four colors for a "proper"
coloring-one in which two countries that have a boundary segment in
common must be colored differently.
5.2 Is there a solution to the previous exercise in which only four countries
are drawn? Why?
5.3 Can each of the countries drawn in Exercise 5.1 be shaped like a
rectangle? Explain your answer.
5.4 Can each of the countries drawn in Exercise 5.1 be square? Give a
reason for your answer.
5.5 Suppose that a map of countries is drawn in which each vertex lies on
the boundary of exactly four countries. How many colors are needed for a
proper coloring of such a map? Can you prove this?
5.6 Draw a map of connected countries on the surface of a sphere-use a
tennis ball if you wish. Consider each country to be a "face," each boundary
segment between two vertices to be an "edge," and each point where three
or more boundary segments meet to be a "vertex." Let F be the number of
faces, E the number of edges, and V the number of vertices. Evaluate the
number V - E + F. Repeat this experiment a number of times.
5.7 Repeat the previous exercise, but use a torus-a figure shaped like a
doughnut-instead of a sphere. Use only countries that have connected
boundaries.
5.8 What happens in the previous exercise if some countries do not have
connected boundaries?
5.9 Why was it required in the definition of "polyhedron" that a polyhedron
have finite volume?
5.10 A subset of three-dimensional space is said to be convex provided that
each two points of the set can be joined by a straight-line segment lying
entirely within (or on the boundary of) the subset. Must a convex polyhedron
be 2-connected? Must a 2-connected polyhedron be convex? Explain.
5.1 The definition of polyhedron 117

Fig.5.10 A Mobius strip.

5.11 If countries are not required to be connected (such as Pakistan) but


the mapmaker still wishes to have all parts of a given country colored the
same color, the solution to the coloring problem becomes more complicated.
Suppose that you know that each country comes in no more than two con-
nected parts-can you construct a map of such countries in the plane re-
quiring eight colors for a proper coloring? Ten? More?
5.12 Find the minimum number of colors necessary to properly color each
of the regular solids, considering each face as a different country.
5.13 A Mobius strip is formed by cutting a long rectangular piece of paper,
and then joining the short sides not in the expected fashion, but after giving
the strip a half twist. An example is shown in Fig. 5.10. Can you draw a
map of connected countries on a Mobius strip such that you need five colors
to color the map properly?
5.14 What happens to the mapmaker's problem in three dimensions?
Consider the "countries" to be three-dimensional solids, such as rectangular
polyhedra. We require that two "countries" be colored differently if they
have part of a boundary face in common. Can you construct a "map" of
such countries requiring six colors for a proper coloring? Eight? Ten?
More?
5.15 Repeat Exercise 5.11, but allow the countries to come in arbitrarily
large numbers of pieces. For example, you may have one country in two
pieces, a second in three, a third in four, and so on. Is there any upper limit
to the number of colors needed to color all such maps?
118 Polyhedra 5.2

5.2 EULER'S FORMULA

Let K be a 2-connected polyhedron and let V, E, and F denote respectively


the number of vertices, edges, and faces of K. The Austrian mathematician
L. Euler discovered a simple relation that must always hold between the
numbers V, E, and F:
V-E+F=2.

This is what you should have discovered in working Exercise 5.6. It is


the aim of this section to provide a proof of this relation, known as Euler's
Formula.
Since this formula has nothing to do with the interior of the polyhedron
K, we immediately forget its existence, and consider only the boundary of K,
as composed of a number of vertices, edges, and faces. Choose a face of K
that you particularly like (or dislike), and remove this face from K, leaving
only the edges that it had in common with other faces of K. Since K is 2-
connected, it is now possible to deform the cup-shaped figure composed
of the remaining boundary polygons in such a way that it lies in a plane,
without changing the number of vertices or edges or the way they are con-
nected. The remaining faces of K thus become polygons in the plane, bounded
by the same edges as before; of course, the new faces cannot remain con-
gruent to the old ones, nor can the edges retain their original lengths in this
deformation. Figure 5.11 shows a cube before such a deformation, and
Fig. 5.12 shows the resulting plane figure, which is sometimes known as the
net of K.
The net of K may be thought of as a number of polygons, some of which
may be triangles, but some of which may not. We need next to convert all
these polygons into triangles. But first we count the number of vertices,
edges, and faces in the net, denoting these numbers by V, E, and Frespectively,
and form the sum V - E + F. Note that V and E are the same for K
and the net of K, while K has one more face than its net since the unbounded
exterior domain of the net is not counted as a face.
In Chapter 1 we provided a proof that any plane polygon can be trian-
gulated without introducing additional vertices. In this proof, straight-line
segments were drawn in the polygon, joining various pairs of its vertices,
until the polygon had been completely triangulated. If such a line segment
is drawn in one of the polygons in the net of K, thus joining two vertices of
some polygon of the net, the value of E will be increased by one. But the
value of F is simultaneously increased by one as well, and consequently the
value of V - E + F will be unchanged by this process. Thus we proceed to
triangulate each polygon in the net of K, and when we are done, though the
values of E and Fwill generally change, the value of the expression V - E + F
is not altered.
5.2 Euler's Formula 119

I
I
I
I
I
r------
//
Fig.5.11 The boundary of a cube in /
three-dimensional space.

Fig.5.12 After one face of the cube is


removed, its boundary can be deformed
so as to lie in a plane.
120 Polyhedra 5.2

Fig. 5.13 The net of the cube is


now completely triangulated.

In Fig. 5.13 we show the net of K completely triangulated; we are con-


tinuing with the cube as the model for this proof. We now attack this
collection of triangles with an eraser. Our aim is to erase all the triangles
but one, one at a time, without changing the value of V - E + F (though
we are certainly going to change the values of the individual terms). In the
erasing process, we also want to have a connected net of triangles at each
stage. Two types of erasing will be needed. Figure 5.14 shows that one can
erase a single edge, thus removing one edge and one face. The line to be
erased is shown as a dashed line in that figure. Since this erasure decreases
the value of each of E and F by 1, the value of V - E + F is unchanged.
Figure 5.15 shows an example of the other type of erasing needed; one can
also erase a vertex and two edges, shown as dashed lines in that figure. This
second type of erasing decreases each of V and F by 1 and simultaneously
decreases E by 2. Again, the value of V - E + F is unchanged.
At each stage of the erasing process, at least one of these two types of
erasures can be performed. We take care only to erase vertices and edges
lying on the "outside" of the net, so that the net remains connected at each
stage. Since at each erasure the value of F is decreased by 1, the process can
be continued until after a finite number of steps the value of F becomes I
and a single triangle is all that remains.
5.2 Euler's Formula 121

Fig.5.14 Removing one triangle by


erasing an edge.

/'
I \
I \
I \
I \
I \
I \
I \

Fig.5.15 Removing one triangle by erasing


two edges and their common vertex.
122 Polyhedra 5.2

The value of V - E + F is the same for this triangle as it was for the net
of K, but since a triangle has V = 3, E = 3, and F = I, the value of V -
E + F is 3 - 3 + 1 = 1. Thus the value of V - E + F must also be 1
for the net of K. Now K itself has the same number of vertices and edges,
and one more face, than its net, so the value of V - E + F for K must be 2.
This establishes Euler's Formula: for any 2-connected polyhedron,
V - E +F= 2.

Exercises

5.16 Calculate the value of V - E + F for a polyhedron shaped roughly


like a cube with a single hole running all the way through, such as the one
shown in Fig. 5.8. Repeat this for a polyhedron with two holes, three holes,
and four holes. Generalize. How might you go about proving your guess
is correct?
5.17 You can see from the proof of Euler's Formula that it is not actually
necessary for the edges involved to be straight. Hence for any map of
connected countries on the surface of a sphere, if F is the number of countries,
E the number of boundary segments, and V the number of vertices, then
it is still true that V - E + F = 2. We can also use the formula V-
E + F = 2 for a map of connected countries in the two-dimensional plane,
provided that we interpret the unbounded outside region as a large country.
You can use this somewhat generalized form of Euler's Formula to work a
number of problems, including this one: Suppose you are given five points
marked in the plane and lines are drawn from each to the other four, de-
touring around vertices. This will make ten lines in all. Prove that two
of these lines must intersect.
5.18 Prove that three houses cannot each be connected to each of three
wells by nonintersecting paths. Hint: See Exercise 5.17.
5.19 Prove that if each country of a map on a sphere has exactly three edges,
then the number of countries is even.
5.20 If a map of connected countries on a sphere has 60 vertices and each
country has exactly three edges, how many countries are there?
5.21 Suppose that a map of connected countries on a sphere has the property
that each vertex lies on an even number of edges. What is the minimum
number of colors required to properly color such a map?
5.22 Prove that there is no map covering the surface of a sphere such that
each vertex lies on exactly four edges and each country has exa~tly six sides.
5.23 Suppose a map is given in the plane such that each vertex lies on exactly
three edges and no two countries have in common more than a single vertex
or single edge. Show that at least one country in the map must have five or
5.3 Regular solids 123

fewer edges. Hint: For each natural number n, let Fn be the number of
countries that have exactly n edges. Suppose that a map such as the one
mentioned above has no country with five or fewer edges. Then
F1 = F 2 = F3 = F4 = F s = 0,
and
F6 + F7 + Fs + ... = F.

Moreover, since each vertex lies on exactly three edges, 3 V = 2E. Finally,
the number of edges in the map can be counted as follows: For each country
with six edges, we count all the edges and get 6F6 • Similarly, we count the
edges using countries with exactly seven edges, and get 7F7 • The total
number of edges counted in this fashion is then
6F6 + 7F7 + 8Fa + ...,
and this process counts each edge exactly twice, so
2E = 6F6 + 7F7 + 8Fa + ,
> 6F6 + 6F7 + 6Fa + .
= 6F.
Now you have the following three relationships:
V - E +F = 2,
3V = 2E,
2E > 6F.
Does this lead to a contradiction?
5.24 Figure 5.16 shows a map of countries on an island. Prove that this
map cannot be colored with only three colors in such a way that adjacent
countries are colored differently.
5.25 Suppose the Martians have divided their planet into two countries,
one occupying the Northern hemisphere and called Northia, and one
occupying the Southern hemisphere and called Southia~There is then a single
boundary edge, the Martian equator. Does Euler's Formula hold for this
sort of map? If not, what convention about boundary edges is necessary
to make the formula valid?

5.3 REGULAR SOLIDS


A regular solid is a 2-connected polyhedron each of whose faces is congruent
to a given regular polygon and each of whose vertices lies on the same
number of edges. It was known to the ancient Greeks that there could be only
five regular solids, but a careful use of Euler's Formula will show a much
124 Polyhedra 5.3

Fig. 5.16 A map that cannot


be properly colored with
only three colors.

more general result. It turns out that the metric or geometric properties are
not in themselves what prevent the existence of a sixth regular solid, but
merely the numerical relationships between the numbers V, E, and F.
Theorem 5.1 There can be no more thanfive regular solids.
Proof. Suppose we have a regular solid; that is, a 2-connected polyhedron
with the property that the same number of edges meet at each vertex, and
such that each face is bounded by the same number of edges. The latter
condition means that we are actually considering figures that are much more
general than regular solids, for in a regular solid not only is each face bounded
by the same number of edges, but in fact each face is congruent to a fixed
regular polygon. But we shall show, nevertheless, that there can be but five
such figures, and it then follows that only five regular solids can exist.
Since the number of edges incident at each vertex is the same, let us
denote this number by p; similarly, let q denote the fixed number of edges
bounding each face. Let us first count the number of edges by multiplying
the number p incident at each vertex by the number V of vertices; we obtain
p V. But since this process counts each edge twice, we find that

pV = 2E.

Let us also count the number of edges by multiplying the number q


bounding each face by the number F of faces; we obtain qF. This process
also counts each edge twice, and hence

qF = 2E.
5.3 Regular solids 125

Using the above two equations, we express each of E and Fin terms of V,
and we obtain
E = pV and F = pV.
2 q

We substitute the above in Euler's Formula, V - E + F = 2, which holds


for such a figure as we are considering, and we find that

pV pV
V--+-=2
2 q ,
or
2Vq - pqV + 2pV = 4q.

We solve for V, and we obtain the equation

v= 4q
2p + 2q - pq

Now V is a positive whole number, and so is 4q, so that the last denominator
must be positive. That is,
2p + 2q - pq > 0,
or
pq - 2p - 2q < o.
We next add 4 to both sides of this last inequality in order to make it
possible to factor the left-hand side, and we get

pq - 2p - 2q + 4 < 4,
which, when factored, becomes

(p - 2)(q - 2) < 4.

Our definition ofpolyhedron implies that each face has at least three bounding
edges and that there are at least three edges incident at each vertex. Thus
each of p and q is no less than 3. Hence both p - 2 and q - 2 are positive
whole numbers, and their product is less than 4. The only possibilities are
these:
p = 3, q = 3: The tetrahedron
p = 4, q = 3: The octahedron
p = 3, q = 4: The cube
p = 5, q = 3: The icosahedron
p = 3, q = 5: The dodecahedron
126 Polyhedra 5.3

The values of V, E, and F may be found by substitution in the equations

v = 4q ,
2p + 2q - pq
E = pV
2 '

F = pV.
q

For example, for p = 5 and q = 3, we obtain


V = 12, E = 30, F = 20,
so that the figure in question must have 20 faces, each a triangle (since q = 3);
30 edges; and 20 vertices, each the meeting place of exactly five edges (since
p = 5). We have referred to such a figure above as an icosahedron, whereas
this is more properly the name of the regular solid with such a number of
vertices, edges, and faces. What we have in fact shown is this: If one wishes
to glue together a number of not necessarily equilateral triangles to form a
2-connected polyhedron with five edges incident at each vertex, then 20
triangles must be used, and no other construction is possible. In particular,
there can be at most five regular solids in the classical sense.

Exercise

5.26 We have shown in Theorem 5.1 that there can be at most five regular
solids, but the theorem does not show that the five solids actually exist.
Some very careful work with blocks of wood, solid geometry, and a bandsaw
may make it seem very likely that all five do exist, but there is a difficulty.
No matter how carefully the bandsaw is used, and no matter how carefully
you measure lengths and angles, there is always the possibility that the figure
you have constructed is almost, but not quite, regular, since errors in lengths
and angles may be beyond the limits of physical measurement.
Suppose you were given eight equilateral and congruent triangles, and
were asked to prove that these can be assembled to form the boundary of a
regular octahedron. It would not be difficult to show that the first seven
could be matched up, each edge coinciding with another edge in most cases,
but then you would have the problem of showing that the last triangle would
exactly fit the triangular hole left in the constructed figure. This would
probably involve some very difficult solid analytic geometry and all sorts of
lengthy equations. Can you think of a way to prove the last triangle would
fit without all this agony? Then proceed to prove the existence of the other
four regular solids.
5.4 A converse of Euler's Formula 127

5.4 A CONVERSE OF EULER'S FORMULA


Suppose you are given three positive whole numbers, say a, b, and c. Suppose
also that these numbers satisfy the relationship of Euler's Formula: a - b +
c = 2. Need there exist a 2-connected polyhedron with V = a, E = b,
and F = c? Obviously not, for it seems clear that in any polyhedron, the
num1?ers V and F must be at least 4. But even if this condition is met, does
such a polyhedron have to exist? If not, can we find conditions on the
numbers a, b, and c that will assure its existence? What we are asking for is
this: Given positive whole numbers V, E, and F, such that V - E + F = 2,
what conditions on V, E, and F will assure the existence of a 2-connected
polyhedron with V vertices, E edges, and F faces?
E. Steinitz answered this question in a paper published in 1906. It
turns out that the answer to the above question is the double inequality

V + 4 ~ 2F < 4V - 8.

This inequality says, roughly, that the number of faces must be a little more
than half the number of vertices, but not quite so much as twice the number
of vertices. If you wish, you may convert the above inequality to the alter-
native form
F + 4 -~ 2 V -~ 4F - 8.
This shows that there is some symmetry hidden in the original inequality.
Later we will show a more natural geometric interpretation of this inequality.
We state Steinitz's result as a theorem and proceed with its proof.

Theorem 5.2 Let V, E, and F be three positive whole numbers. Then there
exists a 2-connected polyhedron with V vertices, E edges, and F faces if and
only if both of the following relations hold:

V - E + F = 2,
V + 4 ~ 2F :s 4V - 8.

This is called an "if and only if" theorem, for we must really supply
two almost independent proofs for the two theorems that Theorem 5.2 really
is. First we have to show that if K is a 2-connected polyhedron, then both the
above relations hold. Then we must show that if the numbers V, E, and F
satisfy both the above relations, then there exists a 2-connected polyhedron
K with the corresponding numbers of vertices, edges, and faces. The second
part will be a constructive proof; that is, we will show how one actually would
go about building the needed polyhedron. But the first part is much easier,
and so that is where we begin the proof.
128 Polyhedra 5.4

Proof Suppose that K is a 2-connected polyhedron with V vertices, E edges,


and F faces. We have already shown in Section 5.2 that V - E + F = 2,
and so it remains only to show that
V + 4 < 2F ~ 4V - 8.
Now there are at least three edges incident at each vertex. If we were to
count all the edges at each vertex and there happened to be exactly three
edges incident at each vertex, we would obtain the number 3 V, and we could
then say that 3 V = 2E since in this process each edge would be counted
twice. However, since there may be more than three edges at some vertices,
let us only count three and ignore any others. The number we obtain by such
careless counting will still be 3V, but since we may have ignored some edges,
we can say only that
3V ~ 2E.
Similarly, we count only three edges in the boundary of each face, even
though some faces may have more. If each face had exactly three edges,
we would find that 3F = 2E; since we ignore some edges in this process
we can say only that 3F ~ 2E. We know that Euler's Formula holds for K,
and so our information to this point may be summarized as follows:
V - E + F = 2,
3V < 2E,
3F < 2E.
We transform the first equation into
4 + 2E = 2V + 2F,
and replace the quantity 2E which appears in the resulting equation with
quantities known to be smaller (or at least no larger): first 3V, then 3F.
We thus obtain the two inequalities
4 + 3V ~ 2V + 2F,
4 + 3F < 2V + 2F.
We simplify each; the first becomes
4+V~2F,

and the second becomes


4 + F < 2V,
or
F ~ 2V - 4,
or, finally,
2F ~ 4V - 8.
5.4 A converse of Euler's Formula 129

Fig. 5.17
A cut of type A.

We now have both 4 + V ~ 2F and 2F < 4V - 8. Combining these two


into a single relationship, we get
V + 4 ~ 2F < 4 V - 8,
and the first part of the proof is complete.
To finish the proof of the theorem, we need to show that if V, E, and F
are three positive whole numbers satisfying both the relations
V - E + F = 2,
and
V + 4 :::;; 2F ~ 4 V - 8,
then there exists a 2-connected polyhedron K with V vertices, E edges, and F
faces. We shall construct Kby starting with a regular tetrahedron and literally
sawing off various chunks until we obtain the desired polyhedron. You can
imagine these saw cuts as actually being done with a bandsaw. There are
three types of cuts that will be needed. Each is to be applied to a vertex
where exactly three edges of the polyhedron meet. Since we are beginning
with a tetrahedron, it will always be possible to make the first cut. We will
need to show that in the construction of K, each cut we perform leaves us with
at least one vertex where exactly three edges meet, so that the process can be
continued until K is obtained.
Cut of Type A.' See Fig. 5.17. Let P be a 2-connected polyhedron and v a
vertex of P where exactly three edges meet. Choose three points a, b, and c,
one on each of these three edges, and much closer to v than to the other
130 Polyhedra 5.4

a ------I b
\ v I
\ I
\ I
\ I
\ I
\ I
\ I
\ I
\ I
\ I
\ I
\ I Fig. 5.18
\ I
c A cut of type B.

" \
\

vertices of P. Cut along the plane determined by a, b, and c, and discard the
small tetrahedron with vertices v, a, b, and c.
Cut of Type B: See Fig. 5.18. Let P be a 2-connected polyhedron and v a
vertex of P where exactly three edges meet. Choose two points a and b,
one on each of two of these edges, and much closer to v than to the other
vertices of P. Let c be the vertex other than v on the third edge incident at v.
Cut along the plane determined by a, b, and c, and discard the small tetrahe-
dron with vertices v, a, b, and c.
Cut of Type C: Let P be a 2-connected polyhedron and v a vertex of P
where exactly three edges meet. Let a be a point on one of these edges much
closer to v than to the other vertices of P, and let band c be, respectively, the
vertices of P other than v on the other two edges incident at v. As indicated
in Fig. 5.19, cut along the plane determined by a, b, and c, and discard the
small tetrahedron with vertices v, a, b, and c.
In each case, a new vertex will be formed at the point a, and it is easy to
see that exactly three edges must be incident at this vertex. Hence if we use
only cuts of the three types described above, and begin with a tetrahedron,
we shall always have available at least one vertex where exactly three edges
meet. So this cutting process can be continued as long as we please. More-
over, it should be at least intuitively clear (though it is not difficult to prove)
5.4 A converse of Euler's Formula 131

_~~ ~-- I I
~-- I
....- I
a \- II
\ v I
\ I
\ I
\ I
\ I
\ I
\ I
Fig. 5.19
A cut of type C. \\ I
/
\ I

c
\ /
'.
,,
,,

that since we begin with a tetrahedron-which is 2-connected-we produce


a 2-connected polyhedron at each stage as a result of each of these cuts.
It is easy to show the following important facts about the results of these
types of cuts. First, a cut of type A increases the number of vertices by two
and the number of faces by one. Second, a cut of type B increases the number
of vertices by one and the number of faces also by one. However, some care
must be used in considering what happens as a result of a cut of type C.
We must avoid using a cut of type C at a vertex v where three edges meet,
but where also the three faces meeting at v are all triangles. Figure 5.20 shows
what would happen if a cut of type C were applied in such a case. The
number of vertices, edges, and faces would not change, so a cut of type C
here would do us no good in constructing our desired polyhedron K. We
must be sure, when applying a cut of type C, that not only is this cut applied
at a vertex v where exactly three edges are incident, but in addition one of the
three faces incident at v must have four or more edges. Moreover, we must
choose the points band c indicated in Fig. 5.19 both in the boundary of the
face with four or more edges. Then when a cut of type C is applied, the face
with four or more edges is divided by the line from b to c into two faces.
132 Polyhedra 5.4

Fig. 5.20 The case when a cut


of type C should not be used.

Hence, under the right conditions, a cut of type C will increase the number of
faces by one without changing the number of vertices.
Note that we do not care what happens to the number of edges as a
result of any of these cuts, for the following reason: When we: finally succeed
in constructing a polyhedron K with the desired number of vertices and
faces, the number of edges will take care of itself because Euler's Formula
must hold for K.
Now we are ready to proceed with our construction of the polyhedron
K, which is to have Vvertices, E edges, and Ffaces. We begin with a tetrahe-
dron, which has four vertices, six edges, and four faces. Suppose first that
V = F. In this case, simply apply F - 4 cuts of type B. The number of
vertices and faces of the tetrahedron will then each be increased by F - 4,
so the resulting polyhedron will have 4 + (F - 4) vertices, or F vertices.
But in this case, V = F, so that the resulting number of vertices will be V,
as desired. Moreover, the resulting polyhedron will have 4 + (F - 4) = F
faces, also as desired. As we mentioned earlier, the value of E takes care of
itself, as in any case E = V + F - 2.
The second possibility is that F < V. If so, first apply to the tetrahedron
2F - V - 4 cuts of type B and then V - F cuts of type A. Again, it is not
hard to verify that the resulting polyhedron does indeed have V vertices and
Ffaces.
Finally, the only complicated case: that in which V < F. Here we must
use cuts of type C, but we must make sure that such a cut is applied only to a
5.4 A converse of Euler's Formula 133

12 • • • • • • •
11 • • • • • •
10 • • • • • •
9 • • • • •
:::::. 8 • • • • •
.....0
CIl
7 • • • •
Q.l
::J
6 • • • • • •
co
> 5 • • • • • • •
Fig. 5.21 The graphical 4 • • • • • • • •
interpretation of the formula
V + 4 ~ 2F ~ 4V - 8.
3 • • • • • • • •
2 • • • • • • • • •
1 • • • • • • • • • •
2 3 4 5 6 7 8 9 10 11
(Values of F)

vertex where three edges meet such that, of the three faces also incident there,
at least one has four or more edges. We do so as follows: We apply one cut
of type B followed by one of type C, using for the second cut one of the new
vertices formed by the cut of type B. It is not hard to see that a cut of type B
will always produce at least one new face with four or more edges, and that
one of the new vertices created on the edges of this face will be a vertex
where exactly three edges meet. So we use this vertex for the next cut of
type C.
We perform this process of first applying a cut of type B, then a cut of
type C, exactly F - V times. This is possible since V < F. After this process
is completed, we next apply exactly 2 V - F - 4 cuts of type B. Again, it is
easy to verify that the resulting polyhedron does have exactly V vertices and
F faces, as desired. This concludes the proof of Steinitz's Theorem.

Exercises

5.27 The graph shown in Fig. 5.21 shows plotted dots, each representing
a possible value pair for V and F. The two lines are the graphs of
V + 4 = 2F,
and
2F = 4V - 8.
134 Polyhedra 5.5

The shaded region between the lines represents possible value pairs for V
and F for which a polyhedron can exist; outside this region none can exist.
Why?
5.28 The proof of Steinitz's Theorem shows how one may begin with a
tetrahedron and construct a polyhedron with given values of V and F.
Note that the tetrahedron itself is represented by the point of intersection
of the two lines shown in Fig. 5.21. Give a geometric interpretation of the
construction in Steinitz's Theorem with the aid of Fig. 5.21.
5.29 Show that the application of a cut of anyone of the three types used in
the proof of Steinitz's Theorem to a convex polyhedron produces a convex
polyhedron. See Exercise 5.10. As a consequence of this exercise and Exercise
5.10, the construction used in the proof of the second part of Steinitz's
Theorem works because each polyhedron used is convex. Why is this enough
to make the construction work?
5.30 If a 2-connected polyhedron has 20 edges, what is the maximum number
of vertices it can have? What is the minimum number?
5.31 Verify that a cut of type A does indeed increase the number of vertices
of a polyhedron by two and the number of faces by one.
5.32 Verify that a cut of type B does indeed increase the number of vertices
of a polyhedron by one and the number of faces by one as well.
5.33 Verify that application of a cut of type C in the proper place on a
polyhedron does increase the number of faces by one without changing the
number of vertices.
5.34 In the proof of Steinitz's Theorem, where the polyhedron K is con-
structed, it is shown only for the case V = F that the constructed polyhedron
actually has the desired number of vertices and faces. Verify that the values
of V and F also come out right in the other two cases, the cases of F < V
and V < F.
5.35 To continue the previous exercise, how many cuts are actually needed
to produce the polyhedron K from a tetrahedron? Your answer should be in
terms of the number F of faces that K is supposed to have.

5.5 MAP COLORING


As we mentioned at the beginning of this chapter, it is an unsolved problem
whether or not four colors are sufficient to color every map of connected
countries on the surface of a sphere (or, for that matter, on the surface of any
2-connected polyhedron). With the aid of Exercise 5.23, which depends
itself on Euler's Formula, it is possible to show that each such map can be
colored using no more than five colors; the answer to Exercise 5.1 shows that
four colors are necessary. But no one has ever been able to construct such a
5.5 Map coloring 135

Fig.5.22 The dashed


line shows which
countries and edges to
remove in order to
make the torus into
a cylinder.

map requiring five colors, nor has anyone ever been able to prove that four
colors are sufficient for every possible map. This appears to be a very difficult
problem.
However, in what should be a more complicated case, the problem has
been solved. We refer to the case of a map on the surface of a torus. Actually,
the problem has even been solved for a number of other surfaces as well,
including the Mobius strip and the surface of a two-hole torus. It seems quite
strange that the problem has not been solved in what really ought to be the
simplest case of all-the case of the plane or sphere (the two are equivalent
problems). We present next the proof that the answer to the coloring problem
for the torus is seven; that is, that every map on the surface of a torus can be
colored with seven or fewer colors, and that there do exist maps that require
seven colors. Of course, we always refer to a proper coloring, in which
adjacent countries have different colors; moreover, in the case of maps on
the torus, we require not only that each of the countries be connected, but
also that each has connected boundary and that no country encircle the torus
in either of the two possible fashions. We could really eliminate these last
conditions by a simple device, but we will make these assumptions for
simplicity.
As you might expect, it will first be necessary to derive Euler's Formula
for a torus. Let a map of countries be given on a torus, subject to the con-
ditions mentioned above. As in Fig. 5.22, draw a circle around the torus
the short way, avoiding vertices, and passing through a country no more than
once. Remove the countries and boundaries crossed by this curve. This
operation will decrease the values of F and E for this map by the same amount,
136 Polyhedra 5.5

Fig. 5.23 The plane map resulting


after the cylinder is deformed.

so that the value of V - E + F remains unchanged. If there should be any


free vertices in the resulting map, where only two edges meet, remove these
as well. Each such operation decreases the value of V and E by one each,
so that after this is done, the value of V - E + F is still unchanged.
What we now have is a map on a cylinder, which can then be deformed
into a map in the plane in the shape of an annulus, as in Fig. 5.23. We add
a dummy country in the hole in the middle of the annulus, and note that we
have here a map of the sort examined in the proof of Euler's Formula in
Section 5.2. There is no need to triangulate this map and remove triangles,
for we already know that for this map,
V-E+F=1.
We remove the dummy country in the hole, thus decreasing Fby 1, and find
that
V - E + F = O.
But the value of V - E + F is the same for the annular map as for the
original map on the torus, and so we discover that Euler's Formula for a
torus is
V - E + F = O.
This is a result you should have obtained in Exercise 5.7.
We are now ready to prove that seven is the "map-coloring number"
for the torus.
5.5 Map coloring 137

Fig. 5.24 Adjusting a


map so that only three •
edges meet at each
vertex.

Theorem 5.3 Each map of connected countries each with connected boundary
on the surface of a torus can be colored with seven or fewer colors.
Proof The theorem is clearly true for a map consisting of seven or fewer
countries. Suppose it is.false for some value of F larger than seven: Then
there is a least such value of F for which the theorem is false; and so, letting
that value be itself denoted by F, there is a map on a torus with F countries
that cannot be colored with seven colors. Remember that F is the smallest
number for which we are supposing the theorem to be false, so that for
example, any map with F - 1 countries can be colored with seven colors.
We have a map of F countries on the surface of a torus, and this map
requires eight or more colors for a proper coloring. We next adjust this map
so that only three edges meet at each vertex. One way to do this is shown in
Fig. 5.24, in which a few edges are moved slightly to one side or the other.
This will increase the value of V and E, but F will not be changed.
Since three edges meet at each vertex, for this adjusted map we can
multiply the number of edges at each vertex by the number of vertices, and
obtain the number 3 V. This counts each edge twice, hence
3V = 2E,
or
6V = 4E.
Also, for this map
V - E + F = 0,
and so
F= E- V,
so that
6F = 6E - 6V.
138 Polyhedra 5.5

Since we also know that 6 V = 4£, then


6F = 6£ - 4£,
or
6F = 2£.
We now claim that one of these countries has fewer than seven boundary
segments, and hence is bounded by fewer than seven other countries. For
if not, we let Fn denote the number of countries of this map that have exactly
n boundary segments, for all natural number values of n. If no country has
fewer than seven boundary segments, then
F1 = F2 = F3 = F4 = Fs = F6 = 0,
and
F = F7 + Fa + F 9 + .. '.
We count the number of edges by counting the number bounding each
country. Since seven edges bound F 7 countries, eight edges bound Fa
countries, and so on, and because this process counts each edge exactly
twice, we find that
2£ = 7F7 + 8Fa + 9F9 + .
> 7F7 + 7Fa + 7F9 + .
= 7· (F7 + Fa + F 9 + )
= 7F.
Hence 2£ > 6F. But we had previously established that, on the contrary,
2£ = 6F. This contradiction shows that at least one country must have six or
fewer boundary segments.
Remove such a country temporarily from the map, and allow the coun-
tries which formerly bounded it to annex the now unclaimed territory, as in
Fig. 5.25. These six or fewer countries bound exactly the same countries as
before the annexation, provided that the annexation is carried out properly.
This new map has F - 1 countries, too, and hence can be colored using no
more than seven colors. Color it.
We now ask the countries that took part in the annexation to cede their
new territory back to the country that was temporarily removed, and we
replace that country. Since it is bounded by no more than six countries, it
touches only six other colors at most, and the seventh color is available to
color it properly. Color it that color.
At the beginning of the proof, we made an adjustment of the edges so
that only three ~dges met at each vertex. We now reverse that adjustment,
restoring the original map. No new boundaries between countries are set up
in this process; indeed, some countries may no longer bound ones they did
5.5 Map coloring 139

Fig. 5.25
Annexation of a
cou ntry by its
neighbors.

previously. Hence our coloring with seven colors will also work for the
original map. This contradicts the assumption that the map could not be
colored using only seven colors, and hence our supposition that the theorem
was false is itself a false assumption. Thus the theorem is"true, and we have
completed the proof of Theorem 5.3.
All that is left is to establish the existence of a map on the torus that
actually does require seven colors for a proper coloring. One such, using
in fact only seven countries, is shown in Fig. 5.26. This figure requires some
explanation. It is too confusing to draw a view of a semitransparent torus
on a two-dimensional piece of paper and try to show the various countries.
What appears in Fig. 5.26 is instead a recipe for building such a torus; or,
if you prefer, a set of directions for painting such a map on the surface of an
old inner tube. If you make a copy of Fig. 5.26 on a flat but flexible rectangle
of paper and glue together the two long sides of the rectangle, you will have a
long cylinder. Then glue together the ends of the cylinder; you will obtain
a torus. Hence, in Fig. 5.26, opposite sides of the rectangle are to be thought
of as attached, and so a country such as number 2 (for example) has a com-
mon boundary with country number 7. You can verify that each of the seven
countries actually shares a boundary with each of the other six. We have
drawn the figure, for simplicity, so that the sides of the rectangle are also
boundaries of countries, except for the four corners, all of which belong to
country number 1.
140 Polyhedra 5.5

Fig.5.26 A map on the


torus requiring seven
colors.

This example, together with Theorem 5.3, shows that the map-coloring
number for the torus is indeed seven. The example shows that seven colors
are necessary; the theorem shows that more are unnecessary.

Exercises

5.36 Each face of a certain regular solid is a pentagon, and exactly three
edges meet at each of its vertices. Use the techniques of this chapter, but not
the results of Theorem 5.1, to find the possible number of faces such a figure
can have.
5.37 Suppose that a map on the surface of a torus has II vertices and each
country has three edges. How many countries can there be? Why?
5.38 What is the value of V - E + F for the Mobius strip? (See Exercise
5.13.) Prove that your answer is correct.
5.39 What is the map-coloring number for the Mobius strip? Hint: Use
Exercise 5.38.
5.40 A map on the surface of a sphere consists of pentagons and hexagons
attached as shown in Fig. 5.27. Note that:
a) five hexagons surround each pentagon;
b) three pentagons and three hexagons alternate, surrounding each hexagon;
c) each vertex necessarily lies on exactly three edges; and
d) not all of the map is shown in Fig. 5.27.
5.5 Map coloring 141

I
I
I

Fig. 5.27 The pattern


of hexagons and
pentagons for Exercise
5.40.

,,
\
142 Polyhedra 5.5

Find how many pentagons and hexagons there can be. Hint: Let P
denote the number of pentagons and H the number of hexagons. Then
P + H = F, where F is the number of faces; and, as usual,
V-E+F=2.

Incidentally, the figure described one possible pattern for a geodesic


dome, such as has been used for large structural work, and also gives the
pattern of polypeptides in the beet virus molecule.
5.41 The author's wife has called to his attention the following generaliza-
tion of the previous exercise: Suppose that a map on a sphere consists of
countries each having either five or six sides, and such that each vertex lies
on exactly three edges. Then exactly 12 of the countries have five sides.
See whether you can prove this.
5.42 Continuing the previous exercise, what are the possible numbers of
countries with six sides, proceeding under the following "regularity" assump-
tion: Each pentagonal country bounds the same number of hexagonal
countries?
5.43 If you look over the proof of Steinitz's Theorem, you will see a number
of instances in which the next step is to perform (for example) 2 V - F - 4
cuts of one type, and so on. Check each such statement in the proof to make
sure the number of cuts given is never negative-for if it were, this would
invalidate the proof. Doing this exercise will help show exactly why the
hypothesis
v + 4 ~ 2F ~ 4 V - 8
is needed.
5.44 Suppose that a map on the surface of a sphere consists of a number T
of triangular countries and a number Q of four-sided countries, such that
each vertex lies on exactly three edges, each triangular country is bounded
by three four-sided countries, and each four-sided country is bounded by
exactly two triangular countries. How many triangular and how many
four-sided countries are there?
5.45 Can a map such as the one described in the previous exercise exist on
the surface of a torus?
5.46 Suppose that a map on the surface of a sphere consists of a number T
of triangular countries and a number Q of four-sided countries, such that
each vertex lies on exactly four edges, each triangular country is bounded
by exactly three four-sided countries, and each four-sided country is bounded
by exactly two triangular countries meeting it in opposite sides. How many
triangular countries and how many four-sided countries are there?
5.47 Can a map such as the one described in the previous exercise exist on
the surface of a torus? Explain your answer.
Notes and references 143

5.48 The situation involved in describing "regular solid tori" is in one way
simpler, in another way more complicated, than in the case of the sphere.
Assume that a map on the surface of a torus consists of a number of countries
such that
a) each country has connected boundary,
b) each vertex lies on the same number d > 3 of edges, and
c) each country has the same number n > 3 of sides.
(This is in analogy to the case for the ordinary regular solids.) Use the torus
formula
V-E+F=O
to show that there are only three possibilities for the shapes of the countries:
triangles, quadrilaterals, or hexagons.
5.49 Show that at least one map for each of the three possibilities of the
previous exercise exists.
5.50 Show that-in continuation of the previous two exercises-more than
one map of each of the sorts discovered can exist. In fact, infinitely many
exist, of each type, and thus there are infinitely many "regular tori"; in-
finitely many are composed of triangular countries, infinitely many of
rectangular countries, and infinitely many of hexagonal countries.

NOTES AND REFERENCES


The paper of Steinitz mentioned in Section 5.4 is his "Uber die Eulersche
Polyederrelationen," Arch. Math. Phys. (3), 11 (1906), pp. 86-88. The proof
given in this text is new and possibly simpler. Some other general references
to polyhedra and map coloring are listed below:
Coxeter, H. S. M., Introduction to Geometry (Wiley, 1961).
Coxeter, H. S. M., Regular Polytopes, second edition (Macmillan, 1963).
Dynkin, E. B. and V. A. Uspenskii, Multicolor Problems, translated by
N. D. Whaland, Jr., and R. B. Brown (Heath, 1963).
Griinbaum, B., Convex Polytopes (Interscience, 1967).
Hilbert, D. and S. Cohn-Vossen, Geometry and the Imagination, translated
by P. Nemenyi (Chelsea, 1952).
Lyusternik, L. A., Convex Figures and Polyhedra, translated by T. Jefferson
Smith (Dover, 1963).
Ore, 0., The Four-Color Problem (Academic Press, 1967).
Tietze, H., Famous Problems of Mathematics (Graylock, 1965).
144 Polyhedra

A complete proof of the Five-Color Theorem for the sphere and plane
may be found in Sherman Stein's Mathematics: The Man-Made Universe,
second edition (Freeman, 1969), in the chapter on map-coloring.
The great Swiss mathematician Leonhard Euler was born in Basle in
1707. He was one of the most productive of all mathematicians, continuing
his research even after he became totally blind in his sixtieth year until his
death seventeen years later. He spent most of his life either at St. Petersburg
(Leningrad) or at Berlin, under the sponsorship of royalty, and contributed
to mechanics, the calculus of variations, the three-body problem, number
theory-indeed, to virtually every branch of mathematics. His interests
were not restricted to mathematics. In his stay in Russia he developed a
mathematical theory of investment (out of which our present theory of
annuities grew), wrote most of the mathematics textbooks for the Russian
school system (he was a superb textbook writer), and reformed the Russian
system of weights and measures.
The four-color problem is not so old as most people think. The best
evidence we have is that Francis Guthrie (later a professor of mathematics)
asked one of his teachers at University College, London, for a proof that
four colors were sufficient to color any map in the plane. The teacher, who
was the well-known mathematician Augustus de Morgan, communicated
this problem to his colleague, the famous Sir William Rowan Hamilton, in a
letter written in 1852. Apparently the problem did not formally appear in
print until about 1878; at that time incorrect proofs were published by Kempe
and Tait. P. J. Heawood found the flaw in Kempe's proof, and published in
1890 a paper showing how Kempe's proof could be modified to show that
five colors are sufficient.
Last-minute note: The author has just received a copy of the paper
"The Four Color Theorem," by Professor Emeritus (of Duke University)
Joseph Miller Thomas. In this paper Dr. Thomas states that each map on the
sphere can be properly colored using no more than four colors, and gives
a proof of some twelve printed pages. Perhaps this century-old problem
has finally been solved.
CHAPTER 6

INFINITE
SETS

Imagine a hotel with so many rooms that it takes all the positive whole
numbers to number the rooms. That is, the hotel has its rooms numbered
1, 2, 3, 4, 5, ... , without any largest room number. If the hotel is filled with
guests, one in each room, it might seem difficult to provide space for an
additional guest. But should such an extra guest arrive at the hotel, a clever
room clerk can arrange for this guest to have a room to himself with only
minor inconvenience to the other guests: The room clerk can request the
guest in room 1 to move to room 2, the guest in room 2 to move to room 3,
and in general request the guest in room n to move to room n + 1. He can
then assign the new guest to room 1. The hotel is still full, all the guests still
have private rooms, and the new guest has been accommodated.
In order to repay all the other guests for their courtesy, the new guest
devises a scheme for improving the financial status of each. He posts a large
sign in the hotel lobby directing each guest in rooms numbered 1 through 10
to deliver one dollar to the guest in room 1, each guest in rooms numbered
11 through 20 to deliver one dollar to the guest in room 2, each guest in rooms
numbered 21 through 30 to deliver one dollar to the guest in room 3, and in
general, each guest in rooms numbered IOn + 1 through IOn + 10 to
deliver one dollar to the guest in the room numbered n + 1. (The number n
is supposed to take on all positive whole number values.) Now this is no
great inconvenience, for each guest need make no more than one trip, and the
net result is that each guest receives ten dollars while paying out one, thus
making a profit of nine dollars. Actually, our clever friend in room 1 does
best of all by this scheme, for he receives nine dollars just as everyone else
does, but he does not have to do any walking.
Of course, this plan will not work properly unless each guest has at least
one dollar to start with-but if some do not, there is still a way to handle the

145
146 Infinite sets 6.1

problem. Suppose we have the extreme case in which the guests in the odd-
numbered rooms have no money, but all the others have at least one dollar.
The guests in the even-numbered rooms could then be directed thus: Each is
to deliver one dollar to the guest in the room with the number half his room
number. Then the guest in room 2 would deliver a dollar to the guest in room
1, the guest in room 4 would deliver a dollar to the guest in room 2, and in
general the guest in room numbered 2n would deliver a dollar to the guest
in the room numbered n. You can easily verify for yourself that each guest
in an odd-numbered room receives one dollar, and each guest in an even-
numbered room pays out and receives one dollar. Thus there would be no
change in the financial status of the guests in the even-numbered rooms, while
each guest in an odd-numbered room would now have one dollar. Then the
original scheme proposed above could take place without any difficulty.
The reason that such peculiar manipulations are possible is of course
due to the fact that we are dealing with infinite sets-the hotel has infinitely
many rooms as well as infinitely many guests, and infinitely many dollars
are involved in the financial transactions. However, in spite of the apparent
contradictions involved here, we hope to show you that such phenomena
are natural-even common-when one deals with infinite sets. Throughout
much of the following material, the central idea is the concept of a one-to-
one correspondence between two sets. With this concept we will be able to
compare the numbers of elements of two infinite sets because it turns out to be
quite possible that one infinite set actually contains "more" elements than
another. The basic tools are some ideas about sets and functions.

6.1 SETS
By a set we mean a collection of objects, thought of as a whole. The objects
which make up the set are called the elements of the set.
We do not pretend that the above comprises a definition, at least in the
formal sense that previous definitions have been given in this book. Any
attempt to define a term requires the introduction of other more primitive
terms, and so it is easy to see that some terms must be taken as absolutely
primitive-their meaning assumed clear. However, it is helpful in such cases
to attempt to give synonyms and examples. For the term "set," which we
take as such a primitive term, we can supply a large list of synonyms. Some
of these could be "class," "collection," "aggregate," "conglomerate," or
just "bunch." We will soon give several examples, and ask you to use your
powers of abstraction. One final note: In the next section we shall encounter
the same problem with the definition of the term "function."
It is not necessary that the elements of a set have any particular property
in common. For example, it makes sense to talk of the set whose four elements
6.1 Sets 147

are the first two words on this page, the number 6, and the moon. But we do
ask two things:
First, that of any object potentially the member of a set under considera-
tion, it is at least theoretically possible to determine whether or not that
object belongs to the given set. In the above example, it is possible for you to
determine whether or not the word "however," or the positive square root of
36, or the eighth largest body in the Solar System, belongs to the set mentioned.
Second, we also require that we can at least theoretically tell apart any
two objects which do belong to a given set. One reason for this is that we will
follow the convention that objects of a given set are not to be listed in that
set more than once; otherwise, in the material which follows, we would run
into difficulties in counting the number of elements of a set. For example,
if W is to be the set whose elements are the first president of the United
States and George Washington, we want the list of elements of W to contain
only one entry, so that we can say that W contains one element.
These two considerations-that a set should be sufficiently well defined
to tell what its objects are and to tell them apart-are probably quite in
accord with your intuitive notion of what a set is, but they are also important
considerations for those mathematicians who study logic and the foundations
of mathematics.
There are basically two ways to specify a set. One method might be called
the listing method. One simply writes down the elements of the set. This
method is quite useful for sets with only a small number of objects or elements,
but if the number of elements is large (or if the set is infinite) we must resort
to the ellipsis:
A = {I, 2, 3, 4},
B = {l, 2, 3, 4, , lOO},
C = {I, 2, 3,4, }.
When the elements of a set are listed, the list is enclosed in braces for clarity.
The set A above consists of the first four positive whole numbers; B consists
of the first one hundred positive whole numbers; C consists of all the positive
whole numbers. It is certainly possible to tell whether or not any given object
is a positive whole number, and given two positive whole numbers it can be
decided whether or not they are equal. Thus these three examples satisfy
the two criteria previously mentioned.
As an alternative to the listing method for specifying a set, we can use the
descriptive method.
D = {x I x is a positive whole number}.
We read this notation as follows. As soon as you see the first brace on the
left, you realize that you are about to encounter a set, and you should think
to yourself "The set of," or "The set consisting of ...." The next word is
148 Infinite sets 6.1

quite important, and it is surprising that it is omitted in the above notation;


the word is "all." The letter x is simply a dummy variable, to be used in the
sentence that will follow the vertical bar, so up to the vertical bar you can
translate as follows:
"The set of all objects ... "
or
"The set consisting of all elements x ...."
The vertical bar is merely translated "such that" or "with the property that."
Thus what follows that vertical bar must be a declarative sentence. It gives the
exact condition an object must satisfy in order that it belong to the set being
described. The brace at the right-hand side tells that the description has been
completed. Thus, the entire sentence
D = {x I x is a positive whole number}
may be correctly translated as
"D is the set consisting of all objects x such that x is a positive whole
,
number"
or
"D is the set of all x such that x is a positive whole number,"
or
"D is the set of all positive whole numbers,"
or, finally,
"An object is an element of the set D if and only if the object is a
positive whole number."
Of course, you see that the set D above contains exactly the same objects
as the set
C = {I, 2, 3, 4, ... }.
Since it is the aggregate of objects itselfwhich makes up a set, not the particular
method of describing the set, we have here an example of what it means for
two sets to be the same. We say that C and D are equal sets, and we write
simply C = D.
Using the symbol E we can also supply a convenient shorthand for the
idea of an object's belonging to a set. If x is an element of the set C, then we
write x E C; if not, we write x ¢ C. These two examples may be translated
as follows:
"x E C" translates as "The object x is an element of the set C" or "x is
an element of C."
6.1 Sets 149

"x ¢ C" translates as "The object x is not an element of the set C" or
"x is not an element of C."
In the example of the set C given above, two true statements using this
notation are
2EC and -3¢ C.

Two false but meaningful statements are

DEC and 17 ¢ C.

Finally, using this symbol E and the set C, we can use the descriptive notation
for the sets
A = {I, 2, 3, 4},
and
B = {I, 2, 3, 4, ... , loo},

by writing the equivalent statements

A = {x E C I 1 < x < 4},


and
B = {x E C I 1 < x < loo}.

This quantification of the dummy variable x has the property of removing


ambiguity, for without knowing what sorts of objects the values of x were
limited to, we could not tell whether the set A, as described above, contained
but four elements, or infinitely many (in case x were allowed to take on real
number values).
There are important relations between sets, and one of the most important
is the idea of inclusion. The set A is said to be a subset of the set B provided
that each element of A is also an element of B. Symbolically, we write
A c B as a shorthand for "A is a subset of B," and it is customary to say,
if A c B, that A is contained in B, that B contains A, that A is a subset of
B, or that B is a superset of A. For A and B as in the examples above, it is
true that A c B and false that B c A; in this case we write B ¢ A.
For a formal definition, we can state the following: Let A and B be sets.
Then A c B if, and only if, for each x E A also x E B. Moreover, using
the relation of set-inclusion, we can also define set-equality: Let A and B
be sets. Then A = B if and only if both A c Band B cA.
New sets can also be manufactured from old ones. The union of the two
sets Sand T is the set consisting of all elements which belong either to S
or to T (or both), and is denoted by S u T. In symbols, then,

S u T = {x I XES or XE T}
150 Infinite sets 6.1

We can also form the intersection of Sand T, consisting of all elements com-
mon to Sand T; that is,

S fl T = {x I XES and XE T}.

Since it may happen that Sand T have no elements in common, it turns out
to be convenient to accept the notion of the so-called empty set, denoted by
0, which contains no elements. Thus, for example, if
S = {x I x is an even integer}
and
T = {x I x is an odd integer},
then
S fl T = 0.
In the examples below, let

A = {I, 2, 3, 4},
B = {I, 2, 3, 4, , loo},
C = {I, 2, 3, 4, },
S = {... , - 4, - 2, 0, 2, 4, 6, ... },
and
T = {... , - 3, -I, I, 3, 5, 7, ... }.
Example 6.1 The following are true statements:
5 ¢ A, 5 ¢ S,
5 E B, 5 E T,
5 E C, 5¢0·
Example 6.2 The following are true statements:
A c B, S ¢ T,
Be C, o c S,
C ¢ S, T¢ C.

Example 6.3 The following are true statements:

Au B = B,
A u C = C,
SuT=W,
where W is the set of all whole numbers.
6.1 Sets 151

u u

A ns A US

Fig.6.1 Venn diagrams illustrating A n B and A u B.

Example 6.4 The following are true statements:


A (\ B = A,
A (\ S = {2, 4},
C (\ T = {l, 3, 5, 7, ... }.
For some people, Venn diagrams as shown in Fig. 6.1 are very useful in
visualizing the unions and intersections of sets. The rectangle U symbolizes
some fixed universe of elements under consideration, and the area within the
circle A is meant to symbolize the elements of the set A. The set A (\ B is
shown on the left, shaded; A u B is shaded in the right figure.

Exercises

6.1 Why are the sets {I, 2, 4} and {2, 4, I} equal?


6.2 Let A = {I, 2, 3}, B = {2, 3, 5}, and C = {4, 5, 6}. Express the sets
below by listing their elements between braces, or by using 0 if necessary.
a) Au B b) A (\ B
c) Au C d) A (\ C
e) (A u B) u C f) A u (B u C)
g) (A (\ B) (\ C h) A (\ (B (\ C)
6.3 Using A, B, and C as in the previous exercise, compare (A u B) (\ C
and A u (B (\ C) to see if they are equal.
152 Infinite sets 6.1

Fig. 6.2 The correct


way to draw three
sets in a Venn
diagram.

6.4 In drawing Venn diagrams involving three sets A, B, and C, they should
overlap as shown in Fig. 6.2 so as to allow for all possibilities of the various
intersections of these sets. Of course, it is possible that no elements lie in
some region, such as B n C, but it does no harm to let Band C overlap.
Shade in the following sets in two copies of Fig. 6.2:

A n (B u C),
and
(A n B) u (A n C).

6.5 Your result on the previous exercise should suggest that the formula

A n (B u C) = (A n B) u (A n C)

holds for all sets A, B, and C. The picture is, of course, no substitute for a
formal proof, but it can serve as a guide for construction of such a proof.
One way to invent such a proof is to choose an arbitrary element, say x,
from the set A n (B u C) and show that x must necessarily belong to the
set (A n B) u (A n C). Then one would choose an arbitrary element x
(there is no harm in using the same symbol as before) from the set (A n B) u
6.1 Sets 153

(A n C) and show that x must also necessarily belong to the set A n (B u C).
These two proofs would show that, in order,
A n (B u C) c (A n B) u (A n C),
and
(A n B) u (A n C) cAn (B u C).
Hence by the definition of equality of sets,
A n (B u C) = (A n B) u (A n C).
Provide the necessary details of this proof.
6.6 The previous exercise should remind you of the distributive law of
multiplication over addition, which holds in the real number system. If
a, b, and c are real numbers, then
a' (b + c) = (a' b) + (a ·c).
Addition does not distribute over multiplication in the real number system;
that is, it is generally false that
a + (b' c) = (a + b)' (a + c).
However, union of sets does distribute over intersection, for given sets
A, B, and C, it is true that
A u (B n C) = (A u B) n (A u C).
Please prove this.
6.7 The previous two exercises suggest that there is a formal "algebra" of
sets much as there is an algebra involving addition and multiplication of real
numbers. You may verify as many of the set-algebraic properties listed
below as you wish. Capital letters stand for sets.
A = A.
If A = B, then B = A.
If A = B and B = C, then A = C.
If A c B and Be C, then A c C.
A cA.
o cA.
A u A = A = AnA.
A u B =
B u A, and A n B = B n A.
(A u B) u C = A u (B u C).
(A n B) n C = A n (B n C).
ou A =A A = 0.
and 0 n
A u (B n C) = (A u B) n (A u C).
A n (B u C) = (A n B) u (A n C).
154 Infinite sets 6.2

6.8 In the above list of properties of the algebra of sets, one can make an
analogy with the algebra of real numbers, interpreting A, B, and C as real
numbers, intersection as multiplication, and union as addition. What role
does 0 play in this analogy? How good is the analogy? How do you in-
terpret the relation of set-inclusion?
6.9 We define A - Bas
A - B = {x E A I x ¢ B}.
Draw a Venn diagram, similar to the ones shown in Fig. 6.1, to illustrate
the set A-B.
6.10 Let A, B, and C be sets. Prove that
A - (B n C) = (A - B) u (A - C).
See Exercise 6.9.
6.11 Continuing the ideas in the previous exercise, discover and prove valid
a formula resembling the one above for A - (B u C).
6.12 How would you define
Ai u A 2 U A3 U A4 U "',
and
Ai n A 2 n A 3 n A 4 n "',
where, for each natural number n, An is a set?
6.13 Prove the formula
A - (B l n B 2 n B 3 n B 4 n ... )
= (A - B l ) u (A - B 2 ) u (A - B 3 ) u (A - B 4 ) u ....

Is a similar formula with unions and intersections interchanged also valid?


6.14 Let m and n be natural numbers, and let A be a set containing m
elements and B be a set containing n elements. What can be said about the
number of elements of A u B? What can be said about the number of
elements of A n B?
6.15 In the previous exercise, if you know that the number of elements of
A n B is k, what can be said about the number of elements in A u B?

6.2 FUNCTIONS
Even if the concept of "set" is of first magnitude in the galaxy of math-
ematics, so is the concept of a "function."
Let A and B be sets. A functionjfrom A to B is a rule which assigns to
each element of A one and only one element of B. We writej: A -+ B, call
A the domain off, and B the range off. If x is an element of A, and y is that
6.2 Functions 155

element of B assigned by f to x, we call y the value of f at x, and write


y = f(x).
In this definition we encounter the same problem as in our definition
of set. We have simply substituted for the word "function" another primitive
term, "rule," as a sort of synonym. In Exercise 6.16 we shall discuss another
method of defining the term "function," using a somewhat clearer and more
natural primitive term.
If f: A ~ B is a function, the phrase ''I assigns to each element of A
one and only one element of B" is not meant to be interpreted as meaning
that each element of B is used exactly once as a value off It is permissible
to use some elements of B more than once, and others not at all. For example,
let R denote the set of all real numbers, and let f: R ~ R according to the
rulef(x) = x 2 • Thenfisjust the function which assigns to each real number
its square. Not every number in the range is used--4 is not a value off
On the other hand, some elements in the range are used more than once-
f(2) = 4 and also f( - 2) = 4, so that the number 4 in the range is used
twice.
For another example, suppose you are given a large collection of gummed
labels, say at least a hundred thousand with the phrase "five feet, two inches"
printed on them, another hundred thousand with the phrase "six feet, eleven
inches" printed on them, and so on, a hundred thousand for each of the
possible heights of students you might encounter at your school. Suppose
next that you walk around campus until you have met each student; upon
first meeting each, you paste on his forehead that label most nearly indicating
his height. (If a person claims to be exactly five feet, eleven and one-half
inches tall, just round this off to six feet.)
In this example the domain is the set of all students on campus, the
range is a set of heights, and you are the rule, assigning to each element
(student) in the domain one and only one element (his height) in the range.
The "one and only one" part of the definition comes in because no student
has two different labels pasted on him and because we presume you continue
this campaign sufficiently long so that each student receives a label. You,
as the paster, are operating as the rule part of this function. The function
itself consists of three things, a domain, a range, and a rule; in this case, the
function consists of the set of students on campus, the set of heights, and
yourself. Note again that it is permissible for some heights to be used more
than once-you will likely encounter at least two students almost exactly
the same height-and some labels, such as the "eight feet, ten inches" label,
may never be used at all, and thus the possible height "eight feet, ten inches"
would be an unused element of the range.
It is sometimes convenient to consider the set of all values in its range
that a function does use; this is called the image of the function, and we can
define it using set-theoretic notation as follows.
156 Infinite sets 6.2

Let f: A ~ B be a function. Then the image off is the set


[m(f) = {y E B Iy = f(x) for some x E A}.
In the study of infinite sets, there are two properties of functions which
will be of particular interest to us. First, it may happen that a function does
indeed use each value in its range no more than once. Such a function is said
to be one-to-one. It may also happen that a function has its image equal to
its range; such a function is said to be onto. Again, we provide a formal
definition.
Let f: A ~ B be a function. The function f is said to be one-to-one
provided that, if Xl and X2 are elements of A and Xl =F X2' then f(Xl) =F
f(X2)' If [m(f) = B, thenfis said to be onto. Moreover, iffis both one-to-
one and onto, then f is said to be a one-to-one correspondence from A to B.
Example 6.5 Letf: R ~ R by f(x) = x 2 • Thenfis a function, because
a) each real number has a square, so that R is the domain andfcan operate
on R;
b) the square of each real number is again a real number, so that the
"output" off does lie in R, and hence R is the range off; and
c) if a E R, b E R, and a = b, then a 2 = b2, and hence f(a) = feb).
Hence even if a real number is known by two different names a and b,
the output of the function is fixed for that number. So f has the "one
and only one" property.
However, f is not one-to-one, and neither is it onto, as we have already
observed.
Example 6.6 Let f: R ~ R according to the rule f(x) = 2x • Then f is a
function, for reasons similar to those above. In addition, f is one-to-one,
since if a and b are real numbers and a =F b, then 2 =F 2b , and hence Q

f(a) =F feb). Butfis not onto, because for each real number x, 2x is positive.
Although - 1 E R, the range off, - 1 ¢ [m(f). The graph of the function f
of this example is shown in Fig. 6.3.
Example 6.7 Let f: R ~ R by f(x) = x 3 - x. The graph off is shown in
Fig. 6.4. Again, f is a function, but f is not one-to-one, since 1 and - 1 are
both numbers in the domain off, but although 1 =F - 1, f(1) = 0 = f( - 1).
However,fis onto, since given a number y in the range off, there does exist
at least one value of X in the domain off for which y = x 3 - x; that is,
y = f(x).

Example 6.8 This is an example of a non-function. Let the domain and


range each be the set R of all real numbers, as in the previous example, and
let f assign to each number its reciprocal. We might write f(x) = l/x.
6.2 Functions 157

y-axis

Fig.6.3 The graph of


(x) = 2)( a one-to-one
I

function from
RtoR.

=======-------i------------ x-axis

y-axis

Fig. 6.4 The graph of


(x) = x 3 - x, an onto
function from R to R.

--------,,e----"k--------,,e--------- x-axis
158 Infinite sets 6.2

Then I is not a function, for there is at least one number-namely O-in the
domain ofI to which no value in the range is assigned by f
Example 6.9 This is another example of a non-function. Again, let the
domain and range each be R. Let I assign to each irrational number the
value 17. Given a rational number, express it as a fraction a/b. Then f
assigns to this rational number the value a + b.
This is not a function, because there are equal numbers in the domain
to which are assigned unequal numbers in the range. For example, 1/2 =
2/4, but/O/2) = 3 ¥: 6 = f(2/4).
Example 6.10 This is our last example of a non-function. Let the domain
be the set R of all real numbers, and the range be the set [0, 1] of all real
° °
numbers between and 1 (including and 1). Let f have the rule f(x) =
2x + 3. Then/is not a function, for although the number 12 is in its domain,
f(12) is technically undefined-but in any case, the only "value" f(12) could
have, according to the rule off, is 27, and 27 does not belong to the range of
f. Hence f does not assign to the number 12 in its domain any value in its
range.

Example 6.11 Let I: R --+ R according to the rule f(x) = 2x + 3. Then


not only is I a function, but in fact f is a one-to-one correspondence from R
to itself. To show the latter, we need only show that I is both one-to-one
and onto. First, suppose that Xl and X2 are two numbers in the domain off
such thatf(x l ) = I(X2). Then

so that

and hence

Consequently, if Xl ¥: X2' then I(Xl) ¥: I(X2). Hence I is one-to-one.


N ow suppose that y belongs to the range R off Let
y - 3
X=--.
2
Then
f(x) = 2x + 3

=2. y - 3 +3
2
= (y - 3) + 3 = y.
6.2 Functions 159

Hence given the number y in the range off, there does exist a value of x-
namely, x = (y - 3)/2-such that f(x) = y and x does lie in the domain
R off Hence f is also onto, and thus by definition is a one-to-one corres-
pondence from R to itself.
If f: A ~ B is a one-to-one correspondence from the set A to the set B,
there is then naturally associated with this function another function
g: B ~ A according to the rule
g(y) is that element x of A such thatf(x) = y.
The function 9 is called the inverse of the function f, and is sometimes
denoted by f - 1.

Example 6.12 Letf: R ~ R according to the rulef(x) = 2x + 3. We saw


in Example 6.11 that f is a one-to-one correspondence from R to R. So f
must have an inverse g. In the proofthatfis onto, we saw that the element x
of the domain of f that was assigned to the element y in the range of f by
f had the value x = (y - 3)/2. Hence the rule of 9 is given by
y - 3
g(y) = 2 '

or, if you prefer,


x - 3
g(x) = .
2
Note that we can use any reasonable symbol we please to specify how the rule
of a function acts, for in the above example, 9 is simply the function which
assigns to each real number that number obtained by halving the given
number diminished by three. Note also how much clearer it is, at least in this
case, to use symbols rather than words to describe the rule part of the
function.
Finally, as you may have noticed, there is no reason why we should
restrict our attention to functions which have domains and ranges subsets of
R, and rules that look like algebraic formulas. The example with the stickers
giving a student's height shows that other sets and rules may be used. And
the rule does not have to have any special regularity about it; as a final
example, you could simply paste the stickers on the students' foreheads more
or less at random, so long as each student received one and only one sticker,
and this would be a different function from the height function previously
mentioned.

Exercises

6.16 We give here another primitive term, that of ordered pair, by which an
equivalent definition of "function" can be given. The ordered pair (a, b)
160 Infinite sets 6.2

consists of two objects a and b together with the idea that a is the first, and
b the second, in the pairing.
The Cartesian product A x B of two sets is defined as follows:
A x B = {(a, b) I a E A and bE B}.
A familiar example of an application of the above concept is the ordinary
coordinate system for the two-dimensional plane used in analytic geometry.
One forms the Cartesian product R x R, and then a is the so-called x-
coordinate, and b the y-coordinate, of the point (a, b) E R x R.
We can provide an alternate but equivalent definition of the term
"function." Let A and B be sets. A function f from A to B is a subset of
A x B such that
a) for each x E A, there exists y E B such that (x, y) Ef; and
b) if (x, y) Efand (x, z) Ef, then y = z.
If f is a function in the sense of this definition, and (a, b) Ef, does our
old notationf(a) make sense? What is the value of f(a)?
6.17 Give an example unlike Example 6.6 of a function f: R -+ R that is
one-to-one but not onto.
6.18 Give an example unlike Example 6.7 of a function f: R -+ R that is
onto but not one-to-one.
6.19 Give an example unlike Example 6.11 of a functionf: R -+ R that is a
one-to-one correspondence from R to R.
6.20 Findf- 1 for your example above.
6.21 The open interval (- (nI2), n12) of all real numbers between - (nI2)
and nl2 can be thought of as a set of angles in radian measure, and with this
domain and with range R, the rule f(x) = tan x gives a function that is a
one-to-one correspondence from ( - (nI2, n12) to R. The graph off is shown
in Fig. 6.5. Sketch the graph off-I.
6.22 Let A and B be sets, and f a one-to-one correspondence from A to B.
Prove that not only is f- 1 a function with domain B and range A, but also
thatf- 1 is a one-to-one correspondence from B to A.
6.23 Let A be a set. Construct a one-to-one correspondence from A to
itself. Hint: This problem is very easy.
6.24 Let A, B, and C be sets, and letf: A -+ Band g: B -+ C be functions.
As indicated in Fig. 6.6, we can go directly from A to C by a new function,
called the composition off and g, which we construct as follows. We denote
this function by {g(f)}, and define {g(f)}: A -+ C according to the rule
{g(f)}(x) = g(f(x)).
Verify that {g(f)} : A -+ C is indeed a function.
6.2 Functions 161

y-axis

Fig. 6.5 The graph of


(x) = tan x,
- (1t/2) < x < 1t/2.

-(1T/2)

Fig. 6.6
The composition
o
{g( f) } of the
functions ( and g. {g(f)}
162 Infinite sets 6.3

6.25 Using the notation and definitions of the previous exercise, verify that
if each off and g is one-to-one, then so is {g(f)}.
6.26 Using the notation and definitions of Exercise 6.24, prove that if each
off and g is a one-to-one correspondence from its domain to its range, then
{g(f)} is a one-to-one correspondence from A to C.
6.27 Continuing the previous exercise, find a formula for {g(f)} -1 in terms
off- 1 and g-l.
6.28 See Exercise 6.24. Let f: R --+ R by f(x) = x 2 , and let g: R --+ R
by g(x) = x + 1. To be equal, two functions must have the same domain,
the same range, and the effects of their rules must be the same, in that if x
is in their common domain, then f(x) = g(x). Are the functions f and g
given in this exercise equal functions? Why?
6.29 Using the functions f and g of the previous exercise, find {g(f)} and
{f(g)}. Are the latter two equal functions? Explain your answer.
6.30 Let f: A --+ B be a one-to-one correspondence from the set A to the
set B, and let g = f-l. Show that {g(f)}(x) = x for all x E A and that
{f(g)}(x) = x for all x E B.

6.3 MORE ON ONE-TO-ONE CORRESPONDENCES


Recall that we say there is a one-to-one correspondence from the set A to the
set B provided that there is a one-to-one and. onto function f: A --+ B.
If so, we shall use the notation A ,...., B, and say that the sets A and B can be
put into one-to-one correspondence. By using the notion of one-to-one
correspondence between two sets, we shall be able to define what we mean by a
finite set, an infinite set, and what it means for two sets to have the same
number of elements even if this number should be infinite. First, however,
we examine the properties of one-to-one correspondences; specifically, we
want first to show that this relation is an equivalence relation (see Exercise
1.6). That is, we want to show that if A and Band C are any sets, then
a) A ,...., A.
b) If A,...., B, then
c) If A,...., B and then A,...., C.
But part (a) has been taken care of in Exercise 6.23, part (b) has been
taken care of in Exercise 6.22, and part (c) has been proved in your work for
Exercise 6.26. So from this point on, we can eliminate a large number of
tedious constructions from many proofs and exercises. For example, if you
know that each of the two sets Sand T can be put into one-to-one corres-
pondence with a third set U, then you know that Sand T can also be put
into one-to-one correspondence with each other; knowing this, you know in
6.3 More on one-to-one correspondences 163

addition that there exists some one-to-one and onto function !: S -+ T,


so you can simply "let" ! be such a function, and use! where necessary in
your proofs.
Now we can give a precise definition of what it means for a set to be
infinite-we are going to say that a set is infinite provided that it is not
finite, so first we define what we mean by a finite set.
The set S is said to be finite provided that either S = 0 or, for some
natural number n, S ,..., {I, 2, 3, ... , n}.
The set S is said to be infinite provided that S is not finite.
The example that comes easily to mind for an example of an infinite set
is the set
N = {I, 2, 3,4, ... },

whose elements are all the positive whole numbers. But in order to show that
N is in fact an infinite set, it is necessary to show that N =1= 0 (which is easy)
and that there can exist no one-to-one correspondence between N and any
set of the form {I, 2, 3, ... ,n}. This is likely to be a formidable task, for
one must show for infinitely many different values of n that no such one-to-
one correspondence can exist. Moreover, it is also quite difficult to prove at
this point the obvious, necessary fact that the two sets {I, 2, 3, ... , n} and
{I, 2, 3, ... , m} can be put into one-to-one correspondence if and only if
m = n. In the next section we shall provide the Cantor-Schroeder-Bernstein
Theorem, an almost indispensable tool in dealing with problems of this sort.

Exercises

6.31 Supply, in addition to N, two more examples of infinite sets.


6.32 Show that the set N of positive whole numbers can be put into one-to-
one correspondence with its proper subset E of even positive whole numbers.
Note: In order to do this, you must show the existence, presumably by
construction, of a one-to-one correspondence between Nand E; that is,
you must construct a one-to-one and onto function!: N -+ E (or!: E -+ N).
6.33 Show that the set N of positive whole numbers can be put into one-to-
one correspondence with its proper subset M of all squares of positive whole
numbers.
6.34 Show that the set N of positive whole numbers can be put into one-to-
one correspondence with its subset T of all such numbers with two or more
digits.
6.35 Show that the set N of positive whole numbers can be put into one-to-
one correspondence with the set X = {- 1, - 2, - 3, - 4, ... }.
6.36 Give three different functions each of which is a one-to-one corres-
pondence from the set N to itself.
164 Infinite sets 6.4

6.37 Give two different functions each of which is a one-to-one corres-


pondence from the set N to its proper subset L = {I, 3, 5, 7, 9, ... }.
6.38 Show that the set N of positive whole numbers can be put into one-to-
one correspondence with the set W of all whole numbers.
6.39 Show that the set of points on a line one unit long can be put into one-
to-one correspondence with the set of points on a line two units long.

° °
6.40 Show that the set (0, I) of real numbers between and I (not including
or I) can be put into one-to-one correspondence with the set R of all real
numbers. Hint: In Exercise 6.21, an example was given of a one-to-one
correspondence between the set ( - (nI2), n12) and the set R. If you can show
that (-(nI2), n12) and (0, I) can also be put into one-to-one correspondence,
then it will follow (why?) that (0, I) and R can also be put into one-to-one
correspondence.
Assuming that you have established a one-to-one correspondence, say f,
from (0, I) to R, can you now show the existence of a one-to-one corres-
pondence from [0, I] to R? ([0, I] = {x E RIO ~ x ~ I}.) How?

6.4 THE CANTOR-SCHROEDER-BERNSTEIN THEOREM


There are many cases in which one desires to show that two sets can be put
into one-to-one correspondence, but all that can easily be accomplished in
practice is something such as this: Sand T are two given sets, and it turns out
to be possible to show that S can be put into one-to-one correspondence with
some subset B of T, and that T can be put into one-to-one correspondence
with some subset A of S. In Fig. 6.7 this situation is diagrammed, with
cI> the one-to-one function from S onto Band 'P the one-to-one function
from Tonto A.
This situation seems to suggest that since T has at least as "many"
elements as S, and S as many as T, then Sand T themselves could be put
into one-to-one correspondence. However, though this is true, finding such a
correspondence in actual practice can sometimes be rather complicated.
The Cantor-Schroeder-Bernstein Theorem guarantees under these conditions
the existence of a one-to-one correspondence between Sand T.
Theorem 6.1 (Cantor-Schroeder-Bernstein) Let Sand T be two sets, and
suppose that S '" B, where BeT, and that T '" A, where A c S. Then
S '" T.
Proof Let cI>: S -+ Band 'P: T -+ A each be one-to-one and onto functions.
Both cI> and 'P exist because of our hypotheses that S '" Band T '" A.
We shall produce a function E> that is a one-to-one correspondence from
S to T. The proof of the existence of E> will actually be constructive, so that
a formula for E> could actually be written in a specific case; however, in such
6.4 The Cantor-Schroeder-Bernstein Theorem 165

Fig. 6.7
The hypotheses of the
Cantor-Schroeder-
Bernstein Theorem: The
functions are one-to-one.

cases the formula could be so complicated that we will generally be content


with the knowledge of the existence of 0.
Consider an element XES. If there exists an element yET such that
'¥(y) = x, then we will call y a parent of x. If such an element yET in
addition has a parent Z E S, such that <I>(z) = y, we will also call z a parent
of x, and in this case {'¥(<I»}(z) = x. Note that since the subset A of S is the
image of '¥, then each element of A has at least one parent; if XES - A
then x has no parents. Similar remarks hold for Band T.
Given that XES, there are exactly three possibilities:
a) x has infinitely many parents.
b) x has but finitely many parents, and the ancestry of x begins with a
parentless ancestor in S.
c) x has but finitely many parents, and the ancestry of x begins with a
parentless ancestor in T.
If x has no parent at all, which happens when XES - A, then x belongs
in case (b) above. So we can divide the set S up into three mutually exclusive
subsets:
Soo = {x E S I x has infinitely many parents}.
SfT = {x E S I the ancestry of x begins in S}.
Sf = {x E S I the ancestry of x begins in T}.
Note that we have chosen the subscripts so as to help us remember which
elements of S belong to each of these subsets of S. Of course, S - A C SfT'
but in fact SCI contains those elements of S with an even number of parents
166 Infinite sets 6.4

• • • •
• • • •
• • • •

Fig.6.8 The element


a E S has infinitely
many parents; the
element b E S has
ancestry beginning
in S; the element
b~-' C E S has ancestry
beginning in T.

c
s T

(recall that 0 is an even number) and Sr: contains those elements of S with an
odd number of parents. See Fig. 6.8.
Similarly, we divide Tup into three mutually exclusive subsets as follows:
Too = {y E T I y has infinitely many parents}.
Tu = {y E T I the ancestry of y begins in S}.
~ = {y E T I the ancestry of y begins in T}.
We emphasize that Too, Tu' and Tr; are mutually exclusive-the intersection
of any two is the empty set-and that T = Too u Tu u~. Hence each
element of T belongs to exactly one of the sets described above. Similar
remarks hold for S.
6.4 The Cantor-Schroeder-Bernstein Theorem 167

We will establish that <I> is a one-to-one correspondence from S 00 to


Too, that <I> is also a one-to-one correspondence from Sq to Tq, and that'll
is a one-to-one correspondence from ~ to St' This will prove that Soo ,..., Too,
Sq ,..., Tq, and St ,..., Tt • Then we can "glue together" the functions <I> on
Soo u Sq and '1'-1 on St to obtain the one-to-one correspondence E> from
S to T, as desired.
Now Soo c S, so <I> is defined on Soo' Moreover, if x E Soo, then x has
infinitely many parents, so that <I>(x) is an element of T with infinitely many
parents as well. Hence <I> is a function with domain S 00 and range Too.
(Actually, this is an improper use of terminology, since by restricting the
domain of <I> from S to Soo we are actually considering a new function;
perhaps we should indicate this by calling this new function <1>00 instead of
of just <1>, but the additional notation hardly seems justified in this case.)
We need to show that <1>: Soo --+ Too is both one-to-one and onto. But
<I> is clearly one-to-one, since it is one-to-one on all of S. And if y E Too,
then y has infinitely many parents, so in particular it has an immediate parent
XES, for which <I>(x) = y. But this element x must actually belong to Soo'
for if x had but finitely many parents then so would y. Hence there does
indeed exist an element x E Soo such that <I>(x) = y. Thus <1>: Soo --+ Too
is both one-to-one and onto, and thus <I> is indeed a one-to-one corres-
pondence from S 00 to Too"
By the symmetry of the remaining two cases, it is sufficient to prove only
one of them, for example that 'II is a one-to-one correspondence from
~ to St' This is left for you, in the exercises at the end of this section. So
we may assume that we know that each of the following is a one-to-one
correspondence:
<1>: S 00 --+ Too ,
<1>: Sq --+ Tq,
'1': ~ --+ St'

See Fig. 6.9. By Exercise 6.22, '1'-1 is also a one-to-one correspondence


from St to ~, so that we have the following situation. The three functions
shown below are each one-to-one and onto:

<1>: S 00 --+ Too ,


<1>: Sq --+ Tq,
'1'-1: St --+ ~.

The three domains shown above are disjoint and their union is all of S;
the three ranges shown above are also disjoint and their union is all of T.
168 Infinite sets 6.4

s~ T~


Fig.6.9 Each function
is a one-to-one
correspondence on the
indicated subset.

So we define 0: S --+ T as follows:


If x E SOC!, then 0(x) = <I>(x).
If x E S(1' then 0(x) = <I>(x).
If x E S-r' then 0(x) = \}1-1(X).
It should be clear not only that 0 is a function, but in fact is a one-to-one
correspondence from S to T. Thus S '" T, and this concludes the proof of
the Cantor-Schroeder-Bernstein Theorem.
6.4 The Cantor-Schraeder-Bernstein Theorem 169

As an application, we show how this theorem can be used to show the


existence of a one-to-one correspondence between the set

w= {... , - 2, - 1, 0, 1, 2, 3, ... }
of all whole numbers, and the set X = W x W, or

X = {em, n) I mEW and nEW},

the Cartesian product of W with itself. (This may appear to be a surprising


result, since X appears to be much "larger" than W.)
Let $: W --+ X by the rule $(n) = (n, 0). Then $ is clearly a one-to-one
function from W to X. Now all we need is a one-to-one function 'P from
X to W. Among other things, if (m, n) is an ordered pair of whole numbers
(a typical element of X), we must so design 'P that 'P(m, n) is a whole number,
and thus an element of W.
One approach that almost works is to let 'P(m, n) = mn. Then'P is a
function from X to W, but unfortunately is not one-to-one, since

'P(1, 6) =6= 'P(2, 3)


and
(1, 6) i= (2, 3).

Another approach that almost works is to let 'P(m, n) = 2m • 3n • Then


'P would be one-to-one, but unfortunately not a function from X to W,
since (1, -1) E X but 'P(1, -1) = 2/3 and 2/3 ¢ W.
But a modification of the last approach does work. In the last approach
it is only the fact that m or n might be negative that produces fractions, rather
than whole numbers, in the output of 'P, so we define 'P as follows:

If m > 0 and n > 0, then 'P(m, n) = 2m • 3n •


If m > 0 and n < 0, then 'P(m, n) = 2m • 5- n •
If m < 0 and n > 0, then 'P(m, n) = 7- m • 3n •
If m < 0 and n < 0, then 'P(m, n) = 7- m • 5- n •

In each case, the output of'P is a positive whole number, so that 'P: X --+ W
is a function. The fact that each positive whole number has a unique prime
factorization into the product of primes means that the output of'P uniquely
determines its input; or, in other words, 'P is one-to-one.
Neither $ nor'P is onto, but this does not matter. We have exactly the
situation given in the hypotheses of the Cantor-Schroeder-Bernstein Theorem.
Wand X are sets, $: W --+ X is one-to-one, so that W '" B = Im($) c X,
and 'P: X --+ W is one-to-one, so that X '" A = Im('P) c W. Hence by
the Cantor-Schroeder-Bernstein Theorem, W '" X.
170 Infinite sets 6.5

Exercises

6.41 Show, in the proof of Theorem 6.1, that'P actually is a one-to-one


correspondence from Tt to St'
6.42 In the example following Theorem 6.1, in which it is shown that W '" X,
show that the function <D constructed therein is one-to-one and that neither
of the functions <D or 'P is onto.
6.43 Use the Cantor-Schroeder-Bernstein Theorem to prove that if A '" C
and A c B c C, then B '" C.
6.44 At the conclusion of the proof of Theorem 6.1, it was stated that the
function 0 constructed therein was actually a one-to-one correspondence
from S to T. Verify this.
6.45 In Exercise 6.38, we asked you to prove that the set N of positive whole
numbers can be put into one-to-one correspondence with the set W of all
whole numbers. Prove this by use of the Cantor-Schroeder-Bernstein
Theorem. Hint: Use techniques similar to those in the example following
the proof of Theorem 6.1.

6.5 PROPERTIES OF FINITE AND INFINITE SETS


First we should show that both finite sets and infinite sets exist. It is clear
that the sets of the form {I, 2, 3, ... , n} are finite for each positive whole
number value of n. To show the existence of an infinite set,we prove that the
set N of positive whole numbers is in fact infinite.
Theorem 6.2 The set N = {I, 2, 3, ... } is infinite.
Proof The proof is by contradiction; we suppose by way of contradiction
that the set N is finite. Then, by definition, either N = 0 or N can be put
into one-to-one correspondence with a set of the form {I, 2, 3, ... , n}, for
some positive whole number n. Since 17 E N, N =F 0, and hence there is a
one-to-one and onto function f from {I, 2, 3, ... , n} to N for some positive
whole number n.
We can think of f as a one-to-one correspondence from the set {I, 2,
3, ... , n} to the set {fO), f(2), f(3), ... ,f(n)}, and it would not be difficult
to think of a method of selecting from the latter set its largest element, a
method which would require no more than n steps. So the set {fO), f(2),
f(3), ... ,fen)} contains a largest element, say m, and m = f(j) for some j
such that 1 <j < n.
But m + 1 is an element of the set N, and hence since f is onto, there
must exist a positive whole number k E {I, 2, 3, , n} such that f(k) =
m + 1. Sof(k) is an element of {f(1),f(2),f(3), ,fen)} larger than m.
This contradicts the fact that m is the largest element of {fO), f(2), f(3), ... ,
f(n)} , as m + 1 > m. Hence our original supposition, that N is finite,
6.5 Properties of finite and infinite sets 171

leads to a contradiction, and consequently N is infinite. This completes the


proof of the theorem.
Note that it was not necessary to use the fact that f is one-to-one.
Our next result depends on Exercise 6.46, at the end of this section:
That if S is any infinite set whatsoever and XES, then the set S - {X} is
also infinite, and therefore nonempty. Moreover, our next theorem also
implies that the set N is in some sense a "smallest" infinite set-the exact
meaning of this statement will be discussed in Exercise 6.47.
Theorem 6.3 If S is an infinite set, then S contains a subset M such that
M ,.., N.
Proof. Choose Xl E S. Then by Exercise 6.46, S - {xd is infinite, and thus
nonempty. So we may choose X2 E S - {xd. Since S - {Xl} is infinite,
it follows again by Exercise 6.46 that S - {x h x 2 } is infinite and nonempty.
So we next choose X3 E S - {Xl' X2}' We continue this process. It will no1.
terminate, for if Xl' X2' X3' ... , Xk have been chosen, then still
S - {Xl' X2' X 3 , ... , xk }
is infinite, thus nonempty, and we may choose Xk+ 1 from it. Then

is still infinite and nonempty, and the process can be continued. Thus for
each positive whole number m we can produce an element X m E S, and by
our construction, if j :I: m then x j :I: X m • So the set M = {x h X2' x 3 , ••. }
is a set of distinct elements of S, one for each positive whole number.
Let f: N -+ M according to the rule f(n) = X n • It is clear that f is a
one-to-one correspondence, and hence that M ,.., N. This establishes the
theorem.
The next theorem seems "obvious," but it is not easy to prove, at least
without using the Cantor-Schroeder-Bernstein Theorem.
Theorem 6.4 Every subset of a finite set isfinite.
Proof. Suppose that F is a finite set. The theorem is clearly true if F = 0,
so we consider only the case in which F ,.., {I, 2, 3, ... , n} for some positive
whole number n.
Suppose by way of contradiction that F contains an infinite subset S.
By Theorem 6.3, S contains a subset M such that N ,.., M. Let f: N -+ M
be a one-to-one correspondence from N to M.
Now {I, 2, 3, ... ,n} c N, so that, lettingf(j) = mj for eachj EN,

is a subset of M such that K ,.., {I, 2, 3, ... , n}. So K ,.., F. Also, since
K c M and M c S, then K c S.
172 Infinite sets 6.5

So we have the following situation: K e S c F and K '" F. By


Exercise 6.43, which is an easy consequence of the Cantor-Schroeder-
Bernstein Theorem, it follows that S '" F. But F '" {I, 2, 3, ... , n}, and
hence S '" {I, 2, 3, ... , n} as well. This is impossible because S was sup-
posed to be infinite; if S '" {I, 2, 3, ... , n} were true, then S would be
finite by definition, and again by definition "infinite" means "not finite."
This contradiction shows that the finite set F can contain no infinite subsets,
and establishes the theorem.
We now prove a theorem sometimes known as the "Dedekind Box
Principle," which together with the succeeding theorem will enable us to give
an alternate, equivalent, and sometimes more useful definition of what it
means for a set to be infinite.

Theorem 6.5 No finite set can be put into one-to-one correspondence with one
of its proper subsets.

Proof Let F be a finite set and G a proper subset of F. Suppose by way of


contradiction that G '" F; then there exists a one-to-one correspondence
qJ from F to G. Since G c F, then G must be finite by Theorem 6.4. Hence
either G = 0 or else G = {I, 2, 3, ... , n} for some positive whole number
n. If G = 0, then F =1= 0 since G is a proper subset of F; however, for
F =1= 0 and G = 0, qJ: F --+ G cannot be a function. So we can eliminate
the possibility that G = 0.
In several previous exercises, it was seen that when a one-to-one cor-
respondence exists between two sets such as F and G, there are generally
several such correspondences. If we have one, such as cp, we can let

Mlp = {x E F I cp(x) =1= x}

be that subset of F of points of F moved under the action of cp. By Theorem


6.4, for each such correspondence cp the set Mlp must be finite because it is a
subset of the finite set F. So the number of elements in the set M lp is a positive
whole number or zero.
Choose a one-to-one correspondence cp from F to G so that M lp contains
the smallest possible number of elements. With this choice of cp we will
seek to establish a contradiction.
Now Mlp =1= 0. For if Mlp = 0, then cp(x) = x for all x E F. Then
G = Im( cp) = F, so that G would not be a proper subset of F. Therefore,
since Mlp =1= 0, we may choose an element x E Mlp (") G (how?).
Now cp(x) =1= x, so cp(x) = y for some y E F such that y =1= x. More-
over, cp is onto, so there exists WE F such that cp(w) = x. And if w = x,
then cp(w) = cp(x) since cp is a function; but since cp(w) = x, this would
imply that x = cp(x), which is false by the choice of x E Mlp. Hence w =1= x.
6.5 Properties of finite and infinite sets 173

Fig. 6.10
(fJ moves w to x and x to y.

Next, y e M([). For if not, then qJ(Y) = y. But also qJ(x) = y, and cp is
one-to-one. This implies that x = y, which we have shown false. Hence
yeM([).
Also, we M([). For if not, then qJ(w) = w. But cp(w) = x by choice of
w, so that W = x since cp is a function. But then cp(x) = x, again contrary
to our choice of x. So we M([) as well.
Hence we have the situation shown in Fig. 6.10. All three ofw, x, andy
belong to M([). Also w =1= x and y =1= x (although it is possible that w = y;
however, for clarity we have shown in Fig. 6.10 the more general case in which
w =1= y). Finally, qJ(w) = x and cp(x) = y.
We modify the function qJ and obtain a slightly different function I/J as
follows:
Let I/J(w) = y.
Let I/J(x) = x.
Let I/J(z) = qJ(z) if Z =1= w and Z =1= x.
Now I/J is almost the same function as qJ; all that has been done to cp in
order to obtain I/J is to switch the values of qJ at wand x, as indicated in Fig.
6.11. Hence it is easy to see that I/J is also a one-to-one correspondence from
F to G. But let us now consider the set M "', given by

M", = {x e F I I/J(x) =1= x}.


174 Infinite sets 6.5

Fig. 6.11 tp is modified to become


tfI, which moves w to y and leaves
x fixed.

. The particular element x, previously chosen from M tp' has the property
that q>(x) =1= x; but for this element x, t/J(x) = x. So X E Mtp' but x ¢ M",.
Moreover, if v ¢ Mtp' then q>(v) = v and, in addition, v =1= wand v =1= x
since x and ware elements of Mtp. So, by definition of t/J, also t/J(v) = v.
Thus if v ¢ Mtp' then v ¢ M",.
Hence M", can contain no more elements of F than Mtp' and in fact Mtp
actually contains more elements of F than M", because x E Mtp but x ¢ M",.
But q> was chosen so that its corresponding set of points Mtp would contain
the minimum possible number of elements of F, and here we have constructed
t/J, another one-to-one correspondence from F to G whose corresponding
set M", contains fewer elements than MfP. This is in contradiction to the
choice of q>, and this contradiction establishes that no such function as q> can
exist. Therefore no finite set can be put into one-to-one correspondence with
one of its proper subsets, and the theorem is proved.
The mathematician Richard Dedekind suggested an alternative, but
equivalent, definition of what it means for a set to be infinite; the term used
is that the set is Dedekind infinite, and this means that the set can be put
into one-to-one correspondence with one of its proper subsets. Theorem
6.6 shows that this is the same property as being infinite.
Theorem 6.6 The set S is infinite if and only if Sis Dedekind infinite.
Proof. Suppose first that S is Dedekind infinite. Then, by definition, Scan
be put into one-to-one correspondence with one of its proper subsets. By
6.5 Properties of finite and infinite sets 175

Fig. 6.12 The function f is a one-to-one


correspondence from S to S - {x}.

• • • • • • • •••
1234567···

the previous theorem, S cannot be finite; hence by definition, S must be


infinite.
Next, suppose that S is an infinite set. By Theorem 6.3, S contains a
subset M such that M ,..., N. Let () be a one-to-one correspondence from
N to M, and let x = ()(1). Then x EM c S, so that x is also an element of
S; let T = S - {x}. Tis certainly a proper subset of S. We will show that S
can be put into one-to-one correspondence with T.
We define a function f from S to T as follows. If S E S - M, just let
f(s) = s. If S E M, then s = ()(n) for some positive whole number n since
(): N --+ M is onto; in fact, n is uniquely determined by s since () is also
one-to-one. Letf(s) = ()(n + 1).
Of course, what we are doing here is shoving each element of M up one
notch, sending x = ()(1) to ()(2), sending ()(2) to ()(3), and so on, while
leaving the elements of S not in M alone. This action of f is indicated in
Fig. 6.12. You may easily verify for yourself that f is indeed a one-to-one
correspondence from S to T. So we have shown that the infinite set Scan
be put into one-to-one correspondence with one of its proper subsets, and
thus that S is Dedekind infinite. This completes the proof of Theorem 6.6.
The Dedekind Box Principle (Theorem 6.5) can be phrased very in-
formally as follows: If a postman has n letters to put into m mailboxes and
m < n, then at least one box must get at least two letters.
176 Infinite sets 6.5

Exercises

6.46 Prove that if 8 is an infinite set and x E 8, then S - {x} is infinite.


Note: This exercise is used to establish Theorem 6.3, so that theorem and its
successors may not be used in working this exercise. Hint: First suppose that
8 - {x} is finite, in order to reach an eventual contradiction. Set up a one-
to-one correspondence f from 8 - {x} to the set
{I, 2, 3, ... , n}.
Use f to produce a one-to-one correspondence g from (8 - {x}) u {x} to
the set
{I, 2, 3, ... , n, n + I}.
Explain why this is a contradiction. Draw the desired conclusion.
6.47 Show that if K is a subset of N, then either K is finite or K '" N. Use
this fact to show that if S is a set such that 8 '" N, and T is a subset of 8,
then either T is finite or T '" N. This is the sense in which N can be con-
sidered a "smallest" infinite set, bearing in mind the next exercise as well.
6.48 Prove that if 8 is a set containing the set N of positive whole numbers
as a subset, then S must be infinite. Use this fact to show that if 8 is a set
containing a subset M such that M '" N, then 8 must be infinite.
6.49 A set 8 is said to be denumerable provided that 8 '" N. Prove that
N x N is denumerable. Hint: Use the Cantor-Schroeder-Bernstein Theorem
(Theorem 6.1) and the techniques used in the application immediately after
its proof. See also Exercise 6.45.
6.50 Use Exercise 6.49 to prove that if A and B are denumerable sets, then so
is A x B.
6.51 Show that the set Q of all rational numbers is denumerable. Hint:
Use Exercises 6.47 and 6.50.
6.52 Prove that if A and B are denumerable sets, then so is A u B.
6.53 For each prime pEN, let A p be the set of all positive integral powers
of p; that is,
A p = {p, p2, p3, p4, ... }.

Show that for each prime p, A p is a denumerable set.


Prove that the collection
{A p I p is prime} = {A 2 , A 3 , As, A 7 , All' ... }
is denumerable. Note: There are infinitely many primes.
6.54 Use the previous exercise to show that the set N of positive whole
numbers contains infinitely many infinite sets no two of which have any
element in common.
6.6 Nondenumerable infinite sets 177

6.55 Use the ideas developed in the previous two exercises to show that if
each of B 1 , B 2 , B 3 , ••• is a denumerable set, then so is the set
B = B1 U B2 U B3 U .••.

6.56 An equilateral triangle of side length 2 is drawn in the plane, and five
points are selected within this triangle. Prove that some two of these points
must lie within distance I of each other. Hint: Use the Dedekind Box
Principle.
6.57 Prove that at least one pair of people in Atlanta, Georgia have the same
number of hairs on their heads.
6.58 Show that the sets
{I, 2, 3, ... , n}
and
{I, 2, 3, ... , m}
can be put into one-to-one correspondence if and only if m = n.
6.59 Prove that if F is a finite set and S is an infinite set, then S '" S u F.
6.60 So far, every infinite set has turned out to be denumerable. Do you
believe that every infinite set is denumerable? For example, let P be the set
of all polynomials with whole number coefficients. Is P denumerable? Hint:
See Exercise 6.55.

6.6 NONDENUMERABLE INFINITE SETS


We have shown N infinite, and N is denumerable by definition. In Exercise
6.51, you were asked to show that the set Q of all rational numbers is also
denumerable, and in Exercise 6.60 that the set P of all polynomials with
coefficients in Wis also denumerable. At this point you might wonder, quite
justifiably, if there might be only two kinds of sets-finite and denumerable-
and thus that all infinite sets could be put into one-to-one correspondence.
Our next theorem shows that this is not the case; the familiar set R of all
real numbers is certainly infinite, but it cannot be put into one-to-one
correspondence with the set N, and is thus nondenumerable. Hence, in a
very precise sense, the set R contains "more" elements than the set N. This
was proved by the mathematician Georg Cantor in 1873; he later discovered
a simpler proof which we next present. This proof depends on the fact that
each real number has a decimal expansion, and that two real numbers are
equal if and only if their decimal expansions are identical. (There is a minor
point here to be discussed in the exercises; 0.99999 . .. and 1.00000. .. are
different decimal expansions for the same real number; we assume in the
proof to follow that if there should be such ambiguity, the latter form is to
be used.)
178 Infinite sets 6.6

1 Al • all al2 a l3 a)4 a l5 •••

2 A 2 • a21 a22 a 23 a 24 a25 •••


3 A 3 • a 31 a32 a 33 a 34 a 35 • • •
Fig. 6.13 The hypothetical
4 A 4 • a4) a42 a 43 a 44 a45 ••• one-to-one correspondence
5 As • as) a52 a53 a54 ass • • • from Nto R.
•• ••
• •

Theorem 6.7 The set R of all real numbers is nondenumerable.


Proof. We suppose, by way of contradiction, that R is denumerable. Then
there must exist a one-to-one correspondenceffrom N to R. We arrange a
"diagram" of the action off as shown in Fig. 6.13.
The left column lists the positive whole numbers, the domain off To
the right of each is its value under the action off Now we have supposed that
the one-to-one correspondence f exists, but we do not know what it is;
we do not know whether fO) is 1/4, -7, or n. So we must represent the
values offaccording to their decimal expansion. Since it turns out that we will
not have to consider the whole number part of fen), we have called this
An; thusf(n) is shown in the form

so that anj is the jth digit in the decimal expansion of fen).


Because f is onto, each real number must appear somewhere in the right-
hand column. We proceed to reach a contradiction by producing a real
number that does not appear anywhere in the right-hand column.
We construct this real number ex by going down the diagonal all' a22 ,
a 3 3' . . . , and changing each of these digits. Specifically, we construct ex
as follows: The decimal expansion of ex is to be
ex = 0 . blb2b3b4bs ... ,
where b i is obtained from au as follows. If au is less than 8, let b i = a ii + 1.
If au is 8 or 9, let b i = O.
6.6 Nondenumerable infinite sets 179

Now ex does not appear anywhere in the right-hand column, for the
decimal expansion of ex differs in at least one place from any of the numbers
listed in the right-hand column. But ex has a decimal expansion, and thus is
indeed a real number. This contradicts the fact that! is onto. Hence R is
nondenumerable.
As you have seen, we count by means of one-to-one correspondence.
For example, if the set A can be put into one-to-one correspondence with the
set {I, 2, 3, ... , n}, then we say that the set A contains n elements. If A ,..., 0,
then we say that A contains no, or zero, elements. The numbers 0, 1, 2, ... ,
that we use in counting the finite sets are called finite cardinal numbers. In
fact, if we consider the collection of all sets that can be put into one-to-one
correspondence with the set {I, 2}, there is but one significant property
common to all sets in this collection, "twoness" or the property of containing
two elements, and to be precise it is exactly this common property we mean
when we speak of the cardinal number 2.
There is nothing to prevent us from giving names to infinite cardinal
numbers as well. By tradition, the name ~o has been used for the cardinal
number of the set N of positive whole numbers, and thus is the cardinal
number of all denumerable sets. (~is the first letter of the Hebrew alphabet,
and ~o is usually pronounced "aleph-null.") The German letter c is
customarily used for the cardinal number of the set R of all real numbers,
because c is the first letter of the German word for continuum, used some-
times as a synonym for the real number line.
We shall not go into the arithmetic of cardinal numbers, but only mention
that it is possible to consider a natural order relation between them. Here
is how this is done.
First, if A is a set, then it has a cardinal number-finite or infinite-and
we denote this number sometimes by [A]. Next, given two sets A and B,
it may be possible to find a one-to-one function!: A ~ B. If so, then we say
that [A] < [B]. If we can also show that there can be no one-to-one and
onto function from A to B, then we can say in fact that [A] < [B]. This
order relation obeys the expected properties (with one exception: If m and
n are cardinal numbers, it does not follow without an additional powerful
axiom of set theory that either m < n, m = n, or n < m). In particular,
the Cantor-Schroeder-Bernstein Theorem may be translated into the language
of cardinal numbers as follows:
If m and n are cardinal numbers and both m ~ nand n < m are true,
then m = n.
Finally, Theorem 6.7 may be translated very simply:
~o < c.
But so far, the only two infinite cardinal numbers we have seen are just
~o and c. Our final theorem says, in effect, that there are infinitely many.
180 Infinite sets 6.6

• Fig. 6.14 Either


9
Z E Kz or z ¢ K z
-but either
possibility leads to
a contradiction.
s

Theorem 6.8 Let S be any set and let f/ be the collection of all subsets of S.
Then there exists a one-to-one function from S to f/ but no such function can
be onto. Hence, in the terminology of cardinal numbers, [S] < [f/J.
Proof There clearly does exist a one-to-one function f from S to f/; just
let f(x) = {x} for each XES. Suppose, by way of contradiction, that
there exists a function 9 that is a one-to-one correspondence from S to f/.
What 9 must then do is assign to each element s E S a subset of S, which we
will call K s ; thus K s E f/, for K s c S, and K s is just another name for g(s).
Consider s E S. Since K s c S there are just two possibilities: either
s E K s or s f K s • We are particularly interested in the latter case; in fact,
we let

Now :K c S, so that :K E f/. Since g: S -+ f/ is onto, there must


exist an element Z E S such that g(z) = :K; in fact, since 9 is one-to-one,
this element z is in fact uniquely determined by :K.
Sinceg(z) = :K, then:K = K z • But where is z? Since z E Sand K z c S,
either z E K z or z ¢ K z • Let us consider each of the two possibilities. Fig.
6.14 may be helpful.
If z E Kz, then z E :K since :K = Kzo But:K is the set of all s E S such
that s f K s • Since Z E K z , it follows by definition of :K that Z f :K. This is
contrary to the fact that Z E :K.
On the other hand, if Z f K z , then Z f :K since:K = K z • But by definition
of :K, :K consists of all those elements s of S such that s ¢ K s • Since Z ¢ K z '
then Z E :K by definition of :K. This is contrary to the fact that Z ¢ :K.
6.6 Nondenumerable infinite sets 181

Either way, the assumption that there exists a one-to-one correspondence


9 from S to !/ leads to a contradiction. Thus there can be no such function,
and this establishes the theorem.

Exercises

6.61 Prove that irrational real numbers exist, using results of this chapter.
6.62 It was mentioned just before the proof of Theorem 6.7 that the same
real number may have two different decimal expansions; the example given
was 0.99999 . .. and 1.00000... . You can prove these two are equal by
using some of the techniques of infinite geometric series (see Exercise 1.45).
Of course, you use the fact that the correct interpretation of the decimal
expansion 0.99999 ... is the sum of the series
9/10 + 9/100 + 9/1000 + ....
6.63 Modify the proof of Theorem 6.7 to show that the set of real numbers
between 0 and 1 is nondenumerable.
6.64 Use the previous exercise and other results to show that the cardinal
number of the set of real numbers between 0 and 1 is c.
6.65 Show that the cardinal number of the set of points in the unit square S
in the plane is also c by using the previous exercise and setting up a one-to-
one function from [0, 1] into S and a one-to-one function from S into [0, 1],
then applying the Cantor-Schroeder-Bernstein Theorem. Note:
S = {(a, b) E R x RIO < a, b ~ I}.
Hint: If (a, b) E S, then a and b have decimal expansions of the form

and
bo . b 1 b2b3b4 · ..
where ao and bo are each either 0 or 1. Where is the number

o . aObOalbla2b2a3b4 . .. ?
6.66 Use the definition of "Dedekind Infinite" to prove that the set R of all
real numbers is infinite.
6.67 Let E be the set of all even positive whole numbers. Translate the
statement "E is denumerable" into the language of cardinal numbers.
6.68 If % is the collection of all subsets of Rand [%] = n, what is the rela-
tion between c and n?
6.69 See Exercise 6.38; translate its statement into the language of cardinal
numbers.
182 Infinite sets 6.6

6.70 Translate the statement of Theorem 6.3 into the language of cardinal
numbers.
6.71 Translate the statement of Theorem 6.4 into the language of cardinal
numbers.
6.72 Let ~ denote the collection of all finite subsets of N. Prove that [~] =
No. Hint: See Exercise 6.55.
6.73 If the collection of all sets were a set f/, then every subset of f/ would
be an element of f/. This is a contradiction-but to what? Hint: See
Theorem 6.8.
6.74 What is the cardinal number of the collection of all (unbounded)
straight lines in the plane?
6.75 Suppose that an urn is filled with No balls, numbered 1, 2, 3, .... Let
us call a "stage" in the following experiment, the act of removing three balls
from the urn and then replacing two of the balls then outside the urn.
Imagine performing an experiment, in which each "stage" is performed
No times. It is clear that after stage 1, one ball is outside the urn; after stage 2,
two balls are outside the urn; and, in general, after stage n, n balls are outside
the urn. The question is this: After No stages-that is, after stage 1, stage 2,
stage 3, ... ,-how many balls are in the urn?
Be careful. This is a trick question.

NOTES AND REFERENCES


Two good references are
Set Theory and Logic, by Robert R. Stoll (Freeman, 1963).
Set Theory, by Felix Hausdorff, translated by John R. Aumann (Chelsea,
1957).
Recall that we have scarcely discussed the following two questions:
Does there exist a cardinal number n such that No < n < c?
Given two cardinal numbers m and n, must one of the three relations
m< n m = n n<m
be true?
See Paul J. Cohen's excellent book Set Theory and the Continuum
Hypothesis (Benjamin, 1966) for a discussion of these matters.
Georg Cantor was born in 1845 in Russia; his father was a Dane; he
grew up in Germany, so he is somewhat international. He studied at Zurich
and Berlin, and gave promise of becoming a talented and conventional
mathematician. But in 1874 his first revolutionary paper was published-a
paper in which he was the first to attack the previously avoided problem of
Notes and references 183

the "infinite." In fact, most of the material of this chapter can be found in
Cantor's published works. These innovations shocked mathematicians of
the day, and stimulated violent attacks on Cantor by his colleagues. Cantor
was sensitive and unable to weather the criticisms thrown his way; he had
attacks of irrational anger or overwhelming depression, beginning when he
was about forty, and he died in a mental institution in 1918. By this time he
had been belatedly recognized as the genius he was.
The German mathematician Richard Dedekind was one of Cantor's
few allies during the above troubles. Dedekind had a long life-he lived from
1831 until 1916-and a very productive one; he suffered a mild form of the
same sort of attack directed against Cantor because of his own researches.
He is well known for putting the concept of an irrational number into a
logically sound structure, and his brilliant ideas were not at first universally
well received. However, he did live long enough to become recognized as one
of Germany's greatest mathematicians.
CHAPTER 7

NUMBER
THEORY

The theory of numbers is meant principally to answer questions about the set
N = {I, 2, 3, 4, ... }
of positive whole numbers, or natural numbers, and sometimes about the set
Z = {... , - 2, -I, 0, I, 2, 3, ... }
of integers, or whole numbers. However, the techniques used to answer such
questions frequently involve the rational numbers, complex numbers, or even
calculus. The questions themselves have fascinated people for centuries-
indeed, number theory is one of the oldest branches of mathematics-perhaps
because the questions themselves are easy to pose, and the privilege of
searching for the answers is available to almost everyone. Indeed, for his
research in number theory Pierre de Fermat is known as the "Prince of
Amateurs"; he was a French jurist of the seventeenth century who made
many of the most important contributions to number theory.
We assume that you are familiar with the arithmetic properties of the
sets Nand Z; that is, that addition and multiplication are both commutative
and associative, that multiplication distributes over addition, and so on.
With this background you may plunge immediately into number theory.
One of the cornerstone concepts is that of divisibility, for upon this concept
are built many of the other important definitions and theorems of number
theory-and so it is with divisibility that we begin.

7.1 DIVISIBILITY
The integer m is said to be divisible by the integer d provided that there
exists an integer k such that m = dk. If so, then d is said to be a divisor of

184
7.1 Divisibility 185

m and m is said to be a multiple of d. If d is a divisor of m, we write the


shorthand expression dim and read this expression as "d divides m."
For example, the number 6 has exactly four natural number divisors;
namely, I, 2, 3, and 6 itself. If our context is the set Z of integers, then the
number 6 would have twice as many integral divisors-the ones listed above
together with their negatives. Because there is such a close relationship
between the natural number divisors of an integer m and its integral divisors,
we will in the future always mean (unless otherwise stipulated) by a divisor d
of the integer m a natural number divisor.
The integers 0, - I, and I are rather special so far as divisibility is
concerned. First, 0 is divisible by all whole numbers, for given a number d
it is always true that d I O. To see this, just take m = 0 and k = 0 in the
definition of divisibility; then, since 0 = o· d, it follows by definition that d
is a divisor of O. Moreover, the only integer of which 0 is a divisor is 0 itself,
for the equation m = k· 0 has a solution k only for m = o. On the other
hand, I and -I are divisors of every integer, while they themselves are their
only divisors. Note that each natural number other than I has at least two
divisors, itself and I. Some natural numbers, such as 5, have no other divisors;
others, such as 6, do.
Divisibility may be studied for its own sake. It should be easy to see that
if d and m are natural numbers and dim, then I ~ d ~ m. With this in
mind we can establish our first theorem.

Theorem 7.1 If a, b, and c are natural numbers, then:

a) a I a.
b) If a I b and bla then a = b.
c) If a I b and blc then a I c.
To prove this theorem it is necessary only to look at the definition of
divisibility. The details of the proof are outlined in the exercises for you to
complete. Note the close similarity between the behavior of the symbols
I and ~; the above theorem is true if the former symbol is replaced by the
latter.
The natural number I has but one divisor (remember, by "divisor" we
mean only natural number divisors); 1 is the only such natural number, so we
divide the others into two classes, as indicated in the next definition.
The natural number p is said to be prime provided that p has exactly
two divisors. If the natural number m has three or more divisors, then m is
said to be composite.
Note that I is neither prime nor composite. If you need a name for such a
situation, you may refer to I as a unit, as it is called in some branches of
mathematics. It is easy to see that there are infinitely many composite
186 Number theory 7.1

numbers (why?). On the other hand, prime numbers do not seem so plentiful.
The first twenty, as you may easily verify, are as follows:
2 3 5 7 11 13 17 19 23 29
31 37 41 43 47 53 59 61 67 71

There are twenty-five primes between 1 and 100, sixteen between 1000 and
1100, eleven between 10,000 and 10,100, and only six between 100,000 and
100,100. Since primes tend to become less plentiful among the larger numbers,
you might suspect that there are only finitely many primes. However, it
has been known for several thousand years that the opposite is true-there
are in fact infinitely many primes, just as there are infinitely many composite
numbers. An outline of the simple proof of this fact appears as one of the
exercises at the end of this section; in order to supply the details of this proof
all that is needed is our next result.

Theorem 7.2 If m is a composite natural number, then m has a prime factor;


that is, m = kp where k and p are natural numbers and p is prime.
Proof Suppose that m is a composite natural number. Then m does have
divisors between 1 and m, so that the set
D = {d E Nil < d < m and dl m}
is a nonempty set of natural numbers. Moreover, this set is finite, since it can
contain no more than m - 2 numbers. Hence it is possible to select from the
set D its least element, which we will call p. It now suffices to show that p
must be prime.
Suppose by way of contradiction that p is not prime. Then, since p E D,
also p > 1; hence p must be composite. Thus p = ab, where a and bare
natural numbers such that
l<a<p and 1 < b < p.
We need consider only the number a in order to reach a contradiction.
Since a I p and p I m, it follows from our previous theorem that also a I m.
Hence a E D. But p was chosen as the least element of D, and a < p. This
is a contradiction, and hence p cannot be composite. Thus p is prime. This
establishes the theorem, for p is thus a prime factor of m.
As we have already mentioned, our next theorem follows easily now that
Theorem 7.2 is established, and the proof is outlined in the exercises.
Theorem 7.3 There are infinitely many primes.
It is quite true that the primes do tend to become more and more sparsely
distributed among the larger numbers. This tendency is well illustrated by
the next theorem, whose proof is also outlined in the exercises.
7.1 Divisibility 187

Theorem 7.4 Given a natural number n, it is possible to find a sequence of n


consecutive natural numbers each of which is composite.
In other words, given a natural number n, there do exist two consecutive
primes whose difference is at least n.

Exercises

7.1 List the divisors of 60, 100, 117, and 119. Circle the least such that
exceeds 1 in each case, and go on to the next exercise.
7.2 Is the least divisor of 60 (other than 1) prime? Does this also hold for
100, 117, and 119?
7.3 The proof of Theorem 7.2 actually "proves" a little more than is needed.
Rephrase Theorem 7.2 with the stronger conclusion that follows from the
proof given.
7.4 If the natural number n has d divisors, how many integral divisors has n?
7.5 Suppose that a and b are integers such that both a I band b I a are true.
What can you say about the relationship between a and b? In particular,
must it be true that a = b? Why?
7.6 Let p and q be primes such that p I q. What can you say about the
relationship between p and q?
7.7 List all the even primes. How many odd primes are there?
7.8 The text stated that if d and m are natural numbers such that dim,
then 1 ~ d ~ m. Fill in the details of the following possible proof of this
fact.
First, it is clear that 1 ~ d (why?).
Let us suppose by way of contradiction that m < d. Then the difference
d - m is a positive whole number; say,
d - m = a.
Since dim, m = dk for some natural number k (why?). Moreover,
k =1= 1 (why?). Hence k > 2. But then
m
k =-
d '

m+a-a
------
d
d a
= - --
d d
a
= 1
d
188 Number theory 7.1

(Justify each equality.) But aid is positive, and hence

a
k=l--<1.
d

(Why?) This is a contradiction (why?), and hence d ~ m (why?). Therefore


1 ~ d < m.
7.9 Suppose that a, b, and e are natural numbers such that both a I e and
b I e are true. Does it follow that ab I e?
7.10 Suppose that a, b, and e are natural numbers such that a I b and a I e.
Does it follow that a I be?
7.11 Suppose that a, b, and e are natural numbers such that a I be. Does
it follow that either a I b or a I e must be true?
7.12 Is it possible to have two composite natural numbers a and b such that
neither a I b nor b I a is true?
7.13 Here is the outline of the proof of Theorem 7.1. First, to show in part
(a) that a I a is true, it suffices to find an integer k such that a = ak. What
value do you choose for k?
Next, suppose that a I band b I a. Apply Exercises 7.5 and 7.8 in order
to conclude that a = b; or, if you prefer, use the following approach: Since
a I b, there exists an integer k such that b = ak. Since b I a, there exists an
integer j such that a = bj. Substitute b = ak into the latter equation. What
conclusion can be drawn about the number kj? What must be the values of
k and j? Why does this imply that a = b?
Finally, for part (c), suppose that both a I band b I e are true. Apply
the definition of divisibility to obtain two equations similar to the two above.
As above, substitute one into the other. Does the desired conclusion a I e
follow?
7.14 The text asked why there are infinitely many composite numbers.
Supply the reason.
7.15 Show that there are infinitely many odd composite natural numbers.
7.16 Here is the outline of the proof of Theorem 7.3, which states that there
are infinitely many primes. Supply the details.
First, suppose by way of contradiction that there are only finitely many
primes. If so, the primes could be listed in increasing order, as follows:

where Pi stands for the ith prime in the complete list of all n primes above.
Form the number
q = PtP2P3 .. ·Pn + 1.
7.2 Well-ordering 189

Now q > 1 (why?), so q must be either prime or composite. But q


cannot be prime (why?). So q must be composite. Hence q has a prime factor
P (why?).
None of the primes in the complete list

is a divisor of q (why?). Hence since P I q, P cannot appear in this list.


This is a contradiction (why?). Hence there are indeed infinitely many
primes, and this establishes Theorem 7.3.
7.17 Give the reasons, where needed, in the following outline of a proof of
Theorem 7.4: Given a natural number n, it is possible to find a sequence of n
consecutive natural numbers each of which is composite.
Let the natural number n be given. By n! we mean the product of all the
natural numbers from 1 to n; that is,
n! = n' (n - 1)· (n - 2)· (n - 3)' . ·3 ·2· 1.

Show that there are n consecutive composite natural numbers in the sequence
(n + I)!, (n + I)! + 1, (n + I)! + 2, (n + I)! + 3, ... ,
(n + I)! + (n - 1), (n + I)! + n, (n + I)! + (n + 1).
7.18 In the proof of Theorem 7.3 outlined in Exercise 7.16, a finite number
of primes are multiplied together, the number 1 is added to the product,
and the fact that this product is composite leads to a contradiction. It would
seem to follow that if the first n primes were multiplied together and the
number 1 added to the product, thus obtaining
1 + PIPZP3 ••• Pn'
then this number would be prime. Is this always so? If not, does this mean
that the proof given for Theorem 7.3 is invalid? Explain your answer.
7.19 It follows from Exercise 7.8 that no natural number can have infinitely
many divisors. Why not?
7.20 A surprising formula is that if a natural number n has k divisors, then
their product is
~nk.
Verify this formula for three different two-digit values of n.

7.2 WELL-ORDERING
One property of the set N of natural numbers is quite important in establish-
ing many results in number theory. It is strange that this property has nothing
to do with the algebraic structure of the natural number system; at least on the
190 Number theory 7.2

surface, it contains no mention of either addition or multiplication. We


state it next.

Well-Ordering Axiom Each nonempty subset of the set N of natural numbers


contains a least element.

For example, there is a least prime, a least composite natural number, a


least number which is the sum of two squares (of natural numbers), a least
number which can be expressed as the sum of two squares in two different
ways, a least common multiple of 11 and 13, and a least natural number
whose decimal representation uses all the odd digits. The existence of each
of these numbers is shown by establishing that the set in question is nonempty
in each case; finding the least element of that set may be more difficult in
practice. In the last case, the number 971,513 is a natural number whose
decimal representation uses all the odd digits; hence the set of all such
natural numbers is nonempty. The Well-Ordering Axiom guarantees that
this set contains a least element, and thus there does exist a least natural
number whose decimal representation uses all the odd digits.
In contrast, the Well-Ordering Axiom does not hold for some other
number systems. For example, you should have no difficulty in finding a
nonempty set of integers which does not contain a least element.
The Well-Ordering Axiom is logically equivalent to the very useful
Induction Principle for the set N. Induction was first discussed in Exercise
1.17; we next state the Induction Principle, and then prove its equivalence
to the Well-Ordering Axiom in our following two theorems.

Induction Principle for N Suppose that f/ is a statement meaningful for


each natural number, and suppose moreover that both

a) f/ is true of the number 1, and

b) whenever f/ is true of the number n, then f/ is also true of the number


n + 1. Then f/ is true for each natural number.

It is amusing and sometimes helpful to visualize the Induction Principle


by the device shown in Fig. 7.1. A row of dominoes is depicted, one for each
natural number, and the dominoes are so arranged that whenever the domino
numbered n falls over, it knocks over the domino numbered n + 1. This is
in analogy to the second hypothesis of the Induction Principle. We knock
over the domino numbered I-this is in analogy to the first hypothesis of the
Induction Principle-and we obtain the analogous conclusion: All the
dominoes fall over. Before showing the equivalence of the Induction Principle
and the Well-Ordering Axiom, we provide an example of how the Induction
Principle may be used to prove a theorem in number theory.
7.2 Well-ordering 191

• •
n
, •
9
I 8
I 7
I 6
I 5

I 4
I 3
Fig. 7.1 Visualizing the induction
principle as falling dominoes. I 2
1
-
~
"""
~

I--

f--

Example 7.1 In Exercise 7.8 we provided an outline of a proof that if d and


m are natural numbers such that dim, then 1 < d < m. Here is a more
elegant proof, using the Induction Principle for N.
Let d and m be natural numbers such that dim. Since d is a natural
number, d > 1. Moreover, since dim, then by definition m = dk for some
natural number k. Suppose by way of contradiction that d > m.
Then let f/ be the statement, "For each natural number n, m < dn."
Then f/ is certainly a statement meaningful for each natural number
value of n, for given n, it can be determined whether it is true or false that
m < dn. Moreover, part (a) of the hypotheses of the Induction Principle
is satisfied, for the statement f/ is true for the value n = 1. (Recall that we
have supposed, by way of contradiction, that m < d = d· 1.)
Suppose that f/ is true for the natural number n. Then m < dn. But
n < n + 1, and so dn < d· (n + 1). Consequently, we have also that
m < d· (n + 1). So part (b) of the hypotheses of the Induction Principle is
also satisfied, for if f/ is true for n then f/ is true for n + 1.
Therefore by the conclusion of the Induction Principle, it must be the
case that f/ is true for all values of n. But recall that m = dk for some
192 Number theory 7.2

natural number k. The fact that !/ is true for all values of n and the fact that
k is a natural number together imply that m < dk. But we know that m = dk.
This is a contradiction, and hence the assumption that m < d must be false.
Hence if dim, then 1 < d ~ m.
Other examples of applications of this principle to prove various theorems
may be found elsewhere in this book, in the next set of exercises, and in
George Polya's book Induction and Analogy in Mathematics (Princeton
University Press, 1954). We proceed to show the logical equivalence of the
Well-Ordering Axiom and the Induction Principle for N.

Theorem 7.5 The Well-Ordering Axiom implies the Induction Principle for N.

Proof We assume that the Well-Ordering Axiom is true; that is, every
nonempty set of natural numbers contains a least element. We assume also
the hypotheses of the Induction Principle: that !/ is a statement meaningful
for each natural number, that !/ is true of the number 1, and that whenever
!/ is true of the natural number n, it follows that !/ is also true for the natural
number n + 1. We wish to show the conclusion of the Induction Principle,
that !/ must be true for every natural number.
Suppose, by way of contradiction, that !/ is not true for every natural
number. Let F be the set of natural numbers for which !/ is false. Then F
is nonempty because of the above assumption, and hence contains a least
element k because of the Well-Ordering Axiom. Moreover, k =F 1 since !/ is
true of the number 1 by hypothesis. So k > 1.
Hence k - 1 is also a natural number, and moreover, !/ is true of the
number k - 1 because k is the least natural number for which !/ is false.
But we have one more hypothesis to use: That whenever !/ is true of a
given natural number n, then also !/ must be true for the number n + 1.
In particular, since k - 1 is a natural number for which !/ is true, then also
!/ must be true for (k - 1) + 1 = k. But!/ is false for k, and so we have
reached a contradiction. Consequently,!/ must be true for each natural
number. This establishes the Induction Principle for N, and completes the
proof of Theorem 7.5.
Now for the converse.

Theorem 7.6 The Induction Principle for N implies the Well-Ordering Axiom.

Proof To establish the Well-Ordering Axiom, we must begin with a non-


empty subset of N and use the Induction Principle to show the existence of
a least element of that set.
So let S be a nonempty set of natural numbers. We suppose by way of
contradiction that S contains no least element. Then, in particular, the
number 1 is not an element of S, for if it were then it would be the least
element of S.
7.2 Well-ordering 193

Now let f/ be the statement, "If 1 ~ k ~ n, then k is not an element of


S." Then f/ is clearly a statement meaningful for each natural number value
of n, for given a natural number n we can certainly establish whether or not
it is true that each natural number between 1 and n fails to belong to S. We
now apply the Induction Principle to the statement f/.
First, f/ is certainly true for the value n = 1, for in that case the only
value of k such that 1 ~ k ~ n is k = 1 as well, and we have already seen
that 1 is not an element of S.
We suppose that f/ is true of the natural number n; in order to obtain
the conclusion of the Induction Principle we need only show that it follows
that f/ is necessarily also true of n + 1. Having supposed that f/ is true of
n, we know that if 1 ~ k :::; n, then k is not an element of S; in particular,
n itself is not an element of S.
If f/ were false for the number n + 1, then for some natural number k
between 1 and n + 1, k would be an element of f/. But we have seen that
k cannot lie between 1 and n, so such a value of k could only be n + 1
itself. Thus n + 1 would have to be an element of S. But then n + 1 would
in fact be the least element of S, for no number between 1 and n belongs to S.
But since we have supposed that S contains no least element, this situation
is impossible, and hence the statement f/ must be true for n + 1.
So we have shown that f/ is true for 1, and that whenever f/ is true for
n it must also be true for n + 1. Hence by the Induction Principle, f/ must
be true for each natural number n. Now f/ was chosen to be the statement,
"If 1 ~ k ~ n, then k is not an element of S." In particular, since f/ is
known to be true for all values of n, it follows that no natural number is an
element of S. Since S is a set of natural numbers and natural numbers only,
S must be the empty set. This contradicts our assumption that Sis nonempty,
and this contradiction establishes that S must contain a least element after all.
This concludes the proof of Theorem 7.6.

Exercises

7.21 Use the Induction Principle to prove that the sum of the first n natural
numbers is n(n + 1)/2; that is, that

n· (n + 1)
1+2+3+···+n= .
2

7.22 Use the Induction Principle to prove that the following formula holds
for all natural numbers n:

-1 + -
1 1
+- + ... + - -1- - ---
n
1·2 2·3 3·4 n·(n + 1) n +1
194 Number theory 7.2

7.23 In the proof of Theorem 7.6, what goes wrong if we replace the state-
ment f/ used there by the far simpler statement, "If n is a natural number
then n is not an element of S."?
7.24 It is frequently easier to apply the Induction Principle when it is stated
in the following form:
Suppose that f/ is a statement meaningful for natural numbers, that f/
is true of the natural number 1, and that whenever f/ is true of every natural
number less than the number n, then f/ is true for n as well. Then f/ is true
for every natural number.
Show that this form of the Induction Principle follows from the form
previously stated. Note: Since the original form of the Induction Principle
has been shown equivalent to the Well-Ordering Axiom, it may be easier
(and is certainly sufficient) to show that the above form of the Induction
Principle follows from the Well-Ordering Axiom.
7.25 Let S be the set of all positive real numbers. Does S contain a least
element? Does the Well-Ordering Axiom hold for the real number system?
7.26 Suppose that % is a statement meaningful for natural numbers, that
% is true of the natural number 5, and that whenever % is true for the natural
number n, then also % is true for the natural number n + 1. For what
natural numbers need % be true? Can you prove this?
7.27 Compute the values of the successive sums
1, 1 + 3, 1 + 3 + 5, 1 + 3 + 5 + 7, ....
Guess a formula and prove it by induction.
7.28 Does the Well-Ordering Axiom hold for the set
E = {2, 4, 6, 8, 10, ... }
of all even natural numbers? Can you prove this?
7.29 Does the Well-Ordering Axiom hold for the set of all rational numbers?
Give a reason for your answers.
7.30 Compute the values of the successive sums
1 1 111
1, 1 + -1 1 +-+- 1+-+-+-
2' 2 4' 2 4 8'
1 1 1 1
1+-+-+-+-
2 4 8 16 '
Guess a formula and prove it by induction.
7.31 The number 125 can be expressed as the sum of two squares in two
different ways:
7.2 Well-ordering 195

Knowing this, could you prove that there is a least natural number which can
be expressed as the sum of two squares in two different ways? If so, what is it?
7.32 Prove by induction: If S is a finite set containing n elements, then S has
2" subsets. Hint: Having assumed this proposition true for n, suppose that S
is a set containing n + 1 elements. Choose a E S and let T = S - {a}.
Then T contains n elements, and hence has 2" subsets. Note that every subset
of S is either a subset of T or a subset of T with the element a adjoined.
7.33 Let the sequence

of real numbers be defined as follows:

1
51 = - = 1,
1

52 = 1 + -1 = -,
2
51 1
1 3
53 = 1 + - = -,
52 2
1 5
54 = 1 +- = -,
53 3

and, in general, for each natural number n,

1
5,,+ 1 = 1 +- .
5"

If you write a few more terms of this sequence you will notice that

Prove this by induction. Note: It suffices to prove that

if n is odd;
if n is even; and

if m is odd and n IS even.

It may help if you first est~blish that if s" = alb, then s,,+ 1 = (a + b)la.
7.34 Prove by induction: If x and yare positive real numbers such that
x < y, then x" < y" for each natural number n.
7.35 Prove by induction: For each natural number n, 3 I (n 3 - n).
196 Number theory 7.3

7.3 THE FUNDAMENTAL THEOREM OF ARITHMETIC


If you factor a given composite number as much as possible, you will find
that it can be factored into the product of primes. Moreover, no matter how
you go about this factorization, you will always obtain, for a given composite
number, the same answer; that is, the prime factorization you obtain will be
unique except possibly for the order in which you write the primes down.
For example, you could factor 144 as follows:
144 = 2·72
= 2·2·36
= 2·2·6·6
= 2·2·2·3·2·3
= 24 • 32 •
We establish this result as follows: We first show that each composite
number can be factored into the product of primes. Then we establish the
so-called Euclidean Algorithm. Finally, using the Euclidean Algorithm,
we show that the factorization that exists must be unique. Note the use of
the Well-Ordering Axiom in each of these three theorems.
Theorem 7.7 If m is a composite natural number, then m is the product of
primes,. that is,
m = PtP2P3 ... Pm
where Pi is prime for 1 ::;; i < n.
Proof Suppose by way of contradiction that the theorem is false. Then, by
the Well-Ordering Axiom, there exists a least composite natural number not
expressible as the product of primes; we denote that number by m.
Since m is composite, m = ab, where 1 < a < m and 1 < b < m.
Since each of a and b exceeds 1, each is either prime or composite. We shall
consider only one case, the one in which a is composite and b is prime; the
other cases are handled similarly.
Since m is the least composite number not expressible as the product of
primes and a is a composite number less than m, then a can be expressed as
the product of primes; thus
a = PtP2P3 ... Pk'
where Pi is prime for 1 ::;; i ~ k. Hence
m = ab,
= PtP2P3 ... Pkb ,
and, since each number in the last expression is prime, we have expressed m
as the product of primes. This contradiction establishes the theorem.
7.3 The fundamental theorem of arithmetic 197

4 q
5')23 d)7;
Fig.7.2 23 = (4)·5 + 3, 20 qd
and m = dq + r. 3 r

Perhaps you have begun to notice a pattern in most of our proofs that
use the Well-Ordering Axiom. We first suppose that the theorem is false;
then the set of natural numbers for which the theorem does not hold is non-
empty, and thus contains a least element. Using this smallest exception to
the theorem, we produce smaller numbers for which the theorem must then
be true. We use the truth of the theorem about the smaller numbers-in
the above theorem, factors of the original number-to establish the truth
of the theorem about the original number, the number about which the
theorem was supposed to be false. This contradiction establishes the theorem.
We next use the Well-Ordering Axiom to establish the Euclidean
Algorithm-which says, quite simply, that it is possible to divide one integer
into another so as to obtain not only a quotient but also a nonnegative
"small" remainder. We shall treat only the case in which the divisor is
positive; the case in which it is negative can be handled similarly.
Theorem 7.8 Let m be an integer and d a natural number. Then there exist
integers q and r such that m = dq + rand 0 < r < d.
Proof In spite of the fact that the statement of this theorem contains no
mention of divisibility, it really is a theorem about dividing one number into
another. See Fig. 7.2.
We take as an example the case where d = 5 and m = 23. You recall
from the chronological vicinity of the third grade the process by which one
divides 5 into 23, as shown in Fig. 7.2. The quotient is 4 and the remainder
is 3. Moreover, we hope you also recall the method by which your third-
grade teacher asked you to check your work. You were to multiply the
divisor by the quotient and add the remainder to this product; if you obtained
the dividend, your arithmetic was correct. Finally, you should also recall
that your answer was "wrong" unless your remainder was a nonnegative
integer less than the divisor; in this example,
o ~. 3 < 5
is true, and the arithmetic check looks like
(5 . 4) + 3 = 23.
198 Number theory 7.3

2 6
SJ23 s}23
10 30
13 -7
Fig. 7.3 Arithmetically correct
23 = (2)' S + 13 23 = (6)'S + (-7) ways of dividing 5 into 23.

The general case is also shown in Fig. 7.2. The divisor d is divided into
the dividend m, obtaining the quotient q and the remainder r. The equivalent
check would be to verify that
O~r<d,
and that
dq +r= m.
But this is precisely the statement of this theorem. So the theorem really
does say that it is possible to divide one whole number by another and obtain
a quotient and a remainder, the remainder nonnegative and less than the
divisor.
Without the requirement that the remainder r satisfy the inequality
o ::;; r < d, there would be infinitely many ways to divide d into m in which
the arithmetic would check. For example, we could divide 5 into 23 with a
quotient of 2 and a remainder of 13, as shown in Fig. 7.3; in this case we
would be informed by the third-grade teacher that our quotient is not large
enough because "5 goes into 23 more than twice." Or if you prefer, you
could obtain a quotient of 6 and a "remainder" of - 7; again, the arithmetic
checks, as indicated in Fig. 7.3. Here the third-grade teacher would inform
us that the remainder is supposed to be nonnegative. The restriction that the
remainder be nonnegative and less than the divisor provides not only the
answer referred to as "correct" by the third-grade teacher, but also provides
a unique choice of q and r, as we shall prove.
Our method of proof will be to look at all possible such divisions of d
into m, without the restriction that 0 ~ r < d, and by use of the Well-
Ordering Axiom select the division that gives the least nonnegative value of r.
We will then show that 0 ~ r < d; the quotient q will automatically "work"
in that m = dq + r. Finally, we show that this choice of q and r is indeed
unique. Now for the proof of Theorem 7.8.
7.3 The fundamental theorem of arithmetic 199




'i m - di




'3 m - d·3

'2 m - d·2

'1 m - d·1

'0 = m d·O
Fig. 7.4 Possible remainders
upon division of m by d. '-1 m - d·(-1l

'-2 m - d·(-2l



Recall that we are given the divisor d, which is a natural number, and the
dividend m, which is an integer. For each integer i, we let

'i = m - di.

In other words, as indicated in Fig. 7.4, we look at all possible ways of


"dividing" d into m, and associate with each possible quotient i the arith-
metically correct remainder; since'i = m - di, it follows that our arithmetic
check m = di + 'i will be automatically true. We have denoted the re-
mainder 'i with the subscript i because the value of the remainder depends
on the choice of the quotient i.
We first show the existence of a remainder , i such that 0 < , i < d.
We do this by considering the set of all nonnegative remainders appearing
as in Fig. 7.4; if we can show this set is nonempty, then we can choose its
least element, which is certainly a reasonable candidate for the inequality
'i < d.
200 Number theory 7.3

If m > 0, we let i = - 1. Then the corresponding remainder r - 1 =


m + d is also nonnegative. On the other hand, if m is negative, we let
i = m. Then
rm = m - dm
= m(l - d)
= (-m)(d - 1),
and since - m > 0 and d - 1 > 0, r m > O. In either case, there is at least
one nonnegative remainder. Hence the set R of all nonnegative remainders
is a nonempty subset of the nonnegative whole numbers. It should be clear
that the Well-Ordering Axiom holds for the set N u {O} as well as N, so
we can select from R its least element, which we denote simply by r, and we
let the corresponding quotient be denoted by q.
We know that
r = m - dq,
and that
o~ r;
hence
m = dq + r,
and
o~ r.
If we can show that also r < d, this will complete the proof of the existence
of the desired numbers q and r.
Keep in mind that r is the least possible nonnegative remainder upon
dividing d into m. Suppose by way of contradiction that d ~ r. Then the
number s = r - d is nonnegative, and s < r because 0 < d. Now
m = dq + r
= dq + d + r - d
= d(q + 1) + r - d
= d(q + 1) + s.
But since m = d(q + 1) + s, we have expressed the division of d into
m with a new quotient q + 1 and a remainder s such that 0 ~ s < r. This
is in contradiction to the fact that r is the least possible nonnegative remainder.
Hence our supposition that d :::;; r leads to a contradiction, and therefore
the desired inequality r < d must be true. Thus we have shown the existence
of integers q and r such that both
m = dq + r and O~r<d
are true.
7.3 The fundamental theorem of arithmetic 201

Technically, this completes the proof of Theorem 7.8 as stated, but we


want to show one more useful fact while all this machinery is set up: that the
quotient q and the remainder r chosen above are unique as well; that is, if
x and yare two numbers such that both
m=dx+y
and
O~y<d

are true, then x = q and y = r. Suppose that x and yare integers satisfying
the above two relations.
Now r was chosen to be the least nonnegative integer such that m =
dq + r for some integer q; since y is such an integer, it must be true that
r ~ y. Suppose that y = r. Then

r = m - dq,
and
y = m - dx,
hence
m - dq = m - dx,
and thus
dq = dx,
so that x = q. In this case, y = r and x = q, as we wish to show. On the
other hand, suppose that r < y.
Then we have the inequality
o~ r < y < d.
Now
m = dq + r,
and
m = dx + y,
hence
dq + r = dx + y.
Thus
dq - dx = y - r,
or
d(q - x) = y - r.
Therefore d is a divisor of y - r. But since
o~ r < y < d,
it follows that
o< y - r < d - r < d,
so that
o< y - r < d.
202 Number theory 7.3

By Exercise 7.8, no natural number (such as d) can divide evenly into a


smaller natural number (such as y - r). So the fact that d I (y - r) leads
to a contradiction, and shows that the inequality r < y is impossible. Only
the previous case considered, in which y = r, can occur; and we have seen
that in this case also x = q. This establishes the uniqueness of q and r,
as desired.
For example, while there are many ways to "divide" 5 into 23 to obtain
a "quotient" q and a "remainder" r such that 23 = 5q + r, there is one and
only one choice of q and r-namely, q = 4 and r = 3-such that both
23 = 5q + rand 0 ~ r < 5.

Theorem 7.9 (The Fundamental Theorem of Arithmetic) If m is a natural


number other than 1, then m can be factored into the product of primes, and
this factorization is unique (apart from the order of the prime factors).

Proof If m is already prime, then we understand the theorem to mean that


m is its own unique factorization; certainly m cannot be equal to two different
primes, nor can m be factored any further. So we suppose that m is com-
posite; moreover, we suppose by way of contradiction that the theorem is
false. Since Theorem 7.7 guarantees that each composite natural number has
at least one prime factorization, it must then be the case that there exists
a composite natural number with a nonunique prime factorization. By
the Well-Ordering Axiom, we select the least composite natural number with
a nonunique prime factorization, and denote this number by m.

Hence

and

where all the p's and q's are primes, and these two factorizations are different
in that the p's and q's differ in number or in kind, or both.
If PI = q l' then

and since n < m, these two prime factorizations of n must be the same.
Hence by properly rearranging subscripts, we have P2 = q 2, P3 = q 3' . . . ,
Pi = q j' and i = j. But then the two factorizations of m shown above are
the same, for also PI = q 1. This contradicts the fact that the two given
factorizations of m are different. And by repeating this argument for other
p's and q's, we see that none of the p's can equal any of the q's.
7.3 The fundamental theorem of arithmetic 203

Since P1 '1= q l' we can suppose that the p's and q's are so named that
P1 < q 1· By Theorem 7.8, there exist integers wand r such that

q1 =P1 W + r,
and

We substitute P1 w + r for q 1 in the equation

and obtain
m = (P1 W + r)q2q3··· qj
=P1 wq2q3···qj + rq2q3···qj·

But m is also equal to P1P2P3 ... Pi' so we have the equation

Hence
rq2q3··· qj = P1P2P3·· ·Pi - P1 wq2q3··· qj
= P1(P2P3 ... Pi - wq2q3··· q).

So P1 is a divisor of rq 2q 3 ••• q j.
Now r < P1 andp1 < q1' so r < q1. Hence the natural number

is less than m, since m = q 1q 2q 3 ••• q j. Because m is the least natural


number with a nonunique prime factorization, rq 2q 3 ••• q j has a unique prime
factorization. Moreover, since rq 2q 3 ••• q j is divisible by P1' P1 appears in
some factorization of rq 2q 3 ••• q j. Since P1 cannot equal any of the q's,
P1 must appear in the prime factorization of r, and hence P1 I r. But 0 :::;
r < P1' so the only way thatp1 I r can be true is if r = O. But if so, then the
previous equation

becomes

and hence P1 I q 1. This, too, is a contradiction, as we have shown that none


of the p's can equal any of the q's; but since P1 and q 1 are primes andp1 I q l'
the only way this can be true is if P1 = q 1. Thus our supposition that there
is a composite number with a nonunique prime factorization leads to a
contradiction, and this establishes the theorem.
204 Number theory 7.3

An alternate version of stating the Fundamental Theorem of Arithmetic


is this: If m is a natural number other than 1, then there is one and only one
way of expressing m as the product

where the exponents nt, n2' ... , nk are natural numbers, and Pt, P2' ... ,Pk
are primes arranged so that Pt < P2 < ... < Pk'

Exercises

7.36 In the proof of Theorem 7.7, only one of four cases was considered-
the one in which m = ab, where a was composite and b was prime. Show
how to handle the other three cases.
7.37 Let m = -23 and d = -5. For what values of q and r is it true that

m = dq + r
and
o~ r < -d?
(One handles the statement of Theorem 7.8 for the case of a negative divisor
by changing the sign of d to obtain a pusitive number.)
7.38 Factor 6300 into the product of primes. In how many essentially
different ways can this be done?
7.39 If we interpret the product of no numbers to be the number 1, and the
product of one number to be itself, can we then state that "Every natural
number has a prime factorization?" Can we state that "Every natural
number has a unique prime factorization?"
7.40 Let a, b, and c be integers. Prove that if a I b and a I c, then a I (b + c)
and a I (b - c).
7.41 Suppose that a and b are natural numbers and P is a prime such that
P I abo Prove that either P I a or P I b.
7.42 Is it true that if n is a natural number such that 12 I n 2 then 12 I n?
7.43 Prove that every divisor of both m and n is also a divisor of both
3m + nand m + 2n.
7.44 Suppose that p is a prime and m and n are natural numbers such that
P I (m + n). Does it follow that one of the relations P I m or pin is true?
7.45 Wilson's Theorem states that the natural number n is prime if and only if

n I (n - I)! + 1.

Use Wilson's Theorem to prove that 7 is prime.


7.46 See Exercise 7.45. Use Wilson's Theorem to prove that 20 is not prime.
7.4 The greatest common divisor 205

7.47 See Exercise 7.45. Use Wilson's Theorem to prove that l00! + 1 is
not prime.
7.48 Suppose that P and q are two different primes and n is a natural number
such that both pin and q I n are true. Prove that pq I n.
7.49 How could you use the prime factorization of 144 in order to write
down all the divisors of 144?
7.50 Suppose that n is a natural number with the "canonical" prime
factorization
n -- p(Zlp(Z2
I 2
• •• p(Zk
k'

By the "canonical" factorization we mean the one in which the exponents


(Xl' (X2' ••• ,(Xk are natural numbers and the numbers PI' P2' ... ,Pk are
primes so arranged that
PI < P2 < ... < Pk'
Use this factorization to find a formula giving the number of divisors of n.
Hint: See the previous exercise.

7.4 THE GREATEST COMMON DIVISOR


If m and n are two integers not both zero, then by their greatest common
divisor g we mean the largest integer g such that both g I m and gin are
true. The greatest common divisor of m and n will be denoted by (m, n).
It should be clear that if m and n are not both zero, then (m, n) exists,
for m and n do have common divisors (such as 1), and only finitely many
integers can divide evenly into both m and n. So there must be a largest such;
in fact, since I is a common divisor of m and n, then 1 ~ (m, n), and hence
(m, n) is a positive integer. For example, (24, 28) = 4 and (8, 9) = 1.
The Euclidean Algorithm provides a method for computing the greatest
common divisor of two integers. The method depends on the next theorem.
Theorem 7.10 Let m and n be natural numbers, and let q and r be integers
such that m = qn + rand 0 ~ r < n. Then (n, m) = (r, n).

The proof of this theorem is outlined in the next set of exercises. We


illustrate with an example the method of using this theorem to find the
greatest common divisor of 21 and 78.
As indicated in Fig. 7.5, we first divide 21 into 78, obtaining a quotient
of 3 and a remainder of 15. We ignore the quotient; Theorem 7.10 tells us
that (21,78) = (15,21). We continue successive divisions, and obtain the
chain of equalities
(21, 78) = (15, 21) = (6, 15) = (3, 6) = (0, 3).
206 Number theory 7.4

3 1 2 2
21}78 15}21 6}15 3T6
63 15 12 6 Fig.7.5 Successive
15 6 3 a divisions to find (21, 78).

2 1 7
67) 193 59}67 8)59
134 59 56
59 8 3

2 1
3)8 2)"3 Fig.7.6 Finding (67, 193)-Note
6 2 that each remainder is next divided
2 1 into the corresponding divisor.

Since the remainders must decrease with each division, we must eventually
reach the remainder zero after a finite number of steps. So in any problem
of this sort, our last entry in a chain such as the one above will have the form
(0, r), where r is the next-to-Iast remainder. But r =1= 0; in fact, r > 0, and
it is easy to see that (0, r) = r. So the greatest common divisor of m and n
will be the last nonzero remainder in the sequence of successive divisions of
previous remainder into previous divisor.
For another example, we show the calculations to find (67, 193) in Fig.
7.6. We obtain the chain of equalities

(67, 193) = (59, 67) = (8, 59) = (3, 8)


= (2, 3) = (1, 2) = (0, 1) = 1.
Hence the greatest common divisor (67, 193) of 67 and 193 is 1.
7.4 The greatest common divisor 207

Moreover, a sort of reverse of this process can be carried out once the
above computations are made. We take the example (21, 78) = 3. The
divisions shown in Fig. 7.5 can be expressed in the form below:
78 = (3)' 21 + 15,
21 = (1)'15 + 6,
15=(2)'6+3,
6 = (2)' 3 + O.
Recall that (21, 78) is 3, the last nonzero remainder above. We ignore the
last equation, and solve the one in which the remainder 3 appears for 3 itself,
thus obtaining
3 = (1)·15 + (-2)·6.
We solve the second equation in the above list for its remainder, 6, and
substitute
6 = (1)·21 + (-1)' 15
in the previous equation to obtain
3 = (1)' 15 + (- 2) . [(1) ·21 + (-1)' 15]
= (3)·15 + (-2)·21.
The remainder previous to 6 is 15; we solve the equation 78 = (21)' 3 + 15
for 15, and substitute the result for the 15 in the above equation:
15 = (1)·78 + (-3)·21,
so that
3 = (3)'[(1)'78 + (-3)·21] + (-2)·21
= (3)·78 + (-11)·21.
What we have accomplished is the expression of 3, the greatest common
divisor of 78 and 21, in terms of 78 and 21 themselves. It should be clear that
this process of back-substitution of remainders can always be carried out so
as to express (m, n) in terms of m and n. We give one more example, using
67 and 193. From the data shown in Fig. 7.6, we obtain the following
equations:
193 = (2)·67 + 59,
67 = (1). 59 + 8,
59 = (7)· 8 + 3,
8 = (2)' 3 + 2,
3 = (1)·2 + 1,
2 = (2)' 1 + O.
208 Number theory 7.4

The last equation is unnecessary; we discard it. Our sequence of solutions


and substitutions goes as follows:
1 = (1). 3 + (-1)·2
2 = (1). 8 + (-2)·3
1 = (1). 3 + (-1). [(1) . 8 + (- 2) . 3]
= (3)' 3 + (-1)·8
3 = (1). 59 + (-7)·8
I = (3)' [(1) . 59 + (-7)· 8] + (-1)' 8
= (3)' 59 + (- 22) . 8
8 = (1)' 67 + (-1)' 59
1= (3)'59 + (-22)'[(1)'67 + (-1)'59]
= (25)' 59 + (- 22) . 67
59 = (1). 193 + (- 2) . 67
1 = (25)' [(1). 193 + (-2)· 67] + (-22)·67
= (25)' 193 + (- 72) . 67

Hence we have expressed the greatest common divisor of 67 and 193 in


terms of 67 and 193, as follows:
(67, 193) = (-72)·67 + (25)' 193
If you see that this process will always work, you have in effect seen the
proof of our next theorem.

Theorem 7.11 Let g = (m, n). Then there exist integers x and y such that
xm + yn = g.

Exercises

7.51 Evaluate (8334, 9612).


7.52 Find integers x and y such that
43x + 91y = 1.

7.53 One can use the Well-Ordering Axiom to prove Theorem 7.11; how-
ever, this proof will merely show the existence of integers x and y such that
xm + yn = (m, n).
One proceeds as follows: Let
S = {xm + yn I x and yare integers and xm + yn > O}.
7.4 The greatest common divisor 209

First show that S is nonempty; then let 9 be the least element of S. Prove
that 9 = (m, n). Please fill in the details of this proof.
7.54 Fill in the details of the following proof of Theorem 7.10: If m and n
are natural numbers and q and r are integers such that m = qn + rand
o ~ r < n, then (n, m) = (r, n).
Let d be any common divisor of nand m. Show that d must be a divisor
of rand n. Then show that every common divisor of rand n is also a common
divisor of nand m. Since the common divisors of nand m are thus the same
set of natural numbers as the common divisors of rand m, it follows im-
mediately that (n, m) = (r, n).
7.55 Show that (36, 64) = 4, and find integers x and y such that 36x +
64y = 4.
7.56 Show that (11, 27) = 1, and find integers x and y such that llx +
27y = 1.
7.57 Using the Fundamental Theorem of Arithmetic, we can factor 4200
into the prime product 2 3 . 3 . 52 . 7 and 4500 into the prime product 2 2 . 32 . 53.
Show how these factorizations can be used to find the greatest common
divisor (4200, 4500) of 4200 and 4500.
7.58 Is it possible to find integers m and n such that 12m + 16n = I?
Explain.
7.59 Is it possible to find integers m and n such that 12m + 13n = 6?
7.60 Find integers m and n such that

31m + 231n = 1.

7.61 Let n be a natural number. Prove that

(n, n + 3) =1= 2.

7.62 Let n be an integer. Prove that

(n, n + 2) ~ 2.

7.63 Prove that 8 I (n 2 - 1) if n is an odd integer.


7.64 Let m and n be natural numbers. Prove that (m, m + n) I n.
7.65 We have seen that the set W of all whole numbers, together with the
two operations of ordinary multiplication and addition, satisfies the Fun-
damental Theorem of Arithmetic, and it seems that Well-Ordering is essential
for establishing this theorem. This is indeed the case, for consider the
mathematical system K defined as follows:

K = {a + b~ Ia and b are integers, and e= - 5}.


210 Number theory 7.5

We define addition and multiplication on the set K as follows:


(a + b~) + (c + d~) = (a + c) + (b + d)~,
and
(a + b~)' (c + d~) = ac + ad~ + bc~ + bde
= (ac - 5bd) + (ad + bc)~.
With these two operations K becomes an algebraic system satisfying the
same axioms as the system of integers; specifically,
If ex, fJ, and yare elements of K, then
ex + fJ and exfJ belong to K',
ex + fJ = fJ + ex and exfJ = fJex;
(ex + fJ) + Y = ex + (fJ + y) and (cxfJ)y = cx(fJy);
ex(fJ + y) = exfJ + exy;
fJ = 0 + O~ has the property that fJ + ex = ex;
z = 1 + O~ has the property that zcx = cx; and
if cx = a + b~, then -cx = (-a) +. (-b)~ has the prop-
erty that -cx belongs to K and (-cx) + cx = fJ.
However, no ordering of K compatible with addition and multiplication can
be a well-ordering; thus it is not surprising that the Fundamental Theorem
of Arithmetic does not hold for K. To see this, one defines an element of K
to be "prime" if its only factorizations are the obvious ones; that is, the
element ex E K is prime if it can only be expressed as a product of two elements
of K in the two forms
cx = zcx = (- z)( -cx).
It is possible to find a "composite" element of K with two different prime
factorizations. Please do so.

7.5 APPLICATIONS
The equation ax + by = c, where a, b, and c are whole numbers, has of
course infinitely many solutions in the real number system (unless for example
a = 0 = band c i= 0). But in number theory we are interested in only
those whole numbers rand s which are solutions to the equation ax +
by = c. The ability to find such solutions opens the way to solve a wide
variety of fascinating problems.
Let us consider the equation ax + by = c, where a, b, and c are given
fixed whole numbers. The only case giving difficulty occurs when none of
a, b, and c is zero. We begin by letting 9 = (a, b).
If 9 is not a divisor of c, then the equation ax + by = c has no solution
in whole numbers, so we further suppose that 9 I c.
7.5 Applications 211

We use the methods of Theorem 7.11 to solve the equation ax + by = g.


Let m and n be a pair of whole numbers that solve the equation, so that
am + bn = g.
Since g I c, there is a whole number k such that c = gk. Let r = km and
s = kn. Then rand s are solutions of the original equation ax + by = c,
for
ar + bs = a(km) + b(kn)
= k(am + bn)
= kg = c.
So there is no difficulty in finding one whole number solution to the equation
ax + by = c. However, we seek all possible solutions. Using the particular
solution ar + bs = c that we have found, it turns out to be possible to
express all possible solutions in terms of the numbers rand s. For suppose
that the whole numbers p and q give another solution, so that
ap + bq = c.
Then
ap + bq = ar + bs,
so that
a(p - r) = b(s - q).
Now 9 = (a, b), so 9 I a and 9 I b. We divide our last equation by g, and
obtain
(alg)(p - r) = (blg)(s - q).
Note that all quantities in this last equation are whole numbers. Moreover,
the greatest common divisor of al g and big is 1, so that
(alg) I (s - q) and (big) I (p - r).
Let t be a whole number such that (alg)' t = s - q. If we substitute
(alg) . t for s - q in our previous equation, we obtain
(alg)(p - r) = t· (alg)(blg).
We just cancel alg from both sides, and thus
p - r = (big)' t.
From the relations (alg)' t = s - q and (big)' t = p - r we have obtained,
we can by solving for p and q also derive that
p = r + (big)' t,
q = s - (alg)' t.
212 Number theory 7.5

Now r, s, a, b, and g are all known, so that we have expressed the other
solution pair p, q in terms of known quantities and the "variable" t-which is
an integer. Since it is not hard to verify also that each integer choice of t
does indeed produce a solution pair p, q to the equation ax + by = c, we
have in effect established our next theorem.
Theorem 7.12 Let a, b, and c be nonzero whole numbers and let g = (a, b).
Consider the equation ax + by = c. Ifg does not divide c, there is no solution
to the equation. If g I c, then all possible integral solutions are given by
x = p and y = q, where
p = r + (big)· t,
q = s - (alg)· t,
where r, s is one solution pair for the original equation and t is allowed to range
through all whole number values.
Example 7.1 Find all solutions in whole numbers of the equation 8x +
9y = 10.
Now (8, 9) = 1 and 1 I 10, so solutions do exist. We first solve 8x +
9y = 1. In this simple case one can see a solution immediately: x = - 1
and y = 1. Hence we multiply this solution pair by 10 to obtain a solution
to the original equation:
8· (-10) + 9· (10) = 10.
So we have a = 8, b = 9, c = 10, r = - 10, s = 10, and g = 1. By
Theorem 7.12, all solutions are given by the formulas
x = -10 + 9t,
y = 10 - 8t,
where t is a whole number.
It frequently happens in such problems that we are interested only in the
positive solutions. If so, we must have both
-10 + 9t > 0,
and
10 - 8t > 0,
so that
8t < 10 < 9t,
and thus that
t < 1 and 2 < t.
But there is no integer t satisfying both these inequalities, and so we can
conclude that the equation 8x + 9y = 10 has no positive whole number
solutions.
7.5 Applications 213

Example 7.2 Find all integral solutions of the equation 3x + 12y = 100.
There is no solution, since (3, 12) = 3 but 3 is not a divisor of 100.
Example 7.3 Find all positive integral solutions of the equation 5x +
15y = 100.
A simplifying procedure is to divide each term by (5, 15) = 5, and the
resulting equations will have the same solutions as the original equation.
Thus we consider instead the equation
x + 3y = 20.
Now (1, 3) = .1, so we first solve instead
x + 3y = 1.
The obvious solution x = 1, y = 0 will of course do. We multiply each of
x and y by 20 to obtain the solution r = 20, s = 0 to
x + 3y = 20.
By Theorem 7.12, we can obtain all possible solutions from the equations
x = 20 + 31,
Y = 0 - 1,
where 1 is an integer. For positive solutions, it is further necessary that
20 + 31 > 0,
and
o- 1 > O.
These relations lead to the inequalities
-20 < 31 and 1 < 0,
and thus
-6 ~ 1 ~ -1.

So only the values 1 = - 6, - 5, - 4, - 3, - 2, and - 1 can produce positive


solutions. In this case, those solutions are as follows:
x = 2, y = 6,
x = 5, y = 5,
x = 8, y = 4,
x = 11, y = 3,
x = 14, y = 2,
x=17,y=1.
214 Number theory 7.5

Example 7.4 If a man has ninety-five cents in dimes and quarters, how many
of each type of coin might he have?
If we let d denote the number of dimes and q the number of quarters he
has, then we need to find the nonnegative solutions of the equation
10d + 25q = 95.

As in the previous example, we solve instead the simpler equation


2d + 5q = 19.
Now (2, 5) = 1 and 1 I 19, so solutions do exist. We first solve
2d + 5q = 1

One solution, by inspection, is d = 3 and q = -1. We multiply by 19 to


obtain the particular solution d = 57, q = -19 of the original equation.
Then all solutions have the form
d = 57 + 51,
q = -19 - 21
for 1 an integer. For q and d both to be nonnegative, we must also have
-57 < 51 and 21 < -19,
which lead to only the two values -11 and -10 for 1. These lead to the two
possibilities
d=2 and q = 3,
or
d=7 and q = 1.
So the solution to the original problem is this: The man has either two dimes
and three quarters, or else seven dimes and one quarter.

Exercises

7.66 Find all positive whole number solutions of


4x + 6y = 100.
7.67 Find three different solutions of
lOx - 7y = 23.
7.68 A man cashed a check at a bank for two hundred forty-five dollars,
and asked the teller for some ones, ten times as many twos, and the balance
in fives. In how many different ways can the teller oblige him? What are
these ways?
7.5 Applications 215

7.69 How would you go about solving the equation


ax + by = 0,
where a and b are whole numbers? Are the formulas given in Theorem 7.12
also valid in this case?
7.70 What can be said about the whole-number solutions of the equation

ax + by = c,
where a, b, and c are integers, in case one of a and b is zero? What if both
a and b are zero?
7.71 Let a, b, and c be whole numbers with neither a nor b equal to zero,
and let 9 = (a, b). Prove that if the equation ax + by = c has a solution
in integers, then 9 I c.
7.72 To prove Theorem 7.12, it is necessary to show that if a and bare
nonzero whole numbers and 9 = (a, b), then the greatest common divisor
of alg and bIg is 1. Please prove this.
7.73 Here is the method for solving the equation

ax + by + cz = d

in whole numbers, where a, b, c, and d are integers, none of a, b, and c is


zero, and all solutions are desired. The method is much more complicated
than the case of only two unknowns, so we give only the method for finding
the solutions and omit the proof.
First, there is no solution unless g, the greatest common divisor of a, b,
and c, also divides d, so we suppose that 9 I d.
Let

and
-b
b=-.
(b, c)

Then ([3, b) = 1, hence we can find integers ex and y such that


exb - [3y = 1.
Let
, = (a, bex + cy),

and find integers Jl and v such that


Jla + v(bex + cy) = ,.
216 Number theory 7.5

Then all possible solutions of

ax + by + cz =d
are given by the three equations below, where t and u are allowed to assume
all possible whole number values (and every solution can be obtained from
the equations below):
dJl
- (bex + cy)t
x =-
,
exdv exat
"
Y = T + T + fJu,

z =-
ydv
, , +-
yat
+ bu.

Use this result to find all positive solutions of the equation

x + 2y + 3z = 12.
7.74 If four marks are worth one dollar, five zlotys are worth one dollar,
and eight pesos are worth one dollar, how maya dollar be exchanged fairly
for marks, zlotys, and pesos so that at least one unit in each of the foreign
currencies is obtained?
7.75 A man paid two dollars for 100 eggs, including some new-laid eggs at
ten cents each, some fresh eggs at two cents each, and some old eggs at one
cent each. He found that he had the same number of two kinds of these eggs.
How many of each did he buy?
7.76 A man bought pomegranates at 16 cents each, oranges at two cents
each, and tangerines at one cent each. If he bought twenty pieces of fruit,
including at least one of each kind, and spent 80 cents in all, how many of
each did he buy?
7.77 A man cashed a check for less than one hundred dollars at a bank.
The teller confused the number of dollars on the check for the number of
cents, and paid the man forty-three dollars and fifty-six cents more than he
deserved. In how many different amounts could the check have been written?
7.78 Find three different solutions of

8x + 9y + 10z = 12.

7.79 Verify that the solutions x andy given in Example 7.1 actually work for
all integral values of t.
7.80 Verify that the solutions p and q given in Theorem 7.12 actually work
for all integral values of t.
7.5 Applications 217

7.81 If 11 brass balls (or equal weight) weigh exactly 15 pounds, 11 copper
balls weigh 16 pounds, and 11 silver balls weigh 17 pounds, how many of
each are required to weigh exactly 11 pounds?
7.82 Prove that
1 1 1
-+-+-+ ... +-1
2 3 4 n
is never a whole number.
7.83 Each odd prime has the form 4n + 1 or else the form 4n + 3, where
n is a nonnegative integer. Prove that no prime of the latter form is the sum
of two squares of whole numbers.
7.84 Prove that if n is a cube of a natural number, then the product of the
three consecutive integers n - 1, n, and n + 1 is divisible by 504.
7.85 Find a natural number half of which is a square (of a natural number),
one-third of which is a cube, and one-fifth of which is a fifth power.
7.86 Show that if n is a natural number, then
10 I (n 5 - n).
7.87 What is the last digit of 7355 ?
7.88 We consider the equation x 2 + y2 = Z2, and seek natural number
solutions. It turns out that any solution has the form
x = m2 - n2 ,
y = 2mn,
Z = m 2 + n2,
(provided that, since one of x and y must be even, we let that be y) where m
and n are natural numbers with m > n. Verify that the formulas given above
actually do provide a solution for the equation x 2 + y2 = Z2.
7.89 Find three different solutions to the equation
x2 + y2 = Z2.

7.90 In connection with Exercise 7.88, we are actually finding Pythagorean


right triangles-those right triangles with whole number sides. Find two
Pythagorean right triangles with the same hypotenuse.
7.91 Find two different Pythagorean right triangles with the same perimeter.
7.92 How many Pythagorean right triangles can have a hypotenuse of
length 10?
7.93 How many Pythagorean right triangles can have one side of length 10?
7.94 Find three different Pythagorean right triangles whose legs differ by the
number 1.
218 Number theory 7.5

7.95 Prove that one leg of a Pythagorean right triangle always has length
divisible by 3 and that one side always has length divisible by 5.
7.96 Find three different Pythagorean right triangles such that the length
of the hypotenuse and the length of one leg differ by the number 1.
7.97 Prove that every natural number n can be expressed in the form
n = a2 + b2 - c2 ,
where a, b, and c are integers.
7.98 Is 2 100 - I prime? Why or why not?
7.99 Let n be a natural number and p a prime not a factor of n. Fermat's
Theorem states that p I (n P - 1 - I). Use Fermat's Theorem to prove that
7 I 999,999.
7.100 Show that every prime other than 2 and 5 divides evenly into some
number of the form 999 ... 99 (k digits, all nines).

NOTES AND REFERENCES


The following books on number theory may be of interest:
Beiler, A., Recreations in the Theory of Numbers (Dover, 1964).
Dudley, U., Elementary Number Theory (Freeman, 1969).
Gelfond, A., The Solution ofEquations in Integers, translated by J. B. Roberts
(Freeman, 1961).
Griffin, H., Elementary Theory of Numbers (McGraw-Hill, 1954).
Mordell, L., Diophantine Equations (Academic Press, 1969).
Niven, I. and H. Zuckerman, An Introduction to the Theory of Numbers
(Wiley, 1960).
Rademacher, H., Lectures on Elementary Number Theory (Blaisdell, 1964).
Please do not consider this chapter as containing more than a minute
fraction of what is known about the theory of numbers. We have not touched
on Farey series, the distribution of primes, twin primes, Fermat's "Last
Theorem," perfect numbers, or even Gauss' Law of Quadratic Reciprocity.
Since the latter is considered by many to be one of the most beautiful results
in the field, we state it below, and invite you to verify it for yourself in a
number of special cases.
~,

Suppose that p and q are two different odd primes. The problem is in
finding a natural number n such that
p I (n 2 - q).
Notes and references 219

The Law of Quadratic Reciprocity states that such a natural number n can be
found if and only if there exists a natural number m such that
q I (m 2 - p),
with an exception: If p and q are both of the form 4k + 3, then the first
relation above has a solution if and only if the second does not.
The Euclidean Algorithm of Section 7.3 is, of course, named for the Greek
mathematician Euclid-about whom very little is known other than that he
lived in the third century B.c. His Elements formed such a complete system-
ization of geometry that his work has been used as a geometry text even in this
century.
On the other hand, a great deal is known about the life of Carl Friedrich
Gauss. He was born in 1777 of very poor parents in Braunschweig, Germany,
and only a succession of fortunate coincidences made it possible for him to
become a mathematician. He lived seventy-eight years, and though not so
prolific a writer as Euler, unquestionably made far greater contributions to
mathematics: His work in number theory is particularly impressive, but he
also laid the foundations for differential geometry, complex analysis, and
modern topology. He desired perfection in his publications, and thus it is
that he anticipated results of many later mathematicians-his journal contains
a large number of valuable results which he never published, and for this
reason many of these were credited to others. In any case, the consensus
seems to be that the world has produced three truly outstanding men of
genius-Archimedes, Newton, and Gauss.
CHAPTER 8

ANIMAL
POPULATIONS

We shall consider an important branch of ecology, the branch which deals


with the growth of animal populations, but we feel very strongly that two
important cautions should be issued at once.
First, we shall deal with only the simplest possible cases; we shall be
assuming, for example, that no more than two or three species are involved,
that the reproduction rate of each species is constant, that there are no changes
in the populations in question caused by migration, and numerous other
simplifying conditions that will become apparent as we proceed.
Second, there is the philosophical consideration that mathematics really
cannot prove anything about the real world, but only about mathematics
itself. We shall be passing from a physical reality to a mathematical abstrac-
tion; such a procedure always merits cautious interpretation because of
simplifying assumptions such as those mentioned above. In addition, we shall
be making essentially unprovable (in the mathematical sense) assumptions
about the way in which animals behave; here the usefulness of the math-
ematical model lies in the fact that it can offer predictions about the way in
which animal populations ought to fluctuate; and if such fluctuations are
indeed observed by biologists and ecologists in the field and laboratory, this
can be considered evidence in favor of the validity of such assumptions.
To make the last idea clear, mathematicians have been accused in the
past of "proving" things which are patently absurd; for example, there is a
rumor that a mathematician once "proved" that a bumblebee cannot fly.
It was likely the case that the prover in question made the assumption that
insect flight muscle could not metabolize its fuel any faster than mammalian
muscle, as well as a host of other such assumptions. Of course, some of these
assumptions must then have been false, provided that the "proof" itself was
mathematically valid. In this case, rather than being a futile exercise, perhaps

220
8.1 Unrestricted growth of a single species 221

such a proof could give to biologists clues as to which such assumptions


should be experimentally verified.
As another example, it is said that a mathematician once "proved"
that it was impossible for a drag racer to turn a quarter mile in less than 9.0
seconds. But dragsters commonly beat this time these days; in this case, the
erroneous assumption may well have been that the coefficient of friction
between rubber and strip did not increase as the rubber temperature increased,
an assumption now known to be false.
In summary, then, one cannot do ecology with paper and pencil. What
mathematics can contribute to ecology, as it has contributed to the other
sciences, is the prediction of the behavior of a system after certain assump-
tions-usually simplifying ones-have been made. If the predictions correlate
with field and laboratory research, this is evidence in favor of the validity of
the assumptions. If not, the assumptions should be examined; again, such
examination can be done only experimentally, not mathematically. With
these cautions in mind, we proceed to the simplest case (with the simplest
assumptions).

8.1 UNRESTRICTED GROWTH OF A SINGLE SPECIES


One simple assumption about the growth of an animal population is that the
growth rate is proportional to the number of individuals present. This
assumption is contingent upon the further assumption that there is nothing
to impede growth of the population; for example, there must be effectively
unlimited living space and food supply. It would seem reasonable that such
should be the case if the population were small relative to the amount of
living space and food supply available. For example, with a reasonably small
population of bacteria in a large culture medium, one would expect that if
the bacteria population were 1000 individuals increasing at the rate of 3
individuals per second, then a population of 2000 individuals would increase
at the rate of 6 individuals per second and a population of 10,000 individuals
would increase at the rate of 30 individuals per second. Here is the reason
why this should be the case.
Suppose we have such a situation, say of a small number of bacteria
in a large culture, and we denote the number of bacteria at a given time t by
N(t), or more simply merely by N. It seems very reasonable that the rate of
increase of this population, which we will denote by N'(t) or simply N',
depends on the value of N itself at the time t in question. This is why we
have chosen a notation N' for the rate of increase, in order to suggest that
the value of N itself has something to do with the rate N'. After all, if all
extraneous factors could be removed, it should be the case that the value of
N' depends only on the value of N, whether or not they happen to be
proportional.
1
\

j
222 Animal populations 8.1 i
1
I
We can try to make plausible the assumption that N' is proportional to N
in this simple case. We can by virtue of the above discussion at least write
the equation
N' = f(N)

to indicate that there is some function f that gives the manner in which
N' behaves with respect to the value of N. For example, it might be that
N' = N 3,
or
N' = 2N + N2 + N
N + 1
Suppose our bacteria culture were divided into two equal populations,
each thus containing N /2 individuals. This division could be done either
physically, by removing half the bacteria to another culture medium-or it
could be done by drawing an imaginary line down the middle of the culture.
Since the actual rate of increase of each half of the population should be the
same in either case, but in the former case it should be f(N /2) and in the latter
case just N' /2, we could therefore write the equation
N' /2 = f(N /2).
Similar considerations with respect to dividing the population into
thirds, or multiplying it by four, lead us to the equation
a- N' = f(aN)
for all meaningful values of a. But since
N' = f(N),
we can substitute this into the former equation, and obtain
a .f(N) = f(aN).
This means that the function f has the property that f(ax) = af(x) for
all values of a and x. It turns out that the only such function with a smooth
graph must be of the form f(x) = kx, where k is a constant, and hence
N' = f(N) = k - N.
In other words, it would seem reasonable that the form of the function f is
extremely simple, and that N' is indeed proportional to N. Note that the
constant k must be positive in the case of bacterial growth, since we are
assuming N to be positive, and N' is also positive because the population is
increasing rather than decreasing.
8.1 Unrestricted growth of a single species 223

Let us rewrite our equation as an equation about functions, by reinserting


the time variable t. Then
N'(t) = k· N(t).
By the end of a course in differential calculus, most students will have
learned how to "solve" this equation in order to find explicitly the form of
the function N(t), the real unknown in the above equation. Calculus enters
the picture here because differential calculus is itself the study of rates of
change of functions; although we will next present the formal manipulations
used to solve the above so-called differential equation, it is not necessary that
you understand these procedures. The reasons are these: First, many of our
subsequent equations representing growth rates of animal populations are
too complicated to be solved with pencil and paper alone-they are usually
solved by approximation methods involving use of high-speed electronic
computers. Second, in our later studies we shall not be interested so much in
the actual form of the function N(t), but rather in the steady-state or limiting
behavior of the animal population, and we can discover this steady-state
behavior without the necessity of solving any differential equations. All
that will be needed is some elementary work with inequalities.
However, given the differential equation
N'(t) = k· N(t),

the calculus student first divides by N(t), since we may assume that N(t)
IS never zero.

N'(t) = k.
N(t)
Then

f N'(t) dt =
N(t)
f k dt
,
and hence
log N(t) = kt + C,
where C is a constant and the logarithm function is to the base e (the approxi-
mate value of e is 2.71828) rather than to the base 10. From this follows
N(t) = ekt + C = eC ' ~t.

Since C is constant, so is eC , and we evaluate it by introducing the further


experimental assumption that the population N(t) is known when t = 0;
say, N(O) = No. Then

and hence
N(t) = No' ~t.
224 Animal populations 8.1

N-axis

Fig.8.1 The graph of


N(t} = No ·e kt, which is
(k>O) increasing at an increasing
rate.

0....- t-axis

The graph of the function N(t) is shown in Fig. 8.1. In this graph, we
have assumed that the constant of proportionaHty k is positive, as we have
already mentioned. The graph increases more and more rapidly with in-
creasing values of t, indicating that the rate of population increase is itself
increasing. Of course, the value of k itself as well as the value of No must
be determined individually for each experimental case; k, for example, should
depend not only on the species of animal involved, but also on the concentra-
tion of the food supply, the effectiveness of the food in promoting reproduc-
tion, the temperature of the environment, and numerous other experimental
factors.
The information that we want to derive from our initial population
equation
N' = k·N
8.1 Unrestricted growth of a single species 225

is available to us without use of the manipulations of calculus shown earlier.


In this chapter we shall be mostly concerned with the eventual or long-term
behavior of the population in question. In this example, we can reason very
simply from the preceding equation in the following manner. At the beginning
of the experiment, we assume N to be positive, and N' positive as well since
the value of N is assumed to be increasing, at least initially. As we have seen,
k too must then be positive. But then, so long as N remains positive, N'
itself must be positive; that is, the population will continue to increase.
Thus N will indeed remain positive, and the population will in fact always be
increasing. In addition, as N increases, the value of N' must also increase, so
the population will be increasing at a faster rate as time goes on. Thus the
population will not tend toward any steady state, but behave as the graph
in Fig. 8.1 indicates.
This is no proof that E. coli, or any other living species, will eventually
take over the whole world, for we have made one very important simplifying
assumption: that the increase in the population in no way impedes future
growth of the population. In actual practice-say, in a culture of E. coli in
a test tube-available space and food supply are both strictly limited, and we
shall see in the next section how certain very simple additional assumptions
will lead us to a better model of growth of a single species.
On the other hand, it is important that we mention that the form of the
function N(t) that we derived using calculus,
N(t) = No· tit
has been experimentally verified in a large number of cases. That is, for a
small population with a large amount of available food and space, the graph
of the actual population compares well with the graph of N(t) in Fig. 8.1,
so long as the value of N itself is small. In other words, so long as there are
not too many individuals, the rate of growth of the population does behave
as if it were proportional to the population. This will be an important
assumption in much of our later work. The graphs shown in Fig. 8.2 show
a typical population curve, shown as a dashed line, compared with the graph
of N(t) = No· ekt • The two curves match quite well for small values of N.
This should be considered as experimental justification for our assumption
that the rate of growth of animal population is proportional to the population
itself, so long as other factors are kept equal and so long as there is no in-
hibition of the growth rate by a too-large population.

Exercises

8.1 Suppose for some reason we had been led to the equation

N' =~
N'
226 Animal populations 8.1

N-axis

Fig. 8.2 The population curve

--
is well approximated by
---------- N(t) = No· ekt for small
~';1'
"'.... populations.
~';1'
~~

' - - - - - - - - - - - - - - - - - t-axis

where k is a constant. Assuming that the population N and its rate of in-
crease N' are both positive at some initial time t = 0, sketch the graph
showing-very roughly-the behavior of the function N(t).
8.2 If a population N is initially positive, and its rate of increase N' is
constant, what will be the behavior of the function N(t)? Treat three cases:
when the constant is positive, when it is zero, and when it is negative.
8.3 For reasons given in Section 8.1, it would be plausible to suppose that
an animal population of N individuals would have a constant birth rate b,
resulting in a rate of increase in the population B = bN proportional to the
value of N ; it is equally reasonable that the population would have a constant
death rate d, resulting in a rate of decrease D = dN in the population also
proportional to the value of N. The net rate of increase in the population,
which we have denoted by N', should then have the form

N' = B - D.
Show how the equation
N' = kN
.
can be derived from the above assumptions. What is the value of k? What
interpretation can be given to the constant k?
8.1 Unrestricted growth of a single species 227

N-axis

N ( t) = No· ekt (k < 0)

L....- t-axis

Fig.8.3 For negative k, the graph of N(t) = No ·ekt is decreasing.

8.4 A quantity of a radioactive substance can be thought of as a population,


say in terms of the mass present or the number of atoms present. It follows
from the assumption that radioactive decay is equally likely for any two
atoms of one substance that the decrease of the amount of the radioactive
substance is proportional to the number of atoms present. We again obtain
the differential equation
N' = k· N,
where N is the amount present and N' is the rate of increase of the amount.
Here, though, k is a negative constant-the rate of increase must be negative
since the amount of radioactive substance is actually decreasing. Neverthe-
less, we obtain the same solution
N(t) = No· t!'t,
and the graph of this function, for negative k, is shown in Fig. 8.3.
The half-life of a radioactive substance is the time it takes for one-half
of the substance to decay; for example, in the case of Iodine-131, half will be
gone after eight days. Show that the "half-life" is a meaningful concept;
that is, that for a given radioactive substance, the half-life is independent
of the initial amount present. Hint: Let T be the half-life. Find some way
of "solving" the equation N(t) = No· t!'t for T, and show that the solution
is independent of the initial quantity No.
228 Animal populations 8.1

N-axis

8.5 Here is one method by which the graph of a function N(t) satisfying a
differential equation such as
N'(t) = k· N(t)
can be found approximately, without the necessity of actually obtaining an
explicit formula for the function N(t). We shall illustrate how the procedure
works with the equation above, assuming k > 0, and leave it to you to fill
in the details.
Imagine a point in the first quadrant, as shown emphasized in Fig. 8.4.
Here both Nand t are nonnegative, and since N' = k· N, N' is also non-
negative. Also, the larger the value of N, the larger the corresponding value of
N'. So the point indicated on the graph not only represents a certain possible
population at a certain time, but also can be thought of as lying on the graph
of a solution to the above equation. As t increases the value of N must also
increase, as indicated by the arrow. For larger values of N the rate of increase
will be proportionately greater, as indicated by the equation N' = k· N.
Thus arrows of steeper slope are indicated for the larger values of N. Note
that smaller values of N have associated with them arrows of smaller slope
8.2 Growth of a species under limiting conditions 229

as well, but the slope is always positive because N' is positive in the first
quadrant (except when N = 0). If you select an initial population value
No, and sketch in a smooth curve following the trend of the arrows, you will
obtain a rough idea of the shape of the graph of N(t).
You can repeat this process with the first two exercises, as well as with
other differential equations such as

N'= N
N + l'

and other such examples of your own invention.

8.2 GROWTH OF A SINGLE SPECIES UNDER


LIMITING CONDITIONS
Since the graph of N(t) = No· ekt does not give an accurate picture of the
growth of an animal population-at least, for large values of N-we should
try to improve our assumptions which led to the original differential equation

N' = k·N.

One very simple way to do this is to introduce into the equation a term which
will, in effect, decrease the value of the rate constant k as the value of N
increases. A very natural way of doing this is to assume that the population
is living in an environment that will support a certain maximum population
M of individuals, and the closer the value of N gets to the number M, the
smaller the value of k (and thus the smaller the value of N'). However, the
solution of the previous section turns out to be quite accurate for small
values of N, and hence we do not wish to modify the value of k when N
is small. What we need is a term that has the value 1 for N very small, and
which decreases to 0 as N increases; in fact, it should become 0 when N = M
and become negative for N > M.
The reason for the latter consideration is that we can imagine an animal
population in an environment of insufficient resources to support that
population. In such a case, it seems reasonable that the number of deaths
would exceed the number of births, and thus that the value of N' would be
negative, indicating a net decrease in the population.
Hence we need to multiply the constant k by a term that is nearly 1 for
N close to zero, a term that decreases to zero as N gets closer and closer to
the value M, and that becomes negative if N exceeds M. One of the simplest
ways of inventing a formula for such a term is to use

M-N
M
230 Animal populations 8.2

For N = 0, the value of the above term is 1; for N between 0 and M,


its value is between 0 and 1; it becomes 0 when N = M and negative for
N > M. Now the above term can be regarded as a sort of degree to which
the potential increase of population is realized; indeed, when these concepts
were first introduced into ecology, the word equation

{ act~al rate} = {pote.ntial rate} . {degree of real~zatiOn}


of Increase of Increase of potentIal
was used. This translates very naturally into the differential equation
M N
N'=k·N· -
M '

and this is the equation that we shall examine. There is only moderate
difficulty in actually finding the form of the function N(t), but we shall spare
you the details, and use instead the sort of analysis that will be appropriate
in later sections. Remember that, as before, k is a positive constant, and M
too is a positive constant, indicating a maximum population that can be
supported by the environment.
We can use the same sort of analysis as was used in Exercise 8.5. We
sketch in Fig. 8.5 the arrows indicating the direction of movement of the
value of the population. For small values of N (small, that is, relative to M)
the value of(M - N)/M is very close to the number 1, so that its effect on the
equation can be neglected. Hence, for small N, N' is approximately pro-
portional to N, and so for larger values of N the arrows become steeper.
However, somewhere along the line the effect of the term (M - N)/M
begins to make itself felt, and the arrows become no steeper; indeed, as N
increases, the term (M - N)/M becomes quite small, effectively reducing the
value of k and thus the value of N'. Hence the arrows begin to flatten out.
For N = M, in fact, the arrows must be horizontal, indicating no change in
the population, for when N = M, the equation
M N
N'=kN· -
M
becomes
N' = O.
That is, there is no change in the population.
Finally, for N > M the value of (M - N)/M is negative, and the arrows
must slope downward; as N continues to increase, the downward slope of the
arrows must increase since (M - N)/M is taking on values such as -1,
-2, -3, ....
In Fig. 8.6 we show three different population curves, each depending
on the initial choice No of the population at time t = O. These curves are
8.2 Growth of a species under limiting conditions 231

N-axis

,N =M
""","
""" " """ " ........... .........
------------------------------------------------
.....,. .....,.
.." . . " ~ .....,.
./ / / / / /
I I I I I I Fig.8.5 Approximating
the graph of a solution to
N' = kN·(M - N)/M.
I I I I I I
I I I I I I
./ ./ / / /
"~
"" ~ ~.....,. ,;#'"
t-axis

N-axis

Fig. 8.6 The graphs of


three typical solutions
to the equation -----------------------------
N' = kN·(M - N)/M.

~--------------t-axis
232 Animal populations 8.2

obtained by choosing a value of No and then sketching in the graph of N(t)


using the arrows as a guide.
The hardest experimental test of this solution is in the case of the lowest
curve, which exhibits the most complicated behavior. But ultimately this
so-called sigmoid curve (because it is shaped like the letter "S") gives a
surprisingly good fit to actual experimental evidence. Thus the hypothesis
that the potential rate of increase of a population is appropriately modified
by multiplication by the degree of realization of that potential is justified,
as well as the form
M-N
M

for the degree of realization.


Actually, some researchers in the field believe that various modifications
of the equation

N' = kN. M - N
M

would give a solution N(t) that would more accurately fit the experimental
data in more cases; for example, one might wish to consider instead the
equation

or even ask, in general, for the best possible exponent rx in the equation

However, for reasonable choices of rx, the steady-state behavior of the system
will generally be unchanged. We shall consider at present only the case of our
original equation, where rx = 1. Suppose, then, we inquire into the behavior
of the function N(t) that solves

N' = kN. M - N
M '

as t increases without bound. The answer is already before us. The fact that
all the arrows in Fig. 8.6 are directed toward the horizontal line where N = M
indicates that, regardless of the initial population, its size will tend toward the
value Mas t increases; unless, of course, No = O. This simple analysis will
not be much complicated in the more complex systems we take up in the
next sections.
8.3 The case of two competing species 233

Exercises
8.6 Repeat the analysis of this section for the differential equation

N' = kN. (M ;:; N) 1/2

and show that the population tends toward one of the two values 0 or M.
For what initial value of the population will it tend toward the value O?
8.7 Examine the behavior of the population equation

N' = k. (N)1/2 . M - N .
M
This equation has been observed to give accurate fits to experimental data
under certain conditions.
8.8 Construct, as an alternative to (M - N)jM, a formula for the so-called
degree of realization of potential population increase. Examine the corres-
ponding differential equation and see if it has the right sort of general
behavior as t increases without bound.
8.9 Suppose that a cylindrical tank with vertical axis has a small hole in its
bottom, is filled with water, and the water leaves the tank at a rate proportional
to the water pressure. Make the necessary assumptions about constants
(height of tank, density of water, and so on) and write down the differential
equation whose solution V(t) is the volume of water in the tank at time t.
8.10 Suppose that the increase in a certain animal population due to births
is proportional to the number of individuals present, but that the decrease in
the population due to deaths is constant. Write down the differential equation
describing the behavior of the population N as a function of time t, and
include the term (M - N)jM for degree of realization of potential. What is
the behavior of the solution of this differential equation as t increases with-
out bound?
8.3 THE CASE OF TWO COMPETING SPECIES
We now turn our attention to the case of two different species of animals with
a coexistence problem. We assume that they live in the same space and are
competing for much the same food supply. However, please remember that
we are still making a large number of simplifying assumptions; for example,
we assume that there is under such circumstances a maximum population
which the environment will support with respect to each species, and that
this maximum is constant; in practice, of course, this maximum is likely
to undergo variations because of the available food supply suffering seasonal
variations, and the like.
In order to make the ideas of this section more concrete, we shall con-
sider the case of two reasonably similar species of fish-bluegill and redear-
234 Animal populations 8.3

living in a pond free from predators and with a constant but limited food
supply. Since adults of these species are not predatory upon one another,
we shall ignore the effects of predation on immature fish, so that the most
important aspect of the interspecies competition is the sharing of food and
living space. The important thing here is that bluegill and redear have similar,
though not identical, food preferences. To make the significance of the
notation easy to remember, we shall use the following symbols:
B(t) or B will denote the number of bluegill present at time t.
R(t) or R will denote the number of redear present at time t.
B'(t) or B' will denote the rate of increase of the bluegill population-
as usual, if B' is negative this means that the population is actually decreasing.
R'(t) or R' will denote the rate of increase of the redear population.
With redear absent, we suppose that the pond will support a certain
maximum population of bluegill, and we denote this maximum by C (since
the letters Band C are alphabetically adjacent). Similarly, we let S denote
the maximum population of redear the pond would support in the absence
of bluegill. With redear absent, we have seen in Section 8.2 that a reasonable
differential equation describing the population B(t) of bluegill would be

B' = kB. C - B
C '
where k is a positive constant having to do with the birth and death rate of
bluegill. Similarly, with bluegill absent, we can describe the behavior of the
redear population by
R ' = vR . S - R .
S
Again, v is a positive constant.
With both species present, neither of the above equations will still be
appropriate, for the very existence of redear in the pond impinges upon
the available food and space for bluegill, and in effect decreases the value of C,
the maximum possible bluegill population. In fact, it seems reasonable to
suppose that the value of C would be decreased in direct proportion to the
number of redear present, and hence our degree of realization of potential
term would become
C-B-exR
C
where ex is a constant that can be thought of as representing the degree to
which redear interfere with the bluegills in the latter's quest for food and
space. If a wide variety of foods were available to the two species and their
food preferences overlapped only partially, one would expect the constant
ex to be somewhere between 0 and I; however, if there were only one type of
8.3 The case of two competing species 235

food available and the redear were more efficient in obtaining it than bluegill,
then it would be reasonable to suppose that ex > 1. We shall treat all
possibilities, but remember that there are also numerous other such factors
of comparison between the two species that are concealed in the little
constant ex.
To continue our analysis, we can write a reasonable (though oversim-
plified) equation for the population of bluegill as
B' = kB. C - B - exR
C
and, similarly, the redear equation becomes

R' = vR. S - R - pB .
S
Of course, p plays a role for redear analogous to the role of ex with respect to
bluegill.
What we now ask is this: Given initial populations of the two species
of fish, and values of the various constants, what will be the eventual popula-
tion of the pond? Will one species inevitably dominate the other, so that the
latter becomes extinct and the former reaches its maximum population?
Or can the two species coexist, each at a certain percentage of its maximum
population?
In Fig. 8.7, rather than plotting either of the functions B(t) and R(t)
against the time variable t, we plot instead values of B(t) on the x-axis and
values of R(t) on the y-axis. The reason for this will shortly become clear.
We now ask for what values of Band R will B be increasing; that is, we
solve the inequality B' > O. That is done by recalling that

B' = kB. C - B - exR •


C
Now if B' > 0, we have
kB. C - B - exR > o.
C
Since we may assume each of k, B, and C positive (they are certainly not
negative, and if B = 0 there is no problem) each may be canceled from the
above inequality, and we see that
C - B - exR > 0,
or
exR < C - B,
or
C-B
R < .
ex
236 Animal populations 8.3

R-axis

R = Cia
.. 8'<0 Fig.8.7 B' = 0 on the
straight line R = (C - B)/a.

.,
8'>0

L-.---------------'""'-----8-axis
8=C

Thus the population of bluegill will be increasing when R < (C - B)fa.;


this is the condition that will make B' > O. If we plot the graph of R =
(C - B)fa., as we have done in Fig. 8.7, this will be a straight line; it must
cross the vertical axis where B = o-that is, where R = C fa.-and it must
cross the horizontal axis where R = 0; that is, when
C-B
= 0,
a.
which occurs exactly when B = C.
Below this line, we have
C-B
R < ,
a.
and hence in the triangular region below this line, where each point represents
a possible population of bluegill and a possible population of redear, the
bluegill population will be increasing. We have indicated this in Fig. 8.7
by an arrow directed to the right, which indicates an increase in the value
of B-we are temporarily silent as to the behavior of R. Similarly, in the
region above the line R = (C - B)fa., we must have B' < 0, so that the
8.3 The case of two competing species 237

Case 1 Case 2

,\
\ ,
\ ,
\\

Case 3 Case 4

Fig. 8.8 The four important cases for the positions of the lines on
which B' = 0 and R' = O.

bluegill population will be decreasing in that region. This is indicated by an


arrow pointing to the left.
If we perform a similar analysis on the equation
R' = vR. S - R - pB
S '
we can expect to obtain another straight line, with redear population in-
creasing on one side and decreasing on the other; moreover, this line must
cross the vertical axis at S and the horizontal axis at SIP. There are, however,
four possibilities, as shown in Fig. 8.8, where the "redear line" is shown as a
dashed line. The redear line can lie entirely over the bluegill line, entirely
beneath it, or cross it in either of two ways. (We neglect the unlikely pos-
sibilities that the two lines coincide or cross at a point exactly on one of the
two axes, for reasons to be discussed in the exercises to come.)
238 Animal populations 8.3

R-axis

8'<0
R'<O

Fig. 8.9 The case in


which the redear Ii ne lies
entirely above the bluegill
line.

L
L
C S/{3

Which of the actual cases shown in Fig. 8.8 actually occurs depends on
the experimentally obtained values of four of the constants we have in-
troduced; the first case shown in Fig. 8.8 corresponds to the case when
C 8
-<8 and C < -.
(X
f3
We begin our analysis with this case. Under the solid line (or bluegill
line) we have indicated with arrows pointing to the right the fact that B is
increasing, and above that line the arrows pointing to the left mean that in
that region B is decreasing. At the beginning of each of these arrows we have
attached another arrow, vertically upward if R is increasing and vertically
downward if R is decreasing. All this is shown in Fig. 8.9, and is just a
convenient way of indicating that in the small triangle, both populations are
increasing; in the middle region, the redear population is increasing while the
bluegill population is decreasing, and above both lines both populations are
decreasing. Moreover, note that on the bluegill line the values of Band R
are such that B' = 0; that is, the bluegill population is steady. Similarly,
the redear population is steady on the redear line.
8.3 The case of two competing species 239

R-axis

Fig.8.10 Three typical


curves showing
population trends.

Each point on the graph where Band R are nonnegative represents a


possible initial population of bluegill and redear. We have selected three
such points in Fig. 8.10, one in each of the three essentially different regions
determined by the two lines, and then imagined the value of t increasing from
its initial value of 0. The curves drawn represent, as one moves along each
curve from its initial point, the manner in which each of the two populations
must change, as indicated by the arrows in Fig. 8.9. It is easy to see what
must happen in each of the three cases; for any initially positive populations
of bluegill and redear, the curves lead inexorably with increasing values of t
to the point (0, S), where the redear line crosses the vertical axis. When the
curves reach that point, there they stay, for both B' and R' are zero at that
point. At this point, the population of bluegill is zero and the population of
redear is the maximum the pond can support, the number S. Hence, in this
case, the redear will eventually take over the pond.
240 Animal populations 8.3

One concept we have just encountered will be subsequently quite useful;


that is the concept of a critical point. The point (0, S) is a critical point for the
system of differential equations
B' = kB. C - B - IXR
C '
R' = vR. S - R - fJB
S '
because when B = 0 and R = S, both B' and R' are zero, and the populations
should not be expected to change without outside influence. In general, we
refer to any such situation in which all rates of change involved are zero
as a critical point, and the location and nature of these critical points will
usually give us some idea as to the eventual or limiting value toward which
the populations are tending.
In the case above, the point (0, S) will be called a stable critical point
because, if the values of Band R are changed slightly from the values B = 0
and R = S, the tendency will be for the values of Band R to return to the
values 0 and S, respectively. However, the above system has two other
critical points as well, which can be found by inspection of the diagram shown
in Fig. 8.9, or simply by solving the differential equations so as to find when
both B' and R' are zero. In the latter choice of procedures, we have
kB. C - B - IXR = 0
C '
and
S - R - fJB
vR· = o.
S
The former equation holds when k = O-a solution which we ignore-or
when B = 0, or when
C - B - IXR = o.
That is, when
C-B
R = .
IX

Similarly, R' = 0 when (ignoring v = 0) R = 0 or when


R = S - fJB.
To find when both B' and R' are zero, we have the four following cases:
1) B = 0 and R = O. In this case the pond holds no fish. This critical point
is not stable because a slight change in the "population" will result in the
movement of the population away from this critical point, rather than
back to it.
8.3 The case of two competing species 241

2) B = 0 and R = S - pB. Since B = 0, the latter simplifies to R = S.


We have already seen that this is a stable critical point; the pond contains
only redear.
3) R = (C - B)ja. and R = O. Since R = 0, the former equation simplifies
to just B = C. The pond contains only bluegill; again, this critical point
is unstable, since introduction of a small number of redear will upset the
balance.
4) R = (C - B)ja. and R = S - pB. This solution is a point where the
"bluegill line" and the "redear line" do not cross in the first quadrant,
and thus this case gives us no critical point.
To summarize, the population of the pond will be constant if it contains
no fish, only bluegill, or only redear, and only the latter case is stable.
Recall our earlier statement that some physical interpretation could be
attached to the numbers a. and fJ. Let us examine that interpretation in the
situation just considered. We can see from Fig. 8.9 that since the redear
line lies entirely above the bluegill line, we must have the inequalities

C 8
-<8 and C < -.
a. fJ
For simplicity, and because the two species are fairly similar in this example,
we suppose that C = S. Then
8 8
- < 8 and 8<-
a. fJ'
or
1 < a. and fJ<1.
Remember that a. can be thought of, very roughly, as measuring the degree to
which the redears interfere with the success of bluegills, and fJ the degree to
which bluegills interfere with redears. That a. > 1 and fJ < 1 thus could be
interpreted, very simply, as meaning that in this situation, redear is a "more
successful" species.
If we pass to the second case shown in Fig. 8.8, then the situation would
be exactly reversed. If it turned out, by experimental measurement, that C
and S were approximately equal and that a. < I and fJ > 1, then the stable
critical point would be located at (C, 0), indicating that one would expect
that the population ofthe pond would tend toward the maximum of bluegill
and no redear.
The third case shown in Fig. 8.8 is the most complicated. If you fill in the
arrows indicating the direction of population trend as in Fig. 8.9, you will
obtain a diagram much like that shown in Fig. 8.11. Here, for C and S
approximately equal, it turns out that both a. and fJ can be expected to exceed
242 Animal populations 8.3

R-axis

Fig.8.11 Directions of
Cia population trends in the
case of an unstable critical
point in the first quadrant.

L
L-----------~---~--B-axis
SI(3 C

the number 1, which should indicate that each species competes more suc-
cessfully with the other than with itself. This sounds like a peculiar situation,
and some typical lines of population flow are shown in Fig. 8.12.
We have here two stable critical points-one where B = 0 and R = S
(all redear) and one where B = C and R = 0 (all bluegill). The expected
critical point at (0, 0) is unstable and also rather uninteresting. But the
bluegill line and redear line cross at a point in the first quadrant, specifically at
rxS - C
B= ,
rxf3 - 1
and
R _ ,--f3C
_ -_S
- rxf3 - 1 .

At this point, both B' and R' are zero, indicating a steady-state popula-
tion, but this state is certainly unstable since a small change will generally
force a population movement toward one of the two stable critical points.
8.3 The case of two competing species 243

R-axis

Fig.8.12 Typical curves


of population trend in the
case of an unstable critical
point in the first quadrant.

L...----------~--- . . . . ." '----B-axis

The curves in Fig. 8.12 also tell us how to predict the eventual fish population.
If the pond starts with relatively few redear, the curves flow toward the point
of eventual extinction of redear; with relatively few bluegill, the redear will
eventually dominate the pond.
The most interesting case of all, shown as the fourth case in Fig. 8.8,
has been reserved for your enjoyment as the very next exercise.

Exercises

8.11 Consider the bluegill-redear competition in the fourth case shown in


Fig. 8.8; that is, when

S<-
c and c s
<-.
ex p
Draw figures analogous to Figs. 8.11 and 8.12. Find the stable and
unstable critical points and discuss the physical interpretation of your
solution.
8.12 Give a reason why we can ignore such cases as those in which the
redear and bluegill lines coincide, or intersect on one of the coordinate axes.
244 Animal populations 8.4

8.13 What sort of critical points are obtained in case the redear and bluegill
lines do coincide?
8.14 See Exercise 8.11. Does it make sense to draw the arrows on the co-
ordinate axes? Does it make sense to draw curves of population movement
that intersect the axes? What interpretation would you give to such a curve?
8.15 Let us consider the case examined in Exercise 8.11, in which a critical
point is found where both Band R are positive. Assume it is plausible that a
fisherman fishing for bluegill and only bluegill has the effect of decreasing the
value of C, the maximum population of bluegill supportable by the pond.
What effect will his fishing have on the relative populations of bluegill and
redear in case the populations have previously stabilized at the critical point
mentioned above? What if the fisherman is so expert that he reduces the value
of C so much that Clr.x < 8?
8.16 In analogy to the case of two competing species considered in this
section, write down reasonable differential equations describing the case of
three competing species.
8.17 Why must the "bluegill line" and the "redear line" each cross the
coordinate axes at a positive value?
8.18 In Section 8.2, the differential equation

M N
N'=kN· -
M

was discussed. Suppose N represents a population of one species of fish in a


pond, and M the maximum population that the pond will support. Find the
stable and unstable critical points for this system.
8.19 Continuing the previous exercise, what will be the effect on the stable
critical point if a fisherman catches fish from the pond in proportion to their
population?
8.20 Continuing the previous two exercises, what will be the effect on the
stable critical point if a fisherman catches fish from the pond at a constant
rate? In particular, what is a reasonable differential equation that describes
this system?

8.4 THE PREDATOR-PREY CASE


Let us now examine the interesting case in which there are again two species
involved, one of which preys on the other. Again, we shall suppose that this
situation exists in a pond containing two species of fish-bass and redear-
the former the predator, the latter the prey, for while redear account for a
very substantial portion of the food supply for bass in such a situation,
8.4 The predator-prey case 245

redear will not prey upon bass above a certain minimum size. We will make a
large number of simplifying assumptions here, including the following:
• We assume that neither population is so great as to necessitate the in-
troduction of the degree of realization factor. Consequently, the rate of
increase of redear population will be the increase due to births, minus those
redear consumed by bass; here we further assume that such consumption
completely accounts for attrition of redear.
• Moreover, we assume that the increase in the bass population due to
births is wholly dependent on the available food supply-redear and nothing
else. Thus we must introduce a term to account for the decrease in bass
population due to deaths.
• Finally, we assume that redear are consumed at a rate proportional to the
number of encounters between bass and redear. Since the number of such
encounters will thus be proportional to the number of bass as well as to the
number of redear, it is not hard to see that the number of encounters will be
proportional to the product of the number of bass and the number of redear.
For if the number of bass were doubled, the number of encounters would be
doubled. If then, in addition, the number of redear were doubled, the number
of encounters would again be doubled, resulting in four times as many
encounters in all.
Hence we assume that the following very simple differential equations
describe the predator-prey situation:
B' = kBR - dB,
R' = vR - wBR.
Here, B is the number of bass present, B' is as usual the rate of increase of the
bass population, R is the number of redear, and R' their rate of increase.
As we have said, we assume that the rate of increase of the bass population
is proportional to the number of encounters between bass and redear, so that
this rate of increase is kBR, where k is a positive constant. The positive
constant d represents the death rate of bass, presumably from old age; since
the number of deaths of bass is proportional to the bass population, the
attrition of the bass population is thus dB. Since this term represents a
decrease in the bass population, we subtract the number of deaths from the
number of births to obtain the net population increase of bass, and thus we
obtain the first equation
B' = kBR - dB.
We have also assumed that the increase of the redear population due to
births is proportional to the population of redear, so that this term becomes
vR, where v is a positive constant. Redear vanish from the redear population
246 Animal populations 8.4

R-axis

Fig.8.13 Directions of
trends of bass population
B'>O in the predator-prey
example.

R = d/kt-------------------

.. B'<O

.......- - - - - - - - - - - - - - - - - - - - B - a x i s

R-axis

R'
L
> 0, B' > 0
r
R' <O,B' >0
Fig.8.14 Trends of
bass and redear
d/kl------------~I__-------- population.

R' > 0, B' < 0 R' <O,B' <0

L...- -L- B-axis


w/v
8.4 The predator-prey case 247

at a rate proportional to the number of bass-redear encounters, and thus by


the amount wBR, where w is another positive constant. Again subtracting
deaths from births, we find that
R' = vR - wBR.
We now solve the inequality B' > 0, in order to find for what values of Band
R the bass population is increasing. We have
kBR - dB> 0.
Since we may assume that B > 0, the term B may be canceled from this
inequality, and thus we see that

kR > d,
or
d
R>-
k'
since k, too, is positive.
We plot the straight line R = djk in the graph shown in Fig. 8.13,
where values of B lie on the x-axis and values of R lie on the y-axis. Thus
the line R = djk is a horizontal line passing through the point djk on the
y-axis, and for values of R in excess of djk-that is, above this straight line-
the bass population is increasing since B' > 0, when R > djk. This makes
good sense; when there are many redear, the bass population should be on
the increase, and when there are few redear the number of bass should be
decreasing. This information is indicated by the usual arrows in Fig. 8.13.
We next ask for what values of Band R is the redear population increas-
ing; that is, we solve R' > 0. We have
vR - wBR > 0,
so that
v - wB > 0,
and hence
v
B < -.
w

We plot the vertical line B = vjw in the graph shown in Fig. 8.14; for
values of B less than vjw (that is, to the left of this line) R' > 0, and so the
redear population is increasing. And if B > vjw, the redear population is
decreasing. The usual arrows have been drawn in Fig. 8.14, summarizing
our findings about the population changes of the two species.
Now on the line R = djk, the horizontal line shown in Fig. 8.14,
B' = 0. This can be seen by substitution of R = djk in the bass equation

B' = kBR - dB.


248 Animal populations 8.4

R-axis

d1k --------------- -----

Fig. 8.15 Curves of


population trend in the
predator-prey case.

L...- ----'- 8-axis


v/w

°
Similarly, R' = on the vertical line B = v/w. Hence the point (v/w, d/k)
where the two lines intersect is a critical point. Its stability will be discussed
shortly. Another critical point is (0, 0), the case in which the two species are
absent from the pond. In practice this should probably be considered a stable
critical point, since if a few of each species are introduced the bass could be
expected to eliminate the redear and then die of starvation themselves-
remember, we are assuming that the bass have no other food supply.
There is no other critical point where R = 0, for again, in the absence of
redear, the bass population will decrease to zero. One can expect a critical
point where B = 0, somewhere high up on the vertical axis, where the
population of redear will stabilize at a point much higher than if bass were
present. This is not indicated by our differential equations, since for sim-
plicity we omitted the degree of realization term from each equation.
In Fig. 8.15 we have indicated the apparent behavior of the curves of
population trend. The behavior of the arrows in the previous figure very
strongly suggests that the curves are closed, and thus that the population
8.4 The predator-prey case 249

of each species will undergo periodic variations. First bass and redear in-
crease, then the larger number of bass cause the redear population to decrease;
next, the diminished food supply causes a decrease in the bass population,
and finally this last decrease permits an increase in redear until there are
sufficiently many to permit an increase in the number of bass.
It might actually be that the curves are slowly spiraling in to the point
(v/w, d/k), in which case the latter would be a stable critical point; or,
possibly, the curves could be spiraling slowly outward, and the point
(v/w, d/k) would then be an unstable critical point. For these particularly
simple differential equations, it turns out that the point (v/w, d/k) is neither.
The curves actually are closed, and hence a perturbation in a population
initially with values B = v/w, R = d/k will result in a new population which
can be thought of as having values circling about the point (v/w, d/k), and
thus not really moving either away from or closer to the point (v/w, d/k).
Perhaps some term could be invented here, such as calling this the case of a
quasi-stable critical point.
In addition, each closed curve of the sort shown in Fig. 8.15 can also be
thought of as a critical path ("path" seems a better term than "point"
here), since deviations from one curve simply place the population on a
nearby curve. It also is sensible to ask about the duration of such a path-
that is, how long does it take the population to go through one complete
cycle? For some species of fish and many mammals, the duration of a cycle
is usually measured in terms of a few years; curiously enough, even the much
"longer" paths do not seem to have much greater duration that the very
short ones. Of course, these very simple differential equations cannot pretend
to be an accurate representation of what actually takes place in nature, but
such cycles have been observed with sufficient frequency so that they deserve
some explanation, even a weak one.
In summary, the mathematics predicts the existence of periodic cycles
in populations of predators and their prey, such cycles have been observed,
and there is apparently a good reason why such cycles should occur.

Exercises

8.21 In the predator-prey case involving bass and redear, it would seem to
the fisherman's advantage to fish so as to move the populations of each
species close to the quasi-stable critical point (v/w, d/k). Why?
8.22 Continuing the previous exercise, during which parts of the population
cycle should the fisherman fish for bass? For redear? Can he ever fish for
both? Should he ever fish for neither?
8.23 Suppose that there are two species of fish in a pond, say A and B, and
the members of species A eat the eggs of species B at a rate proportional to
the population of species A. Suppose also that the members of species B
250 Animal populations 8.4

eat the eggs of species A at a rate proportional to the population of species B.


Suppose, finally, that the two species are not in competition for food or space,
so that other than egg-eating the presence of each species will not inhibit
the growth of the other. Under such circumstances the degree of realization
term (M - N)/M should probably be introduced into each resulting differen-
tial equation.
Write down a plausible pair of differential equations describing this
situation. It should be the case that your equations are sufficiently simple
to make the next exercise feasible.
8.24 Use the equations invented in the previous exercise to go through a
process like that in the last two sections, finding stable and unstable critical
points and sketching curves representing population trends.
8.25 List at least five deficiencies in the oversimplified treatment of the
predator-prey situation given in the previous section. For example, do bass
reproduce continuously?
8.26 Suppose that species A is parasitic on species B, which it has no difficulty
in finding, and this takes place in a region of ample space and food supply.
It would then be reasonable to write

A' = etA. KB - A
KB '
and
B' = f3B' M - B - kA .
M

Explain the origin of these equations.


8.27 Find critical points and sketch curves of population trends for the
equations in the previous exercise.
8.28 One can try to improve the predator-prey equations given in the last
section by introducing a degree of realization term for the redear. We write

B' = kBR - dB,

S-R
R' = vR· - wBR
S '

where S is the maximum population of redear the pond can support, and the
other constants and terms have the same meaning as in the last section.
Making reasonable assumptions about the relative sizes of these constants
where necessary, find critical points and sketch curves of population trend
for this system of differential equations.
8.4 The predator-prey case 251

8.29 Suppose the two sPecies Band R are symbiotic, though neither is
completely dependent on the other, and there is no competition for food or
space. Explain the origin of the descriptive equations

B' = kB' C - B + IXR


C '
R' = vR. S - R + pB ,
S

where IX and p are positive constants. Hint: See Section 8.3.


8.30 Assuming that the constants IX and p of the previous exercise are relati-
vely small, sketch curves of population trend for the equations of the previous
exercise and find any critical points.
8.31 If the human race were considered as two species-male and female-
competing for food and space, what sort of differential equations would
describe this system? You should assume that the birth rate of males and
females is the same. Analyze the consequences of your differential equations,
and see if the results seem to match up with reality.
8.32 In the predator-prey equations

B' = kBR - dB,


R' = vR - wBR

for bass and redear, suppose a poison is introduced into the pond which
kills members of each species at a rate proportional to the population of that
species. We could then write the modified equations
B' = kBR - dB - pB,
R' = vR - wBR - qR,

where p and q are the positive "poison" coefficients. Without the poison
terms, the most interesting stable critical point is (v/w, d/k), as we have seen.
Where is the corresponding critical point for the new system of equations?
In particular, will the number of prey rise or fall?
8.33 Ladybugs prey on aphids, and aphids are hard on many vegetables and
ornamentals. If you observe a stable population of ladybugs and aphids in
your garden, and you have an insecticide as effective in killing ladybugs as in
killing aphids, should you spray?
8.34 Suppose that a city of fixed area contains N(t) automobiles at time t.
Suppose also that new cars are being introduced into the city at a constant
rate, and that cars are being permanently removed by total destruction due to
two types of accidents-single-vehicle accidents and two-vehicle accidents.
252 Animal populations 8.4

Should the rate of attrition of cars due to single-vehicle accidents be pro-


portional to N? To what should the rate of attrition due to two-vehicle
accidents be proportional? What differential equation would describe the
behavior of N' under these circumstances?
8.35 In Section 8.1, we discussed the case of unrestricted growth of a single
organism. Under such circumstances, will there be a fixed time to such
that the population will double in every interval of duration to?
8.36 Suppose that a cylindrical tank with vertical axis has a small hole in the
bottom of such size that if the tank were full of water, then water would pour
out of the hole at the rate of 100 gallons per minute. Suppose also that in any
case, water pours out of the hole at a rate proportional to the water pressure,
and that the tank is very large compared to the other amounts used in this
problem. If water is running into the tank from an overhead pipe at the rate
of 10 gallons per minute, toward what limiting volume is the amount of water
in the tank tending? (Use appropriate constants where necessary.)
8.37 How would one test the growth of the population of the United States
in the last hundred years to see if it fits the case of the unrestricted growth of a
single species treated in Section 8.1 ?
8.38 How would one test the growth of the population of the United States
in the last hundred years to see if it fits the case of the growth of a single
species under limiting conditions, as treated in Section 8.2?
8.39 Return to the predator-prey case of Section 8.4, and assume that the
death rate d of bass is so small that we might as well suppose that d = O.
What happens to the bass-redear system in this case?
8.40 Continuing the previous exercise, a more realistic assumption might
be the following: If the death rate of bass is ignored, then a degree of realiza-
tion term should be introduced into the equation giving the rate of popula-
tion increase of the bass. In this case we might have for the bass equation

B' = kBR. M - R
M '

where M is some theoretical maximum population of bass the pond can


support. Should M be proportional to R, the number of redear present?
If so, how does the bass-redear system behave? What if M is assumed to be
constant?

NOTES AND REFERENCES


Three interesting books concerned with animal populations are:
Gause, G. F., The Struggle for Existence (Williams and Wilkins, 1934).
Notes and references 253

Lack, D. L., The Natural Regulation of Animal Numbers (Oxford Clarendon


Press, 1954).
Slobodkin, L. B., Growth and Regulation of Animal Populations (Holt,
Rinehart, and Winston, 1961).
/'

Eugene Odum's well-known textbook EcOlogy (Holt, Rinehart, and


Winston, 1963) is an excellent general introduction to the whole field of
ecology.
If you have worked a number of the exercises, do not feel as if you have
been doing ecology. Field work and laboratory work are essential; ecology
cannot be done solely at the desk. The most one can hope for is the for-
mulation of new hypotheses susceptible to experimental verification-or
invalidation.
CHAPTER 9

THE
ART
GALLERY
THEOREM

The main purpose of this chapter is to develop enough of the theory of


convex sets to prove Krasnoselskii's Theorem, also known as the "Art
Gallery" Theorem. However, there will be considerable development of side
topics, particularly in the exercises.
Although the theory of convex sets has applications in game theory,
linear programming, and other branches of mathematics much used in the
social and managerial sciences, convex sets are quite interesting in their own
right. Moreover, numerous pictures can be drawn illustratin~ definitions and
theorems in this chapter; we recommend that you draw such pictures fre-
quently, for they will help greatly in constructing proofs and will not usually
mislead you.
A small amount of notation from the elementary theory of sets will be
used throughout this chapter, so if this notation is unfamiliar to you, it would
be very helpful to read over the first section of Chapter 6. On the other hand,
though the study of convex sets will be presented from a very geometric
point of view, plane geometry itself is not a prerequisite for this chapter.
We will usually restrict our attention to subsets of the ordinary two-dimen-
sional plane; although generalizations to higher dimensions are frequently
possible, such material will usually be reserved for the exercises.

9.1 CONVEX SETS


We assume that you are familiar with the properties of straight lines and
straight-line segments in the plane. We will denote the ordinary two-

254
9.1 Convex sets 255

y-axis

Fig.9.1 The segment


[a, b] from the point a
to the point b.

- - - - + - - -.......- - - - - - - - - - - x-axis
a

dimensional plane by £2, just an abbreviation for Euclidean space of two


dimensions.
If a and b are points of £2, the straight-line segment joining a and b
will be denoted by [a, b], and consists of a and b together with all points
between a and b on the straight line in £2 through a and b. By [a, a] we
mean the set {a}, consisting of the point a alone.
Example 9.1 Let a = (0, 1) and b = (2, 2). Then [a, b] consists of all
points (x, y) of £2 such that 1 ~ x :$; 2 and y = 2x - 2. The set [a, b]
of this example is shown in Fig. 9.1.
It should be clear from the definition that [a, b] = [b, a]. You may use
anything you know about plane geometry to prove our first theorem.
Theorem 9.1 If a and b are points of £2 and p E [a, b], then
[a, p] u [p, b] = [a, b].
Of course, to prove that the above two sets are equal, one shows that each
point of [a, p] u [p, b] is a point of [a, b], and conversely.
The following is our most important definition. Let C be a subset of
£2. We say that C is convex if [a, b] is a subset of C whenever a and bare
points of C. That is, for every possible pair of points a and b of C, it must
256 The art gallery theorem 9.1

be true that the segment [a, b] belongs wholly to C. And thus, to prove a set
is convex, a usually fruitful approach is to select two arbitrary points a and b
of the set, and then prove that [a, b] is a subset of the given set.
For some examples, the plane sets shown in Fig. 9.2 are convex; those
shown in Fig. 9.3 are not. Some special cases of convex sets are these:
E 2 itself is convex.
Each segment [a, b] is convex.
A set consisting of just one point in E 2 is convex.
A circular disk in E 2 , containing all, part, or none of its boundary, is
convex.
If C consists of a triangle together with all points within the triangle, then
C is convex. (We will call such a set a triangular region.)
The empty set 0 is convex.
The last assertion above is a consequence of the definition of convexity.
For if a set S is not convex, it must contain at least one pair of points a and
b such that [a, b] is not wholly contained in S. The empty set contains no
such pair of points, for it contains no points at all.
On the other hand, while it is at least intuitively clear that a triangular
region is convex, a formal proof of this is difficult and depends on a careful
definition of "triangle." You should assume whenever necessary that a
triangular region is convex, even though we will not supply a proof of this
fact.

Exercises

9.1 Professor Aardvark told his class that a convex set was one "each
two points of which could see each other." What did he mean?
9.2 Let a and b be points of E 2 • Under what circumstances is the set {a, b}
convex?
9.3 How would you define convexity for subsets of three-dimensional
space E 3 ?
9.4 How would you define convexity for subsets of one-dimensional
space E 1 ?
9.5 Prove Theorem 9.1; that is, show that if a and b are points of E 2 and
p E [a, b], then
[~p]u[p,~ = [~~.

9.6 Draw three convex sets in E 2 such that any two of them have at least
one point in common, but such that there is no point common to all three of
the sets.
9.7 Prove that each segment [a, b] is convex.
9.1 Convex sets 257

Fig.9.2 Two
convex sets in £2.

Fig.9.3 Two
nonconvex sets in £2.
258 The art gallery theorem 9.2

Fig. 9.4 Proving that


C rt D is convex if
~

C and Dare.

9.8 Draw two overlapping convex sets C and D. In your example, is their
intersection C (J D also convex?
9.9 Prove that if C and D are any two convex sets whatsoever in £2, then
C (J D must be convex.
9.10 Must the union of two convex sets be convex? Explain the reason for
your answers.

9.2 INTERSECTIONS OF CONVEX SETS


One might solve Exercise 9.9-to show that the intersection of two convex
sets is convex-as follows. Let C and D be two convex sets in £2. We can
dispose of the simplest cases first. If C (J D is empty, then it is convex; if
C (J D contains only one point, it is convex. So we might as well suppose that
C (J D contains at least two points, say p and q. To help us proceed with the
argument, we draw a picture like that shown in Fig. 9.4.
What do we know about the two points p and q? Only that each belongs
to C (J D. Put another way, this means that p and q belong to C, and also
that p and q belong to D.
But since p and q belong to C, and C is given a convex set-here is where
that hypothesis is used-it follows by the definition of convex set that
[p, q] is a subset of C. For exactly the same sorts of reasons, [p, q] is also
a subset of D. But since [p, q] is a subset of both C and D, it follows that
[p, q] is a subset of C (J D.
What has happened? Two arbitrary points p and q were chosen in C (J D,
and it turned out that the segment [p, q] was consequently a subset of C (J D.
This means that C (J D is convex, by definition. The above argument thus
establishes our next theorem.
9.2 Intersections of convex sets 259

Theorem 9.2 The intersection of two convex sets is convex.

By mimicking this proof, you can show without difficulty that the inter-
section of three convex sets must be convex. For a start, let A, B, and C be
convex sets, and let D = A n B n C. Stop here and try to prove that D
must be convex.
If you worked Exercise 9.10, you probably saw that the union of two
convex sets need not be convex. In fact, if p and q are two points of £2,
then each of the sets {p} and {q} is convex, but their union {p, q} is not,
since it does not contain all the points of the segment [p, q]. Even if two
convex sets have points in common, their union need not be convex, a fact
that you can establish by means of innumerable simple examples. However,
there is one situation in which the union of convex sets is convex-this
situation will be discussed in the next group of exercises.
Though you may have thought of one proof that the intersection of
three convex sets must be convex, there are two, one proof involving seg-
ments (probably the one you thought of), and one not; the latter is the one
given for our next result.

Theorem 9.3 The intersection of three convex sets is convex.

Proof. Let A, B, and C be convex sets, and let D = A n B n C. Then

D = (A n B) n C,

since set-intersection obeys an associative law.


But since A and B are convex, so is A n B, by the previous theorem.
Hence we have written D as the intersection of two convex sets, namely
A n Band C. Again using the last theorem, it follows that D is convex.
Hence the intersection of any three convex sets must be convex.

Exercises

9.11 Use the technique of the proof of Theorem 9.3 to show that the inter-
section of four convex sets must be convex.
9.12 Suppose that n is a natural number at least 2, and it is known that the
intersection of any collection of n convex sets is convex. Use this fact and
the technique of the proof of Theorem 9.3 to show that the intersection of any
collection of n + 1 convex sets is a convex set.
9.13 Does it follow from Theorem 9.2 and the previous exercise that the
intersection of any finite collection of convex sets is convex? Explain
carefully.
9.14 Does it follow from Theorem 9.2 and Exercise 9.12 that the intersection
of infinitely many convex sets is convex? Explain your answer.
260 The art gallery theorem 9.3

9.15 In the previous section it was mentioned that there is one special
situation in which the union of convex sets must be convex. This occurs when
the collection of convex sets forms what is called a tower. That is, the
collection ~ of sets-eonvex or not-is said to be a tower if, given any two
sets A and B in the collection ~, it is true that either
AcB or B cA.

Prove that if ~ is a tower of convex sets, then u ~ (the union of all sets
in the collection ~) is convex. Hint: If p and q are two points of u~, then
pEA and q E B for some sets A and B in~. How can you use the fact that
~ is a tower?

9.16 Does Theorem 9.2 hold for convex sets in E 3 ? (The notation E 3 of
course stands for three-dimensional Euclidean space.) Explain.
9.17 Do Theorem 9.3 and the technique of its proof remain valid for
convex sets in E 3 ? Why?
9.18 Does Exercise 9.15 hold for convex sets in E 3 ? Give your reasons.
9.19 Let ~ be a collection of sets each finite subcollection of which is a
tower. Does it follow that ~ itself must be a tower? Explain your answer.
9.20 If ~ is a tower of sets in E 2 , need there be a "largest" element of ~­
that is, need ~ include a set L such that L contains all the other sets in ~?
Explain carefully.

9.3 HULLS AND KERNELS


Although we can use our method of proving Theorem 9.3 to show that the
intersection of any finite collection of convex sets is convex, an alternate
proof can be given that shows that the intersection of any collection of convex
sets is convex. This is one of the more important results about convex sets,
and we present it as our next theorem.
Theorem 9.4 If ~ is any collection of convex sets, then the intersection of all
the sets in ~-denoted by r. ~-is convex.
Proof If p and q are two points of r.~, then both p and q belong to every
set in the collection~. Since each such set is convex, the segment [p, q]
also belongs to every set in~. Hence [p, q] is a subset of r.~. Therefore
by definition, r. ~ is convex.
It is fortunate that such an important theorem has such an easy proof.
Note also that Theorems 9.2 and 9.3 are superseded by this theorem, for they
are special cases of it. The proof seems to work even in the special cases
where CC contains only one convex set, or no sets at all; the latter somewhat
puzzling situation will be discussed in the next set of exercises.
9.3 Hulls and kernels 261

s
Fig. 9.5 Forming
the convex hull
of the two sets
Sand T.
Hul (S) I
Hull TI

In addition, if A is any subset of £2 whatsoever, then it is meaningful


to consider the collection ~ of all convex subsets of £2 containing A. The
intersection of all these sets is called the convex hull of A, and is abbreviated
by Hul (A). It has the following properties, which we list in the form of a
theorem.
Theorem 9.5 Let A be a subset of £2. Then
a) A c Hul (A).
b) Hul (A) is convex.
c) If C is any convex set containing A, then C also contains Hul (A).
d) A is convex if and only if A = Hul (A).
The proofs of each of these facts are quite easy, and are left for the
exercises. We should remark at this point that, as a consequence of this
theorem, there is a unique "smallest" convex set, namely Hul (A), containing
each subset A of £2. Some examples of hull formation are shown in Fig. 9.5.
Associated with each subset A of £2, in addition to its convex hull, is
another set called its convex kernel. The kernel can be defined as follows:
Ker (A) = {x E A I if YEA, then [x,y]cA}.
262 The art gallery theorem 9.3

Fig.9.6 Forming the convex


kernels of the two
sets Sand T.

Ker (T)

Some examples of the formation of Ker (A) from A are shown in Fig. 9.6.
One convenient way to remember the definition of Ker (A) is to say that
Ker (A) is the set of all points of A that can "see" all the other points of A.
And, in analogy to our theorem about the convex hull, we have our next
result.

Theorem 9.6 Let A be a subset of E 2 • Then


a) Ker (A) c A.
b) Ker (A) is convex.
c) A is convex if and only if A = Ker (A).
9.3 Hulls and kernels 263

Again, the easy proof is left for the exercises. The analogy between
kernels and hulls is not perfect; while there is a unique "smallest" convex
set containing the given set A, the kernel need be neither the largest nor the
smallest convex subset of A. However, there is a connection between Ker (A)
and the "largest" convex subsets of A, and we will make this connection
explicit in our next theorem. But, before we even state that theorem, some
preliminaries are needed.
Let ~ be a collection of sets, and let II be a property meaningful for each
set A E~. (That is, for each set A in ~ it is either true or false that A has
property II.) The set M in ~ is said to be maximal with respect to property
II if
a) M has property II, and
b) M is not a proper subset of any other set in ~ having property II.
It is important that you do not confuse the property of being maximal
with the property of being maximum. A maximum set having property II
would be a set having property II and also containing every other set having
property n. To illustrate the distinction, we give an example.
Let ~ be the collection of all subsets of £2. Let us say that a subset of the
plane is nonlinear if no three of its points lie on a straight line. Let II be the
property of "being nonlinear." Then II is certainly meaningful for sets in the
collection~. And ~ contains sets maximal with respect to property II; for
example, let M be a circle.
Then M is maximal with respect to being nonlinear, because
a) M itself is nonlinear, and
b) if M is a proper subset of L (where L E~) then L cannot have property
II-for since M is a proper subset of L, L contains at least one point
x ¢ M. A straight line through x and the center of the circle M contains
x and two points of M, thus three points of L, since MeL.
This example shows that a set that is maximal with respect to property II
need not be unique, for there are numerous different circles in the plane,
and each is maximal with respect to being nonlinear. However, there is no
subset of £2 that is a maximum with respect to being nonlinear, for the only
subset of £2 containing all circles is £2 itself, and £2 is not nonlinear.
The following axiom is one form of the so-called Zermelo Axiom, which
is usually assumed true by most mathematicians. We will need this axiom
to prove our next theorem.
Axiom Let ~ be a collection of sets and let II be the property of "being a
tower." (Then II is meaningful for each subcollection d c ~, since each
such subcollection either is or is not a tower.) Then ~ contains a subcollection
.,I( maximal with respect to property n.
264 The art gallery theorem 9.3

As an example of an application of this axiom, let ~ be the collection of


all circular disks in E 2 each of which contains its boundary. The above
axiom guarantees the existence of a maximal tower vii of such disks; that
is, vii is a collection of circular disks, each disk in vii either contains or is
contained in each other disk in vii, and if some circular disk D is either
contained in or contains each disk in vii then DEvil.
(In this example it is not actually necessary to use the axiom to find the
maximal tower vii of circular disks, since it is not difficult to show that the
collection of all disks centered at the origin is a maximal tower. Note also
that this latter collection is maximal, but not a maximum.)
We mentioned that there was a connection between Ker (A) and the
convex subsets of A. We now proceed to demonstrate that connection, first
by proving the following lemma-which shows the existence of maximal
convex subsets of a given set A-and then immediately proceeding to the
theorem in question, which says that Ker (A) is the intersection of the maximal
convex subsets of A. The axiom just given is necessary to establish the lemma.
Lemma Let A be a subset of E 2 and let p be a point of A. Then there exists
a subset C of E 2 ,such that
a) p C;
E

b) C c A;
c) C is convex; and
d) C is maximal with respect to the above three properties.
Proof Let ~ be the collection of all convex subsets of A containing p.
Then ~ is not the empty collection, since {p} E~. By the axiom, ~ contains a
maximal tower J(. By Exercise 9.15, u vii is convex. Let C = u vii. It
is not difficult to show that C has the four desired properties listed in the
statement of the lemma.
Theorem 9.7 Let A be a subset of E 2 • Let f!A be the collection of all maximal
convex subsets of A. Then Ker (A) = n f!A.
The proof is outlined in one of the next exercises.

Exercises
9.21 This and the next three exercises provide a proof of Theorem 9.5.
Let A be a subset of E 2 • Show that A c Hul (A).
9.22 Let A be a subset of E 2 • Show that Hul (A) is convex.
9.23 Let A be a subset of E 2 and let C be a convex subset of E 2 such that
A c C. Prove that Hul (A) c C.
9.24 Let A be a subset of E 2 • Prove that A is convex if and only if A =
Hul (A).
9.3 Hulls and kernels 265

9.25 Let A be a subset of £2, and let A(A) be the set obtained by adjoining
to A all points belonging to segments [p, q], where p and q are points of A.
For example, if A is a circle, this process produces a circular disk for A(A).
Need the set A(A) formed in this way be convex? Explain.
9.26 Continuing the previous exercise, show that A C A(A) and that
A(A) C Hul (A).
9.27 See the previous two exercises. Is it true that, given A c £2, the
application of A a finite number of times will produce Hul (A)? That is, is it
true that one of the sets
A, A(A), A(A(A»), A(A(A(A»), . ..
must be the convex hull of A? Is there a maximum number of applications of
A that will always suffice for the production ofHul (A)? What is this number?
9.28 Repeat the previous three exercises for subsets of £3 rather than £2.
9.29 Repeat Exercises 9.25, 9.26, and 9.27 for subsets of £1 rather than £2.
9.30 Let A be a subset of £2 and let 9" be the collection of all triangular
regions whose vertices lie in A. Need Hul (A) = u 9"? Why?
9.31 Let A be a subset of £3 and let 9" be the collection of all triangular
regions whose vertices lie in A. Need Hul (A) = u 9"? Explain.
9.32 Give an example of a tower CC of convex subsets of £2 such that (") CC
is nonempty. Then give an example such that (") CC is empty.
9.33 In the previous section we provided an example-being nonlinear-
as a property meaningful for subsets of £2, and a set maximal with respect
to this property. Provide a different example of such a property and, if
.
possible, find a subset maximal with respect to that property.
9.34 Let CC consist of all convex subsets of £2 no one of which contains the
origin (0, 0). Show that there exists a maximal convex subset of £2 not
containing the origin. Is it possible to use the axiom of the last section?
Is it necessary?
9.35 If CC is the empty collection of subsets of £2, what is (") CC? Hint: If
p ¢ (") CC, then p must fail to belong to some set in the collection CC. What
points p have this property?
9.36 IfCC is the collection of subsets of £2 consisting of just one set, say A-
that is, CC = {A}-then what is (") CC?
9.37 Let A be a subset of £2. Show that the point p belongs to Hul (A)
if and only if p belongs to the convex hull of a set consisting of three or fewer
points of A. Hint: Use Exercise 9.30 if you wish.
9.38 The last line of the proof of the Lemma of the preceding section leaves
verification of the four properties of the Lemma for the reader. Please verify
them.
266 The art gallery theorem 9.4

9.39 Prove Theorem 9.6.


9.40 Here is an outline of the proof of Theorem 9.7; please fill in the details.
We have given that A is a subset of £2 and that f!A is the collection of all
maximal convex subsets of A. Let B = n f!A; we need to show that
Ker (A) = B.
First, we show that Ker (A) c B. There is no problem if Ker (A) = 0
(why?), so let p E Ker (A). It suffices to show that if M is a maximal convex
subset of A, then p E M (why is this sufficient?). So let M be a maximal
convex subset of A.
Since p E Ker (A), [p, q] c A for each point q E M (why?). Let C
consist of all points belonging to all such segments [p, q], where q E M.
Then C is convex (why?), C is a subset of A (why?), and M c C. Since M
is a maximal convex subset of A, M = C (why?). Since p E C (why?), it
follows that p E M. As we mentioned previously, this is sufficient to show
that p E B. Hence, since p is an arbitrary point of Ker (A), it follows that
Ker (A) c B, and we are half done.
Next, there remains only the problem of showing that B c Ker (A).
If B = 0, there is no problem, so let x be a point of B. To show that
x E Ker (A), it is sufficient (why?) to show that [x, y] c A for each YEA.
So let y be an arbitrary point of A.
By the Lemma of the last section, y is contained in a maximal convex
subset M of A (exactly how is the Lemma applied ?). But since x E B, it
follows (why?) that x E M too. Thus [x, y] c M (why?). Hence [x, y] c A
(why?). It follows that x E Ker (A), and thus that B c Ker (A).
We have shown that both
Ker (A) c B and B c Ker (A)
are true, and consequently Ker (A) = B. This establishes Theorem 9.7.

9.4 HELLV'S THEOREM


Helly's Theorem will be the principal tool we use to prove the Art Gallery
Theorem. First, suppose that CC is a nonempty collection of convex subsets
of £2 such that each two sets in CC have nonempty intersection. Does it
follow that nee must be nonempty? Can we even conclude that the inter-
section of each three sets in CC is nonempty? Try to answer these questions
before proceeding.
We can answer the first question very easily. In the coordinatized plane,
let the set Cn consist of all points on, or to the right of, the vertical line
through the point (n, 0) on the x-axis, where n is allowed to assume all whole
number values. Let CC consist of all the sets Cn' Then CC is a collection of plane
convex sets, and the intersection of each two sets in CC is clearly nonempty.
However, no point belongs to n~, for in order that pEn CC, it would be
9.4 Helly's theorem 267

Fig.9.7 The line passing


through (n, 0) has slope n.

(5,0)

necessary that the x-coordinate of p be greater than every integer, which is


impossible. So the answer to the first question above is a very emphatic
"No"; indeed, any finite subcollection of CC has nonempty intersection, but
still (J CC = 0. However, this example sheds little light on the second
question, for in this case the intersection of each three sets from CC is indeed
nonempty.
However, the unbounded straight lines drawn as shown in Fig. 9.7 do
form a collection of convex subsets of £2, and each pair of these lines
intersect since no two of the lines are parallel. However, with the help of a
little analytic geometry one can show that no three of these lines have a
common point. So the answer to the second question raised above is also
in the negative.
But ReIly's Theorem tells us that the answer must be affirmative if we
increase the numbers of sets mentioned in both the hypothesis and the
conclusion. The above examples serve to show why Helly's Theorem may be a
somewhat unexpected result.
268 The art gallery theorem 9.4

ABC
, ABO
I , \
\

,,,
I \
\
\

,,,
\
\
\

,,,
\
\
\ Fig. 9.8 One case in the

,,, -- ---- BCD proof of Helly's Theorem.

---------- --
---- ----
ACO

Theorem 9.8 (Helly's Theorem) Suppose that Cft/ is a collection of convex


subsets of £2 such that any three sets in Cft/ have a point in common. Then for
each n > 4, each n sets in Cft/ have a point in common.
This is just one version of Helly's Theorem; there are forms of Helly's
Theorem for three- and higher-dimensional Euclidean space, and forms that
guarantee even that n Cft/ itself is nonempty. However, the version stated
above will be quite sufficient for the proof of the Art Gallery Theorem, and
it is the easiest version to prove.
Proof We attack the simplest case first, the case in which the number n of
the theorem is 4. That is, suppose that Cft/ is a collection of plane convex sets
each three of which have a point in common; we desire to show that each
four of the sets in Cft/ must also have a common point. So let A, B, C, and D
be four sets in Cft/.
Since each three sets in Cft/ must have a point in common, there must in
particular be a point common to the three sets A, B, and C. For convenience,
we will denote this point by ABC, for this notation serves to remind us,
among other things, that the point ABC belongs to each of the three sets
A, B, and C.
Similarly, there are points ABD, ACD, and BCD common to the other
possible combinations of three of the four sets in question. In the most
general case, these points are four distinct points in £2, and one possibility
is that they form the vertices of a convex quadrilateral, as shown in Fig. 9.8.
9.4 Helly's theorem 269

In this case, the diagonals of the quadrilateral must intersect in a point


we have called p. In Fig. 9.8, the diagonal from ABC to BCD has its end
points in the convex sets Band C, so this diagonal must be a subset of B n C.
Similarly, the other diagonal must be a subset of AnD. Hence the point p
must belong to all four of the sets A, B, C, and D. This shows that any four
sets in re must have nonempty intersection in this case, the one in which the
four points form the vertices of a convex quadrilateral. Fortunately, there
are not many other cases, and all of the others are even simpler, so that we
have left their discussion for the exercises at the end of this section. Assum-
ing, then, that the other cases can similarly be disposed of, we have shown
that if each three sets in re have a point in common, then also each four
sets in re must have a point in common.
You may well guess what comes next. Knowing of re more than we did
before-that each four sets in re must have a common point-we proceed to
show that each five sets in re must have a point in common. By continuing
this process, we may conclude that each finite subcollection of sets from re
must have nonempty intersection, since each finite value of n > 4 must
eventually be reached by this method. However, the "obvious" way to show
that each five sets in re have a point in common is not the best way.
For suppose we are given five sets A, B, C, D, and E in re. What we
should not do is label the point known to lie in the four sets A, B, C, and D
by the symbol ABCD, and consider the possibilities for the five points we
would thus obtain. There are too many possibilities, and for larger values of
n the situation becomes far more complicated. Instead, we use a trick like
that in the proof of Theorem 9.3 and Exercises 9.11 and 9.12.
We consider the collection fJI consisting of the five sets A, B, C, D, and E.
Since these sets come from re, we know that any three of them have a point
in common, and thus, by what we have already proved, also that any four
of them must have a point in common. Consider next the new collection
d = {A, B, C, D n E}.
In d, we have four convex sets, and each three of them have a point in
common. For the only possible combinations of intersections of three of
them are
A n B n C, A n C n (D n E),

A n B n (D n E), B n C n (D n E).
Because of what we have observed about fJI, each of the above sets is
nonempty. Hence each three sets in d have nonempty intersection. Since
each set in d is convex, it follows by what we have already proved that any
four sets in d have nonempty intersection. There are only four sets in d,
so we now know that
A n B n C n (D n E)
270 The art gallery theorem 9.4

is nonempty. Thus the five sets A, B, C, D, and E have nonempty intersection.


Since these are five sets arbitrarily chosen from Cfj, this shows that each five
sets from C(j have nonempty intersection.
You can see how we continue application of this method. For example,
now that we know each five sets in Cfj have nonempty intersection, we let
A, B, C, D, E, and F be six sets arbitrarily chosen from Cfj, and let

f!} = {A, B, C, D, E, F},


and
if = {A, B, C, D, E n F}.

Now f!} is a subcollection ofCfj, and so by what we have already shown,


each five sets in f!} have nonempty intersection. Moreover, if is a collection
of convex sets in E 2 and each four sets in if have nonempty intersection-
some of the possibilities for intersections of four sets from if are listed below:
An B n C n D,
A n B n D n (E n F),
B n C n D n (E n F).

In every case, any intersection of four sets from if can be thought of as the
intersection of five or fewer sets from fi), and hence any intersection of four
sets from if is nonempty. Hence, by what we have already shown, the inter-
section of any five sets from if must also be nonempty. The only possibility
for such an intersection is
A n B n C n D n (E n F),

but this is the same as the intersection of all six of the sets in f!}. So the
intersection of any six sets from Cfj itself is nonempty.
Thus, given n > 4, we will eventually reach the value of the integer n
after a number of repetitions of this idea, and thus we can conclude that the
intersection of any finite subcollection of sets from Cfj is nonempty. This
establishes our version of Helly's Theorem.

Exercises

9.41 Does the following statement describe what was actually shown in our
proof of Helly's Theorem?
Suppose that Cfj is a collection of convex sets in the plane, n is a whole
number at least 3, and any n sets in Cfj have nonempty intersection. Then any
n + 1 sets in Cfj have nonempty intersection.
9.42 Suppose that Cfj is a collection of 5000 unbounded straight lines in the
plane, any three of which have one point in common. What can you conclude?
9.5 Krasnoselskii's theorem 271

9.43 Suppose that rc is a collection of 10,000 circles in the plane (here, by a


"circle" we mean the boundary of a circular disk, rather than the disk
itself), and any three circles in rc have at least one point in common. What
can you conclude?
9.44 Suppose that rc is a collection of 64 solid balls in three-dimensional
space, each three of which have a point in common. What can you conclude?
9.45 Suppose that n is an integer at least 4, and rc is a collection of n points
in £2 such that each three of these points lie within a circle of radius 1. Can
you show that all n points must lie within a circle of radius 1 ?
9.46 What do you think is the correct version of Helly's Theorem for £3?
9.47 What do you think is the correct version of Helly's Theorem for £1?
9.48 In the proof of Helly's Theorem, we began by letting rc be a collection
of convex sets in the plane each three of which are known to have a common
point, and we sought first to prove that each four sets in rc had a common
point. We chose four sets A, B, C, and D from rc, and denoted the point
common to A, B, and C by ABC, and so on. There were several cases for the
location of the four point ABC, ABD, ACD, and BCD in £2, and we con-
sidered only the case in which these four points lay on the vertices of a convex
quadrilateral. List the other possibilities.
9.49 Continuing the previous exercise, show how in each of the cases listed
one can conclude that there is a point common to all four of the sets A, B, C,
andD.
9.50 Continue one step further the argument used in the proof of Helly's
Theorem; that is, knowing that rc is a collection of convex sets in £2 each
six of which have a common point, show that each seven sets from rc must
also have a common point.

9.5 KRASNOSELSKII'S THEOREM


We will now prove the Art Gallery Theorem itself. Watch for the point in
the proof at which Helly's Theorem is used.
Theorem 9.9 (Krasnoselskii's Theorem) If, for each three paintings hung in
an art gallery, there is a spot from which those three can be viewed simul-
taneously, then there is a spot in the gallery from which all the paintings can
be viewed.
Of course, this statement is a little imprecise; we can phrase the theorem
as M. A. Krasnoselskii did, in the language of plane convex sets.
Theorem 9.9 Rephrased Let P be a plane polygon. Suppose that, given any
three points a, b, and c on the boundary of P, there exists a point x E P such
that [x, a] u [x, b] u [x, c] c P. Then Ker (P) =1= 0.
272 The art gallery theorem 9.5

,
,
I
,,
I

I
/'11.
I

f" "', ' ... ...... ......


...... ......
- - ...... ...... ......
...... .....

...... ......
..., Fig. 9.9 Construction of
I
I
the squares in the Art
I
I Gallery proof.
,
I
I

I
,,
I

, I
I

p ,
I

/..............
,
I
I
I

I .............. I
I -......... I
I........ I

I
,
I -..........
...... , -~_ " I

/
I
, .........

............ ..... -
~'''''I
I "
I

From the second form of the theorem, you can see that the art gallery
must be polygonal in shape and with vertical walls. There are certain other
differences in the statements given above for the Art Gallery Theorem, but
these differences can be resolved by a careful study of the following proof.
We prove the second form, of course.
Proof As shown in Fig. 9.9, we first give the sides of P a counterclockwise
orientation. For each side (J of P thus oriented, there is an unbounded
straight line A containing (J, and the orientation of (J induces a like orientation
on A. This orientation of A makes it possible to distinguish the points on the
"left side" of A from those on the right; just imagine yourself standing on
A, like a tightrope walker, facing in the direction of the orientation; the
"left side" of A consists of those points of £2 to your left.
9.5 Krasnoselskii's theorem 273

The plane set consisting of A together with all points to the left of A is
called the closed left half-plane determined by A. In this half-plane use a
segment of Ato construct a square 8 with the following properties:

a) 8 has one side on the line A;


b) 8 is a subset of the closed left half-plane determined by A; and
c) 8 is so large as to contain all the points of P lying to the left of A.

We construct such a square 8 for each side (J of P in exactly the above


fashion. We want to show that each three of these squares have a common
point. Let 8 1 , 8 2 , and 8 3 be three such squares, and let a, b, and c be points
lying, respectively, on the corresponding sides (J1' (J2' and (J3 of the polygon
P. By our hypotheses, there then exists a point x E P such that

[x, a] u [x, b] u [x, c] c P.

But x must lie on the left side of each of the corresponding three lines
A 1 , A2' and A3' For if x were to lie to the right of (for example) A 1 , then
[x, a] would also lie to the right of A1 , and thus there would be points of P
arbitrarily close to (J 1 but to the right of (J l' This cannot happen, for the
counterclockwise orientation of the sides of P guarantees that no points of P
lie immediately to the right of any side of P.
Hence x must lie to the left of each of the lines A 1 , A 2 , and A 3 , and x E P
too. Since each of the three squares 8 1 , 8 2 , and 8 3 is constructed so as to
contain all points of P to the left of the lines used to construct these squares,
it follows that x belongs to all three of the squares 8 b 8 2 , and 8 3 , So each
of the possible triples of squares we have constructed has nonempty inter-
section, and in fact, given three such squares, there is a point of P belonging
to all three of them.
Each square is a convex subset of E 2 (here we understand that a "square"
consists of the boundary together with its interior). So we have the following
situation: We have finitely many convex sets in E 2 -namely, the squares-
and each three of these sets has nonempty intersection. By Helly's Theorem,
there must be a point common to all the squares. We call this point q, and
now we wish to show that q E P.
If q were not a point of P, we could draw a straight line segment from q
to some interior point r of P, and so arrange matters that [q, r] does not
intersect a vertex of P. Then, as one moves along [q, r] in the direction from
q to r, one must first encounter a point of some side (J of P, as shown in Fig.
9.10. Call that point t. Since [q, t] meets P only at the point t, then each
point of [q, t] must lie to the right of the side (J and thus to the right of the
line Adrawn through (J. In particular, q itself would have to lie to the right
of A, and thus q could not be a point of the square 8 built on the line A.
274 The art gallery theorem 9.5

Fig. 9.10
The situation if q ¢ P.

\
\
\
\

This is impossible, since q belongs to every square, and hence we may con-
clude that q is, after all, a point of P.
If we can show that q E Ker (P), this will establish the theorem, for it
will show that Ker (P) =I 0. But suppose that q ¢= Ker (P). Then there
exists some point Z E P that q cannot "see"-that is, such that the segment
[q, z] does not lie entirely within P. Let y be a point of [q, z] not contained
in P, as shown in Fig. 9.11.
Then, as one travels along the segment [y, z] from y to z, one first meets
the boundary of P at some point w on a side (j of P. Hence [q, w] lies to the
right of the side (j, as there are points of [q, w] arbitrarily close to (j but not in
P. But then, q must lie to the right of the straight line A through (j, and hence
q cannot belong to the square S built on A. This is in contradiction to the
fact that q lies in each of the squares, and this contradiction establishes
that q E Ker (P).
Therefore Ker (P) =I 0, and this establishes Krasnoselskii's Theorem.
Actually, Krasnoselskii's Theorem is true for any plane figure bounded
by a closed curve, and the method of proof is quite similar. However, this
version of the theorem requires the use of a form of Helly's Theorem for the
case of the intersection of infinitely many convex sets.
9.5 Krasnoselskii's theorem 275

-----------~~~~ -----x--------

Fig. 9.11 Showing


that q E Ker(P).

Exercises

9.51 The proof of the Art Gallery Theorem is so long that it is difficult to
understand without a summary. Try summarizing the proof of the theorem;
make your summary as condensed as possible, omitting most reasons. For
example, you might begin like this:
"First orient the sides of P in a counterclockwise direction, then draw
an unbounded straight line through each side. On each such line, construct
a square such that ... "
9.52 How could you phrase the Art Gallery Theorem for a three-dimensional
gallery, with pictures hung in all sorts of directions from an observer?
Should the word "three' in the statement of the theorem then be replaced by
the word "four"?
9.53 Is it possible that although 8 is a nonempty subset of £2, Ker (8) = 0?
Give your reasons.
9.54 An unbounded straight line A in £2 divides £2 into three sets: A itself,
the points of £2 on one side of A, and the points of £2 on the other side of A
(the latter two sets are to contain no points of A itself). The last two sets are
called the open half-planes determined by A.
276 The art gallery theorem 9.6

Prove that if 8 is a subset of E 2 , then Hul (8) is the intersection of all


open half-planes containing 8. (Remember that the intersection of an empty
collection of sets is all of E 2 .)
9.55 Is it possible to divide E 2 into two disjoint convex sets whose union
is E 2 ?
9.56 Let f(j be a finite collection of rectangles (including boundary and
interior) in E 2 each of which has sides parallel to the coordinate axes. There
does exist a natural number k such that, in such a situation, if each k
rectangles from f(j have a point in common, then there must be a point
common to all the rectangles in f(j. What is the least value of k for which
this is so? Hint: It is clear that the answer is either k = 2 or k = 3-but
why is this clear?
9.57 In Section 9.4, there was shown in Fig. 9.7 a collection of unbounded
straight lines in E 2 such that any two intersected, and no three had a point in
common. The figure shows lines passing through the points (n, 0) on the
x-axis-where n is a natural number-and the slope of the line through the
point (n, 0) was to be n itself. If you know some analytic geometry, this
will help in constructing a proof that no three of the lines have a point in
common. If not, see if you can find an alternative proof of this fact.
9.58 See Exercise 9.46. Give an example of four convex sets in E 3 such that
each three have a point in common but such that there is no point common to
all four.
9.59 This is a version of Helly's Theorem for E 2 in which the word "three"
can be replaced by the word "two," but proving the following version is not
easy. Try it anyway.
Let f(j be a finite collection of convex sets in E 2 such that each two sets in
2
f(j have a point in common. Then, for each point p E E , there exists an
unbounded straight line A. passing through p and through all the sets in C(/.
9.60 Here is another difficult problem, in which Helly's Theorem can be used
to give the solution.
Suppose that the set 8 consists of n points in E 2 , where n is a natural
number. Then there exists a point p in E 2 such that, for each unbounded
straight line A. passing through p, at least n/3 of the points of 8 lie in each of
the closed half-planes determined by A..

9.6 i-CONVEXITY
One frequently fruitful approach to mathematics is the generalization of
previous results. Sometimes such generalizations actually simplify the theory
by removing extraneous details, and they sometimes show connections be-
tween branches of mathematics that were thought to be unrelated. One
9.6 L -convexity 277

possible generalization of the idea of convexity is L-convexity. We still


restrict our attention to subsets of £2.
Let us say that the subset K of £2 is L-convex provided that, for each
two points x and y of K, there exists a point z of K such that [x, z] u
[z, y] c K.
The name "L-convex" comes, of course, from the idea that each two
points of K can be joined by a vaguely L-shaped figure lying entirely within K.
Every convex set is L-convex, but the converse is easily shown to be false.
So we do have here a generalization of the idea of convexity, and a number
of our previous theorems still hold after appropriate modifications have
been made. For example, even without modification, one can prove the
following two theorems.
Theorem 9.10 The union of a tower of L-convex sets is L-convex.

Theorem 9.11 Let K be a subset of £2 and let p be a point of K. Then there


is an L-convex subset of K maximal with respect to containing the point p.

The L-convex kernel of a set has a very natural definition: If K is a subset


of £2, then the point x of K is said to belong to the L-convex kernel of K
provided that, for each y E K, there exists Z E K such that [x, z] u [z, y] c
K. We abbreviate the L-convex kernel of K by L-Ker (K).
At this point, the professional mathematician would raise questions
such as the following:
Need Ker (K) c L-Ker (K)?
Need L-Ker (K) be L-convex?
Need L-Ker (K) equal the intersection of the maximal L-convex subsets
of K?
Unfortunately, it is not so easy to define an L-convex hull for the set K.
We would want the L-hull to be L-convex, by analogy with properties of the
convex hull. But the "obvious" approach of forming the L-hull by inter-
secting all L-convex sets containing K does not work-as you will see if you
let K consist of three sides of a square in the plane. The real difficulty seems
to be that the intersection of L-convex sets need not be L-convex.
On the other hand, there are some constructions with L-convex sets
that do not work for convex sets. Given a subset S of £2 and a point p E £2,
we can form the join of p and S, denoted by p # S, as follows:

p# S = u rEp, s] I s E S}.
It would seem plausible that if S is any subset of £2, then p # S would
be convex. Unfortunately, this is not the case. However, it is true that p # S
is L-convex. We leave further development of these ideas to the exercises.
278 The art gallery theorem 9.6

Exercises

9.61 Give an example of an L-convex subset of £2 which is not convex.


9.62 Prove that every convex set in £2 is L-convex.
9.63 Prove Theorem 9.10: That the union of a tower of L-convex sets is
L-convex.
9.64 Prove Theorem 9.11: That if K is a subset of £2 and p is a point of K,
then there exists an L-convex subset of K maximal with respect to con-
taining p.
9.65 Is it true that for every subset K of £2, Ker (K) c: L-Ker (K)?
9.66 Is it true that for every subset K of £2, Ker (K) = L-Ker (K)?
9.67 Suppose that K is a polygonal region in £2. Can you show that
L-Ker (K) must be L-convex? (So far as the author knows, this problem is
unsolved.)
9.68 Suppose that K is an arbitrary subset of £2. Need it be true that
L-Ker (K) is L-convex? (Hint: See the previous exercise.) Explain.
9.69 Suppose that K is a subset of £2. Need L-Ker (K) be equal to the
intersection of the maximal L-convex subsets of K? Give your reasons.
9.70 Need L-Ker (K) contain the intersection of the maximal L-convex
subsets of K for every subset K of £2? Why?
9.71 Let K consist of three sides of a square in £2. Show that K is not
L-convex, and that the intersection of all L-convex sets containing K is
in fact equal to K.
9.72 Give an example of two L-convex sets in £2 whose intersection is not
L-convex.
9.73 Give an example of a subset S of £2 and a point P E £2 such that
P # S is not convex.
9.74 Prove that if S is a subset of £2 and p is a point of £2, then p # S
is L-convex.
9.75 Show that if C is a circle in the plane, then the closed circular disk
with boundary C is not a minimal L-convex set containing C.
9.76 Formulate an alternative generalization of the idea of convexity in the
plane.
9.77 Is the complement of a circular disk (containing its boundary) in £2
L-convex?
9.78 Let P be a polygonal region in £2. Suppose that each two points
x and y of the boundary of P can be joined by the segment [x, y] with
[x, y] c: P. Does it follow that P must be convex?
Notes and references 279

9.79 Let P be a polygonal region in E 2 • Suppose that for each two points
x and y on the boundary of P, there exists a point z of P such that [x, z] u
[z, y] c P. Does it follow that P must be L-convex?
9.80 Is it true that the subset S of E 2 is L-convex if and only ifKer (S) :I: 0?

NOTES AND REFERENCES


The books by Hadwiger, Debrunner, and Klee and by Yaglom and
Boltyanskii listed below are collections of problems on convexity and
related topics, with discussion material liberally interspersed, and with the
level of the material not excessively high (with certain exceptions). The
books by Valentine and Griinbaum are rather advanced. The first study
of L-convexity known to the author can be found in the paper, "Some
properties of L sets in the plane," by Alfred Horn and F. A. Valentine,
published in Volume 16 (1949) of the Duke Mathematical Journal.
Benson, R., Euclidean Geometry and Convexity (McGraw-Hill, 1966).
Griinbaum, B., Convex Polytopes (Interscience, 1967).
Hadwiger, H., Dubrunner, H., and Klee, V., Combinatorial Geometry in
the Plane (Holt, Rinehart, and Winston, 1964).
Lyusternik, L., Convex Figures and Polyhedra, translated by T. Jefferson
Smith (Dover Publications, 1963).
Valentine, F., Convex Sets (McGraw-Hill, 1964).
Yaglom, I. and Boltyanskii, V., Convex Figures, translated by Kelly and
Walton (Holt, Rinehart, and Winston, 1961).
CHAPTER 10

THE
REAL
NUMBER
SYSTEM

We shall begin with the assumption that you already clearly understand the
system (Q, " +, <) of rational numbers with ordinary multiplication and
addition and the usual order relation. We shall examine four main topics:
1) the inadequacies of the rational number system,
2) remedying such inadequacies by construction of the real number system,
3) the significance of decimal expansions for real numbers, and
4) some unusual and important properties of the real number system.

10.1 THE RATIONAL NUMBERS


The system of rational numbers is quite adequate, of course, for solving
many kinds of equations, such as
2x + 3 = 7 - 5x,
or even systems of simultaneous equations in more than one unknown;
for example,
5x + 3y - 2z = 0,
2x = 3z,
yJ2 + 9x = 1001.
Moreover, it does appear that every "quantity" encountered can, in many
circumstances, be approximated numerically to within any desired degree

280
10.1 The rational numbers 281

Fig.10.1 The number


.J2 measures a length.

of accuracy by rational numbers; we have in mind "quantities" such as


length, weight, volume, velocity, and even the famous rational approxima-
tion 22/7 for 1t.
However, 1t is not equal to 22/7, and such a simple equation as
x2 = 2
cannot be solved using only rational numbers. But this equation certainly
ought to have an exact solution, for the positive "number" x such that x 2 = 2
can easily be visualized as the exact length of the hypotenuse of an isosceles
right triangle of leg length 1, as in Fig. 10.1. We see as a result of our first
theorem that this length cannot be a rational number.
Theorem 10.1 No rational number is a solution of the equation
x2 = 2.
Proof We will use Theorem 7.9, the Fundamental Theorem of Arithmetic,
to give the simplest (but by no means the only) proof.
Suppose by way of contradiction that there were a rational number r
such that r 2 = 2. Whether or not r is positive, since it is rational there must
(by definition of "rational number") exist integers m and n with n :1= 0 such
that
m
r =-.
n
282 The real number system 10.1

Hence

But,2 = 2, and so
m2
2=-
n2 '

and hence

Now m and n are integers, but here we are concerned only with their
squares, so we may suppose that both m and n are positive. By the Fun-
damental Theorem of Arithmetic, each of m and n has a unique prime
factorization, and so do m 2 and n2 • The prime factorization of m has some
number-possibly zero-of 2's in it, and hence m 2 has one factorization in
which twice as many 2's appear. That is, m 2 has a factorization with an even
number of 2's. Since this factorization is unique, we see that the prime
factorization of m 2 must contain an even number of 2's.
Similarly, the prime factorization of n2 contains an even number of 2's.
So one factorization of 2n 2 contains an odd number of 2's; again, this prime
facto~ization is unique, so the prime factorization of 2n 2 must contain an
odd number of 2's.
But we have the equation
2n 2 = m2•
We have seen that if the term 2n 2 is factored into primes an odd number of
2's must appear in that factorization, while if m 2 is factored into primes
this factorization must contain an even number of 2's. Hence the natural
number with the two names above-the two names 2n 2 and m2-has two
different prime factorizations, one with an even number of 2's and one with
an odd number of 2's. This is in contradiction to the Fundamental Theorem
of Arithmetic, since that theorem guarantees a unique prime factorization
of each natural number. This contradiction means that our original sup-
position-that the equation

has a rational solution-must be false, and hence no rational number can


solve the above equation. In other words, the square root of 2 is irrational,
our term for a real number (whatever that is) that is not rational.
Since there then exist lengths which are not rational numbers-or, if
you prefer, there are polynomials such as p(x) = x 2 - 2 which have no
rational zeroes-it thus seems reasonable that there do exist numbers in
addition to the rational numbers, numbers yet to be discovered. (If you prefer,
10.1 The rational numbers 283

you may say that such numbers have yet to be constructed. It depends on
whether you think a mathematician is an explorer or an inventor.)
An alternative formulation of the above situation is this: If S is the set
of all positive rational numbers with squares no less than 2; that is,
S = {r E Q+ I r 2 > 2},
(where Q+ denotes the set of all positive rationals) then the set S contains
no smallest element. The method of showing this will be outlined in one of
the exercises at the end of this section.
But you have undoubtedly approximated .J2 by a sequence of numbers
such as
1.4, 1.41, 1.414, 1.4142, 1.41421, ....
And you have likely carried out the above decimal approximations sufficiently
far to obtain the necessary accuracy in the context of the problem you are
working. It is not difficult to find what such a sequence is tending to, if
there is some regularity or periodicity in the terms of the sequence. For
example, the sequence
0.3, 0.33, 0.333, 0.3333, 0.33333, ...
can easily be seen to be tending toward the number whose decimal expansion
IS
0.333 333 333 ... ,
and the "value" of this decimal can be calculated using some elementary
knowledge about geometric series. The above nonterminating decimal is
actually just an abbreviation for
3 3 3
10 + 100 + 1000 +"',
which is a geometric series with ratio 1/10 (since each term is 1/10 the
previous) and first term 3/10. If the ratio is between -1, and 1, the sum of
such a series is given by the formula
a
1- r
where a is the first term of the series and r is its ratio; in the case of the series
for the decimal 0.333 333 333 ... , we find its value to be
3
10 1
1 3
1
10
284 The real number system 10.1

Since no periodicity in the decimal expansion


1.414213 562 ...
is apparent, such techniqlles cannot be applied, so the question naturally
arises at this point as to what the sequence
1.4, 1.41, 1.414, 1.4142, 1.41421, ...
has as its "value" toward which it is tending; that is, what is the "limit"
of the above sequence of rational numbers. Thus, the construction of the
real numbers can be thought of as giving a meaning to every possible decimal
expansion, and interpreting such a decimal expansion as measuring some
length.
Our next theorem may give some insight into the structure of the rational
number system; it almost says that the rational numbers are sufficient for
approximation of any quantity to any desired degree of accuracy.
Theorem 10.2 Let rand s be rational numbers with r < s. Then there are
infinitely many rational numbers between rand s.
Proof Choose integers m and n such that 0 < m < n. Then
m
0<-<1.
n
Since r < s, s - r is positive. Also s - r is rational, as you will be asked
to establish in the exercises, and so we have
m
o< - (s - r) < s - r,
n
and hence
r < r + -m (s - r) < s.
n
Since min and s - r are rational, so is their product, and so is their sum
m
r+-(s-r).
n
Again, the details are left for the exercises. But there are infinitely many
choices of integers m and n such that 0 < m < n, and m and n can be chosen
so as to give infinitely many different values of min, and thus infinitely many
different values of
m
r + - (s - r),
n
all of which lie between rand s. This establishes the theorem.
10.1 The rational numbers 285

/i\
/' I \
VL./, / '
.f7'f I: " \

"
" " "I'
I
\
\
,," I '

-(3/2) 7/4 3

By virtue of this theorem, it would appear that if the rational numbers


were indicated as points on an unbounded straight line, located according to
their values as indicated in Fig. 10.2, there could be no gaps in this line.
But we have seen that .J2
is not a rational number, but could be located on
this line by a method such as that shown in Fig. 10.2. So the rational number
system does, after all, contain gaps, which we intend to fill with numbers we.
will eventually call the irrational real numbers.

Exercises

10.1 Let a, b, C, and d be rational numbers. Show that the equation


ax + b = cx + d
has either no solution or a rational solution.
10.2 Let a and b be rational numbers. Show that the numbers
a + b a - b a·b
are rational, and that alb is rational if b i= O.
10.3 The number 1t represents a length. What length? It also represents an
area. What area?
10.4 The value of 1t accurate to 30 decimal places is
1t = 3.1415926535 89793 2384626433 83279.
Can this information be used to show that 1t is not a rational number? How?
286 The real number system 10.1

10.5 Can the decimal expansion of n be used to show that

n =1= 22 ?
7

10.6 Give a better rational approximation to n than 2217.


10.7 For those who have studied Chapter 3. Use continued fractions and the
information in Exercise 10.4 to find the "next better" rational approximation
to n.
10.8 Show that ,J3 is irrational; that is, that the equation x 2 = 3 has no
rational solution x. Hint: Use the technique of Theorem 10.1.
10.9 Give the reasons for each step in the following alternative proof that
.J2 is not rational.
Suppose that ,J2 is rational. Then 1 < ,J2 < 2 (why?). So there exist
natural numbers m and n, with n =1= 1, such that

(why?) and the fraction min is in lowest terms. Hence


m2
2 = -2 '
n
and so

Hence m 2 is even (why?), so m is even (why?). So m = 2" where' is a


natural number (why?). So
2n 2 = 4,2
(why?), thus

(why?). So n 2 is even (why?), and so n is even (why?). This is a contradiction


(to what?). Therefore,J2 is not rational.
10.10 For those who have studied Chapter 3. The continued fraction
expansion of ,J2 is given by

.J2 = (1; 2, 2, 2, 2, . . .),


as seen in Section 3.3. How can this fact be used to prove that,J2 is irrational?
10.11 If the techniques of Theorem 10.1 are used in a (vain) attempt to prove
.J4 is not rational, no contradiction is reached. Why not?
10.1 The rational numbers 287

10.12 Prove that if p is prime, then ,Jp is not rational.


10.13 Exactly which natural numbers n have the property that ,J~ is not
rational?
10.14 Here is an outline of how to show that the set
S = {r E Q+ I r 2 > 2}
contains no smallest element. Fill in the details.
First, if rES then r 2 > 2 (why?). Suppose, then, that r is the smallest
element of S. Let
r2 - 2
8=r----
2r
Then 0 < s (why?), and S2 > 2 (why?). Hence S E S, since s is rational
(why?). But s < r (why?). This is a contradiction (to what?), and hence S
contains no smallest element (why?).
10.15 Express
0.888 888 888 ...

as a rational number; that is, in the form min, where m and n are integers
and n =1= O.
10.16 Express
0.327 327 327 ...
as a rational number.
10.17 Express
0.999 999 999 ...
as a rational number.
10.18 Prove that ,J6 is not a rational number.
10.19 Prove that,J2 + ,J3 is not a rational number. Hint: Begin much as
in the proof of Theorem 10.1. Then use Exercise 10.18.
10.20 Prove that,J2 + ,J3 is not a solution of any equation of the form
ax 2 + bx + c = 0,
where a, b, and c are rational numbers. Hint: There is no loss of generality
in supposing that a = I-why not?
This exercise together with the previous one shows that there are irrational
numbers not solutions to any quadratic equation with rational coefficients.
10.21 Prove that no rational number is a solution of the equation

x 3 = 2.
288 The real number system 10.2

10.22 Prove that the equation

has no rational solution x.


10.23 Prove that if ex is not a rational number and r is a rational number,
then ex + r is not a rational number.
10.24 Prove that if ex is not a rational number and r is a rational number
other than 0, then ex . r is not a rational number.
10.25 If a, b, and c are integers with a =1= 0, and the equation
ax 2 + bx + c = 0
has two rational solutions, what can be said about the relationship between
the numbers a, b, and c?

10.2 NESTED INTERVALS OF RATIONAL NUMBERS


Let a and b be rational numbers with a < b. By the interval [a, b] we mean
the set
[a, b] = {x E Q I a ~ x ~ b}.

That is, [a, b] consists of all rational numbers between a and b, including
a and b. As we have seen in Theorem 10.2, each such set contains infinitely
many rational numbers.
If [a, b] is given, then we denote its length by A([a, b]), which is given by
the formula
A([a, b]) = b - a.
Thus if I is an interval of rational numbers, then A(I) is a positive rational
number, and if J c I and J is an interval of rational numbers, then
A(J) ~ A(I).
In fact, if J is a proper subset of I, then A(J) < A(I), a fact whose proof
is outlined in the next set of exercises.
We shall be concerned with sequences of such intervals of the form

where each interval contains the one immediately after it In the above
sequence, and where the sequence of numbers

is approaching zero. We need a precise definition of this last concept,


preceded only by a definition for convenience.
10.2 Nested intervals of rational numbers 289

If r is a rational number, then we denote by Ir I the absolute value of r,


and Ir I has the value
\rl = r if r > 0,
Irl = - r if r < 0.
Thus if r is rational, Ir I is nonnegative, and just measures
the distance
from r to 0, when r is thought of as located in its natural position on an
unbounded straight line.
Let

be a sequence of rational numbers, one for each natural number n. The


sequence {sn} is said to approach 0, or have limit 0, provided that given any
positive number s, no matter how small, there exists a natural number k
such that, for all natural numbers n > k,
ISnl < s.

This definition does not mean merely that as n increases, the numbers Sn
get closer and closer to zero.

Example 10.1 For each n, let Sn = 1 + (l/n). Thus we have


2, 3/2, 4/3, 5/4, 6/5, 7/6, ...
as indicated in Fig. 10.3. As n increases, the numbers Sn are indeed getting
closer and closer to zero. But the sequence {sn} is not approaching zero.
To see this, let s = 1/4. No matter what value of k is chosen, if n > k then
1
ISnl - 1 + -
n
> 1,
and hence ISnl is not less than 1/4 = s. Hence it does not have zero for a
limit, although this sequence is getting closer and closer to zero.
The definition of the limit of a sequence also does not mean that the terms
of the sequence must get steadily closer and closer to the limit, but only
that they tend to get closer "on the average." For consider the next example:
Example 10.2 For each n, let
1
Sn = - if n is odd,
n
1
S =- if n IS even.
n 2n
290 The real number system 10.2

3/2
2

o
Fig. 10.3 The sequence gets closer to 0 but does not have limit O.

1/16

0
/ I

1/5 1/4
I I

1/3 1
I

/
1/7

Fig. 10.4 The sequence has limit 0 though it does not steadily get closer to O.
10.2 Nested intervals of rational numbers 291

Thus we have the sequence

1 ! ! 111 1
, 4' 3' 16' 5' 64' ;;'
as indicated in Fig. 10.4. The terms of this sequence do not steadily get
closer and closer to zero; however, the sequence does have limit zero. Here
is one proof of this fact; note how we follow exactly the pattern of the
definition.
Let e > 0 be given. Since to this point, only rational numbers "exist,"
e itself is rational, and thus has the form
a
e = -,
b
where a and b are natural numbers. Let k = b + 1. Suppose that n > k.
For n odd,
1 1
ISnl = - --
n n
and for n even,
1 1
= - n <-.
2 - n
Hence, if n > k, then

That is, given e > 0, we have shown the existence of a natural number k
such that for each n > k,
ISnl < e.
Therefore, by definition, the given sequence {sn} has limit zero.
An alternative way of putting the definition of the sequence {sn} having
limit zero is this: No matter how small an interval of the form (- e, e) is
chosen around zero, from some point on all the terms of the sequence {sn}
lie within that interval. This phenomenon is illustrated in Fig. 10.5.
We shall, in essence, construct the real numbers by locating their position
using sequences of closed intervals of rational numbers. A typical sequence
for locating .J2 would be
11 = [1, 2],
12 = [1.4, 1.5],
13 = [1.41, 1.42],
14 = [1.414, 1.415],
Is = [1.4141, 1.4142],
292 The real number system 10.2

( I I I ""11111 I I ) I

o
Fig.10.5 All terms of the sequence from 524 on lie in the interval ( -e, e).

2
1.4 1.5

1.414 1.415

1.41 1.42

Fig. 10.6 Locating.j2 with a sequence of closed intervals of rational numbers.


10.2 Nested intervals of rational numbers 293

These intervals are shown in Fig. 10.6. The behavior of these intervals is
such that only the number ,J"2 can belong to all of them. However, since
,J"2 has not yet been "constructed," we cannot speak of the one number
belonging to all of these intervals; indeed, in our context of the rational
numbers and only the rational numbers, there is no number which belongs
to all the above intervals. Hence what we will do is simply say that the number
,J"2 is the sequence {In}' This is a subtle and ingenious idea, and the way it
succeeds is aesthetically very pleasing.
What is necessary to make this idea succeed is the behavior of the above
sequence of intervals. The significant characteristics we need are these given
below, which you can check in the above example.
a) For each n, I n+ 1 is a proper subset of In.
b) The sequence {A(In)} has limit zero.
We abstract such behavior to obtain our next definition.
A nested sequence of closed intervals of rational numbers is a sequence

of closed intervals of rational numbers such that


a) for each n, I n+ 1 is a proper subset of In; and
b) the sequence {A(In)} has limit zero.
Here is an outline of the construction of the real number system; the
details will be forthcoming.
First, we define what it means for two sequences of nested interv.als to be
the same. (This should not be surprising, that two different-looking such
sequences could be considered the same; after all, the rational number 1/2 has
many different symbols, such as 2/4, 3/6, 4/8, .... In the same way, different-
looking sequences of nested intervals will be considered as different names
for the same real number.) We then define a real number to be a sequence of
nested intervals of rational numbers. (Technically, here, a real number will
be the set of all "same" sequences of nested intervals of rational numbers.)
We show that every rational number is, in this sense, a real number, so that
Q c R, where R denotes the set of all real numbers.
Next, we define addition, multiplication, and the order relation for R,
and show that these are the same as those already extant in Q. We also
show that all the familiar axioms concerning these operations hold true in R.
Finally, we show that when the elements of R are thought of as points
on an unbounded straight line, located according to their value, then there
are no gaps in this line-in particular,,J"2 E R. And we show that if we repeat
the above process, using sequences of nested intervals of real numbers, no
new numbers are obtained.
294 The real number system 10.2

Exercises

10.26 If r is a rational number and {rn } is a sequence of rational numbers,


how would you define the statement "The sequence {rn } has limit r."?
10.27 Let {sn} be the sequence

1, - (1/2), 1/3, - (1/4), 1/5, ....

Prove that {sn} has limit zero.


10.28 Suppose that I and J are closed intervals of rational numbers, and J
is a proper subset of I. Fill in the details of the following outline of a proof
that A(J) < A(I).
Let I = [a, b] and J = [c, dJ. Since J c I, a ::::: c and d ~ b (why?).
Suppose that a = c. Then we must have d < b (why?). Similarly, if d = b,
then a < c. Hence either a < cor d < b.
Now A(J) = d - c and A(I) = b - a (why?). If A(J) > A(I), we
obtain a contradiction (how, and to what?). Therefore A(J) < A(I).
10.29 Remove as many absolute value signs as possible from each of the
expressions below, without changing the value of each expression.

a) 1191 b) 1-71
c) 101 d) Ixl
e) I-xl f) Ilxll
10.30 Prove that the sequence

1, 1/4, 1/9, 1/16, 1/25, 1/36, ...


has limit zero.
10.31 What is the limit of the sequence in Example 10.1? Use your answer
to Exercise 10.26 to prove that your answer to this exercise is correct.
10.32 What is the limit of the sequence

1, 0, 1, 0, 1, 0, 1, 0, ... ?

Prove that your answer is correct.


10.33 Let r be a rational number such that Ir I < 1. Prove that the sequence

r, r 2
,r3,r
4
, ...
has limit zero.
10.34 Let a and r be rational numbers and n a natural number. Show that
a(1 - r n + 1)
a + ar + ar 2
+ ar 3
+ ... + ar n
= .
1 - r
10.3 Construction of the real numbers 295

10.35 If the sum of the infinite geometric series


a + ar + ar 2 + ar 3 + ...
is defined to be the limit of the sequence
a, a + ar, a + ar + ar 2, a + ar + ar 2 + ar 3,
a + ar + ar 2 + ar 3 + ar 4 , . . . ,
prove that the sum of the infinite series above is then given by the formula
a
,
1 - r
if Ir I < 1. Hint: Use the two previous exercises.
Now evaluate the sum of the infinite series
1/2 + 1/4 + 1/8 + 1/16 + 1/32 + .. '.

10.3 CONSTRUCTION OF THE REAL NUMBERS


Let' and" each be a sequence of nested intervals of rational numbers. Then
, = {II' 12 , 13 , ••• },
and
" = {J I , J 2 , J 3 , •.. },

where each In and each I n is a closed interval of rational numbers. We say


that' and" are equivalent, and write' = ", provided that, for every com-
bination of natural numbers m and n, there is a point common to 1m and I n •
In other words, each interval in , overlaps each interval in ". This is
just a way of saying that the two sequences' and" are "zeroing in" on the
same point. This point will be the eventual location of the real number
which, by virtue of the above definition, , and" are two names for. For if
some interval 1m is disjoint from some interval J m then all the intervals after
1m in , will be disjoint from all intervals after I n in ", and there will be a
positive distance between the point "defined" by , and the point "defined"
by".
As an illustration of the above definition, suppose that

In = [1 - ~, 1+ ~}
and

In = [1 - ~, 1]
for each natural number n. If' = {In} and" = {In}, then' = ,,; the reason
is that every interval In as well as every interval I n contains the number 1,
296 The real number system 10.3

r s
Kn
~[
1m

r
] [ -,
}
~------------------~
L Jr .J

Fig. 10.7 A contradiction is reached because all the intervals J t have length at
least s - t.

and hence for every combination of natural numbers m and n, 1m overlaps


I n in at least the number 1.
Incidentally, we also have in the above example an illustration of the way
in which a rational number can be thought of as a real number; namely,
the number 1 is the only number common to all the intervals In and the only
number common to all the intervals I n • So in this case both' and 1] are new
names for the familiar rational number 1.
We should now show that this "equality" (which we have temporarily
called an equivalence in the above definition) has the properties that equality
ought to have.
Theorem 10.3 Let " 1], and e be sequences of nested intervals of rational
numbers. Then
a)
b)
,="
if ,= then 1] = ,
if ,=
1]

c) 1] and 1] = e then
Proof Parts (a) and (b) are obvious. To establish part (c), suppose by way of
contradiction that' i= e. Then some interval 1m in , is disjoint from some
interval K n in e. Since 1m and K n are intervals, we can suppose that each
number in 1m is less than each number in K m as indicated in Fig. 10.7. In
particular, the right-hand endpoint r or 1m is less than the left-hand endpoint
s of K n•
Since , = 1] and 1] = (), it is not hard to see that each interval J t in
the sequence 1] must contain both rand s, since each interval J t must intersect
both 1m and K n • Since r < s, s - r is a positive number, and hence
J...(Jt) > s - r
for all natural numbers t, as the interval [r, sJ is a subset of J t for each natural
number t. Hence the sequence {J...(Jtn cannot have limit zero. This con-
tradicts the fact that 1] is a sequence of nested intervals of rational numbers,
10.3 Construction of the real numbers 297

as by definition the corresponding sequence of lengths must have limit zero.


This contradiction shows that' = (), and establishes the theorem.
We define a real number to be the collection of all equivalent sequences
of nested intervals of rational numbers. This justifies the use of the symbol
of equality in the definition of equivalent sequences, for then two real
numbers are equal if and only if they are represented by equivalent sequences
of nested intervals. We will use the symbol R to stand for the set of all real
numbers.
In order to define addition on the set R, we need first to define addition
for sequences of nested intervals of rational numbers. Let' and '1 be two
such sequences. Then

and
'1 = {J1 , J 2 , J 3 , ••• }.
We define, + '1 to be the sequence
{I 1 + J 1 , 12 + J 2 , 13 + J3 , ••• },

where, for each natural number n,


and
Thus in order to "add" two real numbers, we choose any sequences of
nested intervals representing those two real numbers; we add these sequences
by adding corresponding intervals, and intervals are added by adding each
number in one to each number in the other. This makes sense, since in the
latter case we are just adding rational numbers together, an operation already
taken for granted as defined.
There is no difficulty in calculating the result when two closed intervals
of rational numbers are added in the above way. For example, if
I = [2,3J
and
J = [4,7J
are two such intervals, then
I + J = {a + b Ia E 1 and bE J},
and this must be the interval [6, 10]. For if a E I and b E J, then
2 ~ a < 3,
and
4 ~ b ~ 7,
and hence
6 ~ a + b ~ 10.
298 The real number system 10.3

Moreover, if C E [6, 10], then


6 =:;; C < 10.
In order to show that there exist numbers a E I and b E J such that c =
a + b, we must deal with cases. If 6 < c ~ 7, we let
a=c-4
and
b = c - a.
Then a and b are rational numbers, a + b = c, and moreover, since
6 < c < 7,
then
2<c-4=a<3
so that a E I. And
b = c - a = c - (c - 4) = 4
so that b E J. The other cases are handled similarly. Thus, indeed, [2, 3] +
[4, 7] = [6, 10]. In general, one can prove the following theorem:

Theorem 10.4 If I and J are closed intervals of rational numbers, then so is


1+ J.
But curiously enough, our definition of the sum , + 1'/ of two real
numbers needs a justification theorem before it becomes a valid definition.
The reason is that in order to add the real numbers' and 1'/, one selects just
one of many possible sequences of nested intervals to represent , and just
one of many possible sequences of nested intervals to represent 1'/. There are
other choices; we may have' represented by both
{Ii' 12 , 13 , ... }
and
{Xl' X 2 , X 3 , ••• },

and 1'/ too may be represented by both


{Jh J 2 , J 3 , ..• }
and
{Yl , Y 2 , Y 3 ,· •• }.

Although {In} and {Xn} are equivalent, they need not be identical;
similar remarks hold for {In } and {Yn }. Thus you should not expect the two
sequences

and
10.3 Construction of the real numbers 299

to be identical; the problem is, however, that they may not even be equivalent.
Since they are both supposed to determine the same sum' + 1'/, the two
sequence sums above should be equivalent, or else there is an unacceptable
ambiguity in our definition of real number addition.
This problem can best be illustrated by an attempt to define a method
of combining rational numbers other than addition or multiplication.
Suppose, for each two rational numbers rand s, we define r # s as follows:
Represent each of rand s as quotients of whole numbers (with nonzero
denominator)-thus
m a
r =- and s = -
n b'
where m, n, a, and b are integers, and neither n nor b is zero. Then r # s
is to have the value
m + a
n + b
This fails to be a "valid" operation, in that the definition of r # s is
ambiguous: If r = 1/2 and s = 2/5, then
r # s = 0/2) # (2/5) = 3/7.
But there are alternative representations of rand s as fractions-for
example, we could write r as 2/4 and s as 6/15. Then
r # s = (2/4) # (6/15) = 8/19.
Since 3/7 =I 8/19, we see that the result of combining rational numbers
with the operation # gives a result dependent on the numeral used to represent
each rational number, rather than on the actual number itself. This ambiguity
is just what we need to show cannot happen in the case of our definition of
addition for real numbers. To do so, it suffices to establish the next theorem.
Theorem 10.5 Let' and 1'/ be real numbers, and let' be represented by the two
sequences of nested intervals {In} and {Xn} and let 1'/ be represented by the two
sequences of nested intervals {In } and {Yn }. Then the two sequences of nested
intervals {In + I n} and {Xn + Yn} are equivalent, and hence give rise to the
same real number' + 1'/.
Proof. Suppose that we have the conditions as given in the hypotheses to
this theorem. And also suppose, by way of contradiction, that the sequence
{In + I n} represents the real number y, the sequence {Xn + Yn} represents
the real number (j, and that y =I (j.
Because of the last condition, the two sequences {In + I n} and {Xn + Yn}
are not equivalent, so that there must be some interval of the form I k + J k
300 The real number system 10.3

disjoint from some interval of the form X m + Ym • We suppose without loss


of generality that m > k, and then, since 1m + 1m C I k + I k , it follows
that 1m + 1m and X m + Ym are also disjoint.
But there does exist at least one rational number a belonging to both
1m and X m , since {In} and {Xn} are equivalent sequences of nested intervals.
Similarly, there does exist a rational number b belonging to both 1m and Ym •
Hence the rational number a + bbelongs to both 1m + 1 m and to X m + Ym •
This contradiction establishes the theorem.
Of course, we have not yet established that if each of {In} and {In} is a
nested sequence of closed intervals of rational numbers, then so is the sequence
{In + In}; but the proof of this is very straightforward and is outlined in the
next set of exercises.
For convenience, we next establish that each rational number "is" a
real number; that is, that each rational number can be thought of as a
collection of equivalent sequences of nested intervals of rational numbers.
Theorem 10.6 Let r be a rational number. Then r is a real number.
Proof For each natural number n, let
In = [r - (lIn), r + (lin)].
Then, clearly, {In} is a nested sequence of closed intervals of rational numbers.
It is easy to see that if {In} is another such sequence, then the latter is
equivalent to the sequence {In} if and only if every interval In contains the
number r. Moreover, if so, then r is the only number common to all the
intervals in the sequence {In} because of the condition that the sequence
{A(In )} has limit zero. Hence the rational number r is represented by a
sequence of nested intervals of the necessary sort, and consequently is a real
number.
We have two ways of adding rational numbers. If rand s are rational,
we can add them with our already given method of rational number addition,
or we can think of rand s as real numbers, and add them using the sequences
of nested intervals. The two results turn out to be the same, and so our new
definition of real number addition turns out to be the same as the old rational
number addition when both methods apply to a pair of numbers. Also,
addition is commutative and associative, the rational number 0 (when thought
of as a real number) is the identity for this operation, and each real number'
has an additive inverse, which we of course denote by -,. The details of these
assertions can be found in the exercises.
An exactly similar procedure can be used to develop the idea of a
multiplication for real numbers, and exactly analogous results follow:
multiplication is commutative and associative and distributes over addition;
the rational number 1 is the multiplicative identity; each nonzero real number
has a multiplicative inverse. Since the multiplication defined for R is the
10.3 Construction of the real numbers 301

same as that given on Q when both can be applied, we have thus constructed
a natural extension of the rational number system (Q, ., +) to the real
number system (R, ., +); and we can think of the former as a subsystem
of the latter, for not only is it true that Q c R, but the operations are the
same on Q in either case.
There remains only the question of how the order relation on Q is to be
extended to R; we take up this topic in the next section.

Exercises

10.36 To prove Theorem 10.3, it is necessary to know that if {sn} is a


sequence of positive rational numbers each larger than the fixed positive
rational number a, then {sn} cannot have limit zero. Please prove this.
Hint: Let 8 = a/2.
10.37 Supply the details of the proof that if I and J each are closed intervals
of rational numbers, then so is I + J. Hint: If I = [a, b] and J = [c, d],
what are the end points of the interval 1+ J? What if some of the numbers
involved are negative?
10.38 Suppose that each of {In} and {In} is a sequence of nested intervals of
rational numbers. By the previous exercise, the sequence {In + I n} is indeed
a sequence ofclosed intervals of rational numbers-but is it a nested sequence?
Fill in the details of the following outline of a proof that {In + I n} is nested.
First, for each natural number n,
In + 1 + I n+ 1 C In + In (Why?).
Second, for each natural number n,
(Why?).
It follows that for each natural number n, not only is I n + 1 + I n + 1 a
proper subset of In + Jm but also that the sequence {A(In + I n)} has limit
zero. (Why, to both.)
Therefore {In + I n} is a nested sequence of closed intervals of rational
numbers.
10.39 Given that 1 < .J2
< 2, how would you construct a sequence of
nested intervals of rational numbers representing the real (and irrational)
number .J2-without using decimals?
10.40 Prove that if ( and Yf are real numbers, then ( + Yf = Yf + (.
Hint: Let ( be represented by the nested sequence {In} and Yf by the
nested sequence {In}. Show that In + I n = I n + In for each natural number
n. It follows (why?) that ( + Yf = Yf + (.
10.41 Prove that if " Yf, and 0 are real numbers, then ( + (Yf + 0) =
(( + Yf) + O. Hint: Do this much like the previous exercise.
302 The real number system 10.4

10.42 Prove that the real number 0 is an additive identity; that is, that
o + ( = ( for each real number (.
10.43 Suppose that the real number ( is represented by the sequence {In} of
nested intervals of rational numbers. In terms of this sequence, what is a
sequence representing - (? Show that - ( + ( = 0 for all real numbers (.
10.44 Show that if ( is a real number such that some representation of (
by a sequence {In} contains an interval I k of positive rational numbers only,
then there exists a representation of ( by a sequence {In} in which every
interval contains only positive rational numbers. Show that a similar result
holds in the "negative" case.
10.45 Show how to define the product of two real numbers ( and 11 in
terms of products of sequences of nested closed intervals of rational numbers.
10.46 Prove that if ( is a real number, then o· ( = 0 and 1 . ( = (.
10.47 Show that multiplication of real numbers is both commutative and
associative.
10.48 Show that multiplication of real numbers distributes over addition;
that is, for any three real numbers (, 11, and 0,
( . (11 + 0) = (. 11 + (. o.
10.49 Show that each real number other than zero has a multiplicative
inverse. Hint: Use Exercise 10.44.
10.50 Show that the multiplication for real numbers defined by yourself
in Exercise 10.45 coincides with ordinary rational number multiplication
when both can be applied to two numbers.

10.4 THE ORDER RELATION ON R


Let ( and 11 be real numbers, with representations {In} and {In} respectively
by sequences of nested closed intervals of rational numbers. If there exists
a rational number r and a natural number n such that each number in In
is less than r and each number in In exceeds r, then we say that ( < 11.
Theorem 10.7 If a and b are rational numbers and a < b in the order relation
already given on Q, then a < b in the order relation on R defined above, and
conversely.
Theorem 10.8 If ( and 11 are real numbers, then exactly one of the three
relations
( < 11 11 < (
is true.
Theorem 10.9 If (, 11, and 0 are real numbers such that ( < 11 and 11 < 0,
then ( < o.
10.4 The order relation on R 303

Note: The notation , < '7 means that either' < '7 or , = '7 is true. (By
virtue of Theorem 10.8, at most one can be true.)
Theorem 10.10 If' is a real number and 0 < " then there exists a natural
number n such that

-1 < ,.
n
Theorem 10.11 (Archimedean Property of R) If e is a real number with
o < e, and y is a real number with 0 < y, then no matter how small e is and
no matter how large y is, there exists a natural number n such that
y < n·e.
Let S be a nonempty set of real numbers. The number b is said to be an
upper bound for S provided that x ~ b for all XES. If c is an upper bound
for S such that c ~ b whenever b is also an upper bound for S, then c is
called a least upper bound for S.
The next theorem really says that the real number line contains no
"gaps."
Theorem 10.12 (Least Upper Bound Theorem) If S is a nonempty set of
real numbers with an upper bound, then S has a least upper bound.
The proofs of the above six theorems are outlined in the next set of
exercises. Using Theorem 10.12, we can prove that ,J2 is a real number;
that is, that there exists a real number y such that y2 = 2. Here is an
alternative method.
The method is to construct two sequences of positive rational numbers
of the form

and

such that, for every natural number n, (an)2 < 2, (b n)2 > 2, an < an+ h
bn+ I < bn, and such that the sequence {Ibn - anI} is approaching zero.
This sequence looks much like that illustrated in Fig. 10.8, arranged so that
al < a2 < a3 < ... < ,J2 < ... < b3 < b2 < bi
in actuality, although we have not yet shown the existence of,Ji Since
{Ibn - anI} has limit zero, the sequence of intervals {In} = {[am bn]} will be
a nested sequence of closed intervals of rational numbers, thus representing
some real number y. It is then natural to expect that y2 = 2, and this will
be proved.
304 The real number system 10.4

Fig.10.8 Illustrating the proof that.j"2 is a real number.

The first problem is the construction of the two sequences {an} and
{bn }. We begin by letting a i = 1 and bi = 2. Then we average:

This average is a positive rational number whose square is either less than 2,
or greater than 2. If less, we call it a2; if greater, we call it b 2. In this case,
it turns out to be b 2 • Now that we have ai' b i , and b 2 , we average the last an
constructed with the last bn constructed, and obtain

Again this average is a positive rational number; if its square is less than
2, we call it a2; if greater, we call it b 3.
We continue this process. In general, if an and bm are the last numbers
constructed in each sequence, we form the average

and call this number an + 1 if its square is less than 2; we call it bm + 1 if its
square is greater than 2.
10.4 The order relation on R 305

It is not hard to see that {an} and {b n} are sequences of positive rational
numbers so arranged that

and so that, for each natural number n,

The only real problem is in showing that there are actually infinitely
many a's and infinitely many b's. But suppose, for example, that there were
but finitely many b's. Then there must be infinitely many a's, and we would
have

where bk is the last bn constructible by this process.


At each stage of the construction, the distance between the last two
a's and b's constructed is half what it was at the previous stage; for example,
we have
Ib l - all - 1,
Ib z - all - 1/2,
Ib z - azl - 1/4,
and so on. Hence the sequence {b k - an} is approaching zero (n is the variable
in this sequence; k is fixed). But then, the only rational number belonging
to all the intervals of the form [am bkJ is bk itself. However, by Exercise
10.14, bk cannot be the smallest rational number whose square exceeds 2.
So there is a smaller rational number, say c, whose square exceeds 2.
By construction, the square of each an is less than 2, and the square of
bk exceeds 2. Hence the number c must belong to each interval of the form
[an, bkJ. This is a contradiction, since the only number in all such intervals
is bk itself and c "# bk • The case in which there are but finitely many a's is
handled similarly, and again, with this contradiction, we can conclude that
there must be infinitely many terms in both the sequences {an} and {b n}.
Consequently, {[an, bnJ} is a sequence of nested closed intervals of
rational numbers, and thus represents some real number which we denote
by y. The object now is to prove that yZ = 2.
But this is quite easy, for the sequence of nested intervals

not only represents the real number l, but also has the property-again by
construction-that it represents the number 2; for (an)Z < 2 < (bn)Z for
each natural number n. Therefore yZ = 2, and hence .J"2
exists.
306 The real number system 10.4

Exercises

10.51 Part of the proof of Theorem 10.7 is presented below; please supply
the details.
Suppose that a and b are rational numbers and a < b in the order
relation on Q. First construct a rational number r such that a < r < b.
Represent a by the sequence {An} and b by the sequence {Bn} of nested
intervals of rational numbers. Show that, for some sufficiently large value
of n, each number in An is less than r and each number in Bn exceeds r.
It may be helpful to do this last step by contradiction. Remember that
{A(A n )} and {A(Bn)} are sequences approaching zero.
The conclusion is then that a < b in the order relation on R.
10.52 The other part of the proof of Theorem 10.7 is presented below;
please supply the details.
Suppose that a and b are rational numbers such that a < b in the order
relation on R. Show why there must exist a rational number r such that
a < rand r < b in the order on Q. Conclude that a < b in the order on Q.
10.53 An outline of the proof of Theorem 10.8 is presented below; please
supply the details.
Suppose first that , < '1 and , = '1 are both true. Represent' by a
sequence of nested intervals {In} and '1 by a sequence of nested intervals
{In}. Since' = '1, each In must interse~t each J m.
But since' < '1, there exists a natural number n and a rational number r
such that each number in In is less than r and each number in I n is greater
than r. Why does this lead to a contradiction?
By similar treatment of other cases, conclude that at most one of the
three relations

can be true.
Suppose that neither of the last two relations above is true, and show
that the first one must be true; since' =1= '1, there must be a natural number n
so large that In and I n are disjoint. Show how to construct a rational number r
such that each number in In is less than r and each number in I n exceeds r.
This shows that' < '1, and thus establishes Theorem 10.8.
10.54 Prove Theorem 10.9. Hint: Use techniques similar to those in the
above exercise, and use the fact that Theorem 10.9 does hold for rational
numbers.
10.55 Prove Theorem 10.10. Hint: Use Exercise 10.44.
10.56 Prove Theorem 10.11. Hint: Use Theorem 10.10.
10.57 Prove Theorem 10.12. Hint: Let S be a nonempty set of real numbers
with an upper bound. Let
A = {x E R I x > s for all S E S}
10.5 Are there more numbers 7 307

and
B = {x E R I x <s for some S E S}.

Then A and Bare nonempty, A u B = R, and S c B. Moreover,


each number in B is less than each number in A. Follow the technique used
after the statement of Theorem 10.12, in proving that .J2
exists, to construct
two sequences, one drawn from A and one from B. Construct a sequence of
nested intervals using the terms of these sequences for endpoints. If your
construction is much like that in showing that .J2
exists, you should obtain a
sequence of nested intervals representing a real number' which can be shown
to be the least upper bound of B. Then, with a little care, , can also be shown
to be the least upper bound of S.
10.58 From this point on, we consider intervals to be intervals of real
numbers. That is,
[a, b] = {x E R Ia < x < b},
[ a, b) = {x E R Ia < x < b},
(a, b] = {x E R Ia < x < b},
and
(a, b) = {x E R Ia < x < b}.
Give three different upper bounds for the set [1, 2).
10.59 Prove that if a set of real numbers has a least upper bound, it has
only one.
10.60 What is the least upper bound of the set [1, 2)1 What is the least
upper bound of the set (1, 2] 1

10.5 ARE THERE MORE NUMBERS?


First let us try to answer the question above by showing how every length
can be represented by a real number. Such a length can be represented as a
length measured from 0 to some point on the unbounded straight line on
which the rational numbers can be thought of as already located according
to their values. If for some reason the length should be thought of as negative
(perhaps as representing a velocity, charge, or other signed physical quantity)
we shall nevertheless suppose that it has been measured off in the positive
direction-for if we can show that some real number' measures the positive
length, then - , will measure the negative length.
So, in essence, the problem is to show that each point to the right of 0
on the line is the location of some real number' already constructed. Let P
308 The real number system 10.5

be such a point. There is certainly at least one rational number r to the right
of P and -1 is a rational number to the left of P. Hence the set

s = {x E Q Ix is to the left of P}

is a nonempty set of real numbers with an upper bound. Let' be the least

If'
upper bound of S.
is to the left of P, then by the construction of' there must exist some
closed interval I of rational numbers with , E I and P to the right of each
number in I. In particular, the right-hand endpoint b of I is a rational number
to the left of P. So b E S. But' < b; this is a contradiction, since' is the
least upper bound of S.
Similarly, if' is to the right of P, a contradiction is obtained. Hence the
point P is the location of the real number'. This establishes that each point
on the line is the location of some real number'; moreover, this is the natural
location of' because this point is to the right of every rational number less
than , and to the left of every rational number greater than'.
We now show how each real number can be located in a natural position
on the unbounded straight line of rational numbers. This will be easiest to
see with an example; we show how to locate the number

TC = 3.14159 26535 ....


We will not actually use this decimal expansion in order to locate the position
of TC, but only to make it clear what intervals of rational numbers are to be
constructed.
Given TC, let n be the greatest integer such that n ~ TC, and let 11 -
En, n + 1]. In the case of TC, we obtain the interval II = [3, 4].
Next, let n 1 be the greatest integer such that
n
n + _1 < TC,
10 -
and let 12 be the interval

[
n +!!..! n + n 1 +
10' 10
1].
In this case, using TC, we obtain 12 = [3.1, 3.2]'
We continue this process, obtaining the sequence

[3, 4], [3.1, 3.2], [3.14, 3.15], [3.141, 3.142], ....


The above process will work without the necessity of knowing the decimal
expansion of TC in advance; in fact, this process is just what we use to define
the decimal expansion of each real number; the decimal expansion is just the
10.5 Are there more numbers? 309

limit of the sequence of left-hand endpoints of the above intervals (with a


minor exception to be dealt with in the exercises).
In any case, we think of the closed intervals we have just obtained as
closed intervals of rational numbers. Since each has length 1/10 that of the
previous interval and each is contained in the previous interval, we have a
sequence of nested intervals of rational numbers, which "is" the real number
n. The location of n is then the one point on the number line lying in all the
above intervals; it is clear that there can be at most one such point, and it
can be shown using some techniques of topology that at least one such point
must exist. This is a natural location for n, since we locate n to the right of
every rational number less than n and to the left of every rational number
greater than n.
This construction provides us with a bonus, as we have noted. We have
developed a decimal expansion for each real number, and the rules of arith-
metic by which these decimals can be added, multiplied, and so on, can be
developed as well.
This construction partly answers the question of the title of this section.
There are not more real numbers; that is, since we have established a one-to-
one correspondence between R and all the points on an unbounded straight
line, any physical quantity which can be interpreted as a length can be
measured exactly by one and only one real number. On the other hand,
even using real numbers the simple equation
x2 + 1 = 0
has no solution x E R; this problem is discussed in Exercise 10.65.

Exercises

10.61 Suppose that Cis a real number in the interval [n, n + I] where n is
an integer. Prove that there exists a nonnegative integer m such that
m
CE [ n+-,n+ m + 1] .
10 10
What is the maximum possible value of m?
10.62 Suppose in the construction of a sequence of nested intervals of
rational numbers, as we did for the number n, the number Cfor which the
sequence is constructed lies at the right-hand end of each interval. For
example, suppose that Cis the number 1. Then
1E [0, 1],
1E [0.9, 1],
1E [0.99, 1],
1E [0.999, 1],
310 The real number system 10.5

Has the above construction been carried out in the same way as was indicated
for the number n?
Since 1 lies in each of the above intervals, it would seem reasonable that a
decimal expansion for 1 might be given by 0.999 999. . .. Is this correct?
See Exercise 10.17.
10.63 What real numbers have two different decimal expansions? Hint:
See Exercise 10.62.
10.64 A question of some theoretical interest is this: If we were to repeat
the construction of this chapter, beginning with R rather than with Q,
would any new' numbers be obtained? The answer is that none would, and
the reason is the truth of Theorem 10.12, the Least Upper Bound Theorem.
Let {In} be a nested sequence of closed intervals of real numbers. Each
interval In is of the form [am bn], where an and bn are real numbers and an <
bn. Let' be the least upper bound of the set {an I n EN}, and show that'
is the only real number that belongs to all the intervals In. Since in this
alternative development, , is represented by the sequence {In}, this sequence
produces only a number that is already a real number. Fill in the details of
this argument.
10.65 In order to construct a solution to the equation
x 2 + 1 = 0,
a procedure can be used much simpler than the construction of R. Let
c = {(a, b) I a E R and bE R}.
For (a, b) and (e, d) in C, let
(a, b) + (e, d) = (a + e, b + d)
and
(a, b) . (e, d) = (ae - bd, ad + be).

Show that addition and multiplication are closed, commutative, and


associative; that (0, 0) is an additive identity and that (1, 0) is a multiplicative
identity; and that (a, b) has an additive inverse and, if (a, b) :f;: (0, 0), then
(a, b) has a multiplicative inverse.
Then show that if the real number a is identified with the number
(a, 0) E C, the operations of addition and multiplication with respect to C
are the same as in R.
This shows that C, the complex number system, is a natural extension
of the real number system, and that R can be thought of as a subsystem of C.
Finally, show that the equation

x2 + 1 = 0
has a solution in C.
10.6 An unusual set of real numbers 311

10.6 AN UNUSUAL SET OF REAL NUMBERS


If we used only the digits 0, 1, and 2 for counting, we would be counting in the
so-called ternary system, as shown below:

Ternary Numeral Decimal Numeral


o o
1 1
2 2
10 3
11 4
12 5
20 6
21 7
22 8
100 9
101 10
102 11
110 12

A fraction written with the numeral 1/4 in the decimal system would then
be written 1/11 in the ternary system, and so on. The development of the
real number system from the rationals could be carried out exactly as before,
and only the "decimal" expansions would look any different. The ternary
"decimal" for the above fraction could be computed by division:

0·02020202···
11)1·00000000· ..
22
100
22
100
22
100
22
1

As in the case of ordinary decimal representations of real numbers, some


ternary "decimals" may differ yet represent the same real number. For
example,
0.022222 ...
312 The real number system 10.6

can be "evaluated" using the formula for the sum of a geometric series
(Exercise 10.35), as follows:
0.022222 ... = 0 + 0/3 + 2/9 + 2/27 + 2/81 + ....
This is a geometric series with "first" term 2/9 and ratio 1/3, hence its sum is
2
-
9 1
1 - 3'
1 - -
3
or, in ternary notation, 0.022222 . . . can be written as
1
or 0.100000 ...
10
Here is an example of a very unusual set of real numbers, known as the
Cantor Ternary Set. Let K be those real numbers in the interval [0, 1] not
requiring use of the digit 1 in their ternary expansion.
Thus the number 1/3 belongs to K, since although 1/3 does have a ternary
expanSIOn
0.100000 ...
in which the digit 1 is used, it also has a ternary expansion
0.022222 ...
in which the digit 1 is not required.
If 1/3 < x < 2/3, x does require the use of the digit 1 in its ternary
expansion, since the ternary expansion of such a number must use the digit 1
in the first place after the decimal. So K c [0, 1/3] u [2/3, 1].
Also, if 1/9 < x < 2/9 or if 7/9 < x < 8/9, it is necessary to use the
digit 1 in the second place after the decimal point in representing x as a
ternary decimal, and hence
K c [0, 1/9] u [2/9, 1/3] u [2/3, 7/9] u [8/9, 1].
If this process of elimination is continued, it can be seen that K is that subset
of [0, 1] that remains after the middle third (except for endpoints) of [0, 1]
is deleted, then the middle third of each of the two resulting intervals is
deleted, then the middle third of each of the resulting four intervals is deleted,
and so on. This process is shown in Fig. 10.9.
In this deletion process, once a point becomes an "endpoint" of K, it
must remain in K in spite of all subsequent deletions; thus, for example, K
contains the infinite set
{I, 1/3, 1/9, 1/27, 1/81, ... }.
10.6 An unusual set of real numbers 313

0[-------------]1

0[-----]1/3 2/3[-----]1

[ ] [ ] [ ] [ ]

E-3E-3 E-3B
• •
• •
• •
Fig. 10.9 First four stages in the construction of the Cantor Ternary Set.

Clearly, each endpoint of K has denominator a power of 3; however, K


contains other points as well, such as 1/4, since 1/4 has the ternary decimal
0.0202020 ....
Now let us calculate the "length" of the set K. This should be 1 - Jl,
where Jl is the total length of all the deleted intervals. But then
1 2 4 8
Jl=-+-+-+-+ ....
3 9 27 81

This series is geometric, with first term 1/3 and ratio 2/3, so its sum gives
us the value of Jl:
1
-
3
Jl= - 1.
2
1 - -
3
Since the length of K is I - fl, K has length zero.
314 The real number system 10.6

Let f be a function defined on K and real-valued, operating according


to the following rule:
Given x in K, express x in ternary decimal form without using the digit 1.
Replace each 2 in this ternary expansion by the digit 1. Interpret the resulting
numeral as the binary decimal numeral for a real number r. Thenf(x) = r.
For example, given 1/4 E K, we proceed as follows tofindf(1/4).
The ternary decimal 0.020202... represents 1/4. Convert this to
0.010101 .... The latter is the binary decimal expansion for
o 1 0 1 ....
-+-+-+-+
2 4 8 16
This series is geometric, and its sum is 1/3. Hence f(1/4) = 1/3.
For another example, given 1 E K, we findf(1) as follows.
The ternary decimal 0.22222... represents 1. Convert this to
0.11111 .... Sum the series 1/2 + 1/4 + 1/8 + 1/16 + .. '. The sum is 1.
Hence f(1) = 1.
Now each number in [0, 1] is a value of the function f For, given
Z E [0, 1], z has a binary decimal representation; each digit in this decimal
can be doubled, obtaining a numeral which can be thought of as the ternary
decimal of some real number. This ternary decimal contains no 1's, and has
the form O. ????? ... , hence it represents a number x E K. It should be clear
that f(x) = z. Thus every number in [0, 1] is a value off
Since f is a function, it cannot have more values than the number of
elements of K; but since K c [0, 1], K cannot contain more numbers than
are in the set [0, 1]. Hence K and [0, 1] contain the same number of points.
But K has length zero. This is what is unusual about the set K.

Exercises
10.66 How do you count in the binary system, using only the digits 0 and 1?
10.67 Give the binary and ternary decimals for the numbers 1/2,2/9, and 3/7.
10.68 Show that the Cantor Ternary Set contains infinitely many points
not endpoints.
10.69 Evaluate f(1/3) and f(2/3) for the function f constructed in this
section.
10.70 Suppose instead of the middle third, the middle fifth is deleted from
[0, 1], the middle fifth next deleted from each of the two resulting intervals,
and so on. What is the length of the resulting set?

NOTES AND REFERENCES


W. Rudin's Principles of Mathematical Analysis (second edition, McGraw-
Hill, 1964) and M. J. Mansfield's Intermediate Real Analysis (Prindle, Weber,
Notes and references 315

and Schmidt, 1969) develop the real numbers from the rationals usmg
Dedekind cuts.
R. L. Wilder's Introduction to the Foundations of Mathematics (Wiley,
1952) gives, in addition, the development of the integers from the natural
numbers, then the development of the rational numbers from the integers.
B. Kripke's Introduction to Analysis (Freeman, 1968) gives some further
topics in the study of the real number system, and contains a valuable first
chapter on the approach to the study of mathematics.
c. Goffman's Real Functions (Rinehart, 1953), D. A. Sprecher's Elements
of Real Analysis (Academic Press, 1970), and R. R. Stoll's Set Theory and
Logic (Freeman, 1963) develop the real numbers from the rationals using
equivalent convergent sequences of rational numbers.
EPILOGUE

Mathematics can be thought of as being divided into several branches. The


branches are listed below, and we have indicated which chapters of this book
fall into each branch. In addition, the listing gives supplementary readings
on related topics. Some references duplicate those previously given at the
ends of the chapters. The level of difficulty of these books is quite variable,
but most of them would be appropriate for students who have the equivalent
of an undergraduate major in mathematics, while some of the books are
suitable for college freshmen.
Algebra. A very special kind of modern algebra is commonly taught in
high school. Chapter 4 (on group theory) and some of the material in
Chapter 2 belong in this category. Birkhoff and MacLane's A Survey of
Modern Algebra (revised edition, Macmillan, 1953) covers many of the topics
commonly thought of as "modern algebra," and has been used as a junior-
level textbook.
Number Theory. Perhaps only geometry antedates this very old branch
of mathematics. Of course, it is closely allied to algebra, but since about 1900
techniques of analysis have been very fruitful in producing advances in
number theory. Niven and Zuckerman's An Introduction to the Theory of
Numbers (Wiley, 1960) is frequently used as a junior- or senior-level text-
book. Beiler's Recreations in the Theory of Numbers (Dover, 1964) is written
for the layman with some familiarity with elementary mathematics, and is a
delightful book. Chapter 7 belongs in this category.
Analysis. Freshman calculus, calculus of several variables, and differential
equations form the backbone of the mathematical education of students
majoring in the physical sciences. These topics form the beginnings of
analysis, which together with its daughter, applied mathematics, have
produced most of the visible effects of mathematics in our culture. (For
example, almost all the mathematical problems involved in the flight plans
of space explorations belong in this category.) The use of continued fractions
in Chapter 3 is an example of an application of a topic in analysis; of course,
the differential equations of Chapter 8 are solved using techniques of analysis.
The material on convex sets in Chapter 9 is really geometry, but convex sets
have found their widest applications in analysis; and of course, Chapter 10
might best be described as an introduction to the foundations of analysis.
316
Epilogue 317

If you wish to study more mathematics of this sort, H. S. Wall's Creative


Mathematics (Texas, 1963) is an unusual book-a bright freshman with a
great deal of persistence can learn a great deal of calculus on his own with
the aid of this book. Bers' Calculus (Holt, Rinehart, and Winston, 1969)
is one excellent recent text, as is Spivak's Calculus (Benjamin, 1967).
Geometry. Chapter 1 on the Bolyai-Gerwin Theorem, Chapter 5 on
polyhedra, and Chapter 9 on convex sets are all highly geometric in content.
It is apparent that there is a great deal more to modern geometry than Euclid's
Elements. Some interesting references are Hadwiger, Debrunner, and Klee's
Combinatorial Geometry in the Plane (Holt, Rinehart, and Winston, 1964),
Coxeter's Introduction to Geometry (Wiley, 1961), and Hilbert and Cohn-
Vossen's Geometry and the Imagination (Chelsea, 1952).
Logic and Foundations. Chapter 6, about infinite sets, deals with part
of set theory and the latter is usually included as a part of foundations.
Many people would classify the material in Chapter 10 as belonging in
foundations rather than in analysis. Stoll's Set Theory and Logic (Freeman,
1961) can be used as a senior-level textbook.
Topology. Some of the material in Chapter 5 belongs in this branch of
mathematics, but only the second chapter, on Brunnian links, is really
mostly topological. Hocking and Young's Topology (Addison-Wesley,
1961) and Alexandroff's Elementary Concepts of Topology (Dover, 1960)
are introductory but not elementary.
Applied Mathematics. Perhaps probability and statistics belong in this
category; perhaps they should receive separate listings. In any case, Chapter 8
on animal populations is an example of an application of mathematics. So
are the topics treated in most physics books. Somewhere in the category
of applied mathematics belong such new branches of mathematics as game
theory, queueing theory, and others, each of which is quite likely to have
a profound effect on our lives and cultures, perhaps in the very near future.
With respect to game theory, Williams' The Compleat Strategyst (McGraw-
Hill, 1954) is written for the layman, and is a very entertaining book.
Two references dealing with problems in mathematics, mostly unsolved,
are given below:
Dorrie, H., 100 Great Problems of Elementary Mathematics (Dover,
1965, translated by David Antin).
Tietze, H., Famous Problems of Mathematics (Graylock, 1965).
That mathematics has far-reaching and surprisingly diverse applications
can be seen merely by examination of the two titles below:
Jakobson, R., Structure of Language and its Mathematical Aspects
(Proceedings of the Twelfth Symposium in Applied Mathematics, American
Mathematical Society, 1961).
Lomont, J. S., Applications of Finite Groups (Academic Press, 1959).
318 Epilogue

Finally, here is a list of more general books, some with intent similar
to this one, some not even intended as textbooks, but all appropriate for the
educated layman:
Aleksandrov, A. D., Kolmogorov, A. N., and Lavrent'ev, M. A.,
Mathematics: Its Content, Methods, and Meaning (M.I.T. Press, 1963,
translated by Gould, Bartha, and Hirsch).
Beck, A., Bleicher, M., and Crowe, D., Excursions into Mathematics
(Worth, 1969).
Crowdis, D. G. and Wheeler, B. W., Introduction to Mathematical
Ideas (McGraw-Hill, 1969).
Fraleigh, J.B., Mainstreams of Mathematics (Addison-Wesley, 1969).
Polya, G., Mathematics and Plausible Reasoning (Princeton, 1954).
Stein, S., Mathematics: The Man-Made Universe (second edition,
Freeman, 1969).
Wilder, R. L., Evolution of Mathematical Concepts (Wiley, 1968).
Wilder, R. L., Introduction to the Foundations of Mathematics (Wiley,
1952).
Of course, many fine books have been omitted from the above listing;
but the bibliographies that appear in many of those listed will serve as a
guide to additional reading.
Epilogue 319

GREEK ALPHABET
A ex Alpha al'f~ pat
B p Beta ba't~, be't~ ~ about
r ')' Gamma gam'~ pay
A b Delta del't~ pet
E
Z
H
,
8 Epsilon
Zeta
Eta
ep's~-lon'
za't~, ze't~
a't~, e't~
pot
thin
pie
e "
() Theta tha't~, the't~ toe
I I Iota i-6't~ boot
K K Kappa kap'~ pit
A A. Lambda lam'd~ out
M f.l Mu myoo, moo paw
N v Nu - nyoo
noo, - cut
.....
...
.... e Xi zi, si
0 0 Omicron ,
om'~-kron' o'm~-kron'
n 1t Pi pi
P P Rho ro
1: q Sigma sig'm~
T 't Tau tou, to
y v Upsilon up's~-lon'
<f) 4J Phi fi
X X Chi, Khi ki
'II t/J Psi psi, si
n OJ Omega o-meg'~, o-me'g~, 6-ma'g~
ANSWERS
AND
HINTS

CHAPTER 1
1.1 The only polygon is (d).
1.2 Suppose that a polygon had fewer than three vertices, and reach a
contradiction.
1.3 It can be done.
1.4 Yes. Choose a point p on one edge but not a vertex. A sufficiently
small circular disk centered at p will be bisected by this edge, and part of this
disk must lie within the polygon. But a semicircular region has positive area.
1.5 See a plane geometry textbook for a formula concerning the sum of the
interior angles of a polygon.
1.6 See Section 1.8 for one approach.
1.7 In either case, since only finitely many parallelograms may be used,
there must be one with a vertex coinciding with a vertex of the triangle and
with one of the two sides incident at that vertex lying on one side of the
triangle.
1.8 See Fig. 1.2.
1.9 For each integer n > 0, let Rn be the rectangle with vertices at the plane
coordinates
(n,O) (n + 1, 0)

and let P be the union of all these rectangles. Since R n has area 2- n, the total
area of Pis
1 + 1/2 + 1/4 + 1/8 + ... = 2.

1.10 Try a figure with a square boundary.

320
Answers and hints 321

1.11 Make R into two squares with one cut. Cut each resulting square
along a diagonal. Reassemble.
1.12 Cut the strip into squares each of which has side length the same as the
width of the strip. Reassemble the strip

I 2 3 4 5 6 ...
in the pattern
1 3 5 7 .
2 4 6 8 .

1.13 Use the same squares as in Exercise 1.12. Start at the origin and work
"circularly" outward.
1.14 Drop a perpendicular from the largest angle to the opposite side.
1.15 It can be done; you should really try to find the solution on your own.
1.18 One reasonable analogue to "polygon" in the one-dimensional case
might be a figure which can be expressed as the union of finitely many line
segments, each of which contains both its end points.
1.19 Test your theorem on the two line segments Sand T, both of length 1,
where S consists of all numbers x such that 0 ~ x < 1 and T consists of all
numbers y such that 2 ~ y < 3.
1.20 The relation is an equivalence relation.
1.21 There should be n - 3 cuts, resulting in n - 2 triangles.
1.22 First rephrase the induction principle as follows:
If a statement meaningful for natural numbers is true for n = 3, and
whenever it is true for all integers k with 3 ~ k < n then it is also true of n,
then the statement is true for all natural numbers n > 3.
1.23 If you have assumed that

1 + 2 + 3 + ... + k _ k(k + 1)
2
then
1 + 2 + 3 + ... + k + (k + 1) = k(k + 1) + (k + 1).
2

1.24 Suppose that if a polygon has k vertices and 3 =:;; k < n, then the
polygon can be triangulated. Let P be a polygon with n vertices. Use the
proof in Section 1.3 to divide P into two polygons each of which has fewer
than n vertices. Apply the above assumption, and then the answer to Exercise
1.22.
322 Answers and hints

1.25 It cannot always be done. Try polyhedra of the sort that have a hole
running all the way through. This is not an easy problem.
1.26 You may wish to use the fact that if b < c and a > 0, then ab < ac.
1.27 This is easy; you may wish to split the proof into two cases.
1.28 If 4n - 1 is divisible by 3, then there exists a natural number k such
that 4n - 1 = 3k. Hence 4n is always of the form 3k + 1. What do you
do to 4n to turn it into 4n + I?
1.29 This is not only a difficult question, it is also a trick question. In his
book Mathematics: The Man-Made Universe (second edition, Freeman, 1969),
in Theorem 3 of Chapter 8, Sherman Stein shows that-for example-the
rectangle with sides of length 1 and ,J2 can not be cut up into finitely many
squares in any way whatsoever!
1.30 Try the rectangle of sides of length 1 and ,J2, and suppose that, by way
of contradiction, it can be cut into squares of side length a > O.
1.31 A 1800 rotation of one line does not change the fact that it is parallel
to another line.
1.32 A right triangle, if cut the proper way.
1.33 There are really only two choices for which perpendicular to use.
Show that at least one choice must produce a segment that lies within the
parallelogram.
1.34 A motion without rotation will not change the fact that two lines are
parallel.
1.35 The answer is best couched in terms of the ratio of the length of the
altitude of the parallelogram to the length of its base.
1.36 In Fig. 1.4, the lines from q to the end points of the diameter are the
hypotenuses of two similar right triangles, whose sides are thus in proportion.
1.37 A little experience with inequalities of real numbers will be helpful
here.
1.38 This is a complicated but straightforward problem involving
inequalities.
1.39 Note that no rotations are used, and that x 2 = abo
1.40 First show that the four right triangles in the "corners" of square A
are c?ng(uent to each other.
1.41 Again, this is a long but not difficult problem.
Answers and hints 323

1.42 In Fig. 1.6, let the square S have side length c. Then show that a2 +
b 2 = c2 •
1.43 Read the summary immediately preceding this exercise.
1.44 Take a = 1 and b = 5 in the proof given in Section 1.6.
1.45 First, (4/3)3 > 2, so the numbers
(4/3)3, (4/3)6, (4/3)9, (4/3)12, ...

are, respectively, larger than


2, 4, 8, 16, ....
For the question about area, note that at each stage after the first, four
times as many triangles are added as in the previous stage, and each new
triangle has one-ninth the area of each triangle added in the previous stage.
So the ratio of the resulting geometric series will be 4/9.
1.46 Use, axiomatically, that equidecomposable figures have the same
"area." A detailed formal development of the area function A is quite long.
1.47 Cut the square into vertical line segments. Hold the one at the left
fixed, and move each other segment to the right so that it ends up twice as
far from the leftmost segment.
1.48 Go directly to a rectangle by using just one cut.
1.49 Try a construction similar to the one shown in Fig. 1.5.
1.50 Yes; one stops with congruent rectangles rather than going all the way
to two equal squares. Details are given in the book of Boltyanskii mentioned
at the end of Chapter 1.
1.51 If the ratio of the length of one side of one of the parallelepipeds to the
length of one side of the other is a rational number, the construction is easy.
Otherwise, the answer is still in the affirmative, but the author knows of no
easy proof. That it is possible follows from a theorem mentioned in the
answer to Exercise 1.29.
1.52 All four can be cut up and reassembled into congruent squares. All
the cuts could then be superimposed onto a single such square.
1.53 It is possible; this follows from the answer to Exercise 1.51, but again
the author knows of no easy proof. Try Exercise 1.54 instead.
1.54 If there are n cubes along each edge of the smaller cubes, and m along
each edge of the larger cube, one must solve
2n 3 = m 3
324 Answers and hints

It follows from the Fundamental Theorem of Arithmetic (Section 7.3)


that this equation has no solution in integers.
1.55 Yes; the ingenious method of solving this problem is given in Chapter
7 of the text by Sherman Stein mentioned in the answer to Exercise 1.29.

CHAPTER 2
2.1 A circle together with one of its diameters is a candidate for an answer
to the first question. For other combinations, try finding figures that satisfy
two of the properties but not the other two.
2.2 Certainly not the first.
2.3 Certainly not the last.
2.4 See the answer to Exercise 2.1.

2.5 A disk certainly does have property (a).


2.6 There are infinitely many different correct answers to this problem.
2.7 This is easy.
2.8 Circumscribe a circle about the square, and then move the points of the
square outward along radii of the circle until the result is that the square has
been deformed onto the circle.
2.9 Deform the curve until it lies in a plane. It then bounds a circular disk
(after some possible additional deformation in the plane). The disk can
be deformed to a hemisphere, and the other hemisphere then supplied. The
curve has thus been deformed so as to become the "equator" of the sphere.
2.10 Procure an old inner tube, and draw a curve on it that goes around
twice the long way while going around three times the short way. Do you
think that every knotted simple closed curve can be deformed so as to lie
on the surface of a torus?
2.11 Yes; but why?
2.12 No; but why not? (Give an example.)
2.13 Can you link each curve with every other curve?
2.14 In the first construction, first draw a completely splittable (n - 1)-
link.
2.15 One solution might have the curves so arranged so as to look like an
infinite chain.
Answers and hints 325

2.17 Since one, and thus any, curve representing ab can be deformed
continuously to a curve representing ba. The right convincing drawing can
be found after a little effort.
2.18 Yes; since ab = ba, then also xy = yx if x and yare any two ex-
pressions whatsoever in the algebra.
2.19 Both (y-l X -l)(xy) and (xy)-l(xy) are equal to 1.

2.20 These two expressions do reduce to identical ones.


2.21 The answer is found in Section 2.5.
2.22 See Fig. 2.11.
2.23 Begin by copying Fig. 2.12.
2.24 Draw four separated circles, and then follow the formula found in
Section 2.5.
2.25 No shorter formula is known to the author.
2.28 See Section 2.6.
2.29 Yes: Let x = aba- 1 b- 1 and y = c.

2.30 Here is one solution. Draw three circles A, B, and C forming the
Borromean Rings, and then so arrange matters that D links A and B in the
Borromean manner as well as linking A and C in the same way. One possible
formula for D is thus
aba - 1 b - 1 aca - 1 C - 1 .

2.31 Expand and simplify (x, y)(y, x).


2.33 Use a case argument.
2.35 Show that an (n + 1, 2)-Brunnian link can be constructed from an
(n, 2)-Brunnian link by the methods of Section 2.6.

2.36 The two expressions represent the same curve if the point p is allowed
to move. This is not permissible in the algebra, but is allowable for the
purposes of constructing the various links of this chapter. This is the one way
in which the geometry and the algebra are not in perfect correspondence.

2.37 Try (a, b)(c, d)).


2.38 Use the form of the induction principle gIven In the answer to
Exercise 1.22.
2.39 Again, see Exercise 1.22.
326 Answers and hints

2.40 Knottedness may be removed by reversing some of the crossings of a


given curve over itself-how?
2.41 Look up "Pascal's Triangle" in any college algebra textbook.
2.42 There are many correct answers to this problem; however, certain
arrangements are impossible.
2.43 To show the strip has only one side, start coloring at one spot with a
crayon, and keep expanding the colored region. The whole strip will
eventually be colored on "both" sides.
2.44 Note that the cut down the middle has the drawn line on both sides at
all times, and never crosses it.
2.45 Try the experimental approach.
2.48 The surface would be one-sided; it is called a Klein bottle. If you find
the surface difficult to visualize, this is probably because this figure cannot
be placed in ordinary three-dimensional space without an artificial self-
intersection-the same sort of artificial self-intersection you see when you
try to draw a knotted simple closed curve on a (two-dimensional) piece of
paper.
2.49 You can add any desired number of edges to a preexisting figure by
removing the interior of as many small circular disks as you need.

CHAPTER 3
3.1 Try solving the equation 1x = 2.
3.2 The graph looks much like the one in Fig. 3.2.
3.3 Note that the number 1/2 must be taken to a negative power to give a
value larger than 1.
3.4 First establish this for b = l/e, where e > 1.
3.5 What does y = 10gb x mean in terms of exponents?
3.7 Note that log2 (4 3 ) = 3 log2 (4).
3.8 Let 10gb xy = p, 10gb X = q, and 10gb y = r. Then
bP = xy, bq = x, and
3.9 Use the techniques of the previous exercise.
3.10 For b > 0, b i= 1, and x any real number, define bX to be that real
number a such that a > 0 and x = 10gb a. Show first that such a value of a
exists and is unique.
Answers and hints 327

3.12 The given expressions may be simplified, in order, to


1
1 + 10gb C
C
-1
1
3.13 Use Exercise 3.11.
3.14 Let 10gb x = p and 10gb Y = q.
3.15 If 2x = 3, then x log 2 = log 3; the logarithms may be to any fixed
base, such as 10. Why is this so? In any case, the answer correct to nine
places is 1.584 962 501.
3.17 1 + 5/7.
3.18 The correct answer is not
1 + log (5/2) ,
log 2
since the fraction there exceeds 1. Why?
3.19 Note that log 4 = 2 log 2.
3.20 Attack the numerator:
log 2 = log (3/2)(4/3)
= (log 3/2) + (log 4/3).
3.21 34/25.
3.22 You should obtain the equation
x = 1+ 1
2 + x
3.23 If x 2 = 5, then x 2 - 4 = 1.
3.24 The answer is
-1 + 4../3 .
3
If you got this unlikely-looking monster-which we have simplified, by the
way-you almost certainly worked the problem correctly.
3.25 If../3 = (l; a, a, a, ... ), then you can obtain the equation
- 1
../3 = 1 + ../ '
a-I + 3
and this should lead to a contradiction.
328 Answers and hints

3.26 If you are in utter despair, see Exercise 3.50.


3.30 There is a connection between this exercise and the previous two, and
you should have discovered it.
3.31 Can a sequence of positive numbers have a negative limit?
3.33 Increasing the denominator of a fraction with positive entries decreases
its value, and conversely.
3.35 You should obtain (3; 6, 6, 6, ... )-and this is the correct answer.
3.42 First, of course, you must (correctly) guess that the limit is zero. For a
proof:
Let e > 0 be given. Then l/e is a positive number, so we may choose a
whole number N such that N > l/e. Suppose that n is an arbitrary whole
number such that n > N. Then
Il/n - 01 = l/n < l/N < e
and hence the sequence has limit O.
3.43 The most natural definition of the sum, and the one used in math-
ematics, is the limit of the sequence
1, 1 + 1/2, 1 + 1/2 + 1/4, ...
obtained by "adding up more and more terms of the series."
3.44 See Exercise 10.35 if you wish.
3.45 By use of the definition in Exercise 3.32.
3.46 The results may surprise you.
3.48 This problem can be reduced to showing that the equation
3m = 2n
has no solution if m and n are positive whole numbers.
3.49 Note that k 12 = 2. Why is this so?
3.52 A fourth is an "inverted" fifth.
3.60 Well, theoretically, yes.

CHAPTER 4
4.3 No matter how the elements of (say) the group of Example 4.4 are
renamed, one cannot obtain the multiplication of Example 4.5, since in the
latter example the square of each element is the identity, and the former
example does not have this property. This observation enables one to avoid
Answers and hints 329

consideration of the twenty-four cases corresponding to the twenty-four


different ways of renaming the four elements of the first group with the names
of the four elements of the second group.
4.4 Be sure to establish associativity the "easy way."
4.5 Consider the "value" of the product ef
4.6 Examine the element yxz.
4.7 Use the fact that x has an inverse yin G.
4.8 What is the value of (x- 1) -1 . x- 1 ?
4.10 Note that either 0 or 1 can stand for the identity of M.
4.11 There is only one way to fill in the table, if 0 is to be the identity. Why?
4.12 One can establish associativity by "realizing" the group of Example
4.5 as a group of (not necessarily all) motions of some geometric figure.
One figure that will work is a rectangular parallelepiped. What are the
appropriate motions to consider?
4.14 Yes. How?
4.16 If the element 9 appears twice in the row to the right of x, and in the
columns headed by y and z, then xy = 9 = xz.
4.17 If the table were associative then it would have to be a group table.
4.18 Some other associative operations are these:
x # y = x + Y + 17,
x # y = x + y - xy,
x # y = y.
4.20 Try a group that is not commutative.
4.21 (W, +) contains just one subgroup of finite order-which one?
4.24 Yes; which one? Or are there more than one?
4.26 The identity has order 1. To answer the second question, what if
Xl = e?

4.27 The group L contains infinitely many elements of finite order and
infinitely many elements of infinite order.
4.28 A rotation of the disk one-seventh of the way around generates a
subgroup of order 7. Generalize. The group L does contain many subgroups
of infinite order, but this may not be so obvious.
4.30 First show that (x- 1gx)n = x- 1g nx.
330 Answers and hints

4.31 See the next exercise.


4.32 Note that (gh)2 = ghgh.
4.33 To show that xy = yx, consider (xy) 2 •
4.35 If 9 has order 3, what is the order of g2? Can 9 and g2 be the same?
4.36 Note that (ab)" = a(ba)n- 1 b.
4.37 Note that b = a- 1 b 2 a. Substitute the b on the left-hand side for each
b on the right-hand side. Continue this process, and eventually use the
fact that as = e.
4.38 Since G is commutative, (xy)2 = X
2y 2 for all x and y in G (why?).

4.39 See the answer to Exercise 4.36.


4.45 See the answer to Exercise 2.19.
4.46 Suppose by way of contradiction that 9 were an element of infinite
order in the finite group G.
4.49 Is there a fixed value of k such that k is a multiple of each element's
order?
4.50 One method is to show that the identity must appear at least twice on
the main diagonal of the group table for G. Suppose that it does not.
4.51 Yes; but why?
4.56 Use Exercise 4.54.
4.57 Try this first for n = 3. Generalize your proof to the case of arbitrary
natural number values of n.
4.59 Here is one way to show that Z is closed:
Suppose that y and z are elements of Z. Then yg = gy and zg = gz
for all elements 9 of G. Hence (yz)g = y(zg) = y(gz) = (yg)z = (gy)z =
g(yz). Hence (yz)g = g(yz) for all elements 9 of G. Hence yz is an element
of Z by definition of Z. Hence the operation is closed in Z.
4.60 This proof is similar to the previous exercise. Do you need to know
that G is finite?
4.62 Let n be the order of G. Among the orders of subgroups of G are
only the two natural numbers I and n. However, this does not show that
among all natural numbers, n has only the two divisors I and n. How would
you show that a group of order 12 has a proper subgroup? Generalize.
4.63 If x and yare elements of g-1 Hg, then for some hand k in H,
x = g-1hg and y = g-1kg.
Answers and hints 331

4.64 Note that this is an "if and only if" proof.


4.66 If this is too easy, why don't you try a really tough problem:
Let G be a group containing elements x and y such that, for some fixed
natural number n,
and
Prove that x = e = y.
4.71 Even though A is not a sllbgroup, it has "cosets" such as gA, where 9
is an element of G. See the proof of LaGrange's Theorem to see what
properties gA must have even though A is not a subgroup of G.

CHAPTER 5
5.2 Yes.
5.3 Yes.
5.4 Yes.
5.5 A man walking around a vertex passes through an even number of
countries. Will this fact help in showing that a two-color "checkerboard"
coloring pattern will work?
5.9 To prevent the boundary and exterior of (say) a cube from being a
polyhedron.
5.10 The answer to the second question is "no."
5.11 The maximum possible is 12. Finding a map that requires all 12 colors
is difficult; proving that 12 is sufficient for all such maps is very difficult.
5.13 It can be done. Curiously enough, it does not matter whether countries
"go all the way through" the strip or whether one has different countries
and non-coincident boundaries on the two "sides" of the strip. Can you see
why not?
5.14 Yes; you can construct such "maps" requiring any given number of
colors for a proper coloring.
5.15 There is no upper limit.
5.16 The proof might involve consideration of what happens to the value of
V - E + F when one hole is "plugged up."
5.17 If the lines do not intersect, you have a net in the plane for which
V - E + F = 1. What is the value of F? Reach a contradiction by
considering the possible number of boundary edges of each country.
332 Answers and hints

5.19 First show that 3F = 2E.


5.21 See Exercise 5.5.
5.22 Use the fact that each vertex lies on four edges to show first that 4V =
2E. Then, since each country is to have six sides, you can show that 6F = 2E.
Since you also know that V - E + F = 2, you may be able to reach a
contradiction using these three formulas.
5.24 Name the countries, and then proceed with a coloring scheme chosen
so as to avoid cases.
5.25 A boundary edge must be a segment rather than a closed curve. So
one should just introduce two (artificial) vertices onto the equator.
5.27 The points in the shaded region are exactly those satisfying the in-
equality of Steinitz's Theorem.
5.28 Try a few simple examples with small values of V and F.
5.30 Since E = 20, V + F = 22. Solve for F and use the inequality in
Steinitz's Theorem to find the desired conditions on V.
5.35 Curiously enough, the answer is always F - 4. Why?
5.36 First establish that 3 V = 2E, and that 5F = 2E. Then use Euler's
Theorem.
5.42 Use a case argument.
5.44 Use the techniques of the solution to Exercise 5.40.

CHAPTER 6
6.13 The other formula IS also valid. These formulas are known as
DeMorgan's laws.
6.14 If k is the larger of the two numbers 111 and n, then the answer may be
given in terms of an inequality involving k.
6.15 The answer is a formula involving k, m, and n.
6.16 Yes, the notationf(a) makes sense: If (a, b) Ef, thenf(a) = b.
6.19 The function f: R -4 R according to the rule f(x) = x 3 is sufficiently
different from Example 6.11.
6.21 If you hold the page on which Fig. 6.5 is printed up to the light, and
look at the other side of the page so that the x-axis is vertical and the positive
y-axis is to the right, this has the effect of interchanging these two axes;
thus, what you see is the graph off-I.
Answers and hints 333

6.22 Here is one way to show thatf- l is one-to-one:


Suppose that x and yare elements of B, and that f-l(X) = f-l(y).
Since f is a function, f(f-l(X)) = f(f-l(y)). Hence x = y. Therefore, if
x :F y, thenf-l(x) :F f-l(y). Therefore f- l is one-to-one.
6.27 The appropriate notation would be {f - 1(g - I)}.
6.29 Note that (x + 1)2 is not always equal to x 2 + 1.
6.30 Is the converse true?
6.32 Try f: N -+ E according to the rule

f(x) = 2x.

6.34 What about the function: 9 N -+ T by

g(x) = x + 9?
6.36 One possibility is
f(1) = 2,
f(2) = 1,
f(x) = x if x > 3.

Now find two more.


6.38 First devise the correspondence; then, if you have been sufficiently
systematic, you can devise a formula for the appropriate function.
6.39 Draw a triangle with base two units long. It has a parallel median which
must be one unit long. How do you correspond the points of the base with
the points of the median?
6.40 Modify the answer to the first question to answer the second.
6.43 Since you want to put Band C into one-to-one correspondence, the
trick is to draw two circles and label them Band C. Inside of each draw a
smaller circle, and determine what sets these two should represent in order
to be able to apply the Cantor-Schroeder-Bernstein Theorem.
6.45 One can let f: W -+ N according to the rule

f(x) = 2X if x > 1,
f(x) = 3- X if x ~ o.
6.52 You may wish to show this first in the special case in which A n B = 0,
and then apply some of the previous exercises or theorems.
6.59 Remember that S must contain a denumerable subset.
334 Answers and hints

6.72 How many three-element subsets has N? How many five-element


subsets has N?
6.75 Compare the following two versions of the method of performing
the experiment:
First Method: At each stage, remove from the urn the three lowest-
numbered balls not previously moved, then replace in the urn the two lowest-
numbered balls outside the urn.
Second Method: At each stage, remove the three lowest-numbered
balls in the urn, then replace the two highest-numbered balls then outside
the urn.
Do you see a way to perform the experiment so that, at its conclusion,
exactly thirty-seven balls are in the urn? It can be done!

CHAPTER 7
7.2 Yes; see the next exercise.
7.4 Check your answer with a few experiments.
7.5 Remember that integers may be negative as well as positive.
7.6 Remember that each prime is positive.
7.9 This is true under certain conditions, but not always true.
7.10 Yes; supply a proof.
7.12 Yes; supply an example.
7.17 The last n numbers in the sequence are composite.
7.18 The answer to each question is "no."
7.21 See Exercise 1.23 and the answer to it.
7.25 If a were the least positive real number, what about a12?
7.28 E is a subset of N.
7.29 What is the last positive rational number?
7.31 The number happens to be 65, but you should prove its existence using
the Well-Ordering Axiom.
7.33 This is a moderately long problem. If you have studied Chapter 3,
compare this exercise with Exercise 3.33.
7.34 If a < b, x < y, and all numbers involved are positive, then ax < by.
Why?
Answers and hints 335

7.35 Simplify the expression


(n + 1)3 - (n + I).
7.38 The second question is much easier than the first!
7.41 Use the Fundamental Theorem of Arithmetic.
7.42 No; but why not?
7.43 Handle 3m + nand m + 2n separately if you wish.
7.44 No; but why not?
7.45 Take n = 7, of course.
7.46 Use Wilson's Theorem to show that 20 has a proper divisor.
7.47 Note that 101 is prime.
7.48 Compare this with Exercise 7.9.
7.50 The formula is
«(Xl + 1)((X2 + 1)((X3 + 1)··· «(Xk + 1).
7.53 This one is not easy.
7.58 The number 4 is always a divisor of the left-hand side and never a
divisor of the right-hand side.
7.59 First find integers m and n such that 12m + 13n = 1.
7.61 If (n, n + 3) = 2, then 2 would be a divisor of (n + 3) - n. But
why?
7.63 If n is odd, then n has the form n = 2k + 1 for some integer k.
7.64 Show that every common divisor of m and m +n is a divisor of n.
It will then follow (why?) that (m, m + n) I n.
7.73 The author obtained the following solutions, but could persuade no
one else to check his answer to this laborious problem: The triple (x, y, z)
may be any of the following and no others.
(7, 1, 1) (3, 3, 1)
(5, 2, 1) (2, 2, 2)
(4, 1, 2) (1, 1, 3)
(1, 4, 1)
7.74 No way. It cannot be done.
336 Answers and hints

7.75 If X is the number of new eggs, y the number of fresh eggs, and z the
number of old eggs, then
X + Y + z = 100,
and
lOx + 2y + z = 200.

If you solve the first equation for z and substitute the result in the second
equation, you obtain
9x +Y= 100,

which is no problem to solve. Of the ten positive solutions, only x = 10,


y = 10, z = 80 satisfy the last condition of the problem.

7.76 Use the same sort of simplification as in the previous problem. There
is only one solution.
7.77 There is no need to list the solutions in order to count them; however,
there turn out to be 45 ways in which the check could have been written.
7.81 If there are b brass balls, c copper balls, and s silver balls, then one
obtains the equation
15b + 16c + I7s = 121.

This problem is quite long, and the author is reasonably sure that there is
only one solution. Hint: In that solution, no balls of one type were used.
This is not ruled out as a possible solution.
7.82 You can get an easy proof if you know the following fact: If m > 2
is a whole number, then there exists at least one prime p such that m <
p < 2m. Observe that if

1/2 + 1/3 + ... + 1/11 = k,

where k is a whole number, then n must be much larger than k. However,


there is a proof-hard to find-that does not use the above rather advanced
result of number theory.
7.83 If the prime p is of the form 411 + 3, and is the sum of two squares,
then one of the squares must be odd and the other even.
7.84 A small number of cases should be considered.
7.86 Use Exercise 7.48.
7.90 One solution is
Answers and hints 337

7.91 This is not so easy. The author believes that the smallest perimeter
that solves the problem is 480.
7.92 This is easy, and can be answered without much trouble for any whole
number, not just 10.
7.93 There is only one solution.
7.94 This is quite difficult, unless you have found a short cut unknown to the
author. The smallest solutions he obtained are

32 + 42 = 52,
(696)2 + (697)2 = (985)2,
(23,660)2 + (23,661)2 = (33,461)2.

Might this problem have infinitely many solutions?


7.95 This is just a matter of checking a few cases.
7.96 Unlike Exercise 7.94, this is quite easy.
7.100 Use Fermat's Theorem.

CHAPTER 8
8.1 The conditions of the problem indicate that k is positive. Hence N'
is always positive, but as N increases, the value of N' gets closer to zero.
8.2 Since N' is constant, the graph of N(t) will be a straight line in each
case.
8.3 The equation N' = B - D becomes

N' = bN - dN,
so that
N' = (b - d)N.
Since b - d is constant, set k = b - d.
8.4 Note that
No = No' e kT
2

by definition of the half-life T.


8.6 The graph of N(t) will look very much like one of those shown in Fig.
8.6; that is, taking the square root of the degree of realization term has little
effect on the long-term behavior of N.
338 Answers and hints

8.9 Since the water flows out at the rate V'et), and V'(t) is proportional to
the pressure at time t, and the pressure then is proportional to Vet) itself
because of the shape of the tank, we obtain the differential equation
V'et) = k Vet),
where k is a negative constant. Now see Exercise 8.4.
8.10 An equation giving a satisfactory interpretation of the conditions of the
problem would be
M
N ' = bN - N - d
M '

where band d are positive constants.


_~.11 You should obtain a stable critical point where the bluegill and redear
lines cross; that is, the two species can coexist.
<8:14 Arrows should definitely be drawn on the coordinate axes; it can happen
that some curve of population trend does meet an axis, indicating the dis-
appearance of one species.
8.16 If the three populations are A, B, and C, then one of the three equations
would be
A' = kA M - A - aB - f3C
M '
where M is the maximum population of population A the pond will support
and a and f3 are positive constants. The behavior of the system could be
examined by means of a graph in the first octant in three-dimensional space;
instead of lines where A = 0, there would be planes, with A positive beneath
I I

the plane corresponding to it and negative above. There are a large number
of possibilities for the eventual behavior of the system; for example, one
species might disappear, followed by the coexistence of the remaining two.
8.18 The system is critical for N = M and for N = 0. The former is stable,
the latter unstable.
8.20 Compare this exercise with Exercise 8.10.
8.21 Wide variations in population may result in the elimination of one
species from the pond.
8.23 One reasonable set of equations is
A' = (kA _ f3B) M ;; A ,

B' = (vB _ aA) C ~ B ,


Answers and hints 339

where all constants are positive, M and C representing the maximum popula-
tions of A and B, respectively, that can be supported by the pond.
8.28 The qualitative behavior of the solution is unchanged.
8.29 The fact that ex and p are positive describes the condition that each of the
two species contributes to the success of the other. A culture of yeast and
slime mold on nutrient agar in a large dish will exhibit this sort of behavior.
8.33 You should not spray unless you can spray enough to completely
eliminate aphids.
8.35 Yes; what is the value of to in terms of the given constants?
8.37 One could obtain the solution of the differential equation, substitute
enough values of the population at various times in order to evaluate all
unknown constants, and then check the resulting solution with values of the
population at other times.
8.38 See the answer to the previous exercise.

CHAPTER 9
9.2 If and only if a = b.
9.3 Use the same definition.
9.5 Since p E[a, b], if p is different from a and different from b then the
same straight line contains {a, p} as contains {p, b}.
9.7 Use the previous two exercises if you wish.
9.8 See the next exercise.
9.9 This is Theorem 9.2.
9.10 No; give an example.
9.13 See the material on "proof by induction" in Chapter 7, or see Exercise
1.45.
9.14 No; proof by induction can only show something true "for each natural
number n." However, although it does not follow from Theorem 9.2 and
Exercise 9.12 that the intersection of infinitely many convex sets is convex,
this is nevertheless true.
9.15 To answer the last question of the exercise, you know that either
A c B or B c A; there is no harm in supposing that the sets have been so
named that A c B.
9.16 Yes, and the proof is the same.
340 Answers and hints

9.17 Yes; try proving it.


9.18 Again, yes.
9.19 Of course; the proof is very simple.
9.20 Consider the collection of all circular regtons (including boundary
points) centered at the origin in E 2 •
9.25 What if A consists of three noncollinear points in E 2 ?
9.27 The first three questions of the exercise have affirmative answers.
9.28 The number of applications of A must be increased by one.
9.30 This exercise turns out to be very useful in some later problems.
9.31 No. Instead of triangular regions, what sort of sets should ~ consist of?
9.32 The answer to the second question is too easy if 0 E CC, so try to give
an example in which 0 does not belong to CC.
9.33 Some candidates for the property might be "being linear" and "con-
taining no straight line segment of length exceeding one."
9.34 It is possible but not necessary; try finding a proof in which the axiom
is used, just for practice.
9.35 It turns out that the only consistent interpretation of n CC is E 2 • No
points p of E 2 have the property indicated in the exercise, since there are no
sets-and thus no such sets-in CC. Try showing that if CC is the empty
collection of subsets of E 2 , then vCC = 0.
9.41 Yes. Hence our proof is really a disguised proof by induction. See the
Index for more on induction.
9.42 Helly's Theorem does apply here.
9.43 Although Helly's Theorem does not immediately apply, it is still
possible to reach some conclusion by considering the circular regions bounded
by the circles in CC.
9.44 Nothing, for the analogue of Helly's Theorem in E 3 requires that each
four sets have a point in common. There is a connection with Exercise 9.28
and the version of Exercise 9.30 for E 3 , a connection which explains why the
number in the theorem must be increased by one.
9.45 This is not easy. See Exercise 9.43.
9.46 See Exercise 9.44 and the answer to it.
9.48 The easiest way to consider cases is to consider how many of the points
are collinear.
Answers and hints 341

9.52 In order to apply Helly's Theorem in E 3 , the number of sets that in-
tersect must be increased to four. Hence it would be necessary for each set of
four pictures to be visible from some point in the gallery.
9.53 Yes; give an example.
9.55 Yes; one way is to divide E 2 into the disjoint convex sets E 2 and 0.
There are other solutions; find them.
9.56 By Helly's Theorem, it is clear that k < 3.
9.57 Note that the equation of the straight line through (n, 0) with slope n is

y = nx - n2 •

9.58 Examine a regular tetrahedron.


9.64 Use the axiom of Section 9.3.
9.66 No; give an example.
9.68 No; give an example.
9.72 This is possible even if the intersection is "connected"-see if you can
find such an example.
9.80 One of the implications is true, the other false.

CHAPTER 10
10.2 Use the fact that each rational number can be expressed in the form
min, where m and n are integers and n i= O.
10.4 No. Why not?
10.5 The decimal expansion given in the text can be so used; how?
10.6 How about 3.14159?
10.10 Every rational number has a finite continued fraction expanSIOn.
This follows from the Euclidean Algorithm (Theorem 7.8).
10.12 Use the technique of Theorem 10.1.
10.15 8/9.
10.16 327/999. Examine the previous exercise; do you see a pattern?
10.17 Note that if 0.999 999 ... were less than 1, then there would be a
number r such that
0.999 999 ... < r < 1.
342 Answers and hints

10.20 The techniques of Chapter 7 can be used to show that there are
irrational numbers not solutions to any equation of the form
p(x) = 0,
where p(x) is a polynomial with rational coefficients. These are called
transcendental numbers.
10.22 There is a connection with Exercise 3.48.
10.23 Suppose by way of contradiction that IX + r is a rational number.
10.25 You can conclude that b 2 - 4ac is the square of an integer.
10.27 The proof is similar to that given in the answer to Exercise 3.42.
10.29 Nothing can be done with either (d) or (e). Why not?
10.32 The sequence has no limit.
10.33 This is not easy.
10.34 Expand the product
(1 - r)(1 +r+ r2 + r3 + . .. + r n).
10.39 See Section 10.5.
10.44 Let J 1 = /b J 2 = /k+ l' and so on.
10.45 Treat the case when both real numbers are positive first. Define the
product in the other cases by using absolute values.
10.59 If a set had two least upper bounds, one would have to exceed the
other.
10.61 Clearly, 0 ~ m ::; 9. Or is it clear?
10.62 The construction differs in the choice of the first interval.
10.63 What about zero?
10.65 Both (0, 1) and (0, - 1) are solutions of the equation
x2 + 1 = O.
10.68 For example, show that 1/4 E K.
10.69 The function has the same value at the two numbers.
INDEX
INDEX

Absolute value, 289 Bolyai Farkas, 23-24


Aleksandrov, A. D., 318 Janos, 24
Alexandroff, Paul, 317 Bolyai-Gerwin Theorem, 2, 6
American Mathematical Monthly, 107 Borromean Rings, 25
Antoine, L., 51 generalizations, 29
Approximations by continued frac- formula, 41
tions, 63-65 see also Brunnian links
Archimedean property, 303 Brunn, H., 29, 51
Area, 21 Brunnian links, 29
Art Gallery Theorem, 254 (4, 2)-links, 45-46
Associative (n, k)-links, 43, 49
law, 82
operations in real number system, Cancellative operations, 95
301 Cantor, Georg, 177, 182
Cantor-Schroeder-Bernstein
Bach, J. S., 52, 76 Theorem, 164
Banach, S., 21-23 applications, 169-172, 176,181
Batting average, 66 Cantor Ternary Set, 312
Beck, Anatole, 318 length, 313
Beet virus molecule, 142 number of elements, 314
Beiler, A. H., 218, 316 Cardinal numbers, 179
Benade, Arthur H., 81 existence, 180
Benson, R. V., 279 Center of group, 104
Bers, Lipman, 317 Cohen, Paul J., 182
Birkhoff, Garrett, 106, 316 Cohn-Vossen, S., 143, 317
Bleicher, Michael N., 318 Commutative
Boltyanskii, V. G., 23, 279 law, 34

345
346 Index

Commutative-continued Debrunner, H., 23, 51, 279, 317


operation in group, 87, 94 Decimal expansion, 308-309
see also Group, Abelian Dedekind, Richard, 183
Commutator, 45 Dedekind Box Principle, 172, 175
k-commutator,48 applications, 177
Complex number system, 310 Dedekind infinite, 174
Composite numbers, 185 Degree of realization, 230, 232
consecutive, 187 in competing populations, 234
prime factors, 186 Dehn, M., 18
Congruence motions Differential equations, 223
of disk, 90 for competing populations, 234, 245
product, 84-85 in population growth, 224-226
of square, 93 in radioactive decay, 227
of tetrahedron, 90 Divisibility, 184
of triangle, 83-87 Divisors, 184
Constructions with straightedge and product, 189
compass, 6, 19, 23 number, 205
Continued fractions, 59-61 see also Greatest common divisor
and batting averages, 66-67 Dorrie, Heinrich, 317
construction, 75-76 Dudley, Underwood, 218
evaluation, 62-63 Duplication of cube, 18
and grade distributions, 67-69 Dynkin, E. B., 143
and irrational numbers, 286
Continuum, 179 Equidecomposable figures, 3, 22, 23
Convex hull, 261 Equivalence relation, 5, 162
kernel, 261 between nested sequences, 293, 295
polyhedron in Steinitz's Theorem, Euclidean algorithm, 92, 196, 197,
116 219
sets, intersection of, 258-260 proof, 197-202
sets, tower of, 260 Euler, L., 118, 144
Convexity, 255 Euler's formula, 108, 118
generalizations, 276-279 proof, 118-122
Coset of subgroup, 96 Exponent of group element, 98
Coxeter, H. S. M., 143, 317
Critical point, 240 Factorization into primes, 196, 202,
Cross-cancellative operation, 95 209-210
semigroup, 105 Fermat, Pierre de, 184
Crowdis, David G., 318 Fermat's Theorem, 218
Crowe, Donald W., 318 Fifths, 73
Crowell, R. H., 51 improving, 77
Curve, knotted, 30 Five-Color Theorem, 144
on torus, 30 Fort, M. K., 51
polygonal, 28 Fox, R. H., 51
simple closed, 27-30 Fraleigh, John B., 318
tame, 28 Frequencies, 70
wild, 28 of notes on piano, 73
Cycles in population, 249 Functions, 154-162
Index 347

Fundamental Theorem of Arith- Integers, 184


metic, 202, 204 Intermediate fractions, 64
applications, 281 Intervals, 288
proof, 202-203 length, 288
nested sequences of, 293
Gause, G. F., 252 sum of, 297
Gauss, C. F., 218-219 Inverse of group element, 83
Gelfond, A., 218 uniqueness of, 93
Generator of group, 102
Geodesic dome, 142 Jacobson, Nathan, 106
Geometric series, 283 Jakobson, Roman, 317
Goffman, Caspar, 315 Jeans, Sir James, 81
Golden Mean, 62 Join of point and set, 277
Greatest common divisor, 205
computation, 205-206 Kernel of homomorphism, 106
Greek alphabet, 319 see also Convex
Griffin, Harriett, 218 Khinchine, A. Ya., 81
Group, 92 Klee, Victor, 23, 279, 317
Abelian, 102 Kolmogorov, A. N., 318
associated with curves in space, 38 Krasnoselskii, M. A., 271
center, 104 Krasnoselskii's Theorem, 254, 271
cyclic, 102 Kripke, Bernard, 315
examples, 82-92 Kurosh, A. G., 106
of prime order, 102
uniqueness of identity, 93 Lack, D. L., 253
Grtinbaum, Branko, 143, 279 LaGrange, J. L., 107
LaGrange's Theorem, 96
Hadwiger, H., 18, 23, 279, 317 Lavrent'ev, M. A., 318
Half-plane, 273, 275 Law of Quadratic Reciprocity, 218
Hall, M., 106 Law of Small Whole Numbers, 72
Harmonics, 70 L-convexity, 277
Hausdorff, Felix, 182 Least upper bound, 303
Helly's Theorem, 266, 268 Limit of sequence, 65, 69, 284, 289-
applications, 270-271, 273, 276 291, 294-295
Herstein, I. N., 106 Links, 30, 32
Hilbert, D., 143, 317 splittable, 30
Hocking, John G., 317 see also Brunnian links
Homomorphism, 105 Lobachevsky, N. I., 24
Horn, Alfred, 279 Logarithms, 53-56
Lomont, J. S., 317
Image of homomorphism, 105 Lyusternik, L. A., 143, 279
of function, 155-156
Induction Principle, 10, 190 MacLane, Saunders, 106, 316
and well-ordering, 190-193 Mansfield, M. J., 314
applications, 10-11, 49, 193-195, Map on Mobius strip, 117
259 on sphere, 116
Infinite series, 69-70, 295 on torus, 116
348 Index

Mobius strip, 50, 117 Rademacher, Hans, 218


Moise, E. E., 23 Radioactive decay, 227
Mordell, L., 218 Rational numbers, 280
Real numbers, 297
Archimedean property, 303
n-link,30 as nondenumerable set, 178
completely splittable, 32 decimal expansions, 308-309
splittable, 30 order relation, 302
sublink of, 32 Rectangle formed from parallelo-
Natural numbers, 184 gram, 11
composite, 185 reassembled into square, 12-14
prime, 185 Regular solid, 123
Nested sequences of intervals, 293 Rudin, Walter, 314
equivalence of, 295
Net of polyhedron, 118 Schoenflies Theorem, 110
Niven, Ivan, 218, 316 Semigroup, 104
Nonmeasurable set, 21 Sets, 146
algebra, 153
Odum, Eugene, 253 and Venn diagrams, 151
One-to-one correspondence, 156, Cartesian product, 160
160-162 convex, 255
Order of group, 96 denumerable, 176
infinite, 96, 99 descriptive definition, 147
of group element, 99 difference, 154
Ordered pair, 159 distributive laws, 152-153
Ore, Oystein, 143 element, 146
empty, 150
equality, 148
Parallelogram formed from triangle, finite, 163
11 inclusion, 149
reassembled into rectangle, 11 infinite, 163
Partial quotients, 62 intersection, 150
Passman, D. S., 106 listing, 147
Polya, George, 192, 318 maximal convex, 264
Polygon, 2, 109 maximal tower, 263-264
connected, 109 maximal with respect to a property,
edge, 2 263
equidecomposable, 3 nondenumerable, 177
vertex, 2 nonlinear, 263
Polyhedron, 113 notation, 147-148
edge, 115 number of elements, 154
face, 115 subset, 149
2-connected, 116 tower, 260
vertex, 115 union, 149
Primes, 102, 185 Sierpinski, W., 23
Pythagorean right triangles, 217-218 Sigmoid curve, 232
Pythagorean Theorem, 16 Singer, I. M., 51
Index 349

Slobodkin, Lawrence B., 253 Tietze, Heinrich, 143, 317


Snowflake curve, 19-21 Torus, 30, 116
Spivak, Michael, 317 Transposition, 74
Sprecher, David A., 315 Triangle reassembled into parallelo-
Square formed from rectangle, 12-14 gram, 11
formed from several squares, 14-16 Trisection of angle, 18
reassembled into given polygon,
16-17 Unbounded figure, 6
Squaring the circle, 18 Uspenskii, V. A., 143
Stein, Sherman, 144, 318
Steinitz, E., 127, 143 Valentine, F. A., 279
Steinitz's Theorem, 127 Venn diagrams, 151-152
proof, 128-133
Stoll, Robert R., 182, 315, 317 Wall, H. S., 81, 317
Straight line segment, 255 Well-Ordering Axiom, 190
Subgroup, 95 Well-tempering, 76-77
improper, 96 Wheeler, Brandon W., 318
normal, 104 Wilder, R. L., 315, 318
proper, 96 Williams, J. D., 317
Wilson's Theorem, 204, 205
Tarski, A., 19, 22 Wolf Interval, 74
Ternary system, 311
Texan rectangle, 7 Yaglom, Y. A., 279
Thirds, 78 Young, G. S., 317
improving, 81
Thomas, J. M., 144 Zermelo Axiom, 263
Thorpe, John A., 51 Zuckerman, H. S., 218, 316

ABCDE79876S432

You might also like