Professional Documents
Culture Documents
MONTHLY
VOLUME 118, NO. 3 MARCH 2011
NOTES
Lines of Best Fit for the Zeros and for the Critical Points 262
of a Polynomial
Grant Keady
REVIEWS
Ralph P. Boas
Second Edition revised by Harold P. Boas
An ideal choice for a first course in complex
analysis, this book can be used either as a
classroom text or for independent study.
Written in an informal style by a master
expositor, the book distills more than half a century of experience
with the subject into a lucid, engaging, yet rigorous account. The book
reveals both the power of complex analysis as a tool for applications
and the intrinsic beauty of the subject as a fundamental part of pure
mathematics.
MONTHLY
VOLUME 118, NO. 3 MARCH 2011
EDITOR
Daniel J. Velleman
Amherst College
ASSOCIATE EDITORS
William Adkins Jeffrey Nunemacher
Louisiana State University Ohio Wesleyan University
David Aldous Bruce P. Palka
University of California, Berkeley National Science Foundation
Roger Alperin Joel W. Robbin
San Jose State University University of Wisconsin, Madison
Anne Brown Rachel Roberts
Indiana University South Bend Washington University, St. Louis
Edward B. Burger Judith Roitman
Williams College University of Kansas, Lawrence
Scott Chapman Edward Scheinerman
Sam Houston State University Johns Hopkins University
Ricardo Cortez Abe Shenitzer
Tulane University York University
Joseph W. Dauben Karen E. Smith
City University of New York University of Michigan, Ann Arbor
Beverly Diamond Susan G. Staples
College of Charleston Texas Christian University
Gerald A. Edgar John Stillwell
The Ohio State University University of San Francisco
Gerald B. Folland Dennis Stowe
University of Washington, Seattle Idaho State University, Pocatello
Sidney Graham Francis Edward Su
Central Michigan University Harvey Mudd College
Doug Hensley Serge Tabachnikov
Texas A&M University Pennsylvania State University
Roger A. Horn Daniel Ullman
University of Utah George Washington University
Steven Krantz Gerard Venema
Washington University, St. Louis Calvin College
C. Dwight Lahr Douglas B. West
Dartmouth College University of Illinois, Urbana-Champaign
Bo Li
Purdue University
EDITORIAL ASSISTANT
Nancy R. Board
NOTICE TO AUTHORS Proposed problems or solutions should be sent to:
The MONTHLY publishes articles, as well as notes and DOUG HENSLEY, MONTHLY Problems
other features, about mathematics and the profes- Department of Mathematics
sion. Its readers span a broad spectrum of math- Texas A&M University
ematical interests, and include professional mathe- 3368 TAMU
maticians as well as students of mathematics at all College Station, TX 77843-3368
collegiate levels. Authors are invited to submit arti-
cles and notes that bring interesting mathematical
In lieu of duplicate hardcopy, authors may submit
ideas to a wide audience of MONTHLY readers.
pdfs to monthlyproblems@math.tamu.edu.
The MONTHLY’s readers expect a high standard of ex-
position; they expect articles to inform, stimulate,
challenge, enlighten, and even entertain. MONTHLY Advertising Correspondence:
articles are meant to be read, enjoyed, and dis- MAA Advertising
cussed, rather than just archived. Articles may be 1529 Eighteenth St. NW
expositions of old or new results, historical or bio- Washington DC 20036
graphical essays, speculations or definitive treat-
ments, broad developments, or explorations of a Phone: (866) 821-1221
single application. Novelty and generality are far Fax: (866) 387-1208
less important than clarity of exposition and broad E-mail: advertising@maa.org
appeal. Appropriate figures, diagrams, and photo-
graphs are encouraged. Further advertising information can be found online
at www.maa.org
Notes are short, sharply focused, and possibly infor-
mal. They are often gems that provide a new proof Change of address, missing issue inquiries, and
of an old theorem, a novel presentation of a familiar other subscription correspondence:
theme, or a lively discussion of a single issue. MAA Service Center, maahq@maa.org
Beginning January 1, 2011, submission of articles and All at the address:
notes is required via the MONTHLY’s Editorial Man-
ager System. Initial submissions in pdf or LATEX form The Mathematical Association of America
can be sent to the Editor-Elect Scott Chapman at 1529 Eighteenth Street, N.W.
Washington, DC 20036
http://www.editorialmanager.com/monthly
Recent copies of the MONTHLY are available for pur-
chase through the MAA Service Center.
The Editorial Manager System will cue the author maahq@maa.org, 1-800-331-1622
for all required information concerning the paper.
Questions concerning submission of papers can be Microfilm Editions: University Microfilms Interna-
addressed to the Editor-Elect at monthly@shsu.edu. tional, Serial Bid coordinator, 300 North Zeeb Road,
Authors who use LATEX are urged to use article.sty, Ann Arbor, MI 48106.
or a similar generic style, and its standard environ-
ments with no custom formatting. The style of ci- The AMERICAN MATHEMATICAL MONTHLY (ISSN
tations for journal articles and books should match 0002-9890) is published monthly except bimonthly
that used on MathSciNet (see http://www.ams. June-July and August-September by the Mathe-
org/mathscinet). Follow the link to Electronic Publi- matical Association of America at 1529 Eighteenth
cations Information for authors at http://www.maa. Street, N.W., Washington, DC 20036 and Lancaster,
org/pubs/monthly.html for information about fig- PA, and copyrighted by the Mathematical Associa-
ures and files, as well as general editorial guidelines. tion of America (Incorporated), 2011, including rights
Letters to the Editor on any topic are invited. to this journal issue as a whole and, except where
Comments, criticisms, and suggestions for making otherwise noted, rights to each individual contribu-
the MONTHLY more lively, entertaining, and infor- tion. Permission to make copies of individual arti-
mative can be forwarded to the Editor-Elect at cles, in paper or electronic form, including posting
monthly@shsu.edu. on personal and class web pages, for educational
and scientific use is granted without fee provided
The online MONTHLY archive at www.jstor.org is a that copies are not made or distributed for profit
valuable resource for both authors and readers; it or commercial advantage and that copies bear the
may be searched online in a variety of ways for any following copyright notice: [Copyright the Mathe-
specified keyword(s). MAA members whose institu- matical Association of America 2011. All rights re-
tions do not provide JSTOR access may obtain indi- served.] Abstracting, with credit, is permitted. To
vidual access for a modest annual fee; call 800-331- copy otherwise, or to republish, requires specific
1622. permission of the MAA’s Director of Publications and
See the MONTHLY section of MAA Online for current possibly a fee. Periodicals postage paid at Washing-
information such as contents of issues and descrip- ton, DC, and additional mailing offices. Postmaster:
tive summaries of forthcoming articles: Send address changes to the American Mathemati-
cal Monthly, Membership/Subscription Department,
MAA, 1529 Eighteenth Street, N.W., Washington, DC,
http://www.maa.org/ 20036-1385.
Yueh-Gin Gung and Dr. Charles Y. Hu
Award for 2011 to Joseph A. Gallian for
Distinguished Service to Mathematics
Barbara Faires
Joe Gallian.
The two themes that run through Joseph A. Gallian’s service to mathematics are (a) en-
couraging young mathematicians and helping them to develop successful careers and
(b) communicating mathematics to the widest possible audience. He was one of the
early proponents of undergraduates conducting mathematical research, and his REU
at Duluth, which began in 1977, is widely regarded as the premier REU. The quality
of the work at Joe’s REU is evidenced by the 160 papers by participants that grew out
of their REU work. These papers have appeared in such journals as Crelle’s Journal,
Journal of Algebra, Journal of Combinatorial Theory, Discrete Mathematics, Applied
Discrete Mathematics, Annals of Discrete Mathematics, and Journal of Graph Theory.
The REU, along with Joe’s continuing contact with its participants, makes an important
contribution to developing the next generations of mathematicians. Participants have
included some prominent mathematicians, whose careers this REU helped to shape.
Joe is also an inspiration to a generation of mathematicians who involve students in
high-quality undergraduate research in mathematics. Not only is Joe successful with
his own REU, but he is generous with his time and advice to help others to set up
REUs. In 2002, Joe was recognized by the Council on Undergraduate Research with
their Fellow Award, given to members who have demonstrated sustained excellence in
research with undergraduates.
doi:10.4169/amer.math.monthly.118.03.195
March 2011] YUEH-GIN GUNG AND DR. CHARLES Y. HU AWARD FOR 2011 195
Project NExT is the MAA’s widely acclaimed professional development opportu-
nity for new or recent Ph.D.’s. Joe has been involved with Project NExT since its first
summer in 1994 when he gave the closing address. The address was so extraordi-
narily successful that he has given each subsequent closing address. Later, when Joe
became co-director of Project NExT in 1998, he assumed primary responsibility for
many parts of the program, participated in developing the workshop program, and of-
ten drafted articles for Focus and reports to the Board of Governors. His boundless
energy, enthusiasm, mathematical sophistication, and academic savvy have made him
the perfect person to work with the hundreds of new mathematics faculty who have
become Project NExT Fellows.
Joe’s Project NExT work illustrates that his service to mathematics ranges across
all the levels of work that needs to be done. Not only does Joe participate in long-
range planning and vision discussions, but he also does the small tasks that keep a
program functioning successfully. As with his REU, Joe does not treat Project NExT
as a job for which he has specified, narrowly defined duties, but as a program to which
he generously gives his time and to whose success he is committed.
In his talks, Joe combines thorough preparation, imaginative presentation, and a
showman’s flair with solid mathematical content. An indication of Joe’s success at
communicating mathematics was a standing ovation from the audience at his Pi Mu
Epsilon Frame Lecture. This audience included high school students as well as profes-
sors, and all understood and were excited with Joe’s talk. Joe has given 24 addresses
at national meetings, 65 at MAA Section meetings, and over 200 at colleges and uni-
versities. Joe also communicates mathematics beyond the mathematical community.
Articles about his work have appeared in twenty-five news outlets in the United States
as well as in Europe and India. Four of these were in Science News and one in the New
York Times. In addition to this he has more than a 100 articles in mathematical jour-
nals and other publications including Math Horizons, the Macmillan Encyclopedia of
Chemistry, and the Mathematical Intelligencer. Joe Gallian was named by a Duluth
newspaper as one of the “100 Great Duluthians of the 20th Century.”
Joe has served professional organizations and the mathematical community at
large. Joe has been national coordinator for Mathematics Awareness Month (2003 and
2010); he has served on more than 50 national committees, chairing at least 10 of
them; he was a Council on Undergraduate Research Councilor for 11 years, serving
as chair of the mathematics and computer science division for part of that time; he has
served as associate editor for Mathematics Magazine and the American Mathematical
Monthly; and he has been director or co-director of five conferences. Joe has refereed
for 40 journals and is a reviewer for NSF, the Research Council of Canada, and the
Australian Research Council. Those who work with Joe know that he is always an
active contributor to a project in which he is involved—he is efficient and he moves
the project along.
Joe Gallian’s many awards and honors attest to his passion to serve undergraduates,
professional organizations, and the mathematical community. He has been honored
with teaching awards from the University of Minnesota Duluth, the Carnegie Founda-
tion for the Advancement of Teaching, and the Mathematical Association of America
(Haimo Award). Joe has received the MAA’s Trevor Evans and Carl B. Allendoerfer
Awards and has been an MAA Polya Lecturer. Joe served as second vice president and
then president of the MAA.
Joe completed his undergraduate degree at Slippery Rock University, M.A. at the
University of Kansas, and Ph.D. at Notre Dame. He is a professor of mathematics and
statistics at the University of Minnesota Duluth, where he was recognized in December
2009 with the Chancellor’s Award for Distinguished Research.
March 2011] YUEH-GIN GUNG AND DR. CHARLES Y. HU AWARD FOR 2011 197
Was Cantor Surprised?
Fernando Q. Gouvêa
Abstract. We look at the circumstances and context of Cantor’s famous remark, “I see it, but
I don’t believe it.” We argue that, rather than denoting astonishment at his result, the remark
pointed to Cantor’s worry about the correctness of his proof.
Mathematicians love to tell each other stories. We tell them to our students too, and
they eventually pass them on. One of our favorites, and one that I heard as an under-
graduate, is the story that Cantor was so surprised when he discovered one of his the-
orems that he said “I see it, but I don’t believe it!” The suggestion was that sometimes
we might have a proof, and therefore know that something is true, but nevertheless still
find it hard to believe.
That sentence can be found in Cantor’s extended correspondence with Dedekind
about ideas that he was just beginning to explore. This article argues that what Can-
tor meant to convey was not really surprise, or at least not the kind of surprise that
is usually suggested. Rather, he was expressing a very different, if equally familiar,
emotion. In order to make this clear, we will look at Cantor’s sentence in the context
of the correspondence as a whole.
Exercises in myth-busting are often unsuccessful. As Joel Brouwer says in his poem
“A Library in Alexandria,”
Our only hope, then, lies in arguing not only that the standard story is false, but also
that the real story is more interesting.
1. THE SURPRISE. The result that supposedly surprised Cantor was the fact that
sets of different dimension could have the same cardinality. Specifically, Cantor
showed (of course, not yet using this language) that there was a bijection between the
interval I = [0, 1] and the n-fold product I n = I × I × · · · × I .
There is no doubt, of course, that this result is “surprising,” i.e., that it is counter-
intuitive. In fact Cantor said so explicitly, pointing out that he had expected something
different. But the story has grown in the telling, and in particular Cantor’s phrase about
seeing but not believing has been read as expressing what we usually mean when we
see something happen and exclaim “Unbelievable!” What we mean is not that we
actually do not believe, but that we find what we know has happened to be hard to
believe because it is so unusual, unexpected, surprising. In other words, the idea is that
Cantor felt that the result was hard to believe even though he had a proof. His phrase
has been read as suggesting that mathematical proof may engender rational certainty
while still not creating intuitive certainty.
The story was then co-opted to demonstrate that mathematicians often discover
things that they did not expect or prove things that they did not actually want to prove.
For example, here is William Byers in How Mathematicians Think:
doi:10.4169/amer.math.monthly.118.03.198
Did Cantor’s comment suggest that he found it hard to believe his own theorem
even after he had proved it? Byers was by no means the first to say so.
Many mathematicians thinking about the experience of doing mathematics have
found Cantor’s phrase useful. In his preface to the original (1937) publication of the
Cantor-Dedekind correspondence, J. Cavaillès already called attention to the phrase:
Notice, however, that Cavaillès is still focused on the description of the result as
“surprising” rather than on the issue of Cantor’s psychology. It was probably Jacques
Hadamard who first connected the phrase to the question of how mathematicians think,
and so in particular to what Cantor was thinking. In his famous Essay on the Psychol-
ogy of Invention in the Mathematical Field, first published in 1945 (only eight years
after [14]), Hadamard is arguing about Newton’s ideas:
2. THE MAIN CHARACTERS. Our story plays out in the correspondence between
Richard Dedekind and Georg Cantor during the 1870s. It will be important to know
something about each of them.
Richard Dedekind was born in Brunswick on October 6, 1831, and died in the same
town, now part of Germany, on February 12, 1916. He studied at the University of
Göttingen, where he was a contemporary and friend of Bernhard Riemann and where
he heard Gauss lecture shortly before the old man’s death. After Gauss died, Lejeune
Dirichlet came to Göttingen and became Dedekind’s friend and mentor.
Dedekind was a very creative mathematician, but he was not particularly ambitious.
He taught in Göttingen and in Zurich for a while, but in 1862 he returned to his home
town. There he taught at the local Polytechnikum, a provincial technical university. He
lived with his brother and sister and seemed uninterested in offers to move to more
prestigious institutions. See [1] for more on Dedekind’s life and work.
Our story will begin in 1872. The first version of Dedekind’s ideal theory had ap-
peared as Supplement X to Dirichlet’s Lectures in Number Theory (based on actual
lectures by Dirichlet but entirely written by Dedekind). Also just published was one
of his best known works, “Stetigkeit und Irrationalzahlen” (“Continuity and Irrational
Numbers”; see [7]; an English translation is included in [5]). This was his account of
how to construct the real numbers as “cuts.” He had worked out the idea in 1858, but
published it only 14 years later.
1 Cavaillès misquotes Cantor’s phrase as “je le vois mais je ne le crois point.”
Allow me to put a question to you. It has a certain theoretical interest for me, but
I cannot answer it myself; perhaps you can, and would be so good as to write me
about it. It is as follows.
Take the totality of all positive whole-numbered individuals n and denote it
by (n). And imagine, say, the totality of all positive real numerical quantities x
and designate it by (x). The question is simply, Can (n) be correlated to (x) in
such a way that to each individual of the one totality there corresponds one and
only one of the other? At first glance one says to oneself no, it is not possible, for
(n) consists of discrete parts while (x) forms a continuum. But nothing is gained
by this objection, and although I incline to the view that (n) and (x) permit no
one-to-one correlation, I cannot find the explanation which I seek; perhaps it is
very easy.
Cantor first concedes that perhaps it is not that interesting, then immediately points
out an application that was sure to interest Dedekind! In fact, Dedekind’s notes indi-
cate that it worked: “But the opinion I expressed that the first question did not deserve
too much effort was conclusively refuted by Cantor’s proof of the existence of tran-
scendental numbers.” [8, p. 848]
These two letters are fairly typical of the epistolary relationship between the two
men: Cantor is deferential but is continually coming up with new ideas, new questions,
new proofs; Dedekind’s role is to judge the value of the ideas and the correctness of
the proofs. The very next letter, from December 7, 1873, contains Cantor’s first proof
of the uncountability of the real numbers. (It was not the “diagonal” argument; see [4]
or [9] for the details.)
As for the question with which I have recently occupied myself, it occurs to me
that the same train of thought also leads to the following question:
Can a surface (say a square including its boundary) be one-to-one correlated
to a line (say a straight line including its endpoints) so that to every point of the
surface there corresponds a point of the line, and conversely to every point of the
line there corresponds a point of the surface?
It still seems to me at the moment that the answer to this question is very
difficult—although here too one is so impelled to say no that one would like to
hold the proof to be almost superfluous.
Cantor’s letters indicate that he had been asking others about this as well, and that
most considered the question just plain weird, because it was “obvious” that sets of
different dimensions could not be correlated in this way. Dedekind, however, seems to
have ignored this question, and the correspondence went on to other issues. On May
18, 1874, Cantor reminded Dedekind of the question, and seems to have received no
answer.
The next letter in the correspondence is from May, 1877. The correspondence seems
to have been reignited by a misunderstanding of what Dedekind meant by “the essence
of continuity” in [7]. On June 20, 1877, however, Cantor returns to the question of
bijections between sets of different dimensions, and now proposes an answer:
Significantly, Cantor’s formulation of the question had changed. Rather than asking
whether there is a bijection, he posed the question of finding a bijection. This is, of
course, because he believed he had found one. By this point, then, Cantor knows the
right answer. It remains to give a proof that will convince others. He goes on to explain
his idea for that proof, working with the ρ-fold product of the unit interval with itself,
but for our purposes we can consider only the case ρ = 2.
The proof Cantor proposed is essentially this: take a point (x, y) in [0, 1] × [0, 1],
and write out the decimal expansions of x and y:
Some real numbers have more than one decimal expansion. In that case, we always
choose the expansion that ends in an infinite string of 9s. Cantor’s idea is to map (x, y)
to the point z ∈ [0, 1] given by
z = 0.aαbβcγ dδe . . .
Since we can clearly recover x and y from the decimal expansion of z, this gives the
desired correspondence.
Dedekind immediately noticed that there was a problem. On June 22, 1877 (one
cannot fail to be impressed with the speed of the German postal service!), he wrote
back pointing out a slight problem “which you will perhaps solve without difficulty.”
He had noticed that the function Cantor had defined, while clearly one-to-one, was not
onto. (Of course, he did not use those words.) Specifically, he pointed out that such
numbers as
z = 0.120101010101 . . .
did not correspond to any pair (x, y), because the only possible value for x is
0.100000 . . . , which is disallowed by Cantor’s choice of decimal expansion. He
was not sure if this was a big problem, adding “I do not know if my objection goes to
the essence of your idea, but I did not want to hold it back.”
Of course, the problem Dedekind noticed is real. In fact, there are a great many real
numbers not in the image, since we can replace the ones that separate the zeros with
any sequence of digits. The image of Cantor’s map is considerably smaller than the
whole interval.
Cantor’s first response was a postcard sent the following day. (Can one envision him
reading the letter at the post office and immediately dispatching a postcard back?) He
acknowledged the error and suggested a solution:
Alas, you are entirely correct in your objection; but happily it concerns only the
proof, not the content. For I proved somewhat more than I had realized, in that I
bring a system x1 , x2 , . . . , xρ of unrestricted real variables (that are ≥ 0 and ≤ 1)
into one-to-one relationship with a variable y that does not assume all values of
This is a remarkable response. It suggests that Cantor was very confident that his
result was true. This confidence was due to the fact that Cantor was already thinking
in terms of what later became known as “cardinality.” Specifically, he expects that the
existence of a one-to-one mapping from one set A to another set B implies that the
size of A is in some sense “less than or equal to” that of B.
Cantor’s proof shows that the points of the square can be put into bijection with a
subset of the interval. Since the interval can clearly be put into bijection with a subset
of the square, this strongly suggests that both sets of points “are the same size,” or,
as Cantor would have said it, “have the same power.” All we need is a proof that the
“powers” are linearly ordered in a way that is compatible with inclusions.
That the cardinals are indeed ordered in this way is known today as the Schroeder-
Bernstein theorem. The postcard shows that Cantor already “knew” that the Schroeder-
Bernstein theorem should be true. In fact, he seems to implicitly promise a proof of
that very theorem. He was not able to find such a proof, however, then or (as far as I
know) ever.
His fuller response, sent two days later on June 25, contained instead a completely
different, and much more complicated, proof of the original theorem.
I sent you a postcard the day before yesterday, in which I acknowledged the
gap you discovered in my proof, and at the same time remarked that I am able
to fill it. But I cannot repress a certain regret that the subject demands more
complicated treatment. However, this probably lies in the nature of the subject,
and I must console myself; perhaps it will later turn out that the missing portion
of that proof can be settled more simply than is at present in my power. But since
I am at the moment concerned above all to persuade you of the correctness of
my theorem . . . I allow myself to present another proof of it, which I found even
earlier than the other.
Notice that what Cantor is trying to do here is to convince Dedekind that his theorem
is true by presenting him a correct proof.2 There is no indication that Cantor had any
doubts about the correctness of the result itself. In fact, as we will see, he says so
himself.
Let’s give a brief account of Cantor’s proof; to avoid circumlocutions, we will ex-
press most of it in modern terms. Cantor began by noting that every real number x
between 0 and 1 can be expressed as a continued fraction
1
x=
1
a+
1
b+
c + ···
2 Cantor claimed he had found this proof before the other. I find this hard to believe. In fact, the proof
looks very much like the result of trying to fix the problem in the first proof by replacing (nonunique) decimal
expansions with (unique) continued fraction expansions.
A number y that can assume all the values of the interval (0 . . . 1) with the soli-
tary exception of the value 0 can be correlated one-to-one with a number x that
takes on all values of the interval (0 . . . 1) without exception.
In other words, he claimed that there was a bijection between the half-open interval
(0, 1] and the closed interval [0, 1], and that “successive application” of this fact would
finish the proof. In the actual application he would need the intervals to be open on the
right, so, as we will see, he chose a bijection that mapped 1 to itself.
Cantor did not say exactly what kind of “successive application” he had in mind,
but what he says in a later letter suggests it was this: we have the interval [0, 1] minus
the sequence of the ηk . We want to “put back in” the ηk , one at a time. So we leave
the interval [0, η1 ) alone, and look at (η1 , η2 ). Applying the lemma, we construct a
bijection between that and [η1 , η2 ). Then we do the same for (η2 , η3 ) and so on. Putting
together these bijections produces the bijection we want.
Finally, it remained to prove the lemma, that is, to construct the bijection from [0, 1]
to (0, 1]. Modern mathematicians would probably do this by choosing a sequence xn
in (0, 1), mapping 0 to x1 and then every xn to xn+1 . This “Hilbert hotel” idea was still
some time in the future, however, even for Cantor. Instead, Cantor chose a bijection
that could be represented visually, and simply drew its graph. He asked Dedekind to
consider “the following peculiar curve,” which we have redrawn in Figure 1 based on
the photograph reproduced in [4, p. 63].
Such a picture requires some explanation, and Cantor provided it. The domain
has been divided by a geometric progression, so b = 1/2, b1 = 3/4, and so on;
a = (0, 1/2), a 0 = (1/2, 3/4), etc. The point C is (1, 1). The points d 0 = (1/2, 1/2),
d 00 = (3/4, 3/4), etc. give the corresponding subdivision of the main diagonal.
The curve consists of infinitely many parallel line segments ab, a 0 b0 , a 00 b00 and
of the point c. The endpoints b, b0 , b00 , . . . are not regarded as belonging to the
curve.
The stipulation that the segments are open at their lower endpoints means that 0 is not
in the image. This proves the lemma, and therefore the proof is finished.
a
d b
P
O b b1 b2 b3 b4
Cantor did not even add that last comment. As soon as he had explained his curve, he
moved on to make extensive comments on the theorem and its implications. He turns
on its head the objection that various mathematicians made to his question, namely
that it was “obvious” from geometric considerations that the number of variables is
invariant:
For several years I have followed with interest the efforts that have been made,
building on Gauss, Riemann, Helmholtz, and others, towards the clarification of
all questions concerning the ultimate foundations of geometry. It struck me that
all the important investigations in this field proceed from an unproven presuppo-
sition which does not appear to me self-evident, but rather to need a justification.
I mean the presupposition that a ρ-fold extended continuous manifold needs ρ
independent real coordinates for the determination of its elements, and that for
a given manifold this number of coordinates can neither be increased nor de-
creased.
This presupposition became my view as well, and I was almost convinced of
its correctness. The only difference between my standpoint and all the others
was that I regarded that presupposition as a theorem which stood in great need
of a proof; and I refined my standpoint into a question that I presented to several
colleagues, in particular at the Gauss Jubilee in Göttingen. The question was the
following:
“Can a continuous structure of ρ dimensions, where ρ > 1, be related one-
to-one with a continuous structure of one dimension so that to each point of the
former there corresponds one and only one point of the latter?”
Most of those to whom I presented this question were extremely puzzled that
I should ask it, for it is quite self-evident that the determination of a point in an
extension of ρ dimensions always needs ρ independent coordinates. But who-
ever penetrated the sense of the question had to acknowledge that a proof was
needed to show why the question should be answered with the “self-evident”
no. As I say, I myself was one of those who held it for the most likely that the
5. “JE LE VOIS . . . ” So now Dedekind had a lot to digest. The interleaving argu-
ment is not problematic in this case, and the existence of a bijection between the ra-
tionals and the increasing sequence ηk had been established in 1872. But there were at
least two sticky points in Cantor’s letter.
First, there is the matter of what kind of “successive application” of the lemma
Cantor had in mind. Whatever it was, it would seem to involve constructing a bijection
by “putting together” an infinite number of functions. One can easily get in trouble.
For example, here is an alternative reading of what Cantor had in mind. Instead
of applying the lemma to the interval (η1 , η2 ), we could apply it to (0, η1 ) to put it
into bijection with (0, η1 ]. So now we have “put η1 back in” and we have a bijection
between [0, 1] − {η1 , η2 , η3 , . . . } and [0, 1] − {η2 , η3 , . . . }.
Now repeat: use the lemma on (0, η2 ) to make a bijection to (0, η2 ]. So we have “put
η2 back in.” If we keep doing that, we presumably get a bijection from (0, 1) minus
the ηk to all of (0, 1).
But do we? What is the image of, say, 13 η1 ? It is not fixed under any of our functions.
To determine its image in [0, 1], we would need to compose infinitely many functions,
and it’s not clear how to do that. If we manage to do it with some kind of limiting
process, then it is no longer clear that the overall function is a bijection.
The interpretation Cantor probably intended (and later stated explicitly) yields a
workable argument because the domains of the functions are disjoint, so it is clear
where to map any given point. But since Cantor did not indicate his argument in this
letter, one can imagine Dedekind hesitating. In any case, at this point in history the
idea of constructing a function out of infinitely many pieces would have been both
new and worrying.
The second sticky point was Cantor’s “application” of his theorem to undermine
the foundations of geometry. This is, of course, the sort of thing one has to be careful
about. And it is clear, from Dedekind’s eventual response to Cantor, that it concerned
him.
Dedekind took longer than usual to respond. Having already given one wrong proof,
Cantor was anxious to hear a “yes” from Dedekind, and so he wrote again on June 29:
3 The original reads “. . . bis ich vor ganz kurzer Zeit durch ziemlich verwickelte Gedankereihen zu der Ue-
berzeugung gelangte, dass jene Frage ohne all Einschränkgung zu bejahen ist.” Note Cantor’s Überzeugung—
conviction, belief, certainty.
So here is the phrase. The letter is, of course, in German, but the famous “I see it,
but I don’t believe it” is in French.4 Seen in its context, the issue is clearly not that
Cantor was finding it hard to believe his result. He was confident enough about that
to think he had rocked the foundations of the geometry of manifolds. Rather, he felt
a need for confirmation that his proof was correct. It was his argument that he saw
but had trouble believing. This is confirmed by the rest of the letter, in which Cantor
spelled out in detail the most troublesome step, namely, how to “successively apply”
his lemma to construct the final bijection.
So the famous phrase does not really provide an example of a mathematician having
trouble believing a theorem even though he had proved it. Cantor, in fact, seems to have
been confident [überzeugt!] that his theorem was true, as he himself says. He had in
hand at least two arguments for it: the first argument, using the decimal expansion,
required supplementation by a proof of the Schroeder-Bernstein theorem, but Cantor
was quite sure that this would eventually be proved. The second argument was correct,
he thought, but its complicated structure might have allowed something to slip by him.
He knew that his theorem was a radically new and surprising result—it would cer-
tainly surprise others!—and thus it was necessary that the proof be as solid as possible.
The earlier error had given Cantor reason to worry about the correctness of his argu-
ment, leaving Cantor in need of his friend’s confirmation before he would trust the
proof.
Cantor was, in fact, in a position much like that of a student who has proposed
an argument, but who knows that a proof is an argument that convinces his teacher.
Though no longer a student, he knows that a proof is an argument that will convince
others, and that in Dedekind he had the perfect person to find an error if one were
there. So he saw, but until his friend’s confirmation he did not believe.
6. WHAT CAME NEXT. So why did Dedekind take so long to reply? From the
evidence of his next letter, dated July 2, it was not because he had difficulty with the
proof. His concern, rather, was Cantor’s challenge to the foundations of geometry.
The letter opens with a sentence clearly intended to allay Cantor’s fears: “I have
examined your proof once more, and I have discovered no gap in it; I am quite certain
that your interesting theorem is correct, and I congratulate you on it.” But Dedekind
did not accept the consequences Cantor seemed to find:
c’est croire,” or for some other reason. I do not believe the phrase was already proverbial.
REFERENCES
1. K.-R. Biermann, Dedekind, in Dictionary of Scientific Biography, C. C. Gillispie, ed., Scribners, New
York, 1970–1981.
Abstract. Legendre was the first to state the law of quadratic reciprocity in the form in which
we know it and he was able to prove it in some but not all cases, with the first complete proof
being given by Gauss. In this paper we trace the evolution of Legendre’s work on quadratic
reciprocity in his four great works on number theory.
As is well known, Adrien-Marie Legendre (1752–1833) was the first to state the law
of quadratic reciprocity in the form in which we know it (though an equivalent result
had earlier been conjectured by Euler), and he was able to prove it in some but not
all cases, with the first complete proof being given by Gauss [3]. In this paper we
trace the evolution of Legendre’s work on quadratic reciprocity in his four great works
[10, 11, 12, 13] on number theory. These works span a 45 year period in Legendre’s
life, dating from 1785, 1797, 1808, and 1830 respectively.
Before beginning with our analysis here, we call the reader’s attention to several
other relevant works. [15] overlaps with our work here, though it has a somewhat
different focus. [2] is a brief survey, and [14] is an extended treatment of the early
history of reciprocity laws. A highly readable account of the development of number
theory around this era can be found in [9], which has excerpts from original works of
Euler, Legendre, Gauss, and others, translated into English.
In this paper, we will use Legendre’s language to the extent possible. In particular,
we will not use the terms quadratic residue/nonresidue or the notion of congruence in
the body of this article, as these were never used by Legendre.
We begin with Legendre’s 1785 paper [10]. In Article I of that paper he proves a
result due originally to Euler:
Theorem A. Let c be an odd prime and let d be any integer not divisible by c. Then
d c−1 − 1 is divisible by c.
Furthermore, c divides the formula x 2 + dy 2 (i.e., there are integers x and y not
divisible by c with x 2 + dy 2 divisible by c) if and only if (−d)(c−1)/2 leaves a remainder
of 1 when divided by c; otherwise (−d)(c−1)/2 leaves a remainder of −1 when divided
by c. If −c/2 < d < c/2, each possibility occurs for (c − 1)/2 values of d.1
We follow Legendre’s notation throughout this paper and let a and A be distinct
primes of the form 4x + 1 and b and B be distinct primes of the form 4x − 1 (or
4x + 3; Legendre used both but preferred 4x − 1).
In Article IV he rewrites the above conclusions as (−d)(c−1)/2 = 1 or (−d)(c−1)/2 =
−1, with the convention here, as he explicitly states elsewhere, that this is true “omit-
ting multiples of c.” With this convention he then states the following 8 theorems:
doi:10.4169/amer.math.monthly.118.03.210
1 Note that “c divides the formula x 2 + dy 2 ” if and only if −d is a quadratic residue (mod c). To see this,
first suppose that −d is a quadratic residue (mod c), and let x be an integer with x 2 ≡ −d (mod c). Then
x 2 + d12 ≡ 0 (mod c). Conversely, if x 2 + dy 2 ≡ 0 (mod c), let y be an integer with y y ≡ 1 (mod c). Then
x 2 y 2 + d ≡ 0 (mod c), so −d ≡ (x y)2 (mod c).
Note that Theorems I and II are equivalent (being contrapositives of each other) as
are Theorems III and IV, and Theorems V and VI, leaving five essentially different
cases. Legendre makes this observation in the course of his proofs.
Legendre then proceeds to (attempt to) prove these theorems. As he observes, he is
successful in unconditionally proving Theorems I, II, and VII, but for the remaining
cases his proof is conditional on an auxiliary hypothesis that he cannot prove. We state
this as Hypothesis A below. His key tool in these (partial) proofs is the following result
of his from Article III:
r λ2 + s tµ2 − s tν 2 − r
, ,
t r s
Note in particular that in this theorem r , s, and t are not required to be prime.
However, in his (partial) proofs of Theorems I–VIII he uses only the cases where r , s,
and t are primes or are equal to 1.
Now we come to Legendre’s 1797 book [11]. In this work we see two major changes
from his previous work.
The first change is the introduction of what we now call the Legendre symbol mn .
Here n is a prime and m is an arbitrary integer not divisible by n. From Theorem A
he knows that m (n−1)/2 leaves a remainder of ±1 when divided by n, and then he sets
m
n
= 1 or −1 as this remainder is 1 or −1. Legendre occasionally uses this symbol
when n = 1 as well, in which case he sets mn = 1 for every nonzero integer m.
In the opinion of the author, this is more than a notational convenience. In intro-
ducing this notion, Legendre reifies this concept, and makes it into an object of in-
dependent study. This line of thought later led to the Jacobi symbol and the Hilbert
symbol.
The second change is the introduction of the term “reciprocity.” We have in this
work [11, par. (164)], a paragraph entitled:
Legendre begins this paragraph by letting m and n be odd primes, and with this
implicit assumption he states the theorem:
Hypothesis A.
b
= 1 and Ab
(a) For any a and A, there exists b with a
= −1.
A
A
(b) For any a and b, there exists A with a
= −1 and b
= −1.
a
a
(c) For any b and B, there exists a with b
= −1 and B
= −1.
The only essential difference between the proofs in 1785 and 1797 is that in 1797
Legendre gives two proofs of cases I and II, the first of which is the same as his 1785
proof.
2 Nowadays it is common to define the Legendre symbol m by m = 1 if m is a quadratic residue (mod n)
n n
and mn = −1 if m is a quadratic nonresidue (mod n), but this was not Legendre’s definition. Of course, the
modern definition and Legendre’s definition are equivalent, by Euler’s theorem. Note that the multiplicativity
of the Legendre symbol is immediate from Legendre’s definition, but takes some work to obtain from the
modern definition.
Of course, by this time Gauss had proved the law of quadratic reciprocity in general.
Gauss gave two proofs in [3] and a third in [4]. (The author admits to not being able
to read Latin, and he consulted [3, 4, 5, 6] in the German translation [7].) In [12, par.
(381)], Legendre gives Gauss’s third proof as well.
In Legendre’s two-volume 1830 book [13] we again find his attempt to prove
quadratic reciprocity. Here his proof is the same as in 1808, with the same dependence
on Hypothesis B in some of the cases. He again gives Gauss’s third proof, but he also
gives a proof in [13, par. (679)] that is a modification of a proof due to Jacobi [8].
Jacobi’s proof is in part a simplification of Gauss’s
sixth proof [6]. Jacobi begins with
what we now call the Gauss sum P = c−1 d
P
√ d=1 c exp(2πid/c). It is easy to show √
that
P 2 = (−1)(c−1)/2 c, so P = ± c if c is a prime of the form 4x + 1 and P = ±i c if
c is a prime of the form 4x − 1.
It is a celebrated theorem of Gauss [5] that the sign is always positive, and in that
work Gauss used this fact to provide his fourth proof of quadratic reciprocity. Jacobi
used the value of P. Legendre modified Jacobi’s proof to use only the value of P 2 so
that the difficult sign question could be bypassed. In fact, the observation that quadratic
reciprocity can be proved using only the value of P 2 and not the value of P goes back
to Gauss in his sixth proof [6]. Legendre describes the proof he gives as the simplest
of all known proofs of quadratic reciprocity.
Legendre realized full well that his own two proofs of quadratic reciprocity were
incomplete. The change from his 1785/1797 proof to his 1808/1830 proof enabled
him to prove an additional case of reciprocity unconditionally. But this new proof also
involved a change of the auxiliary hypothesis on which the proof of the remaining
cases depended. Evidently, since he replaced Hypothesis A by Hypothesis B in his
later works, he regarded that as progress. In 1808 in [12, par. (169)], the first time
Hypothesis B appears, he observes that a, being of the form 4x + 1, is necessarily
of the form 8x + 1 or 8x + 5, and he can verify this hypothesis in the 8x + 5 case,
leaving the 8x + 1 case open. He further observes that that case splits into the two
cases 24x + 1 and 24x + 17, and he can verify this hypothesis in the 24x + 17 case,
choosing b = 3, leaving the 24x + 1 case open. In 1830 in [13, par. (171)] he observes
that in addition he has verified this hypothesis for each of the fifteen primes a of the
form 24x + 1 with a ≤ 1009. He verifies this by considering the remainders when a
is divided by 168 or 264, where he chooses b = 7 or b = 11 respectively.
In fact, Legendre’s earlier Hypothesis A is a consequence of Dirichlet’s theorem
on primes in an arithmetic progression, although, in the author’s opinion, this is cer-
tainly overkill. Ironically, there is no known proof of his Hypothesis B that does not
use quadratic reciprocity. Thus, what seemed to him to be an advance seems to us
Theorem C.
(a) An odd prime c is of the form y 2 + z 2 if and only if c is of the form 8x + 1 or
8x + 5.
(b) An odd prime c is of the form y 2 + 2z 2 if and only if c is of the form 8x + 1 or
8x + 3.
(c) An odd prime c is of the form y 2 − 2z 2 if and only if c is of the form 8x + 1 or
8x + 7.
Legendre credits the discovery of all parts of this theorem to Fermat, with the first
proofs of parts (a) and (b) due to Euler and the first proof of (c) due to Lagrange.
This derivation of the value of 2c is unchanged in 1808/1830 in [12, 13], although
in those works Legendre investigates the equations M x 2 − N y 2 = ±2, and some but
2
not all cases of c follow more simply from those investigations, as he notes. (The
derivation of the value of 2c was much simplified by Gauss. In [3] he proves this in
an elementary way and in [4] he gives a second proof, using “Gauss’s lemma,” as part
of his third proof of reciprocity. Both of these proofs were well known in the 19th
century, appearing in the famous textbook [1], written by Dirichlet but with revisions
and supplements due to Dedekind, which gave (a slight reformulation of) Gauss’s third
proof of reciprocity as well. But only the last of these
proofs is well known today, when
it has become the standard proof of the value of 2c .)
We have concentrated here on Legendre’s work, but we would like to make a few
more historical remarks. We have mentioned that Euler stated a conjecture equivalent
to the law of quadratic reciprocity (though it takes a bit of work to see that), but Euler’s
statement seems not to have had any influence on either Legendre or Gauss. Euler
coined the terms quadratic residue and nonresidue, which were not used by Legendre
but were used by Gauss. Legendre coined the term quadratic reciprocity, but this was
never used by Gauss, who always referred to this result as the “fundamental theorem
(in the theory of quadratic residues),” nor did Gauss ever use the Legendre symbol in
any of his works on the subject.
In the introduction to [3] Gauss writes that his work there had been done without
knowledge of prior results in the subject. He also writes there that in the meanwhile,
the “excellent” work [11] of the “highly deserving” Legendre appeared, but that he
did not rewrite [3] to take it into account, only adding a few additional remarks in the
Appendix. He makes some historical remarks in [3, par. (151)] immediately after he
proves the fundamental theorem, in which he comments on the efforts of Euler and
Legendre. In particular, Gauss credits Legendre for having arrived at that theorem in
[10], without having been able to completely prove it (as Legendre himself admitted
there), and then claims (quite fairly, in the opinion of the author) that his own proof is
the first proof.
In the first paragraph of [4] Gauss writes that in number theory it is often easy to
inductively arrive at results whose proofs lie very deep, or even which defy proof.
ACKNOWLEDGMENTS. This work was done while the author was on sabbatical leave at the Mathematics
Institute of the University of Göttingen, which he would like to thank for its hospitality. He notes with particular
pleasure that the copy of [11] he consulted in Göttingen was the copy from Gauss’s personal library. Preparation
of this manuscript for publication was supported by a Faculty Research Grant from Lehigh University.
REFERENCES
1. P. G. L. Dirichlet, Vorlesungen über Zahlentheorie, Chelsea, New York, 1968; reprint of the 4th edition,
Braunschweig, 1893.
2. G. Frei, The reciprocity law from Euler to Eisenstein, in The Intersection of History and Mathematics,
Science Networks Historical Studies, vol. 15, S. Chikara, S. Mitsuo, and J. W. Dauben, eds., Birkhäuser-
Verlag, Basel, 1994, 67–90.
3. C.-F. Gauss, Disquisitiones Arithmeticae, privately printed, Leipzig, 1801.
4. , Theorematis arithmetici demonstratio nova, Commentationes soc. reg. sc. Gottingensis XVI,
1808.
5. , Summatio quarumdum serium singularium, Commentationes soc. reg. sc. Gottingensis recen-
tiores I, 1811.
6. , Theorematis fundamentalis in doctrina de residuis quadraticis demonstrationes et ampliationes
novae, Commentationes soc. reg. sc. Gottingensis recentiores IV, 1818.
7. , Untersuchungen über höhere Arithmetik (trans. H. Maser), American Mathematical Society/
Chelsea, Providence 2006.
8. C. G. J. Jacobi, Letter to Legendre of 5 August 1827, in Collected Works, vol. I, C. W. Borchardt, ed.,
Chelsea, New York, 1969, 390–396; reprint of the original edition, G. Reimer, Berlin, 1881.
9. A. Knoebel, R. Laubenbacher, J. Lodder, and D. Pengelley, Mathematical Masterpieces, Further Chron-
icles by the Explorers, Springer, New York, 2007.
10. A. M. Le Gendre, Recherches d’analyse indéterminée, Histoire de l’Académie royale des sciences avec
les Mémoires de Mathématique et de Physique pour la même Année, 1785, 465–559; also available at
http://gallica.bnf.fr.
11. A. M. Legendre, Essai sur la Théorie des Nombres, Paris, 1797.
12. , Essai sur la Théorie des Nombres, Paris, 1808.
STEVEN H. WEINTRAUB is Professor of Mathematics at Lehigh University. His research spans a range of
areas in algebra, geometry, and topology, although lately he has become interested in the history of mathematics
as well. When he is not thinking about mathematics he can often be found reading mystery novels or flying his
airplane (but not doing both simultaneously).
Department of Mathematics, Lehigh University, Bethlehem, PA 18015
steve.weintraub@lehigh.edu
We give a simple proof, based on the mean value theorem of calculus, that the
antiderivative of 1/(1 + x 2 ) is not rational.
Assume on the contrary that
P(x) p0 x n + · · · + pn
R(x) = =
Q(x) q0 x m + · · · + qm
is such that
1
R 0 (x) = ,
1 + x2
with p0 , q0 6= 0. Observe that R must be strictly increasing. Next, using the mean
value theorem, we see that, if x > 0, there exists y between x and 2x such that
x
R(2x) − R(x) = R 0 (y)x ≤ .
1 + x2
Hence
p0 q0 (2n − 2m )x n+m + · · ·
R(2x) − R(x) = →0
q02 2m x 2m + · · ·
Abstract. We strengthen a result of Lehmer, obtaining a new necessary condition for the roots
of a complex polynomial to have equal modulus. From this we derive the famous theorem of
Feuerbach, as well as the less well-known theorems of Euler and Guinand on the tritangent
centers of a triangle. The latter theorems constrain the possible locations of the incenter and
excenters subject to fixed locations for the circumcenter and orthocenter.
2. THE THEOREMS E-F-G. Euler’s 1765 article [3] marks an important milestone
in triangle geometry. In it, he introduces the line now referred to as the Euler line,
as well as his formula for the distance between the circumcenter and the incenter of
a triangle. These points are two of the four classical triangle centers, illustrated in
Figure 1.
The circumcenter O is the center of the circumscribed circle, as well as the inter-
section point of the perpendicular bisectors of the sides.2 The incenter I is the center
of the inscribed circle or incircle, and also the intersection point of the angle bisec-
tors. The remaining classical centers are the orthocenter H , where the altitudes meet,
and the centroid G, where the medians meet. In a nonequilateral triangle, the points
O, G, and H lie on the Euler line in the order O-G-H , with GH = 2 · OG. A fifth
triangle center, apparently unknown to Euler, is the nine-point center N . This is the
doi:10.4169/amer.math.monthly.118.03.217
1 After a suggestion of John Conway.
2 Our notation and terminology are consistent with those of [2].
G O
I
N
H
B D A C
Figure 1. The classical triangle centers and the Euler segment OH.
center of the nine-point circle, which passes through the midpoints of the three sides
and the feet of the three altitudes. It turns out that N is also the midpoint of OH, so
that OG : GN : NH = 2 : 1 : 3. The nine-point circle is often attributed to Euler, but
no evidence has been found to support this attribution. According to [7], there are
precedents of the nine-point circle theorem dating as far back as the 1804 work of
Benjamin Bevan, but it is first explicitly described in the 1821 article [1] of Brianchon
and Poncelet. The term “nine-point circle” was coined in 1842 by O. Terquem [11].
Although the 1765 paper is historically important for its introduction of the Euler
line, Euler states that his primary aim is to compute the sides of a triangle in terms of
its central distances, OH, OI, and IH. He does this by showing that for a nonequilateral
triangle, the central distances determine the coefficients of a real cubic whose roots are
exactly the sides a, b, c. This leads to a pair of necessary conditions on OH, OI, IH,
which come from the fact that the cubic must have three positive real roots. Guinand [5]
shows that Euler’s necessary conditions are also sufficient to guarantee the existence
of a triangle with pre-specified central distances. This is now called Euler’s theorem.
In order to formulate Euler’s theorem, we begin by defining two functions
where O and H are any two points. In the archetypal case when O and H are the
circumcenter and orthocenter of a triangle, N (O, H ) and G(O, H ) coincide with the
nine-point center N and the centroid G. For brevity, we always write N = N (O, H )
and G = G(O, H ).
The Euler shield can also be described as the locus of a point X such that OX =
2 · NX. This equation defines a circle of Apollonius whose center lies on line ON.
Since OG : GN : NH = 2 : 1 : 3, the values X = G and X = H satisfy the equation.
As G and H lie on line ON, the circle of Apollonius has GH as a diameter. We may
now state Euler’s theorem.
Theorem E (Euler, 1765). Three points O, H , and I are the circumcenter, orthocen-
ter, and incenter of a triangle if and only if I is inside the Euler shield and differs
from N .
OI 2 = R(R − 2r ). (1)
Here R is the circumradius, or radius of the circumscribed circle, while r is the in-
radius, or radius of the incircle. Equation (1) often appears in conjunction with Feuer-
bach’s relation
NI = 21 R − r, (2)
which expresses the famous result that the incircle and the nine-point circle are inter-
nally tangent. The term 21 R represents the radius of the nine-point circle.4
Both (1) and (2) have analogues in which the incenter is replaced by one of the
three excenters. These are the centers of the excircles, which touch one side of 4ABC
internally and the other two sides externally (see Figure 2). Each is the intersection of
one internal angle bisector with two external angle bisectors. We denote the excenters
opposite A, B, C by E a , E b , E c . An arbitrary excenter will be denoted by E, with ρ
being the radius of the corresponding excircle. The incircle and excircles are collec-
tively called tritangent circles, and their centers tritangent centers. Theorems about
them are called “tritangency theorems”.
C B
I
N
C
B A
Ea
The analogues of formulas (1) and (2) for an excenter are as follows:
Together, (2) and (4) comprise Feuerbach’s theorem (illustrated in Figure 2):
3 Euler expresses the squared distances between the classical centers in terms of the area 1 and the elemen-
tary symmetric functions of the sides a, b, c. His expression for OI 2 is converted to the form (1) by substitution
of the well-known formulas 21 = r (a + b + c) and 4R1 = abc.
4 Two circles are internally (externally) tangent if and only if the distance between their centers is equal to
the difference (sum) of their radii. The nine-point circle has radius 21 R because it is the circumscribed circle of
the midpoint triangle.
Guinand sought constraints on the location of an excenter, just as Euler had for
the incenter. The region of possible excenters turns out to be the common exterior of
and a rational algebraic curve 0, which we call the Guinand shield (see Figure 3).
Guinand’s definition of this curve is stated in algebraic rather than geometric language.
As we will see, 0 turns out to have a simple geometric characterization. Like that of
the Euler shield, it depends only on the points O and H .
Theorem G (Guinand, 1984). Three points O, H , and E are the circumcenter, the
orthocenter, and one excenter of a triangle if and only if E lies outside both and 0.
OG N H
r 2k ck = c0 c̄n−k , k = 0, 1, . . . , n. (5)
The only possible radius for which this may occur is r = |c0 |1/n .
Proof. Inversion in the circle of radius r is the map z 7 → r 2 /z̄. This transforms f (z)
into f (r 2 /z̄), which has the same set of roots as
Taking moduli in (5) gives the following weaker condition of Lehmer [6]:
Now let each monic polynomial be identified with its vector of coefficients. This
puts the natural topology of Cn on the set of monic polynomials of degree n. When-
ever we apply topological terms to polynomial spaces, it is this topology that is to be
understood. We call a polynomial simple if it has no repeated roots. The following
lemma is a direct consequence of the implicit function theorem, and presents a key
topological property of simple polynomials.
Proof. For each root z 0 of f 0 , the implicit function theorem gives a neighborhood W of
f 0 and a continuous function ρ : W → C such that ρ( f 0 ) = z 0 and f (ρ( f )) = 0 for
all f ∈ W . The n distinct roots of f 0 give rise to n such “root functions” ρ1 , . . . , ρn ,
each continuous near f 0 . Let V be a neighborhood of f 0 in which all the ρ j are con-
tinuous. Write δ( f ) = min j6=k |ρ j ( f ) − ρk ( f )|. Note that δ is continuous on V , and
δ( f 0 ) > 0. Thus δ > 0 in some neighborhood U of f 0 . For each f ∈ U , the num-
bers ρ1 ( f ), . . . , ρn ( f ) furnish n distinct solutions of f (z) = 0. As deg f = n, these
solutions exhaust the roots of f .
We prove next that the “equimodular region” in a space of inversively stable simple
polynomials is both open and closed, and is therefore a union of connected compo-
nents. This will later allow us to identify the Euler and Guinand shields with the
boundaries of certain equimodular regions.
Theorem 1. In any space of simple, inversively stable polynomials, the set of equimod-
ular elements and the set of non-equimodular elements are each open.
Theorem 2. Suppose 4ABC has circumcenter 0. Exactly two of the eight sets of
square roots α, β, γ of A, B, C make 4αβγ acute-angled. For either set,
H = α2 + β 2 + γ 2 (6)
I = −(βγ + γ α + αβ) (7)
E a = −βγ + γ α + αβ, E b = βγ − γ α + αβ, E c = βγ + γ α − αβ. (8)
Y
A = α2
Z
0
I
B = β2 C = γ2
Ea
Euler’s relations (1) and (3) have several relatively straightforward proofs (see, e.g.,
[2]). Combining them with our condition (5) yields a short new proof of Feuerbach’s
theorem.
In this case, condition (5) and formula (3) yield |N − E| = 12 R + ρ, implying the
external tangency of the nine-point circle and the A-excircle.
Proof. By Lemma 1, a polynomial of the form (11) is inversively stable if and only if
√
|Q|2/3 (−P) = Q H − 2P. Taking moduli allows us to solve for |Q|, and substitut-
ing the result back into the original equation gives
P2 √
P H − 2P. (12)
Q = −
(H − 2P)2
It turns out that there is no need to distinguish between f P+ and f P− , and we write f P
for either of them. The reason is that f P+ and f P− have the same equimodularity status,
and throughout this paper their roots only appear in homogeneous combinations of
degree 2, which are invariant under z 7 → −z.
We call a polynomial square-simple if the squares of its roots are distinct.
Theorem 3. The point P is a tritangent center of some admissible triangle if and only
if f P is equimodular and square-simple.
Proof. Suppose P is a tritangent center for an admissible triangle 4ABC, and con-
sider α, β, γ , the equimodular square roots of A, B, C supplied by Theorem 2. For-
mulas (9) and (10) show that the roots of f P are ±α, ±β, ±γ (e.g., −α, β, γ when
P = E a ). Thus f P is equimodular, and since A, B, C are clearly distinct, f P is also
square-simple.
Conversely, suppose f P has equimodular roots a, b, c with distinct squares A, B, C.
Since
√ A, B, C are distinct points of a circle, they form2 a triangle. Now a + b + c =
− H − 2P and bc + ca + ab = −P, so that H = a + b2 + c2 = A + B + C. By
(6), 4ABC is admissible. Theorem 2 supplies a possibly different set of square roots
α, β, γ of A, B, C satisfying (6)–(8). However, since a = ±α, b = ±β, and c = ±γ ,
the point P = −(bc + ca + ab) can only be one of
These are precisely the tritangent centers of 4ABC by (7) and (8).
There is also a third region, of points ineligible to be a tritangent center. This follows
from Theorem 3, once it is noted that there are nonequimodular cubics f P . Consider
for instance P = −2G (conveniently chosen inside 0). Choosing λ ∈ C such that
7λ2 = G, formulas (11) and (12) give
Proof. Note that f P is equimodular and has a pair of mutually negative roots if and
only if f P (z) = (z − z 0 )(z + z 0 )(z − ωz 0 ), where z 0 , ω ∈ C satisfy z 0 6 = 0 and |ω| =
1. Equating coefficients and writing H = 2N , this is equivalent to
2N − 2P = ω2 z 02 , P = z 02 . (13)
Since z 0 and ω are arbitrary subject to z 0 6 = 0 and |ω| = 1, our two equations are
equivalent to the single condition 2|N − P| = |P|. This is exactly the Apollonius
condition 2 · NP = OP, which characterizes .
Theorem 6. The polynomial f P is equimodular and has two equal roots if and only if
P lies on the Guinand shield 0.
Proof. Observe that f P is equimodular and has two equal roots if and only if f P (z) =
(z − z 0 )2 (z − ωz 0 ), where z 0 , ω ∈ C have z 0 6 = 0 and |ω| = 1. Equating coefficients
and putting L = −P, this amounts to
H = (2 + ω2 )X. (15)
H−X (1 + ω2 )X ω + ω̄
= = = Re(ω).
L−X 2ωX 2
If X 6= H , then H − X and L − X are related by a real nonzero scale factor, so L
must lie on line HX, which is denoted by ` X . If X = H , we have ω = ±i. Thus
L − H = L − X = 2ωX = ±2i H , and LH ⊥ OH. This means that L lies on the
tangent line of at H , denoted by ` H . Therefore P ∈ 0.
Incidentally, if we eliminate X between (15) and the second equation of (14), and
if we put ω = eit , the following simple parametrization of 0 results:
1 + 2eit
P=− H, 0 ≤ t ≤ 2π.
2 + e2it
Proof. Consider the reflection of 0 through O, which we call 3 (Figure 5). For each
X ∈ , there are two points of ` X at a distance of 2 · O X from X . These are L 1 , L 2 ∈
3, labeled so that L 1 and H are on the same side of X whenever X 6 = H . (When
X = H , the L i are interchangeable.)
L = L2
P = P1
O G N H
L = L1
P = P2
Finally, we will use a continuity argument to show that the three connected re-
gions formed by the Euler and Guinand shields coincide with the regions of incenters,
excenters,
√ and non-tritangent points.6 The argument requires a continuous choice of
H − 2P, and to achieve this we will cut the plane. For each point P, there are two
possibilities for f P , one for each square root of H − 2P. The branch point of these
square roots is N . If we cut the plane
√ along a continuous curve from N to ∞, then in
the cut plane there is a choice of H − 2P which varies continuously with P. Formu-
las (11) and (12) supply a corresponding continuous choice of f P . It will be convenient
to cut along the ray κ = NH, because this ray avoids 0. Notice that we can modify κ
slightly to obtain a cut κ ∗ that avoids both 0 and any given point P0 ∈ κ − {N }. We
simply take a disc about P0 that avoids 0, and deform κ inside this disc so as to avoid
P0 .
Proof. The set Cκ = (C − κ) − 0 − {0} has two components, one inside 0 and one
outside. Choose the map f : P 7 → f P to be continuous on Cκ . Let M be the equimod-
ular subset of f (Cκ ). By Theorem 1, both M and f (Cκ ) − M are open. The continuity
of f implies that both U = f −1 (M) and its complement V = Cκ − U are open. Thus
U and V must be the components of Cκ .7 Since −2G lies between −H and G, inside
0, in fact V = Int(0) − {0} and U = Ext(0) − κ.
We have now accounted for all points P ∈ / κ. To account for a given point P0 ∈
κ − {N }, the preceding argument may be repeated in the cut plane C − κ ∗ , where κ ∗
is a suitably deformed cut avoiding 0 and P0 .
We now know that the region inside 0 contains the non-tritangent points. To com-
plete our proof of the theorems of Euler and Guinand, we need only identify the two
regions outside 0.
Proof of Theorems E and G. The set Dκ = Ext(0) − − κ has two components, one
inside and one outside. (Cutting along the ray NH does not disconnect the interior
of , as may be seen in Figure 3.) Again, choose f : P 7 → f P so as to be continuous
on Dκ . With each P ∈ Dκ we may associate a triangle 4ABC having one tritangent
center at P (Theorems 3, 5, 6, 7). Recall that A, B, C are the squares of the roots of f P .
Since P 7→ f P is continuous, the root continuity lemma implies that P 7 → (A, B, C)
is continuous. Moreover, P is the incenter of 4ABC if and only if P is inside 4ABC.
Consider the partition Dκ = I ∪ E, where I and E are the subsets of possible incen-
ters and possible excenters (which are disjoint by Corollary 4). We claim that I and E
are open. Take a point P0 ∈ I. Being an incenter, P0 is inside its triangle 4A0 B0 C0 .
By the continuity of P 7 → (A, B, C), there is a neighborhood U of P0 such that each
P ∈ U lies inside its triangle 4ABC. Hence each P ∈ U is a possible incenter, and
U ⊂ I. This shows that I is open. A similar argument shows that E is open.
6 In[5], Guinand refers to the region of non-tritangent points as the acentric lacuna.
7A fixed admissible triangle will have some excenter E ∈ / κ, showing that U is nonempty. When κ is later
replaced by the deformed cut κ ∗ , the deformation may be taken sufficiently small that E ∈
/ κ ∗ , so that U is still
nonempty. In either of these cases, −2G ∈ V .
REFERENCES
1. C. J. Brianchon and J.-V. Poncelet, Géométrie des courbes. Recherches sur la détermination d’une hyper-
bole équilatère, au moyen de quatres conditions données, Annales de Gergonne 11 (1820–1821) 205–220;
also available at http://www.numdam.org/item?id=AMPA_1820-1821__11__205_0.
2. H. S. M. Coxeter and S. L. Greitzer, Geometry Revisited, Mathematical Association of America, Wash-
ington, DC, 1967.
3. L. Euler, Solutio facilis problematum quorundam geometricorum difficillimorum, Comm. Acad. Sci.
Petropol. 11 (1765) 103–123; also in Opera Omnia, A. Speiser, ed., ser. I, vol. 26, no. 325, 139–157;
available at Euler Archive, http://math.dartmouth.edu/~euler/.
4. K. W. Feuerbach, Eigenschaften einiger merkwürdigen Punkte des geradlinigen Dreiecks und mehrerer
durch sie bestimmten Linien und Figuren: Eine analytisch-trigonometrische Abhandlung, Riegel und
Wiesner, Nürnberg, 1822.
5. A. P. Guinand, Euler lines, tritangent centers, and their triangles, Amer. Math. Monthly 91 (1984) 290–
300. doi:10.2307/2322671
6. D. H. Lehmer, The complete root-squaring method, J. Soc. Indust. Appl. Math. 11 (1963) 705–717. doi:
10.1137/0111053
7. J. S. MacKay, History of the nine point circle, Proc. Edinb. Math. Soc. 11 (1892) 19–61. doi:10.1017/
S0013091500031163
8. B. Scimemi, Paper-folding and Euler’s theorem revisited, Forum Geom. 2 (2002) 93–104.
9. G. C. Smith, Statics and the moduli space of triangles, Forum Geom. 5 (2005) 181–190.
10. J. Stern, Euler’s triangle determination problem, Forum Geom. 7 (2007) 1–9; also available at http:
//forumgeom.fau.edu/FG2007volume7.
11. O. Terquem, Considérations sur le triangle rectiligne, Nouv. Ann. de Math. 1 (1842) 196–200; also avail-
able at http://www.numdam.org/item?id=NAM_1842_1_1__196_1.
12. A. Várilly, Location of incenters and Fermat points in variable triangles, Math. Mag. 74 (2001) 123–129.
doi:10.2307/2690626
13. P. Yiu, Conic solution of Euler’s triangle determination problem, J. Geom. Graph. 12 (2008) 75–80.
ALEX RYBA received his B.A. and Ph.D. from the University of Cambridge. His main area of interest is
finite group theory. After teaching at the University of Illinois at Chicago, the University of Michigan, and
Marquette University, he joined the faculty of Queens College CUNY in 1998. One of his first students at
Queens College was Joe Stern.
Department of Computer Science, Queens College, Flushing NY 11367
ryba@sylow.cs.qc.edu
JOE STERN is a Ph.D. student at Columbia University. After receiving the 1999 British Marshall Scholar-
ship, he joined the faculty of Stuyvesant High School, where he taught until 2009. He is a two-time recipient
of the MAA’s Edyth May Sliffe Award for Distinguished High School Teaching. Two of his last students at
Stuyvesant were Alex Ryba’s sons, Andrew and Nicholas.
Department of Mathematics, Columbia University, New York NY 10027
jstern@math.columbia.edu
Abstract. We give a natural and direct proof of a famous result by Sharkovsky that gives a
complete description of possible sets of periods for interval maps. The new ingredient is the
use of Štefan sequences.
1.1. The Sharkovsky Theorem. The Sharkovsky Theorem involves the following
ordering of the set N of positive integers, which is now known as the Sharkovsky
ordering:
3 F 5 F 7 F · · · F 2 · 3 F 2 · 5 F 2 · 7 F · · · F 22 · 3 F 22 · 5 F 22 · 7 F · · ·
· · · F 23 F 22 F 2 F 1.
This is a total ordering; we write l F r or r G l whenever l is to the left of r .
It is crucial that the Sharkovsky ordering has the following doubling property:
l F r if and only if 2l F 2r. (1)
This is because the odd numbers greater than 1 appear at the left end of the list, the
number 1 appears at the right end, and the rest of N is included by successively dou-
bling these end pieces, and inserting these doubled strings inward:
Sharkovsky showed that this ordering describes which numbers can be periods for a
continuous map of an interval.
doi:10.4169/amer.math.monthly.118.03.229
1 Dynamicistsusually refer to m as the least period.
This shows that the set of periods of a continuous interval map is a tail of the Shar-
kovsky order. A tail is a set T ⊂ N such that s F t for all s ∈
/ T and all t ∈ T . There are
three types of tails: {m} ∪ {l ∈ N | l G m} for some m ∈ N, the set {. . . , 16, 8, 4, 2, 1}
of all powers of 2, and ∅.
The following complementary result is sometimes called the converse to the Shar-
kovsky Theorem, but is proved in Sharkovsky’s original papers.
Theorem 1.2 (Sharkovsky Realization Theorem [14, 16]). Every tail of the Shar-
kovsky order is the set of periods for some continuous map of an interval into itself.
The Sharkovsky Theorem is the union of Theorem 1.1 and Theorem 1.2: a subset
of N is the set of periods for a continuous map of an interval to itself if and only if the
set is a tail of the Sharkovsky order.
All proofs of the Sharkovsky Theorem that we know are elementary, no matter how
ingenious; the Intermediate-Value Theorem is the deepest ingredient. There is varia-
tion in the clarity of the proof strategy and its implementation. Our aim is to present,
with all details, a direct proof of the Forcing Theorem that is conceptually simple and
involves no artificial case distinctions. Indeed, its directness provides additional infor-
mation (Section 8). We also reproduce a proof of the Realization Theorem in Section
7 at the end of this note.
The standard proof of the Sharkovsky Forcing Theorem studies orbits of odd period
with the property that their period comes earlier in the Sharkovsky sequence than any
other period for that map. It shows that such an orbit is of a special type, known as
a Štefan cycle,2 and then that such a cycle forces the presence of periodic orbits with
Sharkovsky-lesser periods. The second stage of the proof considers various cases in
which the period that comes earliest in the Sharkovsky order is even. Finally, this
approach requires special treatment of the case in which the set of periods consists of
all powers of 2.
We extract the essence of the first stage of the standard proof to produce an argument
that does not need Štefan cycles, and we replace the second stage of the standard proof
by a simple and natural induction. Our main idea is to select a salient sequence of orbit
points and to prove that this sequence “spirals out” in essentially the same way as the
Štefan cycles considered in the standard proof.
1.2. History. A capsule history of the Sharkovsky Theorem is in [11], and [1] pro-
vides much context. The first result in this direction was obtained by Coppel [5] in the
1950s: every point converges to a fixed point under iteration of a continuous map of
a closed interval if the map has no periodic points of period 2; it is an easy corollary
that a continuous map must have 2 as a period if it has any periodic points that are not
fixed. This amounts to 2 being the penultimate number in the Sharkovsky ordering.
Sharkovsky obtained the results described above and reproved Coppel’s theorem in
a series of papers published in the 1960s [14, 16]. He also worked on other aspects
of one-dimensional dynamics (see, for instance, [13, 15, 17]). Sharkovsky appears to
have been unaware of Coppel’s paper. His work did not become known outside eastern
Europe until the second half of the 1970s. In 1975 this M ONTHLY published a famous
paper, “Period three implies chaos” [10] by Li and Yorke, which included the result that
the presence of a periodic point of period 3 implies the presence of periodic points of
2 “Š” is pronounced “Sh.”
1.3. Related Work. There is a wealth of literature related to periodic points for one-
dimensional dynamical systems. [1] is a good source of pertinent information. There is
a characterization of the exact structure of a periodic orbit whose period comes earliest
in the Sharkovsky order for a specific map. There is also work on generalizations to
other permutation patterns (how particular types of periodic points force the presence
of others, and how intertwined periodic orbits do so), to different one-dimensional
spaces (that look like the letter “Y,” the letter “X,” or a star “∗”), and to multivalued
maps.
Lemma 2.3 (Itinerary Lemma). If J0 , . . . , Jn−1 are closed bounded intervals and
f f f
J0 −
→ ··· − → Jn−1 − → J0 (this is called a loop or n-loop of intervals) then there is
a point x that follows the loop, that is, f i (x) ∈ Ji for 0 ≤ i < n and f n (x) = x.
K I
Figure 1. Finding K J .
Thus there is a closed bounded interval K n−1 ⊂ Jn−1 such that K n−1 J0 . Then
Jn−2 → K n−1 , and so there is K n−2 ⊂ Jn−2 such that K n−2 K n−1 . Inductively, there
are closed bounded intervals K i ⊂ Ji , 0 ≤ i < n, such that
K 0 K 1 · · · K n−1 J0 .
We wish to ensure that the period of the point x found in Lemma 2.3 is n and not a
proper divisor of n, such as for the 2-loop [−1, 0] [0, 1] of f (x) = −2x, which is
followed only by the fixed point 0.
This makes it interesting to give convenient criteria for being elementary. The sim-
plest is that any loop of length 1 is elementary (since the period of a point that follows
such a loop must be a factor of 1). A criterion with wider utility is:
Proof. If x follows the loop, then x ∈ Int(J0 ) because x ∈ J0 and it is not an endpoint.
If 0 < i < n then f i (x) ∈
/ Int(J0 ) because it is in Ji , and so x 6 = f i (x). Thus x has
period n.
2.2. Cycles Produce Coverings. A closed bounded interval whose endpoints belong
to a cycle O of f is called an O-interval.
4 This is a different use of the word “elementary” from the one in [1].
3. EXAMPLES. The first example is the most celebrated special case of the Shar-
kovsky Theorem: that period 3 implies all periods. The second and third examples
apply the same method to longer cycles and illustrate how our choice of O-intervals
differs from that made in the standard proof. The last example illustrates our induction
argument, which is built on the doubling structure of the Sharkovsky order.
3.1. Period 3 Implies All Periods. A 3-cycle comes in two versions that are mirror
images of one another. In Figure 2, the dashed arrows indicate that x1 = f (x0 ), x2 =
f (x1 ), and x0 = f (x2 ). In both pictures, I1 is the O-interval with endpoints x0 and x1 ,
and I0 is the O-interval with endpoints x0 and x2 . The endpoints of I1 are mapped to
the very left and right points of the cycle, so we have the O-forced covering relations
I1 → I1 and I1 → I0 . The endpoints of I0 are mapped to those of I1 , and so I0 → I1
is O-forced. We summarize these covering relations by writing I1 I0 .
I0 I1 I1 I0
x2 x0 x1 x1 x0 x2
Figure 2. 3-cycles.
l − 1 copies of I1
z }| {
I0 → I1 → I1 → · · · → I1 → I0
is elementary if l > 3. Thus, f has a periodic point of period l for each l > 3.
This shows a special case of the Sharkovsky Theorem: the presence of a period-3
point causes every positive integer to be a period.
x6 x4 x2 x0 x1 x3 x5
Figure 3. A 7-cycle.
(1) I1 → I1 and I0 → I1 ,
(2) I1 → I2 → I3 → I4 → I5 → I0 , and
(3) I0 → I5 , I3 , I1 .
This information can be summarized in a graph as follows:
I1 I2
I0 I3 (2)
I5 I4
x6 x4 x2 x0 x1 x3 x5
Figure 4. A 9-cycle.
x5 +
x3 +
x1 +
c +
x0 +
x2 +
x4 +
x6 +
+ + + + + + + + + +
x6 x4 x2 x0 c x1 x3 x5
In the next section we abstract the properties of the endpoints of the intervals
I0 , I1 , . . . that are essential to the above argument.
3.4. A 6-cycle. Consider the 6-cycle in Figure 6. The salient feature here is that the
3 points in the left half are mapped to the 3 points in the right half and vice versa.
Therefore, the 3 points in the right half form a cycle • • • for the second iterate
x0 x1
Figure 6. A 6-cycle.
f2 f2
f 2 . As in Subsection 3.1 we have the covering relations I1 −→ I1 , I1 −→ I0 , and
f2
I0 −→ I1 for the intervals I0 and I1 shown in Figure 6. We can conclude as before that
f 2 has elementary loops of all lengths.
For f itself we choose two additional intervals I00 and I10 by taking I j0 to be the
shortest O-interval that contains f (I j ∩ O).
We now illustrate a recursive method we will use later: we show how to associate
with an elementary k-loop for f 2 an elementary 2k-loop for f itself. In the present
example this then tells us that every even number is a period.
f2
Consider an elementary k-loop for f 2 made using the covering relations I1 −→ I1 ,
f2 f2 f2 f f
→ I10 −
I1 −→ I0 , and I0 −→ I1 . Replace each occurrence of “I1 −→” by “I1 − →” and
f2 f f
each occurrence of “I0 −→” by “I0 − → I00 −→” and note that this produces a 2k-loop
for f that is not a k-loop traversed twice (which would cause difficulty with being
elementary). We show that it is elementary using the definition of elementary. Suppose
a point p follows the 2k-loop under f . We need to show that it has period 2k for f .
Observe that p follows the original elementary k-loop under f 2 and hence has period
k for f 2 . On the other hand, the iterates of p under f are alternately to the left and
the right of the middle interval (x0 , x1 ) since the 2k-loop for f alternates between
primed and unprimed intervals. Therefore, the orbit of p consists of 2k distinct points;
there are k even iterates on the right and k odd iterates on the left. This means that the
period of p for f is 2k. Since k was arbitrary, we infer that this 6-cycle forces all even
periods (as well as period 1 due to the interval [x0 , x1 ] in the center, which covers itself
under f ).
In the next 3 sections we prove the Sharkovsky Forcing Theorem 1.1. We first show
that the existence of a special sequence in an m-cycle O produces all desired cycles.
Next we construct such a sequence under a mild assumption on O. Finally we reduce
the general case to this latter one.
Definition 4.1. Let p be the rightmost of those points in O for which f ( p) > p, and
q the point of O to the immediate right of p.
We define the center of O by c := ( p + q)/2. For x ∈ O we denote by Ox ⊂ O the
set of points of O in the closed interval bounded by x and c. That is, Ox = O ∩ [x, p]
when x ≤ p, and Ox = O ∩ [q, x] when x ≥ q.
We say that a point x ∈ O switches sides if c is between x and f (x).
Remark 4.3. The condition x j+1 ∈ O f (x j ) in (Š3) means that c < x j+1 ≤ f (x j ) if
x j < c and c > x j+1 ≥ f (x j ) if x j > c.
(Š2) implies that x0 , . . . , xn are pairwise distinct. Hence n + 1 ≤ m and so n < m.
Figure 2 and Figure 3 show Štefan sequences that happen to consist of the entire cycle;
we have n + 1 = m in these cases. Figure 4 provides an illustration in which a Štefan
sequence is a proper subset of the cycle and n + 1 < m.
(Š1) and (Š4) together imply that n ≥ 2 and hence m ≥ 3. Note that for m = 1 the
Sharkovsky Forcing Theorem is vacuously true and for m = 2 it is an application of
Lemma 2.2.
Proposition 4.4. Suppose that the m-cycle O has a Štefan sequence. If l G m, then
f has an O-forced elementary l-loop of O-intervals and hence a periodic point with
least period l.
Proposition 4.5. With I j chosen as above, we have the following O-forced covering
relations.
(1) I1 → I1 and I0 → I1 .
(2) I1 → I2 → · · · → In−1 → I0 .
(3) I0 → In−1 , In−3 , . . . .
They can be summarized in a graph as follows:
I1
I0 In−5
In−1 In−4
In−2 In−3
From the graph in Proposition 4.5 we read off the following loops:
(L1) I1 → I1 ;
(L2) I0 → In−(l−1) → In−(l−2) → · · · → In−2 → In−1 → I0 for even l ≤ n;
(L3) I0 → I1 → I1 → · · · → I1 → I2 → · · · → In−1 → I0 with r ≥ 1 repeti-
tions of I1 (and hence of length l = n − 1 + r ).
Proposition 5.1. A cycle with more than one point contains a Štefan sequence unless
every point switches sides.
f (O f (x) ) ⊂ O f (σ (x)) .
We noted that x and σ (x) are on opposite sides of c, so σ 2 (x), if defined, is again on
the same side as x. It is crucial for obtaining the outward spiraling in (Š2) that σ 2 (x)
is further from c than x, i.e., that σ 2 (x) ∈
/ Ox .
Lemma 5.2. If there is an x ∈ S such that σ 2 (x) ∈ Ox , then all points of O switch
sides.
Proof. In order to have σ 2 (x) defined and in Ox , we must have x, y := σ (x), and
z := σ (y) = σ 2 (x) all in S. Moreover σ (x) and σ (y) are both obtained using case (ii)
in the definition of σ . Hence
and
O f (z) ⊂ O f (x) .
Combining the above inclusions shows that O f (x) ∪ O f (y) is mapped into itself by
f . Since f is a cyclic permutation of O, the only nonempty f -invariant subset of O is
O itself. Thus O = O f (x) ∪ O f (y) . But all points of this set switch sides because x and
y are in S.
To conclude the proof of Proposition 5.1 we now suppose that there is a point of O
that does not switch sides and show that this implies the existence of a Štefan sequence.
The contrapositive of Lemma 5.2 implies that we cannot have both σ ( p) = q and
σ (q) = p. Therefore we can choose {x0 , x1 } = { p, q} in such a way that x2 := σ (x1 ) 6 =
x0 and then continue to choose xi+1 = σ (xi ) while xi ∈ S. We now verify that this
produces a Štefan sequence.
Our choice of {x0 , x1 } = { p, q} gives (Š1).
Proposition 5.1 and Proposition 4.4 give the following main case of the Sharkovsky
Theorem.
Proposition 5.3. If an m-cycle O with m ≥ 2 contains a point that does not switch
sides, then for each l G m there is an elementary, O-forced l-loop of O-intervals, and
hence an l-cycle.
f2 f2 f2 f2 f2
I0 −→ I1 −→ I2 −→ · · · −→ Ik−1 −→ I0 (3)
h
Th
(a) T1 has only one periodic point (the fixed point 0) while the tent map T1 has
a 3-cycle {2/7, 4/7, 6/7} and hence has all natural numbers as periods by the
Sharkovsky Forcing Theorem 1.1.
(b) Any cycle O ⊂ [0, h) of Th is a cycle for T1 , and any cycle O ⊂ [0, h] of T1 is
a cycle for Th .
What makes the proof so elegant is that h plays three roles: as a parameter, as the maxi-
mum value of Th , and as a point of an orbit. The key idea is to let h(m) := min{max O |
O is an m-cycle of T1 } for m ∈ N. (We can write “min” instead of “inf” because T1 has
a finite number of periodic points for each period.5 ) From this and (b) we obtain:
5 Inspection of the graph of T1n shows that it has exactly 2n fixed points.
8. CONCLUSION. It may be of interest to note that the proof given here provides
more information than the statement of the Sharkovsky Forcing Theorem 1.1. When in
the proof of Proposition 4.4 we treated the loops in (L3) on page 238 we only needed
to know that n ≤ l 6 = m. Therefore Proposition 4.4 can be amplified to the following:
Furthermore we must have f (xi ) = xi+1 for 0 ≤ i < m − 1. These orbits are called
Štefan cycles. They are central to the standard proof of the Sharkovsky Theorem. Our
proof is more direct because we do not need these cycles, but they inspired our defini-
tion of Štefan sequences.
ACKNOWLEDGMENTS. We thank the Instituto Superior Técnico of the Universidade Técnica de Lisboa
and the Mittag-Leffler Institute of the Royal Swedish Academy of Sciences for their hospitality and support.
We are also grateful to Wah Kwan Ku for pointing out an error in a draft of this paper, Aaron Brown and
Stephen Sherman for expository suggestions, and the referee and Dan Velleman for extensive corrections and
improvements. Keith Burns was partially supported by NSF grant DMS-0701140.
REFERENCES
1. L. Alsedà, J. Llibre, and M. Misiurewicz, Combinatorial Dynamics and Entropy in Dimension One, 2nd
ed., Advanced Series in Nonlinear Dynamics, vol. 5, World Scientific, River Edge, NJ, 2000.
2. R. Barton and K. Burns, A simple special case of Sharkovskii’s theorem, Amer. Math. Monthly 107 (2000)
932–933. doi:10.2307/2695586
3. L. Block, J. Guckenheimer, M. Misiurewicz, and L.-S. Young, Periodic points and topological entropy
of one-dimensional maps, in Global Theory of Dynamical Systems, Z. Nitecki and C. Robinson, eds.,
Lecture Notes in Mathematics, vol. 819, Springer Verlag, Berlin, 1980, 18–34.
4. U. Burkart, Interval mapping graphs and periodic points of continuous functions, J. Combin. Theory Ser.
B 32 (1982) 57–68. doi:10.1016/0095-8956(82)90076-4
5. W. A. Coppel, The solution of equations by iteration, Proc. Cambridge Philos. Soc. 51 (1955) 41–43.
doi:10.1017/S030500410002990X
6. B.-S. Du, A simple proof of Sharkovsky’s theorem, Amer. Math. Monthly 111 (2004) 595–599. doi:
10.2307/4145161
7. , A simple proof of Sharkovsky’s theorem revisited, Amer. Math. Monthly 114 (2007) 152–155.
8. , A collection of simple proofs of Sharkovsky’s theorem (2007), available at http://arxiv.
org/abs/math/0703592.
9. C. W. Ho and C. Morris, A graph-theoretic proof of Sharkovsky’s theorem on the periodic points of
continuous functions, Pacific J. Math. 96 (1981) 361–370.
10. T.-Y. Li and J. A. Yorke, Period three implies chaos, Amer. Math. Monthly 82 (1975) 985–992. doi:
10.2307/2318254
11. M. Misiurewicz, Remarks on Sharkovsky’s Theorem, Amer. Math. Monthly 104 (1997) 846–847. doi:
10.2307/2975290
12. Z. Nitecki, Topological dynamics on the interval, in Ergodic Theory and Dynamical Systems II–College
Park, MD 1979–1980, Progr. Math., vol. 21, Birkhäuser, Boston, 1982, 1–73.
13. A. N. Sharkovskiı̆, The reducibility of a continuous function of a real variable and the structure of the
stationary points of the corresponding iteration process, Dokl. Akad. Nauk RSR 139 (1961) 1067–1070.
14. , Coexistence of cycles of a continuous map of the line into itself, Ukrain. Mat. Zh. 16 (1964)
61–71; trans. J. Tolosa, Proceedings of “Thiry Years after Sharkovskiı̆’s Theorem: New Perspectives”
(Murcia, Spain 1994), Internat. J. Bifur. Chaos Appl. Sci. Engrg. 5 (1995) 1263–1273.
15. , Fixed points and the center of a continuous mapping of the line into itself, Dopovidi Akad. Nauk
Ukr. RSR 1964 (1964) 865–868.
16. , On cycles and structure of a continuous mapping, Ukrain. Mat. Zh. 17 (1965) 104–111. doi:
10.1007/BF02527365
17. , The set of convergence of one-dimensional iterations, Dopovidi Akad. Nauk Ukr. RSR 1966
(1966) 866–870.
18. P. Štefan, A theorem of Šarkovskii on the existence of periodic orbits of continuous endomorphisms of
the real line, Comm. Math. Phys. 54 (1977) 237–248. doi:10.1007/BF01614086
19. P. D. Straffin Jr., Periodic points of continuous functions, Math. Mag. 51 (1978) 99–105. doi:10.2307/
2690145
BORIS HASSELBLATT is Professor and Chair of Mathematics at Tufts University. He obtained a Vordiplom
in Physics from the Technische Universität Berlin in 1981, an M.A. in mathematics from the University of
Maryland in College Park in 1984, and a Ph.D. in mathematics from the California Institute of Technology in
1989. He studies mainly hyperbolic dynamical systems, often of types that are geometrically motivated. When
he is not doing mathematics he enjoys singing. He dedicates this article to all those who do not dedicate their
articles to themselves.
Department of Mathematics, Tufts University, Medford, MA 02155
Boris.Hasselblatt@tufts.edu
Polanyi on Mathematics
“All these difficulties are but consequences of our refusal to see that mathemat-
ics cannot be defined without acknowledging its most obvious feature: namely,
that it is interesting. Nowhere is intellectual beauty so deeply felt and fastidi-
ously appreciated in its various grades and qualities as in mathematics, and only
the informal appreciation of mathematical value can distinguish what is math-
ematics from a welter of formally similar, yet altogether trivial statements and
operations.”
Abstract. This article is an introduction to the common algebraic methods used to study both
solutions to polynomial equations and solutions to differential equations: Galois theory and
differential Galois theory. We develop both theories simultaneously by studying the solutions
to the polynomial equation x 5 − 4x 2 − 2 = 0 and the solutions to the differential equation
u0 = t − u2.
1. INTRODUCTION. The object of this paper is to prove that the differential equa-
tion
u0 = t − u2 (1)
x 5 − 4x 2 − 2 = 0 (2)
has no solutions which can be written as radicals of solutions to lower degree polyno-
mial equations.
The paper is written with a reader in mind who at some point studied Galois theory:
either very recently and is therefore not an expert, or long ago and has since forgotten
many of the finer points. The examples are chosen to illuminate the theorems: it is
scarcely possible to give all the details of every proof in this article. For further reading,
we recommend [4], [5], [7], [8], [9], or [10]. For the basics of Galois theory, differential
equations, or algebraic groups, see [1], [2], or [3], respectively.
2. SPLITTING FIELDS. Our first step will be to determine where solutions to equa-
tions (1) and (2) lie. Recall that a field is a set in which an addition, subtraction, multi-
plication, and division are defined, and that these operations satisfy the rules which one
expects from elementary arithmetic. Three standard examples are the rational numbers
Q, the real numbers R, and the complex numbers C. All fields in this paper will have
characteristic 0.
For the case of polynomials, we now have all of the background we need.
doi:10.4169/amer.math.monthly.118.03.245
The reason we can be sure that all solutions to a polynomial f lie in some subfield
of C is the fundamental theorem of algebra, which says that any degree-n polynomial
with coefficients in C has n (not necessarily distinct) roots in C.
Remark. The splitting field need not be thought of as a subfield of C. We need only
to fix an algebraic closure of Q and we could work in there (as any algebraic closure
contains the roots any polynomial, by definition). We find it easier to think of subfields
of C, but that is just a crutch.
In fact, the results on the Galois theory of polynomials that we collect in this section
and the next are true in a much more general setting: if F is any characteristic-zero
field, we will have the same results for a polynomial f with coefficients in F, provided
that we fix an algebraic closure of F to begin with. This level of generality is not
appropriate in these introductory sections, but will be necessary in Proposition 4.6.
To deal with differential equations rather than polynomial equations, one must con-
sider fields with a bit more structure.
Definition 2.3. A differential field, here called a D-field, is a field F, together with a
derivation δ : F → F which satisfies the rules
The first example of a D-field is C(t), the set of all rational functions in one variable
with complex coefficients, with the usual addition and multiplication, and with the
derivation given by the ordinary derivative. The standard rules for differentiating say
has a unique solution in M(U ) for any t0 in U with any given initial conditions
u(t0 ), u 0 (t0 ), . . . , u (k−1) (t0 ).
Definition 2.4. If (F, δ) is a differential field, then the constants of F are the elements
u ∈ F such that δ(u) = 0.
In this paper, the field of constants will always be C. We now have the background
needed to define where the solutions to a differential equation lie.
Definition 2.5. Let L(u) = δ n (u) + αn−1 δ n−1 (u) + · · · + α0 u be a differential oper-
ator where the αi ∈ F ⊂ M(U ) are all analytic in some simply connected open set
U ⊂ C. Then the differential splitting field of L over F in M(U ), denoted E LU , is the
smallest D-subfield of M(U ) containing F and all solutions of L(u) = 0 on U .
We will suppress U from the notation and write E L for E LU when no confusion can
arise.
Example 2.6. One of the simplest differential operators is L(u) = u 0 − u. In this case
the only coefficients are the numbers ±1, which are certainly analytic on all of C, so
we may consider the splitting field over C(t) to be the smallest D-subfield of M(C)
containing the rational functions and the solutions of the differential equation u 0 =
u, that is, all functions of the form Cet . It should be clear that this subfield, E L , is
precisely the space of functions of the form
where the pi and q j are polynomials with coefficients in C, with the denominator
not identically zero. Indeed, this is a differential field (clearly closed under addition,
multiplication, division, and differentiation), and more or less obviously the smallest
such field containing the constants and et . One√should think of this splitting field as a
close analog of the numbers of the form a + b 2.
Definition 3.1. Let F ⊂ C be a field and suppose K /F is any field extension. The
Galois group, denoted Gal(K /F), is the group of all field automorphisms of K which
leave F fixed, where the group law is given by composition of automorphisms. If f is
a polynomial with coefficients in F, then we call Gal(E f /F) the Galois group of f .
We conclude that elements of the Galois group of f permute the roots of f . Conse-
quently, if we denote the set of roots of f by R f , then there is a group homomorphism
so that not only does Gal(E f /F) permute the roots of f , it permutes the roots of the
irreducible factors of f . Because of this, we can focus solely on the case where f
itself is irreducible for the remainder of the paper.
2
Example 3.2. If f (x) = x√ −2√as in Example 2.2, then the group Gal(E f /Q) is the
group of permutations of { 2, − 2}.
Example 3.3. Let f (x) = x 5 − 1. Then the set of roots is the set with the five elements
ωk = e2kπi/5 , with k = 0, . . . , 4. Clearly the Galois group is not the whole group of
permutations; no automorphism can map 1 to anything else. This is a particular case
of the following general statement: the Galois group acts transitively on the roots of a
polynomial f (i.e., given any two roots a1 , a2 of f there exists σ ∈ Gal(E f /Q) such
that σ (a1 ) = a2 ) if and only if f is irreducible.
In our case, x 5 − 1 = (x 4 + x 3 + x 2 + x + 1)(x − 1), and the roots of the two fac-
tors cannot get mixed up. How about the other roots? It is not quite obvious,1 but ω1 can
be mapped to any other root ωk , k = 1, 2, 3, 4 by an automorphism σ ∈ Gal(E f /Q).
Knowing σ (ω1 ) completely determines σ , since
Once you see that, it is not hard to see that the Galois group is the multiplicative group
of Z/5Z, which is a cyclic group of order 4. The same argument shows that the Galois
group of the polynomial x p − 1 is the multiplicative group of the field Z/ pZ for any
prime p.
In Examples 3.2 and 3.3 we saw examples of Galois extensions. We now look at an
extension which is not a Galois extension.
a + b21/3 + c41/3
for any rational numbers a, b, c. In this case, Gal(F/Q) = {1} since any element of
the group must send 21/3 to a cubic root of 2, and there are no other such roots in F.
Thus, F/Q is not a Galois extension.
Example 3.8. Consider the polynomial f (x) = x 3 − 2. This has three roots, and the
Galois group Gal(E f /Q) is the full group of permutations of these roots3 and so has
order six. The splitting field, E f , contains the ratios of the roots, which are cubic roots
of unity. If we set E g to be the splitting field of g(x) = x 2 + x + 1, then Gal(E f /E g )
is cyclic of order three. Therefore, Gal(E f /E g ) E Gal(E f /Q) since index-two sub-
groups are always normal. According to Theorem 3.7, then, E g /Q is a Galois exten-
sion with Gal(E g /Q) ' Gal(E f /Q)/ Gal(E f /E g ).
More specifically, set ω = e2πi/3 . Then Gal(E f /Q) is generated by complex con-
jugation and the unique field automorphism σ which maps 21/3 7 → ω · 21/3 , whereas
Gal(E f /E g ) is generated by just σ , since complex conjugation is not the identity on
E g . This last statement shows that Gal(E g /Q) is isomorphic to the quotient group,
which is generated by the coset containing complex conjugation, and so is of order
two.
Finally, we consider the order-two subgroup of Gal(E f /Q) generated by complex
conjugation. The field corresponding to this subgroup under the bijection in Theorem
3.7 is precisely the field considered in Example 3.6. There, we saw that this field was
not a Galois extension of Q. This corresponds to the fact that no subgroup of two
elements is normal in Gal(E f /Q).
3 In particular, the real cubic root of 2 cannot be algebraically distinguished from the other roots.
Definition 4.1. Let (K , ) be a differential extension of the differential field (F, δ).
The differential Galois Group, DGal(K /F), is the group of field automorphisms
σ : K → K which restrict to the identity on F and satisfy σ ((u)) = (σ (u)) for all
u ∈ K , and the group law is given by composition of automorphisms. If L is a linear
differential operator with coefficients in F, then we call DGal(E L /F) the differential
Galois group of L.
Example 4.2. Let L be the operator given by L(u) = u 0 − u; in Example 2.6, the
splitting field of L was determined. Any automorphism of E L must send one solution
of u 0 = u to another, so in particular, it must send et to Cet for some nonzero complex
number C. Moreover, C completely determines the D-Galois automorphism. Conse-
quently, the D-Galois group is C∗ = GL1 (C), the multiplicative group of the complex
numbers.
Theorem 4.3. The differential Galois group DGal(E L /K ) of a linear differential op-
erator L is an algebraic subgroup of GL(VL ); that is, it is a subset defined by finitely
many polynomial equations.
Let us see what this says for a few examples. The additive group C has lots of sub-
groups, isomorphic to Z, Z ⊕ Z, R, etc., but none of them are algebraic. For instance,
Z is defined by the equation sin π z = 0, but f (z) = sin(π z) is not a polynomial (nor
is the function f (z) = z − z̄, which vanishes exactly on R). The group C∗ also has
lots of subgroups, but only those consisting of the nth roots of unity for some n are
algebraic (obviously defined by the single equation z n − 1 = 0).
The next theorem is a strong result on algebraic groups, and will be necessary for
the proof of Theorem 8.1.
Example 4.5. Let U ⊂ C be the open unit disk and consider the linear differential
operator
t
L(u) = u 0 + u,
1 − t2
√
whose coefficients are analytic on U . Let w be an analytic branch of 1 − t 2 on U ,
for instance the one which is positive on (−1, 1). Then L(Cw) = 0 for any C ∈ C,
√ splitting field E L ⊂ M(U ) of L over C(t) is the set of functions of the form
and the
u + v 1 − t 2 with u, v ∈ C(t).
Since L is a first-order operator, we see that DGal(E L /C(t)) is a subgroup of
GL1 (C) = C∗ . Let σ ∈ DGal(E L /C(t)), so that σ (w) = Cw for some C ∈ C. Then,
σ (w)2 = C 2 w 2 = C 2 (1 − t 2 ). But also, σ (w)2 = σ (w 2 ) = σ (1 − t 2 ) = 1 − t 2 , since
σ fixes 1 − t 2 ∈ C(t). Therefore, C 2 = 1. Thus, DGal(E L /C(t)) is the group√of two
elements:
√ the identity automorphism and the automorphism which exchanges 1 − t
2
and − 1 − t . 2
This illustrates the following fact: even if a linear differential operator is irreducible,
in the sense that it is not the composition of two linear differential operators of lower
degree, the differential Galois group of the splitting field may well not act transitively
on the nonzero solutions, which may have an “individuality” of their own.
The coefficients of this polynomial are fixed under DGal(E L /F), hence in F by The-
orem 4.8 below. Thus, f is a polynomial with coefficients in F which has v as a root,
so that v is algebraic over F, as desired.
where the sign is + if and only if the number of factors is divisible by four.
A priori, this looks like an element of E f , but it is clearly fixed by Gal(E f /F),
Q an element of F by Theorem 3.5. In E f , the discriminant 1( f ) is
and is therefore
the square of i< j (xi − x j ) (that is what the sign was for), but it is not necessarily
a square in√F. If it is not a square, there is an intermediate field between F and E f ,
namely F( 1( f )). It is fairly easy to understand the relation between the various
Galois groups.
√
Proposition 5.2. We have Gal(E f /F( 1( f ))) = Gal(E f /F) ∩ Alt(R f ), where
Alt(R f ) ⊂ Perm(R f ) is the subgroup of even permutations. In particular, the discrimi-
nant is a square in F precisely when Gal(E f /F) consists entirely of even permutations
of the roots of f .
Proof. An even permutation σ can be written as a product of an evenQ number of trans-
positions, and hence it will not change the sign of (and hence fixes) i< j (xi − x j ).
4 For the sake of comparison with the Wronskian, it might be better to define the discriminant in terms of
the resultant of f , but we avoid that here because it is longer and more technical.
Unfortunately, it would seem from the definition that one would have to know a
basis for the space of solutions to L(u) = 0 to compute the Wronskian. That, in gen-
eral, could be very difficult to come by. However, the following proposition gives us a
method of computing the Wronskian just by considering the matrix A L .
u 0 = Tr(A L )u.
Proof. To see the first fact, we use the Jacobi identity for invertible n × n matrices:
Suppose now that the Wronskian is 0 at t0 ∈ U . Then there is some linear combination
w of u 1 , . . . , u n such that w(t0 ) = w 0 (t0 ) = · · · = w (n−1) (t0 ) = 0. However, L(w) =
0, so by the uniqueness theorem for solutions to L(u) = 0, we must have the w is the
constant function 0. This violates the linear independence of the solutions u 1 , . . . , u n ,
so we may conclude that Wr L is nonvanishing on U . Thus, we may divide to get
Wr0L
= Tr A L
Wr L
as functions on the simply connected set U . Thus, if we pick any t0 ∈ U , the integral
Z t Z t
Wr0L (s)
Tr A L (s) ds = ds
t0 t0 Wr L (s)
is independent of the path chosen from t0 to t and the second part of the proposition
follows.
Again, if the Wronskian is not in the original D-field, this gives an intermediate D-
field extension F ⊂ F(Wr L ) ⊂ E L , and it is not too difficult to understand the effect
on the D-Galois groups.
Proof. Let σ ∈ DGal(E L /F). Then, as discussed above, σ permutes the solutions of
L(u) = 0. As W L is determined by these solutions, we see that σ acts on W L by acting
on its entries. We will denote this action by σ (W L ). Since σ is a field homomorphism,
and as det W L is a sum of products of elements of F, we can conclude σ (Wr L ) =
σ (det W L ) = det(σ (W L )).
By identifying DGal(E L /F) and GL(VL ), we can find a matrix S ∈ GL(VL ) such
that σ (W L ) = W L S.5 Therefore,
then a cube root of an element of the field generated by the first extension.
{1} = G n ⊂ G n−1 ⊂ · · · ⊂ G 0 ⊂ G −1 = G,
such that each G j is a normal subgroup of G j−1 and the quotient groups G j−1 /G j are
all abelian.
Standard examples of solvable finite groups are the symmetric groups S3 and S4 ,
the latter via the chain
{1} ⊂ V ⊂ A4 ⊂ S4 ,
where V is the Klein four-group and A4 is the alternating group. The groups Sn and
An are not solvable, however, for n ≥ 5.
The next proposition shows that the similarity in naming is no coincidence.
F = F0 ⊂ F1 ⊂ F2 ⊂ · · · ⊂ Fn = K
such that each field Fi+1 is either finite algebraic over Fi , or generated by an antideriva-
tive or exponential of an antiderivative of an element of Fi .
Notice that if you are thinking of all these fields as subfields of M(U ) for an ap-
propriate U , then it may be necessary to restrict to some U1 ⊂ U : if v has a pole in U
with nonzero residue, then there will not be an antiderivative of v defined on all of U ,
but there will be one on any simply connected subset of U which avoids the poles of v.
Consider the extension Fi /Fi−1 ; there are three possibilities for DGal(Fi /Fi−1 ) de-
pending on how the extension is generated:
(i) If Fi /Fi−1 is generated by a finite algebraic element, then DGal(Fi /Fi−1 ) is
finite.
(ii) If Fi /Fi−1 is generated by an antiderivative of an element α ∈ Fi−1 , then we
can think of Fi as the splitting field of the linear operator L(u) = u 0 − α. In
particular, the solution we use to generate Fi is defined only up to addition of a
constant in C. Since DGal(Fi /Fi−1 ) must permute the solutions of L(u) = 0,
we see that DGal(Fi /Fi−1 ) ' C (since C has no algebraic subgroups).
Example 7.2. Let F ⊂ M(U ), where U ⊂ C is some simply connected open set.
Parts (ii) and (iii) above show that if L is a first-order linear operator defined over
F, then E L /F is a Liouvillian extension. What can we say about second-order linear
operators? The answer requires us to think more about first-order operators.
Suppose L(u) = u 0 + αu where α ∈ F. Let β ∈ E L and let v be a solution to the
nonhomogeneous equation L(u) = β. Solving for v first requires us to R multiply both
sides of the equation v 0 + αv = β by the integrating factor w := exp α . Note that
w is contained in a Liouvillian extension of F.
This multiplication transforms our equation into (vw)0 = wβ. Integrating and di-
viding gives
1
Z
v= wβ;
w
in particular, v is contained in a Liouvillian extension of F since w and β are.
We now have all we need to consider a second-order linear operator L(u) = u 00 +
α1 u 0 + α0 u with α0 , α1 ∈ F. Suppose v satisfies L(v) = 0 and let K ⊂ E L be the
smallest D-subfield which contains v and F. It need not be the case the K /F is a
Liouvillian extension. However, we can show that E L /K is a Liouvillian extension.
Let w be another, linearly independent, solution to L(w) = 0; then, up to a scalar
multiple, we have
v w
Wr L = det = vw0 − wv 0 .
v 0 w0
v0 W rL
w0 − w= ,
v v
where W L /v is contained in a Liouvillian extension of K . From what we saw above,
this shows that E L /K is a Liouvillian extension. It is worth noting that if K /F is a
Liouvillian extension, then E L /F will be a Liouvillian extension as well.
The following proposition says that Liouvillian extensions are the analogs of radical
extensions.
8. SOLUTIONS OF EQUATIONS (1) AND (2). We now restate our goals in the
new language we have developed over the previous sections:
(i) The polynomial f (x) = x 5 − 4x − 2 is not solvable by radicals.
(ii) No solution of the differential equation u 0 = t − u 2 is contained in a Liouvillian
extension.
We begin by showing (i). Recall that a polynomial is solvable by radicals only if
the Galois group of its splitting field is a solvable group; that is, it suffices to show that
Gal(E f /Q) is not solvable.
Since f has degree five, there are five roots defined over C. The Galois group
Gal(E f /Q) permutes these roots and is thus (naturally isomorphic to) a subgroup of
S5 . If a ∈ C is such that f (a) = 0, then, since f is irreducible, the field Q(a) is an ex-
tension of Q of degree five. Since Q(a) ⊂ E f , the degree of E f /Q must be divisible by
five. By Theorem 3.5, the order of Gal(E f /Q) is divisible by five as well. By Cauchy’s
theorem, there is therefore an element of order five (or 5-cycle) in Gal(E f /Q).
Note that f (−2) < 0 < f (−1) and f (0) < 0 < f (2), so that f has real roots
a, b, c satisfying
By considering derivatives of f , one sees that these are the only real roots. Thus, there
are two complex conjugate roots of f .
Now consider the action of complex conjugation on the field E f . It is certainly an
automorphism, and it fixes Q ⊂ R. It is therefore an element of Gal(E f /Q) ⊂ S5 .
We just concluded that three of the roots of f are real, and thus fixed by complex
conjugation, and that the two remaining roots are swapped. We can therefore conclude
that Gal(E f /Q) contains a 2-cycle as well.
That is all we need, however. It is a fact from elementary group theory that a 2-
cycle and a 5-cycle generate all of S5 . We conclude that Gal(E f /Q) = S5 , and f is
not solvable by radicals.
We now proceed to our second goal. Up to this point, we have considered only
linear differential operators and their solutions. In light of this, our results so far would
seem to have little bearing on the solutions to the nonlinear equation u 0 = t − u 2 .
However, there is still hope: recall the Airy differential operator
L A (u) = u 00 − tu
from Example 5.7. It is certainly linear, and, since the function t has no poles, the
splitting field E L A is a subfield of the meromorphic functions on C, M(C).
v · v 00 − [v 0 ]2
w0 =
v2
t · v 2 − [v 0 ]2
= , since v 00 − tv = 0 by assumption,
v2
= t − w2 ;
Proof. In Example 5.7, we found that G ⊂ SL2 (C). To prove equality, let G 0 be the
connected component of the identity. Then G/G 0 is finite by Theorem 4.4.
Since SL2 (C) is 3-dimensional, there are very few proper connected subgroups. In
particular, they are all conjugate to one of the following four:
1 0
(i) ,
0 1
a 0
(ii) :a∈C , ∗
0 1/a
1 b
(iiii) : b ∈ C , or
0 1
a b
(iv) : a ∈ C ,b ∈ C .
∗
0 1/a
Note that for each of these groups, (1 0)t is a common eigenvector. Thus, if G 0
were a proper subgroup of SL2 (C), then all elements of G 0 would have a common
eigenvector v ∈ E L A which satisfies L A (v) = 0.
Since differentiation commutes with the action of G on E L A , we then see that w =
v 0 /v is left fixed by G 0 . Thus, the Picard-Vessiot extension generated by w is a D-
subfield, M say, of E L A and G 0 ⊂ DGal(E L /M) (because G 0 fixes w). In particular,
we can apply the Fundamental Theorem of Differential Galois Theory (Theorem 4.9)
so that DGal(M/C(t)) ' G/ DGal(E L /M) is a quotient of G by a group containing
G 0 ; hence, DGal(M/C(t)) is finite, so that w is an algebraic function by Proposition
4.6 (and so has finitely many poles).
However, as we saw above, w0 = t − w 2 since w is the logarithmic derivative of the
solution v of L A (u) = 0. For any number t0 < −1 − π/2, the solution w with w(t0 ) =
0 has domain of definition (a, b) with t0 − π/2 < a < b < t0 + π/2, since the solution
is above tan(t + t0 ) for t > t0 and beneath tan(t + t0 ) for t < t0 (see Figure 1). Thus,
w has at least as many poles as tan(t), which has infinitely many poles. Hence w is not
algebraic and our guess that G 0 6 = SL2 (C) is false.
x
–7 7
–7
Figure 1. Solutions to w0 (t) = t − w 2 .
Corollary 8.2. No nonzero solution of the Airy equation belongs to a Liouvillian ex-
tension of C(t).
Proof. By Proposition 7.4, if one nonzero solution (and hence all solutions by Ex-
ample 7.2) of the Airy equation belonged to a Liouvillian extension, then G 0 =
SL2 (C) would be solvable. Since solvability is preserved by quotients (this is an
exercise using the definition and the isomorphism theorems for groups), we would
also have that PSL2 (C) = SL2 (C)/{± Id} would be solvable. By Theorem 8.4 of [6],
PSL2 (C) is simple. Thus, PSL2 (C) is solvable only if it is abelian. One can easily
check that this is not the case.
REFERENCES
1. D. Dummit and R. Foote, Abstract Algebra, 3rd ed., John Wiley, Hoboken, NJ, 2004.
2. J. Hubbard and B. West, Differential Equations: A Dynamical Systems Approach, Texts in Applied Math-
ematics, vol. 5, Springer-Verlag, New York, 1995; corrected reprint of the 1991 edition.
3. J. Humphreys, Linear Algebraic Groups, Graduate Texts in Mathematics, vol. 21, Springer-Verlag, New
York, 1975.
4. E. Kolchin, Algebraic matric groups and the Picard-Vessiot theory of homogeneous linear ordinary dif-
ferential equations, Ann. of Math. (2) 49 (1948) 1–42. doi:10.2307/1969111
5. , Differential Algebra and Algebraic Groups, Pure and Applied Mathematics, vol. 54, Academic
Press, New York, 1973.
6. S. Lang, Algebra, 3rd ed., Graduate Texts in Mathematics, vol. 211, Springer-Verlag, New York, 2002.
7. A. Magid, Lectures on Differential Galois Theory, University Lecture Series, vol. 7, American Mathe-
matical Society, Providence, RI, 1994.
8. , Differential Galois theory, Notices Amer. Math. Soc. 46 (1999) 1041–1049.
JOHN H. HUBBARD received his undergraduate degree from Harvard University and his doctorate from the
Université de Paris-Sud. He is currently a professor at Cornell University and the Université de Provence. His
research interests lie in differential equations and complex dynamics.
Department of Mathematics, Cornell University, Ithaca, NY 14850
jhh8@cornell.edu
BENJAMIN E. LUNDELL received his undergraduate degree from the University of Illinois at Urbana-
Champaign and a Certificate of Advanced Study in Mathematics from Cambridge University. He is currently
a doctoral candidate at Cornell University studying number theory and arithmetic geometry.
Department of Mathematics, Cornell University, Ithaca, NY, 14850
blundell@math.cornell.edu
Abstract. Combining results presented in two papers in this M ONTHLY yields the following
elementary result. Any line of best fit for the zeros of a polynomial is a line of best fit for its
critical points.
This note gives a generalization of results on cubic polynomials presented in [1]. Our
notation will follow that paper. A line of best fit for a set of points in the plane is
defined, as in [1, p. 682], to be a line that minimizes the sum of squares of the per-
pendicular distances from the points to the line. (Sometimes, elsewhere, such a line is
called a “least-squares perpendicular-offsets” line.)
Let {z j | 1 ≤ j ≤ n} be a set of n ≥ 2 complex numbers. Define
n n
1X
(z j − z A )2 .
X
zA = zj and Z= (1)
n j=1 j=1
Theorem 2 (Coolidge, n = 3). Suppose the complex numbers z 1 , z 2 , z 3 form the ver-
tices of a triangle which is nonequilateral. If p(z) = 3j=1 (z − z j ), then the unique
Q
line of best fit for the three numbers is the line through the roots of p 0 (z).
The next ingredient inPthe proof is taken from [2]. (See also Newton’s identities.)
First note that an−1 = − nj=1 z j = −nz A . Now suppose, as in [2], that z A = 0: there is
no loss of generality in this translation of the points of the complex plane. Then an−1 =
0. (It also follows that the coefficient of z n−2 in p 0 (z) is also zero, so (n − 1)z A0 =
Pn−1 0 Pn
k=1 z k = 0.) Squaring the equation j=1 z j = 0, we find that
n
z 2j = −2
X X
Z= z j z k = −2an−2 .
j=1 1≤ j<k≤n
Theorem 4 ([2, equation (1.9)]). Denote the zeros of the polynomials p and p 0 as
above, and suppose that their centroids are located at the origin. Then
n−1 n
0
X
0 2 (n − 2) X 2 (n − 2)
Z = (z k ) = zj = Z. (2)
k=1
n j=1
n
Proof. Using z A = 0,
n−2
p0 X ja j
= z n−1 + z j−1 ,
n j=1
n
REFERENCES
1. D. Minda and S. Phelps, Triangles, ellipses, and cubic polynomials, Amer. Math. Monthly 115 (2008)
679–689.
2. I. J. Schoenberg, A conjectured analogue of Rolle’s theorem for polynomials with real or complex coeffi-
cients, Amer. Math. Monthly 93 (1986) 8–13. doi:10.2307/2322536
School of Mathematics and Statistics, University of Western Australia, Nedlands 6009, Western Australia,
and Department of Mathematics and Statistics, Curtin University, Bentley 6102, Western Australia
Grant.Keady@graduate.uwa.edu.au
Abstract. A matchstick graph is a plane geometric graph in which every edge has length 1 and
no two edges cross each other. It was conjectured that no 5-regular matchstick graph exists. In
this paper we prove this conjecture.
REFERENCES
1. D. Minda and S. Phelps, Triangles, ellipses, and cubic polynomials, Amer. Math. Monthly 115 (2008)
679–689.
2. I. J. Schoenberg, A conjectured analogue of Rolle’s theorem for polynomials with real or complex coeffi-
cients, Amer. Math. Monthly 93 (1986) 8–13. doi:10.2307/2322536
School of Mathematics and Statistics, University of Western Australia, Nedlands 6009, Western Australia,
and Department of Mathematics and Statistics, Curtin University, Bentley 6102, Western Australia
Grant.Keady@graduate.uwa.edu.au
Abstract. A matchstick graph is a plane geometric graph in which every edge has length 1 and
no two edges cross each other. It was conjectured that no 5-regular matchstick graph exists. In
this paper we prove this conjecture.
Proof. Suppose to the contrary that there is such a graph M which we consider also
as a planar map, that is, a crossing-free embedding of a planar graph in the plane. This
drawing consists of vertices, edges, and faces. Without loss of generality we assume
that this graph is connected and denote by V the number of its vertices, by E the
number of its edges, and by F the number of faces in the planar map M. By Euler’s
formula we have V − E + F = 2. For every k ≥ 3 we denote by f k the number of
faces in M with precisely P k edges.
We observe that 2E = k f k = 5V and F = f k . Therefore
P
5 X X
−6 = −3V + E + 2E − 3F = −3V + V + k fk − 3 fk
2
1 X
=− V + (k − 3) f k . (1)
2
The idea of the proof is to assign a charge to every vertex and every face in M so
that the total charge is negative. Then we will redistribute the charges according to a
simple local rule and reach a contradiction by showing that the charge of each vertex
and each face becomes nonnegative.
We begin by giving a charge of − 21 to each vertex and by giving a charge of k − 3
to each face in M with precisely k edges. By (1) the total charge of all the vertices and
faces is negative.
We redistribute the charge in the following very simple way. Consider a face T of
M and a vertex x of T . Let α denote the measure of the internal angle of T at x. If
α > π3 we take a charge of min 12 , 2π3 α − 12 from T and move it to x (see Figure 2).
We now show that after the redistribution of charges every vertex and every face has
a nonnegative charge. Consider a vertex x. Let ` denote the number of internal angles
at x that are greater than π3 . As the degree of x is equal to 5 we must have ` > 0. If
due to one of these ` angles we transfered a charge of 21 to x, then the charge at x is
x
T α
REFERENCES
1. A. Blokhuis, Regular finite planar maps with equal edges (2009), available at http://www.wm.
uni-bayreuth.de/fileadmin/Sascha/aartb.pdf.
2. H. Harborth, Match sticks in the plane, in The Lighter Side of Mathematics, R. K. Guy and R. E. Woodrow,
eds., MAA Spectrum, Washington, DC, 1994, 281–288.
3. N. Hartsfield and G. Ringel, Pearls in Graph Theory. A Comprehensive Introduction, Academic Press,
Boston, 1990.
4. S. Kurz, Fast recognition of planar non-unit distance graphs—searching the minimum 4-regular planar
unit distance graph (preprint).
Abstract. We have used Euler–Maclaurin summation to develop a recursive scheme for mod-
ifying the original approximation for the Euler–Mascheroni constant γ . Convergence to γ re-
sulting from successively employing the proposed scheme has been significantly accelerated
while the form of the approximation originally introduced by Euler is still preserved.
The Euler constant (also known as the Euler–Mascheroni constant) γ was introduced
by Euler [6] in 1734 (the paper [6], originally presented to the St. Petersburg Academy
of Sciences in 1734, was first published in 1740) as the limit of this difference (see
also [7]):
The Euler–Mascheroni constant is usually defined not by (3) but by a slightly mod-
ified formula (see, e.g., [1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 13, 14, 17])
0
γ = lim γn(E ) = lim (sn − ln n) (4)
n→∞ n→∞
1 0 1
< γn(E ) − γ < (6)
2(n + 1) 2n
doi:10.4169/amer.math.monthly.118.03.268
1
γn(D) = sn − ln n + (7)
2
significantly speeds up the rate of convergence to γ since
1 1
< γn(D) − γ < . (8)
24(n + 1) 2 24n 2
0
Accuracy estimates of the approximations γn(E ) and γn(D) are also given in [1, 4, 8,
11, 13, 16]. Although some of them slightly differ from the aforementioned bounds
determined by Young and DeTemple, all the authors agree that the errors contributed
by (5) and (7) are equal to 2n1 + O n12 and 24n1 2 + O n13 as indicated by (6) and (8),
respectively.
Negoi [10] has adopted DeTemple’s approach and further accelerated convergence
to γ by introducing
1 1
γn = sn − ln n + +
(N )
(9)
2 24n
0
in place of γn(E ) in (4) and proving that
1 1
− < γn(N ) − γ < − . (10)
48n 3 48(n + 1)3
The following key questions naturally arise at this point:
• What is the origin of the approximations γn(D) and γn(N ) that have such a profound
effect on the rate of convergence despite seemingly negligible modifications of (5)?
• How can we further speed up the rate of convergence to γ ?
Our objective is to address these problems.
∞
1 (−1)k+1 1 1 1
X
ln 1 + = = − 2+ − ··· (13)
2n k=1
k(2n)k 2n 8n 24n 3
Its accuracy γn(1) − γ = 24n1 2 + O n13 , explicitly determined by (14), is higher than
0
that of γn(E ) given by (6) and can be easily further increased by applying the Newton–
Mercator series again and eliminating the term 24n1 2 from (14) in the same way as 2n1
has been eliminated from (12). This process can be continued until the desired kth
approximation is obtained.
The kth approximation contributes an error
(k) −2k
γn(k) − γ = a2k n (k)
+ a2k+1 n −(2k+1) + a2k+2
(k)
n −(2k+2) + · · · (16)
(k) (k) (k)
depending on n and constant coefficients a2k , a2k+1 , a2k+2 , . . . . Note that −2k, not
−(k + 1), is the highest exponent of n for k ≥ 2 in (16). This choice seems to be
natural as there are no odd exponents of n (except for the first term) in the original
infinite expansion (11–12). It means in practice that for k ≥ 2 each subsequent approx-
imation γn(k) eliminates not one but two succesive terms of the series still left on the
right-hand side of the formula for γn(k−1) − γ .
Here are the three subsequent approximations directly following γn(1) (15):
1 1 1
γn = sn − ln n 1 +
(2)
1+ 1− (17)
2n 24n 2 24n 3
1 1
γn = sn − ln n 1 +
(3)
1+
2n 24n 2
1 143 1
× 1− 1 + 1 − (18)
24n 3 5760n 4 160n 5
1 1 1 143
γn(4) = sn − ln n 1 + 1+ 1 − 1 +
2n 24n 2 24n 3 5760n 4
1 151 1
× 1− 1 − 1 − (19)
160n 5 290304n 6 896n 7
and the respective errors they contribute:
143 1 151 1
γn(2) − γ = − − −
5760n 4 160n 5 290304n 6 896n 7
30893 1
+ +O (20)
6635520n 8 n9
(−1)l − 2l Bl
al(1) = (23)
l2l
for l ≥ 2 and
l ml
X (−1) m +1 am(k)
al(k+1) = al(k) − l
(24)
m∈{2k,2k+1} m
l≡0 (mod m)
for k ≥ 1 and l ≥ 2(k + 1), where Bl are the Bernoulli numbers with B3 = B5 = B7 =
· · · = 0.
Then we have
∞
1 al(1)
X
γn(1) − γ = sn − ln n 1 + −γ = (25)
2n l=2
nl
and
k−1
" ! (l)
!#
(l)
1 a a
Y
2l 2l+1
γn(k) − γ = sn − ln n 1 + 1 + 2l 1 + 2l+1
2n l=1 n n
∞
X a (k)l
−γ = (26)
l=2k
nl
for k ≥ 2.
k−1
" ! (l)
!#
(l)
1 a a
Y
2l+1
γn(k) − γ = sn − ln n 1 + 1 + 2l2l 1 + 2l+1 −γ
2n l=1 n n
(k)
a2k 1
= 2k + O . (27)
n n 2k+1
consisting of k 2 entries.
γn(k) − γ
δn = × 100% (28)
γ
for all the derived approximations (15) and (17–19). Performance of the original ap-
proximations (2) and (5), and Negoi’s (9), is also illustrated for comparison.
100
0.01
0.0001
1e–006
| δ n | [%]
1e–008
γn(E ) (5)
1e–010 γn(E) (2)
γn(1) (15)
1e–012 γ (N) (9)
n
γn(2) (17)
1e–014 γ (3) (18)
n
γn(4) (19)
1e–016
5 10 15 20 25 30 35
n
Figure 1. Comparison of the rates of convergence to γ for various approximations.
The proposed scheme not only works well but also nicely explains the origin of
DeTemple’s and Negoi’s improvements of the original formula for γ . Note that De-
Temple’s approximation γn(D) is equivalent to γn(1) whereas Negoi’s approximation γn(N )
is a simplified version of γn(2) which can be expressed as
1 1 1 1 1 1
γn = sn − ln n + +
(2)
− − − − . (29)
2 24n 48n 2 48n 3 576n 4 1152n 5
Negoi has considered in his formula (9) only the first three terms of the logarithm
given in (29). The orders of the original approximation (6) and DeTemple’s approxi-
mation (8) are also consistent with those given by Euler–Maclaurin summation, (12)
and (14), respectively.
In this paper we have analyzed only the first four approximations resulting from
employing the proposed algorithm. The rate of convergence to γ can be improved
even further if one keeps applying our scheme (27) until the desired approximation
accuracy is obtained.
29,844,489,545 decimal digits of the Euler–Mascheroni constant γ are currently
known. They were computed for n = 233 in March 2009 by Yee and Chan [15] us-
ing the algorithm of Brent and McMillan [3] which is based on the modified Bessel
functions and avoids the need for computation of the Bernoulli numbers required for
Euler–Maclaurin summation. The error of this algorithm is O(e−8n ) [15].
Our approach, which still employs Euler–Maclaurin summation, is less computa-
tionally efficient than the Sweeney [12] or Brent–McMillan [3] methods. The nice
feature of the approximations γn(k) , however, is that they preserve the form γn(k) =
sn − ln [ f (n)] introduced by Euler (2) with only the original function f (n) = n + 1
being modified in order to improve the rate of convergence to γ .
ACKNOWLEDGMENTS. The author wishes to thank the anonymous reviewer for helpful comments.
REFERENCES
1. H. Alzer, Inequalities for the gamma and polygamma functions, Abh. Math. Sem. Univ. Hamburg 68
(1998) 363–372. doi:10.1007/BF02942573
2. T. M. Apostol, An elementary view of Euler’s summation formula, Amer. Math. Monthly 106 (1999)
409–418. doi:10.2307/2589145
3. R. P. Brent and E. M. McMillan, Some new algorithms for high-precision computation of Euler’s con-
stant, Math. Comp. 34 (1980) 305–312.
4. C. -P. Chen, Inequalities for the Euler–Mascheroni constant, Appl. Math. Lett. 23 (2010) 161–164. doi:
10.1016/j.aml.2009.09.005
5. D. W. DeTemple, A quicker convergence to Euler’s constant, Amer. Math. Monthly 100 (1993) 468–470.
doi:10.2307/2324300
6. L. Euler, De progressionibus harmonicis observationes, Commentarii academiae scientiarum imperialis
Petropolitanae 7 (1740) 150–161; also available at http://www.math.dartmouth.edu/~euler.
7. J. Havil, Gamma: Exploring Euler’s Constant, Princeton University Press, Princeton, 2003.
8. E. A. Karatsuba, On the computation of the Euler constant γ , Numer. Algorithms 24 (2000) 83–97. doi:
10.1023/A:1019137125281
9. D. E. Knuth, Euler’s constant to 1271 places, Math. Comp. 16 (1962) 275–281.
10. T. Negoi, A faster convergence to Euler’s constant, Math. Gaz. 83 (1999) 487–489. doi:10.2307/
3620963
11. A. Sintamarian, A generalization of Euler’s constant, Numer. Algorithms 46 (2007) 141–151. doi:10.
1007/s11075-007-9132-0
Department of Computer Science, Illinois Institute of Technology, 10 West 31st St., Chicago, IL 60616
chlebus@iit.edu
X
U
u P t
k C
O T
F0 V F
To check the correctness of the construction, we use the fact that the points X of
the right branch of H are characterized by the property d(F 0 , X ) − d(F, X ) =
2a, where d denotes distance. According to our construction we have d(F 0 , U ) =
d(F 0 , V ), d(X, U ) = d(X, T ), and d(F, V ) = d(F, T ). Thus we get
d(F 0 , X ) − d(F, X ) = d(F 0 , U ) + d(X, U ) − (d(F, T ) + d(X, T ))
= d(F 0 , V ) − d(F, V ) = (e + a) − (e − a) = 2a.
PROBLEMS
11558. Proposed by Andrew McFarland, Płock, Poland. Given four concentric circles,
find a necessary and sufficient condition that there be a rectangle with one corner on
each circle.
11559. Proposed by Michel Bataille, Rouen, France. For positive p and x ∈ (0, 1),
define the sequence hxn i by x0 = 1, x1 = x, and, for n ≥ 1,
pxn−1 xn + (1 − p)xn2
xn+1 = .
(1 + p)xn−1 − pxn
Find positive real numbers α, β such that limn→∞ n α xn = β.
11560. Proposed by Gregory Galperin, Eastern Illinois University, Charleston, IL, and
Yury Ionin, Central Michigan University, Mount Pleasant, MI.
(a) The diagonals of a convex pentagon P0 P1 P2 P3 P4 divide it into 11 regions, of which
10 are triangular. Of these 10, five have two vertices on the diagonal P0 P2 . Prove that if
each of these has rational area, then the other five triangles, and the original pentagon,
all have rational areas.
(b) Let P0 , P1 , . . . , Pn−1 , n ≥ 5 be points in the plane. Suppose no three are collinear,
and, interpreting indices on Pk as periodic modulo n, suppose that for all k, Pk−1 Pk+1
is not parallel to Pk Pk+2 . Let Q k be the intersection of Pk−1 Pk+1 with Pk Pk+2 . Let αk
be the area of triangle Pk Q k Pk+1 , and let βk be the area of triangle Pk+1 Q k Q k+1 . For
0 ≤ j ≤ 2n − 1, let
(
α j/2 , if j is even;
γj =
β( j−1)/2 , if j is odd.
Interpreting indices on γ j as periodic modulo 2n, find the least m such that if m con-
secutive γ j are rational, then all are rational.
doi:10.4169/amer.math.monthly.118.03.275
n Z n Z
!2
X 1 X 1
f k2 (x) d x ≥ f k (x) d x , and
k=1 0 k=1 0
R1
f k2 (x) d x
n
X
2 ≥ n .
0 2
R
1
k=1
0 k
f (x) d x
11562. Proposed by Pál Péter Dályay, Szeged, Hungary. For positive a, b, c, and z,
let 9a,b,c (z) = 0((za + b + c)/(z + 2)), where 0 denotes the gamma function. Show
that 9a,b,c (z)9b,c,a (z)9c,a,b (z) is increasing in z for z ≥ 1.
11563. Proposed by Vlad Matei (student), University of Bucharest, Bucharest, Roma-
nia. For each integer k ≥ 2, find all nonconstant f in Z[x] such that for every prime
p, f ( p) has no nontrivial kth-power divisor.
SOLUTIONS
Explaining a Polynomial
11403 [2008, 949]. Proposed by Yaming Yu, University of California–Irvine, Irvine,
CA. Let n be an integer greater than 1, and let f n be the polynomial given by
n i−1
X n Y
f n (x) = (−x)n−i (x + j).
i=0
i j=0
Mean Inequalities
11434 [2009, 463]. Proposed by Slavko Simic, Mathematical Institute SANU, Bel-
grade, Serbia. Fix n ∈ N with n ≥ 2. Let x1 , . . . , xn be distinct real numbers, and
let p1 , . . . , pn be positive numbers summing to 1. Let
Pn 3
Pn 3
k=1 pk x k − k=1 pk x k
S = P
n 2
Pn 2 .
3 p
k=1 k kx − p x
k=1 k k
Also solved by R. Bagby, R. Chapman (U.K.), M. P. Cohen, W. J. Cowieson, P. P. Dályay (Hungary), H. Deh-
ghan (Iran), P. J. Fitzsimmons, D. Grinberg, E. A. Herman, T. Konstantopoulos (U.K.), O. Kouba (Syria), J. H.
Lindsey II, O. P. Lossers (Netherlands), J. Posch, K. Schilling, B. Schmuland (Canada), R. Stong, M. Tetiva
(Romania), B. Tomper, BSI Problems Group (Germany), GCHQ Problem Solving Group (U.K.), Microsoft
Research Problems Group, and the proposer.
Extrema
11449 [2009, 647]. Proposed by Michel Bataille, Rouen, France. (corrected) Find the
maximum and minimum values of
(a 3 + b3 + c3 )2
(b2 + c2 )(c2 + a 2 )(a 2 + b2 )
given that a + b ≥ c > 0, b + c ≥ a > 0, and c + a ≥ b > 0.
Solution by Jim Simons, Cheltenham, U.K. Call this big expression X . Since X is
homogeneous, we may assume a 2 + b2 + c2 = 1. The feasible region then consists
of a triangular patch on the positive octant of the unit sphere, excluding the ver-
tices (where one of a, b, c is zero), but including the interiors of the sides (where
two of a, b, c are equal). Using spherical polar coordinates, we may set (a, b, c) =
2
cos3 α + sin3 α(cos3 θ + sin3 θ )
X=
sin2 α cos2 α + sin2 α sin2 θ cos2 α + sin2 α cos2 θ
2
cos3 α + sin3 α(cos3 θ + sin3 θ )
= .
sin2 α cos4 α + cos2 α sin2 α + sin4 α sin2 θ cos2 θ
If f (θ ) = cos3 θ + sin3 θ , then f 0 (θ) = 3 cos θ sin θ (sin θ − cos θ ). In the feasible re-
gion for θ , this is positive for θ > π/4 and negative for θ < π/4. Thus f , and with
it, the numerator of X for fixed α, is less at θ = π/4 than at any other feasible θ .
Similarly, if g(θ) = sin2 θ cos2 θ , g, and with it, the denominator of X for fixed α, is
increasing in θ for θ < π/4 and decreasing in θ for θ > π/4. Thus X is, for fixed α,
smallest at θ = π/4, and greatest at an edge of the feasible region. By symmetry, the
minimum value of X is 9/8, attained when a = b = c.
From the foregoing, the maximum value of X on the closure of the feasible re-
gion occurs at a point where, with respect to any translation into spherical coordi-
nates, θ is extremal. The only such points are the corners of the region. At (a, b, c) =
(2−1/2 , 2−1/2 , 0), X = 2. However, this maximum is not attained because these corners
are not in the feasible region.
Also solved by R. Agnew, A. Alt, R. Bagby, D. Beckwith, H. Caerols & R. Pellicer (Chile), R. Chapman
(U.K.), H. Chen, C. Curtis, Y. Dumont (France), D. Fleischman, J.-P. Grivaux (France), E. A. Herman, F.
Holland (Ireland), T. Konstantopoulos (U.K.), O. Kouba (Syria), A. Lenskold, J. H. Lindsey II, P. Perfetti
(Italy), N. C. Singer, R. Stong, T. Tam, R. Tauraso (Italy), M. Tetiva (Romania), E. I. Verriest, Z. Vörös
(Hungary), S. Wagon, G. D. White, GCHQ Problem Solving Group (U.K.), Microsoft Research Problems
Group, and the proposer.
11450 [2009, 647]. Proposed by Cosmin Pohoata (student), National College “Tudor
Vianu,” Bucharest, Romania. Let A be the unit ball in Rn . Find
min ai − a j .
max
a∈A 1≤i< j≤n
Solution by Omran Kouba, Higher Institute for Applied Sciences and Technology,
Damascus, Syria. Let Mn denote the desired maximum. p It is implicit in the statement
of the problem that n ≥ 2. We show that Mn = 12/(n(n 2 − 1)).
Let (a1 , . . . , an ) be an element of A at which the maximum is achieved, and
let Mn = min{|ai − a j | : 1 ≤ i < j ≤ n}. There is a permutation σ of the set
{1, 2, . . . , n} such that aσ (1) ≤ aσ (2) ≤ · · · ≤ aσ (n) . Write for simplicity b j = aσ ( j) .
For j > i, we then have
j
X
b j − bi = (bk − bk−1 ) ≥ ( j − i)Mn .
k=i+1
n n
!2
X X
≤ 2n ak2 − 2 ak ≤ 2n,
k=1 k=1
n n
!2
X X X n 2 (n 2 − 1)
( j − i) = 2n
2 2
k −2 k = .
1≤i, j≤n k=1 k=1
6
p
It follows that Mn2 ≤ 12/(n(n 2 − 1)), so Mn ≤ 12/(n(n 2 − 1)).
Conversely, if we consider (a1(0) , a2(0) , . . . , an(0 ) defined by
s
(0) 12 n+1
ak = k− , k = 1, 2, . . . , n,
n(n 2 − 1) 2
A Cauchy–Schwarz Puzzle
11458 [2009, 747]. Proposed by Cezar Lupu (student), University of Bucharest,
Bucharest, Romania, and Vicenţiu Rădulescu, Institute of Mathematics “Simon Stoi-
low” of the Romanian Academy, Bucharest, Romania. Let a1 , . . . , an be nonnegative
and let r be a positive integer. Show that
2
X i r j r ai a j X n X i r j r k r ai a j ak
≤ m r −1 am .
1≤i, j≤n
i + j −1 m=1 1≤i, j,k≤n
i + j +k−2
An Orthocenter Inequality
11461 [2009, 844]. Proposed by Panagiote Ligouras, Leonardo da Vinci High School,
Noci, Italy. Let a, b, and c be the lengths of the sides opposite vertices A, B, and C of
an acute triangle. Let H be the orthocenter. Let da be the distance from H to side BC,
and similarly for db and dc . Show that
1/4
1 2 3 1 1 1
≥ √ +√ +√ .
da + db + dc 3 abc bc ca ab
An Erroneous Claim
11465 [2009, 845]. Proposed by Pantelimon George Popescu, Polytechnic University
of Bucharest, Bucharest, Romania, and José Luis Dı́az-Barrero, Polytechnic University
of Catalonia, Barcelona, Spain. Consider three simple closed curves in the plane, of
lengths p1 , p2 , and p3 , enclosing areas A1 , A2 , and A3 , respectively. Show that if
p3 = p1 + p2 and A3 = A1 + A2 , then 8π A3 ≤ p32 .
Solution by the Texas State University Problem Solvers Group, San Marcos, TX. The
problem as stated is false. Consider the following counterexample. Let the first curve
be a square of side 1, so p1 = 4 and A1 = 1. Let the second curve be a square √ with
p2 = 40 and√A2 = 100. Let the third curve be a rectangle with sides 11 + 2 5 and
101/(11 + 2 5 ) so that p3 = 44 and A3 = 101. These three curves fulfill the require-
ments of the problem, and yet 8π A3 > p32 .
Let us incorporate the additional requirement that p12 + p22 = 2 p1 p2 . Then the re-
quired inequality can be proved as follows. The isoperimetric inequality applied to any
of the curves is
p 2
i
Ai ≤ π ,
2π
and thus 4π Ai ≤ pi2 . Therefore
4π A3 = 4π A1 + 4π A2 ≤ p12 + p22 .
With the newly-added condition we get
8π A3 = 8π A1 + 8π A2 ≤ 2 p12 + 2 p22 = p12 + p22 + 2 p1 p2 = ( p1 + p2 )2 = p32 .
Logical Labyrinths. By Raymond M. Smullyan. A K Peters, Ltd., Wellesley, MA, 2009, vii +
327 pp., ISBN 978-1-56881-443-8, $49.00.
Since I entered this world late in the Eisenhower administration, some of my earliest
introductions to the field of logic were through Raymond Smullyan’s work. I truly ad-
mire the authors among us who have the talent to open a field to others in an interesting
and non-threatening manner, and Smullyan has that talent to spare. His logical puzzle
books, from What is the Name of this Book? to Forever Undecided and beyond have
introduced thousands to the delights of formal logic, and his variations on knights and
knaves puzzles have been canonical material in logic classes for the past three decades,
and will be for decades to come.
Logical Labyrinths thus begins in familiar territory with the author guiding us into
propositional logic through informal reasoning and logic conundrums. This volume,
however, has a more ambitious goal than the author’s more popular works. Smullyan
says in the preface that it is “. . . a bridge from all my previous recreational puzzle books
to all my technical writing in the fascinating field of symbolic logic.” So this book aims
to be more of a textbook, helping the reader move from an intuitive understanding of
logical arguments to a mastery of specialized results that is on a par with that of first-
year graduate students. This is a daunting goal for a book of 320 pages, no matter
who the author. Although there is much to praise about this work, and indeed parts
of it are brilliant, I am left at the end with a feeling of disappointment. The overly
ambitious reach of the book seems to have left it without a core audience to whom the
author can pitch his prose. Smullyan treats the advanced topics at the conclusion of his
journey in much the same way that he treats the introductory material. Unfortunately,
Smullyan’s characteristic style, which works so well at the beginning of the book,
does not seem to fit as well at the end. In my opinion this makes Logical Labyrinths
somewhat unsatisfactory.
Think about your ideal mathematics class. Maybe for you it is a first proofs course,
or an analysis course, or the high school geometry class that your chair has not let you
teach for the last fifteen years. Whatever the subject is, now think about the students in
that class. Whether they are advanced students or raw beginners, most likely you have
thought of these students as bright, inquisitive, somewhat self-motivated, and eager to
join you as you explore the subject matter at hand. You are looking forward to sharing
your love of your field with them, exploring the main ideas as well as showing them
doi:10.4169/amer.math.monthly.118.03.283
REFERENCES
Importance is placed on the problems and their solutions. The book contains
numerous problems of varying difficulty; over a third of its contents are devoted
to problem statements, hints, and complete solutions. The book can be used as
a textbook for geometry courses; as a source book for geometry and other math-
ematics courses; for capstone, problem-solving, and enrichment courses; and for
independent study courses.
480 pp., 2010, ISBN 978-0-88385-763-2, Hardbound
List: $69.95 MAA Member: $55.95 Catalog Code: MEG
The centerpiece of the book is The Calculus as Algebra: J.-L. Lagrange, 1736-
1813. The book describes the achievements, setbacks, and influence of Lagrange’s
pioneering attempt to reduce the calculus to algebra. Nine additional articles round
out the book describing the history of the derivative; the origin of delta-epsilon proofs;
Descartes and problem solving; the contrast between the calculus of Newton and
Maclaurin, and that of Lagrange; Maclaurin’s way of doing mathematics and science
and his surprisingly important influence; some widely held “myths” about the history
of mathematics; Lagrange’s attempt to prove Euclid’s parallel postulate; and the central
role that mathematics has played throughout the history of western civilization.
Counterexamples in Calculus
Sergiy Klymchuk
As a robust repertoire of examples is essential for students
to learn the practice of mathematics, so a mental library of
counterexamples is critical for students to grasp the logic of
mathematics. Counterexamples are tools that reveal incor-
rect beliefs. Without such tools, learners’ natural misconcep-
tions gradually harden into convictions that seriously impede
further learning. This slim volume brings the power of coun-
terexamples to bear on one of the largest and most important
courses in the mathematics curriculum.
—Professor Lynn Arthur Steen, St. Olaf College, Minnesota,
USA, Co-author of Counterexamples in Topology
7KLV ERRN DLPV WR ¿OO D JDS LQ WKH OLWHUDWXUHDQG SURYLGH D UHVRXUFH IRU
using counterexamples as a pedagogical tool in the study of introductory
calculus. In that light it may well be useful for
KLJKVFKRROWHDFKHUVDQGXQLYHUVLW\IDFXOW\DVDWHDFKLQJUHVRXUFH
KLJKVFKRRODQGFROOHJHVWXGHQWVDVDOHDUQLQJUHVRXUFH
DSURIHVVLRQDOGHYHORSPHQWUHVRXUFHIRUFDOFXOXVLQVWUXFWRUV
Catalog Code: CXC
101pp., Paperbound, 2010
ISBN: 978-0-88385-756-6
List: $45.95
MAA Member: $35.95