You are on page 1of 12

Version of 30 September 2017

Oxford University Mathematical Institute

Complex Numbers
Notes by Peter M. Neumann (Queen’s College)

1. Introduction
Students coming to Oxford to study mathematics arrive with a wide range of background
knowledge and training. In respect of complex numbers some will know much and have acquired
useful confidence, others will be almost beginners. But it is likely that, however much or little
you have already learned, there will be less unanimity of language and perception than is
desirable in a large group of students who are studying mathematics together to a deep level.
The purpose of the two lectures, and therefore the purpose of these lecture notes, is to serve
as a reminder to those who have met the topic before, to introduce the ideas to those new to
the topic, and—perhaps most importantly—to establish a common language and notation.

2. Complex numbers
For us a complex number is an entity of the form a + bi. Here a, b are real numbers and

i = −1. The real number system R—which√ should be familiar to us all—consists of rational
numbers p/q and the irrational numbers like 2 or e that fit between them to make a system
that many of us picture using the ‘number line’
√ √
... . . . − 27 . . . − 10 2 . . . − 1 .. 0 .. 1 . . . 7 3 ... 27 . . . . . . 65 537 . . . ...
It is the ingredient i of a complex number that took mathematicians of the 16th to 19th
centuries much time and mental readjustment to come to terms with. We learn quite early in
our lives that ‘minus times minus is plus’. As a consequence, the square of any real number
is non-negative, so the equation x2 = −1 has no real roots. Complex numbers result from
the imaginative idea of simply inventing an entity i that is a root of this equation, that is, for
which i2 = −1. Indeed, the notation i is said to have been invented in recognition that this
must be an imaginary number having nothing to do with quantity and little to do with reality.

Note. Sometimes we’ll use the form a + bi, sometimes a + ib. There is no mathematical
difference. In some contexts there is a mild psychological difference.

As is well known, in the Real world, there are quadratic equations such as x2 + 1 = 0 and
x2 − x + 1 = 0 that have no roots. Once we have complex numbers, however, all quadratic
equations with real coefficients have roots:

2 −b ± b2 − 4ac
If a 6= 0 and ax + bx + c = 0 then x = . If b2 − 4ac, the so-called
2a
2
discriminant,
√ √is non-negative then the2 roots are real numbers; if b − 4ac < 0 then
b2 − 4ac = 4ac − b2 × i. Thus if b > 4ac then the equation has real roots and
if b2 − 4ac < 0 then roots exist as complex, non-real numbers.

1
Exercise 2.1 Which of the following quadratic equations have real roots, which do not?
3x2 + 2x − 1 = 0; 2x2 − 6x + 9 = 0; −4x2 + 7x − 9 = 0; 4x2 − 9x + 5 = 0.

3. Arithmetic of complex numbers


The collection of all complex numbers is written C (notation that is pretty much standard in
mathematics everywhere in the world). Addition and multiplication are basic binary operations
on C (in the precise sense of the technical term ‘binary operation’ that is to be explained in
other lecture courses). They are what you would expect:
(a + bi) + (c + di) = (a + c) + (b + d)i
(a + bi) × (c + di) = (ac − bd) + (ad + bc)i .

That rule for multiplication comes simply from the calculation

(a + bi) × (c + di) = ac + adi + bci + bdi2

and exploiting the fact (or assumption?) that i2 = −1. We’ll come to division later.

Exercise 3.1 Put each of the following complex numbers into standard form a + bi:

(1 + 2i)(3 − i); (2 + i)(1 − 2i); (1 + i)4 ; (1 − 3i)3 .

4. The Argand diagram


The complex number a+bi may be identified with the point (a, b) of the euclidean plane R2 .
For this reason, some people call a + bi the ‘cartesian form’ of a complex number. Just as we
speak of ‘the real line’ when we represent real numbers geometrically on the number line, so
we speak of ‘the complex plane’ (also known as ‘the Argand diagram’) when we picture C as
a 2-dimensional geometrical object.
If z ∈ C and z = a + bi then a is known as the real part Rez (or Rez or Rz —notation
varies from book to book and from lecturer to lecturer but its meaning should always be clear
from the context) and b is known as the imaginary part Imz (or Imz or I z ). It is important
that R ⊆ C in the sense that R = {z ∈ C | Imz = 0}. In fact this is not just a containment
as set, it is containment as arithmetical structure. For, (a + 0i) + (c + 0i) = (a + c) + 0i and
(a + 0i) × (c + 0i) = (ac) + 0i, so the arithmetic of real numbers identified as a special kind of
complex numbers is precisely the same as the arithmetic of R. We often express this by writing
R 6 C rather than R ⊆ C. With this identification R becomes the x-axis in the complex
plane.
The y -axis consists of numbers of the
form bi. These are generally known
as ‘pure imaginary’ complex numbers.
Clearly, any complex number is uniquely
the sum of a real number and a pure imag-
inary number.
Addition of complex numbers may be
represented geometrically in the Argand
diagram by the so-called parallelogram
law, as shown in the diagram. There is
also a geometrical description of multipli-
cation, but it is more complicated and will
be discussed later.

2
Exercise 4.1. Show that there is no binary relation < on C for which the following
conditions are satisfied:
• for any z ∈ C exactly one of z < 0, z = 0, 0 < z holds;
• if z, w ∈ C, 0 < z and 0 < w then 0 < z + w ;
• if z ∈ C and z 6= 0 then 0 < z 2 .

Note. It is a consequence that there is no order-relation on C analogous to that on R.


Thus inequalities can only be valid between real numbers. Formulae such as i < 2i and
1 + i < 2 + i make no sense.

5. Conjugate and modulus of a complex number


For z = a + bi ∈ C the complex conjugate z (for which you might sometimes see the
notation z ∗ , though very rarely in the Oxford mathematics courses) is defined by z := a − bi.
Interpreted geometrically complex conjugation is reflection of the Argand diagram in the real
axis. Here is an easy, but disproportionately useful and important fact about it:

Theorem C1: If z, w ∈ C then z + w = z + w and z w = z w .

Because this is a very important fact, it is important that you yourself have internalised the
reasons why it is true. That is why the proof is left as an exercise—if it gives you trouble then
please consult your Tutors or write me a note.

Exercise 5.1. Write out a proof of this theorem.

For z = a + bi ∈ C√the absolute value or modulus (these terms are used interchangeably)
|z| is defined by |z| := a2 + b2 . Since 2 2
√ a + b > 0 this real number has a non-negative square
root and this is what is meant by the symbol. Thus |z| > 0, and |z| = 0 if and only if z = 0.
Geometrically, |z| is the distance of z from 0 in the complex plane.
Three details about the modulus that follow directly from the definition are worth noting:
(1) |z| = |z|;

(2) if x is real then |x| = x2 , that is, |x| = x if x > 0, |x| = −x if x < 0, which accords
with the definition of absolute value for real numbers (in their own right);
(3) if z = a + bi then, since a2 + b2 > a2 , |z| > |a|, that is, |Rez| 6 |z|; similarly, of course,
|Imz| 6 |z|.

Theorem C2: (1) If z ∈ C then |z| = zz.
(2) If z, w ∈ C then |z w| = |z||w|.

2 2 2 2 2
√ of (1) note that if z = a + bi then z z = (a + bi)(a − bi) = a − i b = a + b ,
For the proof
so that |z| = z z by definition of the modulus. Now to prove (2) calculate as follows:

|z w|2 = (z w)(z w) = (z w)(z w) [by Theorem C1]


= (z z)(w w) = |z| |w|2 ,
2

and the desired result comes by taking square roots.

Suggestion: Get into the habit of working with complex numbers as single entities. There
is rarely any need to write them in terms of their real and imaginary parts. Our proof of (2)
illustrates this.

3
Theorem C3 [The Triangle Inequality]:
If z, w ∈ C then |z + w| 6 |z| + |w|.

Proof. Geometrically, this is the fact that the


length of any side of a triangle is always at most the
sum of the lengths of the other two sides. In this
case the relevant triangle has its vertices at 0, z and
z + w in the complex plane. The side from 0 to z
has length |z|, the side from z to z + w has length
|w|, and the side from 0 to z + w has length |z + w|.

It is perhaps worth seeing an alternative, more algebraic proof.

|z + w|2 = (z + w)(z + w) = (z + w)(z + w) = zz + w z + z w + w w .

Now w z = z w and so w z +z w = 2Re(z w). By the third of the notes that follow the definition
of absolute value, |Re(z w)| 6 |z w|. We also have |z w| = |z||w| = |z||w|. Therefore

|z + w|2 = |z|2 + 2Re(z w) + |w|2 6 |z|2 + 2|z||w| + |w|2 .

Thus |z + w|2 6 (|z| + |w|)2 , and taking positive square roots we find that |z + w| 6 |z| + |w|.

Exercise 5.2. Let z, w ∈ C and suppose that z 6= 0. Show that |z + w| = |z| + |w|
if and only if there is a real number λ > 0 such that w = λz .

6. Division of complex numbers


We have just seen that if z ∈ C then z z = |z|2 and that if z 6= 0 then |z|2 > 0. It follows
that
1 z 1 a b
if z 6= 0 then = 2 , that is, = 2 2
− 2 i.
z |z| a + bi a +b a + b2
w w×z wz
Therefore if z, w ∈ C and z 6= 0 then = = . Thus C is very similar to Q or
z z×z |z|2
R in that it has addition, multiplication and division by non-zero numbers. These operations
satisfy the usual rules of arithmetic—rules that are familiar, but only to those who have taken
the time to stop and think what they are. In language to be introduced in other courses, the
complex number system C is a field (just as Q and R are fields).

7 − 2i i
Exercise 6.1. Work out , and a few other such complex number calculations
5 + 12i 1 − i
until you are confident that you fully understand division in C.

Exercise 6.2. Let z1 := 1 + i and z2 := 2 − 3i. Put each of the following into standard
form a + bi:
z1 + z2 ; z1 − z2 ; z1 z2 ; z1 /z2 ; z1 z2 ; z1 z2 .

7. The argument of a complex number


In plane geometry and its applications it is often useful to switch between cartesian and polar
coordinates. In the Argand diagram the polar coordinates of a complex number z are (r, α)
where r is the distance from 0 and α is the angle from the positive x-axis (the positive real
line) anti-clockwise to the ray joining 0 to z . It should be clear from the definition of absolute
value that r = |z|. The angle α is known as the argument arg z . In principle it is many-valued.
For any integer k the angle α + 2kπ would serve just as well. Many mathematicians would
express this by saying that arg z is only determined “modulo 2π ”.
4
Two important details about the argument are worth noting: that arg 0 is not defined, and
that arg z = − arg z .
Sometimes one specifies that −π < arg z 6 π to make
the argument single-valued. This single-valued version of
arg z is known as its principal value. Another convention
that some authors use is that the principal value should
satisfy 0 6 arg z < 2π , but this version does not combine
well with the equation arg z = − arg z .
Switching back and forth between polar and cartesian
coordinates requires trigonometry. If z = a + bi ∈ C \ {0}
and r = |z|, α = arg z then a = r cos α, b = r sin α. So
in terms of modulus and argument, z = r(cos α + i sin α).
Going the other way, if√z = a + bi 6= 0 and r := |z|,
α := arg z then r = a2 + b2 (of course—this is how
|z| was defined) and, since b/a = r sin α/r cos α = tan α,
we find α as tan−1 (b/a)—where, however, the inverse tangent function has (as always) to be
interpreted with great care.

Exercise 7.1. Find the modulus and argument of each of the following complex numbers:

1 + 3i; (2 + i)(3 − i); (1 + i)5 .

Exercise 7.2. On separate Argand diagrams sketch each of the following subsets of C:
A := {z |z| < 1}; B := {z Rez = 3}; C := {z − 14 π < arg z < 14 π};
D := {z arg(z − i) = 12 π}; E := {z |z − 3 − 4i| = 5}; F := {z |z − 1| = |z − i|} .

We have seen in Theorem C2 that if z, w ∈ C then |z w| = |z||w|. How is arg(z w) related


to arg z and arg w ? Well, if α := arg z and β := arg w , so that z = |z|(cos α + i sin α),
w = |w|(cos β + i sin β), then

z w = |z||w|(cos α + i sin α)(cos β + i sin β).

Using the trigonometrical addition formulae for sine and cosine we see that

(cos α + i sin α)(cos β + i sin β) = (cos α cos β − sin α sin β) + i(sin α cos β + cos α sin β)
= cos(α + β) + i sin(α + β).

Thus we have proved the following companion to Theorem C2(2):

Theorem C4: If z, w ∈ C then arg(z w) = arg z + arg w .

Note. It is important here that arg z is not restricted to the principal value. Even if
−π < arg z 6 π and −π < arg w 6 π , it could well happen that arg z + arg w > π or
arg z + arg w < −π . For fundamental facts like this one, it is essential that the argument is
understood to be determined only “modulo 2π ”.
The theorem indicates that the argument function behaves somewhat like a logarithmic
function. In fact, it is an important constituent of the complex logarithmic function—but that
is material for a more advanced study than can be offered in this short course.
Theorem C4 yields a geometric interpretation of multiplication as promised above, at the
end of §4. As a transformation of the Argand diagram, the operation ‘multiply by w ’ leaves the
point 0 unmoved, scales distances from 0 by a factor |w|, and rotates the plane anticlockwise

5
about 0 through an angle arg w . It is worth remembering that this anticlockwise convention
for rotations is standard not just in the context of the complex plane, but also in geometry, in
mechanics—in most of mathematics. There is more to the convention, though: if α > 0 then
an anticlockwise rotation through angle −α is a clockwise rotation through angle α.

Exercise 7.3. For each of the following complex numbers w , what transformation of the
Argand diagram does multiplication by w represent?
i; (1 + i); (1 − i); (3 + 4i).

8. De Moivre’s Theorem
If θ ∈ R then cos2 θ + sin2 θ = 1. Therefore complex numbers of the form cos θ + i sin θ
are of absolute value 1. Conversely, we have seen that if |z| = 1 and α := arg z then
z = cos α + i sin α. Define

S 1 := {z ∈ C |z| = 1}, [standard notation]

the unit circle in the complex plane. Described a little differently, S 1 = {cos θ +i sin θ | θ ∈ R }.
We know, since absolute values are multiplicative, that if z, w ∈ S 1 then also z w ∈ S 1 ,
that is, S 1 is closed under the multiplication of complex numbers; moreover, if z ∈ S 1 then
z −1 = z ∈ S 1 . In terms that some of you will already have met and all of you will meet before
long, this means that S 1 is a group. It is a very important group known as the circle group.

Theorem C5 [De Moivre’s Theorem]. If z ∈ S 1 , θ := arg z and n ∈ Z then arg(z n ) = nθ .


Equivalently, (cos θ + i sin θ)n = cos nθ + i sin nθ .

For n > 0 this comes quickly by mathematical induction: it is certainly true if n = 0 (since
z0 = 1 and arg 1 = 0) or if n = 1; if it is true when n = m then arg(z m+1 ) = arg(z m )+arg z =
m. arg z + arg z (by inductive assumption) and so arg(z m+1 ) = (m + 1) arg z , that is, it is then
true when n = m + 1.
If n < 0, say n = −N where N > 0, then z n = (z −1 )N = z N , and so arg(z n ) = N arg z =
N (− arg z) = −N arg z = n arg z . Thus the assertion is true for all integers n.

Note. Both for De Moivre’s Theorem and for its proof it is important that we do not
have to construe an argument θ as being an angle in the range (−π, π]. Even if θ lies in this
range, nθ need not.

De Moivre’s Theorem has many uses. One of them is to express the cosines and sines of
multiples of an angle θ as polynomials in cos θ and sin θ . For example:

cos 3θ + i sin 3θ = (cos θ + i sin θ)3


= cos3 θ + 3i cos2 θ sin θ + 3i2 cos θ sin2 θ + i3 sin3 θ
= (cos3 θ − 3 cos θ sin2 θ) + i(3 cos2 θ sin θ − sin3 θ) ,

and, by equating real and imaginary parts, we find that

cos 3θ = cos3 θ − 3 cos θ sin2 θ = cos3 θ − 3 cos θ (1 − cos2 θ)


= 4 cos3 θ − 3 cos θ
and
sin 3θ = 3 cos2 θ sin θ − sin3 θ = 3(1 − sin2 θ) sin θ − sin3 θ
= −4 sin3 θ + 3 sin θ .

6
Exercise 8.1. Show that if θ ∈ R then
cos 5θ = 16 cos5 θ − 20 cos3 θ + 5 cos θ; sin 5θ = (16 cos4 θ − 12 cos2 θ + 1) sin θ .

Exercise 8.2. Show that if m is a positive integer then (1 + i)4m = (−1)m 22m .

9. Roots of unity
If z ∈ C, n ∈ N and z n = 1 then z is known as a root of unity—an nth root of unity, to be
precise. For such numbers, since |z|n = |z n | = 1 and |z| > 0, we have |z| = 1. Thus complex
roots of unity lie in S 1 .

Theorem C6. Let z ∈ C. Then z is an nth root of unity if and only if |z| = 1 and there
exists an integer k such that arg z = 2k π/n.

Proof. Suppose first that |z| = 1 and arg z = 2k π/n for some integer k . Then |z n | = 1
and, by De Moivre’s Theorem, arg(z n ) = n × 2k π/n = 2k π . Thus z n has the same modulus
and the same argument as 1, and therefore z n = 1.
Conversely, if z n = 1 then, as we have already seen, |z| = 1, and moreover, if α := arg z
then by De Moivre’s Theorem again, nα = arg 1 = 2k π for some integer k , and therefore
arg z = 2k π/n.

Corollary: For any positive integer n the number of nth roots of 1 is n.

For, by Theorem C5, the nth roots of 1 are the complex numbers cos(2k π/n)+i sin(2k π/n)
for 0 6 k 6 n − 1, and there are precisely n of these.

Examples: The square roots of 1 in C are just its real square roots 1 and −1;
the fourth roots of 1 in C are 1, i, −1, −i.

Note. If z is an nth root of unity and z m 6= 1 for each natural number m in the range
0 < m < n then z is said to be a primitive nth root of unity.

Theorem C7. Let z ∈ C and let n be an integer > 2. If z is an nth root of unity and
z 6= 1 then z n−1 + z n−2 + · · · + z 2 + z + 1 = 0. This equation holds, in particular, for primitive
nth roots of unity.

For, let z be an nth root of unity. Since

0 = z n − 1 = (z − 1)(z n−1 + · · · + z + 1)

and z 6= 1 by assumption, it must be the case that z n−1 + · · · + z + 1 = 0.

Example. Let ω be a primitive cube root of 1. Then ω 2 + ω + 1 = 0.√ Solving this



quadratic equation in the usual way we find that ω = 12 (−1 ± −3) = 12 (−1 ± i 3).
I find it comforting that this is consistent with the facts that

ω = cos(2π/3) + i sin(2π/3) or
ω = cos(4π/3) + i sin(4π/3),

and
1

cos(2π/3) = cos(4π/3) = −√ 2
sin(2π/3) = − sin(4π/3) = 21 3.

7
Exercise 9.1. Write down the primitive 6th roots of unity and the primitive 8th roots of
unity in standard form a + bi.

3
Exercise 9.2. Is 3 + 4i a root of unity? Is 5 + 54 i a root of unity?

Exercise 9.3 [Harder]. Find all pairs (a, b) of rational numbers for which a + bi is a root
of unity.

Exercise 9.4. Let φ := cos(2π/5) + i sin(2π/5), a primitive 5th root of 1. Define


α := φ + φ4 , β := φ2 + φ3 .
(i) Show that α and β are real numbers.
(ii) Show that α + β = −1 and αβ = −1 (so that α, β are the roots of x2 + x − 1 = 0).

(iii) Deduce that cos(2π/5) = 14 ( 5 − 1).

Exercise 9.5. Let p be a prime number let ω be a primitive pth root of unity and let m
p−1 
X
mk 0 if p does not divide m,
be a positive integer. Show that ω =
p if p divides m.
k=0

10. A formula discovered by Euler

We observed in §7 above that Theorem C4 indicates that the argument behaves somewhat
like a logarithm. A famous formula published by Leonhard Euler in 1748,

eiθ = cos θ + i sin θ ,

explains, or at least is consistent with, the logarithmic nature of the argument. In this notation
Theorem C4 says that
eiα eiβ = ei(α+β) ,
just as we would expect of the exponential function.
The exponential function may be introduced in many different ways. An important approach
is to say that ex is that function f (x) which is its own derivative and for which f (0) = 1.
That such a function exists is not obvious, however. An alternative is to use a power series,
defining ex to be the infinite sum (once a sense can be attached to the idea of adding infinitely
many numbers together)
x x2 xn
1+ + + ··· + + ··· .
1! 2! n!
Substituting x = iθ and separating the series into real and imaginary parts we find two power
series, recognisable as those for cos θ and for sin θ respectively, so that eiθ = cos θ + i sin θ .
Some of you may have seen this derivation before you arrived at Oxford. It begs rather a lot of
questions, though. In what sense can we add infinitely many numbers together? Is it sensible
to take a formula that is valid for real numbers and expect it to hold for complex numbers?
Certainly that is not always a good idea—for example, x2 6= −1 is a formula that is true for
all real numbers x, and yet it fails trivially in C since by definition i2 = −1. One of the great
successes of 19th century mathematics was the development of answers to awkward questions
and criticisms such as these. By the end of the Prelim Analysis courses you will have acquired
an understanding of what might, if mathematicians had been unlucky, have gone wrong with
the reasoning above and how we can be sure that it leads in fact to perfectly valid results.
As a matter of fact, the reasoning sketched above is a little different from Euler’s. In
some ways he was even more inventive. Starting from the equation (sin z)2 + (cos z)2 = 1,

8

he took what was at the time the amazing step of introducing −1 and factorising it as
√ √
(cos z + −1 sin z)(cos z − −1 sin z) = 1. This historical matter is of great interest to some
of us but is not relevant to the Prelim course, so although I’ll say a few more words about it
I’ll put them into an appendix.
Let’s take Euler’s formula for granted. Then for a complex number z we have

z = r eiθ where r := |z| and θ := arg z .

Although this formula expresses nothing more than the polar form of z , it does so in a notation
that is particularly suggestive and particularly easy to use in calculations. Here is one:

Theorem C8. Let z ∈ C and let n be a positive integer. Then z has nth roots in C. In
fact, if z 6= 0 then there are precisely n distinct complex numbers w such that wn = z .

The reason is this. If z = 0 the assertion is trivial, so now suppose that z 6= 0. Write z =
r eiαwhere r := |z| and α := arg z . Then r > 0 so we know (do we?—yes, methods for a proof

occur quite early in Analysis I) that there is a unique positive real number s (written n r ) such
that sn = r . Now let β := α/n and let w := seiβ . Then wn = sn (eiβ )n = sn ei nβ = r eiα = z .
This proves the existence of one nth root of z . If φ is any one of the nth roots of 1, of which
we know (from §9) that there are precisely n, then (φw)n = φn wn = wn = z : therefore z has
at least n distinct nth roots. On the other hand, if un = z then (u/w)n = un /wn = z/z = 1,
so u/w is an nth root of unity and therefore u = φw , where φ is one of the nth roots of 1:
therefore z has precisely n distinct nth roots.

Exercise 10.1. For n := 2, 3, 4, 5 locate the nth roots of 16i on an Argand diagram.

11. Polynomial equations


Early in our acquaintance with algebra we learn that if ax2 + bx + c = 0 and a 6= 0 then

−b ± D
x= where D := b2 − 4ac.
2a
When we were young, and when mathematics was several centuries more primitive (or inno-
cent?) than it is now, the number D (known as the discriminant of the equation) was expected
to be positive. Once a square root i of −1 had entered the scene every quadratic equation with
real coefficients turned out to have two roots in C. Well, not quite. Only one root if D = 0.
But in that case the equation may be rewritten as a(x + b/2a)2 = 0 and its single root −b/2a
may be deemed to be a root of multiplicity 2, so that in this sense it again has two roots.
Those were real quadratic equations. We now have more numbers available to serve as
coefficients. The algebraic manipulations involved in completing the square to find the classic
formula for the roots of a quadratic equation work just as well in C as they do in R. Moreover,
by Theorem C7 every non-zero complex number D has precisely 2 distinct square roots. Thus:

Theorem C9. Every quadratic equation with complex coefficients has either 2 distinct roots
in C or a single root with multiplicity 2

Exercise 11.1. Find the square roots of −7 + 24i.


Now solve the equation z 2 − (2 + 2i)z + (7 − 22i) = 0.

By Theorem C7, if c 6= 0 then the polynomial equation xn = c has precisely n roots in C.


And if c = 0 then we think of the equation as having 0 as a root of multiplicity n. In this

9
sense the equation always has n roots, counting multiplicities. Consider the problem of solving
the general polynomial equation of degree n,

a0 xn + a1 xn−1 + · · · + an−2 x2 + an−1 x + an = 0 .

It is assumed that a0 6= 0 since otherwise the equation would be of lower degree. It is also now
assumed that ai ∈ C for 0 6 i 6 n. Dividing through by a0 does not alter the roots and so
we treat equations f (x) = 0 where f (x) is a so-called monic polynomial,

f (x) = xn + c1 xn−1 + · · · + cn−2 x2 + cn−1 x + cn

with complex coefficients and leading coefficient equal to 1. The Factor Theorem (an elemen-
tary application of the Division Algorithm for polynomials with complex coefficients) states
that if u ∈ C and f (u) = 0 then there is a polynomial g(x) such that f (x) = (x − u)g(x).
Applying the rules for multiplying polynomials we see that g(x) is monic and of degree n − 1.
It follows easily that if f (x) = 0 has distinct roots u1 , u2 , . . . , um in C then there exists a
monic polynomial gm (x) of degree n − m that has complex coefficients and for which

f (x) = (x − u1 )(x − u2 ) · · · (x − um )gm (x) .

Consequently n > m and the equation f (x) = 0 of degree n has at most n roots in C.
So far so good. Nothing deeper than (advanced) High School mathematics here. This fact,
that a polynomial equation of degree n can have at most n roots, was seen, if through a
glass darkly, by Thomas Harriot in about 1605, by Albert Girard (1629), by René Descartes
(1637), by Isaac Newton (probably about 1667 but not published until 1707), and by many
other mathematicians in clearer and clearer form as the years went by∗ .
The belief that a polynomial equation of degree n does have n complex roots if multiplic-
ities are taken into account came fifty years later. It gradually grew from about the middle of
the 17th century. Its modern form could not properly be formulated until complex numbers
were introduced in the mid-18th century and became reasonably widely accepted early in the
19th century. Nevertheless, it was understood quite a long time before that as the assertion

(A): any polynomial with real coefficients may be expressed as a product of


linear and quadratic factors (with real coefficients).

In this form there were many attempted proofs, none of them sufficiently convincing. The fol-
lowing result, however, with a remarkable number of reasonably convincing proofs—C. F. Gauss,
for example, published his first in 1799 and his fourth in 1849, fifty years later)—is now a little
over 200 years old:

Theorem C10. When multiplicities are taken into account every complex polynomial of
degree n has exactly n roots. That is, if f (x) is a monic polynomial xn +c1 xn−1 +· · ·+cn−1 x+cn
with coefficients in C then there exist u1 , u2 , . . . , un ∈ C (not necessarily distinct) such that
f (x) = (x − u1 )(x − u2 ) · · · (x − un ).

Although this is called The Fundamental Theorem of Algebra its natural place-
ment within mathematics lies in Analysis or Topology. Its proofs require ideas that lie a
long way beyond school mathematics, and somewhat beyond first-year university courses. A


For detail and sources see for example Jacqueline Stedall, Mathematics emerging: a Sourcebook
1540–1900, OUP Oxford 2008, Chapter 12, or Jacqueline Stedall, From Cardano’s great art to Lagrange’s
reflections: filling a gap in the history of algebra. European Mathematical Society, Zürich 2011.

10
very attractive proof will emerge from theorems proved in the Oxford second-year Complex
Analysis course. You may come across other proofs in courses on Topology. For your first year
at Oxford, however, we want you to take it on trust. That is, we want you to appreciate what
it says and believe it (provisionally, until you are in a position to understand a proof), but use
it only when you cannot make progress without it.
The key to the connection between assertion (A) and the Fundamental Theorem of algebra
is the following elementary fact.

Theorem C11. Let f (x) be a polynomial with real coefficients. If u ∈ C and f (u) = 0
then also f (u) = 0.

The reason is this. Write f (x) = a0 xn + a1 xn−1 + · · · + an−2 x2 + an−1 x + an where am ∈ R


for 0 6 m 6 n. Using Theorem C1 and induction on the number of summands we find that

f (u) = a0 u n + a1 u n−1 + · · · + an−2 u 2 + an−1 u + an ,

By assumption, however, am = am for 0 6 m 6 n and so f (u) = f (u). Thus f (u) = 0 if and


only if f (u) = 0.

What Theorem C11 means is that the non-real roots of a polynomial with real coefficients
occur in complex conjugate pairs. Now if u ∈ C and u ∈ / R, then (x − u)(x − u) is the real
quadratic polynomial x2 −2Re(u)x+|u|2 . It follows from the Fundamental Theorem of Algebra
that assertion (A) is true. As was mentioned above, this fact was appreciated (if not proved) in
ever clearer form from about the middle of the 17th century. By some time in the 18th century
(I do not know quite when) it was also realised that Theorem C10 is actually equivalent to
assertion (A) even although the former appears at first sight to be much stronger.

Exercise 11.2. Let (A) be the assertion that any polynomial with real coefficients may
be expressed as a product of linear and quadratic factors (with real coefficients). Let (B) be
the assertion that any polynomial with complex coefficients may be expressed as a product of
linear factors (with complex coefficients). Write down straightforward proofs that if (A) is true
then (B) must also be true, and that if (B) is true then (A) must also be true.

Exercise 11.3. Show that σ 8 = −1 if and only if σ is a primitive 16th root of 1. Find the
roots of the equation x8 = −1, and use them to write x8 + 1 as the product of four quadratic
factors with real coefficients.

Acknowledgements: I am grateful to Professor Francis Kirwan and her predecessors as


lecturers on Complex Numbers for many ideas, for the majority of the questions in the exercise
sheet, and for the suggestion that some historical comments should be included. I thank my
wife Sylvia for her critical reading and careful proofreading of early drafts of these notes and
of the accompanying exercise material.

11
Appendix: Notes on Euler’s formula (1748)
Note 1. Leonhard Euler (b. 1707 in Basel, Switzerland; d. 1783 in St Petersburg, Russia)
was a wonderfully inventive and prolific mathematician, and a great teacher. There is a com-
monly held belief that the equation ei π = −1 is something that Euler was proud of—as he
might well have been, except that he does not seem to have noticed it. The general formula
ei θ = cos θ + i sin θ was a discovery that he published in 1748 but he never focussed on the
special case θ = π .
 z i
Note 2. Euler’s argument starts from the description of ez as 1 + where i is an
i
z n
infinite magnitude—in present times this is written ez = lim (1 + . From this he moved
z
n→∞ n
neatly to power series for e , cos z and sin z , but his argument is nevertheless not quite what
I have sketched in Section 10 above.

These images come from the bottom of p. 103 and the top of p. 104 of Euler’s Introductio in
analysin infinitorum (Lausanne, 1748). Although the language is√Latin and the typography is

different from modern usage, at the end of §138 the formulae e+v −1 = cos v + −1. sin v and
√ √
e−v −1 = cos v − −1. sin v can easily be recognised.

ΠMN: Queen’s: 30.ix.2017

12

You might also like