You are on page 1of 154

Ordinary Differential Equations

Peter Philip
Lecture Notes
Originally Created for the Class of Spring Semester 2012 at LMU Munich,
Revised and Extended for the Classes of Spring Semesters 2013 and 2014

July 22, 2014

Contents
1 Basic Notions

1.1

Types and First Examples . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Equivalent Integral Equation . . . . . . . . . . . . . . . . . . . . . . . . .

1.3

Patching and Time Reversion . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Elementary Solution Methods

12

2.1

Geometric Interpretation, Graphing . . . . . . . . . . . . . . . . . . . . . 12

2.2

Linear ODE, Variation of Constants . . . . . . . . . . . . . . . . . . . . . 12

2.3

Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4

Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 General Theory

24

3.1

Equivalence Between Higher-Order ODE and Systems of First-Order ODE 24

3.2

Existence of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3

Uniqueness of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4

Extension of Solutions, Maximal Solutions . . . . . . . . . . . . . . . . . 40

3.5

Continuity in Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . 50

E-Mail: philip@math.lmu.de

CONTENTS
4 Linear ODE

57

4.1

Definition, Setting

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.2

Gronwalls Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3

Existence, Uniqueness, Vector Space of Solutions . . . . . . . . . . . . . . 61

4.4

Fundamental Matrix Solutions and Variation of Constants . . . . . . . . 63

4.5

Higher-Order, Wronskian . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.6

Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6.1

Linear ODE of Higher Order . . . . . . . . . . . . . . . . . . . . . 68

4.6.2

Systems of First-Order Linear ODE . . . . . . . . . . . . . . . . . 75

5 Stability

84

5.1

Qualitative Theory, Phase Portraits . . . . . . . . . . . . . . . . . . . . . 84

5.2

Stability at Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.3

Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.4

Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.5

Limit Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

A Differentiability

118

B Kn -Valued Integration

119

B.1 Kn -Valued Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . . 119


B.2 Kn -Valued Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . 122
C Metric Spaces

124

C.1 Distance in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 124


C.2 Compactness in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . 127
D Local Lipschitz Continuity

132

E Maximal Solutions on Nonopen Intervals

135

F Paths in Rn

135

G Operator Norms and Matrix Norms

139

H The Vandermonde Determinant

142

CONTENTS
I

Matrix-Valued Functions

144

I.1

Product Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

I.2

Integration and Matrix Multiplication Commute . . . . . . . . . . . . . . 144

J Autonomous ODE

145

J.1 Equivalence Between Autonomous and Nonautonomous ODE


J.2 Integral for ODE with Discontinuous Right-Hand Side

. . . . . . 145

. . . . . . . . . . 146

K Polar Coordinates

147

References

154

1 BASIC NOTIONS

1
1.1

Basic Notions
Types of Ordinary Differential Equations (ODE) and First
Examples

A differential equation is an equation for some unknown function, involving one or more
derivatives of the unknown function. Here are some first examples:
y = y,

(1.1a)

y (5) = (y )2 + x,
(y )2 = c,

(1.1b)
(1.1c)

t x = e2it x2 ,
 
1

.
x = 3x +
1

(1.1d)
(1.1e)

One distinguishes between ordinary differential equations (ODE) and partial differential
equations (PDE). While ODE contain only derivatives with respect to one variable, PDE
can contain (partial) derivatives with respect to several different variables. In general,
PDE are much harder to solve than ODE. The equations in (1.1) all are ODE, and only
ODE are the subject of this class. We will see precise definitions shortly, but we can
already use the examples in (1.1) to get some first exposure to important ODE-related
terms and to discuss related issues.
As in (1.1), the notation for the unknown function varies in the literature, where the
two variants presented in (1.1) are probably the most common ones: In the first three
equations of (1.1), the unknown function is denoted y, usually assumed to depend on
a variable denoted x, i.e. x 7 y(x). In the last two equations of (1.1), the unknown
function is denoted x, usually assumed to depend on a variable denoted t, i.e. t 7 x(t).
So one has to use some care due to the different roles of the symbol x. The notation
t 7 x(t) is typically favored in situations arising from physics applications, where t
represents time. In this class, we will mostly use the notation x 7 y(x).

There is another, in a way a slightly more serious, notational issue that one commonly
encounters when dealing with ODE: Strictly speaking, the notation in (1.1b) and (1.1d)
is not entirely correct, as functions and function arguments are not properly distinguished. Correctly written, (1.1b) and (1.1d) read

xD(y)

tD(x)

y (5) (x) = y (x)


(t x)(t) = e2it

2

+ x,
2
x(t) ,

(1.2a)
(1.2b)

where D(y) and D(x) denote the respective domains of the functions y and x. However,
one might also notice that the notation in (1.2) is more cumbersome and, perhaps,
harder to read. In any case, the type of slight abuse of notation present in (1.1b) and
(1.1d) is so common in the literature that one will have to live with it.

1 BASIC NOTIONS

One speaks of first-order ODE if the equations involve only first derivatives such as in
(1.1a), (1.1c), and (1.1d). Otherwise, one speaks of higher-order ODE, where the precise
order is given by the highest derivative occurring in the equation, such that (1.1b) is
an ODE of fifth order and (1.1e) is an ODE of second order. We will see later in Th.
3.1 that ODE of higher order can be equivalently formulated and solved as systems of
ODE of first order, where systems of ODE obviously consist of several ODE to be solved
simultaneously. Such a system of ODE can, equivalently, be interpreted as a single ODE
in higher dimensions: For instance, (1.1e) can be seen as a single two-dimensional ODE
of second order or as the system
x1 = 3x1 1,
x2 = 3x2 + 1

(1.3a)
(1.3b)

of two one-dimensional ODE of second order.


One calls an ODE explicit if it has been solved explicitly for the highest-order derivative, otherwise implicit. Thus, in (1.1), all ODE except (1.1c) are explicit. In general,
explicit ODE are much easier to solve than implicit ODE (which include, e.g., so-called
differential-algebraic equations, cf. Ex. 1.4(g) below), and we will mostly consider explicit ODE in this class.
As the reader might already have noticed, without further information, none of the
equations in (1.1) makes much sense. Every function, in particular, every function
solving an ODE, needs a set as the domain where it is defined, and a set as the range it
maps into. Thus, for each ODE, one needs to specify the admissible domains as well as
the range of the unknown function. For an ODE, one usually requires a solution to be
defined on a nontrivial (bounded or unbounded) interval I R. Prescribing the possible
range of the solution is an integral part of setting up an ODE, and it often completely
determines the ODEs meaning and/or its solvability. For example for (1.1d), (a subset
of) C is a reasonable range. Similarly, for (1.1a)(1.1c), one can require the range to
be either R or C, where requiring range R for (1.1c) immediately implies there is no
solution for c < 0. However, one can also specify (a subset of) Rn or Cn , n > 1, as
range for (1.1a), turning the ODE into an n-dimensional ODE (or a system of ODE),
where y now has n compoments (y1 , . . . , yn ) (note that, except in cases where we are
dealing with matrix multiplications, we sometimes denote elements of Rn as columns
and sometimes as rows, switching back and forth without too much care). A reasonable
range for (1.1e) is (a subset of) R2 or C2 .
One of the important goals regarding ODE is to find conditions, where one can guanrantee the existence of solutions. Moreover, if possible, one would like to find conditions
that guarantee the existence of a unique solution. Clearly, for each a R, the function
y : R R, y(x) = a ex , is a solution to (1.1a), showing one cannot expect uniqueness
without specifying further requirements. The most common additional conditions that
often (but not always) guarantee a unique solution are initial conditions, (e.g. requiring
y(x0 ) = y0 (x0 , y0 given); or boundary conditions (e.g. requiring y(a) = ya , y(b) = yb for
y : [a, b] Cn (ya , yb Cn given)).
Let us now proceed to mathematically precise definitions of the abovementioned notions.

1 BASIC NOTIONS
Notation 1.1. We will write K in situations, where we allow K to be R or C.
Definition 1.2. Let k, n N.
(a) Given U R K(k+1)n and F : U Kn , call
F (x, y, y , . . . , y (k) ) = 0

(1.4)

an implicit ODE of kth order. A solution to this ODE is a k times differentiable


function
: I Kn ,
(1.5)
defined on a nontrivial (bounded or unbounded, open or closed or half-open) interval
I R satisfying the two conditions


(i) (x, (x), (x), . . . , (k) (x)) I K(k+1)n : x I U .
(ii) F (x, (x), (x), . . . , (k) (x)) = 0 for each x I.

Note that condition (i) is necessary so that one can even formulate condition (ii).
(b) Given G R Kkn , and f : G Kn , call
y (k) = f (x, y, y , . . . , y (k1) )

(1.6)

an explicit ODE of kth order. A solution to this ODE is a k times differentiable


function as in (1.5), defined on a nontrivial (bounded or unbounded, open or
closed or half-open) interval I R satisfying the two conditions



(i)
x, (x), (x), . . . , (k1) (x) I Kkn : x I G.
(ii) (k) (x) = f (x, (x), (x), . . . , (k1) (x)) for each x I.

Again, note that condition (i) is necessary so that one can even formulate condition
(ii). Also note that is a solution to (1.6) if, and only if, is a solution to the
equivalent implicit ODE y (k) f (x, y, y , . . . , y (k1) ) = 0.
Definition 1.3. Let k, n N.
(a) An initial value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp.
of the ODE (1.6)) plus the initial condition

j=0,...,k1

y (j) (x0 ) = y0,j

(1.7)

with given x0 R and y0,0 , . . . , y0,k1 Kn . A solution to the initial value


problem is a k times differentiable function as in (1.5) that is a solution to the
ODE and that also satisfies (1.7) (with y replaced by ) in particular, this requires
x0 I.

1 BASIC NOTIONS

(b) A boundary value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp.
of the ODE (1.6)) plus the boundary condition

jJa

y (j) (a) = ya,j

and

jJb

y (j) (b) = yb,j

(1.8)

with given a, b R, a < b; Ja , Jb {0, . . . , k 1}, ya,j Kn for each j Ja , and


yb,j Kn for each j Jb . A solution to the boundary value problem is a k times
differentiable function as in (1.5) that is a solution to the ODE and that also
satisfies (1.8) (with y replaced by ) in particular, this requires [a, b] I.

Under suitable hypotheses, initial and boundary value problems for ODE have unique
solutions (for initial value problems, we will see some rather general results in Cor. 3.10
and Cor. 3.16 below). However, in general, they can have infinitely many solutions or
no solutions, as shown by Examples 1.4(b),(c),(e) below.
Example 1.4. (a) Let k N. The function : R K, (x) = a ex , a K, is a
solution to the kth order explicit initial value problem

j=0,...,k1

y (k) = y,

(1.9a)

y (j) (0) = a.

(1.9b)

We will see later (e.g., as a consequence of Th. 4.8 combined with Th. 3.1) that
is the unique solution to (1.9) on R.
(b) Consider the one-dimensional explicit first-order initial value problem
p
y = |y|,
y(0) = 0.

(1.10a)
(1.10b)

Then, for every c 0, the function


c : R R,

c (x) :=

0
(xc)2
4

for x c,
for x c,

is a solution to (1.10): Clearly, c (0) = 0, c is differentiable, and


(
0
for x c,
c : R R, c (x) := xc
for x c,
2

(1.11)

(1.12)

solving the ODE. Thus, (1.10) is an example of an initial value problem with uncountably many different solutions, all defined on the same domain.

1 BASIC NOTIONS

(c) As mentioned before, the one-dimensional implicit first-order ODE (1.1c) has no
real-valued
solution for c < 0. For c 0, every function : R R, (x) :=

a x c, a R, is a solution
to (1.1c). Moreover, for c < 0, every function
: R C, (x) := a xi c, a C, is a C-valued solution to (1.1c). The
one-dimensional implicit first-order ODE

ey = 0

(1.13)

is an example of an ODE that does not even have a C-valued solution. It is an


exercise to find f : R R such that the explicit ODE y = f (x) has no solution.
(d) Let n N and let a, c Kn . Then, on R, the function
: R Kn ,

(x) := c + xa,

(1.14)

is the unique solution to the n-dimensional explicit first-order initial value problem
y = a,
y(0) = c.

(1.15a)
(1.15b)

This situation is a special case of Ex. 1.6 below.


(e) Let a, b R, a < b. We will see later in Example 4.12 that on [a, b] the 1-dimensional
explicit second-order ODE
y = y
(1.16)
has precisely the set of solutions
n

o
L=
(c1 sin +c2 cos) : [a, b] K : c1 , c2 K .

(1.17)

In consequence, the boundary value problem


y(0) = 0,

y(/2) = 1,

(1.18a)

for (1.16) has the unique solution : [0, /2] K, (x) := sin x (using (1.18a)
and (1.17) implies c2 = 0 and c1 = 1); the boundary value problem
y(0) = 0,

y() = 0,

(1.18b)

for (1.16) has the infinitely many different solutions c : [0, ] K, c (x) :=
c sin x, c K; and the boundary value problem
y(0) = 0,

y() = 1,

(1.18c)

for (1.16) has no solution (using (1.18c) and (1.17) implies the contradictory requirements c2 = 0 and c2 = 1).

1 BASIC NOTIONS
(f ) Consider
2

F : R K K K ,



    
z2
z1
y1
.
:=
,
F x,
z2 1
z2
y2

(1.19a)

Clearly, the implicit K2 -valued ODE

F (x, y, y ) =

y2
y2 1

 
0
=
0

(1.19b)

has no solution on any nontrivial interval.


(g) Consider
F : R C3 C3 C3 ,


z2 + iy3 2i
z1
y1
F x, y2 , z2 := z1 + y2 x2 . (1.20a)
y1 ieix
z3
y3

It is an exercise to show the C3 -valued implicit ODE




0
y2 + iy3 2i

= 0
F (x, y, y ) = y1 + y2 x
0
y1 ieix

(1.20b)

has a unique solution on R (note that, here, we do not need to provide initial
or boundary conditions to obtain uniqueness). The implicit ODE (1.20b) is an
example of a differential algebraic equation, since, read in components, only its first
two equations contain derivatives, whereas its third equation is purely algebraic.

1.2

Equivalent Integral Equation

It is often useful to rewrite a first-order explicit intitial value problem as an equivalent


integral equation. We provide the details of this equivalence in the following theorem:
Theorem 1.5. If G R Kn , n N, and f : G Kn is continuous, then, for each
(x0 , y0 ) G, the explicit n-dimensional first-order initial value problem
y = f (x, y),
y(x0 ) = y0 ,

(1.21a)
(1.21b)

is equivalent to the integral equation


y(x) = y0 +

x
x0


f t, y(t) dt ,

(1.22)

in the sense that a continuous function : I Kn , with x0 I R being a nontrivial


interval, and satisfying



x, (x) I Kn : x I G,
(1.23)

10

1 BASIC NOTIONS
is a solution to (1.21) in the sense of Def. 1.3(a) if, and only if,
Z x

(x) = y0 +
f t, (t) dt ,
xI

(1.24)

x0

i.e. if, and only if, is a solution to the integral equation (1.22).
Proof. Assume I R with x0 I to be a nontrivial interval and : I Kn
to be a continuous function, satisfying (1.23). If is a solution to (1.21), then is
differentiable and the assumed continuity of f implies the continuity of . In other
words, each component j of , j = {1, . . . , n}, is in C 1 (I, K). Thus, the fundamental
theorem of calculus [Phi13a, Th. G.6(b)] applies, and [Phi13a, (G.16b)] yields
Z x
Z x


(1.21b)

j (x) = j (x0 ) +
fj t, (t) dt = y0,j +
fj t, (t) dt , (1.25)
xI

j{1,...,n}

x0

x0

proving satisfies (1.24). Conversely, if satisfies (1.24), then the validity of the initial
condition (1.21b) is immediate.
Moreover, as f and are continuous, so is the integrand

function t 7 f t, (t) of (1.24). Thus, [Phi13a, Th. G.6(a)] applies to (the components
of) , proving (x) = f x, (x) for each x I, proving is a solution to (1.21). 

Example 1.6. Consider the situation of Th. 1.5. In the particularly simple special
case, where f does not actually depend on y, but merely on x, the equivalence between
(1.21) and (1.22) can be directly exploited to actually solve the initial value problem:
If f : I Kn , where I R is some nontrivial interval with x0 I, then we obtain
: I Kn to be a solution of (1.21) if, and only if,
Z x
(x) = y0 +
f (t) dt ,
(1.26)
xI

x0

i.e. if, and only if, is the antiderivative of f that satisfies the initial condition. In
particular, in the present situation, as given by (1.26) is the unique solution to the
initial value problem. Of course, depending on f , it can still be difficult to carry out
the integral in (1.26).

1.3

Patching and Time Reversion

If solutions defined on different intervals fit together, then they can be patched to obtain
a solution on the union of the two intervals:
Lemma 1.7 (Patching of Solutions). Let k, n N. Given G R Kkn and f : G
Kn , if : I Kn and : J Kn are both solutions to (1.6), i.e. to
y (k) = f (x, y, y , . . . , y (k1) ),
such that I =]a, b], J = [b, c[, a < b < c, and such that

j=0,...,k1

(j) (b) = (j) (b),

(1.27)

11

1 BASIC NOTIONS
then
: I J Kn ,

(x) :=

(x)
(x)

for x I,
for x J,

(1.28)

is also a solution to (1.6).


Proof. Since and both are solutions to (1.6),



x, (x), (x), . . . , (k1) (x) (I J) Kkn : x I J G

(1.29)

must hold, where (1.27) guarantees that (j) (b) exists for each j = 0, . . . , k1. Moreover,
is k times differentiable at each x I J, x 6= b, and

(k) (x) = f x, (x), (x), . . . , (k1) (x) .
(1.30)

xIJ,
x6=b

However, at b, we also have (using the left-hand derivatives for and the right-hand
derivatives for )

(k) (b) = f b, (b), (b), . . . , (k1) (b)

= f b, (b), (b), . . . , (k1) (b) = (k) (b),
(1.31)

which shows is k times differentiable and the equality of (1.30) also holds at x = b,
completing the proof that is a solution.

It is sometimes useful to apply what is known as time reversion:
Definition 1.8. Let k, n N, Gf R Kkn , f : Gf Kn , and consider the ODE
y (k) = f (x, y, y , . . . , y (k1) ).

(1.32)

y (k) = g(x, y, y , . . . , y (k1) ),

(1.33)

We call the ODE


where

g : Gg Kn , g(x, y) := (1)k f x, y1 , y2 , . . . , (1)k1 yk ,



Gg := (x, y) R Kkn : x, y1 , y2 , . . . , (1)k1 yk Gf ,

(1.34a)
(1.34b)

the time-reversed version of (1.32).

Lemma 1.9 (Time Reversion). Let k, n N, Gf R Kkn , and f : Gf Kn .


(a) The time-reversed version of (1.33) is the original ODE, i.e. (1.32).
(b) If a < b , then : ]a, b[ Kn is a solution to (1.32) if, and only if,
: ] b, a[ Kn ,

(x) := (x),

is a solution to the time-reversed version (1.33).

(1.35)

12

2 ELEMENTARY SOLUTION METHODS


Proof. (a) is immediate from the definition of g in (1.34).

(b): Due to (a), it suffices to show if is a solution to (1.32), then is a solution to


(1.33). Clearly, if x ] b, a[, then x ]a, b[. Moreover, noting

j=0,...,k

x]b,a[

(j) (x) = (1)j (j) (x),

(1.36a)

one has

x]b,a[


x, (x), (x), . . . , (k1) (x) Gf


x, (x), (x), . . . , (k1) (x)

= x, (x), (x), . . . , (1)k1 (k1) (x) Gg

and

x]b,a[

(k) (x) = (1)k f x, (x), (x), . . . , (k1) (x)

2.1

= (1)k f x, (x), (x), . . . , (1)k1 (k1) (x)



= g x, (x), (x), . . . , (k1) (x) ,

thereby establishing the case.

(1.36b)


(1.36c)


Elementary Solution Methods for 1-Dimensional


First-Order ODE
Geometric Interpretation, Graphing

Geometrically, in the 1-dimensional real-valued case, the ODE (1.21a) provides a slope
y = f (x, y) for every point (x, y). In other words, it provides a field of directions. The
task is to find a differentiable function such that its graph has the prescribed slope in
each point it contains. In certain simple cases, drawing the field of directions can help
to guess the solutions of the ODE.
Example 2.1. Let G := R+ R and f : G R, f (x, y) := y/x, i.e. we consider the
ODE y = y/x. Drawing the field of directions leads to the idea that the solutions are
functions whose graphs constitute rays, i.e. c : R+ R, y = c (x) = c x with c R.
Indeed, one immediately verifies that each c constitutes a solution to the ODE.

2.2

Linear ODE, Variation of Constants

Definition 2.2. Let I R be an open interval and let a, b : I K be continuous


functions. An ODE of the form
y = a(x)y + b(x)

(2.1)

is called a linear ODE of first order. It is called homogeneous if, and only if, b 0; it is
called inhomogeneous if, and only if, it is not homogeneous.

13

2 ELEMENTARY SOLUTION METHODS

Theorem 2.3 (Variation of Constants). Let I R be an open interval and let a, b :


I K be continuous. Moreover, let x0 I and c K. Then the linear ODE (2.1)
has a unique solution : I K that satisfies the initial condition y(x0 ) = c. This
unique solution is given by


Z x
1
: I K, (x) = 0 (x) c +
0 (t) b(t) dt ,
(2.2a)
x0

where
0 : I K,

0 (x) = exp

Z

a(t) dt
x0

=e

Rx

x0

a(t) dt

(2.2b)

Here, and in the following, 1


0 denotes 1/0 and not the inverse function of 0 (which
does not even necessarily exist).
Proof. We begin by noting that 0 according to (2.2b) is well-defined since a is assumed
to be continuous, i.e., in particular, Riemann integrable on [x0 , x]. Moreover, the fundamental theorem of calculus [Phi13a, Th. G.6(a)] applies, showing 0 is differentiable
with
Z x

0 : I K,

0 (x) = a(x) exp

a(t) dt

= a(x)0 (x),

(2.3)

x0

where Lem. A.1 of the Appendix was used as well. In particular, 0 is continuous. Since
1
0 6= 0 as well, 1
0 is also continuous. Moreover, as b is continuous by hypothesis, 0 b
is continuous and, thus, Riemann integrable on [x0 , x]. Once again, [Phi13a, Th. G.6(a)]
applies, yielding to be differentiable with
: I K,


Z x

1
(x) = 0 (x) c +
0 (t) b(t) dt + 0 (x)0 (x)1 b(x)
x0


Z x
1
= a(x)0 (x) c +
0 (t) b(t) dt + b(x) = a(x)(x) + b(x),

(2.4)

x0

where the product rule of [Phi13a, Th. 9.6(c)] was used as well. Comparing (2.4) with
(2.1) shows is a solution to (2.1). The computation
(x0 ) = 0 (x0 ) (c + 0) = 1 c = c

(2.5)

verifies that satisfies the desired initial condition. It remains to prove uniqueness. To
this end, let : I K be an arbitrary differentiable function that satisfies (2.1) as
well as the initial condition (x0 ) = c. We have to show = . Since 0 6= 0, we can
define u := /0 and still have to verify
Z x
u(x) = c +
0 (t)1 b(t) dt .
(2.6)
xI

x0

We obtain
a 0 u + b = a + b = = (0 u) = 0 u + 0 u = a 0 u + 0 u ,

(2.7)

14

2 ELEMENTARY SOLUTION METHODS

implying b = 0 u and u = 1
0 b. Thus, the fundamental theorem of calculus in the
form [Phi13a, Th. G.6(b)] implies
Z x
Z x

u(x) = u(x0 ) +
u (t) dt = c +
0 (t)1 b(t) dt ,
(2.8)
xI

x0

x0

thereby completing the proof.

Corollary 2.4. Let I R be an open interval and let a : I K be continuous.


Moreover, let x0 I and c K. Then the homogeneous linear ODE (2.1) (i.e. with
b 0) has a unique solution : I K that satisfies the initial condition y(x0 ) = c.
This unique solution is given by
Z x

Rx
a(t) dt
(x) = c exp
a(t) dt = c e x0
.
(2.9)
x0

Proof. One immediately obtains (2.9) by setting b 0 in in (2.2).

Remark 2.5. The name variation of constants for Th. 2.3 can be understood from
comparing the solution (2.9) of the homogeneous linear ODE with the solution (2.2)
of the general inhomogeneous linear ODE: One obtains (2.2)
R x from (2.9) by varying the
constant c, i.e. by replacing it with the function x 7 c + x0 0 (t)1 b(t) dt .
Example 2.6. Consider the ODE

y = 2xy + x3

(2.10)

with initial condition y(0) = c, c C. Comparing (2.10) with Def. 2.2, we observe we
are facing an inhomogeneous linear ODE with
a : R R,
b : R R,

a(x) := 2x,
b(x) := x3 .

(2.11a)
(2.11b)

From Cor. 2.4, we obtain the solution 0,c to the homogeneous version of (2.10):
Z x

2
0,c : R C, 0,c (x) = c exp
a(t) dt = cex .
(2.12)
0

The solution to (2.10) is given by (2.2a):


: R C,


x 


Z x
1 2
t2 3
x2
t2
x2
c + (t + 1)e
c+
e t dt = e
(x) = e
2
0
0
 


1 x2 1 2
1 1
2
2
e (x + 1).
= ex c + (x2 + 1)ex = c +
2 2
2
2

(2.13)

15

2 ELEMENTARY SOLUTION METHODS

2.3

Separation of Variables

If the ODE (1.21a) has the particular form


y = f (x)g(y),

(2.14)

with one-dimensional real-valued continuous functions f and g, and g(y) 6= 0, then it


can be solved by a method known as separation of variables:
Theorem 2.7. Let I, J R be (bounded or unbounded) open intervals and suppose that
f : I R and g : J R are continuous with g(y) 6= 0 for each y J. For each
(x0 , y0 ) I J, consider the initial value problem consisting of the ODE (2.14) together
with the initial condition
y(x0 ) = y0 .
(2.15)
Define the functions
F : I R,

F (x) :=

f (t) dt ,
x0

G : J R,

G(y) :=

y
y0

dt
.
g(t)

(2.16)

(a) Uniqueness: On each open interval I I satisfying x0 I and F (I ) G(J), the


initial value problem consisting of (2.14) and (2.15) has a unique solution. This
unique solution is given by

: I R, (x) := G1 F (x) ,
(2.17)
where G1 : G(J) J is the inverse function of G on G(J).

(b) Existence: There exists an open interval I I satisfying x0 I and F (I ) G(J),


i.e. an I such that (a) applies.
Proof. (a): We begin by proving G has a differentiable inverse function G1 : G(J)
J. According to the fundamental theorem of calculus [Phi13a, Th. 10.19(a)], G is
differentiable with G = 1/g. Since g is continuous and nonzero, G is even C 1 . If
G (y0 ) = 1/g(y0 ) > 0, then G is strictly increasing on J (due to the intermediate value
theorem [Phi13a, Th. 7.56]; g(y0 ) > 0, the continuity of g, and g 6= 0 imply that g > 0
on J). Analogously, if G (y0 ) = 1/g(y0 ) < 0, then G is strictly decreasing on J. In each
case, G has a differentiable inverse function on G(J) by [Phi13a, Th. 9.8].
In the next step, we verify that (2.17) does, indeed, define a solution to (2.14) and
(2.15). The assumption F (I ) G(J) and the existence of G1 as shown above provide
that is well-defined by (2.17). Verifying (2.15) is quite simple: (x0 ) = G1 (F (x0 )) =
G1 (0) = y0 . To see to be a solution of (2.14), notice that (2.17) implies F = G
on I . Thus, we can apply the chain rule to obtain the derivative of F = G on I :

xI


(x)
,
f (x) = F (x) = G (x) (x) =
g (x)

showing satisfies (2.14).

(2.18)

16

2 ELEMENTARY SOLUTION METHODS

We now proceed to show that each solution : I R to (2.14) that satisfies (2.15)
must also satisfy (2.17). Since is a solution to (2.14),
(x)
 = f (x) for each x I .
g (x)

(2.19)

Integrating (2.19) yields


Z x
Z x
(t)
 dt =
f (t) dt = F (x) for each x I .
g
(t)
x0
x0

(2.20)

Using the change of variables formula of [Phi13a, Th. 10.24] in the left-hand side of
(2.20), allows one to replace (t) by the new integration variable u (note that each
solution : I R to (2.14) is in C 1 (I ) since f and g are presumed continuous).
Thus, we obtain from (2.20):
F (x) =

(x)
(x0 )

du
=
g(u)

(x)
y0


du
= G (x) for each x I .
g(u)

(2.21)

Applying G1 to (2.21) establishes satisfies (2.17).

(b): During the proof of (a), we have already seen G to be either strictly increasing
or strictly decreasing. As G(y0 ) = 0, this implies the existence of > 0 such that
] , [ G(J). The function F is differentiable and, in particular, continuous. Since
F (x0 ) = 0, there is > 0 such that, for I :=]x0 , x0 +[, one has F (I ) ], [ G(J)
as desired.

Example 2.8. Consider the ODE
y =

y
x

on I J := R+ R+

(2.22)

with the initial condition y(1) = c for some given c R+ . Introducing functions
f : R+ R,

1
f (x) := ,
x

g : R+ R,

g(y) := y,

(2.23)

one sees that Th. 2.7 applies. To compute the solution = G1 F , we first have to
determine F and G:
Z x
Z x
dt
+
f (t) dt =
F : R R,
F (x) =
= ln x,
(2.24a)
t
1
1
Z y
Z y
dt
y
dt
+
G : R R,
G(y) =
=
= ln .
(2.24b)
t
c
c g(t)
c
Here, we can choose I = I = R+ , because F (R+ ) = R = G(R+ ). That means is
defined on the entire interval I. The inverse function of G is given by
G1 : R R+ ,

G1 (t) = c et .

(2.25)

17

2 ELEMENTARY SOLUTION METHODS


Finally, we get

c
(x) = G1 F (x) = c e ln x = .
x

: R+ R,

(2.26)

The uniqueness part of Th. 2.7 further tells us the above initial value problem can have
no solution different from .

The advantage of using Th. 2.7 as in the previous example, by computing the relevant
functions F , G, and G1 , is that it is mathematically rigorous. In particular, one can be
sure one has found the unique solution to the ODE with initial condition. However, in
practice, it is often easier to use the following heuristic (not entirely rigorous) procedure.
In the end, in most cases, one can easily check by differentiation that the function found
is, indeed, a solution to the ODE with initial condition. However, one does not know
uniqueness without further investigations (general results such as Th. 3.15 below can
often help). One also has to determine on which interval the found solution is defined.
On the other hand, as one is usually interested in choosing the interval as large as
possible, the optimal choice is not always obvious when using Th. 2.7, either.
The heuristic procedure is as follows: Start with the ODE (2.14) written in the form
dy
= f (x)g(y).
dx

(2.27a)

Multiply by dx and divide by g(y) (i.e. separate the variables):

Integrate:

dy
= f (x) dx .
g(y)

(2.27b)

dy
=
g(y)

(2.27c)

f (x) dx .

Change the integration variables and supply the appropriate upper and lower limits for
the integrals (according to the initial condition):
Z x
Z y
dt
=
f (t) dt .
(2.27d)
x0
y0 g(t)
Solve this equation for y, set (x) := y, check by differentiation that is, indeed, a
solution to the ODE, and determine the largest interval I such that x0 I and such
that is defined on I . The use of this heuristic procedure is demonstrated by the
following example:
Example 2.9. Consider the ODE
y = y 2

on I J := R R

(2.28)

18

2 ELEMENTARY SOLUTION METHODS

with the initial condition y(x0 ) = y0 for given values x0 , y0 R. We manipulate (2.28)
according to the heuristic procedure described in (2.27) above:
Z
Z
dy
2
2
2
= y
y dy = dx
y dy =
dx
dx
 y
Z y
Z x
1
1
1
2
= [t]xx0

= x x0

t dt =
dt
t y0
y y0
y0
x0
y0
.
(2.29)
(x) = y =
1 + (x x0 ) y0
Clearly, (x0 ) = y0 . Moreover,
(x) =

y02
1 + (x x0 ) y0

2
2 = (x) ,

(2.30)

i.e. does, indeed, provide a solution to (2.28). If y0 = 0, then 0 is defined


on the entire interval I = R. If y0 6= 0, then the denominator of (x) has a zero at
x = (x0 y0 1)/y0 , and is not defined on all of R. In that case, if y0 > 0, then
x0 > (x0 y0 1)/y0 = x0 1/y0 and the maximal open interval for to be defined on
is I =]x0 1/y0 , [; if y0 < 0, then x0 < (x0 y0 1)/y0 = x0 1/y0 and the maximal
open interval for to be defined on is I =] , x0 1/y0 [. Note that the formula for
obtained by (2.29) works for y0 = 0 as well, even though not every previous expression
in (2.29) is meaningful for y0 = 0 and, also, Th. 2.7 does not apply to (2.28) for y0 = 0.
In the present example, the subsequent Th. 3.15 does, indeed, imply to be the unique
solution to the initial value problem on I .

2.4

Change of Variables

To solve an ODE, it can be useful to transform it into an equivalent ODE, using a


so-called change of variables. If one already knows how to solve the transformed ODE,
then the equivalence allows one to also solve the original ODE. We first present the
following Th. 2.10, which constitutes the base for the change of variables technique,
followed by examples, where the technique is applied.
Theorem 2.10. Let G R Kn be open, n N, f : G Kn , and (x0 , y0 ) G.
Define
Gx := {y Kn : (x, y) G}
(2.31)
xR

and assume the change of variables function T : G Kn is differentiable and such


that



Tx := T (x, ) : Gx Tx (Gx ), Tx (y) := T (x, y), is a diffeomorphism ,


Gx 6=

(2.32)

2 ELEMENTARY SOLUTION METHODS

19

i.e. Tx is invertible and both Tx and Tx1 are differentiable. Then the first-order initial
value problems
y = f (x, y),
y(x0 ) = y0 ,

(2.33a)
(2.33b)

1

(2.34a)

and

y =

DTx1 (y)

y(x0 ) = T (x0 , y0 ),



f x, Tx1 (y) + x T x, Tx1 (y) ,

(2.34b)

are equivalent in the following sense:


(a) A differentiable function : I Kn , where I R is a nontrivial interval, is a
solution to (2.33a) if, and only if, the function

: I Kn , (x) := (Tx )(x) = T x, (x) ,
(2.35)
is a solution to (2.34a).

(b) A differentiable function : I Kn , where I R is a nontrivial interval, is a


solution to (2.33) if, and only if, the function of (2.35) is a solution to (2.34).
Proof. We start by noting that the assumption of G being open clearly implies each Gx ,
x R, to be open as well, which, in turn, implies Tx (Gx ) to be open, even though this
is not as obvious1 . Next, for each x R such that Gx 6= , we can apply the chain rule
[Phi13b, Th. 2.28] to Tx Tx1 = Id to obtain

(2.36)

DTx Tx1 (y) DTx1 (y) = Id


yTx (Gx )

and, thus, each DTx1 (y) is invertible with



1


DTx1 (y)
= DTx Tx1 (y) .
yTx (Gx )

(2.37)

Consider and as in (a) and notice that (2.35) implies

xI

(x) = Tx1 ((x)).

(2.38)

Moreover, the differentiability of and T imply differentiability of by the chain rule,


which also yields



1

(x) = DT x, (x)
(x)

(2.39)
xI


= DTx ((x)) (x) + x T x, (x) .


1

If Tx is a continuously differentiable map, then this is related to the inverse function theorem (see,
e.g. [Phi13b, Cor. C.9]); it is still true if Tx is merely continuous and injective, but then it is the
invariance of domain theorem of algebraic topology [Oss09, 5.6.15], which is equivalent to the Brouwer
fixed-point theorem [Oss09, 5.6.10], and is much harder to prove.

20

2 ELEMENTARY SOLUTION METHODS

To prove (a), first assume : I Kn to be a solution of (2.33a). Then, for each


x I,
(x)

(2.39),(2.33a)



DTx ((x)) f x, (x) + x T x, (x)



DTx Tx1 ((x)) f x, Tx1 ((x)) + x T x, Tx1 ((x))

1


1
DTx ((x))
f x, Tx1 ((x)) + x T x, Tx1 ((x)) ,

(2.38)

(2.37)

(2.40)

showing satisfies (2.34a). Conversely, assume to be a solution to (2.34a). Then, for


each x I,

1


DTx1 ((x))
f x, Tx1 ((x)) + x T x, Tx1 ((x))

(2.34a)
(2.39)
= (x) = DTx ((x)) (x) + x T x, (x) .
(2.41)

Using (2.38), one can subtract the second summand from (2.41). Multiplying the result
by DTx1 ((x)) from the left and taking into account (2.37) then provides

xI

(x) = f x, Tx1 ((x))

showing satisfies (2.33a).


 (2.38)
= f x, (x) ,

(2.42)

It remains to prove (b). If


 satisfies (2.33), then satisfies (2.34a) by (a). Moreover, (x0 ) = T x0 , (x0 ) = T (x0 , y0 ), i.e. satisfies (2.34b) as well. Conversely,
assume satisfies (2.34). Then satisfies (2.33a) by (a). Moreover, by (2.38), (x0 ) =
Tx1
((x0 )) = Tx1
(T (x0 , y0 )) = y0 , showing satisfies (2.33b) as well.

0
0
As a first application of Th. 2.10, we prove the following theorem about so-called
Bernoulli differential equations:
Theorem 2.11. Consider the Bernoulli differential equation
y = f (x, y) := a(x) y + b(x) y ,

(2.43a)

where R \ {0, 1}, the functions a, b : I R are continuous and defined on an open
interval I R, and f : I R+ R. For (2.43a), we add the initial condition
y(x0 ) = y0 ,

(x0 , y0 ) I R+ ,

(2.43b)

and, furthermore, we also consider the corresponding linear initial value problem

y = (1 ) a(x) y + b(x) ,
(2.44a)
1
y(x0 ) = y0 ,
(2.44b)
with its unique solution : I R given by Th. 2.3.

21

2 ELEMENTARY SOLUTION METHODS

(a) Uniqueness: On each open interval I I satisfying x0 I and > 0 on I , the


Bernoulli initial value problem (2.43) has a unique solution. This unique solution
is given by
 1
(2.45)
: I R+ , (x) := (x) 1 .

(b) Existence: There exists an open interval I I satisfying x0 I and > 0 on I ,


i.e. an I such that (a) applies.
Proof. (b) is immediate from Th. 2.3, since (x0 ) = y0 > 0 and is continuous.
To prove (a), we apply Th. 2.10 with the change of variables
T : I R+ R+ ,

T (x, y) := y 1 .

(2.46)

Then T C 1 (I R+ , R) with x T 0 and y T (x, y) = (1 ) y . Moreover,

xI

Tx = S,

S : R+ R+ ,

S(y) := y 1 ,

(2.47)

which is differentiable with the differentiable inverse function S 1 : R+ R+ ,


1

1
S 1 (y) = y 1 , DS 1 (y) = (S 1 ) (y) = 1
y 1 . Thus, (2.34a) takes the form

1


y = DTx1 (y)
f x, Tx1 (y) + x T x, Tx1 (y)



1
1 
= (1 ) y 1 a(x) y 1 + b(x) y 1
+0

= (1 ) a(x) y + b(x) .

(2.48)

Thus, if I I is such that x0 I and > 0 on I , then Th. 2.10 says defined
by (2.45) must be a solution to (2.43) (note that the differentiability of implies the
differentiability of ). On the other hand, if : I R+ is an arbitrary solution to
(2.43), then Th. 2.10 states := S = 1 to be a solution to (2.44). The uniqueness
part of Th. 2.3 then yields 1 = I = 1 , i.e. = .

Example 2.12. Consider the initial value problem
y = f (x, y) := i
y(1) = i,

1
,
ix y + 2

(2.49a)
(2.49b)

where f : G C, G := {(x, y) R C : ix y + 2 6= 0} (G is open as the continuous


preimage of the open set C \ {0}). We apply the change of variables
T : G C,

T (x, y) := ix y.

(2.50)

Then, T C 1 (G, C). Moreover, for each x R,


Gx = {y C : (x, y) G} = C \ {ix + 2}

(2.51)

22

2 ELEMENTARY SOLUTION METHODS


and we have the diffeomorphisms
Tx : C \ {ix + 2} C \ {2},
Tx1 : C \ {2} C \ {ix + 2},

Tx (y) = ix y,
Tx1 (y) = ix y.

(2.52a)
(2.52b)

To obtain the transformed equation, we compute the right-hand side of (2.34a)


1



DTx1 (y)
f x, Tx1 (y) + x T x, Tx1 (y)


1
1
.
(2.53)
+i=
= (1) i
y+2
y+2
Thus, the transformed initial value problem is
y =

1
,
y+2

(2.54a)

y(1) = T (1, i) = i i = 0.

(2.54b)

Using seperation of variables, one finds the solution


: ] 1, [] 2, [,

(x) :=

2x + 2 2,

(2.55)

to (2.54). Then Th. 2.10 implies that


: ] 1, [ C,


(x) := Tx1 (x) = ix 2x + 2 + 2,

(2.56)

is a solution to (2.49) (that is a solution to (2.49) can now also easily be checked
directly). It will become clear from Th. 3.15 below that and are also the unique
solutions to their respective initial value problems.

Finding a suitable change of variables to transform a given ODE such that one is in a
position to solve the transformed ODE is an art, i.e. it can be very difficult to spot a
useful transformation, and it takes a lot of practise and experience.
Remark 2.13. Somewhat analogous to the situation described in the paragraph before
(2.27) regarding the separation of variables technique, in practise, one frequently uses a
heuristic procedure to apply a change of variables, rather than appealing to the rigorous
Th. 2.10. For the initial value problem y = f (x, y), y(x0 ) = y0 , this heuristic procedure
proceeds as follows:
(1) One introduces the new variable z := T (x, y) and then computes z , i.e. the derivative of the function x 7 z(x) = T (x, y(x)).
(2) In the result of (1), one eliminates all occurrences of the variable y by first replacing
y by f (x, y) and then replacing y by Tx1 (z), where Tx (y) := T (x, y) = z (i.e. one has
to solve the equation z = T (x, y) for y). One thereby obtains the transformed initial
value problem problem z = g(x, z), z(x0 ) = T (x0 , y0 ), with a suitable function g.

23

2 ELEMENTARY SOLUTION METHODS

(3) One solves the transformed


initial value problem to obtain a solution , and then

1
x 7 (x) := Tx (x) yields a candidate for a solution to the original initial value
problem.
(4) One checks that is, indeed, a solution to y = f (x, y), y(x0 ) = y0 .
Example 2.14. Consider
+

f : R R R,

y y2
f (x, y) := 1 + + 2 ,
x x

(2.57)

and the initial value problem


y = f (x, y),

y(1) = 0.

(2.58)

We introduce the change of variables z := T (x, y) := y/x and proceed according to the
steps of Rem. 2.13. According to (1), we compute, using the quotient rule,
z (x) =

y (x) x y(x)
.
x2

(2.59)

According to (2), we replace y (x) by f (x, y) and then replace y by Tx1 (z) = xz to
obtain the transformed initial value problem


y
y y2
1
z
1 + z2
1
2

1 + + 2 2 = (1 + z + z ) =
, z(1) = 0/1 = 0. (2.60)
z =
x
x x
x
x
x
x
According to (3), we next solve (2.60), e.g. by seperation of variables, to obtain the
solution
 
(2.61)
: e 2 , e 2 R, (x) := tan ln x,

of (2.60), and

 
: e 2 , e 2 R,

(x) := x (x) = x tan ln x,

(2.62)

as a candidate for a solution to (2.58). Finally, according to (4), we check that is,
indeed, a solution to (2.58): Due to (1) = 1 tan 0 = 0, satisfies the initial condition,
and due to
1
(1 + tan2 ln x) = 1 + tan ln x + tan2 ln x
x
(x) 2 (x)
+
,
=1+
x
x2

(x) = tan ln x + x

satisfies the ODE.

(2.63)

24

3 GENERAL THEORY

3
3.1

General Theory
Equivalence Between Higher-Order ODE and Systems of
First-Order ODE

It turns out that each one-dimensional kth-order ODE is equivalent to a system of k


first-order ODE; more generally, that each n-dimensional kth-order ODE is equivalent
to a kn-dimensional first-order ODE (i.e. to a system of kn one-dimensional first-order
ODE). Even though, in this class, we will mainly consider explicit ODE, we provide the
equivalence also for the implicit case, as the proof is essentially the same (the explicit
case is included as a special case).
Theorem 3.1. In the situation of Def. 1.2(a), i.e. U R K(k+1)n and F : U Kn ,
plus (x0 , y0,0 , . . . , y0,k1 ) R Kkn , consider the kth-order initial value problem
F (x, y, y , . . . , y (k) ) = 0,

j{0,...,k1}

(3.1a)

y (j) (x0 ) = y0,j ,

(3.1b)

and the first-order initial value problem


y1 y2 = 0,
y2 y3 = 0,
..
.

yk1 yk = 0,
F (x, y1 , . . . , yk , yk ) = 0,

(3.2a)

y0,0

y(x0 ) = ...
y0,k1

(3.2b)

(note that the unknown function y in (3.1) is Kn -valued, whereas the unknown function
y in (3.2) is Kkn -valued). Then both initial value problems are equivalent in the following
sense:
(a) If : I Kn is a solution to (3.1), then
: I Kkn ,
is a solution to (3.2).

:=

..
.
(k1)

(3.3)

(b) If : I Kkn is a solution to (3.2), then := 1 (which is Kn -valued) is a


solution to (3.1).

25

3 GENERAL THEORY
Proof. We rewrite (3.2a) as
G(x, y, y ) = 0,

(3.4)

where
G : V Kkn ,


V := (x, y, z) R Kkn Kkn : (x, y, zk ) U R Kkn Kkn ,
G1 (x, y, z) := z1 y2 ,
G2 (x, y, z) := z2 y3 ,
..
.
Gk1 (x, y, z) := zk1 yk ,
Gk (x, y, z) := F (x, y, zk ).
(3.5)
(a): As a solution to (3.1), is k times differentiable and is well-defined. Then (3.1b)
implies (3.2b), since

(x0 )
y0,0
(x0 )

(3.1b) ..
(x0 ) =
= . .
..

.
y0,k1
(k1)

(x0 )
Next, Def. 1.2(a)(i) for implies Def. 1.2(a)(i) for , since

{ x, (x), (x) I Kkn Kkn : x I}


(3.3) 
=
x, (x), (x), . . . , (k1) (x), (x), . . . , (k) (x) I Kkn Kkn : x I
(x, (x), . . . , (k) (x)) U

V.

(3.6)

The definition of in (3.3) implies

j{1,...,k1}

j = ((j1) ) = (j) = j+1 ,

showing satisfies the first k 1 equations of (3.2a). As


k (x) = ((k1) ) (x) = (k) (x)
and, thus,

xI


 (3.1a)
Gk x, (x), (x) = F x, (x), (x), . . . , (k) (x) = 0,

also satisfies the last equation of (3.2a).

(b): As is a solution to (3.2), the first k 1 equations of (3.2a) imply

j{1,...,k1}

=1

j+1 = j = (j) ,

(3.7)

26

3 GENERAL THEORY

i.e. is k times differentiable and has, once again, the form (3.3) (note 1 = by
the definition of ). Then, clearly, (3.2b) implies (3.1b), and Def. 1.2(a)(i) for implies
Def. 1.2(a)(i) for :



x, (x), (x) I Kkn Kkn : x I V
and the definition of V in (3.5) imply



x, (x), . . . , (k) I K(k+1)n : x I U.

Finally, from the last equation of (3.2a), one obtains

xI

F x, (x), . . . , (k)

proving satisfies (3.1a).

 (3.3),(3.5)

= Gk x, (x), (x) = 0,

Example 3.2. The second-order initial value problem


y = y,

y(0) = 0,
y (0) = r,

r R given,

is equivalent to the following system of two first-order ODE:


 
y1 = y2 ,
0
.
y(0) =

r
y2 = y1 ,

(3.8)

(3.9)

The solution to (3.9) is


2

: R R ,

(x) =

1 (x)
2 (x)


r sin x
,
r cos x

(3.10)

and, thus, the solution to (3.8) is


: R R,

(x) = r sin x.

(3.11)

As a consequence of Th. 3.1, one can carry out much of the general theory of ODE
(such as results regarding existence and uniqueness of solutions) for systems of firstorder ODE, obtaining the corresponding results for higher-order ODE as a corollary.
This is the strategy usually pursued in the literature and we will follow suit in this
class.

3.2

Existence of Solutions

It is a rather remarkable fact that, under the very mild assumption that f : G Kn is
a continuous function defined on an open subset G of RKkn with (x0 , y0,0 , . . . , y0,k1 )

27

3 GENERAL THEORY

G, every initial value problem (1.7) for the n-dimensional explicit kth-order ODE (1.6)
has at least one solution : I Kn , defined on a, possibly very small, open interval.
This is the contents of the Peano Th. 3.8 below and its Cor. 3.10. From Example 1.4(b),
we already know that uniqueness of the solution cannot be expected without stronger
hypotheses.
The proof of the Peano theorem requires some work. One of the key ingredients is
the Arzel`a-Ascoli Th. 3.7 that, under suitable hypotheses, guarantees a given sequence
of continuous functions to have a uniformly convergent subsequence (the formulation
in Th. 3.7 is suitable for our purposes many different variants of the Arzel`a-Ascoli
theorem exist in the literature).
We begin with some prelimanaries from the theory of metric spaces. At this point, the
reader might want to review the definition of a metric, a metric space, and basic notions
on metric spaces, such as the notion of compactness and the notion of continuity of
functions between metric spaces. Also recall that every normed space is a metric space
via the metric induced by the norm (in particular, if we use metric notions on normed
spaces, they are always meant with respect to the respective induced metric). If you
are not sufficiently familiar with metrics and norms, you might want to consult the
relevant subsections of [Phi13b, Sec. 1]; for compactness and some related results see,
e.g., Appendix C.2.
Notation 3.3. Let (X, d) be a metric space. Given x X and r R+ , let
Br (x) := {y X : d(x, y) < r}
denote the open ball with center x and radius r, also known as the r-ball with center x.
Definition 3.4. Let (X, dX ) and (Y, dY ) be metric spaces. We say a sequence of functions (fm )mN , fm : X Y , converges uniformly to a function f : X Y if, and
only if,


dY fm (x), f (x) < .


>0

N N

mN,
xX

Theorem 3.5. Let (X, dX ) and (Y, dY ) be metric spaces. If the sequence (fm )mN of
continuous functions fm : X Y converges uniformly to the function f : X Y ,
then f is continuous as well.
Proof. We have to show that f is continuous at every X. Thus, let X and > 0.
Due to the uniform convergence, we can choose m N such that dY fm (x), f (x) < /3
for every x X. Moreover, as fm is continuous at , there exists > 0 such that
x B () implies dY fm (), fm (x) < /3. Thus, if x B (), then




dY f (), f (x) dY f (), fm () + dY fm (), fm (x) + dY fm (x), f (x)

< + + = ,
3 3 3
proving f is continuous at .

28

3 GENERAL THEORY

Definition 3.6. Let (X, dX ) and (Y, dY ) be metric spaces and let F be a set of functions
from X into Y . Then the set F (or the functions in F) are said to be uniformly
equicontinuous if, and only if, for each > 0, there is > 0 such that




dX (x, ) <
dY f (x), f () < .
(3.12)
f F

x,X

Theorem 3.7 (Arzel`a-Ascoli). Let n N, let k k denote some norm on Kn , and let
I R be some bounded interval. If (fm )mN is a sequence of functions fm : I Kn
such that {fm : m N} is uniformly equicontinuous and such that, for each x I, the
sequence fm (x) mN is bounded, then (fm )mN has a uniformly convergent subsequence
(fmj )jN , i.e. there exists f : I Kn such that

>0

N N

jN,
xI

kfmj (x) f (x)k < .

In particular, the limit function f is continuous.


Proof. Let (r1 , r2 , . . . ) be an enumeration of the set of rational numbers in I, i.e. of
Q I. Inductively, we construct a sequence (Fm )mN of subsequences of (fm )mN ,
Fm = (fm,k )kN , such that
(i) for each m N, Fm is a subsequence of each Fj with j {1, . . . , m},
(ii) for each m N, Fm converges pointwise at each of the first m rational numbers
rj ; more precisely, there exists a sequence (z1 , z2 , . . . ) in Kn such that, for each
m N and each j {1, . . . , m}:
lim fm,k (rj ) = zj .

Actually, we construct the (zm )mN inductively together with the (Fm )mN : Since the
sequence (fm (r1 ))mN is, by hypothesis, a bounded sequence in Kn , one can apply the
Bolzano-Weierstrass theorem (cf. [Phi13b, Th. 1.16(b)]) to obtain z1 Kn and a subsequence F1 = (f1,k )kN of (fm )mN such that limk f1,k (r1 ) = z1 . To proceed by
induction, we now assume to have already constructed F1 , . . . , FM and z1 , . . . , zM for
M N such that (i) and (ii) hold for each m {1, . . . , M }. Since the sequence
(fM,k (rM +1 ))kN is a bounded sequence in Kn , one can, once more, apply the BolzanoWeierstrass theorem to obtain zM +1 Kn and a subsequence FM +1 = (fM +1,k )kN of
FM such that limk fM +1,k (rM +1 ) = zM +1 . Since FM +1 is a subsequence of FM , it is
also a subsequence of all previous subsequences, i.e. (i) now also holds for m = M + 1.
In consequence, limk fM +1,k (rj ) = zj for each j = 1, . . . , M + 1, such that (ii) now
also holds for m = M + 1 as required.
Next, one considers the diagonal sequence (gm )mN , gm := fm,m , and observes that this
sequence converges pointwise at each rational number rj (limm gm (rj ) = zj ), since,
at least for m j, (gm )mN is a subsequence of every Fj (exercise) in particular,
(gm )mN is also a subsequence of the original sequence (fm )mN .

29

3 GENERAL THEORY

In the last step of the proof, we show that (gm )mN converges uniformly on the entire
interval I to some f : I Kn . To this end, fix > 0. Since {gm : m N} {fm :
m N}, the assumed uniform equicontinuity of {fm : m N} yields > 0 such that





.
|x | <

gm (x) gm () <
mN
x,I
3

Since I is bounded, it has S


finite length and, thus, it can be covered with finitely many
intervals I1 , . . . , IN , I = N
j=1 Ij , N N, such that each Ij has length less than .
Moreover, since Q is dense in R, for each j {1, . . . , N }, there exists k(j) N such
that rk(j) Ij . Define M := max{k(j)
: j = 1, . . . , N }. We note that each of the

finitely many sequences (gm (r1 ) mN , . . . , (gm (rM ) mN is a Cauchy sequence. Thus,

KN

k,lK

=1,...,M



gk (r ) gl (r ) < .
3

(3.13)

We now consider an arbitrary x I and k, l K. Let j {1, . . . , N } such that x Ij .


Then rk(j) Ij , |rk(j) x| < , and the estimate in (3.13) holds for = k(j). In
consequence, we obtain the crucial estimate


gk (x) gl (x)



+ gk (rk(j) ) gl (rk(j) ) + gl (rk(j) ) gl (x)

kg
(x)

g
(r
)
k
k
k(j)

(3.14)
k,lK

< + + = .
3 3 3

The estimate (3.14) shows (gm (x) mN is a Cauchy sequence for each x I, and we
can define
f : I Kn , f (x) := lim gm (x).
(3.15)
m

Since K in (3.14) does not depend on x I, passing to the limit k in the estimate
of (3.14) implies
kgl (x) f (x)k ,
lK,
xI

proving uniform convergence of the subsequence (gm )mN of (fm )mN as desired. The
continuity of f is now a consequence of Th. 3.5.

At this point, we have all preparations in place to state and prove the existence theorem.
Theorem 3.8 (Peano). If G RKn is open, n N, and f : G Kn is continuous,
then, for each (x0 , y0 ) G, the explicit n-dimensional first-order initial value problem
y = f (x, y),
y(x0 ) = y0 ,

(3.16a)
(3.16b)

has at least one solution. More precisely, given an arbitrary norm k k on Kn , (3.16)
has a solution : I Kn , defined on the open interval
I :=]x0 , x0 + [,

(3.17)

30

3 GENERAL THEORY
= (b) > 0, where b > 0 is such that


B := (x, y) R Kn : |x x0 | b and ky y0 k b G,
M := M (b) := max{kf (x, y)k : (x, y) B} < ,

and
:= (b) :=

min{b, b/M }
b

for M > 0,
for M = 0.

(3.18)
(3.19)

(3.20)

In general, the choice of the norm k k on Kn will influence the possible sizes of and,
thus, of I.
Proof. The proof will be conducted in several steps. In the first step, we check =
(b) > 0 is well-defined: Since G is open, there always exists b > 0 such that (3.18)
holds. Since B is a closed and bounded subset of the finite-dimensional space R Kn ,
B is compact (cf. [Phi13b, Cor. 3.5]). Since f and, thus, kf k is continuous (every norm
is even Lipschitz continuous due to the inverse triangle inequality), it must assume its
maximum on the compact set B (cf. [Phi13b, Th. 3.8]), showing M R+
0 is well-defined
by (3.19) and is well-defined by (3.20).
In the second step of the proof, we note that it suffices to prove (3.16) has a solution + ,
defined on [x0 , x0 + [: One can then apply the time reversion Lem. 1.9(b): The proof
providing the solution + also provides a solution + : [x0 , x0 + [ Kn to the
time-reversed initial value problem, consisting of y = f (x, y) and y(x0 ) = y0 (note
that the same M and work for the time-reversed problem). Then, according to Lem.
1.9(b), : ]x0 , x0 ] Kn , (x) := + (x), is a solution to (3.16). According to
Lem. 1.7, we can patch and + together to obtain the desired solution
(
(x) for x x0 ,
(3.21)
: I Kn , (x) :=
+ (x) for x x0 ,
defined on all of I. It is noted that one can also conduct the proof with the second step
omitted, but then one has to perform the following steps on all of I, which means one
has to consider additional cases in some places.
In the third step of the proof, we will define a sequence (m )mN of functions
m : I+ Kn ,

I+ := [x0 , x0 + ],

(3.22)

that constitute approximate solutions to (3.16). To begin the construction of m , fix


m N. Since B is compact and f is continuous, we know f is even uniformly continuous
on B (cf. [Phi13b, Th. 3.9]). In particular,

m >0

(x,y),(
x,
y )B


|x x| < m , ky yk < m



1
f (x, y) f (

x, y) <
.
m
(3.23)

31

3 GENERAL THEORY

We now form what is called a discretization of the interval I+ , i.e. a partition of I+ into
sufficiently many small intervals: Let N N and
x0 < x1 < < xN 1 < xN := x0 +

(3.24)

such that

j{1,...,N }

xj xj1 < :=

min{m , m /M, 1/m} for M > 0,


min{m , 1/m}
for M = 0

(3.25)

(for example one could make the equidistant choice xj := x0 + jh with h = /N and
N > /, but it does not matter how the xj are defined as long as (3.24) and (3.25)
both hold). Note that we get a different discretization of I+ for each m N; however,
the dependence on m is suppressed in the notation for the sake of readability. We now
define recursively
m : I+ Kn ,
m (x0 ) := y0 ,

(3.26)

m (x) := m (xj ) + (x xj ) f xj , m (xj )

for each x [xj , xj+1 ].

Note that there is no conflict between the two definitions given for x = xj with j
{1, . . . , N 1}. Each function m defines a polygon in Kn . This construction is known
as Eulers method and it can be used to obtain numerical approximations to the solution
of the initial value problem (while simple, this method is not very efficient, though). We
still need to verify that the definition (3.26) does actually make sense: We need to check
that f can, indeed, be applied to (xj , m (xj )), i.e. we have to check (xj , m (xj )) G.
We can actually show the stronger statement

(x, m (x)) B,

xI+

(3.27)

where B is as defined in (3.18). First, it is pointed out that (3.20) implies b, such
that x I+ implies |x x0 | b as required in (3.18). One can now prove (3.27)
by showing by induction on j {0, . . . , N 1}:

x[xj ,xj+1 ]

(x, m (x)) B.

(3.28)

To start the induction, note m (x0 ) = y0 and (x0 , y0 ) B by (3.18). Now let j
{0, . . . , N 1} and x [xj , xj+1 ]. We estimate
km (x) y0 k

(3.26)

km (x) m (xj )k +

j
X
k=1

km (xk ) m (xk1 )k

j



 X
(xk xk1 ) f xk1 , m (xk1 )
(x xj ) f xj , m (xj ) +
k=1

()

(x xj ) M +
(3.20)

M b,

X
k=1

(xk xk1 ) M = (x x0 ) M

(3.29)

32

3 GENERAL THEORY

where, at (), it was used


that (xk ,
m (xk )) B by induction hypothesis for each


k = 0, . . . , j, and, thus, f xk , m (xk ) M by (3.19). Estimate (3.29) completes the
induction and the third step of the proof.

In the fourth step of the proof, we establish several properties of the functions m . The
first two properties are immediate from (3.26), namely that m is continuous on I + and
differentiable at each x ]xj , xj+1 [, j {0, . . . , N 1}, where


m (x) = f xj , m (xj ) .
(3.30)
j{0,...,N 1}

x]xj ,xj+1 [

The next property to establish is

s,tI+

km (t) m (s)k |t s| M.

(3.31)

To prove (3.31), we may assume s < t without loss of generality. If s, t [xj , xj+1 ],
j {0, . . . , N 1}, then
km (t) m (s)k


(3.26)
= m (xj ) + (t xj ) f xj , m (xj ) m (xj ) (s xj ) f xj , m (xj )

 (3.19)
= |t s| f xj , m (xj ) |t s| M
(3.32a)

as desired. If s, t are not contained in the same interval [xj , xj+1 ], then fix j < k such
that s [xj , xj+1 ] and t [xk , xk+1 ]. Then (3.31) follows from an estimate analogous to
the one in (3.29):
km (t) m (s)k

(3.32a)

km (s) m (xj+1 )k +

|s xj+1 | M +

|t s| M,

k1
X

l=j+1

k1
X

l=j+1

km (xl ) m (xl+1 )k + km (xk ) m (t)k

|xl xl+1 | M + |t xk | M
(3.32b)

completing the proof of (3.31). The following property of the m is the justification for
calling them approximate solutions to our initial value problem (3.16):

j{0,...,N 1}

x]xj ,xj+1 [



m (x) f x, m (x) < 1 .
m

(3.33)

Indeed, (3.33) is a consequence of (3.23), i.e. of the uniform continuity of f on B: First,


if M = 0, then f m 0 and there is nothing to prove.So let M > 0. If x ]xj , xj+1 [,
then, according to (3.30), we have m (x) = f xj , m (xj ) . Thus, by (3.25),
|xxj | < min{m , m /M }

(3.31)

km (x)m (xj )k |xxj | M < m , (3.34a)

33

3 GENERAL THEORY
and



 (3.34a),(3.23) 1
m (x) f x, m (x) = f xj , m (xj ) f x, m (x)
,
<
m

(3.34b)

proving (3.33).

The last property of the m we need is


(3.31)

xI+

km (x)k km (x) m (x0 )k + km (x0 )k |x x0 | M + km (x0 )k


M + ky0 k,

(3.35)

which says that the m are pointwise and even uniformly bounded.
In the fifth and last step of the proof, we use the Arzel`a-Ascoli Th. 3.7 to obtain a
function + : I+ Kn , and we show that constitutes a solution to (3.16). According
to (3.31), the m are uniformly equicontinuous (given > 0, condition (3.12) is satisfied
with := /M for M > 0 and arbitrary > 0 for M = 0), and according to (3.35)
the m are bounded such that the Arzel`a-Ascoli Th. 3.7 applies to yield a subsequence
(mj )jN of (m )mN converging uniformly to some continuous function + : I+ Kn .
So it merely remains to verify that + is a solution to (3.16).
As the uniform convergence of the (mj )jN implies pointwise convergence, we have
+ (x0 ) = limj mj (x0 ) = y0 , showing + satisfies the initial condition (3.16b).
Next,

xI

(x, + (x)) = lim (x, mj (x)) B,


j

since each (x, mj (x)) is in B and B is closed. In particular, f (x, + (x)) is well-defined
for each x I+ .
To prove that + also satisfies the ODE (3.16a), by Th. 1.5, it suffices to show
Z x

+ (x) + (x0 )

f t, + (t) dt = 0.
xI+

(3.36)

x0

Fixing x I+ and using the triangle inequality for the umpteenth time, one obtains


Z x


+ (x) + (x0 )
f t, + (t) dt


x0


Z x


k+ (x) mj (x)k +
f t, mj (t) dt
mj (x) + (x0 )

x0
Z x 




+
dt
f
t,

(t)

f
t,

(t)
(3.37)
m
+
j

,
x0

holding for every j N. We will conclude the proof by showing that all three summands
on the right-hand side of (3.37) tend to 0 for j . As already mentioned above,
the uniform convergence of the (mj )jN implies pointwise convergence, implying the
convergence of the first summand. We tackle the third summand next, using
Z x 
Z x









f
t,

(t)

f
t,

(t)

dt
f
t,

(t)

f
t,

(t)

dt , (3.38)
mj
+
mj
+


x0

x0

34

3 GENERAL THEORY

which holds for every norm (cf. Appendix


Pn B), but can easily be checked directly for
the 1-norm, where k(z1 , . . . , zn )k1 :=
(exercise). Given > 0, the uniform
j=1 |zj |



continuity of f on B provides > 0 such that f t, mj (t) f t, + (t) < / for
kmj (t) + (t)k < . The uniform convergence of (mj )jN then yields K N such
that kmj (t) + (t)k < for every j K and each t I. Thus,
Z x


|x x0 |


,

f
t,

(t)

f
t,

(t)

dt
mj
+
jK

x0

thereby establishing the convergence of the third summand from the right-hand side
of (3.37). For the remaining second summand, we note that the fact that each m is
continuous and piecewise differentiable (with piecewise constant derivative) allows to
apply the fundamental theorem of calculus in the form [Phi13a, Th. G.6(b)] to obtain
Z x
m (x) = m (x0 ) +

m (t) dt .
(3.39)
xI+

x0

Using (3.39) in the second summand of the right-hand side of (3.37) provides


Z x
Z x





m (x) + (x0 )

(t)

f
t,

(t)

f
t,

(t)
dt
dt
mj
mj
mj
j

x0
x0
Z x
(3.33)

mj
x0 m j
showing the convergence of the second summand, which finally concludes the proof. 
Corollary 3.9. If G R Kn is open, n N, f : G Kn is continuous, and C G
is compact, then there exists > 0, such that, for each (x0 , y0 ) C, the explicit ndimensional first-order initial value problem (3.16) has a solution : I Kn , defined
on the open interval I :=]x0 , x0 + [, i.e. always on an interval of the same length
2.
Proof. Exercise.

Corollary 3.10. If G R Kkn is open, k, n N, and f : G Kn is continuous,


then, for each (x0 , y0,0 , . . . , y0,k1 ) G, the explicit n-dimensional kth-order initial value
problem consisting of (1.6) and (1.7), which, for convenience, we rewrite

(3.40a)
y (k) = f x, y, y , . . . , y (k1) ,

j{0,...,k1}

y (j) (x0 ) = y0,j ,

(3.40b)

has at least one solution. More precisely, there exists an open interval I R with
x0 I and : I Kn such that is a solution to (3.40). If C G is compact,
then there exists > 0 such that, for each (x0 , y0,0 , . . . , y0,k1 ) C, (3.40) has a solution
: I Kn , defined on the open interval I :=]x0 , x0 + [, i.e. always on an interval
of the same length 2.

35

3 GENERAL THEORY

Proof. If f is continuous, then the right-hand side of the equivalent first-order system
(3.2a) (written in explicit form) is given by the continuous function

y2

y3

..
(3.41)
f : G Kkn , f(x, y1 , . . . , yk ) :=
.
.

yk1
f (x, y1 , . . . , yk )

Thus, Th. 3.8 provides a solution : I Kkn to (3.2) and, then, Th. 3.1(b) yields
:= 1 to be a solution to (3.40). Moreover, if C G is compact, then Cor. 3.9 provides
> 0 such that, for each (x0 , y0,0 , . . . , y0,k1 ) C, (3.2) has a solution : I Kkn ,
defined on the same open interval I :=]x0 , x0 + [. In particular, := 1 , the
corresponding solution to (3.40) is also defined on the same I.

While the Peano theorem is striking in its generality, it does have several drawbacks:
(a) the interval, where the existence of a solution is proved can be unnecessarily short;
(b) the selection of the subsequence using the Arzel`a-Ascoli theorem makes the proof
nonconstructive; (c) uniqueness of solutions is not provided, even in cases, where unique
solutions exist; (d) it does not provide information regarding how the solution changes
with a change of the initial condition. We will subsequently address all these points,
namely (b) and (c) in Sec. 3.3 (we will see that the proof of the Peano theorem becomes
constructive in situations, where the solution is unique in general, a constructive proof
is not available), (a) in Sec. 3.4, and (d) in Sec. 3.5.

3.3

Uniqueness of Solutions

Example 1.4(b) shows that the hypotheses of the Peano Th. 3.8 are not strong enough
to guarantee the initial value problem (3.16) has a unique solution, not even in some
neighborhood of x0 . The additional condition that will yield uniqueness is local Lipschitz
continuity of f with respect to y.
Definition 3.11. Let m, n N, G R Km , and f : G Kn .
(a) The function f is called (globally) Lipschitz continuous or just (globally) Lipschitz
with respect to y if, and only if,


f (x, y) f (x, y) Lky yk.

(3.42)
L0

(x,y),(x,
y )G

(b) The function f is called locally Lipschitz continuous or just locally Lipschitz with
respect to y if, and only if, for each (x0 , y0 ) G, there exists a (relative) open set
U G such that (x0 , y0 ) U (i.e. U is a (relative) open neighborhood of (x0 , y0 ))
and f is Lipschitz continuous with respect to y on U , i.e. if, and only if,


f (x, y) f (x, y) Lky yk.

(x0 ,y0 )G

(x0 , y0 ) U G open

L0

(x,y),(x,
y )U

(3.43)

36

3 GENERAL THEORY

The number L occurring in (a),(b) is called Lipschitz constant. The norms on Km and
Kn in (a),(b) are arbitrary. If one changes the norms, then one will, in general, change
L, but not the property of f being (locally) Lipschitz.
Caveat 3.12. It is emphasized that f : G Kn , (x, y) 7 f (x, y), being Lipschitz
with respect to y does not imply f to be continuous: Indeed, if I R, 6= A Km ,
and g : I Kn is an arbitrary discontinuous function, then f : I A Kn ,
f (x, y) := g(x) is not continuous, but satisfies (3.42) with L = 0.

While the local neighborhoods U , where a function locally Lipschitz (with respect to y)
is actually Lipschitz continuous (with respect to y) can be very small, we will now show
that a continuous function is locally Lipschitz (with respect to y) on G if, and only if,
it is Lipschitz continuous (with respect to y) on every compact set K G.
Proposition 3.13. Let m, n N, G R Km , and f : G Kn be continuous.
Then f is locally Lipschitz with respect to y if, and only if, f is (globally) Lipschitz with
respect to y on every compact subset K of G.
Proof. First, assume f is not locally Lipschitz with respect to y. Then there exists
(x0 , y0 ) G such that


f (xN , yN,1 ) f (xN , yN,2 ) > N kyN,1 yN,2 k.

(3.44)
N N

(xN ,yN,1 ),(xN ,yN,2 )


GB1/N (x0 ,y0 )

The set


K := {(x0 , y0 )} (xN , yN,j ) : N N, j {1, 2}

is clearly a compact subset of G (e.g. by the Heine-Borel property of compact sets (see
Th. C.19), since every open set containing (x0 , y0 ) must contain all, but finitely many,
of the elements of K). Due to (3.44), f is not (globally) Lipschitz with respect to y on
the compact set K (so, actually, continuity of f was not used for this direction).
Conversely, assume f to be locally Lipschitz with respect to y, and consider a compact
subset K of G. Then, for each (x, y) K, there is some (relatively) open U(x,y) G
with (x, y) U(x,y) and such that f is Lipschitz with respect to y in U(x,y) . By the
Heine-Borel property of compact sets (see Th. C.19), there are finitely many U1 :=
U(x1 ,y1 ) , . . . , UN := U(xN ,yN ) , N N, such that
K

N
[

Uj .

(3.45)

j=1

For each j = 1, . . . , N , let Lj denote the Lipschitz constant for f on Uj and set L :=
max{L1 , . . . , LN }. As f is assumed continuous and K is compact, we have
M := max{kf (x, y)k : (x, y) K} < .

(3.46)

37

3 GENERAL THEORY

Using the compactness of K once again, there exists a Lebesgue number > 0 for the
open cover (Uj )j{1,...,N } of K (cf. Th. C.21), i.e. > 0 such that



ky yk <

{(x, y), (x, y)} Uj .


(3.47)
j{1,...,N }

(x,y),(x,
y )K

Define L := max{L , 2M/}. Then, for every (x, y), (x, y) K:


ky yk <

kf (x, y) f (x, y)k Lj ky yk Lky yk,

(3.48a)

ky yk

kf (x, y) f (x, y)k 2M =

(3.48b)

2M
Lky yk,

completing the proof that f is Lipschitz with respect to y on K.

While, in general, the assertion of Prop. 3.13 becomes false if the continuity of f is omitted, for convex G, it does hold without the continuity assumption on f (see Appendix
D). The following Prop. 3.14 provides a useful sufficient condition for f : G Kn ,
G R Km open, to be locally Lipschitz with respect to y:

Proposition 3.14. Let m, n N, let G R Km be open, and f : G Kn . A


sufficient condition for f to be locally Lipschitz with respect to y is f being continuously
(real) differentiable with respect to y, i.e., f is locally Lipschitz with respect to y provided
that all partials yk fl ; k, l = 1, . . . , n (yk,1 fl , yk,2 fl for K = C) exist and are continuous.
Proof. We consider the case K = R; the case K = C is included by using the identifications Cm
= R2m and Cn
= R2n . Given (x0 , y0 ) G, we have to show f is Lipschitz
with respect to y on some open set U G with (x0 , y0 ) U . As in the Peano Th. 3.8,
since G is open,


B := (x, y) R Rm : |x x0 | b and ky y0 k1 b G,
b>0

where k k1 denotes the 1-norm on Rm . Since the yk fl , (k, l) {1, . . . , m} {1, . . . , n},
are all continuous on the compact set B,


M := max |yk fl (x, y)| : (x, y) B, (k, l) {1, . . . , m} {1, . . . , n} < . (3.49)

Applying the mean value theorem (cf. [Phi13b, Th. 2.32]) to the n components of the
function


fx : y Rm : (x, y) B Rn , fx (y) := f (x, y),
we obtain 1 , . . . , n Rm such that

fl (x, y) fl (x, y) =
and, thus,

(x,y),(x,
y )B

m
X
k=1

yk fl (x, l )(yk yk ),

n
X


f (x, y) f (x, y) =
|fl (x, y) fl (x, y)|
1
l=1

(3.49),(3.50)

m
n X
X
l=1 k=1

M |yk yk | =

n
X
l=1

M ky yk1 = nM ky yk1 ,

(3.50)

(3.51)

38

3 GENERAL THEORY
i.e. f is Lipschitz with respect to y on B (where


(x, y) R Rm : |x x0 | < b and ky y0 k1 < b B

is an open neighborhood of (x0 , y0 )), showing f is locally Lipschitz with respect to y. 

Theorem 3.15. If G R Kn is open, n N, and f : G Kn is continuous and


locally Lipschitz with respect to y, then, for each (x0 , y0 ) G, the explicit n-dimensional
first-order initial value problem
y = f (x, y),
y(x0 ) = y0 ,

(3.52a)
(3.52b)

has a unique solution. More precisely, if I R is an open interval and , : I Kn


are both solutions to (3.52a), then (x0 ) = (x0 ) for one x0 I implies (x) = (x)
for all x I:





(x0 ) = (x0 )

(x) = (x) .
(3.53)
x0 I

xI

Proof. We first show that and must agree in a small neighborhood of x0 :

>0

x]x0 ,x0 +[

(x) = (x).

(3.54)

Since f is continuous and both and are solutions to the initial value problem (3.52),
we can use Th. 1.5 to obtain
Z x


(x) (x) =
(3.55)
f t, (t) f t, (t) dt .
xI

x0

As f is locally Lipschitz with respect to y, there exists > 0 such that f is Lipschitz
with Lipschitz constant L 0 with respect to y on
U := {(x, y) G : |x x0 | < , ky y0 k < },
where we have chosen some arbitrary norm kk on Kn . The continuity of , implies the
existence of > 0 such that B (x0 ) I, (B(x0 )) B (y0 ) and (B(x0 )) B (y0 ),
implying





f x, (x) f x, (x) L (x) (x) .

(3.56)
xB(x0 )

Next, define

:= min{, 1/(2L)}

and, using the compactness of B (x0 ) = [x0 , x0 + ] plus the continuity of , ,




M := max k(x) (x)k : x B (x0 ) < .

From (3.55) and (3.56), we obtain


Z

k(x) (x)k L
xB (x0 )

x
x0



M
k(t) (t)k dt L |x x0 | M
2

(3.57)

39

3 GENERAL THEORY

(note that the integral in (3.57) can be negative for x < x0 ). The definition of M
together with (3.57) yields M M/2, i.e. M = 0, finishing the proof of (3.54).
To prove (x) = (x) for each x x0 , let

s := sup{ I : (x) = (x) for each x [x0 , ]}.


One needs to show s = sup I. If s = sup I does not hold, then there exists > 0 such
that [s, s + ] I. Then the continuity of , implies (s) = (s), i.e. and satisfy
the same initial value problem at s such that (3.54) must hold with s instead of x0 , in
contradiction to the definition of s. Finally, (x) = (x) for each x x0 follows in an
completely analogous fashion, which concludes the proof of the theorem.

Corollary 3.16. If G RKkn is open, k, n N, and f : G Kn is continuous and
locally Lipschitz with respect to y, then, for each (x0 , y0,0 , . . . , y0,k1 ) G, the explicit
n-dimensional kth-order initial value problem consisting of (1.6) and (1.7), i.e.

y (k) = f x, y, y , . . . , y (k1) ,

j{0,...,k1}

y (j) (x0 ) = y0,j ,

has a unique solution. More precisely, if I R is an open interval and , : I Kn


are both solutions to (1.6), then

j{0,...,k1}

(j) (x0 ) = (j) (x0 )

(3.58)

holding for one x0 I implies (x) = (x) for all x I.


Proof. Exercise.

Remark 3.17. According to Th. 3.15, the condition of f being continuous and locally
Lipschitz with respect to y is sufficient for each initial value problem (3.52) to have a
unique solution. However, this condition is not necessary: It is an exercise to show that
the continuous function
(
1
for y 0,
f : R2 R, f (x, y) :=
(3.59)

1 + y for y 0,
is not locally Lipschitz with respect to y, but that, for each (x0 , y0 ) R2 , the initial
value problem (3.52) still has a unique solution in the sense that (3.53) holds for each
solution to (3.52a). And one can (can you?) even find simple examples of f being
defined on an open domain such that f is discontinuous at every point in its domain
and every initial value problem (3.52) still has a unique solution.

At the end of Sec. 3.2, it was pointed out that the proof of the Peano Th. 3.8 is nonconstructive due to the selection of a subsequence. The following Th. 3.18 shows that,
whenever the initial value problem has a unique solution, it becomes unnecessary to
select a subsequence, and the construction procedure (namely Eulers method) used in
the proof of Th. 3.8 becomes an effective (if not necessarily efficient) numerical approximation procedure for the unique solution.

40

3 GENERAL THEORY

Theorem 3.18. Consider the situation of the Peano Th. 3.8. Under the additional
assumption that the solution to the explicit n-dimensional first-order initial value problem (3.16) is unique on some interval J [x0 , x0 + [, x0 J, and where > 0 is
constructed as in Th. 3.8 (i.e. given by (3.18) (3.20)), every sequence (m )mN of functions defined on J according to Eulers method as in the proof of Th. 3.8 (i.e. defined
as in (3.26)) converges uniformly to the unique solution : J Kn . An analogous
statement also holds for J ]x0 , x0 ], x0 J.
Proof. Seeking a contradiction, assume (m )mN does not converge uniformly to the
unique solution . Then there exists > 0 and a subsequence (mj )jN such that


kmj ksup = sup kmj (x) (x)k : x J .
(3.60)
jN

However, as a subsequence, (mj )jN still has all the properties of the (m )mN (namely
pointwise boundedness, uniform equicontinuity, piecewise differentiability, being approximate solutions according to (3.33)) that guanranteed the existence of a subsequence,
converging to a solution. Thus, since the solution is unique on J, (mj )jN must, in
turn, have a subsequence, converging uniformly to , which is in contradiction to (3.60).
This shows the assumption that (m )mN does not converge uniformly to must have
been false. The proof of the analogous statement for J ]x0 , x0 ], x0 J one obtains,
e.g., via time reversion (cf. the second step of the proof of Th. 3.8).

Remark 3.19. The argument used to prove Th. 3.18 is of a rather general nature: It
can be applied whenever a sequence is known to have a subsequence converging to some
solution of some equation (or some other problem), provided the same still holds for
every subsequence of the original sequence in that case, the additional knowledge that
the solution is unique implies the convergence of the original sequence without the need
to select a subsequence.

3.4

Extension of Solutions, Maximal Solutions

The Peano Th. 3.8 and Cor. 3.10 show the existence of local solutions to explicit initial
value problems, i.e. the solutions existence is proved on some, possibly small, interval
containing the initial point x0 . In the current section, we will address the question in
which circumstances such local solutions can be extended, we will prove the existence
of maximal solutions (solutions that can not be extended), and we will learn how such
maximal solutions can be identified.
Definition 3.20. Let : I Kn , n N, be a solution to some ODE (such as (1.6)
or (1.4) in the most general case), defined on some open interval I R.
(a) We say has an extension or continuation to the right (resp. to the left) if, and
only if, there exists a solution : J Kn to the same ODE, defined on some
open interval J I such that I = and
sup J > sup I

(resp. inf J < inf I).

(3.61)

41

3 GENERAL THEORY

An extension or continuation of is a function that is an extension to the right or


an extension to the left (or both).
(b) The solution is called a maximal solution if, and only if, it does not admit any
extensions in the sense of (a) (note that we require maximal solutions to be defined
on open intervals, cf. Appendix E).
Remark 3.21. As an immediate consequence of the time reversion Lem. 1.9(b), if a
solution : I Kn , n N, to (1.6), defined on some open interval I R, has an
extension to the right (resp. to the left) if, and only if, : (I) Kn , (x) := (x),
(solution to y = f (x, y)) has an extension to the left (resp. to the right).

The existence of maximal solutions is not trivial a priori it could be that every solution
had an extension (analogous to the fact that to every x [0, 1[ (or every x R) there
is some bigger element in [0, 1[ (respectively in R)).
Theorem 3.22. Every solution 0 : I0 Kn to (1.4) (resp. to (1.6)), defined on an
open interval I0 R, can be extended to a maximal solution of (1.4) (resp. of (1.6)).
Proof. The proof is carried out for solutions to (1.4) (the implicit ODE) the proof for
solutions to the explicit ODE (1.6) is analogous and can also be seen as a special case.
The idea is to apply Zorns lemma. To this end, define a partial order on the set
S := {(I0 , 0 )} {(I, ) : : I Kn is solution to (1.4), extending 0 }

(3.62)

by letting
(I, ) (J, )

I J,

I = .

(3.63)

Every chain C, i.e. Severy totally ordered subset of S, has an upper bound, namely
(IC , C ) with IC := (I,)C I and C (x) := (x), where (I, ) C is chosen such that
x I (since C is a chain, the value of C (x) does not actually depend on the choice of
(I, ) C and is, thus, well-defined).

Clearly, IC is an open interval, I0 IC , and C extends 0 as a function; we still need to


see that C is a solution to (1.4). For this, we, once again, use that x IC means there
exists (I, ) C such that x I and is a solution to (1.4). Thus, using the notation
from Def. 1.2(a),
(k)

(x, C (x), C (x), . . . , C (x)) = (x, (x), (x), . . . , (k) (x)) U


and

(k)

F (x, C (x), C (x), . . . , C (x)) = F (x, (x), (x), . . . , (k) (x)) = 0,


showing C is a solution to (1.4) as defined in Def. 1.2(a). In particular, (IC , C ) S. To
verify (IC , C ) is an upper bound for C, note that the definition of (IC , C ) immediately
implies I IC for each (I, ) C and C I = for each (I, ) C.

To conclude the proof, we note that all hypotheses of Zorns lemma have been verified
such that it yields the existence of a maximal element of (Imax , max ) S, i.e. max :
Imax Kn must be a maximal solution extending 0 .


42

3 GENERAL THEORY

Proposition 3.23. Let k, n N. Given G R Kkn open and f : G Kn


continuous, if : I Kn is a solution to (1.6) such that I =]a, b[, a < b, b <
(resp. < a), then has an extension to the right (resp. to the left) if, and only if,


lim (x), (x), . . . , (k1) (x) = (0 , . . . , k1 ),


(3.64a)
xb
(b,0 ,...,k1 )G




(k1)
resp.

lim (x), (x), . . . ,


(x) = (0 , . . . , k1 ) . (3.64b)
(a,0 ,...,k1 )G

xa

Proof. That the respective part of (3.64) is necessary for the existence of the respective
extension is immediate from the fact that, for each solution to (1.6), the solution and
all its derivatives up to order k 1 must exist and must be continuous.

We now prove that (3.64a) is also sufficient for the existence of an extension to the right
(the sufficiency of (3.64b) for the existence of an extension to the left is then immediate
from Rem. 3.21). So assume (3.64a) to hold and consider the initial value problem
consisting of (1.6) and the initial conditions

j=0,...,k1

y (j) (b) = j .

By Cor. 3.10, there must exist > 0 such that this initial value problem has a solution
: ]b, b+[ Kn . We now show that extended to b via (3.64a) is still a solution to
(1.6). First note the mean value theorem (cf. [Phi13a, Th. 9.17]) yields that (j) (b) = j
exists for j = 1, . . . , k 1 as a left-hand derivative. Moreover,

lim (k) (x) = lim f x, (x), (x), . . . , (k1) (x) = f (b, 0 , . . . , k1 ),
xb

xb


showing (k) (b) = f b, (b), (b), . . . , (k1) (b) (again employing the mean value theorem), which proves extended to b is a solution to (1.6). Finally, Lem. 1.7 ensures
(
(x) for x b,
: ]a, b + [ Kn , (x) :=
(x) for x b,
is a solution to (1.6) that extends to the right.

Proposition 3.24. Let k, n N, let G R Kkn be open, let f : G Kn be


continuous, and let : I Kn be a solution to (1.6) defined on the open interval I.
Consider x0 I and let gr+ () (resp. gr ()) denote the graph of (, . . . , (k1) ) for
x x0 (resp. for x x0 ):



gr+ () := gr+ (, x0 ) := x, (x), . . . , (k1) (x) G : x I, x x0 ,
(3.65a)



gr () := gr (, x0 ) := x, (x), . . . , (k1) (x) G : x I, x x0 .
(3.65b)
If there exists a compact set K G such that gr+ () K (resp. gr () K), then
has an extension : J Kn to the right (resp. to the left) such that


x, (
x), . . . , (k1) (
x)
/ K.
(3.66)
x
J

The statement can be rephrased by saying that gr+ () (resp. gr ()) of each maximal
solution to (1.6) escapes from every compact subset of G when x appoaches the right
(resp. the left) boundary of I (where the boundary of I can contain and/or +).

43

3 GENERAL THEORY

Proof. We conduct the proof for extensions to the right; extensions to the left can be
handled completely analogously (alternatively, one can apply the time reversion Lem.
1.9(b) as demonstrated in the last paragraph of the proof below). The proof for extensions to the right is divided into three steps. Let K G be compact.
Step 1: We show that gr+ () K implies has an extension to the right: Since K is
bounded, so is gr+ (), implying
b := sup I <

(3.67)

as well as



M1 := sup (j) (x) : j {0, . . . , k 1}, x [x0 , b[ < .

In the usual way, K compact and f continuous imply




M2 := max kf (x, y)k : (x, y) K < .

Set

M := max{M1 , M2 }.
According to Prop. 3.23, we need to show (3.64a) holds. To this end, notice

j=0,...,k1

x,
x[x0 ,b[

k(j) (x) (j) (


x)k M |x x| :

(3.68)

Indeed,

x,
x[x0 ,b[

Z

k(k1) (x) (k1) (
x)k =

and, for 0 j < k 1,

Z

k (x) (
x)k =

(j)

x,
x[x0 ,b[



f t, (t), . . . , (k1) (t) dt
M |x x|,


(j)

(j+1)



(t) dt
M |x x|,

proving (3.68). Since K is compact, there exists a sequence (xm )mN in [x0 , b[ such that


lim xm , (xm ), (xm ), . . . , (k1) (xm ) = (b, 0 , . . . , k1 ). (3.69)


(b,0 ,...,k1 )K

Using x := xm in (3.68) yields, for m ,

j=0,...,k1

x[x0 ,b[

k(j) (x) j k M |x b|,

implying

j=0,...,k1

lim (j) (x) = j ,


xb

i.e. (3.64a) holds, completing the proof of Step 1.

44

3 GENERAL THEORY

Step 2: We show that gr+ () K implies can be extended to the right to I]x0 , b+[,
where > 0 does not depend on b := sup I: Since K is compact, Cor. 3.9 guarantees
every initial value problem
y (k) = f (x, y, y , . . . , y (k1) ),

j=0,...,k1

y (j) (0 ) = y0,j ,

(0 , y0 ) K,

(3.70a)
(3.70b)

has a solution defined on ]0 , 0 + [ with the same > 0. As shown in Step


1, the solution can be extended into b = sup I such that it satisfies (3.70b) with
(0 , y0 ) = (b, ) K. Thus, using Lem. 1.7, it can be pieced together with the solution
to (3.70) given on [b, b + [ by Cor. 3.9, completing the proof of Step 2.
Step 3: We finally show that gr+ () K implies has an extension : J Kn
to the right such that (3.66) holds: We set a := inf I and 0 := . Then, by Step 2,
0 has an extension 1 defined on ]a, b + [. Inductively, for each m 1, either there
exists m0 m such that m0 : ]a, b + m0 [ Kn is an extension of that can be
used as to conclude the proof (i.e. := m0 satisfies (3.66)) or m can, once more,
be extended to ]a, b + (m + 1)[. As K is bounded, {x x0 : (x, y) K} R must
also be bounded, say by R. Thus, (3.66) must be satisfied for some := m with
1 m ( x0 )/.
As mentioned above, one can argue completely analogous to the above proof to obtain
that gr () K implies to have an extension to the left, satisfying (3.66). Here we
show how one, alternatively, can use the time reversion Lem. 1.9 to this end: Consider
the map
h : R Kkn R Kkn ,

h(x, y1 , . . . , yk ) := (x, y1 , . . . , (1)k1 yk ),

which clearly constitutes an R-linear isomophism. Noting (1.6) and (1.32) are the same,
we consider the time-reversed version (1.33) and observe Gg = h(G) to be open, h(K)
Gg to be compact, g : Gg Kn , g = (1)k (f h), to be continuous. If gr (, x0 ) K,
then gr+ (, x0 ) h(K), where is the solution to the time-reversed version (1.33),
given by Lem. 1.9(b). Then has an extension to the right, satisfying (3.66) with
replaced by and K replaced by h(K). Then, by Rem. 3.21, must have an extension

to the left, satisfying (3.66) with replaced by .



In Th. 3.28 below, we will show that, for continuous f : G Kn , each maximal
solution to (1.6) must go to the boundary of G in the sense of the following definition.
Definition 3.25. Let k, n N, let G R Kkn be open, let f : G Kn , and let
: ]a, b[ Kn , a < b , be a solution to (1.6). We say that the solution
goes to the boundary of G for x b (resp. for x a) if, and only if,

K G compact

x0 ]a,b[

gr+ (, x0 ) K = (resp. gr (, x0 ) K = ),

(3.71)

where gr+ (, x0 ) and gr (, x0 ) are defined as in (3.65) (with I =]a, b[). In other words,
goes to the boundary of G for x b (resp. for x a) if, and only if, the graph of
(, . . . , (k1) ) escapes every compact subset K of G forever for x b (resp. for x a).

45

3 GENERAL THEORY

Proposition 3.26. In the situation of Def. 3.25, if the solution goes to the boundary
of G for x b, then one of the following conditions must hold:
(i) b = ,


(ii) b < and L := lim supxb (x), . . . , (k1) (x) = ,

(iii) b < , L < (L as defined in (ii)), G 6= R Kkn (i.e. G 6= ), and





lim dist x, (x), . . . , (k1) (x) , G = 0.
xb

(3.72)

An analogous statement is valid for the solution going to the boundary of G for x a.
Proof. The proof is carried out for x b; the proof for x a is analogous.

Assume (i) (iii) are all false. Choose c ]a, b[. Since (i) and (ii) are false,


(x), . . . , (k1) (x) M.

0M <

x[c,b[

If (iii) is false because G = R Kkn , then K := {(x, y) R Kkn : x [c, b], kyk M }
is a compact subset of G that shows (3.71) does not hold. In the only remaining case,
(iii) must be false, since (3.72) does not hold. Thus,




dist x1 , (x1 ), . . . , (k1) (x1 ) , G .


>0

x0 ]a,b[

x1 ]x0 ,b[

Clearly, the set




A := (x, y) G : dist (x, y), G

is closed (e.g. as the distance function d : R Kkn R+


0 , d() := dist(, G) is
continuous (see Th. C.4) and A = (d1 [, [) (G G)). In consequence, K A with
K as defined above is a compact subset of G that shows (3.71) does not hold.

Remark 3.27. (a) Examples such as the second ODE of Ex. 3.30(b) below show that
the lim sup in Prop. 3.26(ii) can not be replaced with a lim.
(b) If f : G Kn is continuous, then the three conditions of Prop. 3.26 are also
sufficient for to go to the boundary of G (cf. Cor. 3.29 below).
(c) For discontinuous f : G Kn , in general, (ii) of Prop. 3.26 is no longer sufficient
for to go to the boundary of G as is shown by simple examples, whereas (i) and (iii)
remain sufficient, even for discontinuous f (exercise). Similarly, simple examples
show Prop. 3.24 becomes false without the assumption of f being continuous; and
it can also happen that a maximal solution escapes every compact set, but still does
not go to the boundary of G (exercise).
Theorem 3.28. In the situation of Def. 3.25, if f : G Kn is continuous and
: ]a, b[ Kn is a maximal solution to (1.6), then must go to the boundary of G for
both x a and x b, i.e., for both x a and x b, it must escape every compact
subset K of G forever and it must satisfy one of the conditions specified in Prop. 3.26
(and one of the analogous conditions for x a).

46

3 GENERAL THEORY

Proof. We carry out the proof for x b the proof for x a can be done analogously
or by applying the time reversion Lem. 1.9, as indicated at the end of the proof below.
Let : ]a, b[ Kn be a maximal solution to (1.6). Seeking a contradiction, we assume
does not go to the boundary of G for x b, i.e. (3.71) does not hold and there exists
a compact subset K of G and a strictly increasing sequence (xm )mN in ]a, b[ such that
limm xm = b < and

xm , (xm ), . . . , (k1) (xm ) K.
(3.73)

mN

We now define C to be another compact subset of G that is strictly between K and G,


i.e. K ( C ( G: More precisely, we choose r > 0 such that



C := (x, y) R Kkn : dist (x, y), K r G,
where


dist (x, y), K = inf{k(x, y) (
x, y)k2 : (
x, y) K},

k k2 denoting the Euclidean norm on Rkn+1 for K = R and the Euclidean norm on
R2kn+1 for K = C (this choice of norm is different from previous choices and will be
convenient later during the current proof). As is a maximal solution, Prop. 3.24
guarantees the existence of another strictly increasing sequence (m )mN in ]a, b[ such
that limm m = b < , x1 < 1 < x2 < 2 < . . . (i.e. xm < m < xm+1 for each
m N) and such that

m , (m ), . . . , (k1) (m )
/ C.

mN


Noting xm , (xm ), . . . , (k1) (xm ) K by (3.73) and K C, define



sm := sup s xm : x, (x), . . . , (k1) (x) C for each x [xm , s] .
mN

By the definition of sm as a sup, sm < xm+1 < b < , and by the continuity of the
distance function d : R Kkn R+
0 , d() := dist(, K) (see Th. C.4), one obtains




dist sm , (sm ), . . . , (k1) (sm ) , K = r,


mN

in particular,

x[xm ,sm ]

and

mN


x, (x), . . . , (k1) (x) C

(3.74)






xm , (xm ), . . . , (k1) (xm ) sm , (sm ), . . . , (k1) (sm ) r.
2

We use the boundedness of the compact set C and (3.74) to provide





M1 := sup (x), . . . , (k1) (x) 2 : x [xm , sm ], m N < ,


M2 := max kf (x, y)k2 : (x, y) C <

(3.75)

47

3 GENERAL THEORY
(as C is compact and f continuous),
M := max{M1 , M2 }.
We now notice that each function
Jm : [xm , sm ] R Kkn ,


Jm (x) := x, (x), . . . , (k1) (x) ,

is a continuously differentiable curve or path (using the continuity of f ), cf. Def. F.1
(for K = C, we consider Jm as a path in R2kn+1 ). To finish the proof, we will have to
make use of the notion of arc length (cf. Def. F.5) of such a continuously differentiable
curve: Recall that each such continuously differentiable path is rectifyable, i.e. it has a
well-defined finite arc length l(Jm ) (cf. Th. F.7). Moreover, l(Jm ) satisfies
Z sm
(F.4)
(F.17)

kJm (xm ) Jm (sm )k2 l(Jm ) =


kJm
(x)k2 dx
xm
v
Z sm u
k
X
u
t
=
1+
k(j) (x)k22 dx
xm

sm

j=1

q

2
 2
1 + (x), . . . , (k1) (x) 2 + f (Jm (x)) 2 dx

Z xm
sm

1 + 2M 2 dx ,

(3.76)

xm

where it was used that k k2 was chosen to be the Euclidean norm. For each m N, we
estimate
(3.75)




(k1)
(k1)
0 < r xm , (xm ), . . . ,
(xm ) sm , (sm ), . . . ,
(sm )
2
Z sm
(3.76)
1 + 2M 2 dx
= kJm (xm ) Jm (sm )k2
xm

(3.77)
= (sm xm ) 1 + 2M 2 .

However, limm (sm xm ) 1 + 2M 2 = 0 due to limm sm = limm xm = b, in


contradiction to r > 0. This contradiction shows our initial assumption that does not
go to the boundary of G for x b must have been wrong.

To obtain the remaining assertion that must go to the boundary of G for x a,


one can proceed as in the last paragraph of the proof of Prop. 3.23, making use of the
function h defined there and of the time reversion Lem. 1.9: If K G is a compact
set and is the solution to the time-reversed version given by Lem. 1.9(b), then
must be maximal as is maximal. Thus, for x a, must escape the compact set
h(K) forever by the first part of the proof above, implying must escape K forever for
x a.


Corollary 3.29. Let k, n N, let G R Kkn be open, and let f : G Kn


be continuous. If : ]a, b[ Kn , a < b, is a solution to (1.6), then the following
statements are equivalent:

48

3 GENERAL THEORY
(i) is a maximal solution.

(ii) must go to the boundary of G for both x a and x b in the sense defined in
Def. 3.25.
(iii) satisfies one of the conditions specified in Prop. 3.26 and one of the analogous
conditions for x a.
Proof. (i) implies (ii) by Th. 3.28, (ii) implies (iii) by Prop. 3.26, and it is an exercise
to show (iii) implies (i) (here, Prop. 3.23 is the clue).

Example 3.30. The following examples illustrate the different kinds of possible bahavior of maximal solutions listed in Prop. 3.26 (the different kinds of bahavior can already
be seen for 1-dimensional ODE of first order):
(a) The initial value problem
y = 0,

y(0) = 1,

has the maximal solution : R R, (x) = 1 here we have


G = R2 ,

f : G R,

f (x, y) = 0,

solution interval I = R, b := sup I = , i.e. we are in Case (i) of Prop. 3.26.


(b) The initial value problem
y = x2 ,

y(1) = 1,

has the maximal solution : ] , 0[ R, (x) = x1 here we have


G =] , 0[R,

f : G R,

f (x, y) = x2 ,

solution interval I =] , 0[, b := sup I = 0, limx0 |(x)| = i.e. we are in Case


(ii) of Prop. 3.26.
To obtain an example, where we are also in Case (ii) of Prop. 3.26, but where
limxb |(x)|, b := sup I, does not exist, consider the initial value problem


1
1
1
1
1

= 0,
y = 2 sin + 3 cos , y
x
x x
x

which has the maximal solution : ] , 0[ R, (x) = x1 sin(x1 ) (here


lim supx0 |(x)| = , but, as (1/(k)) = 0 for each k N, limx0 |(x)| does
not exist) here we have
G = (R \ {0}) R,

f : G R,

f (x, y) =

1
1
1
1
sin + 3 cos .
2
x
x x
x

To obtain an example, where we are again in Case (ii) of Prop. 3.26, but where
G = R2 , consider the initial value problem
y = y2,

y(1) = 1,

49

3 GENERAL THEORY

which, as in the first example of (b), has the maximal solution : ] , 0[ R,


(x) = x1 here we have
G = R2 ,

f : G R,

f (x, y) = y 2 .

(c) The initial value problem


y = y 1 ,

y(1) = 1,

has the maximal solution : ] , 21 [ R, (x) =


G = R (R \ {0}),

2x 1 here we have

f (x, y) = y 1 ,

f : G R,

solution interval I =] , 21 [, b := sup I = 21 , G = R {0},




1
lim (x, (x)) = , 0 G,
xb
2
i.e. we are in Case (iii) of Prop. 3.26.
An example, where we are in Case (iii) of Prop. 3.26, but where limxb (x, (x))
does not exist, is given by the initial value problem


1
1
1

y = 2 cos , y
= 0,
x
x

has the maximal solution : ] , 0[ R, (x) = sin(1/x) here we have


G = (R \ {0}) R,

f : G R,

f (x, y) =

1
1
cos ,
2
x
x

solution interval I =] , 0[, b := sup I = 0, G = {0} R,



lim dist (x, (x)), G = lim |x| = 0.
x0

x0

As a final example, where we are again in Case (iii) of Prop. 3.26, reconsider the
initial value problem from (a), but this time with
G =] 1, 1[] 3, 5[,

f : G R,

f (x, y) = 0.

Now the maximal solution is : ] 1, 1[ R, (x) = 1, solution interval I =


] 1, 1[, b := sup I = 1, and limx1 (x, (x)) = (1, 1) G. This last example also
illustrates that, even though it is quite common to omit an explicit specification of
the domain G when writing an ODE (as we did in (a)) where it is usually assumed
that the intended domain can be guessed from the context the maximal solution
will typically depend on the specification of G.

50

3 GENERAL THEORY

Example 3.31. We have already seen examples of initial value problems that admit
more than one maximal solution for instance, the initial value problem of Ex. 1.4(b)
had infinitely many different maximal solutions, all of them defined on all of R. The following example shows that an initial value problem can have maximal solutions defined
on different intervals: Let
p
|y|
p ,
G := R] 1, 1[, f : G R, f (x, y) :=
1 |y|
and consider the initial value problem

p
|y|
p ,
y = f (x, y) =
1 |y|

y(0) = 0.

(3.78)

An obvious maximal solution is

: R R,

(0) = 0.

However, another maximal solution (that can be found using separation of variables) is
(
2

for 1 < x 0,
1 1+x
: ] 1, 1[ R, (x) :=
2

1 1x
for 0 x < 1.

To confirm the maximality of the solution , note limx1 (x, (x)) = (1, 1) G
and limx1 (x, (x)) = (1, 1) G.

3.5

Continuity in Initial Conditions

The goal of the present section is to show that, under suitable conditions, small changes
in the initial condition for an ODE result in small changes in the solution. As, in
situations of nonuniqueness, we can change the solution without having changed the
initial condition at all, ensuring unique solutions to initial value problems is a minimal
prerequisite for our considerations in this section.
Definition 3.32. Let G R Kkn , k, n N, and f : G Kn . We say that the
explicit n-dimensional kth-order ODE (1.6), i.e.

y (k) = f x, y, y , . . . , y (k1) ,
(3.79a)
admits unique maximal solutions if, and only if, f is such that every initial value problem
consisting of (3.79a) and

j{0,...,k1}

y (j) () = j Kn ,

(3.79b)

with (, ) G, has a unique maximal solution (,) : I(,) Kn (combining Cor.


3.16 with Th. 3.22 yields that G being open and f being continuous and locally Lipschitz

51

3 GENERAL THEORY

with respect to y is sufficient for (3.79a) to admit unique maximal solutions, but we
know from Rem. 3.17 that this condition is not necessary). If f is such that (3.79a)
admits unique maximal solutions, then
Y : Df Kn ,

Y (x, , ) := (,) (x),

(3.80)

defined on
Df := {(x, , ) R G : x I(,) },

(3.81)

is called the global or general solution to (3.79a). Note that the domain Df of Y is
determined entirely by f , which is notationally emphasized by its lower index f .
Lemma 3.33. In the situation of Def. 3.32, the following holds:
(a) Y (, , ) = 0 for each (, ) G.

(b) If k = 1, then = 0 and Y x, x, Y (
x, , ) = Y (x, , ) for each (x, , ), (
x, , )
Df .

(c) If k = 1, then Y , x, Y (x, , ) = for each (x, , ) Df .

Proof. (a) holds as Y (, , ) is a solution to (3.79b). For (b) note (


x, , ) Df implying x, Y (
x, ,
x, , ) are admissible initial data. Moreover,
 ) G, i.e. x, Y (
Y , x, Y (
x, , ) and Y (, , ) are both maximal solutions for some intial value problem for (3.79a). Since both solutions agree at x = x, both functions must be identical
by the assumed uniqueness of solutions. In particular, they are defined for the same x
and yield the same value at each x. Setting x := in (b) yields (c).


The core of the proof of continuity in initial conditions as stated in Cor. 3.36 below is
the following Th. 3.34(a), which provides continuity in initial conditions locally. As a
byproduct, we will also obtain a version of the Picard-Lindelof theorem in Th. 3.34(b),
which states the local uniform convergence of the so-called Picard iteration, a method for
obtaining approximate solutions that is quite different from the Euler method considered
above.
Theorem 3.34. Consider the situation of Def. 3.32 for first-order problems, i.e. with
k = 1, and with f being continuous and locally Lipschitz with respect to y on G open.
Fix an arbitrary norm k k on Kn .
(a) For each (, ) G R Kn and each < a < b < such that [a, b] I(,)
(i.e., using the notation introduced in Def. 3.32, the maximal solution (,) =
Y (, , ) is defined on [a, b]), there exists > 0 satisfying:
(i) For every point (, ) in the open set




U (, ) := (, ) G : ]a, b[, Y (, , ) < ,

(3.82)

the maximal solution (,) = Y (, , ) is defined on ]a, b[ (i.e. ]a, b[ I(,) ).

52

3 GENERAL THEORY
(ii) The restriction of the global solution (x, , ) 7 Y (x, , ) to the open set
W :=]a, b[U (, )

(3.83)

is continuous.
(b) (Picard-Lindelof) For each (, ) G, there exists > 0 such that the Picard
iteration, i.e. the sequence of functions (m )mN0 , m : ] , + [ Kn , defined
recursively by
0 (x) := ,

mN0

m+1 (x) := +

(3.84a)
Z


f t, m (t) dt ,

(3.84b)

converges uniformly to the solution of the initial value problem (3.79) (with k = 1
and (, ) := (, )) on ] , + [.
Proof. We will obtain (b) as an aside while proving (a). To simplify notation, we
introduce the function
: [a, b] Kn ,

(x) := Y (x, , ).

Since [a, b] is compact and is continuous,


:= (Id, )[a, b] = {(x, (x)) R Kn : x [a, b]}
is a compact subset of G (cf. C.14). Thus, has a positive distance from the closed set
(R Kn ) \ G, implying




C := (x, y) R Kn : x [a, b], y (x) 1 G.
(3.85)
1 >0

Clearly, C is bounded and C is also closed (using the continuity of the distance function
d : R Kn R+
0 , d() := dist(, ), the continuity of the projection to the first
component 1 : R Kn R, and noting C = d1 [0, 1 ] 11 [a, b]). Thus, C is
compact, and the hypothesis of f being locally Lipschitz with respect to y implies f to
be globally Lipschitz with some Lipschitz constant L 0 on the compact set C by Prop.
3.13. We can now choose the number > 0 claimed to exist in (a) to be any number

Since L(b a) 0, we have

0 < < eL(ba) 1 .

(3.86)

< 1 .

(3.87)

Moreover, with d and 1 as above, U (, ) as defined in (3.82) can be written in the


form
U (, ) = d1 [0, [ 11 ]a, b[,

showing it is an open set ([0, [ is, indeed, open in R+


0 ).

53

3 GENERAL THEORY

Even though we are mostly interested in what happens on the open set W , it will be
convenient to define functions on the slightly larger compact set
W := [a, b] U ,




U := (x, y) R Kn : x [a, b], y (x) = d1 [0, ] 11 [a, b].

To proceed with the proof, we now carry out a form of the Picard iteration, recursively
defining a sequence of functions (m )mN0 , m : W Kn , defined recursively by

mN0

0 (x, , ) := (x) + (),


Z x

m+1 (x, , ) := +
f t, m (t, , ) dt .

(3.88a)
(3.88b)

The proof will be concluded if we can show the (m )mN0 constitute a sequence of
continuous functions converging uniformly on W to Y W . As an intermediate step, we
establish the following properties of the m (simultaneously) by induction on m N0 :
(1) m is continuous for each m N0 .
(2) One has

mN0 ,
(x,,)W



m (x, , ) (x) < 1


(x, m (x, , )) C .

In particular, since C G, this shows the m are well-defined by (3.88b).


(3) One has

mN0 ,
(x,,)W

m+1


|x |m+1
m+1 (x, , ) m (x, , ) L
.
(m + 1)!

To start the induction proof, notice that the continuity of implies the continuity of
0 . Moreover, if (x, , ) W , then





0 (x, , ) (x) (3.88a)
= () = Y (, , ) < 1 .
(3.89)
Also, from = Y (, , ) = (,) , we know, for each x, [a, b],
Z x
Z
Z
(x) () = +
f (t, (t)) dt
f (t, (t)) dt =

and, for each (x, , ) W ,




1 (x, , ) 0 (x, , )

=
f L-Lip.

f (t, (t)) dt

Z x 






f
t,

(t,
,
)

f
(t,
(t))
dt
0

Z x




0 (t, , ) (t) dt
L

Z x




() dt L |x | ,
L

54

3 GENERAL THEORY
completing the proof of (1) (3) for m = 0. For the induction step, let m N0 .
It is left as an exercise to prove the continuity of m+1 .

Using the triangle inequality, we estimate, for each (x, , ) W ,




m+1 (x, , ) (x)
m
X

kj+1 (x, , ) j (x, , ) + k0 (x, , ) (x)


j=0

(3.89), ind.hyp. for (3)

m
X
Lj+1 |x |j+1

(j + 1)!

j=0

(3.86)

+ eL|x| < eL(ba) eL(ba) 1 = 1 ,

establishing the estimate of (2) for m + 1. To prove the estimate in (3) for m replaced
by m + 1, one estimates, for each (x, , ) W ,
Z x









m+2 (x, , ) m+1 (x, , )

f t, m+1 (t, , ) f t, m (t, , ) dt

Z x






L
m+1 (t, , ) m (t, , ) dt

Z x m+1
m+1
ind.hyp.


L
|t

dt

L
(m + 1)!

m+2
L
|x |m+2
=
,
(m + 2)!
completing the induction proof of (1) (3).
As a consequence of (3), for each l, m N0 such that m > l:

(x,,)W

m
X


Lj (b a)j
m (x, , ) l (x, , )
.
j!
j=l+1

(3.90)

The convergence of the exponential series, thus, implies that (m (x, , ))mN0 is a
Cauchy sequence for each (x, , ) W , yielding pointwise convergence of the m to
some function : W Kn . Letting m tend to infinity in (3.90) then shows

(x,,)W

X


Lj (b a)j
, ) l (x, , )
(x,
,
j!
j=l+1

where the independence of the right-hand side with respect to (x, , ) W proves
m uniformly on W . The uniform convergence together with (1) then implies
to be continuous.
, ) solves (3.79) (with
In the final step of the proof, we show = Y on W , i.e. (,
k = 1). By Th. 1.5, we need to show
Z x

, ) dt
, ) = +

f t, (t,
(3.91)
(x,
(x,,)W

55

3 GENERAL THEORY

, ) = Y (, , )). To verify (3.91), given


(then uniqueness of solutions implies (,
choose m N sufficiently large such that
> 0, by the uniform convergence m ,


, ) k (x, , ) <
(x,

k{m1,m}

(x,,)W

and estimate, for each (x, , ) W ,




Z x


, )
, ) dt
(x,
f t, (t,



Z
x




f t, m1 (t, , ) f t, (t, , ) dt
(x, , ) m (x, , ) +


Z x



, ) dt + L (b a).
m1 (t, , ) (t,
(3.92)
< + L

As (3.92) holds for every > 0, (3.91) must be true as well.

It is noted that we have, indeed, proved (b) as a byproduct, since we know (for example
from the Peano Th. 3.8) that must be defined on [ , + ] for some > 0 and
then m = m (, , ) on [ , + ] for each m N0 .

Theorem 3.35. As in Th. 3.34, consider the situation of Def. 3.32 for first-order problems, i.e. with k = 1, and with f being continuous and locally Lipschitz with respect to
y on G open. Then the global solution (x, , ) 7 Y (x, , ) as defined in Def. 3.32 is
continuous. Moreover, its domain Df is open.
Proof. Let (x, , ) Df . Then, using the notation from Def. 3.32, x is in the domain of
the maximal solution (,) , i.e. x I(,) . Since I(,) is open, there must be < a <
x < b < such that [a, b] I(,) and then Th. 3.34(a) implies the global solution Y to
be continuous on W , where W as defined in (3.83) is an open neighborhood of (x, , ).
In particular, (x, , ) is an interior point of Df and Y is continuous at (x, , ). As
(x, , ) was arbitrary, Df must be open and Y must be continuous.

Corollary 3.36. Consider the situation of Def. 3.32 with f being continuous and locally
Lipschitz with respect to y on G open. Then the global solution (x, , ) 7 Y (x, , ) as
defined in Def. 3.32 is continuous. Moreover, its domain Df is open.
Proof. It was part of the exercise that proved Cor. 3.16 to show that the right-hand side
F of the first-order problem equivalent to (3.79) in the sense of Th. 3.1 is continuous
and locally Lipschitz with respect to y, provided f is continuous and locally Lipschitz
with respect to y. Thus, according to Th. 3.35, the equivalent first-order problem has
a continuous global solution : DF Kkn , defined on some open set DF . As a
consequence of Th. 3.1(b), Y = 1 : DF Kn is the global solution to (3.79a). So
we have Df = DF and, as is continuous, so is Y .

It is sometimes interesting to consider situations where the right-hand side f depends
on some (vector of) parameters in addition to depending on x and y:

56

3 GENERAL THEORY

Definition 3.37. If G R Kkn Kl with k, n, l N, and f : G Kn is such that,


for each (, , ) G, the explicit n-dimensional kth-order initial value problem

y (k) = f x, y, y , . . . , y (k1) , ,
(3.93a)

j{0,...,k1}

y (j) () = j Kn ,

(3.93b)

has a unique maximal solution (,,) : I(,,) Kn , then


Y : Df Kn ,

Y (x, , , ) := (,,) (x),

(3.94)

defined on
Df := {(x, , , ) R G : x I(,,) },

(3.95)

is called the global or general solution to (3.93a).

Corollary 3.38. Consider the situation of Def. 3.37 with f being continuous and locally
Lipschitz with respect to (y, ) on G open. Then the global solution Y as defined in Def.
3.94 is continuous. Moreover, its domain Df is open.
Proof. We consider k = 1 (i.e. (3.93a) is of first order) the case k > 1 can then, in the
usual way, be obtained by applying Th. 3.1. To apply Th. 3.35 to the present situation,
define the auxiliary function
(
fj (x, y) for j = 1, . . . , n,
(3.96)
F : G Kn+l , Fj (x, y) :=
0
for j = n + 1, . . . , n + l.
Then, since f is continuous and locally Lipschitz with respect to (y, ), F is continuous
and locally Lipschitz with respect to y, and we can apply Th. 3.35 to
y = F (x, y),
y() = (, ),

(3.97a)
(3.97b)

where (, , ) G. According to Th. 3.35, the global solution Y : DF Kn+l of


(3.97a) is continuous on the open set DF . Moreover, by the definition of F in (3.96),
we have


Y
(x,
,
,
)
,
Y (x, , , ) =

(x,,,)DF

where Y is as defined in (3.94). In particular, Df = DF and the continuity of Y implies


the continuity of Y .

Example 3.39. As a simple example of a parametrized ODE, consider f : R K2
K, f (x, y, ) := y,
y = f (x, y, ) = y,
y() = ,
with the global solution
Y : R R K2 K,

Y (x, , , ) = e (x) .

57

4 LINEAR ODE

Linear ODE

4.1

Definition, Setting

In Sec. 2.2, we saw that the solution of one-dimensional first-order linear ODE was
particularly simple. One can now combine the general theory of ODE with some linear
algebra to obtain results for n-dimensional linear ODE and, equivalently, for linear ODE
of higher order.
Notation 4.1. For n N, let M(n, K) denote the set of all n n matrices over K.
Definition 4.2. Let I R be a nontrivial interval, n N, and let A : I M(n, K)
and b : I Kn be continuous. An ODE of the form
y = A(x)y + b(x)

(4.1)

is called an n-dimensional linear ODE of first order. It is called homogeneous if, and
only if, b 0; it is called inhomogeneous if, and only if, it is not homogeneous.

Using the notion of matrix norm (cf. Sec. G), it is not hard to show the right-hand side
of (4.1) is continuous and locally Lipschitz with respect to y and, thus, every initial
value problem for (4.1) has a unique maximal solution (exercise). However, to show
the maximal solution is always defined on all of I, we need some additional machinery,
which is developed in the next section.

4.2

Gronwalls Inequality

In the current section, we will provide Gronwalls inequality, which is also of interest
outside the field of ODE. Here, Gronwalls inequality will allow us to prove the global existence of maximal solutions for ODE with linearly bounded right-hand side a corollary
being that maximal solutions of (4.1) are always defined on all of I.
As an auxiliary tool on our way to Gronwalls inequality, we will now briefly study
(one-dimensional) differential inequalities:
Definition 4.3. Given G R R = R2 , and f : G R, call
y f (x, y)

(4.2)

a (one-dimensional) differential inequality (of first order). A solution to (4.2) is a differentiable function w : I R defined on a nontrivial interval I R satisfying the two
conditions


x, w(x) I R : x I G,

(ii) w (x) f x, w(x) for each x I.
(i)

58

4 LINEAR ODE

Proposition 4.4. Let G R2 be open, let f : G R be continuous and locally


Lipschitz with respect to y, and let < a < b . If w : [a, b[ R is a solution
to the differential inequality (4.2) and : [a, b[ R is a solution to the corresponding
ODE, then
w(a) (a)

w(x) (x).
(4.3)
x[a,b[

Proof. Consider the auxiliary function


g : G R R,

g(x, y, ) := f (x, y) +

(4.4)

and the (parametrized) ODE


y = g(x, y, ) = f (x, y) + .

(4.5)

Since f is continuous and locally Lipschitz with respect to y, g is continuous and locally
Lipschitz with respect to (y, ). Thus, continuity in initial conditions as given by Cor.
3.38 applies, yielding the global solution Y : Dg R, (x, , , ) 7 Y (x, , , ), to
be continuous on the open set Dg .
We now consider an arbitrary compact subinterval [a, c] [a, b[ with a < c < b, noting
that it suffices to prove w on every such interval [a, c]. The set


:= (Id, a, (a), 0)[a, c] = (x, a, (a), 0) : x [a, c]
(4.6)

is a compact subset of Dg and, thus,


n
o

4
:= (x, , , ) R : dist (x, , , ), < Dg .
>0

(4.7)

If we choose the distance in (4.7) to be meant with respect to the max-norm on R4 and
if 0 < < , then (x, a, (a), ) for each x [a, c], such that := Y (, a, (a), )
is defined on (a superset of) [a, c]. We proceed to prove w on [a, c]: Seeking a
contradiction, assume there exists x0 [a, c] such that w(x0 ) > (x0 ). Due to the
continuity of w and , w > must then hold in an entire neighborhood of x0 . On the
other hand, w(a) (a) = (a), such that, for


x1 := inf x < x0 : w(t) > (t) for each t ]x, x0 ] ,
a x1 < x0 and w(x1 ) = (x1 ). But then, for each sufficiently small h > 0,
w(x1 + h) w(x1 ) > (x1 + h) (x1 ),
implying
w(x1 + h) w(x1 )
(x1 + h) (x1 )
lim
= (x1 )
h0
h0
h
h




= g x1 , (x1 ), = f x1 , (x1 ) + > f x1 , (x1 ) = f x1 , w(x1 ) , (4.8)

w (x1 ) = lim

in contradiction to w being a solution to (4.2).

59

4 LINEAR ODE

Thus, w on [a, c] holds for every 0 < < , and continuity of Y on Dg yields,



w(x) lim (x) = lim Y x, a, (a), = Y x, a, (a), 0 = (x),


(4.9)
0

x[a,c]

concluding the proof.

Theorem 4.5 (Gronwalls Inequality). Let I := [a, b[, where < a < b . If
, , : I R are continuous and (x) 0 for each x I, then
Z x
(t) (t) dt
(4.10)
(x) (x) +
xI

implies

xI

(x) (x) +

(t) (t) exp


a

Z

(s) ds
t

dt .

(4.11)

Proof. Defining the auxiliary functions , w : I R,


(x) := (x) (x),
Z x
(t)(t) dt ,
w(x) :=

(4.12a)
(4.12b)

(4.10) can be written as

xI

(x) w(x).

Moreover, this implies

xI


w (x) = (x)(x) = (x) (x) + (x) (x)w(x) + (x)(x),

showing w satisfies the (linear) differential inequality

y (x) y + (x) (x).

(4.13)

Continuously extending and to x < a (e.g. using the constant extensions (x) = (a)
and (x) := (a) for x < a), we can consider the linear ODE corresponding to (4.13) on
all of ] , b[. Using the initial condition y(a) = w(a) = 0, yields the unique solution
(employing the variation of constants Th. 2.3)
: ] , b[ R,
Z x
Z x

 Z t
(x) := exp
(s) ds
(s) ds (t) (t) dt
exp
a
a
a
Z x

Z x
(t) (t) exp
=
(s) ds dt .
a

Finally, we apply Prop. 4.4 to conclude


Z
(4.3)
(4.14)
(x) w(x) (x) =
xI

(4.14)

(t) (t) exp


a

Z

which, taking into account = , establishes (4.11).

(s) ds
t

dt ,

(4.15)


60

4 LINEAR ODE

Example 4.6. Let I := [a, b[, where < a < b . If , : I R are


continuous, (x) 0 for each x I, and C R, then
Z x
(t) (t) dt
(4.16)
(x) C +
xI

implies

xI

(x) C exp

Z

(t) dt
a

(4.17)

We apply Gronwalls inequality of Th. 4.5 with C together with the fundamental
theorem of calculus to obtain the estimate
Z x

Z x
C (t) exp
(x) C +
(s) ds dt
a
t
Z t


Z t
x
Z x
=C C
(t) exp
(s) ds
dt = C C exp
(s) ds
a
x
x
a
Z x

= C exp
(t) dt
(4.18)
a

for each x I, proving (4.17).

The following Th. 4.7 will be applied to show maximal solutions to linear ODE are
always defined on all of I (with I as in Def. 4.2). However, Th. 4.7 is often also useful
to obtain the domains of maximal solutions for nonlinear ODE.
Theorem 4.7. Let n N, let I R be an open interval, and let f : I Kn Kn be
continuous. If there exist nonnegative continuous functions , : I R+
0 such that

kf (x, y)k (x) + (x) kyk,

(x,y)IKn

(4.19)

where k k denotes some arbitrary norm on Kn , then every maximal solution to


y = f (x, y)
is defined on all of I.
Proof. Let c < d and : ]c, d[ Kn be a solution to y = f (x, y). We prove that
d < b := sup I implies can be extended to the right and a := inf I < c implies can
be extended to the left. First, assume d < b and let x0 ]c, d[. The idea is to apply
Example 4.6 on the interval [x0 , d[. To this end, we estimate, for each x [x0 , d[:


Z x
Z x







k(x)k = (x0 ) +
f t, (t) dt k(x0 )k +
f t, (t) dt
x
x0
Z0 x
Z x
(4.19)
k(x0 )k +
(t) dt +
(t) k(t)k dt .
(4.20)
x0

x0

61

4 LINEAR ODE

Since the continuous function is uniformly bounded on the compact interval [x0 , d],
Z x

k(x)k C +
(t) k(t)k dt .
C0

x[x0 ,d[

x0

Thus, Example 4.6 applies, providing

x[x0 ,d[

k(x)k C exp

Z

(t) dt
x0

C eM (dx0 ) ,

(4.21)

where M 0 is a uniform bound for the continuous function on the compact interval
[x0 , d]. As (4.21) states that the graph



gr+ () = x, (x) G : x [x0 , d[

is contained in the compact set



K := [x0 , d] y Kn : kyk C eM (dx0 ) ,

Prop. 3.24 implies has an extension to the right.

Now assume a < c. The idea is to apply the time reversion Lem. 1.9(b): According to
Lem. 1.9(b), : ] d, c[ Kn , (x) = (x), is a solution to y = f (x, y) and
the first part of the prove above shows to have an extension to the right. However,
then Rem. 3.21 tells us has an extension to the left.


4.3

Existence, Uniqueness, Vector Space of Solutions

Theorem 4.8. Consider the setting of Def. 4.2 with an open interval I. Then every
initial value problem consisting of the linear ODE (4.1) and y(x0 ) = y0 , x0 I, y0 Kn ,
has a unique maximal solution : I Kn (note that is defined on all of I).
Proof. It is an exercise to show the right-hand side of (4.1) is continuous and locally
Lipschitz with respect to y. Thus, every initial value problem has a unique maximal
solution by using Cor. 3.16 and Th. 3.22. That each maximal solution is defined on I
follows from Th. 4.7, as




A(x)y + b(x) b(x) + A(x) kyk,
xI


where A(x) denotes the matrix norm of A(x) induced by the norm k k on Kn (cf.
Appendix G).

We will now proceed to study the solution spaces of linear ODE as it turns out, these
solution spaces inherit the linear structure of the ODE.
Notation 4.9. Again, we consider the setting of Def. 4.2. Define Li and Lh to be the
respective sets of solutions to (4.1) and its homogeneous version, i.e.
n
o
n

Li := ( : I K ) : = A + b ,
(4.22a)
n
o
Lh := ( : I Kn ) : = A .
(4.22b)

62

4 LINEAR ODE
Lemma 4.10. Using Not. 4.9, we have

Li

Li = + Lh = { + : Lh },

(4.23)

i.e. one obtains all solutions to the inhomogeneous equation (4.1) by adding solutions
of the homogeneous equation to a particular solution to the inhomogeneous equation
(note that this is completely analogous to what occurs for solutions to linear systems of
equations in linear algebra).
Proof. Exercise.

Theorem 4.11. Let I R be a nontrivial interval, n N, and let A : I M(n, K)


be continuous. Then the following holds:
(a) The set Lh defined in (4.22b) constitutes a vector space over K.
(b) For each k N and 1 , . . . , k Lh , the following statements are equivalent:
(i) The k functions 1 , . . . , k are linearly independent over K.
(ii) There exists x0 I such that the k vectors 1 (x0 ), . . . , k (x0 ) Kn are linearly
independent over K.
(iii) The k vectors 1 (x), . . . , k (x) Kn are linearly independent over K for every
x I.
(c) The dimension of Lh is n.
Proof. (a): Exercise.
(b): (iii) trivially implies (ii). That (ii) implies (i) can easily be shownP
by contraposition:
k
If (i) does not hold, then there is (1 , . . . , k ) K \ {0} such that kj=1 j j = 0, i.e.
Pk
j=1 j j (x) = 0 holds for each x I, i.e. (ii) does not hold. It remains to show (i)
implies (iii). Once again, we accomplish this via contraposition:
PkIf (iii) does not hold,
k
then there are (1 , . . . , k ) K \ {0} and x I such that j=1 j j (x) = 0. But
P
P
then, since kj=1 j j Lh by (a), kj=1 j j = 0 (using uniqueness of solutions). In
consequence, 1 , . . . , k are linearly dependent and (i) does not hold.

(c): Let (b1 , . . . , bn ) be a basis of Kn and x0 I. Let 1 , . . . , n Lh be the solutions


to the initial conditions y(x0 ) = b1 , . . . , y(x0 ) = bn , respectively. Then the 1 , . . . , n
must be linearly independent by (b) (as they are linearly independent at x0 ), proving
dim Lh n. On the other hand, if 1 , . . . , k Lh are linearly independent, k N,
then, once more by (b), 1 (x), . . . , k (x) Kn are linearly independent for each x Kn ,
showing k n and dim Lh n.

Example 4.12. In Example 1.4(e), we had claimed that the second-order ODE (1.16)
on [a, b], a < b, namely
y = y

63

4 LINEAR ODE
had the set of solutions L as in (1.17), namely
n

o
L=
(c1 sin +c2 cos) : [a, b] K : c1 , c2 K .

We are now in a position to verify this claim: The second-order ODE (1.16) is equivalent
to the homogeneous linear first-order ODE
 
  
 
y1
y1
0 1
y2
(4.24)
=
=

y2
1 0
y2
y1
with the vector space of solutions Lh of dimension 2 over K. Clearly, 1 , 2 Lh , where
1 , 2 : [a, b] K2 with




cos x
sin x
.
(4.25)
, 2 (x) :=
1 (x) :=
sin x
cos x


Moreover, 1 and 2 are linearly independent (e.g. since 1 (0) = 01 and 2 (0) = 10 are
linearly independent, so are 1 , 2 : R K2 by Th. 4.11(b), implying, again by Th.
4.11(b), the linear independence of 1 (a), 2 (a), finally implying the linear independence
of 1 , 2 : [a, b] K2 ). Thus,
n

o
2
Lh =
(c1 1 + c2 2 ) : [a, b] K : c1 , c2 K
(4.26)
and, since, according to Th. 3.1 the solutions to (1.16) are precisely the first components
of solutions to (4.24), the representation (1.17) is verified.

4.4

Fundamental Matrix Solutions and Variation of Constants

Definition and Remark 4.13. A basis (1 , . . . , n ), n N, of the n-dimensional


vector space Lh over K is called a fundamental system for the linear ODE (4.1). One
then also calls the matrix

11 . . . 1n

.. ,
(4.27)
:= ...
.
n1 . . . nn
where the kth column of the matrix consists of the component functions 1k , . . . , nk of
k , k {1, . . . , n}, a fundamental system or a fundamental matrix solution for (4.1). The
latter term is justified by the observation that : I M(n, K) can be interpreted as
a solution to the matrix-valued ODE
Y = A(x) Y :

(4.28)

Indeed,


= 1 , . . . , n = A(x) 1 , . . . , A(x) n = A(x) .

Corollary 4.14. Let 1 , . . . , n Lh , n N, and let be defined as in (4.27). Then


the following statements are equivalent:

64

4 LINEAR ODE
(i) is a fundamental system for (4.1).
(ii) There exists x0 I such that det (x0 ) 6= 0.
(iii) det (x) 6= 0 for every x I.

Proof. The equivalences are a direct consequence of the equivalences in Th. 4.11(b). 
Theorem 4.15 (Variation of Constants). Consider the setting of Def. 4.2. If : I
M(n, K) is a fundamental system for (4.1), then the unique solution : I Kn of
the initial value problem consisting of (4.1) and y(x0 ) = y0 , (x0 , y0 ) I Kn , is given
by
Z
x

: I Kn ,

(x) = (x)1 (x0 ) y0 + (x)

1 (t) b(t) dt .

(4.29)

x0

Proof. The initial condition is easily verified:


(x0 ) = (x0 )1 (x0 ) y0 + 0 = Id y0 = y0 .
To check that satisfies (4.1), one computes, for each x I,
Z x
(I.3)

(x) = (x) (x0 ) y0 + (x)


1 (t) b(t) dt + (x)1 (x) b(x)
x0
Z x
(4.28)
1
= A(x) (x) (x0 ) y0 + A(x) (x)
1 (t) b(t) dt + b(x)
x0

A(x) (x) + b(x),

thereby establishing the case.

(4.30)


Remark 4.16. The 1-dimensional variation of constants formula (2.2) is actually a


special case of (4.29): We note that, for n = 1 and A(x) = a(x), the solution := 0
to the 1-dimensional homogeneous equation as defined in (2.2b), i.e.
Z x

Rx
a(t) dt
0 : I K, 0 (x) = exp
a(t) dt = e x0
x0

constitutes a fundamental matrix solution in the sense of Def. and Rem. 4.13 (since 1/0
exists). Taking into account (x0 ) = 0 (x0 ) = 1, we obtain, for each x I,
Z x
(4.29)
1
(x) = (x) (x0 ) y0 + (x)
1 (t) b(t) dt
x0


Z x
1
= 0 (x) y0 +
0 (t) b(t) dt ,
(4.31)
x0

which is (2.2a).

65

4 LINEAR ODE

In Sec. 4.6, we will study methods for actually finding fundamental matrix solutions in
cases where A is constant. However, in general, fundamental matrix solutions are often
not explicitly available. In such situations, the following Th. 4.17 can sometimes help
to extract information about solutions.
Theorem 4.17 (Liouvilles Formula). Consider the setting of Def. 4.2 and recall the
trace of an n n matrix A = (akl ) is defined by
tr A :=

n
X

akk .

k=1

If : I M(n, K) is a fundamental system for (4.1), then


Z x





det (x) = det (x0 ) exp


tr A(t) dt .
x0 ,xI

Proof. Exercise.

4.5

(4.32)

x0

Higher-Order, Wronskian

In Th. 3.1, we saw that higher-order ODE are equivalent to systems of first-order ODE.
We can now combine Th. 3.1 with our findings regarding first-order linear ODE to help
with the solution of higher-order linear ODE.
Definition 4.18. Let I R be a nontrivial interval, n N. Let b : I K and
a0 , . . . , an1 : I K be continuous functions. Then a (1-dimensional) linear ODE of
nth order is an equation of the form
y (n) = an1 (x)y (n1) + + a1 (x)y + a0 (x)y + b(x).

(4.33)

It is called homogeneous if, and only if, b 0; it is called inhomogeneous if, and only if,
it is not homogeneous. Analogous to (4.22), define the respective sets of solutions
n1
n
o
X
Hi := ( : I K) : (n) = b +
ak (k) ,

(4.34a)

k=0

Hh := ( : I K) :

(n)

n1
X
k=0

ak

(k)

(4.34b)

Definition 4.19. Let I R be a nontrivial interval, n N. For each n-tuple of (n 1)


times differentiable functions 1 , . . . , n : I K, define the Wronskian

1 (x)
...
n (x)
1 (x)
...
n (x)

W (1 , . . . , n ) : I K, W (1 , . . . , n )(x) := det
.
..
..

.
.
(n1)
(n1)
1
(x) . . . n (x)
(4.35)

66

4 LINEAR ODE
Theorem 4.20. Consider the setting of Def. 4.18.

(a) If Hi and Hh are the sets defined in (4.34), then Hh is an n-dimensional vector
space over K and, if Hi is arbitrary, then
Hi = + Hh .

(4.36)

(b) Let 1 , . . . , n Hh . Then the following statements are equivalent:


(i) 1 , . . . , n are linearly independent over K (i.e. (1 , . . . , n ) forms a basis of
Hh ).

(ii) There exists x0 I such that the Wronskian does not vanish:
W (1 , . . . , n )(x0 ) 6= 0.

(iii) The Wronskian never vanishes, i.e. W (1 , . . . , n )(x) 6= 0 for every x I.


Proof. According to Th. 3.1, (4.33) is equivalent to the first-order linear ODE

0
1
0
...
0
0
y1
0
0

0
1
...
0
0

y2 0
..

..
.
.
... ...
.
.. ..

.
y =

0
yn2 0
0
0
.
.
.
1
0

0
0
0
...
0
1 yn1 0
a0 (x) a1 (x) a2 (x) . . . an2 (x) an1 (x)
yn
b(x)
y + b(x).
=: A(x)

Define
n

Li := ( : I K ) : = A + b ,
n
o
.
Lh := ( : I Kn ) : = A
n

(a): Let Hi and define

..
.

:=
.

(n1)

Then
Hh

Th. 3.1(a),(b)

{1 : Lh }

and
Hi

Th. 3.1(a),(b)

1 :
Li } (4.23)
1 :
+ Lh }
{
= {
{( + )1 : Lh } = + Hh .

(4.37)

67

4 LINEAR ODE

As a consequence of Th. 3.1, the map J : Lh Hh , J() := 1 , is a linear isomorphism, implying that Hh , like Lh , is an n-dimensional vector space over K.
(l1)

(b): For 1 , . . . , n Hh , define kl := k

xI

for each k, l {1, . . . , n} and

1 (x)
...
n (x)
1 (x)
...
n (x)

(x) := (1 (x), . . . , n (x)) = (kl (x)) =

..
..

.
.
(n1)
(n1)
1
(x) . . . n (x)

such that det (x) = W (1 , . . . , n )(x) for each x I. Since Th. 3.1 yields 1 , . . . , n
Lh if, and only if, 1 , . . . , n Hh , the equivalences of (b) follow from the equivalences
of Cor. 4.14.

Example 4.21. Consider a0 , a1 : R+ K, a1 (x) := 1/(2x), a0 (x) := 1/(2x2 ), and
y = a1 (x) y + a0 (x) y =

y
y
2.
2x 2x

One might be able to guess the solutions


1 , 2 : R+ K,

1 (x) := x,

2 (x) :=

x.

The Wronskian is
W (1 , 2 ) : R+ K,



x
x
x
x

x=
< 0,
W (1 , 2 )(x) = det
=
1 1/(2 x)
2
2
i.e. 1 and 2 span Hh according to Th. 4.20(b):
Hh = {c1 1 + c2 2 : c1 , c2 K}.

4.6

Constant Coefficients

For 1-dimensional first-order linear ODE, we obtained a solution formula in Th. 2.3 in
terms of integrals (of course, in general, evaluating integrals can still be very difficult,
and one might need effective and efficient numerical methods). In the previous sections,
we have studied systems of first-order linear ODE as well as linear ODE of higher order.
Unfortunately, there are no general solution formulas for these situations (one can use
(4.29) if one knows a fundamental system, but the problem is the absence of a general
procedure to obtain such a fundamental system). However, there is a more satisfying
solution theory for the situation of so-called constant coefficients, i.e. if A in (4.1) and
the a0 , . . . , an1 in (4.33) do not depend on x.

68

4 LINEAR ODE
4.6.1

Linear ODE of Higher Order

Definition 4.22. Let I R be a nontrivial interval, n N. Let b : I K be


continuous and a0 , . . . , an1 K. Then a (1-dimensional) linear ODE of nth order with
constant coefficients is an equation of the form
y (n) = an1 y (n1) + + a1 y + a0 y + b(x).

(4.38)

In the present context, it is useful to introduce the following notation:


Notation 4.23. Let n N0 .
(a) Let P denote the set of all polynomials over K, Pn := {P P : deg P n}. We
will also write P[R], P[C], Pn [R], Pn [C] if we need to be specific about the field of
coefficients.
(b) Let I R be a nontrivial interval. Let Dn (I) := Dn (I, K) denote the set of all n
times differentiable functions f : I K, and let
x : D1 (I) F(I, K) := {f : I K}, x f := f ,
P
and, for each P Pn with P (x) = nj=0 aj xj (a0 , . . . , an K) define the differential
operator
n

P (x ) : D (I) F(I, K),

P (x )f :=

n
X

aj xj f

j=0

n
X

aj f (j) .

(4.39)

j=0

Remark 4.24. Using Not. 4.23(b), the ODE (4.38) can be written concisely as
P (x )y = b(x),

where P (x) := x

n1
X

aj x j .

(4.40)

j=0

The following Prop. 4.25 implies that the differential operator P (x ) does not, actually,
depend on the representation of the polynomial P .
Proposition 4.25. Let P, P1 , P2 P.
(a) If P = P1 + P2 and n := max{deg P1 , deg P2 }, then

f D n (I)

P (x )f = P1 (x )f + P2 (x )f.

(b) If P = P1 P2 and n := max{deg P, deg P1 , deg P2 }, then


n

f D (I)


P (x )f = P1 (x ) P2 (x )f .

4 LINEAR ODE

69

Proof. Exercise.

Lemma 4.26. Let K and


f : R K,

f (x) := ex .

(4.41)

Then, for each P P,


P (x )f : R K,

P (x )f (x) = P () ex .

Proof. There exists n N0 and a0 , . . . , an K such that P (x) =


computes
n
n
X
X
P (x )f (x) =
aj xj ex =
aj j ex = ex P (),
j=0

(4.42)
Pn

j=0

aj xj . One

j=0

proving (4.42).

Pn1
Theorem 4.27. If a0 , . . . , an1 K, n N, and P (x) = xn j=0
aj xj has the distinct
zeros 1 , . . . , n K (i.e. P (1 ) = = P (n ) = 0), then (1 , . . . , n ), where

j{1,...,n}

j : I K,

j (x) := ej x ,

is a basis of the homogeneous solution space


n
o
Hh = ( : I K) : P (x ) = 0

(4.43)

(4.44)

to (4.38) (i.e. to (4.40)).

Proof. It is immediate from (4.42) and P (j ) = 0 that each j satisfies P (x )j = 0.


From Th. 4.20(a), we already know Hh is an n-dimensional vector space over K. Thus,
it merely remains to compute the Wronskian. One obtains (cf. (4.35)):


1

.
.
.
1


n1
1 . . . n

(H.2) Y
W (1 , . . . , n )(0) = ..
(k l ) 6= 0,
.. =
.

.
k,l=0
n1

1
k>l
. . . nn1

since the j are all distinct. We have used that the Wronskian, in the present case, turns
out to be a Vandermonde determinant. The formula (H.2) for this type of determinant is
provided and proved in Appendix H. We also used that the determinant of a matrix is the
same as the determinant of its transpose: det A = det At . From W (1 , . . . , n )(0) 6= 0
and Th. 4.20(b), we conclude that (1 , . . . , n ) is a basis of Hh .

Example 4.28. We consider the third-order linear ODE
y = 2y y + 2y,

(4.45)

70

4 LINEAR ODE
which can be written as P (x )y = 0 with
P (x) := x3 2x2 + x 2 = (x2 + 1)(x 2) = (x i)(x + i)(x 2),

(4.46)

i.e. P has the distinct zeros 1 = i, 2 = i, 3 = 2. Thus, according to Th. 4.27, the
three functions
1 (x) = eix ,

1 , 2 , 3 : R C,

2 (x) = eix ,

3 (x) = e2x ,

(4.47)

form a basis of the C-vector space Hh . If we consider (4.45) as an ODE over R, then
we are interested in a basis of the R-vector space Hh . We can use linear combinations
of 1 and 2 to obtain such a basis (cf. Rem. 4.33(b) below):
1 , 2 : R R,

1 (x) =

eix + eix
= cos x,
2

2 (x) =

eix eix
= sin x.
2i

(4.48)

As explained in Rem. 4.33(b) below, as (1 , 2 , 3 ) are a basis of Hh over C, (1 , 2 , 3 )


are a basis of Hh over R.

By working a bit harder, one can generalize Th. 4.27 to the case where P has zeros of
higher multiplicity. We provide this generalization in Th. 4.32 below after recalling the
notion of zeros of higher multiplicity in Rem. and Def. 4.29, and after providing two
preparatory lemmas.
Remark and Definition 4.29. According to the fundamental theorem of algebra (cf.
[Phi13a, Th. 8.32, Cor. 8.33]), for every polynomial P Pn with deg P = n, n N,
there exists r N with r n, k1 , . . . , kr N with k1 + + kr = n, and distinct
numbers 1 , . . . , r C such that
P (x) = (x 1 )k1 (x r )kr .

(4.49)

Clearly, 1 , . . . , r are precisely the distinct zeros of P and kj is referred to as the


multiplicity of the zero j , j = 1, . . . , r.
Lemma 4.30. Let I R be a nontrivial interval, K, k N0 , and f Dk (I). Then
we have

(x )k f (x) ex = f (k) (x) ex .
(4.50)
xI

Proof. The proof is


by induction. The case k = 0 is merely the identity
 carried out
0
x
x
(x ) f (x) e
= f (x) e . For the induction step, let k 1 and compute, using
the product rule,
 ind. hyp.

(x )k f (x) ex
=
(x ) f (k1) (x) ex
=
=

thereby establishing the case.

f (k) (x) ex + f (k1) (x) ex f (k1) (x) ex


f (k) (x) ex ,
(4.51)

71

4 LINEAR ODE

Lemma 4.31. Let P P and K such that P () 6= 0. Then, for each Q P with
deg Q = k, k N0 , it holds that

P (x ) Q(x) ex = R(x) ex ,
(4.52)
xR

where R P is still a polynomial of degree k.

Proof. We can rewrite P (cf. [Phi13a, Th. 6.5(a)]) in the form


P (x) =

n
X
j=0

bj (x )j ,

n N0 ,

(4.53)

where b0 = P () 6= 0 and the remaining bj K can also be calculated from the


coefficients of P according to [Phi13a, (6.8)]. We compute
P (x ) Q(x) ex

n
n
 (4.53) X
 (4.50) X
=
bj (x )j Q(x) ex =
bj Q(j) (x) ex ,
j=0

j=0

Pk

Q(j) and b0 6= 0 implies deg R = deg Q = k.



Pn1
Theorem 4.32. If a0 , . . . , an1 K, n N, and P (x) = xn j=0
aj xj has the distinct
zeros 1 , . . . , r K with respective multiplicities k1 , . . . , kr N, then the set
n
o
B := (jm : I K) : j {1, . . . , r}, m {0, . . . , kj 1} ,
(4.54a)
i.e. (4.52) holds with R :=

j=0 bj

where

j{1,...,r}

m{0,...,kj 1}

jm : I K,

jm (x) := xm ej x ,

(4.54b)

yields a basis of the homogeneous solution space


n
o
Hh = ( : I K) : P (x ) = 0 .
Proof. Since k1 + + kr = n implies #B = n and we know dim Hh = n, it suffices to
show that B Hh and the elements of B are linearly independent. Let jm be as in
(4.54b). As j is a zero of multiplicity kj of P , we can write P (x) = Qj (x)(x j )kj
with some Qj P. From the computation
P (x )jm (x) = Qj (x )(x j )kj xm ej x


 (4.50)
kj >m
= Qj (x ) xkj xm ej x = 0,

we gather B Hh . Linear independence of the jm is verified by showing


!
r
X
Qj (x) ej x = 0

Qj Pkj 1

Qj 0.
j=1

j=1,...,r

j=1,...,r

(4.55)

We prove (4.55) by induction on r. Since ej x =


6 0 for each x R, the case r = 1
is immediate. For the induction step, let r 2. If at least one Qj 0, then the

72

4 LINEAR ODE

remaining Qj 0 as well by the induction hypothesis. It only remains to consider the


case that none of the Qj vanishes identically. In that case, we apply (x r )kr to
P
r
j x
= 0, obtaining
j=1 Qj (x) e
r1
X

Rj (x) ej x = 0

(4.56)

j=1


(k )
with suitable Rj P, since Lem. 4.30 yields (x r )kr Qr (x) er x = Qr r (x) er x = 0
and, for j < r, Lem. 4.31 applies due to (j r )kr 6= 0, also providing deg Rj = deg Qj .
Thus, none of the Rj in (4.56) can vanish identically, violating the induction hypothesis.
This finishes the proof of Qj 0 for each j = 1, . . . , r and the proof of the theorem. 
As it can occur in Th. 4.32 that P P[R], but j C \ R for some or all of the zeros j ,
the question arises of how to obtain a basis of the R-vector space Hh from the basis of
the C-vector space Hh provided by Th. 4.32. The following Rem. 4.33(b) answers this
question.
Remark 4.33. (a) If 1 , 2 C, then complex conjugation has the properties (cf.
[Phi13a, Def. and Rem. 5.5])
1
2 , 1 2 =
1
2.
1 2 =

for each C. In particular, if


In consequence, if P P[R], then P () = P ()
6= is also a zero of P .
P P[R] and C \ R is a nonreal zero of P , then

(b) Consider the situation of Th. 4.32 with P P[R]. Using (a), if jm : I C,
jm (x) = xm ej x , j C \ R, occurs in a basis for the C-vector space Hh (with
m = 0 in the special case of Th. 4.27), then jm : I C, jm (x) = xm ej x , with
j will occur as well. Noting that, for each x R and each C,
j =

ex = ex(Re +i Im ) = ex Re cos(x Im ) + i sin(x Im ) ,
(4.57a)


x
x(Re i Im )
x Re
e =e
cos(x Im ) i sin(x Im ) ,
(4.57b)
=e
1 x

(e + ex ) = ex Re cos(x Im ),
(4.57c)
2
1 x

(e ex ) = ex Re sin(x Im ),
(4.57d)
2i
one can define
1
jm : I R, jm (x) := (jm (x) + jm (x)) = xm ex Re j cos(x Im j ), (4.58a)
2
1
jm : I R, jm (x) := (jm (x) jm (x)) = xm ex Re j sin(x Im j ).
2i
(4.58b)

If one replaces each pair jm , jm in the basis for the C-vector space Hh with the
corresponding pair jm , jm , then one obtains a basis for the R-vector space Hh :
This follows from

1



1 
1
jm
jm
2
2
with A := 1
=A
(4.59)
, det A = 6= 0.
1
jm
jm
2i
2i
2i

73

4 LINEAR ODE
Example 4.34. We consider the fourth-order linear ODE
y (4) = 8y 16y,

(4.60)

which can be written as P (x )y = 0 with


P (x) := x4 + 8x2 + 16 = (x2 + 4)2 = (x 2i)2 (x + 2i)2 ,

(4.61)

i.e. P has the zeros 1 = 2i, 2 = 2i, both with multiplicity 2. Thus, according to Th.
4.32, the four functions

10 (x) = e

2ix

10 , 11 , 20 , 21 : R C,

20 (x) = e2ix ,

11 (x) = x e2ix ,

21 (x) = x e2ix ,

form a basis of the C-vector space Hh . If we consider (4.60) as an ODE over R, we can
use (4.58) to obtain the basis (10 , 11 , 20 , 21 ) of the R-vector space Hh , where
10 (x) = cos(2x),

10 , 11 , 20 , 21 : R R,
11 (x) = x cos(2x), 20 (x) = sin(2x),

21 (x) = x sin(2x).

If (4.38) is inhomogeneous, then one can use Th. 4.32 and, if necessary, Rem. 4.33(b),
to obtain a basis of the homogeneous solution space Hh , then using the equivalence with
systems of first-order linear ODE and variation of constants according to Th. 4.15 to
solve (4.38). However, if the function b in (4.38) is such that the following Th. 4.35
applies, then one can avoid using the above strategy to obtain a particular solution
to (4.38) (and, thus, the entire solution space via Hi = + Hh ).
Pn1
Theorem 4.35. Let a0 , . . . , an1 K, n N, and P (x) = xn j=0
aj xj . Consider
P (x )y = Q(x)ex ,

Q P,

K.

(4.62)

(a) (no resonance): If P () 6= 0 and m := deg(Q) N0 , then there exists a polynomial


R P such that deg(R) = m and
: R K,

(x) := R(x) ex ,

(4.63)

is a solution to (4.62). Moreover, if Q 1, then one can choose R 1/P ().


(b) (resonance): If is a zero of P with multiplicity k N and m := deg(Q) N0 ,
then there exists a solution to (4.62) of the following form:
: R K,

(x) := R(x) e ,

R P,

R(x) =

m+k
X
j=k

c j xj ,

ck , . . . , cm+k K.
(4.64)

The reason behind the terms no resonance and resonance will be explained in the following Example 4.36.

4 LINEAR ODE

74

Proof. Exercise.

Example 4.36. Consider the second-order linear ODE


d2 x
+ 02 x = a cos(t), 0 , R+ ,
dt2
which can be written as P (t )x = a cos(t) with

a R \ {0},

P (t) := t2 + 02 = (t i0 )(t + i0 ).

(4.65)

(4.66)

Note that the unknown function is written as x depending on the variable t (instead of y
depending on x). This is due to the physical interpretation of (4.65), where x represents
the position of a so-called harmonic oscillator at time t, having angular frequency 0
and being subjected to a periodic external force of angular frequency and amplitude
a. We can find a particular solution to (4.65) by applying Th. 4.35 to
P (t )x = a eit .

(4.67)

We have to distinguish two cases:


(a) Case 6= 0 : In this case, one says that the oscillator and the external force are not
in resonance, which explains the term no resonance in Th. 4.35(a). In this case, we
can apply Th. 4.35(a) with := i and Q a, yielding R a/P (i) = a/(02 2 ),
i.e.
a
0 : R C,
0 (t) := R(t) et = 2
eit ,
(4.68a)
0 2
is a solution to (4.67) and
: R R,

(t) := Re 0 (t) =

02

is a solution to (4.65).

a
cos(t),
2

(4.68b)

(b) Case = 0 : In this case, one says that the oscillator and the external force are in
resonance, which explains the term resonance in Th. 4.35(b). In this case, we can
apply Th. 4.35(b) with := i and Q a, i.e. m = 0, k = 1, yielding R(t) = ct
for some c C. To determine c, we plug x(t) = R(t) et into (4.67):

P (t ) ct eit = t (c eit + cit eit ) + 02 ct eit
= ci eit + ci eit c 2 t eit + 02 ct eit

= 2ci eit = a eit

c = a/(2i).

(4.69)

Thus,
0 : R C,

0 (t) :=

a
t eit ,
2i

(4.70a)

is a solution to (4.67) and


: R R,
is a solution to (4.65).

(t) := Re 0 (t) =

a
t sin(t),
2

(4.70b)

75

4 LINEAR ODE
4.6.2

Systems of First-Order Linear ODE

Matrix Exponential Function


Definition 4.37. Let I R be a nontrivial interval, n N, A M(n, K) and b :
I Kn be continuous. Then a linear ODE with constant coefficients is an equation
of the form
y = A y + b(x),
(4.71)
i.e. a linear ODE, where the matrix A does not depend on x.

Recalling that the ordinary exponential function expa : R C, x 7 eax , a C,


is precisely the solution to the initial value problem y = a y, y(0) = 1, the following
definition constitutes a natural generalization:
Definition 4.38. Given n N, A M(n, C), define the matrix exponential function
expA : R M(n, C),

x 7 eAx ,

(4.72a)

to be the solution to the matrix-valued initial value problem


Y = AY,

Y (0) = Id,

(4.72b)

i.e. as the fundamental matrix solution of y = A y that satisfies Y (0) = Id (sometimes


called the principal maxtrix solution of y = A y).

The previous definition of the matrix exponential function is further justified by the
following result:
Theorem 4.39. For each A M(n, C), n N, it holds that

xR

eAx =

X
(Ax)k
k=0

k!

(4.73)

in the sense that the partial sums on the right-hand side converge pointwise to eAx on
R, where the convergence is even uniform on every compact interval.
2
Proof. By the equivalence of all norms on Cn
= M(n, C), we may choose a convenient
norm on M(n, C). So we let k k denote an arbitrary operator norm on M(n, C),
induced by some norm k k on Cn . We first show that the partial sums (Am (x))mN ,
P
(Ax)k
Am (x) := m
k=0 k! , in (4.73) form a Cauchy sequence in M(n, C): For M, N N,
N > M , one estimates, for each x R,
N

N
X (Ax)k (G.10) X
kAkk |x|k

kAN (x) AM (x)k =


.
(4.74)



k!
k!
k=M +1
k=M +1

76

4 LINEAR ODE

P
kAkk |x|k
= ekAk|x| is pointwise for x R and uniform
Since the convergence limm m
k=0
k!
on every compact interval, (4.74) shows each (Am (x))mN is a Cauchy sequence that
converges to some (x) M(n, C) (by the completeness of M(n, C)) pointwise for
x R and uniform on every compact interval. It remains to show is the solution to
(4.72b), i.e.
Z
x

xR

A(t) dt .

(x) = Id +

(4.75)

Using the identity

Am (x) = Id +

m
X
(Ax)k
k=1

k!

= Id +A

m1
X
k=0

Ak xk+1
= Id +
(k + 1)!

A
0

m1
X
k=0

Ak tk
dt ,
k!

we estimate, for each x R and each m N,






Z x
Z x






(x) Id
A(t) dt
A(t) dt

(x) Am (x) + Am (x) Id

0
0


!
Z x

m1

X Ak tk




A
+
(x)

A
(x)
=

A(t)
dt

m


0

k!
k=0

Z x







kAk Am1 (t) (t) dt 0 for m ,
(x) Am (x) +
0

which proves (4.75) and establishes the case.

The matrix exponential function has some properties that are familiar from the case
n = 1 (see Prop. 4.40(a),(b)), but also some properties that are, perhaps, unexpected
(see Prop. 4.42(a),(b)).
Proposition 4.40. Let A M(n, C), n N.
(a) eA(t+s) = eAt eAs holds for each s, t R.
(b) (eAx )1 = eA(x) = eAx holds for each x R.
t

(c) For the transpose At , one has eA x = (eAx )t for each x R.


Proof. (a): Fix s R. The function s : R M(n, C), s (t) := eA(t+s) is a solution
to Y = AY (namely the one for the initial condition Y (s) = Id). Moreover, the
function s : R M(n, C), s (t) := eAt eAs , is also a solution to Y = AY , since

t s (t) = t eAt eAs = AeAt eAs = As (t).

Finally, since s (0) = eA0 eAs = Id eAs = eAs = s (0), the claimed s = s follows by
uniqueness of solutions.
(b) is an easy consequence of (a), since
(a)

Id = eA0 = eA(xx) = eAx eAx .

77

4 LINEAR ODE

(c): Clearly, the map A 7 At is continuous on M(n, C) (since limk Ak = A implies


limk ak, = a for all components, which implies limk Atk = At ), providing, for
each x R,
!t
!t
m
m
m
k
k
t k
X
X
X
(Ax)
(Ax)
(A x)
t
= lim
= (eAx )t ,
= lim
eA x = lim
m
m
m
k!
k!
k!
k=0
k=0
k=0
completing the proof.

Proposition 4.41. Let A M(n, C), n N. Then


det eA = etr A .
Proof. Applying Liouvilles formula (4.32) to (x) := eAx , x R, yields
Z x

Ax
A0
det e = det e exp
tr A dt = 1 ex tr A ,

(4.76)

and setting x = 1 in (4.76) proves the proposition.

Proposition 4.42. Let A, B M(n, C), n N.


(a) BeAx = eAx B holds for each x R if, and only if, AB = BA.
(b) e(A+B)x = eAx eBx holds for each x R if, and only if, AB = BA.
Proof. (a): If BeAx = eAx B holds for each x R, then differentiation yields BAeAx =
AeAx B for each x R, and the case x = 0 provides BA Id = A Id B, i.e. BA = AB.
For the converse, assume BA = AB and define the auxiliary maps
fB : M(n, C) M(n, C),
gB : M(n, C) M(n, C),

fB (C) := BC,
gB (C) := CB.

If kk denotes an operator norm, then kBC1 BC2 k kBkkC1 C2 k and kC1 BC2 Bk
kBkkC1 C2 k, showing fB and gB to be (even Lipschitz) continuous. Thus,
!
!
m
m
k
k
X
X
(Ax)
(Ax)
BeAx = fB (eAx ) = fB lim
= lim fB
m
m
k!
k!
k=0
k=0
!
!
m
m
m
X
X
X
(Ax)k AB=BA
(Ax)k
(Ax)k
= lim B
=
lim
B = lim gB
m
m
m
k!
k!
k!
k=0
k=0
k=0
!
m
X
(Ax)k
= gB (eAx ) = eAx B,
= gB lim
m
k!
k=0
thereby establishing the case.
(b): Exercise (hint: use (a)).

78

4 LINEAR ODE
Eigenvalues and Jordan Normal Form

We will see that the solution theory of linear ODE with constant coefficients is related
to the eigenvalues of A. We recall the definition of this notion:
Definition 4.43. Let n N and A M(n, C). Then C is called an eigenvalue of
A if, and only if, there exists 0 6= v Cn such that
Av = v.

(4.77)

If (4.77) holds, then v 6= 0 is called an eigenvector for the eigenvalue .


Theorem 4.44. Let n N and A M(n, C).
(a) For each eigenvalue C of A with eigenvector v Cn \ {0}, the function
: I Cn ,

(x) := ex v,

(4.78)

is a solution to the homogeneous version of (4.71).


(b) If {v1 , . . . , vn } is a basis of eigenvectors for Cn , where vj is an eigenvector with
respect to the eigenvalue j C of A for each j {1, . . . , n}, then 1 , . . . , n with

j{1,...,n}

j : I Cn ,

j (x) := ej x vj ,

(4.79)

form a fundamental system for (4.71).


Proof. (a): One computes, for each x I,
(x) = ex v = ex Av = A(x),
proving that solves the homogeneous version of (4.71).
(b): Without loss of generality, we may consider I = R. We already know from (a) that
each j is a solution to the homogeneous version of (4.71). Thus, it merely remains
to check that 1 , . . . , n are linearly independent. As 1 (0) = v1 , . . . , n (0) = vn are
linearly independent by hypothesis, the linear independence of 1 , . . . , n is provided by
Th. 4.11(b).

To proceed, we need a few more notions and results from linear algebra:
Theorem 4.45. Let n N and A M(n, C). Then the following statements (i) and
(ii) are equivalent:
(i) There exists a basis B of eigenvectors for Cn , i.e. there exist v1 , . . . , vn Cn and
1 , . . . , n C such that B = {v1 , . . . , vn } is a basis of Cn and Avj = j vj for
each j = 1, . . . , n (note that the vj must all be distinct, whereas some (or all) of
the j may be identical).

79

4 LINEAR ODE
(ii) There exists an invertible matrix W M(n, C) such that

1
0

...
W 1 AW =
,
0
n

(4.80)

i.e. A is diagonalizable (if (4.80) holds, then the columns v1 , . . . , vn of W must


actually be the respective eigenvectors to the eigenvalues 1 , . . . , n ).

Proof. See, e.g., [Koe03, Th. 8.3.1].

Unfortunately, not every matrix A M(n, C) is diagonalizable. However, every A


M(n, C) can at least be transformed into Jordan normal form:
Theorem 4.46 (Jordan Normal Form). Let n N and A M(n, C). There exists an
invertible matrix W M(n, C) such that
B := W 1 AW
is in Jordan normal form, i.e. B has block diagonal form

B1
0

...
B=
,
0
Br

(4.81)

(4.82)

1 r n, where each block Bj is a so-called Jordan matrix or Jordan block, i.e.

j 1 0 . . . 0

j 1

... ...
,
(4.83)
Bj = (j ) or Bj =
0

0
j 1
j
where j is an eigenvalue of A.

Proof. See, e.g., [Koe03, Th. 9.5.6] or [Str08, Th. 27.13].

The reason Th. 4.46 regarding the Jordan normal form is useful for solving linear ODE
with constant coefficients is the following theorem:
Theorem 4.47. Let n N and A, W M(n, C), where W is assumed invertible.
(a) The following statements (i) and (ii) are equivalent:
(i) : I Cn is a solution to y = Ay.

(ii) := W 1 : I Cn is a solution to y = W 1 AW y.

80

4 LINEAR ODE
(b) eW

1 AW x

= W 1 eAx W for each x R.

Proof. (a): The equivalences


= A

W 1 = W 1 A

= W 1 AW

establish the case.


1

(b): By definition, x 7 eW AW x is the solution to the initial value problem Y =


W 1 AW Y , Y (0) = Id. Thus, noting W 1 eA0 W = Id and
(W 1 eAx W ) = W 1 AeAx W = W 1 AW W 1 eAx W
shows x 7 W 1 eAx W is a solution to the same initial value problem, establishing
(b).

Remark 4.48. To obtain a fundamental system for (4.71) with A M(n, C), it suffices
to obtain a fundamental system for y = By, where B := W 1 AW is in Jordan normal
form and W M(n, C) is invertible: If 1 , . . . , n are linearly independent solutions to
y = By, then A = W BW 1 , Th. 4.47(a), and W being a linear isomorphism yield that
1 := W 1 , . . . , n := W n are linearly independent solutions to y = Ay.
Moreover, since B is in block diagonal form with each block being a Jordan matrix
according to (4.82) and (4.83), it actually suffices to solve y = By assuming that
B = Id +N,

(4.84)

where C and N is a so-called canonical nilpotent

N = 0 (zero matrix) or N =

matrix, i.e.
1
0

0 ... 0

... ...
0 ,

0
0 1
0

(4.85)

where the case N = 0 is already covered by Th. 4.44. The remaining case is covered by
the following Th. 4.49.
Theorem 4.49. Let C, k N, k 2, and assume
nilpotent matrix according to (4.85). Then

2
1 x x2

x
0 1

1
0 0

: R M(k, C), (x) := ex .


..
..
..
.
.

0 0
0

0 6= N M(k, C) is a canonical
...
...
...

xk2
(k2)!
xk3
(k3)!

..
.

...

...

...

...

xk1
(k1)!
xk2
(k2)!

..
.
..
.

(4.86)

81

4 LINEAR ODE
is a fundamental matrix solution to
Y = ( Id +N )Y,

Y (0) = Id,

(4.87)

i.e.
(x) = e( Id +N )x ;

xR

(4.88)

in particular, the columns of provide k solutions to y = ( Id +N )y that are linearly


independent.
Proof. (0) = Id is immediate from (4.86). Since (x) has upper triangular form with
all 1s on the diagonal, we obtain det (x) = ekx 6= 0 for each x R, showing the
columns of are linearly independent. Let : R C denote the th component
function of the th column of , i.e.
(
x
ex ()!
for ,

: R C, (x) :=
,{1,...,k}
0
for > .
It remains to show that

,{1,...,k}

One computes,

,{1,...,k}

+ +1,
=

e
(x) = ex

x
()!
x
()!

+ ex

for < ,
for = ,
for > .

x(+1)
((+1))!

+0

(4.89)

for < ,
for = ,
for > ,

i.e. (4.89) holds, completing the proof.

Example 4.50. For a 2-dimensional real system of linear ODE


y = Ay,

A M(2, R),

(4.90)

there exist precisely the following three possibilities (i) (iii):


(i) The matrix A is diagonalizable with real eigenvalues 1 , 2 R (1 = 2 is
possible), i.e. there is a basis {v1 , v2 } of R2 such that vj is an eigenvector for j ,
j {1, 2}. In this case, according to Th. 4.44(b), the two functions
1 , 2 : R K2 ,

1 (x) := e1 x v1 ,

form a fundamental system for (4.90) (over K).

2 (x) := e2 x v2 ,

(4.91)

82

4 LINEAR ODE

(ii) The matrix A is diagonalizable with two complex conjugate eigenvalues 1 , 2


1 . Analogous to (i), one has a basis {v1 , v2 } of C2 such that vj
C \ R, 2 =
is an eigenvector for j , j {1, 2}, and the two functions in (4.91) still form a
fundamental system for (4.90), but with K replaced by C. However, one can still
obtain a real-valued fundamental system as follows: We have
1 = + i,

2 = i,

where R,

R \ {0}.

(4.92)

Thus, if Av1 = 1 v1 with 0 6= v1 = + i, where , R2 , then, letting


v2 := v1 = i, and taking complex conjugates
1 v1 = 2 v2
Av2 = A
v1 = Av1 = 1 v1 =
shows v2 is an eigenvector with respect to 2 . Thus, 2 = 1 and, similar to the
approach described in Rem. 4.33(b) above, we can let


1
2

1

2
1
2i

1
2

 
1
,
1
2
2i

to obtain a fundamental system {1 , 2 } for (4.90) over R, where 1 , 2 : R


R2 ,

1 (x) = Re(1 (x)) = Re e(+i)x ( + i)



= Re ex cos(x) + i sin(x) ( + i)

(4.93a)
= ex cos(x) sin(x) ,

2 (x) = Im(1 (x)) = ex sin(x) + cos(x) .
(4.93b)

(iii) The matrix A has precisely one eigenvalue R and the corresponding eigenspace
is 1-dimensional. Then there is an invertible matrix W M(2, R) such that
B := W 1 AW is in (nondiagonal) Jordan normal form, i.e.


1
1
.
B = W AW =
0
According to Th. 4.49, the two functions
2

1 , 2 : R K ,

1 (x) := e

 
1
,
0

2 (x) := e

 
x
,
1

(4.94)

form a fundamental system for y = By (over K). Thus, according to Th. 4.47,
the two functions
1 , 2 : R K2 ,

1 (x) := W 1 (x),

form a fundamental system for (4.90) (over K).

2 (x) := W 2 (x),

(4.95)

83

4 LINEAR ODE

Remark 4.51. One way of finding a fundamental matrix solution for y = A y, A


M(n, C), is to obtain eAx , using the following strategy based on Jordan normal forms:
(i) Determine the distinct eigenvalues 1 , . . . , s , 1 s n, of A, which are precisely
the zeros of the characteristic polynomial A (x) := det(A x Id) (the multiplicity
of the zero j is called its algebraic multiplicity, the dimension of the eigenspace
ker(A j Id) its geometric multiplicity).
(ii) Determine the Jordan normal form B of A and W such that B = W 1 AW . In
general, this means computing the (finitely many distinct) powers (A j Id)k and
(suitable bases of) ker(Aj Id)k (in general, this is a somewhat involved procedure
and it is referred to [Mar04, Sections 4.2,4.3] and [Str08, Sec. 27] for details the
lecture notes do not provide further details here, as they rather recommend using
the Putzer algorithm as described below instead).
(iii) For each Jordan block Bj (as in (4.83)) of B compute eBj x as in (4.86).
As step (ii) above tends to be complicated in practise, it is usually easier to obtain eAx
using the Putzer algorithm described next.
Putzer Algorithm
The Putzer algorithm due to [Put66] is a procedure for computing eAx that avoids the
difficulty of determining the Jordan normal form of A, and, thus, is often more efficient
to employ in practise than the procedure described in Rem. 4.51 above. The Putzer
algorithm is provided by the following theorem:
Theorem 4.52. Let A M(n, C), n N. If 1 , . . . , n C are precisely the eigenvalues of A (not necessarily distinct, each eigenvalue occurring possibly repeatedly according
to its multiplicity). Then
n1
X
eAx =
pk+1 (x) Mk ,
(4.96)
xR

k=0

where the functions p1 , . . . , pn : R C and matrices M0 , . . . , Mn1 M(n, C) are


defined recursively by
p1 = 1 p1 ,
pk = k pk + pk1 ,

p1 (0) = 1,
pk (0) = 0

for

k = 2, . . . , n

(4.97a)
(4.97b)

(i.e. each pk is a solution to a (typically nonhomogeneous) 1-dimensional first-order


linear ODE that can be solved using (2.2)) and
M0 := Id,
Mk = Mk1 (A k Id)

for

k = 1, . . . , n 1.

(4.98a)
(4.98b)

84

5 STABILITY
Proof. Note that (4.98) can be extended to k = n, yielding
Mn =

n
Y

(A k Id) = A (A) = 0,

k=1

since each matrix annihilates its characteristic polynomial according to the CayleyHamilton theorem (cf. [Koe03, Th. 8.4.6] or [Str08, Th. 26.6]). Also note

k=0,...,n1

AMk = Mk (A k+1 Id) + k+1 Mk = Mk+1 + k+1 Mk .

(4.99)

Pn1
We have to show that x 7 (x) := k=0
pk+1 (x) Mk solves the initial value problem
Y = AY , Y (0) = Id. The initial condition is satisfied, as (0) = p1 (0) M0 = Id, and
the ODE is satisfied, as, for each x R,

(x) A(x)

n1
X

pk+1 (x) Mk

k=0

(4.97), (4.99)

1 p1 (x) M0 +

n1
X
k=1

n1
X

n1
X
k=0


k+1 pk+1 (x) + pk (x) Mk

pk+1 (x) Mk+1 + k+1 Mk

k=0

pn (x) Mn = 0,

completing the proof.

5
5.1

pk+1 (x) Mk




Stability
Qualitative Theory, Phase Portraits

In the qualitative theory of ODE, which can be seen as part of the field of dynamical
systems, the idea is to understand the set of solutions to an ODE (or to a class of ODE),
if possible, without making use of explicit solution formulas, which, in most situations,
are not available anyway. Examples of qualitative questions are if, and under which
conditions, solutions to an ODE are constant, periodic, are unbounded, approach some
limit (more generally, the solutions asymptotic behavior), etc. One often thinks of the
solutions as depending on a time-like variable, and then qualitative theory typically
means disregarding the speed of change, but rather focusing on the shape/geometry of
the solutions image.
The topic of stability takes continuity in intial conditions further and investigates the
behavior of solutions that are, at least initially, close to some given solution. Under
which conditions do nearby solutions approach each other or diverge away from each
other, show the same or different asymptotic behavior etc.

85

5 STABILITY

Even though the abovedescribed considerations are not limited to this situation, a natural starting point is to consider first-order ODE where the right-hand side does not
depend on x. In the following, we will mostly be concerned with this type of ODE,
which has a special name:
Definition 5.1. If Kn , n N, and f : Kn , then the n-dimensional first-order
ODE
y = f (y)
(5.1)
is called autonomous and is called the phase space.
Remark 5.2. In fact, nonautonomous ODE are not really more general than autonomous ODE, due to the, perhaps, surprising Th. J.1 of the Appendix, which states
that every nonautonomous ODE is equivalent to an autonomous ODE. However, this fact
is of little practical relevance, since the autonomous ODE arising via Th. J.1 from nonautonomous ODE can never have bounded solutions on unbounded intervals, whereas the
theory of autonomous ODE is most powerful and useful for ODE that admit bounded
solutions on unbounded intervals (such as constant or periodic solutions, or solutions
approaching constant or periodic functions).
Lemma 5.3. If, in the context of Def. 5.1, : I Kn is a solution to (5.1), defined
on the interval I R, then

: I Kn ,

(x) := (x + ),

where I := {x R : x I},

(5.2)
is another solution to (5.1). In consequence, if is a maximal solution, then so is .
Proof. Clearly, I is an interval. Note x I x + I and, since is a
solution to (5.1), it is (I) , implying (I ) . Finally,



(x) = (x + ) = f (x + ) = f (x) ,
xI

completing the proof that is a solution. Since each extension of yields an extension
of and vice versa, is a maximal solution if, and only if, is a maximal solution. 
Lemma 5.4. If Kn , n N, and f : Kn is such that (5.1) admits unique
maximal solutions (f being locally Lipschitz on open is sufficient, but not necessary,
cf. Def. 3.32), then the global solution Y : Df Kn of (5.1) satisfies
(a) Y (x, , ) = Y (x , 0, ) for each (x, , ) Df .

n
(b) Y x, 0, Y (
x, 0, ) = Y (x+
x, 0, )
 x, 0, ) for each (x, x, ) RRK such that (
Df and x, 0, Y (
x, 0, ) Df .

Proof. (a): If : I, Kn and : I0, Kn denote the maximal solutions to


the initial data y() = and y(0) = , respectively, then (a) claims, using the notation
from Lem. 5.3, = . As a consequence of Lem. 5.3, : I0, + Kn , is some

86

5 STABILITY

maximal solution to (5.1) and, since () = (0) = = (), the assumed uniqueness
yields the claimed = , in particular, I, = I0, + .
(b): Let := Y (
x, 0, ). If : I0, Kn and : I0, Kn denote the maximal
solutions to the initial data y(0) = and y(0) = , respectively, then (b) claims = x .
As a consequence of Lem. 5.3, x : I0, x Kn , is some maximal solution to (5.1)
and, since x (0) = (
x) = = (0), the assumed uniqueness yields the claimed = x ,
in particular, I0, = I0, x.

Definition 5.5. Let I R be an interval and : I S (in principle, S can be
arbitrary).
(a) The image of I under , i.e.
O() := (I) = {(x) : x I} S

(5.3)

is often referred to as the orbit of in the present context of qualitative ODE theory.
(b) : R S (note I = R) is called periodic if, and only if, there exists a smallest
> 0 (called the period of ) such that

xR

(x + ) = (x).

(5.4)

The requirement > 0 means constant functions are not periodic in the sense of
this definition.
Lemma 5.6. Let : R Kn , n N.
(a) If is continuous and (5.4) holds for some > 0, then is either constant or
periodic in the sense of Def. 5.5(b).
(b) (a) is false without the assumption of being continuous.
Proof. Exercise.

Definition 5.7. Let Kn , n N, and f : Kn . In the context of the


autonomous ODE (5.1), the zeros of f are called the fixed points of the ODE (5.1) (cf.
Lem. 5.8 below). One then sometimes uses the notation
F := Ff := { : F () = 0}

(5.5)

for the set of fixed points.


Lemma 5.8. Let Kn , n N, f : Kn , . Then the following statements
are equivalent:
(i) f () = 0, i.e. is a fixed point of (5.1).
(ii) : R Kn , , is a solution to (5.1).

87

5 STABILITY

Proof. If f () = 0 and , then (x) = 0 = f ((x)) for each x R, i.e. (i) implies
(ii). Conversely, if is a solution to (5.1), then f () = f ((x)) = (x) = 0, i.e. (ii)
implies (i).

Proposition 5.9. If Kn , n N, and f : Kn is such that (5.1) admits
unique maximal solutions (f being locally Lipschitz on open is sufficient), then, for
maximal solutions 1 : I1 Kn , 2 : I2 Kn to (5.1), defined on open intervals
I1 , I2 , respectively, precisely one of the following two statements (i) and (ii) is true:
(i) O(1 ) O(2 ) = , i.e. the solutions have disjoint orbits.
(ii) There exists R such that
I2 = I1

and

xI2

2 (x) = 1 (x + ).

(5.6)

In particular, it follows in this case that O(1 ) = O(2 ), i.e. the solutions have
the same orbit.
Proof. Suppose (i) does not hold. Then there are x1 I1 and x2 I2 such that
1 (x1 ) = 2 (x2 ). Define := x1 x2 and consider
: I1 Kn ,

(x) := 1 (x + ).

(5.7)

Then is a maximal solution of (5.1) by Lem. 5.3 and (x2 ) = 1 (x1 ) = 2 (x2 ). By
uniqueness of maximal solutions, we obtain = 2 , in particular, I2 = I1 , proving
(5.6). Clearly, (5.6) implies O(1 ) = O(2 ).


Proposition 5.10. If Kn , n N, and f : Kn is such that (5.1) admits


unique maximal solutions (f being locally Lipschitz on open is sufficient), then, for
each maximal solution : I Kn to (5.1), defined on the open interval I, precisely
one of the following three statements is true:
(i) is injective.
(ii) I = R and is periodic.
(iii) I = R and is constant (in this case := (0) is a fixed point of (5.1)).
Proof. Clearly, (i) (iii) are mutually exclusive. Suppose (i) does not hold. Then there
exist x1 , x2 I, x1 < x2 , such that (x1 ) = (x2 ). Set := x2 x1 . According to Lem.
5.3, : I Kn , (x) := (x + ), must also be a maximal solution to (5.1).
Since (x1 ) = (x1 + ) = (x2 ) = (x1 ), uniqueness implies = and I = I .
As > 0, this means I = R and the validity of (5.4). As is also continuous, by Lem.
5.6(a), either (ii) or (iii) must hold.

Corollary 5.11. If Kn , n N, and f : Kn is such that (5.1) admits unique
maximal solutions (f being locally Lipschitz on open is sufficient), then the orbits
of maximal solutions to (5.1) partition the phase space into disjoint sets. Moreover,
every point is either a fixed point, or it belongs to some periodic orbit, or it belongs
to the orbit of some injective solution.

5 STABILITY

88

Proof. The corollary merely summarizes Prop. 5.9 and Prop. 5.10.

Definition 5.12. In the situation of Cor. 5.11, a phase portrait for (5.1) is a sketch
showing representative orbits. Thus, the sketch shows subsets of the phase space ,
including fixed points (if any) and representative periodic solutions (if any). Usually,
one also uses arrows to indicate the direction in which each drawn orbit is traced as the
variable x increases.
Example 5.13. Even though it is a main goal of qualitative theory to obtain phase
portraits without the need of explicit solution formulas, and we will study techniques
for accomplishing this below, we will make use of explicit solution formulas for our first
two examples of phase portraits.
(a) Consider the autonomous linear ODE
  

y1
y2
.
=
y2
y1

(5.8)

Here we have = R2 and f : , f (y1 , y2 ) = (y2 , y1 ). The only fixed point


is (0, 0). Clearly, for each r > 0, : R R2 , (x) := (r cos x, r sin x) is a solution
to (5.8) and its orbit is the circle with radius r around the origin. Since every point
of belongs to such a circle, every orbit is either the origin or a circle around the
origin. Thus, the phase portrait consists of such circles plus the origin and arrows
that indicate the circles are traversed counterclockwise.
(b) As compared to the previous one, the phase portrait of the autonomous linear ODE
   
y1
y
(5.9)
= 2

y2
y1
is more complicated: While (0, 0) is still the only fixed point, for each r > 0, all the
following functions 1 , 2 , 3 , 4 : R R2 are solutions:
1 (x) := (r cosh x, r sinh x),
2 (x) := (r cosh x, r sinh x),
3 (x) := (r sinh x, r cosh x),
4 (x) := (r sinh x, r cosh x),

(5.10a)
(5.10b)
(5.10c)
(5.10d)

each type describing a hyperbolic orbit in some section of the plane R2 . These
sections are separated by rays, forming the orbits of the solutions 5 , 6 , 7 , 8 :
R R2 :
5 (x) := (ex , ex ),
6 (x) := (ex , ex ),
7 (x) := (ex , ex ),
8 (x) := (ex , ex ).

(5.10e)
(5.10f)
(5.10g)
(5.10h)

89

5 STABILITY

The two rays on {(y1 , y1 ) : y1 6= 0} move away from the origin, whereas the two rays
on {(y1 , y1 ) : y1 6= 0} move toward the origin. The hyperbolic orbits asymptotically approach the ray orbits and are traversed such that the flow direction agrees
between approaching orbits.

The next results will be useful to obtain new phase portraits from previously known
phase portraits in certain situations.
Proposition 5.14. Let Kn , n N, let I R be some nontrivial interval, let
f : Kn , and let : I Kn be a solution to
y = (x) f (y),

(5.11)

where : I R is continuous. If (x) 6= 0 for each x I (if one thinks of x as


time, then one can think of as the velocity of ), then there exists a continuously
differentiable bijective map : J I, defined on some nontrivial interval J, such that
( ) : J Kn is a solution to y = f (y).
Proof. Since (x) 6= 0 for each x I, one has (x) 6= 0 for each x I. As is also
continuous, it must be either always
negative or always positive. In consequence, fixing
Rx
x0 I, : I R, (x) := x0 (t) dt , is continuous and either strictly increasing or
strictly decreasing. In particular, is injective, J := (I) is an interval, and : I J
is bijective. The desired function is := 1 : J I. Indeed, according to [Phi13a,
Th. 9.8], is differentiable and its derivative is the continuous function
: J R,

(x) =

implying, for each x J,

1
,
(x)




( ) (x) = (x) (x) = (x) f ((x))

showing is a solution to y = f (y) as required.


1
 = f ((x)) ,
(x)

Proposition 5.15. Let Kn , n N, and f : Kn . Moreover, consider a


continuous function h : R with the property that either h > 0 everywhere on
or h < 0 everywhere on .
(a) If f has no zeros (i.e. F = ), then the ODE
y = f (y),
y = h(y) f (y)

(5.12a)
(5.12b)

have precisely the same orbits, i.e. every orbit of a solution to (5.12a) is an orbit
of a solution to (5.12b) and vice versa.

90

5 STABILITY

(b) If f and h are such that the ODE (5.12) admit unique maximal solutions, then the
ODE (5.12) have precisely the same orbits (even if F 6= ).
Proof. (a): If : I Kn is a solution to (5.12b), then := h is well-defined
and continuous. Since F = implies 6= 0, we can apply Prop. 5.14 to obtain
the existence of a bijective 1 : J1 I such that 1 is a solution to (5.12a).
Thus, O() = O( 1 ). Conversely, if : I Kn is a solution to (5.12a), i.e. to
f (y), then := 1/(h ) is well-defined and continuous. Since F = implies
y = h(y)
h(y)

6= 0, we can apply Prop. 5.14 to obtain the existence of a bijective 2 : J2 I such


that 2 is a solution to (5.12b). Thus, O() = O( 2 ).
(b): We are now in the situation of Prop. 5.10 and Cor. 5.11, and from (a) we know every
nonconstant orbit of (5.12a) is a nonconstant orbit of (5.12b) and vice versa. However,
since h > 0 or h < 0, both ODE in (5.12) have precisely the same constant solutions,
concluding the proof.


Remark 5.16. We apply Prop. 5.15 to phase portraits (in particular, assume unique
maximal solutions). Prop. 5.15 says that overall multiplication with a continuous positive function h does not change the phase portrait at all. Moreover, Prop. 5.15 also
states that overall multiplication with a continuous negative function h does not change
the partition of into solution orbits. However, after multiplication with a negative h,
the orbits are clearly traversed in the opposite direction, i.e., for negative h, the arrows
in the phase portrait have to be reversed. For a general continuous h, this implies the
phase portrait remains the same in each region of , where h > 0; it remains the same,
except for the arrows reversed, in each region of , where h < 0; and the zeros of h
add additional fixed points, cutting some of the previous orbits. We summarize how to
obtain the phase portrait of (5.12b) from that of (5.12a):
(1) Start with the phase portrait of (5.12a).
(2) Add the zeros of h as additional fixed points (if any). Previous orbits are cut, where
fixed points are added.
(3) Reverse the arrows where h < 0.
Example 5.17. (a) Consider the ODE

y1 = y2 (y1 1)2 + y22 ,

y2 = y1 (y1 1)2 + y22 ,

(5.13)

which comes from multiplying the right-hand side of (5.8) by h(y) = (y1 1)2 + y22 .
The phase portrait is the same as the one for (5.8), except for the added fixed point
at {(1, 0)}.
(b) Consider the ODE
y1 = y1 y2 + y22 ,
y2 = y1 y2 + y12 ,

(5.14)

91

5 STABILITY

which comes from multiplying the right-hand side of (5.8) by h(y) = y1 y2 . The
phase portrait is obtained from that of (5.8), where additional fixed points are on
the line with y1 = y2 . This line cuts each previously circular orbit into two segments.
The arrows have to be reversed for y2 > y1 , that means above the y1 = y2 line.
Definition 5.18. Let Rn , n N, and f : Rn . A function E : R is
called an integral for the autonomous ODE (5.1), i.e. for y = f (y), if, and only if, E
is constant for every solution of (5.1).
Lemma 5.19. Let Rn be open, n N, and f : Rn such that each initial
value problem for (5.1) has at least one solution (f continuous is sufficient by Th. 3.8).
Then a differentiable function E : R is an integral for (5.1) if, and only if,

( E)(y) f (y) =

n
X

j E(y) fj (y) = 0.

(5.15)

j=1

Proof. Let : I Rn be a solution to y = f (y). Then, by the chain rule,

xI

(E ) (x) = ( E)((x)) (x) = ( E)((x)) f ((x)).

(5.16)

The differentiable function E : I R is constant on the interval I if, and only if,
(E ) 0. Thus, by (5.16), E being constant for every solution is equivalent to
( E) f (y) = 0 for each y such that at least one solution passes through y. 

Example J.2 of the Appendix, pointed out by Anton Sporrer, shows the hypothesis of
Lem. 5.19, that each initial value problem for (5.1) has at least one solution, can not be
omitted. The following Prop. 5.20 makes use of integrals and applies to phase portraits
of 2-dimensional real ODE:
Proposition 5.20. Let R2 be open, and let f : R2 be continuous and
such that (5.1) admits unique maximal solutions (f being locally Lipschitz is sufficient).
Assume E : R to be a continuously differentiable integral for (5.1), i.e. for
y = f (y), satisfying E(y) 6= 0 for each y . Then the following statements hold
for each maximal solution : I R2 of (5.1) (I R some open interval):
(a) If (xm )mN is a sequence in I such that limm (xm ) = , then F (i.e.
is a fixed point) or O() (i.e. there exists I with () = ).

(b) Let C R be such that E C (such a C exists, as E is an integral). If


E 1 {C} = {y : E(y) = C} is compact and E 1 {C} F = , then is
periodic.
Proof. Throughout the proof let C be as in (b), i.e. E C.

(a): The continuity of E yields E() = limm E((xm )) = C. Moreover, by hypothesis, (1 , 2 ) := E() 6= (0, 0). We proceed with the proof for 2 6= 0 if 2 = 0
and 1 6= 0, then the roles of the indices 1, 2 have to be switched in the following. We

92

5 STABILITY

apply the implicit function theorem [Phi13b, Th. C.9] to the function f : R,
f(y) := E(y) C at its zero = (1 , 2 ). By [Phi13b, Th. C.9], there exist , > 0 and
a continuously differentiable map g : Ig R, Ig :=]1 , 1 + [, such that g(1 ) = 2 ,

sIg


E s, g(s) = C,

and, having fixed some arbitrary norm k k on R2 ,





ky k < E(y) = C

sIg

(5.17a)

y = s, g(s)




(5.17b)

We now assume
/ F and show O(). If
/ F, then f () 6= 0 and the continuity

1 + [,

of f and g imply there is > 0, , such that, for each s I :=]1 ,

f (s, g(s)) 6= 0. Define the auxiliary function : I , (s) = (s, g(s)). Since
E C, we can employ the chain rule to conclude

sI

0 = (E ) (s) = ( E)((s)) (s),

(5.18)

i.e. the two-dimensional vectors ( E)((s)) and (s) are orthogonal with respect to
the Euclidean scalar product. As E is an integral, using (5.15), f ((s)) is another vector
orthogonal to ( E)((s)) and, since all vectors in R2 orthogonal to ( E)((s)) form
a 1-dimensional subspace of R2 (recalling ( E)((s)) 6= 0), there exists (s) R such
that
(s) = (s)f ((s))
(5.19)
We can now apply Prop. 5.14, since (5.19) says is a
(note f ((s)) 6= 0 as s I).
solution to (5.11), the function : I R, s 7 (s) = (s)/f ((s)) is continuous,
Thus, Prop. 5.14 provides a bijective
and (s) = (1, g (s)) 6= (0, 0) for each s I.
such that is a solution to y = f (y).
: J I,
As we assume limm (xm ) = , there exists M N such that k(xm )k < for each
m M . Since E((xm )) = C also holds, (5.17b) implies the existence of a sequence
(sm )mN in I such that (xm ) = (sm , g(sm )) for each m M . Then, for each m M
and m := 1 (sm ), ( )(m ) = (sm ) = (xm ). On the other hand, for 0 := 1 (1 ),
( )(0 ) = (1 ) = , showing (xm ), O( ). Since (xm ) O() as well,
Prop. 5.9 implies O( ) O(), i.e. O(), which proves (a). In preparation for
(b), we also observe that k(xm ) k < for each m M implies the sm for m M
all are in some compact interval I1 with 1 I1 , implying the m to be in the compact
interval J1 := 1 [I1 ] with 0 J1 . We will use for (b) that J1 is bounded.

(b): As we have O() E 1 {C} according to the choice of C, the assumed compactness
of E 1 {C} and Prop. 3.24 show can only be maximal if it is defined on all of R (since
(x, (x)) must escape every compact [m, m] E 1 {C}, m N, on the left and on the
right). Using the compactness of E 1 {C} a second time, we obtain the existence of a
sequence (xm )mN in R such that limm xm = and limm (xm ) = E 1 {C}.
So we see that we are in the situation of (a). Let be the maximal extension of the
solution constructed in the proof of (a). Then we know O() O() 6= from the
proof of (a) and, since and both are maximal, Prop. 5.9 implies O() = O() and,

93

5 STABILITY

more importantly for us here, there exists R such that (x) = (x+) for each x R.
Let m M with M from the proof of (a). If 6= 0, then (xm ) = (m ) = (xm + )
shows is not injective. If = 0, then = and (xm ) = (m ). Since the m are
bounded, whereas the xm are unbounded, xm = m cannot be true for all m, again
showing is not injective. Since E 1 {C} F = , cannot be constant, therefore it
must be periodic by Prop. 5.10.

Example 5.21. Using condition (5.15), i.e. E f 0, one readily verifies that the
functions
E : R2 R,
E : R2 R,

E(y1 , y2 ) := y12 + y22 ,


E(y1 , y2 ) := y12 y22 ,

(5.20a)
(5.20b)

are integrals for (5.8) and (5.9), i.e. for


  

y1
y2
=
y2
y1
and

   
y1
y
= 2 ,

y2
y1

respectively, and we recover the respective phase portraits via the respective level curves
E(y1 , y2 ) = C, C R.
Example 5.22. Consider the autonomous ODE
  

y1
2y1 y2
=
.
y2
1 2y12
We claim that
E : R2 R,

(5.21)
2

E(y1 , y2 ) := y1 e(y1 +y2 ) ,

(5.22)

is an integral for (5.21) and intend to use Prop. 5.20 to establish (5.21) has orbits that
are fixed points, orbits that are periodic, and orbits that are neither. To verify E is an
integral, one computes, for each (y1 , y2 ) R2 ,
E(y1 , y2 ) (2y1 y2 , 1 2y12 )

2
2
2
2 
2
2
= e(y1 +y2 ) 2y12 e(y1 +y2 ) , 2y1 y2 e(y1 +y2 ) (2y1 y2 , 1 2y12 )
2

= e(y1 +y2 ) (1 2y12 , 2y1 y2 ) (2y1 y2 , 1 2y12 ) = 0.

Clearly, the set of fixed points is


F=



 

1
1
.
, 0 , , 0
2
2

The level set of 0 is E 1 {0} = {(0, y2 ) : y2 R}, i.e. it is the y2 -axis. This is a
nonperiodic orbit (actually, the orbit of solutions of the form : R R2 , (x) :=
(0, x + c), c R). Now consider the level set


E 1 {e1 } = (y1 , y2 ) : y1 > 0, y22 = ln y1 y12 + 1 .

94

5 STABILITY

Using g : R+ R, g(y1 ) = ln y1 y12 + 1, and its derivative, it is not hard to show


g has precisely two zeros, namely 1 = 1 and 0 < 2 < 1, p
and g
 0 precisely
on the
compact interval J := [2 , 1], implying E 1 {e1 } = y1 , g(y1 ) : y1 J , showing
E 1 {e1 } is compact. According to Prop. 5.20(b), E 1 {e1 } must consist of one or
more periodic orbits.

5.2

Stability at Fixed Points

Given an autonomous ODE with a fixed point p, we will investigate the question under
what conditions a solution (x) starting out near p will remain near p as x increases or
decreases.
To simplify notation, we will restrict ourselves to initial data y(0) = y0 , which, in light
of Lem. 5.4(b), is not an essential restriction.
Notation 5.23. Let Kn , n N, and f : Kn such that
y = f (y)

(5.23)

admits unique maximal solutions (f being locally Lipschitz on open is sufficient). Let
Y : Df Kn denote the general solution to (5.23) and define
Y : Df,0 Kn , Y (x, ) := Y (x, 0, ),
Df,0 := {(x, ) R Kn : (x, 0, ) Df }.

(5.24)

Definition 5.24. Let Kn , n N, and f : Kn such that (5.23) admits unique


maximal solutions (f being locally Lipschitz on open is sufficient). Moreover, assume
the set of fixed points to be nonempty, F 6= , and let p F. The fixed point p is said
to be positively (resp. negatively) stable if, and only if, the following conditions (i) and
(ii) hold:
(i) There exists r > 0 such that, for each with k pk < r, the maximal

solution x 7 Y (x, ) (cf. (5.24)) is defined on (a superset of) R+


0 (resp. R0 ).
(ii) For each > 0, there exists > 0 such that, for each ,
k pk <

x0
(resp. x 0)

kY (x, ) pk < .

(5.25)

The fixed point p is said to be positively (resp. negatively) asymptotically stable if, and
only if, (i) and (ii) hold plus the additional condition
(iii) There exists > 0 such that, for each ,
k pk <

lim Y (x, ) = p (resp.

lim Y (x, ) = p).

(5.26)

95

5 STABILITY

The norm k k on Kn used in (i) (iii) above is arbitrary. Due to the equivalence of
norms on Kn , changing the norm does not change the defined stability properties, even
though, in general, it does change the sizes of r, , .
Remark 5.25. In the situation of Def. 5.24, consider the time-reversed version of (5.23),
i.e.
y = f (y).
(5.27)
According to Lem. 1.9(b), (5.27) has the general solution

Y : Df,0 Kn , Y (x, ) := Y (x, ),


Df,0 = {(x, ) R Kn : (x, ) Df,0 }.

(5.28)

(a) Clearly, for a fixed point p F, we have the following equivalences:

p is positively stable for (5.23)


p is negatively stable for (5.23)

p is negatively stable for (5.27),


p is positively stable for (5.27).

(b) Clearly, for a fixed point p F, we have the following equivalences:

p is pos. asympt. stable for (5.23)


p is neg. asympt. stable for (5.23)

p is neg. asympt. stable for (5.27),


p is pos. asympt. stable for (5.27).

Lemma 5.26. Consider the situation of Def. 5.24 with f : Kn continuous on


Kn open. Then the fixed point p is positively (resp. negatively) stable if, and only
if, for each > 0, there exists > 0 such that, for each ,
k pk <

xI(0,) R+
0
(resp. x I(0,) R
0 )

kY (x, ) pk < ,

(5.29)

where I(0,) denotes the domain of the maximal solution Y (, ).


Proof. Clearly, stability in the sense of Def. 5.24 implies (5.29), and it merely remains
to show that (5.29) implies Def. 5.24(i). As (5.29) holds, we can consider := 1 and
obtain a corresponding =: r. Then (5.29) states that, for each with k pk < r,
for x 0 (resp. for x 0), the maximal solution Y (x, ) remains in the compact set
B 1 (p). Since f : Kn is continuous on Kn open, Th. 3.28 implies R+
0 I(0,)
(resp. R

I
),
proving
Def.
5.24(i).

(0,)
0
It is an exercise to show Lem. 5.26 becomes false if the hypothesis that f be continuous
is omitted.
Example 5.27. (a) Consider the 1-dimensional R-valued ODE
y = y(y 1).

(5.30)

The set of fixed points is F = {0, 1}. Moreover, Y (, ) < 0 for 0 < < 1 and
Y (, ) > 0 for ] , 0[]1, [. It follows that, for p = 0, the positive stability

96

5 STABILITY

part of (5.29) holds (where, given > 0, one can choose := min{1, }). Moreover,
for < 0 and 0 < < 1, one has limx Y (x, ) = 0. Thus, all three conditions of
Def. 5.24 are satisfied and 0 is positively asymptotically stable. Analogously, one
sees that 1 is negatively asymptotically stable.
(b) For the R2 -valued ODE of (5.8), (0, 0) is a fixed point that is positively and negatively stable, but neither positively nor negatively asymptotically stable. For the
R2 -valued ODE of (5.9), (0, 0) is a fixed point that is neither positively nor negatively stable.
(c) Consider the 1-dimensional R-valued ODE
y = y2.

(5.31)

The only fixed point is 0, which is neither positively nor negatively stable. Indeed,
not even Def. 5.24(i) is satisfied: One obtains
Y : Df,0 R,

Y (x, ) :=

,
1 x

where
Df,0 =(R {0})


(x, ) R2 : > 0, x ] , 1/[


(x, ) R2 : < 0, x ]1/, [ ,

showing every neighborhood of 0 contains such that Y (, ) is not defined on all

of R+
0 and such that Y (, ) is not defined on all of R0 .
Remark 5.28. There exist examples of autonomous ODE that show fixed points can
satisfy Def. 5.24(iii) without satisfying Def. 5.24(ii). For example, [Aul04, Ex. 7.4.16]
provides the following ODE in polar coordinates (r, ):
r = r (1 r),

(5.32a)

(5.32b)

1 cos

= sin2 .
2
2

Even though it is somewhat tedious, one can show that its fixed point (1, 0) satisfies Def.
5.24(iii) without satisfying Def. 5.24(ii) (see Claim 4 of Example K.2 in the Appendix).

We will now study a method that allows, in certain cases, to determine the stability
properties of a fixed point without having to know the solutions to an ODE. The method
is known as Lyapunovs method. The key ingredient to this method is a test function V ,
known as a Lyapunov function. Once a Lyapunov function is known, stability is often
easily tested. The catch, however, is that Lyapunov functions can be hard to find. From
the literature, it appears there is no definition for an all-purpose Lyapunov function, as
a suitable choice depends on the circumstances.

97

5 STABILITY

Definition 5.29. Let 0 Rn be open, n N. A function V : 0 R is said to


be positive (resp. negative) definite at p 0 if, and only if, the following conditions (i)
and (ii) hold:
(i) V (y) 0 (resp. V (y) 0) for each y 0 .
(ii) V (y) = 0 if, and only if, y = p.
Theorem 5.30 (Lyapunov). Consider the situation of Def. 5.24 with K = R, Rn
open, and f : Rn continuous. Let 0 be open with p 0 Rn . Assume
V : 0 R to be continuously differentiable and define
V : 0 R,

V (y) := ( V )(y) f (y) =

n
X

j V (y) fj (y).

(5.33)

j=1

If V is positive definite at p and V 0 (resp. V 0) on 0 , then p is positively (resp.


negatively) stable. If, in addition, V is negative (resp. positive) definite at p, then p is
positively (resp. negatively) asymptotically stable.
Proof. The proof is carried out for the case of postive (asymptotic) stability; the proof
for the case of negative (asymptotic) stability is then easily obtained by reversing time,
i.e. by using Rem. 5.25 together with noting V changing its sign when replacing f with
f . Fix your favorite norm kk on Rn . Let r > 0 such that B r (p) = {y Rn : ky pk
r} 0 (such an r > 0 exists, as 0 is open). Define


k : ]0, r] R+ , k() := min V (y) : ky pk = ,
(5.34)
where k is well-defined, since the continuous function V assumes its min on compact
sets, and k() > 0 by the positive definiteness of V . Given ]0, r], since V (p) = 0,
k() > 0, and V continuous,

0<()<

yB() (p)

V (y) < k(),

(5.35)

where we used Not. 3.3 to denote an open ball with center p with respect to k k.

We now claim that, for each B() (p), the maximal solution x 7 (x) := Y (x, )
must remain inside B (p) for each x 0 in its domain I(0,) (implying p to be positively
stable by Lem. 5.26). Seeking a contradiction, assume there exists 0 such that
k() pk and let
n
o
s := sup x 0 : (t) B (p) for each t [0, x] < .
(5.36)
The continuity of then implies k(s) pk = , i.e.
V ((s)) k()

(5.37)

98

5 STABILITY

by the definition of k(). On the other hand, by the chain rule (V ) (x) = V ((x))
(cf. (5.16)), such that V 0 implies
Z s
(5.35)
V ((s)) = V () +
V ((x)) dx V () < k(),
(5.38)
0

in contradiction to (5.37), proving (x) B (p) for each x I(0,) R+


0 and the positive
stability of p.
For the remaining part of the proof, we additionally assume V to be negative definite
at p, while continuing to use the notation from above. Set := (r). We have to show
limx Y (x, ) = p for each B (p), i.e.

]0,r]

kY (x, ) pk < .

(5.39)

So fix B (p) and, as above, let (x) := Y (x, ). Given ]0, r], we first claim that
there exists 0 such that ( ) B() (p), where () is as in the first part of the
proof above. Indeed, seeking a contradiction, assume k(x) pk () for all x 0,
and set


:= max V (y) : () ky pk r .
(5.40)

Then < 0 due to the negative definiteness of V at p. Moreover, due to the choice of
, we have () k(x) pk r for each x 0, implying
Z x
0 V ((x)) = V () +
V ((t)) dt V () + x,
(5.41)
x0

which is the desired contradiction, as < 0 implies the right-hand side to go to for
x . Thus, we know the existence of such that := ( ) B() (p).

To finish the proof, we recall from the first part of the proof that kY (x, ) pk < for
each x 0. Using Lem. 5.4(a), we obtain

x0

( + x) = Y ( + x, , ) = Y ( + x , ) = Y (x, ) B (p),

showing k(x) pk < for each x as needed.

(5.42)


Example 5.31. Let k, m N and , > 0. We claim that (0, 0) is a positively


asymptotically stable fixed point for each R2 -valued ODE of the form
!
 
y1
y12k1 + y1 y22
.
(5.43)
=
y2
y22m1 y12 y2
Indeed, (0, 0) is clearly a fixed point, and we consider the Lyapunov function
V : R2 R,

V (y1 , y2 ) :=

y12 y22
+ ,

(5.44a)

99

5 STABILITY
which is clearly positive definite at (0, 0). Since V : R2 R,

V (y1 , y2 ) = V (y1 , y2 ) y12k1 + y1 y22 , y22m1 y12 y2

= (2y1 /, 2y2 /) y12k1 + y1 y22 , y22m1 y12 y2


= 2(y12k / + y22m /),

(5.44b)

is clearly negative definite at (0, 0), Th. 5.30 proves (0, 0) to be a positively asymptotically stable fixed point.
Theorem 5.32. Consider the situation of Def. 5.24 with K = R. Let 0 be open with
p 0 Rn . Assume V : 0 R to be continuously differentiable and assume
there is an open set U 0 such that the following conditions (i) (iii) are satisfied:
(i) p U , i.e. p is in the boundary of U .
(ii) V > 0 and V > 0 (resp. V < 0) on U (where V is defined as in (5.33)).
(iii) V (y) = 0 for each y 0 U .
Then the fixed point p is not positively (resp. negatively) stable.
Proof. We assume V and V are positive, proving p not to be positively stable; the corresponding statement regarding p not to be negatively stable is then, once again, easily
obtained by reversing time, i.e. by using Lem. 5.25 together with noting V changing its
sign when replacing f with f .
Seeking a contradiction, assume p to be positively stable. Then there exists r > 0 such
that B r (p) = {y Rn : ky pk r} 0 and Br (p) implies Y (x, ) is defined
for each x 0. Moreover, positive stability and p U also imply the existence of
U Br (p) such that (x) := Y (x, ) Br (p) for all x 0 (note p 6= as p U ).
Set

s := sup x 0 : (t) U for each t [0, x]}.
(5.45)

If s < , then the maximality of implies (s) to be defined. Moreover, (s) U by


the definition of s, and (s) Br (p) 0 by the choice of . Thus, (s) 0 U
and V ((s)) = 0. On the other hand, as V and V are positive on U , we have
Z s
V ((s)) = V () +
V ((t)) dt > V () > 0,
(5.46)
0

which is a contradiction to V ((s)) = 0, implying s = and (x) U as well as


V ((x)) > V () > 0 hold for each x > 0.
To conclude the proof, consider the compact set
C := B r (p) U V 1 [V (), [ 0 .

(5.47)

Then the choice of guarantees (x) C for all x 0. If y C, then V (y) V () > 0.
If y 0 U , then V (y) = 0, showing C U = , i.e. C U and
:= min{V (y) : y C} > 0.

(5.48)

100

5 STABILITY
Thus,

x0

V ((x)) = V () +

x
0

V ((t)) dt V () + x.

(5.49)

But this means that the continuous function V is unbounded on the compact set C and
this contradiction proves p is not positively stable.

Example 5.33. Let h1 , h2 : R be continuously differentiable functions defined
on some open set R2 with (0, 0) and h1 (0, 0) > 0, h2 (0, 0) > 0. We claim that
(0, 0) is not a positively stable fixed point for each R2 -valued ODE of the form
  

y2 h1 (y1 , y2 )
y1
=
.
y2
y1 h2 (y1 , y2 )

(5.50)

Indeed, (0, 0) is clearly a fixed point, and we let 0 be some open neighborhood of (0, 0),
where both h1 and h2 are positive (such an 0 exists by continuity of h1 , h2 and h1 , h2
being positive at (0, 0)), and consider the Lyapunov function
V : 0 R,

V (y1 , y2 ) := y1 y2 ,

(5.51a)

with V : 0 R,

V (y1 , y2 ) = V (y1 , y2 ) y2 h1 (y1 , y2 ), y1 h2 (y1 , y2 )

= (y2 , y1 ) y2 h1 (y1 , y2 ), y1 h2 (y1 , y2 )
= y22 h1 (y1 , y2 ) + y12 h2 (y1 , y2 ) > 0 on 0 \ {(0, 0)}.

(5.51b)

Letting U := 0 (R+ R+ ), one has (0, 0) U , both V and V are positive on U ,


and V = 0 on 0 U ({0} R) (R {0}). Thus, Th. 5.32 applies, yielding that
(0, 0) is not positively stable.
Theorem 5.34. Let Rn be open, n N. Let F : R be C 2 and consider
y = F (y).

(5.52)

If p is an isolated critical point of F (i.e. F (p) = 0 and there exists an open set
O with p O and F 6= 0 on O \ {p}), then p is a fixed point of (5.52) that is
positively asymptotically stable, negatively asymptotically stable, neither positively nor
negatively stable as p is a local minimum for F , local maximum for F , neither.
Proof. Note that F being C 2 implies F to be C 1 and, in particular, locally Lipschitz,
such that (5.52) admits unique maximal solutions. Suppose F has a local min at p.
As p is an isolated critical point, the local min at p must be strict, i.e. there exists an
open neighborhood 0 of p such that F (p) < F (y) for each y 0 \ {p}. Then the
Lyapunov function V : 0 R, V (y) := F (y) F (p), is clearly positive definite at p
and V : 0 R,
V (y) = V (y) ( F (y)) = F (y) F (y) = k F (y)k22 ,

(5.53)

101

5 STABILITY

is clearly negative definite at p. Thus, p is a positively asymptotically stable fixed point


by Th. 5.30. If F has a local max at p, then the proof is conducted analogously, using
V : 0 R, V (y) := F (p) F (y), or, alternatively, by using time reversion (if F
has a local max at p, then F has a local min at p, i.e. p is a positively asymptotically
stable for y = F (y), i.e. p is a negatively asymptotically stable for y = F (y) by
Rem. 5.25(b)).
If p is neither a local min nor max for F , then let 0 := O, and V : 0 R, V (y) :=
F (p) F (y), where O was chosen such that F 6= 0 on O \ {p}, i.e. V : 0 R,
V (y) = k F (y)k22 , is positive definite at p. Let U := {y 0 : F (y) < F (p)}. Then
U is open by the continuity of F , and p U , as p is neither a local min nor max
for F . By the continuity of F , F (y) = F (p) for each y 0 U , i.e. V = 0 on
0 U . Thus, Th. 5.32 applies, showing p is not positively stable. Analogously, using
U := {y 0 : F (y) > F (p)} and V (y) := F (y) F (p) shows p is not negatively
stable.

Example 5.35. (a) The function F : Rn R, F (y) = kyk22 , has an isolated critical
point at 0, which is also a min for F . Thus, by Th. 5.34,
y = F (y) = (2y1 , . . . , 2yn )

(5.54)

has 0 as a fixed point that is positively asymptotically stable.


(b) The function F : R2 R, F (y) = ey1 y2 , has an isolated critical point at 0, which
is neither a local min nor local max for F . Thus, by Th. 5.34,
y = F (y) = (y2 ey1 y2 , y1 ey1 y2 )

(5.55)

has 0 as a fixed point that is neither positively nor negatively stable.

5.3

Constant Coefficients

The stability properties of systems of first-order linear ODE (cf. Sec. 4.6.2) are closely
related to the eigenvalues of the matrix A. As it turns out, the stability of the origin is
essentially determined by the sign of the real part of the eigenvalues of A (cf. Th. 5.38
below). We start with a preparatory lemma:
Lemma 5.36. Let n N and W M(n, K) be invertible. Moreover, let k k be some
norm on M(n, K). Then
k kW : M(n, K) R+
0,

kAkW := kW 1 AW k,

(5.56)

also constitutes a norm on M(n, K).


Proof. If A = 0, then kAkW = kW 1 0W k = k0k = 0. If kAkW = 0, then W 1 AW = 0,
i.e. A = W 0W 1 = 0, showing k kW is positive definite. Next,

kAkW = || kW 1 AW k = || kAkW ,

102

5 STABILITY
showing k kW is homogeneous of degree 1. Finally,

A,BM(n,K)

kA + BkW = kW 1 (A + B)W k = kW 1 AW + W 1 BW k kAkW + kBkW ,

showing k kW satisfies the triangle inequality.

Remark and Definition 5.37. Let n N, A M(n, C), and let C be an


eigenvalue of A.
(a) Clearly, one has
{0} ker(A Id) ker(A Id)2 . . .
and the inclusion can be strict for at most n times. Let


r() := min k N0 : ker(A Id)k = ker(A Id)k+1 .

Then

kN

ker(A Id)r() = ker(A Id)r()+k :

Indeed, otherwise, let k0 := min{k N : ker(A Id)r() ( ker(A Id)r()+k }.


Then there exists v Cn such that (A Id)r()+k0 v = 0, but (A Id)r()+k0 1 v 6=
0. However, that means w := (A Id)k0 1 v ker(A Id)r()+1 , but w
/
r()
ker(A Id) , in contradiction to the definition of r(). The space
M () := ker(A Id)r()
is called the generalized eigenspace corresponding to the eigenvalue .
(b) Due to A(A Id) = (A Id)A, one has

kN0


A ker(A Id)k ker(A Id)k ,

i.e. all the kernels (in particular, the generalized eigenspace M ()) are invariant
subspaces for A.
(c) As already mentioned in Rem. 4.51 the algebraic multiplicity of , denoted ma (),
is its multiplicity as a zero of the characteristic polynomial A (x) = det(A x Id),
and the geometric multiplicity of is mg () := dim ker(A Id). We call the
eigenvalue semisimple if, and only if, its algebraic and geometric multiplicities
are equal. We then have the equivalence of the following statements (i) (iv):
(i) is semisimple.
(ii) M () = ker(A Id).

(iii) AM () is diagonalizable.
(iv) All the Jordan blocks corresponding to are trivial, i.e. they all have size 1
(i.e. there are dim ker(A Id) such blocks).

103

5 STABILITY

Indeed, note that ma () = dim ker(A Id)ma () (e.g., since, if A is in Jordan


normal form, then ma () provides the size of the -block and, for A Id, this
block is canonically nilpotent). This shows the equivalence between (i) and (ii).
Moreover, mg () = ma () means ker(A Id) has a basis of ma () eigenvectors
v1 , . . . , vma () for the eigenvalue . The equivalence of (i),(ii) with (iii) and with
(iv) is then given by Th. 4.45 and Th. 4.46, respectively.
Theorem 5.38. Let n N and A M(n, C). Moreover, let k k be some norm on
M(n, C) and let 1 , . . . , s C, 1 s n, be the distinct eigenvalues of A.
(a) The following statements (i) (iii) are equivalent:
(i) There exists K > 0 such that keAx k K holds for each x 0 (resp. x 0).
(ii) Re j 0 (resp. Re j 0) for every j = 1, . . . , s and if Re j = 0 occurs, then
j is a semisimple eigenvalue (i.e. its algebraic and geometric multiplicities
are equal).
(iii) The fixed point 0 of y = Ay is positively (resp. negatively) stable.
(b) The following statements (i) (iii) are equivalent:
(i) There exist K, > 0 such that keAx k Ke|x| holds for each x 0 (resp.
x 0).
(ii) Re j < 0 (resp. Re j > 0) for every j = 1, . . . , s.
(iii) The fixed point 0 of y = Ay is positively (resp. negatively) asymptotically
stable.
2
Proof. Let k kmax denote the max-norm on Cn
= M(n, C), i.e.


k(mkl )kmax := max |mkl | : k, l {1, . . . , n}

(caveat: for n > 1, this is not the operator norm induced by the max-norm on Cn ).
Moreover, using Th. 4.46, let W M(n, C) be invertible and such that B := W 1 AW
1
is in Jordan normal form. Then, according to Lem. 5.36, kM kW
M W kmax
max := kW
also defines a norm on M(n, C). According to Th. 4.47(b),

xR

1 Ax
keAx kW
e W kmax = keW
max = kW

1 AW x

kmax = keBx kmax .

(5.57)

According to Th. 4.44 and Th. 4.49, the entries kl (x) of (kl (x)) := eBx enjoy the
following property:

k,l{1,...,n}

j{1,...,s}

C>0

mN0

|kl (x)| = C eRe j x |x|m .

xR

(5.58)

Moreover,
|kl (x)| = C eRe j x |x|m Re j < 0
|kl (x)| = C eRe j x |x|m Re j > 0

|kl (x)| = C eRe j x |x|m Re j = 0 m = 0

|kl (x)| = C eRe j x |x|m Re j = 0 m > 0

|kl | C,

lim |kl (x)| = 0,

(5.59a)

lim |kl (x)| = , (5.59b)


(5.59c)

lim |kl (x)| = . (5.59d)

104

5 STABILITY

(a): We start with the equivalence between (i) and (ii): Suppose, Re j 0 for every
j = 1, . . . , s and if Re j = 0 occurs, then j is a semisimple eigenvalue. Then, using
Rem. and Def. 5.37(c) and (5.58), we are either in situation (5.59a) or in situation
(5.59c). Thus, there exists K0 > 0 such that |kl (x)| K0 for each x 0 and each
k, l = 1, . . . , n. Then there exists K1 > 0 such that
Bx
keAx k K1 keAx kW
kmax K1 K0 ,
max = K1 ke

x0

(5.60)

showing (i) holds with K := K1 K0 . Conversely, if there is j {1, . . . , s} such that


Re j > 0, then there is kl such that (5.59b) occurs; if there is j {1, . . . , s} such that
Re j = 0 and j is not semisimple, then, using Rem. and Def. 5.37(c), there is kl such
that (5.59d) occurs. In both cases,
Bx
kmax = ,
lim keAx k = lim keAx kW
max = lim ke

(5.61)

i.e., the corresponding statement of (i) can not be true. The remaining case is handled
via time reversion: keAx k K holds for each x 0 if, and only if, keAx k K holds
for each x 0, which holds if, and only if, Re(j ) 0 for every j = 1, . . . , s with
j semisimple for Re(j ) = 0, which is equivalent to Re j 0 for every j = 1, . . . , s
with j semisimple for Re j = 0.
We proceed to the equivalence between (i) and (iii): Fix some arbitary norm k k on
Cn , and let k kop denote the induced operator norm on M(n, C). Let C1 , C2 > 0 be
such that kM kop C1 kM k and kM k C2 kM kop for each M M(n, C). Suppose
there exists K > 0 such that keAx k K holds for each x 0. Given > 0, choose
:= /(C1 K). Then

kY (x, ) 0k = keAx k keAx kop kk < C1 K


= ,
(5.62)
B (0) x0
C1 K
proving 0 is positively stable. Conversely, assume 0 to be positively stable. Then there
exists > 0 such that kY (x, )k = keAx k < 1 for each B (0) and each x 0.
Thus,


(G.1)
keAx kop = sup keAx k : Cn , kk = 1

(5.63)
 Ax
1
1
x0
n
= sup ke k : C , kk = ,

showing (i) holds with K := C2 /. The remaining case is handled via time reversion:
keAx k K holds for each x 0 if, and only if, keAx k K holds for each x 0, which
holds if, and only if, 0 is positively stable for y = Ay, which, by Rem. 5.25(a), holds
if, and only if, 0 is negatively stable for y = Ay.
(b): As in (a), we start with the equivalence between (i) and (ii): Suppose, Re j < 0
for every j = 1, . . . , s. We first show, using (5.58),

Kkl ,kl >0

k,l=1,...,n

x0

|kl (x)| Kkl ekl x :

According to (5.58),

Ckl >0

x0

|kl (x)| = Ckl eRe j x/2 eRe j x/2 xm .

(5.64)

105

5 STABILITY

Since Re j < 0, one has limx eRe j x/2 xm = 0, i.e. eRe j x/2 xm is uniformly bounded
on [0, [ by some Mkl > 0. Thus, (5.64) holds with Kkl := Ckl Mkl and kl := Re j /2.
In consequence, if K1 is chosen as in (5.60), then keAx k Ke|x| K for each x 0
holds with K := K1 max{Kkl : k, l = 1, . . . , n} and := min{kl : k, l = 1, . . . , n}.
Conversely, if there is j {1, . . . , s} such that Re j 0, then there is kl such that
(5.59b) or (5.59c) or (5.59d) occurs. In each case,
Bx
lim keAx k = lim keAx kW
kmax ]0, ],
max = lim ke

(5.65)

i.e., the corresponding statement of (i) can not be true. The remaining case is handled
via time reversion: keAx k Ke|x| holds for each x 0 if, and only if, keAx k
Ke|x| holds for each x 0, which holds if, and only if, Re(j ) < 0 for every
j = 1, . . . , s, which is equivalent to Re j > 0 for every j = 1, . . . , s.
It remains to consider the equivalence between (i) and (iii): Let k kop and C1 , C2 > 0
be as in the proof of the equivalence between (i) and (iii) in (a). Suppose, there exist
K, > 0 such that keAx k Ke|x| holds for each x 0. Since keAx k Ke|x| K
for each x 0, 0 is positively stable by (a). Moreover,

Cn

x0

kY (x, )k = keAx k keAx kop kk C1 K e|x| kk 0 for x ,

(5.66)
showing 0 to be positively asymptotically stable. For the converse, we will actually
show (iii) implies (ii). If 0 is positively asymptotically stable, then, in particular, it
is positively stable, such that (ii) of (a) must hold. It merely remains to exclude the
possibility of a semisimple eigenvalue with Re = 0. If there were a semisimple
eigenvalue with Re = 0, then eBx had a Jordan block of size 1 with entry ex , i.e.
kk (x) = ex for some k {1, . . . , n}. Let ek be the corresponding standard unit vector
of Cn (all entries 0, except the kth entry, which is 1). Then, for := W ek ,

xR

kW 1 eAx k = kW 1 eAx W ek k = keBx ek k = kex ek k


= |ex | kek k = 1 kek k > 0,

(5.67)

showing 0 were not positively asymptotically stable (e.g., since y 7 kW 1 yk defines a


norm on Cn ). The remaining case is, once again, handled via time reversion: keAx k
Ke|x| holds for each x 0 if, and only if, keAx k Ke|x| holds for each x 0,
which holds if, and only if, 0 is positively asymptotically stable for y = Ay, which, by
Rem. 5.25(b), holds if, and only if, 0 is negatively asymptotically stable for y = Ay. 
Example 5.39. (a) The matrix
A=

2 1
1 2

0 1
0 0

has eigenvalues 1 and 3 and, thus, the fixed point 0 of y = Ay is negatively


asymptotically stable, but not positively stable.
(b) The matrix
A=

106

5 STABILITY

has eigenvalue 0, which is not semisimple, i.e. the fixed point 0 of y = Ay is neither
negatively nor positively stable.
(c) The matrix

i 1
2
2 3i
0 i
5
17

A=
0 0 1 + 3i
0
0 0
0
5

has simple eigenvalues i, i, 1 + 3i, 5, i.e. the fixed point 0 of y = Ay is


positively stable (since all real parts are 0), but neither negatively stable nor
positively asymptotically stable (since there are eigenvalues with 0 real part).

5.4

Linearization

If the right-hand side f of an autonomous ODE is differentiable and p is a fixed point (i.e.
f (p) = 0), then one can sometimes use its linearization, i.e. its derivative A := Df (p)
(which is an n n matrix), to infer stability properties of y = f (y) at p from those of
y = Ay at 0 (see Th. 5.44 below). We start with some preparatory results:
Lemma 5.40. Let n N and consider the bilinear function
n

: R R R,

(y, z) := y (Bz) = y Bz =

n
X

yk bkl zl ,

(5.68)

k,l=1

where B = (bkl ) M(n, R), denotes the Euclidean scalar product, and elements of
Rn are interpreted as column vectors when involved in matrix multiplications.
(a) The function is differentiable (it is even a polynomial, deg() 2, and, thus,
C ) and
yk : Rn Rn R,
n
X

(5.69a)
k{1,...,n}
n n yk (y, z) =
bkl zl = (Bz)k ,
(y,z)R R

l=1

l{1,...,n}

(y,z)R Rn

zl : R Rn R,
n
X
zl (y, z) =
yk bkl = (y t B)l ,

(5.69b)

k=1

(y,z)Rn Rn

D(y, z) = (y, z) : R Rn R,
(y, z)(u, v) = (y, v) + (u, z) = y t Bv + ut Bz.

(5.69c)

(b) The function


n

V : R R,

V (y) := (y, y) = y (By) = y By =

n
X

k,l=1

yk bkl yl ,

(5.70)

107

5 STABILITY
is differentiable (it is also even a polynomial, deg() 2, and, thus, C ) and

k{1,...,n}

yR

yR

yk V : Rn R,
n
X
yk V (y) =
yl (bkl + blk ) = y t (B + B t )k ,

(5.71a)

l=1

DV (y) = V (y) : Rn R,
V (y)(u) = (y, u) + (u, y) = y t Bu + ut By = y t (B + B t )u.

(5.71b)

Proof. (a): (5.69a) and (5.69b) are immediate from (5.68) and, then, imply (5.69c).
(b): (5.71a) is immediate from (5.70) and, then, implies (5.71b).

Lemma 5.41. Let A, B M(n, R), n N, and V : Rn R as in (5.70). Then

yRn

V (y) (Ay) = y t (BA + At B)y.

(5.72)

Proof. We note
n

yR

(y t B t ) (Ay) = y t B t Ay = y t B t Ay

and, thus, obtain


n

yR

(5.71b)

t

= y t At By

(5.73)

V (y) (Ay) = y t (B + B t ) (Ay) = y t (BA + At B)y,

proving (5.72).

(5.73)

(5.74)


Definition 5.42. A matrix B M(n, R), n N, is called positive definite if, and only
if, the function V of (5.70) is positive definite at p = 0 in the sense of Def. 5.29.
Proposition 5.43. Let A M(n, R), n N. Then the following statements (i) (iii)
are equivalent:
(i) There exist positive definite matrices B, C M(n, R), satisfying
BA + At B = C.

(5.75)

(ii) Re < 0 holds for each eigenvalue C of A.


(iii) For each given positive definite (symmetric) C M(n, R), there exists a positive
definite (symmetric) B M(n, R), satisfying (5.75).
Proof. (iii) immediately implies (i) (e.g. by applying (iii) with C := Id).
For the proof that (i) implies (ii), let B, C M(n, R) be positive definite matrices,
satisfying (5.75). By Th. 5.38(b), it suffices to show 0 is a positively asymptotically
stable fixed point for y = Ay. To this end, we apply Th. 5.30, using V : Rn R of

108

5 STABILITY

(5.70) as the Lyapunov function. Then, by Def. 5.42, B being positive definite means
V being positive definite at 0. Since
V : Rn R,

(5.72)
(5.75)
V (y) = V (y) (Ay) = y t (BA + At B)y = y t Cy

(5.76)

and C is positive definite, V is negative definite at 0, i.e. Th. 5.30 yields 0 to be a


positively asymptotically stable fixed point for y = Ay as desired.
It remains to show that (ii) implies (iii). If all eigenvalues of A have negative real part,
then, as A and At have the same eigenvalues, all eigenvalues of At have negative real
part as well. Thus, according to Th. 5.38(b),


t
(5.77)
keAx kmax Kex keA x kmax Kex ,

K,>0

x0

where we have chosen the norm in (5.77) to mean the max-norm on Rn (note that eAx
is real if A is real, e.g. due to the series representation (4.73)). Given C M(n, R),
define
Z
t
eA x C eAx dx .
(5.78)
B :=
0

To verify that B M(n, R) is well-defined, note that each entry of the integrand matrix
of (5.78) constitutes an integrable function on [0, [: Indeed,
At x

At x
e C eAx

M
ke
kmax kCkmax keAx kmax
max

(5.79)
M >0 x0
M kCkmax K 2 e2x ,
which is integrable on [0, [. Next, we compute
C

(5.79)

=
(I.3)

(I.5),(I.6)

x

t
lim e C e C = lim
s eA s C eAs ds
x
x 0
Z

t
s eA s C eAs ds
Z0

t
t
At eA s C eAs + eA s C eAs A ds
0
Z

Z
At s
As
t
At s
As
e C e ds +
A
e C e ds A
At x

Ax

(5.78)

= At B + BA,

(5.80)

showing (5.75) is satisfied. If C is positive definite and 0 6= y Rn , then y t Cy > 0,


implying
Z

Z
t
(I.5),(I.6)
t
t
At x
Ax
y By
=
y
e C e dx y =
y t eA x C eAx y dx
0
Z 0
Prop. 4.40(c)
(eAx y)t C eAx y dx > 0,
(5.81)
=
0

109

5 STABILITY
showing B is positive definite as well. Finally, if C is symmetric, then
Z
Z

t
Prop. 4.40(c)
At x
Ax t
t
eA x C eAx dx = B,
e Ce
dx
=
B =

(5.82)

showing B is symmetric as well.

Theorem 5.44. Let Rn be open, n N, and f : Rn continuously differentiable. Let p be a fixed point (i.e. f (p) = 0) and A := Df (p) M(n, R) the
derivative of f at p. If all eigenvalues of A have negative (resp. positive) real parts, then
p is a positively (resp. negatively) asymptotically stable fixed point for y = f (y).
Proof. Let all eigenvalues of A have negative real parts. We first consider the special
case p = 0, i.e. A = Df (0). By the equivalence between (ii) and (iii) of Prop. 5.43,
we can choose C := Id in (iii) to obtain the existence of a positive definite symmetric
matrix B M(n, R), satisfying
BA + At B = Id .

(5.83)

The idea is now to apply the Lyapunov Th. 5.30 with V of (5.70), i.e.
V : R,

V (y) := y (By) = y By =

n
X

yk bkl yl .

(5.84)

k,l=1

We already know V to be continuously differentiable and positive definite. We will


conclude the proof of 0 being positively asymptotically stable by showing there exists
> 0, such that
V : B (0) R, V (y) = ( V )(y) f (y),
(5.85)

is negative definite at 0, where we take B (0) with respect to the 2-norm k k2 on Rn .


The differentiability of f at 0 implies that (cf. [Phi13b, Lem. 2.21])
r : Rn ,
satisfies

r(y) := f (y) Ay,

kr(y)k2
= 0.
y0 kyk2
lim

Thus, we compute, for each y ,


V (y)

=
(5.72), B=B t

( V )(y) f (y)

(5.71b),(5.86)

(5.86)

(5.87)


( V )(y) (Ay) + y t (B + B t ) r(y)
(5.83)

y t (BA + At B)y + 2y t B r(y) = kyk22 + 2 y B r(y).

(5.88)

We can estimate the second summand via the Cauchy-Schwarz inequality to obtain


y B r(y) kyk2 kB r(y)k2 kyk2 kBk kr(y)k2 ,
(5.89)

and, thus, using (5.87),

lim

y0



y B r(y)
kyk22

= 0.

(5.90)

110

5 STABILITY
Now choose > 0 such that B (0) and such that


2 y B r(y)
1
< .

2
yB (0)
kyk2
2

(5.91)

Then, for each 0 6= y B (0)


(5.91)
kyk22
kyk22
V (y) = kyk22 + 2 y B r(y) < kyk22 +
=
< 0,
2
2

(5.92)

showing V to be negative definite at 0, and 0 to be positively asymptotically stable. If


p 6= 0, then consider the ODE y = g(y) := f (y + p), g : ( p) Rn . Then 0 is a
fixed point for y = g(y), Dg(0) = Df (p) = A, i.e. 0 is positively asymptotically stable
for y = g(y). But, since is a solution to y = g(y) if, and only if, = +p is a solution
to y = f (y), p must be positively asymptotically stable for y = f (y). The remaining
case that all eigenvalues of A have positive real parts is now treated via time reversion:
If all eigenvalues of A have positive real parts, then all eigenvalues of A = D(f )(p)
have negative real parts, i.e. p is positively asymptotically stable for y = f (y), i.e., by
Rem. 5.25(b), p is negatively asymptotically stable for y = f (y).

Caveat 5.45. The following example shows that the converse of Th. 5.44 does not
hold: A fixed point p can be positively (resp. negatively) asymptotically stable without
A := Df (p) having only eigenvalues with negative (resp. positive) real parts. The same
example shows that, in general, one can not infer anything regarding the stability of the
fixed point p if A := Df (p) is merely stable, but not asymptotically stable: Consider
f : R2 R2 ,

f (y1 , y2 ) := (y2 + y13 , y1 + y23 ),

R.

Then, independently of , (0, 0) is a fixed point and




0 1
Df (0, 0) =
1 0

(5.93)

(5.94)

with complex eigenvalues i and i. Thus, the linearized system is positively and negatively stable, but not asymptotically stable, still independently of . However, we claim
that (0, 0) is a positively asymptotically stable fixed point for y = f (y) if < 0 and
a positively asymptotically stable fixed point for y = f (y) if > 0. Indeed, this can
be seen by using the Lyapunov function V : R2 R, V (y1 , y2 ) = y12 + y22 , which has
V (y1 , y2 ) = (2y1 , 2y2 ) and
V (y1 , y2 ) = V (y1 , y2 ) f (y1 , y2 ) = 2 (y14 + y24 ).

(5.95)

Thus, V is positive definite at (0, 0) and V is negative definite at (0, 0) for < 0 and
positive definite at (0, 0) for > 0.
Example 5.46. Consider (x, y, z) = f (x, y, z) with
f : R3 R3 ,

f (x, y, z) = (x cos y, yez , x2 2z).

(5.96)

111

5 STABILITY
The derivative is
Df : R3 M(3, R),

cos y x sin y
0
ez yez .
Df (x, y, z) = 0
2x
0
2

(5.97)

Clearly, (0, 0, 0) is a fixed point and Df (0, 0, 0) has eigenvalues 1 and 2. Thus,
(0, 0, 0) is a positively asymptotically stable fixed point for (x, y, z) = f (x, y, z) by Th.
5.44.

5.5

Limit Sets

Limit sets are important when studying the asymptotic behavior of solutions, i.e. (x)
for x and for x . If a solution has a limit, then its corresponding limit set
consists of precisely one point. In general, the limit set of a solution is defined to consist
of all points that occur as limits of sequences taken along the solutions orbit (of course,
the limit sets can also be empty):
Definition 5.47. Let Kn , n N, and f : Kn be such that y = f (y) admits
unique maximal solutions. For each , we define the omega limit set and the alpha
limit set of as follows:


() := f () := y :

lim xk = lim Y (xk , ) = y ,


(5.98a)
k
k
(xk )kN R



lim xk = lim Y (xk , ) = y . (5.98b)


() := f () := y :
k

(xk )kN R

Remark 5.48. In the situation of Def. 5.47, consider the time-reversed version of y =
f (y), i.e. y = f (y), with its general solution Y (x, ) = Y (x, ), cf. (5.28). Clearly,
for each ,
f () = f (), f () = f ().
(5.99)
Proposition 5.49. In the situation of Def. 5.47, the following hold:
(a) If Y (, ) is defined on all of R+
0 , then
() =

m=0

{Y (x, ) : x m};

(5.100a)

and if Y (, ) is defined on all of R


0 , then
() =

m=0

{Y (x, ) : x m}.

(5.100b)

(b) All points in the same orbit have the same omega and alpha limit sets, i.e.




() = Y (x, )
() = Y (x, ) .
xI0,

112

5 STABILITY

Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limit
sets.
(a): Let y () and m N0 . Then there is a sequence (xk )kN in R such that
limk xk = and limk Y (xk , ) = y. Since, for sufficiently large k0 N, the
sequence (Y (xk , ))kk0 is in {Y (x, ) : x m}, the inclusion of (5.100a) is proved.
Conversely, assume y {Y (x, ) : x m}. Then,

kN



Y (xk , ) y < 1 ,
k

xk [k,[

providing a sequence (xk )kN in R such that limk xk = and limk Y (xk , ) = y,
proving y () and the inclusion of (5.100a).

(b): Let y () and x I0, . Choose a sequence (xk )kN in R such that limk xk =
and limk Y (xk , ) = y. Then limk (xk x) = and
 Lem. 5.4(b)
=
lim Y (xk , ) = y,

(5.101)

 Lem. 5.4(b)
=
Y (0, ) = ,

(5.102)

lim Y xk x, Y (x, )


proving () Y (x, ) . The reversed inclusion then also follows, since
Y x, Y (x, )

concluding the proof.

Example 5.50. Let Kn , n N, and f : Kn be such that y = f (y) admits


unique maximal solutions.
(a) If is a fixed point, then () = () = {}. More generally, if
is such that limx Y (x, ) = y Kn , then () = {y}; if is such that
limx Y (x, ) = y Kn , then () = {y}.
(b) If A M(n, C) is such that the conditions of Th. 5.38(b) hold (all eigenvalues have
negative real parts, 0 is positively asymptotically stable), then () = {0} for each
Cn and () = for each Cn \ {0}.
(c) If is such that the orbit O() of := Y (, ) is periodic, then () = () =
O(). For example, for (5.8),


2 () = () = Skk2 (0) = y R2 : kyk2 = kk2 .
(5.103)
R

Example 5.51. As an example with nonperiodic orbits that have limit sets consisting
of more than one point, consider
y1 = y2 + y1 (1 y12 y22 ),
y2 = y1 + y2 (1 y12 y22 ).

(5.104a)
(5.104b)

113

5 STABILITY

We will show that, for each point except the origin (which is clearly a fixed point), the
omega limit set is the unit circle, i.e.

R2 \{0}

() = S1 (0) = {y R2 : y12 + y22 = 1}.

(5.105)

We first verify that the general solution is


Y : Df,0 R2 ,

Y (x, 1 , 2 ) =

where, letting

Df,0

(1 cos x + 2 sin x, 2 cos x 1 sin x)


p
,
12 + 22 + (1 12 22 )e2x



kk22 1
1
,

x := ln
{yR2 : kyk2 >1}
2
kk22

 

= R { R2 : kk2 1} ]x , [{ R2 : kk2 > 1} :

(5.106)

(5.107)
(5.108)

For each (1 , 2 ) R2 , Y (, 1 , 2 ) satisfies the initial condition:


Y (0, 1 , 2 ) = p

(1 , 2 )
12 + 22 + (1 12 22 )

= (1 , 2 ).

(5.109)

The following computations prepare the check that each Y (, 1 , 2 ) satisfies (5.104):
The 2-norm squared of the numerator in (5.106) is


(1 cos x + 2 sin x, 2 cos x 1 sin x) 2
2
= 12 cos2 x + 21 2 cos x sin x + 22 sin2 x + 22 cos2 x 21 2 cos x sin x + 12 sin2 x
= 12 + 22 = kk22 .
(5.110)

Thus,

and



kk2
Y (x, 1 , 2 ) = p
2
kk22 + (1 kk22 )e2x

(5.111)


2 kk22 + (1 kk22 )e2x kk22
1 Y12 (x, 1 , 2 ) Y22 (x, 1 , 2 ) = 1 Y (x, 1 , 2 ) 2 =
kk22 + (1 kk22 )e2x
(1 kk22 )e2x
=
.
(5.112)
kk22 + (1 kk22 )e2x
In consequence,
Y1 (x, 1 , 2 )

(1 sin x + 2 cos x) kk22 + (1 kk22 )e2x + (1 cos x + 2 sin x)(1 kk22 )e2x
=
3
kk22 + (1 kk22 )e2x 2


= Y2 (x, 1 , 2 ) + Y1 (x, 1 , 2 ) 1 Y12 (x, 1 , 2 ) Y22 (x, 1 , 2 ) ,
(5.113)

114

5 STABILITY
verifying (5.104a). Similarly,
Y2 (x, 1 , 2 )


(2 sin x 1 cos x) kk22 + (1 kk22 )e2x + (2 cos x 1 sin x)(1 kk22 )e2x
=
3
kk22 + (1 kk22 )e2x 2


= Y1 (x, 1 , 2 ) + Y2 (x, 1 , 2 ) 1 Y12 (x, 1 , 2 ) Y22 (x, 1 , 2 ) ,
(5.114)

verifying (5.104b).

For kk2 1, Y (, 1 , 2 ) is maximal, as it is defined on R (the denominator in (5.106)


has no zero in this case). For kk2 > 1, the denominator clearly has a zero at x < 0,
where x is defined as in (5.107). For x > x , the expression under the square root
in (5.106) is positive. Since limxx kY (x, 1 , 2 )k2 = for kk2 > 1, Y (, 1 , 2 ) is
maximal in this case as well, completing the verification of Y , defined as in (5.106)
(5.108), being the general solution of (5.104).
It remains to prove (5.105). From (5.111), we obtain


2
lim Y (x, 1 , 2 ) 2 = 1,
x

R \{0}

(5.115)

which implies

R2 \{0}

() S1 (0).

(5.116)

Conversely, consider = (1 , 2 ) R2 \ {0} and y = (y1 , y2 ) S1 (0). We will show


y (): Since kyk2 = 1,

y [0,2[

y = (sin y , cos y ).

(5.117)

= kk2 (sin , cos )

(5.118)

Analogously,

[0,2[

(the reader might note that, in (5.117) and (5.118), we have written y and using their
polar coordinates, cf. [Phi13b, Ex. 4.19]). Then, according to (5.106), we obtain, for
each x 0,
kk2 (sin cos x + cos sin x, cos cos x sin sin x)
p
kk22 + (1 kk22 )e2x

kk2 sin(x + ), cos(x + )
p
.
(5.119)
=
kk22 + (1 kk22 )e2x

Y (x, 1 , 2 ) =

Define

kN

xk := y + 2k R+ .

(5.120)

115

5 STABILITY
Then limk xk = and
lim Y (xk , 1 , 2 )

kk2 sin(y + 2k + ), cos(y + 2k + )


p
= lim
k
kk22 + (1 kk22 )e2x

(5.119)

lim p

kk2 (sin y , cos y )

kk22 + (1 kk22 )e2x

= y,


(5.121)

showing y () and S1 (0) ().


Proposition 5.52. In the situation of Def. 5.47, if f is locally Lipschitz, then orbits
that intersect an omega or alpha limit set, must entirely remain inside that same omega
or alpha limit set, i.e.





Y (x, y) ()

Y (x, y) () . (5.122)
y()

xI0,y

y()

xI0,y

Proof. Due to Rem. 5.48, it suffices to prove the statement involving the omega limit set.
Let y () and x I0,y . Choose a sequence (xk )kN in R such that limk xk =
and limk Y (xk , ) = y. Then limk (xk + x) = and,
lim Y (xk + x, )

Lem. 5.4(b)

 ()
lim Y x, Y (xk , ) = Y (x, y),

(5.123)

proving Y (x, y) (). At (), we have used that, due to f being locally Lipschitz
by hypothesis, Y is continuous by Th. 3.35.

Proposition 5.53. In the situation of Def. 5.47, let be such that there exists a
compact set K , satisfying

{Y (x, ) : x 0} K
resp. {Y (x, ) : x 0} K .
(5.124)
Then the following hold:

(a) () 6= (resp. () 6= ).
(b) () (resp. ()) is compact.
(c) () (resp. ()) is a connected set, i.e. if O1 , O2 are disjoint open subsets of Kn
such that () O1 O2 (resp. () O1 O2 ), then ()O1 = or ()O2 =
(resp. () O1 = or () O2 = ).
Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limit
sets.
(a): Since, by hypothesis, (Y (k, ))kN is a sequence in the compact set K, it must have
a subsequence, converging to some limit y K. But then y (), i.e. () 6= .

116

5 STABILITY

(b): According to (5.100a) and (5.124), () is a closed subset of the compact set K,
implying () to be compact as well.
(c): Seeking a contradiction, we suppose the assertion is false, i.e. there are disjoint
open subsets O1 , O2 of Kn such that () O1 O2 , 1 := () O1 6= and 2 :=
() O2 6= . Then 1 and 2 are disjoint since O1 , O2 are disjoint. Moreover, 1
and 2 are both subsets of the compact set (). Due to 1 = () (Kn \ O2 ) and
2 = () (Kn \ O1 ), 1 and 2 are also closed, hence, compact. Then, according
to Prop. C.10, := dist(1 , 2 ) > 0. If y1 1 and y2 2 , then there are numbers
0 < s1 < t1 < s2 < t2 < . . . such that limk sk = limk tk = and



Y (sk , ) O1 Y (tk , ) O2 .
(5.125)
kN

Define

kN



k := sup x sk : Y (t, ) O1 for each t [sk , x] .

(5.126)

Then sk < k < tk and the continuity of Y (, ) yields k := Y (k , ) O1 . Thus


(k )kN is a sequence in the compact set K O1 and, therefore, must have a convergent
subsequence, converging to some z K O1 . But then z (), but not in O1 O2 ,
in contradiction to () O1 O2 .

Theorem 5.54 (LaSalle). Let Rn , n N, and f : Rn be such that y = f (y)
admits unique maximal solutions. Moreover, let 0 be an open subset of , assume
V : 0 R is continuously differentiable, K := {y 0 : V (y) r} is compact for
some r R, and V (y) 0 (resp. V (y) 0) for each y K, where V is defined as in
(5.33). If 0 is such that V () < r, then the following hold:

(a) Y (, ) is defined on all of R+


0 (resp. on all of R0 ).

(b) One has () K (resp. () K) and V is constant on () (resp. on ()).


(c) If f is locally Lipschitz, then, letting
n
o


M := y K : V Y (x, y) = 0 for each x 0 (resp. for each x 0) ,

one has () M (resp. () M ). In particular, V (y) = 0 for each y ()


(resp. for each y ()).

Proof. As usual, it suffices to prove the assertions for V (y) 0, as the assertions for
V (y) 0 then follow via time reversion.
(a): We claim

xI0, R+
0


V Y (x, ) < r :

Indeed, if (5.127) does not hold, then let





0 < s := sup x 0 : V Y (x, ) < r for each t [0, x] I0, ,

(5.127)

(5.128)

117

5 STABILITY
and


r = V Y (s, ) = V () +

s
0


V Y (t, ) dt V () < r,

(5.129)

which is impossible. Thus, (5.127) must hold. However, (5.127) implies R+


0 I0, , since
Y (, ) is a maximal solution and K = {y 0 : V (y) r} is compact.

(b): Let := Y (, ). During the proof of (a) above, we have shown (x) K for each
x 0. Since, then, (V ) (x) = V ((x)) 0 for each x 0, V is nonincreasing for
x 0. Since V is also bounded on K,

c = lim V (x) .
(5.130)
x

cR

If y (), then there exists a sequence (xk )kN in R such that limk xk = and
limk (xk ) = y. Thus, y K (since K is closed), and
V (y) = lim V (xk )
k

proving (b).

 (5.130)
= c,

(5.131)

(c): Let y () and := Y (, y). Since f is assumed to be locally Lipschitz, Prop.


5.52 applies and we obtain (x) = Y (x, y) () for each x R+
0 . Using (b), we know
V to be constant on (), i.e. V must be constant on R+
as
well,
implying
0
+

xR0

V ((x)) = (V ) (x) = 0

(5.132)

as claimed.

Example 5.55. Let a < 0 < b and let h : ]a, b[ R be continuously differentiable and
such that

< 0 for x < 0,


h(x) = 0 for x = 0,
(5.133)

> 0 for x > 0.


Consider the autonomous ODE

y1 = y2 ,
y2 = y12 y2 h(y1 ).

(5.134a)
(5.134b)

The right-hand side is defined on :=]a, b[R and is clearly C 1 , i.e. the ODE admits
unique maximal solutions. Due to (5.133), F = {(0, 0)}, i.e. the origin is the only fixed
point of (5.134). We will use Th. 5.54(c) to show (0, 0) is positively asymptotically
stable: We introduce
Z x
h(t) dt ,
(5.135)
H : ]a, b[ R, H(x) :=
0

and the Lyapunov function


V : R,

V (y1 , y2 ) := H(y1 ) +

y22
.
2

(5.136)

118

A DIFFERENTIABILITY

Since H is positive definite at 0 (H is actually strictly decreasing on ]a, 0] and strictly


increasing on [0, b[), V is positive definite at (0, 0). We also obtain


V : R, V (y1 , y2 ) = h(y1 ), y2 y2 , y12 y2 h(y1 ) = y12 y22 0. (5.137)

Thus, from the Lyapunov Th. 5.30, we already know (0, 0) to be positively stable.
However, V is not negative definite at (0, 0), i.e. we can not immediately conclude that
(0, 0) is positively asymptotically stable. Instead, as promised, we apply Th. 5.54(c):
To this end, using that H is continuous and positive definite at 0, we choose r > 0 and
c, d R, satisfying
a < c < 0 < d < b and H(c) = H(d) = r,

(5.138)

O := {(y1 , y2 ) : V (y1 , y2 ) < r},


K := {(y1 , y2 ) : V (y1 , y2 ) r}.

(5.139)
(5.140)

and define

Then O is open since V is continuous, and it suffices to show

(1 ,2 )O

lim Y (x, 1 , 2 ) = (0, 0).

(5.141)


Moreover, the continuity of V implies K to be closed. Since K [c, d] [ 2r, 2r], it
is also bounded, i.e. compact. Thus, Th. 5.54 applies to each O. So let O. We
will show that M = {(0, 0)}, where M is the set of Th. 5.54(c) (then () = {(0, 0)} by
Th. 5.54(c), which implies (5.141) as desired). To verify M = {(0, 0)}, note V (y1 , y2 ) < 0
for y1 , y2 6= 0, showing (y1 , y2 )
/ M . For y1 = 0, y2 6= 0, let := Y (, y1 , y2 ). Then
2 (0) = y2 6= 0 and 1 (0) = y2 6= 0, i.e. both 1 and 2 are nonzero on some interval
]0, [ with > 0, showing (y1 , y2 )
/ M . Likewise, if y1 6= 0, y2 = 0, then let be as
before. This time 1 (0) = y1 6= 0 and 2 (0) = h(y1 ) 6= 0, again showing both 1 and
2 are nonzero on some interval ]0, [ with > 0, implying (y1 , y2 )
/ M.

Differentiability

We provide a lemma used in the variation of constants Th. 2.3.


Lemma A.1. Let O R be open. If the function a : O K is differentiable, then
f : O K,

f (x) := ea(x)

(A.1a)

f (x) := a (x) ea(x) .

(A.1b)

is differentiable with
f : O K,

B KN -VALUED INTEGRATION

119

Proof. For K = R, the lemma is immediate from the one-dimensional chain rule for
real-valued functions [Phi13a, (9.16)]. It remains to consider the case K = C. Note that
we can not apply the chain rule for holomorphic (i.e. C-differentiable functions), since a
is only R-differentiable and it does not need to have a holomorphic extension. However,
we can argue as follows, merely using the chain rule and the product rule for real-valued
functions: Write a = b + ic with differentiable functions b, c : O R. Then

(A.2)
f (x) = ea(x) = eb(x)+ic(x) = eb(x) eic(x) = eb(x) sin c(x) + i cos c(x) .
Thus, one computes


f (x) = b (x) eb(x) eic(x) + eb(x) c (x) cos c(x) + ic (x) sin c(x)

= b (x) ea(x) + ic (x) eb(x) i cos c(x) + sin c(x) = b (x) ea(x) + ic (x) eb(x) eic(x)

= b (x) + ic (x) ea(x) = a (x) ea(x) ,
(A.3)

proving (A.1b).

Kn-Valued Integration

During the course of this class, we frequently need Kn -valued integrals.R In particular,
R
for f : I Kn , I an interval in R, we make use of the estimate k I f k I kf k,
for example in the proof of the Peano Th. 3.8. As mentioned in the proof of Th. 3.8,
the estimate can easily be checked directly for the 1-norm on Kn , but it does hold for
every norm on Kn . To verify this result is the main purpose of the present section.
Throughout the class, it suffices to use Riemann integrals. However, some readers
might be more familiar with Lebesgue integrals, which is a more general notion (every
Riemann integrable function is also Lebesgue integrable). For convenience, the material
is presented twice, first using Riemann integrals and arguments that make specific use
of techniques available for Riemann integrals, then, second, using Lebesgue integrals
and corresponding techniques. For Riemann integrals, the norm estimate is proved in
Th. B.4, for Lebesgue integrals in Th. B.9.

B.1

Kn -Valued Riemann Integral

Definition B.1. Let a, b R, I := [a, b]. We call a function f : I Kn , n N,


Riemann integrable if, and only if, each coordinate function fj = j f : I K,
j = 1, . . . , n, is Riemann integrable. Denote the set of all Riemann integrable functions
from I into Kn by R(I, Kn ). If f : I Kn is Riemann integrable, then
Z

Z
Z
f :=
f1 , . . . , fn Kn
(B.1)
I

is the (Kn -valued) Riemann integral of f over I.

B KN -VALUED INTEGRATION

120

Remark B.2. The linearity of the K-valued integral implies the linearity of the Kn valued integral.
Theorem B.3. Let a, b R, a b, I := [a, b]. If f R(I, Kn ), n N, and :
f (I) R is Lipschitz continuous, then f R(I, R).
Proof. If K = R, then f = f , where : Rn Cn is the canonical imbedding,
and : Cn R, (z1 , . . . , zn ) := (Re z1 , . . . , Re zn ). Clearly, f R(I, Cn ), and,
if is L-Lipschitz, L 0, then, due to

z,wCn

|(z) (w)| = |(Re z) (Re w)| L k Re z Re wk


n
X
()
CL k Re z Re wk1 = CL
| Re zj Re wj |
j=1

[Phi13a, Th. 5.11(d)]

CL

n
X
j=1

(B.2)

()

|zj wj | = CL kz wk1 CCL


kz wk,

where the estimate at () holds with C R+ , due to the equivalence of k k and k k1 on


Rn , and the estimate at () holds with C R+ , due to the equivalence of k k1 and k k

on Cn . Thus, by (B.2), is Lipschitz as well, namely CCL-Lipschitz,


and it suffices to
consider the case K = C, which we proceed to do next. Once again using the equivalence
of k k1 and k k on Cn , there exists c R+ such that kzk ckzk1 for each z Cn .
Assume to be L-Lipschitz, L 0. If f R(I, Cn ), then Re f1 , . . . , Re fn R(I, R)
and Im f1 , . . . , Im fn R(I, R), i.e., given > 0, Riemanns integrability criterion of
[Phi13a, Th. 10.12] provides partitions 1 , . . . , n of I and 1 , . . . , n of I such that

j=1,...,n

,
2ncL

R(j , Im fj ) r(j , Im fj ) <


,
2ncL

R(j , Re fj ) r(j , Re fj ) <

(B.3)

where R and r denote upper and lower Riemann sums, respectively (cf. [Phi13a, (10.7)]).
Letting be a joint refinement of the 2n partitions 1 , . . . , n , 1 , . . . , n , we have (cf.
[Phi13a, Def. 10.8(a),(b)] and [Phi13a, Th. 10.10(a)])

j=1,...,n

,
2ncL

.
R(, Im fj ) r(, Im fj ) <
2ncL
R(, Re fj ) r(, Re fj ) <

(B.4)

Recalling that, for each g : I R and = (x0 , . . . , xN ) RN +1 , N N, a = x0 <


x1 < < xN = b, Ik := [xk1 , xk ], it is
r(, g) =
R(, g) =

N
X

k=1
N
X
k=1

mk |Ik | =
Mk |Ik | =

N
X

k=1
N
X
k=1

mk (g)(xk xk1 ),

(B.5a)

Mk (g)(xk xk1 ),

(B.5b)

B KN -VALUED INTEGRATION

121

where
mk (g) := inf{g(x) : x Ik },

Mk (g) := sup{g(x) : x Ik },

(B.5c)

we obtain, for each k , k Ik ,








( f )(k ) ( f )(k ) L f (k ) f (k ) cL f (k ) f (k )
1
n
X


fj (k ) fj (k )
= cL
j=1

n
n
X
X




Im fj (k ) Im fj (k )


Re fj (k ) Re fj (k ) + cL
cL

[Phi13a, Th. 5.11(d)]

cL

j=1

j=1

n
X
j=1

Mk (Re fj ) mk (Re fj ) + cL

n
X
j=1


Mk (Im fj ) mk (Im fj ) .

(B.6)

Thus,
R(, f ) r(, f ) =
(B.6)

cL

n
N X
X
k=1 j=1

+ cL

k=1 j=1

= cL

j=1

k=1


Mk ( f ) mk ( f ) |Ik |


Mk (Re fj ) mk (Re fj ) |Ik |

n
N X
X

n
X

N
X


Mk (Im fj ) mk (Im fj ) |Ik |


R(, Re fj ) r(, Re fj ) + cL

< 2ncL
= .
2ncL

(B.4)

n
X
j=1

R(, Im fj ) r(, Im fj )


(B.7)

Thus, f R(I, R) by [Phi13a, Th. 10.12].

Theorem B.4. Let a, b R, a b, I := [a, b]. For each norm k k on Kn , n N, and


each Riemann integrable f : I Kn , it is kf k R(I, R), and the following holds:
Z Z


f kf k.
(B.8)


I

Proof. From Th. B.3, we obtain kf k R(I, R), as the norm k k is 1-Lipschitz by the
inverse triangle inequality. Let be an arbitrary partition of I. Recalling that, for
each g : I R and = (x0 , . . . , xN ) RN +1 , N N, a = x0 < x1 < < xN = b,
Ik := [xk1 , xk ], k Ik , the intermediate Riemann sums
(, f ) =

N
X
k=1

f (tk ) |Ik | =

N
X
k=1

f (tk )(xk xk1 ),

(B.9)

B KN -VALUED INTEGRATION

122

we obtain, for k Ik ,






(, Re f1 ), (, Im f1 ) , . . . , (, Re fn ), (, Im fn )

!!
!
N
N
N
N


X
X
X
X


Im fn (k ) |Ik |
Re fn (k ) |Ik |,
Im f1 (k ) |Ik | , . . . ,
Re f1 (k ) |Ik |,
=



k=1
k=1
k=1
k=1

N
X 





Re f1 (k ) |Ik |, Im f1 (k ) |Ik | , . . . , Re fn (k ) |Ik |, Im fn (k ) |Ik |
=


k=1

N 



X


Re f1 (k ), Im f1 (k ) , . . . , Re fn (k ), Im fn (k ) |Ik |



k=1

N
X
k=1

kf (k )k |Ik | = (, kf k).

(B.10)

Since the intermediate Riemann sums in (B.10) converge to the respective integrals by
[Phi13a, (10.24b)], one obtains
Z







f =
lim
(,
Re
f
),
(,
Im
f
)
,
.
.
.
,
(,
Re
f
),
(,
Im
f
)


1
1
n
n


||0
I
Z
(B.10)

lim (, kf k) = kf k,
(B.11)
||0

proving (B.8).

B.2

Kn -Valued Lebesgue Integral

Definition B.5. Let I R be (Lebesgue) measurable, n N.


(a) A function f : I Kn is called (Lebesgue) measurable (respectively, (Lebesgue)
integrable) if, and only if, each coordinate function fj = j f : I K, j =
1, . . . , n, is (Lebesgue) measurable (respectively, (Lebesgue) integrable), which, for
K = C, means if, and only if, each Re fj and each Im fj , j = 1, . . . , n, is (Lebesgue)
measurable (respectively, (Lebesgue) integrable).
(b) If f : I Kn is integrable, then
Z

Z
Z
f :=
f1 , . . . , fn Kn
I

(B.12)

is the (Kn -valued) (Lebesgue) integral of f over I.


Remark B.6. The linearity of the K-valued integral implies the linearity of the Kn valued integral.
Theorem B.7. Let I R be measurable, n N. Then f : I Kn is measurable in
the sense of Def. B.5(a) if, and only if, f 1 (O) is measurable for each open subset O of
Kn .

B KN -VALUED INTEGRATION

123

Proof. Assume f 1 (O) is measurable for each open subset O of Kn . Let j {1, . . . , n}.
If Oj K is open in K, then O := j1 (Oj ) = {z Kn : zj Oj } is open in Kn .
Thus, fj1 (Oj ) = f 1 (O) is measurable, showing that each fj is measurable, i.e. f is
measurable. Now assume f is measurable, i.e. each fj is measurable. Since every open
O Kn is a countable union of open sets of the form O = O1 On with each Oj
being an open subset of K, it suffices to show that the
of such open sets are
Tn preimages
1
1
measurable. So let O be as above. Then f (O) = j=1 fj (Oj ), showing that f 1 (O)
is measurable.

Corollary B.8. Let I R be measurable, n N. If f : I Kn is measurable, then
kf k : I R is measurable.
Proof. If O R is open, then k k1 (O) is an open subset
of Kn by the continuity of

the norm. In consequence, kf k1 (O) = f 1 k k1 (O) is measurable.


Theorem B.9. Let I R be measurable, n N. For each norm k k on Kn and each


integrable f : I Kn , the following holds:
Z Z


f kf k.
(B.13)


I

Proof. First assume that B I is measurable, y Kn , and f = y B , where B is the


characteristic function of B (i.e. the fj are yj on B and 0 on I \ B). Then
Z
Z




f =
(B.14)

y1 (B), . . . , yn (B) = (B)kyk = kf k,
I

where denotes Lebesgue measure on R. Next, consider the case that f is a so-called
simple function, that means f takes only finitely many values y1 , . . . , yN Kn , N N,
and each preimage Bj := f 1 {yj } I is measurable. Then
f=

N
X

yj B j ,

(B.15)

j=1

where, without loss of generality, we may assume that the Bj are pairwise disjoint. We
obtain
Z

Z X
N Z
N Z
N
X



X




f
yj B =
yj B =
yj B
j
j
j



I
I
I j=1
j=1
j=1 I


Z X
N
Z
()


=
(B.16)
yj Bj = kf k,


I
I
j=1

where, at (), it was used that, as the Bj are disjoint, the integrands of the two integrals
are equal at each x I.

124

C METRIC SPACES

Now, if f is integrable, then each Re fj and each Im fj is integrable (i.e. Re fj , Im fj


L1 (I)) and there exist sequences of simple functions j,k : I R and j,k : I R
such that limk kj,k Re fj kL1 (I) = limk kj,k Im fj kL1 (I) = 0. In particular,
Z

Z


0 lim j,k Re fj lim kj,k Re fj kL1 (I) = 0,
(B.17a)
k
k
I
I
Z

Z


0 lim j,k Im fj lim kj,k Im fj kL1 (I) = 0,
(B.17b)
k

and also

0 lim kj,k + ij,k fj kL1 (I)


k

lim kj,k Re fj kL1 (I) + lim kj,k Im fj kL1 (I) = 0.


k

(B.18)

Thus, we obtain
Z Z

Z



f =
f1 , . . . , fn



I
I
 I Z

Z
Z
Z



lim

+
i
lim

,
.
.
.
,
lim
=

+
i
lim

1,k
1,k
n,k
n,k
k

k I
k I
k I
I
Z

Z
Z


()
lim
= lim
(
+
i

)
kk + ik k = kf k,
(B.19)
k
k


k



where the equality at () holds due to limk k(1,k , . . . , n,k )kkf k

L1 (I)

= 0, which,

in turn, is verified by
Z
Z
Z




kk + ik k kf k kk + ik f k C kk + ik f k1
0
I

Z X
n


j,k + ij,k fj 0 for k ,
=C

(B.20)

I j=1

with C R+ since the norms k k and k k1 are equivalent on Kn .

C
C.1

Metric Spaces
Distance in Metric Spaces

Lemma C.1. The following law holds in every metric space (X, d):
|d(x, y) d(x , y )| d(x, x ) + d(y, y )

for each x, x , y, y X.

(C.1)

In particular, (C.1) states the Lipschitz continuity of d : X 2 R+


0 (with Lipschitz
2
constant 1) with respect to the metric d1 on X defined by

(C.2)
d1 : X 2 X 2 R+
d1 (x, y), (x , y ) = d(x, x ) + d(y, y ).
0,
Further consequences are the continuity and even uniform continuity of d, and also the
continuity of d in both components.

125

C METRIC SPACES
Proof. First, note d(x, y) d(x, x ) + d(x , y ) + d(y , y), i.e.
d(x, y) d(x , y ) d(x, x ) + d(y , y).

(C.3a)

Second, d(x , y ) d(x , x) + d(x, y) + d(y, y ), i.e.


d(x , y ) d(x, y) d(x , x) + d(y, y ).
Taken together, (C.3a) and (C.3b) complete the proof of (C.1).

(C.3b)


Definition C.2. Let (X, d) be a nonempty metric space. For each A, B X define the
distance between A and B by
dist(A, B) := inf{d(a, b) : a A, b B} [0, ]

(C.4)

and

xX

dist(x, B) := dist({x}, B) and

dist(A, x) := dist(A, {x}).

(C.5)

Remark C.3. Clearly, for dist(A, B) as defined in (C.4), we have


dist(A, B) <

A 6= and B 6= .

(C.6)

Theorem C.4. Let (X, d) be a nonempty metric space. If A X and A 6= , then the
functions

, : X R+
(x) := dist(x, A), (x)
:= dist(A, x),
(C.7)
0,
are both Lipschitz continuous with Lipschitz constant 1 (in particular, they are both
continuous and even uniformly continuous).
Proof. Since dist(x, A) = dist(A, x), it suffices to verify the Lipschitz continuity of .
We need to show

| dist(x, A) dist(y, A)| d(x, y).


(C.8)
x,yX

To this end, let x, y X and a A be arbitrary. Then


dist(x, A) d(x, a) d(x, y) + d(y, a)

(C.9)

dist(x, A) d(x, y) d(y, a),

(C.10)

dist(x, A) d(x, y) dist(y, A)

(C.11)

dist(x, A) dist(y, A) d(x, y).

(C.12)

dist(y, A) dist(x, A) d(x, y),

(C.13)

and
implying
and
Since x, y X were arbitrary, (C.12) also yields

where (C.12) and (C.13) together are precisely (C.8).

126

C METRIC SPACES
Definition C.5. Let (X, d) be a metric space, A X, and R+ . Define
A := {x X : d(x, A) < },

A := {x X : d(x, A) }.

(C.14a)
(C.14b)

We call A the open -fattening of A, and A the closed -fattening of A.


Lemma C.6. Let (X, d) be a metric space, A X, and R+ . Then A , the open
-fattening of A, is, indeed, open, and A , the closed -fattening of A, is, indeed, closed.
Proof. Since the distance function : X R+
0 , (x) := dist(x, A), is continuous by
1
Th. C.4, A = [0, [ is open as the continuous preimage of an open set (note that [0, [
1
is, indeed, (relatively) open in R+
0 ); A = [0, ] is closed as the continuous preimage
of a closed set.

Lemma C.7. Let (X, d) be a metric space, A X, and R+ . If A is bounded, then
so are the fattenings A and A .
Proof. If A is bounded, then there exist x X and r > 0 such that A Br (x). Let
s := r + + 1. If y A , then there exists a A such that d(a, y) < + 1. Thus,
d(x, y) d(x, a) + d(a, y) < r + + 1 = s,
showing A A Bs (x), i.e. A and A are bounded.

(C.15)


Proposition C.8. Let (X, d) be a metric space, A X, and 0 < 1 < 2 .


(a) Then A A1 A1 A2 A2 always holds.
(b) If (X, kk) is a normed space with d being the induced metric, 6= A X, and there
exists x
/ A, satisfying := d(x, A) 2 , then all the inclusions in (a) are strict:
A ( A1 ( A1 ( A2 ( A2 . Caveat: For general metric spaces X and A satisfying
all the hypotheses, the inclusions do not need to be strict (consider discrete metric
spaces for simple examples).
Proof. (a) is immediate from (C.14).
To prove (b), consider the maps
: [0, 1] X,

f : [0, 1] R,

(t) := tx + (1 t)a,

f (t) := d (t), A .

(C.16a)
(C.16b)

If (sn )nN is a sequence in [0, 1] such that limn sn = s [0, 1], then limn (sn ) =
sx+(1s)a = (s), i.e. is continuous. Then, using Th. C.4, f is also continuous. Thus,
since f (0) = d(a, A) = 0 and f (1) = d(x, A) = 2 , one can use the intermediate
value theorem [Phi13a, Th. 7.56] to obtain, for each [0, 2 ], some [0, 1], satisfying
f ( ) = . If > 0, then d(( ), A) = f ( ) = > 0, i.e ( ) A \ A and ( ) A \ A ,
showing A ( A1 , A1 ( A1 , and A2 ( A2 . If := (1 + 2 )/2, then 1 < = f ( ) =
d(( ), A) < 2 , i.e. A2 \ A1 , showing A1 ( A2 .


127

C METRIC SPACES

C.2

Compactness in Metric Spaces

Definition C.9. A subset C of a metric space X is called compact if, and only if, every
sequence in C has a subsequence that converges to some limit c C.
Proposition C.10. Let (X, d) be a metric space, C, A X. If C is compact, A is
closed, and A C = , then dist(C, A) > 0.
Proof. Proceeding by contraposition, we show that dist(C, A) = 0 implies A C 6= .
If dist(C, A) = 0, then there exists a sequence ((ck , ak ))kN in C A such that
lim d(ck , ak ) = 0.

(C.17)

lim ck = c C,

(C.18)

As C is compact, we may assume


k

also implying
lim ak = c,

since

kN

d(ak , c) d(ak , ck ) + d(ck , c).

Since A is closed, (C.19) yields c A, i.e. c A C.

(C.19)


Proposition C.11. Let (X, d) be a metric space and C X.


(a) If C is compact, then C is closed and bounded.
(b) If C is compact and A C is closed, then A is compact.
Proof. (a): Suppose C is compact. Let (xk )kN be a sequence in C that converges in
X, i.e. limk xk = x X. Since C is compact, (xk )kN must have a subsequence
that converges to some c C, implying x = c C and showing C is closed. If C
is not bounded, then, for each x X, there is a sequence (xk )kN in C such that
limk d(x, xk ) = . If y X, then d(x, xk ) d(x, y) + d(y, xk ), i.e. d(y, xk )
d(x, xk ) d(x, y), showing that limk d(y, xk ) = as well. Thus, y can not be a limit
of any subsequence of (xk )kN . As y was arbitrary, C can not be compact.
(b): If (xk )kN is a sequence in A, then (xk )kN is a sequence in C. Since C is compact,
it must have a subsequence that converges to some c C. However, as A is closed, c
must be in A, showing that (xk )kN has a subsequence that converges to some c A,
i.e. A is compact.

Corollary C.12. A subset C of Kn , n N, is compact if, and only if, C is closed and
bounded.
Proof. Every compact set is closed and bounded by Prop. C.11(a). If C is closed and
bounded, and (xk )kN is a sequence in C, then the boundedness and the BolzanoWeierstrass theorem yield a subsequence that converges to some x Kn . However,
since C is closed, x C, showing that C is compact.


128

C METRIC SPACES

The following examples show that, in general, sets can be closed and bounded without
being compact.
Example C.13. (a) If (X, d) is a noncomplete metric space, than it contains a Cauchy
sequence that does not converge. It is not hard to see that such a sequence can
not have a convergent subsequence, either. This shows that no noncomplete metric
space can be compact. Moreover, the closure of every bounded subset of X that
contains such a nonconvergent Cauchy sequence is an example of a closed and
bounded set that is noncompact. Concrete examples are given by Q [a, b] for each
a, b R with a < b (these sets are Q-closed, but not R-closed!) and ]a, b[ for each
a, b R with a < b, in each case endowed with the usual metric d(x, y) := |x y|.
(b) There can also be closed and bounded sets in complete spaces that are not compact.
Consider the space X of all bounded sequences (xn )nN in K, endowed with the supnorm k(xn )nN ksup := sup{|xn | : n N}. It is not too difficult to see that X with
the sup-norm is a Banach space: Let (xk )kN with xk = (xkn )nN be a Cauchy
sequence in X. Then, for each n N, (xkn )kN is a Cauchy sequence in K, and,
thus, it has a limit yn K. Let y := (yn )nN . Then
kxk yksup = sup{|xkn yn | : n N}.
Let > 0. As (xk )kN is a Cauchy sequence with respect to the sup-norm, there is
N N such that kxk xl ksup < for all k, l > N . Fix some l > N and some n N.
Then limk |xkn xln | = limk |yn xln |. Since this is valid for each n N,
we get kxl yksup for each l > N , showing liml xl = y, i.e. X is complete
and a Banach space.
Now consider the sequence (ek )kN with
(
1 for k = n,
ekn :=
0 otherwise.
Then (ek )kN constitutes a sequence in X with kek ksup = 1 for each k N. In particular, (ek )kN is a sequence inside the closed unit ball B 1 (0), and, hence, bounded.
However, if k, l N with k 6= l, then kek el ksup = 1. Thus, neither (ek )kN nor any
subsequence can be a Cauchy sequence. In particular, no subsequence can converge,
showing that the closed and bounded unit ball B 1 (0) is not compact.
Note: There is an important result that shows that a normed vector space is finitedimensional if, and only if, the closed unit ball B 1 (0) is compact (see, e.g., [Str08,
Th. 28.14]).
Theorem C.14. If (X, dX ) and (Y, dY ) are metric spaces, C X is compact, and
f : C Y is continuous, then f (C) is compact.
Proof. If (y k )kN is a sequence in f (C), then, for each k N, there is some xk C
such that f (xk ) = y k . As C is compact, there is a subsequence (ak )kN of (xk )kN
with limk ak = a for some a C. Then (f (ak ))kN is a subsequence of (y k )kN and

C METRIC SPACES

129

the continuity of f yields limk f (ak ) = f (a) f (C), showing that (y k )kN has a
convergent subsequence with limit in f (C). We have therefore established that f (C) is
compact.

Theorem C.15. If (X, d) is a metric space, C X is compact, and f : C R is
continuous, then f assumes its max and its min, i.e. there are xm C and xM C
such that f has a global min at xm and a global max at xM .
Proof. Since C is compact and f is continuous, f (C) R is compact according to Th.
C.14. Then, by [Phi13a, Lem. 7.52], f (C) contains a smallest element m and a largest
element M . This, in turn, implies that there are xm , xM C such that f (xm ) = m and
f (xM ) = M .

Theorem C.16. If (X, dX ) and (Y, dY ) are metric spaces, C X is compact, and
f : C Y is continuous, then f is uniformly continuous.
Proof. If f is not uniformly continuous, then there must be some > 0 such that, for
each k N, there exist xk , y k C satisfying dX (xk , y k ) < 1/k and dY (f (xk ), f (y k )) .
Since C is compact, there is a C and a subsequence (ak )kN of (xk )kN such that
a = limk ak . Then there is a corresponding subsequence (bk )kN of (y k )kN such that
dX (ak , bk ) < 1/k and dY (f (ak ), f (bk )) for all k N. Using the compactness of
C again, there b C and a subsequence (v k )kN of (bk )kN such that b = limk v k .
Now there is a corresponding subsequence (uk )kN of (ak )kN such that dX (uk , v k ) <
1/k and dY (f (uk ), f (v k )) for all k N. Note that we still have a = limk v k .
Given > 0, there is N N such that, for each k > N , one has dX (a, uk ) < /3,
dX (b, v k ) < /3, and dX (uk , v k ) < 1/k < /3. Thus, dX (a, b) < dX (a, uk ) + dX (uk , v k ) +
dX (b, v k ) < , implying d(a, b) = 0 and a = b. Finally, the continuity of f implies
f (a) = limk f (uk ) = limk f (v k ) in contradiction to dY (f (uk ), f (v k )) .

Theorem C.17. If (X, dX ) and (Y, dY ) are metric spaces, C X is compact, and
f : C Y is continuous and one-to-one, then f 1 : f (C) C is continuous.
Proof. Let (y k )kN be a sequence f (C) such that limk y k = y f (C). Then there
is a sequence (xk )kN in C such that f (xk ) = y k for each k N. Let x := f 1 (y).
It remains to prove that limk xk = x. As C is compact, there is a C and a
subsequence (ak )kN of (xk )kN such that a = limk ak . The continuity of f yields
f (a) = limk f (ak ) = limk y k = y = f (x) since (f (ak ))kN is a subsequence of
(y k )kN . It now follows that a = x since f is one-to-one. The same argument shows
that every convergent subsequence of (xk )kN has to converge to x. If (xk )kN did not
converge to x, then there had to be some > 0 such that infinitely man xk are not in
B (x). However, the compactness of C would provide a convergent subsequence whose
limit could not be x, in contradiction to x having to be the limit of all convergent
subsequences of (xk )kN .

Definition C.18. A subset A of a metric space (X, d) is called precompact or totally
bounded if, and only if, for each > 0, A can be covered by finitely many -balls, i.e. if,

130

C METRIC SPACES
and only if, there exist finitely many points a1 , . . . , aN A, N N, such that
N
[

B (aj ).

(C.20)

j=1

Theorem C.19. For a subset C of a metric space (X, d), the following statements are
equivalent:
(i) C is compact as defined in Def. C.9.
(ii) C has the Heine-Borel property, i.e. every open cover of C has a finite subcover,
i.e. if (Oj )jI is a family of open sets Oj C, satisfying
A

N
[

Oj ,

(C.21)

jI

then there exist j1 , dots, jN I, N N, such that A

SN

j=1

Oj .

(iii) C is precompact (i.e. totally bounded) as defined in Def. C.18 and complete, i.e.
every Cauchy sequence in C converges to a limit in C.
Proof. We show (i) (iii) (ii) (i).

(i) (iii): Let (cn )nN be a Cauchy sequence in C. As C is compact, (cn )nN has a
subsequence (cnj )jN such that limj cnj = c C. Given > 0 choose K N such
that, for each m, n K, d(cm , cn ) < 2 , and such that, for each nj K, d(cnj , c) < 2 .
Then, fixing some nj K,

nK

d(cn , c) d(cn , cnj ) + d(cnj , c) <

+ = ,
2 2

(C.22)

showing limn cn = c and the completeness of C. We now show C to be also totally


bounded. We proceed by contraposition and assume C not to be totally bounded, i.e.
there exists > 0 such that C is not contained in any finite union of -balls. Inductively,
we construct a sequence (cn )nN in C such that

m,nN,
m6=n

d(cm , cn ) :

(C.23)

To start with, we note C 6= and choose some arbitrary c1 C. Assuming c1 , . . . , ck


C, k N, have already been constructed such that d(cm , cn ) holds for each m, n
{1, . . . , k}, there must be
k
[
cC\
B (cj ).
(C.24)
j=1

Choosing ck+1 := c, (C.24) guarantees (C.23) now holds for each m, n {1, . . . , k + 1}.
Due to (C.23), no subsequence of (cn )nN can be a Cauchy sequence, i.e. (cn )nN does
not have a convergent subsequence, proving C is not compact.

131

C METRIC SPACES

(iii) (ii): Assume C to be precompact and complete. For each k N, the precompactness yields points ck1 , . . . , ckNk C, Nk N, such that
C

Nk
[

B 1 (ckj ).

(C.25)

j=1

Seeking a contradiction, assume C does not have the Heine-Borel property, i.e. there
exists an open cover (Oj )jI of C which does not have a finite subcover. Inductively, we
construct a decreasing sequence of subsets Ck of C, C C1 C2 . . . , such that no
Ck can be covered by a finite subcover of (Oj )jI and such that

kN

j{1,...,Nk }

Ck B 1 (ckj ) :
k

(C.26)

To start out, we note that (C.25) implies at least one of the finitely many sets C
B1 (c11 ), . . . , CB1 (c1N1 ) can not be covered by a finite subcover of (Oj )jI , say, CB1 (c1j1 ).
Define C1 := C B1 (c1j1 ). Then, given C1 , . . . , Ck have already been constructed for some
k N, since Ck can not be covered by a finite subcover of (Oj )jI and
Nk+1

Ck C

1
k+1

(ck+1
),
j

(C.27)

j=1

there exists jk+1 {1, . . . , Nk+1 } such that Ck B

1
k+1

(ck+1
jk+1 ) can not be covered by a

finite subcover of (Oj )jI , either. Define Ck+1 := Ck B 1 (ck+1


jk+1 ). For each k N,
k+1
choose some sk Ck (note Ck 6= , as it can not be covered by finitely many Oj ). Given
> 0, there is K N such that K2 < . If k, l K, then sk , sl CK B 1 (cK
j ) for some
K
2
suitable j {1, . . . , NK }. In particular, d(sk , sl ) < K < , showing (sk )kN is a Cauchy
sequence. As (sk )kN is a Cauchy sequence in C and C is complete, there exists c C
such that limk sk = c. However, then there must exist some j I such that c Oj
and, since Oj is open, there is > 0 with B (c) Oj , and B (c) must contain almost
all of the sk . Choose k sufficiently large such that k1 < 4 and d(sk , c) < 2 . Then, since
sk Ck B 1 (ckj ),

(C.28)

one has

xB 1 (ckj )
k

d(x, c) d(x, sk ) + d(sk , c) <

2
2
+ <
+ = ,
k 2
4
2

(C.29)

showing Ck B 1 (ckj ) B (c) Oj , in contradiction to Ck not being coverable by


k
finitely many Oj .
(ii) (i): Assume C has the Heine-Borel property. Seeking a contradiction, assume C
is not compact, that means there exists a sequence (cn )nN in C such that no subsequence
of (cn )nN converges to a limit in C. According to [Phi13b, Prop. 1.38(d)], no c C can
be a cluster point of (cn )nN , i.e., for each c C, there
exists c > 0 such that Bc (c)
S
contains only finitely many of the cn . Since C cC Bc (c), the family Bc (c) cC
constitutes an open cover of C. As C has the Heine-Borel
SN property, there exist finitely
many points c1 , . . . , cN C, N N, such that C j=1 Bcj (cj ), i.e. C contains only
finitely many of the cn , in contradiction to (cn )nN being a sequence in C.


132

D LOCAL LIPSCHITZ CONTINUITY

Caveat C.20. In general topological spaces, one defines compactness via the HeineBorel property (a topological space C is defined to be compact if, and only if, C has
the Heine-Borel property). Moreover, a topological space C is defined to be sequentially
compact if, and only if, every sequence in C has a convergent subsequence. Using this
terminology, one can rephrase the equivalence between (i) and (ii) in Th. C.19 by stating
that a metric space is sequentially compact if, and only if, it is compact. However, in
general topological spaces, neither implication remains true ((iii) of Th. C.19 does not
even make sense in general topological spaces, as the concepts of boundedness, total
boundedness, and Cauchy sequences are, in general, not available): For an example
of a topological space that is compact, but not sequentially compact, see, e.g. [Pre75,
7.2.10(a)]; for an example of a topological space that is sequentially compact, but not
compact, see, e.g. [Pre75, 7.2.10(c)].
Theorem C.21 (Lebesgue Number). Let (X, d) be a metric space and C X. If C is
compact and (Oj )jI is an open cover of C, then there exists a Lebesgue number for
the open cover, i.e. some > 0 such that, for each A C with diam A < , there exists
j0 I, where A Oj0 . Recall that
(
0
for A = ,


(C.30)
diam A =
sup d(x, y) : x, y A
for 6= A.

Proof. Seeking a contradiction, assume there is no Lebesgue number for the open cover
(Oj )jI . Then there is a sequence of pairs (xk , yk ) C 2 such that
d(xk , yk ) <

1
k

but

kN

jI

{xk , yk } ( Oj .

(C.31)

As C is compact, we may assume that limk xk = c C, implying limk yk = c as


well. But then there must be Oj such that c Oj and, due to the openness of Oj and
the convergences of both sequences to c, Oj must contain almost all of the xk as well as
almost all of the yk in contradiction to (C.31).


Local Lipschitz Continuity

In Prop. 3.13, it was shown that a continuous function is locally Lipschitz with respect
to y if, and only if, it is globally Lipschitz with respect to y on every compact set.
The following Prop. D.1 shows that this equivalence holds even if f is not continuous,
provided that each projection Gx as in (D.1) below is convex. On the other hand, Ex.
D.2 shows that, in general, there exist discontinuous functions that are locally Lipschitz
with respect to y without being globally Lipschitz with respect to y on every compact
set.
Proposition D.1. Let m, n N, G R Km , and f : G Kn . If G is such that
each projection
Gx := {y Km : (x, y) G}, x R,
(D.1)

133

D LOCAL LIPSCHITZ CONTINUITY

is convex (in particular, if G itself is convex), then f is locally Lipschitz with respect to
y if, and only if, f is (globally) Lipschitz with respect to y on every compact subset K
of G.
Proof. The proof of Prop. 3.13 shows, whithout making use of the continuity of f , that
(global) Lipschitz continuity with respect to y on every compact subset K of G implies
local Lipschitz continuity on G. Thus, assume f to be locally Lipschitz with respect to
y and assume each Gx to be convex. The proof of Prop. 3.13 shows, whithout making
use of the continuity of f , that, for each K G compact



ky yk < kf (x, y) f (x, y)k Lky yk .


(D.2)
>0,
L0

(x,y),(x,
y )K

If (x, y), (x, y) K are arbitrary with y 6= y, then the convexity of Gx implies
{(x, (1 t)y) + (x, t
y ) : t [0, 1]} G.

(D.3)

Choose N N such that N > 2ky yk/ and set h := ky yk/N . Then
h < /2.
Define

k=0,...,N

Then

k=0,...,N 1

and

(D.4)



kh
kh
y.
yk :=
y+ 1
ky yk
ky yk

(D.5)




h
h

=h<
y+
y
kyk+1 yk k =
ky yk
ky yk

kf (x, y) f (x, y)k

N
1
X
k=0

(D.2)

kf (x, yk ) f (x, yk+1 )k L

N
1
X
k=0

= L N h = L ky yk,

(D.6)

kyk yk+1 k
(D.7)

showing f to be (globally) L-Lipschitz with respect to y on K.

Example D.2. We provide two examples that show that, in general, a discontinuous
function can be locally Lipschitz with respect to y without being globally Lipschitz with
respect to y on every compact set.
(a) Consider
and f : G R,

G :=] 2, 2[ ] 4, 1[]1, 4[

1/x for x 6= 0, y ] 4, 1[,


f (x, y) := 0
for x = 0, y ] 4, 1[,

0
for y ]1, 4[.

(D.8)

(D.9)

134

D LOCAL LIPSCHITZ CONTINUITY

For the following open balls with respect to the max norm k(x, y)k := max{|x|, |y|},
one has B1 (x, y) G ] 2, 2[] 4, 1[ for y ] 4, 1[, andB1 (x, y) G
] 2, 2[]1, 4[ for y ]1, 4[. Thus, f (x, ) is constant on each set B1 (x, y) G (either
constantly equal to 1/x or constantly equal to 0), i.e. 0-Lipschitz with respect to y.
In particular, f is locally Lipschitz with respect to y. However, f is not Lipschitz
continuous with respect to y on the compact set

K := [1, 1] [3, 2] [2, 3] :
(D.10)
For the sequence ((xk , yk , y k ))kN , where

kN

xk := 1/k,

yk := 2,

y k := 2,

(D.11)

one has

k0
|f (xk , yk ) f (xk , y k )|
= lim
= ,
k
k 2 (2)
|yk y k |
showing f is not Lipschitz continuous with respect to y on K.
lim

(D.12)

(b) If one increases the dimension by 1, then one can modify the example in (a) such
the set G is even connected (this variant was pointed out by Anton Sporrer): Let



A := ] 4, 1[] 2, 2[ ] 4, 4[] 2, 0[ ]1, 4[] 2, 2[ R2 . (D.13)
Then A is open and connected (but not convex) and the same holds for
G :=] 2, 2[A R3 .

Define
f : G R,

f (x, y1 , y2 ) :=

(D.14)

1/x for x 6= 0, y1 ] 4, 1[, y2 > 0,


0
otherwise.

(D.15)

Then everything works essentially as in (a) (it might be helpful to graphically


visualize the set A and the behavior of the function f ): For the following open balls
with respect to the max norm k(x, y)k := max{|x|, |y1 |, |y2 |}, one has


(, 1 , 2 ) B1 (x, y1 , y2 ) G 1 < 1 + 1 = 0 < 1 . (D.16)

(x,y1 ,y2 )G,


y1 ]4,1[

Analogous to (a), f (x, ) is constant on each set B1 (x, y1 , y2 ) G (either constantly


equal to 1/x or constantly equal to 0), i.e. 0-Lipschitz with respect to y. In particular, f is locally Lipschitz with respect to y. However, f is not Lipschitz continuous
with respect to y on the compact set

K := [1, 1] [3, 2] [2, 3] [1, 1] :
(D.17)
For the sequence ((xk , y1,k , y 1,k ), y2,k )kN with

kN

xk := 1/k,

y1,k := 2,

y 1,k := 2,

y2,k := 0,

(D.18)

one has
|f (xk , y1,k , y2,k ) f (xk , y 1,k , y2,k )|
k0
= ,
= lim
k
k
k(y1,k , y2,k ) (y 1,k , y2,k )kmax
max{4, 0}
lim

showing f is not Lipschitz continuous with respect to y on K.

(D.19)

E MAXIMAL SOLUTIONS ON NONOPEN INTERVALS

135

Maximal Solutions on Nonopen Intervals

In Def. 3.20, we required a maximal solution to an ODE to be defined on an open


interval. The following Ex. E.1 shows it can occur that such a maximal solution has an
extension to a larger nonopen interval. In such cases, one might want to call the solution
on the nonopen interval maximal rather than the solution on the smaller open interval.
However, this would make the treatment of maximal solutions more cumbersome in some
places, without adding any real substance, which is why we stick to our requirement for
maximal solutions to always be defined on an open interval.
Example E.1. (a) Let
G := [0, 1] R,

f : G R,

f (x, y) := 0.

(E.1)

Then, for each (x0 , y0 ) G, the function


: [0, 1] R,

y0 ,

(E.2)

y(x0 ) = y0 .

(E.3)

is a solution to the initial value problem


y = f (x, y),

However, the maximal solution of (E.3) according to Def. 3.20 is ]0,1[ .


(b) The following modification of (a) allows f to be defined on all of R2 : Let
(
0 for x [0, 1],
2
G := R , f : G R, f (x, y) :=
1 for x
/ [0, 1].

(E.4)

Then, for each (x0 , y0 ) [0, 1] R, the function of (E.2) is a solution to the initial
value problem (E.3), but, again, the maximal solution of (E.3) according to Def.
3.20 is ]0,1[ .

Paths in Rn

Definition F.1. A path or curve in Rn , n N, is a continuous map : I Rn , where


I R is an interval. One calls the path differentiable, continuously differentiable, etc.
if, and only if, the function has the respective property.
Definition F.2. If a, b R, a b, and I := [a, b], then we call
|I| := b a = |a b|,
the length of I.

(F.1)

F PATHS IN RN

136

Definition F.3. Given a real interval I := [a, b] R, a, b R, a < b, the (N + 1)tuple := (x0 , . . . , xN ) RN +1 , N N, is called a partition of I if, and only if,
a = x0 < x1 < < xN = b. The set of all partitions of I is denoted by (I) or by
[a, b]. Given a partition of I as above and letting Ij := [xj1 , xj ], the number


|| := max |Ij | : j {1, . . . , N } ,
(F.2)
is called the mesh size of .

Notation F.4. Given a, b R, a < b, a path : [a, b] Rn , n N, and a partition


= (x0 , . . . , xN ), N N, of [a, b], we consider the approximation of by the polygon,
connecting the points (x0 ), . . . , (xN ), where we denote the polygons length by
p () := p (x0 , . . . , xN ) :=

N
1
X
k=0

k(xk+1 ) (xk )k2 ,

(F.3)

using k k2 to denote the 2-Norm on Rn , i.e. the Euclidean norm.


Definition F.5. Given a, b R, a b, for each path : [a, b] Rn , n N, define
(
0 n
for a = b,
o
(F.4)
l() :=
sup p () : [a, b] [0, ] for a < b.

The path is called rectifyable with arc length l() if, and only if, l() < .

Proposition F.6. Let a, b R, a < b, and let : [a, b] Rn be a path, n N.


(a) If is affine, i.e. there exist y0 , y1 Rn such that

x[a,b]

(x) = y0 + x y1 ,

(F.5)

then is rectifyable with arc length


l() = ky1 k2 (b a) = k(b) (a)k2 .

(F.6)

(b) If the path is L-Lipschitz with L 0, then is rectifyable and


l() L (b a).
(c) If the paths , : [a, b] Rn are both rectifyable, then


l() l() l( ).

(F.7)

(F.8)

(d) For each [a, b], it holds that

l() = l([a,] ) + l([,b] ).

(F.9)

F PATHS IN RN

137

Proof. (a): For each partition (x0 , . . . , xN ), N N, of [a, b], we have


p (x0 , . . . , xN ) =

N
1
X
k=0

k(xk+1 ) (xk )k2 =

= ky1 k2

N
1
X
k=0

N
1
X
k=0

kxk+1 y1 xk y1 k2

(xk+1 xk ) = ky1 k2 (b a),

(F.10)

proving (F.6).
(b): For each partition (x0 , . . . , xN ), N N, of [a, b], we have
p (x0 , . . . , xN ) =

N
1
X

k(xk+1 ) (xk )k2

k=0
N
1
X

=L

k=0

N
1
X
k=0

L kxk+1 xk k2

(xk+1 xk ) = L (b a),

(F.11)

proving (F.7).
(c): For each partition = (x0 , . . . , xN ), N N, of [a, b], we have
1

N
1

X
X

N

p () p () =
k(xk+1 ) (xk )k2
k(xk+1 ) (xk )k2

k=0
N
1
X

k=0




k(xk+1 ) (xk )k2 k(xk+1 ) (xk )k2

k=0
N
1
X
k=0




(xk+1 ) (xk+1 ) (xk ) (xk )

= p (),

(F.12)

proving (F.8) (the last estimate in (F.12) holds true due to the inverse triangle inequality).
(d): If = a or = b, then there is nothing to prove. Thus, assume a < < b. If
1 := (x0 , . . . , xN ) is a partition of [a, ] and 2 := (xN , . . . , xM ) is a partition of [, b],
N, M N, M > N , then := (x0 , . . . , xM ) is a partition of [a, b]. Moreover,
p () = p (1 ) + p (2 )

(F.13)

is immediate from (F.3), implying


l() l([a,] ) + l([,b] ).

(F.14)

On the other hand, if = (x0 , . . . , xM ) M N, is a partition of [a, b], then, either there
is 0 < N < M such that = N , in which case (F.13) holds once again, where 1 and 2
are defined as before. Otherwise, there is N {0, . . . , M 1} such that xN < < xN +1

F PATHS IN RN

138

and, in this case, 1 := (x0 , . . . , xN , ) is a partition of [a, ] and 2 := (, xN +1 , . . . , xM )


is a partition of [, b]. Moreover,
p () =
=

M
1
X
k=0
N
1
X
k=0

k(xk+1 ) (xk )k2


k(xk+1 ) (xk )k2 + kxN +1 xN k +

M
1
X

k=N +1

p (1 ) + p (2 ),

k(xk+1 ) (xk )k2


(F.15)

showing
and concluding the proof.

l() l([a,] ) + l([,b] )

(F.16)


Theorem F.7. Given a, b R, a < b, each continuously differentiable path : [a, b]


Rn , n N, is rectifyable with arc length
Z b

(x) dx .
(F.17)
l() =
2
a

Proof. Since is continuously differentiable, it follows from [Phi13b, Th. C.3] that
is Lipschitz continuous on [a, b], i.e. is rectifyable by Prop. F.6(b) above. To prove
(F.17), according to the fundamental theorem of calculus [Phi13a, Th. 10.19(b)], it
suffices to show the function
: [a, b] R+
0,

(x) := l([a,x] ),

(F.18)

is differentiable with derivative (x) = k (x)k2 . To this end, first note the continuous
function is even uniformly continuous by Th. C.16. Thus,



|x0 x| < k(x0 ) (x)k2 < . .


(F.19)
>0

>0

x0 ,x[a,b]

Fix x0 [a, b[ and consider x1 ]a, b[ such that x0 < x1 < x0 + . Define the affine path
: [x0 , x1 ] Rn ,

(x) := (x0 ) + (x x0 ) (x0 ).

(F.20)

According to Prop. F.6(a), we have


l() = k (x0 )k2 (x1 x0 ).

(F.21)

Moreover, for the path , we have

x[x0 ,x1 ]

(F.19)

k (x) (x)k2 = k (x) (x0 )k2 < .

(F.22)

Thus, it follows from [Phi13b, Th. C.3] that is -Lipschitz on [a, b] and, then,
Prop. F.6(b) yields
l([x0 ,x1 ] ) (x1 x0 ),
(F.23)

G OPERATOR NORMS AND MATRIX NORMS


Prop. F.6(c), in turn, yields


l([x0 ,x1 ] ) l() l([x0 ,x1 ] ) (x1 x0 ).

Putting everything together, we obtain






Prop. F.6(d), (F.21) l([x0 ,x1 ] )
l([a,x1 ] ) l([a,x0 ] )
l()




=
k (x0 )k2
x1 x0 x1 x0

x1 x0
(F.24)

(x1 x0 )
= ,
x1 x0

139

(F.24)

(F.25)

showing the function from (F.18) has a right-hand derivative at x0 and the value of
that right-hand derivative at x0 is the desired k (x0 )k2 . Repeating the above argument
with x0 , x1 ]a, b] such that x0 < x1 < x0 shows to have a left-hand derivative at
each x0 ]a, b] with value k (x0 )k2 , which completes the proof.

Remark F.8. An example of a differentiable nonrectifyable path is given by (cf. [Wal02,
Ex. 5.14.6])
(
x2 cos x2 for x 6= 0,
2
: [0, 1] R , (x) :=
(F.26)
0
for x = 0.

Operator Norms and Matrix Norms

For the present ODE class, we are mostly interested in linear maps from Kn into itself.
However, introducing the relevant notions for linear maps between general normed vector
spaces does not provide much additional difficulty, and, hopefully, even some extra
clarity.
Definition G.1. Let A : X Y be a linear map between two normed vector spaces
(X, k kX ) and (Y, k kY ) over K. Then A is called bounded if, and only if, A maps
bounded sets to bounded sets, i.e. if, and only if, A(B) is a bounded subset of Y for
each bounded B X. The vector space of all bounded linear maps between X and Y
is denoted by L(X, Y ).
Definition G.2. Let A : X Y be a linear map between two normed vector spaces
(X, k kX ) and (Y, k kY ) over K. The number


kAxkY
kAk := sup
: x X, x 6= 0
kxkX


= sup kAxkY : x X, kxkX = 1 [0, ]
(G.1)

is called the operator norm of A induced by k kX and k kY (strictly speaking, the term
operator norm is only justified if the value is finite, but it is often convenient to use the
term in the generalized way defined here).
In the special case, where X = Kn , Y = Km , and A is given via a real m n matrix,
the operator norm is also called matrix norm.

G OPERATOR NORMS AND MATRIX NORMS

140

From now on, the space index of a norm will usually be suppressed, i.e. we write just
k k instead of both k kX and k kY .
Theorem G.3. For a linear map A : X Y between two normed vector spaces
(X, k k) and (Y, k k) over K, the following statements are equivalent:
(a) A is bounded.
(b) kAk < .
(c) A is Lipschitz continuous.
(d) A is continuous.
(e) There is x0 X such that A is continuous at x0 .
Proof. Since every Lipschitz continuous map is continuous and since every continuous
map is continuous at every point, (c) (d) (e) is clear.

(e) (c): Let x0 X be such that A is continuous at x0 . Thus, for each > 0, there
is > 0 such that kx x0 k < implies kAx Ax0 k < . As A is linear, for each x X
with kxk < , one has kAxk = kA(x + x0 ) Ax0 k < , due to kx + x0 x0 k = kxk < .
Moreover, one has k(x)/2k /2 < for each x X with kxk 1. Letting L := 2/,
this means that kAxk = kA((x)/2)k/(/2) < 2/ = L for each x X with kxk 1.
Thus, for each x, y X with x 6= y, one has



xy

< L kx yk.
kAx Ayk = kA(x y)k = kx yk A
(G.2)
kx yk

Together with the fact that kAx Ayk kx yk is trivially true for x = y, this shows
that A is Lipschitz continuous.
(c) (b): As A is Lipschitz continuous, there exists L R+
0 such that kAx Ayk
L kx yk for each x, y X. Considering the special case y = 0 and kxk = 1 yields
kAxk L kxk = L, implying kAk L < .
(b) (c): Let kAk < . We will show

kAx Ayk kAk kx yk for each x, y X.


For x = y, there is nothing to prove. Thus, let x 6= y. One computes



x

y
kAx Ayk
kAk
A
=

kx yk
kx yk


xy
as kxyk
= 1, thereby establishing (G.3).

(G.3)

(G.4)

(b) (a): Let kAk < and let M X be bounded. Then there is r > 0 such that
M Br (0). Moreover, for each 0 6= x M :


kAxk
x

kAk
= A
(G.5)
kxk
kxk

G OPERATOR NORMS AND MATRIX NORMS

141


x
as kxk
= 1. Thus kAxk kAkkxk rkAk, showing that A(M ) BrkAk (0). Thus,
A(M ) is bounded, thereby establishing the case.

(a) (b): Since A is bounded, it maps the bounded set B1 (0) X into some
bounded subset of Y . Thus, there is r > 0 such that A(B1 (0)) Br (0) Y . In
particular, kAxk < r for each x X satisfying kxk = 1, showing kAk r < .


Remark G.4. For linear maps between finite-dimensional spaces, the equivalent properties of Th. G.3 always hold: Each linear map A : Kn Km , (n, m) N2 , is
continuous (this follows, for example, from the fact that each such map is (trivially)
differentiable, and every differentiable map is continuous). In particular, each linear
map A : Kn Km , has all the equivalent properties of Th. G.3.
Theorem G.5. Let X and Y be normed vector spaces over K.

(a) The operator norm does, indeed, constitute a norm on the set of bounded linear
maps L(X, Y ).
(b) If A L(X, Y ), then kAk is the smallest Lipschitz constant for A, i.e. kAk is a
Lipschitz constant for A and kAx Ayk L kx yk for each x, y X implies
kAk L.
Proof. (a): If A = 0, then, in particular, Ax = 0 for each x X with kxk = 1, implying
kAk = 0. Conversely, kAk = 0 implies Ax = 0 for each x X with kxk = 1. But then
Ax = kxk A(x/kxk) = 0 for every 0 6= x X, i.e. A = 0. Thus, the operator norm is
positive definite. If A L(X, Y ), K, and x X, then





(A)x = A(x) = (Ax) = || Ax ,
(G.6)
yielding





kAk = sup k(A)xk : x X, kxk = 1 = sup || kAxk : x X, kxk = 1


= || sup kAxk : x X, kxk = 1 = || kAk,
(G.7)

showing that the operator norm is homogeneous of degree 1. Finally, if A, B L(X, Y )


and x X, then
k(A + B)xk = kAx + Bxk kAxk + kBxk,
(G.8)
yielding



kA + Bk = sup k(A + B)xk : x X, kxk = 1


sup kAxk + kBxk : x X, kxk = 1




sup kAxk : x X, kxk = 1 + sup kBxk : x X, kxk = 1
= kAk + kBk,
(G.9)
showing that the operator norm also satisfies the triangle inequality, thereby completing
the verification that it is, indeed, a norm.
(b): That kAk is a Lipschitz constant for A was already shown in the proof of (b)
(c) of Th. G.3. Now let L R+
0 be such that kAx Ayk L kx yk for each x, y X.
Specializing to y = 0 and kxk = 1 implies kAxk L kxk = L, showing kAk L.


142

H THE VANDERMONDE DETERMINANT

Remark G.6. Even though it is beyond the scope of the present class, let us mention
as an outlook that one can show that L(X, Y ) with the operator norm is a Banach space
(i.e. a complete normed vector space) provided that Y is a Banach space (even if X is
not a Banach space).
Lemma G.7. If Id : X X, Id(x) := x, is the identity map on a normed vector space
X over K, then k Id k = 1 (in particular, the operator norm of a unit matrix is always
1). Caveat: In principle, one can consider two different norms on X simultaneously,
and then the operator norm of the identity can differ from 1.
Proof. If kxk = 1, then k Id(x)k = kxk = 1.

Lemma G.8. Let X, Y, Z be normed vector spaces and consider linear maps A
L(X, Y ), B L(Y, Z). Then
kBAk kBk kAk.
(G.10)
Proof. Let x X with kxk = 1. If Ax = 0, then kB(A(x))k = 0 kBk kAk. If Ax 6= 0,
then one estimates





Ax
kAk kBk,
B(Ax) = kAxk B
(G.11)

kAxk
thereby establishing the case.

Example G.9. Let m, n N and let A : Rn Rm be the linear map given by the
m n matrix (akl )(k,l){1,...,m}{1,...,n} . Then
( n
)
X
kAk := max
|akl | : k {1, . . . , m}
(G.12a)
l=1

is called the row sum norm of A, and


(
kAk1 := max

m
X
k=1

|akl | : l {1, . . . , n}

(G.12b)

is called the column sum norm of A. It is an exercise to show that kAk is the operator
norm induced if Rn and Rm are endowed with the -norm, and kAk1 is the operator
norm induced if Rn and Rm are endowed with the 1-norm.

The Vandermonde Determinant

Theorem H.1. Let n N and 0 , 1 , . . . , n C. Moreover, let

1 0 . . . n0
1 1 . . . n
1

V := ..
..
.
.
1 n . . . nn

(H.1)

143

H THE VANDERMONDE DETERMINANT

be the corresponding Vandermonde matrix. Then its determinant, the so-called Vandermonde determinant is given by
det(V ) =

n
Y

(k l ).

(H.2)

k,l=0
k>l

Proof. The proof can be conducted by induction with respect to n: For n = 1, we have


1
Y
1 0
= 1 0 =
(k l ),
(H.3)
det(V ) =
1 1
k,l=0
k>l

showing (H.2) holds for n = 1. Now let n > 1. We know from Linear Algebra that the
value of a determinant does not change if we add a multiple of a column to a different
column. Adding the (0 )-fold of the nth column to the (n + 1)st column, we obtain
in the (n + 1)st column

0
n n1 0
1
1

(H.4)

.
..

.
nn nn1 0

Next, one adds the (0 )-fold of the (n 1)st column to the nth column, and, successively, the (0 )-fold of the mth column to the (m + 1)st column. One finishes, in the
nth step, by adding the (0 )-fold of the first column to the second column, obtaining




1 0 . . . n 1
0
0
.
.
.
0
0



1 1 . . . n 1 1 0 2 1 0 . . . n n1 0
1
1
1
1



(H.5)
det(V ) = ..
.
..
..
..
..
.. = ..

.
.
.
.
.
.
.



1 n . . . nn 1 n 0 2n n 0 . . . nn nn1 0
Applying the rule for determinants of block matrices to (H.5) yields


1 0 2 1 0 . . . n n1 0
1
1
1




..
..
..
det(V ) = 1 ...
.
.
.
.


2
n
n1
n 0 n n 0 . . . n n 0

(H.6)

As we also know from Linear Algebra that determinants are linear in each row, for each
k, we can factor out (k 0 ) from the kth row of (H.6), arriving at


1 1 . . . n1
n
1


Y
.. ..

.
.
.
.
det(V ) =
(k 0 ) . .
(H.7)
.
.
.


n1
k=1
1 n . . . n

However, the determinant in (H.7) is precisely the Vandermonde determinant of the n1


numbers 1 , . . . , n , which is given according to the induction hypothesis, implying
det(V ) =

n
Y

(k 0 )

k=1

n
Y

(k l ) =

k,l=1
k>l

n
Y

(k l ),

k,l=0
k>l

(H.8)

144

I MATRIX-VALUED FUNCTIONS
completing the induction proof of (H.2).

Matrix-Valued Functions

Notation I.1. Given m, n N, let M(m, n, K) denote the set of m n matrices over
K.

I.1

Product Rule

Proposition I.2. Let I R be a nontrivial interval, let m, n, l N, and suppose



A : I M(m, n, K),
A(x) = a (x) ,
(I.1a)

B : I M(n, l, K),
B(x) = b (x) ,
(I.1b)
are differentiable. Then

C : I M(m, l, K),

C(x) := A(x)B(x),

(I.2)

is differentiable, and one has the product rule


C (x) = A (x)B(x) + A(x)B (x).

xI

(I.3)


Proof. Writing C(x) = c (x) and using the one-dimensional product rule together
with the definition of matrix multiplication, one computes, for each (, ) {1, . . . , m}
{1, . . . , l},
!
n
X
c (x) =
a (x) b (x)
=1

n
X

a (x) b (x)

=1

= A (x)B(x)
proving the proposition.

I.2

n
X

a (x) b (x)

=1

+ A(x)B (x)

(I.4)


Integration and Matrix Multiplication Commute

Proposition I.3. Let m, n, p N, let I R be measurable (e.g. an interval), let A :


I M(m, n, K), x 7 A(x) = (akl (x)), be integrable (i.e. all Re akl , Im akl : I R
are integrable).

145

J AUTONOMOUS ODE
(a) If B = (bjk ) M(p, m, K), then
Z
Z
B
A(x) dx = B A(x) dx .
I

(I.5)

(b) If B = (Blj ) M(n, p, K), then


Z

Z
A(x) dx B = A(x) B dx .
I

(I.6)

Proof. (a): One computes, for each (j, l) {1, . . . , p} {1, . . . , n},
 Z

Z
Z
m
X
B
A(x) dx
=
bjk
akl (x) dx =
I

jl

k=1

(BA(x))jl dx =
I

Z

m
X

bjk akl (x)

k=1

B A(x) dx
I

dx

(I.7)

jl

proving (I.5).
(b): One computes, for each (k, j) {1, . . . , m} {1, . . . , p},
Z

A(x) dx
I

=
kj

n Z
X
l=1

akl (x) dx
I

(A(x)B)kj dx =
I

blj =

Z

n
X
l=1

A(x) B dx
I

akl (x)blj


J.1

(I.8)

kj

proving (I.6).

dx

Autonomous ODE
Equivalence Between Autonomous and Nonautonomous ODE

Theorem J.1. Let G R Kn , n N, and f : G Kn . Then the nonautonomous


ODE
y = f (x, y)
(J.1)
is equivalent to the autonomous ODE
y = g(y),

(J.2)

where
g : R G Kn+1 ,
in the following sense:


g(y1 , . . . , yn+1 ) := 1, f (x, y2 , . . . , yn+1 ) ,

(J.3)

146

J AUTONOMOUS ODE

(a) If : I Kn is a solution to (J.1), then : I Kn+1 , (x) := (x, (x)), is a


solution to (J.2).
(b) If : I Kn+1 is a solution to (J.2) with the property

x0 I

1 (x0 ) = x0 ,

(J.4)

then : I Kn , (x) := (2 (x), . . . , n+1 (x)), is a solution to (J.1).


Proof. (a): If : I Kn is a solution to (J.1) and : I Kn+1 , (x) := (x, (x)),
then

(x) = (1, (x)) = 1, f (x, (x)) = g(x, (x)) = g((x)),
(J.5)
xI

showing is a solution to (J.2).

(b): If : I Kn+1 is a solution to (J.2) with the property (J.4) and : I Kn ,


(x) := (2 (x), . . . , n+1 (x)), then (J.4) implies 1 (x) = x for each x I and, thus,

xI

(x) = (2 (x), . . . , n+1


(x)) = f (x, 2 (x), . . . , n+1 (x)) = f (x, (x)),

(J.6)

showing is a solution to (J.1).

While Th. J.1 is somewhat striking and of theoretical interest, it has few useful applications in practise, due to the unbounded first component of solutions to (J.2) (cf. Rem.
5.2).

J.2

Integral for ODE with Discontinuous Right-Hand Side

The following Example J.2, provided by Anton Sporrer, shows Lem. 5.19 becomes false
if the hypothesis that every initial value problem for the considered ODE y = f (y) has
at least one solution is omitted:
Example J.2. Consider
f : R R,

f (y) :=

0 for y Q,
1 for y R \ Q,

(J.7)

and the autonomous ODE y = f (y). If (x0 , y0 ) R Q, then the initial value problem
y(x0 ) = y0 has the unique solution : R R, y0 Q. However, if (x0 , y0 )
R (R \ Q), then the initial value problem y(x0 ) = y0 has no solution. Since y = f (y)
has only constant solutions, every function E : R R is an integral for this ODE
according to Def. 5.18. However, not every differentiable function E : R R satisfies
(5.15): For example, if E(y) := y, then E 1, i.e.

yR\Q

E (y)f (y) = 1 6= 0,

showing that Lem. 5.19 does not hold for y = f (y) with f according to (J.7).

(J.8)

147

K POLAR COORDINATES

Polar Coordinates

Recall the following functions, used in polar coordinates of the plane:


q
r : R2 \ {(0, 0)} R+ , r(y1 , y2 ) := y12 + y22 ,

0
for

arccot(y /y )
for
1
2
: R2 \ {(0, 0)} [0, 2[, (y1 , y2 ) :=

for

+ arccot(y1 /y2 ) for

(K.1a)
y2
y2
y2
y2

= 0, y1 > 0,
> 0,
= 0, y1 < 0,
< 0.
(K.1b)

Theorem K.1. Consider f : R2 \ {(0, 0)} R2 and the corresponding R2 -valued


autonomous ODE
y1 = f1 (y1 , y2 ),
y2 = f2 (y1 , y2 ),

(K.2a)
(K.2b)

together with its polar coordinate version


r = g1 (r, ),
= g2 (r, ),

(K.3a)
(K.3b)

where g : R+ R R2 ,
g1 : R+ R R,
g1 (r, ) := f1 (r cos , r sin ) cos + f2 (r cos , r sin ) sin ,
g2 : R+ R R,

1
g2 (r, ) :=
f2 (r cos , r sin ) cos f1 (r cos , r sin ) sin .
r

(K.4a)

(K.4b)

Let : I R2 be a solution to (K.3).


(a) Then
: I R2 ,

is a solution to (K.2).


(x) := 1 (x) cos 2 (x), 1 (x) sin 2 (x) ,

(K.5)

(b) If satisfies the initial condition


1 (0) = ,
2 (0) = ,

R+ ,
R,

(K.6a)
(K.6b)

and if
1 = cos ,
2 = sin ,

(K.7a)
(K.7b)

148

K POLAR COORDINATES
then satisfies the initial condition
1 (0) = 1 ,
2 (0) = 2 .

(K.8a)
(K.8b)

Note that > 0 implies (1 , 2 ) 6= (0, 0), and that, for (, ) R+ [0, 2[, (K.7)
is equivalent to
r(1 , 2 ) = ,
(1 , 2 ) =

(K.9a)
(K.9b)

(cf. the computations of 1 and of 1 in [Phi13b, Ex. 4.19]).


Proof. Exercise.

Example K.2. Consider the autonomous ODE (K.2) with


f1 : R2 \ {(0, 0)} R,
 y2 (r(y1 , y2 ) y1 )
,
f1 (y1 , y2 ) := y1 1 r(y1 , y2 )
2 r(y1 , y2 )
f2 : R2 \ {(0, 0)} R,
 y1 (r(y1 , y2 ) y1 )
f2 (y1 , y2 ) := y2 1 r(y1 , y2 ) +
,
2 r(y1 , y2 )

(K.10a)

(K.10b)

where r is the radial polar coordinate function as defined in (K.1a). Using g : R+ R


R2 as defined in (K.4), one obtains, for each (, ) R+ R,
sin ( cos ) cos
2
cos ( cos ) sin
+ sin (1 ) sin +
2

g1 (, ) = cos (1 ) cos

= (1 ),

(K.11a)

cos ( cos ) cos


2
sin ( cos ) sin
cos (1 ) sin +
2
1 cos
,
=
2

g2 (, ) = sin (1 ) cos +

(K.11b)

such that the autonomous ODE


r = r (1 r),
=

1 cos
2

(K.12a)
[Phi13a, (D.1c)]

sin2

,
2

is the polar coordinate version of (K.2) as defined in Th. K.1.

(K.12b)

149

K POLAR COORDINATES
Claim 1. The general solution to
r = p(r),

p : R+ R,

p(r) := r (1 r),

is
Yp : Dp,0 R+ ,
where
Dp,0

Yp (x, ) :=

,
+ (1 ) ex


i
= R]0, 1] (x, ) : > 1, x ln


h

, .
1

(K.13)
(K.14)

(K.15)

Proof. The initial condition is satisfied, since


+

Yp (0, ) =

= .
+ (1 )

(K.16)

The ODE is satisfied, since

(x,)Dp,0

Yp (x, ) =

(1 )ex

+ (1 ) ex


2 = Yp (x, ) 1 Yp (x, ) .

(K.17)

To verify the form of Dp,0 , we note that the denominator in (K.14) is positive for each
x R if 0 < 1. If > 1, then the function a : R R, a(x) := + (1 ) ex , is

strictly increasing (note a (x) = ( 1) ex > 0) and has a unique zero at x = ln 1


.
Y (x, ) = , proving the maximality of Y (, ).
N
Thus limx ln 1
Claim 2. Letting, for each k Z,
2 cos + 2
for R \ {l : l Z},
sin
Rk :=]0, [+2k,
Lk :=] , 0[+2k,
A0 := R {2k : k Z},
A0,k := R { + 2k},
B0,k := R+ { + 2k},


A1,k := (x, ) R2 : x ] , x0 ( )[, Rk ,


A2,k := (x, ) R2 : x ]x0 ( ), [, Lk ,


B1,k := (x, ) R2 : x ]x0 ( ), [, Rk ,


B2,k := (x, ) R2 : x ] , x0 ( )[, Lk ,


C1,k := (x, ) R2 : x = x0 ( ), Rk ,


C2,k := (x, ) R2 : x = x0 ( ), Lk ,

x0 ( ) :=

(K.18a)
(K.18b)
(K.18c)
(K.18d)
(K.18e)
(K.18f)
(K.18g)
(K.18h)
(K.18i)
(K.18j)
(K.18k)
(K.18l)

the general solution to

= q(),

q : R R,

q() :=

1 cos
= sin2 ,
2
2

(K.19)

150

K POLAR COORDINATES
is
Yq : R2 R+ ,



2
k
+
arctan



2 (k + 1) + arctan x2

+ 2k

sin
Yq (x, ) := 2 k + arctan 2 cos 2x
sin

+2

+ 2k



sin

2 (k + 1) + arctan 2 cos 2x

sin +2

+ 2k



2 sin
2 (k 1) + arctan
2 cos x sin +2

for
for
for
for
for
for
for
for
for

(x, ) A0 ,
(x, ) A0,k , k Z,
(x, ) B0,k , k Z,
(x, ) = (0, + 2k), k Z,
(x, ) A1,k A2,k , k Z,
(x, ) C1,k , k Z,
(x, ) B1,k , k Z,
(x, ) C2,k , k Z,
(x, ) B2,k , k Z.
(K.20)

Proof. One observes that Yq is well-defined, since



[
2

R = A0
{(0, + 2k)} A0,k B0,k A1,k A2,k B1,k B2,k C1,k C2,k
kZ

(K.21)

and, introducing the auxiliary function


: R2 R,

(x, ) := 2 cos x sin + 2,

(K.22)

one has
(x, ) 6= 0 for each (x, ) A1,k A2,k B1,k B2,k , k Z.

(K.23)

It remains to show that, for each R, the function x 7 Yq (x, ) is differentiable on


R, satisfying
1 cos Yq (x, )
,
(K.24)
Yq (x, ) =
xR
2
and the initial condition
Yq (0, ) = .
(K.25)
The initial condition (K.25) is satisfied, since

{k: kZ}

Rk Lk , kZ

Yq (0, ) = ,

(K.26a)

2 sin [Phi13a, (D.1d)]

Yq (0, ) = 2k + 2 arctan
=
2k + 2 arctan tan
2 cos + 2
2


= 2k + 2
k = .
(K.26b)
2

Next, we show that, for each R, the function x 7 Yq (x, ) is differentiable on R and
satisfies (K.24): For {2k : k Z}, x 7 Yq (x, ) is constant, i.e. differentiability is
clear, and
1 cos(2k)
1 cos Yq (x, )
=
= 0 = Yq (x, )
(K.27)

xR
2
2

151

K POLAR COORDINATES
proves (K.24).

For each {2(k +1) : k Z}, differentiability is clear in each x R\{0}. Moreover,



4
4
1
2
= 2
,
(K.28)
2 arctan

4 =
xR\{0}
x
x 1 + x2
4 + x2
and, thus, for each x R \ {0},



1 cos Yq (x, )
1 1
2
1 1
[Phi13a, (D.1e)] 1
=
= cos 2 arctan

2
2 2
x
2 2 1+
2
1 1 x 4
8
4 (K.28)
=
=
=
= Yq (x, ),
2
2
2 2 x +4
2(4 + x )
4 + x2

4
x2
4
x2

(K.29)

proving (K.24) for each x R \ {0}. It remains to consider x = 0. One has, by


LHopitals rule [Phi13a, Th. 9.23(a)],

+ 2k 2 k + arctan x2
Yq (0, ) Yq (x, )
= lim
lim
x0
x0
0x
x
4
2
[Phi13a, (9.29)],(K.28)
(K.30)
=
lim 4+x = 1
x0
1

and

+ 2k 2 (k + 1) + arctan x2
Yq (0, ) Yq (x, )
lim
= lim
x0
x0
0x
x
4
2
[Phi13a, (9.29)],(K.28)
=
lim 4+x = 1,
x0
1


(K.31)

showing x 7 Yq (x, ) to be differentiable in x = 0 with Yq (0, ) = 1. Due to


1 cos( + 2k)
= 1 = Yq (0, + 2k),
2

(K.32)

(K.24) also holds.


For each Rk Lk , the differentiability is clear in each x R \ {x0 ( )}. Moreover,
recalling (x, ) from (K.22), one has, for each x R \ {x0 ( )},



2 sin
1
4(sin )2
4(sin )2
2 arctan
, (K.33)
=
=
2
(x, )
((x, ))2 1 + 4(sin ) 2
4(sin )2 + ((x, ))2
((x, ))
and, thus, for each x R \ {x0 ( )},



1 1
1 1
1 cos Yq (x, )
2 sin
[Phi13a, (D.1e)] 1
= cos 2 arctan

=
2
2 2
(x, )
2 2 1+
=

1 1 ((x, ))2 4(sin )2


8(sin )2

=
2 2 ((x, ))2 + 4(sin )2
2(4(sin )2 + ((x, ))2 )

4(sin )2
4(sin )2 + ((x, ))2

(K.33)

= Yq (x, ),

4(sin )2
((x, ))2
4(sin )2
((x, ))2

(K.34)

152

K POLAR COORDINATES

proving (K.24) for each x R \ {x0 ( )}. It remains to consider x = x0 ( ). For Rk ,


we have sin > 0 and x0 ( ) > 0, and, thus, by LHopitals rule [Phi13a, Th. 9.23(a)],


sin
+ 2k 2 k + arctan 2 cos 2x
Yq x0 ( ), Yq (x, )
sin +2
= lim
lim
xx0 ( )
xx0 ( )
x0 ( ) x
x0 ( ) x
2

[Phi13a, (9.29)],(K.33)

)
4(sin 4(sin
)2 +((x, ))2

lim

=1

xx0 ( )

(K.35)

and

+ 2k 2 (k + 1) + arctan
Yq x0 ( ), Yq (x, )
lim
= lim
xx0 ( )
xx0 ( )
x0 ( ) x
x0 ( ) x

2 sin
2 cos x sin +2

[Phi13a, (9.29)],(K.33)

lim

)
4(sin 4(sin
)2 +((x, ))2

xx0 ( )

=1



(K.36)

showing x 7 Yq (x, ) to be differentiable in x = x0 ( ) with Yq (x0 ( ), ) = 1. Due to


1 cos( + 2k)
= 1 = Yq (x0 ( ), ),
2

(K.37)

(K.24) also holds. For Lk , we have sin < 0 and x0 ( ) < 0, and, thus, by LHopitals
rule [Phi13a, Th. 9.23(a)],

Yq x0 ( ), Yq (x, )
lim
xx0 ( )
x0 ( ) x

sin
+ 2k 2 (k 1) + arctan 2 cos 2x
sin +2
=
lim
xx0 ( )
x0 ( ) x
2

[Phi13a, (9.29)],(K.33)

lim

)
4(sin 4(sin
)2 +((x, ))2

xx0 ( )

=1

(K.38)

and

+ 2k 2 k + arctan
Yq x0 ( ), Yq (x, )
= lim
lim
xx0 ( )
xx0 ( )
x0 ( ) x
x0 ( ) x
2

[Phi13a, (9.29)],(K.33)

lim

)
4(sin 4(sin
)2 +((x, ))2

xx0 ( )

=1

2 sin
2 cos x sin +2



(K.39)

showing x 7 Yq (x, ) to be differentiable in x = x0 ( ) with Yq (x0 ( ), ) = 1. Due to


1 cos( + 2k)
= 1 = Yq (x0 ( ), ),
2
(K.24) also holds.

(K.40)
N

153

K POLAR COORDINATES
Claim 3. The general solution to (K.2) with f1 , f2 according to (K.10) is
Y : Df,0 R2 ,



Y (x, 1 , 2 ) := Yp x, r(1 , 2 ) cos Yq x, (1 , 2 ) ,


Yp x, r(1 , 2 ) sin Yq x, (1 , 2 ) ,

where r and are given by (K.1), and




Df,0 = R { R2 : 0 < kk2 1}

i
2
(x, ) R R : kk2 > 1, x ln

h
kk2
, .
kk2 1

(K.41)

(K.42)

Proof. Since (K.12) is the polar coordinate version of (K.2) with f1 , f2 according to
(K.10), everything follows from combining Th. K.1 with Claims 1 and 2.
N
Claim 4. The autonomous ODE (K.2) with f1 , f2 according to (K.10) has (1, 0) as its
only fixed point, and (1, 0) satisfies Def. 5.24(iii) for x (even for each R2 \ {0})
without satisfying Def. 5.24(ii) (i.e. without being positively stable).
Proof. For each R2 \ {0}, it is r() > 0, and, thus
2

R \{0}

r()

= 1.
x r() + 1 r() ex


lim Yp x, r() = lim

Fix R2 \ {0}. If () = 0, then


lim Yq x, 0 = lim 0 = 0.

If () = , then

2
lim Yq x, = lim 2 + arctan
x
x
x




= 2( + 0) = 2.

If 0 < () < or < () < 2, then sin () 6= 0 and, thus,






2 sin ()
lim Yq x, () = lim 2 + arctan
x
x
2 cos () x sin () + 2
= 2( + 0) = 2.

(K.43)

(K.44a)

(K.44b)

(K.44c)

Using (K.43) and (K.44) in (K.41) yields

R2 \{0}

lim Y (x, ) = (1, 0).

(K.45)

While (1, 0) is clearly a fixed point for (K.2) with f1 , f2 according to (K.10), (K.45)
shows that no other R2 \ {0} can be a fixed point.

REFERENCES

154

For each ]0, [ and := (cos , sin ), it is () = and Yq (0, ()) = . Thus, due
to (K.44c) and the intermediate value theorem, the continuous function Yq (, ()) must
attain every value between and 2, in particular, there is x > 0 that Yq (x , ()) =
and Y (x , ) = (cos , sin ) = (1, 0). Since every neighborhood of (1, 0) contains
points = (cos , sin ) with ]0, [, this shows that (1, 0) does not satisfy Def.
5.24(ii) for x 0.
N

References
[Aul04] Bernd Aulbach. Gewohnliche Differenzialgleichungen, 2nd ed. Spektrum
Akademischer Verlag, Heidelberg, Germany, 2004 (German).
[K04]

Konrad K
onigsberger. Analysis 2, 5th ed. Springer-Verlag, Berlin, 2004
(German).

[Koe03] Max Koecher. Lineare Algebra und analytische Geometrie, 4th ed. Springer-Verlag, Berlin, 2003 (German), 1st corrected reprint.
[Mar04] Nelson G. Markley. Principles of Differential Equations. Pure and Applied
Mathematics, Wiley-Interscience, Hoboken, NJ, USA, 2004.
[Oss09] E. Ossa. Topologie. Vieweg+Teubner, Wiesbaden, Germany, 2009 (German).
[Phi13a] P. Philip. Calculus I for Computer Science and Statistics Students. Lecture Notes, Ludwig-Maximilians-Universitat, Germany, 2012/2013, available
in PDF format at http://www.math.lmu.de/~philip/publications/lectureNot
es/calc1_forInfAndStatStudents.pdf.
[Phi13b] P. Philip. Calculus II for Statistics Students. Lecture Notes, LudwigMaximilians-Universitat, Germany, 2013, available in PDF format at
http://www.math.lmu.de/~philip/publications/lectureNot
es/calc2_forStatStudents.pdf.

[Pre75]

G. Preu. Allgemeine Topologie, 2nd ed. Springer-Verlag, Berlin, 1975 (German).

[Put66] E.J. Putzer. Avoiding the Jordan Canonical Form in the Discussion of Linear Systems with Constant Coefficients. The American Mathematical Monthly
73 (1966), No. 1, 27.
[Str08]

Gernot Stroth. Lineare Algebra, 2nd ed. Berliner Studienreihe zur Mathematik, Vol. 7, Heldermann Verlag, Lemgo, Germany, 2008 (German).

[Wal02] Wolfgang Walter. Analysis 2, 5th ed. Springer-Verlag, Berlin, 2002 (German).

You might also like