Professional Documents
Culture Documents
Peter Philip
Lecture Notes
Originally Created for the Class of Spring Semester 2012 at LMU Munich,
Revised and Extended for the Classes of Spring Semesters 2013 and 2014
Contents
1 Basic Notions
1.1
1.2
1.3
12
2.1
2.2
2.3
Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4
Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 General Theory
24
3.1
3.2
Existence of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3
Uniqueness of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4
3.5
E-Mail: philip@math.lmu.de
CONTENTS
4 Linear ODE
57
4.1
Definition, Setting
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2
Gronwalls Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3
4.4
4.5
Higher-Order, Wronskian . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.6
Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6.1
4.6.2
5 Stability
84
5.1
5.2
5.3
5.4
Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.5
A Differentiability
118
B Kn -Valued Integration
119
124
132
135
F Paths in Rn
135
139
142
CONTENTS
I
Matrix-Valued Functions
144
I.1
I.2
J Autonomous ODE
145
. . . . . . 145
. . . . . . . . . . 146
K Polar Coordinates
147
References
154
1 BASIC NOTIONS
1
1.1
Basic Notions
Types of Ordinary Differential Equations (ODE) and First
Examples
A differential equation is an equation for some unknown function, involving one or more
derivatives of the unknown function. Here are some first examples:
y = y,
(1.1a)
y (5) = (y )2 + x,
(y )2 = c,
(1.1b)
(1.1c)
t x = e2it x2 ,
1
.
x = 3x +
1
(1.1d)
(1.1e)
One distinguishes between ordinary differential equations (ODE) and partial differential
equations (PDE). While ODE contain only derivatives with respect to one variable, PDE
can contain (partial) derivatives with respect to several different variables. In general,
PDE are much harder to solve than ODE. The equations in (1.1) all are ODE, and only
ODE are the subject of this class. We will see precise definitions shortly, but we can
already use the examples in (1.1) to get some first exposure to important ODE-related
terms and to discuss related issues.
As in (1.1), the notation for the unknown function varies in the literature, where the
two variants presented in (1.1) are probably the most common ones: In the first three
equations of (1.1), the unknown function is denoted y, usually assumed to depend on
a variable denoted x, i.e. x 7 y(x). In the last two equations of (1.1), the unknown
function is denoted x, usually assumed to depend on a variable denoted t, i.e. t 7 x(t).
So one has to use some care due to the different roles of the symbol x. The notation
t 7 x(t) is typically favored in situations arising from physics applications, where t
represents time. In this class, we will mostly use the notation x 7 y(x).
There is another, in a way a slightly more serious, notational issue that one commonly
encounters when dealing with ODE: Strictly speaking, the notation in (1.1b) and (1.1d)
is not entirely correct, as functions and function arguments are not properly distinguished. Correctly written, (1.1b) and (1.1d) read
xD(y)
tD(x)
2
+ x,
2
x(t) ,
(1.2a)
(1.2b)
where D(y) and D(x) denote the respective domains of the functions y and x. However,
one might also notice that the notation in (1.2) is more cumbersome and, perhaps,
harder to read. In any case, the type of slight abuse of notation present in (1.1b) and
(1.1d) is so common in the literature that one will have to live with it.
1 BASIC NOTIONS
One speaks of first-order ODE if the equations involve only first derivatives such as in
(1.1a), (1.1c), and (1.1d). Otherwise, one speaks of higher-order ODE, where the precise
order is given by the highest derivative occurring in the equation, such that (1.1b) is
an ODE of fifth order and (1.1e) is an ODE of second order. We will see later in Th.
3.1 that ODE of higher order can be equivalently formulated and solved as systems of
ODE of first order, where systems of ODE obviously consist of several ODE to be solved
simultaneously. Such a system of ODE can, equivalently, be interpreted as a single ODE
in higher dimensions: For instance, (1.1e) can be seen as a single two-dimensional ODE
of second order or as the system
x1 = 3x1 1,
x2 = 3x2 + 1
(1.3a)
(1.3b)
1 BASIC NOTIONS
Notation 1.1. We will write K in situations, where we allow K to be R or C.
Definition 1.2. Let k, n N.
(a) Given U R K(k+1)n and F : U Kn , call
F (x, y, y , . . . , y (k) ) = 0
(1.4)
Note that condition (i) is necessary so that one can even formulate condition (ii).
(b) Given G R Kkn , and f : G Kn , call
y (k) = f (x, y, y , . . . , y (k1) )
(1.6)
Again, note that condition (i) is necessary so that one can even formulate condition
(ii). Also note that is a solution to (1.6) if, and only if, is a solution to the
equivalent implicit ODE y (k) f (x, y, y , . . . , y (k1) ) = 0.
Definition 1.3. Let k, n N.
(a) An initial value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp.
of the ODE (1.6)) plus the initial condition
j=0,...,k1
(1.7)
1 BASIC NOTIONS
(b) A boundary value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp.
of the ODE (1.6)) plus the boundary condition
jJa
and
jJb
(1.8)
Under suitable hypotheses, initial and boundary value problems for ODE have unique
solutions (for initial value problems, we will see some rather general results in Cor. 3.10
and Cor. 3.16 below). However, in general, they can have infinitely many solutions or
no solutions, as shown by Examples 1.4(b),(c),(e) below.
Example 1.4. (a) Let k N. The function : R K, (x) = a ex , a K, is a
solution to the kth order explicit initial value problem
j=0,...,k1
y (k) = y,
(1.9a)
y (j) (0) = a.
(1.9b)
We will see later (e.g., as a consequence of Th. 4.8 combined with Th. 3.1) that
is the unique solution to (1.9) on R.
(b) Consider the one-dimensional explicit first-order initial value problem
p
y = |y|,
y(0) = 0.
(1.10a)
(1.10b)
c (x) :=
0
(xc)2
4
for x c,
for x c,
(1.11)
(1.12)
solving the ODE. Thus, (1.10) is an example of an initial value problem with uncountably many different solutions, all defined on the same domain.
1 BASIC NOTIONS
(c) As mentioned before, the one-dimensional implicit first-order ODE (1.1c) has no
real-valued
solution for c < 0. For c 0, every function : R R, (x) :=
a x c, a R, is a solution
to (1.1c). Moreover, for c < 0, every function
: R C, (x) := a xi c, a C, is a C-valued solution to (1.1c). The
one-dimensional implicit first-order ODE
ey = 0
(1.13)
(x) := c + xa,
(1.14)
is the unique solution to the n-dimensional explicit first-order initial value problem
y = a,
y(0) = c.
(1.15a)
(1.15b)
(1.17)
y(/2) = 1,
(1.18a)
for (1.16) has the unique solution : [0, /2] K, (x) := sin x (using (1.18a)
and (1.17) implies c2 = 0 and c1 = 1); the boundary value problem
y(0) = 0,
y() = 0,
(1.18b)
for (1.16) has the infinitely many different solutions c : [0, ] K, c (x) :=
c sin x, c K; and the boundary value problem
y(0) = 0,
y() = 1,
(1.18c)
for (1.16) has no solution (using (1.18c) and (1.17) implies the contradictory requirements c2 = 0 and c2 = 1).
1 BASIC NOTIONS
(f ) Consider
2
F : R K K K ,
z2
z1
y1
.
:=
,
F x,
z2 1
z2
y2
(1.19a)
F (x, y, y ) =
y2
y2 1
0
=
0
(1.19b)
z2 + iy3 2i
z1
y1
F x, y2 , z2 := z1 + y2 x2 . (1.20a)
y1 ieix
z3
y3
= 0
F (x, y, y ) = y1 + y2 x
0
y1 ieix
(1.20b)
has a unique solution on R (note that, here, we do not need to provide initial
or boundary conditions to obtain uniqueness). The implicit ODE (1.20b) is an
example of a differential algebraic equation, since, read in components, only its first
two equations contain derivatives, whereas its third equation is purely algebraic.
1.2
(1.21a)
(1.21b)
x
x0
f t, y(t) dt ,
(1.22)
10
1 BASIC NOTIONS
is a solution to (1.21) in the sense of Def. 1.3(a) if, and only if,
Z x
(x) = y0 +
f t, (t) dt ,
xI
(1.24)
x0
i.e. if, and only if, is a solution to the integral equation (1.22).
Proof. Assume I R with x0 I to be a nontrivial interval and : I Kn
to be a continuous function, satisfying (1.23). If is a solution to (1.21), then is
differentiable and the assumed continuity of f implies the continuity of . In other
words, each component j of , j = {1, . . . , n}, is in C 1 (I, K). Thus, the fundamental
theorem of calculus [Phi13a, Th. G.6(b)] applies, and [Phi13a, (G.16b)] yields
Z x
Z x
(1.21b)
j (x) = j (x0 ) +
fj t, (t) dt = y0,j +
fj t, (t) dt , (1.25)
xI
j{1,...,n}
x0
x0
proving satisfies (1.24). Conversely, if satisfies (1.24), then the validity of the initial
condition (1.21b) is immediate.
Moreover, as f and are continuous, so is the integrand
function t 7 f t, (t) of (1.24). Thus, [Phi13a, Th. G.6(a)] applies to (the components
of) , proving (x) = f x, (x) for each x I, proving is a solution to (1.21).
Example 1.6. Consider the situation of Th. 1.5. In the particularly simple special
case, where f does not actually depend on y, but merely on x, the equivalence between
(1.21) and (1.22) can be directly exploited to actually solve the initial value problem:
If f : I Kn , where I R is some nontrivial interval with x0 I, then we obtain
: I Kn to be a solution of (1.21) if, and only if,
Z x
(x) = y0 +
f (t) dt ,
(1.26)
xI
x0
i.e. if, and only if, is the antiderivative of f that satisfies the initial condition. In
particular, in the present situation, as given by (1.26) is the unique solution to the
initial value problem. Of course, depending on f , it can still be difficult to carry out
the integral in (1.26).
1.3
If solutions defined on different intervals fit together, then they can be patched to obtain
a solution on the union of the two intervals:
Lemma 1.7 (Patching of Solutions). Let k, n N. Given G R Kkn and f : G
Kn , if : I Kn and : J Kn are both solutions to (1.6), i.e. to
y (k) = f (x, y, y , . . . , y (k1) ),
such that I =]a, b], J = [b, c[, a < b < c, and such that
j=0,...,k1
(1.27)
11
1 BASIC NOTIONS
then
: I J Kn ,
(x) :=
(x)
(x)
for x I,
for x J,
(1.28)
(1.29)
must hold, where (1.27) guarantees that (j) (b) exists for each j = 0, . . . , k1. Moreover,
is k times differentiable at each x I J, x 6= b, and
(k) (x) = f x, (x), (x), . . . , (k1) (x) .
(1.30)
xIJ,
x6=b
However, at b, we also have (using the left-hand derivatives for and the right-hand
derivatives for )
(k) (b) = f b, (b), (b), . . . , (k1) (b)
= f b, (b), (b), . . . , (k1) (b) = (k) (b),
(1.31)
which shows is k times differentiable and the equality of (1.30) also holds at x = b,
completing the proof that is a solution.
It is sometimes useful to apply what is known as time reversion:
Definition 1.8. Let k, n N, Gf R Kkn , f : Gf Kn , and consider the ODE
y (k) = f (x, y, y , . . . , y (k1) ).
(1.32)
(1.33)
(1.34a)
(1.34b)
(x) := (x),
(1.35)
12
j=0,...,k
x]b,a[
(1.36a)
one has
x]b,a[
x, (x), (x), . . . , (k1) (x) Gf
and
x]b,a[
2.1
(1.36b)
(1.36c)
Geometrically, in the 1-dimensional real-valued case, the ODE (1.21a) provides a slope
y = f (x, y) for every point (x, y). In other words, it provides a field of directions. The
task is to find a differentiable function such that its graph has the prescribed slope in
each point it contains. In certain simple cases, drawing the field of directions can help
to guess the solutions of the ODE.
Example 2.1. Let G := R+ R and f : G R, f (x, y) := y/x, i.e. we consider the
ODE y = y/x. Drawing the field of directions leads to the idea that the solutions are
functions whose graphs constitute rays, i.e. c : R+ R, y = c (x) = c x with c R.
Indeed, one immediately verifies that each c constitutes a solution to the ODE.
2.2
(2.1)
is called a linear ODE of first order. It is called homogeneous if, and only if, b 0; it is
called inhomogeneous if, and only if, it is not homogeneous.
13
where
0 : I K,
0 (x) = exp
Z
a(t) dt
x0
=e
Rx
x0
a(t) dt
(2.2b)
a(t) dt
= a(x)0 (x),
(2.3)
x0
where Lem. A.1 of the Appendix was used as well. In particular, 0 is continuous. Since
1
0 6= 0 as well, 1
0 is also continuous. Moreover, as b is continuous by hypothesis, 0 b
is continuous and, thus, Riemann integrable on [x0 , x]. Once again, [Phi13a, Th. G.6(a)]
applies, yielding to be differentiable with
: I K,
Z x
1
(x) = 0 (x) c +
0 (t) b(t) dt + 0 (x)0 (x)1 b(x)
x0
Z x
1
= a(x)0 (x) c +
0 (t) b(t) dt + b(x) = a(x)(x) + b(x),
(2.4)
x0
where the product rule of [Phi13a, Th. 9.6(c)] was used as well. Comparing (2.4) with
(2.1) shows is a solution to (2.1). The computation
(x0 ) = 0 (x0 ) (c + 0) = 1 c = c
(2.5)
verifies that satisfies the desired initial condition. It remains to prove uniqueness. To
this end, let : I K be an arbitrary differentiable function that satisfies (2.1) as
well as the initial condition (x0 ) = c. We have to show = . Since 0 6= 0, we can
define u := /0 and still have to verify
Z x
u(x) = c +
0 (t)1 b(t) dt .
(2.6)
xI
x0
We obtain
a 0 u + b = a + b = = (0 u) = 0 u + 0 u = a 0 u + 0 u ,
(2.7)
14
implying b = 0 u and u = 1
0 b. Thus, the fundamental theorem of calculus in the
form [Phi13a, Th. G.6(b)] implies
Z x
Z x
u(x) = u(x0 ) +
u (t) dt = c +
0 (t)1 b(t) dt ,
(2.8)
xI
x0
x0
Remark 2.5. The name variation of constants for Th. 2.3 can be understood from
comparing the solution (2.9) of the homogeneous linear ODE with the solution (2.2)
of the general inhomogeneous linear ODE: One obtains (2.2)
R x from (2.9) by varying the
constant c, i.e. by replacing it with the function x 7 c + x0 0 (t)1 b(t) dt .
Example 2.6. Consider the ODE
y = 2xy + x3
(2.10)
with initial condition y(0) = c, c C. Comparing (2.10) with Def. 2.2, we observe we
are facing an inhomogeneous linear ODE with
a : R R,
b : R R,
a(x) := 2x,
b(x) := x3 .
(2.11a)
(2.11b)
From Cor. 2.4, we obtain the solution 0,c to the homogeneous version of (2.10):
Z x
2
0,c : R C, 0,c (x) = c exp
a(t) dt = cex .
(2.12)
0
(2.13)
15
2.3
Separation of Variables
(2.14)
F (x) :=
f (t) dt ,
x0
G : J R,
G(y) :=
y
y0
dt
.
g(t)
(2.16)
xI
(x)
,
f (x) = F (x) = G (x) (x) =
g (x)
(2.18)
16
We now proceed to show that each solution : I R to (2.14) that satisfies (2.15)
must also satisfy (2.17). Since is a solution to (2.14),
(x)
= f (x) for each x I .
g (x)
(2.19)
(2.20)
Using the change of variables formula of [Phi13a, Th. 10.24] in the left-hand side of
(2.20), allows one to replace (t) by the new integration variable u (note that each
solution : I R to (2.14) is in C 1 (I ) since f and g are presumed continuous).
Thus, we obtain from (2.20):
F (x) =
(x)
(x0 )
du
=
g(u)
(x)
y0
du
= G (x) for each x I .
g(u)
(2.21)
(b): During the proof of (a), we have already seen G to be either strictly increasing
or strictly decreasing. As G(y0 ) = 0, this implies the existence of > 0 such that
] , [ G(J). The function F is differentiable and, in particular, continuous. Since
F (x0 ) = 0, there is > 0 such that, for I :=]x0 , x0 +[, one has F (I ) ], [ G(J)
as desired.
Example 2.8. Consider the ODE
y =
y
x
on I J := R+ R+
(2.22)
with the initial condition y(1) = c for some given c R+ . Introducing functions
f : R+ R,
1
f (x) := ,
x
g : R+ R,
g(y) := y,
(2.23)
one sees that Th. 2.7 applies. To compute the solution = G1 F , we first have to
determine F and G:
Z x
Z x
dt
+
f (t) dt =
F : R R,
F (x) =
= ln x,
(2.24a)
t
1
1
Z y
Z y
dt
y
dt
+
G : R R,
G(y) =
=
= ln .
(2.24b)
t
c
c g(t)
c
Here, we can choose I = I = R+ , because F (R+ ) = R = G(R+ ). That means is
defined on the entire interval I. The inverse function of G is given by
G1 : R R+ ,
G1 (t) = c et .
(2.25)
17
: R+ R,
(2.26)
The uniqueness part of Th. 2.7 further tells us the above initial value problem can have
no solution different from .
The advantage of using Th. 2.7 as in the previous example, by computing the relevant
functions F , G, and G1 , is that it is mathematically rigorous. In particular, one can be
sure one has found the unique solution to the ODE with initial condition. However, in
practice, it is often easier to use the following heuristic (not entirely rigorous) procedure.
In the end, in most cases, one can easily check by differentiation that the function found
is, indeed, a solution to the ODE with initial condition. However, one does not know
uniqueness without further investigations (general results such as Th. 3.15 below can
often help). One also has to determine on which interval the found solution is defined.
On the other hand, as one is usually interested in choosing the interval as large as
possible, the optimal choice is not always obvious when using Th. 2.7, either.
The heuristic procedure is as follows: Start with the ODE (2.14) written in the form
dy
= f (x)g(y).
dx
(2.27a)
Integrate:
dy
= f (x) dx .
g(y)
(2.27b)
dy
=
g(y)
(2.27c)
f (x) dx .
Change the integration variables and supply the appropriate upper and lower limits for
the integrals (according to the initial condition):
Z x
Z y
dt
=
f (t) dt .
(2.27d)
x0
y0 g(t)
Solve this equation for y, set (x) := y, check by differentiation that is, indeed, a
solution to the ODE, and determine the largest interval I such that x0 I and such
that is defined on I . The use of this heuristic procedure is demonstrated by the
following example:
Example 2.9. Consider the ODE
y = y 2
on I J := R R
(2.28)
18
with the initial condition y(x0 ) = y0 for given values x0 , y0 R. We manipulate (2.28)
according to the heuristic procedure described in (2.27) above:
Z
Z
dy
2
2
2
= y
y dy = dx
y dy =
dx
dx
y
Z y
Z x
1
1
1
2
= [t]xx0
= x x0
t dt =
dt
t y0
y y0
y0
x0
y0
.
(2.29)
(x) = y =
1 + (x x0 ) y0
Clearly, (x0 ) = y0 . Moreover,
(x) =
y02
1 + (x x0 ) y0
2
2 = (x) ,
(2.30)
2.4
Change of Variables
(2.32)
19
i.e. Tx is invertible and both Tx and Tx1 are differentiable. Then the first-order initial
value problems
y = f (x, y),
y(x0 ) = y0 ,
(2.33a)
(2.33b)
1
(2.34a)
and
y =
DTx1 (y)
y(x0 ) = T (x0 , y0 ),
f x, Tx1 (y) + x T x, Tx1 (y) ,
(2.34b)
DTx1 (y)
= DTx Tx1 (y) .
yTx (Gx )
(2.37)
xI
(2.38)
(x) = DT x, (x)
(x)
(2.39)
xI
If Tx is a continuously differentiable map, then this is related to the inverse function theorem (see,
e.g. [Phi13b, Cor. C.9]); it is still true if Tx is merely continuous and injective, but then it is the
invariance of domain theorem of algebraic topology [Oss09, 5.6.15], which is equivalent to the Brouwer
fixed-point theorem [Oss09, 5.6.10], and is much harder to prove.
20
(2.39),(2.33a)
DTx ((x)) f x, (x) + x T x, (x)
DTx Tx1 ((x)) f x, Tx1 ((x)) + x T x, Tx1 ((x))
1
1
DTx ((x))
f x, Tx1 ((x)) + x T x, Tx1 ((x)) ,
(2.38)
(2.37)
(2.40)
Using (2.38), one can subtract the second summand from (2.41). Multiplying the result
by DTx1 ((x)) from the left and taking into account (2.37) then provides
xI
(2.38)
= f x, (x) ,
(2.42)
(2.43a)
where R \ {0, 1}, the functions a, b : I R are continuous and defined on an open
interval I R, and f : I R+ R. For (2.43a), we add the initial condition
y(x0 ) = y0 ,
(x0 , y0 ) I R+ ,
(2.43b)
and, furthermore, we also consider the corresponding linear initial value problem
y = (1 ) a(x) y + b(x) ,
(2.44a)
1
y(x0 ) = y0 ,
(2.44b)
with its unique solution : I R given by Th. 2.3.
21
T (x, y) := y 1 .
(2.46)
xI
Tx = S,
S : R+ R+ ,
S(y) := y 1 ,
(2.47)
1
S 1 (y) = y 1 , DS 1 (y) = (S 1 ) (y) = 1
y 1 . Thus, (2.34a) takes the form
1
y = DTx1 (y)
f x, Tx1 (y) + x T x, Tx1 (y)
1
1
= (1 ) y 1 a(x) y 1 + b(x) y 1
+0
= (1 ) a(x) y + b(x) .
(2.48)
Thus, if I I is such that x0 I and > 0 on I , then Th. 2.10 says defined
by (2.45) must be a solution to (2.43) (note that the differentiability of implies the
differentiability of ). On the other hand, if : I R+ is an arbitrary solution to
(2.43), then Th. 2.10 states := S = 1 to be a solution to (2.44). The uniqueness
part of Th. 2.3 then yields 1 = I = 1 , i.e. = .
Example 2.12. Consider the initial value problem
y = f (x, y) := i
y(1) = i,
1
,
ix y + 2
(2.49a)
(2.49b)
T (x, y) := ix y.
(2.50)
(2.51)
22
Tx (y) = ix y,
Tx1 (y) = ix y.
(2.52a)
(2.52b)
1
,
y+2
(2.54a)
y(1) = T (1, i) = i i = 0.
(2.54b)
(x) :=
2x + 2 2,
(2.55)
(x) := Tx1 (x) = ix 2x + 2 + 2,
(2.56)
is a solution to (2.49) (that is a solution to (2.49) can now also easily be checked
directly). It will become clear from Th. 3.15 below that and are also the unique
solutions to their respective initial value problems.
Finding a suitable change of variables to transform a given ODE such that one is in a
position to solve the transformed ODE is an art, i.e. it can be very difficult to spot a
useful transformation, and it takes a lot of practise and experience.
Remark 2.13. Somewhat analogous to the situation described in the paragraph before
(2.27) regarding the separation of variables technique, in practise, one frequently uses a
heuristic procedure to apply a change of variables, rather than appealing to the rigorous
Th. 2.10. For the initial value problem y = f (x, y), y(x0 ) = y0 , this heuristic procedure
proceeds as follows:
(1) One introduces the new variable z := T (x, y) and then computes z , i.e. the derivative of the function x 7 z(x) = T (x, y(x)).
(2) In the result of (1), one eliminates all occurrences of the variable y by first replacing
y by f (x, y) and then replacing y by Tx1 (z), where Tx (y) := T (x, y) = z (i.e. one has
to solve the equation z = T (x, y) for y). One thereby obtains the transformed initial
value problem problem z = g(x, z), z(x0 ) = T (x0 , y0 ), with a suitable function g.
23
f : R R R,
y y2
f (x, y) := 1 + + 2 ,
x x
(2.57)
y(1) = 0.
(2.58)
We introduce the change of variables z := T (x, y) := y/x and proceed according to the
steps of Rem. 2.13. According to (1), we compute, using the quotient rule,
z (x) =
y (x) x y(x)
.
x2
(2.59)
According to (2), we replace y (x) by f (x, y) and then replace y by Tx1 (z) = xz to
obtain the transformed initial value problem
y
y y2
1
z
1 + z2
1
2
1 + + 2 2 = (1 + z + z ) =
, z(1) = 0/1 = 0. (2.60)
z =
x
x x
x
x
x
x
According to (3), we next solve (2.60), e.g. by seperation of variables, to obtain the
solution
(2.61)
: e 2 , e 2 R, (x) := tan ln x,
of (2.60), and
: e 2 , e 2 R,
(2.62)
as a candidate for a solution to (2.58). Finally, according to (4), we check that is,
indeed, a solution to (2.58): Due to (1) = 1 tan 0 = 0, satisfies the initial condition,
and due to
1
(1 + tan2 ln x) = 1 + tan ln x + tan2 ln x
x
(x) 2 (x)
+
,
=1+
x
x2
(x) = tan ln x + x
(2.63)
24
3 GENERAL THEORY
3
3.1
General Theory
Equivalence Between Higher-Order ODE and Systems of
First-Order ODE
j{0,...,k1}
(3.1a)
(3.1b)
yk1 yk = 0,
F (x, y1 , . . . , yk , yk ) = 0,
(3.2a)
y0,0
y(x0 ) = ...
y0,k1
(3.2b)
(note that the unknown function y in (3.1) is Kn -valued, whereas the unknown function
y in (3.2) is Kkn -valued). Then both initial value problems are equivalent in the following
sense:
(a) If : I Kn is a solution to (3.1), then
: I Kkn ,
is a solution to (3.2).
:=
..
.
(k1)
(3.3)
25
3 GENERAL THEORY
Proof. We rewrite (3.2a) as
G(x, y, y ) = 0,
(3.4)
where
G : V Kkn ,
V := (x, y, z) R Kkn Kkn : (x, y, zk ) U R Kkn Kkn ,
G1 (x, y, z) := z1 y2 ,
G2 (x, y, z) := z2 y3 ,
..
.
Gk1 (x, y, z) := zk1 yk ,
Gk (x, y, z) := F (x, y, zk ).
(3.5)
(a): As a solution to (3.1), is k times differentiable and is well-defined. Then (3.1b)
implies (3.2b), since
(x0 )
y0,0
(x0 )
(3.1b) ..
(x0 ) =
= . .
..
.
y0,k1
(k1)
(x0 )
Next, Def. 1.2(a)(i) for implies Def. 1.2(a)(i) for , since
{ x, (x), (x) I Kkn Kkn : x I}
(3.3)
=
x, (x), (x), . . . , (k1) (x), (x), . . . , (k) (x) I Kkn Kkn : x I
(x, (x), . . . , (k) (x)) U
V.
(3.6)
j{1,...,k1}
xI
(3.1a)
Gk x, (x), (x) = F x, (x), (x), . . . , (k) (x) = 0,
j{1,...,k1}
=1
j+1 = j = (j) ,
(3.7)
26
3 GENERAL THEORY
i.e. is k times differentiable and has, once again, the form (3.3) (note 1 = by
the definition of ). Then, clearly, (3.2b) implies (3.1b), and Def. 1.2(a)(i) for implies
Def. 1.2(a)(i) for :
x, (x), (x) I Kkn Kkn : x I V
and the definition of V in (3.5) imply
x, (x), . . . , (k) I K(k+1)n : x I U.
xI
F x, (x), . . . , (k)
(3.3),(3.5)
= Gk x, (x), (x) = 0,
y(0) = 0,
y (0) = r,
r R given,
r
y2 = y1 ,
(3.8)
(3.9)
: R R ,
(x) =
1 (x)
2 (x)
r sin x
,
r cos x
(3.10)
(x) = r sin x.
(3.11)
As a consequence of Th. 3.1, one can carry out much of the general theory of ODE
(such as results regarding existence and uniqueness of solutions) for systems of firstorder ODE, obtaining the corresponding results for higher-order ODE as a corollary.
This is the strategy usually pursued in the literature and we will follow suit in this
class.
3.2
Existence of Solutions
It is a rather remarkable fact that, under the very mild assumption that f : G Kn is
a continuous function defined on an open subset G of RKkn with (x0 , y0,0 , . . . , y0,k1 )
27
3 GENERAL THEORY
G, every initial value problem (1.7) for the n-dimensional explicit kth-order ODE (1.6)
has at least one solution : I Kn , defined on a, possibly very small, open interval.
This is the contents of the Peano Th. 3.8 below and its Cor. 3.10. From Example 1.4(b),
we already know that uniqueness of the solution cannot be expected without stronger
hypotheses.
The proof of the Peano theorem requires some work. One of the key ingredients is
the Arzel`a-Ascoli Th. 3.7 that, under suitable hypotheses, guarantees a given sequence
of continuous functions to have a uniformly convergent subsequence (the formulation
in Th. 3.7 is suitable for our purposes many different variants of the Arzel`a-Ascoli
theorem exist in the literature).
We begin with some prelimanaries from the theory of metric spaces. At this point, the
reader might want to review the definition of a metric, a metric space, and basic notions
on metric spaces, such as the notion of compactness and the notion of continuity of
functions between metric spaces. Also recall that every normed space is a metric space
via the metric induced by the norm (in particular, if we use metric notions on normed
spaces, they are always meant with respect to the respective induced metric). If you
are not sufficiently familiar with metrics and norms, you might want to consult the
relevant subsections of [Phi13b, Sec. 1]; for compactness and some related results see,
e.g., Appendix C.2.
Notation 3.3. Let (X, d) be a metric space. Given x X and r R+ , let
Br (x) := {y X : d(x, y) < r}
denote the open ball with center x and radius r, also known as the r-ball with center x.
Definition 3.4. Let (X, dX ) and (Y, dY ) be metric spaces. We say a sequence of functions (fm )mN , fm : X Y , converges uniformly to a function f : X Y if, and
only if,
N N
mN,
xX
Theorem 3.5. Let (X, dX ) and (Y, dY ) be metric spaces. If the sequence (fm )mN of
continuous functions fm : X Y converges uniformly to the function f : X Y ,
then f is continuous as well.
Proof. We have to show that f is continuous at every X. Thus, let X and > 0.
Due to the uniform convergence, we can choose m N such that dY fm (x), f (x) < /3
for every x X. Moreover, as fm is continuous at , there exists > 0 such that
x B () implies dY fm (), fm (x) < /3. Thus, if x B (), then
dY f (), f (x) dY f (), fm () + dY fm (), fm (x) + dY fm (x), f (x)
< + + = ,
3 3 3
proving f is continuous at .
28
3 GENERAL THEORY
Definition 3.6. Let (X, dX ) and (Y, dY ) be metric spaces and let F be a set of functions
from X into Y . Then the set F (or the functions in F) are said to be uniformly
equicontinuous if, and only if, for each > 0, there is > 0 such that
dX (x, ) <
dY f (x), f () < .
(3.12)
f F
x,X
Theorem 3.7 (Arzel`a-Ascoli). Let n N, let k k denote some norm on Kn , and let
I R be some bounded interval. If (fm )mN is a sequence of functions fm : I Kn
such that {fm : m N} is uniformly equicontinuous and such that, for each x I, the
sequence fm (x) mN is bounded, then (fm )mN has a uniformly convergent subsequence
(fmj )jN , i.e. there exists f : I Kn such that
>0
N N
jN,
xI
Actually, we construct the (zm )mN inductively together with the (Fm )mN : Since the
sequence (fm (r1 ))mN is, by hypothesis, a bounded sequence in Kn , one can apply the
Bolzano-Weierstrass theorem (cf. [Phi13b, Th. 1.16(b)]) to obtain z1 Kn and a subsequence F1 = (f1,k )kN of (fm )mN such that limk f1,k (r1 ) = z1 . To proceed by
induction, we now assume to have already constructed F1 , . . . , FM and z1 , . . . , zM for
M N such that (i) and (ii) hold for each m {1, . . . , M }. Since the sequence
(fM,k (rM +1 ))kN is a bounded sequence in Kn , one can, once more, apply the BolzanoWeierstrass theorem to obtain zM +1 Kn and a subsequence FM +1 = (fM +1,k )kN of
FM such that limk fM +1,k (rM +1 ) = zM +1 . Since FM +1 is a subsequence of FM , it is
also a subsequence of all previous subsequences, i.e. (i) now also holds for m = M + 1.
In consequence, limk fM +1,k (rj ) = zj for each j = 1, . . . , M + 1, such that (ii) now
also holds for m = M + 1 as required.
Next, one considers the diagonal sequence (gm )mN , gm := fm,m , and observes that this
sequence converges pointwise at each rational number rj (limm gm (rj ) = zj ), since,
at least for m j, (gm )mN is a subsequence of every Fj (exercise) in particular,
(gm )mN is also a subsequence of the original sequence (fm )mN .
29
3 GENERAL THEORY
In the last step of the proof, we show that (gm )mN converges uniformly on the entire
interval I to some f : I Kn . To this end, fix > 0. Since {gm : m N} {fm :
m N}, the assumed uniform equicontinuity of {fm : m N} yields > 0 such that
.
|x | <
gm (x) gm () <
mN
x,I
3
KN
k,lK
=1,...,M
gk (r ) gl (r )
< .
3
(3.13)
kg
(x)
g
(r
)
k
k
k(j)
(3.14)
k,lK
< + + = .
3 3 3
The estimate (3.14) shows (gm (x) mN is a Cauchy sequence for each x I, and we
can define
f : I Kn , f (x) := lim gm (x).
(3.15)
m
Since K in (3.14) does not depend on x I, passing to the limit k in the estimate
of (3.14) implies
kgl (x) f (x)k ,
lK,
xI
proving uniform convergence of the subsequence (gm )mN of (fm )mN as desired. The
continuity of f is now a consequence of Th. 3.5.
At this point, we have all preparations in place to state and prove the existence theorem.
Theorem 3.8 (Peano). If G RKn is open, n N, and f : G Kn is continuous,
then, for each (x0 , y0 ) G, the explicit n-dimensional first-order initial value problem
y = f (x, y),
y(x0 ) = y0 ,
(3.16a)
(3.16b)
has at least one solution. More precisely, given an arbitrary norm k k on Kn , (3.16)
has a solution : I Kn , defined on the open interval
I :=]x0 , x0 + [,
(3.17)
30
3 GENERAL THEORY
= (b) > 0, where b > 0 is such that
B := (x, y) R Kn : |x x0 | b and ky y0 k b G,
M := M (b) := max{kf (x, y)k : (x, y) B} < ,
and
:= (b) :=
min{b, b/M }
b
for M > 0,
for M = 0.
(3.18)
(3.19)
(3.20)
In general, the choice of the norm k k on Kn will influence the possible sizes of and,
thus, of I.
Proof. The proof will be conducted in several steps. In the first step, we check =
(b) > 0 is well-defined: Since G is open, there always exists b > 0 such that (3.18)
holds. Since B is a closed and bounded subset of the finite-dimensional space R Kn ,
B is compact (cf. [Phi13b, Cor. 3.5]). Since f and, thus, kf k is continuous (every norm
is even Lipschitz continuous due to the inverse triangle inequality), it must assume its
maximum on the compact set B (cf. [Phi13b, Th. 3.8]), showing M R+
0 is well-defined
by (3.19) and is well-defined by (3.20).
In the second step of the proof, we note that it suffices to prove (3.16) has a solution + ,
defined on [x0 , x0 + [: One can then apply the time reversion Lem. 1.9(b): The proof
providing the solution + also provides a solution + : [x0 , x0 + [ Kn to the
time-reversed initial value problem, consisting of y = f (x, y) and y(x0 ) = y0 (note
that the same M and work for the time-reversed problem). Then, according to Lem.
1.9(b), : ]x0 , x0 ] Kn , (x) := + (x), is a solution to (3.16). According to
Lem. 1.7, we can patch and + together to obtain the desired solution
(
(x) for x x0 ,
(3.21)
: I Kn , (x) :=
+ (x) for x x0 ,
defined on all of I. It is noted that one can also conduct the proof with the second step
omitted, but then one has to perform the following steps on all of I, which means one
has to consider additional cases in some places.
In the third step of the proof, we will define a sequence (m )mN of functions
m : I+ Kn ,
I+ := [x0 , x0 + ],
(3.22)
m >0
(x,y),(
x,
y )B
|x x| < m , ky yk < m
1
f (x, y) f (
x, y) <
.
m
(3.23)
31
3 GENERAL THEORY
We now form what is called a discretization of the interval I+ , i.e. a partition of I+ into
sufficiently many small intervals: Let N N and
x0 < x1 < < xN 1 < xN := x0 +
(3.24)
such that
j{1,...,N }
xj xj1 < :=
(3.25)
(for example one could make the equidistant choice xj := x0 + jh with h = /N and
N > /, but it does not matter how the xj are defined as long as (3.24) and (3.25)
both hold). Note that we get a different discretization of I+ for each m N; however,
the dependence on m is suppressed in the notation for the sake of readability. We now
define recursively
m : I+ Kn ,
m (x0 ) := y0 ,
(3.26)
Note that there is no conflict between the two definitions given for x = xj with j
{1, . . . , N 1}. Each function m defines a polygon in Kn . This construction is known
as Eulers method and it can be used to obtain numerical approximations to the solution
of the initial value problem (while simple, this method is not very efficient, though). We
still need to verify that the definition (3.26) does actually make sense: We need to check
that f can, indeed, be applied to (xj , m (xj )), i.e. we have to check (xj , m (xj )) G.
We can actually show the stronger statement
(x, m (x)) B,
xI+
(3.27)
where B is as defined in (3.18). First, it is pointed out that (3.20) implies b, such
that x I+ implies |x x0 | b as required in (3.18). One can now prove (3.27)
by showing by induction on j {0, . . . , N 1}:
x[xj ,xj+1 ]
(x, m (x)) B.
(3.28)
To start the induction, note m (x0 ) = y0 and (x0 , y0 ) B by (3.18). Now let j
{0, . . . , N 1} and x [xj , xj+1 ]. We estimate
km (x) y0 k
(3.26)
km (x) m (xj )k +
j
X
k=1
km (xk ) m (xk1 )k
j
X
(xk xk1 )
f xk1 , m (xk1 )
(x xj )
f xj , m (xj )
+
k=1
()
(x xj ) M +
(3.20)
M b,
X
k=1
(xk xk1 ) M = (x x0 ) M
(3.29)
32
3 GENERAL THEORY
In the fourth step of the proof, we establish several properties of the functions m . The
first two properties are immediate from (3.26), namely that m is continuous on I + and
differentiable at each x ]xj , xj+1 [, j {0, . . . , N 1}, where
m (x) = f xj , m (xj ) .
(3.30)
j{0,...,N 1}
x]xj ,xj+1 [
s,tI+
km (t) m (s)k |t s| M.
(3.31)
To prove (3.31), we may assume s < t without loss of generality. If s, t [xj , xj+1 ],
j {0, . . . , N 1}, then
km (t) m (s)k
(3.26)
=
m (xj ) + (t xj ) f xj , m (xj ) m (xj ) (s xj ) f xj , m (xj )
(3.19)
= |t s|
f xj , m (xj )
|t s| M
(3.32a)
as desired. If s, t are not contained in the same interval [xj , xj+1 ], then fix j < k such
that s [xj , xj+1 ] and t [xk , xk+1 ]. Then (3.31) follows from an estimate analogous to
the one in (3.29):
km (t) m (s)k
(3.32a)
km (s) m (xj+1 )k +
|s xj+1 | M +
|t s| M,
k1
X
l=j+1
k1
X
l=j+1
|xl xl+1 | M + |t xk | M
(3.32b)
completing the proof of (3.31). The following property of the m is the justification for
calling them approximate solutions to our initial value problem (3.16):
j{0,...,N 1}
x]xj ,xj+1 [
m (x) f x, m (x)
< 1 .
m
(3.33)
(3.31)
33
3 GENERAL THEORY
and
(3.34a),(3.23) 1
m (x) f x, m (x)
=
f xj , m (xj ) f x, m (x)
,
<
m
(3.34b)
proving (3.33).
xI+
(3.35)
which says that the m are pointwise and even uniformly bounded.
In the fifth and last step of the proof, we use the Arzel`a-Ascoli Th. 3.7 to obtain a
function + : I+ Kn , and we show that constitutes a solution to (3.16). According
to (3.31), the m are uniformly equicontinuous (given > 0, condition (3.12) is satisfied
with := /M for M > 0 and arbitrary > 0 for M = 0), and according to (3.35)
the m are bounded such that the Arzel`a-Ascoli Th. 3.7 applies to yield a subsequence
(mj )jN of (m )mN converging uniformly to some continuous function + : I+ Kn .
So it merely remains to verify that + is a solution to (3.16).
As the uniform convergence of the (mj )jN implies pointwise convergence, we have
+ (x0 ) = limj mj (x0 ) = y0 , showing + satisfies the initial condition (3.16b).
Next,
xI
since each (x, mj (x)) is in B and B is closed. In particular, f (x, + (x)) is well-defined
for each x I+ .
To prove that + also satisfies the ODE (3.16a), by Th. 1.5, it suffices to show
Z x
+ (x) + (x0 )
f t, + (t) dt = 0.
xI+
(3.36)
x0
Fixing x I+ and using the triangle inequality for the umpteenth time, one obtains
Z x
+ (x) + (x0 )
f t, + (t) dt
x0
Z x
k+ (x) mj (x)k +
f t, mj (t) dt
mj (x) + (x0 )
x0
Z x
+
dt
f
t,
(t)
f
t,
(t)
(3.37)
m
+
j
,
x0
holding for every j N. We will conclude the proof by showing that all three summands
on the right-hand side of (3.37) tend to 0 for j . As already mentioned above,
the uniform convergence of the (mj )jN implies pointwise convergence, implying the
convergence of the first summand. We tackle the third summand next, using
Z x
Z x
f
t,
(t)
f
t,
(t)
dt
f
t,
(t)
f
t,
(t)
dt , (3.38)
mj
+
mj
+
x0
x0
34
3 GENERAL THEORY
f
t,
(t)
f
t,
(t)
dt
mj
+
jK
x0
thereby establishing the convergence of the third summand from the right-hand side
of (3.37). For the remaining second summand, we note that the fact that each m is
continuous and piecewise differentiable (with piecewise constant derivative) allows to
apply the fundamental theorem of calculus in the form [Phi13a, Th. G.6(b)] to obtain
Z x
m (x) = m (x0 ) +
m (t) dt .
(3.39)
xI+
x0
Using (3.39) in the second summand of the right-hand side of (3.37) provides
Z x
Z x
m (x) + (x0 )
(t)
f
t,
(t)
f
t,
(t)
dt
dt
mj
mj
mj
j
x0
x0
Z x
(3.33)
mj
x0 m j
showing the convergence of the second summand, which finally concludes the proof.
Corollary 3.9. If G R Kn is open, n N, f : G Kn is continuous, and C G
is compact, then there exists > 0, such that, for each (x0 , y0 ) C, the explicit ndimensional first-order initial value problem (3.16) has a solution : I Kn , defined
on the open interval I :=]x0 , x0 + [, i.e. always on an interval of the same length
2.
Proof. Exercise.
j{0,...,k1}
(3.40b)
has at least one solution. More precisely, there exists an open interval I R with
x0 I and : I Kn such that is a solution to (3.40). If C G is compact,
then there exists > 0 such that, for each (x0 , y0,0 , . . . , y0,k1 ) C, (3.40) has a solution
: I Kn , defined on the open interval I :=]x0 , x0 + [, i.e. always on an interval
of the same length 2.
35
3 GENERAL THEORY
Proof. If f is continuous, then the right-hand side of the equivalent first-order system
(3.2a) (written in explicit form) is given by the continuous function
y2
y3
..
(3.41)
f : G Kkn , f(x, y1 , . . . , yk ) :=
.
.
yk1
f (x, y1 , . . . , yk )
Thus, Th. 3.8 provides a solution : I Kkn to (3.2) and, then, Th. 3.1(b) yields
:= 1 to be a solution to (3.40). Moreover, if C G is compact, then Cor. 3.9 provides
> 0 such that, for each (x0 , y0,0 , . . . , y0,k1 ) C, (3.2) has a solution : I Kkn ,
defined on the same open interval I :=]x0 , x0 + [. In particular, := 1 , the
corresponding solution to (3.40) is also defined on the same I.
While the Peano theorem is striking in its generality, it does have several drawbacks:
(a) the interval, where the existence of a solution is proved can be unnecessarily short;
(b) the selection of the subsequence using the Arzel`a-Ascoli theorem makes the proof
nonconstructive; (c) uniqueness of solutions is not provided, even in cases, where unique
solutions exist; (d) it does not provide information regarding how the solution changes
with a change of the initial condition. We will subsequently address all these points,
namely (b) and (c) in Sec. 3.3 (we will see that the proof of the Peano theorem becomes
constructive in situations, where the solution is unique in general, a constructive proof
is not available), (a) in Sec. 3.4, and (d) in Sec. 3.5.
3.3
Uniqueness of Solutions
Example 1.4(b) shows that the hypotheses of the Peano Th. 3.8 are not strong enough
to guarantee the initial value problem (3.16) has a unique solution, not even in some
neighborhood of x0 . The additional condition that will yield uniqueness is local Lipschitz
continuity of f with respect to y.
Definition 3.11. Let m, n N, G R Km , and f : G Kn .
(a) The function f is called (globally) Lipschitz continuous or just (globally) Lipschitz
with respect to y if, and only if,
f (x, y) f (x, y)
Lky yk.
(3.42)
L0
(x,y),(x,
y )G
(b) The function f is called locally Lipschitz continuous or just locally Lipschitz with
respect to y if, and only if, for each (x0 , y0 ) G, there exists a (relative) open set
U G such that (x0 , y0 ) U (i.e. U is a (relative) open neighborhood of (x0 , y0 ))
and f is Lipschitz continuous with respect to y on U , i.e. if, and only if,
f (x, y) f (x, y)
Lky yk.
(x0 ,y0 )G
(x0 , y0 ) U G open
L0
(x,y),(x,
y )U
(3.43)
36
3 GENERAL THEORY
The number L occurring in (a),(b) is called Lipschitz constant. The norms on Km and
Kn in (a),(b) are arbitrary. If one changes the norms, then one will, in general, change
L, but not the property of f being (locally) Lipschitz.
Caveat 3.12. It is emphasized that f : G Kn , (x, y) 7 f (x, y), being Lipschitz
with respect to y does not imply f to be continuous: Indeed, if I R, 6= A Km ,
and g : I Kn is an arbitrary discontinuous function, then f : I A Kn ,
f (x, y) := g(x) is not continuous, but satisfies (3.42) with L = 0.
While the local neighborhoods U , where a function locally Lipschitz (with respect to y)
is actually Lipschitz continuous (with respect to y) can be very small, we will now show
that a continuous function is locally Lipschitz (with respect to y) on G if, and only if,
it is Lipschitz continuous (with respect to y) on every compact set K G.
Proposition 3.13. Let m, n N, G R Km , and f : G Kn be continuous.
Then f is locally Lipschitz with respect to y if, and only if, f is (globally) Lipschitz with
respect to y on every compact subset K of G.
Proof. First, assume f is not locally Lipschitz with respect to y. Then there exists
(x0 , y0 ) G such that
f (xN , yN,1 ) f (xN , yN,2 )
> N kyN,1 yN,2 k.
(3.44)
N N
The set
K := {(x0 , y0 )} (xN , yN,j ) : N N, j {1, 2}
is clearly a compact subset of G (e.g. by the Heine-Borel property of compact sets (see
Th. C.19), since every open set containing (x0 , y0 ) must contain all, but finitely many,
of the elements of K). Due to (3.44), f is not (globally) Lipschitz with respect to y on
the compact set K (so, actually, continuity of f was not used for this direction).
Conversely, assume f to be locally Lipschitz with respect to y, and consider a compact
subset K of G. Then, for each (x, y) K, there is some (relatively) open U(x,y) G
with (x, y) U(x,y) and such that f is Lipschitz with respect to y in U(x,y) . By the
Heine-Borel property of compact sets (see Th. C.19), there are finitely many U1 :=
U(x1 ,y1 ) , . . . , UN := U(xN ,yN ) , N N, such that
K
N
[
Uj .
(3.45)
j=1
For each j = 1, . . . , N , let Lj denote the Lipschitz constant for f on Uj and set L :=
max{L1 , . . . , LN }. As f is assumed continuous and K is compact, we have
M := max{kf (x, y)k : (x, y) K} < .
(3.46)
37
3 GENERAL THEORY
Using the compactness of K once again, there exists a Lebesgue number > 0 for the
open cover (Uj )j{1,...,N } of K (cf. Th. C.21), i.e. > 0 such that
ky yk <
(x,y),(x,
y )K
(3.48a)
ky yk
(3.48b)
2M
Lky yk,
While, in general, the assertion of Prop. 3.13 becomes false if the continuity of f is omitted, for convex G, it does hold without the continuity assumption on f (see Appendix
D). The following Prop. 3.14 provides a useful sufficient condition for f : G Kn ,
G R Km open, to be locally Lipschitz with respect to y:
where k k1 denotes the 1-norm on Rm . Since the yk fl , (k, l) {1, . . . , m} {1, . . . , n},
are all continuous on the compact set B,
M := max |yk fl (x, y)| : (x, y) B, (k, l) {1, . . . , m} {1, . . . , n} < . (3.49)
Applying the mean value theorem (cf. [Phi13b, Th. 2.32]) to the n components of the
function
fx : y Rm : (x, y) B Rn , fx (y) := f (x, y),
we obtain 1 , . . . , n Rm such that
fl (x, y) fl (x, y) =
and, thus,
(x,y),(x,
y )B
m
X
k=1
yk fl (x, l )(yk yk ),
n
X
f (x, y) f (x, y)
=
|fl (x, y) fl (x, y)|
1
l=1
(3.49),(3.50)
m
n X
X
l=1 k=1
M |yk yk | =
n
X
l=1
M ky yk1 = nM ky yk1 ,
(3.50)
(3.51)
38
3 GENERAL THEORY
i.e. f is Lipschitz with respect to y on B (where
(x, y) R Rm : |x x0 | < b and ky y0 k1 < b B
(3.52a)
(3.52b)
(x0 ) = (x0 )
(x) = (x) .
(3.53)
x0 I
xI
>0
x]x0 ,x0 +[
(x) = (x).
(3.54)
Since f is continuous and both and are solutions to the initial value problem (3.52),
we can use Th. 1.5 to obtain
Z x
(x) (x) =
(3.55)
f t, (t) f t, (t) dt .
xI
x0
As f is locally Lipschitz with respect to y, there exists > 0 such that f is Lipschitz
with Lipschitz constant L 0 with respect to y on
U := {(x, y) G : |x x0 | < , ky y0 k < },
where we have chosen some arbitrary norm kk on Kn . The continuity of , implies the
existence of > 0 such that B (x0 ) I, (B(x0 )) B (y0 ) and (B(x0 )) B (y0 ),
implying
f x, (x) f x, (x)
L
(x) (x)
.
(3.56)
xB(x0 )
Next, define
:= min{, 1/(2L)}
k(x) (x)k L
xB (x0 )
x
x0
M
k(t) (t)k dt L |x x0 | M
2
(3.57)
39
3 GENERAL THEORY
(note that the integral in (3.57) can be negative for x < x0 ). The definition of M
together with (3.57) yields M M/2, i.e. M = 0, finishing the proof of (3.54).
To prove (x) = (x) for each x x0 , let
j{0,...,k1}
j{0,...,k1}
(3.58)
Remark 3.17. According to Th. 3.15, the condition of f being continuous and locally
Lipschitz with respect to y is sufficient for each initial value problem (3.52) to have a
unique solution. However, this condition is not necessary: It is an exercise to show that
the continuous function
(
1
for y 0,
f : R2 R, f (x, y) :=
(3.59)
1 + y for y 0,
is not locally Lipschitz with respect to y, but that, for each (x0 , y0 ) R2 , the initial
value problem (3.52) still has a unique solution in the sense that (3.53) holds for each
solution to (3.52a). And one can (can you?) even find simple examples of f being
defined on an open domain such that f is discontinuous at every point in its domain
and every initial value problem (3.52) still has a unique solution.
At the end of Sec. 3.2, it was pointed out that the proof of the Peano Th. 3.8 is nonconstructive due to the selection of a subsequence. The following Th. 3.18 shows that,
whenever the initial value problem has a unique solution, it becomes unnecessary to
select a subsequence, and the construction procedure (namely Eulers method) used in
the proof of Th. 3.8 becomes an effective (if not necessarily efficient) numerical approximation procedure for the unique solution.
40
3 GENERAL THEORY
Theorem 3.18. Consider the situation of the Peano Th. 3.8. Under the additional
assumption that the solution to the explicit n-dimensional first-order initial value problem (3.16) is unique on some interval J [x0 , x0 + [, x0 J, and where > 0 is
constructed as in Th. 3.8 (i.e. given by (3.18) (3.20)), every sequence (m )mN of functions defined on J according to Eulers method as in the proof of Th. 3.8 (i.e. defined
as in (3.26)) converges uniformly to the unique solution : J Kn . An analogous
statement also holds for J ]x0 , x0 ], x0 J.
Proof. Seeking a contradiction, assume (m )mN does not converge uniformly to the
unique solution . Then there exists > 0 and a subsequence (mj )jN such that
kmj ksup = sup kmj (x) (x)k : x J .
(3.60)
jN
However, as a subsequence, (mj )jN still has all the properties of the (m )mN (namely
pointwise boundedness, uniform equicontinuity, piecewise differentiability, being approximate solutions according to (3.33)) that guanranteed the existence of a subsequence,
converging to a solution. Thus, since the solution is unique on J, (mj )jN must, in
turn, have a subsequence, converging uniformly to , which is in contradiction to (3.60).
This shows the assumption that (m )mN does not converge uniformly to must have
been false. The proof of the analogous statement for J ]x0 , x0 ], x0 J one obtains,
e.g., via time reversion (cf. the second step of the proof of Th. 3.8).
Remark 3.19. The argument used to prove Th. 3.18 is of a rather general nature: It
can be applied whenever a sequence is known to have a subsequence converging to some
solution of some equation (or some other problem), provided the same still holds for
every subsequence of the original sequence in that case, the additional knowledge that
the solution is unique implies the convergence of the original sequence without the need
to select a subsequence.
3.4
The Peano Th. 3.8 and Cor. 3.10 show the existence of local solutions to explicit initial
value problems, i.e. the solutions existence is proved on some, possibly small, interval
containing the initial point x0 . In the current section, we will address the question in
which circumstances such local solutions can be extended, we will prove the existence
of maximal solutions (solutions that can not be extended), and we will learn how such
maximal solutions can be identified.
Definition 3.20. Let : I Kn , n N, be a solution to some ODE (such as (1.6)
or (1.4) in the most general case), defined on some open interval I R.
(a) We say has an extension or continuation to the right (resp. to the left) if, and
only if, there exists a solution : J Kn to the same ODE, defined on some
open interval J I such that I = and
sup J > sup I
(3.61)
41
3 GENERAL THEORY
The existence of maximal solutions is not trivial a priori it could be that every solution
had an extension (analogous to the fact that to every x [0, 1[ (or every x R) there
is some bigger element in [0, 1[ (respectively in R)).
Theorem 3.22. Every solution 0 : I0 Kn to (1.4) (resp. to (1.6)), defined on an
open interval I0 R, can be extended to a maximal solution of (1.4) (resp. of (1.6)).
Proof. The proof is carried out for solutions to (1.4) (the implicit ODE) the proof for
solutions to the explicit ODE (1.6) is analogous and can also be seen as a special case.
The idea is to apply Zorns lemma. To this end, define a partial order on the set
S := {(I0 , 0 )} {(I, ) : : I Kn is solution to (1.4), extending 0 }
(3.62)
by letting
(I, ) (J, )
I J,
I = .
(3.63)
Every chain C, i.e. Severy totally ordered subset of S, has an upper bound, namely
(IC , C ) with IC := (I,)C I and C (x) := (x), where (I, ) C is chosen such that
x I (since C is a chain, the value of C (x) does not actually depend on the choice of
(I, ) C and is, thus, well-defined).
(k)
To conclude the proof, we note that all hypotheses of Zorns lemma have been verified
such that it yields the existence of a maximal element of (Imax , max ) S, i.e. max :
Imax Kn must be a maximal solution extending 0 .
42
3 GENERAL THEORY
(k1)
resp.
xa
Proof. That the respective part of (3.64) is necessary for the existence of the respective
extension is immediate from the fact that, for each solution to (1.6), the solution and
all its derivatives up to order k 1 must exist and must be continuous.
We now prove that (3.64a) is also sufficient for the existence of an extension to the right
(the sufficiency of (3.64b) for the existence of an extension to the left is then immediate
from Rem. 3.21). So assume (3.64a) to hold and consider the initial value problem
consisting of (1.6) and the initial conditions
j=0,...,k1
y (j) (b) = j .
By Cor. 3.10, there must exist > 0 such that this initial value problem has a solution
: ]b, b+[ Kn . We now show that extended to b via (3.64a) is still a solution to
(1.6). First note the mean value theorem (cf. [Phi13a, Th. 9.17]) yields that (j) (b) = j
exists for j = 1, . . . , k 1 as a left-hand derivative. Moreover,
lim (k) (x) = lim f x, (x), (x), . . . , (k1) (x) = f (b, 0 , . . . , k1 ),
xb
xb
showing (k) (b) = f b, (b), (b), . . . , (k1) (b) (again employing the mean value theorem), which proves extended to b is a solution to (1.6). Finally, Lem. 1.7 ensures
(
(x) for x b,
: ]a, b + [ Kn , (x) :=
(x) for x b,
is a solution to (1.6) that extends to the right.
x, (
x), . . . , (k1) (
x)
/ K.
(3.66)
x
J
The statement can be rephrased by saying that gr+ () (resp. gr ()) of each maximal
solution to (1.6) escapes from every compact subset of G when x appoaches the right
(resp. the left) boundary of I (where the boundary of I can contain and/or +).
43
3 GENERAL THEORY
Proof. We conduct the proof for extensions to the right; extensions to the left can be
handled completely analogously (alternatively, one can apply the time reversion Lem.
1.9(b) as demonstrated in the last paragraph of the proof below). The proof for extensions to the right is divided into three steps. Let K G be compact.
Step 1: We show that gr+ () K implies has an extension to the right: Since K is
bounded, so is gr+ (), implying
b := sup I <
(3.67)
as well as
M1 := sup
(j) (x)
: j {0, . . . , k 1}, x [x0 , b[ < .
Set
M := max{M1 , M2 }.
According to Prop. 3.23, we need to show (3.64a) holds. To this end, notice
j=0,...,k1
x,
x[x0 ,b[
(3.68)
Indeed,
x,
x[x0 ,b[
Z
k(k1) (x) (k1) (
x)k =
Z
k (x) (
x)k =
(j)
x,
x[x0 ,b[
f t, (t), . . . , (k1) (t) dt
M |x x|,
(j)
(j+1)
(t) dt
M |x x|,
proving (3.68). Since K is compact, there exists a sequence (xm )mN in [x0 , b[ such that
j=0,...,k1
x[x0 ,b[
implying
j=0,...,k1
44
3 GENERAL THEORY
Step 2: We show that gr+ () K implies can be extended to the right to I]x0 , b+[,
where > 0 does not depend on b := sup I: Since K is compact, Cor. 3.9 guarantees
every initial value problem
y (k) = f (x, y, y , . . . , y (k1) ),
j=0,...,k1
y (j) (0 ) = y0,j ,
(0 , y0 ) K,
(3.70a)
(3.70b)
which clearly constitutes an R-linear isomophism. Noting (1.6) and (1.32) are the same,
we consider the time-reversed version (1.33) and observe Gg = h(G) to be open, h(K)
Gg to be compact, g : Gg Kn , g = (1)k (f h), to be continuous. If gr (, x0 ) K,
then gr+ (, x0 ) h(K), where is the solution to the time-reversed version (1.33),
given by Lem. 1.9(b). Then has an extension to the right, satisfying (3.66) with
replaced by and K replaced by h(K). Then, by Rem. 3.21, must have an extension
K G compact
x0 ]a,b[
gr+ (, x0 ) K = (resp. gr (, x0 ) K = ),
(3.71)
where gr+ (, x0 ) and gr (, x0 ) are defined as in (3.65) (with I =]a, b[). In other words,
goes to the boundary of G for x b (resp. for x a) if, and only if, the graph of
(, . . . , (k1) ) escapes every compact subset K of G forever for x b (resp. for x a).
45
3 GENERAL THEORY
Proposition 3.26. In the situation of Def. 3.25, if the solution goes to the boundary
of G for x b, then one of the following conditions must hold:
(i) b = ,
(ii) b < and L := lim supxb
(x), . . . , (k1) (x)
= ,
(3.72)
An analogous statement is valid for the solution going to the boundary of G for x a.
Proof. The proof is carried out for x b; the proof for x a is analogous.
Assume (i) (iii) are all false. Choose c ]a, b[. Since (i) and (ii) are false,
(x), . . . , (k1) (x)
M.
0M <
x[c,b[
If (iii) is false because G = R Kkn , then K := {(x, y) R Kkn : x [c, b], kyk M }
is a compact subset of G that shows (3.71) does not hold. In the only remaining case,
(iii) must be false, since (3.72) does not hold. Thus,
x0 ]a,b[
x1 ]x0 ,b[
A := (x, y) G : dist (x, y), G
46
3 GENERAL THEORY
Proof. We carry out the proof for x b the proof for x a can be done analogously
or by applying the time reversion Lem. 1.9, as indicated at the end of the proof below.
Let : ]a, b[ Kn be a maximal solution to (1.6). Seeking a contradiction, we assume
does not go to the boundary of G for x b, i.e. (3.71) does not hold and there exists
a compact subset K of G and a strictly increasing sequence (xm )mN in ]a, b[ such that
limm xm = b < and
xm , (xm ), . . . , (k1) (xm ) K.
(3.73)
mN
dist (x, y), K = inf{k(x, y) (
x, y)k2 : (
x, y) K},
k k2 denoting the Euclidean norm on Rkn+1 for K = R and the Euclidean norm on
R2kn+1 for K = C (this choice of norm is different from previous choices and will be
convenient later during the current proof). As is a maximal solution, Prop. 3.24
guarantees the existence of another strictly increasing sequence (m )mN in ]a, b[ such
that limm m = b < , x1 < 1 < x2 < 2 < . . . (i.e. xm < m < xm+1 for each
m N) and such that
m , (m ), . . . , (k1) (m )
/ C.
mN
Noting xm , (xm ), . . . , (k1) (xm ) K by (3.73) and K C, define
sm := sup s xm : x, (x), . . . , (k1) (x) C for each x [xm , s] .
mN
By the definition of sm as a sup, sm < xm+1 < b < , and by the continuity of the
distance function d : R Kkn R+
0 , d() := dist(, K) (see Th. C.4), one obtains
in particular,
x[xm ,sm ]
and
mN
x, (x), . . . , (k1) (x) C
(3.74)
xm , (xm ), . . . , (k1) (xm ) sm , (sm ), . . . , (k1) (sm )
r.
2
(3.75)
47
3 GENERAL THEORY
(as C is compact and f continuous),
M := max{M1 , M2 }.
We now notice that each function
Jm : [xm , sm ] R Kkn ,
Jm (x) := x, (x), . . . , (k1) (x) ,
is a continuously differentiable curve or path (using the continuity of f ), cf. Def. F.1
(for K = C, we consider Jm as a path in R2kn+1 ). To finish the proof, we will have to
make use of the notion of arc length (cf. Def. F.5) of such a continuously differentiable
curve: Recall that each such continuously differentiable path is rectifyable, i.e. it has a
well-defined finite arc length l(Jm ) (cf. Th. F.7). Moreover, l(Jm ) satisfies
Z sm
(F.4)
(F.17)
sm
j=1
q
2
2
1 +
(x), . . . , (k1) (x)
2 +
f (Jm (x))
2 dx
Z xm
sm
1 + 2M 2 dx ,
(3.76)
xm
where it was used that k k2 was chosen to be the Euclidean norm. For each m N, we
estimate
(3.75)
(k1)
(k1)
0 < r
xm , (xm ), . . . ,
(xm ) sm , (sm ), . . . ,
(sm )
2
Z sm
(3.76)
1 + 2M 2 dx
= kJm (xm ) Jm (sm )k2
xm
(3.77)
= (sm xm ) 1 + 2M 2 .
48
3 GENERAL THEORY
(i) is a maximal solution.
(ii) must go to the boundary of G for both x a and x b in the sense defined in
Def. 3.25.
(iii) satisfies one of the conditions specified in Prop. 3.26 and one of the analogous
conditions for x a.
Proof. (i) implies (ii) by Th. 3.28, (ii) implies (iii) by Prop. 3.26, and it is an exercise
to show (iii) implies (i) (here, Prop. 3.23 is the clue).
Example 3.30. The following examples illustrate the different kinds of possible bahavior of maximal solutions listed in Prop. 3.26 (the different kinds of bahavior can already
be seen for 1-dimensional ODE of first order):
(a) The initial value problem
y = 0,
y(0) = 1,
f : G R,
f (x, y) = 0,
y(1) = 1,
f : G R,
f (x, y) = x2 ,
= 0,
y = 2 sin + 3 cos , y
x
x x
x
f : G R,
f (x, y) =
1
1
1
1
sin + 3 cos .
2
x
x x
x
To obtain an example, where we are again in Case (ii) of Prop. 3.26, but where
G = R2 , consider the initial value problem
y = y2,
y(1) = 1,
49
3 GENERAL THEORY
f : G R,
f (x, y) = y 2 .
y(1) = 1,
2x 1 here we have
f (x, y) = y 1 ,
f : G R,
y = 2 cos , y
= 0,
x
x
f : G R,
f (x, y) =
1
1
cos ,
2
x
x
x0
As a final example, where we are again in Case (iii) of Prop. 3.26, reconsider the
initial value problem from (a), but this time with
G =] 1, 1[] 3, 5[,
f : G R,
f (x, y) = 0.
50
3 GENERAL THEORY
Example 3.31. We have already seen examples of initial value problems that admit
more than one maximal solution for instance, the initial value problem of Ex. 1.4(b)
had infinitely many different maximal solutions, all of them defined on all of R. The following example shows that an initial value problem can have maximal solutions defined
on different intervals: Let
p
|y|
p ,
G := R] 1, 1[, f : G R, f (x, y) :=
1 |y|
and consider the initial value problem
p
|y|
p ,
y = f (x, y) =
1 |y|
y(0) = 0.
(3.78)
: R R,
(0) = 0.
However, another maximal solution (that can be found using separation of variables) is
(
2
for 1 < x 0,
1 1+x
: ] 1, 1[ R, (x) :=
2
1 1x
for 0 x < 1.
To confirm the maximality of the solution , note limx1 (x, (x)) = (1, 1) G
and limx1 (x, (x)) = (1, 1) G.
3.5
The goal of the present section is to show that, under suitable conditions, small changes
in the initial condition for an ODE result in small changes in the solution. As, in
situations of nonuniqueness, we can change the solution without having changed the
initial condition at all, ensuring unique solutions to initial value problems is a minimal
prerequisite for our considerations in this section.
Definition 3.32. Let G R Kkn , k, n N, and f : G Kn . We say that the
explicit n-dimensional kth-order ODE (1.6), i.e.
y (k) = f x, y, y , . . . , y (k1) ,
(3.79a)
admits unique maximal solutions if, and only if, f is such that every initial value problem
consisting of (3.79a) and
j{0,...,k1}
y (j) () = j Kn ,
(3.79b)
51
3 GENERAL THEORY
with respect to y is sufficient for (3.79a) to admit unique maximal solutions, but we
know from Rem. 3.17 that this condition is not necessary). If f is such that (3.79a)
admits unique maximal solutions, then
Y : Df Kn ,
(3.80)
defined on
Df := {(x, , ) R G : x I(,) },
(3.81)
is called the global or general solution to (3.79a). Note that the domain Df of Y is
determined entirely by f , which is notationally emphasized by its lower index f .
Lemma 3.33. In the situation of Def. 3.32, the following holds:
(a) Y (, , ) = 0 for each (, ) G.
(b) If k = 1, then = 0 and Y x, x, Y (
x, , ) = Y (x, , ) for each (x, , ), (
x, , )
Df .
(c) If k = 1, then Y , x, Y (x, , ) = for each (x, , ) Df .
The core of the proof of continuity in initial conditions as stated in Cor. 3.36 below is
the following Th. 3.34(a), which provides continuity in initial conditions locally. As a
byproduct, we will also obtain a version of the Picard-Lindelof theorem in Th. 3.34(b),
which states the local uniform convergence of the so-called Picard iteration, a method for
obtaining approximate solutions that is quite different from the Euler method considered
above.
Theorem 3.34. Consider the situation of Def. 3.32 for first-order problems, i.e. with
k = 1, and with f being continuous and locally Lipschitz with respect to y on G open.
Fix an arbitrary norm k k on Kn .
(a) For each (, ) G R Kn and each < a < b < such that [a, b] I(,)
(i.e., using the notation introduced in Def. 3.32, the maximal solution (,) =
Y (, , ) is defined on [a, b]), there exists > 0 satisfying:
(i) For every point (, ) in the open set
U (, ) := (, ) G : ]a, b[,
Y (, , )
< ,
(3.82)
52
3 GENERAL THEORY
(ii) The restriction of the global solution (x, , ) 7 Y (x, , ) to the open set
W :=]a, b[U (, )
(3.83)
is continuous.
(b) (Picard-Lindelof) For each (, ) G, there exists > 0 such that the Picard
iteration, i.e. the sequence of functions (m )mN0 , m : ] , + [ Kn , defined
recursively by
0 (x) := ,
mN0
m+1 (x) := +
(3.84a)
Z
f t, m (t) dt ,
(3.84b)
converges uniformly to the solution of the initial value problem (3.79) (with k = 1
and (, ) := (, )) on ] , + [.
Proof. We will obtain (b) as an aside while proving (a). To simplify notation, we
introduce the function
: [a, b] Kn ,
(x) := Y (x, , ).
Clearly, C is bounded and C is also closed (using the continuity of the distance function
d : R Kn R+
0 , d() := dist(, ), the continuity of the projection to the first
component 1 : R Kn R, and noting C = d1 [0, 1 ] 11 [a, b]). Thus, C is
compact, and the hypothesis of f being locally Lipschitz with respect to y implies f to
be globally Lipschitz with some Lipschitz constant L 0 on the compact set C by Prop.
3.13. We can now choose the number > 0 claimed to exist in (a) to be any number
(3.86)
< 1 .
(3.87)
53
3 GENERAL THEORY
Even though we are mostly interested in what happens on the open set W , it will be
convenient to define functions on the slightly larger compact set
W := [a, b] U ,
U := (x, y) R Kn : x [a, b],
y (x)
= d1 [0, ] 11 [a, b].
To proceed with the proof, we now carry out a form of the Picard iteration, recursively
defining a sequence of functions (m )mN0 , m : W Kn , defined recursively by
mN0
(3.88a)
(3.88b)
The proof will be concluded if we can show the (m )mN0 constitute a sequence of
continuous functions converging uniformly on W to Y W . As an intermediate step, we
establish the following properties of the m (simultaneously) by induction on m N0 :
(1) m is continuous for each m N0 .
(2) One has
mN0 ,
(x,,)W
m (x, , ) (x)
< 1
(x, m (x, , )) C .
mN0 ,
(x,,)W
m+1
|x |m+1
m+1 (x, , ) m (x, , )
L
.
(m + 1)!
To start the induction proof, notice that the continuity of implies the continuity of
0 . Moreover, if (x, , ) W , then
0 (x, , ) (x)
(3.88a)
=
()
=
Y (, , )
< 1 .
(3.89)
Also, from = Y (, , ) = (,) , we know, for each x, [a, b],
Z x
Z
Z
(x) () = +
f (t, (t)) dt
f (t, (t)) dt =
=
f L-Lip.
f (t, (t)) dt
Z x
f
t,
(t,
,
)
f
(t,
(t))
dt
0
Z x
0 (t, , ) (t)
dt
L
Z x
()
dt L |x | ,
L
54
3 GENERAL THEORY
completing the proof of (1) (3) for m = 0. For the induction step, let m N0 .
It is left as an exercise to prove the continuity of m+1 .
m
X
Lj+1 |x |j+1
(j + 1)!
j=0
(3.86)
establishing the estimate of (2) for m + 1. To prove the estimate in (3) for m replaced
by m + 1, one estimates, for each (x, , ) W ,
Z x
m+2 (x, , ) m+1 (x, , )
f t, m+1 (t, , ) f t, m (t, , )
dt
Z x
L
m+1 (t, , ) m (t, , ) dt
Z x m+1
m+1
ind.hyp.
L
|t
dt
L
(m + 1)!
m+2
L
|x |m+2
=
,
(m + 2)!
completing the induction proof of (1) (3).
As a consequence of (3), for each l, m N0 such that m > l:
(x,,)W
m
X
Lj (b a)j
m (x, , ) l (x, , )
.
j!
j=l+1
(3.90)
The convergence of the exponential series, thus, implies that (m (x, , ))mN0 is a
Cauchy sequence for each (x, , ) W , yielding pointwise convergence of the m to
some function : W Kn . Letting m tend to infinity in (3.90) then shows
(x,,)W
X
Lj (b a)j
, ) l (x, , )
(x,
,
j!
j=l+1
where the independence of the right-hand side with respect to (x, , ) W proves
m uniformly on W . The uniform convergence together with (1) then implies
to be continuous.
, ) solves (3.79) (with
In the final step of the proof, we show = Y on W , i.e. (,
k = 1). By Th. 1.5, we need to show
Z x
, ) dt
, ) = +
f t, (t,
(3.91)
(x,
(x,,)W
55
3 GENERAL THEORY
k{m1,m}
(x,,)W
Z
x
f t, m1 (t, , ) f t, (t, , ) dt
(x, , ) m (x, , )
+
Z x
, )
dt + L (b a).
m1 (t, , ) (t,
(3.92)
< + L
It is noted that we have, indeed, proved (b) as a byproduct, since we know (for example
from the Peano Th. 3.8) that must be defined on [ , + ] for some > 0 and
then m = m (, , ) on [ , + ] for each m N0 .
Theorem 3.35. As in Th. 3.34, consider the situation of Def. 3.32 for first-order problems, i.e. with k = 1, and with f being continuous and locally Lipschitz with respect to
y on G open. Then the global solution (x, , ) 7 Y (x, , ) as defined in Def. 3.32 is
continuous. Moreover, its domain Df is open.
Proof. Let (x, , ) Df . Then, using the notation from Def. 3.32, x is in the domain of
the maximal solution (,) , i.e. x I(,) . Since I(,) is open, there must be < a <
x < b < such that [a, b] I(,) and then Th. 3.34(a) implies the global solution Y to
be continuous on W , where W as defined in (3.83) is an open neighborhood of (x, , ).
In particular, (x, , ) is an interior point of Df and Y is continuous at (x, , ). As
(x, , ) was arbitrary, Df must be open and Y must be continuous.
Corollary 3.36. Consider the situation of Def. 3.32 with f being continuous and locally
Lipschitz with respect to y on G open. Then the global solution (x, , ) 7 Y (x, , ) as
defined in Def. 3.32 is continuous. Moreover, its domain Df is open.
Proof. It was part of the exercise that proved Cor. 3.16 to show that the right-hand side
F of the first-order problem equivalent to (3.79) in the sense of Th. 3.1 is continuous
and locally Lipschitz with respect to y, provided f is continuous and locally Lipschitz
with respect to y. Thus, according to Th. 3.35, the equivalent first-order problem has
a continuous global solution : DF Kkn , defined on some open set DF . As a
consequence of Th. 3.1(b), Y = 1 : DF Kn is the global solution to (3.79a). So
we have Df = DF and, as is continuous, so is Y .
It is sometimes interesting to consider situations where the right-hand side f depends
on some (vector of) parameters in addition to depending on x and y:
56
3 GENERAL THEORY
j{0,...,k1}
y (j) () = j Kn ,
(3.93b)
(3.94)
defined on
Df := {(x, , , ) R G : x I(,,) },
(3.95)
Corollary 3.38. Consider the situation of Def. 3.37 with f being continuous and locally
Lipschitz with respect to (y, ) on G open. Then the global solution Y as defined in Def.
3.94 is continuous. Moreover, its domain Df is open.
Proof. We consider k = 1 (i.e. (3.93a) is of first order) the case k > 1 can then, in the
usual way, be obtained by applying Th. 3.1. To apply Th. 3.35 to the present situation,
define the auxiliary function
(
fj (x, y) for j = 1, . . . , n,
(3.96)
F : G Kn+l , Fj (x, y) :=
0
for j = n + 1, . . . , n + l.
Then, since f is continuous and locally Lipschitz with respect to (y, ), F is continuous
and locally Lipschitz with respect to y, and we can apply Th. 3.35 to
y = F (x, y),
y() = (, ),
(3.97a)
(3.97b)
(x,,,)DF
Y (x, , , ) = e (x) .
57
4 LINEAR ODE
Linear ODE
4.1
Definition, Setting
In Sec. 2.2, we saw that the solution of one-dimensional first-order linear ODE was
particularly simple. One can now combine the general theory of ODE with some linear
algebra to obtain results for n-dimensional linear ODE and, equivalently, for linear ODE
of higher order.
Notation 4.1. For n N, let M(n, K) denote the set of all n n matrices over K.
Definition 4.2. Let I R be a nontrivial interval, n N, and let A : I M(n, K)
and b : I Kn be continuous. An ODE of the form
y = A(x)y + b(x)
(4.1)
is called an n-dimensional linear ODE of first order. It is called homogeneous if, and
only if, b 0; it is called inhomogeneous if, and only if, it is not homogeneous.
Using the notion of matrix norm (cf. Sec. G), it is not hard to show the right-hand side
of (4.1) is continuous and locally Lipschitz with respect to y and, thus, every initial
value problem for (4.1) has a unique maximal solution (exercise). However, to show
the maximal solution is always defined on all of I, we need some additional machinery,
which is developed in the next section.
4.2
Gronwalls Inequality
In the current section, we will provide Gronwalls inequality, which is also of interest
outside the field of ODE. Here, Gronwalls inequality will allow us to prove the global existence of maximal solutions for ODE with linearly bounded right-hand side a corollary
being that maximal solutions of (4.1) are always defined on all of I.
As an auxiliary tool on our way to Gronwalls inequality, we will now briefly study
(one-dimensional) differential inequalities:
Definition 4.3. Given G R R = R2 , and f : G R, call
y f (x, y)
(4.2)
a (one-dimensional) differential inequality (of first order). A solution to (4.2) is a differentiable function w : I R defined on a nontrivial interval I R satisfying the two
conditions
x, w(x) I R : x I G,
(ii) w (x) f x, w(x) for each x I.
(i)
58
4 LINEAR ODE
w(x) (x).
(4.3)
x[a,b[
g(x, y, ) := f (x, y) +
(4.4)
(4.5)
Since f is continuous and locally Lipschitz with respect to y, g is continuous and locally
Lipschitz with respect to (y, ). Thus, continuity in initial conditions as given by Cor.
3.38 applies, yielding the global solution Y : Dg R, (x, , , ) 7 Y (x, , , ), to
be continuous on the open set Dg .
We now consider an arbitrary compact subinterval [a, c] [a, b[ with a < c < b, noting
that it suffices to prove w on every such interval [a, c]. The set
:= (Id, a, (a), 0)[a, c] = (x, a, (a), 0) : x [a, c]
(4.6)
(4.7)
If we choose the distance in (4.7) to be meant with respect to the max-norm on R4 and
if 0 < < , then (x, a, (a), ) for each x [a, c], such that := Y (, a, (a), )
is defined on (a superset of) [a, c]. We proceed to prove w on [a, c]: Seeking a
contradiction, assume there exists x0 [a, c] such that w(x0 ) > (x0 ). Due to the
continuity of w and , w > must then hold in an entire neighborhood of x0 . On the
other hand, w(a) (a) = (a), such that, for
x1 := inf x < x0 : w(t) > (t) for each t ]x, x0 ] ,
a x1 < x0 and w(x1 ) = (x1 ). But then, for each sufficiently small h > 0,
w(x1 + h) w(x1 ) > (x1 + h) (x1 ),
implying
w(x1 + h) w(x1 )
(x1 + h) (x1 )
lim
= (x1 )
h0
h0
h
h
= g x1 , (x1 ), = f x1 , (x1 ) + > f x1 , (x1 ) = f x1 , w(x1 ) , (4.8)
w (x1 ) = lim
59
4 LINEAR ODE
Thus, w on [a, c] holds for every 0 < < , and continuity of Y on Dg yields,
x[a,c]
Theorem 4.5 (Gronwalls Inequality). Let I := [a, b[, where < a < b . If
, , : I R are continuous and (x) 0 for each x I, then
Z x
(t) (t) dt
(4.10)
(x) (x) +
xI
implies
xI
(x) (x) +
Z
(s) ds
t
dt .
(4.11)
(4.12a)
(4.12b)
xI
(x) w(x).
xI
w (x) = (x)(x) = (x) (x) + (x) (x)w(x) + (x)(x),
(4.13)
Continuously extending and to x < a (e.g. using the constant extensions (x) = (a)
and (x) := (a) for x < a), we can consider the linear ODE corresponding to (4.13) on
all of ] , b[. Using the initial condition y(a) = w(a) = 0, yields the unique solution
(employing the variation of constants Th. 2.3)
: ] , b[ R,
Z x
Z x
Z t
(x) := exp
(s) ds
(s) ds (t) (t) dt
exp
a
a
a
Z x
Z x
(t) (t) exp
=
(s) ds dt .
a
(4.14)
Z
(s) ds
t
dt ,
(4.15)
60
4 LINEAR ODE
implies
xI
(x) C exp
Z
(t) dt
a
(4.17)
We apply Gronwalls inequality of Th. 4.5 with C together with the fundamental
theorem of calculus to obtain the estimate
Z x
Z x
C (t) exp
(x) C +
(s) ds dt
a
t
Z t
Z t
x
Z x
=C C
(t) exp
(s) ds
dt = C C exp
(s) ds
a
x
x
a
Z x
= C exp
(t) dt
(4.18)
a
The following Th. 4.7 will be applied to show maximal solutions to linear ODE are
always defined on all of I (with I as in Def. 4.2). However, Th. 4.7 is often also useful
to obtain the domains of maximal solutions for nonlinear ODE.
Theorem 4.7. Let n N, let I R be an open interval, and let f : I Kn Kn be
continuous. If there exist nonnegative continuous functions , : I R+
0 such that
(x,y)IKn
(4.19)
x0
61
4 LINEAR ODE
Since the continuous function is uniformly bounded on the compact interval [x0 , d],
Z x
k(x)k C +
(t) k(t)k dt .
C0
x[x0 ,d[
x0
x[x0 ,d[
k(x)k C exp
Z
(t) dt
x0
C eM (dx0 ) ,
(4.21)
where M 0 is a uniform bound for the continuous function on the compact interval
[x0 , d]. As (4.21) states that the graph
gr+ () = x, (x) G : x [x0 , d[
K := [x0 , d] y Kn : kyk C eM (dx0 ) ,
Now assume a < c. The idea is to apply the time reversion Lem. 1.9(b): According to
Lem. 1.9(b), : ] d, c[ Kn , (x) = (x), is a solution to y = f (x, y) and
the first part of the prove above shows to have an extension to the right. However,
then Rem. 3.21 tells us has an extension to the left.
4.3
Theorem 4.8. Consider the setting of Def. 4.2 with an open interval I. Then every
initial value problem consisting of the linear ODE (4.1) and y(x0 ) = y0 , x0 I, y0 Kn ,
has a unique maximal solution : I Kn (note that is defined on all of I).
Proof. It is an exercise to show the right-hand side of (4.1) is continuous and locally
Lipschitz with respect to y. Thus, every initial value problem has a unique maximal
solution by using Cor. 3.16 and Th. 3.22. That each maximal solution is defined on I
follows from Th. 4.7, as
A(x)y + b(x)
b(x)
+
A(x)
kyk,
xI
where
A(x)
denotes the matrix norm of A(x) induced by the norm k k on Kn (cf.
Appendix G).
We will now proceed to study the solution spaces of linear ODE as it turns out, these
solution spaces inherit the linear structure of the ODE.
Notation 4.9. Again, we consider the setting of Def. 4.2. Define Li and Lh to be the
respective sets of solutions to (4.1) and its homogeneous version, i.e.
n
o
n
Li := ( : I K ) : = A + b ,
(4.22a)
n
o
Lh := ( : I Kn ) : = A .
(4.22b)
62
4 LINEAR ODE
Lemma 4.10. Using Not. 4.9, we have
Li
Li = + Lh = { + : Lh },
(4.23)
i.e. one obtains all solutions to the inhomogeneous equation (4.1) by adding solutions
of the homogeneous equation to a particular solution to the inhomogeneous equation
(note that this is completely analogous to what occurs for solutions to linear systems of
equations in linear algebra).
Proof. Exercise.
63
4 LINEAR ODE
had the set of solutions L as in (1.17), namely
n
o
L=
(c1 sin +c2 cos) : [a, b] K : c1 , c2 K .
We are now in a position to verify this claim: The second-order ODE (1.16) is equivalent
to the homogeneous linear first-order ODE
y1
y1
0 1
y2
(4.24)
=
=
y2
1 0
y2
y1
with the vector space of solutions Lh of dimension 2 over K. Clearly, 1 , 2 Lh , where
1 , 2 : [a, b] K2 with
cos x
sin x
.
(4.25)
, 2 (x) :=
1 (x) :=
sin x
cos x
Moreover, 1 and 2 are linearly independent (e.g. since 1 (0) = 01 and 2 (0) = 10 are
linearly independent, so are 1 , 2 : R K2 by Th. 4.11(b), implying, again by Th.
4.11(b), the linear independence of 1 (a), 2 (a), finally implying the linear independence
of 1 , 2 : [a, b] K2 ). Thus,
n
o
2
Lh =
(c1 1 + c2 2 ) : [a, b] K : c1 , c2 K
(4.26)
and, since, according to Th. 3.1 the solutions to (1.16) are precisely the first components
of solutions to (4.24), the representation (1.17) is verified.
4.4
11 . . . 1n
.. ,
(4.27)
:= ...
.
n1 . . . nn
where the kth column of the matrix consists of the component functions 1k , . . . , nk of
k , k {1, . . . , n}, a fundamental system or a fundamental matrix solution for (4.1). The
latter term is justified by the observation that : I M(n, K) can be interpreted as
a solution to the matrix-valued ODE
Y = A(x) Y :
(4.28)
Indeed,
= 1 , . . . , n = A(x) 1 , . . . , A(x) n = A(x) .
64
4 LINEAR ODE
(i) is a fundamental system for (4.1).
(ii) There exists x0 I such that det (x0 ) 6= 0.
(iii) det (x) 6= 0 for every x I.
Proof. The equivalences are a direct consequence of the equivalences in Th. 4.11(b).
Theorem 4.15 (Variation of Constants). Consider the setting of Def. 4.2. If : I
M(n, K) is a fundamental system for (4.1), then the unique solution : I Kn of
the initial value problem consisting of (4.1) and y(x0 ) = y0 , (x0 , y0 ) I Kn , is given
by
Z
x
: I Kn ,
1 (t) b(t) dt .
(4.29)
x0
(4.30)
constitutes a fundamental matrix solution in the sense of Def. and Rem. 4.13 (since 1/0
exists). Taking into account (x0 ) = 0 (x0 ) = 1, we obtain, for each x I,
Z x
(4.29)
1
(x) = (x) (x0 ) y0 + (x)
1 (t) b(t) dt
x0
Z x
1
= 0 (x) y0 +
0 (t) b(t) dt ,
(4.31)
x0
which is (2.2a).
65
4 LINEAR ODE
In Sec. 4.6, we will study methods for actually finding fundamental matrix solutions in
cases where A is constant. However, in general, fundamental matrix solutions are often
not explicitly available. In such situations, the following Th. 4.17 can sometimes help
to extract information about solutions.
Theorem 4.17 (Liouvilles Formula). Consider the setting of Def. 4.2 and recall the
trace of an n n matrix A = (akl ) is defined by
tr A :=
n
X
akk .
k=1
Proof. Exercise.
4.5
(4.32)
x0
Higher-Order, Wronskian
In Th. 3.1, we saw that higher-order ODE are equivalent to systems of first-order ODE.
We can now combine Th. 3.1 with our findings regarding first-order linear ODE to help
with the solution of higher-order linear ODE.
Definition 4.18. Let I R be a nontrivial interval, n N. Let b : I K and
a0 , . . . , an1 : I K be continuous functions. Then a (1-dimensional) linear ODE of
nth order is an equation of the form
y (n) = an1 (x)y (n1) + + a1 (x)y + a0 (x)y + b(x).
(4.33)
It is called homogeneous if, and only if, b 0; it is called inhomogeneous if, and only if,
it is not homogeneous. Analogous to (4.22), define the respective sets of solutions
n1
n
o
X
Hi := ( : I K) : (n) = b +
ak (k) ,
(4.34a)
k=0
Hh := ( : I K) :
(n)
n1
X
k=0
ak
(k)
(4.34b)
1 (x)
...
n (x)
1 (x)
...
n (x)
W (1 , . . . , n ) : I K, W (1 , . . . , n )(x) := det
.
..
..
.
.
(n1)
(n1)
1
(x) . . . n (x)
(4.35)
66
4 LINEAR ODE
Theorem 4.20. Consider the setting of Def. 4.18.
(a) If Hi and Hh are the sets defined in (4.34), then Hh is an n-dimensional vector
space over K and, if Hi is arbitrary, then
Hi = + Hh .
(4.36)
(ii) There exists x0 I such that the Wronskian does not vanish:
W (1 , . . . , n )(x0 ) 6= 0.
0
1
0
...
0
0
y1
0
0
0
1
...
0
0
y2 0
..
..
.
.
... ...
.
.. ..
.
y =
0
yn2 0
0
0
.
.
.
1
0
0
0
0
...
0
1 yn1 0
a0 (x) a1 (x) a2 (x) . . . an2 (x) an1 (x)
yn
b(x)
y + b(x).
=: A(x)
Define
n
Li := ( : I K ) : = A + b ,
n
o
.
Lh := ( : I Kn ) : = A
n
..
.
:=
.
(n1)
Then
Hh
Th. 3.1(a),(b)
{1 : Lh }
and
Hi
Th. 3.1(a),(b)
1 :
Li } (4.23)
1 :
+ Lh }
{
= {
{( + )1 : Lh } = + Hh .
(4.37)
67
4 LINEAR ODE
As a consequence of Th. 3.1, the map J : Lh Hh , J() := 1 , is a linear isomorphism, implying that Hh , like Lh , is an n-dimensional vector space over K.
(l1)
xI
1 (x)
...
n (x)
1 (x)
...
n (x)
..
..
.
.
(n1)
(n1)
1
(x) . . . n (x)
such that det (x) = W (1 , . . . , n )(x) for each x I. Since Th. 3.1 yields 1 , . . . , n
Lh if, and only if, 1 , . . . , n Hh , the equivalences of (b) follow from the equivalences
of Cor. 4.14.
Example 4.21. Consider a0 , a1 : R+ K, a1 (x) := 1/(2x), a0 (x) := 1/(2x2 ), and
y = a1 (x) y + a0 (x) y =
y
y
2.
2x 2x
1 (x) := x,
2 (x) :=
x.
The Wronskian is
W (1 , 2 ) : R+ K,
x
x
x
x
x=
< 0,
W (1 , 2 )(x) = det
=
1 1/(2 x)
2
2
i.e. 1 and 2 span Hh according to Th. 4.20(b):
Hh = {c1 1 + c2 2 : c1 , c2 K}.
4.6
Constant Coefficients
For 1-dimensional first-order linear ODE, we obtained a solution formula in Th. 2.3 in
terms of integrals (of course, in general, evaluating integrals can still be very difficult,
and one might need effective and efficient numerical methods). In the previous sections,
we have studied systems of first-order linear ODE as well as linear ODE of higher order.
Unfortunately, there are no general solution formulas for these situations (one can use
(4.29) if one knows a fundamental system, but the problem is the absence of a general
procedure to obtain such a fundamental system). However, there is a more satisfying
solution theory for the situation of so-called constant coefficients, i.e. if A in (4.1) and
the a0 , . . . , an1 in (4.33) do not depend on x.
68
4 LINEAR ODE
4.6.1
(4.38)
P (x )f :=
n
X
aj xj f
j=0
n
X
aj f (j) .
(4.39)
j=0
Remark 4.24. Using Not. 4.23(b), the ODE (4.38) can be written concisely as
P (x )y = b(x),
where P (x) := x
n1
X
aj x j .
(4.40)
j=0
The following Prop. 4.25 implies that the differential operator P (x ) does not, actually,
depend on the representation of the polynomial P .
Proposition 4.25. Let P, P1 , P2 P.
(a) If P = P1 + P2 and n := max{deg P1 , deg P2 }, then
f D n (I)
P (x )f = P1 (x )f + P2 (x )f.
f D (I)
P (x )f = P1 (x ) P2 (x )f .
4 LINEAR ODE
69
Proof. Exercise.
f (x) := ex .
(4.41)
P (x )f (x) = P () ex .
(4.42)
Pn
j=0
aj xj . One
j=0
proving (4.42).
Pn1
Theorem 4.27. If a0 , . . . , an1 K, n N, and P (x) = xn j=0
aj xj has the distinct
zeros 1 , . . . , n K (i.e. P (1 ) = = P (n ) = 0), then (1 , . . . , n ), where
j{1,...,n}
j : I K,
j (x) := ej x ,
(4.43)
(4.44)
since the j are all distinct. We have used that the Wronskian, in the present case, turns
out to be a Vandermonde determinant. The formula (H.2) for this type of determinant is
provided and proved in Appendix H. We also used that the determinant of a matrix is the
same as the determinant of its transpose: det A = det At . From W (1 , . . . , n )(0) 6= 0
and Th. 4.20(b), we conclude that (1 , . . . , n ) is a basis of Hh .
Example 4.28. We consider the third-order linear ODE
y = 2y y + 2y,
(4.45)
70
4 LINEAR ODE
which can be written as P (x )y = 0 with
P (x) := x3 2x2 + x 2 = (x2 + 1)(x 2) = (x i)(x + i)(x 2),
(4.46)
i.e. P has the distinct zeros 1 = i, 2 = i, 3 = 2. Thus, according to Th. 4.27, the
three functions
1 (x) = eix ,
1 , 2 , 3 : R C,
2 (x) = eix ,
3 (x) = e2x ,
(4.47)
form a basis of the C-vector space Hh . If we consider (4.45) as an ODE over R, then
we are interested in a basis of the R-vector space Hh . We can use linear combinations
of 1 and 2 to obtain such a basis (cf. Rem. 4.33(b) below):
1 , 2 : R R,
1 (x) =
eix + eix
= cos x,
2
2 (x) =
eix eix
= sin x.
2i
(4.48)
By working a bit harder, one can generalize Th. 4.27 to the case where P has zeros of
higher multiplicity. We provide this generalization in Th. 4.32 below after recalling the
notion of zeros of higher multiplicity in Rem. and Def. 4.29, and after providing two
preparatory lemmas.
Remark and Definition 4.29. According to the fundamental theorem of algebra (cf.
[Phi13a, Th. 8.32, Cor. 8.33]), for every polynomial P Pn with deg P = n, n N,
there exists r N with r n, k1 , . . . , kr N with k1 + + kr = n, and distinct
numbers 1 , . . . , r C such that
P (x) = (x 1 )k1 (x r )kr .
(4.49)
71
4 LINEAR ODE
Lemma 4.31. Let P P and K such that P () 6= 0. Then, for each Q P with
deg Q = k, k N0 , it holds that
P (x ) Q(x) ex = R(x) ex ,
(4.52)
xR
n
X
j=0
bj (x )j ,
n N0 ,
(4.53)
n
n
(4.53) X
(4.50) X
=
bj (x )j Q(x) ex =
bj Q(j) (x) ex ,
j=0
j=0
Pk
j=0 bj
where
j{1,...,r}
m{0,...,kj 1}
jm : I K,
jm (x) := xm ej x ,
(4.54b)
(4.50)
kj >m
= Qj (x ) xkj xm ej x = 0,
Qj Pkj 1
Qj 0.
j=1
j=1,...,r
j=1,...,r
(4.55)
72
4 LINEAR ODE
Rj (x) ej x = 0
(4.56)
j=1
(k )
with suitable Rj P, since Lem. 4.30 yields (x r )kr Qr (x) er x = Qr r (x) er x = 0
and, for j < r, Lem. 4.31 applies due to (j r )kr 6= 0, also providing deg Rj = deg Qj .
Thus, none of the Rj in (4.56) can vanish identically, violating the induction hypothesis.
This finishes the proof of Qj 0 for each j = 1, . . . , r and the proof of the theorem.
As it can occur in Th. 4.32 that P P[R], but j C \ R for some or all of the zeros j ,
the question arises of how to obtain a basis of the R-vector space Hh from the basis of
the C-vector space Hh provided by Th. 4.32. The following Rem. 4.33(b) answers this
question.
Remark 4.33. (a) If 1 , 2 C, then complex conjugation has the properties (cf.
[Phi13a, Def. and Rem. 5.5])
1
2 , 1 2 =
1
2.
1 2 =
(b) Consider the situation of Th. 4.32 with P P[R]. Using (a), if jm : I C,
jm (x) = xm ej x , j C \ R, occurs in a basis for the C-vector space Hh (with
m = 0 in the special case of Th. 4.27), then jm : I C, jm (x) = xm ej x , with
j will occur as well. Noting that, for each x R and each C,
j =
ex = ex(Re +i Im ) = ex Re cos(x Im ) + i sin(x Im ) ,
(4.57a)
x
x(Re i Im )
x Re
e =e
cos(x Im ) i sin(x Im ) ,
(4.57b)
=e
1 x
(e + ex ) = ex Re cos(x Im ),
(4.57c)
2
1 x
(e ex ) = ex Re sin(x Im ),
(4.57d)
2i
one can define
1
jm : I R, jm (x) := (jm (x) + jm (x)) = xm ex Re j cos(x Im j ), (4.58a)
2
1
jm : I R, jm (x) := (jm (x) jm (x)) = xm ex Re j sin(x Im j ).
2i
(4.58b)
If one replaces each pair jm , jm in the basis for the C-vector space Hh with the
corresponding pair jm , jm , then one obtains a basis for the R-vector space Hh :
This follows from
1
1
1
jm
jm
2
2
with A := 1
=A
(4.59)
, det A = 6= 0.
1
jm
jm
2i
2i
2i
73
4 LINEAR ODE
Example 4.34. We consider the fourth-order linear ODE
y (4) = 8y 16y,
(4.60)
(4.61)
i.e. P has the zeros 1 = 2i, 2 = 2i, both with multiplicity 2. Thus, according to Th.
4.32, the four functions
10 (x) = e
2ix
10 , 11 , 20 , 21 : R C,
20 (x) = e2ix ,
11 (x) = x e2ix ,
21 (x) = x e2ix ,
form a basis of the C-vector space Hh . If we consider (4.60) as an ODE over R, we can
use (4.58) to obtain the basis (10 , 11 , 20 , 21 ) of the R-vector space Hh , where
10 (x) = cos(2x),
10 , 11 , 20 , 21 : R R,
11 (x) = x cos(2x), 20 (x) = sin(2x),
21 (x) = x sin(2x).
If (4.38) is inhomogeneous, then one can use Th. 4.32 and, if necessary, Rem. 4.33(b),
to obtain a basis of the homogeneous solution space Hh , then using the equivalence with
systems of first-order linear ODE and variation of constants according to Th. 4.15 to
solve (4.38). However, if the function b in (4.38) is such that the following Th. 4.35
applies, then one can avoid using the above strategy to obtain a particular solution
to (4.38) (and, thus, the entire solution space via Hi = + Hh ).
Pn1
Theorem 4.35. Let a0 , . . . , an1 K, n N, and P (x) = xn j=0
aj xj . Consider
P (x )y = Q(x)ex ,
Q P,
K.
(4.62)
(x) := R(x) ex ,
(4.63)
(x) := R(x) e ,
R P,
R(x) =
m+k
X
j=k
c j xj ,
ck , . . . , cm+k K.
(4.64)
The reason behind the terms no resonance and resonance will be explained in the following Example 4.36.
4 LINEAR ODE
74
Proof. Exercise.
a R \ {0},
P (t) := t2 + 02 = (t i0 )(t + i0 ).
(4.65)
(4.66)
Note that the unknown function is written as x depending on the variable t (instead of y
depending on x). This is due to the physical interpretation of (4.65), where x represents
the position of a so-called harmonic oscillator at time t, having angular frequency 0
and being subjected to a periodic external force of angular frequency and amplitude
a. We can find a particular solution to (4.65) by applying Th. 4.35 to
P (t )x = a eit .
(4.67)
(t) := Re 0 (t) =
02
is a solution to (4.65).
a
cos(t),
2
(4.68b)
(b) Case = 0 : In this case, one says that the oscillator and the external force are in
resonance, which explains the term resonance in Th. 4.35(b). In this case, we can
apply Th. 4.35(b) with := i and Q a, i.e. m = 0, k = 1, yielding R(t) = ct
for some c C. To determine c, we plug x(t) = R(t) et into (4.67):
P (t ) ct eit = t (c eit + cit eit ) + 02 ct eit
= ci eit + ci eit c 2 t eit + 02 ct eit
c = a/(2i).
(4.69)
Thus,
0 : R C,
0 (t) :=
a
t eit ,
2i
(4.70a)
(t) := Re 0 (t) =
a
t sin(t),
2
(4.70b)
75
4 LINEAR ODE
4.6.2
x 7 eAx ,
(4.72a)
Y (0) = Id,
(4.72b)
The previous definition of the matrix exponential function is further justified by the
following result:
Theorem 4.39. For each A M(n, C), n N, it holds that
xR
eAx =
X
(Ax)k
k=0
k!
(4.73)
in the sense that the partial sums on the right-hand side converge pointwise to eAx on
R, where the convergence is even uniform on every compact interval.
2
Proof. By the equivalence of all norms on Cn
= M(n, C), we may choose a convenient
norm on M(n, C). So we let k k denote an arbitrary operator norm on M(n, C),
induced by some norm k k on Cn . We first show that the partial sums (Am (x))mN ,
P
(Ax)k
Am (x) := m
k=0 k! , in (4.73) form a Cauchy sequence in M(n, C): For M, N N,
N > M , one estimates, for each x R,
N
N
X (Ax)k
(G.10) X
kAkk |x|k
76
4 LINEAR ODE
P
kAkk |x|k
= ekAk|x| is pointwise for x R and uniform
Since the convergence limm m
k=0
k!
on every compact interval, (4.74) shows each (Am (x))mN is a Cauchy sequence that
converges to some (x) M(n, C) (by the completeness of M(n, C)) pointwise for
x R and uniform on every compact interval. It remains to show is the solution to
(4.72b), i.e.
Z
x
xR
A(t) dt .
(x) = Id +
(4.75)
Am (x) = Id +
m
X
(Ax)k
k=1
k!
= Id +A
m1
X
k=0
Ak xk+1
= Id +
(k + 1)!
A
0
m1
X
k=0
Ak tk
dt ,
k!
A
(x)
=
A(t)
dt
m
0
k!
k=0
Z x
kAk
Am1 (t) (t)
dt 0 for m ,
(x) Am (x)
+
0
The matrix exponential function has some properties that are familiar from the case
n = 1 (see Prop. 4.40(a),(b)), but also some properties that are, perhaps, unexpected
(see Prop. 4.42(a),(b)).
Proposition 4.40. Let A M(n, C), n N.
(a) eA(t+s) = eAt eAs holds for each s, t R.
(b) (eAx )1 = eA(x) = eAx holds for each x R.
t
Finally, since s (0) = eA0 eAs = Id eAs = eAs = s (0), the claimed s = s follows by
uniqueness of solutions.
(b) is an easy consequence of (a), since
(a)
77
4 LINEAR ODE
(4.76)
fB (C) := BC,
gB (C) := CB.
If kk denotes an operator norm, then kBC1 BC2 k kBkkC1 C2 k and kC1 BC2 Bk
kBkkC1 C2 k, showing fB and gB to be (even Lipschitz) continuous. Thus,
!
!
m
m
k
k
X
X
(Ax)
(Ax)
BeAx = fB (eAx ) = fB lim
= lim fB
m
m
k!
k!
k=0
k=0
!
!
m
m
m
X
X
X
(Ax)k AB=BA
(Ax)k
(Ax)k
= lim B
=
lim
B = lim gB
m
m
m
k!
k!
k!
k=0
k=0
k=0
!
m
X
(Ax)k
= gB (eAx ) = eAx B,
= gB lim
m
k!
k=0
thereby establishing the case.
(b): Exercise (hint: use (a)).
78
4 LINEAR ODE
Eigenvalues and Jordan Normal Form
We will see that the solution theory of linear ODE with constant coefficients is related
to the eigenvalues of A. We recall the definition of this notion:
Definition 4.43. Let n N and A M(n, C). Then C is called an eigenvalue of
A if, and only if, there exists 0 6= v Cn such that
Av = v.
(4.77)
(x) := ex v,
(4.78)
j{1,...,n}
j : I Cn ,
j (x) := ej x vj ,
(4.79)
79
4 LINEAR ODE
(ii) There exists an invertible matrix W M(n, C) such that
1
0
...
W 1 AW =
,
0
n
(4.80)
B1
0
...
B=
,
0
Br
(4.81)
(4.82)
j 1 0 . . . 0
j 1
... ...
,
(4.83)
Bj = (j ) or Bj =
0
0
j 1
j
where j is an eigenvalue of A.
The reason Th. 4.46 regarding the Jordan normal form is useful for solving linear ODE
with constant coefficients is the following theorem:
Theorem 4.47. Let n N and A, W M(n, C), where W is assumed invertible.
(a) The following statements (i) and (ii) are equivalent:
(i) : I Cn is a solution to y = Ay.
(ii) := W 1 : I Cn is a solution to y = W 1 AW y.
80
4 LINEAR ODE
(b) eW
1 AW x
W 1 = W 1 A
= W 1 AW
(4.84)
N = 0 (zero matrix) or N =
matrix, i.e.
1
0
0 ... 0
... ...
0 ,
0
0 1
0
(4.85)
where the case N = 0 is already covered by Th. 4.44. The remaining case is covered by
the following Th. 4.49.
Theorem 4.49. Let C, k N, k 2, and assume
nilpotent matrix according to (4.85). Then
2
1 x x2
x
0 1
1
0 0
0 0
0
0 6= N M(k, C) is a canonical
...
...
...
xk2
(k2)!
xk3
(k3)!
..
.
...
...
...
...
xk1
(k1)!
xk2
(k2)!
..
.
..
.
(4.86)
81
4 LINEAR ODE
is a fundamental matrix solution to
Y = ( Id +N )Y,
Y (0) = Id,
(4.87)
i.e.
(x) = e( Id +N )x ;
xR
(4.88)
: R C, (x) :=
,{1,...,k}
0
for > .
It remains to show that
,{1,...,k}
One computes,
,{1,...,k}
+ +1,
=
e
(x) = ex
x
()!
x
()!
+ ex
for < ,
for = ,
for > .
x(+1)
((+1))!
+0
(4.89)
for < ,
for = ,
for > ,
A M(2, R),
(4.90)
1 (x) := e1 x v1 ,
2 (x) := e2 x v2 ,
(4.91)
82
4 LINEAR ODE
2 = i,
where R,
R \ {0}.
(4.92)
1
2
1
2
1
2i
1
2
1
,
1
2
2i
(iii) The matrix A has precisely one eigenvalue R and the corresponding eigenspace
is 1-dimensional. Then there is an invertible matrix W M(2, R) such that
B := W 1 AW is in (nondiagonal) Jordan normal form, i.e.
1
1
.
B = W AW =
0
According to Th. 4.49, the two functions
2
1 , 2 : R K ,
1 (x) := e
1
,
0
2 (x) := e
x
,
1
(4.94)
form a fundamental system for y = By (over K). Thus, according to Th. 4.47,
the two functions
1 , 2 : R K2 ,
1 (x) := W 1 (x),
2 (x) := W 2 (x),
(4.95)
83
4 LINEAR ODE
k=0
p1 (0) = 1,
pk (0) = 0
for
k = 2, . . . , n
(4.97a)
(4.97b)
for
k = 1, . . . , n 1.
(4.98a)
(4.98b)
84
5 STABILITY
Proof. Note that (4.98) can be extended to k = n, yielding
Mn =
n
Y
(A k Id) = A (A) = 0,
k=1
since each matrix annihilates its characteristic polynomial according to the CayleyHamilton theorem (cf. [Koe03, Th. 8.4.6] or [Str08, Th. 26.6]). Also note
k=0,...,n1
(4.99)
Pn1
We have to show that x 7 (x) := k=0
pk+1 (x) Mk solves the initial value problem
Y = AY , Y (0) = Id. The initial condition is satisfied, as (0) = p1 (0) M0 = Id, and
the ODE is satisfied, as, for each x R,
(x) A(x)
n1
X
pk+1 (x) Mk
k=0
(4.97), (4.99)
1 p1 (x) M0 +
n1
X
k=1
n1
X
n1
X
k=0
k+1 pk+1 (x) + pk (x) Mk
k=0
pn (x) Mn = 0,
5
5.1
pk+1 (x) Mk
Stability
Qualitative Theory, Phase Portraits
In the qualitative theory of ODE, which can be seen as part of the field of dynamical
systems, the idea is to understand the set of solutions to an ODE (or to a class of ODE),
if possible, without making use of explicit solution formulas, which, in most situations,
are not available anyway. Examples of qualitative questions are if, and under which
conditions, solutions to an ODE are constant, periodic, are unbounded, approach some
limit (more generally, the solutions asymptotic behavior), etc. One often thinks of the
solutions as depending on a time-like variable, and then qualitative theory typically
means disregarding the speed of change, but rather focusing on the shape/geometry of
the solutions image.
The topic of stability takes continuity in intial conditions further and investigates the
behavior of solutions that are, at least initially, close to some given solution. Under
which conditions do nearby solutions approach each other or diverge away from each
other, show the same or different asymptotic behavior etc.
85
5 STABILITY
Even though the abovedescribed considerations are not limited to this situation, a natural starting point is to consider first-order ODE where the right-hand side does not
depend on x. In the following, we will mostly be concerned with this type of ODE,
which has a special name:
Definition 5.1. If Kn , n N, and f : Kn , then the n-dimensional first-order
ODE
y = f (y)
(5.1)
is called autonomous and is called the phase space.
Remark 5.2. In fact, nonautonomous ODE are not really more general than autonomous ODE, due to the, perhaps, surprising Th. J.1 of the Appendix, which states
that every nonautonomous ODE is equivalent to an autonomous ODE. However, this fact
is of little practical relevance, since the autonomous ODE arising via Th. J.1 from nonautonomous ODE can never have bounded solutions on unbounded intervals, whereas the
theory of autonomous ODE is most powerful and useful for ODE that admit bounded
solutions on unbounded intervals (such as constant or periodic solutions, or solutions
approaching constant or periodic functions).
Lemma 5.3. If, in the context of Def. 5.1, : I Kn is a solution to (5.1), defined
on the interval I R, then
: I Kn ,
(x) := (x + ),
where I := {x R : x I},
(5.2)
is another solution to (5.1). In consequence, if is a maximal solution, then so is .
Proof. Clearly, I is an interval. Note x I x + I and, since is a
solution to (5.1), it is (I) , implying (I ) . Finally,
(x) = (x + ) = f (x + ) = f (x) ,
xI
completing the proof that is a solution. Since each extension of yields an extension
of and vice versa, is a maximal solution if, and only if, is a maximal solution.
Lemma 5.4. If Kn , n N, and f : Kn is such that (5.1) admits unique
maximal solutions (f being locally Lipschitz on open is sufficient, but not necessary,
cf. Def. 3.32), then the global solution Y : Df Kn of (5.1) satisfies
(a) Y (x, , ) = Y (x , 0, ) for each (x, , ) Df .
n
(b) Y x, 0, Y (
x, 0, ) = Y (x+
x, 0, )
x, 0, ) for each (x, x, ) RRK such that (
Df and x, 0, Y (
x, 0, ) Df .
86
5 STABILITY
maximal solution to (5.1) and, since () = (0) = = (), the assumed uniqueness
yields the claimed = , in particular, I, = I0, + .
(b): Let := Y (
x, 0, ). If : I0, Kn and : I0, Kn denote the maximal
solutions to the initial data y(0) = and y(0) = , respectively, then (b) claims = x .
As a consequence of Lem. 5.3, x : I0, x Kn , is some maximal solution to (5.1)
and, since x (0) = (
x) = = (0), the assumed uniqueness yields the claimed = x ,
in particular, I0, = I0, x.
Definition 5.5. Let I R be an interval and : I S (in principle, S can be
arbitrary).
(a) The image of I under , i.e.
O() := (I) = {(x) : x I} S
(5.3)
is often referred to as the orbit of in the present context of qualitative ODE theory.
(b) : R S (note I = R) is called periodic if, and only if, there exists a smallest
> 0 (called the period of ) such that
xR
(x + ) = (x).
(5.4)
The requirement > 0 means constant functions are not periodic in the sense of
this definition.
Lemma 5.6. Let : R Kn , n N.
(a) If is continuous and (5.4) holds for some > 0, then is either constant or
periodic in the sense of Def. 5.5(b).
(b) (a) is false without the assumption of being continuous.
Proof. Exercise.
(5.5)
87
5 STABILITY
Proof. If f () = 0 and , then (x) = 0 = f ((x)) for each x R, i.e. (i) implies
(ii). Conversely, if is a solution to (5.1), then f () = f ((x)) = (x) = 0, i.e. (ii)
implies (i).
Proposition 5.9. If Kn , n N, and f : Kn is such that (5.1) admits
unique maximal solutions (f being locally Lipschitz on open is sufficient), then, for
maximal solutions 1 : I1 Kn , 2 : I2 Kn to (5.1), defined on open intervals
I1 , I2 , respectively, precisely one of the following two statements (i) and (ii) is true:
(i) O(1 ) O(2 ) = , i.e. the solutions have disjoint orbits.
(ii) There exists R such that
I2 = I1
and
xI2
2 (x) = 1 (x + ).
(5.6)
In particular, it follows in this case that O(1 ) = O(2 ), i.e. the solutions have
the same orbit.
Proof. Suppose (i) does not hold. Then there are x1 I1 and x2 I2 such that
1 (x1 ) = 2 (x2 ). Define := x1 x2 and consider
: I1 Kn ,
(x) := 1 (x + ).
(5.7)
Then is a maximal solution of (5.1) by Lem. 5.3 and (x2 ) = 1 (x1 ) = 2 (x2 ). By
uniqueness of maximal solutions, we obtain = 2 , in particular, I2 = I1 , proving
(5.6). Clearly, (5.6) implies O(1 ) = O(2 ).
5 STABILITY
88
Proof. The corollary merely summarizes Prop. 5.9 and Prop. 5.10.
Definition 5.12. In the situation of Cor. 5.11, a phase portrait for (5.1) is a sketch
showing representative orbits. Thus, the sketch shows subsets of the phase space ,
including fixed points (if any) and representative periodic solutions (if any). Usually,
one also uses arrows to indicate the direction in which each drawn orbit is traced as the
variable x increases.
Example 5.13. Even though it is a main goal of qualitative theory to obtain phase
portraits without the need of explicit solution formulas, and we will study techniques
for accomplishing this below, we will make use of explicit solution formulas for our first
two examples of phase portraits.
(a) Consider the autonomous linear ODE
y1
y2
.
=
y2
y1
(5.8)
y2
y1
is more complicated: While (0, 0) is still the only fixed point, for each r > 0, all the
following functions 1 , 2 , 3 , 4 : R R2 are solutions:
1 (x) := (r cosh x, r sinh x),
2 (x) := (r cosh x, r sinh x),
3 (x) := (r sinh x, r cosh x),
4 (x) := (r sinh x, r cosh x),
(5.10a)
(5.10b)
(5.10c)
(5.10d)
each type describing a hyperbolic orbit in some section of the plane R2 . These
sections are separated by rays, forming the orbits of the solutions 5 , 6 , 7 , 8 :
R R2 :
5 (x) := (ex , ex ),
6 (x) := (ex , ex ),
7 (x) := (ex , ex ),
8 (x) := (ex , ex ).
(5.10e)
(5.10f)
(5.10g)
(5.10h)
89
5 STABILITY
The two rays on {(y1 , y1 ) : y1 6= 0} move away from the origin, whereas the two rays
on {(y1 , y1 ) : y1 6= 0} move toward the origin. The hyperbolic orbits asymptotically approach the ray orbits and are traversed such that the flow direction agrees
between approaching orbits.
The next results will be useful to obtain new phase portraits from previously known
phase portraits in certain situations.
Proposition 5.14. Let Kn , n N, let I R be some nontrivial interval, let
f : Kn , and let : I Kn be a solution to
y = (x) f (y),
(5.11)
(x) =
1
,
(x)
( ) (x) = (x) (x) = (x) f ((x))
1
= f ((x)) ,
(x)
(5.12a)
(5.12b)
have precisely the same orbits, i.e. every orbit of a solution to (5.12a) is an orbit
of a solution to (5.12b) and vice versa.
90
5 STABILITY
(b) If f and h are such that the ODE (5.12) admit unique maximal solutions, then the
ODE (5.12) have precisely the same orbits (even if F 6= ).
Proof. (a): If : I Kn is a solution to (5.12b), then := h is well-defined
and continuous. Since F = implies 6= 0, we can apply Prop. 5.14 to obtain
the existence of a bijective 1 : J1 I such that 1 is a solution to (5.12a).
Thus, O() = O( 1 ). Conversely, if : I Kn is a solution to (5.12a), i.e. to
f (y), then := 1/(h ) is well-defined and continuous. Since F = implies
y = h(y)
h(y)
Remark 5.16. We apply Prop. 5.15 to phase portraits (in particular, assume unique
maximal solutions). Prop. 5.15 says that overall multiplication with a continuous positive function h does not change the phase portrait at all. Moreover, Prop. 5.15 also
states that overall multiplication with a continuous negative function h does not change
the partition of into solution orbits. However, after multiplication with a negative h,
the orbits are clearly traversed in the opposite direction, i.e., for negative h, the arrows
in the phase portrait have to be reversed. For a general continuous h, this implies the
phase portrait remains the same in each region of , where h > 0; it remains the same,
except for the arrows reversed, in each region of , where h < 0; and the zeros of h
add additional fixed points, cutting some of the previous orbits. We summarize how to
obtain the phase portrait of (5.12b) from that of (5.12a):
(1) Start with the phase portrait of (5.12a).
(2) Add the zeros of h as additional fixed points (if any). Previous orbits are cut, where
fixed points are added.
(3) Reverse the arrows where h < 0.
Example 5.17. (a) Consider the ODE
y1 = y2 (y1 1)2 + y22 ,
y2 = y1 (y1 1)2 + y22 ,
(5.13)
which comes from multiplying the right-hand side of (5.8) by h(y) = (y1 1)2 + y22 .
The phase portrait is the same as the one for (5.8), except for the added fixed point
at {(1, 0)}.
(b) Consider the ODE
y1 = y1 y2 + y22 ,
y2 = y1 y2 + y12 ,
(5.14)
91
5 STABILITY
which comes from multiplying the right-hand side of (5.8) by h(y) = y1 y2 . The
phase portrait is obtained from that of (5.8), where additional fixed points are on
the line with y1 = y2 . This line cuts each previously circular orbit into two segments.
The arrows have to be reversed for y2 > y1 , that means above the y1 = y2 line.
Definition 5.18. Let Rn , n N, and f : Rn . A function E : R is
called an integral for the autonomous ODE (5.1), i.e. for y = f (y), if, and only if, E
is constant for every solution of (5.1).
Lemma 5.19. Let Rn be open, n N, and f : Rn such that each initial
value problem for (5.1) has at least one solution (f continuous is sufficient by Th. 3.8).
Then a differentiable function E : R is an integral for (5.1) if, and only if,
( E)(y) f (y) =
n
X
j E(y) fj (y) = 0.
(5.15)
j=1
xI
(5.16)
The differentiable function E : I R is constant on the interval I if, and only if,
(E ) 0. Thus, by (5.16), E being constant for every solution is equivalent to
( E) f (y) = 0 for each y such that at least one solution passes through y.
Example J.2 of the Appendix, pointed out by Anton Sporrer, shows the hypothesis of
Lem. 5.19, that each initial value problem for (5.1) has at least one solution, can not be
omitted. The following Prop. 5.20 makes use of integrals and applies to phase portraits
of 2-dimensional real ODE:
Proposition 5.20. Let R2 be open, and let f : R2 be continuous and
such that (5.1) admits unique maximal solutions (f being locally Lipschitz is sufficient).
Assume E : R to be a continuously differentiable integral for (5.1), i.e. for
y = f (y), satisfying E(y) 6= 0 for each y . Then the following statements hold
for each maximal solution : I R2 of (5.1) (I R some open interval):
(a) If (xm )mN is a sequence in I such that limm (xm ) = , then F (i.e.
is a fixed point) or O() (i.e. there exists I with () = ).
(a): The continuity of E yields E() = limm E((xm )) = C. Moreover, by hypothesis, (1 , 2 ) := E() 6= (0, 0). We proceed with the proof for 2 6= 0 if 2 = 0
and 1 6= 0, then the roles of the indices 1, 2 have to be switched in the following. We
92
5 STABILITY
apply the implicit function theorem [Phi13b, Th. C.9] to the function f : R,
f(y) := E(y) C at its zero = (1 , 2 ). By [Phi13b, Th. C.9], there exist , > 0 and
a continuously differentiable map g : Ig R, Ig :=]1 , 1 + [, such that g(1 ) = 2 ,
sIg
E s, g(s) = C,
ky k < E(y) = C
sIg
(5.17a)
y = s, g(s)
(5.17b)
We now assume
/ F and show O(). If
/ F, then f () 6= 0 and the continuity
1 + [,
f (s, g(s)) 6= 0. Define the auxiliary function : I , (s) = (s, g(s)). Since
E C, we can employ the chain rule to conclude
sI
(5.18)
i.e. the two-dimensional vectors ( E)((s)) and (s) are orthogonal with respect to
the Euclidean scalar product. As E is an integral, using (5.15), f ((s)) is another vector
orthogonal to ( E)((s)) and, since all vectors in R2 orthogonal to ( E)((s)) form
a 1-dimensional subspace of R2 (recalling ( E)((s)) 6= 0), there exists (s) R such
that
(s) = (s)f ((s))
(5.19)
We can now apply Prop. 5.14, since (5.19) says is a
(note f ((s)) 6= 0 as s I).
solution to (5.11), the function : I R, s 7 (s) = (s)/f ((s)) is continuous,
Thus, Prop. 5.14 provides a bijective
and (s) = (1, g (s)) 6= (0, 0) for each s I.
such that is a solution to y = f (y).
: J I,
As we assume limm (xm ) = , there exists M N such that k(xm )k < for each
m M . Since E((xm )) = C also holds, (5.17b) implies the existence of a sequence
(sm )mN in I such that (xm ) = (sm , g(sm )) for each m M . Then, for each m M
and m := 1 (sm ), ( )(m ) = (sm ) = (xm ). On the other hand, for 0 := 1 (1 ),
( )(0 ) = (1 ) = , showing (xm ), O( ). Since (xm ) O() as well,
Prop. 5.9 implies O( ) O(), i.e. O(), which proves (a). In preparation for
(b), we also observe that k(xm ) k < for each m M implies the sm for m M
all are in some compact interval I1 with 1 I1 , implying the m to be in the compact
interval J1 := 1 [I1 ] with 0 J1 . We will use for (b) that J1 is bounded.
(b): As we have O() E 1 {C} according to the choice of C, the assumed compactness
of E 1 {C} and Prop. 3.24 show can only be maximal if it is defined on all of R (since
(x, (x)) must escape every compact [m, m] E 1 {C}, m N, on the left and on the
right). Using the compactness of E 1 {C} a second time, we obtain the existence of a
sequence (xm )mN in R such that limm xm = and limm (xm ) = E 1 {C}.
So we see that we are in the situation of (a). Let be the maximal extension of the
solution constructed in the proof of (a). Then we know O() O() 6= from the
proof of (a) and, since and both are maximal, Prop. 5.9 implies O() = O() and,
93
5 STABILITY
more importantly for us here, there exists R such that (x) = (x+) for each x R.
Let m M with M from the proof of (a). If 6= 0, then (xm ) = (m ) = (xm + )
shows is not injective. If = 0, then = and (xm ) = (m ). Since the m are
bounded, whereas the xm are unbounded, xm = m cannot be true for all m, again
showing is not injective. Since E 1 {C} F = , cannot be constant, therefore it
must be periodic by Prop. 5.10.
Example 5.21. Using condition (5.15), i.e. E f 0, one readily verifies that the
functions
E : R2 R,
E : R2 R,
(5.20a)
(5.20b)
y1
y
= 2 ,
y2
y1
respectively, and we recover the respective phase portraits via the respective level curves
E(y1 , y2 ) = C, C R.
Example 5.22. Consider the autonomous ODE
y1
2y1 y2
=
.
y2
1 2y12
We claim that
E : R2 R,
(5.21)
2
(5.22)
is an integral for (5.21) and intend to use Prop. 5.20 to establish (5.21) has orbits that
are fixed points, orbits that are periodic, and orbits that are neither. To verify E is an
integral, one computes, for each (y1 , y2 ) R2 ,
E(y1 , y2 ) (2y1 y2 , 1 2y12 )
2
2
2
2
2
2
= e(y1 +y2 ) 2y12 e(y1 +y2 ) , 2y1 y2 e(y1 +y2 ) (2y1 y2 , 1 2y12 )
2
1
1
.
, 0 , , 0
2
2
The level set of 0 is E 1 {0} = {(0, y2 ) : y2 R}, i.e. it is the y2 -axis. This is a
nonperiodic orbit (actually, the orbit of solutions of the form : R R2 , (x) :=
(0, x + c), c R). Now consider the level set
E 1 {e1 } = (y1 , y2 ) : y1 > 0, y22 = ln y1 y12 + 1 .
94
5 STABILITY
5.2
Given an autonomous ODE with a fixed point p, we will investigate the question under
what conditions a solution (x) starting out near p will remain near p as x increases or
decreases.
To simplify notation, we will restrict ourselves to initial data y(0) = y0 , which, in light
of Lem. 5.4(b), is not an essential restriction.
Notation 5.23. Let Kn , n N, and f : Kn such that
y = f (y)
(5.23)
admits unique maximal solutions (f being locally Lipschitz on open is sufficient). Let
Y : Df Kn denote the general solution to (5.23) and define
Y : Df,0 Kn , Y (x, ) := Y (x, 0, ),
Df,0 := {(x, ) R Kn : (x, 0, ) Df }.
(5.24)
x0
(resp. x 0)
kY (x, ) pk < .
(5.25)
The fixed point p is said to be positively (resp. negatively) asymptotically stable if, and
only if, (i) and (ii) hold plus the additional condition
(iii) There exists > 0 such that, for each ,
k pk <
(5.26)
95
5 STABILITY
The norm k k on Kn used in (i) (iii) above is arbitrary. Due to the equivalence of
norms on Kn , changing the norm does not change the defined stability properties, even
though, in general, it does change the sizes of r, , .
Remark 5.25. In the situation of Def. 5.24, consider the time-reversed version of (5.23),
i.e.
y = f (y).
(5.27)
According to Lem. 1.9(b), (5.27) has the general solution
(5.28)
xI(0,) R+
0
(resp. x I(0,) R
0 )
kY (x, ) pk < ,
(5.29)
I
),
proving
Def.
5.24(i).
(0,)
0
It is an exercise to show Lem. 5.26 becomes false if the hypothesis that f be continuous
is omitted.
Example 5.27. (a) Consider the 1-dimensional R-valued ODE
y = y(y 1).
(5.30)
The set of fixed points is F = {0, 1}. Moreover, Y (, ) < 0 for 0 < < 1 and
Y (, ) > 0 for ] , 0[]1, [. It follows that, for p = 0, the positive stability
96
5 STABILITY
part of (5.29) holds (where, given > 0, one can choose := min{1, }). Moreover,
for < 0 and 0 < < 1, one has limx Y (x, ) = 0. Thus, all three conditions of
Def. 5.24 are satisfied and 0 is positively asymptotically stable. Analogously, one
sees that 1 is negatively asymptotically stable.
(b) For the R2 -valued ODE of (5.8), (0, 0) is a fixed point that is positively and negatively stable, but neither positively nor negatively asymptotically stable. For the
R2 -valued ODE of (5.9), (0, 0) is a fixed point that is neither positively nor negatively stable.
(c) Consider the 1-dimensional R-valued ODE
y = y2.
(5.31)
The only fixed point is 0, which is neither positively nor negatively stable. Indeed,
not even Def. 5.24(i) is satisfied: One obtains
Y : Df,0 R,
Y (x, ) :=
,
1 x
where
Df,0 =(R {0})
(x, ) R2 : > 0, x ] , 1/[
(x, ) R2 : < 0, x ]1/, [ ,
of R+
0 and such that Y (, ) is not defined on all of R0 .
Remark 5.28. There exist examples of autonomous ODE that show fixed points can
satisfy Def. 5.24(iii) without satisfying Def. 5.24(ii). For example, [Aul04, Ex. 7.4.16]
provides the following ODE in polar coordinates (r, ):
r = r (1 r),
(5.32a)
(5.32b)
1 cos
= sin2 .
2
2
Even though it is somewhat tedious, one can show that its fixed point (1, 0) satisfies Def.
5.24(iii) without satisfying Def. 5.24(ii) (see Claim 4 of Example K.2 in the Appendix).
We will now study a method that allows, in certain cases, to determine the stability
properties of a fixed point without having to know the solutions to an ODE. The method
is known as Lyapunovs method. The key ingredient to this method is a test function V ,
known as a Lyapunov function. Once a Lyapunov function is known, stability is often
easily tested. The catch, however, is that Lyapunov functions can be hard to find. From
the literature, it appears there is no definition for an all-purpose Lyapunov function, as
a suitable choice depends on the circumstances.
97
5 STABILITY
n
X
j V (y) fj (y).
(5.33)
j=1
0<()<
yB() (p)
(5.35)
where we used Not. 3.3 to denote an open ball with center p with respect to k k.
We now claim that, for each B() (p), the maximal solution x 7 (x) := Y (x, )
must remain inside B (p) for each x 0 in its domain I(0,) (implying p to be positively
stable by Lem. 5.26). Seeking a contradiction, assume there exists 0 such that
k() pk and let
n
o
s := sup x 0 : (t) B (p) for each t [0, x] < .
(5.36)
The continuity of then implies k(s) pk = , i.e.
V ((s)) k()
(5.37)
98
5 STABILITY
by the definition of k(). On the other hand, by the chain rule (V ) (x) = V ((x))
(cf. (5.16)), such that V 0 implies
Z s
(5.35)
V ((s)) = V () +
V ((x)) dx V () < k(),
(5.38)
0
]0,r]
kY (x, ) pk < .
(5.39)
So fix B (p) and, as above, let (x) := Y (x, ). Given ]0, r], we first claim that
there exists 0 such that ( ) B() (p), where () is as in the first part of the
proof above. Indeed, seeking a contradiction, assume k(x) pk () for all x 0,
and set
:= max V (y) : () ky pk r .
(5.40)
Then < 0 due to the negative definiteness of V at p. Moreover, due to the choice of
, we have () k(x) pk r for each x 0, implying
Z x
0 V ((x)) = V () +
V ((t)) dt V () + x,
(5.41)
x0
which is the desired contradiction, as < 0 implies the right-hand side to go to for
x . Thus, we know the existence of such that := ( ) B() (p).
To finish the proof, we recall from the first part of the proof that kY (x, ) pk < for
each x 0. Using Lem. 5.4(a), we obtain
x0
( + x) = Y ( + x, , ) = Y ( + x , ) = Y (x, ) B (p),
(5.42)
V (y1 , y2 ) :=
y12 y22
+ ,
(5.44a)
99
5 STABILITY
which is clearly positive definite at (0, 0). Since V : R2 R,
(5.44b)
is clearly negative definite at (0, 0), Th. 5.30 proves (0, 0) to be a positively asymptotically stable fixed point.
Theorem 5.32. Consider the situation of Def. 5.24 with K = R. Let 0 be open with
p 0 Rn . Assume V : 0 R to be continuously differentiable and assume
there is an open set U 0 such that the following conditions (i) (iii) are satisfied:
(i) p U , i.e. p is in the boundary of U .
(ii) V > 0 and V > 0 (resp. V < 0) on U (where V is defined as in (5.33)).
(iii) V (y) = 0 for each y 0 U .
Then the fixed point p is not positively (resp. negatively) stable.
Proof. We assume V and V are positive, proving p not to be positively stable; the corresponding statement regarding p not to be negatively stable is then, once again, easily
obtained by reversing time, i.e. by using Lem. 5.25 together with noting V changing its
sign when replacing f with f .
Seeking a contradiction, assume p to be positively stable. Then there exists r > 0 such
that B r (p) = {y Rn : ky pk r} 0 and Br (p) implies Y (x, ) is defined
for each x 0. Moreover, positive stability and p U also imply the existence of
U Br (p) such that (x) := Y (x, ) Br (p) for all x 0 (note p 6= as p U ).
Set
s := sup x 0 : (t) U for each t [0, x]}.
(5.45)
(5.47)
Then the choice of guarantees (x) C for all x 0. If y C, then V (y) V () > 0.
If y 0 U , then V (y) = 0, showing C U = , i.e. C U and
:= min{V (y) : y C} > 0.
(5.48)
100
5 STABILITY
Thus,
x0
V ((x)) = V () +
x
0
V ((t)) dt V () + x.
(5.49)
But this means that the continuous function V is unbounded on the compact set C and
this contradiction proves p is not positively stable.
Example 5.33. Let h1 , h2 : R be continuously differentiable functions defined
on some open set R2 with (0, 0) and h1 (0, 0) > 0, h2 (0, 0) > 0. We claim that
(0, 0) is not a positively stable fixed point for each R2 -valued ODE of the form
y2 h1 (y1 , y2 )
y1
=
.
y2
y1 h2 (y1 , y2 )
(5.50)
Indeed, (0, 0) is clearly a fixed point, and we let 0 be some open neighborhood of (0, 0),
where both h1 and h2 are positive (such an 0 exists by continuity of h1 , h2 and h1 , h2
being positive at (0, 0)), and consider the Lyapunov function
V : 0 R,
V (y1 , y2 ) := y1 y2 ,
(5.51a)
with V : 0 R,
V (y1 , y2 ) = V (y1 , y2 ) y2 h1 (y1 , y2 ), y1 h2 (y1 , y2 )
= (y2 , y1 ) y2 h1 (y1 , y2 ), y1 h2 (y1 , y2 )
= y22 h1 (y1 , y2 ) + y12 h2 (y1 , y2 ) > 0 on 0 \ {(0, 0)}.
(5.51b)
(5.52)
If p is an isolated critical point of F (i.e. F (p) = 0 and there exists an open set
O with p O and F 6= 0 on O \ {p}), then p is a fixed point of (5.52) that is
positively asymptotically stable, negatively asymptotically stable, neither positively nor
negatively stable as p is a local minimum for F , local maximum for F , neither.
Proof. Note that F being C 2 implies F to be C 1 and, in particular, locally Lipschitz,
such that (5.52) admits unique maximal solutions. Suppose F has a local min at p.
As p is an isolated critical point, the local min at p must be strict, i.e. there exists an
open neighborhood 0 of p such that F (p) < F (y) for each y 0 \ {p}. Then the
Lyapunov function V : 0 R, V (y) := F (y) F (p), is clearly positive definite at p
and V : 0 R,
V (y) = V (y) ( F (y)) = F (y) F (y) = k F (y)k22 ,
(5.53)
101
5 STABILITY
(5.54)
(5.55)
5.3
Constant Coefficients
The stability properties of systems of first-order linear ODE (cf. Sec. 4.6.2) are closely
related to the eigenvalues of the matrix A. As it turns out, the stability of the origin is
essentially determined by the sign of the real part of the eigenvalues of A (cf. Th. 5.38
below). We start with a preparatory lemma:
Lemma 5.36. Let n N and W M(n, K) be invertible. Moreover, let k k be some
norm on M(n, K). Then
k kW : M(n, K) R+
0,
kAkW := kW 1 AW k,
(5.56)
kAkW = || kW 1 AW k = || kAkW ,
102
5 STABILITY
showing k kW is homogeneous of degree 1. Finally,
A,BM(n,K)
Then
kN
kN0
A ker(A Id)k ker(A Id)k ,
i.e. all the kernels (in particular, the generalized eigenspace M ()) are invariant
subspaces for A.
(c) As already mentioned in Rem. 4.51 the algebraic multiplicity of , denoted ma (),
is its multiplicity as a zero of the characteristic polynomial A (x) = det(A x Id),
and the geometric multiplicity of is mg () := dim ker(A Id). We call the
eigenvalue semisimple if, and only if, its algebraic and geometric multiplicities
are equal. We then have the equivalence of the following statements (i) (iv):
(i) is semisimple.
(ii) M () = ker(A Id).
(iii) AM () is diagonalizable.
(iv) All the Jordan blocks corresponding to are trivial, i.e. they all have size 1
(i.e. there are dim ker(A Id) such blocks).
103
5 STABILITY
(caveat: for n > 1, this is not the operator norm induced by the max-norm on Cn ).
Moreover, using Th. 4.46, let W M(n, C) be invertible and such that B := W 1 AW
1
is in Jordan normal form. Then, according to Lem. 5.36, kM kW
M W kmax
max := kW
also defines a norm on M(n, C). According to Th. 4.47(b),
xR
1 Ax
keAx kW
e W kmax = keW
max = kW
1 AW x
(5.57)
According to Th. 4.44 and Th. 4.49, the entries kl (x) of (kl (x)) := eBx enjoy the
following property:
k,l{1,...,n}
j{1,...,s}
C>0
mN0
xR
(5.58)
Moreover,
|kl (x)| = C eRe j x |x|m Re j < 0
|kl (x)| = C eRe j x |x|m Re j > 0
|kl | C,
(5.59a)
104
5 STABILITY
(a): We start with the equivalence between (i) and (ii): Suppose, Re j 0 for every
j = 1, . . . , s and if Re j = 0 occurs, then j is a semisimple eigenvalue. Then, using
Rem. and Def. 5.37(c) and (5.58), we are either in situation (5.59a) or in situation
(5.59c). Thus, there exists K0 > 0 such that |kl (x)| K0 for each x 0 and each
k, l = 1, . . . , n. Then there exists K1 > 0 such that
Bx
keAx k K1 keAx kW
kmax K1 K0 ,
max = K1 ke
x0
(5.60)
(5.61)
i.e., the corresponding statement of (i) can not be true. The remaining case is handled
via time reversion: keAx k K holds for each x 0 if, and only if, keAx k K holds
for each x 0, which holds if, and only if, Re(j ) 0 for every j = 1, . . . , s with
j semisimple for Re(j ) = 0, which is equivalent to Re j 0 for every j = 1, . . . , s
with j semisimple for Re j = 0.
We proceed to the equivalence between (i) and (iii): Fix some arbitary norm k k on
Cn , and let k kop denote the induced operator norm on M(n, C). Let C1 , C2 > 0 be
such that kM kop C1 kM k and kM k C2 kM kop for each M M(n, C). Suppose
there exists K > 0 such that keAx k K holds for each x 0. Given > 0, choose
:= /(C1 K). Then
(5.63)
Ax
1
1
x0
n
= sup ke k : C , kk = ,
showing (i) holds with K := C2 /. The remaining case is handled via time reversion:
keAx k K holds for each x 0 if, and only if, keAx k K holds for each x 0, which
holds if, and only if, 0 is positively stable for y = Ay, which, by Rem. 5.25(a), holds
if, and only if, 0 is negatively stable for y = Ay.
(b): As in (a), we start with the equivalence between (i) and (ii): Suppose, Re j < 0
for every j = 1, . . . , s. We first show, using (5.58),
k,l=1,...,n
x0
According to (5.58),
Ckl >0
x0
(5.64)
105
5 STABILITY
Since Re j < 0, one has limx eRe j x/2 xm = 0, i.e. eRe j x/2 xm is uniformly bounded
on [0, [ by some Mkl > 0. Thus, (5.64) holds with Kkl := Ckl Mkl and kl := Re j /2.
In consequence, if K1 is chosen as in (5.60), then keAx k Ke|x| K for each x 0
holds with K := K1 max{Kkl : k, l = 1, . . . , n} and := min{kl : k, l = 1, . . . , n}.
Conversely, if there is j {1, . . . , s} such that Re j 0, then there is kl such that
(5.59b) or (5.59c) or (5.59d) occurs. In each case,
Bx
lim keAx k = lim keAx kW
kmax ]0, ],
max = lim ke
(5.65)
i.e., the corresponding statement of (i) can not be true. The remaining case is handled
via time reversion: keAx k Ke|x| holds for each x 0 if, and only if, keAx k
Ke|x| holds for each x 0, which holds if, and only if, Re(j ) < 0 for every
j = 1, . . . , s, which is equivalent to Re j > 0 for every j = 1, . . . , s.
It remains to consider the equivalence between (i) and (iii): Let k kop and C1 , C2 > 0
be as in the proof of the equivalence between (i) and (iii) in (a). Suppose, there exist
K, > 0 such that keAx k Ke|x| holds for each x 0. Since keAx k Ke|x| K
for each x 0, 0 is positively stable by (a). Moreover,
Cn
x0
(5.66)
showing 0 to be positively asymptotically stable. For the converse, we will actually
show (iii) implies (ii). If 0 is positively asymptotically stable, then, in particular, it
is positively stable, such that (ii) of (a) must hold. It merely remains to exclude the
possibility of a semisimple eigenvalue with Re = 0. If there were a semisimple
eigenvalue with Re = 0, then eBx had a Jordan block of size 1 with entry ex , i.e.
kk (x) = ex for some k {1, . . . , n}. Let ek be the corresponding standard unit vector
of Cn (all entries 0, except the kth entry, which is 1). Then, for := W ek ,
xR
(5.67)
2 1
1 2
0 1
0 0
106
5 STABILITY
has eigenvalue 0, which is not semisimple, i.e. the fixed point 0 of y = Ay is neither
negatively nor positively stable.
(c) The matrix
i 1
2
2 3i
0 i
5
17
A=
0 0 1 + 3i
0
0 0
0
5
5.4
Linearization
If the right-hand side f of an autonomous ODE is differentiable and p is a fixed point (i.e.
f (p) = 0), then one can sometimes use its linearization, i.e. its derivative A := Df (p)
(which is an n n matrix), to infer stability properties of y = f (y) at p from those of
y = Ay at 0 (see Th. 5.44 below). We start with some preparatory results:
Lemma 5.40. Let n N and consider the bilinear function
n
: R R R,
(y, z) := y (Bz) = y Bz =
n
X
yk bkl zl ,
(5.68)
k,l=1
where B = (bkl ) M(n, R), denotes the Euclidean scalar product, and elements of
Rn are interpreted as column vectors when involved in matrix multiplications.
(a) The function is differentiable (it is even a polynomial, deg() 2, and, thus,
C ) and
yk : Rn Rn R,
n
X
(5.69a)
k{1,...,n}
n n yk (y, z) =
bkl zl = (Bz)k ,
(y,z)R R
l=1
l{1,...,n}
(y,z)R Rn
zl : R Rn R,
n
X
zl (y, z) =
yk bkl = (y t B)l ,
(5.69b)
k=1
(y,z)Rn Rn
D(y, z) = (y, z) : R Rn R,
(y, z)(u, v) = (y, v) + (u, z) = y t Bv + ut Bz.
(5.69c)
V : R R,
n
X
k,l=1
yk bkl yl ,
(5.70)
107
5 STABILITY
is differentiable (it is also even a polynomial, deg() 2, and, thus, C ) and
k{1,...,n}
yR
yR
yk V : Rn R,
n
X
yk V (y) =
yl (bkl + blk ) = y t (B + B t )k ,
(5.71a)
l=1
DV (y) = V (y) : Rn R,
V (y)(u) = (y, u) + (u, y) = y t Bu + ut By = y t (B + B t )u.
(5.71b)
Proof. (a): (5.69a) and (5.69b) are immediate from (5.68) and, then, imply (5.69c).
(b): (5.71a) is immediate from (5.70) and, then, implies (5.71b).
yRn
(5.72)
Proof. We note
n
yR
(y t B t ) (Ay) = y t B t Ay = y t B t Ay
yR
(5.71b)
t
= y t At By
(5.73)
proving (5.72).
(5.73)
(5.74)
Definition 5.42. A matrix B M(n, R), n N, is called positive definite if, and only
if, the function V of (5.70) is positive definite at p = 0 in the sense of Def. 5.29.
Proposition 5.43. Let A M(n, R), n N. Then the following statements (i) (iii)
are equivalent:
(i) There exist positive definite matrices B, C M(n, R), satisfying
BA + At B = C.
(5.75)
108
5 STABILITY
(5.70) as the Lyapunov function. Then, by Def. 5.42, B being positive definite means
V being positive definite at 0. Since
V : Rn R,
(5.72)
(5.75)
V (y) = V (y) (Ay) = y t (BA + At B)y = y t Cy
(5.76)
K,>0
x0
where we have chosen the norm in (5.77) to mean the max-norm on Rn (note that eAx
is real if A is real, e.g. due to the series representation (4.73)). Given C M(n, R),
define
Z
t
eA x C eAx dx .
(5.78)
B :=
0
To verify that B M(n, R) is well-defined, note that each entry of the integrand matrix
of (5.78) constitutes an integrable function on [0, [: Indeed,
At x
At x
e C eAx
M
ke
kmax kCkmax keAx kmax
max
(5.79)
M >0 x0
M kCkmax K 2 e2x ,
which is integrable on [0, [. Next, we compute
C
(5.79)
=
(I.3)
(I.5),(I.6)
x
t
lim e C e C = lim
s eA s C eAs ds
x
x 0
Z
t
s eA s C eAs ds
Z0
t
t
At eA s C eAs + eA s C eAs A ds
0
Z
Z
At s
As
t
At s
As
e C e ds +
A
e C e ds A
At x
Ax
(5.78)
= At B + BA,
(5.80)
109
5 STABILITY
showing B is positive definite as well. Finally, if C is symmetric, then
Z
Z
t
Prop. 4.40(c)
At x
Ax t
t
eA x C eAx dx = B,
e Ce
dx
=
B =
(5.82)
Theorem 5.44. Let Rn be open, n N, and f : Rn continuously differentiable. Let p be a fixed point (i.e. f (p) = 0) and A := Df (p) M(n, R) the
derivative of f at p. If all eigenvalues of A have negative (resp. positive) real parts, then
p is a positively (resp. negatively) asymptotically stable fixed point for y = f (y).
Proof. Let all eigenvalues of A have negative real parts. We first consider the special
case p = 0, i.e. A = Df (0). By the equivalence between (ii) and (iii) of Prop. 5.43,
we can choose C := Id in (iii) to obtain the existence of a positive definite symmetric
matrix B M(n, R), satisfying
BA + At B = Id .
(5.83)
The idea is now to apply the Lyapunov Th. 5.30 with V of (5.70), i.e.
V : R,
V (y) := y (By) = y By =
n
X
yk bkl yl .
(5.84)
k,l=1
kr(y)k2
= 0.
y0 kyk2
lim
=
(5.72), B=B t
( V )(y) f (y)
(5.71b),(5.86)
(5.86)
(5.87)
( V )(y) (Ay) + y t (B + B t ) r(y)
(5.83)
(5.88)
We can estimate the second summand via the Cauchy-Schwarz inequality to obtain
y B r(y) kyk2 kB r(y)k2 kyk2 kBk kr(y)k2 ,
(5.89)
lim
y0
y B r(y)
kyk22
= 0.
(5.90)
110
5 STABILITY
Now choose > 0 such that B (0) and such that
2y B r(y)
1
< .
2
yB (0)
kyk2
2
(5.91)
(5.92)
R.
(5.93)
(5.94)
with complex eigenvalues i and i. Thus, the linearized system is positively and negatively stable, but not asymptotically stable, still independently of . However, we claim
that (0, 0) is a positively asymptotically stable fixed point for y = f (y) if < 0 and
a positively asymptotically stable fixed point for y = f (y) if > 0. Indeed, this can
be seen by using the Lyapunov function V : R2 R, V (y1 , y2 ) = y12 + y22 , which has
V (y1 , y2 ) = (2y1 , 2y2 ) and
V (y1 , y2 ) = V (y1 , y2 ) f (y1 , y2 ) = 2 (y14 + y24 ).
(5.95)
Thus, V is positive definite at (0, 0) and V is negative definite at (0, 0) for < 0 and
positive definite at (0, 0) for > 0.
Example 5.46. Consider (x, y, z) = f (x, y, z) with
f : R3 R3 ,
(5.96)
111
5 STABILITY
The derivative is
Df : R3 M(3, R),
cos y x sin y
0
ez yez .
Df (x, y, z) = 0
2x
0
2
(5.97)
Clearly, (0, 0, 0) is a fixed point and Df (0, 0, 0) has eigenvalues 1 and 2. Thus,
(0, 0, 0) is a positively asymptotically stable fixed point for (x, y, z) = f (x, y, z) by Th.
5.44.
5.5
Limit Sets
Limit sets are important when studying the asymptotic behavior of solutions, i.e. (x)
for x and for x . If a solution has a limit, then its corresponding limit set
consists of precisely one point. In general, the limit set of a solution is defined to consist
of all points that occur as limits of sequences taken along the solutions orbit (of course,
the limit sets can also be empty):
Definition 5.47. Let Kn , n N, and f : Kn be such that y = f (y) admits
unique maximal solutions. For each , we define the omega limit set and the alpha
limit set of as follows:
() := f () := y :
(xk )kN R
Remark 5.48. In the situation of Def. 5.47, consider the time-reversed version of y =
f (y), i.e. y = f (y), with its general solution Y (x, ) = Y (x, ), cf. (5.28). Clearly,
for each ,
f () = f (), f () = f ().
(5.99)
Proposition 5.49. In the situation of Def. 5.47, the following hold:
(a) If Y (, ) is defined on all of R+
0 , then
() =
m=0
{Y (x, ) : x m};
(5.100a)
m=0
{Y (x, ) : x m}.
(5.100b)
(b) All points in the same orbit have the same omega and alpha limit sets, i.e.
() = Y (x, )
() = Y (x, ) .
xI0,
112
5 STABILITY
Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limit
sets.
(a): Let y () and m N0 . Then there is a sequence (xk )kN in R such that
limk xk = and limk Y (xk , ) = y. Since, for sufficiently large k0 N, the
sequence (Y (xk , ))kk0 is in {Y (x, ) : x m}, the inclusion of (5.100a) is proved.
Conversely, assume y {Y (x, ) : x m}. Then,
kN
Y (xk , ) y
< 1 ,
k
xk [k,[
providing a sequence (xk )kN in R such that limk xk = and limk Y (xk , ) = y,
proving y () and the inclusion of (5.100a).
(b): Let y () and x I0, . Choose a sequence (xk )kN in R such that limk xk =
and limk Y (xk , ) = y. Then limk (xk x) = and
Lem. 5.4(b)
=
lim Y (xk , ) = y,
(5.101)
Lem. 5.4(b)
=
Y (0, ) = ,
(5.102)
lim Y xk x, Y (x, )
proving () Y (x, ) . The reversed inclusion then also follows, since
Y x, Y (x, )
Example 5.51. As an example with nonperiodic orbits that have limit sets consisting
of more than one point, consider
y1 = y2 + y1 (1 y12 y22 ),
y2 = y1 + y2 (1 y12 y22 ).
(5.104a)
(5.104b)
113
5 STABILITY
We will show that, for each point except the origin (which is clearly a fixed point), the
omega limit set is the unit circle, i.e.
R2 \{0}
(5.105)
Y (x, 1 , 2 ) =
where, letting
Df,0
kk22 1
1
,
x := ln
{yR2 : kyk2 >1}
2
kk22
= R { R2 : kk2 1} ]x , [{ R2 : kk2 > 1} :
(5.106)
(5.107)
(5.108)
(1 , 2 )
12 + 22 + (1 12 22 )
= (1 , 2 ).
(5.109)
The following computations prepare the check that each Y (, 1 , 2 ) satisfies (5.104):
The 2-norm squared of the numerator in (5.106) is
(1 cos x + 2 sin x, 2 cos x 1 sin x)
2
2
= 12 cos2 x + 21 2 cos x sin x + 22 sin2 x + 22 cos2 x 21 2 cos x sin x + 12 sin2 x
= 12 + 22 = kk22 .
(5.110)
Thus,
and
kk2
Y (x, 1 , 2 )
= p
2
kk22 + (1 kk22 )e2x
(5.111)
2 kk22 + (1 kk22 )e2x kk22
1 Y12 (x, 1 , 2 ) Y22 (x, 1 , 2 ) = 1
Y (x, 1 , 2 )
2 =
kk22 + (1 kk22 )e2x
(1 kk22 )e2x
=
.
(5.112)
kk22 + (1 kk22 )e2x
In consequence,
Y1 (x, 1 , 2 )
(1 sin x + 2 cos x) kk22 + (1 kk22 )e2x + (1 cos x + 2 sin x)(1 kk22 )e2x
=
3
kk22 + (1 kk22 )e2x 2
= Y2 (x, 1 , 2 ) + Y1 (x, 1 , 2 ) 1 Y12 (x, 1 , 2 ) Y22 (x, 1 , 2 ) ,
(5.113)
114
5 STABILITY
verifying (5.104a). Similarly,
Y2 (x, 1 , 2 )
(2 sin x 1 cos x) kk22 + (1 kk22 )e2x + (2 cos x 1 sin x)(1 kk22 )e2x
=
3
kk22 + (1 kk22 )e2x 2
= Y1 (x, 1 , 2 ) + Y2 (x, 1 , 2 ) 1 Y12 (x, 1 , 2 ) Y22 (x, 1 , 2 ) ,
(5.114)
verifying (5.104b).
R \{0}
(5.115)
which implies
R2 \{0}
() S1 (0).
(5.116)
y [0,2[
y = (sin y , cos y ).
(5.117)
(5.118)
Analogously,
[0,2[
(the reader might note that, in (5.117) and (5.118), we have written y and using their
polar coordinates, cf. [Phi13b, Ex. 4.19]). Then, according to (5.106), we obtain, for
each x 0,
kk2 (sin cos x + cos sin x, cos cos x sin sin x)
p
kk22 + (1 kk22 )e2x
kk2 sin(x + ), cos(x + )
p
.
(5.119)
=
kk22 + (1 kk22 )e2x
Y (x, 1 , 2 ) =
Define
kN
xk := y + 2k R+ .
(5.120)
115
5 STABILITY
Then limk xk = and
lim Y (xk , 1 , 2 )
(5.119)
lim p
= y,
(5.121)
Y (x, y) ()
Y (x, y) () . (5.122)
y()
xI0,y
y()
xI0,y
Proof. Due to Rem. 5.48, it suffices to prove the statement involving the omega limit set.
Let y () and x I0,y . Choose a sequence (xk )kN in R such that limk xk =
and limk Y (xk , ) = y. Then limk (xk + x) = and,
lim Y (xk + x, )
Lem. 5.4(b)
()
lim Y x, Y (xk , ) = Y (x, y),
(5.123)
proving Y (x, y) (). At (), we have used that, due to f being locally Lipschitz
by hypothesis, Y is continuous by Th. 3.35.
Proposition 5.53. In the situation of Def. 5.47, let be such that there exists a
compact set K , satisfying
{Y (x, ) : x 0} K
resp. {Y (x, ) : x 0} K .
(5.124)
Then the following hold:
(a) () 6= (resp. () 6= ).
(b) () (resp. ()) is compact.
(c) () (resp. ()) is a connected set, i.e. if O1 , O2 are disjoint open subsets of Kn
such that () O1 O2 (resp. () O1 O2 ), then ()O1 = or ()O2 =
(resp. () O1 = or () O2 = ).
Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limit
sets.
(a): Since, by hypothesis, (Y (k, ))kN is a sequence in the compact set K, it must have
a subsequence, converging to some limit y K. But then y (), i.e. () 6= .
116
5 STABILITY
(b): According to (5.100a) and (5.124), () is a closed subset of the compact set K,
implying () to be compact as well.
(c): Seeking a contradiction, we suppose the assertion is false, i.e. there are disjoint
open subsets O1 , O2 of Kn such that () O1 O2 , 1 := () O1 6= and 2 :=
() O2 6= . Then 1 and 2 are disjoint since O1 , O2 are disjoint. Moreover, 1
and 2 are both subsets of the compact set (). Due to 1 = () (Kn \ O2 ) and
2 = () (Kn \ O1 ), 1 and 2 are also closed, hence, compact. Then, according
to Prop. C.10, := dist(1 , 2 ) > 0. If y1 1 and y2 2 , then there are numbers
0 < s1 < t1 < s2 < t2 < . . . such that limk sk = limk tk = and
Y (sk , ) O1 Y (tk , ) O2 .
(5.125)
kN
Define
kN
k := sup x sk : Y (t, ) O1 for each t [sk , x] .
(5.126)
Proof. As usual, it suffices to prove the assertions for V (y) 0, as the assertions for
V (y) 0 then follow via time reversion.
(a): We claim
xI0, R+
0
V Y (x, ) < r :
(5.127)
(5.128)
117
5 STABILITY
and
r = V Y (s, ) = V () +
s
0
V Y (t, ) dt V () < r,
(5.129)
(b): Let := Y (, ). During the proof of (a) above, we have shown (x) K for each
x 0. Since, then, (V ) (x) = V ((x)) 0 for each x 0, V is nonincreasing for
x 0. Since V is also bounded on K,
c = lim V (x) .
(5.130)
x
cR
If y (), then there exists a sequence (xk )kN in R such that limk xk = and
limk (xk ) = y. Thus, y K (since K is closed), and
V (y) = lim V (xk )
k
proving (b).
(5.130)
= c,
(5.131)
xR0
V ((x)) = (V ) (x) = 0
(5.132)
as claimed.
Example 5.55. Let a < 0 < b and let h : ]a, b[ R be continuously differentiable and
such that
y1 = y2 ,
y2 = y12 y2 h(y1 ).
(5.134a)
(5.134b)
The right-hand side is defined on :=]a, b[R and is clearly C 1 , i.e. the ODE admits
unique maximal solutions. Due to (5.133), F = {(0, 0)}, i.e. the origin is the only fixed
point of (5.134). We will use Th. 5.54(c) to show (0, 0) is positively asymptotically
stable: We introduce
Z x
h(t) dt ,
(5.135)
H : ]a, b[ R, H(x) :=
0
V (y1 , y2 ) := H(y1 ) +
y22
.
2
(5.136)
118
A DIFFERENTIABILITY
Thus, from the Lyapunov Th. 5.30, we already know (0, 0) to be positively stable.
However, V is not negative definite at (0, 0), i.e. we can not immediately conclude that
(0, 0) is positively asymptotically stable. Instead, as promised, we apply Th. 5.54(c):
To this end, using that H is continuous and positive definite at 0, we choose r > 0 and
c, d R, satisfying
a < c < 0 < d < b and H(c) = H(d) = r,
(5.138)
(5.139)
(5.140)
and define
(1 ,2 )O
(5.141)
Moreover, the continuity of V implies K to be closed. Since K [c, d] [ 2r, 2r], it
is also bounded, i.e. compact. Thus, Th. 5.54 applies to each O. So let O. We
will show that M = {(0, 0)}, where M is the set of Th. 5.54(c) (then () = {(0, 0)} by
Th. 5.54(c), which implies (5.141) as desired). To verify M = {(0, 0)}, note V (y1 , y2 ) < 0
for y1 , y2 6= 0, showing (y1 , y2 )
/ M . For y1 = 0, y2 6= 0, let := Y (, y1 , y2 ). Then
2 (0) = y2 6= 0 and 1 (0) = y2 6= 0, i.e. both 1 and 2 are nonzero on some interval
]0, [ with > 0, showing (y1 , y2 )
/ M . Likewise, if y1 6= 0, y2 = 0, then let be as
before. This time 1 (0) = y1 6= 0 and 2 (0) = h(y1 ) 6= 0, again showing both 1 and
2 are nonzero on some interval ]0, [ with > 0, implying (y1 , y2 )
/ M.
Differentiability
f (x) := ea(x)
(A.1a)
(A.1b)
is differentiable with
f : O K,
B KN -VALUED INTEGRATION
119
Proof. For K = R, the lemma is immediate from the one-dimensional chain rule for
real-valued functions [Phi13a, (9.16)]. It remains to consider the case K = C. Note that
we can not apply the chain rule for holomorphic (i.e. C-differentiable functions), since a
is only R-differentiable and it does not need to have a holomorphic extension. However,
we can argue as follows, merely using the chain rule and the product rule for real-valued
functions: Write a = b + ic with differentiable functions b, c : O R. Then
(A.2)
f (x) = ea(x) = eb(x)+ic(x) = eb(x) eic(x) = eb(x) sin c(x) + i cos c(x) .
Thus, one computes
f (x) = b (x) eb(x) eic(x) + eb(x) c (x) cos c(x) + ic (x) sin c(x)
= b (x) ea(x) + ic (x) eb(x) i cos c(x) + sin c(x) = b (x) ea(x) + ic (x) eb(x) eic(x)
= b (x) + ic (x) ea(x) = a (x) ea(x) ,
(A.3)
proving (A.1b).
Kn-Valued Integration
During the course of this class, we frequently need Kn -valued integrals.R In particular,
R
for f : I Kn , I an interval in R, we make use of the estimate k I f k I kf k,
for example in the proof of the Peano Th. 3.8. As mentioned in the proof of Th. 3.8,
the estimate can easily be checked directly for the 1-norm on Kn , but it does hold for
every norm on Kn . To verify this result is the main purpose of the present section.
Throughout the class, it suffices to use Riemann integrals. However, some readers
might be more familiar with Lebesgue integrals, which is a more general notion (every
Riemann integrable function is also Lebesgue integrable). For convenience, the material
is presented twice, first using Riemann integrals and arguments that make specific use
of techniques available for Riemann integrals, then, second, using Lebesgue integrals
and corresponding techniques. For Riemann integrals, the norm estimate is proved in
Th. B.4, for Lebesgue integrals in Th. B.9.
B.1
B KN -VALUED INTEGRATION
120
Remark B.2. The linearity of the K-valued integral implies the linearity of the Kn valued integral.
Theorem B.3. Let a, b R, a b, I := [a, b]. If f R(I, Kn ), n N, and :
f (I) R is Lipschitz continuous, then f R(I, R).
Proof. If K = R, then f = f , where : Rn Cn is the canonical imbedding,
and : Cn R, (z1 , . . . , zn ) := (Re z1 , . . . , Re zn ). Clearly, f R(I, Cn ), and,
if is L-Lipschitz, L 0, then, due to
z,wCn
CL
n
X
j=1
(B.2)
()
j=1,...,n
,
2ncL
(B.3)
where R and r denote upper and lower Riemann sums, respectively (cf. [Phi13a, (10.7)]).
Letting be a joint refinement of the 2n partitions 1 , . . . , n , 1 , . . . , n , we have (cf.
[Phi13a, Def. 10.8(a),(b)] and [Phi13a, Th. 10.10(a)])
j=1,...,n
,
2ncL
.
R(, Im fj ) r(, Im fj ) <
2ncL
R(, Re fj ) r(, Re fj ) <
(B.4)
N
X
k=1
N
X
k=1
mk |Ik | =
Mk |Ik | =
N
X
k=1
N
X
k=1
mk (g)(xk xk1 ),
(B.5a)
Mk (g)(xk xk1 ),
(B.5b)
B KN -VALUED INTEGRATION
121
where
mk (g) := inf{g(x) : x Ik },
Mk (g) := sup{g(x) : x Ik },
(B.5c)
n
n
X
X
Im fj (k ) Im fj (k )
Re fj (k ) Re fj (k ) + cL
cL
cL
j=1
j=1
n
X
j=1
Mk (Re fj ) mk (Re fj ) + cL
n
X
j=1
Mk (Im fj ) mk (Im fj ) .
(B.6)
Thus,
R(, f ) r(, f ) =
(B.6)
cL
n
N X
X
k=1 j=1
+ cL
k=1 j=1
= cL
j=1
k=1
Mk ( f ) mk ( f ) |Ik |
Mk (Re fj ) mk (Re fj ) |Ik |
n
N X
X
n
X
N
X
Mk (Im fj ) mk (Im fj ) |Ik |
R(, Re fj ) r(, Re fj ) + cL
< 2ncL
= .
2ncL
(B.4)
n
X
j=1
R(, Im fj ) r(, Im fj )
(B.7)
Proof. From Th. B.3, we obtain kf k R(I, R), as the norm k k is 1-Lipschitz by the
inverse triangle inequality. Let be an arbitrary partition of I. Recalling that, for
each g : I R and = (x0 , . . . , xN ) RN +1 , N N, a = x0 < x1 < < xN = b,
Ik := [xk1 , xk ], k Ik , the intermediate Riemann sums
(, f ) =
N
X
k=1
f (tk ) |Ik | =
N
X
k=1
(B.9)
B KN -VALUED INTEGRATION
122
we obtain, for k Ik ,
(, Re f1 ), (, Im f1 ) , . . . , (, Re fn ), (, Im fn )
!!
!
N
N
N
N
X
X
X
X
Im fn (k ) |Ik |
Re fn (k ) |Ik |,
Im f1 (k ) |Ik | , . . . ,
Re f1 (k ) |Ik |,
=
k=1
k=1
k=1
k=1
N
X
Re f1 (k ) |Ik |, Im f1 (k ) |Ik | , . . . , Re fn (k ) |Ik |, Im fn (k ) |Ik |
=
k=1
N
X
Re f1 (k ), Im f1 (k ) , . . . , Re fn (k ), Im fn (k )
|Ik |
k=1
N
X
k=1
kf (k )k |Ik | = (, kf k).
(B.10)
Since the intermediate Riemann sums in (B.10) converge to the respective integrals by
[Phi13a, (10.24b)], one obtains
Z
f
=
lim
(,
Re
f
),
(,
Im
f
)
,
.
.
.
,
(,
Re
f
),
(,
Im
f
)
1
1
n
n
||0
I
Z
(B.10)
lim (, kf k) = kf k,
(B.11)
||0
proving (B.8).
B.2
(B.12)
B KN -VALUED INTEGRATION
123
Proof. Assume f 1 (O) is measurable for each open subset O of Kn . Let j {1, . . . , n}.
If Oj K is open in K, then O := j1 (Oj ) = {z Kn : zj Oj } is open in Kn .
Thus, fj1 (Oj ) = f 1 (O) is measurable, showing that each fj is measurable, i.e. f is
measurable. Now assume f is measurable, i.e. each fj is measurable. Since every open
O Kn is a countable union of open sets of the form O = O1 On with each Oj
being an open subset of K, it suffices to show that the
of such open sets are
Tn preimages
1
1
measurable. So let O be as above. Then f (O) = j=1 fj (Oj ), showing that f 1 (O)
is measurable.
Corollary B.8. Let I R be measurable, n N. If f : I Kn is measurable, then
kf k : I R is measurable.
Proof. If O R is open, then k k1 (O) is an open subset
of Kn by the continuity of
the norm. In consequence, kf k1 (O) = f 1 k k1 (O) is measurable.
where denotes Lebesgue measure on R. Next, consider the case that f is a so-called
simple function, that means f takes only finitely many values y1 , . . . , yN Kn , N N,
and each preimage Bj := f 1 {yj } I is measurable. Then
f=
N
X
yj B j ,
(B.15)
j=1
where, without loss of generality, we may assume that the Bj are pairwise disjoint. We
obtain
Z
Z X
N
Z
N Z
N
X
X
f
yj B
=
yj B
=
yj B
j
j
j
I
I
I j=1
j=1
j=1 I
Z
X
N
Z
()
=
(B.16)
yj Bj
= kf k,
I
I
j=1
where, at (), it was used that, as the Bj are disjoint, the integrands of the two integrals
are equal at each x I.
124
C METRIC SPACES
and also
(B.18)
Thus, we obtain
Z
Z
Z
f
=
f1 , . . . , fn
I
I
I Z
Z
Z
Z
lim
+
i
lim
,
.
.
.
,
lim
=
+
i
lim
1,k
1,k
n,k
n,k
k
k I
k I
k I
I
Z
Z
Z
()
lim
= lim
(
+
i
)
kk + ik k = kf k,
(B.19)
k
k
k
where the equality at () holds due to limk
k(1,k , . . . , n,k )kkf k
L1 (I)
= 0, which,
in turn, is verified by
Z
Z
Z
kk + ik k kf k kk + ik f k C kk + ik f k1
0
I
Z X
n
j,k + ij,k fj 0 for k ,
=C
(B.20)
I j=1
C
C.1
Metric Spaces
Distance in Metric Spaces
Lemma C.1. The following law holds in every metric space (X, d):
|d(x, y) d(x , y )| d(x, x ) + d(y, y )
for each x, x , y, y X.
(C.1)
125
C METRIC SPACES
Proof. First, note d(x, y) d(x, x ) + d(x , y ) + d(y , y), i.e.
d(x, y) d(x , y ) d(x, x ) + d(y , y).
(C.3a)
(C.3b)
Definition C.2. Let (X, d) be a nonempty metric space. For each A, B X define the
distance between A and B by
dist(A, B) := inf{d(a, b) : a A, b B} [0, ]
(C.4)
and
xX
(C.5)
A 6= and B 6= .
(C.6)
Theorem C.4. Let (X, d) be a nonempty metric space. If A X and A 6= , then the
functions
, : X R+
(x) := dist(x, A), (x)
:= dist(A, x),
(C.7)
0,
are both Lipschitz continuous with Lipschitz constant 1 (in particular, they are both
continuous and even uniformly continuous).
Proof. Since dist(x, A) = dist(A, x), it suffices to verify the Lipschitz continuity of .
We need to show
(C.9)
(C.10)
(C.11)
(C.12)
(C.13)
and
implying
and
Since x, y X were arbitrary, (C.12) also yields
126
C METRIC SPACES
Definition C.5. Let (X, d) be a metric space, A X, and R+ . Define
A := {x X : d(x, A) < },
A := {x X : d(x, A) }.
(C.14a)
(C.14b)
(C.15)
f : [0, 1] R,
(t) := tx + (1 t)a,
f (t) := d (t), A .
(C.16a)
(C.16b)
If (sn )nN is a sequence in [0, 1] such that limn sn = s [0, 1], then limn (sn ) =
sx+(1s)a = (s), i.e. is continuous. Then, using Th. C.4, f is also continuous. Thus,
since f (0) = d(a, A) = 0 and f (1) = d(x, A) = 2 , one can use the intermediate
value theorem [Phi13a, Th. 7.56] to obtain, for each [0, 2 ], some [0, 1], satisfying
f ( ) = . If > 0, then d(( ), A) = f ( ) = > 0, i.e ( ) A \ A and ( ) A \ A ,
showing A ( A1 , A1 ( A1 , and A2 ( A2 . If := (1 + 2 )/2, then 1 < = f ( ) =
d(( ), A) < 2 , i.e. A2 \ A1 , showing A1 ( A2 .
127
C METRIC SPACES
C.2
Definition C.9. A subset C of a metric space X is called compact if, and only if, every
sequence in C has a subsequence that converges to some limit c C.
Proposition C.10. Let (X, d) be a metric space, C, A X. If C is compact, A is
closed, and A C = , then dist(C, A) > 0.
Proof. Proceeding by contraposition, we show that dist(C, A) = 0 implies A C 6= .
If dist(C, A) = 0, then there exists a sequence ((ck , ak ))kN in C A such that
lim d(ck , ak ) = 0.
(C.17)
lim ck = c C,
(C.18)
also implying
lim ak = c,
since
kN
(C.19)
128
C METRIC SPACES
The following examples show that, in general, sets can be closed and bounded without
being compact.
Example C.13. (a) If (X, d) is a noncomplete metric space, than it contains a Cauchy
sequence that does not converge. It is not hard to see that such a sequence can
not have a convergent subsequence, either. This shows that no noncomplete metric
space can be compact. Moreover, the closure of every bounded subset of X that
contains such a nonconvergent Cauchy sequence is an example of a closed and
bounded set that is noncompact. Concrete examples are given by Q [a, b] for each
a, b R with a < b (these sets are Q-closed, but not R-closed!) and ]a, b[ for each
a, b R with a < b, in each case endowed with the usual metric d(x, y) := |x y|.
(b) There can also be closed and bounded sets in complete spaces that are not compact.
Consider the space X of all bounded sequences (xn )nN in K, endowed with the supnorm k(xn )nN ksup := sup{|xn | : n N}. It is not too difficult to see that X with
the sup-norm is a Banach space: Let (xk )kN with xk = (xkn )nN be a Cauchy
sequence in X. Then, for each n N, (xkn )kN is a Cauchy sequence in K, and,
thus, it has a limit yn K. Let y := (yn )nN . Then
kxk yksup = sup{|xkn yn | : n N}.
Let > 0. As (xk )kN is a Cauchy sequence with respect to the sup-norm, there is
N N such that kxk xl ksup < for all k, l > N . Fix some l > N and some n N.
Then limk |xkn xln | = limk |yn xln |. Since this is valid for each n N,
we get kxl yksup for each l > N , showing liml xl = y, i.e. X is complete
and a Banach space.
Now consider the sequence (ek )kN with
(
1 for k = n,
ekn :=
0 otherwise.
Then (ek )kN constitutes a sequence in X with kek ksup = 1 for each k N. In particular, (ek )kN is a sequence inside the closed unit ball B 1 (0), and, hence, bounded.
However, if k, l N with k 6= l, then kek el ksup = 1. Thus, neither (ek )kN nor any
subsequence can be a Cauchy sequence. In particular, no subsequence can converge,
showing that the closed and bounded unit ball B 1 (0) is not compact.
Note: There is an important result that shows that a normed vector space is finitedimensional if, and only if, the closed unit ball B 1 (0) is compact (see, e.g., [Str08,
Th. 28.14]).
Theorem C.14. If (X, dX ) and (Y, dY ) are metric spaces, C X is compact, and
f : C Y is continuous, then f (C) is compact.
Proof. If (y k )kN is a sequence in f (C), then, for each k N, there is some xk C
such that f (xk ) = y k . As C is compact, there is a subsequence (ak )kN of (xk )kN
with limk ak = a for some a C. Then (f (ak ))kN is a subsequence of (y k )kN and
C METRIC SPACES
129
the continuity of f yields limk f (ak ) = f (a) f (C), showing that (y k )kN has a
convergent subsequence with limit in f (C). We have therefore established that f (C) is
compact.
Theorem C.15. If (X, d) is a metric space, C X is compact, and f : C R is
continuous, then f assumes its max and its min, i.e. there are xm C and xM C
such that f has a global min at xm and a global max at xM .
Proof. Since C is compact and f is continuous, f (C) R is compact according to Th.
C.14. Then, by [Phi13a, Lem. 7.52], f (C) contains a smallest element m and a largest
element M . This, in turn, implies that there are xm , xM C such that f (xm ) = m and
f (xM ) = M .
Theorem C.16. If (X, dX ) and (Y, dY ) are metric spaces, C X is compact, and
f : C Y is continuous, then f is uniformly continuous.
Proof. If f is not uniformly continuous, then there must be some > 0 such that, for
each k N, there exist xk , y k C satisfying dX (xk , y k ) < 1/k and dY (f (xk ), f (y k )) .
Since C is compact, there is a C and a subsequence (ak )kN of (xk )kN such that
a = limk ak . Then there is a corresponding subsequence (bk )kN of (y k )kN such that
dX (ak , bk ) < 1/k and dY (f (ak ), f (bk )) for all k N. Using the compactness of
C again, there b C and a subsequence (v k )kN of (bk )kN such that b = limk v k .
Now there is a corresponding subsequence (uk )kN of (ak )kN such that dX (uk , v k ) <
1/k and dY (f (uk ), f (v k )) for all k N. Note that we still have a = limk v k .
Given > 0, there is N N such that, for each k > N , one has dX (a, uk ) < /3,
dX (b, v k ) < /3, and dX (uk , v k ) < 1/k < /3. Thus, dX (a, b) < dX (a, uk ) + dX (uk , v k ) +
dX (b, v k ) < , implying d(a, b) = 0 and a = b. Finally, the continuity of f implies
f (a) = limk f (uk ) = limk f (v k ) in contradiction to dY (f (uk ), f (v k )) .
Theorem C.17. If (X, dX ) and (Y, dY ) are metric spaces, C X is compact, and
f : C Y is continuous and one-to-one, then f 1 : f (C) C is continuous.
Proof. Let (y k )kN be a sequence f (C) such that limk y k = y f (C). Then there
is a sequence (xk )kN in C such that f (xk ) = y k for each k N. Let x := f 1 (y).
It remains to prove that limk xk = x. As C is compact, there is a C and a
subsequence (ak )kN of (xk )kN such that a = limk ak . The continuity of f yields
f (a) = limk f (ak ) = limk y k = y = f (x) since (f (ak ))kN is a subsequence of
(y k )kN . It now follows that a = x since f is one-to-one. The same argument shows
that every convergent subsequence of (xk )kN has to converge to x. If (xk )kN did not
converge to x, then there had to be some > 0 such that infinitely man xk are not in
B (x). However, the compactness of C would provide a convergent subsequence whose
limit could not be x, in contradiction to x having to be the limit of all convergent
subsequences of (xk )kN .
Definition C.18. A subset A of a metric space (X, d) is called precompact or totally
bounded if, and only if, for each > 0, A can be covered by finitely many -balls, i.e. if,
130
C METRIC SPACES
and only if, there exist finitely many points a1 , . . . , aN A, N N, such that
N
[
B (aj ).
(C.20)
j=1
Theorem C.19. For a subset C of a metric space (X, d), the following statements are
equivalent:
(i) C is compact as defined in Def. C.9.
(ii) C has the Heine-Borel property, i.e. every open cover of C has a finite subcover,
i.e. if (Oj )jI is a family of open sets Oj C, satisfying
A
N
[
Oj ,
(C.21)
jI
SN
j=1
Oj .
(iii) C is precompact (i.e. totally bounded) as defined in Def. C.18 and complete, i.e.
every Cauchy sequence in C converges to a limit in C.
Proof. We show (i) (iii) (ii) (i).
(i) (iii): Let (cn )nN be a Cauchy sequence in C. As C is compact, (cn )nN has a
subsequence (cnj )jN such that limj cnj = c C. Given > 0 choose K N such
that, for each m, n K, d(cm , cn ) < 2 , and such that, for each nj K, d(cnj , c) < 2 .
Then, fixing some nj K,
nK
+ = ,
2 2
(C.22)
m,nN,
m6=n
d(cm , cn ) :
(C.23)
Choosing ck+1 := c, (C.24) guarantees (C.23) now holds for each m, n {1, . . . , k + 1}.
Due to (C.23), no subsequence of (cn )nN can be a Cauchy sequence, i.e. (cn )nN does
not have a convergent subsequence, proving C is not compact.
131
C METRIC SPACES
(iii) (ii): Assume C to be precompact and complete. For each k N, the precompactness yields points ck1 , . . . , ckNk C, Nk N, such that
C
Nk
[
B 1 (ckj ).
(C.25)
j=1
Seeking a contradiction, assume C does not have the Heine-Borel property, i.e. there
exists an open cover (Oj )jI of C which does not have a finite subcover. Inductively, we
construct a decreasing sequence of subsets Ck of C, C C1 C2 . . . , such that no
Ck can be covered by a finite subcover of (Oj )jI and such that
kN
j{1,...,Nk }
Ck B 1 (ckj ) :
k
(C.26)
To start out, we note that (C.25) implies at least one of the finitely many sets C
B1 (c11 ), . . . , CB1 (c1N1 ) can not be covered by a finite subcover of (Oj )jI , say, CB1 (c1j1 ).
Define C1 := C B1 (c1j1 ). Then, given C1 , . . . , Ck have already been constructed for some
k N, since Ck can not be covered by a finite subcover of (Oj )jI and
Nk+1
Ck C
1
k+1
(ck+1
),
j
(C.27)
j=1
1
k+1
(ck+1
jk+1 ) can not be covered by a
(C.28)
one has
xB 1 (ckj )
k
2
2
+ <
+ = ,
k 2
4
2
(C.29)
132
Caveat C.20. In general topological spaces, one defines compactness via the HeineBorel property (a topological space C is defined to be compact if, and only if, C has
the Heine-Borel property). Moreover, a topological space C is defined to be sequentially
compact if, and only if, every sequence in C has a convergent subsequence. Using this
terminology, one can rephrase the equivalence between (i) and (ii) in Th. C.19 by stating
that a metric space is sequentially compact if, and only if, it is compact. However, in
general topological spaces, neither implication remains true ((iii) of Th. C.19 does not
even make sense in general topological spaces, as the concepts of boundedness, total
boundedness, and Cauchy sequences are, in general, not available): For an example
of a topological space that is compact, but not sequentially compact, see, e.g. [Pre75,
7.2.10(a)]; for an example of a topological space that is sequentially compact, but not
compact, see, e.g. [Pre75, 7.2.10(c)].
Theorem C.21 (Lebesgue Number). Let (X, d) be a metric space and C X. If C is
compact and (Oj )jI is an open cover of C, then there exists a Lebesgue number for
the open cover, i.e. some > 0 such that, for each A C with diam A < , there exists
j0 I, where A Oj0 . Recall that
(
0
for A = ,
(C.30)
diam A =
sup d(x, y) : x, y A
for 6= A.
Proof. Seeking a contradiction, assume there is no Lebesgue number for the open cover
(Oj )jI . Then there is a sequence of pairs (xk , yk ) C 2 such that
d(xk , yk ) <
1
k
but
kN
jI
{xk , yk } ( Oj .
(C.31)
In Prop. 3.13, it was shown that a continuous function is locally Lipschitz with respect
to y if, and only if, it is globally Lipschitz with respect to y on every compact set.
The following Prop. D.1 shows that this equivalence holds even if f is not continuous,
provided that each projection Gx as in (D.1) below is convex. On the other hand, Ex.
D.2 shows that, in general, there exist discontinuous functions that are locally Lipschitz
with respect to y without being globally Lipschitz with respect to y on every compact
set.
Proposition D.1. Let m, n N, G R Km , and f : G Kn . If G is such that
each projection
Gx := {y Km : (x, y) G}, x R,
(D.1)
133
is convex (in particular, if G itself is convex), then f is locally Lipschitz with respect to
y if, and only if, f is (globally) Lipschitz with respect to y on every compact subset K
of G.
Proof. The proof of Prop. 3.13 shows, whithout making use of the continuity of f , that
(global) Lipschitz continuity with respect to y on every compact subset K of G implies
local Lipschitz continuity on G. Thus, assume f to be locally Lipschitz with respect to
y and assume each Gx to be convex. The proof of Prop. 3.13 shows, whithout making
use of the continuity of f , that, for each K G compact
(x,y),(x,
y )K
If (x, y), (x, y) K are arbitrary with y 6= y, then the convexity of Gx implies
{(x, (1 t)y) + (x, t
y ) : t [0, 1]} G.
(D.3)
Choose N N such that N > 2ky yk/ and set h := ky yk/N . Then
h < /2.
Define
k=0,...,N
Then
k=0,...,N 1
and
(D.4)
kh
kh
y.
yk :=
y+ 1
ky yk
ky yk
(D.5)
h
h
=h<
y+
y
kyk+1 yk k =
ky yk
ky yk
N
1
X
k=0
(D.2)
N
1
X
k=0
= L N h = L ky yk,
(D.6)
kyk yk+1 k
(D.7)
Example D.2. We provide two examples that show that, in general, a discontinuous
function can be locally Lipschitz with respect to y without being globally Lipschitz with
respect to y on every compact set.
(a) Consider
and f : G R,
G :=] 2, 2[ ] 4, 1[]1, 4[
0
for y ]1, 4[.
(D.8)
(D.9)
134
For the following open balls with respect to the max norm k(x, y)k := max{|x|, |y|},
one has B1 (x, y) G ] 2, 2[] 4, 1[ for y ] 4, 1[, andB1 (x, y) G
] 2, 2[]1, 4[ for y ]1, 4[. Thus, f (x, ) is constant on each set B1 (x, y) G (either
constantly equal to 1/x or constantly equal to 0), i.e. 0-Lipschitz with respect to y.
In particular, f is locally Lipschitz with respect to y. However, f is not Lipschitz
continuous with respect to y on the compact set
K := [1, 1] [3, 2] [2, 3] :
(D.10)
For the sequence ((xk , yk , y k ))kN , where
kN
xk := 1/k,
yk := 2,
y k := 2,
(D.11)
one has
k0
|f (xk , yk ) f (xk , y k )|
= lim
= ,
k
k 2 (2)
|yk y k |
showing f is not Lipschitz continuous with respect to y on K.
lim
(D.12)
(b) If one increases the dimension by 1, then one can modify the example in (a) such
the set G is even connected (this variant was pointed out by Anton Sporrer): Let
A := ] 4, 1[] 2, 2[ ] 4, 4[] 2, 0[ ]1, 4[] 2, 2[ R2 . (D.13)
Then A is open and connected (but not convex) and the same holds for
G :=] 2, 2[A R3 .
Define
f : G R,
f (x, y1 , y2 ) :=
(D.14)
(D.15)
kN
xk := 1/k,
y1,k := 2,
y 1,k := 2,
y2,k := 0,
(D.18)
one has
|f (xk , y1,k , y2,k ) f (xk , y 1,k , y2,k )|
k0
= ,
= lim
k
k
k(y1,k , y2,k ) (y 1,k , y2,k )kmax
max{4, 0}
lim
(D.19)
135
f : G R,
f (x, y) := 0.
(E.1)
y0 ,
(E.2)
y(x0 ) = y0 .
(E.3)
(E.4)
Then, for each (x0 , y0 ) [0, 1] R, the function of (E.2) is a solution to the initial
value problem (E.3), but, again, the maximal solution of (E.3) according to Def.
3.20 is ]0,1[ .
Paths in Rn
(F.1)
F PATHS IN RN
136
Definition F.3. Given a real interval I := [a, b] R, a, b R, a < b, the (N + 1)tuple := (x0 , . . . , xN ) RN +1 , N N, is called a partition of I if, and only if,
a = x0 < x1 < < xN = b. The set of all partitions of I is denoted by (I) or by
[a, b]. Given a partition of I as above and letting Ij := [xj1 , xj ], the number
|| := max |Ij | : j {1, . . . , N } ,
(F.2)
is called the mesh size of .
N
1
X
k=0
(F.3)
The path is called rectifyable with arc length l() if, and only if, l() < .
x[a,b]
(x) = y0 + x y1 ,
(F.5)
(F.6)
(F.7)
(F.8)
(F.9)
F PATHS IN RN
137
N
1
X
k=0
= ky1 k2
N
1
X
k=0
N
1
X
k=0
kxk+1 y1 xk y1 k2
(F.10)
proving (F.6).
(b): For each partition (x0 , . . . , xN ), N N, of [a, b], we have
p (x0 , . . . , xN ) =
N
1
X
k=0
N
1
X
=L
k=0
N
1
X
k=0
L kxk+1 xk k2
(xk+1 xk ) = L (b a),
(F.11)
proving (F.7).
(c): For each partition = (x0 , . . . , xN ), N N, of [a, b], we have
1
N
1
X
X
N
p () p () =
k(xk+1 ) (xk )k2
k(xk+1 ) (xk )k2
k=0
N
1
X
k=0
k(xk+1 ) (xk )k2 k(xk+1 ) (xk )k2
k=0
N
1
X
k=0
(xk+1 ) (xk+1 ) (xk ) (xk )
= p (),
(F.12)
proving (F.8) (the last estimate in (F.12) holds true due to the inverse triangle inequality).
(d): If = a or = b, then there is nothing to prove. Thus, assume a < < b. If
1 := (x0 , . . . , xN ) is a partition of [a, ] and 2 := (xN , . . . , xM ) is a partition of [, b],
N, M N, M > N , then := (x0 , . . . , xM ) is a partition of [a, b]. Moreover,
p () = p (1 ) + p (2 )
(F.13)
(F.14)
On the other hand, if = (x0 , . . . , xM ) M N, is a partition of [a, b], then, either there
is 0 < N < M such that = N , in which case (F.13) holds once again, where 1 and 2
are defined as before. Otherwise, there is N {0, . . . , M 1} such that xN < < xN +1
F PATHS IN RN
138
M
1
X
k=0
N
1
X
k=0
M
1
X
k=N +1
p (1 ) + p (2 ),
showing
and concluding the proof.
(F.16)
Proof. Since is continuously differentiable, it follows from [Phi13b, Th. C.3] that
is Lipschitz continuous on [a, b], i.e. is rectifyable by Prop. F.6(b) above. To prove
(F.17), according to the fundamental theorem of calculus [Phi13a, Th. 10.19(b)], it
suffices to show the function
: [a, b] R+
0,
(x) := l([a,x] ),
(F.18)
is differentiable with derivative (x) = k (x)k2 . To this end, first note the continuous
function is even uniformly continuous by Th. C.16. Thus,
>0
x0 ,x[a,b]
Fix x0 [a, b[ and consider x1 ]a, b[ such that x0 < x1 < x0 + . Define the affine path
: [x0 , x1 ] Rn ,
(F.20)
(F.21)
x[x0 ,x1 ]
(F.19)
(F.22)
Thus, it follows from [Phi13b, Th. C.3] that is -Lipschitz on [a, b] and, then,
Prop. F.6(b) yields
l([x0 ,x1 ] ) (x1 x0 ),
(F.23)
=
k (x0 )k2
x1 x0 x1 x0
x1 x0
(F.24)
(x1 x0 )
= ,
x1 x0
139
(F.24)
(F.25)
showing the function from (F.18) has a right-hand derivative at x0 and the value of
that right-hand derivative at x0 is the desired k (x0 )k2 . Repeating the above argument
with x0 , x1 ]a, b] such that x0 < x1 < x0 shows to have a left-hand derivative at
each x0 ]a, b] with value k (x0 )k2 , which completes the proof.
Remark F.8. An example of a differentiable nonrectifyable path is given by (cf. [Wal02,
Ex. 5.14.6])
(
x2 cos x2 for x 6= 0,
2
: [0, 1] R , (x) :=
(F.26)
0
for x = 0.
For the present ODE class, we are mostly interested in linear maps from Kn into itself.
However, introducing the relevant notions for linear maps between general normed vector
spaces does not provide much additional difficulty, and, hopefully, even some extra
clarity.
Definition G.1. Let A : X Y be a linear map between two normed vector spaces
(X, k kX ) and (Y, k kY ) over K. Then A is called bounded if, and only if, A maps
bounded sets to bounded sets, i.e. if, and only if, A(B) is a bounded subset of Y for
each bounded B X. The vector space of all bounded linear maps between X and Y
is denoted by L(X, Y ).
Definition G.2. Let A : X Y be a linear map between two normed vector spaces
(X, k kX ) and (Y, k kY ) over K. The number
kAxkY
kAk := sup
: x X, x 6= 0
kxkX
= sup kAxkY : x X, kxkX = 1 [0, ]
(G.1)
is called the operator norm of A induced by k kX and k kY (strictly speaking, the term
operator norm is only justified if the value is finite, but it is often convenient to use the
term in the generalized way defined here).
In the special case, where X = Kn , Y = Km , and A is given via a real m n matrix,
the operator norm is also called matrix norm.
140
From now on, the space index of a norm will usually be suppressed, i.e. we write just
k k instead of both k kX and k kY .
Theorem G.3. For a linear map A : X Y between two normed vector spaces
(X, k k) and (Y, k k) over K, the following statements are equivalent:
(a) A is bounded.
(b) kAk < .
(c) A is Lipschitz continuous.
(d) A is continuous.
(e) There is x0 X such that A is continuous at x0 .
Proof. Since every Lipschitz continuous map is continuous and since every continuous
map is continuous at every point, (c) (d) (e) is clear.
(e) (c): Let x0 X be such that A is continuous at x0 . Thus, for each > 0, there
is > 0 such that kx x0 k < implies kAx Ax0 k < . As A is linear, for each x X
with kxk < , one has kAxk = kA(x + x0 ) Ax0 k < , due to kx + x0 x0 k = kxk < .
Moreover, one has k(x)/2k /2 < for each x X with kxk 1. Letting L := 2/,
this means that kAxk = kA((x)/2)k/(/2) < 2/ = L for each x X with kxk 1.
Thus, for each x, y X with x 6= y, one has
xy
< L kx yk.
kAx Ayk = kA(x y)k = kx yk
A
(G.2)
kx yk
Together with the fact that kAx Ayk kx yk is trivially true for x = y, this shows
that A is Lipschitz continuous.
(c) (b): As A is Lipschitz continuous, there exists L R+
0 such that kAx Ayk
L kx yk for each x, y X. Considering the special case y = 0 and kxk = 1 yields
kAxk L kxk = L, implying kAk L < .
(b) (c): Let kAk < . We will show
y
kAx Ayk
kAk
A
=
kx yk
kx yk
xy
as
kxyk
= 1, thereby establishing (G.3).
(G.3)
(G.4)
(b) (a): Let kAk < and let M X be bounded. Then there is r > 0 such that
M Br (0). Moreover, for each 0 6= x M :
kAxk
x
kAk
=
A
(G.5)
kxk
kxk
141
x
as
kxk
= 1. Thus kAxk kAkkxk rkAk, showing that A(M ) BrkAk (0). Thus,
A(M ) is bounded, thereby establishing the case.
(a) (b): Since A is bounded, it maps the bounded set B1 (0) X into some
bounded subset of Y . Thus, there is r > 0 such that A(B1 (0)) Br (0) Y . In
particular, kAxk < r for each x X satisfying kxk = 1, showing kAk r < .
Remark G.4. For linear maps between finite-dimensional spaces, the equivalent properties of Th. G.3 always hold: Each linear map A : Kn Km , (n, m) N2 , is
continuous (this follows, for example, from the fact that each such map is (trivially)
differentiable, and every differentiable map is continuous). In particular, each linear
map A : Kn Km , has all the equivalent properties of Th. G.3.
Theorem G.5. Let X and Y be normed vector spaces over K.
(a) The operator norm does, indeed, constitute a norm on the set of bounded linear
maps L(X, Y ).
(b) If A L(X, Y ), then kAk is the smallest Lipschitz constant for A, i.e. kAk is a
Lipschitz constant for A and kAx Ayk L kx yk for each x, y X implies
kAk L.
Proof. (a): If A = 0, then, in particular, Ax = 0 for each x X with kxk = 1, implying
kAk = 0. Conversely, kAk = 0 implies Ax = 0 for each x X with kxk = 1. But then
Ax = kxk A(x/kxk) = 0 for every 0 6= x X, i.e. A = 0. Thus, the operator norm is
positive definite. If A L(X, Y ), K, and x X, then
(A)x
=
A(x)
=
(Ax)
= ||
Ax
,
(G.6)
yielding
kAk = sup k(A)xk : x X, kxk = 1 = sup || kAxk : x X, kxk = 1
= || sup kAxk : x X, kxk = 1 = || kAk,
(G.7)
kA + Bk = sup k(A + B)xk : x X, kxk = 1
sup kAxk + kBxk : x X, kxk = 1
sup kAxk : x X, kxk = 1 + sup kBxk : x X, kxk = 1
= kAk + kBk,
(G.9)
showing that the operator norm also satisfies the triangle inequality, thereby completing
the verification that it is, indeed, a norm.
(b): That kAk is a Lipschitz constant for A was already shown in the proof of (b)
(c) of Th. G.3. Now let L R+
0 be such that kAx Ayk L kx yk for each x, y X.
Specializing to y = 0 and kxk = 1 implies kAxk L kxk = L, showing kAk L.
142
Remark G.6. Even though it is beyond the scope of the present class, let us mention
as an outlook that one can show that L(X, Y ) with the operator norm is a Banach space
(i.e. a complete normed vector space) provided that Y is a Banach space (even if X is
not a Banach space).
Lemma G.7. If Id : X X, Id(x) := x, is the identity map on a normed vector space
X over K, then k Id k = 1 (in particular, the operator norm of a unit matrix is always
1). Caveat: In principle, one can consider two different norms on X simultaneously,
and then the operator norm of the identity can differ from 1.
Proof. If kxk = 1, then k Id(x)k = kxk = 1.
Lemma G.8. Let X, Y, Z be normed vector spaces and consider linear maps A
L(X, Y ), B L(Y, Z). Then
kBAk kBk kAk.
(G.10)
Proof. Let x X with kxk = 1. If Ax = 0, then kB(A(x))k = 0 kBk kAk. If Ax 6= 0,
then one estimates
Ax
kAk kBk,
B(Ax)
= kAxk
B
(G.11)
kAxk
thereby establishing the case.
Example G.9. Let m, n N and let A : Rn Rm be the linear map given by the
m n matrix (akl )(k,l){1,...,m}{1,...,n} . Then
( n
)
X
kAk := max
|akl | : k {1, . . . , m}
(G.12a)
l=1
m
X
k=1
|akl | : l {1, . . . , n}
(G.12b)
is called the column sum norm of A. It is an exercise to show that kAk is the operator
norm induced if Rn and Rm are endowed with the -norm, and kAk1 is the operator
norm induced if Rn and Rm are endowed with the 1-norm.
1 0 . . . n0
1 1 . . . n
1
V := ..
..
.
.
1 n . . . nn
(H.1)
143
be the corresponding Vandermonde matrix. Then its determinant, the so-called Vandermonde determinant is given by
det(V ) =
n
Y
(k l ).
(H.2)
k,l=0
k>l
Proof. The proof can be conducted by induction with respect to n: For n = 1, we have
1
Y
1 0
= 1 0 =
(k l ),
(H.3)
det(V ) =
1 1
k,l=0
k>l
showing (H.2) holds for n = 1. Now let n > 1. We know from Linear Algebra that the
value of a determinant does not change if we add a multiple of a column to a different
column. Adding the (0 )-fold of the nth column to the (n + 1)st column, we obtain
in the (n + 1)st column
0
n n1 0
1
1
(H.4)
.
..
.
nn nn1 0
Next, one adds the (0 )-fold of the (n 1)st column to the nth column, and, successively, the (0 )-fold of the mth column to the (m + 1)st column. One finishes, in the
nth step, by adding the (0 )-fold of the first column to the second column, obtaining
1 0 . . . n 1
0
0
.
.
.
0
0
1 1 . . . n 1 1 0 2 1 0 . . . n n1 0
1
1
1
1
(H.5)
det(V ) = ..
.
..
..
..
..
.. = ..
.
.
.
.
.
.
.
1 n . . . nn 1 n 0 2n n 0 . . . nn nn1 0
Applying the rule for determinants of block matrices to (H.5) yields
1 0 2 1 0 . . . n n1 0
1
1
1
..
..
..
det(V ) = 1 ...
.
.
.
.
2
n
n1
n 0 n n 0 . . . n n 0
(H.6)
As we also know from Linear Algebra that determinants are linear in each row, for each
k, we can factor out (k 0 ) from the kth row of (H.6), arriving at
1 1 . . . n1
n
1
Y
.. ..
.
.
.
.
det(V ) =
(k 0 ) . .
(H.7)
.
.
.
n1
k=1
1 n . . . n
n
Y
(k 0 )
k=1
n
Y
(k l ) =
k,l=1
k>l
n
Y
(k l ),
k,l=0
k>l
(H.8)
144
I MATRIX-VALUED FUNCTIONS
completing the induction proof of (H.2).
Matrix-Valued Functions
Notation I.1. Given m, n N, let M(m, n, K) denote the set of m n matrices over
K.
I.1
Product Rule
C : I M(m, l, K),
C(x) := A(x)B(x),
(I.2)
xI
(I.3)
Proof. Writing C(x) = c (x) and using the one-dimensional product rule together
with the definition of matrix multiplication, one computes, for each (, ) {1, . . . , m}
{1, . . . , l},
!
n
X
c (x) =
a (x) b (x)
=1
n
X
a (x) b (x)
=1
= A (x)B(x)
proving the proposition.
I.2
n
X
a (x) b (x)
=1
+ A(x)B (x)
(I.4)
145
J AUTONOMOUS ODE
(a) If B = (bjk ) M(p, m, K), then
Z
Z
B
A(x) dx = B A(x) dx .
I
(I.5)
(I.6)
Proof. (a): One computes, for each (j, l) {1, . . . , p} {1, . . . , n},
Z
Z
Z
m
X
B
A(x) dx
=
bjk
akl (x) dx =
I
jl
k=1
(BA(x))jl dx =
I
Z
m
X
k=1
B A(x) dx
I
dx
(I.7)
jl
proving (I.5).
(b): One computes, for each (k, j) {1, . . . , m} {1, . . . , p},
Z
A(x) dx
I
=
kj
n Z
X
l=1
akl (x) dx
I
(A(x)B)kj dx =
I
blj =
Z
n
X
l=1
A(x) B dx
I
akl (x)blj
J.1
(I.8)
kj
proving (I.6).
dx
Autonomous ODE
Equivalence Between Autonomous and Nonautonomous ODE
(J.2)
where
g : R G Kn+1 ,
in the following sense:
g(y1 , . . . , yn+1 ) := 1, f (x, y2 , . . . , yn+1 ) ,
(J.3)
146
J AUTONOMOUS ODE
x0 I
1 (x0 ) = x0 ,
(J.4)
xI
(J.6)
While Th. J.1 is somewhat striking and of theoretical interest, it has few useful applications in practise, due to the unbounded first component of solutions to (J.2) (cf. Rem.
5.2).
J.2
The following Example J.2, provided by Anton Sporrer, shows Lem. 5.19 becomes false
if the hypothesis that every initial value problem for the considered ODE y = f (y) has
at least one solution is omitted:
Example J.2. Consider
f : R R,
f (y) :=
0 for y Q,
1 for y R \ Q,
(J.7)
and the autonomous ODE y = f (y). If (x0 , y0 ) R Q, then the initial value problem
y(x0 ) = y0 has the unique solution : R R, y0 Q. However, if (x0 , y0 )
R (R \ Q), then the initial value problem y(x0 ) = y0 has no solution. Since y = f (y)
has only constant solutions, every function E : R R is an integral for this ODE
according to Def. 5.18. However, not every differentiable function E : R R satisfies
(5.15): For example, if E(y) := y, then E 1, i.e.
yR\Q
E (y)f (y) = 1 6= 0,
showing that Lem. 5.19 does not hold for y = f (y) with f according to (J.7).
(J.8)
147
K POLAR COORDINATES
Polar Coordinates
0
for
arccot(y /y )
for
1
2
: R2 \ {(0, 0)} [0, 2[, (y1 , y2 ) :=
for
(K.1a)
y2
y2
y2
y2
= 0, y1 > 0,
> 0,
= 0, y1 < 0,
< 0.
(K.1b)
(K.2a)
(K.2b)
(K.3a)
(K.3b)
where g : R+ R R2 ,
g1 : R+ R R,
g1 (r, ) := f1 (r cos , r sin ) cos + f2 (r cos , r sin ) sin ,
g2 : R+ R R,
1
g2 (r, ) :=
f2 (r cos , r sin ) cos f1 (r cos , r sin ) sin .
r
(K.4a)
(K.4b)
is a solution to (K.2).
(x) := 1 (x) cos 2 (x), 1 (x) sin 2 (x) ,
(K.5)
R+ ,
R,
(K.6a)
(K.6b)
and if
1 = cos ,
2 = sin ,
(K.7a)
(K.7b)
148
K POLAR COORDINATES
then satisfies the initial condition
1 (0) = 1 ,
2 (0) = 2 .
(K.8a)
(K.8b)
Note that > 0 implies (1 , 2 ) 6= (0, 0), and that, for (, ) R+ [0, 2[, (K.7)
is equivalent to
r(1 , 2 ) = ,
(1 , 2 ) =
(K.9a)
(K.9b)
(K.10a)
(K.10b)
g1 (, ) = cos (1 ) cos
= (1 ),
(K.11a)
g2 (, ) = sin (1 ) cos +
(K.11b)
1 cos
2
(K.12a)
[Phi13a, (D.1c)]
sin2
,
2
(K.12b)
149
K POLAR COORDINATES
Claim 1. The general solution to
r = p(r),
p : R+ R,
p(r) := r (1 r),
is
Yp : Dp,0 R+ ,
where
Dp,0
Yp (x, ) :=
,
+ (1 ) ex
i
= R]0, 1] (x, ) : > 1, x ln
h
, .
1
(K.13)
(K.14)
(K.15)
Yp (0, ) =
= .
+ (1 )
(K.16)
(x,)Dp,0
Yp (x, ) =
(1 )ex
+ (1 ) ex
2 = Yp (x, ) 1 Yp (x, ) .
(K.17)
To verify the form of Dp,0 , we note that the denominator in (K.14) is positive for each
x R if 0 < 1. If > 1, then the function a : R R, a(x) := + (1 ) ex , is
x0 ( ) :=
(K.18a)
(K.18b)
(K.18c)
(K.18d)
(K.18e)
(K.18f)
(K.18g)
(K.18h)
(K.18i)
(K.18j)
(K.18k)
(K.18l)
= q(),
q : R R,
q() :=
1 cos
= sin2 ,
2
2
(K.19)
150
K POLAR COORDINATES
is
Yq : R2 R+ ,
2
k
+
arctan
2 (k + 1) + arctan x2
+ 2k
sin
Yq (x, ) := 2 k + arctan 2 cos 2x
sin
+2
+ 2k
sin
2 (k + 1) + arctan 2 cos 2x
sin +2
+ 2k
2 sin
2 (k 1) + arctan
2 cos x sin +2
for
for
for
for
for
for
for
for
for
(x, ) A0 ,
(x, ) A0,k , k Z,
(x, ) B0,k , k Z,
(x, ) = (0, + 2k), k Z,
(x, ) A1,k A2,k , k Z,
(x, ) C1,k , k Z,
(x, ) B1,k , k Z,
(x, ) C2,k , k Z,
(x, ) B2,k , k Z.
(K.20)
R = A0
{(0, + 2k)} A0,k B0,k A1,k A2,k B1,k B2,k C1,k C2,k
kZ
(K.21)
(K.22)
one has
(x, ) 6= 0 for each (x, ) A1,k A2,k B1,k B2,k , k Z.
(K.23)
{k: kZ}
Rk Lk , kZ
Yq (0, ) = ,
(K.26a)
2 sin [Phi13a, (D.1d)]
Yq (0, ) = 2k + 2 arctan
=
2k + 2 arctan tan
2 cos + 2
2
= 2k + 2
k = .
(K.26b)
2
Next, we show that, for each R, the function x 7 Yq (x, ) is differentiable on R and
satisfies (K.24): For {2k : k Z}, x 7 Yq (x, ) is constant, i.e. differentiability is
clear, and
1 cos(2k)
1 cos Yq (x, )
=
= 0 = Yq (x, )
(K.27)
xR
2
2
151
K POLAR COORDINATES
proves (K.24).
For each {2(k +1) : k Z}, differentiability is clear in each x R\{0}. Moreover,
4
4
1
2
= 2
,
(K.28)
2 arctan
4 =
xR\{0}
x
x 1 + x2
4 + x2
and, thus, for each x R \ {0},
1 cos Yq (x, )
1 1
2
1 1
[Phi13a, (D.1e)] 1
=
= cos 2 arctan
2
2 2
x
2 2 1+
2
1 1 x 4
8
4 (K.28)
=
=
=
= Yq (x, ),
2
2
2 2 x +4
2(4 + x )
4 + x2
4
x2
4
x2
(K.29)
and
+ 2k 2 (k + 1) + arctan x2
Yq (0, ) Yq (x, )
lim
= lim
x0
x0
0x
x
4
2
[Phi13a, (9.29)],(K.28)
=
lim 4+x = 1,
x0
1
(K.31)
(K.32)
=
2
2 2
(x, )
2 2 1+
=
=
2 2 ((x, ))2 + 4(sin )2
2(4(sin )2 + ((x, ))2 )
4(sin )2
4(sin )2 + ((x, ))2
(K.33)
= Yq (x, ),
4(sin )2
((x, ))2
4(sin )2
((x, ))2
(K.34)
152
K POLAR COORDINATES
[Phi13a, (9.29)],(K.33)
)
4(sin 4(sin
)2 +((x, ))2
lim
=1
xx0 ( )
(K.35)
and
+ 2k 2 (k + 1) + arctan
Yq x0 ( ), Yq (x, )
lim
= lim
xx0 ( )
xx0 ( )
x0 ( ) x
x0 ( ) x
2 sin
2 cos x sin +2
[Phi13a, (9.29)],(K.33)
lim
)
4(sin 4(sin
)2 +((x, ))2
xx0 ( )
=1
(K.36)
(K.37)
(K.24) also holds. For Lk , we have sin < 0 and x0 ( ) < 0, and, thus, by LHopitals
rule [Phi13a, Th. 9.23(a)],
Yq x0 ( ), Yq (x, )
lim
xx0 ( )
x0 ( ) x
sin
+ 2k 2 (k 1) + arctan 2 cos 2x
sin +2
=
lim
xx0 ( )
x0 ( ) x
2
[Phi13a, (9.29)],(K.33)
lim
)
4(sin 4(sin
)2 +((x, ))2
xx0 ( )
=1
(K.38)
and
+ 2k 2 k + arctan
Yq x0 ( ), Yq (x, )
= lim
lim
xx0 ( )
xx0 ( )
x0 ( ) x
x0 ( ) x
2
[Phi13a, (9.29)],(K.33)
lim
)
4(sin 4(sin
)2 +((x, ))2
xx0 ( )
=1
2 sin
2 cos x sin +2
(K.39)
(K.40)
N
153
K POLAR COORDINATES
Claim 3. The general solution to (K.2) with f1 , f2 according to (K.10) is
Y : Df,0 R2 ,
Y (x, 1 , 2 ) := Yp x, r(1 , 2 ) cos Yq x, (1 , 2 ) ,
Yp x, r(1 , 2 ) sin Yq x, (1 , 2 ) ,
h
kk2
, .
kk2 1
(K.41)
(K.42)
Proof. Since (K.12) is the polar coordinate version of (K.2) with f1 , f2 according to
(K.10), everything follows from combining Th. K.1 with Claims 1 and 2.
N
Claim 4. The autonomous ODE (K.2) with f1 , f2 according to (K.10) has (1, 0) as its
only fixed point, and (1, 0) satisfies Def. 5.24(iii) for x (even for each R2 \ {0})
without satisfying Def. 5.24(ii) (i.e. without being positively stable).
Proof. For each R2 \ {0}, it is r() > 0, and, thus
2
R \{0}
r()
= 1.
x r() + 1 r() ex
lim Yp x, r() = lim
lim Yq x, 0 = lim 0 = 0.
If () = , then
2
lim Yq x, = lim 2 + arctan
x
x
x
= 2( + 0) = 2.
(K.43)
(K.44a)
(K.44b)
(K.44c)
R2 \{0}
(K.45)
While (1, 0) is clearly a fixed point for (K.2) with f1 , f2 according to (K.10), (K.45)
shows that no other R2 \ {0} can be a fixed point.
REFERENCES
154
For each ]0, [ and := (cos , sin ), it is () = and Yq (0, ()) = . Thus, due
to (K.44c) and the intermediate value theorem, the continuous function Yq (, ()) must
attain every value between and 2, in particular, there is x > 0 that Yq (x , ()) =
and Y (x , ) = (cos , sin ) = (1, 0). Since every neighborhood of (1, 0) contains
points = (cos , sin ) with ]0, [, this shows that (1, 0) does not satisfy Def.
5.24(ii) for x 0.
N
References
[Aul04] Bernd Aulbach. Gewohnliche Differenzialgleichungen, 2nd ed. Spektrum
Akademischer Verlag, Heidelberg, Germany, 2004 (German).
[K04]
Konrad K
onigsberger. Analysis 2, 5th ed. Springer-Verlag, Berlin, 2004
(German).
[Koe03] Max Koecher. Lineare Algebra und analytische Geometrie, 4th ed. Springer-Verlag, Berlin, 2003 (German), 1st corrected reprint.
[Mar04] Nelson G. Markley. Principles of Differential Equations. Pure and Applied
Mathematics, Wiley-Interscience, Hoboken, NJ, USA, 2004.
[Oss09] E. Ossa. Topologie. Vieweg+Teubner, Wiesbaden, Germany, 2009 (German).
[Phi13a] P. Philip. Calculus I for Computer Science and Statistics Students. Lecture Notes, Ludwig-Maximilians-Universitat, Germany, 2012/2013, available
in PDF format at http://www.math.lmu.de/~philip/publications/lectureNot
es/calc1_forInfAndStatStudents.pdf.
[Phi13b] P. Philip. Calculus II for Statistics Students. Lecture Notes, LudwigMaximilians-Universitat, Germany, 2013, available in PDF format at
http://www.math.lmu.de/~philip/publications/lectureNot
es/calc2_forStatStudents.pdf.
[Pre75]
[Put66] E.J. Putzer. Avoiding the Jordan Canonical Form in the Discussion of Linear Systems with Constant Coefficients. The American Mathematical Monthly
73 (1966), No. 1, 27.
[Str08]
Gernot Stroth. Lineare Algebra, 2nd ed. Berliner Studienreihe zur Mathematik, Vol. 7, Heldermann Verlag, Lemgo, Germany, 2008 (German).
[Wal02] Wolfgang Walter. Analysis 2, 5th ed. Springer-Verlag, Berlin, 2002 (German).