You are on page 1of 124

Contents

0 Solving Linear Equation Systems with the Gauss-Algorithm 6


1 Linear Algebra and Vector Spaces 1
1.1 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . 1
1.1.2 Linear Independence . . . . . . . . . . . . . . . 2
1.1.3 Dimension and Basis . . . . . . . . . . . . . . . 3
1.1.4 Scalar Product . . . . . . . . . . . . . . . . . . 5
1.1.5 Orthonormal Systems . . . . . . . . . . . . . . . 6
1.1.6 Norms . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Matrices and Linear Maps . . . . . . . . . . . . . . . . 9
1.2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . 9
1.2.2 Linear Maps . . . . . . . . . . . . . . . . . . . . 12
1.2.3 Linear Equations . . . . . . . . . . . . . . . . . 14
1.2.4 Inverse map and Inverse Matrix . . . . . . . . . 15
1.2.5 Changing the Basis . . . . . . . . . . . . . . . . 17
1.2.6 Some Special Linear Maps in R
2
. . . . . . . . . 18
1.2.7 Examples . . . . . . . . . . . . . . . . . . . . . 19
1.3 Operations with matrices . . . . . . . . . . . . . . . . . 19
1.3.1 Matrix-algebra . . . . . . . . . . . . . . . . . . 20
1.3.2 Scalar Product . . . . . . . . . . . . . . . . . . 21
1.3.3 Homogeneous Coordinates . . . . . . . . . . . . 21
1.3.4 Norms . . . . . . . . . . . . . . . . . . . . . . . 22
1.4 Gauss Algorithm and LU-Decomposition . . . . . . . . . 24
1.4.1 Numerical Stability . . . . . . . . . . . . . . . . 24
1.4.2 Special Operations . . . . . . . . . . . . . . . . 26
AEM 0- 2
1.4.3 Properties of C(k, l ; ), D(k; ) and F(k, l ) . . 27
1.4.4 Standard Algorithm . . . . . . . . . . . . . . . . 27
1.4.5 LU-Decomposition . . . . . . . . . . . . . . . . 28
1.4.6 Example . . . . . . . . . . . . . . . . . . . . . . 31
1.4.7 Summary of LU-decomposition . . . . . . . . . . 33
1.4.8 Example of LU-Decomposition . . . . . . . . . . 34
1.4.9 Solving a Linear Equation System . . . . . . . . 36
1.4.10 Short Form . . . . . . . . . . . . . . . . . . . . 37
1.4.11 Example . . . . . . . . . . . . . . . . . . . . . . 37
1.5 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . 40
1.5.1 Denition and properties . . . . . . . . . . . . . 40
1.5.2 More properties . . . . . . . . . . . . . . . . . . 41
1.5.3 Lemma . . . . . . . . . . . . . . . . . . . . . . 41
1.5.4 Theorem: Schur Form . . . . . . . . . . . . . . 41
1.5.5 Consequences . . . . . . . . . . . . . . . . . . . 42
1.5.6 Jordan-Form . . . . . . . . . . . . . . . . . . . 42
1.5.7 Example . . . . . . . . . . . . . . . . . . . . . . 45
1.6 Special Properties of Symmetric Matrices . . . . . . . . 51
1.6.1 Properties of Symmetric and Hermitian Matrices 52
1.6.2 Orthogonal Matrices . . . . . . . . . . . . . . . 52
1.7 Singular Value Decomposition (SVD) . . . . . . . . . . 53
1.7.1 Preparations . . . . . . . . . . . . . . . . . . . 53
1.7.2 Existence and Construction of the SVD . . . . . 54
1.8 Generalized Inverses . . . . . . . . . . . . . . . . . . . . 55
1.8.1 Special case: A injectiv . . . . . . . . . . . . . . 57
1.9 Applications to linear equation systems . . . . . . . . . 58
1.9.1 Errors . . . . . . . . . . . . . . . . . . . . . . . 58
1.9.2 Numerical Rank Deciency . . . . . . . . . . . . 59
1.9.3 Application: Best Fit Functions . . . . . . . . . 61
1.10 Symmetric Matrices and Quadratic Forms . . . . . . . . 64
1.11 QR-Decomposition . . . . . . . . . . . . . . . . . . . . 68
1.12 Numerics of eigenvalues . . . . . . . . . . . . . . . . . . 71
AEM 0- 3
2 Ordinary Dierential Equations 2
2.1 General Denitions . . . . . . . . . . . . . . . . . . . . 2
2.2 Linear dierential equations with constant coecients . 4
2.2.1 Inhomogeneous Equations . . . . . . . . . . . . 7
2.3 Linear dierential equations of higher order . . . . . . . 8
2.3.1 General Case . . . . . . . . . . . . . . . . . . . 8
2.3.2 Ode with Constant Coecients . . . . . . . . . 10
2.3.3 Special Inhomogeneities . . . . . . . . . . . . . 11
3 Calculus in Several Variables 3
3.1 Dierential Calculus in R
n
. . . . . . . . . . . . . . . . 3
3.1.1 Denitions . . . . . . . . . . . . . . . . . . . . 3
3.1.2 Examples and Properties of Open and Closed Sets 4
3.1.3 Main Rule for Vector-Valued Functions . . . . . 4
3.1.4 Denition - Limits and Continous Fuctions . . . 5
3.1.5 Denition - Partial Derivatives . . . . . . . . . . 5
3.1.6 Theorem of H.A. Schwarz . . . . . . . . . . . . 6
3.1.7 Denition: Derivative of f . . . . . . . . . . . . 6
3.1.8 Higher derivatives . . . . . . . . . . . . . . . . . 7
3.1.9 Examples . . . . . . . . . . . . . . . . . . . . . 7
3.1.10 Directional derivative, Gf teaux derivative . . . . 7
3.1.11 Rules . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Inverse and Implicit Functions . . . . . . . . . . . . . . 8
3.2.1 Inverse Function Theorem . . . . . . . . . . . . 8
3.2.2 Application: Newtons method . . . . . . . . . . 9
3.2.3 Implicit Function Theorem . . . . . . . . . . . . 9
3.3 Taylor Expansions . . . . . . . . . . . . . . . . . . . . . 10
3.3.1 Nabla-Operator . . . . . . . . . . . . . . . . . . 10
3.3.2 Construction of Taylor Expansions . . . . . . . . 10
3.3.3 Taylors Theorem . . . . . . . . . . . . . . . . . 11
3.3.4 Calculation in the two-dimensional Case . . . . . 12
3.4 Extreme Values . . . . . . . . . . . . . . . . . . . . . . 14
3.4.1 Denition . . . . . . . . . . . . . . . . . . . . . 14
3.4.2 Neccesary Criterion . . . . . . . . . . . . . . . . 15
AEM 0- 4
3.4.3 Sucient Criterion . . . . . . . . . . . . . . . . 15
3.4.4 Saddle Points . . . . . . . . . . . . . . . . . . . 1
4 Integral Transforms 2
4.1 Laplace Transform . . . . . . . . . . . . . . . . . . . . 2
4.1.1 Method of Calculation . . . . . . . . . . . . . . 2
4.1.2 Convolution . . . . . . . . . . . . . . . . . . . . 4
4.1.3 Some important Examples . . . . . . . . . . . . 4
4.1.4 Solution of Inital Value Problems . . . . . . . . 5
4.2 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . 6
4.2.1 Theorem . . . . . . . . . . . . . . . . . . . . . 6
4.2.2 Denition . . . . . . . . . . . . . . . . . . . . . 6
4.2.3 Theorem . . . . . . . . . . . . . . . . . . . . . 7
4.2.4 Properties of the Coecients . . . . . . . . . . 7
4.2.5 Real form of the Fourier Series . . . . . . . . . . 7
4.3 Fourier Transform . . . . . . . . . . . . . . . . . . . . . 8
4.3.1 Denition . . . . . . . . . . . . . . . . . . . . . 8
4.3.2 Inverse Transform . . . . . . . . . . . . . . . . . 8
4.3.3 Convolution . . . . . . . . . . . . . . . . . . . . 9
4.3.4 Rules . . . . . . . . . . . . . . . . . . . . . . . 9
4.3.5 Sine and Cosine transform . . . . . . . . . . . . 10
4.3.6 More Properties . . . . . . . . . . . . . . . . . . 10
4.3.7 Calculation of the Fourier Transform . . . . . . 11
4.3.8 Gauss functions . . . . . . . . . . . . . . . . . . 11
4.3.9 Consequences . . . . . . . . . . . . . . . . . . . 12
4.3.10 Denition: Dirac sequence . . . . . . . . . . . . 1
4.3.11 Main Property of Dirac sequences . . . . . . . . 1
4.3.12 Delta Distribution . . . . . . . . . . . . . . . . . 1
5 Stability of Ordinary Dierential Equations 2
5.1 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 2
5.2 Denition . . . . . . . . . . . . . . . . . . . . . . . . . 3
5.3 Flow-box theorem . . . . . . . . . . . . . . . . . . . . . 3
5.4 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 3
AEM -1- 5
5.5 Theorem: Linear Case . . . . . . . . . . . . . . . . . . 4
5.6 Linearisation . . . . . . . . . . . . . . . . . . . . . . . . 4
5.7 Poincar

S-Ljapunov Theorem . . . . . . . . . . . . . . . 4
5.8 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 4
5.9 Ljapunov Functions . . . . . . . . . . . . . . . . . . . . 1
5.9.1 Denition . . . . . . . . . . . . . . . . . . . . . 1
5.9.2 Theorem . . . . . . . . . . . . . . . . . . . . . 1
0 Solving Linear Equation
Systems with the
Gauss-Algorithm
A linear equation system with m equations and n unknowns is given by
a
11
x
1
+ a
12
x
2
+ a
1n
x
n
= b
1
.
.
.
a
m1
x
1
+ a
m2
x
2
+ a
mn
x
n
= b
m
Omitting the plus-signs and the variables this will be written down in the
short form
_

_
a
11
a
12
a
1n
b
1
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
b
m
_

_
In case of an homogeneous equation system (all b
j
are equal to zero)
the last column is omitted, too.
Allowed operations are
multiply a row with a number unequal to zero
exchange two rows
add a multiple of a row to another row
The exchange of columns is only allowed is there is a row 0 added
that contains the names of the variables.
AEM 0- 7
Naturally the last column containing the b
j
-values must not be
exchanged with other columns.
The simplest form of the Gauss-Algorithm is to perform these steps:
m
1 Try to get a 1 into the upper left corner. If this is not possible,
the algorithm stops.
m
2 by adding suitable multiples of the rst row to the rows below (and
above) generate zeroes in the rest of the column.
m
3 Repeat the process in the subscheme without the rst row and
column.
In the end (possibly after exchanging rows and columns) one has
_

_
x
j
1
x
j
2
x
j
k
x
j
k+1
x
j
n
1 0 0 c
1
0 1 0 c
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 1 c
k
0 0 0 0 c
k+1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 0 c
m
_

_
The rst row contains the names of the variables.
The number k is called the rank of the equation system. The following
cases are possible:
(i) At least one of the values c
k+1
, . . . , c
m
is unequal to zero. Then
the system is not solvable.
(ii) If k = n = m then the system is uniquely solvable with x
j
1
= c
1
,
. . . , x
j
n
= c
n
.
(iii) It is k < n and c
k+1
= = c
m
= 0. Then we can take the last
n k variables x
j
k+1
to x
j
n
as parameters in the solution. With this
the values of x
j
1
to x
j
k
are uniquely determined for each choice of
the parameters.
AEM 0- 8
Example
2x
1
+6x
2
+2x
4
= 10
x
1
+3x
2
+x
3
+2x
4
= 7
3x
1
+9x
2
+4x
3
= 16
3x
1
+9x
2
+x
3
+4x
4
= 17
or
_

_
x
1
x
2
x
3
x
4
2 6 0 2 10
1 3 1 2 7
3 9 4 0 16
3 9 1 4 17
_

_
m
1 Exchange rows 1 and 2.
_

_
x
1
x
2
x
3
x
4
1 3 1 2 7
2 6 0 2 10
3 9 4 0 16
3 9 1 4 17
_

_
m
2 Add row 1 multiplied by (2) to row 2, multiplied by (3) to row
3 and multiplied by 3 to row 4. This results in
_

_
x
1
x
2
x
3
x
4
1 3 1 2 7
0 0 2 2 4
0 0 1 6 5
0 0 2 2 4
_

_
m
3 Now swap columns 2 and 4.
_

_
x
1
x
4
x
3
x
2
1 2 1 3 7
0 2 2 0 4
0 6 1 0 5
0 2 2 0 4
_

_
m
4 Add row 2 to row 1, row 2 multiplied by (3) to row 3 and multi-
plied by (1) to row 4. Then divide row 2 by (2).
AEM 0- 9
_

_
x
1
x
4
x
3
x
2
1 0 1 3 3
0 1 1 0 2
0 0 7 0 7
0 0 0 0 0
_

_
m
5 Leave row 4 away, divide row 3 by 7 and add row 3 to row 1 and
subtract it from row 2. Then we reach the nal form
_

_
x
1
x
4
x
3
x
2
1 0 0 3 4
0 1 0 0 1
0 0 1 0 1
_

_
m
6 The system is solvable. The variables behind the columns that
form an identity matrix are parameters; here this applies to x
2
.
With x
2
= t one sees x
1
= 4 3t, x
4
= 1 and x
3
= 1. So we can
write the general solution as follows:
_

_
x
1
x
2
x
3
x
4
_

_
=
_

_
4 3t
t
1
1
_

_
=
_

_
4
0
1
1
_

_
+ t
_

_
3
1
0
0
_

_
1 Linear Algebra and Vector
Spaces
1.1 Vector spaces
1.1.1 Vector Spaces
1.1.1.1 Denition
A real vector-space (short: VS) is a set in which two operations addition
and multiplication are dened, and where the following rules hold:
u, v and w are elements of the vector-space, and are real numbers.
(i) u +v = v + u, u + (v + w) = (v + u) + w
(ii) There is a zero-vector

0 with v +

0 = v.
(iii) For each v there is a vector v with v + (v) =

0.
(iv) ( + )v = v + v, ()v = (v), (v + w) = v +
w, 1 v = v
If one admits complex scalars, one gets a complex vector-space instead
of a real VS.
The elements of the vector-space are called vectors. The elements of
the eld R or C are called scalars.
Often there is no dierence wether one has R or C as eld. In this case
we use K as a symbol.
AEM 1- 2
1.1.1.2 Denition: Subspace
Let V be a VS and U V . U is called a subspace of V , if U is itself a
VS with the operations induced by V . This is fullled, i (short for if
and only if) U contains for each pair of elements x and y the sum x + y
and all vectors of the form x with K.
This property is called closedness against sums and multiplications.
Always V has the trivial subspaces V and {

0}.
1.1.2 Linear Independence
1.1.2.1 Denition
Let v
1
to v
k
vectors and
1
, . . . ,
k
K. The expression

1
v
1
+
k
v
k
is called linear combination. The numbers
j
are called
coecients. Please notice that a linear combination is always a nite
sum, even in innite-dimensional spaces.
The vectors v
1
to v
k
are linearly dependent (l.d.), if there are coecients

1
to
k
with
1
v
1
+ +
k
v
k
=

0, and not all of the
i
are zero. If
this is not the case, the vectors are called linearly independent (l.i.).
Therefore, if v
1
to v
k
are linearly independent, and
1
v
1
+ +
k
v
k
=

0,
it follows that
1
=
2
= =
k
= 0.
On the other hand, if v
1
to v
k
are l.d, then it is possible to write

1
v
1
+ +
k
v
k
=

0 with at least one of the
j
= 0, say
1
= 0. Then
one has
v
1
=
1

1
(
2
v
2
+
k
v
k
),
so one of the vectors is a linear combination of the others.
AEM 1- 3
1.1.2.2 Criteria for Linear Dependence
A single vector is linearly dependent, i it is the zero-vector.
Two vectors u and v are linearly dependent, i they lie on a straight
line through zero; or i one of them is a multiple of the other.
Three vectors u, v and w are linearly dependent, i they lie in
a plane through zero; or if one of them is a linear combination
of the others. In R
3
there is a criterion with the volume of the
parallelepiped spanned by these vectors
v
1
, v
2
, v
3
l.d. (v
1
, v
2
, v
3
) = det(v
1
, v
2
, v
3
) = 0
k vectors v
1
to v
k
are linearly dependent, i the rank of the matrix
with the columns v
1
to v
k
is less than k. (rank will be explained
later).
More than n vectors in K
n
are always linearly dependent.
Criterion for n vectors in K
n
:
v
1
to v
n
are linearly dependent det(v
1
, . . . , v
n
) = 0.
1.1.3 Dimension and Basis
1.1.3.1 Denition: Span, Dimension and Basis
Let V be a vector space.
(i) Let M V be a (nite or innite) non-empty subset of V . The
set of all linear combinations is called the span on M, span M =
{
m

k=1

k
v
k
|
j
K, v
j
M}. The span is always a subspace.
(ii) If there is a system M of n vectors in V , so that V is the span of
M, and there is no such system consisting of less than n vectors,
then V has the dimension n.
AEM 1- 4
If there is no nite set M with span M = V , V is said to be
innite-dimensional.
(iii) A set M = {v
1
, v
2
v
n
} V is called a basis of V , i every vector
v V has an unique representation v =
n

k=1

k
v
k
.
1.1.3.2 Remarks
(i) If V has dimension n, then every basis consists of n elements.
(ii) If V has dimension n, then every linearly independent set of n
vectors forms a basis.
(iii) The elements of a basis are always linearly independent.
1.1.3.3 Coordinates
Let M = {v
1
, v
2
v
n
} V be a basis of V . For each v V there
is a unique representation v =
n

k=1

k
v
k
. The numbers (
1
, . . . ,
n
) are
called the coordinates of v with respect to M. The vector
_

1
.
.
.

n
_

_
(always
a column!) is called the coordinate vector of v.
AEM 1- 5
1.1.4 Scalar Product
1.1.4.1 Complex scalar product
Let V be a complex vectorspace. A scalar product is a mapping V V
C, (v, w) < v, w > with the properties
(i) < u + v, w >= < u, w > + < v, w > for , C,
u, v, w V (linearity in the rst argument)
(ii) < u, v + w >= < u, v > + < u, w > for , C,
u, v, w V (anti-linearity in the second argument)
(iii) < u, v >= < v, u >
(iv) < u, u > 0 and < u, u >= 0 u =

0 (positive deniteness)
Esp. the scalarproduct of a vector with itself is always real and
non-negative.
1.1.4.2 Real scalar product
If V is a real vectorspace, the same properties shall hold with real valued
scalar product, , R and (naturally) without complex conjugation.
1.1.4.3 Standard scalar product
The standard real resp. complex scalar product of two vectors in K
n
is
dened by
v w =< v, w >:=
n

k=1
v
k
w
k
v, w R
n
v w =< v, w >:=
n

k=1
v
k
w
k
v, w C
n
In this case we dene
AEM 1- 6
(i) u :=

< u, u > is the length or (euclidean) norm of the vector


u (also denoted by | u|).
(ii) The angle [0, ] of two non-zero vectors u, v R
n
is dened
by cos =
< u, v >
u v
.
1.1.5 Orthonormal Systems
With the Kronecker symbol
i j
=
_
1 i = j
0 i = j
we dene
1.1.5.1 Denition
(i) Two vectors having scalar product zero are called orthogonal or
perpendicular.
(ii) A set of vectors {v
i
} with < v
i
, v
j
>=
i j
is called an orthonormal
system (ONS). A basis that is an ONS is called orthonormal basis
(ONB).
1.1.5.2 Lemma
ONS are linearly independent.
The importance of ONB lies in the following theorem, which allows an
expansion of a given vector in the basis with aid of scalarproducts:
1.1.5.3 Expansion Theorem
Let v
1
, , v
k
be an ONS, V the span of these vectors.
AEM 1- 7
(i) If u V , then the following holds:
u =< u, v
1
> v
1
+ < u, v
2
> v
2
+ + < u, v
k
> v
k
=
k

j =1
< u, v
j
> v
j
(ii) For V U and u U there exists a decomposition u = u
1
+ u
2
with
u
1
V and < u
1
, u
2
>= 0. u
1
is called the orthogonal projection
of u, and the map u u
1
is the orthogonal projection onto V .
1.1.5.4 Gram-Schmidt Orthonormalisation Process
Let u
1
, , u
k
be a set of vectors, in which at least one non-zero vector
exists.
m
1 Choose u
1
=

0, let v
1
= u
1
and set w
1
=
1
v
1

v
1
.
m
2 If w
1
to w
j 1
are already constructed, let
v
j
= u
j
< u
j
, w
1
> w
1
< u
j
, w
j 1
> w
j 1
= u
j

j 1

i =1
< u
j
, w
i
> w
i
.
Then span { u
1
. . . u
j
} = span {v
1
. . . v
j
} and
< v
j
, v
1
>= =< v
j
, v
j 1
>
=< v
j
, u
1
>= =< v
j
, u
j 1
>= 0.
In manual computations it is often easier to use the v
i
instead of
the w
i
:
v
j
= u
j

< u
j
, v
1
>
< v
1
, v
1
>
v
1

< u
j
, v
j 1
>
< v
j 1
, v
j 1
>
v
j 1
= u
j

j 1

i =1
< u
j
, v
i
>
v
i

2
v
i
.
As the v
j
will be normed later, it is allowed to substitute the v
j
with
a multiple. With this technique one can avoid sometimes the use
of fractions.
AEM 1- 8
m
3 If v
i
=

0 then let w
j
=
1
v
j

v
j
and go on with
m
2 . If one is
calculating with v
j
instead of w
j
this step can be carried out in the
end.
If v
j
=

0 then u
j
was linearly dependent of u
1
to u
j 1
. In this case
u
j
is deleted from the starting set of vectors and the algorithm
goes on with the next vector.
If the u
i
are linearly independent this case cannot occur.
1.1.6 Norms
1.1.6.1 Denition
A norm on a vector space V is a function . : V R, x x R
with the following properties:
(i) x 0 and x = 0 x =

0 (deniteness)
(ii) x = ||x (homogeneity)
(iii) x + y x + y (triangle inequality)
1.1.6.2 Examples
(i) The euclidean norm on K
n
x
2
= |x| =
_
x, x
(ii) The 1-norm: x
1
= |x
1
| +|x|
2
+ +|x|
n
(iii) The -norm: x

= max{|x
1
|, |x|
2
, , |x|
n
}
(iv) On C([a, b]) we dene f
2
:=
_
_
b
a
|f (x)|
2
dx
_1
/
2
Remark In (i)(iii) we have e
k
= 1.
AEM 1- 9
1.1.6.3 Lemma: Cauchy-Schwarz and Minkowski inequalities
Let ., . be a real or complex scalar product, i.e. ., . is linear in the
rst argument and u, v = v, u with u, u = 0 u = 0.
(i) | u, v | u, u
1
/
2
v, v
1
/
2
(ii) With u := u, u
1
/
2
is a norm, especially u + v u +v.
(i) is called Cauchy-Schwarz inequality, (ii) is the Minkowski inequality.
1.1.6.4 Comparison of norms
It is easy to see that x

x
2
x
1
nx

holds. Therefore,
one can dene: a sequence x
k
approaches zero if the real sequence x
k

has the limit zero, and the choice of the norm doesnt make a dierence.
Naturally x
k
x (x
k
x)

0 x
k
x 0.
1.2 Matrices and Linear Maps
1.2.1 Matrices
1.2.1.1 Denition
In the most cases it is sucient to regard a matrix as a rectangular
scheme consisting of column-vectors:
A = (a
i j
)i =1..m
j =1..n
=
_

_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_

_
=
_

_
| | |
a
1
a
2
a
n
.
.
.
.
.
.
.
.
.
.
.
.
| | |
_

_
A matrix with an equal number of rows and columns is called square
matrix.
AEM 1- 10
1.2.1.2 Special types of square matrices
_

_
1 0 0 0
0 1 0 0
0 0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1
_

_
Identity-matrix
E
n
or I
n
or E or I
_

_
d
1
0 0 0
0 d
2
0 0
0 0 d
3
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 d
n
_

_
diagonal-matrix
_

_
0 0 0
0 0
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

_

_
lower
triangular matrix
_

_

0
0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0
_

_
upper
triangular matrix
Two matrices of the same size can be added by adding all entries. A
matrix is multiplied by a scalar by multiplying each entry by .
A =
_

_
a
11
a
12

a
21
a
22

.
.
.
.
.
.
.
.
.
_

_
, B =
_

_
b
11
b
12

b
21
b
22

.
.
.
.
.
.
.
.
.
_

_
,
A =
_

_
a
11
a
12

a
21
a
22

.
.
.
.
.
.
.
.
.
_

_
A + B =
_

_
a
11
+ b
11
a
12
+ b
12

a
21
+ b
21
a
22
+ b
22

.
.
.
.
.
.
.
.
.
_

_
,
1.2.1.3 Multiplication of Matrices and Vectors
Let A be a matrix with k columns and

b be an element of K
k
.
The product of the matrix A and the vector

b = (b
1
, . . . , b
k
)
T
is the
linear combination of the column-vectors of A with the coecients b
1
AEM 1- 11
to b
k
.
_
_
|
a
1
|

|
a
k
|
_
_
_

_
b
1
.
.
.
b
k
_

_
= b
1
a
1
+ + b
k
a
k
The matrix A is multiplied with the matrix B by decomposing B into
column-vectors and forming the corresponding matrix-vector-products.
These products are written down in order.
_
_
A
_
_
_
_
|

b
1
|

|

b
k
|
_
_
=
_
_
|
A

b
1
|

|
A

b
k
|
_
_
So the matrix-product is calculated in concrete situations:
a
i 1
a
i 2
a
i 3


b
3j

b
1j
b
2j

c
i j
C = AB A
B



.
.
.
.
.
.
.
.
.
c
i j
= a
i 1
b
1j
+ a
i 2
b
2j
+ a
i n
b
n
=
n

k=1
a
i k
b
kj
c
i j
= a
i 1
b
1j
+ a
i 2
b
2j
+ a
i n
b
nj
=
n

k=1
a
i k
b
kj
On the other hand, if you dene matrices as an (ordered) collection of
row-vectors, the product

bA consists of linear combinations of the rows
of A with coecients in

b. Observe the order of multiplication!
This leads to:
The product AB of the matrices A and B is a matrix with
the k-th column is a linear combination of the columns of A with
coecients in the k-th columns of B
AEM 1- 12
the k-th row is a linear combination of the rows of B with coe-
cients in the k-th row of A
1.2.2 Linear Maps
1.2.2.1 Denition: Linear map
Let U and V be vector-spaces. A map L : U V is called linear, if for
all x, y U and , K the following equation holds:
L(x + y) = L(x) + L( y)
If u
1
. . . u
n
is a basis of U then L is completely determined by its action
on the basis:
L u = L
_
n

i =1

i
u
i
_
=
n

i =1

i
L( u
i
)
Suppose that V has a basis v
1
. . . v
m
. Then each L u
i
has a representation
L u
i
=
m

j =1
a
j i
v
j
. The matrix A = (a
j i
)
j =1..m,i =1..n
is called the matrix
associated to the linear map L. Note that this matrix depends not only
on L itself, but also on the choice of the bases in U and V .
Resuming this for the special case U = K
n
and V = K
m
with the standard
bases we have:
The matrix of the linear map L : U V has in the k-th column the
image of e
k
.
On the other hand every matrix with n rows and m columns denes a
linear map K
m
K
n
through L(x) := Ax.
1.2.2.2 Denition: Rank
The rank of a matrix is the rank of the corresponding homogeneous
equation system dened in chapter 0.
AEM 1- 13
1.2.2.3 Rank theorem
Let A be a matrix. Then the maximum number of linear independent
columns is equal to the maximum numbers of linear independent rows.
A matrix with the property that the rank is the minimum number of rows
and columns is called to have full rank.
1.2.2.4 Denition: Multilinear Maps
(i) Let U
1
, . . . , U
n
und V be vectorspaces. A map
L : U
1
U
2
U
n
V
L : (u
1
, u
2
, . . . , u
n
) L(u
1
, . . . , u
n
) V
is called multilinear if L is linear in each component, i.e. L is linear
in each u
j
if one xes all other u
k
.
(ii) Most important case: U
1
= = U
n
.
For n = 2 we have bilinear maps. They are called symmetric ist
L( u, v) = L(v, u) and hermitian if L( u, v) = L(v, u).
A multilinear map with the property
L( , u
j
, , u
k
, ) = L( , u
k
, , u
j
, )
is called alternating.
Properties of alternating maps:
(iii) (1) L( , u, , u, ) = 0
(2) If one of the vectors is a linear combination of the others,
L( ) = 0.
(3) For U
1
= = U
n
= K
n
and V = K there is exactly one L
with
L(e
1
, , e
n
) = 1.
AEM 1- 14
In this case we have L( u
1
, , u
n
) u
1
, , u
n
lin. indepen-
dant.
This L is called determinante L( u
1
, , u
n
) = det( u
1
, , u
n
)
(and is the well known determinante with the usual properties)
(iv) Application: Cramers Rule
Let a
1
, , a
n
K
n
be a basis, A = [ a
1
, , a
n
] a n n-matrix
and

b K
n
.
Then the equation system Ax =

b is uniquely solvable with
x
j
=
det A
j
det A
, where A
j
is A with a
j
is replaced by

b.
1.2.3 Linear Equations
1.2.3.1 Some Denitions
A linear map L : U K is called linear functional.
Let L : U V be linear. Then L is called
epimorphism, if L is surjective
isomorphism, if L is bijective (one-to-one)
endomorphism, if U = V
automorphism, if U = V and L is one-to-one.
The rank of L is the dimension of the range of L in V . As the range is
spanned by the column-vectors of the matrix representation, the rank of
L is the rank of the corresponding matrix.
1.2.3.2 Denition: Linear equation
Let L : U V be a linear map,

b V . An equation Lx =

b is
called linear equation. For

b =

0 the equation is called homogeneous,
AEM 1- 15
otherwise inhomogeneous. The set of all solutions of the homogeneous
equation is called the kernel of L, written ker L.
From now on we assume that L is represented by the matrix A.
1.2.3.3 Immediate Properties
(i) The kernel is a subspace of U
(ii) For the homogeneous equation the dimension formula holds:
dimker L = dimU rank L
That means that one can choose freely n k parameters in the
solution of the equation Lx =

0.
(iii) The general solution of the inhomogeneous equation is archived by
adding one particular solution to all solutions of the homogeneous
equation.
(iv) The inhomogeneous equation is solvable i the rank of A is equal
to the rank of the extended matrix (A|

b).
(v) For square n n-matrices A the following holds:
The inhomogeneous equation is solvable for each right side

b
The homogeneous equation is uniquely solvable
det A = 0
A has rank n
ker A = {

0}
A
1
exists (A
1
is dened below).
1.2.4 Inverse map and Inverse Matrix
Let Lx =

b be a linear equation that is uniquely solvable for all

b.
Then the map

b x is well dened, and this map is called L
1
, the
inverse map of L.
AEM 1- 16
1.2.4.1 Consequences
(i) L
1
is a linear map from V to U.
(ii) In the nite dimensional case the matrix associated to L must be
square.
Let A be a n n-square matrix with rank n. Then each equation system
Ax =

b is uniquely solvable. The matrix B = [v
1
, . . . , v
n
] containing the
solutions Av
j
= e
j
is called the inverse of A, A
1
= B.
A is called regular or invertible.
1.2.4.2 Properties
A
1
A = AA
1
= E
From now on we restrict ourselves to the case that the linear map is
dened between R
n
and R
m
or between C
n
and C
m
.
1.2.4.3 Correspondences between Linear Maps and Matrices
Linear Map L Matrix A
Application to a vector L(x) Multiplication matrix - vector Ax
Identity map I(x) = x Identity matrix E with Ex = x
Zero map: O(x) =

0 Zero matrix 0 with 0x =

0
Composition L
1
L
2
Matrix-multiplication A
1
A
2
Inverse map L
1
Inverse matrix A
1
AEM 1- 17
1.2.5 Changing the Basis
In the beginning of the section was mentioned that the matrix of a given
map L : K
n
K
m
contains in the columns the coordinates of the images
of the basis of K
n
with respect to the basis of K
m
. Now we can ask how
the matrix changes when we choose other bases in K
n
or K
m
.
1.2.5.1 Coordinates with Respect to a Basis
Let u
1
, , u
n
be a basis of K
n
. Then the matrix U = ( u
1
u
n
) is in-
vertible. To gain the coordinates a of a point x with respect to u
1
, , u
n
we write
x = U a a = U
1
x.
If v
1
v
n
is another basis of K
n
we have with V = (v
1
v
n
)
x = U a = V

b = V
1
U a a = U
1
V

b
1.2.5.2 Matrix and Change of Coordinates
This uses the same method as in the paragraph above: x K
n
has
the representations x = U a = V

b and y K
m
has the representations
y = Wc = Z

d.
Let A be the matrix of L with respect to the basis U and W. Using the
last paragraph we have
L(x) = y A a = c AU
1
V

b = W
1
Z

d Z
1
WAU
1
V

b =

d.
A special case is the change of basis of an endomorphism: With W = U
and Z = V the last formula reduces to
A a = c V
1
UAU
1
V

b =

d (V
1
U)A(V
1
U)
1

b =

d
In the even more special case U = W = E we have
A a = c V
1
AV

b =

d.
AEM 1- 18
1.2.6 Some Special Linear Maps in R
2
(i) Identity and zero maps E and 0.
(ii) Homogeneous scaling E =
_
0
0
_
(iii) Rotation with the angle
_
cos sin
sin cos
_
.
(iv) Shears as
_
1 1
0 1
_
(v) Reections.
Let a = 1 and g be the straight line with direction a. The
reection at g has the matrix
S
g
=
_
2a
2
1
1 2a
1
a
2
2a
1
a
2
2a
2
2
1
_
= 2 a a
T
E
(vi) The reection at zero has the matrix E =
_
1 0
0 1
_
.
AEM 1- 19
1.2.7 Examples
1 2 3 4
5 6 7 8
1 :
_
1 0
0 1
_
2 :
_
1.5 0
0 0.5
_
3 :
_
0.75 0
0 1
_
4 :
_
1 1
0 1
_
5 :
_
1 0
0 1
_
6 :
_
1 0
0 1
_
7 :
_
1 0
0 1
_
8 :
_
a a
a a
_
, a =
1

2
1.3 Operations with matrices
The transpose A
T
of a matrix A is the matrix with columns and rows ex-
changed. The transpose of a mn-matrix is a nm-matrix. This means
for square matrices, that everything is mirrored at the rst diagonal.
The adjoint A

of an (complex) matrix A is constructed by replacing all


entries of the transpose by their complex conjugates.
A square matrix is called symmetric, if it is equal to its transpose. It
is called self-adjoint or hermitian, if it is equal to its adjoint. For real
matrices these term coincide.
A matrix with A = A
T
is called skew-symmetric, if A = A

, A is
called skew-hermitian.
AEM 1- 20
Often it is useful to regard vectors as matrices with one column and n
rows. The numbers in R or C correspondent to the 1 1-matrices.
1.3.1 Matrix-algebra
A+B = B+A (A+B) = A+B (A+B)+C = A+(B+C)
(A+B)C = AC +BC A(B +C) = AB +AC (AB)C = A(BC)
Attention! In general is AB = BA.
Let A and B be invertible n n-matrices. Then AB is invertible and the
following rules hold:
(AB)
1
= B
1
A
1
(A)
1
=
1

A
1
.
AE = EA = A A0 = 0A = 0
(A
1
)
1
= A (A
T
)
T
= A (A

= A
(A + B)
T
= A
T
+ B
T
(A)
T
= A
T
(AB)
T
= B
T
A
T
(A + B)

= A

+ B

(A)

= A

(AB)

= B

(A
1
)
T
= (A
T
)
1
(A
1
)

= (A

)
1
1.3.1.1 Block Matrices
If a matrix is divided into blocks by horizontal or vertical lines one can
calculate with these block as if they were entries in a common matrix
(exception: determinants!). The blocks have to t in size. Example:
_
A
1
A
2
A
3
A
4
_ _
B
1
0
E
k
B
2
_
=
_
A
1
B
1
+ A
2
A
2
B
2
A
3
B
1
+ A
4
A
4
B
2
_
Here O denotes a matrix consisting only of zeroes and E
k
a k k identity
matrix.
AEM 1- 21
1.3.2 Scalar Product
The role of the transpose resp. adjoint matrix becomes clearer if we if
we regard the scalar product as a matrix product:
< u, v >=
n

i =1
u
i
v
i
= v

u (complex case)
< u, v >=
n

i =1
u
i
v
i
= v
T
u (real case).
So we have
< A u, v >= v
T
A u = v
T
A
T T
u = (A
T
v)
T
u =< u, A
T
v >
and analogously in the complex case < A u, v >=< u, A

v >.
This property characterizes the transpose matrix:
Let < A u, v >=< u, Bv > for all u, v R
n
. If one chooses u = e
i
and
v = e
j
one has < Ae
i
, e
j
>= a
j i
and < e
i
, Be
j
>= b
i j
, so B = A
T
.
1.3.3 Homogeneous Coordinates
With matrix multiplication one can describe rotations, stretchings, shear-
ings or reections (and combinations of these), but as the origin always
remains xed, translations are not possible. This diculty can be over-
come by using homogeneous coordinates. Homogeneous coordinates in
R
3
consist of four coordinates, where the fourth coordinate must not be
zero. A point (x, y, z) R
3
is represented by any vector of the form
[ax, ay, az, a]
T
. Especially [x, y, z, 1]
T
is a representant of [x, y, z]
T
.
Then we have the following correspondeces:
AEM 1- 22
cartesian coordinates homogeneous coordinates
x =
_
_
x
1
x
2
x
3
_
_
y =
_

_
x
1
x
2
x
3
1
_

_
or y =
_

_
ax
1
ax
2
ax
3
a
_

_
x Ax y B y with B =
_

_
0
A 0
0
0 0 0 1
_

_
x x +v y B y with B =
_

_
1 0 0 v
1
0 1 0 v
2
0 0 1 v
3
0 0 0 1
_

_
1.3.4 Norms
Denition of norms of linear maps Let U and V be normed vector
spaces and let L(U, V ) denote the vectorspace of all linear maps from
U to V . A norm on L(U, V ) is a real valued function with the following
properties:
If A, B L(U, V ) then
(i) A 0 and A = 0 A = 0, the zero-map (deniteness)
(ii) A = ||A (homogeneity)
(iii) A + B A +B (triangle inequality)
(iv) AB AB
AEM 1- 23
In the nite dimensional case linear maps are represented by matrices,
and the norm is called matrix-norm. Other notation: operator-norm
In general, a vector-norm .
a
and a matrix-norm .
b
are compatible
if for each vector x and each matrix A the inequality Ax Ax
holds. The norm-denition below produces compatible matrix-norms.
Denition Let .
i
be a (vector)-norm in K
n
and A be a nn matrix.
We dene the matrix-norm A generated by . by
A = max{Ax | x = 1} = max{Ax | x 1}.
Then one has A = min{C | for all x U one has Ax Cx}
The norm generated by the vector-norms .
1
and .

above are
denoted by the same symbol.
Lemma
(i) A
1
= max
1j n
n

i =1
|a
i j
| (largest sum of columns)
(ii) A
2
is the rst (and largest) singular value of A (will be dened
later)
(iii) A

= max
1i n
n

j =1
|a
i j
| (largest sum of rows)
(iv) A
s
=

_
n

i ,j =1
|a
i j
|
2
is compatible with .
2
(Frobenius Norm).
AEM 1- 24
1.4 Gauss Algorithm and LU-Decomposition
1.4.1 Numerical Stability
We will study some small equation systems and the eect of rounding
errors onto the solutions.
Example system:
10
4
x + y = 1
x + y = 2
Solution with the Gaua-Algorithm, exact calculation:
_
1
10000
1 1
1 1 2
_

_
1
10000
1 1
0 9999 9998
_

_
1
10000
1 1
0 1
9998
9999
_

_
1
10000
0
1
9999
0 1
9998
9999
_

_
1 0
10000
9999
0 1
9998
9999
_
and so x 1 and y 1
Now the same calculation with three signicant digits, i.e. all numbers
are rounded to the next number with three digits of the form x = 0.abc
10
p
.
_
0.0001 1 1
1 1 2
_

_
0.0001 1 1
0 10000 10000
_

_
0.0001 1 1
0 1 1
_

_
0.0001 0 0
0 1 1
_

_
1 0 0
0 1 1
_

x = 0
y = 1
This solution is unusable.
This can be avoided by pivoting: choose the entry in the rst column
below the diagonal (the diagonal included) with the largest absolute value
and put it into the diagonal by exchanging rows. Then go on with Gaua
AEM 1- 25
algorithm. If A is invertible then the pivot elements are unequal to zero.
This results in the following:
_
0.0001 1 1
1 1 2
_

_
1 1 2
0.0001 1 1
_

_
1 1 2
0 0.9999 0.9998
_

_
1 1 2
0 1 1
_

_
1 0 1
0 1 1
_
Other problems may arise. Example two is example one after multiplying
row 1 by 20000. Again the calculations use three signicant digits.
_
2 20000 20000
1 1 2
_

_
2 20000 20000
0 10000 10000
_

_
2 20000 20000
0 1 1
_

_
2 0 0
0 1 1
_

x = 0
y = 1
So this solution is unusable, too.
This eect can be avoided by equilibration.This means that each equa-
tion is multiplied with a factor so that the sum of the absolute values of
the row,
n

k=1
|a
i k
| is equal to one.
Applying this one gets
_
2 20000 20000
1 1 2
_

_
2
20002
20000
20002
20000
20002
1
2
1
2
1
_

_
0.0001 1 1
0.5 0.5 1
_
Then pivoting gives

_
0.5 0.5 1
0.0001 1 1
_

_
0.5 0.5 1
0 1 1
_

x = 1
y = 1
Conclusion: pivoting and equilibration can help to avoid problems caused
by rounding errors.
AEM 1- 26
1.4.2 Special Operations
Always we assume that the sizes of the matrices t so that the products
can be performed.
Let be a real or complex number, k = l . Now dene the following
n n- square matrices:
Denition
C(k, l ; ) = (c
i j
)
i ,j =1..n
with c
i j
=
_
_
_
1 for i = j
for i = k, j = l
0 otherwise
D(k; ) = (d
i j
)
i ,j =1..n
with d
i j
=
_
_
_
1 for i = j and i = k
for i = j = k
0 otherwise
F(k, l ) = (f
i j
)
i ,j =1..n
with f
i j
=
_

_
1 for i = j and i = k and i = l
1 for i = k, j = l
1 for i = l , j = k
0 otherwise
Decompose A into column-vectors a
i
and row-vectors

b
j
:
A =
_
a
k
a
l

=
_

b
k

b
l

_

_
Multiplication from the left side does operations with rows:
C(k, l ; )A =
_

b
k
+

b
l

b
l

_

_
D(k; )A =
_

b
k

b
l

_

_
F(k, l )A =
_

b
l

b
k

_

_
Multiplication from the right side does operations with columns:
AEM 1- 27
AC(k, l ; ) =
_
a
k
a
l
+ a
k

,
AD(k; ) =
_
a
k
a
l

AF(k, l ) =
_
a
l
a
k

Observe that multiplication with C(k, l ; ) from the right changes col-
umn l while multiplication from the left changes row k.
1.4.3 Properties of C(k, l ; ), D(k; ) and F(k, l )
(i) C(k, l ; )
1
= C(k, l ; )
(ii) C(k, l ; )C(k, m; ) = C(k, m; )C(k, l ; )
(iii) C(k, l ; 0) = E
(iv) For = 0 we have D(k; )
1
= D(k;
1
)
(v) F(k, l )
1
= F(k, l ) = F(k, l )
T
1.4.4 Standard Algorithm
Standard operations in the Gauss algorithm are
(i) adding row l multiplied by to row k
(ii) multiplying row k by = 0
(iii) exchanging rows k and l .
These operations can be described with aid of the fundamental matrices
C(k, l ; ), D(k; ) and F(k, l ). To see this we write the system Ax =

b
as an augmented matrix S = (A|

b).
Then the operations (i) to (iii) from above are
(i) multiply S with C(k, l ; ) from the left
(ii) multiply S with D(k; ) from the left
(iii) multiply S with F(k, l ) from the left.
AEM 1- 28
As all appearing matrices are invertible we see that the Gauss algorithm
gives equivalent transformations and so preserves the set of solutions.
If the system Ax =

b is uniquely solvable it is sucient to reach the
form
_

_
d
11

.
.
.
.
.
.
0 d
nn

_

_
From the last equation it one can read directly the value of x
n
, and
substituting the already determined variables the solution is calculated
recursively from the bottom to the top.
1.4.5 LU-Decomposition
The LU-decomposition
is an eective method in solving many equations with the same
left side
decomposes a given square matrix A as A = PLU with
(i) P is a permutation matrix, i.e. P has exactly one 1 in each
column and row, and all other entries are zero.
(ii) L is a lower triangular matrix
(iii) U is an upper triangular matrix
1.4.5.1 Description of the Algorithm - simple case with P=E
The algorithm consists of a series of transformations of the matrix A.
With L
0
= E, U
0
= A we calculate
A = EA = L
0
U
0
= L
k
U
k
= = L
n
U
n
=: LU.
The matrices L
k
and U
k
are block-diagonal as shown:
AEM 1- 29
L
k
=
_

_
1 0
.
.
.
1
0

1 0
.
.
.
0 1
_

_
. .
k
U
k
=
_

_
u
1

.
.
.
0 u
k

0

.
.
.

_

_
. .
k
L
k1
=
_

_
1 0
.
.
.
1
0

1 0
.
.
.
0 1
_

_
. .
k1
U
k1
=
_

_
u
1

.
.
.
0 u
k1

0
z
.
.
.

_

_
. .
k1
m
1 We start with A = L
k1
U
k1
. Let U
k1
= (u
i j
).
During this simple case we assume that z = u
kk
= 0.
To each row from row k + 1 to the last in U
k1
we add the row
k multiplied with
j
:=
u
j k
u
kk
. This results in having zeroes in
column k from row k + 1 to the bottom.
These actions expressed with matrices: U
k1
is multiplied from the
left with C(j, k;
j
).
Recall the facts that the inverse of C(j, k;
j
) is C(j, k;
j
) and
that matrices C(j, k; ) and C(i , k; ) commute. So we have
A = L
k1
U
k1
= L
k1
C(k + 1, k;
k+1
)C(k + 1, k;
k+1
)U
k1
= L
k1
C(k + 1, k;
k+1
)C(k + 1, k;
k+1
)
C(n, k;
n
)C(n, k;
n
)U
k1
AEM 1- 30
=
_
L
k1
C(k + 1, k;
k+1
) C(n, k;
n
)
_

_
C(k + 1, k;
k+1
) C(n, k;
n
)U
k1
_
=: L
k
U
k
.
How is L
k
build from L
k1
?
The action of the matrices is adding multiples of the columns k +1
to n to column k. Obviously only column k is changed by this
process, and contains in the places k + 1 to n the negative of the
factors used by the transforming of U
k1
.
As an example we write down L
1
in the case U
0
= A = (a
i j
):
L
1
=
_

_
1 0 0
a
21
a
11
1 0
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
11
0 1
_

_
m
2 Recursively now repeat step
m
1
When the algorithm ends we have
A = LU with
L =
_

_
1 0 0
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0
1
_

_
and U =
_

_
u
11

0 u
22
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

0 0 u
nn
_

_
The u
i i
are non-zero.
m
3 Now we have
Ax = LUx = L Ux
..
y
= L y =

b
(i) L y =

b is solved recursively beginning with the rst compo-
nent of y.
AEM 1- 31
(ii) Ux = y is solved recursively beginning with the last compo-
nent of x.
1.4.5.2 Remark
det A = det Ldet U = u
11
u
nn
.
1.4.6 Example
Let A =
_
_
1 2 4
2 3 8
1 3 1
_
_
and

b =
_
_
3
6
0
_
_
. Solve Ax =

b
m
1 Start with the LU-decomposition of A.
[L
0
|U
0
] =
_
_
1 0 0 1 2 4
0 1 0 2 3 8
0 0 1 1 3 1
_
_
[L
1
|U
1
] =
_
_
1 0 0 1 2 4
2 1 0 0 1 0
1 0 1 0 1 3
_
_
[L
2
|U
2
] =
_
_
1 0 0 1 2 4
2 1 0 0 1 0
1 1 1 0 0 3
_
_
m
2 Solve L y =

b.
Line by line one has y
1
= 3, y
2
= 0 and y
3
= 3.
m
3 Solve Ux = y.
Line by line (from the bottom to the top) one has
x
3
= 1, x
2
= 0 and x
1
= 1, so x =
_
1 0 1

T
.
AEM 1- 32
1.4.6.1 LU-decomposition, general case
This general case brings two extensions:
A may be singular
pivoting is possible
Now we construct a decomposition A = PLU. We start with P
0
:= E,
L
0
= E and U
0
= A
If the element z in U
k1
is zero and in the rest of the column k there
are only zeroes, too, then the matrix A is singular. In this case let
U
k
:= U
k1
and L
k
:= L
k1
. We will get a LU-decomposition of A with
some diagonal elements of U being zero. This can only happen is A is
singular.
If in the row l > k of the column k there is an entry with an larger
absolute value, then exchange the rows k and l of U
k1
.
This is a multiplication of U
k1
from the left with F(k, l ). Remembering
F(k, l )F(k, l ) = E we get
A = P
k1
L
k1
U
k1
=
_
P
k1
L
k1
F(k, l )
__
F(k, l )U
k1
_
=:
_
P
k1
L
k1
F
ki
_

U
k1
.
The matrix

U
k1
is U
k1
with rows i and k exchanged and therefore has
a non-zero element in position z.
The action of right multiplication with F(k, l ) on L
k1
is interchanging
columns k and i . As these columns consist of zeroes with only one 1
in each case this can be undone by interchanging the rows k and i , i.e.
multiplying L
k1
with F(k, l ) from the left. But doing so interchanges
the rst k 1 positions of these rows too, so that one has to undo this.
Resuming this we have this step in the algorithm: Set P
k
:= P
k1
F(k, l )
and

L
k1
is L
k1
with the rst k 1 columns of the rows k and l
interchanged.
AEM 1- 33
1
1
1
k
l
P
k1
P
k L
k1


L
k1
U
k1


U
k1
l
k
k l
Now construct U
k
and L
k
as in the simple case from

U
k1
and

L
k1
and get A = P
k
L
k
U
k
.
In the end we have P
1
= P
T
. As P is a product of matrices F(k, l )
and F(k, l )
1
= F(k, l )
T
this is true for P, too, because of:
Let A
T
= A
1
and B
T
= B
1
. Then (AB)
T
= B
T
A
T
= B
1
A
1
=
(AB)
1
.
1.4.7 Summary of LU-decomposition
Solving a linear equation system Ax =

b with LU-decompostion consists
of the following steps:
m
1 Start with P
0
= L
0
= E
n
, U
0
= A.
m
2 For each k from 1 to n perform
Exchanging rows

U
k1
is U
k1
with rows k and l > k exchanged,

L
k1
is L
k1
where the rst k 1 entries in rows k and l are exchanged
(only if k > 1), and exchanging columns k and l in P
k1
gives
P
k
If you skip this step just put P
k
:= P
k1
,

L
k1
:= L
k1
and

U
k1
:= U
k1
AEM 1- 34
Adding multiples of row k to the rows below
Adding in

U
k1
the
l
-fold row to the rows l with l > k
gives U
k
, and L
k
is

L
k1
with entries
l
in row l of
column k.
With P := P
n
, L := L
n
and U := U
n
this gives the decomposition
A = PLU.
In case of dierent right sides

b
j
in the equation system, this step
has to be carried out only once.
m
3 Solve Pz =

b by z = P
T

b
m
4 Solve L y = z recursively starting with y
1
.
m
5 Solve Ux = y recursively starting with x
n
.
At an arbitrary point you can make a crosscheck whether you made
mistakes during the calculation: always P
k
L
k
U
k
and P
k

L
k1

U
k1
must
be equal to A.
1.4.7.1 Remarks
(i) The rst step in the LU-Decomposition can be used to do pivoting;
i.e. you can always put the entry with the largest absolute value
into the u
mm
-position. This results in higher numerical stability.
(ii) P arises from the identity-matrix by interchanging rows. Therefore
it is not necessary to write down the complete matrix. One only
has to keep notice what coordinates are interchanged.
1.4.8 Example of LU-Decomposition
A =
_

_
6 5 3 10
3 7 3 5
12 4 4 4
0 12 0 8
_

_
AEM 1- 35
[P
0
|L
0
|U
0
] = [E|E|A]
=
_

_
1 0 0 0 1 0 0 0 6 5 3 10
0 1 0 0 0 1 0 0 3 7 3 5
0 0 1 0 0 0 1 0 12 4 4 4
0 0 0 1 0 0 0 1 0 12 0 8
_

_
[P
1
|

L
0
|

U
0
] =
_

_
0 0 1 0 1 0 0 0 12 4 4 4
0 1 0 0 0 1 0 0 3 7 3 5
1 0 0 0 0 0 1 0 6 5 3 10
0 0 0 1 0 0 0 1 0 12 0 8
_

_
[P
1
|L
1
|U
1
] =
_

_
0 0 1 0 1 0 0 0 12 4 4 4
0 1 0 0
1
/
4
1 0 0 0 6 4 4
1 0 0 0
1
/
2
0 1 0 0 3 1 12
0 0 0 1 0 0 0 1 0 12 0 8
_

_
[P
2
|

L
1
|

U
1
] =
_

_
0 0 1 0 1 0 0 0 12 4 4 4
0 0 0 1 0 1 0 0 0 12 0 8
1 0 0 0
1
/
2
0 1 0 0 3 1 12
0 1 0 0
1
/
4
0 0 1 0 6 4 4
_

_
[P
2
|L
2
|U
2
] =
_

_
0 0 1 0 1 0 0 0 12 4 4 4
0 0 0 1 0 1 0 0 0 12 0 8
1 0 0 0
1
/
2
1
/
4
1 0 0 0 1 10
0 1 0 0
1
/
4
1
/
2
0 1 0 0 4 8
_

_
[P
3
|

L
3
|

U
3
] =
_

_
0 0 0 1 1 0 0 0 12 4 4 4
0 0 1 0 0 1 0 0 0 12 0 8
1 0 0 0
1
/
4
1
/
2
1 0 0 0 4 8
0 1 0 0
1
/
2
1
/
4
0 1 0 0 1 10
_

_
AEM 1- 36
[P
3
|L
3
|U
3
] =
_

_
0 0 0 1 1 0 0 0 12 4 4 4
0 0 1 0 0 1 0 0 0 12 0 8
1 0 0 0
1
/
4
1
/
2
1 0 0 0 4 8
0 1 0 0
1
/
2
1
/
4

1
/
4
1 0 0 0 8
_

_
A = PLU = P
3
L
3
U
3
with
P =
_

_
0 0 0 1
0 0 1 0
1 0 0 0
0 1 0 0
_

_
L =
_

_
1 0 0 0
0 1 0 0
1
/
4
1
/
2
1 0
1
/
2
1
/
4

1
/
4
1
_

_
U =
_

_
12 4 4 4
0 12 0 8
0 0 4 8
0 0 0 8
_

_
1.4.9 Solving a Linear Equation System
Ax =

b with A =
_

_
6 5 3 10
3 7 3 5
12 4 4 4
0 12 0 8
_

_
and

b =
_

_
10
14
8
8
_

_
m
1 Solve Pz =

b
z = P
T

b =
_

_
0 0 1 0
0 0 0 1
0 1 0 0
1 0 0 0
_

_
_

_
10
14
8
8
_

_
=
_

_
8
8
14
10
_

_
.
m
2 Solve L y = z, i.e.
_

_
1 0 0 0
0 1 0 0
1
/
4
1
/
2
1 0
1
/
2
1
/
4

1
/
4
1
_

_
_

_
y
1
y
2
y
3
y
4
_

_
=
_

_
8
8
14
10
_

_
.
Line by line one has y
1
= 8, y
2
= 8, 2 4 + y
3
= 14 y
3
= 16
and 4 2 4 + y
4
= 10 y
4
= 8.
AEM 1- 37
m
3 Solve Ux = y, i.e.
_

_
12 4 4 4
0 12 0 8
0 0 4 8
0 0 0 8
_

_
_

_
x
1
x
2
x
3
x
4
_

_
=
_

_
8
8
16
8
_

_
.
line by line one has
8x
4
= 8 x
4
= 1, 4x
3
+ 8 = 16 x
3
= 2, 12x
2
8 =
8 x
2
= 0 and 12x
1
8 + 4 = 8 x
1
= 1, so
x =
_
1 0 2 1

T
.
1.4.10 Short Form
(i) Use the zeroes in the U-matrix to store the elements below the
diagonal of the L-matrix.
Divide these areas of the U-matrix by a line.
(ii) Instead of the P-matrix use a vector (initially p = [1 2 3 4]
T
) con-
taining the numbers of the rows of the right-side vector

b.
Then a pivoting operation results in exchanging whole rows in U and p.
1.4.11 Example
[P
0
|L
0
|U
0
] = [E|E|A] :
_

_
6 5 3 10
3 7 3 5
12 4 4 4
0 12 0 8
_

_
_

_
1
2
3
4
_

_
.
[P
1
|

L
0
|

U
0
] :
_

_
12 4 4 4
3 7 3 5
6 5 3 10
0 12 0 8
_

_
_

_
3
2
1
4
_

_
AEM 1- 38
[P
1
|L
1
|U
1
] :
_

_
12 4 4 4
1
/
4
6 4 4
1
/
2
3 1 12
0 12 0 8
_

_
_

_
3
2
1
4
_

_
[P
2
|

L
1
|

U
1
] :
_

_
12 4 4 4
0 12 0 8
1
/
2
3 1 12
1
/
4
6 4 4
_

_
_

_
3
4
1
2
_

_
[P
2
|L
2
|U
2
] =
_

_
12 4 4 4
0 12 0 8
1
/
2
1
/
4
1 10
1
/
4
1
/
2
4 8
_

_
_

_
3
4
1
2
_

_
[P
3
|

L
3
|

U
3
] :
_

_
12 4 4 4
0 12 0 8
1
/
4
1
/
2
4 8
1
/
2
1
/
4
1 10
_

_
_

_
3
4
2
1
_

_
[P
3
|L
3
|U
3
] =
_

_
12 4 4 4
0 12 0 8
1
/
4
1
/
2
4 8
1
/
2
1
/
4

1
/
4
8
_

_
_

_
3
4
2
1
_

_
Decompose this and put the L- and U-parts into the right form:
L =
_

_
1 0 0 0
0 1 0 0
1
/
4
1
/
2
1 0
1
/
2
1
/
4

1
/
4
1
_

_
und U =
_

_
12 4 4 4
0 12 0 8
0 0 4 8
0 0 0 8
_

_
AEM 1- 39
In z = P
T

b one has

b =
_

_
b
1
b
2
b
3
b
4
_

_
=
_

_
10
14
8
8
_

_
so z =
_

_
b
3
b
4
b
2
b
1
_

_
=
_

_
8
8
14
10
_

_
and the rest is as above.
If one wants P explicitly one has from p: P = [ e
3
, e
4
, e
2
, e
1
].
AEM 1- 40
1.5 Eigenvalues and Eigenvectors
1.5.1 Denition and properties
Let A be a square matrix.
(i) If C and v =

0 is a vector with Av = v, then v is called
eigenvector of A to the eigenvalue .
(ii) It is Av = v with v =

0
there is a vector v =

0 with (A E)v =

0
the kernel of A E is non-trivial
A E is not regular
det(A E) = 0
As det(A E) is a polynomial of degree n in , we dene
p() = det(A E) is called characteristic polynomial von A.
Therefore a (complex) number is an eigenvalue of A, if is a
zero of the characteristic polynomials.
(iii) A has at least one eigenvalue and at least one eigenvector to each
eigenvalue.
(iv) If is a k-fold zero of p, then o() = k is called the algebraic
multiplicity of .
The geometric multiplicity () is the dimension of the kernel of
A E, that is dimension of eigenspace of A and .
(v) A vector v is called generalized eigenvector of the k-th order to ,
if the following holds:
(A E)
k
v =

0, but (A E)
k1
v =

0.
(vi) Because of (A E)
0
v = Ev = v the eigenvectors are just the
generalized eigenvectors of rst order. If v is a generalized eigen-
vector of k-th order then (A E)v is a generalized eigenvector
of order (k 1).
AEM 1- 41
1.5.2 More properties
(i) Let C = PAP
1
. Then A and C have the same characteristic
polynomial.
(ii) If v is a (generalised) eigenvector of A then Pv ist a (generalized)
eigenvector of C (of the same order).
(iii) Let A be a square k k-matrix with the property that the diagonal
and everything below the diagonal is zero.
Then A
k
= 0.
(iv) Let A be an (upper or lower) triangular matrix. Then the eigen-
values of A are the diagonal elements.
This shows that eigenvalues are properties of the linear map rather than
of the representing matrix.
1.5.3 Lemma
Let C be a mm-matrix. Then there exists an invertible mm-matrix
P so that
S = P
1
CP =
_

_

0
.
.
.
.
.
.
.
.
.
0
_

_
where is an eigenvalue of C.
1.5.4 Theorem: Schur Form
Let A be a n n-matrix. Then there exists an invertible matrix P and
an upper triangular matrix U with A = PUP
1
.
U has the same characteristic polynomial as A, so the diagonal of U are
the eigenvalues of A with the same multiplicity.
AEM 1- 42
1.5.5 Consequences
(i) Always 1 () o() n holds.
If () < o() then for sucient large k the dimension of
the kernel of (A E)
k
is equal to the algebraic multiplicity
o().
(ii) The generalized eigenspace to is the span of all generalized
eigenvectors to . Its dimension is o(), i.e. there are in total
as many linearly independent generalized eigenvectors to as the
order of as a zero of the characteristic polynomial.
In particular for a simple zero of the characteristic polynomial we
have: there is a one-dimensional eigenspace and there are no gen-
eralized eigenvectors of higher order.
(iii) (generalized) eigenvectors to distinct eigenvalues are linearly inde-
pendent.
(iv) A real matrix is called (real) diagonalisable, if
(1) the characteristic polynomial has only real zeroes
(2) for each zero the algebraic and the geometric multiplicity are
equal.
This means that there is a basis of the R
n
consisting of eigenvectors
of A resp. that there are no generalized eigenvectors of higher
order.
(v) Accordingly a complex matrix is called complex diagonalisable if
for every eigenvalue the algebraic and geometric multiplicity are
the same.
(vi) The spectrum of A is the set of eigenvalues, denoted by (A).
1.5.6 Jordan-Form
If is an eigenvalue of the matrix A and v is a corresponding eigenvector,
then Av = v.
AEM 1- 43
If v is a generalized eigenvector of order k +1 then u = (AE)v is a
generalized eigenvector of order k. In this case we have Av = v + u.
Putting these two cases together we get the important theorem on the
Jordan-form of a matrix:
1.5.6.1 Jordan-Form
Let L be an endomorphism of C
n
. Then there exists a basis of C
n
so
that in this basis L has an block-matrix representation
J =
_

_
J
1
0 0
0 J
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0
0 0 J
p
_

_
where J
r
=
_

r
1 0 0
0
r
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

r
1
0 0
r
_

_
The numbers
r
are (not necessarily distinct) eigenvalues. The blocks
J
r
are Jordan-blocks.
If J
r
has the size k and u
1
u
k
are the basis vectors associated to the
block J
r
then we have
L u
1
=
r
u
1
and for 2 s k we have L u
s
= u
s
+ u
s1
. ()
That means that u
1
is an eigenvector and u
s
are generalized eigenvectors
of order s. The (ordered) set u
1
u
k
alle called Jordan-chain.
Now let u
1,1
u
1,k
1
, u
2,1
u
2,k
2
, u
p,1
u
p,k
p
be the Jordan chains
assosiated with the Jordan blocks J
1
J
p
. The matrix
U = [ u
1,1
u
1,k
1
u
p,1
u
p,k
p
]
fullls
AU = UJ A = UJU
1
J = U
1
AU.
This is easily seen be looking at the columnvectors in the products,
because this is just the equation () in each column.
AEM 1- 44
1.5.6.2 Remark
If each J
r
has the size 1 then there exists a basis of eigenvectors and
there are no generalized eigenvectors of order greater than one. In this
case the matrix is diagonalisable.
AEM 1- 45
1.5.7 Example
A :=
_

_
2 0 1 0 0 0 0 0 0 0
1 2 0 0 0 0 0 0 0 0
0 0 2 0 0 0 0 0 0 0
0 0 0 2 0 0 0 1 0 0
0 0 0 0 2 0 0 0 1 0
0 0 0 1 0 2 0 0 0 0
0 0 0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 2 0 0
0 0 0 0 0 0 0 0 2 0
0 0 0 0 0 0 0 0 0 2
_

_
p() = (2 )
10
, so 2 is 10-fold eigenvalue of A.
B :=
_

_
0 0 1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
_

_
B
2
=
_

_
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
_

_
Furthermore B
3
= 0.
AEM 1- 46
U
3
U
2
U
1
s
3
= 2
s
2
= 3
s
1
= 5
r
3
= 10 r
2
= 8 r
1
= 5 r
0
= 0
ker B
3
ker B
2
ker B
1
ker B
0

One has
v ker B

v =

0 B
1
(Bv) =

0 Bv ker B
1
.
So B is injective between U
3
, U
2
and U
1
.

b
31

b
32

b
22

b
21

b
23

b
13

b
11

b
12

b
14

b
15
B B
U
3
U
2
U
1
Choose a basis

b
31
and

b
32
of U
3
.
From this dene
(i) B

b
31
=

b
21
, B

b
21
=

b
11
and B

b
11
=

0. (Jordan chain of length 3)
(ii) B

b
32
=

b
22
, B

b
22
=

b
12
and B

b
12
=

0. (Jordan chain of length 3)
In the 3-dimensional space U
2
the vectors

b
21
and

b
22
are completed to
a basis by

b
23
.So one has
(iii) B

b
23
=

b
13
, B

b
13
=

0. (Jordan chain of length 2)


AEM 1- 47
In the end the vectors in U
1
that are already determined are completed
to a basis.
(iv) B

b
14
=

0 (Jordan chain of length 1)


(v) B

b
15
=

0 (Jordan chain of length 1)


With this the map B is uniquely described in the basis

b
i j
.
If one observes Bv =

0 (A I)v =

0 Av = v
Bv = w (A I)v = w Av = v + w,
one has with

b
11
,

b
21
,

b
31
,

b
12
,

b
22
,

b
32
,

b
13
,

b
23
,

b
14
and

b
15
the following matrix representation of A
J :=
_

_
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
_

_
J is the (better a) Jordan form of the map A.
Gather the vectors

b
11
,

b
21
, . . . ,

b
15
in a matrix C. Then it follows
AC = CJ, so A = CJC
1
.
AEM 1- 48
Calculation with numbers
U
1
is the kernel of B. It consists of all vectors having a zero in position
1, 3, 4, 8 and 9. Because in general there is no canonical choice of bases
we describe U
1
as
U
1
= [e
2
e
5
, e
2
+ e
5
, e
6
e
2
, e
7
e
2
, e
10
e
2
.]
The kernel of B
2
consists of all vectors having a zero in position 3 and
8. So U
1
is completed by
U
2
= [e
1
+ e
4
, e
1
e
4
, e
1
+ e
9
]
to a basis of ker B
2
.
ker B
3
consists of all vectors. So we choose
U
3
:= [e
3
, e
8
].
Now construct the Jordan chains:
Be
3
= e
1
, Be
1
= e
2
Be
2
=

0 these are

b
31
,

b
21
and

b
11
Be
8
= e
4
, Be
4
= e
6
Be
6
=

0 these are

b
32
,

b
22
, and

b
12
These are the chains of lenghth 3.
In U
2
we have to complete the images of the vectors of U
3
(e
1
and e
4
)
to a basis. So we choose

b
23
= e
1
+ e
9
and build the next Jordan chain
B(e
1
+ e
9
) = e
2
+ e
5
, B(e
2
+ e
5
) =

0 these are

b
23
and

b
13
In U
1
the span of e
2
, e
6
and e
2
+ e
5
has to be completed to a basis.
Therefore we choose

b
14
= e
10
e
2
and

b
15
= e
7
e
2
.
AEM 1- 49
With this we have: in the basis

b
11
,

b
21
,

b
31
,

b
12
,

b
22
,

b
32
,

b
13
,

b
23
,

b
14
und

b
15
A the form J stated above.
Here we have C = (e
2
, e
1
, e
3
, e
6
, e
4
, e
8
, e
2
+e
5
, e
1
+e
9
, e
10
e
2
, e
7
e
2
)
and so
C =
_

_
0 1 0 0 0 0 0 1 0 0
1 0 0 0 0 0 1 0 1 1
0 0 1 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 1 0
_

_
und
C
1
=
_

_
0 1 0 0 1 0 1 0 0 1
1 0 0 0 0 0 0 0 1 0
0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0 0 0
_

_
The Jordan theorem now tells that A = CJC
1
and J = C
1
AC.
AEM 1- 50
1.5.7.1 Algorithm
We look for the Jordan form and transformation matrices of an endo-
morphism A on R
n
(or C
n
), so A = CJC
1
.
(i) Calculate p() = det(A E) and nd all zeroes. These are the
eigenvalues.
(ii) For each eigenvalue perform the following process:
m
1 For construct B := A E and determine the spaces U
i
,
until the dimension of the kernel of B
k
(this is equal to the
sum of the dimensions of the U
i
) is equal to the algebraic
multiplicity of .
This is done iteratively: rst nd (with aid of the Gaua-
algorithm) a basis of the kernel of B. This is U
1
.
Then compute B
2
and nd a basis of its kernel by completing
the basis of U
1
by other vectors. These completing vectors
form a basis of U
2
.
Now nd a basis of U
3
by completing the basis of ker B
2
by
some vectors to a basis of ker B
3
and so on.
m
2 Now construct the Jordan chains:
the basis of U
3
(in general: U
k
with the highest k) is mapped
by B in U
2
; and then is completed to a basis of U
2
by vectors
that have been computed in
m
1 .
This basis is mapped by B; and the images are completed to
a basis of U
1
.
Each j -tuple v, Bv, . . . B
j 1
v of basis vectors with a starting
vector v U
j
forms a Jordan chain of length j .
m
3 When in total

basis vectors are found, the work is done


for this eigenvalue.
AEM 1- 51
(iii) Each Jordan chain v, Bv, . . . B
j 1
v is written down in reverse
order (so starting with the eigenvector) B
j 1
v, B
j 2
v, . . . v and
gathered to the matrix C.
In the Jordan matrix J each chain corresponds to a Jordan block
of size j j having the form J(j, ) =
_

_
1 0 0
0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 1
0 0 0
_

_
with
the eigenvalue .
The Jordan matrix J is the a block diagonal matrix consisting of
the single Jordan blocks.
1.6 Special Properties of Symmetric Matrices
A matrix is called orthogonal, i the columns form an orthonormal basis.
Equivalently one can say
A
T
= A
1
or A
T
A = AA
T
= E
n
.
In the complex case a matrix is called unitary if
A

= A
1
or A

A = AA

= E
n
.
The importance of these notions lies in the fact that for arbitrary vectors
v and w and an orthogonal or unitary matrix A the following holds:
Av = v and < Av, A w >=< A
T
Av, w >=< v, w >
An orthogonal transformation doesnt change neither angles nor lengths.
The proof of these facts is given below,
This subsection contains facts about symmetric or hermitian matrices.
Recall that a real matrix is called symmetric if A = A

and a complex
hermitian if A = A

. For real matrices these denitions coincide.


The following statements are formulated for the complex case, because
the (more important) real case is contained in it.
AEM 1- 52
1.6.1 Properties of Symmetric and Hermitian Matrices
Let A be a hermitian n n-matrix.
(i) The eigenvalues of A are real.
(ii) If = are eigenvalues and v
1
and v
2
are eigenvectors to resp.
, then < v
1
, v
2
>= 0.
(iii) For each eigenvalue the geometrical and the algebraic multiplicity
are equal.
(iv) There exists a ON-Basis of eigenvectors of A
(v) There is an unitary matrix U and a real diagonal matrix D with
A = UDU

. (Remember: U unitary U

= U
1
.)
1.6.2 Orthogonal Matrices
A square matrix is called orthogonal (or unitary in the complex case)
if A
T
A = E resp. A

A = E. As the real case is more important, we


restrict our further results to this case. The complex case can be proved
analogously.
1.6.2.1 Properties of Orthogonal Matrices
The following statements are equivalent:
(i) A is orthogonal.
(ii) A
T
= A
1
.
(iii) The columns of A form an orthonormal basis.
(iv) The rows of A form an orthonormal basis.
(v) For v, w R
n
we have < v, w >=< Av, A w >.
(vi) for each v R
n
we have Av = v.
AEM 1- 53
1.6.2.2 Further Properties
Let A be orthogonal.
(i) For v, w one has <)(v, w) =<)(Av, A w)
(ii) The determinant of A is either 1 or 1.
1.7 Singular Value Decomposition (SVD)
1.7.1 Preparations
We start with studying the properties of the product A
T
A.
Lemma
(i) The kernels of A and A
T
A are equal.
(ii) rank A = rank A
T
A.
(iii) From the last point and the well-known fact rank A = rank A
T
we
have that all four matrices A, A
T
, AA
T
and A
T
A have the same
rank.
(iv) From (A
T
A)
T
= A
T
(A
T
)
T
= A
T
A it follows that A
T
A is symmetric.
(v) Immediately it follows that the eigenvalues of A
T
A are real and
there exists an ON-basis of R
n
consisting of eigenvectors. Putting
these together into a matrix we get an orthogonal matrix V =
(v
1
, . . . , v
n
).
(vi) The eigenvalues of A
T
A are non-negative.
(vii) If v is an eigenvector to the eigenvalue = 0 of A
T
A, then w = Av
is an eigenvector to the same eigenvalue of AA
T
.
(viii) Analogously it follows: If u is eigenvector of AA
T
to = 0 then
A
T
u is eigenvector of A
T
A to .
AEM 1- 54
(ix) Let v
i
and v
j
be elements of the ON-Basis of the eigenspace of
A
T
A to = 0. Then we have

i j
= < v
i
, v
j
>=< v
i
, v
j
>=< A
T
Av
i
, v
j
>=< Av
i
, Av
j
> .
This shows that Av
i
forms an orthogonal system and hence the
dimension of the eigenspace of AA
T
to must be greater or equal
than the dimension of the corresponding eigenspace of A
T
A.
By symmetry it follows that these two numbers are equal.
1.7.2 Existence and Construction of the SVD
1.7.2.1 Theorem
Let A be a m n-matrix.
Then there exists an orthogonal nn-matrix V and an orthogonal mm
matrix U, and a m n-matrix S = (s
i j
) with s
i i
0 so that
A = USV
T
.
The matrix S = (s
i j
) is a matrix of diagonal type, i.e. for i = j one has
s
i j
= 0.
1.7.2.2 Algorithm
m
1 Form B = A
T
A. This is an n n-matrix.
m
2 Compute the eigenvalues of B. These are non-negative and are
numered in the sequence
1

2

k
>
k+1
= =

n
= 0. The fact that k is the rank of the matrix A ( and the rank
of A
T
A too) can be used as a crosscheck.
m
3 Find an ON-basis v
1
, . . . , v
n
of R
n
. here is v
i
eigenvector to the
eigenvalue
i
. V := [v
1
, , v
n
] becomes an orthogonal matrix.
(V
T
= V
1
).
AEM 1- 55
m
4 The singular values of A are dened as s
i
=

i
. The matrix
S = (s
i j
) is a matrix of diagonal type, i.e. for i = j one has s
i j
= 0.
S has the same shape as A, i.e. n columns and m rows. The
elements in the diagonal are given by the singular values: s
i i
= s
i
.
m
5 For i k dene the vectors u
i
=
1

i
Av
i
. They form an orthonor-
mal system. Complete these vectors to an ON-basis u
1
, . . . , u
m
of
R
m
and gather them into the matrix U = [ u
1
, , u
m
].
m
6 The singular value decomposition of A is
A = USV
T
.
1.7.2.3 Remark
In many cases the vectors in V and U belonging to the eigenvalues zero
are not needed. In this case the entries are denoted by stars () and
are not explicitely calculated. This is called the simplied version of the
SVD.
1.7.2.4 Further Properties
If A = USV
T
is the SVD of A then A
T
has the SVD A
T
= V S
T
U
T
. If A
is invertible, then A
1
= V S
1
U
T
.
1.8 Generalized Inverses
The singular value decomposition can be used to construct approximate
solutions of (possibly) non-square linear equation systems.
Given a mn-matrix A and an vector

b R
m
we are looking for a vector
x R
n
so that the norm
Ax

b
2
= min!
AEM 1- 56
Substituting the SVD of A and remembering that for the orthogonal
matrix U we have that U
T
= U
1
is orthogonal, too with u = U
T
u
for each u R
m
we get
Ax

b
2
= USV
T
x

b
2
= U
T
USV
T
x U
T

b
2
= S V
T
x
..
z
U
T

b
..

2
()
The solutions of this equation are given by
z
j
=
_
1
s
j
d
j
j = 1 . . . k
arbitrary j > k
As V is orthogonal we get all solutions x as
x = V z =
k

j =1
1
s
j
d
j
v
j
+
n

j =k+1
z
j
v
j
.
Because V is orthogonal, the norm of x is given by
_
n

j =1
z
2
j
_1
/
2
. There-
fore the solution with the smallest norm is
x
+
= V z =
k

j =1
1
s
j
d
j
v
j
.
This solution is called pseudo-normal solution.
One sees that the mapping

b x
+
is given by the matrix A
+
:= V

SU
T
with the diagonal-type matrix

S := (
i

i j
) where
i
is dened by
i
=
_
1
s
j
for j k
0 for j > k
.
1.8.0.5 Denition
The so dened matrix A
+
is called generalized inverse or
Moore-Penrose-inverse of A.
AEM 1- 57
1.8.0.6 Further Properties
We have (A
T
)
+
= (A
+
)
T
.
1.8.1 Special case: A injectiv
If A is injective then A has the rank n and the pseudo-normal solution
of every equation Ax =

b is unique. Furthermore in this case A
T
A is
invertible (because of rank A = n and the rank of A
T
A is equal to the
rank of A).
In this case we can calculate x
+
without explicit construction of the
SVD: using A
T
A = V S
T
U
T
USV = V S
T
SV we get from the equation
() above:
SV
T
x
+
= U
T

b V S
T
SV
T
x
+
= V S
T
U
T

b
V S
T
SV
T
. .
A
T
A
x
+
= V S
T
U
T
. .
A
T

b A
T
Ax
+
= A
T

b x
+
= (A
T
A)
1
A
T

b
So in this case
A
+
= (A
T
A)
1
A
T
If one wants only x
+
it is sucient to solve A
T
Ax
+
= A
T

b.
AEM 1- 58
1.9 Applications to linear equation systems
1.9.1 Errors
1.9.1.1 Introductory example
Ax =

b with A =
_
_
2 3 4
2 3 4.001
3 4 5
_
_
and

b =
_
_
1
1
1
_
_
One easily sees that A is invertible. The solution x is uniquely deter-
mined:
The exact solution is x =
_
_
1
1
0
_
_
. On the other hand y =
_
_
0.5
0
0.5
_
_
is
not far from the solution because of A y =
_
_
1
1.0005
1
_
_
. From this one
sees that the given equation system is very unstable with respect to
perturbations.
If one calculates the solution of the slightly perturbated system Ax
1
=

b
1
with

b
1
=
_
_
1
0.9
1
_
_
one gets x
1
=
_
_
101.0000
201.0000
100.0000
_
_
.
1.9.1.2 Theorem
Let x be the solution of Ax =

b. If we compare the solution x + x of
the disturbed system A(x +x) =

b+

b with x, we get the relative error


x
x
AA
1

b
.
The number (A) = cond A = AA
1
is called the condition of A.
With a little more eord it is possible to prove
AEM 1- 59
Theorem If x is the solution of Ax =

b and x + x the solution of
(A + A)(x +x) =

b +

b, then the following estimate for the relative


error holds:
x
x

(A)
1 (A)
A
A
_

b
+
A
A
_
.
For small values of A the right side is approximately equal to
(A)
_

b
+
A
A
_
.
1.9.2 Numerical Rank Deciency
Numerical rank deciency appears if a matrix is close to another matrix
with smaller rank. This leads to a very large condition number.
Small variations in the initial data of Ax =

b lead to large variation in
the result x.
The SVD of A is A = USV
T
with the singular values s
1
10, s
2
0.4
and s
3

1
/
3000
.
To avoid this eects one can proceed as follows:
m
1 Decompose A = USV
T
.
m
2 The matrix S
1
is build out of S by replacing all entries smaller than
a given number by zero and A
1
= US
1
V
T
.
This is reasonable: one can prove that entries in S that are smaller
than the machine accuracy multiplied by the Frobenius norm of the
matrix will have no inuence on the result.
m
3 Instead of the solutions of Ax =

b nd the pseudo-normal solutions
of A
1
x =

b with
x
+
= A
+
1

b = V S
+
1
U
T

b
AEM 1- 60
In the example one has A = USV
T
with
S =
_
_
10.3873 0 0
0 0.3338 0
0 0 0.0003
_
_
and orthogonal matrices U and V .
We change the third singular value to zero an get
S
1
=
_
_
10.3873 0 0
0 0.3338 0
0 0 0
_
_
and S
+
1
=
_
_
0.0963 0 0
0 2.9961 0
0 0 0
_
_
.
Then A
+
1
=
_
_
1.1633 1.1674 1.8314
0.1669 0.1676 0.3342
0.8316 0.8344 1.1662
_
_
and
x
+
= A
+
1
_
_
1
1
1
_
_
=
_
_
0.4992
0.0002
0.4997
_
_
and x
+
1
= A
+
1
_
_
1
0.9
1
_
_
=
_
_
0.3825
0.0165
0.4163
_
_
In the original problem we have
Ax
+
=
_
_
0.9998
0.9999
1.0001
_
_
and Ax
+
1

_
_
0.9499
0.9499
1.0003
_
_
AEM 1- 61
1.9.3 Application: Best Fit Functions
Other name: Gaua method of least squares
1.9.3.1 Most important case: best t straight line
Starting point are n > 2 pairs of coordinates (x
i
, y
i
), so that at least two
dierent x-values occur.
We search for a line y = ax +b with the property that the quadratic error
n

i =1
_
(ax
i
+ b) y
i
_
2
is as small as possible.
The solution of this problem is the pseudo normal solution of
b + ax
1
= y
1
.
.
.
b + ax
n
= y
n
, or A
_
b
a
_
= y with A =
_

_
1 x
1
.
.
.
.
.
.
1 x
n
_

_
and y =
_

_
y
1
.
.
.
y
n
_

_
As the matrix is injective, the solution is obtained with aid of the trans-
posed matrix:
_
b
a
_
= (A
T
A)
1
A
T
y.
The coecient of correlations r measures the quality of the approxima-
tion. Always we have |r | 1 and for r = 1 the line goes through all
points.
AEM 1- 62
Algorithm
All sums are from i = 1 to n.
m
1 = n

x
2
i
(

x
i
)
2
m
2 The best t straight line y = ax + b has the coecients
a =
1

(n

x
i
y
i

x
i

y
i
)
and b =
1

x
2
i

y
i

x
i
y
i

x
i
_
.
m
3 r =
n

x
i
y
i

x
i

y
i
_
_
n

x
2
i
(

x
i
)
2
__
n

y
2
i
(

y
i
)
2
_
Second method
Find the mean values x =
1
n
n

k=1
x
k
and y =
1
n
n

k=1
y
k
. Shift the coordi-
nate system so that x and y are the new origin by replacing x
k
by x
k
x
resp. y
k
by y
k
y. Then the best t straight line is given by
y = vx with v =
n

k=1
x
k
y
k
n

k=1
x
2
k
and
r =
n

k=1
x
k
y
k
_
n

k=1
x
2
k
_1
/
2
_
n

k=1
y
2
k
_1
/
2
=
x, y
x y
.
AEM 1- 63
Here it is easy to see that the coecient of correlation describes the
relative error in the approximation:
n

k=1
(vx
k
y
k
)
2
n

k=1
y
2
k
= 1 r
2
.
1.9.3.2 General problem
Let (x
i
, y
i
), i = 1, . . . , n be n pairs of data. Furthermore let f
1
, . . . , f
k
be
k < n functions. We look for a linear combination f (x) =
k

j =1

j
f
j
(x)
of the f
j
so that the sum of the squares of the deviations of f (x
i
) to y
i
becomes minimal:
F =
n

i =1
(f (x
i
) y
i
)
2
=
n

i =1
_
k

j =1

j
f
j
(x
i
) y
i
_
2
!
= min.
Solution: Solve A a = y. Here a = (
1
, . . . ,
k
)
T
contains the coe-
cients we look for and A =
_

_
f
1
(x
1
) f
2
(x
1
) f
k
(x
1
)
f
1
(x
2
) f
2
(x
2
) f
k
(x
2
)
.
.
.
.
.
.
.
.
.
.
.
.
f
1
(x
n
) f
2
(x
n
) f
k
(x
n
)
_

_
and y =
_

_
y
1
y
2
.
.
.
y
n
_

_
.
AEM 1- 64
1.10 Symmetric Matrices and Quadratic
Forms
A quadratic form on R
n
is a map of the form
x = (x
1
, . . . , x
n
)
T
Q(x) =
n

i ,j =1
c
i j
x
i
x
j
The c
i j
are real numbers with c
i j
= c
j i
. With the symmetric matrix
C = (c
i j
)
i ,j =1...n
this is written as
Q(x) = x
T
Cx,
On the other hand is Q the quadratic form that belongs to C .
Let C = UDU
T
with a real diagonal matrix D containing the eigenvalues
of C and an orthogonal matrix U. Then
Q
C
(x) = x
T
Cx = x
T
UDU
T
x = (U
T
x)
T
D(U
T
x)
If the columns of U are the (ON-)vectors u
1
u
n
, then U
T
x are the
coecients of x in this basis. If these are denoted by y
1
, . . . , y
n
, then
with y = U
T
x one has
Q
C
(x) = y
T
D y =
n

k=1

k
y
2
k
.
From this one has immediately e.g. that Q
C
(x) is positive for non-zero-
vectors i all eigenvalues of C are positive.
This leads to the denition:
A quadratic form is called
positive denite
if Q(x) > 0 for x =

0 > 0 for all eigenvalues of C.


AEM 1- 65
positive semidenite
if Q(x) 0 for all x 0 for all eigenvalues of C.
negative denite
if Q(x) < 0 for x =

0 < 0 for all eigenvalues of C.


negative semidenite
if Q(x) 0 for all x 0 for all eigenvalues of C.
denite
if Q is negative or positive denite.
indenite
if there are x and y with Q(x) < 0 < Q( y)
the matrix C has positive and negative eigenvalues.
(Dangerous) notation: C positive denite: C > 0, C positive semidef-
inite: C 0, C negative (semi)denite: C < 0 (C 0).
A symmetric matrix is called positive/negative (semi)denite or inde-
nite, if this is true for the corresponding quadratic form.
Remark
A is positive [semi]denite A is negative [semi]denite.
Hurwitz Criterion
The Hurwitz Criterion is useful to determine the deniteness of a matrix
without calculation the eigenvalues.
AEM 1- 66
In the symmetric n n-matrix A
one forms - starting from the left
upper corner - submatrices of the
size 1, 2,. . . n. The determinants
of these submatrices are called
D
1
to D
n
. We have D
1
= a
11
,
D
2
= a
11
a
22
a
12
a
21
. D
n
is the
determinant of A at last. Then
the following holds:
_

_
a
11
a
21
a
31
a
n1
a
nn
a
1n
a
12
a
13
a
22
a
23
a
33
a
32

.
.
.
.
.
.
_

_
(i) D
1
> 0, D
2
> 0, D
3
> 0, D
4
> 0 etc. A pos. denite.
(D
k
> 0)
(ii) D
1
< 0, D
2
> 0, D
3
< 0, D
4
> 0 etc. A neg. denite.
((1)
k
D
k
> 0)
(iii) A pos. semidenite D
1
0, D
2
0, D
3
0, D
4
0 etc.
(D
k
0)
(iv) A neg. semidenite D
1
0, D
2
0, D
3
0, D
4
0 etc.
((1)
k
D
k
0)
(v) if neither iii) nor iv) holds, A is indenite.
Especially A is indenite, if for an even number k D
k
< 0 holds. Please
pay attention to the fact that A may be indenite even if always D
k
0
or (1)
k
D
k
0 holds. In this case at least one D
k
has to be zero.
Quadratic Completion
Another possibility to determine the deniteness of a quadratic form is
quadratic completion. The method is explained at the example
Q(x) = x
2
+ 4xy + 2xz + 8y
2
+ 16yz + 9z
2
.
m
1 Choose one variable x
j
with a non-vanishing coecient of x
2
j
. Here
we choose x. If such a choice is impossible, the quadratic form is
indenite.
AEM 1- 67
m
2 Gather all terms that contain x:
Q(x) = (x
2
+ 4xy + 2xz) + (8y
2
+ 16yz + 9z
2
)
m
3 Use the following to complete to a square
(a + b + c + d + )
2
=
a
2
+b
2
+c
2
+d
2
+2(ab+ac +ad + +bc +bd + +cd + )
Q(x) = (x + 2y + z)
2
+
m
4 Subtract the term that are not contained in the bracket in step
m
2 :
Q(x) = (x +2y +z)
2
+(4y
2
z
2
4yz) +(8y
2
+16yz +9z
2
)
= (x + 2y + z)
2
+ (4y
2
+ 12yz + 8z
2
)
m
5 Now the second bracket contains no x. Continue with
m
1 applied
to the second bracket. Choose y.
m
6 Q(x) = (x + 2y + z)
2
+ (4y
2
+ 12yz) + 8z
2
= (x + 2y + z)
2
+ (2y + 3z)
2
9z
2
+ 8z
2
= (x + 2y + z)
2
+ (2y + 3z)
2
z
2
.
This is a sum of squares with two plus and one minus-sign. This means
that there are two positive and one negative eigenvalues in the corre-
sponding matrix, and Q is indenite.
Further examples
Q(x) = x
2
+ 4xy + 2xz + 8y
2
+ 16yz + 10z
2
is positive semidenite,
and
Q(x) = x
2
+ 4xy + 2xz + 8y
2
+ 16yz + 11z
2
is positive denite.
AEM 1- 68
1.11 QR-Decomposition
Theorem
Let A be a matrix with m rows and n m columns. Then there exists
a orthogonal matrix Q and a upper triangular matrix R with A = QR.
Upper triangular means that for R = (r
i j
) one has r
i j
= 0 for j < i .
Proof 1 - Jacobi method, Givens rotations
The case n = 1 or m = 1 is trivial. Now let us rst look at the case
m = 2.
We are looking for an orthogonal 2 2-matrix Q with A = QR and
r
21
= 0.
Q =
_
u v
v u
_
with u
2
+ v
2
= 1, R =
_
r
11
r
12

0 r
22

_
and A =
_
a b
c d
_
leads to
Q
T
A = R
_
u v
v u
_ _
a b
c d
_
=
_
r
11
r
12

0 r
22

_

_

av + uc
_
=
_
r
11
r
12

0 r
22

_
.
So this can be fullled with
u =
a

a
2
+ c
2
and v =
c

a
2
+ c
2
In the case c = 0 one simply takes Q = E
2
.
AEM 1- 69
With Q
0
= E and R
0
= A for each element below the diagonal an
operation is performed:
Q
T
i
R
i
= R
i +1
_

_
.
.
.

0 a b
.
.
.
.
.
. c d
0
_

_
_

_
E
k
0 0
0 u 0 v
.
.
.
.
.
. 0 E
m
0
.
.
.
.
.
. v 0 u
.
.
.
0 E
p
_

_
_

_
.
.
.

0 r
.
.
.
.
.
. 0
0
_

_
From this one sees: the same values of u and v as above eliminate the
c- element with an orthogonal matrix Q
i
, and the rest of the column
that contains a and c is not changed.
So we have A = Q
0
R
0
= Q
0
R
0
= Q
0
Q
1
R
1
= Q
0
Q
k
R
k
:= QR with
Q = Q
0
Q
k
and R = R
k
.
Proof 2 - Householder Transformations
The Jacobi method needs
n
2
2
steps. This method uses only n 1
steps:
The idea is to use a series of reexions that map the parts of the columns
below the diagonal to zeroes.
After some steps we habe the matrix
R
k
=
_

_
.
.
.

0 |
.
.
.

b
k

0 |
_

_
AEM 1- 70
The lower part of column k,

b
k
, shall be mapped onto a multiple of
e
k
. Let c
k
be a vektor equal to

b
k
, but with zeroes in the rst k 1
positions. So dene v
k
= | c
k
| e
k
c
k
, normalize u
k
=
1
|v
k
|
v
k
und use the
orthogonal reection matrix
Q
k
= E 2 u
k
u
T
k
.
(observe that the rst k 1 entries in u
k
are zero.) Multiplication with
Q
k
turnes the

b
k
-part of R
k
to a multiple of e
k
and leaves the rst k 1
columns unchanged.
As above we have A = Q
0
R
0
= Q
T
0
R
0
= Q
T
0
Q
T
1
R
1
= Q
T
0
Q
T
k
R
k
=:
QR with Q = Q
T
0
Q
T
k
and R = R
k
.
Proof 3 - Gram-Schmidt method
Let A = [ a
1
, , a
n
] a decomposition of A into columns.
Apply the Gram-Schmidt-procedure to these vectors, and if a resulting
vector is zero, skip it.
This gives an ONS q
1
, . . . , q
l
. Complete this to a ONB q
1
, , q
m
. This
basis has the property that for 1 k n the vector a
k
is contained
in the span of the vectors q
1
, . . . , q
k
, and so can be written as a linear
combination of these vectors. As the q
m
are a ONs, we have
a
k
=< a
k
, q
1
> q
1
+ + < a
k
, q
k
> q
k
and so
[ a
1
, , a
n
] = [ q
1
, , q
l
]
_

_
< a
1
, q
1
> < a
2
, q
1
> < a
n
, q
1
>
0 < a
2
, q
2
> < a
n
, q
2
>
.
.
.
.
.
.
.
.
.
.
.
.
0 0 < a
n
, q
l
>
_

_
AEM 1- 71
1.12 Numerics of eigenvalues
Theorem (Gerschgorin)
Let A be a square n n- matrix and C an eigenvalue of A.
Then there is an index i with | a
i i
|

j =i
|a
i j
|.
Corollary
As A and A
T
have the same characteristic polynomial, we have the
following:
For each i dene r
i
:= min{

j =i
|a
i j
|,

j =i
|a
j i
|}.
Then all eigenvalues of A lie in the union of the Gerschgorin circles with
centers a
i i
and radii r
i
.
Eigenvalues - direct method
We dene
x
k+1
=
1
x
k

Ax
k
and
k
= Ax
k

Then
x
k
c
1
v
1
and
k

1
Eigenvalues - Wielands inverse method
Assume
0
is not an eigenvalue of A. Let B = (A
0
E)
1
and apply
the direct method.
If is an eigenvalue of B with eigenvector v, then v is an eigenvector
of A, too, for the eigenvalue
1

+
0
.
AEM 1- 1
Eigenvalues - Q-R-method
Let A have real eigenvalues with distinct absolute values. Apply recur-
sively Q-R-decomposition to A:
A := A
0
A
k
= Q
k
R
k
A
k+1
= R
k
Q
k
Then one can proove that A
k
D, D beeing a upper triangular matrix
with the same eigenvalues as A.
2 Ordinary Dierential
Equations
In this section we often omit the arrow symbols over vectors. A dot
always denotes dierentiation with respect to the real (time-) variable t.
2.1 General Denitions
Denition Let I R be an interval and K
n
be a open set. Let
v : I K
n
be a continuous function of all n + 1 variables that is
continuously dierentiable with respect to all variables of the K
n
.
A [ordinary] dierential equation [system of rst order] (for short: ode)
is given by
x = v(t, x).
A solution is a function u : J K
n
, J I, satisfying u(t) = v(t, u(t)).
An initial value problem is a dierential equation together with an
initial condition u(t
0
) = u
0
where t
0
I and u
0
K
n
.
Theorem Each initial value problem has an unique maximal solution,
i.e. a solution that cannot be continued to a larger interval of denition.
From now on we will only regard maximal solutions.
Denition If the function v is independent of t the ode is called
autonomous.
AEM 2- 3
Remark If u(t) is a solution on (t
1
, t
2
) of a given autonomous system,
then for a R the function u(t +a) is a solution, too, on (t
1
a, t
2
a).
Denition A dierential equation of the form x = A(t)x +b(t) with a
n n-matrix A(t) is called linear, in the case of b(t) = 0 homogeneous,
otherwise inhomogeneous.
Theorem: Solutions of homogeneous linear ode.
Let x = A(t)x be a linear ode with a n n-matrix A(t).
Then the solutions of the ode form a n-dimensional vector space.
Theorem Let u
1
and u
2
be solutions of the inhomogeneous ode. Then
u
1
u
2
solves the homogeneous ode.
Consequence: all solutions of the inhomogeneous ode have the form
u = u
h
+ u
p
,
where u
p
is a xed solution of the inhomogeneous ode and u
h
ranges
over all solutions of the homogeneous ode.
Denition Any system of n linearly independent solutions of x = A(t)x
is called an fundamental system. A matrix whose columns are a funda-
mental system is called fundamental matrix.
u
1
, . . . u
n
f.s. the general solution is
u =
1
u
1
+ +
n
u
n
= [u
1
u
n
]
_

1
.
.
.

n
_

_
=: Y
_

1
.
.
.

n
_

_
Y is the fundamental matrix.
AEM 2- 4
Proposition Properties of the fundamental matrix
(i) The domain of denition of the solutions is the whole of I.
(ii) Y f.m. Y regular and Y solves the matrix dierential equation

Y = A(t)Y .
(iii) Y f.m., C regular Y C f.m.
(iv) If the columns of Y consist of solutions and Y is regular at t
0
I
then Y is regular everywhere in I
Remark If n > 1 there is no general algorithm to solve x = A(t)x.
Theorem Solutions of inhomogeneous linear dierential equations -
variation of constants
Let Y be a fundamental matrix of x = A(t)x. Then all solutions of
x = A(t)x + b(t) are given by u(t) = Y (t)c + u
0
(t), where c K
n
is
constant and the particular solution u
0
is given by
u
0
(t) = Y (t)
_
Y
1
(t)b(t) dt.
2.2 Linear dierential equations with
constant coecients
Theorem let A be a (constant real or complex n n) matrix. A f.m.
of x = Ax is given by
Y = exp(tA) =

n=0
t
n
n!
A
n
.
AEM 2- 5
Proof Clearly one has Y (0) = E. Now show that

Y = AY :

Y =
d
dt
_
E +
t
1!
A
1
+
t
2
2!
A
2
+
t
3
3!
A
3
+
_
=
1
1!
A
1
+ 2
t
1
2!
A
2
+ 3
t
2
3!
A
3
+
= A +
t
1
1!
A
2
+
t
2
2!
A
3
+
= A
_
E +
t
1!
A
1
+
t
2
2!
A
2
+
_
= AY
Computation of exp(tA) Let A = PJP
1
with
J =
_

_
J
1
J
2
.
.
.
J
m
_

_
and J
k
=
_

k
1

k
1
.
.
.
.
.
.

k
1

k
_

_
or J
k
= [
k
].
So exp(tA) = P
_

_
exp(tJ
1
)
exp(tJ
2
)
.
.
.
exp(tJ
m
)
_

_
P
1
.
Now, exp(t[
k
]) = [exp(t
k
)]. Omitting the subscript k we consider the
case J = D + N, where D = E and N is the matrix with ones only on
the upper secondary diagonal.
Observe that (D+N)
n
= D
n
+
_
n
1
_
D
n1
N +
_
n
2
_
D
n2
N
2
+ , and there
are at most as many terms as the size of D and N.
exp(tJ) =

n=0
t
n
n!
(D + N)
n
=

n=0
t
n
n!
D
n
+

n=1
t
n
n!
nD
n1
N +

n=2
t
n
n!
n(n 1)
2
D
n2
N
2
+
AEM 2- 6
Now

n=0
t
n
n!
D
n
= exp(tD)

n=1
t
n
n!
nD
n1
N = Nt

n=1
t
n1
(n 1)!
D
n1
= tN

n=0
t
n
n!
D
n
= tN exp(tD)

n=2
t
n
n!
n(n 1)
2!
D
n2
N
2
=
t
2
2!
N
2

n=2
t
n2
(n 2)!
D
n2
=
t
2
2!
N
2

n=0
t
n
n!
D
n
=
t
2
2!
N exp(tD) aand so on.
exp(tJ) = exp(tD) + t exp(tD)N +
t
2
2
exp(tD)N
2
+
= exp(t)
_

_
1 t
t
2
2

1 t
1
.
.
.
_

_
Concrete Calculation
(i) It suces to compute P exp(tJ) = exp(tA)P, because this is a
f.m., too. This means, that if v is eigenvector to the eigenvalue ,
then e
t
v is a solution. Especially, if there exists a basis consisting
of eigenvectors of A, a f.s. may be determined by this method.
(ii) Using the fact that linear combinations of solutions are again so-
lution, and that for an eigenvalue the generalized eigenvectors
form a basis of the kernel of (AE)
p
for suitable p we conclude
the following algorithm:
m
1 If the algebraic and geometric multiplicities of an eigenvalue
are equal (say = k), compute k linearly independent eigen-
vectors v
1
, v
k
. The functions
u
j
= e
t
v
j
are linearly independent solutions.
AEM 2- 7
m
2 If the algebraic multiplicity of is m
1
and the geometric
multiplicity is m
2
< m
1
then compute powers of B = A
E until the dimension of the kernel of B
p
is equal to m
1
.
Determine a basis v
1
v
m
1
of this kernel. The functions
u
j
= e
t
_
v
j
+ tBv
j
+
t
2
2
B
2
v
j
+
_
are linearly independent solutions. The series in the brackets
stops after at most m
1
terms because of B
m
1
v
j
= 0.
(iii) If A is a real matrix and is a non-real eigenvalue with imaginary
part > 0 (then is eigenvalue of the same multiplicities, too), then
the real and imaginary parts of the (non-real) solutions belonging
to are real solutions. The corresponding solutions belonging to
are the same (with exception of the sign) and need not to be
regarded.
2.2.1 Inhomogeneous Equations
The method of variation of constants is always applicable.
If the inhomogeneity is of the type b(t) = e
t
v(t) with a vector v whose
entries consist of polynomials of order k then a particular solution may
be constructed by an ansatz:
If is no eigenvalue of A then there is a solution of the form
u(t) = e
t
w(t), and the entries of w(t) are polynomials of order
at most k.
If is an eigenvalue of (algebraic) order m, then there exists a
solution of the form u(t) = e
t
w(t) where entries of w(t) are
polynomials of order at most k + m.
If the inhomogeneity is a sum of functions of this type, we use the
principle of superposition:
If u
1
(t) and u
2
(t) are particular solutions to the right sides b
1
(t)
AEM 2- 8
and b
2
(t), then u
1
(t) + u
2
(t) is a particular solution to the right
side b
1
(t) + b
2
(t).
Because of
_
sin t
cos t
_
=
1
2i
(exp(i t) exp(i t)) this method is
applicable to trigonometrical functions, too.
2.3 Linear dierential equations of higher
order
2.3.1 General Case
A linear dierential equation of order n is an equation of the form
x
(n)
+ a
n1
(t)x
(n1)
+ + a
2
(t) x + a
1
(t) x + a
0
(t)x = f (t). ()
The coecients a
i
(t) and the inhomogeneity f (t) are assumend to be
continous.
To apply the theoretic results of the last section the ode is transformed
to a rst order system:
With y =
_

_
y
1
y
2
.
.
.
y
n
_

_
=
_

_
x
x
.
.
.
x
(n1)
_

_
one has
d
dt
y =
_

_
x
x
.
.
.
x
(n)
_

_
=
_

_
y
2
y
3
.
.
.
a
0
(t)y
1
a
1
(t)y
2
a
2
(t)y
3
a
n1
(t)y
n
+ f (t)
_

_
AEM 2- 9
=
_

_
0 1 0 0
0 0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
0
(t) a
1
(t) a
2
(t) a
n1
(t)
_

_
y +
_

_
0
0
.
.
.
f (t)
_

_
=: A(t)y + b(t)
The solution of the ode of order n corresponds to the rst entry in the
solution of the corresponding n n-system.
Denition A fundamental system of a linear ode of order n is a set of
n linearly independent solutions of the ode.
If u
1
, . . . , u
n
is a fundamental system of (*), then
Y =
_

_
u
1
u
2
u
n
u
1
u
2
u
n
.
.
.
.
.
.
.
.
.
.
.
.
u
(n1)
1
u
(n1)
2
u
(n1)
n
_

_
is a fundamental matrix of the cor-
responding system.
Solution of inhomogeneous equations
This is done using the variation-of-constants-formula:
m
1 Starting with a fundamental system u
1
, . . . , u
n
one forms the
fundamental matrix
Y =
_

_
u
1
u
2
u
n
u
1
u
2
u
n
.
.
.
.
.
.
.
.
.
.
.
.
u
(n1)
1
u
(n1)
2
u
(n1)
n
_

_
and b(t) =
_

_
0
0
.
.
.
f (t)
_

_
m
2 Calculate d(t) = Y
1
(t)b(t) as the solution of Y (t)d(t) = b(t).
m
3 Integrate c(t) =
_
d(t) dt
m
4 A particular solution is the rst component of Y (t)c(t).
AEM 2- 10
2.3.2 Ode with Constant Coecients
x
(n)
+ a
n1
x
(n1)
+ + a
2
x + a
1
x + a
0
x = f (t) ()
Again we look rst at the homogeneous equation with f (t) 0.
x
(n)
+ a
n1
x
(n1)
+ + a
2
x + a
1
x + a
0
x = 0 ( )
In this case the corresponding matrix A is constant:
A =
_

_
0 1 0 0
0 0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
0
a
1
a
2
a
n1
_

_
Lemma
(i) The characteristic polynomial of the corresponding system is
p() = (1)
n
(a
0
+ a
1
+ a
2

2
+ + a
n1

n1
+
n
)
(ii) If is an eigenvalue, then the kernel of A E has dimension 1.
(iii) For each eigenvalue of algebraic multiplicity k there is a unique
Jordan block of the form J = E
k
+ N.
Denition The polynomial p() = a
0
+a
1
+
2
a
2
+ +
n
is called
the characteristic polynomial of (**) resp. (***).
Solutions The solutions of (***) are found in the following way:
m
1 Find the zeroes of the characteristic polynomial by factorisation:
p() = (
1
)
k
1
(
m
)
k
m
AEM 2- 11
m
2 A fundamental system is given by
e

1
t
, te

1
t
,. . . t
k
1
1
e

1
t
, . . . e

m
t
, . . . t
k
m
1
e

m
t
This means that, if
0
is a simple zero of p() then e

0
t
is a solution,
and if
0
is a k-fold zero then the k functions e

0
t
,te

0
t
, . . . , t
k1
e

0
t
are solutions.
Zeroes of p Solution of the ode
l
a
= a e
at
l
b
1
=
2
= =
k
= a e
at
, te
at
. . . , t
k1
e
at
l
c
= + i , = i e
t
sin t, e
t
cos t
l
d
1
=
2
= =
k
= + i , e
t
sin t, te
t
sin t, . . . ,t
k1
e
t
sin t,

k+1
=
k+2
= =
2k
= i e
t
cos t, te
t
cos t, . . . ,t
k1
e
t
cos t
l
e
= 0 1
l
f
1
=
2
= =
k
= 0 1, t, . . . , t
k1
2.3.3 Special Inhomogeneities
For right sides of the form (P(t) sin t + Q(t) cos t) e
t
or sums of
term of this type we use again a tting ansatz.
If P and Q are polynomials of degree at most m, let R and S be poly-
nomials of degree m. A, B, C and D are constants.
The column with the label denotes the factor that appears in the
exponential function if one writes the right side in the form P(t)e
t
.
The following tabular gives the ansatz for a particular solution u
p
. The
factor of resonance k is the multiplicity of as a zero of the character-
istic polynomial p(). If p() = 0 we have k = 0 and the term t
k
is
omitted in the ansatz.
AEM 2- 12
right side ansatz
e
t
C t
k
e
t
P(t) 0 t
k
R(t)
P(t) e
t
t
k
R(t) e
t
Asin t + Bcos t i t
k
(C sin t + Dcos t)
P(t) sin t + Q(t) cos t i t
k
(R(t) sin t + S(t) cos t)
(Asin t + Bcos t) e
t
i t
k
(C sin t + Dcos t)e
t
(P(t) sin t + Q(t) cos t) e
t
i t
k
(R(t) sin t + S(t) cos t)e
t
Actually only the last row is needed. All other cases result from replacing
or (or both) with zero, or regarding constants as polynomials of
degree zero.
Even if only the sine (or cosine) function appears on the right side, both
sine and cosine have to appear in the ansatz.
Mnemonic ansatz like the right side multiplied by t
k
.
AEM 2- 1
x
(4)
8x
(3)
+ 23 x 28 x + 12x = x
(4)
5x
(3)
+ 6 x =
right zeroes of the char. polynomial
side of 1, 2, 2, 3 0, 0, 2, 3
the ode k ansatz k ansatz
3 0 0 A 2 At
2
2t 0 0 At + B 2 At
3
+ Bt
2
2e
2t
2 2 At
2
e
2t
1 Ate
2t
e
t
1 0 Ae
t
0 Ae
t
4 sin 2t 2i 0 Asin 2t + B cos 2t 0 Asin 2t + B cos 2t
(t
2
+ 4)e
t
1 1 (At
3
+ Bt
2
+ Ct)e
t
0 (At
2
+ Bt + C)e
t
t cos 3t 3i 0 (At + B) sin 3t 0 (At + B) sin 3t
5 sin 3t +(Ct + D) cos 3t +(Ct + D) cos 3t
2e
t
cos 5t 1 5i 0 (Asin 5t + B cos 5t)e
t
0 (Asin 5t + B cos 5t)e
t
AEM 2- 2
x
(7)
+ x
(6)
+ 18x
(5)
+ 16x
(4)
+ 81x
(3)
+ 45 x 162x =
right zeroes of the char. polynomial
side of 1, 3i , 3i , 3i , 3i , 1 + i , 1 i
the ode k ansatz
2t + 5 0 0 At + B
te
t
1 1 (At
2
+ Bt)e
t
4 sint i 0 Asin t + Bcos t
4t sin t i 0 (At + B) sin t + (Ct + D) cos t
4 sin 3t 3i 2 At
2
sin 3t + Bt
2
cos 3t
4t sin 3t 3i 2 (At
3
+ Bt
2
) sin 3t + (Ct
3
+ Dt
2
) cos 3t
2(t + 2)e
2t
2 0 (At + B)e
2t
6e
t
cos t 1 i 1 (At sin t + Bt cos t)e
t
6te
t
sin t 1 i 1 ((At
2
+ Bt) sin t + (Ct
2
+ Dt) cos t)e
t
3 Calculus in Several Variables
Throughout this chapter let

f resp. f be a function that is dened on
an open set , having dierentiability properties of sucient order.
3.1 Dierential Calculus in R
n
3.1.1 Denitions
(i) Let x R
n
and > 0. Then the open ball with radius and center
x is dened by
K

(x) := { y R
n

| y x| < }.
(ii) A subset U R
n
is called a neighbourhood of x R
n
i U contains
a set K

(x) for some > 0.


(iii) A subset R
n
is called open, i for each point x there exist
a neighbourhood of x that is contained in . This is equivalent to
x there is a > 0 so that K

(x)
(iv) Let U R
n
. The boundary U of U consists of all points of R
n
with the property that each neigbourhood of x contains at least
one point of U and one point of R
n
\ U.
(v) A subset U R
n
is called closed, if R
n
\ U is open.
AEM 3- 4
3.1.2 Examples and Properties of Open and Closed
Sets
(i) In one dimension:
[a, b] is closed, ]a, b[ is open, [a, b[ is neither open nor closed.
] , a] and [a, [ are closed, ] , a[ and ]a, [ are open.
(ii) and R
n
are the only sets that are both open and closed.
(iii) U open U U = .
(iv) U closed U U.
3.1.3 Main Rule for Vector-Valued Functions
Let R
n
and

f be a function

f : R
m
,

f (x) =
_

_
f
1
(x)
.
.
.
f
m
(x)
_

_
.

f is
_

_
continous
dierentiable
bounded
integrable
_

_
each f
j
is
_

_
continous
dierentiable
bounded
integrable
_

_
Limits / derivatives / integrals are taken in each component separately.
A consequence is that most of the properties need only to be dened for
real-valued functions, since the denition is extended for vector-valued
functions by this main rule.
AEM 3- 5
3.1.4 Denition - Limits and Continous Fuctions
(i) For each n N let x
n
=
_

_
x
1,n
.
.
.
x
m,n
_

_
R
m
. The sequence x
n
converges
to a limit point x =
_

_
x
1
.
.
.
x
m
_

_
R
m
i for each j {1, . . . , m} the
sequence of components x
j,n
converges to x
j
.
This is denoted by x
n
x.
(ii) Equivalent denition: x
n
x i x
n
x 0 with some (or each)
norm.
(iii) f : R is continous in a i for each sequence (x
n
)
we have:
x
n
a implies f (x
n
) f ( a).
(iv) Equivalent denitions: f : R is continous in a
the inverse image of each neighbourhood of f ( a) is a neigh-
bourhood of a
for each > 0 there is a > 0 so that for all x with
|x a| < one has |f (x) f ( a)| < .
(v) f is continous on i f is continous in each a .
3.1.5 Denition - Partial Derivatives
(i) f is partially dierentiable with respect to x
j
in a , if the scalar
function t f (a
1
, . . . , a
j
+ t, . . . , a
n
) is dierentiable in t = 0.
This is equivalent to the property that t f ( a + te
j
) is dieren-
tiable in t = 0.
(ii) f is partially dierentiable in a, if all partial derivatives exist in a.
AEM 3- 6
(iii) f is partially dierentiable in , if f is partially dierentiable in
each point a .
(iv) The partial derivative is denoted by
f
x
j
( a) or by D
j
f ( a) or by

j
f ( a). In two or three dimensions we write f
x
, f
y
and f
z
.
(v) The gradient of f is the row-vector grad f =
_
f
x
1
, . . . ,
f
x
n
_
Sometimes the gradient is dened as a column vector. Our deni-
tion has the advantage that for scalar functions the derivative and
the gradient coincide.
(vi) Higher partial derivatives are formed by dierentiating partial
derivatives:

2
f
x
i
x
j
=

x
i
_
f
x
j
_
.
3.1.6 Theorem of H.A. Schwarz
Let f be twice partially dierentiable in an open set and let the partial
derivatives be continous in .
Then

x
i
_
f
x
j
_
=

x
j
_
f
x
i
_
; i.e. the order of dierentiation does not
matter.
3.1.7 Denition: Derivative of f
f : R is (totally) dierentiable in a, if there exists a linear mapping
f

( a) from R
n
R with
f (x) = f ( a) + f

( a)(x a) + r (x) with lim
x a
r (x)
|x a|
= 0.
In the scalar case the derivative f

( a) = Df ( a) is given by the gradient,


in the general case of a function

f : R
m
with

f (x) =
_

_
f
1
(x)
.
.
.
f
m
(x)
_

_
AEM 3- 7
we have

f

= D

f = Jf =
_

_
grad f
1
.
.
.
grad f
m
_

_
. This matrix with m rows and n
columns is also called the Jacobian of f .
3.1.8 Higher derivatives
The derivative of a function is a n-dimensional vector valued function.
If the entries are dierentiable again, we say that the function is twice
dierentiable. By induction we may determine derivatives of a function
of arbitrary order.
C
m
() is the set of all functions dened on that m times dierentiable
and the (partial) derivatives of order m are continous. If all derivatives
exist this is denoted by C

().
3.1.9 Examples
All function composed of the standard functions like polynomials, ex-
ponential and trigonometric functions, inverse functions of these are
dierentiable in open subsets of their domains of denition.
Remark: The property of openness has to be fullled in each step of
composing, e.g. |x 1| =
_
(x 1)
2
is not dierentiable in x = 1.
3.1.10 Directional derivative, Gfteaux derivative
Let f be dierentiable in a . For v R
n
we dene the directional
derivative of f in direction v by
f
v
( a) := D
v
f ( a) := lim
t0
f ( a + tv) f ( a)
t
For v = e
j
the directional derivative is the j th partial derivative of f .
AEM 3- 8
If f is dierentiable then from the chain rule below follows easily
f
v
( a) =
f

( a) v
3.1.11 Rules
(i) (

f +g)

=

f

+g

, (

f )

f

(ii) Chain rule
(

f g)

( a) =

f

(g( a)) g

( a).
Here the dot is the matrix multiplikation of the Jacobians of

f and
g.
(iii) Generalized Product Rule
Let (t) be a real valued function,

f (t), g(t) vector valued func-
tions and A(t), B(t) matrix-valued functions of the real variable t
(which will be omitted in the following formulas)
(1) (

f )


f +

f

(2) (

f g)

=

f

g +

f g

(3) (

f g)

=

f

g +

f g

(4) (A

f )

= A

f + A

f

(5) (AB)

= A

B + AB

(6) (A
1
)

= A
1
A

A
1
3.2 Inverse and Implicit Functions
3.2.1 Inverse Function Theorem
Let

f : R
n
R
n
, a be continously dierentiable and

f ( a) =

b.
AEM 3- 9
If det

f

( a) = 0 (so the Jacobian of

f is invertible in a) then there is
a neighbourhood U of a and a neighbourhood V of

b so that

f is one-
to-one between U and V . Furthermore

f
1
: V U is dierentiable of
the same order as

f and the Jacobian of

f
1
in

b is the inverse of the
Jacobian of

f in a.
(

f
1
)

f (x)) =
_

f

(x)
_
1
or (

f
1
)

( y) =
_

f

(

f
1
( y))
_
1
3.2.2 Application: Newtons method
The solution of f (x) = a is the zero of g(x) = f (x) a = 0. So is
sucient to look for zeroes of functions.
Let f (x
0
) = 0 and x
n
near x
0
. We have f (x
0
) f (x
n
) f

(x
n
)(x
0
x
n
)
and so x
0
x
n
(f

(x
n
))
1
(f (x
0
) f (x
n
)). With f (x
0
) = 0 we get a
x
0
x
n
(f

(x
n
))
1
f (x
n
), and so
x
n+1
:= x
n
(f

(x
n
))
1
f (x
n
)
is in many cases a better approximation for x
0
.
In most cases the Newton method is quadratically convergent, i.e. the
number of correct digits in the result ist doubled in each step.
3.2.3 Implicit Function Theorem
Here only the most simple version of this theorem will be presented.
Let R
2
be open and F : R continously dierentiable. If for
some point (x
0
, y
0
) F(x
0
, y
0
) = 0 and F
y
(x
0
, y
0
) = 0 then there
exists a neighbourhood of (x
0
, y
0
) of the form
U =]x
0
, x
0
+[]y
0
, y
0
+[ so that the equation F(x, y) = 0
has a unique dierentiable resolution y = (x) in U: F(x, y) = 0
y = (x).
AEM 3- 10
The derivatives of the resolution may be calculated by the chain rule:
F(x, (x)) = 0

x
F(x, (x)) = 0
F
x
(x, (x)) + F
y
(x, (x))

(x) = 0

(x) =
F
x
(x, (x))
F
y
(x, (x))
Omitting the arguments the higher derivatives are obtained by succesive
dierentiation:
F
x
+ F
y

= 0 F
xx
+ F
xy

+ [F
xy
+ F
yy

+ F
y

= 0

=
1
F
y
_
F
xx
+2F
xy

+F
yy

2
_
=
1
F
3
y
_
F
xx
F
2
y
2F
xy
F
x
F
y
+F
yy
F
2
x
_
3.3 Taylor Expansions
3.3.1 Nabla-Operator
The Nabla-operator is dened by =
_

x
1
, ,

x
n
_
.
Then f

( a) = (f )( a)
3.3.2 Construction of Taylor Expansions
Now consider the following situation: let R
n
, I R an interval,
and p : I and g : R dierentiable functions. Let a and

h R
n
so that the line between a and a +

h is contained in .
Let g = f p, so g(t) = f ( p(t)).
I R
t a = p(t) g(t) = f ( p(t))
AEM 3- 11
We are going to apply the one-dimensional Taylor formula to g.
The derivatives of g are calculated by the chain rule:
g

(t) = (f p)

(t) = f

(p(t)) p

(t)
Now choose p(t) = a + t

h. Then p

(t) =

h and
g

(t) = f

( a + t

h)

h =
_

x
1
f h
1
+ +

x
n
f h
n
_
( a + t

h)
=
_
(

x
1
h
1
+ +

x
n
h
n
)f
_
( a + t

h) = (

h)f ( a + t

h)
Dierentiating again yields
g

(t) = (

h)
2
f ( a + t

h), g

(t) = (

h)
3
f ( a + t

h)
and g
k
(t) = (

h)
k
f ( a + t

h)
The one-dimensional Taylor formula is
g(u) = g(0) +
g

(0)
1!
u + +
g
m
(0)
m!
u
m
+
g
m+1
(s)
(m + 1)!
u
m+1
with s = u, 0 < < 1.
The result follows with evaluating this for u = 1: with g(1) = f ( a +

h)
and g
m+1
(s) = f
m+1
( a + s

h) we have
3.3.3 Taylors Theorem
Let R
n
be open, > 0, K

( a) ,

h < and f C
n+1
().
Then
f ( a +

h) = f ( a) + (

h)f ( a) +
1
2
(

h)
2
f ( a) + +
1
m!
(

h)
m
f ( a)
+
1
(m + 1)!
(

h)
m+1
f ( a + s

h), 0 < s < 1.


AEM 3- 12
Alternative formulation: with x = a +

h and

h = x a one gets
f (x) =
m

k=0
1
k!
_
( (x a))
k
f
_
( a)
+
1
(m + 1)!
_
( (x a))
m+1
f
_
( a + s(x a))
The special case a =

0 is now
f (x) =
m

k=0
1
k!
_
( x)
k
f
_
(

0) +
1
(m + 1)!
_
( x)
m+1
f
_
(sx)
Observe that the nabla operator is applied to f only and not to x!
3.3.4 Calculation in the two-dimensional Case
We use the variables x and y. Then = (

x
,

y
) and

h = (h
1
, h
2
)

.
So the product

h has the form

h =

x
h
1
+

y
h
2
and the power
(

h)
k
is evaluated with the binomial formula:
(

h)
k
f (x, y) =
k

j =0
_
k
j
_

k
x
j
y
kj
f (x, y)h
j
1
h
kj
2
.
So the Taylor formula has the form
f (a + h
1
, b + h
2
) =
m

k=0
1
k!
k

j =0
_
k
j
_
_

k
x
j
y
kj
f
_
(a, b) h
j
1
h
kj
2
+
1
(m + 1)!
m+1

j =0
_
m + 1
j
_
_

m+1
x
j
y
m+1j
f
_
(a+sh
1
, b+sh
2
) h
j
1
h
m+1j
2
AEM 3- 13
s is in the intervall (0, 1).
We get the following algorithm, written down for the case m = 2 (i.e.
Taylor expansion up to order two and the remainder contains the deriva-
tives of order three)
m
1 Write down all partial derivatives up to order m + 1 in a way that
is buildt like Pascals triangle:
f row 0
f
x
f
y
row 1
f
xx
f
xy
f
yy
row m, here row 2
f
xxx
f
xxy
f
xyy
f
yyy
row m + 1, here row 3
(triangle 1)
m
2 In row 0 to m the values of the function at the point of
developement are written down.
In row m + 1, i.e. in the last row are the values at an inter-
mediate point ( a,

b) with a = a + sh
1
and

b = b + sh
2
:
f (a, b) row 0
f
x
(a, b) f
y
(a, b) row 1
f
xx
(a, b) f
xy
(a, b) f
yy
(a, b) row m
f
xxx
( a,

b) f
xxy
( a,

b) f
xyy
( a,

b) f
yyy
( a,

b) row m + 1
(triangle 2)
Below (or beside, if there is room enough) one writes down the
terms that arise in binomial expansion of (h
1
+ h
2
)
k
, and besides
this the reciprocal values of k!
1 1 row 0
h
1
h
2
1 row 1
h
2
1
2h
1
h
2
h
2
2
1
m!
=
1
2
row m
h
3
1
3h
2
1
h
2
3h
1
h
2
2
h
3
2
1
(m + 1)!
=
1
6
row m + 1
AEM 3- 14
(triangle 3)
m
3 each term of the second triangle (containing the values at the point
of developement or at the intermediate point) is multiplied with the
corresponding term of the third triangle (containing the parts of
(h
1
+ h
2
)
k
and the reciprocal of k!. All these terms are added.
In the example (developement up to order 2) this results in
f (a + h
1
, b + h
2
)
row 0 = f (a, b) 1 1
row 1 + f
x
(a, b) h
1
1 + f
y
(a, b) h
2
1
row 2 + f
xx
(a, b) h
2
1

1
2
+ f
xy
(a, b) 2h
1
h
2

1
2
+ f
yy
h
2
2

1
2
row 3 + f
xxx
(a + sh
1
, b + sh
2
) h
3
1

1
6
+ f
xxy
(a + sh
1
, b + sh
2
) 3h
2
1
h
2

1
6
+ f
xyy
(a + sh
1
, b + sh
2
) 3h
1
h
2
2

1
6
+ f
yyy
(a + sh
1
, b + sh
2
) h
3
2

1
6
3.4 Extreme Values
3.4.1 Denition
f : R has a relative maximum (minimum) in a if there is a
neighbourhood U of a so that for each x U \ { a} one has f (x) f ( a)
(f (x) f ( a)). If < (>) holds we have a strict maximum (minimum).
If f has a maximum or minimum, we say that f has an extremum.
AEM 3- 15
3.4.2 Neccesary Criterion
Let f have an extremum in a. Then f

( a) =

0.
Proof If f has an extremum in a, the restriction of f to the lines
parallel to the coordinate axes through a has an extremum, too. By one-
dimensional calculus we have that the directional derivatives of f with
respect to these directions are zero. As these are the partial derivatives,
we have that the gradient of f is zero.
Hessian Let f C
2
(). The nn-matrix of the second partial deriva-
tives of f is called the Hessian Hf .
Hf =
_

_
f
x
1
x
1
f
x
1
x
2
f
x
1
x
n
f
x
2
x
1
f
x
2
x
2
f
x
2
x
n
.
.
.
.
.
.
.
.
.
.
.
.
f
x
n
x
1
f
x
n
x
2
f
x
n
x
n
_

_
By Schwarz Theorem this is a symmetric matrix. The Taylor expansion
of order two of f is
T
2
f ( a +

h) = f ( a) + f

( a) h +
1
2

h
T
Hf ( a)

h.
3.4.3 Sucient Criterion
Let f fulll f

( a) =

0 and the Hessian Hf ( a) be positive (negative
denite). Then f has a strict minimum (maximum) in a.
Proof As the entries of the Hessian are continous functions, there is
a neighbourhood U of a so that Hf is positively denite in each point of
U.
AEM 3- 1
By Taylors theorem we have for

h =

0
f ( a +

h) = f ( a) + f

( a)

h + +
1
2
n

i =1
n

j =1

2
f
x
i
x
j
( a + s

h)h
i
h
j
= f ( a) +
1
2

h
T
Hf ( a + s

h)

h
and this is greater than f ( a).
3.4.4 Saddle Points
If in the situation above Hf is indenite, the we have no extemum at a.
These Points are calles saddle points.
(without proof)
4 Integral Transforms
4.1 Laplace Transform
Let f be a (complex-valued) function, integrable over each nite intervall
(e.g. f continuous or piecewise continuous). If the integral converges,
the function
L[f (t)] (s) =
_

0
f (t) e
st
dt
is called the Laplace-transform of f .
f is called admissible, if there are numbers and M with |f (t)| Me
t
,
so e.g. exponential-functions, sine, cosine, polynomials and bounded
functions. If the estimation above holds, then the Laplace-transform of
f exists and is complex dierentiable for z > .
4.1.1 Method of Calculation
Let all functions be admissible. For t < 0 set f (t) = 0. Let c > 0 be
a constant.
AEM 4- 3
L[f (t) + g(t)] (s) = L[f (t)] (s) + L[g(t)] (s) Linearity
L[f (ct)] (s) =
1
c
L[f (t)] (
s
c
) Similarity
L
_
e
ct
f (t)

(s) = L[f (t)] (s + c) Damping


L[f (t c)] (s) = e
cs
L[f (t)] (s) Displacement
L
_
_
t
0
f (u) du
_
(s) =
1
s
L[f (t)] (s) Integration
L
_
f (t)
t
_
=
_

s
L[f ] (u) du
L[t
n
f (t)] (s) = (1)
n
d
n
ds
n
L[f (t)] (s) Dierentiation
L[f

(t)] (s) = sL[f (t)] (s) f (0)
L[f

(t)] (s) = s
2
L[f (t)] (s) s f (0) f

(0)
L
_
f
(n)
(t)

(s)
= s
n
L[f (t)] (s) s
n1
f (0) sf
(n2)
(0) f
(n1)
(0)
If there is no risc of confusion we will omit the arguments s and t.
For a periodic function f with period T (f (t) = f (t + T)) and
f
0
(t) =
_
f (t) for 0 t T
0 otherwise
the following rule holds:
L[f (t)] (s) =
L[f
0
(t)] (s)
1 e
Ts
AEM 4- 4
4.1.2 Convolution
The convulution of two admissible functions f and g is dened by
(f g)(t) :=
_
t
0
f (t u)g(u) du.
(i) f g = g f commutativity
(ii) f (g + h) = f g + f h linearity
(iii) f (g h) = (f g) h assoziativity
(iv) L[f g] = L[f ] L[g] Convolution theorem
4.1.3 Some important Examples
As a rule the Laplace-transform and its inverse are calculated with the
aid of tabulars.
f L[f ] f L[f ] f L[f ]
e
t
1
s
sin(t)

s
2
+
2
cos(t)
s
s
2
+
2
1
1
s
t
1
s
2
t
n
n!
1
s
n+1
The Laplace-transform of the -distribution is the function 1.
Many more transformed functions may be derived with the aid of the
rules above, e.g.
L
_
e
t
sin t

=

(s )
2
+
2
(damping theorem) or
1
(s )
2
=
d
ds
1
s
=
d
ds
L
_
e
t

= L
_
te
t

.
Alternatively one gets this from L[t] =
1
s
2
and the damping-theorem.
AEM 4- 5
4.1.4 Solution of Inital Value Problems
The Laplace-transform is adapt for the solution of initial value prob-
lems with ordinary dirential equations, if the following prerequisits are
fulllled:
The dierential equation is linear with constant coecients.
The initial values are given at t = 0.
The right side of the equation (the inhomogenity) has a known
Laplace-transform.
m
1 Transformation of the equation or the equation system.
m
2 Resolve to L[y] resp. L[y
1
] to L[y
n
] (possibly with Cramers Rule).
m
3 Partial fraction decomposition of L[y] resp. L[y
i
].
m
4 Back-transformation of the parts.
algebraic equation sum of simple terms
DE. with inital values
Solution of the
initial value problem
Laplace-Tr.
Inverse transformation
solve
l
1
l
4
l
2
l
3
AEM 4- 6
4.2 Fourier Series
Reminder
(i) The complex exponential function e
a+i b
= e
a
(cos b + i sin b) has
the same properties with respect to derivative, integration and
products as the real exponential function. The complex conjugate
of e
z
is e
z
.
(ii) cos n = (1)
n
, sin n = 0, e
i n
= e
i n
= (1)
n
.
(iii) f : R R is even if f (x) = f (x) and odd if f (x) = f (x).
4.2.1 Theorem
The functions e
i nx
, n Z, form an ONS with respect to the scalar
product < f , g >=
1
2
_

f (x) g(x) dx.


4.2.2 Denition
Let f be a 2-periodic integrable function. The numbers
c
n
=< f , e
i nx
>=
1
2

f (x) e
i nx
dx are called the Fourier coecients
of f .

n=
c
n
e
i nx
is the Fourier series associated with f .
AEM 4- 7
4.2.3 Theorem
(i) If f is square-integrable (
_

|f (x)|
2
dx ex.) then
_

f (x)

n=
c
n
e
i nx

2
dx = 0, i.e. the function is repre-
sented by the series in the norm that is induced from the scalar
product.
(ii) Let f be continous and piecewise continuously dierentiable.
Then the series is uniformly converging and one has
f (x) =

n=
c
n
e
i nx
.
(iii) Let f be piecewise continously dierentiable. If teh discontinuities
of f have the property that on both sides the limit exists, then in
these points x the series is converging to
1
2
(f (x) + f (x+)).
(iv) Parseval equality:

n=
|c
n
|
2
=< f , f >
4.2.4 Properties of the Coecients
(i) If f is real, then c
n
= c
n
.
(ii) if f is real and even, then c
n
is real.
(iii) if f is real and odd, then c
n
is purely imaginary.
4.2.5 Real form of the Fourier Series
Let f be real-valued. Decomposing the exponential functions into sine
and cosine parts gives

n=
c
n
e
i nx
=
a
0
2
+

n=1
(a
n
cos nx + b
n
sin nx)
AEM 4- 8
with
a
n
=
1

f (x) cos nx dx, b


n
=
1

f (x) sin nx dx
f is even b
n
= 0, f is odd a
n
= 0 The connection between the
coecients is given by
c
0
=
a
0
2
c
n
=
1
2
(a
n
i b
n
) c
n
=
1
2
(a
n
+ i b
n
)
a
0
= 2c
0
a
n
= c
n
+ c
n
b
n
= i (c
n
c
n
)
4.3 Fourier Transform
4.3.1 Denition
Let f be a (complex valued) function, dened on R, with the property
that for each nte intervall the integral exists.
The Fourier Transform is dened by
F(f )(t) :=

f (t) :=
1

2
_

f (x) e
i tx
dx
If |f | is integrable, then the Fourier transform exists for all values of t.
4.3.2 Inverse Transform
Let |f | be integrable, and fulll the Dirichlet condition:
each nite interval is decomposable into nitely many intervalls in
which f is monotone and continous.
in each point of discontinouity the limits of f from both sides exist
.
AEM 4- 9
Then for x R one has
f (x+) + f (x)
2
=
1

2
_

F(f (x))(t) e
i tx
dt.
So if f and F(f ) are absolutely integrable and if f has the property
f (x) =
1
2
(f (x) + f (x+)) then the transform and its inverse are linked
by
F
1
(f (t))(x) = F(f (t))(x) F(f (x))(t) = F
1
(f (x))(t)
4.3.3 Convolution
For f and g absolutely integrable the convolution is dened by
(f g)(x) :=
_

f (x t) g(t) dt.
4.3.4 Rules
(i) (F(f (x) + g(x))(t) = (Ff (x))(t) + (F(g(x))(t)
(ii) (Ff (ax))(t) =
1
a
(Ff (x))
_
t
a
_
(iii) (Ff (x a))(t) = e
i at
(Ff (x))(t)
(iv) If f is dierentiable and |f

| integrable on R then
(Ff

(x))(t) = i t (Ff (x))(t)


(v) If g(x) =
_
x

f (s) ds is absolutely integrable on R then


(Fg(x))(t) =
1
i t
(Ff (x))(t), t = 0.
(vi) (f g)(x) = (g f )(x)
(vii) (F(f g)(x))(t) =

2(Ff (x))(t) (Fg(x))(t)


AEM 4- 10
4.3.5 Sine and Cosine transform
Under the same assumptions as for the Fourier transform we dene
F
c
(t) :=
_
2

_

0
f (x) cos tx dx cosine transform of f
F
s
(t) :=
_
2

_

0
f (x) sin tx dx sine transform of f
4.3.6 More Properties
(i) If f is absolutely integrable then Ff ,F
c
f and F
s
f are continous
and lim
t
Ff (t) = 0.
(ii) If Ff = Fg then in all common points of continouity of f and g
f (x) = g(x).
(iii) If f is even (f (x) = f (x)) then Ff (t) = F
c
f (cosine transform).
(iv) If f is odd (f (x) = f (x)) then Ff (t) = i F
s
f (sine trans-
form).
(v) For f (x) = f
e
(x) + f
o
(x) (decomposition of f into even and odd
part, cf. below)
Ff (t) = F
c
f
e
i F
s
f
o
.
AEM 4- 11
4.3.7 Calculation of the Fourier Transform
1. Tabulars
In some tabulars only the sine- and cosine transforms are included. So
one has to decompose f into an even part f
e
and an odd part f
o
:
f
e
=
1
2
(f (x) + f (x)), f
o
=
1
2
(f (x) f (x))
2. direct calculation of the integral
Sometimes helpful:
_

e
a
2
x
2
dx =

a
, a > 0
3. Use of residue calculus
For a rational function f (x) =
P(x)
Q(x)
where the denominator Q has no
real zeroes, and the degree of Q is al least as great as the degree of P
plus 2, one has:
F(f )(t) =
_

2 i

Imz
k
<0
Res(e
i tz
f (z), z
k
) t 0

2 i

Imz
k
>0
Res(e
i tz
f (z), z
k
) t 0
4.3.8 Gauss functions
(i)
_
R
e
x
2
dx =

(ii)

(x) := e
x
2
(iii)
_
R
e
x
2
dx =
_

(iv)

(x) =
_

e
x
2
=
_

(x) has the property


_
R

(x) dx = 1
AEM 4- 12
(v) F

(x) =
1

2
1
/
4
y
x
The functions
f =
a
(x) for = 16,
= 1 and = 1/16.

and its Fourier transform for = 1, 4, 16 and 64


4.3.9 Consequences
(i) For each (xed) x R we have lim
0

(x) = 1
(ii) For each x = 0 we have lim

(x) = 0
(iii) lim

(0) =
This shows that the sequence
n
fullls the following denition:
AEM 4- 1
4.3.10 Denition: Dirac sequence
A sequence of integrable functions f
n
: R R is called Dirac sequence,
if
(i) f
n
0
(ii)
_
R
f
n
(x) dx 1
(iii) for each a > 0 one has
_
R\(a,a)
f
n
(x) dx 0
4.3.11 Main Property of Dirac sequences
For any continous admissible function g one has
lim
n
(f
n
g)(x) = g(x)
4.3.12 Delta Distribution
(i) The delta distribution is dened by = lim
n
f
n
(x) for a Dirac
sequence f
n
.
(ii) This limit is only valid, if the f
n
resp. are inside an integral. From
the above we have f = f for any continous function.
(iii) The Fourier transform of is
1

2
and Laplace transform of the
is 1.
(iv) Another example of a dirac sequence is f
n
(x) =
_
n |x| <
1
2n
0 otherwise
or f
n
(x) =
_
n 0 x <
1
n
0 otherwise
5 Stability of Ordinary
Dierential Equations
An autonomous system is a dierential equation system of the form
x = v(x),
i.e. the right side of the ode contains no t. Let the domain of denition
of the ode be , and let v be a continously dierentiable function v :
R
n
. (This ensures that the solution of each initial value proble
exists locally and is unique, i.e. has no branching points).
Let u(t) be a solution dened on the maximal interval of denition
(t

, t
+
) with u(t
0
) = x
0
. Then we call the set {u(t) | t

< t < t
+
} the
orbit through x
0
.
5.1 Remarks
(i) As the right side of the dierential equations contains no

t

we
have: If u(t) is a solution on (a, b), then u(t t
0
) is a solution on
(a + t
0
, b + t
0
).
(ii) u(t) x
0
is a solution x
0
is a critical point.
AEM 5- 3
5.2 Denition
(i) A point x
0
with v(x
0
) = 0 is called a critical point.
(ii) the critical point x
0
is called stable if:
For each neighbourhood U of x
0
there exists a neighbourhood V
of x
0
with the property that for each orbit starting in V one has
t
+
= and u(t) U for all t > 0.
(iii) A critical point that is not stable is called unstable.
(iv) A critical point x
0
is called asymptotically stable, if it is stable and,
moreover, for each orbit one has lim
t
u(t) = x
0
.
(v) An asymptotically stable point x
0
is exponentially stable if
|u(t) x
0
| Me
t
for some < 0.
5.3 Flow-box theorem
Let x
0
be a non-critical point of x = v(x). Then x
0
has a neighbourhood
U with the property that there exists a dieomorphism from U onto
U

R
n
so that image of v is the vector eld w = e
1
(and so the
images of the orbits are straight lines parallel to the x
1
-axis).
5.4 Remarks
(i) The Flow-box theorem tells us that the critical points are the only
interesting points.
(ii) The fact u(0) V u(t) x
0
is not sucient for the stability
of x
0
.
AEM 5- 4
5.5 Theorem: Linear Case
In the system x = Ax, A = 0 the origin is the only critical point.
(i) If all eigenvalues of A satisfy Re < 0 then x
0
= 0 is an exponen-
tially stable critical point.
(ii) If all eigenvalues of A satisfy Re 0, and for the eigenvalues
with Re = 0 the algebraic and geometric multiplicities are equal,
then x
0
= 0 is a stable critical point.
(iii) In all other cases x
0
= 0 is unstable.
5.6 Linearisation
Let v be a C
2
vector eld with v(x
0
) = 0. Taylor expansions yields
v(x) = v(x
0
)
. .
=0
+Dv(x
0
)(x x
0
) + O(||x x
0
||
2
)
The matrix A = Dv(x
0
) is called linearisation of v at x
0
.
5.7 Poincar

S-Ljapunov Theorem
Let x
0
be a critical point of x = v(x).
(i) If Re < 0 for all eigenvalues of Dv(x
0
), then x
0
is asymptotically
stable.
(ii) If one eigenvalue of Dv(x
0
) has a positive real part, then x
0
is
unstable.
5.8 Example
Let v = grad f (gradient ow).
AEM 5- 1
f has an isolated min. at x
0
x
0
is asymptotically stable.
f has an isolated max. at x
0
x
0
is unstable.
f has a saddle point at x
0
x
0
is unstable.
5.9 Ljapunov Functions
The Poincar

S-Ljapunov theorem contains no result for the case that


the linerisation has eigenvalues with real part zero. Here the theory of
Ljapunov functions may be useful.
Let V R
n
be open, x
0
V , v : V R
n
be a (continously dif-
ferentiable) vector eld, x
0
V be the only critical point of v (so
v(x) = 0 x = x
0
).
5.9.1 Denition
A continously dierentiable function L : V R is called a
Ljapunov function (at the critical point x
0
of the ode x = v(x)), if
(i) L(x) 0 for x V and L(x) = 0 x = x
0
(ii)

L(x) v(x) 0 .
(iii) The Ljapunov function is said to be strict if
for x = x
0

L(x) v(x) < 0 holds.
5.9.2 Theorem
(i) If a Ljapunov function L exists at the critical point x
0
of v, then
x
0
is stable.
(ii) If L is strict, then x
0
is asymptotically stable.

You might also like