You are on page 1of 93

Numerical Linear Algebra

A Solution Manual
Georg Muntingh and Christian Schulz

Preface
This solution manual gradually appeared over the course of several years, when first
Christian Schulz (20052008) and then I (20092013) were guiding the exercise sessions
of a numerical linear algebra course at the University of Oslo.
We would like to thank Tom Lyche for providing solutions to some of the exercises. Several students contributed by pointing out mistakes and improvements to the
exercises and solutions. Any remaining mistakes are, of course, our own.
Blindern, November 2013,

Georg Muntingh.

Contents
Preface

Chapter 0. Preliminaries
Exercise 0.24: The AT A inner product
Exercise 0.25: Angle between vectors in complex case
Exercise 0.41: The inverse of a general 2 2 matrix
Exercise 0.42: The inverse of a 2 2 matrix
Exercise 0.43: Sherman-Morrison formula
Exercise 0.44: Cramers rule; special case
Exercise 0.45: Adjoint matrix; special case
Exercise 0.47: Determinant equation for a plane
Exercise 0.48: Signed area of a triangle
Exercise 0.49: Vandermonde matrix
Exercise 0.50: Cauchy determinant
Exercise 0.51: Inverse of the Hilbert matrix

1
1
1
1
1
2
2
2
2
3
3
4
5

Chapter 1. Examples of Linear Systems


Exercise 1.2: Gaussian elimination example
Exercise 1.8: Strict diagonal dominance
Exercise 1.9: LU factorization of 2nd derivative matrix
Exercise 1.10: Inverse of 2nd derivative matrix
Exercise 1.11: Central difference approximation of 2nd derivative
Exercise 1.12: Two point boundary value problem (TODO)
Exercise 1.13: Two point boundary value problem; computation (TODO)
Exercise 1.14: Matrix element as a quadratic form
Exercise 1.15: Outer product expansion of a matrix
Exercise 1.16: The product AT A
Exercise 1.17: Outer product expansion
Exercise 1.18: System with many right hand sides; compact form
Exercise 1.19: Block multiplication example
Exercise 1.20: Another block multiplication example

7
7
7
7
8
8
9
9
9
9
9
10
10
10
10

Chapter 2. LU Factorizations
Exercise 2.3: Column oriented backsolve (TODO)
Exercise 2.6: Computing the inverse of a triangular matrix
Exercise 2.15: Row interchange
Exercise 2.16: LU of singular matrix
Exercise 2.17: LU and determinant
Exercise 2.18: Diagonal elements in U
Exercise 2.20: Finite sums of integers
Exercise 2.21: Operations

11
11
11
11
12
12
12
13
14

ii

Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise

2.22:
2.30:
2.39:
2.61:
2.62:
2.63:
2.67:

Multiplying triangular matrices


Making a block LU into an LU (TODO)
Positive definite characterizations
Using PLU of A to solve AT x = b
Using PLU to compute the determinant
Using PLU to compute the inverse (TODO)
Direct proof of Theorem 2.64 (TODO)

14
14
14
15
15
15
15

Chapter 3. The Kronecker Product


Exercise 3.2: 2 2 Poisson matrix
Exercise 3.5: Properties of Kronecker products
Exercise 3.11: 2nd derivative matrix is positive definite
Exercise 3.12: 1D test matrix is positive definite?
Exercise 3.13: Eigenvalues 2 2 for 2D test matrix
Exercise 3.14: Nine point scheme for Poisson problem
Exercise 3.15: Matrix equation for nine point scheme
Exercise 3.16: Biharmonic equation

16
16
16
17
17
18
18
19
20

Chapter 4. Fast Direct Solution of a Large Linear System


Exercise 4.5: Fourier matrix
Exercise 4.6: Sine transform as Fourier transform
Exercise 4.7: Explicit solution of the discrete Poisson equation
Exercise 4.8: Improved version of Algorithm 4.1
Exercise 4.9: Fast solution of 9 point scheme
Exercise 4.10: Algorithm for fast solution of 9 point scheme
Exercise 4.11: Fast solution of biharmonic equation
Exercise 4.12: Algorithm for fast solution of biharmonic equation
Exercise 4.13: Check algorithm for fast solution of biharmonic equation
Exercise 4.14: Fast solution of biharmonic equation using 9 point rule (TODO)

22
22
22
23
23
24
25
25
26
26
27

Chapter 5. Matrix Reduction by Similarity Transformations


Exercise 5.3: Idempotent matrix
Exercise 5.4: Nilpotent matrix
Exercise 5.5: Eigenvalues of a unitary matrix
Exercise 5.6: Nonsingular approximation of a singular matrix
Exercise 5.7: Companion matrix
Exercise 5.16: Schur decomposition example
Exercise 5.19: Skew-Hermitian matrix
Exercise 5.20: Eigenvalues of a skew-Hermitian matrix
Exercise 5.31: Eigenvalue perturbation for Hermitian matrices
Exercise 5.33: Hoffman-Wielandt
Exercise 5.42: Find eigenpair example
Exercise 5.45: Jordan example
Exercise 5.46: Big Jordan example
Exercise 5.49: Properties of the Jordan form
Exercise 5.50: Powers of a Jordan block
Exercise 5.52: Minimal polynomial example
Exercise 5.53: Similar matrix polynomials
Exercise 5.54: Minimal polynomial of a diagonalizable matrix

28
28
28
28
28
29
29
29
30
30
30
30
30
31
31
31
32
32
32

iii

Exercise 5.59: Biorthogonal expansion


Exercise 5.60: Generalized Rayleigh quotient

32
33

Chapter 6. The Singular Value Decomposition


Exercise 6.14: SVD examples
Exercise 6.15: More SVD examples
Exercise 6.17: Counting dimensions of fundamental subspaces
Exercise 6.18: Rank and nullity relations
Exercise 6.19: Orthonormal bases example
Exercise 6.20: Some spanning sets
Exercise 6.21: Singular values and eigenpair of composite matrix
Exercise 6.27: Rank example
Exercise 6.28: Another rank example

34
34
35
35
36
36
37
37
37
38

Chapter 7. Matrix Norms


Exercise 7.4: Consistency of sum norm?
Exercise 7.5: Consistency of max norm?
Exercise 7.6: Consistency of modified max norm?
Exercise 7.8: The sum norm is subordinate to?
Exercise 7.9: The max norm is subordinate to?
Exercise 7.16: Spectral norm
Exercise 7.17: Spectral norm of the inverse
Exercise 7.18: p-norm example
Exercise 7.21: Unitary invariance of the spectral norm
Exercise 7.22: kAUk2 rectangular A
Exercise 7.23: p-norm of diagonal matrix
Exercise 7.24: Spectral norm of a column vector
Exercise 7.25: Norm of absolute value matrix
Exercise 7.32: Sharpness of perturbation bounds
Exercise 7.33: Condition number of 2nd derivative matrix
Exercise 7.44: When is a complex norm an inner product norm?
Exercise 7.45: p-norm for p = 1 and p =
Exercise 7.46: The p-norm unit sphere
Exercise 7.47: Sharpness of p-norm inequality
Exercise 7.48: p-norm inequalities for arbitrary p

40
40
40
40
41
42
42
42
43
43
43
43
44
44
45
45
47
48
48
48
49

Chapter 8. The Classical Iterative Methods


Exercise 8.2: Richardson and Jacobi
Exercise 8.13: Convergence of the R-method when eigenvalues have positive real
part (TODO)
Exercise 8.16: Example: GS converges, J diverges
Exercise 8.17: Divergence example for J and GS
Exercise 8.18: Strictly diagonally dominance; The J method
Exercise 8.19: Strictly diagonally dominance; The GS method
Exercise 8.23: Convergence example for fix point iteration
Exercise 8.24: Estimate in Lemma 8.22 can be exact
Exercise 8.25: Slow spectral radius convergence
Exercise 8.31: A special norm (TODO)
Exercise 8.33: When is A + E nonsingular?

50
50

iv

50
50
51
51
51
52
52
53
55
55

Chapter 9. The Conjugate Gradient Method


Exercise 9.1: Paraboloid
Exercise 9.4: Steepest descent iteration
Exercise 9.7: Conjugate gradient iteration, II
Exercise 9.8: Conjugate gradient iteration, III
Exercise 9.9: The cg step length is optimal
Exercise 9.10: Starting value in cg
Exercise 9.15: The A-inner product
Exercise 9.17: Program code for testing steepest descent
Exercise 9.18: Using cg to solve normal equations
Exercise 9.20: Maximum of a convex function
Exercise 9.25: Krylov space and cg iterations
Exercise 9.28: Another explicit formula for the Chebyshev polynomial

56
56
56
57
57
58
58
59
59
61
62
62
63

Chapter 10. Orthonormal and Unitary Transformations


Exercise 10.2: Reflector
Exercise 10.5: What does Algorithm housegen do when x = e1 ?
Exercise 10.6: Examples of Householder transformations
Exercise 10.7: 2 2 Householder transformation
Exercise 10.16: QR decomposition
Exercise 10.17: Householder triangulation
Exercise 10.20: QR using Gram-Schmidt, II
Exercise 10.22: Plane rotation
Exercise 10.23: Solving upper Hessenberg system using rotations

64
64
64
65
65
65
66
66
67
68

Chapter 11. Least Squares


Exercise 11.7: Straight line fit (linear regression)
Exercise 11.8: Straight line fit using shifted power form
Exercise 11.9: Fitting a circle to points
Exercise 11.15: The generalized inverse
Exercise 11.16: Uniqueness of generalized inverse
Exercise 11.17: Verify that a matrix is a generalized inverse
Exercise 11.18: Linearly independent columns and generalized inverse
Exercise 11.19: The generalized inverse of a vector
Exercise 11.20: The generalized inverse of an outer product
Exercise 11.21: The generalized inverse of a diagonal matrix
Exercise 11.22: Properties of the generalized inverse
Exercise 11.23: The generalized inverse of a product
Exercise 11.24: The generalized inverse of the conjugate transpose
Exercise 11.25: Linearly independent columns
Exercise 11.26: Analysis of the general linear system
Exercise 11.27: Fredholms Alternative
Exercise 11.33: Condition number
Exercise 11.34: Equality in perturbation bound (TODO)
Exercise 11.36: Problem using normal equations

69
69
69
70
71
72
72
72
72
73
73
73
74
75
75
75
76
76
77
77

Chapter 12. Numerical Eigenvalue Problems


Exercise 12.5: Continuity of eigenvalues
Exercise 12.6: Nonsingularity using Gerschgorin

79
79
79

Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise
Exercise

12.7: Gerschgorin, strictly diagonally dominant matrix


12.12: -norm of a diagonal matrix
12.15: Number of arithmetic operations
12.17: Number of arithmetic operations
12.18: Tridiagonalize a symmetric matrix
12.22: Counting eigenvalues
12.23: Overflow in LDLT factorization
12.24: Simultaneous diagonalization
12.25: Program code for one eigenvalue
12.26: Determinant of upper Hessenberg matrix (TODO)

Chapter 13. The QR Algorithm


Exercise 13.4: Orthogonal vectors
Exercise 13.14: QR convergence detail (TODO)

vi

79
80
80
80
81
81
82
82
83
84
85
85
85

CHAPTER 0

Preliminaries
Exercise 0.24: The AT A inner product
Assume that A Rmn has linearly independent columns. We show that h, iA :
(x, y) 7 xT AT Ay satisfies the axioms of an inner product on a real vector space
V, as described in Definition 0.20. Let x, y, z V and a, b R, and let h, i be the
standard inner product on V.
Positivity. One has hx, xiA = xT AT Ax = hAx, Axi 0, with equality holding
if and only if Ax = 0. Since Ax is a linearly combination of the columns of A with
coefficients the entries of x, and since the columns of A are assumed to be linearly
independent, one has Ax = 0 if and only if x = 0.
Symmetry. One has hx, yiA = xT AT Ay = (xT AT Ay)T = yT AT Ax = hy, xiA .
Linearity. One has hax + by, ziA = (ax + by)T AT Az = axT AT Az + byT AT Az =
ahx, ziA + bhy, ziA .
Exercise 0.25: Angle between vectors in complex case
By the Cauchy-Schwarz inequality for a complex inner product space,
|hx, yi|
1.
0
kxkkyk
Note that taking x and y perpendicular yields zero, taking x and y equal yields one,
and any value in between can be obtained by picking an appropriate affine combination
of these two cases.
Since the cosine decreases monotonously from one to zero on the interval [0, /2],
there is a unique argument [0, /2] such that
cos =

|hx, yi|
.
kxkkyk

Exercise 0.41: The inverse of a general 2 2 matrix


A straightforward computation yields




 

1
1
d b a b
ad bc
0
1 0
=
=
,
c d
0
ad bc
0 1
ad bc c a
ad bc
showing that the two matrices are inverse to each other.
Exercise 0.42: The inverse of a 2 2 matrix
By Exercise 0.41, and using that cos2 + sin2 = 1, the inverse is given by


cos sin
.
sin cos
1

Exercise 0.43: Sherman-Morrison formula


A direct computation yields
(A + BCT ) A1 A1 B(I + CT A1 B)1 CT A1

= I B(I + CT A1 B)1 CT A1 + BCT A1 BCT A1 B(I + CT A1 B)1 CT A1


= I + BCT A1 B(I + CT A1 B)(I + CT A1 B)1 CT A1
= I + BCT A1 BCT A1
= I,
showing that the two matrices are inverse to each other.
Exercise 0.44: Cramers rule; special case
Cramers rule yields



3 2 1 2



= 3,
x1 =
/
6 1 2 1




1 3 1 2



= 0.
x2 =
/
2 6 2 1

Exercise 0.45: Adjoint matrix; special case


We are given the

2
A = 3
6

matrix

6 3
2 6 .
3
2

Computing the cofactors of A gives







1+2 3
1+1 2 6
(1)
(1) 3

6
2

T
2+2 2
2+1 6 3

(1)
adjA = (1)
6
3 2

6
3
(1)3+2 2
(1)3+1
3
2 6

T
14
21 42
= 42 14 21 .
21 42 14






6
1+3 3 2
(1)

6 3
2




3
2+3 2 6
(1)
6 3
2

3
3+3 2 6
(1)
3 2
6

One checks directly that adjA A = det(A)I, with det(A) = 343.


Exercise 0.47: Determinant equation for a plane
Let ax + by + cz + d = 0 be an equation for a plane through the points (xi , yi , zi ), with
i = 1, 2, 3. There is precisely one such plane if and only if the points are not colinear.
Then axi + byi + czi + d = 0 for i = 1, 2, 3, so that


x y z 1 a
0
x1 y1 z1 1 b 0


x2 y2 z2 1 c = 0 .
x3 y3 z3 1
d
0
2

Since the coordinates a, b, c, d of the plane are not all zero, the above matrix is singular, implying that its determinant is zero. Computing this determinant by cofactor
expansion of the first row gives the equation








y1 z1 1
x1 z1 1
x1 y1 1
x1 y1 z1








+ y2 z2 1 x x2 z2 1 y + x2 y2 1 z x2 y2 z2 = 0
y3 z3 1
x3 z3 1
x3 y3 1
x3 y3 z3
of the plane.
Exercise 0.48: Signed area of a triangle
Let T denote the triangle with vertices P1 , P2 , P3 . Since the area of a triangle is
invariant under translation, we can assume P1 = A = (0, 0), P2 = (x2 , y2 ), P3 = (x3 , y3 ),
B = (x3 , 0), and C = (x2 , 0). As is clear from Figure 2, the area A(T ) can be expressed
as
A(T ) = A(ABP3 ) + A(P3 BCP2 ) A(ACP2 )
1
1
1
= x3 y3 + (x2 x3 )y2 + (x2 x3 )(y3 y2 ) x2 y2
2
2
2

1 1 1

1
= 0 x2 x3 ,
2 0 y y
2
3
which is what needed to be shown.
Exercise 0.49: Vandermonde matrix
For any n = 1, 2, . . ., let

1 x1 x21

1 x2 x22

2
Dn := 1 x3 x3
..
... ...
.

1 x x2
n
n


x1n1


xn1
2
n1
x3
..
..
.
.
n1
x
n

be the determinant of the Vandermonde matrix in the Exercise. Clearly the formula
Y
(?)
DN =
(xi xj )
1j<iN

holds for N = 1 (in which case the product is empty and defined to be 1) and N = 2.
Let us assume (?) holds for N = n 1 > 2. Since the determinant is an alternating
multilinear form, adding a scalar multiple of one column to another does not change
the value of the determinant. Subtracting xkn times column k from column k + 1 for
k = n 1, n 2, . . . , 1, we find


1 x1 xn x21 x1 xn x1n1 x1n2 xn


1 x2 xn x22 x2 xn x2n1 x2n2 xn


n1
n2
2
Dn = 1 x3 xn x3 x3 xn x3 x3 xn .
..
..
..
..
...

.
.
.
.


1 x x x2 x x xn1 xn2 x
n
n
n n
n
n
n
n
3

Next, by cofactor expansion along the last row and by



x 1 xn
x21 x1 xn

x 2 xn
x22 x2 xn
Dn = (1)n1 1
..
..
.
.

x
2

xn1 xn

x
x
n1
n
n1

the multilinearity in the rows,



x1n1 x1n2 xn

x2n1 x2n2 xn

..
..

.
.

n1
n2
xn1
xn1
xn

= (1)n1 (x1 xn )(x2 xn ) (xn1 xn )Dn1


Y
= (xn x1 )(xn x2 ) (xn xn1 )
(xi xj )
1j<in1

(xi xj ).

1j<in

By induction, we conclude that (?) holds for any N = 1, 2, . . .


Exercise 0.50: Cauchy determinant
(a) Let [1 , . . . , n ]T , [1 , . . . , n ]T Rn and let
1
1
1 +1


A = (ai,j )i,j =

1
i + j


i,j

1
+
= 2. 1
..

1
n +1

Multiplying the ith row of A by


C = (ci,j )i,j ,

ci,j =

Qn

n
Y

k=1 (i

1 +2
1
2 +2

..
.

1
n +2

...

1
1 +n
1

2 +n

1
n +n

..
.

+ k ) for i = 1, 2, . . . , n gives a matrix

(i + k ).

k=1
k6=j

The determinant of an n n matrix is a homogeneous polynomial of degree n in the


entries of the matrix. Since each entry of C is a polynomial of degree n 1 in the
variables i , j , the determinant of C must be a homogeneous polynomial of degree
n(n 1) in i , j .
Q
By the multilinearity of the determinant, det C = ni,j=1 (i + j ) det A. Since A
vanishes whenever i = j or i = j for i 6= j, the homogeneous polynomial det C
contains
 factors (i j ) and (i j ) for 1 i < j n. As there are precisely
n
2 2 = (n 1)n such factors, necessarily
Y
Y
(?)
det C = k
(i j )
(i j )
1i<jn

1i<jn

for some constant k. To determine k, we can evaluate det C at a particular value, for
instance any {i , j }i,j satisfying 1 + 1 = = n + n = 0. In that case C becomes
a diagonal matrix with determinant
det C =

n Y
n
Y

(i + k ) =

i=1 k=1
k6=i

n Y
n
Y

(i k ) =

i=1 k=1
k6=i

(i k )

1i<kn

Y
1i<kn

(k i ).

Comparing with (?) shows that k = 1. We conclude that


Y
Y
(i j )
(i j )
(??)

det A =

1i<jn

1i<jn
n
Y

(i + j )

i,j=1

(b) Deleting row l and column k from A, results in the matrix Al,k associated to
the vectors [1 , . . . , l1 , l+1 , . . . , n ] and [1 , . . . , k1 , k+1 , . . . , n ]. By the adjoint
formula for the inverse A1 = (bk,l ) and by (??),
det Al,k
bk,l := (1)k+l
det A
n
Y
Y
Y
(i + j )
(i j )
(i j )
i,j=1

= (1)

k+l
n
Y

1i<jn
i,j6=l

1i<jn
i,j6=k

(i + j )

i,j=1
i6=l
j6=k
n
Y

(i j )

1i<jn

(i j )

1i<jn

n
Y
(s + l )
(s + k )

= (l + k )

s=1
s6=l
n
Y

s=1
s6=k
n
Y

s=1
s6=l
n
Y

s=1
s6=k
n
Y

s6=l

s6=k

(s l )

= (l + k )

s + k
l
s=1 s

(s k )

s + l
,

s
k
s=1

which is what needed to be shown.


Exercise 0.51: Inverse of the Hilbert matrix
If we write
= [1 , . . . , n ] = [0, 1, . . . , n 1],

then the Hilbert matrix matrix is of the form Hn = (hi,j ) = 1/(i + j ) . By Exercise
0.50.(b), its inverse Tn = (tni,j ) := H1
n is given by
= [1 , . . . , n ] = [1, 2, . . . , n],

tni,j

= (i + j 1)

n
n
Y
s+i1Ys+j1
s=1
s6=j

sj

si

s=1
s6=i

We wish to show that


f (i)f (j)
(?)
tni,j =
,
1 i, j n,
i+j1
where f : N Q is the sequence defined by
 2

i n2
f (1) = n,
f (i + 1) =
f (i),
i2
5

1 i, j n.

for i = 1, 2, . . . .

Clearly (?) holds when i = j = 1. Suppose that (?) holds for some (i, j). Then
n
n
Y
s+1+i1 Y s+j1
tni+1,j = (i + j)
sj
s1i
s=1
s=1
s6=j

s6=i+1

n+1
Y

n
Y
(s + i 1) (s + j 1)

= (i + j)

1
s=2
n
2
(i + j) Y
(s j)

s=1
n1
Y

(s i)

s=1
s6=j

s=0
s6=i
n
n
Y
Y
(s + i 1) (s + j 1)
s=1

(i + j 1)2 (n + i)(n i) s6=j


n
Y
(i + j)i(i)
(s j)
s=1
s6=j

s=1
s6=i
n
Y
(s i)
s=1
s6=i

n
n
Y
Y
1 i2 n2
s+i1
s+j1
=
(i + j 1)
(i + j 1)
2
i+j i
sj
si
s=1
s=1
s6=j

s6=i

1 i n
f (i)f (j)
i + j i2
f (i + 1)f (j)
,
=
(i + 1) + j 1
so that (?) holds for (i + 1, j). Carrying out a similar calculation for (i, j + 1), or using
the symmetry of Tn , we conclude by induction that (?) holds for any i, j.
=

CHAPTER 1

Examples of Linear Systems


Exercise 1.2: Gaussian elimination example
Applying the recipe of

1 1 1
1 1 3
2 8 3

Example 1.1

1
1

0
1
1
0

to the augmented matrix

1 1 1
1 1

2 2
0 2
2
6 5 1
0 0

[A|b], yields

1 1
2
2 .
1 7

Using back substitution we find that x = [14, 6, 7]T .


Exercise 1.8: Strict diagonal dominance
Suppose that the matrix A = tridiag(ai , di , ci ) Cnn as in (1.3) is strictly diagonally
dominant. To show that an LU factorization of A exists, we again need to show that
u1 , . . . , un1 in (1.4) are nonzero. It therefore suffices to show by induction that
(?)

|uk | > |ck |,

k = 1, . . . , n 1.

This clearly holds for k = 1, since A is strictly diagonally dominant. Assume (?) holds
for some k satisfying 1 k n 2. Using (1.4), the reverse triangle inequality, the
induction hypothesis, and strict diagonal dominance of A,




a
c
k
k
|dk+1 | |ck | |ak | |dk+1 | |ak | > |ck+1 |,
|uk+1 | = dk+1
uk
|uk |
and it follows by induction that an LU factorization exists. Moreover, since the LU
factorization is completely determined by (1.4), it is unique.
Exercise 1.9: LU factorization of 2nd derivative matrix
Let L = (lij )ij , U = (rij )ij and T be as in the exercise. Clearly L is unit lower triangular
and U is upper triangular. We compute the product LU by separating cases for its
entries. There are several ways to carry out and write down this computation, some
more precise than others. For instance,
(LU)11 = 1 2 = 2;
i1
i+1
1 + 1
= 2,
i
i
i1
i
=

= 1,
i
i1
= 1 1 = 1,

(LU)ii =

for i = 2, . . . , m;

(LU)i,i1

for i = 2, . . . , m;

(LU)i1,i

for i = 2, . . . , m;
for |i j| 2.

(LU)ij = 0,
It follows that T = LU is an LU factorization.
7

Another way to show that T = LU is by induction. For m = 1, one has L1 U1 =


1 2 = T1 . Now let m > 1 be arbitrary and assume that Lm Um = Tm . With
m T
a := [0, . . . , 0,
] ,
b := [0, . . . , 0, 1]T ,
m+1
block multiplication yields



Lm 0 Um b
Lm+1 Um+1 = T
m+2 =
0 m+1
a
1
 


Tm
Lm b
Tm b
=
= Tm+1 .
aT Um aT b + m+2
bT 2
m+1
By induction, we can then conclude that Tm = Lm Um for all m 1.
Exercise 1.10: Inverse of 2nd derivative matrix
Let S = (sij )ij be defined by


i
sij = sji = 1
j,
m+1

for 1 j i m.

In order to show that S = T1 , we multiply S by T and show that the result is the
identity matrix. To simplify notation we define sij := 0 whenever i = 0, i = m + 1,
j = 0, or j = m + 1. With 1 j < i m, we find
m
X

si,k Tk,j = si,j1 + 2si,j si,j+1
ST i,j =
k=1


= 1
ST


j,i

m
X

i
m+1


(j + 1 + 2j j 1) = 0,

sj,k Tk,i = sj,i1 + 2sj,i sj,i+1

k=1




i
i+1
j+2 1
j 1
j
m+1
m+1
i 1 2i + i + 1
= j + 2j j + j
= 0,
m+1
m
X

ST i,i =
si,k Tk,i = si,i1 + 2si,i si,i+1
i1
= 1
m+1

k=1


= 1

i
m+1


(i 1) + 2 1

i
m+1

i+1
i 1
m+1


i=1

which means that ST = I. Moreover, since S, T, and I are symmetric, transposing


this equation yields TS = I. We conclude that S = T1 .
Exercise 1.11: Central difference approximation of 2nd derivative
If all hi equal to the same number h, then
2h
yi+1 yi
i = i =
= 1,
i =
,
h+h
h
which is what needed to be shown.
8

i = 3(i1 + i ) = 3

yi+1 yi1
,
h

Exercise 1.12: Two point boundary value problem (TODO)

Exercise 1.13: Two point boundary value problem; computation (TODO)

Exercise 1.14: Matrix element as a quadratic form


Write A = (aij )ij and ei = (ik )k , where

1 if i = k,
ik =
0 otherwise,
is the Kronecker delta. Then, by the definition of the matrix product,
!
X
X
alk jk = eTi (alj )l =
il alj = aij .
eTi Aej = eTi (Aej ) = eTi
k

Exercise 1.15: Outer product expansion of a matrix


Let ij denote the Kronecker delta. For any indices 1 k m and 1 l n, the
(k, l)-th entry of the matrix ei eTj satisfies
X

(ei )ko (eTj )ol = (ei )k1 (eTj )1l = ik jl .
ei eTj kl =
o

It follows that
!
XX
i

aij ei eTj

XX
i

kl

aij ei eTj


kl

XX
i

aij ik jl = akl

for any indices k, l, implying the statement of the Exercise.


Exercise 1.16: The product AT A
A matrix product is defined as long as the dimensions of the matrices are compatible.
More precisely, for the matrix product AB to be defined, the number of columns in A
must equal the number of rows in B.
Let now A be an n m matrix. Then AT is an m n matrix, and as a consequence
the product B := AT A is well defined. Moreover, the (i, j)-th entry of B is given by
T

(B)ij = A A


ij

n
X

aki akj = aT.i a.j = ha.i , a.j i,

k=1

which is what needed to be shown.


9

Exercise 1.17: Outer product expansion


Recall that the matrix product of A Cm,n and BT = C Cn,p is defined by
n
n
X
X
(AC)ij =
aik ckj =
aik bjk .
k=1

k=1

For the outer


 product expansion of the columns of A and B, on the other hand, we
find a:k bT:k ij = aik bjk . It follows that
AB


ij

n
X

aik bjk =

k=1

n
X

a:k bT:k


ij

k=1

Exercise 1.18: System with many right hand sides; compact form
Let A, B, and X be as in the Exercise.
(=): Suppose AX = B. Multiplying this equation from the right by ej yields
Ax.j = b.j for j = 1, . . . , p.
(=): Suppose Ax.j = b.j for j = 1, . . . , p. Let I = Ip denote the identity matrix.
Then
AX = AXI = AX[e1 , . . . , ep ] = [AXe1 , . . . , AXep ]
= [Ax.1 , . . . , Ax.p ] = [b.1 , . . . , b.p ] = B.
Exercise 1.19: Block multiplication example
The product AB of two matrices A and B is defined precisely when the number of
columns of A is equal to the number of rows of B. For both sides in the equation
AB = A1 B1 to make sense, both pairs (A, B) and (A1 , B1 ) need to be compatible in
this way. Conversely, if the number of columns of A equals the number of rows of B
and the number of columns of A1 equals the number of rows of B1 , then there exists
integers m, p, n, and s with 1 s p such that
A Cm,p , B Cp,n , A1 Cm,s , A2 Cm,ps , B1 Cs,n .
Then
(AB)ij =

p
X
k=1

aik bkj =

s
X
k=1

p
X

aik bkj +

aik 0 = (A1 B1 )ij .

k=s+1

Exercise 1.20: Another block multiplication example


Since the matrices have compatible dimensions, a direct computation gives



 

 

1 0T aT 1 0T

aT
1 0T

aT B1
CAB =
=
=
.
0 C1 0 A1 0 B1
0 C1 A1 0 B1
0 C1 A1 B1

10

CHAPTER 2

LU Factorizations
Exercise 2.3: Column oriented backsolve (TODO)

Exercise 2.6: Computing the inverse of a triangular matrix


This exercise introduces an efficient method for computing the inverse B of a triangular
matrix A.
Let us solve the problem for an upper triangular matrix (the lower triangular case
is similar). By the rules of block multiplication,
[Ab1 , . . . , Abn ] = A[b1 , . . . , bn ] = AB = I = [e1 , . . . , en ].
By Lemma 1.22, the matrix B is upper triangular, implying that the other entries bk+1,k ,
. . . , bn,k in bk are zero. The kth column in this matrix equation can be partioned into
blocks, as

0
b1k
a11 a1,k a1,k+1 a1,n
..
..
..
..
...
...
.. .

.
.
.
.

0
ak,k ak,k+1 ak,n bkk

= 1 .
ak+1,k+1 ak+1,n 0

. 0

..
...
.. .

0
.
..
an,n
0
0
Evaluating the upper block matrix multiplication then yields (2.4). Solving the above
system thus yields the kth column of B.
Performing this block multiplication for k = n, n 1, . . . , 1, we see that the computations after step k only use the first k 1 leading principal submatrices of A. It
follows that the column bk computed at step k can be stored in row (or column) k of
A without altering the remaining computations.
Exercise 2.15: Row interchange
Suppose we are given an LU factorization

 


1 1
1 0 u11 u12
=
.
0 1
l21 1
0 u22
Carrying out the matrix multiplication on the right hand side, one finds that

 

1 1
u11
u12
=
,
0 1
l21 u11 l21 u12 + u22
11

implying that u11 = u12 = 1. It follows that necessarily l21 = 0 and u22 = 1, and the
pair




1 0
1 1
L=
,
U=
0 1
0 1


1 1
is the only possible LU factorization of the matrix
. One directly checks that
0 1
this is indeed an LU factorization.
Exercise 2.16: LU of singular matrix
Suppose we are given an LU factorization

 


1 1
1 0 u11 u12
=
.
1 1
l21 1
0 u22
Carrying out the matrix multiplication on the right hand side, one finds that

 

1 1
u11
u12
=
,
1 1
l21 u11 l21 u12 + u22
implying that u11 = u12 = 1. It follows that necessarily l21 = 1/u11 = 1 and u22 =
1 l21 u12 = 0, and the pair




1 0
1 1
L=
,
U=
1 1
0 0


1 1
. One directly checks that
is the only possible LU factorization of the matrix
1 1
this is indeed an LU factorization.
Exercise 2.17: LU and determinant
Suppose A has an LU factorization A = LU. Then, by Lemma 2.11, A[k] = L[k] U[k]
is an LU factorization for k = 1, . . . , n. By induction, the cofactor expansion of the
determinant yields that the determinant of a triangular matrix is the product of its
diagonal entries. One therefore finds that det(L[k] ) = 1, det(U[k] ) = u11 ukk and
det(A[k] ) = det(L[k] U[k] ) = det(L[k] ) det(U[k] ) = u11 ukk
for k = 1, . . . , n.
Exercise 2.18: Diagonal elements in U
From Exercise 2.17, we know that det(A[k] ) = u11 ukk for k = 1, . . . , n. Since A
is nonsingular, its determinant det(A) = u11 unn is nonzero. This implies that
det(A[k] ) = u11 ukk 6= 0 for k = 1, . . . , n, yielding a11 = u11 for k = 1 and a
well-defined quotient
det(A[k] )
u1,1 uk1,k1 uk,k
=
= uk,k ,
det(A[k1] )
u1,1 uk1,k1
for k = 2, . . . , n.
12

Exercise 2.20: Finite sums of integers


There are many ways to prove these identities. While the quickest way to prove these
identities is by induction, we choose a generating function approach because it is a
powerful method that works in a wide range of circumstances.
It is easily checked that the identities hold for m = 1, 2, 3. So let m 4 and let
Pm := 1 + x + + xm =

1 xm+1
.
1x

Then
Pm0 =

1 (m + 1)xm + mxm+1
,
(x 1)2

2 + (m2 + m)xm1 + 2(1 m2 )xm + (m2 m)xm+1


.
(x 1)3
Applying lHopitals rule twice, we find
Pm00 =

1 + 2 + + m = Pm0 (1)
1 (m + 1)xm + mxm+1
x1
(x 1)2
m(m + 1)xm1 + m(m + 1)xm
= lim
x1
2(x 1)
1
= m(m + 1),
2
establishing (2.10). In addition it follows that
= lim

1 + 3 + + 2m 1 =

m
X

(2k 1) = m + 2

m
X

k = m + m(m + 1) = m2 ,

k=1

k=1

which establishes (2.12). Next, applying lHopitals rule three times, we find that
1 2 + 2 3 + + (m 1) m = Pm00 (1)
is equal to
2 + (m2 + m)xm1 + 2(1 m2 )xm + (m2 m)xm+1
x1
(x 1)3
2
m2
(m 1)(m + m)x
+ 2m(1 m2 )xm1 + (m + 1)(m2 m)xm
= lim
x1
3(x 1)2
2
m3
(m 2)(m 1)(m + m)x
+ 2(m 1)m(1 m2 )xm2 + m(m + 1)(m2 m)xm1
= lim
x1
6(x 1)
1
= (m 1)m(m + 1),
3
lim

establishing (2.13). Finally,


2

1 + 2 + + m =

m
X
k=1

k =

m
X

(k 1)k + k =

k=1

m
X
k=1

(k 1)k +

m
X

k=1

1
1
1
1
= (m 1)m(m + 1) + m(m + 1) = (m + 1)(m + )m,
3
2
3
2
which establishes (2.11).
13

Exercise 2.21: Operations


Solving a k k upper triangular system for k = n, n 1, . . . , 1, takes
n
X
1
1
1
k 2 = n(n + )(n + 1) Gn
3
2
2
k=1
arithmetic operations.
Exercise 2.22: Multiplying triangular matrices
Computing the (i, j)-th entry of the matrix AB amounts to computing the inner product of the ith row aTi: of A and the jth column b:j of B. Because of the triangular
nature of A and B, only the first i entries of aTi: can be nonzero and only the first
j entries of b:j can be nonzero. The computation aTi: b:j therefore involves min{i, j}
multiplications and min{i, j} 1 additions. Carrying out this calculation for all i and
j, amounts to a total number of
!
n X
n
n
i
n
X
X
X
X
(2 min{i, j} 1) =
(2j 1) +
(2i 1)
i=1 j=1

n
X

i=1

j=1


i + i(i + 1) + (n i)(2i 1) =

i=1

j=i+1
n
X


i2 + 2ni n + i

i=1

= n2 + (2n + 1)

n
X
i=1

n
X

i2

i=1

1
1
= n2 + n(n + 1)(2n + 1) n(n + 1)(2n + 1)
2
6
1
2
1
1
= n2 + n(n + 1)(2n + 1) = n3 + n = n(2n2 + 1)
3
3
3
3
arithmetic operations. A similar calculation gives the same result for the product BA.
Exercise 2.30: Making a block LU into an LU (TODO)

Exercise 2.39: Positive definite characterizations


We check the equivalent statements of Theorem 2.38 for the matrix


2 1
A=
.
1 2
1. Obviously A is symmetric. In addition A is positive definite, because

 

 2 1 x
x y
= 2x2 2xy + 2y 2 = (x y)2 + x2 + y 2 > 0
1 2
y
for any nonzero vector [x, y]T R2 .
2. The eigenvalues of A are the roots of the characteristic equation
0 = det(A I) = (2 )2 1 = ( 1)( 3).
Hence the eigenvalues are = 1 and = 3, which are both positive.
3. The leading principal submatrices of A are [2] and A itself, which both have
positive determinants.
14

4. One checks that A = BBT , for the nonsingular matrix




2 p0
B=
.
1/ 2
3/2
Exercise 2.61: Using PLU of A to solve AT x = b
If A = PLR, then AT = RT LT PT . The matrix LT is upper triangular and the matrix
RT is lower triangular, implying that RT LT is an LU factorization of AT P. Since A
is nonsingular, the matrix RT must be nonsingular, and we can apply Algorithms 2.1
and 2.2 to economically solve the systems RT z = b, LT y = z, and PT x = y, to find a
solution x to the system RT LT PT x = AT x = b.
Exercise 2.62: Using PLU to compute the determinant
If A = PLU, then
det(A) = det(PLU) = det(P) det(L) det(U)
and the determinant of A can be computed from the determinants of P, L, and U.
Since the latter two matrices are triangular, their determinants are simply the products
of their diagonal entries. The matrix P, on the other hand, is a permutation matrix,
so that every row and column is everywhere 0, except for a single entry (where it is 1).
Its determinant is therefore quickly computed by cofactor expansion.
Exercise 2.63: Using PLU to compute the inverse (TODO)

Exercise 2.67: Direct proof of Theorem 2.64 (TODO)

15

CHAPTER 3

The Kronecker Product


Exercise 3.2: 2 2 Poisson matrix
For m = 2, the Poisson matrix A is the 22 22 matrix given by

4 1 1
0
1
4
0 1
.

1
0
4 1
0 1 1
4
P
In every row i, one has |aii | = 4 > 2 = | 1| + | 1| + |0| = j6=i |aij |. In other words,
A is strictly diagonally dominant.

Exercise 3.5: Properties of Kronecker products


Let be given matrices A, A1 , A2 Rpq , B, B1 , B2 Rrs , and C Rtu . Then
(A) (B) = (A B) by definition of the Kronecker product and since

(A)b11 (A)b12 (A)b1s


Ab11 Ab12 Ab1s
(A)b21 (A)b22 (A)b2s
Ab
Ab22 Ab2s

= . 21
.
.
.
.
..
..
.
..

..
..
..
..
..
.
.
.
(A)br1 (A)br2 (A)brs
Abr1 Abr2 Abrs
The identity (A1 + A2 ) B = (A1 B) + (A2 B) follows from

(A1 + A2 )b11 (A1 + A2 )b12 (A1 + A2 )b1s


(A1 + A2 )b21 (A1 + A2 )b22 (A1 + A2 )b2s

..
..
..
..

.
.
.
.
(A1 + A2 )br1 (A1 + A2 )br2 (A1 + A2 )brs

A1 b11 + A2 b11 A1 b12 + A2 b12 A1 b1s + A2 b1s


A1 b21 + A2 b21 A1 b22 + A2 b22 A1 b2s + A2 b2s

=
..
..
..
..

.
.
.
.
A1 br1 + A2 br1 A1 br2 + A2 br2 A1 brs + A2 brs

A1 b11 A1 b12 A1 b1s


A2 b11 A2 b12 A2 b1s
A1 b21 A1 b22 A1 b2s A2 b21 A2 b22 A2 b2s
=
+ .
.
..
..
..
..
..
..
...
.
.
.
. ..
.
.
A1 br1 A1 br2

A2 br1 A2 br2

A1 brs

A2 brs

A similar argument proves A (B1 + B2 ) = (A B1 ) + (A B2 ), and therefore the


bilinearity of the Kronecker product. The associativity (A B) C = A (B C)
16

follows from

Ab11 Ab1s
.. C
= ...
.
Abr1 Abrs

Ab11 c11 Ab1s c11


..
..

.
.

Abr1 c11 Abrs c11

..
=
.

Ab c
Ab1s ct1

11 t1

.
..
..

.
Abr1 ct1 Abrs ct1

b11 c11 b1s c11


..
..
.
.

br1 c11 brs c11

..
=A
.

b c
11 t1 b1s ct1
.
..
..
.
br1 ct1

Bc11
..

=A
.
Bct1

Ab11 c1u
..
.
Abr1 c1u
..
.

Ab11 ctu
..
.
Abr1 ctu
b11 c1u
..
.

br1 c1u
..
.
b11 ctu
..
.

br1 ctu

brs ct1

Bc1u
.. .
.
Bctu

Ab1s c1u
..

Abrs c1u

Ab1s ctu

..

.
Abrs ctu

b1s c1u
..
.

brs c1u

b1s ctu

..
.
brs ctu

Exercise 3.11: 2nd derivative matrix is positive definite


Applying Lemma 3.8 to the case that a = 1 and d = 2, one finds that the eigenvalues
j of the matrix tridiag(1, 2, 1) Rm,m are





j
j
j = d + 2a cos
= 2 1 cos
,
m+1
m+1
for j = 1, . . . , m. Moreover, as |cos(x)| < 1 for any x (0, ), it follows that j > 0 for
j = 1, . . . , m. Since, in addition, tridiag(1, 2, 1) is symmetric, Lemma 2.41 implies
that the matrix tridiag(1, 2, 1) is symmetric positive definite.
Exercise 3.12: 1D test matrix is positive definite?
The statement of this exercise is a generalization of the statement of Exercise 3.11.
Consider a matrix M = tridiag(a, d, a) Rm,m for which d > 0 and d 2|a|. By
Lemma 3.8, the eigenvalues j , with j = 1, . . . , m, of the matrix M are


j
j = d + 2a cos
.
m+1
If a = 0, then all these eigenvalues are equal to d and therefore positive. If a 6= 0, write
sgn(a) for the sign of a. Then






a
j
j
cos
= 2|a| 1 + sgn(a) cos
> 0,
j 2|a| 1 +
|a|
m+1
m+1
17

again because |cos(x)| < 1 for any x (0, ). Since, in addition, M is symmetric,
Lemma 2.41 implies that M is symmetric positive definite.
Exercise 3.13: Eigenvalues 2 2 for 2D test matrix
One has



1
2d + 2a
1
2d a a 0
1
a 2d 0 a 1 2d + 2a



Ax =
a 0 2d a 1 = 2d + 2a = (2d + 2a) 1 = x,
1
2d + 2a
0 a a 2d 1

which means that (, x) is an eigenpair of A. For j = k = 1 and m = 2, Theorem


3.10.1 implies that

3/4
1
   
3/4 1
3/2
3/2

x1,1 = s1 s1 =

=
3/4 1 = x.
3/2
3/2
3/4
1
Equation (3.20), on the other hand, implies that
 
= 2d + 2a = .
1,1 = 2d + 4a cos
3
We conclude that the eigenpair (, x) agrees with the eigenpair (1,1 , x1,1 ).
Exercise 3.14: Nine point scheme for Poisson problem
(a) If m = 2, the boundary condition yields

0 0 0 0
v00 v01 v02 v03
v10

0
v13
,
= 0

v20
0
v23 0
0 0 0 0
v30 v31 v32 v33
leaving four equations
to determine the interior points v11 , v12 , v21 , v22 . As 6h2 /12 =

1/ 2(m + 1)2 = 1/18 for m = 2, we obtain
20v11 4v01 4v10 4v21 4v12 v00 v20 v02 v22
1
= (8f11 + f01 + f10 + f21 + f12 ),
18
20v21 4v11 4v20 4v31 4v22 v10 v30 v12 v32
1
= (8f21 + f11 + f20 + f31 + f22 ),
18
20v12 4v02 4v11 4v22 4v13 v01 v21 v03 v23
1
= (8f12 + f02 + f11 + f22 + f13 ),
18
20v22 4v12 4v21 4v32 4v23 v11 v31 v13 v33
1
= (8f22 + f12 + f21 + f32 + f23 ),
18
Using the values known from the boundary condition, these equations can be simplified
to
1
20v11 4v21 4v12 v22 = (8f11 + f01 + f10 + f21 + f12 ),
18
18

1
(8f21 + f11 + f20 + f31 + f22 ),
18
1
20v12 4v11 4v22 v21 = (8f12 + f02 + f11 + f22 + f13 ),
18
1
20v22 4v12 4v21 v11 = (8f22 + f12 + f21 + f32 + f23 ).
18
(b) For f (x, y) = 2 2 sin(x) sin(y), one finds

f00 f01 f02 f03


0
0
0
0
f10 f11 f12 f13 0 3 2 /2 3 2 /2 0

f20 f21 f22 f23 = 0 3 2 /2 3 2 /2 0 .


f30 f31 f32 f33
0
0
0
0
20v21 4v11 4v22 v12 =

Substituting these

20 4
4 20

4 1
1 4

values in our linear system, we obtain


2

5 /6
1
4 1 v11
v21 8 + 1 + 1 3 2 1 5 2 /6
1 4
.
=
=
20 4 v12
18
2 1 5 2 /6
5 2 /6
1
v22
4 20

Solving this system we find that v11 = v12 = v21 = v22 = 5 2 /66.
Exercise 3.15: Matrix equation for nine point scheme
(a) Let

2 1 0

1 2 1

0 ... ... ...


,
T=

1 2 1
0 1 2

v11 v1m
..
...
V = ...
.
vm1 vmm

be of equal dimensions. Implicitly assuming the boundary condition


(?)

v0,k = vm+1,k = vj,0 = vj,m+1 = 0,

for j, k = 0, . . . , m + 1,

the (j, k)-th entry of TV + VT can be written as


4vj,k vj1,k vj+1,k vj,k1 vj,k+1 .
(Compare Equations (3.4) (3.5).) Similarly, writing out two matrix products, the
(j, k)-th entry of TVT = T(VT) is found to be
1(1vj1,k1 +2vj1,k 1vj1,k+1 )
+vj1,k1 2vj1,k +vj1,k+1
+2(1vj,k1
+2vj,k
1vj,k+1 )
2vj,k+1 .
= 2vj,k1 +4vj,k
1(1vj+1,k1 +2vj+1,k 1vj+1,k+1 )
+vj+1,k1 2vj+1,k +vj+1,k+1
Together, these observations yield that the System (3.22) is equivalent to (?) and
1
TV + VT TVT = h2 F.
6
(b) It is a direct consequence of properties 7 and 8 of Theorem 3.7 that this equation
can be rewritten to one of the form Ax = b, where
1
A = T I + I T T T, x = vec(V), b = h2 vec(F).
6
19

Exercise 3.16: Biharmonic equation


(a) Writing v = 2 u, the second line in Equation (3.24) is equivalent to
u(s, t) = v(s, t) = 0,

for (s, t) ,

while the first line is equivalent to



f (s, t) = 4 u(s, t) = 2 2 u(s, t) = 2 v(s, t),

for (s, t) .

(b) By property 8 of Theorem 3.7,


(A I + I B)vec(V) = vec(F) AV + VBT = F,
whenever A Rr,r , B Rs,s , F, V Rr,s (the identity matrices are assumed to be of
the appropriate dimensions). Using T = TT , this equation implies
TV + VT = h2 F (T I + I T)vec(V) = h2 vec(F),
TU + UT = h2 V (T I + I T)vec(U) = h2 vec(V).
Substituting the equation for vec(V) into the equation for vec(F), one obtains the
equation
Avec(U) = h4 vec(F),

where A := (T I + I T)2 ,

which is a linear system of m2 equations.


(c) The equations h2 V = (TU + UT) and TV + VT = h2 F together yield the
normal form
T(TU + UT) + (TU + UT)T = T2 U + 2TUT + UT2 = h4 F.
The vector form is given in (b). Using the distributive property of matrix multiplication
and the mixed product rule of Lemma 3.6, the matrix A = (T I + I T)2 can be
rewritten as
A = (T I)(T I) + (T I)(I T) + (I T)(T I) + (I T)(I T)
= T2 I + 2T T + I T2 .
Writing x := vec(U) and b := h4 vec(F), the linear system of (b) can be written as
Ax = b.
(d) Since T and I are symmetric positive definite, property 6 of Theorem 3.7 implies
that M := T I + I T is symmetric positive definite as well. The square of any
symmetric positive definite matrix is symmetric positive definite as well, implying
that A = M2 is symmetric positive definite. Let us now show this more directly by
calculating the eigenvalues of A.
By Lemma 3.8, we know the eigenpairs (i , si ), where i = 1, . . . , m, of the matrix T.
By property 5 of Theorem 3.7, it follows that the eigenpairs of M are (i + j , si sj ),
for i, j = 1, . . . , m. If B is any matrix with eigenpairs (i , vi ), where i = 1, . . . , m, then
B2 has eigenpairs (2i , vi ), as
B2 vi = B(Bvi ) = B(i vi ) = i (Bvi ) = 2i vi ,

for i = 1, . . . , m.

It follows that A = M2 has eigenpairs (i + j )2 , si sj , for i, j = 1, . . . , m. (Note
that we can verify this directly by multiplying A by si sj and using the mixed product
rule.) Since the i are positive, the eigenvalues of A are positive. We conclude that A
is symmetric positive definite.
20

Writing A = T2 I + 2T T + I T2 and computing the block structure of each


of these terms, one finds that A has bandwidth 2m, in the sense that any row has at
most 4m + 1 nonzero elements.
(e) One can expect to solve the system of (b) faster, as it is typically quicker to
solve two simple systems instead of one complex system.

21

CHAPTER 4

Fast Direct Solution of a Large Linear System


Exercise 4.5: Fourier matrix
The Fourier matrix FN has entries
(FN )j,k =

(j1)(k1)
N
,

2
i
N

N := e


= cos

2
N


i sin

2
N


.

In particular for N = 4, this implies that 4 = i and

1 1
1
1
1 i 1 i

F4 =
1 1 1 1 .
1 i 1 i
Computing the transpose and Hermitian transpose gives

1 1
1
1
1 1
1
1
1 i 1 i
1 i 1 i
H

= F4 ,

F
=
FT4 =
4
1 1 1 1 6= F4 ,
1 1 1 1
1 i 1 i
1 i 1 i
which is what needed to be shown.
Exercise 4.6: Sine transform as Fourier transform
According to Lemma 4.2, the Discrete Sine Transform can be computed from the
Discrete Fourier Transform by (Sm x)k = 2i (F2m+2 z)k+1 , where
z = [0, x1 , . . . , xm , 0, xm , . . . , x1 ]T .
For m = 1 this means that
i
and S1 x1 = (F4 z)2 .
2
1
1
Since h = m+1 = 2 for m = 1, computing the DST directly gives
 
S1 x1 = sin(h)x1 = sin
x 1 = x1 ,
2
while computing the Fourier transform gives

1 1
1
1
0
0
1 i 1 i x1 2ix1

F4 z =
1 1 1 1 0 = 0 .
1 i 1 i
x1
2ix1
z = [0, x1 , 0, x1 ]T

Multiplying the Fourier transform with 2i , one finds


i
S1 x1 = x1 = (F4 z)2 ,
2
22

which is what needed to be shown.


Exercise 4.7: Explicit solution of the discrete Poisson equation

2
For any integer m 1, let h = 1/(m + 1). For j = 1,
 . . . , m, let j = 4 sin jh/2 ,
D = diag(1 , . . . , m ), and S = (sjk )jk = sin(jkh) jk . By Section 4.2, the solution
to the discrete Poisson equation is V = SXS, where X is found by solving DX+XD =
4h4 SFS. Since D is diagonal, one has
m X
m
X
(SFS)pr
spk fkl slr
xpr = 4h4
= 4h4
p + r
p + r
k=1 l=1
so that
vij =

m X
m
X

sip xpr srj = 4h

m X
m X
m X
m
X
sip spk slr srj

fkl
p + r




m X
m X
m X
m
ip
pk
rj
lr
X
sin
sin
sin
sin
m+1
m+1
m+1
m+1




= h4
fkl ,
p
r
sin2 2(m+1)
+ sin2 2(m+1)
p=1 r=1 k=1 l=1
p=1 r=1 k=1 l=1

p=1 r=1

which is what needed to be shown.


Exercise 4.8: Improved version of Algorithm 4.1
Given is that
(?)

TV + VT = h2 F.

Let T = SDS1 be the orthogonal diagonalization of T from Equation (4.4), and write
X = VS and C = h2 FS.
(a) Multiplying Equation (?) from the right by S, one obtains
TX + XD = TVS + VSD = TVS + VTS = h2 FS = C.
(b) Writing C = [c1 , . . . , cm ], X = [x1 , . . . , xm ] and applying the rules of block
multiplication, we find
[c1 , . . . , cm ] =
=
=
=
=
=

C
TX + XD
T[x1 , . . . , xm ] + X[1 e1 , . . . , m em ]
[Tx1 + 1 Xe1 , . . . , Txm + m Xem ]
[Tx1 + 1 x1 , . . . , Txm + m xm ]
[(T + 1 I)x1 , . . . , (T + m I)xm ],

which is equivalent to System (4.9). To find X, we therefore need to solve the m tridiagonal linear systems of (4.9). Since the eigenvalues 1 , . . . , m are positive, each matrix
T + j I is diagonally dominant. By Theorem 1.7, every such matrix is nonsingular and
has a unique LU factorization. Algorithms 1.3 and 1.4 then solve the corresponding
system (T + j I)xj = cj in O(m) operations for some constant . Doing this for all
m columns x1 , . . . , xm , one finds the matrix X in O(m2 ) operations.
(c) To find V, we first find C = h2 FS by performing O(2m3 ) operations. Next we
find X as in step b) by performing O(m2 ) operations. Finally we compute V = 2hXS
by performing O(2m3 ) operations. In total, this amounts to O(4m3 ) operations.
23

(d) As explained in Section 4.3, multiplying by the matrix S can be done in


O(2m2 log2 m) operations by using the Fourier transform. The two matrix multiplications in c) can therefore be carried out in
O(4m2 log2 m) = O(4n log2 n1/2 ) = O(2n log2 n)
operations.
Exercise 4.9: Fast solution of 9 point scheme
Analogously to Section 4.2, we use the relations between the matrices T, S, X, D to
rewrite Equation (3.23).
1
TV + VT TVT = h2 F
6
1

TSXS + SXST TSXST = h2 F


6
1

STSXS2 + S2 XSTS STSXSTS = h2 SFS


6
1

S2 DXS2 + S2 XS2 D S2 DXS2 D = h2 SFS


6
1

DX + XD DXD = 4h4 SFS = 4h4 G


6
Writing D = diag(1 , . . . , m ), the (j, k)-th entry of DX + XD 61 DXD is equal to
j xjk + xjk k 61 j xjk k . Isolating xjk and writing j = 4j = 4 sin2 (jh/2) then
yields


h4 gjk
4h4 gjk
jh
2
=
,
j = sin
xjk =
.
2
j + k 16 j k
j + k 32 j k
Defining := jh/2 and = kh/2, one has 0 < , < /2. Note that
2
j + k j k > j + k j k
3
= 2 cos2 cos2 (1 cos2 )(1 cos2 )
= 1 cos2 cos2
1 cos2
0.
Let A = T I + I T 61 T T be as in Exercise 3.15.(b) and si as in Section
4.2. Applying the mixed-product rule, one obtains
1
A(si sj ) = (T I + I T)(si sj ) (T T)(si sj ) =
6
1
1
(i + j )(si sj ) i j (si sj ) = (i + j i j )(si sj ).
6
6
The matrix A therefore has eigen vectors si sj , and counting them shows that these
must be all of them. As shown above, the corresponding eigen values i + j 16 i j
are positive, implying that the matrix A is positive definite. It follows that the System
(3.22) always has a (unique) solution.
24

Exercise 4.10: Algorithm for fast solution of 9 point scheme


The following describes an algorithm for solving System (3.22).
Algorithm 1 A method for solving the discrete Poisson problem (3.22)
Require: An integer m denoting the grid size, a matrix F Rm,m of function values.
Ensure: The solution V to the discrete Poisson problem (3.22).
1
1: h m+1
m
2: S sin(jkh) j,k=1
m
jh
3: sin2 2
j=1
4: G 
SFS
m
h4 g
5: X + i,j
2

6:

i j

j,k=1

V SXS

For the individual steps in this algorithm, the time complexities are shown in the
following table.
step

1
O(1)

complexity

3
2

O(m )

O(m)

5
3

O(m )

6
2

O(m )

O(m3 )

Hence the overall complexity is determined by the four matrix multiplications and
given by O(m3 ).
Exercise 4.11: Fast solution of biharmonic equation
From Exercise 3.16 we know that T Rmm is the second derivative matrix. According
to Lemma 3.8, the eigenpairs (j , sj ), with j = 1, . . . , m, of T are given by
sj = [sin(jh), sin(2jh), . . . , sin(mjh)]T ,
j = 2 2 cos(jh) = 4 sin2 (jh/2),
and satisfy sTj sk = j,k /(2h) for all j, k, where h := 1/(m + 1). Using, in order, that
U = SXS, TS = SD, and S2 = I/(2h), one finds that
h4 F = T2 U + 2TUT + UT2

h4 F = T2 SXS + 2TSXST + SXST2

h4 SFS = ST2 SXS2 + 2STSXSTS + S2 XST2 S

h4 SFS = S2 D2 XS2 + 2S2 DXS2 D + S2 XS2 D2

h4 SFS = ID2 XI/(4h2 ) + 2IDXID/(4h2 ) + IXID2 /(4h2 )

4h6 G = D2 X + 2DXD + XD2 ,


where G := SFS. The (j, k)-th entry of the latter matrix equation is
4h6 gjk = 2j xjk + 2j xjk k + xjk 2k = xjk (j + k )2 .
Writing j := sin2 (jh/2) = j /4, one obtains
xjk =

4h6 gjk
4h6 gjk
h6 gjk
=
=
.

2
(j + k )2
4(j + k )2
4 sin2 (jh/2) + 4 sin2 (kh/2)
25

Exercise 4.12: Algorithm for fast solution of biharmonic equation


In order to derive an algorithm that computes U in Problem 3.16, we can adjust
Algorithm 4.1 by replacing the computation of the matrix X by the formula from
Exercise 4.11. This adjustment does not change the complexity of Algorithm 4.1, which
therefore remains O(n3/2 ). The new algorithm can be implemented in Matlab as in
Listing 4.1.
Listing 4.1. A simple fast solution to the biharmonic equation
1
2
3
4
5
6
7
8
9
10
11

function U = simplefastbiharmonic(F)
m = length(F);
h = 1/(m+1);
hv = pi*h*(1:m);
sigma = sin(hv/2).2;
S = sin(hv*(1:m));
G = S*F*S;
X = (h6)*G./(4*(sigma*ones(1,m)+ones(m,1)*sigma).2);
U = zeros(m+2,m+2);
U(2:m+1,2:m+1) = S*X*S;
end

Exercise 4.13: Check algorithm for fast solution of biharmonic equation


The Matlab function from Listing 4.2 directly solves the standard form Ax = b of
Equation (3.26), making sure to return a matrix of the same dimension as the implementation from Listing 4.1.
Listing 4.2. A direct solution to the biharmonic equation
1
2
3
4
5
6
7
8
9
10

function V = standardbiharmonic(F)
m = length(F);
h = 1/(m+1);
T = gallery(tridiag, m, -1, 2, -1);
A = kron(T2, eye(m)) + 2*kron(T,T) + kron(eye(m),T2);
b = h.4*F(:);
x = A\b;
V = zeros(m+2, m+2);
V(2:m+1,2:m+1) = reshape(x,m,m);
end

After specifying m = 4 by issuing the command F = ones(4,4), the commands simplefastbiharmonic(F) and standardbiharmonic(F) both return
the matrix

0
0
0
0
0
0
0 0.0015 0.0024 0.0024 0.0015 0

0 0.0024 0.0037 0.0037 0.0024 0

.
0 0.0024 0.0037 0.0037 0.0024 0
0 0.0015 0.0024 0.0024 0.0015 0
0
0
0
0
0
0
For large m, it is more insightful to plot the data returned by our Matlab functions.
For m = 50, we solve and plot our system with the commands in Listing 4.3.
26

Listing 4.3. Solving the biharmonic equation and plotting the result
1
2
3
4
5

F = ones(50, 50);
U = simplefastbiharmonic(F);
V = standardbiharmonic(F);
surf(U);
surf(V);
simplefastbiharmonic

standardbiharmonic

x 10

x 10

0
60

0
60
60

40

60

40

40
20

40
20

20
0

20
0

On the face of it, these plots seem to be virtually identical. But exactly how close are
they? We investigate this by plotting the difference with the command surf(U-V),
which gives
simplefastbiharmonic minus standardbiharmonic

14

x 10
2

1.5

0.5

0
60
60

40
40
20

20
0

We conclude that their maximal difference is of the order of 1014 , which makes them
indeed very similar.
Exercise 4.14: Fast solution of biharmonic equation using 9 point rule
(TODO)

27

CHAPTER 5

Matrix Reduction by Similarity Transformations


Exercise 5.3: Idempotent matrix
Suppose that (, x) is an eigenpair of a matrix A satisfying A2 = A. Then
x = Ax = A2 x = Ax = 2 x.
Since any eigenvector is nonzero, one has = 2 , from which it follows that either
= 0 or = 1. We conclude that the eigenvalues of any idempotent matrix can only
be zero or one.
Exercise 5.4: Nilpotent matrix
Suppose that (, x) is an eigenpair of a matrix A satisfying Ak = 0 for some natural
number k. Then
0 = Ak x = Ak1 x = 2 Ak2 x = = k x.
Since any eigenvector is nonzero, one has k = 0, from which it follows that = 0. We
conclude that any eigenvalue of a nilpotent matrix is zero.
Exercise 5.5: Eigenvalues of a unitary matrix
Let x be an eigenvector corresponding to . Then Ax = x and, as a consequence,
x A = x . To use that A A = I, it is tempting to multiply the left hand sides of
these equations, yielding
||2 kxk2 = x x = x A Ax = x Ix = kxk2 .
Since x is an eigenvector, it must be nonzero. Nonzero vectors have nonzero norms, and
we can therefore divide the above equation by kxk2 , which results in ||2 = 1. Taking
square roots we find that || = 1, which is what needed to be shown. Apparently the
eigenvalues of any unitary matrix reside on the unit circle in the complex plane.
Exercise 5.6: Nonsingular approximation of a singular matrix
Let 1 , . . . , n be the eigenvalues of the matrix A. As the matrix A is singular, its
determinant det(A) = 1 n is zero, implying that one of its eigenvalues is zero.
If all the eigenvalues of A are zero let 0 := 1. Otherwise, let 0 := mini 6=0 |i | be
the absolute value of the eigenvalue closest to zero. By definition of the eigenvalues,
det(AI) is zero for = 1 , . . . , n , and nonzero otherwise. In particular det(AI)
is nonzero for any (0, 0 ), and A I will be nonsingular in this interval. This is
what we needed to prove.
28

Exercise 5.7: Companion matrix


(a) To show that (1)n f is the characteristic polynomial A of the matrix A, we
need to compute

qn1 qn2 q1 q0

1

0
0

0
1

0
0 .
A () = det(A I) = det
..
..
..
..
...

.
.
.
.
0
0

By the rules of determinant evaluation, we can substract from any column a linear
combination of the other columns without changing the value of the determinant.
Multiply columns 1, 2, . . . , n 1 by n1 , n2 , . . . , and adding the corresponding
linear combination to the final column, we find

qn1 qn2 q1 f ()

1

0
0

0
1

0
0 = (1)n f (),
A () = det
..
..
..
..
..

.
.
.
.
.
0
0

1
0
where the second equality follows from cofactor expansion along the final column.
Multiplying this equation by (1)n yields the statement of the Exercise.
(b) Similar to (a), by multiplying rows 2, 3, . . . , n by , 2 , . . . , n1 and adding the
corresponding linear combination to the first row.
Exercise 5.16: Schur decomposition example
The matrix U is unitary, as U U = UT U = I. One directly verifies that


1 1
T
R := U AU =
.
0
4
Since this matrix is upper triangular, A = URUT is a Schur decomposition of A.
Exercise 5.19: Skew-Hermitian matrix
By definition, a matrix C is skew-Hermitian if C = C.
=: Suppose that C = A + iB, with A, B Rm,m , is skew-Hermitian. Then
A iB = C = C = (A + iB) = AT iBT ,
which implies that AT = A and B = BT (use that two complex numbers coincide
if and only if their real parts coincide and their imaginary parts coincide). In other
words, A is skew-Hermitian and B is real symmetric.
=: Suppose that we are given matrices A, B Rm,m such that A is skewHermitian and B is real symmetric. Let C = A + iB. Then
C = (A + iB) = AT iBT = A iB = (A + iB) = C,
meaning that C is skew-Hermitian.
29

Exercise 5.20: Eigenvalues of a skew-Hermitian matrix


Let A be a skew-Hermitian matrix and consider a Schur triangularization A = URU
of A. Then
R = U AU = U (A )U = U A U = (U AU) = R .
Since R differs from A by a similary transform, their eigenvalues coincide (use the
multiplicative property of the determinant to show that
det(A I) = det(U ) det(URU I)) det(U) = det(R I).)
As R is a triangular matrix, its eigenvalues i appear on its diagonal. From the equation
R = R it then follows that i = i , implying that each i is purely imaginary.
Exercise 5.31: Eigenvalue perturbation for Hermitian matrices
Since a positive semidefinite matrix has no negative eigenvalues, one has n 0. It
immediately follows from i + n i that in this case i i .
Exercise 5.33: Hoffman-Wielandt
The matrix A has eigenvalues 0 and 4, and the matrix B has eigenvalue 0 with algebraic multiplicity two. Independently of the choice of the permutation i1 , . . . , in , the
Hoffman-Wielandt Theorem would yield
n X
n
n
X
X
|aij bij |2 = 12,
16 =
|ij j |2
i=1 j=1

j=1

which clearly cannot be valid. The Hoffman-Wielandt Theorem cannot be applied to


these matrices, because B is not normal,

 

2 2
2 2
H
B B=
6=
= BBH .
2 2
2 2
Exercise 5.42: Find eigenpair example
As A is a triangular matrix, its eigenvalues correspond to the diagonal entries. One
finds two eigenvalues 1 = 1 and 2 = 2, the latter with algebraic multiplicity two.
Solving Ax1 = 1 x1 and Ax2 = 2 x2 , one finds (valid choices of) eigenpairs, for
instance


1
2
(1 , x1 ) = (1, 0),
(2 , x2 ) = (2, 1).
0
0
Exercise 5.45: Jordan example
Given matrices

3 0 1
A = 4 1 2 ,
4 0 1

1 1 0
J = 0 1 0 ,
0 0 1

we are asked to find a matrix S = [s1 , s2 , s3 ] satisfying




(?)
[As1 , As2 , As3 ] = AS = SJ = [s1 , s2 , s3 ]J = s1 , s1 + s2 , s3 .
30

Let s1 = (AI)s2 , with s2


/ ker(AI). Then s1 , s2 6= 0, and As2 = s1 +s2 as required
by the second column in (?). Moreover, since (AI)2 = 0, the vector s1 ker(AI) is
an eigenvector of A and satisfies the first column in (?). Any eigenvector s3 ker(AI)
will satisfy the third column in (?), and choosing s3 to be linearly independent of s1 , s2
will guarantee that the matrix S is invertible. For instance, we could choose

2 1 1
S = 4 0 0 .
4 0 2
Exercise 5.46: Big Jordan example
The matrix A has Jordan form A = SJS1 , with

3 1 0 0 0 0 0 0
0 3 0 0 0 0 0 0

0 0 2 1 0 0 0 0

0 0 0 2 1 0 0 0
1
, S =
J=

0 0 0 0 2 0 0 0
9

0 0 0 0 0 2 1 0

0 0 0 0 0 0 2 0
0 0 0 0 0 0 0 2

14
28
42
56
70
84
98
49

9
18
27
36
45
54
63
0

5
10
15
20
16
12
8
4

6
12
18
24
12
9
6
3

0
0
0
0
9
0
0
0

8
7
6
5
4
3
2
1

9 9
0 0

0 9

0 0
.
0 0
0 0

0 0
0 0

Exercise 5.49: Properties of the Jordan form


Let J = S1 AS be the Jordan form of the matrix A as in Theorem 5.44. Items 1.
3. are easily shown by induction, making use of the rules of block multiplication in 2.
and 3. For Item 4., write Em := Jm () Im , with Jm () the Jordan block of order m.
By the binomial theorem,
r  
r  
X
X
r
r rk k
r
r
k
rk
Jm () = (Em + Im ) =
Em (Im )
=
Em .
k
k
k=0
k=0
Since Ekm = 0 for any k m, we obtain
min{r,m1}  
X
r rk k
r
Jm () =
Em .
k
k=0
Exercise 5.50: Powers of a Jordan block
Let S be as in Exercise 5.45. We show by induction on n that

1 1 0
1 n 0
(?)
Jn = 0 1 0 = 0 1 0 .
0 0 1
0 0 1
Clearly (?) holds for n = 1. Suppose (?)

1 n 0 1 1
Jn+1 = Jn J = 0 1 0 0 1
0 0 1 0 0

holds for some n 1. Then


0
1 n+1 0
0 = 0
1
0 ,
1
0
0
1
31

implying that (?) holds for n + 1. In particular

2
100
1 100
100 1

A = (SJS ) = SJ S = 4
4

2 1 1
1 100 0 0

0 1 0 1
= 4 0 0
4 0 2 0 0 1 0

we find J 100 . It follows that

1
1 1
1 100 0
2 1 1
0 0 0 1 0 4 0 0
0 2 0 0 1 4 0 2

41 0
201 0 100
1
0
= 400 1 200 .
2
1
1
400 0 199
2
2

Exercise 5.52: Minimal polynomial example


The matrix J has characteristic polynomial J () = det(J I) = (2 )6 (3 )2
and minimal polynomial () = (2 )3 (3 )2 .

Exercise 5.53: Similar matrix polynomials


P
For a given polynomial p(x) = k ak xk , substituting B = S1 AS for x gives
X
X
k X
p(B) =
ak S1 AS =
ak S1 Ak S = S1
ak Ak S = S1 p(A)S.
k

Exercise 5.54: Minimal polynomial of a diagonalizable matrix


A matrix A is diagonalizable precisely when its Jordan form only has Jordan blocks
of size one. Writing 1 , . . . , k for the (distinct) eigenvalues of A, it follows that A
has minimal polynomial A () = (1 ) (k ). In particular, for any postive
integer n, the identity matrix I Rnn has minimal polynomial I () = 1 .
Exercise 5.59: Biorthogonal expansion
The matrix A has characteristic polynomial det(A I) = ( 4)( 1) and right
eigenpairs (1 , x1 ) = (4, [1, 1]T ) and (2 , x2 ) = (1, [1, 2]T ). Since the right eigenvectors
x1 , x2 are linearly independent, there exists vectors y1 , y2 satisfying hyi , xj i = ij . The
set {x1 x2 } forms a basis of C2 , and the set {y1 , y2 } is called the dual basis.
How do we find such vectors y1 , y2 ? Any vector [x1 , x2 ]T is orthogonal to the vector
[x2 , x1 ]T for any . Choosing appropriately, one finds y1 = 31 [1, 1]T , y2 =
1
[2, 1]T . By Theorem 5.58, y1 and y2 are left eigenvectors of A. For any vector
3
v = [v1 , v2 ]T C2 , Equation (5.20) then gives us the biorthogonal expansions
1
1
v = hy1 , vix1 + hy2 , vix2 = (v1 v2 )x1 + (2v1 + v2 )x2
3
3
= hx1 , viy1 + hx2 , viy2 = (v1 + v2 )y1 + (v1 2v2 )y2 .

32

Exercise 5.60: Generalized Rayleigh quotient


Suppose (, x) is a right eigenpair for A, so that Ax = x. Then the generalized
Rayleight quotient for A is
y Ax
y x
R(y, x) := = = ,
y x
y x
which is well defined whenever y x 6= 0. On the other hand, if (, y) is a left eigenpair
for A, then y A = y and it follows that
y Ax
y x
R(y, x) := = = .
y x
y x

33

CHAPTER 6

The Singular Value Decomposition


Exercise 6.14: SVD examples
(a) For A = [3, 4]T we find a 1 1 matrix AT A = 25,which has the eigenvalue
1 = 25. This provides us with the singular value 1 = + 1 = 5 for A. Hence the
matrix A has rank 1 and a SVD of the form
 

 5  
V1 ,
A = U1 U2
with U1 , U2 R2,1 , V = V1 R.
0
The eigenvector of AT Athat
 corresponds to the eigenvalue 1 = 25 is given by v1 = 1,
providing us with V = 1 . Using Theorem 6.7.3, one finds u1 = 51 [3, 4]T . Extending
u1 to an orthonormal basis for R2 gives u2 = 15 [4, 3]T . A SVD of A is therefore

 
1 3 4 5  
1 .
A=
0
5 4 3
(b) One has

1 1
A = 2 2 ,
2 2



1 2 2
,
A =
1 2 2
T



9 9
A A=
.
9 9
T

The eigenvalues of AT A are the zeros


of det(AT A I) = (9 )2 81, yielding
1 = 18 and 2 = 0, and therefore 1 = 18 and 2 = 0. Note that since there is only
one nonzero singular value, the rank of A is one. Following the dimensions of A, one
finds

18 0
= 0 0 .
0 0
The normalized eigenvectors v1 , v2 of AT A corresponding to the eigenvalues 1 , 2 are
the columns of the matrix


1 1 1
.
V = [v1 v2 ] =
2 1 1
Using Theorem 6.7.3 one finds u1 , which can be extended to an orthonormal basis
{u1 , u2 , u3 } using Gram-Schmidt Orthogonalization (see Theorem 0.29). The vectors
u1 , u2 , u3 constitute a matrix

1 2 2
1
U = [u1 u2 u3 ] = 2 2 1 .
3 2 1 2
34

A SVD of A is therefore

1 2
1
2 2
A=
3 2 1

given by


2
18 0 1 
1
1
1 0 0
.
2 1 1
2
0 0

Exercise 6.15: More SVD examples


 
(a) We have A = e1 and AT A = eT1 e1 = 1 . This gives the eigenpair (1 , v1 ) =
(1, 1) of AT A. Hence 1 = 1 and = e1 = A. As = A and V = I1 we must have
U = Im yielding a singular value decomposition
A = Im e1 I1 .
(b) For A = eTn ,

0
...
AT A =
0
0

the matrix

0 0
. . . .. ..
. . .
0 0

0 1

has eigenpairs
(0, ej ) for j = 1, . . . , n 1 and (1, en ). Then = eT1 R1,n and


V = en , en1 , . . . , e1 Rn,n . Using Theorem 6.7.3 we get u1 = 1, yielding U = 1 .
A SVD for A is therefore given by
  

A = eTn = 1 eT1 en , en1 , . . . , e1 .
(c) In this exercise


1 0
A=
,
0 3

A = A,



1 0
A A=
.
0 9
T

The eigenpairs of AT A are given by (1 , v1 ) = (9, e2 ) and (2 , v2 ) = (1, e1 ), from


which we find




3 0
0 1
=
,
V=
.
0 1
1 0
Using Theorem 6.7.3, one finds u1 = e2 and u2 = e1 , which constitute the matrix


0 1
.
U=
1 0
A SVD of A is therefore given by




0 1 3 0 0 1
A=
.
1 0
0 1 1 0
Exercise 6.17: Counting dimensions of fundamental subspaces
Let A have singular value decomposition UV .
1. By items 1. and 3. of Theorem 6.16, span(A) and span(A ) are vector spaces
of the same dimension r, implying that rank(A) = rank(A ).
2. This statement is known as the rank-nullity theorem, and it follows immediately
from combining items 1. and 4. in Theorem 6.16.
3. As rank(A ) = rank(A) by 1., this follows by replacing A by A in 2.
35

Exercise 6.18: Rank and nullity relations


Let A = UV be a singular value decomposition of a matrix A Cmn .
1. By Theorem 6.5.5, rank(A) is the number of positive eigenvalues of
AA = UV V U = UDU ,
where D := is a diagonal matrix with real nonnegative elements. Since UDU
is an orthogonal diagonalization of AA , the number of positive eigenvalues of AA is
the number of nonzero diagonal elements in D. Moreover, rank(AA ) is the number
of positive eigenvalues of
AA (AA ) = AA AA = U V = UD2 U ,
which is the number of nonzero diagonal elements in D2 , so that rank(A) = rank(AA ).
From a similar argument for rank(A A), we conclude that
rank(A) = rank(AA ) = rank(A A).
2. Let r := rank(A) = rank(A ) = rank(AA ) = rank(A A). Applying Theorem
6.5, parts 3 and 4, to the singular value decompositions
A = UV , A = VU , AA = U U , A A = V V ,
one finds that {vr+1 , . . . , vn } is a basis for both ker(A) and ker(A A), while {ur+1 , . . . um }
is a basis for both ker(A ) and ker(AA ). In particular it follows that
dim ker(A) = dim ker(A A),

dim ker(A ) = dim ker(AA ),

which is what needed to be shown.

Exercise 6.19: Orthonormal bases example


Given is the matrix


1 14 4 16
A=
.
15 2 22 13
From Example 6.11 we know that B = AT and hence A = UVT and B = VT UT ,
with





2
1 2
1
1
2
0
0
3
4
2 2 1 ,
V=
=
,
U=
.
0 1 0
3 2 1 2
5 4 3
From Theorem 6.16 we know that V1 forms an orthonormal basis for span(AT ) =
span(B), V2 an orthonormal basis for ker(A) and U2 an orthonormal basis for ker(AT ) =
ker(B). Hence
span(B) = v1 + v2 ,

ker(A) = v3

36

and

ker(B) = 0.

Exercise 6.20: Some spanning sets


The matrices A Cmn and A A have the same rank r since they have the same
number of singular values, so that the vector spaces span(A A) and span(A ) have
the same dimension. It is immediate from the definition that span(A A) span(A ),
and therefore span(A A) = span(A ).
Let A = U1 1 V1 be a singular value factorization of A. Taking the Hermitian
transpose A = V1 1 U1 one finds span(A ) span(V1 ). Moreover, since V1 Cnr
has orthonormal columns, it has the same rank as A , and we conclude span(A ) =
span(V1 ).
Exercise 6.21: Singular values and eigenpair of composite matrix
Given is a singular value decomposition A = UV . Let r = rank(A), so that
1 r > 0 and r+1 = = n = 0. Let U = [U1 , U2 ] and V = [V1 , V2 ]
be partitioned accordingly and 1 = diag(1 , . . . , r ) as in Equation (6.7), so that
A = U1 1 V1 forms a singular value factorization of A.
By Theorem 6.16,

  
 
0 A ui
Avi
i pi for i = 1, . . . , r
Cpi =
=
=

A 0 vi
A ui
0 pi for i = r + 1, . . . , n



 
 
0 A
ui
Avi
i qi for i = 1, . . . , r
Cqi =
=
=

A 0 vi
A ui
0 qi for i = r + 1, . . . , n


  
  
0 A uj
0
0
Crj =
=
=
= 0 rj , for j = n + 1, . . . , m.
A 0
0
A uj
0
This gives a total of n + n + (m n) = m + n eigen pairs.
Exercise 6.27: Rank example
We are given the singular value decomposition

1
1
1
1

6
2
2
2
2
1
1
1
1

0
2
2
2
2
A = UVT =
1
1
1
1
0
2 2
2
2
1
1
1
1
0

2
2
2
2

0
6
0
0

0
0

0
0

2
3
2
3
1
3

2
3
31
32

1
3
23
2
3

Write U = [u1 , u2 , u3 , u4 ] and V = [v1 , v2 , v3 ]. Clearly r = rank(A) = 2.


(a) A direct application of Theorem 6.16 with r = 2 gives
{u1 , u2 } is an orthonormal basis for span(A),
{u3 , u4 } is an orthonormal basis for ker(AT ),
{v1 , v2 } is an orthonormal basis for span(AT ),
{v3 , v4 } is an orthonormal basis for ker(A).
Since U is orthogonal, {u1 , u2 , u3 , u4 } is an orthonormal basis for R4 . In particular
u3 , u4 are orthogonal to u1 , u2 , so that they span the orthogonal complement span(A)
to span(A) = span{u1 , u2 }.
(b) Applying Theorem 6.26 with r = 1 yields
q

kA BkF 22 + 32 = 62 + 02 = 6.
37

(c) Following Section 6.4.2, with D0 := diag(1 , 0, . . . , 0) Rn,n , take

2
2
1
 0
2 2 1
D
0

A1 = A := U
VT =
2 2 1 .
0
2 2 1

Exercise 6.28: Another rank example


(a) The matrix B = (bij )ij Rn,n is defined by

1
if i = j;

1
if i < j;
bij =
2n
2
if (i, j) = (n, 1);

0
otherwise.
while the column vector x = (xj )j Rn is given by

1
if j = n;
xj =
2n1j otherwise.
For the final entry in the matrix product Bx one finds that
(Bx)n =

n
X

bnj xj = bn1 x1 + bnn xn = 22n 2n2 + 1 1 = 0.

j=1

For any of the remaining indices i 6= n, the i-th entry of the matrix product Bx can
be expressed as
(Bx)i =

n
X

bij xj = bin +

j=1

= 1 + 2

n1
X

2n1j bij

j=1

n1i

bii +

n1
X

2n1j bij

j=i+1

= 1 + 2

n1i

n1
X

2n1j

j=i+1

= 1 + 2

n1i

n2i

n2i
X 
j 0 =0

1
= 1 + 2n1i 2n2i

1
2

j 0


1 n1i
2
12

= 1 + 2n1i 2n1i 1 2(n1i)

= 0.
As B has a nonzero kernel, it must be singular. The matrix A, on the other hand,
is nonsingular, as its determinant is (1)n 6=p0. The matrices A and B differ only in
their (n, 1)-th entry, so one has kA BkF = |an1 bn1 |2 = 22n . In other words, the
tiniest perturbation can make a matrix with large determinant singular.
38

(b) Let 1 n 0 be the singular values of A. Applying Theorem 6.26 for


r = rank(B) < n, we obtain
s
2

2
q
r+1
n1
2
n n
+ + n2
+ +
+ 1 = r+1
n
n
=

min

CRn,n
rank(C)=r

kA CkF kA BkF = 22n .

We conclude that the smallest singular value n can be at most 22n .

39

CHAPTER 7

Matrix Norms
Exercise 7.4: Consistency of sum norm?
Observe that the sum norm is a matrix norm. This follows since it is equal to the
l1 -norm of the vector v = vec(A) obtained by stacking the columns of a matrix A on
top of each other.
Let A = (aij )ij and B = (bij )ij be matrices for which the product AB is defined.
Then


X
X X

kABkS =
aik bkj
|aik | |bkj |



i,j

i,j,k

|aik | |blj | =

i,j,k,l

X
i,k

|aik |

|blj | = kAkS kBkS ,

l,j

where the first inequality follows from the triangle inequality and multiplicative property of the absolute value | |. Since A and B where arbitrary, this proves that the
sum norm is consistent.
Exercise 7.5: Consistency of max norm?
Observe that the max norm is a matrix norm. This follows since it is equal to the
l -norm of the vector v = vec(A) obtained by stacking the columns of a matrix A on
top of each other.
To show that the max norm is not consistent we use a counter example. Let
A = B = (1)2i,j=1 . Then






 

1 1 1 1
2 2
1 1 1 1







1 1 1 1 = 2 2 = 2 > 1 = 1 1 1 1 ,
M
M
M
M
contradicting kAB||M kAkM kBkM .
Exercise 7.6: Consistency of modified max norm?
Exercise 7.5 shows that the max norm is not consistent. In this Exercise we show that
the max norm can be modified so as to define
a consistent matrix norm.

(a) Let A Cm,n and define kAk := mnkAkM as in the Exercise. To show that
k k defines a consistent matrix norm we have to show that it fulfills the three matrix
norm properties and that it is submultiplicative. Let A, B Cm,n be any matrices and
any scalar.

Positivity: Clearly kAk = mnkAkM 0. Moreover,


kAk = 0 ai,j = 0 i, j A = 0.

Homogeneity: kAk = mnkAkM = || mnkAkM = ||kAk.


40

Subadditivity: One has


kA + Bk =


nmkA + BkM nm kAkM + kBkM = kAk + kBk.

Submultiplicativity: One has


q

X



kABk = mn max
ai,k bk,j
1im

1jn

k=1
q

mn max

1im
1jn

|ai,k ||bk,j |

k=1

mn max

max |bk,j |

1im

1kq
1jn

q
X

!
|ai,k |

k=1

q mn

max |ai,k |

1im
1kq

!
max |bk,j |

1kq
1jn

= kAkkBk.
(b) For any A Cm,n , let
kAk(1) := mkAkM

and

kAk(2) := nkAkM .

Comparing with the solution of part (a) we see, that the points of positivity, homogeneity and subadditivity are fulfilled here as well, making kAk(1) and kAk(2) valid
matrix norms. Furthermore, for any A Cm,q , B Cq,n ,
q

!
!
X



kABk(1) = m max
ai,k bk,j m max |ai,k | q max |bk,j |
1im
1im
1kq

1jn
1kq
1jn
(1)

k=1
(1)

= kAk kBk ,

kABk(2) = n max |
1im
1jn

q
X

!
ai,k bk,j | q

k=1
(2)

max |ai,k | n

1im
1kq

!
max |bk,j |

1kq
1jn

= kAk(2) kBk ,
which proves the submultiplicativity of both norms.

Exercise 7.8: The sum norm is subordinate to?


For any matrix A = (aij )ij Cm,n and column vector x = (xj )j Cn , one has
n

m X
m X
n
m X
n
n
X
X
X
X


kAxk1 =
aij xj
|aij ||xj |
|aij |
|xk | = kAkS kxk1 ,



i=1

j=1

i=1 j=1

i=1 j=1

k=1

which shows that the matrix norm k kS is subordinate to the vector norm k k1 .
41

Exercise 7.9: The max norm is subordinate to?


Let A = (aij )ij Cm,n be a matrix and x = (xj )j Cn a column vector.
(a) One has


n
n
n

X
X
X


|aij | |xj | max |aij |
aij xj max
|xj |
kAxk = max
i=1,...,m
i=1,...,m
i=1,...,m
j=1

j=1

j=1,...,n

j=1

= kAkM kxk1 .
(b) Assume that the maximum in the definition of kAkM is attained in column l,
implying that kAkM = |ak,l | for some k. Let el be the lth standard basis vector. Then
kel k1 = 1 and
kAel k = max |ai,l | = |ak,l | = |ak,l | 1 = kAkM kel k1 ,
i=1,...,m

which is what needed to be shown.


(c) By (a), kAkM kAxk /kxk1 for all nonzero vectors x, implying that
kAkM max
x6=0

kAxk
.
kxk1

By (b), equality is attained for any standard basis vector el for which there exists a k
such that kAkM = |ak,l |. We conclude that
kAkM = max
x6=0

kAxk
,
kxk1

which means that k kM is the (, 1)-operator norm (see Definition 7.10).


Exercise 7.16: Spectral norm
Let A = UV be a singular value decomposition of A, and write 1 := kAk2 for
the biggest singular value of A. Since the orthogonal matrices U and V leave the
Euclidean norm invariant,
max

|y Ax| =

max

kxk2 =1=kyk2

kxk2 =1=kyk2

1 |y x|

max

|y UV x| =

max

1 kyk2 kxk = 1 .

kxk2 =1=kyk2
kxk2 =1=kyk2

max

kxk2 =1=kyk2

|y x|

Moreover, this maximum is achieved for x = y = e1 , and we conclude


kAk2 = 1 =

max

kxk2 =1=kyk2

|y Ax|.

Exercise 7.17: Spectral norm of the inverse


Let 1 n be the singular values of A. Since A is nonsingular, n must be
nonzero. Using Equations (7.10) and (6.16), we find
kA1 k2 =

1
=
n

1
kxk2
= max n
,
06=xC kAxk2
kAxk2
min
06=xCn kxk2

which is what needed to be shown.


42

Exercise 7.18: p-norm example


We have



2 1
A=
,
1 2



1 2 1
.
=
3 1 2

Using Theorem 7.12, one finds kAk1 = kAk = 3 and kA1 k1 = kA1 k = 1. The
singular values 1 2 of A are the square roots of the zeros of
0 = det(AT A I) = (5 )2 16 = 2 10 + 9 = ( 9)( 1).
Using Theorem 7.14, we find kAk2 = 1 = 3 and kA1 k2 = 21 = 1. Alternatively,
since A is symmetric positive definite, we know from (7.11) that kAk2 = 1 and
kA1 k2 = 1/2 , where 1 = 3 is the biggest eigenvalue of A and 2 = 1 is the
smallest.
Exercise 7.21: Unitary invariance of the spectral norm
Suppose V is a rectangular matrix satisfying V V = I. Then
kVAk22 = max kVAxk22 = max x A V VAx
kxk2 =1

kxk2 =1

= max x A Ax = max kAxk22 = kAk22 .


kxk2 =1

kxk2 =1

The result follows by taking square roots.


Exercise 7.22: kAUk2 rectangular A
Let u = [u1 , u2 ]T be any column vector satisfying 1 = uT u = kuk22 . Then Au,
considered as a matrix, has operator 2-norm
max kAuxk2 = max{kAuk2 , k Auk2 } = kAuk2
|x|=1

equal to its Euclidean norm, considered as a vector. In order for kAuk2 < kAk2
to hold, there needs to exist another unit vector v for which one has the inequality
kAuk2 < kAvk2 of Euclidean norms. In other words, we need to pick a matrix A that
scales more in the direction v than in the direction u. For instance, if


 
 
2 0
0
1
A=
,
u=
,
v=
,
0 1
1
0
then
kAk2 = max kAxk2 kAvk2 = 2 > 1 = kAuk2 .
kxk2 =1

Exercise 7.23: p-norm of diagonal matrix


The eigenpairs of the matrix A = diag(1 , . . . , n ) are (1 , e1 ), . . . , (n , en ). For (A) =
max{|1 |, . . . , |n |}, one has
kAkp =

(|1 x1 |p + + |n xn |p )1/p
(x1 ,...,xn )6=0
(|x1 |p + + |xn |p )1/p
max

((A)p |x1 |p + + (A)p |xn |p )1/p


max
= (A).
(x1 ,...,xn )6=0
(|x1 |p + + |xn |p )1/p
43

On the other hand, for ej such that (A) = |j |, one finds


kAkp = max
x6=0

kAxkp
kAej kp

= (A).
kxkp
kej kp

Together, the above two statements imply that kAkp = (A) for any diagonal matrix
A and any p satisfying 1 p .
Exercise 7.24: Spectral norm of a column vector
We write A Cm,1 for the matrix corresponding to the column vector a Cm . Write
kAkp for the operator p-norm of A and kakp for the vector p-norm of a. In particular
kAk2 is the spectral norm of A and kak2 is the Euclidean norm of a. Then
kAkp = max
x6=0

kAxkp
|x|kakp
= max
= kakp ,
x6=0
|x|
|x|

proving (b). Note that (a) follows as the special case p = 2.


Exercise 7.25: Norm of absolute value matrix
(a) One finds

 

|1 + i| | 2|
2 2
|A| =
=
.
|1|
|1 i|
1
2
(b) Let bi,j denote the entries of |A|. Observe that bi,j = |ai,j | = |bi,j |. Together
with Theorem 7.12, these relations yield
! 21
! 12
m X
n
m X
n
X
X
kAkF =
=
= k |A| kF ,
|ai,j |2
|bi,j |2
i=1 j=1

kAk1 = max

1jn

kAk = max

1im

m
X
i=1
n
X
j=1

i=1 j=1

!
|ai,j |

= max

1jn

!
|ai,j |

= max

1im

m
X

!
|bi,j |

i=1
n
X

= k |A| k1 ,
!

|bi,j |

= k |A| k ,

j=1

which is what needed to be shown.


(c) To show this relation between the 2-norms of A and |A|, we first examine
the connection between the l2 -norms of Ax and |A| |x|, where x = (x1 , . . . , xn ) and
|x| = (|x1 |, . . . , |xn |). We find

2 ! 1
!2 ! 12
m
n
m X
n
2
X
X
X

|ai,j ||xj |
= k |A| |x| k2 .
kAxk2 =
ai,j xj



i=1

i=1

j=1

j=1

Now let x with kx k2 = 1 be a vector for which kAk2 = kAx k2 . That is, let x be
a unit vector for which the maximum in the definition of 2-norm is attained. Observe
that |x | is then a unit vector as well, k |x | k2 = 1. Then, by the above estimate of
l2 -norms and definition of the 2-norm,
kAk2 = kAx k2 k |A| |x | k2 k |A| k2 .
44

(d) By Theorem 7.12, we can solve this exercise by finding a matrix A for which A
and |A| have different largest singular values. As A is real and symmetric, there exist
a, b, c R such that




a b
|a| |b|
A=
,
|A| =
,
b c
|b| |c|
 2

 2

a + b2 ab + bc
a + b2 |ab| + |bc|
T
T
A A=
,
|A| |A| =
.
ab + bc b2 + c2
|ab| + |bc| b2 + c2
To simplify these equations we first try the case a + c = 0. This gives
 2

 2

a + b2
0
a + b2 2|ab|
T
T
A A=
,
|A| |A| =
.
0
a2 + b 2
2|ab| a2 + b2
To get different norms we have to choose a, b in such a way that the maximal eigenvalues
of AT A and |A|T |A| are different. Clearly AT A has a unique eigenvalue := a2 + b2
and putting the characteristic polynomial () = (a2 + b2 )2 4|ab|2 of |A|T |A| to
zero yields eigenvalues := a2 + b2 2|ab|. Hence |A|T |A| has maximal eigenvalue
+ = a2 + b2 + 2|ab| = + 2|ab|. The spectral norms of A and |A| therefore differ
whenever both a and b are nonzero. For example, when a = b = c = 1 we find



1 1
A=
,
kAk2 = 2,
k |A| k2 = 2.
1 1
Exercise 7.32: Sharpness of perturbation bounds
Suppose Ax = b and Ay = b + e. Let K = K(A) = kAkkA1 k be the condition
number of A. Let yA and yA1 be unit vectors for which the maxima in the definition
of the operator norms of A and A1 are attained. That is, kyA k = 1 = kyA1 k,
kAk = kAyA k, and kA1 k = kA1 yA1 k. If b = AyA and e = yA1 , then
kA1 ek
kA1 yA1 k
kyA1 k
kek
ky xk
=
=
= kA1 k = kAkkA1 k
=K
,
1
kxk
kA bk
kyA k
kAyA k
kbk
showing that the upper bound is sharp. If b = yA1 and e = AyA , then
ky xk
kA1 ek
kyA k
1
1
kAyA |
1 kek
=
=
=
=
=
,
kxk
kA1 bk
kA1 yA1 k
kA1 k
kAkkA1 k kyA1 k
K kbk
showing that the lower bound is sharp.
Exercise 7.33: Condition number of 2nd derivative matrix
Recall that T = tridiag(1, 2, 1) and, by Exercise 1.10, T1 is given by


1
T1 ij = T1 ji = (1 ih)j > 0, 1 j i m, h =
.
m+1
From Theorems 7.12 and 7.14, we have the following explicit expressions for the 1-, 2and -norms
n
m
X
X
1
kAk1 = max
|ai,j |, kAk2 = 1 , kA1 k2 =
, kAk = max
|ai,j |
1jn
1im

m
i=1
j=1
for any matrix A Cm,n , where 1 is the largest singular value of A, m the smallest
singular value of A, and we assumed A to be nonsingular in the third equation.
45

(a) For the matrix T this gives kTk1 = kTk = m + 1 for m = 1, 2 and kTk1 =
kTk = 4 for m 3. For the inverse we get kT1 k1 = kT1 k = 21 = 18 h2 for m = 1
and




1 2 1
1 2 1
1






= kT1 k
kT k1 =
= 1 =
3 1 2 1
3 1 2
for m = 2. For m > 2, one obtains
m
m
j1
X
X
1  X
(1 jh)i +
(1 ih)j
T ij =
i=1

i=1

i=j

j1

m
X

(1 jh)i +

i=1

j1
X
(1 ih)j
(1 ih)j
i=1

i=1

(j 1)j jm
(j 1)j
= (1 jh)
+
(2 jh)
2
2
2
j
= (m + 1 j)
2
1
1
=
j j 2,
2h
2
1
which is a quadratic function in j that attains its maximum at j = 2h
= m+1
. For
2
odd m > 1, this function takes its maximum at integral j, yielding kT1 k1 = 81 h2 .
For even m > 2, on the other hand, the maximum over all integral j is attained at
j = m2 = 1h
or j = m+2
= 1+h
, which both give kT1 k1 = 18 (h2 1).
2h
2
2h
Similarly, we have for the infinity norm of T1
m
m
i1
X
X
1
1
1  X
(1 ih)j +
(1 jh)i =
i i2 ,
T i,j =
2h
2
j=1
j=1
j=i

and hence kT1 k = kT1 k1 . This is what one would expect, as T (and therefore
T1 ) is symmetric. We conclude that the 1- and -condition numbers of T are

2
m = 1;

1 6
m = 2;
cond1 (T) = cond (T) =
2
h
2
m odd, m > 1;

2
h 1 m even, m > 2.
(b) Since the matrix T is symmetric, TT T = T2 and the eigenvalues of TT T are
the squares of the eigenvalues 1 , . . . , n of T. As all eigenvalues of T are positive,
each singular value of T is equal to an eigenvalue. Using that i = 2 2 cos(ih), we
find
1 = |m | = 2 2 cos(mh) = 2 + 2 cos(h),
m = |1 | = 2 2 cos(h).
It follows that
1
1 + cos(h)
=
= cot2
cond2 (T) =
m
1 cos(h)
46

h
2


.

(c) From tan x > x we obtain cot2 x =


we find
4
4
2
< cond2 (T) < 2 2 .
2
2
h
3
h

1
tan2 x

<

1
.
x2

Using this and cot2 x > x2

2
3

Exercise 7.44: When is a complex norm an inner product norm?


As in the Exercise, we let
kx + yk2 kx yk2
.
4
We need to verify the three properties that define an inner product. Let x, y, z be
arbitrary vectors in Cm and a C be an arbitrary scalar.
(1) Positive-definiteness. One has s(x, x) = kxk2 0 and
hx, yi = s(x, y) + is(x, iy),

s(x, y) =

kx + ixk2 kx ixk2
k(1 + i)xk2 k(1 i)xk2
=
4
4
(|1 + i| |1 i|)kxk2
= 0,
=
4
so that hx, xi = kxk2 0, with equality holding precisely when x = 0.
(2) Conjugate symmetry. Since s(x, y) is real, s(x, y) = s(y, x), s(ax, ay) =
|a|2 s(x, y), and s(x, y) = s(x, y),
s(x, ix) =

hy, xi = s(y, x)is(y, ix) = s(x, y)is(ix, y) = s(x, y)is(x, iy) = hx, yi.
(3) Linearity in the first argument. Assuming the parallelogram identity,
1
1
1
1
2s(x, z) + 2s(y, z) = kx + zk2 kz xk2 + ky + zk2 kz yk2
2
2
2
2

2

2



1
x
+
y
x

y
1
x
+
y
x

y
z
+
=
z+
+




2
2
2
2
2
2

2

2



1
z + x + y x y 1 z x + y + x y



2
2
2
2
2
2
2

2




x y 2
x y 2
x + y
x + y








= z +
+
z

2
2
2
2
2
2





x
+
y
x
+
y
z

=
z
+


2
2


x+y
=4s
,z ,
2
implying that s(x + y, z) = s(x, z) + s(y, z). It follows that
hx + y, zi = s(x + y, z) + is(x + y, iz)
= s(x, z) + s(y, z) + is(x, iz) + is(y, iz)
= s(x, z) + is(x, iz) + s(y, z) + is(y, iz)
= hx, zi + hy, zi.
That hax, yi = ahx, yi follows, mutatis mutandis, from the proof of Theorem 7.42.
47

Exercise 7.45: p-norm for p = 1 and p =


We need to verify the three properties that define a norm. Consider arbitrary vectors
x = [x1 , . . . , xn ]T and y = [y1 , . . . , yn ] in Rn and a scalar a R. First we verify that
k k1 is a norm.
(1) Positivity. Clearly kxk1 = |x1 | + + |xn | 0, with equality holding precisely
when |x1 | = = |xn | = 0, which happens if and only if x is the zero vector.
(2) Homogeneity. One has
kaxk1 = |ax1 | + + |axn | = |a|(|x1 | + + |xn |) = |a|kxk1 .
(3) Subadditivity. Using the triangle inequality for the absolute value,
kx+yk1 = |x1 +y1 |+ +|xn +yn | |x1 |+|y1 |+ + |xn |+|yn | = kxk1 +kyk1 .
Next we verify that k k is a norm.
(1) Positivity. Clearly kxk = max{|x1 |, . . . , |xn |} 0, with equality holding
precisely when |x1 | = = |xn | = 0, which happens if and only if x is the
zero vector.
(2) Homogeneity. One has
kaxk = max{|a||x1 |, . . . , |a||xn |} = |a| max{|x1 |, . . . , |xn |} = |a|kxk .
(3) Subadditivity. Using the triangle inequality for the absolute value,
kx + yk = max{|x1 + y1 |, . . . , |xn + yn |} max{|x1 | + |y1 |, . . . , |xn | + |yn |}
max{|x1 |, . . . , |xn |} + max{|y1 |, . . . , |yn |} = kxk + kyk
Exercise 7.46: The p-norm unit sphere
In the plane, unit spheres for the 1-norm, 2-norm, and -norm are

-1

0.5

0.5

0.5

-0.5

0.5

-1

-0.5

0.5

-1

-0.5

0.5

-0.5

-0.5

-0.5

-1

-1

-1

Exercise 7.47: Sharpness of p-norm inequality


Let 1 p . The vector xl = [1, 0, . . . , 0]T Rn satisfies
kxl kp = (|1|p + |0|p + + |0|p )1/p = 1 = max{|1|, |0|, . . . , |0|} = kxl k ,
and the vector xu = [1, 1, . . . , 1]T Rn satisfies
kxu kp = (|1|p + + |1|p )1/p = n1/p = n1/p max{|1|, . . . , |1|} = n1/p kxu k .

48

Exercise 7.48: p-norm inequalities for arbitrary p


Let p and q be integers satisfying 1 q p, and let x = [x1 , . . . , xn ]T Cn . Since
p/q 1, the function f (z) = z p/q is convex on [0, ). For any z1 , . . . , zn [0, ) and
1 , . . . , n 0 satisfying 1 + + n = 1, Jensens inequality gives
!p/q
!
n
n
n
n
X
X
X
X
p/q
i zi
=f
i zi
i f (zi ) =
i zi .
i=1

i=1

i=1

i=1

In particular for zi = |xi | and 1 = = n = 1/n,


!p/q
!p/q
n
n
n
n
X
X
X
X

1
1
q
q p/q
1
p/q
q
|xi |p .
|xi |

|xi |
=n
n
|xi |
=
n
n
i=1
i=1
i=1
i=1
Since the function x 7 x1/p is monotone, we obtain
!1/q
!1/p
n
n
X
X
n1/q kxkq = n1/q
|xi |q
n1/p
|xi |p
= n1/p kxkp ,
i=1

i=1

from which the right inequality in the exercise follows.


The left inequality clearly holds for x = 0, so assume x 6= 0. Without loss of
generality we can then assume kxk = 1, since kaxkp kaxkq if and only if kxkp
kxkq for any nonzero scalar a. Then, for any i = 1, . . . , n, one has |xi | 1, implying
that |xi |p |xi |q . Moreover, since |xi | = 1 for some i, one has |x1 |q + + |xn |q 1,
so that
!1/p
!1/p
!1/q
n
n
n
X
X
X
kxkp =
|xi |p

|xi |q

|xi |q
= kxkq .
i=1

i=1

i=1

Finally we consider the case p = . The statement is obvious for q = p, so assume


that q is an integer. Then
!1/q
!1/q
n
n
X
X
kxkq =
|xi |q

kxkq
= n1/q kxk ,
i=1

i=1

proving the right inequality. Using that the map x 7 x1/q is monotone, the left
inequality follows from
n
X
q
q
kxk = (max |xi |)
|xi |q = kxkqq .
i

i=1

49

CHAPTER 8

The Classical Iterative Methods


Exercise 8.2: Richardson and Jacobi
If a11 = = ann = d 6= 0 and = 1/d, Richardsons method (8.1) yields, for
i = 1, . . . , n,
!
n
X
1
bi
aij xk (j)
xk+1 (i) = xk (i) +
d
j=1
!
n
X
1
=
dxk (i)
aij xk (j) + bi
d
j=1
!
n
X
1
aii xk (i)
aij xk (j) + bi
=
aii
j=1
!
i1
n
X
X
1
=

aij xk (j)
aij xk (j) + bi ,
aii
j=1
j=i+1
which is identical to Jacobis method (8.3).
Exercise 8.13: Convergence of the R-method when eigenvalues have
positive real part (TODO)

Exercise 8.16: Example: GS converges, J diverges


The eigenvalues of A are the zeros of det(A I) = ( + 2a + 1)( + a 1)2 . We find
eigenvalues 1 := 2a + 1 and 2 := 1 a, the latter having algebraic multiplicity two.
Whenever 1/2 < a < 1 these eigenvalues are positive, implying that A is positive
definite for such a.
Lets compute the spectral radius of GJ = I D1 A, where D is the diagonal part
of A. The eigenvalues of GJ are the zeros of the characteristic polynomial


a a


det(GJ I) = a a = ( 2a)(a )2 ,
a a
and we find spectral radius (GJ ) = max{|a|, |2a|}. It follows that (GJ ) > 1 whenever
1/2 < a < 1, in which case Theorem 8.27 implies that the Jacobi method does not
converge (even though A is symmetric positive definite).
50

Exercise 8.17: Divergence example for J and GS


We compute the matrices GJ and G1 from A and show that that the spectral radii
(GJ ), (G1 ) 1. Once this is shown, Theorem 8.10 implies that the Jacobi method
and Gauss-Seidels method diverge.
Write A = D AL AR as in the book. From Equation (8.13), we find


 

 
1 0 1 2
0 2
1 0
1
1
,
GJ = I MJ A = I D A =

=
0 1
0 41 3 4
34 0

 


1 0 1 2
1 0
1
1
G1 = I M1 A = I (D AL ) A =

0 1
34 41 3 4


0 2
=
.
0 32
p
From this, we find (GJ ) = 3/2 and (G1 ) = 3/2, both of which are bigger than 1.
Exercise 8.18: Strictly diagonally dominance; The J method
If A = (aij )ij is strictly diagonally dominant, then it is nonsingular and a11 , . . . , ann 6=
0. For the Jacobi method, one finds

a1n
12
0
aa11
aa13

a11
11
a2n
a21
0
aa23

a22
a22
22
a3n
a31 a32
1
0

G = I diag(a11 , . . . , ann ) A = a33


a33
a33 .
..

.
.
.
.
.
.
.
.
.
.
.
.
.
n2
n3
n1
aann
aann

0
aann
By Theorem 7.12, the -norm can be expressed as the maximum, over all rows, of
the sum of absolute values of the entries in a row. Using that A is strictly diagonally
dominant, one finds
P
X aij
|a |
= max j6=i ij < 1.
kGk = max
aii 1in |aii |
i
j6=i

As by Lemma 7.11 the -norm is consistent, Corollary 8.9 implies that the Jacobi
method converges for any strictly diagonally dominant matrix A.
Exercise 8.19: Strictly diagonally dominance; The GS method
Let A = AL + D AR be decomposed as a sum of a lower triangular, a diagonal,
and an upper triangular part. By Equation (8.14), the approximate solutions x(k) are
related by
Dx(k+1) = AL x(k+1) + AR x(k) + b
in the Gauss Seidel method. Let x be the exact solution of Ax = b. It follows that
the errors (k) := x(k) x are related by
D(k+1) = AL (k+1) + AR (k) .
Let r and ri be as in the exercise. Let k 0 be arbitrary. We show by induction on i
that
(?)

(k+1)

|j

| rk(k) k ,

for j = 1, . . . , i.
51

For i = 1, the relation between the errors translates to






(k+1)
(k)
(k)
(k)
|1
| = |a11 |1 a12 2 a1n (k)
n r1 k k rk k .
Fix i 2 and assume that Equation (?) holds for all smaller i. The relation between
(k+1)
the residuals then bounds |j
| as


(k+1)
(k+1)
(k)
1
(k)
|ajj | aj1 1
aj,j1 j1 aj,j+1 j+1 ajn n
rj max{rk(k) k , k(k) k } = rj k(k) k rk(k) k .
Equation (?) then follows by induction.
If A is strictly diagonally dominant, then r < 1 and
lim k(k) k k(0) k lim rk = 0.

We conclude that the Gauss Seidel method converges for strictly diagonally dominant
matrices.
Exercise 8.23: Convergence example for fix point iteration
(k)

(k)

We show by induction that x1 = x2 = 1 ak for every k 0. Clearly the formula


holds for k = 0. Assume the formula holds for some fixed k. Then


 
 

0 a 1 ak
1a
1 ak+1
(k+1)
(k)
x
= Gx + c =
+
=
,
a 0 1 ak
1a
1 ak+1
It follows that the formula holds for any k 0. When |a| < 1 we can evaluate the limit
(k)

lim xi

= lim 1 ak = 1 lim ak = 1,
k

(k)

for i = 1, 2.

(k)

When |a| > 1, however, |x1 | = |x2 | = |1 ak | becomes arbitrary large with k and
(k)
limk xi diverges.
The eigenvalues of G are the zeros of the characteristic polynomial 2 a2 =
( a)( + a), and we find that G has spectral radius (G) = 1 , where := 1 |a|.
Equation (8.32) yields an estimate k = log(10)s/(1 |a|) for the smallest number of
iterations k so that (G)k 10s . In particular, taking a = 0.9 and s = 16, one
expects at least k = 160 log(10) 368 iterations before (G)k 1016 . On the other
hand, 0.9k = |a|k = 10s = 1016 when k 350, so in this case the estimate is fairly
accurate.
Exercise 8.24: Estimate in Lemma 8.22 can be exact
As the eigenvalues of the matrix GJ are the zeros of 2 1/4 = ( 1/2)( + 1/2) = 0,
one finds the spectral radius (GJ ) = 1/2. In this example, the Jacobi iteration process
is described by
 1

1    1 
2 0
1
0 2
(k+1)
(k)
,
c=
= 21 .
x
= GJ x + c,
GJ = 1
0 2
1
0
2
2
The initial guess
 
0
(0)
x =
0
52

(k)

(k)

satisfies the formula x1 = x2 = 1 2k for k = 0. Moreover, if this formula holds


for some k 0, one finds

 1 
 1 
1 2k
0 2 1 21k
(k+1)
(k)
2
,
+ 1 =
x
= GJ x + c = 1
0 1 21k
1 2k
2
2
which means that it must then hold for k + 1 as well. By induction we can conclude
that the formula holds for all k 0.
At iteration k, each entry of the approximation x(k) differs by 2k from the fixed
point, implying that k(k) k = 2k . Therefore, for given s, the error k(k) k 10s
for the first time at k = ds log(10)/ log(2)e. The bound from Lemma 8.22, on the other
hand, yields k = 2s log(10) in this case.
Exercise 8.25: Slow spectral radius convergence
In this exercise we show that the convergence of
lim kAk k1/k

can be quite slow. This makes it an impractical method for computing the spectral
radius of A.
(a) The Matlab code
1
2
3
4
5
6
7
8
9

n = 5
a = 10
l = 0.9
for k = n-1:200
L(k) = nchoosek(k,n-1)*a(n-1)*l(k-n+1);
end
stairs(L)

yields the following stairstep graph of f :


7

x 10
2.5

1.5

0.5

0
0

20

40

60

80

100

120

140

160

180

200

The command max(L) returns a maximum of 2.0589 107 of f on the interval


n 1 k 200. Moreover, the code
53

1
2
3
4

k = n-1;

while nchoosek(k,n-1)*a(n-1)*l(k-n+1) >= 10(-8)


k = k + 1;
5 end
6
7

finds that f (k) dives for the first time below 108 at k = 470. We conclude that the
matrix Ak is close to zero only for a very high power k.
(b) Let E = E1 := (A I)/a be the n n matrix in the exercise, and write


0 Ink
Ek :=
Rn,n .
0 0
Clearly Ek = Ek for k = 1. Suppose that Ek = Ek for some k satisfying 1 k n 1.
Using the rules of block multiplication,
Ek+1 = Ek E1



0k,1 , Ik 0k,nk1
0nk,k Ink
=
nk1
0k,k 0k,nk
0nk,k+1 0I1,nk1


0nk,k+1 Ink1
=
0k,k+1 0k,nk1
= Ek+1 .
Alternatively, since

1 if j = i + 1,
(E)ij =
0 otherwise,

(E )ij =

1 if j = i + k,
0 otherwise,

one has
(Ek+1 )ij = (Ek E)ij =

(Ek )i` (E)`j = (Ek )i,i+k (E)i+k,j = 1 (E)i+k,j


=

1 if j = i + k + 1,
0 otherwise,

By induction we conclude that Ek = Ek for any k satisfying 1 k n, with the


convention that En = En = 0n,n . We summarize that the matrix E is nilpotent of
degree n.
(c) Since the matrices E and I commute, the binomial theorem and (b) yield
min{k,n1}  
X
k kj j j
k
k
A = (aE + I) =
aE.
j
j=0
Since (Ej )1,n = 0 for 1 j n 2 and (En1 )1,n = 1, it follows that


min{k,n1}  
X
k
k kj j j
k
(A )1,n =
a (E )1,n =
kn+1 an1 = f (k),
j
n

1
j=0
which is what needed to be shown.
54

Exercise 8.31: A special norm (TODO)

Exercise 8.33: When is A + E nonsingular?



Suppose (A1 E) = A1 (E) < 1. By Theorem 8.32.2, I + A1 E is nonsingular
and therefore so is the product A(I + A1 E) = A + E.
Conversely, suppose A + E is nonsingular. Then the inverse C of I A1 (E) =
k
P
1
A1 (A + E) exists, implying that the series
converges (namely to
k=0 A (E)
C). By Theorem 8.32.1,

(A1 E) = A1 (E) < 1.

55

CHAPTER 9

The Conjugate Gradient Method


Exercise 9.1: Paraboloid
Given is a quadratic function Q(y) = 12 yT Ay bT y, a decomposition A = UDUT
with UT U = I and D = diag(1 , . . . , n ), new variables v = [v1 , . . . , vn ]T := UT y, and
a vector c = [c1 , . . . , cn ]T := UT b. Then
n

X
1
1
1X
Q(y) = yT UDUT y bT y = vT Dv cT v =
j vj2
cj vj ,
2
2
2 j=1
j=1
which is what needed to be shown.
Exercise 9.4: Steepest descent iteration
In the method of Steepest Descent we choose, at the kth iteration, the search direction
pk = rk = b Axk and optimal step length
k :=

rTk rk
.
rTk Ark

Given is a quadratic function


 
 

1
x
T x
x y A
Q(x, y) =
b
,
y
y
2


2 1
A=
,
1 2

 
0
b=
,
0

and an initial guess x0 = [1, 1/2]T of its minimum. The corresponding residual is
  

  
0
2 1
1
3/2
r0 = b Ax0 =

=
.
0
1 2
1/2
0
Performing the steps in Equation (9.6) twice yields

  

rT r0
9/4
1
2 1 3/2
3
= ,
t0 = Ar0 =
=
, 0 = 0T =
1 2
0
3/2
9/2
2
r0 t0
  

 

  


1 3/2
1
1
1/4
3/2
3
0
x1 =
+
=
, r1 =

=
1/2
0
1/2
0
3/2
3/4
2
2
  

rT r1
9/16
1
2 1
0
3/4
t1 = Ar1 =
=
, 1 = 1T =
= ,
1 2
3/4
3/2
9/8
2
r1 t1


  

 

  
1 0
1 3/4
1/4
1/4
0
3/8
x2 =
+
=
, r2 =

=
.
1/2
1/8
3/4
0
2 3/4
2 3/2
Moreover, assume that for some k 1 one has


 
 
1
k 0
1k
k 1
, x2k1 = 4
, r2k1 = 3 4
,
(?)
t2k2 = 3 4
2
1
1/2


56

(??)

t2k1 = 3 4

 
1
,
2

x2k = 4


1
,
1/2

r2k = 3 4

 
1/2
.
0

Then


t2k = 3 4



 
1
2 1 1/2
1(k+1)
,
=34
1/2
1 2
0

9 42k ( 21 )2
rT2k r2k
1
=
=
,
1
T
2
r2k t2k
9 42k 2
 
 
 
1
1
(k+1) 1
k
k 1/2
,
+ 34
= 4
x2k+1 = 4
2
1/2
0
2
 


 
1
1
1(k+1)
(k+1) 0
k 1/2
34
r2k+1 = 3 4
=34
,
0
1/2
1
2

 
 
2 1 0
(k+1)
(k+1) 1
t2k+1 = 3 4
=34
,
1 2
1
2
2k =

rT2k+1 r2k+1
9 42(k+1)
1
= ,
=
2k+1 = T
2(k+1)
94
2
2
r2k+1 t2k+1
 
 
 
1
1
(k+1) 1
(k+1) 0
(k+1)
x2k+2 = 4
+ 34
= 4
,
2
1
1/2
2
 
 
 
1
(k+1) 1
(k+1) 0
(k+1) 1/2
r2k+2 = 3 4
=34
34
,
2
1
0
2
Using the method of induction, we conclude that (?), (??), and k = 1/2 hold for any
k 1.
Exercise 9.7: Conjugate gradient iteration, II
Using x(0) = 0, one finds
x(1) = x(0) +

(b Ax(0) , b Ax(0) )
(b, b)
(b Ax(0) ) =
b.
(0)
2
(0)
(b Ax , Ab A x )
(b, Ab)

Exercise 9.8: Conjugate gradient iteration, III


By Exercise 9.7,
x

(1)

   
(b, b)
9 0
0
=
=
.
b=
3/2
(b, Ab)
18 3

We find, in order,
(0)

(0)

=r

1
0 = ,
4

3
1 (1)
0 = , r = 2 ,
0
2
3
 
2
1
(2)
= 23 ,
1 = , x =
.
2
3
4

 
0
=
,
3
p(1)

Since the residual vectors r(0) , r(1) , r(2) must be orthogonal, it follows that r(2) = 0 and
x(2) must be an exact solution. This can be verified directly by hand.
57

Exercise 9.9: The cg step length is optimal


For any fixed search direction pk , the step length k is optimal if Q(xk+1 ) is as small
as possible, that is
Q(xk+1 ) = Q(xk + k pk ) = min f (),
R

where, by (9.3),
1
f () := Q(xk + pk ) = Q(xk ) pTk rk + 2 pTk Apk
2
is a quadratic polynomial in . Since A is assumed to be positive definite, necessarily
pTk Apk > 0. Therefore f has a minimum, which it attains at
=

pTk rk
.
pTk Apk

Applying (9.15) repeatedly, one finds that the search direction pk for the conjugate
gradient method satisfies


rTk1 rk1
rTk rk
rTk rk
pk = rk + T
pk1 = rk + T
rk1 + T
pk2 =
rk1 rk1
rk1 rk1
rk2 rk2
As p0 = r0 , the difference pk rk is a linear combination of the vectors rk1 , . . . , r0 ,
each of which is orthogonal to rk . It follows that pTk rk = rTk rk and that the step length
is optimal for
=

rTk rk
= k .
pTk Apk
Exercise 9.10: Starting value in cg

As in the exercise, we consider the conjugate gradient method for Ay = r0 , with


r0 = b Ax0 . Starting with
y0 = 0,

s0 = r0 Ay0 = r0 ,

q0 = s0 = r0 ,

one computes, for any k 0,


k :=

sTk sk
,
qTk Aqk

yk+1 = yk + k qk ,

sk+1 = sk k Aqk ,

sTk+1 sk+1
,
qk+1 = sk+1 + k qk .
k :=
sTk sk
How are the iterates yk and xk related? As remarked above, s0 = r0 and q0 = r0 = p0 .
Suppose sk = rk and qk = pk for some k 0. Then
sk+1 = sk k Aqk = rk

rTk rk
Apk = rk k Apk = rk+1 ,
pTk Apk

rTk+1 rk+1
pk = pk+1 .
rTk rk
It follows by induction that sk = rk and qk = pk for all k 0. In addition,
qk+1 = sk+1 + k qk = rk+1 +

yk+1 yk = k qk =

rTk rk
pk = xk+1 xk ,
pTk Apk
58

for any k 0,

so that yk = xk x0 .
Exercise 9.15: The A-inner product
We verify the axioms of Definition 0.20.
Positivity: Since A is positive definite, x 6= 0 = hx, xi = xT Ax > 0. On the
other hand, x = 0 = hx, xi = xT Ax = 0T A0 = 0. It follows that hx, xi 0
for all x, with equality if and only if x = 0.
Symmetry: One has hx, yi = xT Ay = (xT Ay)T = yT AT x = yT Ax = hy, xi
for all vectors x and y.
Linearity: One has hax + by, zi = (ax + by)T Az = axT Az + byT Az = ahx, zi +
bhy, zi for all real numbers a, b and vectors x, y, z.
Exercise 9.17: Program code for testing steepest descent
Replacing the steps in (9.16) by those in (9.6), Algorithm 9.13 changes into the following
algorithm for testing the method of Steepest Descent.
Listing 9.1. Testing the method of Steepest Descent
1
2
3
4
5
6
7
8
9
10
11
12
13

function [V,K] = sdtest(m, a, d, tol, itmax)


R = ones(m)/(m+1)2; rho = sum(sum(R.*R)); rho0 = rho;
V = zeros(m,m);
T1=sparse(toeplitz([d, a, zeros(1,m-2)]));
for k=1:itmax
if sqrt(rho/rho0) <= tol
K = k; return
end
T = T1*R + R*T1;
a = rho/sum(sum(R.*T)); V = V + a*R; R = R - a*T;
rhos = rho; rho = sum(sum(R.*R));
end
K = itmax + 1;

To check that this program is correct, we compare its output with that of cgtest.
1
2
3

[V1, K] = sdtest(50, -1, 2, 10(-8), 1000000);


[V2, K] = cgtest(50, -1, 2, 10(-8), 1000000);
surf(V2 - V1);

Running these commands yields Figure 1, which shows that the difference between
both tests is of the order of 109 , well within the specified tolerance.
As in Tables 9.12 and 9.14, we let the tolerance be tol = 108 and run sdtest for
the m m grid for various m, to find the number of iterations Ksd required before
||rKsd ||2 tol ||r0 ||2 . Choosing a = 1/9 and d = 5/18 yields the averaging matrix, and
we find the following table.
n
Ksd

2 500

10 000

40 000

1 000 000

4 000 000

37

35

32

26

24

Choosing a = 1 and d = 2 yields the Poisson matrix, and we find the following
table.
59

10

x 10
8

0
50
40

50
30

40
30

20
20

10

10
0

Figure 1. For a 50 50 Poisson matrix and a tolerance of 108 , the


figure shows the difference of the outputs of cgtest and sdtest.

n
Ksd /n
Ksd

100

400

1 600

2 500

10 000

40 000

4.1900
419

4.0325
1 613

3.9112
6 258

3.8832
9 708

3.8235
38 235

3.7863
151 451

KJ

385

8 386

KGS

194

4 194

KSOR

35

164

324

645

Kcg

16

94

188

370

37

75

Here the number of iterations KJ , KGS , and KSOR of the Jacobi, Gauss-Seidel and
SOR methods are taken from Table 8.1, and Kcg is the number of iterations in the
Conjugate Gradient method.
Since Ksd /n seems to tend towards a constant, it seems that the method of Steepest
Descent requires O(n) iterations
for solving the Poisson problem for some given accu
racy, as opposed to the O( n) iterations required by the Conjugate Gradient method.
The number of iterations in the method of Steepest Descent is comparable to the number of iterations in the Jacobi method, while the number of iterations in the Conjugate
Gradient method is of the same order as in the SOR method.

The spectral
condition number of the mm Poisson matrix is = 1+cos(h) /(1

cos h) . Theorem 9.16 therefore states that

(?)

||x xk ||A

||x x0 ||A

1
+1

k

= cos

m+1

60


.

Listing 9.2. Conjugate gradient method for least squares


1
2
3
4
5
6
7
8
9
10
11
12
13
14

function [x,K]=cg_leastSquares (A,b,x,tol,itmax)


r=b-A*A*x; p=r;
rho=r*r; rho0=rho;
for k=0:itmax
if sqrt(rho/rho0)<= tol
K=k;
return
end
t=A*p; a=rho /(t*t);
x=x+a*p; r=r-a*A*t;
rhos=rho; rho=r*r;
p=r+(rho/rhos)*p;
end
K=itmax+1;

How can we relate this to the tolerance in the algorithm, which is specified in terms
of the Euclidean norm? Since
kxk2A
xT Ax
=
kxk22
xT x
is the Rayleigh quotient of x, Lemma 5.26 implies the bound
min kxk22 kxk2A max kxk22 ,


with min = 4 1cos(h) the smallest and max = 4 1+cos(h) the largest eigenvalue
of A. Combining these bounds with Equation (?) yields


k s



1 + cos m+1
1

kx xk k2
k
 cos

=
.

kx x0 k2
+1
m+1
1 cos m+1
Replacing k by the number of iterations Ksd for the various values of m shows that
this estimate holds for the tolerance of 108 .
Exercise 9.18: Using cg to solve normal equations
We need to perform Algorithm 9.11 with AT A replacing A and AT b replacing b. For
the system AT Ax = AT b, Equations (9.13), (9.14), and (9.15) become
x(k+1) = x(k) + k p(k) ,

k =

r(k) T r(k)
r(k) T r(k)
=
,
p(k) T AT Ap(k)
(Ap(k) )T Ap(k)

k =

r(k+1) T r(k+1)
,
r(k) T r(k)

r(k+1) = r(k) k AT Ap(k) ,


p(k+1) = r(k+1) + k p(k) ,

with p(0) = r(0) = b AT Ax(0) . Hence we only need to change the computation of
r(0) , k , and r(k+1) in Algorithm 9.11, which yields the implementation in Listing 9.2.
61

Exercise 9.20: Maximum of a convex function


This is a special case of the maximum principle in convex analysis, which states that
a convex function, defined on a compact convex set , attains its maximum on the
boundary of .
Let f : [a, b] R be a convex function. Consider an arbitrary point x = (1 )a +
b [a, b], with 0 1. Since f is convex,


f (x) = f (1 )a + b (1 )f (a) + f (b) = f (a) + f (b) f (a) .
Since 0 1, the right hand side is bounded by f (a) if f (b) f (a) is negative, and
by f (b) if f (b) f (a) is positive. It follows that f (x) max{f (a), f (b)} and that f
attains its maximum on the boundary of its domain of definition.
Exercise 9.25: Krylov space and cg iterations
(a) The Krylov spaces Wk are defined as


Wk := span r(0) , Ar(0) , . . . , Ak1 r(0) .
Taking A, b, x = 0, and r(0) = b Ax = b as in the Exercise, these vectors can be
expressed as


8
20
4

 (0)
 
 
r , Ar(0) , A2 r(0) = b, Ab, A2 b = 0 , 4 , 16 .
0
4
0
(b) As x(0) = 0 we have p(0) = r(0) = b. We have for k = 0, 1, 2, . . . Equations
(9.13), (9.14), and (9.15),
x(k+1) = x(k) + k p(k) ,

k =

r(k) T r(k)
,
p(k) T Ap(k)

k =

r(k+1) T r(k+1)
,
r(k) T r(k)

r(k+1) = r(k) k Ap(k) ,


p(k+1) = r(k+1) + k p(k) ,
which determine the approximations

2
1
(1)

0 = ,
x = 0 ,
2
0

8
2
1
(2)
4 ,
1 = ,
x =
3
3 0

3
3
(3)

2 = ,
x = 2 ,
4
1

x(k) . For k = 0, 1, 2 these give




0
1
1
(1)
(1)

p = 2 ,
r = 2 ,
0 = ,
4
0
0


0
4
1
4
1
(2)
(2)
0 ,
8 ,
r =
1 = ,
p =
3 4
9
9 12


0
0
(3)
(3)

0
r =
,
2 = 0,
p = 0 .
0
0

(c) By definition we have W0 = {0}. From the solution of part (a) we know
that Wk = span(b0 , Ab0 , . . . , Ak1 b0 ), where the vectors b, Ab and A2 b are linearly
independent. Hence we have dim Wk = k for k = 0, 1, 2, 3.
62

From (b) we know that the residual r(3) = b Ax(3) = 0. Hence x(3) is the exact
solution to Ax = b.
We observe that r(0) = 4e1 , r(1) = 2e2 and r(2) = (4/3)e3 and hence the r(k) for
k = 0, 1, 2 are linear independent and orthogonal to each other. Thus we are only left to
show that Wk is the span of r(0) , . . . , r(k1) . We observe that b = r(0) , Ab = 2r(0) 2r(1)
and A2 b = 5r(0) 8r(1) + 3r(2) . Hence span(b, Ab, . . . , Abk1 ) = span(r(0) , . . . , r(k1) )
for k = 1, 2, 3. We conclude that, for k = 1, 2, 3, the vectors r(0) , . . . , r(k1) form an
orthogonal basis for Wk .
One can verify directly that p(0) , p(1) , and p(2) are A-orthogonal. Moreover, observing that b = p(0) , Ab = (5/2)p(0) 2p(1) , and A2 b = 7p(0) (28/3)p(1) + 3p(2) ,
it follows that
span(b, Ab, . . . , Abk1 ) = span(p(0) , . . . , p(k1) ),

for k = 1, 2, 3.

We conclude that, for k = 1, 2, 3, the vectors p(0) , . . . , p(k1) form an A-orthogonal


basis for Wk .
By computing the Euclidean norms of r(0) , r(1) , r(2) , r(3) , we get
(0)
(1)
(2)
(3)
r = 4,
r = 2,
r = 4/3,
r = 0.
2
2
2
2
It follows that the sequence (kr(k) k)k is monotonically decreasing. Similarly, one finds
p
(k)


x x 3 =
10, 6, 14/9, 0 ,
2 k=0
which is clearly monotonically decreasing.
Exercise 9.28: Another explicit formula for the Chebyshev polynomial
It is well known, and easily verified,
 that cosh(x+y) = cosh(x) cosh(y)+sinh(x) sinh(y).
Write Pn (t) = cosh n arccosh(t) for any integer n 0. Writing = arccosh(t), and
using that cosh is even and sinh is odd, one finds
Pn+1 (t) + Pn1 (t)


= cosh (n + 1) + cosh (n 1)
= cosh(n) cosh() + sinh(n) sinh() + cosh(n) cosh() sinh(n) sinh()
= 2 cosh() cosh(n)
= 2tPn (t).
It follows that Pn (t) satisfies the same recurrence relation as Tn (t). Since in addition
P0 (t) = 1 = T0 (t), necessarily Pn (t) = Tn (t) for any n 0.

63

CHAPTER 10

Orthonormal and Unitary Transformations


Exercise 10.2: Reflector
Suppose x, y Rn are column vectors with equal length l := kxk2 = kyk2 , and write
v := x y.
(a) Since
vT v = xT x yT x xT y + yt y = 2l + 2yT x = 2vT x,
we find


vvT v
v(2vT x)
vvT
=
x

= x v = y.
x=x
I2 T
v v
vT v
vT v
(b) Since kxk22 = kyk22 ,
hx y, x + yi = hx, xi hy, xi + hx, yi hy, yi = kxk22 kyk22 = 0,
which means that x y and x + y are orthogonal. Px is the orthogonal projection of
x into span(x + y), because it satisfies
 T
  T

v xv
1v v
hx Px, (x + y)i =
, (x + y) =
v, (x + y)
vT v
2 vT v


1
=
(x y), (x + y) = 0,
2
for an arbitrary element (x + y) in span(x + y).

Exercise 10.5: What does Algorithm housegen do when x = e1 ?


If x = e1 , then the algorithm yields = ke1 k2 = 1,

x/ e1
e1 /(1) e1
u= p
=p
= 2e1 ,
1 x1 /
1 1/(1)
and

0
H = I uuT =
...
0

0
1
..
.
0

0
0
.
. . . ..
.
1
64

Exercise 10.6: Examples of Householder transformations


(a) Let x and y be as in the exercise. As kxk2 = kyk2 , we can apply Exercise 10.2
to obtain a vector v and a matrix Q,
 


vvt
1 3 4
2
v =xy =
,
Q=I2 t =
,
4
vv
5 4 3
such that Qx = y. As explained in the
text above Exercise 10.2, this matrix Q is a
Householder transformation with u := 2v/kvk2 .
(b) Let x and y be as in the exercise. As kxk2 = kyk2 , we can apply Exercise 10.2
to obtain a vector v and a Householder transformation Q,

2
1 2 2
t
vv
1
v = x y = 1 ,
Q = I 2 t = 2 2 1 ,
vv
3 2 1 2
1
such that Qx = y.
Exercise 10.7: 2 2 Householder transformation
Let Q = IuuT R2,2 be any Householder transformation. Then u = [u1 u2 ]T R2 is
a vector satisfying u21 +u22 = kuk22 = 2, implying that the components of u are related via
u21 1 = 1 u22 . Moreover, as 0 u21 , u22 kuk2 = 2, one has 1 u21 1 = 1 u22 1,
and there exists an angle 0 [0, 2) such that cos(0 ) = u21 1 = 1 u22 . For such an
angle 0 , one has
p
p
p
u1 u2 = 1 + cos 0 1 cos 0 = 1 cos2 0 = sin(0 ).
We thus find an angle := 0 for which

 
 

cos() sin()
cos(0 ) sin(0 )
1 u21 u1 u1
.
=
=
Q=
sin() cos()
sin(0 ) cos(0 )
u1 u2 1 u22
Furthermore, we find
  2
 



 
cos(2)
cos
cos sin cos
sin cos2
.
=
=
Q
=
sin cos sin
sin(2)
sin
2 sin cos
When applied to the vector [cos , sin ]T , therefore, Q doubles the angle and reflects
the result in the y-axis.
Exercise 10.16: QR decomposition
That Q is orthonormal, and therefore unitary, can be shown directly by verifying that
QT Q = I. A direct computation shows that QR = A. Moreover,

2 2


0 2
R
1

R=
0 0 =: 02,2 ,
0 0
where R1 is upper triangular. It follows that A = QR is a QR decomposition.
65

A QR factorization is obtained by removing the parts of Q and R that dont


contribute anything to the product QR. Thus we find a QR factorization

1 1



1
1
1
2
2
,
A = Q1 R 1 ,
Q1 :=
R1 :=
.
0 2
2 1 1
1 1

Exercise 10.17: Householder triangulation


(a) Let

1
0 1
A = [a1 , a2 , a3 ] = 2 1 0
2
2 1
be as in the Exercise. We wish to find Householder transformations Q1 , Q2 that produce
zeros in the columns a1 , a2 , a3 of A. Applying Algorithm 10.4 to the first column of
A, we find


3 2 1
2
1
0
1 .
Q1 A := (I u1 uT1 )A = 0
u1 = 1 ,
3 1
0
1
0
Next we need to map the bottom element (Q1 A)3,2 of the second column to zero,
without changing the first row of Q1 A. For this, we apply Algorithm 10.4 to the
vector [0, 1]T to find
 


1
0 1
0
T
u2 =
and
Q2 := I u2 u2 =
,
1
1 0
which is a Householder transformation of size 2 2. Since



3 2 1
1 0
Q2 Q1 A :=
Q1 A = 0 1 0 ,
0 H2
0
0 1
it follows that the Householder transformations Q1 and Q2 bring A into upper triangular form.
(b) Clearly the matrix Q3 := I is orthogonal and R := Q3 Q2 Q1 A is upper
triangular with positive diagonal elements. It follows that
A = QR,

Q := QT1 QT2 QT3 = Q1 Q2 Q3 ,

is a QR factorization of A of the required form.


Exercise 10.20: QR using Gram-Schmidt, II
Let

1 3
1
1 3
7

A = [a1 , a2 , a3 ] =
1 1 4 .
1 1 2
66

Applying Gram-Schmidt orthogonalization, we find



1
1

v1 = a1 =
1 ,
1

2
2

v2 = a2 12 v1 =
2 ,
2

12 =

13 =

aT2 v1
= 1,
v1T v1

aT3 v1
3
= ,
T
2
v1 v1

23 =

Hence we have

1 2 3
1 2
3

V=
1 2 3 ,
1 2 3

5
aT3 v2
= ,
T
4
v2 v2

4 4 6
= 1 0 4 5 ,
R
4 0 0 4


3
3

v3 = a3 13 v1 23 v2 =
3 .
3

and

2 0 0
D = 0 4 0 .
0 0 6

we obtain
Using Q1 = VD1 and R1 = DR,

1 1 1
2
2
3
1 1 1
1
0 4 5 .
A = Q1 R 1 =
2 1 1 1 0 0 6
1 1 1

Exercise 10.22: Plane rotation


Suppose



r cos
x=
,
r sin


cos sin
P=
.
sin cos

Using the angle difference identities for the sine and cosine functions,
cos( ) = cos cos + sin sin ,
sin( ) = sin cos cos sin ,
we find


 

cos cos + sin sin
r cos( )
Px = r
=
.
sin cos + cos sin
r sin( )
67

Exercise 10.23: Solving upper Hessenberg system using rotations


To determine the number of arithmetic operations of Algorithm 10.24, we first consider
the arithmetic operations in each step. Initially the algorithm stores the length of the
matrix and adds the right hand side as the (n + 1)-th column to the matrix. Such
copying and storing operations do not count as arithmetic operations.
The second big step is the loop. Let us consider the arithmetic operations at the
k-th iteration of this loop. First we have to compute the norm of a two dimensional
vector, which comprises 4 arithmetic operations: two multiplications, one addition and
one square root operation. Assuming r > 0 we compute c and s each in one division,
adding 2 arithmetic operations to our count. Computing the product of the Givens
rotation and A includes 2 multiplications and one addition for each entry of the result.
As we have 2(n + 1 k) entries, this amounts to 6(n + 1 k) arithmetic operations.
The last operation in the loop is just the storage of two entries of A, which again does
not count as an arithmetic operation.
The final step of the whole algorithm is a backward substitution, known to require
O(n2 ) arithmetic operations. We conclude that the Algorithm uses
2

O(n ) +

n1
X

4 + 2 + 6(n + 1 k) = O(n ) + 6

k=1
2

n1
X
k=1

= O(n2 ) + 3n + 9n 12 = O(n2 )
arithmetic operations.

68

(n + 2 k)

CHAPTER 11

Least Squares
Exercise 11.7: Straight line fit (linear regression)
In each case, we are given an over-determined system Ax = b with corresponding
normal equations A Ax = A b.
(a) In this case A = [1, 1, . . . , 1]T , x = [x1 ], and b = [y1 , y2 . . . , ym ]T , implying that

A A = [m] and A b = [y1 + y2 + + ym ]. The normal equation


mx1 = y1 + y2 + + ym
has the unique solution
y1 + y2 + + ym
,
x1 =
m
which is the average of the values y1 , y2 , . . . , ym .
(b) In this case


1 t1
y1
 
1 t2

y
x1
,
.2 ,
A=
x
=
,
b
=
.
.
.. ..
..
x2
1 tm
ym
so that

m
t1 + + tm
,
A A=
t1 + + tm t21 + + t2m


y1 + + ym
A b=
.
t1 y1 + + tm ym

The solution x = [x1 , x2 ]T to the normal equations describes the line y(t) = x2 t + x1
closest to the points (t1 , y1 ), . . . , (tm , ym ), in the sense that the total error
kAx

bk22

m
X

x1 + ti x2 yi

2

i=1

m
X

y(ti ) yi

2

i=1

is minimal.
Exercise 11.8: Straight line fit using shifted power form
We are given an over-determined system Ax = b, with


1 t1 t
y1


1 t t

y2
2
x

A = ..
x= 1 ,
b=
.. ,
... ,
x2
.
.
ym
1 tm t

t1 + t2 + + tm
,
t =
m

(a) The corresponding system of normal equations is A Ax = A b, where




m
t1 + + tm mt

A A=
,
t1 + + tm mt (t1 t)2 + + (tm t)2
69


y1 + + ym
A b=
.
(t1 t)y1 + + (tm t)ym

The solution x = [x1 , x2 ]T to the normal equations describes the line y(t) = x2 (t t)+x1
closest to the points (t1 , y1 ), . . . , (tm , ym ), in the sense that the total error
kAx

bk22

m
X

x1 + (ti t)x2 yi

2

i=1

m
X

y(ti ) yi

2

i=1

is minimal.
(b) Solving the normal equations, we find x1 = 2.375 and x2 = 0.87. Plotting the
data (ti , yi ) and the fitted linear function y(t) = x2 (t t) + x1 gives

4
3.5
3
2.5
2
1.5
1
998

999

1000

1001

1002

Exercise 11.9: Fitting a circle to points


We are given the (in general overdetermined) system
(ti c1 )2 + (yi c2 )2 = r2 ,

i = 1, . . . , m.

(a) Let c1 = x1 /2, c2 = x2 /2, and r2 = x3 + c21 + c22 as in the Exercise. Then, for
i = 1, . . . , m,
0 = (ti c1 )2 + (yi c2 )2 r2

 x 2  x 2
x1  2 
x2  2
1
2
= ti
+ yi
x3

2
2
2
2
= t2i + yi2 ti x1 yi x2 x3 ,
from which Equation (11.6) follows immediately. Once x1 , x2 , and x3 are determined,
we can compute
r
x2
1 2 1 2
x1
c2 = ,
r=
x + x + x3 .
c1 = ,
2
2
4 1 4 2
(b) The linear least square problem is to minimize kAx bk22 , with


t1 y1 1
t1 + y12
x1
.
.
.
.

.. .. ,
..
A = ..
b=
,
x = x2 .
2
x3
tm ym 1
t2m + ym
(c) Whether or not A has independent columns depends on the data ti , yi . For
instance, if ti = yi = 1 for all i, then the columns of A are clearly dependent. In
70

general, A has independent columns whenever we can find three points (ti , yi ) not on
a straight line.
(d) For these points the matrix A becomes

1 4 1
A = 3 2 1 ,
1 0 1
which clearly is invertible. We find

1
x1
1 4 1
17
2

13 = 4 .
x = x2 = 3 2 1
x3
1 0 1
1
1
It follows that c1 = 1, c2 = 2, and r = 2. The points (t, y) = (1, 4), (3, 2), (1, 0)
therefore all lie on the circle
(t 1)2 + (y 2)2 = 4,
as shown in the following picture.

4
3
2
1
0

-1

Exercise 11.15: The generalized inverse


Let A Cm,n be a matrix of rank r with singular value decomposition A = UV
and corresponding singular value factorization A = U1 1 V1 . Define
 1

0r,mr
1

B = A := V U ,
:=
Rn,m .
0nr,r 0nr,mr
Note that = and = . Let us use this and the unitarity of U and
V to show that B satisfies the first two properties from the Exercise.
(1) ABA = UV V U UV = U V = UV = A
(2) BAB = V U UV V U = V U = V U = B
Moreover, since in addition the matrices and are Hermitian,
(3) (BA) = A B = V U U V = V V
= V( ) V = V V = V U UV = BA

(4) (AB) = B A = U V V U = U U
= U( ) U = U U = UV V U = AB
71

Exercise 11.16: Uniqueness of generalized inverse


Denote the Properties to the left by (1B ), (2B ), (3B ), (4B ) and the Properties to the
right by (1C ), (2C ), (3C ), (4C ). Then one uses, in order, (2B ), (4B ), (1C ), (4C ), (4B ),
(1B ) or (2C ), (1C ) or (2C ), (3C ), (3B ), (1B ), (3C ), (2C ).
Exercise 11.17: Verify that a matrix is a generalized inverse
Let

1 1
A = 1 1 ,
0 0



1 1 1 0
B=
4 1 1 0

be as in the Exercise. One finds



1 1
1
1
1
0
AB = 1 1
=
1 1 0
4
0 0


 1 1
1 1 1 0
1 1 =
BA =
4 1 1 0 0 0

1 1 0
1
1 1 0 ,
2 0 0 0


1 1 1
,
2 1 1

so that (AB) = AB and (BA) = BA. Moreover,



1 1
1 1
1 1 1
= 1 1 = A,
ABA = A(BA) = 1 1
1 1
2
0 0
0 0

 



1 1 1 1 1 1 0
1 1 1 0
BAB = (BA)B =
=
= B.
2 1 1 4 1 1 0
4 1 1 0
By Exercises 11.15 and 11.16, we conclude that B must be the pseudoinverse of A.
Exercise 11.18: Linearly independent columns and generalized inverse
If A Cm,n has independent columns then both A and A have rank n m. Then,
by Theorem 6.18, A A must have rank n as well. Since A A is an n n-matrix of
maximal rank, it is nonsingular and we can define B := (A A)1 A . We verify that
B satisfies the four axioms of Exercise 11.15.
(1) ABA = A(A A)1 A A = A
(2) BAB = (A A)1 A A(A A)1 A = (A A)1 A = B

(3) (BA) = (A A)1 A A = In = In = (A A)1 A A = BA




(4) (AB) = A(A A)1 A = A (A A)1 A
= A(A A)1 A = AB
It follows that B = A . The second claim follows similarly.
Alternatively, one can use the fact that the unique solution of the least squares
problem is A b and compare this with the solution of the normal equation.
Exercise 11.19: The generalized inverse of a vector
This is a special case of Exercise 11.18. In particular, if u is a nonzero vector, then
u u = hu, ui = kuk2 is a nonzero number and (u u)1 u is defined. One can again
check the axioms of Exercise 11.15 to show that this vector must be the pseudoinverse
of u .
72

Exercise 11.20: The generalized inverse of an outer product


Let A = uv be as in the Exercise. Since u and v are nonzero, the matrix
B :=

vu
A
=
kuk22 kvk22
kuk22 kvk22

is well defined. We verify the four axioms of Exercise 11.15 to show that B must be
the pseudoinverse of A.
uv vu uv
ukvk22 kuk22 v
=
= uv = A;
kuk22 kvk22
kuk22 kvk22
vu uv vu
vkuk22 kvk22 u
vu
(2) BAB =
=
=
= B;
kuk42 kvk42  kuk42 kvk42
kuk22 kvk22


vu uv
vu uv

= BA;
(3) (BA) =
=
2
2
2
2
 kuk2 kvk 2  kuk2 kvk 2
uv vu
uv vu
(4) (AB) =
=
= AB.
2
2
kuk2 kvk2
kuk22 kvk22
This proves that B is the pseudoinverse of A.
(1) ABA =

Exercise 11.21: The generalized inverse of a diagonal matrix


Let A := diag(1 , . . . , n ) and B := diag(1 , . . . , n ) as in the exercise. Note that,
by definition, j indeed represents the pseudoinverse of the number j for any j. It
therefore satisfies the axioms of Exercise 11.15, something we shall use below. We now
verify the axioms for B to show that B must be the pseudoinverse of A.
(1)
(2)
(3)
(4)

ABA = diag(1 1 1 , . . . , n n n ) = diag(1 , . . . , n ) = A;


BAB = diag(1 1 1 , . . . , n n n ) = diag(1 , . . . , n ) = B;
(BA) = (diag(1 1 , . . . , n n )) = diag(1 1 , . . . , n n ) = BA;
(AB) = (diag(1 1 , . . . , n n )) = diag(1 1 , . . . , n n ) = AB.

This proves that B is the pseudoinverse of A.


Exercise 11.22: Properties of the generalized inverse
Let A = UV be a singular value decomposition of A and A = U1 1 V1 the corresponding singular value factorization. By definition of the pseudo inverse, A :=

V1 1
1 U1 .

(a) One has (A ) = (V1 1


1 U1 ) = U1 1 V1 . On the other hand, the matrix A
has singular value factorization A = V1 1 U1 , so that its pseudo inverse is (A ) :=



U1
1 V1 as well. We conclude that (A ) = (A ) .

(b) Since A := V1 1
1 U1 is a singular value factorization, it has pseudo inverse
1 1


(A ) = (U1 ) (1 ) V1 = U1 1 V1 = A.
(c) Let 6= 0. Since the matrix A has singular value factorization U1 (1 )V1 ,
it has pseudo inverse

1
(A) = V1 (1 )1 U1 = 1 V1 1
1 U1 = A .

73

Exercise 11.23: The generalized inverse of a product


(a) From the condition that A has linearly independent columns we can deduce
that n m. Similarly it follows that n k, hence n min{m, k} and both matrices
have maximal rank. As a consequence,



 A,1

A = UA A VA = UA,1 UA,2
VA
,
0



B = UB B,1 0 VB,1 VB,2 .
This gives


 1

 
 A,1

A A = VA A,1 0 UA,1 UA,2 UA,1 UA,2


VA
0

= VA 1
A,1 A,1 VA = I

and




 
 1
B,1
BB = UB B,1 0 VB,1 VB,2 VB,1 VB,2
UB
0

= UB B,1 1
B,1 UB = I.

Moreover we get

(AA ) = UA,1

= UA,1

= UA,1



 1


A,1
UA,2
VA
VA A,1
0


 I 0 

UA,2
UA,1 UA,2
0 0


 A,1


UA,2
VA
VA 1
A,1
0



UA,1 UA,2



UA,1 UA,2

= AA
and



 B,1



0 VB,1 VB,2
(B B) = VB,1 VB,2
UB UB 1
B,1
0
 1 

 B,1



= VB,1 VB,2
UB UB B,1 0 VB,1 VB,2
0

= B B.
We now let E := AB and F := B A . Hence we want to show that E = F. We
do that by showing that F satisfies the properties given in Exercise 11.15.
EFE = ABB A AB = AB = E
FEF = B A ABB A = B A = F
(FE) = (B A AB) = (B B) = B B = B A AB = FE
(EF) = (ABB A ) = (AA ) = AA = ABB A = EF
(b) We have


A= a b

and

 
c
.
B=
d
74

This gives


a2 ab
A A=
ab b2
T


and

BT B = c2 + d2 .

Hence we want to choose a and b such that AT A is diagonal and c and d such that it
is the square of a nice number. Thus we set b = 0, c = 3 and d = 4, yielding
 2 
 


a 0
3
T
A= a 0 ,
A A=
,
B=
,
BT B = 25 = 52 .
0 0
4
We hence can derive the following singular value decompositions and pseudoinverse.


 
 
 1 0
1 1

A= 1 a 0
,
A =
,
0 1
a 0

 

1 3 4 5  
1 
1 ,
3 4 .
B=
B =
0
5 4 3
25
We thus get
(AB) = (3a) =

1
3a

and

B A =

3
,
25a

and have
1
9 1
3
6=

=
= B A
3a
25 3a
25a
for all nonzero a R.
(AB) =

Exercise 11.24: The generalized inverse of the conjugate transpose


Let A have singular value factorization A = U1 1 V1 , so that A = V1 1 U1 and
1

A = V1 1
1 U1 . Then A = A if and only if 1 = 1 , which happens precisely
when all nonzero singular values of A are one.
Exercise 11.25: Linearly independent columns
By Exercise 11.18, if A has rank n, then A = (A A)1 A . Then A(A A)1 A b =
AA b, which is the orthogonal projection of b into span(A) by Theorem 11.10.
Exercise 11.26: Analysis of the general linear system
In this exercise, we can write


1 0
=
,
1 = diag(1 , . . . , r ),
0 0

1 > > r > 0.

(a) As U is unitary, we have U U = I. We find the following sequence of equivalences.


Ax = b UV x = b U U(V x) = U b y = c,
which is what needed to be shown.
75

(b) By (a), the linear system Ax = b has a solution if and only if the system


1 y1
c1
.. ..
. .




1 0
r yr cr
y=
=c
=
0 0
0 cr+1
. .
.. ..
cn
0
has a solution y. Since 1 , . . . , r 6= 0, this system has a solution if and only if
cr+1 = = cn = 0. We conclude that Ax = b has a solution if and only if cr+1 =
= cn = 0.
(c) By (a), the linear system Ax = b has a solution if and only if the system
y = c has a solution. Hence we have the following three cases.
r = n:
Here yi = ci /i for i = 1, . . . , n provides the only solution to the system
y = b, and therefore x = Vy is the only solution to Ax = b. It follows that
the system has exactly one solution.
r < n, ci = 0 for i = r + 1, . . . , n:
Here each solution y must satisfy yi = ci /i for i = 1, . . . , r. The remaining
yr+1 , . . . , yn , however, can be chosen arbitrarily. Hence we have infinitely many
solutions to y = b as well as for Ax = b.
r < n, ci 6= 0 for some i with r + 1 i n:
In this case it is impossible to find a y that satisfies y = b, and therefore
the system Ax = b has no solution at all.

Exercise 11.27: Fredholms Alternative


Suppose that the system Ax = b has a solution, i.e., b span(A). Suppose in
addition that A y = 0 has a solution, i.e., y ker(A ). Since (span(A)) = ker(A ),
one has hy, bi = y b = 0. Thus if the system Ax = b has a solution, then we can
not find a solution to A y = 0, y b 6= 0. Conversely if y ker(A ) and y b 6= 0,
then b
/ (ker(A )) = span(A), implying that the system Ax = b does not have a
solution.

Exercise 11.33: Condition number


Let

1 2
A = 1 1 ,
1 1


b1

b = b2
b3

be as in the Exercise.
(a) By Exercise 11.18, the pseudoinverse of A is


1 1
1

T
1 T
A = (A A) A =
.
1 21 21
76

Theorem 11.10 tells us that the orthogonal projection of b into span(A) is

1 0 0
b1
2b1
1
b1 := AA b = 0 12 12 b2 = b2 + b3 ,
2 b +b
b3
0 12 12
2
3
so that the orthogonal projection of b into ker(AT ) is

0 0
0
b1
0
1
b2 := (I AA )b = 0 21 12 b2 = b2 b3 ,
2 b b
b3
0 12 21
3
2
where we used that b = b1 + b2 .
(b) By Theorem 7.12, the 2-norms kAk2 and kA k2 can be found by computing the
largest singular values of the matrices A and A . The largest singular value 1 of A is
the square root of the largest eigenvalue 1 of AT A, which satisfies


3 1
4
T
0 = det(A A 1 I) = det
= 21 91 + 2.
4
6 1
p

It follows that 1 = 12 2 9 + 73. Similarly, the largest singular value 2 of A is


the square root of the largest eigenvalue 2 of AT A , which satisfies


8 6 6
1
5 2 I
0 = det(AT A 2 I) = det 6 5
4 6 5
5

1
= 2 222 92 + 1 .
2
Alternatively, we could have used that the largest singular value of A is the inverse of
the smallest singular
follows from the singular value factorization). It
pvalueof A (this
p

1
follows that 2 = 2 9 + 73 = 2/ 9 73. We conclude
s


9
+
73
1 

= 9 + 73 6.203.
K(A) = kAk2 kA k2 =
9 73
2 2
Exercise 11.34: Equality in perturbation bound (TODO)

Exercise 11.36: Problem using normal equations


(a) Let A, b, and be as in the exercise. The normal equations At Ax = At b are
then

  

3
3+
x1
7
=
.
3 + ( + 1)2 + 2 x2
7 + 2
If 6= 0, inverting the matrix At A yields the unique solution
 


 5

1
1 ( + 1)2 + 2 3
7
+
x1
= 2 12 .
= 2
3
3
7 + 2
x2
2
2
If = 0, on the other hand, then any vector x = [x1 , x2 ]t with x1 + x2 = 7/3 is a
solution.
77

(b) For = 0, we get the same solution as in (a). For 6= 0, however, the solution
to the system

  

3
3 + x1
7
=
3 + 3 + 2 x2
7 + 2
is
 0


 

1 3 + 2 3
x1
7
2 1
= 2
=
.
1
x02
3
7 + 2
3

We can compare this to the solution of (a) by comparing the residuals,


1



 5

1


2
A 2 +12 b = 1 = 1
2



2
2
0
2
2


0


1

b ,
2 = 1 = A
1

1
2
2
which shows that the solution from (a) is more accurate.

78

CHAPTER 12

Numerical Eigenvalue Problems


Exercise 12.5: Continuity of eigenvalues
In this exercise kk denotes the Euclidean norm. For a given matrix A = (aij )ij Rnn ,
write
A(t) := D + t(A D),

D := diag(a11 , . . . , ann ),

t R,

for the affine combinations of A and its diagonal part D. Let t1 , t2 [0, 1], with
t1 < t2 , so that A(t1 ), A(t2 ) are convex combinations of A and D. For any eigenvalue
of A(t2 ), we are asked to show that A(t1 ) has an eigenvalue such that

C 2 kDk + kA Dk .
(?)
| | C(t2 t1 )1/n ,
In particular, every eigenvalue of A(t) is a continuous function of t.
Applying Theorem 12.8 with A(t1 ) and E = A(t2 ) A(t1 ), one finds that A(t1 )
has an eigenvalue such that
11/n
| | kA(t1 )k + kA(t2 )k
kA(t2 ) A(t1 )k1/n .
Applying the triangle inequality to the definition of A(t1 ) and A(t2 ), and using that
the function x 7 x11/n is monotone increasing,

11/n
| | 2kDk + (t1 + t2 )kA Dk
k(A D)k1/n (t2 t1 )1/n .
Finally, using that t1 + t2 2, that the function x 7 x1/n is monotone increasing,
and that k(A D)k 2kDk + 2k(A D)k, one obtains (?).
Exercise 12.6: Nonsingularity using Gerschgorin
We compute the Gerschgorin disks
R1 = R4 = C1 = C4 = {z C : |z 4| 1},
R2 = R3 = C2 = C3 = {z C : |z 4| 2}.
Then, by Gerschgorins Circle Theorem, each eigenvalue of A lies in
(R1 R4 ) (C1 C4 ) = {z C : |z 4| 2}.
In particular A has only nonzero eigenvalues, implying that A must be nonsingular.
Exercise 12.7: Gerschgorin, strictly diagonally dominant matrix
Suppose A is a strictly diagonally dominant matrix. For such a matrix, one finds
Gerschgorin disks
(
)
X
Ri = z C : |z aii |
|aij | .
j6=i

79

P
Since |aii | >
i, the origin is not an element of any of the Ri , and
j6=i |aij | for all S
S
S
therefore neither of the union Ri , nor of the intersection ( Ri ) ( Ci ) (which is
smaller). Then, by Gerschgorins Circle Theorem, A only has nonzero eigenvalues,
implying that det(A) = det(A 0 I) 6= 0 and A is nonsingular.
Exercise 12.12: -norm of a diagonal matrix
Let A = diag(1 , . . . , n ) be a diagonal matrix. The spectral radius (A) is the absolute
value of the biggest eigenvalue, say i , of A. One has
kAk = max kAxk = max max{|1 x1 |, . . . , |n xn |} (A),
kxk =1

kxk =1

as 1 , . . . , n i = (A) and since the components of any vector x satisfy x1 , . . . , xn


kxk . Moreover, this bound is attained for the standard basis vector x = ei , since
kAei k = i = (A).
Exercise 12.15: Number of arithmetic operations
An arithmetic operation is a floating point operation, so we need not bother with
any integer operations, like the computation of k + 1 in the indices. As we are only
interested in the overall complexity, we count only terms that can contribute to this.
For the first line involving C, the multiplication v*C involves (n k)2 floating
point multiplications and about (nk)2 floating point sums. Next, computing the outer
product v*(v*C) involves (n k)2 floating point multiplications, and subtracting
C - v*(v*C) needs (n k)2 substractions. This line therefore involves 4(n k)2
arithmetic operations. Similarly we find 4n(n k) arithmetic operations for the line
after that.
These 4(n k)2 + 4n(n k) arithmetic operations need to be carried out for k =
1, . . . , n 2, meaning that the algorithm requires of the order
N :=

n2
X

4(n k)2 + 4n(n k)

k=1

arithmetic
operations.
Pn2 2 This sum can be computed by either using the formulae for
Pn2
k
and
k=1 k , or using that the highest order term can be found by evaluating
k=1
an associated integral. One finds that the algorithm requires of the order
Z n

10
N
4(n k)2 + 4n(n k) dk = n3
3
0
arithmetic operations.
Exercise 12.17: Number of arithmetic operations
The multiplication v*C involves (n k)2 floating point multiplications and about
(n k)2 floating point sums. Next, computing the outer product v*(v*C) involves
(n k)2 floating point multiplications, and subtracting C - v*(v*C) needs (n k)2
substractions. In total we find 4(nk)2 arithmetic operations, which have to be carried
out for k = 1, . . . , n 2, meaning that the algorithm requires of the order
N :=

n2
X

4(n k)2

k=1

80

arithmetic
operations.
Pn2
Pn2 2 This sum can be computed by either using the formulae for
k=1 k and
k=1 k , or using that the highest order term can be found by evaluating
an associated integral. One finds that the algorithm requires of the order
Z n
4
N
4(n k)2 dk = n3
3
0
arithmetic operations.

Exercise 12.18: Tridiagonalize a symmetric matrix


From w = Ev, = 21 vT w and z = w v we get z = w v = Ev 21 vvT Ev and
zT = vT E 21 vT EvvT . Using this yields
G = (I vvT )E(I vvT ) = E vvT E EvvT + vvT EvvT
1
1
= E v(vT E vT EvvT ) (Ev vvT Ev)vT
2
2
T
T
= E vz zv .

Exercise 12.22: Counting eigenvalues


Let

4
1
A=
0
0

1
4
1
0

0
1
4
1

0
0
,
1
4

= 4.5.

Applying the recursive procedure described in Corollary 12.21, we find the diagonal
elements d1 (), d2 (), d3 (), d4 () of the matrix D in the factorization AI = LDLt ,
d1 () = 4 9/2 = 1/2,
d2 () = 4 9/2 12 /(1/2) = +3/2,
d3 () = 4 9/2 12 /(+3/2) = 7/6,
d4 () = 4 9/2 12 /(7/6) = +5/14.
As precisely two of these are negative, Corollary 12.21 implies that there are precisely
two eigenvalues of A strictly smaller than = 4.5. As
det(A 4.5I) = det(LDLt ) = d1 ()d2 ()d3 ()d4 () 6= 0,
the matrix A does not have an eigenvalue equal to 4.5. We conclude that the remaining
two eigenvalues must be bigger than 4.5.
81

Exercise 12.23: Overflow in LDLT factorization


Since An is tridiagonal and strictly diagonally dominant, it has a unique LU factorization by Exercise 1.8. From Equations (1.4), one can determine the corresponding LDLT
factorization. For n = 1, 2, . . ., let dn,k , with k = 1, . . . , n, be the diagonal elements of
the diagonal matrix Dn in a symmetric factorization of An .
(a) We proceed by induction. Let n 1 be any positive integer. For the
first
diagonal element, corresponding to k = 1,
Equations (1.4) immediately yield 5+ 24 <
dn,1 = 10 10. Next, assume that 5 + 24 < dn,k 10 for some 1 k < n. We

2
show that this implies that 5 + 24 < dn,k+1 10. First observe that 5 + 24 =

25+10 24+24 = 49+10 24. From Equations (1.4) weknow that dn,k+1 = 101/dn,k ,
which yields dn,k+1 < 10 since dn,k > 0. Moreover, 5 + 24 < dn,k implies

1
1
50 + 10 24 1
=

dn,k+1 = 10
> 10
= 5 + 24.
dn,k
5 + 24
5 + 24

Hence 5 + 24 < dn,k+1 10, and we conclude that 5 + 24 < dn,k 10 for any n 1
and 1 k n.
(b) We have A = LDLT with L triangular and with ones on the diagonal. As a
consequence,
det(A) = det(L) det(D) det(L) = det(D) =

n
Y

di > 5 +

n
24 .

i=1

In Matlab an overflow is indicated by Matlab returning Inf as result. At my computer


this happens at n = 310.
Exercise 12.24: Simultaneous diagonalization
and D1/2 be as in the Exercise.
Let A, B, U, D, A,
21
(a) Since D , like any diagonal matrix, and A are symmetric, one has
T = D 21 T UAT UT D 12 T = D 12 UAUT D 12 = A

A
is symmetric, it admits an orthogonal diagonalization A
=U
TD
U.
Let
(b) Since A
1
2 T
T
E := U D U . Then E, as the product of three nonsingular matrices, is nonsingular.
12 U, since
Its inverse is given explicitly by F := UD
12 UUT D 21 U
T = UD
12 D 12 U
T = U
U
T = I
FE = UD
=U
TD
U

and similar EF = I. Hence E1 = F and E is nonsingular. Moreover, from A


T

follows that UAU = D, which gives


12 UAUT D 12 U
T = U
A
U
T = D.

ET AE = UD
Similarly B = UT DU implies UBUT = D, which yields
21 UBUT D 21 U
T = UD
21 D 12 D 12 D 12 U
T = I.
ET BE = UD
We conclude that for a symmetric matrix A and symmetric positive definite matrix B,
the congruence transformation X 7 ET XE simultaneously diagonalizes the matrices
A and B, and even maps B to the identity matrix.
82

Exercise 12.25: Program code for one eigenvalue


(a) Let A = tridiag(c, d, c) and x be as in the Exercise. The following Matlab program counts the number of eigenvalues k of A strictly less than x.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

function k=count(c,d,x)
n = length(d);
k = 0; u = d(1)-x;
if u < 0
k = k+1;
end
for i = 2:n
umin = abs(c(i-1))*eps;
if abs(u) < umin
if u < 0
u = -umin;
else
u = umin;
end
end
u = d(i)-x-c(i-1)2/u;
if u < 0
k = k+1;
end
end

(b) Let A = tridiag(c, d, c) and m be as in the Exercise. The following Matlab program computes a small interval [a, b] around the mth eigenvalue m of A and returns
the point in the middle of this interval.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

function lambda = findeigv(c,d,m)


n = length(d);
a = d(1)-abs(c(1)); b = d(1)+abs(c(1));
for i = 2:n-1
a = min(a, d(i)-abs(c(i-1))-abs(c(i)));
b = max(b, d(i)+abs(c(i-1))+abs(c(i)));
end
a = min(a, d(n)-abs(c(n-1)));
b = max(b, d(n)+abs(c(n-1)));
h = b-a;
while abs(b-a) > eps*h
c0 = (a+b)/2;
k = count(c,d,c0);
if k < m
a = c0;
else
b = c0;
end
end
lambda = (a+b)/2;

(c) The following table shows a comparison between the values and errors obtained
by the different methods.
83

method
exact
findeigv
Matlab eig

value

error

0.02413912051848666
0.02413912051848621
0.02413912051848647

16

0
4.44 10
1.84 1016

Exercise 12.26: Determinant of upper Hessenberg matrix (TODO)

84

CHAPTER 13

The QR Algorithm
Exercise 13.4: Orthogonal vectors
In the Exercise it is implicitly assumed that u u 6= 0 and therefore u 6= 0. If u and
Au u are orthogonal, then
0 = hu, Au ui = u (Au u) = u Au u u.
Dividing by u u yields
u Au
= .
uu
Exercise 13.14: QR convergence detail (TODO)

85

You might also like