Solution Manual Linear Algebra and Its Applications 2ed PDF

Linear Algebra and Its Applications, 2ed.
Solution of Exercise Problems

Version 1.0.4, last revised on 2014-08-13.
Abstract
This is a solution manual for Linear algebra and its applications, 2nd edition, by Peter Lax [8]. This
version omits the following problems: exercise 2, 9 of Chapter 8; exercise 3, 4 of Chapter 17; exercises of
Chapter 18; exercise 3 of Appendix 3; exercises of Appendix 4, 5, 8 and 11.
If you would like to correct any typos/errors, please send email to zypublic@hotmail.com.
Contents
1 Fundamentals 3
2 Duality 6
3 Linear Mappings 9
4 Matrices 13
5 Determinant and Trace 15
6 Spectral Theory 19
7 Euclidean Structure 24
8 Spectral Theory of Self-Adjoint Mappings of a Euclidean Space into Itself 29
9 Calculus of Vector- and Matrix- Valued Functions 33
10 Matrix Inequalities 36
11 Kinematics and Dynamics 39
12 Convexity 41
13 The Duality Theorem 45
14 Normed Linear Spaces 46
15 Linear Mappings Between Normed Linear Spaces 48
16 Positive Matrices 50
17 How to Solve Systems of Linear Equations 51
1
18 How to Calculate the Eigenvalues of Self-Adjoint Matrices 52
A Appendix 52
A.1 Special Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
A.2 The Pfaan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
A.3 Symplectic Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
A.4 Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
A.5 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
A.6 Fast Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
A.7 Gershgorins Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.8 The Multiplicity of Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.9 The Fast Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.10 The Spectral Radius . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.11 The Lorentz Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
A.12 Compactness of the Unit Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
A.13 A Characterization of Commutators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
A.14 Liapunovs Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
A.15 The Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
A.16 Numerical Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2
1 Fundamentals
The books own solution gives answers to Ex 1, 3, 7, 10, 13, 14, 16, 19, 20, 21.
I 1. (page 2) Show that the zero of vector addition is unique.

Proof. Suppose 0 and 0 are two zeros of vector addition, then by the denition of zero and commutativity,
we have 0 = 0 + 0 = 0 + 0 = 0.
I 2. (page 3) Show that the vector with all components zero serves as the zero element of classical vector
addition.
Proof. For any x = (x1 , , xn ) K n , we have
x + 0 = (x1 , , xn ) + (0, , 0) = (x1 + 0, , xn + 0) = (x1 , , xn ) = x.
So 0 = (0, , 0) is the zero element of classical vector addition.

I 3. (page 3) Show that (i) and (iv) are isomorphic.
Proof. The isomorphism T can be dened as T ((a1 , , an )) = a1 + a2 x + + an xn1 .
I 4. (page 3) Show that if S has n elements, (i) and (iii) are isomorphic.
Proof. Suppose S = {s1 , , sn }. The isomorphism T can be dened as T (f ) = (f (s1 ), , f (sn )) ,f
KS.
I 5. (page 4) Show that when K = R, (iv) is isomorphic with (iii) when S consists of n distinct points of R.
Proof. For any p(x) = a1 + a2 x + + an xn1 , we dene
T (p) = p(x),
where p on the left side of the equation is regarded as a polynomial over R while p(x) on the right side of
the equation is regarded as a function dened on S = {s1 , , sn }. To prove T is an isomorphism, it suces
to prove T is one-to-one. This is seen through the observation that
a
1 s1 s21 sn1 1 p(s1 )
1
1
a2
s2 s22 s2n1 p(s2 )
. =
..
1 sn s2n snn1 an p(sn )
and the Vandermonde matrix

1 s1 s21 sn1
1
1 s2 s22 sn1
2

1 sn s2n sn1
n
is invertible for distinct s1 , s2 , , sn .

I 6. (page 4) Prove that Y + Z is a linear subspace of X if Y and Z are.
Proof. For any y, y Y , z, z Z and k K, we have (by commutativity and associative law)
(y + z) + (y + z ) = (z + y) + (y + z ) = z + (y + (y + z )) = z + ((y + y ) + z ) = z + (z + (y + y ))
= (z + z ) + (y + y ) = (y + y ) + (z + z ) Y + Z,
and
k(y + z) = ky + kz Y + Z.
So Y + Z is a linear subspace of X if Y and Z are.
3
I 7. (page 4) Prove that if Y and Z are linear subspaces of X, so is Y Z.
Proof. For any x1 , x2 Y Z, since Y and Z are linear subspaces of X, x1 + x2 Y and x1 + x2 Z.

Therefore, x1 + x2 Y Z. For any k K and x Y Z, since Y and Z are linear subspaces of X, kx Y
and kx Z. Therefore, kx Y Z. Combined, we conclude Y Z is a linear subspace of X.
I 8. (page 4) Show that the set {0} consisting of the zero element of a linear space X is a subspace of X.
It is called the trivial subspace.
Proof. By denition of zero vector, 0+0 = 0 {0}. For any k K, k0 = k(0+0) = k0+k0. So k0 = 0 {0}.
Combined, we can conclude {0} is a linear subspace of X.
I 9. (page 4) Show that the set of all linear combinations of x1 , , xj is a subspace of X, and that it is
the smallest subspace of X containing x1 , , xj . This is called the subspace spanned by x1 , , xj .
Proof. Dene Y = {k1 x1 + + kj xj : k1 , , kj K}. Then clearly x1 = 1x1 + 0x2 + + 0xj Y .
Similarly, we can show x2 , , xj Y . Since for any k1 , , kj , k1 , , kj K,
(k1 x1 + + kj xj ) + (k1 x1 + + kj xj ) = (k1 + k1 )x1 + + (kj + kj )xj Y
and for any k1 , , kj , k K,
k(k1 x1 + + kj xj ) = (kk1 )x1 + + (kkj )xj Y,
we can conclude Y is a linear subspace of X containing x1 , , xj . Finally, if Z is any linear subspace of

X containing x1 , , xj , it is clear that Y Z as Z must be closed under scalar multiplication and vector
addition. Combined, we have proven Y is the smallest linear subspace of X containing x1 , , xj .
I 10. (page 5) Show that if the vectors x1 , , xj are linearly independent, then none of the xi is the zero
vector.
Proof. We prove by contradiction. Without loss of generality, assume x1 = 0. Then 1x1 + 0x2 + + 0xj =
0. This shows x1 , , xj are linearly dependent, a contradiction. So x1 = 0. We can similarly prove
x2 , , xj = 0.
I 11. (page 7) Prove that if X is nite dimensional and the direct sum of Y1 , , Ym , then

dim X = dim Yj .
Proof. Suppose Yi has a basis y1i , , yni i . Then it suces to prove y11 , , yn1 1 , , y1m , , ynmm form a basis
of X. By denition of direct sum, these vectors span X, so we only need to show they are linearly independent.
m
In fact, if not, then 0 has two distinct representations: 0 = 0 + + 0 and 0 = i=1 (ai1 y1i + + aini yni i )
for some a11 , , a1n1 , , am
1 , , anm , where not all aj are zero. This is contradictory with the denition
m i
of direct sum. So we must havelinear independence, which imply y11 , , yn1 1 , , y1m , , ynmm form a basis
of X. Consequently, dim X = dim Yi .
I 12. (page 7) Show that every nite-dimensional space X over K is isomorphic to K n , n = dim X. Show
that this isomorphism is not unique when n is > 1.
n
Proof. Fix a basis x1 , , xn of X, any element x X can be uniquely represented as i=1 i (x)xi for some
i (x) K, i = 1, , n. We dene the isomorphism as x 7 (1 (x), , n (x)). Clearly this isomorphism
depends on the basis and by varying the choice of basis, we can have dierent isomorphisms.
I 13. (page 7) Prove (i)-(iii) above. Show furthermore that if x1 x2 , then kx1 kx2 for every scalar k.
Proof. For any x1 , x2 X, if x1 x2 , i.e. x1 x2 Y , then x2 x1 = (x1 x2 ) Y , i.e. x2 x1 . This
is symmetry. For any x X, x x = 0 Y . So x x. This is reexivity. Finally, if x1 x2 , x2 x3 , then
x1 x3 = (x1 x2 ) + (x2 x3 ) Y , i.e. x1 x3 . This is transitivity.
4
I 14. (page 7) Show that two congruence classes are either identical or disjoint.
Proof. For any x1 , x2 X, we can nd y {x1 } {x2 } if and only if x1 y Y and x2 y Y . Then
x1 x2 = (x1 y) (x2 y) Y.
So {x1 } {x2 } = if and only if {x1 } = {x2 }.
I 15. (page 8) Show that the above denition of addition and multiplication by scalar is independent of the
choice of representatives in the congruence class.
Proof. If {x} = {x } and {y} = {y }, then x x , y y Y . So (x + y) (x + y ) = (x x ) + (y y ) Y .

This shows {x + y} = {x + y }. Also, for any k K, kx kx = k(x x ) Y . So k{x} = {kx} = {kx } =
k{x }.
I 16. (page 9) Denote by X the linear space of all polynomials p(t) of degree < n, and denote by Y the set
of polynomials that are zero at t1 , , tj , j < n.
(i) Show that Y is a subspace of X.
(ii) Determine dim Y .
(iii) Determine dim X/Y .
Proof. By theory of polynomials, we have
{ }
j
Y = q(t) (t ti ) : q(t) is a polynomial of degree < n j .
i=1
Then its easy to see dim Y = n j and dim X/Y = dim X dim Y = j.
I 17. (page 10) Prove Corollary 6 .
Proof. By Theorem 6, dim X/Y = dim X dim Y = 0, which implies X/Y = {{0}}. So X = Y .
I 18. (page 11) Show that
dim X1 X2 = dim X1 + dim X2 .
Proof. Dene Y1 = {(x, 0) : x X1 , 0 X2 } and Y2 = {(0, x) : 0 X1 , x X2 }. Then Y1 and Y2 are linear
subspaces of X1 X2 . It is easy to see Y1 is isomorphic to X1 , Y2 is isomorphic to X2 , and Y1 Y2 = {(0, 0)}.
So by Theorem 7, dim X1 X2 = dim Y1 +dim Y2 dim(Y1 Y2 ) = dim X1 +dim X2 0 = dim X1 +dim X2 .
I 19. (page 11) X is a linear space, Y is a subspace. Show that Y X/Y is isomorphic to X.
Proof. By Exercise 18 and Theorem 6, dim(Y X/Y ) = dim Y + dim(X/Y ) = dim Y + dim X dim Y =
dim X. Since linear spaces of same nite dimension are isomorphic (by one-to-one mapping between their
bases), Y X/Y is isomorphic to X.
I 20. (page 12) Which of the following sets of vectors x = (x1 , , xn ) in Rn are a subspace of Rn ? Explain
your answer.
(a) All x such that x1 0.
(b) All x such that x1 + x2 = 0.
(c) All x such that x1 + x2 + 1 = 0.
(d) All x such that x1 = 0.
(e) All x such that x1 is an integer.
Proof. (a) is not since {x : x1 0} is not closed under the scalar multiplication by 1. (b) is. (c) is not
since x1 + x2 + 1 = 0 and x1 + x2 + 1 = 0 imply (x1 + x1 ) + (x2 + x2 ) + 1 = 1. (d) is. (e) is not since x1
being an integer does not guarantee rx1 is an integer for any r R.
5
I 21. (page 12) Let U , V , and W be subspaces of some nite-dimensional vectors space X. Is the statement
dim(U + V + W ) = dim U + dim V + dim W dim(U V ) dim(U W )

dim(V W ) + dim(U V W ).
true or false? If true, prove it. If false, provide a counterexample.

Proof. (From the textbooks solutions, page 279.) The statement is false; here is an example to the contrary:
X = R2 = (x, y) space
U = {y = 0}, V = {x = 0}, W = {x = y}.
U + V + W = R2 , U V = {0}, U W = {0}
V W = {0}, U V W = 0.
2 Duality
The books own solution gives answers to Ex 4, 5, 6, 7.
I 1. (page 15) Given a nonzero vector x1 in X, show that there is a linear function l such that
l(x1 ) = 0.
Proof. We let Y = {kx1 : k K}. Then Y is a 1-dimensional linear subspace of X. By Theorem 2 and
Theorem 4,
dim Y = dim X dim Y < dim X = dim X
So there must exist some l X \ Y such that l(x1 ) = 0.
Remark 1. When K is R or C, the proof can be constructive. Indeed, assume e1 , , en is a basis for X
n
and x1 = i=1 ai ei . In the case of K = R, dene l by setting l(ei ) = ai , i = 1, , n; in the case of K = C,
dene l by setting l(ei ) = ai (the conjugate of ai ), i = 1, , n. Then in both cases, l(x1 ) = i=1 ||ai ||2 > 0.
n
I 2. (page 15) Verify that Y is a subspace of X .

Proof. For any l1 and l2 Y , we have (l1 + l2 )(y) = l1 (y)+ l2 (y) = 0 + 0 = 0 for any y Y . So l1 + l2 Y .
For any k K, (kl)(y) = k(l(y)) = k0 = 0 for any y Y . So kl Y . Combined, we conclude Y is a
subspace of X .
I 3. (page 17) Prove Theorem 6.
Proof. Since S Y , Y S . For , mlet x1 , , xm be a maximal linearly independent subset of S.
Then S = span(x1 , , xm ) and Y = { i=1 i xi : 1 , , m K} by Exercise 9 of Chapter 1. By the
denition of annihilator, for any l S and y = i=1 i xi Y , we have
m

m
l(y) = i l(xi ) = 0.
i=1
So l Y . By the arbitrariness of l, S Y . Combined, we have S = Y .

I 4. (page 18) In Theorem 7 take the interval I to be [1, 1], and take n to be 3. Choose the three points
to be t1 = a, t2 = 0, and t3 = a.
(i) Determine the weights
m1 , m2 , m3 so that (9) holds for all polynomials of degree < 3.
(ii) Show that for a > 1/3, all three weights are positive.
(iii) Show that for a = 3/5, (9) holds for all polynomials of degree < 6.
6
Proof. Suppose three linearly independent polynomials p1 , p2 and p3 are applied to formula (9). Then m1 ,
m2 and m3 must satisfy the linear equations
1
p1 (t1 ) p1 (t2 ) p1 (t3 ) m1 p (t)dt
1 1

p2 (t1 ) p2 (t2 ) p2 (t3 ) m2 = 1
1 p2 (t)dt
1
p3 (t1 ) p3 (t2 ) p3 (t3 ) m3 p (t)dt
1 3
We take p1 (t) = 1, p2 (t) = t and p3 (t) = t2 . The above equation becomes

1 1 1 m1 2
a 0 a m2 = 0
2
a2 0 a2 m3 3
So 1 1 1
m1 1 1 1 2 0 2a 1 1
2a2 2 3a2
m2 = a 0 a 0 = 1 0 a12 0 = 2 3a22
2 1 1 2 1
m3 a2 0 a2 3 0 2a 2a2 3 3a2

Then its easy to see that for a > 1/3, all three weights are positive.
To show formula (9) holds for all polynomials of degree < 6 when a = 3/5, we note for any odd n N,
1
xn dx = 0, m1 p(a) + m3 p(a) = 0 since m1 = m2 and p(x) = p(x), and m2 p(0) = 0.
1
So (9) holds for any xn of odd degree n. In particular, for p(x) = x3 and p(x) = x5 . For p(x) = x4 , we have
1
2 2
, m1 p(t1 ) + m2 p(t2 ) + m3 p(t3 ) = 2m1 a4 = a2 .
x4 dx =
1 5 3

So formula (9) holds for p(x) = x4 when a = 3/5. Combined, we conclude for a = 3/5, (9) holds for all
polynomials of degree < 6.
Remark 2. In this exercise problem and Exercise 5 below, Theorem 6 is corrected to Theorem 7.
I 5. (page 18) In Theorem 7 take the interval I to be [1, 1], and take n = 4. Choose the four points to be
a, b, b, a.
(i) Determine the weights m1 , m2 , m3 , and m4 so that (9) holds for all polynomials of degree < 4.
(ii) For what values of a and b are the weights positive?
Proof. We take p1 (t) = 1, p2 (t) = t, p3 (t) = t2 , and p4 (t) = t3 . Then m1 , m2 , m3 , and m4 solve the following
equation:
1 1 1 1 m1 2
a b b a m2 0
2 =
a b2 b2 a2 m3 2/3
a3 b3 b3 a3 m4 0
7
Then
1
m1 1 1 1 1 2
m2 a b b a 0
=
m3 a2 b2 b2 a2 2/3
m4 a b3
3
b3 a3 0
b2 b2 1 1

2a2 +2b2 2a3 2ab2 2a2 2b2 2a3 +2ab2 2
a2 a2 1 1
2a2 2b2 2a2 b+2b3 2a2 +2b2 2a2 b2b3 0
= 2
a2
2a2a2b2 2a2 b2b3
1
2a2 +2b2
1
2a2 b+2b3
2/3
b2 b2 1 1 0
2a2 +2b2 2a3 +2ab2 2a2 2b2 2a3 2ab2

3b2 +1
3(a2 b2 )
3a2 1
3(a2 b2 )
= 2
3a 1
3(a2 b2 )
3b2 +1
3(a2 b2 )
So the weights are positive if and only if one of the following two mutually exclusive cases hold
1) b2 > 13 , a2 < b2 , a2 > 13 ;
2) b2 < 13 , a2 > b2 , a2 < 13 .
I 6. (page 18) Let P2 be the linear space of all polynomials
p(x) = a0 + a1 x + a2 x2
with real coecients and degree 2. Let 1 , 2 , 3 be three distinct real numbers, and then dene
lj = p(j ) for j = 1, 2, 3.
(a) Show that l1 , l2 , l3 are linearly independent linear functions on P2 .

(b) Show that l1 , l2 , l3 is a basis for the dual space P2 .
(c) (1) Suppose {e1 , , en } is a basis for the vector space V . Show there exist linear functions {l1 , , ln }
in the dual space V dened by {
1 if i = j
li (ej ) =
0 if i = j
Show that {l1 , , ln } is a basis of V , called the dual basis.
(2) Find the polynomials p1 (x), p2 (x), p3 (x) in P2 for which l1 , l2 , l3 is the dual basis in P2 .
(a)
Proof. (From the textbooks solutions, page 280) Suppose there is a linear relation
al1 (p) + bl2 (p) + cl3 (p) = 0.
Set p = p(x) = (x 2 )(x 3 ). Then p(2 ) = p(3 ) = 0, p1 (1 ) = 0; so we get from the above relation that
a = 0. Similarly b = 0, c = 0.
(b)
Proof. Since dim P2 = 3, dim P2 = 3. Since l1 , l2 , l3 are linearly independent, they span P2 .
(c1)
8
Proof. We dene l1 by setting {
1, if j = 1
l1 (ej ) =
0, if j = 1
n n
and extending l1 to V by linear combination, i.e. l1 ( j=1 j ej ) := j=1 j l1 (ej ) = 1 . l2 , , ln can be
constructed similarly. If there exist a1 , , an such that a1 l1 + + an ln = 0, we have
0 = a1 l1 (ej ) + an ln (ej ) = aj , j = 1, , n.
So l1 , , ln are linearly independent. Since dim V = dim V = n, {l1 , , ln } is a basis of V .
(c2)
Proof. We dene
(x x2 )(x x3 ) (x x1 )(x x3 ) (x x1 )(x x2 )
p1 (x) = , p2 (x) = , p3 (x) = .
(x1 x2 )(x1 x3 ) (x2 x1 )(x2 x3 ) (x3 x1 )(x3 x2 )
I 7. (page 18) Let W be the subspace of R4 spanned by (1, 0, 1, 2) and (2, 3, 1, 1). Which linear functions
l(x) = c1 x1 + c2 x2 + c3 x3 + c4 x4 are in the annihilator of W ?
Proof. (From the textbooks solutions, page 280) l(x) has to be zero for x = (1, 0, 1, 2) and x = (2, 3, 1, 1).
These yield two equations for c1 , , c4 :
c1 c3 + 2c4 = 0, 2c1 + 3c2 + c3 + c4 = 0.
We express c1 and c2 in terms of c3 and c4 . From the rst equation, c1 = c3 2c4 . Setting this into the
second equation gives c2 = c3 + c4 .
3 Linear Mappings
The books own solution gives answers to Ex 1, 2, 4, 5, 6, 7, 8, 10, 11, 13.
Comments: To memorize Theorem 5 (RT = NT ), recall for a given l U , (l, T x) = 0 for any x X
if and only if T l = 0.

(a)
Proof. For any y, y T (X), there exist x, x X such that T (x) = y and T (x ) = y . So y + y =
T (x) + T (x ) = T (x + x ) T (X). For any k K, ky = kT (x) = T (kx) T (X). Combined, we conclude
T (X) is a linear subspace of U .
(b)
Proof. Suppose V is a linear subspace of U . For any x, x T 1 (V ), there exist y, y V such that T (x) = y
and T (x ) = y . Since T (x + x ) = T (x) + T (x ) = y + y V , x + x T 1 (V ). For any k K, since
T (kx) = kT (x) = ky V , kx T 1 (V ). Combined, we conclude T 1 (V ) is a linear subspace of X.
I 2. (page 24) Let

n
tij xj = ui , i = 1, , m
1
be an overdetermined system of linear equationsthat is, the number m of equations is greater than the
number n of unknowns x1 , , xn . Take the case that in spite of the overdeterminacy, this system of
equations has a solution, and assume that this solution is unique. Show that it is possible to select a subset
of n of these equations which uniquely determine the solution.
9
Proof. (From the textbooks solution, page 280) Suppose we drop the ith equation; if the remaining equations
do not determine x uniquely, there is an x = 0 that is mapped into a vector whose components except the
ith are zero. If this were true for all i = 1, , m, the range of the mapping x u would be m-dimensional;
but according to Theorem 2, the dimension of the range is n < m. Therefore one of the equations may be
dropped without losing uniqueness; by induction m n of the equations may be omitted.
Alternative solution: Uniqueness of the solution x implies the column vectors of the matrix T = (tij ) are
linearly independent. Since the column rank of a matrix equals its row rank (see Chapter 3, Theorem 6 and
Chapter 4, Theorem 2), it is possible to select a subset of n of these equations which uniquely determine the
solution.
Remark 3. The textbooks solution is a proof that the column rank of a matrix equals its row rank.
(i)
Proof. S T (ax + by) = S(T (ax + by)) = S(aT (x) + bT (y)) = aS(T (x)) + bS(T (y)) = aS T (x) + bS T (y).
So S T is also a linear mapping.
(ii)
Proof. (R + S) T (x) = (R + S)(T (x)) = R(T (x)) + S(T (x)) = (R T + S T )(x) and S (T + P )(x) =
S((T + P )(x)) = S(T (x) + P (x)) = S(T (x)) + S(P (x)) = (S T + S P )(x).
I 4. (page 25) Show that S and T in Examples 8 and 9 are linear and that ST = T S.
Proof. For Example 8, the linearity of S and T is easy to see. To see the non-commutativity, consider the
polynomial p(s) = s. We have T S(s) = T (s2 ) = 2s = s = S(1) = ST (s). So ST = T S.
For Example 9, x = (x1 , x2 , x3 ) X, S(x) = (x1 , x3 , x2 ) and T (x) = (x3 , x2 , x1 ). So its easy to
see S and T are linear. To see the non-commutativity, note ST (x) = S(x3 , x2 , x1 ) = (x3 , x1 , x2 ) and
T S(x) = T (x1 , x3 , x2 ) = (x2 , x3 , x1 ). So ST = T S in general.
Remark 4. Note the problem does not specify the direction of the rotation, so it is also possible that
S(x) = (x1 , x3 , x2 ) and T (x) = (x3 , x2 , x1 ). There are total of four choices of (S, T ), and each of the
corresponding proofs is similar to the one presented above.
I 5. (page 25) Show that if T is invertible, T T 1 is the identity.

Proof. T T 1 (x) = T (T 1 (x)) = x by denition. So T T 1 = id.

(i)
Proof. Suppose T : X U is invertible. Then for any y, y U , there exist a unique x X and a unique
x X such that T (x) = y and T (x ) = y . So T (x + x ) = T (x) + T (x ) = y + y and by the injectivity of
T , T 1 (y + y ) = x + x = T 1 (y) + T 1 (y ). For any k K, since T (kx) = kT (x) = ky, injectivity of T
implies T 1 (ky) = kx = kT 1 (y). Combined, we conclude T 1 is linear.
(ii)
Proof. Suppose T : X U and S : U V . First, by the denition of multiplication, ST is a linear map.
Second, if x X is such that ST (x) = 0 V , the injectivity of S implies T (x) = 0 U and the injectivity
of T further implies x = 0 X. So, ST is one-to-one. For any z V , there exists y U such that S(y) = z.
Also, we can nd x X such that T (x) = y. So ST (x) = S(y) = z. This shows ST is onto. Combined, we
conclude ST is invertible.
By associativity, we have (ST )(T 1 S 1 ) = ((ST )T 1 )S 1 = (S(T T 1 ))S 1 = SS 1 = idV . Replace
S with T 1 and T with S 1 , we also have (T 1 S 1 )(ST ) = idX . Therefore, we can conclude (ST )1 =
T 1 S 1 .
10
I 7. (page 26) Show that whenever meaningful,
(ST ) = T S , (T + R) = T + R , and (T 1 ) = (T )1 .
(i)
Proof. Suppose T : X U and S : U V are linear maps. Then for any given l V , ((ST ) l, x) =
(l, ST x) = (S l, T x) = (T S l, x), x X. Therefore, (ST ) l = T S l. Let l run through every element of
V , we conclude (ST ) = T S .
(ii)
Proof. Suppose T and R are both linear maps from X to U . For any given l U , we have ((T + R) l, x) =
(l, (T + R)x) = (l, T x + Rx) = (l, T x) + (l, Rx) = (T l, x) + (R l, x) = ((T + R )l, x), x X. Therefore
(T + R) l = (T + R )l. Let l run through every element of V , we conclude (T + R) = T + R .
(iii)
Proof. Suppose T is an isomorphism from X to U , then T 1 is a well-dened linear map. We rst show
T is an isomorphism from U to X . Indeed, if l U is such that T l = 0, then for any x X, 0 =
(T l, x) = (l, T x). As x varies and goes through every element of X, T x goes through every element of
U . By considering the identication of U with U , we conclude l = 0. So T is one-to-one. For any given
m X , dene l = mT 1 , then l U . For any x X, we have (m, x) = (m, T 1 (T x)) = (l, T x) = (T l, x).
Since x is arbitrary, m = T l and T is therefore onto. Combined, we conclude T is an isomorphism from
U to X and (T )1 is hence well-dened.
By part (i), (T 1 ) T = (T T 1 ) = (idU ) = idU and T (T 1 ) = (T 1 T ) = (idX ) = idX . This shows
(T 1 ) = (T )1 .
I 8. (page 26) Show that if X is identied with X and U with U via (5) in Chapter 2, then
T = T.
Proof. Suppose : X X and : U U are the isomorphisms dened in Chapter 2, formula (5), which
identify X with X and U with U , respectively. Then for any x X and l U , we have
(T x , l) = (x , T l) = (T l, x) = (l, T x) = (T x , l).
Since l is arbitrary, we must have T x = T x , x X. Hence, T = T , which is the precise

interpretation of T = T .
I 9. page 28) Show that if A in L(X, X) is a left inverse of B in L(X, X), that is AB = I, then it is also
a right inverse: BA = I.
Proof. If Bx = 0, by applying A to both sides of the equation and AB = I, we conclude x = 0. So B is

injective. By Corollary B of Theorem 2, B is surjective. Therefore the inverse of B, denoted by B 1 , always
exists, and A = A(BB 1 ) = (AB)B 1 = IB 1 = B 1 , which implies BA = I.
Remark 5. For a general algebraic structure, e.g. a ring with unit, its not always the case that an elements
right inverse equals to its left inverse. In the proof above, we used the fact that for nite dimensional linear
vector space, a linear mapping is injective if and only if its surjective.
I 10. (page 30) Show that if M is invertible, and similar to K, then K also is invertible, and K 1 is similar
to M 1 .
Proof. Suppose K = MS . Then K(M 1 )S = SM S 1 SM 1 S 1 = I. By Exercise 9, K is also invertible

and K 1 = (M 1 )S .
11
Proof. Suppose A is invertible, we have AB = AB(AA1 ) = A(BA)A1 . So AB and BA are similar. The
case of B being invertible can be proved similarly.
I 12. (page 31) Show that P dened above is a linear map, and that it is a projection.
Proof. For any , K and x = (x1 , , xn ), y = (y1 , , yn ), we have
P (x + y) = P ((x1 + y1 , , xn + yn ))
= (0, 0, x3 + y3 , , xn + yn )
= (0, 0, x3 , , xn ) + (0, 0, y3 , , yn )
= (0, 0, x3 , , xn ) + (0, 0, y3 , , yn )
= P (x) + P (y).
This shows P is a linear map. Furthermore, we have
P 2 (x) = P ((0, 0, x3 , , xn )) = (0, 0, x3 , , xn ) = P (x).
So P is a projection.
I 13. (page 31) Prove that P dened above is linear, and that it is a projection.
Proof. For any , K and f, g C[1, 1], we have

1
P (f + g)(x) = [(f + g)(x) + (f + g)(x)]
2

= [f (x) + f (x)] + [g(x) + g(x)]
2 2
= P (f )(x) + P (g)(x).
This shows P is a linear map. Furthermore, we have

( ) [ ]
f () + f () 1 f (x) + f (x) f (x) + f (x)
(P 2 f )(x) = (P (P f ))(x) = P (x) = +
2 2 2 2
1
= (f (x) + f (x)) = (P f )(x).
2
So P is a projection.
I 14. (page 31) Suppose T is a linear map of rank 1 of a nite dimensional vector space into itself.
(a) Show there exists a unique number c such that T 2 = cT .
(b) Show that if c = 1 then I T has an inverse. (As usual I denotes the identity map Ix = x.)
(a)
Proof. Since dim RT = 1, it suces to prove the following claim: if T is a linear map on a 1-dimensional
linear vector space X, there exists a unique number c such that T (x) = cx, x X. We assume the
underlying led K is either R or C. We further assume S : X K is an isomorphism. Then S T S 1 is
a linear map on K. Dene c = S T S 1 (1), we have
S T S 1 (k) = S T S 1 (k 1) = k c, k K.
So T S 1 (k) = S 1 (c k) = cS 1 (k), k K. This shows T is a scalar multiplication.
(b)
Proof. If c = 1, its easy to verify I + 1
1c T is the inverse of I T .
12
I 15. (page 31) Suppose T and S are linear maps of a nite dimensional vector space into itself. Show that
the rank of ST is less than or equal the rank of S. Show that the dimension of the nullspace of ST is less
than or equal the sum of the dimensions of the nullspaces of S and of T .
Proof. Because RST RS , rank(ST ) = dim(RST ) dim RS = rank(S). Moreover, since the column
rank of a matrix equals its row rank (see Chapter 3, Theorem 6 and Chapter 4, Theorem 2), we have
rank(ST ) = rank(T S ) rank(T ) = rank(T ). Combined, we conclude rank(ST ) min{rank(S), rank(T )}.
Also, we note NST /NT is isomorphic to NS RT , with the isomorphism dened by ({x}) = T x, where
{x} := x + NT . Its easy to see is well-dened, is linear, and is both injective and surjective. So by
Theorem 6 of Chapter 1,
dim NST = dim NT + dim NST /NT = dim NT + dim(NS RT ) dim NT + dim NS .
Remark 6. The result rank(ST ) min{rank(S), rank(T )} is used in econometrics. Cf. Greene [4, page 985]
Appendix A.
4 Matrices
The books own solution gives answers to Ex 1, 2, 4.
I 1. (page 35) Let A be an arbitrary m n matrix, and let D be an m n diagonal matrix,

{
di if i = j
Dij =
0 if i = j.
Show that the ith row of DA equals di times the ith row of A, and show that the jth column of AD equals
dj times the jth column of A.
Proof. It looks the phrasement of the exercise is problematic: when m = n, AD or DA may not be well-
r1
r2
dened. So we will assume m = n in the below. We can write A in the row form
. Then DA can be
rm
written as
d1 0 0 r1 d1 r1
0 d 0 r2 d2 r2
DA = 2
=
0 0 dn rn dn rn
We can also write A in the column form [c1 , c2 , , cn ], then AD can be written as

d1 0 0
0 d2 0
AD = [c1 , c2 , , cn ]
= [d1 c1 , d2 c2 , , dn cn ]
0 0 dn
I 2. (page 37) Look up in any text the proof that the row rank of a matrix equals its column rank, and
compare it to the proof given in the present text.
Proof. Proofs in most textbooks are lengthy and complicated. For a clear, although still lengthy, proof, see
[12, page 112], Theorem 3.5.3.
13
I 3. (page 38) Show that the product of two matrices in 2 2 block form can be evaluated as
( )( ) ( )
A11 A12 B11 B12 A11 B11 + A12 B21 A11 B12 + A12 B22
=
A21 A22 B21 B22 A21 B11 + A22 B21 A21 B12 + A22 B22
Proof. The calculation is a bit messy. We refer the reader to [12, page 190], Theorem 4.6.1.
I 4. (page 38) Construct two 2 2 matrices A and B such that AB = 0 but BA = 0.

[ ] [ ] [ ]
1 1 1 2 1 1
Proof. Let A = and B = . Then AB = 0 yet BA = = 0.
0 0 1 2 1 1
I 5. (page 40) Show that x1 , x2 , x3 , and x4 given by (20)j satisfy all four equations (20).
Proof.
1 2 3 1 1 1 1 + 2 2 + 3 (2) + (1) 1 2
2 5 4 3 2 2 1 + 5 2 + 4 (2) + (3) 1 1

2 3 4 1 2 = 2 1 + 3 2 + 4 (2) + 1 1 = 1
1 4 2 2 1 1 1 + 4 2 + 2 (2) + (2) 1 3
I 6. (page 41) Choose values of u1 , u2 , u3 , u4 so that condition (23) is satised, and determine all solutions
of equations (22).
Proof. We choose u1 = u2 = u3 = 1 and u4 = 2. Then x3 = 5x4 u3 u2 + 3u1 = 5x4 + 1,
x2 = 7x4 + u4 3u1 = 7x4 1, and x1 = u1 x2 2x3 3x4 = 1 (7x4 1) 2(5x4 + 1) 3x4 = 0.
I 7. (page 41) Verify that l = (1, 2, 1, 1) is a left nullvector of M :
lM = 0.
Proof.

1 1 2 3
1 2 3 1
[1, 2, 1, 1]
2

1 2 3
3 4 6 2
= [1 1 2 1 1 2 + 1 3, 1 1 2 2 1 1 + 1 4, 1 2 2 3 1 2 + 1 6, 1 3 2 1 1 3 + 1 2]
= 0.
I 8. (page 42) Show by Gaussian elimination that the only left nullvectors of M are multiples of l in Exercise
7, and then use Theorem 5 of Chapter 3 to show that condition (23) is sucient for the solvability of the
system (22).
Proof. Suppose a row vector x = (x1 , x2 , x3 , x4 ) satises xM = 0. Then we can proceed according to
Gaussian elimination

x1 + x2 + 2x3 + 3x4 = 0

{
x + 2x + x + 4x = 0 x2 x3 + x4 = 0 x3 x4 = 0
2 2 3 4
x2 2x3 = 0

2x1 + 3x2 + 2x3 + 6x4 = 0
5x3 5x4 = 0.

2x2 3x3 7x4 = 0
3x1 + x2 + 3x3 + 2x4 = 0
14
So we have x1 = x4 , x2 = 2x4 , and x3 = x 4 ,i.e. x = x4 (1, 2, 1, 1), a multiple of l in Exercise 7.
u1
u2
Equation (22) has a solution if and only if u =
u3 is in RM . By Theorem 5 of Chapter 3, this is equivalent
u4
to yu = 0, y NM (elements of NM are seen as row vectors). We have proved y is a multiple of l. Hence
condition (23), which is just lu = 0, is sucient for the solvability of the system (22).
5 Determinant and Trace

The books own solution gives answers to Ex 1, 2, 3, 4, 5.
Comments:
1) For a more intuitive proof of Theorem 2 (det(BA) = det A det B), see Munkres [10, page 18], Theorem
2.10.
2) The following proposition is one version of Cramers rule and will be used in the proof of Lemma 6,
Chapter 6 (formula (21) on page 68).
Proposition 5.1. Let A be an n n matrix and B dened as the matrix of cofactors of A; that is,
Bij = (1)i+j det Aji ,
where Aji is the (ji)th minor of A. Then AB = BA = det A Inn .
Proof. Suppose A has the column form A = (a1 , , an ). By replacing the jth column with the ith column
in A, we obtain
M = (a1 , , aj1 , ai , aj , , an ).
On one hand, Property (i) of a determinant gives det M = ij det A; on the other hand, Laplace expansion
of a determinant gives

a1i
n n

det M = (1)k+j aki det Akj = aki Bjk = (Bj1 , , Bjn ) ... .
k=1 k=1 ani
Combined, we can conclude det A Inn = BA. By replacing the ith column with the jth column in A, we
can get similar result for AB.
I 1. (page 47) Prove properties (7).

(a)

Proof. By formula (5), we have |P (x1 , , xn )| = | i<j (xi xj )| = | i=j (xi xj )| = |P (p(x1 , , xn ))|.
By formula (6), we have |P (p(x1 , , xn )| = |(p)||P (x1 , , xn )|. Combined, we conclude |(p)| = 1.
(b)
Proof. By denition, we have
P (p1 p2 (x1 , , xn )) = P (p1 (p2 (x1 , , xn ))) = (p1 )P (p2 (x1 , , x2 )) = (p1 )(p2 )P (x1 , , xn ).
So (p1 p2 ) = (p1 )(p2 ).

I 2. (page 48) Prove (c) and (d) above.
15
Proof. To see (c) is true, we suppose t interchange i0 and j0 . Without loss of generality, we assume i0 < j0 .
Then
P (t(x1 , , xn )) = P (x1 , , xj0 , , xi0 , , xn )

= (xj0 xi0 ) (xi xj )
i<j,(i,j)=(i0 ,j0 )

= (xi xj )
i<j
= P (x1 , , xn ).
So (t) = 1.
To see (d) is true, note formula (9) is equivalent to id = tk t1 p1 . Acting these operations on
(1, , n), we have (1, , n) = tk t1 (p1 (1), , p1 (n)). Then the problem is reduced to proving
that a sequence of transpositions can sort an array of numbers into ascending order. There are many ways
to achieve that. For example, we can let t1 be the transposition that interchanges p1 (1) and p1 (i0 ), where
i0 satises p1 (i0 ) = 1. That is, t1 puts 1 in the rst position of the sequence. Then we let t2 be the
transposition that puts 2 to the second position. We continue this procedure until we sort out the whole
sequence. This shows sorting can be accomplished by a sequence of transpositions.
I 3. (page 48) Show that the decomposition (9) is not unique, but that the parity of the member k of factors
is unique.
Proof. For any transposition t, we have t t = id. So if p = tk t1 , we can get another decomposition
p = tk t1 t t. This shows the decomposition is not unique.
Suppose the permutation p has two dierent decompositions into transpositions: p = tk t1 =
tm t1 . By formula (7), part (b) and formula (8), (p) = (1)k = (1)m . So k m is an even number.
This shows the parity of the member of factors is unique.
I 4. (page 49) Show that D dened by (16) has Properties (ii), (iii) and (iv).
Proof. To verify Property (ii), note for any index j, , K, we have
D(a1 , , aj + aj , , an )

= (p)ap1 1 (apj j + apj j ) apn n

= [(p)ap1 1 apj j apn n + (p)ap1 1 apj j apn n ]
= D(a1 , , aj , , an ) + D(a1 , , aj , , an ).
To verify Property (iii), note ep1 1 epn n is non-zero if and only if pi = i for any 1 i n. In this case
the product is 1.
To verify Property (iv), note for any i = j, if we denote by t the transposition that interchanges i and j,
then p 7 p t is a one-to-one and onto map from the set of all permutations to itself. Therefore, we have
D(a1 , , ai , , aj , , an )

= (p)ap1 1 api i apj j apn n

= (1)(p t)apt1 1 aptj i apti j aptn n

= (1)(p t)apt1 1 apti j aptj i aptn n

= (1) (q)aq1 1 aqi j aqj i aqn n
= D(a1 , , aj , , ai , , an ).
16
I 5. (page 49) Show that Property (iv) implies Property (i), unless the eld K has characteristic two, that
is, 1 + 1 = 0.
Proof. By property (iv), D(a1 , , ai , , ai , , an ) = D(a1 , , ai , , ai , , an ). So add to both
sides of the equations D(a1 , , ai , , ai , , an ) , we have 2D(a1 , , ai , , ai , , an ) = 0. If the
character of the eld K is not two, we can conclude D(a1 , , ai , , ai , , an ) = 0.
Remark 7. This exercise and Exercise 5.4 together show formula (16) is equivalent to Properties (i)-(iii),
provided the character of K is not two. Therefore, for K = R or C, we can either use (16) or properties
(i)-(iii) as the denition of determinant.
I 6. (page 52) Verify that C(A11 ) has properties (i)-(iii).
[ ] [ ]
0 0
Proof. If two column vectors ai and aj (i = j) of A11 are equal, we have = . So C(A11 ) = 0 and
ai aj
property
[ ] (i) is satised. Since any linear operation on a column vector ai of A11 can [be naturally
] extended
0 1 0
to , property (ii) is also satised. Finally, we note when A11 = I(n1)(n1) , = Inn . So
ai 0 A11
property (iii) is satised.
I 7. (page 52) Deduce Corollary 5 from Lemma 4.
Proof. We rst move the j-th column to the position of the rst column. This can be done by interchanging
neighboring columns (j 1) times. The determinant of the resulted matrix A1 is (1)j1 det A. Then we
move the i-th row to the position of the rst row. This can be done by interchanging neighboring rows
(i 1) times. The resulted matrix
( A2 has) a determinant equal to (1)
i1
det A1 = (1)i+j det A. On the
1
other hand, A2 has the form of . By Lemma 4, we have det Aij = det A2 = (1)i+j det A. So
0 Aij
det A = (1)i+j det Aij .
Remark 8. Rigorously speaking, we only proved that swapping two neighboring columns will give a minus
sign to the determinant (Property (iv)), but we havent proved this property for neighboring rows. This can
be made rigorous by using det A = det AT (Exercise 8 of this chapter).
I 8. (page 54) Show that for any square matrix
det AT = det A, AT = transpose of A
[Hint: Use formula (16) and show that for any permutation (p) = (p1 ).]
Proof. We rst show for any permutation p, (p) = (p1 ). Indeed, by formula (7)(b), we have 1 = (id) =
(p p1 ) = (p)(p1 ). By formula (7)(a), we conclude (p) = (p1 ). Second, we denote by bij the
(i, j)-th entry of AT . Then bij = aji . By formula (16) and the fact that p 7 p1 is a one-to-one and onto
map from the set of all permutations to itself, we have

det AT = (p)bp1 1 bpn n

= (p)a1p1 anpn

= (p1 )a(p1 p)1 p1 a(p1 p)n pn

= (p1 )ap1 (p1 )p1 ap1 (pn )pn

= (p1 )ap1 (1)1 ap1 (n)n
= det A.
17
I 9. (page 54) Given a permuation p of n objects, we dene an associated so-called permutation matrix P
as follows: {
1, if j = p(i),
Pij =
0, otherwise.
Show that the action of P on any vector x performs the permutation p on the components of x. Show that if
p, q are two permutations and P , Q are the associated permutation matrices, then the permutation matrix
associated with p q is the product P Q.
Proof. By Exercise 2, it suces to prove the property for transpositions. Suppose p interchanges i1 , i2 and
q interchanges j1 , j2 . Denote by P and Q the corresponding permutation matrices, respectively. Then for
any x = (x1 , , xn )T Rn , we have (ij is the Kronecker sign)

xi2 if i = i1
(P x)i = Pij xj = p(i)j xj = xi1 if i = i2

xi otherwise.
This shows the action of P on any column vector x performs the permutation p on the components of x.
Similarly, we have

xj2 if i = j1
(Qx)i = xj1 if i = j2

xi otherwise.
Since (P Q)(x) = P (Q(x)), the action of matrix P Q on x performs rst the permutation q and then the
permutation p on the components of x. Therefore, the permutation matrix associated with p q is the
product of P and Q.
I 10. (page 56) Let A be an m n matrix, B an n m matrix. Show that
trAB = trBA
Proof.

m
m
n
m
n
n
m
n
tr(AB) = (AB)ii = aij bji = aji bij = bij aji = (BA)ii = tr(BA),
i=1 i=1 j=1 j=1 i=1 i=1 j=1 i=1
where the third equality is obtained by interchanging the names of the indices i, j.
I 11. (page 56) Let A be an n n matrix, AT its transpose. Show that

trAAT = a2ij .
The square root of the double sum on the right is called the Euclidean, or Hilbert-Schmidt, norm of the
matrix A.
Proof.
tr(AAT ) = (AAT )ii = Aij ATji = aij aij = a2ij .
i i j i j ij
I 12. (page 56) Show that the determinant of the 2 2 matrix

( )
a b
c d
is D = ad bc.
18
Proof. Apply Laplace expansion of a determinant according to its columns (Theorem 6).
I 13. (page 56) Show that the determinant of an upper triangular matrix, one whose elements are zero
below the main diagonal, equals the product of its elements along the diagonal.
Proof. Apply Laplace expansion of a determinant according to its columns (Theorem 6) and work by induc-
tion.
I 14. (page 57) How many multiplications does it take to evaluate det A by using Gaussian elimination to
bring it into upper triangular form?
Proof. Denote by M (n) the number of multiplications needed to evaluate det A of an n n matrix A by
using Gaussian elimination to bring it into upper triangular form. To use the rst row to eliminate a21 ,
a31 , , an1 , we need n(n 1) multiplications. So M (n) = n(n 1) + M (n 1) with M (1) = 0. So
n
M (n) = k=1 k(k 1) = n(n+1)(2n+1)6 n(n+1)
2 = (n1)n(n+1)
3 .
I 15. (page 57) How many multiplications does it take to evaluate det A by formula (16)?
Proof. Denote by M (n) the number of multiplications needed to evaluate the determinant of an n n matrix
by formula (16). Then M (n) = nM (n 1). So M (n) = n!.
I 16. (page 57) Show that the determinant of a (3 3) matrix

a b c
A= d e f
g h i
can be calculated as follows. Copy the rst two columns of A as a fourth and fth column:

a b c a b
d e f d e
g h i g h
det A = aei + bf g + cdh gec hf a idb.

Show that the sum of the products of the three entries along the dexter diagonals, minus the sum of the
products of the three entries along the sinister diagonals is equal to the determinant of A.
Proof. We apply Laplace expansion of a determinant according to its columns (Theorem 6):

a b c [ ] [ ] [ ]
e f b c b c
det d e f = a det d det + g det
h i h i e f
g h i
= a(ie f h) d(ib ch) + g(bf ce)
= aei + bf g + cdh gec af h idb.
6 Spectral Theory
Comments:
1) C is an eigenvalue of a square matrix A if and only if it is a root of the characteristic polynomial
det(aI A) = pA (a) (Corollary 3 of Chapter 5). The spectral mapping theorem (Theorem 4) extends this
result further to polynomials of A.
19
2) The proof of Lemma 6 in this chapter (formula (21) on page 68) used Proposition 5.1 (see the Comments
of Chapter 5).
3) On p.72, the fact that that from a certain index on, Nd s become equal can be seen from the following
line of reasoning. Assume Nd1 = Nd while Nd = Nd+1 . For any x Nd+2 , we have (AaI)x Nd+1 = Nd .
So x Nd+1 = Nd . Then we work by induction.
4) Theorem 12 can be enhanced to a statement on necessary and sucient conditions, which leads to the
Jordan canonical form (see Appendix A.15 for details).
Supplementary notes:
1) Minimal polynomial is dened from the algebraic point of view as the generator of the polynomial
ring {p : p(A) = 0}. So the powers of its linear factors are given algebraically. Meanwhile, the index of an
eigenvalue is dened from the geometric point of view. Theorem 11 says they are equal.
2) As a corollary of Theorem 11, we claim an n n matrix A can be diagonalized over the eld F if and
only if its minimal polynomial can be decomposed into the product of distinct linear factors (polynomials of
degree 1 over the eld F). Indeed, by the uniqueness of minimal polynomial, we have
mA is the product of distinct linear factors
k
Fn = N1 (aj )
j=1
F has a basis {xi }ni=1 consisting of eigenvectors of A
n
A can be diagonalized by the matrix U = (x1 , , xn ), such that U 1 AU = diag{1 , , n }.

The above sequence of equivalence also gives the steps to diagonalize a matrix A
i) Compute the characteristic polynomial pA (s) = det(sI A).
ii) Solve the equation pA (s) = 0 in F to obtain the eigenvalues of A: a1 , , ak .
iii) For each aj (j = 1, , k), solve the homogenous equation (aj I A)x = 0 to obtain the eigenvectors
pertaining to aj : xj1 , , xjmj , where mj = dim N1 (aj ).
k k
iv) If j=1 mj < n, A cannot be diagonalized in F. If j=1 mj = n, A can be diagonalized by the
matrix
U = (x11 , , x1m1 , x21 , , x2m2 , , xk1 , , xkmk ),
such that U 1 AU = diag{a1 , , a1 , , ak , , ak }.
| {z } | {z }
dim N1 (a1 ) dim N1 (ak )
3) Suppose X is a linear subspace of Fn that is invariant under the n n matrix A. If A can be

diagonalized, the matrix corresponding to A|X can also be diagonalized. This is due to the observation that
k
X = j=1 (N1 (aj ) X).
4) We summarize several relationships among index, algebraic multiplicity, geometric multiplicity, and
the dimension of the space of generalized eigenvectors pertaining to a given eigenvalue. The rst result is an
elementary proof of Lemma 10, Chapter 9, page 132.
Proposition 6.1 (Geometric and algebraic multiplicities). Let A be an n n matrix over a eld F
and an eigenvalue of A. If m() is the multiplicity of as a root of the characteristic polynomial pA of
A, then dim N1 () m().
m() is called the algebraic multiplicity of ; dim N1 () is called the geometric multiplicity of
and is the dimension of the linear space spanned by the eigenvectors pertaining to . So this result says
geometric multiplicity dim N1 () algebraic multiplicity m().
Proof. Let v1 , , vs be a basis of N1 () and extend it to a basis of Fn : v1 , , vs , u1 , , ur . Dene
U = (v1 , , vs , u1 , , ur ). Then
U 1 AU = U 1 A(v1 , , vs , u1 , , ur )
= U 1 (v1 , , vs , Au1 , , Aur )
= (U 1 v1 , , U 1 vs , U 1 Au1 , , U 1 Aur ).
20
[ ]
Iss B
Because U 1 U = I, we must have U 1 AU = and det(I A) = det(I U 1 AU ) =
0 C
[ ]
( )Iss B
det = ( )s det(I C).1 So s m.
0 I(ns)(ns) C
We continue to use the notation from Proposition 6.1, and we dene d() as the index of . Then we
have
Proposition 6.2 (Index and algebraic multiplicity). d() m().

Proof. Let q(s) = pA (s)/(s )m . Then Cn = NpA = Nq Nm (). For any v Nm+1 (), v can be
uniquely written as v = v + v with v Nq and v Nm (). Then v = v v Nm+1 () Nq .
Similar to the second part of the proof of Lemma 9, we can show v = 0. So v = v Nm (). This shows
Nm+1 () = Nm () and hence d() m().
Using the notations from Propositions 6.1 and 6.2, we have
Proposition 6.3 (Algebraic multiplicity and the dimension of the space of generalized eigen-
vectors). m() = dim Nd() ().
Proof. See Theorem 11 of Chapter 9, page 133.

In summary, we have
dim N1 (), d() m() = dim Nd() ().
In words, it becomes
geometric multiplicity of , index of

algebraic multiplicity of as a root of the characteristic polynomial
= dim. of the space of generalized eigenvectors pertaining to .
I 1. (page 63) Calculate f32 .

Proof. f32 = a32
1 / 5 = 2178309.
I 2. (page 65) (a) Prove that if A has n distinct eigenvalues aj and all of them are less than one in absolute
value, then all h in Cn
AN h 0, as N ,
that is, all components of AN h tend to zero.
(b) Prove that if all aj are greater than one in absolute value, then for all h = 0,
AN h , as N ,
that is, some components of AN h tend to innity.

(a)
Proof. Denote by hj the eigenvector
corresponding
to N the eigenvalue aj . For any h Cn , there exist
1 , , n C such that h = j j hj . So A h = j j aj hj . Dene b = max{|a1 |, , |an |}. Then for any
N

1 k n, |(AN h)k | = | j j aN j (hj )k | b
N
j |j ||(hj )k | 0, as N , since 0 b < 1. This shows
A h 0 as N .
N
(b)
1 For the last equality, see, for example, Munkres [10, page 24], Problem 6, or [6, page 173].
21
Proof. We use the same notation as in part (a). Since h = 0, there exists some k0 {1, , n} so that
the k0 th coordinate of h satises hk0 = (h )
= 0. Then |(AN
h) | = | aN (h ) |. Dene
j j j k0 k 0
j j Nj j k0
n a
b1 = max1in {|ai | : i = 0, (hi )k0 = 0}. Then b1 > 1 and |(AN h)k0 | = |b1 |N i=1 i bNi (hi )k0 as
1
N .
I 3. (page 66) Verify for the matrices discussed in Examples 1 and 2,

( ) ( )
3 2 0 1
and ,
1 4 1 1
that the sum of the eigenvalues equals the trace, and their product is the determinant of the matrix.
Proof. The verication is straightforward.
I 4. (page 69) Verify (25) by induction on N .

Proof. Formula (24) gives us Af = af + h, which is formula (25) when N = 1. Suppose (25) holds for any
n N , then AN +1 f = A(AN f ) = A(aN f + N aN 1 h) = aN Af + N aN 1 Ah = aN (af + h) + N aN 1 ah =
aN +1 f + (N + 1)aN h. So (25) also holds for N + 1. By induction, (25) holds for any N N.
I 5. (page 69) Prove that for any polynomial q,
q(A)f = q(a)f + q (a)h,
where q is the derivative of q and f satises (22).

n n n
Proof.
n Suppose q(s)
= i=0 bi si , then by formula (25), q(A)f = i=0 bi Ai f = i=0 bi (ai f + iai1 h) =
( i=0 bi ai )f + ( i=1 ibi ai1 )h = q(a)f + q (a)h.
n
I 6. (page71) Prove (32) by induction on k.
Proof. By Lemma 9, Np1 pk = Np1 Np2 pk = Np1 (Np2 Np3 pk ) = Np1 Np2 Np3 pk = =
Np1 Np2 Npk .
I 7. (page 73) Show that A maps Nd into itself.

Proof. For any x Nd (a), we have (A aI)d (Ax) = (A aI)d+1 x + a(A aI)d x = 0. So Ax Nd (a).

Proof. A number is an eigenvalue of A if and only if its a root of the characteristic polynomial pA . So pA (s)
k
can be written as pA (s) = 1 (s ai )mi with each mi a positive integer (i = 1, , k). We have shown
k
in the text that pA is a multiple of mA , so we can assume mA (s) = i=1 (s ai )ri with each ri satisfying
0 ri mi (i = 1, , k). We argue ri = di for any 1 i k.
Indeed, we have
k k
Cn = NpA = Nmj (aj ) = Ndj (aj ).
j=1 j=1
where the last equality comes from the observation Nmj (aj ) Nmj +dj (aj ) = Ndj (aj ) by the denition of
k
dj . This shows the polynomial j=1 (s aj )dj := {polynomials p : p(A) = 0}. By the denition of
minimal polynomial, rj dj for j = 1, , n.
Assume for some j, rj < dj , we can then nd x Ndj (aj ) \ Nrj (aj ) with x = 0. Dene q(s) =
k
i=1,i=j (s ai ) , then by Corollary 10 x can be uniquely decomposed into x + x with x Nq and
ri
dj
x Nrj (aj ). We have 0 = (A aj I) x = (A aj I) x + 0. So x Nq Ndj (aj ) = {0}. This implies
dj
x = x Nrj (aj ). Contradiction. Therefore, ri di for any 1 i k.

k
Combined, we conclude mA (s) = i=1 (s ai )di .
22
Remark 9. Along the way, we have shown that the index d of an eigenvalue is no greater than the algebraic
multiplicity of the eigenvalue in the characteristic polynomial. Also see Proposition 6.2.
I 9. (page 75) Prove Corollary 15.
Proof. The extension is straightforward as the key feature of the proof, B maps N (j) into N (j) , remains
the same regardless of the number of linear maps, as far as they commute pairwise.

Proof. For any i {1, , n}, by Theorem 17, (li , xj ) = 0 for any j = i. Since x1 , , xn span the whole
space and li = 0, we must have(li , xi ) = 0, i = 1, , n. This proves (a) of Theorem 18. For (b), we note if
k
x = j=1 kj xj , then (li , x) = ki (li , xi ). So ki = (li , x)/(li , xi ).
I 11. (page 76) Take the matrix ( )
0 1
1 1
from equation (10) of Example 2.
(a) Determine the eigenvector of its transpose.
(b) Use formulas (44) and (45) to determine the expansion of the vector (0,1) in terms of the eigenvectors
of the original matrix. Show that your answer agrees with the expansion obtained in Example 2.
(a)
Proof. The matrix is symmetric, so its

[ equal
] to its transpose and the eigenvectors are the same:
[ for] eigenvalue

1 1
a1 = 1+2 5 , the eigenvector is h1 = ; for eigenvalue a2 = 12 5 , the eigenvector is h2 = .
a1 a2
(b)

[ ]
5+ 5 5 5 0
Proof. We note (h1 , h1 ) = 1 + a21 = 2 and (h2 , h2 ) = 1 + a22 = 2 . For x = , we have (h1 , x) = a1
1
and (h2 , x) = a2 . So using formula (44) and (45), x = c1 h1 + c2 h2 with

5+ 5 5 5
c1 = a1 / = 1/ 5, c2 = a2 / = 1/ 5.
2 2
This agrees with the expansion obtained in Example 2.
I 12. (page 76) In Example 1 we have determined the eigenvalues and corresponding eigenvector of the
matrix ( )
3 2
1 4
( ) ( )
2 1
as a1 = 2, h1 = , and a2 = 5, h2 = .
1 1
Determine eigenvectors l1 and l2 of its transpose and show that
{
0 for i = j
(li , hj ) =
= 0 for i = j
Proof.
[ ] The
[ ] transpose
[ ] of the matrix has the same eigenvalues a[1 = ]2,[ a]2 = [5. ] Solving the equation
3 1 x x [ ] 3 1 x x [ ]
=2 , we have l1 = 1 1 . Solving the equation =5 , we have l2 = 1 2 .
2 4 y y 2 4 y y
Then its easy to calculate (l1 , h1 ) = 3, (l1 , h2 ) = 0, (l2 , h1 ) = 0, and (l2 , h2 ) = 3.
23
I 13. (page 76) Show that the matrix
0 1 1
A= 1 0 1
1 1 0
has -1 as an eigenvalue. What are the other two eigenvalues?
Solution.

1 1 0 1 1 + 2
det(I A) = det 1 1 = det 0 +1 1
1 1 1 1
= [( + 1)2 (2 1)( + 1)] = ( + 1)2 ( 2).
So the eigenvalues of A are 1 and 2, and the eigenvalue 2 has a multiplicity of 2.
7 Euclidean Structure
The books own solution gives answers to Ex 1, 2, 3, 5, 6, 7, 8, 14, 17, 19, 20.
Erratum: In the Note on page 92, the innite-dimensional version of Theorem 15 is Theorem 5 in
Chapter 15, not Theorem 15.

x
Proof. By letting y = ||x|| , we get ||x|| max||y||=1 (x, y). By Schwartz Inequality, max||y||=1 (x, y) ||x||.
Combined, we must have ||x|| = max||y||=1 (x, y).

Proof. x, y and suppose their decomposition are x1 + x2 , y1 + y2 , respectively. Here x1 , y1 Y and
x2 , y2 Y . Then (PY y, x) = (y, PY x) = (y1 + y2 , x1 ) = (y1 , x1 ) = (y1 , x) = (PY y, x). By the arbitrariness
of x and y, PY = PY .
I 3. (page 89) Construct the matrix representing reection of points in R3 across the plane x3 = 0. Show
that the determinant of this matrix is 1.
Proof. Under the reection across the plane {(x1 , x2 , x3 ): x3 = 0}, point (x1 , x2 , x3 ) will be mapped to
1 0 0
(x1 , x2 , x3 ). So the corresponding matrix is 0 1 0 , whose determinant is 1.
0 0 1
I 4. (page 89) Let R be reection across any plane in R3 .

(i) Show that R is an isometry.
(ii) Show that R2 = I.
(iii) Show that R = R.
Proof. Suppose the plane L is determined by the equation Ax + By + Cz = D. For any point x =
(x1 , x2 , x3 ) R3 , we rst nd y = (y1 , y2 , y3 ) L such that the line segment xy L. Then y must
satisfy the following equations
{
Ay1 + By2 + Cy3 = D
(y1 x1 , y2 x2 , y3 x3 ) = k(A, B, C)
24
D(Ax1 +Bx2 +Cx3 )
where k is some constant. Solving the equations gives us k = A2 +B 2 +C 2 and
2
y1 x1 A x1 A AB AC x1 A
y2 = x2 + k B = x2 1 AB D
B2 BC x2 + 2 B
A2 + B 2 + C 2 A + B2 + C 2
y3 x3 C x3 CA CB C2 x3 C
So the symmetric point z = (z1 , z2 , z3 ) of x with respect to L is given by

2
z1 y1 x1 x1 A AB AC x1 A
z2 = 2 y2 x2 = x2 2 AB B 2 BC x2 + 2D B
A2 + B 2 + C 2 A2 + B 2 + C 2
z3 y3 x3 x3 CA CB C 2 x3 C

A + B + C
2 2 2
2AB 2AC x1 A
1 x2 + 2D B .
= 2AB A2
B 2
+ C 2
2BC
A2 + B 2 + C 2 A2 + B 2 + C 2
2CA 2CB A2 + B 2 C 2 x3 C
To make the reection R a linear mapping, its necessary and sucient that D = 0. So the problems
statement should be corrected to let R be reection across any plane in R3 that contains the origin. Then

A2 + B 2 + C 2 2AB 2AC
1 .
R= 2 2AB A2 B 2 + C 2 2BC
A + B2 + C 2
2CA 2CB A2 + B 2 C 2
R is symmetric, so R = R and by plain calculation, we have R R = R2 = I. By Theorem 12, R is an

isometry.
I 5. (page 89) Show that a matrix M is orthogonal i its rows are pairwise orthogonal unit vectors.
Proof. Suppose M is an n n orthogonal matrix. Let r1 , , rn be its column vectors. Then
T
r1 r1 r1 r1 r2T r1 rnT
I = M M T = [r1T , , rnT ] = .
rn rn r1T rn r2T rn rnT
So M is orthogonal if and only if ri rjT = ij (1 i, j n).

I 6. (page 90) Show that |aij | ||A||.
Proof. Note |aij | = sign(aij ) eTi Aej , where ek is the column vector that has 1 as the k-th entry and 0
elsewhere. Then we apply (ii) of Theorem 13.
I 7. (page 94) Show that {An } converges to A i for all x, An x converges to Ax.
Proof. The key is the space X being nite dimensional. See the solution in the textbook.
I 8. (page 95) Prove the Schwarz inequality for complex linear spaces with a Euclidean structure.
(x,y)
Proof. For any x, y X and a C, 0 ||x ay|| = ||x||2 2Re(x, ay) + |a|2 ||y||2 . Let a = ||y||2 (assume
y = 0), then we have { }
(x, y) |(x, y)|2
0 ||x|| 2Re
2
(x, y) + ,
||y||2 ||y||2
which gives after simplication |(x, y)| ||x||||y||.

I 9. (page 95) Prove the complex analogues of Theorem 6, 7, and 8.
Proof. Proofs are the same as the ones for the real Euclidean space.
I 10. (page 95) Prove the complex analogue of Theorem 9.
25
Proof. Proof is the same as the one for real Euclidean space.
I 11. (page 96) Show that a unitary map M satises the relations
M M = I
and, conversely, that every map M that satises (45) is unitary.
Proof. If M is a unitary map, then by parallelogram law, M preserves inner product. So x, y, (x, M M y) =
(M x, M y) = (x, y). Since x is arbitrary, M M y = y, y X. So M M = I. Conversely, if M M = I,
(x, x) = (x, M M x) = (M x, M x). So M is an isometry.
I 12. (page 96) Show that if M is unitary, so is M 1 and M .
Proof. (M 1 x, M 1 x) = (M (M 1 x), M (M 1 x)) = (x, x). (M x, M x) = (x, x) = (M M x, M M x). By

RM = X, (y, y) = (M y, M y), y X. So M 1 and M are both unitary.
I 13. (page 96) Show that the unitary maps form a group under multiplication.
Proof. If M , N are two unitary maps, then (M N ) (M N ) = N M M N = N N = I. So the set of unitary

maps is closed under multiplication. Exercise 12 shows that each unitary map has a unitary inverse. So the
set of unitary maps is a group under multiplication.
I 14. (page 96) Show that for a unitary map M , | det M | = 1.
T
Proof. By Exercise 8 of Chapter 5, det M = det M = det M . So by M M = I, we have
1 = det M det M = | det M |2 ,
i.e. | det M | = 1.
I 15. (page 96) Let X be the space of continuous complex-valued functions on [1, 1] and dene the scalar
product in X by
1
(f, g) = f (s)g(s)ds.
1
Let m(s) be a continuous function of absolute value 1: |m(s)| = 1, 1 s 1.

Dene M to be multiplication by m:
(M f )(s) = m(s)f (s).
Show that M is unitary.

1 1 1
Proof. (M f, M f ) = 1 M f (s)M f (s)ds = 1 m(s)f (s)m(s)f(s)ds = 1 |m(s)|2 |f (s)|2 ds = (f, f ). This
shows M is unitary.
I 16. (page 98) Prove the following analogue of (51) for matrices with complex entries:
1/2

||A|| |aij |2 .
i,j
Proof. The proof is very similar to that of real case, so we omit the details. Note we need the complex
version of Schwartz inequality (Exercise 8).
|aij |2 = trAA .
i,j
26
Proof. We have

aj1 n
(AA )ij = [ai1 , , ain ] = aik ajk .
ajn k=1
n
So (AA )ii = k=1 |aik |2 and tr(AA ) = i,j |aij |2 .

trAA = trA A.
Proof. This is straightforward from the result of Exercise 17.

I 19. (page 99) Find an upper bound and a lower bound for the norm of the 2 2 matrix
( )
1 2
A=
0 3
( )1/2
i,j |aij |
2
The quantity is called the Hilbert-Schmidt norm of the matrix A.
Solution. Suppose 1 and 2 are two eigenvalues of A. Then by Theorem 3 of Chapter 6, 1 + 2 = trA = 4
and 1 2 = det A = 3. Solving the
equations gives ||A|| 3. According
us 1 = 1, 2 = 3. By formula (46),
to formula (51), we have ||A|| 12 + 22 + 32 = 14. Combined, we have 3 ||A|| 14 3.7417.
I 20. (page 99) (i) w is a bilinear function of x and y. Therefore we write w as a product of x and y, denoted
as
w = x y,
and called the cross product.
(ii) Show that the cross product is antisymmetric:
y x = x y.
(iii) Show that x y is orthogonal to both x and y.

(iv) Let R be a rotation in R3 ; show that
(Rx) (Ry) = R(x y).
(v) Show that

||x y|| = ||x||||y|| sin ,
where is the angle between x and y.
(vi) Show that
1 0 0
0 1 = 0 .
0 0 1
(vii) Using Exercise 16 in Chapter 5, show that

a d bf ce
b e = cd af .
c f ae bd
(i)
27
Proof. For any 1 , 2 F, we have
(w(1 x1 + 2 x2 , y), z) = det(1 x1 + 2 x2 , y, z) = 1 det(x1 , y, z) + 2 det(x2 , y, z)

= 1 (w(x1 , y), z) + 2 (w(x2 , y), z) = (1 w(x1 , y) + 2 w(x2 , y), z).
Since z is arbitrary, we necessarily have w(1 x1 + 2 x2 , y) = 1 w(x1 , y) + 2 w(x2 , y). Similarly, we can
prove w(x, 1 y1 + 2 y2 ) = 1 w(x, y1 ) + 2 w(x, y2 ). Combined, we have proved w is a bilinear function of x
and y.
(ii)
Proof. We note
(w(x, y), z) = det(x, y, z) = det(y, x, z) = (w(y, x), z) = (w(y, x), z).
By the arbitrariness of z, we conclude w(x, y) = w(y, x), i.e. y x = x y.

(iii)
Proof. Since (w(x, y), x) = det(x, y, x) = 0 and (w(x, y), y) = det(x, y, y) = 0, x y is orthogonal to both x
and y.
(iv)
Proof. We suppose every vector is in column form and R is the matrix that represents a rotation. Then
(Rx Ry, z) = det(Rx, Ry, z) = (det R) det(x, y, R1 z)
and
(R(x y), z) = (R(x y))T z = (x y)T RT z = (x y, RT z) = det(x, y, RT z).
A rotation is isometric, so RT = R1 and det R = 1. Combing the above two equations gives us (Rx
Ry, z) = (R(x y), z). Since z is arbitrary, we must have Rx Ry = R(x y).
(v)
Proof. In the equation det(x, y, z) = (xy, z), we set z = xy. Since the geometrical meaning of det(x, y, z)
is the signed volume of a parallelogram determined by x, y, z, and since z = x y is perpendicular to x and
y, we have det(x, y, z) = ||x||||y|| sin ||z||, where is the angle between x and y. Then by (x y, z) = ||z||2 ,
we conclude ||x y|| = ||z|| = ||x||||y|| sin .
(vi)
Proof.
1 0 0 1 0 0
1 = det 0 1 0 = (0 1 , 0).
0 0 1 0 0 1

1 0 a 1 0
So 0 1 = b . By part (iii), we necessarily have a = b = 0. Therefore, we can conclude 0 1 =
0 0 1 0 0
0
0.
1
(vii)
28
Proof. By Exercise 16 of Chapter 5,

a d g a b c
det b e h = det d e f
c f i g h i
= aei + bf g + cdh gec hf a idb
= (bf ec)g + (cd f a)h + (ae db)i

[ ] g
= bf ce cd af ae bd h .
i
So we have
a d bf ce
b e = cd af .
c f ae bd
I 21. (page 100) Show that in a Euclidean space every pair of vector satises
||u + v||2 + ||u v||2 = 2||u||2 + 2||v||2 .
Proof.
||u + v||2 + ||u v||2 = (u + v, u + v) + (u v, u v) = (u, u + v) + (v, u + v) + (u, u v) (v, u v)

= (u, u) + (u, v) + (v, u) + (v, v) + (u, u) (u, v) (v, u) + (v, v) = 2||u||2 + 2||v||2 .
8 Spectral Theory of Self-Adjoint Mappings of a Euclidean Space

into Itself
The books own solution gives answers to Ex 1, 4, 8, 10, 11, 12, 13.
Erratum: On page 114, formula (37) should be an = maxx=0 (x,Hx)

(x,x) instead of an = minx=0 (x,Hx)
(x,x) .
Comments:
1) In Theorem 4, the eigenvectors of H can be complex (the proof did not show they are real), although
the eigenvalues of H are real.
2) The following result will help us understand some details in the proof of Theorem 4 (page 108, It
follows from this easily that we may choose an orthonormal basis consisting of real eigenvectors in each
eigenspace Na .)
Proposition 8.1. Let X be a conjugate invariant subspace of Cn (i.e. X is invariant under conjugate
operation). Then we can nd a basis of X consisting of real vectors.
Proof. We work by induction. First, assume dim X = 1. v X with v = 0, we must have Rev X and
Imv X. At least one of them is non-zero and can be taken as a basis. Suppose for all conjugate invariant
subspaces with dimension no more than k the claim is true. Let dim X = k + 1. v X with v = 0. If Rev
and Imv are (complex) linearly dependent, there must exist c C and v0 Rn such that v = cv0 , and we let
Y = span{v0 }; if Rev and Imv are (complex) linearly independent,
n we let Y = span{v, v} = span{Rev, Imv}.
In either case, Yis conjugate invariant. Let Y = {x X : i=1 xi yi = 0, y = (y1 , , yn ) Y }. Then
clearly, X = Y Y and Y is also conjugate invariant. By assumption, we can choose a basis of Y
consisting exclusively of real vectors. Combined with the real basis of Y , we get a real basis of X.
29
3) For an elementary proof of Theorem 4 by mathematical induction, see [12, page 297], Theorem
5.9.4.
4) Theorem 5 (the spectral resolution representation of self-adjoint operators) can be extended to innite
dimensional space and is phrased as any self-adjoint operator can be decomposed as the integral w.r.t.
orthogonal projections. See any textbook on functional analysis for details.
5) For the second proof of Theorem 4, compare Spivak[13, page 122], Exercise 5-17 and Keener[5, page
15], Theorem 1.6 (the maximum principle).
Supplementary notes
In view of the spectral theorem (Theorem 7 of Chapter 6, p.70), the diagonalization of a self-adjoint
matrix A is reduced to showing that in the decomposition

Cn = Nd1 (1 ) Ndk (k ),
we must have di (i ) = 1, i = 1, , k. Indeed, assume for some , d() > 1. Then for any x N2 ()\N1 (),
we have
(I A)x = 0, (I A)2 x = 0.
But the second equation implies
((I A)x, (I A)x) = ((I A)2 x, x) = 0.
A contradiction. So we must have d() = 1. This is the substance of the proof of Theorem 4, part (b).

Re(x, M x) = (x, Ms x).
Proof.
( )
M + M 1 1 1
x, x = [(x, M x) + (x, M x)] = [(x, M x) + (M x, x)] = [(x, M x) + (x, M x)] = Re(x, M x).
2 2 2 2
I 2. (page 104) We have described above an algorithm for diagonalizing q; implement it as a computer
program.
Solution. Skipped for this version.
I 3. (page 105) Prove that

p+ + p0 = max dim S, q 0 on S
and
p + p0 = max dim S, q 0 on S.
Proof. We prove p+ + p0 = maxq(S)0 dimS. p + p0 = maxq(S)0 dimS can be proved similarly. We shall
use representation (11) for q in terms of the coordinates z1 , , zn ; suppose we label them so that d1 , ,
dp are nonnegative where p = p+ + p0 , and the rest are negative. Dene the subspace S1 to consist of
all vectors for which zp+1 = = zn = 0. Clearly, dim S1 = p and q is nonnegative on S1 . This shows
p+ + p0 = p maxq(S)0 dim S. If < holds, there must exist a subspace S2 such that q(S2 ) 0 and
dim S2 > p = p+ + p0 . Dene P : S2 S3 := {z : zp+1 = zp+2 = = zn = 0} by P (z) = (z1 , , zp , 0, , 0).
Since dim S2 > p = dim S3 , some z S2 such
there exists n that z = 0 and P (z) = 0. This implies
p n
z1 = = zp = 0. So q(z) = i=1 di zi2 + i=p+1 di zi2 = i=p+1 di zi2 < 0, contradiction. Therefore, our
assumption is not true and < cannot hold.
I 4. (page 109) Show that the columns of M are the eigenvectors of H.
30
Proof. Write M in the column form M = [c1 , , cn ] and multiply M to both sides for formula (24), we get

1 0 0
0 2 0
HM = [Hc1 , , Hcn ] = M D = [c1 , , cn ]
= [1 c1 , , n cn ],
0 0 n
where 1 , , n are eigenvalues of M , including multiplicity. So we have Hci = i ci , i = 1, , n. This

shows the columns of M are eigenvectors of H.
I 5. (page 118) (a) Show that the minimum problem (47) has a nonzero solution f .
(b) Show that a solution f of the minimum problem (47) satises the equation
Hf = bM f,
where the scalar b is the value of the minimum (47).

(c) Show that the constrained minimum problem
(y, Hy)
min
(y,M f )=0 (y, M y)
has a nonzero solution g.

(d) Show that a solution g of the minimum problem (47) satises the equation
Hg = cM g,
where the scalar c is the value of the minimum (47) .

Proof. The essence of the generalization can be summarized as follows: x, y = (x, M y) is an inner product
and M 1 H is self-adjoint under this new inner product, hence all the previous results apply.
Indeed, x, y is a bilinear function of x and y; it is symmetric since M is self-adjoint; and it is positive
since M is positive. Combined, we can conclude x, y is an inner product.
Because M is positive, M x = 0 has a unique solution 0. So M 1 exists. Dene U = M 1 H and we
check U is self-adjoint under the new inner product , . Indeed, x, y X,
U x, y = (U x, M y) = (M 1 Hx, M y) = (Hx, y) = (x, Hy) = (x, M M 1 Hy) = (x, M U y) = x, U y.
Applying the second proof of Theorem 4, with (, ) replaced by , and H replaced by M 1 H, we can
verify claims (a)-(d) are true.

Proof. This is just Theorem 4 with (, ) replaced by , and H replaced by M 1 H, where , is dened
in the solution of Exercise 5.
I 7. (page 118) Characterize the numbers bi in Theorem 11 by a minimax principle similar to (40).
Solution. This is just Theorem 11 with (, ) replaced by , and H replaced by M 1 H, where , is dened
in the solution of Exercise 5.
I 8. (page 119) Prove Theorem 11 .

Proof. Under the new inner product , = (, M ), U = M 1 H is selfadjoint. By Theorem 4, all the
eigenvalues of M 1 H are real. If H is positive and M 1 Hx = x, then x, x = x, M 1 Hx = (x, Hx) > 0
for x = 0, which implies > 0. So under the condition that H is positive, all eigenvalues of M 1 H are
positive.
I 9. (page 119) Give an example to show that Theorem 11 is false if M is not positive.
31
I 10. (page 119) Prove Theorem 12. (Hint: Use Theorem 8.)
Proof. By Theorem 8, we can assume N has an orthonormal basis v1 , , vn consisting of genuine eigen-
vectors. We assume the eigenvalue corresponding to vj is nj . Then by letting x = vj , j = 1, , n and
by the denition of ||N ||, we can conclude ||N
|| max |nj |. Meanwhile, x X with ||x|| = 1, there exist
a1 , , an C, so that |aj |2 = 1 and x = aj vj . So

||N x||
= || aj nj vj || = |aj nj |2 max |nj | |aj |2 = max |nj |.
||x|| 1jn
This implies ||N || max |nj |. Combined, we can conclude ||N || = max |nj |.
Remark 10. Compare the above result with formula (48) and Theorem 18 of Chapter 7.
I 11. (page 119) We dene the cyclic shift mapping S, acting on vectors in Cn , by S(a1 , a2 , , an ) =
(an , a1 , , an1 ).
(a) Prove that S is an isometry in the Euclidean norm.
(b) Determine the eigenvalues and eigenvectors of S.
(c) Verify that the eigenvectors are orthogonal.
Proof. |S(a1 , , an )| = |(an , a1 , , an1 )| = |(a1 , , an )|. So S is an isometry in the Euclidean norm.
To determine the eigenvalues and eigenvectors of S, note under the canonical basis e1 , , en , S corresponds
to the matrix
0 0 0 0 1
1 0 0 0 0

A= 0 1 0 0 0,

0 0 0 1 0
whose characteristic polynomial is p(s) = |AsI| = (s)n +(1)n+1 . So the eigenvalues of S are the solutions
2k
to the equation sn = 1, i.e. k = e n i , k = 1, , n. Solve the equation Sxk = k xk , we can obtain the gen-
eral solution as xk = (k , k , , k , 1) . After normalization, we have xk = 1n (n1
n1 n2
k , n2
k , , k , 1) .
Therefore, for i = j,
1 k1 k1 1
n n
1 1 (i j )n
(xi , xj ) = i j = (i j )k1 = = 0.
n n n 1 i j
k=1 k=1
I 12. (page 120) (i) What is the norm of the matrix

( )
1 2
A=
0 3
in the standard Euclidean structure?

(ii) Compare the value of ||A|| with the upper and lower bounds of ||A|| asked for in Exercise 19 of Chapter
7.
(i)
[ ][ ] [ ]
1 0 1 2 1 2
Solution. A A = = , which has eigenvalues 7 40. By Theorem 13, ||A|| =
2 3 0 3 2 13

7 + 40 3.65.
(ii)
Solution. This is consistent with the estimate obtained in Exercise 19 of Chapter 7: 3 ||A|| 3.7417.
32
I 13. (page 120) What is the norm of the matrix
( )
1 0 1
2 3 0
in the standard Euclidean structures of R2 and R3 .

1 2 [ ] 5 6 1
1 0 1
Solution. 0 3 = 6 9 0 , which has eigenvalues 0, 1.6477, and 13.3523 By Theorem
2 3 0
1 0 1 0 1

13, the norm of the matrix is approximately 13.3523 3.65.
9 Calculus of Vector- and Matrix- Valued Functions

Erratum: In Exercise 6 (p.129), we should have det eA = etrA instead of det eA = eA.
d1
Comments: In the proof of Theorem 11, to see why sI Ad = 0 () holds, see Lemma 3 of
Appendix 6, formula (9) (p.369).
I 1. (page 122) Prove the fundamental lemma for vector valued functions. (Hint: Show that for every vector
y, (x(t), y) is a constant.)
d
Proof. Following the hint, note dt (x(t), y) = (x(t), y)+(x(t), y) = 0. So (x(t), y) is a constant by fundamental
lemma for scalar valued functions. Therefore (x(t) x(0), y) = 0, y Kn . This implies x(t) x(0).
I 2. (page 124) Derive formula (3) using product rule (iii).
[ 1 ]
Proof. A1 (t)A(t) = I. So 0 = dt
d
A (t)A(t) = dt d 1
A (t) A(t) + A1 (t)A(t) and d 1
dt A (t) = d 1
dt A (t)
1 1 1
A(t)A (t) = A (t)A(t)A (t).
I 3. (page 128) Calculate ( )

A+B 0 1
e = exp .
1 0
( )2 ( )
0 1 1 0
Solution. = . So
1 0 0 1

( )n
I(22 ) if n is even
0 1
= 0 1
1 0
if n is odd.
1 0
Therefore, we have
( )n ( ) ( )
1 0 1 I22 1 0 1 e + e1 e e1 0 1
exp{A + B} = = + = I22 + .
n=0
n! 1 0 (2k)! (2k + 1)! 1 0 2 2 1 0
k=0 k=0
I 4. (page 129) Prove the proposition stated in the Conclusion.
33
Proof. For any > 0, there exists M > 0, so that m M , supt ||Em (t) F (t)|| < . So m M , t, h,

1
[Em (t + h) Em (t)] F (t)
h

1 t+h 1 t+h

= [Em (s) F (s)]ds + F (s)ds F (t)
h t h t
t+h
||Em (s) F (s)||ds 1 t+h

t
+ F (s)ds F (t)
h h t

1 t+h

< + F (s)ds F (t) .
h t
Under the assumption that F is continuous, we have

1 1
lim [E(t + h) E(t)] F (t) = lim lim [Em (t + h) Em (t)] F (t)
h0 h h0 m h

1 t+h

+ lim F (s)ds F (t) = .
h0 h t
Since is arbitrary, we must have limh0 h1 [E(t + h) E(t)] = F (t).
I 5. (page 129) Carry out the details of the argument that Em (t) converges.
m k1 1 i
Proof. By formula (12), Em (t) = k=1 i=0 k! A (t)A(t)Aki1 (t). So for m and n with m < n,

n ||Ai (t)A(t)Aki1 (t)||
k1
n ||A(t)||k1 ||A(t)||
k1
||Em (t) En (t)|| =
i=0
k! i=0
k!
k=m+1 k=m+1
n
||A(t)||
k1
= ||A(t)|| = ||A(t)||[en (||A(t)||) em (||A(t)||)] 0
(k 1)!
k=m+1
as m, n . This shows (Em (t))

m=1 is a Cauchy sequence, hence convergent.
I 6. (page 129) Apply formula (10) to Y (t) = eAt and show that
det eA = etrA .
Proof. Apply formula (10) to Y (t) = eAt , we have dt d

log det Y (t) = tr(eAt eAt A) = trA. Integrating from 0
to t, we get log det Y (t) log det Y (0) = ttrA. So det Y (t) = ettrA . In particular, det eA = etrA .
I 7. (page 129) Prove that all eigenvalues of eA are of the form ea , a an eigenvalue of A. Hint: Use Theorem
4 of Chapter 6, along with Theorem 6 below.
Proof. Without loss of generality, we can assume A is a Jordan matrix. Then eA is an upper triangular
matrix and its entries on the diagonal line have the form ea , where a is an eigenvalue of A. So all eigenvalues
of eA are the exponentials of eigenvalues of A.
I 8. (page 142) (a) Show that the set of all complex, self-adjoint n n matrices forms N = n2 -dimensional
linear space over the reals.
(b) Show that the set of complex, self-adjoint n n matrices that have one double and n 2 simple
eigenvalues can be described in terms of N 3 real parameters.
(a)
34
n(n+1)
Proof. The total number of free entries is 2 . The entries on the diagonal line must be real. So the
dimension is n(n+1)
2 2 n = n2 .
(b)
Proof. Similar to the argument in the text, the total number of complex parameters that determine the
eigenvectors is (n 1) + + 2 = n(n1)
2 1. This is equivalent to n(n 1) 2 real parameters. The number
of distinct (real) eigenvalues is n 1. So the dimension = n2 n 2 + n 1 = n2 3.
I 9. (page 142) Choose in (41) at random two self-adjoint 10 10 matrices M and B. Using available
software (MATLAB, MAPLE, etc.) calculate and graph at suitable intervals the 10 eigenvalues of B + tM
as functions of t over some t-segment.
Solution. See the Matlab/Octave program aoc.m below.
function aoc
%AOC illustrates the avoidance-of-crossing phenomenon

% of the neighboring eigenvalues of a continuous
% symmetric matrix. This is Exercise 9, Chapter 9
% of the textbook, Linear Algebra and Its Applications,
% 2nd Edition, by Peter Lax.
% Initialize global variables

matrixSize = 10;
lowerBound = 0.01; %lower bound of t's range
upperBound = 3; %upper bound of t's range
stepSize = 0.1;
t = lowerBound:stepSize:upperBound;
% Generate random symmetric matrix

temp1 = rand(matrixSize);
temp2 = rand(matrixSize);
M = temp1+temp1';
B = temp2+temp2';
% Initialize eigenvalue matrix to zeros;

% use each column to store eigenvalues for
% a given parameter
eigenval = zeros(matrixSize,numel(t));
for i = 1:numel(t)
eigenval(:,i) = eig(B+t(i)*M);
end
% Plot eigenvalues according to values of parameter

hold off;
disp(['There are ', num2str(matrixSize), ' eigenvalue curves.']);
disp(' ');
for j = 1:matrixSize
disp(['Eigenvalue curve No. ', num2str(j),'. Press ENTER to continue...']);
plot(t, eigenval(j,:));
xlabel('t');
ylabel('eigenvalues');
title('Matlab illustration of Avoidance of Crossing');
hold on;
35
pause;
end
hold off;
10 Matrix Inequalities
The books own solution gives answers to Ex 1, 3, 4, 5, 6, 7, 10, 11, 12, 13, 14, 15.
Erratum: In Exercise 6 (p.152), Imx > 0 should be Imz > 0.
I 1. (page 146) How many square roots are there of a positive mapping?
Solution. Suppose H has k distinct eigenvalues 1 , , k . Denote by Xj the subspace consisting of

k
eigenvectors of H pertaining to the eigenvalue j . Then H can be represented as H = j=1 j Pj were Pj
k
is the projection to Xj . Let A be a positive square root of H, we claim A has to be j=1 j Pj . Indeed, if
is an eigenvalue of A and x is an eigenvector of A pertaining to , then 2 is an eigenvalue of H and x is
of H pertaining to . So we can assume A has m distinct eigenvalues 1 , , m (m k)
2
an eigenvector
with i = i (1 i m). Denote by Yi the subspace consisting of eigenvectors of A pertaining to i .
m k
Then Yi Xi . Since H = i=1 Yi = j=1 Xj , we must have m = k and Yi = Xi , otherwise at lease one
m m k
of the in the sequence of inequalities dim H = i=1 dim Yi i=1 dim Xi i=1 dim Xi = dim H is <,
k
contradiction. So A can be uniquely represented as A = j=1 j Pj , the same as H dened in formula
(6).
I 2. (page 146) Formulate and prove properties of nonnegative mappings similar to parts (i), (ii), (iii), (iv),
and (vi) of Theorem 1.
Proposition 10.1. (i) The identity I is nonnegative. (ii) If M and N are nonnegative, so is their sum
M + N , as well as aM for any nonnegative number a. (iii) If H is nonnegative and Q is invertible, we have
Q HQ 0. (iv) H is nonnegative if and only if all its eigenvalues are nonnegative. (vi) Every nonnegative
mapping has a nonnegative square root, uniquely determined.
Proof. (i) and (ii) are obvious. For part (iii), we write the quadratic form associated with Q HQ as
(x, Q HQx) = (Qx, HQx) = (y, Hy) 0,
where y = Qx. For part (iv), by the selfadjointness of H, there exists an orthogonal basis of eigenvectors.
Denote these by hj and the corresponding
eigenvalue by
aj . Then any vectornx can be expressed as a linear
combination of the hj s: x = j xj hj . So (x, Hx) = i,j (xi hi , xj aj hj ) = j=1 aj |xj |2 . From the formula
it is clear that (x, Hx) 0 for any x if and only if aj 0, j. For part (vi), the proof is similar to that of
positive mappings and we omit the lengthy proof. Cf. also solution to Exercise 10.1.
I 3. (page 149) Construct two real, positive 2 2 matrices whose symmetrized product is not positive.
Solution. Let A be a mapping that maps the vector (0, 1) to (0, 2 ) with 2 > 0 suciently small and
(1, 0) to (1 , 0) with 1 > 0 suciently large. Let B be a mapping that maps the vector (1, 1) to (1 , 1 )
with 1 > 0 suciently small and (1, 1) to (2 , 2 ) with 2 > 0 suciently large. Then both A and B
are positive mappings, and we can nd x between (1, 1) and (0, 1) so that (Ax, Bx) < 0. By(the analysis
) in
0
the paragraph below formula (14) , AB + BA is not positive. More precisely, we have A = 1
and
0 2
( )
1 + 2 1 2
B = 21 .
1 2 1 + 2
36
I 4. (page 151) Show that if 0 < M < N , then (a) M 1/4 < N 1/4 . (b) M 1/m < N 1/m , m a power of 2. (c)
log M log N .
Proof. By Theorem 5 and induction, it is easy to prove (a) and (b). For (c), we follow the hint. If M has
k
the spectral resolution M = i=1 i Pi , log M is dened as
( )

k
k
1
k
1
k
1
log M = log i Pi = lim m(i 1)Pi = lim m
m
i Pi
m
Pi = lim m(M m I).
m m m
i=1 i=1 i=1 i=1
1 1
So log M = limm m(M m I) limm m(N m I) = log N .
I 5. (page 151) Construct a pair of mappings 0 < M < N such that M 2 is not less than N 2 . (Hint: Use
Exercise 3).
Solution. (from the textbooks solution, pp. 291) Choose A and B as in Exercise 3, that is positive matrices
whose symmetrized product is not positive. Set
M = A, N = A + tB,
t suciently small positive number. Clearly, M < N .
N 2 = A2 + t(AB + BA) + t2 B 2 ;
for t small the term t2 B is negligible compared with the linear term. Therefore for t small N 2 is not greater
than M 2 .
I 6. (page 151) Verify that (19) denes f (z) for a complex argument z as an analytic function, as well as
that Imf (z) > 0 for Imz > 0.

Proof. For f (z) = az + b 0 dm(t)
z+t , we have

dm(t)
f (z + z) f (z) = az + z .
0 (z + z + t)(z + t)
dm(t)
So if we can show limz0 0 (z+z+t)(z+t) exists and is nite, f (z) is analytic by denition. Indeed, if
Imz > 0, for z suciently small, we have

1 1 1 2

z + z + t |z + t| |z| Imz |z| Imz .
dm(t) dm(t)
So by Dominated Convergence Theorem, limz0 0 (z+z+t)(z+t) exists and is equal to 0 (z+t)2 , which
is nite. To see Imf (z) > 0 for Imz > 0, we note

[ ]
dm(t) dm(t)
Imf (z) = aImz Im = Imz a + .
0 Rez + t + iImz 0 (Rez + t)2 + (Imz)2
Remark 11. This exercise can be used to verify formula (19) on p.151.
I 7. (page 153) Given m positive numbers r1 , , rm , show that the matrix

1
Gij =
ri + rj + 1
is positive.
37
1
Proof. Consider the Euclidean space L2 (, 1], with the inner product (f, g) :=
f (t)g(t)dt. Choose
fj = erj (t1) , j = 1, , m, then the associated Gram matrix is
1 (ri +rj )t
e 1
Gij = (fi , fj ) = ri +rj
dt = .
e ri + rj
Clearly, (fj )m
j=1 are linearly independent. So G is positive.
I 8. (page 158) Look up a proof of the calculus result (35).

Proof. We apply the change of variable formula as follows

2
1
ez dz =
2
ex2 y2 dxdy = d er2 rdr = 2 = .
R2 0 0 2
I 9. (page 162) Extend Theorem 14 to the case when dim V = dim U m, where m is greater than 1.
Proof. The extension is straightforward, just replace the paragraph (on page 161) If S is a subspace of V ,
then T = S and dim T = dim S. ... It follows that
dim S 1 dim T
as asserted. with the following one: Let T = S V and T1 = S V , where V stands for the compliment
of V in U . Then dim T + dim T1 = dim S. Since dim T1 dim V = n (n m) = m, dim T dim S m.
The rest of the proof is the same as the proof of Theorem 14 and we can conclude that
p (A) m p (B) p (A).
I 10. (page 164) Prove inequality (44) .

Proof. For any x, (x, (N M dI)x) = (x, (N M )x) d||x||2 ||N M ||||x||2 d||x||2 = 0. Similarly,
(x, (M N dI)x) = (x, (M N )x) d||x||2 ||M N ||||x||2 d||x||2 0.
I 11. (page 166) Show that (51) is largest when ni and mj are arranged in the same order.
Proof. Its easy to see the problem can be reduced to the case k = 2. To prove this case, we note if m1 m2
and np1 np2 , we have
m2 np1 + m1 np2 m2 np2 m1 np1 = (m2 m1 )(np1 np2 ) 0.
I 12. (page 168) Prove that if the self-adjoint part of Z is positive, then Z is invertible, and the self-adjoint
part of Z 1 is positive.
Proof. Assume Z is not invertible. Then there exists x = 0 such that Zx = 0. In particular, this implies
(x, Zx) = (x, Z x) = 0. Sum up these two, we get (x, (Z + Z )x) = 0. Contradictory to the assumption
that the selfadjoint part of Z is positive. For any x = 0, there exists y = 0 so that x = Zy. So
(x, (Z 1 + (Z 1 ) )x) = (x, Z 1 x) + (x, (Z 1 ) x)

= (Zy, y) + (Z 1 x, x)
= (y, Z y) + (y, Zy)
= (y, (Z + Z )y) > 0.
This shows the selfadjoint part of Z 1 is positive.
38
I 13. (page 170) Let A be any mapping of a Euclidean space into itself. Show that AA and A A have the
same eigenvalues with the same multiplicity.
Proof. Exercise Problem 14 has proved the claim for non-zero eigenvalues. Since the dimensions of the spaces
of generalized eigenvectors of AA and A A are both equal to the dimension of the underlying Euclidean
space, we conclude by Spectral Theorem that their zero eigenvalues must have the same multiplicity.
I 14. (page 171) Let A be a mapping of a Euclidean space into another Euclidean space. Show that AA
and A A have the same nonzero eigenvalues with the same multiplicity.
Proof. Suppose a is a non-zero eigenvalue of AA and x is an eigenvector of AA pertaining to a: AA x = ax.

Applying A to both sides, we get A A(A x) = aA x. Since a = 0 and x = 0, A x = 0 by AA x = ax.
Therefore, a is an eigenvalue of A A with A x an eigenvector of A A pertaining to a. By symmetry, we
conclude AA and A A have the same set of non-zero eigenvalues.
Fix a non-zero eigenvalue a, and suppose x1 , , xm is a basis for the space of generalized eigenvectors
of AA pertaining to a. Since a = 0, we can claim A x1 , , A xm are linearly independent. Indeed,
m
assume
m not, then there
m must exist 1 ,
m , m not all equal to 0, such that i=1 i A xi = 0. This implies

a( i=1 i xi ) = i=1 i AA xi = A( i=1 i A xi ) = 0, which further implies x1 , , xm are linearly
dependent since a = 0. Contradiction.
This shows the dimension of the space of generalized eigenvectors of AA pertaining to a is no greater
than that of the space of generalized eigenvectors of A A pertaining to a. By symmetry, we conclude the
spaces of generalized eigenvectors of AA and A A pertaining to the same nonzero eigenvalue have the same
dimension. Combined, we can conclude AA and A A have the same non-zero eigenvalues with the same
(algebraic) multiplicity.
Remark 12. The multiplicity referred to in this problem is understood as algebraic multiplicity, which is
equal to the dimension of the space of generalized eigenvectors.
I 15. (page 171) Give an example of a 2 2 matrix Z whose eigenvalues have positive real part but Z + Z
is not positive.
( )
1 + bi 3
Solution. Let Z = where b could be any real number. Then the eigenvalue of Z, 1 + bi,
0 1 + bi ( )
2 3
has positive real part. Meanwhile, Z + Z = has characteristic polynomial p(s) = (2 s)2 9 =
3 2
(s 5)(s + 1). So Z + Z has eigenvalue 5 and 1, and therefore cannot be positive.
I 16. (page 171) Verify that the commutator (50) of two self-adjoint matrices is anti-self-adjoint.
Proof. Suppose A and B are selfadjoint. Then for any x and y,
(x, (AB BA) y) = ((AB BA)x, y) = (ABx, y) (BAx, y) = (x, BAy) (x, ABy) = (x, (AB BA)y).
So (AB BA) = (AB BA).
11 Kinematics and Dynamics

I 1. (page 174) Show that if M (t) satises a dierential equation of form (11), where A(t) is antisymmetric
for each t and the initial condition (5), then M (t) is a rotation for every t.
Proof. We note dtd
(M (t)M (t)) = M (t)M (t) + M (t)M (t) = A(t) + A (t) = 0. So M (t)M (t)

M (0)M (0) = I. Also, f (t) = det M (t) is continuous function of t and takes values either 1 or -1 by
the isometry property of M (t). Since f (0) = 1, we have f (t) 1. By Theorem 1, M (t) is a rotation for
every t.
39
I 2. (page 174) Suppose that A is independent of t; show that the solution of equation (11) satisfying the
initial condition (5) is
M (t) = etA .
M (t+h)M (t) etA (ehA I)
Proof. limh0 h = limh0 h = AetA , i.e. M (t) = AM (t). Clearly M (0) = I.
I 3. (page 174) Show that when A depends on t, equation (11) is not solved by
t
A(s)ds
M (t) = e 0 ,
unless A(t) and A(s) commute for all s and t.
Proof. The reason we need commutativity is that the following equation is required in the calculation of
derivative:
1 1 ( t+h A(s)ds t )
(M (t + h) M (t)) = e0 e 0 A(s)ds
h h
1 ( t A(s)ds+ t+h A(s)ds t )
= e0 t A 0 A(s)ds
h
1 t A(s)ds ( t+h A(s)ds )
= e0 et I ,
h
t t t+h
A(s)ds+ tt+h A(s)ds A(s)ds A(s)ds
i.e. e 0 =e 0 e t . So when this commutativity holds,
1
M (t) = lim (M (t + h) M (t)) = M (t)A(t).
h0 h
I 4. (page 175) Show that if A in (15) is not equal to 0, then all vectors annihilated by A are multiples of
(16).
Proof. If f = (x, y, z)T satises Af = 0, we must have

ay + bz = 0
ax + cz = 0

bx cy = 0.
By discussing various possibilities (a, b, c = 0 or not), we can check f is a multiple of (c, b, a)T .

I 5. (page 175) Show that the two other eigenvalues of A are i a2 + b2 + c2 .
Proof.

a b
det(sI A) = det a c = (2 + c2 ) a(a + bc) + b(ac + b) = 3 + (c2 + b2 + a2 ).
b c
Solving it gives us the other two eigenvalues.

I 6. (page 176) Show that the motion M (t) described by (12) rotation around the axis through the vector
f given by formula (16). Show that the angle of rotation is t a2 + b2 + c2 . (Hint: Use formula (4) .)
40

0 a b
Proof. Since A = a 0 c is anti-symmetric, M (t)M (t) = etA etA = etA etA = I. By Exercise 7 of

b c 0
Chapter 9, all eigenvalues of eAthas the form of eat , where a is an eigenvalue of A. Since the eigenvalues
of A are 0 and ik with k = a2 + b2 + c2 (Exercise 5), the eigenvalues of eAt are 1 and eikt . This
implies det eAt = 1 eikt eikt = 1. By Theorem 1, M = eAt is a rotation. Let f be given by formula
(16). From Af = 0 we deduce that eAt f = f ; thus f is the axis of the rotation eAt . The trace of
eAt is 1 + eikt + eikt = 2 cos kt + 1. According
to formula (4) , the angle of rotation of eAt satises
At
2 cos + 1 = tre . This shows that = kt = a + b2 + c2 .
2
I 7. (page 177) Show that the commutator
[A, B] = AB BA
of two antisymmetric matrices is antisymmetric.

Proof. (AB BA) = (AB) (BA) = B A A B = (B)(A) (A)(B) = BA AB = (AB
BA).
I 8. (page 177) Let A denote the 3 3 matrix (15); we denote the associated null vector (16) by fA .
Obviously, f depends linearly on A.
(a) Let A and B denote two 3 3 antisymmetric matrices. Show that
trAB = 2(fA , fB ),
where (, ) denotes the standard scalar product for vectors in R3 .
Proof. See the solution in the textbook, on page 294.

I 9. (page 177) Show that the cross product can be expressed as
f|A,B| = fA fB .
Proof. See the solution in the textbook, page 294.

I 10. (page 184) Verify that solutions of the form (36) form a 2n-dimensional linear space.
Proof. It suces to note that the set of 2n functions, {(cos cj t)hj , (sin cj t)hj }nj=1 , are linearly independent,
since any two of them are orthogonal when their subscripts are distinct.
12 Convexity
The books own solution gives answers to Ex 2, 6, 7, 8, 10, 16, 19, 20.
Comments:
1) The following results will help us understand some details in the proofs of Theorem 6 and Theorem
10.
Proposition 12.1. Let S be an arbitrary subset of X and x an interior point of S. For any real linear
function l dened on X, if l 0, then l(x) is an interior point of = l(S) in the topological sense.
Proof. We can nd y X so that l(y) = 0. Then for t suciently small, l(x) + tl(y) = l(x + ty) . So
contains an interval which contains l(x), i.e. l(x) is an interior point of under the topology of R1 .
Corollary 1. If K is an open convex set and l is a linear function with l 0, = l(K) is an open interval.
Proof. Note is convex and open in R1 in the topological sense.
41
Proposition 12.2. Let K be a convex set and K0 the set of all interior points of K. Then K0 is convex
and open.
Proof. (Convexity) x, y K0 and a [0, 1]. For any z X, [ax+(1a)y]+tz = a(x+tz)+(1a)(y+tz) K
when t is suciently small, since x, y are interior points of K and K is convex.
(Openness) Fix x K0 , y1 X. We need to show for t suciently small, x+ty1 K0 . Indeed, y2 X,
we can nd a common > 0, so that whenever (t1 , t2 ) [, ] [, ], x + t1 y1 K and x + t2 y2 K.
Fix any t [ 2 , 2 ], by the convexity of K, x + t y1 + t y2 = 21 (x + 2t y1 ) + 21 (x + 2t y2 ) K when
t [ 2 , 2 ]. This shows x + t y1 K0 . Since t is arbitrarily chosen from [ 2 , 2 ], we conclude for t
suciently small, x + ty1 K0 . That is, x is an interior point of K0 . By the arbitrariness of x, K0 is
open.
2) Regarding Theorem 10 (Carathodory): i) Among all the three conditions, convexity is the essential
one; closedness and boundedness are to guarantee K has extreme points. (ii) Solution to Exercise 14
may help us understand the proof of Theorem 10. When a convex set has no interior points, its often useful
to realize that the dimension can be reduced by 1. (iii) To understand ... then all points x on the open
segment bounded by x0 and x1 are interior points of K, we note if this is not true, then we can nd y
such that for all t > 0 or t < 0, x + ty K. Without loss of generality, assume x + ty K, t > 0. For t
suciently small, x0 + ty K. so the segment [x1 , x0 + ty] K. But this necessarily intersects with the
ray x + ty, t > 0. A contradiction. (iv) We can summarize the idea of the proof as follows. One dimension
is clear, so by using induction, we have two scenarios. Scenario one, K has no interior points. Then the
dimension is reduced by 1 and we are done. Scenario two, K has interior points. Then intuition shows
any interior point lies on a segment with one endpoint an extreme point and the other a boundary point; a
boundary point resides on a hyperplane, whose dimension is reduced by 1. By induction, we are done.
I 1. (page 188) Verify that these are convex sets.
Proof. The verication is straightforward and we omit it.

I 2. (page 188) Prove these propositions.
Proof. These propositions are immediate consequences of the denition of convexity.

I 3. (page 188) Show that an open half-space (3) is an open convex set.
Proof. Fix a point x {z : l(z) < c}. For any y X, f (t) = l(x + ty) = l(x) + tl(y) is a continuous function
of t, with f (0) = l(x) < c. By continuity, f (t) < c for t suciently small. So x + ty {z : l(z) < c} for t
suciently small, i.e. x is an interior point. Since x is arbitrarily chosen, we have proved {z : l(z) < c} is
open.
I 4. (page 188) Show that if A is an open convex set and B is convex, then A + B is open and convex.
Proof. The convexity of A + B is Theorem 1(b). To see the openness, x A, y B. For any z X,
(x + y) + tz = (x + tz) + y. For t suciently small, x + tz A. So (x + y) + tz A + B for t suciently
small. This shows A + B is open.
I 5. (page 188) Let X be a Euclidean space, and let K be the open ball of radius a centered at the origin:
||x|| < a.
(i) Show that K is a convex set.
(ii) Show that the gauge function of K is p(x) = ||x||/a.
Proof. That K is a convex set is trivial to see. Its also clear that p(0) = 0. For any x Rn \ {0}, when
||x|| ||x||
(0, a), r = a satises r > 0 and x/r K. So p(x) a . By letting 0, we conclude
p(x) ||x||/a. If < holds, we can nd r > 0 such that r < ||x||
a and
x
r K. But r < ||x||
a implies a < ||x||
r
and hence xr K. Contradiction. Combined, we conclude p(x) = ||x|| a .
42
I 6. (page 188) In the (u, v) plane take K to be the quarter-plane u < 1, v < 1. Show that the gauge
function of K is

0 if u 0, v 0,

v if 0 < v, u 0,
p(u, v) =

u if 0 < u, v 0,

max(u, v) if 0 < u, 0 < v.
Proof. See the textbooks solution.

I 7. (page 190) Let p be a positive homogeneous, subadditive function. Prove that the set K consisting of
all x for which p(x) < 1 is convex and open.
Proof. x, y K, we have p(x) < 1 and p(y) < 1. For any a [0, 1], p(ax + (1 a)y) p(ax) + p((1 a)y) =
ap(x) + (1 a)p(y) < a + (1 a) = 1. This shows K is convex. To see K is open, x x K and choose
any z X. Then p(x + tz) p(x) + tp(z). So for t suciently small such that tp(z) < 1 p(x), we have
p(x + tz) p(x) + tp(z) < p(x) + 1 p(x) = 1, i.e. x + tz K. This shows K is open.
I 8. (page 193) Prove that the support function qS of any set is subadditive; that is, it satises qS (m + l)
qS (m) + qS (l) for all l, m in X .
Proof. > 0, there exists x() S, so that
qS (m + l) = sup(l + m)(x) < (l + m)(x()) + sup l(x) + sup m(x) + = qS (m) + qS (l) + .
xS xS xS
By the arbitrariness of , we conclude qS (m + l) qS (m) + qS (l).

I 9. (page 193) Let S and T be arbitrary sets in X. Prove that qS+T (l) = qS (l) + qT (l).
Proof. qS+T (l) = supxS,yT l(x + y) = supxS,yT [l(x) + l(y)] supxS,yT [qS (l) + qT (l)] = qS (l) + qT (l).
Conversely, > 0, there exists x0 S, y0 T , s.t. qS (l) < l(x0 ) + 2 , qT (l) < l(y0 ) + 2 . So qS (l) + qT (l) <
l(x0 + y0 ) + qS+T (l) + . By the arbitrariness of , qS (l) + qT (l) qS+T (l). Combined, we get
qS+T (l) = qS (l) + qT (l).
I 10. (page 193) Show that qST (l) = max{qS (l), qT (l)}.
Proof. qST (l) = supxST l(x) supxS l(x) = qS (l). Similarly, qST (l) qT (l). Therefore, we have
qST (l) max{qS (l), qT (l)}. For any > 0 suciently small, we can nd x S T , such that qST (l)
l(x ) + . But l(x ) max{qS (l), qT (l)}. So qST (l) max{qS (l), qT (l)} + . Let 0, we get qST (l)
max{qS (l), qT (l)}. Combined, we can conclude qST (l) = max{qS (l), qT (l)}.
I 11. (page 194) Show that a closed half-space as dened by (4) is a closed convex set.
Proof. If for any a (0, 1), l(ax + (1 a)y) = al(x) + (1 a)l(y) c, by continuity, we have l(x) c and
l(y) c. This shows {x : l(x) c} is a closed convex set.
I 12. (page 194) Show that the closed unit ball in Euclidean space, consisting of all points ||x|| 1, is a
closed convex set.
Proof. Convexity is obvious. For closedness, note f (t) = ||tx + (1 t)y|| is a continuous function of t. So if
f (t) t for any t (0, 1), f (0) = ||y|| 1 and f (1) = ||x|| 1. So the unit ball B(0, 1) is closed. Combined,
we conclude B(0, 1) = {x : ||x|| 1} is a closed convex set.
I 13. (page 194) Show that the intersection of closed convex sets is a closed convex set.
Proof. Suppose H and K are closed convex sets. Theorem 1(a) says H K is also convex. Moreover, if for
any a (0, 1), ax + (1 a)y H K, then the closedness of H and K implies a, b H and a, b K, i.e.
a, b H K. So H K is closed.
43
I 14. (page 194) Complete the proof of Theorems 7 and 8.
Proof. Proof of Theorem 7: Suppose K has an interior point x0 . If a linear function l and a real
number c determine a closed half-space that contains K x0 but not y x0 , i.e. l(x x0 ) c, x K and
l(y x0 ) > c, then l and c+l(x0 ) determine a closed half-space that contains K but not y, i.e. l(x) c+l(x0 )
and l(y) > c + l(x0 ). So without loss of generality, we can assume x0 = 0. Note the convexity and closedness
are preserved under translation, so this simplication is all right for this problems purpose.
Dene gauge function pK as in (5). Then we can show pK (x) 1 if and only if x K. Indeed, if x K,
then pK (x) 1 by deinition. Conversely, if pK (x) 1, then for any > 0, there exists r() < 1 + so that
r() K. We choose r() > 1 and note r() = a() 0 + (1 a()) x with a() = 1 r() . As r() can be as
x x 1
close to 1 as we want when 0, a() can be as close to 0 as we want. Meanwhile, 0 is an interior point of
K, so for r large enough, xr K. This shows for a close enough to 1, a 0 + (1 a) x K. Combined, we
conclude K contains the open segment {a 0 + (1 a) x : 0 < a < 1}. By denition of closedness, x K.
The rest of the proof is completely analogous to that of Theorem 3, with p(x) < 1 replaced by p(x) 1.
If K has no interior point, we have two possibilities. Case one, y and K are not on the same hyperplane.
In this case, there exists a linear function l and a real number c, such that l(x) = c(x K) but l(y) = c.
By considering l if necessary, we can have l(x) = c(x K) and l(y) > c. So the half-space {x : l(x) c}
contains K, but not y. Case two, y and K reside on the same hyperplane. Then the dimension of the
ambient space for y and K can be reduced by 1. Work by induction and note the space is of nite dimension,
we can nish the proof.
Proof of Theorem 8: By denition of (16), if x K, then l(x) qK (l). Conversely, suppose y is not
in K, then there exists l X and a real number c such that l(x) c, x K and l(y) > c. This implies
l(y) > qK (l). Combined, we conclude x K if and only if l(x) qK (l), l X .
Remark: From the above solution and the proof of Theorem 3, we can see a useful routine for proving
results on convex sets: rst assume the convex set has an interior point and use the gauge function, which
often helps to construct the desired linear functionals via Hahn-Banach Theorem. If there exists no interior
point, reduce the dimension by 1 and work by induction. Such a use of interior points as the criterion for a
dichotomy is also present in the proof of Theorem 10 (Carathodory).
Proof. Denote by Sb the closed convex hull of S, and dene l = {x : l(x) qS (l)} where l X . Then it
is easy to see each l is a closed convex set containing S, so Sb lX l . For the other direction, suppose
lX l \ Sb = and we choose a point x from lX l \ S.
b By Theorem 8, there exists l0 X such that
l0 (x) > qSb(l0 ) qS (l0 ). So x l0 , contradiction. Combined, we conclude Sb = lX l = {x : l(x)
qS (l), l X }.
I 16. (page 195) Show that if x1 , , xm belong to a convex set, then so does any convex combination of
them.
m m
Proof. Suppose 1 , , m satisfy 1 , , m (0, 1) and i=1 i = 1. We m to show i=1 i xi
need
K, where K is the convex set to which x1 , , xm belong. Indeed, since i=1 i xi = (1 + +
m1 m1
i
m1 ) i=1 1 ++ m1
xi + m xm , it suces to show i=1 1 ++ i
m1
xi K. Working by induction,
we are done.
I 17. (page 195) Show that an interior point of K cannot be an extreme point.
Proof. Suppose x is an interior point of K. y X, for t suciently small, x + ty K. In particular, we

can nd > 0 so that x + y K and x y K. Since x can be represented as x = (x+y)+(xy)
2 , we
conclude x is not an extreme point.
I 18. (page 197) Verify that every permutation matrix is a doubly stochastic matrix.
44
Proof.
n Let S be a permutation matrix as dened in formula (25). Then clearly Sij 0. Furthermore,
n n
i=1 S ij = ni=1 p(i)j , where j is xed and is equal to p(i0 ) for some i 0 . Son i=1 Sij = 1. Finally,
n 1
j=1 Sij = j=1 ip1 (j) , where i is xed and is equal to p (j0 ) for some j0 . So j=1 Sij = 1. Combined,
we conclcude S is a doubly stochastic matrix.
I 19. (page 199) Show that, except for two dimensions, the representation of doubly stochastic matrices as
convex combinations of permutation matrices is not unique.
Proof. The textbooks solution demonstrates the case of dimension 3. Counterexamples for higher dimensions
can be obtained by building permutation matrices upon the case of dimension 3.
I 20. (page 201) Show that if a convex set in a nite-dimensional Euclidean space is open, or closed, or
bounded in the linear sense dened above, then it is open, or closed, or bounded in the topological sense,
and conversely.
Proof. Suppose K is a convex subset of an n-dimension linear space X. We have the following properties.
(1) If x is an interior point of K in the linear sense, then x is an interior point of K in the topological
sense. Consequently, being open in the linear sense is the same as being open in the topological sense.
Indeed, let e1 , , en be a basis of X. There exists > 0 so that for any ti (, ), x + ti ei K,
i = 1, , n.For any y X which is close enough to x, the norm of y x can nbe very small
n so that if we write
n
y as y = x+ i=1 ai ei , |ai | < n . Since for ti ( n , n ) (i = 1, , n), x+ i=1 ti ei = i=1 n1 (x+nti ei ) K
by the convexity of K, we conclude y K if y is suciently close to x. This shows x is an interior point of
K.
(2) If K is closed in the linear sense, it is closed in the topological sense.
Indeed, suppose (xk ) k=1 K and xk x in the topological sense, we need to show x K. We work
by induction. The case n = 1 is trivial, because x is necessarily an endpoint of a segment contained in K.
Assume the property is true any n N . For n = N + 1, we have two cases to consider. Case one, K has
no interior points. Then as argued in the proof of Theorem 10, K is contained in a subspace of X with
dimension less than n. By induction, K is closed in the topological sense and hence x K. Case two, K has
at least one interior point x0 . In this case, all the points on the open segment (x0 , x) must be in K. Indeed,
assume not, then there exists an x (x0 , x) such that the open segment (x0 , x ) K, but (x , x] K = .
Since x0 is an interior point of K, we can nd n linearly independent vectors e1 , , en so that x0 + ei K,
i = 1, , n. For any xk suciently close to x, the cone with xk as the vertex and x0 + e1 , , x0 + en
as the base necessarily intersects with (x , x]. So such an x (x0 , x) with (x , x] K = does not exist.
Therefore (x0 , x) K and by denition of being closed in the linear sense, we conclude x K.
(3) If K is bounded in the linear sense, it is bounded in the topological sense.
Indeed, assume K is not bounded in the topological sense, then we can nd a sequence (xk ) k=1 such
that ||xk || . We shall show K is not bounded in the linear sense. Indeed, if the dimension n = 1, this
is clearly true. Assume the claim is true for any n N . For n = N + 1, we have two cases to consider.
Case one, K has no interior points. Then as argued in the proof of Theorem 10, K is contained in a
subspace of X with dimension less than n. By induction, K is not bounded in the linear sense. Case two,
K has at least one interior point x0 . Denote by yk the intersection of the segment [x0 , xk ] with the sphere
S(x0 , 1) = {z : ||z x0 || = 1}. For k large enough, yk always exists. Since a sphere in nite-dimensional
space is compact, we can assume without loss of generality that yk y S(x0 , 1). Then by an argument
similar to that of part (2) (the argument based on cone), the ray starting with x0 and going through y is
contained in K. So K is not bounded in the linear sense.
13 The Duality Theorem

The books own solution gives answers to Ex 3.
Comments: For a linear equation Ax = y to have solution, it is necessary and sucient that
y R(A) = N (A) . This observation of duality helps us determine the existence of solution. In optimization
45
theory, if the collection of points satisfying certain constraint is a convex set, we use the hyperplane separation
theorem to nd and state the necessary and sucient condition for the existence of solution.
I 1. (page 205) Show that K dened by (5) is a convex set.

m
Proof. Let Y = {y : y = i=1 pj yj , pj 0}. If y, y Y , then

m
[ ]
ty + (1 t)y = tpj + (1 t)pj yj Y.
i=1
So Y is a convex set.
I 2. (page 205) Show that if x z and 0, then x z.
Proof. x z = (x z) 0.
I 3. (page 208) Show that the sup and inf in Theorem 3 is a maximum and minimum. [Hint: The sign of
equality holds in (21).]
Proof. In the proof of Theorem 3, we already showed that there is an admissible p for which p s
(formula (21)). Since S p s S by formula (16) and (20), the sup in Theorem 3 is obtained at
p , hence a maximum. To see the inf in Theorem 3 is a minimum, note under the condition that there are
admissible p and , Theorem 3 can be written as
sup{p : y Y p, p 0} = inf{y : Y, 0} = .
This is equivalent to
inf{()p : (y) (Y )p, p 0} = sup{(y) : () (Y ), 0}.
By previous argument for S = sup p, we can nd such that 0, () (Y ) and (y) =

sup{(y) : () (Y ), 0}, i.e. 0, Y , and y = inf{y : Y, 0}. That is, the inf
in Theorem 3 is obtained at , hence a minimum.
14 Normed Linear Spaces

The books own solution gives answers to Ex 2, 5, 6.
Comments: The geometric intuition of Theorem 9 is clear if we identify X with X and assume X
is an inner product space.
I 1. (page 214) (a) Show that the open and closed unit balls are convex.
(b) Show that the open and closed unit balss are symmetric with respect to the origin, that is, if x belongs
to the unit ball, so does x.
Proof. Trivial and proof is omitted.
I 2. (page 215) Prove the triangle inequality, that is, for all x, y, z in X,
|x z| |x y| + |y z|.
Proof. |x z| = |(x y) + (y z)| |x y| + |y z|.

I 3. (page 215) Prove that |x|1 dened by (5) has all three properties (1) of a norm.
Proof. (i) |x|1 0,and |x|1 = 0 if and only if each aj = 0, i.e. x = 0.
(ii) |x + y|1 =
|xj + y
j | |x j | + |yj | = |x|1 + |y|1 .
(iii) |kx|1 = |kxj | = |k||xj | = |k||x|1 .
46
I 4. (page 216) Prove or look up a proof of Hlders inequality.
Proof. f (x) = ln x is a strictly convex function on (0, ). So for any a, b > 0 with a = b, we have
f (a + (1 )b) f (a) + (1 )f (b), [0, 1],
where = holds if and only if a + (1 )b = a or b. That is, one of the following three cases occurs: 1)
= 0; 2) = 1; 3) a = b.
We note inequality f (a + (1 )b) f (a) + (1 )f (b) is equivalent to a b1 a + (1 )b, and by
letting ai = |x i|
and bi = |y|y|i |qq , we have ( = p1 )
p q
|x|p
p
|xi yi | 1 |xi |p 1 |xi |q

+ .
|x|p |y|q p |x|pp q |x|qq

Taking summation gives i |xi yi | |x|p |y|q .
We consider when i |xi yi | = |x|p |y|q . Since p, q are real positive numbers and since p1 + 1q = 1, we must

have p, q (0, 1). So among the three cases aforementioned, = holds in i |xi yi | |x|p |y|q if and only if
for each i, |x i| |yi |q
p
|x|p =
|y|qq
, that is, (|x1 |p , , |xn |p ) are proportional to (|y1 |q , , |yn |q ).
p
For i xi yi = |xi yi | to hold, we need xi yi = |xi yi | for each i. This is the same as sign(xi ) = sign(yi )
for each i. In summary, we conclude xy |x|p |y|q and the = holds if and only if (|x1 |p , , |xn |p ) and
(|y1 |q , , |yn |q ) are proportional to each other and sign(xi ) = sign(yi ) (i = 1, , n).
I 5. (page 216) Prove that
|x| = lim |x|p ,
p
where |x| is dened by (3).
Proof. Given x, we note

( )1/p ( )1/p
n
|xi |p n
|xi |p
|x|p = |x| and 1 n1/p .
i=1
|x|p i=1
|x|p
Letting p , we can see |x|p |x| .
I 6. (page 219) Prove that every subspace of a nite-dimensional normed linear space is closed.
Proof. Every linear subspace of a nite-dimensional normed linear space is again a nite-dimensional normed
linear space. So the problem is reduced to proving any nite-dimensional
normed space is closed. Fix a basis
e1 , , en , we introduce the following norm: if x = aj ej , ||x|| := ( j a2j )1/2 . Then the original norm | | is
equivalent to || ||. So (xk ) n sequence undern| | if and
k=1 is a Cauchy only if {(ak1 , , akn )}
k=1 is a Cauchy
sequence in C or R . Here xk = j=1 a
n n
kj ej . Since C and R are complete, we conclude there exists
n
(b1 , , bn ) Cn or Rn , so that xk x = bj ej in || || and hence in | |.
I 7. (page 221) Show that the inmum in Lemma 5 is a minimum.

Proof. If (zk )
k=1 Y is such that |x zk | d := infyY |x y|, then for k suciently large, |zk |
|zk x| + |x| 2d + |x|. Note that Y = span{y1 , , yn } is a nite dimensional space, by Theorem 3 (ii),
(zk )
k=1 has a subsequence which converges to a point y0 Y . Then infyY |x y| is obtained at y0 .
I 8. (page 223) Show that |l| dened by (23) satises all postulates for a norm listed in (1).
Proof. (i) Positivity: || = 0 implies x = 0, x with |x| = 1. So for any y with y = 0, y = |y|(y/|y|) = 0,
i.e. 0. So || = 0 implies =
0, which is equivalent to = 0 implies || > 0. |0| = 0 is obvious.
(ii) Subadditivity: |1 + 2 | = |x|=1 (1 + 2 )x sup|x|=1 1 x + sup|x|=1 2 x = |1 | + |2 |.
(iii) Homogeneity: |k| = sup|x|=1 kx = |k| sup|x|=1 x = |k|||.
47
I 9. (page 228) (i) Show that for all rational r,
(rx, y) = r(x, y).
(ii) Show that for all real k,

(kx, y) = k(x, y).
(i)
q
Proof. By formula (47) and (48), it suces to prove the equality for positive rational r. Suppose r = p with
p, q Z+ . By formula (49) and by induction, we have
n
z }| {
(x, y) + + (x, y)= (nx, y) .
Therefore
q
p(rx, y) = (prx, y) = (qx, y) = q(x, y), i.e. (rx, y) = (x, y) = r(x, y).
p
(ii)
Proof. For any given k, we can nd a sequence of rational numbers (rn )
n=1 such that rn k as n .
Then k(x, y) = limn rn (x, y) = limn (rn x, y) = (limn rn x, y) = (kx, y), where the third = uses
the fact that (, y) denes a continuous linear functional on X.
15 Linear Mappings Between Normed Linear Spaces

Erratum: In Exercise 7 (p.236), it should be dened by formulas (3) and (5) in Chapter 14 instead
of dened by formulas (3) and (4) in Chapter 14.
I 1. (page 230) Show that every linear map T : X Y is continuous, that is, if lim xn = x, then
lim T xn = T x.
Proof. By Lemma 1, |T xn T x| = |T (xn x)| c|xn x|. So if limn xn = x, then limn T xn = T x.
I 2. (page 235) Show that if for every x in X, |Tn x T x| tends to zero as n , then |Tn T | tends to
zero.
Proof. Suppose |Tn T | does not tend to zero. Then there exists > 0 and a sequence (xn ) n=1 such that
|xn | = 1 and |(Tn T )xn | . By Theorem 3(ii), we can without loss of generality assume (xn )
n=1 converges
to some point x . Then
|(Tn T )xn | |(Tn T )x | |(Tn T )xn (Tn T )x |

|T (xn x )| + |Tn (xn x )|
|T ||xn x | + |Tn ||xn x |.
For n suciently large, |(Tn T )xn | |(Tn T )x | will be greater than /2, while |T ||xn x | + |Tn ||xn x |
will be as small as we want, provided that we can prove (|Tn |) n=1 is bounded. Indeed, this is the principle
of uniform boundedness (see, for example, Lax [7], Chapter 10, Theorem 3). Thus we have arrived at a
contradiction which shows our assumption is wrong.
Remark 13. Can we nd an elementary proof without using the principle of uniform boundedness in
functional analysis, especially since we are working with nite dimensional space?
48
n
I 3. (page 235) Show that Tn = Rk converges to S 1 in the sense of denition (16).
0
( )
is a Cauchy sequence in X .
K
Proof. First of all, k=0 Rk is well-dened, since by |R| < 1, k=0 R k

K=0

Then we note S k=0 R k
= k=0 Rk k=1 Rk = I and ( k=0 Rk )S = k=0 Rk k=1 Rk = I. So S

is invertible and S 1 = k=0 Rk .
I 4. (page 235) Deduce Theorem 5 from Theorem 6 by factoring S = T + S T as T [I T 1 (S T )].
Proof. Assume all the conditions in Theorem 5. Dene R = T 1 (S T ), then |R| |T 1 ||S T | < 1. So
by Theorem 6, I R = T 1 S is invertible, hence S = T T 1 S is invertible.
I 5. (page 235) Show that Theorem 6 remains true if the hypothesis (17) is replaced by the following
hypothesis. For some positive integer m,
|Rm | < 1.

Proof. If for some m, |Rm | < 1, we dene U = k=0 Rkm = I + Rm U . U is well-dened, and the following
linear map is also well-dened: V = U + RU + + Rm1 U . Then SV = U + RU + + Rm1 U (RU +
R2 U + + Rm U ) = U Rm U = I. This shows S is invertible.
I 6. (page 235) Take X = Y = Rn , and T : X X the matrix (tij ). Take for the norm |x| the maximum
norm |x| dened by formula (3) of Chapter 14. Show that the norm |T | of the matrix (tij ), regarded as a
mapping of X into X, is
|T | = max |tij |.
i
j
n n |T x|
Proof. For any x Rn , |T x| = maxi | j=1 tij xj | maxi ( j=1 |tij |)|x| . So |T | = supx=0
|x|
maxi j |tij |. For the other direction, suppose j |ti0 j | = maxi j |tij | and we choose
x = (sign(ti0 1 ), , sign(ti0 n ))T ,

then |x | = 1 and T x = ( j t1j xj , , j ti0 j xj , , j tnj xj )T . So

|T x | |ti0 j | = max |tij |.
i
j j

This implies |T | maxi j |tij |. Combined, we conclude |T | = maxi j |tij |.
I 7. (page 236) Take X to be Rn normed by the maximum norm |x| , Y to be Rn normed by the 1-norm
|x|1 , dened by formulas (3) and (5) in Chapter 14. Show that the norm of the matrix (tij ) regarded as a
mapping of X into Y is bounded by
|T | |tij |.
i,j
Proof. For any x = (x1 , , xn ) , we have

n n
n n
|T x|l =
tij xj |tij xj | |tij ||x| .

i=1 j=1 i=1 j=1 i,j
|T x|l
So |T | = sup|x| =1 |x| i,j |tij |.
49
I 8. (page 236) X is any nite-dimensional normed linear space over C, and T is a linear mapping of X
into X. Denote by tj the eigenvalues of T , and denoted by r(T ) its spectral radius:
r(T ) = max |tj |.
(i) Show that |T | r(T ).

(ii) Show that |T n | r(T )n .
(iii) Show, using Theorem 18 of Chapter 7, that
lim |T n |1/n = r(T ).

n
Proof. The proof is very similar to the content on page 97, Chapter 7, the material up to Theorem 18. So
we omit the proof.
16 Positive Matrices
The books own solution gives answers to Ex 1, 2.
Comments: To see the property of complex numbers mentioned n on p.240,n we note if z1 , z2 C, then
|z1 + z2 | = |z1 | + |z2 | if and only if arg z1 = arg z2 . For n 3, if | i=1 zi | = i=1 |zi |, then
n n n n

n

zi zi + |z1 + z2 | zi + |z1 | + |z2 | |zi | = zi .

i=1 i=3 i=3 i=1 i=1
So |z1 + z2 | = |z1 | + |z2 | and hence arg z1 = arg z2 . Then we work by induction.
I 1. (page 240) Denote by t(P ) the set of nonnegative such that
P x x, x 0
for some vector x = 0. Show that the dominant eigenvalue (P ) satises
(P ) = min .
t(P )
n
Proof. Let x = (1, , 1)T and = max1in j=1 pij , then P x x . So t(P ) = and t (P ) = {0
: t(P )} is a bounded, nonempty set. We show further t (P ) is closed. Suppose (m )
m=1 t (P )
converges to a point . Denote by x the nonnegative and nonzero vector such that P x x . Without
m m m m
n m
i = 1. Then (x )m=1 is bounded and we can assume x x for some
xm m
loss of generality,
n we assume i=1

x 0 with i=1 xi = 1. Passing to limit gives us P x x. Clearly 0 . So t (P ). This shows
t (P ) is compact and t(P ) has a minimum .
Denote by x the nonzero and nonnegative vector such that P x x. We show we actually have
n
P x = x. Assume not, there must exist some k {1, , n} such that j=1 pij xij xi for i = k and
n
b = x ek , where > 0 and ek has the k-th component equal to
j=1 pkj xj < xxk . Consider the vector x
1 with all the other components zero. Then in the inequality P x x, each component of LHS is decreased
when x is replaced by x b, while only the k-th component of RHS is decreased by an amount of . So for
small enough, P x b < bx, and we can nd a b < such that P x bx. Note > 0 (otherwise x = 0, a
b b
b
contradiction), so we can also let > 0. This contradicts with = mint(P ) . We have shown > 0 is an
eigenvalue of P which has a nonzero, nonnegative eigenvalue. By Theorem 1(iv), = (P ).
I 2. (page 243) Show that if some power P m of P is positive, then P has a dominant positive eigenvalue.
Proof. P m has a dominant positive eigenvalue 0 . By Spectral Mapping Theorem, there is an eigenvalue
of P , such that m = 0 . Suppose is real, then we can further assume > 0 by replacing with if
50
necessary. Then for any other eigenvalue of P , Spectral Mapping Theorem implies ( )m is an eigenvalue
of P m . So |( )m | < 0 = m , i.e. | | < .
To show we can take as real, denote by x the eigenvector of P m associated with 0 . Then
P m x = 0 x.
Let P act on this relation:

P m+1 x = P m (P x) = 0 P x.
This shows P x is too an eigenvector of P m with eigenvalue 0 . By Theorem 1(iv), P x = cx for some positive
number c. Repeated application of P shows that P m x = cm x. Therefore cm = 0 . Let = c.
17 How to Solve Systems of Linear Equations

Comments: The three-term recursion formulae
xn+1 = (sn A + pn I)xn + qn xn1 sn b
is introduced by Rutishauser et al. [1]. See Papadrakakis [11] for a survey on a family of vector iterative
methods with three-term recursion formulae and Golub and van Loan [2] for a gentle introduction to the
Chebyshev semi-iterative method (section 10.1.5).
I 1. (page 248) Show that (A) is 1.
Proof. I = AA1 . So 1 = |I| = |AA1 | |A||A1 | = (A).

I 2. (page 254) Suppose = 100, ||e0 || = 1, and (1/)F (x0 ) = 1; how large do we have to take N in order
to make ||eN || < 103 , (a) using the method in Section 1, (b) using the method in Section 2.
Proof. If we use the method of Section 1, we need to solve for N the following inequality: 2 (1 1 )N F (x0 ) <
103 . Plug in numbers, we have N > 757. If we use the method of Section 2, we need to solve for N the
inequality 2(1 + 2 )N ||e0 || < 103 . Plug in numbers, we have N > 42. The numbers of steps needed in
respective methods dier a great deal.
I 3. (page 261) Write a computer program to evaluate the quantities sn , pn , and qn .

Solution. We rst summarize the algorithm. We need to solve the system of linear equations Ax = b, where
b is a given vector and A is an invertible matrix. We start with an initial guess x0 of the solution and dene
r0 = Ax0 b. TO BE CONTINUED ...
I 4. (page 261) Use the computer program to solve a system of equations of your choice.
Solution. We solve the following problem from the rst edition of this book: Use the computer program in
Exercise 3 to solve the system of equations
1 1
Ax = f, Aij = c + , fi = ,
i+j+1 i!
c some nonnegative constant. Vary c between 0 and 1, and the order K of the system between 5 and 20.
TO BE CONTINUED ...
51
18 How to Calculate the Eigenvalues of Self-Adjoint Matrices
I 1. (page 266) Show that the o-diagonal entries of Ak tend to zero as k tends to .
Proof. Skipped for this version.

I 2. (page 266) Show that the mapping (16) is norm-preserving.

I 3. (page 270) (i) Show that BL LB is a tridiagonal matrix.
(ii) Show that if L satises the dierential equation (21), its entries satisfy
d
ak = 2(b2k b2k1 ),
dt
d
bk = bk (ak+1 ak ),
dt
where k = 1, , n and b0 = bn = 0.
A Appendix
A.1 Special Determinants
I 1. (page 304) Let
p(s) = x1 + x2 s + + xn sn1
by a polynomial of degree less than n. Let a1 , , an be n distinct numbers, and let p1 , , pn be n
arbitrary complex numbers; we wish to choose the coecients x1 , , xn so that
p(ai ) = pi , i = 1, , n.
This is a system of n linear equations for the n coecients xi . Find the matrix of this system of equations,
and how that its determinant is = 0.
Solution. The system of equations p(ai ) = pi (i = 1, , n) can be written as

1 a1 an1 1 x1 p1
1 n1
a 2 a 2 x
2 = p2

1 an ann1 xn pn
Since a1 , , an are distinct,

by the formula for the determinant of Vandermonde matrix, the determinant
of the matrix is equal to j>i (aj ai ).
I 2. (page 304) Find an algebraic formula for the determinant of the matrix whose ijth element is
1
;
1 + ai aj
here a1 , , an are arbitrary scalars.
52

(aj ai )2
Solution. Denote the matrix by A. We claim det A = j>i . Indeed, by subtracting column 1 from
i,j (1+a i aj )
each of the other columns, we have

1
1
1+a2
1
1+a1 a2 1+a1 a3 1
1+a1 an
11 1 1
1
1+a2 a1 1+a22 1+a2 an
1+a2 a3

det A = det 1
1+ai a1 1 1
1

1+ai a2 1+ai a3 1+ai an

1
1+an a1
1
1+an a2
1
1+an a3 1
1+a2n
a1 (a2 a1 ) a1 (a3 a1 ) a1 (an a1 )
1
1+a2 (1+a21 )(1+a1 a2 ) (1+a21 )(1+a1 a3 )
(1+a21 )(1+a1 an )
11 a2 (a2 a1 ) a2 (a3 a1 ) a2 (an a1 )
1+a a
2 1 (1+a2 a1 )(1+a22 ) (1+a2 a1 )(1+a2 a3 ) (1+a2 a1 )(1+a2 an )

= det
1 ai (a2 a1 ) ai (a3 a1 ) ai (an a1 )

1+ai a1 (1+ai a1 )(1+ai an )
(1+ai a1 )(1+ai a2 ) (1+ai a1 )(1+ai a3 )

an (a2 a1 ) an (a3 a1 ) an (an a1 )
1
1+an a1 (1+an a1 )(1+an a2 ) (1+an a1 )(1+an a3 ) (1+an a1 )(1+a2n )
By extracting the common factor 1

1+ai ai (i = 1, , n) from each row and (aj a1 ) (j = 2, , n) from each
column, we have

1 a1
1+a1 a2
a1
1+a1 a3 a1
1+a1 an
1 a2 a2
a2
n 1+a22 1+a2 a3 1+a2 an

j=2 (aj a1 )
det A = n det
i=1 (1 + ai a1 ) 1 ai ai
1+ai an
ai
1+ai a2 1+ai a3

1 an
1+an a2
an
1+an a3 an
1+a2n
Subtracting row 1 from each of the other rows, we get

1 a1
1+a1 a2
a1
1+a1 a3 a1
1+a1 an
0 a2 a1 a2 a1
a2 a1
n (1+a1 a2 )(1+a22 ) (1+a1 a3 )(1+a2 a3 ) (1+a1 an )(1+a2 an )

(a
j=2 j a 1 )
det A = n det ai a1 ai a1 ai a1
(1 + a a ) 0 (1+a1 an )(1+ai an )
i=1 i 1
(1+a1 a2 )(1+ai a2 ) (1+a1 a3 )(1+ai a3 )

an a1 an a1 an a1
0 (1+a1 a2 )(1+an a2 ) (1+a1 a3 )(1+an a3 ) (1+a1 an )(1+a2n )
By the Laplace expansion and extracting the common factor (ai a1 ) from row 2 through n and the common
factor 1+a11 aj from column 2 through n, we get

1
1+a22
1
1+a2 a3 1
1+a2 an
n
a1 )2
j=2 (aj
det A = n n det 1+a1i a2 1
1
1+ai an
i=1 (1 + ai a1 ) j=2 (1 + a1 aj ) 1+ai a3

1
1+an a2
1
1+an a3 1
1+a2n
By induction, we can prove our claim.
A.2 The Pfaan

I 1. (page 306) Verify by a calculation Cayleys theorem for n = 4.
53
Proof. By the Laplace expansion and Exercise 16 of Chapter 5, we have

0 a b c
a 0 a b c a b c a b c
d e
det
b d 0 f = a det d 0 f b det 0 d e + c det 0 d e
e f 0 e f 0 d 0 f
c e f 0
= a(bef + cdf + af 2 ) b(be2 + cde + aef ) + c(adf bde + cd2 )
= 2abef + 2acdf 2bcde + a2 f 2 + b2 e2 + c2 d2
= (af be + cd)2 .
A.3 Symplectic Matrices

I 1. (page 308) Prove that any real 2n 2n anti-self-adjoint matrix A, det A = 0, can be written in the
form
A = F JF T ,
J dened by (1), F some real matrix, det F = 0.
[ ]
0 a
Proof. We work by induction. For n = 1, A has the form of . Since det A = 0, a = 0. We note
a 0
[ ][ ][ ] [ ]
1 0 0 a 1 0 0 1
1 = .
0 1
a a 0 0 a 1 0
Now assume the claim is true for 1, , n, we show it also holds for n + 1. Indeed, we write A into the form

0 a
a 0 ,
A1
where A1 is a 2n 2n anti-self-adjoint matrix. Then

1 0 012n 0 a 1 0 012n 0 1
0 1
012n a 0 0 1
012n = 1 0 .
a a
02n1 02n1 I2n2n A1 02n1 02n1 I2n2n A1
Recall that multiplying an elementary matrix from the left is equivalent to an elementary row manipulation,
while multiplying an elementary matrix from the right is equivalent to an elementary column manipulation.
A being anti-self-adjoint implies Aij = Aji , so we can nd a sequence of elementary matrices U1 , U2 , , Uk
such that
0 1 0 1 0
Uk U2 U1 1 0 U1T U2T UkT = 1 0 0 .
A1 0 0 A1
[ ]
0nn Inn
By assumption, A1 = F1 J1 F1 for some real matrix F1 with det F1 = 0 and J1 =
T
. Therefore
Inn 0nn
(U := Uk U2 U1 )

[ ] 0 1 [ ] 0 1 0
I22 0 I 0
U 1 0 U T 22 = 1 0 0 .
0 F11 0 (F11 )T
A1 0 0 J1
Dene
0 1 0 0
0 0 0 Inn
L=
1

0 0 0
0 0 Inn 0
54
and 1
[ ] 1 0 012n
I 0
F = L 22 U 0 1
012n .
0 F11 a
02n1 02n1 I2n2n
Then

[ ] 1 0 012n 0 a
I22 0
F 1 A(F 1 )T = L U 0 1
0 a 0
0 F11 a 12n
02n1 02n1 I2n2n A1

1 0 012n [ ]
I 0
0 1
012n U T 22 LT
a 0 (F11 )T
02n1 02n1 I2n2n
[ ]
0 I(n+1)(n+1)
= .
I(n+1)(n+1) 0
By induction, we have proved the claim.
I 2. (page 310) Prove the converse.
Proof. For any given x and y, dene f (t) = (S(t)x, JS(t)y). Then we have
d d d
f (t) = ( S(t)x, JS(t)y) + (S(t)x, J S(t)y)
dt dt dt
= (G(t)S(t)x, JS(t)y) + (S(t)x, JG(t)S(t)y)
= (JL(t)S(t)x, JS(t)y) + (S(t)x, J 2 L(t)S(t)y)
= (L(t)S(t)x, J T JS(t)y) (S(t)x, L(t)S(t)y)
= (S(t)x, L(t)S(t)y) (S(t)x, L(t)S(t)y)
= 0.
So f (t) = f (0) = (S(0)x, JS(0)y) = (x, Jy). Since x and y are arbitrary, we conclude S(t) is a family of
symplectic matrices.
I 3. (page 311) Prove that plus or minus 1 cannot be an eigenvalue of odd multiplicity of a symplectic
matrix.
I 4. (page 312) Verify Theorem 6.
Proof. We note
2n v1 dui
dv v du
i=1 ui dt v
=
2n v2n dui u dt
= = JHu .
dt u
i=1 ui dt

H(u) can be seen as a function of v: K(v) = H(u(v)). So
K 2n H ui u1 u2n H
( )T
v1
i=1 ui v1 v1 v1 u1 u
Kv = = =
2n H ui
= Hu .
v
K
v2n i=1 u v
u1
v2n u2n
v2n
H
u2n
i 2n
Since v/u is symplectic, by Theorem 2, u/v and (v/u)T are also symplectic. So using formula (4)
gives us
( )T ( )T ( )T
dv v v u u
= J Hu = J Hu = JKv .
dt u u v v
55
A.4 Tensor Product
I 1. (page 313) Establish a natural isomorphism between tensor products dened with respect to two pairs
of distinct bases.

I 2. (page 314) Verify that (4) maps U V onto L(U , V ).

I 3. (page 316) Show that if {ui } and {vj } are linearly independent, so are ui vi . Show that Mij is
positive.

I 4. (page 316) Let u be a twice dierentiable function of x1 , , xn dened in a neighborhood of a point
p, where u has a local minimum. Let (Aij ) be a symmetric, nonnegative matrix. Show that
2u
Aij (p) 0.
xi xj
A.5 Lattices
I 1. (page 318) Show that a1 is a rational number.

I 2. (page 318) (i) Prove Theorem 2. (ii) Show that unimodular matrices form a group under multiplication.


I 4. (page 319) Show that L is discrete if and only if there is a positive number d such that the ball of
radius d centered at the origin contains no other point of L.
A.6 Fast Matrix Multiplication

There are no exercise problems for this section. For examples of implementation of Strassens algorithm,
we refer to Huss-Lederman et al. [9] and references therein.
56
A.7 Gershgorins Theorem
I 1. (page 324) Show that if Ci is disjoint from all the other Gershgorin discs, then Ci contains exactly one
eigenvalue of A.
Proof. Using the notation of Gershgorin Circle Theorem, let B(t) = D + tF , t [0, 1]. The eigenvalues of
B(t) are continuous functions of t (Theorem 6 of Chapter 9). For t = 0, the eigenvalues of B(0) are the
diagonal entries of A. As t goes from 0 to 1, the radius of Gershgorin circles corresponding to B(t) become
bigger while the centers remain the same. So we can nd for each di a continuous path i (t) such that
i (0) = di and i (t) is an eigenvalue of B(t) (0 t 1, i = 1, , n). Moreover, by Gershgorin Circle
Theorem, each path i (t) (0 t 1) is contained in disc Ci = {x : |x di | |fi |l }. If for some i1 and i2
with i1 = i2 , i2 (0) falls into Ci1 , then its necessary that Ci1 Ci2 = . This implies that for any Gershgorin
disc that is disjoint from all the other Gershgorin discs, there is one and only one eigenvalue of A falls within
it.
Remark 14. Theres a strengthened version of Gershgorin Circle Theorem that can be found at Wikipedi-
a (http://en.wikipedia.org/wiki/Gershgorin_circle_theorem). The above exercise problems solution is an
adaption of the proof therein. The claim: If the union of k Gershgorin discs is disjoint from the union of the
other (n k) Gershgorin discs, then the former union contains exactly k and the latter (n k) eigenvalues
of A.
A.8 The Multiplicity of Eigenvalues

I (page 327) Show that if n 2 (mod 4), there are no nn real matrices A, B, C not necessarily self-adjoint,
such that all their linear combinations (1) have real and distinct eigenvalues.
A.9 The Fast Fourier Transform

There are no exercise problems for this section.
A.10 The Spectral Radius

Comments: In the textbook (pp. 340), when the author applied Cauchy integral theorem to get
formula (20): |z|=s R(z)z j dz = 2iAj , he used the version of Cauchy integral theorem for the outside region
of a simple closed curve (see, for example, Gong and Gong [3], Chapter 2, Exercise 8).
I 1. (page 337) Prove that the eigenvalues of an upper triangular matrix are its diagonal entries.
Proof. If T is an upper triangular

n matrix with diagonal entries a1 , , an , then its characteristic polynomial
pT () = det(I T ) = i=1 ( ai ). So

n
0 is an eigenvalue of T det(0 I T ) = 0 (0 ai ) = 0 0 is equal to one of a1 , , an .
i=1
I 2. (page 338) Show that the Euclidean norm of a diagonal matrix is the maximum of the absolute value
of its eigenvalues.
Proof. Let D = diag{a1 , , an }. Then Dei = diag{0, , 0, ai , 0, , 0}. So ||Dei || =
|a
i |. This shows
n
||D|| max1in |ai |. For any x Cn , Dx = diag{a1 x1 , , an xn }. So ||Dx|| = i=1 |ai xi |
2
max1ia |ai | ||x||. So ||D|| max1in |ai |. Combined, we conclude ||D|| = max1in |ai |.
57
I 3. (page 339) Prove the analogue of relation (2),
lim |Aj |1/j = r(A),

j
when A is a linear mapping of any nite-dimensional normed, linear space X (see Chapters 14 and 15).
Proof. By examining the proof for Euclidean space, we see inner product is not really used. All that has
been exploited is just norm. So the proof for any nite-dimensional normed linear space is entirely identical
to that of nite-dimensional Euclidean space.
I 4. (page 339) Show that the two denitions are equivalent.
Proof. It suces to note that a sequence (An )
n=1 converges to A in matrix norm i each ((An )ij )n=1
converges to Aij (Exercise 16 of Chapter 7).
I 5. (page 339) Let A(z) be an analytic matrix function in a domain G, invertible at every point of G. Show
that then A1 (z), too, is an analytic matrix function in G.

Proof. By formula (16) of Chapter 5: D(a1 , , an ) = (p)ap1 1 apn n , we conclude the determinant of
any analytic matrix (i.e. matrix-valued analytic function) is analytic. By Cramers rule and det A(z) = 0 in
G, we conclude A1 (z) is also analytic in G.
I 6. (page 339) Show that the Cauchy integral theorem holds for matrix-valued functions.
Proof. By Exercise 4, Cauchy integral theorem for matrix-valued functions is reduced to Cauchy integral
theorem for each entry of an analytic matrix.
A.11 The Lorentz Group

Skipped for this version.
A.12 Compactness of the Unit Ball

I 1. (page 354) (i) Show that a set of functions whose rst derivatives are uniformly bounded in G are
equicontinuous in G.
(ii) Use (i) and the Arzela-Ascoli theorem to prove Theorem 3.
(i)
Proof. We use the notation of Theorem 3. For simplicity, we assume G is convex so that for any x, y G,
the segment {z : z = (1 t)x + ty} G. Then by Mean Value Theorem, there exists c (0, 1) such that
|f (x) f (y)| = |f ((1 c)x + cy) (y x)| dm|x y|,
where d is the dimensional of the Euclidean space in which G resides. This shows the elements of D are
equi-continuous in G.
(ii)
Proof. From (i), we know each element of D is uniformly continuous. So they can be extended to G, the
closure of G. Then Theorem 3 is the result of the following version of Arzela-Ascoli Theorem (see, for
example, Yosida [14]): Let S be a compact metric space, and C(S) the Banach space of (real- or) complex-
valued continuous functions x(s) normed by ||x|| = supsS |x(s)|. Then a sequence {xn (s)} C(S) is
relatively compact in C(S) if the following two conditions are statised: (a) {xn (s)} is uniformly bounded;
(b) {xn (s)} is equi-continuous.
A.13 A Characterization of Commutators

58
A.14 Liapunovs Theorem
I 1. (page 360) Show that the sums (14) tend to a limit as the size of the subintervals j tends to zero.
(Hint: Imitate the proof for the scalar case.)
Proof. This is basically about how to extend the Riemann integral to Banach space valued functions. The
theory is essential the same as the scalar case just replace the Euclidean norm with an arbitrary norm. So
we omit the details.
I 2. (page 360) Show that the two denitions are equivalent.
Proof. It suces to note An A in matrix norm if and only if each entry of An converges to the corre-
sponding entry of A (see Exercise 7 and formula (51) of Chapter 7).
I 3. (page 360) Show, using Lemma 4, that for the integral (12)
T

lim eW t eW t dt
T 0
Proof. We note for T > T , by Denition 2

T T
W t W t W t W t
e e dt e e dt.
T T
T W t W t
By Lemma 4, limT,T e e dt = 0. So by Cauchys criterion, we conclude the integral (12)
T
exists.
A.15 The Jordan Canonical Form

A.16 Numerical Range

I 1. (page 367) Show that for A normal, equality holds in (2).
Proof. By Theorem 8 of Chapter 8, we can nd an orthonormal basis consisting of eigenvectors of A. Let
a1 , , an be the eigenvalues of A (multiplicity counted) with v1 , , vn the corresponding eigenvectors.
n
For any x X, we can nd 1 (x), , n (x) C such that x = i=1 i (x)vi . Then

n
|(Ax, x)| = i (x)j (x)(Avi , vj ) = i (x)j (x)(ai vi , vj ) = ai |i (x)|2 max |ai | = r(A).
i,j i,j 1in
i=1
Combined with (2), we conclude r(A) = w(A).

I 2. (page 367) Show that for A normal,
w(A) = ||A||.
Proof. By denition ||A|| = sup||x||=1 ||Ax||. Using the notation in the solution to Exercise 1, we have

Ax = i (x)Avi = i (x)ai vi .
i i
n
So ||Ax|| = i=1 |ai | |i (x)| r(A) = w(A), where the last equality comes from Exercise 1. This implies
2 2
||A|| w(A). By Theorem 13 (ii) of Chapter 7, w(A) ||A||. Combined, we conclude w(A) = ||A||.
I 3. (page 369) Verify (7) and (8).
59
Proof. To verify (7), we note
n
(rk z) e n k=1 k =
2i
(1 rk z) = (rk z) rk = (rk z) e(n+1)i = (1)n+1 (rk z).
k k k k k k
Since (rk )n = 1, r1 , , rn are a permutation of r1 , , rn . So

(rk z) = (1)n (z n 1).
k

Combined, we conclude (1 z n ) = k (1 rk z). To verify (8), we use (7) to get
1 1 1 zn
(1 rk z) = .
n j n j 1 rj z
k=j
P (z)
1
j 1rj zis a rational function over the complex plane C, which can be assumed to have the form Q(z)
with P(z) and Q(z) being polynomials
without common factors. Since r1 , , rn are singularity of degree
1 for j 1r 1
, we conclude Q(z) = (1 r k z) = 1 z n
, up to the dierence of a constant factor. Since
1
jz k
j 1rj z has no zeros on complex plane, we conclude P (z) must be a constant. Combined, we conclude
1 C
=
j
1 rj z 1 zn
for some constant C. By letting z 0, we can see C = n. This nishes the verication of (8).
( ) ( )
1 1 1 2
I 4. (page 370) Determine the numerical range of A = 2
and of A = .
0 1 0 1
[ ]
x
Solution. If x = 1 , (Ax, x) = x21 + x22 + x1 x2 . If x21 + x22 = 1, we have (Ax, x) = 1 + x1 sign(x2 ) 1 x21 .
x2

Calculus shows f () = 1 2 (1 1) achieves maximum at 0 = 22 . So w(A) = 1 + 12 = 32 .
Similarly, plain calculation shows w(A2 ) = 2.
References
[1] M. Engeli, T. Ginsburg, H. Rutishauser and E. Stiefel. Rened iterative methods for computation of the
solution and the eigenvalues of self-adjoint boundary value problems, Birkhauser Verlag, Basel/Stuttgart,
1959. 51
[2] Gene H. Golub and Charles F. van Loan. Matrix computation, 3rd Edition. Johns Hopkins University
Press, 1996. 51
[3] Gong Sheng and Gong Youhong. Concise complex analysis, Revised Edition, World Scientic, 2007. 57
[4] William H. Greene. Econometric Analysis, 7th ed., Prentice Hall, 2011. 13
[5] James P. Keener. Principles of applied mathematics: Transformation and approximation, revised edition.
Westview Press, 2000. 30
[6] 2002.8[Lan Yi-Zhong. A concise course
of advanced algebra (in Chinese), Volume 1, Peking University Press, Beijing, 2002.8.] 21
[7] P. Lax. Functional analysis, Wiley-Interscience, 2002. 48
[8] P. Lax. Linear algebra and its applications, 2nd Edition, Wiley-Interscience, 2007. 1
60
[9] Steven Huss-Lederman, Elaine M. Jacobson, Anna Tsao, Thomas Turnbull, Jeremy R. Johnson. Im-
plementation of Strassens algorithm for matrix multiplication. Proceedings of the 1996 ACM/IEEE
conference on Supercomputing (CDROM), p.32-es, January 01-01, 1996, Pittsburgh, Pennsylvania, U-
nited States. 56
[10] J. Munkres. Analysis on manifolds, Westview Press, 1997. 15, 21
[11] M. Papadrakakis. A family of methods with three-term recursion formulae. International Journal for
Numerical Methods in Engineering, Vol. 18, 1785-1799 (1982). 51
[12] 1996[Qiu Wei-Sheng. Advanced algebra (in

Chinese), Volume 1, Higher Education Press, Beijing, 1996.] 13, 14, 30
[13] Michael Spivak. Calculus on manifolds: A modern approach to classical theorems of advanced calculus.
Perseus Books Publishing, 1965. 30
[14] Kosaku Yosida. Functional analysis, 6th Edition. Springer, 1996. 58
61

Solution Manual Linear Algebra and Its Applications 2ed PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Solution Manual Linear Algebra and Its Applications 2ed PDF

Uploaded by

Copyright:

Available Formats

Linear Algebra and Its Applications, 2ed.

Solution of Exercise Problems

5 Determinant and Trace 15

8 Spectral Theory of Self-Adjoint Mappings of a Euclidean Space into Itself 29

9 Calculus of Vector- and Matrix- Valued Functions 33

11 Kinematics and Dynamics 39

13 The Duality Theorem 45

14 Normed Linear Spaces 46

15 Linear Mappings Between Normed Linear Spaces 48

17 How to Solve Systems of Linear Equations 51

I 1. (page 2) Show that the zero of vector addition is unique.

x + 0 = (x1 , , xn ) + (0, , 0) = (x1 + 0, , xn + 0) = (x1 , , xn ) = x.

So 0 = (0, , 0) is the zero element of classical vector addition.

and the Vandermonde matrix

is invertible for distinct s1 , s2 , , sn .

Proof. For any x1 , x2 Y Z, since Y and Z are linear subspaces of X, x1 + x2 Y and x1 + x2 Z.

(k1 x1 + + kj xj ) + (k1 x1 + + kj xj ) = (k1 + k1 )x1 + + (kj + kj )xj Y

and for any k1 , , kj , k K,

k(k1 x1 + + kj xj ) = (kk1 )x1 + + (kkj )xj Y,

we can conclude Y is a linear subspace of X containing x1 , , xj . Finally, if Z is any linear subspace of

So {x1 } {x2 } = if and only if {x1 } = {x2 }.

Proof. If {x} = {x } and {y} = {y }, then x x , y y Y . So (x + y) (x + y ) = (x x ) + (y y ) Y .

dim(U + V + W ) = dim U + dim V + dim W dim(U V ) dim(U W )

true or false? If true, prove it. If false, provide a counterexample.

I 2. (page 15) Verify that Y is a subspace of X .

So l Y . By the arbitrariness of l, S Y . Combined, we have S = Y .

We take p1 (t) = 1, p2 (t) = t and p3 (t) = t2 . The above equation becomes

I 6. (page 18) Let P2 be the linear space of all polynomials

(a) Show that l1 , l2 , l3 are linearly independent linear functions on P2 .

al1 (p) + bl2 (p) + cl3 (p) = 0.

I 1. (page 20) Prove Theorem 1.

I 5. (page 25) Show that if T is invertible, T T 1 is the identity.

I 6. (page 25) Prove Theorem 4.

Since l is arbitrary, we must have T x = T x , x X. Hence, T = T , which is the precise

Proof. If Bx = 0, by applying A to both sides of the equation and AB = I, we conclude x = 0. So B is

Proof. Suppose K = MS . Then K(M 1 )S = SM S 1 SM 1 S 1 = I. By Exercise 9, K is also invertible

I 11. (page 30) Prove Theorem 9.

Proof. For any , K and x = (x1 , , xn ), y = (y1 , , yn ), we have

This shows P is a linear map. Furthermore, we have

P 2 (x) = P ((0, 0, x3 , , xn )) = (0, 0, x3 , , xn ) = P (x).

Proof. For any , K and f, g C[1, 1], we have

This shows P is a linear map. Furthermore, we have

So T S 1 (k) = S 1 (c k) = cS 1 (k), k K. This shows T is a scalar multiplication.

I 1. (page 35) Let A be an arbitrary m n matrix, and let D be an m n diagonal matrix,

I 4. (page 38) Construct two 2 2 matrices A and B such that AB = 0 but BA = 0.

I 7. (page 41) Verify that l = (1, 2, 1, 1) is a left nullvector of M :

5 Determinant and Trace

Bij = (1)i+j det Aji ,

where Aji is the (ji)th minor of A. Then AB = BA = det A Inn .

I 1. (page 47) Prove properties (7).

So (p1 p2 ) = (p1 )(p2 ).

P (t(x1 , , xn )) = P (x1 , , xj0 , , xi0 , , xn )

I 8. (page 54) Show that for any square matrix

det AT = det A, AT = transpose of A

I 12. (page 56) Show that the determinant of the 2 2 matrix

I 16. (page 57) Show that the determinant of a (3 3) matrix

det A = aei + bf g + cdh gec hf a idb.

A can be diagonalized by the matrix U = (x1 , , xn ), such that U 1 AU = diag{1 , , n }.

3) Suppose X is a linear subspace of Fn that is invariant under the n n matrix A. If A can be

Proposition 6.2 (Index and algebraic multiplicity). d() m().

Proof. See Theorem 11 of Chapter 9, page 133.

geometric multiplicity of , index of

I 1. (page 63) Calculate f32 .