Professional Documents
Culture Documents
Abstract
This is a solution manual for Linear algebra and its applications, 2nd edition, by Peter Lax [8]. This
version omits the following problems: exercise 2, 9 of Chapter 8; exercise 3, 4 of Chapter 17; exercises of
Chapter 18; exercise 3 of Appendix 3; exercises of Appendix 4, 5, 8 and 11.
If you would like to correct any typos/errors, please send email to zypublic@hotmail.com.
Contents
1 Fundamentals 3
2 Duality 6
3 Linear Mappings 9
4 Matrices 13
6 Spectral Theory 19
7 Euclidean Structure 24
10 Matrix Inequalities 36
12 Convexity 41
16 Positive Matrices 50
1
18 How to Calculate the Eigenvalues of Self-Adjoint Matrices 52
A Appendix 52
A.1 Special Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
A.2 The Pfaan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
A.3 Symplectic Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
A.4 Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
A.5 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
A.6 Fast Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
A.7 Gershgorins Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.8 The Multiplicity of Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.9 The Fast Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.10 The Spectral Radius . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.11 The Lorentz Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
A.12 Compactness of the Unit Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
A.13 A Characterization of Commutators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
A.14 Liapunovs Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
A.15 The Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
A.16 Numerical Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2
1 Fundamentals
The books own solution gives answers to Ex 1, 3, 7, 10, 13, 14, 16, 19, 20, 21.
T (p) = p(x),
where p on the left side of the equation is regarded as a polynomial over R while p(x) on the right side of
the equation is regarded as a function dened on S = {s1 , , sn }. To prove T is an isomorphism, it suces
to prove T is one-to-one. This is seen through the observation that
a
1 s1 s21 sn1 1 p(s1 )
1
1
a2
s2 s22 s2n1 p(s2 )
. =
..
1 sn s2n snn1 an p(sn )
(y + z) + (y + z ) = (z + y) + (y + z ) = z + (y + (y + z )) = z + ((y + y ) + z ) = z + (z + (y + y ))
= (z + z ) + (y + y ) = (y + y ) + (z + z ) Y + Z,
and
k(y + z) = ky + kz Y + Z.
So Y + Z is a linear subspace of X if Y and Z are.
3
I 7. (page 4) Prove that if Y and Z are linear subspaces of X, so is Y Z.
I 10. (page 5) Show that if the vectors x1 , , xj are linearly independent, then none of the xi is the zero
vector.
Proof. We prove by contradiction. Without loss of generality, assume x1 = 0. Then 1x1 + 0x2 + + 0xj =
0. This shows x1 , , xj are linearly dependent, a contradiction. So x1 = 0. We can similarly prove
x2 , , xj = 0.
I 11. (page 7) Prove that if X is nite dimensional and the direct sum of Y1 , , Ym , then
dim X = dim Yj .
Proof. Suppose Yi has a basis y1i , , yni i . Then it suces to prove y11 , , yn1 1 , , y1m , , ynmm form a basis
of X. By denition of direct sum, these vectors span X, so we only need to show they are linearly independent.
m
In fact, if not, then 0 has two distinct representations: 0 = 0 + + 0 and 0 = i=1 (ai1 y1i + + aini yni i )
for some a11 , , a1n1 , , am
1 , , anm , where not all aj are zero. This is contradictory with the denition
m i
of direct sum. So we must havelinear independence, which imply y11 , , yn1 1 , , y1m , , ynmm form a basis
of X. Consequently, dim X = dim Yi .
I 12. (page 7) Show that every nite-dimensional space X over K is isomorphic to K n , n = dim X. Show
that this isomorphism is not unique when n is > 1.
n
Proof. Fix a basis x1 , , xn of X, any element x X can be uniquely represented as i=1 i (x)xi for some
i (x) K, i = 1, , n. We dene the isomorphism as x 7 (1 (x), , n (x)). Clearly this isomorphism
depends on the basis and by varying the choice of basis, we can have dierent isomorphisms.
I 13. (page 7) Prove (i)-(iii) above. Show furthermore that if x1 x2 , then kx1 kx2 for every scalar k.
Proof. For any x1 , x2 X, if x1 x2 , i.e. x1 x2 Y , then x2 x1 = (x1 x2 ) Y , i.e. x2 x1 . This
is symmetry. For any x X, x x = 0 Y . So x x. This is reexivity. Finally, if x1 x2 , x2 x3 , then
x1 x3 = (x1 x2 ) + (x2 x3 ) Y , i.e. x1 x3 . This is transitivity.
4
I 14. (page 7) Show that two congruence classes are either identical or disjoint.
Proof. For any x1 , x2 X, we can nd y {x1 } {x2 } if and only if x1 y Y and x2 y Y . Then
x1 x2 = (x1 y) (x2 y) Y.
I 15. (page 8) Show that the above denition of addition and multiplication by scalar is independent of the
choice of representatives in the congruence class.
Then its easy to see dim Y = n j and dim X/Y = dim X dim Y = j.
I 17. (page 10) Prove Corollary 6 .
Proof. By Theorem 6, dim X/Y = dim X dim Y = 0, which implies X/Y = {{0}}. So X = Y .
I 18. (page 11) Show that
dim X1 X2 = dim X1 + dim X2 .
Proof. Dene Y1 = {(x, 0) : x X1 , 0 X2 } and Y2 = {(0, x) : 0 X1 , x X2 }. Then Y1 and Y2 are linear
subspaces of X1 X2 . It is easy to see Y1 is isomorphic to X1 , Y2 is isomorphic to X2 , and Y1 Y2 = {(0, 0)}.
So by Theorem 7, dim X1 X2 = dim Y1 +dim Y2 dim(Y1 Y2 ) = dim X1 +dim X2 0 = dim X1 +dim X2 .
I 19. (page 11) X is a linear space, Y is a subspace. Show that Y X/Y is isomorphic to X.
Proof. By Exercise 18 and Theorem 6, dim(Y X/Y ) = dim Y + dim(X/Y ) = dim Y + dim X dim Y =
dim X. Since linear spaces of same nite dimension are isomorphic (by one-to-one mapping between their
bases), Y X/Y is isomorphic to X.
I 20. (page 12) Which of the following sets of vectors x = (x1 , , xn ) in Rn are a subspace of Rn ? Explain
your answer.
(a) All x such that x1 0.
(b) All x such that x1 + x2 = 0.
(c) All x such that x1 + x2 + 1 = 0.
(d) All x such that x1 = 0.
(e) All x such that x1 is an integer.
Proof. (a) is not since {x : x1 0} is not closed under the scalar multiplication by 1. (b) is. (c) is not
since x1 + x2 + 1 = 0 and x1 + x2 + 1 = 0 imply (x1 + x1 ) + (x2 + x2 ) + 1 = 1. (d) is. (e) is not since x1
being an integer does not guarantee rx1 is an integer for any r R.
5
I 21. (page 12) Let U , V , and W be subspaces of some nite-dimensional vectors space X. Is the statement
X = R2 = (x, y) space
U = {y = 0}, V = {x = 0}, W = {x = y}.
U + V + W = R2 , U V = {0}, U W = {0}
V W = {0}, U V W = 0.
2 Duality
The books own solution gives answers to Ex 4, 5, 6, 7.
I 1. (page 15) Given a nonzero vector x1 in X, show that there is a linear function l such that
l(x1 ) = 0.
Proof. We let Y = {kx1 : k K}. Then Y is a 1-dimensional linear subspace of X. By Theorem 2 and
Theorem 4,
dim Y = dim X dim Y < dim X = dim X
So there must exist some l X \ Y such that l(x1 ) = 0.
Remark 1. When K is R or C, the proof can be constructive. Indeed, assume e1 , , en is a basis for X
n
and x1 = i=1 ai ei . In the case of K = R, dene l by setting l(ei ) = ai , i = 1, , n; in the case of K = C,
dene l by setting l(ei ) = ai (the conjugate of ai ), i = 1, , n. Then in both cases, l(x1 ) = i=1 ||ai ||2 > 0.
n
m
l(y) = i l(xi ) = 0.
i=1
6
Proof. Suppose three linearly independent polynomials p1 , p2 and p3 are applied to formula (9). Then m1 ,
m2 and m3 must satisfy the linear equations
1
p1 (t1 ) p1 (t2 ) p1 (t3 ) m1 p (t)dt
1 1
p2 (t1 ) p2 (t2 ) p2 (t3 ) m2 = 1
1 p2 (t)dt
1
p3 (t1 ) p3 (t2 ) p3 (t3 ) m3 p (t)dt
1 3
So 1 1 1
m1 1 1 1 2 0 2a 1 1
2a2 2 3a2
m2 = a 0 a 0 = 1 0 a12 0 = 2 3a22
2 1 1 2 1
m3 a2 0 a2 3 0 2a 2a2 3 3a2
Then its easy to see that for a > 1/3, all three weights are positive.
To show formula (9) holds for all polynomials of degree < 6 when a = 3/5, we note for any odd n N,
1
xn dx = 0, m1 p(a) + m3 p(a) = 0 since m1 = m2 and p(x) = p(x), and m2 p(0) = 0.
1
So (9) holds for any xn of odd degree n. In particular, for p(x) = x3 and p(x) = x5 . For p(x) = x4 , we have
1
2 2
, m1 p(t1 ) + m2 p(t2 ) + m3 p(t3 ) = 2m1 a4 = a2 .
x4 dx =
1 5 3
So formula (9) holds for p(x) = x4 when a = 3/5. Combined, we conclude for a = 3/5, (9) holds for all
polynomials of degree < 6.
Remark 2. In this exercise problem and Exercise 5 below, Theorem 6 is corrected to Theorem 7.
I 5. (page 18) In Theorem 7 take the interval I to be [1, 1], and take n = 4. Choose the four points to be
a, b, b, a.
(i) Determine the weights m1 , m2 , m3 , and m4 so that (9) holds for all polynomials of degree < 4.
(ii) For what values of a and b are the weights positive?
Proof. We take p1 (t) = 1, p2 (t) = t, p3 (t) = t2 , and p4 (t) = t3 . Then m1 , m2 , m3 , and m4 solve the following
equation:
1 1 1 1 m1 2
a b b a m2 0
2 =
a b2 b2 a2 m3 2/3
a3 b3 b3 a3 m4 0
7
Then
1
m1 1 1 1 1 2
m2 a b b a 0
=
m3 a2 b2 b2 a2 2/3
m4 a b3
3
b3 a3 0
b2 b2 1 1
2a2 +2b2 2a3 2ab2 2a2 2b2 2a3 +2ab2 2
a2 a2 1 1
2a2 2b2 2a2 b+2b3 2a2 +2b2 2a2 b2b3 0
= 2
a2
2a2a2b2 2a2 b2b3
1
2a2 +2b2
1
2a2 b+2b3
2/3
b2 b2 1 1 0
2a2 +2b2 2a3 +2ab2 2a2 2b2 2a3 2ab2
3b2 +1
3(a2 b2 )
3a2 1
3(a2 b2 )
= 2
3a 1
3(a2 b2 )
3b2 +1
3(a2 b2 )
So the weights are positive if and only if one of the following two mutually exclusive cases hold
1) b2 > 13 , a2 < b2 , a2 > 13 ;
2) b2 < 13 , a2 > b2 , a2 < 13 .
p(x) = a0 + a1 x + a2 x2
with real coecients and degree 2. Let 1 , 2 , 3 be three distinct real numbers, and then dene
lj = p(j ) for j = 1, 2, 3.
Set p = p(x) = (x 2 )(x 3 ). Then p(2 ) = p(3 ) = 0, p1 (1 ) = 0; so we get from the above relation that
a = 0. Similarly b = 0, c = 0.
(b)
Proof. Since dim P2 = 3, dim P2 = 3. Since l1 , l2 , l3 are linearly independent, they span P2 .
(c1)
8
Proof. We dene l1 by setting {
1, if j = 1
l1 (ej ) =
0, if j = 1
n n
and extending l1 to V by linear combination, i.e. l1 ( j=1 j ej ) := j=1 j l1 (ej ) = 1 . l2 , , ln can be
constructed similarly. If there exist a1 , , an such that a1 l1 + + an ln = 0, we have
0 = a1 l1 (ej ) + an ln (ej ) = aj , j = 1, , n.
So l1 , , ln are linearly independent. Since dim V = dim V = n, {l1 , , ln } is a basis of V .
(c2)
Proof. We dene
(x x2 )(x x3 ) (x x1 )(x x3 ) (x x1 )(x x2 )
p1 (x) = , p2 (x) = , p3 (x) = .
(x1 x2 )(x1 x3 ) (x2 x1 )(x2 x3 ) (x3 x1 )(x3 x2 )
I 7. (page 18) Let W be the subspace of R4 spanned by (1, 0, 1, 2) and (2, 3, 1, 1). Which linear functions
l(x) = c1 x1 + c2 x2 + c3 x3 + c4 x4 are in the annihilator of W ?
Proof. (From the textbooks solutions, page 280) l(x) has to be zero for x = (1, 0, 1, 2) and x = (2, 3, 1, 1).
These yield two equations for c1 , , c4 :
c1 c3 + 2c4 = 0, 2c1 + 3c2 + c3 + c4 = 0.
We express c1 and c2 in terms of c3 and c4 . From the rst equation, c1 = c3 2c4 . Setting this into the
second equation gives c2 = c3 + c4 .
3 Linear Mappings
The books own solution gives answers to Ex 1, 2, 4, 5, 6, 7, 8, 10, 11, 13.
Comments: To memorize Theorem 5 (RT = NT ), recall for a given l U , (l, T x) = 0 for any x X
if and only if T l = 0.
9
Proof. (From the textbooks solution, page 280) Suppose we drop the ith equation; if the remaining equations
do not determine x uniquely, there is an x = 0 that is mapped into a vector whose components except the
ith are zero. If this were true for all i = 1, , m, the range of the mapping x u would be m-dimensional;
but according to Theorem 2, the dimension of the range is n < m. Therefore one of the equations may be
dropped without losing uniqueness; by induction m n of the equations may be omitted.
Alternative solution: Uniqueness of the solution x implies the column vectors of the matrix T = (tij ) are
linearly independent. Since the column rank of a matrix equals its row rank (see Chapter 3, Theorem 6 and
Chapter 4, Theorem 2), it is possible to select a subset of n of these equations which uniquely determine the
solution.
Remark 3. The textbooks solution is a proof that the column rank of a matrix equals its row rank.
I 3. (page 25) Prove Theorem 3.
(i)
Proof. S T (ax + by) = S(T (ax + by)) = S(aT (x) + bT (y)) = aS(T (x)) + bS(T (y)) = aS T (x) + bS T (y).
So S T is also a linear mapping.
(ii)
Proof. (R + S) T (x) = (R + S)(T (x)) = R(T (x)) + S(T (x)) = (R T + S T )(x) and S (T + P )(x) =
S((T + P )(x)) = S(T (x) + P (x)) = S(T (x)) + S(P (x)) = (S T + S P )(x).
I 4. (page 25) Show that S and T in Examples 8 and 9 are linear and that ST = T S.
Proof. For Example 8, the linearity of S and T is easy to see. To see the non-commutativity, consider the
polynomial p(s) = s. We have T S(s) = T (s2 ) = 2s = s = S(1) = ST (s). So ST = T S.
For Example 9, x = (x1 , x2 , x3 ) X, S(x) = (x1 , x3 , x2 ) and T (x) = (x3 , x2 , x1 ). So its easy to
see S and T are linear. To see the non-commutativity, note ST (x) = S(x3 , x2 , x1 ) = (x3 , x1 , x2 ) and
T S(x) = T (x1 , x3 , x2 ) = (x2 , x3 , x1 ). So ST = T S in general.
Remark 4. Note the problem does not specify the direction of the rotation, so it is also possible that
S(x) = (x1 , x3 , x2 ) and T (x) = (x3 , x2 , x1 ). There are total of four choices of (S, T ), and each of the
corresponding proofs is similar to the one presented above.
Proof. Suppose T : X U is invertible. Then for any y, y U , there exist a unique x X and a unique
x X such that T (x) = y and T (x ) = y . So T (x + x ) = T (x) + T (x ) = y + y and by the injectivity of
T , T 1 (y + y ) = x + x = T 1 (y) + T 1 (y ). For any k K, since T (kx) = kT (x) = ky, injectivity of T
implies T 1 (ky) = kx = kT 1 (y). Combined, we conclude T 1 is linear.
(ii)
Proof. Suppose T : X U and S : U V . First, by the denition of multiplication, ST is a linear map.
Second, if x X is such that ST (x) = 0 V , the injectivity of S implies T (x) = 0 U and the injectivity
of T further implies x = 0 X. So, ST is one-to-one. For any z V , there exists y U such that S(y) = z.
Also, we can nd x X such that T (x) = y. So ST (x) = S(y) = z. This shows ST is onto. Combined, we
conclude ST is invertible.
By associativity, we have (ST )(T 1 S 1 ) = ((ST )T 1 )S 1 = (S(T T 1 ))S 1 = SS 1 = idV . Replace
S with T 1 and T with S 1 , we also have (T 1 S 1 )(ST ) = idX . Therefore, we can conclude (ST )1 =
T 1 S 1 .
10
I 7. (page 26) Show that whenever meaningful,
(ST ) = T S , (T + R) = T + R , and (T 1 ) = (T )1 .
(i)
Proof. Suppose T : X U and S : U V are linear maps. Then for any given l V , ((ST ) l, x) =
(l, ST x) = (S l, T x) = (T S l, x), x X. Therefore, (ST ) l = T S l. Let l run through every element of
V , we conclude (ST ) = T S .
(ii)
Proof. Suppose T and R are both linear maps from X to U . For any given l U , we have ((T + R) l, x) =
(l, (T + R)x) = (l, T x + Rx) = (l, T x) + (l, Rx) = (T l, x) + (R l, x) = ((T + R )l, x), x X. Therefore
(T + R) l = (T + R )l. Let l run through every element of V , we conclude (T + R) = T + R .
(iii)
Proof. Suppose T is an isomorphism from X to U , then T 1 is a well-dened linear map. We rst show
T is an isomorphism from U to X . Indeed, if l U is such that T l = 0, then for any x X, 0 =
(T l, x) = (l, T x). As x varies and goes through every element of X, T x goes through every element of
U . By considering the identication of U with U , we conclude l = 0. So T is one-to-one. For any given
m X , dene l = mT 1 , then l U . For any x X, we have (m, x) = (m, T 1 (T x)) = (l, T x) = (T l, x).
Since x is arbitrary, m = T l and T is therefore onto. Combined, we conclude T is an isomorphism from
U to X and (T )1 is hence well-dened.
By part (i), (T 1 ) T = (T T 1 ) = (idU ) = idU and T (T 1 ) = (T 1 T ) = (idX ) = idX . This shows
(T 1 ) = (T )1 .
I 8. (page 26) Show that if X is identied with X and U with U via (5) in Chapter 2, then
T = T.
Proof. Suppose : X X and : U U are the isomorphisms dened in Chapter 2, formula (5), which
identify X with X and U with U , respectively. Then for any x X and l U , we have
(T x , l) = (x , T l) = (T l, x) = (l, T x) = (T x , l).
I 9. page 28) Show that if A in L(X, X) is a left inverse of B in L(X, X), that is AB = I, then it is also
a right inverse: BA = I.
I 10. (page 30) Show that if M is invertible, and similar to K, then K also is invertible, and K 1 is similar
to M 1 .
11
Proof. Suppose A is invertible, we have AB = AB(AA1 ) = A(BA)A1 . So AB and BA are similar. The
case of B being invertible can be proved similarly.
I 12. (page 31) Show that P dened above is a linear map, and that it is a projection.
P (x + y) = P ((x1 + y1 , , xn + yn ))
= (0, 0, x3 + y3 , , xn + yn )
= (0, 0, x3 , , xn ) + (0, 0, y3 , , yn )
= (0, 0, x3 , , xn ) + (0, 0, y3 , , yn )
= P (x) + P (y).
So P is a projection.
I 13. (page 31) Prove that P dened above is linear, and that it is a projection.
S T S 1 (k) = S T S 1 (k 1) = k c, k K.
(b)
Proof. If c = 1, its easy to verify I + 1
1c T is the inverse of I T .
12
I 15. (page 31) Suppose T and S are linear maps of a nite dimensional vector space into itself. Show that
the rank of ST is less than or equal the rank of S. Show that the dimension of the nullspace of ST is less
than or equal the sum of the dimensions of the nullspaces of S and of T .
Proof. Because RST RS , rank(ST ) = dim(RST ) dim RS = rank(S). Moreover, since the column
rank of a matrix equals its row rank (see Chapter 3, Theorem 6 and Chapter 4, Theorem 2), we have
rank(ST ) = rank(T S ) rank(T ) = rank(T ). Combined, we conclude rank(ST ) min{rank(S), rank(T )}.
Also, we note NST /NT is isomorphic to NS RT , with the isomorphism dened by ({x}) = T x, where
{x} := x + NT . Its easy to see is well-dened, is linear, and is both injective and surjective. So by
Theorem 6 of Chapter 1,
dim NST = dim NT + dim NST /NT = dim NT + dim(NS RT ) dim NT + dim NS .
Remark 6. The result rank(ST ) min{rank(S), rank(T )} is used in econometrics. Cf. Greene [4, page 985]
Appendix A.
4 Matrices
The books own solution gives answers to Ex 1, 2, 4.
Show that the ith row of DA equals di times the ith row of A, and show that the jth column of AD equals
dj times the jth column of A.
Proof. It looks the phrasement of the exercise is problematic: when m = n, AD or DA may not be well-
r1
r2
dened. So we will assume m = n in the below. We can write A in the row form
. Then DA can be
rm
written as
d1 0 0 r1 d1 r1
0 d 0 r2 d2 r2
DA = 2
=
0 0 dn rn dn rn
We can also write A in the column form [c1 , c2 , , cn ], then AD can be written as
d1 0 0
0 d2 0
AD = [c1 , c2 , , cn ]
= [d1 c1 , d2 c2 , , dn cn ]
0 0 dn
I 2. (page 37) Look up in any text the proof that the row rank of a matrix equals its column rank, and
compare it to the proof given in the present text.
Proof. Proofs in most textbooks are lengthy and complicated. For a clear, although still lengthy, proof, see
[12, page 112], Theorem 3.5.3.
13
I 3. (page 38) Show that the product of two matrices in 2 2 block form can be evaluated as
( )( ) ( )
A11 A12 B11 B12 A11 B11 + A12 B21 A11 B12 + A12 B22
=
A21 A22 B21 B22 A21 B11 + A22 B21 A21 B12 + A22 B22
Proof. The calculation is a bit messy. We refer the reader to [12, page 190], Theorem 4.6.1.
I 5. (page 40) Show that x1 , x2 , x3 , and x4 given by (20)j satisfy all four equations (20).
Proof.
1 2 3 1 1 1 1 + 2 2 + 3 (2) + (1) 1 2
2 5 4 3 2 2 1 + 5 2 + 4 (2) + (3) 1 1
2 3 4 1 2 = 2 1 + 3 2 + 4 (2) + 1 1 = 1
1 4 2 2 1 1 1 + 4 2 + 2 (2) + (2) 1 3
I 6. (page 41) Choose values of u1 , u2 , u3 , u4 so that condition (23) is satised, and determine all solutions
of equations (22).
Proof. We choose u1 = u2 = u3 = 1 and u4 = 2. Then x3 = 5x4 u3 u2 + 3u1 = 5x4 + 1,
x2 = 7x4 + u4 3u1 = 7x4 1, and x1 = u1 x2 2x3 3x4 = 1 (7x4 1) 2(5x4 + 1) 3x4 = 0.
lM = 0.
Proof.
1 1 2 3
1 2 3 1
[1, 2, 1, 1]
2
1 2 3
3 4 6 2
= [1 1 2 1 1 2 + 1 3, 1 1 2 2 1 1 + 1 4, 1 2 2 3 1 2 + 1 6, 1 3 2 1 1 3 + 1 2]
= 0.
I 8. (page 42) Show by Gaussian elimination that the only left nullvectors of M are multiples of l in Exercise
7, and then use Theorem 5 of Chapter 3 to show that condition (23) is sucient for the solvability of the
system (22).
Proof. Suppose a row vector x = (x1 , x2 , x3 , x4 ) satises xM = 0. Then we can proceed according to
Gaussian elimination
x1 + x2 + 2x3 + 3x4 = 0
{
x + 2x + x + 4x = 0 x2 x3 + x4 = 0 x3 x4 = 0
2 2 3 4
x2 2x3 = 0
2x1 + 3x2 + 2x3 + 6x4 = 0
5x3 5x4 = 0.
2x2 3x3 7x4 = 0
3x1 + x2 + 3x3 + 2x4 = 0
14
So we have x1 = x4 , x2 = 2x4 , and x3 = x 4 ,i.e. x = x4 (1, 2, 1, 1), a multiple of l in Exercise 7.
u1
u2
Equation (22) has a solution if and only if u =
u3 is in RM . By Theorem 5 of Chapter 3, this is equivalent
u4
to yu = 0, y NM (elements of NM are seen as row vectors). We have proved y is a multiple of l. Hence
condition (23), which is just lu = 0, is sucient for the solvability of the system (22).
Comments:
1) For a more intuitive proof of Theorem 2 (det(BA) = det A det B), see Munkres [10, page 18], Theorem
2.10.
2) The following proposition is one version of Cramers rule and will be used in the proof of Lemma 6,
Chapter 6 (formula (21) on page 68).
Proposition 5.1. Let A be an n n matrix and B dened as the matrix of cofactors of A; that is,
Proof. Suppose A has the column form A = (a1 , , an ). By replacing the jth column with the ith column
in A, we obtain
M = (a1 , , aj1 , ai , aj , , an ).
On one hand, Property (i) of a determinant gives det M = ij det A; on the other hand, Laplace expansion
of a determinant gives
a1i
n n
det M = (1)k+j aki det Akj = aki Bjk = (Bj1 , , Bjn ) ... .
k=1 k=1 ani
Combined, we can conclude det A Inn = BA. By replacing the ith column with the jth column in A, we
can get similar result for AB.
(b)
Proof. By denition, we have
P (p1 p2 (x1 , , xn )) = P (p1 (p2 (x1 , , xn ))) = (p1 )P (p2 (x1 , , x2 )) = (p1 )(p2 )P (x1 , , xn ).
15
Proof. To see (c) is true, we suppose t interchange i0 and j0 . Without loss of generality, we assume i0 < j0 .
Then
So (t) = 1.
To see (d) is true, note formula (9) is equivalent to id = tk t1 p1 . Acting these operations on
(1, , n), we have (1, , n) = tk t1 (p1 (1), , p1 (n)). Then the problem is reduced to proving
that a sequence of transpositions can sort an array of numbers into ascending order. There are many ways
to achieve that. For example, we can let t1 be the transposition that interchanges p1 (1) and p1 (i0 ), where
i0 satises p1 (i0 ) = 1. That is, t1 puts 1 in the rst position of the sequence. Then we let t2 be the
transposition that puts 2 to the second position. We continue this procedure until we sort out the whole
sequence. This shows sorting can be accomplished by a sequence of transpositions.
I 3. (page 48) Show that the decomposition (9) is not unique, but that the parity of the member k of factors
is unique.
Proof. For any transposition t, we have t t = id. So if p = tk t1 , we can get another decomposition
p = tk t1 t t. This shows the decomposition is not unique.
Suppose the permutation p has two dierent decompositions into transpositions: p = tk t1 =
tm t1 . By formula (7), part (b) and formula (8), (p) = (1)k = (1)m . So k m is an even number.
This shows the parity of the member of factors is unique.
I 4. (page 49) Show that D dened by (16) has Properties (ii), (iii) and (iv).
Proof. To verify Property (ii), note for any index j, , K, we have
D(a1 , , aj + aj , , an )
= (p)ap1 1 (apj j + apj j ) apn n
= [(p)ap1 1 apj j apn n + (p)ap1 1 apj j apn n ]
= D(a1 , , aj , , an ) + D(a1 , , aj , , an ).
To verify Property (iii), note ep1 1 epn n is non-zero if and only if pi = i for any 1 i n. In this case
the product is 1.
To verify Property (iv), note for any i = j, if we denote by t the transposition that interchanges i and j,
then p 7 p t is a one-to-one and onto map from the set of all permutations to itself. Therefore, we have
D(a1 , , ai , , aj , , an )
= (p)ap1 1 api i apj j apn n
= (1)(p t)apt1 1 aptj i apti j aptn n
= (1)(p t)apt1 1 apti j aptj i aptn n
= (1) (q)aq1 1 aqi j aqj i aqn n
= D(a1 , , aj , , ai , , an ).
16
I 5. (page 49) Show that Property (iv) implies Property (i), unless the eld K has characteristic two, that
is, 1 + 1 = 0.
Proof. By property (iv), D(a1 , , ai , , ai , , an ) = D(a1 , , ai , , ai , , an ). So add to both
sides of the equations D(a1 , , ai , , ai , , an ) , we have 2D(a1 , , ai , , ai , , an ) = 0. If the
character of the eld K is not two, we can conclude D(a1 , , ai , , ai , , an ) = 0.
Remark 7. This exercise and Exercise 5.4 together show formula (16) is equivalent to Properties (i)-(iii),
provided the character of K is not two. Therefore, for K = R or C, we can either use (16) or properties
(i)-(iii) as the denition of determinant.
I 6. (page 52) Verify that C(A11 ) has properties (i)-(iii).
[ ] [ ]
0 0
Proof. If two column vectors ai and aj (i = j) of A11 are equal, we have = . So C(A11 ) = 0 and
ai aj
property
[ ] (i) is satised. Since any linear operation on a column vector ai of A11 can [be naturally
] extended
0 1 0
to , property (ii) is also satised. Finally, we note when A11 = I(n1)(n1) , = Inn . So
ai 0 A11
property (iii) is satised.
I 7. (page 52) Deduce Corollary 5 from Lemma 4.
Proof. We rst move the j-th column to the position of the rst column. This can be done by interchanging
neighboring columns (j 1) times. The determinant of the resulted matrix A1 is (1)j1 det A. Then we
move the i-th row to the position of the rst row. This can be done by interchanging neighboring rows
(i 1) times. The resulted matrix
( A2 has) a determinant equal to (1)
i1
det A1 = (1)i+j det A. On the
1
other hand, A2 has the form of . By Lemma 4, we have det Aij = det A2 = (1)i+j det A. So
0 Aij
det A = (1)i+j det Aij .
Remark 8. Rigorously speaking, we only proved that swapping two neighboring columns will give a minus
sign to the determinant (Property (iv)), but we havent proved this property for neighboring rows. This can
be made rigorous by using det A = det AT (Exercise 8 of this chapter).
[Hint: Use formula (16) and show that for any permutation (p) = (p1 ).]
Proof. We rst show for any permutation p, (p) = (p1 ). Indeed, by formula (7)(b), we have 1 = (id) =
(p p1 ) = (p)(p1 ). By formula (7)(a), we conclude (p) = (p1 ). Second, we denote by bij the
(i, j)-th entry of AT . Then bij = aji . By formula (16) and the fact that p 7 p1 is a one-to-one and onto
map from the set of all permutations to itself, we have
det AT = (p)bp1 1 bpn n
= (p)a1p1 anpn
= (p1 )a(p1 p)1 p1 a(p1 p)n pn
= (p1 )ap1 (p1 )p1 ap1 (pn )pn
= (p1 )ap1 (1)1 ap1 (n)n
= det A.
17
I 9. (page 54) Given a permuation p of n objects, we dene an associated so-called permutation matrix P
as follows: {
1, if j = p(i),
Pij =
0, otherwise.
Show that the action of P on any vector x performs the permutation p on the components of x. Show that if
p, q are two permutations and P , Q are the associated permutation matrices, then the permutation matrix
associated with p q is the product P Q.
Proof. By Exercise 2, it suces to prove the property for transpositions. Suppose p interchanges i1 , i2 and
q interchanges j1 , j2 . Denote by P and Q the corresponding permutation matrices, respectively. Then for
any x = (x1 , , xn )T Rn , we have (ij is the Kronecker sign)
xi2 if i = i1
(P x)i = Pij xj = p(i)j xj = xi1 if i = i2
xi otherwise.
This shows the action of P on any column vector x performs the permutation p on the components of x.
Similarly, we have
xj2 if i = j1
(Qx)i = xj1 if i = j2
xi otherwise.
Since (P Q)(x) = P (Q(x)), the action of matrix P Q on x performs rst the permutation q and then the
permutation p on the components of x. Therefore, the permutation matrix associated with p q is the
product of P and Q.
I 10. (page 56) Let A be an m n matrix, B an n m matrix. Show that
trAB = trBA
Proof.
m
m
n
m
n
n
m
n
tr(AB) = (AB)ii = aij bji = aji bij = bij aji = (BA)ii = tr(BA),
i=1 i=1 j=1 j=1 i=1 i=1 j=1 i=1
where the third equality is obtained by interchanging the names of the indices i, j.
I 11. (page 56) Let A be an n n matrix, AT its transpose. Show that
trAAT = a2ij .
The square root of the double sum on the right is called the Euclidean, or Hilbert-Schmidt, norm of the
matrix A.
Proof.
tr(AAT ) = (AAT )ii = Aij ATji = aij aij = a2ij .
i i j i j ij
is D = ad bc.
18
Proof. Apply Laplace expansion of a determinant according to its columns (Theorem 6).
I 13. (page 56) Show that the determinant of an upper triangular matrix, one whose elements are zero
below the main diagonal, equals the product of its elements along the diagonal.
Proof. Apply Laplace expansion of a determinant according to its columns (Theorem 6) and work by induc-
tion.
I 14. (page 57) How many multiplications does it take to evaluate det A by using Gaussian elimination to
bring it into upper triangular form?
Proof. Denote by M (n) the number of multiplications needed to evaluate det A of an n n matrix A by
using Gaussian elimination to bring it into upper triangular form. To use the rst row to eliminate a21 ,
a31 , , an1 , we need n(n 1) multiplications. So M (n) = n(n 1) + M (n 1) with M (1) = 0. So
n
M (n) = k=1 k(k 1) = n(n+1)(2n+1)6 n(n+1)
2 = (n1)n(n+1)
3 .
I 15. (page 57) How many multiplications does it take to evaluate det A by formula (16)?
Proof. Denote by M (n) the number of multiplications needed to evaluate the determinant of an n n matrix
by formula (16). Then M (n) = nM (n 1). So M (n) = n!.
can be calculated as follows. Copy the rst two columns of A as a fourth and fth column:
a b c a b
d e f d e
g h i g h
Proof. We apply Laplace expansion of a determinant according to its columns (Theorem 6):
a b c [ ] [ ] [ ]
e f b c b c
det d e f = a det d det + g det
h i h i e f
g h i
= a(ie f h) d(ib ch) + g(bf ce)
= aei + bf g + cdh gec af h idb.
6 Spectral Theory
The books own solution gives answers to Ex 2, 5, 7, 8, 12.
Comments:
1) C is an eigenvalue of a square matrix A if and only if it is a root of the characteristic polynomial
det(aI A) = pA (a) (Corollary 3 of Chapter 5). The spectral mapping theorem (Theorem 4) extends this
result further to polynomials of A.
19
2) The proof of Lemma 6 in this chapter (formula (21) on page 68) used Proposition 5.1 (see the Comments
of Chapter 5).
3) On p.72, the fact that that from a certain index on, Nd s become equal can be seen from the following
line of reasoning. Assume Nd1 = Nd while Nd = Nd+1 . For any x Nd+2 , we have (AaI)x Nd+1 = Nd .
So x Nd+1 = Nd . Then we work by induction.
4) Theorem 12 can be enhanced to a statement on necessary and sucient conditions, which leads to the
Jordan canonical form (see Appendix A.15 for details).
Supplementary notes:
1) Minimal polynomial is dened from the algebraic point of view as the generator of the polynomial
ring {p : p(A) = 0}. So the powers of its linear factors are given algebraically. Meanwhile, the index of an
eigenvalue is dened from the geometric point of view. Theorem 11 says they are equal.
2) As a corollary of Theorem 11, we claim an n n matrix A can be diagonalized over the eld F if and
only if its minimal polynomial can be decomposed into the product of distinct linear factors (polynomials of
degree 1 over the eld F). Indeed, by the uniqueness of minimal polynomial, we have
mA is the product of distinct linear factors
k
Fn = N1 (aj )
j=1
F has a basis {xi }ni=1 consisting of eigenvectors of A
n
20
[ ]
Iss B
Because U 1 U = I, we must have U 1 AU = and det(I A) = det(I U 1 AU ) =
0 C
[ ]
( )Iss B
det = ( )s det(I C).1 So s m.
0 I(ns)(ns) C
We continue to use the notation from Proposition 6.1, and we dene d() as the index of . Then we
have
Proposition 6.3 (Algebraic multiplicity and the dimension of the space of generalized eigen-
vectors). m() = dim Nd() ().
In words, it becomes
I 2. (page 65) (a) Prove that if A has n distinct eigenvalues aj and all of them are less than one in absolute
value, then all h in Cn
AN h 0, as N ,
that is, all components of AN h tend to zero.
(b) Prove that if all aj are greater than one in absolute value, then for all h = 0,
AN h , as N ,
(b)
1 For the last equality, see, for example, Munkres [10, page 24], Problem 6, or [6, page 173].
21
Proof. We use the same notation as in part (a). Since h = 0, there exists some k0 {1, , n} so that
the k0 th coordinate of h satises hk0 = (h )
= 0. Then |(AN
h) | = | aN (h ) |. Dene
j j j k0 k 0
j j Nj j k0
n a
b1 = max1in {|ai | : i = 0, (hi )k0 = 0}. Then b1 > 1 and |(AN h)k0 | = |b1 |N i=1 i bNi (hi )k0 as
1
N .
that the sum of the eigenvalues equals the trace, and their product is the determinant of the matrix.
Proof. The verication is straightforward.
Proof. By Lemma 9, Np1 pk = Np1 Np2 pk = Np1 (Np2 Np3 pk ) = Np1 Np2 Np3 pk = =
Np1 Np2 Npk .
where the last equality comes from the observation Nmj (aj ) Nmj +dj (aj ) = Ndj (aj ) by the denition of
k
dj . This shows the polynomial j=1 (s aj )dj := {polynomials p : p(A) = 0}. By the denition of
minimal polynomial, rj dj for j = 1, , n.
Assume for some j, rj < dj , we can then nd x Ndj (aj ) \ Nrj (aj ) with x = 0. Dene q(s) =
k
i=1,i=j (s ai ) , then by Corollary 10 x can be uniquely decomposed into x + x with x Nq and
ri
dj
x Nrj (aj ). We have 0 = (A aj I) x = (A aj I) x + 0. So x Nq Ndj (aj ) = {0}. This implies
dj
22
Remark 9. Along the way, we have shown that the index d of an eigenvalue is no greater than the algebraic
multiplicity of the eigenvalue in the characteristic polynomial. Also see Proposition 6.2.
I 9. (page 75) Prove Corollary 15.
Proof. The extension is straightforward as the key feature of the proof, B maps N (j) into N (j) , remains
the same regardless of the number of linear maps, as far as they commute pairwise.
(b)
[ ]
5+ 5 5 5 0
Proof. We note (h1 , h1 ) = 1 + a21 = 2 and (h2 , h2 ) = 1 + a22 = 2 . For x = , we have (h1 , x) = a1
1
and (h2 , x) = a2 . So using formula (44) and (45), x = c1 h1 + c2 h2 with
5+ 5 5 5
c1 = a1 / = 1/ 5, c2 = a2 / = 1/ 5.
2 2
This agrees with the expansion obtained in Example 2.
I 12. (page 76) In Example 1 we have determined the eigenvalues and corresponding eigenvector of the
matrix ( )
3 2
1 4
( ) ( )
2 1
as a1 = 2, h1 = , and a2 = 5, h2 = .
1 1
Determine eigenvectors l1 and l2 of its transpose and show that
{
0 for i = j
(li , hj ) =
= 0 for i = j
Proof.
[ ] The
[ ] transpose
[ ] of the matrix has the same eigenvalues a[1 = ]2,[ a]2 = [5. ] Solving the equation
3 1 x x [ ] 3 1 x x [ ]
=2 , we have l1 = 1 1 . Solving the equation =5 , we have l2 = 1 2 .
2 4 y y 2 4 y y
Then its easy to calculate (l1 , h1 ) = 3, (l1 , h2 ) = 0, (l2 , h1 ) = 0, and (l2 , h2 ) = 3.
23
I 13. (page 76) Show that the matrix
0 1 1
A= 1 0 1
1 1 0
has -1 as an eigenvalue. What are the other two eigenvalues?
Solution.
1 1 0 1 1 + 2
det(I A) = det 1 1 = det 0 +1 1
1 1 1 1
= [( + 1)2 (2 1)( + 1)] = ( + 1)2 ( 2).
7 Euclidean Structure
The books own solution gives answers to Ex 1, 2, 3, 5, 6, 7, 8, 14, 17, 19, 20.
Erratum: In the Note on page 92, the innite-dimensional version of Theorem 15 is Theorem 5 in
Chapter 15, not Theorem 15.
I 3. (page 89) Construct the matrix representing reection of points in R3 across the plane x3 = 0. Show
that the determinant of this matrix is 1.
Proof. Under the reection across the plane {(x1 , x2 , x3 ): x3 = 0}, point (x1 , x2 , x3 ) will be mapped to
1 0 0
(x1 , x2 , x3 ). So the corresponding matrix is 0 1 0 , whose determinant is 1.
0 0 1
Proof. Suppose the plane L is determined by the equation Ax + By + Cz = D. For any point x =
(x1 , x2 , x3 ) R3 , we rst nd y = (y1 , y2 , y3 ) L such that the line segment xy L. Then y must
satisfy the following equations
{
Ay1 + By2 + Cy3 = D
(y1 x1 , y2 x2 , y3 x3 ) = k(A, B, C)
24
D(Ax1 +Bx2 +Cx3 )
where k is some constant. Solving the equations gives us k = A2 +B 2 +C 2 and
2
y1 x1 A x1 A AB AC x1 A
y2 = x2 + k B = x2 1 AB D
B2 BC x2 + 2 B
A2 + B 2 + C 2 A + B2 + C 2
y3 x3 C x3 CA CB C2 x3 C
To make the reection R a linear mapping, its necessary and sucient that D = 0. So the problems
statement should be corrected to let R be reection across any plane in R3 that contains the origin. Then
A2 + B 2 + C 2 2AB 2AC
1 .
R= 2 2AB A2 B 2 + C 2 2BC
A + B2 + C 2
2CA 2CB A2 + B 2 C 2
25
Proof. Proof is the same as the one for real Euclidean space.
I 11. (page 96) Show that a unitary map M satises the relations
M M = I
Proof. If M is a unitary map, then by parallelogram law, M preserves inner product. So x, y, (x, M M y) =
(M x, M y) = (x, y). Since x is arbitrary, M M y = y, y X. So M M = I. Conversely, if M M = I,
(x, x) = (x, M M x) = (M x, M x). So M is an isometry.
I 12. (page 96) Show that if M is unitary, so is M 1 and M .
I 13. (page 96) Show that the unitary maps form a group under multiplication.
i.e. | det M | = 1.
I 15. (page 96) Let X be the space of continuous complex-valued functions on [1, 1] and dene the scalar
product in X by
1
(f, g) = f (s)g(s)ds.
1
I 16. (page 98) Prove the following analogue of (51) for matrices with complex entries:
1/2
||A|| |aij |2 .
i,j
Proof. The proof is very similar to that of real case, so we omit the details. Note we need the complex
version of Schwartz inequality (Exercise 8).
I 17. (page 98) Show that
|aij |2 = trAA .
i,j
26
Proof. We have
aj1 n
(AA )ij = [ai1 , , ain ] = aik ajk .
ajn k=1
n
So (AA )ii = k=1 |aik |2 and tr(AA ) = i,j |aij |2 .
Solution. Suppose 1 and 2 are two eigenvalues of A. Then by Theorem 3 of Chapter 6, 1 + 2 = trA = 4
and 1 2 = det A = 3. Solving the
equations gives ||A|| 3. According
us 1 = 1, 2 = 3. By formula (46),
to formula (51), we have ||A|| 12 + 22 + 32 = 14. Combined, we have 3 ||A|| 14 3.7417.
I 20. (page 99) (i) w is a bilinear function of x and y. Therefore we write w as a product of x and y, denoted
as
w = x y,
and called the cross product.
(ii) Show that the cross product is antisymmetric:
y x = x y.
(i)
27
Proof. For any 1 , 2 F, we have
Since z is arbitrary, we necessarily have w(1 x1 + 2 x2 , y) = 1 w(x1 , y) + 2 w(x2 , y). Similarly, we can
prove w(x, 1 y1 + 2 y2 ) = 1 w(x, y1 ) + 2 w(x, y2 ). Combined, we have proved w is a bilinear function of x
and y.
(ii)
Proof. We note
Proof. Since (w(x, y), x) = det(x, y, x) = 0 and (w(x, y), y) = det(x, y, y) = 0, x y is orthogonal to both x
and y.
(iv)
Proof. We suppose every vector is in column form and R is the matrix that represents a rotation. Then
and
(R(x y), z) = (R(x y))T z = (x y)T RT z = (x y, RT z) = det(x, y, RT z).
A rotation is isometric, so RT = R1 and det R = 1. Combing the above two equations gives us (Rx
Ry, z) = (R(x y), z). Since z is arbitrary, we must have Rx Ry = R(x y).
(v)
Proof. In the equation det(x, y, z) = (xy, z), we set z = xy. Since the geometrical meaning of det(x, y, z)
is the signed volume of a parallelogram determined by x, y, z, and since z = x y is perpendicular to x and
y, we have det(x, y, z) = ||x||||y|| sin ||z||, where is the angle between x and y. Then by (x y, z) = ||z||2 ,
we conclude ||x y|| = ||z|| = ||x||||y|| sin .
(vi)
Proof.
1 0 0 1 0 0
1 = det 0 1 0 = (0 1 , 0).
0 0 1 0 0 1
1 0 a 1 0
So 0 1 = b . By part (iii), we necessarily have a = b = 0. Therefore, we can conclude 0 1 =
0 0 1 0 0
0
0.
1
(vii)
28
Proof. By Exercise 16 of Chapter 5,
a d g a b c
det b e h = det d e f
c f i g h i
= aei + bf g + cdh gec hf a idb
= (bf ec)g + (cd f a)h + (ae db)i
[ ] g
= bf ce cd af ae bd h .
i
So we have
a d bf ce
b e = cd af .
c f ae bd
I 21. (page 100) Show that in a Euclidean space every pair of vector satises
Proof.
Comments:
1) In Theorem 4, the eigenvectors of H can be complex (the proof did not show they are real), although
the eigenvalues of H are real.
2) The following result will help us understand some details in the proof of Theorem 4 (page 108, It
follows from this easily that we may choose an orthonormal basis consisting of real eigenvectors in each
eigenspace Na .)
Proposition 8.1. Let X be a conjugate invariant subspace of Cn (i.e. X is invariant under conjugate
operation). Then we can nd a basis of X consisting of real vectors.
Proof. We work by induction. First, assume dim X = 1. v X with v = 0, we must have Rev X and
Imv X. At least one of them is non-zero and can be taken as a basis. Suppose for all conjugate invariant
subspaces with dimension no more than k the claim is true. Let dim X = k + 1. v X with v = 0. If Rev
and Imv are (complex) linearly dependent, there must exist c C and v0 Rn such that v = cv0 , and we let
Y = span{v0 }; if Rev and Imv are (complex) linearly independent,
n we let Y = span{v, v} = span{Rev, Imv}.
In either case, Yis conjugate invariant. Let Y = {x X : i=1 xi yi = 0, y = (y1 , , yn ) Y }. Then
clearly, X = Y Y and Y is also conjugate invariant. By assumption, we can choose a basis of Y
consisting exclusively of real vectors. Combined with the real basis of Y , we get a real basis of X.
29
3) For an elementary proof of Theorem 4 by mathematical induction, see [12, page 297], Theorem
5.9.4.
4) Theorem 5 (the spectral resolution representation of self-adjoint operators) can be extended to innite
dimensional space and is phrased as any self-adjoint operator can be decomposed as the integral w.r.t.
orthogonal projections. See any textbook on functional analysis for details.
5) For the second proof of Theorem 4, compare Spivak[13, page 122], Exercise 5-17 and Keener[5, page
15], Theorem 1.6 (the maximum principle).
Supplementary notes
In view of the spectral theorem (Theorem 7 of Chapter 6, p.70), the diagonalization of a self-adjoint
matrix A is reduced to showing that in the decomposition
Cn = Nd1 (1 ) Ndk (k ),
we must have di (i ) = 1, i = 1, , k. Indeed, assume for some , d() > 1. Then for any x N2 ()\N1 (),
we have
(I A)x = 0, (I A)2 x = 0.
But the second equation implies
A contradiction. So we must have d() = 1. This is the substance of the proof of Theorem 4, part (b).
I 2. (page 104) We have described above an algorithm for diagonalizing q; implement it as a computer
program.
Solution. Skipped for this version.
Proof. We prove p+ + p0 = maxq(S)0 dimS. p + p0 = maxq(S)0 dimS can be proved similarly. We shall
use representation (11) for q in terms of the coordinates z1 , , zn ; suppose we label them so that d1 , ,
dp are nonnegative where p = p+ + p0 , and the rest are negative. Dene the subspace S1 to consist of
all vectors for which zp+1 = = zn = 0. Clearly, dim S1 = p and q is nonnegative on S1 . This shows
p+ + p0 = p maxq(S)0 dim S. If < holds, there must exist a subspace S2 such that q(S2 ) 0 and
dim S2 > p = p+ + p0 . Dene P : S2 S3 := {z : zp+1 = zp+2 = = zn = 0} by P (z) = (z1 , , zp , 0, , 0).
Since dim S2 > p = dim S3 , some z S2 such
there exists n that z = 0 and P (z) = 0. This implies
p n
z1 = = zp = 0. So q(z) = i=1 di zi2 + i=p+1 di zi2 = i=p+1 di zi2 < 0, contradiction. Therefore, our
assumption is not true and < cannot hold.
I 4. (page 109) Show that the columns of M are the eigenvectors of H.
30
Proof. Write M in the column form M = [c1 , , cn ] and multiply M to both sides for formula (24), we get
1 0 0
0 2 0
HM = [Hc1 , , Hcn ] = M D = [c1 , , cn ]
= [1 c1 , , n cn ],
0 0 n
I 5. (page 118) (a) Show that the minimum problem (47) has a nonzero solution f .
(b) Show that a solution f of the minimum problem (47) satises the equation
Hf = bM f,
(y, Hy)
min
(y,M f )=0 (y, M y)
Hg = cM g,
Applying the second proof of Theorem 4, with (, ) replaced by , and H replaced by M 1 H, we can
verify claims (a)-(d) are true.
Solution. This is just Theorem 11 with (, ) replaced by , and H replaced by M 1 H, where , is dened
in the solution of Exercise 5.
31
Solution. Skipped for this version.
I 10. (page 119) Prove Theorem 12. (Hint: Use Theorem 8.)
Proof. By Theorem 8, we can assume N has an orthonormal basis v1 , , vn consisting of genuine eigen-
vectors. We assume the eigenvalue corresponding to vj is nj . Then by letting x = vj , j = 1, , n and
by the denition of ||N ||, we can conclude ||N
|| max |nj |. Meanwhile, x X with ||x|| = 1, there exist
a1 , , an C, so that |aj |2 = 1 and x = aj vj . So
||N x||
= || aj nj vj || = |aj nj |2 max |nj | |aj |2 = max |nj |.
||x|| 1jn
This implies ||N || max |nj |. Combined, we can conclude ||N || = max |nj |.
Remark 10. Compare the above result with formula (48) and Theorem 18 of Chapter 7.
I 11. (page 119) We dene the cyclic shift mapping S, acting on vectors in Cn , by S(a1 , a2 , , an ) =
(an , a1 , , an1 ).
(a) Prove that S is an isometry in the Euclidean norm.
(b) Determine the eigenvalues and eigenvectors of S.
(c) Verify that the eigenvectors are orthogonal.
Proof. |S(a1 , , an )| = |(an , a1 , , an1 )| = |(a1 , , an )|. So S is an isometry in the Euclidean norm.
To determine the eigenvalues and eigenvectors of S, note under the canonical basis e1 , , en , S corresponds
to the matrix
0 0 0 0 1
1 0 0 0 0
A= 0 1 0 0 0,
0 0 0 1 0
whose characteristic polynomial is p(s) = |AsI| = (s)n +(1)n+1 . So the eigenvalues of S are the solutions
2k
to the equation sn = 1, i.e. k = e n i , k = 1, , n. Solve the equation Sxk = k xk , we can obtain the gen-
eral solution as xk = (k , k , , k , 1) . After normalization, we have xk = 1n (n1
n1 n2
k , n2
k , , k , 1) .
Therefore, for i = j,
1 k1 k1 1
n n
1 1 (i j )n
(xi , xj ) = i j = (i j )k1 = = 0.
n n n 1 i j
k=1 k=1
32
I 13. (page 120) What is the norm of the matrix
( )
1 0 1
2 3 0
Erratum: In Exercise 6 (p.129), we should have det eA = etrA instead of det eA = eA.
d1
Comments: In the proof of Theorem 11, to see why sI Ad = 0 () holds, see Lemma 3 of
Appendix 6, formula (9) (p.369).
I 1. (page 122) Prove the fundamental lemma for vector valued functions. (Hint: Show that for every vector
y, (x(t), y) is a constant.)
d
Proof. Following the hint, note dt (x(t), y) = (x(t), y)+(x(t), y) = 0. So (x(t), y) is a constant by fundamental
lemma for scalar valued functions. Therefore (x(t) x(0), y) = 0, y Kn . This implies x(t) x(0).
I 2. (page 124) Derive formula (3) using product rule (iii).
[ 1 ]
Proof. A1 (t)A(t) = I. So 0 = dt
d
A (t)A(t) = dt d 1
A (t) A(t) + A1 (t)A(t) and d 1
dt A (t) = d 1
dt A (t)
1 1 1
A(t)A (t) = A (t)A(t)A (t).
Therefore, we have
( )n ( ) ( )
1 0 1 I22 1 0 1 e + e1 e e1 0 1
exp{A + B} = = + = I22 + .
n=0
n! 1 0 (2k)! (2k + 1)! 1 0 2 2 1 0
k=0 k=0
33
Proof. For any > 0, there exists M > 0, so that m M , supt ||Em (t) F (t)|| < . So m M , t, h,
1
[Em (t + h) Em (t)] F (t)
h
1 t+h 1 t+h
= [Em (s) F (s)]ds + F (s)ds F (t)
h t h t
t+h
||Em (s) F (s)||ds 1 t+h
t
+ F (s)ds F (t)
h h t
1 t+h
< + F (s)ds F (t) .
h t
I 5. (page 129) Carry out the details of the argument that Em (t) converges.
m k1 1 i
Proof. By formula (12), Em (t) = k=1 i=0 k! A (t)A(t)Aki1 (t). So for m and n with m < n,
n ||Ai (t)A(t)Aki1 (t)||
k1
n ||A(t)||k1 ||A(t)||
k1
||Em (t) En (t)|| =
i=0
k! i=0
k!
k=m+1 k=m+1
n
||A(t)||
k1
= ||A(t)|| = ||A(t)||[en (||A(t)||) em (||A(t)||)] 0
(k 1)!
k=m+1
I 6. (page 129) Apply formula (10) to Y (t) = eAt and show that
det eA = etrA .
I 8. (page 142) (a) Show that the set of all complex, self-adjoint n n matrices forms N = n2 -dimensional
linear space over the reals.
(b) Show that the set of complex, self-adjoint n n matrices that have one double and n 2 simple
eigenvalues can be described in terms of N 3 real parameters.
(a)
34
n(n+1)
Proof. The total number of free entries is 2 . The entries on the diagonal line must be real. So the
dimension is n(n+1)
2 2 n = n2 .
(b)
Proof. Similar to the argument in the text, the total number of complex parameters that determine the
eigenvectors is (n 1) + + 2 = n(n1)
2 1. This is equivalent to n(n 1) 2 real parameters. The number
of distinct (real) eigenvalues is n 1. So the dimension = n2 n 2 + n 1 = n2 3.
I 9. (page 142) Choose in (41) at random two self-adjoint 10 10 matrices M and B. Using available
software (MATLAB, MAPLE, etc.) calculate and graph at suitable intervals the 10 eigenvalues of B + tM
as functions of t over some t-segment.
Solution. See the Matlab/Octave program aoc.m below.
function aoc
35
pause;
end
hold off;
10 Matrix Inequalities
The books own solution gives answers to Ex 1, 3, 4, 5, 6, 7, 10, 11, 12, 13, 14, 15.
I 1. (page 146) How many square roots are there of a positive mapping?
I 2. (page 146) Formulate and prove properties of nonnegative mappings similar to parts (i), (ii), (iii), (iv),
and (vi) of Theorem 1.
Proposition 10.1. (i) The identity I is nonnegative. (ii) If M and N are nonnegative, so is their sum
M + N , as well as aM for any nonnegative number a. (iii) If H is nonnegative and Q is invertible, we have
Q HQ 0. (iv) H is nonnegative if and only if all its eigenvalues are nonnegative. (vi) Every nonnegative
mapping has a nonnegative square root, uniquely determined.
Proof. (i) and (ii) are obvious. For part (iii), we write the quadratic form associated with Q HQ as
where y = Qx. For part (iv), by the selfadjointness of H, there exists an orthogonal basis of eigenvectors.
Denote these by hj and the corresponding
eigenvalue by
aj . Then any vectornx can be expressed as a linear
combination of the hj s: x = j xj hj . So (x, Hx) = i,j (xi hi , xj aj hj ) = j=1 aj |xj |2 . From the formula
it is clear that (x, Hx) 0 for any x if and only if aj 0, j. For part (vi), the proof is similar to that of
positive mappings and we omit the lengthy proof. Cf. also solution to Exercise 10.1.
I 3. (page 149) Construct two real, positive 2 2 matrices whose symmetrized product is not positive.
Solution. Let A be a mapping that maps the vector (0, 1) to (0, 2 ) with 2 > 0 suciently small and
(1, 0) to (1 , 0) with 1 > 0 suciently large. Let B be a mapping that maps the vector (1, 1) to (1 , 1 )
with 1 > 0 suciently small and (1, 1) to (2 , 2 ) with 2 > 0 suciently large. Then both A and B
are positive mappings, and we can nd x between (1, 1) and (0, 1) so that (Ax, Bx) < 0. By(the analysis
) in
0
the paragraph below formula (14) , AB + BA is not positive. More precisely, we have A = 1
and
0 2
( )
1 + 2 1 2
B = 21 .
1 2 1 + 2
36
I 4. (page 151) Show that if 0 < M < N , then (a) M 1/4 < N 1/4 . (b) M 1/m < N 1/m , m a power of 2. (c)
log M log N .
Proof. By Theorem 5 and induction, it is easy to prove (a) and (b). For (c), we follow the hint. If M has
k
the spectral resolution M = i=1 i Pi , log M is dened as
( )
k
k
1
k
1
k
1
log M = log i Pi = lim m(i 1)Pi = lim m
m
i Pi
m
Pi = lim m(M m I).
m m m
i=1 i=1 i=1 i=1
1 1
So log M = limm m(M m I) limm m(N m I) = log N .
I 5. (page 151) Construct a pair of mappings 0 < M < N such that M 2 is not less than N 2 . (Hint: Use
Exercise 3).
Solution. (from the textbooks solution, pp. 291) Choose A and B as in Exercise 3, that is positive matrices
whose symmetrized product is not positive. Set
M = A, N = A + tB,
N 2 = A2 + t(AB + BA) + t2 B 2 ;
for t small the term t2 B is negligible compared with the linear term. Therefore for t small N 2 is not greater
than M 2 .
I 6. (page 151) Verify that (19) denes f (z) for a complex argument z as an analytic function, as well as
that Imf (z) > 0 for Imz > 0.
Proof. For f (z) = az + b 0 dm(t)
z+t , we have
dm(t)
f (z + z) f (z) = az + z .
0 (z + z + t)(z + t)
dm(t)
So if we can show limz0 0 (z+z+t)(z+t) exists and is nite, f (z) is analytic by denition. Indeed, if
Imz > 0, for z suciently small, we have
1 1 1 2
z + z + t |z + t| |z| Imz |z| Imz .
dm(t) dm(t)
So by Dominated Convergence Theorem, limz0 0 (z+z+t)(z+t) exists and is equal to 0 (z+t)2 , which
Remark 11. This exercise can be used to verify formula (19) on p.151.
is positive.
37
1
Proof. Consider the Euclidean space L2 (, 1], with the inner product (f, g) :=
f (t)g(t)dt. Choose
fj = erj (t1) , j = 1, , m, then the associated Gram matrix is
1 (ri +rj )t
e 1
Gij = (fi , fj ) = ri +rj
dt = .
e ri + rj
Clearly, (fj )m
j=1 are linearly independent. So G is positive.
I 9. (page 162) Extend Theorem 14 to the case when dim V = dim U m, where m is greater than 1.
Proof. The extension is straightforward, just replace the paragraph (on page 161) If S is a subspace of V ,
then T = S and dim T = dim S. ... It follows that
dim S 1 dim T
as asserted. with the following one: Let T = S V and T1 = S V , where V stands for the compliment
of V in U . Then dim T + dim T1 = dim S. Since dim T1 dim V = n (n m) = m, dim T dim S m.
The rest of the proof is the same as the proof of Theorem 14 and we can conclude that
I 12. (page 168) Prove that if the self-adjoint part of Z is positive, then Z is invertible, and the self-adjoint
part of Z 1 is positive.
Proof. Assume Z is not invertible. Then there exists x = 0 such that Zx = 0. In particular, this implies
(x, Zx) = (x, Z x) = 0. Sum up these two, we get (x, (Z + Z )x) = 0. Contradictory to the assumption
that the selfadjoint part of Z is positive. For any x = 0, there exists y = 0 so that x = Zy. So
38
I 13. (page 170) Let A be any mapping of a Euclidean space into itself. Show that AA and A A have the
same eigenvalues with the same multiplicity.
Proof. Exercise Problem 14 has proved the claim for non-zero eigenvalues. Since the dimensions of the spaces
of generalized eigenvectors of AA and A A are both equal to the dimension of the underlying Euclidean
space, we conclude by Spectral Theorem that their zero eigenvalues must have the same multiplicity.
I 14. (page 171) Let A be a mapping of a Euclidean space into another Euclidean space. Show that AA
and A A have the same nonzero eigenvalues with the same multiplicity.
I 16. (page 171) Verify that the commutator (50) of two self-adjoint matrices is anti-self-adjoint.
Proof. Suppose A and B are selfadjoint. Then for any x and y,
(x, (AB BA) y) = ((AB BA)x, y) = (ABx, y) (BAx, y) = (x, BAy) (x, ABy) = (x, (AB BA)y).
39
I 2. (page 174) Suppose that A is independent of t; show that the solution of equation (11) satisfying the
initial condition (5) is
M (t) = etA .
M (t+h)M (t) etA (ehA I)
Proof. limh0 h = limh0 h = AetA , i.e. M (t) = AM (t). Clearly M (0) = I.
I 3. (page 174) Show that when A depends on t, equation (11) is not solved by
t
A(s)ds
M (t) = e 0 ,
Proof. The reason we need commutativity is that the following equation is required in the calculation of
derivative:
1 1 ( t+h A(s)ds t )
(M (t + h) M (t)) = e0 e 0 A(s)ds
h h
1 ( t A(s)ds+ t+h A(s)ds t )
= e0 t A 0 A(s)ds
h
1 t A(s)ds ( t+h A(s)ds )
= e0 et I ,
h
t t t+h
A(s)ds+ tt+h A(s)ds A(s)ds A(s)ds
i.e. e 0 =e 0 e t . So when this commutativity holds,
1
M (t) = lim (M (t + h) M (t)) = M (t)A(t).
h0 h
I 4. (page 175) Show that if A in (15) is not equal to 0, then all vectors annihilated by A are multiples of
(16).
By discussing various possibilities (a, b, c = 0 or not), we can check f is a multiple of (c, b, a)T .
I 5. (page 175) Show that the two other eigenvalues of A are i a2 + b2 + c2 .
Proof.
a b
det(sI A) = det a c = (2 + c2 ) a(a + bc) + b(ac + b) = 3 + (c2 + b2 + a2 ).
b c
40
0 a b
Proof. Since A = a 0 c is anti-symmetric, M (t)M (t) = etA etA = etA etA = I. By Exercise 7 of
b c 0
Chapter 9, all eigenvalues of eAthas the form of eat , where a is an eigenvalue of A. Since the eigenvalues
of A are 0 and ik with k = a2 + b2 + c2 (Exercise 5), the eigenvalues of eAt are 1 and eikt . This
implies det eAt = 1 eikt eikt = 1. By Theorem 1, M = eAt is a rotation. Let f be given by formula
(16). From Af = 0 we deduce that eAt f = f ; thus f is the axis of the rotation eAt . The trace of
eAt is 1 + eikt + eikt = 2 cos kt + 1. According
to formula (4) , the angle of rotation of eAt satises
At
2 cos + 1 = tre . This shows that = kt = a + b2 + c2 .
2
[A, B] = AB BA
I 8. (page 177) Let A denote the 3 3 matrix (15); we denote the associated null vector (16) by fA .
Obviously, f depends linearly on A.
(a) Let A and B denote two 3 3 antisymmetric matrices. Show that
trAB = 2(fA , fB ),
f|A,B| = fA fB .
Proof. It suces to note that the set of 2n functions, {(cos cj t)hj , (sin cj t)hj }nj=1 , are linearly independent,
since any two of them are orthogonal when their subscripts are distinct.
12 Convexity
The books own solution gives answers to Ex 2, 6, 7, 8, 10, 16, 19, 20.
Comments:
1) The following results will help us understand some details in the proofs of Theorem 6 and Theorem
10.
Proposition 12.1. Let S be an arbitrary subset of X and x an interior point of S. For any real linear
function l dened on X, if l 0, then l(x) is an interior point of = l(S) in the topological sense.
Proof. We can nd y X so that l(y) = 0. Then for t suciently small, l(x) + tl(y) = l(x + ty) . So
contains an interval which contains l(x), i.e. l(x) is an interior point of under the topology of R1 .
Corollary 1. If K is an open convex set and l is a linear function with l 0, = l(K) is an open interval.
41
Proposition 12.2. Let K be a convex set and K0 the set of all interior points of K. Then K0 is convex
and open.
Proof. (Convexity) x, y K0 and a [0, 1]. For any z X, [ax+(1a)y]+tz = a(x+tz)+(1a)(y+tz) K
when t is suciently small, since x, y are interior points of K and K is convex.
(Openness) Fix x K0 , y1 X. We need to show for t suciently small, x+ty1 K0 . Indeed, y2 X,
we can nd a common > 0, so that whenever (t1 , t2 ) [, ] [, ], x + t1 y1 K and x + t2 y2 K.
Fix any t [ 2 , 2 ], by the convexity of K, x + t y1 + t y2 = 21 (x + 2t y1 ) + 21 (x + 2t y2 ) K when
t [ 2 , 2 ]. This shows x + t y1 K0 . Since t is arbitrarily chosen from [ 2 , 2 ], we conclude for t
suciently small, x + ty1 K0 . That is, x is an interior point of K0 . By the arbitrariness of x, K0 is
open.
2) Regarding Theorem 10 (Carathodory): i) Among all the three conditions, convexity is the essential
one; closedness and boundedness are to guarantee K has extreme points. (ii) Solution to Exercise 14
may help us understand the proof of Theorem 10. When a convex set has no interior points, its often useful
to realize that the dimension can be reduced by 1. (iii) To understand ... then all points x on the open
segment bounded by x0 and x1 are interior points of K, we note if this is not true, then we can nd y
such that for all t > 0 or t < 0, x + ty K. Without loss of generality, assume x + ty K, t > 0. For t
suciently small, x0 + ty K. so the segment [x1 , x0 + ty] K. But this necessarily intersects with the
ray x + ty, t > 0. A contradiction. (iv) We can summarize the idea of the proof as follows. One dimension
is clear, so by using induction, we have two scenarios. Scenario one, K has no interior points. Then the
dimension is reduced by 1 and we are done. Scenario two, K has interior points. Then intuition shows
any interior point lies on a segment with one endpoint an extreme point and the other a boundary point; a
boundary point resides on a hyperplane, whose dimension is reduced by 1. By induction, we are done.
Proof. Fix a point x {z : l(z) < c}. For any y X, f (t) = l(x + ty) = l(x) + tl(y) is a continuous function
of t, with f (0) = l(x) < c. By continuity, f (t) < c for t suciently small. So x + ty {z : l(z) < c} for t
suciently small, i.e. x is an interior point. Since x is arbitrarily chosen, we have proved {z : l(z) < c} is
open.
I 4. (page 188) Show that if A is an open convex set and B is convex, then A + B is open and convex.
Proof. The convexity of A + B is Theorem 1(b). To see the openness, x A, y B. For any z X,
(x + y) + tz = (x + tz) + y. For t suciently small, x + tz A. So (x + y) + tz A + B for t suciently
small. This shows A + B is open.
I 5. (page 188) Let X be a Euclidean space, and let K be the open ball of radius a centered at the origin:
||x|| < a.
(i) Show that K is a convex set.
(ii) Show that the gauge function of K is p(x) = ||x||/a.
Proof. That K is a convex set is trivial to see. Its also clear that p(0) = 0. For any x Rn \ {0}, when
||x|| ||x||
(0, a), r = a satises r > 0 and x/r K. So p(x) a . By letting 0, we conclude
p(x) ||x||/a. If < holds, we can nd r > 0 such that r < ||x||
a and
x
r K. But r < ||x||
a implies a < ||x||
r
and hence xr K. Contradiction. Combined, we conclude p(x) = ||x|| a .
42
I 6. (page 188) In the (u, v) plane take K to be the quarter-plane u < 1, v < 1. Show that the gauge
function of K is
0 if u 0, v 0,
v if 0 < v, u 0,
p(u, v) =
u if 0 < u, v 0,
max(u, v) if 0 < u, 0 < v.
qS (m + l) = sup(l + m)(x) < (l + m)(x()) + sup l(x) + sup m(x) + = qS (m) + qS (l) + .
xS xS xS
Proof. qS+T (l) = supxS,yT l(x + y) = supxS,yT [l(x) + l(y)] supxS,yT [qS (l) + qT (l)] = qS (l) + qT (l).
Conversely, > 0, there exists x0 S, y0 T , s.t. qS (l) < l(x0 ) + 2 , qT (l) < l(y0 ) + 2 . So qS (l) + qT (l) <
l(x0 + y0 ) + qS+T (l) + . By the arbitrariness of , qS (l) + qT (l) qS+T (l). Combined, we get
qS+T (l) = qS (l) + qT (l).
I 10. (page 193) Show that qST (l) = max{qS (l), qT (l)}.
Proof. qST (l) = supxST l(x) supxS l(x) = qS (l). Similarly, qST (l) qT (l). Therefore, we have
qST (l) max{qS (l), qT (l)}. For any > 0 suciently small, we can nd x S T , such that qST (l)
l(x ) + . But l(x ) max{qS (l), qT (l)}. So qST (l) max{qS (l), qT (l)} + . Let 0, we get qST (l)
max{qS (l), qT (l)}. Combined, we can conclude qST (l) = max{qS (l), qT (l)}.
I 11. (page 194) Show that a closed half-space as dened by (4) is a closed convex set.
Proof. If for any a (0, 1), l(ax + (1 a)y) = al(x) + (1 a)l(y) c, by continuity, we have l(x) c and
l(y) c. This shows {x : l(x) c} is a closed convex set.
I 12. (page 194) Show that the closed unit ball in Euclidean space, consisting of all points ||x|| 1, is a
closed convex set.
Proof. Convexity is obvious. For closedness, note f (t) = ||tx + (1 t)y|| is a continuous function of t. So if
f (t) t for any t (0, 1), f (0) = ||y|| 1 and f (1) = ||x|| 1. So the unit ball B(0, 1) is closed. Combined,
we conclude B(0, 1) = {x : ||x|| 1} is a closed convex set.
I 13. (page 194) Show that the intersection of closed convex sets is a closed convex set.
Proof. Suppose H and K are closed convex sets. Theorem 1(a) says H K is also convex. Moreover, if for
any a (0, 1), ax + (1 a)y H K, then the closedness of H and K implies a, b H and a, b K, i.e.
a, b H K. So H K is closed.
43
I 14. (page 194) Complete the proof of Theorems 7 and 8.
Proof. Proof of Theorem 7: Suppose K has an interior point x0 . If a linear function l and a real
number c determine a closed half-space that contains K x0 but not y x0 , i.e. l(x x0 ) c, x K and
l(y x0 ) > c, then l and c+l(x0 ) determine a closed half-space that contains K but not y, i.e. l(x) c+l(x0 )
and l(y) > c + l(x0 ). So without loss of generality, we can assume x0 = 0. Note the convexity and closedness
are preserved under translation, so this simplication is all right for this problems purpose.
Dene gauge function pK as in (5). Then we can show pK (x) 1 if and only if x K. Indeed, if x K,
then pK (x) 1 by deinition. Conversely, if pK (x) 1, then for any > 0, there exists r() < 1 + so that
r() K. We choose r() > 1 and note r() = a() 0 + (1 a()) x with a() = 1 r() . As r() can be as
x x 1
close to 1 as we want when 0, a() can be as close to 0 as we want. Meanwhile, 0 is an interior point of
K, so for r large enough, xr K. This shows for a close enough to 1, a 0 + (1 a) x K. Combined, we
conclude K contains the open segment {a 0 + (1 a) x : 0 < a < 1}. By denition of closedness, x K.
The rest of the proof is completely analogous to that of Theorem 3, with p(x) < 1 replaced by p(x) 1.
If K has no interior point, we have two possibilities. Case one, y and K are not on the same hyperplane.
In this case, there exists a linear function l and a real number c, such that l(x) = c(x K) but l(y) = c.
By considering l if necessary, we can have l(x) = c(x K) and l(y) > c. So the half-space {x : l(x) c}
contains K, but not y. Case two, y and K reside on the same hyperplane. Then the dimension of the
ambient space for y and K can be reduced by 1. Work by induction and note the space is of nite dimension,
we can nish the proof.
Proof of Theorem 8: By denition of (16), if x K, then l(x) qK (l). Conversely, suppose y is not
in K, then there exists l X and a real number c such that l(x) c, x K and l(y) > c. This implies
l(y) > qK (l). Combined, we conclude x K if and only if l(x) qK (l), l X .
Remark: From the above solution and the proof of Theorem 3, we can see a useful routine for proving
results on convex sets: rst assume the convex set has an interior point and use the gauge function, which
often helps to construct the desired linear functionals via Hahn-Banach Theorem. If there exists no interior
point, reduce the dimension by 1 and work by induction. Such a use of interior points as the criterion for a
dichotomy is also present in the proof of Theorem 10 (Carathodory).
I 15. (page 195) Prove Theorem 9.
Proof. Denote by Sb the closed convex hull of S, and dene l = {x : l(x) qS (l)} where l X . Then it
is easy to see each l is a closed convex set containing S, so Sb lX l . For the other direction, suppose
lX l \ Sb = and we choose a point x from lX l \ S.
b By Theorem 8, there exists l0 X such that
l0 (x) > qSb(l0 ) qS (l0 ). So x l0 , contradiction. Combined, we conclude Sb = lX l = {x : l(x)
qS (l), l X }.
I 16. (page 195) Show that if x1 , , xm belong to a convex set, then so does any convex combination of
them.
m m
Proof. Suppose 1 , , m satisfy 1 , , m (0, 1) and i=1 i = 1. We m to show i=1 i xi
need
K, where K is the convex set to which x1 , , xm belong. Indeed, since i=1 i xi = (1 + +
m1 m1
i
m1 ) i=1 1 ++ m1
xi + m xm , it suces to show i=1 1 ++ i
m1
xi K. Working by induction,
we are done.
I 17. (page 195) Show that an interior point of K cannot be an extreme point.
I 18. (page 197) Verify that every permutation matrix is a doubly stochastic matrix.
44
Proof.
n Let S be a permutation matrix as dened in formula (25). Then clearly Sij 0. Furthermore,
n n
i=1 S ij = ni=1 p(i)j , where j is xed and is equal to p(i0 ) for some i 0 . Son i=1 Sij = 1. Finally,
n 1
j=1 Sij = j=1 ip1 (j) , where i is xed and is equal to p (j0 ) for some j0 . So j=1 Sij = 1. Combined,
we conclcude S is a doubly stochastic matrix.
I 19. (page 199) Show that, except for two dimensions, the representation of doubly stochastic matrices as
convex combinations of permutation matrices is not unique.
Proof. The textbooks solution demonstrates the case of dimension 3. Counterexamples for higher dimensions
can be obtained by building permutation matrices upon the case of dimension 3.
I 20. (page 201) Show that if a convex set in a nite-dimensional Euclidean space is open, or closed, or
bounded in the linear sense dened above, then it is open, or closed, or bounded in the topological sense,
and conversely.
Proof. Suppose K is a convex subset of an n-dimension linear space X. We have the following properties.
(1) If x is an interior point of K in the linear sense, then x is an interior point of K in the topological
sense. Consequently, being open in the linear sense is the same as being open in the topological sense.
Indeed, let e1 , , en be a basis of X. There exists > 0 so that for any ti (, ), x + ti ei K,
i = 1, , n.For any y X which is close enough to x, the norm of y x can nbe very small
n so that if we write
n
y as y = x+ i=1 ai ei , |ai | < n . Since for ti ( n , n ) (i = 1, , n), x+ i=1 ti ei = i=1 n1 (x+nti ei ) K
by the convexity of K, we conclude y K if y is suciently close to x. This shows x is an interior point of
K.
(2) If K is closed in the linear sense, it is closed in the topological sense.
Indeed, suppose (xk ) k=1 K and xk x in the topological sense, we need to show x K. We work
by induction. The case n = 1 is trivial, because x is necessarily an endpoint of a segment contained in K.
Assume the property is true any n N . For n = N + 1, we have two cases to consider. Case one, K has
no interior points. Then as argued in the proof of Theorem 10, K is contained in a subspace of X with
dimension less than n. By induction, K is closed in the topological sense and hence x K. Case two, K has
at least one interior point x0 . In this case, all the points on the open segment (x0 , x) must be in K. Indeed,
assume not, then there exists an x (x0 , x) such that the open segment (x0 , x ) K, but (x , x] K = .
Since x0 is an interior point of K, we can nd n linearly independent vectors e1 , , en so that x0 + ei K,
i = 1, , n. For any xk suciently close to x, the cone with xk as the vertex and x0 + e1 , , x0 + en
as the base necessarily intersects with (x , x]. So such an x (x0 , x) with (x , x] K = does not exist.
Therefore (x0 , x) K and by denition of being closed in the linear sense, we conclude x K.
(3) If K is bounded in the linear sense, it is bounded in the topological sense.
Indeed, assume K is not bounded in the topological sense, then we can nd a sequence (xk ) k=1 such
that ||xk || . We shall show K is not bounded in the linear sense. Indeed, if the dimension n = 1, this
is clearly true. Assume the claim is true for any n N . For n = N + 1, we have two cases to consider.
Case one, K has no interior points. Then as argued in the proof of Theorem 10, K is contained in a
subspace of X with dimension less than n. By induction, K is not bounded in the linear sense. Case two,
K has at least one interior point x0 . Denote by yk the intersection of the segment [x0 , xk ] with the sphere
S(x0 , 1) = {z : ||z x0 || = 1}. For k large enough, yk always exists. Since a sphere in nite-dimensional
space is compact, we can assume without loss of generality that yk y S(x0 , 1). Then by an argument
similar to that of part (2) (the argument based on cone), the ray starting with x0 and going through y is
contained in K. So K is not bounded in the linear sense.
Comments: For a linear equation Ax = y to have solution, it is necessary and sucient that
y R(A) = N (A) . This observation of duality helps us determine the existence of solution. In optimization
45
theory, if the collection of points satisfying certain constraint is a convex set, we use the hyperplane separation
theorem to nd and state the necessary and sucient condition for the existence of solution.
m
[ ]
ty + (1 t)y = tpj + (1 t)pj yj Y.
i=1
So Y is a convex set.
I 2. (page 205) Show that if x z and 0, then x z.
Proof. x z = (x z) 0.
I 3. (page 208) Show that the sup and inf in Theorem 3 is a maximum and minimum. [Hint: The sign of
equality holds in (21).]
Proof. In the proof of Theorem 3, we already showed that there is an admissible p for which p s
(formula (21)). Since S p s S by formula (16) and (20), the sup in Theorem 3 is obtained at
p , hence a maximum. To see the inf in Theorem 3 is a minimum, note under the condition that there are
admissible p and , Theorem 3 can be written as
sup{p : y Y p, p 0} = inf{y : Y, 0} = .
This is equivalent to
Comments: The geometric intuition of Theorem 9 is clear if we identify X with X and assume X
is an inner product space.
I 1. (page 214) (a) Show that the open and closed unit balls are convex.
(b) Show that the open and closed unit balss are symmetric with respect to the origin, that is, if x belongs
to the unit ball, so does x.
Proof. Trivial and proof is omitted.
I 2. (page 215) Prove the triangle inequality, that is, for all x, y, z in X,
|x z| |x y| + |y z|.
46
I 4. (page 216) Prove or look up a proof of Hlders inequality.
Proof. f (x) = ln x is a strictly convex function on (0, ). So for any a, b > 0 with a = b, we have
where = holds if and only if a + (1 )b = a or b. That is, one of the following three cases occurs: 1)
= 0; 2) = 1; 3) a = b.
We note inequality f (a + (1 )b) f (a) + (1 )f (b) is equivalent to a b1 a + (1 )b, and by
letting ai = |x i|
and bi = |y|y|i |qq , we have ( = p1 )
p q
|x|p
p
|x|p =
|y|qq
, that is, (|x1 |p , , |xn |p ) are proportional to (|y1 |q , , |yn |q ).
p
For i xi yi = |xi yi | to hold, we need xi yi = |xi yi | for each i. This is the same as sign(xi ) = sign(yi )
for each i. In summary, we conclude xy |x|p |y|q and the = holds if and only if (|x1 |p , , |xn |p ) and
(|y1 |q , , |yn |q ) are proportional to each other and sign(xi ) = sign(yi ) (i = 1, , n).
I 5. (page 216) Prove that
|x| = lim |x|p ,
p
I 6. (page 219) Prove that every subspace of a nite-dimensional normed linear space is closed.
Proof. Every linear subspace of a nite-dimensional normed linear space is again a nite-dimensional normed
linear space. So the problem is reduced to proving any nite-dimensional
normed space is closed. Fix a basis
e1 , , en , we introduce the following norm: if x = aj ej , ||x|| := ( j a2j )1/2 . Then the original norm | | is
equivalent to || ||. So (xk ) n sequence undern| | if and
k=1 is a Cauchy only if {(ak1 , , akn )}
k=1 is a Cauchy
sequence in C or R . Here xk = j=1 a
n n
kj ej . Since C and R are complete, we conclude there exists
n
I 8. (page 223) Show that |l| dened by (23) satises all postulates for a norm listed in (1).
Proof. (i) Positivity: || = 0 implies x = 0, x with |x| = 1. So for any y with y = 0, y = |y|(y/|y|) = 0,
i.e. 0. So || = 0 implies =
0, which is equivalent to = 0 implies || > 0. |0| = 0 is obvious.
(ii) Subadditivity: |1 + 2 | = |x|=1 (1 + 2 )x sup|x|=1 1 x + sup|x|=1 2 x = |1 | + |2 |.
(iii) Homogeneity: |k| = sup|x|=1 kx = |k| sup|x|=1 x = |k|||.
47
I 9. (page 228) (i) Show that for all rational r,
Therefore
q
p(rx, y) = (prx, y) = (qx, y) = q(x, y), i.e. (rx, y) = (x, y) = r(x, y).
p
(ii)
Proof. For any given k, we can nd a sequence of rational numbers (rn )
n=1 such that rn k as n .
Then k(x, y) = limn rn (x, y) = limn (rn x, y) = (limn rn x, y) = (kx, y), where the third = uses
the fact that (, y) denes a continuous linear functional on X.
Erratum: In Exercise 7 (p.236), it should be dened by formulas (3) and (5) in Chapter 14 instead
of dened by formulas (3) and (4) in Chapter 14.
I 1. (page 230) Show that every linear map T : X Y is continuous, that is, if lim xn = x, then
lim T xn = T x.
I 2. (page 235) Show that if for every x in X, |Tn x T x| tends to zero as n , then |Tn T | tends to
zero.
Proof. Suppose |Tn T | does not tend to zero. Then there exists > 0 and a sequence (xn ) n=1 such that
|xn | = 1 and |(Tn T )xn | . By Theorem 3(ii), we can without loss of generality assume (xn )
n=1 converges
to some point x . Then
For n suciently large, |(Tn T )xn | |(Tn T )x | will be greater than /2, while |T ||xn x | + |Tn ||xn x |
will be as small as we want, provided that we can prove (|Tn |) n=1 is bounded. Indeed, this is the principle
of uniform boundedness (see, for example, Lax [7], Chapter 10, Theorem 3). Thus we have arrived at a
contradiction which shows our assumption is wrong.
Remark 13. Can we nd an elementary proof without using the principle of uniform boundedness in
functional analysis, especially since we are working with nite dimensional space?
48
n
I 3. (page 235) Show that Tn = Rk converges to S 1 in the sense of denition (16).
0
( )
is a Cauchy sequence in X .
K
Proof. First of all, k=0 Rk is well-dened, since by |R| < 1, k=0 R k
K=0
Then we note S k=0 R k
= k=0 Rk k=1 Rk = I and ( k=0 Rk )S = k=0 Rk k=1 Rk = I. So S
is invertible and S 1 = k=0 Rk .
I 4. (page 235) Deduce Theorem 5 from Theorem 6 by factoring S = T + S T as T [I T 1 (S T )].
Proof. Assume all the conditions in Theorem 5. Dene R = T 1 (S T ), then |R| |T 1 ||S T | < 1. So
by Theorem 6, I R = T 1 S is invertible, hence S = T T 1 S is invertible.
I 5. (page 235) Show that Theorem 6 remains true if the hypothesis (17) is replaced by the following
hypothesis. For some positive integer m,
|Rm | < 1.
Proof. If for some m, |Rm | < 1, we dene U = k=0 Rkm = I + Rm U . U is well-dened, and the following
linear map is also well-dened: V = U + RU + + Rm1 U . Then SV = U + RU + + Rm1 U (RU +
R2 U + + Rm U ) = U Rm U = I. This shows S is invertible.
I 6. (page 235) Take X = Y = Rn , and T : X X the matrix (tij ). Take for the norm |x| the maximum
norm |x| dened by formula (3) of Chapter 14. Show that the norm |T | of the matrix (tij ), regarded as a
mapping of X into X, is
|T | = max |tij |.
i
j
n n |T x|
Proof. For any x Rn , |T x| = maxi | j=1 tij xj | maxi ( j=1 |tij |)|x| . So |T | = supx=0
|x|
maxi j |tij |. For the other direction, suppose j |ti0 j | = maxi j |tij | and we choose
|T x|l
So |T | = sup|x| =1 |x| i,j |tij |.
49
I 8. (page 236) X is any nite-dimensional normed linear space over C, and T is a linear mapping of X
into X. Denote by tj the eigenvalues of T , and denoted by r(T ) its spectral radius:
Proof. The proof is very similar to the content on page 97, Chapter 7, the material up to Theorem 18. So
we omit the proof.
16 Positive Matrices
The books own solution gives answers to Ex 1, 2.
Comments: To see the property of complex numbers mentioned n on p.240,n we note if z1 , z2 C, then
|z1 + z2 | = |z1 | + |z2 | if and only if arg z1 = arg z2 . For n 3, if | i=1 zi | = i=1 |zi |, then
n n n n
n
zi zi + |z1 + z2 | zi + |z1 | + |z2 | |zi | = zi .
i=1 i=3 i=3 i=1 i=1
So |z1 + z2 | = |z1 | + |z2 | and hence arg z1 = arg z2 . Then we work by induction.
P x x, x 0
(P ) = min .
t(P )
n
Proof. Let x = (1, , 1)T and = max1in j=1 pij , then P x x . So t(P ) = and t (P ) = {0
: t(P )} is a bounded, nonempty set. We show further t (P ) is closed. Suppose (m )
m=1 t (P )
converges to a point . Denote by x the nonnegative and nonzero vector such that P x x . Without
m m m m
n m
i = 1. Then (x )m=1 is bounded and we can assume x x for some
xm m
loss of generality,
n we assume i=1
x 0 with i=1 xi = 1. Passing to limit gives us P x x. Clearly 0 . So t (P ). This shows
t (P ) is compact and t(P ) has a minimum .
Denote by x the nonzero and nonnegative vector such that P x x. We show we actually have
n
P x = x. Assume not, there must exist some k {1, , n} such that j=1 pij xij xi for i = k and
n
b = x ek , where > 0 and ek has the k-th component equal to
j=1 pkj xj < xxk . Consider the vector x
1 with all the other components zero. Then in the inequality P x x, each component of LHS is decreased
when x is replaced by x b, while only the k-th component of RHS is decreased by an amount of . So for
small enough, P x b < bx, and we can nd a b < such that P x bx. Note > 0 (otherwise x = 0, a
b b
b
contradiction), so we can also let > 0. This contradicts with = mint(P ) . We have shown > 0 is an
eigenvalue of P which has a nonzero, nonnegative eigenvalue. By Theorem 1(iv), = (P ).
I 2. (page 243) Show that if some power P m of P is positive, then P has a dominant positive eigenvalue.
Proof. P m has a dominant positive eigenvalue 0 . By Spectral Mapping Theorem, there is an eigenvalue
of P , such that m = 0 . Suppose is real, then we can further assume > 0 by replacing with if
50
necessary. Then for any other eigenvalue of P , Spectral Mapping Theorem implies ( )m is an eigenvalue
of P m . So |( )m | < 0 = m , i.e. | | < .
To show we can take as real, denote by x the eigenvector of P m associated with 0 . Then
P m x = 0 x.
is introduced by Rutishauser et al. [1]. See Papadrakakis [11] for a survey on a family of vector iterative
methods with three-term recursion formulae and Golub and van Loan [2] for a gentle introduction to the
Chebyshev semi-iterative method (section 10.1.5).
I 4. (page 261) Use the computer program to solve a system of equations of your choice.
Solution. We solve the following problem from the rst edition of this book: Use the computer program in
Exercise 3 to solve the system of equations
1 1
Ax = f, Aij = c + , fi = ,
i+j+1 i!
c some nonnegative constant. Vary c between 0 and 1, and the order K of the system between 5 and 20.
TO BE CONTINUED ...
51
18 How to Calculate the Eigenvalues of Self-Adjoint Matrices
I 1. (page 266) Show that the o-diagonal entries of Ak tend to zero as k tends to .
d
ak = 2(b2k b2k1 ),
dt
d
bk = bk (ak+1 ak ),
dt
where k = 1, , n and b0 = bn = 0.
A Appendix
A.1 Special Determinants
I 1. (page 304) Let
p(s) = x1 + x2 s + + xn sn1
by a polynomial of degree less than n. Let a1 , , an be n distinct numbers, and let p1 , , pn be n
arbitrary complex numbers; we wish to choose the coecients x1 , , xn so that
p(ai ) = pi , i = 1, , n.
This is a system of n linear equations for the n coecients xi . Find the matrix of this system of equations,
and how that its determinant is = 0.
52
(aj ai )2
Solution. Denote the matrix by A. We claim det A = j>i . Indeed, by subtracting column 1 from
i,j (1+a i aj )
By the Laplace expansion and extracting the common factor (ai a1 ) from row 2 through n and the common
factor 1+a11 aj from column 2 through n, we get
1
1+a22
1
1+a2 a3 1
1+a2 an
n
a1 )2
j=2 (aj
det A = n n det 1+a1i a2 1
1
1+ai an
i=1 (1 + ai a1 ) j=2 (1 + a1 aj ) 1+ai a3
1
1+an a2
1
1+an a3 1
1+a2n
53
Proof. By the Laplace expansion and Exercise 16 of Chapter 5, we have
0 a b c
a 0 a b c a b c a b c
d e
det
b d 0 f = a det d 0 f b det 0 d e + c det 0 d e
e f 0 e f 0 d 0 f
c e f 0
= a(bef + cdf + af 2 ) b(be2 + cde + aef ) + c(adf bde + cd2 )
= 2abef + 2acdf 2bcde + a2 f 2 + b2 e2 + c2 d2
= (af be + cd)2 .
54
and 1
[ ] 1 0 012n
I 0
F = L 22 U 0 1
012n .
0 F11 a
02n1 02n1 I2n2n
Then
[ ] 1 0 012n 0 a
I22 0
F 1 A(F 1 )T = L U 0 1
0 a 0
0 F11 a 12n
02n1 02n1 I2n2n A1
1 0 012n [ ]
I 0
0 1
012n U T 22 LT
a 0 (F11 )T
02n1 02n1 I2n2n
[ ]
0 I(n+1)(n+1)
= .
I(n+1)(n+1) 0
By induction, we have proved the claim.
I 2. (page 310) Prove the converse.
Proof. For any given x and y, dene f (t) = (S(t)x, JS(t)y). Then we have
d d d
f (t) = ( S(t)x, JS(t)y) + (S(t)x, J S(t)y)
dt dt dt
= (G(t)S(t)x, JS(t)y) + (S(t)x, JG(t)S(t)y)
= (JL(t)S(t)x, JS(t)y) + (S(t)x, J 2 L(t)S(t)y)
= (L(t)S(t)x, J T JS(t)y) (S(t)x, L(t)S(t)y)
= (S(t)x, L(t)S(t)y) (S(t)x, L(t)S(t)y)
= 0.
So f (t) = f (0) = (S(0)x, JS(0)y) = (x, Jy). Since x and y are arbitrary, we conclude S(t) is a family of
symplectic matrices.
I 3. (page 311) Prove that plus or minus 1 cannot be an eigenvalue of odd multiplicity of a symplectic
matrix.
Proof. Skipped for this version.
I 4. (page 312) Verify Theorem 6.
Proof. We note
2n v1 dui
dv v du
i=1 ui dt v
=
2n v2n dui u dt
= = JHu .
dt u
i=1 ui dt
H(u) can be seen as a function of v: K(v) = H(u(v)). So
K 2n H ui u1 u2n H
( )T
v1
i=1 ui v1 v1 v1 u1 u
Kv = = =
2n H ui
= Hu .
v
K
v2n i=1 u v
u1
v2n u2n
v2n
H
u2n
i 2n
Since v/u is symplectic, by Theorem 2, u/v and (v/u)T are also symplectic. So using formula (4)
gives us
( )T ( )T ( )T
dv v v u u
= J Hu = J Hu = JKv .
dt u u v v
55
A.4 Tensor Product
I 1. (page 313) Establish a natural isomorphism between tensor products dened with respect to two pairs
of distinct bases.
A.5 Lattices
I 1. (page 318) Show that a1 is a rational number.
56
A.7 Gershgorins Theorem
I 1. (page 324) Show that if Ci is disjoint from all the other Gershgorin discs, then Ci contains exactly one
eigenvalue of A.
Proof. Using the notation of Gershgorin Circle Theorem, let B(t) = D + tF , t [0, 1]. The eigenvalues of
B(t) are continuous functions of t (Theorem 6 of Chapter 9). For t = 0, the eigenvalues of B(0) are the
diagonal entries of A. As t goes from 0 to 1, the radius of Gershgorin circles corresponding to B(t) become
bigger while the centers remain the same. So we can nd for each di a continuous path i (t) such that
i (0) = di and i (t) is an eigenvalue of B(t) (0 t 1, i = 1, , n). Moreover, by Gershgorin Circle
Theorem, each path i (t) (0 t 1) is contained in disc Ci = {x : |x di | |fi |l }. If for some i1 and i2
with i1 = i2 , i2 (0) falls into Ci1 , then its necessary that Ci1 Ci2 = . This implies that for any Gershgorin
disc that is disjoint from all the other Gershgorin discs, there is one and only one eigenvalue of A falls within
it.
Remark 14. Theres a strengthened version of Gershgorin Circle Theorem that can be found at Wikipedi-
a (http://en.wikipedia.org/wiki/Gershgorin_circle_theorem). The above exercise problems solution is an
adaption of the proof therein. The claim: If the union of k Gershgorin discs is disjoint from the union of the
other (n k) Gershgorin discs, then the former union contains exactly k and the latter (n k) eigenvalues
of A.
I 1. (page 337) Prove that the eigenvalues of an upper triangular matrix are its diagonal entries.
n
0 is an eigenvalue of T det(0 I T ) = 0 (0 ai ) = 0 0 is equal to one of a1 , , an .
i=1
I 2. (page 338) Show that the Euclidean norm of a diagonal matrix is the maximum of the absolute value
of its eigenvalues.
Proof. Let D = diag{a1 , , an }. Then Dei = diag{0, , 0, ai , 0, , 0}. So ||Dei || =
|a
i |. This shows
n
||D|| max1in |ai |. For any x Cn , Dx = diag{a1 x1 , , an xn }. So ||Dx|| = i=1 |ai xi |
2
max1ia |ai | ||x||. So ||D|| max1in |ai |. Combined, we conclude ||D|| = max1in |ai |.
57
I 3. (page 339) Prove the analogue of relation (2),
when A is a linear mapping of any nite-dimensional normed, linear space X (see Chapters 14 and 15).
Proof. By examining the proof for Euclidean space, we see inner product is not really used. All that has
been exploited is just norm. So the proof for any nite-dimensional normed linear space is entirely identical
to that of nite-dimensional Euclidean space.
I 4. (page 339) Show that the two denitions are equivalent.
Proof. It suces to note that a sequence (An )
n=1 converges to A in matrix norm i each ((An )ij )n=1
converges to Aij (Exercise 16 of Chapter 7).
I 5. (page 339) Let A(z) be an analytic matrix function in a domain G, invertible at every point of G. Show
that then A1 (z), too, is an analytic matrix function in G.
Proof. By formula (16) of Chapter 5: D(a1 , , an ) = (p)ap1 1 apn n , we conclude the determinant of
any analytic matrix (i.e. matrix-valued analytic function) is analytic. By Cramers rule and det A(z) = 0 in
G, we conclude A1 (z) is also analytic in G.
I 6. (page 339) Show that the Cauchy integral theorem holds for matrix-valued functions.
Proof. By Exercise 4, Cauchy integral theorem for matrix-valued functions is reduced to Cauchy integral
theorem for each entry of an analytic matrix.
where d is the dimensional of the Euclidean space in which G resides. This shows the elements of D are
equi-continuous in G.
(ii)
Proof. From (i), we know each element of D is uniformly continuous. So they can be extended to G, the
closure of G. Then Theorem 3 is the result of the following version of Arzela-Ascoli Theorem (see, for
example, Yosida [14]): Let S be a compact metric space, and C(S) the Banach space of (real- or) complex-
valued continuous functions x(s) normed by ||x|| = supsS |x(s)|. Then a sequence {xn (s)} C(S) is
relatively compact in C(S) if the following two conditions are statised: (a) {xn (s)} is uniformly bounded;
(b) {xn (s)} is equi-continuous.
58
A.14 Liapunovs Theorem
I 1. (page 360) Show that the sums (14) tend to a limit as the size of the subintervals j tends to zero.
(Hint: Imitate the proof for the scalar case.)
Proof. This is basically about how to extend the Riemann integral to Banach space valued functions. The
theory is essential the same as the scalar case just replace the Euclidean norm with an arbitrary norm. So
we omit the details.
I 2. (page 360) Show that the two denitions are equivalent.
Proof. It suces to note An A in matrix norm if and only if each entry of An converges to the corre-
sponding entry of A (see Exercise 7 and formula (51) of Chapter 7).
I 3. (page 360) Show, using Lemma 4, that for the integral (12)
T
lim eW t eW t dt
T 0
T W t W t
By Lemma 4, limT,T e e dt = 0. So by Cauchys criterion, we conclude the integral (12)
T
exists.
||A|| w(A). By Theorem 13 (ii) of Chapter 7, w(A) ||A||. Combined, we conclude w(A) = ||A||.
I 3. (page 369) Verify (7) and (8).
59
Proof. To verify (7), we note
n
(rk z) e n k=1 k =
2i
(1 rk z) = (rk z) rk = (rk z) e(n+1)i = (1)n+1 (rk z).
k k k k k k
1 1 1 zn
(1 rk z) = .
n j n j 1 rj z
k=j
P (z)
1
j 1rj zis a rational function over the complex plane C, which can be assumed to have the form Q(z)
with P(z) and Q(z) being polynomials
without common factors. Since r1 , , rn are singularity of degree
1 for j 1r 1
, we conclude Q(z) = (1 r k z) = 1 z n
, up to the dierence of a constant factor. Since
1
jz k
j 1rj z has no zeros on complex plane, we conclude P (z) must be a constant. Combined, we conclude
1 C
=
j
1 rj z 1 zn
for some constant C. By letting z 0, we can see C = n. This nishes the verication of (8).
( ) ( )
1 1 1 2
I 4. (page 370) Determine the numerical range of A = 2
and of A = .
0 1 0 1
[ ]
x
Solution. If x = 1 , (Ax, x) = x21 + x22 + x1 x2 . If x21 + x22 = 1, we have (Ax, x) = 1 + x1 sign(x2 ) 1 x21 .
x2
Calculus shows f () = 1 2 (1 1) achieves maximum at 0 = 22 . So w(A) = 1 + 12 = 32 .
Similarly, plain calculation shows w(A2 ) = 2.
References
[1] M. Engeli, T. Ginsburg, H. Rutishauser and E. Stiefel. Rened iterative methods for computation of the
solution and the eigenvalues of self-adjoint boundary value problems, Birkhauser Verlag, Basel/Stuttgart,
1959. 51
[2] Gene H. Golub and Charles F. van Loan. Matrix computation, 3rd Edition. Johns Hopkins University
Press, 1996. 51
[3] Gong Sheng and Gong Youhong. Concise complex analysis, Revised Edition, World Scientic, 2007. 57
[4] William H. Greene. Econometric Analysis, 7th ed., Prentice Hall, 2011. 13
[5] James P. Keener. Principles of applied mathematics: Transformation and approximation, revised edition.
Westview Press, 2000. 30
[6] 2002.8[Lan Yi-Zhong. A concise course
of advanced algebra (in Chinese), Volume 1, Peking University Press, Beijing, 2002.8.] 21
[7] P. Lax. Functional analysis, Wiley-Interscience, 2002. 48
[8] P. Lax. Linear algebra and its applications, 2nd Edition, Wiley-Interscience, 2007. 1
60
[9] Steven Huss-Lederman, Elaine M. Jacobson, Anna Tsao, Thomas Turnbull, Jeremy R. Johnson. Im-
plementation of Strassens algorithm for matrix multiplication. Proceedings of the 1996 ACM/IEEE
conference on Supercomputing (CDROM), p.32-es, January 01-01, 1996, Pittsburgh, Pennsylvania, U-
nited States. 56
[10] J. Munkres. Analysis on manifolds, Westview Press, 1997. 15, 21
[11] M. Papadrakakis. A family of methods with three-term recursion formulae. International Journal for
Numerical Methods in Engineering, Vol. 18, 1785-1799 (1982). 51
[13] Michael Spivak. Calculus on manifolds: A modern approach to classical theorems of advanced calculus.
Perseus Books Publishing, 1965. 30
61