You are on page 1of 48

Notes for Linear Algebra

Fall 2013

Vector spaces

Denition 1.1 [Vector spaces] Let V be a set with a binary operation +, F a eld, and (c, v ) cv be a mapping from F V into V . Then V is called a vector space over F (or a linear space over F ) if (i) u + v = v + u (ii) u + (v + w) = (u + v ) + w (iii) 0 V : v + 0 = v (iv) u V u V : u + (u) = 0 (v) (ab)v = a(bv ) (vi) (a + b)v = av + bv (vii) a(u + v ) = au + av (viii) 1v = v for all u, v V for all u, v, w V for all v V for for for for all all all all a, b F and v V a, b F and v V a F and u, v V vV

Note that (i)-(iv) mean that (V, +) is an abelian group. The mapping (c, v ) cv is called scalar multiplication, the elements of F scalars, the elements of V vectors. The vector 0 V is called zero vector (do not confuse it with 0 in F ).
| . We call them real In this course we practically study two cases: F = IR and F = C vector spaces and complex vector spaces, respectively.

Examples 1.2 [of vector spaces] (a) F eld, F n = {(x1 , . . . , xn ) : xi F } n-tuple space. The operations are dened by (x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn ) c(x1 , . . . , xn ) = (cx1 , . . . , cxn ) Note that IR2 and IR3 can be interpreted as ordinary plane and space, respectively, in rectangular coordinate systems. (b) F eld, S set, V = {f : S F } (arbitrary functions from S to F ). The operations are dened by (f + g )(s) = f (s) + g (s) (cf )(s) = cf (s) for f, g V, s S for c F, f V, s S

(c) F [x], the space of polynomials over a eld F in the unknown x. Note that by denition these are identied with sequences (a0 , a1 , a2 , . . .), an F for all n, such that only nitely many of the an are non-zero. Addition and scalar multiplication are dened componentwise. (d) F eld, F mn the space of m n matrices with components in F . For A F mn denote by Aij the component of the matrix A in the ith row and the j th column, i [1, m], j [1, n]. Then the operations are dened by (A + B )ij = Aij + Bij (cA)ij = cAij Proposition 1.3 [Basic properties of vector spaces] Let V be a vector space over F , v V and c F , then c0 0v (1)v cv = 0 = = = 0 0 v c = 0 or v = 0

Denition 1.4 [Linear combination] Let v1 , . . . , vn V and c1 , . . . , cn F . Then the vector c1 v1 + + cn vn is called a linear combination of v1 , . . . , vn . Denition 1.5 [Subspace] A subset of a vector space V which is itself a vector space with respect to the operations in V is called a subspace of V . Theorem 1.6 [Subspace criterion] A non-empty subset W of a vector space V is a subspace of V if and only if au + bv W for all a, b F, u, v W.

Examples 1.7 [of subspaces] (a) Let I IR be an interval. Then C (I ), the set of all continuous real-valued functions on I , is a subspace of the vector space of all functions on I . If k is a positive integer, then C k (I ), the k times continuously dierentiable functions on I , is a subspace of C (I ). (b) F [x]n , the polynomials of degree n over F , is a subspace of F [x]. (c) The space of symmetric n n matrices (i.e. such that Aij = Aji ) is a subspace of F n n . (e) W = {0} is a trivial subspace in any vector space (it only contains the zero vector). (f) If C is a collection of subspaces of V , then their intersection W C W is a subspace of V . This is not true for unions. 2

Denition 1.8 [Span] Let A be a subset of a vector space V . Denote by C the collection of all subspaces of V that contain A. Then the span of A is the subspace of V dened by span A = W C W Note: span A is the smallest subspace of V which contains A. Theorem 1.9 If A = , then span A = {0}, otherwise span A is the set of all linear combinations of elements of A. Example 1.10 The vectors (3, 5) and (0, 2) span IR2 . Example 1.11 If W1 , W2 V are subspaces, then span (W1 W2 ) = W1 + W2 where W1 + W2 = {u + v : u W1 , v W2 } Denition 1.12 [Linear independence, linear dependence] A nite set of vectors v1 , . . . , vn is said to be linearly independent, if c1 v1 + + cn vn = 0 implies c1 = = cn = 0. In other words, all nontrivial linear combinations are dierent from zero. Otherwise, the set {v1 , . . . , vn } is said to be linearly dependent. An innite set of vectors is said to be linearly independent if every nite subset of it is linearly independent. Examples 1.13 (a) In the space F n , the vectors e1 = (1, 0, 0, . . . , 0) e2 = (0, 1, 0, . . . , 0) . . . en = (0, 0, 0, . . . , 1) are linearly independent. (b) In F [x]n , the polynomials 1, x, x2 , . . . , xn are linearly independent. In F [x], the innite collection of polynomials {xi : 0 i < } is linearly independent. (c) If a set of vectors contains the zero vector, it is linearly dependent. Lemma 1.14 Let S be a linearly independent subset of V . Suppose that u V is not contained in span S . Then the set S {u} is linearly independent. Denition 1.15 [Basis] A basis of V is a linearly independent set B V such that span B = V . 3
def

Examples 1.16 a) {e1 , . . . , en } is a basis of F n (the canonical basis). b) {1, x, . . . , xn } is a basis of F [x]n . The innite collection of polynomials {xi : 0 i < } i=0 is a basis of F [x]. Theorem 1.17 Let V = {0} be a vector space and S a linearly independent subset of V . Then there exists a basis B of V such that S B . In particular, every non-trivial vector space has a basis. Theorem 1.18 Let V be spanned by u1 , . . . , um n, and let v1 , . . . , vm be linearly independent in V . Then m n. Denition 1.19 [nite-dimensional vector space] A vector space V is said to be nitedimensional if it has a nite basis. Corollary 1.20 Let V be a nite-dimensional vector space. Then (a) Each basis of V is nite. (b) If {u1 , . . . , um } and {v1 , . . . , vn } are two bases of V , then m = n. Denition 1.21 [Dimension] The dimension of a nite-dimensional vector space V is the number of vectors in any basis of V . It is denoted by dim V . The trivial vector space {0}, the one consisting of a single zero vector, has no bases, and we dene dim{0} = 0. Examples 1.22 (a) dim F n = n (b) dim F [x]n = n + 1 (c) dim F mn = mn (d) F [x] is not a nite-dimensional space Corollary 1.23 Let V be a nite dimensional vector space. (a) If W is a subspace of V , then W is nite dimensional and dim W dim V . (b) If W is a proper subspace of V , then dim W < dim V . Theorem 1.24 Let W1 and W2 be nite dimensional subspaces of V . Then W1 + W2 is nite dimensional and dim W1 + dim W2 = dim (W1 W2 ) + dim (W1 + W2 ). Lemma 1.25 If {u1 , . . . , un } is a basis for V , then for any v V there exist unique scalars c1 , . . . , cn such that v = c1 u1 + + cn un

Denition 1.26 [Coordinates] Let B = {u1 , . . . , un } be an ordered basis of V . If v = c1 u1 + + cn un , then (c1 , . . . , cn ) are the coordinates of the vector v with respect to the basis B . Notation: c1 . (v )B = . . cn is the coordinate vector of v relative B . Examples 1.27 (a) The canonical coordinates of a vector v = (x1 , . . . , xn ) F n in the standard (canonical) basis {e1 , . . . , en } are its components x1 , . . . , xn , since v = x1 e1 + + xn en . (b) The coordinates of a polynomial p Pn (IR) given by p = a0 + a1 x + a2 x2 + + an xn in the basis {1, x, . . . , xn } are its coecients a0 , a1 , . . . , an . Theorem 1.28 [coordinate mapping] Let dim V = n and B = {u1 , . . . , un } be an ordered basis of V . Then the mapping MB : V F n given by v (v )B

is a bijection. For any u, v V and c F we have MB (u + v ) = MB (u) + MB (v ) MB (cv ) = cMB (v ) Note: The mapping MB : V F n not only is a bijection, but also preserves the vector operations. Since there is nothing else dened in V , we have a complete identity of V and F n . Any property of V can be proven by rst substituting F n for V and then using the mapping MB . Theorem 1.29 [Change of coordinates] Let B = {u1 , . . . , un } and B = {u1 , . . . , un } be two ordered bases of V . Denote by PB ,B the n n matrix with j th column given by (PB ,B )j = (uj )B
1 for j = 1, . . . , n. Then the matrix PB ,B is invertible with PB ,B = PB,B and

(v )B = PB ,B (v )B for every v V . 5

Denition 1.30 [Row space, row rank] Let A F mn . Denote by vi = (Ai1 , . . . , Ain ) F n the ith row vector of A. Then the subspace span{v1 , . . . , vm } of F n is called the row space of A, and dim(span{v1 , . . . , vm }) is called the row rank of A. Denition 1.31 [elementary row operations, row equivalence] The following operations on m n-matrices are called elementary row operations: (a) Multiplication of one row by a non-zero scalar. (b) Addition of a multiple of one row to another. (c) Interchange of two rows. Two matrices A, B F mn are called row-equivalent if A can be transformed into B by a nite series of elementary row operations. Note: This is an equivalence relation. Theorem 1.32 Row equivalent matrices have the same row space. Note: Theorem 1.30 has a converse: if two matrices have the same row space, then they are row equivalent (a proof may be found in textbooks). We will not need this fact. Denition 1.33 [row echelon matrix] Let A F mn have row vectors v1 , . . . , vm . A is called a row-echelon matrix if (a) There exists an integer M such that v1 , . . . , vM = 0 and vM +1 , . . . , vm = 0. (b) If ji = min{j : Aij = 0}, 1 i M , then j1 < j2 < . . . < jM . Theorem 1.34 If R is a row echelon matrix, then the non-zero row vectors of R are a basis of the row space of R. Remark 1.35 (i) Every matrix is row equivalent to a row echelon matrix. (ii) Theorems 1.32 and 1.34 show how to nd a basis of span{v1 , . . . , vm } for a given set of vectors v1 , . . . , vm F n . In particular, this gives the dimension of that subspace. Denition 1.36 [direct sums, complementary subspaces] Let W1 and W2 be subspaces of a vector space V . Their sum W1 + W2 is called direct if W1 W2 = {0}. In this case one writes W1 W2 for their sum. If W1 W2 = V , then the subspaces W1 and W2 are called complementary subspaces of V . Theorem 1.37 Let W1 and W2 be subspaces of V . Then the following are equivalent: (i) The sum of W1 and W2 is direct. (ii) If w1 W1 , w2 W2 and w1 + w2 = 0, then w1 = w2 = 0. (iii) For each w W1 + W2 there exist unique vectors w1 W1 and w2 W2 such that w = w1 + w2 . (iii) If B1 is a basis of W1 and B2 is a basis of W2 , then B1 B2 is a basis of W1 + W2 . 6

Corollary 1.38 For every subspace W of V there exists a subspace U of V such that W and U are complementary. Note: U is generally not unique. Example 1.39 Let V = C [0, 1]. Let W1 = {f V : 01 f (x) dx = 0} and W2 = {f V : f const}. Then W1 and W2 are complementary subspaces of V . Denition 1.40 [external direct sum] Let V and W be vector spaces over F . The external direct sum V W of V and W is dened by V W := {(v, w)| v V, w W }, (v1 , w1 ) + (v2 , w2 ) = (v1 + v2 , w1 + w2 ) for all v1 , v2 V , w1 , w2 W , c(v, w) = (cv, cw) for all v V , w W , c F . Theorem 1.41 Let V W be the external direct sum of V and W . Then (a) V W is a vector space over F . (b) If V and W are nite dimensional, then V W is nite-dimensional and dim V W = dim V + dim W. (c) V := {(v, 0)|v V } and W := {(0, w)|w W } are complementary subspaces of V W.

Linear Transformations

Denition 2.1 [Linear transformation] Let V and W be vector spaces over a eld F . A linear transformation from V to W is a mapping T : V W such that (i) T (u + v ) = T u + T v for all u, v V ; (ii) T (cu) = c T u for all u V , c F . If V = W then T is often called a linear operator on V . Theorem 2.2 [Elementary properties of linear transformations] Let T : V W be a linear transformation. Then (a) T (0) = 0, T (u) = T u, T ( ci ui ) = ci T ui , (b) Im T = {T u : u V } is a subspace of W (also denoted by R(T )), (c) Ker T = {u V : T u = 0} is a subspace of V (also denoted by N (T )), (d) T is injective (one-to-one) if and only if Ker T = {0}. Examples 2.3 [of linear transformations] (a) A matrix A F mn denes a linear transformation T : F n F m by T u = Au (multiplication). We denote this transformation by TA . (b) T : C 1 (0, 1) C (0, 1) dened by T f = f , where f is the derivative of the function f . (c) T : IR[x]n IR[x]n1 dened by T f = f , as above. (d) T : C [0, 1] IR dened by T f = 01 f (x) dx. Theorem 2.4 Let V and W be vector spaces, where V is nite dimensional with basis {u1 , . . . , un }. Let {v1 , . . . , vn } be a subset of W . Then there exists a unique linear transformation T : V W such that T ui = vi for all 1 i n. Theorem 2.5 Let T : F n F m be a linear transformation, then there exists a unique matrix A F mn such that T = TA . Denition 2.6 [Rank of a linear transformation] Let T : V W be a linear and Im(T ) be nite dimensional. Then rank(T ) := dim Im(T ). (2.1) Theorem 2.7 Let T : V W be a linear transformation and V nite dimensional, then Im(T ) is nite dimensional and rank(T ) + dim ker(T ) = dim V. Denition 2.8 [Column space, column rank] Let A F mn , then the subspace of F m spanned by the columns of A is called the column space of A, its dimension the column rank of A. 8

Lemma 2.9 If R F mn is a row echelon matrix, then row rank R = column rank R. Theorem 2.10 Let A F mn and T = TA the induced linear transformation from F n to F m . Then (a) Im(TA ) is the column space of A. In particular rank(TA ) = column rank(A), (b) dim Ker(TA ) = n row rank(A), (c) row rank(A) = column rank(A). Denition 2.11 [Rank of a matrix] Let A F mn . Then rank(A) := row rank(A) = column rank(A) . Denition 2.12 Let V and W be vector spaces over F . The set of all linear transformations from V to W is denoted by L(V, W ) Theorem 2.13 L(V, W ) is a vector space (over F ), with ordinary addition and multiplication by scalars as dened for functions (from V to W ). If V and W are nite dimensional, then L(V, W ) is nite dimensional and dim L(V, W ) = dim V dim W. Note: A proof of this will follow from Theorem 2.25. Proposition 2.14 Let V, W, Z be vector spaces over F . For any T L(V, W ) and U L(W, Z ) the composition U T , also denoted by U T , is a linear transformation from V to Z , i.e. U T L(V, Z ). Example 2.15 Let A F mn , B F km and TA , TB be dened as in Example 2.3(a). Then the composition TB TA is a linear transformation from F n to F k given by the product matrix BA, i.e. TB TA = TBA . Denition 2.16 [Isomorphism] A transformation T L(V, W ) is called an isomorphism if T is bijective. If an isomorphism T L(V, W ) exists, the vector spaces V and W are called isomorphic. Proposition 2.17 If T L(V, W ) is an isomorphism, then T 1 L(W, V ) is an isomorphism.

Theorem 2.18 Let V and W be nite dimensional, and T L(V, W ). (i) T is injective if and only if whenever {u1 , . . . , uk } is linearly independent, then {T u1 , . . . , T uk } is also linearly independent. (ii) T is surjective (i.e. Im(T ) = W ) if and only if rank(T ) = dim W . (iii) T is an isomorphism if and only if whenever {u1 , . . . , un } is a basis in V , then {T u1 , . . . , T un } is a basis in W . (iv) If T is an isomorphism, then dim V = dim W . Theorem 2.19 Let V and W be nite dimensional, dim V = dim W , and T L(V, W ). Then the following are equivalent: (i) T is an isomorphism, (ii) T is injective, (iii) T is surjective. Denition 2.20 [Automorphism, GL(V )] An isomorphisms T L(V, V ) is called an automorphism of V . The set of all automorphisms of V is denoted by GL(V ), which stands for general linear group. In the special case V = F n one sometimes writes GL(V ) = GL(n, F ). Note: GL(V ) is not a subspace of L(V, V ) (it does not contain zero, for example), but it is a group with respect to composition. This group is not abelian, i.e. T U = U T for some T, U GL(V ), unless dim V = 1. Example 2.21 Let T : IR[x]n IR[x]n be dened by T f (x) = f (x + a), where a IR is a xed number. Then T is an isomorphism. Theorem 2.22 If dim V = n, then V is isomorphic to F n . Denition 2.23 [Matrix representation] Let dim V = n and dim W = m. Let B = {u1 , . . . , un } be a basis in V and C = {v1 , . . . , vm } be a basis in W . Let T L(V, W ). The unique matrix A dened by T uj =
m i=1

Aij vi

is called the matrix of T relative to B, C and denoted by [T ]B,C . Theorem 2.24 In the notation of Denitions 1.26 and 2.23, for each vector u V (T u)C = [T ]B,C (u)B . Theorem 2.25 The mapping T [T ]B,C denes an isomorphisms from L(V, W ) to F mn .

10

Theorem 2.26 Let B, C, D be bases in the nite dimensional vector spaces V, W, Z , respectively. For any T L(V, W ) and U L(W, Z ) we have [ST ]B,D = [S ]C,D [T ]B,C . Corollary 2.27 Let V, W be nite dimensional with bases B, C . Then T L(V, W ) is an isomorphism if and only if [T ]B,C is an invertible matrix. In this case
1 1 [T ] ]C,B . B,C = [T

Corollary 2.28 Let B = {u1 , . . . , un } and B = {u1 , . . . , un } be two bases in V . Then [IV ]B ,B = PB ,B where PB ,B is the transition matrix dened in Denition 1.29. Theorem 2.29 Let V and W be nite dimensional, B and B bases in V , and C and C bases in W . Then for every T L(V, W ) we have [T ]B ,C = Q [T ]B,C P where P = [I ]B ,B and Q = [I ]C,C In the special case V = W , B = C and B = C we have [T ]B ,B = P 1 [T ]B,B P. Denition 2.30 [Similar matrices] Two matrices A, A F nn are said to be similar, denoted A A , if there is an invertible matrix P F nn such that A = P 1 A P. Note: Similarity is an equivalence relation. Theorem 2.31 Two matrices A and A in F nn are similar if and only if there exists an n-dimensional vector space V , T L(V, V ), and two bases B, B of V with A = [T ]B,B and A = [T ]B ,B . Theorem 2.32 Let V and W be nite dimensional with bases B, C . For all T L(V, W ), rank(T ) = rank [T ]B,C . Corollary 2.33 If A and A are similar, then rank(A) = rank(A ).

11

Denition 2.34 [Linear functional, Dual space] Let V be a vector space over F . Then V := L(V, F ) is called the dual space of V . The elements f V are called linear functionals on V (they are linear transformations from V to F ). Corollary 2.35 V is a vector space. If V is nite dimensional, then dim V = dim V . Examples 2.36 (a) T : C [0, 1] IR dened by T f = 01 f (x) dx. (b) The trace of a square matrix A F nn is dened by tr A = i Aii . Then tr (F nn ) . (c) Every linear functional f on F n is on the form f (x1 , . . . , xn ) =
n i=1

ai xi

for some ai F (and each f of this form is a linear functional). Theorem 2.37 Let V be nite dimensional and B = {u1 , . . . , un } be a basis of V . Then there is a unique basis B = {f1 , . . . , fn } of V such that fi (uj ) = ij , where ij = 1 if i = j and 0 if i = j (Kronecker delta symbol). B has the property that f= for all f V . Denition 2.38 The basis B found in Theorem 2.37 is called the dual basis of B . Example 2.39 In Example 2.36(c) the dual basis of the standard basis B = {e1 , . . . , en } is given by fi (x1 , . . . , xn ) = xi , i = 1, . . . , n. Systems of linear equations Denition 2.40 [Systems of linear equations] A system of m linear equations in n unknowns x1 , . . . , xn is dened by a11 x1 a21 x1 . . . + . . . + a1n xn + . . . + a2n xn . . . = b1 = b2 . . .
n i=1

f (ui )fi

(2.2)

am1 x1 + . . . + amn xn = bm where aij F and bi F are given. If A F mn has entries Aij = aij and x F n1 and b F m1 are column vectors with entries x1 , . . . , xn and b1 , . . . , bm , then (2.2) can be written as Ax = b. Every x F n1 with Ax = b is called a solution of Ax = b. Systems of the form Ax = 0 are called homogeneous. If b = 0, then Ax = b is called inhomogeneous. 12

Theorem 2.41 [homogeneous systems] If A F mn , then the set of solutions of Ax = 0 is a subspace of F n1 of dimension n rank(A). In particular, x1 = x2 = . . . = xn = 0 is the unique solution of Ax = 0 if and only if rank(A) = n. Theorem 2.42 [inhomogeneous systems] Let A F mn and b F m1 . Dene the matrix (A, b) F m(n+1) by adjoining b as (n + 1)-st column to A. (i) The inhomogeneous system Ax = b has at least one solution if and only if rank(A, b) = rank(A). (ii) Suppose that rank(A, b) = rank(A). Then the solution of Ax = b is unique if and only if rank(A) = n. (iii) Let rank(A, b) = rank(A) < n (i.e. Ax = b has multiple solutions), and let xp be a particular solution of Ax = b. Then x is a solution of Ax = b if and only if x = xp + xh , where xh is a solution of Ax = 0. Corollary 2.43 [elementary form of Fredholms Alternative] Let A F nn . Then exactly one of the following is true: (i) Ax = 0 has a non-trivial solution. (ii) Ax = b is solvable for every b F n1 . Gaussian elimination Denition 2.44 [Equivalent systems] Let A, A F mn , b, b F m1 . Then the systems Ax = b and A x = b are called equivalent if and only if they have the same set of solutions. Theorem 2.45 If (A, b) and (A , b ) are row equivalent, then Ax = b and A x = b are equivalent. Remark 2.46 [Gaussian elimination] The inhomogeneous system Ax = b is solved by the following method: To (A, b) nd a row-equivalent matrix (R, b ), where R is a row-echelon matrix. If b has non-zero entries below the non-zero rows of R, then Ax = b has no solution. Otherwise solve Rx = b by backward substitution. Denote the rst non-zero entries in each row of R as pivots. If the k -th column of R contains a pivot, then xk is a determined variable. Otherwise xk is a free variable (any choice will lead to a solution). Example 2.47 Consider

1 1 1 1 x1 2 1 x 2 = 5 . 1 x3 b 1 1 1 13

This system is equivalent to

1 1 1 x1 1 3 0 x2 = 6 0 . 0 0 0 x3 b+3 If b = 3, then the system has no solution. If b = 3, then x3 is free, x2 = 2 and x1 = 1 x3 , i.e. the general solution is 1 x3 2 x= . x3

14

Determinants

Denition 3.1 [Permutation, Transposition, Sign] A permutation of a set S is a bijective mapping : S S . If S = {1, . . . , n} we call a permutation of degree n (or permutations of n letters). Notation: = ( (1), . . . , (n)). Sn is the set of all permutations of degree n. A permutation which interchanges two numbers and leaves all the others xed is called a transposition. Notation: (j, k ). If Sn , then the sign of is dened as follows: sg( ) = 1 if can be written as a product of an even number of transpositions, and sg( ) = 1 if can be written as a product of an odd number of transpositions. Note: The sign of a permutation is well-dened, see MA634. Denition 3.2 [Determinant] Let A F nn . Then the determinant of A is det(A) =

sg( )a1(1) a2(2) an(n) .

Sn

Note: every term in this sum contains exactly one element from each row and exactly one element from each column. Examples 3.3 (i) If A F 22 , then det(A) = a11 a22 a12 a21 . (ii) Let A =diag(a11 , . . . , ann ) be a diagonal matrix (i.e., A is the n n-matrix with diagonal elements a11 , . . . , ann and zeros o the diagonal). Then det(A) = a11 ann . Theorem 3.4 Let A, B, C F nn . (a) If B is obtained from A by multiplying one row of A by k ,then det(B ) = k det(A). (Note that det(kB ) = k n det(B ).) b) If A,B and C are identical, except for the i-th row, where cij = aij + bij then det(C ) = det(A) + det(B ). (Note that in general det(A + B ) = det(A) + det(B ).) c) If B is obtained from A by interchange of two rows, then det(B ) = det(A). d) If A has two equal rows, then det(A) = 0. e) If B is obtained from A by adding a multiple of one row to another row, then det(B ) = det(A). 15 (1 j n)

Remark 3.5 Let A F nn . By using two elementary row operations row interchange and adding a multiple of one row to another row we can transform A into a row echelon matrix R, whose leading entries in non-zero rows are non-zero numbers (not necessarily ones). By the above theorem, det(A) = (1)p det(R), where p is the number of row interchanges used. From Theorem 3.13 below it will follow that det(R) is the product of the diagonal entries of R. This gives a method for calculating det(A) which is generally more convenient than using Dention 3.2. Denition 3.6 [Elementary matrices] A matrix E F nn is called an elementary matrix if it is of one of the following three types (here E (ij ) denotes the matrix with ij -entry 1 and all other entries 0): (i) E = diag(1, . . . , 1, c, 1, . . . , 1), where c = 0 is the entry in the j -th row and j -th column . (ii) E = In + cE (ij ) , where c F , i = j . (iii) E = In E (ii) E (jj ) + E (ij ) + E (ji) , where i = j . Note: If A F nn and E is an elementary matrix, then EA if found from A by multiplying the j -th row by c if E is of type (i), by adding c times the j -th row to the i-th row if E is of type (ii), and by exchanging the i-th and j -th rows if E is of type (iii). Remark 3.7 A square matrix A is invertible if and only if it can be transformed into In via elementary row operations. From this it follows that A is invertible if and only if A = Es . . . E1 for elementary matrices E1 , . . . , Es . Lemma 3.8 Let A, B F nn and A invertible. Then det(AB ) = det(A) det(B ) Theorem 3.9 A F nn is invertible if and only if det(A) = 0. In this case det(A1 ) = 1/ det(A). Theorem 3.10 Let A, B F nn , then det(AB ) = det(A) det(B ). Corollary 3.11 If A and B are similar, then det(A) = det(B ). Theorem+Denition 3.12 [Transpose] The transpose At of A F nn is dened by At ij := Aji . Then det(A) = det(At ). Note: This implies that all parts of Theorem 3.4 are valid with row replaced by column.

16

Theorem 3.13 If A11 F rr , A21 F (nr)r , and A22 F (nr)(nr) , then


(

det .

A11 0 A21 A22

= det(A11 ) det(A22 )

Note: By induction it follows that


det

A11 0 0 . . . . . .. .. . . . . . . . .. .. . 0 Am1 Amm


det

= det(A11 ) det(Amm )

In particular, for a lower triangular matrix one has a11 0 0 . ... ... . . . . . . .. .. . . . 0 . an1 ann
= a11 ann

By Theorem 3.12 this also holds for upper triangular matrices. Denition 3.14 [Cofactors] Let A F nn and i, j {1, . . . , n}. Denote by A(i|j ) F (n1)(n1) the matrix obtained by deleting the i-th row and j -th column from A. Then cij := (1)i+j det(A(i|j )) is called the i, j -cofactor of A. Theorem 3.15 [Cofactor expansion] (a) For every i {1, . . . , n} one has the i-th row cofactor expansion det(A) =
n j =1

aij cij .

(b) For every j {1, . . . , n} one has the j -th column cofactor expansion det(A) = Theorem 3.16 If A is invertible, then A1 = 1 adj(A) det(A)
n i=1

aij cij .

where adj(A) F nn is the adjoint matrix of A dened by (adj(A))ij = cji . 17

Theorem 3.17 [Cramers rule] If A F nn is invertible and b F n , then the unique solution x F n of the equation Ax = b is given by xj = det(Bj ) det(A) (j = 1, . . . , n),

where Bj is the matrix obtained by replacing the j -th column of A by the vector b. Remark 3.18 Using adj(A) to compute A1 or Cramers rule to solve the equation Ax = b is numerically impractical, since the computation of determinants is too expensive compared to other methods (e.g. Gaussian elimination). However, Theorems 3.16 and 3.17 have theoretical value: they show the continuous dependence of entries of A1 on entries of A and, respectively, of x on the entries of A and b. Denition 3.19 [triangular matrices] A matrix A F nn is said to be upper triangular if Aij = 0 for all i > j . If, in addition, Aii = 1 for all 1 i n, the matrix A is said to be unit upper triangular. Similarly, lower triangular matrices and unit lower triangular matrices are dened. Theorem 3.20 The above four classes of matrices are closed under multiplication. For example, if A, B F nn are unit lower triangular matrices, then so is AB . Theorem 3.21 The above four classes of matrices are closed under taking inverses (if invertible).

18

The LU Decomposition Method

This chapter is not covered any more in Algebra II. Part of this material has been integrated into Chapter 2 (systems of linear equations, Gaussian elimination). The other contents are now part of the course Applied Linear Algebra.

19

Diagonalization

Throughout this section, we use the following notation: V is a nite dimensional vector space, dim V = n, and T L(V, V ). Also, B is a basis in V and [T ]B := [T ]B,B F nn is a matrix representing T . Our goal in this and the following sections is to nd a basis in which the matrix [T ]B is as simple as possible. In this section we study conditions under which the matrix [T ]B can be made diagonal. 5.1 Denition (Eigenvalue, eigenvector) A scalar F is called an eigenvalue of T if there is a non-zero vector v V such that T v = v In this case, v is called an eigenvector of T corresponding to the eigenvalue . (Sometimes these are called characteristic value, characteristic vector.) 5.2 Theorem + Denition (Eigenspace) For every F the set E = {v V : T v = v } = Ker(I T ) is a subspace of V . If is an eigenvalue, then E = {0}, and it is called the eigenspace corresponding to . Note that E always contains 0, even though 0 is never an eigenvector. At the same time, the zero scalar 0 F may be an eigenvalue. 5.3 Remark 0 is an eigenvalue Ker T = {0} T is not invertible [T ]B is singular det[T ]B = 0 . 5.4 Denition (Eigenvalue of a matrix) F is called an eigenvalue of a matrix A F nn if there is a nonzero v F n such that Av = v Eigenvectors and eigenspaces of matrices are dened accordingly. 5.5 Simple properties (a) is an eigenvalue of A F nn is an eigenvalue of TA L(F n , F n ). (b) is an eigenvalue of T is an eigenvalue of [T ]B for any basis B . (c) if A = diag(1 , . . . , n ), then 1 , . . . , n are eigenvalues of A with eigenvectors e1 , . . . , en . 20

(d) dim E = nullity(I T ) = n rank(I T ). 5.6 Lemma The function p(x) := det(xI [T ]B ) is a polynomial of degree n. This function is independent of the basis B . 5.7 Denition (Characteristic polynomial) The function CT (x) := det(xI [T ]B ) is called the characteristic polynomial of T . For a matrix A F nn , the function CA (x) := det(xI A) is called the characteristic polynomial of A. Note that CT (x) = C[T ]B (x) for any basis B . 5.8 Lemma is an eigenvalue of T if and only if CT () = 0. 5.9 Corollary | ), then T has at least one T has at most n eigenvalues. If V is complex (F = C eigenvalue. In this case, according to the Fundamental Theorem of Algebra, CT (x) is decomposed into linear factors. 5.10 Corollary If A B are similar matrices, then CA (x) CB (x), hence A and B have the same set of eigenvalues. 5.11 Examples (
)

3 2 . Then CA (x) = x2 3x + 2 = (x 1)(x 2), so that = 1, 2. (a) A = 1 0 2 Also, E1 =span(1 , 1) ( ) and E2 =span(2, 1), so that E1 E2 = IR . 1 1 (b) A = . Then CA (x) = (x 1)2 , so that = 1 is the only eigenvalue, 0 1 E1 =span(1,( 0). ) 0 1 (c) A = . Then CA (x) = x2 + 1, so that there are no eigenvalues in the 1 0 real case and two eigenvalues (i and i) in the complex case.

21

3 3 2 2 2 (d) A = 1 . Then CA (x) = (x1)(x2)2 , so that = 1, 2. E1 =span(1, 1, 0) 1 1 0 and E2 =span(2 , 0, 1). Here E1 E2 = IR3 (not enough eigenvectors). 1 0 0 (e) A = 0 2 0 . Then CA (x) = (x 1)(x 2)2 , so that = 1, 2. E1 =span{e1 } 0 0 2 and E2 =span{e2 , e3 }. Now E1 E2 = IR3 . 5.12 Denition (Diagonalizability) T is said to be diagonalizable if there is a basis B such that [T ]B is a diagonal matrix. A matrix A F nn is said to be diagonalizable if there is a similar matrix D A which is diagonal. 5.13 Lemma (i) If v1 , . . . , vk are eigenvalues of T corresponding to distinct eigenvalues 1 , . . . , k , then the set {v1 , . . . , vn } is linearly independent. (ii) If 1 , . . . , k are distinct eigenvalues of T , then E1 + + En = E1 En . Proof of (i) goes by induction on k . 5.14 Corollary If T has n distinct eigenvalues, then T is diagonalizable. (The converse is not true.) 5.15 Theorem T is diagonalizable if and only if there is a basis B consisting entirely of eigenvectors of T . (In this case we say that T has a complete set of eigenvectors.) Not all matrices are diagonalizable, even in the complex case, see Example 5.11(b). 5.16 Denition (Invariant subspace) A subspace W is said to be invariant under T if T W W , i.e. T w W for all w W. The restriction of T to a T -invariant subspace W is denoted by T |W . It is a linear transformation of W into itself. 5.17 Examples (a) Any eigenspace E is T -invariant. Note that any basis in E consists of eigenvectors. 0 1 0 0 0 (b) Let A = 1 . Then the subspaces W1 = span{e1 , e2 } and W2 = 0 0 1

22

span{e3 } are TA -invariant. The restriction TA |W1 is represented by the matrix and the restriction TA |W2 is the identity.

0 1 1 0

5.18 Lemma Let V = W1 W2 , where W1 and W2 are T -invariant subspaces. Let B1 and B2 be bases in W1 , W2 , respectively. Denote [T |W1 ]B1 = A1 and [T |W2 ]B2 = A2 . Then
(

[T ]B1 B2 =

A1 0 0 A2

Matrices like this are said to be block-diagonal. Note: by induction, this generalizes to V = W1 Wk . 5.19 Denition (Algebraic multiplicity, geometric multiplicity) Let be an eigenvalue of T . The algebraic multiplicity of is its multiplicity as a root of CT (x), i.e. the highest power of x that divides CT (x). The geometric multiplicity of is the dimension of the eigenspace E . The same denition goes for eigenvalues of matrices. Both algebraic and geometric multiplicities are at least one. 5.20 Theorem T is diagonalizable if and only if the sum of geometric multiplicities of its eigenvalues equals n. In this case, if 1 , . . . , s are all distinct eigenvalues, then V = E1 Es Furthermore, if B1 , . . . , Bs are arbitrary bases in E1 , . . . , Es and B = B1 Bs , then [T ]B = diag (1 , . . . , 1 , 2 , . . . , 2 , . . . , s , . . . , s ) where each eigenvalue i appears mi = dim Ei times. 5.21 Corollary Assume that CT (x) = (x 1 ) (x n ) where i s are not necessarily distinct. (i) If all the eigenvalues have the same algebraic and geometric multiplicities, then T is diagonalizable. (ii) If all the eigenvalues are distinct, then T is diagonalizable.

23

5.22 Corollary Let T be diagonalizable, and D1 , D2 are two diagonal matrices representing T (in different bases). Then D1 and D2 have the same the diagonal elements, up to a permutation. 5.23 Examples (continued from 5.11) (a) The matrix is diagonalizable, its diagonal form is D =
( )

1 0 0 2

(b) The matrix is not diagonalizable. (c) The matrix is not diagonalizable case, but is diagonalizable in the ( in the real ) i 0 complex case. Its diagonal form is D = . 0 i (d) The matrix is not diagonalizable.

24

Generalized eigenvectors, Jordan decomposition

Throughout this and the next sections, we use the following notation: V is a nite dimensional complex vector space, dim V = n, and T L(V, V ). In this and the next sections we focuse on nondiagonalizable transformations. Those, as we have seen by examples, do not have enough eigenvectors. 6.1 Denition (Generalized eigenvector, Generalized eigenspace) Let be an eigenvalue of T . A vector v = 0 is called a generalized eigenvector of T corresponding to if (T I )k v = 0 for some positive integer k . The generalized eigenspace corresponding to is the set of all generalized eigenvectors corresponding to plus the zero vector. We denote that space by U . 6.2 Example(
)

1 | . Let A = for some C Then is the (only) eigenvalue of A, and 0 | 2. E = span {e1 }. Since (A I )2 is the zero matrix, U = C

6.3 Notation (k) (1) (k) (k+1) For each k 1, denote U = Ker (T I )k . Clearly, U = E and U U (k) (k) (k+1) (i.e., {U } is an increasing sequence of subspaces of V ). Note that if U = U , (k) (k+1) then dim U < dim U . (k ) (k) Observe that U = k=1 U . Since each U is a subspace of V , their union U is a subspace, too. 6.4 Lemma (k) (k+1) There is an m = m such that U = U for all 1 k m 1, and U
(m)

= U

(m+1)

= U
(k )

(m+2)

In other words, the sequence of subspaces U strictly increases up to k = m and stabi(m) lizes after k m. In particular, U = U . Proof. By way of contradiction, assume that U = U
(k+2) (k+1) (k ) (k+1)

= U

(k+2)

Pick a vector v U \ U and put u := (T I )v . Then (T I )k+2 v = 0, so k+1 k+1 k , and so (T I )k u = 0, hence (T I ) u = 0, hence u U . But then u U (T I )k+1 v = 0, a contradiction. 2

25

6.5 Corollary (n) m n, where n = dim V . In particular, U = U . 6.6 Denition (Polynomials in T ) By T k we denote T T (k times), i.e. inductively T k v = T (T k1 v ). A polynomial in T is ak T k + + a1 T + a0 I
| . Example: (T I )k is a polynomial in T for any k 1. where a0 , . . . , ak C

Note: T k T m = T m T k . For arbitrary polynomials p, q we have p(T )q (T ) = q (T )p(T ). 6.7 Lemma. The generalized eigenvectors of T span V . Proof goes by induction on n = dimV . For n = 1 the lemma follows from 5.9. Assume that the lemma holds for all vector spaces of dimension < n. Let be an eigenvalue of T . We claim that V = V1 V2 , where V1 := Ker (T I )n = U and V2 := Im (T I )n . Proof of the claim: (i) Show that V1 V2 = 0. Let v V1 V2 . Then (T I )n v = 0 and (T I )n u = v for some uV . Hence (T I )2n u = (T I )n v = 0, i.e. uU . Now 6.5 implies that 0 = (T I )n u = v . (ii) Show that V = V1 + V2 . In view of 2.6 we have dim V1 + dim V2 = dim V . Since V1 and V2 are independent by (i), we have V1 + V2 = V . The claim is proven. Since, V1 = U , it is already spanned by generalized eigenvectors. Next, V2 is T invariant, because for any v V2 we have v = (T I )n u for some uV , so T v = T (T I )n U = (T I )n T uV2 . Since dim V2 < n (remember that dim V1 1), by the inductive assumption V2 is spanned by generalized eigenvectors of T |V2 , which are, of course, generalized eigenvectors for T . This proves the lemma.2 6.8 Lemma Generalized eigenvectors corresponding to distinct eigenvalues of T are linearly independent. Proof. Let v1 , . . ., vm be generalized eigenvectors corresponding to distinct eigenvalues 1 , . . ., m . We need to show that if v1 + + vm = 0, then v1 = = vm = 0. It is enough to show that v1 = 0. Let k be the smallest positive integer such that (T 1 I )k v1 = 0. We now apply the transformation R := (T 1 I )k1 (T 2 I )n (T m I )n to the vector v1 + + vm (if k = 1, then the rst factor in R is missing). Since all the factors in R commute (as polynomials in T ), R kills all the vectors v2 , . . . , vm , and we 26

get Rv1 = 0. Next, we replace T i I by (T 1 I ) + (1 i )I for all i = 2, . . . , m in the product formula for R. We then expand this formula by Binominal Theorem. All the terms but one will contain (T 1 I )r with some r k , which kills v1 , due to our choice of k . The equation Rv1 = 0 is then equivalent to (1 2 )n (1 m )n (T 1 I )k1 v1 = 0 This contradicts our choice of k (remember that the eigenvalues are distinct, so that i = 1 ). 2 6.9 Denition (Nilpotent transformations) The transformation T : V V is said to be nilpotent, if T k = 0 for some positive integer k . The same denition goes for matrices. 6.10 Example | nn is an upper triangular matrix whose diagonal entries are zero, then it is If A C nilpotent. 6.11 Lemma T is nilpotent if and only if 0 is its only eigenvalue. Proof. Let T be nilpotent and be its eigenvalue with eigenvector v = 0. Then T k v = k v for all k , and then T k v = 0 implies k = 0, so = 0. Conversely, if 0 is the only eigenvalue, then by 6.7, V = U0 = Ker T n , so T n = 0. 2 6.12 Theorem (Structure Theorem) Let 1 , . . ., m be all distinct eigenvalues of T : V V with corresponding generalized eigenspaces U1 , . . ., Um . Then (i) V = U1 Um (ii) Each Uj is T -invariant (iii) (T j I )|Uj is nilpotent for each j (iv) Each T |Uj has exactly one eigenvalue, j (v) dim Uj equals the algebraic multiplicity of j Proof. (i) Follows from 6.7 and 6.8. (ii) Recall that Uj = Ker (T j I )n by 6.5. Hence, for any v Uj we have (T j I )n T v = T (T j I )n v = 0 so that T v Uj . (iii) Follows from 6.5. (iv) From 6.11 and (iii), (T j I )|Uj has exactly one eigenvalue, 0. Therefore, TUj has exactly one eigenvalue, j . (A general fact: is an eigenvalue of T if and only if 27
by 6.6

is an eigenvalue of T I .) (v) Pick a basis Bj in each Uj , then B := m j =1 Bj is a basis of V by (i). Due to 5.18 and (i), the matrix [T ]B is block-diagonal, whose diagonal blocks are [T |Uj ]Bj , 1 j m. Then CT (x) = det(xI [T ]B ) = C[T |U1 ]B1 (x) C[T |Um ]Bm (x) = CT |U1 (x) CT |Um (x) due to 3.11. Since T |Uj has the only eigenvalue j , we have CT |Uj (x) = (x j )dim Uj , hence CT (x) = (x 1 )dim U1 (x m )dim Um Theorem is proven. 2 6.13 Corollary Since E U , the geometric multiplicity of never exceeds its algebraic multiplicity. 6.14 Denition (Jordan block matrix) An mm matrix J is called a Jordan block matrix for the eigenvalue if
J =

0 0 . . . . . . 0

0 0 .. . 0 1 0 .. . . . . 0 1 .. .. .. .. . . . . 0 ... ... 0 1 0 0

6.15 Properties of Jordan block matrices (i) CJ (x) = (x )m , hence is the only eigenvalue of J , its algebraic multiplicity is m (ii) (J I )m = 0, i.e. the matrix J I nilpotent (iii) rank (J I ) = m 1, hence nullity (J I ) = 1, so the geometric multiplicity of is 1. (iv) Je1 = e1 , hence (J I )e1 = 0, i.e. E = span{e1 } (v) Jek = ek + ek1 , hence (J I )ek = ek1 for 2 k m (vi) (J I )k ek = 0 for all 1 k m. Note that the map J I takes em em1 e1 0 6.16 Denition (Jordan chain) Let be an eigenvalue of T . A Jordan chain is a sequence of non-zero vectors {v1 , . . . , vm } such that (T I )v1 = 0 and (T I )vk = vk1 for k = 2, . . . , m. The 28

length of the Jordan chain is m. Note that a Jordan chain contains exactly one eigenvector (v1 ), and all the other vectors in the chain are generalized eigenvectors. 6.17 Lemma Let be an eigenvalue of T . Suppose we have s 1 Jordan chains corresponding to , call them {v11 , . . . , v1m1 }, . . ., {vs1 , . . . , vsms }. Assume that the vectors {v11 , v21 , . . . , vs1 } (the eigenvectors in these chains) are linearly independent. Then all the vectors {vij : i = 1, . . . , s, j = 1, . . . , mj } are linearly independent. Proof. Let M = max{m1 , . . . , ms } be the maximum length of the chains. The proof goes by induction on M . For M = 1 the claim is trivial. Assume the lemma is proved for chains of lengths M 1. Without loss of generality, assume that m1 m2 ms , i.e. the lengths are decreasing, so M = m1 . By way of contradiction, let cij vij = 0
i,j

Applying (T I )m1 1 to the vector cij vij kills all the terms except the last vectors vim1 in the chains of maximum length (of length m1 ). Those vectors will be transformed to (T I )m1 1 vim1 = vi1 , so we get
p i=1

cim1 vi1 = 0

where p s is the number of chains of length m1 . Since the vectors {vi1 } are linearly independent, we conclude that cim1 = 0 for all i. That reduces the problem to the case of chains of lengths M 1. 2 6.18 Corollary Let B = {v1 , . . . , vm } be a Jordan chain corresponding to an eigenvalue . Then B is linearly independent, i.e. it is a basis in the subspace W := span{v1 , . . . , vm }. Note that W is T -invariant, and the matrix [T |W ]B is exactly a Jordan block matrix. 6.19 Denition (Jordan basis) A basis B of V is called a Jordan basis for T if it is a union of some Jordan chains. 6.20 Remark If B is a Jordan basis of V , then [T ]B is a block diagonal matrix, whose diagonal blocks are Jordan block matrices.

29

6.21 Denition (Jordan matrix) A matrix Q is called a Jordan matrix corresponding to an eigenvalue if J1 0 . . . .. . Q= . . . 0 Js where J1 , . . . , Js are Jordan block matrices corresponding to , and their lengths decrease: |J1 | |Js |. A matrix A is called a Jordan matrix if Q1 0 . . .. A= . . . . . 0 Qr where Q1 , . . . , Qr are Jordan matrices corresponding to distinct eigenvalues 1 , . . . , r . 6.22 Example | 4 have only one eigenvalue, . We can nd all possible Jordan matrices | 4 C Let T : C for T ; there are 5 distinct such matrices. 6.23 Theorem (Jordan decomposition) Let V be a nite dimensional complex vector space. (i) For any T : V V there is a Jordan basis B of V , so that the matrix [T ]B is a Jordan matrix. The latter is unique, up to a permutation of eigenvalues. | nn is similar to a Jordan matrix. The latter is unique, up to a (ii) Every matrix A C permutation of eigenvalues (i.e., Qj s in 6.21). Note: the uniqueness of the matrices Qj is achieved by the requirement |J1 | |Js | in 6.21. Note: the Jordan basis B is not unique, not even after xing the order of eigenvalues. Proof of Theorem 6.23. (ii) follows from (i). It is enough to prove (i) assuming that T has just one eigenvalue , and then use 6.12. We can even assume that the eigenvalue of T is zero, by switching from T to T I . Hence, we assume that T is nilpotent. The proof of existence goes by induction on n = dim V . The case n = 1 is trivial. Assume the theorem is proved for all spaces of dimension < n. Consider W := Im V . If dim W = 0, then T = 0, so the matrix [T ]B is diagonal (actually, zero) in any basis. So, assume that m := dim W 1.

30

Note that W is T -invariant, and m < n, because n m = nullity T = 0. So, by the inductive assumption there is a basis B in W that is the union of Jordan chains. Let k be the number of those chains. The last vector in each Jordan chain has a pre-image under T (because it belongs in W = Im T ). So, we can extend each of those k Jordan chains by one more vector. Now we get k Jordan chains for the transformation T : V V . By 6.17 all the vectors in those chains are linearly independent, so they span a subspace of dimension m + k . Next, consider the space K := Ker T . Note that K0 := Ker T |W is a subspace of K . The rst vectors in our Jordan chains make a basis in K0 , so k = dim K0 = nullity T |W . This basis can be extended to a basis in K , and thus we can get r := dim K dim K0 new vectors. Note that dim K dim K0 = n m k . Hence, the total number of vectors we found is n. The last r vectors are eigenvectors, so they make r Jordan chains of length 1. Finally, note that all our vectors are independent, therefore they make a basis in V . To prove the uniqueness, note that for any k 1 the number of Jordan chains in the Jordan basis that have length k equals rank T k1 rank T k , so it is independent of the basis. This easily proves the uniqueness. 2 6.24 Strategy to nd the Jordan matrix | nn you can nd a similar Jordan matrix J A as follows. Given a matrix A C First, nd all the eigenvalues. Then, for every eigenvalue and k 1 nd the number rk = rank (T I )k . You will get a sequence n > r1 > r2 > > rp = rp+1 = (in view of 6.4). Then rk1 rk is the number of Jordan blocks of length k corresponding to . 1 0 1 Example: A = 0 1 1 . Here = 1 is the only eigenvalue, and r1 = rank (AI ) = 1. 0 0 1 1 1 0 Then (A I )2 is a zero matrix, so r2 = 0. Therefore, J = 0 1 0 . 0 0 1 In addition, we can nd a Jordan basis in this example, i.e. a basis in which the transformation is represented by a Jordan matrix. The strategy: pick a vector v1 Im (A I ), then take one of its preimages v2 (A I )1 v1 , and an arbitrary eigenvector v3 Ker (A I ) independent of v1 . Then {v1 , v2 } and {v3 } are two Jordan chains making a basis.

31

Minimal Polynomial and Cayley-Hamilton Theorem

We continue using the notation V, n, T of Section 6. The statements 7.17.4 below hold for any eld F , but in the rest of the section 7 we work in the complex eld. 7.1 Lemma For any T : V V there exists a nonzero polynomial f P (F ) such that f (T ) = 0. If we require that the leading coecient equal 1 and the degree of the polynomial be minimal, then such a polynomial is unique. Proof: Recall that L(V, V ) is a vector space of a nite dimension, dim L(V, V ) = n2 . The transformations I, T, T 2 , . . . belong in L(V, V ), so there is a k 1 such that I, T, . . . , T k are linearly dependent, i.e. c0 , . . ., ck F : c0 I + c1 T + + ck T k = 0 and we can assume that ck =0. Dividing through by ck gives ck = 1. Lastly, if there are two distinct polynomials of degree k and with ck = 1 satisfying the above equation, then their dierence is a polynomial g of degree < k such that g (T ) = 0. This proves the uniqueness. 2 7.2 Denition (Minimal polynomial) The unique polynomial found in Lemma 7.1 is called the minimal polynomial MT (x) of T : V V . If A is an nn-matrix then the minimal polynomial of TA is denoted by MA (x). 7.3 Examples ) ( 0 1 . Here A2 = 0, so that the minimal polynomial is MA (x) = x2 . (It (a) A = 0 0 is easy to check that cI + A = 0 for any c, so the degree one is not enough.) Compare this to the characteristic polynomial CA (x) = x2 . (b) It is an easy exercise to prove CA (A) = 0 for any 2 2 matrix A (over any eld F ). Hence MA (x) either coincides with CA (x) or is of degree one. In the latter case A cI = 0 for some c F , hence A = cI , in which case CA (x) = (x c)2 and MA (x) = x c. 2 3 1 (c) A = 0 2 0 . Here A 2I is an upper triangular matrix with zero main 0 0 2 diagonal, so it is nilpotent, see 6.10. It is easy to check that (A 2I )2 = 0, hence the minimal polynomial is MA (x) = (x 2)2 = x2 4x +4. Compare this to the characteristic polynomial CA (x) = (x 2)3 . Note that CA (x) is a multiple of MA (x).

32

7.4 Corollary (i) Let B be a basis of V , then MT (x) = M[T ]B (x). (ii) Similar nn-matrices have the same minimal polynomial.
| again. From now on, F = C

7.5 Theorem Let 1 , . . ., r be all distinct eigenvalues of T and Uj the corresponding generalized eigenspaces. Let mj is the maximum size of Jordan blocks corresponding to the eigenvalue j . Consider the polynomial p(x) = (x 1 )m1 (x r )mr Then: (i) deg p dim V (ii) if f (x) is a nonzero polynomial such that f (T ) = 0, then f is a multiple of p (iii) p(x) = MT (x) Note: mj is the length of the longest Jordan chain in Uj . Also, mj is the smallest positive integer s.t. (T j I )mj v = 0 for all v Uj . Proof: (i) Follows from 6.23. (ii) Let f (T ) = 0. We will show that f (x) is a multiple of (x j )m j for all j . Fix a j t1 ts t and write f (x) = c(x c1 ) (x cs ) (x j ) where c1 , . . .cs denote the roots of f other than j (if j is not a root of f , we simply put t = 0). If t < mj , then there is a vector v Uj such that u := (T j I )t v = 0. Recall that Uj is T -invariant, so it is (T cI )-invariant for any c. Hence, u Uj . Furthermore, each transformation T ci I leaves Uj invariant, and is a bijection of Uj because ci = j . Therefore, the transformation c(T c1 I )t1 (T cs I )ts is a bijection of Uj , so it takes u to a nonzero vector w. Thus, w = f (T )v = 0, hence f (T ) = 0, a contradiction. This proves (ii). (iii) By (ii), MT (x) is a multiple of p(x). It remains to prove that p(T ) = 0, then use 7.1. To prove that p(T ) = 0, recall that V = U1 Ur , and for every v Uj we have (T j I )mj v = 0. 2 7.6. Example | m Let J be an mm Jordan block matrix for eigenvalue , see 6.14. Then U = C and (T I )m = 0 (and m is the minimal such power). So, MJ (x) = (x )m . Note that CJ (x) = MJ (x). In general, though, CA (x) = MA (x). 7.7 Theorem (Cayley-Hamilton) The characteristic polynomial CT (x) is a multiple of the minimal polynomial MT (x). In particular, CT (T ) = 0, i.e. any linear operator satises its own characteristic equation. 33

Proof: Let 1 , . . ., r be all distinct eigenvalues of T , and p1 , . . . , pr their algebraic multiplicities. Then CT (x) = (x 1 )p1 (x r )pr . Note that pj = dim Uj mj , where mj is the maximum size of Jordan blocks corresponding to j . So, by 7.5 the minimal polynomial MT (x) = (x 1 )m1 (x r )mr divides CT (x). 2 7.8 Corollary (i) CT (x) and MT (x) have the same linear factors. (ii) T is diagonalizable if and only if MT (x) = (x 1 ) (x r ), i.e. MT (x) has no multiple roots. 7.9 Examples

1 0 0 0 1 (a) A = 1 . Here CA (x) = (x + 1)2 (x 1). Therefore, MT (x) may be 1 1 2 2 either (x +1) (x 1) or (x +1)(x 1). To nd it, it is enough to check if (A + I )(A I ) = 0, i.e. if A2 = I .This is not true, so MT (x) = (x + 1)2 (x 1). The Jordan form of the 1 1 0 matrix is J = 0 1 0 . 0 0 1 (b) Assume that CA (x) = (x 2)4 (x 3)2 and MA (x) = (x 2)2 (x 3). Find all possible Jordan forms of A. Answer: there are two Jordan blocks of length one for = 3 and two or three Jordan blocks of lengths 2+2 or 2+1+1 for = 2.

34

Norms and Inner Products

This section is devoted to vector spaces with an additional structure - inner product. | . The underlining eld here is either F = IR or F = C 8.1 Denition (Norm, distance) A norm on a vector space V is a real valued function || || satisfying 1. ||v || 0 for all v V and ||v || = 0 if and only if v = 0. 2. ||cv || = |c| ||v || for all c F and v V . 3. ||u + v || ||u|| + ||v || for all u, v V (triangle inequality). A vector space V = (V, || ||) together with a norm is called a normed space. In normed spaces, we dene the distance between two vectors u, v by d(u, v ) = ||u v ||. Note that property 3 has a useful implication: ||u|| ||v || ||u v || for all u, v V . 8.2 Examples | n: (i) Several standard norms in IRn and C ||x||1 :=
(
n i=1 n i=1

|xi |
)1/2

(1 norm)

||x||2 := note that this is Euclidean norm, ||x||p := for any real p [1, ),

|xi |2

(2 norm)

( n
i=1

)1/p

|xi |

(p norm)

||x|| := max |xi |


1in

( norm)

(ii) Several standard norms in C [a, b] for any a < b: ||f ||1 := ||f ||2 :=
(
b

a b

|f (x)| dx
)1/2

|f (x)|2 dx

||f || := max |f (x)|


axb

35

8.3 Denition (Unit sphere, Unit ball) Let || || be a norm on V . The set S1 = {v V : ||v || = 1} is called a unit sphere in V , and the set B1 = {v V : ||v || 1} a unit ball (with respect to the norm || ||). The vectors v S1 are called unit vectors. For any vector v = 0 the vector u = v/||v || belongs in the unit sphere S1 , i.e. any nonzero vector is a multiple of a unit vector. The following four theorems, 8.48.7, are given for the sake of completeness, they will not be used in the rest of the course. Their proofs involve some advanced material of real analysis. The students who are not familiar with it, may disregard the proofs. 8.4 Theorem | n is a continuous function. Precisely: for any > 0 there is Any norm on IRn or C a > 0 such that for any two vectors u = (x1 , . . . , xn ) and v = (y1 , . . . , yn ) satisfying maxi |xi yi | < we have ||u|| ||v || < . Proof. Let g = max{||e1 ||, . . . , ||en ||}. By the triangle inequality, for any vector v = (x1 , . . . , xn ) = x1 e1 + + xn en we have ||v || |x1 | ||e1 || + + |xn | ||en || (|x1 | + + |xn |) g Hence if max{|x1 |, . . . , |xn |} < , then ||v || < ng . Now let two vectors v = (x1 , . . . , xn ) and u = (y1 , . . . , yn ) be close, so that maxi |xi yi | < . Then ||u|| ||v || ||u v || < ng It is then enough to set = /ng . This proves the continuity of the norm || || as a function of v . 2 8.5 Corollary | n . Denote by Let || || be a norm on IRn or C
e S1 = {(x1 , . . . , xn ) : |x1 |2 + + |xn |2 = 1}
| n ). Then the function || || is bounded above and the Euclidean unit sphere in IRn (or C e below on S1 : 0 < min ||v || max ||v || < e e

v S1

v S1

36

e Proof. Indeed, it is known in real analysis that S1 is a compact set. Note that || || is e a continuous function on S1 . It is known in real analysis that a continuous function on a compact set always takes its maximum and minimum values. In our case, the mine e imum value of || || on S1 is strictly positive, because 0 / S1 . This proves the corollary. 2

8.6 Denition (Equivalent norms) Two norms, || ||a and || ||b , on V are said to be equivalent if there are constants 0 < C1 < C2 such that C1 ||u||a /||u||b C2 < for all u = 0. This is an equivalence relation. 8.7 Theorem In any nite dimensional space V , any two norms are equivalent.
| n . It is enough to prove that any norm is Proof. Assume rst that V = IRn or V = C equivalent to the 2-norm. Any vector v = (x1 , . . . , xn ) V is a multiple of a Euclidean e e unit vector u S1 , so it is enough to check the equivalence for vectors u S1 , which n | n . An immediately follows from 8.5. So, the theorem is proved for V = IR and V = C | is isomorphic to IRn or C | n, arbitrary n-dimensional vector space over F = IR or F = C respectively. 2

8.8 Theorem + Denition (Matrix norm) | n . Then Let || || be a norm on IRn or C A := sup ||Ax|| = sup
||x||=1 x=0

||Ax|| ||x||

denes a norm in the space of n n matrices. It is called the matrix norm induced by ||||. Proof is a direct inspection. Note that supremum can be replaced by maximum in Theorem 8.8. Indeed, one can obviously write A = sup||x||2 =1 ||Ax||/||x||, then argue that the function ||Ax|| is continuous e on the Euclidean unit sphere S1 (as a composition of two continuous functions, Ax and || ||), then argue that ||Ax||/||x|| is a continuous function, as a ratio of two continuous e . functions, of which ||x|| = 0, so that ||Ax||/||x|| takes its maximum value on S1 Note that there are norms on IRnn that are not induced by any norm on IRn , for example ||A|| := maxi,j |aij | (Exercise, use 8.10(ii) below). 8.9 Theorem (i) ||A||1 = max1j n n i=1 |aij | (ii) ||A|| = max1in n j =1 |aij |

(maximum column sum) (maximum row sum)

37

Note: There is no explicit characterization of ||A||2 in terms of the aij . 8.10 Theorem | n . Then Let || || be a norm on IRn or C (i) ||Ax|| ||A|| ||x|| for all vectors x and matrices A. (ii) ||AB || ||A|| ||B || for all matrices A, B . 8.11 Denition (Real Inner Product) Let V be a real vector space. A real inner product on V is a real valued function on V V , denoted by , , satisfying 1. u, v = v, u for all u, v V 2. cu, v = cu, v for all c IR and u, v V 3. u + v, w = u, w + v, w for all u, v, w V 4. u, u 0 for all u V , and u, u = 0 i u = 0 Comments: 1 says that the inner product is symmetric, 2 and 3 say that it is linear in the rst argument (the linearity in the second argument follows then from 1), and 4 says that the inner product is non-negative and non-degenerate (just like a norm). Note that 0, v = u, 0 = 0 for all u, v V . A real vector space together with a real inner product is called a real inner product space, or sometimes a Euclidean space. 8.12 Examples (standard inner product). Note: u, v = ut v = v t u. (i) V = IRn : u, v = n i=1 ui vi b (ii) V = C ([a, b]) (real functions): f, g = a f (x)g (x) dx 8.13 Denition (Complex Inner Product) Let V be a complex vector space. A complex inner product on V is a complex valued function on V V , denoted by , , satisfying 1. u, v = v, u for all u, v V 2, 3, 4 as in 8.11. A complex vector space together with a complex inner product is called a complex inner product space, or sometimes a unitary space. 8.14 Simple properties (i) u, v + w = u, v + u, w (ii) u, cv = c u, v The properties (i) and (ii) are called conjugate linearity in the second argument.

38

8.15 Examples n | n : u, v = (i) V = C i (standard innerproduct). Note: u, v = ut v =v t u. i=1 ui v b f (x)g (x) dx (ii) V = C ([a, b]) (complex functions): f, g = a Note: the term inner product space from now on refers to either real or complex inner product space. 8.16 Theorem (Cauchy-Schwarz-Buniakowsky inequality) Let V be an inner product space. Then |u, v | u, u1/2 v, v 1/2 for all u, v V . The equality holds if and only if {u, v } is linearly dependent. Proof. We do it in the complex case. Assume that v = 0. Consider the function f (z ) = u zv, u zv = u, u z v, u z u, v + |z |2 v, v of a complex variable z . Let z = rei and u, v = sei be the polar forms of the numbers z and u, v . Set = and assume that r varies from to , then 0 f (z ) = u, u 2sr + r2 v, v Since this holds for all r IR (also for r < 0, because the coecients are all nonnegative), the discriminant has to be 0, i.e. s2 u, uv, v 0. This completes the proof in the complex case. It the real case it goes even easier, just assume z IR. The equality case in the theorem corresponds to the zero discriminant, hence the above polynomial | . (We left out the case v = 0, do assumes a zero value, and hence u = zv for some z C it yourself as an exercise.) 2 8.17 Theorem + Denition (Induced norm) If V is an inner product vector space, then ||v || := v, v 1/2 denes a norm on V . It is called the induced norm. To prove the triangle inequality, you will need 8.16. 8.18 Example | n, The inner products in Examples 8.12(i) and 8.15(i) induce the 2-norm on IRn and C respectively. The inner products in Examples 8.12(ii) and 8.15(ii) induce the 2-norm on the spaces C [a, b] of real and complex functions, respectively.

39

8.19 Theorem Let V be a real vector space with norm || ||. The norm || || is induced by an inner product if and only if the function u, v :=
) 1( ||u + v ||2 ||u v ||2 4

(polarization identity)

satises the denition of an inner product. In this case || || is induced by the above inner product. Note: A similar but more complicated polarization identity holds in complex inner product spaces.

40

Orthogonal vectors

In this section, V is always an inner product space (real or complex). 9.1 Denition (Orthogonal vectors) Two vectors u, v V are said to be orthogonal if u, v = 0. 9.2 Example | n with the standard inner product (i) The canonical basis vectors e1 , . . . , en in IRn or C are mutually (i.e., pairwise) orthogonal. (ii) Any vectors u = (u1 , . . . , uk , 0, . . . , 0) and v = (0, . . . , 0, vk+1 , . . . , vn ) are orthog| n with the standard inner product. onal in IRn or C (iii) The zero vector 0 is orthogonal to any vector. 9.3 Theorem (Pythagoras) If u, v = 0, then ||u + v ||2 = ||u||2 + ||v ||2 . Inductively, it follows that if u1 , . . . , uk are mutually orthogonal, then ||u1 + + uk ||2 = ||u1 ||2 + + ||uk ||2 9.4 Theorem If nonzero vectors u1 , . . . , uk are mutually orthogonal, then they are linearly independent. 9.5 Denition (Orthogonal/Orthonormal basis) A basis {ui } in V is said to be orthogonal, if all the basis vectors ui are mutually orthogonal. If, in addition, all the basis vectors are unit (i.e., ||ui || = 1 for all i), then the basis is said to be orthonormal, or an ONB. 9.6 Theorem (Fourier expansion) If B = {u1 , . . . , un } is an ONB in a nite dimensional space V , then v=
n i=1

v, ui ui

for every v V , i.e. ci = v, ui are the coordinates of the vector v in the basis B . One can also write this as [v ]t B = (v, u1 , . . . , v, un ). Note: the numbers v, ui are called the Fourier coecients of v in the ONB {u1 , . . . , un }.
| n with the standard inner product, the coordinates of any vector For example, in IRn or C v = (v1 , . . . , vn ) satisfy the equations vi = v, ei .

41

9.7 Denition (Orthogonal projection) Let u, v V , and v = 0. The orthogonal projection of u onto v is Prv u = u, v v ||v ||2

Note that the vector w := u Prv u is orthogonal to v . Therefore, u is the sum of two vectors, Prv u parallel to v , and w orthogonal to v (see the diagram below).

Figure 1: Orthogonal projection of u to v


B u c

||u|| cos (v/||v ||)

9.8 Denition (Angle) In the real case, for any nonzero vectors u, v V let cos = u, v ||u|| ||v ||

By 8.16, we have cos [1, 1]. Hence, there is a unique angle [0, ] with this value of cosine. It is called the angle between u and v . Note that cos = 0 if and only if u and v are orthogonal. Also, cos = 1 if and only if u, v are proportional, v = cu, then the sign of c coincides with the sign of cos . 9.9 Theorem Let B = {u1 , . . . , un } be an ONB in a nite dimensional space V . Then v=
n i=1

Prui v =

n i=1

ui cos i

where i is the angle between ui and v .

42

9.10 Theorem (Gram-Schmidt) Let nonzero vectors w1 , . . . , wm be mutually orthogonal. For v V , set wm+1 = v
n i=1

Pr wi v

Then the vectors w1 , . . . , wm+1 are mutually orthogonal, and span{w1 , . . . , wm , v } = span{w1 , . . . , wm , wm+1 } In particular, wm+1 = 0 if and only if y span{w1 , . . . , wm }. 9.11 Algorithm (Gram-Schmidt orthogonalization) Let {v1 , . . . , vn } be a basis in V . Dene w 1 = v1 and then inductively, for m 1, wm+1 = vm+1 = vm+1
m i=1 m

Prwi vm+1

vm+1 , wi wi ||wi ||2 i=1

This gives an orthogonal basis {w1 , . . . , wn }, which agrees with the basis {v1 , . . . , vn } in the following sense: span{v1 , . . . , vm } = span{w1 , . . . , wm } for all 1 m n. The basis {w1 , . . . , wn } can be normalized by ui = wi /||wi || to give an ONB {u1 , . . . , un }. Alternatively, an ONB {u1 , . . . , un } can be obtained directly by w1 = u1 and inductively for m 1 wm+1 = vm+1
m i=1

u1 = w1 /||w1 ||

vm+1 , ui ui

um+1 = wm+1 /||wm+1 ||

9.12 Example Let V = Pn (IR) with the inner product given by f, g = 01 f (x)g (x) dx. Applying Gram-Schmidt orthogonalization to the basis {1, x, . . . , xn } gives the rst n + 1 of the so called Legendre polynomials.

43

9.13 Corollary Let W V be a nite dimensional subspace of an inner product space V , and dim W = k . Then there is an ONB {u1 , . . . , uk } in W . If, in addition, V is nite dimensional, then the basis {u1 , . . . , uk } of W can be extended to an ONB {u1 , . . . , un } of V . 9.14 Denition (Orthogonal complement) Let S V be a subset (not necessarily a subspace). Then S := {v V : v, w = 0 for all w S } is called the orthogonal complement to S . 9.15 Theorem S is a subspace of V . If W = span S , then W = S . 9.16 Example If S = {(1, 0, 0)} in IR3 , then S = span{(0, 1, 0), (0, 0, 1)}. Let V = C [a, b] (real functions) with the inner product from 8.12(ii). Let S = {f b const} (the subspace of constant functions). Then S = {g : a g (x) dx = 0}. Note that V = S S , see 1.34(b). 9.17 Theorem If W is a nite dimensional subspace of V , then V = W W Proof. By 9.13, there is an ONB {u1 , . . . , uk } of W . For any v V the vector v
k i=1

v, ui ui

belongs in W . Hence, V = W + W . The linear independence of W and W follows from 9.4. 2 Note: the nite dimension of W is essential. Let V = C [a, b] with the inner product from 8.12(ii) and W V be the set of real polynomials restricted to the interval [a, b]. Then W = , and at the same time V = W . 9.18 Theorem (Parcevals identity) Let B = {u1 , . . . , un } be an ONB in V . Then v, w =
n i=1 t v, ui w, ui = [v ]t B [w ]B = [w ]B [v ]B

44

for all v, w V . In particular, ||v ||2 = Proof. Follows from 9.6. 2 9.19 Theorem (Bessels inequality) Let {u1 , . . . , un } be an orthonormal subset of V . Then ||v ||2 for all v V . Proof. For any v V the vector w := v belongs in {u1 , . . . , un } . Hence, ||v ||2 = ||w||2 +
n i=1 n i=1 n i=1 n i=1

|v, ui |2 = [v ]t B [v ]B

|v, ui |2

v, ui ui

|v, ui |2

9.20 Denition (Isometry) Let V, W be two inner product spaces (both real or both complex). An isomorphism T : V W is called an isometry if it preserves the inner product, i.e. T v, T w = v, w for all v, w V . In this case V and W are said to be isometric. Note: it can be shown by polarization identity that T : V W preserves inner product if and only if T preserves the induced norms, i.e. ||T v || = ||v || for all v V . 9.21 Theorem Let dim V < . A linear transformation T : V W is an isometry if and only if whenever {u1 , . . . , un } is an ONB in V , then {T u1 , . . . , T un } is an ONB in W . 9.22 Corollary Finite dimensional inner product spaces V and W (over the same eld) are isometric if and only if dim V = dim W . 9.23 Example 2 Let standard ) inner product. Then the maps dened by matrices ( V = IR ) with the ( 0 1 1 0 A1 = and A2 = are isometries of IR2 . 1 0 0 1 45

10

Operators in Inner Product Spaces

Denition 10.1 [isometries] Let V and W be inner product spaces. An isomorphism T : V W is called an isometry if T v, T w = v, w for all v, w V . In this case V and W are called isometric. If V = W is a complex (real) inner product space, then an isometry T : V V is called a unitary (orthogonal) operator. Proposition 10.2 [Properties of isometries] (a) T v, T w v, w V T v = v v V . (b) If dim V < and {u1 , . . . , un } is an ONB in V , then T : V W is an isometry if and only if {T u1 , . . . , T un } is an ONB in W . (c) Finite dimensional inner product spaces V and W are isometric if and only if dim V = dim W . (d) The unitary operator on a complex inner product space V are a subgroup U (V ) of GL(V ). The orthogonal operators on a real inner product space are a subgroup O(V ) of GL(V ). Denition 10.3 [orthogonal, unitary matrices] A matrix Q IRnn is called orthogonal | nn is called unitary if U U H = I . (Here U H = U t is the if QQt = I . A matrix U C Hermitean transpose.) Theorem 10.4 (a) Let Q IRnn . Then the following are equivalent: (i) Q is orthogonal, (ii) Qt is orthogonal, (iii) The rows of Q are on ONB of IRn , (iv) The columns of Q are on ONB of IRn , (v) Qu, Qv = u, v for all u, v IRn (where , is the standard inner product). | n n . (b) A completely analogous result holds for unitary matrices in C Corollary 10.5 (a) The orthogonal matrices in IRnn form a subgroup O(n) of GL(n, IR). | ). | nn form a subgroup U (n) of GL(n, C The unitary matrices in C (b) If Q O(n), then det Q {1, 1}. If U U (n), then | det U | = 1. Lemma 10.6 Let T L(V ), B = {u1 , . . . , un } be an ONB of V and A := [T ]B . Then Ak,j = T uj , uk for all j , k. Theorem 10.7 (a) Let V be a real (complex) inner product space and B an ONB in V . Then T L(V ) is orthogonal (unitary) if and only if [T ]B is an orthogonal (unitary) matrix.

46

| nn is unitary, Theorem 10.8 If Q IRnn is orthogonal, then (Q) {1, 1}. If U C | : |z | = 1}. then (U ) {z C

Theorem 10.9 [Riesz] Let V be a nite-dimensional inner product space and f V (i.e. f : V F a linear functional). Then there exists a unique v V such that f (u) = u, v for all u V . Note: For each xed v V the mapping u u, v denes a linear functional on V . The Riesz Theorem says that all linear functionals are of this form and thereby leads to a natural identication of V with V via the mapping v , v . Theorem+Denition 10.10 [adjoint operator] Let V be a nite-dimensional inner product space and T L(V ). Then there exists a unique T L(V ) such that T u, v = u, T v for all u, v V . T is called the adjoint operator of T . Proposition 10.11 [Properties of adjoints] Let T, S L(V ) and c F . Then (i) (T + S ) = T + S , (ii) (cT ) = c T , (iii) (T S ) = S T , (iv) (T ) = T . Theorem 10.12 Let V be a real (complex) nite-dimensional inner product space, B an ONB of V , and T L(V ). Then [T ]B = [T ]t B ([T ]H B ).

Denition 10.13 [selfadjoint operators, symmetric and hermitean matrices] An operator T L(V ) on a nite-dimensional inner product space V is called selfadjoint if T = T . A matrix A IRnn is called symmetric if A = At , i.e. Akj = Ajk for all 1 j, k n. | nn is called hermitean if A = AH , i.e. A A matrix A C kj = Ajk for all 1 j, k n. Corollary 10.14 Let V be a nite-dimensional real (complex) inner product space and B an ONB of V . Then T L(V ) is selfadjoint if and only if [T ]B is symmetric (hermitean). Theorem 10.15 Let T be selfadjoint on V . Then (a) (T ) IR. (b) If 1 , 2 (T ), 1 = 2 and v1 , v2 are corresponding eigenvectors, then v1 v2 . (c) T has an eigenvalue. 47

Theorem 10.16 U L(V ) is orthogonal (unitary) if and only if U = U 1 . Denition 10.17 [orthogonally, unitarily equivalent matrices] Two matrices A, B | nn ) are called orthogonally equivalent (unitarily equivalent) if there exists an IRnn (C orthogonal matrix O (unitary matrix U ) such that B = O1 AO (B = U 1 AU ). Note: If two matrices are orthogonally (unitarily) equivalent, then they are similar, but not vice versa. Lemma 10.18 Let V be a nite-dimensional inner product space, T L(V ), and W V a T -invariant subspace, i.e. T w W for all w W . (a) If T is selfadjoint, then W is T -invariant. (b) If T is an isometry, then W is T -invariant. Theorem 10.19 [Spectral theorem for selfadjoint operators] Let V be a nite-dimensional inner product space and T L(V ) selfadjoint. Then there exists an ONB of V consisting entirely of eigenvectors of T . In particular, T is diagonalizable.
| nn ), then A is orthogonally Theorem 10.20 If A is symmetric in IRnn (hermitean in C (unitarily) equivalent to a diagonal matrix.

Note: (1) Theorems 10.19 and 10.20 carry over to unitary operators on complex nitedimensional inner product spaces, respectively to unitary matrices, with essentially the same proofs. They do not hold for orthogonal operators and matrices, due to the lack of an analogue of Theorem 10.15(c). However the following can be shown: Let T : V V be an orthogonal operator on a nite-dimensional real inner product space V . Then V = V1 . . . Vm , where Vi are mutually orthogonal subspaces, each Vi is a T -invariant one- or two-dimensional subspace of V . (2) Much of the above theory for selfadjoint and unitary operators can be carried out for the more general case of normal operators T , i.e. operators such that T T = T T .

48

You might also like