Professional Documents
Culture Documents
A 5I =
0 0
1
3
which shows that we have
=5
with v = t
3
1
Av2 = 5v2 .
or
A v2 v1 = ( 5v2
5v1 +v2 ) = v2 v1
5 1
0 5
Im
0
A1
A2
Im
0
0
Q2
QAQ =
Im
0
0
Q2
with
wW ;
if we find the coordinate vectors of the two terms on the right hand side
with respect to B then the first will end with the jth column of U , while
the second will end with n m zeros. Thus the matrix of T with respect
to B has the form
Im A
,
0
U
which is upper triangular.
which is upper triangular, and the existence of such a matrix for all n
follows by induction. The diagonal elements are the eigenvalues of A
because similar matrices have the same characteristic polynomial.
Im
0
A1
A2
3
Im
0
0
Q2
Im
0
A1 Q2
U
Comment. Consider the significance of this lemma. For many applications, we would like to diagonalise a matrix A, making it similar to a
matrix with the eigenvalues of A on the diagonal and zeros elsewhere.
4
A1 0 0
0 A2 0
A1 A2 Ar =
.
..
..
..
...
.
.
.
0
1 2 0
5 6
1 2
3 4 0
=
0 0 5
7 8
3 4
0 0 7
2 0 0
3 1
0 3 1
(4) =
(2)
0 0 3
0 3
0 0 0
Ar
same size:
0
0
;
6
8
0
0
.
0
4
1 0 0
0 1 0
. . . .
. . ... .
.. .. ..
Jk () =
0 0 0 1
0 0 0
Matrix polynomials. Let F be a field. We write F[z] for the set of all
polynomials in one variable z which have coefficients in the field F. If
f (z) = f0 + f1 z + + fd z d is a polynomial in F[z] and A is an n n
matrix over F, then f (A) is the n n matrix
f (A) = f0 I + f1 A + + fd Ad .
If A is fixed, the set of all such matrices is denoted F[A]. That is,
F[A] = { f (A) | f is a polynomial } .
Lemma. Properties of matrix polynomials. If f, g are polynomials,
is a scalar and A is an n n matrix, then
(f + g)(A) = f (A) + g(A) ;
(f )(A) = f (A) ;
(z p f )(A) = Ap f (A) ;
(gf )(A) = g(A)f (A) ;
f (A) and g(A) commute;
Af (A) = f (A)A .
Moreover, if v is a vector in the eigenspace E of A, then
f (A)v = f ()v .
Finally, if P is invertible and A = P BP 1 , then
f (A) = P f (B)P 1 .
Proof. Follows directly from the definition.
Problem. What is the dimension of the vector space C[A]? Since this
space contains all possible (complex) polynomial expressions in A, a
spanning set is
{ I, A, A2 , A3 , . . . } .
There is no immediately obvious finite spanning set for C[A], so it might
appear that the space is infinitedimensional. However, the previous result shows that this is not so: Mnn (C) has dimension n2 , so the subspace
C[A] must have dimension n2 or less. In fact, much more than this is
true!
p(z) = det(zI A) = z 2 5z 2 ,
and we can calculate
p(A) = A2 5A 2I =
That is, p(U ) maps every element of Cn to the zero vector, and so p(U )
is the zero matrix.
1 2
is
Example. The characteristic polynomial of A =
3 4
Arthur Cayley
(18211895)
p(z) = (z 1 ) (z n ) .
1
3
2
4
2
0
0
2
=0.
7 10
15 22
(U 1 I)(V1 )
= {0} .
{ I, A, A2 , . . . , An1 }
is a spanning set for C[A], and hence dim C[A] n.
Lemma. The division algorithm for polynomials Let F be a field and
let f, g be in F[z] If g is not the zero polynomial then there exists a
unique pair of polynomials q, r in F[z] such that
f = qg + r
Then
f = (2z 1)g + r1
where r1 = z 3 + z 2 + 3z + 3
g = (z 2)r1 + r2
r1 = (z + 1)r2 .
where r2 = z 2 + 3
f = q1 g + r1 = q2 g + r2 ,
where the pairs q1 , r1 and q2 , r2 both satisfy the conditions of the theorem. Then
(q1 q2 )g = r2 r1 ;
consequently q1 q2 = 0 and r2 r1 = 0, as otherwise the right hand side
has smaller degree than the left hand side. This completes the proof.
By applying the division algorithm repeatedly we obtain the Euclidean algorithm for polynomials. For example, let
f = 2z 5 3z 4 + 6z 3 7z 2 + 6 and
9
g = z 4 z 3 + 2z 2 3z 3 .
()
rn2 = qn rn1 + rn
rn1 = qn+1 rn .
Note that as long as rk 6= 0 we have deg rk < deg rk1 ; this cannot continue indefinitely, so at some stage we must have rn+1 = 0, as indicated
in this calculation. Looking at the last equation and working backwards
we have
rn | rn1
rn | rn2
10
rn | f2 and rn | f1 .
d | f1 and d | f2
d | r1
d | rn .
= (b1 f1 + c1 f2 ) q1 (b0 f1 + c0 f2 )
= (b1 q1 b0 )f1 + (c1 q1 c0 )f2
= b1 f1 + c1 f2
is in L; and so on. Eventually we find that rn is in L; and we already
know that gcd(f1 , f2 ) is a constant times rn .
Example and comment.
Let
f = 2z 5 3z 4 + 6z 3 7z 2 + 6 and g = z 4 z 3 + 2z 2 3z 3
as on page 9. From the Euclidean algorithm on page 10 we have
gcd(f, g) = r2 = z 2 + 3 ;
running the Euclidean algorithm backwards, we have
z 2 + 3 = g (z 2)r1
L = { a1 f1 + a2 f2 | a1 , a2 F[z] }
11
12
1
3
3
A = 4 7 6 .
2
3
2
Since A is not a scalar multiple of I, no
can satisfy f (A) = 0. However,
5 9
2
A = 12 19
6 9
linear polynomial f = f0 + f1 z
9
18 ,
8
or
m2 (z) = (z + 1)2 (z + 2) .
and so m(k ) = 0 ,
()
On the other hand, if v is in both ker(f1 (A)) and ker(f2 (A)) then
A1 A2 As ,
where Ak is an upper triangular matrix of size ak ak , with all
diagonal elements equal to k ;
4. GEk has dimension equal to the algebraic multiplicity ak .
Comment. Two important things are assured by this lemma. Firstly,
though we have seen that the dimension of an eigenspace may be less
than the corresponding algebraic multiplicity, this is not so for generalised eigenspaces: each generalised eigenspace has dimension equal to
the algebraic multiplicity of the corresponding eigenvalue. So, at the expense of introducing some extra complications, we have overcome one of
the obstacles to diagonalisation in the general case. Secondly, although
we cannot always find a basis of Cn consisting of eigenvectors of a given
matrix A, we can always find a basis (moreover, quite a special one)
consisting of generalised eigenvectors.
Proof of the lemma. For the first result we use the fact that the matrices
A and (A I)a commute (exercise: give two different proofs of this).
Therefore
v GE
(A I)a v = 0
(A I)a Av = A(A I)a v = 0
Av GE .
For the second we apply the primary decomposition theorem (page 12)
or its inductive extension (page 13) to p(z), the characteristic polynomial
of A. We have (by definition of algebraic multiplicity) that
p(z) = (z 1 )a1 (z s )as ;
the factors (z k )ak are coprime in pairs and therefore
2. Cn = GE1 GEs ;
Note that this is stronger than saying that there is no common
factor of all s polynomials.
15
16
where T (v) = Av
Comment. The second and third results of this lemma will be very
useful later in avoiding hard work. They show that if we write down
the sequence of nullities, then the nullities themselves must be non
decreasing, while their differences must be nonincreasing. For example,
5 , 8 , 11 , 12 , 12 , 12 , . . .
is a possible sequence, but
5 , 8 , 7 , 12 , 12 , 12 , . . .
and 5 , 8 , 9 , 12 , 12 , 12 , . . .
are not. The fourth result shows that in order to find GE , we do not
necessarily have to calculate the kernel of (A I)a ; we only need to
calculate successive kernels until there is no change.
Proof of the lemma. The first statement is a very easy exercise; the
second follows immediately. For the third, choose a basis { u1 , . . . , up }
for ker(A I)k and extend it to obtain a basis
{ u1 , . . . , up , v1 , . . . , vq }
for ker(A I)k+1 ; extend this again to a basis
{ u1 , . . . , up , v1 , . . . , vq , w1 , . . . , wr }
for ker(A I)k+2 . Now for j = 1, 2, . . . , r the vector (A I)wj is
in ker(A I)k+1 and hence is a linear combination of u1 , . . . , up and
v1 , . . . , vq ; for each j we write
(A I)wj = xj + yj
with xj span{ u1 , . . . , up } and yj span{ v1 , . . . , vq }. We wish to
show that the vectors y1 , y2 , . . . , yr are linearly independent. So, set
1 y1 + 2 y2 + + r yr = 0 .
Now (A I)k xj = 0 for each j, and so
(A I)k+1 (1 w1 + 2 w2 + + r wr )
= (A I)k (1 x1 + 2 x2 + + r xr )
+ (A I)k (1 y1 + 2 y2 + + r yr )
= 0,
18
na
X
k (z )k
k=0
na
X
k (A I)a+k v = 0 (A I)a v ;
k=0
ker(A I)k .
k=1
19
Jordan forms. It can be shown that the basis found in the previous
lemma can be further refined to give a Jordan basis for Cn , with
respect to which the matrix of A is in Jordan form.
Definition. A Jordan chain in a generalised eigenspace GE of a
matrix A is a sequence v1 , v2 , . . . , vk such that
(A I)vj = vj+1 for j = 1, . . . , k 1
and (A I)vk = 0 ,
AI
AI
AI
AI
v1 v2 vk1 vk 0
(where the zero vector is not actually part of the chain). A Jordan
basis for an n n matrix A is a basis of Cn consisting of one or more
Jordan chains of A.
Lemma. Jordan chains produce Jordan blocks. Let v be a generalised
eigenvector of an n n matrix A corresponding to the eigenvalue .
Then there is a Jordan chain
AI
AI
AI
AI
AI
v1 v2 vk1 vk 0
with vk 6= 0, and
1. every vj is in GE ;
2. the set { vk , . . . , v1 } is linearly independent;
3. the subspace W = span{ vk , . . . , v1 } of Cn is invariant under A;
4. the matrix of T : W W , T (v) = Av with respect to the ordered
basis B = { vk , . . . , v1 } is the Jordan block Jk ().
Proof. For the first claim note that by definition of a Jordan chain we
have
(A I)k+1j vj = 0 .
For the second, let 1 v1 + + k vk = 0. Multiplying by (A I)k1
gives 1 vk = 0 and so 1 = 0; a similar calculation shows that every
coefficient j is zero. For the rest, note that by definition
Avj = vj+1 + vj for j = 1, . . . , k 1 and
20
Avk = vk ;
this shows that Avj W , and that the matrix of T with respect to B
has jth column from the right given by
0
1 row j+1 from the bottom
[Avj ]B =
row j from the bottom
0
provided j < k, and kth column from the right (that is, first column)
equal to (, 0, . . .)T . So the matrix of T is Jk (), as claimed.
Examples.
1. Consider the matrix from the start of this chapter,
2 9
A=
.
1 8
We know already that A has only one eigenvalue = 5. It is easy
to check that (A 5I)2 = 0 and so
GE5 = ker(0) = C2 .
Alternatively, this is given without calculation
by the corollary on
page 17. Starting with the vector v = 10 , we have a chain
A5I
A5I
0
3
1
,
v2 =
v = v1 =
0
1
0
and the matrix of A with respect to the ordered basis { v2 , v1 } is
5 1
J = J2 (5) =
0 5
as we found earlier. The fact that the generalised
eigenspace is the
whole of C2 shows that our choice of v1 = 10 was not just lucky
in fact, any nonzero vector other than an eigenvector would have
done as well. For example, the chain
A5I
A5I
0
6
7
v2 =
v1 =
0
2
3
21
gives
A=
2. Take
6
7
2 3
5
0
1
5
6
7
2 3
1
7 1 1
A = 2 7
2 ;
5 3 1
1
1
v = 0 , v = 3
1
7
respectively. We have
2 1
2
2
1
1
? = span
0 , 2 ,
?
1
0
1
1
0
A6I
A6I
v1 = 2 v2 = 0 0
0
1
0
1
0
A3I
v3 =
3
0 .
7
0
We have
Av2 = 6v2 ,
Av1 = v2 + 6v1 ,
22
Av3 = 3v3 ,
and so A = P JP 1 where
1 1
P = v2 v1 v3 = 0 2
1 0
and
1
3
7
6 1 0
J = J2 (6) J1 (3) = 0 6 0 .
0 0 3
3. Consider the matrices
4 2 1
A = 2 3
1
6 5 1
2 1 0
J = 0 2 1 .
0 0 2
If we try to do the same with B we find that things are not quite
the same, because we have
1 2 2
0
2
ker(B 2I) = ker 0 0 0 = span 1 , 1
0 0 0
0
1
and already
and
0 4 4
B = 2 6 4 .
1 2 0
Each has eigenvalue = 2, with algebraic multiplicity 3, and therefore has one generalised eigenspace GE2 = C3 . We may compute
2 2 1
1
ker(A 2I) = ker 0 1 0 = span 0
2
0 0
0
and
1 1 0
P = 0
1 0 and
2 3 1
2 1 1
1
1
2
0 0
0
2
0
A chain beginning with a vector in ker(A 2I) will only have length
1; one beginning in ker(A 2I)2 will have length 2; so we need to
begin with a vector in neither. For example, take
0
1
1
0
A2I
A2I
A2I
0 1 0 0 ;
1
3
2
0
23
0 0
ker(B 2I)2 = ker 0 0
0 0
0
0 = C3 .
0
1
2
0
B2I
B2I
v1 = 0 v2 = 2 0 .
0
1
0
2 1
P = (v2 v1 v3 ) = 2 0
1 0
24
0
1
1
2
and J = 0
0
1 0
2 0 ;
0 2
alternatively,
0
P = (v3 v2 v1 ) = 1
1
2 1
2 0
1 0
2
and J = 0
0
0 0
2 1 .
0 2
The facts we have proved about the dimensions of successive kernels will,
in certain circumstances, make it very easy to find the Jordan form J
of a matrix; though if we want to find the matrix P as well, there will
usually still be a lot of computation to be done.
Examples/exercises.
1. A 12 12 matrix A has only one eigenvalue = 7. Given that
nullity(A 7I) = 4 ,
nullity(A 7I)2 = 7 ,
nullity(A 7I)3 = 10 ,
nullity(A 7I)4 = 12 ,
ker(A 7I)2
v12
ker(A 7I)3
v3
v1
v9
v5
A7I
v1 v2 v3 v4 .
25
A7I
v9 v10 v11
of length 3. Finally we have a chain of length 1, that is, a single
vector v12 ker(A 7I). These twelve generalised eigenvectors
form a basis of C12 , and we have A = P JP 1 , where
P = v4 v3 v2 v1 v8 v7 v6 v5 v11 v10 v9 v12
ker(A 7I)4
A7I
As v4 is an eigenvector of A this chain will go no further and we construct a new one, starting with a vector v5 which is in ker(A 7I)4
but is not a linear combination of v1 , v2 , v3 , v4 and vectors in
ker(A 7I)3 . This gives another chain of four generalised eigenvectors
A7I
A7I
A7I
v5 v6 v7 v8 .
since many of the vectors in this expression are in ker(A I)3 , we have
(A I)3 (1 v1 + 5 v5 ) = 0 .
This means that
1 v1 + 5 v5 = w
for some w in ker(A I)3 ; but by our choice of v5 , this is possible
only if 5 = 0. Hence also 1 = 0, and we have shown that v1 , . . . , v8
are independent. Finally, note that for our production of chains to
work, each ring of the diagram must contain no more vectors than
the previous ring; but this is true as a consequence of our earlier result
dm dm1 dm1 dm2 .
Examples/exercises, continued.
2. Use these ideas to find with minimal work Jordan forms for the
matrices A and B in example 3 on page 23.
3. Find a Jordan form for the matrix
1
0 4 8
2
8
1 1
C=
,
0
4 1 8
1 2 2 9
given that two of its eigenvalues are 1 and 3.
4. Suppose that A is a 15 15 matrix with only one eigenvalue , and
that the nullities of (A I)k are
4 , 8 , 11
for k = 1, 2, 3 respectively. Find all possible completions of the
sequence of nullities, and hence all possible Jordan forms of A.
5. A 13 13 matrix A is known to have an eigenvalue with
nullity(A I)k = 3, 5
for k = 1, 2;
for k = 1, 2, 3, 4.
Find all possible Jordan forms of A. If A were given and the values of and were known, how could you decide with minimum
calculation which of these Jordan forms is the correct one?
6. Let A be a square matrix and an eigenvalue of A. Explain why
all of the following are the same:
the number of Jordan blocks Jk () in the Jordan form of A;
the number of separate Jordan chains forming a basis for GE ;
the number of independent eigenvectors of A corresponding to
eigenvalue ;
the dimension of E ;
the geometric multiplicity of
the nullity of A I;
the number of parameters in the solution of (A I)v = 0;
the number of nonleading columns in a rowechelon form of
A I.
7. One more example of finding both P and J, with a bit of a hint to
reduce the workload: let
3 1 3 1
8
1
3 4
A=
.
1 1 1 1
1
0 2 3
Given that 2 and 3 are eigenvalues of A, find an invertible matrix
P and a Jordan matrix J such that A = P JP 1 .
Solution. Finding the eigenspaces for = 2 and = 3, we obtain
1
0
A 2I
0
0
0
1
0
0
1
0
A 3I
0
0
2
1
0
0
0
1
0
0
1
2
1
2
1 2
,
E
=
span
,
,
2
0
0
0
0
1
0
2 0
3 1
1
, E3 = span
.
1 0
0 0
1
28
Hence A has eigenvalues 2, 2, 3, and from the trace we find that the
fourth eigenvalue is 2 again. Hence there will be three independent
generalised eigenvectors corresponding to = 2, and
v1 v2
0
0 0
0
0
0
0
1
1
2
(A 2I)2 =
0
? ?
?
?
0
? ?
?
?
1
0
0
0
1
0
0
0
2
0
,
0
0
1
3
v2 = (A 2I)v1 =
.
1
1
2
1
v3 =
1
0
will do. To complete the matrix P we take, say,
0
1
v4 =
0
1
29
and
v3 ;
and
1
0
0
0
2
0
J = J2 (2) J1 (2) J1 (3) =
0
0
2
0
1 1
1
0
0
1
1
2
0
0
0
0
2
0
0
0
.
0
3
Jordan forms and similarity. The following theorem shows that the
Jordan form gives a definite test for similarity of two matrices. Compare
the determinant, characteristic polynomial and other similarity invariants which we studied in previous chapters: these can can be used to
show that two matrices are not similar, but can never show that two
matrices are similar.
Theorem. The Jordan form is a complete similarity invariant. Two
n n matrices A and B are similar if and only if they have the same
Jordan form, except possibly for a permutation of the Jordan blocks.
Proof. First, suppose that A and B are similar. Then they have the
same characteristic polynomial and hence the same eigenvalues with the
same algebraic multiplicities (see the lemma computing eigenvalues,
chapter 5, page 5). Moreover, for any and any k the matrices (AI)k
and (B I)k are similar; so they have the same nullity (chapter 2,
page 39). But we have seen that these nullities determine the Jordan
blocks; so, if we ignore the order of these blocks, A and B have the same
Jordan form.
30
4 2 1
A = 2 3
1
6 5 1
0 4
and B = 2 6
1 2
4
4
0
from page 23 share all the similarity invariants that we have discussed
up to chapter 5 (rank, nullity, trace, determinant, eigenvalues and so
on). But they are not similar because they have different Jordan forms
(top and bottom of page 24).
Jordan forms and the minimal polynomial. Earlier in this chapter
we found some minimal polynomials more or less by trial and error.
Theorem. Jordan forms and the minimal polynomial. Let A be an
n n matrix; for each eigenvalue k of A, denote the size of the largest
Jordan block Jb () occurring in the Jordan form of A by bk . Then the
minimal polynomial of A is
m(z) = (z 1 )b1 (z s )bs .
Proof. Note that bk is the length of the longest Jordan chain corresponding to k ; and that GEk = ker(A k I)bk . So for every vk in
GEk we have
(A k I)bk vk = 0 ;
since (A k I)bk is one of the terms in the product for m(A), and since
it commutes with all the other terms, we have
m(A)vk = (A k I)bk vk = 0 .
deg
Xg
j (z k )j
j=0
deg
Xg
j (z k )ck +j v = 0 (A k I)ck v 6= 0 ,
j=0
and so f (A) is not the zero matrix. This completes the proof.
Corollary. If A is an n n matrix, and if b1 , b2 , . . . , bs are defined as
in the preceding theorem, then
dim C[A] = b1 + b2 + + bs n .
Example. If the matrix A has Jordan form
J5 (2) J2 (2) J3 (2) J3 (2) J2 (2) ,
then its minimal polynomial is
31
32
0
.
.
Jk () =
.
0
..
.
..
.
0
0
0
0
0
0
.
..
.
.
= I + N with N = .
0
1
1
0
.. . .
.
.
0
0
0
0
..
.
.
1
0
1
0
0
0
0
1
0
0
0
0
0
0
2
, N =
0
1
0
0
0
0
0
0
1
0
0
0
0
0
1
0
3
, N =
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
Examples.
1. We have
J4 ()n = (I + N )n
n
n
n
n
n1
n2 2
= I+
N+
N +
n3 N 3 ,
1
2
3
the series stopping at this point since N 4 = 0. Hence
7
n 7n1 0
0
7n
0
0
0
Jn =
.
n
0
0
7
0
0
0
0 3n
3. For the matrix
n
n
An2 B 2 + + B n .
An1 B +
2
1
n
r
n
.
r1
0 4
B = 2 6
1 2
4
4
0
2 1 0
2
P = (v2 v1 v3 ) = 2 0 1
and J = 0
1 0 1
0
therefore
1 0
2 0 ;
0 2
2n n2n1 0
2 2n 4n
4n
Bn = P 0
2n
0 P 1 = 2n1 2n 2 + 4n 4n .
0
0
2n
n
2n 2 2n
34