You are on page 1of 22

7.

DIAGONALISATION
§7.1. Diagonal Matrices
The diagonal of a matrix is the one that goes from the
top left. Its components are of the form aii. If the matrix is
square this diagonal will reach the bottom right hand corner.
We shall consider here only square matrices.
A diagonal matrix is a square matrix where every
component above or below the diagonal is zero. So A = (aij) is
diagonal if aij = 0 whenever i ≠ j. Some or all of the diagonal
components can also be zero. Special examples are the zero
matrix and the identity matrix.
 d1 0  0 
 
 0 d2  0 
So a diagonal matrix has the form  . Sometimes we write this as
   
 
0 0  d 
 n

diag(d1, d2, ..., dn). Clearly the sum, difference and product of two n × n diagonal matrices are
again diagonal.

 a1 0 0  b1 0 0   a1 + b1 0 0 
     
Example 1: If A =  0 a2 0  and B =  0 b2 0  then A + B =  0 a2 + b2 0 ,
0 0 a3  0 0 b   0 a3 + b3 
  3  0
 a1 − b1 0 0  a b 0 0 
   11 
A−B=  0 a2 − b2 0  and AB =  0 a2b2 0 .
 0 a3 − b3   0 a3b3 
 0  0
 a1m 0 0 
 
In particular, if m is a positive integer, the m’th power of A is mn =  0 0 .
m
a2
 m
 0 0 a3 
 a −1 0 0 
 1 
−1 −1
If each ai ≠ 0 then A is invertible and A =  0 a2 0 .
 −1 

 0 0 a3 

There are many cases where it’s important to be able to find a formula for the m’th power of
an n × n matrix. If the matrix is a diagonal matrix this is easy. But it is possible to do this for most
square matrices by a process called diagonalisation.

Theorem 1: If A = SDS−1, where S is invertible and D is diagonal then Am = SDmS−1.


Proof: Am = (SDS−1) (SDS−1) ... (SDS−1) (SDS−1) (m groups)
= SD(S−1S)D(S−1S) ... (S−1S)D(S−1S)DS−1
= SDmS−1.

There remains the problem, given a square matrix, to find, if possible, such an invertible S
and diagonal D. The technique to do this involves eigenvalues and eigenvectors,

113
§7.2. Diagonalisable Matrices
A matrix is defined to be diagonalisable if it is similar to a diagonal matrix, that is, if there
exists an invertible matrix S such that A = SDS−1.
Are all matrices diagonalisable? No, but they generally are. An example of a non-
1 1
diagonalisable matrix is A = 0 1 which has a double eigenvalue of 1. If A is similar to a diagonal
matrix D then D must be the diagonal matrix with the same eigenvalues. In other words we must
have D = I. But if A = S−1DS then A = I, which is clearly not so.

Example 2: If A is an upper-triangular matrix where all the diagonal components are equal then A
is not diagonalisable, unless it’s a scalar multiple of the identity matrix.
λ *  * 
 
0 λ  *
The eigenvalues of A =  are all equal to λ. So if A =SDS−1 then D = λI, in which
0 0  *
 
0 0 0 λ
 
case A = λSIS−1 = λI.

Here is a partial list of diagonalisable matrices. It doesn’t cover every diagonalisable


matrix, but it will be clear that to be diagonalisable is the rule, not the exception.
(1) n × n matrices with n distinct eigenvalues;
(2) matrices A such that Am = I for some positive integer m;
(3) real symmetric matrices.

If A, S are an n × n matrices and the columns of S are eigenvectors for A then S is called an
eigenmatrix of A.

Theorem 2: If S = (v1, v2, …, vn) is an eigenmatrix for A then AE = ED


where D = and λi is the eigenvalue corresponding to the eigenvector vi.
Proof: AS = A(v1, v2, …, vn) = (Av1, Av2, …, Avn) = (λ1v1, λ2v2, …, λnvn)
 λ01 λ02 … 0
… 0

= (v1, v2, …, vn) 
… … … …
= SD.
 0 0 … λn 
Corollary: A matrix is diagonalisable if and only if it has an invertible eigenmatrix.
Proof: Suppose A has an invertible eigenmatrix S. Then AS = SD, as above, and since S is
invertible we may write this as A = SDS−1 = (S−1)−1D(S−1).
Conversely suppose that A is diagonalisable. Then A = T−1DT for some diagonal matrix D and
invertible matrix T. Let S = T−1. Then A = SDS−1 and so AS = SD.
 λ01 λ02 … 0
… 0

Let D = 
… … … …
and S = (v1, v2, …, vn).
 0 0 … λn 
Hence AS = (Av1, Av2, …, Avn) = (λ1v1, λ2v2, …, λnvn) and so each vi is an eigenvector for A.

 4 2
Example 3: Diagonalise the matrix A =   .
 −1 1
Solution: tr(A) = 5 and |A| = 6.

114
So χA(λ) = λ2 − 5λ + 6 = (λ − 2)(λ − 3).
2 2 1
λ = 2: A − 2I =   so a corresponding eigenvector is   .
 − 1 − 1  −1
1 2  2
λ = 3: A − 3I =   so a corresponding eigenvector is   .
 −1 − 2  −1
1 2  2 0
Let S =   and D =   .
 − 1 − 1  0 3
 −1 − 2
Then S −1 =  .
1 1 
We can check that A = SDS −1.
 7 140 − 240 
 
Example 4: Diagonalise the matrix A =  6 65 − 120  .
 4 42 − 78 
 
Solution: tr1(A) = tr(A) = 7 + 65 − 78 = − 6.
7 140 7 − 240 65 − 120
tr2(A) = + +
6 65 4 − 78 42 − 78
1 75 − 1 − 84 23 − 42
= + +
6 65 4 − 78 42 − 78
1 75 − 1 − 84 23 − 42
= + +
6 65 4 − 78 −4 6
= 65 − 450 + 78 + 336 + 138 − 168
= −1
7 140 − 240
tr3(A) = 6 65 − 120
4 42 − 78
1 75 − 120
= 6 65 − 120
4 42 − 78
1 75 − 120
= 0 − 385 600
0 − 258 402
− 385 600
=
− 258 402
= 30.

So χA(λ) = λ3 + 6λ2 − λ − 30.


By inspection λ = 2 is a zero.
So χA(λ) = (λ − 2)(λ2 + 8λ + 15)
= (λ − 2)(λ + 3)(λ + 5).
Hence the eigenvalues are λ = 2, −3, −5.

115
 5 140 − 240   1 − 77 120 
   
λ = 2: A − λI =  6 63 − 120  →  5 140 − 240  R2 − R1, R1 ↔ R2
 4 42 − 80   4 42 − 80 
  
 1 − 77 120 
 
→  0 525 − 840  R2 − 5R1, R3 − 4R1
 0 350 − 560 
 
 1 − 77 120 
 
→ 0 5 − 8  R2 ÷ 105
 0 350 − 560 
 
 1 − 77 120 
 
→ 0 5 − 8  R3 − 70R2
0 0 
 0
Put z = 5k, 5y = 8(5k) so y = 8k and x = 77(8k) − 120(5k) = 16k.
16 
 
So  8  is an eigenvector corresponding to the eigenvalues 2.
5
 
10 140 − 240   1 14 − 24 
   
λ = −3: A − λI =  6 68 − 120  →  6 68 − 120  R2 ÷ 10
 4 42 − 75   4 42 − 75 
  
 1 14 − 24 
 
→  0 − 16 24  R2 − 6R1, R3 − 4R1
 0 − 14 21 

 1 14 − 24 
 
→ 0 2 − 3  R2 ÷ (−8)
 0 − 14 21 
 
 1 14 − 24 
 
→  0 2 −`3  R3 + 7R2
0 0 0 

Put z = 2k, 2y = 3(2k) so y = 3k and x = − 14(3k) + 24(2k) = 6k.
6
 
So  3  is an eigenvector corresponding to the eigenvalues −3.
 2
 

12 140 − 240   3 35 − 60 


   
λ = − 5: A − λI =  6 70 − 120  →  6 70 − 120  R1 ÷ 4
 4 42 − 73   4 42 − 73 
   
 1 7 − 13 
 
→  6 70 − 120  R3 − R1, R1 ↔ R3
 3 35 − 60 
 

116
 1 7 − 13 
 
→  0 28 − 42  R2 − 6R1, R3 − 3R1
 0 14 − 21 
 
 1 7 − 13 
 
→  0 2 − 3  R2 ÷ 14, R3 − 7R2
0 0 0 
 
Put z = 2k, 2y = 3(2k) so y = 3k and x = −7(3k) + 13(2k) = 5k.
5
 
So  3  is an eigenvector corresponding to the eigenvalues −5.
 2
 
16 6 5  2 0 0 
   
Take S =  8 3 3  and D =  0 − 3 0  . Then A = SDS−1. We can check this by performing
 5 2 2  0 0 − 5
   
the appropriate multiplication.

 3 − 22 18 
 
Example 5: Show that A =  3 − 14 9  is not diagonalisable.
2 −8 4 
 
Solution: We begin by working out the characteristic polynomial.
tr(A) = 3 − 14 + 4 = −7.
3 − 22 3 18 − 14 9
tr2(A) = + +
3 − 14 2 4 −8 4
= − 42 + 66 + 12 − 36 − 56 + 72
= 16.
3 − 22 18 3 − 22 18
|A| = 3 − 14 9 = 0 8 −9
2 −8 4 2 −8 4
= 3(32 − 72) + 2(198 − 144)
= − 120 + 108
= − 12.
Hence χA(λ) = λ3 + 7λ2 + 16λ + 12
= (λ + 2)(λ + 2)(λ + 3).
[We factorise this by trying certain integer values until we discover that λ = −2 is a zero.
Hence λ + 2 is a factor. We then divide χA(λ) by λ + 2 and solve the resulting quadratic.]
So the eigenvalues are λ = −2 (twice) and λ = −3.
 6 − 22 18  1 − 3 2 
   
λ = − 3: A + 3I =  3 − 11 9  →  2 − 8 7 
2 −8 7   6 − 22 18 
   
1 − 3 2
 
→ 0 − 2 3
0 − 4 6
 

117
1 − 3 2 
 
→  0 2 − 3 .
0 0 0 

5
 
Hence  3  is an eigenvector.
 2
 
 5 − 22 18  1 − 4 3   1 − 4 3 1 − 4 3 
       
λ = − 2: A + 2I =  3 − 12 9  →  2 − 8 6  →  0 0 0  →  0 2 − 3  .
2 −8 6   5 − 22 18   0 − 2 3 0 0 0 
      
6
 
Hence  3  is an eigenvector.
 2
 
If A was diagonalisable there would have been two eigenvectors for λ = − 2 that are not multiples of
one another.

The steps involved in diagonalisation are as follows.

DIAGONALISATION OF AN n × n MATRIX A
(1) Compute tr(A), tr2(A), ... trn−1(A) and |A|.
(2) Write down χA(λ) = λn − tr(A)λn−1 + tr2(A)λn−2 − ... + (−1)n−1trn−1(A) + (−1)n|A|.
(3) Solve χA(λ) = 0 to find the eigenvalues λ1, λ2, ..., λn.
(4) For each eigenvalue find an eigenvector.
(5) If you can obtain an invertible eigenmatrix S (columns are eigenvectors) then A is
diagonalisable. (Otherwise it is not diagonalisable.)
(6) Let D = diag(λ1, λ2, ..., λn) .
(7) (OPTIONAL) Find S−1 and check that A = SDS−1.
NOTE: The columns of S can be in any order so long as the components of D are in
the same order.

§7.3. Powers of Matrices


Computing integer powers of a matrix is just a matter of applying matrix multiplication to
the matrix or its inverse. But if n is large this can involve an enormous amount of work. And often
we require a formula for An in terms of n. Sometimes we’re interested in the behaviour of An as n
approaches infinity.
 λ01 λ02 … 0
… 0

If D is a diagonal matrix this problem is simple. If D = 
… … … …
then
 0 0 … λn 
λ01 λ02m …
m

… 0
0

D = 
… … … …
m
. If A is diagonalisable powers of A can be computed in terms of the
 0 0 … λn  m

diagonal matrix of eigenvalues.

Theorem 3: If A = SDS−1 then Am = SDmS−1.


Proof: Am = (SDS−1)m = SD(S−1S)DS−1. … . S)D(S−1)DS−1 = SDmS−1,

118
 7 140 − 240 
 
Example 6: If A =  6 65 − 120  , find A5 and A−4.
 4 42 − 78 
 
Solution: In example 1 we showed that A is diagonalisable, and that A = SDS−1 where
16 6 5  2 0 0 
   
S =  8 3 3  and D =  0 − 3 0  . We must first find S−1. Since S is only a 3 × 3 matrix it is
 5 2 2  0 0 − 5
   
just as easy to use the cofactor method.
 0 −1 1  T  0 − 2 3  16 6 5
   
adj(S) =  − 2 7 − 2  =  − 1 7 − 8  and |S| = 8 3 3 = 16(0) − 6(1) + 5(1) = −1.
 3 −8 0   1 −2 0 
    5 2 2
0 2 − 3
−1 
Hence S =  1 − 7 8  .
 −1 2 0 

16 6 5   32 0 0  0 2 − 3
−1 5 −1    
So A = SDS and so A = SD S =  8 3 3   0 − 243
5
0   1 −7 8 
 5 2 2  0 − 3125   −1 2 0 
  0 
 512 − 1458 − 15625   0 2 − 3
  
=  256 − 729 − 9375   1 − 7 8 
 160 − 486 − 6250   − 1 2 0 
 
14167 − 20020 − 13200 
 
=  8646 − 13135 − 6600  .
 5764 − 8778 − 4368 
 
−4
To compute A we could find the inverse of A and then raise it to the 4th power, but it’s simpler to
do it directly. This time, instead of computing it as a rational matrix we will approximate each
component to 4 decimal places.
16 6 5   0.0625 0 0  0 2 − 3
−4 −4 −1    
A = SD S =  8 3 3   0 0.0123 0   1 −7 8 
 5 2 2  0 0.0016   − 1 2 0 
  0
 1 0.0738 0.008   0 2 − 3
  
=  0.5 0.0369 0.0048   1 − 7 8 
 0.3125 0.0246 0.0032   − 1 2 0 

 0.073 1.4994 − 2.4096 
 
=  0.0321 0.7513 − 1.2048  .
 0.0214 0.4592 − 0.7407 

Suppose A(x) is a matrix where each component is a function of x. We define the limit of
A(x) as x approaches a real number or ±∞ to be the matrix whose components are the respective
limits, that is, if these limits exist.

119
 sin x  1 0
 x 
=  ,
Example 7: (i) lim
x →0
 x 
0 2 
 tan x 2 cos x 
 1 
 1− 2 + e− x 
(ii) lim 
x  =  1 2
.
x →∞   1    3 0 
x
1
 −  
x 2  2  
3

Since the limit of a sum, difference and product of functions is the sum, difference and
product of the limits, and matrix addition, subtraction and multiplication can be expressed in terms
of these operations on the components, the limit of a sum, difference and product of matrices is the
sum, difference and product of the respective matrices.
λ1 m 0 … 0
 0 λ2 m … 0 
If each λi satisfies −1 < λi ≤ 1 then 
… … … …
converges as m → ∞ and the limit is a
 0 0 … λnm
diagonal matrix of 1’s and 0’s where the 1’s on the diagonal correspond to the λi that are equal to 1.

 0.9n 0 0  0 0 0
   
Example 8:  0 1 0  approaches  0 1 0  as n → ∞.
 0
 0 (− 0.3)n  0 0 0
 
If a diagonalisable matrix A has eigenvalues λ where −1 < λ ≤ 1 then An converges to a limit
as n → ∞.

 − 1.5 0.5 
Example 9: (i) Show that if A =   then An converges as n → ∞.
 − 5 2 
(ii) Find the limit.
λ + 1.5 − 0.5
Solution: (i) |λI − A| =
5 λ −2
= (λ + 1.5)(λ − 2) + 2.5
= λ2 − 0.5λ − 0.5
= (λ − 1)(λ + 0.5).
So the eigenvalues of A are 1, − 0.5.
Since these are in the interval (−1, 1] the matrix An converges as n → ∞.

(ii) To find the limit we must diagonalise.


 − 2.5 0.5   − 5 1
λ = 1: A − I =   →  
 −5 1   0 0
1
so   is an eigenvector for λ = 1.
 5
 − 1 0.5   2 − 1
λ = − 0.5: A + 0.5I =   →  
 − 5 2.5  0 0 
1
so   is an eigenvector for λ = − 0.5.
 2

120
1 1 1 0 
Let S =   and D =   .
5 2  0 − 0.5 
Then A = SDS−1 and An = SDnS−1.
1 0  1 0  −1
Since Dn →   as n → ∞, An → S   S .
0 0 0 0
1  2 − 1 1 1 1 1 0   2 − 1
Now S−1 = − 3   so the lim An = −    
3 5 2 0 0   − 5 1 
− 5 1   
1 1 0  2 − 1
= − 3    
5 0  − 5 1 
1  2 −1
= − 3  
10 − 5 
 − 2 / 3 1/ 3 
=   .
 − 10 / 3 5 / 3 

Fractional powers can be found for a diagonalisable matrix in the same way.

 4 2
Example 10: If A =   , find A1/2, if one exists.
 − 1 1 
1 2  2 0
Solution: If S =   and D =   then A = SDS−1 [We omit the details.]
 − 1 − 1   0 3 
 −1 − 2
Then A1/2 = SD1/2S−1. Now S−1 =  .
1 1 
 1 2   2 0   −1 − 2
∴ A1/2 =      
 − 1 − 1  0 3   1 1 
 2 2 3   −1 − 2
=   
 1 
 − 2 − 3   1 
2 3 − 2 2 3 − 2 2 
=  .

 2 − 3 2 2 − 3 
In fact, there are two square roots for each eigenvalue, giving 4 square roots of A. These are
obtained by replacing √2 by ±√2 and √3 by ±√3. Thus the four square roots of A are
2 3 − 2 2 3 − 2 2  2 3+ 2 2 3+2 2 
±   and ± 
 − 2 − 3 − 2 2 − 3

 2 − 3 2 2 − 3   

An n × n matrix with distinct eigenvalues may have as many as kn square roots, especially
over ℂ. However if there are repeated eigenvalues there can be just two, or infinitely many.

Example 11: Find the square roots of the identity matrix.


a b 
Solution: Let A =   and suppose that A2 = I.
c d 
2 2
Then a + bc = d + bc = 1 and b(a + d) = c(a + d) = 0.
Case 1: a + d = 0: Then a2 + bc = 1.

121
1 k 
If c = 0 this gives the square roots   for all k.
 0 −1
1 − a2  1 − a2 
If c ≠ 0 then b = c and this gives   a 
c  for all a, c.
c − a 

1 0 1 0   −1 0
Case 2: a + d ≠ 0: Then b = c = 0 and this gives the square roots   ,   ,   and
 0 1   0 −1  0 1
−1 0 
  .
 0 − 1

 1 1
Example 12: Find the square roots of   .
 0 1
a b   1 1
Solution: Let A =   and suppose that A2 =   .
c d   0 1
Then a2 + bc = d2 + bc = b(a + d) = 1 and c(a + d) = 0.
Hence a + d ≠ 0 and so c = 0.
Since a2 = d2 = 1 we can only have a = d = 1 or a = d = −1.
In the first case b = ½ and in the second, b = − ½ .
 1 1  1 1/ 2   − 1 − 1/ 2 
So   only has 2 square roots:   and  .
 0 1 0 1  0 − 1 

§7.4. Recurrence Equations


A sequence is an ordered list of numbers or functions a1, a2, ....... (Sometimes the list starts
with a0 instead of a1.) Although a sequence can be completely random it only becomes interesting
mathematically if there is some rule or pattern that describes the list. One type of rule is where we
have a formula for the n’th term. Another type of rule is where each term is defined in terms of the
previous term, or a certain number of previous terms. Such a rule is called a recurrence equation.

Example 13: an = n2 describes the sequence 1, 4, 9, 16, ...


an = 3an−1 describes the sequence 1, 3, 9, 27, ... It also describes the sequence 2, 6, 18, 54, ...

As the above example shows a recurrence equation does not define the sequence uniquely.
We must be given one or more terms explicitly until the recurrence equation can start.

Example 14: There is a famous sequence, called the Fibonacci sequence, that is described as
follows:
an+2 = an+1 + an;
a0 = 1;
a0 = 1.
The sequence begins as follows: 1, 1, 2, 3, 5, 8, 13, ................

Now it’s preferable to have an explicit formula for the n’th term than to have a recurrence
equation. If an = n2 it is easy to see that a1000 is 1,000,000. To find the 1000th term of the Fibonacci
sequence we appear to have to compute the sequence all the way up to u1000.
Also, if we have an explicit formula, we can make general statements about the sequence,
such as whether it converges.

122
If we have the sequence defined by an = 3an−1, a1 = 2, we can obtain an explicit formula for
an. This is a GP with first term 2 and common ratio of 3. The n’th term is an = 2.3n−1.

A linear recurrence is one of the form an+k = c1an+k−1 + c2an+k−2 + ... + ckan where the ci’s
are constant. The order of this recurrence is k. For example the Fibonacci sequence is a second
order linear recurrence. Using matrices it’s possible to find an explicit formula for an in terms of n
in such cases. The technique involves considering a more general situation.

Example 15: Suppose a1, a2, ... and b1, b2, ... are two sequences where
an+1 = 4an + 2bn;
bn+1 = − an + bn;
a0 = 4, b0 = 7.
Find explicit expressions for an and bn in terms of n.
Solution: We need to generate both sequence concurrently.
The first sequence is 4, 30, 126, 450, ...
The second sequence is 7, 3, −27, −153, ...
a 
But suppose we define vn =  n  . Instead of two sequences of numbers we have a single sequence
 bn 
of vectors. Moreover the recurrence equations can be expressed as a single matrix/vector equation
 an +1   4 2   an 
  =     .
 bn +1   − 1 1   bn 
 4 2
If we put A =   then vn+1 = Avn. Remarkably the pair of intertwined sequences becomes,
 −1 1
not only a single sequence, but a very simple one at that. It is simply a geometric sequence with
common ratio A.
It doesn’t matter that A is a matrix and not just a number. The formula for the n’th term still
 u   4
works: vn = Anv0. Here v0 =  1  =   . All we need to do is to diagonalise the matrix and hence
 u0   7 
n
find A in terms of n.
1 2  2 0
We can diagonalise A and get A = SDS−1 where S =   , D =  
 − 1 − 1  0 3
 −1 − 2
and S−1 =  .
1 1 
 1 2   2n 0   − 1 − 2   4 
So Anv0 =    
n 
  .
 − 1 − 1  0 3   1 1   7 
 2n 2 × 3n   − 18 
=  n 
n  

 − 2 − 3   11 
 − 18 × 2n + 22 × 3n 
=  .
n 
 18 × 2 − 11 × 3 
n

Hence an = 22×3n − 18×2n and


bn = 18×2n − 11×3n.

123
To solve a higher order recurrence un+k = a1un+k−1 + a2un+k−2 + ... + ak−1un+1 + akun we turn it
 un + k =1   a1 a2  ak −1 ak 
   
 un + k − 2  1 0  0 0
into a first order system by defining vn =  . Then vn+k =  vn.
      
   
 u  0 0  1 0
 n   

Theorem 4: If the k × k matrix A is diagonalisable with eigenvalues λ1, λ2, ..., λk, the components
of An and of Anv for any v, have the form C1λ1n + C2λ2n + ... + Ckλkn for constants C1, C2, ..., Ck.

un+3 = 6un+2u0−=11un+1 + 6un


1
Example 16: Solve the recurrence .
u1 = 2
 u2 = 3
6 − 4 6
 
Solution: Let R =  1 0 0  .
0 1 0
 
tr(R) = 6, tr2(R) = 11 and |R| = 6.
Hence χR(λ) = λ3 − 6λ2 + 11λ − 6
= (λ − 1)(λ − 2)(λ − 3).
So the eigenvalues are λ = 1, 2, 3.
Hence un = A + B2n + C3n for some A, B, C.
1 1 1 1  1 1 1 1 1 1 1 1 
     
1 2 3 2  →  0 1 2 1  →  0 1 2 1  .
1 4 9 3  0 3 8 2  0 0 2 −1
     
1 1
Hence C = − 2 , B = 2, A = − 2 .
−1 + 2n+2 − 3n
So un = for all n.
2

§7.5. Probability Matrices


A probability matrix is a square matrix where the entries are non-negative real numbers
and the total of each row is 1. Suppose a system has n possible states S1, S2, ..., Sn and at any instant
it is in one of these states. We monitor the system over time using discrete time intervals, so that if
n is a natural number we can say that at time n the system is in state Sr.
The initial state is S0. If the system is in state Si at time n it might be in state Sj at time
n + 1. Denote the probability of this occurring by pij and suppose this is independent of n.
Then P = (pij) is called the transition matrix of the system. Since pi1 + pi2 + ... + pin is the
probability of going somewhere from state i at the next step, clearly this is 1 and P is a probability
matrix.

Example 17: The “system” might be the age structure of the population with Sr being the state of
being aged r years. We would take one year as the unit of time and prs would be the probability of
someone r years being s years old one year later. If it wasn’t for the problem of death this transition

124
 0 1 0 0 
 
 0 0 1 0 
matrix would be the infinite matrix  . Because of this we need only need to go
0 0 0 1 
 
     
 
up to state S100, which can include all those 100 or over, and a state S101 for “dead”.
Because of mortality at all ages, especially infant mortality, the 1’s just above the diagonals will be
slightly reduced.

The following is a typical transition matrix for the age structure of a population, expressed
as a table, rather than as a matrix, to simplify layout.
AGE 0 1 2 3 ..... 99 ≥ 100 dead
0 0 0.95 0 0 ..... 0 0 0.05
1 0 0 0.99 0 ..... 0 0 0.01
2 0 0 0 0.997 ..... 0 0 0.003
..... ..... ..... ..... ..... ..... ..... .....
99 0 0 0 0 ..... 0 0.66 0.34
≥ 100 0 0 0 0 ..... 0.5 0.5
dead 0 0 0 0 ..... 0 1

Here the first row and column correspond to state S0, those in their first year of life. The
first row reflects the fact that 5% of those in their first year of life fail to reach their first birthday.
(This is obviously a third world country!)
The first column of all zeros reflects the fact that nobody, whatever their age, will be in their
first year of life next year. Here we are taking a fixed cohort of people and are not considering
births.
The last row and column represent state S101, the state of being dead. Clearly all of these
remain dead after a further year. You can see that the death rate drops to 1% for those aged 1 and
even further to 0.3% for those aged 2.
Of those aged 99 this year, 34% will die within the year with the remaining 66% moving
into the “100 and over” group.

If P = (pij) is an n × n probability matrix and the vector v(k) = (x1(k), x2(k), ..., xn(k)) is the row
vector giving the numbers in each state at time k then the expected number in state j after one year
is
p1jx1(k) + p2jx2(k) + ... + pnjxj(k), which is the j’th component of v(k)P.

Hence v(k+1) = v(k)P and so v(k) = v(0)Pk. This shows the need to be able to find a formula for the k’th
power of a matrix.
A matrix of size 102 × 102 as in the above example would require a computer to analyse.
The following example has only three states and lends itself to manual calculation.

Example 18: A robot vacuum cleaner moves about the floor randomly, picking up dust. If it
reaches a doorway it will go through into the next room. Suppose we have 9 rooms as follows.

125
For each door let p be the probability that the robot will move through that door in a 10
minute interval. We assume that the probability that it will move
through a second door in that time interval to be negligible.
We have 9 rooms and could therefore potentially have to
deal with a 9 × 9 matrix. However by symmetry we can combine
the four corner rooms into one state and the four “edge” rooms
also into one state.
Let S1 be the state of being in one of the corner rooms, S2
the state of being in one of the edge rooms and S3 the state of
being in the middle room. The transition matrix is
1 − 2 p 2p 0 
 
P =  2 p 1− 3p p .
 0 4 p 1 − 4 p 

 0.8 0.2 0 
 
Let’s suppose that p = 0.1. Then P =  0.2 0.7 0.1  .
 0 0.4 0.6 
 
If we start with such a robot vacuum cleaner in a corner room we have v(0) = (1, 0, 0) and so the
expected distribution after 10 minutes is
 0.8 0.2 0 
 
(1 0 0)  0.2 0.7 0.1  = (0.8 0.2 0).
 0 0.4 0.6 
 
 0.8 0.2 0 
2  
After 20 minutes the distribution is (1 0 0)P = (0.8 0.2 0)  0.2 0.7 0.1 
 0 0.4 0.6 
 
= (0.68 0.3 0.02).
To determine the distribution after 24 hours would require a very long calculation. Fortunately we
can find an explicit formula for P144 or any power of P for that matter.

First we find the characteristic polynomial.


tr1(P) = 2.1,
0.7 0.1 0.8 0 0.8 0.2
tr2(P) = + +
0.4 0.6 0 0.6 0.2 0.7
= 0.42 − 0.04 + 0.48 + 0.56 − 0.04
= 1.38,
0.8 0.2 0
tr3(P) = 0.2 0.7 0.1
0 0.4 0.6
= 0.8(0.42 − 0.04) − 0.2(0.12)
= 0.28.
So χA(P) = λ3 − 2.1λ2 + 1.38λ − 0.28.

126
Solving a cubic can be tricky, but fortunately we know that one of the eigenvalues is 1. How do we
1
 
know this? Because the rows total 1 and so 1 is an eigenvector with corresponding eigenvalue 1.
1
 
Dividing χA(λ) by λ − 1 gives λ − 1.1λ + 0.28.
2

Solving by the quadratic formula we get zeros of 0.7 and 0.4. So the eigenvalues are 1, 0.7 and 0.4.
1
 
λ = 1: We already know that 1 is a corresponding eigenvector.
1
 

 0.1 0.2 0   1 2 0 
   
λ = 0.7: P − 0.7I =  0.2 0 0.1  →  0.2 0 0.1 
 0 0.4 − 0.1  0 0.4 − 0.1
   
1 2 0 
 
→  0 − 0.4 0.1 
 0 0.4 − 0.1
 
1 2 0 
 
→  0 4 − 1 .
0 0 0 
 
 − 2
 
So  1  is an eigenvector for λ = 0.7.
 4 
 
 0.4 0.2 0   0.2 0.3 0.1 
   
λ = 0.4: P − 0.4I =  0.2 0.3 0.1  →  0 − 0.4 − 0.2 
 0 0.4 0.2   0 0.2 
   0.4
2 3 1  1 
   
→ 0 2 1  so  − 2  is an eigenvector for λ = 0.4.
0 0 0   4 
  
1 − 2 1  1 0 0 
   
Let S = 1 1 − 2  and D =  0 0.7 0  .
1 4 4   0 0 0.4 
  
−1
We now must calculate S .
1 − 2 1 1 0 0 1 − 2 1 1 0 0
   
1 1 − 2 0 1 0  →  0 3 − 3 − 1 1 0 
1 4 4 0 0 1  0 6 3 − 1 0 1 
 
1 − 2 1 1 0 0
 
→  0 3 − 3 −1 1 0
0 0 9 1 − 2 1 

127
1 − 2 1 1 0 0 
 
→  0 1 −1 −1/ 3 1/ 3 0 
0 0 1 1 / 9 − 2 / 9 1 / 9 

1 −2 0 8/9 2/9 −1/ 9
 
→ 0 1 0 − 2 / 9 1/ 9 1/ 9 
0 0 1 1 / 9 − 2 / 9 1 / 9 

1 0 0 4 / 9 4 / 9 1/ 9
 
→ 0 1 0 − 2 / 9 1/ 9 1/ 9
0 0 1 1 / 9 − 2 / 9 1 / 9 

 4 4 1
1  
So S−1 = 9  − 2 1 1
 1 − 2 1
 

1 − 2 1   1 0 0  4 4 1
1
n −1   
Then P = SP S = 9 1 1 − 2   0 0.7 n
n
0   − 2 1 1
1 4 4   0 0.4n   1 − 2 1
 0

1 − 2 × 0.7 n 0.4n   4 4 1
1  
= 1
9 0.7 n − 2 × 0.4   − 2 1 1
n

1 4 × 0.7 n
 4 × 0.4n   1 − 2 1
 4 + 4 × 0.7 n + 0.4n 4 − 2 × 0.7 n − 2 × 0.4n 1 − 2 × 0.7 n + 0.4n 
1 
= 9  4 − 2 × 0.7 n − 2 × 0.4n 4 + 0.7 n + 4 × 0.4n 1 + 0.7 n − 2 × 0.4n 
 4 − 8 × 0.7 n + 4 × 0.4n 4 + 4 × 0.7 n − 8 × 0.4n 1 + 4 × 0.7 n + 4 × 0.4n 
 
Clearly when n is large 0.7 and 0.4 are negligible. For example 0.7 is about 5 × 10−23. So for
n n 144

4 4 1  4 4 1 
n 1  (144) 1   1
large n, P = 9  4 4 1  and so v = 9 (1 0 0)  4 4 1  = 9 (4 4 1).
 4 4 11  4 4 11
   
4
This means that the probability of the robot vacuum cleaner being in a corner room is . Since
9
1
there are 4 such rooms the probability of it being in any one of those rooms is 9 . In fact, this is true
of any of the 9 rooms, a fact that we might have expected.

128
EXERCISES FOR CHAPTER 7
For exercises 1 - 3:
(i) find the eigenvalues and eigenvectors;
(ii) determine whether or not it diagonalisable;
(iii) if it is diagonalisable, find an invertible matrix D and a diagonal matrix S such that
A = SDS−1;
(iv) if it is diagonalisable find a formula for An.

5 6 
Exercise 1: A =   .
3 − 2

 3 −2 4 
 
Exercise 2: A =  − 1 2 − 2  .
 −1 1 −1
 

0 − 5 1 
 
Exercise 3: A =  0 3 − 1 .
1 3 1 

abn+1 = 3an + 2bn


n+1 = 2an + 3bn
Exercise 4: Solve the recurrence system:  a0 = 1 .
 b0 = 0

an+3 = 6an+2a1−=11an+1 + 6an

Exercise 5: Solve the recurrence equation 


1
.
a2 = 2
 a3 = 3

Exercise 6: For a certain type of insect 90% of those under week old die within the following week
and 50% of those between 1 and 2 weeks old die within the following week. Very few survive
beyond the first three weeks. They breed in the second and third weeks only. On average each
female in the one to two week age group produces 30 offspring in that week. Half of these are
female. For those in the two to three week age group each female produces 20 offspring in that
week. Again half of these are female.
Let an be the number of thousands of females, aged 0 to 1 week, in week n. Let bn be the
number of thousands of females in the one to two week group in week n and let cn be the number of
thousands of females aged from 2 to 3 weeks in week n.

(i) How many weeks will it take before there are insects in all three age groups?
 an 
 
(ii) Let vn =  bn  Find a matrix A such that vn+1 = Avn.
c 
 n
(iii) If 4000 females aged between 2 and 3 weeks are released into a certain area, and sufficient
males to breed with them, find the population distribution after n weeks.

129
SOLUTIONS FOR CHAPTER 7
Exercise 1: (i) tr(A) = 3,
|A| = − 10 − 18 = −28.
χA(λ) = λ2 − 3λ − 28
= (λ − 7)(λ + 4).
So the eigenvalues are λ = 7, −4.
− 2 6   1 − 3  3
λ = 7: A − 7I =   →   . So   is an eigenvector.
 3 − 9 0 0  1
9 6  3 2  − 2
λ = −4: A + 4I =   →   . So   is an eigenvector.
3 2 0 0  3 
(ii) Since we have two independent eigenvectors A is diagonalisable.
3 − 2 7 0  1  3 2
(iii) Let S =   and D =   . Then S−1 =  
11  − 1 3  .
1 3   0 − 4  
1 3 − 2  7 0   3 2
SDS−1 = 11      
1 3   0 − 4  −1 3
1  21 8   3 2 
= 11    
 7 −12   − 1 3 
1  55 66 
= 11  
 33 − 22 
5 6 
=   .
3 − 2
1  3 − 2   7n 0   3 2
(iv) An = SDnS−1 = 11    
n  
 1 3   0 (− 4)   − 1 3 
1  3 × 7 n − 2 × (− 4 )n   3 2 
= 11  n   
 7 3 × (−4) n   − 1 3 
1  9 × 7 n + 2 × (−4) n 6 × 7 n − 6 × (− 4 )n 
= 11  n
.
 3 × 7 − 3 × (−4) 2 × 7 + 9 × (−4) 
n n n

Exercise 2:
(i) tr(A) = 3 + 2 − 1 = 4.
tr2(A) = 4 + 1 + 0 = 5.
|A| = 3.0 + 2(−1) + 4.1 = 2.
χA(λ) = λ3 − 4λ2 + 5λ − 2
= (λ − 1)2(λ − 2).
The eigenvalues are λ = 1 (twice), λ = 2.
 2 −2 4  1 −1 2 1  − 2
       
λ = 1: A − I =  − 1 1 − 2  →  0 0 0  so  1  and  0  are independent eigenvectors
  0 0 0 0  1 
 −1 1 − 2      
(neither is a scalar multiple of the other).
 1 −2 4  1 − 2 4
   
λ = 2: A − I =  − 1 0 − 2  →  0 − 2 2 
 −1 1 − 3  
   0 −1 1

130
1 − 2 4   − 2
   
→  0 1 − 1 so  1  is an eigenvector.
0 0 0   1 
  
(ii) Since we have 3 independent eigenvectors A is diagonalisable.
1 − 2 − 2 1 0 0
   
(iii) Let S =  1 0 1  and D =  0 1 0  . Then |S| = −1 −(−2) −2 = −1.
0 1 1   
 0 0 2
 − 1 − (1) 1  T  1 0 2 
   
S = −  − (0) 1
−1
− (1) =  1 − 1 3 
 − 2 − (3) 2   −1 1 − 2
   
1 − 2 − 2 1 0 0  1 0 2 
     
Then SDS−1 =  1 0 1   0 1 0  1 −1 3 
0 1 1   0 0 2   − 1 1 − 2 

1 − 2 − 4  1 0 2 
  
= 1 0 2   1 −1 3 
0 1 2   − 1 1 − 2 

 3 −2 4 
 
=  − 1 2 − 2  = A.
 −1 1 −1
 
1 − 2 − 2 1 0 0   1 0 2 
     
(iv) An = SDnS−1 =  1 0 1   0 1 0   1 −1 3 
0 1 1   0 0 2n   − 1 1 − 2 

 1 − 2 − 2 × 2n   1 0 2 
  
= 1 0 2 n
  1 −1 3 
0 1 2n   − 1 1 − 2 

 2n +1 − 1 2 − 2n +1 2n + 2 − 4 
 
=  1 − 2n 2n 2 − 2n +1  .
 1 − 2n 2n − 1 3 − 2n +1 

Exercise 3:
(i) tr(A) = 4.
tr2(A) = 0 + (−1) + 6 = 5.
|A| = 0 + 5(1) + (−3) = 2.
χA(λ) = λ3 − 4λ2 + 5λ − 2 (as in exercise 2).
= (λ −1)2(λ − 2).
So the eigenvalues are λ = 1 (twice), λ = 2.
 −1 − 5 1   1 5 − 1
   
λ = 1: A − I =  0 2 − 1 →  0 2 − 1
1 0  0 − 2 1 
 3  

131
 1 5 − 1
 
→  0 2 − 1 .
0 0 0 
 
− 2 − 5 1   1 3 − 1
   
λ = 2: A − 2I =  0 1 − 1 →  0 1 − 1
 1 3 − 1 − 2 − 5 1 
  
 1 5 − 1
 
→  0 1 − 1
 
 0 1 − 1
 1 5 − 1  − 4
   
→  0 1 − 1 . So  1  , and its scalar multiples, are the only
0 0 0   1 
   
eigenvectors for λ = 1.

(ii) Since we have only two independent eigenvectors A is not diagonalisable.

a   3 2 10 
Exercise 4: Let vn =  n  and A =   . Then vn+1 = Avn and v0 =   .
 bn   2 3 1
tr(A) = 6 and |A| = 5.
χA(λ) = λ2 − 6λ + 5
= (λ − 1)(λ − 5).
The eigenvalues are λ = 1, 5.
 2 2 1 1 1
λ = 1: A − I =   →   , so   is an eigenvector.
 2 2 0 0  −1
− 2 2   1 − 1 1
λ = 5: A − 5I =   →   , so   is an eigenvector.
 2 − 2 0 0  1
 1 1 1 0 1  1 − 1
Let S =   and D =   . Then |S| = 2 and S−1 =  .
 − 1 1 0 5 2 1 1 
1  1 1  1 0  1 − 1 10 
SDnS−1v0 =     
2  − 1 1  0 5n  1 1   1 
 1 5n   9 
1
 
=  − 1 5n  11
2  
1  9 + 11 × 5n 
=  .
2  − 9 + 11 × 5n 
11× 5n + 9 11× 5n − 9
So an = and bn = .
2 2

 un + 2   6 − 11 6  1
     
Exercise 5: Let vn =  un +1  and A =  1 0 0  . Then vn+1 = Avn and v0 =  2 .
 u  0 1 0  3
 n     
tr(A) = 6, tr2(A) = 11, |A| = 6.
∴ χA(λ) = λ3 − 6λ2 + 11λ − 6

132
= (λ − 1)(λ − 2)(λ− 3)
so the eigenvalues are λ = 1, 2, 3.
 5 − 11 6  1 −1 0  1 −1 0  1
       
λ = 1: A − I =  1 − 1 0  →  0 − 6 6  →  0 1 − 1 so 1 is an eigenvector.
 0 1 − 1  0 1 − 1 0 0 0  1
       
 4 − 11 6  1 − 2 0   4
     
λ = 2: A − 2I =  1 − 2 0  →  0 1 − 2  so  2  is an eigenvector.
  0 0 0   
 0 1 − 2  1
 3 − 11 6  1 − 3 0  1 − 3 0  9
       
λ = 2: A − 3I =  1 − 3 0  →  0 − 2 6  →  0 1 − 3  so  3  is an eigenvector.
 0 1 − 3  0 1 − 3 0 0 0  1
      

1 4 9  1 0 0
   
Let S = 1 2 3  and D =0 2 0 .
1 1 1   0 0 3
   
 −1 2 −1 T  −1 5 − 6
−1 1  1 
Then |S| = −2 and S = −  5 − 8 3  =  2 − 8 6  .
2  2 
 − 6 6 − 2  −1 3 − 2

1 4 9 1 0 0  −1 5 − 6 1
1      
A v0 = − 1
n
2 3 0 2
n
0  2 − 8 6   2
2
1 1 1  0 0
 3n   −1 3 − 2  3
  
1 2n + 2 3n + 2   − 1
1   
= − 1 2n +1 3n +1  2
2
1 2n 3n   − 1
 
1 2 n + 2 3n + 2   − 1 + 2n + 3 − 3n + 2 
1   
= − 1 2n +1 3n +1   − 1 + 2n + 2 − 3n +1  .
3n   − 1 + 2n +1 − 3n 
2 n
1 2  
3n − 2n+1 + 1
Hence un = 2 .

Exercise 6: (i) The age distributions for females (in thousands) for the first few weeks are as
follows.
week 0 week 1 week 2 week 3 week 4 week 5
0 weeks old 0 40 0 60 20 90
1 week old 0 0 4 0 6 2
2 weeks old 4 0 0 2 0 3
So it is not until week 5 before there are insects in all age groups.

 an +1   0 15 10   an 
    
(ii)  bn +1  =  0.1 0 0   bn  .
 c   0 0.5 0   c 
 n +1     n

133
(iii) tr(A) = 0, tr2(A) = − 1.5, |A| = 0.5.
χA(λ) = λ3 − 1.5λ − 0.5.
= (λ − 1)(λ2 + λ − 0.5).
−1 ± 3
Hence the eigenvalues are 1, = 1, 0.366, − 1.366.
2
Since the eigenvalues are distinct A is diagonalisable and so each of an, bn and cn can be expressed
in the form A + 0.366nB + (−1.366)nC.
Suppose an = A + 0.366nB + (−1.366)nC.

1 1 1   A  0 
    
Since a0 = 0, a1 = 40 and a2 = 0 we have 1 0.366 − 1.366   B  =  40  , that is
1 0.3662 1.3662   C   0 

1 1 1   A  0 
    
1 0.366 − 1.366   B  =  40  .
1 0.134 1.866   C   0 
    
1 1 1 0 1 1 1 0
   
1 0.366 − 1.366 40  →  0 − 0.634 − 2.366 40 
   
1 0.134 1.866 0   0 − 0.866 0.866 0 
1 1 1 0 
 
→  0 1 3.7319 − 63.0915 
 
0 1 −1 0 
1 1 1 0 
 
→ 0 1 −1 0 
 
 0 1 3.7319 − 63.0915 
1 1 1 0 
 
→ 0 1 −1 0 
 
 0 0 4.7319 − 63.0915 
1 1 1 0 
 
→  0 1 −1 0 .
 
 0 0 1 − 13.3332 
So C = −13.3332, B = −13.3332, A = 26.6664.
Hence an = 26.6664 − 13.3332×0.366n − 13.3332×(−1.366)n.
Now bn = 0.1an−1 = 2.6666 − 1.333×30.366n−1 − 1.3333×(−1.366)n−1.
Finally, cn = 0.5bn−1 = 1.3333 − 0.6666×30.366n−2 − 0.6666×(−1.366)n−2.

0 weeks old 80 − 40×0.366n − 40×(−1.366)n


3
1 week old 8 − 4×0.366 − 4×(−1.366)n−1
n−1

3
2 weeks old 4 − 2×0.366 − 2×(−1.366)n−1
n−1

134

You might also like