You are on page 1of 102

MA2012: Linear Algebra

by

Zsuzsanna Gonye

Department of Mathematics
Polytechnic University
Brooklyn, New York
2003

Contents
1 Linear Equations
1.1 Systems of Linear Equations
1.2 Gauss-Jordan Elimination .
1.3 Examples . . . . . . . . . .
1.4 The Zp field . . . . . . . . .
1.5 Systems of Linear Equations

.
.
.
.
.

1
1
3
6
13
14

.
.
.
.
.

17
17
25
26
28
30

.
.
.
.
.
.
.
.
.

33
33
34
36
37
39
40
42
44
45

4 Vector Spaces
4.1 Introduction to the Euclidean n-space . . . . . . . . . . . . . . . . . .
4.2 Linear Transformation from Rn to Rm . . . . . . . . . . . . . . . . .
4.3 Real Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47
47
49
52

. . .
. . .
. . .
. . .
in Zp

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

2 Matrix Algebra
2.1 Matrix Arithmetic . . . . . . . . . . . . . . .
2.2 Elementary Matrices . . . . . . . . . . . . . .
2.3 Matrix Inverse . . . . . . . . . . . . . . . . . .
2.4 Method for Finding the Inverse . . . . . . . .
2.5 Diagonal, Triangular and Symmetric Matrices

.
.
.
.
.

.
.
.
.
.

3 Determinant
3.1 The Determinant Function . . . . . . . . . . . .
3.2 Calculating the Determinant for 2 2 Matrices
3.3 Geometric Meaning of the Determinant . . . . .
3.4 Properties of the Determinant Function . . . . .
3.5 Evaluating the Determinant by Row Reduction
3.6 Determinant, Invertibility and Systems of Linear
3.7 Cofactor Expansion, Adjoint Matrix . . . . . . .
3.8 Calculating the Determinant for 3 3 Matrices
3.9 Block-Triangular Matrices . . . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Equations
. . . . . .
. . . . . .
. . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

ii

CONTENTS
4.4
4.5
4.6
4.7
4.8

Subspaces . . . . . . . . . . . .
Spanning . . . . . . . . . . . . .
Linear Independence . . . . . .
Basis and Dimension . . . . . .
Column Space, Row Space, Null

.
.
.
.
.

53
56
57
59
61

.
.
.
.

67
67
69
73
76

.
.
.
.

79
79
79
82
83

A Supplementary Material
A.1 Cayley-Hamilton Theorem . . . . . . . . . . . . . . . . . . . . . . . .
A.2 Exponential of Matrices . . . . . . . . . . . . . . . . . . . . . . . . .

85
85
85

B The
B.1
B.2
B.3

Laplace Expansion For Determinant


First Minors and Cofactors; Row and Column Expansions . . . . . .
Alien Cofactors; The Sum Formula . . . . . . . . . . . . . . . . . . .
Cramers Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89
89
90
91

C.1 Rules of Matrix Arithmetic . . . . . . . . . . . . . . . . . . . . . . . .


C.2 Equivalent Statements . . . . . . . . . . . . . . . . . . . . . . . . . .

93
93
96

5 Eigenvalues and Eigenvectors


5.1 Eigenvalues and Eigenvectors .
5.2 Examples . . . . . . . . . . . .
5.3 Diagonalization . . . . . . . . .
5.4 Computing Powers of a Matrix

. . . .
. . . .
. . . .
. . . .
Space,

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
Rank and Nullity

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

6 General Linear Transformations


6.1 Linear Transformations from Rn to Rm . . . . . .
6.2 General Linear Transformations . . . . . . . . . .
6.3 Matrix Representation of Linear Transformations
6.4 Kernel and Range of Linear Transformations . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

Bibliography

97

Index

97

Chapter 1
Linear Equations
1.1 Systems of Linear Equations
A linear equation in the n variables x1 , x2 , . . . , xn is one that can be expressed in the
form
a1 x 1 + a2 x 2 + + an x n = b
where a1 , a2 , . . . , an and b are constants. The constants a1 , a2 , . . . , an are called the
coefficients, b is the constant term, and the variables x1 , x2 , . . . xn are also called
unknowns. A solution of a linear equation a1 x1 + a2 x2 + + an xn = b is a sequence
of n numbers, so that the equation is satisfied when we substitute x1 = t1 , x2 =
t2 , . . . , xn = tn . We will also call it a solution vector and write it as (t1 , t2 , . . . , tn ).
The collection of all solutions of the equation is called the solution set.
A finite set of linear equations is called a system of linear equations or a linear
system. A sequence of numbers (t1 , t2 , . . . , tn ) is called a solution of the system if
x1 = t1 , x2 = t2 , . . . , xn = tn is a solution of every equation in the system. If a
system of equation has at least one solution (maybe infinitely many), then we say the
system is consistent. If a system of equations has no solutions, then it is said to be
inconsistent.
A linear equation in two unknowns, for example 2x1 5x2 = 3, can be graphed
in the Descartes coordinate system, and its graph is a straight line.
In the study of two equations
a11 x1 + a12 x2 = b1
a21 x1 + a22 x2 = b2
in two unknowns x1 , x2 over R there are (assuming each equation determines a line)
three possibilities:
The two lines are distinct and not parallel. In this case they intersect in a point
and there is a unique solution.

Linear Equations
The two lines are parallel. In this case there is no solution.
The two lines coincide. In this case any solution of one equation is also a solution
of the other, so there are infinitely many solutions.

Example 1.1.1. The system


x y = 10
3x + 5y = 26
has one solution over R (or over C): (3, 7). That is x = 3 and y = 7 satisfies both
of the equations. The system is consistent.
Example 1.1.2. The system
x y = 10
3x 3y = 3
has no solution over R (or over C). There is no such pair of numbers (x, y), which
satisfy both equations. The system is inconsistent.
Example 1.1.3. The system
x y = 10
3x 3y = 30


has infinitely many solutions over R, the solution set is (t, t + 10) : t R . That is,
for every number t, the pair x = t, y = t + 10 will satisfy both equations. The system
is consistent.
Example 1.1.4. The system
xy =i2
5x iy = 15
has no solutions over R, but it has a solution over C, which is (3 + i, 5). So it
is always important to specify the field where you are looking for solutions. This
system is inconsistent over R, but consistent over C.

Z. G
onye

1.2 Gauss-Jordan Elimination

1.2 Gauss-Jordan Elimination


If a system is in a so called triangular form, then you can solve it easily using back
substitution.
Example 1.2.1. The system
x 3y + 2z = 17
y + 5z = 13
z=3
is triangular. The last equation gives you the value of z directly, z = 3. Substituting
this back to the second equation, you get the value of y, y = 2. Substituting z = 3
and y = 2 back into the first equation you can get x = 5. The system is consistent,
it has only one solution which is (5, 2, 3).
Most of the systems are not in such a nice triangular form, but if we can bring it to
a triangular form, then the solution is easy. How can we do that? If you multiply an
equation by a nonzero number, does the solution change? If you swap two equations
of the system, does the solution change? If you add a constant times an equation to
another equation, does the solution change?
Example 1.2.2. We can bring the system
x + 2y 3z = 9
y 2z = 4
3x + 5y 6z = 20
to a triangular form, if we first add 3 times the first equation to the third equation
(R3 R3 3R1 )
x + 2y 3z = 9
y 2z = 4
y + 3z = 7,
then we add the second equation to the third (R3 R3 + R2 )
x + 2y 3z = 9
y 2z = 4
z = 3,
and then we can get the solution by back substitution: (4, 2, 3).
Z. G
onye

Linear Equations

As we did this example, you may see that we have to keep track only the coefficients. Lets look at a general linear system.
a11 x1 + a12 x2 + + a1n xn = b1
a21 x1 + a22 x2 + + a2n xn = b2
..
.
am1 x1 + am2 x2 + + amn xn = bm .
The coefficients of each variable aligned in columns

a11 a12 a1n

a
21 a22 a2n

. . . . . . . . . . . . . . . . . . .
am1 am2 amn

is called the coefficient matrix , and if we also include the constant terms of the
equations, then the matrix

a11 a12 a1n b1

a21 a22 a2n b2


.
. . . . . . . . . . . . . . . . . . . ..

am1 am2 amn bm

is called the augmented matrix of the system. The size of a matrix tells you how
many rows and columns it has. The coefficient matrix above is an m n (read m by
n) matrix. The augmented matrix above is m (n + 1). The following operations
are allowed on the augmented matrix in order to get a solution.
Definition 1.2.1. Elementary row operations:
1. Multiply a row through by a nonzero number.
2. Interchange two rows.
3. Add a multiple of one row to another row.
Z. G
onye

1.2 Gauss-Jordan Elimination

Example 1.2.3. Lets redo Example 1.2.2 using the augmented matrix. The augmented matrix of
x + 2y 3z = 9
y 2z = 4
3x + 5y 6z = 20
is

1 2 3 9
0 1 2 4 .
3 5 6 20

We add 3 times the first row to the third row (R3 R3 3R1 ), then the matrix
changes to

1 2 3 9
0 1 2 4 ,
0 1 3
7
then we add the second row to the third one

1 2 3
0 1 2
0 0 1

(R3 R3 + R2 ):

9
4 .
3

From here you can again use the equations given by this matrix. Now it is a triangular
system, so use back substitution to find the solution.
The goal is to bring the augmented matrix to an easy-to-solve form, like

1 2 3 9

0 1 2 4 .
0 0 1
3
This matrix is in a row echelon form. To be of this form a matrix must have the
following three properties:
1. If there are rows that consist entirely of zeroes, then they are all at the bottom
of the matrix.
2. If a row does not consist entirely of zeroes, then the first nonzero number in the
row must be a 1. We will call it leading 1.
Z. G
onye

Linear Equations
3. In any two successive rows that do not consist entirely of zeroes, the leading
1 in the lower row occurs farther to the right than the leading 1 in the higher
row.

Example

1 ? ?
0 1 ?

0 0 1
0 0 0

1.2.4. Here are some examples

?
1 ? ? ? ?
0 0 0 1 ?
?

,
0 0 0 0 1 ,
?
1
0 0 0 0 0

of matrices

0 1 ?
0 0 1

0 0 0
0 0 0

that are in row echelon

? ?
1 ? ?
? ?
,
0 1 ?
0 0
0 0 0
0 0

form:

? ?
? ?
1 ?

where ? can be any number.


If a matrix has in addition a fourth property:
4. Each column that contains a leading 1 has zeroes everywhere else.
then the matrix is said to be in reduced row echelon form (rref),
Example
form:

1 0 0
0 1 0

0 0 1
0 0 0

1.2.5. Here are some examples of matrices that are in reduced row echelon

0
0
,
0
1

1
0

0
0

?
0
0
0

?
0
0
0

0
1
0
0

0
0
,
1
0

0
0

0
0

1
0
0
0

0
1
0
0

?
?
0
0

?
?
,
0
0

1 0 ? 0 ?
0 1 ? 0 ?
0 0 0 1 ?

where ? can be any number.


Using elementary row operations you can always bring a matrix to a row echelon
form. The procedure is called Gaussian elimination or Gaussian algorithm. We can
work even more to bring the matrix into reduced row echelon form. This procedure
is called Gauss-Jordan elimination. Here we note, that a matrix can have more than
one row echelon form. But it can be shown that every matrix has a unique reduced
row echelon form.

1.3 Examples
Note: This section is also available in txt format at
http://www.math.poly.edu/courses/ma2012/classnotes.phtml
Z. G
onye

1.3 Examples

that you can save (Save as ... File... choose a name, like class2.txt) and use with
MATLAB. Open MATLAB, set your current directory to your working directory.
Among the files you should see class2.txt. By clicking twice on this file you can open
it. MATLAB will open the file using its own text editor. If you want to try something
out, just cut and paste the command into the Command Window of the MATLAB.
Hit the enter key to evaluate the command. MATLAB commands are written as:
>> command
Example 1.3.1. Solve the following system of linear equations over the real numbers:
x + y + 4z = 15
2x + 4y 3z = 1
3x + 6y 6z = 3.
First write down the augmented matrix:
>> M=[1 1 4 15;2 4 -3 1;3 6 -6 -3]

1 1 4 15
M = 2 4 3 1 .
3 6 6 3
Then use the Gaussian algorithm to bring the augmented matrix to row-echelon or
reduced row-echelon form. Add 2 times the first row to the second:
>> M(2,:)=M(2,:)-2*M(1,:)

1 1 4
15
M = 0 2 11 29 .
3 6 6 3
Notice that the third row can be divided by three, so divide the third row by 3:
>> M(3,:)=M(3,:)/3

1 1 4
15
M = 0 2 11 29 .
1 2 2 1
Subtract the first row from the third:
>> M(3,:)=M(3,:)-M(1,:)

1 1 4
15
M = 0 2 11 29 .
0 1 6 16
Z. G
onye

Linear Equations

Switch the second and third rows:


>> M([2,3],:)=M([3,2],:)

1 1 4
15
M = 0 1 6 16 .
0 2 11 29
Add 2 times the second row to the third
>> M(3,:)=M(3,:)-2*M(2,:)

1 1

M= 0 1
0 0

one:

4
15
6 16 .
1
3

Now you have the row-echelon form. From here you can get the solution by back
substitution: From the last row you have that z = 3. Second row says that y 6z =
16, so y = 2. First row says that x + y + 4z = 15, so x = 1.
Instead of the back substitution you can work more with the matrix to get the
reduced row-echelon form. Subtract the second row from the first one:
>> M(1,:)=M(1,:)-M(2,:)

1 0 10 31
M = 0 1 6 16 .
0 0 1
3
Add 10 times the third row the the first
>> M(1,:)=M(1,:)-10*M(3,:)

1 0
M = 0 1
0 0

row:

0
1
6 16 .
1
3

Add 6 times the third row to the second:


>> M(2,:)=M(2,:)+6*M(3,:)

1 0 0 1
M = 0 1 0 2 .
0 0 1 3
Now you have the reduced row-echelon form. From this you can easily see the solutions. The first row says x = 1, the second row says y = 2, and the third row says
z = 3. So the solution of the system is: x = 1, y = 2, z = 3. You can also write down
the solution in the vector form: (1, 2, 3). Notice that the first number is the value of
Z. G
onye

1.3 Examples

x, the second is the value of y, and the third is the value of z. Keep the order of the
variables!
MATLAB can give you the reduced row-echelon form in one step, lets see how.
Enter the augmented matrix (same as above):
>> M=[1 1 4 15;2 4 -3 1;3 6 -6 -3]

1 1 4 15
M = 2 4 3 1 .
3 6 6 3
Ask MATLAB to find the reduced row-echelon form:
>> rref(M)

1 0 0 1
ans = 0 1 0 2 .
0 0 1 3
Thats it. The solution of the system is (1, 2, 3).
Example 1.3.2. Solve the system of linear equations over the real numbers:
2y + 3z v = 1
2x + 6y 4z + 10v = 8
3x + 5y 12z + 17v = 7
Enter the augmented matrix:
>> M=[0 2 3 -1 1;2 6 -4 10 8;3

M= 2
3

5 -12 17 7]

2 3 1 1
6 4 10 8 .
5 12 17 7

Notice that the second row can be divided by 2, so


>> M(2,:)=M(2,:)/2

0 2 3 1

M = 1 3 2 5
3 5 12 17

divide the second row by 2:

1
4 .
7

Since the second row starts with 1, which could be


the first and second rows:
>> M([1,2],:)=M([2,1],:)

1 3 2 5

M = 0 2 3 1
3 5 12 17

a convenient pivot point, switch

Z. G
onye

4
1 .
7

10

Linear Equations

Subtract 3 times the first row from the third:


>> M(3,:)=M(3,:)-3*M(1,:)

1 3 2 5
4
3 1 1 .
M = 0 2
0 4 6 2 5
Divide the second row by two, to get a 1 for you
>> M(2,:)=M(2,:)/2

1.0000 3.0000 2.0000

0
1.0000
1.5000
M=
0
4.0000 6.0000

pivot point (this step is optional):

5.0000
4.0000
0.5000 0.5000 .
2.0000 5.0000

There are fractions in this matrix, and MATLAB shows you the numbers up to the
4th decimal place. If you would rather see the matrix in rational form, then type:
>> format rat
>> M

1 3 2
5
4
M = 0 1 3/2 1/2 1/2 .
0 4 6
2
5
However, we note here, that MATLAB uses rationales to APPROXIMATE the exact
value. Sometimes it is a problem, please see Example 1 in the on-line MATLAB
manual for further explanation on this.
Add 4 times the second row to the third:
>> M(3,:)=M(3,:)+4*M(2,:)

1 3 2
5
4
M = 0 1 3/2 1/2 1/2 .
0 0 0
0
3
Look at the last row. That says: 0 x + 0 y + 0 z + 0 v = 3. There are no values
for (x, y, z, v) for which this can be true. So the system HAS NO SOLUTION. The
system is inconsistent.
Again we could have used MATLAB build-in function to get the reduced rowechelon form in one step:
Enter the augmented matrix:
>> M=[0 2 3 -1 1;2 6 -4 10 8;3 5 -12 17 7]

0 2 3 1 1
M = 2 6 4 10 8 .
3 5 12 17 7
Z. G
onye

11

1.3 Examples
Get the reduced row-echelon form:
>> rref(M)

1 0 13/2 13/2 0
ans = 0 1 3/2 1/2 0 .
0 0
0
0
1
Look at the last row and conclude that the system is inconsistent.
Example 1.3.3. Find all real solutions (p, q, r, s) of the system:
p + 2r = 0
2p 2q + 4r 3s = 1
q + 3s = 5
2p + 8q + 4r + 15s = 13.
Enter the augmented matrix:
>> M=[1 0 2 0 0;2 -2 4 -3 -1;0 1 0

1 0
2 2
M =
0 1
2 8
Bring to reduced row-echelon form:
>> rref(M)

1
0
ans =
0
0

0
1
0
0

3 5;2 8 4 15 13]

2 0
0
4 3 1
.
0 3
5
4 15 13

2
0
0
0

0 0
0 4
.
1 3
0 0

This means: p + 2r = 0, q = 4 and s = 3. There are only three equations, but


we have 4 unknowns. That means the fourth unknown is free, that can be anything.
How to give the solutions?
The unknown that has a leading 1 in its column is determined by the row of its
leading 1. The unknown that has no leading 1 in its column is a free variable.
That is, p is determined by the first row: p + 2r = 0, or if you rearrange this:
p = 2r.
q has a leading 1, so it is determined by the second row: q = 4.
r has no leading 1, so it is a free variable.
s has a leading 1, so it is determined by the third row: s = 3.
Z. G
onye

12

Linear Equations

So the solutions are: p = 2r, q = 4, r, s = 3. Writing this down with the vector
notation, the solutions are: (2r, 4, r, 3). The system has infinitely many solutions.
Example 1.3.4. For which value(s) of the constant k does the system
x + (k 4)y = k + 3
kx + (2k 3)y = 2
have
(a) no solution
(b) exactly 1 solution
(c) infinitely many solutions
over the field of real numbers?
First we have to teach MATLAB that k is a parameter (a symbolic variable):
>> syms k
>> M=[1 k-4 k+3;-k 2*k-3 2]


1,
k 4,
k+3
M=
.
k, 2 k 3,
2
Add k times the first row to the second row:
>> M(2,:)=M(2,:)+k*M(1,:)


1
k4
k+3
M=
.
0 2 k 3 + k (k 4) 2 + k (k + 3)
Factor the nonzero terms in the last row:
>> M(2,2)=factor(M(2,2));
>> M(2,3)=factor(M(2,3))


1
k4
k+3
M=
.
0 (k + 1) (k 3) (k + 2) (k + 1)
(a) The system has no solution if the last row becomes: 0,0,nonzero. So if (k + 1)
(k 3) = 0 but (k + 2) (k + 1) is not zero. This happens if k = 3.
(b) The system has exactly one solution, if the second row also has a leading one,
that is (k +1)(k 3) is not zero. Therefore the system has exactly one solution
if k is neither equal to 1 nor equal to 3. Notice that (k + 2) (k + 1) can be
anything, zero or not zero, you do still get exactly one solution.
Z. G
onye

13

1.4 The Zp field

(c) The system has infinitely many solutions, if the second row has no leading and
the system has solution(s). That is when both (k+1)(k3) and (k+2)(k+1)
are equal to 0. This happens when k = 1.
Remark 1.3.1. I do NOT suggest using the command rref(M) for this problem. Try
it, and see what happens! How would you answer the questions from that form? Why
dont you get the same answer for part (c)? You will not be able to answer part (c)
correctly, because MATLAB (and most of the calculators) will divide the second row
by (k + 1). You know that you cannot divide by (k + 1) if that is zero, MATLAB just
assumes that is not zero. Which is not correct.

1.4 The Zp field


Definition 1.4.1. Two integers a and b are said to be congruent modulo p, written
ab

(mod p),

if p divides b a.
Example 1.4.1.
1 1 (mod 2)
22 1 (mod 3)
12 2 (mod 5)
12 3 (mod 5)
Definition 1.4.2. Zp denotes the residue classes modulo p, that is
Zp = {0, 1, 2, . . . , p 1}.
Example 1.4.2.
Z2 = {0, 1}
Z3 = {0, 1, 2}
Z5 = {0, 1, 2, 3, 4}
You can add, subtract and multiply numbers in Zp , but no division!
Example 1.4.3.
Z. G
onye

14

Linear Equations
1 + 1 0 (mod 2)
2 + 2 1 (mod 3)
3 + 4 2 (mod 5)
3 4 4 (mod 5)
2 2 1 (mod 3)
4 4 1 (mod 5)
3 4 2 (mod 5)
6 3 4 (mod 7)
5 3 1 (mod 7)

1.5 Systems of Linear Equations in Zp


Example 1.5.1. Solve the following system of linear equations in Z3 :
x + y + 2z = 0
2x + 2z = 1
x + 2y = 2.
The augmented matrix of this system

1
2
1

is:

1 2 0
0 2 1 .
2 0 2

We use the Gaussian algorithm in Z3 to


Add the first row to the second (mod 3):

1 1
0 1
1 2

reduce the matrix to a row-echelon form.

2 0
1 1 ,
0 2

subtract the first row from the third (mod 3):

1 1 2 0
0 1 1 1 ,
0 1 1 2
Z. G
onye

15

1.5 Systems of Linear Equations in Zp


subtract the second row from the third:

1 1 2 0
0 1 1 1 .
0 0 0 1

The last row says: 0 x + 0 y + 0 z = 1, which is not possible, therefore the system
is inconsistent and it has no solution in Z3 .
Example 1.5.2. Solve the following system in Z2 :
x1 + x2
x2 + x3 + x4
x1 + x4
x1 + x2 + x3

=1
=0
=0
= 0.

The augmented matrix is

1
0

1
1
Using row operations in Z2 , add the

1
0

0
0
add the second row to the third row:

1
0

0
0
add the third row to the fourth:

1
0

0
0

1
1
0
1

0
1
0
1

0
1
1
0

1
0
.
0
0

first row to the third and fourth rows:

1 0 0 1
1 1 1 0
,
1 0 1 1
0 1 0 1

1
1
0
0

0
1
1
1

0
1
0
0

1
0
,
1
1

1
1
0
0

0
1
1
0

0
1
0
0

1
0
,
1
0

This is now in row-echelon form. Notice that there are only 3 leading 1, so one of the
four unknowns will be free. We can use back-substitution to solve:
Z. G
onye

16

Linear Equations
The fourth column has no leading one, that means x4 is a free variable. That
means x4 is either 0 or 1, since there are the only elements in Z2 .
The first row says: x1 + x2 = 1. By solving for x1 : x1 = 1 + x2 (remember we
are calculating in Z2 !)
The second row says: x2 + x3 + x4 = 0. By solving for x2 : x2 = x3 + x4 .
The third row says: x3 = 1.

So if x4 = 0, then the solution is (x1 , x2 , x3 , x4 ) = (0, 1, 1, 0); and if x4 = 1, then the


solution is (x1 , x2 , x3 , x4 ) = (1, 0, 1, 1). The system has these two solutions in Z2 .

Z. G
onye

Chapter 2
Matrix Algebra
2.1 Matrix Arithmetic
Definition 2.1.1. A matrix is a rectangular array of numbers. We will always denote
matrices by capital letters. The numbers in the array are called entries in the matrix.
If we want to refer to the entry of matrix A standing in the ith row and jth column,
then we write (A)ij or aij . If a matrix has m rows and n columns, then the matrix
is said be of size m n. The entries of a matrix can be real numbers, then we say
the matrix is over R. If the entries are complex numbers, then the matrix is said to
be over C. If we want to refer to the entry of matrix A that is in the ith row and
jth column, we write (A)ij or aij . A matrix which has only one row is also called a
row matrix . A matrix which has only one column is also called a column matrix or
vector . A matrix with n rows and n columns is called a square matrix of order n, and
the entries a11 , a22 , . . . , ann are said to be on the main diagonal of the matrix.
Example 2.1.1.

2 4 1 0
A = 3 3 1 2 ,
0 2
1 0

1
0

C=
2 ,
7


B = 2 0 1 2 3 ,


D=


2 3
,
1 7


E= 5 .

Matrix A is of size 3 4. Matrix B is of size 1 5, it is also called a row matrix.


Matrix C is of size 4 1, it is also called a column matrix or vector. Matrix D is a
2 2 matrix; since it has the same number of rows and columns it is called a square
matrix of order 2. Its main diagonal consists of the numbers 2 and 7. Matrix E is of
size 1 1; every number can be considered a 1 1 matrix.
Definition 2.1.2. The m n zero matrix is the m n matrix whose entries are all
zeroes.

18

Matrix Algebra

Example 2.1.2. Some examples for zero matrices:

0
0
0


0 0 0 ,
0 ,
0 0 0 0 ,
0 0 0

0 0 0 0
0 0 0 0 ,
0 0 0 0

...

Definition 2.1.3. A square matrix whose entries along its main diagonal are all 1s
and whose all other entries are 0s is called an identity matrix . The n n identity
matrix is denoted by In .
Example 2.1.3. The following are examples for identity matrices:

1 0


1 0 0


1 0
0 1
I1 = 1 ,
I2 =
,
I3 = 0 1 0 ,
I4 =

0 1
0 0
0 0 1
0 0

0
0
1
0

0
0
,
0
1

...

Definition 2.1.4. Two matrices are equal if they have the same size and their corresponding entries are equal.
Definition 2.1.5. If two matrices, A and B, are of the same size, then the sum A+B
is the matrix obtained by adding the entries of B to the corresponding entries of A.
Definition 2.1.6. If two matrices, A and B, are of the same size, then the difference
A B is the matrix obtained by subtracting the entries of B from the corresponding
entries of A.
Example 2.1.4. Consider the matrices


2 1 0 1
A=
and
3 2 2 0


B=


0
2 3 1
.
1 5 3 7

Both matrices have the same size, 2 4, so the sum and difference are defined.


2
3 3 0
A+B =
,
2 7 5 7
and


AB =


2 1 3 2
.
4 3 1 7

Definition 2.1.7. If A is a matrix and c is a scalar, then the product cA is the matrix
obtained by multiplying each entry of the matrix A by c.
Z. G
onye

19

2.1 Matrix Arithmetic


Example 2.1.5. Let

A=
Then


5A =


2 1 0 1
.
3 2 2 0


10 5 0 5
.
15 10 10 0

Theorem 2.1.1 (Properties of Matrix Arithmetic I). Assuming that the sizes
of the matrices are such that the indicated operations can be performed, the following
rules of matrix arithmetic are valid. Here A, B and C denote matrices, 0 is a zero
matrix, and a and b are scalars.
1. A + B = B + A
Commutative law for addition
2. A + (B + C) = (A + B) + C
Associative law for addition
3. a(B + C) = aB + aC = (B + C)a
4. a(B C) = aB aC = (B C)a
5. (a + b)C = aC + bC = C(a + b)
6. (a b)C = aC bC = C(a b)
7. a(bC) = (ab)C
8. A + 0 = 0 + A = A
9. A A = 0
10. 0 A = A
Definition 2.1.8. If A is an m r matrix and B is an r n matrix, then the matrix
product AB is the m n matrix whose entries are obtained as follows:
For the entry (AB)ij choose the ith row of matrix A and the jth column of matrix
B, and multiply the corresponding entries of this row and column together and then
add up the resulting products. (See Figure 2.1 for the sizes.)
With sum notation we can write
(AB)ij =

r
X
k=1

Z. G
onye

(A)ik (B)kj .

20

Matrix Algebra
A
m

B
r

AB
n

Inside
Outside

Figure 2.1: Multiplying Matrices


Remark 2.1.1. It is easy to calculate AB if you write the two matrices arranged like
this:
B
.
A AB
For example,

b11
b21

b31
b41

a11
a21
a31
a41
a51

a12
a22
a32
a42
a52

a13
a23
a33
a43
a53

a14
a24
a34
a44
a54

c11
c21
c31
c41
c51

b12
b22
b32
b42

b13
b23

b33
b43

c12
c22
c32
c42
c52

c13
c23
c33
c43
c53

where for example c32 = row3 (A)col2 (B) is given by


c32 = a31 b12 + a32 b22 + a33 b32 + a34 b42 .
Example 2.1.6. Consider the matrices

2 1 0
A=
3 2 2

1
2
B=
0
3
Then


AB =


1
, and
0 24

0 1
2 0
.
3 2
5 2 43


3 3 4
,
7 2
1 23

but BA is not defined since the number of columns in B and the number of rows in
A are not the same.
Z. G
onye

21

2.1 Matrix Arithmetic


Example 2.1.7. If
 
2
A=
1
then


AB =


B = 5 3 ,

and


10 6
5 3

and


BA = 7 .

Definition 2.1.9. For a positive integer p and a square matrix A we define the pth
power of A by
Ap = AAA
| {z A},
p

and we also define A0 = I.


Would Ap be defined for a non-square matrix?
Theorem 2.1.2 (Properties of Matrix Arithmetic II). Assuming that the sizes
of the matrices are such that the indicated operations can be performed, the following
rules of matrix arithmetic are valid. Here A, B and C denote matrices, 0 is a zero
matrix, a is a scalar, and r and s are positive integers.
11. A(BC) = (AB)C
Associative law for multiplication
12. A(B + C) = AB + AC
Left distributive law
13. (B + C)A = BA + CA
Right distributive law
14. A(B C) = AB AC
15. (B C)A = BA CA
16. a(BC) = (aB)C = B(aC)
17. A0 = 0A = 0
18. AI = IA = A
19. Ar As = Ar+s
20. (Ar )s = Ars
Z. G
onye

22

Matrix Algebra

21. A0 = I
Definition 2.1.10. If A is any m n matrix, then the transpose of A, denoted by
AT , is defined to be the n m matrix, that is obtained by interchanging the rows
and columns of A.
Example 2.1.8. If


A=


2 1 0 1
,
3 2 2 0 24

then

2
3
1 2
.
AT =
0
2
1 0 42

Theorem 2.1.3 (Properties of Matrix Arithmetic III). Assuming that the sizes
of the matrices are such that the indicated operations can be performed, the following
rules of matrix arithmetic are valid. Here A and B denote matrices, and a is a scalar.
22. (A + B)T = AT + B T
23. (A B)T = AT B T
24.* (AB)T = B T AT
25. (AT )T = A
26. (aB)T = aB T
Definition 2.1.11. If A is a square matrix, then the trace of A, denoted by tr(A),
is defined to be the sum of the entries along the main diagonal of A.
Example 2.1.9. The trace of

A=


2 1 0 1
3 2 2 0

is undefined, since A is not a square matrix.


Example 2.1.10. Consider the square matrix


2 3
D=
.
1 7
Then tr(D) = 2 + 7 = 9.
Z. G
onye

23

2.1 Matrix Arithmetic

Theorem 2.1.4 (Properties of Matrix Arithmetic IV). Assuming that the sizes
of the matrices are such that the indicated operations can be performed, the following
rules of matrix arithmetic are valid. Here A and B denote matrices, and c is a scalar.
27. tr(A + B) = tr(A) + tr(B)
28.* tr(A B) = tr(A) tr(B)
29.* tr(cB) = c tr(B)
30.* tr(AB) = tr(BA)
Proof. Let

a11 a12 a1n


a21 a22 a2n

A=
. . . . . . . . . . . . . . . . . .
an1 an2 ann

and

b11 b12 b1n


b21 b22 b2n

B=
. . . . . . . . . . . . . . . . . ,
bn1 bn2 bnn

then tr(A) = a11 + a22 + + ann and tr(B) = b11 + b11 + + bnn .
Since

a11 + b11 a12 + b12 a1n + b1n


a21 + b21 a22 + b22 a2n + b2n

A+B =
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ,
an1 + bn1 an2 + bn2 ann + bnn
tr(A + B) = a11 + b11 + a22 + b22 + + ann + bnn
= a11 + a22 + + ann + b11 + b22 + + bnn
= tr(A) + tr(B).
This proves that tr(A + B) = tr(A) + tr(B). Similarly you can prove that tr(A B) =
tr(A) tr(B).
Since

ca11 ca12 ca1n


ca21 ca22 ca2n

cA =
. . . . . . . . . . . . . . . . . . . . . ,
can1 can2 cann
tr(cA) = ca11 + ca22 + + cann
= c(a11 + a22 + + ann )
= c tr(A).
Z. G
onye

24

Matrix Algebra

This proves that tr(cA) = c tr(A).


If we calculate the product matrix AB, then the entries along the main diagonal
are:
(AB)11 = (a11 + + a1n )(b11 + + bn1 )
(AB)22 = (a21 + + a2n )(b12 + + bn2 )
..
.
(AB)nn = (an1 + + ann )(b1n + + bnn ),
so
tr(AB) = (a11 + + a1n )(b11 + + bn1 ) + (a21 + + a2n )(b12 + + bn2 )
+ + (an1 + + ann )(b1n + + bnn ).
(2.1.1)
The entries along the main diagonal of the product BA are:
(BA)11 = (b11 + + b1n )(a11 + + an1 )
(BA)22 = (b21 + + b2n )(a12 + + an2 )
..
.
(BA)nn = (bn1 + + bnn )(a1n + + ann ),
so
tr(BA) = (b11 + + b1n )(a11 + + an1 ) + (b21 + + b2n )(a12 + + an2 )
+ + (bn1 + + bnn )(a1n + + ann ).
(2.1.2)
If you expand 2.1.1 and 2.1.2, you can see that they have exactly the same terms, so
tr(AB) = tr(BA).

Finally, we would like to emphasize two very important things about matrix calculations.
Remark 2.1.2. For matrices the cancellation law usually does not hold!
Consider the matrices




0 1
1 1
A=
,
B=
,
0 2
3 4
Z. G
onye

25

2.2 Elementary Matrices



C=


2 5
,
3 4


and

D=


3 7
.
0 0

You can calculate that



AB = AC =


3 4
,
6 8


and

AD =


0 0
.
0 0

That is, although AB = AC and A 6= 0, but it is incorrect to cancel A from both


sides. We cannot conclude that B = C.
Moreover, although AD = 0, yet A 6= 0 and D 6= 0.
Remark 2.1.3. The matrix multiplication is not commutative! Using the matrices
from the previous remark, verify that AB 6= BA.

2.2 Elementary Matrices


Elementary matrices are the simplest of all invertible matrices. We shall see that they
are the building blocks from which the invertible matrices are constructed. Here is
the definition.
Definition 2.2.1. A matrix that results from the identity matrix by applying a single
elementary row operation (see Definition 1.2.1) is called an elementary matrix . An
elementary matrix is always a square matrix. There are three kinds.
Scale The matrix E = Scale(I, i, c) is an elementary matrix for i = 1, 2, . . . , m and
c 6= 0. It differs from the m m identity matrix I = Im in that (E)ii = c rather
than 1.
Swap The matrix E = Swap(I, i, j) is an elementary matrix for i, j = 1, 2, . . . , m, i 6=
j. It differs from the identity matrix in that
(E)ii = 0 (E)ij = 1
(E)ji = 1 (E)jj = 0.
Shear The matrix E = Shear(I, i, j, c) is an elementary matrix for i, j = 1, 2, . . . , m, i 6=
j. It differs from the identity matrix in that
(E)ij = c.
Example 2.2.1. The matrices

1 0 0
E1 = 0 1 0 ,
0 0 4
Z. G
onye

0 1 0
E2 = 1 0 0 ,
0 0 1

1 0 2
E3 = 0 1 0
0 0 1

26

Matrix Algebra

are examples for elementary matrices. E1 = Scale(I, 3, 4) is a scale, E2 = Swap(I, 1, 2),


and E3 = Shear(I, 1, 3, 2).
What happens if we multiply a 3 3 matrix by these elementary matrixes from
the left. (Remember: matrix multiplication is not commutative!) Let

a11 a12 a13


A = a21 a22 a23 ,
a31 a32 a33
then

a11
a12
a13
a21 a22 a23
a22
a23 ,
E1 A = a21
E2 A = a11 a12 a13 ,
4a31 4a32 4a33
a31 a32 a33

a11 + 2a31 a12 + 2a32 a13 + 2a33


.
a22
a23
E3 A = a21
a31
a32
a33

and

Theorem 2.2.1. The matrix EA that results by multiplying a matrix A on the left
by an elementary matrix E is the same as the matrix that results by applying the
corresponding elementary row operation to A.

2.3 Matrix Inverse


Definition 2.3.1. If A is a square matrix, and if B of the same size can be found
such that AB = I and BA = I, then A is said to be invertible and B is called an
inverse of A. If no such matrix B can be found, then A is said to be singular or
non-invertible. The inverse of A is denoted by A1 .


2 3 0
Example 2.3.1. The matrix A =
has no inverse, because it is not a
1 2 5
square matrix.
Example 2.3.2. A diagonal matrix with nonzero entries along its diagonal is invertible, and its inverse is also diagonal whose diagonal entries are the reciprocals of the
original diagonal entries. For example

2 0
0
1/2 0
0
0 1/3 0 = 0 3
0 .
0 0 4
0 0 1/4
Z. G
onye

27

2.3 Matrix Inverse







1/2 3/2
2 3
Example 2.3.3. The matrix B =
is an inverse of A =
,
0
1
0 1
because


 

2 3
1/2 3/2
1 0
AB =
=
,
0 1
0
1
0 1
and


BA =


 

1/2 3/2
2 3
1 0
=
.
0
1
0 1
0 1

Example 2.3.4. The matrix

0 0 0
3 1 4
2 5 7
is singular, it has no inverse, because the first row of the matrix product AB is always
a zero row, so AB cannot give the identity matrix.
Example 2.3.5. Every elementary matrix is invertible, and the inverse is also an
elementary matrix. The inverse of Scale(I, i, c) is Scale(I, i, 1c ). The inverse of
Swap(I, i, j) is Swap(I, i, j). The inverse of Shear(I, i, j, c) is Shear(I, i, j, c).

1 0
0 1
0 0

0
1
0

1
0
0

1
1
0

0
= 0
0
4
1
1 0
0

0 0
= 1
0 1
0
1
0 2
1

1 0
= 0
0 1
0

0
0
1
0 ,
0 1/4

1 0
0 0 ,
0 1

0 2
1 0 .
0 1

Example 2.3.6. Let A and B be two invertible matrices of the same size. Then
(B 1 A1 )(AB) = B 1 (A1 A)B = B 1 IB = B 1 B = I,
and
(AB)(B 1 A1 ) = A(BB 1 )A1 = AIA1 = AA1 = I.
This shows that the product AB is invertible, and
(AB)1 = B 1 A1 .
Z. G
onye

28

Matrix Algebra

Theorem 2.3.1 (Properties of Matrix Arithmetic V). If A and B are invertible


matrices, then:
31. (A1 )1 = A
32. (An )1 = (A1 )n
1
33. (aA)1 = A1 , for any nonzero scalar a.
a
34.* (AB)1 = B 1 A1 .
35. (AT )1 = (A1 )T .

2.4 Method for Finding the Inverse


Using row operations to find A1 : Construct the matrix
[A|I ]
and apply row operations until you can get the identity matrix on the left side. If
that is possible, then the matrix on the right side will be the inverse of A. So the
final matrix will be
[ I | A1 ].
If we cannot get I on the left hand side using elementary row operations, then A is
non-invertible.
Proof. Let A be a square matrix. Every elementary row operation corresponds to a
multiplications by an elementary matrix (scale, swap or shear) from the left. We do
the row operations on [ A | I ] until we get the row reduced echelon form (rref) of A
on the left side.
Ek E3 E2 E1 [A | I] = [Ek E2 E1 A | Ek E3 E2 E1 I] = [rref | Ek E3 E2 E1 ].
There are two possibilities:
1. The rref of A has a zero row. In this case the matrix A is not invertible.
2. The rref of A is the identity. That is Ek E2 E1 A = I, which means that
Ek E2 E1 is an inverse of A. Therefore
Ek E3 E2 E1 [ A | I ] = [ rref | Ek E3 E2 E1 ] = [ I | A1 ].
Z. G
onye

2.4 Method for Finding the Inverse

29

Example 2.4.1. Find the inverse of the matrix

1 2 1
A = 0 4 5
2 3 3
over R.
Construct the matrix (A|I) and use row operations to bring the left side to the
form of I3 .

1 2 1 1 0 0
0 4 5 0 1 0
2 3 3 0 0 1
Add two times the first row to the third row:

1 2 1 1 0 0
0 4
0 1 0 .
5
0 1 1 2 0 1
Multiply the third row by 1, and switch

1 2 1
0 1 1
0 4 5

the second and third rows:

1 0 0
2 0 1 .
0 1 0

Subtract four times the second row from the third:

1 2 1 1 0 0
0 1 1
2 0 1 .
0 0 1 8 1 4
Subtract two times the second row from the first:

1 0 3 3 0 2
0 1 1
2 0 1 .
0 0 1 8 1 4
Add three times the third row
second:

1
0
0
Z. G
onye

the the first, and subtract the third row from the

0 0 27 3 14
1 0 10 1 5 .
0 1 8 1
4

30

Matrix Algebra

The inverse of A is now on the right side:

27 3 14
A1 = 10 1 5 .
8 1
4
Example 2.4.2. Show that the matrix

1 0 2
B = 0 2 1
2 0 1

is singular in Z3 .
Construct the matrix (B|I) and use row
form of I3 .

1 0 2 1
0 2 1 0
2 0 1 0

operations to bring the left side to the

0 0
1 0 ,
0 1

subtracting 2 times the first row from the third row (modulo 3):

1 0 2 1 0 0
0 2 1 0 1 0 .
0 0 0 1 0 1
The reduced row-echelon form of the matrix is not I3 , so B is singular in Z3 .

2.5 Diagonal, Triangular and Symmetric Matrices


Definition 2.5.1. A square matrix in which all the entries off the main diagonal are
zero is called diagonal matrix .
Example 2.5.1. Some examples for



2
3 0

0
,
0 7
0

diagonal matrices:

0 0
2 0 0
0 6 0 .
4 0 ,
0 0
0 0 5

Definition 2.5.2. A square matrix in which the entries below the main diagonal are
all zero is called upper triangular .
Z. G
onye

2.5 Diagonal, Triangular and Symmetric Matrices

31

Example 2.5.2. Some examples for upper triangular matrices:



2 0 3
2 7
3
3 4
0 4 0 ,
0 4 5 .
,
0 7
0 0 0
0 0 3
Definition 2.5.3. A square matrix in which the entries above the main diagonal are
all zero is called lower triangular .
Example 2.5.3. Some examples for lower triangular matrices:



2 0 0
2
0 0
3 0
0 4 0 ,
5 4 0 .
,
9 7
6 5 0
3 0 0
Definition 2.5.4. A matrix that is either upper triangular or lower triangular is
called triangular .
Example 2.5.4. Some examples for



2
3 0

0
,
9 7
0

triangular matrices:

0 0
2 7 0
0 4 7 .
4 0 ,
0 0
0 0 6

Definition 2.5.5. A square matrix is called symmetric if AT = A.


Example 2.5.5. Some examples

2 7
7 4
0 3

for symmetric matrices over R and also over C:

1 21 9
0
21 11 0 .
3 ,
9 0 17
6

Example 2.5.6. Show that: If S is an invertible symmetric matrix, then S 1 is also


symmetric.
We have to show that S 1 is also symmetric, that is (S 1 )T = S 1 . Is this true? We
can use the rule for transpose and inverse: (A1 )T = (AT )1 , and that the matrix S
is symmetric, which means S T = S, so
(S 1 )T = (S T )1 = S 1 ,
which proves that S 1 is symmetric.
Definition 2.5.6. A square matrix is called skew-symmetric if AT = A.
Z. G
onye

32

Matrix Algebra

Example 2.5.7.

0
7 0
7 0 3 ,
0 3 0

0 21 9
21 0
0
9
0
0

are skew-symmetric matrices over R (and also over C). Notice that the entries along
the main diagonal must be zero.
Example 2.5.8.

1 1 0
1 0 1 ,
0 1 0

1 1 1
1 0 0
1 0 1

are skew-symmetric matrices in Z2 . Notice that the entries along the main diagonal
can be either 0 or 1, because 1 1 (mod 2).
Example 2.5.9. Show that: If A is an invertible skew-symmetric matrix, then A1
is also skew-symmetric.
Definition 2.5.7. A square matrix A is called nilpotent if Ak = 0 for some positive
integer k.
Example 2.5.10. The matrices

0 1 0
A = 0 0 1
0 0 0

and

0
7

B=
0
0
0

0
0
3
0
0

0 0
0 0
0 0
4 0
0 1

0
0

0
0

are nilpotent over R, because A3 = 0 and B 5 = 0.


Definition 2.5.8. A real square-matrix A is called orthogonal if AT = A1 .
Example 2.5.11. Some examples for orthogonal matrices:

!


1 0 0
1 0 0
3
1
2
0 1
0 0 1 .
0 1 0 ,
2
,
,
3
1
1 0
2
2
0 1 0
0 0 1

Z. G
onye

Chapter 3
Determinant
3.1 The Determinant Function
We follow an intuitive approach to introduce the definition of determinant. We already have a function defined on certain matrices: the trace. The trace assigns a
number to a square matrix by summing the entries along the mail diagonal of the
matrix. So the trace is a function; its domain is the set of all square matrices, its
range is the set of numbers. We also showed that the trace is a linear function, that
is
tr(A + B) = tr(A) + tr(B)
tr(cA) = c tr(A)
where A and B are two n n matrices, and c is a constant.
The determinant is also a function which assigns a number to every square matrix.
So its domain will be again the set of square matrices, and its rage is the set of
numbers. The notation of the determinant of a matrix A is
det(A).
The determinant function has some nice properties, and we should emphasize two of
them at this point:
det(AB) = det(A) det(B)
det(I) = 1
where A and B are two n n matrices. We will see the other properties later in this
chapter.

34

Determinant

3.2 Calculating the Determinant for 2 2 Matrices


Definition 3.2.1. The determinant of a 2 2 matrix is defined as


a b
det
= ad bc.
c d
Example 3.2.1.



1 0
det(I2 ) = det
= 1.
0 1
Example 3.2.2.



3 2
det
= 3 4 2 (2) = 16.
2 4
Theorem 3.2.1. For any two 2 2 matrices A and B
det(AB) = det(A) det(B).
Proof. Let

A=

a11 a12
a21 a22


and

B=


b11 b12
.
b21 b22

Then


det(A) det(B) = a11 a22 a12 a21 b11 b22 b12 b21
= a11 a22 b11 b22 a11 a22 b12 b21 a12 a21 b11 b22 + a12 a21 b12 b21 .

(3.2.1)

The product of the two matrices is




a11 b11 + a12 b21 a11 b12 + a12 b22
AB =
,
a21 b11 + a22 b21 a21 b12 + a22 b22
so




det(AB) = a11 b11 + a12 b21 a21 b12 + a22 b22 a11 b12 + a12 b22 a21 b11 + a22 b21
= a11 b11 a21 b12 + a11 b11 a22 b22 + a12 b21 a21 b12 + a12 b21 a22 b22
.
a11 b12 a21 b11 a11 b12 a22 b21 a12 b22 a21 b11 a12 b22 a22 b21
= a11 b11 a22 b22 + a12 b21 a21 b12 a11 b12 a22 b21 a12 b22 a21 b11 ,
(3.2.2)
As we see the terms in 3.2.1 and in 3.2.2 are the same, so
det(A) det(B) = det(AB).

Z. G
onye

3.2 Calculating the Determinant for 2 2 Matrices

35

If A is an invertible matrix, then we can apply this theorem for A and A1 :


det(AA1 ) = det(A) det(A1 ).
The left hand side of the equation is det(AA1 ) = det(I) = 1, so we have that
1 = det(A) det(A1 ).
From this equation we can conclude that if A is invertible, then det(A) 6= 0, det(A1 ) 6=
0, and
1
.
det(A1 ) =
det(A)
Corollary 3.2.2. If A is invertible, then
det(A1 ) =

1
.
det(A)

Using the definition of the determinant we can easily show some properties of the
determinant function for 2 2 matrices.
Corollary 3.2.3.



1 0
det
= 1.
0 1
Corollary 3.2.4. If a row or column of a matrix is 0, then the determinant of the
matrix is 0.
Corollary 3.2.5. If a matrix is triangular, then its determinant is the product of the
diagonal entries:


a 0
det
= ad,
c d


a b
det
= ad.
0 d
Corollary 3.2.6. If we swap two rows or two columns, then the determinant changes
its sign:




c d
a b
det
= det
,
a b
c d




b d
a b
det
= det
.
a c
c d
Z. G
onye

36

Determinant

Corollary 3.2.7. If we multiply a row or column by a number k, then the determinant


will be k times as before:




ka kb
a b
det
= k det
,
c d
c d




ka b
a b
det
= k det
.
kc d
c d
Corollary 3.2.8. If A is a 2 2 matrix and c is a scalar, then det(cA) = c2 det(A).
Corollary 3.2.9. If we add k times a row (or column) to the other row (or column),
then the determinant will no change:


a + kc b + kd
det
= (a + kc)d (b + kd)c
c
d
= ad bc


a b
= det
.
c d

3.3 Geometric Meaning of the Determinant


We can consider a 2 2 matrix
x1 y1
x2 y2
as a collection of two column vectors
!
x1
~x =
x2

and

~y =

x1
x2

!
.

We can draw these two vectors in the Descartes coordinate system, and we can see
that any two such vectors determine a parallelogram. (You may see Section 4.1 for
more about vectors.)
Example 3.3.1. The area of the parallelogram with vertices (0, 0), (1, 0), (1, 0.8)
and (2, 0.8) is 0.8, see Figure 3.1. This is the same as the determinant of the matrix
formed by the two column vectors which determine the parallelogram:


1 1
det
= 0.8.
0 0.8
Z. G
onye

37

3.4 Properties of the Determinant Function

(1,0.8)

(0,0)

(2,0.8)

(1,0)

Figure 3.1: Area of a parallelogram


Example 3.3.2. The area of the parallelogram with vertices (1, 1), (2, 1), (2, 1.8)
and (3, 1.8) is the same as in the previous example, since this parallelogram is just
shifted, but its area has not changed.


x1 y1
Theorem 3.3.1. The determinant of a 2 2 matrix
is the signed area of
x2 y2
the parallelogram determined by the two column vectors of the matrix. The sign is
positive if the angle that you get by rotating the first column vector toward the second
column vector in the counterclockwise direction is less than . The sign is negative if
this angle is greater than .
Proof. We will only give a visual verification of this theorem here by cutting and
moving pieces of the parallelogram around, see Figure 3.2. On each picture, the dark
shaded part has the same area as the lightly shaded piece.

3.4 Properties of the Determinant Function


We defined the determinant function for 2 2 matrices, and saw some nice properties
of it. We would like to extend our definition for all square matrices, so that these
properties remain true.
Theorem 3.4.1. The determinant assigns a number for every square matrix, with
the following properties. Let A and B be two n n matrices.
1. det(In ) = 1.
2. If B is the matrix that results when a single row or single column of A is
multiplied by a scalar k, then det(B) = k det(A).
Z. G
onye

38

Determinant

(x1+x2,y1+y2)

(y1,y2)
y2
x2

(x1,x2)

y1

x1

(a)

y2

y2

x2

x2

y1

y1

x1

(b)

y2
x2

A
A

x1

(c)

y2

x2

x1

y1

y1

(d)

x1

(e)

Figure 3.2: Verifying Theorem 3.3.1

Z. G
onye

3.5 Evaluating the Determinant by Row Reduction

39

3. If B is the matrix that results when two rows or two columns of A are interchanged, then det(B) = det(A).
4. If B is the matrix that results when a multiple of one row is added to another
row or when a multiple of one column of A is added to another column, then
det(B) = det(A).
Some further properties of the determinant that we showed for 2 2 matrices and
remain true for larger matrices:
Theorem 3.4.2. Let A and B be n n matrices.
1. If A has a row or a column of zeroes, then det(A) = 0.
2. If A is a triangular matrix, then its determinant is the product of the diagonal
entries.
3. det(A) = det(AT ).
4. det(AB) = det(A) det(B).
5. A square matrix A is invertible if and only if det(A) 6= 0.
6. If A is invertible, then det(A1 ) =

1
.
det(A)

7. If k is a scalar, then det(kA) = k n det(A).

3.5 Evaluating the Determinant by Row Reduction


Theorem 3.4.1 allows us to use row- (or column-) reduction to calculate the value of
the determinant of larger matrices than 2 2. The goal is to reduce the matrix to a
triangular form, because we know that the determinant of a triangular matrix is the
product of its entries along the main diagonal.
Example 3.5.1. Using row reduction

1
x
2
x
x3
Z. G
onye

lets calculate the determinant of the matrix

1 1 1
1 1 1
.
x2 1 1
x3 x3 1

40

Determinant

If we do the following row operations R2 = R2 xR1 , R3 = R3 x2 R1 , R4 = R4 x3 R1 ,


then the value of the determinant does not change.

1 1 1 1
1
1
1
1
x 1 1 1
0 1 x
1
1

det
2
x2 x2 1 1 = det 0
0
1x
1
x3 x3 x3 1
0
0
0
1 x3
= (1 x)(1 x2 )(1 x3 ).
Example 3.5.2. Since we already calculated the determinant of

1 1 1 1
x 1 1 1

A=
x2 x2 1 1 ,
x3 x3 x3 1
we can easily tell that this matrix is singular (has no inverse) over the real numbers
if the determinant is equal to zero, that is if x = 1.
Notice, that det(A) = 0 has two more complex roots, so over the complex numbers
4
2
A is singular not only for x = 1 but also for x = ei 3 , or x = ei 3 .

3.6 Determinant, Invertibility and Systems of Linear Equations


Every system of linear equations can be written in a matrix form:
A~x = ~b,
where A is the coefficient matrix, ~x is the vector of unknowns, and ~b is the vector
of constant terms. You can always solve a system of linear equations using the
Gaussian algorithm, by reducing it to row echelon form or to the reduced row echelon
form. However if the coefficient matrix A is a square matrix, then calculating the
determinant might be useful.
Theorem 3.6.1. If A is an n n matrix, then the following are equivalent. (That
is, if one of these statements is true, then all the others must also be true.)
1. A is invertible.
2. det(A) 6= 0.
Z. G
onye

3.6 Determinant, Invertibility and Systems of Linear Equations

41

3. A~x = ~b has exactly one solution for every n 1 matrix ~b.


4. A~x = ~0 has only the trivial solution (that is the solution is the zero vector).
5. The reduced row-echelon form of A is In .
Proof. The idea of the proof: if A is invertible, then we can multiply both sides of
the equation
A~x = ~b
by A1 , and we get
A1 A~x = A1~b.
Since A1 A = I, we solved the equation:
~x = A1~b,
and there is only one solution for every ~b. If ~b was the zero vector, then the solution
is
~x = A1~0 = ~0,
the trivial solution (each of its component is 0).
Corollary 3.6.2. If A is an n n matrix such that det(A) = 0, then the equation
1. A~x = ~0 has a non-trivial solution (that is a solution whose components are not
all 0).
2. A~x = ~b has either more than one solution for a non-zero n 1 matrix ~b, or has
no solutions at all. (We have to use the Gaussian algorithm to find out what is
happening in this case.)
Example 3.6.1. The linear system
2kx + (k + 1)y = 2
(k + 6)x + (k + 3)y = 3
has exactly one solution if



2k k + 1
det
6= 0,
k+6 k+3
that is when 2k(k + 3) (k + 1)(k + 6) = k 2 k 6 = (k 3)(k + 2) 6= 0, i.e. if
k 6= 2, 3. However if k = 2 or 3, then we have to use the Gaussian algorithm.
Z. G
onye

42

Determinant
If k = 2, then the system becomes
4x y = 2
4x + y = 3

whose row echelon form is




1 1/4 1/2
0 0
5


,

and we can conclude that the system has no solution.


If k = 3, then the system becomes
6x + 4y = 2
9x + 6y = 3
whose row echelon form is


1 2/3 1/3
0
0 0


.

In this case y is a free variable, and the system has infinitely many solutions over R
and also over C.

3.7 Cofactor Expansion, Adjoint Matrix


Definition 3.7.1. The recursive definition of the determinant using cofactor expansion along the ith row of A:
det A = ai1 Ci1 + ai2 Ci2 + ai3 Ci3 + ain Cin .
With sum notation:
det(A) =

n
X

(1)i+k aik Cik .

k=1

The recursive definition of the determinant using cofactor expansion along the jth
column of A:
det A = a1j C1j + a2j C2j + a3j C3j + anj Cnj .
With sum notation:
det(A) =

n
X

(1)k+j akj Ckj .

k=1

Z. G
onye

3.7 Cofactor Expansion, Adjoint Matrix

43

Here Cij denotes the cofactor of the entry aij : the determinant of the minor you get
from A by cancelling the ith row and jth column, with a plus or minus sign according
to the checkerboard:

+ +
+ +

+ +

+ +
.................
Definition 3.7.2. The cofactor matrix of A:

C11 C12 C13 C1n


C21 C22 C23 C2n

C31 C32 C33 C3n

. . . . . . . . . . . . . . . . . . . . . . . . .
Cn1 Cn2 Cn3 Cnn
is the matrix you get by replacing each entry in A by its cofactor.
Definition 3.7.3. The adjoint matrix of A is the transpose of the cofactor matrix:

C11 C21 C31 Cn1


C12 C22 C32 Cn2

.
C
C
C

C
adj(A) =
13
23
33
n3

. . . . . . . . . . . . . . . . . . . . . . . . .
C1n C2n C3n Cnn
Theorem 3.7.1.

det(A)
0
0
0
0
det(A) 0
0

A adj(A) =
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . = det(A) In .
0
0
0 det(A)
Theorem 3.7.2. If A is an invertible matrix, then
A1 =

1
adj(A).
det(A)

Example 3.7.1. Let

1 1 0
A = 3 4 5 .
3 2 1
Z. G
onye

44

Determinant

To get the determinant of A we can use cofactor expansion. The first row would be
the best choice, since it has a zero in it and the other entries are 1, which makes the
calculations easier:
det(A) = a11 C11 + a12 C12 + a13 C13




4 5
3 5
= 1 det
+ 1 (1) det
+0
3 1
2 1
= 6 + 12
= 6.
The cofactor matrix of A is:

6 12 6
1 1
1 .
5 5 1

The adjoint of A is:

6 1 5
adj(A) = 12 1 5 .
6 1
1
Calculating A adj(A):

1 1 0
6 1 5
6 0 0
A adj(A) = 3 4 5 12 1 5 = 0 6 0 = det(A) I3 .
3 2 1
6 1
1
0 0 6
Therefore we can use the adjoint matrix to find the inverse of A:

6 1 5
1 1/6 5/6
1
1
1/6 5/6 .
adj(A) = 12 1 5 = 2
A1 =
det(A)
6
6 1
1
1 1/6
1/6

3.8 Calculating the Determinant for 3 3 Matrices


Theorem 3.8.1 (The Sarruss rule). To find the determinant of a 3 3 matrix
A, write the first two column of A to the right of A. Then, multiply the entries along
as the diagram shows,
+
+
+

@
a11 @
a12 @
a13 a11 a12
@
@
@
@
@
@
a21 a@22 a@23 a@21 a22
@
@
@
@
@
@
a31 a32 a@
a@
a@
33
31
32
@
@
@
Z. G
onye

3.9 Block-Triangular Matrices

45

then add or subtract these products:


det(A) = a11 a22 a33 + a12 a23 a31 + a13 a21 a32
a13 a22 a31 a11 a23 a31 a12 a21 a33 .
Please note, that Sarruss rule works only for 3 3 matrices. For larger
matrices you either have to use row- or column reduction, cofactor expansion or see
if the matrix is block triangular.

3.9 Block-Triangular Matrices


If A is a block-triangular matrix, then the determinant of A is the product of the
determinants of its diagonal blocks.
Example 3.9.1.

1 1
3 4

3 2
det
0 0

0 0
0 0

0 4 5 6

5 9 4 6



1
1
0

2
1
1 4 3 1
= det 3 4 5 det 1 det
5 2
0 1 0 2

3 2 1

0 0 2 1
0 0 5 2
= 6 (1) (1)
= 6.

Z. G
onye

46

Determinant

Z. G
onye

Chapter 4
Vector Spaces
4.1 Introduction to the Euclidean n-space
Definition 4.1.1. A vector is an ordered sequence of numbers:

u1
u2

~u = ..
.
un
where u1 , u2 , . . . un are the coordinates of the vector ~u.
We may write a vector in a row format to save space:
~u = (u1 , u2 , u3 , . . . , un ).
Example 4.1.1.
(3, 5) is a vector in R2 (in the plane)
(2, 3, 3) is a vector in R3 (in the space)
(3, 5, 7, 0, 1) R5 .
Let us consider a simple example. Nonzero vectors in R2 (the plane) can be
represented geometrically by directed line segments. For example, the vector
!
3
5
can be represented by the directed line segment starting at the origin and pointing
to (3, 5), see Figure 4.1. In general, the vector
!
x1
~x =
x2

48

Vector Spaces
(6,7)
(3,5)

x
x
(3,2)

(0,0)

(7,2)

x
(4,3)

Figure 4.1: Vectors in R2


is associated with the line segment in the plane from (0, 0) to (x1 , x2 ). Every vector
has a magnitude (length) and direction. The line segment starting at a point (a, b)
and ending at (a + x1 , b + x2 ) has the same magnitude and direction. So the vector
~x could be represented by this line segment.
Definition 4.1.2. Vector arithmetic
Let ~u = (u1 , u2 , . . . , un ) and ~v = (v1 , v2 , . . . , vn ) be two vectors in Rn , and let k be
any scalar number from R.
1. ~u + ~v = (u1 + v1 , u2 + v2 , . . . , un + vn )
2. ~u ~v = (u1 v1 , u2 v2 , . . . , un vn )
3. k~u = (ku1 , ku2 , . . . , kun )
4. ~0 = (0, 0, . . . , 0) is the zero-vector.
Theorem 4.1.1. Properties of vector arithmetic:
1. ~u + ~v = ~v + ~u, commutative law
2. (~u + ~v ) + w
~ = ~u + (~v + w),
~ associative law
3. ~u + ~0 = ~u
4. ~u + (~u) = ~0.

Z. G
onye

4.2 Linear Transformation from Rn to Rm

49

4.2 Linear Transformation from Rn to Rm


Definition 4.2.1. If the domain of a function f is Rn and it maps every vector from
Rn to Rm , then f is called a map or transformation from Rn to Rm . The notation is
f : Rn R m .
If n = m then f : Rn Rn is often called an operator .
Example 4.2.1.
f : R2 R, f (x, y) = x2 + y 2 .
Example 4.2.2.
g : R2 R3 , g(x1 , x2 ) = (x1 + x2 , 2x1 x2 , x21 x22 ).
With another notation: g(x1 , x2 ) = (w1 , w2 , w3 ), where
w1 = x1 + x2
w2 = 2x1 x2
w3 = x21 x22 .
Definition 4.2.2. A transformation f : Rn Rm is a linear transformation if for
all vectors ~u and ~v in Rn and every scalar c:
(a) T (~u + ~v ) = T (~u) + T (~v ), and
(b) T (c~u) = cT (~u)
Example 4.2.3. The transformation T : R3 R2 defined by
T (x1 , x2 , x3 ) = (2x1 + 1, x2 3x3 )
is not linear.
To show this, take two vectors (a, b, c) and (x, y, z) from R3 , then


T (a, b, c) + (x, y, z) = T (a + x, b + y, c + z) = 2(a + x) + 1, b + y 3(c + z) ,
but
T (a, b, c)+T (x, y, z) = (2a+1, b3c)+(2x+1, y3z) = (2a+1+2x+1, b3c+y3z)
is not the same.
Z. G
onye

50

Vector Spaces

Theorem 4.2.1. A transformation f : Rn Rm is a linear transformation if the new


coordinates, w1 , w2 , . . . wm , can be expressed as linear functions of the old coordinates,
x 1 , x2 , . . . , x n .
Example 4.2.4. Let T : R3 R2 , where
w1 = x1 + 2x2 4x3
w2 = x2 + 5x3
is a linear transformation. We can write this as a matrix equation:

  
 x1
w1
1 2 4
x2 .
=
w2
0 1 5
x3


1 2 4
The matrix
is called the standard matrix for the linear transformation
0 1 5
T . Note, that if T maps from Rn to Rm , then the matrix of the linear transformation
is of size m n.
Example 4.2.5. Let T : R3 R3 , where


x1
x1 + 3x2
T x2 = 2x1 + x2 + 2x3 .
x3
4x1 + 5x2 3x3
This is also a linear transformation, its standard matrix is:

1 3 0
A = 2 1 2 .
4 5 3
Then the transformation is just a multiplication by the matrix A, that is T (~x) = A~x:


x1
1 3 0
x1

2 1 2
x2 .
T x2 =
x3
4 5 3
x3
Why do we call the matrix the standard matrix of T ? If you calculate the image
of the standard coordinate vectors of Rn under the map T you will find the answer.

1
1

2 ,
T 0 =
0
4
Z. G
onye

4.2 Linear Transformation from Rn to Rm

51

the first column of the matrix.



0
3

T 1 = 1 ,
0
5
the second column of the matrix.

0
0
T 0 = 2 ,
1
3
the third column of the matrix. Actually this gives another method to find the
standard matrix of a linear transformation. We just have to calculate the image of
the standard basis vectors of Rn and put them in the column of the matrix.
Definition 4.2.3. Let T be a linear transformation with standard matrix A. This
transformation T is invertible if its matrix A is invertible (i.e. when det(A) 6= 0). The
inverse of the transformation T is denoted by T 1 , and its standard matrix is A1 .
Example 4.2.6. Let T : R3 R3 defined by


x1
1 1 0
x1

x2 .
T x2 = 3 4 5
x3
3 2 1
x3

1 1 0
In Example 3.7.1 we calculated that det 3 4 5 = 6 6= 0 , so the transformation
3 2 1
is invertible, and


w1
1 1/6 5/6
w1
1

w2 =
2
1/6 5/6
w2 .
T
w3
1 1/6
1/6
w3
Example 4.2.7. The transformation T : Rn Rm , T (~x) = ~0 is called the zero
transformation. Its standard matrix is the m n zero matrix.
Example 4.2.8. The transformation T : Rn Rn , T (~x) = ~x is called the identity
transformation. Its standard matrix is the n n identity matrix.
Example 4.2.9. The projection from R3 into the xy-plane is also a linear transformation:

x
x
3
3

P : R R , P y = y .
z
0
Z. G
onye

52

Vector Spaces

Example 4.2.10. The rotation on the xy-plane by angle is also a linear transformation, and its standard matrix is:


cos sin
.
sin cos
For example,
2

  
 
x
x
3/2
1/2
R
=
y
y
1/2
3/2

R:R R ,

is a rotation by 30 degrees (counterclockwise).


Example 4.2.11. The dilatation by a number from Rn to Rn is also a linear
transformation:
D : Rn Rn , D(~x) = ~x.
Example 4.2.12.
2

T :R R ,

 
  
3
1
x
x
T
=
y
y
1
3

is a rotation by 30 degrees (counterclockwise) and a dilatation by 2.

4.3 Real Vector Spaces


Definition 4.3.1. Let V be a non-empty set of objects on which two operations
are defined: addition and multiplication by scalars. If these two operations have the
following properties, then we call V a real vector space.
1. V is closed for addition: If ~u and ~v are in V , then ~u + ~v V .
2. The addition is associative: (~u + ~v ) + w
~ = ~u + (~v + w).
~
3. There is a zero vector ~0 (additive identity), and for that: ~0 + ~u = ~u.
4. There is an inverse for addition: ~u + (~u) = ~0.
5. The addition is commutative: ~u + ~v = ~v + ~u.
6. V is closed for scalar multiplication: If ~u V and k is a scalar, then k~u V .
7. The scalar multiplication is distributive: k(~u + ~v ) = k~u + k~v .
Z. G
onye

53

4.4 Subspaces
8. The scalar multiplication is distributive: (k + l)~u = k~u + l~u.
9. The scalar multiplication is associative: k(l~u) = (kl)~u.
10. Multiplication by the unit: 1~u = ~u.
Example 4.3.1. Here are some examples for real vector spaces:
n o
V = ~0
V =R
V = R2
V = R3
V = Rn
V ={ all the m n real matrices }
V ={ all the 2 2 real matrices }
V ={ all continuous functions on R }
V ={ all polynomials of degree n }

4.4 Subspaces
Definition 4.4.1. A subset W of a vector space V is called a subspace of V if W is
itself a vector space under addition and scalar multiplication. We have to check the
following three things:
1. The zero vector of V is in W .
2. For any two elements ~u and ~v from W : ~u + ~v W .
3. If ~u W and k is any scalar, then k~u W .
Example 4.4.1. Let
V = P = {all polynomials} and
W = Pn = {polynomials of degree n}.
Z. G
onye

54

Vector Spaces
1. The zero vector of P is the zero polynomial, which has degree n, so it is
also in Pn .
2. The sum of two polynomials of degree n is also a polynomial of degree n,
so Pn is closed for addition.
3. If an xn + an1 xn1 + + a0 Pn , and k is a scalar, then k(an xn + an1 xn1 +
+ a0 ) = kan xn + kan1 xn1 + + ka0 is also in Pn .

Therefore Pn is a subspace of P.
Example 4.4.2. Let
V = {all n n real matrices} and
W = {all n n upper triangular matrices}.
1. The zero vector in V is the n n zero matrix. The zero matrix is upper
triangular, so it is in W .
2. The sum of two upper triangular matrices is also upper triangular, so W is
closed for addition.
3. k times an upper triangular matrix is also upper triangular, so W is closed for
scalar multiplication.
Therefore W is a subspace of V .
Example 4.4.3. Any plane through the origin is a subspace of R3 .
Example 4.4.4. Let V = R3 . Show that
W = {(a, b, c) : a2 + b2 + c2 1}
is not a subspace of V . The zero vector is in W . But you can find two vectors in
W whose sum is no longer in W . For example: (1, 0, 0) + (0, 1, 0) = (1, 1, 0) 6 W .
Or, you can find a vector in W and a scalar, so that the product is not in W . For
example: 2 (1, 0, 0) = (2, 0, 0) 6 W .
Example 4.4.5. Let V be the vector space of functions f : R R. Show that
W = {f (t) : f (t) = f (t)}
is a subspace of V . Notice that W is the set of all even functions.
Z. G
onye

55

4.4 Subspaces

1. The zero vector in V is the zero function: f (t) = 0. This is an even function,
so it is also in W .
2. Let f and g be two functions from W , i.e. f and g are even functions. Is the
sum h(t) = f (t) + g(t) in W , i.e. is h even?
h(t) = f (t) + g(t) = f (t) + g(t) = h(t)
so h = f + g is in W . W is closed for addition.
3. Let f be a function from W (i.e. f is an even function) and k be a scalar. Is
h(t) = k f (t) also in W , i.e. is h even?
h(t) = kf (t) = kf (t) = h(t)
so h = kf (t) is in W . W is closed for scalar multiplication.
Therefore W is a subspace of V .

More examples for subspaces


Problem 4.4.6. Show that W = {(a, b, c) : a 0} is not a subspace of V = R3 .
Solution. The subset W is not closed for scalar multiplication. If we take for example
the vector ~u = (1, 2, 0) from W and a negative constant, say k = 3, then k~u =
(3, 6, 0) does not belong to W since its first coordinate is negative.
Problem 4.4.7. Let V be the vector space of functions f : R R. Show that
W = {f (x) : f (1) = 0},
all functions whose value at 1 is 0, is a subspace of V .
Solution. Check the three things required for subspaces.
(a) The zero-vector in the vector space V is the zero function: f (x) = 0 for all
values of x. Since the zero function is zero everywhere, in particular at x = 1,
so it belongs to W .
(b) Check the addition. Take two functions f and g from W . This means f (1) = 0,
and g(1) = 0. Then for the sum: (f + g)(1) = f (1) + g(1) = 0 + 0 = 0, so the
sum f + g is also in W .
(c) Check the scalar multiplication. Take a function h from W . That means h(1) =
0. Let c be any constant. The (c h)(1) = c h(1) = c 0 = 0, so c h(x) is also
in W .
Z. G
onye

56

Vector Spaces

Therefore W is a subspace of V .
 
x1
Problem 4.4.8. Let W = {~x : |x1 | = |x2 |}, where ~x =
. The set W is a subset
x2
of R2 , but is W a subspace of R2 ?
 
1
Solution. We can show that W is not closed for addition. Take for example ~u =
1
 
 
3
4
and ~v =
from W . Then ~u + ~v =
does not belong to W , because
3
2
|4| =
6 |2|.

4.5 Spanning
Definition 4.5.1. Take a set of vectors {~v1 , ~v2 , . . . , ~vr } from a vector space V . Define
W as all the linear combinations of the vectors ~v1 , ~v2 , . . . , ~vr . That is all vectors in
the form
k1~v1 + k2~v2 + + kr~vr
where k1 , k2 , . . . , kr are arbitrary scalars. Then W is called the space spanned by
~v1 , ~v2 , . . . , ~vr . Notation: W = span{~v1 , ~v2 , . . . , ~vr }.
Example 4.5.1. span{1, t, t2 , t3 , . . . , tn } = Pn
Example 4.5.2. span{(1, 0, 0), (0, 1, 0), (0, 0, 1)} = R3 .
Example 4.5.3. Is the vector u = (8, 4, 3) in the span of {(1, 2, 1), (2, 0, 1)}? It is
the same question as: Can we write the vector u as the linear combination of (1, 2, 1)
and (2, 0, 1)? Are there constants a and b so that
a(1, 2, 1) + b(2, 0, 1) = (8, 4, 3)?
This is a system of linear equations:
a + 2b = 8
2a = 4
a b = 3
whose augmented matrix is

8
1 2
2 0
4 .
1 1 3

You can solve this system using the Gaussian algorithm, the system has no solution,
which means that u is not in the span{(1, 2, 1), (2, 0, 1)}.
Z. G
onye

57

4.6 Linear Independence


Example 4.5.4. The vectors ~v1 = (1, 0, 2) and ~v2 = (0, 1, 0) do not span R3 .

Example 4.5.5. The vectors ~v1 = (1, 0, 2), ~v2 = (0, 1, 0), and ~v3 = (2, 1, 4) do
not span R3 .

4.6 Linear Independence


Definition 4.6.1. A set of vectors ~v1 , ~v2 , . . . , ~vr is called linearly dependent if
k1~v1 + k2~v2 + + kr~vr = ~0
for some scalars k1 , k2 , . . . , kr not all zero. Otherwise ~v1 , ~v2 , . . . , ~vr are linearly independent.
Example 4.6.1. The set of vectors
~v1 = (1, 2, 3),

~v2 = (5, 6, 1),

~v3 = (3, 2, 1)

is linearly dependent, because


k1~v1 + k2~v2 + + k3~v3 = ~0
not only for the scalars k1 = 0, k2 = 0, and k3 = 0. We can easily see this if we write
down the coefficient matrix for the unknowns k1 , k2 , k3

1
5 3
A = 2 6 2 .
3 1 1
Since det(A) = 0, the matrix A is not invertible, so the homogeneous matrix equation
A~k = ~0 has infinitely many solution. For example: k1 = 1, k2 = 1, k3 = 2 is also a
solution:
1~v1 + 1~v2 2~v3 = ~0.
Example 4.6.2. The set of vectors
~v1 = (1, 2, 0),

~v2 = (5, 6, 1),

is linearly independent, because


k1~v1 + k2~v2 + k3~v3 = ~0
Z. G
onye

~v3 = (3, 2, 1)

58

Vector Spaces

only if the scalars k1 , k2 , and k3 are all equal to 0. We can easily see this if we write
down the coefficient matrix for the unknowns k1 , k2 , k3

1
5 0
A = 2 6 2 .
3 1 1
Since det(A) 6= 0, the matrix equation A~k = ~0 only has the trivial solution, that is
k1 = 0, k2 = 0, k3 = 0.
So if the coefficient matrix is a square matrix, then the determinant is a good
device. If the coefficient matrix in not a square matrix, then we have to use the
Gaussian algorithm to determine how many solutions the vector equation has.
Example 4.6.3. We would like to determine the values of the parameter p for which
the set of vectors
~v1 = (1, 0, 2, 4),

~v2 = (0, 1, 0, 0),

~v3 = (c, 3, 2, 4)

is linearly independent. For this we have to solve the vector equation


k1~v1 + k2~v2 + k3~v3 = ~0
whose augmented matrix (note that its coefficient matrix is not a square matrix) is

1 0 c 0
0 1 3 0

A=
2 0 2. 0
4 0 4 0
Using the Gaussian algorithm we can get a row echelon form:

1 0
c
0
0 1
3
0

.
0 0 c + 1 0 .
0 0
0
0
The vectors are linearly independent, if this system has only the (0, 0, 0) solution,
that is when c 6= 1.
Example 4.6.4. The standard basis vectors ~e1 = (1, 0, 0), ~e2 = (0, 1, 0), ~e3 = (0, 0, 1)
of R3 are linearly independent.
Z. G
onye

59

4.7 Basis and Dimension

Example 4.6.5. The vectors ~u = (2, 0, 1, 3, 5) and ~v = (4, 0, 2, 6, 10) are linearly dependent, because 2~u + ~v = ~0. Notice, that two vectors are linearly dependent
if one is a constant multiple of the other.
Example 4.6.6. The set {1, t, t2 , t3 } is linearly independent in P3 .
Example 4.6.7. If r > n then the set any set of r vectors from Rn is linearly
dependent.
Example 4.6.8. The set of vectors S = {~v1 , ~v2 , . . . , ~vk , ~0} is linearly dependent.

4.7 Basis and Dimension


Definition 4.7.1. A set S = {~v1 , ~v2 , . . . , ~vn } of vectors in the vector space V is a
basis of V if
1. S is linearly independent, and
2. S spans V .
The number of vectors in a basis is the dimension of the vector space V .
Notation: dim(V ) = n.
Example 4.7.1. The set
{(1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1)}
is the standard basis of R4 , dim(R4 ) = 4.
Example 4.7.2. The vectors
~v1 = (1, 2, 1),

~v2 = (2, 9, 0),

~v3 = (3, 3, 4)

form a basis for R3 .


Example 4.7.3. Two vectors of R3 cannot form a basis for R3 . They cannot span
R3 .
Example 4.7.4. Four vectors of R3 cannot form a basis for R3 . They are not linearly
independent.
Example 4.7.5. S = {tn , tn1 , . . . , t, 1} is the standard basis for Pn , so dim(Pn ) =
n + 1.
Z. G
onye

60

Vector Spaces

Example 4.7.6. The set



 
 
 

1 0
0 1
0 0
0 0
S=
,
,
,
0 0
0 0
1 0
0 1
is the standard basis for M22 , and therefore dim(M22 ) = 4.
Example 4.7.7. The set

 
 
 
 
 

1 0 0
1 1 0
1 1 1
1 1 1
1 1 1
1 1 1
S=
,
,
,
,
,
0 0 0
0 0 0
0 0 0
1 0 0
1 1 0
1 1 1
form a basis for M23 , and dim(M23 ) = 6.
We can show that the set S is linearly independent, by solving the homogeneous
equation
c1 A1 + c2 A2 + c3 A3 + c4 A4 + c5 A5 + c6 A6 = (0)
that is

c1



1 0 0
1
+ c2
0
0 0 0

1
+ c5
1



1 0
1
+ c3
0 0
0


1 1
1
+ c6
1 0
1



1 1
1
+ c4
0 0
1
 
1 1
0 0
=
1 1
0 0


1 1
+
0 0

0
0

From this we can get an equation for each entry:


c1 + c2 + c3 + c4 + c5 + c6
c2 + c3 + c4 + c5 + c6
c3 + c4 + c5 + c6
c4 + c5 + c6
c5 + c6
c6

=0
=0
=0
=0
=0
= 0.

We can easily solve this system and see that the only solution is the trivial (zero)
solution. In more difficult cases, form the augmented matrix

1 1 1 1 1 1 0
0 1 1 1 1 1 0

0 0 1 1 1 1 0

0 0 0 1 1 1 0

0 0 0 0 1 1 0
0 0 0 0 0 1 0
Z. G
onye

4.8 Column Space, Row Space, Null Space, Rank and Nullity

61

and use the Gauss-Jordan elimination to solve the system. Since the coefficient matrix

1 1 1 1 1 1
0 1 1 1 1 1

0 0 1 1 1 1

A=

0
0
0
1
1
1

0 0 0 0 1 1
0 0 0 0 0 1
is a square matrix we can also use the determinant. The determinant of A is equal to
1, therefore the coefficient matrix A is invertible, so the homogeneous system A~c = ~0
only has the trivial solution.
The determinant of the coefficient matrixA easily gives answer for our other question: Does the set S span M23 ? Since the determinant of A is nonzero, so the
coefficient matrix A is invertible, therefore the matrix equation A~c = ~b has a unique
solution for any vector ~b. This shows that the set S spans M23 .

4.8 Column Space, Row Space, Null Space, Rank and Nullity
Column space
How can we find a basis for the subspace spanned by vectors which are not linearly
independent?
Example 4.8.1. Lets show first that the vectors
~v1 = (1, 2, 1, 0),

~v2 = (2, 1, 2, 1) ~v3 = (0, 3, 0, 1),

~v4 = (1, 1, 1, 2)

are not linearly independent. That is we have to show that the vector equation
k1~v1 + k2~v2 + k3~v3 + k4~v4 = ~0
has a nonzero solution. The augmented matrix is:

1 2 0 1 0
2 1 3 1 0

1 2 0 1 0 .
0
1 1 2 0
Z. G
onye

62

Vector Spaces

You can either calculate the determinant, or you can bring this augmented matrix to
row-echelon form:

1 2 0 1 0
0 1 1 1 0

0 0 0 1 0 ,
0 0 0 0 0
from this k3 is a free variable, so the vector equation
k1~v1 + k2~v2 + k3~v3 + k4~v4 = ~0
has nonzero solution, and therefore the set of vectors {~v1 , ~v2 , ~v3 , ~v4 } is linearly dependent.
How can we find a basis (a set of linearly independent vectors) for the subspace
they span? We will use the Gaussian algorithm to find a basis. Form a matrix
from these vectors, so that the columns of the matrix are the given vectors. Using
elementary row operations find the row-echelon form of the matrix. Then the column
vectors from the original matrix which correspond to the columns with the leading
1 of the row-echelon matrix form a basis spanned by the original (column) vectors
(i.e. form a basis for the column space of the original matrix).
Definition 4.8.1. Let A be an m n matrix. The vectors formed by each of the
columns of A are called the column vectors of A. The space spanned by the column
vectors is the column space of A.
So form the matrix from the given vectors:

1 2 0 1
2 1 3 1

A=
1 2 0 1 .
0
1 1 2
We already found its row-echelon form:

1
0
R=
0
0

2
1
0
0

0 1
1 2
.
0 1
0 0

A basis for the space spanned by the vectors ~v1 , ~v2 , ~v3 , and ~v4 are the columns from the
original matrix A which correspond to the columns with leading 1 in the row-echelon
matrix R, that is the first, second and fourth columns:

1
2
1
2 1 1
, , .
1 2 1
0
1
2
Z. G
onye

4.8 Column Space, Row Space, Null Space, Rank and Nullity

63

Row space
Definition 4.8.2. The vectors formed by the rows of a matrix A are called row
vectors. The space spanned by these row vectors is called the row space of the matrix
A.
Example 4.8.2. Find a basis for

1
2
B=
2
1

the row space of

3 4 2 5
4
6 9 1 8
2
,
6 9 1 9
7
3 4 2 5 4

Again, use elementary row operations to bring the matrix into row-echelon form: So
form the matrix from the given vectors:

1 3 4 2 5
4
0 0 1 3 2 6
.
R=
0 0 0 0
1
5
0 0 0 0
0
0
The nonzero row vectors of R a basis for the row-reduced matrix, and also for the
original matrix B. The basis vectors for the row space of B are:
~r1 = (1, 3, 4, 2, 5, 4) ~r2 = (0, 0, 1, 3, 2, 6) ~r3 = (0, 0, 0, 0, 1, 5).

Null space
Definition 4.8.3. The solution space of the homogeneous system A~x = ~0 is called
the null space of A. The dimension of the null space is called the nullity of A and is
denoted by nullity(A).
Example 4.8.3. Find the null space of the system
x+y =0
3x y + 4z = 0.
The augmented matrix is

its row-echelon form is

Z. G
onye

1 1 0 0
3 1 4 0

1 1 0 0
0 1 1 0

64

Vector Spaces

so z is a free variable, y = z and x = z. Therefore the null space or solution space of


the system consists of all vectors in the form (z, z, z). With mathematical notation
the null space = {(z, z, z) : z R}. A basis for the solution space is (1, 1, 1). The
dimension of the solution space is 1: nullity(A) = 1.
Example 4.8.4. Find the solution space and a basis for the solution space of the
system
x1 + x2 2x3 x5
2x1 + 2x2 x3 + x5
x1 x2 + 2x3 3x4 + x5
x3 + x4 + x5

=0
=0
=0
= 0.

The augmented matrix is

1
1 2 0 1 0
2
2 1 0
1 0

1 1 2 3 1 0
0
0
1
1
1 0
its row-echelon form is

1
0

0
0

1 2 0 1 0
0 1 0 1 0
,
0 0 1 0 0
0 0 0 0 0

so x2 and x5 are free variables. Then x4 = 0 and x3 = x5 , and x1 = x5 + 2x3 x2 =


x5 x2 . Therefore the null space or solution space of the system consists of all
vectors in the form (x5 x2 , x2 , x5 , 0, x5 ). With mathematical notation the
nullspace = {(x5 x2 , x2 , x5 , 0, x5 ) : x2 , x5 R}.
A basis for the solution space is {(1, 0, 1, 0, 1), (1, 1, 0, 0, 0)}. The dimension of
the solution space is 2: nullity(A) = 2.

Rank and Nullity


Theorem 4.8.1. If A is an m n matrix, then dim(row space) = dim(column space)
=number of leading 1 in the row-echelon form of A.
Definition 4.8.4. The rank of A is the dimension of the row space of A (same as
the dimension of the column space of A, and same as the number of leading 1 in the
row-echelon form of A). Notation: rank(A).
Z. G
onye

4.8 Column Space, Row Space, Null Space, Rank and Nullity

65

Theorem 4.8.2. If A is an m n matrix then


rank(A) + nullity(A) = n.
Example 4.8.5. Let A be a 5 7 matrix with rank(A) = 3. Then nullity(A) =
7 3 = 4. Notice that rank(AT ) = rank(A) = 3, because dim(row space) =
dim(column space) and we get AT from A by interchanging the rows and column
of the matrix. Therefore we can calculate nullity(AT ) = 5 3 = 2 (number of
columns in AT minus the rank of AT ).
Example 4.8.6. Let

1 1
A = 2 3
3 2
Find a basis for its row space. Find a basis for its solution space. Find its solution
space, a basis for the solution space. What is rank(A), and nullity(A)? (Remark: the
system A~x = ~0 is so called over-determined, because there are more equations then
unknowns.)
First we have to solve the system A~x = ~0 so we set up the augmented matrix:

1 1 0
2 3 0 .
3 2 0
Its row-echelon form is

1 1 0
0 1 0 .
0 0 0.

A basis for the row space of A is:


{(1, 1), (0, 1)}
those rows of the row-echelon form which has the leading 1.
A basis for the column space of A is:

1
1
2 , 3

3
2
those column of A which corresponds to the columns of the row-echelon matrix with
the leading 1.
The only solution of the system is the zero solution: (0, 0), so the null space
consists of the zero vector only:
nullspace = {(0, 0)}.
There is no basis for the null space. Therefore, rank(A) = 2, nullity(A) = 2 2 = 0.

Z. G
onye

66

Vector Spaces

Z. G
onye

Chapter 5
Eigenvalues and Eigenvectors
5.1 Eigenvalues and Eigenvectors
Let T : Rn Rn be a linear transformation. Then T can be represented by a matrix
(the standard matrix), and we can write
T (~v ) = A~v .
Example 5.1.1. Consider the transformation T : R2 R2 given by its standard
matrix


3 1
A=
,
5 3
and lets calculate the image of some vectors under the transformation T .
  
   
3 1
1
2
1
T
=
=
,
1
5 3
1
2
  
   
3
3 1
3
6
T
=
=
.
3
5 3
3
6
We may notice that the image of (x, x) is 2(x, x). Lets calculate some more images:
  
  

1
3 1
1
2
T
=
=
,
5
5 3
5
10

 

  
2
3 1
2
4
=
=
.
T
10
5 3
10
20
We may notice that the image of a vector (x, 5x) is 2(x, 5x). A couple of more
images:
  
   
2
3 1
2
3
T
=
=
,
3
5 3
3
1
  
   
1
3 1
1
4
T
=
=
.
8
1
5 3
1
There are no such nice patterns for these vectors.

68

Eigenvalues and Eigenvectors

Although a transformation given by a matrix A may move vectors in a variety


direction, it often happens that there are special vectors on which the action is quite
simple. In this section we would like to find those nonzero vectors ~v , which are
mapped to a scalar multiple of itself, that is
A~v = ~v
for some scalar . In our example above, these vectors are (x, x) and (x, 5x), where
x can be any nonzero number.
Definition 5.1.1. If A is an n n matrix, then a nonzero vector ~v in Rn is called an
eigenvector of A if there is a scalar such that
A~v = ~v .
The scalar is called an eigenvalue of A, and ~v is said to be an eigenvector of A
corresponding to .
We emphasize that eigenvectors are nonzero vectors. So the question is: when can
we find a nonzero vector ~v which satisfies the matrix equation A~v = ~v with some
scalar ? Lets rearrange this equation A~v = ~v to
A~v ~v = ~0.
Then, we can factor ~v from both terms of the left hand side. However we have to be
careful, because these products are not commutative, so we have to keep the order,
and we will also have to write I (a matrix) instead of , which is only a number. So
we get
(A I)~v = ~0.
This is a homogeneous equation B~v = ~0 with B = A I. This homogeneous linear
system has nonzero solutions, if det(B) = 0. That is if det(A I) = 0.
So here is the idea: first we find those values of for which det(A I) = 0. Then
for a such value of we solve the linear system (A I)~v = ~0 to get an eigenvector.
Definition 5.1.2. The equation det(A I) = 0 is called the characteristic equation
of A. When expanded, the determinant det(A I) is a polynomial in . This is
called the characteristic polynomial of A.
Z. G
onye

69

5.2 Examples

Definition 5.1.3. The eigenvectors corresponding to are the nonzero vectors in


the solution space of (A I)~v = ~0. We call this solution space the eigenspace of A
corresponding to .
Remark 5.1.1. In some books you will find that the characteristic polynomial is defined by det(I A). Using this as a definition, the characteristic polynomial would
have 1 as its leading coefficient. You can show that the polynomials det(I A) and
det(A I) differ only by a negative sign if the size of A is odd. If the size of A is
even, then the two polynomials are the same.

5.2 Examples
Example 5.2.1. Let

0 0 2
A = 1 2 1
1 0 3
The characteristic polynomial of A is

0
2
1 = 3 + 52 8 + 4.
det(A I) = det 1 2
1
0
3
To get the eigenvalues, find the zeroes of the characteristic polynomial:
3 + 52 8 + 4 = ( 2)2 ( 1),
the eigenvalues of A are: = 2, which has algebraic multiplicity of 2, (that is = 2 is a
double root of the characteristic equation) and = 1, which has algebraic multiplicity
of 1 (that is = 1 is a simple root of the characteristic equation).
Lets find the eigenspace and a basis for the eigenspace for each of the eigenvalues
of A. To find the eigenspace corresponding to , we have to find the solutions space
of the equation (A I)~v = ~0. So for = 2, the augmented matrix is:

2 0 2 0
1 0 1 0
1 0 1 0
whose row-echelon form is

Z. G
onye

1 0 1 0
0 0 0 0 .
0 0 0 0

70

Eigenvalues and Eigenvectors

Since there are two free variables the solution space of (A 2I)~(v) = ~0, and therefore
the eigenspace of A corresponding to = 2 has dimension two. The geometric multiplicity of the eigenvalue = 2 is 2 (the dimension of the corresponding eigenspace).
The solutions of (A 2I)~v = ~0 are (v3 , v2 , v3 ) where v2 and v3 are free variables.
These are the eigenvectors of A corresponding to = 2. The eigenspace corresponding
to = 2 is

v3

v 2 : v 2 , v3 C .

v3
A basis for the eigenspace is

1
0
1 , 0 .

0
1
To find the eigenspace corresponding to = 1 we have to repeat the same procedure. We have to find the solutions space of the equation (A 1I)~v = ~0, the
augmented matrix is:

1 0 2 0
1 1 1 0 ,
1 0 2 0
whose row-echelon form is

1 0 2 0 0
0 1 1 0 0 .
0 0 0 0 0

Since there is only one free variable the solution space of (A 1I)~v = ~0, and therefore
the eigenspace of A corresponding to = 1 has dimension one. The geometric multiplicity of the eigenvalue = 1 is 1. The solutions of (A I)~v = ~0 are (2v3 , v3 , v3 ),
where v3 is a free variable. These are the eigenvectors of A corresponding to = 1.
The eigenspace corresponding to = 1 is

2v3

v3 : v3 C .

v3
A basis for the eigenspace is

2
1 .

1
Z. G
onye

71

5.2 Examples
Example 5.2.2. Let

5 0 4
B = 0 3 1 .
0 0 2
It is a triangular matrix. The eigenvalues of B are = 5, 3, and 2. Each has
algebraic multiplicity of one. For each eigenvalue we can find the eigenspace, and a
basis for the eigenspace. The eigenspace corresponding to = 5 is

v1

0 : v1 C .

0
A basis for the eigenspace is

1
0 .

0
The eigenspace corresponding to = 3 is

v 2 : v 2 C .

0
A basis for the eigenspace is

0
1 .

0
The eigenspace corresponding to = 2 is

7
1
5 v3 : v3 C .

v3
A basis for the eigenspace is

4

7
1
5

1
or a more convenient one is: (20, 7, 35).
Z. G
onye

72

Eigenvalues and Eigenvectors

Example 5.2.3. Let



C=


5 1
.
1 3

The characteristic polynomial of C is




5 1
det(C I) = det
= 2 8 + 16.
1
3
To find the eigenvalues we have to find the roots of the characteristic polynomial
2 8 + 16 = ( 4)2 ,
so C has only one eigenvalue = 4, which has algebraic multiplicity of two (i.e. it is
a double root of the characteristic equation).
To find the eigenspace corresponding to = 4 we have to find the solutions space
of the equation (4I A)~v = ~0, the augmented matrix is:


1 1 0
1 1 0
whose row-echelon form is


1 1 0
.
0 0 0

Since there is only one free variable the solution space, and therefore the eigenspace
corresponding to = 4 has dimension one. The geometric multiplicity of the eigenvalue = 4 is 1. The solutions, so the eigenvectors are (v2 , v2 ), where v2 is a free
variable. The eigenspace corresponding to = 4 is

 
v2
: v2 C .
v2
A basis for the eigenspace is
 
1
.
1
Example 5.2.4. Let

2 3 1
N = 0 4 0 ,
0 0
1
a triangular matrix. Then the matrix N I

2
3
1
4
0
N = 0
0
0
1
Z. G
onye

5.3 Diagonalization

73

is also triangular, therefore the determinant of N I is the product of the entries


along the main diagonal:
(2 )(4 )(1 ),
and the roots of the characteristic equation of N are = 2, 4, and 1.
If N is a triangular matrix, then the entries along its main diagonal are its eigenvalues.
Remark 5.2.1. If you add the algebraic multiplicity of all eigenvalues of a given matrix,
it should be equal to the size of the matrix. The geometric multiplicity of an eigenvalue
cannot be greater than its algebraic multiplicity.

5.3 Diagonalization
Definition 5.3.1. A square matrix is called diagonalizable if there exists an invertible
matrix P so that P 1 AP is diagonal.

Procedure for diagonalizing a matrix


1. Find the characteristic polynomial of the matrix A.
2. Find the roots to obtain the eigenvalues.
3. Repeat (a) and (b) for each eigenvalue of A:
(a) Form the augmented matrix to the equation (A I)~v = ~0 and bring it
to a row-echelon form.
(b) Find a basis for the eigenspace corresponding to . That is find a basis for
the solution space of (A I)~v = ~0.
4. Consider the collection S = {~v1 , ~v2 , . . . , ~vm } of all basis vectors of the eigenspaces
found in step 3.
(a) If m is less than the size of the matrix A, then A is not diagonalizable.
(b) If m is equal to the size of the matrix A, then A is diagonalizable, and the
matrix P is the matrix whose columns are the vectors ~v1 , ~v2 , . . . , ~vm found
Z. G
onye

74

Eigenvalues and Eigenvectors


in step 3, and

1 0 . . . 0
0 ... 0

2
D=

. . . . . . . . . . . . . . .
0 0 . . . n
where ~v1 corresponds to 1 , ~v2 corresponds to 2 , and so on.
We will look at the three examples we did in Section 5.2, and see whether the
matrices A, B, and C are diagonalizable.
Example 5.3.1.

0 0 2
A = 1 2 1
1 0 3
is
diagonalizable,

because it has three basis vectors for


allof its eigenspaces combined:
0
1
2
1 and 0 are corresponding to = 2, and 1 is corresponding to = 1.
0
1
1
So

1 0 2
P = 0 1 1
1 0 1
and

2 0 0
D = 0 2 0 .
0 0 1

Note: Since we could have found anther basis for the eigenspaces, this matrix P
is not unique.
Example 5.3.2. The matrix

5 0 4
B = 0 3 1
0 0 2
is also diagonalizable, because we found 3 basis vectors for the eigenspaces combined.
Therefore

1 0 20
P = 0 1 7
0 0 35
Z. G
onye

75

5.3 Diagonalization
and

5 0 0
D = 0 3 0 .
0 0 2

Example 5.3.3. The matrix



C=


5 1
.
1 3

is not diagonalizable, because it only has one basis vector for its eigenspace(s).
Example 5.3.4. If all eigenvalues are different, then the matrix is diagonalizable,
because for each eigenvalue there will be one basis vector for the corresponding
eigenspace. For example:

0 1 0
M = 0 0 1
4 17 8

has eigenvalues = 4, 2 3. All eigenvalues are different, so A is diagonalizable.


To find the matrix P , you will have to find a basis for each of the three eigenspaces.
However, we already know the diagonal form will be:

4
0
0
0 .
D = 0 2 + 3
0
0
2 3
Example 5.3.5. The triangular matrix

2 3 1
N = 0 4 0 ,
0 0
1
has eigenvalues = 2, 4 and 1. The eigenvalues of N are all different, so N is
diagonalizable, and D can be

2 0 0
0 4 0 .
0 0 1
To find the corresponding matrix P , to each eigenvalue you will have to find a corresponding eigenvector.
Example 5.3.6. The matrix


Z. G
onye


0 2
3 0

76

Eigenvalues and Eigenvectors

has complex eigenvalues, = 6i. In the diagonal form we would see these complex
entries. Since the diagonal form is not a matrix over R, we say this matrix is not
diagonalizable over R.

5.4 Computing Powers of a Matrix


There are numerous problems that require the computation of high powers of a matrix.
If the matrix is diagonal, then this is easy.
Example 5.4.1. The 100th power of

2 0 0
D = 0 2 0
0 0 1
is

D100

2100 0 0
= 0 2100 0 .
0
0 1

Suppose that a matrix A is not diagonal, but diagonalizable. That is


P 1 AP = D
for some diagonal matrix D and form some invertible matrix P . Multiply this equation
by P from the left, and by P 1 from the right:
P P 1 AP P 1 = P DP 1 ,
using that P P 1 = I, P 1 P = I and AI = A, we get that
A = P DP 1 .
Now, lets take powers of A:
An = (P DP 1 )(P DP 1 )(P DP 1 ) (P DP 1 )(P DP 1 )
= P D(P 1 P )D(P 1 P )DP 1 P D(P 1 P )DP 1
= P DDD DDP 1
= P Dn P 1 .
Therefore
An = P Dn P 1 .
Z. G
onye

77

5.4 Computing Powers of a Matrix


Example 5.4.2. Lets calculate the 15th

0
A = 1
1

power of

0 2
2 1 .
0 3

We showed in Example 5.3.1 that A is diagonalizable with matrix

1 0 2
P = 0 1 1 ,
1 0 1
and then

2 0 0
D = 0 2 0 .
0 0 1

So
A15 = P D15 P 1
1

15
1 0 2
1 0 2
2 0 0
= 0 1 1 0 2 0 0 1 1
1 0 1
1 0 1
0 0 1

15

2
0 0
1 0 2
1 0 2
= 0 1 1 0 215 0 1 1 1
1 0 1
0
0 1
1 0 1

15
16
22
0 22
15
= 2 1 215 215 1 .
215 1 0 216 1
For further applications you may see Section A.2 in the appendix.

Z. G
onye

78

Eigenvalues and Eigenvectors

Z. G
onye

Chapter 6
General Linear Transformations
6.1 Linear Transformations from Rn to Rm
We studied linear transformations in Section 4.2 and used the following definition to
determine whether a transformation from Rn to Rm is linear.
Definition 6.1.1. A transformation T : Rn Rm is linear if the following two
properties hold for all vectors ~u and ~v in Rn and every scalar c:
(a) T (~u + ~v ) = T (~u) + T (~v ), and
(b) T (c~u) = cT (~u)
Example 6.1.1. Using this theorem we can show for example that
1. T : R2 R2 defined by T (x, y) = (2x, x y) is linear transformation.
2. T : R2 R2 defined by T (x, y) = (x2 , y) is not linear.

6.2 General Linear Transformations


Definition 6.2.1. Let V and W be two vector spaces, not necessarily a Euclidean
space. A function f , which maps every vector from V to W is called a map or
transformation from V to W . If V = W , then it can also be called operator.
Definition 6.2.2. Let V and W be two vector spaces. A transformation T : V W
is called a linear transformation if the following two properties hold for all vectors ~u
and ~v in V and every scalar c:
(a) T (~u + ~v ) = T (~u) + T (~v ), and
(b) T (c~u) = cT (~u)

80

General Linear Transformations

Example 6.2.1. Let V be the vector space of all real-valued functions that are
differentiable, and let W be the vector space of all real-valued functions. Define
D : V W by
D(f ) = f 0
where f 0 is the derivative of f . We will show that this transformation is a linear
transformation.
(a) Checking addition. Take two functions f and g from V , then
D(f + g) = (f + g)0 = f 0 + g 0 = D(f ) + D(g).
(b) Checking scalar multiplication. Take a function f form V and let c be a scalar.
Then
D(cf ) = (cf )0 = cf 0 = cD(f ).
Both of the properties of Definition6.2.2 are satisfied, so D is a linear transformation.
Actually, this transformation is called the differential operator .
Example 6.2.2. Let V be the vector space of all real-valued functions that are
integrable over the interval [a, b]. Let W = R. Define I : V W by
Z b
I(f ) =
f (x) dx.
a

Using calculus we can show that this transformation is linear:


(a) Checking addition. Take two functions f and g from V , then
Z b
Z b
Z b
I(f + g) =
f (x) + g(x) dx =
f (x) dx +
g(x) dx = I(f ) + I(g).
a

(b) Checking scalar multiplication. Take a function f form V and let c be a scalar.
Then
Z b
Z b
I(cf ) =
cf (x) dx = c
f (x) dx = cI(f ).
a

Both of the properties are satisfied, so I is a linear transformation.


Example 6.2.3. Consider the transformation T : P2 (t) P3 (t) defined by

T p(t) = tp(t) + 1.
Since p(t) is from P2 (t), the vector space of polynomials of degree 2 or less, we can
write that p(t) = at2 + bt + c. Lets see if this transformation is linear.
Z. G
onye

81

6.2 General Linear Transformations

(a) Checking addition. Take two polynomials from P2 (t), say p(t) = at2 + bt + c,
and q(t) = t2 + t + . Then

T p(t) + q(t) = T (at2 + bt + c + t2 + t + )

= T (a + )t2 + (b + )t + c +
(6.2.1)

= t (a + )t2 + (b + )t + c + + 1
= (a + )t3 + (b + )t2 + (c + ) + 1.
But


T p(t) + T q(t) = T (at2 + bt + c) + T (t2 + t + )
= t(at2 + bt + c) + 1 + t(t2 + t + ) + 1
= (a + )t3 + (b + )t2 + (c + ) + 2.

(6.2.2)

Since 6.2.1 and 6.2.2 are not equal, the first property of the definition 6.2.2
fails. Therefore this is not a linear transformation.
(As a practice, show that

the other property, T cp(t) = cT p(t) , also fails.)
Example 6.2.4. Let P4 (x) be the vector space of all real polynomials of degree 4 or
less. Let T : P4 (x) P4 (x) be given by
T =

d
d2
+3 .
2
dx
dx

First, lets just see how to calculate the image of a polynomial, say the image of
3x4 5x2 7x + 10. That is, we have to find T (3x4 5x2 7x + 10).
T (3x4 5x2 7x + 10) = (3x4 5x2 7x + 10)00 + 3(3x4 5x2 7x + 10)0
= 36x2 10 + 3(12x3 10x 7)
= 36x3 + 36x2 30x 31.
Is this transformation linear? We have to check the two properties given in the
definition 6.2.2.
(a) Checking addition. Let p(x) and q(x) be two polynomials from P4 (x).
d2
d
(p(x) + q(x)) + 3 (p(x) + q(x))
2
dx
dx

d2
d2
d
d
= 2 p(x) + 2 q(x) + 3
p(x) + q(x)
dx
dx
dx
dx
2
2
d
d
d
d
= 2 p(x) + 3 p(x) + 2 q(x) + 3 q(x)
dx
dx
dx
dx
= T (p(x)) + T (q(x)).

T (p(x) + q(x)) =

Z. G
onye

82

General Linear Transformations

(b) Checking scalar multiplication. Let p(x) be a polynomial from P4 (x), and let k
be any constant.
d2
d
(kp(x))
(kp(x))
+
3
dx2
dx
d
d2
= k 2 p(x) + 3k p(x)
dx
dx
= kT (p(x)).

T (kp(x)) =

This shows that T is a linear transformation.

6.3 Matrix Representation of Linear Transformations


The matrix representation of a linear transformation from Rn to Rm is the standard
matrix of the transformation. But how can we find a matrix representation for a
general linear transformation?
Example 6.3.1. We will use the transformation given in Example 6.2.4. Let T :
P4 (x) P4 (x) be given by
d2
d
T = 2 +3 .
dx
dx
The standard basis for P4 (x) is {x4 , x3 , x2 , x, 1}. First we calculate the image of each
of these basis vectors under the transformation T .
T (x4 ) = (x4 )00 + 3(x4 )0 = 12x2 + 12x3
T (x3 ) = (x3 )00 + 3(x3 )0 = 6x + 9x2
T (x2 ) = (x2 )00 + 3(x2 )0 = 2 + 6x
T (x) = (x)00 + 3(x)0 = 3
T (1) = (1)00 + 3(1)0 = 0.
Now we have to decode these polynomials as follows

a
b

ax4 + bx3 + cx2 + dx + e is represented by


c .
d
e
Z. G
onye

6.4 Kernel and Range of Linear Transformations

83

Using this, decode T (x4 ), T (x3 ), T (x2 ), T (x), T (1), the vectors we get are the columns
of the standard matrix:

0 0 0 0 0
12 0 0 0 0

.
12
9
0
0
0
A=

0 6 6 0 0
0 0 2 3 0

6.4 Kernel and Range of Linear Transformations


Example 6.4.1. We continue working with the same example, the transformation
T : P4 (x) P4 (x) be given by
d
d2
T = 2 +3 .
dx
dx
Definition 6.4.1. The kernel of a transformation T form V to W is the set of all
vectors from V which are mapped to the zero vector of W , that is


Ker(T ) = ~v V : T (~v ) = ~0 .
For transformation from Rn to Rm the kernel of a linear transformation is the
null space of its standard matrix. For general linear transformations we can use the
matrix representation, but we will have to decode. So first we find the null space
of the standard matrix A. The row-echelon form of A is

1 0 0 0 0
0 1 0 0 0

rref(A) =
0 0 1 0 0 ,
0 0 0 1 0
0 0 0 0 0
so its null space is {(0, 0, 0, 0, c) : c R}, and a basis for this null space is (0, 0, 0, 0, 1).
Using the decoding again, the vector (0, 0, 0, 0, c) corresponds to 0 x4 + 0 x3 + 0
x2 + 0 x + c 1 = c, which is the constant polynomial c. So
Ker(T ) = {all constant polynomials}.
The nullity(T ) is the dimension of the kernel of T , therefore nullity(T ) = 1.
Definition 6.4.2. The range of a transformation T : V W is the set of all vectors
w
~ W for which there is a vector ~v V such that T (~v ) = w.
~
Z. G
onye

84

General Linear Transformations

For linear transformation on Euclidean spaces, the range is equal to the column
space of the standard matrix of the transformation. To find the range of a general
linear transformation T , we can use its matrix representation.
So in our example, lets find the column space of its matrix representation. The
first 4 columns of A form a basis for the column space, so




0
0
0
0

0
0
0

12


column space = a 12 + b 9 + c 0 + d 0 : a, b, c, d R .

0
6
6
0

0
0
2
3
A basis for the column space is:

0
0
0
0

12 0 0 0

12 , 9 , 0 , 0 .

0 6 6 0

0
2
3
0
Remark 6.4.1. We could choose a simpler basis:

0
0
0
0

1 0 0 0

1 , 3 , 0 , 0 .

0 2 3 0

0
0
1
1
To get the range of T you have to decode the column space of A.


Range(T ) = a(12t3 + 12t2 ) + b(9t2 + 6t) + c(6t + 2) + d 3 : a, b, c, d R .
A basis for the range is:



12t3 + 12t2 , 9t2 + 6t, 6t + 2, 3 .

Z. G
onye

Appendix A
Supplementary Material
A.1 Cayley-Hamilton Theorem
Theorem A.1.1. Every matrix is a root of its characteristic polynomial.
Example A.1.1. Let q(t) = (t 4)4 t2 + 8t. Evaluate q(C), where


5 1
C=
.
1 3
The characteristic equation of the matrix is 2 8 + 16. Cayley-Hamilton Theorem
says that the matrix C is a root of this polynomial, which means if we plug C in to
the equation, then we get zero: C 2 8C + 16I. So we try to rearrange the given
polynomial q(t):

2
q(t) = (t 4)4 t2 + 8t = (t 4)2 (t2 8t + 16) + 16.
Using the Cayley-Hamilton theorem:

2
q(C) = (C 4I)2 (C 2 8C + 16I) + 16I = 16I =


16 0
.
0 16

A.2 Exponential of Matrices


In differential equations you will use the exponent of a matrix, so in this worksheet
we will have some questions with that. The exponent of a square matrix A is defined
by the Taylor series:
eA = I + A +

An
A2 A3
+
+ +
+
2!
3!
n!

As you will learn the solution of the vector differential equation


d
~y = A~y
dt

86

Supplementary Material

with initial condition ~y (0) = ~y0 is


~y (t) = eAt ~y0 ,
where

(At)2 (At)3
(At)n
+
+ +
+
2!
3!
n!
(a) If A is a diagonal matrix, then it is very easy to calculate eA . For example, let
!
3 0
A=
.
0 2
eAt = I + At +

Then
eA =
=
=

!
1 0
+
0 1
1+3+

!
3 0
+
0 2

(3)2
2!

0
(2)2
2!

0
e3 0
0 e2

(3)2
2!

!
+

0
1+2+

(2)2
2!

(3)3
3!

0
!

0
(2)3
3!

!
+

7 0 0
Problem A.2.1. What is eB , where B = 0 0 0 ?
0 0 3
Solution.

e 0 0
eB = 0 1 0 .
0 0 e3

Watch the middle term: e0 = 1.


(b) If the matrix A is diagonalizable (when A has as many distinct eigenvectors as
its dimension), then A = P DP 1 , A2 = P D2 P 1 , . . . An = P Dn P 1 , . . . , so
A2 A3
An
+
+ +
+
2!
3!
n!
P D2 P 1 P D3 P 1
P Dn P 1
+
+ +
+
= I + P DP 1 +
2!
3!
n!


D2 D3
Dn
=P I +D+
+
+ +
+ P 1
2!
3!
n!

eA = I + A +

= P eD P 1
Z. G
onye

87

A.2 Exponential of Matrices


Problem A.2.2. Let

5 3 0
A = 0 3 1 .
0 0 2

Find the eigenvalues of A. For each of its eigenvalue, find a basis for the
corresponding eigenspace. Explain why the matrix A is diagonalizable, find its
diagonal form D and the matrix P so that A = P DP 1 . Calculate eA .
Solution. The eigenvalues
ofA are 5, 3, and 2. A basis for the eigenspace
1

corresponding to = 5 is 0.
0

3

A basis for the eigenspace corresponding to = 3 is 2.


0
1

A basis for the eigenspace corresponding to = 2 is 1.


1
In this case

1 3 1
5 0 0
P = 0 2 1 , and D = 0 3 0 .
0 0 1
0 0 2
Note: Depending the basis vectors you chose, you could have different matrix
for P .

5

1 3 1
e 0 0
2 3 1
1
1
eA = P eD P 1 = 0 2 1 0 e3 0 0 1
2
2
0 0
2
0 0 1
0 0 e

5
e 32 e5 + 32 e3 21 e5 + 32 e3 e2

e3
e3 e2
= 0
2
0
0
e
(c) Let

2 1 0

C= 0
0
1 .
0 4 4
Find the eigenvalue(s) of C. For each eigenvalue find a basis for the corresponding eigenspace. Explain why the matrix C is not diagonalizable.
Z. G
onye

88

Supplementary Material
In the case when the matrix C has only one eigenvalue , there is a trick to
calculate its exponent eC . Instead of C we write C I + I and we use the
property of exponents (ea+b = ea eb ).
eC = eCI+I = eCI eI = e(CI) eI .
In the last part I is diagonal so we can easily calculate eI as described in part
(a). The matrix C I is nilpotent (a consequence of the Cayley-Hamilton
Theorem), so the infinite Taylor series will truncate to a finite sum.
For the matrix C given above, show that (C I)2 6= 0 but (C I)3 = 0. So
the Taylor series of eCI truncates to the first three terms:
eCI = I + C I +

1
(C I)2 .
2!

Problem A.2.3. Calculate eC (by calculating eI and eCI and then using
that eC = eCI eI ).
Solution. The eigenvalue of C is = 2, and a basis for its eigenspace is

1
0 .

0
So C is not diagonalizable because there is only one basis vector for the eigenspace(s).

eC = e(C+2I) e2I


1
2
= I + C + 2I + (C + 2I) e2 I
2!

1 0 0
0 1 0
0 2 1
1
1 + 0 0
0 e2
= 0 1 0 + 0 2
2
0 0 1
0 4 2
0 0
0

1
1 2 2
1 .
= e2 0 3
0 4 1

Z. G
onye

Appendix B
The Laplace Expansion For Determinant
B.1 First Minors and Cofactors; Row and Column Expansions
To each element aij in the determinant |A| = |aij |n , there is associated a subdeterminant of order (n 1) which is obtained from |A| by deleting row i and column j. This
subdeterminant is known as a first minor of |A| and is denoted by Mij = D (A (i|j)).
The first cofactor (CA)ij is then defined as a signed first minor:
(CA)ij = (1)i+j Mij = (1)i+j D (A (i|j)) .

(B.1.1)

It is customary to omit the adjective first and to refer simply to minors and cofactors
and it is convenient to regard Mij and (CA)ij as quantities which belong to aij in
order to give meaning to the phrase an element and its cofactor.
The expansion of |A| by elements from row i and their cofactors is
|A| = D(A) =

n
X

aij (CA)ij =

j=1

n
X

(1)i+j aij D (A (i|j)) ,

1 i n.

(B.1.2)

j=1

The expansion of |A| by elements from column j and their cofactors is


|A| = D(A) =

n
X
i=1

aij (CA)ij =

n
X

(1)i+j aij D (A (i|j)) ,

1 j n.

(B.1.3)

i=1

Since (CA)ij belongs to but is independent of aij , another way to define (CA)ij is
(CA)ij =

|A|
.
aij

(B.1.4)

90

The Laplace Expansion For Determinant

B.2 Alien Cofactors; The Sum Formula


The theorem on alien cofactors states that
n
X

aij (CA)kj = 0,

1 i n,

1 k n,

k 6= i.

(B.2.1)

j=1

The elements come from row i of |A|, but the cofactors belong to the elements in row
k and are said to be alien to the elements. The identity is merely an expansion by
elements from row k of the determinant in which row k = row i and which therefore
zero.
The identity can be combined with the expansion formula for A with the aid of
the Kronecker delta function ik to form a single identity which may be called the
sum formula for elements and cofactors:
n
X

aij (CA)kj = ik |A|,

1 i n, 1 k n.

(B.2.2)

j=1

It follows that

0
..
.

0
n
X

,
(CA)ij Cj =
D(A)

j=1
0

.
..

1 i n,

(B.2.3)

where Cj is column j of the matrix A, and the element D(A) is in the row i of the
column vector and all the other elements are zero. If D(A) = 0, then
n
X

(CA)ij Cj = 0,

1 i n,

(B.2.4)

j=1

that is, the columns are linearly dependent. Conversely, if the columns are linearly
dependent, then D(A) = 0.
T. Pranayanuntana

91

B.3 Cramers Formula

B.3 Cramers Formula


The set of equations
n
X

1 i n,

aij xj = bi ,

(B.3.1)

j=1

can be expressed in column vector notation as follows:


n
X

Cj xj = B,

(B.3.2)

j=1

where

B=

b1
b2
b3
..
.

bn
So



Pn



C1 Cj1 B Cj+1 Cn = C1 Cj1

j x
j Cj+1 Cn
j=1 C




= C1 Cj1 x1 C1 Cj+1 Cn
+


+ C1 Cj1 xj1 Cj1 Cj+1


+ C1 Cj1 xj Cj Cj+1 Cn


+ C1 Cj1 xj+1 Cj+1 Cj+1



Cn





Cn

+




+ C1 Cj1 xn Cn Cj+1 Cn




= xj C1 Cj1 Cj Cj+1 Cn
= xj |A|.
(B.3.3)
T. Pranayanuntana

92

The Laplace Expansion For Determinant

If |A| = |aij |n 6= 0, then the unique solution of the equations can also be expressed in
column vector notation.

1

xj =
C1 Cj1 B Cj+1 Cn
|A|
n
(B.3.4)
1 X
=
bi (CA)ij .
|A| i=1

T. Pranayanuntana

Appendix C

C.1 Rules of Matrix Arithmetic


Theorem C.1.1 (Properties of Matrix Arithmetic). Assuming that the sizes
of the matrices are such that the indicated operations can be performed, the following
rules of matrix arithmetic are valid. Here A, B and C denote matrices, 0 is a zero
matrix, a and b are scalars, and r and s are positive integers.
1. A + B = B + A
Commutative law for addition
2. A + (B + C) = (A + B) + C
Associative law for addition
3. a(B + C) = aB + aC = (B + C)a
4. a(B C) = aB aC = (B C)a
5. (a + b)C = aC + bC = C(a + b)
6. (a b)C = aC bC = C(a b)
7. a(bC) = (ab)C
8. A + 0 = 0 + A = A
9. A A = 0
10. 0 A = A
11. A(BC) = (AB)C
Associative law for multiplication
12. A(B + C) = AB + AC
Left distributive law
13. (B + C)A = BA + CA
Right distributive law

94
14. A(B C) = AB AC
15. (B C)A = BA CA
16. a(BC) = (aB)C = B(aC)
17. A0 = 0A = 0
18. AI = IA = A
19. Ar As = Ar+s
20. (Ar )s = Ars
21. A0 = I
22. (A + B)T = AT + B T
23. (A B)T = AT B T
24.* (AB)T = B T AT
25. (AT )T = A
26. (aB)T = aB T
27.* tr(A + B) = tr(A) + tr(B)
28.* tr(A B) = tr(A) tr(B)
29.* tr(aB) = a tr(B)
30.* tr(AB) = tr(BA)
31. (A1 )1 = A
32. (An )1 = (A1 )n
1
33. (aA)1 = A1 , for any nonzero scalar a.
a
34.* (AB)1 = B 1 A1 .
35. (AT )1 = (A1 )T .
36.* det(I) = 1.
37.* det(AB) = det(A) det(B).
Z. G
onye

95

C.1 Rules of Matrix Arithmetic


38.* det(aA) = an det(A), where n is the size of the matrix A.
39.* det(A1 ) =

1
.
det(A)

40.* A adj(A) = I det(A) and so A1 =

Z. G
onye

adj(A)
.
det(A)

96

C.2 Equivalent Statements


Theorem C.2.1. If A is an n n matrix, and if TA : Rn Rn is a multiplication
by A, then the following are equivalent.
1. A is invertible.
2. A~x = ~0 has only the trivial solution.
3. The reduced row-echelon form of A is In .
4. A~x = ~b is consistent for every n 1 matrix ~b.
5. A~x = ~b has exactly one solution for every n 1 matrix ~b.
6. det(A) 6= 0.
7. The range of TA is Rn .
8. TA is one-to-one.
9. The column vectors of A are linearly independent.
10. The row vectors of A are linearly independent.
11. The column vectors of A span Rn .
12. The row vectors of A span Rn .
13. The column vectors of A form a basis for Rn .
14. The row vectors of A form a basis for Rn .
15. A has rank n.
16. A has nullity 0.
17. = 0 is not an eigenvalue of A.

Z. G
onye

Bibliography
[1] A. C. Aitken, Determinant and matrices, Oliver and Boyd, London, 1939.
[2] Howard Anton and Chris Rorres, Elementary linear algebra, Wiley & Sons, Inc.,
New York, 2000.
[3] Kenneth Hoffman and Ray Kunze, Linear algebra, Prentice-Hall, Inc., New Jersey, 1971.
[4] Roger A. Horn and Charles R. Johnson, Matrix analysis, Cambridge University
Press, New York, 1985.
[5] Bernard Kolman and David Hill, Elementary linear algebra, Prentice-Hall, Inc.,
New Jersey, 2000.
[6] Peter Lancaster and Miron Tismenetsky, The theory of matices, Academic Press
Inc., San Diego, CA, 1985.
[7] David C. Lay, Linear algebra and its applications, Addison Wesley, Boston, 2003.
[8] V. V. Prasolov, Problems and theorems in linear algebra, vol. 134, American
Mathmatical Society, United States, 1994.
[9] Joel W. Robbin, Matrix algebra, A K Peters, Ltd., MA, 1985.
[10] Max F. Stein, Introdution to matrices and determinants, Wadsworth Publishing
Company Inc., Belmont, Ca., 1967.
[11] Gilbert Strang, Linear algebra and its applications, Brooks/Cole, Thomson
Learning Inc., 1988.
[12] Robert Vein and Paul Dale, Determinants and their applications in mathematical
physics, Springer-Verlag New York Inc., New York, 1999.

Index
(A)ij , 17
Zp , 13
P, 53
Pn , 53
rref, 6
aij , 17
pth power, 21
adjoint matrix, 43
algebraic multiplicity, 69
augmented matrix, 4
back substitution, 3
basis, 59

diagonalizable, 73
difference, 18
differential operator, 80
dilatation, 52
eigenspace, 69
eigenvalue, 68
eigenvector, 68
elementary matrix, 25
Elementary row operations:, 4
entries, 17
equal, 18
free variable, 11

characteristic equation, 68
characteristic polynomial, 68
coefficient matrix, 4
coefficients, 1
cofactor, 43
cofactor expansion, 42
cofactor matrix, 43
column matrix, 17
column space, 62
congruent modulo p, 13
consistent, 1
constant term, 1

Gauss-Jordan elimination, 6
Gaussian algorithm, 6
Gaussian elimination, 6
General Linear Transformations, 79
geometric multiplicity, 70

determinant, 34
determinant function, 37
diagonal matrix, 30

leading 1, 5
linear equation, 1
linear system, 1

identity matrix, 18
identity transformation, 51
inconsistent, 1
inverse, 26
invertible, 26, 51
kernel, 83

99

INDEX
linear transformation, 49, 79
linearly dependent, 57
linearly independent, 57
lower triangular, 31
main diagonal, 17
map, 49, 79
matrix, 17
matrix product, 19
nilpotent, 32
non-invertible, 26
non-trivial solution, 41
null space, 63
nullity, 63, 83
operator, 49, 79
orthogonal, 32
product, 18
projection, 51
range, 83
rank, 64
real vector space, 52
reduced row echelon form, 6
residue classes modulo p, 13
rotation, 52
row echelon form, 5
row matrix, 17
row space, 63
Scale, 25
Shear, 25
singular, 26
size, 4

size m n, 17
skew-symmetric, 31
solution, 1
space spanned by, 56
square matrix of order n, 17
standard basis, 59
standard matrix, 50
subspace, 53
sum, 18
Swap, 25
symmetric, 31
system of linear equations, 1
trace, 22
transformation, 49, 79
transpose of A, 22
triangular, 31
trivial solution, 41
unknowns, 1
upper triangular, 30
vector, 17, 47
Vector arithmetic, 48
zero matrix, 17
zero transformation, 51

You might also like