Professional Documents
Culture Documents
An Introduction
David A. Evans
What is a matrix? The matrix A + B is given by
A matrix is a rectangular array of numbers. 1 + 2 0 + 1 1 + 4 3 1 5
All positions in the array contain a number, A +B = =
2 + 0 3 + 1 1 + 3 2 4 4
there are no gaps. Each individual number is
called an element of the matrix. The dimension If we call this matrix C, we can write the
of a matrix is specified by the number of rows matrix statement or equation
and the number of columns, conventionally, the C=A+B
number of rows is stated first. Thus the
following matrices are respectively 2×2 and 3×4: where now the '+' and '=' signs stand for matrix
addition and matrix equality. The subtraction of
3 0 2 1
2.3 3.1 two matrices follows in an obvious way from
A= ;B = 4 2 5 1
0. 7 1. 2 the definition of addition.
4 2 1 0
Although any two numbers can always be
A matrix is a complete entity and is added together, the same is not true for
referred to by a single symbol just as you have matrices. A moment's thought will show you
been used to x, y, a, etc. standing for some that they have to have the same dimensions in
number. Printed convention is to use bold order for addition to work. Thus, if A is a 2×2
letters to stand for matrices. In handwritten matrix and B is 3×2, the symbol A + B is
work the underline is sometimes used thus, A. meaningless. The third row of B does not have
We often need to refer to a particular element any corresponding elements in A to be added.
(e.g., 2nd row, 3rd column of B) or the general Such matrices are not conformable for addition.
element (ith row, jth column of A). These have
the symbols b23 and aij, (it is conventional to use
Scalar multiple of a matrix
upper-case letters for the matrix and the
corresponding lower-case letter for the Adding a matrix to itself, A + A, will
elements). These symbols stand for ordinary produce a matrix whose elements are twice the
numbers. We can write the equation: b23 = 5. corresponding elements in A. It is not
Note the order of the sub-scripts is row-column. unreasonable to call this matrix 2A. In the same
way, A + A + A will give a matrix with elements
three times those of A and should be written 3A.
Addition of matrices
By a simple extension of this idea, the symbol kA
There would be little point in introducing stands for a matrix whose elements are each k
these entities without defining how they times the corresponding ones of A, where k is
combine and interact. While it appears trivial, any number. We see that this now defines the
we must first agree on the meaning of the multiplication of a matrix by a number. This is
statement that two matrices are equal. If A and sometimes called scalar multiplication since
B are two matrices, then the statement, A = B ordinary numbers are referred to as scalars to
means that the matrices have the same distinguish them from matrices and other
dimensions and that the corresponding elements complicated entities (such as vectors).
of each are equal. That is, aij = bij, for all possible
values of i and j.
Matrix multiplication
Two matrices can be added to give a third
The definition of matrix multiplication is
matrix whose elements are the sums of the
usually considered by students to be the
elements in the corresponding row/column
trickiest and least intuitive of matrix properties.
positions of the matrices being added. Thus if
First of all, the product of two matrices AB is
1 0 1 2 1 4 not, NOT, NOT! given by multiplying the
A= ; B = 0 1 3
2 3 1 corresponding elements of A and B!
Matrices - 1 of 9 - January, 98
Let us consider the following simultaneous the same dimension. Notice also that you can't
equations: always multiply matrices. Write down a couple
3 x + 2y =6 of 2×3 matrices and try to find their product.
You can't do it because there are 3 elements in
x + 5 y + 2 z = 13
each row of the first and only 2 elements in each
2x + 2 z = 4
column of the second. You can't match up the
The solution depends on three items: the array appropriate elements to find the products
of coefficients of x and y, the right-hand sides of required. In order for the product of two matrices to
the equations, and the symbols x, y that identify exist, the first factor must have the same number of
the unknowns. Regroup this information thus: columns as the second factor has rows. Matrices
which can be multiplied are called conformable
3 2 0 x 6
1 5 2 y 13 for multiplication. A convenient way of checking
= for conformity is to write the dimensions
2 0 2 z 4
underneath the matrices, thus:
This can be expressed in terms of symbols
representing matrices thus
Ax = b
AB=C
if we agree to interpret the product Ax such that mxn nxp mxp
the element in the first row and first column of
the product is obtained from the elements in the Note that the inner values must be the same and
first row of A and the elements in the first that the dimensions of the product are given by
column of x in the following way: start at the the outer values.
left of the row and top of the column and take Notice another odd thing, if the two
the product of the elements. Move one element matrices above are multiplied in the opposite
left in the row and one element down in the order, a different matrix results. It doesn't even
column, multiply these two elements. Continue have the same dimensions. As a further
moving left in the row and down in the column, example, if A is a 2×3 matrix and B is a 3×4
forming products of the elements. Add up all matrix, you can form the product AB, but in the
the products, this gives the value of the element order BA, the matrices are not conformable — B
in the product. The element in the second row, has 4 columns and A has 2 rows.
first column of the product is obtained from the Thus the order of the factors is crucially
elements in the second row of A and the first important in matrix multiplication. Unlike the
column of x. As an example the following products of ordinary numbers (scalars), matrix
product is worked out for you: multiplication does not commute.
Matrices - 2 of 9 - January, 98
Transpose of a matrix Identity or Unit matrices
A matrix can be transposed to create Within the realm of square matrices, there
another matrix. Given a matrix A, its transpose are some special ones which have ones (1) on the
is written AT (or sometimes A′′ ). The first row of leading diagonal and zeros (0) everywhere else.
this matrix contains, in order, the elements in The product of one of these matrices with any
the first column of A, and likewise for all the conformable matrix, A, (not necessarily square)
other rows. Thus the number of rows(columns) gives the same matrix, A. For example,
in AT is equal to the number of columns(rows)
1 0 0 2 − 3 2 − 3
in A. An example follows: 0 1 0 0
− 1 = 0 − 1
1 0 0 0 1 − 4 7 − 4 7
1 − 2 5 T
A= ;A = − 2 − 1
0 − 1 − 3 5 − 3 These matrices play a role very similar to unity
(1) in ordinary arithmetic. They are called
Notice that you can always form the products identity or unit matrices and are written, In,
ATA and AAT for any matrix. The transposing where n denotes the dimension, the identity
guarantees that the numbers of rows and matrix in the example above is I3. Where there is
columns match up. little chance of confusion, the subscript is often
Suppose we have two matrices, A and B omitted. Take the non-identity matrix in the
which can be multiplied together to give a above example and multiply it by I2, with the
matrix, C = AB. For the sake of example let us identity matrix as the right-hand or second
assume that A is 2×3 and B is 3×4, then C is 2×4. factor. You still get the same result. If A is a
Consider now products of the transposes, AT and square matrix, then IA = AI, i.e. an identity
BT. The product ATBT, does not exist since AT matrix commutes with any conformable square
is 3×2 and BT is 4×3. However, the product matrix.
BTAT does exist; it is a 4×2 matrix which is in
fact equal to CT. Try this calculation for a
Matrix inverse
couple of matrices.
There is no such thing as matrix division.
We have the general result:
The symbol A/B cannot be given a meaning.
(AB)T = BTAT. The reason for this is easily demonstrated. Were
division defined, the matrix equation C = A/B,
Square matrices would imply that C is a matrix that satisfies the
relation BC = A. This is how division is defined
We often encounter matrices with equal for ordinary numbers. With matrices there is a
numbers of rows and columns (2×2, 3×3, etc.), problem. Let
these are quite reasonably called square
matrices. Notice that all square matrices of a 4 − 7 1 − 2
A= ;B = .
given dimension are conformable for − 8 14 − 2 4
multiplication, and the products are square
matrices with that same dimension. It is easy to verify that
Matrices - 3 of 9 - January, 98
1 − 2 2 − 1 4 − 7 2 - 4
− 2 4 − 1 3 = − 8 14 - 3 6
which would imply instead that does not have an inverse (again, I state this
without proof!). There is no matrix by which
2 − 1
C= . you can multiply it to give the identity matrix.
− 1 3 Matrices which have no inverse are said to be
Thus a unique value cannot be assigned to the singular.
operation "A/B" for these matrices. In fact, in The inverse of a matrix is analogous to the
this case there are an infinite number of matrices, reciprocal in ordinary numbers. If x is a number,
C that will satisfy the relation BC = A. then its reciprocal x-1 is the number such that
Sometimes, however, C is unique. For the x-1x = 1. One number, 0, has no reciprocal; there
matrices is no number by which you can multiply zero to
obtain 1. Singular matrices are the matrix
6 − 8 3 − 2
A= ;B = analogues of zero.
4 − 2 1 0
I have not indicated how, given the
there is indeed only one matrix: numerical values of the elements of a matrix, to
calculate the elements of the inverse. The
4 − 2
C= discussion of the various techniques for doing
3 1
this form an important topic in numerical
which satisfies BC = A. methods. The calculations become very complex
for even a 3×3 matrix. One of the powerful
We pose the question: given a square features of using matrix notation is that
matrix A, is there a square matrix, B such that tremendously complicated arithmetic is
the product AB is an identity matrix, i.e. AB = I? encapsulated in the little superscript, -1! For
B must, of course, have the same dimension as example, we have seen that a set of
A. simultaneous equations can be written in matrix
If notation as Ax = b. The equations can be solved
symbolically, thus:
7 5
A= Ax = b
4 3
multiply each side by A − 1 ,
we find that: A − 1 Ax = A − 1b,
7 5 3 − 5 1 0 but A − 1 A = I, therefore
4 3 − 4 =
7 0 1 Ix = A − 1b,
and I state (without proof!) that this is a unique x = A − 1b.
result, like the second example above; no other This shows how the inverse of a matrix is
matrix exists whose product with A is the (2×2) related to the solution of a set of simultaneous
identity matrix. This matrix is called the inverse equations. Note the simplicity of the relation
of A and is written A-1. The result above is expressed in this form even though the
written in matrix notation as AA-1 = I. As an calculation of the inverse and a matrix product
exercise, form the product A-1A, you will find involve many arithmetical operations on the
that this is also the identity matrix. A matrix elements of the matrices.
commutes with its inverse. It is not difficult to
show that the inverse of the product AB, i.e.
Symmetric Matrices
(AB)-1, is equal to B-1A-1.
Consider the matrix
But not every square matrix has an inverse.
For example, the matrix
Matrices - 4 of 9 - January, 98
− 1 4 0 5 − 7 These are called diagonal matrices. Note that the
4 −3 2 −1 0 unit matrix is a diagonal matrix in which the
0 2 1 4 − 2 diagonal elements are all unity (1). It is very easy
to find the inverse of a diagonal matrix; it is also
5 −1 4 3 1
− 7 0 − 2 1 − 5 a diagonal matrix whose non-zero elements are
the reciprocals of the corresponding elements.
Notice the symmetrical arrangement of the Thus the inverse of the above matrix is
numbers. Draw an imaginary line down the
12 0 0
leading diagonal (from upper left to lower
right). This divides the matrix into two 0 - 1 0
0 0 1
triangles. The numbers in the lower triangle are 4
Matrices - 5 of 9 - January, 98
do not have a solution, i.e. no pair of values of x or
and y can satisfy them. The reason is easy to see,
ad − bc = 0 .
the left hand side of the second equation is just
twice the left hand side of the first. Suppose we Here we have a test for the singularity of a 2×2
choose any pair of values of x and y that satisfy matrix. The expression ad – bc is calculated, i.e.
the first equation. These values can never (top left)(bottom right) – (top right)(bottom left). If
satisfy the second since on substituting the this is zero the matrix is singular and has no
values of x and y, the left hand side will always inverse, otherwise it is non-singular and has an
result in the value 8, never 6. The equations are inverse. The number we have calculated is
said to be inconsistent. Conversely, the called the determinant of the matrix. We use the
equations: symbol
x − 2y = 4 a b
c d
2 x − 4 y = 8
have an infinite number of solutions. The second to stand for the determinant of the matrix. Note
equation is automatically satisfied for any that the determinant is a number which is a
values of x and y that satisfy the first. We function of the elements of the matrix. We have,
basically have just one equation in two for the example equations above
unknowns. There is no additional information 1 -2
provided by the second equation. The problem, = 1 × (− 4 ) − (− 2 ) × 2 = −4 − (− 4 ) = 0
2 -4
of course, is that the equations are just multiples
of each other. Thus, as promised, the matrix is singular. In the
section dealing with inverse matrices, it was
Since there are no solutions, the matrix of
stated that the matrix
coefficients, A:
2 − 4
1 − 2 − 3
2 − 4 6
was singular. Checking the determinant we find
is singular. We see that the second row is just
twice the first row. Note a similar dependence 2 −4
= 2 × 6 − (− 4 ) × (− 3 ) = 12 − 12 = 0
between the columns. The second column is the −3 6
negative of twice the first column. It is clear that
showing that the matrix is indeed singular.
the problems will always arise whenever one
row is any multiple of the other. Let us consider This can be extended to square matrices
a general 2×2 matrix with larger dimensions. Any square matrix has
a determinant, a number which is a function of
a b
c d the elements. If the determinant is zero, the
matrix is singular and vice versa. The expression
for the determinant becomes quite complicated
If the matrix is singular, one row is just a
and, at this stage, need not concern us greatly.
multiple of the other:
The determinant of a 3×3 matrix is given by
a = kc;b = kd
a b c
e f d f d e
or d e f =a −b +c
h j g j g h
a b g h j
= =k
c d Each of the 2×2 determinants is then expanded
or using the formula given above. In this way we
can obtain explicitly the value of the
ad = bc
Matrices - 6 of 9 - January, 98
determinant in terms of the elements. This is left 2 − 3 1 x
as an exercise. Notice that each of the 2×2
1 2 − 3 y = 0
determinants is obtained from the original 3 − 1 − 2 z
matrix by removing the first row and then
removing each column in turn. A similar The system is homogeneous. It is
relationship can be written to express a 4×4 straightforward (but labor-intensive!) to check
determinant in terms of four 3×3 determinants, that:
and this can be continued for the determinants 2 −3 1
of larger matrices. 1 2 −3 = 0
3 −1 −2
Homogeneous equations
Therefore there are solutions in addition to the
Consider the set of equations:
trivial one, x = y = z = 0.
x + 3 y = 0
It can be shown that when a determinant is
2 x − y = 0 zero, at least one row (or one column) can be
expressed as a linear combination of the
Both right hand sides are zero; such equations
remaining rows (or columns). That is, in the
are called homogeneous. The matrix of
determinant above, one has that:
coefficients is nonsingular (check that the
determinant is not zero), the equations have a row3 = a row1 + b row2
unique solution x = 0, y = 0, not a very is true for some value of a and b. In our case it is
interesting state of affairs. In fact, this is called a easy to see that a = 1 and b = 1 so that:
trivial solution.
row3 =row1 + row2.
Consider now the following:
Another way of putting it is that the third
x − 2y = 0 equation is merely obtained by adding the first
2x − 4y = 0 two. Therefore it must be satisfied by any
values of x, y, and z that satisfy the first two
where the left hand sides are the same as the
equations. It does not bring any new
examples given in the last section. There are
information to the system; it is redundant.
now an infinite number of solutions in addition
to the trivial one. Any pair of values where x is It is not the third equation that is at fault.
twice y satisfies the set of equations. The set has We might equally well write:
the solutions in addition to the trivial one row1 =row3 – row2
because the determinant of the matrix of
coefficients is zero. This idea extends to so that the first row is a linear combination of
equations in three, four, etc. unknowns. the second and the third. The important thing is
that there is only information from two
We have the result that a set of equations present. We know that two equations
homogeneous equations has only the trivial in three unknowns have an infinite number of
solution unless the determinant of the matrix of solutions.
the coefficients is zero. When the determinant is
We solve two of the equations using the
zero, there are an infinite number of solutions
method of transformation to echelon form. We
where definite relationships exist between the
shall arbitrarily choose the second and third.
values of the unknowns.
Consider the following system of
equations:
Matrices - 7 of 9 - January, 98
1 2 − 3 same. For another set of equations we might
3 − 1 − 2 , add (-3) × row1 to row2 find that the solutions are all multiples of
1 2 − 3 − 1 3 4.7
→ , divide row2 by - 7
0 − 7 7 2 or - 6 or - 9.4 or K
1 2 − 3 1 - 3 - 4.7
→
0 1 − 1
What is important is the ratios between the
The final matrix corresponds to the equation elements. It is often useful to choose a particular
system: matrix as the basic solution. To do this we must
specify an additional criterion. For example, let
x + 2 y − 3 z = 0
the first element be “1”. Although this may seem
y − z = 0
reasonable, how would we deal with the
This is re-arranged thus: following solution set
y=z 0
− 3 ?
x = −2 y + 3 z
2
Let z = r, where r is an arbitrary number. These
equations then give: No amount of scaling will produce a “1” in the
y=z=r top row. It is preferable to use a criterion which
x = −2 y + 3 z = −2r + 3 r = r does not single out one row over all the others.
For reasons that we shall see later, we define a
So the final solution can be expressed:
“normalized” matrix such that the sum of the
x = r x r 1 squares of its elements add to unity. If a matrix
y = r , or y = r = r 1 is denoted by x, then the sum of the squares of
z = r z r 1 its elements is given by xT x. For example
− 1 − 1
Thus all the solutions are multiples of the T
column matrix: x = 2 ;x x = [− 1 2 1 ] 2
1 1
1
1 . = (− 1 ) + 2 + 1 2 = 6
2 2
1 Thus if we divide each element of x by 6 we
will get a normalized matrix,
I leave it as an exercise to show that the same
result is obtained with any other choice of two − 1
2 6
of the three equations. 6 .
1
6
Normalization
This works even if some of the elements are
In the above example, we could equally say zero:
that all the solutions are multiples of the matrix,
0 0
2 − 74 π − 3 normalizes to - 3
2 or − 74 or π or K 13
2 2
2 − 74 π 13
Matrices - 8 of 9 - January, 98
Rank of a matrix
In the example just given it was shown that
the information content of the matrix really
amounted to two equations rather than three. If
a matrix has a non-zero determinant, none of the
rows or columns can be expressed as a linear
combination of the others. They all bring
independent information. They are called
linearly independent. Such a matrix is said to be
of full rank. In this case the full rank is 3, the
dimension of the square matrix.
If the determinant is zero, the matrix is said
to be not of full rank. The matrix just used
above corresponds to two independent
equations and has a rank of 2.
Consider the following matrix:
− 2 3 5
4 − 6 − 10 .
− 6 9 15
Matrices - 9 of 9 - January, 98