Matrices: David A. Evans

Matrices
An Introduction
David A. Evans
What is a matrix? The matrix A + B is given by
A matrix is a rectangular array of numbers. 1 + 2 0 + 1 1 + 4 3 1 5 
All positions in the array contain a number, A +B =  = 
2 + 0 3 + 1 1 + 3   2 4 4 
there are no gaps. Each individual number is
called an element of the matrix. The dimension If we call this matrix C, we can write the
of a matrix is specified by the number of rows matrix statement or equation
and the number of columns, conventionally, the C=A+B
number of rows is stated first. Thus the
following matrices are respectively 2×2 and 3×4: where now the '+' and '=' signs stand for matrix
addition and matrix equality. The subtraction of
3 0 2 1 
 2.3 3.1    two matrices follows in an obvious way from
A=  ;B =  4 2 5 1 
 0. 7 1. 2  the definition of addition.
 4 2 1 0
Although any two numbers can always be
A matrix is a complete entity and is added together, the same is not true for
referred to by a single symbol just as you have matrices. A moment's thought will show you
been used to x, y, a, etc. standing for some that they have to have the same dimensions in
number. Printed convention is to use bold order for addition to work. Thus, if A is a 2×2
letters to stand for matrices. In handwritten matrix and B is 3×2, the symbol A + B is
work the underline is sometimes used thus, A. meaningless. The third row of B does not have
We often need to refer to a particular element any corresponding elements in A to be added.
(e.g., 2nd row, 3rd column of B) or the general Such matrices are not conformable for addition.
element (ith row, jth column of A). These have
the symbols b23 and aij, (it is conventional to use
Scalar multiple of a matrix
upper-case letters for the matrix and the
corresponding lower-case letter for the Adding a matrix to itself, A + A, will
elements). These symbols stand for ordinary produce a matrix whose elements are twice the
numbers. We can write the equation: b23 = 5. corresponding elements in A. It is not
Note the order of the sub-scripts is row-column. unreasonable to call this matrix 2A. In the same
way, A + A + A will give a matrix with elements
three times those of A and should be written 3A.
Addition of matrices
By a simple extension of this idea, the symbol kA
There would be little point in introducing stands for a matrix whose elements are each k
these entities without defining how they times the corresponding ones of A, where k is
combine and interact. While it appears trivial, any number. We see that this now defines the
we must first agree on the meaning of the multiplication of a matrix by a number. This is
statement that two matrices are equal. If A and sometimes called scalar multiplication since
B are two matrices, then the statement, A = B ordinary numbers are referred to as scalars to
means that the matrices have the same distinguish them from matrices and other
dimensions and that the corresponding elements complicated entities (such as vectors).
of each are equal. That is, aij = bij, for all possible
values of i and j.
Matrix multiplication
Two matrices can be added to give a third
The definition of matrix multiplication is
matrix whose elements are the sums of the
usually considered by students to be the
elements in the corresponding row/column
trickiest and least intuitive of matrix properties.
positions of the matrices being added. Thus if
First of all, the product of two matrices AB is
1 0 1   2 1 4 not, NOT, NOT! given by multiplying the
A=  ; B = 0 1 3 
 2 3 1    corresponding elements of A and B!
Matrices - 1 of 9 - January, 98
Let us consider the following simultaneous the same dimension. Notice also that you can't
equations: always multiply matrices. Write down a couple
3 x + 2y =6  of 2×3 matrices and try to find their product.
 You can't do it because there are 3 elements in
x + 5 y + 2 z = 13 
each row of the first and only 2 elements in each
2x + 2 z = 4 
column of the second. You can't match up the
The solution depends on three items: the array appropriate elements to find the products
of coefficients of x and y, the right-hand sides of required. In order for the product of two matrices to
the equations, and the symbols x, y that identify exist, the first factor must have the same number of
the unknowns. Regroup this information thus: columns as the second factor has rows. Matrices
which can be multiplied are called conformable
 3 2 0  x   6 
 1 5 2  y  13  for multiplication. A convenient way of checking
   =   for conformity is to write the dimensions
 2 0 2  z   4 
underneath the matrices, thus:
This can be expressed in terms of symbols
representing matrices thus
Ax = b
AB=C
if we agree to interpret the product Ax such that mxn nxp mxp
the element in the first row and first column of
the product is obtained from the elements in the Note that the inner values must be the same and
first row of A and the elements in the first that the dimensions of the product are given by
column of x in the following way: start at the the outer values.
left of the row and top of the column and take Notice another odd thing, if the two
the product of the elements. Move one element matrices above are multiplied in the opposite
left in the row and one element down in the order, a different matrix results. It doesn't even
column, multiply these two elements. Continue have the same dimensions. As a further
moving left in the row and down in the column, example, if A is a 2×3 matrix and B is a 3×4
forming products of the elements. Add up all matrix, you can form the product AB, but in the
the products, this gives the value of the element order BA, the matrices are not conformable — B
in the product. The element in the second row, has 4 columns and A has 2 rows.
first column of the product is obtained from the Thus the order of the factors is crucially
elements in the second row of A and the first important in matrix multiplication. Unlike the
column of x. As an example the following products of ordinary numbers (scalars), matrix
product is worked out for you: multiplication does not commute.
1 0 Continued products of more than two

3 − 1 2   
1 0 4  − 1 4 
factors can be done, e.g. ABC, providing the
   2 − 2 numbers of rows and columns match up:
 
 3 × 1 + (− 1 ) × (− 1 ) + 2 × 2 3 × 0 + (− 1 ) × 4 + 2 × (− 2 )
=
 1 × 1 + 0 × (− 1 ) + 4 × 2 1 × 0 + 0 × 1 + 4 × (− 2 ) 
 8 − 8
=
9 − 8
 ABC
mxn nxp pxr
Notice that a 2×3 matrix multiplied by a 3×2
matrix gives a 2×2 matrix. This is an important In this case the final product is an m×r matrix.
thing about matrix multiplication, it can connect
matrices with different dimensions. Addition and
scalar multiplication only work with matrices of
Transpose of a matrix Identity or Unit matrices
A matrix can be transposed to create Within the realm of square matrices, there
another matrix. Given a matrix A, its transpose are some special ones which have ones (1) on the
is written AT (or sometimes A′′ ). The first row of leading diagonal and zeros (0) everywhere else.
this matrix contains, in order, the elements in The product of one of these matrices with any
the first column of A, and likewise for all the conformable matrix, A, (not necessarily square)
other rows. Thus the number of rows(columns) gives the same matrix, A. For example,
in AT is equal to the number of columns(rows)
1 0 0  2 − 3  2 − 3
in A. An example follows: 0 1 0   0
  − 1  =  0 − 1 
 1 0  0 0 1  − 4 7  − 4 7 
1 − 2 5  T  
A=  ;A =  − 2 − 1 
 0 − 1 − 3   5 − 3  These matrices play a role very similar to unity
(1) in ordinary arithmetic. They are called
Notice that you can always form the products identity or unit matrices and are written, In,
ATA and AAT for any matrix. The transposing where n denotes the dimension, the identity
guarantees that the numbers of rows and matrix in the example above is I3. Where there is
columns match up. little chance of confusion, the subscript is often
Suppose we have two matrices, A and B omitted. Take the non-identity matrix in the
which can be multiplied together to give a above example and multiply it by I2, with the
matrix, C = AB. For the sake of example let us identity matrix as the right-hand or second
assume that A is 2×3 and B is 3×4, then C is 2×4. factor. You still get the same result. If A is a
Consider now products of the transposes, AT and square matrix, then IA = AI, i.e. an identity
BT. The product ATBT, does not exist since AT matrix commutes with any conformable square
is 3×2 and BT is 4×3. However, the product matrix.
BTAT does exist; it is a 4×2 matrix which is in
fact equal to CT. Try this calculation for a
Matrix inverse
couple of matrices.
There is no such thing as matrix division.
We have the general result:
The symbol A/B cannot be given a meaning.
(AB)T = BTAT. The reason for this is easily demonstrated. Were
division defined, the matrix equation C = A/B,
Square matrices would imply that C is a matrix that satisfies the
relation BC = A. This is how division is defined
We often encounter matrices with equal for ordinary numbers. With matrices there is a
numbers of rows and columns (2×2, 3×3, etc.), problem. Let
these are quite reasonably called square
matrices. Notice that all square matrices of a  4 − 7  1 − 2
A=  ;B =  .
given dimension are conformable for − 8 14  − 2 4 
multiplication, and the products are square
matrices with that same dimension. It is easy to verify that
It is still true that multiplication does not  1 - 2   4 − 3  4 − 7 

- 2 4   0 2  = − 8 14 
commute, AB ≠ BA, in general, even though     
both products have the same dimension.
thus it would appear that
The set of elements on the diagonal of a
square matrix leading from top left to bottom 4 − 3 
C= .
right is referred to as the leading diagonal. 0 2 
But it is equally easy to verify that
 1 − 2  2 − 1   4 − 7   2 - 4
− 2 4  − 1 3  = − 8 14  - 3 6
     
which would imply instead that does not have an inverse (again, I state this
without proof!). There is no matrix by which
 2 − 1
C= . you can multiply it to give the identity matrix.
− 1 3  Matrices which have no inverse are said to be
Thus a unique value cannot be assigned to the singular.
operation "A/B" for these matrices. In fact, in The inverse of a matrix is analogous to the
this case there are an infinite number of matrices, reciprocal in ordinary numbers. If x is a number,
C that will satisfy the relation BC = A. then its reciprocal x-1 is the number such that
Sometimes, however, C is unique. For the x-1x = 1. One number, 0, has no reciprocal; there
matrices is no number by which you can multiply zero to
obtain 1. Singular matrices are the matrix
 6 − 8 3 − 2 
A=  ;B =  analogues of zero.
 4 − 2 1 0 
I have not indicated how, given the
there is indeed only one matrix: numerical values of the elements of a matrix, to
calculate the elements of the inverse. The
4 − 2
C= discussion of the various techniques for doing
3 1 
this form an important topic in numerical
which satisfies BC = A. methods. The calculations become very complex
for even a 3×3 matrix. One of the powerful
We pose the question: given a square features of using matrix notation is that
matrix A, is there a square matrix, B such that tremendously complicated arithmetic is
the product AB is an identity matrix, i.e. AB = I? encapsulated in the little superscript, -1! For
B must, of course, have the same dimension as example, we have seen that a set of
A. simultaneous equations can be written in matrix
If notation as Ax = b. The equations can be solved
symbolically, thus:
7 5 
A=  Ax = b
4 3 
multiply each side by A − 1 ,
we find that: A − 1 Ax = A − 1b,
 7 5   3 − 5   1 0 but A − 1 A = I, therefore
 4 3  − 4 =
  7  0 1  Ix = A − 1b,
and I state (without proof!) that this is a unique x = A − 1b.
result, like the second example above; no other This shows how the inverse of a matrix is
matrix exists whose product with A is the (2×2) related to the solution of a set of simultaneous
identity matrix. This matrix is called the inverse equations. Note the simplicity of the relation
of A and is written A-1. The result above is expressed in this form even though the
written in matrix notation as AA-1 = I. As an calculation of the inverse and a matrix product
exercise, form the product A-1A, you will find involve many arithmetical operations on the
that this is also the identity matrix. A matrix elements of the matrices.
commutes with its inverse. It is not difficult to
show that the inverse of the product AB, i.e.
Symmetric Matrices
(AB)-1, is equal to B-1A-1.
Consider the matrix
But not every square matrix has an inverse.
For example, the matrix
− 1 4 0 5 − 7 These are called diagonal matrices. Note that the
 4 −3 2 −1 0  unit matrix is a diagonal matrix in which the
 
 0 2 1 4 − 2 diagonal elements are all unity (1). It is very easy
  to find the inverse of a diagonal matrix; it is also
 5 −1 4 3 1 
− 7 0 − 2 1 − 5 a diagonal matrix whose non-zero elements are
the reciprocals of the corresponding elements.
Notice the symmetrical arrangement of the Thus the inverse of the above matrix is
numbers. Draw an imaginary line down the
 12 0 0
leading diagonal (from upper left to lower  
right). This divides the matrix into two  0 - 1 0
0 0 1
triangles. The numbers in the lower triangle are  4
the "reflection" in this line of those in the upper

Check this by multiplying them together.
triangle. Another way of putting this is that the
elements in the first (or ith) row are the same Another kind of symmetry can occur in
and in the same order as those in the first (or ith) matrices, consider the following matrix,
column. A more succinct way of describing the  0 1 - 3
arrangement is to note that the transpose of the - 1 0 2
matrix is identical to the original matrix. That 
 3 - 2 0
is, for the matrix above we can state
AT=A The elements in the lower triangle are the
negative of the corresponding elements in the
A matrix with this property is said to be upper triangle. The ith column is the negative
symmetric. Notice that only square matrices can of the ith row. The transpose of the matrix gives
be symmetric. Symmetric matrices often appear the negative of the original matrix, i.e.
in practical applications, for example, in
multivariate statistics the correlation and AT = –A
covariance matrices are symmetric. Such matrices are called skew-symmetric or
It is not difficult to see that the sum (or antisymmetric. Notice that the leading diagonal
difference) of two symmetric matrices is also elements of a skew-symmetric matrix must be
symmetric. The inverse of a symmetric matrix zero.
(if it exists) is symmetric. However, the product
of two symmetric matrices is not necessarily Determinants
symmetric. (Can you deduce what property
symmetric matrices A and B must have if their We have seen that a matrix may be
product AB is also symmetric?) singular, i.e. it has no inverse. Can we detect
this without trying unsuccessfully to invert the
It was previously stated that for any matrix?
matrix, A, we can always make the products
ATA and AAT. Both these products are square A set of linear simultaneous equations
(though not necessarily of the same size) and it expressed in matrix form as Ax = b, has the
is easy to show that they are symmetric. solution expressed as x = A-1b. If A is singular,
A-1 does not exist and so there are no solutions.
A special set of symmetric matrices are Thus the singularity of A is related to the
those whose off-diagonal elements are all zero, existence of solutions of a set of simultaneous
e.g. equations.
2 0 0 The following equations:
0 − 1 0
  x - 2y = 4
0 0 4  
2x - 4 y = 6
do not have a solution, i.e. no pair of values of x or
and y can satisfy them. The reason is easy to see,
ad − bc = 0 .
the left hand side of the second equation is just
twice the left hand side of the first. Suppose we Here we have a test for the singularity of a 2×2
choose any pair of values of x and y that satisfy matrix. The expression ad – bc is calculated, i.e.
the first equation. These values can never (top left)(bottom right) – (top right)(bottom left). If
satisfy the second since on substituting the this is zero the matrix is singular and has no
values of x and y, the left hand side will always inverse, otherwise it is non-singular and has an
result in the value 8, never 6. The equations are inverse. The number we have calculated is
said to be inconsistent. Conversely, the called the determinant of the matrix. We use the
equations: symbol
x − 2y = 4 a b
 c d
2 x − 4 y = 8
have an infinite number of solutions. The second to stand for the determinant of the matrix. Note
equation is automatically satisfied for any that the determinant is a number which is a
values of x and y that satisfy the first. We function of the elements of the matrix. We have,
basically have just one equation in two for the example equations above
unknowns. There is no additional information 1 -2
provided by the second equation. The problem, = 1 × (− 4 ) − (− 2 ) × 2 = −4 − (− 4 ) = 0
2 -4
of course, is that the equations are just multiples
of each other. Thus, as promised, the matrix is singular. In the
section dealing with inverse matrices, it was
Since there are no solutions, the matrix of
stated that the matrix
coefficients, A:
 2 − 4
 1 − 2 − 3
2 − 4   6
 
was singular. Checking the determinant we find
is singular. We see that the second row is just
twice the first row. Note a similar dependence 2 −4
= 2 × 6 − (− 4 ) × (− 3 ) = 12 − 12 = 0
between the columns. The second column is the −3 6
negative of twice the first column. It is clear that
showing that the matrix is indeed singular.
the problems will always arise whenever one
row is any multiple of the other. Let us consider This can be extended to square matrices
a general 2×2 matrix with larger dimensions. Any square matrix has
a determinant, a number which is a function of
a b 
 c d the elements. If the determinant is zero, the
  matrix is singular and vice versa. The expression
for the determinant becomes quite complicated
If the matrix is singular, one row is just a
and, at this stage, need not concern us greatly.
multiple of the other:
The determinant of a 3×3 matrix is given by
a = kc;b = kd
a b c
e f d f d e
or d e f =a −b +c
h j g j g h
a b g h j
= =k
c d Each of the 2×2 determinants is then expanded
or using the formula given above. In this way we
can obtain explicitly the value of the
ad = bc
determinant in terms of the elements. This is left 2 − 3 1  x
as an exercise. Notice that each of the 2×2   
 1 2 − 3  y  = 0
determinants is obtained from the original  3 − 1 − 2   z 
matrix by removing the first row and then
removing each column in turn. A similar The system is homogeneous. It is
relationship can be written to express a 4×4 straightforward (but labor-intensive!) to check
determinant in terms of four 3×3 determinants, that:
and this can be continued for the determinants 2 −3 1
of larger matrices. 1 2 −3 = 0
3 −1 −2
Homogeneous equations
Therefore there are solutions in addition to the
Consider the set of equations:
trivial one, x = y = z = 0.
x + 3 y = 0
 It can be shown that when a determinant is
2 x − y = 0 zero, at least one row (or one column) can be
expressed as a linear combination of the
Both right hand sides are zero; such equations
remaining rows (or columns). That is, in the
are called homogeneous. The matrix of
determinant above, one has that:
coefficients is nonsingular (check that the
determinant is not zero), the equations have a row3 = a row1 + b row2
unique solution x = 0, y = 0, not a very is true for some value of a and b. In our case it is
interesting state of affairs. In fact, this is called a easy to see that a = 1 and b = 1 so that:
trivial solution.
row3 =row1 + row2.
Consider now the following:
Another way of putting it is that the third
x − 2y = 0  equation is merely obtained by adding the first

2x − 4y = 0  two. Therefore it must be satisfied by any
values of x, y, and z that satisfy the first two
where the left hand sides are the same as the
equations. It does not bring any new
examples given in the last section. There are
information to the system; it is redundant.
now an infinite number of solutions in addition
to the trivial one. Any pair of values where x is It is not the third equation that is at fault.
twice y satisfies the set of equations. The set has We might equally well write:
the solutions in addition to the trivial one row1 =row3 – row2
because the determinant of the matrix of
coefficients is zero. This idea extends to so that the first row is a linear combination of
equations in three, four, etc. unknowns. the second and the third. The important thing is
that there is only information from two
We have the result that a set of equations present. We know that two equations
homogeneous equations has only the trivial in three unknowns have an infinite number of
solution unless the determinant of the matrix of solutions.
the coefficients is zero. When the determinant is
We solve two of the equations using the
zero, there are an infinite number of solutions
method of transformation to echelon form. We
where definite relationships exist between the
shall arbitrarily choose the second and third.
values of the unknowns.
Consider the following system of
equations:
 1 2 − 3 same. For another set of equations we might
 3 − 1 − 2  , add (-3) × row1 to row2 find that the solutions are all multiples of
 
1 2 − 3 − 1   3   4.7 
→ , divide row2 by - 7      
0 − 7 7   2  or  - 6  or - 9.4  or K
1 2 − 3  1  - 3  - 4.7 
→ 
0 1 − 1 
What is important is the ratios between the
The final matrix corresponds to the equation elements. It is often useful to choose a particular
system: matrix as the basic solution. To do this we must
specify an additional criterion. For example, let
x + 2 y − 3 z = 0
 the first element be “1”. Although this may seem
y − z = 0
reasonable, how would we deal with the
This is re-arranged thus: following solution set
y=z   0 
 − 3  ?
x = −2 y + 3 z   
 2 
Let z = r, where r is an arbitrary number. These
equations then give: No amount of scaling will produce a “1” in the
y=z=r top row. It is preferable to use a criterion which
x = −2 y + 3 z = −2r + 3 r = r does not single out one row over all the others.
For reasons that we shall see later, we define a
So the final solution can be expressed:
“normalized” matrix such that the sum of the
x = r  x  r  1  squares of its elements add to unity. If a matrix

y = r , or  y  = r  = r 1  is denoted by x, then the sum of the squares of
z = r   z  r  1  its elements is given by xT x. For example
− 1  − 1 
Thus all the solutions are multiples of the   T  
column matrix: x =  2  ;x x = [− 1 2 1 ] 2 
 1   1 
1 
1  . = (− 1 ) + 2 + 1 2 = 6
2 2
 
1  Thus if we divide each element of x by 6 we
will get a normalized matrix,
I leave it as an exercise to show that the same
result is obtained with any other choice of two − 1 
 2 6
of the three equations.  6 .
 1 
 6 
Normalization
This works even if some of the elements are
In the above example, we could equally say zero:
that all the solutions are multiples of the matrix,
 0   0 
2  − 74 π − 3  normalizes to - 3 
2  or − 74 or π or K    13 
       2   2 
2  − 74 π  13 
In this particular case it is easy to see that all

that is necessary is that the elements are all the
Rank of a matrix
In the example just given it was shown that
the information content of the matrix really
amounted to two equations rather than three. If
a matrix has a non-zero determinant, none of the
rows or columns can be expressed as a linear
combination of the others. They all bring
independent information. They are called
linearly independent. Such a matrix is said to be
of full rank. In this case the full rank is 3, the
dimension of the square matrix.
If the determinant is zero, the matrix is said
to be not of full rank. The matrix just used
above corresponds to two independent
equations and has a rank of 2.
Consider the following matrix:
− 2 3 5 
 4 − 6 − 10 .
 
− 6 9 15 
The determinant is zero. In this case it is easy to

see that the second and third rows are
respectively, (–2) and 3 times the first row. (Or
equally correctly, the first and third rows are
respectively (–0.5) and (–1.5) times the second
row.) Thus there is only one independent row.
The rank of this matrix is therefore 1.
A non-singular matrix has a non-zero
determinant and has full rank equal to the
number of rows (or columns); its rows (and
columns) are linearly independent. We see that
a singular matrix has a zero determinant and
also has a rank which is less than the dimension
of the matrix. The rank is the number of rows
(or columns) which are linearly independent.
Calculating the rank of a matrix is not a trivial
exercise. We shall see later on that it is closely
related to the eigenvalues of the matrix.

Matrices: David A. Evans

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Matrices: David A. Evans

Uploaded by

Copyright:

Available Formats

Matrices

1 0 Continued products of more than two

It is still true that multiplication does not  1 - 2   4 − 3  4 − 7 

the "reflection" in this line of those in the upper

In this particular case it is easy to see that all

The determinant is zero. In this case it is easy to

You might also like