Linear Algebra

Subsections Definition of a Matrix Special Matrices Operations on Matrices Multiplication of Matrices Inverse of a Matrix Some More Special Matrices
Submatrix of a Matrix Block Matrices Miscellaneous Exercises Matrices over Complex Numbers
Mohd. Abrar Nizami
DEFINITION 1.1.1 (Matrix) A rectangular array of numbers is called a matrix. We shall mostly be concerned with matrices having real numbers as entries. The horizontal arrays of a matrix are called its ROWS and the vertical arrays are called its COLUMNS. A matrix rows and columns is said to have the order having A matrix of ORDER can be represented in the following form:
where
is the entry at the intersection of the
row and by
column. by suppressing its order.
In a more concise manner, we also denote the matrix
Remark 1.1.2 Some books also use
to represent a matrix.
Let
Then
and
A matrix having only one column is called a COLUMN VECTOR; and a matrix with only one row is called a ROW VECTOR. WHENEVER A VECTOR IS USED, IT SHOULD BE UNDERSTOOD FROM THE CONTEXT WHETHER IT IS A ROW VECTOR OR A COLUMN VECTOR. DEFINITION 1.1.3 (Equality of two Matrices) Two matrices order are equal if for each and and having the same
where the point of symmetry is left as an exercise to the reader.
is the centre. The calculation of the planes
DEFINITION 1.1.5 1. A matrix in which each entry is zero is called a zero-matrix, denoted by For example,
2. A matrix for which the number of rows equals the number of columns, is called a square matrix. So, if is an matrix then is said to have order . 3. In a square matrix, of order , the entries are called the diagonal entries
and form the principal diagonal of 4. A square matrix is said to be a diagonal matrix if for In other words, the and
non-zero entries appear only on the principal diagonal. For example, the zero matrix are a few diagonal matrices. A diagonal matrix of order with the diagonal entries is denoted by
If
for all of order .
then the diagonal matrix
is called a scalar matrix. for all . This
5. A diagonal matrix matrix is denoted by
is called an IDENTITY MATRIX if
For example,
and
The subscript A square matrix
is suppressed in case the order is clear from the context or if no confusion arises. is said to be an upper triangular matrix if for
A square matrix A square matrix
is said to be a lower triangular matrix if
for
is said to be triangular if it is an upper or a lower triangular matrix.
For example
is an upper triangular matrix. An upper triangular matrix will be
represented by
DEFINITION 1.2.1 (Transpose of a Matrix) The transpose of an matrix with for and
matrix The transpose of
is defined as the is denoted by
That is, by the transpose of an its columns and the columns of
matrix as its rows.
we mean a matrix of order
having the rows of
as
For example, if
then
Thus, the transpose of a row vector is a column vector and vice-versa. THEOREM 1.2.2 For any matrix Proof . Let and Then, the definition of transpose gives
and the result follows. height6pt width 6pt depth 0pt DEFINITION 1.2.3 (Addition of Matrices) let the sum is defined to be the matrix and with be are two matrices. Then
Note that, we define the sum of two matrices only when the order of the two matrices are same. DEFINITION 1.2.4 (Multiplying a Scalar to a Matrix) Let element we define be an matrix. Then for any
For example, if
and
then
THEOREM 1.2.5 Let
and
be matrices of order
and let
1. 2. 3. 4. Proof . Part 1. Let and
Then
as real numbers commute. The reader is required to prove the other parts as all the results follow from the properties of real numbers. height6pt width 6pt depth 0pt EXERCISE 1.2.6 1. Suppose 2. Suppose Then show that Then show that be an matrix. This matrix is called the additive inverse of and
DEFINITION 1.2.7 (Additive Inverse) Let 1. Then there exists a matrix is denoted by 2. Also, for the matrix identity. with
Hence, the matrix
is called the additive
DEFINITION 1.2.8 (Matrix Multiplication / Product) Let be an matrix. The product is a matrix of order
be an with
matrix and
That is, if
and
then
Observe that the product
is defined if and only if
THE NUMBER OF COLUMNS OF MATHEND000# THE NUMBER OF ROWS OF MATHEND000#
For example, if
and
then
(1.2.1)
(1.2.2)
Observe the following: 1. In this example, while is defined, the product and is not defined. and are defined.
However, for square matrices 2. The product 3. The product
of the same order, both the product
corresponds to operating on the rows of the matrix (see 1.2.1), and also corresponds to operating on the columns of the matrix (see 1.2.2). and are said to commute if
DEFINITION 1.2.9 Two square matrices Remark 1.2.10 1. Note that if
is a square matrix of order
then
Also, a scalar matrix of order
commutes with any square matrix of order . 2. In general, the matrix product is not commutative. For example, consider the following two matrices and . Then check that the matrix product
THEOREM 1.2.11 Suppose that the matrices are defined. 1. Then 2. For any 3. Then 4. If is an matrix then of order is the and
and
are so chosen that the matrix multiplications
That is, the matrix multiplication is associative.
That is, multiplication distributes over addition.
5. For any square matrix the first row of for
we have
times the first row of row of is times the when and row of
A similar statement holds for the columns of Proof . Part 1. Let
is multiplied on the right by
Therefore,
Part 5.
For all
we have
as
whenever
Hence, the required result follows.
The reader is required to prove the other parts. height6pt width 6pt depth 0pt EXERCISE 1.2.12 1. Let and be two matrices. If the matrix addition . Also, if the matrix product is defined, then prove that is defined then prove that .
2. Let
and
Compute the matrix products
and
Let
be a positive integer. Compute
for the following matrices:
Can you guess a formula for
1. Suppose that the matrix product 2. Suppose that the matrix products have different orders. 3. Suppose that the matrices and may not be equal.
is defined. Then the product need not be defined. and are defined. Then the matrices and can are square matrices of order Then and may or
DEFINITION 1.2.13 (Inverse of a Matrix) Let 1. A square matrix 2. A square matrix 3. A matrix
be a square matrix of order if if such that
is said to be a LEFT INVERSE of is called a RIGHT INVERSE of
is said to be INVERTIBLE (or is said to have an INVERSE) if there exists a matrix
LEMMA 1.2.14 Let and Proof . Note that
be an then
matrix. Suppose that there exist
matrices
and
such that
height6pt width 6pt depth 0pt Remark 1.2.15 1. From the above lemma, we observe that if a matrix is invertible, then the inverse is unique. 2. As the inverse of a matrix is unique, we denote it by That is, THEOREM 1.2.16 Let 1. 2. 3. Proof . Proof of Part 1 . and be two matrices with inverses and respectively. Then
By definition definition, implies
Hence, if we denote or equivalently
by
then we get
Thus, the
Proof of Part 2. Verify t hat Proof of Part 3 We know Taking transpose, we get
Hence, by definition EXERCISE 1.2.17 1. Let
height6pt width 6pt depth 0pt
be invertible matrices. Prove that the product
is also an invertible
matrix. 2. Let be an inveritble matrix. Then prove that cannot have a row or column consisting of only zeros. 3. Let be an invertible matrix and let be a nonzero real number. Then determine the inverse of the matrix .
DEFINITION 1.3.1 1. A matrix 2. A matrix EXAMPLE 1.3.2 over is called symmetric if and skew-symmetric if
is said to be orthogonal if
1. Let skew-symmetric matrix. 2. Let
and
Then
is a symmetric matrix and
is a
Then
is an orthogonal matrix.
3. Let
be an
matrix with for which a positive integer for which
Then exists such that
and
for are called
The matrices
NILPOTENT matrices. The least positive integer NILPOTENCY.
is called the ORDER OF are called
4. Let
Then
The matrices that satisfy the condition that
IDEMPOTENT matrices.
EXERCISE 1.3.3 1. Show that for any square matrix skew-symmetric, and 2. Show that the product of two lower triangular matrices is a lower triangular matrix. A similar statement holds for upper triangular matrices. 3. Let and be symmetric matrices. Show that is symmetric if and only if 4. Show that the diagonal entries of a skew-symmetric matrix are zero. Let be skew-symmetric matrices with Is the matrix skew-symmetric? 6. Let be a symmetric matrix of order with Is it necessarily true that 7. Let be a nilpotent matrix. Show that there exists a matrix such that is symmetric, is
DEFINITION 1.3.4 A matrix obtained by deleting some of the rows and/or columns of a matrix is said to be a submatrix of the given matrix. For example, if a few submatrices of are
But the matrices
and
are not submatrices of
Let
be an and
matrix and as and
be an and
matrix. Suppose where and and rows of has order
Then, we can decompose the and has order and That consists
matrices
is, the matrices of the last rows of and
are submatrices of . Similarly,
consists of the first are submatrices of
columns of and
columns of
consists of the first
consists of the last
. We now prove the following important theorem.
THEOREM 1.3.5 Let
and
be defined as above. Then
Proof . First note that the matrices
and
are each of order and and
. The matrix products
and and
are valid as the order of the matrices . Let , we have
are respectively, . Then, for and
height6pt width 6pt depth 0pt Theorem 1.3.5 is very useful due to the following reasons: 1. The order of the matrices and are smaller than that of or
2. It may be possible to block the matrix in such a way that a few blocks are either identity matrices or zero matrices. In this case, it may be easy to handle the matrix product using the block form. Or when we want to prove results using induction, then we may assume the result for
submatrices and then look for
submatrices, etc.
For example, if
and
Then
If
then
can be decomposed as follows:
or
or
and so on.
Suppose
and are called the blocks of the matrices and
Then the matrices respectively.
and
Even if and
is defined, the orders of in the block form. But, if
and and
may not be same and hence, we may not be able to add is defined then
Similarly, if the product is defined, the product need not be defined. Therefore, we can talk of matrix product as block product of matrices, if both the products and are defined. And in this case, we have
That is, once a partition of is fixed, the partition of block addition or multiplication.
has to be properly chosen for purposes of
Here the entries of the matrix are complex numbers. All the definitions still hold. One just needs to look at the following additional definitions. DEFINITION 1.4.1 (Conjugate Transpose of a Matrix) 1. Let matrix be an matrix over with If then the Conjugate of denoted by is the
For example, Let
Then
2. Let
be an
matrix over with
If
then the Conjugate Transpose of
denoted by
is the matrix
For example, Let
Then
3. A square matrix 4. A square matrix 5. A square matrix 6. A square matrix Remark 1.4.2 If
over over over over with
is called Hermitian if is called skew-Hermitian if is called unitary if is called Normal if then
Subsections Introduction A Solution Method Row Operations and Equivalent Systems Gauss Elimination Method Row Reduced Echelon Form of a Matrix Gauss-Jordan Elimination Elementary Matrices Rank of a Matrix Existence of Solution of Example Main Theorem Equivalent conditions for Invertibility Inverse and the Gauss-Jordan Method Determinant Adjoint of a Matrix Cramer's Rule Miscellaneous Exercises
Let us look at some examples of linear systems. 1. Suppose 1. If 2. If 1. 2. Consider the system then the system has a UNIQUE SOLUTION and then the system has NO SOLUTION. then the system has INFINITE NUMBER OF SOLUTIONS, namely all equations in unknowns. If one of the coefficients, Thus for the system or is non-zero, then this linear
2. We now consider a system with Consider the equation equation represents a line in
the set of solutions is given by the points of intersection of the two lines. There are three cases to be considered. Each case is illustrated by an example. 1. UNIQUE SOLUTION and The unique solution is Observe that in this case, 2. INFINITE NUMBER OF SOLUTIONS and The set of solutions is with represent the same line. Observe that in this case, 3. NO SOLUTION and no point of intersection. Observe that in this case, 3. As a last example, consider A linear equation arbitrary. In other words, both the equations and The equations represent a pair of parallel lines and hence there is but provided As in the
equations in unknowns. represent a plane in
case of equations in unknowns, we have to look at the points of intersection of the given three planes. Here again, we have three cases. The three cases are illustrated by examples. 1. UNIQUE SOLUTION Consider the system and The unique
solution to this system is 2. INFINITE NUMBER OF SOLUTIONS Consider the system solutions to this system is
THE THREE PLANES INTERSECT ON A LINE. 3. NO SOLUTION
i.e. THE THREE PLANES INTERSECT AT A POINT. and The set of with arbitrary:
The system
and
has no solution. In this
case, we get three parallel lines as intersections of the above planes taken two at a time. The readers are advised to supply the proof. DEFINITION 2.1.1 (Linear System) A linear system of set of equations of the form equations in unknowns is a
(2.1.1)
where for
and
Linear System (2.1.1) is called HOMOGENEOUS if
and NON-HOMOGENEOUS otherwise. We rewrite the above equations in the form where
and
The matrix
is called the COEFFICIENT matrix and the block matrix
is the AUGMENTED matrix of the
linear system (2.1.1). Remark 2.1.2 Observe that the the row of the augmented matrix represents the variable equation and That is, for equation and
column of the coefficient matrix and the entry
corresponds to coefficients of the of the coefficient matrix
corresponds to the
variable For a system of linear equations

SYSTEM.
the system
is called the ASSOCIATED HOMOGENEOUS
is a column DEFINITION 2.1.3 (Solution of a Linear System) A solution of the linear system vector with entries such that the linear system (2.1.1) is satisfied by substituting in place of That is, if Note: The zero -tuple -tuple then holds. and is called the TRIVIAL
is always a solution of the system if it satisfies
solution. A non-zero
is called a NON-TRIVIAL solution.
EXAMPLE 2.1.4 Let us solve the linear system
and
Solution: 1. The above linear system and the linear system (2.1.2)
have the same set of solutions. (why?) 2. Using the equation, we eliminate from and equation to get the linear system
(2.1.3) This system and the system (2.1.2) has the same set of solution. (why?) 3. Using the equation, we eliminate from the last equation of system (2.1.3) to get the system
(2.1.4) which has the same set of solution as the system (2.1.3). (why?) 4. The system (2.1.4) and system
(2.1.5) has the same set of solution. (why?) 5. Now, solution is implies and Or in terms of a vector, the set of
DEFINITION 2.2.1 (Elementary Operations) The following operations 1, 2 and 3 are called elementary operations. 1. interchange of two equations, say `ìnterchange the (compare the system (2.1.2) with the original system.) 2. multiply a non-zero constant throughout an equation, say ``multiply the (compare the system (2.1.5) and the system (2.1.4).) 3. replace an equation by itself plus a constant multiple of another equation, say ``replace the equation by equation plus times the equation". equation by "; and equations";
(compare the system (2.1.3) with (2.1.2) or the system (2.1.4) with (2.1.3).) Remark 2.2.2 1. In Example 2.1.4, observe that the elementary operations helped us in getting a linear system (2.1.5), which was easily solvable. 2. Note that at Step 1, if we interchange the first and the second equation, we get back to the linear system from which we had started. This means the operation at Step 1, has an inverse operation. In other words, INVERSE OPERATION sends us back to the step where we had precisely started. So, in Example 2.1.4, the application of a finite number of elementary operations helped us to obtain a simpler system whose solution can be obtained directly. That is, after applying a finite number of elementary operations, a simpler linear system is obtained which can be easily solved. Note that the three elementary operations defined above, have corresponding INVERSE operations, namely, 1. `ìnterchange the 2. ``divide the 3. ``replace the and equation by equation by equations", "; equation minus times the equation".
It will be a useful exercise for the reader to IDENTIFY THE INVERSE OPERATIONS at each step in Example 2.1.4. DEFINITION 2.2.3 (Equivalent Linear Systems) Two linear systems are said to be equivalent if one can be
obtained from the other by a finite number of elementary operations. The linear systems at each step in Example 2.1.4 are equivalent to each other and also to the original linear system. LEMMA 2.2.4 Let be the linear system obtained from the linear system by a single and have the same set of solutions. elementary operation. Then the linear systems Proof . We prove the result for the elementary operation ``the times the equation is replaced by equation plus
equation." The reader is advised to prove the result for other elementary operations. and vary only in the Then substituting for equation. Let 's in the and be a
In this case, the systems solution of the linear system equations, we get
's in place of
Therefore, (2.2.1)
But then the
equation of the linear system
is (2.2.2)
Therefore, using Equation (2.2.1), Use a similar argument to show that if also a solution of the linear system
is also a solution for the
Equation (2.2.2). then it is
is a solution of the linear system
Hence, we have the proof in this case. height6pt width 6pt depth 0pt Lemma 2.2.4 is now used as an induction step to prove the main result of this section (Theorem 2.2.5). THEOREM 2.2.5 Two equivalent systems have the same set of solutions. Proof . Let be the number of elementary operations performed on theorem by induction on If suppose Lemma 2.2.4 answers the question. If to get We prove the Now, step from the
assume that the theorem is true for
Apply the Lemma 2.2.4 again at the ``last step" (that is, at the
step) to get the required result using induction. height6pt width 6pt depth 0pt Let us formalise the above section which led to Theorem 2.2.5. For solving a linear system of equations, we applied elementary operations to equations. It is observed that in performing the elementary operations, the calculations were made on the COEFFICIENTS (numbers). The variables and the sign of ) are not disturbed. Therefore, in place of looking at the system of equations as a equality (that is, whole, we just need to work with the coefficients. These coefficients when arranged in a rectangular array gives us the augmented matrix DEFINITION 2.2.6 (Elementary Row Operations) The elementary row operations are defined as: 1. interchange of two rows, say `ìnterchange the and rows", denoted row by ", denoted
2. multiply a non-zero constant throughout a row, say ``multiply the
3. replace a row by itself plus a constant multiple of another row, say ``replace the plus times the row", denoted
row by
row
EXERCISE 2.2.7 Find the INVERSE row operations corresponding to the elementary row operations that have been defined just above. DEFINITION 2.2.8 (Row Equivalent Matrices) Two matrices are said to be row-equivalent if one can be obtained from the other by a finite number of elementary row operations. EXAMPLE 2.2.9 The three matrices given below are row equivalent.
Whereas the matrix
is not row equivalent to the matrix
DEFINITION 2.2.10 (Forward/Gauss Elimination Method) Gaussian elimination is a method of solving a linear system (consisting of equations in unknowns) by bringing the augmented matrix
to an upper triangular form
This elimination process is also called the forward elimination method. The following examples illustrate the Gauss elimination procedure. EXAMPLE 2.2.11 Solve the linear system by Gauss elimination method.
Solution: In this case, the augmented matrix is steps. 1. Interchange and equation (or ).
The method proceeds along the following
2. Divide the
equation by
(or
).
3. Add
times the
equation to the
equation (or
).
4. Add
times the
equation to the
equation (or
).
5. Multiply the
equation by
(or
).
The last equation gives Hence the set of solutions is
the second equation now gives
Finally the first equation gives
A UNIQUE SOLUTION.
EXAMPLE 2.2.12 Solve the linear system by Gauss elimination method.
Solution: In this case, the augmented matrix is
and the method proceeds as follows:
1. Add
times the first equation to the second equation.
2. Add
times the first equation to the third equation.
3. Add
times the second equation to the third equation
Thus, the set of solutions is words, the system has INFINITE NUMBER OF SOLUTIONS. EXAMPLE 2.2.13 Solve the linear system by Gauss elimination method.
with
arbitrary. In other
Solution: In this case, the augmented matrix is
and the method proceeds as follows:
1. Add
times the first equation to the second equation.
2. Add
times the first equation to the third equation.
3. Add
times the second equation to the third equation
The third equation in the last step is
This can never hold for any value of
Hence, the system has NO SOLUTION. one needs to apply only the elementary row
Remark 2.2.14 Note that to solve a linear system, operations to the augmented matrix
DEFINITION 2.3.1 (Row Reduced Form of a Matrix) A matrix
is said to be in the row reduced form if
1. THE FIRST NON-ZERO ENTRY IN EACH ROW OF MATHEND000# IS MATHEND000# 2. THE COLUMN CONTAINING THIS MATHEND000# HAS ALL ITS OTHER ENTRIES ZERO. A matrix in the row reduced form is also called a ROW REDUCED MATRIX. EXAMPLE 2.3.2 1. One of the most important examples of a row reduced matrix is the that the entry of the identity matrix is identity matrix, Recall
is usually referred to as the Kronecker delta function.
2. The matrices
and
are also in row reduced form.
3. The matrix
is not in the row reduced form. (why?)
DEFINITION 2.3.3 (Leading Term, Leading Column) For a row-reduced matrix, the first non-zero entry of any row is called a LEADING TERM. The columns containing the leading terms are called the LEADING COLUMNS. in variables and DEFINITION 2.3.4 (Basic, Free Variables) Consider the linear system equations. Let be the row-reduced matrix obtained by applying the Gauss elimination method to the augmented matrix Then the variables corresponding to the leading columns in the first
columns of variables.
are called the BASIC variables. The variables which are not basic are called FREE
The free variables are called so as they can be assigned arbitrary values and the value of the basic variables can then be written in terms of the free variables. Observation: In Example 2.2.12, the solution set was given by
That is, we had two basic variables,
and
and
as a free variable.
Remark 2.3.5 It is very important to observe that if there are non-zero rows in the row-reduced form of the matrix then there will be leading terms. That is, there will be leading columns. Therefore, IF THERE
ARE MATHEND000# LEADING TERMS AND MATHEND000# VARIABLES, THEN THERE WILL BE MATHEND000# BASIC VARIABLES AND MATHEND000# FREE VARIABLES.
We now start with Step 5 of Example 2.2.11 and apply the elementary operations once again. But this time, we start with the 1. Add row. ). times the third equation to the second equation (or
2. Add
times the third equation to the first equation (or
).
3. From the above matrix, we directly have the set of solution as DEFINITION 2.3.6 (Row Reduced Echelon Form of a Matrix) A matrix reduced echelon form if is said to be in the row
1. is already in the row reduced form; 2. The rows consisting of all zeros comes below all non-zero rows; and 3. the leading terms appear from left to right in successive rows. That is, for leading column of the row. Then
let
be the
EXAMPLE 2.3.7 Suppose
and
are in row reduced form.
Then the corresponding matrices in the row reduced echelon form are respectively,
and
DEFINITION 2.3.8 (Row Reduced Echelon Matrix) A matrix which is in the row reduced echelon form is also called a row reduced echelon matrix. DEFINITION 2.3.9 (Back Substitution/Gauss-Jordan Method) The procedure to get to Step II of Example 2.2.11 from Step 5 of Example 2.2.11 is called the back substitution. The elimination process applied to obtain the row reduced echelon form of the augmented matrix is called the Gauss-Jordan elimination. That is, the Gauss-Jordan elimination method consists of both the forward elimination and the backward substitution. Method to get the row-reduced echelon form of a given matrix Let be an matrix. Then the following method is used to obtain the row-reduced echelon form the matrix Step 1: Consider the first column of the matrix If all the entries in the first column are zero, move to the second column. Else, find a row, say the first row with the whole row by below this row, which contains a non-zero entry in the first column. Now, interchange row. Suppose the non-zero entry in the -entry of the new matrix is -position is Now, use the Divide the to make all the entries
so that the
equal to
Step 2: If all entries in the first column after the first step are zero, consider the right submatrix of the matrix obtained in step 1 and proceed as in step 1. Else, forget the first row and first column. Start with the lower matrix obtained in the first step and proceed as in step 1. Step 3: Keep repeating this process till we reach a stage where all the entries below a particular row, Then has the following form: say , are zero. Suppose at this stage we have obtained a matrix 1. THE FIRST NON-ZERO ENTRY IN EACH ROW of is These 's are the leading terms of and the columns containing these leading terms are the leading columns. 2. THE ENTRIES OF MATHEND000# BELOW THE LEADING TERM ARE ALL ZERO. Step 4: Now use the leading term in the to zero. Step 5: Next, use the leading term in the row to make all entries in the leading column equal leading submatrix of the
row to make all entries in the
column equal to zero and continue till we come to the first leading term or column.
The final matrix is the row-reduced echelon form of the matrix

RIGHT. Hence, if
Remark 2.3.10 Note that the row reduction involves only row operations and proceeds from LEFT TO is a matrix consisting of first columns of a matrix then the row reduced form of will be the first columns of the row reduced form of
The proof of the following theorem is beyond the scope of this book and is omitted. THEOREM 2.3.11 The row reduced echelon form of a matrix is unique. EXERCISE 2.3.12 1. Solve the following linear system. 1. 2. 3. 4. 5. and and and and and
2. Find the row-reduced echelon form of the following matrices.
DEFINITION 2.3.13 A square matrix of order is called an elementary matrix if it is obtained by applying exactly one elementary row operation to the identity matrix, Remark 2.3.14 There are three types of elementary matrices. 1. which is obtained by the application of the elementary row operation to the identity matrix,
Thus, the 2.
entry of
is to the identity
which is obtained by the application of the elementary row operation
matrix,
The
entry of
is
3.
which is obtained by the application of the elementary row operation
to the identity
matrix,
The
entry of
is
In particular, if we start with a
identity matrix
, then
EXAMPLE 2.3.15
1. Let
Then
That is, interchanging the two rows of the matrix is same as multiplying on the left by the corresponding elementary matrix. In other words, we see that the left multiplication of elementary matrices to a matrix results in elementary row operations. 2. Consider the augmented matrix same as the matrix product Then the result of the steps given below is
Now, consider an
matrix
and an elementary matrix
of order
Then multiplying by
on the
right to corresponds to applying column transformation on the matrix matrix, there is a corresponding column transformation. We summarize:
Therefore, for each elementary
DEFINITION 2.3.16 The column transformations obtained by right multiplication of elementary matrices are called elementary column operations.
EXAMPLE 2.3.17 Let
and consider the elementary column operation
which
interchanges the second and the third column of
Then
EXERCISE 2.3.18 1. Let is, be an elementary row operation and let is the matrix obtained from be the corresponding elementary matrix. That Show that
by applying the elementary row operation
2. Show that the Gauss elimination method is same as multiplying by a series of elementary matrices on the left to the augmented matrix. Does the Gauss-Jordan method also corresponds to multiplying by elementary matrices on the left? Give reasons. 3. Let and be two matrices. Then prove that the two matrices are row-equivalent if and only if where is product of elementary matrices. When is this unique? 4. Show that every elementary matrix is invertible. Is the inverse of an elementary matrix, also an elementary matrix?
In previous sections, we solved linear systems using Gauss elimination method or the Gauss-Jordan method. In the examples considered, we have encountered three possibilities, namely 1. existence of a unique solution, 2. existence of an infinite number of solutions, and 3. no solution. Based on the above possibilities, we have the following definition. DEFINITION 2.4.1 (Consistent, Inconsistent) A linear system is called CONSISTENT if it admits a solution and is called INCONSISTENT if it admits no solution. The question arises, as to whether there are conditions under which the linear system is consistent. The answer to this question is in the affirmative. To proceed further, we need a few definitions and remarks. Recall that the row reduced echelon form of a matrix is unique and therefore, the number of non-zero rows is a unique number. Also, note that the number of non-zero rows in either the row reduced form or the row reduced echelon form of a matrix are same. DEFINITION 2.4.2 (Row rank of a Matrix) The number of non-zero rows in the row reduced form of a matrix is called the row-rank of the matrix. By the very definition, it is clear that row-equivalent matrices have the same row-rank. For a matrix write ` EXAMPLE 2.4.3 ' to denote the row-rank of we
1. Determine the row-rank of Solution: To determine the row-rank of 1. we proceed as follows.
2.
3.
4. The last matrix in Step 1d is the row reduced form of which has non-zero rows. Thus, This result can also be easily deduced from the last matrix in Step 1b. 2. Determine the row-rank of Solution: Here we have 1.
2. From the last matrix in Step 2b, we deduce Remark 2.4.4 Let be a linear system with equations and unknowns. Then the row-reduced echelon form of agrees with the first columns of and hence
The reader is advised to supply a proof. Remark 2.4.5 Consider a matrix (see Definition 2.3.16) to the matrix After application of a finite number of elementary column operations we can have a matrix, say which has the following properties:
1. The first nonzero entry in each column is 2. A column containing only 0 's comes after all columns with at least one non-zero entry. 3. The first non-zero entry (the leading term) in each non-zero column moves down in successive columns. Therefore, we can define column-rank of that as the number of non-zero columns in It will be proved later
Thus we are led to the following definition. DEFINITION 2.4.6 The number of non-zero rows in the row reduced form of a matrix of denoted THEOREM 2.4.7 Let such that be a matrix of rank Then there exist elementary matrices is called the rank
and
Proof . Let be the row reduced echelon matrix obtained by applying elementary row operations to the given matrix As the matrix will have the first rows as the non-zero rows. So by Remark 2.3.5, will have in the will have leading columns, say Note that, for the column
row and zero elsewhere. Let be the matrix obtained from Then the matrix block of by successively and column of for can be written in the form is an identity matrix, the block This gives the required result.
We now apply column operations to the matrix interchanging the where
is a matrix of appropriate size. As the
can be made the zero matrix by application of column operations to height6pt width 6pt depth 0pt COROLLARY 2.4.8 Let be a matrix of rank
Then the system of equations
has
infinite number of solutions. Proof . By Theorem 2.4.7, there exist elementary matrices Define and . Then the matrix such that
as the elementary martices
's are being multiplied on the left of the matrix . Then check that for
Let .
be the columns of the matrix
Hence, we can use the
's which are non-zero (Use Exercise 1.2.17.2) to generate infinite number of
solutions. height6pt width 6pt depth 0pt EXERCISE 2.4.9 1. Determine the ranks of the coefficient and the augmented matrices that appear in Part 1 and Part 2 of Exercise 2.3.12. 2. Let be an matrix with Then prove that is row-equivalent to 3. If and are invertible matrices and and and 5. Let and 1. if 2. if 6. Let be two matrices. Show that is defined, then is defined, then and Then show that there exists invertible matrices and prove that the matrix 7. Let 8. Let be an and and have rank is an matrix of rank is a matrix of size invertible matrix. Then can be written as and is a matrix of size Then show that then
and such that
is defined then show that where
4. Find matrices
which are product of elementary matrices such that
be any matrix of rank
such that Also,
where both
and
be two matrices such that for some matrix for some matrix
and
is defined and is defined and
Similarly, if
[Hint: Choose non-singular matrices
Define
9. If matrices
and
are invertible and the involved partitioned products are defined, then show that
10. Suppose
is the inverse of a matrix
Partition
and
as follows:
If
is invertible and
then show that
and
We try to understand the properties of the set of solutions of a linear system through an example, using the Gauss-Jordan method. Based on this observation, we arrive at the existence and uniqueness results for the linear system This example is more or less a motivation.
Subsections Example Main Theorem Equivalent conditions for Invertibility Inverse and the Gauss-Jordan Method
Consider a linear system with
which after the application of the Gauss-Jordan method reduces to a matrix
For this particular matrix Observations:
we want to see the set of solutions. We start with some observations.
1. The number of non-zero rows in
is
This number is also equal to the number of non-zero rows in and
2. The first non-zero entry in the non-zero rows appear in columns 3. Thus, the respective variables 4. The remaining variables, 5. We assign arbitrary constants Hence, we have the set of solutions as and and and are free variables. to the free variables
are the basic variables. and respectively.
where
and
are arbitrary.
Let
and
Then it can easily be verified that
and for
A similar idea is used in the proof of the next theorem and is omitted. The interested readers can read the proof in Appendix 14.1.
THEOREM 2.5.1 [Existence and Non-existence] Consider a linear system matrix, and are vectors with orders and respectively. Suppose
where
is a and
Then exactly one of the following statement holds: 1. if the set of solutions of the linear system is an infinite set and has the form
where 2. if 3. If
are
vectors satisfying
and vector
for satisfying
the solution set of the linear system has a unique the linear system has no solution. be an matrix and consider the linear system is consistent if and only if
Remark 2.5.2 Let
Then by Theorem 2.5.1,
we see that the linear system
The following corollary of Theorem 2.5.1 is a very important result about the homogeneous linear system COROLLARY 2.5.3 Let solution if and only if rank Proof . Suppose the system assumption, we need to show that has a non-trivial solution, That is, and So, Under this be an matrix. Then the homogeneous system has a non-trivial
On the contrary, assume that rank
Also implies that is a solution of the linear system solution under the condition (see Theorem 2.5.1), we get that was a given non-trivial solution. Then
Hence, by the uniqueness of the A contradiction to the fact
Now, let us assume that rank
So, by Theorem 2.5.1, the solution set of the linear system has infinite number of vectors satisfying From this infinite set, we can choose any vector that is different from Thus, we have a solution That is, we have obtained a non-trivial solution height6pt width 6pt depth 0pt
We now state another important result whose proof is immediate from Theorem 2.5.1 and Corollary 2.5.3. PROPOSITION 2.5.4 Consider the linear system hold together. 1. The system 2. The system Remark 2.5.5 1. Suppose 2. If are two solutions of then of Then is also a solution of That is, differ by a for any has a unique solution for every has a non-trivial solution. Then the two statements given below cannot
are two solutions of for some solution
is a solution of the system That is, any two solutions of
solution of the associated homogeneous system In conclusion, for the set of solutions of the system and is a solution is of the form, where
is a particular solution of EXERCISE 2.5.6 1. For what values of and
-the following systems have
no solution,
a unique solution and
infinite number of solutions. 1. 2. 3. 4. 5. 6. 2. Find the condition on so that the linear system
is consistent. 3. Let be an
matrix. If the system
has a non trivial solution then show that
also has a non trivial solution.
DEFINITION 2.5.7 A square matrix THEOREM 2.5.8 For a square matrix 1. 2. 3. 4. Proof . 1
or order of order
is said to be of full rank if the following statements are equivalent.
is invertible. is of full rank. is row-equivalent to the identity matrix. is a product of elementary matrices. 2 Then there exists an invertible matrix where is an (a product of elementary is invertible, let
Let if possible rank matrices) such that
matrix. Since
where
is an
matrix. Then
(2.5.1)
Thus the matrix
has
rows as zero rows. Hence, is of full rank.
cannot be invertible. A contradiction to
being
a product of invertible matrices. Thus, 2 3
Suppose is of full rank. This implies, the row reduced echelon form of has all non-zero rows. But as many columns as rows and therefore, the last row of the row reduced echelon form of will be Hence, the row reduced echelon form of is the identity matrix. 3 Since 4 is row-equivalent to the identity matrix there exist elementary matrices That is, 4 1 is product of elementary matrices.
has
such that
Suppose
where the
's are elementary matrices. We know that elementary matrices are
invertible and product of invertible matrices is also invertible, we get the required result. height6pt width 6pt depth 0pt The ideas of Theorem 2.5.8 will be used in the next subsection to find the inverse of an invertible matrix. The idea used in the proof of the first part also gives the following important Theorem. We repeat the proof for the sake of clarity. THEOREM 2.5.9 Let be a square matrix of order such that such that Then Then exists. exists.
1. Suppose there exists a matrix 2. Suppose there exists a matrix Proof . Suppose that Let if possible, rank matrices) such that
We will prove that the matrix
is of full rank. That is, (a product of elementary matrix. Then
Then there exists an invertible matrix Let where is an
(2.5.2)
Thus the matrix
has
rows as zero rows. So,
cannot be invertible. A contradiction to
being a
product of invertible matrices. Thus, is an invertible matrix. That is, Using the first part, it is clear that the matrix
That is, as well.
is of full rank. Hence, using Theorem 2.5.8,
in the second part, is invertible. Hence
Thus,
is invertible as well. height6pt width 6pt depth 0pt of order
Remark 2.5.10 This theorem implies the following: `ìf we want to show that a square matrix is invertible, it is enough to show the existence of 1. either a matrix 2. or a matrix such that such that of order
THEOREM 2.5.11 The following statements are equivalent for a square matrix 1. 2. is invertible. has only the trivial solution
3. Proof . 1 Since 2
has a solution
for every
is invertible, by Theorem 2.5.8
is of full rank. That is, for the linear system Hence, by Theorem 2.5.1 the system
the number has a unique
of unknowns is equal to the rank of the matrix solution 2 1
Let if possible be non-invertible. Then by Theorem 2.5.8, the matrix is not of full rank. Thus by Corollary 2.5.3, the linear system has infinite number of solutions. This contradicts the assumption has only the trivial solution that 1 Since 3 For 1 define for each and consider the linear system Define a matrix Then By That 3 is invertible, for every the system has a unique solution
assumption, this system has a solution is, the column of
is the solution of the system
Therefore, by Theorem 2.5.9, the matrix EXERCISE 2.5.12 1. Show that a triangular matrix 2. Let be a matrix and invertible? Give reasons. 3. Let be an matrix and and only if the matrix
is invertible. height6pt width 6pt depth 0pt
is invertible if and only if each diagonal entry of is non-zero. be a matrix having positive entries. Which of or is be an matrix. Prove that the matrix is invertible if
is invertible.
We first give a consequence of Theorem 2.5.8 and then use it to find the inverse of an invertible matrix. COROLLARY 2.5.13 Let be an invertible matrix. Suppose that a sequence of elementary
row-operations reduces to the identity matrix. Then the same sequence of elementary row-operations when applied to the identity matrix yields Proof . Let be a square matrix of order Also, let Then be a sequence of elementary row This implies
operations such that height6pt width 6pt depth 0pt Summary: Let be an
matrix. Apply the Gauss-Jordan method to the matrix is If then or else
Suppose the is not
row reduced echelon form of the matrix invertible.
EXAMPLE 2.5.14 Find the inverse of the matrix
using the Gauss-Jordan method.
Solution: Consider the matrix
A sequence of steps in the Gauss-Jordan method
are:
1.
2.
3.
4.
5.
6.
7.
8. Thus, the inverse of the given matrix is
EXERCISE 2.5.15 Find the inverse of the following matrices using the Gauss-Jordan method.
Notation: For an
deleting the
matrix
by
we mean the submatrix
of
which is obtained by
row and
column.
EXAMPLE 2.6.1 Consider a matrix
Then
and
DEFINITION 2.6.2 (Determinant of a Square Matrix) Let associate inductively (on
be a square matrix of order written (or
With ) by
we
) a number, called the determinant of
EXAMPLE 2.6.3 1. Let Then,
For example, for
2. Let
Then,
(2.6.1)
For example, if
then
EXERCISE 2.6.4 1. Find the determinant of the following matrices.
2. Show that the determinant of a triangular matrix is the product of its diagonal entries. DEFINITION 2.6.5 A matrix is said to be a singular matrix if It is called non-singular if
The proof of the next theorem is omitted. The interested reader is advised to go through Appendix 14.3. THEOREM 2.6.6 Let 1. if 2. if be an matrix. Then by interchanging two rows, then by multiplying a row by then , times the th row, where , ,
is obtained from is obtained from
3. if all the elements of one row or column of 4. if then 5. if is obtained from , by replacing the
are 0 then th row by itself plus
is a square matrix having two rows equal then
Remark 2.6.7
1. Many authors define the determinant using ``Permutations." It turns out that THE WAY WE HAVE DEFINED DETERMINANT is usually called the expansion of the determinant along the first row. 2. Part 1 of Lemma 2.6.6 implies that `òne can also calculate the determinant by expanding along any matrix for every , one also has row." Hence, for an
Remark 2.6.8 1. Let and be two vectors in and Then consider the parallelogram, We
formed by the vertices
Recall that the dot product, vector vectors We denote the length by and then
and With the above notation, if
is the length of the is the angle between the
Which tells us,
Hence, the claim holds. That is, in 2. Let cross product of two vectors in is,
the determinant is and
times the area of the parallelogram. be three elements of Recall that the
Note here that if
then
Let
be the parallelopiped formed with
as a vertex and the vectors
as adjacent
vertices. Then observe that formed by the vectors at formed by where and and
is a vector perpendicular to the plane that contains the parallelogram So, to compute the volume of the parallelopiped we need to look and the normal vector to the parallelogram
is the angle between the vector So,
Hence, Let properties of and let also hold for the volume of an be an matrix. Then the following
-dimensional parallelopiped formed with as adjacent vertices: and then then the determinant of the new matrix is
as one vertex and the vectors 1. If Also, volume of a unit -dimensional cube is 2. If we replace the vector by for some
. This is also true for the volume, as the original volume gets multiplied by 3. If for some then the vectors will give rise to an -dimensional hyperplane.
-dimensional parallelopiped. So, this parallelopiped lies on an Thus, its -dimensional volume will be zero. Also, matrix it can be proved that
In general, for any the
is indeed equal to the volume of
DEFINITION 2.6.9 (Minor, Cofactor of a Matrix) The number . We write The cofactor of be an denoted denoted
is called the is the number with
minor of
DEFINITION 2.6.10 (Adjoint of a Matrix) Let for
matrix. The matrix
is called the Adjoint of
EXAMPLE 2.6.11 Let
Then
as THEOREM 2.6.12 Let 1. for 2. for 3. Thus, and be an matrix. Then
and so on.
(2.6.2)
Proof . Let the row of
be a square matrix with as the row of
the other rows of By the construction of construction again,
are the same as that of two rows ( and ) are equal. By Part 5 of Lemma 2.6.6, for Thus, by Remark 2.6.7, we have By
Now,
Thus,
Since, has an inverse and
Therefore,
has a right
inverse. Hence, by Theorem 2.5.9
height6pt width 6pt depth 0pt
EXAMPLE 2.6.13 Let
Then
and
By Theorem 2.6.12.3,
The next corollary is an easy consequence of Theorem 2.6.12 (recall Theorem 2.5.9). COROLLARY 2.6.14 If is a non-singular matrix, then and
THEOREM 2.6.15 Let Proof . Step 1. Let
and
be square matrices of order
Then
This means, is invertible. Therefore, either matrices (see Theorem 2.5.8). So, let
is an elementary matrix or is a product of elementary be elementary matrices such that
Then, by using Parts 1, 2 and 4 of Lemma 2.6.6 repeatedly, we get
Thus, we get the required result in case Step 2. Suppose Then So,
is non-singular.
is not invertible. Hence, there exists an invertible matrix and therefore
such that
where
Thus, the proof of the theorem is complete. height6pt width 6pt depth 0pt COROLLARY 2.6.16 Let Proof . Suppose inverse. Suppose has an inverse. Then there exists a matrix both sides, we get such that Taking determinant of be a square matrix. Then is non-singular if and only if has an inverse. Thus, has an
is non-singular. Then
and therefore,
This implies that THEOREM 2.6.17 Let Proof . If If
Thus,
is non-singular. height6pt width 6pt depth 0pt
be a square matrix. Then
is a non-singular Corollary 2.6.14 gives Hence, by Corollary 2.6.16, has an inverse then doesn't have an inverse. Therefore, Thus again by
is singular, then
also doesn't have an inverse (for if Corollary 2.6.16, Hence, we have
Therefore, we again have height6pt width 6pt depth 0pt
Recall the following: The linear system has a unique solution for every has an inverse if and only if Thus, has a unique solution FOR EVERY if and only if when if and only if exists.
The following theorem gives a direct method of finding the solution of the linear system
THEOREM 2.6.18 (Cramer's Rule) Let be a linear system with then the unique solution to this system is
equations in
unknowns. If
where
is the matrix obtained from
by replacing the
th column of
by the column vector has the solution
Proof . Since Hence, the
Thus, the linear system th coordinate of is given by
height6pt width 6pt depth 0pt The theorem implies that
and in general
for
EXAMPLE 2.6.19 Suppose that
and
Use Cramer's rule to find a vector
such that Solution: Check that Therefore
and
That is,
In this chapter, the linear transformations are from a given finite dimensional vector space to itself. Observe that in this case, the matrix of the linear transformation is a square matrix. So, in this chapter, all the for some positive integer matrices are square matrices and a vector means EXAMPLE 6.1.1 Let be a real symmetric matrix. Consider the following problem:
To solve this, consider the Lagrangian
Partially differentiating
with respect to
for
we get
and so on, till
Therefore, to get the points of extrema, we solve for
We therefore need to find a
and
such that
for the extremal problem.
EXAMPLE 6.1.2 Consider a system of
ordinary differential equations of the form
(6.1.1)
where
is a real
matrix and
is a column vector.
To get a solution, let us assume that (6.1.2)
is a solution of (6.1.1) and look into what and has to satisfy, i.e., we are investigating for a necessary condition on and so that (6.1.2) is a solution of (6.1.1). Note here that (6.1.1) has the zero solution, namely and so we are looking for a non-zero Differentiating (6.1.2) with respect to and substituting in (6.1.1), leads to (6.1.3)
So, (6.1.2) is a solution of the given system of differential equations if and only if matrix we are this lead to find a pair such that That is, given an satisfied. Let be a matrix of order In general, we ask the question: For what values of there exist a non-zero vector such that
and
satisfy (6.1.3). and (6.1.3) is
(6.1.4) stands for either the vector space Here, equation over or over Equation (6.1.4) is equivalent to the
By Theorem 2.5.1, this system of linear equations has a non-zero solution, if
So, to solve (6.1.4), we are forced to choose those values of that is a polynomial in of degree
for which
Observe
We are therefore lead to the following definition. The polynomial The equation
DEFINITION 6.1.3 (Characteristic Polynomial) Let be a matrix of order is called the characteristic polynomial of and is denoted by
is called the characteristic equation of then is called a characteristic value of
If
is a solution of the characteristic equation
Some books use the term EIGENVALUE in place of characteristic value. THEOREM 6.1.4 Let characteristic equation. Then there exists a non-zero Proof . Since is a root of the characteristic equation, Suppose such that This shows that the matrix is a root of the
is singular and therefore by Theorem 2.5.1 the linear system
has a non-zero solution. height6pt width 6pt depth 0pt Remark 6.1.5 Observe that the linear system consider only those has a solution for every So, we
that are non-zero and are solutions of the linear system has a non-zero solution
DEFINITION 6.1.6 (Eigenvalue and Eigenvector) If the linear system for some then 1. 2. 3. the tuple is called an eigenvalue of is called an eigenvector corresponding to the eigenvalue is called an eigenpair. of
and
Remark 6.1.7 To understand the difference between a characteristic value and an eigenvalue, we give the following example. Consider the matrix Then the characteristic polynomial of is
Given the matrix
recall the linear transformation
defined by
1. If has
that is, if and
is considered a COMPLEX matrix, then the roots of as eigenpairs.
in
are
So,
2. If
that is, if then
is considered a REAL matrix, then
has no solution in
Therefore, if
has no eigenvalue but it has
as characteristic values. matrix then for any non-zero are eigenvectors of it is easily seen that if Hence, when
Remark 6.1.8 Note that if
is an eigenpair for an Similarly, if
is also an eigenpair for corresponding to the eigenvalue , then
then for any non-zero
is also an eigenvector of
corresponding to the eigenvalue
we talk of eigenvectors corresponding to an eigenvalue Suppose is a root of the characteristic equation Suppose has
we mean LINEARLY INDEPENDENT EIGENVECTORS. Then is singular and
Then by Corollary 4.3.9, the linear system has linearly independent
linearly independent solutions. That is, whenever
eigenvectors corresponding to the eigenvalue EXAMPLE 6.1.9 1. Let with
for
Then
is the
characteristic equation. So, the eigenpairs are
2. Let That is
Then
Hence, the characteristic equation has roots for Hence, from the above is
is a repeated eigenvalue. Now check that the equation And this has the solution
equivalent to the equation remark,
is a representative for the eigenvector. Therefore, HERE WE HAVE TWO EIGENVALUES
MATHEND000# BUT ONLY ONE EIGENVECTOR.
3. Let
Then and we know that from
The characteristic equation has roots for every to get and then
Here,
the matrix that we have is
and we can CHOOSE ANY as the two eigenpairs.
TWO LINEARLY INDEPENDENT VECTORS
In general, if
are linearly independent vectors in are eigenpairs for the identity matrix,
4. Let
Then Now check that the eigenpairs are and
The characteristic equation has roots In this case, we have TWO
DISTINCT EIGENVALUES AND THE CORRESPONDING EIGENVECTORS ARE ALSO LINEARLY INDEPENDENT.
The reader is required to prove the linear independence of the two eigenvectors. Then Hence, over that the eigenpairs are the matrix and
5. Let
The characteristic equation has roots has no eigenvalue. Over the reader is required to show
EXERCISE 6.1.10 1. Find the eigenvalues of a triangular matrix. 2. Find eigenpairs over for each of the following matrices: and 3. Let and be similar matrices. 1. Then prove that and have the same set of eigenvalues. 2. Let be an eigenpair for and be an eigenpair for between the vectors and ? and are similar, then there exists a non-singular matrix
What is the relationship
[Hint: Recall that if the matrices such that ] 4. Let be an
matrix. Suppose that for all
Then prove that
is an eigenvalue of What is the corresponding eigenvector? have the same set of eigenvalues. Construct a 5. Prove that the matrices and
matrix
such
that the eigenvectors of and are different. be a matrix such that ( is called an idempotent matrix). Then prove that its 6. Let eigenvalues are either 0 or or both. 7. Let be a matrix such that ( is called a nilpotent matrix) for some positive integer . Then prove that its eigenvalues are all 0 . THEOREM 6.1.11 Let distinct. Then be an and matrix with eigenvalues not necessarily
Proof . Since
are the
eigenvalues of
by definition, (6.1.5)
(6.1.5) is an identity in
as polynomials. Therefore, by substituting
in (6.1.5), we get
Also,
(6.1.6)
(6.1.7) for some Note that the coefficient of comes from the product
So,
by definition of trace.
But , from (6.1.5) and (6.1.7), we get
(6.1.8) Therefore, comparing the coefficient of we have
Hence, we get the required result. height6pt width 6pt depth 0pt EXERCISE 6.1.12
1. Let 2. Let
be a skew symmetric matrix of order be a orthogonal matrix such that .If
Then prove that 0 is an eigenvalue of , then prove that there exists a
non-zero vector Let be an
matrix. Then in the proof of the above theorem, we observed that the characteristic is a polynomial equation of degree it has the form in Also, for some numbers
equation
Note that, in the expression elements of It turns out that the expression
is an element of
Thus, we can only substitute
by
holds true as a matrix identity. This is a celebrated theorem called the Cayley Hamilton Theorem. We state this theorem without proof and give some implications. THEOREM 6.1.13 (Cayley Hamilton Theorem) Let characteristic equation. That is, be a square matrix of order Then satisfies its
holds true as a matrix identity. Some of the implications of Cayley Hamilton Theorem are as follows. Remark 6.1.14 1. Let Then its characteristic polynomial is and eigenvalue of does not imply that where is and a Also, for the function, for each
This shows that the condition
2. Suppose we are given a square matrix of order and we are interested in calculating Then we can use the division algorithm to find numbers large compared to polynomial such that
Hence, by the Cayley Hamilton Theorem,
That is, we just need to compute the powers of
till
In the language of graph theory, it says the following:

``Let be a graph on vertices. Suppose there is no path of length to of any length. That is, the graph or less from a vertex is disconnected and and to a vertex are in different of Then there is no path from components."
3. Let
be a non-singular matrix of order
Then note that
and
This matrix identity can be used to calculate the inverse.

Note that the vector (as an element of the vector space of all matrices) is a linear combination of the vectors
EXERCISE 6.1.15 Find inverse of the following matrices by using the Cayley Hamilton Theorem
THEOREM 6.1.16 If eigenvectors
are distinct eigenvalues of a matrix then the set
with corresponding
is linearly independent.
Proof . The proof is by induction on the number of eigenvalues. The result is obviously true if as the corresponding eigenvector is non-zero and we know that any set containing exactly one non-zero vector is linearly independent. Let the result be true for We prove the result for We consider the equation (6.1.9)
for the unknowns
We have
(6.1.10) From Equations (6.1.9) and (6.1.10), we get
This is an equation in
eigenvectors. So, by the induction hypothesis, we have
But the eigenvalues are distinct implies Also,
for
We therefore get
for
and therefore (6.1.9) gives
Thus, we have the required result. height6pt width 6pt depth 0pt We are thus lead to the following important corollary. COROLLARY 6.1.17 The eigenvectors corresponding to distinct eigenvalues of an linearly independent. EXERCISE 6.1.18 1. For an 1. 2. If 3. If 4. If and matrix prove the following. matrix are
have the same set of eigenvalues. then is an eigenvalue of
is an eigenvalue of an invertible matrix
is an eigenvalue of then is an eigenvalue of for any positive integer and are matrices with nonsingular then and have the same set
of eigenvalues. In each case, what can you say about the eigenvectors? 2. Let and be matrices for which and need not be similar. be an eigenpair for another matrix
1. Do and have the same set of eigenvalues? 2. Give examples to show that the matrices and 3. Let be an eigenpair for a matrix and let 1. Then prove that
is an eigenpair for the matrix are respectively the eigenvalues of and then
2. Give an example to show that if
need not be an eigenvalue of 4. Let be distinct non-zero eigenvalues of an matrix Let be the If
corresponding eigenvectors. Then show that then show that
forms a basis of has the unique solution
Let
be a square matrix of order
and let
be the corresponding linear transformation. In of such that the matrix of the
this section, we ask the question ``does there exist a basis linear transformation is in the simplest possible form."
We know that, the simplest form for a matrix is the identity matrix and the diagonal matrix. In this section, we we can find a basis such that is a diagonal matrix, show that for a certain class of matrices consisting of the eigenvalues of This is equivalent to saying that the above, we need the following definition. is similar to a diagonal matrix. To show
DEFINITION 6.2.1 (Matrix Diagonalisation) A matrix is said to be diagonalisable if there exists a non-singular matrix such that is a diagonal matrix. Remark 6.2.2 Let be an diagonalisable matrix with eigenvalues Observe that By definition, as similar matrices have the
is similar to a diagonal matrix
same set of eigenvalues and the eigenvalues of a diagonal matrix are its diagonal entries. EXAMPLE 6.2.3 Let Then we have the following:
1. Let Then has no real eigenvalue (see Example 6.1.8 and hence doesn't have Hence, there does not exist any non-singular real matrix eigenvectors that are vectors in such that 2. In case, are Define a and is a diagonal matrix. the two complex eigenvalues of respectively. Also, complex matrix by and Then are and the corresponding eigenvectors can be taken as a basis of
THEOREM 6.2.4 let
be an
matrix. Then
is diagonalisable if and only if
has
linearly
independent eigenvectors. Proof . Let be diagonalisable. Then there exist matrices and such that
Or equivalently,
Let
Then
implies that
Since
's are the columns of a non-singular matrix of Since,
they are non-zero and so for
we get the
eigenpairs get
's are columns of the non-singular matrix
using Corollary 4.3.9, we
are linearly independent. is diagonalisable then has linearly independent eigenvectors. with eigenvalues Then
Thus we have shown that if Conversely, suppose Let is non-singular. Also, has
linearly independent eigenvectors Since
are linearly independent, by Corollary 4.3.9,
Therefore the matrix COROLLARY 6.2.5 let diagonalisable. Proof . As is an
is diagonalisable. height6pt width 6pt depth 0pt be an matrix. Suppose that the eigenvalues of are distinct. Then is
matrix, it has
eigenvalues. Since all the eigenvalues of
are distinct, by is diagonalisable. as
Corollary 6.1.17, the eigenvectors are linearly independent. Hence, by Theorem 6.2.4, height6pt width 6pt depth 0pt COROLLARY 6.2.6 Let be an matrix with
as its distinct eigenvalues and divides . Then but
its characteristic polynomial. Suppose that for each does not divides for some positive integers
Or equivalently Proof . As is diagonalisable, by Theorem 6.2.4, as has linearly independent eigenvalues. Also, , has exactly linearly
. Hence, for each eigenvalue
independent eigenvectors. Thus, for each has exactly Indeed Now suppose that for each choose
, the homogeneous linear system .
linearly independent vectors in its solution set. Therefore, for follows from a simple counting argument. . Then for each
, we can
linearly independent eigenvectors. Also by Corollary 6.1.17, the eigenvectors corresponding to has linearly independent eigenvectors.
distinct eigenvalues are linearly independent. Hence Hence by Theorem 6.2.4, EXAMPLE 6.2.7
is diagonalisable. height6pt width 6pt depth 0pt
1. Let It is easily seen that
Then and
Hence,
has eigenvalues
are the only eigenpairs. That is, the matrix Hence, by Theorem 6.2.4,
has exactly one eigenvector corresponding to the repeated eigenvalue the matrix is not diagonalisable. 2. Let can be easily verified that corresponds to the eigenvalue Then and Note that the set Hence,
has eigenvalues and
It
correspond to the eigenvalue
consisting of eigenvectors
corresponding to the eigenvalue are not orthogonal. This set can be replaced by the orthogonal set which still consists of eigenvectors corresponding to the eigenvalue as . Also, the set forms a basis of
So, by Theorem 6.2.4, the matrix
is diagonalisable. Also, if
is the
corresponding unitary matrix then Observe that the matrix is a symmetric matrix. In this case, the eigenvectors are mutually orthogonal. In general, for any real symmetric matrix there always exist eigenvectors and they are mutually orthogonal. This result will be proved later. EXERCISE 6.2.8 1. By finding the eigenvalues of the following matrices, justify whether or not real non-singular matrix and a real diagonal matrix for any with for some
2. Are the two matrices
and
diagonalisable? , where matrix. Suppose if and otherwise. is
3. Find the eigenvalues and eigenvectors of 4. Let be an matrix and an
Then show that
diagonalisable if and only if both and are diagonalisable. 5. Let be a linear transformation with
and
Then 1. determine the eigenvalues of 2. find the number of linearly independent eigenvectors corresponding to each eigenvalue? 3. is diagonalisable? Justify your answer. 6. Let be a non-zero square matrix such that Show that cannot be diagonalised. [Hint: Use Remark 6.2.2.] 7. Are the following matrices diagonalisable?
In this section, we will look at some special classes of square matrices which are diagonalisable. We will also recall the following be dealing with matrices having complex entries and hence for a matrix definitions. DEFINITION 6.3.1 (Special Matrices) 1. Note that 2. A square matrix with complex entries is called 1. a Hermitian matrix if 2. a unitary matrix if 3. a skew-Hermitian matrix if 4. a normal matrix if 3. A square matrix with real entries is called 1. a symmetric matrix if 2. an orthogonal matrix if 3. a skew-symmetric matrix if Note that a symmetric matrix is always Hermitian, a skew-symmetric matrix is always skew-Hermitian and an orthogonal matrix is always unitary. Each of these matrices are normal. If is a unitary matrix then EXAMPLE 6.3.2 1. Let Then is skew-Hermitian. is called the conjugate transpose of the matrix
2. Let that
and is also a normal matrix.
Then
is a unitary matrix and
is a normal matrix. Note
DEFINITION 6.3.3 (Unitary Equivalence) Let equivalent if there exists a unitary matrix
and
be two
matrices. They are called unitarily
such that
Note that EXERCISE 6.3.4
as
is a unitary matrix. So,
is unitarily similar to the matrix
1. Let be a square matrix such that is a normal matrix. 2. Let be any matrix. Then and
is a diagonal matrix for some unitary matrix where
. Prove that
is the Hermitian part of
is the skew-Hermitian part of where both and are Hermitian matrices.
3. Every matrix can be uniquely expressed as 4. Show that is always skew-Hermitian. such that
5. Does there exist a unitary matrix and
where
PROPOSITION 6.3.5 Let Proof . Let
be an
Hermitian matrix. Then all the eigenvalues of and implies
are real.
be an eigenpair. Then
Hence
But
is an eigenvector and hence That is,
and so the real number
is non-zero as well. Thus
is a real number. height6pt width 6pt depth 0pt be an such that Hermitian matrix. Then where is unitarily diagonalisable. That is, there
THEOREM 6.3.6 Let exists a unitary matrix
is a diagonal matrix with the eigenvalues of
as the diagonal entries. In other words, the eigenvectors of form an orthonormal basis of
Proof . We will prove the result by induction on the size of the matrix. The result is clearly true if Let the result be true for we will prove the result in case So, let be a matrix and let be an eigenpair of orthonormal basis with We now extend the linearly independent set (using Gram-Schmidt Orthogonalisation) of . to form an
As
is an orthonormal set,
Therefore, observe that for all
Hence, we also have as columns of
for
Now, define is a unitary matrix and
(with
). Then the matrix
where
is a
matrix. As
,we get
. This condition, . That is, is
together with the fact that
is a real number (use Proposition 6.3.5), implies that
also a Hermitian matrix. Therefore, by induction hypothesis there exists a such that
unitary matrix
Recall that , the entries
for
are the eigenvalues of the matrix
We also know that two are Define
similar matrices have the same set of eigenvalues. Hence, the eigenvalues of Then is a unitary matrix and
Thus,
is a diagonal matrix with diagonal entries
the eigenvalues of
Hence, the
result follows. height6pt width 6pt depth 0pt COROLLARY 6.3.7 Let be an real symmetric matrix. Then
1. the eigenvalues of are all real, 2. the corresponding eigenvectors can be chosen to have real entries, and 3. the eigenvectors also form an orthonormal basis of Proof . As is symmetric, is also an Hermitian matrix. Hence, by Proposition 6.3.5, the eigenvalues of are all real. Let be an eigenpair of Suppose Then there exist such that So,
Comparing the real and imaginary parts, we get eigenvectors to have real entries.
and
Thus, we can choose the
To prove the orthonormality of the eigenvectors, we proceed on the lines of the proof of Theorem 6.3.6, Hence, the readers are advised to complete the proof. height6pt width 6pt depth 0pt EXERCISE 6.3.8 1. Let be a skew-Hermitian matrix. Then all the eigenvalues of are either zero or purely imaginary. Also, the eigenvectors corresponding to distinct eigenvalues are mutually orthogonal. [Hint: Carefully study the proof of Theorem 6.3.6.] be an unitary matrix. Then 2. Let 1. the rows of form an orthonormal basis of 2. the columns of form an orthonormal basis of 3. for any two vectors 4. for any vector
5. for any eigenvalue 6. the eigenvectors if 3. Let for 4. Show that the matrices and are similar. Is it possible to find a unitary and corresponding to distinct eigenvalues are eigenpairs, with then and and satisfy That is,
are mutually orthogonal. then is an eigenpair
be a normal matrix. Then, show that if
is an eigenpair for
such that matrix 5. Let be a orthogonal matrix. Then prove the following: 1. if then for some
2. if
then there exists a basis of
in which the matrix of
looks like
Or equivalently, reflects the vectors in
for some
In this case, prove that
about a line passing through origin. Also, determine this line.
6. Let 7. Let be a
Determine
orthogonal matrix. Then prove the following: then is a rotation about a fixed axis, in the sense that to the plane has an eigenpair
1. if
such that the restriction of 2. if then the action of
is a two dimensional rotation of followed by
corresponds to a reflection through a plane
a rotation about the line through the origin that is perpendicular to Remark 6.3.9 In the previous exercise, we saw that the matrices and are
similar but not unitarily equivalent, whereas unitary equivalence implies similarity equivalence as But in numerical calculations, unitary transformations are preferred as compared to similarity transformations. The main reasons being: 1. Exercise 6.3.8.2 implies that an orthonormal change of basis leaves unchanged the sum of squares of the absolute values of the entries which need not be true under a non-orthonormal change of basis.
2. As
for a unitary matrix
unitary equivalence is computationally simpler.
3. Also in doing ``conjugate transpose", the loss of accuracy due to round-off errors doesn't occur. We next prove the Schur's Lemma and use it to show that normal matrices are unitarily diagonalisable. LEMMA 6.3.10 (Schur's Lemma) Every matrix. Proof . We will prove the result by induction on the size of the matrix. The result is clearly true if Let the result be true for we will prove the result in case So, let be a matrix and let be an eigenpair for with Then the linearly independent set can be extended, using of ) is a unitary matrix and . complex matrix is unitarily similar to an upper triangular
the Gram-Schmidt Orthogonalisation process, to get an orthonormal basis Then (with as the columns of the matrix
where matrix
is a such that
matrix. By induction hypothesis there exists a is an upper triangular matrix with diagonal entries are the eigenvalues of is a unitary matrix and Observe that since the eigenvalues of Define
unitary the eigen values are is an upper Hence, the result
of the matrix
Then check that
triangular matrix with diagonal entries follows. height6pt width 6pt depth 0pt EXERCISE 6.3.11 1. Let be an
the eigenvalues of the matrix
real invertible matrix. Prove that there exists an orthogonal matrix with positive diagonal entries such that and .
and a
diagonal matrix 2. Show that matrices
are unitarily equivalent via the unitary
matrix
Hence, conclude that the upper triangular matrix obtained in the
"Schur's Lemma" need not be unique. 3. Show that the normal matrices are diagonalisable. [Hint: Show that the matrix in the proof of the above theorem is also a normal matrix and if an upper triangular matrix with then has to be a diagonal matrix]. Remark 6.3.12 (The Spectral Theorem for Normal Matrices) Let be an normal matrix. of
is
Then the above exercise shows that there exists an orthonormal basis such that for
be a normal matrix. Prove the following: 4. Let 1. if all the eigenvalues of are then 2. if all the eigenvalues of 5. Let be an 1. if 2. if is Hermitian and are then for all then for all . then . matrix. Prove that
is a real, symmetric matrix and
Do these results hold for arbitrary matrices? We end this chapter with an application of the theory of diagonalisation to the study of conic sections in analytic geometry and the study of maxima and minima in analysis.
DEFINITION 6.4.1 (Bilinear Form) Let
be a
matrix with real entries. A bilinear form in
is an expression of the type
Observe that if (the identity matrix) then the bilinear form reduces to the standard real inner product. Also, if we want it to be symmetric in and then it is necessary and sufficient that for all Why? Hence, any symmetric bilinear form is naturally associated with a real symmetric matrix. DEFINITION 6.4.2 (Sesquilinear Form) Let form in be a matrix with complex entries. A sesquilinear is given by
Note that if (the identity matrix) then the sesquilinear form reduces to the standard complex inner product. Also, it can be easily seen that this form is `linear' in the first component and `conjugate linear' in the second component. Also, if we want Note that if The expression and of and then the matrix need to be an Hermitian matrix. , then the sesquilinear form reduces to a bilinear form. is called the quadratic form and in place of and is a real number. , the Hermitian form can be rewritten as the Hermitian form. We generally write
, respectively. It can be easily shown that for any choice
the Hermitian form
Therefore, in matrix notation, for a Hermitian matrix
EXAMPLE 6.4.3 Let the Hermitian form
Then check that
is an Hermitian matrix and for
where `Re' denotes the real part of a complex number. This shows that for every choice of Hermitian form is always real. Why? The main idea is to express Note that if we replace by
the
as sum of squares and hence determine the possible values that it can take. where is any complex number, then for which ( i.e., simply gets multiplied by
and hence one needs to study only those From Exercise 6.3.11.3 one knows that if such that ( know are real). So, taking
is a normalised vector.
is Hermitian) then there exists a unitary matrix with 's the eigenvalues of the matrix which we 's as linear combination of 's with coefficients
(i.e., choosing ), one gets
coming from the entries of the matrix
(6.4.1)
Thus, one knows the possible values that case is a Hermitian matrix. Also, for
can take depending on the eigenvalues of the matrix
in
represents the principal axes of the conic that
they represent in the n-dimensional space. Equation (6.4.1) gives one method of writing as a sum of absolute squares of linearly independent as sum of squares. The
linear forms. One can easily show that there are more than one way of writing question arises, ``what can we say about the coefficients when squares".
has been written as sum of absolute
This question is answered by `Sylvester's law of inertia' which we state as the next lemma.
LEMMA 6.4.4 Every Hermitian form written as
(with
an Hermitian matrix) in
variables can be
where
are linearly independent linear forms in depend only on
and the integers
and
Proof . From Equation (6.4.1) it is easily seen that are uniquely given by
has the required form. Need to show that
and
Hence, let us assume on the contrary that there exist positive integers
with
such that /
Since, find a matrix get such that
and Choose
are linear combinations of . Since such that
we can Theorem 2.5.1, Hence, we
gives the existence of finding nonzero values of
Now, this can hold only if Similarly, the case
which gives a contradiction. Hence
can be resolved. height6pt width 6pt depth 0pt
Note: The integer degree of
is the rank of the matrix
and the number
is sometimes called the inertial
We complete this chapter by understanding the graph of
for
We first look at the following example.
EXAMPLE 6.4.5 Sketch the graph of Solution: Note that
The eigenpairs for
are
Thus,
Let
Then
Thus the given graph reduces to
Therefore, the given graph represents an ellipse with the principal axes principal axes are
and
That is, the
The eccentricity of the ellipse is
the foci are at the points
and
and the equations of the directrices are
Figure 6.1: Ellipse DEFINITION 6.4.6 (Associated Quadratic Form) Let equation of a general conic. The quadratic expression be the
is called the quadratic form associated with the given conic. We now consider the general conic. We obtain conditions on the eigenvalues of the associated quadratic form to characterise the different conic sections in (endowed with the standard inner product). PROPOSITION 6.4.7 Consider the general conic
Prove that this conic represents 1. an ellipse if 2. a parabola if 3. a hyperbola if and
Proof . Let
Then the associated quadratic form
As
is a symmetric matrix, by Corollary 6.3.7, the eigenvalues are orthonormal and
of
are both real, the corresponding
eigenvectors
is unitarily diagonalisable with
(6.4.2)
Let
Then
and the equation of the conic section in the
-plane, reduces to
Now, depending on the eigenvalues 1. Substituting
we consider different cases:
in (6.4.2) gives in the -plane.
Thus, the given conic reduces to a straight line
2. In this case, the equation of the conic reduces to
1. If 2. If 1. If 2. If 3. If
then in the
-plane, we get the pair of coincident lines
then we get a pair of parallel lines the solution set corresponding to the given conic is an empty set. Then the given equation is of the form for some translates
and
and thus represents a parabola. implies that the That is,
Also, observe that 3. Let and
Then the equation of the conic can be rewritten as
In this case, we have the following: 1. suppose Then the equation of the conic reduces to
The terms on the left can be written as product of two factors as the given equation represents a pair of intersecting straight lines in the 2. suppose As we can assume
Thus, in this case, -plane.
So, the equation of the conic reduces to
This equation represents a hyperbola in the
-plane, with principal axes
As
we have
4. In this case, the equation of the conic can be rewritten as
we now consider the following cases: 1. suppose Then the equation of the ellipse reduces to a pair of perpendicular lines and 2. suppose in the -plane.
Then there is no solution for the given equation. Hence, we do not get any real
ellipse in the 3. suppose
-plane. In this case, the equation of the conic reduces to
This equation represents an ellipse in the
-plane, with principal axes
Also, the condition
implies that
height6pt width 6pt depth 0pt Remark 6.4.8 Observe that the condition
implies that the principal axes of the conic are functions of the eigenvectors EXERCISE 6.4.9 Sketch the graph of the following surfaces: 1. 2. 3. 4.
and
As a last application, we consider the following problem that helps us in understanding the quadrics. Let (6.4.3)
be a general quadric. Then we need to follow the steps given below to write the above quadric in the standard form and thereby get the picture of the quadric. The steps are:
1. Observe that this equation can be rewritten as
where
2. As the matrix is symmetric matrix, find an orthogonal matrix matrix. 3. Replace the vector by Then writing
such that
is a diagonal
the equation (6.4.3) reduces to (6.4.4)
where
are the eigenvalues of so
4. Complete the squares, if necessary, to write the equation (6.4.4) in terms of the variables
that this equation is in the standard form. 5. Use the condition to determine the centre and the planes of symmetry of the quadric in terms of the original system. EXAMPLE 6.4.10 Determine the quadric
Solution: In this case,
and
and
. Check that for the orthonormal matrix
So, the equation of the quadric reduces to
Or equivalently,
So, the equation of the quadric in standard form is
where the point of symmetry is left as an exercise to the reader.
is the centre. The calculation of the planes

Linear Algebra

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Algebra

Uploaded by

Copyright:

Available Formats

Subsections Definition of a Matrix Special Matrices Operations on Matrices Multiplication of Matrices Inverse of a Matrix Some More Special Matrices

Mohd. Abrar Nizami

is the entry at the intersection of the

column. by suppressing its order.

In a more concise manner, we also denote the matrix

Remark 1.1.2 Some books also use

where the point of symmetry is left as an exercise to the reader.

is the centre. The calculation of the planes

for all of order .

then the diagonal matrix

is called a scalar matrix. for all . This

5. A diagonal matrix matrix is denoted by

is called an IDENTITY MATRIX if

The subscript A square matrix

A square matrix A square matrix

is said to be a lower triangular matrix if

is said to be triangular if it is an upper or a lower triangular matrix.

is an upper triangular matrix. An upper triangular matrix will be

matrix The transpose of

is defined as the is denoted by

That is, by the transpose of an its columns and the columns of

matrix as its rows.

we mean a matrix of order

having the rows of

THEOREM 1.2.5 Let

1. 2. 3. 4. Proof . Part 1. Let and

Hence, the matrix

is called the additive

Observe that the product

is defined if and only if

THE NUMBER OF COLUMNS OF MATHEND000# THE NUMBER OF ROWS OF MATHEND000#

However, for square matrices 2. The product 3. The product

of the same order, both the product

DEFINITION 1.2.9 Two square matrices Remark 1.2.10 1. Note that if

is a square matrix of order

Also, a scalar matrix of order

are so chosen that the matrix multiplications

That is, the matrix multiplication is associative.

That is, multiplication distributes over addition.

5. For any square matrix the first row of for

A similar statement holds for the columns of Proof . Part 1. Let

is multiplied on the right by

Hence, the required result follows.

Compute the matrix products

be a positive integer. Compute

for the following matrices:

Can you guess a formula for

be a square matrix of order if if such that

is said to be a LEFT INVERSE of is called a RIGHT INVERSE of

is said to be INVERTIBLE (or is said to have an INVERSE) if there exists a matrix

LEMMA 1.2.14 Let and Proof . Note that

matrix. Suppose that there exist

By definition definition, implies

Hence, if we denote or equivalently

Hence, by definition EXERCISE 1.2.17 1. Let

height6pt width 6pt depth 0pt

be invertible matrices. Prove that the product

1. Let skew-symmetric matrix. 2. Let

is a symmetric matrix and

matrix with for which a positive integer for which

Then exists such that

for are called

NILPOTENT matrices. The least positive integer NILPOTENCY.

is called the ORDER OF are called